/
Author: Knapp A.W.
Tags: mathematics cryptography natural sciences elliptic curves
ISBN: 0-691-08559-5
Year: 1992
Text
ELLIPTIC CURVES
BY
ANTHONY W. KNAPP
MATHEMATICAL NOTES
PRINCETON UNIVERSITY PRESS
Elliptic Curves
by
Anthony W. Knapp
Mathematical Notes 40
Princeton University Press
Princeton, New Jersey
1992
Copyright © 1992 by Princeton University Press
ALL RIGHTS RESERVED
The Princeton Mathematical Notes are edited by
Luis A. Caffarelli, John N. Mather, and Elias M. Stein
Princeton University Press books are printed on acid-free
paper, and meet the guidelines for permanence and
durability of the Committee on Production Guidelines for Book
Longevity of the Council on Library Resources
Printed in the United States of America
Library of Congress Cataloging-in-Publication Data
Knapp, Anthony W.
Elliptic curves / by Anthony W. Knapp.
p. cm.—(Mathematical notes ; 40)
Includes bibliographical references and index.
ISBN 0-691-08559-5 (PB)
1. Curves, Elliptic. I. Title. II. Series. Mathematical notes
(Princeton University Press); 40.
QA567.2.E44K53 1993 516.3'52--<Jc20 92-22183
3 5 79 10 8642
To Susan
CONTENTS
List of Figures
List of Tables
Preface
Standard Notation
I. Overview
II. Curves in Projective Space
1.
2.
3.
4.
5.
Projective Space
Curves and Tangents
Flexes
Application to Cubics
Bezout's Theorem and Resultants
III. Cubic Curves in Weierstrass Form
1.
2.
3.
4.
5.
Examples
Weierstrass Form, Discriminant, j-invariant
Group Law
Computations with the Group Law
Singular Points
IV. Mordell's Theorem
1.
2.
3.
4.
5.
6.
7.
8.
9.
Descent
Condition for Divisibility by 2
E(Q)/2£(Q), Special Case
£(Q)/2£(Q), General Case
Height and MordelPs Theorem
Geometric Formula for Rank
Upper Bound on the Rank
Construction of Points in E(Q)
Appendix on Algebraic Number Theory
V. Torsion Subgroup of £(Q)
1.
2.
3.
4.
5.
6.
Overview
Reduction Modulo p
p-adic Filtration
Lutz-Nagell Theorem
Construction of Curves with Prescribed Torsion
Torsion Groups for Special Curves
X
X
xi
xv
19
24
32
40
44
50
56
67
74
77
80
85
88
92
95
102
107
115
122
130
134
137
144
145
148
vin
CONTENTS
VI. Complex Points
1. Overview 151
2. Elliptic Functions 152
3. Weierstrass p Function 153
4. Effect on Addition 162
5. Overview of Inversion Problem 165
6. Analytic Continuation 166
7. Riemann Surface of the Integrand 169
8. An Elliptic Integral 174
9. Computability of the Correspondence 183
VII. Dirichlet >s Theorem
1. Motivation 189
2. Dirichlet Series and Euler Products 192
3. Fourier Analysis on Finite Abelian Groups 199
4. Proof of Dirichlet's Theorem 201
5. Analytic Properties of Dirichlet L Functions 207
VIII. Modular Forms for ST(2,Z)
1. Overview 221
2. Definitions and Examples 222
3. Geometry of the q Expansion 227
4. Dimensions of Spaces of Modular Forms 231
5. L Function of a Cusp Form 238
6. Petersson Inner Product 241
7. Hecke Operators 242
8. Interaction with Petersson Inner Product 250
IX. Modular Forms for Hecke Subgroups
1. Hecke Subgroups 256
2. Modular and Cusp Forms 261
3. Examples of Modular Forms 265
4. L Function of a Cusp Form 267
5. Dimensions of Spaces of Cusp Forms 271
6. Hecke Operators 273
7. Oldforms and Newforms 283
X. L Function of an Elliptic Curve
1. Global Minimal Weierstrass Equations 290
2. Zeta Functions and L Functions 294
3. Hasse's Theorem 296
CONTENTS
IX
XI. Eichler-Shimura Theory
1. Overview 302
2. Riemann surface Xq(N) 311
3. Meromorphic Differentials 312
4. Properties of Compact Riemann Surfaces 316
5. Hecke Operators on Integral Homology 320
6. Modular Function j(r) 333
7. Varieties and Curves 341
8. Canonical Model of X0(N) 349
9. Abstract Elliptic Curves and Isogenics 359
10. Abelian Varieties and Jacobian Variety 367
11. Elliptic Curves Constructed from S2{r0{N)) 374
12. Match of L Functions 383
XII. Taniyama-Weil Conjecture
1. Relationships among Conjectures 386
2. Strong Weil Curves and Twists 392
3. Computations of Equations of Weil Curves 394
4. Connection with Fermat's Last Theorem 397
Notes 401
References 409
Index of Notation 419
Index 423
LIST OF FIGURES
1.1 Method for obtaining Q solutions of x2 -f y2 — 1 6
1.2 Newton's explanation of the Diophantus method 10
1.3 Chord-tangent composition rule 11
1.4 Group laws relative to different base points O 11
1.5 Negatives relative to the group law 12
1.6 Singular behavior 13
1.7 Graphs of R points of the elliptic curve y2 = P(x) 14
3.1 Configuration for (PP')(QQf) = (PQ)(P'Q') 69
8.1 Fundamental domain for 5L(2, Z) 228
8.2 Contour for calculating number of zeros 232
9.1 Fundamental domain for r0(2) 260
LIST OF TABLES
1.1 Values of H'p<R N{p)/p for y2 = x3 + ax 17
3.1 Some elliptic curves with small negative discriminant 64
3.2 Some elliptic curves with small positive discriminant 65
4.1 Image of <p for y2 = x3 - Ax with x £ {-2,0,2} 111
4.2 Image of <p for y2 = x3 - 4x with x G {-2,0,2} 111
4.3 Adjusted image of <p for y2 = x3 - Ax 111
4.4 Image of ip for y2 = x3 — p2x with x ^ {—p,0,p} 113
4.5 Adjusted image of <p for y2 = x3 — p2x 113
4.6 Some primes p = 5 mod 8 that are congruent numbers 117
4.7 Effect on E : y2 = x3 + 8x of Fermat's descent 121
5.1 Examples of torsion subgroups of E(Q) 133
10.1 Effect of an admissible change of variables 291
12.1 Conductors of some elliptic curves 391
12.2 Curves X0(N) of low genus 395
PREFACE
For introductory purposes, an elliptic curve over the rationals is an
equation y2 = P(x)> where P is a monic polynomial of degree three with
rational coefficients and with distinct complex roots.
The points on such a curve, together with a point at infinity, form an
abelian group under a geometric definition of addition. Namely if we
take two points on the curve and connect them by a line, the line will
intersect the curve in a third point. The reflection of that third point in
the x-axis is taken as the sum of the given points. The identity is the
point at infinity. According to Mordeirs Theorem, the abelian group of
points on the curve with rational coordinates is finitely generated. A
theorem of Lutz and Nagell describes the torsion subgroup completely,
but the rank of the free abelian part is as yet not fully understood.
This is the essence of the basic theory of rational elliptic curves. The
first five of the twelve chapters of this book give an account of this theory,
together with many examples and number-theoretic applications. This
is beautiful mathematics, of interest to people in many fields. Except for
one small part of the proof of Mordeirs Theorem, it is elementary,
requiring only undergraduate mathematics. Accordingly the presentation
avoids most of the machinery of algebraic geometry.
A related theory concerns elliptic curves over the complex numbers, or
Riemann surfaces of genus one. This subject requires complex variable
theory and is discussed in Chapter VI. It leads naturally to the topic of
modular forms, which is the subject of Chapters VIII and IX.
But the book is really about something deeper, the twentieth-century
discovery of a remarkable connection between automorphy and
arithmetic algebraic geometry. This connection first shows up in the
coincidence of L functions that arise from some very special modular forms
("automorphic" L functions) with L functions that arise from number
theory ("arithmetic" or "geometric" L functions, also called "motivic").
Chapter VII introduces this theme. The automorphic L functions have
manageable analytic properties, while the arithmetic L functions
encode subtle number-theoretic information. The fact that the arithmetic
L functions are automorphic enables one to bring a great deal of
mathematics to bear on extracting the number-theoretic information from the
L function.
The prototype for this phenomenon is the Riemann zeta function £(s),
which should be considered as an arithmetic L function defined initially
xi
Xll
PREFACE
for Re s > 1. An example of subtle number-theoretic information that
£(s) encodes is the Prime Number Theorem, which follows from the
nanvanishing of £(s) for Re s = 1. In particular, this property of £(s)
is a property of points s outside the initial domain of C(5)- To get
a handle on analytic properties of C(5)> one proves that £(s) has an
analytic continuation and a functional equation. These properties are
completely formal once one establishes a relationship between £(s) and
a theta function with known transformation properties. Establishing
this relationship is the same as proving that £(s) is an automorphic L
function.
The main examples of Chapter VII are the Dirichlet L functions
L(S)X)- These too are arithmetic L functions defined initially for
Re s > 1. They encode Dirichlet's Theorem on primes in arithmetic
progressions, which follows from the nonvanishing of all L(s> \) at s = 1.
As with C(s)> tne relevant properties of L(s,\') ar€ outside the initial
domain. Also as with C(5)> °ne gets at the analytic continuation and
functional equation of L(stx) by identifying L(s>x) with an automorphic
L function.
The examples at the level of Chapter VII are fairly easy. Further
examples, generalizing the Dirichlet L functions in a natural way, arise
in abelian class field theory, are well understood even if not easy, and
will not be discussed in this book. The simplest L functions that are
not well understood come from elliptic curves. An elliptic curve has a
geometric L function L(s,E) initially defined for Re s > |. An example
conjecturally of the subtle information that L(s> E) encodes is the rank
of the free abelian group of rational points on the curve. This rank is
believed to be the order of vanishing of L(s,E) at s = 1. Once again,
the relevant property of L(s, E) is outside the initial domain. To address
the necessary analytic continuation, one would like to know that L(s, E)
is an automorphic L function. Work of Eichler and Shimura provides a
clue where to look for such a relationship. Eichler and Shimura gave a
construction for passing from certain cusp forms of weight two for Hecke
subgroups of the modular group to rational elliptic curves. Under this
construction, the L function of the cusp form (which is an automorphic
L function) equals the L function of the elliptic curve. The Taniyama-
Weil Conjecture expects conversely that every elliptic curve arises from
this construction, followed by a relatively simple map between elliptic
curves. This conjecture appears to be very deep; a theorem of Frey,
Serre, and Ribet says that it implies Fermat's Last Theorem. The final
three chapters discuss these matters; the last two take for granted more
mathematics than do the earlier chapters.
If the theme were continued beyond the twelve chapters that are here,
PREFACE
xni
eventually it would lead to the Langlands program, which brings in
representation theory on the automorphic side of this correspondence.
As a representation theorist, I come to elliptic curves from the point of
view of the Langlands program. Although the book neither uses nor
develops any representation theory, elliptic curves do give the simplest
case of the program where the correspondence of L functions is not
completely understood. Furthermore representation-theoretic methods
occasionally yield results about elliptic curves that seem inaccessible by
classical methods. From my point of view, they are an appropriate place
to begin to study and appreciate the Langlands program. A beginning
guide to the literature in this area appears in the section of Notes at the
end of the book.
This book grew out of a brilliant series of a half dozen lectures by
Don Zagier at the Tata Institute of Fundamental Research in Bombay
in January 1988. The book incorporates notes from parts of courses
that I gave at SUNY Stony Brook in Spring 1989 and Spring 1990. The
organization owes a great deal to Zagier's lectures, and I have reproduced
a number of illuminating examples of his. I am indebted to Zagier for
offering his series of lectures.
Much of the mathematics here can already be found in other books,
even if it has not been assembled in quite this way. Some sections of this
book follow sections of other books rather closely. Notable among these
other books are Fulton [1989],* Hartshorne [1977], Husemoller [1987],
Lang [1976] and [1987], Ogg [1969a], Serre [1973a], Shimura [1971a],
Silverman [1986], and Walker [1950]. The expository paper Swinnerton-
Dyer and Birch [1975] was also especially helpful. Detailed
acknowledgments of these dependences may be found in the section of Notes at the
end.
In addition, I would like to thank the following people for help in
various ways, some large and some small: H. Farkas, N. Katz, S. Kudla, R. P.
Langlands, S. Lichtenbaum, H. Maturnoto, C.-H. Sah, V. Schechtman,
and R. Stingley. The type-setting was by A/viS-T^K, and the Figures
were drawn with Mathematical. Financial support in part was from
the National Science Foundation in the form of grants DMS 87-23046
and DMS 91-00367.
A. W. Knapp
January, 1992
*A name followed by a bracketed year is an allusion to the list of References at the
end of the book. The date is followed by a letter in case of ambiguity.
STANDARD NOTATION
Item
Meaning
#5 or |5|
0
Ac
n positive
Z, Q, R, C
Re*
Im z
0(1)
o(l)
=
r**t
Im
a = b mod m
a\b
GCD(a,6)
1
1 or /
dimV
V
Endfc(K)
GL{n,k)
5L(2,R), 5L(2,Z)
Tr,4
>itr
/2X
k
[A:B]
Autk(K)
Ee
»i
number of elements in S
empty set
A complement
n > 0
integers, rationals, reals, complex numbers
real part of z
imaginary part of z
bounded term
term tending to 0
approximately numerically equal
asymptotic to, with ratio tending to 1
integers modulo m
m divides a — b
a divides 6
greatest common divisor
multiplicative identity
identity matrix
dimension of vector space
dual of vector space
linear maps of vector space to itself
general linear group over a field k
group of 2-by-2 matrices of determinant 1
trace of A
transpose of A
multiplicative group of invertible elements
algebraic closure of k
index of B in A, or degree of A in B
automorphism group of K fixing k
direct sum (for emphasis)
fundamental group
Elliptic Curves
CHAPTER I
OVERVIEW
Diophantus lived in Alexandria around 250 A.D. He published a series
of books, Arithmetica, in 13 volumes, which were lost for more than a
thousand years. They were found again about 1570, and in the next
century Fermat studied a translation. Despite the passage of so much
time, much of Diophantus Js work had still not been rediscovered. The
books are in the style of problems and solutions. The early volumes
introduced, apparently for the first time, algebraic notation and equations,
as well as negative numbers. Later volumes dealt with number theory.
The work of Diophantus is so stunning that equations to be solved in
number theory are often called Diophantine equations in his honor.
Basic Problem (affine). We consider the locus f(x}y) = 0 with
/ a nonzero polynomial. To fix the ideas, think of Q coefficients and
of solutions with x and y in Q. Clearing denominators, we can always
adjust / so that we have Z coefficients and are seeking Q solutions, i.e.,
solutions with x and y in Q.
Basic Problem (projective). We study the locus F(xyy}w) = 0
with F a nonzero homogeneous polynomial of some degree </, and we
seek "projective" Q solutions. This means we identify solutions (x*, ?/, w)
and (Ax,Aj/, \w) for A / 0 and we discard (0,0,0). Rational solutions
automatically give us integer solutions, by clearing denominators.
The two problems are related, in that we can pass from each to the
other. Two examples will illustrate.
EXAMPLE 1. Fermat equation xd + yd = zd. This equation is
projective. We can reduce to the affine case by dividing by zd: ud + vd = 1.
Relatively prime integer solutions of xd -f yd — zd (with z ^ 0)
correspond to rational solutions of ud + vd — 1 provided that we identify a
solution (x,t/, z) with its negative (—xy— j/, — z). This process of
reducing to the affine case loses "affine solutions at oo," those with 2 = 0, but
they are trivial here. Fermat's Last Theorem is the conjecture that
the equation has no nontrivial solutions for d > 2.
3
4
I: OVERVIEW
Example 2. The affine equation y2 = x3 + 1. This becomes a
projective equation if we put y = ^ and x = ^. We get wv2 = u3 +w3.
(Or we simply insert powers of w to make all terms cubic: wy2 = x3 +
w3.)
Diophantus considered the affine problem in degrees 1, 2, and 3.
Case of degree 1.
An affine line is of the form
ax + by + cz = 0 with a and 6 not both 0.
Two such lines are the same if and only if the coefficients of one are a
multiple of the coefficients of the other.
A projective line is of the form
ax + by -f cw = 0 with a, 6, c not all 0.
Again, two such lines are the same if and only if the coefficients of one
are a multiple of the coefficients of the other. The line with a = b = 0
and c = 1 is w = 0, which is the "line at oo."
Two facts are clear:
1) Q solutions exist (projectively).
2) We can parametrize all Q solutions.
Case of degree 2.
First we consider this case projectively. Then it turns out to be
possible to make a linear change of variables so that the equation is
ax2 + by2 + cw2 = 0. (1.1)
The questions we shall address are:
(I) Do (nonzero) solutions exist?
(II) If solutions exist, how are they parametrized?
Of these, only the second question was considered by Diophantus.
We shall return to the second question shortly. We begin with a
discussion of Question I, first giving two examples.
Example 1. x2 + y2 -f z2 — 0 has no Q solutions since it has no R
solutions.
Example 2. x2-\-y2 = 3z2 has no Q solutions since it has no solutions
modulo 9 in which some variable is not divisible by 3. This condition
is necessary since existence of a Q solution implies existence of a
relatively prime Z solution. To see that the condition fails, we argue as
follows: Modulo 3, the equation is x2 + y2 = 0, which forces x = y = 0
mod 3. Then 32 divides x2 -f y2 and so 3 divides z. Thus 3 divides all
of x, y, and z, a contradiction.
I: OVERVIEW
5
More generally, suppose that a,6,c are relatively prime integers and
no square divides a coefficient. Then two necessary conditions for the
existence of solutions of (1.1) can be seen to be that
(1) (1.1) must have a nonzero solution in R
(2) (1.1) must have a relatively prime solution modulo pm for each
prime p and each integer m > 0.
It turns out that (2) is automatic for all m if p is odd and does not divide
abc. If p = 2 or p divides a6c, then the weaker necessary condition (3a)
or (3b) below implies (2) for the corresponding p:
(3a) when an odd prime p divides abc, (1.1) must have a relatively
prime solution modulo p
(3b) when p = 2, (1.1) must have a relatively prime solution modulo
16.
Theorem 1.1 (Legendre). If (1) holds and if (3) holds for p = 2 and
for all odd primes p dividing abc> then (1.1) has a Q solution.
Remarks: i) (1) and (3) are decidable. So we can decide the existence
question, Question I. (And, parenthetically, the hypothesis on p = 2 in
the theorem is not needed.)
ii) To fix the ideas, think of (1) and (2) as the conditions. The Hasse-
Minkowski theorem generalizes the Legendre theorem to more variables:
A homogeneous quadratic polynomial in n variables with integer coeff-
cients has a nonzero solution in Q if and only if it has a nonzero solution
in R and a relatively prime solution modulo pm for each prime and each
integer m > 0.
iii) The field of p-adic numbers, described below in §V.3, can be used
to combine congruence information modulo pm for all m. With this
device, the Hasse-Minkowski theorem says: A homogeneous quadratic
polynomial in n variables with integer coefficients has a nonzero solution
in Q if and only if it has a nonzero solution in R and in the p-adic numbers
for every prime p.
The idea that solutions over R and the p-adics ought to control
solutions over Q is called the Hasse Principle. Unfortunately it is not
universally true. As we shall see shortly, it fails even for homogeneous
cubics in three variables. Nevertheless, the Hasse Principle is a useful
place to begin the study of a class of projective equations over Q. We
shall see better how the principle operates in Chapters V and VII.
In the language of algebraic number theory, the p-adics and R are
indexed by "places." A narrow (but still false) form of the Hasse Principle
6
I: OVERVIEW
then says: If a projective equation is locally solvable at every place,
then it is globally solvable. This terminology is supposed to suggest an
analogy with what happens in the theory of several complex variables.
In that theory, we consider holomorphic functions and make a ring out
of those near a point (technically by passing to germs). The ring that we
make is a "local ring" in the sense that it has a unique maximal ideal.
The global ring is the ring of all globally holomorphic functions on a
given open set.
In number theory, we attempt the same thing. Our global number field
is Q. To this we attach "places," analogous to points, by an abstract
definition that we shall not pursue. The places work out to be the primes
p and also oo. For each place, there is a "local field" - the p-adics and the
reals. The sense in which the p-adics are a "local field" is that the p-adic
integers form a "local ring," a ring with a unique maximal ideal. Local
solvability is to be solvability in the local field for each place. The hope
expressed in the Hasse Principle is that we can obtain global solutions
from local ones, because finding local ones is easier.
We turn to a discussion of Question II, the parametrization of
solutions. Let us begin with Diophantus's method to solve x2 + y2 = 1.
Take a particular solution (—1,0). Choose a linear relation it satisfies,
say x = -1 + 2y.
FIGURE 1.1. Method for obtaining Q solutions of x2 + y2 = 1
Substitution into x2 -f y2 — 1 gives (—1 + 2y)2 + y2 = 1, and we are
led to 1 - 4y + Ay2 + y2 = l,5y - 4 = 0,y = f ,x = §. We obtain
I: OVERVIEW
7
(r,y) = (§, if) as a nontrivial solution. Effectively x = -1 + ty works
for any rational t. We are led to
<*•»>= (fefpfr)- (,2>
Conversely any (z,y) must make a line with (-1,0) with ^ = some t,
and (xyy) is forced to be of this form.
The actual Diophantus method did not make reference to any
geometry such as is Figure 1.1. The correspondence between algebra and
geometry was unknown before Fermat and Descartes in the 17th century.
Diophantus used this analysis to parametrize Pythagorean triples:
The problem is to find all relatively prime solutions of a2 -f b2 — c2 over
Z. For such a solution (a, 6, c) with a ^ 0,
^ _ 6/c _ 2t/(t2 -f 1) 2t 2m/n _ 2mn
a " 'afc" (i2 - l)/(t2 + 1) " <2- 1 " (m/n)2 - 1 ~ ^"^n1"
So (a, 6, ±c) = K(m2 - n2,2mn, m2 -f n2) with /v 6 Q. Below we shall
tidy matters up and come to the following conclusion.
Theorem 1.2 (Diophantus). In any Pythagorean triple (a,6,c),
exactly one of a and 6 is odd. If a is odd and c is positive, then there exist
integers m and n such that
(a, 6, c) = On2 - n2,2mn, m2 + n2). (1.3)
The integers m and n are relatively prime and not both odd. Conversely
any pair (m}n) with these properties yields a Pythagorean triple via
(1.3).
PROOF. The given equation taken modulo 4 shows that a and 6 cannot
both be odd. Thus suppose a is odd and 6 is even, and write as above
(a,6,±c) = -(m2- n2,2mn,m2 + n2)
with u/v in lowest terms. Clearing fractions, we see that u divides
a, 6, and c. Hence u = ±1 and we may take u = +1. We now have
v(a> 6, ±c) = (ra2 — n2,2?nn, m2 -f ?z2).
We can eliminate the ambiguous sign by redefining (m,n,i;), leaving it
alone if the sign is + or replacing it by ( — n,?7i,— v) if the sign is —.
Then we obtain
v(a,btc) = (m2 - n2,2?nn,?n2 + n2). {IA)
8
I: OVERVIEW
Since c > 0, v > 0. Ifmandn have any common factor, the square of
the factor will appear in v, and we can cancel. Thus we may assume
that m and n are relatively prime.
If p is an odd prime dividing v, then p divides both m2 — n2 and
m2 + n2, hence both 2m2 and 2n2, hence both m and n. Thus no odd
prime divides v.
If 2 divides v, then the evenness of vb implies that 4 divides 2mn.
Thus one of m or n must be even, and then the other must be odd.
Hence m2 — n2 is odd, and va is odd, contradiction. Thus v = 1, and
(1.4) reduces to (1.3).
In (1.3) it is clear that m and n are not both odd. Conversely, such a
relatively prime pair (mtn) trivially yields a Pythagorean triple.
The method of Diophantus readily generalizes. Take f(xty) = 0 to be
any quadratic in (x,y). Suppose (p, q) is a solution. Then y = q+t(x—p)
and x — p give all lines through (p, q). Substitute for y. Get a quadratic
equation for x with a rational solution (for each /), namely x = p. Then
the other solution must be rational, etc. Without even carrying out the
computations, we can draw an important qualitative conclusion:
Proposition 1.3. If a quadratic equation f(xyy) = 0 over Q has one
Q solution, it has infinitely many Q solutions.
Exercise: Parametrize the relatively prime integer solutions (a,6,c)
of2a2 + 62 = c2.
Case of degree 3.
First we consider this case protectively. There is no normal form in
this generality. We shall address the same questions as in the case of
degree 2:
(I) Do Q solutions exist?
(II) If Q solutions exist, how are they parametrized?
The second of these questions was considered briefly by Diophantus.
The following example shows for Question I that the Hasse Principle
fails.
EXAMPLE (Selmer): The projective equation 3x3 + 4y3+5z3 = 0 has a
solution over R and over the />-adics for each py but it has no Q solution.
Now let us discuss Question II. We begin by normalizing our cubic
suitably. Taking Question I as settled, suppose that the homogeneous
cubic F(x,yyw) = 0 has a Q solution. Suppose in particular that it
I: OVERVIEW
9
has an inflection point over Q in a sense that will be made precise in
§11.3. If this inflection point is mapped to (x, y} w) = (0,1,0) by a linear
transformation in such a way that the tangent line becomes the line at
oo, and if the variables are scaled suitably, then the equation (in affine
form) becomes
y2 + aixy -+ a3j/ = x3 + a2x2 + a4x + a6.
(Notice that (x,y,w) = (0,1,0) does solve this equation projectively.)
Since Q does not have characteristic 2, we can complete the square in y
and reduce matters to
y2 = P(x), where degP = 3.
Example: u3 + j/3 = w3. Put £ = 3f} ± = *=i. The inverse is
3u 9w
x = , t/= .
w — v w — v
Then the equation becomes y2 - 9y = x3 - 27.
Anyway we have a solution (the one at oo), and we want to know all
solutions. Diophantus knew how to generate some solutions from others.
EXAMPLE: Given a number (e.g., 6), divide it into two pieces whose
product is a cube diminished by its root.
Solution: Let the pieces be y and 6 — y. We want
2/(6 - y) = x3 - x.
The point (xyy) = ( — 1,0) is a trivial solution. It lies on a line x = 2y—1.
Substitution gives
6y - y2 = (2t/ - l)3 - (2y - 1) = 8y3 - \2y2 + 4y.
Division by y gives a quadratic equation with irrational roots. We would
surely have had rational roots if we could have made Ay on the right
match 6y on the left. The 4 comes from twice the coefficient 2 of y. To
get 6y, we should use coefficient 3 for y. So x = 3y — 1. Then
6y - y2 = 27y3 - 27y2 + 6y,
27y = 26.
10
I; OVERVIEW
Soy =ff and* = 3y-l = ^.
Newton's explanation of the method is as follows: Start with any line
through (—1,0). Rotate it to be tangent. Then it intersects the given
cubic twice at rational points; so the third intersection point must be
rational. The picture in Figure 1.2 in the context of solutions over R
makes matters much clearer.
Figure 1.2. Newton's explanation of the Diophantus method
In summary, for this (and other) cubic equations, we have a systematic
method to generate new solutions from old. The method does not give
parametric families. As with tz3-f v3 = w3 we may not even get infinitely
many solutions this way.
Let us analyze the method of Diophantus further, postponing detailed
verifications to Chapters II and III. The above construction takes a
rational solution P and produces another one, denoted PP or P • P, by
means of a tangent line. More generally, if P and Q are distinct rational
solutions as in Figure 1.3, so is R. We write R — PQ = P • Q. We have
I: OVERVIEW
11
a systematic way of getting a third point from two. This operation is
called the chord-tangent composition rule.
Figure 1.3. Chord-tangent composition rule
(a) Picture for O = (0,1,0) (b) Picture for a different O
FIGURE 1.4. Group laws relative to different base points O
The chord-tangent composition rule is not applicable to certain pairs
of points for a general cubic. But with an additional assumption of
nonsingularity that we consider below, the rule is everywhere applicable
and can be modified so as to give us an abelian group operation, as in
Figure 1.4. For this purpose fix a point O on the curve. If (xyyyw) =
(0,1,0) is on the curve (i.e., if there are no y3 terms), one frequently
takes O = (0,1,0). With O fixed, we define
P + Q = OPQ.
12
I: OVERVIEW
Let us address the group axioms. Associativity of + is complicated
and is postponed to Chapter III. But commutativity is clear. Also O is
an identity since
P + O = O • PO = P.
To define negatives, form OO as the third point on the line tangent at
O. Then -P = OO P. In fact,
p + (-P) = O • P(-P) = 000 = 0.
The geometry of negatives is especially nice if (0,1,0) is on the curve
(i.e., again, if there are no j/3 terms). Then it turns out that the choice
O = (0,1,0) leads to OO = O if there are no xy2 and x2y terms.
Comparative pictures of negatives are in Figure 1.5.
(a) Picture for O = (0,1,0) (b) Picture for a different O
FIGURE 1.5. Negatives relative to the group law
As we indicated above, there are difficulties if the curve is "singular."
Figure 1.6 illustrates two types of singular behavior. In each case, P =
(0,0), and we consider P + P. In (1), we have to take PP = P, and
then P + P = OP is undefined since the line through O and P neither
meets the curve at a third point nor is tangent to the curve at O or
P. In (2), we do not have a unique definition of PP, since there is
not a unique tangent line at P. The problem is that both curves look
bad at P = (0,0). We cannot solve for y or x in terms of the other.
The condition of the Implicit Function Theorem fails at (0,0) in that
^ = 0 = -§-. We say that (0,0) is a singular point and that the curve
is singular. A precise definition of singularities will be given in §11.2.
I: OVERVIEW 13
(a) y2 = x3 (b) y2 = x2(x + l)
Figure 1.6. Singular behavior
Theorem 1.4 (Poincare). For a nonsingular cubic curve with a
specified rational point O, the operation + is well defined and makes the
set of rational solutions into an abelian group with identity element O.
If a different identity element O' is chosen, then the two operations are
related by P+'Q = P + Q — 0'y and the group structures are isomorphic.
The geometry is especially simple when O is the same as OO, i.e.,
when O is an "inflection point." In this case we shall see that we can
map O out to (x,y, w) = (0,1,0) by a linear mapping with coefficients
in Q and we can arrange for the tangent line to be w = 0. After scaling,
we get
y2 + aixy + a3y = x3 + a2x2 + a4x + a6,
which is called a Weierstrass form of the curve.
An elliptic curve over Q is a nonsingular cubic curve in Weierstrass
form, with rational coefficients. Wider classes of curves can be reduced to
elliptic curves in various ways, and elliptic curves are sometimes defined
as one of these more general curves.
To visualize an elliptic curve over Q we can complete the square in the
Weierstrass form and change variables in y to obtain y2 = P(x), where
P(x) is a monic cubic with distinct roots. We can graph u = P(x) and
obtain a picture as in Figure 1.7a or 1.7b. (Alternatively the graph might
have its dip completely below the x-axis, or it might have no maximum
or minimum.) Then we can scale by y = ±y/u and obtain the pictures
in Figures 1.7c and 1.7d.
14
I: OVERVIEW
(a) u = i(x2-l) (b) u = x{x7-l) + \
(c) y2 = x(x2 - 1) (d) y2 = x(x2 - 1) + \
Figure 1.7. Graphs of R points of the elliptic curve y2 = P(x)
These pictures are of the R points in the affine plane. Projectively
we must include the point at infinity, (0,1,0). Then topologically (and
smoothly) we get one or two circles. This turns out to mean that the
group of R points is S1 or S1®Z2- So the group of Q points is asubgroup
of Sl or 51 ® Z2. The first main structure theorem about the group of
Q points is as follows.
Theorem 1.5 (Mordell). The group of Q points of an elliptic curve
over Q is finitely generated. Hence it is Zr <£ F with F finite abelian.
This theorem will be proved in Chapter IV. Since the finite abelian
group F is contained in Sl or Sl 0 Z2, we must have F = Zn or F =
Z2n®Z2. For any particular curve, F can be determined. We can change
variables to make the curve be y2 = x3 + Ax + B, with A> B £ Z. It
I: OVERVIEW
15
simplifies calculations to take O = (0,1,0). Then the elements of order
2 are those with y = 0, at most three in number. For the rest we can
use
Theorem 1.6 (Lutz-Nagell). For an elliptic curve y2 = x3 -f Ax -f B
over Q with O = (0,1,0), the finite abelian group F can be computed
as follows. If P = (x(P)yy(P)) is in F but does not have order 1 or 2,
then
(a) x(P) and y(P) are in Z
(b) y(P)2 divides 4A3 + 27 J32.
This theorem will be proved in Chapter V. To compute F, we simply
can try all integers y whose square divides 4A3 -f TIB2. For each such,
we seek an integer solution x of y2 = x3 -f Ax + 5. We can tabulate all
integer pairs (x,y) obtained in this way, and we have a bound for the
order |F|. Then we can check which elements (x,y) are of finite order
since their orders (if finite) cannot exceed |^|.
As the elliptic curve varies, we can ask what finite abelian groups F
arise in Mordell's Theorem. The following deep theorem addresses this
question.
Theorem 1.7 (Mazur). The torsion subgroup of the Q points of an
elliptic curve over Q is one of the 15 groups
Zn with n= 1,2,3,...,9,10,12
Z2n0Z2 with n = 1,2,3,4.
We turn to a discussion of the rank r of the group of Q points on an
elliptic curve over Q. Elliptic curves are known for which r is any of the
numbers 0,1,2,..., 12. The proof of MordelFs Theorem gives an upper
bound for r for any particular curve.
Still, no effective way is known for deciding whether r = 0 for a
particular curve, and it is not known whether r can be arbitrarily large.
The motivation for what affects rank is based on computer calculations
and on the hope expressed in the Hasse Principle. Let us consider an
elliptic curve E given by
y2 = F(x)} (1.5)
where F(x) is a cubic polynomial with integer coefficients and with
distinct roots 7*i,r2,r3 in C. The discriminant of F is flio'C7** "~ ri)2 anc*
is a nonzero integer. The group of Q points is denoted E(Q).
16
I: OVERVIEW
Guided by the Hasse principle, we consider (1.5) modulo a prime p.
The curve (1.5) over Zp will be nonsingular (hence will be an elliptic
curve) except when p = 2 or when p divides the discriminant of F.
For nonexceptional p, we let N(p) be the number of Zp solutions (x,y),
including the point at infinity. For example, y2 — x3 -f 3x over Z5 has
solutions
(0,0) (1,±2), (2, ±2), (3,±1), (4,±1), oo,
and thus N(5) = 10.
In (1.5), we always have
1 < W(p)<2p + 1. (1.6)
In fact, the lower bound is a consequence of counting the point at infinity.
For the upper bound, we note that each x from 0 to p— 1 gives at most 2
values of t/, and we also must count the point at infinity. We can rewrite
(1.6) as
\P+1-N(P)\<P. (1-7)
So N(p) is centered about p-f 1. A theorem of Hasse says that it is more
closely centered about p-f 1 than (1.7) indicates.
Theorem 1.8 (Hasse). |p+ 1 - N(p)\ < 2^/p.
The philosophy of the Hasse Principle is that local solutions should
influence global solutions. So a curve is more likely to have many global
solutions if N(p) is relatively large for many p. In Table 1.1, we consider
the equation y2 = x3 + ax for several choices of the integer a. For each
a, we form the product
p<R
as a function of R. The symbol ; means that the exceptional primes (2
and those dividing the discriminant) are not to be included. The table
lists the values of this product for some Rys in approximately a geometric
progression.
I: OVERVIEW
17
a
-3
-4
+4
+6
-8
+3
-6
+8
-14
+14
-82
r
0
0
0
0
0
1
1
1
1
2
3
R= 17
.469
1.926
1.540
.451
.602
5.867
1.625
5.416
1.095
5.476
7.823
31
.624
1.506
2.409
.706
.612
5.964
1.270
7.199
1.456
5.566
12.234
53
.576
1.533
1.603
.477
1.034
5.271
2.807
7.338
1.230
10.491
22.167
97
.698
1.139
1.649
.384
1.584
8.855
3.342
7.600
1.532
18.097
41.142
173
1.030
1.820
2.081
.331
1.004
10.131
5.229
12.176
2.221
24.192
53.847
313
.870
1.433
2.732
.459
.838
8.937
4.891
9.610
3.077
22.829
106.282
557
.824
1.466
1.833
.712
.819
10.500
4.963
9.569
3.689
27.771
124.512
997 1
.550
1.411
2.240
.846
.852
14.159
5.194
14.060
4.204
29.577
134.117
Table 1.1 Values of YI'p<rN(p)/p for various R for the curve
j/2 = x3 + ax
What we can see from Table 1.1 is that l\'p<R N(p)/p is roughly
constant in x for the cases that E(Q) has rank 0 and that it increases for
the cases that £(Q) has rank > 0. Moreover, the rate of increase is
distinctly larger when E(Q) has larger rank. More extensive calculation
with many elliptic curves leads to the following first form of a conjecture.
Conjecture 1.9 (Rough form of Birch and Swinnerton-Dyer
Conjecture). For any elliptic curve E over Q, the rank r of £(Q) is
given by the formula
Y[' ^M ~(const)(logfl)r.
p<R P
Again N(p) is the number of Zp solutions (x,y), including the point
at infinity, and the finite set of primes for which the reduction modulo
p is singular are excluded from the product.
To get a formulation of the conjecture that fits better into an
established framework, one works with the L-series of the elliptic curve, which
for current purposes is given by
^n'TTHESETTX- (L8)
P l P' ^ p2.
the finite set of exceptional primes being excluded from the product.
It follows easily from Theorem 1.8 that the infinite product (1.8) is
convergent for Re s > 3/2. Note that if we put s = 1 in (1.8), then
the product formally becomes the reciprocal of y[ N(p)/p. The second
form of the conjecture is stated in terms of the L-series.
18
I: OVERVIEW
Conjecture 1.10 (Birch and Swinnerton-Dyer). For any elliptic
curve over Q, the L-series L(s) extends to be entire for s £ C, and
the order of vanishing of L(s) at s = 1 equals the rank r.
A third, more detailed, form of the Birch and Swinnerton-Dyer
Conjecture includes factors in L(s) for the exceptional primes and
addresses the value of the coefficient of the first nonzero term in the
power series for L(s) about s = 1. We shall not give a formal statement
of this third version of the conjecture.
Conjectures about L(l) do not even make sense without the analytic
continuation in Conjecture 1.10. This analytic continuation is far from
trivial and is itself the subject of the following two conjectures. The
precise statements of these conjectures require some terms that will be
defined later in the book.
Conjecture 1.11 (Hasse-Weil). L(s), modified by elementary factors
to a function A(s), extends to be entire, and the extension satisfies the
functional equation A(2 — s) = ±A(s). If N is the conductor of the
elliptic curve and m is any integer prime to N, then the same kind of
conclusion is true of L($) when L(s) is twisted by a Dirichlet character
modulo m.
Conjecture 1.12 (Taniyama-Weil). Every elliptic curve over Q is
modular in the sense of being parametrized by modular functions in a
specific way.
It turns out that Conjecture 1.12 is decidable for any particular curve.
Moreover, Conjecture 1.12 implies Conjecture 1.11, and the converse
implication is valid under a minor technical assumption on the curve.
The depth of these conjectures is illustrated by the following theorem.
Theorem 1.13 (Frey-Serre-Ribet), The Taniyama-Weil Conjecture
implies Fermat's Last Theorem.
CHAPTER II
CURVES IN PROJECTIVE SPACE
1. Projective Space
Let k be a field. The projective plane P2(k) over k is defined as
the quotient of {(x}y}w) £ k3 - {(0,0,0)}} by a relation ~, where
(x', y', t//) ~ (x, y, if) if (x;, y\ w') = A(x, y, w) for some A 6 kx. When
there is a need to be careful, we shall write [(x,y, w)] for the member
of P2(k) corresponding to (x,y, tz;) in it3. But usually there will not be
such a need, and we shall simply refer to (x, y, w) as a member of P2(k).
A line defined over k is a nonzero polynomial L = ax -f by -f cw in
i[x, j/, tu]. We regard L and a line V — a'x + b'y + c'w to be the same
line if (a'yb',cf) is a multiple of (a,6,c). The locus
L(k) - {(x, y, if) | ax -f by + cw = 0}
is well defined in i^M and is called the set of k points or k rational
points of L. It will be handy to refer to L(k) as a line in /^(^O- V
L is defined over k and K is an extension field of ky it is meaningful to
speak of L(K) as a subset of P2(A')«
The affine plane k2 = {(x,y)} has a standard one-one imbedding into
P2(k). Namely we map (x,y) into [(x,y, 1)]. The set that is missed by
the image is the set where w — 0, which is the set of fc rational points
of a line called the line at infinity.
The points with w = 0 are called the points at infinity. Except for
the line at infinity, lines in P2W correspond under restriction exactly
to lines in k2.
In certain ways the geometry of P2W is simpler than the geometry
ofJt2:
(1) Two distinct lines in P2(k) intersect in a unique point. In fact,
we set up the system of equations
19
20
II: CURVES IN PROJECTIVE SPACE
Since the lines are distinct, the coefficient matrix has rank 2.
Thus the kernel has dimension 1, and there is just one point in
the intersection.
(2) Two distinct points in /^(fc) lie on a unique line. In fact, we set
up the system of equations
(*' y> 2) N = (2)
and argue in similar fashion.
Let <$ be in G£(3, k). Then $ maps the set k3 of all column vectors
in one-one fashion onto k3 and passes to a one-one map of P2W onto
P2(k) called the projective transformation corresponding to $. Two
^s give the same map of P2(k) if and only if they are multiples of one
another. The group action of GL(3}k) on Pi{k) is transitive because
<7L(3,fc) acts transitively on Jb3 - {(0,0,0)}.
If L is the line whose coefficients are given by the row vector (a b c)
and if $ is a projective transformation, then the row vector
(a b c)$~l defines a new line L*, and the k rational points of L*
are given by
£*(*) = *(L(*)).
In fact, let
and satisfies
be in L(k). Then
(a 6
hence it is in L*(k). Conversely if
is in $(L(k))
= 0;
is in L*(fc), then
x
y l =
$-
satisfies
(a 6
= (0 6 c)*"*1
= 0,
and thus
is 3> of something in L(k).
1. PROJECTIVE SPACE 21
About any point in P2(k) we can introduce various systems of affine
local coordinates.
Namely fix [(xq, yo,w0)] in /^(fc)- Choose (by transitivity) some $
in GL(3,k) with <£(x0,yo»wo) = (0,0,1). Then we can define local
coordinates on ^~1(ifc x k x {1}) to A;2 by the one-one map
V?(<^-1(x,y,l)) = (x,2/).
This definition generalizes the standard imbedding of the affine plane k2
in P2(k) earlier; that imbedding was the case $ = /.
Example 1. Suppose (xo,yo»^o) = (#o,2/o,l)- We can choose $ =
1 0 -x0\
0 1 -2/o . Then
,0 0 1 /
/x\ /l 0 -xQ\ fx\ /x-x0\
In this case, the local coordinates are defined on
fc-^Jfe x*xl) = fcxA:xl
and are given by
= (p($~l{x - x0iy - y0ll)) = (a?-xo,y-3/o).
This $ is handy for reducing behavior about (xo,j/0i 1) in ^M^) to
behavior about (0,0) in k2.
Example 2. Suppose (x0,yo,wo) = (0,1,0). We can choose $ =
'0 0 1\
10 0. Then
,0 1 0/
/x\ /0 0 1
$1 = 1 0 0
\wj \0 1 0
and
V?(r, l,w) = ^(*~1(*(*> l,u/))) = y?(<I>""1 (u;,x, 1)) = (u/,x).
This $ will be handy for studying behavior near the (unique) point
at infinity on a cubic in Weierstrass form.
22
II: CURVES IN PROJECTIVE SPACE
Affine local coordinates are useful in studying homogeneous
polynomials in three variables. We say that a nonzero F 6 k[x} y, w] is
homogeneous of degree d if every monomial in F has total degree d, and we
write k[xy y, w]d for the set of such polynomials. Each such F satisfies
F(\x} Ay,Xw) = \dF{xyyy w) for x,yyw,\ek. (2.1)
Conversely, homogeneous polynomials over an infinite field can be
detected by this property, according to Proposition 2.2 below.
Lemma 2.1. If Jfc is an infinite field, then a nonzero polynomial
/ £ k[x\,..., xn] is nonzero at some point.
Proof. We induct on n, the result for n — 1 being well known.
Assume the result for n — 1, and suppose /(ci,... ,c„) = 0 for all
(ci,..., cn). By induction,
f(x\,..., xn_i, c) is the 0 polynomial in n — 1 variables, (2.2)
for each choice of c. Fix a monomial x*1 • •■£n-'V in n — 1 variables.
The monomials in n variables containing this (n — Invariable monomial
contribute
£*j*il''-*nl~i,*£
to /, and (2.2) says that ^j>o^cJ ~ ^ ^or a'* c* Therefore all the bj
are 0. Repeating this argument for all monomials in n — 1 variables, we
see that / is the 0 polynomial.
Proposition 2.2. If k is an infinite field, then a nonzero polynomial
F € k[x} yyw] satisfying (2.1) is homogeneous of degree d.
Proof. Write F as the sum of homogeneous terms of different
degrees: F = £?_! Fj with Fj of degree dj and with dx = rf. Then (2.1)
gives
\dF(xo}y0,wo) =\dFl(x0,yo}w0) + Xd2F2{x0}yo}wo)
+ -" + Ad*Fn(z0,ifo,u>o)
for all A. Since fc is infinite, this equality for all A implies F(xo}yo, u>o) =
Fi(xoyyoywo) and also Fj(x0,yo,wo) = 0 for all j > 2. Letting
(ar0, yo, w0) varv an(l applying Lemma 2.1 to F — F\ and to Fj for j > 2,
we obtain the conclusion of the proposition.
1. PROJECTIVE SPACE 23
A homogeneous polynomial F ^ 0 of degree > 0 is not a
function on P2(k). Nevertheless we can examine the behavior of F near
a point (x0,yo,u>o) in k3 - {(0,0,0)} by choosing $ in GL(3,fc) with
$(x0,t/o>wo) = (0r0,1) and defining
f(x,V) = F(*-l(x,y,l)).
EXAMPLE 1. Suppose (x0,y0, u>o) = (*o>yo, 1) and
/l 0 -x0\
* = 0 1 -2/o •
\0 0 1 J
Then
f(x}y) = F(x + x0,yH-yo,l).
For
F(x, y, w) = x2y + xyw + w3,
the corresponding f(xyy) splits into homogeneous terms as
f(x, y) =(xly0 + x0yo + 1) + (x^jy + 2x0yox 4- x0y + y0x)
+ (yox2 + 2x0xy + xy) -f (x2y).
We shall use this splitting in §2 in examining singularities and tangent
lines.
/0 0 1\
Example 2. Suppose (x0,yo,u>o) - (°> *»0) and $ = I 1 0 0 1.
Then
*"1 l " l
X
/(x,y) = F(2/.l.*)-
For the same F as in Example 1, namely
F(x} y, w) = x2y + xyu; + w3}
we obtain
/(z>y) = (j/2 + *2/) + (*3)
with no terms of total order 0 or 1.
24
II: CURVES IN PROJECTIVE SPACE
Conversely if d and $ are given and if / in ib[x,y] has degree d, we
can reconstruct F. In Example 2 above with d = 3 and with / as above,
define G(x, y, w) by inserting powers of w to make all terms of degree 3:
G(xyyy w) = y2w -f xyw + x3.
Then put F = G o <£ and recover F(x, y, w) = x2y + zyw + w3.
In the special cases that k is R or C, /^(R) and P2(C) are smooth
manifolds. It is helpful as motivation to recall the argument for /^(R)-
To do so, we need to give compatible charts (U, <p)y where <p : U —► R2 is
a homeomorphism onto an open set and where the sets U cover /^(R)-
At points (x, y,u/) on the set C/3 = {uj ^ 0}, we can use
<ps(x,y,w) = (—, —) •
Viy it;/
This is just our affine local coordinate system about (xo,yo,u>o) =
(0,0,1) with $ = 1. On the set U2 = {y / 0}, we can use
/x w
<p2(x,yyw) - I -,—
\y y
This corresponds to using (xo,yo,wo) = (0,1,0) and <£ = I 0 0 1 I .
\o 1 0/
Similarly we can define a chart (/7i,^i). To have a manifold, we are to
check that functions like <P2°<p31 are smooth. In fact, these functions
are rational with nonvanishing denominators. For example,
¥>20<pal(xty) = <p2(x,y, 1) =
with domain {(ar,y) |y i=- 0}. More generally, all our systems of affine
local coordinates (defined via (xo, yo, wq) and $) yield compatible charts.
Once iV) and /^(C) are smooth manifolds, partial derivatives are
available. We shall see that partial derivatives provide a useful tool even
when the field k is not R or C.
2. Curves and Tangents
A plane curve or projective plane curve defined over k is simply a
nonzero homogeneous polynomial F 6 k[xyyyw\d for some d > 0, except
that we regard two such F's as the same curve if they are multiples
•
\y yJ
2. CURVES AND TANGENTS
25
of each other. As we remarked in §1, F is not well defined on P^k).
Nevertheless, the zero locus of F yields a set in projective space. Namely,
if K is an extension field of k, then the locus
F(K) = {(xyy,w)\F{x)y,w) = 0}
is well defined in /^(^O since F is homogeneous. This locus is called
the set of K points or K rational points of the curve. In the special
cases that d =. 1, 2, or 3, the curve is called a line, conic, or cubic,
respectively.
If F is a plane curve and if we have an affine coordinate system given
by $ with 3>(xo,2/oj w°) = (0>0, *)> tnen tne corresponding affine curve
is
f(x,y)=F(*-l(x,yfl)) in k[xy y].
Among the k rational points of a curve, we shall distinguish
between singular points and nonsingular points. Thus let F ^ 0 be in
k[xyyy w]dy fix (xo,i/0)^o) € F(k), and choose affine local coordinates
about (£o,t/o,u>o) given by some $ with $(xo,yo>w0) = (0,0,1). Let
f{z,y) = F{*-l{z,yfl))ek[xfy].
For example, the situation in Example 1 of §1 had wo = 1 and
F(xy y, w) = x2y -f xyw + w3,
and we were led to
/(x, y) = (xjj/o + xoyo + 1) + (zjjy + 2xoyo^ + ^02/ + 2/o«)
4- (y0z2 + 2x0xy + xy) + (x2z/)
= /o(*,y) + /i(*,y) -I- /2(*,y) + M*,y)>
The constant term /0 is 0 since (x0,yo, 1) is in F(&). In this example
and in general, / is the sum of homogeneous terms of degree 1 through
rf, say / = fi + • • • + fd with /lf..., fd depending on (x0, y0, u>0) and $.
We say (x0,yo> u>o) is a nonsingular point if /i is not the 0 polynomial
in k[x}y]. Otherwise (xo,yo,wo) is a singular point. In our example
with tuo = 1) we are to consider
/i(x, y) = (2x0y0 + yo)x + (xg -I- x0)y.
The singular points in F(k) are those where the coefficients of x and y
are both 0. The coefficients are 0 when (xo,t/o) = (0,0) and (x0yyo) =
26 II: CURVES IN PROJECTIVE SPACE
(-1,0). (Back in P?{k), these are (0,0,1) and (-1,0,1).) But neither
of these is in F(k). Hence F is nonsingular at every point of F(k) for
which u/o = 1.
We need to check that nonsingularity does not depend on the choice of
$. Thus suppose also that ^(x0,yo,u>o) = (0,0,1). Then
V o fc-1 = lc d 0 J with [a j invertible. Write /(x,y) =
F(<P~l(x,y,1)) and g(x,y) = F(tf _1(x,y, 1)). We write heuristically
f(x}y) = (Fo*-l)(*o*-l)(x}yy\)
= (Fo ^_1)(ax -f 6y, ex -f dy^rx + sy + \)
K 'V * '\rx + sy+ 1 *rx + sy+l} //
= (rx + sy + 1) g[ r, r )
X,*, . / ax + by ex -f dy \
= (rx + 5y 4- 1 rf yi + • • ■ + 0* (——LJrT' —7 rr)
V rx -f sy -f 1 rx + sy + 1 /
= (rx -f sy + \)d~1g\{ax + 6y,cx + dy)
+ ■ • • + 0<f(a* + 6y,cx + dy). (2.3)
This computation is valid in the quotient field fc(x,y). By expanding
the various powers of (rx -f sy -f- 1) and regrouping by homogeneity in
(x,y), we can read off
/i(*>y) = yi(ax + 6y,cx + dy). (2.4)
Similarly
0i(*,y) = /i(a* + /fy,7* + *y) with ( " <*) = (c d)
So /i and yi are both zero or both nonzero.
Proposition 2.3. Suppose F £ k[xtytw]m and G £ £[x,y,w]n are
plane curves. If (xo,yo,u>o) is in F(k) C\G(k)) then (x0,yo,^o) is a
singular point of FG.
Proof. Choose affine local coordinates with 4>(xo,yo,u>o) = (0,0,1),
and define /(x,y) = F(*-l(z,y% 1)) and y(x,y) = G(Q-l(x,y, 1)).
Then we can write / = /i + f- /m and y = yi + f- yn- Since
f{x,y)g(x,y) = {FG)(*-l(x,y,\)),
and since /y has no first-degree terms, it follows that FG has a singular
point at (x0,yo,u>o).
2. CURVES AND TANGENTS
27
We say that the plane curve F over k is nonsingular (or smooth) if
F is nonsingular at every point of F(k)} where k is the algebraic closure
of k. Otherwise we say F is singular. Nonsingularity at all points of
F(k) does not imply F is nonsingular. For example, the curve
F(x, y, w) = x3 - 6xw2 + 6yw2 - y6
is defined over k = Q, is nonsingular at every point of F(Q), and has a
singular point at (x,y, w) = (\/2, \/2,1). So the curve is singular.
Nonsingularity of the curve F implies irreducibility over the algebraic
closure k> according to Corollary 2.5 below. The corollary depends on
Bezout's Theorem, whose proof is deferred to §5.
Theorem 2.4 (Bezout's Theorem). Suppose F 6 k[x^y^w\m and
G 6 k[x}y}w]n are plane curves. Then F(k) n G{k) is nonempty. If it
has more than mn points, then F and G have as a common factor some
homogeneous polynomial of degree > 0.
Corollary 2.5. Suppose F is a plane curve and is reducible over k.
Then the factors are homogeneous polynomials, and F is singular.
Proof. Write F — F\F2 nontrivially. Let d\ and ei be the highest
and lowest degrees of terms in F\f and let d2 and e2 be the highest and
lowest degrees of terms in F2. The product of the terms of degree d\ in
F\ and the terms of degree d2 in F2 is nonzero and is the d\d2 degree
part of F. The product of the terms of degree t\ in F\ and the terms of
degree e2 in F2 is nonzero and is the exe2 degree part of F. Since F is
homogeneous, d\d2 = eie2. It follows that d\ = e\ and d2 = e2\ thus F\
and F2 are homogeneous. Bezout's Theorem says that F\{k) H F2(k) is
nonempty, and Proposition 2.3 says that any point in this intersection
is a singular point for F. Hence F is singular.
Suppose that (£o,2/o>^o) *s a nonsingular point of F(k) and that
*(soi2/OiU>o) = (0,0,1). We have written f{x,y) = F(4>_1(x,y, 1)) with
/ = /i H h fd- Then f\(x} y) = px + qy with p and q in k} not both 0.
Consider f\ as a polynomial in three variables fi(x}yyw) independent
of w. We lift back to a nonzero member L — f\ o <I> of k[xyy,w]ly and
the result is called the tangent line to F at (xo,yo,^o)-
Let us see that the tangent line is independent of the choice of aflTuie
local coordinates. Thus suppose also that ^(xo, 2/0, wo) = (0, 0,1). Form
fa b 0\
g{x,y) = F(V-l(x,y,l)) and ^ o ^_1 = c d 0 , and let the
28
II: CURVES IN PROJECTIVE SPACE
respective tangent lines be L* and Ly. Then we have
M*> V>w) = h W*> y> ^))i (2.5)
/l^.y^) = /i(*,2/) = gi(ax + byycx + dy)
= gi(ax + by, ex + dy,rx + sy + w) = 0x(¥ o fc"1^, y, w)),
L*(z,y,w) = £iW*,y,u>))
= 5i(* o *_1($(x, y, w))) = A (<*>(*, y, ™))-
(2.6)
Comparing (2.5) and (2.6), we see that L^(xyyyw) = Ly(x,yyw).
If F is a plane curve and $ is a projective transformation, then F* =
Fo$_1 is another plane curve, and the k rational points of F* are given
by
F*(k) = *(F(k)),
by the same kind of reasoning as in §1 for the case that F is a line.
This reasoning shows also that if (#o, !/o>wo) is a nonsingular point for
F, then $(xo> Vo^o) is a nonsingular point for F*.
From calculus we expect that nonsingularity can be decided in terms
of partial derivatives when k is R or C. In fact, the argument that
works for R or C works also for general k. All that is needed is a
definition of partial derivatives, along with a few elementary properties.
The definition of a partial derivative of a polynomial is clear, and it is
routine to check that the product rule and the several-variable Chain
Rule are valid in the context of polynomials. We find, at a nonsingular
point, that the tangent line is given by a familiar formula from calculus,
as in the following proposition.
Proposition 2.6. Let F be a plane curve over k. If (xo, yo, ™o) is on
the curve, then (xo,yo>u>o) is a nonsingular point if and only if at least
one of ^, ^£, |~ is nonzero at (xo,J/o»^o)- In this case the tangent
line L to F at (xo,yo>wo) IS glven by
rflFi rflFi rdJFi
L OX J(x0,yo,t"o) L #2/ J(*o,yo,u>o) LC7iyJ(ro,yo,tuo)
Proof. Choose affine local coordinates with $(xo>2/o> wo) = (0,0,1).
Since (xoiIfthWo) *s on tne curve, Fo$"1(x,y,i^) vanishes upon
substitution of (0,0,1). Hence every monomial in F o $""1 contains a factor
of x or y, and it follows that
dn-{F o *"1)(0I 0,1) = 0 for every n. (2.7)
dw]
2. CURVES ANP TANGENTS 29
We write f(xyy) = F($-l(xyyyl)) with / = /i + • • • +/<*. The
coefficients of the linear term /i are f£(0,0) and f£(0,0). Since
/ = For1o((^)-(x,y)l)))
the Chain Rule gives
\dx dyj(00) \dx dy «W („,,„,„.) \Q Qj
Let (x'\yf\w') be given. Multiplying on the right by ( , j, we obtain
= («£ ££ «n ♦-!
ir
V«* 3y ^w(royoU,o) \^0y (29)
By (2.7) for n = 1, the third entry of
V5x dy 0w)(xoyoWo)
is 0. Thus the right side of (2.9) is
V&* 0y dw J{*o,*o.wo) \w'J
Putting (x',y',t//) = $(x,2/,uj), we obtain
L{xyyyw) = /! o$(x,y,w)
- (<^1 ^ ^^
x
y
dx dy dwj(xoyowo)yw)' (21Q)
By (2.8), we have a singular point if all first partials of F are 0.
Otherwise (2.10) shows that fx is not 0; hence f\ is not 0 and the point is
nonsingular. In this case, (2.10) gives the tangent line.
30
II: CURVES IN PROJECTIVE SPACE
The quadratic term /20$(r, y}w) can be identified similarly, provided
the characteristic of k is not 2. But the expression is more complicated
and depends on the choice of $. It involves also the Hessian matrix
of F, defined as
H - H(x0%yo}w0) =
dx2
d2F
dxdy
d2F
d2F d2F \
dxdw
82F
dxdy
d2F
dy2
d2F
dydw
d2F
(2.11)
\ dxdw dydw dw2 ) (x0,y0,u>o)
Proposition 2.7. Suppose the characteristic of k is not 2. Let F be
a plane curve over k of degree d} let (xo,yo>u>o) be a nonsingular point
on the curve, and choose affine local coordinates with $(£(), yo> u>o) =
(0,0,1). Put /(x,y) = For1!!,!/,!), let h(xyy) be the quadratic
term in the expansion of f(x}y) about 0, and let /^(x, J/, w) = /2(x,y)-
Then Q*(x, t/, w) = f'2 o <£(x, y} w) is given by
2<2*(x,y,u;) = (x y w ) H j y \ - 2(d - l)L'q>(xyyyw)L(x1yiw),
(2.12)
where H is the Hessian matrix of F at (x0, t/o, wo)> Z(x, y, w) is the
tangent line at (x0,t/o,u>o), and L^(x, j/,u;) is a line with Z4(xo,yo> u>o) = 1.
Proof. Let #* be the Hessian matrix of F* =z Fo^"1 at (0,0,1).
We first show that
In fact, the Chain Rule gives
/ dF* dF* dF* \
V dx dy dw
Transposing gives
/ dF* \
dx
dF*
dy
dF*
^ dw ' {xty,w)
\ _ (dF_ dF_
J(x,ytw)~ \ dx QV
9F
dw
(2.13)
/ «I>- l(xtytw)
= *-1'
dx
dF_
dy
dF_
dw * <Jf-l(xtytw)
\ f)<,n I
2. CURVES AND TANGENTS
31
Now we can apply the Chain Rule to each row of this equation and
evaluate at (xyy}w) = (0,0,1). When the results are assembled, we
obtain (2.13).
The matrix of second partials of f(x} y)=Fo <I>~l(x, t/, 1) at (0, 0) is
obtained by taking the upper left 2-by-2 block of //* and putting w = 1.
Thus
2/2(*',t/) = (*' </)[ 21 dS
dxdy dyi J (0,0)
= (*' x/ 0)H
(?)
= («' y' w')H
-2u/(0 0 \)H
— w
/2
<92f*
dw2
(0,0,1)
The last term on the right is 0, by (2.7) for n = 2. Putting I y' =
$ I t/ I , we obtain
V w I
2Q*(x,y,u;) = (x y w)//
-2u/(0 0 l)//4
= ( x y iz;) //
-2u/
<92F*1
[dxdw\ (001)
x +
<92F*1
dydw
(o.o.i)
(2.14)
With d as the degree of F, the only monomial of F* that makes a
contribution to
times what ^~
d*F*]
dxdwl(o,oti)
is xwd"ly and the contribution is d - 1
J(0
gives. A similar remark applies to d dw
32 II: CURVES IN PROJECTIVE SPACE
Thus the second term on the right of (2.14) is
=_2(J_1K(f^l ,+r«£i ,)
\i dx J(0,0,1) i °y J(o,o,D )
= -2(d-lK/i(i',y') = -2(rf- \)w'h{x',y',w')
= -2(</- lK/i(4(*,y,i»)) = -2(d-l)w'L(x,y,w).
The value of w' is the third entry of $ I y J , and this is a nonzero
linear expression in x,yyw. We define it to be !/$(£, y,iu).
3. Flexes
In this section we shall define the intersection multiplicity of a line
and a plane curve at a point. The curve may be nonsingular or singular
at the point. The notation for intersection multiplicity will be i(P, L, F),
where P = (xo,yo> u>o) IS m F(k) fl L(k)y F is in k[x>yyw]d> and L is in
*[*>!/» Hi-
We introduce affine local coordinates. Choose $ with 4>(zo,yo,^o) =
(0,0,1) and put
/(*, y) = F^-1^, y, 1)) = /i (*, y) + • ■ ■ + /<*(*, y)
/(x,y)=L(*-1(xly,l)).
Since /(0,0) = 0, l(x>y) = bx — ay for some constants a and 6 not both
0. Then <p(t) = f J parametrizes /(x,y) = 0. The expression f(<p(t))
is a polynomial in t with f(<p(0)) = 0. In fact,
/MO) = /iK W) + /2« 60 + • ■ ■ + /d« 60
= tfi{a,b) + t2f2(atb)+--+tdfd{a,b).
There are two possibilities. If foip is not the 0 polynomial, then f(<p{t))
has a zero of some finite order at t = 0, and this order is defined to be
the intersection multiplicity i(P, Z/, F). If / o tp is the 0 polynomial, we
say that t(P, L, F) = +oo. It will be convenient to define z(P, L, F) = 0
if Pis not in L(k)CiF(k).
3. FLEXES
33
Wc need to check that i(P, L, F) does not depend on the choice of $.
Thus suppose that \P(x0,yo»wo) = (0,0,1). Write
/<* /J 0\
tfotf"1 =7 6 0
\r s 1/
f'(x,y) = F(*-Hx,yA)) = f'1{x,y)+-+f'd{x,y)
i'(x,y) = LCP-'Oc.y, i)) = *'* - «'y (2.15)
From (2.4), we have
KX>V) = l'{<x2 + (3yf7Z + 6y).
Since i(xty) = bx — ay> (2.15) gives
6 = b'a - ay and - a = b'(5 - a!6.
Putting A = a8 — 0y, we thus have
(:)-(A?)G) - (i)—(;;)(:)■
Now (2.3) gives
So
/(V>(0) = (aW +6rt -I- l)*-lAf[(a't,b't)
+ (art + bst + l^AVjfa'M'O + • • • + Adf'd(a'tyb't).
From this equation we see that i(P, L,F) is the same when computed
from / as when computed from /'.
Proposition 2.8. If a line L and a plane curve F meet at a point P,
then i(P,L, F) = +oo if and only if L divides F.
34
II: CURVES IN PROJECTIVE SPACE
PROOF. If L divides F, then l{x,y) divides f(x,y). Since /(<£>(*)) is
the 0 polynomial, so is f((p(t)).
Conversely suppose f(<p(t)) is the 0 polynomial, so that /r(a,6) = 0
for all r with 1 < r < d. Without loss of generality, suppose 6^0. The
equality
0 = /r(a, 6) = c0ar + Clar"lb + • • • + crbr
=^(^(5)r+ci(ir1+---+^)
says that X — — is a factor of 6r(coA'r -f c\Xr~x -\ \-cr). If we write
6
br(c0Xr + --+cr)= (*-£)«(*),
then
brfr{x,y) = yTbr(c0(^j + -+cr)
='(;-iH;)=»-"<-"(^(;))-
Hence l(x,y) divides /r(z>t/). It follows that l(x,y) divides f{x,y) and
then that L divides F.
As was the case with the tangent line, there is an expression for
intersection multiplicity that avoids affine local coordinates. For current
purposes this expression, which we give as Proposition 2.9, is not too
helpful, because it actually represents a disguised use of a particular
system of affine local coordinates that may not be the most convenient
one for applications. We shall not use this proposition until Chapter V.
Proposition 2.9. Suppose F £ fcfz,*/,!/;]™ is a plane curve, L 6
k[xyyyw]i is a line, and P = (x0jyoyWo) is a point on L. If (x' 1yfiw/) is
any point on L with \{x\y\w')\ ^ [(zo>2/OiWo)]i then i(P,L,F) equals
the order of vanishing at t = 0 of
i/>(t) = F(x0 + tx\ y0 + ty\ w0 + tw').
Proof. Let (x",t/",u/') be a point with L(x",y",w") = 1, and
choose affine local coordinates $ so that
3. FLEXES
35
This is possible since the above three image vectors for <£ l are linearly
independent. The corresponding local expression for L is then simply
/(x,y) = Lo$ [ y j = x.
The usual parametrization of/(x,y) = 0 comes from <p(t) = J. Thus
we can compute the local expression for F as
f(<p(t)) = Fo<&-1 j t J = F(x0 + tx',yo + ty\w0+tw').
Thus ip(t) = f{(f(t))} and the proposition follows.
Suppose that P is in L(k) 0 F(k), that P is a nonsingular point of F,
and that Lt is the tangent line to F at P. In the above notation, the
local expression for Lt is
h(x,y) = f\(x,y).
Thus
i(P,LlF)=l«=>/1(a,6)^0
«=>(a,6)tf LT(*)
<=> image ^ £ ^r(^)
L(Jfc) £ Lr(k)
L is not the same as Lt- (2.16)
Normally we should expect the tangent line at P to have intersection
multiplicity 2 with F. We say that a nonsingular point P of F is a
flex or inflection point of F(k) if the intersection multiplicity of the
tangent line to F at P is > 3.
Let us see how we would identify a flex from the definition. Suppose
F is given and we want to check (xo,2/o>l)- As usual, we can take
/l 0 -x0\
$ = j 0 1 —yo J , and then /(x, y) = P^'^x.y, 1)) is simply given
\0 0 1 /
by f(x,y) = F(x-f x0,y + 2/o, !)■ ^e rewrite / as a sum of homogeneous
terms in x and y. Then (xo.yo, 1) is on the curve exactly if the constant
36
II: CURVES IN PROJECTIVE SPACE
term is 0, the point is nonsingular if also the first order term fi(x,y)
is not 0, and the point is a flex if also the second order term /2(x,y)
satisfies /2(a,6) = 0, where a and b are defined by /i(x,y) = bx —
ay. Proposition 2.8 may suggest that the condition on /2 is equivalent
with the condition that /i(x,y) divide /2(^,y). In fact, we can see this
equivalence algebraically. [If f2{x,y) = ex2 + dxy -f ey2, suppose
ex2 + dxy + ey2 = (bx — ay)(rx + sy).
Then c = 6r, d = —ar -f 6s, e = —as; hence ca2 -f- t/a6 -+■ e62 = 0.
Conversely if ca2 + dab -f c62=0, we define r = c/6 and s = — e/a,
checking separately the cases that 6 = 0 or a = 0.]
Example. Let F(x,y,w) = x2y + xyw + w3. About (x0,yo,l) we
have seen that
/(x,y) = (xqI/o + *oyo + 1) + ((2x0y0 + yo)* + (*o + x0)y)
+ (s/o*2 + (2x0 + l)xy) + x2y.
Assume that the point is on the curve, i.e., that
xlyo + zoyo + 1 = 0. (2.17)
Then
/i(^y) = (2x0yo + yo)z + (*o + xo)y
fi(z,y) = {Vo)x2 + (2x0 + l)xy.
If we put b = 2x0yo 4- 2/o and a = — (xfj + £o). the condition for a flex is
0 = /2(a, 6) = y0a2 + (2x0 + l)a&,
i.e.,
y0(*o + x0)2 - (2x0 + l)(*o + *o)(2x0i/o) = 0. (2.18)
We can check that (2.18) is equivalent with the condition that /i(x,y)
divide h{x>y).
In the above example, one way to find all flexes (£o,yoil) would be
to solve (2.17) and (2.18) simultaneously, discarding any singular points
at the end. However, there is a more elegant way that does not involve
passing to affine local coordinates; the only additional condition is that
the field is not to have characteristic 2. We require two lemmas.
3. FLEXES 37
i
Lemma 2.10 (Principal Axis Theorem). If char k / 2 and if A is a
symmetric square matrix over ky then there exists a nonsingular square
matrix # over it such that *trj4* is diagonal.
Proof. We induct on the size, the case of size 1-by-l being trivial.
Suppose the result is known for size (n — l)-by-(n — 1). If A has size
n-by-n, write A = [ , tr , J with a symmetric of size (n — l)-by-(n — 1)
and with d of size 1-by-l. If d ^ 0, let x = -d~lb. Then
(o l) V6^ dJ Utr lj = \0 d)'
and the result is reduced to the (n — l)-by-(n — 1) case and follows by
induction. If d = 0, we may assume inductively that 6^0, say 6, ^ 0.
Let y be an (n — l)-dimensional row vector with ith entry 6 and with
other entries 0. Then
// 0\f a b\fl y"\ /* * \
\y lj\btr 0j\0 l)~\* yaytv + btfytr + yb)
= \* 62aii^2dbi) '
Since 2bi ^ 0, there is some value of 6 for which 62aa + 2<56» ^ 0. Thus
we are reduced to the case d ^ 0, which we have already handled.
Lemma 2.11. Let A = (Aij) be a nonzero 3-by-3 symmetric matrix
over a field k with char k ^ 2. Then the conic
C(xyy}w) = (x y w)A I y J
associated to A is reducible over the algebraic closure k if and only if
deti4 = 0.
Proof. If C(x,y, tu) is reducible, Corollary 2.5 shows that the curve
has a singular point (xo,yo,u>o). By Proposition 2.6
dc dc dc n f , .
ax ay aw
wo
Hence detyl = 0.
38
II: CURVES IN PROJECTIVE SPACE
Conversely let det A = 0. By Lemma 2.10, choose a nonsingular $
such that A1 = $_1 'A$~l is diagonal. Without loss of generality we
lx
may assume that A1^ = 0. Put I yf J = $ I y I . Then
w
C(x}y,w) = (x' y' w')A'\ ^ J = A\xxa + A'22y'2.
The right side is reducible over k. Substituting back in terms of (x, y> w)y
we see that C(x,y>w) is reducible over k.
Proposition 2.12. Suppose the characteristic of k is not 2. Let F be
a plane curve over k of degree d > 2, and let (zo, 2/o, u>o) be a nonsingular
point on the curve. Then (xo,2/o, u>o) is a flex if and only if the Hessian
matrix of F satisfies det H(xo,yo^o) = 0.
Proof. Without loss of generality, we may suppose k is algebraically
closed. Let L be the tangent line at (xo>2/o>u>o). Choose affine local
coordinates with $(xo,2/o,wo) = (0A 1)> Put f{x^y) = ^(^'H*^ 1))>
and let f\(xyy) and /2(x,y) be the first-order and second-order parts.
The condition that (x0}y0}wo) be a flex is that fi(xyy) divide f2(xyy)}
hence that L divide the conic Q<p of Proposition 2.7.
Suppose that (xo,2/o>wo) *s a ^ex- Then L divides Q$, and (2.12)
shows that L divides the conic defined by the matrix H(xo,yoywo). By
Lemma 2.11, detH(x0}yo}wo) = 0.
Conversely suppose det H(xo> yo>wo) = 0. By Lemma 2.11 the conic
C(xyyyw) defined by the matrix H(x0}yQ}w0) is reducible. Let us say
that C = LiL2, where L\ and L2 are lines. We now divide matters into
cases.
Suppose d > 2. In affine local coordinates defined by $, Q* becomes
homogeneous quadratic in (x}y)} and the term
-2(d- l)L'*(x,y,w)L(z,y9w)
becomes
—2(d — l)/i(x, y) + higher order.
Therefore C has 2(c/ — \)L as tangent at (x0,yo,wo). But C = ^1^2
implies C has tangent Li or L2 at (ar0,2/o, ^0), up to a constant factor.
Hence L is a multiple of L\ or L2, an^ £ divides C. By (2.12), L divides
Q*. Thus (xq,yo,^o) is a flex.
3. FLEXES
39
Now suppose d = 2. We still have C = LiL2. The second partials
of C are twice the entries of H(x0yyQyw0). Since C and F are both
conies, C = 2F. Then F = \L\L2, and easy computation shows that
the nonsingular points are the points of F(k) not in L\(k) H L2(k) and
that each such is a flex.
EXAMPLE. Let F(x}yyw) = x2y -f xyw -f w3. The Hessian matrix at
(x0lr/o,l) Js
(2y0 2x0 + 1 2/o \
2x0 + l 0 x0 .
yo *o 6 /
The condition in Proposition 2.11 that the determinant vanish says that
2y0(x20 + x0) - 6(2xo + l)2 = 0. (2.19)
This is ostensibly different from (2.18). But use of (2.17) reduces both
equations consistently to
3(2x0 + 1)2 = -1.
Corollary 2.13. Over an algebraically closed field of characteristic
not equal to 2, a nonsingular conic has no flex, while a nonsingular plane
curve of degree > 2 has at least one flex.
Proof. Let the curve be F. Since F is nonsingular, Proposition 2.12
says that the flexes are all the common solutions of
F(xyyyw) = Q
and
det H(xyyfw) = 0.
If deg F > 2, Bezout's Theorem says there is a solution. If deg F = 2,
then H(xyyi w) is a constant matrix H, and we have to show that det H
is not 0. Let C(xyyyw) be the conic defined by 11. At the end of the
proof of Proposition 2.12, wesaw that C = 2F. If det// = 0, Lemma
2.11 says that C is reducible, and thus F is reducible, in contradiction
to Corollary 2.5.
If F has degree d > 2, then det // has degree Z{d — 2). The converse
half of Bezout's Theorem says that an irreducible F of degree d (over a
field of characteristic not 2) has < Zd(d — 2) flexes unless F and detH
have a common factor. One can show that a plane curve F and its det H
have no common factor unless F is a product of lines; in particular, F
and det H have no common factor if F is irreducible.
40
II: CURVES IN PROJECTIVE SPACE
4. Application to Cubics
The most general cubic is
F(r, y, w) =cyyyy3 + cxyyxy2 + cxxyx2y + cyywy2w + cxywxyw
+ CywwVW2 + CXXXX3 + C^^X2^ + Cxti/wX^2 + C^^W3.
(2.20)
We examine the algebraic effect of some geometric conditions that we
might impose on such a curve. We shall make calculations in local
coordinates, despite the availability of Proposition 2.6, because we do
not want to exclude k of characteristic 2 when we deal with flexes.
Condition 1. The curve is to pass through (r,y, w) = (0,1,0).
Equivalently
cyyy = ^*
Condition 2. In addition, the curve is to have (x,y,w) = (0,1,0)
as a nonsingular point. To examine this condition, we choose $ =
/0 0 1\
II 0 0 J as the map that carries (xo}y0}w0) = (0,1,0) to (0,0,1).
\0 1 0/
Then
f(z,y) = F(*-1(x,y,l)) = F(y,l,x)
2 2
Cxxyy i CyywX 4" CxywXy *t Cytyu/2?
+ crrr2/3 + cxxwy2 +
cxww y* + (2.21)
Hence
The condition is that
CyylU ^0 Or CXyy ^ 0.
Condition 3. In addition, the curve is to have L(x}y,w) = w as
tangent line at (0,1,0). (Recall that this line is the line at infinity.)
Referring to (2.21), we have
h{x,y,w) = cyyxvx + cxyyy.
4. APPLICATION TO CUBICS 41
Hence
L(x, y, w) = /i($(x, y, w)) = /i(u>, z, y) = cyyww + cxyyx.
The condition is that
In view of Condition 2, we automatically have
cyyw t 0)
and then the tangent line is the same line as w.
Condition 4. In addition, the curve is to have a flex at (0,1,0).
Referring to (2.21), we have
/2(*,2/) = cywwx2 + cxywxy + cxxyy2
and
/l(x»2/) = cyywX.
The condition that /i divide f^ is that
cxxy = 0.
The condition that i((0,1,0), w,F) be defined is that w not divide F,
hence that
CXXX ^F ^*
When all four conditions are satisfied, the cubic becomes
F(xyy,w) - cyywy2w + cxywxyw + cywwyw2
3 "* 2 3
i Cxxz^ i Cxxxv^ ^ i CXXUU)X'W ~j~ CyjiyyjW
(2.22)
with cyyti, ^ 0 and cxxr ^ 0.
Proposition 2.14. If F is a cubic over k such that F(k) has a flex
(#o> 2/o>u>o)) then there exists a projective transformation <S> such that
F* is the same curve as
y2w +a\xyw + asyw — x3 — a.2X2w — aixw2 — clqw3. (2.23)
PROOF. Choose $! so that $i(*o,2/o, ™o) = (0,1,0). Then F*1 has
a flex at (0,1,0). Let L(x}yyw) = ax + (iy be the tangent to F*1 at
42
II: CURVES IN PROJECTIVE SPACE
(0,1,0), choose a nonsingular matrix I , 1 with aa -f (3c — 0, and
define
(a 0 b\
*2X = 0 1 0 .
\c 0 d)
Then $2 has the property that Z*3 is the same line as w. Hence
(F*i)*2 = ^*3*i ^33 a flex at (0,1,0) and has w as tangent. The
discussion above shows that F*3*1 has the form (2.22). Put
It 0 0\
*J* = 0 t 0
\0 ° V
with * to be specified. Then (F*2*1)*3 = F*3*2*1 has
F*3*>*> (x, 2/, u;) = F*3*> (**, <y, u/)
= cyyw(t2y2w) + • • • + cxrx(f3x3) + ....
Taking tf = cyyw/cXXXi we see that the coefficients of t/2iu and x3 in
F*3*2*1 are the same. Thus $ = <J>3<$2$i is the required
transformation.
A cubic of the form (2.23) is said to be in Weierstrass form. The
corresponding affine Weierstrass form, achieved by putting w = 1, is
written as an equation
y2 + a\xy + a3y = x3 + a2x2 + a4x + ac. (2.24)
The notation in (2.24) is standard; the subscripts will be seen in Chapter
III to indicate the degree of homogeneity of the corresponding term
under a certain change of variables.
By Corollary 2.13, a nonsingular cubic over an algebraically closed
field of characteristic not equal to 2 necessarily has a flex. Hence it can
be put in Weierstrass form by a projective transformation.
An elliptic curve over fc is a nonsingular cubic over k that is in
Weierstrass form. In testing a curve (2.23) for nonsingularity, note that
(0,1,0) is the only point at infinity on the curve, and it is nonsingular
by Condition 2. Hence only points (xo,t/o> 1) need to be checked.
4. APPLICATION TO CUBICS
43
Proposition 2.15. Let P be a nonsingular cubic over &, and let L
be a line. Then Yip {(Pi U F) is °> 1. or 3.
REMARKS. (1) If P ^ Q are in F(k), let L be the line containing P
and Q. Then the proposition says there is a third point on L(k) D F(k)
if we count multiplicities. (This third point may coincide with P or Q.)
We define PQ to be the third point.
(2) To define PP} let L be the tangent to F at P. Then f(P, L, P) > 2.
The proposition says that there is a third point on L{k) f) F(k) if we
count multiplicities. This third point will be P if P is a flex of P, and it
will be some different point if P is not a flex. In either case, we define PP
to be this third point. The rule for determining PQ in this paragraph
or the previous one is called the chord-tangent composition law.
Proof. Let P be as in (2.20). Without loss of generality the sum is
not 0. Thus take P = (xo>2/o, wo) to be in L(k)DF(k). Since everything
is invariant under projective transformations, the same argument as in
the proof of Proposition 2.14 shows we may assume that (zo.l/Oi^o) =
(0,1,0) and that the tangent line there is w. Conditions 1 and 3 show
that Cyyy = cxyy — 0. We consider some possibilities for L.
(1) L = w. Putting w = 0 in P gives
P(x,y,0) = cxxyx2y + cxxxx3. (2.25)
(la) Suppose cxxy = cxxx = 0. Then w divides P, and Corollary 2.5
shows that P is singular, contradiction.
(lb) Suppose
cxxx ^- 0 and cXXy — 0. Then Condition 4 shows that
(0,1,0) is a flex with z(P, Lt F) defined and > 3. Since z(P, L, F) cannot
be > 3 for a cubic, i(Py L, P) = 3. The expression (2.25) cannot be 0
without x = 0, and so there are no other points in L(k) D F(k). So
L(k) fl F(k) contains P with multiplicity 3, and it contains no other
point.
(lc) Suppose cxxy ^ 0. Then Condition 4 shows that (0,1,0) is not
a flex, hence that i{P,LtF) — 2. Meanwhile (2.25) produces a unique
additional point Q on L for which (2.25) is 0; we can write Q = (1,2/i, 0).
To complete the study of this case, we need to show that i(QyLyF) ^ 0.
/ 0 0 1\
Define $ = -yi 1 0 J, so that 4>(l,yi,0) = (0,0,1). Then
V 1 0 0/
f(x,y) = F(<p-l(x,y,l)) = F(l,y + yi,x)
l{x,y) = L(*-l(x,y,\)) = L{\,y + yux) = x.
44
II: CURVES IN PROJECTIVE SPACE
Calculating /(x,y), we find that
)x + (cxxy)y.
Since cxxy ^ 0, fi(x,y) is not a multiple of l(x,y). Hence L is not
tangent to F and Q, and i(Q,L,F) = 1 as required.
(2) L is not the same as w. Since L goes through P = (0,1,0) and
is not the tangent, we have i(P, L,F) = 1. Assume that i(Q, L, F) > 1
for some Q ^ P. We may write Q = (xo,t/0)l)« Transforming by
/l 0 -*0\
$ = 0 1 -ifo ), which has $(P) = P and $(Q) = (0,0,1), we
\0 0 1 /
may assume from the outset that Q is (0,0,1). Then L, which passes
through both P and Q, is the line L = x. Using $ = /, we form
/(x, y) = F(x, y, 1) = cxxyx2y + cyytut/2 + cxyu,xy + cyu;tz,y
3 2
i CXXXX ~7~ CXX\i)X ~r Cxww*'
and
/(x,y) = L(x,y, 1) = x.
Then /(x,y) = bx — ay with a = 0 and 6=1. Putting y?(<) = f . 1, we
have
/(?(<)) =t(cx
tun/" b) + t2(cxru;a2 + c,.yti,a6 + cyyu,62)
+ <3(cxrjca3 + cxrya26)
= *Cyiyti, T" * Cyyw • ^Z.ZOj
Condition 3 shows that cyyu, ^ 0. If cyt/;u, = 0, then (2.26) shows
that i(Q,L,F) = 2 and that f(x,y) vanishes on the locus l(xyy) = 0
only at (x,y) = (0,0); hence P and Q are the only points contributing
to the sum in question, and the sum is 3. If cyww ^ 0, then (2.26)
shows that i(Q,L,F) = 1 and that there is just one more point R
where i(R,L,F) > 0; interchanging the roles of Q and R shows that
i(R> Ly F) = 1 and hence that the sum is 3.
5. Bezout's Theorem and Resultants
Before returning to Bezout's Theorem, we introduce the resultant
R{fyg) of two polynomials. Let A be a unique factorization domain,
e.g., k[x\,..., xr]. For / and g in A[X] with
/(*) = oo + a,^ + ..- + araJfm
g(X) = b0 + blX+-+bnX", { " '
5. BEZOUT'S THEOREM AI(JD RESULTANTS 45
let [R(f} g)] be the (m + n)-by-(m -f n) matrix
faQ ai ••• am_i am 0 0 ••• 0 \
0 a0 ••• am_2 am-i am 0 ••• 0
0 •■• a0 ••• flm
bo bi 6n_! 6n 0 ••• 0
0 b0 6n_2 bn-i bn ••• 0
V 0 • • • 60 &i • • 6n /
in which there are n rows above the 6o in the first column and there are
m remaining rows. Let R(fyg) be the determinant
R(f,g) = det[R(f9g)].
Proposition 2.16. With / and g in A[X] of the form (2.27), the
following are equivalent:
(1) / and g have a common factor of degree > 0.
(2) af + bg = 0 for some nonzero a and 6 in A[X] with dega < n and
deg6 < m.
(3) R[f,g) = 0.
When R(fyg) ^ 0, there exist nonzero a and 6 in A[X] with dega < n,
deg6 < m, and a(X)f{X) + b(X)g(X) = fl(/,<7)- (Here R(ftg) is to be
regarded as a polynomial of degree 0.)
Proof. (1) => (2). If u \ f and u \ g, write f — bu and # = —au.
Then (2) holds.
(2) => (1). Factor / and bg. If (1) fails, then the prime factors of/ of
degree > 0 must occur in the factorization of 6, with multiplicities. So
deg/ < deg6, contradiction.
(2) O (3). For any a and b of the form
a(X) = ao + axX + • •. + an_1Xn"1
&(;0 =/?0 + A* + • • • + An-iX"1"1
with coefficients in the quotient field A of >4, we have
(2.28)
(<*0 <*! ••• Otn-l /?0 •• /?m-l )[R{f, 9)]
= (c0 Q ••• cm+n_!), (2.29a)
46
II: CURVES IN PROJECTIVE SPACE
where
a(X)f(X) + b(X)g(X) = c(X) = c0 + cxX + • • • + cm+n-i*m+n-1.
(2.29b)
If nontrivial a(X) and b(X) exist with coefficients in A such that c(X) =
0, then ker[R(fyg)]tr ^ 0 and R(f,g) = 0. Conversely if fl(/,flr) = 0,
then ker[fl(/,^)]tr ^ 0 and there exist nontrivial a(X) and b(X) with
coefficients in >4 such that c(X) = 0. Clearing denominators, we obtain
a(X) and b(X) with coefficients in A such that c(X) = 0.
Last statement: If R(fyg) jL 0, then the cofactor formula for the
inverse of [R(f} g)] with entries in A says that
IW.g)]-1 = R(f,g)-l[S{f,g)],
where [5(/,^)] has entries in A. Thus we can define elements
a0,... ,an-i,A)>-- ■,]8m-i
in i4 by
(a0 «1 ••• «ri-l 00 *•• &n-l)
= (*(/,*) 0 ... 0)[/?(/, (z)]-1.
If we define a(X) and 6(X) in yl[Ar] by (2.28), then (2.29) shows that
a(X)f(X) + b(X)g(X) = Rif9g).
Proposition 2.17. If A = k[xi>... ,xr] and if / and g are as in
(2.27) with cij homogeneous of degree m — j and 6; homogeneous of
degree n — j} then R(f}g) is homogeneous of degree mn.
PROOF. For an exponent u obtained by factoring powers
(tVn-\...Jr,r-1l...)
tnam •• \
of < from the rows of the matrix displayed below and for v obtained by
factoring powers (£m+n,*m+n~1,...) from the columns, we have
/tntma0 tntm'lai •
| 0 tn-ltma0 •
tuR(f,g)(txu..-,txr) = det
I tmtnb0 tmtn'lbi ••• tmbn
\ 0 tm"ltnb0 ••• /
= tVR(f,9)(*l,~.,Xr).
So R(f,g)(tx) = tv~uR(fyg). Computing u and v, we find that u =
^m(ra-f l)+^n(n + l) and i; = ^(m-f n)(m + n + l), so that v — u = mn.
By Proposition 2.2 applied to fc, R{f,g) is homogeneous of degree mn.
5. BEZOUTS THEOREM AND RESULTANTS
47
Let us restate Bezout's Theorem, originally given as Theorem 2.4.
Theorem 2.18 (Bezout's Theorem). Suppose F 6 k[xy yyw]m and
G € k[xyyy w]n are plane curves. Then F(k) D G(k) is nonempty. If it
has more than mn points, then F and G have as a common factor some
homogeneous polynomial of degree > 0.
Proof of first conclusion. Write F and G in the form
F(xyyyw) = a0 + axu; + • • • + amwm with a; 6 k[xyy]m-j ( .
G(x, y, w) = 60 + biw -f h bnwn with 6;- € fc[x, y]n-j.
Regarding F and G as polynomials in w, with coefficients in A = k[xyy]y
we form R(FyG)y which Proposition 2.17 identifies as a member of
k[x>y]mn-
Since R(FyG)(x,y) is homogeneous and k is algebraically closed, we
can choose a point (xo>2/o) ^ (0,0) where R{FyG)(xoyy0) = 0. Then
the resultant of F(x0y y0y w) and G(xoyyoyw) is 0, and Proposition 2.16
says that these two polynomials in w have a common factor. Since k is
algebraically closed, this common factor vanishes at some w = w0, and
then we must have F(x0yyoywo) = G(xo>2/o, u>o) — 0.
Proof of second conclusion. Suppose F(k) n G(k) contains
mn -f 1 points. Join these points by lines, and pick a point defined
over k that is not on any of the lines. Applying a projective
transformation, we may assume the point is (0,0,1). Write F and G in the
form (2.30). Regarding F and G as polynomials in wy with coefficients
in A = £[2,2/], we again form R(FyG)} which Proposition 2.17 identifies
as a member of k[xyy]mn. For fixed (xo,2/o), Proposition 2.16 says that
R(FyG)(xoyyo) — 0 if and only if F(xoyy0yw) and G{x0yy0yw) have a
common factor (hence necessarily some w — wq factor since & is
algebraically closed), if and only if F(xQyy0yw0) = G(x0yy0ywQ) = 0 for
some wo- So at each of our mn + 1 points, say (x,-,yi}Wi)y we have
R(FyG)(cxiycyi) = 0 for all c. Since (xiyyi) ^ (0,0), R{FyG) vanishes
on the line y,x — x,y = 0. Consequently t/,x — x,y divides R(Fy G).
Suppose (xiyyi) = (xjyyj). Then (xiyyiyWi) and (xjyyjyWj) both
satisfy y,x — xty = 0. Since (0,0,1) satisfies this also and since (0,0,1)
is not to be on any of the connecting lines, we obtain a contradiction.
Thus the mn-J-1 factors yix — Xiy are distinct primes in k[xy y] dividing
R(FyG). By unique factorization, their product divides R(FyG). Since
degR(*FyG) = mny we conclude R(FyG) = 0. Then Proposition 2.16
shows that F and G have a common factor in £[x,t/][w] = k[xyyyw).
The common factor is homogeneous by the first conclusion of Corollary
2.5 (which does not depend on Bezout's Theorem).
48
II: CURVES IN PROJECTIVE SPACE
Corollary 2.19. Suppose F € k[x,y,w\d is a plane curve and L is a
line. If F(k) C\ L(k) has more than d elements, then L divides F. This
condition is satisfied if F vanishes on L(k) and Jfe has > d elements.
Proof. The number of points on L(k) is one more than the number
of elements of fc, hence is > d+1. Then F(k)C\L(k) has > d+1 points,
and Bezout's Theorem says L and F have a common factor. Since L is
prime and fc[x,y,u;] has unique factorization, L divides F.
The full-strength version of Bezout's Theorem says that two plane
curves F and G of degrees m and n meet in < mn points even when
multiplicities are counted, and that the number is = mn if k is
algebraically closed and multiplicities are counted. We do not need the
full-strength version of Bezout's Theorem and have consequently not
defined intersection multiplicity in full generality. The result that we
shall prove instead is the corresponding sharpening of the special case
of Corollary 2.19 in which one of the curves is a line.
Corollary 2.20. If F G k[xyy, w]d is a plane curve and L is a line
such that J2P i(P>L>F) > d, then L divides F.
Proof. Without loss of generality, we may assume k is algebraically
closed and is in particular infinite. Suppose L does not divide F. Then
Corollary 2.19 shows that F(k)DL(k) is finite, and it follows from
Proposition 2.8 that £V *(P> £> F) ls finite.
Possibly by applying a projective transformation, we may assume that
the line L is w = 0. Then the points Pj with i(Pj,L, F) > 0 are of the
form [(xjy VjyO)]. Possibly by applying a second projective
transformation, one that translates the y variable, we may assume that no yj is
0. Then we can write Pj = [(rj,l,0)] with rj in k. Now ^(£,1,0) is a
polynomial in x of degree < d. We shall prove that
i((ry 1,0), w, F) = multiplicity of r as a root of F(xy 1,0). (2.31)
Since k is infinite, there is some r not in F(k) D L(k), and this r in
(2.31) shows that F(x,l,0) is not the 0 polynomial. Hence F(x,l,0)
has < d roots, counting multiplicities, and it follows from (2.31) that
Dp ,#(P» L, F) < d, as required.
To prove (2.31), we introduce affine local coordinates about (r, 1,0),
/l 0 r\
using fc-1 = I 0 0 1 J , so that $(r, 1,0) = (0,0,1). The local ver-
\0 1 0/
sions / of F and / of L = w are
f(x,y) = F(<t>-\x,y,l))= F(x + r,l,y)
l(x,y) = L(*-l(x,y,l)) = y.
5. BEZOUT'S THEOREM AND RESULTANTS
49
Thus /(x,y) is of the form bx — ay with a = — 1 and 6 = 0. We can
parametrize / by (p(t) = (.) = ( J, and then
/(p(0) = /H,0)=F(-« + r,l,0).
The order of vanishing of f(ip(t)) at t = 0, which is i((r, 1,0),L, F),
thus equals the order of the 0 of F(—t + r, 1,0) at t = 0, which equals
the multiplicity of r as a root of F(x, 1,0). This proves (2.31), and the
corollary follows.
CHAPTER III
CUBIC CURVES IN WEIERSTRASS FORM
1. Examples
We begin with some examples of plane curves, showing how to bring
them into Weierstrass form and discussing some of their solutions.
A. Diophantus cubic example.
The equation from Chapter 1 in affine form is
y(6-y) = x3-x. (3.1)
Following the procedure in the proof of Proposition 2.13, we replace y
by — y and x by — x to obtain
y2 + 6y = x3 - x.
If we complete the square and replace y + 3 by y, we are led to
y2 = x3 _ x + g (3 2)
The trivial solutions of (3.1) were x = 0 or ±1 with y — 0 or 6. These
correspond in (3.2) to x = 0 or ±1 with y = ±3. Diophantus's nontrivial
solution of (3.1) began with P = (—1,0) and amounted to using PP =
(■y, ||); here PP is the result of applying the chord-tangent composition
law (§11.4) to P and P. In terms of (3.2) we would take the corresponding
trivial solution P = (1,3) and use PP = (-^, ff).
B. Cubic case of Fermat's Last Theorem.
The equation in projective form is
x3 + y3 = t/;3. (3.3)
Since we can always clear denominators, integer solutions and rational
solutions amount to the same thing. With F(xyyyw) = x3 + y3 — w3}
we have
dF 2 dF 2 dF 2
— = 3x , — = 3y , — = -3ur.
ox oy ow
50
1. EXAMPLES
51
The only place where all first partials are 0 is (0,0,0), which does not
correspond to a point in projective space. By Proposition 2.6, F is
nonsingular. To look for flexes, we compute the Hessian determinant
/6x 0 0 \
deti/(x,y,w) = det I 0 6y 0 = -63xyw.
\ 0 0 -6w)
We seek simultaneous solutions of (3.3) and
xyw = 0. (3.4)
For example, we can use (x0,yo> ™o) = (0,1,1). To map this to (0,1,0),
which is the point at infinity of a curve in Weierstrass form, we ap-
/l 0 0\
ply successively 10 1 -1 J to carry (0,1,1) to (0,0,1) and then
\o o i;
1 0 0\
0 0 1 J to carry (0,0,1) to (0,1,0). The tangent y - w maps first
,0 1 0/
to y and then to w. Hence
maps F into Weierstrass form except for scaling. To carry out the
computations, we put (a b c) = <^(x y w). Then
6 = 6 + c
Substituting in (3.3), we obtain
a3 + (6 + c)3 = 63.
Passing to affine form by replacing (a}b}c) by (xyy} 1), we obtain
3y2 + 3y =-x3 - l.
52
III: CUBIC CURVES IN WEIERSTRASS FORM
We scale by changing y to —3y and x to —3x, and the result is
y* - ly - ~3 _ J_
y ^y —x 27*
We make a further adjustment to clear denominators: We change y to
y/u3 and x to x/ti2 with u = 3. Then the equation becomes
y2 - 9y = x3 - 27. (3.5)
Fermat showed that the only Q-rational (projective) solutions of (3.3)
are the trivial ones: (xyy%w) must be one of (1,0,1), (0,1,1), and
(1,-1,0). The corresponding solutions of (3.5) are (3,9), oo, and (3,0).
For future reference let us record the composite transformations from
(3.3) to (3.5) and back. If (X,Y) satisfies X3+Y3 = 1 and (x,t/) satisfies
y2 — 9t/ = x3 — 27, the relationships are
3X
y
v- 9
Y = i—-
x =
and { 1 gY (3.6)
y= T^V'
These are the transformations that we used in discussing u3 + v3 = w3
in Chapter I.
Equation (3.5) can be transformed some more. Completing the square
eliminates the y term and leads to
y2 = *3-f. (3.7)
Scaling by (x,t/) —► (x/u2,y/u3) with u = 2 leads to
y2 = x3-24-33. (3.8)
C. Congruent numbers.
A positive integer n is congruent if n is the area of a rational right
triangle. For example, the Pythagorean triple (3,4,5) has area 6. So
n = 6 is congruent.
Congruent numbers were studied as early as 984 A.D., were taken up
by Fibonacci in 1225, and were later studied by Fermat. The problem
is: Given n, decide whether n is congruent. Results for small square-free
n are as follows:
1 not congruent Fermat
2 not congruent Fermat
3 not congruent Fermat
5 congruent for (|, ^, ~) Fibonacci
6 congruent for (3,4,5).
Part of the connection between the congruent-number problem and
elliptic curves is given in the following proposition. We shall study the
relationship further in Chapter IV, and we shall prove there that n = 1,2,3
are not congruent.
1. EXAMPLES 53
Proposition 3.1. If n is a square-free positive integer, then the
following are equivalent:
(1) n is congruent: n = ^ab, where (a,6,c) is a rational Pythagorean
triple.
(2) There exist three rational squares in arithmetic progression with
common difference n.
Moreover, these conditions imply
(3) There exists a Q rational point (x,y) on
y2 = x3 - n2x (3.9)
other than (—n,0), (0,0), (n,0), and oo.
Examples. Arithmetic progressions meeting the second condition for
the congruent numbers 5 and 6 are
— (if.?)
- ((SHS)'-®*)-
PROOF. (1) => (2). Given (a,6,c), put x = c2/4. Then (a - 6)2/4 =
x — n and (a + 6)2/4 = x + n, so that x — n and x and x + n are all
squares of rational numbers.
(2) => (1). Given x such that x — n and x and x + n are all squares,
put
a = (x + n)i/2 + (x_n)i/2
6 = (* + n)1'2-(*-n)1/2
c = 2x1'2.
Then a,6,c are rational, and a2 + b2 — c2.
(2) => (3). If x is the middle number in the progression, then the
product of the three is x3 — n2x and is a square. Hence (3.9) holds with
x as the middle number of the progression. The progression cannot be
—n, 0, n since n is square-free. Thus the Q rational point in question
satisfies (3).
Remark. In Corollary 4.3 we shall prove conversely that (3) => (2).
54
III: CUBIC CURVES IN WEIERSTRASS FORM
D. Quartic case of Format's Last Theorem.
Fermat showed that x4 -f y4 = z4 has no nontrivial integer solutions
by showing that
u4 + v4 = w2 (3.10)
has no nontrivial integer solutions. We give his proof as Proposition
4.1. Equation (3.10) is not homogeneous, but we can still study it as
f —) + 1 = f — J . Thus the equation to study for Q-rational point!
s is
y2 = z4+l. (3.11)
If we make the nonprojective change of variables x —► x and y —► y + x2,
we are led to the cubic equation
y2 + 2x2y=\ affine
2 2 3 \O.LZ)
y w + 2x y = w projective.
We can verify that (3.12) is nonsingular and can look for Q-rational flexes
in the same way as for (3.3). The only Q-rational flex is at (xyyyw) =
/0 1 0\
(1,0,0), where the tangent line is y — 0. We apply ^i = II 0 0 I in
order to send (1,0,0) to (0,1,0) with the tangent becoming x = 0, and
/0 0 1\
then we apply <^2 = I 0 1 0 in order to fix (0,1,0) and to make
V1 ° °/
the tangent become w = 0. The resulting equation is
2y2 = x3 - x.
Scaling by sending y to 2y and x to 2x changes the equation to
2 3 1
V = x° - \x.
We can make the coefficients be integers by sending y to i^y and x to
\x\ the result is the elliptic curve
y2 = x3-4x. (3.13)
Since (3.10) has only trivial solutions, the only Q-rational solutions of
(3.13) are oo, (0,0), and (±2,0). We shall prove this result directly in
§IV.7.
1. EXAMPLES
55
For future reference let us record the composite transformations from
(3.11) to (3.13) and back. If (X, Y) satisfies Y2 = X4 + 1 and (x,t/)
satisfies y2 = x3 — 4x, the relationships are
x = X
2x
Y y2 + 8*
4*2 .
> and <
X =
y =
*"-** (3 14)
4X v '
Y -X2'
The quattic Fermat equation x4 + y4 = z4 can be treated also as
u4 + v2 = w4. This equation becomes y2 = x4 — 1, and an analysis like
the one above reduces it to the elliptic curve
y2 = x3 + 4x. (3.15)
The Q-rational solutions of (3.15) are oo, (0,0), and (2, ±4).
E. Fermat's problem for Mersenne.
In 1643 Fermat posed to Mersenne the problem of finding a
relatively prime integer-valued Pythagorean triple (X} Y, Z) such that the
hypotenuse Z is a square and the sum of the legs is a square. The answer
that Fermat had in mind is
X = 1061652293520
Y = 4565486027761 (3.16)
Z = 4687298610289,
and this is the smallest nontrivial solution.
In algebraic terms, we are given X2 + Y2 = Z2, Z=62,X + 7 = a2.
If we write e = X - Y, then X and Y are \(a2 ± e). Thus
64 = Z2 = X2 + Y2 = i(a4 + e2)
and
e2 = 264-a4. (3.17)
Dividing by a4, we are led to the affine curve
v2 = 2u4-l. (3.18)
If (u, v) satisfies (3.18) and if (x, y) is defined by
2(v + 2u2 - 1)
x =
(«-i)2
4[(2tt - l)v + 2u3 - 1]
(«-D3
(3.19)
56
III: CUBIC CURVES IN WEIERSTRASS FORM
then (x, y) satisfies
y2 = x3 + 8x. (3.20)
Conversely if (x,y) satisfies (3.20), then (u,v) may be recovered from
_ y - 2x - 8
"~ y - Ax + 8
(3 21)
_ y2 - 24x2 + 48y - 16s - 64 V ' ;
V " (y-4x + 8)2
so as to satisfy (3.18). For example, (x,y) = (0,0), (1,3), and (1,-3)
correspond to (u,v) = (-1,-1), (-1,1), and (-13,-239). But there
are some singularities: (u, v) = (13,239) maps to (x,y) = (8,24), but
(x,y) = (8,24) maps to a removable singularity. Also (x,y) = (8,-24)
maps to (ti,v) = (1,-1), but (u,v) = (1,-1) maps to a removable
singularity. Large points (x,t/) correspond to points near (u>v) = (1,1),
and large points (u, v) correspond to points near x = 4 ± \/8.
We shall not be concerned at this time with how to obtain the
transformations (3.19) and (3.21). What we shall see, apart from the
singularities, is that a Q rational point on the elliptic curve (3.20) leads to a
Q-rational solution (u, v) of (3.18), hence to an integral solution (a,6,e)
of (3.17) with a and 6 relatively prime. Then a and 6 must both be odd,
and so must e. We can thus define an integer tuple (Xy y, Z) by
X = i(a2 + e), V = i(a2-e), Z = b\ (3.22)
and the only question will be whether X and Y are positive.
Unfortunately small solutions like (a,6,e) = (1,13,239) lead to Y negative.
We shall return to Fermat's solution to the problem—and to the
interpretation in terms of elliptic curves—in Chapter IV.
2. Weierstrass Form, Discriminant, j-invariant
A cubic over Jfe in Weierstrass form is given projectively by
y2w + a\xyw -f a3yw2 = x3 + d2X2w + a+xw2 -f fl6^3» (3.23a)
with coefficients in k. We know that the only k rational point on the
line at infinity w = 0 is (x,2/, w) = (0,1,0). Moreover, we saw in §11.4
that (0,1,0) is a nonsingular point and is an inflection point, the tangent
line being the line at infinity. Since the behavior at (0,1,0) is so well
2. WEIERSTRASS FORM, DISCRIMINANT, j-INVARIANT 57
understood, we can study much of the behavior of the curve by working
with the affine form
y2 + axxy + a3y = x3 + a2x2 + a4x + a6, (3.23b)
This form has the advantage that the notation is simpler. The
significance of the subscripts will be explained later in this section.
In the examples in §1, specific plane curves of the form (3.23b)
simplified under changes of variables. Actually these changes of variables
are valid more generally, under only an assumption on the characteristic
of k. The expressions for how the coefficients of (3.23b) are affected by
these changes of variables do not depend on the characteristic and are
as follows. The notation is absolutely standard:
and
62 = a\ + 4a2
64 = 2a4 + aids
h = <*3 + 4fl6
68 = a\aQ + 4a2a6 - aia3a4 + a2a\
c4 = b\ - 2464
c6 = -6^-f366264 -21666.
(3.24)
(3.25)
The first simplification of (3.23b) assumes that char(fc) ^ 2. We
complete the square in (3.23b), replacing y-f \{p.xx-\-a^) by |y, and the
result is
y2 = 4x3 + 62x2 + 264x + 66 (3.26)
with 62,64,65 as in (3.24). (The coefficient 6g will play a role later in
this section and also in §4.)
The second simplification assumes in addition that char(Ar) ^ 3. We
replace (x, y) in (3.26) by ( ———, —— ), and the result is
\ ob 108/
y2 = x3 - 27c4x - 54c6 (3.27)
with c4 and c6 as in (3.25).
Although our chief interest is in elliptic curves over Q, it is not enough
to consider only equations (3.27). The reason lies in what happens when
we reduce curves over Q modulo a prime p. The following example
illustrates.
58
III: CUBIC CURVES IN WEIERSTRASS FORM
EXAMPLE. Cubic case of Fermat's Last Theorem. The Fermat
equation (3.3) led to the Weierstrass form in (3.5):
y2 - 9y = x3 - 27. (3.28)
Further changes of variables then led to the equation
y2 = x3-2433, (3.29)
which is of the form (3.27). Following Theorem 3.2, we shall see that
(3.28) is nonsingular modulo all primes but 3, while (3.29) is nonsingular
modulo all primes but 2 and 3. Thus the passage from (3.28) to (3.29)
loses information that might be obtained by reduction modulo 2.
For any field ky we introduce the discriminant A of the curve (3.23b)
or (3.26) by the formula
A = -b\b% - 863 - 276j? + 9626466 (3.30)
with 62,64,66,6s as in (3.24). When char(fc) is not 2 or 3, we can solve
for A from the relation
1728A = c34-c26. (3.31)
WARNING. If (3.27) is used as our original equation, its A is
2639(c4* - eg), and this is off by a factor of 212312 from A in (3.31).
The factor is introduced by the scaling in passing from (3.23) to (3.26)
and then (3.27).
Theorem 3.2. The cubic (3.23) is singular if and only if A = 0.
This theorem will be proved later in this section. By way of examples,
note that A = -39 for (3.28), and A = -21239 for (3.29), so that (3.29)
is singular modulo 2 but (3.28) is not. To get at the proof of the theorem,
we introduce the discriminant of a cubic polynomial in one variable.
Let
f(x) = x3 - ax2 + pz - 7 = (x - n)(x - r2)(x - r3) (3.32)
be a monic cubic polynomial over k with roots in k. Here a, /?, and 7
are given by the elementary symmetric polynomials
ot = ri + r2 + r3, 0 =. rir2 + rxr3 + r2r3y 7 = rir2r3.
2. WEIERSTRASS FORM, DISCRIMINANT, j-INVARIANT 59
We can check that
/i i i\
det r2 r2 r3 = (r3 - r2)(r3 - ri)(r2 - n) (3.33a)
\r? r2 r|/
(3.33b)
where 07 = rj -f r\ + 7*3 for 1 < i < 4. The discriminant d of /(x) is
given by
d=(r1-r2)2(r1-r3)2(r2-r3)2 (3.34)
Proposition 3.3. The discriminant d of the polynomial f(x) in (3.32)
is given by
(3 <tx a2\
*1 a1 ^3 J ,
02 G"3 ^4 /
where
<ri = a
<r2 = a2- 2(3
a3 = a3 - 3a(3 -f 37
04 = otA - 4a2/? 4- 2/?2 + 4<*7.
Proof. The determinant formula follows from (3.33) and (3.34).
Then 0*1, 0*2> 0*3) 04 are symmetric polynomials in r\yr2,r3 and hence are
polynomials in the elementary symmetric polynomials a,/?,7. There is
an algorithm for finding the polynomials in a,/?, 7, and application of it
yields the above expressions for 0*1,0*2, 0"3,04- These expressions can be
verified by direct computation.
Corollary 3.4. For the cubic polynomial f(x) = x3 -f px -|- </, the
discriminant is d = —4p3 — 27<j2.
Proof. This is the special case of Proposition 3.3 in which a =
0, P = p, and 7 = — q.
60
III: CUBIC CURVES IN WEIERSTRASS FORM
Example. Cubic polynomial f(x) defined over R. The discriminant
d is 0 if and only if / has a repeated root. If the three roots are real,
then d > 0. If / has one real root r\ and one pair of complex conjugate
roots r2 and r^ = ft, then (ri — T2)(ri — r2) is real and (r2 — f2) is
imaginary; since d is the square of the product, d is < 0.
In the general case, d = 0 for the cubic polynomial in (3.32) if and
only if at least two of the roots are equal. For a cubic polynomial that is
not monic, we define d to be the same as for the multiple that is monic.
If we then replace x by ^ in a cubic, the discriminant gets multiplied
by C6 (since each root gets multiplied by C).
The relevance of the discriminant d to detecting singularities of cubics
in Weierstrass form is as follows.
Proposition 3.5. If C is a nonzero element of k and if char(fc) ^ 2,
then the plane curve
y2 = C(x3 - ax2 + 0x-y) (3.35)
is nonsingular if and only if /(#) = C{x2 — ax2 + 0x — 7) has distinct
roots in k.
Proof. Since §11.4 showed that there can be no singularity on the
line at infinity, Proposition 2.6 says that the curve is singular if and
only if there exists a k rational point (xo,yo, 1) on t-ne curve where the
following three equations are all satisfied:
— : 0 = 3x20 - 2axQ + /?
ox
h- 2yo = 0
j^- J/o = C{-ax\ + 2/3xo - 3T)
Equations (3.35), (3.36a), and (3.36b) are equivalent with
0 = j,0 = f(x0) = f'(x0), (3.37)
and (3.36c) is redundant, giving the extra condition 3/(xo) —xo/'(*o) =
0. Thus the only candidates for singular points over k are (xo,0,1),
where x0 is a root of /, and such a candidate (xq, 0,1) is singular if and
only if Xo is a multiple root of /.
(3.36a)
(3.36b)
(3.36c)
2. WEIERSTRASS FORM, DISCRIMINANT, j-INVARIANT 61
Proposition 3.6. If char(fc) ^ 2, let db and dc be the discriminants of
the cubic polynomials on the right sides of (3.26) and (3.27), respectively.
Then
dc = 2l2312</6 (3.38a)
and
A = 2Adb. (3.38b)
Proof if char(fc) ^ 3. Apart from translations, (3.27) is obtained
from (3.26) by replacing x by x/C with C — 62, and we have seen that
the effect on discriminants is to multiply them by C6. Thus (3.38a)
follows. By Corollary 3.34 and (3.31), we have
dc = -4(-27c4)3 - 27(-54c6)2 = 22 • 39 • 123A.
Then (3.38b) follows from (3.38a).
Proof if char(fc) = 3. It is immediate from Corollary 3.34 that
dc = 0, and hence (3.38a) is valid. The discriminant db of (3.26) is the
same as that of (3.35) with
« = —, /?=y- 7 = -T-
Since 3 = 0, Proposition 3.3 says that
df, = 2c7"i(T2<T3 — <72 ~ 0"i (T4 = — <Ti(T2<T3 — (72 — G^G^,
where
<T\ — a
(T2 = a2 + /?
<73 = a3
(T4 = a4 - o?P - p2 + ay
with
a = -62, 0 = ~*4i 7 = ~be-
Substituting gives
db = _/?3 + ^2 _ a37 = 63 + 6262 _ ^ (3 39)
Meanwhile, in any characteristic we can check that
468 = 6266-&2, (3.40)
an equality that was already used to show that (3.30) implies (3.31).
Therefore in characteristic 3,
A = -b$bg + 63 = -bibs + b\b\ + b\.
Comparing this equation with (3.39), we see that A = db, and (3.38b)
follows.
62
III: CUBIC CURVES IN WEIERSTRASS FORM
Proof of Theorem 3.2.
Case 1. Suppose char(fc) ^ 2. Then (3.23) is singular if and only if
(3.26) is singular, if and only if the right side of (3.26) has a repeated
root (by Proposition 3.5), if and only if db = 0, if and only if A = 0 (by
(3.39)).
Case 2. Suppose char(fc) = 2. Then A reduces to
A = b\b& + b\ + 62Me
= a\dG + a\a^aA + a\a2a\ + a\a\ + a\ + a\a\.
(3.41)
Meanwhile, just as in the first paragraph of the proof of Proposition 3.5,
(3.23) can have singularities only at k rational points (xo,#o> 1) on the
curve, and it has a singularity at such a point if and only if
-*- : O = aiy0 + *o + a4 (3.42a)
ox
— : 0 = aizo + a3 (3.42b)
oy
■7T- : 0 = yl + aiXoyo + a2xl + a6- (3.42c)
Equation (3.42c) is redundant, being the sum of the curve, xo times
(3.42a), and y0 times (3.42b).
Suppose ai = 0. Then A = 0 if and only if a$ — 0, if and only
if (3.42b) holds. To complete this case, it is enough to show that the
system
Vo = xo + a2^o + a*xo + <*6
0 = xl + a4
has a solution in fc. But we have only to choose xq 6 Jb so that the second
equation holds, substitute into the first equation, and choose yo £ £ so
that the first equation holds.
Suppose ax ^ 0. Then (3.42b) and (3.42a) successively give
xq = a]~ «3 and yo = aj~ a3 -f- aj~ a4.
Substitution of these values for x and t/ in the difference of the two sides
of (3.23b) gives
(a^6a$ + a*2^) + (a^a^ + 0^0304) + (ar3fl3 + a^c^)
+ aj"3°3 + o>\2a20>\ + ar Ia3«4 + a6,
and (3.41) says this is just a]~6A. Thus (xo,t/o) satisfies (3.23b), yielding
(#o> 2/0) 1) as a singular point, if and only if A = 0. This completes the
proof of the theorem.
2. WEIERSTRASS FORM, DISCRIMINANT, j-INVARIANT 63
An admissible change of variables in a Weierstrass equation
(3.23b) is one of the form
x = t/V -f r and y - u3y' + su2x' + t (3.43a)
with it, r, 5, t in k and u ^ 0. In matrix form it is given by
x\ (x'\ ( u2 0 r\
y j = fc-1 I j,' 1 with fc"1 = fit/2 u3 / (3.43b)
It fixes [(0,1,0)] and carries the tangent w = 0 to the same line. The
cubic F in Weierstrass form gets carried to the same curve as a cubic
F* still in Weierstrass form. Up to a constant, admissible changes of
variables are the most general linear transformations with these
properties. We have already used such changes of variables on several occasions
to normalize Weierstrass equations in various ways. Two elliptic curves
that are related by an admissible change of variables are said to be
isomorphic.
Under the change of variables (3.43) and the normalization that makes
the coefficients of wy2 and x3 be 1, let F get carried to F*. In the
special case that r = 5 = t = 0, this passage F —> F* multiplies the
coefficients ax ,a3, a2,... ,62, • • • >C4,C6 by powers of u. The significance
of a subscript i is that that coefficient has been multiplied by u~%. We
say that the coefficient has weight i.
More generally when r,syt are not all 0, the above passage F —►
F* maps the coefficients a 1,03,02, • • )^2) • • • 1^4, C6 into new coefficients
a'],0-3,112, •.. ,&2, • • • ,c4,Cg. Then a primed coefficient is u~~l times an
expression in r, s,<,Oj,... ,c6 that does not involve u. In the case of c4
and C6, we have c4 = u~4c4 and c'6 = u"6C6 with r, s,£ not involved.
When coefficients are multiplied, their weights add. Linear
combinations of terms with the same weight again have that weight. In this
sense the discriminant A has weight 12. Since A can be expressed in
terms of c4 and ce (at least when k has characteristic not equal to 2 or
3), A is unaffected by r, sy and t.
64 III: CUBIC CURVES EM WEIERSTRASS FORM
Elliptic Curve
y2 + y = x3 - x2
y2 + xy = x3 — 2x2 + x
y2 + y = x3 + x2 + x
y2 + xy+y = x3
y2 +y = x3
y2 + xy-y= x3
y2 + y = x3 + x2 — x
y2 + xy = x3 + x2 + x
y2 + y - x3 + x2
y2 = x3 — x2 + x
y2 = x3 + x2 + x
y2 + xy + y = x3 - x2
y2 + Zxy -y=x3
y2 + xy = x3 — x2 + x
y2 + xy = x3 - 1x + 1
y2 + xy = x3 + x
y2 — x3 + x
y2 + y = x3 — 5a;2 - 4a; - 1
y2 + Zxy + y = x3-x2
y2 + xy - y = x3 + x2
y2 + y = x3 + x
y2 + 4xy - y = x3
A
-11
-15
-19
-26
-27
-28
-35
-39
-43
-48
-48
-53
-54
-55
-61
-63
-64
-67
-83
-89
-91
-91
c4
16
1
-32
-23
0
25
64
-23
16
-32
-32
-15
153
-39
97
-47
-48
592
-47
49
-48
352
eel
-152
-161
8
-181
-216
-253
-568
235
-280
-224
224
-297
-1917
-189
-1009
71
0
14408
199
-521
-216
-6616 |
Table 3.1. Some elliptic curves with small negative discriminant
To get some feeling for elliptic curves (i.e., nonsingular cubics (3.23))
without relying on classical examples or random selection, one can look
for elliptic curves of small discriminant. These, it turns out, are an
infrequent occurrence. They remain nonsingular, when reduced modulo
p, for all but a small number of primes p, and thus they supply a good
source of examples for many purposes. Tables 3.1 and 3.2 list, up to
admissible changes of variables over Q, all elliptic curves with |A| < 100
whose coefficients are all integers < 9 in absolute value. The triple
(A,C4,C6) can be used to detect equivalence under admissible changes
of variables over Q: It is always true that two elliptic curves over Q
with the same (A,c4,c6) are related by a transformation (3.43), since
the changes of variables that pass from (3.23) to (3.27) can both be
inverted. Conversely if two elliptic curves are related by a transformation
2. WEIERSTRASS FORM, DISCRIMINANT, j-INVARIANT 65
(3.43), their discriminants stand in the ratio of u12, and this is possible
with integer coefficients and with |A| < 100 only when t* = ±1. Hence
(A,C4,C6) is a complete invariant of the equivalence class for curves
limited as in the tables.
Elliptic Curve
y2 + Ixy + 2y = x3 + 4a;2 + x
y2 + 3xy = x3 + x
y2 + y - x3 - x
y2 + 2xy - 3y = x3 - 1
j/2 + xy = x3 - 3x2 + x
y2 + 9xy - 9y = x3 + 9x2 - 5x - 3
y2 = x3 — x
y2 + xy = x3 — x
y2 + xy = x3 — x2 — x
y2 + ixy — y = x3 — x2
y2 = x3 - 2x + 1
y2 = x3 - 2x - 1
y2 = x3 + x2 — x
y2 = x3 — x2 — x
j/2 + xy = x3 + x2 — x
y2 + 5xy + y = x3
A
15
17
37
37
57
62
64
65
73
79
80
80
80
80
89
98
c4
3841
33
48
160
73
15873
48
49
57
97
96
96
64
64
73
505
°6
-238049
-81
-216
-2008
539
-1999809
0
-73
243
-881
-864
864
-352
352
-485
-11341
Table 3.2. Some elliptic curves with small positive discriminant
For an elliptic curve the j-invariant is defined by
j = c3/A. (3.44)
This is well defined by Theorem 3.2, and it has weight 0. It is
unaffected by r, 5, and t and thus is invariant under all admissible changes
of variables.
Examples. Consider y2 = x3 +px -f q over a field with characteristic
not equal to 2 or 3. We have C4 = —48p and ce = —864<j. (See the
Warning before Theorem 3.2.) By (3.31),
66
III: CUBIC CURVES IN WEIERSTRASS FORM
Thus (3.44) gives
'' = 1728vw (345)
Thus, for example, every elliptic curve y2 = x3 + ax has j = 1728; this
includes those in Subsections C, D, and E of §1. Similarly every elliptic
curve y2 = x3 + a has j = 0; this includes the one in Subsection B of §1.
Finally the Diophantus example in Subsection A of §1 was y2 = x3—x+9.
2s • 33
This has p = — 1 and q — 9. Thus (3.45) gives j = — .
zloo
Proposition 3.7. Suppose that k has characteristic not equal to 2
or 3.
(a) If two elliptic curves are related by an admissible change of
variables, then they have the same j-invariant.
(b) If jo € A: is given, then there exists an elliptic curve over k with
j-invariant j0.
(c) If k is algebraically closed and two elliptic curves have the same
j-invariant, then they are related by an admissible change of variables.
PROOF, (a) This was observed above.
(b) The cases jo = 0 and j0 = 1728 were handled in the examples
above. For other values we can specialize the above example to
„2 _ 3 27 Jo „ 27 Jo
y x —t~~—r^oo"x ~
4 jo-1728 4 jo-1728'
27 in
in which p = q = — — - r=^r, and then we can apply (3.45).
4 jo — 1728
(c) We can normalize the two curves to be y2 = x3 + Ax + B and
y2 = x3 -f A'x + 5', by (a). Here A has weight 4 and B has weight 6.
Thus if we put y = i//u3 and x = x*/u2 in the first equation, we get
y2 = x3 + Au4x + Bue.
What we want to arrange is that
A' = Au4 and B'= Bu6. (3.46)
Equality of j-invariants means that
AA3 AA'3
AA3 -I- 27B2
AA'3 + 21B'2 '
3. GROUP LAW
67
from which we see that
A3B'2 = A'3B2. (3.47)
Now we distinguish cases.
(i) A = 0. Then B ^ 0 by nonsingularity, and A' = 0 by (3.47). If we
take u = (B'/S)1/6, then (3.46) holds.
(ii) B = 0. Then A / 0 by nonsingularity, and B' = 0 by (3.47). If
we take u = (yl'/^)1/4, then (3.46) holds.
(iii) AB ^ 0. First we take u = (yl'/yl)1/4 to get 4' = ,4t/4. Then
(3.47) says B'2 = (A'/A)3B2 = u12£2. Hence fl' = 5u6 or B' = -Bu6.
In the first case, (3.46) has been verified. In the second case we replace
u by y/^lu and recompute to find that (3.46) holds.
3. Group Law
For a nonsingular cubic curve with a specified k rational point 0,
we noted in Chapter I that the operation P + Q = O • PQ> given in
terms of the chord-tangent composition law of §11.4, makes the points
on the curve into an abelian group with O as identity. In particular,
the conclusion applies to elliptic curves, which are nonsingular cubics in
Weierstrass form. In this section, we shall state this result as a theorem
and give a geometric proof.
Theorem 3.8 (Poincare). Let F be a nonsingular cubic, and let O
be in F(k). Then the operation P + Q = O(PQ), given in terms of
the chord-tangent composition law, makes F(k) into an abelian group
with O as identity and with negatives given by — P = (OO)P. If A'
is a field extension of kf then the inclusion F(k) C F(K) is a group
homomorphism. If a different base point O' is chosen, then the two
operations are related by P+'Q — P + Q — O', and the group structures
are isomorphic.
The heart of the matter is to prove associativity, whose essence is
captured by the following lemma. We shall prove the lemma after showing
how the lemma implies the theorem.
Lemma 3.9. For any points P.Q.P'.Q' in F(k),
(PP')(QQf) = (PQ){P'Q'Y
(3.48)
68
III: CUBIC CURVES IN WEIERSTRASS FORM
Proof that Lemma 3.9 implies Theorem 3.8. We saw in Chapter
I that + is commutative, that O is an identity, and that —P is an additive
inverse. For associativity the lemma gives the second equality of
P(0(QR)) = (PQ • Q)(0 • QR) = (PQ • 0)(Q • QR) = (0(PQ))R.
Thus
P{Q + R) = {P + Q)R.
If we apply O to both sides, we get
P + (Q + R) = {P + Q) + R,
as required. Thus F(k) is an abelian group under -K Clearly the
inclusion F(k) C F(K) is a group homomorphism.
Let O' be given. Then the lemma gives the second equality of
(P + Q) - O' = 0[(0 • PQ)(00 • O'))
= 0[(0 • 00)(PQ • 0')]
= 0[0(P+'Q)] = P+'Q. (3.49)
Define amap^: (F(Jb), +) -♦ (F(*), +') by p(P) = P - O'. Then
y?(P + Q) = (P + Q)-0, = P-f,Q
by (3.49). Thus y> is a homomorphism. It is clearly one-one and onto.
Overview of proof of Lemma 3.9. To prove (3.48) and thus
Lemma 3.9, we shall divide matters into a nondegenerate case and a
degenerate case, and we shall use quite different methods for the two
cases. The distinction between "nondegenerate" and "degenerate" will
depend on which, if any, of the 8 points
P, P', Q, Q', PQ, P'Q', PP\ QQ'
are equal to one another. Specifically we arrange these points as 8 of
the entries of a 3-by-3 matrix
P P' PP' \
Q Q' QQ' (3.50)
PQ P'Q' )
We say that we are in the nondegenerate case if none of these points
is equal to a point in a different row and column (e.g.r P should not
equal Q;, QQ\ or P'Q'). Otherwise we are in the degenerate case.
The proof in the nondegenerate case simplifies a little if the 8 points are
actually distinct, but it does not simplify much.
3. GROUP LAW
69
Proof of (3.48) in the nondegenerate case. Let L\,L'2,L'Z be
the lines that meet F(k) at the points of the respective rows of (3.50),
with the usual convention that if a point appears more than once in
a row, then the line is supposed to be tangent to the curve at the
point. Let L",!^',//?} be the similar lines determined by the columns
of (3.50). Figure 3.1 illustrates matters, showing L\> Lf2,L'3 as solid lines
and L"yL'2,Lf3 as dashed lines.
Figure 3.1. Configuration for {PP')(QQ') = {PQ){P'Q')
If we are thinking of L'lyL'2i I3, it is natural to include {PQ)(P'Q') as
the lower right entry of (3.50), while if we are thinking of L", L'2', ^3> ^
is natural to include (PP')(QQf) as the lower right entry. Our objective,
of course, is to prove that these two candidates for the lower right entry
are equal. Thus, in referring to (3.50), we shall regard the lower right
entry as blank.
70
III: CUBIC CURVES IN WEIERSTRASS FORM
We claim as a consequence of our nondegeneracy assumption that
each L\ is different from each L'J. (3.51)
Thus L-(fc) and L"(k) meet only in the (ij)th entry of (3.50) unless
i = j = 3. To see (3.51), first suppose i and j are not both 3, say
i ^ 3. If L'{ = L", then the other point(s) of the ;th column must occur
as other point(s) of the ith row, since the ith row gives all points of
F(Jb)nL{ with their multiplicities, and this occurrence of a point in two
such entries of the matrix would contradict nondegeneracy. If« = j = 3
and L'3 = L3, then PQ in L'z must occur as one of the three points
of F(k) fl L'3\k)y counting multiplicities. It might be the missing lower
right entry of (3.50), but then P'Q' would have to coincide with PP' or
QQ'y and we would again have a contradiction to nondegeneracy.
Let T be the point where L'z and L% meet; this is unique by (3.51).
We shall prove (3.48) by showing that both sides of (3.48) are equal to
T. In the proof we may assume that k is algebraically closed. (Actually
we shall use only that k is infinite.)
Let V and L" be the cubics given by
L' = L[L'2L'3 and L" = L'/L'^'. (3.52)
The main step of the proof will be to show that
aF + a'L' + a"L" = 0 (3.53)
for suitable a}a'} a" in k with a / 0. To define these constants, let R! be
a point of L[(k) other than P> P;, and PP' (possible since k is infinite),
and let R" be a point not in L'(k) (existence by Lemma 2.1). Choose
a,a',a" in k not all 0 such that
(F(R') L'(R') L"(R')\(a\_(0\
\F(R") L'(R") L"(R"))y°„J-{0)' {6™>
and put
F0 = aF + a'L' + a" L". (3.55)
We shall prove that F0 = 0.
In doing so, we shall make repeated use of two facts about intersection
multiplicities of lines and plane curves. If L is a line, p is a point, and
Gi and G2 are plane curves, then
i(p, L, dd) = i(P, L,Gi) + i(P, L, G2). (3.56)
3. GROUP LAW
71
If, in addition, G\ and G2 have the same degree and if G\ + Gi is not
0, then
i(p,L,Gi-fG2)>min{i(p,L,Gi),2(p,L,G2)}. (3.57)
These are obvious from the definitions.
Thus suppose Fo / 0, so that Fo is a cubic. Let L be one of the lines
1*1,1*2, Lf3,Lf{, L'2, £3 and suppose the point p is repeated N times in
the corresponding row or column of (3.50), with the lower right entry
not included. We shall show that
i(p,LiF0)>N. (3.58)
In fact, without loss of generality, suppose L is a factor of the cubic V.
We have
i(p,L,F)> N (3.59)
by construction and by definition of the chord-tangent composition law.
In addition, Proposition 2.8 gives
t(p, !,//) =+00. (3.60)
The point p appears in N columns of (3.50) and thus lies on > N of the
lines L'{,L'2\L%. By (3.56)
i{p>L,L")> N. (3.61)
Putting (3.59), (3.60), and (3.61) together with the aid of (3.57), we
obtain (3.58).
Equation (3.54) implies that FQ(R!) = 0, hence that i(R',L[,FQ) > 1.
Adding this inequality to (3.58) for L = L\ and for p equal to the distinct
points in the first row of (3.50), we obtain £^p *(p, L[,Fo) > 4. From
Corollary 2.20 we conclude that L\ divides Fq. Let us write Fo = L\C
for a conic G.
Suppose p appears N > 1 times in the second row of (3.50) but not
at all in the first row. Then i{p,L'2,L\) — 0 since the first row lists all
points of F(k) n L[(k). And 2(p,^2,F0) > N by (3.58). Hence (3.56)
gives
i(p,L'2,C)>N. (3.62)
We shall show that (3.62) remains valid even if p does appear in the
first row. In this case, N = 1 by the nondegeneracy assumption.
Suppose that p appears > 2 times in the ;th column, possibly including
once in the second row. By (3.58), we have i{p,L'-,F^) > 2. Since
i{p,L'j,L\) = 1 by (3.51), (3.56) gives i(p,L'/,C) > 1. That is, p is in
72
III: CUBIC CURVES IN WEIERSTRASS FORM
C(k). Hence i(p, L'2yC) > 1, and (3.62) is proved even when p appears
in the first row.
Summing (3.62) over the distinct points p of the second row of (3.50),
we obtain £p i(pyL2yC) > 3. From Corollary 2.20 we conclude that L2
divides C. Thus we can write F0 = L[L2M for a line M.
Now consider the two points of the third row of (3.50). First
suppose they coincide; call their common value p. By non degeneracy, p
cannot occur in the first or second row, so that i(p,L'3} L\) = 0 and
i(p, £3,^2) = 0. Since (3.58) gives i(p} L'3)Fo) > 2, we conclude from
(3.56) that i(p,L'3)M) > 2. By Corollary 2.20, L3 divides Af.
Next suppose that the two points of the third row of (3.50) are distinct.
Let p be one of them, say the one in the jth column. If p appears N
times in the ,7th column, (3.58) gives
i(p,L'!,F0)>N.
On the other hand,
i(P>L'/,L\L'2) = N~\
by (3.56). Another application of (3.56) then shows that i(p, L", M) > 1.
Thus p is in M(ib). Since the same argument applies equally to the other
point on the third row of (3.50), we see that L3 and M have two distinct
points in common. Therefore L3 divides M.
Thus L3 divides M in either case, and we conclude that
Fq = cL^L2L3 = cL
for a nonzero constant c. But F0{R") = 0 by (3.54), and L'(R") ^ 0 by
definition of R". We have arrived at a contradiction, and we conclude
that FQ = 0. This proves (3.53).
We still have to show that a ^ 0 in (3.53). If a = 0, then we may
assume by symmetry that a' ^ 0, hence that V divides L", Thus L[
divides L'{L2L3y and unique factorization shows that L[ is the same line
as some L'', in contradiction to (3.51). We conclude that a ^ 0.
Therefore
F = c'L' + c"L" (3.63)
for suitable c' and c" in k. Recall that T is the common intersection
of I3 and L'3'. It follows from (3.63) that F(T) = 0. Since T is in
F(*)nL'3(fc),
T is one of PQt F'Q\ {PQ){P'Q'). (3.64a)
Since T is in F(k) O L3(ib),
3. GROUP LAW
73
T is one of PP\ QQ', (PP')(QQ').
(3.64b)
Since, by nondegeneracy, neither of the first two points of (3.64a) can
equal either of the first two points of (3.64b), we conclude either that
(PQ)(PV) = T = (PP')(QQ') (3.65)
and we are done, or that one of the following four possibilities occurs:
PQ = T= (PP')(QQ') (3.66)
P'Q' = T= {PP')(QQ')
PP' = T = (PQ)(P'Q')
QQ> = T=(PQ){P'Q').
These four formulas are symmetric under permutation of notation, and
it is enough to handle the first one.
Thus suppose (3.66) holds. Consider (PQ)(P'Qf). This is on L'3,
hence on V. Also it is on F. Since c" ^ 0 in (3.63) (F being nonsingu-
lar), (PQ)(P'Qf) must be on L". Hence (PQ)(P'Qf) is on at least one
ofiy, L£, and L'3'.
Suppose (PQ)(P'Qf) is on L'{. Since it is on L'3) we must have
(PQ)(P'Q') = PQ.
Substituting into (3.66), we obtain (3.65), and we are done.
Suppose (PQ)(P'Qf) is on L'{. Since it is on L3) we must have
(PQ)(P'Qf) = P'Q'. (3.67)
If P'Q' — PQ, we can substitute into (3.66) and obtain (3.65), and then
we are done. Thus suppose P'Q' ^ PQ. From (3.67) it follows that
i(P'Q'yL'3iF) = 2. (3.68)
Now PQ is on L'3' by (3.66), and it is on L'( automatically. Thus
i(PQ,L'3,L")>2
by (3.56). Since
i(PQ} L'3)L') = +00,
(3.63) and (3.57) give
i(PQ,L'3>F)>2. (3.69)
Combining (3.68) and (3.69), we obtain £p i(pt L'3, F) > 4. By Corollary
2.20, we find that L'3 divides F} in contradiction to the nonsingularity
ofF.
We conclude that (PQ)(P'Qf) is on ig. Since it is on both L'3 and
L's, (3.51) shows it is T. In combination with (3.66), this conclusion
yields (3.65). This completes the proof of (3.48) in the nondegenerate
case.
74 III: CUBIC CURVES IN WEIERSTRASS FORM
Proof of (3.48) in the degenerate case. The assumption is
that two of the 8 points in (3.50) in different rows and columns are
equal. Up to symmetry the cases we need to consider are at the most
P = Q'
P = QQ' (and hence PQ = Q1)
PP' = PQ (and hence P' = Q).
The identity (3.48) reduces in these cases to
(PPf)(QP) = (PQ)(P'P)
{PP')P = Q'(P'Q')
(PP')(PiQt) = (PP')(P'Qf).
The first and third of these are trivially valid, and the second is valid
since both sides reduce to P'. This completes the proof of (3.48) in the
degenerate case.
4. Computations with the Group Law
Fix an elliptic curve E over fc, i.e., a nonsingular cubic in Weierstrass
form (3.23). It is customary to choose O = (0,1,0), the point at infinity,
and then Theorem 3.8 says that E(k) becomes an abelian group with
O as identity element. This particular choice of O has some immediate
implications:
(1) OO = O since O is an inflection point.
(2) The additive inverse, usually given by — P = OO • P, is now given
by
-P = OP
as a consequence of (1). Then
P = (*o>yo) implies - P = (x0,-y0 - aix0 - a3)} (3.70)
because the line from O to P is x = x0 in affine form (or x — p^w = 0
in projective form).
(3) If a\ = as = 0, so that E is given in affine form by y2 = f(x) with
/ a monic cubic polynomial, then (3.70) specializes to the statement
that
P=(zo,2/o) implies - P = (x0, -y0)- (3.71)
In this case the elements of order 2, which have P = —P) are the points
on the x-axis. There are at most three such points.
4. COMPUTATIONS WITH THE GROUP LAW
75
Let us derive a formula for the group law for E(k). Let P\ and P2 be
given with P2 ^ -Pi- We shall calculate P3 = Px + P2. The notation
will be
Pi = (xi,yi), P2 = (^2,2/2), P3 = (z3,y3)
_ , _ f line through Pi and P2 if Pi ^ P2 (chord case)
"" \ tangent line at Pi if Pi = P2 (tangent case).
We begin by calculating m and b. The result is
{Vi ~y\
x2 - x\
3sf + 2a2gi 4-04- Qiyi
2yi -f aixi +03
( 3/1^2 ~y2^i
^2-^1
6 = I
I -af + fl4^i + 2a6 - a3yi
I 2yi + aiXi + a3
The formulas for m and 6 in the chord case are simply formulas for
the line through Pi and P2 and do not involve cubics. To prove the
formulas in the tangent case, we begin by computing m = ^ by implicit
differentiation of (3.23b):
2y</ + axxy' + aty + azy' = 3x2 + 2a2x + a4.
Putting y' = m and (x,y) = (2:1,2/1), we obtain (3.72a) in the tangent
case. Along the line y = mx + 6, we have y — t/i = m(x — x\). Thus
y = mx + (yi — mxi), and we obtain b = y\ — mx\. In other words
. 2y? 4- aixiyi + a2y\ - 3x? - 1a2x\ - a4Xi -f aiXiyi
0 = .
2yi -f a^i + a3
Substitution for y\ from (3.23b) then gives (3.72b) in the tangent case.
Now we can calculate P3 = (x3,ys) = Pi + P2. Let F(xyy) be the
difference of the right side and the left side of the Weierstrass equation
(3.23b). Along the line y = mx+b, this expression becomes P(x, mi+6),
and it is clearly a monic cubic polynomial in x. Since Pi, P2, and PiP2
are on this line and are in E(k) and since (3.70) shows that x3 equals
the x coordinate of P1P2, all of xi,x2ixs are roots of
F(x,mx + 6) = 0,
in chord case
in tangent case (3.72a)
in chord case
in tangent case. (3.72b)
76
III: CUBIC CURVES IN WEIERSTRASS FORM
with multiplicities counted. Thus
F(x, mx + 6) — (x — x\)(x — Xi)(x — x3).
Substitution of the expression for F gives
x3 + «2^2 + ^4^ -f a6 — (mar + 6)2 — aix(mx + 6) — 03(7721 -f 6)
= x3 — (xi + x2 + £3)z2 + lower order.
We can equate the coefficients of x7 on the left and right, obtaining
a2- m2 - aim = -(xi + x2 4- £3),
and the result is
2
23 = m + a^ — a2 — X! — X2
2/3 = -(m + ai)^3 - b - a3
> in chord and tangent cases. (3.73)
A special case of (3.73) is that P\ — i^j so that we are in the tangent
case. If P — (x,y), then substitution of (3.72a) into (3.73) yields a
formula for the x coordinate x(2P) of 2P:
X(2P) = 4*3+ 62*2 +264*+ 66' (3J4)
where 62,64,66,63 are as in (3.24).
Example 1. Cubic case of Fermat's Last Theorem. This was
discussed in Subsection B of §1. Since Fermat's Last Theorem is valid in
degree 3, (3.3) has only trivial solutions. Transforming to (3.7), we see
that
„,2 _ 3 27
V = x - -4
has only (3,|), (3,-§), and 00 as its Q solutions. Thus the group of
solutions must be isomorphic to Z3. Let us verify that P = (3,^) has
order 3. We have
ai = a3 = a2 = a4 = 0 and ae = — 2Z.
Thus
62 = 64 = bs = 0 and &6 = —27.
Hence (3.74) gives
81 + 162
x(2P)=*4 + Mx
,=3 108-27
= 3.
4x3 - 27 I
So 2P = P or 2P = -P. The conclusion 2P = P is false because it
forces P = O = 00. Thus IP = -P and 3P = O. In other words, P
indeed does have order 3.
5. SINGULAR POINTS
77
EXAMPLE 2. y2 + y = i3-z over Q. This is one of the elliptic curves
in Table 3.2, and it has A=37. The point P = (0,0) is an obvious
solution, and we can calculate from (3.73) that
P=(0,0) 5P = (i,-|)
2P = (1,0) 6P = (6,14)
3P = (-1,-1) 7P = (-f,£)
4P = (2,-3) 8P = (&-&)•
(3.75)
It will follow from Theorem 5.1a that P generates an infinite cyclic
subgroup. (In fact, P generates all solutions.)
5. Singular Points
Singular Weierstrass curves arise for certain primes p when a Weier-
strass curve with integral coefficients is considered modulo p. Such
curves are easy to analyze, and we shall note some of their features
in this section.
Let E be a singular Weierstrass curve over a field k. The point oo
on the curve is nonsingular, and we have only to analyze points (x,y)
in the affine plane. Proposition 3.10 will show that there is only one
singularity and that, under a mild restriction on k, it occurs at a k
rational point (xo,yo)- Translating (zo>yo) to the origin (an admissible
change of variables), we are led to the projective curve (3.23a) with
a6 = 0. The condition that — give 0 at (0,0,1) means that a4 = 0,
ox
and the condition that -r- give 0 at (0,0,1) means that az = 0. Thus
oy ]
E is given in affine form by the equation
We can factor
y2 + aixy = x3 + a2x2. (3.76)
y2 +aixy - a2x2 (3.77)
over Jb, obtaining
(y - otz)(y - 0x) = x3 with a, f3 € k. (3.78)
We say that the singular point (0,0) is a cusp if a = /?, or a node if
a ^ p. Pictures of the two kinds of behavior (with ax = 0) appear in
Figure 1.6.
78
III: CUBIC CURVES IN WEIERSTRASS FORM
Proposition 3.10. For a singular Weierstrass curve E over Jfc, there
is only one singular point (x0,yo)- It is k rational if either
(i) char(ifc) ^ 2 or
(ii) char(Jb) = 2 and k is closed under the operation of taking square
roots (as is the case when k is a finite field of characteristic 2).
The point (xo> 2/o) is a cusp if c4 = 0, or a node if c4 ^ 0.
Proof. If char(ib) ^ 2, we can apply, without loss of generality, a
projective transformation to eliminate the d\ and a^ terms. By
Proposition 3.5 the curve will be singular if and only if the cubic polynomial /
in x has a repeated root. In this case / and /' have a greatest common
divisor g over k with degree > 1, and the singular points are (xo>0),
where xq ranges over the roots of g.
If g has degree 1, its unique root xo is in ky and (xo,0) is the unique
singular point. If g has degree 2, its two roots x0 and x0 must be equal,
since otherwise Xo and x'0 would both be roots of multiplicity > 2 for
the cubic polynomial /. Since we can conclude xo = ^o, ^o is in k, and
(xo,0) is the unique singular point.
If char(fc) = 2, the singular points are the points (xo>2/o) on the affine
curve over k satisfying (3.42a) and (3.42b). If ai ^ 0, (3.42b) uniquely
determines xo, and (3.42a) then uniquely determines t/oj the resulting
(x0,3fo) >s k rational. If c^ = 0, (3.42a) forces x2+4 = 0- In characteristic
2, square roots are unique, and the assumption (ii) says that xq is in k.
Since (3.42b) shows a3 = 0, t/o is given by
2/0 = *0 + °2^o + a*x0 + «6
and again exists in k and is unique in k under our assumption (ii).
Return to general k. Under the translation over k that moves (xo, j/o)
to the origin, c4 is unaffected. Thus it is enough to decide cusp vs. node
in (3.76). From (3.24) and (3.25), the value of c4 is
c4 = b\ - 2464 = {a\ + Aa2f - 24(2a4 + Gla3) = {a\ + 4a2)2. (3.79)
If char(Jfc) ^ 2, the discriminant of (3.77) is a\ + 4(*2, which is 0 if and
only if c4 = 0; hence a = /? (and there is a cusp) if and only if c4 = 0.
If char(ib) =. 2, then
(y - ax)2 = y2 + axxy - a2x2
says a\ = 0 and a2 — a2; hence a = p (and there is a cusp) if and only
if a\ = 0, which happens if and only if c4 = 0.
5. SINGULAR POINTS
79
We can parametrize our singular Weierstrass curve (3.76) by the
method that Diophantus used for conies (see Chapter I): We parametrize
lines through the singular point (0,0) by their slopes, and we find that
most lines intersect the curve at one and only one other point. To handle
exceptional lines, it is necessary to distinguish two subcases of nodes.
Let us say that a node is a split case if the members a and /? of (3.78)
lie in k. In the contrary case, a and /? lie in a nontrivial quadratic
extension of k} and we say that the node is a nonsplit case.
Proposition 3.11. For the singular Weierstrass equation E in (3.76),
the map
t —> (t2 + axt - a2, t{t2 + axt - a2))
carries k - {<*,/?} one-one onto E(k) — {oo} - {(0,0)}. If A: is a finite
field with \k\ elements, the nonsingular set E(k) — {(0,0)} therefore has
|Jb| — 1 elements if (0,0) is a split-case node
|Jfc| + 1 elements if (0,0) is a nonsplit-case node
|ik| elements if (0,0) is a cusp.
Proof. In (3.76), x = 0 gives only y = 0, and (0,0) is singular. Thus
any nonsingular point of E(k) — {oo} has y — tx for a unique member t
of k. Substituting tx for y in (3.76) and using x / 0, we are led to
t2 + a\t — a.2 = x.
Then also y is t times this expression for x, and k maps onto the affine
solutions of (3.76). To exclude x — 0 from the image, we must exclude
the roots of i2 -+■ a^t — a2\ these are a and /?. Once x — 0 is not in the
image, the map is one-one since t is recovered as y/x. The numerology
if A: is a finite field is then clear.
CHAPTER IV
MORDELL'S THEOREM
1. Descent
MordelPs Theorem (Theorem 4.11 below) is the statement that the
group E(Q) is finitely generated if E is an elliptic curve ove/ Q. The
proof is motivated by a close examination of Format's method of
descent. To prove that x4 -f y4 = z4 has no nontrivial integer solutions,
Fermat showed that a nontrivial integer solution of u4 + v4 = w2 leads
to a strictly smaller nontrivial integer solution of the same equation, and
he thereby arrived at a contradiction.
In this section we shall look at Fermat's proof in detail, and then we
shall translate it into an argument about the elliptic curve y2 = x3 — Ax
via the transformations of Subsection D of §111.1. It will turn out that
the descent procedure passes from the point P to a point \P (or possibly
— \P) on the elliptic curve.
With MordelPs Theorem in hand, we know that the passage P —►
^P cannot go on forever. But, in fact, the argument can be turned
around to give a proof of MordelPs Theorem. The passage P —► \P has
certain obstructions, which are measured by E(Q)/2#(Q). (In Fermat's
proof these obstructions correspond to an occasional need to adjust the
parameters in apparently trivial ways.) The hard step in the proof of
MordelPs Theorem will be to prove that Z?(Q)/22?(Q) is finite, so that
the descent passage P —* \P can proceed for all but a finite number
of situations. The other step in the proof is to codify the notion of
a "smaller" solution of it4 + v4 = w2y or of a "simpler" solution of
y2 = x3 — Ax. This is done by introducing a suitable notion of height
such that only finitely many points on the elliptic curve have height
less than any given constant C. For the notion of height h(P) that we
introduce, we shall have h(^P) = \h(P), so that the descent process
P —► \P eventually carries us to height less than C. If C is taken larger
than the maximum of h{P) for P in a set of coset representatives for
the finite group £(Q)/2i?(Q), then the finite set of points of height less
than C will turn out to be a finite.set of generators for £(Q).
Let us begin with Fermat's argument. We use the symbol Q to
denote a square.
80
1. DESCENT 81
Proposition 4.1 (Fermat). u4 + v4 = w2 has no nontrivial integer
solutions.
Proof. Arguing by contradiction, suppose we have a nontrivial
integer solution. Without loss of generality, we may assume GCD(u, v) = 1,
it is odd, and v is even. Writing
and applying Theorem 1.2, we obtain integers m and n with
u2 = m2-n2, v2 = 2mn, GCD(m,n) = l.
Applying Theorem 1.2 to m2 = u2 + n2, in which w is odd, we obtain
integers p and q with
rn = p2 + ?2, ti = p2-g2, n = 2pqy GCD(pyq) = 1.
Since t; is even, we can write
(£) =D = 2mn = w(p2 + *2)-
Here p, 9, and p2 -f q2 are relatively prime since GCD(p, q) = 1. Thus
p=D. ?=D. p2 + 92 = D-
and we can write
p=r2, <? = s2, r4+j4 = Q
Thus we have been led from the old solution u4 + v4 = Q to a new
solution r4 + s4 = QJ. These are related by
(!)2=P?(p2 + ?2) = r2SV + A
hence by
v = 2rs\/r4 + s4. (4.1)
If the new solution were trivial, then rs would be 0, so that v would be
0; this would mean that the old solution was trivial. We conclude that
rs ^ 0, and then it is clear from (4.1) that r < v and s < v. Hence
max(|r|, \s\) < max(|u|, |t>|).
The passage from an old solution to a new solution therefore cannot go
on forever, and we arrive at a contradiction.
82
IV: MORDELL'S THEOREM
Analysis of proof. Let us go backwards. We have
(r,*) — (p,9) = (rV)
— (m,n) = (p2 + q2,2pq) = (r4 + s4,2r2s2)
(u,v) = (p2 - q2,v) = (r4 - s4,2rs\/r4 + s4).
(4.2)
Now let us match parameters with what is happening for the
corresponding elliptic curve. Schematically we have
(x,y) on cubic ► (u}v) on quartic
i
(c,d) on cubic < (r»5) on quartic
The equations that are satisfied by these variables are
(4.3)
y2 = x3 - Ax
d2 = c3- 4c
U4 + V4 = W2
r4+ *4 = Q
We seek an expression for x in terms of c. Let X = u/v and Y = w/v2,
so that y2 = X4 + 1. Then (3.14) says that the top line of (4.3) is
implemented by
2x
y =
t/2 + 8g
4x2 J
> and <
x —
Y-X2
AX
Y-X2'
Let R — r/s, so that
^=rdy=c3-4c"=c2-
V2c,/ 4c2 4c
(4.4)
Then
Y-X' =
_ w - u2 _ (m2 + n2) - (m2 - n2)
2rV
2mn
2R2
m r* + s4 R4 + ly
1. DESCENT
83
and hence
Y-X2 #> /c2-4\ 4c(c2-4)' l ^
Interpretation of analysis. In the notation of (3.23) and (3.24),
our elliptic curve has
a\ = as = a.2 — a$ = 0 and 04 = —4.
Thus
62 = £6 = 0, 64 = -8, 68 = -16.
The point (c, d) is on the elliptic curve, and thus c = x(P) for a point
P. Then (3.74) says that
and this expression coincides with the expression (4.5) for x.
Computation shows also that y(2P) agrees with y except for a factor of sgn c.
Thus Fermat's method of descent amounts to a construction that starts
with P' in £(Q) and constructs ±\P* in E(Q).
Closer inspection shows that we made certain adjustments to our
variables as we went along, in order to be able to carry out the descent. The
first one was to assume that u is odd and v is even, rather than the
other way around. This is the first adjustment. The next assumption
was that u and v are positive; if u and v are not positive, we adjusted
them to make them positive. We chose r and 5 to be the positive square
roots of p and qy but we could not force r to be odd and 5 to be even.
Normalization of r and 5 to be of the same form as u and v was a second
instance of the first adjustment. In the context of the elliptic curve, it
turns out that both adjustments amount to changing the point we are
studying by a coset representative of £,(Q)/2E(Q). The fact that at
most two adjustments are needed reflects the fact that E(Q)/2E(fy) is
a sum of only two copies of Z2 in this example.
Finally we can ask why the descent has to stop. For the quartic
equation this was clear, since the integer entries of the solution were
going down. Let us translate this condition into the context of the
elliptic curve. For a rational number j in lowest terms, we define
= logmax(|i|l|j|).
84
IV: MORDELL'S THEOREM
Since r and s in (4.2) are relatively prime, we have
= logmax(|r|,|5|).
The construction of r and 5 had the property that
\R)00 = \-\ <|-| =1X1^.
I 5 loo llMoo
With the quartic equation, the descent has to stop because only finitely
many rationals have | • |oo less than a given constant. To work directly
with the elliptic curve, we first re-examine (4.1) and see that in fact
I 1 V loo I
Now R is given in terms of c by (4.4), and a little checking shows that
||fl|co-|cU<M (4.7a)
for some computable constant M. Similarly
|Moo-|*|oo|<M. (4.7b)
Putting together (4.6) and (4.7), we obtain
\c\~<\\*U + \m. (4.8)
Thus
Moo < Moo as long as ZM < \x\w. (4.9)
The values of |c|oo and \x\oo are a naive version of the notion of height
mentioned at the beginning of the chapter. Condition (4.9) does not
say that the descent must stop, but it does say that it has to lead to
a (nontrivial) solution of the elliptic curve with small numerator and
denominator. Although this conclusion does not rule out all nontrivial
solutions for this example immediately, we shall see that it does give
enough information for the conclusion that £(Q) is finitely generated.
To get a cleaner argument, we ultimately will use a more refined
definition of height, and inequality (4.8) will not have the additive constant
on the right side.
2. CONDITION FOR DIVISIBILITY BY 2
85
2. Condition for Divisibility by 2
The proof of Mordell's Theorem will involve showing that
£(Q)/2£(Q) is finite and will involve using a suitable notion of height
to prove that the descent must stop. In this section we prove a theorem
that allows us to recognize elements of 2£(Q). We continue to use the
symbol Q to denote a square.
Theorem 4.2. Let E be an elliptic curve over a field of characteristic
not equal to 2 or 3. Suppose E is given by
y2 = (x — a)(x — (3){x — 7) = x3 -f rx2 -f sx -M
with or,/?,7 in k. For (2:2,2/2) in E(k)} there exists (£i,yi) in E(k) with
2(^1,2/i) = (#2)2/2) ^ an(J only ^ #2 — <*> #2 — /?> and x2 - 7 are squares
in fc.
Proof of necessity. If (£i,yi) exists, let y = mz+6 be the tangent
line at x\. Since (£i,yi) and (£2,— 2/2) both satisfy
(x - a)(z - /?)(x - 7) = y2 = (rnx -f &)2
and since y = mi 4- 6 is actually tangent at x\y the three roots of
x (x - a)(x - 0){x - 7) - (mi -I- 6)2
are £2, x\, and £1. Thus
(x - a)(x - /?)(x - 7) - {mx + 6)2 = (x - x2)(x - xx)2.
Putting x = a, we have — QJ = (a — ^2)[]]- Since Xi cannot be or, the
square on the right is not 0. We conclude £2 — a is a square. Similarly
£2 — 0 and £2—7 are squares.
Preparation for sufficiency. Reduction of quartics to cubics.
If we have a monic quartic polynomial equation in one variable to
solve, we can translate x to eliminate the x3 term. Then we have
£4 + ax2 + bx -f c = 0,
86
IV: MORDELL'S THEOREM
With u as an unknown to be specified, (4.10) is equivalent with
(x2 + u)2 = (2u - a)x2 - bx + (u2 - c). (4.11)
We choose u so that the right side of (4.11) has a double root in x. Then
0 = B2 - AAC = b2- 4(2u - a)(u2 - c)
= b2 - 8u3 + 4at/2 + 8cu - 4ac,
and hence
8u3 - Aau2 - 8cu + (4ac - 62) = 0.
If we solve for u and substitute into (4.11), the result is Q = QJ. We
take square roots and are led to a quadratic equation for x.
Proof of sufficiency. Changing variables in x, we may assume
X2 = 0. We are given
y2 = (x - a)(x - (3)(x - 7) = x3 + rx2 + sx + t
with
t/2 = -a/?7 = * (4.12a)
and with —a = a2, —/? = /?2, —7 — 72, and we are to produce (xi,j/i).
Without loss of generality, we may adjust the signs of <*i,/?i,7i so that
*i0i7i = !fe. (4.12b)
Take 3/ = mx + 2/2 as the line through (0,2/2) tangent at the unknown
point (xi,t/i). Then the three roots of
(x - a)(z - P){x - 7) - (mx + y2)2 (4.13)
are 0, x1? and xj. So
~(x3 + rx2 -f sx — m2x2 — 2my2x) is to be (x — xi)2.
Thus the discriminant of
x2 4- (r - m2)x + (s - 2my2) (414)
is to be 0:
(r-m2)2 = 4(s-2rm/2). (4.15)
2. CONDITION FOR DIVISIBILITY BY 2
87
This is a quartic equation in m. If it has a root m0 in Jfc, then x\ =
\{m\ — r) is a double root of (4.14) and hence also of (4.13).
Consequently 2(xi,ra0xi + y2) = (0,-y2), and 2(xi,-m0xi - y2) = (0,y2).
In other words (x\yy\) exists with 2(xi,yi) = (z2,y2).
Thus it is enough to show that (4.15) has a root in k. As in the
preparation for this part of the proof, let u be an unknown to be specified.
From (4.15) we have
(m2 -r + u)2 = 2um2 - 8y2m + (u2 - 2ru + 4s). (4.16)
We try to find u such that the right side is a square as a function of m.
Thus
0 = B2 - \AC = 64y| - $u(u2 - 2ru + 4s)
u3 - 2ru2 + 4su - 8y2 = 0.
The roots of this equation are it = —2a, —2/?, —2y. [In fact, just change
variables with u = —2vy use the identity (4.12a), and reduce the
polynomial to -8(u3 + rv2 + sv + t)] Let us take u - -2oc. Then (4.16)
gives
(m2 - r - 2a)2 = -4am2 - Sy2m + (4a2 + 4ra + 4s).
Substituting for r and s in terms of a,/?,7 and using (4.12b), we obtain
(m2 - a + p + y)2 = 4(ttim - /?i7i)2,
m2-a + /? + 7 = ±2(arim - ft 71),
m2 =F 2aim - a = -/? - 7 =F 2ft71,
(m T «02 = ft2 =p 2ft71 + 7? = (ft T Ti)2-
Taking square roots and denoting an independent sign by ±', we find
m = ±ati±'(Pl T7i).
We can reverse the steps, and we see that we have found four roots m
in k to the equation (4.15). This proves the theorem.
Let us digress for a moment and return to the subject of congruent
numbers, first discussed in Subsection C of §111.1. Recall that n is
congruent if n is the area of a rational right triangle. Using the easy half of
Theorem 4.2, we obtain the following corollary, which says that (3) =>
(2) in Proposition 3.1.
88
IV: MORDELL'S THEOREM
Corollary 4.3. Let n be a square-free positive integer, and suppose
there exists a Q rational point (x, y) on
2 3 2
y = x — n x
other than (—n,0), (0,0), (n,0), and oo. Then there exist three rational
squares in arithmetic progression with common difference n. Hence n is
congruent.
Proof. Let P = {x\) y\) be a Q solution other than the trivial ones.
Then y\ / 0 and P does not have order 1 or 2. Hence 2P is not oo, say
2P = (x2,J/2)« Then the theorem says that a^ + n, #2i a^d x2 — n are all
squares, as required. By (2) => (1) in Proposition 3.1, n is congruent.
Corollary 4.4 (Fermat). n = 2 is not a congruent number.
Proof. If 2 were congruent, (1) => (3) in Proposition 3.1 would say
that y2 = x3 — 4x has a nontrivial solution. Transforming via (3.14), we
would obtain a nontrivial solution of the equation u4 -f v4 = w2. But
this conclusion contradicts Proposition 4.1.
3. £(Q)/2£(Q), Special Case
The main step in Mordell's Theorem is the proof of the following
result.
Theorem 4.5. If E is an elliptic curve over Q, then the abelian group
£(Q)/2£(Q) is finite.
Using an admissible change of variables, we may assume that E is of
the form
y2 = (x - a)(x - 0)(x - 7) (4.17)
with integer coefficients. In this section we make the additional
assumption that or,/?,y are in Q, hence in Z.
Since or,/?, y are in Q, the subgroup of £(Q) of elements of order < 2
is of the form Z2 0 Z2. The picture of E(R) is qualitatively the same as
in Figure 1.7c. We consider the group Qx/Qx2, writing it as -
QVQx2={i2a365c7d...|a,6,c,c/,...€{0,l}}= £ 0Z2.
±,2,3,5,7,...
(4.18)
We continue to use the symbol Q] to denote a square.
3. E(Q)/2£(Q), SPECIAL CASE 89
Proposition 4.6. Let E be the elliptic curve (4.17) over Q with
integer roots, and define (p : E(Q) —► Qx/Qx2 by
{(x_a)Qx2 if p_ ^yj wiih p^ooand i^a
(a-/?)(<*-7)Qx2 ifP = (a,0)
Qx2 ifP = oo.
(4.19)
Then cp is a group homomorphism.
Remark. Then <p(2P) = <p(P)2 € 1 • Qx2, so that <p descends to a
homomorphism
Va : E(q)/2E(Q) — QX/QX2. (4.20)
Proof. Let Pi + P2 = P3. We are to show (p(Pl)<p(P2)(p(P3)~l is
in Qx2. Since <p(P3) = y?(-P3) and <p(P3) = <£>(P3)_1, it is enough to
show that Pi + P2 + P3 = 0 implies ^(PiM^Mft) is in Qx2. If one
of the P, is oo, this conclusion is trivial. Thus we may assume P, is of
the form (x,-, y,) for i = 1,2, 3.
Case 1: No (xf-,t/,) is (a,0). Let y = mi -f & be the line on which
Pi,P2,P3 lie. Each P, = (x,-,j/i) satisfies
(x — a)(x - /?)(x - 7) = y2 = (mx -1- fc)2.
Hence (x — a)(x — /?)(x — 7) — (mx -f &)2 is 0 for x — x\, x2, X3 with
multiplicities. Thus
(x - a)(x - /?)(x — 7) — (mx + b)2 — (x - x\)(x - x2)(x - X3).
Putting x — oc gives (xi — a)(x2 — a)(x3 — a) = (ma 4- 6)2, and thus
<£>(Pi)<£>(P2)v?(P3) is a square in Qx.
Case 2: (zi,j/i) = (<*,0). Then neither (x2,y2) nor (x3,y3) is (a,0),
since otherwise the other point would be 00. Let y = mx + 6 be the line
on which Pi,P2,P3 He. As before,
(x — a)(x — (3)(x - 7) — (mx -f b)2
is 0 for x = a,x2,x3 with multiplicities, and thus
(x — a)(x — f$)(x — 7) — (mx + 6)2 = (x - a)(x — x2)(x - x3). (4.21)
Now x — a has to divide (mx + 6)2, and hence mx + 6 = m(x — a). So
(4.21) becomes
(x — a)(z — /?)(x — 7) — m2(x — a)2 = (x — a)(x — x2)(x — x3)
or
(x - /?)(x - 7) — m2(x - a) = (x - x2)(x - x3).
Putting x = a gives
(a - /?)(<* - 7) = (a - x2)(a - x3),
so that <p(Px) = <p(P2)<p(P3)- Thus (p(Pl)ip(P2)ip(P3) is in Qx2.
90 IV: MORDELL'S THEOREM
Corollary 4.7. Let E be the elliptic curve (4.17) over Q with integer
roots, and define
y>a : £(Q)/2£(Q) — Q*/Qx2
as in (4.19). Define <pp similarly. Then the homomorphism
<pax<Pp: E(Q)/2E{Q) — Qx/Qx2 * Qx/Qx2 (4-22)
is one-one.
Proof. Suppose (x,y) € £(Q) maps to Qx2 under <pa and (pp.
Case 1: Suppose (x, y) is not oo or (a, 0) or (0,0). Then the condition
is that x — a and x —/? are QJ. Since (x —a)(x — ($)(x — 7) = [""[, x — 7 is
□ • By Theorem 4.2, (x,y) = 2(x',y') for some (x',y') in E(Q). Thus
(x,y) is in 2£(Q).
Case 2: Suppose (x,y) = (a,0). Then
□ = ^(a.0) = (a-/?)(<*-7)Q*2
□ =¥»^a,0) = (a-/?)Qx2.
Then a — ft and a — 7 are QJ. Also 0 is a square. By Theorem 4.2,
(a,0) = 2(x',y') for some (1 ,y'), and thus (a,0) is in 2E(Q).
Case 3: Suppose (x,y) = (/?,0). This case follows from Case 2 by
symmetry.
Let us write
qx/qxa x QX/QX2 = J2 ®(Z2eZ2). (4.23)
±,2,3,...
It will turn out that the image of <pQ x <pp in Qx/Qx2 x Qx/Qx2 is 0 in
almost all coordinates of (4.23). Let d be the discriminant of the cubic
polynomial (x — a)(x — /?)(x — 7), namely
d=(a-p)2(a-7)2(/3-y)\ (4.24)
(See Proposition 3.3 and the neighboring discussion.) If p is a prime and
r is rational, we write pa\\r if r = pa<7 and q € Q has no factor of p in its
numerator or denominator.
3. E(Q)/2E(Q), SPECIAL CASE 91
Proposition 4.8. Let E be the elliptic curve (4.17) over Q with
integer roots, let <pQ x <pp be the homomorphism (4.22), and let d be the
discriminant (4.24). Then the image of <pQ x (fp in (4.23) is confined to
the coordinates i and primes p dividing d. Thus the homomorphism
£?(Q)/2E(Q) —♦ £ 0(Z2 0 Z2) (4.25)
±,p\d
is one-one.
Remark. The group on the right of (4.25) is finite, and thus
Proposition 4.28 completes the proof of Theorem 4.5 under the special
assumption that the roots or,/?,7 in (4.17) are in Q.
Proof. Let (x,y) be a point other than 00 in E(Q). Assume for
the moment that x £ {a, /?, 7}. Fix a prime p— 2,3,5,..., and define
integers a,b}c by
pa\\(x-a), P4||(x-/?), pc\\(x-y).
Since (4.17) says that (x — a)(x — fi){x — 7) is a square, we have
a + 6 + c = 0 mod 2. (4.26)
Suppose at least one of a, 6, c is < 0. Say a < 0. Since a is an integer,
p'a' || (denominator of x). Therefore
pa\\(x-a), p° || (*-/?), pa||(x-7).
In other words, a = b •= c. From (4.26), it then follows that
a = b = c = 0 mod 2.
Consequently the image of (x,y) in the pth coordinate of (4.23) is 0.
Suppose at least one of aybfc is > 0. Say a > 0. If p \ </, then
p f (a — P) and hence p cannot occur in the numerator of
x- /3= (x-a) + (a-/?).
In other words, 6 = 0. Similarly, if p f rf, then p \ (a — 7), and we
conclude c = 0. By (4.26), a is even. Thus again we have
a = b = c = 0 mod 2,
and again the image of (z,y) in the pth coordinate of (4.23) is 0.
We have been assuming that x £ {<*,/?,7}. When x is in this set,
then <pQ{x) and <pp(x) are products of a — /?, a — 7, and /? — 7, up to
sign, and these are all prime to p if p \ d. Hence the pth coordinate of
<pa(x) x <pp(x) is (0,0).
Combining these conclusions with the one-oneness from Corollary 4.7,
we obtain the proposition.
92
IV: MORDELL'S THEOREM
4. £(Q)/2£(Q), General Case
So far, we have proved Theorem 4.5 in the special case that the roots
a,/?,7 in (4.17) are in Q. In the general case, let k be the splitting field
of the cubic polynomial (x — a)(x — fi)(x — 7) over Q. The rough idea
is to imitate the proof in §3, replacing Q everywhere by this extension
field k. But we encounter three complications:
(1) The modified approach will lead to information about
E(k)/2E(k), and we need to deduce information about
£(Q)/2£(Q).
(2) The decomposition (4.18) of Qx/Qx2 uses unique factorization
in Z, and unique factorization may fail in the ring of algebraic
integers in k.
(3) The ± coordinate of Q*/Qx2 in (4.18) represents units in Z, and
the units need to be controlled in whatever ring replaces Z in the
general case.
We can dispose of (1) rather quickly.
Proposition 4.9. Let E be the elliptic curve (4.17) over Q, and let k
be the splitting field of (x — a)(x — (3)(x — 7) over Q. Then the canonical
homomorphism
£(Q)/2£(Q) — E(k)/2E(k) (4.27)
has < 22tfc:Ql elements in its kernel. Consequently if E(k)/2E(k) is finite,
then so is E(Q)/2£(Q).
Remark. We shall use the Galois group Gal(A:/Q). Note that if
Q = (xyy) is in E(k) and cr is in Gal(fc/Q), then Qa = (a(x)}(r(y)) is in
E(k) since E is defined over Q.
Proof. Let
E[2] = {Q' £E(k)\'2Q' = 0}.
The points P in #(Q) that map to 0 under the canonical homomorphism
are those in E(Q>) C\ 2E(k). For each such P, select Qp in E(k) such
that 2Qp = P. Then we obtain a function
XP : Gal(Jfc/Q) —> E[2] (4.28a)
by the definition
\p{a) = Qap-QP. (4.28b)
4. E(Q)/2E(Q), GENERAL CASE 93
[In fact, to see that Xp takes its values in E[2], we observe that 2(Q(p -
QP) = {2QPy -2Qp = P°-P = 0.]
Now let us show that AP = XP> implies P' is in P-f 2£(Q). In fact, if
Qp — Qp = Xp = Xpi = Qapi — Qpi
for all <r, then
[Qp'-QpY = Qp>-Qp.
Since fc is a normal extension of Q, fcGal(*/Q) = Q. Thus Qp> — Qp is in
£(Q). Hence
P'-P = 2(Qp<-Qp)£2E(Q))
as required.
Thus for each member of the kernel of (4.27), we select P in #(Q)
mapping into 2E(k)y and then we can associate Xp as in (4.28). What
we showed above is that if we start with a different member of the
kernel, then the resulting Xp is a different function from Gal(Jfc/Q) to
E[2]. Hence the order of the kernel is < the number of functions from
Cal(*,Q) to E[2]t which is 4lGa,<*'«l = 22I*:Q1.
Complications (2) and (3) are more subtle. Let us list some examples.
Example 1. y2 = x3 + x. The splitting field is k = Q(v/:rT)- The
ring of integers l\y/— 1] is a unique factorization domain, and its group
of units is {(\/IIT)A:}2=o-Z4-
Example 2. y2 - x3 - 2x. The splitting field is k - Q(>/2). The
ring of integers Z[>/2] is a unique factorization domain, and its group of
units is the infinite group {±(1 ± V^)*}^.^ ^ Z 0 Z2.
Example 3. y2 - x3 + 5x. The splitting field is k = Q(v/Z5). The
ring of integers Z[«s/^5] is not a unique factorization domain. In fact,
the numbers 2, 3, l±>/--5 are all prime, and 2-3 = (l + \/—5)(1 —\/—5)-
The group of units is {±1} = Z2.
In the first two examples, Z[>/=T| and Z[\/2] are unique factorization
domains. So it is meaningful to write
kx/kx2 = {units/(squares of units)}© {piPb2-- |a, 6, • • • G {0,1}}
= {units/(squares of units)} © Y^ Z2; (4.29)
p prime
94
IV: MORDELL'S THEOREM
in the sum we list only one representative from each class
{pc\e = unit} of associate primes. As in (4.19), we get a homomor-
phism (p : E(k) —► kx/kx2 generically given by x —► (x — a)kx2. And
as in Corollary 4.7, we obtain a one-one homomorphism
<pQ*<Pp • E(k)/2E(k) —► {units/(squares of units)} 0 ^ (Z2 0 Z2).
p prime
(4-30)
As in Proposition 4.8, we can discard all primes p except those dividing
the integer discriminant d given in (4.24). There are only finitely many
such primes because of unique factorization. Then E(k)/2E(k) has been
mapped one-one into the finite group
{units/(squares of units)} 0 ^(Z2 0 Z2), (4-31)
and E(k)/2E(k) is therefore finite.
For the third example, unique factorization fails for Z[^/-5], and
(4.29) is no longer valid. However, if we let S = {1, 2, 22, 23,...} and
define R = 5~1Z[>/—5], then R is a ring with Z[«v/--5] C R C ib, and it
turns out that R has the following two properties:
(1) R is a principal idea domain, hence a unique factorization domain.
(2) The group of units in R is finitely generated (with —1 and ^ as
generators).
Since R contains Z[\/^5], the quotient fiek^6f R is still k. By unique
factorization, (4.29) is valid if we interpret units and primes as units and
primes in R. The argument yielding a one-one homomorphism (4.30)
is then valid, and again we can collapse the image to (4.31). Since.jbhe
group of units in R is finitely generated, it is a direct sum of cyclic
groups, and thus
{units/(squares of units)}
is a finite group. Therefore (4.31) is a finite group, and E(k)/2E{k) has
to be finite.
The same argument as above reduces the finiteness of E(k)/2E(k) in
the general case to the following result.
Theorem 4.10. Let k be a finite extension of Q, and let Ok be
the ring of algebraic integers in k. Then there exists a ring R with
Ok C RCk such that
(1) R is a principal ideal domain, hence a unique factorization
domain.
(2) The group of units in R is finitely generated.
5. HEIGHT AND MORDELL'S THEOREM
95
The proof of Theorem 4.10 relies on three famous results in algebraic
number theory. In §9 we shall sketch the relevant background from
algebraic number theory and then give a proof of this theorem.
As we mentioned, the finiteness of E(k)/2E(k) follows from Theorem
4.10. Then Proposition 4.9 implies that £(Q)/2£(Q) is finite. This
completes the proof of Theorem 4.5.
5. Height and Mordell's Theorem
With the hard step finished, we can now address MordelPs Theorem.
In this section we state the theorem, introduce the notions of "naive
height" and "canonical height," and show how Mordell's Theorem
follows from Theorem 4.5.
Theorem 4.11 (Mordell). If E is an elliptic curve over Q, then the
abelian group £(Q) is finitely generated.
Using an admissible change of variables, we may assume that E is of
the form
y2 = x3 + Ax + B (4.32)
with A and B in Z. For P = (xyy) in i?(Q), write x = - with
GCD(p, q) — 1, and define the naive height of P by
/i0(P) = logmax(|p|,M)>0. (4.33)
By convention we define
Ao(oo) = 0.
For any given C,
{P £ F(Q) | k0{P) <C} is a finite set. (4.34)
Proposition 4.12. The naive height satisfies
ho(2P) = 4h0{P) + 0(l), (4.35)
where 0(1) is bounded independently of P £ E(Q).
96 IV: MORDELL'S THEOREM
Proof. If P* = IP = (x*,y*) ^ oo, we know from (3.74) that
* _ x4 - 64x2 - 266x - 68
~~ 4x3 + 62x2 + 264x -f 66 '
where b2 = 0, 64 = 2A, 66 = 4fl, 68 = -A2. Thus
„ _ x4 - 2Ax2 - 8Bx + A2
X ~~ 4(x3 + Ax + B)
If x = - with GCD(p, g) = 1, then x* = — with
q q*
p* =p4- 2Ap2q2 - SBpq3 + ;4 V
q* = 4q(P3 + Apq2 + Bq3).
Let 6 = GCD(p*,g*), p** = p*/6, and q** = q*/6y so that x* = -^ in
lowest terms. Then
max(|p**|,|,"|)
<max(|p'|,|g*|)
< max(|p|, |9|)4 max(l + 2|>1| + 8|5| + A2, 4(1 + \A\ + \B\))
= Ca,b max(|p|,|g|)4,
and
h0(2P) < Ah0(P) + log CA,B ■ (4,36)
To get an inequality in the reverse direction, we shall bound
max(|p|, \q\) in terms of max(|p*|, \q"\), and we shall bound
max(|p*U?*|) in terms of max(|p**|, |g**|). Let d = -AA3 - 27B2 #0
be the discriminant of the cubic polynomial x3 + AX + B. Then one
can check the following identities:
Adq7 = (3p3 - hApq2 - 27Bq3)q* - 4(3p2g + 4.493)p*
Adp7 = -(A2Bp3 + (5A4 + Z2AB2)p2q
+ (26A3B + l92B3)pq2 - 3{A5 + $A2B2)q3)q*
- A({AA3 + 27B2)p3 - A2Bp2q
+ (ZAA + 22AB2)pq2 + Z{A3B + 853)</3)p*. *
(4.37)
5. HEIGHT AND MORDELL'S THEOREM 97
[To derive these formulas, one can use two applications of Proposition
2.16 and its proof, with f = p* and g = q*. In one application, X = p
and the domain is k[q]; in the other application, X = q and the domain
is k\p]. A symbolic manipulation program is a helpful tool.]
From (4.37), we obtain
max(|p|, |,|)7 < C'A<B max(|p|, |?|)3 max(|p*|, \q'\)
and thus
max(|p|, |?|)4 < C'AB max(|p*|, |?*|). (4.38)
This inequality is one of the desired bounds. For the other one, we see
from (4.37) that 6\4dq7 and 6\4dp7. Since GCD{p,q) = 1, 6\4d. Hence 6
is bounded, and
max(|p'|,|g'|)<C"max(|p"|,|g**|).
Combining this inequality with (4.38), we have
m«(|p|, |?|)4 < CAiBC"naX(\pt'\,\9*t\)- (4-39)
Inequalities (4.36) and (4.39) together prove the proposition.
Proposition 4.13. There exists a unique function h : i?(Q) —► R
satisfying
(i) h(P) - h0(P) is bounded
(ii) h(2P) = Ah{P).
The function is given by
A(P) = lim M?^P) (4 4Q)
It has h(P) > 0 with equality if and only if P has finite order. Also
{P | h(P) < C) is a finite set for each C.
Remark. The function h is called the canonical height.
PROOF. We begin with uniqueness. Let h satisfy (i) and (ii), with a
bound C" in (i). Then
\Anh(P) - ft0(2nP)| = \h(2nP) - h0(2nP)\ < C
by (ii) and (i). Thus
98 IV: MORDELL'S THEOREM
and h must be given by (4.40).
{h (2n P) 1
— > is Cauchy. By
Proposition 4.12, we can write
\h0(2Q)-4h0(Q)\<C"
with C" independent of Q. Then N > M > 0 implies
\4-Nh0(2NP) - 4-Mh0(2MP)\
N-l
^2 (4-n-1/io(2n+1P)-4-n/i0(2nP))
n-M
7V-1
< J2 4-n-1M2n+1P)-4V2nP)|
N-\
< J2 4"n-1C" <
(4.41)
N~l 4_AfC"
and the right side tends to 0 as M and N tend to oo. Thus h{P) is
defined. Letting N —► oo in (4.41) and taking M = 0 gives
\h(P)-ho(P)\<^-,
which proves (i). Result (ii) is clear from (4.40). Also h(P) > 0.
If P is a torsion point, then 2nP lies in a finite set. So h(P) = 0. In
addition, (i) implies {P\h(P) < C} is finite.
Now suppose P has infinite order. Since {P\h(P) < 1} is finite, we
must have h(2nP) > 1 for some n. By (ii), h(P) > 4"n > 0.
Proposition 4.14. The canonical height on E(Q) satisfies
h(P + Q) + h(P - Q) = 2/i(P) + 2/j(Q). (4.42)
Proof.We shall prove that
h(P + Q) + ft(P - 0) < 2/i(P) + 2h(Q). (4.43)
Once we have done so, we can apply (4.43) to P' — P+Q and Q' = P—Q
to get
h{2P) 4- h(2Q) < 2h{P + Q) + 2/i(P - Q).
5. HEIGHT AND MORDELL'S THEOREM 99
Dividing by 2 and using property (ii) of h, we obtain
2/i(P) + 2h(Q) < h(P + Q) + h(P - Q).
In combination with (4.43), this inequality proves (4.42).
To prove (4.43), it is enough, by (4.40), to prove that
h0(P + Q) + MP -Q)< 2MP) + 2/i0(Q) + O(l).
If P or Q is oo, this inequality is trivial; if P + Q or P — Q is oo, the
inequality reduces to Proposition 4.12. Thus let us write
P = (x,y) with x = — in lowest terms,
Q = (x',y') with x = — in lowest terms,
Woo = max(H, \q\)y |*;I*, = maxflp'l, |<?'|), x± = x(P ± Q).
Since P / ±Q, (3.72) and (3.73) in the chord case give
x± = x(P±Q)
= /V=FyV
Vx'-x/
Thus
y'2 + y2-(x' + x)(x'-x)2
x+ + x- = 2
= 2
(*' - x)2
_ n (x/3 + x3) + A(x' + X) + 2B- (a1 + x)(x/2 - 2xx' + x2)
(x' - x)2
_ xx'(x' + x) + A(x' + x) + 2ff
(x' - x)2
= 2 PPV? + Pq') + Agq'jp'q + pg') + 2g?y2 ^
{p'q-pq'Y
100
IV: MORDELL'S THEOREM
and
_ jz'- x)2{x" + xz'+ z2 + A)2
(x' - z)4
2(x + x')[(*' + x)(x/2 - xx' + x2 + i4) + 2ff]
(x' - x)2
_ (x/2 4- zz' + x2 + A)7 - 2(x/2 + 2xxf -f x2)(g/2 - xx' + x2 + A)
(x' - x)2
-AB(x + x') + (x'2 - x2)2
,2 +(x + x')2
+
(x' — xy
V _ A\2 _AR(-rJL t'\
after simplification
(x'-x)2
(xx' - A)2 - 4B(x + x')
(4.45)
(x' - x)2
^{pp' -qq'A)2 -Aqq'Bjpq' + y'q)
(p'q - Vq'Y
Examining (4.44) and (4.45), we see that
x+ + x_ = - and x+x_ = -•
with
m*x(\T\,\,\,\l\)<C\x\l\x'&. (4.46)
Then x+ and x_ are the two roots of X2 — (-J X + f-J =0, namely
X = —-(r ± y/v2 — 4st), and we see that
X+ € 2t and *~ € 2t * ^4'47^
Let x+ = I*- and x_ = £=- in lowest terms. By (4.47), 2t = S+q+ = £_</_
for suitable integers 6+ and 6-. Then
(M-)(?+9-) = 4<2. (4.48)
From
7 = x+ + x_ = h — = ,
we obtain
4r*2
(p+tf- +P-?+)^ = W_ =
5x6_
and hence
From
we obtain
and hence
5. HEIGHT AND MORDELL'S THEOREM 101
M-(P+?- + P-?+) = 4r<. (4.49)
5 -„ . _P±Etl
T - X+X- - :
t q+q-
4st2
(p+p-)< = sq+q- =
M-
(«+p+)(««P-) = 4^. (4.50)
We shall show that <|£+6_, and then
|p+?_ + p_g+| < 4|r| from (4.49)
|p+p_| < 4|s| from (4.50) (4.51)
ltf+0-1 <4|t| from (4.48)
To see that <|<5+£_, first fix a prime p. We construct by (4.50) integers
a and 6 > 0 such that pa|6+p+, p6|£_p_, and pa+b is the exact power of
p dividing t. Since 2t — 6+q+ = 6-q-r pa\6+q+ and p6|6_<j_. Then p°
dividesGCD(£+p+,6+<j+) = £+, andp6 dividesGCD(£_p_, £_</_) = £_.
So pa+6|£+£_. Since p is arbitrary, <|6+£_. This proves (4.51).
We readily check the numerical inequality
max(|p+|,|g+|)max(|p_|,|g_|) < 2max(|p+g_ + p_?+|, |p+P-|, lfl+tf-1)-
Substituting in the right side from (4.51) and then using (4.46), we
obtain
max(|P+|,|g+|)max(|p_|,|?_|) < 8max(|r|, \s\,\t\) < 8C*|*£>U-
Consequently
ho{P + Q) + h0(P -Q)< 2h0(P) + 2h0(Q) + log8C,
and the proposition follows.
102
IV: MORDELL'S THEOREM
Proof of Theorem 4.11. Since Theorem 4.5 has shown
E(Q)/1E(Q) to be finite, there exists some sufficiently large C such
that the set
S = {PeE(Q)\h(P)<C)
contains a representative for each coset of E(fy)/2E(Q). The set S is
finite. We claim it generates E(Q).
Assume the contrary, and let P E E(Q) be outside the group
generated by S. Since {Pf G £(Q) \h(P') < C'} is finite, we may assume P
is chosen with h(P) as small as possible among all points outside the
group generated by S. Choose Q G S with
P = Q mod2£(Q),
say P = Q + 2R. By Proposition 4.14, we have either
KP + Q)<h(P) + h{Q)
or
h{P -Q)< h(P) + h(Q)i
fix the sign of h(P ± Q) so that this happens. From P + Q = 2(Q + R)
and P - Q = 2fi, we have P±Q = 2P' for some P' G E(Q). Then
Ah{P') = h(2P') = h(P ±Q)< h(P) -f h(Q)
<h(P) + C<2h(P)<4h(P).
So h{P') < h(P). By minimality, P' is in the group generated by S.
Since Q has this property and P ±Q = 2P;, P has this property,
contradiction.
6. Geometric Formula for Rank
We have now proved that the Q rational points of an elliptic curve E
over Q form a finitely generated abelian group. Thus
£(Q)-Zr0,F, (4.52)
where F is a finite abelian group. The group F is uniquely defined
as the torsion subgroup, and Chapter V will show how to determine it
completely for^any particular E. The integer r is the rank of E(Q).
In this section we shall give a geometric limit formula for the rank,
under the special assumption that E is of the form
y2 = x3 + Ax + By A, Be Z. (4.53)
The tool will be the canonical height h defined in §5. We give a complete
proof modulo one result from Euclidean Fourier analysis.
6. GEOMETRIC FORMULA FOR RANK 103
Proposition 4.15. With E as in (4.53), there exists a unique
Z bilinear form (P, Q) on E(Q) such that (P, P) = h(P). This form
descends to £(Q)/F = Zr and is positive definite there.
Remarks. Let {P{} be a Z basis of Zr, and let ctJ = (Pi,Pj), so that
the form is given by
Then the proof will show that the symmetric matrix (c{j) is positive
semidefinite and that (£li m»^»» Si m>^«) — 0 only when all m, are 0.
Subsequently we shall see that (c,j) is actually a positive definite matrix.
Proof. If the bilinear form exists, it has to be given by
(P, Q) = \ (h(P + Q)- h(P) - h(Q)). (4.55)
This proves uniqueness. For existence, define (P}Q) by (4.55). Then
(P, Q) is certainly symmetric. For additivity in the first variable, we use
Proposition 4.14 several times. First we have
(P,-Q) = l(h(P-Q)-h(P)-h(Q))
= -i(h(P + Q)-h(P)-h(Q)) = -(P,Q).
(4.56)
Then we can write
(P + P',Q) + (P-P',Q)
= |(A(P + P' + Q) + h(P -P' + Q)
- h(P + P') - h(P - P') - h(Q) - h(Q))
= $(2h{P + Q) + 2h{P') - 2h(P) - 2h(P') - 2h(Q))
= 2(P,g). (4.57)
Interchanging P and P' and using (4.56) gives
{P+P>,Q)-(P-P>tQ) = 2(P',Q).
Adding this to (4.57), we obtain
(P + P',Q) = {P,Q) + (P',Q).
104
IV: MORDELL'S THEOREM
Thus the form is bilinear.
Next let us observe that
\(P,Q)\2 < h(P)h(Q). (4.58)
In fact, the bilinearity proves that
0 < h(nQ ~ mP) z= n2/i(Q) - 2mn(P, Q) + m2h{P).
Hence
/i(Q)A2-2(P,Q)A + h(P)>0
for all A £ Q, and the inequality extends to all A G R by a passage to
the limit. The discriminant must then be < 0, and the result is (4.58).
If Q is a torsion point, then h(Q) = 0, and (4.58) shows that (P>Q) =
0. Hence
o = </>, Q) = * {h(P + Q) - A(P) - fc(Q)) = i (AC + Q) - W),
and h(P + Q) = h(P). It follows that the form descends to £(Q)/F.
On Zr the form is given by (4.54). Since h > 0, we see that
for all systems of rationals j^}. ..,^. Passing to the limit, we see
that 52sj^*c»i^i ^ 0 f°r a^ rea^s Ai,...,Ar. Thus (c,-j) is positive
semidefinite.
Let us now sharpen the conclusion of Proposition 4.15 by showing that
the matrix (c,-j) is positive definite. We shall use the following celebrated
result of Minkowski.
Lemma 4.16 (Minkowski). If E is a compact convex set in Rr
containing 0 and closed under negatives and having volume > 4r, then E
contains a nonzero member of Zr.
Remark. Fine tuning of the proof would show that the same
conclusion holds when the volume is merely > 2r.
Proof. Let N be an integer large enough so that the standard cube
C with center 0 and side AN contains E. Suppose E contains no nonzero
member of Zr. We claim that the sets {/ + ^E} for / 6 Zr are disjoint.
6. GEOMETRIC FORMULA FOR RANK
105
In fact, if li + ^ei = /2 + \t<i with /i ^ /2, then /i -/2 = ^(e2 — ci), and
this is in E since c2 and -e! are in E and i? is convex.
If every coordinate of / is < N in absolute value, then / -f ^E is in C.
Hence
(4AT)r = vol(C) > £ vo,(/ + iE)
|coords l|<JV
> (2yv)rvoi(^) = 2-r(2;vy voi(E).
Since vol(Z?) > 4r by assumption, this is a contradiction.
Proposition 4.17. With E as in (4.53), the symmetric positive
semidefinite matrix (cl;) in (4.54) is actually positive definite.
Proof. Since (ctJ) is symmetric and positive semidefinite, we can
choose column vectors t/1),... ,v(r) that form an orthonormal basis of
eigenvectors of (c,j) with respective eigenvalues
\W> ...> \(r)>Q.
We are to prove that A^r) > 0. Arguing by contradiction, suppose A(r) =
0.
Let {e^} be the standard orthonormal basis of column vectors, and
write vW = ^f v; V*\ We shall identify P = ]T^ mi^i w^h tne column
vector ]T\ ™>i^x\ so that the Zr quotient of E(Q) gets identified with
the standard lattice of column vectors. The form (•, •) then transfers to
column vectors with the definition
\ i j I ij
Then
= I>(t)*2 (4-59)
106 IV: MORDELL'S THEOREM
since the v^ are orthonormal.
Let e be the minimum value of h(P) = (P, P) for P £ F. This exists
and is > 0 since {P\h(P) < C] is always finite. Let E be the compact
convex set of column vectors
with M chosen so large that \o\(E) > 4r. By Lemma 4.16, E contains
a nonzero lattice point ^b^v^ <-+ P. But then (4.59) gives
\ k I I *=1
r-1
= £a(*>*£ since A<r> = 0
k-l
r-1
<EA(1)(2^(T)) since £>« is in £
Jb=l
e
"" 2'
and we have a contradiction. Thus A^r^ > 0, and (c,j) is definite.
The matrix (cij) = ((Pi,Pj)) depends on the choice of the basis {P,}
of Zr, but its determinant does not. The elliptic regulator of E over
Q is the number
«js/Q=det((Pi>Pi)).
Proposition 4.17 shows that Re/q > 0. This number enters into the
statement of the third form of the Birch and Swinnerton-Dyer
Conjecture discussed in Chapter I.
We can now give the geometric interpretation of rank. We use the
symbol ~ to denote "asymptotic to," in the sense that the ratio tends
to 1.
Proposition 4.18. With E as in (4.53), the following formula is
valid as T —► oo:
#{(*,») €£(Q) | k|eo<T} {
(=#(F) ifr = 0
~ M^L(lo&Ty/2 ifr>0.
R
VB/Q
Here #(•) refers to the number of elements in a set, and Or is the volume
of the unit ball in Rr.
7. UPPER BOUND ON THE RANK
107
Proof. We may assume r > 0. Let (cij) be the positive definite
matrix in (4.54). With e^ as the standard basis of Rr, we borrow the
following fact from Euclidean Fourier analysis: The number of lattice
points ^n;e(l) with Y2nicijnj <* ls
nrr/2
~ det(Cii)i/2
as t —► oo. Consequently the number of points J2niP* 4- / in £(Q) =
Zf0F with ft(£ mPi + /) < t is
#(F)nrr/2
as 2 —► oo. Since ft — fto is bounded and r > 0, the same asymptotic
estimate is valid for fto- Taking t — logT and using the definition of fto,
we obtain the proposition.
7. Upper Bound on the Rank
Although Proposition 4.18 does give a limit formula for the rank of
£(Q), the formula is not very practical as a computational tool, even
for getting an idea of what the rank is. In this section we shall take
a different approach to estimating the rank, namely by sharpening the
bound on the order of £(Q)/2£(Q).
For simplicity we return to the situation of §3, where the elliptic curve
E is given by (4.17) and the roots a,/?,7 of the cubic polynomial are
integers. We shall obtain an upper bound for the rank that is sharp in a
number of cases. The same principle applies in the more general setting
of §4 (where a,/?,7 are not necessarily integers), but the upper bound
is inclined to be too big.
In the situation of §3, let us notice an easy upper bound for the rank r.
Proposition 4.8 shows that the order of E(Q)/2£,(Q) is bounded above
by 2 to the power
s = 2-|-2#{primesp \ p\d),
d being the discriminant. On the other hand, the torsion subgroup F
of £(Q) contains Z2 0 Z2 as a subgroup and, because of the structure
of #(R), contains no more than two cyclic summands of even order. It
is thus the direct sum of a group of odd order and two cyclic groups of
108 IV: MORDELL'S THEOREM
even order. Consequently F contributes 22 to the order of i?(Q)/2i?(Q).
Meanwhile the free abelian part V of £(Q) contributes T to the order
of £(Q)/2F(Q). Thus
r+2<s=2+ 2#{primes p \ p \ d}}
and our easy upper bound is
r <2#{primes p \ p \ d). (4.60)
Our objective is to sharpen (4.60). Let us call
p good if p \ d
p fairly bad if p divides exactly one of a — /?, /? — 7, a_=- 7
p very bad if p divides all three of a — /?,/? — 7, a — 7.
These notions are related to the discussion in §111.5, but we omit the
details. Let
n\ = number of fairly bad primes
ri2 = number of very bad primes
Our first improvement of (4.60) is contained in the following proposition.
Proposition 4.19. Let E be the elliptic curve (4.17) over Q with
integer roots a,/?,7, and let r be the rank of £(Q). Then
r < ni + 2n2- 1. (4.61)
Proof. We go over the proof of Proposition 4.8 in order to reduce
the image of
E(Q)/2£(Q) —> J2 ® (Z2 © Z2) (4.62)
±.pM
while keeping the mapping one-one. We shall show that the image in
the ± coordinate lies in a subgroup Z2 of Z2 © Z2 and the same is true
of the pth coordinate if p is fairly bad. Then the same analysis that gave
(4.60) will now give (4.61).
First consider the ± coordinate. The roots a,/?,7 are integers and
can be ordered. Let us say a < (J < 7, so that
x-a>x-/?>x-7. (4.63)
7. UPPER BOUND ON THE RANK
109
If x £ {a,/?,7}, the possible signs in (4.63) are apparently + + +,
4- -\ , -| , and — . For a point (x,y) in £7(Q), the product
(x — or)(x — 0)(x — 7) equals a square (namely y2), and thus only -f + -f
and H can occur. Therefore the ± coordinate of the image of <pa in
(4.20) and (4.25) is 0, and the first factor of Z2 can be dropped. [Note
that if (4.25) used <pa x y?7 or <pp x ^?7, then the image in the ±
coordinate would still be in a Z2 subgroup of Z2 0 Z2, namely 0 0 Z2 and
diag(Z2) in the respective cases.]
Actually the above argument applies only when x £ {a,/?,7}. When
x = a, (4.19) shows that the i component of <pQ(x) is obtained from
sgn[(a — (3)(a — 7)] = -f 1 and hence is trivial. When x is /? or 7, then
the i component of <p0(x) is obtained from sgn(x — cv) = +1 and hence
is trivial.
Now consider the p coordinate when p is fairly bad. First let us assume
that p I (ot—/3). Let (x, y) be a point other than 00 on i?(Q), and suppose
temporarily that x ^ {a,/?, 7}. Define integers a,6,c by
pa||(*-a), p* |l (*"/?)■ Pc||(*-t).
In (4.26) we saw that a -f 6 + c = 0 mod 2. Also we saw that if any of
a, 6, c is < 0, then a = 6 = c = 0 mod 2, so that the image of <£>a x y?^
in the pth coordinate is (a mod 2, 6 mod 2) =■ (0,0).
Suppose a > 0, so that p | (x — a) in the sense that the numerator of
x — a hasp as a factor. Since p f (or —7), we have p \ (x —7). Thus c = 0
and a -f 6 = 0 mod 2, and the image of (pa x ^ in the pth coordinate is
contained in diag(Z2). Suppose instead that b > 0, so that p | (x — /?).
Since p f (/? — 7), we have p f (x — 7). Thus c = 0, a + 6 = 0 mod 2,
and the image is in diag(Z2). Finally suppose instead that c > 0, so that
p I (x — 7). Since p f (7 — a) and p f (7 — /?), we have p f (x — a) and
p | (x — /?). Thus a = b — 0, and the image 0 is in diag(Z2).
Exceptional cases occur when x £ {or,/?,7}. If x = a, then <pa(x) —
(a — (3){ot — 7) and y>/?(x) — 0 — ex. Since <Pa{x)/<Pp{x) is not divisible
by p, the image of <pa(x) x <pp(x) in the pth coordinate is again
contained in diag(Z2). If x = /? instead, then (fa(x) = /? - a and ^(x) =
(/? — a)(/J — 7). Since <pp(x)/(pa(x) is not divisible by p, the image of
(poi(x) x ^0(2) in the pth coordinate is contained in diag(Z2). If x = 7
instead, then (pa{x) = (7 — a) and ^(z) = (7 — /?) are not divisible by
p. So a = 6 = 0, and the image 0 is in diag(Z).
We have been assuming that p | (a — /?). If p | (0 — 7) instead, the
pth coordinate of the image of <pa x <pp is in 0 0 Z2, while if p | (a — 7),
the pth coordinate of the image of (pa x (pp is in Z2 0 0. In each case
the image is confined to a Z2 subgroup of Z2 0 Z2. This completes the
proof.
110
IV: MORDELL'S THEOREM
Once again let us return to congruent numbers. The integer n is
congruent if n is the area of a rational right triangle. Let us borrow
the following lemma from Theorem 5.2.
Lemma 4.20. If n is a square-free integer and E is the elliptic curve
y2 = x3 - n2x,
then the torsion subgroup of E(Q) is Z2 8 Z2.
Proposition 4.21. A square-free integer n fails to be congruent if
and only if the elliptic curve
E : y2 = x3 - n2x,
has the property that E(Q) has rank 0.
Proof. If #(Q) has rank 0, it is a torsion group, and Lemma 4.20
shows it to be Z2 ® Z2. By (1) => (3) in Proposition 3.1, n is not
congruent. Conversely if n is not congruent, Corollary 4.3 shows that E
has no Q rational point other than (—n,0), (0,0), (n,0) and 00. Hence
£(Q) has rank 0.
Corollary 4.22 (Fermat). n = 1 is not a congruent number.
Proof. The elliptic curve E with y2 = x3 — x is of the form considered
in Proposition 4.19, with {a,/?,7} = {—1,0,1}. All primes are good
except p = 2, which is fairly bad. Thus n\ = 1 and ri2 = 0. By
Proposition 4.19, #(Q) has rank 0. By Proposition 4.21, n = 1 is not
congruent.
We already saw in Corollary 4.4 that n = 2 is not a congruent number.
If we attempt to reprove this result from Propositions 4.19 and 4.21, we
run into a problem. The relevant elliptic curve is y2 = x3 — 4x, with
{<*,/?, 7} — {—2,0,2}. All primes are good except p = 2, which is very
bad. Thus n\ = 0 and n2 = 1, and Proposition 4.19 gives us the estimate
r < 1 for the rank. We do not immediately get r = 0.
However, an even more careful analysis of the proof of Proposition
4.8 will give the sharper estimate. We need to study the relationship
between the ± coordinate of (pa x <pp and the p = 2 coordinate. In Table
4.1 below, we list the effect of (p = <pa,<pp,<py on an x(P) other than
or,/?,7. Each column for ± gives a conceivable configuration of signs,
and each column for p = 2 gives a conceivable set of residues modulo 2
for the power of 2 in <p(x). We keep in mind that (x + 2)x(x — 2) has to
be a square.
7. UPPER BOUND ON THE RANK 111
Map Image mod Qx2 ± 2
\ <Pa x + 2 ++ 0011
(pp x + — 0 10 1
\ <py x-2 + - 0110
Table 4.1. Image of <p for y2 = x3 - Ax with x £ {-2,0,2}
At all other primes the image of each <p is a square. Conceivably all
8 combinations of a column for ± and a column for p = 2 could occur;
actually we shall see that there is a restriction.
The idea is to adjust P by an element of order 2 to make the p = 2
column trivial. For P = (<*,0), (/?,0), (t,0) the corresponding information
is assembled in Table 4.2.
Map
<P*
<Pp
Vi
Image of
(-2,0)
8
-2
-4
±
+
-
-
2
1
1
0
Image of
(0,0)
2
-4
-2
±
+
-
-
2
1
0
1
Image of
(2,0)
4
2
8
±
+
+
+
2
0
1
1
Table 4.2. Image of tp for y2 = x3 - Ax with x e {-2,0,2}
Since all nontrivial columns appear for p = 2 and since each <p is a
homomorphism, we can adjust any point P under study by adding an
element of Z2 0 Z2 so that (pQ) <pp, and y?7 all map the adjusted point
into (0,0) in the p = 2 coordinate. For the adjusted point, which we
still call P = (x(P),y(P)), the image of cp is as in Table 4.3.
Map
fa
f&
<Py
Image mod Qx2
x + 2
X
x-2
±
+ +
+ -
+ -
2
0
0
0
Total
□ D
□ -□
D -□!
Table 4.3. Adjusted image of <p for y2 = x3 — Ax
112
IV: MORDELL'S THEOREM
There are only two possibilities, one of two columns from i and the
column for p = 2.
We shall show that the second possibility, namely
* + 2 = [], * = -□. x-2 = -Q, (4.64)
does not occur. Then it will follow that r = 0. Arguing by contradiction,
suppose (4.64) occurs. Subtracting the first two equations gives
If these squares have even denominators, then the power of two is the
same for each and is even; also the numerators must be odd. Clearing
denominators, we obtain
0 = Q + □ mod 8
with both squares odd integers, and this is a contradiction. Thus the
first two squares in (4.64) have odd denominators. Subtracting the last
two equations in (4.64) gives
2 = -D + D-
and the first of these squares has an odd denominator. Therefore so does
the second. Clearing denominators, we obtain
2m2 = -[] + Q mod 8 (4.65)
with m an odd integer and both squares equal to integers. Then m2 = 1
mod 8, and (4.65) is impossible. Thus (4.64) cannot occur, and y1 =
x3 — 4x has rank 0. This argument has given a different proof that n — 2
is not congruent.
A similar technique gives refined information about the rank of £(Q)
for t/2 = x3 — p2x, where p is an odd prime. The result is as follows.
Proposition 4.23. Let p be an odd prime, and let E be the elliptic
curve
i "\ i
y = x — p x.
Then the rank r of £(Q) satisfies
r < 2 if p = 1 mod 8
r = 0 if p = 3 mod 8
r <\ ifp=5or7 mod 8.
Consequently any p = 3 mod 8 is not a congruent number.
7. UPPER BOUND ON THE RANK
113
REMARK. The proof will use some facts about quadratic residues
from elementary number theory.
Proof. We argue as above, taking {a,/?, 7} = {-p, 0,p}. All primes
are good except 2 and p; 2 is fairly bad, and p is very bad. The
information analogous to what is in Table 4.1 is assembled as Table 4.4.
Image mod Qx2 ± 2 p
x+p + + 0 1 0 0 11
x + - 0 0 0 1 0 1
x-p + - 0 1 0 110
Table 4.4. Image of <p for y2 = x3 — p2x with x £ { — p,0,p}
If we make a table like Table 4.2 that lists what happens for the torsion
points (—p, 0), (0,0), and (p,0), we find that the torsion points account
for all nontrivial columns for the prime p in Table 4.4. Since each (p is a
homomorphism, we can adjust any point P under study so that <pa) (fpy
and (py all map the adjusted point into (0,0) in the p coordinate. For
the adjusted point, which we still call P = (x(P),y(P)), Table 4.5 shows
the four possibilities for the combined contribution from a column for
± and a column for 2. The point is that certain columns listed under
"Total," depending on p, cannot occur. If only two columns can occur,
then r < 1. If only one column can occur, then r = 0.
Map
¥>a
w
^7
Image mod Qx2
x + p
x
x — p
□
□
□
Total
□
"□
-□
2D
□
2D
2D
-□
-2Qj
Table 4.5. Adjusted image of <p for y2 — x3 — p2x
Map
<pa
114
IV: MORDELL'S THEOREM
□
Suppose I — Q I occurs. Subtracting the expressions for x + p and
x — p, we obtain 2p = Q -f Q. If p occurs in the numerator of one
square, then p2 occurs in the numerators of both squares and p2|2p
gives a contradiction. Thus the numerators of the squares are prime to
p. Clearing fractions and reducing modulo p, we obtain 0 = f~~] + [~~]
mod p, with both squares prime to p. Then — 1 is a square modulo p,
and it follows that p = 1 mod 4. Thus
occurring => p = 1 or 5 mod 8. (4.66)
(2°\
Suppose Q I occurs. The first two entries give p = 2Q - Q.
Arguing as in the previous paragraph, we are led to 0 = 2Q] — £J
mod p, with both squares prime to p. Then 2 is a square modulo p, and
it follows that p = ±1 mod 8. Thus
p = 1 or 7 mod 8. (4.67)
2D
Suppose I — Q I occurs. The second and third entries give p =
2Q — Q, and the previous paragraph again shows that p = 1 or 7
mod 8. The first and second entries give p = 2Q] + []]. Arguing as
above, we conclude that —2 is a square modulo p, and it follows that
p = 1 or 3 mod 8. Thus
occurring => p=l mod 8. (4.68)
In view of how the number of occurring columns affects the rank, the
implications (4.66), (4.67), and (4.68) prove the proposition.
8. CONSTRUCTION OF POINTS IN E(Q)
115
8. Construction of Points in #(Q)
The refinements to the method of descent in §7 are not good enough
to produce members of #(Q) with certainty, but they are excellent at
cutting down the possibilities enough to make a computer search
manageable. We shall consider two examples in this section. The first
example will go in the converse direction to Proposition 4.23 by showing
that all small primes p = 5 mod 8 are congruent numbers (i.e., r = 1 in
Proposition 4.23). The second example will solve Fermat's problem for
Mersenne that was introduced in Subsection E of §111.1.
A, Congruent numbers
For the first example, fix a prime p = 5 mod 8, According to (4.66),
if p is congruent, then we can write
f x + j/
x
for suitable squares in Q and for some x — x(P). To quantify matters
write
/x + jA / a2 \
Then we must have a2 — c2 = —262, which we write as
a2 = c2-2b2. (4.70)
We parametrize the solutions by the method of Diophantus in Chapter
I: We change the equation from projective to affine form by writing it
as
l = u2-2v2. (4.71)
Then (u,v) = (1,0) gives one solution. A line through (1,0) is u = tv+l
for any choice of t in Q. Substituting, we obtain
1 =(<v+ l)2-2t;2,
It
which is solved by v = 0 and v = -. Thus our parametrization of
solutions of (4.71) is
, x /2-M2 2t \
116
IV: MORDELL'S THEOREM
Write i = r/s in lowest terms. Then
-(a,b,c) = (l.w.u) = * J2s2 - r\2rs,2s2 + r2),
a zs' — r*
and
a = (2s2 - r2)A
6 = (2rs)A (4.72)
c = (2s2 + r2)A
for some A € Q.
1 777
We shall show that A is of the form —. In fact, say A = — in lowest
n n
terms. Then (4.72) shows that m divides the integers na and nb. Hence
77i divides a and 6. Referring to (4.69), we see that m2 divides a2+b2 = p.
Thus m — ±\ and we may take m — \.
Since A = —, the equation
n
p = a2 + 62 = (r4 + 4s4)A2
shows that
r4+454=pn2. (4.73)
At this stage we have thinned out the possibilities considerably. We
look for a solution (r, s, n) of (4.73), and it determines (a, 6, c) in (4.72),
since A = —. Then (a,byc) determines an x of the correct form so that
n
(4.69) holds, and x is of the form x(P) for a point P in £(Q). By
letting r and 5 range from 1 to 50 and seeing whether the square root
of p~l(rA + As4) is very close to an integer, we are led to all the entries
in Table 4.6 except the one for p — 53. Since r and 5 (and ultimately
x(P)) exist for each p in the table, all these numbers p are congruent.
8. CONSTRUCTION OF POINTS IN E(Q)
117
p
5
13
29
37
53
61
109
149
r
1
1
7
1
286
41
6
14
s
1
3
5
21
119
39
7
17
n
1
5
13
145
11890
445
10
50
x(P)
-4
36
25
4900
~~169~
1764
21025
1158313156
35343025
10227204
198025
1764
25
56644
625 |
Table 4.6. Some primes p = 5 mod 8 that are congruent numbers
To discover (r, s,n) for p — 53 in the table, we refine the argument
still further. We are trying to solve
(r2)2 + 4(s2)2 = 53n2.
Consider the affine curve
j2+4*2 = 53
over Q. One solution is (j>k) = (7,1), and the method of Diophantus
in Chapter I will allow us to parametrize all solutions. A line through
(7,1) is (ib — 1) = t(j — 7) for any choice of /. Substituting for k> we
obtain a formula for j in terms of t. Then we can recover Jfc. The result
is
/28t2-8f-7 -4t2-14t + l\
W'*)-^ 4*2 + ! » 4,2 +1 J'
118
IV: MORDELL'S THEOREM
which we can reparametrize as
Putting t = d/e in lowest terms gives
(r2,s2, n) = fi(7d2 - 4de - 7e2, -d2 - Ide + e2, d2 + e2). (4.74)
From the third entry, we see that /i > 0. The numerator of ft can be
taken to be 1 since GCD(r, s) = 1. If the denominator is hy then h
divides
7d2 - Ade - 7e2 + 7(-d2 - Ide + e2) = -53de and d2 + e2.
Any prime factor of h must divide 53de. If it divides d or e, then it
divides d2 -f e2 and hence both d and e, contradiction. Thus \i = ^ or
/, = i. _
Handling // = ^ does not look very easy. So let us look for a solution
with \jl — 1. Our equality is then
(r2, s2, n) = (7d2 - 4de - 7e2, -d2 - 7de + e2, d2 + e2). (4.75)
We analyze
-d2_7de + e2 = r-| (=s2)
by the method of Diophantus. The affine equation is
-d'2-7d'e' + e'2 = l,
and (d', e') = (0,1) is a solution. The line e' = d't + 1 leads to
l«>e>>- ^_t2 + 7< + 1' _t2 + 7t + 1y-
If t = £/t] in lowest terms, then
(d,e,a) = u(2(r, - ln\ ? + n\ -? + 7£ij + n2). (4.76)
In the same way as for \i in (4.74), we find v — -^ or v — 1.
We look for a solution with v = 1. Using a short computer search
with small values of £ and 77 and with d and e determined by (4.76), we
check whether the first entry 7d2 — 4de — 7e2 of (4.75) is a square, so
that we can take it as r2. For f = 10 and rj = 3, we do get a square, and
the result is what is listed in Table 4.6.
The reader may wish to try such an analysis for p = 101, which is the
only p = 5 mod 8 that is < 150 and is missing from Table 4.6.
8. CONSTRUCTION OF POINTS IN E(Q) 119
B. Fermat's problem to Mersenne
We seek an integer-valued relatively prime Pythagorean triple
(X, y, Z) such that the hypotenuse Z is a square and the sum of the
legs is a square. In other words we want
X2 + Y2 = Z2y Z = b2y X + Y = a2.
With e = X - y, we obtain X and Y as f (a2 ± e)> and (3.17) showed
the problem comes down to solving
264-a4 = e2. (4.77)
A solution in the style of Fermat proceeds as follows. We can work a
little sloppily, apparently discarding possibilities, because all we need is
one solution. From X2 + Y2 = Z2y we write
X = m2-n2y Y = 2mny Z = m2 + n2 (4.78)
by Theorem 1.2. (Actually this form of a solution assumes that X is
odd; but if it works, it works.) From b2 = m2 -f n2 we have
m = r2 - s2y n = 2rsy b = r2 + s2. (4.79)
Meanwhile, the Diophantus method of Chapter I leads from
a2 = X + Y = (m + n)2- 2n2
to
m-f n = t2 + 2u2, n = 2tu, a~t2-2u2,
hence to
m = *2-2*u + 2u2, n-2tu, a = t2-2u2. (4.80)
Equating the two expressions for n in (4.79) and (4.80), we obtain rs =
tu. Thus - = —, and we can write this as — in lowest terms. Then
t s c
r = kdy t = kcy u = ldy s = Ic. (4.81)
Equating the two expressions for m in (4.79) and (4.80), we obtain
r2 - s2 = m = t2 - 2tu + 2u2.
120
IV: MORDELL'S THEOREM
Substitution from (4.81) gives
Jfc2d2 - /V = *V - 2klcd + 2/2d2
and hence
0 = (c2 + 2d2) (0 2 - 2cd ({) + (c2 - <f2). (4.82)
For l/k to be in Q, the discriminant must be a square in Q. Thus
c2d2_(c2 + 2d2)(c2-d2) = n>
which we rewrite as
2d4 - c4 = /2. (4.83)
This equation is of the same form as (4.77), but the integers are usually
much smaller here. At first it looks as if the method of descent is leading
us to a proof that there is no solution. But there is a bottom nontrivial
solution with (c>d) = (1,1). Substituting into (4.82), we obtain — =
A:
2 /
0 or -. The solution — = 0 leads back to (a,6) = (1,1), which is not
/ 2
interesting. From — = -, we have / = 2, k — 3. Going backwards, we
k 3
calculate
r = 3
t = 3
u = 2
s = 2
m = 5
n= 12
0= 1
6=13
X = -119
Y = 120
Z = 169
e = -239
We must discard the result of this trial since X is negative.
But we can continue. The (a, 6) from this trial becomes the (c, d) of
our next trial. Putting (c,dy f) = (1,13,-239), we have
/ _ cd±f _ 13 T 239 2 84
h "" c2 + 2d2 " 1 + 2 169 " 3 an 113'
/ 2
From — = — -, we have / = —2, k — 3. Going backwards, we calculate
r = 39
/ = 3
u= -26
s = -2
m= 1517
n = -156
a = -1343
b = 1525
X = 2276953
Y = -473304
Z = 2325625
e = 2750257.
8. CONSTRUCTION OF POINTS IN £(Q)
121
This time we have a minus sign in Y and discard the result.
We try instead ^, using / = 84, Jfc = 113. Then
r = 1469 m = 2150905 X = 4565486027761
t = 113 n = 246792 Y = 1061652293520
u = 1092 a = -2372159 Z = 4687298610289
5 = 84 6 = 2165017 e = 3503833734241.
This is the solution given in (3.16).
To analyze the solution technique, we work backwards, obtaining a
formula for (a, 6) in terms of (c,d). Actually we had a choice of a sign
for £, and we may regard (a,6,e) as depending on (c,d,/). The
correspondence will be nicer if we choose the ambiguous sign in j so that
I _ cd-f
k ~~ c2 + 2d? '
With this choice made, (a,6,e) becomes a function of (c, d> /).
Meanwhile the tuples (a,6,e) and (c, d, /), which give Z solutions of (3.17),
then give Q solutions of (3.18) and, by virtue of (3.19) and its inverse
(3.21), Q rational points on the elliptic curve y2 = x3 + 8x.
Table 4.7 shows what the effect of (c, d,/) —► (a,6,e) is, in terms of
the group law for the elliptic curve. The table uses the abbreviations
P0 = (8,-24) and T = (0,0). The point P0 generates an infinite cyclic
subgroup of £(Q), and T has order two.
d
1
1
-1
-1
13
13
-13
-13
/
1
-1
1
-1
239
-239
239
-239
Point
O
Po
Po + T
T
-Po
2Po
2P0 + T
-Po+T
a
1
1
1
1
-1343
-2372159
-2372159
-1343
6
1
13
13
1
1525
2165017
2165017
1525
c
1
-239
-239
1
2750257
3503833734241
3503833734241
2750257
Point
O
2P0
2Po
O
-2P0
4Po
4P0
-2P0
Table 4.7. Effect on E : y2 = x3 + 8x of Fermat's descent for
2b4 - a4 = e2
122
IV: MORDELL'S THEOREM
The table shows that Fermat's descent genuinely corresponds to the
passage P —► \P for the points in question. The reader may wish to
carry out the symbolic manipulations appropriate for general P.
9. Appendix on Algebraic Number Theory
The object of this section is to prove Theorem 4.10, which provides
a tool that was used in the proof of Mordell's Theorem. Theorem 4.10
requires a certain amount of algebraic number theory as preparation,
including three basic theorems. We shall state the necessary preliminary
results with most of the proofs omitted. References are given in the
chapter of "Notes."
We regard Q as a subfield of C, the subfield of elements that are roots
of polynomials P(X) with Z coefficients. The members of Q are called
algebraic numbers. A number field A' is a subfield of Q that is
finite-dimensional over Q.
An algebraic integer is an algebraic number that is the root of a
monic polynomial P(X) over Z. The set of such is denoted O. If 9 £ Q
is a root of
anXn + ... + aiX + a0,
then an6 is a root of
Xn + an.xXn~x + an-2anXn-2 + • • • + a.a^X + a0a^\
Hence
0 tty => k0 G O for some k € Z. (4.84)
One step in the proof of unique factorization for Z[X] shows that if a
monic polynomial in Z[X] has a monic factor in Q[X], then that factor
is actually in Z[X]; consequently the minimal polynomial over Q of any
9 £0 has integer coefficients.
The set O of algebraic integers is actually a ring. The standard proof
of this fact uses the theory of symmetric polynomials. A simpler proof
uses the following lemma, which is handy also for proving (4.88) below.
Lemma 4.24. Let V be the additive group generated by complex
numbers x\,... ,x/ that.are not all 0. If a is a complex number such
that ax £ V whenever x € Vy then a is an algebraic integer.
Let us write "left-by-a" for the operation of left multiplication by
a. For an example of a group V as in the lemma, let a £ O satisfy
Qn + an_iftn_1 -h • • • -h a0 = 0. Then left-by-a maps the Z span of
9. APPENDIX ON ALGEBRAIC NUMBER THEORY 123
an_1,... ,a, 1 into itself. When there are two such elements in O and
we use the products of their powers to generate V', the result is the
following proposition.
Proposition 4.25. O is a ring.
Fix a number field A'. It follows from the proposition that Ok = Of\K
is a ring, the ring of algebraic integers in K. From (4.84) it follows
that K is the field of fractions of Ok • The fact that the only rational
roots of a monic polynomial in 2[X] are in Z translates into the equality
OHQ = Z. (4.85)
Let n = [K : Q]. If {ai,... ,an} is a basis of K over Q and if a is in
K, we can write aa, = ]P. aijQj w^h aij € Q- We define
Ntf/Q(a) = det(a,;) (norm of a)
TV^/q(«) = Tr(a,j) (trace of a)
These are functions from K to Q. It is easy to check that
(1) Nx/q and Ttk/q are independent of basis.
(2) N/<-/q is a multiplicative homomorphism from A'x to Q*, and
TY/r/Q is an additive homomorphism from K to Q.
(3) Trjr/Q(l) = n # 0.
There are exactly n distinct isomorphisms of K into Q fixing Q, and
they are given as follows. The Theorem of the Primitive Element says
that K = Q(ct) for some a E K. Let
P(X) = Xn + cn-iXn-1 + .-. + c0
be the minimal polynomial of a over Q, and let a\ = a, a2,... an be
the (necessarily distinct) roots of P in Q. Then
Vj{Cn-\(Xn~l + hCia + Co) = Cn_iO^~l + hCiQfj + C0.
If/? is in Ky the elements 0j(/?) (with <T\(fi) = ft) are called the
conjugates of/? in Q. They need not lie in K. However, each <tj{P) has the
same minimal polynomial as /? over Q. Consequently
<Tj carries Ok into 0. (4.86)
To bring the conjugates of fi into the theory, one uses the following
proposition.
124
IV: MORDELL'S THEOREM
Proposition 4.26. For /? G K let g be the minimal polynomial of /?
over Q. Then (deg g) \ n. Also left-by-/? as a Q linear map from K to
K satisfies
n
det(*/ - left-by-/?) = g{x)r = [J (x - *j(/?)),
i=i
where r = n/(deg #).
Using Proposition 4.26, one can show that
n
^*/q(/*) = £*;(«•
i=i
By (4.85) and (4.86), N*/q and Tt/c/q carry 0* into Z.
A unit in Ok is an invertible element. The set of units is denoted
Oft. The first of the three basic theorems in beginning algebraic number
theory is the Dirichlet Unit Theorem, which gives the structure of 0%.
We shall give a weak form of the theorem as Theorem 4.29.
Lemma 4.27. The units in Ok are the members e of Ok with
Njc/Q(e) = ±1.
Proof. If e is a unit, then ^k/q^"1) = N^/q^)""1 € Z shows that
N/c/q(£) is an invertible integer, hence is ±1. Conversely we use (4.86),
remembering that &\ = 1. Then
shows that
If Nk/q(£) = ±1, then the right side is in O.
Lemma 4.28. The torsion subgroup of 0% consists of all 7Vth roots
of unity e2irxllN for some N that is bounded in terms of n = [K : Q].
9. APPENDIX ON ALGEBRAIC NUMBER THEORY 125
One proof of Lemma 4.28 uses properties of cyclotomic polynomials
and the formula for the Euler ^function.
The isomorphisms <x; : K —► Q C C, 1 < j < n are of two types:
(a) those carrying K into R. Say there are n of these.
(b) those carrying K into C but not R. These come in pairs a and <r,
where the bar refers to complex conjugation. Say there are 2r2
of these.
Then r\ + 2r2 = n. Let <t\ ,..., <rri be of the first kind and
GVi + l^rj + l, • • • ,0-ri+r3,0-ri+r2
be of the second kind. We use each <tj to form an absolute value on K:
\\x\\j = Wj(x)\ for i < 5 < n + r2-
Then the mapping
Log : x —► (log \\z\\u..., log ||*||ri+ra)
is a group homomorphism of K* into Rri+r2.
Theorem 4.29 (Dirichlet Unit Theorem). On 0£y the kernel of Log
is the (finite) torsion subgroup, and the image is a discrete subgroup of
the subspace of Rri+r2 where
*i + • • • + xri + 2xri+i + • • • + 2xri+r2 = 0.
Consequently 0% is a finitely generated group of rank < r\ -f r2 — 1.
The finiteness of the torsion subgroup is by Lemma 4.28. The version
of the theorem stated here is not very hard to prove. The full version of
the Dirichlet Unit Theorem says that the rank of 0\ equals v\ + r2 - 1,
and its proof takes considerably more effort. But we do not need this
stronger version.
We turn our attention now to the structure of ideals. If a\>..., an
are in K, their discriminant is the element of Q given by
A/r/Q(ari,..., an) = det (TV K/ q («i a j))-
If a\,..., an are in Ok then it is clear that their discriminant is in Z. It
is not too hard to establish the following additional properties of A^/q'
(1) Ajr/Q(<*i, • • • ,<*n) ^ 0 if and only if {c*i,... tan} is a basis of K
over Q.
(2) If {ai,..., orn} and {/?i,... ,/?n} are bases of K over Q and if
ot{ — 5^. ciij/3j with all aij G Z, then
AK/Q(au. ..,«„) = [det(a0-)]2A|f/Q(/?i>... ,/?„).
(3) A^/Q(a1)...,an)3:[det((Ti(al.))]2.
126
IV: MORDELL'S THEOREM
Proposition 4.30.
(a) Every nonzero ideal / in Ok (including Ok itself) is additively a
free abelian group of rank n whose Q span is K.
(b) If / is a nonzero ideal in Ok> then Ak/q(<Xi > • • • > <*„) is the same
for every Z basis e*i,..., an of /.
(c) Every nonzero ideal in Ok contains a nonzero member of Z.
(d) Every nonzero ideal in Ok has \Ok/I\ < oo.
(e) Ok is a Noetherian ring, i.e., every ascending sequence of ideals
terminates.
(f) Every nonzero proper prime ideal in Ok is maximal.
We define a product operation on nonzero ideals / and J in Ok by
IJ = {all sums of products from / and J}.
This product is associative and commutative, and Ok is an identity.
If / and J are nonzero ideals in Okj we say that I and J are
equivalent if (a)I — {0)J for some nonzero principal ideals (a) and (/?) in
Ok- ^ is easy to check that this notion of "equivalent" has the following
properties:
(1) it is an equivalence relation
(2) the product operation is a class property
(3) the principal ideals form a single equivalence class, which acts as
an identity.
The number Uk of classes of nonzero ideals in Ok is called the class
number of K. The class number equals one if and only if Ok is a
principal ideal domain. The following is the second of the three basic
theorems in beginning algebraic number theory.
Theorem 4.31. The class number of a number field is finite.
The proof requires a lemma in diophantine approximation and an
elementary argument using the notions above. For the case that [K :
Q] = 2, the classes of ideals correspond to certain classes of binary
quadratic forms, and the finiteness of Hk (as well as its value) is more
apparent.
Corollary 4.32. Multiplication of classes of nonzero ideals in Ok is
a group operation in which the identity is the class of principal ideals.
9. APPENDIX ON ALGEBRAIC NUMBER THEORY 127
The group in the statement of the corollary is called the ideal class
group; its order is the class number of K. For the proof of the corollary,
one first shows for nonzero ideals I and J in Ok and for a ^ 0 in Ok
that
(a)I = JI => (a) = J. (4.88)
Lemma 4.24 is a tool in the argument. Then the finiteness of the class
number forces two powers /' and P of an ideal I to be equivalent, and
finally (4.88) allows one to show that /li_,l-1 is an inverse of the class
of/.
Next we give the third of the three basic theorems in beginning
algebraic number theory.
Theorem 4.33 (Unique factorization of ideals). Every proper ideal
/ in Ok is uniquely a product I = FL^i P?* > where the Pi are proper
nonzero prime ideals and the k{ are integers > 0. The integers &, are
characterized by the property: Pf* D I but P*'+1 ^ /.
The pr6of of Theorem 4.33 begins with two preliminary facts. The
first says for three nonzero ideals A,B,C of Ok that
AB = AC => B = C. (4.89)
The other preliminary fact says about ideals: To contain is to divide.
Specifically if A and B are nonzero ideals of Ok, then
BD A => there exists an ideal C with A = BC. (4.90)
The existence of factorization of ideals uses (4.90) and parts (d) and (e)
of Proposition 4.30. Uniqueness uses also (4.89), Proposition 4.30f, the
finite class group (Theorem 4.31 and Corollary 4.32), and the inequality
\Nk/q(<x)\ > 2 for a € Ok that is not a unit (Lemma 4.27).
Proof of Theorem 4.10. Write h = hK- Let Iu...tIh be
representatives of the finitely many classes of ideals (Theorem 4.31), and let
us say 11 = (1). Let u; be a nonzero element of 7j, and put u = u{ • • • u^.
Then u is in Ij for 1 < j < h. Define S = {1, uy u2,... }. Notice that
(1) 1 € 5 and 0 £ S
(2) 5 is closed under multiplication.
From these properties it follows that S~1Ok = {s~1a \ s G 5, a G Ok}
is a ring. We shall prove that S~1Ok is a principal ideal domain and
that its group of units is finitely generated as an abelian group.
128
IV: MORDELL'S THEOREM
If/ is an ideal in Ok, then J = S~~1Ok is clearly an ideal in S~1Ok-
Every ideal Is of S~1Ok arises in this way. In fact, / = Is CiOjc is
an ideal in Ok and / = S~l(Is HO*-) coincides with Is [To verify
this equality, let i 6 /. Then i = s~la with a £ Is C\Ok- Since
5"1 C S~1Ok and since Is is an ideal in S~1Ok> « = s~xa is in
Is- Conversely if i is in Is, write t = s~~la with a € Ok- Then
a = si is also in Is, hence is in Is O (9/f, and i = s~la exhibits i as in
s-1(isnoK) = i.)
If /s is an ideal in 5 1Ok, we are to show that Is is principal. Put
/ = Is C\Ok- Then / is equivalent with some Ij, I < j < h\ let us write
(«)/ = (/%-.
Since u is in Ij flS, we have 5"1/,- = S"1Ok, and thus
(a)sJs = 5-1(a)5"1/ = S"1 (/?)/> = S-1Ok(P) = (£)s. (4-91)
where (a)5 and (/?)s denote principal ideals in S"1Ok-
FVom (4.91) let us see that
/?/a is in S'xOK and 7S = (^0)5. (4.92)
In fact, (4.91) allows us to write P = ai0 for some i0 € Is- Hence
PI a = t0 £ Is Q S-xOK, and (/?/or)s C /5. In the reverse direction,
let i 6 Is be given, and use (4.91) to write ai = /3x with x 6 S~1Ok>
Then i = — x shows i is in (p/a)s, and we obtain ^5 C (P/a)s. This
Of
proves (4.92).
Since Is was an arbitrary ideal in S~1Ok, (4.92) shows that S~1Ok
is a principal ideal domain. This completes the first half of the proof.
The second half of the proof will show that (S"1Ok)x is finitely
generated. The argument is a little subtle, as more generators than it
and Ox are needed.
Let u-'a be in (S~lOK)x} and write (u^a)"1 = tT*/?. Then ap =
tt'+*. So or is a divisor of a power > 0 of it. We seek a finite set of
generators for the divisors of all powers > 0 of u.
Thus let otP = ur with a,p eOK- Let
(u) = ^.--P*- (4.93)
be the factorization of (u) given by Theorem 4.33. Then we have
(a)(/?) = (ur) = (u)r = ^r--.^r.
9. APPENDIX ON ALGEBRAIC NUMBER THEORY 129
By the uniquenesss in Theorem 4.33,
(a) = Pj1 • ■ • Pjf with 0 < lj < kjr for 1 < j < N.
By Corollary 4.32 (and Theorem 4.31), Pj1 is principal for each j. Write
P? = (7;). For each j, write lj = qjh + r; with 0 < rj < h, so that
(a) = (Tl)?,-(7N)?"Pr-P^. (4.94)
Then
a = Tl«l • • • 7^i for some i G Pf •• ■ Pr„N.
Ot
Dividing, we see that —^ ^ (= i) is in Ok- Hence we can rewrite
Ti " '7n
(4.94) as
-v*1 ... ^N ( Q ] — (*JX . . . ***** ^ Pr» ... Pr"
Ti "-In I 7*i . ..7** J - wi 7n ;m Ov •
By (4.89) we conclude that
Pri ... pr" — ( a \
In other words, for (4.94) to hold, the ideal P[l • • • P#N has to be
principal, say
/>[..../>;/< = (sri rj,
and then (4.94) is the statement that
a = 7?1 ■ ■ -T^r, rwe with e in 0* . (4.95)
From (4.95), we see that (S~lOx)x is generated by 71,..., 7^, the finite
number of elements £r,,...,rft (since the ry's are < /i), and (9£. Theorem
4.29 shows that 0% is finitely generated. Hence (S~1Ok)x is finitely
generated.
Example. Let K = Q(>/z5). Then Ok = Zfv^]. The theory of
binary quadratic forms shows that hx — 2. In the above proof we can
take h — (1) and I2 — (2, 1 -f >/--5). For u we can use u\ = 1, W2 = 2,
it = 2. The factorization (4.93) is
(2) = P* with Px = (2,1 + y/-E).
Thus JV = 1, and we can take 71 = 2. The only ideals P[l • • • PrNN that
need to be considered are P* = (1) and P/, which is not principal. Thus
the only 6rit...rN is 1. The units of S~1Ok are therefore generated by
7i=2,<$o = l',and(0tf)*={±l}.
CHAPTER V
TORSION SUBGROUP OF £(Q)
1. Overview
Let E be an elliptic curve over Q with notation as in the Weierstrass
form (3.23). As usual, we can make an admissible change of variables
so that all the coeflficients are in Z. Let A 6 Z be the discriminant of
E. By Mordell's Theorem, £(Q) is a finitely generated abelian group.
Our objective in this chapter is to study the torsion subgroup #(Q)tore-
For any particular curve, we shall see that we can determine i?(Q)tors
explicitly.
The main tool in the analysis will be reduction modulo a prime p.
For the curve E itself, with its Z coefficients, reduction modulo p will
amount to considering the curve Ep over Zp with the Z coefficients taken
modulo p. It is less apparent how to get reduction modulo p to be a
well defined mapping rp defined from all of i?(Q), including points with
factors of p in the denominators of the coordinates, into Ep(Zp). Once
we have done this in §2, we will find that rp : £(Q) —► Ep(lp) is a group
homomorphism if p \ A (i.e., if Ep is nonsingular). The main theorem is
the following result discovered in the 1930's independently by Lutz and
Nagell.
Theorem 5.1 (Lutz-Nagell). Let E be an elliptic curve (3.23) with
coefficients in Z.
(a) If a2 = 0 and if P = (x(P), y(P), 1) is in £(Q)ton>, then x{P) and
y{P) are integers.
(b) For any a\, if P — (x(P)yy(P), 1) is in i£(Q)tors> then 4x(P) and
Sy(P) are integers.
(c) If p is an odd prime such that p | A, then the restriction to
£(Q)tors of the reduction homomorphism rp : 2?(Q) —► Ep(lp) is one-
one. The same conclusion is valid for p = 2 if 2 f A and a\ = 0.
(d) If a\ = a3 = a-2 — 0, so that E is given by
y2 = x3 + Ax + B, (5.1)
and if P = (x(P),j/(P), 1) is in £(Q)tors, then either y(P) = 0 (and P
has order 2) or else y{P) ^ 0 and y(P)2 divides d = -4A3 - 27B2, the
discriminant of the cubic polynomial on the right side of (5.1).
130
1. OVERVIEW
131
This theorem will be proved in §4. As was mentioned in Chapter
I, a consequence is that we can determine E(Q)tors completely for any
particular curve. The algorithm is to put the curve in the form (5.1), to
consider each square divisor y2 of d to give a candidate for y{P) and to
write down any corresponding integers x such that (x,y) is an integer
solution of (5.1). There can be only finitely many such, and we obtain a
bound on |i?(Q)torsl- F°r eacn sucn integer solution (x,y), we can raise
it to powers up to our bound on the order, effectively checking whether
it is a torsion point.
As a practical matter, this algorithm is often a little tedious. It is
usually more efficient to use (c) to cut down the possibilities. Some
examples will illlustrate.
Example 1. y2 -f y = x3 — x2. There are some obvious solutions with
integer coordinates, namely
(x,y) = (0,0), (0,-1), (1,0), (1,-1). (5.2)
Under the doubling formula (3.74), x = 0 doubles to x = 1 and vice
versa. So all four points are torsion points. For this curve, A = —11
from Table 3.1. Rather than clear out the y and x2 terms to be able to
use Theorem 5.Id, we reduce modulo 2, since ai = 0 and 2 \ A. Then
£2(^2) contains the four points (5.2) and also 00; hence £2(^2) = Z5.
By Theorem 5.1c, r2 : £(Q)tors -> ^(Z2) = Z5 is one-one. Since (5.2)
has shown i?(Q)tors to be nontrivial, we conclude that i?(Q)tors — Z5.
The members of E^Q^ors are 00 and the points in (5.2).
EXAMPLE 2. y2 + y = x3 — x. Again there are some apparent solutions
with integer coordinates, namely
(*,y) = (±1,0), (±1,-1), (2,2), (2,-3), (6,-15). (5.3)
From Table 3.2, we have A = 37. Since ax = 0 and 2 f A and 3 f A,
Theorem 5.1c applies to p = 2 and p = 3. Over Z2 we find 5 solutions
(counting 00), while over Z3 we find 7 solutions. So £'(Q)tors maps one-
one into Z5 and also one-one into Z7. Thus £(Q)tors = 0. None of the
points (5.3) is a torsion point.
Example 3. y2 — xy + 2y = x3 ± 2x2. For this curve we calculate
that A = -2713. We can try to identify £(Q)tors by using parts (a) and
(d) of Theorem 5.1. To find a full list of candidates for torsion points,
we can rewrite E as
[y + (l-is)]2 = a:3 + f*2-*+l.
132
V: TORSION SUBGROUP OF £(Q)
Making admissible changes of variables three times leads first to
3/2 = x3+fx2-x+l,
then to
y2 = x3 + 9x2 - 16x -f 64,
and finally to
y2 = x3 - 43x + 166.
The discriminant of the cubic polynomial on the right side is —21313.
Thus Theorem 5.1d says that y(P)2 | 21313. So y(P)2 = 1 or 4 or 16 or
256 or 1024 or 4096. Checking each case, we are led to the candidates
(3,±8),(-5,±16),(ll,±32),oo.
Using the doubling formula, we can check that each of these is a torsion
point. We can transform back to obtain the following solutions of the
original equation:
(0,0), (0,-2), (-2,0), (-2,-4), (2,4), (2,-4), oo. (5.4)
The torsion subgroup of the original equation therefore consists of the
points (5.4) and is isomorphic to Z7.
It is more efficient, however, to use Theorem 5.1c. Since 3 f A, we
list the solutions of the original equation modulo 3 as
(0,0), (0,1), (1,0), (1,-1), (-1,1), (-1,-1), co.
By Theorem 5.1c, 2?(Q)tors is 0 or Z7. In Q we quickly see that (0,0),
(0, —2), and (—2,0) are solutions. Using the doubling formula, we check
that the doubled x coordinate of any of these makes a cycle after 3 steps.
Therefore these points are torsion points, and 2£(Q)tors — Z7. We can
use the results of the doubling to generate the list (5.4).
Example 4. y2 + xy = x3 + Ax2 + x. For this curve, A = 3252. The
point (—^,^) lies on the curve, and the tangent line is vertical there
(since the coefficient 2y + x of dy is 0). Therefore (—\> |) has order 2
and is a torsion point with nonintegral coordinates.
To limit the size of E(fy)torSi we reduce modulo 7 and find that the
numbers of j/'s giving solutions for x — 0,1,...,6 are 1,2,0,1,0,1,2,
respectively. Counting the point 00, we see that £7(^7) has order 8.
To identify E{Q)tOTS, we replace y -f \x by y and then scale. The
transformed equation is y2 = z3+ 17x2 + IQx. In §5 we shall see that an
1. OVERVIEW
133
equation of the type y2 = x(x + r2){x + s2) has Z4 0 Z2 as a subgroup
of its torsion group. So in our case, the torsion group is exactly Z40Z2.
Taking the 8 solutions given in §5 and transforming back, we obtain
(0,0), (-$,£), (-1,-1), (-1.2), (1,2), (1,-3), (-4,2), 00
as the torsion elements for our original equation.
Since 2 f A, reduction modulo 2 is a homomorphism. The group
£2(^2) consists of
(0,0), (1,0), (1,1), 00,
and r2 maps E(Q>)t0rs onto £2(^2) with kernel 22 = {( —^, |), 00}.
Table 5.1 gives examples of elliptic curves over Q with 15 different
torsion subgroups. Methods for generating such examples will be indicated
in §5. According to Theorem 1.7 (due to Mazur), these 15 groups are
the only possibilities for a torsion group over Q. The proof of Mazur's
Theorem is well beyond the scope of this book and will not be given
here.
E
y2 = s3 + 2
y2 = x3 + X
y2 = x3 + 4
y2 = x3 + 4x
y2 + y = x3 - i2
y2 = x3 + 1
y2 - xy + 2y = x3 + 1x2
y2 + Ixy - 6y = i3 -
6x2
y2 + 3ry + 6y = x3 + 6ar2
y2 - 7iy - 36y = a;3 -
y2 + 43zy - 210y = i3 -
y2 = i3 - x
18*2
-210r2
y2 = X3 + 5X2 + AX
y2 + bxy - 6y = i3 -
3x2
y2 = g3 + 337r2 + 20736a;
£(Q)ton»
0
z2
z3
z4
z5
z6
z7
z8
z9
Zio
Zl2
z2ez2
z4©z2
z6ez2
z8©z2
A
-2633
-26
-2833
-212
-11
-2433
-2713
283417
-2935
-25310112
21236537413
26
2832
223652
220385472
Table 5.1. Examples of torsion subgroups of £(Q)
134
V: TORSION SUBGROUP OF E(Q)
There are two situations where we can decide E(Q)t0rs for infinitely
many given curves at once (other than the situations in §5 where we
construct some families just for this decidability). The two theorems that
follow use Dirichlet's Theorem on primes in arithmetic progressions and
will be proved in §6. Part of the first theorem was borrowed in Chapter
IV and stated without proof as Lemma 4.20.
Theorem 5.2. Let E be the elliptic curve y2 = x3 -f Ax with A in Z
and with A assumed fourth-power free. Then
if — A is a square in Z
E(Q)t0rs={ Z4 ifA = 4
otherwise.
Theorem 5.3. Let E be the elliptic curve y2 = x3 + B with B in Z
and with B assumed sixth-power free. Then
!Z6 if B = 1
Z3 if B = -432 = -2433, or if B = [J and B ± 1
Z2 if B = cube and B ? 1
0 otherwise.
2. Reduction Modulo p
We recall the definition of the p-adic norm on Q, p being a prime.
If r ^ 0 is in Q, we write r = pnu/v with u and v in Z and with
GCD(t/,p) := GCD(v,p) = 1. The definition of the p-adic norm is
\r\p = p""n. By convention, we define |0|p = 0. This notion has the
following properties:
(i) \r + s\p < max{|r|p, |5|p}, with equality if \r\p ^ \s\p,
(ii) \rs\p - \r\p\s\p.
Property (ii) is clear. For (i) we write r = pnu/v and s = pn u'/v\ with
n <n' without loss of generality. Writing
\v v' J vv'
with GCD(vt/,p) = 1, we obtain (i) directly.
Property (i) is called the ultrametric inequality. It implies
lr + s\p < \r\P + kip- If we define d(xyy) = \x - y\p} then the latter
inequality implies the triangle inequality for dy and d is a metric on Q.
2. REDUCTION MODULO p
135
We say that r e Q is p-integral if \r\p < 1. By (i) and (ii), the p-
integral elements form a subring of Q containing Z. Those with \r\p < 1
form an ideal in this subring; they of course have \r\p < p"1. The
p-integral elements of Q can be reduced modulo p: If r = pnu/v is
p-integral, i.e., if n > 0, then we define rp(r) G Zp by
, x f u/v mod p if n = 0
if n > 0.
Then rp : {p-integral elements} —► 2p is a ring homomorphism.
In preparation for considering plane curves, we can try to use rp to
get a map of the affine plane over Q to the affine plane over Zp, but the
best we get is a map defined on
{(r)5) I r and $ are p-integral}
as rpfas) = (rp(r))rp(5))- To correct this deficiency we work with
curves projectively, as follows.
To define rp : P2(Q) — ^(Zp), we let
rp(x,yiw) = (rp(x),rp(y),rp(ti;)), (5.5)
where (x,y, w) are coordinates of the point in question chosen so that
x, y, and w all have | • |p < 1 and at least one of them has | • |p = 1. Such
a representative of a point in P2(Q) is said to be p-reduced. Note that
if a general (x, y, w) is given, we can multiply by a suitable pn to obtain
a p-reduced representative. A p-reduced representative is unique up to
a factor with | • |p = 1. Therefore rp is well defined as a map of all of
F2(Q) into P2(ZP).
Using (5.5), we can reduce projective plane curves modulo p. Let F G
Q[x,y,ti;]m be a plane curve of degree m. Multiplying the coefficients of
F by a constant, we may assume that all the coefficients have | • |p < 1
and at least one has | • |p = 1. Then we can reduce the coefficients
modulo p, obtaining a nonzero polynomial Fp £ Zp[x, y, w]m. Although
Fp is not defined uniquely, it is defined uniquely up to a nonzero scalar.
Therefore its zero locus Fp(lp) is well defined.
Proposition 5.4. Let F G Q[z,y,ti;]m be a plane curve. Under the
reduction homomorphism rp : P2(Q) —> ^(Zp) given in (5.5), the image
of F(Q) is contained in Fp(Zp).
136
V: TORSION SUBGROUP OF £?(Q)
PROOF. Normalize the coefficients of F as above, and let (x, y, w) be
a p-reduced representative of a point in ACQ)- Then
(x,y,w) G F(Q) ■$=> F(x,y,w) = 0
=>rp(F(x,y,w)) = 0
*=>Fp((rp(x),rp{y),rp(w)) = 0
<=>Fp(rp(x,y,w))=Q
<=^rp(x,y,w)£Fp(lp).
Proposition 5.5. Suppose F € Q[x,y, u;]m is a plane curve, L 6
Q[x,y,u;]l is a line, and P = (a:o.yo.u>o) is a point on L. If Fp and Lp
are reductions of F and L modulo p, then the intersection multiplicities
satisi v
i(P,L,F)<i(rp(P),Lp,Fp). (5.6)
Proof. Without loss of generality, we may assume that (xo,2/o,^o)
is a p-reduced representative and that the coefficients of F and L are
normalized as they are supposed to be. Choose a p-reduced
representative (xfiyfiwf) of a point P' of L with [(x',y',u/)] ^ [(x0, j/o, u>o)L and
form
V>(<) = F(P + <^') = ^(xo + **', yo + *i/, wq + tw)
= trF; + ... + tmFU,
with F'r / 0. By Proposition 2.9 the left side of (5.6) equals r.
Recomputing rp(t) modulo p (i.e., in 2p[t]) and applying Proposition 2.9 again,
we see that the right side of (5.6) is > r.
Let us apply Propositions 5.4 and 5.5 to elliptic curves E over Q.
For studying E(Q), we may make an admissible change of variables
y —* J//ti3, x —► x/u2 to make all coefficients of E be in Z. Then we
can assume E is in Weierstrass form (3.23) with all coefficients in Z. To
apply the above theory, we consider the projective form (3.23a) of E.
The coefficients of E are all in Z and hence are p-integral. Also wy7
and x3 have coefficient 1. Thus passage to Ep is given simply by writing
(3.23a) with coefficients considered in Zp; no preliminary normalization
is needed. The discriminant of Ep is clearly given by
Ap = A mod p.
Thus Ep is nonsingular if and only if p f A. Our reduction map on 2?(Q)
is a mapping
rp : £?(Q) - £P(ZP) (5.7)
by Proposition 5.4.
3. p-ADIC FILTRATION
137
Proposition 5.6. If Ep is nonsingular, then the map rp in (5.7) is a
group homomorphism.
Proof. Since rp(0,l,0) = (0,1,0), rp carries O to Op. We apply
Proposition 5.5. Since the sum of intersection multiplicities over a line
is < 3 (by nonsingularity of E and Ep)7 the proposition gives rp(PQ) =
r,{P)-rP(Q). Thus
rp(P + Q) = rp(0 ■ PQ) = rp(0) ■ rp(PQ) = rp(0) • (rp(P) • rp(Q))
= Op ■ (rp(P) ■ rp(Q)) = rp(P) + rp(Q),
and rp is a group homomorphism.
3. p-adic Filtration
Fix an elliptic curve E with Z coefficients. In order to get at the
integrality or almost integrality of torsion points, it is natural to work with
coordinates (x,y,l), clear denominators, and see what happens. The
difficulty with this approach is that it is not so easy to take advantage
of the hypothesis that (x, y, 1) is a torsion point.
The clever idea that works is to use coordinates (X, 1, W). Proposition
5.6 shows that rp : E(Q) —> Ep(Zp) is a group homomorphism if p \
A. In the coordinates (X} 1,W), subgroups and the homomorphism
property of rp play a more visible role, and the finite order of a torsion
point becomes a usable property.
When p \ A, the kernel of rp is
{(x,y,w) £ E(Q) | rp(x,y,w) = (0,1,0)}.
For such a point (x,y, w)y we must have y ^ 0. Normalizing, we may
assume y = 1. The elements (x,l,w) 6 £(Q) that are in ker(rp) are
those with |x|p < 1 and \w\p < 1. In this section we shall study the
same set
^J)(Q) = {(*, l.u,) € £(Q) I \x\p < 1 and |t»|p < 1},
but without the assumption that p \ A.
Lemma 5.7. Let (x, l,tu) be in £(Q). If \w\p < 1, then (x^ < 1 and
K = |x|p3-
138 V: TORSION SUBGROUP OF K(Q)
Proof. Putting y = 1 in (3.23a), we have
w = — a\xw — a^iv2 -f- x3 + a2x2w -f- a4Xti/2 •+- a6™3. (5-8)
First suppose |x|p > 1. On the right side of (5.8) the x3 term is strictly
the largest relative to | • |p, since
| - aixw\p < \xw\p < \x\p < \x\3p = |x3|p
l-^2|p<K<i<W? = k3lp
\a2x2w\p<\x2w]p<\x\2p<\x\3 = \x3\p
\aAxw\ < \xw\ < \x\p < \x\3p = |x3|p
\a6w\ < \w\3 < 1 < |x|3 = \x\.
By the ultrametric inequality, (5.8) gives \w\p = |x3|p. This is a
contradiction, since |u;|p < 1 and \x3\p = \x\3 > 1. We conclude that |x|p < 1.
Now we rewrite (5.8) as
w 4- aixw + a3w2 — a2x2w — a^xw2 — a^w3 = x3. (5.9)
If w = 0, then x = 0 and \w\p = |x|3. If w ^ 0, then the w term is
strictly the largest on the left side of (5.9), since
I - a\xw\p < \xw\p = |x|p|u;|p < MP
\-azw2\p<\w2\p = \w\2p<\w\p
\a2x2w\p < \x2w\p = \x\2p\w\p < \w\p
\a4xw2\p < \xw2\p = \x\p\w\2p < \w\p
\a6w3\p < \w3\p = H3 < \w\p.
By the ultrametric inequality, (5.9) gives \w\p = \x3\p
lemma follows.
For n > 1, define
£(n)(Q) = {(*> 1, u>) € E(Q) | \w\p < 1 and |x|p < p-»}.
The p-adic filtration of E^(Q) is
oo
£(1)(Q) 2 £(2)(Q)) 2 £(3)(Q) 2 • • • with p| £<»>(Q) - {(o, 1,0)}.
n = l
|x|3, and the
3. p-ADIC FILTRATION 139
The important control on a point of ^^(Q) other than the group
identity is that it lies in £,(n)(Q) - E<n+1>(Q) for some n. Lemma 5.7 says
that
£(n)(Q) = {(*, 1,«0 € E(Q) | |W|P < p-*»}.
Let R be the ring of p-integral elements of Q, i.e., members of Q with
no factor of p in the denominator.
Proposition 5.8. The subsets £?(n>(Q) of j?(Q) are subgroups. The
function P = (x.l.w) —► z(P) = x gives a map £<n)(Q) -► pnfl.
The composition of this map, followed by the quotient map pnR —>
pnR/p2nR, is a group homomorphism
£(n)(Q) — PnR/p2nR
whose kernel is contained in 2?(2fl)(Q). Consequently the homomorphism
E(n\Q)/E(2n\Q) —+ pnR/p2nR
is one-one.
Before coming to the proof, we mention that
pnR/p2nR ~ pnl/p2nl *? Z,». (5.10)
For the second of these isomorphisms, we observe that each group is
cyclic and they both have pn elements. For the first of the isomorphisms,
we map pnZ into pnR/p2nR and obtain a map on the quotient
pnZ/p2nZ —>pnR/p2nR.
This map is one-one since pnZ flp2n-R C p2nZ. To see that it is onto, let
pnu/v be in pnR. Choose a and b in Z with av -f bpn = u. Then
pnu n p2nb
L— = p"a + £—
v v
shows that pna maps onto the coset oipnu/v.
The proof of Proposition 5.8 will be preceded by the statement of a
lemma. The lemma will be proved after the proposition.
140 V: TORSION SUBGROUP OF E(Q)
Lemma 5.9. Let L be a line that meets i?(Q) in three points when
multiplicities are counted, say in P\ = (xi,l,tui), Pi = (x2,l,u;2), an(*
P3 = (x3,1, m)- If Pi and P2 are in £(n)(Q), then P3 is in £(n)(Q) and
I*i+X2 + *»U<(P:£ ifai = ° (5.11)
ix * oip _ |^p ^n m any case.
Proof of Proposition 5.8. If Pi and P2 are in /^(Q), then the
lemma says that P3 = PiP2 is in £<n>(Q). Since O is in £<n>(Q), so is
O • PiP2 = Pi + P2. Also O Pi = -Pi is in ^n>(Q). Hence JE?<n>(Q) is
a subgroup of #(Q).
If P is in £(n)(Q) and x = x(P), then |x|p < p"n yields
|p-n*|P = b"%k|p=Ppil*lp<l-
Hence p~nx is in R and x is in pniE. Thus P —> x(P) gives a map of
£(n)(Q)intopnfl.
Let PiP2 = P3. Then the lemma gives
x(Pi) + x(P2) 4- x(P3) € p2n/?. (5.12)
If P3 = (X3,1,^3), then a little computation with (3.70) shows that
op3 = -p3 = (-J—2- , i, -t-t^t ) •
\ I +axx3 + a3w3 1-f aix3 + 031^3/
Hence
x(Pi+P2) = x(OP3) = -: X3
1 -faix3 + a3w;3
and
x(Pi + P2) + x{P3) = ~ — ^- + x3
1 + aix3 + a3iy3
aix3 + a3ti;3
= X3
1 -f aix3 -f a3ti;3
6 p3nR (5.13)
since |aix3 + 031^3^ < 1 and |1 + aix3 4- 03^3^ = 1. Combining (5.12)
and (5.13), we see that
x(Pi + P2) = x{Px) + x(P2) mod p2ntf,
and the composition P —► x(P) —► x(P) +p2nR is a homomorphism.
If P = (x,l,ti;) is in the kernel of the resulting homomorphism on
£(n)(Q)/£(2")(Q), then x(P) = x is in p2nfl and |ur|p < 1. Then
\Ap < P"2n> and P is in £<2n)(Q). This completes the proof.
3. p-ADIC FILTRATION 141
Proof of Lemma 5.9. The first part of the proof will show that the
line L is of the form
w = rnx + 6 (5-14)
and will give a bound for the slope m. We begin with the equation
w -f a\wx -f azw2 = x3 -f a2x2w + a^xw2 + aew3 (5.15)
satisfied by the three points.
Case 1. Pi ^ P2. Here we substitute (xi,l,wi) and (x2,l,ttf2) into
(5.15) and subtract the results. Each term of the difference is an integral
multiple of
x\w\ — x2w2 = (x{ - x2)w\ + x2(w\ "" ^2)
= (xx - x2)(*r1 + • • • + *2~~l)w\ + (m - ™2)K-1 + • • • + wi"1)*!-
(5.16)
Consider all the terms of (5.15) but w and x3, all of which have 5+3/ > 4.
Noting that Pi and P2 are in f?(n)(Q), we use
K*i-*a)(*r1 + -"+*rViip
< |u - *2|pp-B(-V3"1 < P_3"ki - *2|p
when s > 0, and we use
\(wx - w2){w[~l + • • • + w2-l)xs2\p
< K - ^2|PP"3n(<" Vn* < p"nK - ^2|p
when t > 0. Moving the Xi — x2 terms obtained from (5.16) to the right
side and the W\ — w2 terms to the left side, we can rewrite our difference
of equations as
(wi - u/2)(l + u) = (xx - x2){x\ + xix2 + x\ + v), (5.17)
where |u|p < *rn and \v\p < p~3n. Since |ti|p < 1, |1 + u\p = 1. In
particular, xi = x2 implies w\ = w2. Since we have assumed Pi ^ P2,
we conclude xi ^ x2. Thus the line through Pi and P2 is of the form
(5.14). In view of (5.17), the slope m is given by
_ W\ - W2 __ x\ + XiX2 + X2+V
X\ - X2 1 + li
142
Hence
Mp =
V: TORSION SUBG
x\ + X!X2 + x\ + v\
1 +u
= \x\ + xlx2 + xl + v\p
< max{p-2n, p-2n, p-2n, p-3"} = p~2n. (5.18)
Case 2. Pi = P2. The line in question is the tangent, which we
know exists. Since Q C R, we can compute the tangent by implicit
differentiation of (5.15). Differentiation of (5.15) and substitution of
(x,l,tu) = (xi,l,t/>i) gives
(1 -f aiXi + 2a3wi — a2x2 — 2a4xiwi — 3aGw2)dw
= (—aiwi + 3x2 + 2a2x\wi + aAw\)dx.
dw
The coefficient of dw is of the form 1 -f u' with |u'L < 1. Therefore -—
ax
is finite. The tangent line is thus of the form (5.14), and the slope m is
, -aiwi +3x? + 2a2xiwi + a4w2
given by m = . Hence
6 J 1 + u'
\m\p = | — a\Wi + 3x2 + 2a2xiti>i + a4u>2|p
< maxjp , p , p , p )
<P"2n. (5.19)
Both cases. Thus in both cases, (5.18) and (5.19) say
\m\p<p-2n. (5.20)
Since Wi = mxi -f &, we have b = wi — m«i, and (5.20) gives
|6|„ < p-3n. (5.21)
The three points Pi,P2,Pa satisfy (5.14) and (5.15) with multiplicities.
Substitution of (5.14) into (5.15) gives a cubic equation
(mx + 6) + aix(mx + 6) -f a3(mx -f 6)2
= x3 + d2X2(mx -f 6) -f a4x(mx + &)2 + a6(mx + 6)3
whose roots, with multiplicities, are xi,x2,X3. Then xt + x2 -f X3 must
be minus the quotient of the coefficient of x2 by the coefficient of x3,
and we obtain
—aim — a^m2 + a26 + 2a4m6 + 3a6m26
X) + X2 + X3 = — ; — 3 .
1 + Q2m -f (14m2 + ^6^
3. p-ADIC FILTRATION 143
By (5.20) the denominator has norm 1. Thus (5.20) and (5.21) show
that
|*i +*2 + x3|p < Tmx{\ai\pp'2nt p~3n} < [PlTn !f Ql =°
1 IF — u iy r J — [p zn in any case,
as required. Since |xi| < p~~n and |x2|p < p~n, we deduce that \x3\p <
p~~n. From tu3 = mx3 + 6, we obtain \w$\p < p~3n by using (5.20) and
(5.21). Thus P3 is in £<n)(Q). This proves the lemma.
REMARK. The proof of Proposition 5.8 did not make use of the full
strength of the lemma since it did not isolate the case ai = 0. When
ai = 0, one can go over the proof of the proposition to see that p2n can
be replaced by p3n and E^2n^ can be replaced by E^3n\ In particular,
ai = 0 implies that there is a well defined map
£(n)(Q)/£(3n)(Q) — pnR/p3nR, (5.22)
and it is a one-one homomorphism. We shall use this observation
critically in the next proposition.
Proposition 5.10. For each odd prime p, £(Q)t0rs n £(1)(Q) = 0.
This conclusion extends to p = 2 if ai =0.
Proof. Fix p. First assume aY = 0. If E(Q)tors n £(1)(Q) ^ 0, the
intersection contains a nonzero element P of some prime order q. Since
fC=i £(n)(Q) = {O}, we can find n such that P is in £(n)(Q) but not
£(tt-H)(Q)# From qP = 0 and the homomorphism property of the map
in Proposition 5.8 (as amended by (5.22)), we see that
qx(P) is in p3nR. (5.23)
If q / p, then x(P) must be in p3nR C p2nR, while if q = p, then x(P)
must be in p3n~lR C p2nR. In either case the one-oneness in Proposition
5.8 implies
P G £(2n)(Q) C £<n+1>(Q),
contradiction.
Now we allow a\ / 0 but assume p is odd. Suppose by way of
contradiction that P = (x, 1, w) is a nonzero element of £(Q)tors H /^(Q)
for the equation (5.15). Then P is a torsion point with |x|p < 1 and
\w\p < 1. Also w / 0. Write
[(z,l,w)] = [{z,y,l)]
144 V: TORSION SUBGROUP OF E(Q)
X _ 1
by putting x = — and y = —. Then (x,y, 1) is a torsion point with
w w
y2 + a\xy + a3y = x3 + a2x2 + a4x + a6. (5.24)
We make an admissible change of variables, putting
x = -x' and y - -Ay' - a\xf).
Substituting into (5.24), we see that (x,)y,) 1) is a torsion point of
yf2 + 8a3y' = x'3 + (4a2 + a?)x'2 + (16a4 + Sa^x1 + 64a6.
The point (x',l,'ti/) with [(x',1,^')] = [(*',y',l)] has x; = — and
y; = —, and it is a torsion point of
wf
w' + 8a3u/2 = x/3 + (4a2 + aj)x/2w;' + (16a4 + Saxa3)xfwf2 + 64a6u;/3.
(5.25)
This equation has integer coefficients, and its x'w' term has coefficient
0. From the previous paragraph it has J?(Q)to™n£?(1)(Q) = 0. We shall
obtain a contradiction by showing that (x;, 1/ w') is in this intersection.
In fact, we know (x;, l,u/) is a torsion point. Also
w
8 + 4a!X
and p odd and \x\p < 1 imply |8 -f 4aix|p = 1, so that \w'\p = \w\p <
p-3. By Lemma 5.7, |x'|p < p"1. Thus (x',1,1//) is in ^(1)(Q). This
contradiction completes the proof.
4. Lutz-Nagell Theorem
In this section we shall prove Theorem 5.1. The notation is as in §1.
Proof of Theorem 5.1a. Under the assumption that ai = 0 let
(x,y, 1) be in i£(Q)tors- We are to prove that x and y are in Z. First
we handle y. Without loss of generality, we may suppose y ^ 0. Then
[(x,y, 1)] = [(X.ljW)], where X = x/y and W = 1/y. Fix any prime
p. By Proposition 5.10, the torsion point (X, 1,W) is not in E^\Q)
By Lemma 5.7, \W\P > 1. Thus \y\p =
W
< 1. Since this condition
p
holds for all p, y is an integer.
Regarding y as an integer constant in (3.23b), we see that x satisfies
a monic cubic polynomial equation with integer coefficients. For such
an equation, the rational roots are integers. Thus x is an integer.
5. CURVES WITH PRESCRIBED TORSION
145
PROOF OF THEOREM 5.1b. Without the assumption that ai - 0, let
(x,y, 1) be in E(Q)ton. Then (x',y', 1) = (4x,8y, 1) satisfies
y'2 + 2a1x'y' + 23a3y' = x'3 + 22a2x'2 + 24a4x' + 26a6,
and (iV', 1) = (x',y' + a!x', 1) satisfies
y"2 + 8a3y" = x"3 + (4a2 + a2)x"2 + (16a4 + 8a1a3)x// + 64a6.
By Theorem 5.1a, x" and y" are integers. Tracing back, we see that x'
and y1 are integers. Hence 4x and 8y are integers.
Proof of Theorem 5.1c. When p \ A, we have ker(rp) = /^(Q).
Under our hypotheses, Proposition 5.10 says that £(Q)torsn£(1)(Q) = 0.
Thus the restriction of rp to £7(Q)tors is one-one.
Proof of Theorem 5.Id. Let E be of the form (5.1). The doubling
formula (3.74) gives
_ x*-2Ax*-8Bx + A>
*Kn ~ 4(x3 + Ax + B) '
\/{x\
and we write this as x(2P) = ttt4-- Since 6(x) = y2, we have
4o(x)
i/(x) = 4y2x(2P). (5.26)
Now x, x(2P), and y2 are integers by Theorem 5.1a, and we see from
(5.26) that i/(x) is an integer and that y2 | J/(x). Thus y2 divides both
i/(x) and S(x). Direct calculation gives
(3x2 + \A)u{x) - (3x3 - hAx - 27B)6(x) = 4A3 + 27£2 = -d.
Since y2 divides both v(x) and £(x), y2 divides d.
5. Construction of Curves with Prescribed Torsion
In this section we study how to construct elliptic curves E over Q with
prescribed features in 2?(Q)tors- The first part of the section shows how
to produce elements of a given order. Then we shall discuss how to make
^2n 0 Z2 be a subgroup of 2?(Q)tors- Finally we summarize matters by
saying how Table 5.1 in §1 was constructed.
146 V: TORSION SUBGROUP OF E(Q)
In studying behavior at a single point, we may translate the curve to
make the special point be P = (0,0) in the affine plane.
After this translation, we have a$ = 0, and the curve is
y2 + aixy + a3y = x3 + a2x2 + a4x. (5.27)
From §111.5 the curve is singular if and only if a3 = a4 — 0.
Assume that (5.27) is nonsingular. Then P = (0,0) is of order 2 if
and only if the tangent is vertical there. Vertical tangency occurs when,
in the implicitly differentiated form of the curve, the coefficient of dy is
0. In (5.27), dy has coefficient 2y + a\x + a3. Thus P has order 2 if and
only if a3 = 0 (and therefore a4 / 0).
Assume next that (5.27) is nonsingular and that as ^ 0. If we make
the admissible change of variables
(x,y) = (x/,2// + a3"1a4x/)
in (5.27), then P remains at (0,0) and we are led to
y'2 + (ai + 2a3"1a4)xV + a3y' = x/3 + (a2 - aya^la4 - a^2a\)x'2.
Changing notation, we can rewrite this as
y2 + axxy + a3y = x3 + a2x2 (5.28)
with no x term. With P = (0,0), the numbered formulas of §111.4 give
-p = (0,-a3) (5.29a)
and
2P = (-^, a^ - a3) = (~a2, axa2 - a3). (5.29b)
Since 3P = O if and only if -P = 2P, we see that P = (0,0) in (5.28)
has order 3 if and only if a2 = 0.
Now assume that we have arrived at a nonsingular (5.28), i.e., a3 ^ 0,
and suppose a2 ^ 0. Then we can eliminate one of the parameters.
Namely if we make the admissible change of variables
(x,y) = (x7«2,y7«3)
with u = a^la2, then P remains at (0, 0) and we are led to
y1 + a^axa2xy + <q2a\y = x3 + a^2a\x2
5. CURVES WITH PRESCRIBED TORSION 147
with equal coefficients for the y and x2 terms. Changing notation, we
can rewrite this as
E(b,c) : y2 + (l-c)xy-by=x3-bx2.
This is called the Tate normal form. It is nonsingular if and only if
6 = 0. Direct calculation shows that the discriminant is
A(6,c) = (l-c)463-(l-c)363-8(l-c)264 + 36(l-c)64-2764+1665.
From (5.29) and (3.70), we obtain
^ = (0,0) -P = (0,6)
2P = (6,6c) -2P = (6,0). (5.30a)
Using (3.72) and (3.73), and then (3.70), we find
3P = (c, 6 - c) -3P = (c, c2). (5.30b)
From (5.30) we can easily choose 6 and c to make P = (0,0) have
AP = O or 5P = O or 6P = O. To achieve these three conditions, we
just make 2P = -2P or 3P = -2P or 3P = -3P. From (5.30) we
obtain immediately
4P = 0 <=> c = 0
5P = 0 <=> b = c (5.31)
6P = O <=> 6 = c2 + c.
Let us turn to the question how to make 7.2n 0 Z2 be a subgroup of
£(Q)tors- The tool will be Theorem 4.2. Since Z2 0 Z2 will have to be
a subgroup, we can assume from the outset that E is of the form
y2 = (x - a)(x - p)(x - 7)
with <*,/?,7 in Z. Translating (7,0) to (0,0) and changing notation, we
can write the curve as
y2 = x(x-a)(x-p). (5.32)
For Z4 0 Z2 to be contained in E(Q)torSl one of the points (zo,0) with
xo 6 {0,a,/?} must be the double of something. Suppose (0,0) is the
point that is to be a double. Theorem 4.2 says that the necessary and
148
V: TORSION SUBGROUP OF K(Q)
sufficient condition is that 0 — 0, 0 — a, and 0 — 0 are squares. Thus any
curve
y2 = x(x + r2)(x + 52)
has Z4 0 Z2 C i£(Q)tors- The subgroup Z4 0 Z2 consists of the elements
(0,0), (-r2,0), (-52,0), (rs, ±rs(r + 5)), (-rs} ±rs{r-s)), 00.
This construction was used in connection with Example 4 in §1.
The application of Theorem 4.2 can be repeated, this time to be used
on (rs,rs(r + 5)), and the result is a condition for Zs 0 Z2 to be a
subgroup of E(Q)tors-
To summarize matters, let us say how Table 5.1 in §1 was constructed.
For as many cases as possible, the examples are from Theorems 5.2 and
5.3. For Z5 we used (5.31), and for Z7,Zs,Z9 we used extensions of (5.31)
that deal with IP = O, 8P = O, 9P = O. For Z10, we started with the
standard form with P = (0,0) of order 5 and found a case where P
was the output from the doubling formula (3.74). For Z12, the same
technique in principle should work with P = (0,0) of order 6, but we
simply quoted a known example. The examples with Z40Z2 and Zg0Z2
were obtained by the method in the previous two paragraphs, and the
example for Z6 0 Z2 was obtained by starting with the form (5.31) for
6P = 0 and looking for a case that had three elements of order 2.
6. Torsion Groups for Special Curves
Our objective in this section is to prove Theorems 5.2 and 5.3 in §1.
Each theorem requires a lemma about Ep(lp) for certain primes p. Also
we shall use the form of the doubling formula (3.74) when specialized to
a curve y2 = x3 + Ax -f B:
_ X<-2Axi-8Bx + A>
X{2n~ 4(x* + Ax + B) ■ (5,33)
And we shall use Dirichlet's Theorem that there are infinitely many
(positive) primes an + b if GCD(a,6) = 1; this theorem will be proved
in Chapter VII.
Lemma 5.11. Let Ep be the curve y2 = x3 + Ax over Zp, and assume
that p f A, p > 7, and p = 3 mod 4. Then Ep(lp) has exactly p + 1
points.
6. TORSION GROUPS FOR SPECIAL CURVES 149
PROOF. We start from the known result that p = 3 mod 4, implies
that —1 is not a square modulo p. For x -£ 0, consider the pair {x, — x}.
When these elements are substituted into E, we obtain x3 -f Ax and
—(x3 + Ax). If the answers are 0, each one has a square root, and we
get one solution from each. If they are nonzero, exactly one is a square
(since — 1 is not a square), and it has two square roots y. So in either
case, the pair {x, —x} gives us two solutions. Thus the nonzero x's give
us p— 1 solutions in all. For x = 0, we get one more solution (0,0), and
oo gives us one additional solution. Thus Ep(2p) has p + 1 points.
Proof of Theorem 5.2. The main step is to show that |#(Q)tors|
divides 4. By Theorem 5.1c, for all sufficiently large primes p, |£(Q)torsl
divides \EP(2P)\. By Lemma 5.11, |£(Q)tors| divides p + 1 for all
sufficiently large primes p with p = 3 mod 4.
Let us see that 8 does not divide |£(Q)torsl- By Dirichlet's Theorem,
we can choose a prime p as in the previous sentence with p = 3 mod 8.
If 8 divides |£(Q)t0n|. tnen 8 | (p + 1). But p = 3 mod 8 means that
p-f 1 = 4 mod 8; so 8 \ (p -f 1), contradiction.
Now let us see that 3 does not divide |£(Q)tors|- By Dirichlet's
Theorem, we can choose p large with p = 7 mod 12. Then p = 3 mod 4.
Thus 3 | |£(Q)tors| implies 3 | (p + 1). But p+ 1 = 8 mod 12 implies
p-f 1 = 8 mod 3; so 3 f (p+ 1), contradiction.
Finally let us see that no odd prime q > 3 divides |2£(Q)tors|- By
Dirichlet's Theorem, we can choose p large with p = 3 mod Aq. Then
p = 3 mod 4. Thus q | |£?(Q)t0rs| implies q | (p + 1). But p + 1 = 4
mod Aq implies p + 1 = 4 mod q\ so q \ (p -f- 1), contradiction.
This completes the proof that |£(Q)tors| divides 4. The torsion group
will then contain Z2 = {(0,0), 00} as a subgroup, and it will be Z2 0 Z2
if and only if x3 -f Ax splits over Q, i.e., if and only if — A is a square.
Thus the only question is when (0, 0) is the double of something (so that
the torsion group is Z4 rather than Z2). We can check directly for A = 4
that (2,4) doubles to (0,0).
Consider the equation 2(x,y) = (0,0) for other A. By (5.33), we have
0 = xA-2Ax2 + A2 = (x2-A)2.
Thus x2 = A. Since A is fourth-power free, x is square free. But
y2 = x(x2 -f A) = x(x2 -f x2) = 2x3 then shows that no odd prime can
divide x. So x = ±1 or ±2. Checking the possibilities, we see that
x = ±2 and A = A.
Lemma 5.12. Let Ep be the curve y2 = x3 + B over Zp> and assume
that p f A, p > 5, and p = 2 mod 3. Then £p(Zp) has exactly p + 1
points.
150
V: TORSION SUBGROUP OF E(Q)
Proof. Let p = 3n + 2. The multiplicative group Z* has order p- 1.
Since 3 \ (p — 1), no element has order 3. Therefore the homomorphisrn
a —► a3 on Z£ is one-one, hence onto. Thus each element of Zp has a
unique cube root. For each y in Zp, the element y2 — B has a unique
cube root, which we can take as x. In this way we obtain p solutions.
Adjoining oo, we see that EP(ZP) has p + 1 points.
Proof of Theorem 5.3. The main step is to show that ^(QJtorsI
divides 6. By Theorem 5.1c, for all sufficiently large primes p, |£,(Q)tors|
divides \Ep(lp)\. By Lemma 5.12, |£(Q)to™| divides p + 1 for all
sufficiently large primes p with p = 2 mod 3.
Let us see that 4 does not divide |l?(Q)tors|- By Dirichlet's Theorem,
we can choose a prime p as in the previous sentence with p = 5 mod 12.
If 4 divides |#(Q)tors|, then 4 | (p + 1). But p = 1 mod 4 means that
p + 1 = 2 mod 4; so 4 f (p -f 1), contradiction.
Now let us see that 9 does not divide |£(Q)tore|- By Dirichlet's
Theorem, we can choose p large with p = 2 mod 9. Then p = 2 mod 3.
Thus 9 | |£(Q)tors| implies 9 | (p + 1). But p + 1 = 3 mod 9 implies
9 f (p+ 1), contradiction.
Finally let us see that no odd prime q > 3 divides |£(Q)tors|- By
Dirichlet's Theorem, we can choose p large with p = 2 mod Zq. Then
p = 2 mod 3. Thus q | |£(Q)t0rsl implies q \ (p + 1). But p -f 1 = 3
mod 3g implies p + 1 = 3 mod q\ so q f (p+ 1), contradiction.
This completes the proof that |£(Q)tors| divides 6. The torsion group
has an element of order 2 if and only if x3 + B has a first-degree factor
over Q, i.e., if and only if B is a cube. Thus the only question is when
the torsion group has elements of order 3. Such a point P = (x,y)
is characterized by 2P = — P. Moreover, the x coordinate determines
everything, since 2P = -fP is impossible for P ^ O. By (5.33) the
question is whether
x4 - SBx _
4(x3 + B)~*
has any rational solutions x. Clearing fractions, we have
4x4 + ABx - x4 - SBx
x4 = -ABx.
One solution is x — 0, which gives y2 = B\ so Z3 occurs if B is a square.
The only other possibility is x3 = — 4B. Then y2 = -35. Consequently
B < 0. Since 5 is sixth-power free, the only possible prime divisors of
B are 2 and 3. We readily find B = -2433. So Z3 occurs if and only if
either B is a square or B = —2433.
CHAPTER VI
COMPLEX POINTS
1. Overview
We shall consider meromorphic functions on C periodic with two
periods that are linearly independent over R. Such functions are called
elliptic functions. We fix the periods. Among these functions, there
is a basic one p(z)% and p(z) satisfies the differential equation
p'2 = 4p3 - g2p - 03
for certain constants g2 and #3 depending on the periods. The linear
independence of the two generating periods forces the cubic 4x3— g2x—g%
to have nonzero discriminant.
For t G C in a parallelogram II generated by the periods, we form the
parametrically defined curve
Then we have
y2 = 4x3 - g2x - g3i
and the discriminant is nonzero. In other words, p provides us with a
parametrization of part of a curve E over C, and E is for all intents and
purposes an elliptic curve. Taking into account poles of p(t) and using
the projective plane over C rather than the affine plane, we shall see that
the parametrization is onto £"(C). In fact, the map is biholomorphic.
There is a natural addition on our fundamental parallelogram obtained
from addition in C, and we shall see that this operation corresponds to
the group law for E(C).
Finally we shall see that every elliptic curve over C can be realized by
this construction. This is the inversion problem, and detailed motivation
for its solution appears in §5.
151
152
VI: COMPLEX POINTS
2. Elliptic Functions
A function / : C —► C U {00} is doubly periodic with periods u>\
and u;2 if ^1 and u;2 are linearly independent over R and if f(z -fa;i) =
/(z) = f(z +<jj2) for all z G C. An elliptic function is a meromorphic
doubly periodic function.
Fix u>i and lj2- The corresponding elliptic functions form a field. The
parallelogram with vertices 0, u)\, LJ2> w\ +w2 is called the fundamental
parallelogram II. To be more precise, we shall insist that II contain the
two sides adjacent to the origin, as well as the origin, but not the other
two sides and three vertices. Any C-translate a + II of II is a period
parallelogram. With our precise definition of period parallelogram,
we can conclude the following: For any period parallelogram a + 11, any
point in C is congruent modulo Zu\ -f 2lj2 to one and only one point of
a+ 11.
Proposition 6.1 (First Liouville Theorem). There exists no noncon-
stant elliptic function without poles.
PROOF. Such a function would have to be bounded on the closure fl,
hence entire and bounded on C.
Corollary 6.2. If two elliptic functions have the same poles with the
same respective principal parts, then the functions differ by a constant.
Proposition 6.3 (Second Liouville Theorem). If f(z) is an elliptic
function with no poles on the boundary C of a period parallelogram
a + Il, then the sum of the residues of f(z) in a + II is zero.
PROOF. By the Residue Theorem, the sum in question equals
The integrals over opposite sides of a -f II cancel (since / is the same at
congruent points and dz introduces a minus sign on one of two matching
sides), and thus the integral is 0.
Remark. If/ is elliptic, it has only finitely many poles on a bounded
set, and thus there is always an a for which the proposition applies.
Corollary 6.4. If f(z) is a nonconstant elliptic function, then either
/ has more than one pole in a + II or else / has one pole and that pole
has order > 1.
Proof. There has to be a pole, and the sum of the residues is 0.
3. WEIERSTRASS p FUNCTION
153
Corollary 6.5 (Third Liouville Theorem). Suppose f{z) is an elliptic
function with no pole or zero on the boundary C of a period
parallelogram a + II. Let {rrii} be the orders of the various zeros in a 4- II, and
let {rij} be the orders of the various poles. Then ]£,• mt = £• rij.
Proof. Since / is meromorphic, the residue of at 20 is
1
n if n > 0 is the order of a zero of / at zq
0 if / has no zero or pole at zq
— n if n > 0 is the order of a pole of / at zq.
Summing over the points of the period parallelogram and applying
Proposition 6.3 to , we obtain the corollary.
The order of an elliptic function is the number of poles in a -f II,
counting multiplicities. In Corollary 6.5, the order is J2jnj'
Consequently the number of zeros of f(z), counting multiplicities, equals the
order of f{z).
Corollary 6.6. Suppose f(z) is an elliptic function of order m and
or + II is a period parallelogram. For each c £ C, / takes on the value c
exactly m times, counting multiplicities.
Proof. Apply Corollary 6.5 to f(z) — c.
3. Weierstrass p Function
Fix lj\ and u>2 in C linearly independent over R, and let II be the
fundamental parallelogram. We shall define a particular nonconstant
elliptic function with periods u>\ and U2} thereby proving existence of
nonconstant elliptic functions. Then we shall develop some properties
of this function.
Write A = {mu>\ 4- n^2 | ^,w G Z}; A is called the period lattice.
The Weierstrass p function relative to A is defined by
- — V I 1 - -
P(') = j + E 7TT«-^ ■ («•!)
u>GA
We shall make repeated use of the following fact from complex variable
theory: If a sequence of functions analytic in an open set converges
uniformly, then the limit is analytic, and limit and derivative may be
interchanged.
154
VI: COMPLEX POINTS
Lemma 6.7. If 5 is a real number > 2, then
converges.
Proof. Let ft be the union of the four A translates of II that surround
the origin:
ft = n u (-lj1 + n) u {-u2 + n) u {-ux - u2 + n).
The boundary of ft is compact and does not contain 0, and we can choose
c > 0 so that |2| > c for all z in the boundary of ft. If \m\ > \n\ > 0, we
then have
I ft I
\mu\ -f 71CJ2I > |"i| \w\ -\ ^2 > c|ml» (6-2)
I m I
since z — u>i -f —^2 lies in the boundary of ft.
Now consider ]T)u,^o M-*- ^ we write u> = mwj -f nu>2, the terms
with n = 0 contribute 2|c^i |""* X)m=i m~* < °°» anc* tne terms w^n
m = 0 contribute 2|c^2|~~* 5Z^^Li rl~* < 00. By (6.2), the terms with
lml > \n\ > 0 contribute
]T |mu;i H-nu^r* < c"5 X) 1^1"* = 4c"* ^ m",+1 < 00,
|m|>|n|>0 |m|>|n|>0 m=l
and the terms with \n\ > \m\ > 0 similarly make a finite contribution.
The lemma follows.
Proposition 6.8. If F is any finite subset of A and if the terms
corresponding to F are omitted in (6.1), then the resulting series converges
absolutely uniformly on any compact subset of C — (A — F).
Consequently p(z) is meromorphic in C, its only poles are double poles at the
points of A, and p'(z) may be computed term by term.
Proof. We may assume F contains u> = 0. The sum for A — F is
*2
VPU-w)2 a/2/ ^
2z-Z-
3. WEIERSTRASS p FUNCTION
155
and we have
C
" M3
2z
a;
a;(z —a/)
on our compact set. By Lemma 6.7, X^^ga-f O* ^ °°' Therefore the
absolute uniform convergence follows by the Weierstrass M-test. The
fact from complex variable theory at the beginning of the section shows
that the limit is meromorphic. The poles are as indicated because of the
convergence, and the same fact from the beginning of the section shows
we can compute p'(z) term by term.
Proposition 6.9. The function p(z) is an elliptic function with u\
and o;2 as periods, and p{—z) — p(z). The order of p(z) is 2.
PROOF. We have p(—z) = p(z) since the right side of (6.1) is
unchanged if z is replaced by — z and u is replaced by — u>. Differentiation
of (6.1) gives
o;€A V '
and this is certainly doubly periodic. Hence p'{z) is elliptic.
For w G A, we differentiate p(z -f u) — p(z) and obtain p'{z + u>) —
p'{z) = 0. Therefore p(z + u;) — p(z) = C. Evaluating at z = — \u\
(where there is no pole, according to Proposition 6.8), we obtain
c = p(£"i) - p(-£"i) = p(I^i) - p(i^i) = o.
Hence p(z -f tt>i) = p(^). Similarly p(z + u>2) = p{2)- Since p is
meromorphic, p is elliptic. Since z = 0 is the only pole of p(z) in n, the
order of p(z) is 2.
Theorem 6.10 Any even elliptic function (relative to periods uj\ and
LJ2) is a rational function of p(z). Any elliptic function / is of the form
f(z) = g(p(z)) -f p'(z)h(p(z)) with g and h rational.
Preliminary remarks. The second statement follows from the first
since f = fe + f0 with fe(z) = \{f(z) + /(-*)) even and with f0(z) =
f fo
\{f{z) — f{—z)) odd, and since f0 — p' • — with —f even. The proof of
P P
the first statement will be preceded by a lemma.
156
VI: COMPLEX POINTS
Lemma 6.11.
(a) For any ti € C, the elliptic function p(z) — u has within II either
two simple zeros or one double zero.
(b) The zeros of p'{z) within II are \uiy \u2, and \{u\ + ^2), all
simple.
(c) The values u^ - p(^u>i), u2 = p(^2), and u3 = p(^(u>!+u>2)) are
the exact u's within II where p(z) — u has a double zero, and «i, 1*2, «3
are distinct.
Proof.
(a) Since p(z) has order 2, we can apply Corollary 6.6.
(b) The function p'(z) has order 3. Hence it has at most 3 zeros.
Since p'(z) is odd and periodic, we have
f/(iw,) = -p'(-iw,) = -*/(«! " ±«l) = -p'far).
Thus ^ux is a zero. Similarly |u>2 and \(w\ +^2) are zeros.
(c) If p(z) — u has a zero at zq , the zero is double if and only if p'(zo) =
0. By (b), p(z) — u can have a double zero only at \w\, ^u2l ^(^1 + ^2).
and in these cases u clearly must be u\, 1*2, t/3, respectively. Conversely if
u = u\, then p(z) — u\ is 0 at z = \wi> and also fp'(\w\) = 0. So we have
a double 0. Similar remarks apply to ~u2 and to \{u\ +^2). Finally
if at least two of U\y u2i U3 were equal, say to «o, then p(z) — uo would
have at least two double zeros, in contradiction to (a); we conclude that
w1jw2jw3 are distinct.
Proof of Theorem 6.10. We are to prove that any even elliptic
function is a rational function of p(z). Let f(z) be even elliptic.
Taking into account the evenness, we shall list "half the zeros and poles,
temporarily ignoring what happens at z = 0.
Let f(a) = 0. First suppose a G II and a £ {0, \u\> ^u>2, \{w\ + u>2)}-
Let a* be the symmetric point, the point in II congruent to —a:
{u\ -f w2 — a if a is interior (and not \(w\ -f w2))
u>i — a if a is on the side to lj\ (and not \w\) (6.3)
u>2 — o> if a is on the side to u2 (and not ^2).
If a has order m, so does a*: In fact,
/(a* - z) = /(period - a - z) = f(-a - z) - f(a + z).
If /(a + z) = amzm+higher, then
/(a* + z) = f(a -z) = am(-z)m + higher.
3. WEIERSTRASS p FUNCTION
157
Next suppose a is one of |u>i, |u;2, ^1+0/2). Say a = ^cj. Then we
show a has even order. In fact,
/(iu, - z) =/(-I*-*) = /(iu, + z)
shows / is even about a = £u>. Hence the order is even.
A similar argument applies to poles. If / has a pole at a € II with
a £ {0, ^a;i, ^u>2, ^(^1 +^2)}) then / has a pole at a* of the same order.
Also the order of a pole at any of \w\y \u)2, ^(^i + ^2) is even.
Now we list "half the zeros and poles of f(z). Let {a,} be a list of
the zeros of f(z) in II other than 0, %u)\, ^u;2, \{u\ + u^), each taken
with its multiplicity, but with only one taken from each pair a, a*. For
any zero among ^ux, ^u;2j ^(^1 + w2), we list the point half as often as
its multiplicity. Similarly let {6;} be a list of "half" the poles in II other
than 0.
Since all the a, and bj are nonzero, p(at) and p(bj) are finite for all i
and j. Thus it makes sense to define
„r v _ TIM^) - p(°0)
g{)~TlM*)-p(i>i)Y
We claim that g has the same zeros and poles as /, counting
multiplicities. Since the only poles of the numerator and the denominator are
at z = 0, the only other zeros and poles of g(z) come from zeros of the
numerator and the denominator.
Consider a zero zq of the numerator, a point where p(zq) = p(a,-). If
a, is any of ^u)\t ^u>2, or |(cji + 0*2)1 then Lemma 6.11 says p takes on
the value p(a;) twice at that point and nowhere else. So z0 = a, and
p(z) — p(a») has a zero of order two at z0. Taking into account repeated
factors for this a,, we see that / and g have a zero of the same order at
Next suppose a, is not ^lji, ^u;2, or \(w\ -f u^). We have p(at*) =
p(a;), so that p(z) — p(a,-) has distinct zeros at a,- and a,*. The lemma
says these zeros are simple. Taking into account repeated factors for
this a,-, we see that / and g have a zero of the same order at z0.
Thus / and g have the same zeros (including their orders), apart from
z = 0. Similarly they have the same poles (including their orders), apart
from z = 0. By Corollary 6.5, they have the same order of zero or pole at
z = 0. Consequently f/g is entire and therefore constant, by Proposition
6.1. This completes the proof.
158
VI: COMPLEX POINTS
Theorem 6.12. The function w = p(z) satisfies the differential
equation
(*)-
4w3 - g2w - g3,
where g2 and #3 are constants depending on A in the following way:
Gm = Gm(A) = Yl Az for m > 3 (=0 for m odd) (6.4)
<*>GA
92 = 172(A) = 6OG4
(73 = (73(A) = 140G6. l ' '
Remarks. The series for Gm is absolutely convergent by Lemma 6.7.
The theorem will be proved after Lemma 6.13.
Lemma 6.13 (Weierstrass Double Series Theorem). Suppose fn{z) =
Z)jbLoafc (z "~ zo)fc *s analytic for \z — z0\ < r and n > 0, and suppose
that the series
n=0
+ [a(0x) + flW(r - *0) + 41 \z - *0)2 + • • •]
+ ...
+ [«(on) + «(in)(^ - *o) + a2n\z - z0)2 + ...]
-f ...
is uniformly convergent for |z — 2tq| < p for each p < r. Then the
coefficients in any column form a convergent series. Moreover, if
^=i>:
00
n=0
is the sum of the kth column coefficients, then ]>^£L0 Ak(z — z0)k is the
Taylor series for F(z) about 2 = 20, and it converges for \z\ < r.
Proof. By the given convergence, F(z) is analytic for \z — zo\ < r
and hence is given by a convergent Taylor series about z0. Its kth Taylor
coefficient is
n=0 n=0
by the complex variables fact after (6.1), and the lemma follows.
3. WEIERSTRASS p FUNCTION 159
Proof of Theorem 6.12. We are going to apply Lemma 6.13 to
i.,c A N v ' '
in a small disc
so
that
about z
1
(z-i
p(*)-
= 0.
u)*z
1
z*
weA
Here
1 „ *
-El
u;€A
u;4
w*+2
Thus if we define Gm by (6.4) for m > 3, the lemma says that the series
for Gm is convergent (which we know already) and
p(*)-4 = f>+l)G*+2z*.
Z*
We see directly from the definition (6.4) that Gm = 0 for m odd.
Therefore
p(z) = L + 3G4*2 + 5G6*4 + 7G8*6 + . • •.
By direct calculation we find
p'(z) = ~ + GG4z + 20G6*3 + 42G8*5 + 0(z7)
p'(z)2 = A - 24G44r - 80G6 + (36G2 - 168)*2 + 0(z4)
p{zf = 1. + 6G4 + 10G6z2 + 0(z4)
p(*)3 = ~ + 9G4^ + 15G6 + (21G8 + 27G2>2 + 0(z4)
p'{zf - 4p(z)3 + 60G4p(z) + 140G6 = 0(z2).
The left side of the last equation is therefore without poles or constant
term. By Proposition 6.1, it is 0. Defining <}i and g3 by (6.5), we obtain
the theorem.
160
VI: COMPLEX POINTS
Theorem 6.14. The map of C/A into P2(C) given by
{
(6.6)
(0,1,0) forzGA v ;
is holomorphic and carries C/A one-one onto the projective plane curve
E(C), where E is given in affine form by
y2 = 4x3 - g2x - g3.
Remark. Recall from §11.1 that our various systems of affine local
coordinates make P2(C) into a complex manifold.
PROOF. For z ^ 0, the image is (p(z), p'(z)} 1), which becomes
(p(z),p'(z)) in the affine local coordinate system given by 4> = J.
Since each coordinate is analytic in z, our map (6.6) is holomorphic
for z ^ 0. Near z = 0, we use the affine local coordinate system given
/l 0 0\
by <£ = ( 0 0 II, and the map is given by
0 1 0,
Mil i JL\ _ (£WL _M
Vp'oo1 • pw Woovco;
0_ (0,1,0) — (0,0).
Each coordinate is analytic in a punctured disc about 0 and is continuous
at 0, hence is analytic in a disc about 0. Thus (6.6) is holomorphic.
It is clear from Theorem 6.12 that the image is contained in E(C).
Suppose z\ and z2 in IT map to the same point. Then p(*i) = p(z2) and
p'(*i) = p'C2^) and zx ^ 0, z2 ^ 0. Since p has order 2, p(zl) = p(z2)
with 2i ^ z2 implies z2 — z\*y in the notation of (6.3). But then
pf(z2) = p'(*i*) = p'(period - zi) = p'(-zx) = -p'(*i).
Since also pf(z2) = p'(zi), we see that p'^i) = p'(z2) = 0. By Lemma
6.11, *! and z2 are members of {\^\, ^w2t |(u>i + u>2)}. Consequently
zi* = zi, ^2* = z2. Since Z2 = 2:1*, we obtain z\ — z2i contradiction.
We conclude that the map (6.6) is one-one.
Finally we show that the map (6.6) is onto. Let (x,y,l) £ E(C) be
given. Since p has order 2, we can find z (clearly ^ 0) with p(z) = x.
Then also p(z*) = x. The equation
y2 = 4x3 - g2x -g3 = 4p(z)3 - g2p(z) - flfe = p'(*)2
says either that y = p'^), in which case z maps to our point (x,y, 1), or
else y = -p'(z), in which case z maps to (x,—y,l). In the latter case,
p'(z*) = —p'(z) implies z* maps to (x,y, 1).
3. WEIERSTRASS p FUNCTION 161
Theorem 6.15. The cubic polynomial 4w3 — g2w — g3 factors as
4w3 — g2w — g3 = A(w — u\)(w — u2)(w — u3)} (6.7)
where ui = p(^Ui)} u2 = lj(^(lj2)} u3 = u(j(u;i +v2)). Consequently
the plane curve
E : y2 = 4x3-g2x-g3 (6.8)
is nonsingular.
Proof. For the first statement we go over the proof of Theorem 6.10
for the even elliptic function pf(z)2. Since the only pole in II is at z — 0
and since the only zeros of p'{z) are simple zeros at Ux,u2iu3, the proof
shows that
u/(*)2=cn(P(z)-«o-
t=i
Subtracting the expression in Theorem 6.12, we find that every value
p(z) of the p function satisfies
3
4p3 - g?P -9s = C]J(p - u{).
izzl
This identity is impossible unless C — 4 (since nontrivial cubics have
only 3 roots). Thus the cubics match. This proves the first statement
in the theorem. The second statement follows from Lemma 6.11c and
Proposition 3.5.
The nonsingularity of E in Theorem 6.15 allows us to define a complex
manifold structure on E(C) in a natural way. About (x,y, 1) we define
E(x,y) = y2 - (4x3 - g2x - g3).
We have
9E ,,„ 2 dE „
^ = -(12*2-«,,), ^ = 2y.
When y^O, the Implicit Function Theorem allows us to define y =
f(x) implicitly so that the curve is (x,/(x),l). A chart is given by
(x,/(x),l)-*.
BE
When -£— ^ 0, the Implicit Function Theorem allows us to define
ox
x = g(y) implicitly so that the curve is (g(y)iyi 1). A chart is given by
(0(2/)>y,i)-*y-
162
VI: COMPLEX POINTS
Finally we have to consider behavior about (0,1,0). About (x}\yw)
we define
e(x, w) = w — (4x3 — g2W2x — ^u;3).
de I
Then — ^ 0, so that w = h(x) implicitly and the curve is
^1(0,1,0)
(x, 1, h(x)). A chart is given by (x, l,iu) —► x.
We readily check that these charts make E(C) into a complex
manifold such that inclusion into /^(C) is holomorphic and such that any
holomorphic mapping into P2(C) with image in E(C) is holomorphic
into E(C). Relative to this complex structure on E(C)} we have the
following corollary.
Corollary 6.16. The inverse of the one-one map C/A —► E(C) given
in (6.6) is holomorphic.
Proof. About points (x, y, 1) where y ^ 0, the coordinate function is
(x,y, 1) —► x. Thus the coordinate version of the map (6.6) is z —► p(z)
about points zq where p'(z0) / 0. The invertibility condition for this
function (by the one-dimensional Inverse Function Theorem over C) is
that p'(z0) ^ 0. So we have invertibility.
About points (x, y> 1) where y = 0 (i.e., p'(zq) = 0), we are considering
z — \u\, |cJ2, \{y>\ + ^2)- The coordinate function is (x,y,l) —► y.
Thus the coordinate version of the map (6.6) is z —► p'{z) about z =
^lj\, ^U2j \{u\ +^2). The invertibility condition is that p"{zo) ^ 0.
But p"(zo) = 0 would mean p' has a double 0 at zq} in contradiction to
Lemma 6.11b.
About (0,1,0), we are considering z = 0. The coordinate function
is (x, l,ti>) —♦ x. Thus the coordinate version of the map (6.6) is z —►
p(z)/p'(z). This has a simple zero at z = 0, and thus its derivative is
nonzero at z = 0. This completes the proof.
4. Effect on Addition
We continue to assume that u)\ and lj2 in C are linearly independent
over R, and we continue with the notation A, II, and E as in §3.
We write <p for the biholomorphic map <p : C/A —► E(C) given by
(6.6).
Theorem 6.17. The biholomorphic map ip : C/A —► E(C) is a group
isomorphism.
4. EFFECT ON ADDITION
163
REMARKS. The proof will be preceded by Lemma 6.18. Had we not
already proved that the operation (P11P2) —*• Pi + Pi m E(C) *s
associative, then Theorem 6.17 would give us a proof. This was Poincare's
approach, and as a consequence he obtained associativity for the k
rational points of elliptic curves defined over Jk, provided the field k is a
subfield of C.
Lemma 6.18. If / : C/A x C/A —► C/A is analytic in each
variable and continuous jointly, then there exist a, 6, and c in C such that
/(*i» Zti) — azi + bz2 + c mod A for all z\, z2 G C.
Proof. We lift / to F : C x C —► C/A with F jointly continuous and
separately analytic. Then we lift F to a function F : C x C —► C that is
jointly continuous and separately analytic. The equality
F{z\ -f rnui -f nc^, 2:2) = F(z\, Z2) mod A
says that
F(zi + mu>i 4- na;2,Z2) = F(z1,z2) + miwi -f nj^
with mi,ni in Z depending on zi,Z2,rrc,n. For fixed m and n, rrci and
ni are continuous in (zi, 2:2)1 hence constant. Differentiation gives
—(zi imui +nu2,Z2) - -q^-\zuz2)
OF, x 5F,
— (zj + mcji + nw2, 2:2) = ^-(zi»*2)-
dF dF
Hence —— and —— are elliptic in the first variable and thus constant
dz\ dz2
in z\. Similarly they are constant in Z2 and thus globally constant. Say
dF dF
—— = a and —— = b. Then we have F(z\}Z2) = az\ + bz2 + c for a
OZ\ OZ2
suitable c, and the lemma follows.
Proof of Theorem 6.17. First we note that y>(0) is the identity
element of E(C)} translation is analytic on E(C)t and addition is
continuous on E(C). (For the last of these, it is enough to prove continuity
at (0,0), where it is clear.)
Form / : C/A x C/A —► C/A given by
f{zuz2) = <p~l(<p(zi) + <p(z2)) niod A.
164
VI: COMPLEX POINTS
This is analytic in each variable and is jointly continuous. By Lemma
6.18, f(zx, z2) = azx + bz2 + c mod A. Since /(0, 0) = 0, c = 0. Since
/(z,0) = z, az = z mod A for all z and thus a = 1. Similarly 6=1.
Thus f{zuz2) = z\ + z2, i.e.,
zi + 2r2 = ^"U^i) + V>(*2)) mod A.
Applying <p, we obtain
¥>(*1 +Z2) = </>(*l) + </>(*2),
as required.
An analytic map h : E(C) —► #(C) fixing 0 is called an isogeny.
(More generally an analytic map h : #i(C) —► E2(C) carrying identity
to identity is an isogeny.) We can reinterpret an isogeny h by means of
(p : C/A —► E(C), i.e., we can study <p~x oho (p. In Lemma 6.18, we put
f{z\yz2) = (p~l o h o <p(zi). The lemma gives <p~~l oho <p(z) = az -f c.
Since /i(0) = 0, c = 0. Thus our reinterpretation is
h(<p{z)) = <p(az).
The trivial maps of this sort have a £ Z. If a is in R but not Z, take
z = w\ to see that the map is not well defined. Thus the only nontrivial
isogenics from E to itself have a £ C with a not real. In this case we
say that E has complex multiplication.
Proposition 6.19. E over C as above has complex multiplication if
and only if u2/w\ lies in a quadratic extension of Q.
Proof. With a as above, h cannot be well defined unless a A C A.
Conversely if a A C A, let z\ = z2 mod A, say z\ — z2 -f u>. Then
azi = az2 + aw G az2 -f A shows that z —► az is well defined on C/A,
i.e., we can define an isogeny h.
Now suppose aA C A. Then awi = mu>i -f nu;2 with m,n £ Z. If
t = lj2/{jJ\ , this says a = m -f nr. Now
au>2 = (m + nr)u>2 equals m'u;i + n'u^-
Dividing by u\ gives
(m -f nr)r = m' -f n'r.
Thus r satisfies a quadratic equation with integer coefficients.
Conversely if rr2 + sr + t = 0, then (rr)r = -< - 5r. Define a = rr g R.
Then au>\ = ra;2 and act^ = — twi — su2, so that a A C A.
5. OVERVIEW OF INVERSION PROBLEM
165
5. Overview of Inversion Problem
In §§1-4 we have seen how to associate to a lattice A = Zu>i + Zu;2 in
C a nonsingular curve
E : y2 = 4x3 - g2x - g3
with #2 and </3 in C. This association is implemented by the biholomor-
phic mapping (p : C/A —► E(C) given by
^V ; 1(0,1,0) for ^ G A. V ;
In this section we address a converse problem. We know that every
elliptic curve over C, after the linear change of variables that changes
(3.23) into (3.26), is of the form
y2 = 4(x - a)(x - b)(x - c) (6.10)
with a,6,c distinct in C. With another linear change of variables, we
can even make a-f 6-f c = 0, and in this case the x2 term drops out from
the right side of (6.10). Our question is: For such a curve E, does there
exist a lattice A such that the above association carries A to E?
The answer is "yes." The positive answer to this inversion problem is
a fairly easy consequence of the Uniformization Theorem in the theory
of one complex variable. However, we prefer to take a more pedestrian
approach, since it gives more information. Essentially we shall produce
concrete formulas for the inverse of (p in (6.9).
In terms of parameters, we need to use E to construct a pair of
generating periods w\ and l^. Then we can form the corresponding p function,
and we shall recover a}byc as the parameters wi,«2,^3 attached to this
p function.
The construction of the parameters will be a byproduct of the
construction of <p~~l. To see the nature of <p~l, let us assume that E does
indeed arise from w = p(z). Then the results of §3 give
f-jjM =4(w-a)(w-b)(w-c)
dw
—. = = dz
2y/(w - a)(w - b)(w - c)
z{w)=ro n jw w i- (6n)
J 2\/(w — a)(w — b)(w — c)
166
VI: COMPLEX POINTS
This integral is called an elliptic integral of the first kind.
In order to assert that w = p(z) is genuinely the inverse function of
an elliptic integral of the first kind, we need to make sense out of (6.11).
The first step is to make sense out of the double-valued integrand; we
carry out this step in §7 after an essay on analytic continuation in §6.
Then in §8 we clarify what path of integration to use in order to make
sense of (6.11). Different choices of path will lead to different values
of the integrand, but the different values will all lie in a translate of a
lattice, namely the lattice we take as A.
Once we have made sense of (6.11), we can show it has a locally defined
inverse, and we can show that the locally defined inverse extends to be
globally meromorphic. The rest will be easy.
In §9 we shall illustrate how to make rapidly convergent numerical
calculations for passing back and forth between A and (#2,03), particularly
when 02 and g^ are in R.
6. Analytic Continuation
A function element is an analytic function on a disc. If the disc
is centered at z = a, we speak of a function element at a. We shall
define a notion of "direct analytic continuation" of a function element;
this will be another function element of a certain kind. Iteration of this
notion will lead to the definition of "analytic continuation" of a function
element along a path.
To define direct analytic continuation, let the first function
element be
f{z) ~ a0 + ai(z - a) + a2(z - a)2 + ... for \z - a\ < r(a). (6.12)
Suppose s satisfies \s — a\ < r(a)} so that the disc
\z-s\ <r(a)-\s-a\ (6.13)
lies completely within the disc \z — a\ < r(a). Then we can compute the
derivatives of f(z) at s to obtain
f(z) = b0(s) + &i(s)(z - s) + b2(s){z - s)2 + .... (6.14)
For fixed s satisfying \s - a\ < r(a), Taylor's Theorem shows that (6.14)
is convergent for z as in (6.13). But more is true. If we write
(z-a)n = [(z-s) + (S-a)]n, (6.15)
6. ANALYTIC CONTINUATION
167
then we can compute the coefficients 60(5),61(5),... from (6.15); The
result is that fro(s),&i(s), • •• are convergent power series in s — a.
Effectively the bj(s) are obtained by expanding (6.15), substituting into
(6.12), and rearranging. We say that (6.14) arises from (6.12) as a
result of rearranging the series at s. A second function element is
a direct analytic continuation of the given function element if its
center s satisfies |s — a\ < r(a) and if the series for the second function
element arises from the given function element as a result of rearranging
the series at 5. It may be that r(s) for the second function element is in
fact larger than the obvious possibility r(a) — \s — a\ given in (6.13); in
fact, this will be the case of interest for us.
Fix a and b} and let C be a path from a to 6 given by
7(0» <* < t < 0 with 7(*) = a> 7(0) = 6-
Let / be a function element at a. A function element g at b is said to
be an analytic continuation of / along C if there exist
(1) a partition
a = t0 < *i < ••■ <tn = /?, (6.16)
(2) discs Dj centered at 7(2j), 0 < j < n, and
(3) function elements fj at j(tj), 0 < j < n, defined on Dj
such that /o = /, fn = <7, 7[*j-i>*j] Q Dj-i for 1 < j' < ™> and fj is a
direct analytic continuation of /;_i for 1 < j < n.
Example. Let C be the unit circle 7 = e*', 0 < t < 2ir. Let
/(z) = y/z in \z — 1| < 1, with the square root taken to be positive on
the positive reals, and let g(z) = y/z in \z — 1| < 1, with the square root
taken to be negative on the positive reals. It is easy to check that g is
an analytic continuation of / along C. The discs Dj can be taken to be
centered at the points e'**/4, 0 < j < 8.
Proposition 6.20. Let C be a path from a to 6, and let / be a
function element at a. If (71 and gi are function elements at 6 that are
analytic continuations of/ along C, then g\ and gi have the same Taylor
series expansions about z — b. (Only the radii of their discs of definition
can be different.)
Proof if D0 D 7[<*,/?]. In terms of the partition (6.16) used to
determine one of g\ and g2, we claim that the Taylor series for fn is
the result of rearranging the series for /o at y{tn)- If so, then certainly
168
VI: COMPLEX POINTS
g\ and g2 wiH have the same Taylor series expansions. We proceed
by induction on n, the case n = 1 being true by definition of direct
analytic continuation. Assuming the result for n — 1, we then know that
/n_i is the result of rearranging the series for /o at 7(<n_i), so that
fo(z) = fn-\(z) for z £ Do C\ Dn_i. By definition of direct analytic
continuation, fn-\{z) = fn{z) for z G A1-1 H Dn. Thus fo(z) = fn(z)
for z E D0 nDn_i C\Dn. This intersection is nonempty since j{in) is in
it. Thus fo{z) = fn(z) on D0 fl Dn, and the assertion follows.
Proof in general case. The main step is to see that the partition
(6.16) can be suitably refined without adjusting the series of the initial
or final function element. Recall that t[<j-i,<j] C Dj-\. Then
6 = min distance(7[<,_i,<j], D^_i)
i<i<n
is > 0. Fix any positive number p with p < 6. By uniform continuity of
7 (with € = p)y we can refine (6.16) to a partition
a = so < ••• < sm = 0 (6.17)
so that
7[*i-i»*] £ {|* - 7(*»-i)l < P} n {\z - y(si)\ < p] for 1 < i < m.
Define
At = {\z - j{si)\ < p) for 0 < i < m.
Let
tj-utj (6.18)
be two consecutive points of (6.16) and let
5jb_i = <j_1, 5jfc, . . . , 5/ = tj (6.19)
be the corresponding points of (6.17). Now the series for /;-i converges
in Dj_i, and all the discs Ajb_i, A*,..., A/ lie in -Dj-i, by definition
of 6. Hence the special case shows that fj results whether we use the
partition (6.18) and discs Dj_i,Dj or the partition (6.19) and discs
Ajk_i, Ajfc,..., A/. Putting these results together for 1 < j < n, we see
that fn is the result of analytic continuation from /o using the partition
(6.17) and the discs Ao,..., Am.
Now we can compare g\ and #2- Let us say that the above argument
would work with 6 — 6\ for g\ and with 6 = 62 for g2. By choosing
p < min{6i,62}, we can choose a common refinement of both partitions
and see that #i and g2 are obtained from a common sequence of points
and discs.
7. RIEMANN SURFACE OF THE INTEGRAND
169
Proposition 6.21. Suppose C is a path from a to 6 given by y(t)y
Q 5* * < 0- Suppose / is a function element at a and g is a function
element at b obtained by analytic continuation of / along C. Then there
exists € > 0 with the following property: If C" is any path from a to 6
given by il>(t)} <*<*</?, that is uniformly within e of 7(f), then g is an
analytic continuation of/ along C".
This proposition has the following use: We can always approximate
C uniformly by polygonal paths, and thus polygonal paths give us the
most general analytic continuation. We omit the proof of Proposition
6.21, as it is in much the same spirit as the previous proof.
Proposition 6.22. If g is an analytic continuation of/ along a path
C\ then / is an analytic continuation of g along the reverse path C"1.
PROOF. Without loss of generality, let g be a direct analytic
continuation of /, so that the given partition is a = to < i\ = /?. We
apply the construction in the second half of Proposition 6.20, obtaining
a refinement a = sq < • • • < sm = /?, discs A, for 0 < i < m, and
function elements (/,, At) such that 7[st_1,5t] C At_i flAj, /o has the
same series as /, fm has the same series as g} and fc is a direct analytic
continuation of /,_i. For the path C"1, we use the partition
-/? = r0 < • • • < rm = -a
with r,- = — 5m_,-, the discs D{ = Am_,-, and the function elements
9i = fm-i' Then
7-1[r.--l>n] = 7"1[-5m-i+l,-*m-i] = 7[*m-i.*rn-i+l]
C Am-jDAm_t>i = DiHDi-i C Dt_L
Also the fact that Am_t and Am_t+i have the same radius shows that
fm-i and /m_,+i are direct analytic continuations of each other, hence
that gi is a direct analytic continuation of <7,_i.
7. Riemann Surface of the Integrand
Let a, 6, c be distinct points in C. In this section we treat the problem
of defining y/(z - a)(z - b)(z — c). Fix z0 € C - {a, 6, c}, and fix q0
as one of the two possible values of \/(zq — a)(zo — b)(zo — c). Then
y/(z — a)(z — b)(z - c) at least extends to be analytic in any disc about
zq within C — {a,6,c}.
170
VI: COMPLEX POINTS
We shall define a double covering H of C — {a, 6, c} so that
>/(z - a)(z - fc)(z - c) extends to be well defined on all of 11. Then we
shall complete 71 to a space IV over C U {oo} that fills in one preimage
above each of a, 6, c, oo. The space IV will be a one-dimensional complex
manifold (Riemann surface), and it will be compact. (Actually it will
be a torus.)
Consider loops in C— {a, 6, c] based at z$. If such a loop C is piecewise
smooth, we define its total winding number about a, 6, and c to be
n(C) = n{C} a) + n(C} b) + n(C, c)
2iri\Jcz-a Jcz-b Jcz-cJ
If the loop C is not piecewise smooth, all piecewise smooth loops that
are uniformly sufficiently close to C can be seen to have the same total
winding number, and we take this to be n(C).
Homotopic loops based at z0 have the same n(C). Then n(C) gives a
group homomorphism from the fundamental group 7rj(C — {a,6,c}, zq)
into Z. This homomorphism is onto Z since there exists a loop that
winds once around a and 0 times around 6 and around c. Therefore the
subgroup of even total winding number
H = {Ce n(C - {a,6,c}, z0) I n(C) G 2Z}
has index 2. Let 1Z be the covering space of C — {a,6,c} corresponding
to //, let e : 72. —► C — {a, 6, c} be the covering map, and fix (o as a base
point in H with e(Co) = ^o-
For use in §8, we record how H is generated in terms of
tt^C — {a,6,c}, zq). Let ro,r&,rc be simple piecewise smooth loops
in C — {a, 6, c] based at z0 that intersect one another only at z0 and that
have winding numbers
n(rflla) = 1, n(ra,6) = n(ra,c) = 0
n(r*,6) = 1, n(r»,a) = n(r6,c) = 0 (6.20)
n(re,c) = l, n(rc,a) = n(rc,6) = 0.
Define
Ti = rar6 and r2 = r6rc. (6.21)
Proposition 6.23. The subgroup of ir{(C- {a,6,c}, z0) of even total
winding number is generated by the classes of fi, r2) rj, T^, rj.
7. RIEMANN SURFACE OF THE INTEGRAND 171
PROOF. The loops ra,r6,rc generate 7Ti(C — {a, 6, c}, z0)y and each
of the purported generators is in the subgroup of even winding number.
Thus let a word in ra,ri,,rc and their inverses be given, and suppose
it has even total winding number. Since n(-) gives +1 or —1 for each
factor, there must be an even number of factors. Grouping them in
pairs, let us show that each pair is generated in the required fashion. If
the left hand member of a pair is T"1, we replace it by (r2)_1r; If the
right hand member of a pair is r*"1, we replace it by r(r2)-1. In this
way we may assume that the factors of the pair come from ra,r&,rc,
not their inverses. We may assume the members of the pair are not the
same, since the squares are in our list of generators. For the remaining
six pairs, rar& and Tt,Tc are generators, and the other pairs are given
by
rbTa = rl(rarbrlrl
rert = r2e(rtrc)-1rl
r.re = (r.rt)(r?)-1(r»re)
rera = r?(r»re)-,r?(rar»r1r2.
This proves the proposition.
The covering space becomes a Riemann surface in a natural way.
Namely let D be a disc about p in C — {a, 6, c}. Since D is simply
connected, it is evenly covered. Thus e-1(D) — Ai U A2, a disjoint
union of open sets Ai and A 2 each homeomorphic with D via e. We
can take e : £ —► z £ C as a local coordinate in A,-, and then (e, Aj)
becomes a chart.
We shall use the convention begun above that Greek letters refer to
points and sets in H and the corresponding Latin letters refer to their
counterparts in C — {a,6,c}. We have already fixed Qq with e(Co) = zo •
Let Do be a disc in C — {a, 6, c] about zo, and let Ao be the component of
e"1(Do) to which Co belongs. Recall that y/(z — a)(z - b)(z — c) makes
sense on Do as an analytic function with y/(zo — cl)(zq — 6)(zo — c) = <jo-
Proposition 6.24. There exists a unique analytic function F :7l —+
C satisfying F((q) = q$ and having the following property: If C is any
path from Co to £i in 7Z} if D is a disc about z\ = e(£i) in C — {a, 6, c},
and if e_1 : D —► A is regarded as a map from D to the component A of
e_1(D) to which Ci belongs, then (Foe"1, D) is an analytic continuation
of
(y/(z-a)(z-b){z-c)f D0) (6.22)
along e(C).
172
VI: COMPLEX POINTS
Remark. Temporarily we shall write
/(C) = >/«-a)«-&)«-c)
for the function produced by the proposition. We shall introduce better
notation in (6.23).
Proof. Uniqueness is an immediate consequence of Proposition 6.20
and the path-connectedness of 71. For existence, let fo{z) be the function
element (6.22). It is clear that this /0 continues analytically along any
curve C in C — {a,6,c} and that the result is one of the two values of
the square root, because fo(z)2 extends to a global function and analytic
continuation respects squaring.
Given Ci in 1Z, let C be a path from Co to Ci and let f\ be a function
element at z\, with domain D\, that is an analytic continuation of /o
along e(C). Let Ai be the component of e"2(Di) to which Ci belongs,
and define e"1 as a function from D\ to Ai. Then we can define an
analytic function F on Ai by F = f\ o e"1.
For F to give a globally defined analytic function on 7£, we need only
see that F is well defined. Thus let C# be another path from Co to
Ci, and let ff be the corresponding function element at z\ obtained by
continuation along e(C*). We are to show that f\ and ff have the
same Taylor series about z — z\.
Analytic continuation of /o around the loop Ta leads to —/o, and
similarly for T& and Tc. Thus analytic continuation of /o around a loop
7 in C —{a,fc,c} based at zQ leads to (-l)n^7^/o. Since CC#~l is a loop
in H based at Co, e(CC#~l) leads to /0.
Since analytic continuation of /o along e(C) leads to /i, analytic
continuation of /i along e(C$~l) ~ e{C*)~l leads to /o. By Proposition
6.22, analytic continuation of /o along e(C#) leads to f\. By
Proposition 6.20, /i and ff have the same Taylor series expansion. Thus F is
well defined, and the proposition follows.
Now we shall complete Tl to a space TV. The set TV is to consist of
the points of TZ together with four more points a,/?,7,00. We define
e :TZ* -+C\J {00} = PX(C) by e(a) = a, e(/?) = 6, e(7) = c, e(oo) = 00.
To topologize TV', we declare that 7£ is to be open in 7£* and that
basic open neighborhoods of a,/?,7,00 are to be defined as follows: A
basic open neighborhood of a is given by {a} Ue"l(Dx), where Dx is
any punctured disc centered at a and lying in C — {a,6,c}, and basic
open neighborhoods of /?,7,oo are defined similarly. The resulting TV
is easily seen to be a Hausdorff regular space with a countable base,
7. RIEMANN SURFACE OF THE INTEGRAND
173
therefore a separable metric space. It is sequentially compact, therefore
compact. With this notation the function F(£) of Proposition 6.24 is
better written as
nO = >/K-«)«-0)«-7). (6-23)
This is the notation we shall normally use for F.
Let us make IV into a Riemann surface. We need to define charts
about a,/?,7,oo. In the case of a, let Dx be a punctured disc about
a lying in C — {a,6,c}. Then e~x(Dx) has two preimages above each
point of Dx. The lift of a loop in Dx with winding number 1 about
a is not a loop (since y/z — a is not well defined on Dx) and therefore
e~l(Dx) is connected. It follows that e"1(Dx) is a covering space of
Dx. Arguing as in Proposition 6.24, we can define y/C, — a uniquely on
e"1(Dx) in such a way that y/C, — Q takes a prescribed value at some
base point. We put \/C ~ a = 0 if C = a.
Then (t/C — «, e~l(Dx) U {a}) is a continuous map into C. Since it
maps small discs to open sets, it is open. If VCi ~~ a = VC2 — <*, then
21 — a = z^ — a and z\ = Z2- Hence Ci and C2 lie over the same point
z. If z = a then £i = £2 = a. Otherwise the square root changes
by a factor of —1 between the two points of the same fiber. Thus
>/£i — a = v^2 — <x implies £i = £2, and the map is one-one.
Consequently (1/C "~ qj e~l(Dx)\J {a}) is a chart. It is clear that this chart
is compatible with the charts on 1Z.
We treat /? and 7 in the same way, and we are left with handling
the point 00 of IT. For this we let Dx = {z \ \z\ > R} for R >
max{|a|, |6), |c|}, and we check that e"1(Dx) is connected. We define
—-=. on Dx, fixing the value at some base point and defining the function
vC
to be 0 at e_1(oo). Then ( -j=,e~l(Dx U {00}) j provides a compatible
chart about 00.
The function F(Q'1 = * _„—== with F(Co)-1 = l/q0
is analytic on H and is continuous from IV into CU{oo}. By Riemann's
removable singularity theorem, it is a meromorphic function on 11. Let
lis see its behavior near a, /?, 7,00.
Near a, the local coordinate is s = y/C — ot. To calculate with this
relation, we cannot use the manifold variable £ but have to use the
complex variable z. Thus we have s2 = z — a, z = s2 4- a, and
(2 - &)(* - c) = (s2 + a - b)(s2 + a-c).
174
VI: COMPLEX POINTS
Hence
F(C) =S V(s* + a - b)(s* + a-c) (6'24)
for one of the two determinations of the square root. Since the square
root is analytic and nonvanishing near 5 = 0, F(£)~l has a simple pole
at £ = a.
Similarly F(C)"1 has a simple pole at £ = /? and at £ = 7. Near 00
the local coordinate is 5 = —^=. Then 2 = s~2 and
vC
(,-.K.-»K.-«)-^(l-:)(.-i)(i-f)
= z3(l-as2)(l-&s2)(l-cs2).
Hence
F(C)-1 = s3 . 1 = (6.25)
V ' y/(l - a«2)(l - 6*»)(1 - c«2) v ;
for one of the two determinations of the square root. Since the square
root is analytic and nonvanishing near s = 0, /(C)-1 has a triple zero at
C = oo.
8. An Elliptic Integral
We continue with the notation of §7. Thus F : IV —► C U {00} is as
in (6.23) and later, and Co is the base point of IV with e(£o) = zo- Our
first objective is to make sense of the expression
w(C)
= J $n<rldc (6.26)
where C is any piecewise smooth path in TV.
Roughly the idea is that (6.26) is to be f<c) \F{z)~ldz, but the
difficulty is that F(z)~l is not necessarily single-valued on e(C). The
point of the reference to IV is to allow us to make a choice between
F(z)-1 or -F(z)~l for each point of the curve. To be precise, let C be
given by C(t) for t £ /, so that e(C) is the path in C U {00} given by
e(C(0). Then (6.26) is to be made precise by the definition
w(C)= f ±F(C(t))-le(C)'(t)dt. (6.27)
8. AN ELLIPTIC INTEGRAL
175
We need to check that the integral (6.27) is convergent. The only
difficulty is in neighborhoods of a,/?,7,00. Thus suppose C is confined
to a basic neighborhood. Near a, the local parameter is s = >/£ — a,
and the path C(t) determines a path s(t) in the coordinate patch by
s{t) = y/C(t) - a. Then s{t)2 = e(C(t)) - a, e(C(t)) = s(t)2 + a, and
e(C)'(t)dt = 2s(t)s'(t)dt. Using (6.24), we substitute into (6.27) and
obtain
s'(t)dt
u(C) = I
6/ \/(s(02 -fa- 6)(s(*)2 + a-c)
(6.28)
for a suitably small interval J. The denominator is bounded away from
0 for t G /, and the integral is convergent. Similar remarks apply to /3
and 7.
Near 00, the local parameter is s = —=, and we have s(t) = —,
Then
S(02 = l/e(C(t)), e(C(t)) = s(t)-2,
e(C)'(t)dt = -2s(t)-3s'(t)dt.
Using (6.25), we substitute into (6.27) and obtain
»(C)=[ -S'Wdt (6.29)
w[
for a suitably small interval J. Again the denominator is bounded away
from 0 for t £ /, and the integral is convergent. Thus (6.27) is always
convergent, and we can use it as a definition of (6.26).
Lemma 6.25. If C is a piecewise smooth path in It* lying
completely in a basic disc with local parameter s and extending between
local parameter values s\ and s^, then
w(C) = £3±F(«s))-lfsds. (6.30)
In particular, the integral depends only on the endpoints, not on the
path itself.
PROOF. For a disc in H with local parameter z, the lemma follows by
applying the Cauchy Integral Theorem to a branch of
176
VI: COMPLEX POINTS
l/y/(z — a)(z — b)(z — c). For a disc centered at a, the integral is given
by (6.28), which is the value of the complex line integral
/
* (6.31)
v/(52 + a - 6)($2 + a-c)
over the local version s(t) of the path C(t). Since the integrand of
(6.31) is analytic in a disc containing the image of the path s(t)y the
line integral depends only on the endpoints and is given as in (6.30), by
the Cauchy Integral Theorem. Similar remarks apply to (3 and 7. Near
00, the integral is given by (6.29), which is the value of the complex line
integral
-ds (6.32)
/
y/(l - as2)(l - bs2)(l - cs2)'
and the same considerations apply.
Lemma 6.26. If C and C" are piecewise smooth paths from £i to £2
in H* that are homotopic in IV, then w(C) = w(C).
Proof. Let the homotopy be Ct, 0 < i < 1, where each Ct has
domain interval / and each Ct extends from £i to £2- Without loss of
generality we may assume that each Ct is piecewise smooth. It is enough
to prove, for each *o> that w(Ct) is constant for / in a neighborhood of to.
Changing notation, we see that it is enough to prove that any piecewise
smooth path Ci that is sufficiently close to C\ uniformly for t € / has
w{C2) = w(d).
About each point of Ci, choose a basic disc A and a proper subdisc
A" centered at that point. The discs A" cover the image of Ci, and
we extract a finite subcover A'/,..., AJJ. Form the corresponding larger
discs Ai,..., A„ and intermediate discs A^,..., A'n with
A'/ C A;/ C a; C A; C A; for 1 < i < n.
By uniform continuity of C\) there is some 6 > 0 so that Ci(t) G AJ;
implies C\(t') € AJ for \t'-t\<6. Let 10 < • • • < tm be a partition of the
domain interval / of Ci of mesh < 6. For each j with 0 < j < m, we can
choose i = i(j) so that d(tj) is in A'/(jy Then Ci([<j-i,<;]) C Afi{j_x)
for 1 < i < m.
Choose e > 0 so that dist(A[,A£) > e for 1 < i < n. Suppose C2 is
any path from Ci to ^2 that is uniformly within c of C\. Then we have
C2([<j-i,^]) C Aiy_i) for 1 < j < m. In particular, Ci(tj) and C2(tj)
are in At-y_!)n A,-qj, which is connected. Let 5; be a piecewise smooth
8. AN ELLIPTIC INTEGRAL 177
path from C\(tj) to C2(tj) in A^y^ij fl A,-y), 1 < .; < m — 1. Let So be
the constant path at C\(to) and let Sm be the constant path at Cm(to).
By considering cancelling integrals, we have
m
w(C2) = ^^(5;_1C2|[,i_1,tj]5>). (6.33)
i=i
But 5j-iC2|[t-_llt-]S'<y is a path in A^-i) with the same endpoints as
Ci\[tj_lttj]. Thus Lemma 6.25 says that (6.33) is
m
= Ew(Ci|[ii.liti]) = w(Ci).
This proves the lemma.
Recall from Proposition 6.23 that the classes of the loops ri,r2,r^,
r£,r^ generate the subgroup of 7Ti(C —{a,6, c}, z0) of even total winding
number. Lift these loops to loops fi, f2, ra, Tp} T7 in H based at Qo-
Lemma 6.27. /r ^F(Q"ldQ = 0, and similarly for /? and y.
Proof. In view of Lemma 6.26 applied just within 1Z} the
computation reduces to an integral over a loop in a basic disc of TV about a.
Then Lemma 6.25 applies and shows we get 0.
Define
Wl= / ^(C)-1^ and w2 = / ±F(C)-ldC, (6.34)
and let A be the subset of C given by
A = Zu;! + Zw2. (6.35)
We give C/A the quotient topology, which is not yet known to be Haus-
dorfF.
Proposition 6.28. There is a well defined map w : TV —► C/A with
the following property. Whenever £ is in IV and C is a piecewise smooth
path from £o to C, then
w(C) = ™{C) mod A.
The map w is continuous and open.
178
VI: COMPLEX POINTS
Proof. Fix C in TV, let C be a piecewise smooth path from Co to C,
and define
tu(C) = w(C) -f A as a member of C/A.
To see that w(Q is well defined, let C" be another such path. We
are to show that w(C') — w(C) is in A. Without loss of generality we
may assume the following: If C £ {0,^,7,00}, neither C nor C" meets
{a,/?,7,00}; if C G {a,/?,7,00}, C and C" meet {a,/?,7,00} only at
their final points. We have
w(C')-w(C) = w(C'C-l)}
and C'C~l is a loop based at Co- By Lemma 6.26, we can perturb
C'C~~l near C, if necessary, so that C'C~~l does not meet {a,/?,7,00}.
Afterward, w(C'C~l) depends only on the homotopy class of C'C"1
in 72., according to Lemma 6.26. Since Proposition 6.23 identifies the
homotopy classes in 7£, we see that w(CfC~x) is a Z combination of
ti;(fO,ti;(f2),ti;(ra),ti;(r^),ti;(r7). But w(Ta) = w(Tp) = w(T7) = 0
by Lemma 6.27. Thus w(C) — w(C) is a Z combination of u^ = w(Ti)
and U2 — ^(^2), hence is in A.
Thus w is well defined. If we fix Ci and a path C\ leading to it, we can
look at how w behaves in a basic disc about Ci by using paths that begin
with C\ and continue completely within the basic disc. Let C correspond
to C\C in this way, and let Ci ♦-► s\ and C <-► s in the local coordinate.
Then Lemma 6.25 gives
u/(C) = "(Ci) + w(C) = w(d) + J' $F(C(8))-l^d8. (6.36)
This formula shows that w is continuous. Since the integral on the right
is nonconstant analytic in 5, w is open.
Corollary 6.29. The complex numbers u)\ and u/2 in (6.34) are
linearly independent over R. Therefore A in (6.35) is a lattice, and C/A
is a Riemann surface.
Proof. If wx and u/2 are dependent, then A C Ru; for a suitable
complex number u. The natural map /j, : C/A —► C/(Ruz) is continuous
and open, with a Hausdorff image. Hence the composition flow : IV —►
C/(Ruz), with w as in Proposition 6.28, is continuous and open. Since
TV is compact and C/(Ruz) is Hausdorff, the image of fi o w is open,
compact, and closed. Since C/(Ru/) is connected, \i o w is onto. But
C/(Ru>) is noncompact, and we have a contradiction.
8. AN ELLIPTIC INTEGRAL
179
Theorem 6.30. The map w : IV —► C/A of Proposition 6.28 is
one-one onto and is biholomorphic.
Remarks. The map is well defined by Proposition 6.28, and C/A
has a complex manifold structure as a result of Corollary 6.29. Thus it
is meaningful to ask about holomorphicity of w (and also w"1, once it
exists).
Proof. The map w is holomorphic as a result of (6.36). In local
coordinates, (6.36) shows that w has derivative
^(c(s))_1S- (637)
At a point £ of 7£, where s = z is a local parameter, (6.37) is just
1
2N/(s-a)(s-6)(5-c)'
which is nonvanishing. At £ = a, (6.37) is the integrand of (6.31) and
again is nonzero. Similar remarks apply to C = P and C = 7- Finally
at C = oo, (6.37) is the integrand of (6.32) and again is nonzero. Since
the local derivative is nowhere 0, w is an immersion. An immersion
between compact connected smooth manifolds of the same dimension
is necessarily a covering map, and thus w : It* —► C/A is a covering
map. If we can show w is one-one, then w~~l will exist, and w"1 will be
holomorphic as a consequence of the one-dimensional complex Inverse
Function Theorem.
Thus we are to show that w is one-one. Suppose tu(Ci) = w(£2) in
C/A. Let Ci and C2 be piecewise smooth paths from Co to £i and £2,
respectively. By definition of w(Qy
w{Ci) = w(C2) + u>,
for some u = mw\ -f nc^ in A. Since r = r^TJ has w(T) = a;, the path
C3 = TC2 goes from Co to £2 and has w{C$) = w{C2) + w. Therefore
w{d) = w{C3). (6.38)
Let us say that C\ and C3 have domain [0,1]. For 0 < t < 1, we define
C1#(t) = w(C1\[0tt]) and C3#(t) = w(C3\[0>t])
as piecewise smooth paths in C. Define
Ci## = woC1 and C3## = woC3 (6.39)
180
VI: COMPLEX POINTS
as piece wise smooth paths in C/A. Then
Ci#(J) mod A = ti;(Ci|[0(t]) m°d A
= w(Ci(t)) mod A
and a similar computation for C3 show that C\* and 63^ are the unique
lifts based at 0 of Ci## and C3## from C/A to C. By (6.38), Ci#(l) =
C^{\). Therefore C\$C^~~l is a loop in C, necessarily contractible.
Hence Ci^^C^^'1 is a contractible loop in C/A. Since w has been
proved to be a covering map, (6.39) shows that C\C^1 is a contractible
loop in IV. Therefore C\ and C3 have the same endpoint in 7£*, and
£x = £2. We conclude that tu is one-one, as required.
Theorem 6.30 completes the construction of the lattice A and a
function w(Q that should have something to do with inverting a Weierstrass
p function. To make the correspondence complete, we modify w(() by
translation in C/A by the element J^° ^F(Q~ld( of C/A. (Recall from
(6.27) and Proposition 6.28 that this element is well defined.) Thus our
new definition of w(£) will be
w(()= [ ^(O^dC + A as a member of C/A. (6.40)
Joo
The new w : IV —► C/A is still biholomorphic, and we let
w~l : C/A —ft*
be its inverse function. Let
/1 : C — C/A
be the (holomorphic) quotient mapping, and recall that the extended
version of e, no longer a covering map, is a function
e:7T — CU{oo}
and is meromorphic. Define P : C —► C U {00} as the meromorphic
function
P(z) = eow'lofi(z). (6.41)
8. AN ELLIPTIC INTEGRAL
181
Theorem 6.31. The function P(z) in (6.41) is given by
P(z) = p(z)+i(a + 6+c), (6.42)
where p is the Weierstrass p function for the lattice A in (6.35).
PROOF. The function P is meromorphic, and it is doubly periodic
since it factors through C/A. Hence it is elliptic. Its only poles are in A
because
P-\{oo}) = n~\w{e-\{oo}))) = /i"1 (t*({oo}))
= /i"1(0 + A) by (6.40)
= A.
Now we examine the order of the pole of P(z) at z = 0. The maps /x
and w"1 are locally invertible. Thus the order of the pole of P is the
same as the order of the pole of e at oo in IV. The local parameter is
s = -^ near oo in IV. So e(£) = z = 5~2. Thus the pole has order 2.
Next let us see that P is an even function. For r in C, put £ =
w~~1fi(z). Let C be a piecewise smooth path from oo to £ in IV. The
Riemann surface IV has a holomorphic involution (namely interchange
of the two sheets), and we let C" be the image curve extending from oo
to a point (J. Then
(C') = / ^(C)-^C mod A
Joo (along C')
__ p J p/f\-ijr since the integrands
""i.(*MC,' (° C in (6.27) match
= —ty(C) mod A.
Hence
and
C = u»-1(-t»(C) + A)
P(z) = et»-V(0 = e(C) = e(C')
= efuT'f-u/K) + A)) = e(«T'(-u;(«r V(*)) + A))
= e(w-l(-p(z) + A)) = t{w-\ii(~z) + A)) = P{-z),
as asserted.
182 VI: COMPLEX POINTS
Our classification of poles of P shows that P has order 2. Thus it has
only two zeros, counting multiplicities. The proof of Theorem 6.10 gives
the structure of even elliptic functions and shows that
P(z) = kl(p(z)-k2). (6.43)
Let us see that P satisfies the expected differential equation. Put
ip = w~lfi(z) € IV and restrict attention to points ip € 1Z. Then
p = e(ip) is a local parameter in C, and we have
P(z) = ew-1fi(z) = e(iP)=p.
Since w(rp) = p(z), (6.40) gives
I HO'1
Joo
* € / £F(CrldC + A.
By (6.30), we have as local expression for a derivative
dp 2 v'4(p-a)(p-6)(p-c)
for a certain branch of the square root. Consequently
^ = v/4(p-a)(p-6)(p-c).
Replacing p = P{z), we see that P satisfies
P'2 = 4(P - a)(P - 6)(P - c). (6.44)
Now we evaluate k\ and £2 in (6.43). Since P is even and has a double
zero at z = 0, we can write
P(z) = f^. + c0 + 0{z2) (as z - 0)
z*
P\zf = i§2 + 0(,"2)
P(z)-a=^ + 0(l)
(P(z) - «)(/>(*) - 6)(/>(*) - C)) = ^ + 0(z"4).
9. COMPUTABIUTY OF THE CORRESPONDENCE 183
4c2 o 4c3 o
Thus (6.44) implies —=^ = —=^-, i.e., c_2 = 1. Thus fci = 1 in (6.43),
and P(z) = p(z) + k. Substituting this formula into (6.44) gives
p'2 = 4(p + k - a)(p + fc - »)(p + * - c)
= 4p3 + 4(3fc — a — 6 — c)p2 -f lower order.
Since also
p/2 = 4p3-$r2p-03,
we can subtract and conclude from the fact that p(z) assumes more than
two values that
3fc — a — 6 — c = 0.
This proves the theorem.
Corollary 6.32. Every nonsingular curve
y2 = 4x3 - g2x - g3
over C is of the form C/A for some lattice A and is parametrized bi-
holomorphically by z —► (p(z), p'(z))t where p(z) is the Weierstrass p
function relative to A.
9. Computability of the Correspondence
The theory of §§1-8 produced a correspondence A <-► (<72> £3) of lattices
A in C with pairs (02,£3) of complex numbers such that the right side
of
E : y2 = 4r3 - g2x - g3
has nonzero discriminant. The correspondence was implemented for
each A by a biholomorphic mapping of C/A onto £(C), given in terms
of the Weierstrass p function for A, with inverse given in terms of elliptic
integrals. In this section we study the correspondence A <-► (<72,<73) a
bit further. In particular, we shall see (at least in cases of interest) that
there are rapidly convergent expressions for the correspondence that are
suitable for (approximate) numerical computations.
Let us recall the exact correspondence of parameters. If A is given,
then we obtain E by using
"*° ! (6-45)
u>eA
u)?0
184
VI: COMPLEX POINTS
If £2 and </3 are given, then we let a, 6, c be the roots of the right side of
Et and we obtain A as the lattice generated by
where
"i= / ^no-'dc and U2= f incr1^,
a, /?, 7 are points in 7£* covering a, b, c,
(6.46)
F(0 = n/«-«)(C-0)(C-7),
fi = lift to TV of loop Ti around a and 6,
V2 = lift to 7£* of loop T2 around 6 and c.
To understand A, we are not concerned about minus signs on ui and
u;2, and we can thus replace (6.46) by integrals in C:
dz
Jr3
2y/(z-a)(z-b)(z-c)
dz
2y/(z - a)(z - b)(z - e)'
where the square root in each case is a single-valued branch in the z
plane slit between the two points in question and slit also between the
third point and 00. From (6.45) and (6.47), we can read off one feature
of the correspondence.
Proposition 6.33. Under the correspondence A <-► (<72><73), A is
closed under complex conjugation if and only if g? and g3 are both in R.
Proof. If A is closed under conjugation, then (6.45) shows
immediately that ^2 and g$ are real. Conversely let gi and g$ be real. Then
either a,6,c are real, or one of them (6, say) is real and the other two
are complex conjugates of each other.
Suppose a,b,c are real. Let us say that a < b < c. For Ti we can use
a curve that extends on the real axis on one side of a cut from a to 6
and comes back to a on the real axis on the other side of the cut with
the square root having changed by a minus sign. Then
wi = / / dX (6.48)
J a \/(x - a)(x - b)(x - c)
9. COMPUTABILITY OF THE CORRESPONDENCE 185
for one of the two determinations of the square root. The expression
under the square root is positive for a < x < b, and therefore u>i is real.
Similarly we can take
cj2= / y dX (6.49)
Jb V (x - a)(x - b)(x - c)
for one of the two determinations of the square root. The expression
under the square root is negative for 6 < x < c, and therefore lj2 is
imaginary. Hence Q\ = u\ and u>2 = —u;2. Consequently A is closed
under conjugation.
Now suppose 6 is real and a = c. Whatever path we use for T\, we can
use the complex conjugate path for T2, apart from orientation. With
suitable choices of the branch of the square root in each case, we can
arrange that the integrands in (6.47) are pointwise complex conjugates
of each other. Then it follows that u>i and u>2 are complex conjugates
of each other, except possibly for a sign. If indeed lji = l/2> then A is
closed under conjugation. If lj\ = —u>2, then u>i = — u>2 and G)z = —u>i
show that A is closed under conjugation.
Our chief interest is in elliptic curves defined over Q. Thus it is not
unreasonable for us to restrict attention now to lattices and curves meeting
the conditions of Proposition 6.33. There are rapid methods for
calculating the sums in (6.45) and integrals in (6.47), and we shall use them
in examining the correspondence further.
To calculate g? and y3 rapidly, let us assume that ui /u)^ is in the lower
half plane. (If it is not, then we have only to interchange u\ and u>2-)
Then we shall see in from (8.10) that
(6.50)
The two series in (6.50) are rapidly convergent: The numerator is a
polynomial in n, but the denominator grows exponentially with n.
A general theory for handling (6.47) is available, but we shall
concentrate on the case that a,6,c are real. Thus we want to investigate (6.48)
and (6.49). The tool is the arithmetic-geometric mean of Gauss
defined as follows: If a and b are positive reals with a > 6, we recursively
let
186
VI: COMPLEX POINTS
Then we can see that
ao > <M > a2 > • • • > 62 > b{ > 60 (6.51)
and liman = limtn, the convergence being quite rapid. The common
value of lim a„ and lim 6„ is the arithmetic-geometric mean of a and 6
and is denoted M(a,6).
[To see the existence of M(a, 6), we note first that (6.51) is clear. Then
(<H - bi) - 2(ai+1 - 6t+1) = (a{ - 6,-) - (a; + 6, - 2\/aJn)
= -26t + 2\/aibl > 0
shows that |a,+1 — 6j+1| < ||a; — 6,-|. The required convergence follows.]
Proposition 6.34. The integrals in (6.48) and (6.49) with
a < b < c are given in terms of the Gauss arithmetic-geometric mean by
the following expressions, up to minus signs:
dx
L y/(z-a)(z-b)(z-c) M(y/^H,y/^b)
dx in
CJ2
■L
^/{x — a)(x — b)(x - c) M{y/c - a, y/b - a)
Proof. For cji we make the change of variables y/x — a = y/b — a sin 0
and obtain
Jo
^/(c - 6) sin2 0 + (c-a) cos2 0
For LJ2 we make the change of variables y/x — b — y/c — b cos 0 and obtain
w2
= 2i /
Jo
J(b - a) sin2 0 + (c - a) cos2 0
Thus the proposition will follow if we prove that 0 < r < s implies that
/(r, s), when defined by
I(r,s)= =_, (6.53)
•A) Vr2 sin2 (9 + s2 cos2 6
9. COMPUTABILITY OF THE CORRESPONDENCE 187
evaluates as
We shall prove that
J(r,s) = /(W,^). (6.55)
Iterating (6.55) and using the existence of M(s,r) and an easy passage
to the limit, we see that
/(r, s) = I(M{s, r), M(s} r)). (6.56a)
But we can evaluate /(M, M) trivially, and the result is
I{M'M) = ^!' (656b)
The equality (6.56) implies (6.54). Thus it is enough to prove (6.55).
To prove (6.55), regard 0 < r < $ as fixed. For 0 < t < 1, the function
2st
r r-r increases from 0 to 1. Therefore
(s + r) + ($ - r)t2
. * 2ssin<p . ^ ^ 7r
sinfl = Y 2 , 0<y><-,
(s + r) + (s — r) sin <p I
is a legitimate change of variables, and 0 extends from 0 to \. We make
this change of variables in (6.53) after rewriting (6.53) as
rT/2 cos0d0
7(r,s) = f
Jo
(6.57)
cos2 0Vr2 tan'2 0 + 52 *
We readily compute
2« cos ^> [(s + r) — (s — r) sin2 y>] cfy?
cos 0 d0 = - - . . 9——
[(s + r) + (s-r) sin2 y?]2
CQC2 g = cos2 ?»[(* + r)2 ~ (* ~ r? sin2 <p]
[(s -f r) + (s — r) sin2 <p]2
« 4s2 sin2 u?
tan20 = ^2—•
cos2^>[(s -f r)2 — (s — r)2sin <p]
Substituting these expressions into (6.57) and simplifying, we obtain
r*'2 2 dip
Jo
yJArs sin2 (p + (s + r)2 cos2 <p
and (6.55) follows. This proves the proposition.
188
VI: COMPLEX POINTS
EXAMPLE. y(y + 1) = (x + l)x(x - 1). This elliptic curve is listed in
Table 3.2 with A = 37. Using the transformation rule that passes from
(3.23) to (3.26), we are led to
This curve has
Its roots are
E : y2 = 4x3 - 4x + 1.
92=:4 g3 = -1. (6.58)
a = -1.10716- •
6 = .269594 • • •
c = .837565
The relevant Gauss arithmetic-geometric means are
M(y/^Hy V^b) = M( 1.39453 • • • , .753639 • • •) = 1.04949 • • •
M{y/F^lL,y/b^d) = M(1.39453- •• ,1.17335- -) = 1.28156- ••
Thus (6.52) gives
u>i = — /-^=^- = 2.99346 • • •
M(y/c — a, vc — 6)
u7 = -——J^—-==- = 2.45139 • • • i.
M(y/c — a, y/b — a)
With the aid of (6.50), we can use u\ and U2 to estimate g^ and g$ as
g2 = 4.00000 • • •
03 = -.999999- ••
in agreement with (6.58).
CHAPTER VII
DIRICHLET'S THEOREM
1. Motivation
Dirichlet's Theorem on primes in arithmetic progressions has the
following statement.
Theorem 7.1. If m and 6 are relatively prime integers with m > 0,
then there exist infinitely many primes of the form km -f b with k a
positive integer.
This theorem was used in the proof of Theorems 5.2 and 5.3. Its
proof contains the first use of L functions historically and will help as
an introduction to the L functions that are the theme of the rest of this
book.
It is illuminating to begin with some special cases of Theorem 7.1 as
motivation. We first suppose that m = 1. In this case the assertion of
the theorem is just that there are infinitely many primes, and there are,
of course, many proofs of this result. For our purposes the interesting
proof is a relatively complicated argument given by Euler in 1737. We
introduce the function
n=l
for real s > 1. (This function subsequently was named the Riemann
zeta function and is defined and analytic for complex s with Re s > 1.
We postpone a more serious discussion of £(s) to §2.) By monotone
convergence, for example, we have
lim<(*)=+oo. (7.2)
We can write
cw= n —n- <7-3)
p prime 1
189
190 VII: DIRICHLET'S THEOREM
for $ > 1. (In fact, the product for p < TV is
n rrW=n (>+?+£+--)=x:';?
with 52' taken over all n with all prime factors < N. Letting N —► oo,
we obtain (7.3).)
If there were only finitely many primes, then (7.3) would yield
limsupC(s) < oo,
in contradiction to (7.2). This contradiction completes Euler's proof
that there are infinitely many primes.
To consider further special cases, it is more convenient to work with
iog<(*)= £ \o6'= £ (1 + ^- + ^ + ...),
p prime 1 p prime x '
p>
(7.4a)
which we shall see in the course of proving (7.30) is
= E i+»w (7-4b>
p prime
with g(s) bounded as s j 1. Euler's proof amounts to combining (7.2)
and (7.4) to see that ^ j diverges, and the corollary is that there are
infinitely many primes.
To handle m = 4 in Theorem 7.1, we want to treat primes 4fc + 1
separately from primes \k + 3. Dirichlet's idea for this special case
amounts to working with the sum and difference of
E £ «<» E £. M
p=lmod4 p=3mod4 ^
rather than the two terms separately. Tracing backwards in (7.4), we
are led to consider the following expressions, which differ from the sum
and difference of (7.5) by bounded terms as s J 1:
(7.6a)
. p prime, r I 1 p prime, r I (i.OD)
\p=4fc+l / \p=4*+3 '
1. MOTIVATION
191
, . ft) ifn = 0
*°(")=(l if„=i
Let us write
= 0 mod 2
mod 2
{0 ifn = 0 mod 2 (7-7)
1 if n = 1 mod 4
-1 if n = 3 mod 4.
With x equal to Xo or Xi> we have
x(mn) = x(m)*(n) ^or a^ m anc^ n-
Consequently the expressions (7.6) are both of the form
*.x>-±*£- n -^ ™
n = l p prime 1 ——
p>
with x taken as xo for (7.6a) and as xi f°r (7.6b). We can write the
analog of (7.4) uniformly as
p prime
with g(syx) bounded as s [ 1. Therefore
log(L(s,xo)^(5,xi)) = 2 Yl — + (*(*. Xo)+0(*,Xi))
P5
p prime
p=4Jfc + l
log(X(5,xo)^,xi)"1) = 2 ]JP — + (g(8,xo)-9(s>Xi)).
(7.9)
p3
p prime
p=4fc+3
For x = Xoj comparison of (7.3) and (7.8) shows that
<{s) = L(siXo)l_2„s-
Therefore
limL(s,Xo) = +oo.
Meanwhile the series in (7.6b) is alternating and converges for s > 0 by
the Leibniz test. The convergence is uniform on compact sets, and the
192
VII: DIRICHLETS THEOREM
sum L(s,xi) is continuous for s > 0. Grouping the terms of this series
in pairs, we see that L(\,Xi) > 0. (Actually we can even recognize the
value as w/4 from the Taylor series of arctan x.) Putting these facts
about L(s, xo) and L(5,Xi) together, we see that both left sides of (7.9)
tend to -foo as s [ 1. It follows from (7.9) that
V A and Y i
p prime p prime
p = 4Jk+l p=4*+3
are both infinite. Hence there are infinitely many primes 4k -f 1, and
there are infinitely many primes 4k -f 3.
The proof of Dirichlet's Theorem for km -f 6 will proceed in similar
fashion. We return to it in §4 after a brief but systematic investigation of
the kinds of series and products that we have encountered in the present
section.
2. Dirichlet Series and Euler Products
A series Y^=\ ~ w^n a* anc* s complex is called a Dirichlet series.
The first result shows that the region of convergence and the region of
absolute convergence are each right half planes in C (unless equal to the
(-l)n
empty set or all of C). These half planes may not be the same: ^2
is convergent for Re s > 0 and absolutely convergent for Re s > 1.
Proposition 7.2. Let Y^^L, — be a Dirichlet series.
(a) If the series is convergent for s = s0i then it is convergent uniformly
on compact sets for Re $ > Re so, and the sum of the series is analytic
in this region.
(b) If the series is absolutely convergent for s = so, then it is uniformly
absolutely convergent for Re s > Re so-
(c) If the series is convergent for s = so, then it is absolutely
convergent for Re s > Re sQ + 1.
(d) If the series is convergent at some so and sums to 0 in a right half
plane, then all the coefficients are 0.
Remark. The proof of (a) will use the summation by parts
formula. If {un} and {vn} are sequences and if Un = ]C*=i w* f°r n > 0,
then 1 < M < N implies
N N-l
X) U"Vn = X) U"(V" ~ V"+l) + UNVN ~ Um-\VM. (7.10)
2. DIRICHLET SERIES AND EULER PRODUCTS 193
On
1
Proof. For (a), we write — = -^- • —-— = unvn and then ap-
n9 ns° ns-'°
ply (7.10). The given convergence means that {Un} is convergent, and
certainly vn —► 0 uniformly on any proper half plane of Re s > Re sq.
Thus the second and third terms on the right side of (7.10) tend to 0
with the required uniformity as M and N tend to oo. For the first term,
the sequence {Un} is bounded, and we shall show that
^|l>n-Vn+l| = ]Tl
n = l
n = l
1
1
ns-s0 (n+l)*-*o
is convergent uniformly on compact sets where Re s > Re So- Use of
(7.10) and the Cauchy criterion will complete the proof of convergence.
For n < t < n + 1, we have
|n-('—)_*-<'-••>!< sup
n<Kn + l
sup
n<t<n + l
—-(n-(5-5°) -rf-(5-5°))
dt
so
t*-a0+l
|g-s0|
- nl+Re(«-50)"
(7.11)
Thus
K - vn+l | = |„-(--) - (n + l)-(^o)| < Jf£^L
+Re(5-j0)'
and the sum is uniformly convergent on compact sets with Re $ >
Re Sq, by the Weierstrass M-test.
Hence the given Dirichlet series is uniformly convergent on compact
sets where Re s > Re Sq. Since each term is analytic in this region, the
sum is analytic.
For (b), we have
I On
In5
I —I
In50 I
1
In50 I
Since the sum of the right side is convergent, the desired uniform
convergence follows from the Weierstrass Af-test.
For (c), let e > 0 be given. Then
I an 1 1
nJ0+l + <
\n*° I n3+f
with the first factor on the right bounded and the second factor
contributing to a finite sum. Therefore we have absolute convergence at
$o + 1 -f €, and (c) follows from (b).
194
VII: DIRICHLET'S THEOREM
For (d), we may assume by (c) that there is absolute convergence
at sq. Suppose ai = ••• = a//-i = 0. By (b), YIuLn ~7 — ^ ^or
Re s > Re s0. The series
(7.12)
is by assumption absolutely convergent at sq> and Re s > Re sq implies
an
\(n/Ny
<
dn
{n/NY°
By dominated convergence we can take the limit of (7.12) term by term
as s —► -f oo. The only term that survives is ayy. Since (7.12) has sum 0
for all st we conclude a^ = 0. This completes the proof.
Proposition 7.3. The Riemann zeta function C(s) = J2™=i ^ ml~
tially defined and analytic for Re s > 1, extends to be meromorphic for
Re s > 0. Its only pole is at s = 1, and the pole is simple.
Proof. For Re s > 1, we have
—i— = / r*dt = y^ t-°dt.
Thus Re s > 1 implies
i oo rn+1
= _ + £/ (n->-r>)dt.
It is enough to show that the series on the right side converges uniformly
on compact sets for Re s > 0. Thus suppose Re s > a > 0 and |s| < C.
By (7.11) with $0 = 0, we have
I fn+1 I fn+l Ul C
\Jn I " Jn "1+Re5 - "1+<r
Since ]T n~(1+<7) < oo, the desired uniform convergence follows from the
Weierstrass M-test.
Remark. Actually £(s) extends to be meromorphic in C with no
additional poles. We return to this point in §5.
2. DIRICHLET SERIES AND EULER PRODUCTS
195
We shall now examine special features of Dirichlet series that allow
the series to have product expansions like the one (7.3) for C(s). We
begin with some general facts about infinite products.
An infinite product flnLi an w^n an £ C and with no factor 0 is said
to converge if the sequence of partial products converges and the limit
is not 0. A necessary condition for convergence is that an —> 1.
Proposition 7.4. If \an\ < 1 for all n, then the following conditions
are equivalent:
(a) n^iC1^ Kl) converges
(b) £^=1 Kl converges
(c) n^LiC1 — l«r»l) converges.
In this case, fl^riO + an) converges.
PROOF. Condition (c) is equivalent with
(c') FI^LiO - \an\)~l converges.
For each of (a), (b), and (c'), convergence is equivalent with boundedness
above. Since
i + Ew<no+w>snT4r
n=l n=l n=l |Un
I'
we see that (c') implies (a) and that (a) implies (b). For (b) => (c'), we
may assume, without loss of generality, that \an\ < ^ for all n. Since
0 < x < 4 implies
log-
< |x| sup
llogrh
1 sup (W)=21
= |z| sup
*l.
we have
Thus (b) implies (c').
Now suppose (a) holds. To prove that f^^L1(l + an) converges, it is
enough to show that I"In=Af (* "*"fln) ^en<^s to 1 as M and TV tend to oo.
In the expression
I N I
\n-M
196
VII: DIRICHLETS THEOREM
we expand out the product, move the absolute values in for each term,
and reassemble the product. The result is the inequality
N
Y[(l + an)-l
n=M
N
< n^+M)-1-
n=M
By (a) the right side tends to 0 as M and N tend to oo. Therefore so
does the left side. This proves the proposition.
Consider a formal product
Y[ (1 + app-* + • • • + apmp-™ + ...). (7.13)
p prime
If this product is expanded without regard to convergence, the result is
the Dirichlet series Y\™ i —, where a\ = 1 and an is given by
an = apr, • • • apr* if n = p\l • - -prkk. (7.14)
Suppose that the Dirichlet series is in fact absolutely convergent in some
right half plane. Then every rearrangement is absolutely convergent to
the same sum, and the same conclusion is valid for subseries. If E is
a finite set of primes and N(2?) is the set of positive integers requiring
only members of E for their factorization, we have
n(i+«pp-,+-+v»p-raj+--)= e £■•
p£E n£N(E)
Consequently the infinite product has a limit in the half plane of absolute
convergence of the Dirichlet series, and the limiting product (7.13) equals
the sum of the series. The sum of the series is 0 only if one of the factors
on the left side is 0. In particular the sum of the series cannot be
identically 0, by Proposition 7.2d. Thus (7.13) can equal only this one
Dirichlet series.
Conversely if an absolutely convergent Dirichlet series J^nLi ~ nas
the property that its coefficients are multiplicative, i.e.,
a\ = 1 and amn = aman whenever GCD(m, n) = 1, (7.15)
then we can form the product (7.13) and recover the given series by
expanding (7.13) and using (7.14). In this case we say that the Dirichlet
series has (7.13) as an Euler product. Many functions in elementary
2. DIRICHLET SERIES AND EULER PRODUCTS 197
number theory give rise to multiplicative sequences; an example is an =
ip{n)y where <p is the Euler (p function.
If the coefficients are strictly multiplicative, i.e.,
a\ = 1 and amn = aman for all m and n, (7.16)
then the pth factor of (7.13) simplifies to
1 + app-> + ■■■ + (app->r + •■■= —^;- (7-17)
P*
In this case out Dirichlet series has a first degree Euler product:
oo ,
Er n ^ (7-»)
1 n
n=l p prime
i"-?
P5
This is what happens with C(5)> where all the coefficients are 1, and with
an = xo(n) and an = Xi(n) as in (7.7). Conversely an Euler product
expansion of the form (7.18) forces the coefficients of the Dirichlet series
to be strictly multiplicative.
A Dirichlet series J^nLi ~ w*tn la*» I ^ n° ^or some real c 1S absolutely
convergent for Re s > c+ 1. This fact leads us to a convergence criterion
for first degree Euler products.
Proposition 7.5. A first degree Euler product Y\ [1 — apP~s]~~l ^vith
\ap\ < pc for some real c and all primes c defines an absolutely convergent
Dirichlet series (and hence a valid identity (7.18)) for Re s > c-f- 1.
PROOF. The coefficients an are strictly multiplicative, and thus \an\ <
nc for all n. The absolute convergence follows.
First degree Euler products are sufficient for Dirichlet's Theorem, but
other Euler products are needed in other applications, such as with
elliptic curves. To isolate the notion of degree of an Euler product, let
us write (7.17) as a formal identity
^-r -r-,~ - -1-a.X
pJ
Here the denominator on the right is a polynomial of degree < 1 with
constant term 1, and it is in this sense that the Euler product (7.18) has
degree 1. The expansion (7.13) is called a kth degree Euler product
198
VII: DIRICHLET'S THEOREM
if, for each prime p, there is a polynomial PP(X) G C[X] having degree
< k and zero constant term such that
' l-Pp(X)
as a formal identity. Let us factor I — PP(X) over C as
l-Pp(X) = (l-r(»X)-..(l-rPx).
We call the complex numbers rpJ' the reciprocal roots of 1 — PP(X).
Proposition 7.6. A kth degree Euler product f] [1 - ^(P"')]"1
whose reciprocal roots satisfy \rp\ < pc for some real c and all primes
p defines an absolutely convergent Dirichlet series for Re s > c + 1. For
such s the sum of the Dirichlet series equals the Euler product.
Proof. We apply Proposition 7.5 to \[ t1 ~ rjj'p"'] for each j. The
product of absolutely convergent Dirichlet series can be rearranged
without affecting the sum, and the result is an absolutely convergent Dirichlet
series.
The final thing we need to know about Euler products is how to
recognize when they are of degree k. We shall be interested only in the
quadratic case, but it merely obscures the idea to make such a restriction
right away.
Proposition 7.7. Let
1 + b,X + b2X2 + • - • + bmXm + . • ■ (7.19)
be given. Then there is a unique polynomial
Q(X) = c0 + ClX + • • • + c^X*-1 (7.20)
of degree < k — 1 such that
1
1 - XQ{X)
= 1 + XQ(X) + X2Q{Xy + • • • (7.21)
is congruent to (7.19) modulo Xk+1. For this polynomial Q(X), the
expressions (7.19) and (7.21) are equal if and only if
bm = c06m_i + Ci6m_2 + h ck-ibm-k for all m > k. (7.22)
3. FOURIER ANALYSIS ON FINITE ABELIAN GROUPS 199
Proof. The condition that (7.19) and (7.21) be equal is the condition
that 1 — XQ(X) be an inverse to (7.19) in the ring of formal power series
over C. Consider the equation
(1 - c0X - ClX2 cfc_!A*)(l + bYX + b2X2 + • • •)
= (l + dlX + d2X2 + -").
The conditions d\ = • • • = d* = 0 uniquely determine Co,... ,cjk_i
recursively. Once Co,..., cjt_i are fixed in this fashion, dm for m > k is
given as the difference of the left side minus the right side in (7.22). The
result follows.
Corollary 7.8. An Euler product (7.13) with a\ = 1 is of degree 2 if
and only if there exists for each prime p a complex number dp such that
dprn =■ dpCLpm-l + dpdpm-l (7.23)
for all m > 2. In this case the pth Euler factor is
i-»r'-<MT»- <"«
Proof. Let bm = apm in Proposition 7.7. The polynomial Q{X)
works out to be cq + c\X with bi = cq and 62 = Cq + Ci. From (7.23)
with m — 2, dp has to be apa — a£. So we define it this way to make
(7.23) valid for m - 2. Now
t/p = api - a2 = 62 - 6j = 62 - Cq = C! .
Hence (7.23) for m > 2 is the same equation as (7.22) for m > 2. Thus
the corollary follows from Proposition 7.7.
3. Fourier Analysis on Finite Abelian Groups
In considering primes of the forms 4fc + 1 and ik + 3 in §1, we worked
with the sum and difference of the expressions (7.5) and then
reconstructed the individual expressions from the sum and difference. These
steps correspond to doing Fourier analysis on the group Z£ = Z2. In this
section we shall work out the Fourier analysis of Z£ that is appropriate
for handling primes of the form km 4- b.
200
VII: DIRICHLET'S THEOREM
Let G be a finite abelian group (such as Z£). A character of G is a
homomorphism of G into 51 C C*. The characters of G form a finite
abelian group G under pointwise multiplication:
(xx')(9) = x(g)x'(g).
Proposition 7.9. Let G be a finite abelian group. With respect to
the Hermitian inner product (F, Ff) = ]£yG<? F(g)F'{g) on the vector
space of all complex-valued functions on Gy the members of G form
an orthogonal basis, each \ G G satisfying ||x||2 = \G\. Consequently
\G\ = |G|, and any function F : G —► C is given by the "sum of its
Fourier series":
F^) = iir E (E F(ft)H^) J X(ff). (7-25)
|G| x€g W /
TERMINOLOGY. Equation (7.25) is called the Fourier inversion
formula.
Proof. First we prove that distinct characters \ and x' are
orthogonal. If x and x' are given with x ^ x'> ^ x" = XXf- Choose gQ 6 G
with x"(9o) ^ 1. From
\9eG } g£G g£G
we see that
U-x"(<7o)]Ex"(0) = ° and hence ^X"(j) = 0-
g£G g£G
Consequently
<x,x,> = Ex(*)x,aif) = Ex,,(») = o,
and the orthogonality is proved.
It follows that the members of G are linearly independent and hence
that \G\ < \G\. It is clear that
IMI2 = £lxGOI2 = £i = |G|-
geG geG
4. PROOF OF DIRICHLET'S THEOREM
201
To see that the members of G are a basis, we write G as a direct sum
of cyclic groups. A summand Z^ of G has at least JV distinct characters,
given by j mod N —► e2nijr/JV for 0 < r < N — 1, and these characters
extend to G as 1 on the other summands of G. Taking products of such
characters from different summands of G, we see that \G\ > \G\.
Therefore G is an orthogonal basis of the space of all complex-valued
functions on G. Formula (7.25) is the usual inner-product-space formula
for expressing an element in terms of an orthogonal basis.
4. Proof of Dirichlet's Theorem
The proof of Dirichlet's Theorem is a direct generalization of the
argument given for m = 4 in §1. Fix an integer m > 1. A Dirichlet
character modulo m is a function x •" Z —► Sl U {0} such that
(i) x(j) = 0 if and only if GCD(j, m) > 1
(ii) x(j) depends only on the residue class ,; mod m
(iii) when regarded as a function on the residue classes modulo m, x
is a character of Z£.
In particular, a Dirichlet character modulo m determines a character of
Z£. Conversely each character of Z£ defines a unique Dirichlet character
modulo m by lifting the character to {j £ Z | GCT>(j,m) = 1} and by
defining a Dirichlet character to be 0 on the rest of Z. It will often be
notation ally helpful to use the same symbol for the Dirichlet character
and the character of Z£. Because of this correspondence, the number of
Dirichlet characters modulo m is <p(m)y where <p is the Euler <p function.
The principal Dirichlet character modulo m, denoted xo, is the one
built from the trivial character of Z£:
fl ifGCD(i,m) = l
X°UJ \o ifGCD(i,m)>l.
Each Dirichlet character modulo m is strictly multiplicative, in the
sense of (7.16). We assemble each as the coefficients of a Dirichlet series,
the Dirichlet L function, by
^(••X) = f;^. (7-26)
n = l
202
VII: DIRICHLET'S THEOREM
Proposition 7.10.
(a) The Dirichlet series L(s}x) is absolutely convergent for Re 5 > 1
and is given in that region by a first degree Euler product
^>*)= n —^r (727)
p prime 1 - *±*-l
P°
(b) If x is not principal, then the series for L(sy x) is convergent for
Re 5 > 0, and the sum is analytic in that region.
(c) For the principal Dirichlet character xo modulo ra, L(s,xo)
extends to be meromorphic for Re s > 0. Its only pole is at $ — 1, and the
pole is simple. It is given in terms of the Riemann zeta function by
L(slXo) = «s)]l(l~±). (7.28)
P\m V V J
Proof. For (a), the boundedness of x implies that the series is
absolutely convergent for Re s > 1. Since \ is strictly multiplicative, L(s,x)
has a first degree Euler product, by (7.18), and the product is convergent
in the same region.
For (b), let us notice that X ^ Xo implies
m
^2 x(n + b) = 0 for any 6, (7.29)
n=l
since the member of Z^ that corresponds to x is orthogonal to the trivial
character, by Proposition 7.9. For s real and positive, let us write
in the notation of (7.10), putting Un = J3£=1 u*- Equation (7.29) implies
that {Un} is bounded, say \Un\ < C. By summation by parts (7.10),
< Y^ r f± - l \ <L
~ 2-* \n* (n+l)') + N' +
This expression tends to 0 as M and N tend to oo. Therefore the series
(7.26) is convergent for 5 real and positive. By Proposition (7.2a), the
series is convergent for Re s > 0, and the sum is analytic in this region.
For (c), let Re s > 1. From (7.27) with X = Xo, we have
l(s}Xo)=n —p
pirn 1
Vs
Using (7.3), we obtain (7.28). The remaining statements in (c) follow
from Proposition 7.3, since the product over p with p \ m is a finite
product.
ST X(n)
n=M
C _ 2C
4. PROOF OF DIRICHLET'S THEOREM
203
By Proposition 7.10b, L(sy \) is well defined at s = 1 if x is not
principal. The main step in the proof of Dirichlet's Theorem is the
following lemma. We defer the proof of the lemma until we have shown
how the lemma implies the theorem.
Lemma 7.11. L(l, x) ^ 0 if x is not principal.
Proof of Theorem 7.1. First we show for each Dirichlet character
X modulo m that
logL(s>X)= £ *&+9(s,x)
p prime
(7.30)
for real s > 1, where g(s,x) remains bounded as s J 1. In this statement
we have not yet specified a branch of the logarithm, and we shall choose
it presently. Fix p and define, for $ > 1, a value of the logarithm of the
pth factor of (7.27) by
log
1
Since
x(p)
ps
x(p)
x(p) i*(p2) ix(p3h
■ + -———r-r—r-—I"'
x(p)
p« 2 p
2»
3 P
,35
+g(s,P,X). (7-31)
< -, this logarithm satisfies
log-
< \z\ sup
\w\<\z\
= \z\ sup
M<l*l
A.
dw
1
1 — w
< 2|*|2
- 1
for \z\ < i. With z = *-^, we therefore have
lff(s,P.X)l<2
X(P)
V
4
SiDce EpprimeP 2 ^ E^=i n 2 < °°> the ***** £P0(5>P>x) is
uniformly convergent for s > 1. Let g($,x) he the continuous function
^2pg(SjPyx)' Summing (7.31) over primes p, we obtain
X>
i
i-
x(p)
p'
P
204
VII: DIRICHLET'S THEOREM
By (7.27) the left side represents a branch of ]og L(s,x)- This proves
(7.30).
Define a function 6& on the positive integers by
c / \ _ f 1 \f n = b mod m
b^ ' ~ \ 0 otherwise.
Proposition 7.9 gives
*<») = ^jE*<»)x(»).
Multiplying (7.30) by *(6) and summing on x, we obtain
^(m) £ i = £^log^x)-£^(*,x). (7-32)
p prime,
p=km+b
The term £x x(%(*> x) is bounded as s j 1, by (7.30). The term
Xo(^)logL(5,xo) is unbounded as s J 1, by Proposition 7.10c. For x
nonprincipal, the term x(&)l°g£(s,x) ls bounded as s [ 1, by
Proposition 7.10b and Lemma 7.11. Therefore the left side of (7.32) is
unbounded as s [ 1. Hence the number of primes contributing to the sum
is infinite.
Proof of Lemma 7.11. Let Z(s) = [~[x £(s>x)- Exactly one factor
of Z(s) has a pole at $ = 1, according to Proposition 7.10. If any factor
has a zero at s = 1, then Z(s) is analytic for Re s > 0. Assuming that
Z(s) is indeed analytic, we shall derive a contradiction.
Being the finite product of absolutely convergent Dirichlet series for
Re s > 1, Z(s) is given by an absolutely convergent Dirichlet series. We
shall prove that the coefficients of this series are > 0. More precisely we
shall prove for Re s > 1 that
*<•> = II t \-r^ • <7-33)
where f(p) is the order of p in Z£ and where g(p) = <p(m)/f(p). The
factor (l-p-'W')-1 is given by a Dirichlet series with all coefficients
> 0. Hence so is the g(p)th power, and so is the product over p of the
result. Thus (7.33) will prove that all coefficients of Z(s) are > 0.
4. PROOF OF DIRICHLETS THEOREM 205
To prove (7.33), we write, for Re s > 1,
(
Z(s) = l[L(s,X) = 1[[
» v 1 _ : „„. x
V' ) \ f J
Fix p not dividing m. We shall show that
?,.sa "S ?,.W
n(>-^)-(-£)'•
where / is the order of p in Z£ and where g = <p(m)/f; then (7.33) will
follow.
Now x —* x(p) is a homomorphism of Z£ into 5*1, hence into {e2*%kM}
and onto some cyclic subgroup {e2wikH } with /' dividing /. We show
f = /. In fact if/' < /, then pf> ^ 1 mod m, while xip1') = x{p)f' =
1 for all x> since x(j^ ) = x(l) for all x, the x's cannot span all functions
on Z£, in contradiction to Proposition 7.9.
Thus x —f *(p) is on^o {e25r**^}. In other words, xQ?) takes on all
/th roots of unity as values, and the homomorphism property ensures
that each is taken on the same number of times, namely g = <p{m)/f
times. If X is an indeterminate, we then have
H(l-X(p)X)= (f[(l-e**ik"X)) =(1-*')'.
X \*=0 /
Then (7.34) follows and so does (7.33). Hence all the coefficients of the
Dirichlet series of Z(s) are > 0.
Let us write Z(s) = Yl^Li ~ anc* Put
s0 = inf{s > 0 | 2J ~7 converges}.
1 n
n = l
Then so < 1. Suppose so > 0. Since ^2ann~s converges uniformly on
compact sets for Re s > so (by Proposition 7.2a), we can compute its
derivative term by term. Thus
*<%. +I). £-<;*■>''■ (7.35)
206
VII: DIRICHLET'S THEOREM
The Taylor series of Z(s) about so + 1 is
N=0
and is convergent at s = ^$o since Z(s) is analytic in the open disc
centered at so + 1 and having radius so + 1. Thus
*(**») = £ T^1 + ^o)N(-l)NZ(NHso + 1),
with the series convergent. Substituting from (7.35), we have
7V=0n = l
This is a series with positive terms, and Fubini's Theorem allows us to
interchange sums and obtain
n=lAT=0
v-^ ^ Qn^og^; l1 -i- 2^
^L--Oogri)(l + ijo)
oo
In other words, the assumption $o > 0 led to a point between 0 and
5o (namely ^so) where there is convergence. This contradiction proves
that 5o = 0. Therefore Y^ann'a converges for Re s > 0.
Since the coefficients are positive, the convergence is absolute for s
real and positive. By Proposition 7.2b the convergence is absolute for
Re 5 > 0. Therefore the Euler product expansion (7.33) is valid for
Re 5 > 0.
For p \ m and for real s > 0, we have
('-£)
L_ = (i+p-/» + p-v. + ...)i
> i+P-Js' + P~Vs' + ■■■
= 1 + p-rt"1)' + p-Mm)« + .. .
1
pV(m)s
5. ANALYTIC PROPERTIES OF DIRICHLET L FUNCTIONS 207
Therefore
J[L(s,X)
n
p\m 1 —
1
> n (i+^(m),+p-Mm),+--)=f:^)7-
p prime n=l
The sum on the right is -foo for s = —.—-, while the left side is finite.
<p(m)
This contradiction completes the proof of the lemma.
5. Analytic Properties of Dirichlet L Functions
The Riemann zeta function £(s) and the Dirichlet L functions L(s,x)
encode arithmetic information, and Dirichlet's Theorem comes out as a
consequence of properties of these functions. The properties in question
are the analytic continuation and the behavior at s = 1 for £(s) and
each L(s,x)- Other theorems about primes can be deduced from other
properties of these functions; for example, the Prime Number Theorem
about the asymptotic number of primes < N is a consequence of the
nonvanishing of £(s) for Re s = 1.
In the case of elliptic curves over Q, we shall introduce in a later
chapter a zeta function and an L function that are defined by Euler products
and encode geometric information, and deep properties of the elliptic
curve come out (partly conjecturally) as a consequence of properties of
these functions. This is part of a general pattern in algebraic number
theory and algebraic geometry, that L functions are used to encode
information prime by prime and that properties of these L functions are
expected to yield deep insights into the original problem being studied.
Let us refer generally to such arithmetic or geometric L functions as
motivic, for reasons explained in the preface. The question is how to get
an analytic handle on these functions in order to exploit their properties.
For £(s) and L(s,x), we obtained enough information by direct
arguments to prove Theorem 7.1. But for more subtle analytic properties,
one uses an indirect approach. The key idea is to relate such a function
to a suitable analytic function in the upper half plane with certain
transformation properties, a so-called "modular form." Each modular form
has an L function, defined explicitly by a Mellin transform and given
as a Dirichlet series. We shall refer generally to this kind of L function as
208
VII: DIRICHLETS THEOREM
automorphic. Automorphic L functions have more manageable
analytic properties, but they initially have little to do with algebraic
number theory or algebraic geometry. The fundamental objective is to
prove that motivic L functions are automorphic.
In the case of £(s) and L(s,x)> we shall achieve this objective in
this section, using suitable theta functions as the modular forms. The
analytic properties that we shall derive as a consequence are their mero-
morphic continuations to all of C, the functional equations that they
satisfy, and the identifications of their poles.
In the case of the L function of an elliptic curve over Q, the
assertion that this L function is always automorphic is substantially the
Taniyama-Weil Conjecture. This theme will occupy us for the remaining
chapters of the book. If the L function is automorphic in the expected
way, it extends to an entire function and satisfies a functional equation.
In particular, this L(s) must extend to be defined in a neighborhood
of s = 1. As mentioned in Chapter I, the Birch and Swinnerton-Dyer
Conjecture relates the behavior of L(s) near s = 1 to the rank of the
group of Q points of the elliptic curve, among other things.
We begin by treating C(5)> returning to L(s,x) afterward. The theta
function that we shall use is
oo oo
0(t) = ^ ein*rT = 1 + 2 J2 c,n2'T, (7.36)
n = —oo n=l
analytic for Im r > 0. Its role in helping us to understand £(s) comes
from the identity
OO rOO
e-n2*v±*-i da = / c-*(n2x)-(i--i)xi--i(n2x)-i dx
Jo
= n-T(£«)ir-*\ (7.37)
Summing on n for n > 1 and interchanging sum and integral by Fubini's
Theorem, we obtain
roo
C(«)r(£*)x-*' = / I[0(tV) - I]**5"1 da (7.38)
for Re s > 1. In the terminology that we shall use in later chapters,
C(s) is, except for harmless factors, the "Mellin transform" of 0(icr) — 1,
and 0(t) is a modular form of "level" 2 and "weight" \. Except for a
growth condition, these modular form properties are the content of the
following proposition.
5. ANALYTIC PROPERTIES OF DIRICHLET L FUNCTIONS 209
Theorem 7.12. The analytic function 0(r), defined for Im r > 0,
satisfies
(a) 0(r + 2) = 0(r) (level 2)
(b) 0(-l/T) = {T/i)V*0(r), (weight I)
where the square root is the principal value cut on the negative real axis.
Before proving this proposition, let us derive some analytic
consequences for C(s).
Corollary 7.13. The Riemann zeta function C(5)» initially defined
by (7.1) for Re s > 1, extends to be meromorphic in C. Its only pole is
at s = 1, and that pole is simple. Moreover
A(«) = C(«)r(i«)*-** (7.39)
satisfies the functional equation
A(l-«) = A(«). (7.40)
Remark. Notice that if C(s) is defined only for Re s > 1, then
A(l — s) and A(s) have disjoint domains and (7.40) has no content. It
is necessary first to extend £(s) to Re s > 0. Then (7.40) is meaningful
for 0 < Re s < 1 and extends to an identity on C. The extension of C(s)
to Re s > 0 has already been carried out in Proposition 7.3, but a little
care is required to interpret (7.38) as an identity in this whole region,
not just in Re s > 1.
Proof. For a > 1,
l[0(ia) - 1] = £ e-»a" < J2 *-k" = 7^7 < (1 " O-1*-".
n=l *=1 l
and thus
\[0{ia) - \)ct^-1 dcr (7.41)
converges for all s € C and defines an entire function. For real s > 1,
we rewrite (7.38) as
A(«)= / \0(i<r)<rt*-ld<r-\ f a^"1 d<r+ j \[0(ia)- l]**-1 da.
Jo Jo J\
,
210 VII: DIRICHLETS THEOREM
On the right side the second term equals -1/s. The first term, by
Theorem 7.12b, is
The change of variables a —► 1/(7 shows that this is
/OO 1
i[*(,-„)-i]«r4o->-irf(r__i_.
Therefore
A(«)= J™ \[0(i<T)-\]a±s-ld<r
+ J™ \[0{i<T)-l]<T^X-S^ld<T
The first two terms extend to be entire, by (7.41), and the other two
terms extend to be meromorphic in C with poles at s = 0 and s = 1,
both simple. Therefore A(s) extends to be meromorphic in C with poles
only at s = 0 and 5=1. Since (7.42) is unchanged under s —► 1 — s,
(7.40) follows. From (7.39) and the meromorphic nature of A(s), we see
that C(s) is meromorphic in C. Since T(^s) is nowhere vanishing, C(s)
can have poles only at s = 0 and 5=1. Since T(^s) and A(s) both have
simple poles at s = 0, C(s) is in fact analytic at s = 0.
We turn to the proof of Theorem 7.12. The tool is the Poisson
Summation Formula on R1, which is a result about the Fourier transform.
The Fourier transform on R1 is the operator on Lebesgue integrable
functions given by
/(«)= / e-2*ituf(t)dt.
J — OO
If also / is continuous and / is integrable, then the Fourier inversion
formula says
/(*)= f°° e2rituf(u)du.
J — OO
(7.42)
_1 1_
1 — s s
5. ANALYTIC PROPERTIES OF DIRICHLET L FUNCTIONS 211
For our purposes a useful class of integrable functions is the space of
Schwartz functions
{I ,jt - is bounded for each "|
f€C°°(Rl)\p(t)-^(t) integer * > 0 and L
I each polynomial P )
The Fourier transform carries «S(RX) one-one onto «S(RX).
Theorem 7.14 (Poisson Summation Formula). If/ is in ^(R1), then
oo oo
J2 f(x + n)= £ /(n)e2""*.
n = —oo n=—oo
Proof. Define F(x) = J2™=-oof(x + n)- From the definition of
^(R1), it is easy to check that this series is uniformly convergent and so
is the series of kth derivatives, for each k. Consequently the function F
is well defined and C°°, and it is periodic of period one. Such a function
is the sum of its Fourier series:
F(x)= ]T ( f F(t)e~2wint dt) e2Hnx. (7.43)
n = -oo \J<> '
The Fourier coefficient in parenthesis above is
/ F(t)e-2*ini dt = ( J2 f{t + k)e-2*intdt
Jq Jq Jb = -oo
oo ri
= E / f(t + k)e-2irintdt
k = -ooJo
= £ / f(t)e'2winidt
k = -ooJk
/oo
f(t)e-2nint dt
■oo
= /(")• (7.44)
The theorem follows by substituting (7.44) into (7.43).
For Theorem 7.12 the Poisson Summation Formula is to be applied to
f(t) — e~wt and its dilates. In treating Dirichlet L functions, we shall
apply the formula also to f(t) = te~Ti . The relevant Fourier transforms
are given in the next proposition.
212
VII: DIRICHLET'S THEOREM
Proposition 7.15.
(a) (<r"2r= e-™7
(b) (te-wi9F=-iue-wu*.
Proof. For (a) we form the line integral over a rectangle in C of
the entire function e~*2 . The rectangle extends from —N to N in the
x direction and from 0 to u in the y direction. The Cauchy Integral
Theorem says that
e~*x2 dx- e-*(*+.'u)2 dx + (contribution from each end) = 0.
■N J-N
As N —► oo, the contribution from each end tends to 0, and we obtain
/OO AOO
e-"2dx= / c-*(x+''u>2 dx.
-oo J—oo
Multiplying by e~*u , we see that
(e-T<2)» =€"Tu3 r e-**2dx.
J — oo
The integral on the right side is 1, as we see by evaluating its square
/■^oo .Coo e~*^x +y ^dxdy in polar coordinates. This proves (a). Part
(b) follows by differentiating both sides of the identity
e-wt2e-2witudt = e
' — oo
with respect to u.
f
J — <
Proof of Theorem 7.12. Conclusion (a) is immediate from (7.36).
For (b), let f(t) = e""3 and define fr(t) = f{r~H) for r > 0. Changing
variables in the definition of Fourier transform, we immediately obtain
/r(u) = r/(ru). (7.45)
For our particular /, Proposition 7.15a therefore gives
fr(t) = e~Tr"~V and fr(u) = re-*rV.
Applying Theorem 7.14 to fr with x = 0, we find that
e-«-a»'=r V" e-'rJ"J.
n = — oo n = —oo
With r = o"1/2 and (T > 0, this identity says
.(.±)=,■/*,*).
This is conclusion (b) for r = ia} and (b) follows by analytic
continuation.
5. ANALYTIC PROPERTIES OF DIRICHLET L FUNCTIONS 213
The remainder of this section will give a parallel development for
Dirichlet L functions. There is one complication, namely in isolating
what Dirichlet characters to consider. For example, the L function for
the principal Dirichlet character Xo modulo m has an Euler product that
is the same as the one for £(s) except that the factors where p \ m are
dropped. So we should be content with having the functional equation
for C(s) yield a functional equation for L(s,Xo)- Here are two subtler
examples.
Example 1. Let m = 8 and define xO) = x(5) = 1 and x(3) =
X(7) = —1. The resulting L function coincides with L(s,x')> where x'
is given with m = 4 by x'(l) = 1 and x'(3) = —1-
Example 2. Let m = 6 and define x(l) = 1 and x(5) = — 1. Also let
m = 3 and define *'(1) = 1 and x'(2) = -1. Then
Again one should be content with having the functional equation for one
of these L functions yield the functional equation for the other.
We say that two nonprincipal Dirichlet characters x modulo m and x'
modulo m' are associate if x(p) = x'(p) f°r all but finitely many primes.
This is an equivalence relation. The conductor of an equivalence
class is the least integer m" such that the class contains a nonprincipal
Dirichlet character modulo m"'. In Example 1, x an^ x' are associate
with conductor 4; in Example 2, x and x' are associate with conductor
3. A Dirichlet character modulo m is primitive if its conductor is m.
If X is a Dirichlet character modulo m, we can convert x into a
Dirichlet character x* modulo am by defining
y*M=(xW) ifGCD^) = 1
K KJ) \ 0 ifGCD(a,i)>l
Let us call x* an extension of x- If X* IS an extension of x, then x
and x* are associate.
Suppose x and x' are Dirichlet characters modulo the same m that are
associate. Let us prove that x = x'- In fact, suppose GCD(m,6) = 1.
By Theorem 7.1 there are infinitely many primes p with p = b mod m.
For all but finitely many of them, hence for at least one of them, we
have x(p) = x'(p)- Therefore x(^) = x'(&)- If GCD(m,6) > 1, then
X(ft) = *'(*) = 0. Hence x = x'.
Consequently an equivalence class of associate Dirichlet characters
contains a unique primitive Dirichlet character.
214
VII: DIRICHLET'S THEOREM
Proposition 7.16. If Xm is a Dirichlet character modulo m and
Xm1 is an associate Dirichlet character modulo m', then there exists a
Dirichlet character xgcd modulo GCD(m,m/) such that Xm and Xm'
are extensions of Xgcd-
a'
Proof. Write m — Y\p*j and m! = ]"[?/• Put n,- = min(aj,a^) and
Nj = max(a;,aj). By the Chinese Remainder Theorem, Zm = 0Z «>
as rings, and thus Z£ ^ 0Zx«i. The proof of Proposition 7.9 then
shows that (Z£y » ©(Z*., )~ Let us write Xm = (• • • > XPV .■■■)• The
extension of Xm to Z£CM,m m,x is accomplished by extending each xpv
to 1XN.. Thus we have
Xm — v • • i X «j j • • • J*
We argue similarly with Xm;» obtaining a similar formula. Since Xm
and Xm' are associate, so are Xm and Xm" Then xm = Xm/ since both
Dirichlet characters occur modulo LCM(m,m/). This equality implies
X#«, = X#«' f°r all j. In other words, x "> = X#«>- Putting Xgcd =
P; P.J pi p>
(..., x n>, • • •), we see that Xm and Xm' are both extensions of xgcd-
pi
Corollary 7.17. Any Dirichlet character modulo m is an extension
of the unique primitive Dirichlet character x' with which it is associate.
Consequently the conductor of x is a divisor of m.
Proof. We apply Proposition 7.16 to x and x'- Then x and \' are
both extensions of xgcd- Since x' is primitive, x' = Xgcd- Thus x is
an extension of Xgcd-
It follows that any L{s, x) can be obtained by deleting finitely many
factors from the Euler product of some L(s,x') with \' primitive. For
purposes of deriving functional equations, it is therefore enough to treat
the primitive case. The way that "primitive" will enter the argument is
through the following lemma.
Lemma 7.18. If x is a primitive Dirichlet character modulo m and
if c(m,x) denotes the Gauss sum
m-l
c("»,x)=I>a''*/mx(*), (7.46)
*=o
5. ANALYTIC PROPERTIES OF DIRICHLET L FUNCTIONS 215
then
m-l
Yl e2"ink/mx(k) = ^Wc(m, X) (7.47)
*=o
for every integer n.
Remark. It is easy to check that the nonprimitive \ in Example 2
above does not satisfy (7.47) for n = 2.
Proof. If GCD(n,m) = 1, then
m—1 m —1
E e2rtBt/mx(*)x(n) = 2 e2'tofc/"*x(*n)
Jb = 0 Jk=0
m-l
= J]e2*,"*'mx(*)=c(mlx).
*=o
Letting n"1 denote an inverse of n in Zm, we multiply through by
x(n_1) = x(n) and obtain (7.47).
If d = GCD(n, m) > 1, we are to prove that the left side of (7.47) is
0. Let m! = m/d. The natural map Zm —♦ Zm, has kernel
K = {aGZm |a = lmodm'}.
If xl/f were to equal 1, then \ would descend to Zm/ and define a
Dirichlet character \' modulo m' such that \ = x'#- Since x is primitive,
this descent cannot happen, and we conclude that x\k is a nontrivial
character of K.
The left side of (7.47) is
m'-l
e2*ik'm'x(k)
m-l
= E<
Jb = 0
m;-l
r=0
,2-Kikfm
2irir/m
'xW =
m'-l
E
r=0
'' E *(*)
E
0<Jb<m-l
fc=rmodm'
m'-l
= E c2'<r/m'x(»-) E x(°)-
Since x|/r is nontrivial, the inner sum is 0. Thus (7.47) is 0.
216 VII: DIRICHLETS THEOREM
Theorem 7.19. Let \ be a primitive Dirichlet character modulo m,
and define 0(r, \) for Im r > 0 as
{
E?=-oo x(n)e'n''T/m = 2£~=i x(n)ein2"/m if X(-l) = 1
HL-co X(«)nc-3"/- = 2 £~=1 x(n)nein*rT,m if x(-l) = "1
Then
(a) 0(r + 2m,x) = 0(r,x)
(b) tf(-l/r,X) = &jj£(T/i)W0(T,x) if X(-1) = 1
(b') *(-l/r,X) = ~ic{pX\r/i)3f2e(r,x) if x(-l) = -1
The Dirichlet L functions L(s,x) and £(5>x) extend to be entire in s
and have the following corresponding properties:
(c) If x(-l) = 1, then A(s,X) = L{8,x)rn*T{±8)ir-ia satisfies
A(*,X) = ^^A(l-5,x) (7.48a)
(c') If x(-l) = -1, then A(«lX) = L(s,x)™*('+1)r(|(s + l))ir-i('+1>
satisfies
A(s)X) = -M^) A(1 _ 5>x). (7.48b)
Remarks. Applying (c) or (c') first to x and then to x'> we see that
c(m,x)c{myx) = mx(-l).
We readily check from the definition that
c(™,x) = X(-l)c(m,x).
Consequently the coefficient c(myx)/y/m in the functional equation
(7.48) has absolute value one.
Proof. Result (a) is clear. For (b), define
/(«) = ('"*'* ifX<-1)=1 (7.49)
\le—' ifx(-l) = -l '
5. ANALYTIC PROPERTIES OF DIRICHLET L FUNCTIONS 217
The Poisson Summation Formula (Theorem 7.14) applied to / with x =
k/m gives
E '(£ + ")= E /(n)e2"nt/m
n=—oo ^ ' n = — oo
Multiplying by x(ib), summing on k from 0 to m — 1, and using (7.47)
yields
oo m—1 • , v oo m —1
E E x(*)/ (£+»)= E /(») E e2"nt/mx(*)
n=-oo jfc=0 ^ ' n = -oo Jb=0
oo
= c(n».x) E /(")*(")•
(7.50)
The left side of (7.50) is
oo m— 1
= £ j;*(*+m,,)/(i±pn)
n = -oo k=0 X '
-f:^(=)-
Jb=—oo N
Applying (7.45) to the resulting form of (7.50), we have
E X(t)/f—)=c(mlX)r E /(™)x(n). (7.51)
fc = -oo ^ ' n = -oo
If x(—1) = 1, we choose / as in (7.49) and put r = \J<r/m\ application
of Proposition 7.15a yields
E X(fc)e-*2""1/m = C-^T=T°112 E xMe-n3"/m-
fc=-oo ^ n = -oo
This is (b) for r = ur, and (b) for general r follows by analytic
continuation.
If x(—1) = —li we choose / as in (7.49) and again put r = yjajm\
application of Proposition 7.15b yields
E x(*)
Jfc = -oo
218 VII: DIRICHLET'S THEOREM
This is (b;) for r = ia, and (b;) for general r follows by analytic
continuation.
For (c), suppose x(—1) = 1- In analogy with (7.37), we have
da
A(',x) = f;^m*T(l5)ir-4'
Jb=l
= J" \B{i<TiX)^a-1 da (7.52)
for Re 5 > 1. The formal argument, disregarding convergence, is to
replace a by a~l and then apply (b). The above expression becomes
= f^ \0{i/a)a-^-1 da
Jo
= *^>A(1-,,*)
by (7.52). To make this argument precise, we observe by the same
argument as for (7.41) that
/ ^(ia.x)^9'1 da (7.53)
converges for all s £ C and defines an entire function. For Re s > 1, we
rewrite (7.52) as
A(«,X) = J ^{i^x^'-'da + J00 \0{i<T,X)<r*'-'d<j
and transform just the first term, replacing a by a"1 and applying (b).
The result is
A(«,x) =C-^=fJ" ^(.V,*)^1-)-1 de + j~ \9(ia,X)*^ da.
(7.54)
5. ANALYTIC PROPERTIES OF DIRICHLET L FUNCTIONS 219
In view of (7.53), both terms extend to entire functions of sy and
therefore A(s, x) extends to be entire in s. Using (7.54) with x} we have
g(m>x)
y/in
A(1 _ .,*) = ^M f I^V,^'-1 da
vm ^ (7.55)
To prove (c), we need to show that the right sides of (7.54) and (7.55)
are equal, and this comes down to proving that
cim.xyim.x) = mx(-l) (7.56)
(as was asserted in the remarks before the proof). To prove (7.56), we
write
m-l
c(m,X)c(m,x) = £ e2*ik'mX(k)c(m,x)
Jb=0
m —1 m— 1
= ^ c2«*/m ^ e2^»H/m^yy by Lemma 7 lg
Jfc=0 /=0
m—1 m—1
= V^ e2wik/m V^ e-2wikl/m / j\
Jb=0 /=0
m —1 m —1
1=0 k=0
m-l
= X(~"l) 5Z *(0^Um by Proposition 7.9
/=o
= rnx(-l).
This proves (7.56). The analyticity of L(s,x) is obtained from that of
A(s, x) by the same argument used with £(s) and A(s) in Corollary 7.13,
and (b) follows.
For (c'), suppose x(_l) = — 1- In analogy with (7.37) and (7.52), we
220
VII: DIRICHLET'S THEOREM
have
= / ^x(*)*c-*2'"/mo-i('+1)-1<fo-
Jo fc=i
= I" \e(i<Ttx)°*{l+l)-1 to (7-57)
for Re s > 1. The formal argument, disregarding convergence, is to
replace a by <r~l and then apply (b'). The above expression becomes
= f° $*(«•/*, x)*-*(f+1)_1<fc-
.70
_ -ic(m,x)
y/rn
■A(l-«,x)
by (7.57). This argument is made precise in the same way as for (c),
by means of (7.56). The analyticity of L(s,x) is obtained by the same
argument used for £(s) in Corollary 7.13, and (c') follows.
CHAPTER VIII
MODULAR FORMS FOR S£(2,Z)
1. Overview
In Chapter VI we established a correspondence A <-► E of lattices
in C with elliptic curves defined over C. By way of introduction to
the subject of modular forms, we shall now let the lattice vary. Then
G4, Gq, and A lead to analytic functions on the upper half plane with
special transformation properties (under the group SL(2, Z)) and certain
growth conditions that we list in §2. Analytic functions of this kind will
be defined in §2 to be modular forms. Cusp forms will be modular forms
with an additional vanishing property, and A will be an example.
To each cusp form, we shall associate in §3 an L function by means of
a Mellin transform. This L function will be given by a Dirichlet series
and will extend to be entire on C and to satisfy a functional equation.
The proof will be much like that for the Dirichlet L functions in Theorem
7.19, and the functional equation will be of the type in (7.48).
The remainder of the chapter will introduce and use Hecke operators
as a tool for expanding the L functions of selected cusp forms as Euler
products. In Chapter IX we shall extend the theory of Chapter VIII
to modular forms and cusp forms whose transformation properties are
relative to certain special subgroups of SL(2,Z).
After that point, we return to elliptic curves. For an elliptic curve with
integer coefficients, we count the number of solutions for each finite field
and assemble this information into an L function for the elliptic curve,
which is given as an Euler product and is hard to use. The result is
that we have two kinds of L functions, the kind from cusp forms that
we understand very well and the kind from elliptic curves that contains
a great deal of information.
Eichler-Shimura theory observes that cusp forms with special
properties can be used to parametrize the points of elliptic curves over Q.
Moreover, the L function of the cusp form coincides with the L function
of the elliptic curve. Elliptic curves that can be parametrized in this
way are called modular.
The Taniyama-Weil Conjecture is the assertion that every elliptic
curve is modular. There are many equivalent formulations, and there is
an algorithm for deciding the conjecture for any particular curve.
221
222
VIII: MODULAR FORMS FOR 5L(2,Z)
If the conjecture is true, then restrictions on the nature of modular
forms can be brought to bear on the theory of elliptic curves. A dramatic
example of this process is the theorem that the Taniyama-Weil
Conjecture implies Fermat's Last Theorem. A counterexample to Fermat *s
Last Theorem would yield an elliptic curve with remarkable properties,
and one proves with some effort that such an elliptic curve cannot be
modular.
2. Definitions and Examples
Let us recall the correspondence A <-> E of lattices in C with elliptic
curves defined over C. In (6.4) and (6.5) we defined
G2fc(A) = Yl ^2k for * > 2
(72(A) = 60G4(A)
(/3(A) = 140G6(A). (8.1)
According to (3.30), the curve
y2 = 4x3 + 62x2 + 264z + 66
has
A = -b2b8 - 86s, - 276§ + 9626466.
Therefore the curve
y2 = 4x3 - g2{A)x - g3(A)
of Chapter VI has
A(A) = (72(A)3-27(73(A)2. (8.2)
Similarly the j invariant is given by
j(A) = 1728ff2(A)3/A(A). (8.3)
The dependence on A can be simplified a little. If a is in Cx, we can
compute
<*»(«A) = £ ^ = £ 7^n = «-»G»(A)
wGorA u,;eA V
"*0 u,V0
2. DEFINITIONS AND EXAMPLES 223
and similarly
A(aA) = a"12A(A)
i(orA) = j(A).
Let A be generated by u>i and u>2, so that A = Zu>i 0 Zu^. Possibly by
interchanging u>\ and u>2, we may assume that Im(o;2/wi) > 0.
Introducing t = U2/v\ and Ar = Z 0 Zr, we have
A = Wl(zez(^)) =WlAr.
Thus G2*, A, and j are determined by their effects on A = AT. We
define
G2k(r) = G3t(AT), A(r) = A(Ar), j(r) = j(AT). (8.4)
We can make the same computation with a different basis {u^, u>2}
for A. We have
(:?)-(: i)C)
with a, 6, c, (f in Z. Invertibility of this relation over Z implies ad — be =
±1. Let us see the effect on the associated Ar. We may as well start
with {u>i, u>2} = {1, r}. Then {u;j, lj2} is given by
tt)-(::)(0 ■(-::)■
and the associated t' is
i _ ar + &
"" cr + d"
Since Im r > 0 and Im r' > 0, we must have ad — be = -hi, not —1. We
are thus led to consider the action of SL(2,R) on the upper half plane
and the special role of the subgroup SX(2, Z).
The group 5L(2, R) of 2-by-2 real matrices of determinant one acts
on the upper half plane 7i = {Im r > 0} in the usual way by linear
fractional transformations, with g — ( , J acting by gr = -.
\c a) cr-ha
Let 5L(2,Z) be the subgroup with integer entries. What we saw above
for an element 7 = ( , ) of SX(2,Z) was that AT D (ct -h d)Ar/,
hence that Ar = (ct -h d)ATi. Therefore
A7T = (cr + d)"1Ar.
224 VIII: MODULAR FORMS FOR SL(2, Z)
The resulting transformation laws for the functions we have been
considering are as follows:
G"^= £ rmTl„x2t G2k(7T) = {cr + d)2kGMT)
(m>ntr(0,0)(nlT + n)2t
g2(r) = 60G4(r) g2(yT) = {cr + d)4g2(r)
g3(T) = 140G6(r) g3(ir) = (cr + dfg3(r)
A(r) = g2(r)3 - 27g3(r)2 A(yr) = (cr + d)12A(r)
j(r) = 172802 (r)3/A(r) j(Tr) = j(r).
Fix an integer fc. An analytic function on H that satisfies
/(7r) = (cr + d)*/(r) for all 7 = (° J) € 51(2, Z) (8.5)
is called an unrestricted modular form of weight k (for the full
modular group SL(2, Z)). To have f =fc 0} k must be even. We shall
drop the word "unrestricted" if an additional condition below is satisfied.
Each unrestricted modular form / has a "q expansion" as follows.
Taking 7 = ( J in (8.5), we see that f(r) = f{r + 1). Putting
r — p + i(T, we expand in Fourier series in the p variable. Since / is
smooth,
00 00
/(r)= £ «n(<r)e2*ifin = J2 ""(^e2""'2™'.
n= —00 n= —00
where
Then
an(<r)= /' f(p + i<r)e-2"inrdp.
•'-■3
n(<7)e2Tmy = / /(/> + .V)c-2ir,"n<>+,'*> dp. (8.6)
J — 4
Imagine a rectangle in the r plane extending in the horizontal direction
from p = — £ to H-^ and in the vertical direction from <j\ — a to some
(T2. The integral (8.6) then represents the bottom portion of the line
integral
' f{r)e-2*inT dr
f-
2. DEFINITIONS AND EXAMPLES
225
over this rectangle. The total line integral is 0, by the Cauchy Integral
Theorem, and the sides cancel because f(r) = /(r+1). Thus the bottom
portion equals the top portion when they are oriented the same way. In
other words, (8.6) is a constant as a function of a, say constantly equal
to cn. We conclude that
oo
/(r)= £ cngn with, = e2*'T. (8.7a)
n = —oo
Here
cn= f(r)e-2*inT dp for any a > 0. (8.7b)
Expression (8.7) is called the q expansion of/, or, for reasons discussed
in §3, the expansion of / at oo.
We say that an unrestricted modular form / is holomorphic at oo
and is a modular form if its q expansion has cn = 0 for n < 0. If also
c0 = 0, we call / a cusp form-
In §3 we shall give a geometric interpretation for these conditions.
But first we consider our standard examples G2*, A, and j.
Proposition 8.1. For k > 2, the q expansion of G2jb(T) is given by
Gn(r) = 2<(2*) + ||^ f] <r2fc_,(n),",
where <T\{ri) — ^2d\n dl. Consequently G2k(T) is a modular form of
weight 2k.
Proof. We take as known the identity
1 °° / 1 1 \
7T COt 7TT = - + 5[ + ) , (8.8)
t ^Kr + m T-mJ* v }
in which the convergence is uniform on compact sets. With q = e2*lT
and Im r > 0 (so that \q\ < 1), we have
cos itt . q + 1 2wi _ . -c-^ d
IT COt 7TT = 7T — = ITT 7 = 27T — = I7T — 27TI > flf .
Sin 7TT flf — 1 1 — flf *—'
Thus (8.8) gives
226 VIII: MODULAR FORMS FOR SL(2, Z)
Differentiating 2k — 1 times gives
Hence
G2t(r) = 2 1
d=l
(m,n)*(0,0) V >
^ (m)2* + ^ ^ (nr + m)2fc
= 2C(2*) + 2]T J2
1
' ^-f ^-f (nr + m)2*
Applying (8.9) with r replaced by nr, we see that this expression is
=2«2t)+2 (2rri)i(2"r* £ £y*"V"
as required.
REMARK. The next-to-last line of the proof shows that also
The series here is rapidly convergent and allows for approximate
numerical calculations. If we replace r with u^/t^i and sort matters out,
we are led to the formulas (6.50) for gi and g$ that were used for the
calculations in §VI.9.
Corollary 8.2. A(r) is a cusp form of weight 12 and is nonvanishing
on 7i. Also j(r) is an unrestricted modular form of weight 0 with q
expansion
1 °°
j(r)=-+ 744+£cn*n.
q n=l
3. GEOMETRY OF THE q EXPANSION 227
Proof. The nonvanishing of A(r) follows from Theorem 6.15. From
(8.2) and (8.1), we have
A(r) = g2{r)3 - 27g3(r)2 = 603O,(r)3 - 27 • 1402G6(r)2.
Taking as known the identities
ft i ^,«v ft
C(4) = — and <(6) =
90 w 33-5-7'
we obtain from Proposition 8.1
G4{r) = ^ + ^J-(1 + 9«2 + 28g3 + 73/ + ...)
Gs(r) = -^7 ~ ^0"(* + 33?2 + 244q3 + m7q4 + • • •}-
Therefore
A(r) = (2*)12(9 - 24«/2 + 252q3 - 1472/ + ...), (8.11)
and A(t) is a cusp form. For the q expansion of j(r), we have
1728 • 603G4(r)3
j(r) =
as required.
A(r)
_ 1 -I- 720g + 179280g2 + 16954560g3 + ...
q - 2V -I- 252g3 - 1472?4 + ...
= - + 744 4- 196884? + 21493760?2 + ... (8.12)
Q
3. Geometry of the q Expansion
Armed with examples, let us discuss the geometry connected with the
q expansion. Let R be the closed subset of W described in Figure 8.1.
Proposition 8.5 below addresses the fact that R is a fundamental
domain for the action of SX(2,Z) in Ji. This fact needs a little care in
its formulation, since ( ~ _x J acts as the identity on H. The group
5L(2, R)/ { i ( o i J r acts effectively, and R is really a fundamental
domain for the group
= SL(2,1)/{±(H)}
228
VIII: MODULAR FORMS FOR 5L(2,Z)
Let T and S be the images in T of the members f 1 J and [ - n 1
of 5L(2, Z). These elements are given by
T(r) = r+1
and
5(r) = -l/r,
(8.13)
and S has order 2 (not 4). Before proving that R is a fundamental
domain, we examine these elements more closely.
-1/2
1/2
Figure 8.1. Fundamental domain for SL(2tl)
Proposition 8.3. The elements ( n J and I n 1 generate
5L(2,Z).
Proof. Assume the contrary. Let f be the subgroup of 51(2, Z)
generated by f J and I , n J. Among all members f , J
of 5L(2,Z) not in f, choose one with max{|a|, \c\} as small as possible.
Then with max{|a|,|c|} fixed, we may assume min{|a|,|c|} is as small
as possible. Multiplying if necessary by ( 1 J on the left, we may
assume that \a\ < \c\. Now
(-•.i)(i!)(ijr-(i!)
3. GEOMETRY OF THE q EXPANSION 229
shows that ( 1 1 is in f. Thus
(I 0\(a b\( a b \
\±l l) \c d) \c±a d±b)
is not in f. Unless a = 0, one of c ± a has \c ± a\ < \c\ and contradicts
our construction of ( , 1. Thus we may assume a = 0. Multiplying
on the left by I . J or its inverse f J, we see that 5L(2,Z)
contains an element f n J not in f. This element is a power of
f n 1 J, and we have a contradiction.
Since / 1 and I . n 1 are in 5£(2,Z), an unrestricted
modular form / of weight k satisfies
/(-l/') = (-r)V(r).
On the other hand, Proposition 8.3 implies the following converse to the
validity of (8.14).
Corollary 8.4. An analytic function / on H that satisfies (8.14) is
an unrestricted modular form.
Proof. For g = ( a * J in 5L(2,R), let
6(gyT) = cr + d. (8.15)
Direct computation gives
%i02,r) = 6(gug2r)6(g27T) (8.16a)
and
%-1,r) = %lfl-1r)-1. (8.16b)
If (/i and 02 are in SX(2, Z) and are elements y such that
f(jr)=6{y,T)kf(T), (8.17)
230
Vni: MODULAR FORMS FOR 5L(2,Z)
then (8.16a) says that (8.17) holds also for 7 = g\g2- If <7 is an element
7 such that (8.17) holds, then (8.16b) says that (8.17) holds also for
7 = g~l. Thus the subset of 5L(2, Z) for which (8.17) holds is a
subgroup. Equations (8.14) say that this subgroup contains I n .1 and
(-0. i>
By Proposition 8.3 the subgroup is all of SX(2, Z).
Theorem 8.5.
(a) Each point of H can be mapped into R by some element of T =
si(2,z)/{± (;;)}.
(b) The only points of R that are equivalent with one another under
T are the points r and r + 1 of the vertical sides, and the points r and
— 1/r of the circular arc.
(c) The only points of R that are fixed by a member 7 ^ 1 of T are
r = i (fixed exactly by the subgroup {1,5}), r = p = e2"/3 (fixed
exactly by the subgroup {1,5T,(5T)2}), and r = -p = c*1"'3 (fixed
exactly by the subgroup {1,T5, (TS)2}).
Proof, (a) If r = p + icr is given, then we can compute that
Im —T^ = 1 , ,112 • (818
Since c and d are integers, some choice of I ,1 makes \ct -f d\2 a
minimum. Applying this element, we obtain r' such that Im r' > Im
77' for all 7 € T. For a suitable choice of n, the translate r" — T^r'
will have |Re r"\ < \> and it will still be true that Im r" > Im jr" for
all 7 E T. If t" were strictly below the unit circle (at the bottom edge
of R)} we would have Im r" < Im St", contradiction. We conclude that
t" is in R.
(b,c) Suppose r is in R and g = ( , 1 is an element of 5L(2, Z)
with gT in i2. Let 7 be the image of g in T. Since jt and 7-1(7r) are
in R> there is no loss of generality in assuming that Im(7r) > Im r. By
(8.18), \ct + d\ < 1. Since Im t > }, we cannot have \c\ > 2. Thus
c = 0, 1, or — 1. If c = 0, then 7 must be a power of T, and the points
in question are r and r ± 1 as in (b).
Since f , J and f _ . J have the same image in T, it remains
to consider c = 1. Then \t + d\ < 1. Since |Re t\ < ^, we must have
4. DIMENSIONS OF SPACES OF MODULAR FORMS 231
\d\ < 1. Thus d = 0, 1, or -1. Suppose d = 0. Then |r| = 1. Since
ad — be =1,6 = — 1. Hence g = ( n )' anc* 7r = a • This is
an integer translate of the point , which is on the bottom edge of R
T
since \t\ = 1. The possibilities are that a = — 1 and r = p as in (c), or
a = 1 and r = —p as in (c), or a = 0 as in (b) and the case r = i of (c).
Suppose d — 1. Then r G R and |r + 1| < 1 imply t — p. In this case
fa b\ *u a 1 c ap + a-1 1 .
'=(,1 lj witha-6=LSoTP= p+1 =«-J^ = * + P-
Hence a = 0 or 1. If a = 0, we have g = I . 1 as in (c). If a = 1,
then the points are p and p + 1 as in (b).
Suppose d = — 1. Then similarly r = —p and 7(—p) = a 4- (—p) with
a = 0 or — 1. If a = 0, we obtain g = ( 1 as in (c). If a = 1, then
the points are —p and (—p) + 1 as in (b).
The fundamental domain allows us to interpret the q expansion of a
modular form as an expansion at oo. In fact, we see from Figure 8.1 that
the only way to tend to oo in R is vertically. Thus the point r = p + ia
has a —► oo while p remains bounded. The effect on q = e2ntT is for q to
tend to 0. So a power series in q may be regarded as an expansion about
r = oo. If the qr expansion does not have infinitely many negative powers
of q) the condition that an unrestricted modular form be modular is a
growth condition: There is to be no pole, and the form is thus to be
bounded as cr —► oo. The additional condition for a cusp form is that
the form vanish as a —► oo. This condition can be checked by evaluating
the integral defining cq in (8.7b).
4. Dimensions of Spaces of Modular Forms
Let / ^ 0 be a modular form of weight k. For r in W, let vT(f) be
the order of vanishing of / at r. If 7 is in SX(2, Z), then vyT(f) = vT(/)
since f(yr) = (cr + d)kf(r). Let vOQ(f) be the order of vanishing of the
q expansion of / (the expansion at 00).
Theorem 8.6. If / ^ 0 is a modular form of weight Jb, then
»„(/) + W/) + 5*v(/) + £' Mf) = 4
r
where 52' refers to a sum over inequivalent points in the fundamental
domain R other than 1 and p.
232
VIII: MODULAR FORMS FOR SL(2,Z)
REMARK. The left side is indeed finite. There can be no zeros of /
in a deleted neighborhood of oo, i.e., beyond some height in R. The
remainder of R is compact, and / can have only finitely many zeros
there.
Essentially we shall apply the Argument Principle,
computer-: V ,./ v dr over the boundary of R with positive orientation.
2*i J /(r)
Proof
ing
Actually we have to adjust our contour, introducing detours to avoid
zeros. First assume that there are no zeros on the boundary of R except
possibly at t and p. We introduce the modified contour in Figure 8.2. It
is assumed that all the zeros of / within R, other than those at i and />,
are within the contour in the figure. The arcs BB't CC, and DD' are
circular.
l-'B
DM
Figure 8.2. Contour for calculating number of zeros
We shall make repeated use of the following change of variables
formula. If t = </>(t') is an analytic change of variables, then
(f0<f>)'(T')dr' = f'(r)dr
(/o^(r') /(r) •
(8.19)
For a first application of this formula, we take r' = q and write f(q) =
/(r). As r traverses EAy q goes once around a circle with negative
orientation. Hence
1 f f'(r)dr _ 1 ff'(g)dq_
4. DIMENSIONS OF SPACES OF MODULAR FORMS 233
Next the points of AB are paired with those of ED' by r —► r + 1,
and /(r + 1) = /(r). Hence
JLt f'^dr , l f f'^dr-o rs2n
mJab f(r) + 2*iJD.B f(r) "u' (*'n)
The arc B'C is paired with the arc DC by r —► r' = — 1/r. We take
<j> = S in (8.19) and obtain
/ /'(r)«fr /• r(r)dr / (/o5)'(r)dr
Jc-D f{r) JDC- f(r) JB,c (/oS)(t)
= _ / £[(-*-)*/(r)]dr
7b'C (-r)*/(r)
-_ / f(T)dr f *^i
Hence
J_ / Ar)rfr 1 / /'(r)ifr t / £ = * ofn
2*i y^c /(*0 2xi y^D /(r) 2xi 7^c r 12 ^ V ;'
(8.22)
where o(l) is a term that tends to 0 as the radii of the arcs BBfy CO,
DD' tend to 0.
To handle the integrals over the small arcs BBf, CO, and DD'y we
note the following fact from complex variable theory. Let Hea mero-
morphic function with a simple pole at z0 and with residue h-\ there.
Over a small positively oriented circular arc centered at z0 and
consuming a fraction 9 of the circumference of the circle, we have
-^ / h{z)dz^eh_l^o{\). (8.23)
*wt Jarc
[In fact, we have only to expand h(z) in Laurent series and compute the
integral term by term.]
Applying (8.23) to the small arcs in Figure 8.2 and taking into account
orientations, we have
234
VIII: MODULAR FORMS FOR SL(2,Z)
We add these three formulas to the sum of (8.20), (8.21), and (8.22),
and we apply the Argument Principle. The result is
£ 'vT(f) = -««,(/) + 0+ ^ -jM/)-^(/)- \">(f) + "(1).
r
Letting the radii of the small arcs tend to 0, we obtain the desired result.
We still have to treat the case that / has some zeros on the boundary
of R other than at p and i. In this case we introduce a circular bubble
at each such point, with S or T of that bubble at the congruent point
on the boundary of R. The bubbles on the vertical sides of R do not
affect the above argument, and the bubbles on the circular bottom side
of R introduce error terms that are o(l). This completes the proof.
Let Mk be the vector space of modular forms of weight jfc, and let Sk
be the subspace of cusp forms. These spaces are 0 for k odd, since
/W = /((-.I_°0')=(-l)V(r).
For even k > 4, Proposition 8.1 shows that G* is in Mk but not S*. On
the other hand, Sk has codimension at most 1 in Mk, being given by a
single condition cq = 0. Thus
Mk = Sk © CG* for k even > 4. (8.24)
Corollary 8.7. Let k be even.
(a) Mk = 0 for k < 0 and k = 2.
(b) Mo = CI, and Mk = CG* for 4 < k < 10.
(c) Multiplication by A(r) defines an isomorphism of Mfc-12 onto Sk-
Proof, (a) In Theorem 8.6, all terms on the left side are > 0. Thus
the right side is > 0, and k > 0. For k = 2, the right side is |. No
nonnegative integer sum n\ + ^2 + \n$ can equal £, and so no / ^ 0
can exist.
(b) When k < 12, the right side is < 1. Thus Voo(f) = 0. Hence
Sk = 0. If 4 < k < 10 and k is even, Proposition 8.1 shows that
Mk = CG*. If k = 0, 1 is in Mk. Since So has codimension at most 1 in
M0, Mo = CI.
(c) Since A(r) is a cusp form of weight 12, multiplication by A carries
Mk-12 into Sk- On the other hand, A(r) is nonvanishing on H by
Corollary 8.2 (or Theorem 6.15) and has a simple zero at 00, by (8.11).
Thus multiplication by A(r)"1 is a well defined linear map of S* into
4. DIMENSIONS OF SPACES OF MODULAR FORMS 235
Corollary 8.8. If k > 0 is even, then
\ \
— if fc = 2 mod 12.
Also Sk = 0 for k < 12; if k > 12 is even, then
f ^-1 if k £ 2 mod 12
- 1 if Jb = 2 mod 12.
dims* =
Corollary 8.9. The q expansion of the cusp form A(r) of weight 12
can be regrouped as
A(r) = (2»)12«n(l-«r-
n = l
More specifically let
i)(r) = e'"'" f[(l -qn),
n = l
so that A(r) = (27r)i2r;(r)24. Then
rj(r + 1) = e^'12^) and ^-l/r) = (-ir)1/2^).
Remark. This formula for A(r) lends itself much better to
calculation than the one in (8.2).
Proof. Once we have proved the two transformation laws for ??(r), it
follows that rj(r)24 satisfies the transformation laws (8.14) with k — 12.
Corollary 8.4 then shows that t){t)24 is an unrestricted modular form of
weight 12, and hence r;(r)24 is a cusp form of weight 12. By Corollary
8.8, A(r) = ct)(r)24 for some constant c. The coefficient of q for t)(t)24
is 1 and for A(r) is (2*)12, by (8.11). Thus c = (2tt)12.
The identity tj(t -f 1) = e*xll2T){r) being obvious, we are left with
proving
^(-l/r) = (-,V)1/^(r). (8.25)
236 VIII: MODULAR FORMS FOR 5L(2, Z)
Let log denote the principal branch of the logarithm cut on the negative
reals. Since \q\ < 1,
-Mi-,') = £!,".
fc=l
Then
77(r)e-Wr/12 = f[explog(l-^
oo
= exp^log(l-tf')
El qk
"Tl *
= cxpE-i(c-«^-i)-
Replacing r by — 1/r gives
,(-l/r)eW(-)=exp|:-i(;i^r-T).
So
l(r)/ij(-l/r)
__ zi(T+r-») v-^ 1 / 1 1 \
- c 12 exp ^ ^ ^.a,,.^ _ j c2,ifc/r _ j y •
In order to prove (8.25), it is therefore enough to prove that
Fix t. Put f(z) = (cot z)(cot f) and v = (n+ ±)tt with n = 0,1,2,....
The two factors comprising / are together singular only at z = 0. Thus
4. DIMENSIONS OF SPACES OF MODULAR FORMS 237
z~lf(i/z) has simple poles at z = ±— and ± for k ^ 0, and the
respective residues are — cot I — J and — cot(7rikr) for k = 1,2,
Also there is a triple pole at z = 0, and the residue of z~l f(i/z) at z = 0
can be seen to be — ^(t + t-1). Let C be the curve in the z plane in the
shape of a parallelogram, passing from 1 to r to —1 to —r to 1. Those
poles mentioned above that lie within the parallelogram are actually on
the diagonals of the parallelogram. Thus the Residue Theorem yields
£ £, sdz 2iri, lx 2-2irtx^l/ A /irJb\ . . A
£/(,*) T = "^ + 0 + —g- (cot (-] +cot(ir*r)J .
(8.27)
Let us consider the limiting behavior of f[yz) on each edge of C as
n —► +oo. If we parametrize the edge from 1 to r by z = tr + (1 — t)
with 0 < t < 1, then we can write
i /re2.(n + iM*r-|-(l-t)) + A
cot „, = cot(n + |)„ = VeM(n^MtT+(1_0) _ / • (8.M)
Let r = p + icr. The exponential here has magnitude
c-2(n+*)irta (g 2Q)
and tends to 0 pointwise (except at t = 0) as n —> +oo, since <r > 0.
Thus cot vz —* — i on this edge. Let us show that the convergence occurs
boundedly. Let c > 0 be a number to be specified. When t > r,
(8.29) is < <r2T£(7, which can be assumed to be < 1; then (8.28) is
bounded. When t < «-, the el9 part of the exponential is
e2<(n + $)7r(<H-(l-0) _ ci»[(2n+l)+*(2n+l)(p-l)] _ _ei><(2n+l)(/)-l)
The exponent on the right is controlled. For a suitably small €, we can
make it < 7r/2 in absolute value, and we see that the denominator of
(8.28) is bounded away from 0. Therefore (8.28) is bounded, and the
convergence occurs boundedly. Similarly we find that cot ~ is bounded
on this edge and tends pointwise to i. By dominated convergence
lim / fli/z) — = / — = logr.
n-°°Jl Z Jx Z
238
VIII: MODULAR FORMS FOR 5L(2, Z)
Similar computations give
r1 „ x dz r1 dz . 1
km / f(vz) — = - / — = -*"* + log r
n-*°°JT z Jr z
hm / /(^) — = / — = log(-r) + n-t
lim I* f{vz)^ = -ll ^ = log(-r).
Since logr 4- log(—r) = 2 log (y), passage to the limit in (8.27) gives
41og (I) = -?£(t + r"1) + 4i£) I (cot(^r) + cot (^))
2iri. _j c^v 1 / 1 1 \
This proves (8.26), and we have seen that (8.25) follows.
5. L Function of a Cusp Form
In this section we shall associate to each cusp form an L function
given by a Dirichlet series. We shall see that the L function extends
to be entire and satisfies a functional equation. The prototype for this
study is Theorem 7.19, where we passed from certain 0 functions to L
functions and proved similar results about the L functions.
Let / E Sk be a cusp form, and let /(r) = X^nLi cnQn he its q
expansion. The L function of / is the Dirichlet series
L(s>f) = Y,%- (8-3°)
n = l
The L function can be obtained from / by applying a Mellin transform.
If F : (0,oo) —► C is given, the Mellin transform of F is the function
g(s) defined by
g{,) = J F(0<* j
5. L FUNCTION OF A CUSP FORM 239
for all values of s for which the integral converges. (For s imaginary, the
Mellin transform is the version of the Fourier transform appropriate for
the multiplicative group R+, but we shall not use this fact.)
For our cusp form /, let us write r = />+ ia and compute the Mellin
transform of /(ur), with p fixed at 0. We proceed formally for now,
disregarding convergence. We have
„(.)=/ f(ia)a'-= £c„
Jo <r Jo £[
= (2ir)-rwf;^
„ 2*no* d<T
€ <T
1 n
n = l
= (2»)-T(«)L(«,/). (8.31)
We shall show below that the coefficients cn of a cusp form satisfy
\cn\ < Cnkl2. Then the Dirichlet series (8.30) converges absolutely for
Re s > ^ + 1, and L(s, f) is analytic in this region. Going over the steps
of (8.31), we see that our computation was rigorous in this same region.
Lemma 8.10. Let / € 5* have q expansion /(r) = Y^=\cn9n-
Then
(a) the function <p{r) = \f(r)\<Tk^2 is bounded on H and invariant
under 5L(2,Z)
(b) |cn|<Cn*/2.
PROOF, (a) From/(r) = £~=1 cnqn for \q\ < 1, we have |/(r)| < C\q\
for \q\ < \, i.e.,
|/(r)| < Ce-2" for a > £log2. (8.32)
Consequently <p(r) = \f(r)\<Tk^2 tends to 0 as r tends to oo through the
fundamental domain R for SL(2,Z). Since ip is continuous and the part
of R with <r < -^ log 2 is compact, y> is bounded on R. On 7i} <p satisfies
<p(r + 1) = V?(r) and
vH)H/H)lfe)"!=i'i'l/wiw"1)"i
= l/(r)kfc/2 = ¥>(r).
240 VIII: MODULAR FORMS FOR SL(2y Z)
By Proposition 8.3, <p is invariant under SX(2, Z). Being bounded on R}
<p must be bounded on W.
(b) We have cn = /_, /(r)e~2T,nr dp, and we have just seen that
|/(r)|<C<r-*/2. Hence
knl < Ca-We2™* for all a > 0.
Taking <r = I, we get |cn| < Ce2wnk/2 as required.
Remark. For fixed <r, the function f{p+ i<r) is smooth and periodic
in p. Thus its Fourier coefficients decrease faster than any negative
power of n. The numbers cn are not the Fourier coefficients of /(/> + tV),
however, but are e2xno times those Fourier coefficients.
Theorem 8.11 (Hecke). If / € S* is a cusp form for S£(2,Z), then
the L function L(s,f)y initially defined for Re s > |» + 1, extends to be
entire in s. Moreover, the function
A(S,f) = (2ir)-T(s)L(s,f) (8.33)
satisfies the functional equation
A(s,/) = (-l)fc/2A(* -*,/). (8.34)
Proof. The transformation law for f j is /(—1/r) = rk/(r),
and we specialize this to r = p -f ia with p = 0 to obtain
/(i/cr) = ik*kf(icr). (8.35)
From (8.31) we have
,/)= [°° f(i<r)<r'-1 d<r (8.36)
Jo
A(s
for Re 5 > ^ + 1. The argument now proceeds in the same way as for
Theorem 7.19c. The formal argument, disregarding convergence, is to
make the change of variables a —► a"1 in (8.36) and apply (8.35).
To make this argument precise, we use the estimate (8.32) to see that
/•OO
/ f(i<r)as-ldcr
A(.
6. PETERSSON INNER PRODUCT 241
converges for all s € C and defines an entire function. For Re s > | + 1,
we rewrite (8.36) as
A(SJ) = / f{ia)a9~l da+ f f{ia)a9~l da
and transform just the first term, replacing a by a"1 and applying (8.35).
The result is
,, /) = ik f°° f(ia)ak'9"1 da + f°° f(ia)a9~l da. (8.37)
The first term is of the same form as the second and thus extends to an
entire function of s. Hence A(s,/) extends to an entire function of s.
Since T(s) is nowhere 0, L(s}f) is entire.
Replacing s by k - s in (8.37) gives, upon multiplication by i*,
ikA(k -sj) = (-1)* f^ f(ia)a'~l da + ik f°° f{ia)ak—x da.
The right side here matches the right side of (8.37) since Sk ^ 0 only
when k is even, and (8.34) follows.
6. Petersson Inner Product
In this section we shall introduce a Hermitian inner product on the
vector space Sk of cusp forms for 5L(2, Z).
Lemma 8.12. The measure —r— on H is invariant under 5L(2,R).
a2
Proof. In terms of r and f, we have dp Ada = -r{dr A df). So it is
enough to prove that ~T_J'" 2 is an invariant 2-form. For g = I , 1,
dr Adr .
(Im r)2
r\/ —\
the Jacobian determinant \, _x is the determinant of a diagonal
0(r, f)
matrix with diagonal entries (cr + d) 2 and (ct -f d)"2. Thus
d(gr) A d(gf) = d^ QT} drAdf= \ct + d\~4dr A df.
d(r, r)
From (8.18) we have
lm(gr) =
Icr + dl2'
and the lemma follows.
242
Vni: MODULAR FORMS FOR S£,(2,Z)
For / and h in Sk, we define
(/.*>= I f{r)W)°k^£-, (8.38)
Jr °*
where R is the usual fundamental domain in H for SL(2, Z). By Lemma
8.10, f{r)h(T)<rk is bounded, and therefore the integral is convergent.
The result is a well defined Hermitian inner product called the Peters-
son inner product on 5*.
By Lemma 8.10a, \f(r)\2ak is invariant under SL(2,Z). In view
of Lemma 8.12, the measure |/(t)|2<t* —r— on H is invariant under
5L(2,Z). It follows that if we cut R into countably many pieces,* map
each piece by a member of SL(2,Z), and reassemble the result as a new
fundamental domain R!', then
/ |/<r)|V^=/|/(r)|V^.
Jw * JR *
Any measurable fundamental domain for 5L(2,Z) can be obtained in
this way, and thus (/, /) is independent of the choice of fundamental
domain. By polarization, the same conclusion holds for (/, h). We
summarize as follows.
Proposition 8.13. On S*, the Hermitian inner product
</,A)=//(r)W^£
JR &
is independent of the fundamental domain for SL(2,Z).
7. Hecke Operators
Hecke operators are a certain kind of linear operator from M* to M*,
with St mapping to St • We shall see that these operators commute and
are self adjoint relative to the Petersson inner product. Consequently
Sjb has an orthogonal basis of simultaneous eigenvectors for the Hecke
operators. The end result in §8 will be that the L function of each such
eigenvector (normalized to have C\ = 1) has a quadratic Euler product
expansion.
"Technically we should first remove the part of the boundary of R where the
imaginary part is positive, so that R is an exact fundamental domain.
7. HECKE OPERATORS 243
A lattice in C is all integer combinations of a pair of complex
numbers linearly independent over R. Recall from §2 that our first
examples of modular forms came from homogeneous functions of lattices:
G2fc(A), 02(A), 03(A), A(A). Effectively what was shown at the
beginning of §2 is the following: Let / be a complex-valued function whose
domain is the set of all lattices and which is homogeneous of degree
—k in the sense that
/(oA) = a"*/(A) for a 6 Cx. (8.39)
Define / on the upper half plane H by
/(r) = /(AT). (8.40)
As in the special cases of §2, / satisfies
'((c d)T)=(CT + d)kf(T) for (c j)eSL(2,Z). (8.41)
Conversely to any / : 7i —► C satisfying (8.41), we can associate a
function / of the lattice variable A, homogeneous of degree — fc, by the
definition
/(Iwi 0 Z^2) = u>r*/(^2M) if Im(cj2M) > 0. (8.42)
Let us check that this function depends only on A, not on the basis. If
we have
(»7\-(* *\(uA
WJ-\c d)\^)'
then
/(Zw/1©Za;5)=a;/f*/(w2/wI)
= (cw2 + duji)~k(c(u2lui) + d)kf(LJ2/ui)
= Wlkf(w2/Ul)
= f(2u>i © Zu>2),
and the independence follows. As a consequence of (8.42) and this
independence, /(A) has the homogeneity property
/(<*A) = Q~kf(A) for aeCx.
244
VIII: MODULAR FORMS FOR 5L(2,Z)
Our initial definition of Hecke operators will be in terms of
homogeneous functions of lattices. Then we shall translate the definition into
the language of modular forms. Let C be the free abelian group freely
generated by the lattices in C. To avoid confusion, we shall write n-A or
n(A) for n times the generator in £, and nA or (nA) for the dilate of A by
the factor n. The Hecke operator T(n) on lattices, for n = 1,2,3,...,
is defined to be the map T(n) : C —► C given by
T(n)A= £ A'- (843)
[A:A']=n
where [A : A'] is the index of A' in A. This is a finite sum, since any
such A' satisfies nA C A' C A and so corresponds to a subgroup of
A/nAE*Zn©Z„.
To define Tk(n) on an element / of the space M* of modular forms of
weight ife, we let /(A) be the function of lattices given by (8.42), so that
/(A) satisfies (8.39). Then 7*(n)/, as a function of lattices, is given by
(Tk(n)f)(A) = nk-1 £ /(A'). (8.44)
[A:A']=n
It is clear that 7*(n)/ is another function of lattices homogeneous of
degree — k. Shortly we shall exhibit the corresponding function of the
complex variable r, denoting it by 7*(n)/, and we shall verify that it is
a modular form. The resulting operator T*(n) on modular forms is also
called a Hecke operator.
To compute Tjb(n) explicitly on functions on lattices, let A = Zu>i®Zu>2
with Im u)2/ijJ\ > 0, and let A' = Zuj'x ® Zu^ with Im u>f2/wi > 0 be a
sublattice of index n. If we write
(3)-(:})(£)■ <8-«>
then rj) will be an integral matrix of determinant n. The most
general properly ordered basis of A' is obtained by left multiplying the
left side of (8.45) by a member of 5L(2,Z), and thus the right coset
5L(2,Z) [a ,) describes all the matrices leading from {w^u/2} to A'.
Let M(n) be the set of all integral matrices of determinant n. We
have seen that there are only finitely many lattices A' of index n in A.
Consequently there are only finitely many right cosets 5L(2, Z)ot in M{n)
with detor = n. Thus we have a (disjoint) right coset decomposition
M(n)= [j SL{2,l)ai (8.46)
1=1
7. HECKE OPERATORS 245
with i/(n) < oo.
To write Tk(n)f as a function of r, we make the definition
(/ o [a]k)(r) = /(ar)(cr + <f)-*(det a)*/2 (8.47)
•■(: i)
for or = ( , J a matrix of positive determinant (so that r 6W implies
a€H). For an analytic function /onH, the condition that / be an
unrestricted modular form of weight k is that / o [7]* = / for all 7 in
SL(2,Z). The formula for Tk(n)f involves this definition for matrices of
determinant n.
Proposition 8.14. Let {<*,-}jli be a complete set of representatives
for the right cosets 5L(2,Z)a of 5L(2,Z) on Af(n). If/ is in Mky then
Tk(n)f is given as a function of r by
*(n)
Tfc(n)/=n^1^/o[ai]ib. (8.48)
1=1
Hence Tk(n)f is an unrestricted modular form of weight k.
PROOF. The left side of (8.44) at Ar is just Tk(n)f(r). Let [Ar : A']=
n and let or,- = 1 , ] be the coset representative corresponding to A'.
This means that
«)-(:i)(0
is a basis of A'. Then
/(A') = /K(Ze(«4M)Z))
= (cr + «f)-*/(f^)
= (detai)-*/2(/o[ai]fc)(r)
= n-*/'(/«> WOW.
Substituting into (8.44), we obtain (8.48). This proves the analytic-
ity of (TJfc(n)/)(r) in 7iy and the transformation law follows from the
homogeneity of Tk(n)f as a function of lattices.
In order to examine the q expansion of T^(n)/, we shall use an explicit
set of coset representatives.
246 VIII: MODULAR FORMS FOR SL(2t Z)
Lemma 8.15. The matrices ( n ,1 with ad = n, d > 0, and
0 < 6 < d are a complete set of coset representatives for the right cosets
of5L(2,Z)onM(n).
Proof. Let ( , „ J be given. Choose relatively prime integers x
and y with xa! + yc' = 0, and then choose relatively prime integers u
and v with uy + v(—x) = 1. Then ( J is in 5L(2,Z), and
/u tA (a' b'\
\x y){c' d')
/ a'1 bN \
has lower left entry 0. Let us call the resulting matrix 1 ~ ,„ 1.
Possibly left multiplying by ( n J, we may assume that a" > 0
and d" > 0. Choose integers q and r with b" — d"q 4- r and 0 < r < d".
Then
A -q\ (</' b"\_(a" r\
\Q 1 A ° <*"/ V 0 d'V
is a coset representative of the kind described in the lemma.
Now suppose that two of the elements in the lemma are in the same
right coset. Say
(u v\ (a b\ _ (a1 b'\
{x y) \0 d)-{0 d!)
with uy—vx — 1. The lower left entry forces x = 0. The determinant
forces uy = 1. Consideration of signs forces u = y = 1. Finally the
inequalities on b and 6' force v = 0. Thus we have a complete set of
representatives.
Proposition 8.16. Let / in M* have q expansion /(r) = 5Z^=o cnQn-
Then Tk(m)f has q expansion
Tk(m)f(r) = jr/bnq",
n=0
where
co<rk-i(m) ifn = 0
bn = { cm if n = 1
£a|GCD(n,m) ak~lcnm/a> if H > 1,
with (Tjk^x as in Proposition 8.1. Consequently Tjfc(m) carries M* to M^
and carries S* to 5jk-
7. HECKE OPERATORS
247
REMARK. We shall make serious use of the formula for bn only in the
case n = 1.
Proof. We apply (8.48), using the representatives in Lemma 8.15.
If or = ( n , 1 is such a representative with ad = m, then
fo[a}k(r) = f(^)d-kmk^
= f^cne**in(aT+bVdd-krnkf2.
n=0
Hence
oo
7k(m)/(r) = m*"1 £ £ <r*c„e2*<"(aT+*>/<'> (8.49)
n=0a,6,<*
with the inner sum over a, 6, d as in the lemma. The sum
d-\
2>
02xinb/d
6=0
equals <f if <f | n and equals 0 otherwise. So we can drop all n's except
those of the form n = ld> and then (8.49) is
/=0 ad=m /=0 aim
<*>0 a>0
The coefficient of qr° comes from / = 0 and is
a\m
a>0
The coefficient of q1 comes from a = / = 1; it is just cm. For n > 2, the
coefficient of qn comes from triples (lya}d) with la = n and a \ m. The
factor cim/a is cnm/ai with a | n and a | m. Thus the coefficient of qn is
a|GCD(n,m)
as asserted. From the formula for 6n, we see that Tk(m)f is a modular
form; hence 7]fe(m) carries M* to M*. If c0 = 0, then bo = 0; hence
Tk(m) carries St to S^-
248
VIII: MODULAR FORMS FOR 5L(2,Z)
Corollary 8.17. Let / in M* have q expansion /(r) = 5Z^L0cn9n-
For p prime, Tk(p)f has q expansion Tk(p)f(r) = 5^L0 bnqn with
. _ f cPn if P t n
\ cpn+pk-lcn/p if p\n.
Now we examine how the T*(n) interact with one another. We begin
with the operators T(n) on £. Let R(n) : C —► C be given by
rt(n)(A) = (nA).
It is clear that R(n)T(m) = T(m)R(n) for all n and m.
Lemma 8.18.
(a) For a prime power pr with r > 1,
7V)T(p) = T^*1) + p • fl^TXp'-1). (8.50)
(b) T(m)T(n) = T(mn) if m and n are relatively prime.
Proof, (a) Both sides associate to A some sublattices A' of index
pr+1, and we have to check that the multiplicity of a sublattice A' is the
same for both sides. Fix such a A'.
First suppose A' C pA. Then R(p)T(pr-1) = T(pr~1)(pA) contains
A' with multiplicity one. Therefore A' occurs on the right side of (8.50)
with multiplicity p + 1. On the left side, T(p)A is the sum of the p + 1
lattices strictly between pA and A, and so A' occurs on the left side with
multiplicity p + 1.
Now suppose A' is not contained in pA. Then it occurs on the right
side with multiplicity one, coming only from T(pr'^1). On the left side
it occurs with multiplicity at least one. If it occurs with multiplicity
greater than one, then there are two distinct sublattices Ai and A2 of
A of index p with A' C Ai and A' C A2. Then A' C Ai fl A2 = pA,
contradiction. This proves (a).
(b) If A' is a sublattice of index mn in A, then A/A' is finite abelian
of order mn. If m and n are relatively prime, we can write
A/A; = Gi © G2
uniquely with |Gi| = n and |G2| = m. If A" denotes the pullback to A
of the members of Gi, then A" is the unique lattice with A' C A" C A
and [A : A"] = m. Hence A' occurs on both sides of T(m)T(n)(A) and
T(mn)(A) with multiplicity one.
7. HECKE OPERATORS
249
We shall translate Lemma 8.18 into a statement about the Hecke
operators on modular forms. The lattices that occur in the definition of
T(n)(A) in (8.43) have index n in A, and we shall say that T(n) : C -► C
is an operator of degree n. More generally a linear operator N : C —+ C
of the form
N(A)= £ mA,(A') (8.51)
[A:A']=n
will be said to be of degree n. In this sense, R(n) has degree n2.
If N : C —► C has degree n and is given as in (8.51), we define Nt on
functions / on lattices of homogeneity — k by
(Nkf)(A) = n*"1 £ mA./(A'). (8-52)
[A:A']=n
This definition is consistent with (8.44), and it satisfies the properties
(S + N)k = Sk + Nk (8.53)
if 5 and N both have degree n, and
(MAT)* = WbMfc (8.54)
if M has degree m and N has degree n (so that Af TV has degree m -h n).
Property (8.54) argues for putting AT* on the right side of / in (8.52),
but the operators of interest to us will commute, and (8.54) will not be
a problem. Notice that our definition (8.52) makes
(J2(n)*/)(A) = (n2)*-7(nA) = n*-2/(A). (8.55)
Theorem 8.19 (Hecke). On the space Af*, the Hecke operators
satisfy
(a) For a prime power pr with r > 1,
n(pr)Tk(P) = w+i)+p*-1 w1).
Hence Tk(pr) is a polynomial in Tk(p) with integer coefficients.
(b) Tjb(m)7]b(n) = T*(mn) if m and n are relatively prime.
(c) The algebra generated by the Tk(n) for n = 1,2,3,... is generated
by the Tk(p) with p prime and is commutative.
250 VIII: MODULAR FORMS FOR SL(2, Z)
PROOF, (a) Let / be in Mk. By (8.53), (8.54), and (8.55), Lemma
8.18a gives
Tk(p)Tk(p')f = W+1)/+ vPk-2Tk{f-l)f.
Hence
Tk(p)Tk(p')f = Tk{p'+1)f + pk-1Tk(pr-1)f.
Since 7jb(l) is the identity, this equation shows recursively that Tk(pr)
is a polynomial in Tlk(p) and hence commutes with T*(p). Then (a)
follows.
(b) This follows from Lemma 8.18b and (8.54).
(c) By (b), the algebra generated by the 7*(n) is the same as that
generated by the Tjt(pr). By (a) it is the same as that generated by the
Tjfe(p). In turn, these commute by (b).
8. Interaction with Petersson Inner Product
We have proved that the Hecke operators on St commute with one
another, and we shall next prove that they are self adjoint operators
relative to the Petersson inner product. The proof is a little subtle, and
we begin with two lemmas.
For any integer N > 1, we define the principal congruence
subgroup r(7V)ofSL(2,Z) by
iw-{(: i)S5lp.d|(: »)-(j •) n^}.
The group T(N) is the kernel of the reduction-modulo-TV homomorphism
carrying 5L(2, Z) to SX(2, ZN). Since 5L(2, ZN) is finite, T(N) has finite
index in 5L(2,Z). Observe that T(l) = 5L(2,Z).
For a = (ac b\ in M(n), we let a' = na"1 = ( * ~6V The
matrix a' is again in M{n).
Lemma 8.20. If a is in M(n), then
aT(nm)a-1 C T(m).
Consequently
T(n2) C aT(n)a-l C 5L(2, Z). (8.56)
8. INTERACTION WITH PETERSSON INNER PRODUCT 251
Proof. Let a = ( a 1 and let 7 be in T(nm). Then
-= (: j) u -;) -
nm
~V> n)
mod nmt
and
-C J)
aryor""1 = ( ) mod m.
Taking m = 1 gives the right hand inclusion of (8.56). Taking m = n
and replacing or by a' gives the left hand inclusion.
Lemma 8.21. If a is in M(n)} then the group
T{n)naT(n)a-1
has the same finite index in T(n) as it does in aT(n)a~l.
REMARK. The index is finite since both groups lie between T(n2) and
SL(2,Z), by Lemma 8.20, and since T(n2) has finite index in SL(2,Z).
PROOF. Let R be a fundamental domain in H for the action of T(n)
(obtained, for example, as the union of the [5L(2,Z) : T(n)] translates
of R by left coset representatives of SL(2,Z)/I\n)). Let R! be a
fundamental domain in 7i for the action of T(n) fl orl^n)**""1, obtained as the
union of
[rfnJiIXnJnalXnJa-1]
dpdcr .
translates of R by elements of T(n). If // is the measure —— in W,
c
then
/i(#) = [T(n) : r(n) n alXn)*"1]/!^). (8.57)
Now a# is a fundamental domain in H for the action of aT(n)a~l.
[In fact, if t is given, choose 7 € T(n) with y(a~lr) E #; then (aja~l)r
is in or.R. Uniqueness follows similarly.] Moreover,
fi(aR) = //(£) (8.58)
252 VIII: MODULAR FORMS FOR 5L(2, Z)
by Lemma 8.12 applied to the element n~l^2a of SL(2,R). Next let
(otR)' be a fundamental domain in Ji for the action of r(n)nar(n)Qr_1,
obtained as the union of
\glY{ti)ol-x : T(n) n aT(n)a'1]
translates of aR by elements of aT(n)a~l. Then
fi((aRY) = (ar(n)a-1 : T(n) n aT(n)a-l]fi(aR). (8.59)
Since Rf and (aR)' are fundamental domains for the intersection, the
same argument as at the end of §6 shows that fi(R') = fi((aR)f).
Comparing (8.57) and (8.59) and using (8.58), we obtain the result of the
lemma.
Theorem 8.22 (Petersson). The Hecke operators Tjt(n) on the space
of cusp forms S* are self adjoint relative to the Petersson inner product.
Proof. By Theorem 8.19, it is enough to prove the result for n equal
to a prime p. We shall use Lemmas 8.20 and 8.21 and the notation in
the proof of Lemma 8.21. Also, as in the proof of Corollary 8.4, we let
6(g, r) = cr+d if g = (abd) has determinant one. If / and h are in 5*,
we are to prove that
/ Tk{p)f(T)W)*k dj£-=i firmm*' d~^- (8-60)
jr ° jr *
By Proposition 8.13 it is enough to prove that
/ Tk(P)f(T)W)*k dj£ = /. fWWmo* &£, (8.61)
JR °r JR *
since each side of (8.61) is the same multiple of the corresponding side
of (8.60). Let {a,} be right coset representatives of M(p) relative to
5L(2,Z), as in (8.46). For each choice of or,- as in Lemma 8.15, the
element a< = par"1 has ot[ = 7,Qr,7l- for some 7, and 7,- in SL(2,Z).
Namely
«,= (J J) has «$=("* "o1)*^ J)
and
-(J!) -■s-(!-,,M-0, i)-
8. INTERACTION WITH PETERSSON INNER PRODUCT 253
Since / o [t,--1]* = / and h o [yfa = h, Lemma 8.12 shows that
/ f(r)ho[aMry &£ = / . /(r)*o [7.a.*U(r)<r* ^
JR <* Jy[R ff
= /./(r)fco[7laj7at(r)<r»^
JR *
JR <?
In view of Proposition 8.14, proving (8.61) therefore reduces to proving
/ / • [«]»(rJSfrV ^f = / /(r)ftoM»(V ^ (8.62)
JR <? JR ff
for each a = c*j.
We shall change variables on the left side of (8.62), replacing ar by a
new variable r1. Note that or# = n-1^2ar has determinant 1 and satisfies
/°H* = f°[a*]k-
Thus (8.16) and (8.18) give
/ o [«]t(r)fcOV = / o [o*]t(r)Af7)(Im t)*
= f(a*T)h(a*-1(a#r))6(a*, r)-*(Im r)*
= /(a#r)Ma#-'(a*r))^#,r)1 (|f(a*,r)|a)*
= /(a#r)/i(a#-»(a#r)) 6(a#-\a*T) -*(Im a#r)*
= /(r')Ao[a']t(r')(Im r')*.
Taking Lemma 8.12 into account, we obtain
/ / o [a]k(r)Wy &£ = / . /(r)ftoMt(rK ^ (8.63)
JR a JqR <*
Now / satisfies
f(jr) = 6(y, r)kf(r) for 7 € ar(p)a~1
since arr(p)of_1 C 51.(2, Z) by Lemma 8.20. Also we readily check that
/* o [<x']k satisfies
h o [a']t(7r) = 6(7, r)*A o [a']t(r) for y 6 ar(p)ar-\
254
VIII: MODULAR FORMS FOR 5L(2,Z)
since a/(«r(p)a-1)a/"1 = T(p) C 5L(2,Z). By (8.18), the integrand
on the right side of (8.63) is invariant under al^pJaT1. Since aR is a
fundamental domain for aT(p)a~l, the right side of (8.63) is unaffected
by replacing aR by any other such fundamental domain. Thus we have
Jr <*
tar(p)a-t:rfr)nar(p)a-x] S(aky ^^^^ ^
(8.64)
Meanwhile, on the right side of (8.62), / satisfies
f(yr) = 6(y,r)kf(r) for 7 € r(p),
and we readily check that h o [a']* satisfies
h o [a']*(7r) = 6(7, r)kh o [a']k(r) for 7 e T(p)
since a/r(p)or'""1 C 5L(2,Z) by Lemma 8.20. Thus the integrand on the
right side of (8.62) is invariant under T(p). Since R is a fundamental
domain for T(p), the right side of (8.62) is unaffected by replacing R by
any other such fundamental domain. Thus we have
/ /<r)fco[«'M'V ^ =
Jr *
The numerical coefficients on the right sides of (8.64) and (8.65) are
equal, by Lemma 8.21. Also the common integrand is invariant under
T(p) r\ar(p)a~ly and R! and (aR)' are two fundamental domains for
this group. Therefore the two integrals are equal. TVacing back, we see
that (8.62) is a valid equality, and the theorem follows.
We have now shown that the Hecke operators Ti(n) are a commuting
family of self adjoint operators on the space S* of cusp forms for 5L(2, Z)
of weight k. Consequently Sk has an orthogonal basis of simultaneous
eigenvectors.
8. INTERACTION WITH PETERSSON INNER PRODUCT 255
Proposition 8.23. Let / E S* be a simultaneous eigenvector of
Tk(n) with Tk(n)f = \(n)f. If / has q expansion f(r) = £~=1 cnqny
then
cn = A(n)ci. (8.66)
Consequently
(a) / £ 0 implies c\ ^ 0
(b) the system of eigenvalues {A(n)} determines / up to a scalar.
PROOF. In the q expansion of Tjt(n)/, the coefficient of q has to be
\(n)c\. But Proposition 8.16 gives it as cn. Thus (8.66) follows. Then
(a) and (b) are immediate consequences.
Suppose now that / € St is a simultaneous eigenvector of 7i(n).
Proposition 8.23 allows us to normalize / so that the q expansion f(r) =
Yl™=icnQn nas ci = 1- Then (8.66) says that cn is the eigenvalue of
Tfc(n). From Theorem 8.19 we see that
CprCp = Cpr+i +pfc_1cpr-i for p prime (8.67a)
cmcn = cmn if GCD(m, n) = 1. (8.67b)
Let us return to the L function of /, defined in §5 by
According to §VII.2, the condition (8.67b) means that L(s,f) has an
Euler product. By Corollary 7.8, (8.67a) means that the Euler product
is quadractic, the pth factor being
1
l-Cpp-'+p*"1"2''
(Here we are taking dp = — p*""1 in (7.23).) We summarize as follows.
Theorem 8.24 (Hecke-Petersson). The space S* of cusp forms has
an orthogonal basis of simultaneous eigenvectors under the Hecke
operators Tjt(n). Each such eigenvector / can be normalized so that its q
expansion f(r) = Y^=zicnQn has C\ = 1. With such a normalization
the coefficients cn satisfy (8.67). Moreover, the L function L(s,f) has
an Euler product expansion
L(i'/)=.Si'->y-V-'-
p prime
(8.68)
convergent for Re s > | + 1.
CHAPTER IX
MODULAR FORMS FOR HECKE SUBGROUPS
1. Hecke Subgroups
In §VIII.8 we already defined the principal congruence subgroup
T(N) of SL(2} Z), for any integer N > 1, by
(9.1)
This is the kernel of the reduction-modulo-TV homomorphism carrying
ST(2,Z) to 5L(2, Z//) and therefore is a normal subgroup of finite index
in SL(2, Z). The Hecke subgroups are
T0(N) = |Y° J) e SL(2, Z) I c = 0 mod n\ . (9.2)
These satisfy
r(N)cr0(7V)C5L(2,Z) (9.3)
and hence have finite index in SL(2,Z).
We can compute the index of To(N) by relating this subgroup to
the set M(N) of integer matrices of determinant N. Let M*(N) be the
subset of primitive such matrices I a J, those with GCD(a, b,cyd) = 1.
Lemma 9.1. With T = 5L(2,Z),
M>) = r(o !)r = Urtt (94)
disjointly, where or runs over all integer matrices (" J with ad = n,
a1 > 0, 0 < 6 < dy and GCD(a,6, d)=l. The number of such right cosets
[M'(n):r) = n JJ 0 + ?)• (9-5)
p prime
p|n
256
1. HECKE SUBGROUPS
257
Proof. If we multiply a nonprimitive member of M(n) on either side
by a member of T, the result is still nonprimitive. Taking complements,
we see that M*(n) is preserved under these operations.
The conclusion that M*(n) = T f"J)r is elementary divisor theory:
In the proof that a doubly generated subgroup of a free abelian group
of rank 2 is free abelian, one does integer row and column operations
on a 2-by-2 integer matrix and arrives at a diagonal matrix. Starting
from ( , ,, ], we can do those operations here and arrive at ( n , j
with GCD(a,d) = 1. This answer corresponds to having Za 0 Z<j as the
quotient of the free abelian groups. Since GCD(a, d) = 1, Za 0 Z<j = Zn,
and the theory shows we could have arrived at f n 1.
For the conclusion M*(n) = |Ja Tor, we can apply Lemma 8.15 to
( ' A1 ) ^ M*(n) and see that I , ,, 1 is in a unique Ta such that
Qr=(n ,},ad~n,d>0, and 0 < 6 < d. Since or must be primitive,
we have also GCD(a,6,d) = 1. Hence M*{n) — (Ja Ta in the asserted
fashion.
For the index formula (9.5), we first show that the index is
multiplicative. In fact, let GCD(n,m) = 1. Write M*{n) - \J{ Tai and
M*(m) = (J; r# ^ in (9-4)- li is clear that Af *(nm) D \J.j TatiPj. We
shall show that equality holds and that the Taify are disjoint. In fact,
if 6 is in M*{nm)y then
* = 7l(7 J) 72 by (9.4)
(n 0\ (m (A
= 7lU lJU lj*
— 71 f ft - j 73/?j for some 73 £ T and some j
= j4Qi/3j for some 74 E T and some i.
Thus M*{nm) = \J{j Tottf^ If Ta^j = Ta^Py, then a{Pj = 7a,-//?,-/
and Pj'PT1 G or^Tor,/. From this relation we see that Pj'PT1 has entries
in m~*Z and also in n_1Z. Since GCD(m,n) = 1, Pj'Pj1 is in T. Thus
/?;- = Pji. Then a,' = 70^/ and or,- = or*'.
To complete the proof of the index formula, we may take n = pr with
p prime. We can simply count the number of ar's in (9.4). The number
258 IX: MODULAR FORMS FOR HECKE SUBGROUPS
IS
Pr+(pr--pr-2) + (pr- -Pr_3)+ •+(?-i) + i = Pr + Pr_1.
as required.
Lemma 9.2. For T = 5L(2,Z),
Proof. The left side consists of all members of T of the form
fN-1 0\fa b\(N 0\ _ / a bN'1}
\ 0 l)\e d)\0 l)~ \cN d J
with ( " * ) 6 T. Hence the left side is contained in the right side. If
(* *) is in T0(N), then (• ») is in T and
fa b\_(N 0\"V a bN\ (N 0\
\c d) ~ \0 \) \cN~l d J \0 \)
Proposition 9.3. T — SL(2y Z) acts transitively by right
multiplication on T\M*(N) with isotropy subgroup r0(AT) at the coset r f NQ J J.
Therefore
[r-.r0(N)] = N J] (! + ?)• (9-6)
p prime
p\N
Proof. The action is transitive by (9.4). By Lemma 9.2, we are to
show that the isotropy subgroup at r(^j) isTfl (^ j) r ( ^ ? ) *
If 7 e T is in the intersection, then y = (^°\ W^Mand
Hence the intersection is contained in the isotropy subgroup. In the
reverse direction, if 7 is in the isotropy subgroup, then 7 is a member of
r with
1. HECKE SUBGROUPS
259
i.e.,
-1
„/N 0\ (N 0\ r
This says ( ^ J ) 7 ( ^ i J is in T. So 7 is in the intersection.
It follows that
\t\m*{n)\ = |r0(w)\r|,
and the index formula (9.6) is then a consequence of (9.5).
Example 1. Let N = p be prime. Then Proposition 9.3 gives
|r/ro(p)| = p+ 1, and coset representatives for the left coset
decomposition T = Uj=o^?^o(p) may be taken as
&=(o 1) and # = (-i 0) for0^J'<p-
(9.7)
To see that these elements represent distinct cosets, we have only to
9p
observe that fip xf3j = fij is not in Y0(p) and that
is not in I\)(p) unless i = j.
EXAMPLE 2. Let N = pr with p prime. One easily checks that coset
representatives for the left coset decomposition T = U/^o(p) may be
taken as
and
(A!)
with 0</<pr_1 (9.8a)
with 0< j <pr. (9.8b)
Let iZ be the usual fundamental domain in 7i for 5L(2,Z). (If there
is a need to be quite precise, we exclude the part of the boundary where
Re r > 0, so that no two distinct points of R are congruent.) Write the
left coset decomposition of 5L(2, 2)/Tq(N) as
SL(2,l) = \J/3jr0(N)9
260 IX: MODULAR FORMS FOR HECKE SUBGROUPS
and put
Rn=\J0?1R.
Then Rn is a fundamental domain for To(N). [In fact, if r € H is
given, we choose (3 G 5L(2, Z) with fir 6 R. If we write /? = /?y7 with
7 € r0(iV), then yr is in yflj-1/? C Rn> So every point of H is congruent
to a point of /fyv. Uniqueness is proved similarly.]
A picture of this fundamental domain for To(2) appears in Figure 9.1.
We use the fy *s of Example 1 above, namely
02
■0 ?)• *=(-. *)■ *-(-i o)
In terms of S = ( _x J ) and T = f J J J, #2 is comprised of
R, Pq1R = S(R), fclR = ST(tf).
The image under each nontrivial /?_1 of the part of R where Im r is
large contributes to a set in the form of a cusp above the real line. Also
we regard R itself as contributing a cusp. A feature of this diagram is
that S(R) and ST(R) contribute to the same cusp. Thus there are only
two cusps, at oo and 0, even though [SL(2,Z) : r0(2)] = 3.
-2 -2/3
Figure 9.1. Fundamental domain for r0(2)
2. MODULAR AND CUSP FORMS
261
2. Modular and Cusp Forms
It will help now to make systematic use of the notation
6(g,r) = cT + d for(/=(**) inSL(2,R). (9.9)
This notation was introduced in the proof of Corollary 8.4. We recall
that 6(g, r) satisfies
%i02,r) = S(gi)g2T)6(g2ir)
6{g,T)-l=6{g-\gT).
As in (8.47), we shall write
/ o \p]k{r) = «(/?, r)-kf(pT) for 0 G SL(2, R).
For a matrix a of positive determinant, we let a# = (detar)_1/2a. Then
a# has determinant one, and we define
fo[a)k = fo[a*]k. (9.11)
This construction is a group operation, in that
/ o [ona2]* = (/ o [on]*) o [a2]fc. (9.12)
An unrestricted modular form of weight k £ Z and level N > 1
is an analytic function /onW with
f(jr) = «(7, r)*/(0 for all y e T0(N). (9.13)
Since f ~0 _x J is in To (AT), we may as well assume fc is an even integer.
Such a function / is a modular form if it is "holomorphic at the cusps,"
and it is a cusp form if it "vanishes at the cusps." We need to explain
these conditions on / at the "cusps."
If R is the standard fundamental domain for 5L(2, Z), then we can
regard oo as a boundary point of H that is also a limit point of R. Each
set 0~lR, with /? € SL(2t Z), is also a fundamental domain for 5L(2, Z).
If (J-1 = f ah V then /?_1(oo) = ^ is a boundary point of H and a
limit point of fi~lR. Because of the geometry of j3~lR (as in Figure
262 IX: MODULAR FORMS FOR HECKE SUBGROUPS
9.1), we say that ^ is a cusp of/?-1. To eliminate the dependence on 0
in this definition of "cusp," it is customary to define "cusps" by taking
QU {00} and identifying those equivalent under the group in question.
With SL(2, Z), there is only one equivalence class, hence only one cusp.
Now let us pass to To(N). With the above conventions, a cusp is an
equivalence class of QU {00} under the action of To(N). If we write the
coset space SL(2,Z)/r0(N) as
SL(2,Z) = \J^r0(N), (9.14)
3
then each /^(oo) represents a cusp. [In fact, let ^ be in Q with
GCD(a,c) = 1. Choose b and d with ad - be = 1, so that 0~l = (° *)
is in 5L(2,Z) and has /?_1(oo) = f- Choose 7 € T0(N) and 0j as
above with 0 = 0p. Then i~l{0Jl(oo)) = /?-1(oo) = f, so that ^ and
/^(oo) represent the same cusp.]
There can be duplication among the cusps 0J'l(oo). This is what
happens in Figure 9.1. The condition that /J'^oo) and /?;_1(oo)
represent the same cusp is that /^(oo) = j0J"l(oo) for some 7 6 To(N).
This means that 0n0Jl fixes 00, hence is of the form ± ( l J for some
sign and some /. A sufficient condition is that 0i0Jx is of the form
if JJ). In To(p) this condition is satisfied by all pairs 0\,... ,0p in
(9.7). But the condition is not necessary. In r0(8), which is discussed in
Example 2, /?„•=( 9 .1 and 0j = ( I yield the same cusp (take
7 = f q 1 1), but 0i0Jl is not upper triangular.
Suppose / is an unrestricted modular form of weight k and level N.
Let us examine the behavior of / near a cusp 0~l{oo). We consider the
function /o [/?~1]fc(/r), whose behavior near r = 00 (i.e., for Im t large)
reflects the behavior of/ near /?_1(oo). Since
r'(0 ^€^r(JV)/?nrwcr0(JV), (9.15)
we have
foyrl]k(T + N) = (fo\fi-*]k)o[(l0 ?)] (r)
= V°\rl(l ^)^]*)°[r1]fc(r) by (9.12)
= /o[r1]t(r) by (9.13).
2. MODULAR AND CUSP FORMS 263
Arguing as for (8.7), we see that
f°[p-%(r)= £ WlN with qN = e""<N. (9.16)
n=— oo
We say that / is holomorphic at the cusp /^(oo) if <& occurs only
for n > 0 and / vanishes at the cusp /^(oo) if cf, occurs only for
n> 1.
If 0 is replaced by a different coset representative 0y with 7 £ To(Ar),
then
/ ° [(/?7)-1]* = (/ o IT"1]*) o IT']* = / o I/?""1]*,
so that the expansion (9.16) is completely unchanged. Therefore our
definitions do not depend on the choice of coset representatives.
But more is true. Suppose 0fl(oo) and /^(oo) represent the same
cusp, i.e., 0aPJl = ± (j [) for some 7 G T0(N) and / E Z. Then we
have
/°Wr1]*((i!)'-) = (/«»D9r1]t)o[±(j!)]t(r) since* is even
=/°i±/r,(;!)wr)
= /o[7/^1]*(r)
= (/o[7]0«>WS",]*(r)
= /o^71]t('-).
Thus changing from 0 = /3j to 0 = /?,- makes cjf' in (9.16) get multiplied
by e2*xlnlN. The validity of "holomorphic at the cusp" and "vanishes at
the cusp" are unaffected.
The expansion (9.16) involves qN = e2ntT^N, which is periodic in r
of period N. It can happen that a smaller period will work for all /.
Namely if /?_1(oo) is a cusp, then
{iei\0-l(l[)(Jero(N)} (9.17)
is a subgroup of Z that contains TV, according to (9.15). Let h be its
positive generator. Going over the above argument shows that
/o^-1]*(r + A) = /o|)?-1]t(r).
Thus we are led to an expansion in powers of q^ = e2*tTth. We call h
the width of the cusp.
264 IX: MODULAR FORMS FOR HECKE SUBGROUPS
Examples.
(1) The width of the cusp at oo is 1.
(2) For r0(p) with p prime, the width of the cusp at 0 is p. (See Figure
9.1.)
(3) For r0(4), there are three cusps—the one at oo of width 1
corresponding to one left coset in (9.14), one at 0 of width 4 corresponding
to four left cosets, and one at — | of width 1 corresponding to one left
coset.
Proposition 9.4. Let y9_1(oo) be a cusp of T0(N)} where 0 is in
SL(2,Z). Then there exists a unique c*t = f J in M*(N) with
ad = N} d > 0, 0 < 6 < d, and GCD(a,6,d) = 1 such that
(NQ ;)/r^70| (9.18)
for some 7 in SX(2,Z). Moreover, the width of the cusp equals
(f/GCD(a,d).
Proof. The first conclusion follows from Lemma 9.1. For the second
conclusion, we conjugate ( J J ) by (9.18), obtaining
Thus
*-(;:)'-(? ;)"K:rM(:;)-
The left side is in T, and this equality shows it is in
C :)"^z'(o;)
if and only if al/d is an integer. By Lemma 9.2, it is in To(N) if and only
if al/d is an integer. Hence / is in the subgroup (9.17) defining width if
and only if al/d is an integer, and it follows that / = d/GCD(a} d).
3. EXAMPLES OP MODULAR FORMS
265
The spaces of modular and cusp forms of weight k and level N will
be denoted Mk(To(N)) and Sk(To(N))> respectively. These spaces are
0 for k odd.
3. Examples of Modular Forms
There are several constructions of modular forms in one situation from
modular forms in another. Typically the new transformation law (9.13)
is a consequence of some computation with matrices. Behavior at the
cusps is handled by the following result.
Proposition 9.5. Suppose that / is analytic in the upper half plane
and, for each 0 in SL(2, Z), /ot/?"1]*^) has a holomorphic qtr expansion
at oo, where qN = e2*irlN. If a is in M(m)y then (fo[a]k) o [/?_1]*(r)
has a holomorphic qmN expansion at oo for each 0 in SL(2,Z). If all of
these expansions for / have 0 constant term, then all of these expansions
for / o [or]jb have 0 constant term.
PROOF. Scaling a so that its entries are relatively prime, we can
apply Lemma 9.1 we and rescale to write or/?"1 = 7"1 f J with 7
in SL(2,Z) and (j J J in M(m). Then
(fo[a]k)o[^-%(r) = (fo[y-%)o^aQ J)] (r). (9.19)
By assumption we can write / o [7_1]jb(*") = Y^=o cne2irtnTfN. Hence
the right side of (9.19) is
= (m-^d)-kfo[y-%(*i±!L)
= (m-l'2d)-k J2 cne2winb^N^e2irina2r^mN\ (9.20)
n=Q
and the proposition follows.
The first two examples give methods for obtaining modular forms of
one level from those at a lower level, with the weight remaining the same.
Example 1. If / is in Mk(TQ(N))t then / is in Mk(r0(rN)) for r > 1,
and similarly for Sk(To(N)) and Sk(To(rN)). The new transformation
law (9.13) is valid because a subset of 7's is involved, and the cusp
conditions are already built into the properties relative to Tq(N).
266 IX: MODULAR FORMS FOR HECKE SUBGROUPS
Example 2. If /(r) is in Mk(T0(N))} then f(rr) is in Mk(r0(rN))
for t > 1. The transformation law (9.13) follows from Lemma 8.20:
(/o[(;Oi»)°w*w = ^oi(r.!)Tr(o!)"1woKof)W'-)
=/•[(;;)!* w
for 7 in To(rN). The conditions at the cusps are satisfied because of
Proposition 9.4 with a = ( q ? ) •
There is one more simple construction to obtain new modular forms
from old ones. This one changes the weight.
Example 3. The product of two modular forms for r0(AT), one of
weight k and one of weight /, is a modular form of weight k + /. If one
is a cusp form, so is the product.
Now we give two completely different constructions, largely without
proof.
Example 4. Define
G2{t) = YY'- for ren, (9.21)
where the inner sum is taken over m such that (mtn) ^ (0,0). One
can prove that this series converges nicely when summed in the order
indicated. Moreover, we have G*l(t 4- 1) = G2{t), and one can prove
that
G2(-l/r) = t2G2(t) - 2irir.
Then it follows that
G2(^) = (cr + d)2G2{r) - 2«c(cr + d) (9.22)
for [acbd) in 5L(2,Z). The argument of Proposition 8.1 shows that
G2(r) = 2C(2) - 8tt2 J2 <rx(n)qn. (9.23)
n = l
G<?\t) = G2(t) - 2G2(2r). (9.24)
n = l
Put
4. L FUNCTION OF A CUSP FORM
267
Combining (9.22) and the identity
(1 0\ (a b\(2 OV1 /a 26 \
1,0 l) \c d) \0 \) " \c/2 d )'
we see that G<2) o [y]2 = G*2) for all 7 G T0(2). Also it is not hard to
see from (9.23) that Gr2 ' is holomorphic at the cusps. Therefore G^ *s
in M2(r0(2)).
Example 5 (Hecke). If p > 3 is prime and tj(t) is as in Corollary 8.9,
then (rf{T)/r}(pr))2 is in Mp_i(ro(p)). The difficult thing to prove is the
transformation law. To handle the behavior at the cusps, we consider
the 12th power, which is Ap(r)/A(pr).
Since p is prime, the only cusps are at 0 and oo. For A(r), the q
expansion at oo begins with a multiple of q\ the same thing is true for the
q expansion at 0 since Ao[[°^)1j]12 = A. For A(pr), the q expansion
at oo clearly begins with a multiple of qp. For the q expansion at 0,
we follow the steps of the proof of Proposition 9.5, using a = I £ J,
/?-i = T-i = («-;), and (*») = (J J). Then (9.20) shows that
(A o [a]k) o [P~l]k(r) begins with ql/p. We conclude that Ap(r)/A(pr)
has q expnasion at oo holomorphic nonvanishing, while at 0 the first
term involves qp~p.
Hence (rf'(t)jfn(pT))2 is holomorphic at the cusps. Modifications to it
will yield cusp forms. For example, let p — Yin— 1. Then {n{pr) / rf (t))2
has weight — (p — 1), and its q expansions begin with q° at oo and
q-JiW- P) at 0. Multiplying by
An = A"(*+1> = 772(*+1\
we see that r)(pT)2n(r)2 has weight 2 and has q expansions beginning
with qn at oo and with qn-^p~^ = g(p+i)/(i2p) at 0. Hence
r)(pr)2v(r)2
is a cusp form of weight 2 when p = — 1 mod 12.
4. L Function of a Cusp Form
Let / £ Sk(Fo(N)) be a cusp form, and let /(r) = J2™=i c*?n be *ts
q expansion at the cusp oo. The L function of / is the Dirichlet series
*(*■/) = ££' (9-25)
n — i
268 IX: MODULAR FORMS FOR HECKE SUBGROUPS
just as in (8.30), and we have
£
f(ia)a' — = (2x)-r(.)Z,(«,/) (9.26)
for all values of s for which the integral converges.
Lemma 9.6. Let / € Sk{To(N)) have q expansion f(r) = Yl™=i c*tfn
at the cusp oo. Then
(a) the function <p(r) = \f(r)\<rkl2 is bounded on H and invariant
under T0(N)
(b)|cn|<Cn*/2.
PROOF. Write 5L(2,Z)/r0(N) as in (9.14). First we prove that <p
is bounded on R^. The cusps of Rn are the points /?r1(oo), and it is
enough to prove boundedness near each, i.e., boundedness of <p(/37 t)
for Im r > 2 and all j. Since / vanishes at the cusp /^(oo), we have
n=l
and hence
1/ o \Pjl]h{r)\ < Cie-2^Im T»N for Im r > 2. (9.27)
Thus
1/ o [^"^(^Klm r)*'2 < C\ for Im r > 2. (9.28)
Since
Im(/?r) = |g(^Tr)|2 for 0 € SL(2, R), (9.29)
we have
= \f °\PjlWT)6{f5-\Tf\ (Im r)*/2 |^r\T)|-*
= l/o^-1]*(r)|(Imr)fc/2. (9.30)
In combination with (9.28), (9.30) shows that <p is bounded on Rjsf.
Replacing fij1 in the calculation (9.30) by a member 7 of Tq(N)} we see
that <p is invariant under To(N). Hence (p is bounded on W, and (a) is
proved. To see that (a) implies (b), we argue just as in Lemma 8.10,
starting from (9.28) with ft; = 1.
4. L FUNCTION OF A CUSP FORM 269
For St = S*(r<)(l)), we obtained a functional equation for L(syf)
as Theorem 8.11, using the transformation law for / under I i ~Q ).
For To(^) with N > 1, the element (° "M is not in T0(N)y and the
functional equation needs an extra hypothesis. This hypothesis will be
given in terms of the effect of the matrix f „ n 1.
Let aN = (£ -1), so that a* = J= (£ -1). If 7 =
(c d)> then ^^N1 = (_M "a J* Therefore
ONro(Ar)^1 C To(N). (9.31)
For / G Mjb(ro(Ar)), we consider the map wpjf = / o [<*#]*. For 7 G
ro(A^), we have
(/ o [orat]*) o [7]* = (/ o [<xN7<Xn%) ° [<*"]* = / o [<*n]*,
so that / o [ajv]k is an unrestricted modular form of the same weight
and level as /.
Let us record the fact that ws maps Mk(To(N)) to itself and
Sk(To(N)) to itself. This result is a special case of Proposition 9.5.
Proposition 9.7. The map wn carries Mk(To(N)) to itself and
Sk{T0(N)) to itself.
Proof. If/ is in Mk(To(N)), we apply Proposition 9.5 with m = N
and or = or//. The proposition says that
(wNf) o \fi-l]k(r) = f; cne2'inT'N°. (9.32)
But wnJ is an unrestricted modular form for T0(N)t and hence
(u>Nf) ° [/^~1]fc(r) is periodic with period N. Thus the coefficients cn
in (9.32) are 0 unless N \ni and / is holomorphic at the cusp 0~l(00).
Similarly if / is in Sk(Y0{N)), then wNf is in Sk(TQ(N)).
Taking into account Proposition 9.7 and the identity (c**)2 = —I, we
see that w^ is an involution on Mk{To{N)) anc* on St(To(N)). These
spaces therefore each split as the sum of the eigenspaces for eigenvalues
+1 and —1. In the case of Sk{To(N)), we denote these eigenspaces by
St(T0(N)).
270 IX: MODULAR FORMS FOR HECKE SUBGROUPS
Theorem 9.8 (Hecke). Let / 6 Sk(To(N)) be a cusp form in one of
the eigenspaces Sl(TQ(N)) ofwN} where e = ±. Then L(s, f) is initially
defined for Re s > | -I- 1 and extends to be entire in s. Moreover, the
function
A(s,/) = Nst7(2ir)-'r(s)L(sJ) (9.33a)
satisfies the functional equation
A(s, /) = e{-l)k'2A(k - 5, /). (9.33b)
Remarks. The proof is similar to that of Theorem 8.11, except that
( l ~~o ) £ets reP'acec' by a*. The integration from 0 to oo is broken
at l/y/N instead of 1 because i/y/N is the fixed point of w^ on the
imaginary axis.
Proof. The initial region of convergence for L(s, /) is Re s > ^ + 1,
by Lemma 9.6b. Since w^f = ef, the inversion law for / under the
action of wpj is
f^)=sNk^2ikakf(ia). (9.34)
By (9.26) we have
A(sJ) = N*l2 r f{ia)as~l da. (9.35)
Jo
From (9.27) we see that
/ f{i<r)<T*-1 da (9.36)
Jl/y/N
converges for all s 6 C and defines an entire function. We rewrite (9.35)
for Re s> \ + 1 as
rl/VN i-oo
A(*f /) = N9'2 / f{i<r)a9-1 da + N9'2 I f{i<r)a9^ da.
Jo Ji/Sn
In the first term, we replace a by (Na)~l and then use (9.34). The
result is
TOO ^OO
A(«, /) = eJV*<fc-*>i* / fiiff)^—1 d<r+N'f2 / /(«o>'_1 da.
(9.37)
5. DIMENSIONS OF SPACES OF CUSP FORMS
271
In view of (9.36) the first term extends to be entire in s, and hence so
does A(s,/) itself. Since T(s) is nowhere 0, L(s,f) is entire. Replacing
s by & — s in (9.37) and multiplying by eik = e(—1)*^2, we obtain
e(-l)*/2A(*-5,/)
= AT*' /°° f(i*)*-1 <Lr + eN&k-*ih f°° /(i*)*"-'-1 da.
Jl/y/N Jlfy/N
Comparison with (9.37) yields (9.33b).
5. Dimensions of Spaces of Cusp Forms
The kind of explicit argument used to prove Theorem 8.6 is not
available for Fo(N). But we can get at the finite-dimensionality of Sk(To(N))
rather painlessly, and this finite-dimensionality will be relevant in
obtaining simultaneous eigenvectors for the Hecke operators.
Theorem 9.9. The vector space Mfc(r0(Ar)) is finite-dimensional.
Sketch of proof. (Some of the steps will be treated in more detail
in Chapter XL) The first step is to make K = T0(N)\H into a Riemann
surface by introducing appropriate local parameters at the fixed points
of elements of To(N) of finite order. Then we form a compactification
TV by adjoining one point for each cusp and using local parameters built
from q in the case of the cusp at oo, or a transformed qn (h being the
width) in the case of the other cusps. The result is a compact Riemann
surface.
Let / be in Mjb(ro(A0)- Then f12 is in Ml2k(To(N))t and so is A*.
Hence F = /12/A* is an analytic function on % invariant under To(N).
The function F descends to a holomorphic function on It that extends
to be meromorphic on TV. The only possible poles are at worst of order
k, coming from the zeros of A*. These poles occur at worst at each
cusp, and the number of cusps is < /i = [SL(2, Z) : ro(A^)]. Hence the
number of poles of F, counting multiplicities, is < kfi.
It follows that the number of zeros of F, counting multiplicities, is
F'(r)dT
< kfi. In fact, the meromorphic differential forma; = —J, is globally
well defined on 1Z*, as a consequence of (8.19). Let us triangulate K* so
that each singularity of a; is in the interior of one triangle. Let Ai,..., A/
be the 2-simplices in the triangulation. By the Argument Principle,
1 ' f
#{zeros of F] - #{poles of F} = — Y* / »>
272 IX: MODULAR FORMS FOR HECKE SUBGROUPS
where 3A;* is the boundary of Aj with positive orientation. In the sum
on the right, each 1-simplex appears twice, with opposite orientations,
and the right side is therefore 0. We conclude that F has < fc/i zeros,
counting multiplicities.
Now let / and therefore F vary. Select t/i+1 distinct points pi,...,
pkfi+i in 71 and consider the linear map Mk(To(N)) —► C*M+1 given by
/ —► (F(pi),..., F(pjtp+i)). By the above, this map has 0 kernel. Hence
dimMk(r0(N))<kii+l.
Our interest ultimately will be in S2(T0(N)). With more machinery,
one can calculate the dimension of S2(To(N)) exactly. We shall say what
is involved without giving the details.
The first step is to compute the genus of the compact Riemann surface
TV in the proof of Theorem 9.9. This is possible because we have a
natural map
IV —+(SL(2}Z)\Hy (9.38)
as a consequence of the canonical quotient map
r0(A0\5L(2,R) —► 5L(2,Z)\5L(2,R).
Applying the Riemann-Hurwitz formula to this map gives a formula for
the genus of IV. The space S2(ro(Af)) 1S easily seen to be isomorphic
to the space of holomorphic differentials on IV, and the latter space can
be shown to have dimension equal to the genus of IV. We summarize
as follows.
Theorem 9.10. The dimension of 52(r0(A0) is
ji(N) n(N) k(N) fioo(N)
* 12 4 3 2 >
where
fi(N) = [SL{2, Z)/r0(AO] = N IJ(1 + I)
r o if 41 at
^'"irwo+tir)) otherwise
( 0 if 2 IJV or 9 I TV
/*oo(tf) = 2>(GCD(cWd)).
Here (^jp) and (^p) are Legendre symbols (referring to quadratic
residues), and <p is the Euler ip function.
6. HECKE OPERATORS
Corollary 9.11. If p > 3 is prime,
(n-l
dim52(r0(p)) = i
n
n
, n+1
then
if p= 12n + 1
ifp= 12n + 5
ifp= 12n + 7
ifp= 12n+ll
273
6. Hecke Operators
As with r0(l) = SL(2,Z), Hecke operators for T0(N) are built from
lattices. However, additional structure plays a role. Instead of working
with lattices A, we shall work with pairs (A,C), where A is a lattice in
C and C is a cyclic subgroup of C/A of exact order N. Such a pair we
shall call a modular pair.
If A is a lattice, we write Pa for the quotient map of C into C/A. To a
modular pair (A,C) and a nonzero complex number or, we can associate
another modular pair (orA,aC) as follows: The lattice orA is as earlier,
and the cyclic subgroup aC is given by
aC=PaA(aPXl(C)). (9.39)
A complex-valued function / on modular pairs is said to be
homogeneous of degree —k if
/(aA, aC) = c*-*/(A, C) for a € Cx. (9.40)
To such a function /, we can associate a function / on H by the definition
/(r) = /(AT(PA,(iZ)).
Let us check that this function / satisfies
f(yr) = 6{yt r)kf(r) for y € T0(N). (9.41)
In fact, we recall from §VIII.2 that AyT = 6(7, r)_1AT. Thus we have
/(77-) = /(A7T)PATr(iZ))
= /(«(7,r)-1Ar,P«(7,r)-A,(^Z))
= *(7,r)*/(AT,*(7,r)P,(7>r)-,Ar(£Z)) by (9.40)
= 6(y,r)kf(AT,PAr(6(y,T)p-!fT).1ArPHy>r)-iAr(jrl)))
by (9.39), and this is
274 IX: MODULAR FORMS FOR HECKE SUBGROUPS
= *(7,r)fc/(Ar,^Ar(*(7,r)(^Z + *(7,r)-1AT)))
= 6(7,r)l/(AT)FAr(5(7,r)iZ + Ar)).
In turn this is
= «(7,r)*/(A„PAr(^Z)) = 6(7,T)kf(r),
since easy computation shows that
(cr + d)jjl + Z + rZ = £Z + Z + rZ
if (" J) is in r0(AT). This proves (9.41).
Conversely suppose that / is a complex-valued function on 7it We
shall define a function / on modular pairs such that (9.40) holds. Let
(A,C) be given. Then A is a sublattice of index N in P^^C) with
P^l(C)/X cyclic. It follows that we can find a Z basis {u>i,u>2} of A
such that {-^0^1,^2} is a Z basis of P\l(C). We may assume moreover
that Im(o;2/wi) > 0. In terms of this basis, we define
/(A,C) = wrV(«2M). (9.42)
To see that /(A,C) is well defined, let {u^u^} t>e another basis of A
such that {j^wj,^} *s a basis of P^X(C) and Im^/^i) ls positive. If
we write
(:?)=(: *)(:)• (^»
then ( a J is in SL(2,Z) since {u>i,u;2} and {u^o^} are both properly
ordered bases for A. Since {-^u^,^} and {j^ui,^} are both bases for
P^l(C), the equality
(ft) -(*.?) (a) <^>
forces the coefficient matrix to be integral. Thus TV | c, and 7 = lflM
is in ro(A^). By (9.41) we have
«i"*/KM) = <*/(7(w2M))
= u>i"~*(c(u/2/wi) + d)kf(u2/ui) = ^rfc/(w2M),
6. HECKE OPERATORS
275
and thus (9.42) is well defined.
Finally we check the homogeneity of /. If {u/i,u>2} is a properly
ordered basis for A as above, then {au>i,aru2} is a properly ordered basis
for orA. Hence
f(aA,aC) = (erWl)-fc/((«"2)/(<*Wi)) = a~k }{\,C),
and / satisfies (9.40).
The two constructions / —► / and / —► / are inverse to one another,
and thus we have a one-one correspondence between functions / on
modular pairs homogeneous of degree —k and functions / on 7i satisfying
(9.41).
Let C be the free abelian group freely generated by the modular pairs
(A,C). In analogy with (8.43), the Hecke operator T(n) on modular
pairs, for n = 1,2,3,..., is defined to be the map T(n) : C —► C given
by
r(n)(A,C)= £ (A'.C). (9.44)
[A:A')=n
The notation nC -» C means the following: We have an inclusion
nA C A' and an induced map C/(nA) —► C/A'. Under the induced map
the cyclic group nC is to map onto the cyclic group C".
The sum (9.44) is a finite sum since nA C A' C A and since (A,C)
and A' uniquely determine C". It is easy to check that the condition
nC -» C" is automatically satisfied if GCD(n, N) = 1. One might expect
that nC -» C could be replaced by C" -» C and that an equivalent or
perhaps better theory would result. But this expectation is misplaced.
For one thing, (A, C) and A' would no longer determine C'\ for another,
the analog of Proposition 8.14 would break down in a serious way.
We define Tk(n) on the space of functions on modular pairs,
homogeneous of degree — k} just as in (8.44):
(Tfc(n)/)(A,0 = n*-1 £ /(A'.C1). (9.45)
[A:A']=n
It is clear that Tk(n)f is another function on modular pairs,
homogeneous of degree —k. Shortly we shall exhibit the corresponding operator
on functions of the variable t eH, and we shall see that it carries
modular forms for To(N) to modular forms for ro(^V), and cusp forms to cusp
forms. This operator too will be denoted Tk{n) and is called a Hecke
operator.
276 IX: MODULAR FORMS FOR HECKE SUBGROUPS
Let us compute the relationship between (A,C) and (A',C") in (9.45)
in terms of matrices. Choose a basis {^1,^2} of A such that {^ui,^}
is a basis of P^l(C) and Im(w2/wi) > 0, and let {(*>[,u>'2} be a similar
sort of basis for A'. Then we can write
with f'M in Af(n). The inverse images of nC and C are given by
^»a ("C^ = -£Zwi + nZw2 (947a)
N
PZ.l(&)=jjM+M- (9-47b)
Since
Jr*i = ^(oX^a + ^1) - (jf) (<*>* + ^1) = ^(aKi - (£) w2
and
nu/2 = (-6n)(cu>2 4- dwi) + (nd)(aw2 4- ta>i) = — (—bnNJa/j 4- (nc/Ju^,
c
(9.47a) is contained in (9.47b) if and only if — is an integer. Thus nC
c
maps into C" if and only if — is an integer. Suppose this happens. For
nC to map onto C", tt^i must have exact order N relative to A'. The
order is the least m > 1 such that
nm , ,
—td! = ru)x 4 slj2
for some integers r and s. Inversion of (9.46) allows us to substitute
lj\ = — (—cu4 4 au)[) and rewrite this condition as
n
771
—(-cu>2 + au;i) = ru^ 4- s^2.
Since TV | c, the condition is just that ma/N = r. The order is therefore
JV/GCD(a, AT), and order AT occurs if and only if GCD(a, N) = 1. We
conclude that {u^,^} and {ui,^} are related by a matrix in the set
M(nyN) = {(«*)g M(n) | c = 0 mod AT and GCD(a,AT) = 1}.
The most general similar sort of basis for A' is related to {u/{, cj2} by a
member of Yq(N), according to (9.43). Thus the modular pairs (A', C)
in the sum (9.44) are parametrized by the right cosets T0(N)\M(n1 N).
6. HECKE OPERATORS
277
Proposition 9.12. Let {a,} be a complete set of representatives for
the right cosets r0(N)a of T0(N) on M{n,N). If / is in A/*(r0(AO),
then Tk(n)f is given as a function of r by
Tfc(n)/=nt-1^/o[a,]t. (9.48)
Hence Tk(n)f is an unrestricted modular form of weight jfc and level TV.
Proof. The argument is the same as for Proposition 8.14.
Corollary 9.13. The Hecke operator Tk(n) carries Mk(To(N)) to
itself and Sk(T0{N)) to itself.
Proof. If/ is in Mk(To(N))i we apply Proposition 9.5 with m = n,
a = ait and 0 € 5L(2,Z) to see that (/ o [a,]*) o [/?*"1]t(r) has a
holomorphic qnjv expansion at oo. By (9.48) the same thing is true
of (Tjb(n)/) o [/?_1]fc(7")- But Tt(n)f is an unrestricted modular form
for r0(AT), and hence (7*(n)/) o[/?_1]*(t) is periodic with period N.
Hence the terms in the qnN expansion whose index is not a multiple of
n vanish, and Tk(n)f is holomorphic at the cusp /?_1(oo). A similar
argument works if / is in Sk(To(N)).
To see concretely the effect of Tjt(n) on q expansions, we use explicit
coset representatives a,- in (9.48). Arguing as in the proof of Lemma
8.15 and noting that N will have to divide x at the start of the proof,
we arrive at the following.
Lemma 9.14. The matrices I ,1 with ad = n, d > 0,
GCD(a,JV) = 1, and 0 < b < d are a complete set of coset
representatives for the right cosets of T0(N) on M(nyN).
REMARK. The set of representatives here is a subset of the set in
Lemma 8.15. The subset is proper if and only if GCD(n, N) = 1.
Proposition 9.15. Let / in Mk(r0(N)) have q expansion /(r) =
H2^Locn<In- Then Tk(m)f has q expansion
Tk(m)f(r) = f>„g",
n=0
278 IX: MODULAR FORMS FOR HECKE SUBGROUPS
where
CoE a|m,a>0 <* * if M = 0
GCD(a,N)=l
bn= { cm if n = 1
Ea|GCD(n,m) <* ~1Cnm/a* if H > 1.
GCD(a,JV)xl
Proof. The argument is substantially the same as for Proposition
8.16. The condition GCD(a, N) = 1 needs to be carried along
throughout.
Next we examine how the Tk(n) interact with one another. First we
formalize the definition of dilation on modular pairs made in (9.39). We
let
fl(fi)(AfC) = (nAfnC).
This definition is consistent with the one in §VIII.7, and R(n)T(m) =
T(m)R(n) for all m and n. The following lemma generalizes Lemma
8.18.
Lemma 9.16.
(a) For a prime power pr with r > 1 such that p \ Ny
T(pr)T(p) = T(pr+1) + p ■ R{p)T{pr~l). (9.49)
(b) For a prime power pr with r > 1 such that p \ N,
T(pr) = T(Py. (9.50)
(c) T(m)T(n) = T(mn) if m and n are relatively prime.
Proof. Parts (a) and (c) are proved in the same way as for Lemma
8.18, the cyclic group being of no significance in the argument. For (b),
we shall show that
T(Pr) = T(pr-l)T(p), (9.51)
and then the result follows by induction. If T(p)(A, C) contains a term
(A",C") and T(pr-1)(A",C") contains a term (A'.C), then
[A:A"]=p, pC-^C", ^".A^p*-1, p'-'C'^C. (9.52)
Hence [A : A'] = pr and prC -» pr-lC" -» C. Thus T(pr)(A,C)
contains the term (A', C).
6. HECKE OPERATORS
279
In the reverse direction, if T(pr)(A,C) contains (A',C"), it contains it
only once. So we want to see that there is only one A" such that (9.52)
holds. Fix such a A". Choose a basis {u>i,u;2} of A so that {jjU>itV2} is
a basis of P^l(C) and Im^/^i) > 0. Fix a similar basis {w}',^} of
A". Changing basis for A" by a member of To(N), we may assume that
Wx)~\0 d)\UJ
with the matrix as in Lemma 9.14. Since ad = p and GCD(a,TV) = 1
and p | TV, we must have a = 1. Thus
Similarly we can find a basis {u^i,^} of A' such that
tt)-(j^)tt)-«'<^-
Then
(:!)=(: 6+/p)te)
The uniqueness in Lemma 9.14 shows that 6 -I- b'p here is determined
by A and A'. But 6+ b'p determines b and 6'. Hence A" is completely
determined by A and A7. This proves (9.51) and the lemma.
To get from operators on modular pairs to operators on functions /
on modular pairs, we use the definitions of (8.51) and (8.52). Lemma
9.16 then yields the following analog of Theorem 8.19.
Theorem 9.17 (Hecke). On the space Mk(Y0(N)), the Hecke
operators satisfy
(a) For a prime power pr with r > 1 such that p \ TV,
Tk(pr)Tk(p) = W+1) + p*"1 W1).
Hence 7*(pr) is a polynomial in Tk(p) with integer coefficients.
(b) For a prime power pr with r > 1 such that p | TV,
(c) Tjt(ra)Tjfc(n) = Tk(mn) if m and n are relatively prime.
(d) The algebra generated by the Tjb(n) for n = 1,2,3,... is generated
by the 7*(p) with p prime and is commutative.
280 IX: MODULAR FORMS FOR HECKE SUBGROUPS
As a consequence of Lemma 9.6, we can define a Petersson inner
product on Sk(To(N)). The definition is
where Rjsr is a fundamental domain for ro(N). The inner product does
not depend on the choice of fundamental domain. In an effort to prove
an analog of Theorem 8.22 (that the Tk(n) are self adjoint), we again
take n to be a prime p, by Theorem 9.17d. Let Rs be a fundamental
domain for T(pN) obtained as the union of translates of R^. Then the
argument in the proof of Theorem 8.22 still shows that
/ /o M*(r)M7) ^? = / /W*o[4(V ^ (9.53)
for a e M(p)t just as in (8.62).
We now assume that p \ N. For the element or,- = ( 1, choose
integei
Then
(
rs x
,0
and
-6^
y so
) = l
that
p2x -
(px — bN
I N
-Ny
:)«
= 1 + pbN.
{1%
xb + y + pb\
Nb + p2 )
-l
shows that a{ = 7ia,-7t- for some 7,- and 7,- in Tq(N). Inverting this
equation for 6 = 0, we see that we have a similar relation a'{ — 7,or,'7t-
for or,- = I JT 1 J. Thus the same argument as the one in Theorem
8.22 that obtained that theorem from (8.62) is applicable here, but only
under the assumption p \ N. We summarize as follows.
Theorem 9.18 (Petersson). The Hecke operators Tk(n) with
GCD(n, N) = 1, on the space of cusp forms Sk(To(N))} are self adjoint
relative to the Petersson inner product.
It is not always true that the operators Tk(n) with GCD(n,7V) > 1
are self adjoint. Thus Theorem 9.18 does not give us quite as good a
result as we had with 5L(2, Z). Since the Hecke operators commute, we
can conclude that Sk(To(N)) splits into the orthogonal sum of
simultaneous eigenspaces for the operators Tk(n) with GCD(n,7V) = 1. An
eigenvector cusp form under the X*(n) with GCD(n,7V) = 1 is called
an eigenform; eigenforms in the same such eigenspace are said to be
equivalent.
6. HECKE OPERATORS
281
Proposition 9.19. The involution wn of Sk(To(N)) given in §4 is
self adjoint and commutes with all Tjk(n) such that GCD(n, N) = 1.
Proof. We have ws(f) = /o [<*#]*. The argument of Theorem 8.22
establishes (9.53) for a = a^ if we regard Rn as a fundamental domain
for r(7V2). But [a'N] = [<*#], and it follows that wN is self adjoint.
To prove that w^Tk(n) = Tfc(n)ti;jv, we may assume that n is a prime
p with p f TV, by Theorem 9.17. The matrices or, to use in the formula
(9.48) for Tk(p) are
(S !)■ (J !)• (i:) -^*sr-i. c«)
What is to be checked is that
^2f°[<*N<Xi]k = $^/©[«•«*]*. (9.55)
» t
The sum of the contributions from the first two matrices in (9.54) to
each side of (9.55) is the same since
aN(po T) = {1 l)aN and °"(o p)=(s °i)qn-
We shall show that the sum of the contributions from the other matrices
in (9.54) to each side of (9.55) is also the same. Specifically we shall show
that the matrix with 6 contributes to the left side what the matrix with
e contributes to the right side, where 1 < e < p — 1 and e = (—Nb)~l
mod p. In fact, we can calculate that
\N 0 ) \0 p) = \-Nb p-l(l+ebN)J \0 p) \N o)'
The first matrix 7 on the right side is in To(N)i and /o [7]* = /. Thus
/oh0 p)L=/°[(J pM
as asserted.
Consequently the decomposition of Sk(To(N)) into spaces of
equivalent eigenforms is compatible with the decomposition of 5*(ro(N)) into
S+(r0(N)) and S;(T0(N)).
The Hecke operators Tk(n) with GCD(n, N) / 1 commute with the
other Tjb(n) and hence map the spaces of equivalent eigenforms into
themselves. In each such space there will be at least one eigenvector of
all the 7i(n), and its eigenvalues will be as in Proposition 9.20 below.
We cannot assert anything at this stage about a relationship between
wn and the operators Tt(n) for GCD(n, N) ^ 1.
282 IX: MODULAR FORMS FOR HECKE SUBGROUPS
Proposition 9.20. Suppose / £ Sk(To(N)) ls an eigenform that is
an eigenvector of all 7i(n), say with Tk(n)f = A(n)/. If the q expansion
of / at oo is /(r) = ]CnLi CnQn» ^hen
cn = A(n)c!. (9.56)
Consequently
(a) / ^ 0 implies c\ ^ 0
(b) the system of eigenvalues {A(n)} determines / up to a scalar.
Proof. This follows from Proposition 9.15 in the same way that
Proposition 8.23 follows from Proposition 8.16.
Under the assumptions of Proposition 9.20, we can normalize / so
that the q expansion /(r) = YlnLi cn9n nas ci = 1. Then (9.56) says
that cn is the eigenvalue of Tk(n). From Theorem 9.17 we see that
CprCp = Cpr+i + pk~lcpr-i for p prime, p f N (9.57a)
cpr =z [cp)r for p prime, p \ N (9.57b)
cmcn = cmn if GCD(m, n) = 1. (9.57c)
As in §VIII.8 it follows from (9.57) that the L function L(sJ) =
— c
Yl™=i ~ nas an Euler product expansion. In this case the Euler factor
when p { N is quadratic, as before. But when p \ TV, it is first order of
the form
1 - cpp~>
We summarize part of this discussion as follows.
Theorem 9.21 (Hecke-Petersson). The whole space Sk(T0(N)) of
cusp forms is the orthogonal sum of the spaces of equivalent eigenforms.
Each space of equivalent eigenforms has a member that is an eigenvector
for all 7]fc(n). Any eigenform / in Sk(^o(N)) that is an eigenvector for
all Tk(n) can be normalized so that its q expansion /(r) = Y^=i cnQn
has c\ = 1. With such a normalization the coefficients satisfy (9.57).
Moreover, the L function L(s,f) has an Euler product expansion
£(*,
./)- n [tt^t] n [-
p prime
p\N piN
Cpp-s+pk-1-23
p prime rt
(9.58)
convergent for Re s > \ + 1.
7. OLDFORMS AND NEWFORMS
283
7. Oldforms and Newforms
Theorem 9.21 is not the same kind of definitive result that we obtained
in Chapter VIII. Since we were unable in Theorem 9.21 to relate Wj\[ to
the Tk(p) for which p | N, we ended up with no correlation betweeen
L functions having Euler products and L functions having functional
equations.
It turns out that the difficulty is the presence of cusp forms that
come trivially from lower levels. These are the cusp forms considered
in Examples 1 and 2 in §3. A specific example is the pair of members
A(r) and A(2r) of 5i2(r0(2)). We shall see after (9.61) that both are
eigenforms with the same eigenvalues under all T\2{n) with n odd, yet
they are linearly independent.
More generally the first kind of example was /(r) in Sk(r0(N)) when
/(r) was given in Sk(r0(N/r)) and r | N. When GCD(n,N) = 1, the
formula for Tk(n)f is the same relative to To(N) as relative to T0(N/r).
Hence an eigenform for To(N/r) becomes an eigenform for To(N) with
the same eigenvalues.
The second kind of example was f(rr) in Sk(To(N)) when /(r) was
given in Sk(T0(N/r)) and r \ N. Let Ak(r) be the operator
4k(r)/=/o[(j J)] - (9.59)
Since f(r) = £~=1 cnq* has
Ak(r)f(T) = rk'2f(rT), (9.60)
we know from §3 that Ak(r) carries Sk(T0(N/r)) to Sk(r0(N)). We
shall see in Lemma 9.23 below that
Ak(r)Tk{n) = Tk(n)Ak(r) if GCD(n, N) = 1. (9.61)
Consequently if /(r) is an eigenform for T0(N/r)y then f(rr) is an
eigenform for To(N) with the same eigenvalues.
We can combine the two examples in sequence as follows: If r\r2 \ N
and if /(r) is an eigenform for ro(Ar/(r1r2)), then f(r2r) is an eigenform
for To(N) with the same eigenvalues. Such an eigenform we call an
oldform. The linear span of the oldforms is denoted Sfd(To(N))1 and
its orthogonal complement is denoted SjJew(ro(Af)). The eigenforms in
S^eYf(T0(N)) are called newforms for T0(N). Since Tk{n) is self adjoint
when GCD(n,7V) = 1, 5few(r0(AT)) is spanned by newforms. The key
result, whose proof we omit, is the following Multiplicity One Theorem.
284 IX: MODULAR FORMS FOR HECKE SUBGROUPS
Theorem 9.22 (Atkin-Lehner). If / £ Sk(T0(N)) is a newform, then
its equivalence class is one-dimensional, i.e., consists of the multiples of
/•
Before deriving some of the consequences for newforms of Theorem
9.22, we supply a proof of (9.61).
Lemma 9.23. If GCD(n, N) = 1 and r | N, then
Ak(r)Tk(n) = Tk(n)Ak(r)
on Sk(r0(N/r)).
Proof. Let ( 0 , 1 range over the usual matrices with ad = n,
d > 0, GCD(a,JV) = 1, and b in a complete residue system modulo d.
(Since GCD(n,;V) = 1, the condition GCD(ayN) = 1 is vacuous.) By
Lemma 9.14 and (9.48)
«»-t/.[(s ';)(; !)].■
Since br goes through a complete residue system modulo <f, the right
side is
= Ak(r)Tk(n)f,
and the lemma follows.
We return to the consequences of Theorem 9.22. The operators ws
and Tfc(p) with p | N commute with the 71b(p) having p \ TV, and thus
they map each equivalence class into itself. If / is a newform, then
the theorem says that C/ is an equivalence class. Consequently / is
an eigenvector for w^ and the Tk(n) with p \ N. The first of these
conclusions means that L(s, /) has a functional equation, according to
Theorem 9.8. The second of these conclusions means that L(s^f) has an
Euler product expansion (after / is normalized), according to Theorem
9.21.
But there are further operators of this kind if N is composite, and
they give some insight into the eigenvalue ±1 of w^. Fix a prime p
7. OLDFORMS AND NEWFORMS
285
dividing N, and let Q = pl be the exact power of p dividing iV. Then we
can choose integers ao and 70 with Qao — (N/Q)7o = 1, and the matrix
(Qao l\
\Ny0 Q)
will have determinant Q. More generally we consider any matrix
»«> = (£ «*) (962)
of determinant Q. For / in Sjb(r0(AO), let wQf = fo[w(Q)]k.
Lemma 9.24. The operator wq carries Sk(Tio(N)) into itself. It
is independent of the choice of defining matrix w(Q), and Wq = 1. It
commutes with all T*(n) with GCD(n,N) = 1. If p' is another prime
dividing N and if Q' is the corresponding Q> then wq and wq* commute.
As p varies over all primes dividing N, the product of the various wq's
is wn.
Proof. Let f a b J be in To(iV). Carrying out the multiplications, we
see that w(Q) (* J) w(Q)"x is in Fo(N). The same argument as with
wn in §4 then shows that wq carries St(To(N)) into itself.
If we have two representatives w(Q) and V)'{Q)> we calculate that
wiQWiQ)"1 is an element 7' of T0(N). For / in Sjfc(r0(AO), we
therefore have
/ o [w(Q))k = (/ o [y')k) o [w\Q))k = / o [w'(Q)]k.
Thus iuq is independent of the representative. Similarly we calculate
that Q~1w(Q)2 is in r0(7V), and it follows that w2Q = 1.
For the commutativity with Tjb(n), it is enough to prove commuta-
tivity with 7*(p'), where p' is a prime with p' \ N. Using the facts
that p' { N and p' ^ p, we shall choose the representative w(Q) in
(9.62) to be congruent to f ^ J modulo p'. Namely choose an integer
Q\ with QQ\ = 1 mod p'. Let Qo = Q\ + mp', where m is the
product of all primes dividing N but not Qx; then QQ0 = 1 mod p1 and
GCD(Q0, W) = 1. Since GCD(Q7Q0,Np'2) = Q, we can choose a' and
7' with
Q2Q0a' - Np,2y = Q.
Then
286 IX: MODULAR FORMS FOR HECKE SUBGROUPS
has the required properties.
If m(b) denotes I 0 , J, then we can calculate that
m(b)w(Q)m(Q0b)-1 = w'(Q)
with w'(Q) of the form (9.62). Hence
£ (fo[m(b)]k)o[w(Q)}k= £ (»Q/)°MQo»)U
bmodp' bmodp*
= £ (™«/)°M6)]*-
6modp' (9.63a)
With the same w'{Q) we have
(i !H(S !)"'-"«»•
and with another w"{Q) we have
Thus
(/.[(J J)]>W«U-^/).[(J J)].
(/.[(J ;)]J.M<BU-(^/).[(S ?)],
(9.63b)
(9.63c)
Adding the equations (9.63), we obtain
wQTk(p)f = Tk(p)wQf
as required.
If p' is another prime, we calculate that
(w(Q)w(Q'))(w(Q')w(Q))-1
is in To(N). Then it follows that wqwq> = wquvq on 5*(ro(N)).
For the product formula for wn, we extend the definition of w(Q)
in (9.62) to any divisor Q of N with GCD(Q, 7V/Q) = 1. If Q and <?'
are relatively prime such divisors, we find that w(Q)w(Q') is a version
of w(QQ'). Hence the product of all w(Q) with Q a prime power is
a version of w(N). But I ^ "! J is another version of w(N). Then it
follows that wn is the product of all wq with Q a prime power dividing
N such that GCD(Q, N/Q) = 1. This completes the proof of the lemma.
7. OLDFORMS AND NEWFORMS 287
If p is prime, let To(r,p) be the subgroup of To(r) given by
To(r,p) = | (° J) 6 r0(r) | 6 = 0 mod p} .
Lemma 9.25. If p is prime, then a complete set of right coset
representatives for To(r,p) in To(r) is
(o i) with °-* -v~l ifplr (964a)
(J {) with 0<j<p-l, and ^ J)ifpfr, (9.64b)
where a and y are integers such that pa — ry = 1.
Proof. Let I , J be given in To(r,p). If p { a, we can choose j
so that p | (6 — aj), and then
(; i) >"- r»<^(; 0-
If p | r, then p | a is impossible, and the elements (9.64a) represent all
cosets. If p \ r and p | a, then
(: J) """ r°^{n !)•
Hence the elements (9.64b) represent all cosets in this case. We readily
check that the elements in question represent distinct cosets, and the
proof is complete.
Lemma 9.26. Let / be in Sk{To(N)), and let p be a prime dividing
N. Up2 | N, then Tk(p)f is in Sk(T0(N/p)). Up\N and p2 \ N, then
Tk{p)f + P*'lwpf is in Sk(TQ(N/p)).
Proof. Let y = ( * J j be in TQ{Nfp,p). Then
(s?)"t(s?)=U J)
288 IX: MODULAR FORMS FOR HECKE SUBGROUPS
is in ro(N). Hence
/o
(S!)l-/-[US)(!!r].
and /o
(! 01
transforms according to To(N/p}p).
Let {.ft,-} be a system of right coset representatives for r0(AT/p,p) in
YQ(N/p). If 7 is in TQ(N/p), then {^7} is another system of right coset
representatives. Hence
(9.65)
and the left side of (9.65) transforms according to rQ(N/p).
If we use the representatives in Lemma 9.25, the desired result now
follows. In fact,
(p 0VV1 A 1/1 0W1 i\ _ 1 /1 A
Vo lj Vo lj-pVo pA° iAp\° p)
shows that the contribution to the left side of (9.65) from all f n *? J
is just p"^+1Tjt(p)/. If p I N and p2 f TV, then p \ {N/p)y and there is
one more coset representative, namely
( PQ
\N7/p
J. It sati
satisfies
(p 0\ l f pa l\_ 1/1 0\ / pa l\
1,0 lj V^T/P iApVO Pj\Ntlp l)
and the contribution to the left side of (9.65) is wpf.
7. OLDFORMS AND NEWFORMS
289
Theorem 9.27 (Atkin-Lehner). The space 5Jew(r0(^)) of newforms
is the orthogonal sum of one-dimensional equivalence classes of eigen-
forms. If / is such an eigenform, then / can be normalized so that its
q expansion /(r) = YlnL\ cn9n has c\ = 1. In this case the eigenform /
is an eigenvector of all X*(n) for all n, of all wq for primes dividing N,
and of wn, and the eigenvalues are as follows:
(a) Tk(n)f = cnf for all n m
(b) wqf = X(Q)f with X(Q) = ±1, if p | N and Q corresponds to p
(c) u,Nf = UpiNHQ)f-
Moreover, cp = 0 if p2 \ N, while cp = — p'J""1A(p) if p | N and p2 f AT.
Consequently the L function L(s, /) has an Euler product expansion
£(«,/)= n
p pnme
l
,4-1-'
n
p prime
1
U — cpP~* +p'
k-1-23
l + \{p)p>-
(9.66)
and L(s,/) satisfies a functional equation (9.33) with e — IlpiN ^(Q)-
Proof. The first sentence is by Theorem 9.22 and the self adjointness
of the Tk{n) with GCD(n,7V) = 1. Since the Tjk(n) commute, any such
eigenform must then be an eigenvector of all Tk(n). By Theorem 9.21,
c\ / 0; also if / is normalized to make c\ = 1, then (a) holds.
Lemma 9.24 shows that / is an eigenvector of all wq. Since Wq = 1,
(b) holds. Lemma 9.24 also proves (c).
Let p2 | N. By Lemma 9.26, Tk(p)f is an oldform equivalent to /.
Since / is orthogonal to 5Jld(r0(7V)), Tk(p)f = 0. By (a), cp = 0.
Let p | N but p2 f TV'. Lemma 9.26 and the same argument show that
Tk(p)f + P*-lu>Pf = 0. By (a) and (b),
cPf = Tk(p)f = -p$-lwpf = -pT-iA(p)/.
Hence cp = — p"2A(p).
The Euler product expansion follows from Theorem 9.21 and these
evaluations of cp when p | N. The functional equation is by Theorem
9.8.
CHAPTER X
L FUNCTION OF AN ELLIPTIC CURVE
1. Global Minimal Weierstrass Equations
Let E be an elliptic curve over Q. The L function of E is a certain
Euler product that takes into account information about the reduction
of E modulo each prime p. This section will deal with some preliminaries
that make the definition invariant under admissible changes of variables
over Q.
From the start we may assume that the equation is as in (3.23) with
integer coefficients. The discriminant A will then be an integer, and the
p-adic norm will satisfy |A|p < 1 with equality if and only if p f A.
An equation (3.23) is called minimal for the prime p if the power of p
dividing A cannot be decreased by making an admissible change of
variables over Q with the property that the new coefficients are p-integral.
It is the same to say that |A|P cannot be increased by such a change
of variables. The equation (3.23) is called a global minimal
Weierstrass equation if it is minimal for all primes and if its coefficients are
integers.
Before considering existence and uniqueness questions for these
notions, it will be helpful to have close at hand detailed formulas for an
admissible change of variables. Such a change of variables is given as in
(3.43a) by
x = u2z' + r and y = u3y' + stiV + *. (10.1)
The effect on the coefficients a* of the Weierstrass equation (3.23) and
of the related coefficients 6;, c,-, and A is given in Table 10.1. The new
coefficients are denoted by primes.
290
1. GLOBAL MINIMAL WEIERSTRASS EQUATIONS 291
ua'j:
«2o'2=
u3a'3--
u4a'A-.
u6a'6--
«26'2=
«464=
»%=
u%--
uV4=
«v6=
«12A'=
=01 + 2s
=a2 — sai + 3r — s2
=a3 + rai + It
=04 — «03 + 2ra2 —
=a6 -f ra4 + r2a2 +
=62 + 12r
=64 + r62 + 6r2
=66 + 2r64 + r262 4
=68 + 3r66 + 3r264
=c4
=C6
=A
(< +
r3-
-4r3
+ r3&
rs)c
ta3
2 +
H +
-<2
3r<
3r2-2st
— rta\
Table 10.1 Effect of an admissible change of variables
Lemma 10.1. Suppose p is a prime and all the coefficients a, in
(3.23) are p-integral. If |A|p > p"12 or \c4\p > p-4 or |c6|P > p~6,
then the equation is minimal for the prime p. Conversely if p > 3 and
|Alp < p~12 and \c^\p < p~4, then the equation is not minimal for the
prime p.
Remark. The proof will show how constructively to achieve
minimality simultaneously for all primes p > 3.
Proof. Suppose a change of variables (10.1) leads to a system of
p-integral coefficients {aj} with 1 > |A'|P > |A|P. Since u12A' = A, we
have |ti|p2|A'|p = |A|P, so that \u\p < 1. Then \u\p < p"1, and
|A|p = Hp2|A'|p<p-12.l = p-12.
The arguments for c4 and c$ are similar.
Conversely let p > 3 and |A|p < p-12 and |c4|p < p"4. Then (3.31)
gives 1728A = c\-cl. Since |1728|p = 1, we see that |c6|p < p"6. From
§111.2, there is an admissible change of variables leading from (3.23) to
y2 = x3 - 27c4x - 54c6
292
X: L FUNCTION OF AN ELLIPTIC CURVE
with discriminant A' = 2l2312A. If we make a change of variables (10.1)
with u = p and r = s = t — 0, we are led to
y2 = x3 - 27(cAp-4)z - 54(c6p-6).
This has p-integral coefficients, since |c4p~4|p < 1 and |c6/>~6|p < 1, and
the discriminant A" = p~12A' has |A"|P = p12|A'|p = p12|A|p. Hence
the given equation was not minimal for the prime p.
Proposition 10.2. Fix a prime p and an elliptic curve E over Q.
(a) There exists an admissible change of variables for E over Q such
that the resulting equation is minimal for the prime p.
(b) If E has p-integral coefficients, then the change of variables in (a)
has uyr,s,t all p-integral.
(c) Two equations that are minimal for the prime p and that come
from E are related by an admissible change of variables in which \u\p = 1
and r,s,t are p-integral.
Proof, (a) Without loss of generality we may assume E has p-
integral coefficients (or actually integral coefficients). Then |A|P < 1.
Since the range of | • |p is discrete away from 0, |A|P can be increased
only finitely many times if we are to maintain |A|P < 1. Hence in finitely
many steps, we can pass to an equation minimal for the prime p.
(b) Let E have coefficients {af}, and let the minimal equation have
coefficients {<*(•}. Since |A'|p > |A|p, we must have |u|p < 1. From
(3.24), we see that all {6,} and {b^} are p-integral. Suppose p ^ 3. If
\r\p > 1, then the equation for u6b'8 in Table 10.1 has 3r4 as strictly
the largest term in p-norm on the right side, contradiction; we conclude
\r\p < 1. If p = 3, we can argue similarly with ueb'e and the term 4r3 to
see that |r|p < 1. Similar arguments with u2af2 and — s2, and then with
u6a'6 and —t2 give |s|p < 1 and \t\p < 1.
(c) We apply (b) to the change of variables relating two minimal
equations, finding that |u|p < 1 and that r, s,t are p-integral. Applying
(b) to the inverse change of variables, which involves ti"1, we see that
KMp <1. Thus |t*|p = 1.
Theorem 10.3 (Neron). If E is an elliptic curve over Q, then there
exists an admissible change of variables over Q such that the resulting
equation is a global minimal Weierstrass equation. Two such
resulting global minimal Weierstrass equations are related by an admissible
change of variables with u — ±1 and with r, s,t in Z.
1. GLOBAL MINIMAL WEIERSTRASS EQUATIONS 293
PROOF. The uniqueness is immediate from Proposition 10.2c, and we
are to prove existence. For existence we may assume that E has integer
coefficients a,. For each p dividing A, choose an admissible change
of variables {upy rPisp,tp} over Q such that the resulting equation has
coefficients af>p and is minimal for the prime p. By Proposition 10.2b,
the rationals upirpysp^tp are p-integral. (10-2)
If the new discriminant is denoted Ap, then Table 10.1 gives
|Up|p2|Ap|p = |A|p. (10.3)
Let us write
tip = pd'Vp with |vp|p = 1. (10.4a)
Define
u = Y[pd' (10.4b)
We shall make an admissible change of variables {w,r, s,t} in the
original equation that leads to an equation with integer coefficients a'{ and
discriminant A'. Since u12A' = A, we have
|A'|P = |u|;12|A|p = |«p|p-12|A|p = |AP|P (10.5)
by (10.3). Thus the new equation is minimal for all p, hence is globally
minimal.
For each p with p | A, let us write rp = pf>vmp/np with mp and np in
Z and with |mp|p = |np|p = 1. Let npl be an inverse to np modulo p6d*.
We set up the congruence
r = pp'mpnpl mod p6^. (10.6)
By the Chinese Remainder Theorem, we can find an integer r such that
(10.6) is satisfied for all p with p | A. Then \npr - pppmp\p < p'6d' and
\r-rp\p<p-«d>
for all p. Similarly we can find integers s and t such that
I* - *p\p < P'6dp and \t -tp\p< p~6dp for all p.
Our admissible change of variables {u,r, s,t} is now defined, and we
are left with showing that the new coefficients {aj} are integers. We
294
X: L FUNCTION OF AN ELLIPTIC CURVE
check for all primes p that |ai|p < 1,..., \a'6\p < 1, using the formulas
of Table 10.1. For p f A, there is no problem: Since |ti|p = 1 and r, s,t
are integers, we have |aj|p < 1. For p | A, we estimate each \a\\p. The
estimates are similar, and we illustrate with a2 only. We have
u2a2 = <*2 — sa\ + 3r — s
= (a2 - spax + 3rp - s£) - (s - sp)ax + 3(r - rp) - (s2 - s2p)
= u2pa'2p - (s - sp)ai + 3(r - rp) - (s - sp)(s + sp)
W\IW
pl"2lp
^maxd^lpla^^, |(s_5p)ai|pt |3(r-rp)|p, |(s - sp)(s + sp)\p)
< max{|tij|p, \s - sp\py \r - rp\p} by (10.2)
< max{|u2|p, p"6^} < \u2p\p by (10.4a).
By (10.4), |t«|p = |up|p. Thus \a2\P < 1, and the proof is complete.
The argument in Theorem 10.3 is constructive, provided we know
how to produce, for each individual p, an equation that is minimal for
the prime p. The proof of Lemma 10.1 shows how to produce such an
equation for primes > 3, and an algorithm of Tate, which we do not
discuss here, handles the cases p = 2 and p = 3.
2. Zeta Functions and L Functions
To define the L function of an elliptic curve E over Q we assume that
E is given by a globally minimal Weierstrass equation. This condition
is no loss of generality in view of Theorem 10.3.
For each prime p, we consider the reduction Ep of E modulo p. This
curve was introduced in §V.2 and is defined over Zp. It is singular if and
only if p | A. The singular cases were discussed in §111.5. In both the
nonsingular and the singular cases we define
aP=P+l-#£P(Zp), (10.7)
where Ep(2p) is as usual the set of projective solutions. The local L
factor for the prime p is the formal power series given by
1 ifpfA
b l-apu + pu2
M«) = { / (10.8)
if p I A.
1 - apu
2. ZETA FUNCTIONS AND L FUNCTIONS
295
The L function of E is the product of the local L factors, with u
replaced in the pth factor by p~s:
^E>=n[ir^]n [,_.,,-'•+,■->•]■ (10-9>
An elementary convergence result for this Euler product is given in the
next proposition. This result will be improved in the next section.
Proposition 10.4. (a) For every prime p, \ap\ < p.
(b) For p \ A, the reciprocal roots of 1 — apu + pv? are < p in absolute
value.
(c) The Euler product defining L(syE) converges for Re s > 2 and is
given there by an absolutely convergent Dirichlet series.
Proof. The members of Ep(Zp) include oo and cannot consist of more
than two other points for each x in Zp. Thus 1 < ##p(Zp) < 2p+ 1 and
\ap\ < P- This proves (a). The reciprocal roots are ^(ap ± Jet* — 4p),
which is < \ap\ in absolute value; thus (a) implies (b). By Proposition
7.6, (b) implies (c).
When p | A, we can calculate ap exactly. According to §111.5, when
there is a singularity, there is only one, and it is classified as a cusp,
a split case of a node, or a nonsplit case of a node. Proposition 3.11
counted the nonsingular points in #£p(Zp) in each case. Adding one for
the singularity, we arrive at the following formula for ap when p \ A:
a0 = <
0 for the case of a cusp
+1 for a split case of a node (10.10)
— 1 for a nonsplit case of a node.
Although it is not necessary for the logical development, we shall give
some indication of how the local L factors Lp(u) arise. An arithmetically
defined L function typically is part of a more naturally defined zeta
function, or a variant of such a function. In the case at hand, the zeta
function Z(t/, Ep) is a generating function that encodes how many points
are on the curve in each finite extension of Zp. If Fpn denotes the field
of pn elements, the definition is
*(„,*,)=.,pIe*£2>>:
296 X: L FUNCTION OF AN ELLIPTIC CURVE
The definition is arranged so that additive formulas for #£,p(Fp») make
multiplicative contributions to Z(u, Ep): Operationally one calculates
with the formula
u±\ogZ(u,Ep) = f^#Ep(Fp.)un.
n=l
For our elliptic curve, calculation of Z(u,Ep) leads to a combination
of three polynomials, two appearing in the denominator and one in the
numerator:
1 -apu + pu2
1 — apu
(i-u)(i-p«)
ifpfA
ifp|A.
These three polynomials have separate significance, and the product over
p of the (reciprocal of) any of them (after substitution of u = p~s) is
in principle a meaningful object. However, \\ *; ~ and n ^j \Z7
are just £(s) and C(s — 1) and give no useful information about E. The
remaining polynomial leads to L^s^E), which encodes a great deal of
information.
3. Hasse's Theorem
The goal of this section is to establish the following improvement of
Proposition 10.4a.
Theorem 10.5 (Hasse). Let E be an elliptic curve over Q with integer
coefficients. For each prime pf A, let Ep be the reduction modulo p.
Then
\p+l-#Ep(lp)\<2Vp. (10.11)
Corollary 10.6. The Euler product defining L(s, E) converges for
Re s > | and is given there by an absolutely convergent Dirichlet series.
Proof of corollary. Let p f A. If ap = p+ 1 - #£P(ZP), the
reciprocal roots r of 1 - apu -h p«2 are r = ^(ap ± yJcL* — 4p). By
Theorem 10.5, the square root in this expression is imaginary. Hence
|r|2 = ±(a% + (4p - a%)) = p and \r\ = y/p. The corollary therefore
follows from Proposition 7.6.
3. HASSE'S THEOREM
297
The proof of the theorem that we give is due to Manin. In order to
be able to normalize E, we first dispose of the cases p = 2 and p = 3.
For these values of p, we have p < 2^/p. In these cases (10.11) therefore
follows from Proposition 10.4a.
For p > 3, we can make an admissible change of variables that does
not affect the condition p \ A, does not change #2?P(ZP), and brings the
equation of E into the form
y2 = x3 + ax + b. (10.12)
We may therefore assume from the outset that E is given by (10.12).
We shall work with the nonsingular cubic
y2 = X* + aX + >
x3 + ax + b y '
defined over the field Zp(x) of rational functions with coefficients in Zp.
Two solutions of (10.13) are
(X, Y) = (s, 1) and (X, Y) = (x*\ (x3 + ax + 6)*<"-1>).
If we specify oo as identity, the projective solutions of (10.13) over Zp(x)
form a group, by Theorem 3.8, and we form the group element
Zn = (*',(x3 + ax + J)***-1))) + n(x, 1) (10.14)
for each integer n, —oo < n < oo. We define a corresponding sequence
of integers dn > 0 as follows: If Zn = oo, then dn = 0. Otherwise Zn
is of the form (Xn, Yn); in this case we reduce Xn to lowest terms in
Zp(x), and we let dn be the larger of the degree of the numerator and
the degree of the denominator of Xn. Let us isolate the statements of
the two main steps.
Lemma 10.7. d_Y - d0 - 1 = #£p(Zp) - p - 1.
Proof. By (10.14) with n = 0, dQ = p. Let Np = #£P(ZP) - 1 be
the number of affine solutions. We are to prove that
d.x = Np + l. (10.15)
When the addition formulas of (3.73) are recomputed to take into
account the form (10.13), we obtain
(x — x*y
298
X: L FUNCTION OF AN ELLIPTIC CURVE
Clearing fractions, we see that
X-i =
(x - xp)2
with deg(fl) < 2p. Therefore the degree of the numerator is 1 greater
than the degree of the denominator, and the degree of the denominator is
cLi — 1. We shall compute the degree of the denominator after reducing
X_i as much as possible.
The denominator splits over ~LV as J^[j€Z (x — j)2. The linear factors
x — k of the numerator of the fraction in (10.16) are those such that the
numerator vanishes at k. In terms of the Legendre symbol (referring to
quadratic residues), we have
The factor [l+(i3+a.;+6) ^(p-1)] vanishes if and only if [ J + aj j =
— 1, and in this case x — j occurs as a factor of the numerator exactly
twice. The other factor [j3 + aj + b] vanishes if and only if x — j is a
factor of x3 + o.x + 6, and in this case x — j occurs as a factor of the
numerator exactly once. The factors that remain in the denominator are
(.-*)» when (j3 + apj + b) = +1 and x-j when (f^'J^) = 0-
They correspond exactly to the affine Zp solutions of Ep, the first ones
giving two values of y and the second ones giving just y = 0.
Hence the number of surviving factors in the denominator, which we
saw earlier is <f_i - 1, is exactly Np. This equality is (10.15), and the
lemma is proved.
Lemma 10.8. For —oo < n < oo,
(fn_1+dn+1 = 2(fn-h2. (10.17)
Proof. First suppose one of Zn_!,Zn,Zn+1 is oo. Then neither of
the other two is oo, in view of (10.14). Say Zn = oo. Then dn = 0 and
Zn+l = (x, 1), Zn_x = -(x, 1) = (x, -1).
Hence dn+i = dn-\ = 1, and (10.17) is verified.
Say Zn_i = oo. Then dn_i = 0 and
zn - [x, i), zn+1 _ [4{x3+ax+by yn+ij
3. HASSE'S THEOREM
299
by a recomputation of (3.74). Hence dn = 1 and <fn+i = 4, and (10.17)
is verified. The remaining case, with Zn+i = oo, is similar.
Now suppose that none of Zn-i,ZniZn+i is oo. Write Xn-iyXni
Xn+i in lowest terms as
The addition formulas allow us to express Xn_x and Xn+\ in terms of
Xn as follows:
y _ -(<?* + P)(Qx - Pf + (l + yn)2(x3 + ax + 6)03
n-1 ~ Q(Qx - Pf
V ' (10.18)
r _ -(Qx + P)(Qx - Pf + (1 - y„)2(x3 + ax + 6)^
n+1 C?(Qx - P)2
Addition of the two formulas (10.18) and use of (10.13) give
2(^-1 + -Xn+i)
_ -(Qx + P)(Qx - Pf + (x3 + ax + b)Q3 + ((%)3 + q(g) + b)Q3
Q(Qx - Pf
PQx2 + P2x + axQ2 + 26Q2 + aPQ
(10.19)
(Qx - P)2
Multiplication of the two formulas (10.18) and some manipulations give
X X _(P*-qQ)2-46Q((?x + P) nn_
An-lA„ + i - _ (10.20)
Now
also, and the claim is that
BD = (Qx-Pf, (10.21)
up to a Zp factor. If S is the greatest common divisor of AC and BD,
(10.20) gives
XC = S[(Px - aQf - AbQ(Qx + P)] (10.22a)
BD = S(Qx-Pf, (10.22b)
300
X: L FUNCTION OF AN ELLIPTIC CURVE
while the numerator of (10.19) gives
AD + BC = 2S[PQx2 + P2x + axQ2 + 2bQ2 + aPQ]. (10.22c)
Let F be a prime factor of 5. Then (10.22b) shows that F | BD.
Without loss of generality, suppose F \ B. Then F \ A since GCD(A, B) = 1.
Since (10.22a) shows that F \ AC, F \ C. Thus F | BC. By (10.22c),
F | (AD + BC). So F | i4£>. Since F t ,4, F | D. Then F | C and
F I £, in contradiction to GCD(C, D) = 1. We conclude that 5 is a
scalar, and (10.21) is proved.
The integers in (10.15) are
(fn_! =max{degy4,deg#}
<fn+1 = max{degC,degD}
dn = max{degP,degQ}.
We divide the proof of (10.17) into the following cases:
(a) dn_! = deg 4 and dn+x = degC
(b) dn_i = degB and </n+i = degD
(c) rfn_i = deg .4 and dn+i = deg£>, but not Case (a) or (b)
(d) dn_i = degB and dn+i = degC, but not Case (a) or (b).
Case (a). By (10.21) and (10.22), we have
dn-x + dn+x = deg(AC) = deg[(Px - aQ)2 - 4bQ(Qx + P)].
If deg P > degQ, then the unique term on the right of highest degree is
P2x2\ so the right side is dn + 2, and (10.17) follows. If degP < degQ,
then (10.21) gives degBD = 2 degQ + 2. Then
deg(AC) <max{2degP + 2, 2 degQ, 2degQ+ 1, deg P +degQ}
<2degQ + l<deg(5£>).
But this inequality contradicts the hypothesis of Case (a). Hence degP
< degQ is impossible.
Case (b). By (10.21), we have
dn-i + d„+i = deg(BD) = deg[(Qx - P)2].
If degQ > degP, then the unique term on the right of highest degree is
Q2x2; so the right side is dn + 2, and (10.17) follows. If degQ < degP,
then (10.22a) gives
deg{AC) = deg(P2x2) > deg(P2) > deg(BD).
3. HASSE'S THEOREM 301
But this inequality contradicts the hypothesis of Case (b). Hence degQ
< deg P is impossible.
Case (c). Since we are not in Case (a) or (b), AegA > degB and
degD> degC. Thus
deg AD > deg AC, deg AD > deg BD, deg AD > deg EC.
(10.23)
From the third of these and from (10.22c), we have
deg(4£>) = deg(AD + BC)
= deg[PQx2 + P2x + axQ2 + 2bQ2 + aPQ].
(10.24)
IfdegP>degQ, (10.24) is
< deg(PV) = deg(^C),
in contradiction to the first inequality of (10.23). If degP < degQ, then
(10.24) is
< deg(<?2*2) = deg(BD),
in contradiction to the second inequality of (10.23). Thus Case (c) is
impossible.
Case (d). This case is symmetric with Case (c) and similarly is
impossible.
This completes the proof of the lemma.
It is an easy matter to combine the two lemmas to prove the theorem.
Induction forwards and backwards by means of Lemma 10.8 gives the
formula
dn = n2 - (d_! - d0 - l)n + d0. (10.25)
(The base cases for the induction are n = 0 and n = — 1, for which
(10.25) is trivial.) Substitution from Lemma 10.7 and use of d0 = p
gives
dn = n2 + apn + p,
where ap = p + 1 — #E?{ZP) as in (10.7). The d^s are degrees of
polynomials and are therefore > 0. Moreover, two consecutive dn's
cannot both be 0. Since ap is an integer, it follows that r2 + apr + p > 0
for all real r. The discriminant of this polynomial must be < 0, and
thus \ap\ < 2y/p. This completes the proof of Theorem 10.5.
CHAPTER XI
EICHLER-SHIMURA THEORY
1. Overview
We return to the theme announced in the preface and discussed a
little in §§VII.5 and VIII.l. We saw in Chapter VII that the Riemann
zeta function £(s) and the Dirichlet L functions L(s,x), which encode
subtle arithmetic information about primes, can be obtained as Mellin
transforms of certain 6 functions that have transformation laws akin
to those of modular forms. In other words, £(s) and L(s,x) arise as
automorphic L functions. Consequently they have analytic
continuations and functional equations, and their analytic properties are more
manageable.
Our objective in this chapter and the next is to address the
corresponding problem for the L function L(s, E) of an elliptic curve over
Q. We would like L(s,E) to have an analytic continuation and satisfy a
functional equation, and the likely way to obtain these conclusions is to
identify L(s, E) as an automorphic L function. A special case of a theory
of Eichier and Shimura gives a clue to the nature of the automorphic
objects to use. Their theory takes certain cusp forms / in S2(ro(N))
and gives a geometric construction of elliptic curves over Q such that
L(s,E) = L(a,f).
We shall discuss this theory in the present chapter. The Taniyama-
Weil Conjecture anticipates that every elliptic curve over Q is obtained
in this way from S2(To(N)) if the Eichler-Shimura map is followed by a
relatively simple kind of map called an "isogeny." This conjecture will
be the subject of Chapter XII.
By way of introduction we begin with a construction that passes from
members / of S2(To{N)) to homomorphisms from T0(N) into the
additive complex numbers. For / in 52(r0(iV)), /(£)d( is invariant under
T0(N). Fix t0 in H and define
Wr)= fT f(C)dC (11.1)
J T0
302
1. OVERVIEW 303
Since / is analytic, F(t) is independent of the path of the integral. For
7 € ro(A0, tne invariance of /(C) d( gives
riKT) niT) riKTo)
H7(r)) = / /(C) K = / /(C) d( + / /(C) dC
1 /(CK+/ f(C)dC = F(r)+ /(C) dC
r0 ^r0 ^r0 (115
(11.2)
Put
7(ro)
*/(t)=/ /(C) dC
We claim that $7(7) is independent of To. In fact, if we define F\(r) by
(11.1) with ri in place of r0, then (11.2) gives
Fi(j(t)) = F1{t) + /7Tl /(C) dC (11.3)
Since Fi = F -f J^0 /(C) <*C, comparison of (11.2) and (11.3) gives
/ /KK=/ /«K,
as asserted.
Proposition 11.1. For / in 52(r0(iV)), $j is a homomorphism of
To(N) into the additive complex numbers. If 70 € To(N) is elliptic
(i.e., |Tr To| < 2) or parabolic (i.e., |Tr -yol = 2), then $/(7o) = 0.
Proof. The invariance of /(C) ^C under T0(N) gives
fYiyiTo rl\T0 r7i72*"o
*/(Ti72) = / = / + /
J T0 J To Jl\To
p\T0 p2T0
= + =*/(7i) + */(72),
Jto Jtq
and $/ is a homomorphism. If 70 € Fo(N) is elliptic, it has finite order
and hence so does its image in C. Thus $/(7o) = 0.
Suppose 70 = (aebd) is parabolic in To(N). Since |IY 70) = 2 and
det7o = 1, we readily find that the equation : = z has a double
cz + a
304 XI: EICHLER-SHIMURA THEORY
root, namely z = ^. Thus yo(^) = ^. Choose 0 in SI(2,Z) with
^ = /^(oo). Then 71 = PloP"1 has
Ti(oo) = /?7o/T V) = /?7o(^) = P(*±) = 00.
Thus 71 = ± ( J J J for some / in Z. By (9.17), the integer / is a multiple
of the width h of the cusp /J"1 (00) = ^f1. The Fourier expansion
(/o[tf-1W(r) = EcBeWBT'*
n=l
has zero constant term since / is a cusp form, and the analog for Chapter
IX of (8.7b) therefore gives
r (/o&?-i]2)(c)dc=o
for any T\. Hence also
r+(/o[^-i]2)(odc=o. (n.4)
Now the change of variables (J = j3 X(Q gives
*/(to)= / hc)<k'= / f(rlQ6tf-lxr2dc
J To JflTo
JffiTo + l
0rn
and this is 0 by (11.4). This completes the proof.
Let us consider the relatively simple case that N = 11. According to
Theorem 9.10, S2(ro(ll)) is one-dimensional. A nonzero element is
/W = »?(r)2»/(llr)2 = f;cn9n (11.5)
n=l
= q - 2q2 - q* + 2q4 + qs + 2q6 - 2q7
- 2q9 - 2?10 + «» - 2q12 + 4q13 + 4g14
_,i5_V6-2,17 + 4g18 + 2920 + ...,
1. OVERVIEW
305
according to Example 5 in §IX.3. The group ro(ll) turns out to be
generated by
r-(i!)- "-=(-33 i,)' «-(ii,)-
Since T is parabolic, Proposition 11.1 shows that the image of $j is
generated by ^/(V^) and $/(Ve). We have
*/(7)=/ /(O dC
J To
= ^-f "I c»?n_1 d* = ^(#(«2,r<7(To)) - H(e2*"°)),
Z1TI Je2wir0 *~f ZTTl
where
H(q) = £
<W
i n
n = l
Judicious choice of tq cuts down considerably on the number of terms
needed. For r0 = i, computation with 10000 terms gives 11 digits
accuracy. For r0 = .125 -f .025* with 600 terms, we obtain
H(e2rir°) = .2628106125215867 + .5230455366717368/
H(e2TiV^To)) = .897415264661364 - .9357710802667581i
^(e2WV6(To)) = 1.1532019916801141 + .523045536671738**
with 14 digits accuracy. (With 300 terms, w», would have had 10 digits
accuracy.) Thus
Wl = $f(V4) = -.232177875650357 - .101000467297158i
u>2 = *j(V6) = -.202000934594317f. ( ' '
The complex numbers u\ and u>2 are independent over R and
therefore define a lattice A in C. For any 7 in To(N)y (11.2) shows that
F(j(t)) — F(t) is in A, hence projects to the 0 element of C/A. Thus
the map F of (11.1), which initially carries 7i into C (and hence also
C/A), descends to a holomorphic map of H = ro(ll)\7i into C/A. In
turn, Theorem 6.14 exhibits a biholomorphic map of C/A onto E(C)
for an elliptic curve E. If X0(ll) denotes the compactification of H
discussed briefly in the proof of Theorem 9.9 and called TV there, then
F actually yields a holomorphic map of the compact Riemann surface
X0(ll) onto C/A and then onto E(C).
306
XI: EICHLER-SHIMURA THEORY
Two miracles occur in this construction. The first miracle is that
Xo(ll), E, and the mapping can be defined compatibly over Q. We
have not yet defined rational maps and rational projective curves (other
than plane curves), and we shall need to make such definitions in order
to make our assertion precise. We return to a discussion of rationality
shortly.
The second miracle is that the L function of E matches the L function
of the cusp form / in (11.5). This assertion relies on introducing new
interpretations of Hecke operators and their eigenvalues.
Let us give a numerical indication of why E can be defined over Q.
Substituting from (11.6) into (6.50), we find that
02(A) = 64419.8788704867 - .0000000003i
93(A) = -5699399.99557174 + .00000002i.
Then (8.2) and (8.3) give
j(A) = -757.672637860052 + .000000000008i. (11.7a)
The actual value is
j(A) = -^ '35^ = -757.672637860057. (117b)
The significance of j(A) for the present context is explained in
Proposition 3.7. With the value of j(A) as in (11.7b), the proposition shows
that E can be defined over Q.
Let us classify the inequivalent Q structures on E and see that there
are infinitely many of them. It will turn out that any two of them are
equivalent over a quadratic extension of Q. If E ultimately is given
in global minimal Weierstrass form, then the integer tuple (c4,C6,A)
determines the Q structure of E, and the equation of E is unique up to
the kind of admissible change of variables described in Theorem 10.3.
Since
; = £1 = 1728-4
we readily find that the only possibilities are
c4 = 24 • 31m2
c6 =-23 • 2501m3 (11.8)
A= -ll5m6
1. OVERVIEW
307
for an integer m ^ 0. We shall see that there are some restrictions on
m. The curve
y2 = x3 - 4mx2 - 160m2x - 1264m3 (11.9)
has
c4 = 24(24 • 31m2), c6 = 26(-23 • 2501m3), A = 212(-ll5m6)
and thus must arise from E by an admissible change of variables with
u = 2. We claim that m has no odd prime square factor p2. In fact,
otherwise we could use u = 1/p in (11.9) and change variables to find
that E was not minimal at the prime p. Similarly 16 does not divide m,
since otherwise we could use u = 1/4 to find that E was not minimal at
the prime 2.
If m = 1 mod 4, then (119) is equivalent with the curve
y2 + y = x3 - mx2 - 10m2x - ±(79m3 + 1), (11.10a)
which has (c4,C6> A) given by (11.8) and is therefore global minimal, by
Lemma 10.1. If m = 2 mod 4, then similarly (119) is equivalent with
y2 = x3- mx2 - 10m2x - 158(±m)3, (11.10b)
which has (c4,c6,A) given by (11.8) and hence is global minimal. If
m = 3 mod 4, then we can attempt a reduction at the prime 2 by
rewriting (11.9) as
(y + ax -f b)2 = (x + c)3 - 4m(x -f c)2 - 160m2(x -f c) - 1264m3
or
y2 + 2axy + 2by = x3 + (3c - 4m - a2)x2
+ (3c2 - 8mc - 2ab - 160m2)x
+ (c3 - 4mc2 - 62 - 160m2c - 1264m3).
To have a reduction, we need 2J to divide the usual coefficient a; of this
equation. From a3 we obtain 4 | b. From a4 we obtain 2 | c. Then
a2 gives 2 | a, and a4 gives 4 | c. Hence a6 gives 64 | (62 + 1264m3).
Writing b = 46', we see that 4 | (6/24-79m3). But this is impossible since
79m3 = 1 mod 4. Thus (11.9) is already global minimal if m = 3 mod 4.
308
XI: EICHLER-SHIMURA THEORY
If 4 | m, write m = 4m' with 4 \ m'. Then we can reduce (11.9) to
y2 = x3- 4m'*2 - 160m'2s - 1264m'3, (11.H)
and we must have arrived at a global minimal form. Since we have
established that 4 \ m', the preceding paragraph shows that m' = 3 mod
4.
Putting everything together, we see that we get a Q structure whose
global minimal equation satisfies (11.8) exactly when m has no odd
square factor and m is congruent modulo 16 to one of
1,2,5,6,9, 10, 12, 13, 14. (11.12)
The simplest Q structure comes from m = 1, and the global minimal
equation can be taken as
E: y2 + y = x3 - x2 - lOx - 20. (11.13)
This is the equation of interest. The Q structures corresponding to other
values of m are called twists of E.
We have obtained a mapping of Xo(ll) onto E(C). There are other
elliptic curves that X0(ll) maps onto. Define
(3)-(i ;)(£)■
and let A' = 1u\ 0 Zu^. Calculating as with A, we find that
j(A') = -372.363636363639 - .000000000004t\
The actual value is
163
j(A') = -— = -372.363636363636.
The result is an elliptic curve E1 that can be defined over Q. The
simplest Q structure corresponds to the equation
E': y2 + y=x3-x2 (11.15)
with (c4,C6,A) = (16,-152,-11). The same analysis as with E yields
twists for the same mJs as in (11.12), but only (11.15) will be of interest
to us.
1. OVERVIEW 309
The inclusion A'CA yields a holomorphic homomorphism
E'(C) 2 C/A' —► C/A ~ E(C). (11.16a)
Such a map is called an isogeny. (Cf. JVT.4.) This particular isogeny
can be defined over Q. Namely if (x,y) satisfies (11.15) and if X and Y
are defined by
1 2 1
x-1 (r-1)2
then (X,y) satisfies (11.13).
To any isogeny corresponds a dual isogeny in the reverse direction.
Namely (11.14) yields
/5w2\ _ (b 0\ fu2\
\buij " ^0 h) \uij
_/5 -3\/l 3\ /w2\ _ (h -3Ww2\
~ ^0 1 )\Q 5)\Ul)-\0 1 Awi/'
Since C/(5A) = C/A, we obtain a holomorphic homomorphism
£?(C) S C/(5A) — C/A' ~ £"(C). (11.16b)
Since (11.16a) is defined over Q, so is (11.16b). But we shall not write
down explicit formulas. In any event, composition of Xo(ll) —► E(C)
with (11.16b) yields a rational map of Xo{H) onto E'.
Similarly if we define
«)-(!!)(:)■
and let A" = Zu;" 0 Zwj, then the actual value of j is
„ (37S376)3
J(A)- jj—-,
and we are led to the equation
E" : y2 + y = x3 - x2 - 7820* - 263580, (11.18)
310
XI: EICHLER-SHIMURA THEORY
as well as various twists. The inclusion A" C A yields an isogeny
#"(0) -> E(C)f and this isogeny is defined over Q. The dual isogeny
yields by composition a rational map of Xo(N) onto E".
Thus A'o(ll) maps rationally onto the three curves E, E\ and E*\
all as a result of the one cusp form / in (11.5). Elliptic curves that are
isogenous over Q have the same L function (Theorem 11.67). By the
Eichler-Shimura theory the common L function for these three curves is
the same as the L function as /. The maps of X0(N) to £?, £", and E"
are called modular parametrizations of these curves. These maps
are determined by giving their x and y coordinates, which are meromor-
phic functions on Xo{N) with some additional property to reflect the
rationality. The rationality is important: / gives us holomorphic maps
of X0(ll) onto the twists of E, E1, and E"\ but we do not get a match
of L functions in these cases.
So far, we have discussed in detail XQ{\\) only. The special
feature of this case is that Ao(ll) has genus one and that
correspondingly S2(ro(H)) has dimension one. For general r0(AT), Proposition
11.1 says that any cusp form / in S2(Yo(N)) yields a homomorphism
<bj : Y0(N) —► C that annihilates the elliptic and parabolic elements,
but the situation is of interest only when the image of $j is a lattice.
It turns out that we do get a lattice when / is a newform and all the
eigenvalues of the Hecke operators are integers. Then the resulting
elliptic curve E is defined over Q, and so are Xo(N) and the map from
Xq(N) to E. Moreover the L functions of / and E match.
The approach to these results is a little indirect. The "Jacobian
variety" J(Xq(N)) of Xq(N) is a certain complex torus with the universal
mapping property that any map of Xq(N) to an elliptic curve must
factor through J(Xo(N)). Because of the universal property, the
construction of the map from Xq(N) to E amounts to obtaining E as a
quotient of J(X0(N)). The construction does not yield an equation for
E directly. Instead, such an equation has to be worked out from side
information; we shall take up this matter in §XII.3.
A certain amount of this chapter is about background material.
Sections 2-4 deal with compact Riemann surfaces in general and with Xo(N)
in particular. Some of the results that we quote about compact Riemann
surfaces will not be used in the form as stated but are helpful as
motivation for properties of projective curves later in the chapter. Section
5 applies some of the more elementary properties of compact Riemann
surfaces to derive properties of Hecke operators on S2(Yo(N)). We shall
see that S2(Yo(N)) has a basis in which the Hecke operators are given by
integer matrices. Consequently the eigenvalues of the Hecke operators
are algebraic integers. Sections 6-8 deal with general projective curves
2. RIEMANN SURFACE X0(N)
311
and show how Xo(N) can be given a canonical Q structure. Sections 9
and 10 discuss abstract elliptic curves, isogenics, abelian varieties, and
the algebraic Jacobian variety. The main results of Eichler-Shimura
theory are in §§11-12.
2. Riemann Surface Xo(N)
In this section we shall give a more careful construction of the com-
pactification X0(N) of Tq(N)\H than we gave in §IX.5. In §§3-4 we
shall go on to discuss some generalities about Riemann surfaces. We
omit many proofs, giving references in the section of Notes at the end.
Let C = QU {oo}, the set whose equivalence classes form the cusps of
§IX.2. Put H* = H UC, and topologize H* as follows: A basic open set
about a point of H is an open disc wholly within 7<, and a basic open
set about the member oo of C is {Im r > r] for each r > 0. If x £ Q is
in C, a basic open set about x is of the form £>U{r}, where D is an open
disc in H of radius y > 0 and center x + iy. The resulting topology on
7i* is Hausdorff, H is an open subset, and SX(2,Z) acts continuously.
Let Xo(N) = To(N)\Hm. It is easy to see that Xq(N) is compact.
Moreover, with some effort one can show that Xo(N) is Hausdorff. A
key step in the argument is Lemma 11.2 below. For z in 7{*} let Tz —
b€To(N)\y(z) = z}.
Lemma 11.2. If r is in W, then there exists an open neighborhood
VT of r in U with VT n y(VT) = 0 for all 7 in ro(A0 not in TT. The set
VT may be taken to be stable under TT.
We shall introduce a system of charts
{(tf„^) !*€«•} onXo(N) (11.19)
that makes the compact Hausdorff space Xo(N) into a Riemann surface.
Let 7r : W* —► Xq(N) be the quotient map. If V is open in H*, then
*~l(*(V)) = L^ervN) *t(y) is open; thus n is an open map. In defining
the charts, there are cases and subcases.
Case 1. zq = t is in %. We choose VT as in Lemma 11.2 and let
UT = tt(Vt); Ut is open since ir is open.
Case la. TT = {±/}. Then -k : VT —* UT is a homeomorphism. We let
<pT be its inverse, and then (UTi(pT) is the required chart.
Case lb. z0 = r is in W, and FT ^ {±1}. Since TT D {±1} and
To(^) Q SL(2, Z), the order of TT is 4 or 6; we write 2n for this number.
312
XI: EICHLER-SHIMURA THEORY
Let A be a linear fractional transformation from H onto the unit disc
Z — T
carrying r to 0, e.g., X(z) = r. Then <pT : UT —> C is well defined
Z — T
by the formula y>r(7r(2r)) = Hz)n> anc* (Ut,<Pt) is the required chart.
Case 2. z0 = x is in C. Choose /? in SX(2,Z) with /?(x) = oo. Then
/Jr,/?-1 = {± ( J "jM | m G Z}, where A is the width of the cusp. Let
Vx = p-\{\m t > 2}). If t € T0(N) has y{Vx) D Vx ± 0, then
/?7/T^{Im r > 2}) R {Im r > 2} ^ 0.
Writing fiy/3~l = ( * * J and taking r in the nonempty intersection, we
have
2<Im^V) = ^<^<^.
Thus c = 0, f$yP~l fixes oo, and y fixes x. In other words, y is in Tx.
Put £/x = v(Vx). Then y>r : £/r —* C is well defined by the formula
<px(ir(z)) = e^WO/^ and (Ux,<px) is the required chart.
Proposition 11.3. The charts in (11.19) are compatible, and Xq(N)
becomes a compact Riemann surface.
3. Meromorphic Differentials
This section treats meromorphic and holomorphic differentials of
compact (connected) Riemann surfaces, with particular attention to Xo(N)
and elliptic curves. We omit proofs of results that do not specifically
address these applications.
Thus let X be a compact Riemann surface, and let {(£/», <Pi)}iei be an
atlas. The meromorphic functions on X form a field, which we denote by
K(X). A system lj = {<*>i}iei of scalar-valued meromorphic functions
LJi on Ui is called a meromorphic differential if
Ui o tpT1 = fa o <pTX){<Pj ° ^f1)' °n <Pi(Ui n Uj) C C (11.20a)
whenever Ui H Uj / 0. The classical notation that is used for a;,- o ipj1
in order to capture the transformation law is uii((pjl(z))dz, where z
is a local parameter. In this notation if w = <pj o ipj (z), then dw =
(<Pj°<P7l)'{z)dzy and (11.20a) says that
"i{<P7\z))<lz = Uj{<pj\w))dw. (11.20b)
The space of meromorphic differentials is denoted il(X). If F is in
K(X), then dF = {(f °ft:1)'o^}ig/ is a meromorphic differential. In
the development of the theory of compact Riemann surfaces, one of the
early steps is to prove the following.
3. MEROMORPHIC DIFFERENTIALS 313
Proposition 11.4. If X is a compact Riemann surface, then the
space Q(X) of meromorphic differentials is nonzero.
If uj = {u>t}t€/ Is a meromorphic differential, then u; has a meaning in
every compatible chart. Namely if (J70fy>o) is a'compatible chart, then
we can adjoin u;o : Uq —► C to {u«}«e/ if w0 is defined by
<*>o = (wi)({<Pi °<PolY ° <Po) on Un n ^i-
A meromorphic differential is a hoiomorphic differential if all the
ljj are hoiomorphic functions. The space of hoiomorphic differentials is
denoted ^hoi(^)- If w = {ui}i£i is a hoiomorphic differential, then cj
is locally of the form df. Namely let (Uo,<po) be a simply connected
compatible chart. Then woo^J"1 is analytic on <po(Uo) and is the
derivative of an analytic function g. If we define / = g o ip0 on (7o, then
(f0(PolY0(Po = ^Oi as asserted. We say that u> locally has an indefinite
integral.
Let
y2 + aixy + a3y = x3 + a2*2 + <*4* + <*6
be the equation of an elliptic curve E over C. In Corollary 6.32 we saw
that E(C) is a compact Riemann surface of genus 1. Formally we have
2ydy + aixdy+ axydx + a3dy = (3x2 + 2a2x + a4)dx
and therefore
dx dy
2y + aix + <i3 3x2 + 2a2x + a4 - a\y'
(11.21a)
At points of E(C) where — ^ 0, x is a local coordinate of £(C), and
ay
the left side defines (locally) a hoiomorphic differential. At points where
— ^ 0, y is a local coordinate, and the right side defines (locally)
ox
a hoiomorphic differential. Together these sets of points cover E(C)
except for the point at oo. In terms of affine local coordinates (X, 1,W)
for projective space, we obtain a similar relation from (3.23a):
-dW
3X2 + 2a2XW + aAW2 - a{W
= 3a6W2 + 2aAXW + a2X2 - 2a3W - ayX - l' ( ' '
314
XI: EICHLER-SHIMURA THEORY
The relationship between the two coordinate systems is [(A",1,W)] =
x 1
[(x,y, 1)]. Thus X - - and W = -. In particular, dW = -y-2 dy.
y y
Calculating, we find that the left side of (11.21b) matches the right side
of (11.21a) on the overlap. The common value of (11.21a) and (11.21b)
is therefore a (globally defined) holomorphic differential on E(C). It is
called the invariant differential for E> for reasons given below.
In the theory of compact Riemann surfaces, one of the preliminary
results used for the Riemann-Roch Theorem is the following.
Proposition 11.5. If X is a compact Riemann surface of genus g,
then the vector space fthoi(^) of holomorphic differentials has dimension
9-
In the case of E(C), the genus is one, and the invariant differential is
the unique holomorphic differential, up to a scalar. We know from
Corollary 6.32 that E(C) is biholomorphic with C/A for a lattice A. Then
dz on C has a meaning on C/A and defines a holomorphic differential
on C/A. By Proposition 11.5, it agrees with the invariant differential in
(11.21), up to a scalar. The invariance of dz under translations of C/A,
which we know correspond to group translations in E(C), is the reason
for the name "invariant differential" for (11.21).
Now consider Xo(N). Let n : TV —► Xo(N) be the quotient map. If
u> = {^»}»e/ ls a holomorphic differential on Xo(AT), then we define a
function f„ : W —► C by
Mr) = u>i(w(T)){fPiowy(T) (11.22)
if ([/,-, <fi) is a chart about tt(t).
Proposition 11.6. The map u —► fw is well defined and is an
isomorphism of Qhoi(^o(^)) onto S-2(r0(N)).
PROOF. To see that fw is well defined, let (Ui:<pi) and (Uj,(pj) be
two charts with tt(t) in Ui PI Uj. Let U — <pi(n(T)) and tj = <Pj(n(r)).
Then <pj o n = (<pj o ^r1) o (<pi o w) implies
(W ° *)'(*) = (W ° <P7l)\U){<Pi o *)'(r).
Hence
"Mt)){<Pj °*)'(t) = (^oip-^iti^ipjO^yiU^owyir)
= ("i°<P7l)(ti)(<Pi°*)'(r) by (11.20a)
= u;,(7r(r))(^o7r)/(r),
3. MEROMORPHIC DIFFERENTIALS
315
and /a, is well defined. It is clear that /^ is analytic.
If 7 is in T0(N)y then tt(t) = ir(jr)y so that
U(jr) = Ui(n(jT))(<pi o 7tY(jt)
= Wi(w(r))(^ o 7T o jY(t)/Y(t)
= 6(y,Tfu,i(ir(T))(<pio*)'(T)
Hence fw is an unrestricted modular form of weight 2 and level N.
For the behavior at the cusps, let x = /?_1(oo) be a cusp, and let
(UXi<px) be the chart about x given by (11.19). Here
Ux = ir(p-l{lmT>2})
and
p,0r(/rlr)) = ea*<T/\
where h is the width of the cusp. Then we have
fu,°W-1h(T) = Mp-lT)6(p-\T)-2
= U,x(w{rlT)){<p,OWy(p-lT)6{rl,T)-2
= w.o^, (e ) ff-i/(r) ^ >r)
= «,o^1(c2"T/fc)^c2,r*r/fc. (11.23)
Since the differential is holomorphic, the first factor is an analytic
function of qh = e2*tTfh. The remaining factor is a multiple of g/». Therefore
fu is holomorphic at the cusp /?_1(oo) and vanishes there.
Clearly the map u —► fw is one-one. To see that it is onto, let / be
in 52(r0(AT)) and let x be in W*. Form the chart (Ux,<px). For z in £/r,
choose r in 71* with z = ^^(^(t)). Define
«.(*) =/0-)((p,o*)'(r))-1
if r is in W. Modular invariance makes u;x well defined. Also u;x is
holomorphic everywhere if x is not in ?r(C), and it is holomorphic except
at :c if x is in ^r(C). In the latter case, to define u>x at x (the only point
where r would be forced to be a cusp), we unravel the argument (11.23)
to see that u;r is bounded in a neighborhood of x. By the Riemann
removable singularity theorem, ux extends to be holomorphic at x. Then
it is easy to see that lj = {wr}r6^« is a holomorphic differential that
maps to /.
316
XI: EICHLER-SHIMURA THEORY
4. Properties of Compact Riemann Surfaces
This section will summarize, without any proofs, some deeper
properties of compact Riemann surfaces. The initial three main theorems about
compact Riemann surfaces are the Riemann-Roch Theorem, Abel's
Theorem, and the Jacobi Inversion Theorem. All are expressed in the
language of divisors.
Let X be a compact Riemann surface. We denote by Div(X) the free
abelian group on the points of X. An element D of Div(X) is called a
divisor. Such an element is written as
D=Y^ ordx{D)x with ordx(D) € Z. (11.24)
We write DY > D2 if ordx(i?i) > ordx(D2) for all x in X.
If / is in K(X)X and x is in X, choose a chart (Ui,<fi) with x, € £/,-,
and put
ord,(/)=ordVj(,)(/o^-1). (11.25)
The right side is > 0 at a zero and < 0 at a pole. The expression (11.25)
does not depend on the choice of ({/,-, <pi).
The divisor [/] of / 6 K(X)X is given by
[/] = 5>rd,(/)*.
Such a divisor is called a principal divisor, and the set of principal
divisors is denoted Div0(X). The quotient group
Pic(X) = Div(X)/Div0(X)
is called the divisor class group.
By Proposition 11.4, X possesses nonzero meromorphic differentials
u). Let [u>] be the divisor of u, defined in the same way as [/]. The
divisor class of [u] is independent of the choice of u;. In fact, if u; and
u/ are two such, then reference to (11.20a) shows that u; = fu>' with /
in K(X)xy and thus the assertion follows. The divisor class of [u;] in
Pic(X) is called the canonical class.
For a divisor D, the linear system L(D) is the vector space
L(D) = {f G K{Xr I [/] + D>0}U{0}
= {/ € K(X)* | ordx(/) +ordx(£>) > 0 for all xSX}U {0}.
Then L(0) = C since the only holomorphic functions on X are the
constants. Also it is clear that
D>D' implies L{D)DL{D'). (11.26)
The next result limits how much L(D') can fail to be all of L(D).
4. PROPERTIES OF COMPACT RIEMANN SURFACES 317
Proposition 11.7. Let X be a compact Riemann surface, let D' be
a divisor, and let x be in X. Then
dimL(D' + x) < dimZ,(D') + 1.
Consequently L(D) is finite-dimensional for each D in Div(X).
The degree deg(D) of the divisor D in (11.24) is £r€X ordx(D). A
proof of the following result was sketched in §IX.5.
Proposition 11.8. Each principal divisor on a compact Riemann
surface has degree 0.
Theorem 11.9 (Riemann-Roch Theorem). Let X be a compact
Riemann surface of genus g> let D be a divisor, and let W be in the canonical
class of Pic(X). Then
dim L(D) = deg(Z>) + dim L(W -D)-g+l.
Corollary 11.10. Let X be s. compact Riemann surface of genus g,
and let W be in the canonical class of Pic(X). Then deg W = 2g — 2
and dimL(W) = g.
Corollary 11.11. Let X be a compact Riemann surface of genus g,
and let D be a divisor with deg(D) > 2g - 2. Then
dimL(D) = deg(D) - g + 1.
Corollary 11.12. Let X be a compact Riemann surface of genus
g > 1. Then there is no point of X where all holomorphic differentials
vanish.
Until further notice in this section, we assume that our compact
Riemann surface X has genus g > 1. Since X is a closed orientable surface
of genus <7, the ordinary homology group H\(X,T) is free abelian on 2g
generators. One usually writes a standard homology basis over Z in the
form ai,..., agi b\,..., 6^ with special intersection properties. But we
shall be content to fix any Z basis cj,... 1C29 of H\(X,T).
Proposition 11.13. Let X be a compact Riemann surface of genus
g > 1, and let u>i,... ,ug be a basis of fihoi(^) over C. Then the 2g
vectors
( \
in Cs are linearly independent over R.
318
XI: EICHLER-SHIMURA THEORY
Consequently the vectors in the proposition are a Z basis for a lattice
A(X) in C*. The lattice A(X) is unchanged if {c*} is replaced by a
different Z basis of Hi{X,l). If {u>j} is replaced by a different Z basis
of fthol(^O) the effect is to transform A(X) by a member of GL(g,C).
The Jacobian variety of X is the (/-dimensional complex torus
J(X) = CVA(X). We define a map * : X — J(X) by fixing a point x0
in X and setting
*(«) = { fay}' . (11.27)
According to the previous paragraph, if the base point is fixed, the pair
(J(X),$) is well defined up to the action of a member of GL(y,C)
simultaneously on the Wj's, C5, and A(X).
Proposition 11.14. For a compact Riemann surface X of genus > 1,
the map $ of X into J(X) is well defined and holomorphic, and its rank
(over C) is everywhere 1.
The first statement of the proposition has already been addressed,
and the second statement follows from Corollary 11.12. In the case of
X = Xo(N)y Proposition 11.6 shows that the map $ is essentially the
same as the tuple of maps 9jj of §1 if {/;} is a basis of S2(To(N)). This
tuple {$/,} will be used in §§10-11 and will be called <l there.
Since J(X) is a group, we can extend $ : X —► J(X) to a map
$ : Div(X) —► J(X) in the obvious way: If D = Ylxex ordr(D)r, then
*(£>)= £ordx(D)<*(x).
For general D} this definition depends on the base point xo in (11.27),
but it is independent of x0 if D has degree 0.
Recall from Proposition 11.8 that all principal divisors have degree 0.
Theorem 11.15 (Abel's Theorem). Let X be a compact Riemann
surface of genus > 1, and let D be a divisor. Then D is principal if and
only if deg(D) = 0 and $(D) is the 0 element of J(X).
Corollary 11.16.
(a) If X has genus 1, then $ : X —► J(X) is one-one onto and biholo-
morphic.
(b) If X has genus > 1 then $ : X —► J{X) is a one-one holomorphic
mapping onto a proper complex submanifold of J(X).
4. PROPERTIES OF COMPACT RIEMANN SURFACES 319
With g equal to the genus of X, let
Div^>(X) = {D € Div(X) | D > 0 and deg(D) = g).
Then we can restrict $ from Div(X) to Div^^(X) and obtain a map
$ : Div(^)(X) — •/(*). The special case £ = 1 is what was considered
in Corollary 11.16a.
Theorem 11.17 (Jacobi Inversion Theorem). For a compact Rie-
mann surface of genus g > 1, the map $ carries Div^^(X) onto J(X).
Corollary 11.18. As a group, J{X) is isomorphic to the group of
divisors of degree 0 modulo the subgroup of principal divisors.
Theorem 11.19. If F : X —► T is a holomorphic mapping of a
compact Riemann surface of genus > 1 into a complex torus, then F factors
through the Jacobian variety: F = / o 4> for some holomorphic
mapping / : J(X) —► T that is the sum of a translation and a holomorphic
homomorphism.
This completes our discussion of the three initial main theorems about
compact Riemann surfaces. The final part of this section addresses the
nature of the function field K(X), indicating part of the connection
between Riemann surface theory and the algebraic geometry of curves.
Theorem 11.20. Let X be a compact Riemann surface of genus
g > 0, and let x be a nonconstant meromorphic function on X (existence
by Corollary 11.11). Then there exist a nonconstant y £ K(X) and an
irreducible polynomial P in two variables such that P(x,y) = 0 and
A'(X) = C(x)[y]/(P(*,y)).
We should think of Theorem 11.20 as giving us a mapping of X into
P2(C) by z —► (x(z), j/(z)), with the image contained in the plane curve
defined by P. In §7, we shall briefly discuss this approach to defining
XQ(N) over Q.
Theorem 11.21. Let X and Y be compact Riemann surfaces, and
suppose F : K(Y) —> K(X) is a nonzero C algebra homomorphism.
Then F is one-one and is implemented by a holomorphic map of X onto
y.
320
XI: EICHLER-SHIMURA THEORY
5. Hecke Operators on Integral Homology
We now return to the subject of modular forms. We shall examine
Hecke operators in detail in this section. Hecke operators act on many
things, and they do so consistently. Understanding these consistent
actions is a key to unlocking the power of the Hecke operators.
One such consistent action is on Hi(Xo(N),Z). Using this action, we
shall see that S2(TQ(N)) has a basis such that all Hecke operators T2(n)
act by integer matrices. One consequence is that their eigenvalues are
algebraic integers. This same realization will allow us also to characterize
the algebra generated by the Hecke operators T2(n).
Before considering Hi(Xo(N)} Z), we consider points and divisors.
Recall from §IX.6 that Hecke operators were introduced as acting on the
free abelian group generated by modular pairs (A,C), where A is a
lattice in C and C is a cyclic subgroup of C/A of exact order N. The set
of equivalence classes of such pairs, with equivalence defined essentially
by Cx, is really Xq(N) without the cusps, and thus we should expect
an action of Hecke operators on the free abelian group generated by the
points of X0(N). This is the divisor group. Rather than try to unwind
our original definition, however, we shall start afresh.
In §IX.6 we defined M(n, N) as the set of 2-by-2 integer matrices of
determinant n whose upper left entry is prime to N and whose lower
left entry is divisible by N. As in Proposition 9.12, we write M(n,N)
as a finite disjoint union
K
M(n,^) = (Jr0(iV)a,-. (11.28)
•=i
If t is in 7i*, let [r] be the corresponding member of Xo(N). For such
t we define
K
r(»)(r) = 5>'T1 (11.29)
i=l
as a member of D\v(X0(N)). If the af-'s are replaced by different coset
representatives in (11.28), then it is clear that the right side of (11.29)
is unchanged.
Let us see that T(h)(t) depends only on [r]. In fact, if 7 is in r0(AT),
then M(nyN)y C M(n, N). Hence we can write
ctij = 7iaj(0
(11.30)
5. HECKE OPERATORS ON INTEGRAL HOMOLOGY 321
with 7,- G To(iV), and we readily check that
i —► j(i) is a permutation of {1,... tK}. (11.31)
Thus
T(n)(7r) - T(n)(r) = £ [ai7r] - £ far]
i i
i i (11.32)
= £ ([7."i(i)T] - [ai(Orl)
t
= 0
by (11.31). Equation (11.32) says that (11.29) is a consistent definition
ofr(n)[r].
Thus T(n) has been defined as a Z linear map of D\v(X0(N)) to itself.
This suggests a natural definition of T(n) on 1-cycles on Xo(N). If we
think of a 1-cycle as a sum of loops on Xo(N), we just need a definition
of T(n) on a loop. We lift the loop to a path from To to some jto in H*,
writing the path as [to,7To], and we try to define
K
T(n)[TQ,jT0] = £[r0,7<To] (11.33)
•=i
with ji as in (11.30). Then we hope that this definition descends to
homology. But there is much to check—that the definition depends only
on the homotopy class of the loop, then that is independent of the lift,
and finally that it depends only on the homology class. The motivation
for an efficient approach comes from Proposition 11.1.
Proposition 11.22. Let Tep be the (normal) subgroup of To(7V)
generated by all elliptic and parabolic elements, and let (•)ab refer to a
group modulo its commutator. In terms of a specified point tq in 7i*,
there is a canonical isomorphism
Hi(X0(N), Z) £ ro(AOab/r£. (11.34)
Remark. To simplify matters, we shall always assume that r0 is
neither an elliptic fixed point nor a cusp.
322
XI: EICHLER-SHIMURA THEORY
Proof, We remove from Xo(N) the finite set of images of elliptic
fixed points and cusps (parabolic fixed points) under IVAf); the
remaining set will be called Xq(N)' , and its preimage in 7t* will be called
W. Then e : TV —► Xo(N)' is a covering map. We shall use r0 and
e(ro) as base points for fundamental groups, but we drop them from the
notation.
Let tQ(N) = rQ{N)/{±\}. Then To(N)\H' = XQ(N)' shows that
t0(N) is the group of deck transformations of TV over Xo(N)' and that
fo(N) ac^s transitively on the fibers. Hence e+(iri(TV)) is normal in
*i(Xf>{N)') with quotient f0(AT). Let
P : M^oW) —♦ *x(Xo(AO')/«-(*i(«')) S f0(JV)
be the quotient homomorphism.
Let y0(N) = T0(N)\Hy i.e., Y0(JV) is X0(N)' with the images of the
elliptic fixed points restored. We shall use the Van Kampen Theorem to
show that
*i(Y0(N))*r0(N)/rei
where fc is the (normal) subgroup generated by the elliptic elements.
Visualize restoring one point to Yq(N) at a time. We apply the Van
Kampen Theorem to a small disc about this point and the partially
restored Y0(N). Let / be a simple loop about the point within the small
disc. The theorem says that the effect on the fundamental group of
restoring the point is to convert I into a relation, i.e., to factor by the
normal subgroup generated by /. If /i,..., lr are the loops used as all
the points are restored, the result is that
*x(Yo(N))~7rl(XQ(N)')/(lu...,lr)
in obvious notation.
Using p, we have a natural homomorphism onto:
WX{Y0{N)) S f,(W)/(llr . . , fr) —> f 0(tf )/p(/l, ■ • • ,'r). (H-35)
Suppose an element a of ?ri(Xo(A0') maps to the identity coset. Then
p(<r) is a product of elliptic elements. Under p, however, the elements
/i,... ,/r map to elliptic elements that generate rc. Hence p(cr) = p(<To)
for some <7o in (/i,..., /r). Adjusting <r by a^1 and changing notation, we
may assume that p(cr) = 1. Then a is in e(ni(H')). But the punctures
of TV get filled in during the passage from X0(N)' to Y0(N). Thus a is
contractible in Y0(N) and must be in (/i,..., /r). Thus (11.35) is one-one
and is an isomorphism.
5. HECKE OPERATORS ON INTEGRAL HOMOLOGY 323
Now we pass from Yq(N) to Xo(N) in the same way, restoring the
images of the cusps one at a time. At the start we have
*i(YQ(N))~r0(N)/re.
Each restoration of a point has the effect of adjoining a new relation
(by the Van Kampen Theorem), and the relations are easily seen to be
parabolic generators corresponding to each cusp. The end result is that
*i(Xo(tf)) = f0(A0/rep - r0(tf)/iv
Finally we have
HiiXoiN),!) SJ JT1(X0(iV))«b S (r0(JV)/rv)"b.
We can readily check that
(ri/r2)ab^r?b/rf
whenever T2 is normal in T\, and then (11.34) follows.
Now we can make T(n) act on Hi(Xo(N),T). Guided by (11.34), we
regard H\(Xo(N),Z) as the free abelian group on symbols [7],
7 6 To(N), subject to the rules
bv»] = [7i] + N (n36)
[7] = 0 if 7 is elliptic or parabolic.
This correspondence of symbols [7] and cycles c respects integration.
Namely suppose / E ^(^(A^)) corresponds under Proposition 11.6 to
u € fthoi(^0 and [7] corresponds to c. Then
/ f(r)dT= u>. (11.37)
«/t0 Jc
(Recall from (11.13) that the left side is independent of r0.) To see this,
let / be a loop in Xq(N) that maps to c in homology, and lift / to a path
/in H*. Then / goes from r0 to some point 7(7*0) in W*. For this 7 and
c, the two sides of (11.37) are equal, and 7 does correspond to c since /
projects to the loop / mapping onto c in homology.
Let the compact Riemann surface Xo{N) have genus g. We know
that Hi(Xq(N),Z) is free abelian on 2g generators, and hence so is the
isomorphic group of fyj's; in particular, it is torsion free.
We can make (11.33) into a rigorous definition by setting
K
n*)M = !>.•]> (n.38)
1=1
with 7, as in (11.30).
324
XI: EICHLER-SHIMURA THEORY
Proposition 11.23. The equation (11.38) makes T(n) into a well
defined Z linear operator on Hx(Xq(N), Z), independent of the coset
representatives a,- in (11.28). Moreover, if c is a 1-cycle on Xq(N) and u; is
a holomorphic differential (corresponding to a member / of 52(r<)(N))),
then
/ w= /t(m)u;, (11.39a)
JT(n)c Jc
where T(n)u is the holomorphic differential corresponding to T2{n)f.
Proof. To see that T(n) is well defined on the group of fyj's, we
have to check that T(n)[7J = 0 if 7 is elliptic or parabolic and that
n»)[Ti72] = T(n)[Tl] + T(n)[72];
these conclusions are enough, according to (11.36). First we check this
identity.
Let 71 and 72 be in T0(N)y and write
OfTi = 7iQj(i) an<* «jT2 = 7J<**(j)-
Then
a,(7iT2) = 7*«j(i)72 = (7i-7J(o)**«(0)»
so that (11.38) gives
T(n)[7l72] = J2 [7<7i(0] = E W + E W(*)l
• = 1 » = 1 1=1
K K
= £ W + £ Ml = r<")fo] + r(n)[%].
1=1 j=l
Thus !r(rc) is compatible with the first relation of (11.36).
Suppose 7 is parabolic with a;7 = 7i<*i(i> Let j^(-) be the £th
iterate of the permutation j. Then
(*rtK = 7W)7*"1 = 7t7j(i)*y<a>(07*"2
= ■ • • = 7i7j(i) • • • 7j(K-i)(0aj(*>(i)
= 7f7i(0---7j(«'-O(,)Qft-
Since 7 is parabolic, so is
<*»7*orfrl = 7i7j(0 • • -7j(K-»(i)>
5. HECKE OPERATORS ON INTEGRAL HOMOLOGY 325
and the right side shows that this element is in To(N). Since
(11.38) gives
K • T(n)[y] = T(n)[yK] = £ [a.W1] = 0.
Since the group of fyj's is torsion free, T(n)[7] = 0.
Suppose 7 is elliptic with yp = 1. It is clear from (11.38) that
r(»)[i] = f>] = o.
Then
PT(n)[1] = T(n)[yP] = T(n)[l] = 0J
and T(n)[7] = 0 since the group of fyj's is torsion free. Thus T(n) is
well defined on the group of fyj's.
The thrust of Proposition 11.22 is that no further relations need
to be checked in order to transfer T(n) to a well defined operator on
tfi(*o(tf),Z).
Suppose oti is changed to a\ — ^a,- with £; in T0(N). From
0^7 = yi<Xj(i)t we obtain
a-7 = ei<*a = eaiai{i) = emej^a^y
Then (11.36) gives
K
E k-K*7(y=E n+E fa] - E few)
•=1 • t •
By (11.38), T(n) gives the same result from the aj's as from the a.-'s.
Now let c be a 1-cycle on Xq(N). With ro as base point in W, we can
choose a path [70,770] whose projection to Xq(N) is homologous to c.
By (11.38), we can write
r(»)M = £>]
•=1
326 XI: EICHLER-SHIMURA THEORY
with 7, as in (11.30). By (11.38), equation (11.39a) is the assertion that
£/ f(r)dr= T2(n)f(r)dT. (11.39b)
To prove (11.39b), we calculate the right side as
r7(ro)
/ ° [ttttaM dr by Proposition 9.12
7(ro)
i Jt0
fCti
= £/ f(T)dr + J2 f^dT'
i ^or.(ro) , •/<»j(t)(ro)
To
OTi7(To)
f(r')dT' under r'= a;(r)
ro)
7.<*>(0(r<>)
/(r)(fr
(*>)
i(.)(To) /■7iOry(0(T0)
(11.40)
Since j() is a permutation, the sum of paths
^[oi(ro))ai(,)(ro)]
i
is a 1-cycle on W. Hence the first term on the right side of (11.40) is 0
by the Cauchy Integral Theorem. The second term is equal to the left
side of (11.39b) since $/(7i) in §1 is independent of base point. Thus
(11.39b) is proved, and the proposition follows.
In the setting of Proposition 11.23, let us write
(c,u) = Ju> (11.41)
for c in //i(Z) = Hi(Xo(N)i2)i so that the proposition says
(T(n)c,w) = (c,T(n)u,).
We can extend (11.41) by linearity to be defined for c in #i(R) =
Hi(Xq(N),R) or even for c in #i(C), with w in Qho\(Xo(N)). Let us
observe that
((c,a;) = 0 for somec in 7/i(R) and all cj) => (c = 0). (11.42)
5. HECKE OPERATORS ON INTEGRAL HOMOLOGY 327
In fact, let ci,..., c^g be a Z basis of i/i(R), and write
c = yj rjfcCfc with rjt € R.
If wi,... yug is a basis of fihol(^o(^)) over C, then we have
^rticttUj) = 0
Jb=l
for all j. By Proposition 11.13, all the r* are 0. Thus c = 0. This proves
(11.42).
We would like to say that the c's and u;'s are in dual vector spaces,
with the c's in H\{T) constrained to the integer points. But the complex
vector space of c's, namely H\(C)y is 2y-dimensional, while the vector
space of u/s, namely £lho\(Xo(N))> is ^-dimensional. So we shall first
split H\(C) into two pieces of dimension g.
The mapping (•)* : r —► — f of 7i into itself induces an involution of
X0(N). Namely if r2 = yn with 7 = ( a bJ in T0(N)> then the element
7* = f °c ~d j of To(iV) has r2* = 7*n*. This involution induces an
involution on #i(Z), which we extend by linearity to H\(R) and H\(C).
The vector spaces have eigenspace decompositions under * that we write
as
1 (11.43)
Hi(C) = Ht{C)®Hr(Q.
Proposition 11.24.
(a) The spaces H* and H± each have dimension g, and the
pairing (c,w) exhibits the spaces H*(C) and ^lho\(Xo(N)) as dual to one
another.
(b) The linear extension of T(n) to #i(C) maps H*(C) into itself,
and the restriction of T(n) to H*(C) is the transpose of T^n) on
nwo\(X0(N)).
(c) The intersection Hi(Z) H H?(R) is a lattice in H?(R), and a Z
basis for this intersection is a basis for H*(C).
Proof. We write the involution on 1-cycles as c —► c*. If we lift the
1-cycles to paths in 7i* without changing their names and if c is given
by [ro, ri], then c* is given by [ro*, t\*]. Also if we realize members lj of
328 XI: EICHLER-SHIMURA THEORY
Ohoi(^o(AO) a8 functions in S2(Tq(N)), then (11.37) allows us to write
the pairing as
(c,f) = Jj(r)dr.
The space S2(To(N)) has a conjugate-linear involution / —► /* given
by
f'(r) = -f(r').
We readily check that /* is again in 52(ro(N)) and that
(c*J)=Kn forcinffi(C).
If / has / = /*, then if has (if)* = —if. Hence the real subspace 52,+
of/'s in 52(ro(iV*)) with f* — f has real dimension g, and so does the
real subspace 52,- with /* = —/. If c is in //^(R), then
<c,f) = (c\f)=TcJ*)
shows that (c, /) is real for / in 52,+ . If dimR Hf(R) > g} we can
therefore find some cq ^ 0 such that (co, /) = 0 for all / 6 52,+ . Since
(•,•) is complex bilinear, (co,i/) = 0 also. But if is the most general
member of 52,-. So (co,52(r0(AO)) = 0. From (11.42) we see that
c0 = 0, contradiction. Thus dimg H*(R) < g. A similar argument with
c in //f(R) and / in 52,_ shows that dimH#-(R) < g. By (11.43),
equality must hold in both cases.
(b) In Proposition 11.23 we choose r0 imaginary, so that if
c <-► [ro,7r0], then c* «-> [ro,7*ro]. The formula for T(n) is
r<»)M=X>i
with <*ij — 7fQr;(j). On matrices the operation (•)* is multiplicative.
Thus
<*i*7* = 7i*oy(i)*-
If we use the standard a's as in Lemma 9.14, then a** and c*j(t)* are
different members of the same set. Therefore
nn)t7-]=X>-]=w»)Mr.
By linearity, T(n)cm = (T(n)c)* on #i(C), and it follows that T(n) maps
/^+(C) into itself. By (a) and (11.39a), the action of T(n) on H?(C) is
the transpose of the action of T2(n) on Qho\(Xo(N)).
5. HECKE OPERATORS ON INTEGRAL HOMOLOGY 329
(c) If c is in #i(Z), then
c=±(c + c*) + i(c - c*) in //j*-(R) e i/f(R).
Hence Ai = {±(c + c*) | c G #i(Z)} spans i/+(R). On the other hand,
A2 = {c + c* I c G #i(Z)} is contained in #i(Z) and hence is discrete.
Since A2 has finite index in Ai and since H\(Z) PI H*(R) lies between
A2 and Ai, Hi(l) D Hf(R) is a lattice in H?(R). This completes the
proof.
Theorem 11.25. Fix a Z basis ci,..., c, of #i(Z) fl //^(R), so that
c\,..., cg is a basis over C of H*(C). Let f\,..., fg be the dual basis
of S2(r0(N)). Then each Hecke operator T2(n) is given in the basis
fi j • • •»fg by a matrix with integer entries.
Remark . The existence of the bases c\,..., cg and f\,..., fg in the
theorem is assured by Proposition 11.24.
Proof. T(n) acts on #i(Z), with linear extensions to H\(R) and
i/i(C). By Proposition 11.24b, it maps H?(C) to itself, hence Hf(R)
to itself. Thus T(n) maps #i(Z) 0 #j*"(R) to itself. Hence it is given in
the basis ci,..., cg by a matrix with integer entries. Proposition 11.24
shows that T2(n) is the transpose of T(n), and the theorem follows.
Corollary 11.26. The eigenvalues of T2(n) on S2(To(N)) are atee-
braic integers.
Proof. By Theorem 11.24, T2(n) is given in a suitable basis by an
integer matrix. The characteristic polynomial is then a monic polynomial
with integer coefficients, and the roots must be algebraic integers.
Theorem 11.27. Let c\ G S2{To(N))' be the linear functional that
extracts the first Fourier coefficient, and let T be the commutative sub-
algebra of Endc(S2(ro(7V))) generated by the identity and the Hecke
operators. Then the map L :T —+ S2(To(N))' given by
L(T) = cioT (11.44)
is one-one onto and exhibits the left regular representation of T on itself
as equivalent with the representation of J on 52(r0(Ar))/ = #*(C). In
particular, dime T — 9-
REMARKS. We continue to write g for the genus of Xq(N). It will
be convenient in this proof to drop the subscript 2 on Hecke operators
T2(n).
330 XI: EICHLER-SHIMURA THEORY
Proof. We can write L(T) = (c\ o T, •), where (-, •) is the pairing of
S?(r0(N))' with S2(T0(N)). For T0 and T in T, we have
To(L(T))(f) = (T0{c1oT,)J)
= {{c,oT,),To/)
= (CloT,T0/>
= cx(TT0f)
and
L(T0T)(f) = (c, oT0T)(f) = Cl(T0Tf).
These two are equal since T is commutative. Hence L is equivariant.
Before proving that L is one-one onto, we make a construction. Let
fi,- — yfg be a basis of S2(T0(N))i and write
with cn G C9. Our first observation is that
{cn}~=1 spans C. (11.45)
In fact, let £ be any linear functional on C9 with £(c„) = 0 for all n.
Since £ is continuous,
£(f(r)) = 0 for all r. (11.46)
Let ri,...,rr be such that the set of r vectors I I with
WW
1 < i < r is a maximal linearly independent set among the infinite
set of all I I. For each r we can then find scalars ai(7"),... ,ar(T)
\f.{T))
such that
7i(r)\ //i(n)\ /f^y
! =«i(r) : +... + «r(r) :
,/»W/ \/»(n)/ \/,(rr),
5. HECKE OPERATORS ON INTEGRAL HOMOLOGY 331
It follows that the functions f\,..., fg lie in the span of a\,..., ar. Hence
g < r, and the vectors span (X By (11.46), ((C) = 0. Thus
£ = 0, and (11.45) follows.
For each T in T, we define a g-by-g matrix A(T) by
(11.47)
If we write
T(m)/,
= £b„e2*"*T
>r(m)/,/
n=l
with bn € C*, we have
'T(m)h
n = l
= A(T(m))^cne2"nT.
n=l
Equating coefficients, we obtain
hn = A(T(m))cn.
Referring to Proposition 9.15, we see that bi = cm. Hence
cm = i4(T(m))c1. (11.48)
We can now prove that L is one-one onto. Suppose that L(T) = 0.
Since L is equivariant and T is commutative, we have
0 = T(m)L{T) = L(T(m)T) = L(TT(m)) = ci o 7T(m) (11.49)
332 XI: EICHLER-SHIMURA THEORY
for all m. Let e;- be the jth standard basis vector of C, and apply
(11.49) to/,. Then
/7T(m)/i\
0 = c1(7T(m)/i) = c1( e;)
\TT{m)fJ
= Cl(A(T)A(T(m))l : e,) by (11.47)
/d(/i)\
= A(T)A(T(m)) \ J ey by linearity of Cl()
W/f)/
= A(T)A(T(m))d ■ ti
= A(T)cmej by (11.48).
Since j is arbitrary, A(T)cm = 0. By (11.45), A{T) = 0. Substituting
into (11.47), we have Tfi = • • • = Tfs = 0. Thus T = 0, and L is
one-one.
Suppose i(T) does not span ^(IV^))'. Choose / ^ 0 with
L(T)(/) = 0 for all T G T. (11.50)
Write / = £J=1 *,/,-. We have
/T(m)A\
L(T(m))fi = Cl(T(m)/;) = Cl( • «,•)
Vr(m)/,/
= ci(i4(m)| : •«,-) by (11.47)
= i4(m) I I • Cj by linearity of ci(-)
W(>»)/
= ^4(m)ci • ej
= cmej by (11.48).
Multiplying by Zj and summing on .;', we have
L(T(m))/ = cm- ( : ) (11.51)
6. MODULAR FUNCTION j(r)
333
for all m. The left side of (11.51) is 0 by (11.50). By (11.45), I \ J is
the 0 vector. Thus / = 0, contradiction. We conclude that L is onto.
6. Modular Function j(r)
Our objective in this section is to address the special properties of
the compact Riemann surface X0(N) that allow us to realize it in a
particular way as the set of C points of a projective curve defined over
Q. The function j(r) introduced in §VIII.2 will play a critical role in
our construction.
A meromorphic function / on W is said to be automorphic of weight
Jfe and level N if fo[y]k = / for all y £ F0(N) and such that the following
condition holds: For every cusp /^(oo) relative to ro(Af), there is some
Cp such that / o [P~l]k{T) is analytic for Im r > Cp with an expansion
fo\rlur)= f; #>«&
n= —oo
that has only finitely many ex ^ 0 for n < 0. The space of such
functions is denoted Ak(To(N)). It is an enlargement of the space
Mie(To(N)) of modular forms so as to allow poles in H and at the cusps.
Our interest will be in the space Ao(To(N))f which is a field. This
space corresponds to the field K(Xo(N)) of meromorphic functions on
Xq(N) in the same way that 52(To(AT)) corresponds to the space of
holomorphic differentials. In more detail, let {(£/i,^»)}«e/ be an atlas
for Xo(N). A system {/t}te/ of scalar-valued meromorphic functions /;
on Ui yields a meromorphic function on Xo(N) if fc o ^r1 = fj o <p~l
on <p(Ui fl Uj) for all i and j. Let tt : H* —► Xo(N) be the quotient
map. If {fi}i£i is a meromorphic function on Xo(N)) then we define a
corresponding scalar-valued function f on'Hby
/(r) =/{(T(r)) (11.52)
if (Ui,<pi) is a chart about ^r(r). The same kind of argument as for
Proposition 11.6 yields the following result: The map {/<} —► /of (11.52)
is well defined and is a field isomorphism of K(Xq(N)) onto Ao(Tq(N)).
Typically we shall use the same name for a member of Ao(To(N)) and
its counterpart in K(Xo{N)).
Let us take N = 1 for the moment. The function j(r) in Ao(r0(N))
comes from a member of Ko(Xq(1)), according to the proposition, and
334
XI: EICHLER-SHIMURA THEORY
we are going to denote this member of K(Xo(N)) by .;' also. For the
case of r0(l) = 5L(2,Z), there is just one cusp, and j(r) has a simple
pole there, according to Proposition 8.2. Moreover, j is analytic on H
by (8.3).
Proposition 11.28. .; : SL{2,T)\H —► C is one-one onto.
Proof. Fix zq in C. As a meromorphic function on Ao(l), j — z0 has
divisor of degree 0, by Proposition 11.8. Since the only pole is at 7r(oo)
and is simple, the divisor must be
[j -zo] = P- *(oo)
for some p ^ 7r(oo) in 5L(2,Z)\W\ Then j takes the value zQ at p G
SL{2,Z)\H and only there.
Lemma 11.29. Let / in j4o(Fo(1)) be analytic on 7i and have q
expansion /(r) = Y1™=-m cnQn at °°- Then / is a polynomial in .; with
coefficients in the module over Z generated by the coefficients cn.
Proof. We induct on M for M > 0. For M = 0, / corresponds to
an analytic function on Xo(l) and must therefore be constant. Hence /
is the polynomial cq in .; of degree 0. Assuming the result for M — 1,
we observe from Proposition 8.2 that / — C-mJM has a q expansion
52^L-Af+i dnqn- Since the coefficients of.; are in Z, each dn is in cn +
c-mZ. Hence the lemma follows by induction.
Theorem 11.30. K(X0(l)) = C(j).
Proof. If/ is in X(A"o(l)), then / has only a finite number of poles.
If / has a pole at a point p ^ ?r(oo), then Proposition 11.28 shows that
U — j(p))~l has a simple pole at p and no other pole. Arguing as in the
proof of Lemma 11.29, we can subtract from / a linear combination of
powers of (j — j{p))~l and obtain a function with no pole at p and with
no new pole. Continuing in this way, we can reduce to the case that /
has its only pole at tt(oo). Then Lemma 11.29 completes the proof.
Now we consider general N. We have denoted by M*(N) the set
of 2-by-2 primitive integer matrices of determinant N. Let us write
T = r0(l) = 5L(2,Z). Lemma 9.1 says that
M-(N) = r(^ J)r= Q r«, (11.53)
6. MODULAR FUNCTION j(r)
335
and gives an explicit set of coset representatives. The number tp(N) is
given by (9.5). The modular polynomial of order N is the polynomial
$N(X) = $n(j,X) of degree ip(N) defined by
*n(X)= Y[(X-jocn). (11.54)
For 7 £ T, we have j o (yati) —jo [7)0 o or, = j o at-; hence $w(X) does
not depend on the choice of coset representatives.
The first thing to notice about $n is that one of its roots is jpj =
The same argument as with Examples 1 and 2 of §IX.3
(-)•
shows that both j and j^ are in the field K(Xq(N)). In fact, Theorem
11.32 below will show that jn is algebraic over C(.;) and that &n is
its minimal polynomial. The theorem says even more, restricting the
nature of the coefficients.
Lemma 11.31. With ori,... ,a^,(N) as in Lemma 9.1, the functions
j oociy... yj o cty(7v) are distinct on 7i.
Proof. Let us write
j(T)=1- + P(q), (11.55)
where P is a power series with integer coefficients. If a, = I n , j is
as in Lemma 9.1 and if qd = e2*iTtd and Cd = e2,r,/rf, then
i.«M-i(=£i)--k+«*}>. in.
56)
Suppose ofi» = I „ j has j oca = j ° <*»'. Taking the quotient and
letting q tend to 0 (i.e., letting Im(r) tend to +00), we obtain
tiCd = 4<*- (H-57)
a 0!
It follows first that - = —. Since ad = aid1 - N and all are positive,
a d
we obtain a = a' and d = d'. Substituting into (11.57), we obtain 6 = b'.
Thus or; = or,/.
336
XI: EICHLER-SHIMURA THEORY
Theorem 11.32. The polynomial $n(X) has coefficients in Z[j] and
is irreducible over C(j).
Proof. Any 7 in T acts on .; o at, sending it to j o or, o 7. By
(11.53) the result is some j oaki and hence the effect of 7 is to permute
these functions. Since all of the functions contribute once to <£#, the
coefficients of $s are invariant under I\ The coefficients are elementary
symmetric polynomials in the functions j o oti and thus are analytic on
H. The same argument as in Proposition 9.5 shows that each jo or, has a
meromorphic qs expansion at 00, and hence the same thing is true of the
coefficients of $x. It follows from the Y invariance that the q expansions
of the coefficients of $# are meromorphic at 00. By Theorem 11.30, the
coefficients of $n are in C(.;).
The polynomial $/v splits in the field C(j o a\,..., j o ar^,(jv)), and T
acts as automorphisms of this field, fixing C(j) and permuting the roots
of $w. Since or, is in roriT, we have or, = 70:17' or 7_1ari = c^it'- Thus
T acts transitively on the roots. By Lemma 11.31 the roots are distinct.
Consequently $jy is irreducible over C(.;).
We are left with showing that the coefficients are actually in 2\j]. Let
us write j(r) and joa^r) as in (11.55) and (11.56). When we form the
elementary symmetric polynomials in the functions (11.56), we know
that the result involves only integral powers of q, and we can see that
the coefficients of this resulting series are in Z[0v] since d \ N.
Regard the coefficients of the series as in Q(Gv)> and consider the
automorphism of this field given by 6v —► £)v> where GCD(r, N) = 1.
The expression (11.56) is mapped to
^W + ^sci1),
where V is the least residue > 0 of rb modulo d. This expression is
just j o ar,'(r), where or,/ = f J. The map (a,b,d) —► (a,b\d)
represents a permutation of the representatives oti. Thus the effect of
Cn —► Cs on ^e coefficients of the q series expansion of a coefficient of
$/v is the same as the effect of a certain permutation of the roots, i.e.,
to leave things fixed. Since Q(Gv) is Galois over Q, the coefficients of
the q series expansion of a coefficient of $at are in Q fl Z[0v] = Z. By
Lemma 11.29 the coefficients of^N are in Z[j].
Theorem 11.33. K(XQ(N)) = C(j,jN).
The proof will use the following two lemmas.
6. MODULAR FUNCTION j(r)
337
Lemma 11.34. If g € T has j(N(gr)) = j(Nr) for all r€«, then g
is in r0(JV).
Proof. Let a — ( n 1 1. Since Nr = a(r), we have
j(aga~1Nr) — j{Nr)
for all t inTi. Hence
j{aga"lr) = j(r)
for all t inTi. By Proposition 11.28, there exists fiT in T with
<*ga-l(T) = PT(T)
for all r in Ti. Fix ro in the interior of the standard fundamental domain
R of T. The point aga"l(r0) is in the interior of /3To{R) and hence so
is otga~l(r) for r near ro- Consequently /?T = ±/?ro for r near r0. By
analyticity aga~l = ±/?To- Consequently g is in
5L(2,Z)na-15L(2,Z)a,
and this is Tq(N) by Lemma 9.2.
Lemma 11.35. Let F be a field of characteristic 0, and let G be a
finite subgroup of AutQ(F). Then F is a finite Galois extension of
FG = {x e F | v>(*) = x for all p £ G},
and AutFo(F)=: G.
Proof. For x G F, let /r(J0 be the member of F[X] given by
/.w= II(x-^x))- (1L58)
Then /r is in FG[X]. Since /x(x) = 0, x is algebraic over FG and
[FG(x) : FG] < |G|. Choose y in F such that [FG(j/) : FG] is maximal.
For any z in F, the Theorem of the Primitive Element yields some w in
F such that FG(z,y) = FG(w). By maximality
[FG(z,y) : FG] = [FG(w) : FG) < [FG(y) : F%
and thus z is in FG(j/). Therefore F = FG(y) and [F : FG] < \G\. In
particular, F is a finite algebraic extension of FG.
Fix an algebraic closure F of F. If tp is in HomFo(F, F), then
fy(i>(y)) — 0 since fy has coefficients in FG. Referring to (11.58), we
see that tp(y) = <p(y) for some <p in G. Since y generates F over FG,
we have ^ = y? on F. We conclude that every member of HomFo(F, F)
carries F into itself and coincides with a member of G. The first of these
conclusions implies that F is a Galois extension of FG, and the second
implies that G is the Galois group.
338
XI: EICHLER-SHIMURA THEORY
PROOF OF THEOREM 11.33. Let r(N) be the principal congruence
subgroup of level N. Let K = Aq(T(N)) be the field of meromorphic
functions / on 7i such that / o [7]0 = / for all 7 € T(N) and such that
the following condition holds: For each 0 in 7 = SL(2, Z), there is some
Cp such that / o [/?_1]o(t") is analytic for Im r > Cp with an expansion
f°[0-1]o(T)= J cWgfr
n=-oo
that has only finitely many c}{ / 0 for n < 0.
Then A0(r0(N)) ^ K(X0(N)) is contained in K. The group T acts
on K by automorphisms under / —► / o a = / o [or]0f and r(AT) fixes J<\
Hence G = V/T(N) acts on A. By Lemma 11.35, K is a finite Galois
extension of KG = Ar, and the Galois group is G/Go, where Go is the
subgroup of G that fixes K. A member / of Kr has / = / o [or]0 for all
a € T, hence is in K(Xo(l))> which Theorem 11.30 identifies as C(j).
Thus K is a Galois extension of C(j) with Galois group G/Go.
Let us determine the Galois group Gf of K over the intermediate
field C(j,jjv). Kg is in G', we can regard y as a member of T satisfying
joNo^ = joN. By Lemma 11.34, 0 is in r0(A0- Conversely T0(N)/r(N)
does fix C(jJN). Hence G' = (To(N)/T(N))/Gq.
By the Fundamental Theorem of Galois Theory,
Since K(X0(N)) C A", we have K(X0(N)) C A'r°W. On the other
hand, comparison of the definitions of K(Xo(N)) and A shows that
K(X0(N)) D Kr«N\ Thus C(j,j>) = Ar°<N) = K(X0(N)).
The Q structure on Xo(N) will result from a relationship between
K(Xo(N)) = C(j,Jn) and Q(j,j;v) that is given by the equivalent
conditions in the next theorem, in which we eventually will take x = j,
y = jNj and A?o = Q. The conditions are satisfied as a consequence of
Theorem 11.32.
Theorem 11.36. If C(x,y) is a field with x transcendental over C
and y algebraic over C(x), then the following conditions on a subfield k0
of C are equivalent:
(a) y is algebraic over &o(x), and the minimal polynomial of y over
ko(x) remains irreducible over C(x),
(b) [k0(x1y):k0(x)] = [C(x)y):C(x)])
(c) Cnkoix.y) = k0,
(d) koCikoix.y) = k0.
6. MODULAR FUNCTION j(r) 339
REMARK. Condition (d) may be restated as: ko is algebraically closed
in k0(xyy).
Proof, (a) <* (b). This is clear.
(b) => (c). Let u be in C and in ko(x, */)• From the inclusions
M*,y) 2 Mx>w) 2 Mx)
C(x,y)DC(r,«)DC(*),
we have
[kQ{x,y) : k0(x)] = [t0(x,y) : M*.W)][M*»W) : k0(x)]
(11.59)
[C(x,y) : C(*)] = [C(x,y) : C(«,w)][C(x,w) : C(x)],
with the left sides equal, by (b). Since y is algebraic over ko(x,u) and
Jko(x), we have
[kQ(xyy) : k0(x,u>)] > [C(x,y) : C(x,w)]
(ll.oO)
[*o(*,«):*o(*)]>[C(*,«):C(*)].
Comparing (11.59) and (11.60), we see that actually the inequalities
(11.60) are equalities. Since u; is in C, we have
[k0{x,u):k0(x)] = [C(x,u):C{x)] = l,
and we conclude that w is in ko(x). Let u = Pi(x)/P2(x) with Pi and
P2 in k[x] and P2 ? 0. Put P(X) = Pi - wP2(*) in C[X]. If P is
not the 0 polynomial, then P(x) = 0 shows that x is algebraic over C,
contradiction. So P = 0 and every coefficient of Pi — u>P2 is 0. Since Pi
and P2 have coefficients in ko and Pi is not 0, it follows that u is in ko.
(c) => (d). This is clear.
(d) => (b). First let us show that (d) implies
[kQ(z,Lj) : k0(x)] = [Ao(*,w) : *o(x)]. (H-61)
In fact, otherwise we can find a finite Galois extension k\ of ko for which
[to(x,ai) : *<>(*)] > [ti(*,w) : *i(*)l-
Hence
[ki(x) : k0(x)] > [fcifar.y) : t0(*,y)]. (11.62)
340
XI: EICHLER-SHIMURA THEORY
The left side here equals [k\ : Jbo] because any element of Jbi(x) can have
its denominator "rationalized" by Aut*0(Jbi) so as to be written in the
form Pi(r)/P2(z) with P\ in k\[x] and P2 in k0[x]. Thus k\(x) is Galois
over ib0(x), and k\{x,y) is Galois over fco(x,y). Since k\ is Galois over
Jbo, any ko isomorphism of k\ into a field containing k\ sends k\ into
k\. Hence every member of Autjb0(riy)(ti(x,y)) sends k\ into k\ and
fixes x, thus sends Jbi(x) into itself while fixing fco(x). Thus we have a
homomorphism
Autfco(r,y)(^i(x,y)) —> AutJko(a.)(Ar1(x))
that is clearly one-one. Let H be the image in Autjt0(x)(ibi(x)). The
assumed inequality (11.62) means that H is not all of AutJbo(J.)(fc1(x)).
Let k[ be the field with k\ D k[ D ko and k[(x) = k\(x)H; here k[(x) ^
*o(x). Form k[(x}y). Every element of Autjfc0(ry)(fci(x,y)) fixes this,
by construction. Hence &J(x,y) = ko(x,y). Thus k[ C ibo(x,y). By (d),
k[ = ko. Since k[(x) ^ ko(x), (11.62) has led to a contradiction. Thus
(11.61) holds.
To complete the proof of (b), we shall show that
[k0(x,y) : *„(*)] = [C(x, y) : C(x)], (11.63)
even without assuming (d). Let P(Y) 6 ib0(x)[Y] be an irreducible
polynomial with P(y) = 0. Multiplying the coefficients by a suitable member
of ibo(x), we may assume that P(Y) is in £o[x][Y] and is primitive over
ibo[x]. We are to prove that P(Y) remains irreducible over C(x).
Suppose on the contrary that P factors nontrivially over C(x). By Gauss's
Lemma it factors over C[x]. We can thus write
P(Y) = Q(Y)R(Y) (11.64)
with Q and R primitive over C[x]. Let us make x explicit by writing
(11.64) as
P(x,Y) = Q(x,Y)fl(x,Y) (11.65)
with P(x,Y) in Ar0[x,Y] and with Q(x,Y) and fl(x,Y) in C[x,Y].
Applying the evaluation homomorphism x —► c G ko, we obtain a
factorization of the polynomial P(c,Y) of one variable, and we see that Q(c,Y)
and R(cyY) are in ko[Y].
Write Q(X,Y) = J2Qi(X)Y\ We have seen that Qi{c) is in Jb0 for
every c in ko. If degQ, = n, we can construct Qf(X) in ko[X] of degree
n with Qf (/) = Qi{l) for 1 < / < n. Then it follows that
n
Qf{X)-Qi(X) = cY[(X-l)
7. VARIETIES AND CURVES
341
with c ^ 0 in C. Evaluating at n + 1, we see that c is in ibo- Hence Qi is
in k0[X], and Q(X,Y) is in k0[XtY]. Similarly R(Xt_Y) is in k0[X,Y].
Thus (11.65) contradicts the irreducibiHty of P over k§(x). This proves
(11.63), and (b) follows by combining (11.61) and (11.63).
7. Varieties and Curves
In this section and the next we shall complete the task of realizing
Xo(N) in a particular way as the set of C points of a projective curve
defined over Q. So far our discussion of curves has been limited to plane
curves. In order to treat Xo(N) adequately, we shall need to carry out
two additional steps. First, in this section, we extend our discussion from
plane curves to varieties and curves in higher dimensional projective
spaces, and we define types of mappings between such varieties and
curves. Second, in §8, we characterize curves in terms of their function
fields.
In the case of Xo(N), we can regard the irreducible polynomial $n(X)
of Theorem 11.32 as a polynomial $jv(Xi, A"2) over Q in two variables
such that &n(J,Jn) = 0. Theorem 11.33 shows that r —> (J(t)Jm(t))
gives a one-one parametric mapping of Xo(N) into the C points of <£.
We could then apply a desingularizing process to obtain a nonsingular
curve over Q in a higher dimensional projective space whose C points
are biholomorphic with Xo(N).
But as we shall see, the theory of algebraic curves provides a faster
procedure. By working directly with the function field K(X0(N)) and
using Theorems 11.32, 11.33, and 11.36, we can arrive at an
appropriate nonsingular curve over Q without working further with explicit
equations. The speed of our procedure is achieved at the price of not
easily being able to make explicit computations. This disadvantage was
already noted in §1, where we mentioned that the equations for the
elliptic curves that arise in Eichler-Shimura theory had to be deduced from
side information.
We begin by assembling background information about varieties and
curves in general. We omit many of the proofs. We shall not return to
X0(N) until §8.
Let ko be a field, and let k be an algebraically closed field with kQ C k.
We shall assume that ko has characteristic 0 or is a finite field and that
the fixed field of Aut*0(£) is ko. In many situations that will be clear
from the context, a definition or result valid for ko is also valid for k.
To any ideal / in k[X\>..., Xn], we associate its zero set in kn:
Vi = {x = (*!,..., xn) e kn | /(*) = 0 for all / £ 1}.
342
XI: EICHLER-SHIMURA THEORY
Any such set Vj is called an affine algebraic set. If V C kn is an affine
algebraic set, its ideal (of polynomials vanishing on it) is
I(V) ={fe k[Xly... ,Xn] | /(x) = 0 for all x € V),
Then I(Vi) D L
An ideal in a polynomial ring over a field is finitely generated (i.e.,
the ring is Noetherian), by the Hilbert Basis Theorem. We say that
an affine algebraic set V is defined over k0 if I(V) can be generated
by members of kQ[X\y..., Xn]. In this case we often abbreviate "V
defined over fc0" by V/kQ, we define I(V/kQ) = I(V)nk0[Xu...,Xn],
and we define the set of ko points or ko rational points of V to be
v(kQ) = vnks.
The group action by Autjtft(ib) on k extends to k[X\y... ,Xn] with
all Xj fixed by all of AuU0(Jfc). Then k0[Xly... ,Xn] is the full fixed
ring under Autfc0(£). In the case of V/ko, I(V) is mapped into itself
by Autjfc0(Ar), and I(V/ko) is the subset of I(V) fixed by Autjt0(Jb). Also
V(kQ) is the subset of V fixed by Autjt0(fc).
Theorem 11.37 (Hilbert Nullstellensatz). If / is an ideal in
k[X\,... ,Xn] and / is in I(Vj) then fr is in / for some integer r > 0.
If / is a prime ideal, then it follows from Theorem 11.37 that / = I(Vi),
and the situation is more manageable. Accordingly we shall
systematically work with this situation (even though we did not do so in Chapter
II). An affine algebraic set V is irreducible if it is not the union of
two proper affine algebraic sets, or equivalently if I(V) is prime. An
irreducible affine algebraic set is called an affine variety. For example,
if P(X,Y) is an irreducible polynomial in fc[X,Y], then its zero locus in
k2 is an affine variety, being Vj for / equal to the prime ideal (P).
The affine coordinate ring k[V] of an affine variety V is the
Noetherian ring defined by
i[V] = k[Xu...tXn)/I(V).
Since I(V) is prime, k[V] is an integral domain. Its quotient field k(V)
is called the function field of V; it is finitely generated over Jk since
fcfV"] has this same property. In the case of V/kQ) the affine coordinate
ring is
ko[V] = ko[X1,...,Xn]/I(V/ko).
Since I(V) is prime, so is I(Vfko), and therefore k0[V] is an integral
domain. Its quotient field k0(V) is the function field of V/k0 and is
finitely generated over ko.
7. VARIETIES AND CURVES
343
Also in the case of V/k0i Aut*0(fc) acts on k[Xi>... yXn] and carries
I(V) into itself; therefore it acts on k[V]. One can prove that the fixed
ring is naturally identified with fco[V]. Similar remarks apply to k(V)
and Jfco(K).
The dimension of an affine variety V is the transcendence degree
of k(V). An affine curve is an affine variety of dimension one. In
other words, there is to exist some / in k(V) not in ib, so that k(f) is
transcendental over k, and any g in k(V) is to have the property that
k(f>9) ls algebraic over k(f). In the case of characteristic 0, it follows
from the Theorem of the Primitive Element that k{V) = k(f, h) for some
suitable h. (Cf. Theorem 11.20.)
Let V C kn be an affine variety, and let f\,...,// be a set of
generators for I(V). A point x of V is a nonsingular point if the matrix
has rank n — d, where d is the dimension of V. The variety
m.
V is nonsingular if every point of it is a nonsingular point. These
definitions do not depend on the set of generators. There are two different
constructs with which we can see this independence. The first one is the
easier to use in computations. For it, let x be in K, and let mx be the
ideal in k[V] given by
mx = {fe ~k[V] I f{x) = 0).
This is a maximal ideal in k[V], since evaluation at x is an isomorphism of
i[V]/mT with h, and the quotient mx/ml is a finite-dimensional vector
space over jb. The following proposition shows that the definition of
nonsingularity does not depend on the system of generators.
Proposition 11.38. A point x of an affine variety V is nonsingular if
and only if the vector space dimension of mx/m^ equals the dimension
of V.
The second construct for seeing that nonsingularity is intrinsic uses
quotients of elements of k[V]. Let k[V]x be the subring of k(V) given
by
\F = g/h,
\g and h are in k[V]}
\h(x)^0
This is a local ring in the sense that it has a unique maximal ideal Mx,
namely the members vanishing at x. It is called the local ring of V at
x. It is Noetherian since k[V] is Noetherian, and it is an integral domain
since it is contained in a field. The members of k[V]x are the elements of
W)* = {fz Hv)
344
XI: EICHLER-SHIMURA THEORY
k(V) with a definite value at x, i.e., F = g/h has F(x) = g(x)/h(x) well
defined and finite. Such elements are said to be regular or defined at
x. Nonsingularity can be decided by Afx/M|, according to Proposition
11.38 and the following result.
Proposition 11.39. The inclusion mx C Mx yields an isomorphism
mx/ml ~ Mx/M*.
Proof. We get a ring homomorphism mxfmx —► Mx/Mx. If g/h is
given in MXi then
«■>-'.=H{) (*^)
exhibits h(x)~1g in M* as mapping to — -f M^. Hence the map is onto.
If g in mx maps to JZ ( i~ J ( "77 J in ^r ♦ we can clear fractions and
write hg = ]T)<7i<7t'/i" for an element h of £[V] with h(x) £ 0. Here
EflftfW is in ml- The set {/ € *[V] | /$ G m£} is an ideal in k[V]
containing mx and also hf which is not in mx. Since mx is maximal, 1
is in the set and g is in m£. Thus the mapping mxfmx —► Mx/M% is
one-one.
Projective n-space Pn(£) is defined in the same way as with Pzik),
namely as the quotient of {(x0,.-.,xn) £ fcn+1 — {(0,... ,0)}} by an
equivalence relation, two points being equivalent if one is a £x multiple
of the other. Our notation for writing a point of Pn(k) with care is
[(x0)... ,xn)]. We let Pn(fco) be the subset of Pn(k) for which x0,... ,xn
can be taken in k0.
To any homogeneous ideal / in k[Xo,..., Xn], we associate its zero set
in Pn(k):
Wj = {[(x0,..., *„)] € Pn(k) | /(x0,..., x„) = 0 for all / € /}.
Any such set Wj is called a projective algebraic set. If W C Pn(k)
is a projective algebraic set, its homogeneous ideal (of polynomials
vanishing on it) is the ideal 1{W) generated by all homogeneous / in
k[Xo,...,Xn] such that /(x0,...,xn) = 0 for all [(x0,... ,xn)] G W. It
is always true that I(Wj) D I.
We say that a projective algebraic set is defined over Jbo if 1{W) can
be generated by homogeneous members of ko[X0,..., Xn). In this case
we abbreviate "W defined over ^o" by W/k0t and we define the set of
k0 points or k0 rational points of W to be W(k0) = W C\ Pn(ko).
7. VARIETIES AND CURVES
345
A projective algebraic set W is irreducible if it is not the union of
two proper projective algebraic sets, or equivalently if I(W) is prime.
An irreducible projective algebraic set is called a projective variety.
Projective transformations of Pn(k) onto Pn(k) are defined in
analogy with the case of n = 2 that was discussed in §11.1. About
any point [(x0,... ,xn)] of Pn(k) we can introduce various systems of
affine local coordinates. Namely choose $ in GL(n -f l,jb) with
4>(x0,.. .,xn) = (1,0,..., 0). (For many purposes we ignore
(io,...,in), and it suffices to take $ to be the identity or a
permutation matrix that interchanges 0 and i.) Then we can define affine local
coordinates on [&~l(kx x in)] to kn by the one-one map
V([*-1(y0,...,yn)])=f^,...,^). (11-66)
V 2/o t/o /
If W is a projective algebraic set with homogeneous ideal I{W) and
if # is given as above, we shall write W Clkn for y?([$_1(^x x Jbn)]fl W).
This is an affine algebraic set with ideal I(W R kn) C i[X\,..., Xn]
given by
"-M'r *",=F(*",(:^»<->
Conversely if/ is given in k[Xi,..., Xn] of degree d and if $ is given as
above, then we can define /// as a homogeneous element of k[Xo,..., Xn]
in two steps by
gH(X0y...yXn) - X$f ( -rr,...,Tr- )
\Xo Xo/ (11.68)
Ih = 9h° $•
If V is an affine algebraic set with ideal I(V), then the projective
closure of V relative to $ is the projective algebraic set W = V whose
homogeneous ideal is generated by {/// G i[Xo,..., Xn] \ f £ I(V)}.
If we combine these two constructions, passing from F to / as in
(11.67) and then from / to fH as in (11.68), we find that
F = (X0o<P)rfH (11.69)
for an integer r > 0. As a consequence we can prove the following.
346
XI: EICHLER-SHIMURA THEORY
Proposition 11.40. Fix $ in GL(n + 1,4) to determine affine local
coordinates.
(a) If V is an affine variety, then W = V is a projective variety and
v = wnkn.
(b) If W is a projective variety, then V = W R kn is an affine variety,
and K C W. The inclusion is strict only if Xo o $ is in /(W), and in
this case V = K = 0.
(c) If $ is in GL(n + 1, fc0) and W is a projective variety with W fl kn
nonempty, then W is defined over k0 if and only if W fl kn is defined
over fco-
Proof. Let us check that I(W) prime implies I(W fl ibn) prime. If
homogeneous F and G lead to / and g with fg in J( W fl fcn), then FG
vanishes on Wf\kn and (X0o$)FG vanishes on W. Hence (X0o$>)FG
is in Z(W^). Then (X0 o$)F or G is in I(W)y and it follows that / or g
isinI(Wnkn).
Next let us check that I(V) prime implies I(V) prime. If/ and g lead
to /// and gfj with /h9h = hjf in /(K), choose h in /(V) leading to
/i//. Then fg = h,forg is in /(K), and /ff or gn is in /(K).
Then (a) is clear, and (11.69) implies that I(W) C I(V) in (b). Hence
V CW. If the inclusion is strict, let fH be in I(V) but not in I(W).
Then /// comes from some / in I(V), which comes from some F in
I(W). By (11.69), F = (X0o<$>)rfH. Since /(H/) is prime and fH is not
in I(W)y X0 o$ is in /(W^). Then 1 is in I(V)y and V = 0. This proves
(b), and (c) follows from the definitions.
If W is a projective variety, we define the function field k(W) of W
to be the field of quotients F(X) = g(X)/h(X) in i(X0,... yXn) such
that g and h are polynomials homogeneous of the same degree, h(X) is
not in /(VK), and quotients — and — are identified if gh!—g'h is in I(W).
In this way, k(W) has a canonical definition independent of <£. We can
describe k(W) alternatively using affine local coordinates. Namely (b)
shows that we can choose $ in GL(n + 1, ik) so that W fl kn = W. Fix
such a choice of $. Using $, we can identify the function field k(W)
with k(W H kn). Consequently k{W) is finitely generated over ib. If W
is defined over fc0, we can treat *o(W) similarly, as a consequence of
Proposition 11.40c; here we must take $ in GL(n + 1, k0).
The dimension of W is again the transcendence degree of k{W). If x
is in WL we can choose $ so that x is mapped by y? in (11.66) to a point
of WC\kn. Then it is meaningful to speak of the local ring k[W]x of W
at x. We regard k[W]x as a subring of k{ W)\ it is the subring of elements
that are regular or defined at x. The ring k[W]x is Noetherian.
7. VARIETIES AND CURVES
347
Finally we can consistently define x to be nonsingular if dim Mx/M£
= dimly, by virtue of Propositions 11.38 and 11.39. The projective
variety W is nonsingular if it is nonsingular at every point.
The group Autjb0(ib) plays the same role for projective varieties that
it does for affine varieties. We shall not list the details.
We shall encounter products of projective varieties and shall want to
regard them as projective varieties. First we consider the product of
two projective spaces. If Pm(k) has points [(xo, • • • > xm)] and Pn(k) has
points [(y0> • • •, 2/n)]> then the Segre embedding, which maps
[(x0,...,xm)] x [(yo,...,2/n)] —►[(...,*.■%,...)] = [(...,*•;,...)]
in lexicographic order, exhibits Pm(k) x Pn(i) as contained in PM(k)
with M = mn + m + n. The image of the Segre embedding is the
variety of the ideal generated by all ZijZiiji — ZiyZiij. It is a projective
variety in P\f(k). If W C Pm(k) and W' C Pn(k) are projective
varieties defined by homogeneous polynomials {f(X)} and {y(y)}, then the
product W x W C P\f(k) is the variety of the ideal generated by all
ZijZi'j'—Zij'Zi'jy all f(X)Y*eg*, and all g(Y)Xfegg, in obvious notation.
It too is a projective variety in Pm(&)• If W and W' are nonsingular, so
is W x W. WW and W1 are defined over k0f so is W x W1.
Let V\ C Pm(k) and V2 C Pn(k) be projective varieties. A rational
map F from V\ to V2 is a tuple F = (/o,..., fn) of members of k(V\)
such that for every point x £ V\ where /o,..., fm are all defined,
F(x) = [(/0(x),...,/n(x))]
is in V2. If V\ and V2 are defined over ^0 and if /o,. -,/n can be
multiplied by a common member of £x so as to be in fco(Vi), then F is
said to be defined over k$.
Example 1. Let Vx be Pi(C) = {[(x0jari)]} with I(VY) = 0, and let
V2 be the curve in Pi(£>) = {[(j/o, 2/i, 2/2)]} given by y0y2 = y\, i.e., having
homogeneous ideal (y0y2 - y\) in C[j/0>2/1,2/2]. Then F = (xg,x0xi,x?)
is a rational map. It is defined over Q.
Let G : V2 —* V\ be given by G = (1, — 1. This too is a rational
V 2/0/
map defined over Q.
Example 2. Let Vi be the elliptic curve (11.15) with /(Ki) generated
within C[xo)x1,x2] (in the current notation) by
X0X2 + xlx2 - x\ -f x0x?.
348
XI: EICHLER-SHIMURA THEORY
Let V2 be the elliptic curve (11.13) with /(V2) generated by
x0x% + xgx2 - x\ + x0*i + lOxjfci + 20xq.
The map (11.16a) is F = (/o,/i,/2), where /0 = 1 and
/ / >> Xl _l *o , 2x0 xg
/i(*o.«i.«») = - + ^ + ^7^ + (Xl_*0)2
, ^ , »\-x* 2x2 + x0 fxg t xl t xl \
xo x0 \x\ (xi — xo)*5 («i — *oJ /
Then F is a rational map, and it is defined over Q.
Example 3. The affine curves v2 = 2u4 - 1 and y2 = x3 + 8x are
related by the transformations (3.19) and (3.21). If the curves and the
transformations are recast in projective form, then the transformations
are rational maps.
A rational map F = (/o,... ,/n) between projective varieties V\ and
V2 is regular or defined at x in V\ if there is some g in k(V\) such that
gfi is regular at x and
((*/«>)(*) (*/„)(*))
is not the 0 tuple. A rational map that is regular at every point of V\ is
a morphism.
Examples. In Example 1 above, both F and G are morphisms. This
is clear for F. In the case of G, G is given also by G = ( —, 1 ).
\Vt )
Hence the only question is about points of V2 where t/o = t/i = 0. But
V\/yo — V2 on V2, and hence
is regular at [(0,0,1)].
Example 2 is a morphism, as a consequence of Theorem 11.42 below.
It fixes 00 and hence is a homomorphism of the group structures, by
Proposition 11.61 below. Its kernel is the 5-element torsion subgroup of
Vi/Q.
In Example 3, the two rational maps (3.19) and (3.21) are inverse to
each other, and thus we say that they are birational. The first curve
has a singularity at u — w = 0 if the curve is written projectively in
variables (u, v>w), while the second curve is nonsingular. It will follow
that the two transformations cannot both be morphisms.
8. CANONICAL MODEL OF X0(N)
349
Two projective varieties Vi and V2 are isomorphic if there are mor-
phisms F : Vx -+ V2 and G : V2 -► Vx such that F o G and G o F are
the respective identity maps. The two varieties of Example 1 are
isomorphic. In the case of elliptic curves as in (3.23), Proposition 11.58
will note that an isomorphism must be given by an admissible change
of variables over ib. Thus the current definition of "isomorphism" for
elliptic curves agrees with the earlier one.
Two projective varieties V\/ko and V2/ko are isomorphic over ibo if
the above F and G can be defined over ko.
Let us address how maps between varieties affect function fields. If
F : V\ —*• V2 is a rational map between projective varieties, we can try to
define a ib algebra map F* : k(V2) —+ k(Vi) between their function fields
by composition: F*(f) = foF. Some condition is needed, however, for
this formula to be meaningful. For example, the image of F should not
completely be contained in the set where / is not defined. It turns out
that it is enough that F have dense image in a suitable topology.
We shall not pursue this point in full generality, however. Suppose
instead that F is a morphism and carries V\ onto V2. Then F is called
a dominant morphism. In this case F* is well defined and is one-one.
Moreover, it carries local rings to local rings:
F'(HV2]nx)) C k[Vi\x.
Consequently if F is an isomorphism, then F* is an isomorphism of
function fields and an isomorphism of local rings at each point. In particular,
F carries nonsingular points to nonsingular points.
8. Canonical Model of Xo(N)
In this section we shall relate curves to their function fields and then
show how to introduce a canonical Q structure on Xq(N). We continue
with fields ko C k and assumptions on them as in §7. By a function
field of dimension 1, we shall mean any finitely generated extension
of k of transcendence degree one.
Let C be a projective curve, and let x be a nonsingular point. The key
to analyzing C about x lies in the fact that every rational function on C
has a well defined order at x. To define the order, we use the following
lemma.
Lemma 11.41. Let R be a Noetherian local ring that is an integral
domain and a ib algebra, and let M be the maximal ideal. Suppose that
(a) dirm:(M/M2) = 1, and
(b) every element of R not in M is a unit.
350
XI: EICHLER-SHIMURA THEORY
Then M is principal, being given by M = (t) for any element t of M
not in M2. Also f]Zx Ml = {0}.
Proof. Let t be in M but not M2. Since /Z is Noetherian, M has a
finite set of generators t,su...,sr. Without loss of generality, suppose
r is as small as possible. By (a), Sj — Cjt is in M2 for some Cj 6 k.
Changing notation, we may assume that our generators t,si,... ,sr of
M have the property that each Sj is in M2. Since the products of pairs
of generators of M generate M2, we can write
sr = at2 + ^2 htsj + 5Z cilSiSl'
Thus
sr(l-6r<-^cirs;) ^a^ + ^&^s, + H c,-|*j*|. (11.70)
j j<r j<l<r
The right side of (11.70) is in (<,«i,... ,5r_i), and the coefficient of sr
on the left side is a unit, by (b). Hence sr is in (*,si,... ,sr_i), in
contradiction to the minimal choice of r (unless r = 0). We conclude
that M = (t).
Let / be in f|~i M1 = n£i(*')> and write / = ''/i- Since /, = </i+1>
we have (/i) C (/2) C (/3) C Since R is Noetherian, (/m) = (/m+i)
for some m. Then /m+i = afm = a</m+i, and a< = 1 if /m+i ^ 0. Since
/ is in My at = 1 is impossible. Thus /m+i = 0 and / = 0.
Let us return to the projective curve C and the nonsingular point x.
The local ring k[C]x of C at x satisfies the hypotheses of the lemma. If
/ is in ib[C]r, we define ordr(/) to be the least / > 0 such that / is in
Mlx but not M^+1. This is an integer > 0 if / ^ 0 and is +oo if / = 0.
Let t be any member of Mx but not M2, so that Mx — (t)\ t is called
a uniformizer of C at i. If / ^ 0 and if / = ordr(/), then / = at1 for
some unit a in £[C]r.
The function field k(C) is the quotient field of ik[C]r, and we can
extend ordx() to all of k(C). Namely any F in k(C) is of the form
F = f/9 with / and g in k[C]x and g ^ 0. We define
ordx(F) = OTdx(f)-oTdx(g).
To see that this is well defined, we let / = ordr(/) and m = ordx(g).
Then / = at1 and g = btm with a and 6 units in k[C]x. Hence F =
ab~ltl~m. So / — m = ordx(F) is characterized as the unique power of
8. CANONICAL MODEL OF X0(N)
351
t such that F/tl m is a unit in k[C]x. Then ordr() has the following
properties:
(a) ordr(FG) = ordr(F) + ord*(G),
(b) ordr(F + G) > min{ordx(F), ordr(G)},
(c) k[C]x = {FG k(C) | otdx(F) > 0},
(d) Mx = {Fek(C)\ordx(F)>0}1
(e) ordr(F) = +oo if and only if F = 0.
The function ordr() allows us to prove a theorem mentioned in
connection with Example 2 in the preceding section.
Theorem 11.42. Let C be a projective curve, let x be a nonsingular
point of C, let V be a projective variety, and let F : C —► V be a rational
map. Then F is regular at x. Consequently if C is nonsingular, then F
is a morphism.
Proof. Let F = (/o,..., fn) with fj £ k(C). Let t be a uniformizer
at x, and let
m = min {ordr(/,)}.
Then ordx(^"m/;) > 0 for all j with equality for some j = jo. By (c),
rmfj is regular at x, and (<"m/jo)(x) ^ 0 by (d). Hence g = rm
exhibits F as regular at x.
To each nonsingular point x of a projective curve C, we have
associated the local subring k[C]x C k(C) and the function ordr() : k(C) —►
ZU{oo} satisfying (a) through (e) above. Let us abstract this situation.
A proper subring R C k(C) containing k is called a discrete valuation
ring of k(C) over ib if there exists a function v : k(C) —► ZU{oo} (called
the valuation) such that
(a) v{FG) = v{F) + v{G),
(b) v(F + G)>imn{v{F),v{G)}>
(c) R={F£k(C)\v(F)>0}.
Then R is a local ring with maximal ideal
MR = {Fe~k{C)\v(F)>0}.
For a nonsingular curve C, the next proposition says that this
abstraction captures the functions ordx() completely and gives nothing
additional.
352
XI: EICHLER-SHIMURA THEORY
Proposition 11.43. Let C be a nonsingular projective curve, and
let R be a discrete valuation ring of k(C) over k, with valuation v. Then
there exists a point x of C such that v is a positive multiple of ordx().
For this x, R = k[C]x and MR = Mx.
Proof. Let (x0,... , xn) be coordinates for the projective space in
which C lies. The x/s with x,- G Ic (so that x,- = 0 everywhere on C)
will not play a role, and we discard them. For the remaining indices we
can regard each Xj/x0 as in k(C). If v(xj/xq) > 0 for all j, we use the
standard system of affine local coordinates (with $ = 1) and pass to
the affine curve C H kn1 which is not empty since x0jfi 0 somewhere on
C. The members of the affine coordinate ring k[Cf)kn] are polynomials
in the Xj/x0i and thus v is > 0 on the whole ring. Hence k[C C\ kn] C
R. On the other hand, if v(xj/xq) < 0 for some j, let j = jo be an
index for which it is smallest. Using affine local coordinates in which $
interchanges 0 and jo, we readily see similarly that CC\kn is nonempty
and that k[C n kn] C R.
Consequently / = Mr H k[C fl kn] is a proper ideal in £[C fl ibn] =
fc[ibn]//c- Let / be the inverse image of I in k[kn}. By Theorem 11.37,
all members of I vanish at some x in kn. Since Ic C /, x is in C. Thus
every member f of I has the property that ordr(/) > 0.
Since C Dkn is nonempty, we can identify k(C) with k(CDkn) and
ifc[C]x with k[C n Jkn]x. Suppose F isin MR n *[C n ikn]x. We can
write F = f fg with / and g in £[C fl /kn] and oxdx(g) = 0. Since g is
in fc[C fl kn] C 72, / = Fg is in 7; thus the previous paragraph gives
°rdar(/) > 0. Since ordx(^) = 0, we have ordx(F) > 0. This proves that
(t/(F) > 0 and ordr(F) > 0) => ordr(F) > 0 (11.71)
for general F in k(C). The contrapositive of (11.71) is
ordx(F) = 0 ==> «(F) < 0.
Applying this to 1/F, we obtain
ordx(F) = 0 => v{F) = 0. (11.72)
By Proposition 11.39 we can choose t in ife[CnJbn] that is a uniformizing
parameter at x_. Since k[C fl kn] C R, we have v(t) > 0. A general
member F of Jb(C) is of the form F = tlF0 with ordx(F0) = 0. By
(11.72) this Fhas
v(F) = lv(t) + v(Fo) = lv{t) = v(<)ordr(F).
Now t; is not identically 0 since R is a proper subring of k(C). Since
v(0 > 0) we conclude that v(t) > 0 and that v is a positive multiple of
ordr(). This completes the proof.
8. CANONICAL MODEL OF X0(N)
353
Proposition 11.43 gives a clue how to reconstruct a nonsingular
projective curve from its function field. The point is that the same idea can
be used to associate a nonsingular projective curve to any function field
of dimension 1.
Theorem 11.44. Let K be a function field of dimension 1 over fc,
and let Ck be the set of discrete valuation rings of K over k. Then there
exists a nonsingular projective curve C such that k(C) = K and such
that Ck gets canonically identified with the set of points of C.
REMARK. The curve C is unique up to isomorphism, by Theorem
11.48 below.
We shall give a brief indication of some of the steps. First, if B is
any integral domain that is finitely generated as a k algebra, then B
is isomorphic with the coordinate ring of an affine variety. In fact, let
xi,...,xn be generators of B over k. The map Xj —► Xj gives a k
homomorphism of k[X\,..., Xn] onto B. Since B is an integral domain,
the kernel is a prime ideal J. Then Vj is the required affine variety.
Now let us consider the function field K of Theorem 11.44. Let 3 be a
member of K not in k. We form the polynomial ring k[x]. Then K is a
finite extension of the quotient field k(x). Let B be the integral closure
of k[x] in K. When the construction of the previous paragraph is applied
to this B, the resulting affine variety turns out to be a nonsingular affine
curve.
At this stage we bring in Ck- Let R be a discrete valuation ring of K
over k, and choose x to be in R (but not in k). Then k[x] C R. One can
show that R is integrally closed in Ky and it follows that B C R. The
intersection N = Mr f] B is a maximal ideal of B and thus corresponds
to a unique point of the nonsingular affine curve built from B. In other
words, R has been attached to a unique point of the curve corresponding
to B.
To prove Theorem 11.44, one has to take the product of the projective
closures of finitely many such affine curves, map Ck into this product
diagonally, and take the image of Ck as the desired curve C. We omit
the details.
For our application to Xo(N) it will be important to know conditions
under which the curve C in Theorem 11.44 is defined over the subfield
kQ. We assume now that k = C. Taking a cue from Theorem 11.36,
suppose K — C(f,g) and Kq = k0(f,g). Given R in the argument
above, we can always choose x to be either / or 1//, so that x is in Kq.
So we need to deal only with two jB's. We are allowed to use more #'s,
but we insist that they come from x's in Kq. The heart of the matter
354
XI: EICHLER-SHIMURA THEORY
is to get the affine variety determined by B to be defined over ko. This
step we address in the following lemma.
Lemma 11.45. Let K = C(f,g) be a function field of dimension
1 over C, let k0 be a subfield of C, put Ko — k0(f}g)i and suppose
that C fl Ko = ibo- Let x be in I<o but not to, let B be the integral
closure of C[x] in K, and let V be a nonsingular affine curve determined
by B. Then V is canonically defined over ko. Under this definition,
ko(V)S*KQzn<iC(V) = K.
Proof. Without loss of generality, / is transcendental over C, and g
is algebraic over C(/). Taking into account (c) => (a) in Theorem 11.36,
we see that / is transcendental over ko and g is algebraic over fco(/)-
Therefore Ko has transcendence degree 1 over ko-
Since C fl Kq = Aro, the given element x of Ko is not in C, hence
is transcendental over ibo- Consequently {x} is a transcendence basis
of A'o over ko. It follows that / and g are algebraic over k0(x) and
that K0 is a finite algebraic extension of ibo(x). By the Theorem of the
Primitive Element, there exists y in A'o such that Ko = ko(x,y). Then
K = C(x,y).
Condition (c) in Theorem 11.36 is independent of the two generators
and hence still holds when we replace (f,g) by (x,y). By (c) => (a) in
the theorem, we see that
[Ko:k0{x)] = [K:C(z)]. (11.73)
Let n be the common value of the two sides of (11.73).
Since C[x] is a principal ideal domain, there exists a basis {xj,..., xn)
of K over C(x) consisting of members of B such that
n
fl = £CM*i. (11.74)
We take X\,..., xn, x as generators of B over C. Then the map
<PB ••C[Ar1,...1ArfX+i] —► B
given by Xj —> Xj for j < n and Xn+i —► x exhibits B as the affine
coordinate ring of the curve V defined by the ideal Is = ker <ps- We
are to prove that IB is finitely generated as an ideal by elements of
k0[X\,... ,Xn+i], and we are to identify the function fields.
8. CANONICAL MODEL OF X0(N)
355
Let Bq be the integral closure of ko[x] in A'o. As above, there exists
a basis {y\,..., yn) of K0 over k0(x) consisting of members of B0 such
that
n
flo = 5Z*oWw. (1175)
We form the map
<Pb0 • k0[X\y... ,Xn+i] —► Bo
given by Xj -* j/, for j < n and Xn+i —> x. Let 7a0 = ker <pb0. By
the Hilbert Basis Theorem, Ib0 is finitely generated, say by members
Pi,..., Pyv of Ar0[A"i,..., Xn+i]. We complexify the exact sequence
0 —► Ib0 —► ko[Xi, • •., A"n+i] —► Bo —► 0
to a sequence
0 — IBo ®k0 C — C[Xl9..., X„+1] —> £0 <8>*0 C — 0.
The result is exact since the operation (•) <g>jb0 C is an exact functor on
ibo vector spaces. Here Ib0 <8>k0 C is generated by Pi,..., Pyv, and
n
; = i
Let <p%Q denote the complexified version of y>B0-
The elements y; are linearly independent over C(x) by (a) of Theorem
11.36. Hence both {xj} and {yy} are bases of K over C(x). Let xp be
the C(x) linear map of K into itself given by Xj —+ yj for 1 < j < n.
Then it is clear that
X\) o (pB = <p%Q
as C linear maps. Therefore their kernels are equal. But \l> is one-one.
Hence
lB = lB0®k0C. (11.76)
The right side is generated by Pi,..., P^v, which are in ko[Xi,..., Xn+i].
Thus Ib is finitely generated as an ideal by elements of ko[X\,..., Xn+i].
The affine coordinate ring of V is #, and the function field is the
field of quotients of B, which is A' by (11.74). In the case of V/fro, we
356
XI: EICHLER-SHIMURA THEORY
determine the affine coordinate ring by first determining /(V/Jfeo), which
is given by
i(v/k0) = i(V)nk0[xu...,xn+l]
= Ib rik0[Xu...,Xn+i]
= (/Bo®*oc)nt0[^i.-...^n+i] by (n.76)
By definition k0[V] = k0[X\y... ,Xn+i]//B0, and this is B0. The
function field is the field of quotients of Bo, which is A'0 by (11.75).
Theorem 11.46. Let t0 be a subfield of C. If C/k0 is a nonsingular
projective curve, then there exist elements / and g in ko(C) such that
k0(C) = ko(f,g) and C(C) = C(f,g). Moreover,
Cnk0(C) = t0.
Conversely let K = C(f,g)) be a function field of dimension 1 over C,
let to be a subfield of C, put A'o = to(/,</), and suppose that C PI Ko =
to. Then there exists a nonsingular projective curve C/to such that
t0(C) ~ K0 and C(C) = K.
The question of uniqueness of C in Theorem 11.46 is conveniently
addressed in the context of a correspondence between certain maps of
function fields and certain morphisms between curves.
Recall from the end of §7 that a dominant morphism F : C\ —► Ci
between nonsingular projective curves (over t) induces a one-one t algebra
homomorphism F* : t(C2) —► t(Ci) by composition. F*(f) = f o F.
Lemma 11.47. A morphism F : C\ —* C? between nonsingular
projective curves over t either is onto or is a constant map.
Theorem 11.48. Let Ci/to and C2Jko be nonsingular projective
curves, and let F : C\ —► Ci be a dominant morphism defined over to.
Then F* carries to(C?2) into to(Ci), and t0(Ci) is a finite extension of
F*(t0(C2)). Conversely if/* : t0(C2) —»- t0(Ci) is a one-one t0 algebra
homomorphism (carrying 1 to 1), then there exists a unique dominant
morphism F : C\ —► C2 defined over to such that F* = T.
In particular the nonsingular projective curve given by Theorem 11.46
is unique up to isomorphism. Finally we can put into place the canonical
Q structure on the compact Riemann surface Xq(N).
8. CANONICAL MODEL OF X0(N)
357
Corollary 11.49. There exist a nonsingular projective curve C/Q
and a biholomorphic mapping <p : X0(N) —► C(C) such that
<p*(C(C)) = K{X0(N)) = C(jJN)
and
^(Q(C))=Q(j,jn).
The curve C/Q is unique up to isomorphism defined over Q, and <p is
uniquely determined by the isomorphism of Q(C) with QCm'at).
Remark. One says that (^>,C) is a model for Xq(N) over Q. Our
practice is to identify C(C) with X0(N) and refer to X0(^)/Q as a Q
structure on Xo(N).
Proof. Theorem 11.33 gives K(X0(N)) = C(jJN). Theorem 11.32
shows that C(j,Jn) and Q(i,i;v) satisfy (a) of Theorem 11.36 and hence
the other equivalent conditions. Theorem 11.46 produces the desired
curve C/Q, and Theorem 11.21 produces (p. Finally Theorem 11.48
proves the uniqueness of C and (p.
Our occasional observations in §7 about Autjfc0(fc) apply with fc0 = Q
and ib = C since the subfield of C fixed by AutQ(C) is Q. This fact allows
us to give a reasonable way of deciding what members of K(Xo(N)) are
in Q(JJn)-
Corollary 11.50. A member of C(j,jn) is in Q(j,jx) if and only if
its q expansion at oo has all coefficients in Q.
Proof. Any element ip of AutQ(C) acts on C(jJs) and fixes
QUiJn) because these fields are associated with varieties. Let
f(r)= £ cnqn
be in C(j,js). If / is written as a rational function in .; and jw, then
f* is the same rational function but with each complex coefficient a
replaced by (p(a). The claim is that
oo
r(r)= £ *>(c„)<f, (11.77)
and then the corollary will follow because Corollary 11.49 ensures that
the subfield of C(jy js) fixed by AutQ(C) is just Q(JjJn)-
358
XI: EICHLER-SHIMURA THEORY
By Theorem 11.32,
1=0
Since the coefficients of the q expansion of jn are integers, it is enough
to prove (11.77) for / in C(j) = C(j~*). Since the coefficients of the q
expansion of j are integers, it is enough to handle
where P is a polynomial without constant term. We have
/ = i-fP(r1) + p(i"1)2 + ...,
and the coefficients of 1, q, g2,..., qM are the same for / as for
i+/>(r1) + --- + ^(r1)M-
Similarly they are the same for f* as for
i + p*(r1) + --+F<>(r1)A'.
Then (11.77) follows.
Example. If M divides AT, then jM is in K(X0(N)) = C(j, j^).
Since the q expansion of jm has rational coefficients, j\f is in Q(j, jV).
The inclusion To(N) C To(N/M) induces a holomorphic mapping F
of X0(N) onto X0(N/M) whose pullback on K(X0(N/M)) is given by
F*(f)(r) = f(Mr). Thus F*(j) = jM and F*{jN/M) = jNy and
^(QU,iN/A/))CQ(j,jN). (11.78)
Comparison of Theorem 11.21 and Corollary 11.49 shows that the
holomorphic mapping F can be regarded as a morphism between complex
varieties. Bringing in (11.78) and looking at Corollary 11.49 again, we
see that F is defined over Q.
We mention one further corollary of our results. The proof is in the
same spirit as the above results and will be omitted.
Corollary 11.51. Let ko be a subfield of C, and let C/k0 be a nonsin-
gular projective curve. If A'i is a subfield of kQ(C) of finite index
containing fc0, then there exist a nonsingular projective curve C/ko and a
dominant morphism F : C —► C defined over k0 such that F*(kQ(C')) = K\.
The curve C" is unique up to isomorphism defined over k0, and F is
uniquely determined by the isomorphism of k0(C) with K\.
9. ABSTRACT ELLIPTIC CURVES AND ISOGENIES 359
9. Abstract Elliptic Curves and Isogenics
The Eichler-Shimura theory does not work directly with nonsingular
Weierstrass equations. Instead it works with a certain kind of projective
curve that turns out to be isomorphic to a curve defined by a nonsingular
Weierstrass equation. In this section we shall expand our definition of
elliptic curve to include this wider kind of projective curve. We omit
proofs of general results about curves but usually give sketches of proofs
about elliptic curves. Let ko C k be fields with the same assumptions as
in §7.
Let C\/ko and C2/ko be nonsingular projective curves, and let
F : C\ —► Ci be a morphism. If F = 0, we define the degree of F
by degF = 0. Otherwise F is onto, by Lemma 11.47, and F*(ko(C2))
has finite index in Jb0(Ci), by Theorem 11.48. In this case we define
degF = [k0(d) : F*(k0(C2))]. (11.79)
This degree is the same if computed over k.
For the case of characteristic ^ 0, one must pay attention to
separability of field extensions in the present context. Any algebraic extension
can be obtained in two steps, a separable one followed by a purely
inseparable one. For the field extension indicated in (11.79), we let deg, F be
the degree of the separable part and deg, F be the degree of the purely
inseparable part. Then degF is the product of dega F and deg, F. We
say that F is separable or purely inseparable if deg F = deg, F or
degF = deg, F, respectively.
Recall that the quotient map H —► To(N)\H of Riemann surfaces fails
to be a covering map at elliptic fixed points of Tq(N) in 7i. We shall
define the counterpart of this kind of ramification for varieties. For F
as above but nonconstant, let x be in C\ and let If(x) ^e a uniformizing
parameter at F(x). We define eF(x) = ordx(F*(tF^)). The morphism
F is unramified at x if eF(x) = 1, ramified if eF(x) > 1. It is
unramified if it is unramified at every point. One can prove for every
y in C2 that
£ eF(x) = degF\ (11.80)
*€F-»(V)
also for all but finitely many y in C2)
#F-1(t/) = deg5F (11.81)
It follows immediately from (11.80) that F is unramified if and only if
#F-X(y) = degF for all y in C2.
360
XI: EICHLER-SHIMURA THEORY
Suppose C is defined over Zp with 1(C) generated by some
homogeneous polynomials / in Zp[Xi,... ,X„]. The Frobenius morphism <)>
of C is given by
*[(x0,...9xn)] = [(J0,...tJn)]. (11.82)
Since /(*oi---i*n) = (/(xo, • • • ,**))*, <£ really does map C to C and
is defined over Zp.
More generally let char(^o) = P and q = pr. If C is defined over fc0 with
7(C) generated by some homogeneous polynomials / in ko[Xoy... ,Xn],
we define C^ by the ideal generated by all f(q\ where fM is / with
all coefficients raised to the qth power. The qth power FVobenius
morphism <j> carries C to C^ and is given by (11.82) with p replaced
by q. The situation in the previous paragraph is a special case because
C(p> = C when C is defined over Z«. One can prove that
(a) *-(*0(C<»>)) = {/« | / € C]
(b) <j> is purely inseparable
(c) deg<£ = g.
In characteristic p, our original nonconstant morphism F : C\ —♦ C*2
admits a factorization F = Fs o <j>, where <f> : C\ -+ C[q) is the gth
power Frobenius morphism with <j = deg^ F and where F, : C[9) —+ Ci
is separable.
We can define the abelian group Div(C) of divisors of a nonsingular
projective curve C just as we did for a compact Riemann surface in §4.
In the case of / ^ 0 in fc(C), we say / has a zero at x if ordx(/) > 0 or
a pole if ordx(/) < 0. The divisor [/] of / makes sense because of the
following theorem.
Theorem 11.52. If C is a nonsingular projective curve and / is in
Jb(C), then / has only finitely many zeros and finitely many poles.
The divisor of a function is called a principal divisor. The degree
of a divisor £r ordx(D)z is the integer J^. ordx(Z>).
Theorem 11.53. Every principal divisor on a nonsingular projective
curve has degree 0.
If F : C\ —► C? is a dominant morphism between nonsingular
projective curves, then F induces maps on divisors in both directions by
F.rDMGJ-DKC,) F* ■ Div(C2) - °iv(Cl)
x->F(x) V~* H ef(x)x- (1183)
*e*->(y)
9. ABSTRACT ELLIPTIC CURVES AND ISOGENIES 361
The idea of the map F* was already implicit in our discussion centered
about (11.29).
The theory of divisors on C proceeds as in the case of compact
Riemann surfaces (§4). The set of principal divisors is denoted Div0(C),
and the quotient Pic(C) = Div(C)/Div0(C) is called the divisor class
group. Let Pic°(C) be the quotient of the divisors of degree 0 by the
principal divisors. Our curve C has an analog of meromorphic
differentials on Riemann surfaces. These always exist, and the divisor class of
such differentials is called the canonical class. Linear systems L(D)
for divisors are defined just as in §4. If D < 0, then L(D) = 0 as a
consequence of Theorem 11.53. By an analog of Proposition 11.7, L(0)
is 1-dimensional.
If fr = C, then the C points of C form a compact Riemann surface, and
genus has the customary topological meaning. One can give an abstract
definition of genus even in characteristic ^ 0, but for our purposes we
may as well define the genus g by its value from the Riemann-Roch
Theorem with D = 0: g = dimL(W) for any W ^ 0 in the canonical
class.
Theorem 11.54 (Riemann-Roch Theorem). Let C be a nonsingular
projective curve of genus g, let D be a divisor, and let W be in the
canonical class. Then
dimL(D) = deg D + dim L(W - D) - g + 1.
Corollary 11.55. Let C be a nonsingular projective curve of genus
g. If D is a divisor with deg£) > 1g — 2, then
dim L(D) = deg D - g + 1.
The Jacobian variety Pic°(C) will be identified as a variety for the
genus 1 case in Corollary 11.60, and for the higher genus case in §10.
If C is defined over fco, the group Autjb0(ik) acts on points of C,
fixing exactly those in C(Jbo)- The action on points induces an action on
divisors. We say that a divisor D is defined over fro if it is fixed by
Autjt0(£). (It is not necessary for each of the points with nonzero
coefficient in D to be fixed by Autjk0(fc).) The following result is fundamental.
Proposition 11.56. Let C/ko be a nonsingular projective curve and
let Dbea divisor defined over k$. Then the k vector space L(D) has a
basis defined over fro (i.e., consisting of members of fro(C)).
362
XI: EICHLER-SHIMURA THEORY
We come to our expanded definition of elliptic curve. An elliptic
curve is a pair (E,0)t where E is a nonsingular projective curve of
genus 1 and O is a point in E. The elliptic curve is defined over ko if
E is defined over k0 and O is in E(ko).
If a curve E is given by a nonsingular Weierstrass equation (3.23)
over k0 and if O is taken as the usual point at oo, then (E,0) is an
elliptic curve in the new sense, and it is defined over k0. To see this, we
need only compute the genus. The expression (11.21) is a member of
the canonical class with neither zeros nor poles, hence with divisor W
of degree 0. Then Theorem 11.54 with D = W gives
g = dim L(W) = deg W + dim L(0) -0 + 1 = 0+1-0+1.
Hence g = 1. The converse is as follows.
Theorem 11.57. If (E,0) is an elliptic curve defined over &o> then
there exist functions x and y in ko(E) such that the map F : E —► /^(fc)
with
F = [(x,y,l))
has F(0) = [(0,1,0)] and is an isomorphism defined over &o onto a non-
singular curve given by a Weierstrass equation (3.23) whose coefficients
are in k0.
Some comments about the proof may be helpful. For n > 1, L(n(0))
has dimension n, by Corollary 11.55, and we may choose a basis for it
from k0(E) by Proposition 11.56. We take any x so that {l,x} is a basis
over ko of L(2(0)) and then any y so that {l,x,y} is a basis over ko of
L(3(0)). The seven functions l,a:,y,x2,xy,y2,x3 are in L(6(0)), which
has dimension 6, and must be linearly dependent. A nontrivial linear
relation among the seven functions can be adjusted to give the desired
Weierstrass equation. (This argument can be implemented for specific
equations. For example, the change of variables that leads from (3.18)
to (3.19) and (3.20) can be deduced from it after (3.18) is desingularized
at oo.)
To show that the Weierstrass equation obtained in the proof of
Theorem 11.57 is nonsingular, one proves first that x and y generate k(E)
over k. Then it follows that F has degree 1. If the image is singular,
Proposition 3.11 allows us to compose F with a certain map to P\(k)
such that the composition has degree 1. Since E and Pi (it) are both
nonsingular, Theorem 11.48 shows they are isomorphic. But they have
different genus, contradiction.
The same kind of argument classifies isomorphisms between elliptic
curves that are given by Weierstrass equations.
9. ABSTRACT ELLIPTIC CURVES AND ISOGENIES 363
Proposition 11.58. If two elliptic curves (E, O) and (£', O') defined
over k0 are given by Weierstrass equations over Jk0 (with 0 and O'
corresponding to oo), and if there exists an isomorphism F : E —► E' defined
over ko with F(oo) — oo, then E and E' are related by an admissible
change of variables (3.43) with coefficients in k0.
The idea of the proof is as follows: If E «-► (x,y) and E' <-► (x\y')y
then {l,x} and {l,x'} are bases of L(2(0)) and L(2(0'))y while {l,x,y}
and {l,x'ty'} are bases of L(3(0)) and L(3(0'))-
As a consequence of the above results, we can transfer the group
operation to a general elliptic curve from any of its associated Weierstrass
equations. We need to know that this group operation fits in with the
kinds of mappings we have been discussing, and we need to know how
to recognize the group operation without passing back and forth to a
Weierstrass equation.
Proposition 11.59.
(a) For an elliptic curve over ko given by a Weierstrass equation over
k0y addition and negative are morphisms defined over k0.
(b) For an elliptic curve (EyO), a definition of addition that is
consistent with the definition in Weierstrass form is as follows: If D is
any divisor of degree 0, there exists a unique point z on E such that
D — (z — (O)) is a principal divisor. The sum of x and y is the point
corresponding to the divisor x + y — 2(0), the identity is (O), and the
negative of x is the point corresponding to the divisor (O) — x.
Part (a) is tedious to check. Part (b) follows by repeated use of
Theorem 11.54 and little else.
Corollary 11.60. For an elliptic curve (£,0), the mapping x —►
x—x{0) of E into Div(£) yields a group isomorphism of E onto Pic°(/£).
The corollary is just a reworded version of Proposition 11.59b. The
Riemann surface analog is Corollary 11.16a. From now on, we shall
typically say, "Let E be an elliptic curve," dropping reference to the group
identity. The identity will be denoted O and for Weierstrass equations
will be the point at oo.
If Ei and E2 are elliptic curves over k, an isogeny between them is a
morphism F : E\ —► Ei with F(0) = O. All nonzero isogenics are onto,
by Lemma 11.47. When k = C, we saw in Chapter VI that isogenics are
automatically homomorphisms. The same thing is true in general.
364
XI: EICHLER-SHIMURA THEORY
Proposition 11.61. An isogeny is a group homomorphism.
Sketch of proof. We may assume that the isogeny F : E\ —* E2
is not zero. Referring to (11.83) and Corollary 11.60, we write F as the
composition of three homomorphisms E\ ~ Pic°(£i), Fm : Pic°(Ei) —►
Pic°(£2), and Pic°(£2) ^ E2.
Examples.
1) If E is any elliptic curve, then multiplication by the integer my
denoted [m], is an isogeny from E to E.
2) In §1 we wrote down an explicit isogeny F : Ef —► E between two
elliptic curves defined over Q such that j(E') — —163/11 and j(E) =
-212313/115.
3) If E is an elliptic curve over Zp, then the Frobenius map <f>: E —► E
given in (11.82) is an isogeny.
If E\ and E2 are elliptic curves defined over k, we let Hom(E\yE2)
be the set of all isogenies from E\ to E2. The pointwise sum F + G
of two members of Hom(E1,£,2) is again in Hom(Ei,£,2), Deing the
composition of
diagonal : E\ —► E\ x E\,
F x G : Ex x Ex — E2 x £2, and
4- • E2 x i£2 —► E2.
Then Hom(i?i,2?2) becomes an abelian group. The subset of isogenies
defined over fco is a subgroup.
We write End(E) in place of Hom(£*, E) when the two curves are the
same. The abelian group End(£) has a multiplication given by
composition. The distributive law F(G + H) = FG + FH follows immediately
by applying Proposition 11.61 to F.
Let E be an elliptic curve. For x in E> we define a translation map
Tx : E — E by Tx(y) = i + y. Then Tx* is an automorphism of i(E)\
if E is defined over k0 and x is in E(k0), then Tx" is an automorphism
of*o(£).
Proposition 11.62. If F : E\ _—► £2 is a nonconstant isogeny
between elliptic curves defined over Jb, then ker F is finite, and the map
x —► Tr* induces an isomorphism
kerFSAut/..(j(Bj)nt(£1)). (11.84)
If F is separable, then F is unramified and
degF = #ker(F); (11.85)
moreover, fc(Fj) is a Galois extension of F*(/fc(F2)).
9. ABSTRACT ELLIPTIC CURVES AND ISOGENIES 365
The proof uses (11.80) and (11.81), as well as a little Galois theory.
We omit the details.
For an elliptic curve over Zp with k — Zp, Proposition 11.62 gives a
way of calculating #F(ZP). In fact, the Frobenius morphism <j> of (11.82)
fixes exactly those points of E that are in E(2p). Thus
x € F(Zp) <=* 4>(x) = x <=► ([1] - 4>){x) = O <=> x e ker([l] - <t>)
and
#Z?(Zp) = #ker([l]-<*). (11.86)
A fundamental fact, which we shall not prove, is that
1 — <f> is a separable isogeny. (11.87)
Combining this fact with (11.85) and (11.86), we obtain
#£(ZP) = deg([l] - *). (11.88)
Lemma 11.63. Let F : Ei —► E2 and G : E\ —* F3 be nonconstant
isogenics of elliptic curves over k. If F is separable and kerF C kerG,
then there exists a unique isogeny H : E2 —► E3 such that G = H o F.
If F and G are defined over k0y so is H.
Proof. The inclusion of kernels and (11.84) give natural
isomorphisms
Autj,.(i(isa))(t(JEi)) ~ ker F C kerG S AutG.(£(fi3))(i(£7i)).
Hence every automorphism fixing F*(k(E2)) fixes G*(fc(F3)). By
Proposition 11.62, k(Ei) is Galois over F*(ifc(F2)). Hence G*(k(E3)) C
F*(k(E2)). Then Theorem 11.48 gives us a corresponding morphism
i/ : F2 — F3 with F* o H* = G*. By uniqueness in Theorem 11.48,
G= H oF. Then O = G(O) = H(F(0)) = H(0) shows that // is an
isogeny. If F and G are defined over k0y we can apply Theorem 11.48 to
G*(ko(E3)) C F*(k0(E2)) to construct // as defined over fc0.
Theorem 11.64. Let F : £1 —> £2 be a nonconstant isogeny of
degree m between elliptic curves defined over k. Then there exists a
unique isogeny F : E2 —> E\ such that F o F = [m]. If F is defined over
fr0, so is F.
REMARKS. F is called the dual isogeny to F. We say that E\
is isogenous to E2 if there is a nonconstant isogeny from E\ to E2.
As a consequence of this theorem, "is isogenous to" is an equivalence
relation. If we restrict attention to elliptic curves defined over Jb0, it is
still an equivalence relation.
366
XI: EICHLER-SHIMURA THEORY
SKETCH OF PROOF. As a morphism, F factors into F = Fs o(f>, where
F8 is separable and <j> is a qth power Frobenius. Thus it is enough to treat
Fs and <j> individually. We treat only FS) changing notation and writing
F for it. By (11.85), #ker(F) = m. Thus the map [m] : Ei -+ Ex
annihilates kerF, and kerF C ker[m]. Application of Lemma 11.63
completes the proof.
Dual isogenics have the following properties, whose proofs we omit.
Proposition 11.65. Let F : E\ —► E2 be an isogeny of degree m.
(a) F o F = [m] on Ely and F o F = [m] on E2.
(b) deg F = deg F and (Pf = F.
(c) [m]"= [m] and deg [m] = m2.
(d) If G : F2 -► E3 is an isogeny, then (G o F)~= F o G.
(e) UH:El->E2 is an isogeny, then (F + H)~= F + H.
The next result is a converse to Proposition 11.62.
Theorem 11.66. Let E be an elliptic curve over £, and let S be
a finite subgroup of E. Then there exist an elliptic curve E1 and a
separable isogeny F : F —► E1 such that kerF = 5. The curve F' is
unique up to isomorphism. If E is defined over k0 and if 5 is stable under
Autjb0(fc), then Ef is defined over fro and may be taken to be defined over
fro-
Idea of proof over fr if char(fr) = 0. The set {Tx* \ x e 5} is
a finite group of automorphisms of fc(F). We apply Lemma 11.35 to
determine a subfield, Theorem 11.44 to define a nonsingular projective
curve C, and Theorem 11.48 to define a dominant morphism F : E —► C.
One has to prove that F is unramified and then that the genus of C is
one.
Let us return to elliptic curves over Q. The L function of such a curve
was defined in (10.9).
Theorem 11.67. Let E and E' be elliptic curves over Q that are
isogenous over Q. Then L(syE) = L(s,E').
SKETCH of proof. The two L functions will be equal factor by
factor. We treat the pth factor only for primes p not dividing A or A',
omitting the others. If F : E —► E' is a nonconstant isogeny defined over
Q, one has to check that F induces a nonconstant isogeny Fp : Ep —► Ep
10. ABELIAN VARIETIES AND JACOBIAN VARIETY 367
defined over Zp. Taking this for granted, let <j>\ and $2 be the Frobenius
morphisms of Ep and Ep defined in (11.82). We can write Fp = Fs<)>\
with F9 : Ep —► E'p separable and defined over 2p. So we may as well
assume from the outset that Fp : Ep —► Ep is a separable isogeny defined
over Zp.
If we write out the morphism Fp as [(Fo, i^i, -^2)] 1 we nave
= [(i:o(x)^F1(*)^F2(*)',)] = MFp(x)).
Thus
Fpo^=^2oF,. (11.89)
If y is in E'p, then y = Fp(x) for deg Fp values of r, by (11.85) since ^i
is onto. For such an x,
yeE'p(zp) <^=> My) = y
<=* </>2Fp(x) = Fp(x) ^=> iGker(([l]-^2)Fp).
Thus
#W} " d^Fp
= #ker(Fp([l]-^))
deg Fp
by (11.89)
= deg(F,([l]-^)) by(n87)and(11.85)
degFp
= deg(Fp)deg([l]-^)
degFp
= deg([l]-<Ai)
= *EP(2P).
Hence the pth factors of L(sy E) and L($} E') match.
10. Abelian Varieties and Jacobian Variety
In this section our algebraically closed field k will be C, and the sub-
field k0 will be Q. An abelian variety A is a nonsingular projective
variety over C with a distinguished point O and with an abelian group
368
XI: EICHLER-SHIMURA THEORY
structure such that O is the identity and the operations of addition and
negative are morphisms. The abelian variety is said to be defined over
Q if A is defined over Q, O is in j4(Q), and addition and negative are
defined over Q. Any elliptic curve (over C or Q) is an example, by
Proposition 11.59a. Conversely an abelian variety of dimension 1 is an
elliptic curve. (We have only to see that the genus is 1. We can work
with j4(C), which is a compact Riemann surface. TVanslations are fixed-
point free automorphisms homotopic to the identity, and the Lefschetz
Fixed-Point Theorem allows these only in genus 1.)
The goal of this section is to introduce the Jacobian variety J(C) of a
nonsingular projective curve C defined over Q. J(C) is to be an abelian
variety defined over Q and is to play the same role in geometry over Q
that the Jacobian variety of §4 plays in the theory of compact Riemann
surfaces.
Let A and B be abelian varieties. A homomorphism F : A —> B
is a morphism of varieties that is also a homomorphism of groups. We
say that F is an isomorphism if it is an isomorphism of varieties and
an isomorphism of groups. If A and B are defined over Q, we say F is
defined over Q if it is defined over Q as a morphism.
The set Hom(j4,£) of homomorphisms from A to B is an abelian
group under pointwise addition. For the special case that A = 5, we
write End A in place of Hom(,4,A). This is a ring with composition as
multiplication.
If A is an abelian variety and C is a subvariety (obtained by enlarging
the defining homogeneous ideal of polynomials), then the inclusion map
is automatically a morphism. If C has a group structure compatible with
that of A, then inclusion and addition give morphisms CxC —>AxA—+
A with image in C that show that addition on C is a morphism. Similarly
negative on C is a morphism. We say that C is an abelian subvariety
of A. The kernel of a homomorphism between abelian varieties is an
example if it is connected in the complex manifold topology. We shall
make use of the following two propositions.
Proposition 11.68. If F : B —► A is a homomorphism between
abelian varieties, then C = image F is an abelian subvariety of A. If A,
Bt and F are defined over Q, then C may be taken to be defined over
Q
Proposition 11.69. Let A be an abelian variety, and let C be an
abelian subvariety. Then A/C may be given the structure of an abelian
variety in the following sense: There exists a pair (A\ F) such that A'
is an abelian variety, F : A —► A' is a homomorphism onto, ker F =
10. ABELIAN VARIETIES AND JACOBIAN VARIETY 369
C, and any homomorphism F" : A —> A!' of abelian varieties with
ker F" D C factors through F, i.e., F" = F' o F for a homomorphism
F' : A' —► A" of abelian varieties. The pair (A',F) is unique up to
canonical isomorphism. If A and C are defined over Q, then A' and F
will be defined over Q; in this case, when A" and F" in the universal
property are defined over Q, F' may be taken to be defined over Q.
The next proposition gives some properties of abelian varieties and
homomorphisms in order to shed some light on the definitions. Although
the proposition helps to motivate parts of §11 and Chapter XII, it will
not be used explicitly.
Proposition 11.70. Let A and B be abelian varieties.
(a) A homomorphism F : A —► B is onto if and only if its kernel is
finite. (In this case, A and B are said to be isogenous.)
(b) Isogeny is symmetric and hence is an equivalence relation.
(c) If V is a nonsingular projective variety and F : V —* A is a rational
map, then F is a morphism.
(d) Any morphism F : A —► B is the composition of a homomorphism
and a translation.
(e) If 5 is an abelian subvariety of Ay then there exists an abelian
subvariety C of A such that B n C is a finite group and A = B + C
as groups. If A and B are defined over Q, then C may be taken to be
defined over Q.
(f) A is isogenous to a product of abelian varieties that are simple (in
the sense of not having any proper nonzero abelian subvarieties), and
the factors of the decomposition are unique up to isogeny.
For X a compact Riemann surface of genus </, we defined the Jacobian
variety J(X) = C9/A(X) in §4. With a base point xQ fixed in X, there
is a canonical holomorphic mapping $ : X —* J(X) with the following
universal property. Whenever F : X —► T is a holomorphic mapping
into a complex torus, then F factors through the Jacobian variety:
F = fo$ + F{x0)
for some holomorphic homomorphism / : J(X) —► T.
When the base point in X is changed, * gets composed with a
translation. The universal property makes J(X) unique up to a biholomorphic
group isomorphism. If J(X) is fixed, $ is unique up to composition with
a translation and a biholomorphic automorphism of J(X).
370
XI: EICHLER-SHIMURA THEORY
If C is a nonsingular projective curve over C, then C(C) has an
underlying complex manifold structure, hence is a compact Riemann surface.
Thus C has a Jacobian variety in the above sense. The following
theorem solves the problem of putting on the Jacobian variety the structure
of a nonsingular projective variety.
Theorem 11.71. Let C be a nonsingular projective curve over C,
let J(C) be the Jacobian variety of the underlying compact Riemann
surface, and let $ : C —► J(C) be the canonical holomorphic mapping
with base point x0. Then J(C) admits the structure of a nonsingular
projective variety in such a way that
(a) its group structure makes it into an abelian variety,
(b) $ is a morphism, and
(c) whenever F : C —► A is a morphism into an abelian variety,
then F factors through J(C) as F = f o <& + F(x0) for some
homomorphism / : J(C) —► A of abelian varieties.
Moreover, if C is defined over Q, then J(C) can be defined over Q in
such a way that (a), (b), and (c) are valid with structures defined over
Q
There is a uniqueness statement for Theorem 11.71, just as in the
setting of compact Riemann surfaces, and it is again easy. Existence,
however, is a deep question. Theorem 11.71 is due to Lefschetz for C and
to Weil and Chow for Q. The following result will be handy in detecting
morphisms that involve J(C).
Theorem 11.72 (Chow). If V\ and V2 are nonsingular projective
varieties over C and if F : V\ —► V2 is a holomorphic mapping between
their underlying complex manifolds, then F is rational over C.
We shall apply Theorem 11.71 to C = Xo(N), which is defined over
Q. The ring of holomorphic homomorphisms of J(Xq(N)) into itself is
the same as the ring Eud(J(Xo(N))) of abelian variety homomorphisms
defined over C, as a consequence of Theorem 11.72. Two key steps in the
Eichler-Shimura theory are to transform the Hecke operators into
members of End( J(Xq(N))) and to show that the resulting homomorphisms
are defined over Q. We carry out the first of these steps now.
For this purpose we fix notation as follows: Let Xq(N) have genus
g, and let wi,...,^ be the basis of Ct\)O\(X0(N)) used to define J =
J(X0(N)) concretely as &/\(X0(N)). (Recall that C and A(X0(N))
are not affected by changing the Z basis of cycles.) Let ir : H* —►
To(N)\7i* be the quotient map and $ : 7i* —► J be the composition
10. ABELIAN VARIETIES AND JACOBIAN VARIETY 371
$ = $ o 7T. If we put fj{r) dr = tt*(u;j), then /i,..., fg is a basis for
S2(r0(AO) and * >s given by
*(T)={/r/i(0dcF_ (1L90)
for any base point r £ ^^(xo).
Since J = C3/A(Xq(N)), we may identify the tangent space j to J
at the group identity O as the same C9. Its standard basis will be
denoted ei,..., etf. The group J is a Lie group, and its Lie algebra is
j = C9, column-vector space. Any real analytic homomorphism f of J
into itself induces (by passage to the differential df at O) an R linear
map of C9 into itself, and distinct homomorphisms yield distinct linear
maps. The members of End(J) are holomorphic, and their differentials
are consequently C linear maps of j = C* into itself. In this way we get
a one-one ring homomorphism of End( J) into the algebra of all g-by-g
complex matrices:
End(J)—>M(gtC). (11.91)
We can use z\,..., zg as coordinates on «/, and the space fihoi(«/) of
holomorphic 1-forms is the C linear span of dz\,... ,dzg. The spaces
fihoi(«J) and j are in natural duality, the pairing being given by
(dzi}€j) = Sij.
If we identify j with the space of invariant vector fields on J, we can
regard this formula also as a valid pairing at any point of J. For / in
End(J), we define an endomorphism Sf of Q^Q\(J) by
<(«/)«, w) = (ti,(d/)v) for u E JJhoi(J) and v£). (11.92)
Let us check that
**{dzi)=fi(T)dr. (11.93)
In fact, at any point ri, differentiation of (11.90) at T\ gives
a (a \ (fdn)\
V fg(n I
Since (dr, —I ) = 1, we obtain (11.93). Since $* maps basis to basis,
dr I ti
4>* is a vector space isomorphism. Therefore it makes sense to define
V- : S2(T0(N)) - fiho,(J) by
**(/«(/)) = /(»■)*• for / € S2(To(N)). (11.94)
372 XI: EICHLER-SHIMURA THEORY
Let M(n, N) = [J*=1 r0(W)a,- as in (11.28). In (11.29) and (11.32) we
defined consistently a Hecke operator T(n) : X0(N) —► Div(X0(N)) by
T(n)(,r(r)) = £>(a.r)). (11.95)
K
1=1
The mapping $ : Xq(N) —► J extends by additivity to a map
$# : DW(X0(N)) — J,
and then we can form the composition T#(n) — $# o T(n) given by
T#(n)(*(r)) = £*(x(a,r)) = [ . (11.96)
/r0
Equation (11.96) exhibits T#(n) as holomorphic; by Theorem 11.72 it is
a morphism of varieties over C. In §11 one of the steps will be to prove
that T#(n) is defined over Q.
Applying the universal mapping property to T#(n) : Xo(N) —► J
(with A = «/), we obtain a member t(n) of End J such that
T#(n)(n(r)) = «(n)(*(r)) + T#(n)(7r(r0))
for all r. What this equation says is
/£/i<o*\ /E,ro,T/'(Orfc\
«(») = • (11-97)
\/;/f(o*/ \E.r0,T/s(orfc/
Since <(n) is additive, we can evaluate (11.97) at T\ and subtract the
result from (11.97) to obtain
(11.98)
To compute the differential <ft(n), we can differentiate the formula for
t(n) along any curve extending from the O vector. Differentiating (11.98)
10. ABELIAN VARIETIES AND JACOBIAN VARIETY 373
at r = r\ and then writing r for T\ (since T\ is arbitrary), we find
dt(n) : =
'£.-/i°[<*.-]2(r)\ /T„(n)/i(r)'
.E,/soK]2(r)/ \T2(n)/,(r)y (11-99)
for all r.
Consequently the matrix of dt(n) is A(T2(n))y in the notation of
(11.47). We shall use the reformulation below of this result.
Proposition 11.73 (Shimura-Taniyama). For / in S2(To(N))}
(«(n))(/i(/)) =/i(T2(n)/). (11.100)
Proof. Write / = $2 ckfk> and let e\ be a standard basis vector of j.
Then we have
((ft(n))(M/)), *i) = (M/), «ft(n)c) by (11.92)
= £**(<***, *(n)(«i)>
= 5>A(Ta(»))w
and
(/i(T2(n)/), e,> = 5></*(T8(n)/t), e,)
= 5^c*<|i(il(ra(n))fcm/m)I«i)
= £ct>l(r2(n))tm<<f*m,e,)
ib,m
= £c^(T2(n))H.
A:
The above equations show that the two sides of (11.100) are paired
equally with any member of j and therefore must be equal.
374
XI: EICHLER-SHIMURA THEORY
11. Elliptic Curves Constructed from S2(ro(A0)
We come to the main theorem of the Eichler-Shimura theory. We
continue with the notation X0(N), g, A(X0(N))} J, $, {/i,... ,/^}, r0i
j, End J, nho\{J), d and 6, fiy T(n)y T#(n)} t(n), dt(n), and 6t(n) as in
the latter part of §10.
In order to handle one step in the proof of the main theorem, we
shall assume that the basis {/i,...,/$} of S2(Tq(N)) over C has been
constructed as in Theorem 11.25, so that the matrices of the Hecke
operators T2(n) have integer entries. By (11.99) these integer matrices
are the matrices of dt(n).
We define EndQ(J) = End(J) <g>z Q. By (11.91), EndQ(7) can be
regarded as an algebra over Q of g-by-g complex matrices.
Theorem 11.74 (Eichler-Shimura). Let /(r) = £~=i cne2*inr be a
newform in S2(To(N)) normalized to have c\ = 1, and suppose that all
cn are in Z. Then there exists a pair (E,v) such that
(a) E is an elliptic curve defined over Q, and (E,i/) is a quotient of
J by an abelian subvariety of J defined over Q,
(b) the members t(n) of End( J) leave A stable and act on the quotient
E as multiplication by the integers cn,
(c) fi(f) is a nonzero multiple of ^*(t4/), where u> is the invariant
differential (11.21) of E,
(d)if
A/ = {♦/(t) = f T° f(C)dC | t € r0(A0},
J To
then Aj is a lattice in C, and E is isomorphic to C/Aj over C,
(e) the L functions of E and / coincide as Euler products except
possibly at finitely many primes.
Moreover, properties (a) and (b) characterize A uniquely and therefore
determine (E) u) up to isomorphism defined over Q.
Remarks. Composing v in (a) with $ : Xq(N) —► J, we obtain a
morphism i/o& : Xo{N) -+ E defined over Q. The proof of the theorem
does not use that / is a newform, only that / is an eigenfunction of
all the Hecke operators X^n). Work of Igusa shows that the Eichler-
Shimura argument for (e) is applicable to all primes not dividing N. If
/ is a newform, then Theorem 12.8 will note that L(E,s) and L(/, s)
match exactly.
11. ELLIPTIC CURVES CONSTRUCTED FROM S2(rQ(N)) 375
The proof has a characteristic 0 part and a characteristic p part, the
latter to handle (e). We shall give the full characteristic 0 part in this
section and shall discuss the characteristic p part in §12. We need the
following extension of the standard Wedderburn theorem on semisimple
associative algebras.
Lemma 11.75 (Wedderburn). Let T be a finite-dimensional
associative algebra with identity defined over a field k> and let 7£ be its
nilradical (largest two-sided ideal). Then there exists a semisimple
subalgebra S of T such that T = S 01Z as vector spaces. Moreover, S
is a direct sum of ideals, each of which is a simple algebra isomorphic to
a full matrix algebra over a division algebra over k.
REMARK. We need the lemma only in the case that T is commutative,
and then the simple algebras in S are isomorphic to fields that are finite
algebraic extensions of k.
In order to get to the body of the proof quickly, we postpone to the
end of this section the proof of the next lemma.
Lemma 11.76. The members t(n) of End( J) are defined over Q.
Proof of Theorem 11.74 except part (e). Let T be the
commutative Q subalgebra of Endg(J) generated by all the t(n) in
End(J). Since each t(n) may be identified with the g-by-g matrix of
its differential dt(n), which has integer entries, T is isomorphic to a
subalgebra of M(<j, Q). Therefore T is finite-dimensional over Q.
We apply Lemma 11.75 to the algebra T and the field Q, obtaining
r = s e n = (fci e • • • e K) e ft,
where 1Z is the nilradical and each kj is an ideal of S isomorphic to a
finite algebraic extension of Q. By Proposition 11.73, we have
(ft(n))(M/)) = *./*(/)•
Hence there exists a well defined Q algebra homomorphism p oiT into
Q given by the formula
p{t{n))-cn.
It is clear that p{1Z) ~ 0. Changing the order of the factors k\,..., kr if
necessary, we may assume that p(k\) = Q. Since k\ is a field, k\ = Q.
Let p1 : Q —* ki be the inverse of p to kXi and define an ideal U of T by
u = (jb2e---e4r)eft.
376
XI: EICHLER-SHIMURA THEORY
The abelian subvariety A will be the sum of all <*{T) for a in
U H End(J). Let us see that A is indeed a subvariety and is defined
over Q. The members of End(J) may be viewed via (11.91) as members
of M(</, C) that carry A(Xo(N)) into itself. With this identification the
members of EndQ(J) are Q linear combinations of these matrices. Let
a G U fl End(7) be given. Being in T, a is a polynomial in the <(n)'s
with rational coefficients. Hence there is a nonzero integer m such that
ma is an integer polynomial combination of the /(n)'s. According to
Lemma 11.76, each t(n) is defined over Q. Therefore ma is defined over
Q. Since a and ma have the same image in «/, Proposition 11.68 shows
that image(a) is an abelian subvariety of J defined over Q. Now suppose
we have two abelian subvarieties A\ and A2 of J defined over Q. Their
sum A\ -t- A2 is the image of A\ x Ai C J x J —► J under addition.
Since A\ x A? and addition are defined over Q, another application of
Proposition 11.68 shows that >ii H- A2 is defined over Q. We now iterate
this construction. With each iteration, either the dimension goes up (of
the abelian subvariety as a connected complex Lie group) or nothing
new happens. We conclude that A, as we have defined it, is an abelian
subvariety of J defined over Q.
By Proposition 11.69 we can form the quotient (E,v) of J by A, and
it can be taken to be defined over Q. We shall prove shortly that E
has dimension 1, and then (a) will be proved. In the meantime, each
0 G TREnd(J) maps J into A. In fact, if a is in A, write a = J2ak(zk)
with a* G U fl End(J) and Zk G J- Since U is an ideal, each /to* is in
U fl End(J), and we have
0(a) = £><**(**) G £>**(./) C A.
Consequently each 0 G TflEnd(J) maps J into A. Applying this
conclusion to f(n), we obtain the first conclusion of (b). Since
i(n)(A) C A, we have (ker 1/o T(n)) C keri/. By the universal
mapping property of (E, v) in Proposition 11.69, there exists t(n) G End(E)
with
t(n)ou-uot(n). (11.101)
In other words t(n) acts on E as i(n). To calculate what this action is,
we observe that t(n) — j/(c„) is in U by construction and p'(cn) — [cn]
is in U also, where [c„] denotes multiplication by the integer cn. Hence
t(n) — [cn] is in U fl End(J). In other words, t(n) — [cn] passes to the
quotient and acts as 0. Thus i(n) = [cn]. This proves (b).
To prove that dimE > 0, we are to prove that A ^ J. Let m > 0
be the integer for which kiKm ^ 0 and kx1lm+l = 0, and let 0 be a
11. ELLIPTIC CURVES CONSTRUCTED FROM 52(r0(N)) 377
nonzero member of kiHm. Possibly multiplying (3 by a nonzero integer
(as in an argument earlier in the proof), we may assume /? is in End(J),
not just EndQ(J). If a is in U, then (3a = 0 (because k\kj = 0 for
j > I and HmH = 0). If a is in A, we can write a = YlQk(zk) with
Qk € U fl End(J) and zk 6 J. Since /?afc = 0, we have /3(a) = 0.
Therefore /? is a nonzero member of End(J) that annihilates A. Hence
A± J.
We shall now prove (c) and prove that dim E < 1, thereby completing
the proof of (a). Let u>' be a nonzero member of fihoKE) (existence by
previous paragraph), and let i/* be the pullback mapping i/* : fihoi(^) —*
^hol(«J) induced by v. Since v on the complex manifold level is just
a homomorphism of one complex torus onto another, u* is one-one.
Applying (•)* to (11.101), we have
v* o 6i(n) = 6t(n) o v*.
Now i(n) = [cn] implies that Si(n) = cn • 1. Hence
«(!»)(!/•(«')) = e„i/>')-
If we define /' = /i_1(j/*(w')), then /i(/') = "*(<*>'), and Proposition
11.73 gives
M(Ta(n)/') = cn/i(/')
and hence
T2(n)/' = cnf. (11.102)
If dimE > 1, then there exist linearly independent u/ and cj" in fihoi(-E),
and ^*(u/) and */*(u/') will be independent. If we put /" = iTl{v*{u>")),
then the above argument shows that
T2(n)f" = cnf". (11.103)
Since /' and /" are independent, (11.102) and (11.103) together
contradict Proposition 9.20b. Thus dimi? = 1, and the proof of (a) is
complete. We can take w1 to be the invariant differential (11.21) of E in
the above argument, and (11.102) and Proposition 9.20b force /' and /
to be linearly dependent. This proves (c).
Let us prove uniqueness. Suppose A and (E',i/') satisfy conclusions
(a) and (b). Let u/ and u> be the invariant differentials (11.21) of £" and
Ey respectively. Then the argument in the previous paragraph shows
that i/*(u/) is a multiple of/i(/). Since we already know that v*(u>) is
a multiple of /i(/), ^'*(u/) and v*(w) are multiples of each other. Hence
378
XI: EICHLER-SHIMURA THEORY
i/*(u/) and ^*(w) annihilate the same members of j. The respective
annihilators are the tangent spaces in j of ker j/ = A1 and keri/ = A.
Since A1 and A are connected Lie subgroups of J with the same Lie
subalgebras, we conclude that A1 — A. Then (E, u) is isomorphic to
(£",i/) over Q as a consequence of Proposition 11.69.
Let us prove (d). The natural pairing of £2hoi(</) with j = C9 makes
/i(/) acts as a linear functional on j. We calculate the effect of /i(/) on
A(X0(N)) = ker(j — J). The lattice A(X0(N)) has generators
\LkfAOdCt
where c\,..., c-i9 represent a Z basis of H\ (Xq(N), Z). Write / = 5Z rjfj •
Then
j
= 53ri{dzj,at>=X;'-i / /*KK = / /«)#•
j j Jck Jck
By (11.37) and Proposition 11.22, we conclude that
M/)(A(*o(AO)) = X;Z / /(C)rfC = A/. (11.104)
Let a C j be the Lie algebra of A (the tangent space at the identity).
We shall verify that
ker/i(/)= a. (11.105)
In fact, /i(/) is a nonzero multiple of v*(<jj), and thus
ker^i(/) = {ti€j | (!/»,«> = 0}
= {tt6j|((Jf(A/)(u)> = 0}
= {w € j | di/(u) = 0} since a; spans £lho\(E)
= ker(di/) = (Lie algebra of A) = a.
This proves (11.105).
The homomorphism j —► J with kernel A(Xq(N)) is really the
exponential map of Lie theory and thus must implement a —► A. Hence the
kernel of a —> A is ar\A(X0(N)). Since ,4 is compact, af\A(Xo(N)) is a
11. ELLIPTIC CURVES CONSTRUCTED FROM S2(r0(N)) 379
lattice in a, evidently of rank 2g — 2. Let x\,..., x2g-2 be a Z basis for
it, and adjoin x2g-\ and x2g in A(Xo(N)) so that A' = Yl]3=i^xj nas
rank 2g. Then A' has finite index in A(Xo(N))} say m, and it follows
that A(X0(N)) is contained in ^-A'. Now
c = /i(/)(i) = /i(/)(£R*i) = /i(/)(R*3f-i + Rx2s)
shows that
/i(/)(^2^-i) and fi(f)(x2g) are linearly independent over R. (11.106)
On the other hand,
M/)(Z*a,-i + Z*2„) = M/)(£Z*i) = rtWA')
i
C /i(/)(A(Xo(AT))) C Kf)(m-lK')
3
= fi(f)(m-llx2g-i + m~ 1Zx2*)-
(11.107)
Combining (11.104) with (11.106) and (11.107), we see that
A/ = ii(f)(A(Xo(N))) is a free abelian subgroup of C of rank 2 that
spans C over R. Consequently A/ is a lattice.
By Theorem 6.14, E' = C/A/ is an elliptic curve over C. Let
r\ : C —► C/A/ be the quotient homomorphism. The composition
r\ o fi(f) : j —♦ E' is given by
j —> j/a ~ C — C/A/ - £'
and has kernel ^(/^(A/), which equals a + A(X0(jV)) by (11.104)
and (11.105). Since j —* J is a covering homomorphism with kernel
A(Xq(N)), rjo /i(/) factors through j —► J. Let us say
?0/|(/)=£0(j->/)
with e : J —► i?' a holomorphic homomorphism with kernel the image
of a + A(XQ(N)) under j -► J, namely A. By Theorem 11.72, e is
a morphism defined over C. Since kere = A, the universal mapping
property of (E,i/) in Proposition 11.69 implies that e = e' o v for a
morphism e' : E —* J£' defined over C. The kernel of e' is trivial.
Combining (11.85) with our results on dual isogenies, we see that e' is
an isomorphism over C. This completes the proof of (d) and all of the
characteristic 0 part of Theorem 11.74 except for Lemma 11.76.
380
XI: EICHLER-SHIMURA THEORY
PROOF OF LEMMA 11.76. Because of the universal mapping property
of J, it is enough to prove that T&(n) is defined over Q. Here T*(n) is
given by
T*(n)(r) = £ r^ f(C) rfC, r e XQ(N), (11.108)
where f = I I. (To make sense of the o^r's in this formula, we must
choose a representative of r in W*, compute each 0,7", and project back
to Xq(N). We always assume this has been done.)
Let G = AutQ(C). This group operates on various things in our
setting. When a member <p of G acts on a polynomial P to give Pv, P^ is
obtained by having y? act on just the coefficients. When the polynomial
gets regarded as a member of a local coordinate ring of a variety, for
example, <p acts on the points of the variety and the relationship is
P*(x) = <p(P(<p-1x)). (11.109)
We shall use this notion in order to localize the proof of rationality.
If (F0, • • •, Fm) gives the imbedding of J as a nonsingular variety in a
projective space, we may take Fo, • • •, Fm to be defined over Q.* Then
T#(n) is certainly defined over Q if we show that
r —ff(r)=(£ pTf(C)rfc) forP = P0,...,Pm (11.110)
is in Q(X0(N))y i.e., that H* = H for all <p G G. In view of (11.109) it
is enough to show for all r e X0(N), <p G G, and F € Q(«/) that
^{T.n rf(o<*<)W(£f'rf(<)<*;) (ii-iii)
whenever F is defined at all points of the relevant finite set.
*As in Theorem 6.14, one version of Fo,...,Pm may not handle all points of J
simultaneously. We may have to normalize (Fq ,..., Pm) by one or more members of
Q(J) to handle all points. This is not a serious matter for the present proof, and we
shall ignore it.
11. ELLIPTIC CURVES CONSTRUCTED FROM S2(r0(N)) 381
Since F is defined over Q and addition within J is defined over Q, we
can move (p past F and the summation sign on the left side of (11.111),
and it is enough to prove that
JX/"" rf(CK)=£r,Tf(OdC (11.H2)
,• •'To ' j J To
for all r and <p. Since $ : Xo(N) —* J is defined over Q,
, fr' Mr')
<p(JT f(c)rfc) = y f(o«/c (n.113)
for all r'. Thus (11.112) comes down to proving that
,- Ao t -/to
or equivalently (under the change of variables r —► y>(r))
rip(atT) rat<p(r)
J2 fKK = £/ f(0«- (11.H4)
j J to t «/r0
Fix r 6 Xq(N) and y? £ G (together with representatives of r and
<p(r) in W*, called by the same names). We shall prove for a suitable
permutation i —► /(i) that
^(of,-r) = a/(i)y?(r) for all i, (11.115)
and this will prove (11.114) and the lemma.
Let h be an arbitrary member of Q(j,Jn) that is defined at every
point of a finite set of points of interest in Xq(N) (including all points
in (11.115)). We form the polynomial
K
Y[(X - h o ai) = XK + hK.1XK~l + ... + hQ. (11.116)
•=i
If 7 is in To(N), write otij = 7t<*j(») for a permutation i —► j(i). Then
K K K
H(X-hoaioy) = H(X-hoaJii)) = l[(X-hoai)i
•=i
382
XI: EICHLER-SHIMURA THEORY
from which it follows that hx-i,..., Ao in (11.116) are invariant under
Tq(N). By the same argument as in Proposition 9.5, all of Aa-_i, ..., h0
are in Ao(To(N)). From §6 we know we can view these functions as in
K(Xo(N)), which Theorem 11.33 identifies as C(j,jn). We shall prove
they are in Q(j,Jn).
Let h(r) = Y^=-m Cn<ln ^e ^e 9 expansion of h at oo, with q = e2*tT.
Since h is in Q(i,in), the c„'s are in Q. Let a,- = f q^J, and put
qd = e2*iTld and <d = e2*{ld. Then
fco«<(r) = fc(^) = £) cn&l
Expanding out the left side of (11.116), we see that the coefficients in the
q expansions of fe/c-i, • • • >^o are in Q(Ov)- Now we can argue as in the
last paragraph of the proof of Theorem 11.32 to see that the coefficients
for fejc-i,..., ho are in fact in Q. Applying Corollary 11.50, we conclude
that fcjr-i,..., h0 are in Q(j,J7v).
Let us evaluate (11.116) at r:
K
]J(X - h(<*iT)) = XK + hK-i(r)XK-1 + • • • 4- &o(r).
•=i
Since h is in Q(jJN), we have h*~l = h. Thus h(r') = ^(M^O"'))).
and
K
l[(X - ^(A^r)))) = XK + Ajr-^r)**-1 + • • • + A0(r).
•=i
At this stage we are dealing with a polynomial over C, and it makes
sense to apply <p to both sides (by applying it to the coefficients). We
obtain
K
H(X - h(<p(aiT))) = XK + ip(hK^1(r))XK'1 + ■ • ■ + <p(hQ(r))
•=i
= xK + fc^Wr)^-1 + • ■ • + M^)),
since fc/c-i,..., h0 are in Q(j,jV) and hence are fixed by y?. The above
expression is
12. MATCH OF L FUNCTIONS
= fl(X-h{aMr)))
1=1
by (11.116). Consequently
Mv>(a^)) = h(aKi)<p(r))
383
(11.117)
for a permutation i —► /(£) depending on h (as well as r and ip). We
need to obtain (11.117) with a permutation that is independent of h.
For the algebra of h's in Q(j,Jn) under study, let
Sfc = {permutations i —► /(i) | (11.117) holds for h}.
If {hi,... >hm] is a finite set of such fc's, then for each c £ Q there is a
permutation that works for
hi + ch2 + c2/i3 -f • • • 4- cm"lhm.
Since Q is infinite, there are m values of c, say ci,...,cm, that work
with some common permutation /(i). Then we have
/l ci
1 c2
\l cm
„m—1
Cl
„m-l
„m-l
MvKtt*(r)) - /i2(a/(0^(r)) I [ 0
) U(^W)-M«Ko^))i ^
Since the Vandermonde matrix on the left is nonsingular, we see that
z(i) is in sfclrv..nSAm.
Thus for every finite set {hu ..., ftm}, Sfcin---n 5/»m is nonempty.
Choose a finite set {hi,..., hja } for which Shxf\- '^ShM has the smallest
possible cardinality. Then we have
p| sh = shlr\---r\shM /0.
all h
For any member of the left side, (11.115) is valid for all h. This completes
the proof of the lemma.
12. Match of L Functions
The proof of Theorem 11.74 is now complete except for part (e), which
says that the L functions of E and / coincide as Euler products except
384
XI: EICHLER-SHIMURA THEORY
possibly at finitely many primes. This is the characteristic p part of the
proof. Carrying out the details would involve too much further
preparation in algebraic geometry, and we shall settle for some discussion.
If p is one of the primes for which (e) is asserted, the idea is to prove
an identity of the type
fp = <(> + $ (11.118)
in Xq(N)} where Tp is a reduction modulo p of the operator T(p) of
(11.95) and where <j> is the Frobenius map.
Before trying to make sense of (11.118) in general, let us consider
the special case where Xo(N) has genus 1. (The first such example is
when N = 11.) Then we can identify XQ(N) with the elliptic curve E
produced by Theorem 11.74. From §V.2 it is meaningful to speak of Epy
the reduction of E modulo p. In EPi (11.118) becomes an identity within
End(Ep)y with Tp acting as [cp] in accordance with (b) of the theorem.
Also within End(2?p) we have
[#£(Zp)] = [deg([l]-4)]
= ([l]-£)° ([!]-<»)
= [l]-{<j> + $) + \p],
the last equality holding by item (c) after (11.82) and by Proposition
11.65 again. Thus
<l> + $=\p+l-#E(lp)) = [ap],
in the notation of (10.7).
In other words, (11.118) as an identity within End(Ep) says that [cp] =
[flpj. It can be shown that End(Ep) has no zero divisors, and hence
cp = ap. This equality says that the pth factor of L(s, /) coincides with
thepth factor of L(s,E).
It is not hard to imagine a similar interpretation of (11.118) within
a reduction modulo p of J in the case of general Xq(N). The identity
passes to Epy and we get the same conclusion.
In fact, reduction modulo p of J was too technically difficult to be the
method that was used. Instead (11.118) was proved in Xo(N)y and then
some consequences were transferred to J and E\ the identity itself was
not transferred. We shall not pursue this transfer of consequences.
The meaning that is customarily attached to (11.118) is as an
identity of "correspondences" on Xo(N)p, the reduction modulo p of X0(N)
by (11.88)
by Proposition 11.65
12. MATCH OF L FUNCTIONS
385
(which we have not defined in genus > 1). Intuitively a correspondence
is like the graph of a function, except that the function need not be
single-valued, and one must be careful in counting multiplicities. More
precisely but still not exactly, the building blocks of correspondences in
X0(N)p are irreducible subvarieties of dimension 1 in Xo(N)p x Xo(N)p
(curves, in other words), and a correspondence is a divisor-like Z
combination of such irreducible curves. Both Tp and <f> are to be regarded
as correspondences, and <fi refers to the correspondence <j> with the two
coordinates interchanged. The sum on the right side of (11.118) is the
sum in the sense of divisors. We shall not go into the proof of (11.118)
but shall end our discussion of the characteristic p part of the proof at
this point.
CHAPTER XII
TANIYAMA-WEIL CONJECTURE
1. Relationships among Conjectures
Throughout this chapter, E will denote an ellliptic curve defined over
Q. By Corollary 10.6 the L function L(s,E) converges for Re s > §
and is given there by an absolutely convergent Dirichlet series. It is
expected that deep arithmetic information is encoded in the behavior of
L(s, E) beyond the region of convergence. For example, the Birch and
Swinnerton-Dyer Conjecture (Conjecture 1.10) predicts that the rank of
E(Q) is the order of vanishing of L(s, E) at s = 1.
To make sense of such conjectures, we must expect that L(s, E) has an
analytic continuation. In view of the behavior of Dirichlet series that are
better understood, it is natural to expect also that L(s, E) will satisfy a
functional equation. Here is a reasonably precise formulation.
Conjecture 12.1. The function L(s,E) extends to be entire, and
for a suitable sign e and a suitable positive integer AT, L(s, E) satisfies
a functional equation given in terms of
A(s,E) = Ns!2(2ir)-sr(s)L(s,E) (12.1)
by
A(syE) = -eA(2-s)E). (12.2)
Hasse had the further idea that L(s,E) should have as good
transformation properties as any other reasonable Dirichlet series. In the
case of L(s, f) for a cusp form / 6 Sk(To(N)) in one of the eigenspaces
SI(Tq(N)) of Wtf, there is more that can be said beyond Theorem 9.8.
For one thing, by examining (9.27), (9.36), and (9.37), we see that the
entire function
\(s, /) = N'<2(2*)-T(s)L(s, f) (12.3)
is actually bounded in vertical strips. For another thing, let us consider
some modifications of L(s, /). Write
^./) = £^- (124a)
n = l
386
1. RELATIONSHIPS AMONG CONJECTURES 387
let m be an integer relatively prime to Ny let \ be a primitive Dirichlet
character modulo m, and let
^./.x)=f;^- (12.4b)
Define
A(s, /, X) = (m2Ny'2(2x)-°T(S)L(S, f, x). (12.4c)
Recall from (7.46) the Gauss sum
m-l
cKx)=^2*x(|). (12.5)
The following theorem is proved by combining the techniques for
Theorem 7.19 and Theorem 9.8.
Theorem 12.2. Let / be in S|(r0(A0), let GCD(m,JV) = 1, let
X be a primitive Dirichlet character modulo m, and let L(s,/, x) and
A(5>/>x) be defined by (12.4). Then L(syf,x) is initially defined for
Re s > f -t-1 and extends to be entire in s. Moreover, A(s, /, x) is entire
and bounded in vertical strips, and it satisfies the functional equation
A(.,/,x) = £(-l)^^Xy)A(*-.,/,x),
where c{m,x) is the Gauss sum (12.5).
The Hasse-Weil Conjecture expects L(s} E) to share all these
properties. It is given in terms of L(s,E) = Y^=i ~~7 an(J ifcs modifications
^^.x) = E~1£^-
Conjecture 12.3 (Hasse-Weil). The function L(s,E) extends to be
entire and, for a certain positive integer TV, so does L(s,£,x) for every
Dirichlet character whose conductor m is prime to TV. Moreover, the
modified functions
A(«, E) = Ns'2(2ir)~sr(s)L(s} E)
A(syEiX) = (m2TVr/2(27r)-T(5)I(5,E,X)
388
XII: TANIYAMA-WEIL CONJECTURE
extend to be entire and, for a suitable sign £, satisfy the functional
equations
A(s, £) =-£A(2 -*,E)
Weil addressed the question how close £(«,£") is to coming from a
modular form if it satisfies functional equations as in Conjecture 12.1
or 12.3. Suppose that L(s,E), adjusted by factors as in (12.1), is really
the Mellin transform of /(*<r), where / is an analytic function on the
upper half plane Ji transforming under a group T of linear fractional
transformations. The fact that we are obtaining L(s,E) from a Mellin
transform incorporates a transformation law under the translation
corresponding to T = ( o 1 ) • ^ne functional equation (12.2) incorporates a
transformation law under some inversion element, such as the one from
aN = (jv V J* Conjecture 12.1 does not give any reason for thinking
that T is any larger than the group V generated by the
transformations corresponding to T and a^. Unfortunately r'\W* is noncompact
and not close to compact. So V does not give us much control over an
analytic function on 7{.
Weil found, however, that the additional functional equations (12.6)
are enough to force matters to be controlled by T0(N). The statement
of the Weil Converse Theorem is as follows. We shall not give the proof.
Theorem 12.4 (Weil). Let L(s) = £~=1 ^ be a Dirichlet series
with cn = 0(nc) for some c > 0. Fix a positive integer N, an even
positive integer fc, and a sign e. Suppose that
(a) the function
A(«) = N*'2{2ir)-8r(s)L(s)
is entire, is bounded in every vertical strip, and satisfies
A(s) = e("l)k/2A(k-s)i
(b) for every integer rn with GCD(m, N) = 1 and every primitive
Dirichlet character \ modulo m, the modified function
M.) = £^
1. RELATIONSHIPS AMONG CONJECTURES 389
is such that the function
Ax(«) = (m>NY'l(2n)->r(S)Lx(s)
is entire, is bounded in every vertical strip, and satisfies
(c) the series defining L(s) converges absolutely at s = k — 6 for some
*>0.
Then
f(r) = f2c^inT
n=l
is a cusp form in S*(r0(iV)).
In fact, it is not necessary to assume (b) for quite so many m's, but
this improvement will not concern us. Theorems 12.2 and 12.4 show that
the Hasse-Weil Conjecture is completely equivalent with the following
fundamental conjecture.
Conjecture 12.5. If E is given, then there exist a positive integer
N and a sign e such that the function L(s,E) equals L(s,f) for some
eigenform / in 5|(r0(AT)).
The notion of newform was not known at the time of Weil's work,
but we can freely substitute the word "newform" for "eigenform" in
Conjecture 12.5. In fact, if / is an eigenform for S|(r0(Af)), ^ comes
from a newform for some S^Tq^/M)) with M dividing N. If M / 1,
then Theorem 9.8 gives two contradictory functional equations that /
must satisfy. Thus M = 1, and / is a newform.
Conjecture 12.5 is consistent with an important theme that has been
discussed on more than one occasion earlier in this book: the
identification of geometric L functions as automorphic L functions. The
conjecture raises two immediate questions: What are N and e? Is there
a deeper relationship between E and /?
Eichler-Shimura theory gives part of the answer to the second
question. Starting from any normalized newform / in S2(r0(N)) whose
Fourier coefficients are integers, the theory provides us with a mapping
from Xq(N) to a canonical elliptic curve over Q. Even before this
theory, Taniyama had suggested looking for such a mapping in connection
with an arbitrary elliptic curve over Q, but Taniyama's suggestion was
apparently initially overlooked. Weil made a similar suggestion as the
Eichler-Shimura theory unfolded.
390
XII: TANIYAMA-WEIL CONJECTURE
Conjecture 12.6 (Taniyama-Weil). If E is given, then there exists an
integer N for which there is a nonconstant morphism F : Xq(N) —► E.
We say that such an E has a modular parametrization of level N,
and, following, custom, we call E a Weil curve. If E is given by the
usual equation (3.23), then it is equivalent to say that there are functions
f (r) and tj(t) in Ao(ro(N)) whose q expansions at oo have rational
coefficients and are such that (x, y) = (£(r), t){t)) satisfies (3.23b) for all
t where neither £ nor r\ has a singularity. The passage from F to (£,17)
is just a matter of unwinding the definitions; the passage from (f, 77) to
F requires Corollary 11.50.
The modular parametrization in the Eichler-Shimura theory has
additional properties. From a newform / of level N, Theorem 11.74 produces
a morphism from Xo(N) to an elliptic curve E over Q such that the pull-
back of the invariant differential is a multiple of /(r) rfr and the period
lattice of J f{r)dr is equivalent with the period lattice of E. If E' is
isogenous to E over Q, one can compose the Eichler-Shimura map with
the isogeny to get a modular parametrization of E'', but the match of
the period lattices may be lost.
We saw in an example following Corollary 11.50 that there is a
morphism from Xo(MN) to Xq(N). The Eichler-Shimura mapping to a
curve E can be composed with such maps to produce modular
parametrizations of E of nonminimal level.
Theorem 12.7. If E has a modular parametrization of level N but
of no level M for M < JV, then E comes via the Eichler-Shimura theory
from the map defined by a normalized newform in S2(r0(N)), followed
by an isogeny defined over Q.
In constructing an elliptic curve E from a normalized newform / £
S2(ro(Ar)), the Eichler-Shimura theory shows that the pth L factor of E
agrees with the pth L factor of / for all but finitely many p> and Igusa's
work shows that any exceptional p must divide N. There are two loose
ends—to identify N and to handle the exceptional p's.
There was an early suspicion of what N was. The conductor N of
E is an isogeny invariant of E with a cohomological definition, but it
can be described almost completely in a simple way as follows. First let
us adjust E so that it is given by a global minimal Weierstrass equation
(§X.l). Then let A be the discriminant of E. The conductor N divides
A and has the same prime factors as A. The power of a prime dividing
TV" is 1 if and only if Ep has a node. If p > 3, then the power of p dividing
N is 2 if and only if Ep has a cusp. For the case of a cusp with p = 2 or
1. RELATIONSHIPS AMONG CONJECTURES
391
3, the conductor can be computed by an algorithm due to Tate. Those
curves in Tables 3.1 and 3.2 whose conductor is not obviously |A| are
listed in Table 12.1.
Elliptic Curve
y2 + y = x3
y2 + xy- y = x3
y2 = x3 — x2 + x
y2 = x3 + x2 + x
\y2 + 3xy - y = x3
y2 + xy = x3 + x
y2 = x3 + x
y2 = x3 - x
y2 = x3 - 2x + 1
y2 = x3 - 2x - 1
y2 = x3 + x2 - x
y2 = x3 — x2 — x
y2 + bxy + y = x3
A
-27
-28
-48
-48
-54
-63
-64
64
80
80
80
80
98
N
27
14
24
48
54
21
64
32
40
80
20
80
14
TABLE 12.1. Conductors of some elliptic curves
The following theorem ties up both loose ends in the Eichler-Shimura
theory.
Theorem 12.8 (Carayol). Let / 6 S2(ro(A0) be a normalized
newform whose Fourier coefficients are integers, and let E be the
elliptic curve over Q associated to / by Eichler-Shimura theory. Then
L(s,E) = L(s,f), and N is the conductor of E.
It follows that the Taniyama-Weil Conjecture implies the Hasse-Weil
Conjecture. In fact, if E is given, we choose N to be the smallest positive
integer such that E has a modular parametrization of level N. Theorem
12.7 says that E comes from the map defined by a normalized newform
/ in S2(r0(N)), possibly followed by an isogeny over Q. Since isogeny
over Q does not affect the L function (Theorem 11.67), Theorem 12.8
says that L(s,E) = L(s,/). This is the conclusion of Corollary 12.5,
which we have seen is equivalent with the Hasse-Weil Conjecture.
We can come close to proving that the Hasse-Weil Conjecture implies
the Taniyama-Weil Conjecture. Namely let E be given with conductor
N. By the Hasse-Weil Conjecture in the form of Conjecture 12.5, there
392
XII: TANIYAMA-WEIL CONJECTURE
is a normalized newform / in S2(To(N')) such that L(syE) = L{s,f).
By the Eichler-Shimura construction and Theorem 12.8, / leads to
an elliptic curve E1 of conductor N' with L(s,E') = L(syf). Since
L(s, E') = L(sy E), a theorem of Serre allows us to conclude that E and
Ef are isogenous over Q provided j(E) is not an integer. In this case the
modular parametrization for E yields one for E1 by composition.
2. Strong Weil Curves and Twists
Let E and E' be Weil curves, and let F : X0(N) -> E and
F' : Xq(N) —♦» Ef be modular parametrizations, with N minimal in both
cases. Without loss of generality, we suppose that oo of Xq(N) maps to
the group identity of E and E1. We say that (EyF)
dominates (E',Ff) if F' factors through F, i.e., if F' = <p o F for
an isogeny ^>. A Weil curve E is a strong Weil curve if (E,F) is
maximal in this ordering for some F. (The F yielding maximality will
not be unique, as it can always be composed with —1 in E.)
Proposition 12.9. In every isogeny class of Weil curves, there is
a strong Weil curve and it is unique up to isomorphism. The strong
Weil curve and map {E,F) are characterized within the isogeny class by
either of the following conditions:
(a) the induced map F* on homology Hi(Xo(N),l) —> H\(E,Z) is
onto.
(b) when F is factored through the Jacobian variety as F\ o $, the
kernel of F\ is a (connected) variety.
Example. In §XI.l, X0(ll) led to three Weil curves, of which the
first two were
E : y2 + y = x3 - x2 - 10* - 20 (12.7)
and
E': y2 + y = x3-xy (12.8)
defined by respective lattices A and A' in C. The lattice A was defined
as the image of the cycles of Xq(N). Hence H\(Xo(N),Z) maps onto
H\(EyT), and (12.7) is a strong Weil curve. We identified E1 by using
the sublattice A', which does not include the image of the cycles of
JYo(N), and then used a dual isogeny to map E to E'. It would have
amounted to the same thing if we had used ^A', which properly contains
A, to define E1. In this case the cycles of Xq(N) do have image in ^A',
but the homology map is not onto. Thus E' in (12.8) is not a strong
Weil curve.
2. STRONG WEIL CURVES AND TWISTS
393
The Eichler-Shimura construction uses the full image of cycles of
Xq(N) as the set of periods to define an elliptic curve, by Theorem
11.74d. Thus it always leads directly to a strong Weil curve.
What we see computationally from the Eichler-Shimura construction
is a lattice in C, from which we can compute the j-invariant
approximately, as in §XI.l. Even if we assume that our approximation to the
j-invariant is an approximation to a particular rational number, we have
not completely solved the problem of identifying the strong Weil curve
as a particular curve over Q. There remains the problem of
distinguishing a curve over Q from its twists, as we saw by example in §XI.l. We
now take up the question of twists more generally.
Two elliptic curves E and E1 over Q are called twists of each other
if j(E) = j(E'). We shall skip over the cases that the common value of
j is 0 or 1728, which require special treatment. We may assume that E
and E1 are in global minimal form (3.23b) with respective coefficients
ai,...,ae and ai,...,^ as usual. Since j is not 0 or 1728, none of
C4,C4,C6,c'6 is equal to 0. Put v - A/A' 6 Q. Since j(E) = j(E') and
j(£) = ! = 1728+|,
we see that
= — = ii = ii
V A' c43 eg'
It follows that v = u6 for some u = - in Q with GCD(r, s) = 1. Choosing
5 as > 0 and the sign of r to match the sign of ce/c^y we have
(S2C4,*3c6>56A) = (r24rV6)r6A'). (12.9)
By Lemma 10.1 and Theorem 10.3, neither r nor s has any prime square
factor p2 with p > 3. Conversely if E and E1 are related as in (12.9),
then j(E) = j(E').
Proposition 12.10. Let E and E1 be twists of each other, related
as in (12.9), and let their L functions have respective coefficients ap and
a^ If p is a prime not dividing 6AA', then
where ( J is the Legendre symbol.
394 XII: TANIYAMA-WEIL CONJECTURE
Proof. Let Np and Np be the respective numbers of affine solutions
modulo p. In view of (10.7), we are to show that Np = N'p if ( J = 1
and that Np + Np = 2p if ( rs J = — 1. Since p is prime to 6, we may
take the two equations to be
y2 = x3 - 27c4x - 54c6 (12.10a)
y2 = x3 - 27c'4x - 54c'6. (12.10b)
Suppose ( ) = 1. Choose a £ Z with ra2 = 5 mod p. If (xo,yo)
solves (12.10a) modulo p, then
.2 /r\3
J c^o-54(-
shows that
!/g=*2-27(J) ^o-54 0 4
(^ar-^o)2 = (sr-^o)3 - 27c'4(sr-1x0) - 54^,
i.e., that (sr~1x0,sar~ly0) solves (12.10b) modulo p. Thus JVp < Npi
and similarly Np < Np.
Now suppose that ( I = — 1. In the same way, we check, for each
fixed xo, that the sum of the number of solutions (xo, y) of (12.10a) and
the number of solutions (sr~lxo)y) of (12.10b) is 2. Summing on xo, we
obtain Np + N'p - 2p.
By quadratic reciprocity, Proposition 12.10 says that L(syEf) in
almost all factors equals L{s,E,x) f°r a quadratic Dirichlet character
modulo Ars. Taking Theorems 12.2 and 12.4 into account, we see that
Proposition 12.10 almost proves that the question whether E is a Weil
curve depends only on j{E).
3. Computation of Equations of Weil Curves
How can we check definitively whether a particular elliptic curve is a
Weil curve? And how can we write down equations for the Weil curves
that arise from Xq(N)? Let us first address these questions for strong
Weil curves.
3. COMPUTATION OF EQUATIONS OF WEIL CURVES 395
The classical method of proof is to exhibit £(r) and t)(r) in Aq(To(N))
such that (£(r),77(r)) parametrizes the curve. This method requires
constructing some cusp forms, especially of weight 2, and manipulating
identities to get the right results. Constructing cusp forms of weight 2
is a problem in its own right that we address below. In any event, the
classical method has been carried out in genus 1 and for some higher
cases, particularly in genus 2. It becomes really tedious as the genus
increases.
\9
0
1
2
N
1< N < 10; 12, 13, 16, 18, 25
11,
22,
14,
23,
15,
26,
17,
28,
19,
29,
20,
31,
21
37,
24, 27,
50
32,
36,
49
Table 12.2. Curves Xo(N) of low genus
In the case that the conductor is of the form N = 2a3\ one knows
all possible elliptic curves by classifying all relevant integer solutions of
1728A = C4 — c%. All such elliptic curves are Weil curves. To prove
this, we start with an elliptic curve of conductor N} we compute many
coefficients ap from (10.7), and we check that they correspond to the first
part of the q expansion of a newform in S2(r0(A0) (see below). Then
using the method of §XI.l, we can compute to several decimal places
the lattice in C generated by images of cycles of Xo(N). Next we can
compute the period lattice of the given elliptic curve as in Chapter VI,
using the Gauss arithmetic-geometric mean, and we can use some Cx
multiple of it to organize the information about the lattice in C.
Once we know a basis for the images of cycles of Xo(N)) we can
compute 02, 03, and ,; as in §XI.l. If the computed j does not match
j for the elliptic curve we have in mind, then the elliptic curve that we
have in mind is not a strong Weil curve. But it may still be an ordinary
Weil curve, and a method for approaching that question will be discussed
later.
The difficulty now from the point of view of proof is in identifying
the computed j as an exact rational number. We have it only to several
decimal places. But when N = 2a36, the classification of curves of
conductor N has bounded the denominator of j, which is A. Thus it
tells us the value of j as a rational number, provided we have computed
enough decimal places.
396
XII: TANIYAMA-WEIL CONJECTURE
In the case of conductor N = 2a36 we can therefore tell exactly
whether our computed j matches j for the given elliptic curve. If so, the
curves are twists of one another. Since the discriminant can have only 2
and 3 as prime factors, the set of possible twists is quite limited. Using
Proposition 12.10, we can check for the correct Fourier coefficients.
Recent progress allows use of a similar technique in the case of prime
conductor, and it should be expected that further progress will expand
the method. The following theorem bounds the denominator of j in the
case of prime conductor and allows us to proceed as above.
Theorem 12.11 (Mestre-Oesterle). If a Weil curve E is in global
minimal form and its conductor N is a prime, then the discriminant A
of E divides N5.
The problem of writing down all ordinary Weil curves coming from
Xo(N)) with no side information, is a hard one. The problem is that it
is easy to miss a member of an isogeny class. Instead let us address the
question of proving that a particular curve E of conductor N is a Weil
curve. We start as above by computing many coefficients ap from (10.7)
and checking that they come from an initial segment of a newform in
S2(Fo(N)). We form the lattice of E and the lattice from X0(N). Even
if they do not match (after some multiplication by Cx), it may happen
that the lattice from E corresponds to a sublattice of the images of
cycles of Xq(N). Let n be the index of the lattice from E in the lattice
from X0(N)y and let E1 be the strong Weil curve. The goal is to find an
isogeny of degree n from E to E1 or from E1 to E. We shall not discuss
the details of how to proceed, but there are formulas that help in the
effort.
Success in proving that particular elliptic curves are Weil curves
depends on being able to produce initial segments of q expansions of new-
forms. Although the methods of §IX.3 can be a little helpful, ultimately
one has to come to grips with an actual construction of newforms. For
low conductor the standard technique is to compute the action of the
Hecke operators on Hx(Xq(N),Z) and then on Hf(XQ{N),C). (See
§XI.5.) Proposition 11.24b yields matrices representing T2(n) in some
basis. Computation of a basis of Hi(X0(N),l) takes some time but is
manageable.
A second technique is to produce initial segments of q expansions of
eigenforms by using trace formulas for the Hecke operators. Once the
trace is known for T2(n), one can compute the characteristic polynomial
and then the eigenvalues; from the eigenvalues, one can piece together
the q expansions of the eigenforms.
4. CONNECTION WITH FERMAT'S LAST THEOREM 397
To obtain the characteristic polynomial of T2(n), we use an
observation from linear algebra. If L is a linear transformation on an r
dimensional complex vector space with characteristic polynomial
r
det(XI -L) = Y[(X -\j) = Xr- Cr-iXr-1 + • • • + (-1)%, (12.11)
i=i
then Ci is the ith elementary symmetric polynomial in Ai,..., Ar and is
a linear combination of the symmetric polynomials
A{ + X{ + • • • + \{ = TV(L'), 1 < j < r.
For L = T2(n), Theorem 9.17 allows us to express T^riy as a linear
combination of the operators T2(m) for which m | n. Hence (12.11) can
be computed from the traces of the Hecke operators.
Once we have the eigenvalues, we have to correlate the effects of the
operators T2(n) as n varies. For example, if the space is 2-dimensional
and T2(3) has a and b as eigenvalues and T2(5) has c and d as eigenvalues,
then we can look at T2(10) to see what the simultaneous eigenvalues are.
If T2(10) has eigenvalues ac and 6rf, then the simultaneous eigenvalues
are {a,c} and {6, d}\ otherwise they are {a,d} and {6,c}.
Everything is easier if we have a candidate for an initial segment of
an eigenform, such as the corresponding coefficients of an elliptic curve.
Then we just have to check that the coefficients solve (12.11) and satisfy
recursions consistent with Theorem 9.17.
4. Connection with Fermat's Last Theorem
Suppose a' +/?' = yl is a relatively prime counterexample to Fermat's
Last Theorem, with / prime and > 5. The elliptic curve E over Q given
by
E: y2=x(x-a')(x-y') (12.12)
has
A = 16a2'/?V
(12.13)
c4 = 16(a2'-aV + 72').
Let us put E in global minimal from and compute its conductor N. Any
odd prime p dividing A divides a(3y. Since a,/?,7 are relatively prime
in pairs, we see that p does not divide C4. By Lemma 10.1, E is minimal
398
XII: TANIYAMA-WEIL CONJECTURE
at p. If p | ay} we readily check that Ep has a node at (0,0), and if p | /?,
we readily check that Ep has a node at (a/,0). Thus
^ = 2P°wer Y[ P- (1214)
p odd
We still have to check p = 2. Since 24 || C4, at most one reduction is
possible. We may assume that 7 is even, so that
y1 = 0 mod 32. (12.15a)
Since /?' = —a' mod 4 and / is odd, we may assume, possibly by
interchanging a and /?, that
a1 = 1 mod 4. (12.15b)
Putting x — 4X and y = 8Y + 4X, we are led to
Y2 + XY = X3 + J(l - otl - V)*2 + jfeaV*. (12.16)
The congruences (12.15) show that the coefficients here are integers (so
that (12.16) is a global minimal form) and that the equation modulo 2
y2 + xy-Ix*
The singularity is at (X,Y) = (0,0); since neither Y2 + XY nor
y2 + XY ■+• X2 is a square, the singularity is a node.
Taking (12.14) into account, we see that the conductor is given by
N = JJ p. (12.17)
p\apy
We shall use the following theorem.
Theorem 12.12 (Ribet). Suppose E is an elliptic curve over Q given
in global minimal form and having discriminant A = Ylp\& P6p anc* con~
ductor TV = Ylp\&P*p- Suppose further that E is a Weil curve with
a modular parametrization of level N given via a normalized newform
f(T) = <I + £^=2 c"?n in S2(r0(N)). Fix a prime /, and let
Ni = -^—. (12.18)
n p
p with
and l\6p
Then there exists fx in S2{T0(Nl)) such that /2(r) = £~=i d^n with
integral coefficients and with cn = dn mod / for 1 < n < 00.
4. CONNECTION WITH FERMAT'S LAST THEOREM 399
Corollary 12.13 (Frey-Serre-Ribet). The Taniyama-Weil Conjecture
implies Fermat's Last Theorem.
Proof. Assuming that Fermat's Last Theorem is false, we apply
Theorem 12.12 to the curve E in (12.12). Comparison of (12.17) and
(12.18) shows that Ny - 2. The theorem produces /i in 52(r0(2)), which
is the 0 space. Thus dn = 0 for all n. But c\ = 1. So c„ = dn mod /
fails for n = 1, and we have a contradiction.
NOTES
Chapter I
The organization of Chapter I is based on Zagier [1988] and is
influenced also by the Introduction of Husemoller [1987]. References to
papers before 1940, among others, may be found in Shimura [1971a] and
Serre [1973a]. For Theorem 1.1, see Borevich and Shafarevich [1966].
Cubic counterexamples to the Hasse Principle were studied by Selmer
[1951]. Theorem 1.4 is discussed in Chapter III.
What we have called the "Weierstrass form" of a cubic is really Tate's
modification of the Weierstrass form. The first ten chapters of the book
work in the context of plane projective geometry, and our definition of
"elliptic curve" (as a nonsingular plane cubic curve in Weierstrass form)
is appropriate for that context. In Chapters XI and XII, and usually
in algebraic geometry, an elliptic curve is a nonsingular projective curve
of genus one, and one proves that all such curves can be realized in
Weierstrass form. See §XI.9.
Theorem 1.5 is the subject of Chapter IV, and Theorem 1.6 is the
main result of Chapter V. Theorem 1.7 appears in Mazur [1977] and
[1978]. For Theorem 1.8, see Chapter X. Conjectures 1.11 and 1.12, as
well as Theorem 1.13, are discussed in Chapter XII.
Conjecture 1.10 appeared originally in Birch and Swinnerton-Dyer
[1963-65]. For more recent numerical evidence, see Brumer and McGuin-
ness [1990]. There are several expositions of Conjecture 1.10 in the
literature. See Swinnerton-Dyer and Birch [1975], Zagier [1984], Appendix
C of Silverman [1986], Chapter 17 of Husemoller [1987], and Chapter 20
of Ireland and Rosen [1990].
Chapters II-V
The standard book on elliptic curves from an arithmetic standpoint
is Silverman [1986]. Three other books on this subject are Koblitz
[1984], Husemoller [1987], and Charlap and Robbins [1988]. Koblitz
concentrates on elliptic curves with complex multiplication, especially
the curves y2 = x3 — n2x that arise in connection with congruent
numbers. Some books on curves are Brieskorn and Knorrer [1986], Fulton
[1989], and Walker [1950]. Chapter II is based chiefly on Husem6Iler
[1987] and Walker [1950].
The examples at the beginning of Chapter III are from Zagier [1988].
The standard notation is as in Tate [1974] and Silverman [1986]. Tables
401
402
NOTES
3.1 and 3.2 are the result of a simple home-computer calculation. The
idea of the proof of Theorem 3.8, as we give it, is a standard one and may
be found, for example, in Husemoller [1987]. Usually the proof in this
form is not carried through to completion. The discussion of singular
points is based on Silverman [1986].
Most of Chapter IV is from Zagier [1988]. The examples were in
Zagier's lectures. Proofs of most of the results in §9 may be found
in Ireland and Rosen [1990]. Theorem 4.11 goes also under the name
Mordell-Weil Theorem. Mordell [1922] proved the theorem as given
here, and Weil [1930] extended the result to elliptic curves defined over
number fields.
In Chapter V, Theorem 5.1 is due in a different form to Nagell [1935]
and in essentially the form here to Elisabeth Lutz [1937]. Our
treatment is similar to that of Lutz, as modified in Chapter 5 of Husemoller
[1987]. The examples are based on Silverman [1986] and Husemoller
[1987]. Theorems 5.2 and 5.3 are due to Nagell [1935] and Fueter [1930],
respectively. Our attention was brought to these results by Lichtenbaum
[1960], and we give Lichtenbaum^ proofs. The material of §5 is based on
Husemoller [1987]; Husemoller credits H. P. Kraft, F. Knopp, G. Menzel,
and E. Senn. See also Exercise 8.12 of Silverman [1986].
Chapter VI
The material in §§1-3 is fairly standard and can be found in many
places. For example, there are individual chapters in Lang [1987],
Silverman [1986], and Husemoller [1987] on this topic. Sections 5-8 are
based on Siegel [1969]. The material in §9 is from Zagier [1988]. Gauss's
use of the arithmetic-geometric mean extended to complex numbers
and is applicable in evaluating period integrals when the roots of the
cubic are not all real. See Cox [1984] for an exposition.
Chapter VII
The definitive book about the Riemann zeta function is Titchmarsh
[1951]. Dirichlet series are discussed in many analysis books. For Euler
products, see pp. 217-220 of Husemoller [1987]. The material on
Dirichlet's Theorem is taken from Serre [1973a]. Further motivation in terms
of the zeta function of a number field may be found in Hecke [1981].
For the definition and properties of the gamma function T(s), see
Ahlfors [1966]. For the elementary properties of the Fourier transform
on the line, see Stein and Weiss [1971]. The results of §5 have several
points of contact with §3.6 of Shimura [1971a]. The passage from the
transformation laws of a function like 6{t) to the functional equation of
an associated Dirichlet series is one of the main themes of Ogg [1969a].
NOTES
403
Chapters VIIMX
Modular forms appear in many books. The definitive book is Shimura
[1971a]. Other full-length books on the subject in the spirit presented
here are Gunning [1962], Ogg [1969a], Lang [1976], and Miyake [1989].
In addition, there are chapters on the subject in Apostol [1976], Koblitz
[1984], and Serre [1973a], and there are summaries in Gelbart [1975],
Silverman [1986], and Husemoller [1987].
Our presentation of much of the material in Chapter VIII is based on
Serre [1973a]. For §5 and §8, see also Ogg [1969a]. Our proof of Corollary
8.9 follows Siegel [1954]; for a different proof that gives different extra
information, see Serre [1973a].
Some authors define weight differently. Also Hecke operators have
more than one standard definition. Hecke operators, when defined as
operators given by one double coset, as in Shimura [1971a], are slightly
different from Hecke operators as we have defined them.
The subgroup To(N) is treated in only a few places. The term "Hecke
subgroup" follows Weil [1971b] and Gelbart [1975]. Our presentation is
a reinterpretation of Ogg [1969a], Lang [1976], and Atkin and Lehner
[1970]. The expositions by Ogg [1973] and Zagier [1990] are helpful also.
For more detail about Example 4 in §IX.3, see Serre [1973] and Ogg
[1973]. Example 5 is discussed in Ogg [1973], and a generalization is in
Newman [1959]. In connection with §VIII.5 and §IX.4, see Ogg [1969a].
The details of the proof of Theorem 9.9 were omitted. For the Rie-
mann surface structure, see Chapter 1 of Shimura [1971a] or Chapter 1
of Miyake [1989]. For the comparison of zeros and poles of meromorphic
functions, see Proposition 11.5.4 of Farkas and Kra [1980]. The
discussion of Theorem 9.10 mentions the Riemann-Hurwitz formula, which is
given as Theorem 1.2.7 of Farkas and Kra [1980], and the fact that the
space of holomorphic differentials has dimension <j, which is given as
Proposition III.2.7. For Theorem 9.10, see §2.6 of Shimura [1971a].
The material of §IX.7 is taken from Atkin and Lehner [1970]. The
proof of Theorem 9.22 may be found in that paper.
Chapter X
The material in §1 is due to Neron [1964]; see §VII.l and §VIII8 of
Silverman [1986]. Our discussion in §2 of zeta functions and L functions
is abbreviated. Silverman [1986] includes more detail in his Chapter
V. The L function of an elliptic curve often has the name Hasse-Weil
attached to it, and its shape is a motivating example for the Weil
conjectures about varieties over finite fields. The final Weil conjecture
generalizes Theorem 10.5, which is due to Hasse [1936]. We have given
404
NOTES
a nonstandard elementary proof of this theorem, following Manin [1956];
Birch discovered an error in Manin's proof, and the error is corrected in
Cassels [1957] and here. The standard proof, given in §§V.l and V.2 of
Silverman [1986], is longer but has the advantage that its methods are
applicable to other situations. For an exposition and history of Hasse's
Theorem and the Weil conjectures and their proofs, see Katz [1976].
Chapter XI
Eichler* Shimura theory refers both to the kinds of results in §§11-
12 and to a theory that seeks to generalize the material in §1 from
S2(To(N)) to Sk(To(N)). The main references for the theme that leads
to §§11-12 are Eichler [1954] and Shimura [1958], [1971a], and [1973b].
The article Swinnerton-Dyer and Birch [1975] includes a very helpful
overview. For the generalization of §1, the references are Eichler [1957]
and Shimura [1959]; Chapters V and VI of Lang [1976] give an exposition
forSL(2,Z).
In connection with §1, economical generators of Xo(N) are known if
TV* is prime. This result is due to Rademacher [1930] and is quoted by
Apostol [1990], p. 78. For the equations of E, E', and E" in §1, as well
as the isogeny, see Velu [1971a]. See also Langlands [1990].
For proofs of the results in §2, see §1.5 of Shimura [1971a]. The
material in §§3-4 about general Riemann surfaces is taken from Farkas and
Kra [1980], and detailed proofs may be found there: For our
Proposition 11.4, see Theorem II.5.1 of that book. For our Proposition 11.5, see
Theorem III.2.7. Proposition 11.7 through Corollary 11.12 are in §111.4
of that book, Proposition 11.13 through Theorem 11.19 are in §111.6,
and Theorems 11.20 and 11.21 are in §IV.ll.
The invariant differential of an elliptic curve is treated in detail in
Silverman [1986], pp. 52-53 and §111.5.
The idea of the construction in the first part of §5 is in Swinnerton-
Dyer and Birch [1975], and the proof has been assembled from arguments
scattered throughout Shimura [1971a]. For example, our Proposition
11.22 is implicit in the proof of Shimura's Theorem 2.20, and many of
the steps in the proof of Proposition 11.23 appear in §8.3 of Shimura's
book. Shimura gives two proofs that the Hecke operators are given by
integer matrices and thus have algebraic integers as eigenvalues. One
proof uses the generalization of our §1 to weight k and appears on p. 84
and in Chapter 8 of Shimura [1971a]; the other uses what we have called
Proposition 11.73 and is on pp. 171-172 of that book. Theorem 11.27 is
a modification of Shimura's Theorem 3.51.
The material in §6 through Theorem 11.33 and its lemmas is classical.
Our treatment is based on §§3.3 and 5.2 of Lang [1987] and lectures of
NOTES
405
H. Matumoto. Theorem 11.36 is implicit in §§6.7 and 6.8 of Shimura
[1971a].
Our treatment of general varieties and curves in §§7-8 is a merger
of Chapter I of Hartshorne [1977] and Chapters I and II of Silverman
[1986], and the reader is referred to those sources for proofs or references
to proofs of results in algebraic geometry. In places where neither a proof
nor a reference appears for such a result in either book, we have tried
to give some indication of proof.
For quoted results in commutative algebra, the reference is Zariski
and Samuel [1958] and [I960]. The Hilbert Basis Theorem is on p. 201
of volume 1, and the Hilbert Nullstellensatz is on p. 164 of volume 2.
Integral closure is discussed in pp. 254-265 of volume 1, and the result
quoted to obtain (11.74) is on p. 265.
Corollary 11.50 is implicit in Chapters 6 and 7 of Shimura [1971a]. The
material in §9 is abstracted from Silverman [1986], largely Chapters II
and III. For more information about isogenics, see Velu [1971b].
Lang [1983] treats abelian varieties in his §§1.1 and II. 1, and he
introduces the Jacobian variety in §11.2. Proofs of Propositions 11.68 to 11.70
and of some of the assertions about Jacobian varieties may be found
there. Historically Weil [1946] and [1948, reprinted in 1971a] reworked
the foundations of algebraic geometry in part to deal rigorously with the
Jacobian variety. Page 88 of Weil [1971a] remarks on the understanding
of the Italian school of the relationship between correspondences and the
Jacobian variety; using correspondences involves counting intersections
properly, and this subject had never been pinned down. Lefschetz [1921]
had constructed the Jacobian variety as a projective variety over C, and
Weil's work constructed Jacobian varieties of curves defined over Q as
"abstract varieties." Chow [1954] realized the Jacobian variety of a curve
defined over Q as a projective variety over Q, and his paper contains the
results we quote as Theorem 11.71. For Theorem 11.72, see Theorem
VII of Chow [1949]. For Proposition 11.73, see §2.9, Proposition 9, of
Shimura and Taniyama [1961].
For the theorem of Wedderburn given as Lemma 11.75, see pp. 63, 66,
and 116 of Jacobson [1943].
Concerning Theorem 11.74, Eichler [1954] discovered (11.118) and
used it to relate the L function of Xq(N) to the action of the Hecke
operators. Shimura [1971a] adapted this algebraic argument and added
the idea of looking at asubvariety of J to get a version of Theorem 11.74;
see his Theorems 7.14 and 7.15. The theorem as we have given it is a
sharpened version that appeared in Shimura [1973b]. Much of Shimura's
proof, including the rationality of the Hecke operators, is couched in the
language of adeles in order to gain generality; insight into what Shimura
406
NOTES
is doing may be obtained from Chapter 7 of Lang [1987]. Shimura's
proof of the equality of L functions uses his own general theory (Shimura
[1955]) of reduction of varieties modulo p and is valid for all but finitely
many primes. Igusa [1959] proves results about the reduction modulo p
of Xq(N) implying that Shimura's argument is valid for all primes not
dividing N.
Chapter XII
The presentation in this chapter, especially in §1, owes a great deal to
Zagier [1988] and Swinnerton-Dyer and Birch [1975]. For Theorem 12.2,
see §3.6 of Shimura [1971a]. Theorem 12.4 originally appeared in Weil
[1967]; Chapter V of Ogg [1969a] gives an exposition. For the Taniyama-
Weil Conjecture, see Taniyama [1957] and Weil [1967]. Theorem 12.7 is
Theorem 4 of Swinnerton-Dyer and Birch [1975].
The notion of conductor dates to Artin and became manageable as a
result of Ogg [1967]. An actual algorithm for computing it is the subject
of Tate [1975].
Swinnerton-Dyer, Stephens, et al. [1975] assembled extensive tables
of information about elliptic curves over Q with low conductor. Velu
[1976] reported a small number of errors that had been discovered in
the tables, including the omission of a page listing curves of conductors
121 to 124. Our Table 12.1 relies partly on those tables and partly on
computation using Tate's algorithm. Miyake [1989] includes long tables
of information about a few particular curves.
Theorem 12.8 is due to Carayol [1986], p. 411. The arguments relating
the Hasse-Weil Conjecture and the Taniyama-Weil Conjecture are in
Swinnerton-Dyer and Birch [1975]. The end of our §1 alludes to Serre's
Isogeny Theorem, which is on p. IV-14 of Serre [1968].
In §2, Proposition 12.9 is noted by Mazur [1973]. Twists appear in
Atkin and Lehner [1970], Shimura [1973b], and Atkin and Li [1978]. See
also §X.5 of Silverman [1986].
Ligozat [1975] works extensively with Xq(N) of genus 1, and the
classical method of proving that curves are Weil curves can be seen there.
Mazur and Swinnerton-Dyer [1974] show how TV* = 37 (of genus 2) can
be handled, and Birch [1973] shows how N = 50 (of genus 2) can be
handled. We have taken Table 12.2 from Mazur [1973], but it can readily
be computed directly.
Conductor 2°36 was treated in the unpublished Manchester thesis of F.
B. Coghlan, and the numerical results are in Table 4 of Swinnerton-Dyer,
Stephens, et al. [1975]. Theorem 12.11 is due to Mestre and Oesterle
[1989]. In this connection, see also Theorem 1 of Coates [1970] and its
interpretation in Theorem 8 of Tate [1974].
NOTES
407
A subject that this book has not treated is elliptic curves over Q with
complex multiplication. All such curves are Weil curves, by a theorem
of Deuring [1953-57]. The j-invariants of curves over Q with complex
multiplication lie in a finite set.
T>ace formulas for Hecke operators are the subject of Chapter 6 of
Miyake [1989]. For original papers, see Eichler [1957], Selberg [1957],
Eichler [1973], and Hijikata [1974]. The argument centered about (12.11)
is taken from Miyake's book.
Frey [1986] studied an equation more general than (12.12) and noted
its relevance to Fermat's Last Theorem. Serre [1987b] studied this
question at length and gave a conjecture whose truth would yield Corollary
12.13. Ribet [1990] proved enough of Serre's conjecture to obtain
Corollary 12.13. We have followed Serre's paper in computing the conductor
of (12.12).
Connection with Representation Theory
Langlands [1990] pulls together L functions, class field theory, elliptic
curves, modular forms, infinite-dimensional representations of the
general linear group, adeles, and automorphic representations, all in the
space of 30 pages. His paper is a good one to start with, and a good one
to return to.
The idea that modular forms are connected with the representations
of GL{2) seems to have originated with Gelfand and Fomin in the 1950's.
Some expositions of the connection are in parts of Gelfand, Graev, and
Pyatetskii-Shapiro [1969], Weil [1971b], Deligne [1973], Gelbart [1975],
and Piatetski-Shapiro [1979]. In part, the connection is that one can
identify cusp forms for Yq(N) in two stages with functions on groups.
In the first stage the identification is with certain functions on SL(2,R)
transforming on one side by Tq(N) and on the other side by the rotation
subgroup. In the second stage it is with certain functions on G£(2, A),
where A is the group of adeles of Q. The cusp forms are then intimately
connected with the decomposition of the representation of GX(2,A) on
L2 of the quotient space GL(2,Q)Za\GL(2, A), where Zf\ is the center
of GL(2,A). Under this correspondence the Hecke operators end up
with a surprisingly simple interpretation.
Once this construction is complete, the objects of interest are pairs
consisting of a certain kind of infinite-dimensional irreducible unitary
representation of GL(2, A) and a finite-dimensional holomorphic
representation of GL(2, C) (initially taken to be the standard
representation). To such a pair the theory attaches an L function. To construct
the L function, one recognizes the infinite-dimensional representation
as a kind of infinite tensor product of representations of G£(2,R) and
408
NOTES
GL(2,p-adics) for all p (see Flath [1979]), and one constructs a factor
of the L function for each factor of the representation. The intention
is to be guided by Hecke's classical theory, to obtain an analytic
continuation and functional equation for the resulting L function, and to
see connections with L functions in number theory. This program for
GL(2) is carried out in detail in Jacquet and Langlands [1970]. For an
exposition, see Robert [1973]. In retrospect, Tate's 1950 thesis (Tate
[1967]) did the same thing for GL{\).
This construction has become in part a way of generating automorphic
L functions, and it lends itself to generalization. For generalization to
GL(n), see Godement and Jacquet [1972]. For generalization to other
kinds of groups, several things have to be brought to bear simultaneously.
One can see them all at once in a preliminary form in Langlands [1970],
and all at once in a later form in Langlands [1979]. For more detail
one can consult Borel [1979], Borel and Jacquet [1979], and Tate [1979].
General sources for representation theory of reductive groups are Knapp
[1986] for real groups and Silberger [1979] for p-adic groups. For a status
report on construction of L functions, see Shahidi [1990].
The theory does more, however, than just produce L functions. It
provides tools for working with them and bringing together the fields of
mathematics in which L functions play a role. Three papers that explain
such relationships are Gelbart [1984], Gelbart and Shahidi [1988], and
Clozel [1990]. And, once again, one should look at Langlands [1990].
REFERENCES
Conferences
"Antwerp-Bonn": Modular Forms in One Variable, I-VI, Lecture Notes
in Math. 320, 349, 350, 476, 601, 627 (1973-77), Springer-Verlag,
Heidelberg Berlin New York.
"Corvallis": Auiomorphic Forms, Representations, and L-functions,
Proc. Symposia in Pure Math. XXXIII, parts 1-2 (1979),
American Mathematical Society.
"Ann Arbor": Clozel, L., and J. S. Milne, Auiomorphic Forms, Shimura
Varieties, and L-functions /-//: Proceedings of a Conference Held
at the Univesiiy of Michigan, Ann Arbor, July 6-16, 1988,
Perspectives in Mathematics 10-11 (1990), Academic Press Inc.,
San Diego, Academic Press, Boston.
"Motives": Proc. Symposia in Pure Math., American Mathematical
Society, to appear.
Books and Papers
Ahlfors, L. V., Complex Analysis, McGraw Hill Inc., New York, 2nd
edition, 1966.
Apostol, T. M., Modular Functions and Dirichlet Series in Number
Theory, Springer-Verlag, New York Berlin Heidelberg, 2nd
edition, 1990 (1st edition, 1976).
Atkin, A. O. L., and J. Lehner, Hecke operators on ro(m), Math.
Annalen 185 (1970), 134-160.
Atkin, A. O. L., and W. W. Li, Twists of newforms and pseudo-
eigenvalues of W-operators, Invent Math. 48 (1978), 221-243.
Birch, B., Some calculations of modular relations, Modular Functions
of One Variable I, Lecture Notes in Math. 320 (1973), Springer-
Verlag, Berlin Heidelberg New York, 175-186.
Birch, B. J., and H. P. F. Swinnerton-Dyer, Notes on elliptic curves I and
II, J. Reine Angew. Math. 212 (1963), 7-25; 218 (1965), 79-108.
Borel, A., Automorphic L-functions, Auiomorphic Forms,
Representations, and L-functions, Proc. Symposia in Pure Math. XXXIII,
part 2 (1979), American Mathematical Society, 27-61.
Borel, A., and H. Jacquet, Automorphic forms and automorphic
representations, Automorphic Forms, Representations, and
L-functions, Proc. Symposia in Pure Math. XXXIII, part 1
(1979), American Mathematical Society, 189-202.
409
410
REFERENCES
Borevich, Z. I., and I. R. Shafarevich, Number Theory, Academic Press,
New York, 1966.
Brieskorn, E., and H. Knorrer, Plane Algebraic Curves, Birkhauser
Verlag, Basel Boston Stuttgart, 1986.
Brumer, A., and O. McGuinness, The behavior of the Mordell-Weil
group of elliptic curves, Bull. Amer. Math. Soc. 23 (1990), 375-
382.
Carayol, H., Sur les representations /-adiques associees aux formes
modulaires de Hilbert, Ann. Scient. Ecole Norm. Sup. 19 (1986),
409-468.
Cassels, J. W. S., Review of Yu I. Manin's, uOn cubic congruences to a
prime modulus," Math. Reviews 18 (1957), 380-381.
Coates, J., An effective p-adic analogue of a theorem of Thue II, Acta
Arithmetica 16 (1970), 399-412.
Charlap, L. S., and D. P. Robbins, An elementary introduction to elliptic
curves, preprint, 1988.
Chow, W., On compact complex analytic varieties, Amer. J. Math. 71
(1949), 893-914.
Chow, W., The Jacobian variety of an algebraic curve, Amer. J. Math.
76 (1954), 453-476.
Clozel, L., Motifs et formes automorphes: applications du principe de
fonctorialite, Automorphic Forms, Shimura Varieties, and
L-functions I: Proceedings of a Conference Held at the Univesity
of Michigan, Ann Arbor, July 6-16, J988, , Perspectives in
Mathematics 10 (1990), Academic Press Inc., San Diego, 77-159.
Cox, D. A., The arithmetic-geometric mean of Gauss, L'Enseignement
Math. 30 (1984), 275-330.
Deligne, P., Formes modulaires et representations de GL(2), Modular
Functions of One Variable II, Lecture Notes in Math. 349 (1973),
Springer-Verlag, Berlin Heidelberg New York, 55-105.
Deligne, P., and M. Rapoport, Les schemas de modules de courbes
elliptiques, Modular Functions of One Variable II, Lecture Notes
in Math. 349 (1973), Springer-Verlag, Berlin Heidelberg New
York, 143-316.
Deuring, M., Die Zetafunktion einer algebraischen Kurve vom
Geschechte Eins I, II, III, IV, Nachr. Akad. Wiss. Gottingen
(1953) 85-94, (1955) 13-42, (1956) 37-76, (1957) 55-80.
Eichler, M., Quaternare quadratische Formen und die Riemannsche
Vermutung fur die Kongruenzzetafunktionen, Arch, der Math. 5
(1954), 355-366.
Eichler, M., Eine Verallgemeinerung der Abelschen Integrale, Math.
Zeitschr. 67 (1957), 267-298.
REFERENCES
411
Eichler, M., The basis problem for modular forms and the traces of the
Hecke operators, Modular Functions of One Variable I, Lecture
Notes in Math. 320 (1973), Springer-Verlag, Berlin Heidelberg
New York, 75-151.
Farkas, H. M., and I. Kra, Riemann Surfaces, Springer-Verlag, New York
Heidelberg Berlin, 1980.
Flath, D., Decomposition of representations into tensor products,
Automorphic Forms, Representations, and L-functions, Proc.
Symposia in Pure Math. XXXIII, part 1 (1979), American
Mathematical Society 179-183.
Frey, G., Links between stable elliptic curves and certain Diophantine
equations, Annales Universitatis Saraviensis, Series
Mathematicae, 1 (1986), 1-40.
Fueter, R., Ueber kubische diophantische Gleichungen, Comm. Math.
Helv. 2 (1930), 69-89.
Fulton, W., Algebraic Curves, Addison-Wesley, Redwood City,
California, 1989.
Gelbart, S. S., Automorphic Forms on Adele Groups, Annals of Math.
Studies 83, Princeton University Press, Princeton, 1975.
Gelbart, S. S., An elementary introduction to the Langlands program,
Bull Amer. Math Soc. 10 (1984), 177-219.
Gelbart, S., and F. Shahidi, Analytic Properties of Automorphic
L-Functionsy Perspectives in Mathematics 6 (1988), Academic
Press Inc., San Diego.
Gelfand, I. M., M. I. Graev, and I. I. Pyatetskii-Shapiro,
Representation Theory and Automorphic Functions, W. B. Saunders Co.,
Philadelphia, 1969.
Godement, R., and H. Jacquet, Zeta Functions of Simple Algebras,
Lecture Notes in Math. 260 (1972), Springer-Verlag, Berlin
Heidelberg New York.
Gunning, R. C, Lectures on Modular Forms, Annals of Math. Studies
48, Princeton University Press, Princeton, 1962.
Hartshorne, R.; Algebraic Geometry, Springer-Verlag, New York
Heidelberg Berlin, 1977.
Hasse, H., Zur Theorie der abstrakten elliptischen Funktionkorper I, II,
III, J. Reine Angew.. Math. 175 (1936), 55-62, 69-88, 193-208.
Hecke, E., Lectures on the Theory of Algebraic Numbers, Springer-
Verlag, New York Heidelberg Berlin, 1981.
Hijikata, H., Explicit formula of the traces of Hecke operators for To(A^),
J. Math. Soc. Japan 26 (1974), 56-82.
Husemoller, D., Elliptic Curves, Springer-Verlag, New York Berlin
Heidelberg, 1987.
412
REFERENCES
Igusa, J., Kroneckerian models of fields of elliptic modular functions,
Amer. J. Math. 81 (1959), 561-577.
Ireland, K., and M. Rosen, A Classical Introduction to Modern
Number Theory, Springer-Verlag, New York Berlin Heidelberg,
2nd edition, 1990.
Jacobson, N., The Theory of Rings, Mathematical Surveys II (1943),
American Mathematical Society.
Jacquet, H., and R. P. Langlands, Automorphic Forms on GL(2),
Lecture Notes in Math. 114 (1970), Springer-Verlag, Berlin
Heidelberg New York.
Katz, N. M.; An overview of Deligne's proof of the Riemann hypothesis
for varieties over finite fields, Mathematical Developments Arising
from Hilbert Problems, Proc. Symposia in Pure Math. XXVIII
(1976), American Mathematical Society, 275-305.
Knapp, A. W., Representation Theory of Semisimple Groups: An
Overview Based on Examples, Princeton University Press,
Princeton, 1986.
Koblitz, N., Introduction to Elliptic Curves and Modular Forms,
Springer-Verlag, New York Berlin Heidelberg, 1984.
Lang, S., Introduction to Modular Forms, Springer-Verlag, Berlin
Heidelberg New York, 1976.
Lang, S., Abelian Varieties, Springer-Verlag, New York Berlin
Heidelberg, 1983.
Lang, S., Elliptic Functions, Springer-Verlag, New York Berlin
Heidelberg, 2nd edition, 1987.
Lang, S., Old and new conjectured Diophantine inequalities, Bull.
Amer. Math. Soc. 23 (1990), 37-75.
Langlands, R. P., Problems in the theory of automorphic forms, Lectures
in Modern Analysis and Applications III, Lecture Notes in Math.
170 (1970), Springer-Verlag, Berlin Heidelberg New York, 18-61.
Langlands, R. P., Modular forms and /-adic representations, Modular
Functions of One Variable II, Lecture Notes in Math. 349 (1973),
Springer-Verlag, Berlin Heidelberg New York, 361-500.
Langlands, R. P., Automorphic representations, Shimura varieties, and
motives. Ein Marchen, Automorphic Forms, Representations, and
L-functions, Proc. Symposia in Pure Math. XXXIII, part 2
(1979), American Mathematical Society, 205-246.
Langlands, R. P., Representation theory: its rise and its role in
number theory, Proceedings of the Gibbs Symposium, Yale
University, May 15-17, 1989, American Mathematical Society,
1990, 181-210.
REFERENCES
413
Lefschetz, S., On certain numerical invariants of algebraic varieties with
application to Abelian varieties, Trans. Amer. Math. Soc. 22
(1921), 327-482.
Li, W. W., Newforms and functional equations, Math. Annalen 212
(1975), 285-315.
Lichtenbaum, S., Torsion Subgroups of Elliptic Curves, Undergraduate
Honors Thesis, Harvard University, Cambridge, Massachusetts,
1960.
Ligozat, G., Courbes modulaires de genre 1, Bull. Soc. Math. France,
Supplement, Memoire 43 (1975).
Lutz, E., Sur l'equation y2 = x3 — Ax — B dans les corps p-adiques, J.
Reine Angew. Math. 177 (1937), 238-247.
Manin, Yu I., On cubic congruences to a prime modulus, Izv. Akad. Nauk
SSSR, Ser. Mat. 20 (1956), 673-678 (Russian), Amer. Math. Soc.
Transl. (2) 13 (I960), 1-7.
Mazur, B., Courbes elliptiques et symboles modulaires, Seminaire
Bourbaki, vol. 1971/72, Exposees 400-417, Lecture Notes in
Math. 317 (1973), Springer-Verlag, Berlin Heidelberg New York,
277-294.
Mazur, B., Modular curves and the Eisenstein ideal, I.H.E.S. Publ.
Math. 47 (1977), 33-186.
Mazur, B., Rational isogenics of prime degree, Invent. Math. 44 (1978),
129-162.
Mazur, B., and P. Swinnerton-Dyer, Arithmetic of Weil Curves, Invent.
Math. 25 (1974), 1-61.
Mestre, J.-F., and J. Oesterle, Courbes de Weil semi-stables de
discriminant une puissance m-ieme, J. Reine Angew. Math. 400
(1989), 173-184.
Miyake, T., Modular Forms, Springer-Verlag, Berlin Heidelberg New
York, 1989.
Mordell, L. J., On the rational solutions of the indeterminate equations
of the third and fourth degrees, Proc. Cambridge Philos. Soc. 21
(1922-23), 179-192.
Nagell, T., Solution de quelques problemes dans la theorie arithmetique
des cubiques planes du premier genre, Skrifter Norske
Videnskaps-Akademi i Oslo, 1935, No. 1, pp. 1-25.
Neron, A., Modeles minimaux des varietes abeliennes sur les corps locaux
et globaux, LH.E.S. Publ. Math. 21 (1964), 5-128.
Newman, M., Construction and application of a class of modular
functions II, Proc. London Math. Soc. 9 (1959), 373-387.
Ogg, A. P., Elliptic curves and wild ramification, Amer. J. Math. 89
(1967), 1-21.
414
REFERENCES
Ogg, A., Modular Forms and Dirichlet Series, W. A. Benjamin Inc.,
New York, 1969a.
Ogg, A., On the eigenvalues of Hecke operators, Math. Annalen, 179
(1969b), 101-108.
Ogg, A., Survey of modular functions of one variable, Modular Functions
of One Variable I, Lecture Notes in Math. 320 (1973), Springer-
Verlag, Berlin Heidelberg New York, 1-35.
Patterson, S. J., Automorphic forms and number theory, Ecole
Europeenne de Theorie des Groupes, CI.R.M. Marseille-Lurniny,
preprint, 1991.
Piatetski-Shapiro, I., Classical and adelic automorphic forms. An
introduction, Automorphic Forms, Representations, and
L-functions, Proc. Symposia in Pure Math. XXXIII, part 1
(1979), American Mathematical Society, 185-188.
Rademacher, H., Uber die Erzeugenden von Kongruenzuntergruppen der
Modulgruppe, Abhandlungen Math. Seminar Hamburg 7 (1930),
134-148.
Ribet, K. A., On modular representations of Gal(Q/Q) arising from
modular forms, Invent. Math. 100 (1990), 431-476.
Robert, A., Formes automorphes sur GL2, Siminaire Bourbaki, vol.
1971/72, Exposees 400-417, Lecture Notes in Math. 317 (1973),
Springer-Verlag, Berlin Heidelberg New York, 295-318.
Selberg, A., Automorphic functions and integral operators, Seminars
on analytic functions. Institute for Advanced Study, Princeton,
N. J., and U. S. Air Force Office of Scientific Research, 2 (1957),
152-161.
Selmer, E. S., The diophantine equation ax3-f by3 -f cz3 = 0, Acta Math.
85 (1951), 203-362, and 92 (1954), 191-197.
Serre, J.-P., Abelian l-adic Representations and Elliptic Curves, W. A.
Benjamin Inc., New York, 1968.
Serre, J.-P., A course in Arithmetic, Springer-Verlag, New York
Heidelberg Berlin, 1973a.
Serre, J.-P., Congruences et formes modulaires, Seminaire Bourbaki, vol.
1971/72, Exposees 400-417, Lecture Notes in Math. 317 (1973b),
Springer-Verlag, Berlin Heidelberg New York, 319-338.
Serre, J.-P., Lettre a J.-F. Mestre, Current Trends in Arithmetical
Algebraic Geometry, Contemporary Mathematics Series 67
(1987a), American Mathematical Society, Providence, 263-268.
Serre, J.-P., Sur les representations modulaires de degre 2 de Gal(Q/Q),
Duke Math. J. 54 (1987b), 179-230.
Serre, J.-P., Lectures on the Mordell-Weil Theorem, Fried. Vieweg k,
Sohn, Braunschweig, 2nd edition, 1990.
REFERENCES
415
Shahidi, F., Automorphic /^-functions: a survey, Automorphic Forms,
Shimura Varieties, and L-functions I: Proceedings of a
Conference Held at the Univesity of Michigan, Ann Arbor, July
6-16, 1988, , Perspectives in Mathematics 10 (1990), Academic
Press Inc., San Diego, 415-437.
Shimura, G., Reduction of algebraic varieties with respect to a discrete
valuation of the basic field, Amer. J. Math. 77 (1955), 134-176.
Shimura, G., Correspondances modulaires et les fonctions C de courbes
algebriques, J. Math. Soc. Japan 10 (1958), 1-28.
Shimura, G., Sur les integrates attachees aux formes automorphes, J.
Math. Soc. Japan 11 (1959), 291-311.
Shimura, G., Introduction to the Arithmetic Theory of Automorphic
Functions, Princeton University Press, Princeton, 1971a.
Shimura, G., On elliptic curves with complex multiplication as factors
of the jacobians of modular function fields, Nagoya Math. J. 43
(1971b), 199-208.
Shimura, G., Complex multiplication, Modular Functions of One
Variable I, Lecture Notes in Math. 320 (1973a), Springer-Verlag,
Berlin Heidelberg New York, 37-56.
Shimura, G., On the factors of the jacobian variety of a modular function
field, J. Math. Soc. Japan 25 (1973b), 523-544.
Shimura, G., and Y. Taniyama, Complex Multiplication of Abelian
Varieties and Its Applications to Number Theory, Publ. Math.
Soc. Japan, no. 6, 1961, Kenkyusha Printing Co., Tokyo.
Siegel, C. L., A simple proof of t;(-1/t) = Tj(T)y/rJi, Mathematika 1
(1954), p. 4.
Siegel, C. L., Topics in Complex Function Theory, vol. 1, John Wiley &
Sons, New York, 1969.
Silberger, A. J., Introduction to Harmonic Analysis on Reductive p-Adic
Groups, Princeton University Press, Princeton, 1979.
Silverman, J. H., The Arithmetic of Elliptic Curves, Springer-Verlag,
New York Berlin Heidelberg, 1986.
Stein, E. M., and G. Weiss, Introduction to Fourier Analysis on
Euclidean Spaces, Princeton University Press, Princeton, 1971.
Swinnerton-Dyer, H. P. F., and B. J. Birch, Elliptic curves and modular
functions, Modular Functions of One Variable IV, Lecture Notes
in Math. 476 (1975), Springer-Verlag, Berlin Heidelberg New
York, 2-32.
Swinnerton-Dyer, H. P. F., N. M. Stephens, J. Davenport, J. Velu, F.
B. Coghlan, A. O. L. Atkin,, D. J. Tingley, Numerical tables on
elliptic curves, Modular Functions of One Variable IV, Lecture
416
REFERENCES
Notes in Math. 476 (1975), Springer-Verlag, Berlin Heidelberg
New York, 74-144.
Taniyama, Y., L-functions of number fields and zeta functions of abelian
varieties, J. Math. Soc. Japan 9 (1957), 330-366.
Tate, J. T., Fourier analysis in number fields and Hecke's zeta functions,
in J. W. S. Cassels and A. Frohlich, Algebraic Number Theory,
Academic Press, London, 1967, 305-347.
Tate, J. T., The arithmetic of elliptic curves, Invent Math. 23 (1974),
179-206.
Tate, J., Algorithm for determining the type of a singular fiber in an
elliptic pencil, Modular Functions of One Variable IV, Lecture
Notes in Math. 476 (1975), Springer-Verlag, Berlin Heidelberg
New York, 33-52.
Tate, J., Number theoretic background, Auiomorphic Forms,
Representations, and L-functions, Proc. Symposia in Pure Math. XXXIII,
part 2 (1979), American Mathematical Society, 3-26.
Titchmarsh, E. C, The Theory of the Riemann Zeta-Function, Oxford
University Press, Oxford, 1951.
Velu, J., Courbes elliptiques sur Q ayant bonne reduction en dehors de
{11}, C. R. Acad. Sci. Paris, Ser. A 273 (1971a), 73-75.
Velu, J., Isogenics entre courbes elliptiques, C. R. Acad. Sci. Paris,
Ser. A 273 (1971b), 238-241.
Velu, J., Review of "Numerical tables on elliptic curves/' Math. Reviews
52 (1976), #10557.
Walker, R. J., Algebraic Curves, Dover Publications Inc., New York,
1950.
Weil, A., Sur un theoreme de Mordell, Bull. Sci. Math. 54 (1930), 182-
191.
Weil, A., Foundations of Algebraic Geometry, Amer. Math. Soc.
Colloquium Publications XXIX, American Mathematical Society,
1946; 2nd edition, 1962.
Weil, A., Uber die Bestimmung Dirichletscher Reihen durch
Funktionalgleichungen, Math. Annalen 168 (1967), 149-156.
Weil, A., Courbes Algebriques et Varietes Abeliennes, Hermann, Paris,
1971a. (Reprint of volumes 1041 and 1064 of Actualites
Scientifiques et Industrielles, 1948.)
Weil, A., Dirichlet Series and Automorphic Forms, Lecture Notes in
Math. 189 (1971b), Springer-Verlag, Berlin Heidelberg New York.
Zagier, D. B., L-series of elliptic curves, the Birch-Swinnerton-Dyer
conjecture, and the class number problem of Gauss, Notices
Amer. Math. Soc. 31 (1984), 739-743.
REFERENCES
417
Zagier, D. B., Modular parametrizations of elliptic curves, Canad. Math.
Bull. 28 (1985), 372-384.
Zagier, D., Elliptic Curves, lecture series, Tata Institute for Fundamental
Research, Bombay, January 1988.
Zagier, D., Introduction to modular forms, preprint, 1990.
Zariski, O., and P. Samuel, Commutative Algebra, vol. I, D. Van
Nostrand Company Inc., Princeton, 1958.
Zariski, O., and P. Samuel, Commutative Algebra, vol. II, D. Van
Nostrand Company Inc., Princeton, 1960.
INDEX OF NOTATION
See also the list of Standard Notation on page xv. In the list below,
Latin, German, and script letters appear together and are followed by
Greek symbols and non-letters.
<*i, <*2> 03, 04, ae, 42, 56, 57
ap, 294, 295
A(T), 331
i4*(r), 283
Ak(r0(N)), 333
f>2, &4, h, 681 57
cj,329
C4, c6, 57
c„, 225
c(m,X), 214, 387
C, 273
C(*,y,u»),37
C*,353
d, 59
deg, 317, 359
deg,, deg,, 359
df, 371
dt(n), 372
Biv(X), 316, 360
Divo(X), 316, 361
e, 170
eF(x), 359
£(*), 25
£(Q)> 15, 80
£(Q)u™, 130
F(n)(Q), 137, 138
End(E), 364,368
Endq(J), 374
Ep, 130, 135
/, 210
/, 243
/i(*,y), 26
///, 345
/r, 212
/«., 314
f(r), 330
F, 365
F*, 349, 360
F., 360, 392
F(lfc), 25
F(t), 302
Fp, 135
F*, 28
9, 272, 317, 361
92, 02(A), 158, 222
03, 01(A), 158, 222
*2(r),*3(r),G2*(r), 223
G2(t), 266
G^2)(r), 266
G»,Ga*(A), 158, 222
ft, 97, 263
/»o, 95
H, 30
/MR). #!(€), 326
/^(R), i/*(C), 327
H1(X0(N),I),Z23
Hom(Ei,E2), 364, 368
ft, 223
ft*, 311
t(P, L, F), 32
/, 341
I(V), 342
/(V/*o), 342
/(W), 344
/(Wnib"), 345
i, 65, 333
419
420
INDEX OF NOTATION
i(A), 222 otdx(D), 316, 360
j(t), 223, 333 O, 11, 67, 74, 362, 368
jN, 335 p, 151, 153
j, 371 -P, 12, 74
J, 371 P2(fc), 19
J(X), 318, 369 Pn\k), 344
J(Xo(7V)), 310 PA, 273
k[x,y,w)d, 24 Pic(X), 316, 360
M^], *M> 342 Pic°(C), 361
ko(V), k(V), 342 P + Q, 11,67
ibo(M^), Jb(W), 346 P • Q, 10, 43
JfelV]., 343 PQ, 10, 43
*[W], 346 q, 225
ifc[W]x,346 ?ft,263
K(X), 312 qN, 263
L, 19 Q, 285
Z-(Jb), 19 r, 102
L(s), 17 rp, 130, 135
L(s,f), 238, 267, 386 fl, 227
L(s,/,X),387 ft, 251
!(«,£?), 295 R(f,g), 45
£(*,x),201 [«(/,</)], 45
L(D), 316,361 fl(n), 248, 278
L(T), 329 flB/Q, 106
£,*, 20 Pyv,260
mx, 343 72, 170, 271, 305, 375
H.365 K\ 172, 271, 305
M(a, b), 186 5,228
M(n), 244 Sfc, 234
M(n, TV), 276, 320 5fc(r0(7V)), 265
M*(n), 256 5J(r0(JV)), 270
Mk, 234 S^ld(r0(/V)), 283
Mfc(r0(JV)), 265 5few(r0(7V)), 283
Mx, 344 S, 375
Mfl, 351 ^(R1), 211
n(C), 170 <(n), 372
n,,n2, 108 T, 228, 305
N, 390 T(n), 244, 275, 320, 321, 323
N(p), 16 T#(n), 372
ordx(/), 316,350 Tk(n), 244, 245, 275, 277
INDEX OF NOTATION
421
fp, 384
T.375
u, 63, 290, 293
U, 375
v, 351
Voo, Vi, vp, vT, 231
V, 342
V7*o, 342
V(to), 342
V4,V6, 305
V/, 341
w(C), 174
w(Q), 285
iutv, 269
wq, 285
W, 344
W(*o)> 344
WIt 344
Xo(ll), 305
A-0(7V), 310,311
q#, 253,261
a,, 244, 277, 320, 372
qn, 269
[a]*, 245, 261
[ll 323
r(AT), 250, 256
T0(N), 256
ro(r.p), 287
Ti, r2, 170
fi.fj, 177
ra, r6, rc, 170
rab, 321
ra, r^, r7,177
6(g,r), 229, 261
6f, 371
6<(n), 373
A, 58
A(A), 222
A(r), 223
<(«), 189
*?(r), 235
0(r), 208
0(r,X),216
A, 153, 223, 371
A(s), 18, 209
A(s,/), 240, 270, 386
A(«,/,x),387
A(s, E), 386, 387
h{s,E,X), 387
A(s,x),216
A;, 374
AT,223
(A, C7), 273
/i, 272, 371
/<«, A*2> /*3, 272
i/, 374
jt, 370
n, 153
/>, 208, 375
<r, 208
<r,(n), 225
t = p + iff, 208, 224
r0, 302
*, 360, 384
^,384
<p, 272, 357, 380
<p(r), 239, 268
¥>a,90
$, 20, 318, 345, 369
$,370
$#, 372
*/(7), 303
*N(X), 335
X.201
X*. 213
Xo, 201
w, 314, 370, 374
uuu3, 153,223,274
fi(X), 312
QhoiPO, 313, 370
422
INDEX OF NOTATION
oo, 79, 231, 261
||,90
(•)", 210, 365
(•)*, 328
(•,•), 242, 280, 326,371
[•], 323, 365
[•]*, 245, 261
I • \P, 134
(- J , 272, 298, 393
H .83
INDEX
Abel's Theorem, 318
abelian
subvariety, 368
variety, 367
addition, 162, 363
elliptic curve, 11
formula, 76
adeles, 407
admissible change of variables,
63, 363
affine
algebraic set, 342
coordinate ring, 342
curve, 25, 343
local coordinates, 21, 345
problem, 3
variety, 342
algebraic integer, 122, 123
algebraic numbers, 122
algebraic set
afnne, 342
projective, 344
analytic continuation, 166, 167
arithmetic-geometric mean, 185,
402
associate Dirichlet characters,
213
associativity, 12, 67
Atkin-Lehner Theorem, 283, 289
automorphic function, 333
automorphic L function, xi, 208
bad prime, 108
Bezout's Theorem, 27, 47
birational map 348
Birch and Swinnerton-Dyer
Conjecture, 17, 208, 386
canonical class, 316, 361
canonical height, 95, 97
canonical Q structure, 333,
341, 349, 357
Carayol, 391
change of variables, admissible,
63, 362
character, 200
chord case, 75
chord-tangent composition rule,
11, 43
class group
divisor, 316, 361
ideal, 127
class number, 126
complex multiplication, 164, 407
complex torus, 160, 183, 370
conductor, 213, 390
congruence subgroup, 250, 256
congruent numbers, 52, 110, 112,
115
conic, 25
continuation, analytic, 166, 167
convergent product, 195
coordinate ring, 342
coordinates, affine local, 345
correspondence, 385
cubic, 25, 40, 56
nonsingular, 58
singular, 58
curve
affine, 25, 343
elliptic, 42
nonsingular, 27
projective, 345
projective plane, 24
same, 24
singular, 27
smooth, 27
cusp, 77, 262
cusp form, 225, 261
defined at, 344, 346, 348
denned over, 342, 344, 347, 361,
362, 368
degenerate, 68
degree, 243, 249, 273, 317, 359, 360
descent, 80
differential
holomorphic, 312
invariant, 314
meromorphic, 312
423
424
INDEX
dimension, 343, 346
dimension formula, 235, 272
Diophantus, 3, 6, 7, 10, 50
Diophantus method, 6, 10, 115,
119
Dirichlet character, 201
associate, 213
conductor of, 213
extension, 213
primitive, 213
principal, 201
Dirichlet L function, xii, 201
Dirichlet series, 192
Dirichlet Unit Theorem, 125
Dirichlet's Theorem, xii, 148, 189
discrete valuation ring, 351
discriminant, 15, 58, 59, 125,
226
divisor, 316, 360
divisor class group, 316, 361
divisor, principal, 316, 360
dominant morphism, 349
dominate, 392
Double Series Theorem, 158
doubling formula, 76
doubly periodic, 152
dual isogeny, 309, 365
Eichler-Shimura theory, xii, 302,
374, 389, 390, 404
eigenform, 280
elliptic curve, 13, 42, 160, 183,
362
modular, 221
elliptic element, 303
elliptic function, 152
elliptic integral, 174
elliptic regulator, 106
equivalent eigenform, 280
equivalent ideals, 126
Euler product, 196, 202, 255, 282,
289, 295
first degree, 197
second degree, 199
even total winding number, 170
extension of Dirichlet character,
213
fairly bad prime, 108
Fexmat, 52, 110, 112
Fermat's Last Theorem, xii, 3, 18,
50, 54, 58, 81, 397, 399
Fermat's method of descent, 80
Fermat's problem for Mersenne,
55, 119
Fibonacci, 52
first degree Euler product, 197
flex, 35
Fourier inversion formula, 200,
210
Fourier transform, 200, 210
Frey-Serre-Ribet, xii, 18, 399
Frobenius morphism, 360, 384
glh power, 360
function element, 166
function field, 312, 319, 342,
346
of dimension 1, 349
functional equation, 209, 216,
240, 270, 289, 386, 387, 388
fundamental domain, 228, 260
fundamental parallelogram, 152
Gauss, 185, 402
Gauss sum, 214, 387
genus, 272, 317, 361, 395
global minimal, 290
good prime, 108
Hasse Principle, 5, 15, 16
Hasse's Theorem, 16, 296
Hasse-Minkowski Theorem, 5
Hasse-Weil Conjecture, 18, 387
Hasse-Weil L function, 403
Hecke operator, 244, 275, 320,
321, 323, 372
Hecke subgroup, 256
Hecke's Theorem, 249, 279
Hecke-Petersson Theorem, 255, 282
height, 95, 97
Hessian matrix, 30
Hilbert Basis Theorem, 342
Hilbert Nullstellensatz, 342
holomorphic at oo, 225
holomorphic at the cusp, 261, 263
holomorphic differential, 312
homogeneous, 243, 273
homogeneous ideal, 344
INDEX
425
homogeneous polynomial, 22
homology, 317, 321, 392
homomorphism, 368
ideal class group, 127
ideal of variety, 342, 344
identity, elliptic curve, 11
Igusa, 374, 390
inflection point, 35
integral, elliptic, 174
intersection multiplicity, 32
invariant differential, 314
inversion, 269, 285
inversion problem, 165
irreducible, 342, 345
isogenous, 365, 369
isogeny, 164, 309, 363, 369
dual, 309, 369
isomorphic elliptic curves, 63
isomorphic over, 349
isomorphic varieties, 349
isomorphism, 368
j-invariant, 65, 226
Jacobi Inversion Theorem, 319
Jacobian variety, 310, 318, 361,
369, 370
L function, xi, 17, 201, 238,
267, 295
automorphic, xi, 208
Dirichlet, xii, 201
motivic, xi, 207
Langlands program, xiii
lattice, 243, 318
Legendre symbol, 272, 298, 393
Legendre's Theorem, 5
level, 261, 333, 390
Lie group, 376
line, 19, 25
at infinity, 19
same, 19
tangent, 27
linear system, 316, 361
Liouville Theorems, 152, 153
local coordinates, affine, 21,
345
local L factor, 294
local ring, 343, 346
Lutz-Nagell Theorem, xi, 15, 130,
144
Mazur's Theorem, 15
Mellin transform, 238
meromorphic differential, 312
Mestre-Oesterle, 396
method of descent, 80
method of Diophantus, 6, 10, 115, 119
minimal Weierstrass equation, 290
at prime p, 290
global, 290
Minkowski, 5, 104
model, 333, 341, 349, 357
modular elliptic curve, 221
modular form, 224, 225, 261
unrestricted, 224, 261
modular pair, 273
modular parametrization, 310, 390
modular polynomial, 335
Mordell's Theorem, xi, 14, 80,
95, 402
Mordell-Weil Theorem, xi, 80, 95,
402
morphism, 348, 356
dominant, 349
Frobenius, 360, 384
motivic L function, xi, 207
multiplicative, 196
strictly, 197
Multiplicity One Theorem, 283
multiplicity, intersection, 32
Nagell, xi, 15, 130, 144
naive height, 95
negative, 12
newform, 283
Newton, 10
node, 77
Noetherian, 342
nondegenerate, 68
nonsingular, 25, 27, 58, 161,
343, 347
n on split case, 79
norm, 123
Nullstellensatz, 342
number field, 122
oldform, 283
426
INDEX
order, 153
p-adic filtration, 138
p-adic norm, 134
p-integral, 135
p-reduced, 135
parabolic element, 303
parallelogram
fundamental, 152
period, 152
period, 151, 184
period lattice, 153
period parallelogram, 152
Petersson inner product, 242,
252, 280
plane curve, 24
Poincare, 13, 67
point
nonsingular, 25, 343, 347
of inflection, 35
singular, 25, 77
points, 19, 25, 342, 344
at infinity, 19
Poisson Summation Formula, 211
pole, 360
prescribed torsion, 145
primitive, 213
Principal Axis Theorem, 37
principal Dirichlet character,
201
principal congruence subgroup,
250, 256
principal divisor, 316, 360
product
convergent, 195
of ideals, 126
of varieties, 347
projective
algebraic set, 344
closure, 345
curve, 345
n-space, 344
plane, 19
plane curve, 24
problem, 3
transformation, 20, 345
variety, 345
purely inseparable, 359
Pythagorean triple, 7
9-expansion, 225, 263, 265
quadratic residue symbol, 272,
298, 393
ramified, 359
rank, 15, 102, 107
rational map, 347
rational points, 19, 25, 342,
344
reciprocal roots, 198
reduction modulo p, 130, 134,
384
regular, 344, 346, 348
regulator, 106
representation theory, xiii, 407
result of rearranging, 167
resultant, 44
Ribet, xii, 18, 398, 399
Riemann-Roch Theorem, 317, 361
Riemann surface, 170, 311, 312,
316
Riemann zeta function 189, 194
same curve, 24
same line, 19
Schwartz function, 211
second degree Euler product, 199
Segre embedding, 347
Selmer's example, 8
separable, 359
Serre, xii, 18, 392, 399
Shimura, xii, 302, 374, 389, 390,
390, 404
Shimura-Taniyama, 373
singular, 12, 25, 27, 58, 77
smooth curve, 27
split case, 79
square root, 169
strictly multiplicative, 197
strong Weil curve, 392
subvariety, 368
summation by parts, 192
Swinnerton-Dyer, 17, 208, 386
tangent case, 75
tangent line, 27
Taniyama, 373
Taniyama-Weil Conjecture, xii,
18, 208, 221, 390, 399
INDEX
427
Tate normal form, 147
theta function, 209, 216
torsion subgroup, 15, 130
prescribed, 145
torus, complex, 160, 183, 370
total winding number, 170
trace, 123, 396
translation, 364
twist, 308, 393
ultrametric inequality, 134
uniformizer, 350
unique factorization, 92, 127
unit, 92, 124
unramified, 359
unrestricted modular form, 224, 261
valuation, 351
vanishes at the cusp, 263
variety
abelian, 367
a/fine, 342
projective, 345
very bad prime, 108
Wedderburn's Theorem, 375
Weierstrass
Double Series Theorem, 158
equation, 362
minimal, 290
form, 13, 42, 56, 57, 401
p function, 153
weight, 63, 224, 261, 333
Weil, xii, 18, 208, 221, 387,
390, 399
Weil conjectures, 403
Weil Converse Theorem, 388
Weil curve, 390
strong, 392
width, 263
winding number, total, 170
zero, 360
zeta function, xi, 189, 194, 295
MATHEMATICAL NOTES
ted by Luis A. Caffarelli, John N. Mather, and Elias M. Stt~.
22. On Uniformization of Complex Manifolds: The Role of Connections, by R. C.
Gunning
23. Introduction to Harmonic Analysis on Reductive P-adic Groups, by Allan J.
SlLBERGER
24. Lectures on Pseudo-Differential Operators: Regularity Theorems and Applications to
Non-Elliptic Problems, by Alexander Nagel and E. M. Stein
25. Non-Abelian Minimal Closed Ideals of Transitive Lie Algebras, by J. F. Conn
26. Exact Sequences in the Algebraic Theory of Surgery, by Andrew Ranicki
27. Existence and Regularity of Minimal Surfaces on Riemannian Manifolds, by Jon T.
Pitts
28. Hardy Spaces on Homogenous Groups, by G. B. Folland and E. M. Stein
29. Lectures on Exponential Decay of Solutions of Second-Order Elliptic Equations:
Bounds on Eigenfunctions of N-Body Schrodinger Operators, by Shmuel Agmon
30. Formal Knot Theory, by Louis H. Kauffman
31. Cohomology of Quotients in Symplectic and Algebraic Geometry, by Frances
Clare Kirwan
32. Predicative Arithmetic, by Edward Nelson
33. Lectures on Counterexamples in Several Complex Variables, by John Erik Forn/ESS
and Berit Stensones
34. Lie Groups, Lie Algebras, and Cohomology, by Anthony W, Knapp
35. Plateau's Problem and the Calculus of Variations, by Michael Struwe
36. Casson's Invariant for Oriented Homology 3-Spheres: An Exposition, by Selman
Akbulut and John D. McCarthy
37. Analytic Pseudodifferential Operators for the Heisenberg Group and Local Solvability,
by Daryl Geller
38. Several Complex Variables: Proceedings of the Mittag-Leffler Institute, 1987-1988,
edited by John E. Forn^ss
39. ©-modules and Spherical Representations, by Frederic V. Bien
40. Elliptic Curves, by Anthony W. Knapp
A complete catalogue of Princeton mathematics and
science books, with prices, is available upon request.
PRINCETON UNIVERSITY PRESS
Princeton, New Jersey
08540
9 780691 085593