/
Author: Kolchin V.F.
Tags: graph theory mathematical modeling combinatorics mathematical statistics mathematical mathematics
ISBN: 0-521-44081-5
Year: 1999
Text
Random Graphs
The book is devoted to the study of classical combinatorial structures, such as ran-
random graphs, permutations, and systems of random linear equations in finite fields.
The author shows how the application of the generalized scheme of allocation
in the study of random graphs and permutations reduces the combinatorial prob-
problems to classical problems of probability theory on the summation of independent
random variables. He concentrates on recent research by Russian mathematicians,
including a discussion of equations containing an unknown permutation. This is the
first English-language presentation of techniques for analyzing systems of random
linear equations in finite fields.
These results will interest specialists in combinatorics and probability theory
and will also be useful in applied areas of probabilistic combinatorics, such as
communication theory, cryptology, and mathematical genetics.
V. F. Kolchin is a leading researcher at the Steklov Institute and a professor at the
Moscow Institute of Electronics and Mathematics (MIEM). He has written four
books and many papers in the area of probabilistic combinatorics. His papers have
been published mainly in the Russian journals Theory of Probability audits Appli-
Applications, Mathematical Notes, and Discrete Mathematics, and in the international
journal Random Structures and Algorithms.
ENCYCLOPEDIA OF MATHEMATICS AND ITS APPLICATIONS
EDITED BY G.-C. ROTA
Editorial Board
R. Doran, M. Ismail, T.-Y. Lam, E. Lutwak
Volume 53
Random Graphs
6 H. Mine Permanents
18 H. O. Fattorini The Cauchy Problem
19 G. G. Lorentz, K. Jetter, and S. D. Riemenschneider Birkhoff Interpolation
22 J. R. Bastida Field Extensions and Galois Theory
23 J. R. Cannon The One-Dimensional Heat Equation
24 S. Wagon The Banach-Tarski Paradox
25 A. Salomaa Computation and Automata
26 N. White (ed.) Theory ofMatroids
27 N. H. Bingham, C. M. Goldie, and J. L. Teugels Regular Variation
28 P. P. Petrushev and V. A. Popov Rational Approximation of Real Functions
29 N. White (ed.) Combinatorial Geometries
30 M. Pohst and H. Zassenhaus Algorithmic Algebraic Number Theory
31 J. Aczel and J. Dhombres Functional Equations in Several Variables
32 M. Kuczma, B. Chozewski, and R. Ger Iterative Functional Equations
33 R. V. Ambartzumian Factorization Calculus and Geometric Probability
34 G. Gripenberg, S.-O. Londen, and O. Staffans Volterra Integral and Functional Equations
35 G. Gasper and M. Rahman Basic Hypergeometric Series
36 E. Torgersen Comparison of Statistical Experiments
37 A. Neumaier Interval Methods for Systems of Equations
38 N. Korneichuk Exact Constants in Approximation Theory
39 R. A. Brualdi and H. J. Ryser Combinatorial Matrix Theory
40 N. White (ed.) Matroid Applications
41 S. Sakai Operator Algebras in Dynamical Systems
42 W Hodges Basic Model Theory
43 H. Stahl and V. Totik General Orthogonal Polynomials
44 R. Schneider Convex Bodies
45 G. Da Prato and J. Zabczyk Stochastic Equations in Infinite Dimensions
46 A. Bjorner, M. Las Vergnas, B. Sturmfels, N. White, and G. Ziegler Oriented Matroids
47 G. A. Edgar and L. Sucheston Stopping Times and Directed Processes
48 C. Sims Computation with Finitely Presented Groups
49 T. Palmer Banach Algebras and the General Theory of *-Algebras
50 F. Borceux Handbook of Categorical Algebra I
51 F. Borceux Handbook of Categorical Algebra II
52 F. Borceux Handbook of Categorical Algebra III
ENCYCLOPEDIA OF MATHEMATICS AND ITS APPLICATIONS
Random Graphs
V. F. KOLCHIN
Steklov Mathematical Institute, Moscow
Cambridge
UNIVERSITY PRESS
PUBLISHED BY THE PRESS SYNDICATE OF THE UNIVERSITY OF CAMBRIDGE
The Pitt Building, Trumpington Street, Cambridge CB2 1RP, United Kingdom
CAMBRIDGE UNIVERSITY PRESS
The Edinburgh Building, Cambridge CB2 2RU, UK http://www.cup.cam.ac.uk
40 West 20th Street, New York, NY 10011-4211, USA http://www.cup.org
10 Stamford Road, Oakleigh, Melbourne 3166, Australia
© Cambridge University Press 1999
This book is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without
the written permission of Cambridge University Press.
First published 1999
Printed in the United States of America
Typeface Times Roman 10/13 pt. System I^TgX [RW]
A catalog record of this book is available from
the British Library
Library of Congress Cataloging in Publication data
Kolchin, V. F. (Valentin Fedorovich)
Random graphs / V. F. Kolchin
p. cm. - (Encyclopedia of mathematics and its applications; v. 53)
Includes bibliographical references and index.
ISBN 0 521 44081 5 hardback
1. Random graphs. I. Title. II. Series.
QA166.17.K65 1999 98-24390
511'.5-dc20 CIP
ISBN 0 521 44081 5 hardback
CONTENTS
Preface ix
1 The generalized scheme of allocation and the components of
random graphs 1
1.1 The probabilistic approach to enumerative combinatorial
problems 1
1.2 The generalized scheme of allocation 14
1.3 Connectivity of graphs and the generalized scheme 22
1.4 Forests of nonrooted trees 30
1.5 Trees of given sizes in a random forest 42
1.6 Maximum size of trees in a random forest 48
1.7 Graphs with unicyclic components 58
1.8 Graphs with components of two types 70
1.9 Notes and references 86
2 Evolution of random graphs 91
2.1 Subcritical graphs 91
2.2 Critical graphs 97
2.3 Random graphs with independent edges 100
2.4 Nonequiprobable graphs 109
2.5 Notes and references 120
3 Systems of random linear equations in GFB) 122
3.1 Rank of a matrix and critical sets 122
3.2 Matrices with independent elements 126
3.3 Rank of sparse matrices 135
3.4 Cycles and consistency of systems of random equations 143
3.5 Hypercycles and consistency of systems of random equations 156
vn
viii Contents
3.6 Reconstructing the true solution 164
3.7 Notes and references 177
4 Random permutations 181
4.1 Random permutations and the generalized scheme of allocation 181
4.2 The number of cycles 183
4.3 Permutations with restrictions on cycle lengths 192
4.4 Notes and references 212
5 Equations containing an unknown permutation 219
5.1 A quadratic equation 219
5.2 Equations of prime degree 225
5.3 Equations of compound degree 235
5.4 Notes and references 239
Bibliography 241
Index 251
PREFACE
Combinatorics played an important role in the development of probability theory
and the two have continued to be closely related. Now probability theory, by
offering new approaches to problems of discrete mathematics, is beginning to
repay its debt to combinatorics. Among these new approaches, the methods of
asymptotic analysis, which have been well developed in probability theory, can be
used to solve certain complicated combinatorial problems.
If the uniform distribution is defined on the set of combinatorial structures in
question, then the numerical characteristics of the structures can be regarded as
random variables and analyzed by probabilistic methods. By using the probabilistic
approach, we restrict our attention to "typical" structures that constitute the bulk
of the set, excluding the small fraction with exceptional properties.
The probabilistic approach that is now widely used in combinatorics was first
formulated by V. L. Goncharov, who applied it to Sn, the set of all permuta-
permutations of degree n, and to the runs in random (O,l)-sequences. S. N. Bernstein,
N. V. Smirnov, and V. E. Stepanov were among those who developed probabilis-
probabilistic combinatorics in Russia, building on the famous Russian school of probability
founded by A. A. Markov, P. L. Lyapunov, A. Ya. Khinchin, and A. N. Kolmogorov.
This book is based on results obtained primarily by Russian mathematicians and
presents results on random graphs, systems of random linear equations in GFB),
random permutations, and some simple equations involving permutations.
Selecting material for the book was a difficult job. Of course, this book is not
a complete treatment of the topics mentioned. Some results (and their proofs) did
not seem ready for inclusion in a book, and there may be relevant results that have
escaped the author's attention.
There is a large body of literature on random graphs, and it is not possible to re-
review it here. Among the probabilistic tools that have been used to analyze random
structures are the method of moments, Poisson and Gaussian approximations, gen-
generating functions using the saddle-point method, Tauberian-type theorems, analysis
IX
x Preface
of singularities, and martingale theory. In the past two decades, a method called
the generalized scheme of allocation has been widely used in probabilistic com-
combinatorics. It is so named because of its connection with the problem of assigning
n objects randomly to TV cells. Let 771,..., r]N be random variables that are, for
example, the sizes of components of a graph. If there are independent random
variables ?1,..., %n so that the joint distribution of 771,..., 77at for any integers
k\,... ,Icn can be written as
=h,...,r]N = kN} = P{?i = k\,..., l;N = kN I ?1 H \-%N = n},
where n is a positive integer, then we say that 771,..., 77at satisfy the generalized
scheme of allocation with parameters n and N and independent random variables
?1, • • •,?#•
Graph evolution is the random process of sequentially adding new edges to a
graph. For many classes of random graphs with n labeled vertices and T edges, the
parameter 0 = 2T/n plays a role of time in the process; various graph properties
often change abruptly at the critical point 0 = 1. Graph evolution is the most
fascinating object in the theory of random graphs, and it appears that it is well
suited to the generalized scheme. We will show that applying generalized schemes
makes it possible to analyze random graphs at different stages of their evolution
and to obtain limit distributions in those cases in which only properties similar to
the law of large numbers have been proved.
The theory of random equations in finite fields is shared by probability, combi-
combinatorics, and algebra. In this book, we will consider systems of linear equations in
GFB) with random coefficients. The matrix of such a system corresponds to a ran-
random graph or hypergraph; therefore, results on random graphs help to study these
systems. We are sure that this application alone justifies developing the theory of
random graphs.
The theory of random permutations is a well-developed branch of probabilis-
probabilistic combinatorics. Although Goncharov has investigated the cycle structure of a
random permutation in great detail, there is still great interest in this area. We will
fully describe the asymptotic behavior of P{vn = k) for the total number vn of
cycles in a random permutation for all possible behaviors of the parameters n and
k = k(n) as n -> 00. We will also give some of the asymptotic results for the
number of solutions of the equation Xd = e, where an unknown X e Sn,d\sa
fixed positive integer, and e is the identity of the group Sn.
Although the generalized scheme of allocation cannot be applied to nonequi-
probable graphs, we present some results in this situation by using the method
of moments. The statistical applications of nonequiprobable graphs call for the
development of regular methods of analyzing these structures.
The book consists of five chapters. Chapter 1 describes the generalized scheme
of allocation and its applications to a random forest of nonrooted trees, a random
Preface xi
graph consisting of unicyclic components, and a random graph with a mixture of
trees and unicyclic components. In Chapter 2, these results are applied to the study
of the evolution of random graphs. Chapter 3 is devoted to systems of random linear
equations in GFB). Much of this branch of probabilistic combinatorics is the work
of Russian mathematicians; this is the first English-language presentation of many
of the results. Random permutations are considered in Chapter 4, and Chapter 5
contains some results on permutation equations of the form Xd = e.
Most results presented in this book derive from work done over the past fifteen
years; notes and references can be found in the last section of each chapter. (It is,
of course, impossible to give a complete list in each particular area.) In addition to
articles used in the text, the summary sections of all chapters include references to )
papers on related topics, especially those in which the same results were obtained \
by other methods. j
We assume that the reader is familiar with basic combinatorics. This book '
should be accessible to those who have completed standard courses of mathemat- ,
ical analysis and probability theory. Section 1.1 includes a list of pertinent results
from probability.
This book continues in the tradition of Random Mappings [78] and differs from
other treatments of random graphs in the systematic use of the generalized scheme
of allocation. We hope that the chapter on systems of random linear equations in
GFB) will be of interest to a broad audience. I wish to express my sincere appre-
appreciation to G.-C. Rota, who encouraged me to write this book for the Encyclopedia
of Mathematics series, even though there are already several excellent books on
random graphs.
My greatest concern is writing the book in English. I am indebted to the editors
who have brought the text to an acceptable form. It is apparent that no amount of
editing can erase the heavy Russian accent of my written English, so my special
thanks go to those readers who will not be deterred by the language of the book.
I greatly appreciate the support I received from my colleagues at the Steklov
Mathematical Institute while I wrote this book.
I
I
The generalized scheme of allocation
and the components of
random graphs
1.1. The probabilistic approach to enumerative
combinatorial problems
The solution to enumerative combinatorial problems consists in finding an exact
or approximate expression for the number of combinatorial objects possessing the
property under investigation. In this book, the probabilistic approach to enumera-
enumerative combinatorial problems is adopted.
The fundamental notion of probability theory is the probability space (Q ,A,P),
where Q is a set of arbitrary elements, A is a set of subsets of Q forming a a-
algebra of events with the operations of union and intersection of sets, and P is
a nonnegative countably additive function defined for each event A e A so that
P(Q) = 1. The set ?2 is called the space of elementary events and P is a probability.
A random variable is a real-valued measurable function ? = ?(a>) defined for all
oo e ?1.
Suppose ?2 consists of finitely many elements. Then the probability P is defined
on all subsets of Q, if it is defined for each elementary event oo e ?2. In this case, any
real-valued function ? = ?(a>) on such a space of elementary events is a random
variable.
Instead of a real-valued function, one may consider a function f(co) taking
values from some set Y of arbitrary elements. Such a function f{oo) may be con-
considered a generalization of a random variable and is called a random element of
the set Y.
In studying combinatorial objects, we consider probability spaces that have a
natural combinatorial interpretation: For the space of elementary events Q, we take
the set of combinatorial objects under investigation and assign the same probability
to all the elements of the set. In this case, numerical characteristics of combinatorial
objects of Q, become random variables. The term "random element of the set Q"
is usually used for the identity function f(<o) = a), co e Q, mapping each element
of the set of combinatorial objects into itself. Since the uniform distribution is
2 The generalized scheme of allocation and the components of random graphs
assumed on Q, the probability that the identity function / takes any fixed value
cd is the same for all co e Q. Hence the notion of a random combinatorial object
of ?2, such as the identity function f(w) = co, agrees with the usual notion of a
random element of a set as an element sampled from all elements of the set with
equal probabilities.
Note that a random combinatorial object with the same distribution could also
be defined on larger probability spaces. For our purposes, however, the natural
construction presented here is sufficient for the most part. The exceptions are
those few cases that involve several independent random combinatorial objects
and in which it would be necessary to resort to a richer probability space, such as
the direct product of the natural probability spaces.
Since we use probability spaces with uniform distributions, in spite of the proba-
probabilistic terminology, the problems considered are in essence enumeration problems
of combinatorial analysis. The probabilistic approach furnishes a convenient form
of representation and helps us effectively use the methods of asymptotic analysis
that have been well developed in the theory of probability.
Thus, in the probabilistic approach, numerical characteristics of a random com-
combinatorial object are random variables. The main characteristic of a random variable
? is its distribution function F(x) defined for any real x as the probability of the
event {? < x}, that is,
The distribution function F(x) defines a probability distribution on the real line
called the distribution of the random variable ?. With respect to this distribution,
given a function g(x), the Lebesgue-Stieltjes integral
/•OO
/ g(x)dF(x)
J — OO
can be defined. The probabilistic approach has advantages in the asymptotic in-
investigations of combinatorial problems. As a rule, we have a sequence of random
variables %n,n = 1,2,..., each of which describes a characteristic of the random
combinatorial object under consideration, and we are interested in the asymptotic
behavior of the distribution functions Fn(x) = P{^n < x} as n -> oo.
A sequence of distributions with distribution functions Fn (jc) converges weakly
to a distribution with the distribution function F{x) if, for any bounded continuous
function g(x),
/•OO /.OO
/ g(x)dFn(x)^ / g(x)dF(x)
J—oo J—oo
as n -> oo.
The weak convergence of distributions is directly connected with the pointwise
convergence of the distribution functions as follows.
1.1 Probabilistic approach to enumerative combinatorial problems 3
Theorem 1.1.1. A sequence of distribution functions Fn(x) converges to a dis-
distribution function F(x) at all continuity points if and only if the corresponding
sequence of distributions converges weakly to the distribution with distribution
function F(x).
In a sense, the distribution, or the distribution function F(x), characterizes the
random variable ?. The moments of ? are simple characteristics. If
/•OO
/ \x\dF(x)
J -OO
exists, then
/•OO
E? = / xdF(x)
J-oo
is called the mathematical expectation, or mean, of the random variable ?. Further,
/•OO
mr = E^r = / xr dF{x)
J — OO
is called the rth moment, or the moment of rth order (if the integral of |jc|r exists).
In probabilistic combinatorics, one usually considers nonnegative integer-
valued random variables. For such a random variable, the factorial moments are
natural characteristics. We denote the rth factorial moment by
If a distribution function F(x) can be represented in the form
F(x)= f p(u)du,
J — OO
where p{u) > 0, then we say that the distribution has a density p{u). In addition to
the distribution function, it is convenient to represent the distribution of an integer-
valued random variable ? by the probabilities of its individual values. For ?, we
will use the notation
= k], A: = 0,1,...,
and for integer-valued nonnegative random variables ?„,
p{kn) = P{t;n=k}, k = 0,1,....
It is clear that
OO
«=0
if this series converges.
It is not difficult to see that the following assertion is true.
4 The generalized scheme of allocation and the components of random graphs
Theorem 1.1.2. A sequence of distributions {pk" }, n = 1,2,..., converges
weakly to a distribution \pk) if and only if for every fixed k = 1,2,...,
(«)
Pk ~+ Pk
as n —> oo.
If an estimate of the probability P{? > 0} is needed for a nonnegative integer-
valued random variable ?, then the simple inequality
OO 00
P{? > 0} = ?>{? = k} < J^kpk = E? A.1.1)
k=l k=l
can be useful. In particular, for a sequence ?„, n = 1, 2,..., of such random
variables with E%n —> 0 as n —> oo, it follows that
P{&, > 0} -> 0.
Since it is generally easier to calculate the moments of a random variable than
the whole distribution, one wants a criterion for the convergence of a sequence of
distributions based on the corresponding moments. But, first, it should be noted
that even if a random variable ? has moments of all orders, its distribution cannot, in
general, be reconstructed on the basis of these moments, since there exist distinct
distributions that have the same sequences of moments. For example, it is not
difficult to confirm that for any n = 1, 2, ...,
•oo
xne~l/4smxl/4dx=0.
'0
Hence, for — 1 < a < 1, the function
Jo
is the density of a distribution on [0, oo) whose moments do not depend on a.
Thus the distribution functions with moments of all orders are divided into two
classes: The first class contains the functions that may be uniquely reconstructed
from their moments, and the second class contains the functions that cannot be
reconstructed from their moments. There are several sufficient conditions for the
moment problem to have a unique solution. Let
/•OO
Mn= \x\ndF(x).
J — oo
A distribution function F(x) is uniquely reconstructed by the sequence mr, r =
1, 2, ..., of its moments if there exists A. such that
-Mln/n <k. A.1.2)
n
1.1 Probabilistic approach to enumerative combinatorial problems 5
The following theorem describing the so-called method of moments is applicable
only to the first class of distribution functions.
Theorem 1.1.3. If distribution functions Fn(x), n = 1, 2, ..., have the moments
of all orders and for any fixed r = 1,2,...,
/•OO
m^ = I xrdFn(x) —> mr, \mr\ < oo,
/-oo
as n —> oo, then there exists a distribution function F(x) such that for any fixed
r= 1,2,...,
/•OO
J—oo
•00
mr = I xr dF(x),
-oo
and from the sequence Fn(x), n = 1, 2, ..., it is possible to select a subsequence
Fnk(x), k = 1,2,..., that converges to F(x) as n —> oo at every continuity point
ofF(x).
If the sequence mr, r = 1, 2, ..., uniquely determines the distribution function
F(x), then Fn(x) —> F(x) as n —> oo at every continuity point of F(x).
Note that the normal (Gaussian) and Poisson distributions are uniquely recon-
structible by their moments.
To use the method of moments, it is necessary to calculate moments of random
variables. One useful method of calculating moments of integer-valued random
variables is to represent them as sums of random variables that take only the values
Oandl.
Theorem 1.1.4. If
Sn =?!+•••+?„,
and the random variables ?i, ...,?« take only the values 0 and 1, then for any
m = 1,2, ... ,n,
Sn(Sn~\)---(Sn-m + \)=
where the summation is taken over all different ordered sets of different indices
{il, .. ., im}, the number of which is equal to (^)w!.
Generating functions also provide a useful tool for solving many problems
related to distributions of nonnegative integer-valued random variables. The com-
complex-valued function
00
pkzk = Ez^ A.1.3)
k=0
6 The generalized scheme of allocation and the components of random graphs
is called the generating function of the distribution of the random variable ?. It
is denned at least for \z\ < 1. For example, for the Poisson distribution with
parameter A., which is defined by the probabilities
Xk
Pk = —e-k, k = 0,\,...,
the generating function is ex^z~[\
Relation A.1.3) determines a one-to-one correspondence between the generat-
generating functions and the distributions of nonnegative integer-valued random variables,
since the distribution can be reconstructed by using the formula
/ @), k = 0, 1,.... A.1.4)
k\
Generating functions are especially convenient for the investigation of sums of
independent random variables. If ?i, ...,?„ are independent nonnegative integer-
valued random variables and Sn = ?i + • • • + ?„, then
<t>sn(z) = 01,00 ••-06,Or).
The correspondence between the generating functions and the distributions is con-
continuous in the following sense.
Theorem 1.1.5. Let {p[ }, n = 1, 2, ..., be a sequence of distributions. If for
anyk = 0,1, ...,
(n)
Pk "> Pk
as n —> oo, then the sequence of corresponding generating functions <pn(z), n =
1,2,..., converges to the generating function of the sequence {pk} uniformly in
any circle \z\ < r < 1.
In particular, if{pk} is a distribution, then the sequence of corresponding gen-
generating functions converges to the generating function (p(z) of the distribution {pk}
uniformly in any circle \z\ < r < 1.
Theorem 1.1.6. If the sequence of generating functions <pn(z), n = 1, 2,..., of
the distributions {p? } converges to a generating function 0(z) of a distribution
on a set M that has a limit point inside of the circle \z\ < 1, then the
distributions {pk } converge weakly to the distribution
Since a generating function <p(z) = YaLo Pkzk *s analytic, its coefficients can
be represented by the Cauchy formula
(p(z)dz
'c
where the integral is over a contour C that lies inside the domain of analyticity of
(p(z) and contains the point z = 0.
Pn = l0(«)(O) = -L f
n\ 2tti Jc
1.1 Probabilistic approach to enumerative combinatorial problems 7
Thus, if we are interested in the behavior of pn as n —> oo, then we have to be
able to estimate contour integrals of the form
2tti Jc
c
where g(z) and f\z) are analytic in the neighborhood of the curve of integration
C and A. is a real parameter tending to infinity.
j The saddle-point method is used to estimate such integrals. The contour of
i integration C may be chosen in different ways. The saddle-point method requires
i choosing the contour C in such a way that it passes through the point z$, which is
a root of the equation f\z) = 0. Such a point is called the saddle point, since the
¦ function 9t/(z) has a graph similar to a saddle or mountain pass. The saddle-point
i method requires choosing the contour of integration such that it crosses the saddle
point zo in the direction of the steepest descent. However, finding such a contour
and applying it are complicated problems, so for the sake of simplicity one usually
! does not choose the best contour, hence losing some accuracy in the remainder
term when estimating the integral.
A parametric representation of the contour transforms the contour integral to
i an integral with a real variable of integration. Therefore the following theorem
i on estimating integrals with increasing parameters, based on Laplace's method,
1 sometimes provides an answer to the initial question on estimating integrals.
Theorem 1.1.7. If the integral
/•OO
G(X)= /
J — 00
converges absolutely for some X = Xo, that is,
/•OO
/ \g(t)\ekof{t) dt < M;
if the function fit) attains its maximum at a point to and in a neighborhood of
this point
fit) = fito) +a2it- tQJ + a3it- t0K + ¦¦¦
with #2 < 0;
if for an arbitrary small 8 > 0, there exists h = hid) > 0 such that
fito) ~ fit) > h,
for\t-t0\ >8;
and if as t —> to,
8 The generalized scheme of allocation and the components of random graphs
where c is a nonzero constant and m is a nonnegative integer, then, as X —> oo,
G(X) = ekm)X-m-{i2cc]m+[r(.m +
where V(x) is the Euler gamma function and
1 1
c\ =
In particular, if m =0, then c = g(to), and as X —> oo,
G(X) = e^—MM=J^Jx(\ + OA/VI)). A.1.5)
To demonstrate that this rather complicated theorem can really be used, let us
estimate the integral
•oo
F(A.+ 1) = / xke~x dx
= f
Jo
as X —> oo, and obtain the Stirling formula. The change of variables x = Xt leads
to the equation
•OO
/•
/
Jo
/
o
Hereg(O = 1, and f(t) = -(t - 1 - logO, /(D = 0, /(I) = 0, /'(I) = -1.
The conditions of the theorem are fulfilled; therefore, by A.1.5),
/•OO
G(X) = / ekf{t) dt = y/2n/k(l + O(l/Vx)),
Jo
and for the Euler gamma function, we obtain the representation
F(X + 1) = Xk+l/2e^V2^(l + 0A~
as X —> oo, coinciding with the Stirling formula, except for the remainder term,
which can be improved to O(\/X).
Generating functions are only suited for nonnegative integer-valued random
variables. A more universal method of proving theorems on the convergence of
sequences of random variables is provided by characteristic functions. The char-
characteristic function of a random variable ? or the characteristic function of its
distribution is defined as
/•OO
<p(t)=(p^t) = Eeut = / eitxdF(x), A.1.6)
./-oo
where —oo < t < oo and F(x) is the distribution function of ?.
If the rth moment mr exists, then the characteristic function <p{t) is r times
differentiable, and
<p{r)@) = irmr.
1.1 Probabilistic approach to enumerative combinatorial problems 9
Characteristic functions are convenient for investigating sums of independent
random variables, since if Sn = ?1 + • • • + ?„, where ?1,..., %n are independent
random variables, then
^@=^,@ •••<%,@-
The characteristic function of the normal distribution with parameters (m, a2) and
density
Relation A.1.6) defines a one-to-one correspondence between characteristic
functions and distributions. There are different inversion formulas that provide a
formal possibility of reconstructing a distribution from its characteristic function,
but they have limited practical applications. We state the simplest version of the
inversion formulas.
Theorem 1.1.8. If a characteristic function <p(t) is absolutely integrable, then
the corresponding distribution has the bounded density
-itx
The correspondence defined by A.1.6) is continuous in the following sense.
Theorem 1.1.9. A sequence of distributions converges weakly to a limit distri-
distribution if and only if the corresponding sequence of characteristic functions <pn(t)
converges to a continuous function <p(t) as n —> oo at every fixed t, —oo < t < oo.
In this case, <p(t) is the characteristic function of the limit distribution, and the
convergence <pn(t) —> <p(t) is uniform in any finite interval.
For a sequence ?„ of characteristics of random combinatorial objects, applying
Theorem 1.1.9 gives the limit distribution function. But for integer-valued char-
characteristics, one would rather have an indication of the local behavior, that is, the
behavior of the probabilities of individual values. To this end the so-called local
limit theorems of probability theory are used.
Let ? be an integer-valued random variable and pn = P{? = n}. It is clear that
P{? e H) = 1, where H is the lattice of all integers. If there exists a lattice Fd
with a span d such that P{? e Tj} = 1 and there is no lattice F with span greater
than d such that P{? eF), then d is called the maximal span of the distribution
of ?. The characteristic function <p{t) of the random variable ? is periodic with
period 2n/d and \<p{t)\ < 1 for 0 < t < 2n/d.
10 The generalized scheme of allocation and the components of random graphs
For integer-valued random variables, the inversion formula has the following
form:
1 Cn
In J-n
e~ltnip{t)dt.
Consider the sum 5V = ?| +••• + ?# of independent identically distributed
integer-valued random variables ?i,..., ?#• When the distributions of the sum-
mands are identical and do not depend on N, the problem of esti mating the probabi 1-
ities P{Sn = n], as TV -> oo, has been completely solved. If there exist sequences
of centering and normalizing numbers Am and Bm such that the distributions of the
random variables (Sm — Am)/Bm converge weakly to some distribution, then the
limit distribution has a density. Moreover, a local limit theorem holds on the lattice
with a span equal to the maximal span of the distribution of the random variable
?i. If the maximal span of the distribution of ?i is 1, then the local theorem holds
on the lattice of integers.
Theorem 1.1.10. Let ?i, ?2. ¦ • • be a sequence of independent identically dis-
distributed integer-valued random variables and let there exist Am and Bm such that,
as N —> 00 for any fixed x,
Then, if the maximal span of the distribution of%\ is 1,
BNP{SN =n}- p((n - AN)/BN) -+ 0
uniformly in n.
Local limit theorems are of primary importance in what follows. Therefore, let
us prove a local theorem on convergence to the normal distribution as a model
for proofs of local limit theorems in more complex cases, which will be discussed
later in the book.
Theorem 1.1.11. Let the independent identically distributed integer-valued ran-
random variables ?1, ?2, • • • have a mathematical expectation a and a positive vari-
variance a2. Then, if the maximal span of the distribution of%\ is 1,
V27T
uniformly in n as N —> 00.
Proof. Let
n — aN
z = — and PN(n) = P{?i + ... + ^ = „}.
il
Jl
1.1 Probabilistic approach to enumerative combinatorial problems 11
If <p{t) is the characteristic function of the random variable ?i, then the character-
characteristic function of the sum ?#= ?1 + •••+?# is equal to (pNit), and
00
(pNit)=
By the inversion formula,
n=—oo
n
-itn.^N
«) = — / e~nn(pNit)dt. A.1.7)
Let (p* it) denote the characteristic function of the centered random variable ?1 - a,
which equals (pit) exp{—ita}. Sincerc = aN + az^/N, it follows from A.1.7) that
After the substitution x = to*/N, this equality takes the form
eixz(<p*(x/(aVN)))Ndx. A.1.8)
By the inversion formula,
= — f
2x J-
4=e = f e/2dx. A.1.9)
-Jin 2x J-oo
It follows from A.1.8) and A.1.9) that the difference
~z2/2\ A.1.10)
can be written as the sum of the following four integrals:
h = jAe-ix
h = ~ f e~ixz-x2/2dx,
JA<\x\
h= f e-ixz((p*(x/(oVN)))Ndx,
JA<\x\<eo*/N
h= f e-ixz((p*(x/(oVN)))Ndx,
Jea*/N<\x\<nG*/N
ea*/N<\x\<nG*/N
where the constants A and e will be chosen later.
To see that Rm -> 0 as TV -> oo, we take an arbitrary 8 > 0 and show that
can be made less than 8 for sufficiently large N.
12 The generalized scheme of allocation and the components of random graphs
For h, we have
|/21 < f e~x2/2dx,
JA<\x\
and I/2I can be made arbitrarily small by the choice of sufficiently large A.
Since E?i = a and D?i = a2, for the characteristic function <p*(t) as t ->• 0,
we have
^B) A.1.11)
Let (fNit) denote the characteristic function of EV — aN)/(a*/N), which equals
{cp*{x/(a^/N)))N. For any fixed x and N ->• 00, we obtain from A.1.11) the
relation
\og(pN(x) = N\og<p*(x/(asfN))
2 +
implying that for any fixed x as N ->• oo,
Wv(*) -> e~x2'2. A.1.12)
Moreover, as seen from A.1.11), there exists s > 0 such that, for |r| < s,
< \-°J- <e-a2t2!\ A.1.13)
Using this inequality to estimate I?,, we find that
h<( \(p*{x/{aVN))\Ndx< f e~x2'Adx,
J A<\x\<eg-Jn J A<\x\<sa-/N
and by the choice of sufficiently large A, \h\ can be made arbitrarily small.
Let s be such that A.1.13) is satisfied and let A be large enough so that \h\ < 8/4
and I/31 < 8/4. Let us now estimate the integrals I\ and I4 for fixed s and A. Rela-
Relation A.1.12) implies that the distribution of (S^—aN)/(a */N) converges weakly,
as N -+ 00, to the normal distribution with parameters @, 1). The convergence
of the characteristic functions <Pn(x) to the characteristic function of the normal
law is uniform in any finite interval, and the integral I\ tends to zero as N —>¦ 00.
For I4, we have
f \N
= f \(p*(x/(aVN))\Ndx = aVN f
f \<p(t)\Ndt.
s<\t\<n
Since the maximal span of the distribution of ?1 is 1,
max \(p(t)\ = q < 1.
1.1 Probabilistic approach to enumerative combinatorial problems 13
Hence,
and [4 -+ 0 as N ->• 00.
The estimates of I\ and I4 show that there exists /Vo such that 11\ \ < S/4 and
<S/4forN> Mo.
Thus the difference Rn tends to zero as N —>¦ 00 uniformly for all integers n.
In most applications of local theorems in this text, the distribution of the sum-
summands of the sum Sn = ?1 + • • • + ?yv depends on the number of summands N.
In such cases, there is no complete answer to the question of when the local theo-
theorem holds for 5V • Even in the case of convergence to the normal law, the known
sufficient conditions for the validity of a local theorem cannot be deemed fully
satisfactory. Hence, for each specific distribution whose parameters depend on the
number of summands in the sum, it is necessary to invoke the classical scheme
given above as a model. In the hope of finding simple sufficient conditions for the
validity of local theorems for integer-valued identically distributed summands, as
in Theorems 1.1.10 and 1.1.11, we will often omit the particularly cumbersome
calculations arising in estimating characteristic functions.
If ?1,..., ?n are independent identically distributed random variables such that
P{%l = l} = p and P{?i =0} = q = 1 -p for 0 < p < 1,
then Sn = ?1 + • • • + ?jv has the binomial distribution with parameters (N, p),
that is, for any k = 0, 1,..., N,
If Npq ->• 00, then the binomial distribution is approximated by the normal law.
The following theorem, known as the local de Moivre-Laplace theorem, can be
obtained by a direct analysis of the explicit formula.
Theorem 1.1.12. IfN -+ 00 and A + u6)/(Npq) -+ 0, where
k-Np
u =
then
Theorem 1.1.12 implies the well-known integral de Moivre-Laplace theorem.
14 The generalized scheme of allocation and the components of random graphs
Theorem 1.1.13. If'N -+ oo and A + ub)/{Npq) -> 0, where
k-Np
u =
<Npq
then
P{SN <?} = -L f e-x2/2dx(\+o(\)).
v27T J—oo
If p —> 0, then the binomial distribution is approximated by the Poisson law.
It is well known that if N ->¦ oo and Np ->• A, 0 < A. < oo, then
for any fixed A: = 0, 1, The Poisson approximation is also valid if Np tends
to infinity not too quickly.
Theorem 1.1.14. If N —>• oo, Np —>• oo, A + w2)/? -> 0, where
k-Np
u = — ,
The Poisson distribution converges to the normal law as its parameter tends to
infinity.
Theorem 1.1.15. If (I + u6)/X -+ 0, where u = (k - A)/VX, then
Sometimes it is necessary to estimate the tails of the binomial distribution in
the form of an inequality with an explicit constant.
Theorem 1.1.16. For any x > 0,
P{SN-ESN >Nx} <e~2Nx\
1.2. The generalized scheme of allocation
In the past three decades, the so-called generalized scheme of allocation of particles
has been applied to many probabilistic problems of combinatorics, and many of
the results in this text were obtained by reducing combinatorial problems to such
a generalized scheme.
1.2 The generalized scheme of allocation 15
Consider n independent trials, each having N equiprobable outcomes, 1,2,
..., N. Let r\i denote the number of occurrences of the /th outcome in this sequence
of trials, i = 1, 2, ..., N. The random variables 771, • • •. *7/V have the multinomial
distribution: Ifthenonnegative integers k 1, ...,?# are such that A: 1 H \-k^ = n,
then
P{m =kl,...,VN=kN}= n] A.2.1)
k\! • ¦ • kx! Nn
The situation in which the multinomial distribution arises can be described in
terms of an equiprobable scheme of allocating particles to cells. If n particles are
independently distributed with equal probabilities into N cells labeled 1,2,. ..,N,
then the contents of cells 771, ..., tjn have the multinomial distribution A.2.1).
In the scheme of allocating particles to cells yielding the multinomial distri-
distribution, the contents of cells can be obtained by independent sequential allocation
of particles. If one does not require that the contents of cells can be obtained by
some sequential allocation of particles, with a simple probability law governing
the sequential trials, then any set of integer-valued nonnegative random variables
771,..., 77yv, such that 771 H h r]^ = n, can be viewed as a scheme of allocating
n particles to N cells, and one can interpret 77; as the number of particles in the
cell with index i,i = 1,2,..., N.
Some probabilistic problems of combinatorics can be treated by using general-
generalized schemes of allocation in which the joint distribution of the contents of cells
771,,.., 77 yv can be represented in the form
=k\,...,r]N = kN} = P{?i = k\, ..., %N = kN \ ?1 H h?yv = »},
A.2.2)
where ?1 ,...,?# are independent identically distributed integer-valued random
variables.
The generalized scheme of allocating particles to cells is given by the parameters
n and N and the distribution of the random variables ?1,..., ?yv, which by relation
A.2.2) determines the joint distribution of the contents of the cells 771, ..., r]^. Set
Pk = P{§i=k], k = 0,l,.... A.2.3)
For the random variables 771,..., 77^ with the multinomial distribution A.2.1),
relation A.2.2) is satisfied if ?1 has the Poisson distribution with arbitrary param-
parameter A:
A.2.4)
Therefore the distribution of 771,..., 77^ satisfying relation A.2.2) for some
distribution A.2.3) can be viewed as a generalization of the multinomial distribu-
distribution.
16 The generalized scheme of allocation and the components of random graphs
The term "classical scheme of allocation" has become common for the equi-
probable scheme of allocating particles to cells leading to the multinomial distri-
distribution A.2.1). The terminology of the classical scheme of allocating particles to
cells proved to be convenient for describing a number of combinatorial problems
where the multinomial distribution appears. Many results pertaining to the classi-
classical scheme of allocation can be obtained by applying relation A.2.2) between the
multinomial distribution and the Poisson distribution A.2.4). Introducing gener-
generalized schemes of allocating particles not only broadens the scope of convenient
language for describing combinatorial objects, but also offers the possibility of
applying methods based on relation A.2.2) that have been developed to analyze
the classical scheme.
Let \±r (n, N) denote the number of cells containing exactly r particles in the
generalized scheme of allocation with distributions A.2.2) and A.2.3). We show
that the representation A.2.2) can be used to study this random variable.
Let ?j , ...,?Jy be independent identically distributed random variables
whose distribution is linked with the distribution of ?i ,...,?# as follows:
p{?i(r) = *} = p{?i = * I *i * '•}. * = o, 1,....
Also let
\ — 51 H +57V, JN — 4-j H +5yy-
The following lemma expresses the distribution of /xr (n, N) in terms of the prob-
probabilities of sums of independent identically distributed random variables.
Lemma 1.2.1.
«. A0 = *) =
N = n}
Proof. Let Ap be the event that exactly k of the random variables
take the value r. By equality A.2.2),
The lemma is derived by obvious manipulations of the numerator: The events A^
can occur for (k) distinct choices of random variables taking the value r; therefore
P{SN =n\^^r,..., $N_k ^ r, HN-k+\ =
1.2 The generalized scheme of allocation 17
In the generalized scheme of allocating particles, there is a rather simple ap-
approach to study the order statistics r\{\) < rjB) < • • • < V{N) constructed for the
random variables r\\,..., t]n arranged in nondecreasing order.
Let?,(/1),..., ^ be independent identically distributed random variables such
that
P$A) =k} = Pfa =k\htA}, k = 0, 1,...,
where A is a subset of the set of natural numbers with P{?i g A} > 0. In particular,
if A consists of one value r, then ?[ = ?, , where ?j is the random variable
defined preceding Lemma 1.2.1. Set
The following lemma reduces the study of distributions of order statistics to
that of probabilities related to sums of independent random variables.
Lemma 1.2.2. For any positive integer m,
, aUr) _ 1
P{rKN-m+i)<r}= 2_^ [j )Pr(\-Pr) _ , A-2.7)
where Ar is the set of all nonnegative integers not exceeding r,Ar is its complement
in the set of all nonnegative integers, and Pr = P{?i > r).
Proof. Let us prove A.2.7) for m = 1. For the maximal order statistic
max(r7i,..., 77yv), by A.2.2) and the independence of ?1,..., ?yv, we have
P{r](N) < r} = P{r]i < r, ..., r]N < r}
<r,...,%N <r\SN=n]
< r})NP{SN =n\§l<r
By using the random variables ?| ,...,%N , we finally obtain
(l-Pr)NP{slAr) =n
1
= n\
Relations A.2.6) and A.2.7) for other values of m are similarly proved.
A-2.8)
For the joint distribution of the random variables /xr| (n, N),..., firs(n, N), we
can prove the following lemma as we did in Lemma 1.2.1.
18 The generalized scheme of allocation and the components of random graphs
Lemma 1.2.3.
A/I „*'... Dk*(\ - »*' *,
Ari! - - • Arv! (iV — Ari ks)\
X
where s — \,k\,... ,ks,r\,... ,rs are nonnegative integers and r\,..., rs are
distinct.
Lemmas 1.2.1, 1.2.2, and 1.2.3 express the distributions of the random variables
lir(n, N) and the order statistics r}(\), 77B), • • •, V(N) in the generalized scheme
of allocating particles in terms of probabilities related to sums of independent
random variables. Obtaining limit distributions for the random variables /xr(n, N)
and 77A), 77B), •. •, 77(yv) is reduced to applying local limit theorems for sums of
independent identically distributed integer-valued random variables.
We now give some examples of how combinatorial problems can be reduced to
the generalized scheme of allocating particles to cells.
Example 1.2.1. Consider single-valued mappings of the set Xn = {1, 2,...,«}
into itself. A single-valued mapping s of the set Xn into itself can be represented
as
s =
where sk denotes the image of k, k = 1,2,... ,n, under the mapping s. The
mapping s may be thought of as an oriented graph T^ = T(Xn, Wn) with vertex
set Xn and arcs Wn = {(k,sk),k= 1, 2, ...,«}, where the arc (k, Sk) is directed
from kto Sk,k = 1,2,... ,n. The number of arcs entering the vertex k in the graph
Tn , which is the number of pre-images of the element k under the mapping s, is
called the multiplicity of the vertex k.
Let En denote the set of all single-valued mappings of Xn into itself, and
Fn the set of all graphs of these mappings. The number of elements of ?„ is
obviously equal to n". If the uniform distribution is defined on the set EM, then we
obtain a probability space whose set of elementary events Q is the set EM; and the
probability for any subset of EM is the number of elements in the subset divided
by n". The random mapping a is any of the nn possible mappings with probability
P{a =s} = n~n,s € E«.If
a =
1.2 The generalized scheme of allocation 19
where the random variable 07 is the random image of the element i, i = 1, 2,..., n,
then, for any s,
P{a = s} = P{a{ = si, ...,<rn =sn} = n~n.
Thus the random variables o\, ..., an are independent and take the values 1, 2,
..., n with equal probabilities.
Let r]r denote the multiplicity of the vertex r in the random mapping a,r =
1,2,... ,n. The quantity r\r is equal to the number of random variables o\,..., an
taking the value r; thus, for nonnegative integers k\,..., kn with k\ + - • -+kn = n,
the probability P{rj\ = k\,..., r]n = kn) is equal to the sum of probabilities
P{cri = si, ..., an = sn} = n~n, where among s\,... ,sn there are exactly
kr values equal to r, r = 1,2,... ,n. The number of summands in this sum is
obviously n \/{k\ !•••?„!); therefore
P{t]i = k\,..., x]n = kn} =
k\\---kn\nn
Thus the joint distribution of the multiplicities of the vertices 771,..., r]n of a
random mapping is the multinomial distribution. Taking the vertices as cells and
the arcs going into these vertices as particles, we obtain the classical scheme of
allocating n particles to n cells with multinomial distribution of the contents of the
cells r\\,... ,r\n. For the random variables r]\, ..., r]n, relation A.2.2) holds:
P{r)\ =k\,...,r)n=kn\ = P{?i = k\, ...,?„ = kn | ?1 H h ?«=«},
in which ?1 ,...,?„ are independent and identically Poisson-distributed.
The number of vertices /xr(n) in a random mapping with multiplicity r corre-
corresponds to the number of cells containing exactly r particles in the classical scheme
of allocating n particles to n cells; to study these variables, as well as the order
statistics made up of the multiplicities of the vertices, one can invoke Lemmas 1.2.1,
1.2.2, and 1.2.3.
Example 1.2.2. Consider all distinct partitions of n into N summands not less
than r > 0. The number of such partitions is (""^l^). Let us define the
uniform distribution on the set of these partitions by assigning the probability
/n_CA-_1\/V_1\ —1
1 tV_i ) t0 each partition n = n 1 + • • • + n^, n 1,..., n^ > r. Then n can
be written in the form
n = 771 H \-rjN,
where the summands 771,..., r]^ are random variables. If n\,...,
n = n\ + ¦ ¦ ¦ +hn, then
20 The generalized scheme of allocation and the components of random graphs
The general scheme of allocation corresponding to this combinatorial problem is
obtained if we use the geometric distribution for the distribution of the random
variables ?1, .. • ,%N'-
P{?, =k} = pk-'\ 1 - p), k = r,r+\,..., 0<p<\.
Indeed, as is easily verified,
fn-{r - \)N - \\~{
P{?i =«],..., %N =nN | $! H +%N=n}=l N_^ I ,
since, for geometrically distributed summands,
Example 1.2.3. Note that it is not necessary for the random variables ?t ...,?# in
a generalized scheme to be identically distributed. Consider the following example.
Draw n balls at random without replacement from an urn containing m; balls of
the zth color, i = 1,..., N. Let r\i denote the number of balls drawn of the zth
color, i = 1,..., N. It is easily seen that for nonnegative integers n i,..., n^ such
that n\ + ¦ ¦ ¦ + ax = n,
=n\,...,r]N =nN} =
fm\\ /mN
Knx) \nNi
0
where m = m\ + ¦ • • +
If in the generalized scheme of allocation the random variables %i,... ,%n nave
the binomial distributions
where 0 < p < l,k = 1,2, ... ,mt,i = 1,..., N, then
=nN | ?i H \-?-N = n} =
mN
nN/
0
and the distribution of the random variables rji,... ,t]n coincides with the con-
conditional distribution of the independent random variables t-i, ...,!• n under the
condition ?i H \-%N = n. Thus rj\,... ,t]n may be viewed as contents of cells
in the generalized scheme of allocation, in which the random variables ?i ,...,?#
have different binomial distributions.
1.2 The generalized scheme of allocation 21
Example 1.2.4. In a sense, the graph Yn of a random mapping consists of trees. In-
Indeed, the graph can be naturally decomposed into connected components. Clearly,
each connected component of the graph Tn contains exactly one cycle. Vertices in
the cycle are called cyclic. If we remove the arcs joining the cyclic vertices, then
the graph turns into a forest, that is, a graph consisting of rooted trees.
Recall that a rooted tree with n + 1 vertices is a connected undirected graph
without cycles, with one special vertex called the root, and with n nonroot labeled
vertices. A rooted tree with n + 1 vertices has n edges. In what follows, we view
all edges of trees as directed away from the root, and the multiplicity of a vertex
of a tree is defined as the number of edges emanating from it.
Let Tn denote the set of all rooted trees with n + 1 vertices whose roots are
labeled zero, and the n nonroot vertices are labeled 1, 2,..., n. The number of
elements of the set Tn is equal to (n + I)".
A forest with N roots and n nonroot vertices is a graph, all of whose components
are trees. The roots of these trees are labeled with I,..., N and the nonroot vertices
with 1,..., n. We denote the set of all such forests by Tn > n • The number of elements
in the set Tn^ is N(n + N)"~l. The number of forests in which the Arth tree contains
n/c nonroot vertices, k = 1, 2,..., n, is
nil- ••nN\
where the factor n!/(«i !•••«#!) is the number of partitions of n vertices into N
ordered groups, and («# + I)"* is the number of trees that can be constructed
from the Arth group of vertices of each partition. Then
«H VnN-n ly
where the summation is taken over nonnegative integers n\,..., n^ such that
n\-\ \-riN = n.
Next, we define the uniform distribution on Tn^. Let r]k denote the number of
nonroot vertices in the Arth tree of a random forest in Tn^, k = 1,..., N. For the
random variables r)\,..., tjn, we have
P{r]i =m, ...,r]N =nN} = —— , —— ——, A.2.10)
N(n + N^-Hni + 1)! ¦¦¦(nN + 1)!
where n\,..., n^ are nonnegative integers and n\ + ¦ ¦ ¦ +n^ = n.
Let us consider independent identically distributed random variables ?i, ...,
for which
|^ * = 0,l,..., A.2.11)
where the parameter x lies in the interval 0 < x < e~l and the function 9{x) is
22 The generalized scheme of allocation and the components of random graphs
defined as
-xk.
k=l
By using A.2.9), we easily obtain
C-
hence, for any x, 0 < x < e~l, and for nonnegative integers n\,..., n^ such that
n\ + ¦ ¦ ¦ + flyv = n,
A.2.12)
1)! •••(«# +1)!
The right-hand sides of A.2.10) and A.2.12) are identical, and the joint distribution
of r)\,..., r]^ coincides with the distribution of ?i, ...,?# under the condition
that ?i + • ¦ • +?n = n. Thus, for the random variables r]\,..., r)N and ?i ,...,?#,
relation A.2.2) holds, enabling us to study tree sizes in a random forest by using
the generalized scheme of allocating particles into cells, with the random variables
?i,..., ?n having the distribution given by A.2.11).
1.3. Connectivity of graphs and the generalized scheme
Not pretending to give an exhaustive solution, let us describe a rather general
model of a random graph by using the generalized scheme of allocation. Consider
the set of all graphs Tn(R) with n labeled vertices possessing a property R. We
assume that connectivity is defined for the graphs from this set and that each graph
is represented as a union of its connected components. In the formal treatment that
follows, it may be helpful to keep in mind the graphs of random mappings or of
random permutations. The former graphs consist of components that are connected
directed graphs with exactly one cycle, whereas the latter graphs consist only of
cycles.
Let an denote the number of graphs in the set rn(R) and let bn be the num-
number of connected graphs in Tn{R). We denote by Tn%N(R) the subset of graphs
in Tn (R) with exactly N connected components. Note that the components of a
graph in rnjv(/?) are unordered, and hence we can consider only the symmetric
characteristics that do not depend on the order of the components. To avoid this
restriction, we, instead, consider the set fn^(R) of combinatorial objects con-
constructed by means of all possible orderings of the components of each graph from
1.3 Connectivity of graphs and the generalized scheme 23
r,,,yv(/?)- The elements of this set are ordered collections of N components, each
of which is a connected graph possessing the property R, and the total number of
vertices in the components is equal to n. Since the vertices of a graph in r,,jv(/?)
are labeled, all the connected components of the graph are distinct; therefore the
number of elements in tn^(R) is equal to N\an^, where an%^ is the number of
elements of the set Fn^(R) consisting of the unordered collection of components.
Now let us impose a restriction on the property R of graphs. Let a graph possess
the property R if and only if the property holds for each connected component:
The property R is then called decomposable.
Set ao = 1, bo = 0 and introduce the generating functions
°° n rn °° h xn
n=\ n=0
Lemma 1.3.1. If the property R is decomposable, then
where the summation is taken over nonnegative integers n\,..., n^ such that
n\-\
Proof. Withni H \-n^ = n and^i,... ,n^ > 1, \etan(ni,... ,n^) denote
the number of graphs in fnjv(/?) with ordered components of sizes n\,..., n^.
We construct all an (n i,..., n^) such graphs and decompose the n labeled vertices
into N groups so that there are «/ vertices in the z'th group, i = 1,..., N. This
can be done in n \/(n i! ¦ ¦ ¦ n^ 0 ways. From ni vertices, we construct a connected
graph possessing the property R; this can be done in bni ways. Thus the number
of ordered sets of connected components of sizes n \,..., n^ is
- t x n\bnx---bnN
an(n\, ...,nN) = ¦ —.
n\\---nN\
Since N components can be ordered in JV! ways, the number an(n\, ..., n^) of
unordered sets, or the number of graphs in rn^(R) having exactly N components
of sizes «i,..., ftyv, is
) = —an{ri\, ..,nN) =
N\ni\
/ \ ^ - i n\bni ¦ ¦¦bflN
an(nu .. .,nN) = —an{ri\, ...,nN) = — ¦ -. A.3.2)
iV^! N\ni\-- -n^l
Lemma 1.3.2. If the property R is decomposable, then
24 The generalized scheme of allocation and the components of random graphs
Proof. As follows from A.3.1), the number an of all graphs in Fn (R) is
fe"'"--V A.3.3)
!1
By dividing both sides of this equality by n!, multiplying by x", and summing over
n, we get the chain of equalities
oo
M = l
OO
00 n * h r«l...
E
m\---nN\
N=l
= eB^ - 1,
which proves the lemma. ¦
Let us define the uniform distribution on the set Yn (R) and consider the random
variables am equal to the number of components of size m in a random graph
from Tn(R). The total number of components vn of a random graph from Tn(R)
is related to these variables by vn = ai + ¦¦¦ +<xn. Arrange the components in
order of nondecreasing sizes and denote by ftm the size of the wth components in
the ordered series; if m > vn, set fim = 0.
We will also consider the random variables defined on the set fn>yv (R) of ordered
sets of N components. The ordered components labeled with the numbers from 1
to N play the role of cells in the generalized scheme of allocating particles. Define
the uniform distribution on Tn^(R) and denote by r/i, ... ,t]n the sizes of the
ordered connected components of a random element in rn^(R). It is then clear
that
p Nlan(ni,...,nN) an{n\, ..., nN)
r{rii = n\, ..., tin = n^} = = . A.3.4)
N\an,N an%N
Theorem 1.3.1. If the series
J^ h xn
V d-3.5)
to
has a nonzero radius of convergence, then the random variables r}\,... ,t]N are the
contents of cells in the generalized scheme of allocation in which the independent
! .3 Connectivity of graphs and the generalized scheme 25
identically distributed random variables ?| ,...,?# have the distribution
• (IX6)
where the positive value xfrom the domain of convergence of( 1.3.5) may be taken
arbitrarily.
Proof. Let us find the conditional joint distribution of the random variables
?i,..., ?,v with distribution A.3.6) under the condition i-i + ¦ ¦ • + i-N = n. For
such random variables,
Z_, _ ^T7
and by virtue of A.3.1),
xnN\
—TF-anN- A.3.8)
Hence, if n\,... ,n^ > 1 and^i H +«yv = n, then
=nN | ?i H
ni\---nNl(B(x))NP{t;i+---+i;N=n}
bnx ¦ ¦ -bnNn\
nil---nNlN\an,N'
and according to A.3.2),
A.3.9)
From A.3.4) and A.3.9), we obtain the relation A.2.2) between the random vari-
variables rj\,... ,tjn and ?i ,...,?# in tne generalized scheme of allocating particles
to cells. ¦
In the generalized scheme of allocating particles, we usually study the random
variables ixr{n, N) equal to the number of cells containing exactly r particles
and the order statistics r](i), r)B), .. •, t](n) obtained by arranging the contents of
cells in nondecreasing order. In this case, ixr(n, N) is the number of components
of size r, and 77A), rjB), ¦ ¦ •, r}(N) ^e tne sizes of the components in a random
element from Fn^(R) arranged in nondecreasing order. The random variables
help in studying distributions of the random variables a 1 ,...,«„ and the associated
variables defined on the set Fn (R) of all graphs possessing the property R.
26 The generalized scheme of allocation and the components of random graphs
Lemma 1.3.3. For any positive x from the domain of convergence of( 1.3.5),
«' (B(x))N
P{Vn = N}= ') K' " P{fr + • • • + &v = n}. A.3.10)
Proof. Relation A.3.10) follows from A.3.8) because P{vn = N} = an^/an by
definition. ¦
It is clear by virtue of A.3.3) that the number an can also be expressed in terms
of probabilities related to ?i ,...,?#:
Nlxn
N=[
Lemma 1.3.4. For any nonnegative integers N, m i,..., mn,
P{«1 =mi,...,an =mn \vn = N}
= P{/j.i(n,N) = mi, ...,/jLn(n, N) =mn).
Proof. The conditional distribution on Vn{R) under the condition vn = N is
concentrated on the set Fn>yv (/?) of graphs having exactly N connected components
and is uniform on this set. Hence,
or i An cN(mi,...,mn)
P{«1 = mi, ...,«„= mn | vn = N} = , A.3.12)
an,N
where anjj is the number of elements in Fn>yv(/?) and Cyv(wi,..., mn) is the
number of graphs in Tn^ (R) such that the number of components of size r is mr,
r = 1,2,...,».
Consider the above set F« ,n(R) composed of ordered sets of N components. Let
Cyv(wi,..., mn) denote the number of elements in rn^(R) such that the number
of components of size r is m r, r = 1, 2,..., n. It is clear that
Df I K\ < AM 1 ^(Wl, . .. , mn)
P{/jLi(n,N) - mi, ...,/jLn(n,N) = mn) = = , A.3.13)
an,N
where an^ is the number of elements in Yn,N (R)- The assertion of the lemma fol-
follows from A.3.12) and A.3.13) because an^ = N\ an^ and c^(mi,..., mn) =
N\cN(mi,...,mn). M
Thus, if the series A.3.5) has a nonzero radius of convergence, then all of the
random variables expressed by a i ,...,«„ can be studied by using the generalized
scheme of allocating particles in which the random variables ?i ,...,?# nave the
distribution A.3.6). Roughly speaking, under the condition that the number vn
of connected components of the graph Fn (R) is N, the sizes of these components
(under a random ordering) have the same joint distribution as the random variables
1.3 Connectivity of graphs and the generalized scheme 27
r)i, ...,t]n in the generalized scheme of allocating particles that are defined by
the independent random variables ?1, ...,?# with distribution A.3.6). Thus, for
vn = N the random variables fii,..., f}N are expressed in terms of ct\ ,...,«„
in exactly the same way as the order statistics r^(i),..., rj^) in the generalized
scheme of allocating particles are expressed in terms of /x i (n, N),..., \xn (n, N).
Hence, Lemma 1.3.4 implies the following assertion.
Lemma 1.3.5. For any nonnegative integers N, k\,..., k^,
P{/3) = ki, . . . , /3yV = ?/V I vn = N} = P{^A) = kl, . . . , ?7(yV) = ^yv}.
A.3.14)
We now consider the joint distribution of ^i(n, N), ..., /xn(n, N).
Lemma 1.3.6. For nonnegative integers mi,..., mn such thatmi + - ¦ -+mn = N
and m i + 2w2 + • • ¦ + nmn = n,
n, N) =mi,..., ixn(n,N) =mn)
A.3.15)
Proof. To obtain A.3.15), it suffices to calculate c^(mi, ..., mn) in A.3.13). It
is clear that
where the summation is taken over all sets («i,..., nj^) containing the element
r exactly mr times, r = 1,..., n. The number of such sets is N\/(m i! • • • mn!),
and for each of them, by A.3.2),
n\y?x---bnn
an{m, ...,nN) =
(I!)' ¦¦¦(n\)m«
Hence,
Nlnlb™1 ¦¦¦bnn
cm (mi,..., mn) = —
mil ¦ ¦ -mnl (l\)m] ¦ ¦ ¦ (n\)mn
To obtain formula A.3.15), it remains to note that
P{vn =N} = ^= "
an N\an
Lemmas 1.3.4 and 1.3.6 enable us to express the joint distribution of the random
variables a\,... ,an in a random graph from Fn (R).
28 The generalized scheme of allocation and the components of random graphs
Lemma 1.3.7. Ifm [,..., mn are nonnegative integers, then
Pfai = m i, ..., <xn = mn} =
an l \ mr\(r\)m'-
0 otherwise.
Proof. By the total probability formula,
P{«1 = m\,..., an =mn]
N
^~ / ¦ t ^n ^~ *?} * \0? [ ^= 7YI |, . . . , (XY\ ^= ftlY\ | Vft ^ K|
k=i
= P{vn = N}P{ai =m\,...,an=mn \ vn = N),
where N = mi + ¦ • • +mn.By using Lemma 1.3.4, we find that
P{«1 = mi, ...,an = mn)
= P[vn= N}P{fjLi(n,N)=mi,...,fjLn(n,N)=mn}. A.3.16)
It remains to note that P{/xi(n, N) = mi, ..., /i.«(n, N) = mn] = 0 if m\ +
2w2 + ¦ ¦ • + nmn 7^ n and that equality A.3.15) from Lemma 1.3.6 holds for the
probability P{/xi(n, N) = mi,..., fin(n,N) = mn} if mi + ¦¦¦ +mn = N and
mi + 2w2 + • • • + nmn = n. The substitution of A.3.15) into A.3.16) proves
Lemma 1.3.7. ¦
We now turn to some examples.
Example 1.3.1. The set Sn of one-to-one mappings corresponds to the set Fn (R)
of graphs with n vertices for which we have the property R: Graphs are directed
with exactly one arc entering each vertex and exactly one arc emanating from
each vertex. This property is decomposable. The connected components of such a
graph are (directed) cycles. In this case, an = n\,bn = (n — 1)!, and the generating
functions
A(x) = j—, B(x) = -\og(l-x)
satisfy the relations of Lemma 1.3.2:
A(x) = eB(x). A.3.17)
To study the lengths of cycles of a random permutation and the associated variables,
one can use the generalized scheme of allocating particles in which the random
variables ?i ,...,?# have the distribution
xk
=k} = -—— -, ?=1,2,..., 0<x<\.
Hog(l -x)
1.3 Connectivity of graphs and the generalized scheme 29
Example 1.3.2. The set E,, of all single-valued mappings corresponds to the set
Fn(R) of graphs with n vertices with property R: The graphs are directed with
exactly one arc emanating from each vertex. This property is decomposable. Since
the number of elements of EM is nn, from relation A.3.17) for the generating
functions we find that
°° nnxn
s(x) = log/*(*) = log y]—-,
yielding
n— 1 fc
k=Q '
The radius of convergence of A (x) and B(x) is e~', and at the point x = e~l, they
diverge.
To study the characteristics of a random mapping, we can use the generalized
scheme of allocating particles in which the random variables ?i ,...,?# have the
distribution
*> *12 0<<«-1
*>771S7T' *1>2>---> 0<x<«.
k\ B(x)
Example 1.3.3. Consider the set of all unordered partitions of the set Xn =
{1,2, ...,«} into disjoint subsets, the union of which is Xn. The partition of Xn
into unordered subsets Y\,..., Y^ corresponds to the hypergraph of Fn ^ (R) with
n vertices and N hyperedges Y\, ..., Y^. Since all of the JV! orderings of the hy-
hyperedges Y\, ..., 7yv are distinct, each hypergraph of Tn^(R) gives us JV! distinct
objects of Fn>yv(/?) that are hypergraphs with n vertices and N ordered hyperedges
Ai,..., An, with the sets of hyperedges being permutations of Y\,..., Yn- The
property R determining this class of graphs requires that a graph be a hypergraph
whose distinct hyperedges have no common vertices. Each connected component
of such a graph is a hyperedge. Clearly, the number of connected graphs possessing
the property R with n vertices is 1, that is, bn = 1, so
oo „
Since R is decomposable,
This equality, or A.3.3), yields
n
_ y^
yV=l " ' ' n\-\ \-nx=n
where the second summation is over positive integers n \, ...,
n i
n\\---nN\
30 The generalized scheme of allocation and the components of random graphs
Thus, to study random partitions, we can use the generalized scheme of al-
allocation in which the random variables ?1 ?# have the truncated Poisson
distribution
* - 1)' ^1'2'-"-' °<*<°°-
Example 1.3.4. A tree is a connected graph without cycles. As the set Fn
let us consider the set J-n,N of all forests consisting of N trees with the total
number n of labeled vertices. The trees in a forest are not ordered. The property R
determining this class of graphs requires that a graph be undirected without cycles.
The property R is decomposable. The number bn of connected graphs possessing
the property R is the number of nonrooted trees with n vertices and bn = n"~2,
so the generating function is
00 nn-2xn
B(x) = y^ , 0<x<e~l.
n — \
Thus, to study a random forest from Tn^, we can use the generalized scheme
of allocation in which the random variables ?i ,...,?# have the distribution
fck-2k
*=1.2,..., 0<x<e~l.
k\B(x)
1.4. Forests of nonrooted trees
The graphs consisting of nonrooted trees and unicyclic components play the same
role in investigating graphs as the forests of rooted trees do for graphs of mappings.
Hence, the following sections concentrate on these objects, using the generalized
scheme of allocation.
As in Example 1.3.4, let Tn^ be the set of all forests of N nonrooted trees with
n vertices. It is known that the number of forests of N ordered rooted trees with
total number n of nonroot vertices is N(N + n)n~l. In contrast to the forests of
rooted trees, there is no simple formula for the number Fn^ = \Fn,N\ of forests
of nonrooted trees. Therefore the first step is to study the asymptotic behavior
of FntN.
Denote by T the number of edges in a forest belonging to Tn,N- It is easy
to see that T = n — N. Following the general algorithm for applying the gen-
generalized scheme of allocation, let us consider the set Tn%N, which consists of N
ordered nonrooted trees, and define the uniform distribution on this set. Denote
by rji, ..., rjN the sizes of ordered trees in a random graph from Tn^. By Cay-
ley's formula for counting trees, the number bn of nonrooted trees with n vertices
is n"~2. Denote by an{n\,...,n^) the number of elements in Tn^ for which
[r\\ = n\,..., tin = n^}. It is easy to see that for positive integers n\,...,
1.4 Forests of nonrooted trees 31
i + h ft,v = n,
n\
an(fi[, ... ,nN) = —- -bni ¦ ¦ -bnN, A-4.1)
n\\---nN\
and the number of elements in Tn^ is
B V^ - / x V^ n\bn] ¦ ¦ -bnN
«H \-nfi=n n\-\
Thus, for the number of forests Fn<^, we obtain the formula
ni—2 nn—1
r n T
, ni2
n \-^ «r • • • n xT
where the summation is over positive integers n \,..., n n such that «! + ••• +
nN =n.
Introduce independent identically distributed random variables %\, ¦ ¦. ,%n for
which
where
°° h,rk °° kk-2rk
g^ E4rL- °<*s«-1. d.4.4)
In accordance with the results of the previous section and Example 1.3.4, the
generalized scheme of allocation can be applied to investigating random forests of
nonrooted trees, that is, relation A.2.2) is valid: For any integers n\,...,
P{rn =n\,...,r\N =nN} = P{?i =»i,..., l-N =nN | ^i H \-%n = «}•
For the number of forests Fn^, formula A.3.8) is valid, which, of course, can be
obtained directly from A.4.2) and A.4.3):
nUB(x))
N
n n
\x"
where B(x) is defined by A.4.4), and the value of the parameter x in the distribu-
distribution A.4.3) of the random variables t-i,... ,%n can be chosen arbitrarily from the
domain of convergence of the series B(x).
Thus, to obtain the asymptotics of iy yy, it is sufficient to choose an appropriate
value of x, 0 < x < e~l, and analyze the asymptotic behavior of the probability
P{?i + • • • + ?yv = «} for the sum of the random variables ?i,..., ?# mat have
the distribution A.4.3) with the chosen value of the parameter x.
32 The generalized scheme of allocation and the components of random graphs
The first two moments of the random variable ?| have the following expressions:
B(x) ^ k\
kkrk
Therefore, along with #(x), we consider two functions
00 i k k OO , ?_] i-
11 11
k=\ k—\
The function 9(x) is the solution of the equation
6e~e=x A.4.6)
if we choose the solution that is less than 1.
The functions a(x) and B(x) can be represented in terms of this function.
Differentiating A.4.6) gives
9'(x)e-9(x) -0(xH'(x)e-e(x) = 1;
hence,
^W = —3^—. A.4.7)
On the other hand,
Thus
a(x) = 6{-X) . A.4.8)
Slightly more complicated calculations are needed to obtain the relation
B{x) = \{\~{\-9{x)J). A.4.9)
Consider the function
By using A.4.7), we obtain
~ kk~lxk~l
h'{x) = -2A -9(x)N'(x) = -
1.4 Forests of nonrooted trees 33
When we integrate both sides of this equality, we obtain
rx °o L.k-1 rx
/ h'{t)dt = h{x)-\ = -2TK-—
Jo f-1 k\ Jo
V
k=\
which implies equality A.4.9).
Relations A.4.8) and A.4.9) allow us to calculate the mean E?i and the variance
D?i. For 0 < 6 < 1, we set
x=Oe-0.
For such a choice of the parameter x,
6 6B-6)
0(H () , B(x)= '
1 — C7 Z
therefore
5(jc) 2
m =
5(x) 2-6
2 r,t a{x) {6(x)\2 26
B(x) \B(x)J A-
If the parameter 6 is fixed, then Theorem 1.1.11 may be applied to the sum
In fact, the theorem on local convergence to the normal law is valid in a wider
region.
Theorem 1.4.1. If N -> oo and 6 = #(A0 varies such that 6N -> oo
oo, //ien
= k) = i2
uniformly in the integers k such that u = (k — Nm)/(a*/~N) lies in any fixed finite
interval.
Proof. First we prove that, under the conditions of the theorem, the distribution of
(Zn — m N)/ (cr *J~N) converges weakly to the normal distribution with parameters
@, 1). According to Theorem 1.1.9, it is sufficient to demonstrate convergence
of the corresponding characteristic function (Pm@ to the characteristic function
e~l I2 of the standard normal distribution.
34 The generalized scheme of allocation and the components of random graphs
The characteristic function of ?| equals
1 °° kk~2xkeitk B(xelt)
k\ B(x)
By virtue of A.4.7), A.4.8), and A.4.9),
B(x) = ^A -A -
xB'(x) = 0(x),
x2B"(x) = 02(x)(\-0(x))-1,
x3B'"(x) = 83(x)(l-
Therefore
, _ W(xeu)
V ~ B(x)
0(xelt)
if) = r± /—TV- A.4.10)
B(x)(\-0(xe«))
iO(xelt
in,.-. iu\A.e
<P @ = 7
( -e(xeit))
Forx = 0e~e,
0B-0)
0(x)=0, B(x)= 2 \
Denote by ijs(t) the characteristic function of the centered random variable
?i - O(x)/B(x). Then
IB
f'@) = 0, V"@) = -ex2 = =-. A.4.11)
Y (l-6)B-6J V
Let
It is not difficult to check that
= 2i6{xeit)B32(xeit)-e(xeit)-2)
(l-e(xeit)KB-e(xeit)J
Therefore, if x = #e~^, then there exists a constant c such that
1.4 Forests of nonrooted trees
35
and
= exp {-
+ O
e\t\-
A.4.12)
The characteristic function <p#@ of the random variable (?# ~ m N) / ia wN)
satisfies the equality (piq(t) = \j/N' (t/'(aV~N)); hence, for any fixed t, as N -> oo,
,2
= exp ¦
tA
1
A.4.13)
The conditions of the theorem specify that NO -> oo, iV(l — 6K -> oo; hence,
for any fixed /, as N —> oo,
and the distribution of (?#¦ — mN)/(py/~N) converges weakly to the standard
normal law.
To prove the local convergence of these distributions, we need additional esti-
estimates of the characteristic function (pit). It is reasonable to assume that the local
theorem is valid in the same regions as the integral theorem proved above, but the
necessary estimates are complicated to find, and therefore we restrict ourselves to
a proof of the local theorem only in the case where 6 < 9q < 1 and 9N —> oo.
From A.4.12), it follows that there exists s > 0 such that for \t\ < s and
9 < 6q < 1,
\f(t)\<e
-cah2
A.4.14)
We now show that for any e, 0 < s < n, there exists a positive constant c such
that for s < \t\ < tc and 0 < 0 < 1,
If 9 ->• 0, then
X —
(Pit) =
Now
as 6 -> 0; therefore
-cd
0e-° =e-82 + O(83),
B(xeu) xeil + x2e2it/2 + O(83)
Bix) = 6A-6/2)
eu + (e2it -eitN/2+O(82).
=1-26 sin2it/2) + OF2),
A.4.15)
36 The generalized scheme of allocation and the components of random graphs
uniformly in t, and for s < \t\ < n there exists 8 > 0 and c\ > 0 such that
\(p@\ < e~C]0 A.4.16)
for 6 < S.
For any 6, 0 < 6 < 1, the distribution of ?| has maximal span 1 and q>{t) is
continuous in t and 9 in the region
b = {{t,ey.s < \t\ <tc, o< s <e < ij.
Therefore
q = sup |p(OI < 1,
B
and there exists C2 > 0 such that
\<P(O\ < e-<*0 A.4.17)
for (t, 6) e B.
This estimate and A.4.16) imply A.4.15).
Proving the local theorem, we follow the proof of Theorem 1.1.11 as a model
for similar proofs. We set
k-mN
u = —
and represent the difference
RN = 2tc (aVNPN(k) ^
V V2
as a sum of the following four integrals:
h = fA
J—
h = - f
h = f e'itu(ir(t/(aVN)))Ndt,
JA<\t\<sa-jN
Jea-jN<
f
A<\t\
f
A<\t\<sa-jN
ea-jN<\t\<jio-jN
where the constants A and s will be chosen later.
To see that R^ -> 0 as N -> oo, we show that Rm can be made arbitrarily
small by choosing of s, A, and N. It is clear that
\h\< f e~t2'2dt,
JA<\t\
1.4 Forests of nonrootcd trees 37
and | /2I can be made arbitrarily small by choosing a sufficiently large A.
Choose e > 0 such that estimate A.4.14) is fulfilled. Then, for 0 < 90 < 1,
\t\<e,
\f(t/(aVN))\<e-ct\
so that
[ f t2
|/3l< \i,(t/(ojN))\ndt<\ e'^dx,
JA<\t\<saVN JA<\t\
and I/31 can be made arbitrarily small by the choice of sufficiently large A.
For fixed A, the integral I\ tends to zero because cp(t) -> e~l ^ uniformly with
respect to t in any finite interval.
Finally, with the help of estimate A.4.17), we obtain that as N —> 00,
TV
dt
hojN<\t\<no*jN
f
f \<p(t)\"dt
J s<\t\<n
< oVNe-c6N -> 0.
¦
Denote by p(u; a, fi) the density of the stable law with parameters a and fi in
Zolotarev's parameterization (see [60]). If a ^ 1, the characteristic function f(t)
of this distribution can be represented in the form
f{t) = exp \-\t\" exp \-l-^K(a)/3-^ j j ,
where K(a) = 1 — |1 — a\. By the inversion formula,
p(u;a,P) = ±- f e-'7Mexpf-|/rexpf-^(a)/^]]^. A.4.18)
If N -+ 00 and 9 = 1, then the distribution of (?N - 2N)/(bN3/2), where
b = 2B/3J/3, is approximated by the stable distribution with parameters
Theorem 1.4.2. IfN ->> 00, G = 1, b = 2B/3J/3,
= n) = /?(«; 3/2, -
uniformly in the integers n such that u = (n — 2N)/(bN2^) lies in any fixed finite
interval.
38 The generalized scheme of allocation and the components of random graphs
Proof. The terms of the sum ?# = ?i + • • • + ?/v are independent identically
distributed random variables, and for 0=1,
* , ?=1,2,..., A.4.19)
and E?i = 2, since 0(e~') = 1 and B(e~x) = 1/2. The maximal span of the dis-
distribution is 1; therefore, by Theorem 1.1.10, it suffices to prove that the distribution
of (On — 2N)/(bN2/3) converges weakly to the stable law given in the theorem.
In addition to 6(x), a(x), and B(x) defined above, we consider the function
OO ,
k
k=\
This can be expressed in terms of 9 (z): Let
g(z) = (l-
By using the equalities
and
we easily obtain
zg'(z) = -36(z) + 362(z) = 36(z) - 6B(z).
Integration then gives
r g'iu)du=
Jo
JO u JO u
Expressing B(z) in terms of 9{z) demonstrates that, for \z\ < 1,
C(z) = A - 1A - 6{z)J - \{\ - 0(z)K.
Since 6{e~l) = 1, we find that C(e~l) = 5/12.
Set
u(z) = 1 - 6{z), v(z) = C(z) - C(e~
We have shown that
v{z) = -\u2{z)-\u3{z).
If we invert this expression, we obtain two formal solutions
_ 6C(Z).
u(z) = ±2iVv(i) + fv(z) + O(|u(z)|3/2);
1,4 Forests of nonrooted trees 39
since u(x) > 0 and v(x) < 0 for 0 < x < e~l, we choose the solution
u(z) = -2/V^) + \v(z) + O(\v(z)\V2). A.4.20)
Hence,
A - 6{z)J = u2(z) = -4v(z) - ^-(v(z)f/2 + O(\v(z)\2). A.4.21)
The first two derivatives of C{z) are
¦t-^ kk lzk l B(z)
c (z) = E ^^ = -r-
/t=i
OO
Therefore, for real t,
C(e~l+it) - C(e~x) = it/2 + O(t2). A.4.22)
Now we find an expression of the characteristic function cp(t) of the random vari-
variable ?i with distribution A.4.19). It is clear that
From A.4.20), A.4.21), and A.4.22), we find that for z = e'7,
<p(f) = l-(l-0(z)J
= l+4v(z) + l-^-{v{z)?'2 + O(\v(z)\2)
/73/2
/^J +O(t2),
where b = 2B/3J/3. By virtue of the equality
.(it\V1 \int
we can rewrite the last relation as
Since
e~2it = \-
40 The generalized scheme of allocation and the components of random graphs
as t -> 0, we find that
fit) = e~2it<pit) = 1 - \bt\3/2e\p — [ + O(t2).
The characteristic function of the random variable (?#¦ - 2N)/ibN2^3) is
V N
and converges to
at any fixed t. The function fit) is the characteristic function of the stable law
p(u; a, ft) with parameters a = 3/2, ft = — 1. Therefore, according to Theo-
Theorem 1.1.10, as N -> oo,
/3 = az} - p(M; 3/2, -1) -+ 0
uniformly in A:, where u = (k - 2N)/ibN2^). The function /?(*; 3/2, -1) is
positive for any x; hence,
= k} = piu; 3/2, -
uniformly in k such that u = (k — 2N)/ibN2/3) lies in any fixed finite interval.
We now turn to the estimate of the number of forests Fn^ with n vertices, N
trees, and T = n — N edges. Theorems 1.4.1 and 1.4.2 allow us to estimate the
number of forests.
Theorem 1.4.3. If n -> oo and 6 = 2T/n varies such that ON -> oo and
Nil - 0K -+ co, then
2rr, A.4.23)
Proof. Put
0=2T/n, x = 6e'e. A.4.24)
By virtue of A.4.5),
F^N=n'{BN{^Np{^N = n}, A.4.25)
where the parameters are chosen so that
) ^ ?™ {1A26)
1.4 Forests of nonrooted trees 41
Since m = E?i =2/B-0) = n/N, by Theorem 1.4.1,
P[^N = n} = — A +"A)), A.4.27)
where
(l-0)B-0J (l-0)iV2'
If we substitute A.4.24), A.4.25), and A.4.27) into A.4.25), we can conclude that
under the conditions of the theorem,
Theorem 1.4.4. Ifn —> oo and IT'/'n —> 1 so that
A - 2T/n)Nl/3 -> Z?2/3u/2, -oo < u < oo,
AA28)
Proof. Under the conditions of the theorem,
n-2N
u =
thus, by Theorem 1.4.2, continuity, and positivity of the density p(u; 3/2, —1),
= n} = p(-v; 3/2, -1)A + o(l)). A.4.29)
We chose 0 = 1 in Theorem 1.4.2; hence, x = e~l and B(x) = B(e~l) = 1/2.
Having substituted these values and A.4.29) into A.4.25), we conclude that, under
the conditions of Theorem 1.4.4,
n\
\2NN1/6 B/3J/3
p(-v,3/2, -
Although the density p(x; 3/2, -1) cannot be represented in terms of simple
functions, we can use the relation p(x; a, fi) = p(—x; a, —fi) and the following
series expansion for x > 0 and 1 < a < 2 for our calculations:
1 v^ „ r((« + l)/a) „ Tin ( ( l\2-a
p(x; a,P)=- T(-\)n — ^x" cos — 1 + 1 + - )
7T^Q an\ 2 \ \ n) a
42
The generalized scheme of allocation and the components of random graphs
1.5. Trees of given sizes in a random forest
Let [ir — At/-(w' N) be the number of trees with r vertices in a random forest
with n labeled vertices and N nonrooted trees, r = 1,2 Recall that such a
forest has T = n — N edges. In this section, we consider the asymptotic behav-
behavior of the random variables /xr(«, N). Following the approach established in the
previous section, we use the generalized scheme of allocation of n particles to
N cells determined by identically distributed random variables ?i, ...,?#¦ sucn
that
2kk-2Ok~le-k0
?=1,2,..., o < e < 2.
As we have calculated,
2-6'
and for 0 < 6 < 1,
We will also use the notation
s2r =
= pr(l-Pr-
= 1, 2....
The random variables /xr behave much like the corresponding variables for a
random forest of rooted trees. We highlight some of these results; see [30] for
a complete description. As before, let 6 = 2T/n. Again the value 6 = 1 is of
particular interest, so we introduce the following notations: For r = 1,2,...,
TCr = TCrF) =
= <x}r@) =
Pr(9), Q<0<\,
PriX), 1 < 6 < 2,
srF),
0<6<\,
The truncated values nrF) and a}rF) allow us to summarize the rather compli-
complicated behavior of /xr, r > 3, in the following two theorems.
Nnr{9) ->• oo,
Theorem 1.5.1. Ifn, N ->• oo an^/ r = r(«) > 3
1.5 Trees of given sizes in a random forest 43
uniformly in the integers k such that
_k- NnrF)
lies in any fixed finite interval.
Theorem 1.5.2. Ifn, N —> oo and r = r(n) > 3 varies such that Nnr{0) —> X
for some X, 0 < X < oo, then for any fixed k = 0, 1, ...,
Xke~k
k} (l
k} (l+
The random variables ix\ and /X2, like their analogs for forests of rooted trees,
have some special properties, but we will not discuss them.
When edges are added sequentially to a forest, then by Theorems 1.5.1 and 1.5.2,
the asymptotic behavior of ixr does not depend on 9 if 6 > 1. If Npr{\) —> oo,
then the limit distribution of /xr, with similar centering and normalizing, is the
standard normal distribution for all 6, 1 < 9 < 2.
There are similar results for the case 6 > 1 and Npr{\) —> X for some X,
0 < X < oo, with the limit distribution of the [ir for all 6, 1 < 6 < 2, being the
Poisson distribution with parameter X. Thus the point 6 = 1 can be interpreted as
a critical point in the evolution of a random forest.
We now prove Theorems 1.5.1 and 1.5.2.
Proof of Theorems 1.5.1 and 1.5.2. According to Example 1.3.4 and Lemma
1.2.1,
r=k]=
where f# = ?i + • • • + $n, ^r) = ^ + • • • + ^}, the random variables
?i,..., %n', %\ , • •¦, $ are independent and identically distributed,
Pr =
k\B(x)
Jt=l
A.5.2)
and the parameter x of the distribution of ?i ,...,?# may be taken arbitrarily from
the domain of convergence of the series B(x).
44 The generalized scheme of allocation and the components of random graphs
We set 9 = 2T/n. It is convenient to choose jc = 0e~° for 0 < 0 < 1 and
x = e~x for 1 < 0 < 2. With these choices, A.5.1) gives
^? ^ A-5.3)
where
P{?l = *} = 7r*@)> A:= 1,2 ,
and the distribution of ?j is defined by A.5.2).
Reasoning by contradiction, we see that it is sufficient to prove Theorems 1.5.1
and 1.5.2 under the assumption that 6 lies in any of the following three domains:
first, where N9 -> oo and A — 9KN^- oo; second, where A — 9KN is bounded
by an arbitrary constant; and, third, where A - 9KN -> -oo. Negating either
theorem implies the existence of a subsequence of the parameters n, N such that 9
lies in one of these three domains for which the other conditions are satisfied but
for which the conclusion is false. Therefore we assume that n, N -> oo in such a
way that 0 lies in one of the domains and prove the assertions of Theorems 1.5.1
and 1.5.2 in the corresponding three cases.
Consider first Theorem 1.5.1 in the first domain of 6. By the de Moivre-Laplace
theorem, the binomial distribution is approximated by a normal or Poisson distri-
distribution. More precisely, if Nnr(9) -> oo, then
p~z /z A.5/
uniformly in k such that
(k-Nnr{9)J
2NTcr(9){\ - TcrF))
lies in any fixed finite interval.
The probability P{t;N =n} from the denominator of A.5.3) has been estimated
in the previous section. Applying Theorem 1.4.1, we have for# in the first domain,
= n} = —7==(l+o(l)), A.5.5)
where
A-0X2-6)
To find the asymptotics of the numerator of A.5.3), we begin by calculating the
1.5 Trees of given sizes in a random forest 45
first and second moments:
1 1
a} = a}{9) =
where \i = E?, =2/B-9).
A proof similar to that of Theorem 1.4.1 shows that a normal approximation is
valid for the sum ffi = ^(r) -\ h ?^. More precisely, if n, N -^ oo such that
>• oo and A - 0KN ->• oo, then
ar v 27r A1"
uniformly in r > 3 and s such that (s — Nmr)/(a^/N) lies in any fixed finite
interval.
We now use A.5.6) with s = n—kr and N—k summands to obtain an asymptotic
expression for P{$N_k = n — kr}. Since
k =
where
we have
N - k = N(l - pr) - urarrVN = N(l - pr) A Ur°rr ) . A.5.7)
It is easy to see that arr/{\ — pr) is bounded, and for ur lying in any finite interval,
N-k=N(l - pr)(l + O(N~l/2)). A.5.8)
The exponent in A.5.6) may now be written as
2 (n — kr — NmrJ
~ 2a}{N - k) '
Taking into account A.5.7), A.5.8), and the equalities
Priix — r) a — r
mr — \x = , mr — r =
1 - Pr 1 - Pr
which hold for 9 in the first domain, we obtain
k(mr — r) — N{mr — fx) pr' (fx — r)(k — Npr)
u =
ar(N-kI/2 ~ aarr(N-kI/2
p\t2{n - r)
o(\ ~ Pr)l/2
ur{\ + o
46 The generalized scheme of allocation and the components of random graphs
Applying A.5.7) gives
1 2 2 2
P|^m_jc = n — kf\ = :=? ' (I t
<yr^/2TcN(\ — Pr)
A.5.9)
When we substitute A.5.4), A.5.5), and A.5.9) into A.5.3), we see that under the
conditions of Theorem 1.5.1 with 6 in the first domain, this expression transforms
into the product of an exponent and a coefficient. The coefficient of the expo-
exponent is
-Pr)
1 1
r(\ - Pr){\ - Pr{n - r
Combining the exponents from A.5.4) and A.5.9) yields the resulting exponent
{k-NprJ pr{n - VJ(k - NPrJ (k - NPrJ
+ o(l) = 1- o(l).
2Npr{\ - pr) 2a2(l - Pr)a}rN 2arrN
Thus Theorem 1.5.1 is proved for 6 varying in the first domain.
Under the conditions of Theorem 1.5.2, k is fixed, and when we apply A.5.5)
and A.5.6) with the corresponding parameters, we obtain the ratio
Therefore the assertion of Theorem 1.5.2 follows from the Poisson approximation
of the first factor in A.5.3).
In the second domain, we choose the parameter ofthe distribution of ?i, ...,?#
to be 1. If Npr{\) -> oo, then
kj,r,^,,^ „„, ^ j2nNpr(l)(l-pr(l))e ' ^ + ^^
A.5.10)
uniformly in k such that
k-Npr{\)
lies in any fixed finite interval.
Applying Theorem 1.4.2 gives
= n} = p(u; 3/2, -1)A + o(l)) A.5.11)
uniformly in n such that u = (n - 2N)/(bN2/3) lies in any fixed finite interval.
1.5 Trees of given sizes in a random forest 47
Restricting the random variables ^r\ ..., ^ does not affect their maximum
span and convergence to the stable law with density p{u\ 3/2, —1). The only
difference is that now the mean of a summand is E?j = mr(\) = 2/A - />,-(!)).
Therefore, as j —> oo,
(l-/v(DJ/3
[ (CJ - ymr(D)(l - P.(l)J/3 / - ymr(l))(l - r(l)J/3
{
uniformly in / such that
- jmr{\)){\ - pr(l)J/3
v =
bj2'3
lies in any fixed finite interval.
By substituting N — k for j and n — kr for / and recalling that
k = Npr(l) + Zy/Npr(\)(\-pr(\)),
where z is bounded, we have
bN2/3P{^_k = n-kr}= p(u; 3/2, -1)A + o(l)) A.5.12)
uniformly in r > 3, where, as in A.5.11), u = (n — 2N)/{bN2^), since
n-2N
v = 6/73 =
Thus the asymptotics of P{^n — n] and P{Q_k = n — &r} is the same and their
ratio in A.5.3) tends to 1. Therefore the asymptotics of P{/xr = k) is determined by
the first factor and coincides with the asymptotics of the corresponding binomial
probability. Theorems 1.5.1 and 1.5.2 have now been proved in the second domain.
It remains to prove the theorems for the third domain, where A —IT/n^N —>
—oo. We choose 9 = 1 in the distribution of the random variables ?i,..., ?# and
prove that in A.5.3) the ratio
-k = n ~ kr)/P{i;N = n}^\ A.5.13)
uniformly in r, and k = Npr{\) + z^Npr{\){\ — pr{\)), where z lies in any
fixed finite interval.
In this case, A — 2T/nKN -> —oo, so the values n for the sum ?#¦ and the
values n — kr for the sum $N_k lie m what is called the region of large deviations.
Therefore we need to apply the theorem on large deviations. We will not give the
48 The generalized scheme of allocation and the components of random graphs
proof, but the main idea is simple: If the distribution of a sum of independent
identically distributed integer-valued random variables with zero mean converges
to a stable law with parameter a, 1 < a < 2, then the major contribution to a large
deviation of the sum is made by only one of the summands (see [137]). Applying
this theorem to the sum ?#¦ gives the following result for 9 in the third domain.
If n, N -> oo such that N(\ - 2T/nK -> -oo, then
P{$N = n) = P{t;N - 2N = n - IN)
-2 = n-2N}(\+o(\))
N
A
(n —
(r)
The theorem given in [137] cannot be applied to the sum t;^ , since its sum-
summands become noninteger after centering by the expectation mr. Britikov, using
the method given in [137], along with ideas from [58] and [113], proved in [30]
that the probability P{i;^_k = n — kr} has the same asymptotics as P{$n = n}.
More precisely, if n, N -> oo such that N(l — 2T/nK -> —oo, then
N-k = n-Jcr} = P{^ -(N- k)mr = n - kr - (N - k)mr]
= (N - k)P{^r) - mr = n - kr - (N - k)mr]
= (N - k)P{^r) =n-2N+
1/2 N
uniformly in r > 1 and k such that
lies in any fixed finite interval.
Thus the ratio in A.5.3) tends to 1, and the asymptotics of P{/xr = k) is deter-
determined by the first factor and coincides with the asymptotics of the corresponding
binomial probability. This proves Theorems 1.5.1 and 1.5.2 in the third domain.
The proof of Theorems 1.5.1 and 1.5.2 is now complete. ¦
1.6. Maximum size of trees in a random forest
The results of the previous section give some information on the behavior of the
maximum size r}(N) of trees in a random forest from Tn,N with T = n-N edges.
Indeed, if 9 = 2T/n -> 0 and there exists r = r(n, N) such that Npr(9) -> oo
and Npr+\{9) —> X, 0 < X < oo, then the distribution of the number /xr of trees
of size r approaches a normal distribution, and the distribution of/xr+! approaches
1.6 Maximum size of trees in a random forest 49
the Poisson distribution with parameter X. This implies that the limit distribution
of the random variable r](M) is concentrated on the points r and r + 1.
If 6 = 2T/n -> y, y > 0, then there are infinitely many r — r(n, N) such that
the distribution of fxr approaches a Poisson distribution; hence, the distribution
of r)(N) is scattered more and more as y increases. If 0 < y < 1, then the limit
distribution is concentrated on a countable set of integers, whereas if y > 1, then
r)(N) must be normalized to have a limit distribution, and the normalizing values
tend to infinity at different rates, depending on the region of 6.
Thus, it should be possible to prove the limit theorems for rj(N) when T/n -> 0
by using results on \xr from the previous section. But if2T/n —>• y for y > 0,
this approach may not work, and even if it did, the proofs would not be simple.
Therefore we choose instead to use the approach based on the generalized
scheme of allocation. Let ?i ,...,?# be random variables with distribution
2rr-26r-le-re
=k}= lBe) • 0<9<2, A.6.1)
where k = 1, 2, .... We choose 0 = 2T/n. Then, according to Lemma 1.2.2,
where
h (n h (n bemg independent identically distributed random variables such
,...,
that
Pff^ = A:} = P{^i = k | ?i < r}, k=l,...,r, A.6.3)
and
Pr = Pr{6) = P{^i < r} = ^^E). A.6.4)
/t=l
We now state the theorems that completely describe the behavior of rj(N), deferring
their proofs. Our procedure follows Britikov [28].
Theorem 1.6.1. Ifn, N -> oo, 6 = 2T/n -> 0, and the integers
r = r(n, N) > 1
vary such that Npr{6) -> oo and Npr+\F) -> XforO < X < oo,
50 The generalized scheme of allocation and the components of random graphs
Note that if A. 7^ 0 in the conditions of the theorem, then Npr{9) —> 00
without any additional requirements. In particular, the conditions of the theorem
are fulfilled if
T/n{r-X)/r -+ p, 0 < p < 00.
Under this condition, Theorem 1.6.1 was proved by Erdos and Renyi [37], whose
well-known paper provided the only results on the behavior of rj(N) until Britikov's
work seventeen years later [28].
Theorem 1.6.2. Ifn, N -> 00, 6 = 2T/n -> y, 0 < y < 1, then for any fixed
integer k,
(y - 1 -logyM/2
P{r](N) ~ [a] <k}= exp -^ \^=^
>y~ ~ x)V2tt
log n - § log log n
e - 1 - log e '
ant/ [a] and {a} denote, respectively, the integer and fractional parts of a.
Theorem 1.6.3. Ifn, N ^ 00, 0 = 2T/n ->• 1, and N(l - Of ->• 00, r/zen/or
any fixed z,
where
ivgn -
a =
1"^
where P = — log^e1"^) and u is the root of the equation
Theorem 1.6.4. Ifn,N^- 00 such that Nl/3(l - 2T/n) -> v, -00 < v < 00,
then for any fixed positive z,
00
J ( \ f P{y Xl Xs> 3/2' 1} A A
Is{w,y)= / — dx{---dxs,
A = {(*!, xs): Xj >w, j = 1, ...,s},
and p(y; 3/2, — 1) is the density of the stable law with parameters a = 3/2,
1.6 Maximum size of trees in a random forest 51
Theorem 1.6.5. ffn, N —> oo, N(\ - 2T/nK -> -oo, then for any fixed z,
-oo
We will prove Theorems 1.6.1-1.6.5 with the help of relation A.6.2). Under
the conditions of Theorems 1.6.1-1.6.3,
and the limit distribution of ^(at) is the same as the limit distribution of the maxi-
maximum of the random variables ?i ?^. Therefore we first obtain some auxiliary
results on the asymptotic behavior of
oo
Pr =
Lemma 1.6.1. Ifn, N —> oo, 0 = 2T/n —> 0, and the integers r = r(n, N) > 1
vary such that Npr{6) -> oo, Npr+\{6) -> X, 0 < X < oo, then
NPr-i ->• co, NPr ->• X, NPr+i ->• 0.
Proof. Under the conditions of the lemma, x = 0e~e -> 0. It follows from A.6.3)
that
Pr = JTPr+s(e) = Pr+l(f» (l + ? ^l) , A.6.6)
Taking into account the bounds for factorials
we find from A.6.1) that
where ci is a constant. Hence,
Q) K cixe
pr+lF) - I - xe
=
52 The generalized scheme of allocation and the components of random graphs
as q _> o. Now by virtue of A.6.6) and A.6.7),
OO, NPr+l ~+ 0.
Note that if X / 0, then Npr@) —> oo without any additional conditions, so
this requirement may be excluded from the conditions of the lemma if A, / 0.
Indeed,
Npr@) = Npr+l(9)
Pr+l(O)
Since x —> 0 and
pr+l{9) \r+\J x
there exists a constant ci such that Npr{9) > C2Npr+\(9)/x and Npr{9) ->¦ oo.
Lemma 1.6.2. Ifn, N -> oo, 9 = 2T/n -+ y,0 < y < 1, and r = r(n, N) -+
oo, then
NPr = Npr(9)c(l-c)
where c = ye^~y.
Proof. It is clear that
oo
Pr(9)
and
pr(9) \r + sj
Moreover, there exist constants c3 > 0 and q < 1 such that
Pr+s@)lpr@) < c3(xe)s < c3qs.
Therefore the series X^i pr+s(9)/pr(9) converges uniformly and we can pass
to the limit under the sum so that
OO OO
E
. , 1 - c
1.6 Maximum size of trees in a random forest 53
Lemma 1.6.3. Ifn, N -> oo, 9 = 2T/n -> 1, and N{\ - 0K -* oo, then for
any fixed z,
NPr -> e~\
where r is an integer such that fir = u + z + o(l), fi = — log^e1^), and it is
the root of the equation
/o\l/2
Proof. It is clear under the conditions of the lemma that fi = —\ogFex~e) -> 0
and u -^ oo, since Nfi3!2 ^ oo by virtue of the condition N{\ — 6K -^ oo. We
apply Stirling's formula and obtain
oo , t_? t / « \ 1/2
A:=r+1
The sum
k>r
is an integral sum of the function/(jy) = y 5^2e y with step f3 and is approximated
by the corresponding integral:
OO
y-5/2e~y dy(l
Therefore
j
/ 9 \
NPr = ( - j
A.6.8)
By definition, rfi = u + z + o(l) and
/2\l/2
Substituting these expressions into A.6.8) yields
Now we are ready to prove the theorems of this section.
54
The generalized scheme of allocation and the components of random graphs
Proof of Theorems 1.6.1-1.6.3. By applying Lemma 1.6.1, we find that under
the conditions of Theorem 1.6.1,
A - Pr_i)
N
0,
A - Pr)
N
~\
A - Pr+l)
N
1
as N ->¦ oo. These relations, together with A.6.5), whose proof is pending, imply
the assertion of Theorem 1.6.1.
Let
_
\ogn - | log log n
0-l-log0 '
and choose r = [a] + k, where k is a fixed integer. Under the conditions of
Theorem 1.6.2, r = [a] + k ->¦ oo and according to Lemma 1.6.2,
NPr = Npr@)c{\ - c)-\\ +
where c = ye1"^. It is easy to see that
...
r!B-0)
Thus
(/ 1
+0A)),
and consequently,
Under the conditions of Theorem 1.6.3, Lemma 1.6.3 shows that NPr
and
- Pr)
N
Thus, to complete the proof of Theorems 1.6.1-1.6.3, it remains to verify A.6.5)
under each set of conditions.
Since ON -+ oo and N{\ - OK -> oo, by Theorem 1.4.1 the random sum ^
is asymptotically normal, and
1.6 Maximum size of trees in a random forest 55
where
A-0X2-0)
While estimating the asymptotic behavior of A — Pr)N in Lemmas 1.6.1 -1.6.3,
we determined the choice of r. We now prove the central limit theorem for the sum
lx for these choices of r. Set Bm = a(9)Nx/2.
The characteristic function of the random variable f [ — m{9), where m{9) =
E?i, is
,-itm@) J_ . e-itm(9)
r k=l l~Fr \ k>r
where (pit) is the characteristic function of the random variable ?i. Hence, the
characte
written
characteristic function <pr(t, 6) of the random variable (^ ~~ Nm (O))/Bn can be
7-itNm{6)/BN ¦ ' ' N
According to Theorem 1.4.1, the distribution of (?# — Nm(9))/Bn converges to
the standard normal law, and consequently,
It is clear that
J2 Pr@)eitk/BN = Pr + J2 Pr@)(eitk/BN - l) = Pr + O
k>r for
and it is not difficult to prove in each of the three cases that
A.6.10)
k>r
Estimates A.6.9) and A.6.10) imply that for any fixed /,
and the distribution of (?Jy — Nm@))/BN converges to the standard normal
distribution. The local convergence
needed for the proof of A.6.5) can be proved in the standard way.
Thus the ratio in A.6.5) tends to 1, and this, together with the estimates of
A - Pr)N, completes the proof of Theorems 1.6.1-1.6.3. ¦
56 The generalized scheme of allocation and the components of random graphs
To prove Theorem 1.6.4, the following lemma is needed.
Lemma 1.6.4. If N —>¦ oo, the parameter 9 in the distribution A.6.1) equals
Arl/3A — IT In) ->¦ v, and r = zN2/3, where z is a positive constant, then
where
C ?-3/21 / °° 1 / 3 \5
Is(z,y)\,
and Is(z, y) is defined in Theorem 1.6.4.
Proof. As yV ->• oo,
r~2e~k /2\1/2
l
2kr~2e~k /2\1/2
Pk = Pkd) = = l-J k~5/2(l + o(l» A.6.11)
uniformly in k > r.
It is clear that
itk i ! ^rtk y5'2
The last sum is an integral sum of the function y~5/2eity with step l/(bN2/3);
hence,
Set
Then
l#(*,z)l < H(O,z) = -^— f y-5l2dy= Z-—=. A.6.13)
Taking into account b = 2B/3J/3, we obtain from A.6.12) and A.6.13) that
^ f itk 1 / 2 \1/2 ^ 1 [ itk \ / 1 \
A:>r
t^. A.6.14)
In particular,
A.6.15)
1.6 Maximum size of trees in a random forest 57
The characteristic function (prit, 1) of the random variable (?^ — 2N)/ibN ' )
can be written
N
where (pit, 1) is the characteristic function of ?i — E?i. Note that E?i = 2 in this
case. It follows from A.6.13), A.6.14), and Theorem 1.4.2 that
k>r
where fit) is the characteristic function of the stable distribution with density
piy; 3/2, —1). Thus, for any fixed t, as N —> oo,
(prit, 1) -> git, z) = fit) exp{-Hit, z) + H@, z)}.
The function git, z) is continuous; therefore, by Theorem 1.1.9, it is a characteristic
function. Since \g(t, z)\ is integrable, it corresponds to the density
The span of the distribution of fj is 1; therefore, by Theorem 1.1.10, the local
convergence is valid.
Thus it remains to show that /(z, y) has the form given in Theorem 1.6.4.
Representing e~H^^ by its Taylor series gives
Yj^y), A.6.16)
where
1 C°°
fsiz, y) = ~ e-ltyfit)Hsit, z) dt.
2n y_oo
It is easy to see that the function IJnz^l^Hit, z) is the characteristic function of
the distribution with density
Pziy) = |z3/V5/2, y>z. A.6.17)
58 The generalized scheme of allocation and the components of random graphs
Therefore the function
is the characteristic function of the sum ft + ft\ H h As of independent random
variables, where ft has the stable law with density p{y; 3/2, — l)andjSi,..., fts are
identically distributed with density p:{y). The density of the sum ft + ft\ -\ h fts
is
where Is(t, y) is defined in Theorem 1.6.4. Thus
1 C°° / 3 V
/ ef(t)H(t,y)dt (=) Is(t,y).
When we substitute this expression into A.6.16), we find that
A.6.18)
Taking into account A.6.15), Theorem 1.4.2, and A.6.18), we see that Theo-
Theorem 1.6.4 follows from A.6.2).
To prove Theorem 1.6.5 with the help of A.6.2), we need to know the asymptotic
behavior of large deviations of PRjy" = n}. We give that information without proof
(see [28]).
Lemma 1.6.5. Ifn, N —> oo, the parameter 9 in the distribution A.6.1) equals 1,
N{\ — 271/nK —>¦ —oo, and r = n — 2N — bzN2^, where z is a constant, then
-: 3/2, -
A.6.19)
The assertion of Theorem 1.6.5 follows from A.6.19), Theorem 1.4.2, and the
fact that N Pr -^ 0 under the condition of Theorem 1.6.5.
1.7. Graphs with unicyclic components
A graph is called unicyclic if it is connected and contains only one cycle. The
number of edges of a unicyclic graph coincides with the number of its vertices. Let
Un denote the set of all graphs with n vertices where every connected component is
unicyclic. Any graph fromZ4 has n edges. In this section, we study the structure of
a random graph from Un. We follow the general approach described in Section 1.2.
As usual, denote by un the number of graphs in Un; we will study un as n ->¦ oo.
Let bn be the number of unicyclic graphs with n vertices, and b^ the number of
1.7 Graphs with unicyclic components 59
unicyclic graphs with n vertices, where the cycle has size r. The cycle of a unicyclic
graph is nondirected; in other aspects, a unicyclic graph is similar to the connected
graph of a mapping of a finite set into itself. Let dn be the number of connected
graphs of mappings of a set with n labeled vertices into itself, and dn the number
of such graphs with the cycle of size r. It is easy to see that
P?\ P?\ ^=d^/2, r>3.
Introduce the generating functions
_ ... n\ ^ n\
These functions can be represented in terms of the function
9(x) =
OO „_] „
nn lxn
n=\
n\
which is the root of the equation 9e 9 = x in the interval [0, 1]. This function was
used in Section 1.4. Taking into account the notation introduced here and using
the results of Section 1.4, we see that
d(x) = -log(l - 0(jc)), c{x) = \(\ - A - 6(x)J).
Since bn = b{nl) H h b{nn\ we have
OO
= -d(x)+6(x)--c(x)
= — log(l - 9(x)) + 0(x) - -A - A - 0(x)J). A.7.1)
In accordance with the general model of Section 1.4, let us introduce independent
identically distributed random variables ?i, ...,?# for which
A.7.2)
The number of graphs in Un with N components can be represented in the form
n! x—v bn, • • • bnM n
. • A . "jy • J V . .A-
«H h«JV=«
A.7.3)
60 The generalized scheme of allocation and the components of random graphs
In what follows, we choose
Theorem 1.7.1. As n -+ oo,
" 2'/4r(i/4)i
where
/»OO
= / X'
JO
is the Euler gamma function.
Before proving Theorem 1.7.1, we will prove some auxiliary results.
Lemma 1.7.1. Forx = A -
A - 9(xeit)f = i - lit + ei @ + s2(t, n),
where S\{t)/t —>¦ 0 as t -^ 0 uniformly in n and \e{t, ri)\ < l\t\/*Jn.
Proof. We found in Section 1.4 that
u(w) = A - 9{w)J = 1 - 2
k=\
When we write u(w) as
kK zur
k\
1
n
\w\ < e
~x
A.7.4)
it is clear that 0(jc) = l-l/^/n and^Ce) = 1; therefore u(e~l)-u(x) = -l/n.
With this equality and the observation that x <e~',we obtain the estimates
\ei(t,h)\ = \u(xelt) - u
- l/n
= 2
oo
E
k=\
oo
kk-2(e-k-xk)(eitk-l)
~k -xk)\t\
k=\
k\
= 2\t\(9(e-l)-9(x))=a\t\/Vn~.
A.7.5)
1.7 Graphs with unicyclic components 61
The function u(e~l+") has the first derivative —2i at the point / = 0; thus, as
/-> 0,
= -2it+o(t). A.7.6)
The assertion of the lemma follows from A.7.4), A.7.5), and A.7.6). ¦
Lemma 1.7.2. Ifn —>¦ oo, N = a log n +o(log n), where a is a positive constant,
then
' 2°T(a)
uniformly in k such that z = k/n lies in any fixed interval of the form 0 < zq <
Z < Z\ < OO.
Proof. The characteristic function of the sum (?1 +¦ • -+%N)/n is equal to <pN(t) =
cpN(t/n), where cp(t) is the characteristic function of ?1. It is clear that
<p(t) = B(xeit)/B{x).
Lemma 1.7.1 and equation A.7.1) give
4B(xeit/n) = - log ~ + 3 + o(l).
v n
Therefore
/t\ B(xelt/n) _
V\n) = B(x)
\ogn
and if N = a \ogn + o(l), then for any fixed /,
= <pN(t/n)=(l-1^
logn J A-2/0"'
and the distribution of (?i + • • • + %N)/n converges weakly to the distribution with
density
2«r(a)
that is, to the chi-square distribution with 2 degrees of freedom, which corresponds
to the characteristic function A — 2it)~a.
The local convergence can be proved in the usual way by using Lemmas 1.12.3—
1.12.7 from [78], ¦
62 The generalized scheme of allocation and the components of random graphs
Let un,N be the number of graphs in Un with /V components and
1 N -k..
kn = - log/i, « =
Lemma 1.7.3. Ifn -> oo,
uniformly in N such that \u\ < (logn
Proof. It is clear that
Un,N =
n\ ^-y bni ¦ ¦ -bnN
^^ n\\---nN\
By putting a = 1/4 in Lemma 1.7.2, we obtain
uniformly in N when |«| < (log
The assertion of the lemma follows from A.7.7) and A.7.8), since
B{x) = \ \
xn =
d.7.7)
+ • • • + Hn = n) = 2l/4f!A/4) e-1/2d + o(D) A-7.8)
A.7.9)
The assertion of Theorem 1.7.1 can be obtained by summing un^ over N.
Lemma 1.7.3 estimates un,N for N close to kn. The following lemmas give esti-
estimates ofun,N for the other values of N needed in the proof.
Lemma 1.7.4. For any fixed <xq, a\, 0 < «o < ct\ < oo, there exists a constant
c\ such that for «o log n < N < ct\ \ogn,
i NO-Xn
Proof. It follows from Lemma 1.7.2 that there exists a constant A such that
nP{t-i + --' + SN = n}<A A.7.10)
1.7 Graphs with unicyclic components 63
n < /V < a\ \ogn. Indeed, if A.7.10) did not hold, then a sequenceof the
parameters n -h> oo, N = a log n + o(log n) would exist for which the assertion of
Lemma 1.7.2 would not be true. Lemma 1.7.4 then follows from A.7.7), A.7.9),
and A.7.10). ¦
Lemma 1.7.5. If N < logn, then there exists a constant ci such that
H \-%n =n} < c2\ogn.
Proof. It is well known that
m — 1
k=o
Indeed, since the number of forests with n nonroot vertices and N rooted trees
labeled 1,..., NisN(n+N)n~l, the number g^ of connected graphs of mappings
of an m-set into itself with the cycle of size r can be represented as
(m-r)\
Here (^) is the number of possible choices of r vertices that constitute the cycle;
(r — 1)! is the number of cycles that can be constructed from r vertices; and
rmm-r-\ js me number of forests with r cyclic vertices as the roots. Hence,
As m —>¦ oo,
dm = \(m-l)\em(l+o(l)),
and there exists a constant c3 such that
bm <dm <c3(m-l)\em.
Moreover, B(x) = log/i(l + o(l))/4 and xm < e~m for all m > 0. Therefore
there exists a constant C2 such that
A.7.11)
m logn
It is clear that
N
U te =Mn
64 The generalized scheme of allocation and the components of random graphs
Since P{?i = k) decreases as k increases, we have
k>[n/N\
k>[n/N]
= [n/N]}.
The lemma now follows from A.7.11).
Lemma 1.7.6. For N < \ogn,
un,N <c4nn-l/4\ogn n
where c\ is a constant.
This lemma follows from A.7.7), A.7.9), and Lemma 1.7.5.
Proof of Theorem 1.7.1. Roughly speaking, un^ = cX%e~x" /N\, where c does
not depend on N, and to obtain un, we sum the Poisson probabilities whose sum
is 1. To do this rigorously, we divide the sum
oo
N=l
into four parts. Recall that u = (N — Xn)/VKJ- Let
A\ A2 At,
where
Ax = {N: \u\<(\ogn)y/4},
A2 = {N: \u\ > (lognI/4, aologn < N < a\ logn},
A3 = {N: N < aologn},
A4 = {N:N > a\ \ogn).
Asn -^ 00,
^7T (L7-12)
A\
therefore it follows from Lemma 1.7.3 that
1.7 Graphs with unicyclic components 65
It remains to show that 52, 53, and 54 are o(n{~l/A). Lemma 1.7.4 implies that
and it follows from A.7.12) that 52 = o(nn~l/4).
To obtain an estimate for 53, we use the inequality
E
m
m
\<N<m
which is true for m < X. Choose «o < 1/4 such that
«o -aologao -«olog4 < 1/8.
Then, for m = <xq log n,
X^e~x" c5
m ;— - -ttr »
where C5 is a constant. By using the estimate from Lemma 1.7.6, we find that
53 <C4C5nn-l^-^3logn.
To obtain an estimate for 54, we use the inequality
where c^ is a constant, which follows from A.7.7) if P{?i + ••• + ?# = n) is
replaced by 1. For m > X,
^ XNe~x Xm
N>m
Choose a\ > 1/4 such that a 1 — a\ logai — a\ log 4 < —2. Then form = a\\ogn
and Xn = (logn)/4, we have the estimate X%/m\ < n~2; thus A.7.13) implies
that 54 < c6nn~5/4.
The assertion of the theorem follows from the estimates obtained for S\, S2, S3,
and 54. ¦
We denote the number of components in a random graph of Un by xn. The
following theorem is a direct corollary of Lemma 1.7.3 and Theorem 1.7.1.
Theorem 1.7.2. As n ->¦ 00,
P{xn = N}= 2 2'2
n log n
uniformly in N for which u = (N — \\ogn)/J ^\ogn lies in any fixed finite
interval.
66 The generalized scheme of allocation and the components of random graphs
Indeed, Lemma 1.7.3 and Theorem 1.7.1 imply that
uniformly in \u\ < (lognI/4, where Xn = ^ \ogn.
We now consider the maximum size fin of the components of a random graph
from Un.
Theorem 1.7.3. Ifn-^-oo, then for any fixed y, 0 < y < 1,
0<s<\/y
where Wq(x, y) = 1, and for s = 1,2,...,
---dxs
,y)=
J[xi>Y, i=l,...,s,
7377-
i+-+xs<z] xl • " -XS{Z - X\ — ¦ ¦ ¦ — Xs) I
Proof. To study f}n, we use the general approach of Section 1.2. Let rj\,...,
be random variables with distribution
P{^i = n\, ..., t)n = n^} = P{?i = ni,..., ?jv = njv | ?1 H 1- Hn = «}•
A.7.14)
It follows from A.7.7) that these variables can be interpreted as the sizes of the
ordered components of a random graph from Un (see Section 1.2), in which xn is
N. Therefore
oo
JV=1
where 0 < y < 1 and rj(^) = max\<i<M Vi- By Lemma 1.2.2,
\
A.7.16)
where f 1,..., %N are independent identically distributed random variables for
which
and the random variables ?1 ,...,?# have distribution A.7.2). We now estimate
1.7 Graphs with unicyclic components 67
for* = A - \/<s/n)e~[ + l/^. By A.7.1) for any fixed y, 0 < y < 1, asn -> oo,
__ ., 1 v^
tlyn (t) = — /
2 t—1
Let us prove that
where
4
It is easily seen that
1
^2
m=0
uniformly in k > yn. Therefore, as n —>¦ oo,
* it
Hyn(t)= ? - 1 - _ exp
This sum is an integral sum of the function u-le(l~2lt)u/2 wjtj1 step \jn Hence,
I C°°
Hyn(t) = - / u-l
? Jy
In particular, we obtain the following estimate for the tail of the distribution A.7.2):
= 4Hyn@)+o(l) = 4H(y,0)+o(l) ?
log n log n
as n —>¦ oo.
We now find the limit distribution of the sum (f i H h |jv)/aj. The character-
characteristic function of fi/n is
= cp(t/n)-HYn(t)/B(x)
l-Hyn@)/B(x) '
Using the estimates
/i) = I -log(l -2iO/log«+o(l/log/i), 4B(x) = log/i+ 0A),
68 The generalized scheme of allocation and the components of random graphs
from A.7.16) and A.7.17), as /i -> oo, yields
/ log(l -2it)-4H(y,t) + o(\)\ (
and for any fixed t and N = \ log n + o(log n),
fN{t) - <py(t) = (l-2il/
When we expand e~H{y^ into its Taylor series, as we did in the proof of Lemma
1.6.4, we find that the characteristic function <pY @ corresponds to the density
{_iy
Q<s<\/y
Thus, for any y,0 < y < 1, the distribution of (|h r-fjv)/« converges weakly
to the distribution whose density is fy (z) as n —>¦ oo and TV = | log n + o(log n).
We can show that local convergence of these distributions holds. If n ->¦ oo and
TV = ^ logn + o(logn) and 0 < y < 1, then
A.7.18)
holds uniformly in A: for which z = k/n lies in any given interval of the form
0 < zq < z < z\ < oo.
Using A.7.17), we find that for n ->¦ oo and TV = ^logn + o(logn),
(Ptt. < m))» = (l - 4^:0) + OA)) = e-"<^ + 0(l). A.7.19)
Substituting estimates A.7.19), A.7.18), and A.7.8) into A.7.16) gives
P{ri(N)<yn}= J2 (-^Ws(hy) + o(l). A.7.20)
To obtain the distribution of fin, we need to average the distribution of r]^) with
respect to the distribution of xn. By Theorem 1.7.2, the number of components xn
is asymptotically normal with parameters (\ log n, | log n), and for TV = |logn +
o(logn), the probability P^n) < y«} is asymptotically constant; therefore the
assertion of the theorem follows from A.7.15). ¦
Denote by Un^ and Un^ the sets of all graphs with n labeled vertices consisting
of unicyclic components where each cycle has more than one or more than two
vertices, respectively. It is not difficult to see that we can treat Unj, i = 2, 3, in
the same way as Un (which, following the above notation, we have to denote by
1.7 Graphs with unicyclic components 69
UIU\). The role of B(x) forUni, i = 2, 3, is played by the generating functions
00 h .v»
Yl I
n\
, i=2,3,
where bnj is the number of unicyclic graphs with n vertices and cycle lengths not
less than i.
It is clear that
00 OO
5
r=2 r=3
1
'B2(x) = -l-c(x) + d(x) = -1A - A - 9{x)J) - i log(l -
and for x = A - 1/V«
Similarly,
=
and for x = A —
*2(*) =
E2(x))^ =
OO - OO
r=3 r=3
-ilog(l-0(*))
53(x) =
-logn-
^/4(
1
4
1
4
1+O(D).
3
4
A.7.21)
Therefore, if n -> oo, then for the numbers «„ of the graphs in Z^,/ and for the
number u^N of such graphs with N components, we have
uf =
uniformly in the integers N such that \N — Xn\/*/K lies in any fixed finite interval,
70 The generalized scheme of allocation and the components of random graphs
where
''/4r(i/4)' 2'/4r(i/4)' 3 2'/4r(i/4)" K }
Theorems 1.7.2 and 1.7.3 are valid for the random variables xn and fin in Un,2
1.8. Graphs with components of two types
The generalized scheme of allocation can be used in the investigations of random
graphs with nonhomogeneous structure. Consider the set Anj of all graphs with
n vertices and T edges where each connected component contains no more than
one cycle. As usual, we assign equal probabilities to the elements of Anj and
consider a random graph with values from Anj- Since any graph from the set
Anj consists of trees and unicyclic components, we can use the results of the
previous sections to study various characteristics of a random graph from Anj-
Consider first the number of elements in Anj. As in the previous sections, we
will denote by an the number of graphs under consideration with n vertices and
by bn the numbers of connected graphs under consideration with n vertices.
Instead of Anj, we will use, where necessary, the notation An \ if cycles
of lengths 1 and 2 are allowed; A T if cycles of length 1 are forbidden; and
An T if cycles of lengths 1 and 2 are forbidden. Denote the number of graphs
in AnT by anl'T and preserve the notation anj if the specialization is not needed.
In accordance with the previous sections, the number of forests with n vertices, T
edges, and N = n — T trees is denoted by Fn,N- We use u^ to denote the number
of graphs with n vertices and unicyclic components if they are included in AnT,
i = 1,2,3, and preserve the notation un for the number of such graphs in Anj if
the specialization is not important.
It is clear that
Fn-m,N- A.8.1)
Theorem 1.8.1. Ifn, T -> oo such that T/n -> 0, then
2T
an,T = Fn,Na + 0A)) = ^fyTj A + 0(D).
Proof. It follows from Theorem 1.7.1 that there exists a constant c\ such that
um < cxmm-xl\ A.8.2)
1.8 Graphs with components of two types 71
Theorem 1.4.3 shows that under the conditions of Theorem 1.8.1,
A.8.3)
The condition T/n -+ 0 implies that (T - m)/(n - m) -> 0 uniformly in m,
0 < m < T. Therefore, under the conditions of Theorem 1.8.1, there exists a
constant ci such that
c2(n -
F <
forallm,0 <m <T.
We obtain from A.8.1), A.8.2), A.8.3), and A.8.4) that
>
m=\
a,,)
This completes the proof because 2Tn/(n - TJ -+ 0.
Let (onj be the number of vertices contained in the unicyclic components of
the random graph in Anj. It is easily seen from Theorem 1.8.1 that if n, T —>¦ oo
and T/n ^ 0, then
and the limit distributions of the number of trees of fixed sizes in a random graph
from Anj coincide with the corresponding limit distributions in a random for-
forest and are described in Theorems 1.5.1 and 1.5.2; the limit distribution of the
maximum size of trees in a random graph from Anj is given in Theorem 1.6.1.
Now let n, T -^ oo such that 0 = 2T/n -+ X, 0 < X < 1. According to
Theorem 1.4.3, under these conditions,
A.8.6)
If n, T -> oo, IT In -> X, 0 < X < 1, and m = o(n), then by Theorem 1.4.3,
(ti -
F
Since 9 = 2T/n -> X, 0 < X < 1, implies 2B" - /n)/(/i -m) <6, there exists a
constant c such that
)!- (L8'8)
72 The generalized scheme of allocation and the components of random graphs
In subsequent proofs, we will use a cumbersome technical estimate given in the
following lemma.
Lemma 1.8.1. Let n, T -> oo and let there be constants Xq and X\ such that
0 < Xo < 9 = IT In < X\ < 1. Then
x (\--)...(l-^Zl)< i, A.8.9)
where mo < m < T and mo is sufficiently large.
Proof. Write the logarithm of cnj(m) as
\ogcnJ(m) =
oo . m-\ oo r.rp ,
1 x-^ i- x-^ 2T /m\k
k=\ i=\ k=2
oo n , oo ., m-1
Using
we obtain the estimate
2m /m\^ ^-^ 2T
() 2Z
) 2Ztr (
(m — X)K± -s-^ (m - l)
k{k +l)Tk f^ k{k + \)nk
k=\
]g(^)g(
n '—' V T I \ n
(=1
m-\ , . s.
m) TK \ m)
1.8 Graphs with components of two types 73
To prove the assertion of the lemma, we note that for sufficiently large m,
2* / 1
1
\Hk(l) , 1 ^
\ m) 0k \ m)
for all k. Indeed, since 0 < Xq < a < X\ < 1, for sufficiently large m,
and therefore
1
m 6k
ck<2(k+l)-2k,
which implies that ck < 0 for all k > 3 and sufficiently large m. In addition,
1 ~ V m) 0 V m) ~ 0 6m m'
( 1\3 4 / 1\3 4 4 4
C2 = 6-20-1 -T-l <5-20--. + -r- + -,
V m) 62 \ m) ~ 62 62m m
and ci < 0, C2 < 0 for sufficiently large m, since for 0 < Ao < 0 < Ai < 1,
3-0 <0, 5-20 r<0.
6 02
¦
Let bnj be the number of connected unicyclic graphs with n vertices that belong
to A^T, i = 1, 2, 3. If this specification is of no significance, we write bn for the
number of connected unicyclic graphs. Let anj{k) be the number of graphs in
Anj with exactly k cycles. It is clear that
an,T(k) = -^[ )Fn-m,N ?J "^ ^ (L8-10)
k\ *—' \m) L—t m\! • • ~~~
m=k mi~\ \-mk=m
As in Section 1.7, let
oo
nl
«i
A.8.11)
00 A «
and set
x = 9e~
74 The generalized scheme of allocation and the components of random graphs
For such x, according to A.7.1) and A.7.2),
ifl + i
B2(x) = -i
= -\\og{\ - 0) - ±
B3(x) = -
= -\\og{\ - 6)-\0-\02.
Theorem 1.8.2. Ifn,T -> oo such that 9 = 2T/n -> X, 0 < X < 1, then for
any i = 1, 2, 3 and any faced k = 0, 1, ...,
«..r<*> = 2rm! A+OA))'
where an T (k) is the number of graphs in AnT with exactly k cycles, and
_ 1 XX2
2 2 4'
1 X X2
¦- og( - )- - + —,
A A2
Proof. We partition the first sum of A.8.10) into two parts, Si and S2. We set
M = T1/4 and include in Si the summands with m < M. For any x from the
convergence domain of the series A.8.11), the estimate
' ' '
/w>M »»H \-mic=m m>M/k
holds. As in Section 1.7, let dn be the number of connected graphs of single-valued
mappings of a set with n elements into itself and let
00 j vm
d(x) =
ml
m=l
Since
m-\
bm<dm=(m- 1)! J2 7T - (m ~ l)[
k=o
1.8 Graphs with components of two types 75
(see the proof of Lemma 1.7.5), the estimate
"
m>M/k m>M/k
holds. Recall that we chose x = 0e~e. According to the hypothesis of the theorem,
6 = 2T/n -> a,0 < a < 1, and there exists # < 1 suchthatex = 0el~e < q < 1
beginning with some n. Therefore
y b^l<_L.qM/k A8
^—' m! 1 — q
m>M/k ^
Taking into account estimates A.8.8), A.8.9), and Lemma 1.8.1, we find that
n-m,N
TT > > I
m>M m\-\ \-mk=m x
< - Y^ V n\{n-m)^T-m)hmx---bmk
r\}--n ¦¦¦[}-—
+mk=m \ / \ /
/ 1 \ / m — l\ bm, • • -bn
x.
\ n /
Of (
c\n2T
k\2TT\
m>M
(B(x)
,2T
m\+-
/7
V I
m>M/k
m
—0
m
I
~ k\2TT\{\-q)
where c\, c2 are some constants. Thus, under the conditions of the theorem,
We now estimate the sum Si. According to A.8.8),
+
2T-m{T_m)[
n x V i — a
?A
uniformly in m < M =
76 The generalized scheme of allocation and the components of random graphs
Therefore, for any fixed k = 1,2
^r
m<M m\-\
m! )Fn-m<N—-
,2T
nZI s/\-X
kl2TTl
M
E E
m\
m\\
\ ¦ ¦ -
m=k m\-i \-mic=m
Taking into account the estimate of S2, we obtain
n
Si =
k\2TT\
00
E
bn
¦¦ h xmk
^m=k
¦ +mi(=m
m\\ ¦ ¦ ¦
2TTlkl
Combining the estimates of Si and S2 yields
n'
anJik) =
mk
under the hypothesis of the theorem. Since x = 0e~e -> Xe~k, we also have
*Ai, B2(x)-+A2, B3(x)-+A3.
Theorem 1.8.3. Ifn, T -> 00 such that 6 = 2T/n -> X, 0 < X < 1, then
?A/, i = l,2,3,
(/) n
an,T = —2TTl -
where A,-, / = 1, 2, 3, as in Theorem 1.8.2.
Proof. To obtain the asymptotics of an j, we have to estimate the sum
00
A.8.14)
*=0
After normalization, we have
,2T \ -1 00 / ^27 \ -]
n
A.8.15)
A:=0
where for any fixed k = 0, 1, ...,
/ M2r \ -1
it!
1.8 Graphs with components of two types 77
as n, T -» oo, 2T/n -> X, 0 < A < 1. We can pass to the limit under the sum in
A.8.15) if the series converges uniformly with respect to the parameters n, T. To
see this, it suffices to obtain an estimate
{wf\) a"-T-Ak (L8I6)
such that the series YltLo ^-k converges. Using A.8.8) and A.8.9) and reasoning
as we did in the proof of the estimate of 62 give
m = \ m\-\ \-mk=m x 7
rJlT °° Ymh
cn ^—v ^—v x om]
k\2TT\ ^ ^ mi\---mk\
m=k /»H \-mic=m
_ cn2T(B(k))k
~ 2TT\k\ '
Thus we have an estimate of the form A.8.16) and can pass to the limit under the
sum in A.8.15) to obtain
~2TT\
Depending on the set of graphs under consideration, replace B{x) with B\{x),
), or Bt,{x), and Theorem 1.8.3 is proved. ¦
a«,T = H OVT. "exp{B(Xe-x)}(l+o(l)).
A random graph from Anj has exactly N = n — T trees and a random number
of unicyclic components. We denote by x^T the number of unicyclic components
in a random graph from A^T, i = 1, 2, 3.
Theorem 1.8.4. Ifn, T -> oo such that 6 = 2T/n -> X, 0 < X < 1, then for
any i — 1, 2, 3 and for any fixed k = 0, 1,...,
where the A,- are as in Theorem 1.8.2.
Proof. The assertions of the theorem follow from Theorems 1.8.2 and 1.8.3, since
Now we consider the case 6 = 2T/n -> 1. Let con T be the number of vertices
that lie in the unicyclic components of a random graph from An T, i = 1, 2, 3.
It is clear that if we know the distribution of a characteristic of the random graph
78 The generalized scheme of allocation and the components of random graphs
under the condition {cc^\ = /nK tne unconditional distribution can be obtained
by averaging over the distribution of con T.
Theorem 1.8.5. Ifn,T -+ oo such thate = 1 - TTIn -» Oand?3n -» oo, then
for any i = 1,2,3,
uniformly with respect to m such that y = ?2m/2 lies in any fixed interval of the
form 0 < yo < y < y\ < oo, and there exists a constant A such that, for all m,
Proof. We denote the number of graphs in AJ T by a^ T and the number of graphs
for which a>n T = m by a?T m. Clearly,
oo
m=0
A.8.18)
We decompose the sum in A.8.17) into two parts. Let 0 < yo < y\ < oo,
y = ?2m/2, and
oo
\] m=0
By Theorem 1.7.1 and the equalities A.7.21),
%>ml/4 A.8.19)
uniformly in m in the region yo < y < y\, where A,, i = 1, 2, 3, are defined in
A.7.22). There exists a constant c\ such that, for all m,
"i? < c\mm-xl\ A.8.20)
To estimate Fn-m^, it is convenient to use the intermediate formula A.4.25). From
A.4.26) and the equality
6B-6) = 1-e2,
we have
(n-m)!(l -s2)N
F
1.8 Graphs with components of two types
79
where, according to Theorem 1.4.1,
= k) =
uniformly in k such that u = (k — N(jl)/(cr «/N) lies in any finite interval,
2 n 2 2A-e)
2-6 N
S(\+SJ'
If e —> 0, e3n —> oo, and m^/s/n —> 0, then for k = n — m,
u —
(k - Nfi)
m
a
a
A+0A)).
Consequently,
P{SN=n-m} =
It follows from A.8.21) and A.8.22) that
(n - m)\ (\ - ?2)Nen{l~s
There exists a constant c such that
aV2jtNP{^N = k] <c;
therefore
Fn-m,N <
_
wA_e
A.8.22)
A.8.23)
A.8.24)
for all m, 0 < m < T. We note that as s -> 0,
(l-e)mesm =e~y(l +
uniformly in m such that yo < y < yi, and for all m,
A - ?re?m < e~y.
Clearly, A.8.23) holds uniformly in m such that y = ?2m/2 lies in the interval
[.yo* y\\- Therefore, if n -> oo, e = 1 — 2T/n -> 0, and e3n -> oo, then
Fn-m,N = fn(n- m)\ e~m-y{\ + o(l)) A.8.25)
uniformly in m such that yo < y < y\, where
fn =
2NNl(\ -e)"
80 The generalized scheme of allocation and the components of random graphs
and there exists a constant Aq such that for all m,
Fn-m,N < AoMn-m)\e-m-y. A.8.26)
Therefore, by A.8.18), A.8.19), and A.8.25), we have the equality
m-\/4 -m-y
<7> =n\Aifn- ¦ A+0@) A-8.27)
m\
which holds uniformly in m such that yo < y < y\; and outside of this domain,
by A.8.18), A.8.20), and A.8.26), we have
?2
-y—, A.8.28)
where A is a constant. The sum
1 ?2
~2
is the integral sum of the function (F(l/4))~1z~3/4e~z with step s2/2. Therefore,
by choosing yo small enough and y\ and n large enough, this sum can be made
arbitrarily close to 1, and the sum for remaining values of m can be made arbitrarily
small. Thus
Now it follows from A.8.27) and A.8.28) that
@ 2
n,T,m & —3/4 —
-JT = ^J^y 3/ e
un,T
uniformly in m such that yo < y < yi and that outside this domain,
This completes the proof of the theorem. ¦
When we substitute the exact expressions for A, and fn, we obtain for i =
1,2,3,
a% = Cinl^~E ' e" . (l+o(l)), A-8.29)
where C\ = e3/4, C2 = e~1/4, and C3 = e~3l4. It is easy to confirm that if
1.8 Graphs with components of two types
81
e = 1 -277/1 -» 0, then
n!(l -
2T
2^AHA -e)"v/27rn
Thus, under the conditions of Theorem 1.8.5, the asymptotic formulas
r M2r
(i) W" ,-, ,
*n,
i = 1,2,3,
2TT] ^
are valid.
Let Knj denote the number of unicyclic components in a random graph from
An,r a°d use f}n>T to denote the number of vertices in the maximal unicyclic
component.
Theorem 1.8.6. Ifn, T —> oo such that ? = 1 — 27"/n —> 0 and s3n -> oo, then
for any fixed x,
P{Kn,T + - l0g? < Xyj-~ logs
Proof. For any fixed x,
*(*) =
1
<2tx
I
oo
- log e < ^y - 2
00
i
m=0
•>n,T =m}P \Km + Tl0g? <xJ--\0g?
where xm is the number of components in a random graph from Um discussed in
Section 1.7. By Theorem 1.7.2, the random variable
m
is asymptotically normal with parameters @, 1).
Let y = ?2m/2 and 0 < yo < y < y\ < oo. Then logm = logBjF) — 2logs.
Further, since e —> 0,
uniformly in m such that ye [yo, y\] and does not depend on m asymptotically. In
view of Theorem 1.8.5, by choosing yo small enough and y\ and n large enough,
the sum
82 The generalized scheme of allocation and the components of random graphs
can be made arbitrarily close to 1. Therefore
for any fixed*. ¦
Consider now the maximum size of the unicyclic components. Recall that in
Section 1.7 we introduced Ws(z, y), setting Wo(z, y) = 1, and
dx\ ¦ ¦ dxs
Ws(z,y)= f
where
Xs(z,y) = {xj >y, i = 1, ..., s, x\ H \-xs <z], s = 1,2
Theorem 1.8.7. Ifn, T -> oo such that ? = 1 — 2T/n -> 0 andean -> oo, then
for any fixed y > 0,
oo
5=0
where
Proof. For any fixed y > 0,
oo
~ >{conj=m}P{e2f3m <y},
m=0
where ^SOT is the maximum size of the components in a random graph from Un
studied in Section 1.7. If y = s2m/2 and y e [yo, y\\, then
By Theorem 1.7.3,
s —\j
It is clear that this holds uniformly in m such that y e [yo, y\\. Choosing a small
enough yo and a large enough y i and averaging over the distribution of conj prove
Theorem 1.8.7. ¦
1.8 Graphs with components of two types 83
The number of trees in any graph of Anj is N = n — T. Let r\n_ j be the
maximum size of trees in a random graph from Anj.
Theorem 1.8.8. Ifn,T-> oo such thate = 1 -ITIn -» Oands3n -» oo, then
where ft = — logFte °), 9 = 2T/n, and u is the root of the equation
mi/2
su. A.8.30)
\n /
Proof. It is clear that
oo
m T m — u < z\ ei83n
m=0
Let v = s3n. It is easily seen that, under the conditions of Theorem 1.8.8, the root
of equation A.8.30) can be written as
-o(l). A.8.32)
Let y = s2m/2 lie in a finite interval 0 < yo < y < y\ < oo. Set
2G--hi) , 2G--hi)
n — m n — m
3
vm = ?^mn, film) = -\og@meOm).
Since sm = e(l + o(l)), it follows from A.8.32) that the root of the equation
can be written as
um = loguOT - §logloguOT -log4^ + 0A) = u +
uniformly in y in any fixed interval [yo, jki ] • Therefore, by applying Theorem 1.6.3,
we obtain
,-e~z
-m,T-m ~ U < z} -+ e~e A.8.33)
uniformly in y e [yo, y\]. In the main part of the sum in A.8.31), this probabil-
probability does not depend on m asymptotically. Therefore, averaging A.8.33) over the
distribution of conj proves Theorem 1.8.8. ¦
When we compare Theorems 1.8.7 and 1.8.8, we see that the maximum size
of trees in a random graph from Anj is greater than the maximum size of the
unicyclic components, since ft = ?2/2(l + o(l)) and u -> oo. Let anj be the
84 The generalized scheme of allocation and the components of random graphs
maximum size of components of a random graph from Anj, that is,
unj = max($,(.7\ rinj).
Averaging over the distribution of coiut gives the following theorem.
Theorem 1.8.9. Ifn, T -» oo such that s = 1 —2T/n -» Oands3n -» oo, then
for any fixed z,
where fi = — \og(Qe~°), 9 = 2T/n, and u is the root of the equation
1/2
To conclude this section, we consider the case where n, T —> oo such that e3n
tends to a constant.
Theorem 1.8.10. If n,T -> oo such that snl/3 -> 2 • 3/3u, where s =
1 — IT /n and v is a constant, then for any i = 1, 2, 3,
w/iere
C3 =
2V2~r(l/4)' 2V2r(l/4)' 2V2r(l/4)'
/•oo
piv) = / y-3/4p(-v - y; 3/2, -\)dy,
Jo
and p(u; 3/2, -1) is the density of the stable law defined by A.4.18).
Proof. We again use
an,T = 2_^ I JumFn-m,N- A.8.34)
According to Theorem 1.7.1, as m -> oo,
um = Amm~l^{\ +o(l)), A.8.35)
where the value of the coefficient A depends on the type of the unicyclic compo-
components in An,r, and
V2^ _ y^ _
1 21/4r(i/4)' 2~ 21/4r(i/4)' 3~
1.8 Graphs with components of two types 85
To estimate Fn-m,Ni we use formula A.4.25) with 0=1. Then
^'^-n-m). A.8.36)
where ?# = ?i + • • • + ?w is a sum of independent random variables with distri-
distribution A.4.19):
2**-V»
, k=l,2
k\
By Theorem 1.4.2,
bN2/3P{t;N = k} = p(u; 3/2,-
uniformly in k such that u = (k — 2N)/{bN2/3) lies in any fixed finite interval.
Under the conditions of Theorem 1.8.9,
(n-2N)/(bN2/3) -> -v.
Let y = m/{bN2!3) and 0 < yo < y < y\ < oo. Then, under the conditions of
the theorem,
_(n-m - 2N)
Thus, by A.8.36),
(n-m)lp(-v-y; 3/2,-1)
uniformly in m such that jk e [yo, y\\- Since Z) = 2B/3J^3, from A.8.35) and
A.8.37), we obtain
(n\ r
\m)
An\mm-llAp{-v-y; 3/2,-1)
m\2NN\e-n+mbN2/3 ti + olU;
-3/4^(_u - j,; 3/2, -
uniformly in m such that jf € [jFo, ^l]- To obtain anj, we need to carry out the
summation in A.8.35). If we choose a small enough yo and a large enough y\,
substitute the expression of anjtm into A.8.34), note that the obtained sum is the
integral sum of the function z~3^4p(—v — y; 3/2, —1) with step b~xn~2/3, and
omit the needed estimation of the tails, we have
anj = -——= / y-3/4p(-v-y;3/2,-l)dy(l+o(l)),
2N N\ ¦JN Jo
86 The generalized scheme of allocation and the components of random graphs
where
As/2,
c =
Recall our convention that if we consider the set AnT, then A is replaced by
It follows from Theorem 1.8.10 that the number conj of the vertices that form
the unicyclic components in a random graph of Anj has the following limit
distribution:
If n, T -> oo such that e = 1 - 2T/n -> 0 and s3n -> v, then
bN2'3P{con,T =m} = ~^-y-3/4p(-v - y; 3/2, -1)A + o(l))
piv)
uniformly in m such that y = m/(bN2^3) lies in any fixed interval of the form
0 < yo < y < y\ <oo and p(v) is defined in Theorem 1.8.10.
1.9. Notes and references
In this book, we use a probabilistic approach to combinatorial problems. Section 1.1
provides the results from probability theory that suffice for the probabilistic analy-
analysis presented in the book. All of the results in Section 1.1 can be found in standard
treatments of probability theory; however, we follow [76], where these results are
given along with full proofs.
A detailed discussion of the saddle-point method can be found in [42]. The-
Theorem 1.1.7 is a simplified version of the corresponding theorem that gives a full
asymptotic expansion of G (X).
The proof of the local limit theorem (Theorem 1.1.11) was suggested by
B. V. Gnedenko and is contained in the book [49], which remains one of the
best textbooks on the limit theorems of probability theory (see also [43, 122, 60]).
The approximation of the binomial distribution by the normal and Poisson laws
was investigated by Yu. V. Prokhorov [125] (see also [90]). The inequality from
Theorem 1.1.16 was proposed by Hoeffding [59] for sums of bounded random
variables (see also [122]).
Section 1.2 is devoted to a description of the generalized scheme of allocation
of particles, which is a generalization of the multinomial trials. It was introduced in
[69] and now has a significant place in probabilistic combinatorics (see also [78]).
Successful applications of the generalized scheme are mostly limited to the equi-
probable cases; there are only a few examples where a nonequiprobable scheme
has a natural combinatorial interpretation. Along with the nonequiprobable multi-
multinomial distribution, Example 1.2.3 is an example of a nonequiprobable scheme.
Example 1.2.4 concerns random forests with rooted trees and is related to
branching processes. Indeed, the distribution A.2.11) is that of the total progeny
1.9 Notes and references 87
in the Galton-Watson process ti(t, G), which begins with one particle that has
Poisson-distributed numbers of offspring of a particle. Therefore a random forest
with N trees and n nonroot vertices can be represented by the same process that
begins with N particles under the condition that the total progeny is n + N. We
describe more precisely the correspondence between random trees and the branch-
branching process //,(?, G), whose distribution of the number of offspring of one particle
is the Poisson distribution with parameter X.
Let nr(t,G) be the number of particles at time t having exactly r direct de-
descendants, and let v(G) be the total progeny over the whole period of evolution of
the process.
Consider the set Tn of all rooted trees whose nonroot vertices are labeled
1, 2,..., n, and whose root is labeled by 0. Assigning the probability (n + l)~n+1
to each tree of Tn gives the uniform distribution onTn.
Any vertex of a tree is joined to the root by a unique path, whose number of
edges is called the height of the corresponding vertex. We assume that all the edges
of a tree are directed from the root and call the number of edges emanating from
a vertex the degree of the vertex.
Let Hr(t, Tn), r, t = 0, 1,..., n, be the number of vertices of height t having
degree r. Consider the matrices \\fJLr(t, Tn)\\ and \\iJLr(t, G)\\, t,r = 0, 1,..., n,
and a matrix M = ||mr(OII of the same dimension with nonnegative elements.
Kolchin [73] showed that
P{||Mf, Tn)\\ =M} = P{\\fir(t, G)\\ = M | y(G) = n + 1}.
This relation means that the distribution of any random variable that can be ex-
expressed in terms of the random variables ixr{t, Tn), r, t = 0, 1,..., n, coincides
with the conditional distribution of the corresponding random characteristic of the
branching process under the condition that v(G) = n + 1.
This scheme has been used widely to obtain a complete description of the prop-
properties of random trees and forests [73, 74, 75, 111, 112, 113, 114, 116]. Recently
Yu. L. Pavlov [118, 119] discovered that the branching process that has a geo-
geometric distribution of the number of offsprings corresponds - in the same sense
as discussed above - to a random plane planted tree with unlabeled vertices. This
representation of random plane planted trees is also mentioned in [4, 136, 138].
Note that we are aware of only these two branching processes that have the Poisson
and the geometric distributions of the number of offspring, which lead to sets of
trees with uniform distribution. Results on more general classes of forests with
nonuniform distributions can be found in [120, 121].
The correspondence between random plane planted trees and a branching pro-
process that has a geometric distribution appears to be deep and can be considered
as a correspondence of realizations, that is, there exists a one-to-one correspon-
correspondence between the set of such trees and the realizations of the corresponding
88 The generalized scheme of allocation and the components of random graphs
branching process. It seems that this fact was first pointed out in an explicit form
byV. A. Vatutin[138J.
The general approach to investigating connectivity and the sizes of components
of random graphs of various types is presented in Section 1.3. This general ap-
approach was first outlined by Kolchin [78J, but its particular forms had already been
used to investigate other random graphs, such as random permutations, random
mappings, and random forests of rooted trees [71, 72, 73, 74, 75].
Forests of nonrooted trees are investigated in Sections 1.4-1.6. Section 1.4
concerns the number of such forests. The number of forests of N labeled rooted
trees with n nonroot vertices is N(N + n)n~l. In contrast to the forests of rooted
trees, the number Fn ^ of nonrooted forests cannot be expressed by a simple for-
formula. A complete analysis of the random forests of nonrooted trees was conducted
by V. E. Britikov, who used the generalized scheme of allocation. The possibil-
possibility of using such an approach was pointed out in [78, 77]. When Britikov began
investigating Fn^, it was known only that for any fixed N as n —> oo,
A complete description of the asymptotic behavior of Fn^ can be found in [29].
In particular, formula A.9.1) is generalized for N -» oo and proves that if n -» oo
and A - 2T/nKn -> -oo, then
The cases in which A — 2T/nKn tends to a constant and A — 2T/nKn -» oo are
covered by Theorems 1.4.4 and 1.4.3, respectively.
Section 1.5 deals with the numbers \xr of trees with r vertices, r = 3, 4, ..., in
a random forest. A complete description of the limit distributions of these random
variables was obtained by Britikov [30]. Theorems 1.5.1 and 1.5.2 summarize the
results proved in [30], where, in addition, the behavior of \x\ and \i2 is analyzed.
The general approach used to investigate the order statistics in the generalized
scheme was suggested in [70] and is also described in Lemma 1.2.2 in [78]. In Sec-
Section 1.6, we apply this approach to the maximum size of trees in random unrooted
forests. The results of this section were obtained by Britikov [28]. Theorems 1.6.1-
1.6.5 cover all possible regular variations of the parameters n and N, but not the
case where N is bounded. Clearly, for any fixed k, the size of the &th largest tree of
the forest can be analyzed in the same way. Luczak and Pittel [101] realized this
posibility and interpreted the results of their analysis as an evolution of a random
forest (see also [31]).
It is pertinent to note here the results that concern the investigations of the
ordered series of components of wide classes of random graphs [4, 7, 14, 15, 35,
36,41, 56]. There are two natural ways of labeling the components. One way is to
1.9 Notes and references 89
arrange them in decreasing order; the other is to use a particular random labeling
called the size-biased permutation. For the first type of labeling, let M\ > A/2 > • • •
be the sequence of sizes of the components of a graph with n vertices numbered
in decreasing order. Let C\ be the size of the component that contains the vertex
with label 1, let C2 be the size of the component that contains the vertex with the
smallest label among the vertices not included in the first component, and so on.
It is clear that the joint distribution of the random variables C\, C2, ¦ ¦ • nor-
normalized by n places unit mass on the set A of infinite sequences of nonnegative
numbers such that
A = {(x\,X2, ...),x\+x2-\ = 1},
and the joint distribution of M\, M2, ¦ ¦ ¦ normalized by n is concentrated on the
set
V = {(xi,X2,...) e A, xx >x2 > •••}•
For some classes of graphs, the limit distributions of the sequences C\, C2, ¦ ¦ ¦
and M\, M2,... are known. Let us describe a class of the limit distributions.
Let Z\, Z2, ¦ ¦ . be independent identically distributed random variables with
density
9A -z)e~\ 0<z<l, 9>0.
Let
YX = ZU Y2 = Z2(l - Zi), 73 = Z3(l - Zi)(l - Z2), • • •
and let 7(i), 7B),... be the order statistics constructed from Y\, Y2, The dis-
distribution of Y\, Y2,... on A is called the GEM distribution with parameter 0, and
the distribution of 7(i), 7B),... on V is called the Poisson-Dirichlet distribution
with parameter 0.
It is known that the distribution of the random variables M\, M2,. •. normalized
by n for the cycle sizes of a random permutation of degree n converges, as n —> 00,
to the Poisson-Dirichlet distribution with parameter 0 = 1 and that the random
variable C\ is uniformly distributed on the set {1,..., n) (see, for example [78]).
For random mappings, the distributions of the random variables C\, C2, ¦ ¦ ¦ and
M\, M2, ¦ ¦ ¦ normalized by n converge, respectively, to the GEM distribution and
the Poisson-Dirichlet distribution with parameter 6 = 1/2 [3].
As usual, let ar denote the number of components of size r of a random graph
with n vertices. The joint distribution of the random variables a\,... ,an of the
form
\-an
=a\, ...,«„= an\ =
where a\,..., an are nonnegative integers such that a\ + 2a2 H + nan = n is
90 The generalized scheme of allocation and the components of random graphs
similar to the joint distribution of the random variables a.\, ..., an for a random
permutation (see Lemma 1.3.7). This distribution arises frequently in population
genetics and is known as the Ewens distribution [40, 67J.
If the random variables C\, C2, ¦ ¦. and M\, Mi,... correspond to a graph
with the Ewens distribution of a\,..., an with parameter 9, then as n —> 00,
the distributions of the normalized random variables converge, respectively, to the
GEM distribution and the Poisson-Dirichlet distribution with the same parameter 6
[67]. See also [139, 140, 141].
Section 1.7 contains the results on unicyclic random graphs obtained in [77]. The
analysis of random graphs with components of two types presented in Section 1.8
is also contained in [77]. The idea of considering a graph as a combination of
connected components of certain types can be attributed to Agadzhanyan [1, 2].
The results of Section 1.8 can be found in [77].
Evolution of random graphs
2.1. Subcritical graphs
This chapter deals with several models of random graphs with n labeled vertices
and T edges as n, T -» oo. The parameter 6 = 2T/n plays a decisive role in the
behavior of random graphs, and it may be interpreted as time in the evolution of the
graphs. It turns out that many of the characteristics change their behavior abruptly
near the point 6 = 1. It is convenient to distinguish three domains of the variation
of the parameter 9. We say that a random graph is subcritical if n, T —> oo in such
a way that A — 6Kn -» oo. Thus, for a subcritical graph, 6 may tend to unity, but
not too fast. A critical graph is characterized by the conditions that n, T -» oo and
A — 6Kn tends to a constant. And, finally, a graph is supercritical if n, T —> oo
and A - 6Kn -> -oo.
In this section we consider three sets of graphs. Let Q^ \ be the set of all graphs
with n labeled vertices and T edges with loops and multiple edges, provided each
vertex may have no more than one loop and each pair of vertices may be connected
B)
by no more than two edges. Let Qyn T be the set of all graphs with n labeled vertices
and T edges that have no loops; however, each edge may occur twice, so that each
pair of vertices may be connected by no more than two edges. And, finally, let
Qn \ be the set of all graphs with n labeled vertices and T edges that have neither
loops nor multiple edges.
Denote the number of graphs in Q^T by g^\,i = 1, 2, 3. We introduce the uni-
uniform distribution on Q^\, i = 1, 2, 3, assigning equal probabilities to all elements
of the corresponding set, and denote by G{rlT a random graph such that
for any G eG^J = 1,2,3.
91
92 Evolution of random graphs
Recall that in Section 1.8 we considered the sets A(^\, i = 1, 2, 3, of all graphs
with n labeled vertices and T edges with components of two types: trees and
C)
unicyclic components. In An T, the unicyclic components have neither loops nor
multiple edges; in A)^T, the unicyclic components have no loops, but may contain
cycles of length 2; and in An T, the unicyclic components may contain loops and
cycles of length 2. Thus,
A(i) r G(i) i — I 2 3
The results of Section 1.8 allow us to describe the limit distributions of various
characteristics of subcritical random graphs G^T, i = 1, 2, 3.
Theorem 2.1.1. Ifn,T ->¦ oo such that A — 2T/nKn ->¦ oo, then for any
i = 1,2,3,
Proof. It is clear that
We need to determine the asymptotics of g^T, i = 1, 2, 3, under the conditions of
Theorem 2.1.1 to match the results on a^\ from Section 1.8.
Recall that if 6 = 2T/n -> A., 0 < A. < 1, then by Theorems 1.8.1, 1.8.2, and
assertion A.8.29),
for any / = 1, 2, 3, where
If n, 7 ^ oo and 73/«4 -> 0, then
C) /n(n - D/2\
^.r ^ r ;
x _ 4 \ / _ 2G -
( 1)) " \
n(n — 1)/ \ n{n —
n2Te-T/n-T2/n2
oTt\ \l~tv\ijj, B-1.2)
and Theorem 2.1.1 is proved for i = 3.
It is clear that each graph from Q^\ can be obtained by a choice of T edges,
which is equivalent to an allocation of T particles into B) cells, provided each cell
2.1 Subcritical graphs 93
contains no more than two particles. Therefore
&- ? (s)(s~"\
where S = B), t\ cells have exactly one particle, and tj cells have two particles.
Hence,
B) v^ S\
gnJ ^ tx\t2\(S-tx -t2y.
ti+2t2=T
T\S\
For any fixed t,
T\S\
(T - 2t)\ (S - T + t)\
Therefore, under the conditions of Theorem 2.1.1,
,-,7 -T2/n2 °°
S(nJ = ^
nlT p-T/n-T2/n2
B.1.3)
Similarly, each graph from Qn T can be obtained by a choice of T edges, which is
equivalent to an allocation of T particles into n + B) cells, provided that no more
than two particles are allocated into each of B) cells and only one particle may be
put into each of n cells. Therefore, putting S = B) yields
By the same arguments under the conditions of Theorem 2.1.1,
Then, by comparing B.1.1) to B.1.2), B.1.3), and B.1.4), we obtain the assertion
of the theorem. ¦
94 Evolution of random graphs
According to Theorem 2.1.1, each of the subcritical graphs Gn r, i — 1, 2, 3,
consists of trees and unicyclic components and, with probability tending to 1, does
not contain more complicated components.
Given a random graph G, denote by nr(G) the number of trees of size r, by
rj(G) the maximum size of trees, by co(G) the total number of vertices in the
unicyclic components, by x{G) the number of unicyclic components, by /3(G) the
maximum size of the unicyclic components, and by a(G) the maximum size of
the components.
Let y(G^T) be a characteristic of the random graph G^T and let yffT be the
corresponding characteristic of the random graph from A^T. Then, by the formula
of total probability,
PM<j) < x) = PjG.« e Afj)P\y^ < x)
+ PI Oj.% <t <V Ip
for any x. By Theorem 2.1.1,
if the graph G^\ is subcritical. Therefore, for any characteristic y(G\\ T) of the
subcritical graph,
P[y(G™t)<x} = P{y^T <jc}A+oA)) + oA), B.1.5)
and if Piy^j < x] tends to a limit, then the probability P{y{G^T) < x) has the
same limit. Thus, many of the results of Section 1.8 can be reformulated for the
corresponding characteristics of the random graphs G^\, i = 1, 2, 3. \^\
is an integer-valued characteristic, then for any fixed integer k,
inj) ) {nJ V "(I), B-1-6)
and if P{y^T = k] has a nonzero limit, then relation B.1.6) allows us to obtain
the limit of the probability P{y {G^T) = k}.
Theorem 2.1.2. Ifn, T -> oo such that T/n -+ 0, then for any i = 1, 2, 3,
Ifn,T ->¦ oo such that e = 1 — 2T/n ->¦ 0 and ?3n ->¦ oo, then for any fixed
x > 0 and any i = 1, 2, 3,
Proof. The assertions of the theorem follow from B.1.5), B.1.6), and Theo-
Theorems 1.8.1 and 1.8.5. ¦
2.1 Subcritical graphs 95
Theorem 2.1.3. If the graph G^\ is subcritical, i = 1, 2, 3, andr = r(n,T) > 3
varies such that NprF) -> oo, then for any faced x,
- Npr@)
where
N = n-T,
0 = IT jn,
< x
— ('
Pr(d)(l - Pr(d) -{IX- kJPr(d)
Orr\p) = -z
B-0)'
2
cr =
Ifr = r(n,T)>3 varies such that Npr{d) —>• A., 0 < A. < oo, then for any fixed
k = 0,1,...,
Proof. In view of B.1.5) and B.1.6), the assertion of the theorem follows from
Theorems 1.5.1 and 1.5.2 because, by Theorem 2.1.2, the number co(G^T) of
vertices in the unicyclic components for subcritical graphs is small compared with
the total number of vertices; more precisely, P{co(Glrt T) < n2/3} —>• 1. ¦
Theorem 2.1.4. Ifn,T —>• oo such that T/n —>• 0, r = r(n,T) > 1 and
Npr{6) ->¦ oo, Npr+i(d) ->¦ A, 0 < A < oo, then for any i = 1, 2, 3,
Proof. In view of B.1.5) and B.1.6), the assertions of the theorem follow from
Theorem 1.6.1. ¦
Theorem 2.1.5. If i = 1,2,3 and n,T -+ oo such that 0 = 2T/n -+ A,
0 < A < 1, then for any faed k = 0, 1,...,
_ Ake~Al
96 Evolution of random graphs
where
A| =
A2 =
1
2 ?
1
~2l0^
X
;A -A.) H 1
Kl-X)--H
1
4
X2
1 X X2
A3 = --log(l-X)----.
For any fixed k = 0, ± 1,...,
= exp { -
where
_ logn - E/2) log log n
0 - 1 - log 0 '
[a] and {a} are, respectively, the integer and fractional parts of a.
Proof. The assertions of the theorem follow from B.1.5), B.1.6), and Theo-
Theorems 1.8.4 and 1.6.2. ¦
Theorem 2.1.6. Ifi = 1, 2, 3 and n, T -+ oo such that ? = 1 - 2T/n -> 0 and
?3n —>• oo, then for any fixed x,
and for any fixed x > 0,
oo
is defined in Theorem 1.8.7. Finally, for any fixed z,
- u <
log^e*), 0 = 27/n, anJ w w the root of the equation
/o\ 1/2
^ u. B.1.7)
Proof. The results of the theorem are the consequences of B.1.5), B.1.6), and
Theorems 1.8.6, 1.8.7, 1.8.8, and 1.8.9. ¦
2.2 Critical graphs 97
2.2. Critical graphs
Recall that a graph with n vertices and T edges is called critical if n, T —>• oo such
that e = 1 - 2T/n -+ 0 and ?3« tends to a constant. We have seen that many of
the characteristics of the random graphs G^\, i = 1, 2, 3, change their behavior
if 6 = IT In approaches the value 1. For example, the number of cycles, or the
number of unicyclic components x{G^\), tends to zero in probability if 6 ->¦ 0,
has the Poisson distribution with parameter A,-, z" = 1, 2, 3, respectively, if 6 —>• A.,
0 < A. < 1, where
A,
A2
A3
1
= — log(l — A) +
2
1
= — log(l — A) —
2
= -Ilog(l-X)-
A
2
A
2
A
2
A2
+ 4 '
A2
+ 4 '
A2
4'
and is asymptotically normal with parameters (— 5 logs, — 5 logs) if e ->¦ 0,
?3n —>• 00. Thus, 6 = 1 is a singular point and one can correctly suppose that
the behavior of the graphs near this point is interesting but difficult to investigate.
Indeed, not much is known about the properties of critical graphs. We present here
only one assertion about this behavior.
Recall that An T is the set of graphs with n labeled vertices and T edges that
consists of trees and unicyclic components with neither loops nor multiple edges
for i = 3, without loops and with cycles of length 2 allowed for i = 2, and with
cycles of lengths 1 and 2 allowed for i = 1.
Theorem 2.2.1. Ifn,T^* 00 such that ?n1^3 ->¦ 2 • 3~2/3u, where v is a con-
constant, then for any random graph Gr*T, i = 1, 2, 3,
/•OO
p(v) = / ^~3/4p(-i; - y
Jo
and p(y; 3/2, —1) is the density of the stable law, introduced in Theorem 1.4.2,
with the characteristic function
Proof. It is clear that
PlG(i) G A{i) \ - a{i)
H^ fc ^] anj
98 Evolution of random graphs
where a^\ is the number of graphs in A^T, and g^T is the number of graphs in
C?(i) / = 1, 2, 3. In accordance with Theorem 1.8.10,
where N = n — T,
-1/4 V&-3/4
C\ = 7= , C2 = ;= , C\ =
In the previous section, we proved that
@ = n2Ta(l)
gn,T 2TT\
A+0A)),
where ci(l) = e3/4, c2(l) = e~1/4, c3(l) = e~3/4.
Since 7 = n(l — e)/2 and ?3n —>• 8u3/9, we easily find
and, consequently,
The function p(u) can be represented by a convergent power series. The function
/•OO
= p(-v)= /
(v -y; 3/2, -1) dy
gi(y) =
can be thought of as the convolution of the function
y~3/\ y>0,
0, y<0
and the function g2(y) = p(y; 3/2, -1), so that
/•OO
g(v) = / g\(y)gl(v -
-OO
Therefore the Fourier transform g(t) of the function g(v) is the product of the
Fourier transforms of the functions g\ (y) and g2(y). The Fourier transform g\ it)
of the function g\ (y) is
gi(t) =
2.2 Critical graphs 99
and the Fourier transform gj{t) of the function gj(y) = p(y\ 3/2, —1) is the
characteristic function of this density:
Thus,
By the inversion formula,
'00
,—itv Xi
1 f°
g(v) = — / e-ltvg(t)dt
¦^TT J—oo
= f e\t\e
V2rC/4) y-oo
and therefore, under the hypotheses of Theorem 2.2.1,
v^r(i/4)V2rC/4)
where
h(v) =
—OO
Since T A/4) T C/4) = V2tt, we obtain
The function /z (u) can be represented by a convergent power series.
Theorem 2.2.2. Ifn,T —>• oo such that en1/3 —>• 2 • 3~2/3u, where v is a con-
constant, then for any random graph Gnr, i = 1, 2, 3,
AC — (J
Proof. Let us represent h{v) by a power series in v. Since the left-hand side of
B.2.1) is real,
J—oo
100 Evolution of random graphs
Consider first the integral
roo
h[(v) = / e"vrl'*ein'*exp{-t3/2el"/4\dt.
Jo
By expanding e'tv, we obtain
h, (iO = e1*'* f] ^ r tk~[/4 exp j - W 4}</'-
k=0
After the change of variables t3^2ein^4 = z, we obtain
Therefore
Similarly, for
i-c
-r
Jo
we obtain
The assertion of the theorem follows from B.2.1), B.2.2), and B.2.3). ¦
Theorem 2.2.2 allows us to calculate the limit values of P{G^ T e An T}. For
example,
= ^273.
Some values of P(v) are given in Table 2.1.
2.3. Random graphs with independent edges
When we were determining the number of graphs in the classes Q^TJ = 1, 2, 3, in
Section 2.1, we associated each of the classes with the corresponding equiprobable
scheme of allocating particles into cells. It is easily seen from these correspon-
correspondences that the realizations of each of the random graphs G^\, i = 1, 2, 3, could
be obtained by a sequential allocation of particles, but these random allocations
are dependent. For example, if a pair of vertices has been connected in the random
2.3 Random graphs with independent edges
Table 2.1. Values of P(v)
P(v)
3.0
2.8
2.6
2.4
2.2
2.0
1.8
1.6
1.4
1.2
0.0053
0.0118
0.0239
0.0443
0.0755
0.1196
0.1768
0.2461
0.3244
0.4078
-1.0
-0.8
-0.6
-0.4
-0.2
0.2
0.4
0.6
0.8
1.0
0.4919
0.5727
0.6470
0.7128
0.7693
0.8551
0.8860
0.9105
0.9297
0.9447
1.2
1.4
1.6
1.8
2.0
2.2
2.4
2.6
2.8
3.0
0.9563
0.9653
0.9722
0.9776
0.9819
0.9852
0.9878
0.9899
0.9915
0.9929
101
C)
graph G^ 'T after allocating some of the edges, then the outcomes of all subsequent
allocations cannot be the edges connecting these two vertices.
The classes of random graphs whose edges are independent seem to be easier
to investigate by using the methods of probability theory. The best-known random
graph with this property is Gn,p with n vertices such that each of the B) possible
edges belongs to the edge set of Gnp with probability p independently of the
behavior of the other edges. This graph has a random number of edges with the
binomial distribution with n trials and the probability of success p.
In this section, we consider the random graph Gnj with n vertices labeled
1,..., n and T edges that can be obtained by T independent trials. In each trial,
the loop at any point i occurs with probability n~2 and the edge connecting
the vertices i and j, i ^ j, occurs with probability 2n~2. In other words, if
the edge set of Gnj consists of T edges ((/(I), y(l)),..., (i(T), j(T)), then
i(l), j(l),..., i(T), j(T) are independent identically distributed random vari-
variables taking the values 1, 2,..., n with equal probabilities. It is clear that the re-
realizations of the random graph Gnj are not equiprobable. For example, for n = 2
and T = 1, the graphs with a loop and an isolated vertex have the probabilities 1/4
each, and the connected graph has the probability 1/2. Nevertheless, this model
has some advantages and is conducive to treatment by probabilistic methods.
Since /(I), /(I),..., i(T), j{T) are independent identically distributed ran-
random variables, we can associate to the random graph Gnj the classical scheme
of allocating particles where IT particles are allocated into n cells such that each
particle falls into any of n cells with probability \/n independently of the allo-
allocations of the other particles. By using this relationship, we can, for example,
easily find the distribution of the number of loops in Gnj. Indeed, we have T
trials, corresponding to T edges, and in each of these trials a loop appears with
102 Evolution of random graphs
probability l/n. Thus, the total number of loops a\ in Gn j has the binomial dis-
distribution with parameters (T, \/n). The mean number of loops is Ect\ = T/n. If
2T/n -> A., 0 < A. < oo, then the Poisson distribution with parameter A./2 is the
limit distribution for ct\.
Under the condition ct\ = m, the other edges may be considered as the result
of T — m independent allocations into Q) cells corresponding to B) possible
edges of the complete graph with n vertices. Therefore, with a\ = m, the number
a.2 of cycles of length 2 in Gnj can be thought of as the number of cells with
exactly two particles in the classical (equiprobable) scheme of allocation ofT — m
particles into Q) cells. The classical scheme of allocation has been well studied. In
particular, if n, T -> 00 such that ITJn -> A., 0 < A. < 00, then the distribution
of the number of cells, occupied by exactly two particles each, converges to the
Poisson distribution with parameter A,2/4. Since the limit distribution does not
depend on m form = o{n), averaging over the distribution of ct\ shows that ct\
and «2 are asymptotically independent and their distributions approach the Poisson
distributions.
Theorem 2.3.1. If n, T -> 00 such that 2T/n -> A., 0 < A. < 00, then for any
fixed nonnegative integers k\ and k2,
P{*x =klt«2= k2} = Q)*' {^2 e-W-^l + o(l)).
Because the edges of Gnj are independent, we can apply direct probabilistic
approaches to investigations of the structure of Gnj.
Theorem 2.3.2. Ifn, T -> 00 such that T/n -> 0, then in Gnj, with probability
tending to 1, there are no cycles and all the components are trees.
Proof. Denote the number of cycles of length r with r distinct vertices by ar, and
let v(Gnj) = ct\ H \-ccn be the total number of cycles considered as induced
subgraphs of Gnj- We can represent ar as a sum of indicators. The edges of Gnj
appear sequentially in T trials. We assign the numbers 1, 2,..., T to the trials
and arrange (in some order) all (^) possible subsets of cardinality r of the trial
numbers. We define the random variable ?,• to be equal to 1 if the subset of trial
numbers labeled with i forms a cycle in Gnj, and ?,• = 0 otherwise. It is clear that
ur =?1 H h?Gy
In turn, each of the random variables f 1,..., %,t\ can be represented as a sum of
indicators. The cycle corresponding to the subset with label i can be constructed
from r different vertices and r different edges. There exist (") possibilities to
choose these r vertices and (r — 1) !/2 possibilities to construct a cycle from these
r vertices for r > 3. Each construction fixes r edges that must occur. These r edges
2.3 Random graphs with independent edges 103
can occur at r fixed places of the subset labeled i, and there exist r\ possibilities
to assign these r edges to r places. Thus the event {?,• = 1} can be realized by one
of the (")(r - \)\r!/2 variants.
For r > 3, each of these variants has the probability B/n2)r. Thus,
B.3.1)
It is not difficult to check that this formula is also valid for r = 1 and r = 2.
It follows from B.3.1) that
Trnr(r - \)\r\ / 2_Y _ /2TY 1
r\r\2 \n2) \ n ) 2r'
Therefore,
r=\
has the upper bound
Under the conditions of the theorem, Ev(GMO-) tends to zero and the number of
cycles in Gnj is zero with probability approaching 1. ¦
We denote by Anj the set of all graphs with n labeled vertices and T edges
whose components are trees and unicyclic components. Note that loops and cycles
of length 2 are permitted. As before, 0 = 2T/n, e = 1 - 2T/n.
Theorem 2.3.3. Ifn, T -> oo such that ?3n -> oo, then
4
P{Gn,T ?An,T}< -^-
Proof. We have to prove that under the conditions of the theorem, the graph Gnj
has no component with more than one cycle with probability less than 4/(?3n). If
in Gnj there exists such a component, then in Gnj there either exists a subgraph
that consists of two cycles connected by a chain (pince-nez) or there exist two
cycles that have a common sequence of edges (a cycle with a bridge). We use ^'J
to denote the number of subgraphs of Gnj that consist of cycles of lengths r and
s connected by a chain of t edges, and denote by ?r the number of subgraphs of
Gnj that consist of a cycle of length r with two vertices connected by a sequence
of t edges. To prove the assertion of the theorem, it is sufficient to show that the
104 Evolution of random graphs
mean number of such subgraphs tends to zero. It is clear that
P{GnJ <?AnJ} = f
r,a,t rj
r,s,t rj
By reasoning in the same way as in the proof of formula B.3.1), we obtain the
estimates
\r + tj \n2) n \n
71 \ / 9 \ r+'s+' o /Tr\r+S+'
X
Thus, the mathematical expectation of the total number of pince-nez and cycles
with a bridge can be estimated as follows:
00 00
\ r+s+t
/2TV
Theorem 2.3.4. Ifn,T -> 00 5mc^ ^af 0 = 27/n -> A., 0 < A. < 1,
f/ze distribution of the number of cycles v{Gnj) in Gnj converges to the Poisson
distribution with parameter
Proof. In view of Theorems 2.3.1 and 2.3.3, we can reduce the proof to the
application of Theorem 2.1.5 concerning the random graph Gn T without loops
and multiple edges. Indeed, by the formula of total probability,
P{v(Gn,T) = k} = J2 p{«i =ku a2= k2, GnJ e Anj)
k\+k2<k
x P{v(Gnj) = k I ai = ki, ec2 = k2, Gnj e Anj)
+ ^2 p{"i =k\, a2 =k2, Gn,T ? Anj}
k\+k2<k
x P{v(Gnj) = k\ai=ku ct2 = k2, GnJ ? Anj).
2.3 Random graphs with independent edges 105
According to Theorem 2.3.3, P{Gnj ? Anj} -> 0, and it is not difficult to see
that
P{GnJ eA\a\ =ku cc2 = k2) =
P[v(Gn,T) = k\ak=k\, ec2= k2, Gnj e An
Thus
{v(GnJ) = k}= J2 Pf"i = *i. «2 = h) B.3.2)
k\+k2<k
According to Theorem 2.1.5, under the conditions of Theorem 2.3.4, for any
fixed k\, k2 = 0, 1, ..., and k > k\ + k2,
\k—k\—k2 —A3
where
2 2 4
Now it follows from B.3.2) and Theorem 2.3.1 that
-ki -k2)\
k\
x g
where
X X2 1
A = A3 + - + — = --
By reasoning in the same way, we can reformulate the theorems proved for
^ 'T so that they can also be applied to subcritical and critical graphs Gnj ¦ As an
106 Evolution of random graphs
example, we give an analogue of Theorem 2.1.6 on the number x(Gnj) and on the
maximum sizes ^(G«,r). P(Gnj), and ct(Gnj) of trees, unicyclic components,
and all components in Gnj, respectively.
Theorem 2.3.5. Ifn, T -> oo such that e = 1 —IT In -> 0 andean -> oo, then
for any fixed x,
P
x(GnJ) + - logs < xJ-- logs
2 ° ~ V 2
for any fixed x > 0,
f
J -oo
00
where Zs{x) is defined in Theorem 1.8.7; and
a(GnfT) -u<z} = P{/3r}(GnJ) -u<
where E = — log(#e~0), ^ = 2T/n, and u is the root of the equation
/o\ 1/2
rij 2
For the same reasons, Theorem 2.2.2 can be extended to the critical graph Gnj ¦
Theorem 2.3.6. Ifn, T -> oo such that en1^ -> 2 • 3~2/3u, where v is a con-
constant, then
For the supercritical case where n, T -> oo such that ?3n -> — oo, we present
here only the simplest results. In the final section of this chapter, we will give a
short review of what is known about the supercritical graphs.
It is known that if 6 = 2T/n -> A., A. > 1, a giant component appears in the
graph Gn( 'T and, with probability tending to 1, G^ T consists of trees, unicyclic
components, and this giant component formed by all the vertices that are not
contained in trees and unicyclic components. As 2T/n increases, the size of the
giant component increases and the number of unicyclic components decreases.
If9 = 2T/n—>X,l <A.<oo, then the number of unicyclic components has
a Poisson distribution. For 6 -> oo, we have the following result.
Theorem 2.3.7. Ifn, T -> oo such that 6 = 2T/n -> oo, then with probability
tending to 1, there are no unicyclic components in Gnj.
2.3 Random graphs with independent edges 107
Proof. The number of unicyclic component with r vertices is not greater than
CT'-i/2) where c is a constant (see, e.g., [16]). Denote by xr(Gnj) the number of
unicyclic components of size r in Gnj- By reasoning as in the proof of B.3.1),
we find that
Ex (G < c(n\( Vr~1/2r' (—\ (\ 2r(n~r) r(r ~ [) |
~ \rj\rj \n2) \ n2 n2 )
B.3.3)
where the last factor is the probability that the T — r edges, which were not used
for the construction of unicyclic components, neither connect the vertices in the
component with the vertices outside the component nor connect any pair of vertices
in the component.
It is sufficient to prove that
Exr(GnJ) -> 0.
\<r<n
With the help of estimate B.3.3), we find that
Exr(Gn,T)<
For sufficiently large n and 1 < r < n,
e-2r(n-(r+l)/2)(T-r)/n2
and q = del~0/4 < 1. Therefore
oo
Since q = 6el~9^4 -> 0 as 9 -> oo, we conclude that a unicyclic component
exists in Gnj with a probability that tends to zero. ¦
Finally, we consider the behavior of the random graph Gn, t near the point where
the graph becomes connected. Denote the number of components in Gnj by xnj.
Theorem 2.3.8. If n -> oo and IT = n logn + xn + o(n), where x is a con-
constant, then with probability tending to 1, the graph consists of a giant connected
component and isolated vertices. Also, for any fixed integer k = 0, 1,...,
e~kx
P{xnJ -!=*}-> —e~e .
108 Evolution of random graphs
Proof. We have to prove that, with probability tending to 1, Gnj consists of one
giant component and isolated vertices, and that the distribution of the number of
these isolated vertices converges to the Poisson distribution with parameter e~x.
The edges of G,,j appear as a result of T independent trials, and these T
trials can be considered as the allocation of 27 particles into n cells such that any
particle is allocated independently of the other and, with equal probabilities, falls
into any of n cells. Therefore the number of isolated vertices in Gnj has the same
distribution as the number ixqBT, n) of empty cells in the well-studied classical
scheme of allocating particles. Under the conditions of the theorem, the distribution
of tioBT, n) converges to the Poisson distribution with parameter e~~x.
To complete the proof, it suffices to show that, with probability tending to 1,
the remaining vertices form one giant component. If, in addition to the isolated
vertices, there were two other components, then the graph would contain a tree of
size r, 2 < r < n/2, such that any vertex of the tree would not be connected to
any vertices outside the tree. A skeleton of one of the two components could play
the role of such a tree.
By %r we denote the number of trees of size r which are the skeletons of
connected components of Gnj- We will show that under the conditions of the
theorem,
2<r<n/2
and consequently, with probability tending to 1, such a tree does not occur in Gn j ¦
We can represent %r as a sum of indicators and find that
v - 1/ vv \n J V n
This formula is similar to B.3.1): We choose r vertices and r — 1 edges that form
the tree, and the last factor is the probability that none of the T — r + 1 edges that
remain connects a vertex from the set of r selected vertices with a vertex from the
set of n — r remaining vertices.
By using formula B.3.4), we can check, for example, that with probability
tending to 1, there are no isolated edges in Gn j. Indeed, for r = 2,
T-\
~'~ 1}/, B.3.5)
and the right-hand side of B.3.5) tends to zero if n -> oo and 27 = n logn +
xn + o{n).
It follows from B.3.4) that
Ct < f ^^ r!y .-2r(n-r)(T-r+l)/n2
2.4 Nonequiprobable graphs 109
and for all sufficiently large n,
-1
{2TY'1
Therefore
2 oo
2
3<r<n/2 r=3
~ 27A -
If n -> 00 and 2T = n log n + x« + o(«), then
and for all sufficiently large n,
1-40/9
where c is a constant.
Therefore, under the conditions of the theorem,
Taking into account that E^2 -*• 0 also, we see that, with probability tending to 1,
the graph Gnj has only one component besides the isolated vertices. ¦
2.4. Nonequiprobable graphs
The model of the random graph Gnj considered in the previous section can be
easily extended to nonequiprobable graphs. However, the approach based on the
generalized scheme of allocation, which reduces the investigations of equiprobable
graphs to some problems concerning sums of independent random variables, does
not apply to nonequiprobable graphs. In this case, few results have been obtained
because of the lack of effective methods to investigate these objects.
In this section, we consider a generalization of the random graph Gnj of the
previous section. We preserve the notation Gnj for this nonequiprobable graph
with n vertices labeled with the numbers 1,2,... ,n and T edges, which can be
obtained by the following procedure. We consider T independent trials, in each of
which one edge is drawn. The edge connects two different vertices or forms a loop;
110 Evolution of random graphs
the vertices with labels / and /' are connected with the probability Ipipj, and the
loop at vertex / is formed with the probability pj;i, j = 1, ..., n, p\, ..., pn > 0,
px _)_...-)- pn = l. Thus, after T trials we have a realization of the random graph
Gnj, which may have loops and multiple edges.
The main result of this section is the following assertion.
Theorem 2.4.1. Assume that pi = «,-/«, where ai = <2/(n), 0 < ? < a,- < E,
i = 1, ... ,n, e and E are constants, and the limit
1 "
aL = lim - > af
/=1
exists.
Then, ifn, T -> oo such that IT /n -> A., 0 < Xa2 < 1, the distribution of the
number of cycles v(Gnj) in the graph Gnj converges to the Poisson distribution
with parameter A = — ^ ln(l — Xa2).
In proving the theorem, the limit distribution of the random variable ar, the
number of cycles of length r, and the joint limit distribution of ar,, ..., ars are
obtained.
Theorem 2.4.2. Under the conditions of Theorem 2.4.1, without the requirement
"ka < 1, the distribution of the random variable ccrfor any fixed r tends to the
Poisson distribution with parameter Xr = Xra2r /Br).
Theorem 2.4.3. Under the conditions of Theorem 2.4.1, without the requirement
Xa < 1, the joint distribution o/ar,, ..., ars for any fixed 1 < r\ < ¦ ¦ ¦ < rs
converges to the distribution of s independent random variables that have the
Poisson distributions with parameters A.r,,..., Xrs, respectively.
The proof will be accomplished by the method of moments.
A cycle of length r has no self-intersections if it is composed of r vertices
and exactly r edges of Gnj- Denote by ar the number of cycles without self-
intersections of length r, r > 3, in the random graph Gnj. For r distinct vertices
i\, ..., ir, let ?/,,...,/,. = 1 if in Gnj there exists a cycle composed of these r
vertices containing exactly r edges of Gnj\ in other cases, we set ?/,,...,;,. = 0.
Then
where the summation is taken over all (^) distinct unordered sets of r distinct
indices. In the complete graph with vertices i\,.. .,ir, there exist (r - l)!/2
distinct cycles containing exactly r edges. We label these cycles in an arbitrary
order with the numbers j = 1,..., (r - l)!/2 and represent the random variable
2.4 Nonequiprobable graphs 11 1
?/, /;. as the sum of indicators:
(r-l)!/2
ft. .v= E C..V BA2)
where ?;7) ;. = 1 if the y'th cycle exists in Gnj, and ?7. ;. = 0 otherwise.
We now investigate the behavior of the random variable
v(Gnj) = cc\ H \-an,
where the variables ar are defined by B.4.1) for r > 3, ct\ is the number of loops,
and «2 is the number of pairs of parallel edges in Gn j ¦
Each cycle in the graph Gnj may be thought of as the set of edges that form this
cycle; therefore, the following assertion is needed for evaluating such probabilities
as P{%;^ ir = 1}. Let Vr = {(z'i, j\),..., (ir, >)} be the set of r distinct pairs
of vertices in the graph Gnj, where ik ^ Jk, k = 1,..., r. Denote by P(Vr) the
probability of the event that all the edges from Vr occur in Gnj-
Lemma 2.4.1. Ifn, T -> oo, 2T/n -> A., 0 < A. < oo, 0 < e < a; < E < oo,
/ = 1,..., n, then for arbitrary fixed e, E, and r,
P{Vr) = X-ahah . ..airajr (l + O Q)) B.4.3)
uniformly with respect to a\,... ,an and all sets Vr.
Moreover, for any 8 > 0, there exists a constant c such that, for all r and n,
)
P(Vr) < c(X^r8) ahaJx ¦ ¦¦airajr. B.4.4)
Proof. Set qk = 2pikpjk, k = 1,..., r. Then
T[m\-\ \-mr]
mi! • • ¦ mri
m\,...,mr>l
qr)T~r
'"q
v (\ n, ,, \T—m\ mr i /I 4 o
x yv — q\ — • • • — qr) j. \L.L*.j)
Here x[m] = jc(jc — 1) • • • (x — m + 1); the summation in X]' is taken over all sets
{mi,..., mr) in which mi,..., mr > 1 and there exists /, 1 < i < r, such that
112 Evolution of random graphs
m; > 1. It is clear that
A -qx qr)T'r < 1,
and for an arbitrary fixed r,
(l-qi qrf-r = 1 + O(\/n). B.4.6)
In addition,
(T — r\[m\-\ \-mr-r]
( Z r-'^-'a )r-'-"-
B.4.7)
where
G _ r _ l)[m\-\ \-mr-r-l]
, mx\---mr\
m\,...,mr>_\
X
Let /,- = mi —2, lj = mj — 1, j ^ i (recall that m, > 1). Then
=
{T ~r ~
x tfj1 • • ¦ q[r{\ - qx qr)T-r-\-h-..-lr = L B 4 8)
Now assertion B.4.3) follows from B.4.5)-B.4.8), and assertion B.4.4) from
B.4.5), B.4.7), and B.4.8), since
A 2TrE2
(r)
,
()
iq\---qr < —j-ai,ah ¦ ¦ ¦ airajr.
2.4 Nonequiprobable graphs 113
Corollary 2.4.1. Ifn, T -> oo, 2T/n -> A, 0 < X < oo, 0 < e < a,¦ < E < oo,
/ = 1,...,«, then for arbitrary fixed e, E, X, and r,
uniformly with respect to j, \ < j < (r—\)l/2, all sets {i\,..., ir}anda\,... ,an.
Moreover, for any 8 > 0, there exists a constant c such that, for all r and n,
2
l
Proof. The equality ^/.7/. = 1 holds if and only if in Gn j there exist r fixed edges,
{(k\, j\),..., (kr, jr)}, kv ^ jv, v = 1,..., r, which form the yth cycle on the
vertices i\,..., ir. For these edges, the sets {k\,..., kr] and {j\,..., jr) coincide
with the set {i\,..., ir). Therefore, the corollary follows from Lemma 2.4.1.
The notation {i\,..., ir) denotes an unordered set of distinct indices i\,..., ir;
the number of such sets is ("). For ordered sets of distinct indices i\, ..., ir, we
will use the notation (i\,..., ir); the number of such sets is n^r\ By the symbols
we will denote the summations over all distinct unordered and ordered sets of
r distinct indices, respectively. It is clear that the summation over all unordered
sets {i\,..., ir} is well suited to summands fil...tr whose values are invariant with
respect to the permutations of indices. For such summands,
_ f«...ir=r\
and, moreover,
/¦(I) ,-A) ,¦(*) Ak) =
I. ...lr ...I, ...If
if the left-hand side summation is taken over all distinct ordered sets of distinct
r-dimensional indices i\ ,..., i? .
Lemma 2.4.2. If 0 < e < ai < E < oo, i = I, ..., n, then for any fixed r,
as n -^ oo,
i>,2V= E <-AA +
/ V
i = l
114 Evolution of random graphs
Proof. The following representation is valid:
i>?y= t «i-<= e <¦¦¦<+ ^ <¦¦¦<-
i=\ I /|,...,/V=1 {'I i,) (i\ '/¦>
where the summation in the first sum is taken over all distinct ordered sets of
distinct indices, and in the asterisked sum, over all distinct ordered sets, each have
at least two identical indices. The number of summands in the first sum is «^; the
number of summands in the second sum is equal to nr — «^ and does not exceed
crnr~l where the constant cr depends only on r. Therefore
and the proof is complete. ¦
Corollary 2.4.2. Under the conditions of Theorem 2.4.2, for any fixed r > 3,
kra2r
E
2r
Moreover, for any 8 > 0, there exists a constant c such that
Ear < c
2r
Proof. Using representations B.4.1) and B.4.2), with the aid of B.4.9), Corol-
Corollary 2.4.1, and Lemma 2.4.2, we obtain
(r-l)!r
E«/- = ^ z
(r — \)\Xr ^-^ 9 9 / / 1N
^ a? ---a,2 1 + 0 -
2/1^!
X'a1'
The second assertion follows immediately from the inequality of Corollary 2.4.1.
We now evaluate the factorial moments of ar. If Sn = ?i + ••• + ?„, where
?!,...,?„ take the values 0 and 1 only, then according to Theorem 1.1.4,
Sn(Sn-l)---(Sn-m + l)= J2 &,•••&„, B.4.12)
(k\,...,km)
where the summation is taken over all distinct ordered sets of m distinct indices.
2.4 Nonequiprobable graphs 115
In our case, the indices have a composite structure because
(/-l)!/2
<*= E E C-
{/'i iV) ./=•
The following representation is analogous to B.4.12):
ar(ar - 1) ¦ ¦ • («,-* + 1) = J2$W ,„ • • -^ ,m), B.4.13)
/| •••/r Zj •••/r
where the summation is taken over all distinct ordered sets
of distinct indices of the form ({i\,..., ir}, j); the set {i\,..., /',-} in the index is
considered an unordered set of distinct indices, and j indicates the number of the
cycle formed by the vertices i\,... ,ir.
We show that under the conditions of Theorem 2.4.2, for any fixed r and any
fixed m > 1,
/\r 2r\m
^^(^-j . B.4.14)
This assertion for m = 1 follows from Corollary 2.4.2.
In order to become accustomed to the more complicated notation, we first
consider the case m = 2. By B.4.13),
E
Decompose the right-hand side sum into two sums. Let the first sum Ei include
the summands with nonintersecting sets {i\ ',..., ij- } and {/j \ ..., rr '}. When
we take into account that in this case 2r edges must exist to guarantee
,-A) ,-(l) ~ ^ B) .B)
l\ "' 'r 1 r
and by using Lemma 2.4.1, we obtain
fcO"l) _ fc(/2) _ il
,A) ,(D "~ 5,-B) B) — L f
Therefore
f^ /('¦0!\2 2 2 2
X
116 Evolution of random graphs
It is clear by virtue of B.4.9) and B.4.10) that
Therefore, by virtue of Lemma 2.4.2,
2 / (r\J \n ^7 V 2r
B.4.15)
We now show that the remaining sum E2 tends to zero.
The summation in ?2 is taken over the pairs of composite indices in which the
sets {i[ \ ... ,ir ^ and {/j \ ..., ir2^} have at least one common element. Each
composite index ({i\,..., ir}, j) corresponds to a cycle in the complete graph
with n vertices; the cycle consists of r edges and the vertices i\,... ,ir- Two
cycles corresponding to the indices ({i\ ,... ,ir }, j\) and ({/[ \ ..., ir }, 72)
can have M < 2r distinct vertices and L distinct edges. We decompose the sum
E2 into the sums Em,i containing summands with fixed values of the parameters
M and L. The number of such sums does not exceed BrJ; therefore it is sufficient
to prove that any sum ?m,z tends to zero. It is easy to see that in the case M < 2r,
the inequality L > M + 1 is valid. The number of summands in the sum ^m,l
does not exceed nM, and the probability that L fixed edges appear in Gnj does
not exceed, by virtue of B.4.4), the value cn~L. This implies
^ml S—rzn <-¦ B.4.16)
nL M n
Therefore, as n -> 00,
?2 -> 0. B.4.17)
The assertion B.4.14) for m = 2 follows from B.4.15) and B.4.17).
Now let us consider the factorial moment of an arbitrary order m. By B.4.13),
Ear = Ei + S2,
where the sum Ei includes only summands that do not have a pair of sets from
{*' 1 , ¦ ¦ •, ir },•••, {i™ , • ¦ ¦, ir } with common elements. In this case, rm edges
must occur in the graph Gnjio guarantee that the corresponding random variables
equal 1. From this and Lemma 2.4.1, it follows that
tUm) _ 1}
,.(m) Am) — i f
\ mr
a ¦ ¦ ¦ -n2 ¦ ¦ ¦ n2 ¦ ¦ -n2 C1
"•(I) ".A) u.(m) u.(m)\l
2.4 Nonequiprobable graphs 117
and, by B.4.9) and Lemma 2.4.2,
\r 2r\m
J B.4.18)
J
It remains to prove that the sum E2 taken over the remaining sets of indices
tends to zero. The summation in E2 is taken over m sets of composite indices that
have at least one common element in at least one pair of the sets {i\p ,..., ir },
{/| ,..., if }, p ^ q. Recall that each composite index corresponds to a cycle
in the complete graph with n vertices. The cycles corresponding to m indices can
contain M distinct vertices and L distinct edges. We decompose the sum E2 into j I
th ^ ii d ith fid l f th M d
the sums ^m,l containing summands with fixed values of the parameters M and , t
L. The number of such sums does not exceed (rmJ; therefore it is sufficient to t!
prove that any sum ?m,z tends to zero. It is clear that if M < rm, then L > M +1. }
Thus, since the number of summands in the sum ~Em,l does not exceed nM, and by "
B.4.3) the probability of L fixed edges occurring in Gnj does not exceed cn~L
nL M n
Therefore, as n -> 00,
?2 -> 0. B.4.19)
The assertion B.4.14) follows from B.4.18) and B.4.19).
By B.4.14), the limit distribution for ar, r > 3, is the Poisson distribution
with parameter Xr = Xra2r/Br). It is easy to see that in the current situation the
number of loops a\ and the number of pairs of parallel edges a2 approach the
Poisson distributions with parameters k\ = Xa2/2 and A.2 = X2a4/4, respectively.
This proves Theorem 2.4.2.
The more general Theorem 2.4.3 can be proved analogously. It is sufficient to
verify that under the conditions of the theorem,
for arbitrary fixed integers m \,..., ms, where
Xra2r
Xr = ^—.
2r
By B.4.13),
,(k)\ (j(k) (k)\\
where
j(k) _ i.(l,k) ^(U)! I =\ mk k =
118 Evolution of random graphs
are unordered sets of a vertices, and // , / = 1 m^, k = 1,..., .v, are the
numbers of cycles of length r^ under the labeling chosen.
Therefore
(,) -I,--- ,§(,,
where
We decompose the sum on the right-hand side of this representation into two
parts; let the sum Ei include only summands with the distinct elements in all
if- , I = 1,..., ntk, k = 1,..., s; and let the sum ?2 include all the remaining
summands. For the summands of the first sum, the corresponding random variables
equal 1 only if there exist m\r\ + - ¦ ¦+msrs fixed edges in Gnj. Therefore, by Lem-
Lemma 2.4.1,
pko)
and, by B.4.9), B.4.10), and Lemma 2.4.2,
2r>\m'
It remains to prove that E2 tends to zero. The summation in S2 is taken over sets of
composite indices in which at least one of the elements 1, 2,..., n is encountered
at least twice. A cycle corresponds to each of the composite indices. The existence
of a common element in the cycles implies that the number M of distinct vertices
contained in the cycles and the number L of distinct edges involved in the cycles
satisfy L > M + 1. We decompose the sum E2 into a finite number of sums T,m,l
containing summands with fixed values of the parameters M and L. By virtue
of B.4.3), for each of these sums, the estimate
holds because the number of summands does not exceed nM, and the probability of
L fixed edges occurring in Gnj does not exceed cn~L. This proves Theorem 2.4.3.
To prove Theorem 2.4.1, we need the following auxiliary assertion.
2.4 Nonequiprobable graphs
119
Lemma 2.4.3. Let ?,,..., ffl be nonnegative integer-valued random variables
such that for an arbitrary fixed s and arbitrary nonnegative integers k\, ..., ks,
as n -^ oo, where a\, aj, ¦ ¦ ¦ is a fixed sequence of nonnegative numbers. More-
Moreover, suppose
" -> 0 B.4.20)
as s -+ oo, uniformly in n, and let
oo
= A < oo.
k=i
Then the distribution of the random variable ?„ = ?| +
to the Poisson distribution with parameter A.
,{n)
Sn converges
Proof. We show that for an arbitrary fixed ? > 0 and an arbitrary fixed m,
Ame'/
ml
< ?
for sufficiently large n. For fixed e and m, there exists s such that
?
3'
ml
ml
where As = a\ + • • • + as.
It is not hard to see that
»> =m}\ <
0).
Therefore, by B.4.20), \P{^n) = m} - P{$s{n) = m}\ < e/3 for sufficiently large
s. Finally, the conditions of the lemma yield the convergence of the distribution of
J + • • • + ?s (for any fixed 5) to the Poisson distribution with parameter
As = a\ + ¦ ¦ ¦ + as. Therefore
-As
ml
?
< -
~ 3
for sufficiently large s. ¦
Theorem 2.4.1 follows from Theorem 2.4.3 and Lemma 2.4.3, whose conditions
are satisfied when ka2 < 1.
120 Evolution of random graphs
2.5. Notes and references
The investigation of the evolution of random graphs began when P. Erdos and
A. Renyi published the results of their study [37] in 1960. Along with the basic
properties of the random graph G,t T, they discovered the effect known as a phase
transition. At about the same time, V. E. Stepanov studied the graph Gn,p, as
documented later [133, 134,135]. Until recently, Stepanov's results had not seemed
to receive wide recognition. In particular, Stepanov proved that if p = c/n, where
c is a constant, c > 1, then the size of the giant component is asymptotically
normal with mean na(c) and variance nfi{c), where
a(c) =1 , j8(c) = — -r-,
c c(\ - y)
and y < 1 is the root of the equation
= ce~c.
A similar assertion for the graph G^ T was proved by B. Pittel [123] about twenty
years later. He found that the size of the giant component of G^ T is asymptotically
normal with parameters na(c) and nfi{c){\ — 2y + 2y2/c) as n, T -> oo and
IT In -> c> 1.
Many open questions concerning the evolution of random graphs remain. The
main goal of this chapter is to demonstrate the approach based on the generalized
scheme of allocation in investigations of the evolution of random graphs. Sec-
Section 2.1 shows that fine properties of subcritical graphs can be obtained in a rather
simple and natural way, especially as concerns the behavior of subcritical graphs
near the critical point. The transition phenomena for the graph Gn T were first
considered by B. Bollobas [20]. The results presented in Section 2.1 can be found
in [77]. The approach based on the generalized scheme of allocation allowed us to
prove asymptotic normality of the number of unicyclic components and find the
limit distribution of the maximum sizes of trees and unicyclic components.
Section 2.2 is devoted to critical graphs. The behavior of random graphs near
the critical point, and especially in the critical domain where the giant component
appears, is very complicated and difficult to investigate. The investigations of the
behavior are far from complete, but even now the results obtained could fill another
book. Much information about random graphs can be found in the fundamental
work by Bollobas [21] and in the book [105], which is devoted to the evolution
of random graphs. A detailed investigation of the birth of the giant component
is given in [63]. Supercritical graphs are considered by Luczak [99], who, in
particular, proved that the right-hand bound of the critical domain is determined
by the conditions n, T -> oo, A - 2T/nKn -> -oo.
Formally, to analyze supercritical random graphs, we can use the representation
of almost all such graphs as a combination of components of three types: one giant
2.5 Notes and references 121
component, trees, and unicyclic components. However, this approach is hampered
by the absence of a simple formula for the number of connected graphs with n
vertices and T edges with k — T - n > 0. Note that k = T - n is equal to the
number of independent cycles in the graph and is called the cyclomatic number
of the graph. Denote by c(n, k) the number of connected graphs with n labeled
vertices and a cyclomatic number k. It is clear that c{n, — 1) is the number of trees,
and by the Cayley formula, c{n, —l) = nn~2, whereas c(n, 0) is the number un of
unicyclic graphs considered in Section 1.7. The numbers c(n, k) were investigated
by Stepanov (see [10, 142, 143]) and E. M. Wright [151, 152] and are known as
the Stepanov-Wright numbers (see [143]). As n —>¦ oo and k? /n —>¦ 0,
c(n, k) =
where, as it was proved by Meertens, d = l/B7r) (see Bender, Canfield, and
McKay [16]).
We hope that the results of the study by Bender et al. [17], who give the asymp-
totics of c(n, k) for all regular variations of the parameters n and k, can be used in
the application of the generalized scheme to random graphs and help to bring the
investigations of supercritical graphs to the level attained for the subcritical case
in Section 2.1. Note that obtaining the limit distributions of numerical character-
characteristics of supercritical graphs would be merely a problem of averaging if the joint
distribution of the size of the giant component and the number of its edges were
known.
The parameter 0 = 2T/n plays the role of time in the evolution of random
graphs. Therefore, each numerical characteristic of a random graph can be con-
considered not only as a random variable, but also as a random process with the time
parameter 0. Of significant interest is the approach using the convergence of such
processes. This approach is used in the recent papers [34, 62, 127]. Note that the
investigations of convergence of such random processes in combinatorial problems
were started by B. A. Sevastyanov [132] and Yu. V. Bolotnikov [22, 23, 24].
The random graph Gnj discussed in Section 2.3 was investigated by Kolchin
[79, 83]. This graph provides an appropriate model of the graph corresponding
to the left-hand side of a system of random congruences modulo 2 considered in
the next chapter. An analogy of Theorem 2.3.8 for bipartite graphs was proved by
Saltykov [131].
The nonequiprobable version of the graph Gnj is considered in Section 2.4,
where the results of the papers [88,66,65] are presented. Here we use the method of
moments. The lack of regular methods for an asymptotic analysis of nonequiprob-
nonequiprobable graphs makes it impossible to carry out anything approaching a complete
investigation of such graphs. It seems to us that developing the methods appropri-
appropriate for the analysis of nonequiprobable combinatorial structures is a problem of
great importance.
Systems of random linear equations
in GFB)
3.1. Rank of a matrix and critical sets
In this section, we consider systems of linear equations in GFB), the field with
elements 0 and 1. Let us begin with two examples where such systems appear.
Consider first a simple classification problem. Suppose we have a set of n objects
of two sorts, for example, of two different weights. We may sequentially sample
pairs of the objects from the set at random, compare the weights of the objects
from the chosen pair, and determine whether the weights are identical or different.
The problem is to identify the objects that have the same weight - actually, to
estimate the probability of finding that solution. For a formal description of the
situation, let {1, 2,...,«} be the set of objects under consideration and let xj be
the unknown type of the object j, j = 1,...,«. We may assume that x\,..., xn
take the values 0 and 1, depending on the class to which the object belongs. We
choose a pair of objects i(t) and jit) in the trial with number t, t = 1,..., T,
and let bt be the result of their comparison: bt = 0 if their weights are identical,
and bt = 1 otherwise. Thus, the results of the comparisons can be written as the
following system of linear equations in GFB):
xi(t) +Xj(t) = bt, t = l,...,T. C.1.1)
It is clear that the system can be rewritten in the matrix form
AX= B,
where X = (jq,..., xn) and B = (b\,..., bj) are column-vectors, and the el-
elements atj of the matrix A = \\atj\\, t = 1,..., T, j = 1,...,«, are random
variables whose distribution is determined by the sampling procedure. It is con-
convenient to associate the system, or more precisely, the matrix A, with the random
graph Gnj with n vertices that correspond to the variables x\, ..., xn. The graph
has T edges (i(t), jit)), t = 1,..., T. Therefore the graph can have loops and
multiple edges, depending on the sampling procedure.
122
3.1 Rank of a matrix and critical sets
123
In this chapter, we consider the characteristics of the graph Gn, t that are related
to some of the properties of the system C.1.1). It is clear that the connectedness of
the graph is an important characteristic for the classification problem. Indeed, in
the case where the graph is connected, we can determine all values of the variables
x\,..., xn if we set one of them equal to 0 or 1. In both cases, the partitions of the
set are the same, but the system has two different solutions. In the case where the
graph Gnj is disconnected, the system has more than two solutions; therefore a
complete classification is impossible.
Now let the vector B consist of independent random variables that take the
values 0 and 1. If the balance is out of order, the weighings can sometimes be
wrong, and the variables b\, ..., bj can differ from the true values. In this case,
we obtain a system with distorted entries on the right-hand side that sometimes has
no solution. If the balance is completely wrong, we may assume that the variables
b\ ..., bj do not depend on the left-hand side of the system and take the values
0 and 1 with equal probabilities. In this situation, several natural problems arise.
Does the right-hand side b\,..., bj depend on the left-hand side of the system or
are the sides independent? Can we reconstruct the real values of x\,..., xn in the
case where the right-hand parts b\, ..., bj are distorted?
Let us turn to the second example. Let a vector (c\,..., cn) in GFB) be given.
If we take an initial vector x\,..., xn, then we can develop the recurring sequence
xn+t, t = 1, 2,..., by the following recurrence relation:
xn+t —
t — 1, 2,....
C.1.2)
This recurrence relation can be realized with the help of a device called a shift
register, presented in Figure 3.1.1. A shift register consists of n cells or stages with
labels 1,2, ...,«. The ^-dimensional @, 1) vector of the contents of these stages
is called the state of the shift register. At an initial moment, the state of the shift
register under consideration is the vector (x\,..., xn). The choice of the vector
(ci,..., cn) means that we choose the stages with numbers corresponding to the
ones in the sequence c\, ..., cn and form the mod 2 sum xn+\ = c\x\ + - • - + cnxn.
At the next moment, the contents of all stages are shifted to the left so that xn
transfers to the stage numbered n — 1, xn-\ transfers to the stage n — 2, and so
on, x\ leaves the register, and the sum jcn+i = c\x\ + • • • + cnxn is placed into the
stage with label n. Thus the state (x\,..., xn) transfers to the state (jc2, ..., jcw+i).
xt
Xt+n-l
-4—
Figure 3.1.1. Shift register
124 Systems of random linear equations in GFB)
The process is repeated. Thus, if c\, ..., cn are given, then for any initial state
jci, ..., xn, the recurring sequence C.1.2) satisfies
xn+\ = c\x\ H \-cnxn,
Xn+2 = C\X2-\ \-CnXn + \,
*n + T = C\XJ H \-CnXn+T-\.
Let us change the notations and put bt = xn+t, t = 1, 2,..., T, and an =
c\,..., a\n = cn. Then the first relation becomes
a\\x\ H \-a\nxn = b\.
It is clear that we can substitute c\x\-\ h cnxn for xn+\ in the second relation
and obtain
a2\x\ H \-a2nxn = b2.
In the same way, we obtain
H \-ainxn = b\,
C.1.3)
aT\x\ H \-aTnxn =bT.
Suppose that the initial state (jci, ..., xn) is unknown and we observe the se-
sequence b\,..., bj. Then we can regard relations C.1.3) as a system of linear equa-
equations with respect to the unknowns x\,..., xn. A natural question is how many
observations are needed to reconstruct the initial state and to obtain all elements
of the sequence bt,t = T + I,
The other situation concerns the feedback points c\,..., cn. Suppose we ob-
observe the sequence b\,..., bj, but the vector (c\,..., cn) determining the shift
register is unknown. If the number of l's in (c\,..., cn) is k, then there are (?)
possibilities for this vector. If we use an exhaustive search to find the true vector
that corresponds to the observed sequence, we have the following situation. If the
chosen vector is true, then system C.1.3) is consistent for any T, but if the vector
(ci,..., cn) is wrong, then the system becomes inconsistent for some T. There-
Therefore the consistency of the system C.1.3) serves as a test for selecting the true
vector.
Let us introduce the auxiliary notions of a critical set and a hypercycle for
our investigations of systems of linear equations in GFB). Note that the ordinary
notions of linear algebra, such as the notion of linear independence of vectors, rank
of a matrix, Cramer's rule for finding the solutions of linear systems of equations,
3.1 Rank of a matrix and critical sets 125
and so on. are extended in the obvious way to the «-dimensional vector space over
GFB). For example, if the rank of a T x n matrix A = \\atj\\ in GFB) is r, then
the homogeneous system of equations
AX=0,
where X = (x\,..., xn) is the column-vector of unknowns, has exactly n — r
linearly independent solutions.
Denote by
at = (at\, ... ,atn), t = \,...,T,
the rows of the matrix A. If the coordinate-wise sum
then the set C = {t\,..., tm) of row indices is called a critical set.
If C\ and C2 are critical sets and C\ ^ C2, then
Ci a c2 = (Ci u c2) \ (Ci n c2)
is also a critical set.
Let s\,..., es take the values 0 and 1. Critical sets C\,..., Cs are called inde-
independent if
s\C\ A ?2C2 A • • • A ^Cs = 0'
if and only if s\ = ¦¦ ¦ = es =0.
Denote by s(A) the maximum number of independent critical sets and by r{A)
the rank of the matrix A.
Theorem 3.1.1. For any T x n matrix A in GFB),
s(A)+r(A) = T.
Proof. We consider the homogeneous system of equations
A'Y = Q C.1.4)
in GFB), where A' is the transpose of A. There is a one-to-one correspondence
between the solutions of the system C.1.4) and the critical sets: The solution
Ytlt...ttm = {y\,..., yr), whose components ytl, ..., ytm are 1 and the other com-
components are zero, corresponds to the critical set C = {t\,..., tm). The linear
independence of solutions corresponds to the independence of critical sets. There-
Therefore the maximum number of critical sets s(A) equals the maximum number of
linearly independent solutions of system C.1.4), which we know is T — r(A).
126 Systems of random linear equations in GFB)
In addition to the critical sets of a T x n matrix A = \\atj\\, we consider a
hypergraph Ga that is also defined by the matrix A. The set of vertices of the
hypergraph Ga is the set {1,...,«} of column indices and the set of enumerated
hyperedges is the set {e\,..., ej}, where
et = {j- atj = 1}, t = 1,..., T.
Thus there exists a correspondence between a row at = (at\, ..., atn) and
the hyperedge et, t = 1, ..., T. Note that the empty set corresponds to a row
consisting of zeros.
The multiplicity of a vertex j in a set of hyperedges C = {etx, ..., etfn} is the
number of hyperedges in C that contain this vertex.
A set of hyperedges C = {etl,..., etm} is called a hypercycle if each vertex of
the hypergraph Ga has an even multiplicity in C, in other words, if the coordinate-
wise sum of rows atx + • • •+ atm in GFB) equals the zero vector.
If each row of the matrix A contains exactly two 1 's, then the hypergraph Ga
is an ordinary graph, perhaps with multiple edges, and a hypercycle is an ordinary
cycle or a union of cycles.
The set of the indices of hyperedges that form a hypercycle is a critical set for
the matrix A. Let s\,..., ss take the values 0 and 1. Hypercycles C\,..., Cs are
independent, if
siCi As2C2A---AssCs = 0,
if and only if s\ = ¦ • ¦ — ss = 0. Therefore the maximum number s(A) of critical
sets of the matrix A equals the maximum number of independent hypercycles
in
3.2. Matrices with independent elements
This section deals with random matrices with independent elements. Let A = \\atj \\
be a T x n matrix whose elements are independent random variables taking the
values 0 and 1 with equal probabilities, and let pn{T) be the rank of the matrix A
in GFB). The following theorem is the main result of this section.
Theorem 3.2.1. Let s > 0 and m be fixed integers, m + s > 0. Ifn^-oo and
T = n + m, then
00 / 1 \ m+s
where the last product equals 1 for m + s — 0.
Proof. The limit theorem will be proved by using an explicit formula for
P{pn (T) = n — s}. Denote by pn (t) the rank of the submatrix of A which consists
3.2 Matrices with independent elements 127
of the first t rows of the matrix A. We interpret the parameter t as time and consider
the process of sequential growth of the number of rows. Let ?, = 1 if the rank
pn(t — 1) increases after joining the rth row, and ?, = 0 if the rank preserves the
previous value. It is clear that
It is not difficult to describe the probabilistic properties of the random variables
?i,..., ?7-. The event {?, = 1} means that the rth row is linearly independent with
respect to the set of the rows with numbers 1,..., t — 1, and the event {& = 0}
means that the row with number t is a linear combination of the preceding rows.
If among the preceding t — 1 rows there are exactly k linearly independent n-
dimensional vectors, then the linear span of these k vectors contains 2k vectors
(all linear combinations of these k vectors). The matrix A is constructed in such
a way that each row can be obtained by sampling with replacement from a box
containing all 2" distinct n-dimensional vectors. In other words, any row of the
matrix A is independent of all other rows and is equal to any n -dimensional vector
with probability 2~n. Therefore
2k
= 0| A.(f-D=*l = ^,
C.2.1)
Thus the process pn(t) is a Markov chain with stationary transition probabilities
that are given by C.2.1). To find P{pn(T) = n — s), we can sum the probabili-
probabilities of all trajectories of the Markov chain that lead from the origin to the point
with coordinates (n + m,n — s), that is, the trajectories such that pn@) = 0,
pn(n + m) = n — s. If we represent a trajectory as a "broken line" with intervals
of growth and horizontal intervals, we see that any such a broken line has exactly
n+m — {n— s) = m+s horizontal intervals corresponding to m + s zeros among
the values of ?1,..., %n+m. The graph of the trajectory with ?,, = 0,..., ^tm+s = 0
is illustrated in Figure 3.2.1.
By using C.2.1) and Figure 3.2.1, we can easily write an explicit formula for the
probability of a particular trajectory and for the total probability. The derivation
of this probability is quite simple if m + s = 0. Indeed, the only trajectory with
pn @) = 0 and pn{n+m) = n+m has no horizontal intervals, and at each interval
the broken line increases; therefore
P{pn(n+m) = n-s}= (l--L
2n
- n MV
/•=-/«+i
128
Systems of random linear equations in GFB)
n — s
t\ tj tm+s t = n +m
Figure 3.2.1. Graph of the trajectory with ?r, = • ¦ • = t-tm+s = 0
and in the case m + s = 0, as n —> oo,
i—s+l ^ ^
This coincides with the assertion of the theorem for m + s = 0 because the last
product equals 1.
In the general case, for m + s < 0,
P{pn(n +m)=n-s]
E
¦+tm+s—m—s
X
2n(m+s)
2n
X
3.2 Matrices with independent elements 129
Taking the factor 2<"-ff><m+5> out of the sum yields
P{pn(n +m) =n — s)
" / 1 \
o—.s(m+.s) n / i \
/ —
x
\<t\ <--<tm+s<n+m
As will be seen from the following evaluations, the moments t\,..., tm+s are
concentrated at the end of the trajectory; therefore, in the sum of the formula, it is
convenient to switch to the variables
// = — (f/ —l + s — n), I = 1,..., m + s.
It follows from 1 < t\ < ¦ ¦ ¦ < tm+s <n+m that
0<ri-l<f2-2<---< tm+s -m-s <n -s,
and by subtracting n — s from each term, we obtain
—n + s < t\ — 1— n + s ¦ ¦ ¦ < tm+s — m — s — n + s <Q.
If we change the sign, we see that the domain 1 < t\ < • ¦ ¦ < tm+s < n + m in
terms of the new variables is
0 < im+s < <i\ <n -s.
Thus
P{pn{n+m) = n-s} C.2.2)
" /
ft (
" / 1
0''—s
It is easily seen that, as n —> oo,
oo
n ('-?)-n
V 7 i=s+l
and
V^ 2~'m+s '' -^- Y^ 2~im+s ''. C.2.4)
0<im+s<-<i\<n-s
To complete the proof it remains to transform the right-hand side of C.2.4). It
130 Systems of random linear equations in GFB)
is not difficult to see that
.Z—^
9 7 I 92 / I or-l /
z./ \ z. / \ z. /
1 \ -1
Passing to the limit in C.2.2) and taking into account C.2.3), C.2.4), and C.2.5)
provide the assertion of the theorem. ¦
Let the elements of a T x n matrix A = \\atj\\ be independent and take the
values 0 and 1 with equal probabilities. We consider the system of equations
AX=0 C.2.6)
with respect to unknowns X = (x\,..., xn) in GFB). Denote by vnj the number
of linearly independent solutions of this system of equations. If the rank pn{T)
of the matrix A equals r, then vnj=n—r. Therefore Theorem 3.2.1 yields the
following assertion.
Theorem 3.2.2. Let s > 0 and m be fixed integers, m + s > 0. Ifn —> oo, then
00 / 1 \ m+s / 1 \-1
(i - y) n 0 - ?) •
where the last product equals 1 /or m + s = 0.
In particular, for m = s = 0,
/ 1 \
P{vn<n = 0} -> Yl ( 1 - — ) = 0-28878816....
i
3.2 Matrices with independent elements 131
The results of Theorems 3.2.1 and 3.2.2 are of special interest because they are
stable in the sense that the limit distribution of the rank of a matrix is invariant
with respect to deviations of the distributions of its elements from the equiprobable
distribution.
Theorem 3.2.3. Let the elements of a T x n matrix A = \\atj\\ be independent
and suppose there is a positive constant 8 such that, for the probabilities pf- =
P{atj = 1}, the inequalities
hold. Let s > 0 and m be fixed integers, m + s > 0. Then, as n —> oo,
00 / 1 \ m+s / 1
where the last product equals 1 for m + s = 0.
Because these results are outside of the main combinatorial direction of this
book, we will omit the complicated proof of this theorem (see, e.g., [93]).
We illustrate the situation by proving that, under the conditions of Theorem
3.2.3, the mean value of the number of nontrivial solutions of system C.2.6) is
invariant to deviations of the distributions of elements of A from the equiprobable
distribution.
Let iint t be the number of nontrivial (i.e., nonzero) solutions of system C.2.6).
If we associate to the vector X an indicator that is 1 if X satisfies the system, then
We will evaluate Eixnj by using the following lemma on summation of inde-
independent random variables in GFB).
Lemma 3.2.1. Let ?j\,... ,?jn be independent random variables that take the val-
values 0 and 1 with probabilities
Then, in GFB),
+ +6, n .
Proof. It is clear that it suffices to prove the assertion of the lemma for n = 2. In
132 Systems of random linear equations in GFB)
that case,
4
A|A2
{? =0, h=
A+A,)A-A
4
If the elements of A are independent and take the values 0 and 1 with equal
probabilities, then by Lemma 3.2.1, for any X ^ 0,
P{AX = 0} = (P{anjci + • • • + alnxn = 0})T = 2~T.
Therefore E[xnj = BW — lJ~r, and for T = n + m, where m is a fixed integer,
1 1
2m
and as n —> oo,
Under some conditions on the nonequiprobable distribution of the matrix A,
the last result still holds. Let
p™ = P{atJ = 1},
and, as before, denote by \inj the number of nontrivial solutions of system C.2.6).
Theorem 3.2.4. Under the conditions of Theorem 3.2.3,
Hn,T -+ 2
Proof. By using the indicators as in the calculation of the mean number of solu-
solutions in the equiprobable case, we find that
k=\ \<j\<-<jk<n
where, for any fixed set {j\, ..., jk} from the domain of summation, the term
Pj\ jk = P{AX = 0} corresponds to the vector X = (x\,..., xn) whose ele-
elements with indices j\,..., jk are 1 and the remaining elements are zero.
We represent the probabilities p^ as
3.2 Matrices with independent elements 133
According to the conditions of the theorem, there exists A < 1 such that |A^| < A
for all t and j. Since the rows of A are independent,
T
where
By Lemma 3.2.1,
1 + At
P{atJl
0}
and for all t and 1 < j\ < ¦ ¦ • < jk < n,
1 - A^ (t) 1 + A^
2 ~~ Ji>—>Jic ~ 2
Hence, for PJ{ jk, we obtain the bounds
By using these inequalities, we find from C.2.7) that
dWorn
Now let T = n + m, where m is a fixed integer. The left and the right sides of
C.2.8) can be estimated in the same way. Therefore we obtain only an estimate of
the right-hand side. Let
"¦-torn""
and compare S(A) to
k=\
We have seen that S@) -> 2 m as n -> 00. We show that for any fixed A,
0 < A < 1, the difference S(A) - S@) tends to zero. We divide S(A) into
It
134 Systems of random linear equations in GFB)
two parts:
/n\/\+Ak\"+m
(()
\<k<sn
where s, 0 < s < 1/2, will be chosen later. For the sake of simplicity, suppose
that s is such that sn is an integer; then for any s and A, 0 < A < 1, 0 <
e < 1/2,
<) U
U
ennen
by using the inequality n! > n"«/ne~". This bound for Si (A) can be written as
l + AX /1 + A \tt
S\(A) <
2eEe
-?
If we choose a sufficiently small s, we can make the value A + A)/{2e?e~s) less
than 1. For such s, the bound tends to zero as n -> oo. Thus, there exists a fixed
s, 0 < s < 1/2, such that the value Si (A) and, consequently, ^(O) tend to zero,
and Si (A) - Si @) -* 0.
We now estimate the difference S2(A) — ^(O). It is clear that
0 < S2(A) - S2@) = ? J^O + A1)-!)
s<k<n
= (A + Aewr+/" - 1)
'"Y
en<k<n N 7
Since A + As")"+m -> 1 as n ^ oo, it follows from the estimate obtained
above that 5(A) - S2@) -> 0. Thus we have shown that S(A) - 5@) -> 0 and
5@) -* 2-m; hence, S(A) -> 2-w. Theorem 3.2.4 is thus proved. ¦
3.3 Rank of sparse matrices 135
We can actually relax the hypotheses of Theorem 3.2.4. The result remains true
if for t = 1,..., T, j = 1,..., n,
log n + xn ^ @ < i _ lQg n +xn
n — 'J — n '
where xn tends to infinity arbitrarily slowly (see [93]). These bounds are exact
in a sense because, as we will show in the next section, the limit distribution of
the rank of a matrix A differs from the distribution given in Theorem 3.2.1 if the
probability of l's does not satisfy these inequalities.
3.3. Rank of sparse matrices
In Section 3.1, we introduced the notion of critical sets of a matrix. Recall that a set
{?i,..., tm} of row indices of a matrix in GFB) is called critical if the coordinate-
wise sum of rows with indices t\,..., tm is the zero vector. The notion of indepen-
independence of critical sets was also introduced, and s(A) denoted the maximum number
of independent critical sets of a matrix A. According to Theorem 3.1.1, the rank
r(A) of a matrix A is related to s(A) by the equality s(A) + r(A) = T. Therefore,
instead of the rank of a matrix, we can investigate the maximum number s(A) of
independent critical sets of the matrix.
In this section, critical sets are applied in the analysis of the rank of random
sparse matrices. Let the elements of a T x n matrix A = ||a^|| be independent
random variables such that
P[aiJ = „ = !5^±f, PlaiJ = 0] = l-**!+!. C.3.1)
n n
where x is a constant, t = 1,... ,T, j = 1,... ,n. We find the limit distribution
of s(A) for such a matrix.
Theorem 3.3.1. Ifn, T —> oo such that T/n —> a, 0 < a < 1, and condition
C.3.1) is valid, then the distribution of the maximum number of independent critical
sets s(A) converges to the Poisson distribution with parameter X = ae~x.
We show first that the distribution of the number of critical sets that correspond
to zero rows of the matrix converges to a Poisson distribution. Denote the number
of zero rows of the matrix A by %nj.
Lemma 3.3.1. Ifn,T —> oo such that T/n —> a, 0 < a < oo, and condition
C.3.1) is valid, then for any fixed k = 0, 1,...,
where X = ae~x.
136 Systems of random linear equations in GFB)
Proof. The probability pn that a fixed row consists entirely of zeros is
and under the conditions of the lemma,
pn = -e-x(l+
n
The random variable %nj has the binomial distribution with parameters (T, pn),
where T is the number of trials and pn is the probability of success. Under the
conditions of the lemma, the mean number of successes Tpn tends to ae~x\
hence, the binomial distribution converges to the Poisson distribution with parame-
parameter ae~x. ¦
We now prove that if a < 1, then with probability tending to 1, all critical sets
consist of only zero rows.
Lemma 3.3.2. Ifn, T -> oo such that T/n -> a, a < 1, and condition C.3.1) is
valid, then with probability tending to 1, the critical sets of A consist of only zero
rows.
Proof. We consider the total number of critical sets in which each contains at least
one nonzero row. It is sufficient to prove that the mathematical expectation of this
number tends to zero. Although the proof of this fact is straightforward, it involves
many cumbersome estimations of sums containing the binomial coefficients.
An even number of successes among k independent trials with probability of
success p occurs with probability A + {q — p)h)/2.
Let us find the probability that k fixed rows form a critical set containing a
nonzero row. The indices of these rows form a critical set if each column of the
submatrix formed by these rows contains an even number of 1 's. According to the
remark on the probability that the number of successes is even, this probability
equals
Therefore the probability that these k rows constitute a critical set equals
Note that the probability that there is no 1 in all these k rows is equal to
/ _ \ogn+x\k"
3.3 Rank of sparse matrices 137
By using the corresponding indicators to represent the total number of nontrivial
critical sets and the number of the critical sets that consist of zero rows, we obtain
the following expression for the mean number of critical sets that do not consist
of zero rows:
where
/ K\ogn+x)\k
rt=1 + A n J '
We include the terms with k = 0 into these sums because they cancel each other.
Note first that
and under the conditions of the lemma,
Now consider the sum
^*V
Set a = 1 — 2(logn + x)/n for now. The following equalities hold:
i=o N'/ ~ i=q ^'/ ~ i=o N'/ ~ k=o
k=0 w 1=0 x 7 k=0
Let
and divide the sum
138
Systems of random linear equations in GFB)
)T -,
k\ &2 ^3 n/2 k4
Figure 3.3.1. Graphs of the functions ("kJ~" and r?
n
into five parts so that
S{n, T) = Si + S2 + S3 + S4 + S5,
where
0<k<kr
S2 = ? '
k\ <k<k2
S3 =
s4= y.
- E
S5= 2_^ ak,
kx=en,
k3 = \ - ln
k4 = \ + l
and the value of s will be chosen later. For convenience we present the graphs of
the functions (nkJ~n and r\ = A + A - 2(logn + x)/n)k)T as functions of A: in
Figure 3.3.1.
The major contribution to Sin, T) is made by the sum S4. It is clear that
n
uniformly in the integers k = n/2 + u^/n/2 such that \u\ < n1/10. These k form
the domain of summation of S4, which equals {&: \u\ < n1/10}. Therefore
, / 2(logn+x)\
—e
n
~x
n
= e
ae
3.3 Rank of sparse matrices 139
uniformly in k in the domain of summation of S4. Thus
x E
since by the de Moivre-Laplace theorem,
-1-
2"
We now have to show that the remaining four sums tend to zero. We begin with
Since rj is monotone, we find that
E
Under the conditions of the lemma 65 -> 0, since, as was proved, rj. -^ eae *,
and according to the de Moivre-Laplace theorem,
E
Let us estimate
By using the monotonicity of r[, we find that, for sufficiently small s such that en
is an integer,
(\+sn)n?2T ( 2T'n \n
< d+)f
-e
2eeeJ
It is clear that 2r//JBeee-?)-1 <^ < 1 for sufficiently small e; therefore 5i -> 0
as n -> 00.
It remains to consider S2 and 63. Let us begin with
S2= ^ ak.
sn<k<n(\-e)/2
140 Systems of random linear equations in GFB)
We first show that ak is a monotone increasing function for k such that sn < k <
n{\ - s)/2. Indeed,
k+\)rk+\
n -k (\ + A -2(logn+x/n)
k+\ \ T
k+l\ 1+A
n-n{\ -e)/2
n(\ -s)/2- 1
x (i _ A ~ 2Qog n + x)/n)k - A - 2(logn + x)/n)k+l \ T
X V l + (l-2(logw
1+g
1 -e + 2/n
x / A - 2(logn +x)/n)^ - A - 2(logn
X V
l + (l-2(logh
Since 1 + A — 2(logn + x)/n)k > 1, we obtain
1 - e + 2/n
l-e + 2/n
For sufficiently large n,
(l+e)/(l -s + 2/n) >l+e.
Moreover, for k satisfying sn <k<n{\ - e/2),
2(logn+x)\k ^ e_2k{logn+x)/n
where c is the constant e~2sx.
Thus, for sufficiently large n,
3.3 Rank of sparse matrices 141
If we estimate S2, we can use the monotonicity of ak to obtain the inequality
S2 <
Let us estimate
(n\[ I (
n
Since a rough estimate is acceptable, we content ourselves with the bound
e~u <2 du{\ + o
Here we used the well-known asymptotics
(~Z e-u2'2du = -e-z2l2{\+
J-00 z
as z -> 00. Thus, there exists a constant a such that
«
Let us estimate the second factor of ak2. It is clear that
1 + (l - 2^n+XAh\ < (! + e-2k2(\ogn+X)/ny
where Z) is a positive constant.
By combining the estimates of the two factors of a^2, we obtain the bound
and 62 —> 0 if we choose s < 1/5.
It remains to estimate
142 Systems of random linear equations in GFB)
It is clear that
n
and S3 -* Oife < 1/5. ¦
Proof of Theorem 3.3.1. The assertion of Theorem 3.3.1 follows from Lem-
Lemmas 3.3.1 and 3.3.2 because by Lemma 3.3.2,
under the conditions of the theorem. ¦
The following theorem is a corollary to Theorem 3.3.1. Suppose that
l0gr + X < p™ < 1 - l0gr+X, C.3.2)
where x is a constant and t = 1, ..., T, j = 1, ..., n.
Theorem 3.3.2. Ifn,T —>¦ oo such that T/n —>¦ a, 1 < a < oo, and condi-
condition C.3.2) is valid, then the distribution ofs(A) converges to the Poisson distri-
distribution with parameter X = e~x /a.
Proof. Since the rank of a matrix is the maximum number of linearly independent
rows or columns, we apply Theorem 3.3.1 to the transpose matrix and obtain the
assertion of Theorem 3.3.2. ¦
Because we know the limit distribution of the rank of a matrix A, we can obtain
some results for the behavior of the solutions to the system of linear equations with
the matrix A. Let us consider the system
AX=B, C.3.3)
where the elements of the T x n matrix A = \\atj\\ are independent, and for
t = 1,..., T, j = 1,... ,n,
P{aij = „ =
where x is a constant, the column-vector B = (bi,..., br) is independent of A,
and the random variables b\,..., bj are independent, taking the values 0 and 1
with equal probabilities.
Denote by (in 'T the number of solutions of the system C.3.3). The examples
cited in Section 3.1 show that the consistency of linear systems plays a particular
3.4 Cycles and consistency of systems of random equations 143
role in some of the problems related to such systems. The probability of consistency
Pnj of system C.3.3) is the probability that the system has at least one solution:
By using Theorem 3.3.1 we can easily prove the following assertion.
Theorem 3.3.3. Ifn,T —>¦ oo such that T/n —>¦ a, 0 < a < 1, and condition
C.3.1) is valid, then
Pn.T - e-""l\
Proof. If the rank r(A) of A equals r, then
Indeed, let the linearly independent rows have the indices 1,2,... ,r. Then each
of the rows with indices r + 1,..., T is a linear combination of the first r rows,
and for the system to be consistent, each of the right-hand parts br+ \, ... ,bj must
satisfy a linear relation of the form
sltbi+--- + srtbr=bt, t = r + l,...,T, C.3.5)
where e\t,..., srt are constants taking the values 0 and 1. The probability of the
validity of any of the relations C.3.5) is equal to 1/2 and, hence, assertion C.3.4)
is true.
Since {r(A) = r) = {s(A) = T — r), by the total probability formula,
T T
Pn,T = ^ ^
r=0 s=0
The last series from C.3.6) is majorized by the series X^o 2~s and converges
uniformly. Therefore it is possible to pass to the limit under the sum in C.3.6).
Passing to the limit with the help of Theorem 3.3.1 yields
oo
2s s\
s=0
where A. = ae x.
3.4. Cycles and consistency of systems of random equations
In this section, we consider a system of T equations in GFB):
Xi(t)+xjit)=pt, t=\,...,T, C.4.1)
144 Systems of random linear equations in GFB)
where i(t), j(t), t = 1,..., 7\ are independent random variables that take the
values 1,..., n with equal probabilities, and the variables fi\,..., fir take the
values 0 and 1. We denote by Anj the matrix of this system. As in Section 3.1,
we associate the matrix Anj to a graph Gnj with n labeled vertices that cor-
correspond to the variables jci, ... ,xn. The graph has T edges G@, j(t)), t =
1,..., T. Thus the edges of the graph Gnj may be considered an outcome of
T independent trials: In each trial, an edge joins two different vertices / and
j with probability 2n~2 and forms the loop at a vertex / with probability n~2,
i, j = 1,..., T. Thus the graph Gnj is the same as the graph considered in Sec-
Section 2.3.
Denote by iinj the number of solutions of the system C.4.1) and consider the
probability of consistency
PnJ = P{^nJ > 0}.
We want to express Pnj in terms of the characteristics of Gnj. Denote by >cnj
the number of components of the graph Gnj.
Theorem 3.4.1. If f5\ ..., f}j are independent random variables that take the
values 0 and 1 with equal probabilities and do not depend on Anj, then
k=\
Proof. We first assume that Gnj is a connected graph. We can then choose a tree
that is a skeleton of the graph. This tree contains n — 1 edges that correspond to
a subsystem containing n — 1 equations of the system. If we assign a fixed value
to one of the unknowns, then with the help of the corresponding subsystem, we
obtain the values of all other unknowns. Consequently, the right-hand sides of
the remaining T — n + 1 equations must each take a fixed value for the system
to be consistent. Since fi\,..., fix are independent and take the values 0 and
1 with probabilities 1/2, the probability of consistency is (l/2)r~w+1 for Gnj
connected.
Now assume the graph Gnj consists of k components with n \, ..., nk vertices
and T\, ...,Tk edges, respectively. The whole system is consistent if and only
if each of its subsystem is consistent. Under the condition that the number of
components xnj = k and, consequently, that the system decomposes into k
disjoint subsystems, the probability of consistency is
1 1 1 1
27*1-/71 + 1 272-/12+1 ''' 27*-/J*+1 ~ 2T~"+k'
When we apply the formula of total probability, we obtain the assertion of the
theorem. ¦
3.4 Cycles and consistency of systems of random equations 145
According to Theorem 3.4.1, the number of components of the graph Gnj can
be used to investigate the system C.4.1). Likewise, we can consider the maximum
number of independent critical sets s(Anj) introduced in Section 3.1. According
to Theorem 3.1.1, the maximum number of independent critical sets s(Anj) and
the rank r(Anj) of the matrix Anj are related by the equality
s(AnJ) + r(AnS)= T.
It is not difficult to prove that
xn,T =n - T + s(Anj),
and the rank r(y4Wi7<) = n—xnj. Thus, the assertion of Theorem 3.4.1 isequivalent
to relation C.3.6).
We remarked in Section 3.1 that a critical set of Anj corresponds to a cycle
or a union of cycles in the graph Gnj, and the maximum number of critical sets
s(Anj) equals the maximum number of independent cycles.
The graph Gnj was studied in Section 2.3. We have seen that if n, T —>¦ oo
such that 2T/n —>¦ X, 0 < A. < 1, then with probability tending to 1, the graph
has no components with more than one cycle. Therefore, under these conditions,
all cycles of Gnj are isolated and, consequently, independent. As in Section 3.1,
we denote by v(Gnj) the number of cycles in Gnj.
It was proven (see Theorems 2.3.3 and 2.3.4) that if 2T/n —>¦ A, 0 < A < 1,
then }
r
^ 1, C.4.2) l
and for any fixed k = 0, 1...,
Ake~A
P{v{GnJ) = k) -+ —, C.4.3)
where
These results allow us to analyze the probability Pnj of consistency of the system
C.4.1).
Theorem 3.4.2. Ifn, T -+ oo such that 2T/n -+ A, 0 < A < 1, and the right-
hand sides fi\ ..., fix of the system C.4.1) are independent random variables
that take the values 0 and 1 with probabilities 1/2 and do not depend on Anj,
then
146 Systems of random linear equations in GFB)
Proof. When we use Theorem 3.4.1 or the equivalent formula C.3.6), we find
that
k=\ r=0
s=0
Taking into account C.4.2) and C.4.3) and passing to the limit under the sum
yield
s=0
In the same way, we can treat the nonequiprobable case, where the indices
(i(t), j(t)), t = 1,..., T, of the variables of system C.4.1) are independent
identically distributed random variables that take the value i with probability pi,
i = 1,..., n, p\ + • ¦ ¦ + pn = 1. As before, let the right-hand sides fa,..., fir
be independent, take the values 0 and 1 with equal probabilities, and not depend
on An j. We retain the notation Pnj for the probability of consistency of such a
system.
Theorem 3.4.3. Let pi = ai/n, where a, = a;(n), 0 < ?q < a, < e\ < oo,
i = I,... ,n, so and s\ are constants, and let
a = hm — > af.
n-^-oo n *—'
7=1
Ifn, T -+ oo such that2T/n -+ X anda2X < 1, then
Proof. In Section 2.4, the nonequiprobable graph Gnj corresponding to the ma-
matrix Anj was considered. The graph contains n labeled vertices and T edges that
can be obtained by the following T independent trials. In each trial, one edge is
drawn. The edge connects two different vertices i and j with probability 2pt pj, and
a loop at a vertex i is formed with probability pf, i, 7 = 1,..., n, p\ + ¦ • - + pn = 1.
According to Theorem 2.4.1, under the conditions of Theorem 3.4.3 for any
fixed k = 0, 1,...,
P{v(Gn,T)=k}
Ake~A
k\
3.4 Cycles and consistency of systems of random equations 147
where v{Gnj) is the number of cycles in Gnj, and
A =-I log(l -a2k).
If we reason as we did in the proof of Theorem 3.4.2, we obtain the assertion
of Theorem 3.4.3. ¦
The proofs of Theorems 3.4.2 and 3.4.3 are mainly based on assertion C.3.4)
that
P[fin,T > 0 | r(AnJ) = r}= 2~T+r. C.4.4)
The proof of this assertion in Section 3.3 used the fact that if r rows are lin-
linearly independent and r(Anj) = r, then each of the remaining rows is a lin-
linear combination of these r rows, and the system is consistent only if the corre-
corresponding right-hand sides satisfy a certain linear relation. If the right-hand sides
fix,..., fir are independent, then such a relation is satisfied with probability 1/2,
and the events corresponding to different relations are independent. In other words,
each cycle in Gnj imposes a restriction on the right-hand sides fii,..., $t,
these restrictions are independent, and each of them is satisfied with probabil-
probability 1/2.
If the right-hand sides fii,..., fir take the values 0 and 1 with unequal prob-
probabilities, then property C.4.4) is not valid, and the corresponding formula for the
probability Pnj of the consistency of the system becomes more complicated. In
this section, we prove the following assertions.
Let
Pn,T(k) = P{fxnj > 0, v(GnJ) = k], Pnj = PfAVr > 0}.
Theorem 3.4.4. Let the right-hand sides fti, ..., fir of the system C.4.1) be
independent identically distributed random variables that take the values 0 and 1
with probabilities 1 — p and p, respectively, 0 < p < I, A = l— 2 p.
Ifn,T -+ oo such that 2T/n -> k, 0 < k < 1, then for any fixed k =
0,1,...,
Theorem 3.4.5. Let the right-hand sides /3\,..., fir of the system C.4.1) take
the values 0 and 1, and letm = m (T) be the number of\'sin/3\,...,fiT-
148 Systems of random linear equations in GFB)
Ifnt T -+ oo such that 2T/n -* X, 0 < X < 1, am/ m/T ^ p, 0 < p <
then for any fixed k = 0, 1, ...,
rn,T ^ I ,
V1 "
A = 1 - 2p.
«
Before proceeding to the proof of these theorems, we will establish some aux-
auxiliary results. Let /Ji,..., /Jj be independent identically distributed random ^vari-
^variables that take the values 0 and 1 with probabilities 1 — p and p, respectively; let
A = 1 — 2p; and let E be the set of the even numbers. Let ro =0 and r\,..., rk
be positive integers. We consider the random variables
Vi = Pro+-+n_\ + l H h Pro+-+rn i = 1, . . . , k.
Lemma 3.4.1.
P{r)i g E, i = 1,..., k} = -j(l + Ari) •••(! + Ark).
Proof. It suffices to note that the random variables r)\,...,r)k are independent
and that the probability of the event of the sum /3i + • • • + fir being even equals
A + Ar)/2. ¦
When the variables f}\,..., f5j are nonrandom, we need a similar assertion for
the following scheme of allocating m particles into T cells. The cells are divided
into^+1 groups of cells containing r\,... ,rk, T — r\ rk cells, respectively.
We assume that each cell can contain at most one particle, that m < T, and that each
of (w) possible allocations are equiprobable. We introduce the random variables
?i,..., %t, setting ?, = 0 if the cell number / is empty, and ?,• = 1 otherwise,
for / = 1,..., T. By analogy with the random variables rj\,..., rjk, we define the
random variables
& = ?ro+-+r/_i + l H \- %ro+-+n, i = 1, ... ,k.
It is not difficult to verify the following assertions.
Lemma 3.4.2. Ifr\,...,rk are fixed, T ^ oo, andm/T -> 0, then
¦r. p f j — 1 M —* 1
Lemma 3.4.3. Ifr\,...,rk are fixed, T -^ oo, and m/T -> 1, then
PR, g?, / = !,..., k}^ 1
3.4 Cycles and consistency of systems of random equations 149
if all r\, ..., t'k are even; and
P{C/ G E, i = \,...,k}^0
if at least one ofr\, ..., r^ is odd.
Lemma 3.4.4. lfr\,..., r^ are fixed, T —>¦ oo, andm/ T —>¦ p, 0 < p < 1,
where A = 1 — 2p.
We now consider the graph Gnj and mark the cycles in the graph by the
following rule. Recall that An,T is the set of all graphs with n labeled vertices and
T edges whose components are trees and unicyclic components, allowing cycles
of length 1 and 2. If a realization of the graph Gnj belongs to the set An,T, then
every cycle of length r is marked with probability pr independently of the others.
If the graph contains a component with more than one cycle, then no cycle of the
graph is marked. We denote by pnj (k) the probability of the event that the number
of cycles v{Gnj) in the graph Gnj is equal to k and all cycles are marked. It is
clear that the probability pnj of the event that all cycles are marked equals
oo
Pnj = ^Pnjik).
As in Section 1.7, we denote by dm the number of mappings of the set {1,..., m}
into itself whose graphs are connected, and by a$ the number of mappings of
the set {1,..., m} into itself whose graphs are connected and contain a cycle of
length r. Let Fn^ denote the number of forests with n labeled vertices and N
trees, T = n—N.
Explicit expressions for dm and d^ are well known. By using the formula for
the number of rooted trees, we obtain
(m-r)\
hence,
m
r=l k=0
Lemma 3.4.5. For any integer k, 1 < k < min (n, T),
Pnj(k) = -r-i-j) > FW_W,7V > TJ j—
X 7 OT=1 X 7 7Wi + -+7W*=7W X
150 Systems of random linear equations in GFB)
where
and for k = 0,
Proof. For k = 0, the assertion is obvious. As in Section 1.7, let us denote by Vi
the number of connected graphs with n labeled vertices and one cycle of length r.
It is clear that
bil)=4l\ b™=d?\ bP=dP/2, r>3. C.4.5)
Denote by Cnj the event that the graph Gnj contains no unmarked cycles. We
represent the event
[v(Gnj) = k, Gnj G Anj, Cnj)
as a union of the following disjoint events: In a specific order, T trials give T fixed
edges that form a graph consisting of trees and k unicyclic components, including
a marked cycle. It follows from this description that
Pnj(k) = P{v{GnJ)=k, GnJ eAnj, Cnj)
= ?0 ?
ml
, jn) *-* m\\---mk\k\
m=k x ' m\-\ Ymkm
mk
E WPn • • • Eb^
n2 2*i+*2'
where si = si(ri,..., rk) is the number of l's among r\,..., n, and S2 =
S2(r\,..., t>) is the number of 2's among n,..., rk. The factor 2~Sx appears
because the probability 2n~2 is replaced by n~2 in s\ cases. The factor 2~S2 re-
reflects the fact that permuting trials in which two identical edges occur results in
the same graph. The lemma follows from the relations C.4.5). ¦
Theorem 3.4.6. Ifn, T -+ oo such that2T/n -+ X, 0 < A. < 1, then for any
fixed k = 0, 1, ...,
where
oo
m
3.4 Cycles and consistency of systems of random equations 151
Proof. The proof is similar to the proof of Theorem 1.8.2. We partition the sum
from Lemma 3.4.5 into two parts. We put
It is clear that for any x in the domain of convergence of the series
D{x) =
oo
''m
D xm
we have
y y
m\\ ¦ ¦ ¦mici *—' , ml
m>M m\-\ \-mk=m m>M/k
C.4.6)
Along with the function D(x), let us introduce the generating function of the
number of connected mappings
, v—v UmX
d(x) =
m\
m—\
The inequality
D(x) < d(x) C.4.7)
holds because
Also,
m
— <(m-l)\em,
which implies
< E
m>M/k m>M/k
Let
00
x v—\ n x
By Example 1.3.2 and A.4.8),
d(x) = loga(x), a{x) = A -
We put a = IT In and x = ae~a for a < 1. Then
6(x)=a,
152
Systems of random linear equations in GFB)
Under the hypothesis of the theorem, a
x = ae~a, there exists q < 1 such that ex
n. Therefore
2T/n -> k, 0 < k < 1, and for
ael~a < q < 1 for sufficiently large
*/*
Using estimates A.8.8), A.8.9), and C.4.6)-C.4.9) yields
T\ B
m>M m\-\ \-mk=m
m>M m\-\
m\\
C.4.9)
n\ m\Dmr-Dmk
)tn-m,N ;
¦¦¦</
„,,
m>M/k n
where c\, C2 are constants. Thus, under the hypothesis of the theorem, S2
If n,T -> oo,2T/n -> k, 0 < k < 1, then by virtue of A.8.7),
0.
uniformly in m < M = T1/4. Therefore, for any fixed k = 1,2,...,
m<M m\-\ \-mk=m
m=k m\+-
By using the estimate of 52, we obtain
f\^k
~?k\
3.4 Cycles and consistency of systems of random equations
153
Combining the estimates of S\ and 52, we obtain, under the hypothesis of the
theorem,
= P{v(GnJ) = k, Gn,T eAn,T, CnJ)
2kk\
C.4.10)
Hence the assertion of Theorem 3.4.6 for k > 1 follows, since x = ae a =
{2T/n)e~2Tln -* Xe~x = a and D(x) -* D(a). We use A.8.6) and the repre-
representation from Lemma 3.4.5 and conclude that
Corollary 3.4.1. Ifn,T —> oo such that 2T/n —> X, 0 < X < 1, then the
probability pnj of the event that the graph Gnj contains no unmarked cycles
satisfies the relation
Proof. We denote by pn T the probability of only marked cycles in the case where
the graph has k unicyclic components and all the probabilities pr are equal to 1,
r = 1, 2, In this case, D(a) = d(a) = 2A = - log A — X), and Theorem
3.4.6 gives
Ake~A
k\
= 0,1,....
To prove the corollary, it suffices to show that in the sum
Pn,T =
C.4.11)
k=0
one can pass to the limit under the sum. Let us show that for any s > 0, there exists
K such that
We choose K such that
oo
E
k=K+l
OO
Pnj{k) < S.
S
r
C.4.12)
k\
k=K+\
and for fixed K, we choose no so that for n > no,
K Ake~A
k\
k=0
s
—.
2
154
Systems of random linear equations in GFB)
Then, for n > no,
oo
V^ (') i
Z_^ L n, i
k=K+\
and therefore
oo
«- E
k=K+\
Ake~K
k\
oo
E /
k=K+\
K
k=0
Ake-
k=0
k\
s
r
Since pnj(k) < pn T(k), estimate C.4.12) and the validity of passing to the
limit under the sum are established. ¦
Proof of Theorems 3.4.4 and 3.4.5. A cycle leads to the inconsistency of system
C.4.1) if the sum of the right-hand sides of the subsystem corresponding to the
cycle is odd. Let pr be the probability that this sum is even for a cycle of length
r. Then Pnj{k) = pnj{k) for any k = 0, 1,.... Therefore Theorems 3.4.4 and
3.4.5 are direct corollaries to Theorem 3.4.6 and the fact proved above that one
can pass to the limit under the sum in C.4.11). To prove Theorem 3.4.4, we notice
that in this case, according to Lemma 3.4.1,
pr =
Ar)/2,
where A = 1 - 2p; therefore
oo
m
D(x) =
m = \
m\
m = \ r=\
2m!
C.4.13)
m = \
m = \ r=\
where
00 m
d(x, A) = > . > .
*m
x
m
m=\ r=\
m\
For* = ae a, 0 < a < 1,
and
= -\og{\-a),
,A) = -log(l-aA).
3.4 Cycles and consistency of systems of random equations 155
Indeed,
oo m ,(r) .r m oo oo ,(r) .r m
d(x, a)=v-v-^ax ^^am ax
m!
/—j
I I m! l ,
m=\ r=\ r=\ m=\
{jn - r)\ *-" Z-" t\
r=\m=r v r=l /=0
By using the well-known equality
00
t=0
from [124], Chapter 2, Problem 210 (see also [126]), we obtain
00 Arxrear ^ Arar
d(x, A) =
r=l ^ r=\
We conclude by noting that for a=2T/n-^-X,0<X< 1,
</(*) -> - log A - X), d(x, A) = - log A - a A) -+ log A - A A).
Let us turn to the proof of Theorem 3.4.5. If m/T -> 0, then for any fixed k,
all the cycles are marked with probability tending to 1. Therefore
In the case where m/T —> 1, we have pr —> 0 for odd r and pr —> 1 for even r
by Lemma 3.4.3. Therefore, in this case,
n _y nB) -
oo
D{x) =
.... m\
m=\ m=\
It is not difficult to see that
In the case where m/T -> /7, 0 < p < l,by Lemma 3.4.4,
Pr -> A + Ar)/2,
156 Systems of random linear equations in GFB)
and, as in C.4.13),
D(x) -+ D(a) = (d(a) +d(a,A))/2 = -i log A - A,)(l - AA.).
3.5. Hypercycles and consistency of systems of
random equations
In Section 3.2, we studied the rank of random matrices and found, in particular, that
if the elements of a T x n matrix A = \\atj || are independent identically distributed
random variables taking the values 0 and 1 with equal probabilities, then the rank
r(A) of the matrix A has a threshold property: If T/n -> a and a < 1, then
P{r(A) = T} -* 1, and if T/n -* a and a > 1, then P{r(A) = n) -* 1.
In other words, the maximum number of independent critical sets s(A) tends in
probability to zero in the former case and to infinity in the latter case. A similar
property apparently holds for the sparse matrices considered in Section 3.3: We
proved only that if a < 1, then s(A) has in the limit a Poisson distribution, and
Es(A) —> oo for a > 1.
In Section 3.4, we considered systems with at most two unknowns in each
equation. It was shown that if T/n -> a, 0 < a < 1/2, then the maximal
number of independent critical sets or independent cycles in the corresponding
graph approaches the Poisson distribution with parameter A = — 5 log(l — 2a).
As follows from Theorem 2.1.6, if a > 1/2, then s(A) tends in probability to
infinity.
The case of a matrix with independent and identically distributed random ele-
elements taking the values 0 and 1 with probabilities 1/2 and the case of a matrix with
at most two elements in each row studied in Section 3.4 can be considered as the
extreme cases in terms of the behavior of the rank and the maximum number of
independent critical sets. In these cases, the threshold effect appears at the points
T/n = 1 and T/n = 1/2, respectively.
In this section, we consider an intermediate case and obtain a weaker form of
the threshold effect. We consider the system of random linear equations in GFB):
xh(t) + ---+Xir(t)=bt, t = l,...,T, C.5.1)
where i\(t),..., ir(t), t = 1,..., T, are independent identically distributed ran-
random variables taking the values 1,...,« with equal probabilities, and the inde-
independent random variables b\, .. ., bj do not depend on the left-hand side of the
system and take the values 0 and 1 with equal probabilities. If r = 2, we obtain
the system considered in Section 3.4.
In Section 3.1, we introduced the notions of critical sets for a matrix and hyper-
hypercycles for the hypergraph corresponding to a matrix. Denote by Annj the matrix
3.5 Hypercycles and consistency of systems of random equations 157
of system C.5.1) and by G,-Mj the hypergraph with n vertices and T hyperedges
e\, ..., ej that corresponds to this matrix. Thus we consider a random hypergraph
G>\n.T, whose matrix A = ArMj = \\atj\\ has the following structure. The ele-
elements of the matrix atj,t = 1, ..., T, j = 1, ..., n, are random variables and the
rows of the matrix are independent. There are r ones allocated to each row: Each
1, independent of the others, is placed in each of n positions with probability 1 /«,
and atj equals 1 if there are an odd number of 1 's in position j of row t. Therefore,
there are no more than r ones in each row.
For such regular hypergraphs, the following threshold property holds: If n, T —>
oo such that T/n —> a, then an abrupt change in the behavior of the rank of the
matrix Ar,nj occurs while the parameter a. passes the critical value ar. This
property can be expressed in terms of the total number of hypercycles in Gr%nj.
Let s(Ar<nj) be the maximum number of independent critical sets of Ar,n,T or
independent hypercycles of the hypergraph Gr,nj. Then
is the total number of critical sets or hypergraphs.
In this section, we prove that the following threshold property is true for
Theorem 3.5.1. Let r > 3 be fixed, T, n —> oo such that T/n —> a. Then there
exists a constant ar such that E.S{Ar^nj) —> Ofor a. < ar and E.S{Ar^nj) —> oo
fora > ar.
The constant ar is the first component of the vector that is the unique solution
of the system of equations
e cosh X
AtanhA. = x,
with respect to the variables a, x, and X.
The numerical solution of the system of equations gives us the following values
of the critical constants:
a3 = 0.8894 ..., a4 = 0.9671..., a5 = 0.9891 ...,
a6 = 0.9969 ..., a7 = 0.9986 ..., a8 = 0.9995 ....
158 Systems of random linear equations in GFB)
Expanding the solution of the system into powers of e~r yields
e-r
ar « 1 1 r ,
Iog2 log2V2 Iog2 2)
which gives values close to the exact ones for r > 4.
Let us give some auxiliary results that will be needed for the proof of Theo-
Theorem 3.5.1.
The total number of hypercycles S{Ar,nj) in the hypergraph Gr^nj with the
matrix Ar^nj can be represented as a sum of indicators. Let ?/,,...,/„, = 1 if the
hypercycle C = {et{, ..., etm} occurs in Gr,nj, and ?/,,...,/„, = 0 otherwise. It is
clear that P {?/,,...,/„, = 1} does not depend on the indices t\, ..., tm. Indeed, from
the definition of the random hypergraph Gr<n<r, the indicator ?/,,...,/„, = 1 if and
only if there are an even number of l's in each column of the submatrix consisting
of the rows with indices t\,..., tm. The number of 1 's in n columns of any m rows,
before these numbers were reduced modulo 2, have the multinomial distribution
with rm trials and n equiprobable outcomes.
Denote by ?7i(s,«),..., r]n(s, n) the contents of the cells in the equiproba-
equiprobable scheme of allocating 5 particles into n cells. In these notations, the number
of 1 's in the columns of any m rows, before those numbers have been reduced
modulo 2, have a distribution that coincides with the distribution of the variables
r)\(rm, «),..., r]n(rm, n). Therefore
P{&i,...,rm = 1} = P{rii(rm,n) e ?,..., r}n(rm,n) G E],
where E is the set of even numbers, and the average number of hypercycles in
Gr,nj can be written in the following form:
ES(Ar,nJ) = Y] (T)pE(rm,n), C.5.3)
where
PE(rm,n) = P{ri\(rm,n) G E,... ,r]n(rm, n) G E}.
Thus, to estimate ES(Arnj), we need to know the asymptotic behavior of
PE{rm,n).
We consider a more general case and obtain the asymptotic behavior of the
probabilities
PR{s,n) = P{rn(s,n) G R,... ,r]n(s, n) e R],
where R is a subset of the set of all nonnegative integers.
The joint distribution of the random variables 771E,«),..., r]n(s, n) can be
expressed as a conditional distribution of independent random variables ?1, ...,?„,
identically distributed by the Poisson law with an arbitrary parameter X, in the
3.5 Hypercycles and consistency of systems of random equations 159
following way (see, e.g., [90]). For any nonnegative integers si, ..., sn such that
s\ H + sn = s,
P{r)i(s, n) = 5i, ... ,r]n(s,n) = sn}
= P{?, = Sl, ...,$„= Sn \ Si +¦¦¦+ %„ = S}.
Therefore
PR(s,n) = P{m(s,n) e R,...,T]n(s,n) e R}
¦¦¦+$« =
eR})n
We now introduce independent identically distributed random variables
;| ,...,?„ with the distribution
= k) = P^i = * I ?i e *}, * = 0, 1,....
It is not difficult to see that
P(?i +•¦¦+?„= 5 | ?i e *,...,?„ e *} = P[$[R) + ¦¦¦+ ^R) = s],
and therefore
PR(s,n) = (Pffc e R})* [l)t T T;" " }. C-5.4)
P{^ +-..+^ =5}
Let x = s/n and choose the parameter X of the Poisson distribution in such a way
that
x = t-$i =
keR
Let d be the maximum span of the lattice on which the set R is situated and denote
the lattice by Tr .
Theorem 3.5.2. If s,n —> oo such that n G Tr, then in any interval of the form
0 < xq < x < x\ < oo,
xxex\" d fx
) 5L
)
Xxex 1 a
uniformly in x = s/n, where the parameter X of the Poisson distribution of the
random variable ?i is the root of the equation x = E^| , and a2 = D?| (the
variance).
160 Systems of random linear equations in GFB)
Proof. The local limit theorem holds for the sum %\R) + ¦ ¦ ¦ + ^R). Following
the classical proof of the local limit theorem of Gnedenko [49], we prove that if
5, n -> oo such that n e Tr, then
rtn
uniformly in x = s/n in any interval of the form 0 < xq < x < x\ < oo, where
a1 = D?[ \ and d is the span of the lattice Tr.
When we substitute the expression into C.5.4) and take into account that the
sum ?i + ¦ ¦ ¦ + ?„ is distributed by the Poisson law with parameter Xn, we obtain
the assertion of the theorem. ¦
Note that C.5.4) implies the estimate
PR(s,n) <(P{h eR})n
s\eXn
where Pr(s,h) does not depend on X, and on the right-hand side any positive value
can be assigned to this parameter. Let E = {0, 2,...}. In this case,
g E} = e'xcoshX,
and the estimate takes the form
PE(s,n)< (cosh X)n-^-, C.5.5)
Xsns
where X > 0 can be chosen arbitrarily.
We now estimate
rj
m = \
Lemma 3.5.1. Ifr > 3 is fixed, and T, n —> oo such that T/n —> a, then for any
s > 0, there exists-8 > 0 such that
Proof. First we point out that
1 2k
= 2*} = PKl = 2k | f, e E) = m^rx, . = 0,1,...,
= AtanhA.
3.5 Hypercycles and consistency of systems of random equations 161
Put x = rm/n and choose the parameter X of the Poisson distribution in such a
way that x = X tanh A.. From C.5.5), it follows that
(rm)\
PE{rm,n) < (coshA)"
Xrmnrm
Since the value of x becomes small for sufficiently small 8, we can assume that
X < 1 in the domain of summation. For such X,
and therefore
X2/4 <x = X tanh X < A2,
cosh A. < ex < eAx.
We now estimate the sum. It is easy to see that
\m)')s E
Km<ST V 7 Km<ST
l<m<ST
Tme4xn
Km<ST x 7
m
\<m<ST
r\rl2
-) rr'2-leAr8r'2-1
Since T/n tends to a constant, the last sum can be made arbitrarily small by
choosing a sufficiently small 8. ¦
Lemma 3.5.2. If r is fixed, and T, n —> oo such that T/n —> a, 0 < a < 1,
then for any s > 0, there exists 8 > 0 swc/z
Y^
0
PE(rm,n) < s.
Proof. Put X = rm/n and let an integer mo be chosen such that mo/T < 8. With
such a choice of A., by C.5.5),
^ )PE(rm,n)< ^
162 Systems of random linear equations in GFB)
Since in the domain of summation, k is greater than some positive constant, there
exists q < 1 such that
e-xcoshk = A +e~2x)/2 <q.
By using the inequality
(rm)\ < c(rm)rme~rm(rT)l/2,
where c is a constant, we obtain
]T ( )PE(rm,n) < c(rT){/2 ]T ( T\e~xcoshk)n
m=T-mo m=T-m0 ^ '
x—* I * \Q \i — Q)
n \ II ;
m=T-mo
/ a \"~T
<c{rT)x'2{ mo/in-T) )
Since q, a < 1, the value mo/(n — T) can be made arbitrarily small by choosing
a sufficiently small 8, and therefore the value q/{\ — q)mo/(n-T) can be made
smaller than some Q < 1. Thus, for a sufficiently small 8, the right-hand side
tends to zero under the conditions of the lemma. ¦
Proof of Theorem 3.5.1. We now estimate the middle part of the sum. As T/n ->
a and 8 < m/T < 1 — 8, the values x = rm/n lie in an interval of the form
0 < xq < x < x\ < oo. When we apply Theorem 3.5.2, we obtain for even rm,
PE(rm,n) = (P{^ e E})n (?-
^ e
ke/ a
uniformly in x, xo < x < x\, where x — E%[E) = ktanhk, a2 = D^(?) =
k2+x-x2.
From P{^i g E] = e~x coshk, we obtain the final estimate: As T, n -> oo,
T/n -* a,
PE(rm,n) = (coshk)n (^-Y" —A + o(l)) C.5.6)
uniformly in m, 8 < m/T < 1 — 8.
Setting p = m/T, q = I — p, and using the normal approximation to the
binomial distribution show that, as T -> oo,
(T) = (T)pm<iT-m(pm<iT-mr{ = m t \
\mj \mj pmqT-
uniformly in m, 8 < m/T < 1 — 8.
3.5 Hypercycles and consistency of systems of random equations 163
Let a = T/n and write p = m/T in terms of x = rm/n and a. Then
m x m ar — x
T ar' T ar
and the estimate of (^) takes the following form. AsT^oo,8<m/T< 1-5,
m ar f(arr>r-x)*y/'-
= —===== — —- I (l+o(l)) C.5.7)
V»/ ^2nx(ar - x)an \ xx(ar-x)ar )
uniformly in m.
We combine the estimates C.5.6) and C.5.7) and obtain
0
larJx
PE(rm,n) = (f(a,x))nV
— x)an
where
f(a,x) =
x\* [ar-x\x/r { ar \a
\ x / \ar — x J
x = A-tanhA..
The function f(a, x) increases as a increases,
I'x far -x\1/r\
f'x(a,x) = f(a,x)\ogl-l—— J j^-<x), x -* 0,
and the derivative f'x(a, x) has no more than two zeros. Therefore the system of
equations
f(a,x) = 1,
f'x(a,x) = 0, C.5.8)
A.tanhA. = x
has the unique solution {ar, xr, Xr); at this point, the function f(ar, x) as a function
of x attains its maximum, which is equal to 1. Therefore, for all x, 0 < x < ar,
f(ctr,x) < f{ar,xr) = 1.
In addition,
f(a,x) < f(ar,x) < 1, a < ar,
f{a,xr) > f{ar,xr) =1, a > ar.
This implies that the middle part of the sum tends to zero for a < ar and tends to
infinity fora > ar.
164 Systems of random linear equations in GFB)
If we consider the estimates for the tails of the sum in Lemmas 3.5.1 and 3.5.2,
we obtain the assertion of Theorem 3.5.1 because system C.5.8) can be easily
transformed to the form mentioned in the statement of the theorem. ¦
It would be interesting to find the limit distribution of the number of hypercycles.
Up to now, no one has succeeded even in proving that S{Arnj) tends in probability
to infinity as T, n -» oo, T/n -> a > ar.
3.6. Reconstructing the true solution
We consider the system of equations in GFB):
= bt, t=l,...,T, C.6.1)
where the pairs (i(t), j(t)), t = 1,..., T, are independent identically distributed
two-dimensional random vectors that take values (/, j), i < j, i, j — 1,...,«,
with equal probabilities ()
In Section 3.1, we interpreted a system similar to C.6.1) as the result of T trials
performed with the aim of classifying n objects by random pairwise comparisons,
and we set bt = 0 if the comparison of *,-(,) and Xj(t) showed that these objects
were from the same class, and bt = 1 otherwise, for t = 1,..., T. If the compar-
comparisons are not absolutely right, then the result of a comparison may deviate from
the true value. Suppose that X* = (x*,..., x*) is the vector of true values of the
unknowns, and the column-vector B* — (b*,..., b^) is obtained by substituting
X* into the left-hand side of system C.6.1):
AX* = B*, C.6.2)
where A is the matrix of system C.6.1).
If the measurements are not precise, then it is natural to suppose that
bt = b* + st, t = 1,..., T,
where s\,.. .,st are independent identically distributed random variables that
do not depend on A and take the values 0 and 1. These random variables can be
interpreted as errors. Let
]^ = l}, q = l-^- = P{?i = 0}, C.6.3)
where A is called the excess.
The problem is to estimate or reconstruct the vector X* = (X*,..., jc*) on the
basis of the matrix A and the right-hand side B = (b\,..., bj) of system C.6.1).
In a similar situation over the field of real numbers, an estimate of the true
solution of a system of linear equations with perturbed right-hand sides can be
found by the least-square method. Under some conditions on the matrix and the
3.6 Reconstructing the true solution 165
errors in the right-hand sides, the least-square method provides an estimate that
converges to the true solution as the number of equations tends to infinity. In
contrast to the field of real numbers, in GFB) a good estimate X = (x\,..., xn)
coincides with the true solution X* = (jc*, ..., X*) with probability tending to 1
as T -> oo.
As usual, we associate the graph Tnj with the left-hand side of system C.6.1).
The graph TnT has n labeled vertices corresponding to the unknowns jci, ..., xn
and T edges et = (i(t), j(t)), t = 1,..., T. The edges e\,..., ej are independent
and assume the /?(/? — l)/2 possible values with equal probability. Therefore, the
graph Tnj may have multiple edges.
It is clear that along with the vector X*, the vector X* = (jc*, ..., jc*) with
elements jc* = jc* + 1, t = 1,..., /?, satisfies the system C.6.2). The pair X*,
X* is uniquely determined by the system C.6.2) if the graph Tnj is connected, in
other words, if the system C.6.2) contains all the unknowns and is not decomposed
into subsystems with disjoint sets of unknowns.
Denote by pnj the probability that the graph Tnj is connected. It follows from
Theorem 2.3.8 that if n, T -» oo such that T = /?log/? +an +o(n), where a is a
constant, then
-e~a
Pnj -*- e
Thus, if n, T —> oo and the pair X*, X* is determined by the system C.6.2) with
probability tending to 1, then
T in
n\ogn log/?'
where wn —> oo.
In this section, we present three algorithms for reconstructing the true solution
of system C.6.1) with perturbed right-hand sides. We first describe the reconstruc-
reconstruction method that can be called the voting algorithm. This algorithm consists of
correcting the right-hand sides b\, ..., bj of the system C.6.1) by the majority
rule. Let the system C.6.1) contain the subsystem with /w,-y, / < j, equations:
Xi + Xj = atj ,
C.6.4)
The true value of a\ ¦ , ..., a™.1' equals a*. = jc* + jc*.
We set atj = 1 if
and aij = 0 otherwise.
166 Systems of random linear equations in GFB)
Under some conditions, system C.6.1) is indecomposable and a(/ = a*- for all
i, j = 1,...,«; thus the true solution is reconstructed.
Denote by P(n, T) the probability of reconstructing the true solution of system
C.6.1) by the voting algorithm, that is,
P(n,T) = P{aij=a*j, i, j = 1,...,«).
Theorem 3.6.1. Ifn, T —> oo and A —> 0 such that
A2T
n2 log/?
then P{n, T) -* 1.
Proof. Let
i, T) = min
oo,
where the minimum is over all subsystems of the form C.6.4). It is clear that
P(n, T) = P{fi(n, T) > m}Pm(n, T) + P{/x(n, T) < m}Pm(n, T), C.6.5)
where Pm(n, T) and Pm(n, T) are the conditional probabilities of reconstruct-
reconstructing the true solution under the conditions {//,(«, T) > m] and {/x(n, T) < m},
respectively.
We obtain a rough estimate for the probability P{n(n, T) > m}. It is clear that
P{n{n, T) > m} is the probability that each cell contains more than m particles
in the classical scheme of allocating T particles into B) cells. Denote by rn the
number of particles in the ith cell and put ?,• = 1 if m < m, and ?,• = 0 if m > m,
i = l,...,g). By A.1.1),
P{fi(n, T)<m} = P{^i + • • • + ^q > 0}
The random variable 771 has the binomial distribution with T trials and the
probability of success Q)" . Since a = E771 = T/{^) -> 00, the normal ap-
approximation is valid for this distribution. We choose m = a{\ — A), assume that
(As/aK/ T -> 0, and estimate the probability P{^i < m}. By taking into account
the choice of m and the equality D771 = a(l + o{\)), we obtain
P{/71 < n] = P{(m - a)/fihH < (m - a)/y
/ e~u '2du{\+o(\)).
J-oq
3.6 Reconstructing the true solution 167
Hence, there exists a constant c such that
P{/7i <m) <ce-A2a/2.
Thus, for m = a( 1 - A),
,T)<m}^»0 C.6.6)
because A2a/ log/? -» oo and n2e~Aa/2 -» 0.
Now we have to show that under the conditions of the theorem, Pm(n,T) -» 1.
In other words, we have to prove that aij = a*, for all i, j = 1,..., n with
probability tending to 1. The additional requirement of the indecomposability of
the system C.6.1) or of the connectedness of the graph Tnj is obviously fulfilled.
Recall that bt = bf + st. We may assume that in the subsystem C.6.4),
aW=atj+e%\ *= 1,...,«,;,
where the random variables s] ¦',..., ei™'J are independent and have the same
distribution as ei,..., ej from the right-hand side of C.6.1). Denote by ?(«, T)
the number of wrong decisions, that is, the number of realized events {aij ^ a*.},
i,j=l,...,n. Now let fy = 1 if ejP + • • • + ef}ij) > mo/2, and ^7 = 0
otherwise. It is clear that the number of wrong decisions can be represented in the
form
and
1 - Pm{n, T) = P{?(«, T) > 0 | fi(n, T) > m)
B) = 1 \nin,T)>m}. C.6.7)
Now we derive estimates for
P(?i2 = 1 I ti(n, T)>m} = P{e$ +¦¦¦+ e(™l2) > mn/2 \ fi(n, T) > m).
The random variables s\2 ,. •., s^ are independent and have the same distribu-
distribution as the random variables ei,... ,?t from the right-hand side of system C.6.1).
We set Sk = s\ H \- ?k and estimate
P{Sk > k/2} = P{Sk - ESk > kA/2}.
Here, and later in this section, we use the following inequality of exponential type
for the sum Sk that was proposed by Hoeffding [59] and can be found in [122] (see
168 Systems of random linear equations in GFB)
Theorem 1.1.16). For any positive A,
P[Sk-ESk> kA/2} <e~kA2/2. C.6.8)
Therefore
and from C.6.7), we obtain
("XA2/2 C.6.9)
Form = a(l - A), a = T/Q), under the conditions of Theorem 3.6.1, the right-
hand side of C.6.9) tends to zero. Thus, the assertion of the theorem now follows
from C.6.5), C.6.6), and C.6.9). ¦
We now describe the second algorithm for reconstructing the true solution of
system C.6.1), which can be called the method of coordinate testing.
We choose a vector X^ = (x[ \ ..., ^0)) by random sampling from the
set of all n -dimensional vectors over GFB). Denote by B{0) = (bf\ ...,bf})
the column-vector obtained by substituting X^ for X in the left-hand side of
C.6.1). Let p(X^) be the number of the coordinates of 5@) that coincide with
the corresponding coordinates of the vector B = (b\,..., bj) of the right-hand
sides of system C.6.1). We construct a vector X(l) = (x[1},..., x^}) from X@)
and system C.6.1) and show that, with probability tending to 1, the vector
coincides with the true solution X*.
Therefore we consider the vectors
W) @) Q @) @)
(P) @) 1 @) @)
and calculate the values P(Xiyo) and fi(Xiti), defined for the vectors Xito and Xiy
in the same way yS(X@)) was defined for X{0).
For i = 1,...,«, let
Denote by %(X) the number of coordinates of the vectors X and X* that coin-
coincide. The value
where X = {x\,... ,xn) = (x\ + 1,..., xn + 1), is called the number of coinci-
coincidences.
3.6 Reconstructing the true solution 169
Lemma 3.6.1. If n —> oo, then the distribution of the random variable
Bt](X^)) — ri)lsfn converges weakly to the distribution of the modulus of the
random variable that has the normal distribution with parameters @, 1).
Proof. Since the vector A^0) is chosen from the set of all ^-dimensional vectors
by random sampling with equal probabilities, the random variable Sn = ?(A^0))
has the binomial distribution with parameters («, 1/2). From the obvious equality
?j(X) + ?j(X) = n, the random variable ij(X^) is represented in the form
It is clear that
- n) = max -=^, -=-
\2Sn -Al|,
and the assertion of Lemma 3.6.1 follows from the convergence of the distribution
of {2Sn — n)/sfn to the normal distribution with parameter @, 1). ¦
We can now prove the following assertion concerning the algorithm of coordi-
coordinate testing.
Theorem 3.6.2. Ifn, T -> oo and A -> 0 such that
A2T
>
nz\ogn
then
P{X{1) =X*}
Proof. For definiteness, assume
oo,
The coordinates of X^ that coincide with the corresponding coordinates of
X* are called true, whereas those that do not coincide are called wrong.
For the algorithm of coordinate testing to lead to the true solution, the following
obvious conditions must be fulfilled. For each coordinate of the vector X®\ the
value of fi(X^) must increase if we replace the wrong value of the coordinate by
the true value, and the value of fi(X^) must strictly decrease if we replace the
true value by the wrong one.
We separate all the equations of the system C.6.1) that contain jc,- , and denote the
number of such equations by «,-. Replacing x( by x( changes the contribution
in fi(X^) of these equations only, and each equation containing jc,- contributes
170 Systems of random linear equations in GFB)
1 or -1. If jc(-0) is wrong, then the increment of /}(X@)) due to replacing x\ by
xf] is equal to the random variable fr(X{0)) such that (/J,(X@)) + «,)/2 has the
binomial distribution with parameters («,, /?,), where /?,- is the probability that the
coincidence in a fixed equation containing x, appears after substituting x\ for
xf\ provided x\ ) is wrong. It is not difficult to see that
Pi = vq + (l-v)p, C.6.10)
whereg = P{b, = b*},p = 1 — q, and v is the probability that the second variable
in the equation has the true value. The second variable takes values from the set
with equal probabilities. Therefore v = (k - l)/(« - 1), where k is the num-
number of true coordinates of X^\ which equals ^(X^) under the assumption that
> HX{0)). It follows from Lemma 3.6.1 and equality C.6.10) that
which we write as
Pi = - + -—7=r, C.6.11)
2 2v«
where
n-\
By assumption, ?„ is asymptotically normal with parameters @, 1).
Therefore
>
(A27/(«2log«))/4} -> 1 C.6.12)
because A2T/(n2\ogn) -> oo.
Next, we find a lower bound for m, i = 1,...,«. To this end, we take into
account only the first variable in each equation. Then we obtain the classical
scheme of equiprobable allocation of T particles into n cells, and by applying the
corresponding results on the distribution of the minimum of contents of cells [90],
we find that
Pf min m > T/Bn)\ -> 1. C.6.13)
I \<i<n J
3.6 Reconstructing the true solution
171
For the increments &(X( '), we have
P[^(X(O))<O\xii°) is wrong)
= P{(t;(X{0)) + m)/2 < mil | xjo) is wrong)
= P{SB, <ru/2),
where S,u has the binomial distribution with parameters («,-, pi). From C.6.11),
we find that
P{SB, < m/2) = P[Sni - ESni < A|?B|/!//Bv^)}.
When we use estimate C.6.8) of the exponential type for the binomial distribution
and take into account C.6.12) and C.6.13), we obtain
x-l/2
P{Snt < m/2} < exp j
n2 \n2 log/i
In a similar way, we obtain the bound
>
0 | jc/0) is true) < exp -
A2T ( A2T
-1/2
n2 \«2log«
Therefore an upper estimate for the probability of at least one wrong decision
while testing all the coordinates of the vector X^ is
i=\
< 2n exp
< 0 | jcP is wrong) +
a 2t / a 2t \ -1/2
> 0
is true})
n2 \n2\ognj
and tends to zero under the conditions of the theorem because
A2T/(n2\ogn) -> oo.
With the help of a preliminary search of the n -dimensional vectors, it is possible
to select an initial vector X^ with a great number t]{X^) of coordinates coincid-
coinciding with the corresponding coordinates of the true solution X*.\f the algorithm
for coordinate testing begins with this initial vector, then a much smaller number
of equations is needed to reconstruct the true solution. This number is comparable
to the number of edges needed for the graph Tnj to be connected.
Theorem 3.6.3. Ifn, T -> oo and A -> 0 such that
A2T
n log/?
oo,
172 Systems of random linear equations in GFB)
then there exists an algorithm that reconstructs the true solution of system C.6.1)
with probability tending to 1.
Proof. The algorithm, which gives the true solution under the conditions of the
theorem, begins with a preliminary search of an initial vector Ar@) with a large
number of coincidences with the true vector X*. The choice of X^ is determined
by a search of all ^-dimensional vectors. To this end, we choose the level / =
Tq-uTVf, where q = P{bt = bf} = A + A)/2andur = A^T/18, and select
the vectors Xfor which fi(X) > /. Recall that fi(X) is the number of coincident
coordinates of the vector B = {b\,..., bj) and the vector of the right-hand sides
of system C.6.1) that are obtained when Xis substituted into the left-hand side of
the system.
The vector X* will be selected with probability tending to 1. Indeed,
P{P(X*) <Tq- utVt] = P{ST -
where Sj is the number of successes in T independent trials with the probability
of success equal to q = A + A)/2. By using estimate C.6.8), we find that
P{P(X*) <l}<e~u2T/2,
and the complementary probability P{/3(X*) > /} ->¦ 1 because uj ->¦ oo.
If %(X) = s, then the probability of the coincidence of a fixed component of
the right-hand sides is
qs(s-\) q(n - s)(n - s - 1) 2s{\ - q){n - s)
P(s) = — — + + ,
n(n — 1) n(n — 1) n(n — 1)
and, since q = A + A)/2, we find
1 ABs - n)Bs - n + 1)
P(s) = - H — .
2 2n(n — 1)
For example, let s <2n/3. Then p(s) < 1/2 + A/9, beginning with some n, and
for any fixed X with %(X) = s < 2n/3,
> /} = P{ST >Tq- u
< P{ST - EST > 7AT/18 - uTVf]
= P{ST - EST > Ar/3},
where Sr is the number of successes in T independent trials with probability p{s)
of success.
By using the inequality C.6.8) of exponential type, we find
3.6 Reconstructing the true solution 173
The probability that none of the vectors Xwith i-(X) < 2n/3 will be selected
does not exceed 2"e~A'T^lH, and under the conditions of the theorem this prob-
probability tends to zero. Thus, with the help of the exhaustive search, it is possible
to select, with probability tending to 1, a vector Ar@) such that ^(l'0') > 2n/3.
Beginning the algorithm for coordinate testing with this vector X^\ we find, using
the notations introduced in the proof of Theorem 3.6.2, that
P(?(*@)) < 0 | x(@) is wrong) = P{Snj - ESni < -A\^n\ni/B^n')}.
Using estimate C.6.8) and taking into account that with probability tending to 1,
l?nl > ¦y/n/'i for the selected vector and «,¦ > T/{2n), we find the estimate
< 0 | xf is wrong) < P{Snj - ESni < -Am/6)
Similarly we obtain
> 0 | x™ is true) <
As in the proof of Theorem 3.6.2, an upper bound for the probability of at least
one wrong decision, while all n coordinates of X^ are tested, is 2ne~A r/C6")
and tends to zero under the conditions of the theorem. ¦
Thus, if we use the exhaustive search, then the true solution can be reconstructed
under the condition A2T/(n \ogn) -> oo. If the number of equations T is such
that A T/(n log/?) —> oo, then the reconstruction can be realized by the voting
algorithm, which is more economical with respect to the number of operations.
Clearly, there is considerable interest in the algorithms that lead to the true solution
with probability tending to 1 under intermediate conditions on the number of
equations T and do not require the exhaustive search of all 2" vectors.
Let us describe an algorithm that will be referred to as A^. Consider all B)
equations obtained as the pairwise unions of the equations of the system C.6.1).
Among the equations obtained by this operation, there are equations that contain
either four, or two, or zero unknowns each. Denote by 62 the subsystem that
includes all the equations with two unknowns each. The algorithm A2 ends with
the application of the voting algorithm to the subsystem S2. The following theorem
gives the conditions under which the algorithm A2 reconstructs the true solution.
Theorem 3.6.4. Ifn,T-^co and A ->¦ 0 such that
A4T2
> 00,
then the algorithm A2 reconstructs the true solution with probability tending to 1.
174 Systems of random linear equations in GFB)
Proof. Let i and / be arbitrary, assume i < /', and consider all equations of the
system S2 of the form
C.6.14)
+ Xj = b(j
The equality m/y = m means that the graph Vnj corresponding to system
C.6.1) contains exactly m vertices, say v\,..., vm, such that the graph Tnj
contains the edges (v\, i), {v\, j),..., (vm, i), (vm, j). The right-hand sides
b\j ,..., btjlj are the pairwise sums of 2m ,j independent random variables, and
therefore they are independent and, according to Lemma 3.2.1, take the true value
b*. = x* + xj with probability A + A2)/2 and the wrong value with probability
A - A2)/2.
Let bij = 1 if b^ H + bf^'j) > /w/y/2, and bi} = 0 otherwise. As in
the proof of Theorem 3.6.1, we denote by fi(n, T) the minimum value of /Ho-
/Hoover all subsystems of the form C.6.14). As in C.6.5), the probability P(n, T) of
reconstructing the true solution can be represented in the form
P(n, T) = P{fx(n, T) > m}Pm(n, T) + P{^{n, T) < m}Pm{n, T), C.6.15)
where Pm(n, T) and Pm(n, T) are the conditional probabilities of reconstructing
the true solution by the majority method under the condition that {[x(n, T) > m}
and {//.(«, T) < m), respectively.
As in the proof of Theorem 3.6.1, we need to estimate P{n(n, T) > m}, but
here this estimation is more laborious.
Let %ij = 1 if mi j < m, and ?,-_,- = 0 if m,y > m, i < j, i, j = 1, ...,«. It is
clear that
Let ixi = 1 if the edges (i + 2, 1) and (i + 2, 2) occur in Tnj, and m = 0
otherwise; and v; = 1 if exactly one of the edges (i + 2, 1), (i + 2, 2) occurs in
Tnj, and v; = 0 if the edges (/ + 2, 1) and (/ + 2, 2) do not occur in Tnj. The
random variable m 12 can be represented as the following sum of indicators:
"M2 = Mi H h y.n-2,
3.6 Reconstructing the true solution 175
and
m
J2 J2 ^i-'*' C-6J7)
where />,-, ...,t is the probability that //,;,,..., mk take the value 1 and all the other
random variables take the value 0.
It is not difficult to see that (Mt, Nt), where
Mt = Mi H h Ht> M = vi H h v,,
is a Markov chain because (/^+1, vt+ \) depends only on the number of edges used
to construct the random variables ix\,..., ixt, v\,..., vt, t = 1,..., n — 2. More
precisely, let
p(t | y,_i,Z,_i)
7,_i, Z,_i) = P{m, = 0
By using this notation, we can write the probability pix...ik that ^t,-,,..., ^ take
the value 1 and all the other random variables take the value 0 in the form
Pi\...ik =
M = 0, / # /"i,..., ik I vi = z\,..., vn^2 = zn-2)
= q(\ | Y0,Z0)---q(h-\
where Zo = 7o = 0, Zt = z\ + ¦ • • + zt, and Yt is the number of i\, ..., i^
do not exceed t.
We now estimate the probabilities p{t \ Y, Z) and q{t \ Y,Z). It is clear that
p(t | 7, Z) + #(f | 7, Z) = 1, and the probability p{t \ 7, Z) does not depend on
t and equals the probability pi{s, N) that two fixed places corresponding to the
edge A, t), B, t) will be occupied after allocating s = T — 27 — Z edges into
N = B) — Z places in the classical scheme of allocation of particles. Therefore
s\ ( 2\s~2
^ N
176 Systems of random linear equations in GFB)
and we have the following estimates:
s(s-\)/ 2V~2
»*.»>
s(s-\) ^ (j-2)! / _ 2_
TV2 /^ k\ll(s-k-l-2)\Nk+l \ N
IV
_ d _ 1
C, V +
V N) JV2
Since
T-3n<s = T-Y-Z<T,
n(n - 3)/2 < N = (j - Z < n(n -
we obtain for all & = 0, 1,... ,n — 2,
n- ¦ < Pknn~k~2
Pi\...ik 5: -r y ,
where
P = max^(/ | Yt-\, Zt-\),
Q = l-min/>(f I ^-l, ^-i).
Therefore it follows from C.6.16) and C.6.17) that
P{mi2 < m) < (P + Q)"-2P{h +¦¦¦+ $n-2 < m}t
where ?i,..., %n-2 are independent identically distributed random variables,
P{?i = 1} = P/(P + 0, P{^ = 0} = Q/(P + 0.
As«,r^ ooandT/g) -> 0,
P 47
and under the conditions of the theorem,
(P + 0"-2= l+o(l). C.6.18)
The random variable ?w-2 = ?i +•••+?„-2 has the binomial distribution with
parameters (« - 2, P/{P + 0).
3.7 Notes and references 177
Let a = EC_2 = (n -2)P/(P + Q) andm = a(l - A2). We assume that
is not too large, so that A4T2/nlQ/3 -> 0. Then, for sufficiently large n,
,-A2>/2
./—00
'2jt J -00
and there exists a constant c such that
2 < m} < ce-A4a/8. C.6.19)
Thus, by virtue of C.6.16), C.6.18), and C.6.19),
P{fj,(n, T)<m} < cn2e-A4a/s -* 0 C.6.20)
because, under the conditions of the theorem, &4T2/(n3 log n) ->¦ oo, and conse-
consequently, n2e~A^8 -> 0.
As in the proof of Theorem 3.6.1, we have to show that under the conditions
of Theorem 3.6.4, the system 62 is indecomposable and Pin, T) —> 1. In other
words, we have to show that bij = b*- for all /, _/ = 1,..., n with probability
tending to 1.
By the same reasoning as in the proof of Theorem 3.6.1, for m = a{\ — A2),
we obtain
l-Pm(n,T)<("X-m*4'4, C.6.21)
and under the conditions of Theorem 3.6.4, the right-hand side of C.6.21) tends
to zero.
The assertion of the theorem follows from C.6.15), C.6.20), and C.6.21). ¦
3.7. Notes and references
The theory of systems of random equations in finite fields was developed by the
Russian mathematicians V. E. Stepanov, G. V. Balakin, I. N. Kovalenko, A. A. Lev-
itskaya, and others. The connection between systems of equations in GFB) and
graphs was first pointed out and used by Stepanov. The notion of a critical set was
introduced in [79] (see also [13] and [85]).
The theory of recurring sequences and shift registers mentioned in Section 3.1
can be found in [50] and [156].
Theorems 3.2.1 and 3.2.2 were proved by Kovalenko in [92]. This brilliant
result initiated a series of investigations of similar problems that were carried out
by Kovalenko and his school. These investigations developed in two directions.
The first direction concerns extensions of Theorems 3.2.1 and 3.2.2 to matrices
178 Systems of random linear equations in GFB)
over more general algebraic structures. It is not difficult to see that by virtue of
the Markovian character of the process p,,(t), a recurrence relation for pn,T{k) =
P{pn(T) = k} can be derived and used for the proof of Theorem 3.2.1. In this way,
the extension of the result to a finite field with q elements can be easily obtained
[93]. Let the elements of T x n matrix A = \\atJ-\\ in GF(q) take the values
0, 1, ...,</— 1 with equal probabilities, then the pnj(k) for any k = 0, I,...
satisfy the equation
pn j (k) = zn pnj-1(*) + A -zn)pn-\j-\(k), C.7.1)
where z = \/q. Indeed, if the first row of A is a zero vector, then pn(T) =
pn(T — 1), and if the row contains at least one nonzero element, then pn(J) =
pn-\{T — 1) + 1. It follows from C.7.1) that if s > 0 and m are fixed integers,
m + s > 0, n —> oo, and T = n + m, then
oo / 1 v m-\-s / 1 v — 1
r[Pn(.-t) — n — s) —> q 111 /"/111 J } " w-'-?)
i=s+\ ^ ^ ' i=\
The investigations in the second direction concern the bounds of invariance
of the results of Theorems 3.2.1 and 3.2.2 with respect to the deviations of the
distribution of elements of the matrix A from the equiprobable distribution. The
problem of the invariance and a proof of Theorem 3.2.3 are given in [91, 92]. A
modified proof of Theorem 3.2.3 is contained in [93].
Theorem 3.2.4 can be easily extended to any moment of a fixed order of the
number of solutions, but that is not sufficient for the proof of the invariance property,
since the limit distribution C.7.2) does not satisfy the sufficient conditions of the
unique reconstruction by its moments; hence, Theorem 1.1.3 cannot be applied.
Levitskaya [96,97] presents results on the number of solutions of linear random
systems over arbitrary rings and the corresponding results on the invariance of the
moment and the limit distributions. These results are summarized in [93], where, in
particular, the exact bounds for the invariance are given for random linear systems
in arbitrary finite rings. For the system considered in Theorem 3.2.3, the exact
bounds for pry have the form
K < P}j < I —0n,
where 8n = (\ogn + xn)/n and xn —> oo arbitrarily slowly as n —> oo.
Matrices that satisfy condition C.3.1) were considered by Balakin [12], who
also proved Theorems 3.3.1 and 3.3.2. Closer investigation of the estimates used
in our proof of Theorem 3.3.1 allows us to obtain the following assertions.
Theorem 3.7.1. If n —> oo,
T = n + pn\ogn,
3.7 Notes and references 179
{}„ -+ -oo, fin = o(n/\ogn), and condition C.3.1) holds, then the distribution
ofs(A) converges to the Poisson distribution with parameter e~x.
Theorem 3.7.2. Ifn —> oo,
T = n +p\ogn + o(\ogn),
fi is a constant, and condition C.3.1) holds, then the distribution ofs(A) converges
to the Poisson distribution with parameter e~~x if f5 < 0, and with parameter
e~x^ if? > 0.
Theorems 3.3.1, 3.3.2, 3.7.1, and 3.7.2 give a complete description of the be-
behavior of the rank of such matrices, except for the case fi = 0, where the behavior
is unknown. Note that in [12], the analogues of Theorems 3.3.1, 3.7.1, and 3.7.2
are proved for the systems over GF(q), q > 2 (see also [86]), and the connection
between the rank of a matrix in GF{q) and other characteristics such as the perma-
permanent rank and rank of lines is considered. The initial results on the ranks of random
matrices are presented in [38] and [11].
Stepanov began investigating systems of linear equations of the form C.4.1)
with the help of their relations to random graphs. In particular, he proved The-
Theorems 3.4.1 and 3.4.2. Now the theory of random graphs provides a basis for
obtaining the results on the systems of random equations with coefficients taking
their values with equal probabilities. If the coefficients of a system are essentially
nonequiprobable, then there are no standard approaches to investigating its prop-
properties. Only a few results are known for such systems. We remark that at this time,
graph theory is not sufficiently developed to answer questions about nonequiprob-
nonequiprobable cases. Only the method of moments (see Theorem 1.1.3) and the so-called
direct methods are used to solve these problems. Theorem 3.4.3 is a corollary to
Theorem 2.4.1 proved in [88] by the method of moments.
Theorems 3.4.4 and 3.4.5 are proved in [83]. The asymptotics of the probabil-
probability of consistency of a system of linear equations in GFB) (and in more general
algebraic structures) with independent random coefficients that take the values 0
and 1 with equal probabilities have been obtained by Levitskaya [98] (see also
[93]). This probability takes only two values and is the same for all possible
right-hand sides of the system that are not the zero vector. It follows from Theo-
Theorems 3.4.4 and 3.4.5 that the probability of consistency of the system C.4.1) de-
depends on the number of 1 's in the vector of the right-hand sides of the system (see
also [83]).
The results of Section 3.5 on the behavior of the probability of consistency of
the system C.5.1) can be found in [13] (see also [85]). Theorem 3.5.1 is proved by
the author, but the critical values ar were first obtained by Balakin under slightly
different assumptions on the matrix Arnj. These results are extended to GF(q)
in [89]. The proof of Theorem 3.5.2 is given in [87].
180 Systems of random linear equations in GFB)
We can consider the probability of the consistency of a system from the point
of view of mathematical statistics. Consider, for example, the system C.4.1) and
assume the following two hypotheses on the distribution of the right-hand sides of
the system. Let the hypothesis Hq be the existence of a vector X* = (x*, ..., x*),
which is interpreted as the true solution of the system, and bt = x*^t) + x*(t),
t = 1,..., T. Under hypothesis Ho, system C.4.1) is always consistent. Under
the alternative hypothesis H\, the right-hand sides b\,..., bj are independent
random variables that are independent of the left-hand side of the system and take
the values 0 and 1 with equal probabilities. To distinguish between the hypotheses
Hq and H\, we can use the consistency of the system as a test: If the system is
consistent, we accept the hypothesis Ho, and we accept H\ otherwise. Therefore
the hypothesis Hq is never rejected if it is true, and the error of the first kind, the
probability of rejecting Ho if it is true, is zero. The error of the second kind, the
probability of accepting Ho if it is wrong, is equal to the probability of consistency
of the system C.4.1). Thus, the probability of consistency is the main characteristic
in the statistical problem of testing the hypotheses Ho and H\.
Section 3.6 is devoted to the other statistical problems that consist of recon-
reconstructing the true solution on the basis of a system of random equations with
distorted right-hand sides. These results can be found in the paper [84].
Random permutations
4.1. Random permutations and the generalized
scheme of allocation
Denote by Sn the set of all one-to-one mappings of the set Xn = {1, 2, ...,«} into
itself. This set contains n\ elements. We consider a random permutation a that
equals any element of Sn with probability (n !)-1.
A permutation s e Sn can be written as
where Sk is the image of A: under the mapping s,k = \, ... ,n. The mapping s can
be represented also by the graph F^) = T{Xn, Wn) whose vertex set is Xn, and
the edge set Wn consists of the arcs (k, s^) directed from k to Sk, k = 1,...,«.
Since exactly one arc enters each vertex and exactly one arc emanates from each
(s)
vertex, the graph V), consists of the connected components that are cycles, which
are called the cycles of the permutation s.
Denote by Vn the random graph corresponding to the random permutation a,
which takes the values s with equal probabilities. It is obvious that P{Tn = F« } =
(t!).
In Section 1.3, we showed that the generalized scheme of allocation intro-
introduced in Section 1.2 can be applied to a wide class of problems related to the
behavior of the connected components of random graphs. In Example 1.3.1, we
showed that the generalized scheme can be used in the study of random permuta-
permutations. Recall that in the generalized scheme, we separate the subset of graphs
with exactly N components, assign one of the TV! possible orders to the set
of these components, and denote by rji, ... ,tjn the sizes of the components.
If there exist nonnegative identically distributed random variables ?i, ..., %n
181
182 Random permutations
such that for any integers k\ kN,
= k\, ..., r]N = kN) = P{?i = k\, . ..,$N = kN | ?1 H
D.1.1)
we say that the generalized scheme determined by the random variables ?1 ,...,?#
is applied to the random graph.
As was shown in Example 1.3.1, the generalized scheme that corresponds to the
random graph Fn of a random permutation from Sn is determined by the random
variables ?1 ,...,?# with the distribution
xk
P{^=k}= , A: =1,2..., 0<*<l, D.1.2)
Hog(l -x)
since the number of elements in Sn is an = n! and the number of connected
realizations of the random graph Tn is bn = (n — 1)!.
For the random permutations, the corresponding generating functions have the
form
00 „ ,
A(x) =
n\ 1 -x
n=0
n=0 n'
Thus the study of various characteristics of random permutations can be ac-
accomplished with the help of the generalized scheme. This is demonstrated for the
most part in [78].
Recall some combinatorial identities that follow from the general results of
Section 1.3.
Let vn be the number of cycles in a random permutation from Sn. Lemma 1.3.3
gives the equality
P{vn =N}= (B^n P{?i + ... + ^ = «}. D.1.3)
Denote by ar the number of cycles of length r in a random permutation from
Sn, r = \,...,n. According to Lemma 1.3.7, for any nonnegative integers
mi,.. .,mn,
¦" 1
P{ai=mu...,an=mn} = Tl— D.1.4)
i l rmrmr\
r=\
if m i + 2m2 + ¦ • • + nmn = n, and the probability is zero otherwise.
4.2 The number of cycles
183
Let us introduce the generating function
00
=m\,...,an=mn}t'"] ¦¦¦{""
m\,...,mn
= ?
\
n
where the summation is over the set of integers
Mn = {mi > 0, i = 1, ..., n, m\ + 2m2 + ¦ ¦ ¦ + nmn = n).
Put <po = 0. It is not difficult to see that <pn(t\,..., tn) is the coefficient of un in
the expansion of e\p{ut\ + u2t2/2 + •••}:
00
= exp -
n=0
oo
untn
n
n=\
D.1.5)
The generating function D.1.5) was obtained by Goncharov and was the basis
of his pioneering investigations of random permutations [53]. In [78], the approach
based on the generalized scheme of allocations was used in such investigations. In
the next sections, we will present some examples of how the generalized scheme
of allocation can be applied to random permutations. This will supplement the
investigations presented in [78].
4.2. The number of cycles
It is well known that the number of cycles vn in a random permutation from Sn is
asymptotically normal with parameters (log«, log«) as n ->¦ oo. More precisely,
as n
oo,
P{vn = N}=
y/27t log n
D.2.1)
uniformly in the integers N such that u = (N — \ogn)/^/\ogn lies in any fixed
finite interval.
The approach based on the generalized scheme of allocation makes it possible
to obtain the asymptotics of the probability P{vn = N) for all possible values of
N = N(n) as n ->¦ oo. According to D.1.3), for any integer N,
P[Vn = N)=
(—
N\x"
where the parameter x can be taken arbitrarily from the interval @, 1), and ?i,
..., %n are independent identically distributed random variables with distribution
D.1.2).
184 Random permutations
Thus, to study the asymptotic behavior of the distribution of vn, it is sufficient
to obtain the corresponding local limit theorems for the sum
$N = §1 H + %N,
where the parameter x in the distribution of the summands can be chosen so that
obtaining the local theorems becomes simple.
We begin with x = 1 — \/n and prove a series of limit theorems that make it
possible to describe the behavior of the probability P{vn = N} for the values of
TV not too far from log n.
Theorem 4.2.1. If n ^ oo, N = y \ogn + o(logn), where y is a constant,
0 < y < oo, then
= k} = Ll
ze(l +
nT{y)
uniformly in the integers k such that z = k/n lies in any interval of the form
0 < zq < z < z\ and zq and z\ are constants.
Before proving the theorem, we obtain some auxiliary results. We have chosen
x = 1 — \/n. For such x,
(l~l/n) , ?=1,2,..., D.2.3)
*}
k\ogn
and the characteristic function of the random variable ?i equals
(pn(t) ^\og(le + e).
\ogn \ n )
Represent <pn (t) in the form
(Pn(O = -—i- (log (- - it) + log(l + Vi(O + V2@)) , D-2.4)
where
elt -
1/n-it n(\/n-it)
For ir\(t) and ^2@- the following estimates are valid:
D.2.5)
|el| 1
< -r1 < -• D.2.6)
n t n
4.2 The number of cycles 185
By using the explicit form of <pn(t), the representation D.2.4) and the bounds
D.2.5) and D.2.6), we obtain the following estimates of (pn(t).
Lemma 4.2.1. If n —> oo, N = ylogn + o(\ogn), where y is a constant,
0 < y < oo, then for any fixed t,
*T ( t\ 1
r" \n) {\-ity
Lemma 4.2.2. If n ->¦ oo, TV = y\ogn + o(logw), where y is a constant,
0 < y < oo, then there exist positive constants s and c such that for \t/n | < s,
Lemma 4.2.3. //"«—> oo, then for 0 < s < \t\ < it, where s is an arbitrary
constant, there exists a constant c such that for sufficiently large n,
\<pn(t)\ <c/\ogn.
Lemma 4.2.4. If n ^ oo, then there exists a positive constant s such that for
\t/n\ < s,
2
As follows from Lemma 4.2.1, as n —> oo and N = y \ogn +o(log«), where y
is a constant, 0 < y < oo, the distributions of the normalized sums ^n/k converge
to the gamma distribution with characteristic function A — it)~Y and density
zY~le~zI F(y), z > 0. Actually, as stated in Theorem 4.2.1, these distributions
become close locally.
Proof of Theorem 4.2.1. By the inversion formula, the probability
can be represented in the form
1 Cnn
n = z} = -- e-itz^(t/n) dt,
-nn
and
1 /»oo ^-itz
dt.
i'(y) j^ii-ity
Hence,
27tnP{SN/n =z}~ 2jte~z = h + h + h + h,
186 Random permutations
where
/, = f e-itz(tp^t/n)-(\-it)-y)dt,
J-A
h = f e~itz<p^(t/n)dt,
J A<\t\<sn
= f
= - f e-itz(\-ityydt,
JA<\t\
' A<\t\<sn
h
Isn<\t\<nn
h
'A<\t\
with the constants s and A to be chosen later.
By Lemma 4.2.1, (fj?(t/n) -> A - it)~y for any fixed t. By Theorem 1.1.9,
this means that the convergence is uniform with respect to t in any finite interval.
Therefore I\ —> 0 for any fixed A as n —> oo.
By Lemma 4.2.3, for sufficiently large n,
\h\< 27tn{c/\ogn)N < 27tne~2N/y,
and, for TV = y log n + o(log n), the right-hand side tends to zero as n —> oo.
To estimate h and I4, we integrate by parts. For I4, this leads to
/•oo -itz °° y r
/ e-ltz(\ - ityy dt = - e + y- \
Ja iz(\ -ity A z jA
dt.
IZ{\ -lt)Y A Z JA
Therefore
dt
<
2y_ r
z JA
2 JA
A U -f
00 dt CA
tr+i - Ay'
where a, is a constant, and I\ can be made arbitrarily small by the choice of
sufficiently large A.
Similarly,
'?-^^dt
p-itz
en
12 \n ) A izn jA
4.2 The number of cycles
187
By using the estimates of Lemmas 4.2.2, 4.2.3, and 4.2.4, we obtain
2
\h\ < -
AY
'A
n
+
N
2N rsn
in JA
\\ognj J JA
A
00 dt
v?-1
,n
dt
where c, C2, and C3 are constants.
If we choose sufficiently large A and n, we can make I/2I arbitrarily small. ¦
Now we can prove the following theorem on the behavior of the probability
P{vn = N).
Theorem 4.2.2. Ifn —> 00 and N = y logn + o(\ogn), where y is a constant,
0 < y < 00, then
Proof. For* = 1 — \/n, the representation D.2.2) takes the form
vn = N}=
= n),
D.2.7)
N\(l- l/n)n
where ^ = ^i + • • • + %n is the sum of independent identically distributed random
variables with distribution D.2.3). By Theorem 4.2.1,
1 _,
nT(y)
By substituting this expression into D.2.7), we obtain the assertion of Theo-
Theorem 4.2.2. ¦
The case where y = N/ log n —> 0 is described by the following theorem.
Theorem 4.2.3. Ifn —> 00 and y = N/ logn —> 0, then
,-1
nV(Y)
A+0A)).
Proof. Taking into account that y < 1/2 beginning with some n, we choose the
level n(l — y) and represent the probability P{?w = n} as follows:
D.2.8)
=n) = P{t;N = n, & <n(l-y), i = 1,...,«}
+ NP{rN = n,SN>n(l- y)}.
188 Random permutations
Since
?i =m} =
A
n log n
uniformly in m, n > m > n{\ - y), we see that
> n{\ - y)} = 2_^ P{%N = m, Kn-\ = n — m)
m>n(\—y)
6 -P{^-i <yn}(\+o(\)). D.2.9)
We now prove
^ 1. D.2.10)
Show that the random variable ^/(y«) converges in probability to zero. By the
representation D.2.4) and the estimates D.2.5) and D.2.6),
logn \ \n ynj
log(y - f Q - log y
|
log« \y« log«/ '
and if y = N/ log n —> 0, then
*
+ 0 f ,.
\yn\ognJJ
Thus, the characteristic function of ^/(yn) converges to the characteristic func-
function of the random variable that assumes the value 0 with probability 1, and we
obtain D.2.10).
With some technical difficulties, it can be proved that under the conditions of
the theorem,
P{Sn=n, & <n{\-y), i = 1, ...,«} = o{\/{n log/i)).
The assertion of the theorem follows from this relation and the relations D.2.8),
D.2.9), and D.2.10). ¦
Theorem 4.2.4. Ifn ->¦ oo and y = N/ \ogn ->¦ 0, then
N\n
Proof. The assertion of the theorem follows immediately from Theorem 4.2.3
and representation D.2.7) if we take into account that the gamma function T{y) =
l/y(l+o(l))asy -> 0. ¦
4.2 The number of cycles 189
Now consider the case where N/ \ogn —> oo. We distinguish four subcases:
a = n/N —> oo, a —> c > 1, a —> 1 with m = n — N -> oo, and a —> 1 with m
fixed.
Let a —> oo. We must select the value of the parameter x so that E?>/ is close
to n. Since for ?i with distribution D.1.2),
x
(l-x)log(l-x)'
we choose x, 0 < x < 1, such that
— =<*, D.2.11)
where a = n/N. This equation is approximately satisfied if we take
1
x = 1 -
a log a
If TV/ log« —> oo, then x = 1 — l/(a log a) is farther from the singular point
x = 1 than x = \ — l/n, and therefore the normal approximation is valid for the
sum ?>/.
Theorem 4.2.5. Ifn, N —> oo such that N/ log« —> oo, a = w/Af —> oo
the parameter x = 1 — l/(a log a) and cra = ctyf log a, then
= k} =^22
7a v 2jt N
uniformly in the integers k such that z = (k — n)/(aa^/N) lies in any fixed finite
interval.
Proof. The characteristic function of the random variable ?i is
l -xe
log(l -x)
It is easy to see that for any fixed t, as N/ log n —> oo and a = n/N —> oo,
Denote by rfrn(t) the characteristic function of ^ = (^ — n)/(oa*/N), then
under the conditions of the theorem for any fixed t,
and the distribution of ^ converges weakly to the normal distribution with pa-
parameters @, 1).
190 Random permutations
The local convergence can be proved by the standard reasoning and we omit
this technical part of the proof of Theorem 4.2.5. ¦
From Theorem 4.2.5 and representation D.2.2), we obtain the following asser-
assertion.
Theorem 4.2.6. Ifn, N -> oo such that N/ \ogn -> oo, a = n/N ->¦ oo, then
where x = 1 — l/(a log a) and aa = as
The following theorem for the case where a tends to a constant greater than 1
can be proved in the same way as Theorem 4.2.5.
Theorem 4.2.7. If n, N —> oo and there exist constants «o and a.\ such that
1 < ao < a < ct\, the parameter x = xa, where xa is the unique solution of
equation D.2.11) in the interval @, 1), and
v I r\rr I 1 "V" I I y ^
ax = ~— = '
then
= k} = ]==e-z2/2{\ + 0A))
V2N
uniformly in the integers Jc such that z = (k — n)/ (ax \j2it N) lies in any fixed
finite interval.
Proof. The proof is similar to the proof of Theorem 4.2.5 and we omit the details.
Note only that a — E?i and a2 = D?i for x = xa. ¦
Using Theorem 4.2.7 and representation D.2.2), we obtain the following asser-
assertion on the distribution of vn.
Theorem 4.2.8. If n, N —> oo and there exist constants ao and a\ such that
1 < ao < a < a\, then
where xa is the unique solution of equation D.2.11) in the interval @, 1), and
>!
-xa)
The asymptotic normality of ?# is preserved if a = n/N ->¦ 1 slowly, as
specified below.
4.2 The number of cycles 191
Theorem 4.2.9. Ifn, N -> oo such that a = n/N ->¦ 1 am/ m = n — N -> oo,
and the parameter x = xa, where xa is the unique solution of equation D.2.1 I) in
the interval @, I), then
=k} = J2/2
uniformly in the integers k such that z = (k — n)/-sfm lies in any fixed finite
interval.
The proof is similar to the proof of Theorem 4.2.5 and we omit the details.
From Theorem 4.2.9 and representation D.2.2), we obtain the following asser-
assertion on the behavior of P{vn = N}.
Theorem 4.2.10. Ifn, N —> oo such that a = n/N —> 1 and m = n — N —> oo,
then
where xa is the unique solution of equation D.2.11).
It is not difficult to see that if m2/N —> 0, then
and consequently
(- log(l - xa))N = x?(l + xa/2 + O(x2a))N = x»em(\ + O(m2/N)),
Nm
Therefore it follows from Theorem 4.2.9 that if n, N —> oo, a = n/N —> 1,
m —> oo and m2/N —> 0, then
Nm
P{vn = N}=
,(l+
TV! 2mm\
Finally we consider the case where m is bounded.
Theorem 4.2.11. If N ->¦ oo and the parameter x = \/N, then for any fixed
k = 0,1,...,
Proof. By expanding the characteristic function <p(t) of the random variable
with parameter x = 1/N, we obtain for any fixed t,
192 Random permutations
If x = \/N and TV —> oo, then the characteristic function of ^ — N is equal to
A _|_ (eit _ \)/BN) + O(N~2))N and tends to e(e"~{)/2. This means that the
distribution of %n — N converges to the Poisson distribution with parameter 1/2.
From this theorem and representation D.2.2), we obtain the following assertion,
which completes the description of the asymptotic behavior of the distribution
of vn.
Theorem 4.2.12. Ifn -> oo, n/N -> 1, and m = n — N is fixed, then
P{Vn=N}= JT
N
It is not difficult to see that Theorems 4.2.2, 4.2.4, 4.2.6, 4.2.8, 4.2.10, and
4.2.12 give a complete description of the asymptotic behavior of the distribution
of the number of cycles in a random permutation of degree n as n -> oo.
4.3. Permutations with restrictions on cycle lengths
In this section, we present some results on permutations with restrictions on their
cycle lengths. Let R be a subset of the set of natural numbers. We consider the
set Sn<R of all permutations of degree n with cycle lengths from the set R. One of
the first questions that arises in this situation concerns the asymptotic behavior of
the number an7r of elements in Sn,r. This problem is far from being completely
solved. Here we describe some of the solutions provided by an approach based on
the generalized scheme of allocation.
Let the uniform distribution be defined on Sn 7r and let vn7r be the total number
of cycles in a random permutation from this set. Put bn,R = (n — 1)! if n e R, and
bn,R =0 otherwise. It is easy to see that
PK, = N) = -2l_ T ^'¦'I"V. D.3.1)
n\-\ \-n^=n
We introduce independent identically distributed random variables
r with distribution
where
X^ bkRxk \r^xk A
= Li' x>0-
k=l K- keR
4.3 Permutations with restrictions on cycle lengths 193
By using these random variables, we can rewrite D.3.1) in the form
D-3.3)
Hence, summing over TV, we obtain
D.3.4)
N=l
It is clear that above we have repeated the general approach of Section 1.3 for
the case of the set Sn,R, and relations D.3.1), D.3.3), and D.3.4) are the realizations
of the general relations A.3.1), A.3.10), and A.3.11), respectively.
To find the asymptotics of the numbers an^r, it is sufficient to choose an appro-
appropriate value of the parameter x, substitute it into the expression of the distribution
D.3.2), and then prove a local limit theorem for the sum of independent random
variables with this distribution.
We succeed in obtaining results on an^ only if the structure of R has some
regularity. In the general case, the asymptotics of #„,# is unknown.
To demonstrate the approach, we consider first a simple case where R is the set
E of even numbers.
Theorem 4.3.1. Ifn^-oo, then
an E = 2 (-Y (\ + o(\)) D.3.5)
for even n, and anyE = 0 for odd n.
Proof. To prove the theorem, we use the representation D.3.4). We consider
the random variables ?j ,..., ?Jy with distribution D.3.2), where R = E =
{2, 4,...}, and
X2k j
BR(x) = BE{x) = ]?_ = -_ log A - x2).
keR
The random variables ?,- = ?• /2, / = 1,..., N, are independent identically
distributed, and
X2k
=-, ?=1,2,.... D.3.6)
-x1)
If we choose x = j\ — \/n, then this distribution coincides with distribution
D.2.3) from the previous section, and according to Theorem 4.2.1, if n -> oo,
N = y logn + o(logn), where y is a constant, 0 < y < oo, then
nT{y)
194 Random permutations
uniformly in the integers k such that z = k/n lies in any fixed interval of the form
0 < zq < z < z\, where z$ and z\ are constants.
Since
we obtain that if n —> oo, TV = (log«)/2 + o(logn), and n is even, then
a/2
+ • • • + Hn = n/2} = -^-~e~[l2{\ + o{\)). D.3.7)
For odd n, this probability equals zero.
To obtain an,R with the help of relation D.3.4), we have to sum the probabilities
P{?jV n) w^m ^e Poisson coefficients. To this end, we need to estimate these
probabilities for all N. We show that for all N,
n logn
This bound is a consequence of the following chain of estimates. It follows from
D.3.2) that
_ y^ 1
K(n,N) l " ' N
where
K(n, N) = {k\, ... ,kx: h -\ \-kfj = n, k\, ..., k]^ e R}.
Hence,
P<
X"
xkl
<
N-l
N I x-^ jc* \ N
nBE(x)
We obtain relation D.3.8) because B = BE{x) = (log/i)/2.
We split the sum
4.3 Permutations with restrictions on cycle lengths 195
into four summands, dividing the domain of summation into four parts:
A\ = {N: \ <N < B - B3/4},
A2 = {N: B - B3/4 < N < B + B3/4},
A3 = {N: B + B3/4 < N < B + B2},
A4 = {N: B + B2 < N <n/2}.
It is not difficult to see that relation D.3.7) is satisfied uniformly in N € A2.
Therefore
NeA2
2 1/?1 v^ BNe~B V2
TI/2 T (l+(l)) ^
n n f—' N\
NeA2
since B = (logn)/2, and as N -> 00,
y
AM
NeA2
The remaining part of the sum is o(\/n). Indeed, by applying estimate D.3.8),
we obtain
IB ^ BNe-B 1 ^ BNe~B
Si < 2^ —j— = ~
and Si = o{\/n) because
E^-<
NeA\
as n —> co.
It follows from D.3.8) that
BNe~B
If we use the normal approximation for the Poisson distribution, we find that
dN^-B poo
poo
<c,f
it, N i
where ci and c2 are constants. Hence, ^3 = o{\/n).
196 Random permutations
Similarly, by using D.3.8), we obtain
-)V'.
Hence, S4 = o{\/n) because e/B < e~l for n sufficiently large.
If we combine the estimates of S\, S2, S3, and 54, we obtain
Substituting this expression into D.3.4) and expanding n! by the Stirling formula
give the assertion of the theorem. ¦
The analogous result is valid for the number of permutations for which R is the
set of odd numbers.
We turn now to the case where the set R is not as regular as E. Let R(k) be the
number of elements of R that are not greater than k. Set i?@) = 0. In the sequel,
we assume that
lim R{k)/k = p, 0 < p < 1.
In this case, p is called the density of R in the set of natural numbers.
We will find the asymptotics of an^ under the following additional conditions
on the set R.
A) There exists a positive integer r such that, for any nonnegative integer s, the
set R n {5 + 1,..., 5 + r] cannot be embedded in any integer lattice with a
step not equal to 1.
B) The generating function F(z) of the set R has a finite number m of poles at
the points z/ = e2jtil/m, I = 0, 1,..., m - 1, on the unit circle \z\ = 1; in
other words, it is of the form
keR
where P{z) is a polynomial.
Note that, since the coefficients of the series F{z) take a finite number of values,
by Szego's theorem (see, for example, [19]), there are only two possibilities for
F(z): Either F{z) has the form D.3.9), or the set of singular points of F{z) is dense
everywhere on the unit circle, and therefore F(z) cannot be extended outside the
unit circle. We consider here only the first case. In this case, the coefficients of
F(z), with exception of some initial numbers, form a periodic sequence with
4.3 Permutations with restrictions on cycle lengths 197
period m, and, therefore, the set R has density p = l/m, where / is the number of
units in the period.
Consider independent identically distributed random variables ?1, ...,?# with
distribution
, keR, D.3.10)
kB(x)
where
_ 1
n'
Theorem 4.3.2. Suppose that R has the density p > 0 and satisfies conditions
A) and B), n —> oo, N = p logn + o(logn). Then
uniformly in the integers k such that y = k/n lies in any fixed interval of the form
Q < yo < y < yi < oo.
With the aid of Theorem 4.3.2 and relation D.3.9), we prove the following
assertions.
Theorem 4.3.3. Suppose that R has the density p > 0 and satisfies conditions
A) and B). Then, as n —> oo,
a»,R = {n- l)!A"/r(p)(l + 0A)), D.3.11)
where
Since X!J?iO ~ \/n)k/k = logn, the assertion D.3.11) can be written in the
form
where
Theorem 4.3.4. Suppose that R has the density p > 0 and satisfies conditions
A) and B). Then, as n —> oo,
v«,^ = N} = -== exp —— A + o
jlB \ 2B J
198 Random permutations
uniformly in the integers N such that {N — B,^r)/' yjBnj+ lies in any fixed finite
interval.
To prove Theorem 4.3.2, we establish some auxiliary results.
The characteristic function of distribution D.3.10) is
_ y^ xkeitk B(xeil)
~ h
hikB(x) B(x)
Lemma 4.3.1. If R has the density p > 0, then, as n —> oo,
<p I - = 1 1- o
\nj logn \logn/
for any fixed t.
Proof. We first derive some auxiliary estimates. It is easy to see that
oo
keR k=\
oo oo
k=\ k=\
oo
k=\
Set s = log n. For such e,
xkR(k) < J^ k<\og2n,
l<k<e l<k<e
and, since R has positive density,
k>e k>e k>e
Thus, as n —> oo,
oo
i 2
ten i=i V "
D.3.12)
4.3 Permutations with restrictions on cycle lengths 199
Similarly we obtain the estimate
k
B(X) = J2— =p\ogn + o(n). D.3.13)
keR
We now write the characteristic function in the form
It is easy to see that
B(xeit/n) - B(x)
oo
= E -rxk(eitk/n ~ l)(R(k) - R(k - D)
k=\
= f; -xkR{k) (eitk'n - 1 - JtLfcW+W - I)
k=\k
00 1 / 1
= Y -xkR{k) (eitk'n(\ - ei[/n) + I(e
j—* k \ ' n
ft== 1
First of all, we estimate the part that does not contribute essentially to the sum.
If t is fixed and n —> oo, then
00 k
x
x R(k) {i
We transform the other parts of the sum as follows:
OO
k=\ k
kRik)
, y> xR(k)citk/n L , ^
tk \
kn *—' k \ n
k=\
oo
D.3.14)
(
K— 1
200
Random permutations
and
00
k=\
00
oo
k=l
oo
kn
k—\
kn
Similarly,
D.3.15)
00
Set s = logn and E = n logn. Then
k<e
kn
logn
k<e
n
k<e
kn
--I)
f
k<e
n
In exactly the same way,
E
kn
,itk/n _ j\
<-2Tkxk <
n2tE
-i)
V k
< - > jc* <
k>E
It is clear that
R{k)/k =
4.3 Permutations with restrictions on cycle lengths 201
uniformly in k, s < k < E. Hence,
itk/n =
kn
e<k<E e<k<E
= p T
e<k<E
Similarly,
_ j) =
kn , r
e<k<E e<k<E
-l) = p ^ kn
e<k<E " e<k<E
The sums in the right-hand sides of these relations are integral sums of integrable
functions. Therefore, as n —> oo, their limits exist and equal
'OO 1
I 1 .' * \ -r J-
./o
1 - It
"z-')^=r477-1-
f°° -e~z(eitz -\)dz= - log(l - it),
JO 2 V J
respectively. Thus, as n —> oo, for any fixed ^,
B(xeit/n) - B{x) = -p log(l - it) + o(l),
and hence,
logn
Lemma 4.3.1 implies that for any fixed t, as n —> oo and N = p log n + o(log n),
and for the normalized sum (^i + • • • + %n)/k the limit distribution is the distribu-
distribution with the characteristic function (l — it)~p that has the density y p~l e~y/ T(p).
To prove the local convergence of the distributions, we have to estimate <p(t/n)
outside a neighborhood of zero.
202
Random permutations
Lemma 4.3.2. Suppose that R has the density p > 0 and satisfies conditions A)
and B). Then, for any s > 0, there exists q < 1 such that for s <\t\<n,
\<P(O\ <q-
Proof. Let k\, fa, and fa be integers and ak{, akl, ak? > 0. It is easy to verify
that
akle
itkl
= (ak{ +akl+ahJ
- 2ak] akl{\ — cos t (fa — k\))
- 2ak[ ak3(l - cos / (fa - fa))
- 2aklak?>(\ -
For a > 0 and <5 > 0,
Therefore
\- akl + ak3 -
aklak2(l - cost(k2 -
<7*> + ak2eitk* + ak3e
-COSf (fa -fa))
akl -\- ak2 -\-
aklak3{\ - cos t(k3 -fa))
D.3.17)
Suppose now that, as in condition A), the integers fa, fa, and fa do not lie on
any lattice with a step greater than 1 and are contained in an interval of length r.
Then, for s < \t\ < n, the three cosines from the right-hand side of D.3.17) do
not simultaneously take the value 1. Moreover, since fa, fa, and fa are contained
in an interval of length r, their differences can take only a finite number of values.
Therefore, there exists a > 0 such that for s < \t\ < n,
> 3a D.3.18)
uniformly in all such fa, fa, and fa.
We now let ak = xk/k, k = 1,2,..., and suppose condition A) holds for
fa > fa > fa. It follows from D.3.17) and D.3.18) that
ak] + akl + ak3 -
akle
ak3e
itk3
>
>aarakl
D.3.19)
4.3 Permutations with restrictions on cycle lengths
203
Write the characteristic function <p(t) in the form
oo rl+r
52 -^(R(k)-R(k-\))eltk
1=0 k=rl+\
B(x)
From every set {rl + 1, • • •, rl + r), select, according to condition A), three inte-
integers k\i, k2i, and ky from R that do not lie on any integer lattice with a step not
equal to 1. Using estimate D.3.19) gives
akueltku + ak2lelthj + akveltk3l
> aarakli > aarari+r.
Therefore, taking into account that R(kn) — R(kn — 1) = 1 for /' = 1, 2, 3 and
/ = 0, 1,... yields
oo rl+r
B(x)\<p(t)\ <
1=0 k=rl+\
oo
1=0
oo
1=0
+ak2i +akv)
akuettk" + ak2leitk*
00
D.3.20)
Inequalities D.3.20) imply the assertion of Lemma 4.3.2 because r is fixed, x =
1 — \/n, and, as n —> 00,
00
B{x) = p\ogn + o{\ogn), ) — = -log(l -xr) = \ogn + o
Lemma 4.3.3. Suppose that R has the density p > 0 and satisfies conditions A)
and fB). Then there exist c\ and e > 0 such that for every 1 = 0, \,... ,m — 1
and \t/n — 2nl/m \ < e,for sufficiently large n,
1
n
c\
- 2nln/mJ logn
Proof. We start by estimating
n
204
Random permutations
By condition B), there exist c, <5 > 0 such that for \z\ < 1, |z/ — z\ < 8, I =
0, 1, .. .,m - 1,
keR
1 = 0, l,...,m - 1.
D.3.21)
Set z = xelt/n, where x = 1 — 1/n. It is clear that \z\ < 1 and there exists
s > 0 such that if |f/n — 2jtl/m\ < e, then |z/ — z\ < 8 for sufficiently large n.
Therefore, D.3.21) implies that for \t/n - 2nl/m\ < s, I = 0, 1,..., m - 1,
keR
cn
B(x)\\ -
cjn
B{x)J\ + (t- 2nln/mJ
Since B(x) = plogn + o(logn), there exists c\ such that for every / = 0, 1,.
m — 1, if \t/n — 2nl/m\ < s, then
c\n
logn^l + (t -2nln/mJ
for sufficiently large n.
We now proceed to estimate the characteristic function cp(t) in the intermediate
range of t. Obtaining the estimate involves some technical difficulties. So, for the
sake of greater clarity, we first treat the case R = N. In this case,
Pk =
=k} =
kB(xY
?=1,2,...,
B(x) = -
Consider the random variable f
m > 0,
= ?i — ?2- Its distribution is symmetric, and for
Pm =
oo
= y^jPkPk+m-
k=l
Let
00
cos^m.
It is clear that the characteristic function <p(t) of the random variable ?1 is related to
ip(t) by the equality <p(t) = \cp(t)\2. To estimate ip(t), we use a standard inequality
4.3 Permutations with restrictions on cycle lengths 205
(see, e.g., [49]): For; > 0,
oo oo
1 — (p{t) = 2 / Pm\\ ~~ costm) >2/ y pm, D.J.ZZ)
m = \ s=0 meM.i
where
In 2ns In 2ns 1
m: 1 < m < 1 \ .
2t t ~ ~ 2t t \
Lemma 4.3.4. For mo > 0,
oo
2 2_^ pm > /, PI /,Pk-
m>m0 l>2m0 k=l
Proof. By using / = m + k as the variable of summation, we obtain
oo I—mo oo oo
m>mo k=\ l>mo+\ k=\ 1=1 k=l+mo
D.3.23)
The right-hand side of D.3.23) is estimated from below by the quantity
oo
l>2m0 k=l l>2m0
To see this, it is sufficient to delete the first terms in the first sum from D.3.23),
retaining
l>2m0 k=l
and, in the second sum from D.3.23), to shift the domain of summation to 2mo,
giving
oo
l>2mo k=l-mo+l
which does not exceed the second sum from D.3.23) by the monotonicity of the
probabilities. ¦
Lemma 4.3.5. For 0 < t < n,
1 v^
t 3 2^ Pk-
k>it/t
206
Random permutations
Proof. Note that the summation on the right-hand side of D.3.22) occurs over
integers m from an interval of length n/t. If we enumerate intervals of such a length
on the positive semi-axis starting at the point Jt/Bt), the domain of summation
will consist of the intervals labeled by odd numbers. Notice that the sequence of
probabilities pk, k = 1, 2, ..., is monotone, and the numbers of integer points
in any two intervals of length n/t differ by at most 1. Therefore each interval of
length n/t for 0 < t < tt contained in the right-hand side of the sum D.3.22)
contributes not less than one-third of the total sum of the two following intervals:
the interval itself and the interval adjoined to it on the right side, which does not
belong to the initial domain of summation. (Note that, as t -> oo, the number
of integer points in one interval increases and its contribution to the sum tends to
1/2.) Therefore, D.3.22) implies
oo
Pm > ^
s=0meMs m>n/Bt)
By applying Lemma 4.3.4, we obtain the assertion of Lemma 4.3.5. ¦
It remains to estimate the sum of the form Ylk>a Pk from below. If we use the
inequality
we obtain
T-
k>a
co
-y
D.3.24)
where c-x, is a constant.
We use Lemma 4.3.5, set a = nn/\t\ in D.3.24), and obtain for \t\/n <n,
- >
pi >
C3 - log
nn
where a, is a constant. Hence, we go on to estimate <p(t/ri) and find that
ft"
V
n
3logn
<
(
6logn
If N > \ logn, then
N
1
<exp|-—log|f| + ——
[ 12 12 log
} < cskr1/12. D.3.25)
4.3 Permutations with restrictions on cycle lengths 207
We now return to the case R C N. We retain the notation cp{t) and (p{t) for the
characteristic functions and set
_... ,, akxk y-^ . 1
Pk = P{?i = k) = , keR, B(x) = > airX , ak = -;
B(x) ^—' k
v keR
8R(k) = 0 for k g R, and<5^(/:) = 1 for k e R.
Lemma 4.3.6. Suppose that R has the density p > 0 and satisfies conditions A)
and B). Then, for \t\/n <n and N > ^p\ogn,
where r is defined in condition A) and c$ is a constant.
Proof. We revise the arguments leading to estimate D.3.25). Inequality D.3.22)
now takes the following form: For t > 0,
~ oo oo
where
2ns In 2ns 1
\
ITT 2ns In
2^ t ~ ~ 2t t \
We retain only one summand in each interval of length r, replace this summand by
the minimum value over the interval, and use the transition from the sum over one
interval of length r to one-third of the sum over the interval of twice the length.
Then we obtain for t > 0,
oo
J2T,ak+mXk+mSR(k + m)>- ? ak+rlxk+rl.
s=0meMs rl>7t/Bt)
Once again, we preserve only one summand in each interval of length r and get
2 °°
ak+nxk+rl
k=l l>n/Btr)
oo
.rm+rl
urm+rlJ
l>n/Btr)
E
oo x™
3B2(x)r2 *-< ^ m m+l'
l>it/Btr) m — \
The assertion of Lemma 4.3.4 is based on the monotonicity of the probabili-
probabilities pk, k — 1,2, The summands of the last double sum are similar to the
208
Random permutations
summands of the sum in Lemma 4.3.4, and the values xrk/k, k = 1,2 are
also monotonic. Therefore we may use Lemma 4.3.4 and obtain
c™ log(l -jcO
/
l>7t/Btr) m=\
m
,2 Z_^ /
l>7t/(tr)
For a fixed r, the estimate D.3.24) remains true. Therefore, by taking into account
the asymptotics B(x) = p\ogn +o(\ogn) and - log(l — xr) = \ogn + o(log«),
we find
2
i i \
1 -
t
n
1
3r2p2\ogn
Hence,
'?.
\og\t\
and for N > ^p log n,
-l/A2r2p)
where C6 is a constant.
Proof of Theorem 4.3.2. Consider the sum ?w = ?i + • • • + ?w of independent
identically distributed random variables with distribution D.3.10). As we have
seen, Lemma 4.3.1 implies that, as n ->¦ oo and iV = p\ogn + o(logn), the
distribution of ?#/n converges weakly to the distribution with density
u>0.
We now prove the local convergence of these distributions. For an integer k, let
y = k/n. By the inversion formula,
n
where ^@ is the characteristic function of the distribution D.3.10). The density
of the limit distribution at a point u > 0 can be represented by the integral
00
1
ritu
du.
Hence,
4.3 Permutations with restrictions on cycle lengths
209
where
h
h
h
- -
= f
1
n
-ity.
- ay
>N ('-) dt,
(i -ay
dt.
dt,
A<\t\<nn
and the constant A in the integrals is to be chosen later.
By D.3.16), for any fixed A, the integral I\ tends to zero as n —> oo and
N = p log n + o(log n).
To estimate the integrals I2 and I3, we integrate by parts. For I2, this yields
'00 p-ity
(i -
dt = -¦
iy{\ - ity
00
-yh
,-ity
dt.
Hence,
\h\<
dt
and I/21 can be made arbitrarily small by the choice of A.
Similarly,
ryrn
JA
t\ . e~"y N t
- ) dt = <pN -
n) iy V«,
where
_ n rn
iy Ja
t\ 1 ,(C
-)-<p I -
N
N
Therefore
2
When we use the estimates of Lemmas 4.3.2 and 4.3.6, we obtain
\I\-
q< 1;
N
\<p(A/n)\N <
Hence these summands can be made arbitrarily small. It remains to estimate the
integral /. Choose e such that Lemma 4.3.3 is valid, and represent / as the sum of
three integrals:
I2(s)
210
Random permutations
where
TV f
ly Ja
A<\t\<en
''-Y-V'('-)"'¦
n I n \n )
hie) is the integral over the sum of ^-neighborhoods of the poles of F(z), that is,
over the sum that equals
m-\
u
1=1
2nln 2nln~\
—en -\ , en -\ ,
m m J
and 73 (e) is the integral over the remaining set
. r \ i i f 2nln 2nln
Ae = {—nn, —en] U [en, nn]\ I I —en -\ , en -\
ML m m
By using Lemmas 4.3.3 and 4.3.6, we find
2Nc\c6
c_6 r
n Ja
1
A A+-'2I/2
and for y > yo > 0, the value |/i(e)| can be made arbitrarily small by the choice
of a sufficiently large A.
By using Lemma 4.3.3, we find
i pen+27tln/m
If
n J-
—en+2nln/m
A1-
U
dt <
— f
logn J_
en+lxln/m
dt
- 2nln/mJ
logn
ren dt
J-en Vl +t2'
and there exists a constant c-j such that for a fixed e,
dt
'-en Vl + t2
p
J -
< c-j logn.
Therefore, we use the estimate of Lemma 4.3.2 and find that for y > yo > 0,
Nl en dt
\h(e)\ <
ylogn
and under the conditions of Theorem 4.3.2, the right-hand side tends to zero.
For t e Ae,
\<p\t/n)\ <
where eg is a constant that is the upper bound of \F{z)\ for \z\ = x not in the
4.3 Permutations with restrictions on cycle lengths
211
neighborhoods of the poles. By using this estimate and the estimate of Lemma
4.3.2, we find
<
N
y
Ja,
<p
N-\
N-\
ynB{xV
n
L
1
n
dt
dt
Under the conditions of the theorem, the last term of this chain of inequalities
tends to zero for y > yo > 0. ¦
It is easy to see that by first choosing a sufficiently large A and then a sufficiently
large n, we can make the difference being estimated arbitrarily small. Note that
the difference is bounded uniformly with respect to N, and hence, there exists a
constant eg such that for y > yo > 0 and for all N,
= k}< c9/n.
D.3.26)
Proof of Theorem 4.3.3. In D.3.8), divide the domain of summation into two
parts: N{ = {N: \N - B(x)\ < N2/3} and N2 = {N: \N - B(x)\ > N2/3}.
It is not difficult to see that the assertion of Theorem 4.3.2 is fulfilled uniformly
in N € N\. Therefore
uniformly in N e N\, so
,-1
nT{p)
We use the estimate D.3.26) and obtain
NeN2
N\
n
NeN2
N\
Since the sum on the right-hand side of this inequality tends to zero, the total sum
in D.3.4) equals (enT{p)yl{\ + o(l)). It remains to note that
\ ()
x" = e~\\ + o(l)), B(x) = Bn<R =
keR
n
212 Random permutations
Proof of Theorem 4.3.4. According to D.3.3),
n\(B(x))N
P{vn,R = N}= I )}
N\xnan,R
If we substitute the corresponding expressions for an r and P{Ov = n}, we obtain
for N = B(x) + o{B{x)). We note that B{x) = Bn<R and that the expression
obtained above holds uniformly in N such that (N — B(x))/*JB(x) lies in any
fixed finite interval; thus, we obtain the assertion of Theorem 4.3.4. ¦
4.4. Notes and references
The probabilistic approach that is now commonly used in combinatorics was first
formulated in an explicit form and applied in the investigations of the symmetric
group Sn by V. L. Goncharov [51, 52, 53]. For the random variables ct\,... ,an,
he found the joint distribution D.1.4) and the generating function D.1.5). For the
total number of cycles vn = a\ + ¦ • • + an, he proved that, as n —>• oo,
Evn = \ogn + y +
fl - (n2/2 - y/2)/
Goncharov also proved that the distribution of {vn — \ogn)/^/\ogn converges
to the standard normal distribution, and the distribution of ar converges to the
Poisson distribution with parameter 1/r.
Let f3Vn be the length of the maximum cycle in a random permutation from Sn.
Goncharov [51, 53] showed that
h=0
where
S0(m,n) = l, Sh(m,n) =
Let
I0(x,\-x) = l, Ih(x,l-x)=
I
l" h
dx\
X\-\
X\,...,Xh>X
4.4 Notes and references 213
Goncharov proved that, as n ->• oo, the random variable fiVn /n has the distribution
with the density
^' / 1 \h 1 i
which, as is clear from the preceding formula, is defined by different analytic
expressions on the sequential intervals of the form [1/A + k), I/A], where k is an
integer. For example,
x 2
1 1
3 2
Although Goncharov investigated the cycle structure of random permutations
in great detail, these problems continue to be of significant interest to mathemati-
mathematicians. V. F. Kolchin [71] proposed an approach based on the generalized scheme
of allocation. The results on the asymptotic properties of random permutations
obtained with the help of this approach are presented in [78]. Note that, among
the others, the asymptotic logarithmic normality of the middle terms of the series
of order statistics composed of the lengths of cycles, and the local limit theorem
on the convergence of the distribution of the total number of cycles vn to the nor-
normal distribution were first proved by this method. It is clear that this approach
makes it possible to investigate the asymptotic behavior of the local probabilities
P{vn = N} for all possible values of N = N(n) as n ->• oo. These investiga-
investigations were carried out in [109, 115, 117, 146, 147, 148]. In Section 4.2, the results
of these investigations are presented. Theorems 4.2.1 and 4.2.2 were proved by
Yu. Pavlov in [115,117]; and Theorems 4.2.5,4.2.6,4.2.9,4.2.10, and 4.2.12 were
proved by L. M. Volynets in [146, 147, 148].
Methods of estimating the rate of convergence in limit theorems for sums of
independent random variables are well developed in the theory of probability.
Therefore the approach that reduces the study of characteristics of random per-
permutations to problems concerning the sums of independent summands provides
an obvious way to obtain the limit theorems containing estimates of the rate of
convergence. The estimates under the conditions of Theorem 4.2.1 were obtained
by Yu. Pavlov [117] and for y = 1 by A. Pavlov [109]. The following result of
Volynets [146] provides a better bound than the one given in [109].
Theorem 4.4.1. Ifn —>• oo, N = \ogn + x^logn, x/^logn —>• 0, then
214 Random permutations
Volynets [146] proved this theorem by using the approach based on the gener-
generalized scheme of allocation.
Let Hn be the set of all single-valued mappings of the set {1,...,«} into itself.
In particular, Sn c ?«. The random mappings from T,n were first studied by
J. B. Kruskal [94] and B. Harris [57], and many studies have considered subsets
of T,n, which are distinguished from En by various constraints on the mappings.
We mention only the articles by V. N. Sachkov [128, 129, 130], in which the
mappings have the height of less than a fixed number, and cycle lengths are from
a fixed set; the articles by A. A. Grusho [54, 55], which treat the subset T,nr that
consists of the mappings from ?„ whose vertex degrees are not greater than r; the
articles by Yu. Pavlov [114, 115] considering the characteristics of the mappings
with exactly m components (the case m = 1 is considered by G. N. Bagaev in
[8, 9]); and the article by J. Arney and E. A. Bender [5], which treats mappings
with constraints on degrees of the vertices. The research in these directions began
in the early seventies and is still ongoing. In our opinion, the most surprising results
concerning mappings with constraints were obtained by I. B. Kalugin [64], which
we summarize.
Let En ,r be the subset of mappings from T,n such that the degrees of the vertices
take values only from a set R that contains zero and does not coincide with the set
{0, 1}.
Let ?(A.) be a random variable with the distribution
where A. is a positive constant and
There exists ccr such that E^(a^) = 1. Denote by Br the variance of the
random variable ^{ocr). For the number of cyclic vertices A.^ and the height
tntR of the random mapping from T,n,R, the following assertions are well known
[64, 78].
Theorem 4.4.2. Ifn ->• oo, then
JnjB~RP{\f = k} = ze-z2'2(\ + 0A))
uniformly in the integers k such that z = k^/BR/n lies in any interval of the form
0 < z$ < z < z\ < oo.
Theorem 4.4.3. Ifn ->• oo, then for any fixed x > 0,
00
[nxn,R < x] -+
k=—oo
4.4 Notes and references 215
An unexpected result appears if we consider the set S* R of mappings from SM
defined as follows. If in the graph of a mapping from En we delete the edges that
connect the cyclic vertices, we obtain a graph consisting of trees. The set E* R
contains the mappings from !!„ such that the degree of any vertex of the trees
takes a value in R. Thus the difference in the restrictions on the degrees in E* R
and !!„,# seems to be insignificant because only the restrictions on the degrees of
cyclic vertices differ by 1. But the sets T,n,# and E* R have a substantial difference
in the structure of their corresponding random graphs.
Let AR and t* r, respectively, be the number of cyclic vertices and the height
of a random mapping from the set E * R with uniform distribution. For the random
variable ?(A.), set
If R does not coincide with the set of all nonnegative integers, then cir < 1.
Theorem 4.4.4. Ifn^- oo, then
P{ A?> = *} =
uniformly in the integers k such that z = (k — A — aR)n)/ {b R^/n) lies in any fixed
finite interval.
Theorem 4.4.5. If n —»• oo a?u/ ? = ?(«) is such that naR —>• yS, where ft is a
constant, then for any fixed integer m,
<* <t+m} =
where the constant kp depends only on ft and the set R.
Since t = t(n) is of order \ogn, the random mappings from E* R have many
cyclic vertices and, as a consequence, have the height of order log n rather than
¦s/n as in the case for the mappings from ?„,#. A satisfactory explanation for this
situation is not known.
In Section 4.3, we considered the set Sn^r of all permutations of degree n with
cycle lengths from a fixed set R. The interest in such sets may be partly explained
by their connection with the equations involving permutations, which we will look
at in the next section. Another reason for investigating the set Sn ^r and similar sets
of mappings with various restrictions is the possibility (see [5]) of approximating
more complicated sets of combinatorial objects by such sets with relatively simple
constraints. Partly for these reasons, the asymptotic behavior of the number an,R
of elements in Sn^r has been considered in some recent studies [25, 80, 102, 149,
153, 154].
216 Random permutations
The generating function /(z) for the numbers an,# of elements in Sn,R is
00
Therefore it is convenient to apply the saddle-point method to obtain the asymp-
totics of an,R. By this method, the cases in which the elements of R form an
arbitrary arithmetic progression are considered in [25, 107]; see also [130].
The application of the Tauberian-type theorems is another approach that has
been used in the investigations of this problem [153, 154, 155]. Let R(n) be the
number of elements of R that are not greater than n and let \A\ be the number of
elements in A.
Theorem 4.4.6. Letn^oo,
R{n)/n -> p, 0 < p < 1, D.4.1)
and form > n, m = O(n),
-\k:k<n, ke R, m-ke R\ ^ p2. D.4.2)
n
Then
an,R = (n- l)\exp{ln,R - yp}/ r(p)(l + o(l)), D.4.3)
where
ln,R = /J ~,
r
reR,r<n
y is the Euler constant, and T is the Euler gamma function.
Conditions D.4.1) and D.4.2) indicate that the set R is similar to a typical
realization of a random set containing each positive integer with probability p
independent of the other integers.
As examples of the sets R that satisfy conditions D.4.1) and D.4.2), we may
take sets of the form
R = {k:{g(k)}eA}, D.4.4)
where g(t) is a real-valued function of t > 0, {x} is the fractional part of x, and
A is an interval or a finite union of intervals from [0, 1] with the Lebesgue mea-
measure p.
A. L. Yakymiv [154, 155] proved that a set R of the form D.4.4) satisfies
conditions D.4.1) and D.4.2) if
g{t) = tal{t),
4.4 Notes and references 217
where a is a noninteger positive number, l{t) is a slowly varying function, and as
t —> oo,
Let a,-R be the number of cycles of length r in a random permutation of Sn,R
and let vn_r = a\,R + ¦ • ¦ + an_R be its total number of cycles. Yakymiv [154,
155] proved the following assertions.
Theorem 4.4.7. Suppose that conditions D.4.1) and D.4.2) are satisfied and
n —> oo. Then the distribution of the random variable (vnyR — ln,R)/y/plogn
converges weakly to the standard normal distribution, and for any fixed r e R, the
distribution ofa^R converges to the Poisson distribution with parameter 1/r.
A case of irregular behavior of an,R is considered in [149].
Theorem 4.4.8. Ifn —>• oo and R = E U M, where E is the set of all even posi-
positive numbers and M is a set of odd numbers such that the series
1
meM
converges, then
e
for even n, and
,-b
e, , -Of!+*(!))
for odd n.
Volynets [149] proved this theorem with the aid of relation D.3.4), in which
she uses the representation
=n-s}.
s,m
Here the variables ?[ ,..., %? have the parameter x equal to -JT^TJn, v is
the number of these variables taking values in M, rj is the sum of these variables,
and ?j ,..., ?J^ are independent identically distributed random variables with
the distribution
^ l^^l keE.
Note that if b ->• 0, the result of Theorem 4.4.8 transfers continuously to D.3.5).
218 Random permutations
Theorems 4.3.2, 4.3.3, and 4.3.4 are given in [80]. It can be easily shown
that the asymptotics D.3.11) and D.4.3) are identical. Thus, quite different sets
of conditions yield coinciding results. This coincidence shows that there exist
weaker conditions sufficient for the validity of the asymptotics D.3.11). We give
the detailed and cumbersome proof of Theorem 4.3.3 because we conjecture that
condition A) from this theorem and the existence of a positive density of R are
sufficient for the validity of D.3.11) and that it may be possible to simplify the
proof.
The research on the sets Sn,R of permutations with restrictions on the cycle
lengths provides an example of a fruitful competition of various analytical meth-
methods of asymptotic analysis such as the saddle-point method, the application of
Tauberian-type theorems, and the approach based on the generalized scheme of
allocation.
Note that it would also be interesting to consider the cases where the density
p = 0.
Equations containing an unknown
permutation
5.1. A quadratic equation
If g and / are permutations of degree n, then the result of their sequential action
h = fg is a permutation of degree n called the product of g and /. The set Sn
of all permutations of degree n with this operation is the well-known symmetric
group of degree n. Therefore we can consider equations of the form
Xd = a, E.1.1)
where d is a positive integer, a e Sn, and X is an unknown permutation from Sn.
In the previous chapter, we considered the set Sn ^ of all permutations of degree n
with cycle lengths from a fixed set R and found the asymptotics for the number of
elements in Sn>R for some regular sets R. The interest in the sets of permutations
Sn,R may be partly explained by their connection with some equations involving
permutations. For example, the set of all solutions of the equation
Xp = e E.1.2)
in the symmetric group Sn, where e is the identity permutation and p is a prime
number, is exactly the set Sn,R with R = {1, p}. Indeed, a permutation X satisfies
equation E.1.2) if and only if its cycles are of the length 1 or p. Denote by Tn
the number of solutions of equation E.1.2).
Theorem 5.1.1. If p is a prime number, then
TM -
f {n-pk)\k\pk'
0<k<n/p y y
Proof. Let a be a random permutation from S». It is clear that
219
220
Equations containing an unknown permutation
and the study of Tn 's equivalent to the study of the probability P{ap = e).
Since Tnip) = an,R, where R = {1, p),
\ap = e\ = {ar = 0, r / 1, r / p) = [a\ + pap =n),
where ar is the number of cycles of length r in a random permutation from Sn. By
D.1.4),
P{ai =n-pk, ap = k, ar=0,r^\, r # p) =
Summing these probabilities over admissible values of k yields the assertion of
the theorem. ¦
Set ao^R = 1 and consider the generating function of the sequence an,R,
00 „
k=0
n\
Theorem 5.1.2.
fR{z) = exp
E-
Proof. According to D.1.5),
00
n=0
oo
untn
n
where
E.1.3)
<Pn(ti,...,tn)= ? P{ct\=m\, an =mn}t™{ ...t™n,
m\,...,mn
and ar is the number of cycles of length r in a random permutation from Sn.
If we put tr = 1 for r e R and tr = 0 for r $ R, we find that the corresponding
generating function <pn(t\, ...,tn) is
>{ar =mr, r e R, ar =0, r ? R],
where
=n
Mr
Mr = | mi, mn:
reR
5.1 A quadratic equation
221
It is easy to see that
P{ar =mr, r e R, ar = 0, r ? R) = P I ?/ar =n
Mr IreR
Thus, substituting tr = 1 if r e R and tr = 0 if r (? R into E.1.3) shows that the
generating function for
rccr = n
I reR
an,R
equals
00
n=0
n\
= exp
E-
E.1.4)
reR
In view of Theorem 5.1.2, it is convenient to apply the saddle-point method to
obtain asymptotics of TnP . In the next section, we will use a different approach
based on the generalized scheme of allocation; however, for comparison, we now
B)
present the derivation of the asymptotics of !„ by applying the saddle-point
method.
Theorem 5.1.3. As n ->• oo,
Proof. Since
oo TB) »
by Cauchy's formula
F(n) =
/2
integrating over an arbitrary contour that goes around the point z = 0. We can
write
and choose the contour of integration to be the circle passing through the saddle
point q, where the derivative of the function
z2
/(z) = z + — - n log z
222
Equations containing an unknown permutation
is zero. From the equation
we find that
/(z)= 1 +z-- = 0,
z
Thus, setting z = Qelfp,n < <p < n shows that
2rti J z
In J_n
For the sake of brevity, we let a = q sin <p + (q2 sin 2(p)/2 — ncp and write the
integral in the form
pq+q2/2
:f,
Q+Q2/2
2ngn
Since F{n) is real, we see that
ee+Q2/2
F(n) =
We choose e = q~3/4 and estimate the integral outside the ^-neighborhood
of zero, as n —>• oo, taking into account that q = -sjn + 1/4 — 1/2 —>• oo. The
integrand is even, so we only estimate the integral over cp, 0 < cp < 7r. It is
convenient to consider the graphs of the functions cos <p and cos 2cp included in the
exponent. With the help of the graphs presented in Figure 5.1.1, we can easily see
that
< r12 e-QH
Js
Js
since 1 — cos 2s > e2 for sufficiently small e.
Similarly,
f /2*
Jn
[
tt/2
Jn/2
<P
= -»-Q
5.1 A quadratic equation 223
tt/4, n/2 ¦-,.. 3tt/4
Figure 5.1.1. Graphs of cos <p and cos 2<p
Thus
F{n) =
2ttq"
where e = q~3^'. Since q + q2 — n = 0, we find that, in a neighborhood of zero,
a = q sin cp H sin 2cp — ncp
= Qcp + Q2cp -n(p + O(e2\(p\3) = O(e2\(p\3),
and therefore
cosa = 1 + O(a2) = 1 + 0(<?V)-
The exponent of the integrand can be represented in the domain of integration
as follows:
Q2 1
q{\ - coscp) + y A - cos2(p) = -(q + 2Q2)(p2 + O(g2<p4).
Thus, for | (p | < s,
Therefore
f cosae^A-cos^^2A-cos2^2^ = f e^2(e+2^/2^(l + O(q-X/2)).
224
Equations containing an unknown permutation
The change of variables 6 = <Jq + 2g2(p gives
i r _r ¦
-eJq+Iq1
2Q2)
since as x —>• oo,
Combining the estimates gives
F{n) =
eQ+Q2/2
2n Qn J q + 2q2
It remains to substitute q = *Jn + 1/4 - 1/2 into this formula.
Since i^/i) = t}2)/n!, we find that
2
Replace log«! by Stirling's formula
logn\ = n logn — n H— logn
It is easily seen that
li + O(n 1).
1/2
E.1.5)
E.1.6)
E.1.7)
E.1.8)
5.2 Equations of prime degree 225
When we use E.1.7), we find
n\ogg = -/i log/i +/i log (l - 0I))
2 \
/i og/i +/i log (l = + + 0I2))
2 \ 2y/n Sn \nL))
= in,ogn-I^+o(-L). E.L9)
Finally,
M +— j
). E.1.10)
forlogri2):
By substituting estimates E.1.6)—E.1.10) into E.1.5), we obtain the final formula
\
which implies the assertion of the theorem.
\oZn + V« \ logV2 + O
5.2. Equations of prime degree
According to D.3.4), the number an<R of permutations in Sn<R can be represented
in the form
UB^ Y ^f^^if + ...+^ = n}, E.2.1)
N=l
where
x
t> E-2-2)
keR
and ?[ ,... ,%N' are independent identically distributed random variables,
and the positive parameter x can be chosen arbitrarily from the domain of conver-
convergence of the series in E.2.2).
If p is a prime number, then the number Tn of solutions of equation E.1.2)
is an<R, where R = {1, p). Therefore
XP
BR(x)=x + —,
P
226 Equations containing an unknown permutation
and by E.2.1),
" *^ Ml
T = e ?
x" *-^ Ml
N=\
where ?> = ?1 + ¦ ¦ ¦ + ?w> ?i> • • •> %n are independent identically distributed
random variables and
^ =P) = —^—-p- E-2.5)
px + xP
px + xP px
Thus, to find the asymptotics of rM , it suffices to choose an appropriate value
of x and to prove a local limit theorem for the sum %n = %i + • • • + %n- The
summation of independent random variables taking two values is a simple problem
that is solved by the de Moivre-Laplace theorem. Therefore the approach based on
the representation E.2.4) seems more suitable here than the saddle-point method.
We begin by applying this approach to the proof of Theorem 5.1.3.
Proof of Theorem 5.1.3. If R = {1, 2}, then obviously
X 2 ^
• B(x) 2 + x' l'x ' 2B{x) 2 + x'
where B{x) = BR{x) = x +x2/2, and E?N = NE^i = N(x +x2)/B(x).
In the main part of the sum in E.2.4), the parameter Af takes values close to
B(x); therefore we choose x such that
x + x = n.
Hence,
1
1/4-1, B{x) = x + x__ = *L + I
x + x2 n ^ x3
and D?i = 2«~1/2A + o(l)) as n —>• oo (where D denotes the variance).
Let
and divide the sum from E.2.4) into two parts so that
E.2.6)
n' eB(x)
x"
5.2 Equations of prime degree
227
where
= E
N:\u\<A
= E
In the first sum,
n'
and by using the normal approximation to the Poisson distribution, we obtain, as
n ->• oo,
BN(x)
uniformly in the integers N such that \u\ < A.
The sum t;^ — N has the binomial distribution with N trials and the probability
of success p{x) = x/B + x).\f\u\ < A, then A^ = B(x)(l +o(l)),and
Np(x)(l - p(x)) = - ^
as n —>• oo. Therefore the normal approximation to the binomial distribution is
valid. For |u\ < A = y/2\ogn,
n - NE$X n(B(x) - N)
Therefore, by the de Moivre-Laplace theorem,
uniformly in the integers iV such that \u\ < A.
The behavior of the functions cpi(N) = BN(x)e~B(x)/N\ and cp2(N)
= n) is represented approximately in Figure 5.2.1.
The sum S\ can be estimated as follows:
= E
N:\u\<A
= E
N:\u\<A
1
AH
E
228 Equations containing an unknown permutation
-A n/2 A
Figure 5.2.1. The graphs ofcpi(N) and <p2(N)
The last sum is an integral sum of the function e " I1 with step 2E(;c)D?i)~
so as n —>• oo,
1 1 r°° ^ 1
¦?¦= ,.'„. .4=/ «-/2rf»d+ ¦- '
By virtue of monotonicity, for \u\ > A,
P{^N=n}<
and there exists a constant c such that
Therefore
9 — V^
i
Thus
and by substituting this estimate into E.2.6), we obtain
, B(x)
It remains to substitute
I
2'
5.2 Equations of prime degree 229
into the formula. It is easily seen that
eB(x) =?f»A+oA
xn = nn/2e~^/2{\+o{\)).
Therefore
B) _ »
and Theorem 5.1.3 with the remainder term of the form 1 + o(l) is proved. ¦
We now turn to the case where p is a fixed prime number, p > 3, and consider
the number Tn of solutions of equation E.1.2).
Theorem 5.2.1. Ifn —>• oo and p is prime, p > 3, then
Proof. The proof is almost the same as the proof of Theorem 5.1.3 given above
and is also based on relation E.2.4). For R = {1, p},
B(x) = BR(x)=x+xp/p,
and the independent random variables ?i, ...,?# in E.2.4) have the distribution
= ii = JL. = -?L_. Pttl =P]= pxP xP
B{x) px+xP B(x) px+xP
We choose the parameter x such that
x+xp = n. E.2.7)
Then
x = nl/P--
P
B{x) = x+xP/p = -
P
p(x) = XP
1 pn
px +xP
= n/B(x), D^i = (p -
230
Equations containing an unknown permutation
Let
u =
p{N - B(x))
A = y/2\ogn,
and divide the sum in E.2.4) into two parts so that
x'
where
E
N:\u\<A
4= E
BN(x)
N!
N:\u\>A
In the first sum, N = B(x)(l + o(^B(x))) and
uniformly in the integers A^ such that \u\ < A.
Let ^* = (^ - \)/{p -\),i = \,...,N.Tht sum
has the binomial distribution with A^ trials and the probability of success
p(x) = xp/(px+xp) = 1 - pn-x+x'P + O(n
as n -> oo. It is clear that
= (n- N)/(p - 1)},
and if (n-N)/(p-\) is not an integer, then P{?N = n) = 0. Since E^i =n/B(x),
B(x) =/i
(n-N)/(p-\)-NE%* _ n-NEi;! _ n(B(x) - N)
nu
as n -> oo and \u\ < A, by using the de Moivre-Laplace theorem, we obtain
1 *.2 /n
= n} = P{$ = {n- N)/{p - 1)} =
uniformly in the integers N such that (n — N)/(p — 1) is an integer and \u\ < A.
5.2 Equations of prime degree
231
Therefore
= E
N:\u\<A
= E
P-\
f^z 2—i
where the summation is over the integers N such that (n - N)/(p — 1) is an integer.
The last sum is an integral sum of the function e~u /2 with step p(B(x)Di-i)~1/2.
Since the summation is over N such that (n — N)/(p — 1) is an integer, that is,
only each (p — l)th term is included in the sum, we obtain
p-\
-uz/2
1 r°°
-L /
Therefore, as n —>• oo,
For \u\ > A,
p-\
and there exists a constant c such that
and S2 <
S =
Thus
S2 =
1
and by substituting this estimate into E.2.6), we obtain
pxny/2nB(x)
E.2.8)
It is easily seen that
xn =
When we substitute these expressions into E.2.8), we obtain the assertion of
Theorem 5.2.1. ¦
232 Equations containing an unknown permutation
A slight refinement of the estimates used in the proof of Theorem 5.2.1 allows
us to show that the assertion of the theorem is valid if p tends to infinity slowly,
as specified below, where we prove a more general result.
Theorem 5.2.2. If p is prime and n, p —> oo in such a way that p/n —> 0, then
in
\eJ ^ (m
particular, if p~2nxlp —>• oo, ?/ze«
andif p~lnl/p ->• 0, tfzen
Tn{p) = (-) pV2 A+oA))) E.2.11)
where m = n — p[n/p], and [c] is the integer part of c.
Proof. The proof is similar to the proof of Theorem 5.2.1, but now we need to trace
the effect of the parameter p in the remainder terms of the asymptotic formulas
and to use a representation in terms of the Poisson probabilities instead of the
representation E.2.4).
It follows from the equation x+xp = n that under the conditions of the theorem,
n2/p (n3/p \
n— + 0(^), E.2.12)
np \n2P2J
n (p-\)nl/p (nllp\
B = B(x) = - + -^ + 01 1 . E.2.13)
P P \ np )
Therefore it is easy to confirm that
p(x) = P{?! =p} = —ii- = 1 - pn-l+l'p + O(n~2+2/p).
px + xp
The random variable (f # - N)/(p — 1) can be represented in the form
NrjN,
p-\
where r]N has the binomial distribution with N trials and probability of success
q = q(x) = 1 - p(x) = pn~l+l/p(l + O(n~l+l/p)). E.2.14)
Therefore it is not difficult to see that for n = m + p[n/p], the probability
5.2 Equations of prime degree 233
= n) is nonzero if
N = [n/p] + m + k(p~ 1), 0<k<[n/p],
and for such N,
where I = m + kp. Thus, the representation E.2.4) takes the form
N=l
k=0
This results in the representation
T(p) _ n\ ^p {BqI Bq y»Ki - q,, -B(\-q) CS ? 1^
k=0
where / = m + /?&, A^ = [«//?] + m + &(/? — 1), m = « — /?[«//?]; and to obtain
the basic assertion of the theorem, we must sum the products of two Poisson
probabilities. Let
s =
v a = (n /P
a = (n /PJp/n
and divide s into two parts,
\m-\-pk
51 =
^ (m + pk)\
k:\(N-B)b-V2\>a
Note that a -> 0 under the conditions of the theorem, and the normal approxi-
approximation to the second multiplier
b E-2-16)
is valid for all /, N such that \(N — B)B~l/2\ < a, and outside this region,
(B(l-g))"-> ^_^_
(N-l)\ ~ V2^B
where c is a constant.
234 Equations containing an unknown permutation
It remains to show that .s'2 = o(s\) and
ni/P)m+Pk _nl/p
(¦+"(¦))• E-2.18)
For the sake of brevity, we let b = Bq.lt follows from E.2.13) and E.2.14)
that under the conditions of the theorem,
ft = nx/p(\ + O(pn-{ + l/p)). E.2.19)
It is clear that
b[b]+P
51 - W
since at least one of the summands with / from the interval ([ft], [ft] + p) is included
in the sum s\.
On the other hand, the summation over N > B + a*/B is the summation over /,
with/ = m + pk such that / > b + ay/B + o(^/~B ). Let/o = b + a^/B + o(^/B).
Then,
ft ft2 \ bl°e~bl0 cbl°
since ft/ Iq —>• 0. Therefore,
— <
s\ h(h - 1) • • • ([ft] + p + 1)
c
A + (/o - b)/b) •••(! + ([ft] - b + p + l)/ft)
i 2
~ (/o - ftK - (V
where ci, C2, and C3 are constants. By the choice of a, the last bound tends to zero.
This estimate, E.2.16), E.2.17), and E.2.19) imply E.2.18). Assertion E.2.9)
follows from E.2.15), E.2.16), E.2.17), and E.2.18).
If p~2nl/p —>• 00, then by using the normal approximation, we obtain
oo
—Trr*-'" = -A+0A)).
pk)\ p
This yields assertion E.2.10) of the theorem.
Assertion E.2.11) follows from the fact that if p~lnl/p ->• 0, then
OQ / 1 / n \ m -I- nlr jyi I p
pk)\ m!
5.3 Equations of compound degree
235
5.3. Equations of compound degree
In this section, we consider the number rw of solutions of the equation
E.3.1)
where d is a natural number, e is the identity permutation, and X is an unknown
element of the symmetric group Sn. The cases where d is a prime number were
considered in the previous sections. Let d be a compound number and let 1 =
do < d\ < ¦ ¦ ¦ < dr = d be all different divisors of d. A permutation X is a
solution of equation E.3.1) if and only if the lengths of cycles of X belong to
the set {do,... ,dr}. Therefore !„ is equal to the number an>R of permutations
in Sn<R, where R = {do, ... ,dr}. The following is a generalization of Theorems
5.1.3 and 5.2.1.
Theorem 5.3.1. Ifn —>¦ oo and d is a fixed number, d > 2, then
j\d
0+0A))
if d is odd, and
ifd is even.
Note that the summation in the above formulas is over the divisors j of the
number d, and if we put d = 2 and d = /?, we obtain Theorem 5.1.3 and 5.2.1,
respectively.
Proof. Let 1 = do < d\ < ¦ ¦ ¦ < dr = d be all the divisors of d, R =
{do,.. .,dr),
keR
and let ?i ,...,?# be independent identically distributed random variables,
xk
kB(x)'
keR,
E.3.2)
where the positive parameter x can be chosen arbitrarily. Since d is compound,
r > 2.
Put ?at = ?i H h ?#. It is clear that
= 0
x
')/*(*)•
236 Equations containing an unknown permutation
We choose the parameter x such that
x + xdi +¦¦¦ + xdr-1 + xd = n, E.3.3)
and in what follows, we consider the random variables ?1 ?# with distribution
E.3.2), where x is the solution of this equation.
By iteration, it is not difficult to determine that
xd = n - nd'-xtd nl/d + o(l) E.3.4)
if d is odd, and
x
d = n - ndr-l/d nl/d + 1/2 + o(l) E.3.5)
if d is even.
Since 7^ ) = an,R, where R = {1, d\,..., dr-\,d), we can use the represen-
representation E.2.1) and obtain
^ = n). E.3.6)
X N=\ ¦
Therefore, to obtain the assertions of Theorem 5.3.1, it is sufficient to find the
asymptotics of P{?# = n}.
It is not difficult to see that
B(xY
B(x)(x + dxxdx + h dxd) - n:
B{x) = r r 2/
d d M J
where the summation is over the integers j, which are the divisors of d. In view
of E.3.4) and E.3.5),
0+0A)), E-3-7)
j\d J
as n —>¦ oo. By estimating the second and third central moments of ?i and using
the characteristic function of ?#, we can prove that the distribution of the random
variable (?# — NE%\)/+JND%\ converges to the normal law with parameters
@,1) as A^D^i —>• oo. If h is the maximal step of the lattice containing the set R,
then the local limit theorem is valid on this lattice. We omit the proof of this local
theorem.
5.3 Equations of compound degree 237
The remaining part of the proof of Theorem 5.3.1 repeats the corresponding
part of the proof of Theorem 5.1.3 from Section 5.2. We put
n-NE^ d(N-B(x)) ,--
»= —, i/= , ——, ^=2V21og«,
and divide the sum from E.3.6) into two parts so that
where
*X
^ D \X)
N:\u\<A
S2= T ?^Me.k
N:\u\>A
It is easy to see that N = B(x)(l + o(l)) for \u\ < A = 2^/2logn and
n(B(x)-N) _ uh,o(-i,2^ E3o,
v — ——-—. t^ , — —w(i + U\n II, p.J.o;
and by the local limit theorem,
uniformly in the integers iV such that \u\ < A and (n — N)/h are integers. Recall
that h is the maximal span of the distribution of ?i.
As in the proof of Theorem 5.1.3, Section 5.2, we obtain
S 'L
2 ,™
The last sum is an integral sum of the function e~u /z, with;
and the summation is over N such that (« — N)/h are integers, that is, only each
term is included in the sum. Since h and d are relatively prime, we see that
1 ^ hd ,.2/o 1 f°°
and
1
In estimating S2, it will not be possible now to use the monotonicity of the tails
of the function W2(N) = P{?# = n] as we did in the proof of Theorem 5.1.3 in
238
Equations containing an unknown permutation
Section 5.2 (see Figure 5.2.1). By E.3.8), in the first sum, |u| < y/l\ogn for a
sufficiently large n. Therefore, in the second sum,
n:\v\>.j2\ogn
By the integral limit theorem,
E
P{t;N=n} =
I
oo
e~zl/2dz{\
'2\ogn
and there exists a constant c such that, in the second sum,
Thus, S\ + S2 = 5i(l + o(l)), and we obtain
n\eB(x)
E.3.9)
This implies the assertions of the theorem because
J
j\d
and x" can be represented in the cases of odd and even d as follows.
Let d be odd, then according to E.3.4),
jc =
For 1 < j < d,
and for j = d,
Thus
xj =nj/d
xd = n-ndr-x/d n1/d =o(\).
= exp
and
E
7
When we substitute the last expression into E.3.9), we obtain the first assertion of
the theorem.
5.4 Notes and references
239
If d is even, we note that 2c/r_i = d and use E.3.5) to obtain
xn =
For 1 < / < dr-1,
for j = d,
and for j = dr-\,
Thus
= exp
x~neB{x) =x~"/
J = nJ/d
d =n-ndr-xld nl/d
nj/d
j\d
J\d
The substitution of the last expression into E.3.9) gives us the second assertion
of the theorem. ¦
5.4. Notes and references
The study of equations of the form Xd = e in the symmetric group Sn is directly
related to one of the significant characteristics of the elements of Sn: the order
of permutations. By the order On(s) of a permutation s e Sn,we mean the least
positive integer k such that sk is the identity permutation. The orders of elements
in Sn vary from 1 to the maximal value G(n) over all s e Sn. E. Landau [95] shows
that
logGW i
= 1.
In spite of such a wide range of log On(s), the typical values of log On(s) are
considerably less than log G(n) and are concentrated near 2 log2n. Let On be
the order of a random permutation from Sn with uniform distribution. The following
assertion is well known.
Theorem 5.4.1. For any faced x,
lim Pf (log a - 2 \og2n)/J3-1 log3 n 1 = —L= [* e~u2/2du.
240 Equations containing an unknown permutation
The asymptotic normality of log On was first proved by P. Erdos and P. Turan
[39]. Other proofs of Theorem 5.4.1 can be found in [106, 18, 27]. All the proofs
are rather cumbersome and involve many analytical difficulties. From our point
of view, the simplest proof, but still not a sufficiently simple one, is suggested in
[78], where the approach based on the generalized scheme is used.
It seems to us that investigating the numbers of solutions of equations of the
form Xd = e could provide the basis for the study of the local behavior of On.
Indeed, if p is prime, then T^ is just the number of permutations s e Sn whose
order On(s) = p. Since the leading term of the asymptotics of the number !„ for
p g yp
a compound d is (n/e)"^~l^d\ almost all permutations counted by !„ probably
have the order d. It would be of considerable interest to find the asymptotics of the
local probabilities P{On = d} for d that lie in a neighborhood of exp{2-1 log2 n]
and to see whether the integral limit theorem follows from these results in spite
of the fact that the behavior of the probabilities P{On = d} is likely to be rather
complicated. By virtue of the irregularity of the behavior of P{On = d}, this
problem is not usually as trivial as is obtaining the integral limit theorem from the
local theorem because now we have to obtain the local theorem for d of a specified
form and, in addition, we have to know how many d of such a form exist.
Theorems 5.1.1 and 5.1.2 for R = {1,2} and Theorem 5.1.3 were proved in
[32]. Theorem 5.1.2 for/? = {1, p}, p > 2, was proved in [61], and for an arbitrary
R in [33].
Theorem 5.1.3 was proved in [103], where the result of Theorem 5.2.1 was
also presented. Assertion E.2.9) of Theorem 5.2.2 was proved by the saddle-point
method in [144].
Theorem 5.3.1 was proved in [108, 145, 150] independently and almost simul-
simultaneously.
The approach based on the generalized scheme of allocation, presented in Chap-
Chapter 5 of this book, was first published in [82], where the proof of Theorem 5.1.3 was
realized with the help of this approach. The proof of Theorem 5.3.1 in Section 5.3
follows A. V. Kolchin [68], who, in addition, extended this theorem to the case
d —>¦ oo such that d In In n/\nn —>¦ 0.
The general conditions of existence of a solution of the equation Xd = a, where
a is a fixed permutation and X is an unknown permutation from Sn, are given in
[102].
The system of equations
Ym\ y»«2 Ymk
where k > 2, m \,..., ra^ are fixed natural numbers, X\,..., Xk e Sn, and e is the
identity permutation in Sn, is considered in [110]. The asymptotic representation
of the number of solutions X = (X\, ...,Xk) such that XtXj = XjX{ for all
/ ^ j is found.
BIBLIOGRAPHY
[1] Sh. M. Agadzhanyan. On a general method of estimating the number of
graphs from given classes. Avtomatika, (l):10—21, 1981. In Russian.
[2] Sh. M. Agadzhanyan. The asymptotic formulae for the number of m-
component graphs. Avtomatika, D):27-33, 1986. In Russian.
[3] D. J. Aldous. Exchangability and related topics. Lecture Notes in Math.,
1117:1-198, 1985.
[4] D. J. Aldous. Brownian bridge asymptotics for random mappings. Adv.
Appl. Probab., 24:763-764, 1992.
[5] J. Arney and E. A. Bender. Random mappings with constraints on coales-
coalescence. Pacific J. Math., 103:269-294, 1982.
[6] R. A. Arratia. Independent process approximation for random combinato-
combinatorial structures. Adv. Appl. Probab., 24:764-765, 1992.
[7] R. Arratia and S. Tavare. Limit theorems for combinatorial structures
via discrete process approximations. Random Structures and Algorithms,
3:321-345, 1992.
[8] G. N. Bagaev. Distribution of the number of vertices in a component of
an indecomposable mapping. Belorussian Acad. Sci. Dokl., 21A2):1061-
1063, 1977. In Russian.
[9] G. N. Bagaev. Limit distributions of metric characteristics of an inde-
indecomposable random mapping. In Combinatorial and Asymptotic Analysis,
pp. 55-61. Krasnoyarsk Univ., Krasnoyarsk, 1977. In Russian.
[10] G. N. Bagaev and E. F. Dmitriev. Enumeration of connected labelled bipar-
bipartite graphs. Belorussian Acad. Sci. Dokl., 28:1061-1063,1984. In Russian.
[11] G. V. Balakin. On random matrices. Theory Probab. Appl., 12:346-353,
1967. In Russian.
[12] G. V. Balakin. The distribution of random matrices over a finite field.
Theory Probab. Appl., 13:631-641, 1968. In Russian.
241
242 Bibliography
[13] G. V. Balakin, V. I. Khokhlov, and V. F. Kolchin. Hypercycles in a random
hypergraph. Discrete Math. Appl, 2:563-570, 1992.
[14] A. D. Barbour. Refined approximations for the Ewens sampling formula.
Adv. Appl. Probab., 24:765, 1992.
[15] A. D. Barbour. Refined approximations for the Ewens sampling formula.
Random Structures and Algorithms, 3:267-276, 1992.
[16] E. A. Bender, E. R. Canfield, and B. D. McKay. The asymptotic number
of labeled connected graphs with a given number of vertices and edges.
Random Structures and Algorithms, 1:127-170, 1990.
[17] E. A. Bender, E. R. Canfield, and B. D. McKay. Asymptotic properties of
labeled connected graphs. Random Structures and Algorithms, 3:183-202,
1992.
[18] M. R. Best. The distribution of some variables on a symmetric group.
Nederl. Akad. Wetensch. Indag. Math. Proc, 73:385-402, 1970.
[19] L. Bieberbach. Analytische Fortsetzung. Springer-Verlag, Berlin, 1955.
[20] B. Bollobas. The evolution of random graphs. Trans. Amer. Math. Soc,
286:257-274, 1984.
[21] B. Bollobas. Random Graphs. Academic Press, London, 1985.
[22] Yu. V. Bolotnikov. Convergence to the Gaussian and Poisson processes of
the variable \xr(n, n) in the classical occupancy problem. Theory Probab.
Appl, 13:39-50, 1968. In Russian.
[23] Yu. V. Bolotnikov. Convergence to the Gaussian process of the number of
empty cells in the classical occupancy problem. Math. Notes, 4:97-103,
1968. In Russian.
[24] Yu. V. Bolotnikov. Limit processes in a non-equiprobable scheme of al-
allocating particles into cells. Theory Probab. Appl, 13:534-542, 1968. In
Russian.
[25] Yu. V. Bolotnikov. On some classes of random variables on cycles of
permutations. Math. USSRSb., 36:87-99, 1980.
[26] Yu. V. Bolotnikov, V. N. Sachkov, and V. E. Tarakanov. Asymptotic nor-
normality of some variables connected with the cyclic structure of random
permutations. Math. USSR Sb., 28:107-117, 1976.
[27] J. D. Bovey. An approximate probability distribution for the order of ele-
elements of the symmetric group. Bull. London Math. Soc, 12:41-46, 1980.
[28] V. E. Britikov. Limit theorems on the maximum size of trees in a random
forest of non-rooted trees. In Probability Problems of Discrete Mathemat-
Mathematics, pp. 84-91. MIEM, Moscow, 1987. In Russian.
[29] V. E. Britikov. The asymptotic number of forests from unrooted trees.
Math. Notes, 43:387-394, 1988.
[30] V. E. Britikov. The limit behaviour of the number of trees of a given size in a
random forest of nonrooted trees. In Stochastic Processes and Applications,
pp. 36-41. MIEM, Moscow, 1988. In Russian.
Bibliography 243
[31] I. A. Cheplyukova. Emergence of the giant tree in a random forst. Discrete
Math. AppL, 8A): 17-34, 1998.
[32] S. Chowla, I. N. Herstein, and K. Moore. On recursions connected with
symmetric groups. Canad. J. Math., 3:328-334, 1951.
[33] S. Chowla, I. N. Herstein, and W. R. Scott. The solution of xd = 1 in
symmetric groups. Norske Vid. Selsk., 25:29-31, 1952.
[34] J. M. DeLaurentis and B. G. Pittel. Random permutations and Brownian
motion. Pacific J. Math., 119:287-301, 1985.
[35] P. J. Donnelly. Labellings, size-biased permutations and the gem distribu-
distribution. Adv. Appl. Probab., 24:766, 1992.
[36] P. J. Donnelly, W. J. Ewens, and S. Padmadisastra. Functionals of random
mappings: Exact and asymptotic results. Adv. Appl. Probab., 23:437-455,
1991.
[37] P. Erdos and A. Renyi. On the evolution of random graphs. Publ. Math.
Inst. Hungarian Acad. Sci., Ser. A, 5A-2): 17-61, 1960.
[38] P. Erdos and A. Renyi. On random matrices. Magyar Tud. Akad. Mat.
Kutatolnt. Kozl, 8:455-461, 1963.
[39] P. Erdos and P. Turan. On some problems of statistical group theory, iii.
ActaMath. Acad. Hungar, 18C-4):309-320, 1967.
[40] W. J. Ewens. The sampling theory of selectively neutral alleles. Theoret.
Pop. Biol., 3:87-112, 1972.
[41] W. J. Ewens. Sampling properties of random mappings. Adv. Appl. Probab.,
24:773, 1992.
[42] M. V. Fedoryuk. Saddle Point Method. Nauka, Moscow, 1977. In Russian.
[43] W. Feller. An Introduction to Probability Theory and Its Applications,
vol. 2. Wiley, New York, 1966.
[44] P. Flajolet. The average height of binary trees and other simple trees.
Journal of Computer and System Sciences, 25:171-213, 1982.
[45] P. Flajolet. Random tree models in the analysis of algorithms. In P.-J. Cour-
tois and G. Latouche, editors, Performance'87, pp. 171-187. North-
Holland, Amsterdam, 1988.
[46] P. Flajolet, D. E. Knuth, and B. Pittel. The first cycles in an evolving graph.
Discrete Math., 75:167-215, 1989.
[47] P. Flajolet and A. M. Odlyzko. Random mapping statistics. In J.-J. Quis-
quarter and J. Vandewalle, editors, Advances in Cryptology, Lecture Notes
in Computer Science, Vol. 434, pp. 329-354. Springer-Verlag, Berlin, 1990.
[48] P. Flajolet and M. Soria. Gaussian limiting distributions for the number of
i
I components in combinatorial structures. J. Combinatorial Theory, Series
\ A, 53:165-182, 1990.
| [49] B. V. Gnedenko and A. N. Kolmogorov. Limit Distributions for Sums of
i Independent Random Variables. Addison-Wesley, Reading, MA, 1949.
244 Bibliography
[50] S. W. Golomb. Shift Register Sequences. Aegean Park Press, Laguna Hills,
CA, 1982.
[51] V. L. Goncharov. On the distribution of cycles in permutations. Soviet
Math. Dokl, 35(9):299-301, 1942. In Russian.
[52] V. L. Goncharov. On the alternation of events in a sequence of Bernoulli
trials. Soviet Math. Dokl, 36(9):295-297, 1943. In Russian.
[53] V. L. Goncharov. On the field of combinatorics. Soviet Math. Izv., Sen
Math., 8:3-48, 1944. In Russian.
[54] A. A. Grusho. Random mappings with bounded multiplicity. Theory
Probab. Appl, 17:416-425, 1972.
[55] A. A. Grusho. Distribution of the height of mappings of bounded multiplic-
multiplicity. In Asymptotic and Enumerative Problems of Combinatorial Analysis,
pp. 7-18. Krasnoyarsk Univ., Krasnoyarsk, 1976. In Russian.
[56] J. C. Hansen. Order statistics for random combinatorial structures. Adv.
Appl. Probab., 24:774, 1992.
[57] B. Harris. Probability distributions related to random mappings. Ann.
Math. Statist., 31:1045-1062, 1960.
[58] C. C. Heyde. A contribution to the theory of large deviations for sums
of independent random variables. Z Wahrscheinlichkeitstheorie undverw.
Gebiete, 7:303-308, 1967.
[59] W. Hoeffding. Probability inequalities for sums of bounded random vari-
variables. J. Amer. Statist. Assoc, 58C01): 13-30, 1963.
[60] I. A. Ibragimov and Yu. V. Linnik. Independent and Stationary Related
Variables. Nauka, Moscow, 1965. In Russian.
[61] E. Jacobstal. Sur le nombre d'elements du group symmetric Sn dont l'ordre
est un nombre premier. Norske Vid. Selsk., 21:49-51, 1949.
[62] S. Janson. Multicyclic components in a random graph process. Random
Structures and Algorithms, 4:71-84, 1993.
[63] S. Janson, D. E. Knuth, T. Luczak, and B. Pittel. The birth of the giant
component. Random Structures and Algorithms, 4:233-358, 1993.
[64] I. B. Kalugin. The number of cyclic points and the height of a random
mapping with constraints on multiplicities of the vertices. In Abstracts of
the All-Union Conference Probab. Methods in Discrete Math., pp. 35-36.
Karelian Branch of the USSR Acad. Sci., Petrozavodsk, 1983. In Russian.
[65] V. I. Khokhlov. On the structure of a non-uniformly distributed random
graph. Adv. Appl. Probab., 24:775, 1992.
[66] V. I. Khokhlov and V. F. Kolchin. On the structure of a random graph with
nonuniform distribution. In New Trends in Probab. and Statist., pp. 445-
456. VSP/Mokslas, Utrecht, 1991.
[67] J. F. C. Kingman. The population structure associated with the Ewens
sampling formula. Theoret. Pop. Biol, 11:274-284, 1977.
Bibliography 245
[68J A. V. Kolchin. Equations in unknown permutations. Discrete Math. Appl.,
4:59-71, 1994.
[69] V. F. Kolchin. A class of limit theorems for conditional distributions.
Litovsk. Mat. Sb., 8:53-63, 1968. In Russian.
[70J V. F. Kolchin. On the limiting behavior of extreme order statistics in a
polynomial scheme. Theory Probab. Appl., 14:458-469, 1969.
[71] V. F. Kolchin. A problem of allocating particles into cells and cycles of
random permutations. Theory Probab. Appl., 16:74-90, 1971.
[72] V. F. Kolchin. A problem of the allocation of particles in cells and random
mappings. Theory Probab. Appl., 21:48-63, 1976.
[73] V. F. Kolchin. Branching processes, random trees, and a generalized scheme
of arrangements of particles. Math. Notes, 21:386-394, 1977.
[74] V. F. Kolchin. Moment of degeneration of a branching process. Math.
Notes, 24:954-961, 1978.
[75] V. F. Kolchin. Branching processes and random trees. In Cybernetics,
Combinatorial Analysis and Graph Theory, pp. 85-97. Nauka, Moscow,
1980. In Russian.
[76] V. F Kolchin. Asymptotic Methods of Probability Theory. MIEM, Moscow,
1984. In Russian.
[77] V. F Kolchin. On the behavior of a random graph near a critical point.
Theory Probab. Appl., 31:439-451, 1986.
[78] V. F Kolchin. Random Mappings. Optimization Software, New York,
1986.
[79] V. F Kolchin. Systems of Random Equations. MIEM, Moscow, 1988. In
Russian.
[80] V. F. Kolchin. On the number of permutations with constraints on their
cycle lengths. Discrete Math. Appl., 1:179-194, 1991.
[81] V. F. Kolchin. Cycles in random graphs and hypergraphs. Adv. Appl.
Probab., 24:768, 1992.
[82] V. F. Kolchin. The number of permutations with cycle lengths from a fixed
set. In Random Graphs'89, pp. 139-149. Wiley, New York, 1992.
[83] V. F. Kolchin. Consistency of a system of random congruences. Discrete
Math. Appl., 3:103-113, 1993.
[84] V. F. Kolchin. A classification problem in the presence of measurement
errors. Discrete Math. Appl., 4:19-30, 1994.
[85] V. F. Kolchin. Random graphs and systems of linear equations in finite
fields. Random Structures and Algorithms, 5:135-146, 1994.
[86] V. F. Kolchin. Systems of random linear equations with small number of
non-zero coefficients in finite fields. In Probabilistic Methods in Discrete
Mathematics, pp. 295-304. VSP, Utrecht, 1997.
246 Bibliography
[87] V. F. Kolchin and V. I. Khokhlov. An allocation problem and moments of
the binomial distribution. In Probab. Problems of Discrete Math., pp. 16-
21. MIEM, Moscow, 1987. In Russian.
[88] V. F. Kolchin and V. I. Khokhlov. On the number of cycles in a random
non-equiprobable graph. Discrete Math. Appl, 2:109-118, 1992.
[89] V. F. Kolchin and V. I. Khokhlov. A threshold effect for systems of random
equations of a special form. Discrete Math. Appl, 5:425-436, 1995.
[90] V. F. Kolchin, B. A. Sevastyanov, and V. P. Chistyakov. Random Allocations.
Wiley, New York, 1978.
[91] I. N. Kovalenko. A limit theorem for determinants in the class of Boolean
functions. Soviet Math. Dokl, 161:517-519, 1965. In Russian.
[92] I. N. Kovalenko. On the limit distribution of the number of solutions of a
random system of linear equations in the class of Boolean functions. Theory
Probab. Appl, 12:51-61, 1967. In Russian.
[93] I. N. Kovalenko, A. A. Levitskaya, and M. N. Savchuk. Selected Problems
of Probabilistic Combinatorics. Naukova Dumka, Kiev, 1986. In Russian.
[94] J. B. Kruskal. The expected number of components under a random map-
mapping function. Amer. Math. Monthly, 61:392-397, 1954.
[95] E. Landau. Handbuch der Lehre von der Verteilung der Primzahlen, vol. 1.
Teubner, Berlin, 1909.
[96] A. A. Levitskaya. Theorems on invariance of the limit behaviour of the
number of solutions of a system of random linear equations over a finite
ring. Cybernetics, B): 140-141, 1978. In Russian.
[97] A. A. Levitskaya. Theorems on invariance for the systems of random linear
equations over an arbitrary finite ring. Soviet Math. Dokl, 263:289-291,
1982. In Russian.
[98] A. A. Levitskaya. The probability of consistency of a system of random
linear equations over a finite ring. Theory Probab. Appl, 30:339-350,
1985. In Russian.
[99] T. Luczak. Component behaviour near the critical point of the random
graph process. Random Structures and Algorithms, 1:287-310, 1990.
[100] T. Luczak. Cycles in a random graph near the critical point. Random
Structures and Algorithms, 2:421-439, 1991.
[101] T. Luczak and B. Pittel. Components of random forests. Comb. Probab.
andComput., 1:35-52, 1992.
[102] M. P. Mineev and A. I. Pavlov. On the number of permutations of a special
form. Math. USSR Sb., 99:468-476, 1976. In Russian.
[103] L. Moser and M. Wyman. On the solution of xd = 1 in symmetric groups.
Canad. J. Math., 7:159-168, 1955.
[104] L. R. Mutafchiev. Local limit theorems for sums of power series distributed
random variables and for the number of components in labelled relational
structures. Random Structures and Algorithms, 3:403-426, 1992.
Bibliography 247
[105] E. Palmer. Graphical Evolution. Wiley, New York, 1985.
[106] A. I. Pavlov. On the limit distribution of the number of cycles and the
logarithm of the order of a class of permutations. Math. USSR Sb., 42:539-
567, 1982.
[ 107] A. I. Pavlov. On the number of cycles and the cycle structure of permutations
from some classes. Math. USSR Sb., 46:536-556, 1984.
[108] A. I. Pavlov. On the permutations with cycle lengths from a fixed set.
Theory Probab. Appl., 31:618-619, 1986. In Russian.
[ 109] A.I. Pavlov. Local limit theorems for the number of components of random
substitutions and mappings. Theory Probab. Appl., 33:196-200, 1988. In
Russian.
[110] A. I. Pavlov. The number and cycle structure of solutions of a system of
equations in substitutions. Discrete Math. Appl., 1:195-218, 1991.
[Ill] Yu. L. Pavlov. The asymptotic distribution of maximum tree size in a
random forest. Theory Probab. Appl., 22:509-520, 1977.
[112] Yu. L. Pavlov. Limit theorems for the number of trees of a given size in a
random forest. Math. USSRSb., 32:335-345, 1977.
[113] Yu. L. Pavlov. A case of limit distribution of the maximum size of a tree in
a random forest. Math. Notes, 25:387-392, 1979.
[114] Yu. L. Pavlov. Limit distributions of some characteristics of random map-
mappings with a single cycle. In Math. Problems of Modelling Complex Sys-
Systems, pp. 48-55. Karelian Branch of the USSR Acad. Sci., Petrozavodsk,
1979. In Russian.
[115] Yu. L. Pavlov. Limit theorems for a characteristic of a random mapping.
Theory Probab. Appl., 27:829-834, 1981.
[116] Yu. L. Pavlov. Limit distributions of the height of a random forest. Theory
Probab. Appl., 28:471-480, 1983.
[117] Yu. L. Pavlov. On the random mappings with constraints on the number of
cycles. In Proc. Steklov Inst. Math., pp. 131-142. Nauka, Moscow, 1986.
[118] Yu. L. Pavlov. Some properties of plane planted trees. In Abstr. All-Union
Conference on Discrete Math, and its Appl. to Modelling of Complex Sys-
Systems, p. 14. Irkutsk State Univ., Irkutsk, 1991. In Russian.
[119] Yu. L. Pavlov. Some properties of planar planted trees. Discrete Math.
Appl., 3:97-102, 1993.
[120] Yu. L. Pavlov. The limit distributions of the maximum size of a tree in a
random forest. Discrete Math. Appl., 5:301-316, 1995.
[121] Yu. L. Pavlov. Limit distributions of the number of trees of a given size in
a random forest. Discrete Math. Appl., 6:117-133, 1996.
[122] V. V. Petrov. Sums of Independent Random Variables. Springer-Verlag,
New York, 1975.
[123] B. Pittel. On tree census and the giant component in sparse random graphs.
Random Structures and Algorithms, 1:311-342, 1990.
248 Bibliography
[ 124] G. P61ya and G. Szego. Aufgaben undLehrsatze aus derAnalysis. Springer-
Verlag, Berlin, 1925.
[125] Yu. V. Prokhorov. Asymptotic behaviour of the binomial distribution. Us-
pekhiMatem. Nauk, 8C): 135-142, 1953. In Russian.
[126] J. Riordan. Combinatorial Identities. Wiley, New York, 1968.
[127] A. Ruciriski and N. C. Wormald. Random graph processes with degree
restrictions. Combinatorics, Probability and Computing, 1:169-180,1992.
[128] V. N. Sachkov. Mappings of a finite set with restraints on contours and
height. Theory Probab. Appl., 17:640-656, 1972.
[ 129] V. N. Sachkov. Random mappings with bounded height. Theory Probab.
Appl., 18:120-130, 1973.
[130] V. N. Sachkov. Probability Methods in Combinatorial Analysis. Nauka,
Moscow, 1978. In Russian.
[131] A. I. Saltykov. The number of components in a random bipartite graph.
Discrete Math. Appl., 5:515-523, 1995.
[132] B. A. Sevastyanov. Convergence of the number of empty cells in the classi-
classical allocation problems to Gaussian and Poisson processes. Theory Probab.
Appl., 12:144-154, 1967. In Russian.
[133] V. E. Stepanov. On the probability of connectedness of a random graph
gm(t). Theory Probab. Appl., 15:55-67, 1970.
[134] V. E. Stepanov. Phase transition in random graphs. Theory Probab. Appl.,
15:187-203, 1970.
[135] V. E. Stepanov. Structure of random graphs gn(x \ h). Theory Probab.
Appl., 17:227-242, 1972.
[136] L. Takacs. On the height and widths of random rooted trees. Adv. Appl.
Probab., 24:771, 1992.
[ 137] S. G. Tkachuk. Local limit theorems on large deviations in the case of stable
limit laws, hvestiya of Uzbek Academy of Sciences, B):30-33, 1973. In
Russian.
[ 138] V. A. Vatutin. Branching processes with final types of particles and random
trees. Adv. Appl. Probab., 24:771, 1992.
[139] A. M. Vershik and A. A. Shmidt. Symmetric groups of high degree. Soviet
Math. Dokl, 13:1190-1194, 1972.
[140] A. M. Vershik and A. A. Shmidt. Limit measures arising in the asymptotic
theory of symmetric groups, i. Theory Probab. Appl, 22:78-85, 1977.
[141] A. M. Vershik and A. A. Shmidt. Limit measures arising in the asymptotic
theory of symmetric groups, ii. Theory Probab. Appl, 23:36-49, 1978.
[142] V A. Voblyi. Asymptotic enumeration of labelled connected sparse graphs
with a given number of planted vertices. Discrete Analysis, 42:3-16,1985.
In Russian.
[143] V. A. Voblyi. Wright and Stepanov-Wright coefficients. Math. Notes,
42:969-974, 1987.
Bibliography 249
[ 144J L. M. Volynets. The number of solutions of an equation in the symmetric
group. In Probab. Processes and AppL, pp. 104-109. MIEM, Moscow,
1985. In Russian.
[145] L. M. Volynets. On the number of solution of the equation xs = e in the
symmetric group. Math. Notes, 40:155-160, 1986. In Russian.
[ 146] L. M. Volynets. An estimate of the rate of convergence to the limit distribu-
distribution for the number of cycles in a random substitution. In Probab. Problems
of Discrete Math., pp. 40-46. MIEM, Moscow, 1987. In Russian.
[147] L. M. Volynets. The generalized scheme of allocation and the distribu-
distribution of the number of cycles in a random substitution. In Abstracts of the
Second All-Union Conf. Probab. Methods of Discrete Math., pp. 27-28.
Petrozavodsk, 1988. In Russian.
[148] L. M. Volynets. The generalized scheme of allocation and the number of
cycles in a random substitution. In Probab. Problems of Discrete Math.,
pp. 131-136. MIEM, Moscow, 1988. In Russian.
[149] L. M. Volynets. An example of a nonstandard asymptotics of the number
of substitutions with restrictions on the cycle lengths. In Probab. Processes
and AppL, pp. 85-90. MIEM, Moscow, 1989. In Russian.
[150] H. Wilf. The asymptotics of ep^ and the number of elements of each order
in Sn. Bull. Amer. Math. Soc, 15:228-232, 1986.
[151] E. M. Wright. The number of connected sparsely edged graphs, iii. J. Graph
Theory, 4:393-407, 1980.
[152] E. M. Wright. The number of connected sparsely edged graphs, iv. J. Graph
Theory, 7:219-229, 1983.
[153] A. L. Yakymiv. On the distribution of the number of cycles in random
a-substitutions. In Abstracts of the Second All-Union Conference Probab.
Methods in Discrete Math., p. 111. Karelian Branch of the USSR Acad.
Sci., Petrozavodsk, 1988. In Russian.
[154] A. L. Yakymiv. Substitutions with cycle lengths from a fixed set. Discrete
Math. AppL, 1:105-116, 1991.
[155] A. L. Yakymiv. Some classes of substitutions with cycle lengths from a
given set. Discrete Math. AppL, 3:213-220, 1993.
[156] N. Zierler. Linear recurring sequences. J. Soc. Ind. AppL Math., 7:31-48,
1959.
INDEX
algorithm A2, 173
characteristic function, 8
classical scheme of allocation, 16
complete description of distribution of
the number of cycles, 192
connected component, 23
connectivity, 22
critical graph, 91
critical set, 125
decomposable property, 23
distribution function, 2
equations involving permutations, 219
equations of compound degree, 235
equations of prime degree, 225
factorial moment, 3
feedback point, 124
forest, 21
forest of nonrooted trees, 30
generalized scheme of allocation, 14
generating function, 6
graphs with components of two
types, 70
homogeneous system of equations, 125
hypercycle, 126
independent critical sets, 125
inversion formula, 10
length of the maximum cycle, 212
limit distribution of the number of
hypercycles, 164
linearly independent solutions, 125
local limit theorem, 10
mathematical expectation, 3
maximal span, 9
maximum number of independent
critical sets, 125, 135
maximum size of components, 66
maximum size of components of a
random graph, 84
maximum size of trees in a random
forest, 48
maximum size of trees in a random
graph, 83
maximum size of trees in a random
graph from An, t, 71
mean, 3
mean number of solutions in the
equiprobable case, 132
method of coordinate testing, 168
multinomial distribution, 15
multiplicity of a vertex in a set of
hyperedges, 126
252
Index
nonequiprobable graph, 109
normal distribution, 9
number of components, 107, 144
number of components in Un, 65
number of cycles in a random
permutation, 182, 183
number of cycles of length r in a
random permutation, 182
number of forests, 31
number of linearly independent
solutions, 130
number of nontrivial solutions, 131
number of trees of fixed sizes, 71
number of trees with r vertices, 42
number of unicyclic components,
77,81
number of unicyclic graphs, 58
number of vertices in the maximal
unicyclic component, 81
number of vertices in unicyclic
components, 71, 77
order of random permutation, 239
order statistics, 17
partition, 19
permutations with restrictions on cycle
lengths, 192
Poisson distribution, 6
probability, 1
probability distribution, 2
probability of consistency, 144
probability of reconstruction of the
true solution, 166
probability space, 1
problem of moments, 4
process of sequential growth of the
number of rows, 127
random element, 1
random forest, 21, 30
random graph corresponding to
random permutation, 181
random graph of a random
permutation, 182
random graphs with independent
edges, 100
random matrices with independent
elements, 126
random pairwise comparisons, 164
random partitions, 30
random permutation, 28
random variable, 1
rank of matrix, 124
rank of random sparse matrices, 135
reconstructing the true solution, 165
saddle-point method, 7, 221
set of rooted trees, 21
shift register, 123
simple classification problem, 122
single-valued mapping, 18
statistical problem of testing the
hypotheses Hq and H\, 180
subcritical graph, 91
summation of independent random
variables in GFB), 131
supercritical graph, 91
system of linear equations in
GFB), 122
system of random equations with
distorted right-hand sides, 180
system with at most two unknowns in
each equation, 156
threshold property, 156
total number of components, 24
total number of critical sets, 157
total number of cycles, 102, 212
total number of hypercycles, 158
unicyclic graph, 58
voting algorithm, 165
weak convergence, 2