Text
                    American Mathematical Society
TRANSLATIONS
Series 2 • Volume 192
Provability,
Complexity,
Grammars
Lev Beklemishev
Mati Pentus
Nikolai Vereshchagin
American Mathematical Society


Provability, Complexity, Grammars 
American Mathematical Society TRANSLATIONS Series 2 • Volume 192 Provability, Complexity, Grammars Lev Beklemishev Mati Pentus Nikolai Vereshchagin American Mathematical Society Providence, Rhode Island 
AMS Subcommittee Robert D. MacPherson Grigorii A. Margulis James D. Stasheff (Chair) ASL Subcommittee Steffen Lempp (Chair) IMS Subcommittee Mark I. Freidlin (Chair) 1991 Mathematics Subject Classification. Primary 68Q15, 68S05, 03B45; Secondary 03B65, 03F40. Abstract. This book is a collection of three outstanding dissertations in mathematical logic and complexity theory. The study of modal logics axiomatizing provability traces back to Godel’s papers of the early 1930’s. Since then several infinite series of provability logics have been found. The Ph.D. dissertation by Lev Beklemishev establishes that no other such logics exist, thus completing their classification. For this paper Dr. Beklemishev received the Moscow Mathematical Society Award in 1994. The Ph.D. dissertation by Mati Pentus proves the Chomsky conjecture that establishes the equivalence of two competing approaches to defining formal languages: the Chomsky hierarchy and the Lambek grammars. For this result and related papers Dr. Pentus won the international research prize from the European Association for Logic, Language and Information in 1994, and the Moscow Mathematical Society Award in 1998. In his Doctor of Sciences dissertation Nikolai Vereshchagin proposes a general framework for the criteria of relativizability in complexity theory. The book is useful for researchers and graduate students working in mathematical logic and complexity theory. Library of Congress Cataloging-in-Publication Data Beklemishev, Lev Dmitrievich, 1967- Provability, complexity, grammars / Lev Beklemishev, Mati Pentus, Nikolai Vereshchagin. p. cm. — (American Mathematical Society translations, ISSN 0065-9290 ; ser. 2, v. 192) Contains three doctoral dissertations in mathematical logic, mathematical linguistics, and complexity theory, translated from the Russian. Includes bibliographical references. ISBN 0-8218-1078-2 (handcover : alk. paper) 1. Modality (Logic) 2. Proof theory, 3. Computational complexity. 4. Mathematical linguistics. I. Pentus, Mati Removich, 1967- . II. Vereshchagin, Nikolai Konstantinovich, 1958- . III. Title. IV. Series. QA3.A572 ser. 2, vol. 192 [QA9.46] 510 s—dc21 [511.3] 99-20177 CIP © 1999 by the American Mathematical Society. All rights reserved. The American Mathematical Society retains all rights except those granted to the United States Government. Printed in the United States of America. @ The paper used in this book is acid-free and falls within the guidelines established to ensure permanence and durability. Information on copying and reprinting can be found in the back of this volume. Visit the AMS home page at URL: http://www.ams.org/ 10 987654321 04 03 02 01 00 99 
Contents Preface ix Classification of Propositional Provability Logics L. D. Beklemishev 1 Introduction 1 1. Preliminaries 7 2. Semantics for S, D, and A 14 3. Trace classification of provability logics 21 4. Prime А-models and their characteristic formulas 30 5. Provability logics containing D 35 6. Provability logics containing A 38 7. Main results 43 8. Examples, comments, and related results 48 References 55 Lambek Calculus and Formal Grammars Mati Pentus 57 Introduction 57 1. Preliminaries 60 2. Free group interpretation 63 3. Thin sequents 65 4. Interpolation 66 5. Main theorem 69 6. Interpolation in fragments 73 7. Construction of a context-free grammar for a product-free Lambek grammar 79 8. Conjoinable types in the Lambek calculus 80 9. Multiplicative cyclic linear logic 81 References 86 Relativizability in Complexity Theory Nikolai K. Vershchagin 87 Notation 87 1. Introduction 87 vii 
CONTENTS viii 2. A uniform way to define complexity classes 89 3. General criteria 92 4. Relativizable inclusions between particular complexity classes 104 5. Turing reducibility between particular complexity classes 114 6. Complete languages in particular complexity classes 121 7. Perceptrons and oracle separation of AM П со-AM from PP 126 8. The universum method 132 9. Relations between complexity classes relativized with a random oracle 159 References 170 
Preface This book consists of English translations of three outstanding dissertations in mathematical logic, mathematical linguistics, and complexity theory. The area of modal logics axiomatizing provability traces back to Gbdel’s discovery in the early 1930’s of the incompleteness phenomenon in formal theories. Since then several infinite series of provability logics have been found. The Ph.D. dissertation “Classification of Propositional Provability Logics” by Lev Beklemishev establishes that no other such logics exist. This result has completed the efforts of a number of researchers in a classical area of mathematical logic. For it, Beklemishev received the Moscow Mathematical Society Award in 1994. The Ph.D. dissertation “Lambek Calculus and Formal Grammars” by Mati Pentus proves the well-known Chomsky conjecture of the early 1960’s that all formal languages generated by Lambek grammars are context-free. The foundational significance of this result for mathematical linguistics consists in establishing the equivalence of two competing approaches to defining formal languages: the Chomsky hierarchy and the Lambek categorial grammars. The proof is an elegant combination of algebraic, logical, and combinatorial methods. For this result and related work, Pentus won the award “For the best idea of the year” from the European Association for Logic, Language and Information in 1994, and the Moscow Mathematical Society Award in 1998. In 1975 Baker, Gill and Solovay showed that the P = NP problem in complexity theory relativized to different oracles has opposite solutions. In his Doctor of Sciences dissertation “Relativizability in Complexity Theory” Nikolai Vereschagin proposes a general framework for formulating the relativizability criteria and for analyzing their limits. The author finds all relativizable inclusions for some of the known complexity classes. Lev Beklemishev is now an Alexander von Humboldt Research Fellow at the In- stitut fur Mathematische Logik und Grundlagenforschung, Westfalische Wilhelms- Universitat Munster, and a senior research fellow at the Russian Academy of Sciences, Moscow. Mati Pentus is an associate professor at Moscow State University. Nikolai Vereschagin is a professor at Moscow State University. Sergei Artemov IX 
Classification of Propositional Provability Logics L. D. Beklemishev Introduction Overview. The idea of an axiomatic approach to the study of provability in formal theories (of sufficient expressive power) goes back to the work of G5del [17]. G5del noticed that many natural properties of provability can be formulated in the language of the propositional calculus enriched by a new unary connective □ (modality), Пер being inderstood as the statement “formula ep is provable”. A precise formulation of this idea leads to the central concept of our work, namely, that of propositional provability logic. Let T be an axiomatized arithmetical theory, i.e., a recursively enumerable (r.e.) first order theory containing primitive recursive arithmetic PRA, and let Prov^(x) be a standard arithmetical formula expressing the predicate “the formula coded by x is provable in T”. AT-interpretation of a modal formula ep in the language of arithmetic is the result of substituting in ep arbitrary arithmetical sentences for propositional variables and transliterating □ as Prov^. The collection of all modal formulas whose T-interpretations are all provable in some other (possibly non-r.e.) arithmetical theory U is called the provability logic for T relative to U and is denoted PLt{U). Modal logics of the form PLt(U) are called provability logics. PX^(TA), where ТА is the set of all true arithmetical sentences, is the truth provability logic for T. Informally speaking, the modal logic PLT{U) exemplifies the collection of all principles of provability in the “inner” theory T that can be verified inside the “outer” (meta)theory U. Thus, the truth provability logic of T contains all universally true principles of provability in T, and those principles that can be established by means of T itself are formalized by the logic PLt(T). The questions of effective description (axiomatization) and decidability of modal logics of the form PLj\U) arise naturally. They had attracted particular interest from researchers by the end of the ’60s (cf. e.g. [12]). The paper is devoted to the problem of classifying the logics of the form PLt(U), i.e., their characterization within the class of all modal logics, assuming that T and U vary arbitrarily within the class of arithmetical theories. The general classification theorem for propositional provability logics obtained in this paper is, in fact, an outcome of the work of several authors over the last decade. 1991 Mathematics Subject Classification. Primary 03B45, 03F03; Secondary 03F25, 03F40. ©1999 American Mathematical Society 1 
2 L. D. BEKLEMISHEV Final results were obtained by the author in [6]. So, natural axiomatizations for all provability logics are now known, and the questions of algorithmic decidability and Kripke-style semantics for these logics are fully investigated. Before stating our main results we briefly review the history of the question. Hilbert and Bernays [20] formulated certain natural “derivability conditions” on the formalization of the provability predicate for an axiomatized arithmetical theory T sufficient for the validity of Gbdel’s incompleteness theorems for T. L5b [23] reformulated the Hilbert-Bernays conditions in a form that essentially had the character of propositional modal axioms and rules of inference. Together with another important property of provability established by L5b himself, which later become known as “Lob’s theorem”, these properties constitute an axiomatization of the basic provability logic GL: Axioms: 1. Propositional tautologies. 2. D(p —> ф) —> (Dp —> Dip). 3. Dp —> DDp. 4. D(Dp Dp. Rules of inference: p, p —> ip b ip (modus ponens)\ p b Dp. The question whether GL is complete as a system of axioms for provability was answered positively in the fundamental work of Solovay [29]. Solovay showed that for sound (that is, true in the standard model) axiomatized theories T: 1. PXT(PRA) = PLT(T) = GL; 2. PXT(TA) = S ^ GL{Dp -> p}.1 The results of Solovay stimulated further intensive study of provability logics (cf. [8, 28, 11]). Art^mov [1, 3] suggested a general approach to the problem of classification of provability logics based on a specific notion of trace of a modal logic. He proved that all extensions of GL by letterless modal formulas are provability logics [1], and all such extensions are exhausted by logics of the form GLq, ^ GL{Fn | n G a}, a C a;, GL^ ^ GL{Vng0 ~'Fn}, P £ a; and и \ (3 is finite, where Fn ^ (Dn+1J_ DnJ_) [3]. Visser [30] established that the provability logics not contained in S are precisely the logics GL^. He also showed that a (consistent) provability logic of the form PLt(T) may coincide only with GL or with one of the logics GL^ for P = и \ {0,..., n}, n G ш. AH^mov [3] reduced the problem of complete classification of provability logics for sound arithmetical theories to the description of such logics in the interval between GL^ and S.* 2 A new provability logic D ^ GL{-iQ_L, □(□(/? V □'ip) —> Dp V D'lp} xHere and below the expression GL{.. . } denotes the closure under modus ponens of the set of theorems of GL together with the axiom schemes listed within the curly brackets. 2The system GLW plays a distinguished role in the classification of provability logics. In the present translation it is given the special name A (cf. also the recent survey [13]). 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 3 Figure 1. The lattice of provability logics. Inclusion of logics corresponds to imaginary movements “to the right” and “up” in the diagram. Vertical slices represent logics with the same trace. was found by Dzhaparidze [14]. It easily follows that the logics Sn GL^ and D/3 ^Dfl GL^ (/? C a;, uj \ [3 finite) have to be provability logics as well. 
4 L. D. BEKLEMISHEV In this paper (cf. also [6]) we show that there are no other provability logics than those mentioned above, that is, any provability logic is contained in one of the following four infinite series (see Figure 1): (*) GLa, GL^, S/3, (a, /3 C (j and uj \ (3 is finite). Besides, we show that truth provability logics are exhausted by the following list: S, D, A, GLjN{n} (пеш). We also give a complete description, for any fixed inner theory T, of all provability logics from the list (*) that may have the form PLt{U), where U varies within the class of all arithmetical theories. These results complete the classification of propositional provability logics. In the last section of the paper some natural questions related to the classification theorem are considered: effectiveness of this classification; decidability of individual provability logics; dependence of provability logics on the choice of an arithmetical formula Provt(x) representing the provability predicate in an inner theory T; characterizations of provability logics for several concrete pairs of arithmetical theories. Finally, we fully investigate Craig’s interpolation property for propositional provability logics. Contents of the work. The paper is divided into an introductory part and eight sections. The first seven sections contain a proof of the classification theorem along with a self-contained exposition of all necessary background results. In Section 1 basic concepts are introduced. One of the most useful is the notion of rank of an axiomatized theory T. For a given T, consider an infinite sequence of theories (Tn)nGu, defined by the following clauses: T0 ^ T, Tn+i ^ T + Con(Tn), where Con(Tn) denotes the standard Godel’s consistency assertion for Tn. The rank rk(T) of a theory T is the least n such that Tn is inconsistent, if such an n exists; otherwise, by definition, rk(T) = oo. The rank of a theory T is a kind of measure of its proximity to the inconsistent theory. Theories of infinite rank are called strongly consistent. Strongly consistent theories are classified according to the degree of their soundness. An arithmetical theory U is called sound if all theorems of U are valid in the standard model of arithmetic. U is called Yjn-sound if all arithmetical £n-sentences provable in U are true. In Section 1 natural examples showing the nontriviality of the classification of theories by their rank and the degree of soundness are given. Section 2 deals with Kripke semantics of the three most important (besides GL) provability logics: S, D, and A.3 We prove soundness and completeness of these logics with respect to some special classes of (infinite) Kripke models. These models are called, respectively, S-, D-, and А-models. We also show that the set of theorems of each of the three logics in question is decidable. Section 3 contains necessary background results from the important papers [29] and [3]. In the first part of this section the so-called Solovay construction is presented [29], and in the second part a brief introduction into the techniques of traces of modal formulas and logics [3] is given. In fact, the results obtained in [3] for sound inner theories are generalized to arbitrary inner theories. In particular, 3An appropriate semantics for S has been found in [30] and, in a somewhat different format, in [10]. 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 5 the question of axiomatization of provability logics for strongly consistent theories is reduced to the problem of describing the provability logics in the interval between A and S. In Section 4 the notions of prime Kripke model and its characteristic formula (cf. [3, 16]) are extended to the class of А-models. In Section 5, relying upon the techniques of Section 4, we show that any consistent provability logic strictly containing D coincides with S. Thus, in combination with the results of Section 3 the general classification problem is reduced to the description of provability logics within the interval between A and D. In Section 6 we prove that any provability logic strictly containing A contains D. This result is the central part of our work and completes the classification of provability logics. The main theorems of the paper are formulated and proved in Section 7. First of all, summing up the results of Sections 2-6, we obtain the classification theorem for propositional provability logics. Theorem 1. Any provability logic coincides with one of the following modal logics: (*) GL*, GL^, D/j, (a,(3 C uj, uv\a is finite). Theorem 2 shows that every logic from the list (*) is realized as a provability logic for any strongly consistent inner theory. Theorem 2. If an axiomatized theory T is strongly consistent, then PLT( PRA) = PLT{T) = GL, PLT(ТА) = S, D, or A, and every logic from (*) has the form PLt(U) for an appropriate outer theory U. Theorem 3 gives a description of all possible truth provability logics. Theorem 3. The truth provability logics are precisely the following ones: (**) S, D, A, and GL{-n.Fn}, n E u. Moreover, for any axiomatized theory T, 1. PLt(ТА) = S iffT is sound; 2. PLt(ТА) = D iff T is Y^i-sound but not sound; 3. PLt( ТА) = A iffT is strongly consistent but not £i -sound; 4. PLT(ТА) = GL{-nFn} iff rk(T) = n (for n< oo). Theorem 4, which complement Theorem 2, describes the set of all possible provability logics for T for any inner theory T of finite rank. It turns out that this collection is uniquely determined by the rank of T. Theorem 4. Let T be an axiomatized theory of rank n < oo. Then PLT( ТА) = GL{-nFn}, PLT(PRA) = GL{D”+11}, PLT(T) = GL{D”±}, and the provability logics for T exhaust all extensions of GL{Dn+1 J_} by letterless modal formulas, that is, the logics GL“, ш \ a C {0,..., n}. 
6 L. D. BEKLEMISHEV In the last Section 8 some natural questions related to the classification problem are considered. In Section 8.1 the question of effectiveness of this classification is investigated. It is well known and easy to see that, given a provability predicate Prov7 (x) for an axiomatized theory T, one cannot effectively determine if T is consistent. Consequently, neither can one effectively determine the provability logic of T. However, in most of the practical cases provability logics are easily calculated. We present two kinds of partial positive results in this direction. For a modal formula p, let {if}1 denote the arithmetical schema consisting of all T-interpretations of p. The following Proposition 8.1 shows that provability logics can be effectively calculated for an interesting subclass of the class of arithmetical theories. Proposition 8.1. There is an algorithm which, given a modal formula p, determines the provability logic PZpra(PRA + {(^>}pra) in classification. Section 8.2 considers provability logics for natural, mathematically meaningful, arithmetical theories. A number of examples realizing the provability logics S, D, A and GL are treated. Section 8.3 investigates the dependence of truth provability logics on the choice of the arithmetical formula Prov^(x) representing the provability predicate in the inner theory T. Obviously, the conditions of soundness and Ei-soundness of a theory are invariant under such a choice, and therefore, by Theorem 3, the provability logics D and S are invariant as well. However, the notion of rank of a theory is not invariant, as the following Corollary 8.9 shows. An arithmetical theory T is called reflexive if T proves the consistency assertions Con(U) for all finite subtheories U of T. It is well known that theories such as PRA and PA are reflexive. Corollary 8.9. Let T be a consistent axiomatized reflexive theory that is not Ei-sound. Then for each 0 < n < oo there is an axiomatized theory T' deductively equivalent to T such that rk(T') = n. It follows that the truth provability logics, except for S and D, are not invariant under the choice of a provability predicate. Finally, in Section 8.4 we investigate Craig’s interpolation property for propositional provability logics. A modal logic £ satisfies the interpolation property if for any two formulas ip and -0 such that £ b p —> ф, there is a formula 0 such that the variables of 6 are common to p and <ф, and £h p->0, £^0->'ф. Craig’s interpolation property can be considered as a standard test for a logic to be “reasonable,” and is popular in the study of modal logics. We show that the only provability logics that do not satisfy the interpolation property are those of the family (f3 C cu and uj \ (3 is finite). A somewhat surprising example demonstrating that D does not possess the interpolation property is obtained on the basis of Kripke semantics for D developed in Section 2. 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 7 1. Preliminaries 1.1. Theories. By the term theory we mean an arbitrary set of formulas closed under derivability in classical predicate logic with equality. All theories under consideration are formulated in the language containing a constant symbol 0 and function symbols for all primitive recursive functions. Formulas in this language are called arithmetical. The set of all arithmetical sentences, that is, formulas without free variables, is denoted St. Primitive recursive arithmetic PRA is the theory given by the following non- logical axioms: 1. Defining equations for all primitive recursive functions; 2. The scheme of induction (Ind) A(0) Л Vx(A(x) —» A(x + 1)) —» VxA(x), for all quantifier-free formulas A(x). Primitive recursive formulas are formulas in the language of arithmetic all of whose quantifier occurrences are bounded, that is, have the form \/x(x <£—>...) or 3x(x < t A ...), where t is a term not involving the variable x. It is known (see [28]) that any primitive recursive formula is PRA-equivalent to a formula of the form f(xi,..., xn) = 0, for a suitable function symbol /. The classes of £n- and Iln-formulas are defined inductively as follows: £o- and По-formulas are just primitive recursive formulas; £n+i-formulas (Пп+1-formulas) are those of the form 3x\3x2 ... ЗхшА (respectively, VxiVx2 ... VxmA), where A is a nn-formula (respectively, £n-formula). The classes of all arithmetical £n- and nn-sentences are denoted respectively by £n and Пп. In a formalized context the expression x £ £n will also denote a natural primitive recursive formula expressing the predicate ux is the Godel number of a £n-sentence”. The theories 7£n are obtained by adding to the list of axioms of PRA the scheme (Ind) for all £n-formulas A(x) (see [19]). Peano Arithemtic PA is |Jn>1/£n or, in other words, can be obtained by adding to PRA the axiom scheme (Ind) for arbitrary arithmetical formulas A(x). ТА denotes the theory generated by the set of all arithmetical sentences true in the standard model of arithmetic N. The set of natural numbers, identified with the set of all finite ordinals, is denoted uj. 1.2. Axiomatized theories. An axiomatized theory T is a theory generated by a primitive recursive set of formulas, called axioms, taken together with a primitive recursive formula Ax^(x), called numeration of T, which defines the set of Godel numbers of axioms of T in N (see [15]). From the formula Ax^(x) a primitive recursive formula Prfт(х,у) expressing the predicate “y is the Godel number of a T-proof of the formula with the Godel number x” is constructed in a natural way.4 The formulas expressing provability of the formula with the Godel number x in T (provability predicate for T) and consistency assertion for T are then defined 4As usual, by “expressing”, we mean not only that Prfj^x, y) defines this predicate in N, but also that it is formulated in such a way that its basic properties are verifiable in PRA. See [15] for details. 
L. D. BEKLEMISHEV respectively as follows: Piwr(x) ^ 3y Prfт{х,у) and Con(T) ^ -iPiwr(r0 = ln). We shall also consider axiomatized families of theories (Tn)nGu;. Numerations of such families are primitive recursive formulas Ax(n, x) with an extra free variable n playing the role of a parameter. Axiomatized theories U and V are equivalent if they have the same set of theorems (denoted U = V). U and V are provably equivalent if PRA b Vx(Prov[/(x) Provy(x)). It is easy to see that any axiomatized theory is r.e. Conversely, by Craig’s well- known trick (see [15]) any r.e. theory can be generated by a primitive recursive set of axioms and thereby is equivalent to some axiomatized theory. On the other hand, equivalent axiomatized theories, in general, do not have to be provably equivalent, even if they share one and the same set of axioms [15]. Natural theories, such as PRA, /£n, PA, etc., have natural numerations read off from their standard axiomatizations: Axpra(x), .... We shall consider these natural numerations as being fixed for the rest of this paper. We shall often quote the following four basic facts about PRA. Provable Ei-completeness: For any £1 -formula a(xi,..., xn), PRA b Vxi ... Vxn(a(xb ... ,xn) -> ProvPRA(ra(i:i,... ,жп)п))- Here and below the expression ra(xi,..., xn)n denotes a canonical term for the primitive recursive function mapping the tuple aq,... ,xn to the G5del number rcr(xi,..., xn)n of the formula a(xi,...,xn), where the Xi are the numerals denoting the natural numbers X{ (see [15, 28]). £n-truth definition for £n-formulas: For every n > 1 there is a Un- formula Truesn(x) such that PRA b Vxi. ..\/xn(A(xi,... ,xm) <-> Truе^п(гЛ(х1 , . . . , Xni)^)) ч for any En-formula A(xi,... ,xm). Formalized primitive recursion theorem: For any term F(xo,... ,xn) in the language of PRA there is a function symbol f such that PRA I- Vzi.. .Vxn(f(x i,. ..,xn)= F(r P,xi,.. .,xn)). Fixed point lemma: For any arithmetical formula A(xq, ■ ■ ■ ,xn) there is a formula B(xi,..., xn) such that PRA b Vxi. ..Wxn(B(xi,... ,жп) <-> А(гВ(хb ... ,xn)n,xb ... ,xn)). Proofs of these facts can be found in the standard sources [19, 28, 15]. 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 9 1.3. Extensions of theories. An axiomatized theory U is an extension of an axiomatized theory T if PRA h \/x(Axt{x) —> Axu(x)). Clearly, in this case one also has PRA h Wx(Pjto\/t(x) —> Provu(x)) and PRA h Con(U) -> Con(T). Unless explicitly stated otherwise, we assume that axiomatized theories are extensions of PRA. U is called a finite (or finitely axiomatized) extension of T if for some arithmetical sentences ..., Ak, PRA h Vx(Axjj{x) <-> (Ахт(ж) V x = гА\п V • • • V x = rA/c“1)). We denote this by writing U = T + A\ + • • • + A^. It is known that the theories /£n are provably equivalent to some finite extensions of PRA. Finite extensions of predicate logic are simply called finite theories. Canonical families of finite subtheories of a theory T and of finite extensions of Г, respectively, are given by the following numerations: Axj’^n(x) ^ Ахт(ж) Ax<n, Axt+u{x) ^ Ахх’(ж) Л x = u. An axiomatized theory T is called reflexive if for all n E ca, T b Con(T f n). It is not difficult to see that equivalent finite theories are bound to be provably equivalent. Hence, the property of reflexivity does not depend on the choice of a numeration of a given theory. By Godel’s second incompleteness theorem, no consistent reflexive theory can be finite. Traditional examples of reflexive theories are PRA and PA. It is known that any extension of PA in arithmetical language is reflexive [15]. Similarly, so is any extension of PRA by П1-axioms [25]. 1.4. Reflection principles. A theory U is sound if all theorems of U hold in N. U is called En-sound if all En-theorems of U hold in N. Soundness of an axiomatized theory T is formally expressed by the following local reflection schema: Rfn(T) ^ {ProvT(rAn) -> A I A e St}. £n-soundness of T can be formally expressed in two different ways. 1. Local Yin-reflection schema: RfnSn(T) ^ {ProvT(rA^) - A I A e £n}; 2. Global Y,n-reflection formula: RFNSn(T) ^ \/x e T,n(ProvT(x) —> Truesn(x)). Obviously, PRA + RFNEn(T)bRfnEn(T). The converse, generally, does not hold (see below). In order to exhibit a £n-sound, but not £n+i-sound, theory we need the following simple lemma [2, 26]. 
10 L. D. BEKLEMISHEV Lemma 1.1. There is no sentence A £ Пп such that the theory PRA + A is consistent and PRA + А b Rfn^(T). Proof. Let A be such a sentence. Then -iA is PRA-equivalent to a certain En-formula. Since PRA + A b ProvT(r-1^n) -> ni, we also have PRAb Ргоут(г-Лп) -> and by Lob’s theorem T b ~^A. Hence PRA b Prov^C--1^-1) and PRA b ~^A, so PRA + A is inconsistent. □ The following corollary of Lemma 1.1 was already known to Kreisel and Levy [21]. Corollary 1.2. There is no sentence A such that PRA + A is consistent and PRA + Л b Rfn(T). Corollary 1.3. IfT is a T,n-sound theory, thenT + -iRFNEn(T) is Tjn-sound but not Yjn+i~sound. Proof. Clearly, the false formula -iRFN^n(T) is PRA-equivalent to a certain £n+i-sentence. Hence, the theory T + -iRFNEn(T) is not En+i-sound. Now let A be an arbitrary En-sentence provable in T + -iRFNEn(T). Then by contraposition Т + -Л b RFNEn(T); therefore by Lemma 1.1 the theory T + ~^A is inconsistent. Thus, T b A and N 1= A by the £n-soundness of T. □ 1.5. Iterated consistency assertions and rank of a theory. For a given axiomatized theory T we define an increasing sequence of finite extensions of T by iterated consistency assertions. Parametric numeration of this sequence is a primitive recursive formula Axt(z,x) satisfying PRA b Axr{z,x) (Ах^(ж) V 3u < z x = гСопт(й)п). Here С0П7 (гл) denotes the consistency assertion defined from the parametric numeration Axt{u,x) itself. Since rConT(tZ)n is recovered primitively recursively from rAxT(tZ,x)n, the above equivalence has the form of a fixed point equation. The fixed point lemma guarantees that a solution of the fixed point equation (that is, the required formula Axt{z,x)) exists. This formula has to be primitive recursive, because the right hand side of the equivalence is. For each n £ uj the formula Ахт(п,х) numerates a particular theory Tn. It is then easily checked that the following equivalences are (provably) satisfied: To = T, Tn+i =Tn + Con (Tn). Let 7b denote the theory Uneu; Tn numerated by Axt^(x) ^ Bz < x Axt(z,x). (Here and below we assume that the coding of the expressions of arithemetical language satisfies the requirement that longer expressions have larger Godel numbers. So, in particular, rConr(n)n > rfP > n, for every n £ uj. Hence, Ахтш(х) is a numeration of the theory Тш.) 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 11 The rank rk(T) of an axiomatized theory T is the least n £ uj such that the theory Tn is inconsistent if such an n exists, and oo, otherwise. Theories of infinite rank are also called strongly consistent Obviously, a theory T is strongly consistent iff T^ is consistent. If T is a Ei-sound theory, then for all n e uj the theory Tn is Ei-sound as well; hence all Ei-sound theories are strongly consistent. Strong consistency of a theory can be formally expressed in the following two ways: 1. Local strong consistency schema: ConS(T) ^ {Сопт(п) | n e a;}; 2. (Global) strong consistency formula: Con (Тш). We have the following obvious relationships: PRA + Соп(ТЬ) b Cons(T), PRA + RfnEl(T) b Cons(T), JEi + RFNEl(T)bCon(TL). On the other hand, if T is strongly consistent, then by Godel’s theorem PRA + Cons(T) Y- Con(Tu;), and by Lemma 1.1 PRA + Con(TL)FRfnEl(T). Now we give some examples of theories of various ranks. Clearly, the inconsistent theory, and only it, has rank 0. Lemma 1.4. Let T be a strongly consistent theory. Then the theory U ^ T + -iCon(Tn) has rank n-hi. Proof. By induction on m we will show that for all m < n, Um = Тш + -nCon(Tn). For m — 0 this statement is obvious. For the induction step, by the induction hypothesis we obtain Z7m+i = Um + Con(Z7m) = Tm + -nCon(Tn) + Con(Tm + -iCon(Tn)). Since for m < n the theory Tn is an extension of Tm, we have PRA I—iCon(Tm) —■> -nCon(Tn). The formalized Godel’s theorem then implies PRA b Con(Tm) —> Con(Tm + -»Con(Tm)) —» Con(Tm + -iCon(Tn)) Con(Tm). It follows that as required. Um+1 = Тш + _,Con(Tn) + Con(Tm) = Tm+1 + “'Con (Tn), 
12 L. D. BEKLEMISHEV Since T is strongly consistent, the theory иш is consistent for m < n and inconsistent for m — n + 1. Thus, rk(U) = n + 1. □ The following lemma allows us to exhibit a number of natural strongly consistent theories that are not Ei-sound. Lemma 1.5. Let T be an axiomatized theory, and let U be a consistent extension ofT^. Then the theory V ^ T + -iCon(Z7) has infinite rank, but is not Y^i-sound. Proof. Clearly, V is not Ei-sound, because Con(Z7) is a true Ill-sentence. The strong consistency of U is proved in analogy with Lemma 1.4: By induction on m it is easy to show that, for all m £ uj, Vm = Tm + -iCon(I7). Since U is consistent and contains by Godel’s theorem we have Y- Con (I/). Hence Vm has to be consistent for all m G uj. □ Corollary 1.6. IfT is strongly consistent, thenT +-iCon(Tu;) is consistent but not Ei-sound. Corollary 1.7. The theories PRA-t--iCon(PRA), PA + -iCon(ZF), and /Ei + -iCon(/E2) are strongly consistent but not Ei -sound. 1.6. Godel—Lob logic. The language of modal logic includes propositional letters p,q,..boolean connectives —_L (falsum) and a unary connective □. The connectives Л, V, <-», -i, and 0 are treated as abbreviations (0 ^ ->□-<). The logic GL of Godel and Lob is formulated in this language and has the following axioms: 1. Propositional tautologies; 2. П(р -*q)-> (Цр-с □$); 3. Up -> □□ p;5 4. □(□p —► p) —► Пр (Lob’s axiom). The inference rules of GL are modus ponens, substitution, and the necessitation rule ip h Dip. A modal logic is any set of modal formulas closed under modus ponens and substitution rules and containing all propositional tautologies. For a given modal logic £ and a set of formulas X we denote by £X the closure under modus ponens and substitution of £ together with the set of axioms X. The important provability logics introduced respectively by Solovay [29], Dzhaparidze [14], and A^mov [1] are defined as follows: S ^ GL{Dp p}, D ^ GL{-OJ_, □(□p V Uq) -f (Dp V □<?)}, A ^ GL{-On_L | new}. Here □” denotes the n-fold iteration of □: □V - ¥>, nn+V - 5This axiom is known to be derivable from axioms 1, 2, and 4. We include it in the list of axioms following a historical tradition and in order to acknowledge the importance of the corresponding derivability condition. 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 13 Now we describe the so-called Kripke semantics for GL. A binary relation -< on a set К is called converse well-founded if there is no infinite chain of elements of К of the form ao -< cl\ -<••*. A Kripke model (or simply a model) is a triple /С ^ (К, -<, lb), where • -< is a converse well-founded tree-like strict partial ordering on К. The poset (K, -<) is called the frame of K. Elements of К are called nodes, and the minimal node is the root of /С. • lb is a forcing relation on /С, that is, a relation between the nodes of К and modal formulas, which satisfies the following conditions for any x E К and any formulas <p, ф: 1. x lb _L; 2. х\\~(р-*ф <=> (x ¥ (p or x lb ф); 3. x lb Dip <==> \/y G K(x -< у => у lb <p). By conditions 1-3, for a given model /С the forcing relation on /С is uniquely determined by its restriction to propositional variables. We say that a formula <p holds or is valid in a model /С if it is forced at the root of К (denoted К lb ip). The following completeness theorem for GL with respect to the class of (finite) Kripke models was established in [24] (see also [28, 11]). Completeness theorem. 1. All theorems of GL hold in any model. 2. For any modal formula <p, if GL Y- p, then there is a finite model 1C such that /С Jb p. For this reason finite Kripke models will also be called GL-models. As a trivial consequence of the completeness theorem, the system GL is decidable; that is, there is an algorithm deciding, for a given modal formula ip, if it is a theorem of GL or not [28]. 1.7. Arithmetical interpretation. An arbitrary mapping of the set of propositional letters to the set of arithmetical sentences is called a realization. Let T be an axiomatized theory. The T-interpretation fr(p) of a modal formula ip under a realization / is defined inductively as follows: • /r(-L) ^ -L; • fr(p) ^ f{p), f°r апУ propositional letter p; • For any modal formulas в and ф, fT{6 -+ф)?=± fT(0) -+ /т(^), /Т(О0) ^ ProvT(r/T(?/;)n). Let U be an arbitrary theory containing PRA (not necessarily axiomatized or even r.e.). The set PLt(U) of modal formulas all T-interpretations of which are provable in U is called the provability logic for T relative to U or simply the T- representation of U. A modal logic t is called a provability logic if t = PLt{U) for some T and U. A somewhat older term for the same notion, introduced by A^mov [1], is arithmetically complete modal logic. Notice the following simple properties of provability logics. 1. The set PLt(U) is invariant with respect to the replacement of T by a provably equivalent theory and of U by an equivalent theory. 2. For any T and /7, the set PLt{U) is a modal logic containing GL. 
14 L. D. BEKLEMISHEV 3. If Ui C U2, then PLt(Ui) C PLt(U2). 4. For any family of theories (t/*)^/, РХт(Пе) =p)pME). чЕ/ ' гЕ/ For any set X of modal formulas, let X1 denote the set of all T-interpretations of formulas from X. If a modal formula <p does not contain any occurrences of propositional letters, then fr(tp) actually does not depend on the realization f. In this case we also denote /т(ц>) by ipT. The T-completion [£}T of a modal logic £ is the minimal provability logic for T containing £. It is easy to see that [QhX]1 is a T-representation of the theory PRA + XT. Example 1.8. By induction on n it is easily seen that for all n E ш we have PRA h (-iDn+1_L)T ^ ConT(n). Hence, PRA + {-iDn+1± | n G oo}T = PRA + Cons(T), and therefore [A]T = PXT(PRA + {-.Dn+1_L | n e uj}t) = P£T(PRA + Cons(T)). 2. Semantics for S, D, and A 2.1. Operations on Kripke models. Submodels. Let /С ^ (A,-<, lb) be a model and a an element of K. The submodel of JC generated by the node a is the model /Ca ^ (Aa, -<a, lha), where Кa ^ {x e К \ a and the relations -<a and lha (on propositional letters) are the restrictions of the relations -< and lh to the set Ka. Then, obviously, the forcing of all formulas at the elements of Ka in the models /С and /Ca will be the same. For the sake of brevity we shall also call such generated submodels cones. Duplicating a cone. Let /С ^ (A, -<, lh) be a model, and let the nodes a, b E К satisfy a ~<b. Define a new model 1Cba ^ (A^, -<d, lh') in the following way: Кьа ^ {(x, О I ж 6 К} U {(x, 1)1 хек, xhb}. Set (ж, г) -<f (y,j) if i = j and x -< ?/, or i = 0, j = 1 and x ■< a. Finally, for any propositional letter p set (ж, i) lh' p X lh p. Informally, the model JCba is obtained by adjoining to the model /С above the node a a new cone isomorphic to /C^. We call this operation the duplication of the cone JCb above a. It can be directly checked that, for any formula <p and any element (x,i)eKl (ж, г) lh' ip <=> x lh cp. Deleting a cone. Let /С ^ (A, , lh) be a model and a <G K. Define a new model /С- ^ lh~) as follows: K~ ^ {x G К \ a x}\ ~<~ and lh- are the restrictions of -< and lh to the set K~. The transformation of /С to /С- is called deleting the cone generated by a in X. It is clear that when we delete a cone generated by a in /С the forcing relation for all modal formulas is preserved at all nodes x E K~ such that x/a. 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 15 Expanding a node. Let /С ^ (К, 4, lb) be a model, a G K, and (A, <) a wellordering of type a disjoint with K. (In what follows we shall only consider a such that 0 < a < uj -h 1.) Define a model 1C ^ (X', 4/,lb/) in such a way that holds if one of the following conditions is satisfied: 1. x,y e К and x 4 y, 2. x,y e A and у < x, 3. x e А, у e К and у У a, 4. x e K,y e A and x 4 a; and for all propositional letters p, x lb' p if and only if either x G К and x lb p, or x G A and a lb p. We say that the model /С' is obtained from /С by a-expanding the node a. Let ip be a modal formula. A node x of a model /С is called ip-reflexive if for each subformula Пф of ip x lb Пф => x lb ф. It is easy to see that, whenever a is a (p-reflexive node of a model /С, the forcing of all subformulas of ip is preserved under any expansion of a. In other words, for any x e K' and any subformula ф of <p, (1) x lb' ф <=> (x e К and x lb ф, or x e A and a lb ф). 2.2. Kripke semantics for S. Decidability of the provability logic S was established in [29]. The Kripke-style semantics for S described below was introduced by Visser [30] and in a somewhat different format by Boolos [10]. An S-model is a model /С ^ (AT, 4,1b) such that there is an element r G К (called a stem node of /С) satifying the following conditions: 1. The submodel of /С generated by r is a finite tree; 2. {x G if | ж ^ r} is a linearly ordered chain of order type (a? + 1)*; 3. Any element of /С is comparable with r; 4. The forcing of all the propositional variables is the same at all nodes of the chain {x G К \ x 4 r}. Here, as usual, for any linear ordering <r, a* denotes the set a equipped with the converse ordering relation x4*p^p4x. Obviously, any S-model /С is obtained from GL-model JCr by (cj-bl)-expanding its root r. Also notice that any S-model has infinitely many stem nodes. We obtain the following statements. Lemma 2.1. For any modal formula ip and any S-model /С with a root b, /С lb ip <=> (Зх У b My 4 x у lb ip). Proof. An obvious induction on the build-up of the formula ip. □ Lemma 2.2. For any modal formula ip, S b ip <=> (К, lb ip for any S-model 1C). Proof. (=>) Since modus ponens preserves the validity at every node of any Kripke model, it is sufficient to establish that at the root b of 1C all formulas of the form Оф —> ф are forced. Assume, for a contradiction, that b ¥ ф. Then by Lemma 2.1 for some x У b, x ¥ ф. But this implies b ¥ Qф, q.e.d. 
16 L. D. BEKLEMISHEV (<=) Assume that S Y p. Consider the formula n — Д(П<Л <Pi), 2—1 where Dpi,..., D(^n enumerate all subformulas of p of the form Пф. Obviously, S b S(p). Hence, GL Y S(p) —» p. By the completeness theorem there is a GL- model 1C with a root r such that /С lb S(p) and /С lb p. By the definition of the formula S(p) the node r is ^-reflexive. Now (u -b l)-expand the node r of 1C and denote the resulting S-model by 1C'. By (1), for any subformula ф of p we have 1C lb ф 1C' lb ф. Thus, 1C' lb ip, q.e.d. □ From the proof of the above lemma we obtain the following results. Corollary 2.3. For any modal formula ip, S b ip GL b S(p) —» ip. Corollary 2.4. The logic S is decidable. Proof. This follows from the previous corollary by the decidability of GL. □ Corollary 2.5. For any modal formula ip, S b \3p GL b ip. Proof. If GL Y ip, then there is a GL-model /С such that 1C ¥ ip. (ш -b 1)- expanding the root of /С, we obtain an S-model that falsifies □</?. □ 2.3. Kripke semantics for D. Decidability of the provability logic D was proved by Dzhaparidze [14]. An adequate Kripke-style semantics for D was introduced in [6]. We say that a is an accumulating node of a Kripke model /С, iff {x G К \ x >- a} is not empty and Vx, y)^a3zya(z-<x and г -< у). Lemma 2.6. Let a be an accumulating node in 1C. Then a forces all theorems of D. Proof. As in the proof of Lemma 2.2, it is sufficient to prove that a forces -iD_L as well as all the formulas of the form □(□</? V Пф) —» (Пр V Пф). The requirement a II—ОТ follows from the nonemptiness of the set {x £ К \ x >- a). Assume that a lb Пр V Пф for some formulas p and ф. Then there exist two nodes x,y У a such that x lb p and у lb ф. Since a is an accumulating node, there is a z У a such that z < x and z <y. Obviously, z lb Пр V Пф, and therefore а¥и{ПрУПф). □ We shall see below that D is complete with respect to the class of models with the accumulating root. However, it will be essential that, in fact, a stronger version of this completeness result holds. A D-model is a Kripke model satisfying conditions 1-3 of the definition of S-model and the following condition: 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 17 5. The forcing of all propositional letters is the same at every node of the set {x G К \ b -< x ^ r}, where b is the root of /С. Thus, D-models differ from S-models only in that there is no restriction on the forcing of propositional letters at the root. Lemma 2.7. For any modal formula D h (p (JC lh (p for any D-model 1C). Proof. The implication (=>) follows from the fact that the root of any D- model is accumulating. For the converse implication we first establish the following lemma. Lemma 2.7.1. Let {</?i,... ,</?n} be a finite (possibly empty) set of formulas. Then D h □ [ \/ Dipi \=i \/Opt. 2=1 Proof. We argue by induction on n. For n = 0 the statement of the lemma amounts to the derivability of the formula —► _L, which is an axiom. Assume that the statement holds for n. Obviously, using axiom 3 of GL, Hence GL I- у у Op. 2=1 2 = 1 and 72 +1 / n \ GL h V -» □( V ) v П<Ан-i 2=1 \=1 ' /П+1 \ / n (\! □v’ij GL h □( \/ Dipi ) - □ (□(^у Opij V D^+i). By the main axiom of D and the induction hypothesis we obtain GL h □( V DVt) -> □ ( V Opt) V П<рп+1 2=1 2=1 72+1 V 2=1 □ Turning to the proof of the implication (<=) in Lemma 2.7, assume, for a contradiction, that D Y ip. Let D/?i, • • •, Oipn enumerate all subformulas of ip of the form Q0. For any subset I C {1,..., n} let Di(ip) denote the formula □(VD^) - Vn^> 'iei ' iei in particular, D$(p>) ^ □_!_ —> _L. Define D(v) ^ Д Dj(9). 7C{l,...,n} 
18 L. D. BEKLEMISHEV By Lemma 2.7.1 D h D{ip), and hence GLF D(cp) —»ip. By the completeness theorem for GL we obtain a GL-model /С such that /С lh D{ip) and /С IK (p. Take /^{zG{l,...,n}|/CIPQ^}. Since /С lh Di(ip) and /С lh Vie/ we ^ave ^ ^ ^(Vie/ Therefore, there is a node ж G К such that ж >- b and, for all i G /, ж ih D^. Notice that the node ж is ^-reflexive, because for all i G {1,..., n}, ж lh Hipi implies /С lh and hence ж lh ipi. Now let a model К! ^ (iC, -<', lh') be obtained from /С by duplicating the cone /Сж over 6, and then by deleting all elements of the ‘old’ model /С except for its root (6,0) G /C£. (The latter operation amounts to deleting all submodels of K% generated by nodes of the form (a, 0), where a G if, a >- 6, and there is no г G К such that b ~< z a.) Let r denote the node (ж, 1) G K!. Since the forcing of all formulas is preserved under the duplication of cones, the node r, as well as the node ж G /С, is ^-reflexive. Lemma 2.7.2. For all subformulas ф of the formula ip, K! lh ф /С lh ф. Proof. Induction on the build-up of ф. We shall only treat the central case when ф has the form П0. Assume /С lh П0. Then for all elements у G Kx, у lh #. Since the forcing of all formulas is preserved under the duplication of cones, MzeKf (z >' (6, o) => z ih' o), that is, К' lh П0. If /С IP □#, then by the construction of ж, ж IP □#. Hence there is an element у G Kx such that y¥ 0. Prom this we infer (у, 1) IP # and 1C IP П0. □ Now we complete the proof of Lemma 2.7. Let КС be the result of ^-expanding the node r of the model K'. Obviously, К" is a D-model, and since r is a ^-reflexive node in /С', K" lh ф К1 lh ф, for all subformulas ф of ip. By Lemma 2.7.2, Kf IP ip\ hence Kn IP ip. □ Prom the proof of Lemma 2.7 we obtain the following corollary. Corollary 2.8. For any modal formula ip, D h ip 4=Ф> GL h D{ip) —► ip. Corollary 2.9. The logic D is decidable. Every S-model is a D-model. On the other hand, with every D-model К one can associate a unique (up to isomorphism) S-model K° that differs from /С, if at all, only by the forcing of propositional letters at the root. We call K° a stabilization of K. Obviously, for any S-model /С, K° ~ K. A formula ip is modalized iff it has the form of a boolean combination of formulas of the form Пф. From Lemma 2.1 we immediately obtain the following. 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 19 Lemma 2.10. If p is a modalized formula, then for any D-model 1C, 1C lb (p /С° lb p. Corollary 2.11. For all modalized formulas p, D h p S h p. Proof. Since any S-model has the form /C°, for any modalized formula p we have Dh p ( for all D-models /С, /С lb p ) ( for all D-models /С, /C° lb p ) ( for all S-models /С, К lb p ) ^ Sh p. □ 2.4. Kripke semantics for A. Decidability of A is proved in [1]. Kripke semantics for A is introduced in [6]. The depth function on a model 1C is a mapping d of the set К to the class of all ordinals uniquely defined by the following condition: Vx e К d(x) — s\ip{d(y) + 1 | у У x}, where we assume sup0 = 0. (Recall that all models are converse well-founded.) The height h(IC) of a model JC is the depth of its root. Lemma 2.12. Let x be a node of a model JC such that d(x) is infinite. Then x forces all theorems of A. Proof. It is sufficient to check that x forces all formulas of the form -On_L for n G ш. By induction on n it is easy to show that for all elements z of the model /С, z II—i[Hn_L d(z) > n. Since d(x) > uj, it follows that x II—i[Hn_L for all n. □ Below we shall prove that A is complete with respect to the class of models of infinite height. But, as before, we need a stronger version of the completeness theorem. An A-model is a model (/C, -<, lb) with a root b and a stem element r satisfying conditions 1, 2, and 5 of the definition of S- and D-models as well as the following ones: 6. If b -< x -< r and a node у is comparable with x, then у is comparable with r. 7. The restriction of the ordering -< to the set К \ {x e К \ x b, x and у are comparable} is a finite tree (possibly consisting of the single node b). We say that a node у covers a node x in a model /С, iff у >- x and there is no z G К such that x -< z -< y. It is easy to see that in any А-model /С there exist only a finite number of nodes ri,... ,rn covering the root. The corresponding submodels /СГг, 1 < i < n, 
20 L. D. BEKLEMISHEV are called side cones. What remains after deleting all the side cones from an A- model is essentially a D-model. Also notice that any А-model can be obtained from a suitable GL-model by ^-expansion of one of its nodes covering the root. Lemma 2.13. For any modal formula p, A F ip <==> (1C lb ip for all A-models 1C). Proof. The implication (=>) follows from the fact that all А-models have infinite height. For the opposite implication we first establish the following useful lemma (cf. [2]). Let pi,..., pn be any modal formulas, and let R(pi,..., pn) denote the formula n Д(П<Р» -» 4>i)- 2=1 Lemma 2.14. Let 1C be any model, and let ao -< a\ -< • • • -< an be an increasing sequence of elements of 1C. Then there is an i < n such that di lb R(<pi,...,<pn). Proof. Notice that for any formula ф, if a* IF Пф —* ф, then \/j > i aj lb ф and \/j < i aj IF Пф. Therefore, a formula of the form Пф —» ф can be false at no more than one node of the chain ao -< a\ < • • • -< an. Since the chain has n -b 1 elements, and the formula R(pi,..., pn) consists of n conjuncts of the above form, the claim follows by the pigeon-hole principle. □ Corollary 2.15. For any formulas p\,..., pn, GL I—i[ir+1_L —► 0-R(<^i, • • • > ‘Pn)- Proof. Consider an arbitrary GL-model /С that forces the formula -On+1 ±. Since h(1C) > n-b 1, a linear chain of n + 2 elements of the form b ao -< a\ -<•••-< an, where b is the root, must exist in 1C. The previous lemma implies that 3i <n a{ lb R(pi,... ,<pn), and therefore 1C lb ()R{pi,... ,<pn)- □ Corollary 2.16. For any formulas pi,..., pn, A F ()R(pi,..., pn). Returning to the proof of the implication (<=) of Lemma 2.13, assume, for a contradiction, that A Y- p. Let Upi,..., \3pn exhaust all subformulas of p of the form Пф. By Corollary 2.16, GLF 0R(pu ...,pn)-*p; hence there is a GL-model 1C with the root r such that IfC^ p and 3r e К (r >- b and r lb R(p\,..., pn))- By the definition of the formula R(pi,..., pn) the node r is ^-reflexive. Now we successively apply to the model 1C two operations: 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 21 1. Duplicating the cone /Cr over 6; 2. ^-expanding the node (r, 1) G K7b. Let /С' denote the resulting model. Obviously, in JCb the node (r, 1) is (^-reflexive and covers the root (6,0). Hence, К'! is an А-model validating the same subformulas of ip as /С. In particular, /С' ¥ ip. □ From the proof of Lemma 2.13 we immediately obtain the following result. Corollary 2.17. Let n be the number of all subformulas of of the form Пф. Then A b GL I—,nn+ij_ (p' Corollary 2.18. The logic A is decidable. 3. Trace classification of provability logics In this section we present some techniques introduced in [29, 3] that will be essential for us later. For the reader’s convenience we include short proofs of some key lemmas. Furthermore, we recapitulate in a somewhat more general setup the results on the classification of provability logics obtained in [30, 3]. In particular, in proving Art^mov’s theorems we avoid the additional assuption of soundness of the inner theory. 3.1. Solovay construction. Solovay [29] proved the following two important theorems. 1. If an axiomatized theory T is £2-sound, then PLT(PRA) = PLT(T) = GL. 2. If an axiomatized theory T is sound, then PLt{ТА) = S. (It follows from the results of Visser [30] that in Solovay’s theorem 1 the requirement of ^-soundness can be weakened to that of strong consistency of T. On the other hand, it is easy to see that the requirement of soundness in Solovay’s theorem 2 is a necessary condition.) For the proof of these theorems Solovay [29] applied the techniques of “embedding” Kripke models into arithmetic, which is now called the Solovay construction. We shall describe this construction below. Let T be an axiomatized theory and /С ^ (К, lh) a finite GL-model. We assume without loss of generality that К = {0,... , n} and 0 is the root of /С. A primitive recursive function h(x) is defined with the aid of the formalized primitive recursion theorem as follows: Л(0) = 0; . _ ( z if z e K, h(m) -< z and PrfT(r^ ф (m + ) ^ h(m) otherwise. Here I = z denotes the arithmetical formula 3m Vn > m h(n) — z informally expressing that limn^oo h(n) = 2, and I Ф z ^ ->£ = z. Lemma 3.1. The following statements are provable in PRA: VzeK ^ = z; 2. W, v (£ = uA£ = v—> u = v)\ 3. £ = z —> ProVT(rVif;^2 ^ tf z ^ К and z >- 0; 
22 L. D. BEKLEMISHEV 4. £ = z —> -iProyT{r£ ф un), if z,u £ К and u>- z. Proof. Statements 1 and 2 follow from the fact that (provably in PRA) values of the function h belong to К and h is weakly increasing in the ordering To prove Statement 3 we reason informally within PRA as follows: If £ = z, then for some ra, h(m) = z. By Si-completeness PRA b 3m h(m) = z\ hence T b 3m h(m) — z, because T extends PRA. Since h is provably monotone, it follows that T b 3m\/n > m h(n) >z z and On the other hand, £ = z and z У 0 imply T b £ ф z, for otherwise, taking m to be the least such that h(m + 1) = z, one necessary has Prfr(r£ ф zn, m) by the definition of h. Thus, T b \/w^z £ — w, q.e.d. To prove Statement 4 we formalize the following argument in PRA: If £ — z and T b £ ф й, where и У 2, then for a sufficiently large m we have Mk > m h(k) — z and Prfr(r£ ф vF,m). But then, by the definition of /1, one has h(m +1) = u, and since h is weakly increasing this implies £ Ф 2, a contradiction. This completes the proof of Lemma 3.1. □ We call the following function the Solovay realization: Lemma 3.2. For all formuas ip and for all z £ K, z У 0, 1. If z lb ip, then PRA b £ = z —» /T(</?); 2. If z¥ p, then PRA b £ = z-+ -./T(<p). Proof. Statements 1 and 2 are proved simultaneously by induction on the build-up of ip. We consider the only nontrivial case, when ip has the form Пф. 1. If 2 lb ip, then Mu У z и lb ф. Hence by the induction hypothesis f(p) — V £ z£K. z\\-p PRA b \j £ = и —> $т{Ф)- u^z Using statement 3 of Lemma 3.1, we then obtain ProvT(r/T(V>)“1) - МОф). 2. If г ¥ ip, then 3w >- z и ¥ ф. By the induction hypothesis PRA H = -/tWO, whence PRA b -nProvT(r£ ф iT) -► -.Proчт^МФГ)- Using statement 4 of Lemma 3.1, we obtain PRA b t = z -> -.ProvT(r£ ф гГ) -.Pro МГМФГ) -1/г (□'*/')• □ 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 23 Lemma 3.3. If 0 is a p-reflexive node of the model 1C, then for any subformula ф of the formula p the following hold: 1. If 0 lb ф, then PRA b I = 0 -> /т('*/’); 2. If 01b ф, then PRA b I — 0 —» Proof. We argue by simultaneous induction on the build-up of ф. Statement 2 is proved similarly to statement 2 of the previous lemma. For the proof of statement 1 notice that, if ф has the form and 0 lb ф, then Vx G К x lb в by the </?-reflexivity of the node 0. Hence, by the induction hypothesis, PRAb (\J е = й\ -/т(0). \ек ' Hence, by statement 1 of Lemma 3.1, PRAb/T(0), and therefore PRA b I = 0 -> ProvT(r/T(l9)n). □ 3.2. Traces of modal formulas and logics. A^mov [1, 3] suggested a general approach to the problem of classifying provability logics. This approach is based on the concept of trace of a modal logic, which we introduce below. The trace tr(</?) of a modal formula p is the set of heights of all GL-models that falsify p (at the root). Lemma 3.4 ([3]). The trace of any modal formula is either a finite or a cofinite subset of lj. Proof. Let p be a modal formula with an infinite trace, and let □</?!,.. •, F\pn enumerate all subformulas of p of the form Пф. Further, let /С ^ (К, 4,1b) be a GL-model of height h(JC) > n such that /С ¥ p. Since /С II—ilI]n+1_L, by Lemma 2.16 we obtain /С lb 0%i,...,^n). Hence 1C must contain a ^-reflexive node, say r. For an arbitrary finite m we can m-expand the node r, obtaining a GL-model /Cm such that /Cm ¥ p and h(1Cm) = h(/C) + m. It follows that w\tr(<^) с {о,...,ед -1}, that is, tr(p) is cofinite. □ Modal formulas built up from the symbols _L, —», and □ are called letterless. The forcing of a letterless formula at a node x of a GL-model 1C does not depend on the forcing of propositional letters in the model and thereby is uniquely determined by the depth of the node x in /С. From this observation, using the completeness theorem for GL, we immediately obtain Lemma 3.5. If F and p are modal formulas such that tr(</?) C tr(F) and F is letterless, then GL b F p. Corollary 3.6. Letterless modal formulas are determined by their trace up to provable equivalence in GL. 
24 L. D. BEKLEMISHEV Set Fn ^ (Dn+1_L —> Dn_L). Obviously, tr(Fn) = {n}. The trace tr(£) of a modal logic £ is the union of traces of all theorems of £. Logics of the form GLA, for a set of letterless formulas A, will be called Turing logics. We have the following restricted variant of the deduction theorem, whose proof is standard. Lemma 3.7. Let £ be a modal logic and X a set of letterless modal formulas. Then for any formula <p, £X b ip <=> there exist formulas F\,..., Fn G X n such that £ b A Fi -> ip. i=1 For a given a C uj let GLa denote the Turing logic GL{Fn | n E a}, and let GL~ denote GL{\/n^a ^Fn} (in the latter case we assume a cofinite). Notice that Lemma 3.7 implies tr(GLa) = tr(GL~) = a. Also notice that GL0 = GL, GLW = A, and GLJ is the inconsistent logic. Lemma 3.8. If a is coinfinite, then GLa is the strongest logic with trace a. If a is cofinite, then the strongest logic with trace a is GL~. Proof. Let £ be a logic, tr(^) = a, and a coinfinite. If £ b <^, then tr(^) C a, and therefore by Lemma 3.4 tr(^) is finite. By Lemma 3.5 whence GLa b p>. So we have proved that £ C GLa. Now assume that a is cofinite. Then £ C GL“, because by Lemma 3.5 the letterless formula -l^n? whose trace is ce, implies all theorems of £. □ Corollary 3.9. For any modal formula ip, (tr(<£>) is finite) A b ip. Proof. The proof of the implication (4=) is literally the same as that of Lemma 3.4. The implication (=>) follows from the previous lemma, for GL{^} C GLtr(^) C GL^, — A. □ Corollary 3.10. The logics GLa and GL^ (a, /3 C uj, /3 cofinite) exhaust all Turing logics. Proof. Let £ = GLA be a Turing logic with tr(^) = a and X a set of letterless formulas. In the case when a is coinfinite it is sufficient to show that £ 2 GLa, that is, for all n E <j, £ b Fn. By Lemma 3.7 tr(^) = |JFGXtr(F). Hence, Vn G a 3F G X n G tr(F). But then by Lemma 3.5 GL b F Fn, which yields £\~ Fn as required. Now assume that a is cofinite. Consider two cases: Case 1. The trace of any formula in X is finite. Then, obviously, all such formulas are provable in GLa, and we obtain £ = GLa. Case 2. There is a formula F G X such that tr(F) is cofinite. Then, since Vn G a £ b Fn, we conclude that the formula F A AnGa\tr(F) is derivable in 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 25 £. But by Corollary 3.6 the latter formula implies \/nea ~^Fn in GL. Therefore £ 2 GL~, and thus £ = GL~ by Lemma 3.8. By Lemma 3.4 the two cases exhaust all possibilities. □ Below we shall show that, if a C uj is coinfinite, then the only provability logic with trace a is GLa. If a is cofinite, then any provability logic with trace a either coincides with GL~, or is contained in the interval between GLa and SQ ^ Su GL~. Besides, we shall show that, for any strongly consistent theory T and any cofinite a, the classes of provability logics for T with trace a ordered by inclusion are isomorphic to each other. Thereby the problem of a complete classification of provability logics will be reduced to the description of provability logics in the interval between A and S. All these results were obtained in [3] for the case of a sound theory T. Lemma 3.11. Let T be an axiomatized theory and p an arbitrary modal formula. Then, if n E tr(<p), there exists a realization f such that PRA h fT(<p) ^ {Fn)T. If in addition, T is strongly consistent, the existence of such a realization f also implies n E tr(<^). Proof. Let n e tr(<p), and let /С be a GL-model of height n falsifying p at the root b. Consider a GL-model 1C obtained by 1-expanding the root 6, and let 0 be the root of /С'. Obviously, in the model 1C we have d(b) = n, d(0) = n + 1, and b¥p. Apply the Solovay construction to /С', and let / be the corresponding Solovay realization. By Lemma 3.2 PRAb * = -,/T(<p). Hence, it is sufficient to prove that PRA h bFn)T -*t = b. By Lemma 3.2, for all г € К', z >- b, we have PRA h l = z -► (nn±)T, since d(z) < d(b) = n. Therefore, PRA h (\J £ = 2'j -► (CF_L)T and PRA h (-On-L)T -► Д i ф z. zyb Since d(b) = n, by Lemma 3.2 we obtain PRA h £ = b -> -i(D"_L)T. Hence, by Statement 4 of Lemma 3.1, PRA h (nn+1T)T -► ProvT(r(Dn-L)Tn) -> ProvT(r^ ф И -+ефо. 
26 L. D. BEKLEMISHEV Therefore, PRA b (Dn+1± Л ^an±)T -W £ ^ О дД^г ' z^b ' ^£ = ь. The first claim of the lemma follows. To prove the second claim assume, reasoning towards a contradiction, that n £ tr((/?). Then, according to Lemma 3.5, GL I—'Fn —> <p. Therefore, for any realization /, PRA b №)T fT(<p). If there is a realization / such that PRA b fT(<p) -> (Fnf, then PRA b fFn)T -> (Fn)T, whence PRA b (Fn)T. By Lob’s theorem we obtain PRA b (□" ±)r, which means that the theory Tn is inconsistent, contradicting our assumption about T. □ Corollary 3.12. If £ is a provability logic and n E tr(T), then £ b Fn. Corollary 3.13. If £ is a provability logic, tr(£) = a, and a is coinfinite, then £ = GLa. Proof. £ C GLa by Lemma 3.8; GLa C £ by Corollary 3.12. □ Corollary 3.14. If a theory T is strongly consistent, then for any modal logic £ we have tr([£]r) = tr(£). Proof. Since [£]T D £, and so tr([£]T) D tr(T), it is sufficient to establish that tr(MT) C tr(*). Assume n E tr([£]r). By Corollary 3.12 [£}T b Fn, and since [£}T = PLT( PRA-HT), there exist theorems ..., of £ and realizations fl,..., fm such that m PRA b Д /f (<*) -* Ff 2—1 Since £ is closed under substitution we may assume, without loss of generality, that the formulas <pi have pairwise disjoint sets of propositional letters. Therefore, there is a single realization / such that PRAb /Т(Д 4>i 2=1 Using Lemma 3.11 we conclude that n e that is, n E tr(£). □ 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 27 A modal logic £ is called regular if £ C S, and singular if this is not the case. Visser [30] proved that the singular provability logics are exhausted by (singular) Turing logics, that is, those of the form GL^, /3 C uj cofinite. Lemma 3.11 allows us to simplify the proof of this result. Lemma 3.15. Let £ be a singular provability logic with tr(£) = a. Then a is cofinite and £ = GL~. Proof. If a is coinfinite, then GLa is the strongest logic with trace a. But GLa C S, and therefore £ cannot be singular. Hence, a is cofinite and £ C GL~. Let us prove that £ 2 GL“. Since £ is singular, there is a formula <p such that £ b <£, but S Y- p. Clearly, in this case GL Y- R(<p) —> <p (cf. Corollary 2.3); hence there is a GL-model /С with the root 0 such that 0 ¥ p and the node 0 is (^-reflexive. By Corollary 3.12, £ b гр. Apply Solovay’s construction to the model /С, and let / be the corresponding Solovay realization. We claim that Set n<h{K), n£a (2) For the proof of (2), notice that by the construction of Vz G /С (d(z) G а =Ф z IF ф) and 0 IF гр. By Lemma 3.2, if z >- 0 and d(z) £ a, then Thus, Since 0 is a (^-reflexive node, by Lemma 3.3 we obtain PRA b £ = 0 -> -/t(^) Besides, by Lemma 3.2 
28 L. D. BEKLEMISHEV It follows that PRAh/T(V0- I^OA Д 1фг\ \ si ( -y \ {Z ' 2^0. d(z)£a - v * = * 2^0, d(z)£a This proves claim (2). Completing the proof of Lemma 3.15, assume that £ = PLr{U) for some theories T and U. Since £\~ ф, for any realization / we have U b /т(Ф)- In particular, this holds for the Solovay realization. Then by (2) we obtain U h f \/ -.Ffc whence £ h V k^Oc □ Corollary 3.16. Let T be a strongly consistent axiomatized theory. Then A C PLT(ТА) C S. Proof. Let £ denote PLt(ТА). The inclusion A C £ directly follows from the definition of strong consistency. If £ is not included in S, then £ is a singular provability logic. By Lemma 3.15 £ = GL~, where a — tr(L). Clearly, a — lu, because £ D A. Hence £ = GLJ, that is, £ coincides with the inconsistent logic. But, obviously, a truth provability logic cannot be inconsistent. □ From this we easily obtain the following strengthening, due to A. Visser [30], of the first Solovay theorem. Corollary 3.17. IfT is a strongly consistent theory, then PLT(PRA) = PLT(T) - GL. Proof. Clearly, GL C PLT(PRA) C PLr(T) C {<p | PLr(ТА) b П<р} . By Corollaries 3.16 and 2.5 | PLT (ТА) b Dip} C {ip\ S b Dip} C GL. □ Corollary 3.18. Let T be a strongly consistent theory. Then for any cofinite a C uj, GLa is a provability logic for T. Proof. Denote 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 29 We prove that PLT(U) = GL-. By the deduction theorem, using Corollary 3.17, we obtain PLT(U) hp <=> V/ С/ b /тО) V/ PRAb fjy -iFn-^<p PLT(PRA) h I у -,F„ xn^a ч> GL h [ V '-n d r\j ' xn^a GL~ b p. □ Lemma 3.19. Let T be a strongly consistent theory. Then the classes CP^ of provability logics for T with trace a (ordered by inclusion) are isomorphic for all со finite a C uj. Proof. We establish the isomorphism CPl^CPl. Let £ G CP[v0. Define Vaity ^ Z П GLa. By Corollary 3.18, GL~ is a provability logic for T. Hence, so is the logic rja(£). Besides, by Corollary 3.12, tr(rja(£)) = ol. Let £ G СРта. Define Ш^£ {Fn\n?a}. We prove that £a(£) £ CPIndeed, if £ = PLT(U), then .a ш = PLt(U + ( Д F, xn^a because &•(*) £\- (/\Fn Xn^Q: VfU\-fT(/\ Fn^<p\ 'п(£сх. ' ^ \/fU+l/\Fn\ \-fT(<p) <=► PLT(u+(^ Д F^j ) Clearly, the mappings rja and fQ preserve the inclusion relation. We prove that rja and are mutually inverse. 
30 L. D. BEKLEMISHEV Let £ be a provability logic for T with tr(^) = uj. By the definition of rja and we have Uva(£)) = Ut n GL-) = (£ П GL-) {Fn I n 0 a} . By Lemma 3.7, for any formula <p, (£ П GLa) {Fn | n 0 a } L ip <=> £ П GL“ h ip V \J -iFn <=> £\~(pW \J ->Fn. n(£ct On the other hand, by Corollary 3.12 Д Fn, n^a because tr(£) = ш and £ is a provability logic. Therefore, £ b (p V \J —>Fn <=> £ b ip. nga So, we have proved that CfeM) = (^nGL-) {Fn\n£a} = L Now we prove that, for any logic £ with a cofinite trace ce, rja(^a(£)) — £. Indeed, for any modal formula <p we have nMM) = GL; П t {Fn | n i a] h V> Л К n^a <=> £\-p, because GL~ is the strongest logic with trace a. □ 4. Prime А-models and their characteristic formulas In this section we develop for the class of А-models a minitheory similar to that of prime GL-models and their defining formulas of A^mov [4]. In turn, this technique is analogous to the technique of characters of Fine [16]. We show that to a certain extent the results on finite GL-models can be generalized to (infinite) A-models. Let p = (pi,... ,pn) be a string of propositional letters. A p-isomorphism of models /Сi = (Ah, i, ll~i) and /С2 = (Ah, -<2,^2) is an isomorphism of the frames (Ah, -<i) and (Ah, -<2) preserving the forcing of the letters from p at every node (denoted /С1 ~p /С2). Obviously, the forcing of any modal formula whose alphabet is included in p is preserved under p-isomorphisms. A node x of a model /С = (A, -<, lb) is called p-duplicating if x is not the root and My -< x 3z У у (z ф x and 1CZ ~p K,x). Obviously, deleting the cone generated by a duplicating node x e 1C preserves the forcing relation on the elements of JC~ for formulas in the variables p. A GL-model /С is called p-przme, for a given p, if /С does not contain any p-duplicating nodes. 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 31 A formula p in the variables p is called p-characteristic for a p-prime model 1C if /С is the only (up to p-isomorphism) p-prime model validating p. The following two lemmas are established in [4]. Lemma 4.1. Let 1C be a GL-model and p a string of propositional letters. Then there is a p-prime GL-model 1C, called a p-simplification of 1C, such that for all formulas p in the variables p, 1C lb p <=> /С' lb p. Proof. Successively delete the cones generated by p-duplicating nodes in the model 1C. The process will eventually terminate, because 1C is finite. □ Lemma 4.2. For any string p, any p-prime GL-model has a p-characteristic formula. Proof. We only sketch a proof. Let /С be a given GL-model and b its root. For any node x E 1C let p^ denote the formula Д Pi л Д ->pi, х\\-рг xl^Pi where p ^ (pi,... ,pn) is a given string of propositional letters. Clearly, in any Kripke model у lb p№ (the forcing of the variables p at the nodes x and у is the same). We argue by induction on the height of 1C. If 1C is a height 0 one-element model, then the formula p(6) A □_!_ will be p-char act eristic for 1C. If the height of /С is > 0 let rq,..., rn enumerate all the nodes covering b. Notice that the cones /Cri, ... , 1СГп are pairwise non-p-isomorphic, because 1C is p-prime. For any element x E 1C such that x >- 6, let px denote a p-characteristic formula for the submodel 1CX of 1C. Such a formula exists by the induction hypothesis, for all proper cones of 1C are also p-prime. Consider the following formula: ф ^ л д 0ipr. л □ ( у A i=i Чех, xyb ' It is clear that Ф holds in 1C. But it is also not difficult to see that, in fact, Ф will be p-characteristic for 1C: The first conjunct fixes the forcing of propositional variables at the root. By the induction hypothesis, the second conjunct guarantees that in any p-prime GL-model M validating Ф there will be proper cones isomorphic to each of /Cri, ... , 1СГп. The third conjunct shows (by p-primeness of Л4) that any proper cone in M will occur in one of the proper cones corresponding to /Cri, ... , 1СГп. This establishes the required p-isomorphism. □ It follows from Lemma 4.1 that a characteristic formula for a given p-prime GL-model 1C is unique up to provable equivalence in GL. Lemma 4.2 implies that a p-simplification of a given GL-model is unique up to p-isomorphism. 
32 L. D. BEKLEMISHEV The definitions of p-prime GL-model and p-characteristic formula are directly generalized to A-models, if one replaces everywhere the word “GL-model” by “A- model”. It turns out that the following analogs of Lemmas 4.1 and 4.2 hold. Lemma 4.3. For any А-model /С and any string p there is a p-prime A-model 1C' such that for every formula ip in the variables p, /С lb p <=> 1C' lb p. Proof. It is sufficient to show that the process of deleting duplicating cones in /С terminates. Notice that for any А-model the nodes x satisfying b < x < r cannot be duplicating. Indeed, if b -< x -< r, by requirement 6 of the definition of А-models x is the only node covering a certain node у . But then, none of the cones occurring above у can be isomorphic to ICX, except for ICX itself. This means that during the process of deleting duplicating cones in 1C the nodes below r are not being cut off. However, there are only a finite number of other nodes in /С. □ Lemma 4.4. For any string of propositional letters p and any p-prime A-model 1C there is a p-characteristic formula for 1C. Proof. Let 1C be a given A-model, ro being its stem node and b its root. Without loss of generality we assume that the depth N of ro is greater than the height of all side cones of 1C. The nodes covering b will again be denoted rb ..., rn. For any element x E 1C such that x >- b, let px denote a p-characteristic formula for the submodel ICX of 1C. Such a formula exists by Lemma 4.2. Set L0Ap(r»>AD(Dwl- V ^))) ' ' гУг0 ' ' / n \ n AD DW+11^\/ V ] Л Д (0^>r, A □(-’Ov?r.)), ' г-0 гУгг ' 2=1 Ф ^ Ф0 Л p^b\ We are going to show that Ф is a p-characteristic formula for the A-model 1C. First, we check that /С lb Ф. It is sufficient to convince ourselves that /С lb Ф0. Assume x >- b. Notice that x II—^□7V+1_L d(x) > N -b 1 x -< 7*0, because d{rf) < N for i < n. Yet, for ж ^ ro we have x lb 0<pro, and by property 5 of А-models x lb p(r°\ If, besides, у >- x and d(y) < N, then ro ^ у by property 6, and therefore у lb \Jz)^r pz- It follows that the first conjunct of Фо holds in 1C. Clearly, for any i G {1,... ,n}, 1C lb §(рГг• Since 1C is p-prime, we also have 1C lb D-i0(^гг- Indeed, if /С IF then there is an x У b such that x lb and hence a у x can be found such that у lb рГг. Since pr% is a p-characteristic formula for a p-prime cone /СГг, we obtain /СГг !Cy, and since the node r$, unlike i/, covers the node 6, it follows that гг is p-duplicating. This contradicts the assumption that 1C is p-prime. Consequently, we have n /с ih Д(п _,0^гг л <VrJ- 2=1 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 33 Now let x У b and d(x) < N. Obviously, x E 1Cr% for some i < n. Hence, the formula (px occurs in the disjunction V V^- г-0 гУгг and therefore n x ib v V v*- 2 — 0 Z>_T% Since this is true for any node x of depth < AT, К, II- □^□iV+1-L —> \j \J Л ' 2=0 г>гг ' Thus, we have checked that /С lb Ф. Now let 1C ^ (K', -<', lb') be any p-prime А-model with a root b' and a stem element r' such that 1C lb Ф. We have to prove that 1C 1C. To this end, we shall need the following two auxiliary lemmas. Lemma 4.4.1. Let x e K', x y' b', and d(x) > N + 1. Then every node yy'x of depth < N belongs to a cone K!z such that x -C z and 1CZ /Cro. Proof. Let zo be a maximal node among the nodes of the set A ?=* {t e K' \ t у and d(t) > TV } , and let z < у cover zq. Obviously, x -<r zo <r z ■<' у and d(z) < N; hence z lb' and z0 lb' -|Плг+1_1. Since /С' lb' Ф, we have ^ ttro ' and thus ^ IF \J ft- t>zr о On the other hand, since /С lb' we have z0 lb' -i<)<pt, for all 1 < t < n. Therefore z lb' iprQ and K!z ~p /Cro. □ We say that a node гг of a model /С is branching if it has at least two different covering nodes. Lemma 4.4.2. All branching nodes of the model ICf, except for the root, have depth < N. Proof. Suppose this is not the case. Let x yf b' be a branching node such that d(x) > N and all branching nodes above x have depth < N. Further, let уi,..., yk be all the nodes covering x. Without loss of generality we assume that y\ has the minimal depth among all the yf s. We shall prove that y\ is p-duplicating. It is sufficient to establish that for all i the cone K!y. is p-isomorphic to a GL-model obtained by m-expansion of the root of /Cro, for some finite m. Evidently, every node yi has depth > AT, for otherwise by Lemma 4.4.1 ^ could not have covered x. Hence, there is a node Zi E 1Суг such that d(zi) = N. Any node 
34 L. D. BEKLEMISHEV of К,'Уг is comparable with Zi, because otherwise there would have been a branching at a depth > N above x. By Lemma 4.4.1, /Cro z±p KJ Besides, V£ G )С'Уг (t Zi=>t\\-' p(ro)), for 1C'lb' Ф and d(zi) = N. Hence, the forcing of the letters from p at the nodes of )С'Уг occurring below Zi coincides with the forcing at the node 7*0 of the model /С. Thus, the model 1С'Уг is p-isomorphic to an 777-expansion of the root of /Cro, for some finite m. □ We continue the proof of Lemma 4.4. Using Lemma 4.4.1 we obtain a node r*o G 1C' such that /С^ ~p /Cro and r'0 is comparable with the stem node r' of 1C'. Since 7*0 and r' are comparable, the subset {t G K' \ b' -<' t r'0 } has order type aSince there are no branching nodes below t*q, by Lemma 4.4.2, the set {teK'\ty' b', t and t*q are comparable} coincides with {t G К' 11 y' b', t and r are comparable} , and thus its complement is a finite tree. Besides, the forcing of the letters from p is identical on the sets {t G К \ b -< t -< ro} and {t G К' I b' -<' ^ -<' Tq }. Hence, in order to show the p-isomorphism of the models /С and 1C' it is sufficient to establish that the GL-models M^K\{teK\tyb, t and ro are comparable } and M' ^ K' \{t G K' \t y' b', t and Tq are comparable} are p-isomorphic (ordering and forcing in M and M' are inherited from 1C and /С', respectively). Since Vz 1C' Ih' ()(fri, there are nodes r'x,...,>-' b' such that Vz K!r, К,Гг. It follows from the condition Vz 1C' Ih' СН()^рГг that the nodes r[ cover b'. It remains for us to show that there are no other nodes covering b' in the model 1C'. Let z be an arbitrary node in 1C' covering b'. We distinguish two cases. Case 1. d{z) > ЛГ + 1. Then by Lemma 4.4.2 the cone 1C'Z is p-isomorphic to 7n-expansion of the node ro in /Cro, for some finite m. Hence, 2 p-duplicates a node t satisfying b' -< t -< Tq, and this contradicts the primeness of 1C'. Case 2. d(z) < N. In this case we have z \\~' □7V+1_L, and hence n z ||_' V V v*- 2 = 0 ХУГЪ This means, in turn, that 1C'Z ~p YCX for some x У Hence the node 2 p- duplicates the node corresponding to x under p-isomorphism K!r, z±p /Cr?. Again, this contradicts the fact that 1C' is prime. Thus, we have proved that all the nodes covering b' in 1C' are exhausted by the nodes r'x,... ,r^. It follows that the GL-models M. and Ad', and therefore the А-models /С and /С', are p-isomorphic. □ 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 35 Remark 4.5. If the original А-model /С is, in fact, a D-model, then its p- characteristic formula attains a simpler form: GL b Ф <-> p{b) A □(-OiV+1-L -+ (OWo Л p(ro))) лп(п"+1±^ V ъ). Let Фо be a formula defined from a p-prime D-model /С as in the proof of Lemma 4.4 (or 4.5). Then 1. The formula Фо is modalized, and its variables are contained in the list p; 2. Any p-prime А-model 1C' validating Фо is a D-model; 3. The stabilizations of /С and 1C' are p-isomorphic. A formula Фо satisfying these three conditions will be called almost characteristic for a D-model /С. 5. Provability logics containing D In this section, on the basis of the techniques introduced in Section 4 we show that the only consistent provability logic properly containing D is S. Thus, by the results of Section 3 the general classification problem for provability logics will be reduced to a description of provability logics in the interval between A and D. Lemma 5.1. Let ip be a modal formula whose propositional variables are in q= (<7i,..., qn), such that DY ip. Then there is a formula яр of the same variables such that S Y яр and A{(/?} b яр У (Пр —» p), where p is a propositional letter not occurring in q. Proof. Since D Y ip, by Lemma 2.7 there is a D-model 1C' such that 1C' 1Y ip. Let 1C be a (/-prime D-model falsifying p (such a model exists by Lemma 4.3). Define яр ^ “'Фо, where Фо is a formula almost (/-characteristic for 1C. Since Фо is modalized and /С lb Фо, by Lemma 2.10 IC° \Y яр. But IC° is an S-model; hence SY яр. It remains for us to show that A{p} b ярУ (Пр —► p). For any binary string a = (aq,..., an) of length n let 6a denote the result of substitution in a formula в of the formula p <-> qi for the variable for all i such that (Xi = 1. Define Д ^ /\а <ра, where a ranges over the set of all binary strings of length n. Evidently, A{c^} b Д. We prove that A Ь Д —> яр V (Пр —* p). Suppose, for a contradiction, that this is not the case. Then there is a (q,p)- prime А-model M such that Л4 lb Д, Л4 ib яр and Л4 lb Пр A -ip. Since Ai lb Пр, the letter p is forced at every node of every proper cone in A4. Hence, any (/-duplicating node in At would simultaneously be (g,p)-duplicating. Therefore, At is, in fact, a (/-prime А-model. Since At lb Ф0 and Фо is almost characteristic for the (/-prime D-model /С, we conclude that At has to be a D- model and At0 —qIC0. 
36 L. D. BEKLEMISHEV For any x e 1C let x' denote the node of M corresponding to x under this (/-isomorphism. Consider a binary string 7 = (71,..., yn) such that 7г = 0 <=> (the forcing of qi at the roots of models /С and Ai is the same). Lemma 5.1.1. For any formula 0 of the variables q and any node x G 1C, x lh 0 <=> x' II- 01. Proof. Induction on the build-up of 0. The induction step is easy to perform, because the frames of the models /С and Ai are isomorphic. Thus, we only have to consider the case when 0 is a letter from q. We distinguish two subcases. Case 1: 7* = 0. Then 01 — qt. If x e /С is not the root, then x lh qi Ф=7 x' lh qi, because x' corresponds to x under ^-isomorphism. If x is the root of /С, then x lh qi <=> x' lh qu because 7* = 0. Case 2: 7* = 1. Then 01 — (<p p). If x e 1C is not the root, then x' lh p, because Ai lh Qp. Hence, x lh qi p <=> x' lh qi Ф=7 x lh qi. If x is the root of /С, then x' ¥ p and therefore x' lh qi p «<=7 x'¥ q% <=> x ¥ qi, because 7* = 1. □ Now we complete the proof of Lemma 5.1. We know that 1C ¥ ip. Therefore, by Lemma 5.1.1 we obtain Ai ¥ <p7. But ip1 occurs in the conjunction /\a ipa = Д; hence Ai ¥ Д, a contradiction. Thus, we have shown that A h Д —> ф V (Dp —> p), and so A{p} h ф V (Dp —> p). □ Lemma 5.2. Let T be an axiomatized theory and ip a modal formula such that Dh ip. Then the T-completion of the logic A{p} contains S. Proof. Since D ¥ <p, by Lemma 5.1 we obtain a formula ф of the same variables as ip such that S ¥ ф and AM h ф V (Dp —> p). (The variable p does not occur in ip and ф.) Consider the T-completion £ ^ [A{^}]T of the logic A{^}. We know (see Section 1.7) that £ is a T-representation of the theory U ^ PRA + {Ф}1 • By Lemma 3.15, since S ¥ ф, £ is a singular Turing logic and hence coincides with GL^ for some cofinite /3 Си. Let F denote the formula Vn£/3 Obviously, 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 37 i b F and A b -iF. Since t b F and t = PLT{U) = PLT(PRA + {гр}т), by the deduction theorem we obtain realizations /*, ..., fn such that n PRA P Д /-MVO - г=1 Let A be an arbitrary arithmetical sentence. Let the realizations gl,...,gn be defined as follows: • д\(р) — A; • gl(q) ^ for all propositional letters q other than p. Since p does not occur in гр, we have n ^RA Д 9т(Ф) FT■ i=1 Besides, for any i G {1,..., n}, PRA P ^(Пр ч?)« (ProvT(rJ4_l) -► A). Hence, PRA P =i 9*тЦ>) V (ProvT(r^) -> A) (ProvT(rv4'') A) V FT, PRA 1- Д [p^) V (ProvT^-1) Л)] -♦ (ProvT(rAn) —> A) V FT, г=1 n PRA P Д[р^) V ^(Dp -> p)] - (ProvT(rAn) - A) V FT, г=1 n PRA P Д V (Dp -► p)] -► (Prov-HOT) A) V FT. i=1 Now consider the T-completion l\ of the logic A{(/?}. We know that l\ = where C/x = PRA + Con s(T) + {<p}T. Since A{<p} 1— -0 V (Dp —> p), for any realization g we have Ui P Pt(-0V (Dp —► p)). In particular, for each г, C/x P p^(V> V (Dp - p)). Besides, since A I—Рп, we also have PRA + Cons(T) I—'FT, and therefore U\ I—FT. It follows that, for any arithmetical sentence A, иг 1- ProvT(rA^) A, that is, and £ D S. РЬг(и{) P Dp —> p, □ 
38 L. D. BEKLEMISHEV Corollary 5.3. Let £ be a provability logic with trace uj. Then, if £ is not contained in D, £ D S. Proof. By Corollary 3.12, £ D A. If £ is not contained in D, then there is a formula p such that £ b p and D Y p. Obviously, then £ D A{(/?}, and since £ is a provability logic, also £ D [A{(/?}]T. Hence, by Lemma 5.2, OS. □ From Lemma 3.15 we also obtain the following result. Corollary 5.4. Let £ be a consistent provability logic with trace uj such that ££B. Then£ = S. 6. Provability logics containing A Here we shall prove that any provability logic strictly containing A also contains D. This result completes the classification of provability logics with trace uj and thereby, in view of Lemma 3.19, the general classification of provability logics. Lemma 6.1. Let A Y p and T be an axiomatized theory. Then for any arithmetical Tii-sentence о there is a realization f such that PRA + Cons(T) h fT(p) -> (ProvT(ran) -> a). Proof. We apply a modification of the Solovay construction (see Section 3.1). Assume AY- p. Then by Corollary 2.16 GL Y §S(p) -* p. Hence, there is a GL-model /С' with a root b and a ^-reflexive node г У b such that JC'Yp. Consider a model /С = (AT, -<, lb) obtained from /С' by 1-expansion of the node b. We assume without loss of generality that К — {0,..., к} and 0 is the root of /С. Let a be an arbitrary Ei-sentence. We may assume that a has the form 3xB(x), where В is a primitive recursive formula. Using the formalized primitive recursion theorem, define a primitive recursive function h as follows: h( 0) = 0; {z if Prfir(r£ ф zn,ra), 2 >- h(m) and z ф r; . j r if h(m) = b and 3x < m B(x); ]ft(m) otherwise. Here £ = z denotes, as usual, the formula expressing lim^^oc h(m) = z. Now we establish several useful properties of h. Lemma 6.2. The following statements are provable in PRA; 1. Vrr,y (ж < у -* h(x) ^ h(y))\ 2- VzeK (■ = 3. Vn, v (£ = и Л £ = v и = v); 4. £ = z —> -ProvT(r£ ф гГ1), if z.u e К, и Ф r and и У z; 5. £ = r-+ ProvT(r\/u^r £ = гГ); 6. £ = г —» ProvT(rVuyz^ = гГ), if z e K\{r, 0}; 7. ProvT(rO —> £ Ф 0; 8. “1(7 —> £ ф f. 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 39 Proof. Statements 1-4 and 6 are proved similarly to Lemma 3.1. Statement 5 follows from provable monotonicity of h in the sense of -<. We prove 7. First of all, notice that PRAh^ = ЪЛа->е фЬ, because the following reasoning can be formalized in PRA: If m is the minimal number such that h(m) = 6 and 3x < m B(x), then by the definition of h we have h(m + 1) = r. Hence, by monotonicity of /г, V/c > m h(k) ф b and therefore i ф b. So we obtain PRA h a -> t ф 6, and by statement 4 PRA h ProvT(ran) ProvT(r^ ф 6n) ->1ф 0. For the proof of statement 8 we reason inside PRA as follows: If ra is the minial number such that /i(m + 1) = r, then 3x < m B(x), that is, a holds. Consequently, -><r implies Vm h(m) ф r, and hence t Ф r. □ Lemma 6.3. In the theory PRA+Cons(T) the following statements are provable: 1. 3m h(m) = z —^ t ф {0,6, r}, if z £ К \ {0,6, r}; 2. I £ {0,6,r}, 3. ProvT(ran) Л -ia -> £ = b. Proof. 1. The following reasoning can be formalized in PRA: Suppose h(m) = z and г ф {0,6, r}. Then by monotonicity of h we have Vn > m h(n) У z У 6 >- 0, that is, £ ф {0, 6}. On the other hand, notice that in order to get to the node r the function h has to make a jump from 6. By monotonicity, though, we have Vn > m h{n) ф 6. Therefore, Vn > m h(n) ф r, whence £ ф r, q.e.d. 2. First of all, by induction on the depth of the node z we prove that, if z ф {0,6, r*}, then (3) PRA \- £ = z (□d(z)+1_L)T. Indeed, if d(z) = 0, then by statement 6 of Lemma 6.2 PRA t = z -* -Con(T) - (D±)T. If d(z) > 0, then by statement 1 PRA h £ = z —* 3m h(m) = z —» Ргочт(г^т h(m) = zn) -+PrOVT(r^{0,6,7*}n). On the other hand, by statement 6 of Lemma 6.2 PRA h £ = z —> ProvT(r \/ £ = u^j. uyz 
40 L. D. BEKLEMISHEV From this we obtain PRA h e = z -► ProvT(r \J l = uyz, u^{0,6,r} But, if и 0 {0,6, r} and и >~ z, then d(u) < d(z). Hence, by the induction hypothesis, for all such гг, PRA V t — й—^ -> (Dd(^_L)T. Therefore PRA b t = г ProvT(r(Dd(")-L)Tn) (□d(2)+1_L)T. So, we have proved (3) for all z 0 {0,6,r}. Now choose a number n so large that the depth of any node in /С, except for 0, 6, and r, is strictly less than n (e.g., it is sufficient to take n ^ h(/C)). Then for all z 0 {0,6, r} we have PRAb £ = z-^ (ndW+i_L)T -> (Dn_L)T, that is, PRA I- (-On±)T -* Д l ф z z<^{0,6,r} -» ^ e {0,6,r}. The second claim of Lemma 6.3 follows. 3. By statement 2 PRA + Cons(T) b f = 0V £ = bV £ = r. By statement 8 of Lemma 6.2 PRA b -ли -> t ф r. By statement 7 of Lemma 6.2 PRA b ProvT(ro-n) -► t ф 0, whence PRA + Cons(T) b ProvT(rO Л —icr —> £ = 6. □ We define a realization / as in the Solovay construction in the following way: f{p) — V £ = zGK, z\\~p for any propositional letter p. Lemma 6.4. For all z e К, z У 0, and all subformulas ф of the formula ip : 1. If z lb ф, then PRA b I = z -> }т{Ф); 2. If z ¥ ф, then PRA b £ = z —> _i/t('0)- 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 41 Proof. Induction on the build-up of гр. We shall only consider the crucial case when ip has the form 06, and z is b or r. All other cases are treated similarly to Lemma 3.2 using statements 1-6 of Lemma 6.2. Case 1: z = r. 1. Let r lh □ 6. Then Mu >- г и lh 6. Since 06 is a subformula of and r is a (^-reflexive node, we also obtain r lb 0. Hence, Mu >i т и lb 6. By the induction hypothesis Mu^r PRA b £ = й —* /т(0), whence PRAb (\J £ = й) -/t(0) 'иУг ' and PRA b ProvT (r Y t = > ProvT(r/T(<9)n). иУг Hence, by statement 5 of Lemma 6.2, PRA I- £ = f —> ProvT(r\J £ = иУг - МЩ. This proves statement 1 for the case z = r. 2. Let r ib 06. Then Зи >- г u\P 6. By the induction hypothesis PRA \- £ = й -i/T(0). Hence PRA b -ProvT(r^ ф ZZn) -> -ProvT(r/T((9)n), and by statement 4 of Lemma 6.2 PRA b i — r —> -iProvT(r^ ф ZZn) - -/т(П0). Case 2: 2; = b. 1. Let b lb □#. Then Vii >- b и lb fr{0), and hence PRA I- \J t = й —> fT(6). uyb By statement 6 of Lemma 6.2 we obtain PRA I- £ = 6 —► Prov(r У £ = uyb - ProvT(r/T(0n - /т(П0). 2. Let 6 ib П0. Then there is a node и >- b such that и IP 6. Pick a maximal such и in the sense of -<. We claim that и ф r. Indeed, by maximality и )P 6 and Mw У и w\\~ 6. Hence и lb 06, and thus и cannot be a (^-reflexive node, whereas the node r is. Now we can reason in the usual manner, exploiting statement 4 of Lemma 6.2. By the induction hypothesis PRAb e = u-*^fT{6), 
42 L. D. BEKLEMISHEV whence PRA 1- e = b-4 ~.Prov(r£ ф un) - -ProvT(r/T(0n -> ЧАЩ- This completes the proof of Lemma 6.4. □ Now we finish the proof of Lemma 6.1. By Lemma 6.4, since 6 IP ip, we obtain PRAh f = 5->-i/r(<p). By statement 3 of Lemma 6.3 PRA + Cons(T) h ProvT(rO Л —> £ = b. Hence, that is, PRA + Cons(T) P РгоугСст-1) Л -кг —» PRA + Cons(T) P fT(ip) -> (ProvT(r(Tn) -> a). □ Corollary 6.5. If ip is a modal formula and Ah <p, then PRA + Cons(T) + У)т P RfnSl (T). Proof. By definition {</?}T is the set of all arithmetical interpretations fr(p') of ip. By Lemma 6.1, for any Ei-sentence a there is a realization / such that PRA + Cons(T) h fT(<p) -> (ProvT(rO -> a). Hence, PRA + Cons(T) + P ProvT(ro-n) -> a. □ Corollary 6.6. Let £ be a provability logic with trace uj. Then £ coincides with one of the following four logics: A, D, S, or GL{-L}. Proof. By Corollary 3.13, £ D A. If £ ^ A, then there is a formula ip such that А У- ip and £ h ip. By Corollary 6.5, and hence PRA + Cons(T) + MT D PRA + RfnSl (T), l 2 [A{v?}]r = PLT{PRA + Cons(T) + У)т) Э PLT(PRA + RfnSl(T)) D D. There exist two possibilities: either £ = D, or £ <2 D. In the latter case by Corollary 5.4 two further possibilities remain: either £ is inconsistent, or £ coincides with S. □ 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 43 7. Main results Summing up the results of Sections 2-6, we obtain the main theorem of this paper. Theorem 1. Any provability logic coincides with one of the following modal logics: (*) GLa, GL^, D^, Sp (a,(3 C uj, a is cofinite). Proof. Let £ be a provability logic and tr(£) = a. If uj \ a is infinite, then by Corollary 3.13 £ = GLa. If uj \ a is finite, then by the proof of Lemma 3.19 £ = £' П GL“, for some provability logic £' with trace и;. By Corollary 6.6, £' coincides with one of the logics S, D, A, or GL{_L}. If f is S or D, then by definition £ equals Sa or Da, respectively. If £' is inconsistent, then, obviously, £ = GL“. Finally, if £' = A, then £ = £1П GL“ is a regular Turing logic with trace a. By Corollary 3.10 it must coincide with GLa. □ In order to show that all logics from the list (*) are realized as provability logics for any strongly consistent theory T we first gather some additional information about local reflection schemata. Let U be an extension of an axiomatized theory T. We say that U is Пп- axiomatized over T if all axioms of U that are not axioms of T are arithmetical IIn-sentences. Lemma 7.1. Let U be a consistent Un-axiomatized extension ofT. Then there is a sentence A E Пп such that the theory T + A is consistent and contains U. Proof. Let /(m, и) denote the primitive recursive formula Vy < m -.PrfT+u(r-L"'),y). Using the fixed point lemma, we find a formula A such that PRA h A Vx (Ахц(х) A x G Пп Л 7(x, rAn) —» Truenri(x)). It is easy to see that A is equivalent in PRA to a IIn-sentence. We prove that T + A is a consistent theory. Indeed, if T + A h _L, then for some m E <j, PRA I—i/(m, rAn) A Vx < fh 7(x, rAn), because 7(x, y) is a primitive recursive formula. Therefore, PRA h A Vx < fh (Ахц(х) Ax E Пп —* Truenri(x)) -ДДгиеПп(гАГ) where the A^’s exhaust all IIn-axioms of U with Godel numbers < m. Since, obviously, С/ h Д. it follows that U \~ A. Hence, U extends T+ A and has to be inconsistent, contradicting one of the assumptions. So, we have proved that T + A is a consistent theory. Now we prove that T + A contains U. Let В be any axiom of U. Since U is a nn-axiomatized extension of T, either В E Пп, or В is an axiom of T. It is sufficient to show that T + A h В when В E Пп. 
44 L. D. BEKLEMISHEV Since the theory T+A is consistent, the formula /(ra, rv4n) is true for all m G uj. Take rBn for ra. Then, obviously, PRA b Axi;(rB^) А ГВn G Пп Л /(rHn, ОГ). From the definition of A it follows that PRA b A TruеПп(г5п) В. Hence T + A contains U. □ Corollary 7.2. //T + Rfnsn(T) consistent, then PRA+ RfnSn(T)b RfnSn+1(T). Proof. Evidently, PRA+Rfnsn(T) is a consistent nn+i-axiomatized extension of PRA. Hence, by Lemma 7.1 there is a consistent formula A G Пп+1 such that PRA +Ab RfnEn(T). On the other hand, Lemma 1.1 asserts that for any such formula A, PRA + Ab RfnEn+1(T). □ Now, using a result from [18], we obtain a necessary and sufficient condition for the consistency of local reflection schemata. Lemma 7.3. A theory T + Rfn(T) is consistent iJJT is strongly consistent. Proof. Obviously, strong consistency of T follows from the ordinary consistency of T + Rfn(T), because TWCT + Rfn(T). For a proof of the opposite implication assume, for a contradiction, that T + Rfn(T) is inconsistent. Then there are arithmetical sentences Ai,..., An such that n т h - Д (ProvT(rAn) -» A), i—l and therefore n PRA h ProvT(^ A (Р™Т(ГАП) - A)n). i—1 On the other hand, by Corollary 2.16 А Ь ()R(pi,... ,pn), where pi,... ,pn are distinct propositional letters. Hence, for any realization /, n PRA + Cons(T) h -nProvT A (ProvT(r/(Pi)n) - /ЫГ)- 1=1 Now define a realization f in such a way that, for all г, f(pi) = A^ Then we obtain n PRA + Cons(T) I- -.Provr(r-' A (ProvT(rA'1) A)n), г—1 that is, the theory PRA + Cons(T) is inconsistent. □ Putting together Corollary 7.2 and Lemma 7.3, we immediately obtain 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 45 Corollary 7.4. If a theory T is strongly consistent, then for any n > 1, PRA + RfnSn(T)FRfnSn+1(T). The following theorem shows that all possible provability logics can be realized as provability logics for any fixed strongly consistent theory T. Theorem 2. If an axiomatized theory T is strongly consistent, then PLT{ PRA) = PLT{T) = GL, PLT(ТА) = S, D, or A, and every logic from (*) has the form PLt{U) for an appropriate outer theory U. Proof. We have already established in 3.16 that, if T is strongly consistent, then PLT{ PRA) = PLT(T) = GL. Corollary 3.17 together with Corollary 6.6 shows that PLt{ТА) coincides with one of the logics S, D, or A. Besides, by Corollary 3.18 all singular Turing logics GL^, for (3 C u;, /3 cofinite, are provability logics for T. We prove that P£T(PRA + Rfn(T)) = S. Obviously, PLT(PRA + Rfn(T)) contains S. On the other hand, by Lemma 7.3 the provability logic PLt{PRA + Rfn(T)) cannot be inconsistent. Hence, by Corollary 6.6, PLT( PRA + Rfn(T)) = S. It follows that all logics of the form Sp = S П GL^ are provability logics for T. Next, we show that P£T(PRA + RfnSl(T)) = D. Obviously, PRA + RfnEl (Г) {-.□!, П(Пр V Oq) Up V Uq}T, whence On the other hand, if then by Corollary 6.6 and so PIr(PRA + RfnSl(T)) 2 D. PLt(PRA + RfnEl(T)) Ф D, PIT(PRA + RfnSl(T))DS, PRA + RfnSl(T) b Rfn(T), which contradicts Corollary 7.4. Therefore D, together with all logics of the series D/3 (fiQui.fi cofinite), is a provability logic for T. Now we prove that PLT (PRA+ Cons(T)) = A. We already know that P£r(PRA + Cons(T)) 2 A. If PLt(PRA + Cons(T)) Ф A, 
46 L. D. BEKLEMISHEV then by Corollary 6.5 PRA + Cons(T) b RfnSl(T), which is impossible in view of Lemmas 7.1 and 1.1. (The consistent theory PRA + Cons(T) can be majorized by a single consistent Ili-sentence, whereas PRA + Rfns^T) cannot.) Thus, the logics GLa, for all cofinite a, are provability logics for T. It remains for us to prove the same fact for the logics GLa, a coinfinite. Consider the T-completion [GLa]T of any such logic. By Corollary 3.14, tr([GLa]T) = tr(GLa) = a. However, Corollary 3.13 tells us that GLq, is the only provability logic with trace a. Hence, [GLa]T = GLa, and so GLa is the T-representation of the theory PRA + {Fn\nea}T. □ The following theorem gives a complete description of all truth provability logics. Theorem 3. The truth provability logics are precisely the following ones: (**) S, D, A, and GL{-<Fn}, n G u). Moreover, for any axiomatized theory T, 1. PLT(ТА) = 8 if and only ifT is sound; 2. PLt( ТА) = D if and only ifT is £i -sound but not sound; 3. PLt{ ТА) = A if and only ifT is strongly consistent but not £i -sound; 4. PLt(ТА) = GL{-iFn} if and only ifrk(T) = n (for n < oo). Proof. Let £ = PLt(ТА) and a = tv(£). First, we shall prove that t is one of the logics (**). If a = uj, in view of Corolary 6.6 it is sufficient to show that £ is consistent, but a truth provability logic cannot be inconsistent. Suppose ft/w. If n £ a, then £Y Fn and thus N ¥ {Fn)T. Hence N t= (->Fn)T and £ I—iFn, that is, a — tr(£) ^ tr(-iFn) = ш \ {n}. Since we assume а Ф oo, it follows that a = uj \ {n} and therefore £ 5 GL{-iFn} = GL“. But GL; is the strongest logic with trace a, and hence £ = GL{-nFn}. Now we prove statements 1-4. 1. This statement is the content of Solovay’ second theorem [29]; see Section 3.1. With our present knowledge we can give the following simple argument. If T is sound, then N 1= Rfn(T) and so PLt(ТА) 2 S. But no logic from the list (**) contains S, except for S itself. Hence PLt(ТА) = S. Conversely, if PLt(ТА) = S, then N 1= Rfn(T), which means that T is a sound theory. 2. Let PLT(ТА) = D. Then N 1= {->□-!, □(□/? V Dq) —* Dp V □^}T. Since AY -iD_L A (□(□/? V Dq) —» Пр V Dq), by Corollary 6.5 we obtain PRA + {-.□!, □(□p V Dg) Dq}T I- RfnEl (T). Hence, N 1= Rfnsj (T), that is, the theory T is Ei-sound. Nonsoundness of T follows from statement 1, for D ф S. 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 47 Now suppose that T is Hi-sound but not sound. Then, obviously, PLt(ТА) О D, whereas PLt{ТА) Ф S follows from the fact that T is not sound. Hence, PLt{ТА) = D, because all other logics from (**) are not contained in D. 3. Let PLT{ТА) = A. Then N t= {(4HnJ_)T \ n e uj}, that is, N b Cons(T). Hence, T is strongly consistent. If T were Hi-sound, we would get PLt{ ТА) D D, but A does not contain D. Let T be a strongly consistent theory. Then PLt(ТА) О A, and therefore PLT{ ТА) does not have the form GL{ -iFn}. But, if T is not Hi-sound, then PLt{ТА) does not contain D. Hence, PLt(ТА) = A. 4. Assume PLT(ТА) = GL{--Fn}. Then N t= (-Fn)T, that is, N t= (EP+1J_)T and N t= (-iDn_L)T. But the formula (-iDn+1_L)T expresses the consistency of Tn (see Example 1.8). Hence, n is the least к such that is inconsistent, that is, n — rk(T). Conversely, if n = rk(T), we have N t= (->Fn)T, that is, PLt{ТА) О GL{~^Fn}. But -*Fn is not provable in any of the regular logics A, D, and S, nor in GL{-iFk}, for к ф n. Therefore, PLT{ТА) = GL{-Fn}. So, we have proved statements 1-4. Together with 1.3, 1.4, and 1.7 this shows that all logics from the list (**) are realizable as truth provability logics. □ The following theorem gives an exhaustive description of all provability logics for theories of finite rank. Theorem 4. Let T be an axiomatized theory of rank n < oo. Then PLT( ТА) = GL{-iFn}, PXT(PRA) = GL{Dn+1J_}, PLT(T) = GL{DnJ_}, and the provability logics for T exhaust all Turing extensions of GL{Dn+1 _L}, that is, the logics GL~ for uj \ a C {0,..., n}. Proof. In the previous theorem we have already established that PLT( ТА) = GL{-iFn}. Since GL{-iFn} i- mn+1±, (□ra+1_L)T is a true Hi-formula. Hence, PRA b (Dn+1±)T, that is, PZ/t(PRA) D GL{Dn+1_L}. It follows that PP/^PRA) is singular and, by Lemma 3.15, has to be a Turing logic. Furthermore, it is closed under the necessitation rule ip b □(/?, and therefore is bound to have the form GLID^T}, for some к £ uj. But PLt{PRA) Y- nn_L, because (nnJ_)T is false. Consequently, PXT(PRA) = GL{Dn+1J_}. Now we prove that PLt{T) = GL{Dn_L}. First of all, notice that PLT(T) b p ^ PLT{ТА) b Dtp <?=* GL{-Fn} b Dtp. Obviously, GL{iFn} b Dn+1J_, and hence PLT{T) b EPJ_. We show that PLT{T) C GL{DnJ_}. 
48 L. D. BEKLEMISHEV Let GL{Dn_L} Y p. Then, obviously, GL Y Dn_L —> p. Consequently, there is a GL-model /С of height < n such that JC¥ p. Consider any GL-model /С' of height n containing a cone isomorphic to /С. Clearly, К! II—\Fn and K! ¥ C\p. It follows that GL Y -iFn —> □(/?, and by the deduction theorem GL{-iFn} Y dp, that is, PLt(T) Y p. Thus, we have shown that PLt(T) C GL{Dn_L}, and therefore PLt(T) = GL{mn_L}. By Lemma 3.15 all provability logics for T happen to be Turing ones and contain PLt( PR A) = GL{Dn+1_L}. We prove that any Turing logic containing GL{Dn+1_L} is, in fact, a T-representation of a suitable extension U d PRA. Let a C uj and и \ a C {0,..., n}. Take [7-PRA+ {Fi\i& (3}T, where j3^afi{0,...,n}. We show that PLt(U) = GL“. Indeed, for any modal formula ip we obtain PLT{U) b tp VfUh fr((p) V/ PRA b Д Fj fr(p) V/ PRA b fr ( /\ Ft p \ез ^ PLT(PRA) b Д Ft -xp <=* GL {□n+1_L, Ft\ie/3}\-<p ^ GL" b <p. This completes the proof of Theorem 4. □ 8. Examples, comments, and related results In view of the classification theorem for provability logics a natural question arises: to what extent is this classification effective? 8.1. Effectiveness of the classification of provability logics. Since the provability predicate for PRA is undecidable, there is no effective algorithm telling, given an arbitrary numeration (provability predicate), if the corresponding theory is consistent. Accordingly, the more general problem: Given a numeration of a theory T, find the truth provability logic for T, and other problems formulated in a similar way are hopelessly undecidable. On the other hand, provability logics for natural arithhmetical theories in every particular case are usually easy to calculate (see Section 8.2 below). This provides (informal) evidence for the claim that our classification is sufficiently ‘simple’ and ‘decidable’ from a practical point of view. These considerations naturally lead to an attempt to prove the effectiveness of the classification of provability logics for some (necessarily, rather restricted) subclasses of the class of all theories. A simple example of such a class is given by the theories axiomatized over PRA by arithmetical interpretations of finite modal schemata, that is, theories of the form PRA + {(/?}pra, for some modal formula ip. As we have seen, this class includes such theories as PRA + Con (PRA), PRA+Rfn(PRA), 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 49 PRA + Rfns^PRA), but does not include such theories as PA, PRA + RFNe^PRA), or PRA + RfnS2(PRA). Proposition 8.1. There is an algorithm which, for a given modal formula p, calculates the provability logic for PRA relative to PRA + {^}PRA- Before giving a proof of this proposition, we shall consider the question of decidability of individual provability logics. A complete answer to this question is given by the following results [29, 1, 3, 14]: • Logics of the form GL^, D/з, and S/?, for any cofinite (3 Cu, are decidable (cf. 3.7, 2.9, 2.4). • Logics of the form GLq,, a С u;, are decidable iff a is a decidable subset of UJ. The second statement is a corollary of the following simple lemma [3]. Lemma 8.2. The trace tr(<p) of a modal formula p can be found effectively from p. Proof. By 3.4 and 2.18 we can effectively recognize, whether tr(p) is finite or cofinite. Moreover, Corollary 2.17 allows us to effectively pinpoint an interval {0,..., n} that contains tr(p) in case tr(<p) is finite, or ш \ tr(p) in case tr(p) is cofinite. Hence, in order to exactly characterize the trace of p it is sufficient to check, finitely many times, if m G tr(<p), for all elements m of this interval. But m e tr(p) GLFpAFm, which is decidable. □ Notice that for any modal formula p and any a C uo GLq, b p (Ah p and tr(p) C a). One can effectively check the inclusion tr(p) C a, iff a is decidable. Hence, the logic GLq is decidable iff a is. Now we are ready for the proof of Proposition 8.1. Proof. Recall (Section 1.7) that the provability logic for PRA relative to PRA+ {(^}PRA coincides with [GL{(/?}]PRA. In order to effectively find the place of this logic in the list (*), first of all, compute the trace of p using Lemma 8.2. Let tr(p) = a. In case a is finite [GL{(/?}]PRA has to coincide with GLa — the only provability logic with trace a (Corollary 3.14). If a is cofinite, then [GL{(/?}]PRA coincides with 1. GL",ifSaYp; 2. Sa, if Sa b p, but Da b p\ 3. Da, otherwise. (Notice that, if a is cofinite, then the equality [GL{(/?}]PRA = GLq, is impossible for any p: otherwise tr(<p) would simultaneously be finite and cofinite.) All derivability relations in 1-3 are decidable, so we can effectively recognize which of the cases holds. □ 
50 L. D. BEKLEMISHEV 8.2. Provability logics for natural theories. By ‘natural theories’ we mean explicitly given mathematically meaningful theories (a priori not related to provability logics). Typical examples of such theories are Peano arithmetic PA, Zermelo- Fraenkel set theory ZF, the theories I£n mentioned in Section 1.1, and many others. Of course, in general, the notion of ‘natural theory’ is informal. Theorems 1 and 2 show that a provability logic PLr(U) is essentially determined by the amount of reflection for T which is provable in U. To find, for any fixed theories T and [/, how strong reflection principles for T are provable in [/, can be rather difficult. However, for many natural T and U this question, as a rule, has already been investigated using traditional proof-theoretic methods. Example 8.3. By a well known theorem of Kreisel and Levy [21] PA h Rfn(PRA) and ZF h Rfn(PA). It follows that PTra(ZF) = PTpra(PA) = S. Example 8.4. A theorem of Leivant [22, 19] states that for all n > 1, /£„+i bRFN^,+2(/£n). On the other hand, by Lemma 1.2 /£„ + 1E Rfn(/£n), because /£n+i is a finitely axiomatizable extension of /£ri. It follows that for 1 < m < n, PT/Sm(/£n)=D. Example 8.5. Obviously, if an axiomatized theory U contains the local £i- reflection schema for T, then T + Con(U) £ Тш. Hence, by Lemma 1.1 P£pa(PA + Con(ZF)) = Pi,El(l£i + Con(l£2)) = A. Example 8.6. By a well known-theorem of Parsons (see [19]) the set of П2- consequences of /£ 1 coincides with that of PRA. It immediately follows that P£pra(/£i) = GL. Similarly, it is known that G5del-Bernays set theory GB is a conservative extension of ZF; hence PTzf(GB) = GL. (Here we assume that the definitions from Section 1.7 are extended to the class of theories formulated in the language of GB.) 8.3. Invariance of provability logics. Now we consider the question of dependence of truth provability logics on the choice of a numeration of an axiomatized theory T (cf. Section 1.2). Statements 1 and 2 of Theorem 3 show that in the case when a theory T is £1-sound, the logic PLt(ТА) does not depend on the choice of a numeration of T (in this case PLt(ТА) coincides with D or S, depending on the soundness of T). If a theory T is not £j-sound, then PLr (ТА) is completely determined by the rank of T. Thus, we have to find out what values the rank of a given £1-nonsound theory T can take for various numerations of T. It turns out that for finitely axiomatized and reflexive theories the answers to this question are different (cf. Section 1.3). 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 51 Lemma 8.7. Let T be a consistent axiomatized theory which is not Ei -sound. Then for every n < rk(T) there is an axiomatized theory T' such that T = Tr and rk(T') = n+l. Proof. Fix a false but provable in T formula 3yP(y), where P is primitive recursive. For n < rk(T) define AxT/(x) ^ AxT(x) V (3y < x P(y) A (3z < x x — r(DnJ_)T Л (i = i)n)). It is easy to see that PRA h My-^P{y) —> Mx (Ахт{х) Axt'{x)) and PRA h 3yP(y) -» \fx (ProvT'(x) «-► ProvT(r(D”l)T',^a;)). (Here denotes a natural p.r. term for the function mapping r(pn and rt/>n to rip t/T1.) Using these properties, it is not difficult to show by induction on к (compare with 1.4) that for all к < n, PRA h (□fc_L)T' <-> (□fcl)T. It follows that Vfc < n N N ConT/(/c). For к = n we thus obtain PRA h Vj/iP(y) (ProvT'(r(nn-L)T/_l) <-> ProvT(r(DnJ-)Tn)) —> (Сопт'(п) <-> Сопт(п)). But the formula \/у~зР(у) is true and n < rk(T), whence N N Соп^/(п), that is, rk(T') >n+l. On the other hand, PRA proves 3yP(y) -> (PrOV7’'(r(n"±)T n) <- ProvT(r(D"l)T - (□nl)T4)), whence PRA h 3yP(y) -> ProvT'(r(DnJ-)T/_l). Since T h 3yP(y), it follows that T I—iCori'r'(n), that is, rk(T') = n + 1. □ Lemma 8.7 shows that the rank of axiomatizations, for any consistent but not Ei-sound theory, varies over a closed interval of the form [l,<r], where 1 < a < oo. For finitely axiomatizable theories this result is optimal in the sense that for any interval of the above form there is a theory with the given range of rank function. For a proof it is sufficient to notice that, if T' is a theory given by the numeration Axt>{x) ^ Axpra(x) V x = r-An, where A is the only nonlogical axiom of T, then T = Tr and PRA h Con(T) —* Con(T/). Consequently, rk(T') > rk(T), that is, the maximal possible rank of a finitely axiomatizable theory corresponds to its natural numeration. But in 1.1 and 1.3 we gave examples of finitely axiomatized theories of ranks 1, 2, ... , oo. Now we consider reflexive theories. 
52 L. D. BEKLEMISHEV Lemma 8.8. Let T be a consistent reflexive theory. Then there is a theory T' =T that has infinite rank. Proof. We use a modification of a construction from Feferman [15]. Let Axj-(x) be an arbitrary numeration of T. With the aid of the fixed point lemma we define a formula B(y) such that PRA b B(y) ~ \fz (Prh(rB(yy,z) -> -ConT{z(y)) (see 1.3). Set AxT>(x) ^ AxT(x) A\fz<x\fy<z —iPrfT(rB(y)n,z). If T b B(n) for some n £ u;, then PRAb PrfT(rB(n)n,m), for some m e lo. Hence, T I—'ConTlrh(n), by definition of B(y). An obvious induction on n shows that, for any reflexive theory T, all extensions of the finite subtheory T\m by iterated consistency assertions are (finite) subtheories of T, as well. Thus, for all nGw, T b ConT\fri(n). It follows that T is inconsistent, contradicting our assumption. So, we have proved that T Y- B(n) for all nGw. Consequently, N t= Vx (AxT'(x) *-> AxT(x)), that is, Axt'(x) numerates T. Now we show that for all n £ ш, PRA I—i£?(n) —* ConT/(n). Indeed, reasoning inside PRA, from the assumption ->B(n) we can successively derive (ConTr*(n) Л PrfT(rH(n)^,z) Л \fy < z- Prfт(гВ(пГ,у)), 3z (С0П712 (n) Л Vx (Axt' (x) —» (Axt-(x) Ax < z))), Conjv (n). If, for some nGw, T I—iConr'(n), then T b B(n), and we get a contradiction. So, rk(T;) = 00. □ Corollary 8.9. Let T be a consistent reflexive theory. Then, ifT is not £1- sound, there exist axiomatized theories equivalent to T of any ranks > 1. 8.4. Craig’s interpolation property for provability logics. Recall that a propositional logic £ satisfies Craig’s interpolation property if for all formulas ip and гр such that t b ip —> ф there is a formula 0 such that £\-р-*6,£\~6^'ф, and any propositional variable occurring in 0 occurs simultaneously in p and яр. The formula 0 will be called an interpolant of the implication p —> яр in £. Craig’s interpolation property for GL was established independently in [8] and [27]. Here we consider Craig’s interpolation property for other provability logics. First of all, notice that the deduction lemma 3.7 shows that all Turing extensions of a logic £ (that is, logics of the form £X, for X a set of letterless formulas) 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 53 satisfy the interpolation property iff £ itself does. In particular, all Turing logics GLa and GL^ satisfy Craig’s interpolation property. The interpolation property for S was established in [9]. We shall prove a somewhat more general lemma. Lemma 8.10. For any cofinite (3 С ш, Sp satisfies Craig’s interpolation property. Proof. For a cofinite (3 C uj let Fp denote the formula \Jn~^Fn. Obviously, for any modal formula <p, Sp b ip «<==> (S h ip and GL b Fp —> p) GL b (R(<p) V Fp) —* tp, by Corollary 2.3. This allows one to easily reduce the lemma to Craig’s interpolation property for GL. Indeed, assume that for some formulas ip and ф, S/3 b p —► ф. Then GL b ((R(<p) A Rm V Fp) - (y> - ф), because, trivially, GL b R(p -+ф)<г+ {R{p) A Rty)) (see Lemma 2.2). So we obtain GL h [(R(<p) V Fp) Aip]^[ipV ^(Щф) V F0)}. Since Fp is letterless and R{p) is built-up from subformulas of <p, by the interpolation property for GL we obtain a formula в such that GL b (R((p) V Fp) A > 0, GL h в —> ф V -i(Rfa) V Fp), and all the variables of в simultaneously occur in p and ф. It follows directly that Sp b p -> в and Sp b в -> ф. □ Despite the existence of the characterization 2.8 similar to Lemma 2.3, which allowed us to quickly reduce the construction of an interpolant in S to that of one in GL, it surprisingly turns out that D does not possess Craig’s interpolation property. The rest of this section is devoted to a proof of this fact. Define the formulas p ^ □(-ip —> -Q71) —> D^i and ф ^ 0(p —> Q12) —> Dilemma 8.11. D b -up —► ф. Proof. Obviously, GL b (□(-'p A D{p -* Hq2)) —► □(□<71 V Uq2), whence D I- (D(-ip -> -'Dqi) A Ш(р -*• D^)) (Oqi V Uq2), but this formula is logically equivalent to -кр —> ф. □ We are going to show that the implication -чр —> ф has no interpolant in D. 
54 L. D. BEKLEMISHEV Lemma 8.12. Let 6{p) be an interpolant for the implication ~^(p тр in D. Then for every D-model 1C, 1C lb 0(p) <==> 1C° lb p. PROOF. Assume this is not the case. Then there is a D-model /С = (K, ^,lb) such that one of the following conditions holds: 1. 1C IF 0(p) and /C° lb p; 2. 1C lb 0(p) and /C° IF p. Cases 1 and 2 are symmetrical with respect to changing -up —> ^ to the opposite implication, so we shall only treat Case 1. We shall construct a D-model 1C', whose frame is isomorphic to that of 1C, such that 1C' IF —up —> 0(p). Then, by the completeness theorem for D, Db^9(p), which contradicts the definition of interpolant. Let b be the root of 1C, and r its stem element. For any x £ К set x lb' q\ <=> x / r, and x lb' p <=> x lb p. The forcing of all other propositional letters is arbitrary. For any element x £ К and any formula £(p) in which only the letter p occurs, we have X lb £(p) <==> x lb' £(p). Hence, for the interpolant 6(p) we have 1C'¥0{p). To prove that 1C' IF (p, reason as follows. Since r >- b and r ¥' q\, K! ¥ Oft. We only need to check that 1C' lb D(-ip —> Oqi). Consider an arbitrary element x У b of the model 1C'. If x -< r, then ж lb' p, because /C° lb p and the forcing of p at all the nodes z £ К satisfying b -< z -< r is the same. If x F r, then x lb' Dqi. So, in each case, x lb' -ip —> Ofr. □ Proposition 8.13. There is no interpolant in D for the implication -></? —> ф. PROOF. Assume, for a contradiction, that 6{p) interpolates —> г/j. Consider a formula 0(p) obtained from 0(p) by substituting _L for all occurrences of p in 0 which are not in the scope of any occurrence of □. Clearly, 0(p) is a modalized formula. We claim that D h 9(p) <r+ 0(p). Let 1C be an arbitrary D-model, and let a model 1C' be obtained from 1C stipulating that 1C' IF p and preserving the forcing of p at all nodes of 1C above the root. Then, obviously, by the definition of 6(p) 1C lb 0(p) ^ 1C' lb 0(p) 4=^ K' lb 0(p). By Lemma 8.12 1C' lb 0(p) 4=^ JC'° lb p /C° lb p 4=^ /С lb 9{p), since 0(p) is an interpolant. Thus, we have proved that the formulas 0(p) and 9(p) are valid in the same D-models, that is, equivalent in D. 
CLASSIFICATION OF PROPOSITIONAL PROVABILITY LOGICS 55 As a corollary we conclude that 9{p) is an interpolant of -к/? —> гр in D, as well. Now notice that no modalized formula rj(p) can satisfy the property (V D-model K) {1C lb rj{p) JC° lb p). Indeed, if this is the case, by Lemma 2.10 we obtain S I- T)(p) <->• p, but for any modalized rj(p) this is impossible. (For example, otherwise the arithmetical interpretation of 9 would provide a global truth definition for the standard model of arithmetic.) So, applying Lemma 8.12 once more, we conclude that 9{p), and thereby 9{p), cannot be the required interpolants. This proves Proposition 8.13. □ Corollary 8.14. The logics Dp (/3 C ш, /3 cofinite) do not satisfy Craig’s interpolation propoerty. Proof. It follows from the proof of Lemma 3.19 that D is a Turing extension of any logic of type Dp. □ References [1] S. N. Art6mov, Arithmetically complete modal theories, Semiotika i Informatika, 14 (1980), 115-133; English transl., Amer. Math. Soc. Transl. (2) 135 (1987), 39-54. [2] , Applications of modal logic in proof theory, Voprosy Kibernetiki: Nonclassical Logics and Their Applications, “Nauka”, Moscow, 1982, pp. 3-20. (Russian) [3] , On modal logics axiomatizing provability. Izv. Akad. Nauk SSSR Ser. Mat. 49 (1985), 1123-1154; English transl. in Math. USSR-Izv. 27 (1985). [4] , Locally tabular propositional provability logics, Logical Methods of Constructing Efficient Algorithms, Kalinin. Gos. Univ., Kalinin, 1986, pp. 9-12. (Russian) [5] L. D. Beklemishev, Normalization of proofs and interpolation for some provability logics, Uspekhi Mat. Nauk 42 (1987), no. 6, 179-180; English transl., Russian Math. Surveys 42 (1987), no. 6, 223-224. [6] , On the classification of propositional provability logics, Izv. Akad. Nauk SSSR Ser. Mat. 53 (1989), 915-943; English transl., Math. USSR-Izv. 35 (1990), 247-275. [7] , A provability logic without Craig’s interpolation property, Mat. Zametki 45 (1989), no. 6, 12-22; English transl. in Math. Notes 45 (1989). [8] G. Boolos, The unprovability of consistency; an essay in modal logic, Cambridge Univ. Press, Cambridge, 1979. [9] , On systems of modal logic with provability interpretations, Theoria 46 (1980), 7 18. [10] , Provability, truth and modal logic, Journal of Philosophical Logic 9 (1981), 1 7. [11] , The logic of provability, Cambridge Univ. Press, Cambridge, 1993. [12] G. Boolos and G. Sambin, Provability: the emergence of a mathematical modality, Studia Logica 50 (1991), 1-23. [13] D. de Jongh and G. Japaridze, The logic of provability, Handbook of Proof Theory (S. R. Buss, ed.), Studies in Logic and the Foundations of Mathematics, vol. 137. Elsevier, Amsterdam, 1998, pp. 475-546. [14] G. K. Dzhaparidze, Modal logical means of investigating provability, Thesis in Philosophy, Moscow State University, 1986. (Russian) [15] S. Feferman, Arithmetization of metamathematics in a general setting, Fundamenta Mathe- maticae 49 (1960), 35-92. [16] K. Fine, Logics containing K4- Part I, J. Symbolic Logic 39 (1974), 31-42. [17] K. Godel, Eine Interpretation des intuitionistischen Aussagenkalkuls, Ergebnisse Math. Kol- loq. 4 (1933). 39-40; English transl., Kurt Godel Collected Works, Vol. 1, Oxford University Press, 1986, p. 301. [18] S. Goryachev, On interpretability of some extensions of arithmetic, Mat. Zametki 40 (1986), 561-572. English transl. in Math. Notes 40 (1986). 
56 L. D. BEKLEMISHEV [19] P. Hajek and P. Pudlak, Metamathematics of first order arithmetic, Springer-Verlag, Berlin, 1993. [20] D. Hilbert and P. Bernays, Grundlagen der Mathematik, Vols. I, II, 2nd ed., Springer-Verlag, Berlin, 1968. [21] G. Kreisel and A. Levy, Reflection principles and their use for establishing the complexity of axiomatic systems, Z. Math. Logik Grundlagen Math. 14 (1968), 97-142. [22] D. Leivant, The optimality of induction as an axiomatization of arithmetic, J. Symbolic Logic 48 (1983), 182-184. [23] M.H. Lob, Solution of a problem of Leon Henkin, J. Symbolic Logic 20 (1955), 115-118. [24] K. Segerberg, An essay in classical modal logic, Filosofiska Foreningen och Filosofiska Inst. Uppsala Univ., University of Uppsala, Uppsala, 1971. [25] W. Sieg, Fragments of arithmetic, Ann. Pure and Applied Logic 28 (1985), 33-71. [26] C. Smorynski, The incompleteness theorems, Handbook of Mathematical Logic (J. Barwise, ed.), North Holland, Amsterdam, 1977, pp. 821-865. [27] , Beth’s theorem and self-referential sentences, Logic Colloquium ’77 (A. Macintyre et al., eds.), North Holland, Amsterdam, 1978, pp. 253-261. [28] , Self-reference and modal logic, Springer-Verlag, Berlin, 1985. [29] R. M. Solovay, Provability interpretations of modal logic, Israel J. Math. 28 (1976), 33-71. [30] A. Visser, The provability logics of recursively enumerable theories extending Peano Arithmetic at arbitrary theories extending Peano Arithmetic, J. Philosophical Logic 13 (1984), 97-113. Steklov Mathematical Institute Gubkina 8, 117966 Moscow, Russia Translated by the author 
Amer. Math. Soc. Transl. (2) Vol. 192, 1999 Lambek Calculus and Formal Grammars Mati Pentus Introduction The question about the position of categorial grammars in the Chomsky hierarchy arose in late 1950s and early 1960s. In 1960 Bar-Hillel, Gaifman, and Shamir [1] proved that a formal language can be generated by some basic categorial grammar if and only if the language is context-free. They conjectured (see also [7]) that the same holds for Lambek grammars, i.e., for categorial grammars based on a syntactic calculus introduced in 1958 by J. Lambek [10] (this calculus operates with three connectives: multiplication or concatenation of languages, left division, and right division). The proof of one half of this conjecture (namely, that every context-free language can be generated by some Lambek grammar) in fact coincides with the proof for the case of basic categorial grammars. The converse remained an open problem for several years. A proof was proposed in [8], but it contains an error (this was pointed out in [3]). W. Buszkowski [3, 4, 5] obtained partial results for the fragment without one division and for a product-free fragment with a restriction on division nesting. In [2] J. van Benthem mentioned the conjecture as an open problem of contemporary mathematical linguistics. From the logical point of view the Lambek calculus is more interesting than the calculus behind basic categorial grammars. In particular, the rule of equivalent type substitution is admissible in the Lambek calculus. It is known that the Lambek calculus can be embedded into certain fragments of noncommutative linear logic and cyclic linear logic. Our main aim is to prove the conjecture about context-freeness of all languages generated by Lambek grammars. 1991 Mathematics Subject Classification. Primary 68S05; Secondary 03B20, 03D05, 68Q45, 68Q50. Key words and phrases. Lambek calculus, categorial grammar, context-free grammar, interpolation property. The research described in this publication was made possible in part by the Russian Foundation for Basic Research (projects 96-01-01395 and 98-01-00249). (c)1999 American Mathematical Society 57 
58 MATI PENTUS This is achieved using a free group interpretation of noncommutative linear logic, a modification of the Craig interpolation property proof by Maehara and Schutte, and combinatorial techniques. Main results are the following. 1. We prove context-freeness of languages generated by categorial grammars based on any of the following calculi: • the Lambek calculus, • the Lambek calculus allowing empty premises, • the Lambek calculus with the unit, • the multiplicative fragment of cyclic linear logic. 2. We prove that all elementary fragments of the Lambek calculus have the Craig interpolation property. 3. We prove that the conjoinability relation (on syntactic types) is decidable and that it is complete with respect to the free group interpretation. Section 1 defines main notions. Context-free grammars and languages are defined in 1.1. Subsections 1.2, 1.3, and 1.4 contain definitions related to the Lambek syntactic calculus. This calculus deals with syntactic types (we shall call them simply types for shortness) which are built from primitive types using three binary connectives: multiplication, left division, and right division. Lambek categorial grammars, which are based on the Lambek syntactic calculus, are defined in 1.5. In Section 2 the free group interpretation of the Lambek calculus is studied. In 2.1 we define this interpretation as the natural translation of the three connectives of the Lambek calculus into multiplication, left division, and right division in a free group. In 2.2 correctness of the Lambek calculus with respect to this interpretation is proved. Note that completeness with respect to this interpretation does not hold. The exact relation between Lambek calculus derivability and equality of images of types in the free group will be elucidated later, in Section 8. In 2.3 we establish a fact about free groups. This fact will be needed to prove the main lemma in 5.3. Section 3 introduces the notion of a thin sequent (a sequent in which every primitive type involved in it occurs precisely once positively and once negatively). It is proved that every Lambek calculus derivation can be obtained via substitution from a derivation containing only thin sequents (the same holds for multiplicative fragments of linear logic systems, both commutative and noncommutative). Section 4 contains the proof of the Craig interpolation theorem for the Lambek calculus (4.1). This proof, based on the technique of Maehara and Schutte, is a simple modification of D. Roorda’s proof for a variant of the Lambek calculus allowing empty premises. In 4.2 we prove that in the case of a thin sequent the length of the interpolant constructed according to the technique of Maehara and Schutte is equal to the length of the reduced word that represents the interpolated part of the original sequent in the free group interpretation. This fact will play an essential role in 5.3. Section 5 is devoted to the proof of the main result: all languages generated by Lambek grammars are context-free. In 5.1 the construction of a context-free 
LAMBEK CALCULUS AND FORMAL GRAMMARS 59 grammar corresponding to a given Lambek grammar is given. A finite set of Lambek calculus types is used as the non-terminal alphabet of the context-free grammar. The context-free productions are based on derivable Lambek calculus sequents of bounded length. The trivial natural relation between context-free grammars and calculi based on the cut rule is formalized in 5.2. In 5.3 we prove the main lemma, which states that the Lambek calculus is conservative over a calculus corresponding to the context-free grammar constructed in 5.1. The theorem about context-freeness of all languages generated by Lambek grammars is proved in 5.4. (We give an improved exposition of the proof published in [12, 14, 16].) Section 6 deals with the Craig interpolation property in elementary fragments of the Lambek calculus. We prove that the fragments L(\,/), L(\), and L(/) have the interpolation property (6.1). The same about other elementary fragments is known due to D. Roorda [18, 19]. In addition, we introduce the notion of generalized interpolation property, which is of interest in fragments without multiplication. It is proved that the fragments L(\) and L(/) have the generalized interpolation property (6.3), whereas L(\,/) (the product-free Lambek calculus) does not (6.2). These results were first published in [16]. In Section 7 an analog of the main theorem from Section 5 is proved for the product-free Lambek calculus. Here the non-terminal alphabet of the obtained context-free grammar is a finite set of product-free types. The proof is essentially the same as in [17]. In [6] W. Buszkowski presented a similar proof for the case if the designated type of a Lambek grammar is primitive. In Section 8 we give the definition of conjoinable syntactic types (from [2]) and prove that two types are conjoinable if and only if their free group interpretations are equal (this result was published in [11, 13, 15]). This yields a positive answer to the decidability problem for the conjoinability relation. (The problem was formulated in [2].) Section 9 deals with the multiplicative fragment of cyclic linear logic. Here all results and proofs are analogous to those concerning the Lambek calculus. The multiplicative fragment of cyclic linear logic is defined in 9.1. Next, in 9.2 we define grammars based on this fragment. Correctness of the multiplicative cyclic linear logic with respect to the free group interpretation is established in 9.3. Thin sequents for the multiplicative cyclic linear logic are defined in 9.4. The interpolation theorem for this fragment is proved in 9.5. Finally, in 9.6 we formulate the following theorem: the class of languages generated by grammars based on the multiplicative fragment of the cyclic linear logic coincides with the class of all context-free languages. The author would like to express his most sincere gratitude to his thesis advisor, Professor S. N. Artemov, for formulating the task and for invaluable assistance throughout the research and manuscript preparation, to Professor M. I. Kanovich for helpful discussions at his lectures on formal grammars in 1992/1993 at Moscow State University, and to Professors V. A. Uspensky, S. I. Adian, and J. van Bent hem for their attention to this work. 
60 MATI PENTUS 1. Preliminaries By N we denote the set of all natural numbers including 0. By Z we denote the set of all integers. Let M be any non-empty set, called an alphabet We shall call its elements letters. We define a word over the alphabet Ad as a finite (possibly empty) sequence М2 • • • tn of elements of Ad. Two words М2 • • • tn and S1S2 ... are equal if and only if they coincide as sequences, i.e., if n — m and t\ = Si, £2 = 52, ... , tn = sn. The empty word will be denoted by e Let Ad* stand for the set of all words over the alphabet M. The set of all non-empty words over Ad will be denoted by Ad+ We call a language any set of words. The length of a word is defined in the natural way: |M2 • •. tn\ ^ n. 1.1. Context-free grammars. Definition 1.1. A context-free grammar is a quadruple (T, W, <r, 1Z), where T and W are two disjoint finite sets, a is an element of W, and 1Z is a finite set of context-free productions of the form a => u, where a £ W and и £ (T U W) + . The set T is called the alphabet of terminal symbols, and W is called the alphabet of non-terminal symbols. The symbol a is called the start symbol. A word w' is directly derivable from a word w in a grammar (T, W, <r, 1Z) iff w = V\OtV2, wf = V1UV2 for some V2 £ (T U W) + , and a => и is a rule from 1Z. We say that w' is derivable from w in (T, W, <r, 1Z) iff there exists a sequence of words wo, w 1, ... wn such that Wi £ (T U W)*, w — icq, wf = wn, and for every i < n — 1 the word is directly derivable from W{. The language generated by the context-free grammar (T, W, a, 7£) (denoted by C?(T, W, <r, 7^)), is defined as the set of all words over the alphabet T that are derivable in this grammar from the one-letter word a. Remark 1.2. Many authors allow one to use productions of the form ct^ein context-free grammars. It is well-known that the difference is inessential. Namely, for every context-free grammar that involves rules of the form a => e one can effectively construct a context-free grammar in our sense so that the difference of the languages generated by these two grammars is either empty or contains only the empty word (cf. [9]). Definition 1.3. A language is called context-free (or algebraic) iff there exists a context-free grammar that generates the given language. 1.2. Lambek calculus. We consider the syntactic calculus introduced in [10]. We shall denote it by L and call it the Lambek calculus. This calculus occupies a central position in modern research in categorial grammar (cf. [2, p. 31]). Assume that a countable set Var = {риР2,Рз, • • • } is given. The elements of this set will be referred to as primitive types. The Lambek calculus involves three binary connectives •, \, / that are called multiplication, left division, and right division, respectively. Let Tp be the smallest set satisfying the following two conditions: • Var C Tp; • if A e Tp and В £ Tp, then (АфВ) £ Tp, (A\B) £ Tp, and (A/B) £ Tp. The elements of Tp will be called syntactic types or simply types. In some cases we shall omit parentheses with the convention that 
LAMBEK CALCULUS AND FORMAL GRAMMARS 61 • A«B»(7 stands for (АфВ)фС\ • the connective • has higher precedence than \ and /. Capital letters A, B, ... range over types. Capital Greek letters range over finite (possibly empty) sequences of types. The empty sequence of types is denoted by A. The letters p and q range over primitive types. Sequents of the Lambek calculus are of the form Г —► A, where A is a type and Г is a non-empty sequence of types. The left-hand side of a sequent is called its antecedent and the right-hand side is called its succedent. The axioms of the Lambek calculus are all sequents of the form pi —» pi, where Pi € Var. The derivation rules of the Lambek calculus are the following: Г-» A A -+B , ч Г ABA -> С / ч ГД-+А#В Г(А#В)Д —> C ' АП П -» ПА П-+ П-+ —> В A\B —> В B/A В ТВ A -» A ГПД —> A (—> \), where П ф A, (—> /), where П ф A, (cut). П —> А ГВА -» C /v ч ГП(А\В)Д —> C П —> А ГВА-+С (/ ч Г(В/А)ПД —> C K/ The cut-elimination theorem for this calculus is proved in [10]. We write С Ь Г —> A if the sequent Г —> A is derivable in the calculus C. In particular, L b Г —> A means that the sequent Г —> A is derivable in the Lambek calculus. of a type A is defined as the total number of 1.3. Auxiliary notions. Definition 1.4. The length primitive type occurrences in A: Ы ^ 1 I\A'B\\ = ||AB|| = \\A/B\\ ^ ||A|| + \\B\\ The length of a sequence of types is defined in the natural way: II Ai... An|| ^ || A\|| + ... + || An|| Definition 1.5. The set of all primitive types occurring in a type A is denoted by Var(A). Definition 1.6. For every primitive type p € Var we define two functions #+ and from the set Tp into N, and also a function from Tp into Z: #?(</) Фр (я) *Z(a»b) #p{A.B) П(а\в) #p{A\B) #ЦА/В) *p(A/B) #p{A) /1, if \0, if if p = q, if q € Var and p ф q, 0, if q € Var, *;{а)+ф+(в), #;{A) + #~{B), #;(A) + #+(B), #+(Л) + #-(В), #Z(A) + #~(B), Фр (-^) А фр(В), #t{A) - #p{A). 
62 MATI PENTUS These definitions are extended to sequences of types and to sequents as follows: #+(А---Аг) Фр (A\) + . • • + Фр (An), #p (Ai... An) t— #p (A\) + ... + #p (An), Фр(А\... An) ^ Фр(А\) + ... + фр(Ап), #+(П->А) ^ #;(П) + #+(А), #p (П —* A) ^ #+(П) + #р-(Л), #Р(П —> A) - Фр(А) — #Р(П). 1.4. Variants of the Lambek calculus. Let a signature E C {\, /, •} be given. We denote by Tp(E) the set of all types containing only connectives from the given signature E. The elementary fragment of the calculus L corresponding to a signature E is the calculus obtained from L by removing all types that do not belong to Tp(E). We denote this elementary fragment by L(E) The calculus L(\, /) is called the product-free Lambek calculus. Definition 1.7. The calculus L* (cf. [2]) is obtained from the original Lambek calculus L by allowing antecedents to be empty and dropping the condition П ф A in the rules (—> \) and (—> /). Next we define the calculus L\ (the Lambek calculus with the unit). Definition 1.8. Let Tpx be the smallest set satisfying the following conditions: • 1 G TPl; • Var C Tpx; • if A £ Tpx and В G Tpx, then (A%B) G Tpx, (A\B) G Tpx, and (A/B)e Tpx. The sequents of the calculus L\ are of the form Г —> A, where A G Tpx and re(TPl)*. The axioms of L\ are all sequents of the form pi —> pi, where pi G Var, as well as the sequent —> 1. The calculus L\ has all the derivation rules of L* and, in addition, the rule Г1Д —> c (1 -). 1.5. Lambek grammars. Definition 1.9. A Lambek grammar is a triple (T, iL, >)where T is a finite set (the alphabet), H is a type of the Lambek calculus, and > is a finite binary relation > C Tp x T. The language generated by the Lambek grammar (T, #, >) is defined as the set of all strings t\ ... tn over the alphabet T for which there exists a derivable (in L) sequent B\ ... Bn —> H such that Bi > ti for all i < n. We shall denote this language by £L(T,tf,>). Categorial grammars based on other sequent calculi are defined similarly. For the sake of unification of definitions we stipulate that the empty word is not included in the languages generated by grammars based on L* and L\. 
LAMBEK CALCULUS AND FORMAL GRAMMARS 63 2. Free group interpretation 2.1. Definition of free group interpretation. Let F(Var) stand for the free group generated by the enumerable set of all primitive types Var = {рьР2,Рз5 • • • }• By free group we mean the following particular representation. We introduce the extended alphabet Var', obtained by adding to the set Var a new symbol pf1 for each pi E Var. We shall consider reduced words over this extended alphabet. A word и over the alphabet Var' is said to be reduced if it does not contain adjacent occurrences of pi and p~l. The empty word, denoted by e, is also reduced. The set F(Var) consists of all reduced words. Multiplication on this set is defined by induction on word length. • If и = и'pi and v = p~lvr for some г, then uv ^ u!v’. • If u — u'pf1 and v = Piv' for some г, then uv u'v1. • Otherwise uv is obtained simply by juxtaposition. It is obvious that the product of any two reduced words is reduced. The identity element of the free group F(Var) is the empty word e. For any element и E F(Var), we define \u\ as the length of the reduced word u. Definition 2.1. The free group interpretation (written as [ ]) is the following mapping of types and finite sequences of types into F(Var): [P*J — Pi. IA.B] ^ [Л] [-BJ, [A\B] ^ ИГЧВ], {A/Bj ^ M[£]-\ 1Ах...Ап 1 ^ [A!]... [An]. Lemma 2.2. For any type А, |[Л]| < ||Л||. Proof. By induction on the construction of A. □ 2.2. Soundness. Lemma 2.3. If a sequent Г —> C is derivable in the Lambek calculus, then FI = 1C]. D. Roorda obtained this result in terms of “atomic markings” and “balance”. We present an immediate proof (from [11]) in terms of free groups. Proof. Induction on derivations. Case 1. Axiom. Trivial. Case 2. Г-» A ГД —> A -» AmB В (- •)• By the induction hypothesis [Г] = [A] and [A] [A][B] = IA.B}. Case 3. Г ABA -> C , Т{АфВ)А C [£]. Consequently [ГД] = Obvious. 
64 MATI PENTUS Case 4. АП —> В П —► A\B н \). Multiplying the equality [А][П| = [В] by [AJ 1 on the left, one obtains [П] = lAj-^Bj. Thus [П] = [A\B]. Case 5. (—► /) Similar to the previous case. Case 6. П -> А ГВА -> C ,v ч TU(A\B)A-^C If [IIJ = [A], then [П]][А] 1 = e. On the other hand, [Г] [В] [ДJ = [С]. Thus irnniHl-^BHA] = [Cl, which yields [Г]|П]| |Л\£?]] |Д] = \C\. Case 7. (/ ->) Similar to the previous case. □ 2.3. A property of free groups. The following lemma demonstrates that juxtaposed reduced words can reduce to the empty word only if at least one of the given words “loses” at least half of its symbols during reduction with one of its immediate neighbors. Lemma 2.4. Ifu\,... ,un E F(Var), n> 1, and щ ... un = e, then there exists an index к < n such that \v,kUk+i\ < max(|iqfc|, |iqfc+i|). Proof. For any two elements щ and щ+\ in F(Var) there exist three reduced words xu yi<i+1, and zi+1 in F(Var) such that иг = ХгугЛ+ь щ+i = y~l+lzi+ ь щщ+1 = x,zl+i, and the words x,yl l+i, y~}+l zl+\, XiZi+1 are reduced. Evidently, Ы = W + \Vi,i+i\, |«i+i| = \yll+i \ + \zi+1| = \Vi,i+i | + \zi+1| and \щщ+х\ = \xi\ + \zi+i\. Assume for a contradiction that the inequalities |tqtq+i| > |tq| and |tqtq+i| > \щ+1\ hold for every index i < n. From |tqtq+i| > \щ\ we obtain \xi \ + \yiA-\-i | < \xi\ + \zi+\I, whence |^,i+i| < ki+i|, and consequently |^,i+i| < ^\щ+1\. Similarly, from \щщ+г\ > \ui+i \ we derive \уи+г \ + \zi+i\ < \xi \ + |^+i|, whence |^,i+i| < \xi|, which in turn yields \yi.i+i\ < \ \щ\- Now let us consider an arbitrary index i such that 1 < i < n. Recall that Щ — y~\AZi and, on the other hand, щ = #г2/г,г+1- Both words y~\ and aq2/u+i are reduced, and thus they coincide. In view of |y~\ J < \\щ\ and |?/г.г+1| < \\щ\, we have щ = y~lx ^ге^г+ъ X{ = у~}г tWi, and Zi = for a suitable reduced word W{. Note that both y~\ (Wi and Wiyi^+i are reduced. Substituting in ui...un = e the word x\y\^ for iq, у~1г nzn for un, and Vi'-i,iwiyi,i+i f°r ui (where 1 < i < n), we obtain x\W2W^ ... wn-\Zn — e. Now we can check that the word x\W2W% ... wn-izn is reduced. Note that x\w<i is reduced, since x\Z2 = (^22/2,3) is reduced. Similarly, wn-izn is reduced, since xn-\zn — (y~}_2 n-iwn-i)zn is reduced. Finally, for every index i with 1 < i < n — 1 the word WiWi+i is reduced, for азд+i = {y^\ ^г)(^г+12/г+1.г+2) is reduced. Thus we have established that the word X1W2W3 ... wn-izn is reduced and X1W2W3 ... wn-izn = e. Consequently each of the words aq, uq, uq, ... , wn-1, and zn is empty. But they all must be non-empty, because \y~}X i\ < \\щ\ and \yi,i+i \ < \\щ\- Contradiction. □ 
LAMBEK CALCULUS AND FORMAL GRAMMARS 65 3. Thin sequent s In this section we introduce the notion of “thin” sequents and show that every sequent derivable in the Lambek calculus may be obtained from some thin sequent via substitution. Definition 3.1. A sequent П —> A is thin iff the inequalities #+ (П —> A) < 1 and #p (П —> A) < 1 hold for every p e Var (i. e., each primitive type occurs in the sequent positively at most once and negatively at most once). Definition 3.2. We shall refer to any function from Var to Var as a primitive type substitution. Every primitive type substitution 0 induces a function from Tp to Tp (also denoted by ф): <KE*F) ^ 0(s>0(F); <KE\F) ^ 0(S)\0(F); <KE/F) ^ <j>(E)/</>(F). We also extend the function ф to sequents: Ф(Ег ...Em->F)^ ф{Е\)... ф(Ет) 0(F). Lemma 3.3. Let ф be a primitive type substitution. If in any Lambek calculus derivation we replace every sequent Г —> C by ф(Г —> C), then the resulting tree is a correct derivation in the Lambek calculus. Proof. Induction on derivation length. , □ Theorem 3.4. A sequent П —> A is derivable in the Lambek calculus if and only if there exist a thin derivable sequent 0 —> В and a primitive type substitution ф such that П —> A = 0(0 —» В). Example 3.5. Consider the sequent {p/p)p —> p/(p\p) in the role of П —> A. This sequent can be derived in the Lambek calculus as follows: P~*P P~*P / , 4 p -> p (p/p) p-’ P . , (p/p)p(p\p)->p ,7 (p/p)p-*p/(p\p) { Take (93/92) 9i -*• Яз/(чЛЯ2) as 0 B. Let <t>(qi) = p, <j>(q2) = p, </>(93) = P- Then 92 ~> 92 9з —> 93 у , 9i ->■ 9i (93/92) 92 -> 93 , (93/92)91 (9i\92) 93 , .. (93/92)91 -> 9з/(9Л92) 1 Proof of Theorem 3.4. The ‘if’ part is an immediate consequence of Lemma 3.3. To prove the ‘only if’ part we consider an arbitrary cut-free derivation of П —► A. Let n be the number of axiom instances in the derivation. (Evidently, ||П|| + ЦАЦ = 2n.) We introduce n new primitive types q\, ... , qn and assume a one-to-one correspondence between axiom instances and new primitive types being given. The substitution 0 is defined as follows. If a new primitive type qi corresponds to an axiom instance pj —> pj, then 0(<p) ^ Pj- 
66 MATI PENTUS Now we turn the given cut-free derivation of П —► A into a derivation of Г —> C so that ф(Г —► С) = П —> A and the derivation structure remains the same. First we replace each axiom instance by an axiom instance containing the corresponding new primitive type. Next we spread this replacement down along the derivation tree. This is possible due to the fact that in all derivation rules except the cut rule every primitive type occurrence in the consequence has exactly one predecessor in the premises of the rule. □ In 1991 D. Roorda [18] proved (using the method of Maehara and Schiitte [20]) that the calculus L* has the Craig interpolation property. In the paper [19] he remarked that the proof handles also the case of L. In Section 4.1 we present a proof of the interpolation theorem for L. Essentially this proof copies D. Roorda’s proof for L*. The interpolation property in elementary fragments of the Lambek calculus will be studied in Section 6. 4.1. Interpolation in T(\,/,•). Lemma 4.1. Let L I- Ф0Ф —► C, where Ф E Tp*, 0 E Tp+, Ф G Tp*, and C G Tp. Then there exists a type E such that (i) L b 0 -> E; (ii) ТЬФЯФ^С; (iii) the inequality #+(£) < min(#+(0), #+(C) + #~(ФФ)) holds for every primitive type p; (iv) the inequality #~{E) < min(#“(0), (С) + #+(ФФ)) holds for every primitive type p. We shall write Ф[0]Ф —► C instead of Ф0Ф —C in order to show the selected part of the antecedent. Every type E that satisfies clauses (i) - (iv) is referred to as an interpolant for 0 in the sequent Ф[0]Ф —► C. Proof of Lemma 4.1. Induction on the length of a cut-free derivation. Case 1. Let Ф0Ф —> C be an axiom, i.e., С = Ф0Ф. From 0 E Tp+ it follows that 0 = C and Ф = Ф = A. We put E = C. In all the following cases the given partition of the conclusion of a rule induces partitions of premises. By the induction hypothesis one can find interpolants for the premises. Case 2. Consider the rule (—► \). 4. Interpolation ЛФ[0]Ф в Ф[0]Ф -> A\B By the induction hypothesis there is a type E such that (4.1) (4.2) (4.3) #t(E) <тт(#+(0),#+(В) + #;(ЛФФ)), #-(£) < min(#-(0),#p-(S) + #+(ЛФФ)), LhQ^E, L\~ АФЕУ^В, for every p E Var. 
LAMBEK CALCULUS AND FORMAL GRAMMARS 67 We verify that (i), (ii), (iii), and (iv) hold for the conclusion of the rule (—> \) with the same interpolant E as for the premise. The clause (i) is evident from the induction hypothesis. The derivation АФЕЧ! -> В , v \ ФЕЧ! -> A\B X) establishes (ii). The clauses (iii) and (iv) follow from the induction hypothesis and the definition of #+ and . Case 3. The rule (—> /) is handled similarly. Case 4. For the rule (\ —>) we consider six subcases. Case 4a. П'[П"]П'" А Г BA ->C ГП/[П//]П///(Л\Б)Д C Similar to case 2. Case 4b. П -> А Г'[Г"}Г'"ВА Г/[Г//]Г///П(A\B)A -> C Similar to case 2. Cqcq Л г» П -► А ГВД'[Д"]Д'" ГП(Л\В)Д'[Д"]Д"' С ^ Similar to case 2. Case 4d. [П']П" -*• А Г'[Г"].ВД С Г'[Г"П']П"(Л\В)Д -> с " ^ Let Е be an interpolant for the left premise and F be an interpolant for the right premise. It is easy to verify that F»E is an interpolant for the conclusion of the rule (\ —>). The clause (i) is proved by the derivation r"->F П'->£ , . ГТГ -> F»E Case 4e. П А Г[Г"ВД']Д" -*• C Г'[Г"П(Л\.В)Д']Д" -> C " Let E be an interpolant for the right premise. We prove that E is also an interpolant for the conclusion. The proof of (ii) is obvious. Using the induction hypothesis (i) we obtain П —> Л T"BA' -» E /v ч Г"П(Л\В)Д' —> E [X^>' This establishes (i). To prove (iii) observe that #+ (В) + #~(Г"Д') < #+(A\B) + (Г"ПД'). The clause (iv) is verified similarly. Case 4f. [П']П"^Л Г[ВД']Д"^С ГП'[П"(Л\.В)Д,]Д" -*• с ^ ^ 
68 MATI PENTUS Let E be an interpolant for the right premise and F be an interpolant for the left premise. We show that the type F\E is an interpolant for the conclusion. First we verify (iii): #t(F\E) = #+{E) + #~(F) < min(#+(ВД'),#+(С) + #-(ГД")) + min(#p (A) + #+(П"), #, (П')) < min(#+(ВД') + #~(A) + #p (П"),#+(С) + #-(ГД") + #~(П')) = тт(#+(П"(4\В)Д'),#+(С) + #-(ГП'Д")). Next we derive (i): FIT —> A BA' —> E , Fn"(A\B)A' -> E U"(A\B)A' -> F\F ^ Finally, (ii) is proved by IT -> F ГEA" ->• C (W)- rn'(F\£;)A" ->• c Case 5. The rule (/ —►) is handled similarly to case 4 Case 6. For the rule (—► •) three subcases arise. Case 6a. ГрИГ'" -> A A^B Г/[Г//]Г///Д __ AmB Similar to case 2. Case 6b. Similar to case 4d. Case 6c. Г'[Г"] -► А [Д']Д" В Г[Г"А']А" ^ АфВ Г^А Д'[Д"]Д'" —> В ГД'[Д"]Д"' -> А»В Similar to case 2. Case 7. For the rule (• —>) we consider three subcases, all of which are handled similarly to case 2. Case 7a. Г'РНГ'МЯД -» c Case 7b. Case 7c. Г[Г"}Г"'(А»В)А -► C Г'[Г"ЛЯД']Д" -» C Г'[Г"(Л#В)Д']Д" -► C ТАВА'[А"]АШ -► C T(A»B)A'[A"]A"' ^ C (• -*)■ (• -)• (• -)• □ Remark 4.2. The clauses (iii) and (iv) imply that Фр{Е) + #+(E) < min(#-(0) + #+(©), #Р”(ФФС) + #+(ФФС)). 
LAMBEK CALCULUS AND FORMAL GRAMMARS 69 Corollary 4.3. Let L b A —► C, where A e Tp and C e Tp. Then there is a type E such that (i) L b A -> S; (ii) L\- E C; (Hi) Var(E) C Var(.A) П Var(C). Proof. We apply Lemma 4.1 with Ф = Л, 0 = A, and Ф = Л. Consider an arbitrary primitive type p that occurs in the interpolant E. Remark 4.2 shows that min(#p A + #+A, ф~С + #+C) > #“£ + #+£ > 1, whence #~A + #+A > 1 and ФрС + > 1, i.e., p G Var(A) and p G Var(C). □ Example 4.4. Consider the derivable sequent Pl^(Pl\p2) —> (рз/Р2)\рз- Here Var(p1»(pi\p2)) П Уаг((р3/р2)\рз) = {РьР2> П {р2,Рз} = {Ы- The type E — P2 is an interpolant for this sequent. Indeed, L b p^{pi\P2) P2 and L\~ P2 —> {ps /P2) \рз • 4.2. Interpolation property for thin sequents. Lemma 4.5. Let L b Ф0Ф -> C, where Ф G Tp*, 0 e Tp+7 Ф g Tp*, C e Tp, and the sequent Ф0Ф —> C is thin. Then there is a type E such that (i) L b 0 -> E; (ii) L b ФЕФ -► C; (iii) the sequent 0 —E is thin; (iv) the sequent ФЕФ —C is thin; (v) ||S|| = |[0]|. Proof. According to Lemma 4.1 there is a type E satisfying (i) and (ii). It remains to prove (iii), (iv), and (v). Consider an arbitrary primitive type p. In view of (E) < #+(C) + #~(ФФ), we have #A(E) + #“(©) < (С) + #~(Ф0Ф) < 1 (the last inequality follows from the original sequent Ф0Ф —> C being thin). Similarly, #~{E) + #+(0) < #~(C) + #+(Ф0Ф) < 1. This proves (iii). The clause (iv) can be verified in an analogous manner. To establish (v) it is sufficient to verify that ||E|| = |[E]|. This is obvious, since no primitive type occurs in E more than once. □ 5. Main theorem 5.1. Construction. Definition 5.1. For every natural number m we define a set of bounded types Tp(m) and a set of bounded type sequences Ls(m): Tp(m) ^ {A G Tp I ||A|| < m}; Ls(m) ^ {П e Tp(m)+ | ||П|| < 2m}. Definition 5.2. For every pair of natural numbers m and s we define a finite set of types Tp(m, s) and a finite set of type sequences Ls(m, s): Tp(m, s) ^ {A e Tp | Уаг(Л) C {pb ... ,ps} and \\A\\ < m}; Ls(m, s) ^ {П e Tp(m, s)+ | ||П|| < 2m}. 
70 MATI PENTUS Consider an arbitrary Lambek grammar (T, H,>). Only a finite number of types are relevant in the definition of the language generated by this Lambek grammar. Thus there are positive integers m and s such that H G Tp(m, s) and if В > t for some t G T, then В G Tp(m, s). There is no loss of generality in assuming that the sets T and Tp(m, 5) do not intersect. Now we construct the desired context-free grammar (T, W, cr, TV): W ^ Tp(m, s), cr — tf, 7Z ^ {B => t I t eT and В > t} U {A => Г | A G Tp(m, s), Г G Ls(m, s), and L h Г -> A}. The aim of this section is to prove that £z,(T, H, >) = £/(T, W, а, TV). Lemma 5.3. Let t\,... ,tn G T. Then the word t\ .. .tn is in G(T,W,cr,TV) if and only if there are symbols ai,... , an G W sitc/i that the word a\ ... an is derivable from a, and (ai =$> U) G TZ for every i < n. Proof. Observe that every derivation of => t\ .. .tn from cr in the constructed context-free grammar (T, W, cr, TV) can be reorganized so that all occurrences of productions В => t, where t G T, appear after all occurrences of productions A => Г, where Г G Ls(m, s). □ 5.2. Calculus representation of context-free grammars. Definition 5.4. Given a context-free grammar (T, W, cr, 7£), we construct a calculus Ci(W,cr, TV), derivable objects of which are sequents of the form w —► <r, where w G W+. • The only axiom of Ci(W,cr, TV) is cr ^ cr. • If (a => u) G 71, a G W, and и G W+, then the calculus Ci(W,cr, 7£) contains the rule Viav2 —► cr 1)1^2 —> СГ Lemma 5.5. Let w G W+. The sequent w ^ a is derivable in the calculus Ci(W, cr, 7£) if and only if the word w is derivable from a in the context-free grammar (T, W, cr, 7£). Proof. The ‘if’ part is proved by induction on derivation length in the context- free grammar (T, W, cr, 7£). Induction base: w = cr. Obvious. Induction step: Let w = V1UV2, (a => u) G 7£, and let г>1аг>2 be derivable from cr in (T, W, cr, 7£). We apply the rule iq av2 —> cr V\UV2 —> cr The ‘only if’ part is proved similarly by induction on derivation length in the calculus Ci(W,cr,TV). □ Definition 5.6. Given a context-free grammar (T, W,cr, 7£), we construct a calculus C2(W,7£), derivable objects of which are sequents of the form w —► a, where w G W+ and a G W. 
LAMBEK CALCULUS AND FORMAL GRAMMARS 71 • The calculus C2(W, 7£) contains an axiom a —* a for every symbol a. • If (a => u) e 7£, a E W, and и E W+, then the calculus C2(W,7£) contains the axiom и —> a. • The only rule of the calculus C2(W,7£) is the cut rule и —» a i/i ш>2 —> P Vl^2 —> /3 Lemma 5.7. Le£ w e W+. A sequent w —> a is derivable in the calculus C2(W, 7£) if and only if the word w is derivable from a in the context-free grammar (T, W, a, 7£). Proof. In view of Lemma 5.5, it suffices to prove that a sequent w —> a is derivable in the calculus C2(W,7£) if and only if it is derivable in the calculus Ci(W,(T,K). The ‘only if’ part is easy to verify. To prove the ‘if’ part we define the rank of a cut as the number of sequents in the derivation of its left premise and proceed by induction on the total of ranks of all cuts in a given derivation. A derivation fragment и —> a v\a.V2 —> P V\UV2 —> p W\pW2 —> 7 W1V1UV2W2 —> 7 will be replaced by V\av2 —> P w\pW2 —> 7 и a —* 7 W1V1UV2W2 7 □ 5.3. Main lemma. In this section we establish a correspondence between the Lambek calculus and the calculus C2(W, 7Z) representing the context-free grammar constructed in Section 5.1. For every natural number m we introduce an auxiliary calculus Lcutm, which in some sense takes the intermediate position between the calculi L and C2(W,7£). Namely, the calculus C2(W,7£) uses formulas from Tp(m, s), the calculus Lcutm uses Tp(m), and L uses formulas from Tp. Definition 5.8. A sequent Г —> A is an axiom of Lcutm iff A e Tp(m), Г E Ls(m), and the sequent Г —> A is derivable in the Lambek calculus. The only rule of Lcutm is (cut). Lemma 5.9. Let L h П —> C, where П E Ls(m), С E Tp(m), and the sequent П —> C is thin. Then Lcutm Proof. Induction on ||П||. If ||П|| < 2m, then П —> C is an axiom of Lcutm- Assume that ||П|| > 2m. The sequence П can be represented as a concatenation П = П1... П/, where • 0 < ||IL || < m for every i < l, • ||П*|| + ЦПгн-i|| > rn for every г < l 
72 MAT I PENTUS Note that [П] = [C] according to Lemma 2.3. Now let щ ^ [П1], ... , щ ^ ЦП/], and щ+1 ^ [C]_1. Evidently щ .. .щщ+i = г. Applying Lemma 2.4, we find a positive integer к < / such that |'U/e'iqe+i| < max(|iqfc|, \uk+i\)> According to Lemma 2.2 the inequality \иг\ < m holds for every г < / + 1. Thus \икик+\1 < m. The following two cases arise. Case 1. Let к < l. Then |[ЩЩ+1]| < m for this particular к. Applying Lemma 4.5 for П1... Щ+2 ... Lfi —> C, Ф © Ф we find an interpolant E for ЩЩ+i in Щ ... П* —► C. This means that \\E\\ < m and the sequents ЩЩ+1 —> E and IR ... П^_ iEUk+2 • •• П/ —► C are thin and derivable. Note that ||£|| < ra, but ||ЩЩ+1|| > ra. Thus ЦП1 ... Uk^iEUk+2 • • • П/Ц < ЦП1...П/Ц, and we can apply the induction hypothesis for the thin derivable sequent П1 ... Щ_1£Щ+2 ... П/ —> C. On the other hand, ЩЩ+i —> E is an axiom of Lcutm, since ЦЕ1!! < ra and ЦЩЩ+ill < 2TO. Thus we have demonstrated that Lcutm b П1 ... Tik-iETik+2 ... П/ —► C and Lcutrn b ЩЩ+1 -► E. Now Lcutm Ь Щ .. .Uk_iUkUk+iUk+2 .. .Ut -* C is obtained by applying the cut rule. We have proved that Lcutm ЬП^С. Case 2. Let к — l. Then |[П/][С]_1| < m. Applying Lemma 4.5 for © ф we find an interpolant E for П1 ... П/_! in П1 ... П/ —► C. This means that \\E\\ = I ЦП! ... and the sequents IIi...II/_i —► E and EUi —► C are thin and derivable. Recall that ЦП, .. .П/_!П/] = [C], whence ЦПХ ... П/.,] = [С]ЦП/]-1 = ([ФИС]"1)-1 and, further, |[ПХ... П,_!]| = |([n,J[(71 “1)~11 = |[ф}[С]-1| < то. Thus ||£|| = |ЦПХ ... П/_1 J| < га. It follows that EUi G Ls(m), and consequently ETli —> C is an axiom of Lcutm• On the other hand, ЦП1 ... П/_ 11| < ЦП1 ... П/1|, and we can apply the induction hypothesis for the thin derivable sequent IR ... П/_ 1 —* E. Applying the cut rule, we obtain Lcutm b ГЦ ... II/_iII/ —> C. In other words, Lcutm b П —> C. □ Lemma 5.10. Let (T,iL,>) be a Lambek grammar and (T,W,cr,7Z) the corresponding context-free grammar (constructed in Section 5.1). Let Г G W+ and A G W. Then the following three assertions are equivalent. (i) C2(W,ft)br^ A. (ii) Lcutni b Г —> A. (Hi) LbT^A Proof. The implication (i) —> (iii) is easily verified by induction on derivation length in C2(W,H). To prove (ii) —> (i) we consider the following primitive type substitution: f Pu if i < S, \ pi, if i > s. Фв(Рг) ^ 
LAMBEK CALCULUS AND FORMAL GRAMMARS 73 Note that ф3 maps Tp(ra) to Tp(m, s) = W. The substitution ф3 is applied to all sequents in a derivation of the given sequent Г —> A in Lcutm. The resulting tree is a derivation in C2(W, 7£). It remains to establish (iii) —> (ii). Let L h Г —> A, where Г E W+. According to Theorem 3.4 there exist a thin derivable sequent П —> C and a primitive type substitution ф such that Г —► A = ф{П —> C). According to Lemma 5.9 we have Lcutm b П —► C. Consequently also the sequent Г —> A is derivable in the calculus Lcutm. □ 5.4. Proof of the main theorem. Theorem 5.11. Let (T, H, >) be a Lambek grammar. Then the language Cl{T,H,\>) is context-free. Proof. We prove that Cl(T, Я, >) = G(T, W, <r, 71), where (T, W, a, TZ) is the context-free grammar constructed in Section 5.1. Given a word t\ ... tn, consider the following chain of equivalent assertions. (1) The word t\ ... tn is in the language G(T, W, <r, 71). (2) There are types B\, ... , Bn such that Bi ... Bn is derivable from a in (T, W, <r, 71), and (Bi => ti) G 7Z for every i < n. (3) There are types B\, ... , Bn such that C2(W,7^) b B\ ... Bn —> a and Bi > ti holds for every i < n. (4) There are types B\, ... , Bn such that L b B\ ... Bn —> a and Bi > L holds for every i < n. (5) The word t\ ... tn is in the language Cl(T, H, >). The equivalence of (1) and (2) is established by Lemma 5.3. Further, (2) and (3) are equivalent according to Lemma 5.7 and the construction of the set 7Z. The equivalence of (3) and (4) follows from Lemma 5.10. Finally, (4) and (5) are equivalent due to the definition of the language generated by a Lambek grammar. □ Corollary 5.12. A language is context-free if and only if it is generated by some Lambek grammar. Remark 5.13. All the arguments above hold also for the Lambek calculus with the unit and for the calculus L*. Consequently, the class of languages generated by categorial grammars based on any of these calculi coincides with the class of all context-free languages. 6. Interpolation in fragments In this section we introduce the generalized interpolation property and study both ordinary and generalized interpolation in all elementary fragments of the calculi L and L*. In particular, we prove a “weak interpolation theorem” for the product-free fragment of the Lambek calculus. This is used in the proof of existence of a “natural” context-free grammar for every categorial grammar based on this important fragment of the Lambek calculus. Definition 6.1. Let E C {\, /, •}, and let C be either L(E) or L*(E). We say that the calculus C has the interpolation property if for every sequent of the form A —> C derivable in C, where A E Tp(E) and С E Tp(E), there exists a type В E Tp(E) such that 
74 MATI PENTUS (i) C b A -> B\ (ii) Ch В ->C; (Hi) Var(В) C Var(A) U Var(C). Definition 6.2. Let E C {\,/,#}, and let C be either L(E) or L*(E). We say that the calculus C has the generalized interpolation property if for every sequent of the form П —> C derivable in C, where П E Tp(E)* and С E Tp(E), there exists a type В E Tp(E) such that (i) C b П (ii) C b В -> C; (in) Var (В) C Var(II) U Var(C). Remark 6.3. Evidently, if multiplication is not in the signature of C, then C has the generalized interpolation property if and only if it has the ordinary interpolation property (since C b A\... An —> C if and only if C b (A^ ... •An) —> C). Remark 6.4. In view of duality of signatures {\} and {/}, as well as duality of {\,#} and {/,•}, it is sufficient to study the fragments based on the signatures {\}> {\./}> M. {V*}, and {\,/,*}. In 1991 D. Roorda proved that the calculi L* and L*(\, •) have the interpolation property [18, 19]. The same question for the fragments L(\), L(\,/), L*(\), and T*(\, /) was mentioned as an open problem [19, p. 440]. Note that the interpolation property for fragments L*(#), L(#), L, and L(\, •) is easily obtained from Roorda’s proofs. In this section we prove the following results. • The fragments L(\) and L*(\) have both the ordinary and the generalized interpolation property. • The fragments L(\, /) and L*(\, /) have the ordinary interpolation property, but do not have the generalized interpolation property. • The fragments L(\,/) and L*(\,/) satisfy a certain weak version of the generalized interpolation property. We give only the proofs for fragments of L, since the proofs for fragments of L* are analogous. 6.1. “Weak” generalized interpolation. Lemma 6.5. Let Ф € Tp(\,/)*, 0 € Tp(\,/)*, Ф € Tp(\,/)*, C G Tp(\,/), and L b Ф0Ф —> C. Then there is a natural number r > 0, there are sequences of types 0i,... , 0r E Tp(\, /) + , and there are types E\,... , Er E Tp(\, /) such that (i) 0i ... 0r = 0, i.e., the sequence 0 is divided into r non-empty continuous subsequences (if в = A, then r — 0); (ii) L b 0j —> Ej for every j < r; (iii) L b ФЕ1... Er4! C; (iv) ф+iEL..Er) < min(#+(0),#+(C) + #-(ФФ)) and #-{Ex...Er) < min(#p (®). #p{C) + #p (ФФ)) for every p € Var. We say that the sequence E\... Er is an interpolant for 0 in the sequent Ф0Ф -> C. Example 6.6. Consider the derivable sequent \pi(pi\p2)ps](p3\(p2\p4)) —► P4- Applying Lemma 4.1 we obtain a division of the selected subsequence p\ {р\\р2)Рз = 
LAMBEK CALCULUS AND FORMAL GRAMMARS 75 0102, where 0i = Pi(pi\p2) and 02 = рз (here r = 2). The corresponding interpolant is р2рз, he-, E\ — p2 and E2 = рз- In fact, L b Pi(pi\p2) —> P2, L b p3 —> рз, and L b Р2Рз(Рз\(Р2\Р4)) —> P4- Note that no single product-free formula is an interpolant for pi(pi\p2)p3 in [pi(pi\p2)Рз](Рз\(Р2\P4)) P4- Proof of Lemma 6.5. Induction on the length of a cut-free derivation. Case 1. Let Ф0Ф —> C be an axiom, i.e., С = Ф0Ф. Three subcases arise from different partitions of the antecedent between Ф, 0, and Ф. Case la. Let [C] —> C. We put r = 1, ©i = (7, E\ = C. Case lb. Let [ ]C —> C. We put r = 0. Case lc. Let C[ ] —> C. We put r — 0. In all the following cases we shall consider the partition of premises induced by the given partition of the conclusion of a rule. By the induction hypothesis there exist interpolants for the premises. Case 2. Consider the rule ЛФ[в]Ф в Ф[0]Ф -> A\B By the induction hypothesis we find 0i, , 0r, E\, ... , Er such that 01... 0r = 0, L h 0j Ej for every j < г, АФЕХ... ЕГФ -► В, #+(£i ...Er)< 1Шп(#+(0),#+(В) + #-(ЛФФ)) and #-(Ex...Er) < min(#-(0),#;(B) + #+ (АФФ)) for every p G Var. We verify that (i), (ii), (iii), and (iv) hold for the conclusion of the rule (—> \) with the same 01? ... , 0r, Ei, ... , Er as for the premise. The clauses (i) and (ii) are evident from the induction hypothesis. The derivation АФЕ1... ЕГФ -0 В ФЕ\.. .ЕГФ —> A\B ^ ^ establishes (iii). The clause (iv) follows from the induction hypothesis and the definition of ф~ and Case 3. The rule (—> /) is treated similarly. Case 4. Consider the rule (\ -0). Six subcases arise. Case 4a. П/[П//]Г -0 А TBA^C ГП/[П//]П/,/(А\Б)А -0 С Similar to case 2. Case 4b. П -> А Г[Г"}Г'"ВА Г'[Г"]ГШ11(А\В)А -0 C " Similar to case 2. Case 4c. П ГВД/[Д//]Д/// -> C ГП(А\Б)Д/[Д//]Д/// -> c Similar to case 2. Case 4d. [n'in" -0 A Г'[Г"}ВА —> C Г'[Г"П'}П"(А\В)А->С " 
76 MAT I PENTUS Let E\... Er and F\ ... Fm be the interpolants for the left and right premises respectively. It is easy to verify that F\... FmEi... Er is an interpolant for the conclusion of the rule (\ —>). Case 4e. П -> А Г'[Г" BA']A" -> C Г/[Г//П(А\Б)А']А/' -> C Let Ei... Er be the interpolant for the right premise. We prove that it is also an interpolant for the conclusion. The clauses (i) and (iii) are obvious. By the induction hypothesis, Г'^Д' = 0i...0r. Let the particular type occurrence В be in the sequence ©&. Then 0*. = ЕВT for some sequences E and T. We put 0/c = EH(A\B)Y and 0j = 0j for every j Ф k. Evidently T"U(A\B)A' = ©i ... 0r. Using the induction hypothesis (ii), we obtain П -> А ЕВT -> Я/c EU(A\B)T -+ Ek and 0j —> Ej for every j ф к. This proves (ii). To prove (iv), it is sufficient to observe that #+(r,,BA/) < #+(Г/'П(А\Б)А/) and #-(Г"ВД') < Case 4f. [П']П" -> A Г[ВД']Д" ~>C ГП/[П"(А\Б)А']А// -> C Let Ei... Er be an interpolant for the right premise, corresponding to the partition В A' = 0i ... 0r. Let Fi... Fm be an interpolant for the left premise, corresponding to the partition П' = Ei... Em. Then, for a suitable sequence T, (1) 0! =BT- (2) А'= Гв2...вг; (3) (4) L h 0j —> Ej for every j ф 1; (5) L b TEi... Er A" -> C; (6) #p{E\...Er) < min (#+(BA’),#+(C) + (ГА")) for every p G Var; (7) #~(E, ...Er)< min (#-(ВД'),#р-(С) + #+(ГД")) for every p € Var; (8) П' = Hi... Em; (9) L h Е,- —> F, for every j < to; (10) Lb F!... Fmn” -+A; (11) #+(Fx... Fm) < тт(#+(П'), #+(T + #-(П")) for every p G Var; (12) #-(Fj • • • Fm) < min(#-(n'), #-(A) + #+(П")) for every p G Var. We show that (Fm\(... \{Fi\Ei)...)) E2 ... Er is an interpolant for the conclusion, corresponding to the partition П//(А\Л)А/ = ©i...0r, where ©i = П"(А\Л)Т and 0j = ©j for every j Ф 1. First, we prove (iv): #+((Fm\(...\(F1\F1)...))F2...Fr) = #+(F1...Fr) + #;(F1...Fm) < min(#+(ВД'),#+(С) + #-(ГД")) + min{#~{A) + #+(П"), #“(П')) < min(#+(ВД') + #-{A) + #p (П"), #p{C) + #p (ГД") + #“(П')) = тт(#+(П"(Л\В)Д'), #+(C) + #-(ГП'Д")). 
LAMBEK CALCULUS AND FORMAL GRAMMARS 77 Evidently, (i) holds, since П"(A\B)Af = Oi... 0r. To prove (ii) we only need to verify that L b 0i —» (Fm\(... \(F\\Ei)...)). Indeed, Fi...FmU"^A BT-*E1 F!...Fmn"(A\g)T ^ ^ у у F2...Fmn”(A\B)r^F1\E1 £ у U"(A\B)?^(Fm\(...\(Fi\E1)...)) H ^ Finally, we prove (iii): Щ£2...£гАи^С „ ч TE1(F1\E1)E2...ErA"^C Em^Fm fE1...Em-1(Fm-i\(...\(F1\E1)...))E2...ErAl'^C rSi... Em_iSm(Fm\(FTO_1\(... \(Fi\Ex)... )))E2 ... ErA" - C " ~*j' Case 5. The rule (/ —>) is handled similarly to case 4. □ Lemma 6.7. Let Ф G Tp(\)*, 0 G Tp(\)*, Ф G Tp(\)*, C G Tp(\), and L h ФОФ —> C. Then there is a natural number r > 0, tLere are sequences of types 0i,... , 0r G Tp(\) + , and there are types E\,... , Er G Tp(\) such that (i) 0i.. .0r = 0, i.e., the sequence 0 is divided into r non-empty continuous subsequences (if в = A, then r = 0); (ii) L h Qj —> Ej for every j < r; (in) ТЬФ£1...£ГФ->С; (iv) the inequalities ф+(Е\...Ег) < min(#+(0), #+(C) + #“(ФФ)) and #p{E\ ...Er)< min (#p (0),#p (С) + #p (ФФ)) hold for every p € Var. Proof. Induction on the length of a cut-free derivation. It suffices to repeat cases 1,2, and 4 from the proof of Lemma 6.5. □ 6.2. The elementary fragment {\, /}. Lemma 6.8. The calculus L(\, /) has the interpolation property. Proof. Let L(\, /) h A —» C. According to Lemma 6.5 there is an interpolant Ei... Er for A in the sequent A —> C. Obviously r = 1. We put В ^ E\. □ Lemma 6.9. The calculus L(\, /) does not have the generalized interpolation property. Proof. Consider the sequent pip2 —> Рз / (P2\(pi\p3)) - It can be derived as follows: Pi -> Pi Рз Рз .v , P2^P2 рЛрЛрз) -+P3 .. _7 PiP2(P2\(Pi\P3)) -> Рз ."7 P\P2 ^ Рз/(Р2\(Р1\Рз)) 
78 MATI PENTUS We prove that there is no single-type interpolant for pip2 in this sequent. Assume for a contradiction that there exists a type E E Tp(\, /) such that L b pip2 —> E, L\- E Рз/(Р2\(рЛрз)), and Var(E) C {pi,p2}- We need the following translation ( )cl that maps Lambek calculus types to propositional logic formulas: pcl ^ p, if p E Var, (A\B)cl ^ AclDBcl, (.A/B)cl ^ Bc1dAc1, (A*B)cl ^ AclkBcl. This translation is extended to sequents as follows: (Аг... An B)cl ^ (A5Z& ... kA^) э £d. It is routine to verify that if L b П —> C, then the formula (П —> C)cl is true in the classical propositional logic. In particular, the formulas (pi&;p2) D Ecl and Ecl D ((p2 D (pi D Рз)) D Рз) are true. Substituting _L for рз in the latter formula, we obtain Ecl D (pi&P2)- Thus the pure implicative formula Ecl is classically equivalent to the formula (pi&p2). Contradiction. □ 6.3. The elementary fragment {\}. Lemma 6.10. Let Ф E Tp(\)*7 0 e Tp(\)+, С E Tp(\), and L b Ф0 —> C. Then there is a type E E Tp(\) such that (i) L b 0 -> E; (ii) L b ФЕ —> C; (hi) the inequalities #+(£) < min(#+(0), #+(C) + #“(ФФ)) and #~{E) < min(#p (0),#p (С) + #+(ФФ)) hold for every p E Var. Proof. Induction on the length of a cut-free derivation. All cases, except the following two, are trivial. Case 1. П->А Г'[Г "BA}-^C Г[Г"П(А\£)Д] ^ C ^ Following case 4e from the proof of Lemma 4.1, it is easy to verify that the interpolant for the right premise is also an interpolant for the conclusion. Case 2. [П']П// -> А Г[ВД] -> C ГП'[П"(А\В)Д] C ^ Let E be an interpolant for the right premise. Applying Lemma 6.7 to the left premise, we obtain an interpolant F\ ... Fm. Now, following case 4f from the proof of Lemma 6.5, one can verify that (Fm\(... \(Fi\E)...)) is the desired interpolant. □ Corollary 6.11. The calculus L(\) has the generalized interpolation property. 
LAMBEK CALCULUS AND FORMAL GRAMMARS 79 7. Construction of a context-free grammar for a product-free Lambek grammar In this section we consider categorial grammars based on the product-free fragment of the Lambek calculus. Definition 7.1. A product-free Lambek grammar is a triple (T, ff, >), where T is a finite set (the alphabet), H E Tp(\,/), and > is a finite binary relation >cTp(\,/)xT. The language generated by the product-free Lambek grammar (T, ff, >) is defined as the set of all strings t\ ... tn over the alphabet T for which there exist types Bi,... , Bn E Tp(\, /) such that L(\, /) b B\... Bn —> H and Bi > U for all i < n. The cut-elimination property of the Lambek calculus entails its conservativ- ity over its elementary fragments. Thus Theorem 5.11 implies that all languages generated by product-free Lambek grammars are context-free. However, in general the construction used in the proof of Theorem 5.11 involves types with product (as non-terminal symbols). The following question arises. Is it possible to construct, for an arbitrary product-free Lambek grammar, a corresponding “natural” context-free grammar with a finite subset of Tp(\, /) as the alphabet of non-terminal symbols? The positive answer to this question is given by Theorem 7.3. Lemma 7.2. Let a thin sequent Ф0Ф —> C be derivable in L(\,/), and let the sequence E\... Er E Tp(\, /) be an interpolant corresponding to a partition © = ©i ... 0r of the sequence 0 in the sequent Ф[0]Ф —> C. Then (i) for every i < r the sequent 0* —> Ei is thin; (ii) the sequent ФE\... Er4l —> C is thin; (iii) \\Ei. ..Er\\ = |[©]|- Proof. Similar to Lemma 4.5. □ Theorem 7.3. Let (T, ff, >) be a product-free Lambek grammar. We put U ^ {H} U {||f?|| | there is t eT such that В > t}\ m ^ max{||A|| \AeU}\ s ^ тах{г E N | there is A E Ы such that pi E Var(A)}; W - [A E Tp(\, /) | Var(A) C {pl5... ,ps} and \\A\\ < m}; a ^ H; 7Z ^ {B => t | t E T and В > t} U {A=>T\L\-r-+A, AeW, Те W+, and ||T|| < 2m}. Then the context-free grammar (T, W, a, 7Z) and the given Lambek grammar (T, H, >) generate the same language. Proof. One can repeat the argument from Section 5. The only non-trivial part is the proof of Lemma 5.9. We use Lemma 7.2 instead of Lemma 4.5 and obtain a type sequence E\... Er as an interpolant. After this the cut rule needs to be applied r times. □ 
80 MATI PENTUS 8. Conjoinable types in the Lambek calculus The notion of conjoinability for the Lambek calculus is defined in [2, p. 76] as follows. Definition 8.1. Two types A and В are said to be conjoinable iff there exists some type C such that L b A —> C and Lb В —> C. In this section it will be proved that two types A and В are conjoinable if and only if [A] = [£?]. Definition 8.2. We say that two types A and В are interchangeable if and only if L b A -> В and L b В -+ A. Lemma 8.3. The interchangeability relation is a congruence on types. Proof. Immediate from admissibility of the following rules: A—>B A.C BmC ’ A^B 5 B\C -> A\c ’ A В ? A/C -> B/C ’ A^B , СфА -> СфВ ’ A В ? C\A C\B ’ A ~~> B C/B -> C/A ’ □ Lemma 8.4. (i) 77ie types (A\B)/C and A\(B/C) are interchangeable. (ii) The types (АфВ)фС and Аф(ВфС) are interchangeable. We shall write A\B/C and АфВфС (omitting unnecessary parentheses). Lemma 8.5. Let A and В be any two types of the Lambek calculus. The following three assertions are equivalent. (i) There exists a type C such that Lb A —> C and L b В —> C. (ii) There exists a type D such that Lb D —> A and Lb D —> B. (iii) There exist types C$, ... , Cn such that A = Со, В = Cn, and for every i < n, L b Ci —> 1 or L b Ci+i —> Ci. Proof. This lemma is proved by J. Lambek in [10]. To derive (ii) from (i), put D (А/С)фСф(С\В). For the converse one can use C ^ (D/A)\D/(B\D). □ Lemma 8.6. The conjoinability relation is a congruence on types. Proof. Similar to the proof of Lemma 8.3. □ Lemma 8.7. (i) The types A\A andB/B are conjoinable. (ii) The types A\A and B\B are conjoinable. (iii) The types В and B/(A\A) are conjoinable. (iv) The types В and Вф(А\А) are conjoinable. (v) The types В/A and Вф(А\А/A) are conjoinable. (vi) The types A and (A\A/A)\(A\A/A)/(A\A/A) are conjoinable. Proof, (i) Note that L b A\A -> А\(АфВ)/В and L b B/B -> А\(АфВ)/В. (ii) follows from (i) by transitivity. (iii) According to (ii), B/(B\B) and B/(A\A) are conjoinable. On the other hand, LbB-> B/(B\B). 
LAMBEK CALCULUS AND FORMAL GRAMMARS 81 (iv) According to (ii),Вф(В\В) and Bm(A\A) are conjoinable. On the other hand, L b Вф(В\В) -> В. (v) Prom (iv) we conclude that В/A and (B#(A\A))/A are conjoinable. On the other hand, L b Вт{А\А/A) —> (L?#(A\A))/A. (vi) L b A (A\A/A)\(A\A/A)/(A\A/A). □ We introduce the auxiliary notion of simple product and construct a function from F(Var) into the set of simple products. Definition 8.8. A simple product is any type which is a product of factors of the form p and (p\p)/p, where p G Var. The set of all simple products will be denoted by SP. Definition 8.9. We define the function sp: F(Var) —> SP as follows: sp(e) ^ ((Pi\Pi)/Pi)*Pi, sp (p) ^ P, sp (p_1) — (P\P)/P, sp(uv) ^ sp(u)#sp(v) if \u\ = 1 and |u| > 1. Lemma 8.10. The following claims hold for any и E F(Var) and for arbitrary types A and B. (i) The types B/sp(u) and B^sp^-1) are conjoinable. (ii) The types sp(u)\B and sp(п~1)фВ are conjoinable. (iii) The types A and sp([A]) are conjoinable. Proof. First we prove (i) by induction on the length of u. The induction base consists of three cases. Case и = e. Note that sp(e) and pi\pi are conjoinable. It remains to prove that B/(pi\pi) and Bm(pi\pi) are conjoinable. This is immediate from Lemma 8.7 (iii) and (iv). Case и — p follows from Lemma 8.7 (v). Case и = p~l. Prom Lemma 8.7 (v) we conclude that B/(p\p/p) and Вф{{р\р/р)\(р\р/р)/(р\р/р)) are conjoinable. In view of Lemma 8.7 (vi), B/{p\p/p) and B.p are conjoinable. Induction step. Let и = vw, where \w\ > 1 and v is either p or p~l. We must prove that B/(sp(u)#sp(ic)) and Bmsp(w~1)9sp(v~1) are conjoinable. Note that В/(sp(v)9sp(w)) and (B/sp(w))/sp(v) are conjoinable (they are even interchangeable). According to the induction hypothesis, B/sp(w) and B9sp(w~1) are conjoinable. It remains to establish that (J5#sp(ic_1))/sp(u) and J5#sp(ic_1)#sp(u_1) are conjoinable. But this follows from the induction base for В' = J5#sp(ic_1). One can prove (ii) dually. The clause (iii) is proved by induction on the length of A, using (i) and (ii). □ Theorem 8.11. Two types A and В are conjoinable if and only if [A] = {B}. Proof. If L b A —> C and Lb В —> (7, then [A] = [j3] in view of Lemma 2.3. The converse implication follows from Lemma 8.10 (iii). □ 9. Multiplicative cyclic linear logic All results in this section are proved similarly to the corresponding Lambek calculus results from the preceding sections. 
82 MATI PENTUS 9.1. The calculus CLL. The cyclic linear logic was introduced in [21]. Here we consider its multiplicative fragment and denote it by CLL. A countable set Var = {ръР2,Рз, • • •} is assumed to be given. In the linear logic setting we shall call the elements of this set atomic formulas. They play precisely the same role as primitive types in the Lambek calculus. The set of formulas Fm(t, >£, 1, _L) of the calculus CLL is defined as the smallest set satisfying the following conditions: • 1 G Fm(#, >£, 1, _L) and _L G Fm(#, >£, 1, _L); • if pi G Var, then p{ G Fm(t, >2,1, _L) and pf G Fm(#, >2,1, _L); • if A e Fm(t, >£, 1, _L) and В G Fm(t, >£, 1, _L), then (A#L?) G Fm(t, >£, 1, _L) and (A*z?B) G Fm(i, >£, 1, _L). The sequents of the calculus CLL are of the form —> Г, where Г G Fm(#, 1?, 1, J_)*. We need an operation ( • : Fm(#, >£, 1, _L) —> Fm(#, >£, 1, _L) defined on the set Fm(t, >£, 1, _L). It maps each formula to its negation: (l)x -L, 1, — Pi, (px)x — Pi, (А»В)± (A^B)1- — ((ву.(Л)-1). We shall write CLL Ь Г iff the sequent —> Г is derivable in CLL. In this section A\... An —> В will stand for —> (An)1- ... (Ai)1- B. The axioms of the calculus CLL are all sequents of the form —> (pi)^ Pi, where Pi G Var, as well as the sequent —> 1. The calculus CLL has the following derivation rules: ~^TABA ( ГЛ — , ^T{A*8B)A V ^Г(Л*Я)Д v h —> ГА -► ГХА H-L), (rotate), Г A A ГА (cut). Remark 9.1. The calculus CLL is conservative over the calculus L* if we translate A\B as (А)±У^В and В/A as B*8(A)^. If we constrain the derivation rule (—>1?) requiring that ГА ф A and omit the axiom —» 1, then we obtain a variant of the cyclic linear logic, which is conservative over the Lambek calculus L. 9.2. Free group interpretation of CLL-formulas. Definition 9.2. The free group interpretation of CLL-formulas (denoted by [ ]) is the following natural mapping of formulas and their finite sequences into the group F(Var): 
LAMBEK CALCULUS AND FORMAL GRAMMARS 83 M \A4B\ ^ [A][B] [Ai...AnJ [Ai]... [An]. Lemma 9.3. For any formula A e Fm(*, 1, J_), |[A]| < \\A\\ Proof. By induction on the construction of A. □ Lemma 9.4. If a sequent —> Г is derivable in the calculus CLL, then [TJ = e. Proof. By induction on derivations, similarly to the proof of Lemma 2.3. □ 9.3. Thin sequents in CLL. Definition 9.5. The length \\A\\ of a formula A is defined as the total number of atomic formula occurrences in A: The length of a finite sequence of formulas is defined in the natural way: Их...Anil ^ \\Аг\\ + ... + \\An\\. Definition 9.6. The set of all atomic formulas occurring in a formula A will be denoted by Var (A). Definition 9.7. For every atomic formula p e Var we define two mappings #+ and ф~ from the set Fm(*, 1, _L) into N: \\A.B\\ ^ \\A\\ + ||B||, \\A4B || ^ \\A\\ + \\B\\. { #p (?) — 0, if qe Var, #p(«'L) — 0, if q € Var, #+{A.B) ^ #+(Л) + #+(В), #p(M) ^ #рИ) + #р(В), #+{АъВ) ^ #+(Л)+ #+(£), #-{A*8B) ^ ф-{А) + ф~{В). 
84 MATI PENTUS These mappings are extended to finite sequences of formulas: (A\ • • • An) ^ #p(Ai) + ... + #p(An), #p {A\ • • • An) ^ #p (Лх) + ... + #p {An). Definition 9.8. A sequent —> П is thin iff #+ (П) < 1 and #“(П) < 1 for every p G Var. Lemma 9.9. Let ф be an atomic formula substitution. If one replaces every sequent —> Г by —> 0(Г) in a CLL-derivation, then the resulting tree is a legal C LL-derivation. Theorem 9.10. A sequent —> П is derivable in CLL if and only if there exist a thin sequent —> 0 derivable in CLL and an atomic formula substitution ф such that П = 0(0). 9.4. Interpolation in CLL. Lemma 9.11. Let CLL Ь ГПД, Г e Fm(§, >2, 1, _L)*, П G Fm(t, >2, 1, _L)*, and Д G Fm(t, >£, 1, _L)*. Then there exists a formula E such that (i) CLL h (E)1- П; (ii) CLL h TEA; (iii) the inequality #+(Я) < min(#+(II), (ГД)) holds for every atomic formula p; (iv) the inequality (E) < min(#~ (П), #+(ГД)) holds for every atomic formula p. Lemma 9.12. Let CLL h ГПД, Г G Fm(t,^l,l)*, П G Fm(#, >£, 1, _L)*, Д G Fm(#, >£, 1, _L)*. Let the sequent —► ГПД be thin. Then there exists a formula E such that (i) CLL h (E)1- П; (ii) CLL h TEA; (iii) the sequent —> (E)-L П is thin; (iv) the sequent —> ГEA is thin; (v) ||S|| = 11ПЦ. Lemma 9.13. Let CLL Ь Ф0Ф —>C, 0 e Fm(., >S>, 1, ±)*, Ф G Fm(t, >£, 1, _L)*, C G Fm(t, >£, 1, _L). Let the sequent Ф0Ф —> C be thin. Then there exists a formula E such that (i) CLL h 0 -> E; (ii) CLL h ФЯФ -> C; (iii) the sequent 0 —> E is thin; (iv) the sequent ФЯФ —> C is thin; (v) ||S|| = II©]|- Proof. Given CLL Ь (Ф)^ (0)^ (Ф)х (7, it remains to apply Lemma 9.12 with Г ^ (Ф)\ П ^ (0)\ Д ^ (ф)1- C. □ 9.5. Grammars based on CLL. Definition 9.14. A categorial grammar based on the calculus CLL (or a CLL- grammar) is a triple (T, Я, >), where T is a finite set (the alphabet), Я is a formula, and > is a finite binary relation > C Fm(«, >£, 1, _L) x T. 
LAMBEK CALCULUS AND FORMAL GRAMMARS 85 The language generated by the grammar (T, Я, >) is defined as the set of all non-empty strings t\... tn over the alphabet T for which there exists a sequent Вi... Bn —► Я, derivable in CLL, such that Bi > ti for all i < n.We shall denote this language by CCll(TЯ,>). Remark 9.15. It is possible that the sequent —> H is derivable in CLL. Nevertheless the empty word is not included in the language generated by the grammar. This ensures compatibility with our definition of a context-free grammar in subsection 1.1, where we banned productions of the form a => e and thus excluded the possibility that the empty word would occur in the generated language. 9.6. Context-freeness of CLL-grammars. Definition 9.16. We introduce two families of sets of CLL-formulas: Fm(ra) ^ {A G Fm(#, 1?, 1, _L) | \\A\\ < m}, Fm(m, s) ^ {A G Fm(m) | Var(A) C {pb ... ,ps}}. Consider an arbitrary CLL-grammar (T, Я, >). Only a finite number of types are relevant in the definition of the language generated by this grammar. Thus there are positive integers m and s such that Я G Fm(m, s) and, if В > t for some t G T, then В G Fm(m, s). There is no loss of generality in assuming that the sets T and Fm(m, s) do not intersect. Now we construct the desired context-free grammar (T, W,cr, 11): W ^ Fm(m, s), a^±H, 1Z ^ {B => t | t G T and В > t} U {А=>Г | iGFm(m,s),ГGFm(m,s)*, ||Г|| < 2m, and CLL b Г —> A}. We define an auxiliary calculus CLLcutm. Definition 9.17. A sequent Г —► A is an axiom of the calculus CLLcutm if and only if CLL b Г —> A, A G Fm(m), Г G Fm(m)*, and ||Г|| < 2m. The only derivation rule of the calculus CLLcutm is the cut rule П -> В ТВ A ГПА -> A. — (cut). Lemma 9.18. Let (T, Я, >) be a CLL-grammar and (T, W, cr, 1Z) the corresponding context-free grammar. Let Г G Fm(m, s)* and A G Fm(m, s). L/ien following three assertions are equivalent: (i) C2(W, Я) b Г —► A; (ii) CLLcutm Ь Г —> A; (Hi) CLL b Г A. Theorem 9.19. Let (T, Я, >) be an arbitrary CLL-grammar. Then the language Ccll{T, Я, >) is context-free. Remark 9.20. The converse is true in view of conservativity of CLL over L*. Consequently the class of languages generated by grammars based on the multiplicative cyclic linear logic coincides with the class of all context-free languages. 
86 MATI PENTUS References [1] Y. Bar-Hillel, C. Gaifman, and E. Shamir, On categorial and phrase structure grammars, Bull. Res. Council Israel F 9 (1960), 1-16. [2] J. van Benthem, Language in action, North-Holland, Amsterdam, 1991. [3] W. Buszkowski, The equivalence of unidirectional Lambek categorial grammars and context- free grammars, Z. Math. Logik Grundlagen Math. 31 (1985), 369-384. [4] , Generative power of categorial grammars, Categorial Grammars and Natural Language Structures (R. T. Oehrle, E. Bach, and D. Wheeler, eds.), Reidel, Dordrecht, 1988, pp. 69-94. [5] , On generative capacity of the Lambek calculus, Logics in AI (J. van Eijck, ed.), Springer, 1991, pp. 139-152. [6] , On the equivalence of Lambek categorial grammars and basic categorial grammars, ILLC Prepublication Series LP-93-07, Institute for Logic, Language and Computation, University of Amsterdam, 1993. [7] N. Chomsky, Formal properties of grammars, Handbook of Mathematical Psychology (R. D. Luce et al., eds.), vol. 2, John Wiley and Sons, New York, 1963, pp. 323-418. [8] J. M. Cohen, The equivalence of two concepts of categorial grammar, Inform, and Control 10 (1967), 475-484. [9] G. Lallement, Semigroups and combinatorial applications, John Wiley and Sons, New York, 1979. [10] J. Lambek, The mathematics of sentence structure, Amer. Math. Monthly 65 (1958), no. 3, 154-170. [11] M. Pentus, Equivalent types in Lambek calculus and linear logic, Prepublication Series in Logic and Computer Science LCS-92-2, Steklov Mathematical Institute, Moscow, April 1992. [12] , Lambek grammars are context free, Prepublication Series in Logic and Computer Science LCS-92-8, Steklov Mathematical Institute, Moscow, December 1992. [13] , The conjoinability relation in Lambek calculus and linear logic, ILLC Prepublication Series ML-93-03, Institute for Logic, Language and Computation, University of Amsterdam, 1993. [14] , Lambek grammars are context free, Proceedings of the 8th Annual IEEE Symposium on Logic in Computer Science, IEEE Computer Society Press, 1993, pp. 429-433. [15] , The conjoinability relation in Lambek calculus and linear logic, J. Logic Lang. Inform. 3 (1994), no. 2, 121-140. [16] , Lambek calculus and formal grammars, Fundamentalnaya i Prikladnaya Matematika 1 (1995), 729-751 (Russian). [17] , Product-free Lambek calculus and context-free grammars, J. Symbolic Logic 62 (1997), no. 2, 648-660. [18] D. Roorda, Resource logics: proof-theoretical investigations, Ph.D. thesis, University of Amsterdam, 1991. [19] , Interpolation in fragments of classical linear logic, J. Symbolic Logic 59 (1994), no. 2, 419-444. [20] K. Schiitte, Der Interpolationssatz der intuitionistischen Pradikatenlogik, Math. Ann. 148 (1962), 192-200. [21] D. N. Yetter, Quantales and noncommutative linear logic, J. Symbolic Logic 55 (1990), no. 1, 41-64. Department of Mathematical Logic and Theory of Algorithms, Faculty of Mechanics and Mathematics, Moscow State University E-mail address: pentus@lpcs.math.msu.ru URL: http://markov.math.msu.ru/~pentus/ Translated by the author 
Amer. Math. Soc. Transl. (2) Vol. 192, 1999 Relativizability in Complexity Theory Nikolai K. Vereshchagin Abstract. Starting with a paper of Baker, Gill and Solovay in complexity theory, many results have been proved which separate certain relativized complexity classes or show that they have no complete language. All results of this kind were, in fact, based on lower bounds for Boolean decision trees or circuits of a certain type, or for machines with polylogarithmic restrictions on time. The following question arises: Are these methods of proving “relativized” results universal? We propose a general framework in which assertions of universality of this kind may be formulated and proved as convenient criteria. Using these criteria we obtain some new “relativized” results and new proofs of some known results. For example, for many of the complexity classes studied in the literature all relativizable inclusions between the classes are found. Notation the set {0,1}. the set of all words over the alphabet В (= binary words), the set of all binary words of length n. the set of all binary words of length at most n. the length of the word x. the concatenation of words x and у. = {0w | и G B} U {It; | v E C}, where В, C are languages, the number of elements in a set M. the set of natural numbers, the set of functions from B* to B. the domain of a function /. the restriction of a function / to a set M. the probability of the event M. polynomial-time m-reducibility (Karp reducibility). polynomial-time Turing reducibility (Cook reducibility). 1. Introduction The majority of theorems in recursion theory are known to be relativizable. This means that for any language Д a theorem remains true if we take machines 1991 Mathematics Subject Classification. Primary 68Q15, 03D15. xy B®C \M\ N 0 Dom / f\M Prob[M] <p — m <P —T 87 ©1999 American Mathematical Society 
88 NIKOLAI К. VERESHCHAGIN supplied with oracle A as the model of computation. This is not true in complexity theory. In 1975 in the paper [5], oracles A and В were constructed such that PA ф NPa and PB = NP5. This means that although we do not know which of the two assertions P = NP and P Ф NP is true, neither of them is relativizable. After [5], many theorems of the following kind have been proved (for pairs of specific complexity classes Ki,K2): there are oracles A and В such that KA ф KA and KB — КB. Since many interesting complexity classes lie between P and PSPACE, for such classes one can always take the oracle В constructed in [5] as the second oracle because in fact P5 = PSPACE5 for that oracle. In 1989 the first non- relativizable theorem in complexity theory appeared. In [36] it was shown that PH C IP. Earlier, in [18], it was proved that ЗА co-NPA <2 1РЛ- However only a few non-relativizable results in complexity theory are known. All known proofs of results of the form ЗА KA ф KA (that is, ЗА КA <2 К2 or the converse) consist of two parts: the “diagonal” part (constructing an oracle step by step), which is the same in all proofs, and the specific “combinatorial” part, in which it is proved that every step can be made. Our first result is a formalization of this statement. The proof of Theorem 1 in Section 3.1 is a general formulation of the diagonal part of such proofs. Corollary 3 shows what combinatorial assertion is to be proved in every specific case. Theorems of the following form have also appeared in the literature: There exists an oracle A for which the class KA has no Karp complete (or Cook complete) language. (For the definition of Karp and Cook reductions, see Sections 3.2 and 3.3.) For example, in [47], it is proved that there is an oracle A for which the class NPA П co-NPa has no Karp complete language (more precisely, no language complete under polynomial many-one reductions relative to A), and there is an oracle A for which the class RA has no Karp complete language. All we have said about proofs of theorems of the form ЗА KA <2 KA is true for proofs of non-existence of complete languages in complexity classes. Theorem 4 in Section 3.2 provides the diagonal part of such proofs in a general form. Both Theorem 1 and Theorem 4 give criteria. Theorem 1 is a criterion for whether (1) wi к? с k£, while Theorem 4 is a criterion for whether (2) У A (КA has a Karp complete problem for the class KA). Roughly speaking, the criteria are as follows. Let К be a complexity class. Let us replace all polynomial restrictions in the definition of the class К by polylogarithmic ones and replace decision problems (i.e., languages) by separation problems. Let KLOGS denote the resulting “counterpart” of the class К. Then the assertion (1) is equivalent to the absolute inclusion KiLOGS C K2LOGS, and the assertion (2) is true if and only if the class K2LOGS has a language complete for the class KiLOGS. Analysis of proofs of relativizable assertions of the form (1) (for example, BPP С Е2ПП2 from [48]) shows that the more natural formulations of such assertions have the form KiLOGS C K2LOGS. To formulate these criteria in a rigorous form we present in Section 2 a uniform way to define complexity classes. The same approach was proposed independently in [10]. 
REL ATI VIZ ABILITY IN COMPLEXITY THEORY 89 Similar criteria exist also for theorems of the following two forms: (3) VA (the class КA has a Cook complete language for the class KA) and (4) VA (VLi G KA 3L2 G KA : L\ is Cook reducible to L2), that is, UKA is Cook reducible to KA? These criteria are formulated in Sections 3.3 and 3.4. They simplify solving problems of the forms (l)-(4) both psychologically and technically. In Sections 4, 5 and 6 we ascertain, for several known classes K\, K2 between P and PSPACE, which of the two assertions—(1) or the negation of (1) —is true or is unknown. We do the same thing also for assertions of the forms (2) , (3) and (4). Some new positive and negative results of this type are proved. To obtain one of them, namely, to build an oracle under which AM is not included in PP, we prove a new lower bound for perceptron complexity (Section 7), which is interesting in its own right. Some problems of this kind remain open. In Section 8 we present more involved oracle constructions. Namely, we build oracles under which some classes coincide and some other classes do not coincide (say, an oracle under which P = R ф BPP). The hardest of these results is the existence of an oracle A such that PA Ф NPA but both co-NPA-sets and NPA-sets are PA-separable and in addition PA = ВРРЛ. All these constructions are done using the same method. We formalize the method and exhibit two theorems that cannot be proved by this method. Since the relations between relativized complexity classes depend on the oracle, it is natural to ask what happens for a “typical” oracle. A possible refinement of the notion of typicality is randomness with respect to the uniform measure. In Section 9, we study the relation between the classes NPA, co-NPA and PA for a random oracle A. More precisely, we say that assertion S(A) holds for a random A, or for almost all A, if the uniform measure of the set {A | 5(A)} is equal to 1. We consider only properties S(A) satisfying the following two conditions: the set {A I 5(A)} is measurable and 5(A) is stable with respect to any change in the values of A on a finite number of arguments. By the 0-1 law of A. N. Kolmogorov, in this case either 5(A) holds for a random A, or _|5(A) holds for a random A. 2. A uniform way to define complexity classes Consider the definitions of two popular complexity classes, NP and BPP, in a convenient form. Definition 1. L G NP if there exist a polynomial-time function s : B* —> N and a polynomial-time predicate Р(х,г) such that x G L 3i < s(x) Р(х,г), Definition 2. L G BPP if there exist a polynomial-time function s : B* —> N and a polynomial-time predicate P(x,i) such that for any x G L the ratio I{г G N I 1 < i < (x), P(x, i)}\/s(x) is greater than 2/3 and for any x 0 L this ratio is less than 1/3. Let f(x) denote in both definitions the sequence of values of the predicate P(x, i) for i < s(x). Then the membership of x in L is defined in terms of the word f{x). Any bit of the word f(x) can be computed in time polynomial in |x|, given its number. Now we come to the following definition. Definition 3. A function / is bit-computable in time t if 
90 NIKOLAI К. VERESHCHAGIN 1. the function хи \f(x)\ is computable in time t(|x|), 1 2. the partial binary predicate P(x, i) = (zth bit of the word f(x)) can be computed by a machine M that works in time t(\x\) on all x G B* and all i < 1/0*01- Functions that are bit-computable in time poly(n) (where poly(n) is a polynomial) are called polynomial-time bit-computable. For example, the function f(x) = 02И is polynomial-time bit-computable. A separation problem is a function from the set B* to the set {0,1,*}. The meaning of this definition is that we have to separate the set {x \ F(x) = 1} from the set {x \ F(x) = 0}. We identify a language L С B* with its characteristic function, denoted by the same letter: L(x) 1 if x e L, 0 if x qL L. Thus any language can be considered as a separation problem. Let us define a partial ordering on the set {0,1, *} assuming that * < 0 and * < 1. A separation problem Pi is easier than a separation problem P2 (notatioin Pi < P2) if Pi{x) < P2(x) for all x G B*. In other words, P2(x) = P\{x) for all x such that P\(x) Ф *. If L is a language, P a separation problem, and P < L, then we say that L is a solution to P. Both Definitions 1 and 2 have the following form. For a fixed separation problem F we declare that a language L is in the class if there exists a polynomial-time bit-computable function / such that L(x) — F(f(x)) for all x G B*. Let POLY(P) denote the class defined in this way by means of a separation problem P. We say that a class К is represented by a separation problem F if К = POLY(P). For example, the class NP is represented by the following separation problem: F^p(a) 1 if 3i < |a| a(i) = 1, 0 otherwise. To represent the class BPP we can take as P the separation problem Рврр(а)= 1 0 * if #i(a) > §|a|, if #i(a) < §|or|, otherwise, where #i(x) denotes the number of l’s in the binary word x. In Section 4, we shall give the definitions in the form POLY(P) of the following classes: R, UP, FewP, Few, £&, 0P, PP, PSPACE, MA, AM, IP. Let us now give such a definition for the simplest class, P. Recall that a language L belongs to the class P if there is a polynomial-time Turing machine M that recognizes L. Let Pp(a) = (the first bit of a). It is easy to see that P = POLY(Pp). The definition of the class POLY(P) easily relativizes as follows. An oracle is any language. An oracle machine is a Turing machine having an extra tape called the oracle tape\ this tape has a read/write head. That head can write only zeros and ones. To run an oracle machine on an input we must supply it with an oracle. 1 Convention: we assume that natural numbers are represented in binary. Moreover, we identify natural numbers and binary words: a natural number n is identified with the binary notation of the number n + 1 without the leading 1. 
REL ATI VIZ ABILITY IN COMPLEXITY THEORY 91 Let A be an oracle. Then the machine works as a usual two-tape Turing machine with one exception. If the oracle machine gets into a certain state, then the word и written on the oracle tape (starting from the first cell up to the cell where the head is now) is considered as a question to the oracle. In this case oracle provides its answer A(u) in the cell viewed by the head. The time needed for the oracle to provide its answer is assumed to be 1. Let M be an oracle machine and A an oracle. Then MA(x) denotes the output produced by M with oracle A on input x, and tMA (x) the running time necessary to provide this output. We say that an oracle machine M is polynomial [exponential] if there exists a polynomial q(n) [a constant c] such that Ьма(х) < q(|x|) [Ьма{х) < 2CIXI+C] for all x G B* and all ЛСВ*. A function / is called polynomial [exponential] relative to A if there exists a polynomial [exponential] oracle machine M such that /(x) = MA(x) for all x (that is, MA computes /). Now we want to define the notion of bit-computability relative to an oracle. To this end we just allow machine M in the definition of bit-computability to query oracle A and we allow the function |/(x)| to be computable in time t(|x|) by a machine with oracle A. Definition 4. POLYA(F) is the class of languages L such that L(x) = F(/(x)) for all x G B* for some function / that is polynomial-time bit-computable relative to A. Let logn denote [log2(n + 1)]. Functions of the form p(logn), where p is a polynomial, will be called poly logarithms. The expression polylog(n) will denote a polylogarithm. We shall study Turing machines whose running time is bounded by a poly logarithm in the length of the input. An ordinary Turing machine in poly logtime can read only a prefix of the input word having polylogarithmic length. We shall use, therefore, a model of Turing machines which is commonly used when time restrictions are so stringent. The input word is given as an oracle in this model. More specifically, in addition to the work tape, the machine has an additional tape, called the input tape, on which at the beginning of a computation the length of the input word x is written. The machine may at any moment of a computation ask a question of the form ‘х(г) =?’, that is, it can write down on the input tape the number i < |x| and then receive the zth symbol of x, х(г), written on the input tape. The time to write down i is added to the total time, but then the “oracle” immediately supplies х(г). (We could consider another model in which the machine does not obtain the length of the input word, and when it asks ‘х(г) =?’ with i > \x\ it receives the answer “undefined”; evidently, every machine working in time t(|x|) can by simulated by a machine of this new type in time t(|x|) + (log(|x|))°^\) If time restrictions are polynomial, then our model is equivalent to ordinary Turing machines. Functions that are bit-computable in time polylog(n) [2°(n)] are called polylogtime bit-computable [exponential-time bit-computable, respectively]. For example, the function /(x) — x is poly log-time bit-computable. Note that if both / and g are poly log-time bit-computable, then so is their superposition f(g(x)). If / is poly log-time bit-computable and g is polynomialtime bit-computable, then f{g(x)) is polynomial-time bit-computable. Similarly, if / is polylog-time bit-computable and g is exponential-time bit-computable, then f(g(x)) is exponential-time bit-computable. 
92 NIKOLAI К. VERESHCHAGIN Definition 5. We say that a separation problem G polylog-time bit-reduces to a separation problem F, G ■<1Ш F if G(a) < F(f(a)) for some polylog-time bit-computable function f and for all a. LOGS(F) is the class of all separation problems G that polylog-time bit-reduce to G. LOG(F) is the class of all the languages in LOGS(F). It is easy to see that the relation <1Ш is reflexive and transitive. The class LOGS(F) is called the polylog counterpart of the class POLY(F). More precisely, the separation problem F defines a pair—the class POLY(F) and its polylog counterpart LOGS(F) (as we shall see later, the class LOGS(F) is not uniquely determined by the class POLY(F)). If К denotes a complexity class, then (=lef FLOGS will denote the polylog counterpart of this class, for example, PLOGS = LOGS(Fp), NPLOGS d= LOGS(FNP) and BPPLOGS d= LOGS(FBPP). All three classes (PLOGS, NPLOGS and BPPLOGS) can be defined in a standard vein using polylog-time machines. Let D(F) denote the set {x G B* | F(x) Ф *}. It is easy to verify that the following assertions are true. • F G PLOGS if and only if there is a deterministic polylog-time Turing machine M such that M(a) = F(a) for all a G D(F). • F G NPLOGS if and only if there is a polylog-time nondeterministic machine M such that if F(a) = 1, then M accepts a, and if F(a) = 0, then M rejects a. By a polylog-time nondeterministic machine we mean a nondeterministic Turing machine all of whose computations on input a have no more than polylog(|o;|) steps. • F G BPPLOGS if and only if there is a polylog-time probabilistic machine M such that if F(a) = 1, then Prob[M(a) = 1] > 2/3, and if F(a) = 0, then Prob[M(a) — 1] < 1/3 (if F(a) = *, then this probability can be arbitrary). By a probabilistic polylog-time machine we mean a probabilistic Turing machine M whose computation time on input a is bounded by polylog(|a|) (for all outcomes of coin tossing). 3. General criteria 3.1. A criterion for relativizable inclusion. In this section we peove that a complexity class Fj4 is included in a complexity class for all oracles A if and only if the (absolute: no oracles) inclusion between their polylog-counterparts holds. This is true for all classes of the form POLY(F) whenever the separation problem F is nondegenerate in the following sense: there exists a polynomial-time bit-computable function / : N —> B* such that |/(гг)| = n and F(/(n)) Ф * for all n G N; there are two words (denote them zeroF and опер) such that ^ F(zeroF) = 0, F(onep) = L All the problems defining the complexity classes mentioned above are nondegenerate. Theorem 1 ([10, 51]). Assume that a separation problem F satisfies the condition (5) and a separation problem G satisfies the condition (6). Then the following 
RELATIVIZABILITY IN COMPLEXITY THEORY 93 conditions are equivalent: (7) LOGS(F) CLOGS(G), (8) FgLOGS(G), (9) POLYa(F) C POLYa(G) for all A. If F is a language (that is, D(F) — B*), then all these conditions are equivalent to the following condition: (10) LOG(F) C LOG(G). Proof. Obviously, (7) implies (8). Let us prove that (8) implies (7). Assume that F belongs to the class LOGS(G) and let g be a poly log-time bit-computable function such that F(a) < G(g(a)). Let us prove that LOGS(F) C LOGS(G). Assume that H is in LOGS(F) and that / is a polylog-time bit-computable function such that H(a) < F(f(a)). Then H(a) < G(g(f(a))) for all a G B*. Since g(f(a)) is poly log-time bit-computable, it follows that H belongs to LOGS(G). Obviously, (7) implies (10), and if F is a language, then (10) implies (8). Let us prove that (8) implies (9). Assume that / is a polylog-time bit- computable function such that F(a) < G(f(a)). Assume that L G POLYA (F), that is, there is a function g that is polynomial-time bit-computable relative to A such that L(x) = F(g(x)). Consequently, L(x) = G(f(g(x))). As the function f{g(x)) is polynomial-time bit-computable relative to A, it follows that L G POLYA(G). Let us prove that if (8) is not true, then neither is (9). Assume that F is not in LOGS(G). This means that for any separation problem H G LOGS(G), there is an a G B* such that F(a) H(a). We claim that in this case for any separation problem H G LOGS(G), there are infinitely many a G B* such that F(a) H(a). Indeed, assume this is not true, that is, there are a number n and a polylog-time bit-computable function / such that F(a) < G(f(a)) for all a G B* of length greater than n. Then the function f/(«) /i(«) = { zeroc K oneG if |a| > n, if |a| < n, F(a) = 0, otherwise. is polylog-time bit-computable and F(a) < G(fi(a)) for all a G B*. Let us choose a function encoding pairs of words by words. Assume that x is in B*. Let us double all bits of x and add the word “01” to the end of the resulting word. Let x stand for the resulting word (for example, 001 = 00001101). The word xy will be considered as the code of the pair (x, y). Obviously, for a given xy we can find x and у in polynomial time and for a given word и we can decide in polynomial time whether и has the form xy. For an oracle A and n G N, let [A\n denote the word of length n whose zth bit is equal to A(ni).2 We shall construct an oracle A such that the language LA = {n \ F([A]n) = 1} belongs to the set POLYA(F) \ POLYA(G). The assertion LA G POLYA{F) will follow from the following general statement: (H) Vn G N F([A]n) ф *. 2Recall that we identify natural numbers with binary words. 
94 NIKOLAI К. VERESHCHAGIN If (11) is true, then LA(n) = F([A\n) for all n. Since the function h(ri) = [A\n is polynomial-time bit-computable relative to A, the assertion (11) implies that the language LA is in POLYA(F). Let us enumerate all functions that are polynomial-time bit-computable relative to oracles. This means that we enumerate pairs of oracle machines involved in the definition of polynomial bit-computability relative to an oracle. Let fA{x) denote the ith function (A is considered as the second argument of the function). Choose a polynomial-time decidable language E such that F(En) Ф * for all n G N. Such a language exists because F satisfies the condition (5). We start with A — E to satisfy condition (11). Then we make a countable number of steps. At the ith step we change the value of A on a finite number of words to satisfy the following local condition: (12) 3n € N F([A]n) ф G(fA(n)), ensuring that condition (11) remains true. Then we freeze all the values of A needed to ensure the truth of the assertion (12) and also all the values of A that were changed. This is to be understood as follows. There is a finite set U of words such that (12) is true for A! whenever A' has the same values as A on all the elements of U. We find such a U and “label” all its elements and all the elements on which the values A were changed. The values of A on labeled words are called “frozen” and cannot be changed later. After an infinite number of steps, we shall obtain an oracle A such that both (11) and (12) are true for all i G N. This implies that LA 6 POLYa(F) \POLY'4(G). Now we describe the zth step. Let A be the oracle constructed at the (г — l)st step (with some frozen values). For a E Bn, let A [a] stand for the oracle A with [A\n replaced by a, that is, A[a](u) A(u) if и does not have the form m, i < n, a(i) if и — m, where г < n. Let H(a) = G(fA[a](\a\)). Since A is polynomial-time decidable (A is obtained from E by a finite number of changes), the function a is polylog-time bit-computable. It follows that H e LOGS(G). Consequently, there are infinitely many a G B* such that F(a) ^ H(a). We conclude that there is an a G B* such that F(a) ^ H(a) and no value of A on a word of the form |a|г, i < |a|, is frozen. Choose such an a and replace A by A[a\. Now the assertion (12) is true for n — |a|, because F([A}n) = F(a) £ H(a) = G(fA(n)). Freeze a finite number of values of A ensuring the truth of condition (12). Note that the assertion (11) is not violated, because F([A]n) — F(a) Ф * (since F(a) ^ H(a) and * is the least element in the set {0,1,*}). The implication (9)=>(8) is proved. □ Remark 1. All complexity classes studied in the literature are represented by separation problems with the following property. When we add to the definition of the class POLY(F) the requirement \f(x)\ = 2ро1у^ж^ (the definition of polynomial bit-computability implies only that \f(x)\ < 2ро1уСж()), the class POLY(F) does not change. Moreover, the separation problems representing the known complexity 
RELATIVIZ ABILITY IN COMPLEXITY THEORY 95 classes have the following property: (13) FeLOGS(F), where - f F(a) if \a\ has the form 2k, к G N, F{a) — < I 0 otherwise. Note that (13) implies that POLYA(F) = POLYA(F) for all A (by Theorem 1). If a separation problem F has property (13), then conditions (7), (8), and (9) are equivalent to the condition (14) EXPA(F) C ЕХРа(£) for all A, where EXPA(H) is the class containing all the languages L such that L(x) = H(g{x)) for some function g that is exponential-time bit-computable relative to A. Indeed, the implication (8) =>(14) is true because if f(a) is polylog-time bit- computable and g(x) is a exponential-time bit-computable relative to A, then the function f(g(x)) is exponential-time bit-computable relative to A (because polylog(22 n ) = poly(2°(n)) = 2°(n)). Conversely, let us prove the implication (14)=>(8). Assume that F has the property (13) and assume that (8) is false. Then we see that F ^ LOGS(G). Applying the same arguments as in the proof that -i(8)=> —<(9), we can construct an oracle A such that the language LA = {n | F([A]2») = 1} is in EXPA(F) \ EXPA(G). □ For a family T of separation problems, let POLY4 = (JFgJF POLYA(F). It is easy to see that for countable families T, Theorem 1 generalizes to classes of the form POLY '(JP). Theorem 2 ([10, 51]). Assume that all the elements of a countable family F of separation problems have the property (5) and all the elements of a countable family G of separation problems have the property (6). Then the following assertions are equivalent: (15) LOGS(^) C LOGS(£), (16) POLYa(JF) C POLYa(£) for all A. Any mapping from the set of all oracles into the set of families of languages is called a manifold. A manifold is called representable [Ho-representable] if it has the form A > POLYA(F) for some nondegenerate F [A > POLYA(^) for some countable family T containing nondegenerate separation problems]. Theorem 2 implies that an Но-representable manifold determines the defining family T uniquely up to polylog equivalence, that is, (VA POLYa(J-) = POLYA(a)) LOGS(J9 = LOGS(£). This is not true for absolute classes: there are separation problems F\ and F2 such that POLY(F!) = POLY(F2) and LOGS(F!) ф LOGS(F2). In other words, there exists a non-relativizable assertion of the form POLY(Fi) = POLY(F2), namely the equality IP = PSPACE proved by Shamir in [45]. Both classes IP and PSPACE can be defined in our framework as shown in Section 4. Consider the following application of Theorem 1. We want to prove the theorem from [5] stating that there is an oracle A such that Рл ф NPA. By Theorem 1, it suffices to prove that Fnp is not in PLOG. In other words, we have to prove 
96 NIKOLAI К. VERESHCHAGIN that no machine can recognize in polylog-time whether a 1 occurs in a given word. Assume that a polylog-time machine M does this job. Run Mona sufficiently long input word containing only 0’s (the length of the input should be greater than the running time of M on it; such an n does exist because n — polylog(n) —> +oo). The output of M should be 0. But since M has not queried at least one bit of a, we can fool it by changing that bit of a to 1. We have used in this proof only the fact that the number of bits queried by M in a run on an input a is bounded by a polylogarithm of |a|, and the running time can be arbitrary. This is true for all the known proofs of the theorems of the form ЗА Кi £ - More precisely, in Definition 3 replace the restrictions on time by restrictions on the number of queried bits of x. The resulting notion is called bit-computability in t(n) queries. Let n.u.LOGS(G') stand for the class of separation problems F such that F(a) < G(f(a)) for some function / bit-computable in polylog(n) queries. Then to prove that ЗА POLYA(F) £ POLYA(G) it is sufficient to prove that F is not in n.u.LOGS(G), because LOGS(G) C n.u.LOGS(G). Assertions concerned with the number of queries can usually be proved by counting arguments. A formal definition of a function bit-computable in t(n) queries can be given using decision trees. Let aq,...,a;n be Boolean variables and M a set. An (M, жi,... ,£n)-tree is a finite binary rooted tree whose leaves are labeled by elements of M and internal vertices by variables from the set {aq,..., xn}. An (M, aq,..., a:n)-tree T computes the function / : Bn —> M defined as follows. Let b\ ... bn be an assignment of Boolean values to aq,..., xn. Let v\, iq,..., Vk be the path in T such that (i) v\ is the root of T, (ii) for every i < k, Vi is an interval vertex and iq+i is the left son of Vi if the value the variable labeling Vi is 0 and the right son of Vi otherwise, and (iii) Vk is a leaf. The value f(b\ ... bn) is defined as the label of Vk- Let T(aq ... xn) denote the function computed by the tree T. The complexity of a tree is measured by its height. A partial function / : Bn —> M is computable in t queries if there exists an (M, aq,..., a:n)-tree T of height at most t such that the function T(aq,..., xn) extends the function /(aq ... xn). Replace in Definition 3 the notion of computability in time t(\x\) by the notion of computability in t{\x\) queries. The resulting notion is called the nonuniform bit-computability in time t(n), or bit-computability in t(n) queries. Definition 6. n.u.LOGS(G) is the class of all the separation problems F such that F(a) < G(f(a)) for some nonuniformly polylog-time bit-computable function / and for all a E B*. n.u.LOG(G) is the class of all languages from n.u.LOGS(G). Obviously, LOGS(G) C n.u.LOGS(F), and we obtain an easy corollary from Theorem 1. Corollary 3. If (17) F i n.u.LOGS(G), then the negation of (9) is true. It is the assertion (17) that is proved by counting arguments in all the known proofs of theorems of the form ЗА POLYa(F) % POLYa(G). 
R EL ATI VIZ ABILITY IN COMPLEXITY THEORY 97 3.2. A criterion for relativizable existence of an m-complete language in a complexity class. Definition 7. A language L\ is polynomial-time many-one reducible (= Karp reducible) to a language L2 (and we write L\ L2) if there exists a polynomialtime computable function / such that x G L\ <=> f(x) G L2. If we allow the function / to be computable by a polynomial-time machine with an oracle A, then the resulting reducibility is denoted by <^л. Let < stand for a reducibility on separation problems. We say that a separation problem H is <-hard for a class К of separation problems if every separation problem in К is <-reducible to H. If H is <-hard for К and H is in A, then we say that H is <-complete in K. We call a class K\ of separation problems <-hard for a class K2 of separation problems if K\ has a problem that is <-hard for A2. The following theorem gives a criterion for whether the class POLYA(G) is <^A-hard for the class POLYA(F) for all oracles A. Theorem 4 ([10, 51]). Assume that a separation problem F satisfies the condition (5) and a separation problem G satisfies the condition (6). Then the following conditions are equivalent: LOG(G) is ^lm-hard for LOGS(F), F has a solution in LOG(G), the class POLYA(G) is <f^A-hard for the class POLYa(F) for any oracle A. If F is a language, then all these assertions are equivalent to the assertion (21) the class LOG(G) is <lm-hard for the class LOG(F). Proof. Let us prove the implication (18)=>(19). Assume that (18) is true, that is, there is a language H G LOG(G) such that every separation problem in the class LOGS(F) is ^-reducible to Я. Then F <1Ш H. Let g : В* —> B* be a polylog-time bit-computable function reducing F to H. Then the language H(g(a)) is a solution to F and belongs to LOG(G). Let us prove the implication (19)=^(18). Assume that a language H G LOG(G) solves F. Then H is ^^-hard for the class LOGS(F) because the problem F is ^-complete in LOGS(F). Obviously, (18) implies (21). The implication (21)=>(19) in the case when F is a language can be proved in precisely the same way as the implication (18)=>(19), because F G LOG(F) in this case. Let us prove the implication (19)=>(20). Assume that F has a solution H G LOG(G). Theorem 1 implies that POLYA{F) C POLYA(tf) C POLYA{G) (note that in the proof of the implication (8)=>(9) we have not used conditions (5) and (6)). It suffices to prove, therefore, that the class POLYA(#) is <^A-hard for the class POLYA{F). Actually, we shall prove that the class POLYA{H) has a <^-complete language. Let gA, gA, gA, ... be an enumeration of all the functions polynomial-time bit-computable relative to A. Set LA(x) — H(qA(x)). Bv definition, POLYA(tf) - {La \ i e N}. Let pi(\x\) be a polynomial upper bound for the time of bit-computation of the function gA{x) for a given ix. We shall prove that there is a function fA polynomialtime bit-computable relative to A such that fA(ixOPl^x^) = gA(x) for all i G N (18) (19) (20) 
98 NIKOLAI К. VERESHCHAGIN and for all x E B*. Suppose that we have already proved the existence of such a function fA. Then let LA{u) - H{fA{u)). We obtain that LA e POLYA{H). On the other hand, LA is ^-complete in the class РОЬУл(Я) because the function x i—> ix0Pl^x^ is polynomial-time computable and reduces LA to LA for any i G N. It remains to prove the existence of fA. Let MA be a machine that computes in time Pi(\x\) the length of the word gA(x) for any given гх, and let NA be a machine that computes the jth bit of the word gA(x) in time Pi(\x\) for any given ixj. Then the length of the word fA{w) can be computed by the following machine MA: for a given word гс, check first whether w has the form гхО*, and if not, output 0. Otherwise find г, x, and t and run MA on ix. If the machine MA produces a result within time £, then output this result, otherwise output 0. The following machine NA outputs the j th bit of the word fA(w) for any given (w,j}: first run MA on гс, and let n stand for the result produced by MA. If n = 0, then output 0. Otherwise find г, x, and t such that w — ixO1 and run NA on ixj. If the machine NA produces a result within time £, then output that result. Otherwise output 0. Let us prove that if (19) is false, then (20) is false. Assume that F has no solutions in the class LOG(G). Let us construct an oracle A such that the class POLYa(G) has no <^A-hard language for the class POLYA(F). Let /Д fA, ..., /Д ... be an enumeration of all the functions that are polynomial-time bit- computable relative to an oracle A, and Шр, mA, ..., mA, ... an enumeration of all the <^A-reducing functions (that is, all the functions of the type В* —> B* that are polynomial-time computable relative to A). The oracle A is considered as the second argument of both functions fA(x) and mA(x). Without loss of generality we may assume that the polynomials bounding the computation times of fA(x) and mA(x) do not depend on A. Assume that A С B*. The language A1 ^ {x | ix e A} is called the ith component of A, and Li(A) will denote the language {n \ F([A%]n) — 1}. Recall that for С С B*, [C\n stands for the word of length n whose j th bit is equal to C(nj). It clearly suffices to construct an oracle A such that for all i G N, at least one of the following two assertions is true: (22) G(ff(y)) = * for some у € В*, or (23) ^anSuaSe Li(A) is in POLYa(F) and is not <^;A-reducible to the separation problem G(fA(y)). To make the condition (23) true it suffices to satisfy one global condition (24) F([Al}n) ф * for all n G N and the following countable family of local conditions: (25) 3n € N F([A*]n) ф G{f?{mf{n))), j e N. Thus it suffices to construct an oracle A such that for all pairs (i,j) E N2 at least one of the two assertions (22) and (24)&(25) is true. Let us start with the oracle A being a polynomial-time decidable language such that for all i the assertion (24) is true. Fix an enumeration of the set N2. We make a countable number of steps indexed by pairs (г, j). During the step (г,^) we redefine the ith component of A on a finite number of words to make the assertion (22) or 
REL ATI VIZ ABILITY IN COMPLEXITY THEORY 99 the assertion (25) true. Evidently, if for some i there exists j such that we have satisfied the condition (22) on the step (г, j), then we can skip the remaining steps (г, j'). On each step we will freeze the value of A on some words. Let us explain what is done at step number (г, j). Let A be the oracle we have after the previous step (with a finite set of frozen values). Consider two cases: Case 1: it is possible to change nonfrozen values of the ith component of A to make (22) true. Evidently, in this case it is enough to redefine only a finite number of nonfrozen values of A\ Make those changes of A1 and freeze a finite number of values of A to guarantee the truth of (22). Since A1 is not changed for all i' Ф i. the assertions (G^) for all i' Ф i remain true. Case 2: for any changes of nonfrozen values of A1 the assertion (22) remains false. Assume that a E B*. Let A[a,i\ stand for the oracle В such that В1 — Аг for all %' Ф i and Вг — (Аг)[<т] (we recall that the notation C[a] is defined in the proof of Theorem 1). Consider the language н = {ае В* I GU?[a'i]{mfaA{\a\))) = 1}. Let us prove that H E LOG(G). We say that a E B* is free if no value of A on any word of the form |<т|г, i < \a\, is frozen (that is, we can replace A by А [а, г] without changing frozen values). Note that the set of nonfree values is finite. For all the free a we have G(f^a,z\y)) ф * for all у E B*. In particular, G(f^a,l\m^a'l\|a|))) ф * for any free a. The function a i—> |<r|)) is polylog-time bit-computable (because A is obtained from a polynomial-time decidable language by changing a finite number of values). Therefore the function 9(<x) |a|)) if a is free, < onec if ol is not free and a E H, zeroc if ol is not free and а ф H, is polylog-time bit-computable, and H(a) = G(g(a)) for all a E B*. Hence H E LOG(G). Thus, there are infinitely many a such that F(a) ф H(a). Pick a free a such that F(a) ф H(a). Then for n — |a| we have F([A[a,z]%) = F(a) £ H(a) = С(/гА[аЛ](т^аЛ(п))). Replace A by A[a, г] and freeze all the values of A on which the value of /гл (m^(n)) depends as well as the values of A on all the words of the form mj, j < n. Thus we have made the assertion (25) true. The assertion (24) was not affected because F(a) Ф *. Since we have redefined only the ith component of A, all conditions (24) for i’ ф i were not affected either. The implication (20)=>(19) is proved. □ Corollary 5. If F is a language, then the class POLYA(F) has a <^n- complete language. Remark 2. It is clear from the proof of Theorem 4 that in the condition (20) we can replace <^A-reducibility by <^-reducibility. Remark 3. It is clear from the proof of Theorem 4 that for any sequence {(Fi,Gi)}, г = 0,1,2,... of pairs of separation problems such that F{ has no solution in LOG(G^), we can construct an oracle A such that the class POLYA(G^) 
100 NIKOLAI К. VERESHCHAGIN is not <^A-hard for the class POLYA(Ft) for all i. To do so we have to consider for all i a countable number of components AhJ — {x G B* | ijx G A}, j G N. The same is true for Theorem 1, and for Theorems 7 and 8 below. We can also construct an oracle relative to which negative assertions of different types are true simultaneously. For example, if for all i there exists an oracle Ai such that FOLYAl (Fi) 2 РОЬУЛг((7г) and for all j there exists an oracle B3 such that the class POLYBj (Hj) is not <£iB,7-hard for the class POLYBj (Jj), then there exists a single oracle A relative to which all these assertions are true. Corollary 6. If for nondegenerate separation problems F and G the assertion (26) F has no solution in the class n.u.LOG(G'), is true, then there exists an oracle A such that the class POLYA(G) has no <^A- hard language for the class POLYA(F). The assertion (26) is the assertion usually proved by counting arguments when one proves that there exists A such that the class POLYA(G) is not <^A-hard for the class POLYA(F). Example. In [40], it was proved that n.u.BPPLOG — n.u.PLOG. Obviously, the separation problem Fbpp defining the class BPP has no solution in the class n.u.PLOG. Consequently, there exists an oracle A such that the class ВРРЛ has no <^A-hard language for the class R/4. Remark 4. If in Theorem 4 we replace the separation problems F and G by countable classes T and Q of separation problems, the implication (20) =>(19) remains true. To keep the implication (19)=>(20) true, we have to strengthen the condition (19) as follows. There exist a language И in LOG(£?) and a computable function f(i,a) such that for any fixed г, the function a i—> /(г, a) is polylog-time bit-computable and reduces the ith separation problem in T to H. 3.3. A criterion for whether a complexity class is Turing reducible to another complexity class. Definition 8. A language L\ is polynomial-time Turing reducible (= Cook reducible) to a language L2 (and we write L\ <ff L2) if there is a polynomial-time Turing machine M with oracle L2 recognizing L\. A language L\ is polynomial-time Turing reducible to a language L2 relative to A (and we write L\ <AfA L2) if there is a polynomial-time Turing machine M with two oracles A and L2 recognizing L\. Let < stand for <fr or <ffA • A class K\ is <-reducible to a class K2 (notation: Кг < K2) if VZq G Кг 3L2 G K2 Lx < L2. To formulate a theorem giving a criterion of whether Кг <^A K2 for all A, we define a polylog-time version of polynomial-time Turing reducibility that is more flexible than polylog-time many-one reducibility. A separation problem F is polylog-time T-reducible to a separation problem G {F <lT G in symbols) if there are a polynomial-time Turing oracle machine M and a function / : В* x ®* —> B* such that 1) the value f(y, a) can be bit-computed in time poly(|t/1 +log |a|) for given у and a, and 2) for all a G D(F) the following two assertions are true: (27) (28) G(f(y,a)) Ф * for all у e B* F(a) = MG(f("aV{\a\), 
RELATIVIZABILITY IN COMPLEXITY THEORY 101 where G(f(-,a)) stands for the language {у E B* | G(f(y,a)) = 1}. We call (M, /) a pair reducing F to H. Note that in this definition it suffices to require that (27) and (28) are true for all but finitely many a E D(F). Let (M, f)G(a) denote the output of M on input \a\ with oracle G(f(-,a)). Obviously, the binary relation ^lT is reflexive and transitive. It is clear that F*lmG=> F llT G. Theorem 7 ([51]). If a separation problem F satisfies condition (5) and a separation problem G satisfies condition (6), then the following assertions are equivalent: (29) LOGS(F) r<lT LOGS(G), (30) F ±1t G, (31) POLYa(F) <pta POLYa(G) for all oracles A. If F is a language, then all three assertions are equivalent to the assertion (32) LOG(F) r<lT LOG(G). Proof. Evidently, (29) and (30) are equivalent. Assume that F is a language. Then the implication (32) =>(30) is true. On the other hand, assume that (30) is true. Let (M, /) be a pair reducing F to G. Let l(n) be a polylogarithmic upper bound for the length of queries to oracle made by M on the input n E N. Consider the language H = {xa | \x\ < /(H), G(f(x, a)) = 1}. Let us prove that H belongs to LOG(G). Since D(F) = B*, we have G(f(x, а)) ф * for all x,a E B*. Therefore, we have H(f3) = G(h((3)), where h(f3) = lPx'°^ if P = %a>x ^ ДМ); 1 zeroc if (3 is not of the form xa, where x < l(\a\). For a given /3 we can decide in time polylog(|/?|) whether (3 has the form xa, \x\ < l(\a\). Consequently, ft is a polylog-time bit-computable function; hence, we have H E LOG(G). Set g(x, a) = xa. It is obvious that g(x, a) can be bit-computed in time poly(|ж| + log|a|). The pair (M,g) reduces F to Я; therefore {F} ^lT LOG(G). As F is ^^-complete in LOG(F), we obtain LOG(F) <lT LOG(G). Let us prove that (30) implies (31). Assume that F <lT G. Let (M,f) be a pair reducing F to G, A an oracle, and L a language in the class POLYA(F). Let g be a function polynomial bit-computable relative to A such that L(x) — F(g(x)). Then L(x) = •>£<»))(|^(ж)|) for all x E B*. Since the function \g(x)\ is polynomial-time computable relative to A, the language L is <lfA-reducible to the language {yx | G(f(y,g(x))) = 1}. The latter language is in POLYA(G), because G{f{y,g{x))) Ф * for all x,y e B* and the function yx > f{y,g{x)) is polynomialtime bit-computable relative to A. Let us prove the implication —«(30) => —«(31). Assume that F ф1т G. Let us prove that (31) is false. Note that in (31) the <^’A-reducibility can be replaced by <^-reducibility. Indeed, if a language L\ is <ffA-reducible to a language L in POLYa(G), then Li is ^-reducible to the language L 0 A = {0ж | x E L} U {lx | x E A}, which is in POLYA(G) (because A E POLYA(G), provided G satisfies (6) and the class POLYA(G) is closed under the operation 0 for any A and G). 
102 NIKOLAI К. VERESHCHAGIN It suffices to construct an oracle A such that the following two conditions hold: (33) [A]n G D(F) for all n, and the language {n | F([A]n) — 1} is not <^-reducible to any language in POLYA(G). Let M8, M8, ..., M8, ... be an enumeration of all the polynomial-time oracle Turing machines. Let /^(ж), ..., /гл(ж), ... be an enumeration of all the functions that are polynomial bit-computable relative to A (A is considered as the second argument). We want to construct an oracle A such that (35) 3n€NF([A}n)^M°if'A{ )\n) or 3y G{f?{y)) = * for all г, j G N. First, let A be equal to a polynomial-time decidable language satisfying the condition (33). We make a countable number of steps indexed by pairs (г, j) G N2. Step (i,j). Let A be the oracle (frozen values included) we have after the previous step. We say that a G B* is free if no value of A on a word of the form | a | &, к < | a I, is frozen. Consider two cases. Case 1: there are a free a G D(F) and a у G B* such that G(f^a\y)) = *. Then replace A by A [a] and freeze finitely many values of A to guarantee the validity of (35). Note that the condition (33) has not been affected. Case 2: G(ff[a](y)) ф * for all у € В* for all free a G D(F). We claim that there is a free a G D(F) such that F(a) ф M- KJl r8(|<a|). Indeed, otherwise F(a) = ^ ^(|<a|) for all a G D(F). Then the function g(y,a) = f^a\y) is bit-computable in time poly(|г/|+log |a|), and for the pair (M, g) the conditions (27) and (28) are fulfilled for all free a G D(F). Therefore, F <lT G, and we get a contradiction. After that the proof goes similarly to the proof of Theorem 1. □ 3.4. A criterion for whether a complexity class has a Turing hard language for another complexity class. Theorem 8 ([10, 51]). If a separation problem F satisfies condition (5) and a separation problem G satisfies condition (6), then the following assertions are equivalent: (36) the class LOG(G) is ^lT-hard for the class LOGS(F), (37) {F} rA LOG(G), (38) the class POLY 4(G) is <pr A-hard for the class POLYA{F) for all A. If F is a language, then all the three assertions are equivalent to the assertion (39) LOG(G') is <lT-hard for LOG(F). Proof. Evidently, (36) and (37) are equivalent, and if F is a language, then they both are equivalent to (39). 
R EL ATI VIZ ABILITY IN COMPLEXITY THEORY 103 Let us prove the implication (37)=>(38). Assume that F <lT H G LOG(G). If H does not satisfy condition (6), then F G PLOG and therefore (38) is true. Otherwise, Theorem 4 implies that POLYA(H) has a <^-complete language. By Theorem 7 we have POLYA{F) <%A POLYЛ(Я); consequently, the class POLYA(G) is <^A-hard for the class POLYA{F). Let us prove that (38) implies (37). Similarly to Theorem 8, we may replace <^,A-reducibility by <^-reducibility in (38). Assume that (37) is false, that is, F is ^-reducible to no language in the class LOG(G). We have to construct an oracle A such that the class POLYA(G) has no language that is <^-hard for the class POLYA(F). Let fA(y), fA{y)-> •.., fA{y), ... be an enumeration of all the functions that are polynomial-time bit-computable relative to A. Split A into components A1 — {x | ix G A}. It suffices to find an A such that for any i G N at least one of the following two assertions holds: (40) G(fA(y)) = * for some у G B*, the language Li (A) — {n \ F([Al]n) — 1} is in the class POLYA(F) and is not <^-reducible to the separation problem G(fA(y)). Let Mq , Mf, ..., M^, ... be an enumeration of all the polynomial-time oracle Turing machines. To make the assertion (41) true it suffices to satisfy the following requirements: (42) F([A%) Ф * for all n, (43) 3n e N F([Y]„) ф Aff (/-A(0)(n), for all j G N. To construct an oracle A satisfying (40) or (42)&(43) for all pairs (г, j) we can follow the proof of Theorem 4. The only difference appears in the second case when the step (г, j) is described. Recall that in the second case G{fA{y)) Ф * for all ye B* and for all variations of nonfrozen values of A\ We call a word a G B* free if no value of A1 on a word of the form |a|j, j < |a|, is frozen. We have to Q( /--4 [ск,г] / \ \ prove that there is a free a G D(F) such that F(a) ф M- K 1 K i\a\)- Assume that there is no such a. Let l(n) denote a polylogarithmic upper bound for the length of queries made by the machine M on input n. Consider the language H = {ya:\y\<l(\a\),G(f?[a’i](y)) = 1} and the function g(y,a) = у a. Since G(f^a’l\y)) ф * for all free a and for all у G B*, the language H is in LOG(G). Then for the pair (Mj,g), the assertions (27) and (28) are true for all free a G B*. Therefore, F <lT H. This contradiction completes the proof. □ Corollary 9. If F j<lT n.u.LOG(G), then there exists an oracle A such that the class POLYA(G) is not <ffA-hard for the class POLYA(F). Remark 5. Let K2 be classes of languages and A an oracle. In [3] it is noted that if the class K2 is downward closed under <^’A-reductions, then the class K2 is <^,A-hard for a class K\ if and only if K2 is <^-hard for K\. Indeed, suppose that L is a language in K2 to which all the languages from K\ are -reducible. Then consider the language L\ — {ix0* I MA'L on input x outputs 1 in < t steps}, 
104 NIKOLAI К. VERESHCHAGIN where Mq , M\, ... is a numeration of polynomial-time Turing machines having two oracles. All the languages in the class K\ are ^-reducible to L\. On the other hand, L\ <^A L; hence, L\ G K2- 4. Relativizable inclusions between particular complexity classes In this section we consider many of the representable classes lying between P and PSPACE. As mentioned in Remark 1, all the particular complexity classes studied in the literature can be represented by means of separation problems that are non-zero only on the words of length 2n, n G N. To simplify the notation, we consider in the sequel only separation problems satisfying this requirement. Let Fn denote B2™ and let F denote We enumerate the bits of a word a G Fn either by binary words of length n, or by numbers from 0 to 2n — 1. For a word a in F, its norm ||a|| is defined as log2 \a\. While defining particular separation problems we keep the following agreement: if the problem under consideration is defined only on a set M С B*, then its value on all the words from В* \ M is equal to 0 (that is, the default value is 0). We consider the following relativized complexity classes: PA, UPA, co-UP UPA П co-UPA, FewPA, co-FewPA, FewPA П co-FewPA, FewA, 0PA, RA, co-RA, RAHco-RA, NPa, co-NPa, NPaHco-NPa, BPPa, MAa, co-MAa, МАаПсо-МАа, AMa, co-AMa, AMa Псо-АМа, PPa, E^, П^, П E£ (fc > 2), IPA, co-IPA, IPA П co-IPA, PHA, PSPACEa. Below we recall the definitions of complexity classes from this list and give some comments. 1. R4 =f POLYa(F), where Fr(«) 'l if#i(a)>2/3, < 0 if #i(a) = 0, * otherwise. 2 ирл def polya(F), where Fup(a) 'l if #i(«) = 1, <0 if #i(a) = 0, * otherwise. V 3. The definition of the class FewPA is as follows. A language L is in FewPA if there are a polynomial q and a function / polynomial-time bit-computable relative to A such that (i) #i(/(x)) < q(\x\) and (ii) x G L <=> #i(/(x)) > 0 for all x. It is easy to verify that FewPA = POLYA(F), where F(a) = if 0 < #i(a) < ||a||, if #i(a) = 0, otherwise. 4. FewA is the class defined in [11] as follows: a language L is in FewA if there exist a function / polynomial-time bit-computable relative to A, a polynomial q and a predicate R defined on the set B* x N and polynomial-time computable relative to A such that (i) L(x) = R(x, #if(x)) and (ii) #i(/(x)) < q(\x\) for all x G B*. 
RELATIVIZABILITY IN COMPLEXITY THEORY 105 The representation of FewA in the form POLYA(F) is not so natural as for other classes. Assume that \a\ = 2n+1 and a = /Fy, where \/3\ = I7I = 2n. Let F(a) * if #i(/?) > n, 7(#i(i®)) otherwise. It is easy to verify that FewA = POLYA(F). POLYA (PARITY), where 5 0P^ DCTV^/ PARITY(a) = 0 if #i(a) is even, 1 otherwise. 6. ppA def poLYA(MAJORITY), where MAJORITY (a) 0 if #i(a) < i|a|, 1 otherwise. 7. AMa is the abbreviation for the class AM[2]a. The class AMa is represented by the following separation problem Fam- Let the notation MG M P(x) mean that \{x G M : P(x)}\ > d • \M\. Then for a G F2n, Fam (a) 1 if М2/з и G Bn 3v G Bn a(uv) = 1, < 0 if М2/з и G Bn \/v G Bn a(uv) = 0, * otherwise, V ’ where uv stands for the concatenation of words и and v. 8. MAa is the class represented by the separation problem Fam (a) 1 if 3u G Bn M2/3 uGln a(uv) = 1, < 0 if Viz G Bn M2/3 vgB” a(uv) = 0, * otherwise, v ’ where a G F2n. 9. Y^ d= POLY'4(Ffe), where _ 11 3j/iVy2 • • • Qyka{ym ■■■Ук) = 1, INI is a multiple of k\ k 1^0 otherwise, where Q stands for =3 if к is odd and for V otherwise, and all y\,..., yk range over BlHI/fc. Note that Y4 = NP4. 10. П4 d= co-Y,4. PH4 = Ufe^fe- The manifold PH4, as observed by Sil- vestri [46], is an example of an tto-representable manifold that is not representable. Indeed, assume PHA = POLYA(F) for some F and for all A. On the other hand, we have PHA = POLY^dF^ | /c = 1,2,...}), where F^ is defined in the previous item. Theorem 2 implies that F G LOGSdF^ | к — 1,2,...}); hence F G LOGS(F&) for some particular к. This implies that PHA = T>A for all A. However for any к there is an oracle separating PH from E& [25] (see item 12 below). 11. PSPACEa is the class of languages recognized by polynomial-space Turing machines with oracle A. Let us prove that the manifold PSPACEa has the form 
106 NIKOLAI К. VERESHCHAGIN POLYa(F) for some F. It is well known that any language L in PSPACE^4 can be represented as follows: L = {x | 3yi G Bn Vy2 e ®n . ..Qyn € ШпРА(х,у1у2 .. .yn), where n =р(|ж|)}, where PA(x,u) is a predicate that is polynomial-time computable relative to A and p(m) is a polynomial. The converse is true, too. Therefore, we can take the separation problem 1 if there is an n G N such that ||a|| = n2 and 3yi g Bn Vy2 e ®n... Qyn e Bn а{у1У2 •.. yn) = l, 0 otherwise. -^pspace(^) = < if 3P : B* —> Bn Prob[a(rir2 ... rnP{ri)P(rir2). if VP : B* —> Bn •. P(rir2 .. . .rn)) = 1] > | Prob[a(rir2 ... r„P(ri)P(rir2). otherwise, .. P(rir2. ■■rn)) = 1] < 5 It is clear that POLYa(Fpspace) = PSPACEa and LOG(Fpspace) is the class of languages that can be recognized within polylogarithmic space. 12. To define the class IPA take the following separation problem ifip. On words a G F2n2, it is defined as follows: -Fip(a) = < where the probability is considered with respect to the uniform distribution in n ... rn. Then POLYVfip) d= 1РЛ To explain the intuitive meaning of the definition of Fip, we recall the definition of the class IPA according to [4] and convert it to a convenient form. Consider a game between two players called Verifier, V, and Prover, P. P tries to convince V that an input string is in a language L. The Verifier is bounded in time by a polynomial of the length of the input; he also has access to oracle A and can toss a fair coin. P has unbounded computational resources. The convincing procedure is as follows. For a given input x, V tosses the coin several times and then asks P a question depending on x and the outcome of the tossing. Then P answers the question. Again V tosses the coin several times and then asks P another question depending on x, both outcomes of tossing and the previous answer of V. This procedure is repeated a polynomial number of times. Then P either accepts or rejects. The outcomes of tossing are known also to P. The action of V is governed by a polynomial-time Turing machine M. We say that M is good for L if (i) for any x in L, there is a strategy for P such that V will accepts with probability greater than 2/3 and (ii) for any x outside L and for all strategies of P, Verifier will reject with probability greater than 2/3. There are no restrictions on the strategy of P; his strategy may even be uncomputable. However it is easy to see that one can restrict P’s strategies to those computable in exponential time (and even within polynomial space). It is also easy to see that to give answers to V’s questions P does not need the questions themselves, as they can be computed from x and the outcomes of the tossings. In a formal setting, a Verifier is a pair V = (</, Q), where Q is a polynomial-time computable predicate on В* x В* x B* and q : N —► N is a polynomial. Any function P : B* —► B* is called a Prover’s strategy, or briefly a Prover. Assume that x G B*, 
RELATIVIZABILITY IN COMPLEXITY THEORY 107 |x| = m. Assume that п,..., rg(m) is a sequence of q(m) binary words of length q(m) (the outcomes of tossing). For all i < q(m), let Pi = P(ri... ri) (the 2th answer of the Prover). We say that the result of (P, V) on input x and random inputs rq,..., rg(m) is equal to 1 if all the words Pi have length q(m) and Q(x, r\... rq^m^pi .. -pq(m)) = 1; otherwise the result is equal to 0. Let (P, V’)(^)r1...rq(m) denote the result of (P, V) on input x and random inputs rq,..., rq(m) • We say that a language L belongs to IP if there is a Verifier V such that the following two assertions are true: (44) VxeLSP Prob[(F, V)(x)ri...rqiM) = 1] > 2/3, (45) Vx g L VP Prob[(F, V)(x)ri...r,(|a!|) = 0] > 2/3. If we allow Verifier to query the oracle A, then the resulting class is denoted by IPA. The alternative definition of the class IP according to [21] (when the Prover does not see the outcomes of tossing) also fits into our framework, though in a tricky way. As proved in [22], these two definitions are equivalent and the proof of the equivalence relativizes. It is easy to see that a language L is in LOG (Pip) if there is a polylog-time Verifier (in the above formal definition, Q is polylog-time computable and q is a polylogarithm) for which (44) and (45) hold. Let IPLOG denote LOG(Fip). 13. For any class KA, the class со-KA is defined as {L \ B* \ L G KA}. Note that if the manifold KA is representable [tto-representable], then the manifold со-KA = {B* \ L | L G KA} is representable [tto-representable]. If KA, KA are representable, say KA = POLYA(F2), i = 1,2, then the manifold KA П KA is also representable. Indeed, take the following separation problem F: F(a) 1 if a = \ai\aict2, where Fi(au) = Р2(а^) = 1, < 0 if a = laulaqa^, where Fi(qii) = F2(cl2) = 0, * otherwise. Obviously, this separation problem F also satisfies the following equations: LOG(.F) = LOG(Fj) n LOG(F2), LOGS(F) = LOGS(Fi) n LOGS(F2), EXPa(F) = EXPa(Fi) П EXPa(F2). All the known relativizable inclusions between the classes under consideration are shown in Figure 1. A line segment connects a class KA with a class KA if KA is included in KA, and KA is positioned higher than KA. 4.1. Historical references. The nontrivial inclusions shown in Figure 1 were proved by the following authors. 1. The assertion MAa C YiA П ПА follows from the Gacs result (published in [48]) stating that BPPA С T,A П П^. Namely, in [48] a separation problem G is constructed such that G(a) is a solution of Fppp and (46) G(a) = 1 <^=> Vy G Bz e Bp<l|a|l) Q(a,y,z), where p is a polynomial and Q is a polylog-time predicate (that is, G G II2LOG). 
108 NIKOLAI К. VERESHCHAGIN PSPACE co-IP co-AM co-MA co-NP co-FewP co-UP Figure 1. Relativizable inclusions between complexity classes. 2. The assertion AMa С follows from the above result by Gacs. However, for this assertion, it is important that in (46) the predicate Q(a,y,z) is monotone in a (that is, if a' can be obtained from a be replacing some 0’s by l’s, then Q(a,y,z) => Q{a!,y,z)). 3. The assertion MAa C AMa was proved in [4]. 4. FewA C 0PA was proved in [11]. 5. The assertion MAa C PPa is proved as follows. By Theorem 1 it suffices to prove that MALOGS C PPLOGS. 
REL ATI VIZ ABILITY IN COMPLEXITY THEORY 109 Assume that F G MALOGS. Then there are a polynomial p and a poly log-time predicate Q such that F(a) = 1^3«e M2/3 v e Bp(,) Q(a,u,v) = 1, F(a) = 0 Bp(() M2/3 v € Bp(;) Q(a, u, v) = 0, where / stands for log \a\. Using amplification, we can construct a polynomial pi and a polylog-time predicate Q i such that F(a) = l=>3ue Bp(;) Prob[Qi(a,u,r) = 1] > 1 - 4~p(,), ^ F(a) = 0 =► Vm e Bp(;) Prob[Qi(a,u,r) = 1] < 4“p(i\ where probability is considered with respect to the uniform distribution in r G BPl(/). Indeed, let pi(l) = C • p(l)2 where C is a constant to be specified later. We view an r G BPl^ as the concatenation of Cp(l) strings iq,..., vcp(i) of length p(l). Let Q\(a,u,r) = MAJORITY(Q(a, u, v\),... ,Q{ct,u,vcp(i)))- Let us make use of the Chernoff bound [13]. Theorem 10. Let £1,..., fn be independent random variables in the set {0,1} such that Prob[£* = 1] = p for all i. Then, for any 8 G (0;p(l — p)], Prob 1 n i= 1 > 6 6^ n < 2e 2p(!-p). By the Chernoff bound, for some positive constant c, Prob[(3i(a, r, s)] > 1 — 2-c-c-p(i) whenever M2/3U G Bp(^ Q(a,u,v) = 1 and Prob[(3i(a, r, s)] < 2_C'C'U0 whenever М2/з v G Bp(^ Q(a, и, u) = 0. Let C = |"2/c"|. Prom (47) we conclude that F(a) = 1 => Prob[Q±(a,u,r) = 1] > 2"p^(l - 4~p(/)) > 4“p(/), F(a) = 0 => Prob[Qi (a, ?x, r) = 1] < 4_p^, with respect to the uniform distribution in pairs (u,r) G Bp^+Pl(l\ We shall now define a function / : Bn —> Bfc, where n = 2l = |a| and /с = 2Pl^+p(^+1 - 2Pi(0~p(0+1> We index the first 2Pl^+p^ bits of f(a) by pairs (ix, r) G BP1^+P^. The bit of f(a) number (ix, r) is equal to Qi(a:, ?x, r). The remaining 2P1^+P^ — 2Pi(0-p(0+! bits of /(a) are l’s. Obviously /(a) is polylog-time bit-computable. We claim that / reduces F to MAJORITY. If F(a) = 1, then more than 2Pi(0+p(02~2p(/) _j_2Pi^+p^ _ 2pU0-p(0+1 _ \f(a)\/2 bits of f(a) are 1’s. If F(a) = 0, then less than \f(a)\/2 bits of f(a) are ones. Hence F G PPLOGS. □ 6. The assertion FewA С YA О UA follows from FewA <p NPA; the latter assertion is easy and well known. Later we will need the following stronger assertion. Lemma 1. FewA FewPA for all A. Proof. Fix A С B*. Assume that L G FewA and that L is defined by the polynomials p,q and polynomial-time predicates RA, QA, that is, L(x) = RA(x,\{y eM^ \QA(x,y)}\), \{yeM^\QA(x,y)}\<q(\x\). 
no NIKOLAI К. VERESHCHAGIN Let К = A 0 {xz | 3у G Вр(^^ (z is a prefix of у and QA(x, y)}. Obviously, К is in FewPA. Using the binary search and querying oracle К, we find in polynomial time for any given x all у G such that QA(x,y). Then we compute L(x) = RA(x, \{y G Bp^x^ | QA(x,y)}|). □ 7. The assertion FewA C PPA was proved in [31]. 4.2. Proving the completeness of Figure 1. We claim that all true rela- tivizable inclusions are shown in Figure 1. This follows from the twelve assertions listed below—namely, all the assertions ЗА KA £ KA such that Kx £ K2 and VK[(K[ <KX^K[< K2\ (K2 <K'2^KX< AT'), where Kx < K2 means that there exists a directed path from the class Kx to the class К2 in the directed graph shown in Figure 4. Here is the list: 1. ЗА \JPA П co-UPA £ BPPA, 2. ЗА Ra nco-RA £ ©PA, 3. ЗА co-UPA % ©IPA, 4. ЗА FewPA П co-FewPA % UPA 5. ЗЛ co-R^4 % NPa, 6. ЗА 1РАПсо-1Рл gPHA, 7. ЗЛ АМЛ П co-AMa ^ PP 8. ЗА AMa % £A, 9. ЗА PPA % PHA, 10. ЗА 0PA % PH A 11. ЗА ©PA % PPA, 12. ЗА % £a for k > 3. We give the proofs of all the assertions in the above list whose proofs do not require much space, and give references for all other assertions. 1. ЗА UPA П co-UPA % BPPA Theorem 11 ([51]). ЗА UPA nco-UPA 2 BPPA Proof. Let us fix a convenient terminology (also used in other proofs). All the specific separation problems G used in the sequel satisfy the following property: for each F G LOGS(G) there exists a poly log-time bit-computable function / such that F(a) < G(f(a)) and for all a the norm of f(a) depends only on ||a|| and is equal to a polynomial of ||a||. Assume that F G LOGS(G) and let / be a poly log-time bit-computable function such that F(a) = G(f(a)) and ||/(o:)|| = p(||o:||) for all a G D(F), where p is a polynomial. Then all the words r contained in the set Вр(^а^ are called experts (for / and ||a||), and the rth bit of f(a) is called the opinion of r about a. Fix a polylog-time machine M that computes the rth bit of the word f(a) for a given a and r G Вр(На11). We say that expert r queries a(u) (where и G BHall) if M queries the uth bit of a during the run on the input (a,r). It is clear that for all a and all r G Вр(На11) there are at most poly(||a||) different и G BHall such that r queries a(u). The fraction |{r G Вр(НаН) | r queries »(/u)}| 2p(IM) is called the weight of и relative to a, in symbols wa(u). If M and p are not determined by the context, we say “the weight of и relative to a for M, p”. It is easy to prove the following general fact: wa(u) < g(||a||), where q is the polynomial bounding the number of queries of every expert r G 
REL ATI VIZ ABILITY IN COMPLEXITY THEORY 111 Now let us start with the proof of Theorem 11. By Theorem 1, it suffices to prove that the separation problem F{a) 'l if a = /З7, \\(3\\ = Ц7Ц, #i(/?) = 1, #1(7) = 0, < 0 if a = /?7, ||/?|| = ||7||, #i(/?) = 0, #1(7) = 1, * otherwise, v 1 does not belong to BPPLOGS (evidently, POLYA(F) = XJPA П co-UPA). Assume the contrary: suppose there are a polynomial p and a polylog-time predicate P such that #i(/3) = 1, #1(7) = 0 =7 м2/3 r G M^P((37,r) = 1, = 0, #i(7) = 1 =*► M2/3 r g Bp("T(/?7,r) = 0, for all n and all /?,7 G Fn. Fix n. Let /?o G Fn, 70 G Fn be the words containing only zeros. Without loss of generality we may assume that the fraction |{reBP(") | P(/?q7q, r) = 1}| > 1 2pM ~ 2 We shall enumerate bits in the first half /3 of the word ^7 (where /?, 7 G Fn) by the words of the form 0и, и G Bn, and bits of the second half 7 by the words of the form 1 u. (We follow this rule in the sequel, too.) Let the number of queries of experts to /?o7o be bounded by к — poly(n). Then w^oio0lU) ^ ^5 therefore, there is а щ G Bn such that w@010(luo) < ^ < \ (if n is large enough). Let 71 denote the word whose u^th bit is 1 and whose other bits are 0. Replace the word ^070 by the word /?o7i- After this replacement at most 1/6 of the experts change their opinions; hence |{rGBPW 1 P(/?o7l,r) = 1}I > 1 2р(п) ~ 3' As F(^o7i) = 0, we get a contradiction. □ 2. 3ARAnco-RA Theorem 12 ([51]). ЗА RA nco-RA % ®PA Proof. Evidently, the manifold R'4f1co-R4 can be represented by the following separation problem F. If 7 G Fi, then ^(7) = 0. For 7 G Fn+i, let a stand the first half of 7 and /3 for the second half. Then Fb) = 0 1 * if#!(a) = 0, #!(/?)> §\(3\, if #!(<*)> !H, #!(/?) = 0, otherwise. By Theorem 4, it suffices to prove that F 2?lm PARITY. Assume the contrary: suppose there exist a polynomial p and polylog-time predicate P such that Vn V7 G Fn+1 F(7) < ^2 P{l,r) = l. r€lP(Tl) The signs ^ anc^ + m this proof denote addition modulo 2. Fix a poly log-time machine M computing the predicate P and a large n. Let the number of queries to the word 7 made by M on inputs of the form (7, r), r G be bounded by к = poly(n). Let us prove that for any fixed r G 
112 NIKOLAI К. VERESHCHAGIN the function P(7, r) is a polynomial of degree < к (in the field of residues modulo 2) of variables y(v), v E Bn+1. Indeed, к P(7, r) = Xdl (VV(&1 • • • bi-ь r)) + h + 1), г — 1 where the sum ranges over all the tuples (61,...,6&) E Mk such that M outputs 1 if it receives the answers 61,...,to the queries made to 7, and where v(bi.. .bi,r) E Bn+1 is the index of the bit in 7 queried by M if it receives the answers b\,..., bi to the previous queries to 7. Therefore, the function ^ZreEP(n) P(7, r) is a polynomial Q of degree at most к in the variables j(v). Divide the variables j(v), v E Bn+1, into two groups a(u), и E Bn, and /?(?z), и E Bn, where a(u) = 7(0u) and /3(u) = 7(1 u). Consider two cases. In the first case, the constant term in Q is equal to zero. Set f3(u) = 0 for all и E Bn. Let R denote the resulting polynomial of degree at most к = poly(n). The polynomial R has 2n variables, has zero constant term, and is equal to 1 if more than |2n variables are equal to 1. Let us derive a contradiction from the existence of such a polynomial. Let d be the degree of R. Obviosly d > 0. Pick a monomial / in R of degree d. Set all the variables outside / equal to one. The resulting polynomial is not a constant; hence it has a root. Thus there is an assignment having at least 2n — d> |2n ones on which R is zero. The second case (the constant term in Q is equal to 1) can be reduced to the first case by adding 1 to Q. □ 3. ЗА co-UPA <2 IPA. This assertion was in fact proved in [18] (technically speaking, a slightly weaker assertion ЗА co-NPA 2 was proved in that paper). As the proof is very simple, we present it. Theorem 13 ([18]). ЗА co-UPA % IPA. Proof. By Theorem 1, it suffices to prove that the separation problem ^co-up(a) 4 if#!(a)=0, < 0 if #i(a) = 1, * otherwise. is not in IPLOG. Assume the contrary: suppose there exists a polylog-time verifier V such that #i(a) = 0 => 3P Prob[(P, V)(a) = 1] > 2/3, #i(a) = 1 => VP Prob[(P, V)(a) = 1] < 1/3, where (P, V)(a) stands the result output by V after the dialogue with P on input Take a large n and set ao = O2™. Then there exists a P such that Prob[(P, V)(a0) = 1] > 2/3. Consider the dialogue of P and V on input aq. This dialogue depends on the outcome of coin tossing made by the verifier. We call different outcomes of coin tossing experts, and we call the queries to ao made by the verifier during the dialogue 
REL ATI VIZ ABILITY IN COMPLEXITY THEORY 113 with P on input ao and outcome r the queries of the expert r to a. For a given и G Bn we call the fraction | {r G | makes the query ‘ao(u) = ?’}| 2 p(n) the weight of u. Obviously, if n is large enough, then there exists и of weight less than 1/3. Change the iith bit in ao; let ai denote the resulting word. Since Prob[(P, V)(a0) = 1] > 2/3, we obtain Prob[(P, V')(ai) = 1] > 2/3 - 1/3 = 1/3. On the other hand, this probability should be less than 1/3. This contradiction completes the proof. □ 4. ЗА FewPA П co-FewPA g UPA. We will prove in the next section the stronger statement ЗА FewPA П co-FewPA UPA. 5. ЗА co-IT4 g NPa. Theorem 14 ([51]). ЗА со-R/4 g NPA. Proof. Assume the contrary: suppose there exist a polynomial p and a polylog-time predicate P(a, r) such that #i(a) = 0 => 3rGBPM)p(a,r) = l, #i (a) > 2/3|a| => Vr G r) = 0, for all a G F. We shall find an a such that #i(a) > ||a| and such that P(a, r) = 1 for some r G Bp(IMI). Take ao = O2™, where n is large. Then there is an ro G Bp(n) such that P(ao,ro). Change the value of ao on all и such that the polylog-time machine computing P(ao,ro) does not query ‘ao('u) = ?’. The resulting word a satisfies the desired conditions. □ 6. ЗА 1РЛ П co-IP л g РНЛ. In [1] it was proved that ЗА 1РЛ g РНЛ. Some minor changes in that proof allow us to prove that there exists an oracle A such that 1РЛП co-IPл gPIT6 7 4 9. 7. ЗА АМЛ П со-АМл g РРЛ. This assertion will be proved in Section 7. 8. ЗА АМЛ g £л. This assertion is proved in [44]. 9. ЗА РРЛ g РНЛ. This follows from the fact that there is no к G N such that the function MAJORITY(xi,..., xn) can be represented in the form 2Poiyiog(n) 2Polyiog(n) 2Polylo 11 12s(n) 2polylog^n^ V /\ V f\ fi\--.i2k (Жь • • • 5 xn)i %l=\ 22 = 1 12k-1=1 22fc = 1 where each fil...i2k(xь • • • ,%n) is a variable or the negation of a variable ([20], [2], [55], [25]). 10. ЗА 0 Рл g РНЛ. This assertion is proved in [20], [2], [55], [25]. 11. ЗА 0 Рл g РРЛ. This assertion is proved in [8]. In fact, this theorem easily follows from the assertion PARITY MAJORITY proved in [37]. 12. Vfc > 3 ЗА Пл g £д?. Superpolynomial lower bounds for the size E^-circuits necessary for the computation of Щ-functions were first obtained by M. Sipser. We need a lower bound 2^n\ where / grows faster than any polylogarithm. Such a bound is obtained in [25]. 
114 NIKOLAI К. VERESHCHAGIN 5. Turing reducibility between particular complexity classes In this section we present all the known relativizable assertions of the form ifi if2. Obviously, if ifi C if2, then ifi if2. Therefore all the inclusions in Figure 1 yield the assertions on Turing reducibility. Let us list all other known relativizable theorems of the form ifi if2. (1) The class if is -reducible to the class со-if for any if, and vice versa. (2) 0РЛ <рг РРЛ. (3) FewA <PT FewPA. (4) РНЛ <рт РРЛ. The assertion (1) is evident. Both assertions (2) and (3) are simple; (3) was proved in the previous section (Lemma 1), and (2) will be proved right now. The assertion (4) was proved in [49]. Theorem 15 ([51]). 0РЛ <рг РРЛ for any oracle A. Proof. By Theorem 7 it suffices to prove that the language PARITY is ^irreducible to the language MAJORITY. When we prove that a problem F is or is not ^-reducible to a problem G, it is convenient to think that the reducing pair (M, /) is a machine that works on the input a just as the machine M works on \a\ and queries the oracle G instead of the oracle G(/(-, a)) (when M queries the value of the oracle G(/(*,a;)) on a word y, we think that the new machine queries the value of G on the word /(y, a)). Let us define the pair (M, /) reducing the function PARITY to the function MAJORITY in terms of the work of this new machine. Having MAJORITY as oracle, we can find #i(o;) in time polylog(|a|) as follows. Assume that \a\ = 2k. Ask the oracle MAJORITY whether #i(a) > \\a\ is true. Assume that the answer is “yes”. Then check whether Ф\(а) > ||a|. For that purpose take a word /3 consisting of ^\a\ zeros and query the oracle whether #i(a/3) > \\a(3\. It is easy to verify that this inequality is equivalent to the inequality #i(o;) > ||a|. Repeating this process к times we find (a). Output 1 if #i (a) is odd and 0 otherwise. □ All known relativizable assertions of the form K\ if2 are shown in Figure 2. 5.1. On completeness of Figure 2. It is unknown whether Figure 2 is complete, that is, whether all true relativizable assertions of the form ifi <?T if2 are shown in Figure 2. Let us go through the following 15 assertions which must be proved to verify that Figure 2 is complete. 1. ЗА Ra П co-Ra ^тл 0Рл. This assertion is true and follows from the fact that the class 0РЛ is downward closed under <^,A-reductions and from the theorem ЗА RA Dco-RA %. 0РЛ. The fact that 0РЛ is closed under ^-reductions was proved in [49]; the second theorem was proved in the previous section. 2. ЗА UPA П co-UPл iAfA ВРРЛ. This follows from the fact that the class ВРРЛ is downward closed under <^’A-reductions. Indeed, in the previous section it was proved that there exists an oracle A such that UPA П co-UPA % ВРРЛ. 3. ЗА FewPA Dco-FewPA ^?fA UPA. This is true and is proved in this section. 
RELATI VIZ ABILITY IN COMPLEXITY THEORY 115 PSPACE Figure 2. Turing reducibility between complexity classes 4. ЗА R/4 5 6 7 NPa П co-NPa. This follows from the fact that the class NPa П co-NPa is downward closed under <^A-reductions and from the fact that ЗА Ra % co-NPa (it was proved in the previous section). 5. ЗА XJPA A IPA Псо-1Рл. This follows from the fact that the class 1РЛ П со-1Рл is downward closed under <^,A-reductions and from the fact ЗА XJPA % со-1Рл, proved in the previous section. 6. ЗА £2 Л ^A IPA. This is true and is proved in Section 6. 7. ЗА ВРРЛ A NPa. This is proved in Section 6. 
116 NIKOLAI К. VERESHCHAGIN 8. ЗА 0 Рл ^л РНЛ. This follows from the fact that the class РНЛ is downward closed under <^A-reductions (the closure of the class £& is included in the class D/c+i) and from the fact that ЗА 0 Рл <2 РНЛ. 9. ЗА АМЛ ^T/A П П2 • This follows from the fact that the class £2 G П2 is downward closed under <^'A-reductions and from the fact that ЗА АМЛ £ ^2 • 10. ЗА АМЛ П со-АМл ^?гл МАЛ. This is true, and the proof is presented in this section. 11. ЗА 0 PA 1РЛ. This assertion is true, and is proved in Section 6. 12. ЗА IPA Псо-1Рл A РРЛ. Unknown. 13. ЗА T>A П ^3A_i (k > 3). Unknown (for all к > 3). 14. ЗА T,A jCj,A T,A П (k > 3). This follows from the fact that the class Ed П lid is downward closed under <%A-reductions and from the fact that ЗА 15. ЗА РНЛ ^^A T,A (k > 1). This is true, and follows from 14. 5.2. Theorems. We now prove assertions 3 and 10. Theorem 16 (Joint work with An. A. Muchnik, see [51]). There is an oracle A such that АМЛ П со-АМл ^ТА МАЛ. Proof. Consider the following separation problem F. Let a = /?7, where /?,7 G F2n, n e N. Then F(a) 1 if М2/з x e Bn 3y e Bn 0(xy) = 1, M2/3 x G Bn Vy G Bn 7(xy) = 0, < 0 if М2/з x G Bn Vy G Bn /3(xy) = 0, М2/з x G Bn 3y G Bn j(xy) = 1, * otherwise. It is straightforward that F G AMLOGS П co-AMLOGS. By Theorem 7 it suffices to prove that F is not 7: ^-reducible to the problem Fma- Recall that Fma(P) Ф 0 only if the norm of /3 satisfies \\/3\\ = 2k, and in this case Fma(0) = < 1 0 * if 3r e Mk M2/3 seMk 0(rs) = 1, if Vr e Mk M2/3 seMk 0{rs) = 0, otherwise. The following property holds for the separation problem Fma as well as for all other particular problems G considered in the present paper. For any separation problem F, if H <lT G, then there exists a pair (M, /) reducing H to G such that the following two assertions hold: (1) The number of queries made by M for input \a\ does not depend on the answers of the oracle and is equal to a polylogarithm of \a\. (2) For all queries ‘B{u) — V made by M to its oracle В during the work on the input |a|, the length of the word f(u,a) is the same and depends only on \a\. That is, if we consider the pair (M, /) as a single machine, then all its queries to the oracle G during work on the input a have the same length which depends only on \a\. 
REL ATI VIZ ABILITY IN COMPLEXITY THEORY 117 We assume that all pairs (M, /) considered in the sequel satisfy both (1) and (2). Assume that F <lT Fma via a pair (M, h). Let us fix a large n (at the end of the proof we will see how large it should be). Let (p be a function from Bn into Bn. Denote by (p the word of length 22n encoding the graph of p. That is, for all x,y E Bn, ф{ху) is equal to 1 if у — <p(x), and is equal to 0 otherwise. We will take words of the form <рф, where ip and ф are partial functions from Bn into Bn, as arguments of F. Let m be the number of queries to the oracle Fma made by M for the input of this form. As \(рф\ = 22n+1, we see that m = poly(n). We define a binary sequence partial functions (р,ф : Bn —> Bn, and total functions /о? go : Bn —> Bn such that the sequence of oracle answers to the queries made by (M, h) to the oracle Fma during the work on both inputs /0ф and <pg0 is equal to bi,..., bm. The cardinalities of domains of the functions ip and ф will be bounded by a polynomial of n; therefore, for n large enough we shall get | Dom(</?)|, |Dom(/0)| < \2n. Obviously this is a contradiction, because (M,h) reduces F to Fma and F(/0^) = 1, F(<pg0) = 0. Let 2к denote the norm of queries made by the pair (M, h) to the oracle Fma (that is, the norm of a’s such that (M,h) queries ‘Fma(&) = ?’) during work on inputs of norm 2n + 1 (obviously, к < poly(n)). Define the following auxiliary separation problem on words of norm 2k: G(0) = l1 if3r^B"Mi/2 seMk I3(rs) = l, 10 otherwise. Obviously, G is a solution to Fma* Take arbitrary functions /, g : Bn —> Bn. Run the machine M on the input 22n+1 with the oracle G(/i(*, fg))- Let e(/, g) denote the sequence of oracle answers. Since the length of the word e(/, g) is equal to m, there exists a word eo of length m such that \{(f,9) I e(/,ff) = e0}| > J_ 22n(2n) - 2171' Let /С denote the set {(/,<?) | e(f,g) = eo}. Obviously, for all pairs (f,g) E /С the queries to the oracle G(/i(-, fg)) made by M are the same. Let tq,..., vm denote those queries (that is, the queries are ‘G(/i(tq, fg)) = lG(h(vm, fg)) = ?’). Let P(a, v, u) denote the ^th symbol of the word h(v, a) (a E F2n+i, и £ B2/c). Let bi,..., bm denote the bits of the word eo- Let I stand for the set {г | г < m, bi = 1}. We know that for any i E I and for all (f,g) E /С, there exists r* E B^ such that M^sE Bfc P(fg,Vi,ris) = 1. Again, we can find a set /С' С /С such that, for any г E I and for all (f,g) E /С', this гг is the same and such that > ^тт* Evidently, Д2")' > 2fcri+m. Let £ denote the number 2fc J+m . We consider the set /С' as a planar set of area at least e. Obviously, there exist a vertical section of /С' of length at least e and a horizontal section of JCf of length at least e. That is, there exist functions /o,#o &nd families of functions T' and Qf such that |F'| > е2п'2П, \Q'\ > <s2n‘2n, {/0} x Q' С /С', T* x {g0} G /С'. Now define a partial function p : Bn —> Bn and a family T consisting of (total) functions from Bn into Bn. Assume that x, у are in Bn. Denote by popularity^(x, y) the fraction \{f E T \ f(x) — y}\/\F\. First set p — 0, T = T'. Then, while there exists a pair (x,y) E (Bn \ Dom(p)) x Bn such that popularity^(x, y) > 2_n+1, we choose such a pair (x,y), extend the partial function p to x by setting ip(x) = у, and delete from T all the functions / such that f(x) Ф y. 
118 NIKOLAI К. VERESHCHAGIN We claim that the resulting </?, T have the following properties: (1) T cr, (2) all the functions from the set T extend ip, (3) popularityy) < 2~n+l for all (x,y) G (Bn \Dom(<^)) x Bn, (4) | Dom(<^)| < — log2(|J^/|/2rx(2")) < km + m = poly(n). The properties (l)-(3) are evident. Let us prove (4). Let Ti, pu xii and У г denote the values of T’, p, x, and у after the ith iteration of the while-loop. Then \3+i\ > Ш |{/ : Bn —> Bn | / extends ^i+i}| — \{f • Bn —> Bn | / extends рг}\ ’ because l^i+il > 2-n+1|^| and |{/ : Bn —> Bn | / extends Pi+i}\ = 2~n\{/ : Bn —> Bn | / extends Pi}\. Since |^+i|/|{/ ; Bn Bn | / extends ^+1}| < 1 for all i, the number of iterations of the while-loop is at most — log.i(|.F,|/!2n'2 )). Apply the same procedure to the family Q', and let denote the result. Let us prove that Fma(M^? Ф9о)) = h for all i < m. Take an arbitrary i < m. Consider two cases. First case: b{ = 1. Then we know that (48) Mi/2 s € Bfe P(fgo, Vi, rts) = 1 for all the / G T. By the definition of ^-reducibility, Fma(^(^? <P9o)) ^ * (if n is so large that | Dom(<^)| < |2n). Assume that FuA(h(vi, <pgo)) = 0- Then (49) М2/з s еШк Р{фдо, ns) = 0. Let N be the machine that for any given a G F2n+i, ^ G B* and и G B2/c computes P(a, и, и) in time poly(|u| +ri). If a has the form f)6, where ту, 0 are partial functions from Bn into B, then the queries made by iV to a have one of the two following forms: lrj(x) = yV and l0(x) = i/?’, where ж, у G Bn. For ж, т/ G Bn, let w^gQ(x,y) denote the fraction |{s G Bn | N on the input ({pg0,Vi,ris) queries ‘(f(x) = 2/?’}|/2n. Obviously, ye®71 W(pgo(x^y) < poly(n). Then for any / Gf the assertions (48) and (49) imply that Y Wvgo(x,f(x))> x£®n\Dom (<£>) 1 6’ therefore, W\ E х£Шп\Dom((/?) w, <P9o(X:f(X)) > g- 
REL ATI VIZ ABILITY IN COMPLEXITY THEORY 119 Let us rewrite the left hand side of the last inequality as follows: Щ WV9o(xJ(X)) /GJF, a:GBn\Dom((^) = Tj wv9o(x^y) ■ popularity^,?/) a:GBn\Dom((^), yGBn < 2~n+1 wvgo(x,y)<2-n+1po\y(n). iGBn\Dom((/?), yG®n If n is large enough, we get the contradiction 2-n+1poly(n) > Second case: bi = 0. We know that |{s E B^ | P(fgo,Vi,rs) = l}|/2fc is at most 1/2 for all r and for all / £ T. Assume that Fma(M^? Ф&о)) = 1, that is, there exists r E B^ such that М2/з s eMk P(<pg0, Vi, rs) = 1. Then, just as in the first case, we can get a contradiction. In the same way we can prove that Fma(%,M)) = bi for all i < m. □ Theorem 17 ([51]). There is an oracle A such that FewPA П co-FewPA ^A UPA. Proof. To demonstrate the idea, let us prove first that there exists an oracle A such that FewPA П co-FewP"4 % UPA. Define the following separation problem F- If ll/?ll = ILII, then F(07)= < ifl<#i(/?)<2, #1(7) =0, if 1< #1(7) <2, #i(/3) = 0, otherwise. It is straightforward that F E FewPLOGS П co-FewPLOGS. By Theorem 1, it is sufficient to prove that F 0 UPLOGS. Assume the contrary: suppose there exist a polynomial p and a polylog-time predicate P such that F((37) = 1 => 3!r G Bp(ll/3||) P(/?7,r) = 1, F(f37) = 0 => Vr g 1^1/311) P(f37, r) = 0. Take /?o = 70 = 02", where n is large. Consider two cases. First case: 3r e Bp-n) Р(/?о7(), r) = 1. Pick an expert ro such that P(/?o7o, гч) = 1. If n is large enough, then there exists и E Bn such that 7*0 does not query ‘70(77) = ?’. Set 70(77) = 1, and get a contradiction. Second case: Vr P(/?o7o> 0 = 0. Let us prove that if n is large enough, then there exists (3 E Fn such that #i(/?) = 2 and |{r E Bp(n) : P(/370, r) = 1}| > 2. For 77 E Bn, let /3^ denote the word whose 77th bit is 1 and whose other bits are 0. For all 77 we have F((3iJo) = 1; therefore, for all и E Bn there is a unique r = ru E Bp(n) such that P(/?]^7o,r) = 1. The set of all v E Bn such that the expert ru queries l0i(v) — is called the 1 -base of u, and the set of all v E Bn such that the expert ru queries f3o(v) = ?’ is called the 0-base of u. Let B\{u) and Bq(u) denote the 1-base and 0-base of 77, respectively. 
120 NIKOLAI К. VERESHCHAGIN Let us prove that if n is large enough, then there exist u\,u2 E ®n such that u\ 0 BQ(u2) U Bi(u2), and u2 0 Bi(ui). Indeed, the numbers of elements in all bases are bounded by a polynomial of n, say q(n). Take random u\, u2 (independent and uniformly distributed). We have Prob[ui e B0(u2)\ < Prob[ui € Bi(u2)] < Prob[u2 e Bi(ui)] < Therefore, with probability close to 1, none of the three events occurs. Fix u\ and u2 such that u\ 0 Bo(u2)UBi(u2) and u2 0 Define the word /32 as follows: (32{ui) — 02{u2) — 1 and /?2(t?) = 0 for v 7^ u\,u2. Then /?2у0 E D(F) and P{02lo,rUl) = РЦЗ\^о,ги2) = 1 (since u2 £ Bi(iti), u\ g Bi(u2)). We have ru! ф rU2, because P{(3l'^Q,rUl) = 1 and P(0Q1'yo,rU2) = 0 (since щ g B0(u2)). The contradiction shows that F is not in UPLOGS. Let us prove now that F is not ^-reducible to Fpp. Recall that F\jp(a) = < 1 0 if #i(a) = l, if #i(a) = 0, otherwise. Assume that F is ^-reducible to Fpp via the pair (M, /). Then, by the definition of ^-reducibility we have (50) Va E D(F) Ve E ®* #i(/(e,a)) E {0,1}. Fix n E N and set a0 = 02™+1. Let D\ denote the set {о; E Fn+i : #1 (a) = 1}. Evidently, D\ C D(F). We construct a set U C Bn+1 having at most poly(n) elements such that for all a in D\ that are equal to zero on all the elements of С/, the sequence of answers to queries to the oracle Fpp made by (M, /) during work on the input a is the same. Let m denote the number of queries made by M to the oracle during work on the input 2n+1. Define the binary sequence iq,..., bm and the sequence tq,..., уш of binary words by induction as follows. Let be the word such that the machine M asks lFup(/(iq, a)) = V during the work on input 2n+1 after getting the answers bi,..., bi-i to the previous questions to the oracle, and let 1 if #i(/(«i,ao)) > 1, 0 otherwise. For any i let us find a set U% such that F\jp(f(vi,a)) = bi for all a E D\ that are equal to zero on all the elements of U{. Then we shall set U = ur=iG. Fix an arbitrary i not exceeding m, and construct U%. By the definition of ^-reducibility, there exists a machine N that for any given (a,tq,r) (where |r| = \\f(vi,a)\\) produces the rth bit of the word /(tq,a) in time polylogarithmic in \a\. Consider two cases. First case: bi = 1, that is, #i(/(ui,ao)) > L Pick a word r such that f(Vi,ao)(r) = 1. Include in U% all the words и E ®n such that N asks lao(u) = ?’ during the computation on the input (a0,iq,r). Then #i(f(vi,a)) > 1 for all 
RELATIVIZABILITY IN COMPLEXITY THEORY 121 a E Fn+1 that are equal to zero on all the elements of U%. By (50), this means that #1 (f(vu a)) = 1 for all a E D\ that are equal to zero on all the elements of Ux. Second case: #i (f(vu a0)) = 0. Let /?0 = 7o = 02". We use the notation introduced in the proof of the first part. Let us prove that the set V = {и E Bn | #i(f(vt, Pi 7o)) = 1} has no more than poly(n) elements. Namely, we claim that \V\ < 3g(n), where q(n) is a polynomial upper bound for the number of queries of the form ‘ат(и) = V made by N during the computation on any input (<To,Ti,r) (where |r| = ||/(тг, ao)||). Assume the contrary: suppose that \V\ > 3q(n). For а и E F, let ru denote the word r such that the rth bit of the word /(тг,/?^7о) is 1. Let Bq(u) [Bi(u)\ denote the set of all v such that N queries ‘<yq(u) = ?’ [(3^70(t) = ?’] at some moment during the computation on the input (aihvuru) [(/^70, тг, ru)j. Then \B0(u)l\Bi(u)\ < q(n) for all и E V. Take random independent u\,u2 uniformly distributed in V. The probability of the event “u\ 0 Bo(u2) U Bi(u2), u2 0 Bi(iq)” is at least 1 - 3q(n)/\V\ > 0. Just as in the proof of the first part, we can construct a word (32 E D(F) such that #1 (f(vu /^270)) > 2, which contradicts (50). Similarly we can construct a set V' having poly(n) elements such that #1(/Сг,Д)7П) = 1 for all и € Bn \ V. Set U = V U V'. If n is so large that 2n > |[/|, then there exist aq, a2 E D1 such that F(ai) = 1, F(a2) = 0 and both aq and a2 are equal to zero on all the elements of U. We have (M, /)Fup(aj) = (M, /)/'UP(a2). This contradiction proves the theorem. □ 6. Complete languages in particular complexity classes It is well known that the classes (51) PA, NPa, co-NPy SN Щ?, PSPACE*4, 0РЛ, PPA have <^-complete languages (also called m-complete languages). All known theorems of the form is <^;4-hard (or <^-hard) for the class Кf for all A” can be obtained using the following two rules: (1) a class K2 is <^-hard for the class Кf if there is a class KA in the list (51) such that KA С Кл C KA\ (2) a class KA is <^-hard for the class KA if there exists a class KA in the list (51) such that KA <?r KA <?r KA. It is unknown whether all true assertions of the form UKA is <^;A-hard [<prA- hard] for the class KA for all A”, where KA and KA are classes shown in Figure 1, can be obtained using rules (1) and (2). We have already proved some assertions which are necessary to get a positive answer to the above question. Indeed, if KA is <f/-hard for Кj4, then KA C KA (since all the classes under consideration are downward closed under <^A-reductions). Therefore, if we have proved that KA % КA for some A, then we have also proved that KA is not <(J;4-hard for the class KA for some A. Similarly, if KA ^prA KA for some A, then K>A is not <?r -hard for KA for the same A. Let us go through the remaining assertions which must be proved to obtain the positive answer to the above question. 1. ЗА YjA П UA is not <fpA-hard for the class Few'4. It is unknown whether this is true or not (and for <^;A reductions too). 
122 NIKOLAI К. VERESHCHAGIN 2. ЗА IPA is not <^,A-hard for the class BPPA This was proved by An. A. Muchnik together with the author. The proof is presented in this section. 3. ЗА 1РЛ П со-1Рл is not <^,A-hard for the class R/4 П co-RA This assertion is true and was proved in [28]. 4. ЗА IPA Псо-1Рл is not <^,A-hard for the class XJPA П co-UP A This is true. The proof is presented in this section. 5. ЗА T,A П UA is not <^'A-hard for the class BPPA It is unknown whether this is true or not (and for reductions too). 6. ЗА FewA is not <^,A-hard for the class UPA П co-UP A This assertion is true. In [28], it was proved that there exists an oracle A such that the class FewPA is not <^,A-hard for the class UPA Dco-UPA. Since FewA FewPA, this implies that FewA is also not <^,A-hard for the class \JPA П co-UPA for some A. 7. ЗА T,A П is not <^,A-hard for the class T,A П (к > 3). It is unknown whether this is true or not. (For к — 1,2 this follows from 4 and 1, respectively.) 8. ЗА РРЛ is not <^,A-hard for the class 1РЛ Псо-IP A It is unknown whether this is true. Since РРЛ has a <^-complete language, this assertion is equivalent to the assertion ЗА IPA П со-1Рл A PPA Let us turn to the proofs. We use the following lemma. Lemma 2. If F and G are nondegenerate separation problems such that F 0 n.u.PLOGS and n.u.LOG(G) = n.u.PLOG, then there exists an oracle A such that the class POLYA(G) is not <^A-hard for the class POLYA(F). Proof. By Theorem 8, it suffices to prove that the separation problem F is ^Areducible to no language in the class LOG(G). Assume that there is a language H e LOGS(G) such that F -<lT H. Then H is in n.u.LOGS(G) = n.u.PLOG C n.u.PLOGS. Therefore F is in n.u.PLOGS because the class n.u.PLOGS is downward closed under ^lT-reductions. □ Assertions 3 and 4 can be easily derived from Lemma 2, Theorem 7, and the following theorem. Theorem 18. n.u.IPLOG П co-n.u.IPLOG = n.u.PLOG. We omit the proof of Theorem 18, because it is an easy generalization of that of Nisan’s result (see [40]) n.u.BPPLOG = n.u.PLOG. Assertion 6 can be proved in a similar way. Theorem 19 ([51]). n.u.FewLOG = n.u.PLOG. Proof. Assume F is in n.u.FewLOG. Then there are predicates P and R, computable in a polylogarithmic number of queries, and polylogarithms p(n) and q(n) such that |{r e 1бАа1) : P(a,r) = 1}| < g(H), F(a) = R(a,\{r £ Bp(H) : P(a,r) = 1}|), for all a £ D(F). Let n = \a\. The words in the set are called experts. We say that an expert r accepts a if P(a,r) = 1. We claim that by probing at most polylog(n) bits of a we can find all the experts accepting a. This easily implies that F £ n.u.PLOG. 
RELATI VIZ ABILITY IN COMPLEXITY THEORY 123 There is a polylogarithm s(n) such that for any expert r there is a Boolean decision tree Tr of height s(n) computing P(a,r) for all r £ Fn. Fix Tr for all r. Any partial function p : Bn —> В is called a segment. Two segments are consistent if they have a common extension. Any decision tree Tr queries the value of a on к = s(n) arguments, say щ,..., u^. The segment {(щ,а(щ)) \ i < к} is called the information of r about a. The information of r about any a accepted by r is called the certificate of expert r. A certificate is a certificate of some expert. We find all experts accepting a for any given a as follows. For any subset U of Bn let Ф[/(a) denote the set of all certificates having the same value on elements of U as a. Our goal is to construct a set U such that Фt/(a) is the set of all certificates consistent with a. Let us start with U = 0. Repeat the following loop к times. Take any maximal (with respect to inclusion) subset Ф = {p\,..., pj} of Ф[/(а) such that the sets Dom(<^i) \ С/,..., Dom(<^j) \ U are pairwise disjoint. Then j < q(n), because there exists a (3 £ Fn that is consistent with all certificates from Ф and are certificates of different experts (because certificates of any expert are pairwise inconsistent). Ask the value of a on all the elements of the set V = (Dom(<^i) U • • • U Dom(pj)) \ U. Since Ф is maximal, the domain of any certificate p £ Фц(а)\Ф intersects with V. Set U = UuV. Note that | Dom(<^)\C/| has decreased for any certificate ip £ Ф[/(а) \ Ф and Dom(<^) \ U has become empty for any certificate p £ Ф. The loop is completed. The value max{| Dom(<^)\f/| | p £ Ф[/(а)} decreases or remains zero after each iteration of the above loop. Therefore, Dom(<^) C U for any p £ Фи (a) after к iterations of the loop. This means that Фu(a) the set of all certificates consistent with a. Obviously, an expert accepts a if and only if some certificate about a is consistent with a. Hence we know all the experts accepting a. It remains to note that during each iteration of the loop we make at most q(n) • к queries to a. □ Assertion 2 cannot be derived from Lemma 2, since n.u.PLOG C n.u.NPLOG C n.u.IPLOG. Theorem 20 (Joint work with An. A. Muchnik, see [51]). There is an oracle A such that IPA is not <?fA-hard for the class ВРРЛ. We prove this theorem together with the other unproved theorems of the previous section. Theorem 21 ([51]). ЗА ВРРЛ A NPA. Theorem 22 ([51]). ЗА ©P*4 л IPA. Theorem 23 ([51]). ЗА T,A Г\ПА ^,А 1РЛ. Proof of Theorems 20-23. In fact, Theorem 21 follows from Theorem 20 because the class NPA has a <^-complete language and NPA С 1РЛ. Nevertheless we prove Theorem 21 first. By Theorem 8 it suffices to prove that Frpp Fnp- Assume that Fppp <lT Fnp- Let (M, /) be a reducing pair. Fix a large integer n. Let m denote the number of queries made by M to oracle during the work on input 2n. Obviously, m < poly(n). Assume that a is in Fn. Run the machine M with the oracle Fnp(/(-,q;)) on the input 2n. Let e(a) denote the sequence of oracle answers received by M in that computation (e(a) £ Bm). Let a;o be an a £ Fn with lexicographically greatest e(a). Let e0 = 6? ... 6^ stand for e(ao) and Vi,..., vm for the queries of M to the oracle Fnp(/(-, qq)) (more precisely, the 
124 NIKOLAI К. VERESHCHAGIN queries are ‘FnP(/(^, a0)) = ?’). Let I be the set of all the indices i < к such that Enp (/(иг, ao)) = 1, that is, > 0. For each i e I fix a word ti such that f (vi,ao)(tt) = 1. Let q(n) be a polynomial bounding the time of bit-computation of the function /(u7,a) for a G Fn, i < m. Obviously, for any г G I there is a set Щ C Bn having at most q(n) elements such that /(иг,а)(£г) = 1 for all a having the same values on all the elements of U% as ao has. Set U = |J2G/ Evidently, |U\ < mq(n) = poly(n). We have FNP(f(vt,a)) = 1 for all i < m such that 6/ = 1 and for all a G Fn having the same values on all the words in U as ao has. We claim that, moreover, e(a) = e(ao) for all a G Fn taking the same values on all the words in U as ao does. Assume the contrary. Let a be a counterexample. Let b\ ... Ьш be the bits of e(a). Let i be the smallest number such that bi ф b®. Then, since eo is the lexicographically greatest word among the word of the form e(a), a G Fn, we have = 0, 6/ = 1. As a and ao have the same values on all the words in /7, we have FNP(/(u2, a)) = 1. On the other hand, b\...b{)l_l = b\ .. .bj-1; therefore the zth query to the oracle made by M during the computation on the input 2n with the oracle FNP(/(-,a)) is ‘ENP(/(^, a)) = ?’. Consequently, F^p(f(vi,a)) = bi = 0. The contradiction proves the claim. The equality e(a) = e(ao) implies that (M, f)FNP(a) = (M, /)Fnp (a0). Without loss of generality we may assume that (M, f)pNP(a q) = 0. Take n so large that | U | < |2n. Let a be equal to ao on all the elements of U and to 1 on the remaining words. We have 1 = FBPp(a) g (M, /)Fnp (a0) = (M, /)FNP(a) = 0. Theorem 21 is proved. Let us prove Theorem 22. Since PARITY is a language, by Theorem 7, it suffices to prove that PARITY ф1т IPLOG. Assume that PARITY is ^-reducible to a language F in the class IPLOG via a pair (M, f). Define ao, m, q(ri), v\,..., um, and eo just as in the previous proof. Since F is in IPLOG, there exists a poly logtime verifier V for F. For each i < m such that (>• = 1, fix a prover Pi such that Prob[(P*, V)(f(vi, a0)) = 1] > 2/3. Let N be a machine that computes the tth bit of the word /(u,a) within time poly(||a|| + \v\) for any given (a,u,t), where \t\ = ||/(u,a)||. Let r = poly(n) be an upper bound for the number of queries of the form 4ao(x) = ?’, where x is in Bn, made by N in computations on inputs of the form (ao,Vi,t), where \t\ = ||/(u7, ao)||. Let /3J denote /(u7,a0). Let s = poly(n) be an upper bound for the number of queries of the form L/3l(t) = ?’, where \t\ = ||/?q||, made by V in the dialogue with Pi on the input Д/ Let x be in Bn. Let wlao(x) denote the probability of the event “there exists t G B^oH such that V queries ‘/?o(t) = ?’ in the dialogue with P1 on the input ao, and N queries cao(x) = ?’ during the computation on the input (a0,u?, t)”. Then Xa-6°=i Sxe3n wa0(x) — msr:> therefore, there exists xq G B77 such that X]i-6°=i wa0(xo) ^ msr/2n <1/3 (if n is sufficiently large). Change the xoth bit of ao and let a denote the resulting word. Let us prove that e(a) = e(ao), and therefore (M,f)F(a) = (M,f)F(ao). Assume that e(a) Ф e(ao). Let b\ ... bm denote the bits of e(a). Take the least i such that bi ф b®. Then bi = 0 and 6/ = 1. Therefore, F(f(v г, a)) = 0, and consequently, РгоЬ[(Рг, V)(/(u,,a)) = 1] < 1/3. On the other hand, Pmb[(PuV)(f(vua0)) = l} >2/3. 
REL ATI VIZ ABILITY IN COMPLEXITY THEORY 125 Hence, wlaQ(xo) > 1/3, because a and ao have different values only on xq. This contradiction shows that e(a) = e(ao) and (M,/)F(a) = (M,f)F(aq). Since PARITY(a) ф PARITY(ao), the theorem is proved. Let us prove Theorem 20. We have to prove that the separation problem Fbpp is ^-reducible to no language F in the class IPLOG. Assume the contrary: Fbpp dir F ^ IPLOG. We use the notation from the previous proof. Without loss of generality we may assume that (M, f)F (аф) = 1. Let or be a word in the set {a E Fn | e(a) = e(ao)} having the least number of ones. Without loss of generality we may assume that ol\ — ao- If #i(ao) < |2n, then the contradiction is already derived. If #i(<to) > |2n, then there exists x0 E Bn such that ao(^o) = 1 and Xlr60=i wa0(x°) — (i/3^n < 1/3. Define the word a as follows: a(xo) = 0, a(x) = ao(x) for all x Ф xq. Then #i(a) < #i(ao). Just as in the previous proof, we can prove that e(a) = e(ao). This contradicts the choice of ao- Let us prove Theorem 23. Let a be a partial function from Bn into Bn. Let d denote the word encoding the graph of a, that is, a(xy) — l a(x) — у for all x,y G Bn. Consider the separation problem F( 7) f l if 3n G N : 7 = d/5, where a and (3 are partial functions from Bn into Bn such that a is total and f3 is defined on all the arguments but one, < 0 if 3n G N : 7 = d/3, where a and (3 are partial functions from Bn into Bn such that (3 is total and a is defined on all the arguments but one, , * otherwise. Obviously. F e E2LOGSnn2LOGS. Let En denote the set {7 E F2n+i | ^(7) ф *}. By Theorem 4, it suffices to prove that there exists no G E IPLOGS such that F ф1т G. Assume that such a problem G exists. Let (M, /) be a pair reducing F to G. Choose a large n. We use all the notation from the previous proofs. Take a word 7 E En having the lexicographically greatest e(7). Let c*o,/?o be the partial functions such that 7 = do До- Without loss of generality we may assume that F(ao(3o) = 1, that is, ao is total. Let (3o be undefined on the word x\. Fix a verifier for solving the problem G. We enumerate bits of 7 in such a way that for x,y E Bn we have 7(0xy) = a0(xy), 7(1 xy) = (3o(xy). For any i such that 6/ = 1, define the weight wlaof3o(u) of the word и E B2n+1 as follows: wlao3o(u) is equal to the probability of the event “there exists t E В^^г,а°^1 such that V queries ‘/(vb^o)(0 = ?’ in the dialogue with Pi on input f(vi,a0), and N queries ‘ao(и) = V during the work on input (ao,Vi,t)”. If n is large enough, we can find xq E Bn such that J2i-b°=i waQf3Q(^xoao(^o)) <1/6, and we can find y\ E Bn such that < V6- Define the partial functions a, (3 as follows: a(x) а0(ж), Ххфх0, undefined, if x = xq, 0(x) = Po(x), if ж ^ xi, yu ifx = xi. Then e(a/3) — e(ao/?o) and F(a/3) = 0. This contradiction proves the theorem. □ 
126 NIKOLAI К. VERESHCHAGIN 7. Perceptrons and oracle separation of AM П со-AM from PP 7.1. Perceptrons. In this section we prove that AM П со-AM <2 PP under some oracle. To prove this we need some lower bounds for perceptrons extending the well-known “one-in-a-box” theorem of Minsky and Papert [37]. Definition 9. A perceptron is a depth-2 circuit having a threshold gate at the bottom and AND-gates at the remaining level. Inputs of AND-gates are either Boolean variables or their negations. Each AND-gate is labeled by a natural number called the weight of the AND-gate. The total weight of a perceptron is the sum of the absolute values of the weights on all its AND-gates. The order of a perceptron is the maximal arity of its AND-gates. Let P be a perceptron, and ф an assignment of values to its variables. The weight of ф, written I¥р(ф), is the sum of the weights on all AND’s which are true on ф. The perceptron outputs 1 on input ф if Wp(0) is greater than the threshold value of its threshold gate, and 0 otherwise. Let Р(ф) denote the output value. 7.2. An extension of the “one-in-a-box” theorem. Perceptrons have been studied by Minsky and Papert in [37]. We single out two of their results: (i) any perceptron computing the parity function of n variables must have order at least n, and (ii) (the “one-in-a-box” theorem) any perceptron recognizing whether each row in a given Boolean matrix of size n x 4n2 contains at least one 1 has order at least n. Beigel [6] constructed a Boolean function of n variables that is computable by a perceptron having exponential total weight and order 1 but is not computable by perceptrons having quasipolynomial (2polylog(n)) total weight and polylogarith- mic order. To be more precise, he proved the lower bound d2 logic = fi(n), where d and w denote the order and the total weight, respectively, of a perceptron computing that function. We extend Minsky and Papert’s one-in-a-box theorem in the following direction. Let П stand for the following separation problem: to separate Boolean matrices in which every row contains a 1 from matrices in which many rows (for example, 99% of them) contain zeros only. Obviously, any perceptron recognizing whether each row of a given matrix has a 1 also solves П. Our theorem states that П is not solvable by perceptrons of order o(y/m) and of total weight 2°^n\ where n is the number of rows and m is the number of columns (Theorem 24). This implies that perceptrons of polylogarithmic order and quasipolynomial total weight cannot solve П. Let us proceed to precise definitions. Let M be a Boolean matrix with n rows and m columns. Then M can be defined in the usual way by means of mn Boolean values. When we say that a perceptron P has M as input we mean that these Boolean values are assigned to its input variables. In this case P(M) denotes the output of P. A matrix is called good if every row of it contains a 1. A matrix is called bad if it is not good. Let 0 < q < 1. A matrix is called q-bad if the ratio of the number of rows with no l’s to the total number of rows is at least q. The “one-in-a-box” theorem of Minsky and Papert states that the order of a perceptron deciding whether an input Boolean matrix of size n x 4n2 is good must be at least n. We say that a perceptron P separates good matrices from g-bad matrices of size n x m if P(M) = 1 for every good matrix of size n x m and P(M) = 0 for any 
RELATIVIZABILITY IN COMPLEXITY THEORY 127 g-bad matrix of size n x m. Note that for any m, n there is a perceptron of order m and total weight m separating good matrices from 1-bad matrices of size n x m (this perceptron decides whether the first row contains a 1). Theorem 24 ([53]). Let 0 < e < 1/2. Suppose that there is a perceptron of order d and of total weight w that separates good matrices from (1 — e)-bad matrices of size n x m. Then d > у/(6/13)em or w > 0.5e(2/15)£n. Proof. Let m, n be integers. Let Л4 denote the set of Boolean matrices having n rows and m columns. Let M\j denote the element of the matrix M standing in the zth row and in the jth column. Let /z be a probability distribution on the set M. For a property S of matrices in M let ProbM[5(M)] denote the probability (with respect to p) that a random matrix M satisfies S. Let d be an integer and /z, v probability distributions on Л4. We say that p and v are d-indistinguishable if Protу [MiUl = 6i,..., MluJu = by] = Prob„[MiUl =bi,..., MiuJa = by] for any sequence (o, j\),..., (iu,ju) of indices such that и < d and for any sequence fti,..., bu of bits. The theorem is an easy corollary of the following two lemmas. Lemma 3. Suppose there are d-indistinguishable probability distributions p and v on Л4 such that a random matrix is good with probability 1 with respect to p and q-bad with probability at least 1 — p with respect to v. Then a perceptron of order d separating good matrices from q-bad matrices has total weight at least p~l. Lemma 4. If d < yj(6/13)гш and 0 < e < 1/2, then there are d-indistinguishable probability distributions p and v on M such that a random matrix is good with probability 1 with respect to p and a random matrix is (l — e)-bad with probability at least 1 — 2e-(2/15)£n with respect to v. Proof of Lemma 3. Let d, p, v, q, p be as in Lemma 3. Let P be a perceptron of order d and of total weight w separating good matrices from g-bad matrices in M. Let and stand for the average with respect to p and p, respectively. We claim that (52) E(1Wp{M) = E,Wp{M). Indeed, let C(M) denote the Boolean function computed by an AND-gate C in P. Let / be the total number of AND-gates in P, Ci the zth gate, and u\ the weight of Ct. Then WP{M) = X'=1 wt EM Ct(M) = XLi w, РгоЬм[Сг(Л/) = 1]. Therefore, it suffices to prove that PvohfJj[C(M) = 1] = Proby[C(M) = 1] for any AND-gate C in P. Represent C as a conjunction Дus=l(Misjs = bs), where every bs is either 0 or 1. Then Prob/Lt[C'(M) = 1] = Prob n[Mhjl =bi,..., MlaU = bu}. Thus the rf-indistinguishability of // and и implies (52). Let t be the threshold value of the threshold-gate. Since a random matrix is good with probability 1 with respect to /z, we have EM Wp(M) > t + 1. On the other hand, since a random matrix is g-bad with probability at least 1 — p with respect to i/, we have Eu Wp(M) < (1 — p)t +pw. Therefore, t + 1 < (1 — p)t + pw. Thus, t + l<t+ pw, so that 1 < pw. □ In the proof of Lemma 4, we will use the following lemma of Farkash, which is a version of the duality theorem of linear programming (see, for example, [41]). 
128 NIKOLAI К. VERESHCHAGIN Lemma 5. Let ацХ\ + «12^2 + • ■ - • + CL\txt = h a2\Xi + a2 2X2 + • ■ • • + a2txt = b2 a3\X\ + as2x 2 + • ■ T astxt = bs be a system of linear equations in real non-negative X\,X2, -. - ,xt. It has a solution if and only if there are no real yi, y2,..., ys such that a-iiUi + Й21У2 H— • + asiys > 0, «122/1 + «222/2 + • • • + CLs2ys > 0, «1*2/1 + 0212/2 + • • • + UstVs > 0, &12/1 +Ь2Ц2 + • * * T bsys < 0. Proof of Lemma 4. Let cr be a probability distribution on the segment {1,2,..., m} and r a probability distribution on the segment {0,1,2,..., m}. The distributions о and r will be specified later. For i < n and M G M, let ki denote the number of l’s in the ith row of M. We define p by the condition that fci, • • •, kn are independent (with respect to a) random variables with values in {1,2,..., m}. To choose a matrix at random with respect to p, pick random &i, • • •, kn independent with respect to a. Then for each i < n take a random string with exactly k{ ones as the zth row of the matrix. The distribution v is defined in the same way, but with a replaced by r. The distribution r is in turn obtained from a distribution p (to be specified later) on the segment {1,2,..., m} by means of the following transformation: ProbT[A; = a] О.бгРгоЬ^/с = a], if a € {1,2,..., m}, 1 — О.бг, if a = 0. If v is obtained from p in the way described above, then a random matrix is (1 — e)-bad with probability at least 1 — 2e“^2//15^£n with respect to v. This is a direct corollary of the Chernoff inequality. Indeed, a row of a random matrix has only 0’s with probability p = 1 — О.бг with respect to v. Take 6 = 0.4г. Since we assume that e < 1/2, S and p satisfy the conditions of Theorem 10, whence we conclude that a random matrix has less than (1 — e)n zero rows with probability at most 62n (0.4 £)2n 0 16 en 2e 2p(i-P) — 2e з.о.беа-о.бе) — 2e i-2(i-o.ee) < 2e = 2e -2/lben We now specify the probability distributions a and pon{l,2,...,m}. We want to do this so that distributions a and r will have the same first d moments, that is, (53) Ea к1 = Ет к* for all 1 < i < d. Let us prove that this implies the d-indistinguishability of p and Indeed, we claim that the value of Probf1[Mi1j1 — b\,..., Miuju — bu] 
REL ATI VIZ ABILITY IN COMPLEXITY THEORY 129 is a polynomial in Ea k, Eak2, ..., Ea kd for any sequence (zi, j\),..., (iu,ju) of indices of length at most d and for any sequence b\,..., bu of bits. Let us prove this claim. For any i < n, let c* stand for the number of l G {1,..., u} such that %i — i and bi = 1, and let e* stand for the number of l G {1,..., u} such that ц — i and bi — 0. Then Prob p[M, iui П i=l E, = bu- k(k > Miuju — 1) • • • (fc — Ci + l)(m — k)(m — к — 1) • • • (га — к — ег + 1) Evidently, for any г, Ea bination of the Ea kr, r m(m — 1) • • • (m — ci — e* + 1) k(k— 1) ■••(к — сг + 1){т — к)(т — к— —fc —ег + 1) is a linear com- m{m— — сг — ег + 1) 0,1,..., Ci + ei. As C{ + ei < d, the claim is proved. Recall that v is obtained from r in the same way as as fi is obtained from a. Therefore, (53) implies the d-indistinguishability of /x and v. It is easy to see that Ет кг = 0.6s Ep kl for any i > 1. Thus we have to prove that there are probability distributions a and p on {1,2,..., m} satisfying (54) Еакг = 0.6еЕркг for i = 1,2,...,d. Here an application of the Farkash lemma is appropriate because the dual problem is easier than the original one. Lemma 6. The following conditions are equivalent: (1) There are probability distributions a and p on the segment {1,2, ...,m} satisfying (54). (2) There are no polynomials p(x) of degree at most d such that -p(0)1 в'в£б£ < p(j) < 0 for j = 1,2,...,m. PROOF. Condition (1) is equivalent to the existence of a non-negative solution of the system 0.6г(11Х1 + 2гх2 Л b m1xm) - (Vxm+i + 21xm+2 H b m1X2m) = 0, 0.6г(12Х1 + 22x2 -I b m2xm) - (l2xm+i -b 22xTO+2 H b m2x2m) = 0, 0.6e(ldxi + 2dx2 H b mdxm) - (ldxm+i -b 2dxm+2 H bmdx2m) = 0, X\ + X2 + • • • + xm — 1, Xm+l + Xm+2 + ‘ * * + X2m = L By the Farkash Lemma this means that there are no real pi,P2, • • • ,Pd,y,z such that d (55) 0.6e 'Y^Pij1 + y > 0 for j — 1,2,..., m, 2=1 d (56) -'EPif + z >0 for j = 1, 2,..., m. i= 1 (57) у + z <0. We interpret the numbers —z,pi,p2, • • • ,Pd as coefficients of a polynomial p{u) = —г + piu + • • • + pdUd. Then (56) means that p(j) < 0 for all j — 1, 2,..., m. The existence of у satisfying (55) and (57) means that 0-6e(p(j) — p(0)) + p(0) > 0 for 
130 NIKOLAI К. VERESHCHAGIN all j = 1,2,..., m. The last inequality can be rewritten as —p(0) 10^£ < p(j), so (1) and (2) are equivalent. □ It remains to prove that there is no polynomial p(x) of degree at most d such that —p(0) ^ f°r 3 ~ We will use the following theorem of Markov (see, for example, [12, 35, 42]): \\P'\\ < d2||P||, where P is a polynomial of degree d and ||P|| denotes the maximum of |P(x)| over all ж G [—1; 1]. An easy corollary of this theorem is the following simplified version of the Ehlich and Zeller lemma [14]. Let Y = { —1,—1 + 2/ra, — 1 + 4/m,..., 1 — 2/m, 1} and let \\Py\\ denote the maximum of \P(x)\ over all x G Y. Then (58) ||P||(l-^) <||Py||. Indeed, by the Lagrange theorem and by Markov’s theorem we have 11Р1М1Ру||<11Р'|А<цр||^. rn rn This implies (58). Suppose now that there is a polynomial p(x) of degree d such that — p(0) < p(j) < 0 for all j = 1,2,..., m. We have to prove that d > yj(6/13)em. Observe that p(0) > 0. Let a denote г~. Let P(x) — p(\m(x + 1)) + 9LYLp{0). Then \\Py\\ = ^2^p(0), and therefore (59) (1 - d2/m)\\P\\ < ^ylp(O). On the other hand, since |P( —1 + 2/m) — P(—1)| = |p(l) — p(0)| > p(0), the Lagrange theorem implies that \\Pf\\ > p(0)m/2. Therefore, by Markov’s theorem, (60) ||P||>P(0)^. Combining (59) and (60), we obtain Therefore □ 7.3. Oracle separation of AM П co-AM from PP. The lower bound in Theorem 24 is sufficient to construct an oracle under which AM % PP. To construct an oracle under which AM П co-AM % PP we need a lower bound for perceptrons solving another separation problem. Let us describe this problem. Let Mn stand for the family of Boolean matrices of size n x n and let Mn = Mn x Mn- Let D = (Mo, Mi) be a pair of matrices in Mn- We say that D is of type 0 [type 1] if every row in M0 [Mi] contains a 1 and at least 2/3 of the rows in Mi [Mq] contain no l’s. Theorem 25. There is 6 > 0 such that the following holds for sufficiently large n. If there is a perceptron of order d and of total weight w separating elements of type 0 in Mn from elements of type 1 in Mn, then either d > бп1^2 or w > 26n. 
R EL ATI VIZ ABILITY IN COMPLEXITY THEORY 131 Proof. Let n be an integer. Let P be a perceptron of order d and of total weight w separating elements of type 0 in Mn from elements of type 1 in J\fn- Put £ = 1/3,6 = 0.01 and m = n. Suppose that d < 6n1/2. Then the conditions of Lemma 4 are satisfied. Therefore, there are probability distributions ц and v on Mn such that: 1) a random matrix M is good with probability 1 with respect to /x; 2) a random matrix M is not 2/3-bad with probability at most 2e~(2//lo^]/3)r? with respect to z/; 3) Ц and v are d-indistinguishable. Let p denote 2e_(^2/15^1/3^n. Consider the probability distributions к, = ц x v and A = i/x^ on Mn• Then Prob^fD has type 0 ] > 1 — 2p, РгоЬл[£) has type 1 ] > 1 — 2p. As we have seen above, 3) implies that EKWP(D) = ExWp(D). Let t be the threshold value of the threshold-gate. Obviously we can assume that \t\ < w. We have EKWP{D) > (1 - 2p)(t + 1) - 2pw and EXWP{D) < (1 - 2p)t + 2pw. Therefore, (1 - 2p)(t + 1) — 2pw < (1 — 2p)t + 2pw, which implies the inequality w > l/(6p) = (l/12)e^2/15^1/3^)n > e6n for sufficiently large n. □ Theorem 26 ([50]). There is an oracle A such that АМЛ П со-АМл <2 ЕРЛ. Proof. By Theorem 1 it is sufficient to prove that AMLOGSDco-AMLOGS % PPLOGS. Let a be a binary word of length 2n2. We will view a as a pair of Boolean matrices of size n x n. Consider the separation problem F(a) = < if a has type 1, if a has type 0, if a has neither type 0 nor 1 or |a| ф 2n2. Obviously, F G AMLOGS П co-AMLOGS. So it is sufficient to prove that F ф PPLOGS. Assume that there are a polylog-time predicate Q(a,r) and a poly logarithm l(n) such that F(a) = 1 => M1/2 r e Q(a, r) = 1, F(a) = 0 nM1/2rel'(n) Q(a,r) = l. Assume that the computation time of Q on inputs a,r, |a| = 2n2, |r| = l(n), is bounded by the polylogarithm d(n). For every n, let us construct a perceptron of order d = d(n) and of total weight w = 2d^+l^ such that (61) P(a) = 1 M1/2 r e B'(n) Q(a, r) = 1 for all a e B2"2. Let r be a random string of length l(n). The value of Q(a,r) depends on d bits of a learned in the computation of Q on <a,r. Let v = v{l)v{2)...v(d) be a binary string of length d such that Q(a,r) = 1 whenever the d bits of a asked in the computation of Q on <a,r are v(l)v(2)...v(d), respectively. We associate the following AND-gate C with the pair (v,r). Let •.. ,Ud be the indices of bits of a asked in the computation of Q on a, r. On an assignment a, the gate C 
132 NIKOLAI К. VERESHCHAGIN produces 1 when a(uk) = v(k) for all к G {1, 2,..., d}. Declare the weight on C to be 1, and the threshold value of the perceptron P to be 2^n)_1. It is easy to verify that WP(a) = \{r e B'(n) | Q(a, r) = 1}|. This implies (61). Obviously, the order of P is d(n) = polylog(n), and the total weight of P is 2d(n)+i(n) _ 2Polybg(n) Theorem 25 shows that for every sufficiently large n, P cannot separate pairs of type 0 from pairs of type 1 in Mn\ and we are done. □ 7.4. Conclusion. Theorem 24 states that a perceptron of small total weight separating good matrices from g-bad ones has large order. This leaves the possibility that perceptrons of small order and arbitrary total weight can separate good matrices from q-h&d ones (for some q < 1). Since the one-in-a-box theorem involves restrictions on the order only, a theorem stating that perceptrons of small order and arbitrary total weight cannot do the job would a better extension of the one-in-a-box theorem. Recently, R. Beigel obtained such a lower bound (personal communication). He proved that perceptrons separating good matrices of size nxn from q-bad ones must have superpolylogarithmic order in n (for any fixed q < 1). The problem of whether perceptrons of order n°^ can do the job remains open. Note that Beigel’s bound also suffices to separate AM from PP via oracles. 8. The universum method So far we have not yet constructed an oracle under which some positive and negative assertions hold simultaneously, say P = R^ BPP. Many results of this sort (when an oracle is constructed under which some Boolean combination of complexity assertions is true) have appeared in the literature. The following results among them deal with the classes considered here. Rackoff in [43] constructed oracles A and В such that PA = KA Ф NP"4 and РБ ф RB — NPB. In [5], it was proved that P = NP П co-NP Ф NP under some oracle. Homer and Selman in [29] showed that there is an oracle under which P Ф NP but NP-sets are separable. This implies that the reliability of all the cryptographic schemes based on the existence of one-way functions cannot be derived from P ф NP by relativizable arguments (since one-way functions do not exist if NP-sets are separable). We show that one cannot prove, using relativizable arguments, that NP-sets are inseparable even under the hypotheses that co-NP-sets are inseparable and P ф R. The strongest of our results states that there is an oracle under which P ф NP but NP-sets are separable, co-NP-sets are separable, and P = BPP. In other words, it is impossible to prove by relativizable arguments even the disjunction “NP-sets are inseparable or co-NP-sets are inseparable or P ф BPP” under the P ф NP hypothesis. Our method goes back to [5]. We call it “the universum method”. We refine that method and apply it to prove the existence of oracles relative to which certain Boolean combinations of the assertions P = NP, P = R, P = BPP, P = NPDco-NP, P = R П co-R, “NP-sets are P-separable”, and “co-NP-sets are P-separable” hold (we are successful in constructing oracles for 13 of 17 possible combinations; thus four problems of this kind remain unsolved). Roughly speaking, the method works as follows. Suppose we want to prove that there exists an oracle A such that Рл Ф BPP"4 and PA = RA. First, we define a subset V (called the universum) of the set of all oracles. Second, we choose a 
RELATIVIZABILITY IN COMPLEXITY THEORY 133 sufficiently powerful oracle H (in all known applications we can take any PSPACE- complete set as H). Third, we consider machines having two oracles: the oracle H and a varying oracle В ranging over V. (Thus, every machine of this type accepts a subset of В* x V.) Finally, we prove that there exists a BPP-machine of this type which recognizes a subset of В* x V recognizable by no P-machine of this type, and prove that for any R-machine of this type there exists a P-machine of this type recognizing the same subset of В* x V. Another general method close to ours was presented in [15]. An extension of that method was applied by Fortnow and Rogers in [17] to prove the existence of oracles relative to which certain Boolean combinations of the assertions p — NP, P = UP, P = NP П co-NP, “NP-sets are P-separable”, and “co-NP- sets are P-separable” hold. They succeeded in constructing oracles for all possible combinations. In a sense our method (as well as the method of [15]) is a special kind of forcing method. In Section 8.4, we prove two results that can be interpreted as saying that our method fails to prove the following two theorems: P ф R = PSPACE under some oracle [43], and P = NP ф PSPACE under some oracle [32]. We start with a sample application of the method. 8.1. A sample application. In the sequel, we will use the following notation. For a finite set M Cl*, let maxlength (M) denote та хуем \y\. Assume that P is a deterministic oracle machine. Let Query p(x, ВфН) denote the set of all у € В* such that P asks lB(y) = ?’ during the computation on input x with oracle 50Я. Assume that A is a nondeterministic oracle machine and c is one of its computations with oracle Я0Я on some input. Then Query %(c, Я0Я) denotes the set of all у G B* such that N asks ‘B(y) — V during the computation c. By a P- [NP-, BPP-] machine we mean a polynomial-time deterministic [nondeterministic, probabilistic] oracle machine. Definition 10. Let L2 and L be languages. We say that L separates L\ from L2 if Li C L and Ь2 С B* \ L. Let C and C be families of languages. A C-set is a language from C. We say that C-sets are C'-separable if for any two disjoint languages L\ and L2 in C, there exists a language L in C which separates L\ from L2- We say that NPA-sets [co-NP"4-sets] are separable if NPA-sets [co-NPA-sets] are PA-separable. If this is not the case, then we say that NPA-sets [co-NPA-sets] are inseparable. Theorem 27 ([17, 39]). There exists an oracle A such that NPA-sets are inseparable and co-NP"4-sets are separable. Proof. The proof of this theorem is very close to the proof of a theorem from [5] stating that PA = NP"4 П co-NP"4 ф NP"4 for some oracle A. Define the sequence of integers щ by induction: no = 1, Щ+\ — 22 г. Let § = {щ I i G N}. Consider the following set of oracles: V — {B G fi I for all n E § there exists at most one у € Bn such that B(y) = 1 and for all n € N \ S there exist no у £ Bn such that B(y) = 1}. Let Я be a PSPACE-complete language. The oracle A will have the form Б0Я, where В is in V. Thus, we have to define the oracle B. 
134 NIKOLAI К. VERESHCHAGIN We construct В in such a way that the sets Lf = {ln | n £ § and there exists у £ Bn_1 such that B(0y) = 1}, Lf = {ln | n £ § and there exists у £ Bn_1 such that B(ly) = 1} are PA-inseparable. Obviously, both Lq and Lf belong to NPB0// and are disjoint for any В £ V. So we have to construct an oracle В £ V such that (1) Lq and Lf are separable by no PB0//-set, and (2) any two disjoint co-NPB0H-sets are separable by some PB0//-set. Let M be a deterministic or nondeterministic machine. Write MA(x) = 1 if M with oracle A accepts x, and write MA(x) = 0 otherwise. We say that a pair (7V(bA/i) of NP-machines is correct on A if the languages {x \ Aff(x) = 0} and {x | Na(x) = 0} are disjoint. The assertion (1) means that for any P-machine P the language {x \ РБфЯ(х) = 1} does not separate Lf from Lf. The assertion (2) means that for any pair (ATq, N\) of NP-machines correct on В 0 H there exists a set in РБфЯ separating {x \ NB®H(x) = 0} from {x \ Nf®H(x) = 0}. Let Po, Pi, ..., P{, ... be an enumeration of P-machines and (iVoo? ATqi), (Яю, АГц), ..., (ATjo, ATji),... an enumeration of pairs of NP-machines. We make a countable number of steps. On any step we define some oracle values and do some freezing. In other words, we construct a sequence of intervals Г12Г22ГзЗ-- - such that each Г* intersects with V. On step i = 2k + 1 we find Ti such that language {x | PB®H (x) = 1} does not separate Lf from Lf for any В £ Гг П V. On step i = 2k + 2 we find Г* such that either the pair (A^o, N^i) is not correct on Рея for any В £ П V, or the languages {x | NB^H(x) = 0} and {x | NB®H (x) = 0} are separable by a set in рБфЯ for any В £ П V. Obviously, for any oracle В in the set V П fl^i Гг the assertions (1) and (2) will hold. We start with Г о = fi. Let us explain what to do at each step. Let Гг-! = Г(</>) = {ВеП\В\Dom (ф) = ф} be the interval constructed at the (i — l)th step. At the ith step we make the following. Consider two cases. First case: i = 2k + 1. Pick n £ S greater than maxlength Dom(ф) and so large that P& on input ln makes less than 2n~1 queries to oracle. Let C be the oracle in Гг-i that is equal to zero on all the words not in Dom(ф). Without loss of generality we may assume that Pfe//(ln) = 0 (the other case is entirely similar). We know that | Query pfe(ln, C 0 H)\ is less than the number of words of length n — 1. Pick a word z of the form 1 и in the set Bn \ Query pfe(ln, C0 H). Note that z is not in Dom(</>), since n > maxlength (Dom(ф)). Let Г, = {Ве Г4_! I B{z) = 1, B(y) = C(y) for all у e Query cPk{ln,C 0 Я)}. Then рЯфН(Г) = Р£ФН(Г) = 0 and Lf (ln) = 1 for any В e Гь and Г* П V is non-empty since CUf} is in ГгПР The reader can see that, in fact, we have proved the following lemma, whose analog will be used in all other proofs. Lemma 7. If an interval Г intersects with V then there exists no P-machine P such that РЯфЯ separates Lf from Lf for any В £ Г П V. Second case: г = 2k + 2. Consider two subcases. 
REL ATI VIZ ABILITY IN COMPLEXITY THEORY 135 First subcase: There exists an oracle C £ Ti-\C\V such that the pair (A^o, N^i) is not correct опСфЯ. Then pick an x £ B* such that (x) = N^®H(x) = 0. Let L = {в € rv, I B(y) = C(y) for all у e (J Query %ko(c0, С ® Я) U (J Query %kl(ci,C ® #) j, Co Cl where the unions are over all the computations of N^o and AT^i, respectively, on input ln with oracle C 0 H. Second subcase: the pair (A^o, A^i) is correct on C 0 H for any C £ T^-i П V. Then let Г* = Г^-ь We have to prove that the sets {x \ N^H(x) = 0} and {x | NB®H (x) = 0} are separable by a set from РЯфЯ for any В £ ГгПЬ. This assertion easily follows from the following two lemmas. Definition 11. A polynomial machine is a deterministic oracle Turing machine that works within polynomial-space and makes at most poly(|x|) queries to its oracle (x is an input). Lemma 8. Leet Г be an interval and (Nq,Ni) a pair of NP-machines that is correct on C 0 H for any С £ Г П V. Then there exists a polynomial machine P such that PB(x) is equal to a j £ {0,1} for which NB®H(x) — 1 for any x and any В evn Г. Lemma 9. Let P be a polynomial machine. Then there exists a P-machine M such that PB(x) = МЯфЯ (x) for all x £ В*, В £ Pi (recall that H is a PSPACE- complete set). Proof of Lemma 8. We describe the work of P on input x with oracle В in the case В £ Г П L The reader can easily modify the program of P to handle the general case. The machine P with oracle В on input x works as follows. First find n = щ £ § such that log2 n < |x| < 2n. Let m be so large that N3 on inputs of length greater than m cannot query oracle values on words of length n2+i or greater (j = 0,1). If |x| < m, then compute NB®H(x) directly, and return 0 if NB®H (x) = 1 and 1 otherwise. If |x| > m, proceed as follows. Query the value of В on all the words of length at most Щ-\. The number of such queries is less than 2Пг~1+1 < 2\x\. We know the value of В on all words on which both values NB®H (x) and NB^H (x) depend, except for values on words of length n. Let 0 otherwise. Note that C £ V П Г; therefore А^фЯ(х) = 1 or (x) = 1. Find an l £ {0,1} such that А^СфЯ(х) = 1, and find an accepting computation c of Ni with oracle C 0 H on input x. This can be done within polynomial-space by checking all the computations of N$ and N\ with oracle C0 H on input x. All the queries made to H in those computations can be answered within polynomial-space because their lengths are bounded by poly(|x|) and H £ PSPACE. Set W = Query %i (с, С0Я)П Bn. Query lB(y) = ?’ for all у £ W. If B(y) = 0 for all у eW, then АГЯфЯ(ж) = 1; in this case return l. Otherwise we have found the unique word of length n on which В is equal to 1, and therefore can find both NB®H (x) and А/'ЯфЯ(х) within 
136 NIKOLAI К. VERESHCHAGIN begin w A (the empty word); while result (x,w) — $ commentary: result (x^w) is computed in time Poly(N, \w\) by querying Я; do у question (х,гс); commentary: question (x, w) is computed in time poly(|x|, \w\) by querying Я; b:=B{y); w := wb; od return result (x,w) end Figure 3 polynomial-space without making extra queries to B. Obviously, we have made poly(|x|) queries. □ Proof of Lemma 9. Let P be a polynomial machine. Define the functions question (x,w) and result (x,w) as follows. Let w be a binary word of length n. Run the machine P on input x and give the answer w(l) to the first query, the answer w(2) to the second query, and so on. There are three possibilities: 1) P makes exactly n queries and then returns a result, say r; in this case set question (x, w) = $, result (x, w) — r; 2) P makes n queries and then makes (n + l)st query, say 43(y) = ?’; in this case set question (x,w) = y, result (x,w) = $; 3) P makes less than n queries; in this case set question (x,w) = result (x,w) = $. Obviously, both functions question and result are computable within polynomial-space. Therefore, they can be computed by a polynomial-time machine with oracle H. Let the machine M work according the program shown in Figure 3. □ The proof of Theorem 27 is finished. □ All other theorems are proved according to the above scheme. Namely, first a set V of oracles is defined (which is called the universum). The oracle under which the desired Boolean combination of complexity assertions holds always has the form Я0 Я, where Я is a PSPACE-complete set. The desired properties of В are represented as a countable family of requirements on Я, and then a diagonal construction is used to satisfy all the requirements. At the zth step, an interval is constructed such that the ith requirement holds for any В E Г^ПР. The requirements are of two types: “negative” ones and “positive” ones (in the above example the requirements satisfied on odd steps are negative ones and the requirements satisfied on even steps are positive ones). Negative requirements are satisfied by using an appropriate analog of Lemma 7. Its proof is always easy; therefore we will only present the analog of languages Lq and Lf. The positive requirements will be satisfied by 
RELATIVIZABILITY IN COMPLEXITY THEORY 137 pA = NpA p4 = Ra n co-R-4 FIGURE 4. The edges of the drawn directed graph represent relativizable implications. For example, the implication PA = NP^ => PA = ВРРЛ is true since ВРРЛ C X#. trying first to make the current pair of machines (or a single machine in the case of classes BPP and R) incorrect. The notion of correctness of course will be specific in each case. If this fails, then we use an analog of Lemma 8, which combined with Lemma 9 (common for all the applications of the method) will complete the proof. Thus, the proof of any specific theorem in the sequel will consist of the definition of the appropriate universum, the definition of analogs of languages Lq and Lf, and the proof(s) of the appropriate analog(s) of Lemma 8. 8.2. Other applications of the universum method. The assertions on complexity classes in which we are interested are shown in Figure 4. We prove the existence of oracles under which one or another combination of assertions that label the nodes of the shown graph holds. There are 17 possible combinations of those assertions. They are listed in Table 1. We are able to prove the existence of oracles under which the combinations of all the lines except for lines number 3, 4, 9 and 10 are true. In fact, we do not know the answer to the following question. Question. Is there an oracle under which co-NP-sets are separable and P Ф BPP? We shall use only the universums of the form V = V(Z) = {В e П | Vn ф § B\Mn — 0, Vn £ § В|Bn g Z}, where Z is a subfamily of F (recall that F denotes (J^LqF^ where Fn is the set of all functions from Bn to B), and 0 is an identically zero function. The set Z is called the base of V(Z). We will use the following five standard bases: • Z(<1) = {a G F | #\0l < 1}—this base was already used in the proof of Theorem 27, . Z(=l) = {a€F|#1a = l}, • Z(>l) = {aeF| #!<*> 1}, 
138 NIKOLAI К. VERESHCHAGIN Table 1. The signs “+” and “0” in a line of the table indicate that the corresponding assertion is true. The signs ” and “0” indicate that the corresponding assertion is false. The difference between “+” and “0” is that the truth of assertions labeled by “+” follows from the truth of assertions labeled by “0” but the truth of any assertion labeled by “0” does not follow from the truth of the other assertions. The difference between ” and “0” is the same. The comment ending each line tells where the combination given in the line is proved. P = NP NP-sets separable co-NP-sets separable P = NP Dco-NP P = BPP II P = R flco-R Comment 1 0 0 0 0 0 0 0 [5] 2 © 0 0 0 0 0 0 Theorem 39 3 - 0 0 0 © 0 0 Unknown 4 0 0 0 - © 0 Unknown 5 - 0 © 0 0 0 0 Theorem 30 6 - 0 © 0 © 0 0 Theorem 33 7 - 0 © 0 - © 0 Theorem 34 8 - © 0 0 0 0 0 Theorem 28 9 - © 0 0 © 0 0 Unknown 10 - © 0 0 - © 0 Unknown 11 - © © 0 0 0 0 Theorem 32 12 - © © 0 © 0 0 Theorem 35 13 - © © 0 0 © 0 Theorem 36 14 - - — © 0 0 0 Theorem 29 15 — - - © © 0 0 Theorem 37 16 - - - © - © 0 Theorem 38 17 - - - - - - © Well known . Z(BPP) = \Jn&{a € Fn I #ltt/2" 2 [1/3; 2/3]}, • Z(R) = (Jn&{a e Fn | #xa/2" £ (0;2/3]}. Other bases will be built from the standard bases by the following operation 0 on bases: Z' 0 Z" = {ct 0 Fnt | i is even and a 0 Z'} U {ct 0 Fnt | i is odd and a 0 Z"}. Now, we formulate five analogs of Lemma 7, which will be used to satisfy negative requirements in the proofs of the next theorems. Their proofs are straightforward, and therefore we omit them. Lemma 10. Let Z' be any base and let an interval Г intersect with the uni- versum V = V(Z(<1 )+Z'). Then there exists no P-machine M such that MB®H separates the language {ln | n = щ 0 S, i is even and there exists у 0 Bn_1 such that B(0y) = 1} from the language {ln | n — щ 0 S, i is odd and there exists у 0 Bn_1 such that B(ly) = 1} for any В 0 Г C\V. These languages are in NPB0// and are disjoint for any В 0 V. Lemma 11. Let Z' be any base and let an interval Г intersect with the uni- versum V = V(Z(>l)-\-Z'). Then there exists no В-machine M such that МБфЯ 
RELATI VIZ ABILITY IN COMPLEXITY THEORY 139 separates the language {ln | n = пг £ §, i is even and B(0y) = 0 for all у £ Bn_1} from the language {ln | n = щ £ §, г zs odd and B(ly) = 0 for all у £ Bn_1} /or any В £ ГПУ. These languages are in co-NPB0// and are disjoint for any В £ У. Lemma 12. Le£ Z' be any base and let an interval Г intersect with the uni- versum V = y(Z(=l)+Z'). Then there exists no P-machine M such that МЯфЯ recognizes the language {ln | n — щ £ §, i is even and there exists у £ Bn_1 such that B(0y) = 1} for any В £ Г П У. This language is in NPB0// П co-NPB0// for any В £ У. Lemma 13. Let Z' be any base and let an interval Г intersect with the univer- sum V = y(Z(BPP)+Z'). Then there exists no P-machine M such that МЯфЯ recognizes the language {ln | n = пг £ §, i is even and #i(L?|Bn) > |2n} for any В £ Г П У. This language is in ВРРЯфЯ for any В £ У. Lemma 14. Let Z' be any base and let an interval Г intersect with the uni- versum V — y(Z(R)+Z'). Then there exists no ¥-machine M such that МЯфЯ recognizes the language {ln | n = пг £ §, i is even and #i(B\Bn) > |2n} for any В £ Г П У. This language is in RB®H for any В £ У. Now, we are going to consider all the lines in Table 1 except lines 3, 4, 9 and 10. The existence of oracles under which the combinations in the first and in the last lines hold is well known, so we skip those lines. Theorem 28 ([39]). NP-sets are inseparable, co-NP-sets are separable, and P = BPP, under some oracle (line 8 in the table). PROOF. This theorem strengthens Theorem 27, and its proof uses the same universum У = y(Z(<l)). All we have to do is to prove the analog of Lemma 8 for BPP-machines. We say that a BPP-machine M is correct on an oracle A if MA accepts each input with a probability lying outside the segment [l/3;2/3]. Lemma 15. Assume that Г is an interval and M is a BPP-machine that is correct on C 0 H for any С £ Г П У. Then there exists a polynomial machine P that recognizes with oracle В the same language as M does with oracle В 0 H for any В £ У П Г. Proof. Let us construct P. Let x be an input to P. In fact, the beginning of the proof of all analogs of Lemma 8 is common. We first find an n = щ £ S such that log2 n < |x| < 2n, query B's values on words of length at most n^_i, then compute the value РгоЬ[МБфЯ(х)] directly if |x| is so small that МЯфЯ(х) may depend on В\Ш-Пг+1. It remains to construct a polynomial machine Pf that on input (x, В\Ш-Пг-1), where В £ У П Г, decides whether РгоЬ[МЯфЯ(х)] > 2/3 
140 NIKOLAI К. VERESHCHAGIN provided x is so long that МЯфЯ on input x cannot query FTs value on words of length n^+i or greater. Let Pf work as follows. First find the probability p of the event “МСфЯ (x) — 1”, where C is the oracle that is equal to В on words of length different from n and to zero on all other words. Note that we know all the values of C needed to find p. Without loss of generality we may assume that p > 1/2 (the case p < 1/2 is entirely similar).3 For an oracle D, let wo(y) denote the probability of the event “МЯфЯ at some moment in the computation on input x queries ‘D(i/) = Then ^2 WD(y) < poly(|x|) for any D. Let W denote the set of all у £ Bn such that wc(y) >1/6. Obviously, \w\ < poly(|x|). Find W and query ‘B(y) = V for all у eW. Consider two cases. First case: B(y) = 0 for all у £ W. Let us prove that then РгоЬ[МЯфЯ(х) — 1] > 2/3. Since M is correct on В0Я, the probability q of the event “МЯфЯ(х) = 1” is either greater than 2/3 or less than 1/3. We claim that the first alternative holds. Indeed, if В|Bn — C|Bn, then q = p > 1/2, therefore q > 2/3. Otherwise denote by у the unique word of length n such that B(y) = 1. Then wc(y) <1/6 because у ф\¥. Therefore, РгоЬ[МЯфЯ(ж) = 1] - РгоЬ[МСфЯ(х) = 1] < wc{y) < 1/6, and so РгоЬ[МЯфЯ(ж) = 1] > 1/2 — 1/6 = 1/3. Return 1 in the first case. Second case: B(y) = 1 for some у £ W. In this case we know all the values of В needed to compute РгоЬ[МЯфЯ(х) = 1]. □ Theorem 29 ([39]). There exists an oracle A such that NPA Dco-NP"4 Ф PA and ВРРЛ = PA (line 14/. PROOF. Take the base Z = Z{=1). The analog of Lemma 8 for BPP-machines is already proved (see the proof of Lemma 15 and footnote 3). □ Theorem 30 ([39]). There exists an oracle A such that NPA-sets are separable, co-NPA-sets are inseparable and ВРРД = PA (line 5/. Proof. Let V = V(Z(> 1)). A pair (Ao> Ah) of NP-machines is said to be correct on A if NA(x) — 0 or Na(x) = 0 for all x. To ensure the separability of NPA-sets we shall prove the following analog of Lemma 8. Lemma 16. Let (7Vo,Ai) be a pair of NP-machines that is correct on В 0 H for any В £ V П Г. Then there exists a good machine P that for all В e V П Г on input x with oracle В finds an l € {0,1} for which АЯфЯ(х) = 0. PROOF. Let x be an input word. Let n = щ be defined as in the proof of Lemma 8. Assume that the length of x is so large that both machines No,N\ on input x cannot query oracle values on words of length > r^+i, and also so long 3Since M is correct on C 0 H, we know that in this case p > 2/3. However, we shall not use this fact because we want the proof to be valid for the case V — V(Z(=1)), and in this case p can belong to the segment [1/3; 2/3]. 
REL ATI VIZ ABILITY IN COMPLEXITY THEORY 141 that n is greater than the lengths of the words defining Г. Assume that we already know В|В-Пг. We have to find an / such that N^®H(x) = 0. To this end we shall use the technique from [9]. Let l — 0,1. An l-certificate is a function having the form C\ Query (с, C 0 Я), where C is an oracle agreeing with В on all words of length different from n, ЛГгСфЯ (x) = 1, and c is an accepting computation of ДГ/СфЯ on x. Note that if 7 is an /-certificate and C continues 7, then (x) — 1. Obviously, the number of elements in the domain of any /-certificate is polynomial in |x|; let p(\x\) denote this polynomial. Assume that x is so long that 2n > 2p(\x\). We claim that then any О-certificate ф is inconsistent with any 1-certificate ф. Indeed, assume that a О-certificate ф and a 1-certificate ф are consistent. Then there exists an oracle C agreeing with В on words of length different from n that continues both ф and ф. As |Dom(</>)| + |Dom('0)| < 2n, we may assume that there exists у G Bn such that C(y) — 1, that is, C is in V.4 Since C continues both ф and ф, we have N{f®H(x) = N^eH(x) = 1. Thus the pair AT0, N1 is incorrect on C 0 H and C is in V П Г. The contradiction proves the claim. □ Let Co [Ci] be the set of all О-certificates [1-certificates]. Let /7 — 0. Repeat p(\x\) times the following loop. Pick a О-certificate ф in Co (if Co is empty then return 0 and halt). Query ‘B(y) — V for all у G Dom(</>), and remove from Co and Ci all certificates that are inconsistent with B\ Dom(ф). Include in U all the elements of Dom(</>). (We will explain below how to perform the program within polynomial-space.) Before and after each iteration of the loop, all the certificates in Co U Ci agree with each other on U. On the other hand, in each iteration, any certificate ф in Ci is inconsistent with the picked 0-certificate </>; therefore its domain intersects with Dom(0) \ U. Hence the number of elements of the set Dom^) \ U decreases after each iteration of the loop for any 1-certificate ф in Ci. Thus, after p(\x\) iterations, U includes the domains of all the certificates in Ci. If Ci becomes empty, then Nf®H(x) = 0. Otherwise Co becomes empty, and so Nq®h(x) = 0. Obviously, we have made at most p(\x\)2 queries to B. Let us prove now that the described program can be run within polynomial- space. We do not need to write Co or Ci. It suffices to write the set U and B's value on elements of U. For given U and B\U we can decide whether there is a О-certificate [1-certificate] consistent with B\U by checking all the computations of No [Ah] on input x. If a query lB(y) — ?’ is made during one of computations, we answer ‘В(уУ if \y\ < Щ-\ or у is in U (note that we know B's value on such words), ‘O’ if Hi-1 < \y\ < n or n < \y\, and we try all the answers otherwise. As the number of queries does not exceed poly(|x|), the amount of written information is poly(|x|). □ To ensure the equality ВРРД = PA we shall prove the following analog of Lemma 8. 4In the next theorem we shall need this lemma for V — V(Z(BPP)). In this case we need the inequality | Dom(</>)| + | Dom('0)| < (l/3)2n. Assuming this inequality, we can find an oracle C consistent with both ф and ф, agreeing with В on words of length different from n and such that the number of words of length n in C is greater than (2/3)2n; that is, we can find C £ V П Г continuing both ф and ф. 
142 NIKOLAI К. VERESHCHAGIN Lemma 17. Let M be a BPP-machine that is correct on В 0 H for any В £ VnT. Then there exists a polynomial machine P such that PB recognizes the same language as МЯфЯ does for any В £ V П Г. Proof. We use some ideas from [30] and [40]. By Lemma 16 it suffices to construct a pair (ATq, N\) of NP-machines such that РгоЬ[МЯфЯ(х) = 1] > 2/3 => N?®H(x) = 1, Nq®h(x) = 0, РгоЬ[МЯфЯ(х) = 1] < 1/3 =► N?®H(x) = 0, Nq®h(x) = 1, for any x and any В £ V П Г. We shall construct the machine N\ (No can be constructed in the similar way). Let x be an input and let В be in V П Г. Let n be defined as usual, and let x be so long that M on input x cannot query JTs value on words of length n^+i or greater, and also so long that n is greater than the lengths of the words defining Г. In the rest of the proof of the lemma we shall consider only oracles that agree with В an all the words of length different from n. Note that an oracle C with this property is in V П Г if and only if ff1(C\Bn) > 0. Let к denote the maximal number of queries that M can make on input x. Obviously, к < poly(|x|). For a у £ Bn and an oracle D, let wo(y) denote the probability of the event “MD0H on input x queries LD(y) = ?’”. Let W = {у еШп \ иов(у) > l/(9fc + 3)}. Since wb(v) < к, the set W has at most (9к + 3)k elements. Claim. If РгоЬ[МЯфЯ(ж) = 1] > 2/3, then РгоЬ[МСфЯ(ж) = 1] > 2/3 for any oracle C that agrees with В on W. PROOF. Assume the contrary: there exists C agreeing with В on W and such that РгоЬ[МСфЯ(х) = 1] < 1/3. Choose a C satisfying these conditions and differing from В on the least number of words. Let U = {у £ Bn | B(y) Ф C(y)}. Let us prove that wc(y) >1/3 for any у in U save perhaps one. Let у be an element of U. Let Cy denote the oracle obtained from C by changing the value on y. We distinguish two cases. First case: Cy £ V. As Cy differs from В on fewer arguments than C does and Cy £ V П Г, we have РгоЬ[МСфЯ(х) = 1] > 2/3. Therefore, wc(y) > РгоЬ[Мс^фЯ(х) = 1] - РгоЬ[МСфЯ(ж) = 1] > 2/3 - 1/3 = 1/3. Second case: Cy £ V, that is ^(CJB72) = 0. This may happen only if #i(C|Bn) = 1, and therefore this case can occur for the unique y. As wc(y) < к, we have \U\ < 3k + 1. Since U П W = 0, we have wB(y) < 9/Уз for any у e U. Hence J2yeu wb{v) < (М+з) (3fc + l) = 5- 0n the other hand, и)в(у) > РгоЬ[МЯфЯ(х) = 1] — РгоЬ[МСфЯ(х) = 1] > 2/3 — 1/3 = 1/3. yeu The contradiction proves the claim. □ In a similar way we can prove that if РгоЬ[МЯфЯ(х) = 1] < 1/3, then РгоЬ[МСфЯ(х) = 1] < 1/3 for any oracle С £ V agreeing with В on W. Thus there is a nondeterministic oracle machine that accepts x with all oracles ВфН such that В £ V and РгоЬ[МЯфЯ(х) = 1] > 2/3, and rejects x with oracles 
RELATI VIZ ABILITY IN COMPLEXITY THEORY 143 В 0 H such that В £ V and РгоЬ[МЯфЯ(х) = 1] < 1/3, and that makes poly(|x|) queries to oracle. This machine guesses a set W having at most k(9k + 3) elements, learns B1 s value on VF, and then accepts if and only if РгоЬ[МСфЯ(х) = 1] > 2/3 for all oracles C G V that agree with В on W. However, this machine is not a polynomial one, as it does not work within polynomial-space. To convert this machine into a polynomial-space machine we make the following. Assume that 2(9к + 3)k < 2n. Claim. РгоЬ[МЯфЯ(х) = 1] > 2/3 if and only if (*) there is U such that \U\ < (9k + 3)k and РгоЬ[МСфЯ(х) = 1] > 2/3 for any C agreeing with В on [/, and equal to zero on at most 2(9fc + 3)k words of length n. Proof. The implication from the left to the right follows from the the above claim, since we can take W as U. Let us prove the implication from the right to the left. Assume that РгоЬ[МЯфЯ(х) = 1] < 1/3 but there is U such that \U\ < (9k + 3)k and РгоЬ[МСфЯ(х) = 1] > 2/3 for any C agreeing with В on U and equal to zero on at most 2(9fc + 3)k words of length n. Choose such a U. Let D denote the oracle agreeing with В on U U W and equal to 1 on all the words from Bn \ (U U W). As D agrees with В on VF, we have РгоЬ[МЯфЯ(х) = 1] < 1/3. On the other hand, D agrees with В on U and is equal to zero on at most 2(9fc + 3)k words of length n. We conclude that РгоЬ[МЯфЯ(х) = 1] > 2/3. This contradiction shows that РгоЬ[МЯфЯ(х) = 1] > 2/3 if and only if (*) is true. □ Assume that a subset U of Bn has at most (9A; + 3)k elements. The values of an oracle C that is equal to zero on at most 2(9A; + 3)k words of length n can be identified by means of a polynomial amount of information. Thus for given [7, B\B-n%~1 and B\U we can decide within polynomial-space whether (*) is true. Therefore we can decide this in polynomial-time using the oracle H. The machine ЛГЯфЯ on input x works as follows. Query the value of В on all the words of length at most Щ-i. Then guess a set U C Bn having (9k + 3)k elements, and accept if (*) is true. □ Theorem 31 ([39]). There exists an oracle A such that NPA-sets are separable, ВРРД ф FA, and KA = FA. Proof. Let V = V(Z(BPP)). To ensure separability of NPA-sets we need the following analog of Lemma 8. A pair (No,Ni) of NP-machines is said to be correct on A if N^(x) = 0 or Na(x) = 0 for all x. Lemma 18. Let (Щ, Ni) be a pair of NP-machines that is correct on В 0 H for any В e V П Г. Then there is a polynomial machine P that for any В G V П Г on input x with oracle В finds an l £ {0,1} for which АГЯфЯ(х) = 0. Proof. This lemma can be proved just as Lemma 16. The only difference is that we have to take x so large that 2n/3 > 2p(\x\) (and not 2n > 2p(\x\) as in that proof). □ Let us say that a probabilistic oracle Turing machine M is correct on A if Prob[MA(x) = 1] either is equal to 0 or is greater than 2/3 for any x. We need the following analog of Lemma 8. 
144 NIKOLAI К. VERESHCHAGIN Lemma 19. Assume that M is a probabilistic polynomial-time oracle machine correct on the oracle В 0 H for any В £ V П Г. Then there exists a polynomial machine P that with any oracle В £ V П Г recognizes the same language as M does with the oracle В ® H. Proof. By Lemma 18 it suffices to construct a pair (No,N\) of NP-machines such that РгоЬ[МВфЯ(х) = 1] > 2/3 => Ni®H{x) = 1, ДГ0ВфЯ(х) = 0, РгоЬ[МВфЯ(х) = 1] = 0 => N?®H(x) = 0, ЛГ0ВфЯ(х) = 1. for any x and any В £ Г П V. It is obvious that there exists an NP-machine N\ satisfying this requirement. Thus, we have to construct Nq. Let x be an input to No and let В be in V П Г. Let n = щ be defined as earlier. Assume that M on input x cannot query oracle values on words of length > n^+i. In the rest of the proof of the lemma we shall consider only oracles that agree with В on alHhe words of length different from n. Let к = poly (|x |) be the maximal number of queries to В which machine M can make during the work on input x. Denote by wc(y) the probability of the event “MC®H (x) on input x queries ‘C(y) = ?”’. Let W = {y £ Bn | wb{v) > 1/4k}. Note that \W\ < Ak2. Claim. Assume that РгоЬ[МЯфЯ(х) = 1] = 0. Then Prob[MC®H(x) = 1] = 0 for any C agreeing with В on W. Proof. Assume the contrary, and let C be a counterexample. Then for at least one random string, МСфН(х) = 1. Denote by U the set of all the у € Bn such that the query 4C(y) = ?’ is made during the computation of МСфЯ on x for that random string. Obviously, \U\ < k. Let D be the oracle agreeing with C on U and with В on remaining words. If n is large enough, then D is in Г. If D belonged to V we would obtain a contradiction: we know that the probability of the event “MD®H (x) = 1” is positive; hence, this probability would be greater than 2/3. Therefore, ^2о(у)фв(у)т^(у) would be greater than 2/3. On the other hand, {y | D(y) ф B(y)} C U \ W\ consequently, J2 wb(v) < iu\T D{y)^B{y) 1 4 Now we have to explain what to do if D £ V, that is, if #i(D|Bn)€ [(1/3)2”, (2/3)2”]. We know that В £ F, that is, #i(B\Bn) ^ [(l/3)2n, (2/3)2n]. Without loss of generality we may assume that #i(i?|Bn) > (2/3)2n. Then #i(D|Bn) > (2/3)2n — \U\. We have \U\ < к = poly(|x|). Therefore, we may assume that 2\U\ + \W\ < (l/3)2n. As (l/3)2n < #o(^|®n), there exists a set T c Bn having exactly \U\ elements, disjoint from U U W and such that D(y) = 0 for any у G T. Choose such a T and change the value of D on all the words from T. Now we have #i(D|Bn) > (2/3)2n; therefore D is in V. As D(y) = C(y) for any у G С/, we have Prob[MD0//(x) = 1] > 0. Therefore, Prob[MD®H(x) = 1] > 2/3. 
REL ATI VIZ ABILITY IN COMPLEXITY THEORY 145 TABLE 2. The sign “+” in a line of the table indicates that the analog of Lemma 8 is true for the corresponding universum. The sign in a line of the table indicates that the analog of Lemma 7 is true for the corresponding universum. The letter *‘o” means that the proof was omitted (because we do not need the corresponding assertion). NP-sep. co-NP-sep. NP П co-NP BPP R V(Z(< 1)) - + + + + V(Z(=1)) - - - + + V(Z(> 1)) + - + + + V(Z(BPP)) + -(o) + - + V(Z( R)) + - (o) + - - V(F) + (o) - (o) + (o) + (o) + (o) Recall that РгоЬ[МЯфЯ(х) = 1] = 0. Hence, X] Wb(y) > 2/3- D(y)^B(v) On the other hand, < (|J7| + |T|)^ < This contradiction proves the claim. □ The rest is as in the proof of Lemma 17. Assume that \x\ is so large that 4/c2 + к < \2n. Claim. РгоЬ[МЯфЯ(х) = 1] = 0 if and only if (*) there is U such that \U\ < 4/c2 and Prob[MC®H(x) = 1] = 0 for any C that agrees with В on U and vanishes on at most 4/c2 + к words of length n. Proof. The implication from left to right follows from the above claim. Let us prove the implication from right to left. Assume that РгоЬ[МЯфЯ(ж) = 1] > 2/3 but there is U such that \U\ < 4/c2 and РгоЬ[МСфЯ(ж) = 1] = 0 for any C agreeing with В on U and equal to zero on at most 4/c2 + к words of length n. Choose such a U. Fix any random string for which МЯфЯ(х) = 1, and let R denote the set of у € Bn such that the query ‘B(y) = ?’ was made during the computation of МЯфЯ on input x for that string. Let D denote the oracle agreeing with В on U U R and equal to 1 on all the words in Bn \ (U U R). Then РгоЬ[МЯфЯ (x) = 1] > 0, since D agrees with В on R. On the other hand, D agrees with В on U and is equal to zero on at most 4/c2 + к words of length n. Hence РгоЬ[МЯфЯ(ж) = 1] = 0. This contradiction proves the claim. □ So the machine N{f®H works on the input x as follows. Query the value of В on all the words of length at most щ-\. Then guess a set U C Bn having 4/c2 elements, and accept if (*) is true. □ Thus the theorem is proved. □ The above facts on the five standard universums are shown in Table 2. The last line in the table contains unproved facts about the universum V (F) presented for the sake of completeness. They all are straightforwarrd, except inseparability of NP-sets. The latter fact was proved by An. Muchnik (personal communication). We present a sketch of his proof. Consider a binary string / of length k(2fe +1). We 
146 NIKOLAI К. VERESHCHAGIN view / as a function form the set M = {1,..., 2k + 1} into Mk. By the Pigeon Hole Principle there are different i, j £ M such that f(i) = f(j). Let the first co-NP-set consist of those / for which there are no different г, j £ M such that /(г) = f(j) and the first symbol of /(г) is 0. Let the second co-NP-set consist of those / for which there are no different i,j € M such that /(г) = f(j) = 1 and the first symbol of /(г) is 1. It is clear that to separate these sets we need to learn at least 2^ + 1 bits of /. In the proofs of the following theorems we use bases obtained by addition from the five standard bases. Theorem 32 ([39]). There exists an oracle A such that NPA-sets and co-NPA- sets are inseparable, NPA П co-NPA = PA and ВРРЛ = PA (line 11 in Table 1). Proof. Let V — V(Z(<1)+Z(>1)). Let us first prove the analog of Lemma 8 for NP П co-NP-machines. □ Lemma 20. Let (Nq,Ni) be a pair of NP-machines such that the languages accepted by Nq фЯ and ЛГЯфЯ are complementary for any В £ V П Г. Then there exists a polynomial machine P that with any oracle В £ V П Г accepts the same language as Nq does with oracle В ® H. Proof. Machine P works as follows. Let x be an input. Let n = щ £ § be defined by the inequalities log2n < \x\ < 2n. If г is even, then by definition ol £ Z 4Ф ifiot < 1 for any a £ Fn. In this case we consider (iV0, N\) as a pair defining a problem of separation of co-NP-sets, and argue as in the proof of Lemma 8. If i is odd, then by definition a £ Z Ф\(% > 1 for any a £ Fn. In this case we consider (Nq,Ni) as a pair defining a problem of separation of NP-sets and reason as in the proof of Lemma 16. □ Proof. The analog of Lemma 8 for BPP-machines can be proved similarly. □ To prove Theorems 33-38 we do not need any new ideas. Therefore we shall only present the bases used in their proofs. Theorem 33 ([39]). There exists an oracle A such that NPA-sets are separable, co-NPA-sets are inseparable, ВРРЛ ф PA, R/4 = PA (line 6). Proof. Take the base Z — Z(BPP) + Z{> 1). □ Theorem 34 ([39]). There exists an oracle A such that NPA-sets are separable, co-NPA-sets are inseparable, and PA ф R/4 (line 7). Proof. Take the base Z = Z(R) + Z(> 1). □ Theorem 35 ([39]). There exists an oracle A such that NPA-sets are inseparable, co-NPA-sets are inseparable, ВРРЛ ф PA, NPAHco-NPA = PA, andRA = PA (line 12). Proof. Take the base Z = Z(< 1) + Z(> 1) + Z(BPP). □ 
REL ATI VIZ ABILITY IN COMPLEXITY THEORY 147 Theorem 36 ([39]). There exists an oracle A such that NPA-sets are inseparable, co-NPA-sets are inseparable, R/4 ф PA, and NPA П co-NP л = PA (line 13). Proof. Take the base Z = Z(< 1) 0 Z(> 1) + Z(R). □ Theorem 37 ([39]). There exists an oracle A such that NPA П co-NP л ф PA, ВРРЛ ф PA, and Ra = PA (line 15). Proof. Take the base Z = Z(=1) + Z(BPP). □ Theorem 38 ([39]). There exists an oracle A such that NPA Dco-NPA ф PA, RA фРА, and Ra П co-Ra = PA (line 16). Proof. Take the base Z = Z(= 1) Т Z(R^. CZI The next theorem completes the theorems shown in Table 1. To prove it we need many universums. Theorem 39 ([39]). There exists an oracle A such that PA ф NPA, NPA-sets are separable, co-NPA-sets are separable, and ВРРЛ = PA (line 2 in Table 1). Proof. We use a diagonal construction as in the proof Theorem 27, but instead of a chain of intervals we construct a chain of universums Vo 5 V\ 0 Any Vi will be specified by an interval Г = Г(а) and a positive integer j, and will consist of all oracles В £ Г such that В |Bn = 0 for all n ф § and Ф\{В\Шп) < n/j for all n in § of length greater than maxlength (Donrn). Let V(a,j) stand for the set of В specified in this way by a,j. Obviously, V(a,j) = 0 if a(y) ф 0 for some у G Doma such that \y\ ф §. So we shall assume that this is not the case. Since any set V(a,j) is closed in Baire’s topology, the intersection П^о ^ will be non-empty provided all V^’s are non-empty. (We recall that Baire’s topology is the topology whose base consists of intervals.) The oracle A will have the form В 0 H, where H is a PSPACE-complete set. The set in ША \ PA will be LB = {1" | n € §, 3u e Bn B(u) = 1}. We do not present the whole diagonal construction, indicating instead only specific points. The steps at which we satisfy the requirement LB £ PБфЯ are made as in all previous proofs. At those steps we do some freezing; thus the current universum V(a,j) is replaced by universum V(af,j), where a' extends a. On steps at which we satisfy the requirement of separability of NP-sets we use the following analog of Lemma 8. □ Lemma 21. Assume that Nq, N\ are NP-machines such that the languages accepted by Л^фЯ and NB®H are disjoint for any В G V(a,j). Then there exists a polynomial machine separating those languages for any В G V(a,2j). Proof. The polynomial machine separating {x \ NB®H = 1} from {x \ ATB®H = 1} works on input x as follows. Let n = щ be defined as in previous proofs. An l-certificate (l = 0,1) is a function of the form CKQuery^^c^C 0 H) П Bn), where C is an oracle in V(a, 2j) agreeing with В on words of length different from n, Л^СфЯ(ж) = 1 and c is an accepting computation of on input x. 
148 NIKOLAI К. VERESHCHAGIN Let us prove that every О-certificate is inconsistent with every 1-certificate. Assume the contrary: some О-certificate ф is consistent with some 1-certificate ф. Let C(y) = 'Ф{у) Ф(у) В(У) о if у £ Dom((/>); if у £ Dom(ф); if \y\ ф n; otherwise. As фхф < and #1 ip < £■, we have #i(C|Bn) < fj + щ = f; hence C £ V(a,j). On the other hand, Nq®h(x) = N^®H(x) = 1. This contradiction shows that each О-certificate is inconsistent with each 1-certificate. From here we can reason just as in the proof of Lemma 16. □ Lemma 22. Assume that N0,Ni are NP-machines such that the languages accepted by Nq®h and N^H span B* for all В £ V(a,j). Then there exists a polynomial machine P that on input x with any oracle В £ V(a,j) finds an l £ {0,1} such that NB®H (x) = 1. Proof. Let the machine P work on input x as follows. Start with the oracle C that is equal to В on words of length different from n and to zero on the remaining words. Find an l such that N^H(x) = 1 (such an l does exist since C is in V(a, j)). Then we either discover that NB®H (x) = (x), or find a и £ Bn such that B{u) — 1. In the latter case we include и in C and repeat the process. After at most к = [n/j] + 1 iterations we will halt, since #i(L?|Bn) < k. □ We need also the analog of Lemma 8 for BPP-machines: Lemma 23. Let M be a BPP-machine that is correct on В 0 H for any В £ V(a,j). Then there exists a polynomial machine P such that PB recognizes the same language as МБфЯ for any В £ V(a,j). Proof. Let the machine P work as follows. Let x be the input. Let n be defined as in Lemma 8. Let the oracle C be equal to В on words of length different from n and to zero on the remaining words. Let U = {u £ Bn | wc{u) > 1/(3/с)}, where к stands for a polynomial upper bound for the number of queries made by M on input x and wc(u) denotes the probability of the event UMC®H on input x queries 4C(u) = Query lB(u) = ?’ for all и £ U. If there is no и £ U such that B(u) = 1, then РгоЬ[МБфЯ(х) = 1] - Prob[Л/СфЯ(х) = 1] < XI <k ■ ^ = V3- y.C(y)^D(v) Therefore, in this case МЯфЯ accepts x if and only if МСфЯ accepts x. Otherwise, include in C all those и £ U which are in В, and repeat the process. After at most l = [n/j] + 1 iterations we will halt, since #i(B\Bn) < l. Let us present one more application of the universum method, consisting in a new proof of a known theorem. Theorem 40 ([27]). There exists an oracle A such that PA Ф NPAflco-NPA Ф NPa and the class NPA П co-NPA has an m-complete language. 
REL ATI VIZ ABILITY IN COMPLEXITY THEORY 149 Proof. It is sufficient to construct an oracle A such that NPA Pico-NP^ ф Рл, NPA % co-NPa and the class NPA П co-NPA has an га-complete language. The oracle A will have the form В0Я, where Я is a PSPACE-complete set. Thus, we have to construct the oracle B. Take the universum V — {A G D | #i(A|Bn) = 1 for any even n and #i(A|Bn) < 1 for any odd n}. The language in NPA \ co-NPA will be Lf = {ln | n is odd and 3u G Bn B(u) = 1}. The language in NPA П co-NPA \ PA will be Lf = {ln | n is even and 3u G Bn_1L?(lu) = 1}. Obviously, Lf G NPa and Lf G NPA П co-NPA for any В G V (recall that A = В 0 H). Thus, we have to construct an oracle В G V such that (1) Lf £co-NPa; (2) Lf £ PA; (3) NPa П co-NPa has an m-complete language. To this end let us enumerate all the polynomial-time deterministic and nonde- terministic oracle machines and all the pairs of nondeterministic polynomial-time oracle machines. It is easy to see that there is a chain Го 5 Гг 5 Г2 D Г3 D ... of intervals such that each Гг intersects with V and the following holds. If г = 3/c, the kth nondeterministic machine does not accept the language {0,1 }* \ Lf for any Я G ГгПГ. Ifz = 3/c + l, the /cth deterministic machine does not recognize the set Lf for any В G Гг C\V. And if i = 3k + 2, then either the languages accepted by the nondeterministic machines in the kth pair are complementary for any Я G Гг П Г, or those languages are not complementary for any Я G Гг П V. Take any oracle В in ГгПГ. Assertions (1) and (2) are true. It remains to prove that NPA П co-NPA has a complete language. Denote by Nj the jth nondeterministic polynomial-time Turing machine and by Pj(\x\) a polynomial bounding its running time. For a C G Pi let C-n denote the word of length 2n+1 — 1 encoding the values of C on words of length at most n in lexicographic order. Let us note that a pair (Nj,Nk) of NP-machines defines a language in NPB0/fD co-NPB0/f if and only if (x) + N^H (x) = 1 for any x. As a complete language we take the following language: LB - {0',/с,Я^п,х,0^(|х|)+^(|х|)) | j,fc,nG N, Nj(x,B®H) = 1 and Nj(x, C 0 H) + Nk(x, C 0 H) = 1 for any C G V П Г (В \ B^n)}. Let us prove that LB is in NPA П co-NPA. To this end let us prove that LB is in NPa (the remaining part LB G co-NPA can be proved quite similarly). Let us first construct a nondeterministic polynomial-space oracle machine that accepts LB and makes a polynomial number of queries. Let w be an input word. Decide first whether w has the form (j,/c,D^n,x,0^(|x|)+Pfc(|x|)) 
150 NIKOLAI К. VERESHCHAGIN for some D £ V and some j, /c,n. Then decide whether B-n — D-n. This can be done within space \D-n\ < |гс| and requires \D-n\ < \w\ queries to D. Then decide whether Nj(x, C 0 H) + Nk(x, C 0 H) = 1 for all C G V such that C^n = D^n. This can be done within polynomial-space, since both values Nj(x,C 0 H) and Nk(x,C®H) depend only on value of C on words of length at most pj(\x\)+pk(\x\) and В\Шг can be described by at most i bits for any В eV, therefore all the needed information about C can be written using polynomial-space. If this is not the case, then reject. Otherwise run N3 on input x with oracle В ® H and accept if Nj(x,B®H) = 1. As in the proof of Lemma 9, we can convert the constructed nondeterministic polynomial-space machine into a nondeterministic polynomial-time machine with oracle H. Thus, it remains to prove that LB is complete in NPA Dco-NPA. Let a language L be in NPa Dco-NPa. Let (Nj,Nk) be a pair of nondeterministic polynomial-time oracle machines such that L(x) = Nj(x,B 0 H) = 1 — Nk(x,B 0 H) for any x. The construction of the oracle ensures that there exists n such that Nj(x, C®H) + Nk(x, C 0 H) = 1 for any C G V П T(B | B-n). Let us fix such an n. The map x •—» O’, к, B-n, ж, (I^D) reduces L to LB. □ Remark 6. In a similar way we could prove all the theorems from this section in a stronger form: we could add the assertion that all the classes involved have m-complete problems. 8.3. General theorems. In this section we formalize the method applied in the previous section. We assume that our goal is to construct an oracle under which a certain Boolean combination of assertions of the form K\ C X2 or A^-sets are ^-separable holds. As K\,K2 we will consider classes of the form POLY(F). We need some new definitions. Definition 12. A description is a mapping from the set B* x Q into the set {0,1,#,*}. A description D is called an oracle machine if D(x,A) ф for all x,A. A description D is called correct on an oracle A if D(x,A) ф # for all x 6 B*. For a description D correct on an oracle A, let DA denote the separation problem x Н-» D(x, A). For a given class V of descriptions and an oracle A, let VA stand for the set {Da \ D e V and D is correct on A}. For classes K\ and К2 of separation problems, we write K\ < К2 if for each P\ E K\ there exists P2 G К2 such that Pi < P2. Note that if K\ is a class of languages, then K\ < K2 means the same as Ki C K2. Let F be a separation problem. Any function / polynomial-time bit-computable relative to an oracle defines the map (x, A) 1—> F(fA(x)). Let POLY(F) stand for the class of all oracle machines of this form. It is easy to see that POLYA(F) = POLY^(F) for any A. Let F be a separation problem. Any pair {fA,gA) of functions polynomial-time bit-computable relative to A defines the description (62) D(x, A) '1 if F(fA(x)) = 1, F(gA(x)) = 0, 0 if F(fA(x)) = 0, F(gA(x)) — 1, * if F(fA(x)) = 0, F(gA(x)) = 0, k # otherwise. 
REL ATI VIZ ABILITY IN COMPLEXITY THEORY 151 Let POLY(F)-separation denote the class of descriptions D of this form. It is easy to see that POLYA(F)-sets are POLYA(G)-separable if (POLY(F)-separation)A < POLYA(G). From now on we shall consider only classes of descriptions having either the form POLY(F) or the form POLY(F)-separation. Let im-> M,• • •,Mn be classes of descriptions. We want to prove that there exists an oracle A such that JCf ^ Cf for i = 1,... ,n, Mf for j = l,...,m. We shall consider for simplicity of notation the case m = n = 1. By a universum we will mean any nonempty subset V of ft. Definition 13. A superuniversum is any countable family V of universums having a largest universum up to inclusion and such that the following two assertions hold: 1. For any V £ V and for any interval Г intersecting with V there exists V £ V such that V С V П Г. 2. For any countable chain V\ D V2 D Vs D ... of elements of V the intersection HSi Vi is non-empty. The reader can see that in all the theorems proved in the previous section the elements of V have the form РПГ, where Г is an interval and V is closed in Baire’s topology. Any set of this form is closed in Baire’s topology; this implies condition 2 because the space ft is compact. For example, if V is closed in Baire’s topology, then the family V(V) = {V П Г | Г is an interval intersecting with V} is a superuniversum. Such universums were used in the proofs of Theorems 27-38 and 40. The only exception was Theorem 39. Thus, we wish to prove that there exists an oracle A for which K,A CA, Ma<Na. Notation. Let H be an oracle and V a class of descriptions. Then Vh is the class of descriptions {(x, A) i—> D(x, A 0 H) \ D G V}. For example, POLY(Fbpp)# is the class of descriptions of the form ( 1 if Prob[MA(x) = 1] > 2/3, M(x,A)=l 0 if Proh[MA(x) = 1] < 1/3, [ Ф otherwise, where M is a polynomial-time probabilistic oracle machine having an extra oracle H. In general, if /С is a class of oracle machines of a certain type, then /C# is the class of machines of that type having the extra oracle H. Let 1(V) stand for the largest universum in V. Assume that for a superuniversum V and an oracle H the following two assertions are true: (a) There is a description К in /C# that is correct on any oracle in 1(V) and such that there are no L 6 Ch and V £ V such that K(x,A) < L(x,A) for any x and any A £ V. 
152 NIKOLAI К. VERESHCHAGIN (b) For any V € V and any M £ Mh correct on any oracle in F, there are N G Mh and V G V such that V' С V and M(x, A) < N(x, A) for any x and any A G V. We claim that in this case there is an oracle A such that KA ^ CA and MA < Theorem 41 ([38]). Assume that there are H and V such that (a) and (b) are true. Then there is an oracle A such that KA ^ CA and MA < MA. Proof. Let К satisfy (a). We shall construct an oracle В G 1(V) such that the separation problem KB is not easier than any problem in (Ch)b and such that (Мн)В < (Mh)b • Then for the oracle A = В ® H the assertions ICA ^ CA and MA < MA will be true. Let Mo, Mi, ..., M^, ... be an enumeration of Mh and let L0, Li, ..., Lj, ... be an enumeration of Cm- We have to satisfy countably many requirements of two types: for each г G N we have to satisfy the requirement KB £ LB and for each i G N the requirement (63) Mi is incorrect on В or MB < NB for some N G Mh- We make a countable number of steps. At step j we find a universum V3 so that V\ 2 V2 5 V3 D More exactly, at step j = 2i + 1 we construct a universum Vj such that KB LB for any В G Vj. At step j = 2i + 2 we ensure the zth condition of the form (63), that is, we construct a universum Vj such that Mi is incorrect on В or MB < NB for some N G Mh and for all В G Vj. As В we take any oracle from the set H^i K- Set Vo = 1(V). It remains to describe each step. Let j be the number of the current step. Consider two cases. First case: j = 2г + 1. Since К satisfies (a), there are x G B* and C G Vj-1 such that K(x,C) ^ Ьг(х,С). Choose x and C satisfying this inequality. There is an interval Г including C such that K(x, В) = K(x, C) and Li(x, B) = Li(x, C) for all В G Г. We have, therefore, K(x, В) ^ ТДж, В) for all В G Г. By condition 1 in the definition of a superuniversum, there is a universum V G V such that V C V)_i ПГ. Let Vj = V. Obviously, KB £ LB for all В G Vj. Second case: j = 2i + 2. Assume first that M* is correct on Vj-\. By condition (b) there are a description N G Mh and a universum V C Vj-1, V G V, such that Mi(x, В) < N(x, B) for all x G B* and all В G V. Then we can set Vj = V. Obviously, for all В G Vj the assertion (63) holds. Otherwise (when Mi is incorrect on Vj-1) we can reason as in the first case, because uto be incorrect” is a local property. □ Thus, to construct an oracle under which /С ^ C and M < M, it suffices to find a superuniversum V and an oracle H such that conditions (a) and (b) above are fulfilled. In this form, the method is universal. Indeed, if ICA ^ CA and MA < MA then (a) and (b) are fulfilled for V = {{A}}, H = 0 as well as for V = {{0}}, H = A. Thus Theorem 41 does not capture the essence of the method. Roughly speaking, to find a specific oracle we need to find a universum (or a family of universums) with specific inner structure. And the oracle H should not be specific. Both the universums {A}, {0} do not satisfy this requirement, as they have very simple 
REL ATI VIZ ABILITY IN COMPLEXITY THEORY 153 inner structure—they consist of a single oracle. We shall present another general theorem, which captures the essence of the universum method. Given a class V of descriptions, we define a nonuniform counterpart of the class V as follows. Definition 14. The nonuniform counterpart of a class V of descriptions is the class n.u.P = (J T>C- 6 Gil Remark 7. If /С = POLY(F), then there is an equivalent definition of n.u./C using decision trees. Recall that F-n denotes the binary word of length 2n+1 — 1 encoding the value of F on the words of length < n. We say that a function fA(x) of A and x is bit-computable in a polynomial number of queries if there are two families, Tx and of decision trees such that (i) the height of both Tx and Sx is polynomial in |x|, and (ii) TX(A-P^X^) = \fA(x)\ < p(\x\) and Sxa(A-p^x^) = (zth bit of fA(x)) for some polynomial p and for any x and г < \fA(x)\. We claim that n.u.POLY(F) is the class of descriptions of the form (x,A) h-» F(fA(x)), where fA(x) is a function bit-computable in a polynomial number of queries. Let us verify this for F = Fp. We have to prove that a description D is in n.u.P if and only if there are a family {Tx \ x £ B*} of Boolean decision trees and a polynomial p such that TX(A-P^X^) = D(x,A) and the height of Tx does not exceed p(\x\) for all x. Assume that D is in n.u.P, say D(x,A) = M(x,A 0 C), where M is a polynomial-time oracle machine and C is an oracle. It is clear that for any x the value M(x, A 0 C) can be computed by a Boolean decision tree of height p(\x\) in the variables A(y), \y\ < poly(|x|). Conversely, assume that a description D is computable by a family of decision trees, D(x,A) = TX(A-P^X^), where p is a polynomial and the height of Tx is at most p(\x\). We may assume that the nodes of Tx are binary words (the root is the empty word, and гЮ and vl are the left and the right sons, respectively, of the node v). Then take as C the oracle encoding for all x and all vertices v of Tx the label of v and the indication whether г? is a leaf in Tx or not. Then there is a polynomial-time machine computing F(x, A), given x as input and A 0 C as oracle. Similarly, n.u.POLY(F)-separation consists of descriptions of the form (62), where both fA(x) and gA{x) are bit-computable in a polynomial number of queries. Now we are able to present the second general theorem, which is the essence of the universum method. Consider the following two conditions: (a') There is a description К £ n.u./C that is correct on any oracle in 1(V) and such that there are no L £ n.u.£ and V £ V such that K(x,A) < L(x,A) for any x and any A £ V. (b') For any V £ V and any description M £ n.u.Ad correct on any oracle in V there are N £ n.u.AГ and V £ V such that V' С V and M(x, A) < N(x, A) for any x and any A £ V'. Note that (a') and (b') are obtained from (a) and (b), respectively, by replacing uniform classes relativized with H by the corresponding nonuniform classes. Theorem 42 ([38]). Assume that (d) and (V) are true. Then there is an oracle H such that (a) and (b) hold. 
154 NIKOLAI К. VERESHCHAGIN Proof. Assume that both (a') and (b') are true. Let D be an oracle such that К G K,d- Then (a) holds for all oracles H to which D is polynomial-time Turing reducible. Thus, it suffices to construct an oracle H such that (b) is true and such that D is reducible to H. We shall assume that the classes M, N have the form POLY(F), M = POLY(F) and J\f = POLY(G). Other cases are entirely similar. Let /j4, j = 0,1, 2,..., be an enumeration of functions that are bit-computable in a polynomial number of queries. Then MA = {F(fA(x)) | F(fA(x)) is correct on A}, V4 = {G(f£(x)) | G(f*(x)) is correct on A}. Let V = {V0,VUV2,...}. Let j,l G N. There is a polynomial pj such that F(fA(x)) depends only on x and А\В-Рз^х^. Let x be in B* and F in ft. Consider the set U = U(j,l,x,F^(N>) consisting of all pairs (fc, В), к G N, В G Г2, such that F(ffeF(x)) < G(f^B(x)) for all x G B* and for all A G V/. Take a pair (к, B) from U having the minimal sum к + (time of bit-computation of Д on input x). Let C be an oracle encoding in a natural way the map (j,x, F-Pj^x^) ► (k,B). More exactly, with oracle C it is possible to compute к in time poly(j + / + |x| + 2Pj(\x\) -\~k) for any given j,l,x, and it is possible to compute B(y) in time poly (j + l + |x| + 2^ C^l) p |y|) for any given j,l,x, F-Pj(\x\\y. Let Я be a language that is complete in EXPC0Z\ As D is reducible to Я, the condition (a') is true. Let us prove (b'). Fix г, j such that the description F(fA®H(x)) is correct on any oracle A G Vi. We have to prove that there exist a universum V/ C Vi and /с' such that F(ff®H(x)) < G(f£®H(x)) for all x G B* and all A G V/. By (b7) there exist /с', /, and an oracle B' such that Vi C Vi and F{f^H{x))<G{f^B\x)) for all A G Vi and all x. Let us choose such /с7, /, В'. Fixing j, l and F — H in the map j,/, x, (_> (/с,Я), we obtain two functions: k(x) and B(x,y) (the value of В on y). Thus, we have F(ff®H(x)) < G(f*%B{x\x)) for all A eVi and all x. Lemma 24. Both functions k(x) and B(x,y) are polynomial-time computable relative to H. Proof. Since the pair (kf, Я7) belongs to U and k’ + (time of bit-computation of Д/ on input x) < poly(|x|). we can conclude that (64) k(x) + (time of bit-computation of f^x) on input x) < poly(|x|). 
REL ATI VIZ ABILITY IN COMPLEXITY THEORY 155 Let us prove first that k(x) is polynomial-time computable (with oracle H). By (64), we have k(x) < poly(|x|). Therefore it suffices to prove that k(x) can be computed in time 2poly(N) with oracle C0 D. This can be done as follows. First find H-pAlxl). To this end, compute H(z) for all z with length at most p3{\x\) using an exponential machine with oracle C 0 D recognizing H. (As \z\ < p3{|x|), each H(z) can be computed in time 2ро1у^ж^.) Then query C to find k(x). The queries to C have length at most 2ро1у(1аг1). The polynomial-time computability of the function B(x,y) can be proved similarly. □ We may assume that the enumeration {fn} is such that the polynomialtime computability with oracle H of the functions k(x) and B(x,y) implies that fi A(BB(x) k(x) (x) = f°r some m and for all x, A. Consequently, Е(/ЛфЯ(х)) < G(fA®H(x)) for all A e Ц and all x. we have □ 8.4. When the universum method cannot be used. We say that the universum method can be applied to prove that there is an oracle A such that K,A CA and MA < NA if there is a superuniversum V such that (a') and (b') are true. In this section we present two theorems that cannot be proved by the universum method. Theorem 43 ([32]). There exists an oracle A such that pa = npa ^ PSPACEa. Theorem 44 ([43]). There exists an oracle A such that PA ф IT4 = PSPACE^. In the proofs of Theorems 43 and 44, some difficult-to-compute information is encoded via oracle values, to ensure the truth of the positive assertion (Л4л < ЛЛ4). To prove Theorem 43, one needs a lower bound of [25, 55] for complexity of computation of PARITY by means of AND,OR-circuits of bounded depth, which has a rather complicated proof. Theorem 44 was proved in [43] (in fact, the weaker assertion that P ф R = NP under some oracle was proved there, but the proof holds good also for our case). Actually, we can prove that the following corollaries of Theorems 43 and 44 cannot be proved by the universum method. Corollary 45. There exists an oracle A such that NPa = co-NPa Ф PSPACEa. Corollary 46. There exists an oracle A such that PA ф PSPACE-4 and co-NPA C RA Let P, R, NP, co-NP, and PSPACE denote POLY(FP), POLY(FR), POLY(Fnp), POLY(Fco_np), and POLY(FPSPace), respectively. Theorem 47 ([38]). The universum method cannot be applied to prove Corollary 45. That is, there is no superuniversum V such that (a') and (V) are true for К = PSPACE, C = NP and M — co-NP, Af = NP. 
156 NIKOLAI К. VERESHCHAGIN Proof. Let V be a universum. Say that an interval Г n-isolates an oracle A in V if A G Г and А\Шп = В|Bn for any В G V П Г. A set W С B* n-isolates an oracle A in V if the interval T(A\W) n-isolates A in V. We define the size of a finite set of words as the sum of the lengths of all its elements. The size of an interval Г = Т(ф) is defined as the size of Dom(0). We say that V is thin if there exists a polynomial p(n) such that for all A G V and all n G N, there exists a set W С B* of size p(n) that n-isolates A in V. Consider two cases. First case: there exists a universum V in V that is thin. Let p(n) be the corresponding polynomial. Let us prove that (a') is false in this case. Indeed, let К be an oracle machine from n.u.PSPACE. Let us construct a machine L in n.u.NP such that KA = LA for all A G V. The machine L on input (x, A) works as follows. Let the length of queries of К to oracle on input x be bounded by the polynomial q(\x\). For every i < q(\x\) guess a set Wi С B* such that size(Wi) < p{i). Ask ‘А(г/) = ?’ for all у in W%. If there is no В G V such that the interval T(A\Wi) г-isolates В in V for all i < q(\x\), then reject. Otherwise, choose such B. Note that if A is in V, then A\M-q^x^ = B\M-q^\ and therefore K(x,A) = K(x,B). Then accept if and only if K(x,B) = 1. The total number of queries to A is J2i=o^ P(i) = poly(|x|). The maximal length of a query is max^d^) p(i) = poly(|x|). Thus, L G n.u.NP. Second case: all the universums in V are not thin. We claim that in this case (b') is false for M = co-NP, AT = NP. Let Vo, Vi, V2, ..., V^, ... be an enumeration of universums in V. For an oracle A, let тгл(п) denote the minimal size of a set n-isolating A in V^. For all i G N, fix a sequence {F?m}, n G N, of oracles such that Вгп G Vi and such that for any г, тгвгп(п) is not bounded by any polynomial of n. The set of natural numbers can be partitioned into subsets Qo, Q1, Q2, •. •, Qi, • •. such for any i G N, тгВгп (n) is not bounded by any polynomial of n when n ranges over Qi. Obviously, there is a description M G n.u.co-NP such that M(ln,A) r 1 if My G ВnA(y) = Вгп(у), where i < is the number such that n G Qi; 0 otherwise. Let us prove that there are no N G n.u.NP and i G N such that MA = NA for all A G Vi. Suppose the contrary: such TV, г exist. Let p(n) denote the polynomial bounding the size of the set Query уу(1п, A). Then for all A G V and all n G Qi, A\mn = Вгп|Bn 4Ф M(ln, A) - 1 4Ф TV(ln, A) = 1. In particular, TV(ln, = 1. Pick an accepting computation c of TV on input (ln,Bin) and set W = Query N(c, Вгп). Then size(W) < p{n) and TV(ln,A) = 1 for all A G Y{Bin\W). Therefore, A|Bn = Bin|Bn for all A G V П Г{Вгп\\¥) that is, W n-isolates Вгп in V\ Consequently, тВгп < p(n) for all n G Qi. This contradiction proves the theorem. □ Theorem 48 ([38]). Corollary 46 cannot be proved by the universum method. That is, there exists no superuniversum V such that (a') and (V) hold for /С = PSPACE, С = P and M = co-NP, Я = R. Proof. Assume that T is a decision tree in variables A(x), x G B-m, for some m. Let A € V. We say that T n-identifies A in V if the set of values of A learned by T in the computation on input Am, Query T(Am), n-isolates A in P. A 
RELATIVIZABILITY IN COMPLEXITY THEORY 157 universum V is called identifiable if there are a family Pn, n G N, of decision trees and a polynomial p such that (i) Pn is a tree in variables A(x), \x\ < p(n), (ii) the height of Pn is at most p(n), and (iii) Pn n-identifies A in V for all n G N and all A e V. We say that a universum V is randomly identifiable if there are a family Pn?r, n G N, r G BpolyH, of decision trees and a polynomial p such that (i) Px r is a decision tree in variables A(x), x G B-p(n\ (ii) the height of Pn?r is at most p(n), and (iii) Prob [Pnr n-identifies A in V] > ^ (with respect to the uniform distribution in r’s) for all n G N and all A € V. Obviously (V is identifiable) => (V is randomly identifiable) => (V is thin). Lemma 25. Any randomly identifiable universum is identifiable. Proof. Assume that V is randomly identifiable. Let Px r be a family of decision trees and p a polynomial satisfying (i), (ii) and (iii) above. For an n G N, let Zn denote the set {P|B-n | В G V}. We claim that \Zn\ < 2poly(n). Indeed, for all n G N and all В G V, there is a set W C B-p(n) having at most p(n) elements and such that C|B-n = P|B-n for any C G V П T(B\W). Thus, \Zn\ is not greater than the number of elements in the set {B\W I W c \W\ < p(n), В e П}. This number does not exceed (2p(n)+1)p(ra). Thus, \Zn\ < (2p(™)+1)p(”)2p^ra) = 2P°iy(n)^ For a partial function а: В* —^ B, let a denote the oracle л/ 4 j a(z) if 2: G Dom(a), W [0 if 2: G B* \ Dom(a). Obviously, for any A, n and r, (65) Tn?r n-identifies A in V 4Ф Tn?r n-identifies A\BP^ in V. Therefore, we have Prob [Tn?r n-identifies a in V] > 1/2 for any a G Zp(n)- We conclude that Prob [Tn?r n-identifies a in V] > 1/2 with respect to the uniform distribution in r’s and in a G Zp(ny Therefore, there exists r\ such that Prob[Tn?ri n-identifies a in V] > 1/2 with respect to the uniform distribution in a G Zp^ny Let Z' = {a G Zp(n) I Tn?ri does not n-identify a in V}. Reasoning in a similar way, we can prove that there exists Г2 such that Prob[Tn.r2 n-identifies a in V] > 1/2 
158 NIKOLAI К. VERESHCHAGIN with respect to the uniform distribution in a E Z'. Let Z" = {a E Z' | Tn?r2 does not n-identify a in V}, and so on. We thus define the words rq, rq, ..., r*, ... and the sets Z', Z", ..., Evidently, |Z(i+1)| < (l/2)|Z(i)| for all i; therefore for some polynomial t(n) we get Z^n) — 0. Thus for any a E Zp(n) there exists an i < t(n) such that Тп?Гг n-identifies a in V. By (65), this means that for any A E V there exists i < t(n) such that Tnr n-identifies A in V. The decision tree Tn that n-identifies A e V in V works as follows. It just executes Tn?ri,..., Tn>rt(n). The height of Tn is t(n)p(n), and its input variables are A(x), \x\ < p(n). □ Let us continue the proof of the theorem. Consider two cases. First case: there exists a randomly identifiable universum V E V. Then by the above lemma the universum V is identifiable. Therefore, for any К E n.u.PSPACE there is an L E n.u.P such that K(x,A) — L(x,A) for all x and all A E V (first identify the oracle A in V, and then find K(x,A)). Therefore, (a') is false for 1C = PSPACE, С = P. Second case: all universums V E V are not randomly identifiable. Let us prove that (b') is false for Л4 = co-NP, Af = R. Let Vo5 Vi, ..., Vi, ... be an enumeration of V. Let mlB{n) denote the least s E N such that there exists a family Tn r E Bs, of decision trees of height at most s in variables A(x), |x| < 5, such that Prob [Tr n-identifies В in Vi] > 1/2. We know that for any i there exists a sequence {Вгп}, n E N, of oracles in Vi such that the function n 1—> тгвгп (n) is not bounded by any polynomial of n. Obviously, the set N can be partitioned into subsets Qo, Q1, •.., Qu ... such that for each i E N the function тгвгп(п) is not bounded by any polynomial of n when n ranges over Qi. Obviously, there exists a description M E n.u.co-NP such that M(ln, A) = l<* A\Mn = Bin|Bn. for any i E N, any n E Qi and any A. Assume that there are a description N in n.u.R and i E N such that M(x, A) = N(x, A) for all x and all A E Vi. Let q(n) be the length of random strings used by N on inputs of the form (ln, A) (A E Q). For any r E Mq(n\ let A(ln,r, A) denote the output of N on input (ln, A) and random input r, and Query N(1n, r, A) the set of queries to A made by N during the work on the input (ln, A) and the random input r. Then А\Шn ф Вгп|Bn =ф M(ln, A) = 0 =ф Probr€Bq(„) [A(ln,r, A) = 1] = 0, A|Bn - Вгп|Bn =ф M(ln, A) = 1 =ф Probrelq(n) [A(ln,r, A) = 1] > 1/2, for any A E Vi and any n. In particular, Probr€B4(„, [N(1п,г,Впг) — l] >1/2 for all n E Q%. Assertion (8.4) implies that if A E Vi and A|Bn Ф Bn%|Bn, then A(ln, r, A) = 0 for all r E B9(n). Thus, the set Query N(ln, r, Bnl) n-isolates Bn% in V for any r E such that A(ln,r, Впг) = 1. Therefore тгВгп(п) grows polynomially when n ranges over Qi. This contradiction completes the proof. □ 
REL ATI VIZ ABILITY IN COMPLEXITY THEORY 159 9. Relations between complexity classes relativized with a random oracle The study of complexity theory relative to a random oracle was initiated in [8]. It was proved that PA Ф NP^ ф co-NP^ for a random A. Another result of [8] states that, for a random A, there exists an infinite P^-immune NP^-set. (The relevant notions are defined below; see Definition 15.) Let us look at these results from the point of view of the analogy between recursion theory and complexity theory. According to this analogy, P-sets correspond to decidable sets, NP-sets to recursively enumerable sets, and co-NP-sets to complements of recursively enumerable sets. More precisely, we shall consider complexity theory relative to a random oracle. Then decidable sets correspond to P^-sets, recursively enumerable sets to NP^-sets and complements of recursively enumerable sets to co-NP^-sets, where A is a random oracle. Thus, analogs of the following theorems are true in complexity theory: the theorem on the existence of a recursively enumerable undecidable set and the theorem on the existence of a recursively enumerable set whose complement is not recursively enumerable. The analog of the theorem that any infinite recursively enumerable set has an infinite decidable subset is false. In this section we show that analogs of the following theorems are true: the theorem on the existence of recursively enumerable inseparable sets and the theorem on the existence of a simple set (recursively enumerable set whose complement is infinite but has no infinite recursively enumerable subsets). We show also that the analog of the theorem on separability of sets having recursively enumerable complements is false. It is not known whether NP^ П co-NP^ = PA is true for a random A (this is analogous to Post’s theorem, which asserts that a set X is decidable whenever both X and its complement are recursively enumerable). According to [9], a positive answer would imply that AM П со-AM = BPP, and therefore that the problem of graph isomorphism is in BPP. Thus, there is no hope of proving that NP^ П co-NP"4 = PA for a random A. On the other hand, no absolute corollary of the assertion “NP4 П co-NP^ Ф PA for a random A” is known. Thus one may hope to prove that NPA П co-NP"4 ф PA for a random A. All the result cited above are presented in Table 3. The analogous results on generic oracles are also shown there for comparison. Definition 15. A language L is called C-immune if L is infinite but no infinite subset of L belongs to C. Let 5(A) be a property of an oracle A. We say that 5(A) holds for a random A (or for almost all A) if the uniform measure of the set {A | 5(A)} is equal to 1. Theorem 49 ([54]). Relative to a random oracle, there are LgP and an NP- set L\ C L such that both L\ and L\L\ are infinite, L\ has no infinite co-NP-subsets and L\Li has no infinite NP-subsets. Proof. Define a sequence {tг} of integers by induction: to = 1, ti+1 = 2tl. Let г be a natural number. An i-block is a set of the form Bw = {wv | v e B*, \v\ = log2 fi}, 
160 NIKOLAI К. VERESHCHAGIN Table 3 Relative to random oracle Relative to generic oracle Recursion theory P t^NP + + + NP ф co-NP + + + NP-sets are P-inseparable + ? + co-NP-sets are P-inseparable + + - P = NP П co-NP ? ? + NP has P-immune sets + - - NP has co-NP-immune sets + - - co-NP has NP-immune sets + + + where го is a binary word of length ti. Thus, every г-block consists of U words (of length U + log2 ti). Let L={ltl | ie Щ. Let В be a block. We say that an oracle A is identically zero in В if Vue В A(u) = 0. For any oracle A let LA — {Iй | г G N and A is identically zero in some г-block}, L* = L\Lt Obviously, L is in PA and LA is in NP^ for all A. Both languages LA and LA are infinite for almost all A, because Prob[F* e La] = 1 — Prob[F* e Lf] - (1 - 2-uf% -> e"1 as г —► oo. Let us prove that (1) LA has no infinite NP^-subsets for almost all A, and (2) LA has no infinite co-NP^-subsets for almost all A. We claim that instead of (1) it suffices to prove the assertion (F) there exists с < 1 such that for any nondeterministic machine TV, the probability of the event uLna is infinite and Lna C La” is less than c, and that, instead of (2), it suffices to prove (2') there exists с < 1 such that for any nondeterministic machine TV, the probability of the event “B* \ Lna is infinite and (B* \ Lna) C La” is less than c. We first prove the implication (F) (1). Assume that (F) is true but the probability of event ULA has an infinite NP^-subset” is positive. Then there exists a nondeterministic machine TV such that the probability of the event uLna is infinite and Lna C Lq ” is positive. A simple theorem from measure theory states that if a set S of oracles has positive measure and c < 1, then there exists an interval Г such that (66) Prob[A g S | A e Г] > c. Applying this theorem to the set S = {A | Lna is infinite and Lna C La}, 
REL ATI VIZ ABILITY IN COMPLEXITY THEORY 161 we find an interval Г such that (66) is true. Choose w\,... ,Wj G B*, iq,..., bj G В such that Г = {A \ A{w\) = bi,..., A(wj) = bj}. Given an oracle Л, let us define a new oracle A! as follows: A'^ _ lbi ifu = wt, where/ e 1Л(и) if и {wi,..., wj}. It is easy to see that РгоЬ[Л' G S] = РгоЬ[Л G S | A G Г] > c. Let к = maxm<j \шш\. We can easily construct a nondeterministic machine N\ such that Na,x\= /° ; if M+log2|z| <fc, 1 otherwise. We claim that if A! G 5, then Lna is infinite and Lna C Lq. Indeed, suppose that A! is in 5, in other words, that Lna> is infinite and Lna> C Lq . Then, obviously, Lna is infinite. To prove the inclusion Lna C Lq , suppose that Nf(x) — 1. Then |x| + log2 |x| > к and NA (x) = 1. It follows that x G LA . Consequently, x G LA (because |x| + log2 |x| > /с), and our claim is proved. We have, therefore, РгоЬ[Тдгл is infinite and Lna C La] > РгоЬ[Л' G S] > c. This contradiction proves the implication (P) => (1). The implication (2') => (2) is proved in a similar way. We now have to prove (P) and (2'). Let TV be a nondeterministic machine. We claim that the probability of either of the two events (67) “Lna is infinite and Lna C La” , (68) “B* \Lna is infinite and B* \ Lna C does not exceed 0.9. Obviously, if Lna is infinite and Lna C La, then there are infinitely many i such that NA(Рг) = 1. Therefore, to prove that Prob[(67)] < 0.9 it suffices to prove the implication (69) ProbfB^z ТУл(Рг) = 1] > 0.9 (70) => Prob[Зг (Na(Рг) = 1 & Рг G La)} > 0.1 where 3°°г means “there exist infinitely many i such that”. Similarly, to prove that Prob[(68)] < 0.9 it suffices to prove the implication (71) Prob[3°°i Na(1u) = 0] > 0.9 (72) =► Prob[Зг (Л/л(1‘‘) = 0 & 1‘* e L$)] > 0.1. We shall prove these two implications in a unified way. Namely, we prove that for any 6 G {0,1}, (73) Prob[3°°/ Na(11') = tf] > 0.9 (74) => Prob[3/ (Na(1u) = 6 & Iй g Lf)] > 0.1. 
162 NIKOLAI К. VERESHCHAGIN Indeed, the assumption (73) implies that (75) OO ^Prob[Л^4(1*’) = 8 к Na(11'-') ф 6,Na{P* l = k for all к G N. Let Piks{A) denote the event ЛГл(1‘-1) Ф 6,Na{1u-2) ф 8,. >-) ф S,...,NA(ltk) ф8}> 0.9 ,Na(ltfc) ф 8. (76) Lemma 26. For sufficiently large к and for all i > к, we have Prob[l‘* g LA к Na(C) = 8 к Piks(A)) > (1/3) Prob[Na(11') = 6 к Ргк6{А)\-еи where e* is a sequence such that YliLi C converges. We shall prove the lemma later. Let us now continue the proof of the theorem. Summing (76) over г > /с, we get (77) г—к ^Prob[l‘* g Lf к Na{\1') = 8 к Pik6(A))} 1 OO OO > -'£lProb[NA(lt') = 6bPiks(Aj\-Yl<*- i—k i—k Combining (75) and (77), we get OO OO (78) ]TProb[P* g Lf к Мл(1‘*) = 6 к PikS(A)} > 0.3 - i—k i=k Two events whose probabilities are summed on the left hand side of (78) are pairwise disjoint and are included in the event (79) 3i (ДГЛ(Р*) = 8 к C ei/). Therefore, the probability of (79) is greater then 0.3 — e<- Taking к so large that < 6.2, we get (74). Thus the proof of the theorem is complete. However, it remains to prove Lemma 26. □ Proof of Lemma 26. Fix a large number к (to be specified later). Let i > k. Let Dt denote the set of all binary words of length less than t% + log2 U, and Fi the set of all functions from Д into B. For every j, the event NA(1^) Ф 6 depends only on the values of A on words of length bounded by a polynomial in tj (because the length of the questions put by N to the oracle on input ltj is bounded by a polynomial in tj). As ti = 2tl~1, it follows that we can choose к so that the event Ргкб(А) depends only on A\Di for all i > k. This is the choice of к we referred to. (f\S denotes the restriction of function / to the set S.) We claim that, for all /efi, (80) Prob[l*1 G Lf & Na{\u) = 6 I A\DX = f] > t ProbfNVP*) = 6 | A\Di = /] - 6<, 
RELATIVIZABILITY IN COMPLEXITY THEORY 163 where {e*} is a sequence such that ei converges. Let us first prove that this implies (76). Assume that (80) is true for any f e Ft. Then, multiplying (80) by Proh[A\Di = /], we obtain (81) Prob[l‘* g Lf к Na{ 1**) = 6 к A\Dt = /] > lprob[^(l‘*) = « & A\Di = f) -егРгоЬ[Л|А = /]- О Summing (81) over all / G Fi such that Piks{A) is true whenever A\Di = /, we obtain Prob[l(* ebf к Na(L) = 6 к Piks(A)} > 1 Prob[ЛГА(1(*) = 6 к PikS(A)} - е<РгоЪ[Рм(А)} > l Prob[iVA(lf*) = 6 к Pik6(A)] - ег. Thus we get (76). It remains to prove that (80) is true for all f € Fi. Fix an / e Fi. Obviously, the number of queries put by M to the oracle during possible computations on input P% is bounded by a polynomial in ti. Let q(ti) denote this polynomial. To proceed further we have to handle the cases <5 = 0 and <5 = 1 separately. (82) Lemma 27. Prob[l‘* g La к Na( 1**) = 1 | A\Di = f] > (1 - (1 - РгоЬ[ЛГА(14*) = 1 | A\Di = /]. Proof. Let r — q(U). An r-neighborhood is a set of oracles of the form (83) {A | A\Di = /, A\Bi = fuA\B2 = /2,..., A\Br = /r, A\E = g}, where B\,..., Br are z-blocks, /1,..., fr are functions from B\,..., Br, respectively, into В, E is a finite set disjoint from Di and all the г-blocks, and g is a function from E into B. Let W = {A \ NA(ltl) = l,A\Di = /}. Lemma 28. The set W can be represented as a finite union of r-neighborhoods. Proof. First we prove that the set W can be represented as a finite union of intervals of the form {A \ A{w\) = iq,..., A(wr) = 6r, A\Di = /}. Let A be an oracle such that A\D1 = /. Assume that NA(ltz) = 1. Fix an accepting computation of NA on input Pc Let the oracle be queried about the value on the words nq,... ,гсг. Then the set W contains the whole interval {B \ B(w\) = A(w 1),..., B(wr) = A(wr), B\Di = /}. Since the number of possible computations of N on input lb with all possible oracles is finite, we are done. Thus, the set W can be represented as a finite union of intervals of the form {A I A{w\) = bi,..., A(wr) = br, A\Di = /}. Obviously any interval of this form is a finite union of r-neighborhoods. □ Let Ti,..., Tn be r-neighborhoods such that W = Г1 U • • • U Гп. We have to prove that (84) Prob[ltl e I Л e Г1 U • • • U rn] > 1 - (1-2“‘’)2'!“г. 
164 NIKOLAI К. VERESHCHAGIN An r-neighborhood (83) is said to be bad if ft is identically zero for some l <r. Every oracle A from a bad r-neighborhood satisfies lt% E Lf. Therefore, if we remove all bad r-neighborhoods from Гi U • • • U Гп, then the probability Prob[C e L? I A e Ti u-ur„] will not increase. Hence, it suffices to prove (84) in the case when all Гi,... ,ГП are not bad. Clearly, it suffices to prove that (85) Prob[A is identically zero in some г-block | A E Tm \ (IT U • • • U Tm_i)] > 1 — (1 — 2~tz)2tl~r for all m < n such that Гт \ (Т\ U • • • U Tm_i) is not empty. Fix m < n. Represent Tm in the form (83). Let Ci,..., C2c_r denote all the г-blocks that do not belong to the set {B\,..., Br}. For any l <2t% — r let Pi = Prob[A is identically zero in Ci | (86) A E Гт \ (Г1 U • • • U rm_i) and A is not identically zero in all the blocks C1,..., C/_ 1]. Of course, it may happen that some of the pf s are undefined, because the set {A E Гш \ (Г1 U • • • U Гт—1) I A is not identically zero in all the blocks C\,..., C/_i} is empty. This means that every A E Tm \ (Ti U • • • U Tm_i) is identically zero in some of the blocks C\1..., Ci-\. Then the probability on the left-hand side of (85) is equal to 1, and the inequality (85) is automatic. In general, the probability on the left-hand side of (85) is equal to 1 - (1 -Pl)(l -P2)'--(1 -P2‘.-r)- Thus to prove Lemma 27 it suffices to prove the following. Lemma 29. pi > 2~t% for any l < 2tl — r such that pi is defined. Proof. Fix l <2tl — r such that pi is defined. For any oracle A, define a new oracle A! as follows: Л'(«) = {° lfu€C'’ I A(u) otherwise. A set of oracles U is said to be monotone if A E U implies A' E U. We claim that the set U = {A I A E Гт \ (Ti U • • • U Tm_i) and A is not identically zero in all the blocks C1,..., Ci-1} is monotone. Recall that Tm = {A I A\Dt = /, А\Вг =Л,..., A\Br = fri A\E = g} and that Ci does not belong to the set {£1,..., Br}. It follows that Tm is monotone. Since all the sets Г1,..., Гт-1 are not bad, the complement of the set TiU - • *иГт_1 is also monotone. Obviously, the set {A I A is not identically zero in all the blocks C1,..., C/_ 1} is monotone. Therefore, U is monotone, being an intersection of monotone sets. We see that Lemma 29 is easily implied by the following result. 
REL ATI VIZ ABILITY IN COMPLEXITY THEORY 165 Lemma 30. Any monotone set V of positive measure satisfies РгоЬ[Л is identically zero in Ci \ A G V] > T~t%. Proof. Let V satisfy the assumptions of Lemma 30. Let H denote the set of all functions from Ci into B. Let 0 stand for the identically zero function in H. Let h be an arbitrary function in H. We claim that there exists a measurepreserving one-to-one function from the set В — {A G V \ A\Ci — h} into the set V — {A G V | A\Ci = 0}. Indeed, let 7Г be a permutation of H such that 7r(h) = 0. Then the map A i—> A", where 1 A(u) otherwise, takes В into £>, because A" = A' for A G В and V is monotone. Thus РгоЬ[Л G V & A\Q = h\ < РгоЬ[Л G V & A\Q = 0] for all h. Therefore, РгоЬ[Л e V] = Y, РгоЬ[Л e V к A\Ci = h)< 2^ РгоЬ[Л e V к A\Q =0]. □ h This proves Lemma 29 and Lemma 27. □ It is easy to verify that lim*_>oo (l — (1 — 2~^)2*г_(?^г)) = 1 — e_1 > 1/3. Therefore Lemma 27 implies Lemma 26 in the case 6 = 1 (when we set = 0). It remains to consider the case 6 = 0. Let аг = (1 — 2-^)2 г = РгоЬ[Тг G Lq] and ti = (1 - 2~u)2tl~q(u) - cii. Obviously, с» = 0(q(tt)2_t*). Thus J2ihei converges. The case 6 = 0 of Lemma 26 is an immediate corollary of the following. Lemma 31. (87) ProbfC £ Lq к Na{1*•) = 0 | A\Di = f] > <ц Prob[ЛГ'4(14‘) = 0 | A\Di = f] - ct. Proof. By Lemma 27, Prob[l‘- € Lt NA(l*•) = 1 I A\Di = f} 88 > (1 - o, - €<) РгоЬ[Л^(С) = 1 | A\Di = /]. We omit the condition A\Di = f in the following computations to make them readable. We have ProbfC eLf, NA(11') = 0] = Prob[l** € Lf] - Prob[l(l G Lf, Na( 1*‘) = 1] < 1 - а» - (1 - a,i - ti) Probf/C^**) = 1] = (1 - <ц) Prob^l4*) = 0] + ег РгоЬ[ЛИ(1‘‘) = 1] < (1 — di) Prob[fVj4(lt*) = 0] + €i, Prob[l‘* e L$, NA{lu) =0] = PTob[NA(lu) = o] - Prob[l(* g L?, Na(1u) = 0] > РгоЬ[ЛР*(1**) = 0] - (1 - <*i) РгоЬ[ДГл(С) = 0] - с* = ai Prob[Ar'4(lt*) = 0] — сг. Lemma 31 is proved. □ 
166 NIKOLAI К. VERESHCHAGIN Corollary 50. ([52]). For a random A, there is an infinite NPA-set with no infinite co-NPA-subsets. Corollary 51 ([54]). For a random A, there exists an infinite co-NPA-set with no infinite NPA-subsets. Theorem 52 ([52]). NPA-languages are not PA-separable for a random A. Proof. We use the sequence {^} defined in the proof of the previous theorem. An г-block is a set of the form Bw = {wv | v G ®*,\v\ = log2 U}, where w is a binary word of length 2^. Each Bw consists of ti binary words of length 2ti + log2 ti. Arrange the Byf s according to the lexicographical order on the гс’s. Choose a sequence {s^} of natural numbers (to be specified later). The first Si г-blocks are called г, 0-blocks and the following Si blocks are called г, 1 -blocks (si will be chosen to satisfy the inequality 2Si < 22tl). For an oracle A, we define the following NPA-languages: Lq = {ltz | г G N and A is identically zero in some г, 0-block}, LA = {ltl | г G N and A is identically zero in some г, 1-block}. We shall now specify the sequence {,s*}. We want to do this so that the probability of the event “Тг G Lq ” is about l/г. Let us find the probability of this event. For any г-block B, the probability of the event “A is identically zero in B” is equal to 2~t%. Therefore, we have Prob[Cz G Lq] = Prob[l*1 G LA] = 1 — (1 — 2~u)s\ Here Si — [2tl/i\. Obviously, we have 1 — (l-2~tl)Sl = l — e~2 tls%(1+°(i)) as г —> oo. It follows that 2~tlSi = 2~tl[2tl/i\ = \{l + o(l)). Therefore 1 - (1 - 2~^)*г = 1 - e_^1+0^^ = 1 — (1 — 1/г + о(1/г)) = 1/г + о(1/г). We claim that the set Lq Pi La is finite with probability 1. Indeed, the events ltl G Lq and lt% G Lf are mutually independent for all г G N. Therefore Prob[l*1 G Lq П Li] — (1/г + о(1/г))2 = 1/г2 + о(1/г2) for all г G N. The series /i2 + o(l/i2)) converges. By the Borel-Cantelli lemma, there are only finitely many г such that lt% G Lq П Lf with probability 1. Let CA = La \ Lq . Then CA G NPA with probability 1 (as CA is different from the language LA G NPA only on a finite number of words for almost all A). In addition, CA is disjoint from LA for all A. Therefore, it suffices to prove that CA and Lq are not P^-separable for almost all A. To prove this it is sufficient to prove that for every deterministic polynomial-time oracle machine M the probability of the event G ®* (MA{x) = 1 & x G La or MA{x) = 0 & ж G CA) is equal to 1. Fix a deterministic polynomial-time oracle machine M. It is sufficient to prove that there are infinitely many г such that MA{lu) = 1 & ltl G L$ or MA(lu) = 0 & ltl G Lf for almost all A. Let Pi(A) stand for the displayed event. Note that the events Pi(A) can be dependent for г G N. 
RELATIVIZABILITY IN COMPLEXITY THEORY 167 We claim that the series (X) (89) ^2 Рг°Ь[Р,(Л) | -'Pi-i(A) A ~^Pi-2{A) к ■■■A ~^Pk+i(A) A ^Pk(A)} i=k T1 diverges for all к G N. Assume this claim is true. Then for almost all A, there are infinitely many i such that Pi(A). Indeed, Probfzfi > к Pi(A)\ = 1 - РгоЬ[Уг > к -iPi(A)} oc = 1 - Prob[-.Pfc(i4)] П РгоЬрр(Л) I -nPi_i{A)k ■ ■ ■ k->Pk(A)} 2 = /c + l oo = l-Probh Pk(A)} Д (1 - РгоЬ[Р(Л) | -.Pi_164)&--.&-.Pfc(i4)]) i = k + l for all к G N. As (89) diverges, we see that the last infinite product is equal to 0. Therefore РгоЬ[=1г > к Pi(A)\ — 1 for all к. Consequently, Prob[Vfc 3i > к Pi(A)\ = 1 (intersection of a countable family of sets of measure 1 has measure 1). To prove that the series (89) diverges, we shall prove that for all sufficiently large i, РгоЬ[Рг(А) | -пРг_1(А)&...&-пР^(А)] > 1/2г. Let Di denote the set of all binary words of length less than 2U + log2 ti, and Fi the set of all functions from Di into B. For every j, the event Pj(A) depends on the values of A on the words of length bounded by a polynomial in tj. As U = 2t%~1, it follows that for sufficiently large i, the event -iP*_i(A)&; • • • &-,P/c(A) depends only on A\D{. It suffices to prove, therefore, that for sufficiently large i and for all f e Fi, the conditional probability Prob[P^(A) | A\Di = /] is greater than 1 /2г. Fix an г G N and an / G Fi. In the sequel, we shall consider only oracles A such that A\Di = /. Run M with oracle A on the input ltl. Assume that the number of queries to oracle made in this computation is l and the k-th query is uA(uk) = ?”. Delete from the sequence all words of length less than 2ti + log2^* Let denote the resulting sequence. We call the sequence of pairs (w\, A(w\)),..., (wj, A(wj)} the computational protocol on A, written C(A). Let П be the set of all computational protocols, that is, (90) П = {C(A) | A\Di = /}. Assume Z = {{w\,bi),..., (wj, bj)) is a protocol from П, where w\,..., Wj G B*, b\,..., bj G B. We call an oracle A consistent with Z if A{w\) = b\,..., A(wj) = bj. Obviously, Z is a computational protocol on A if and only if A is consistent with Z. Consequently, the family of sets { {A | A\Di = f and A is consistent with Z} | Z G П} is a partition of the set {A \ A\Di = /}. It is sufficient to prove, therefore, that (91) Prob[P^(A) | A is consistent with Z, A\Di = f] > — 2 i for all Z G П. Fix a Z G П. Obviously, if both A' and A" are consistent with Z, then Мл (P7) = MA (Рг)- Without loss of generality we may assume that MA{ltl) = 1 
168 NIKOLAI К. VERESHCHAGIN whenever A is consistent with Z. Then we have Prob[P*(A) | A is consistent with Z, A\Di = /] = Prob[l*1 G LA I A is consistent with Z, = /] = Prob[l*1 G Lq I A is consistent with Z]. Assume that Z consists of j pairs. Then we have Prob[l*1 G Lq I A is consistent with Z] > 1 — (1 — 2~t%)St~L (Recall that lA G Lq means that A is identically zero in some г, О-block and the number of г, О-blocks is s*.) As j < poly(^) and Si = [2tl/i\ > [2tl/U\, it follows that l_(l_2-*.)*.-i = i(l + o(l)), as i —> oo. Consequently, for sufficiently large г, we have Prob[P*(A) | A is consistent with Z, A\D{ = f] > □ 2 i Theorem 53 ([52]). co-NPA-languages are not PA-separable for a random A. Proof. We shall use the notions and notation from the previous proof. We make the following changes. We now set Si = [с2£г log г], where c is a rational constant (to be specified later). Let Lq = {ltl | i G N and A is identically zero in no г, O-block}, Li — {ltl | i G N and A is identically zero in no г, 1-block}. Obviously, Si can be computed in time poly(^). It follows that Lq G co-NPa and Lf G co-NPa. It is easy to see that Prob[l4* G Lq] = Prob[l*1 G Lf] = (1 - 2~tl)Sl — е-2_*г5г(1+о(1)) _ e-clogi(l+o(l)) _ — cloge(l+o(l))> Choose c so that 6/8 < cloge < 7/8. Then we have Prob[l*1 G Lq] = Prob[l4‘ G Lf] < Г5/8 for all sufficiently large г. Therefore, we obtain РгоЬ[Рг G Lq П Lf] < i~5^4 for all sufficiently large г. As the series ^г-5/4 converges, we see that the set Lq П Lf is finite for almost all A. Let CA — La\La. We prove that CA and LA are not PA-separable for almost all A. To this end it is sufficient to prove that for every deterministic polynomialtime oracle machine M, the probability of the event G ®* (MA(x) = 1 &xeLA or MA(x) = 0 & x G CA) is equal to 1. Fix a deterministic polynomial-time oracle machine M. It suffices to prove that for almost all A, there are infinitely many i such that MA(1U) = 1 & Iй G L$ or MA(1U) = 0 & ltl G Lf. Let Pi(A) stand for the displayed event. It suffices to prove that the series oo (92) Prob[Pi(A) I & ->Pi-2(A) & • • • & -^Pk+M) A -^Pk(A)] 
REL ATI VIZ ABILITY IN COMPLEXITY THEORY 169 diverges for all к £ N. To prove that (92) diverges we prove that РгоЬ[Рг(Л) | -пРг^1(Л)&---&-.Р/с(Л)] > Г7/8 - 2_t,poly(it). Let Dt denote the set of all binary words of length less than 2tz + log2 L- and F? the set of all functions from Dt into B. For all sufficiently large г, the event • • •&;-> Pk{A) depends only on A\DZ. It suffices to show, therefore, that the conditional probability РгоЬ[Рг(Л) | A\Dt — f} is greater than i~l — 2_^poly(^) for all / £ Fz. Fix an г G N and an / G Fi. In the sequel, we consider only oracles A such that A\DZ = /. We have to prove that РгоЬ[Рг(Л) | A\Dt = f}> Г1 - 2-{‘ро1у(^). More precisely, we prove the inequality РгоЬЬРг(Л) | A\Dt = f] < 1 - г"1 + 2-^poly(;g. The event -*Рг(А) means that A is identically zero in an г, (1 — Мл(Рг))-Ь1оск. Let Z — ((ici,6i),..., (w3, b3)) be a computational protocol on A. We call an г-block В А-free [A-occupied\ if В is an г, (1 — Мл(Рг))-Ыоск and В is disjoint from the set {ici,..., Wj} [B intersects with the set {w\,..., г/л,}, respectively]. Obviously, Рг(А) holds if and only if A is identically zero either in an Л-free block or in an Л-occupied block. Let Q'(A) denote the event “Л is identically zero in an Л-free block” and Qn(A) the event “Л is identically zero in an Л-occupied block”. We prove that (93) Ргоь[д'(л) | A\Di = f] <1 -Г1, (94) Ргоь[д"(Л) | A\Di = f] < 2_t*poly(tt for all sufficiently large г. First we prove inequality (93). It suffices to prove that РгоЬ[(5/(Л) | Л is consistent with Z, A\DZ = /] < 1 — i~l for all sufficiently large i and all Z G П, We claim that РгоЬ[(Э;(Л) | A is consistent with Z, A\DZ = f] < 1 — (1 — 2~1гУг for all Z £ П. Indeed, fix a protocol Z = ((il?i, 6i), ..., (w3, bj}) from П. (Recall that the set П is defined by (90).) Without loss of generality we may assume that MA(ltl) = 1 for Л consistent with Z. Then Q'(A) means that Л is identically zero in an г,0-block disjoint from the set {w\,..., w3}. Let к stand for the number of г, 0-blocks disjoint from the set {w\,..., Wj}. Then we have РгоЬ[(3'(Л) | A is consistent with Z] = 1 - (1 - 2~tl)k. As к < st1 the claim is proved. The definition of sz implies that 1 — (1 — 2~tl)Si < 1 — г-1 for all sufficiently large г. The inequality (93) is proved. Let us prove the inequality (94). (Recall that Q"{A) means that Л is identically zero in some Л-occupied block.) If a word и belongs to an г-block, let B(u) denote the г-block и belongs to. Let R(A) stand for the event “there are a protocol Z = ((uq, &i),..., (wj, b3)) from П and an m < j such that the following assertions hold: (a) wm belongs to an г-block and Л is identically zero in B(wm); (b) Vz < m wz 0 B(wm); (c) Vz < m A{wx) = Ьг”. 
170 NIKOLAI К. VERESHCHAGIN The assertion Q"{A) implies R(A). Indeed, assume that Q"(A) holds. Let Z be the computational protocol on A and m the least m! such that A is identically zero in B{wrn>). All the three assertions (a), (b) and (c) hold for these Z,m. Let us prove that Prob[i?(A) | A\Di = /] < 2“*lpoly{U). Fix a protocol Z = ((tui, 61),..., (Wj,bj)) and an m < j. We shall prove that (95) Prob[(a) & (b) & (c)] < 2-m+12-\ Note that the truth value of (b) depends only on m and Z. If (b) is false then we are done, because the probability on the left hand side of (95) is equal to zero. If wm belongs to no г-block, then we are also done. Assume that (b) is true and that Wm belongs to an г-block. Then (a) and (c) are independent (we assume that w\,..., Wj are distinct). Therefore, we have Prob[(a) & (b) & (c)] = Prob[(a)] Prob[(c)] = 2~Ьг2~ш+1. Fix an arbitrary m. Let us prove that the probability of the event “there is a protocol Z = ((tui,6i),..., (Wj,bj)) such that m < j and (a)&(b)&(c)” is at most 2~tl. Let Rm(A) denote this event. If Z = ((tui,&i),..., (Wj,bj)) is a protocol such that j > m, then we call the sequence Zm = ((tui,6i),..., (tum_i, 6m_i), ъиш) the m-prefix of Z. The set Zm = {Zm | Z £ П} has at most 2m_1 elements, since Zm is defined completely by the tuple (6i,..., bm-1). The truth values of the assertions (a), (b), and (c) depend only on A and Zm. Therefore, РгоЬ[Яш(А) | A\Di = /] < Y, Pr°b[(a)> (b), and (c) hold for Y] РгоЬ[С(Л)т = Y \ A\Dt = f} Y£Zm < 2m_1(2_^2_m+1) = 2~tl. Let к be the maximal length of a protocol from П. Obviously, к < poly(^)- We have, therefore, к РгоЬ[Д(Л) | A\Di = f]<Y РгоЬ[Дт(Л) | A\Dt = f] < 2~uk < 2~upo\y(U). □ m— 1 References [1] W. Aiello, S. Goldwasser, and J. Hastad. On the power of interaction. Proc. 27th Ann. IEEE Symp. Foundations of Computer Science, pp. 368-379, 1986. [2] N. Ajtai. Y,\-formulae on finite structures. Ann. Pure and Applied Logic, 24 (1983), 1-48. [3] K. Ambos-Spies. A note on complete problems for complexity classes. Inf. Proc. Lett., 23 (1986), 227-230. [4] L. Babai. Trading group theory for randomness. Proc. 17th Ann. ACM Symp. Theory of Computing, pp. 421-429, 1985. [5] T. Baker, J. Gill, and R. Solovay. Relativization of P=?NP question. SIAM J. Computing, 4 (1975), no. 4, 431-442. [6] R. Beigel. Perceptrons, PP, and the polynomial time hierarchy. Proc. 7th Ann. IEEE Conf. Structure in Complexity Theory, pp. 14-19, 1992. [7] R. Beigel, N. Reingold, and D. Spielman. PP is closed under intersection. Proc. 23th Ann. ACM Symp. Theory of Computing, pp. 1-9, 1991. [8] С. H. Bennet and J. Gill. Relative to a random oracle РфЫРфсоЫР with probability 1. SIAM J. Computing, 10 (1981), 96-113. [9] M. Blum and R. Impagliazzo. General oracle and oracle classes. Proc. 28th Ann. IEEE Symp. Foundations of Computer Science, pp. 118-126, 1987. 
RELATIVIZABILITY IN COMPLEXITY THEORY 171 [10] D. P. Bovet, P. Creszenzi, and R. Silvestri. A uniform approach to define complexity classes. Theor. Comp. Sci., 104 (1992), 263-283. [11] J. Cai and L. Hemachandra. On the power of parity polynomial time. Math. Syst. Theory, 23 (1990), no. 2, 95-106. [12] E. W. Cheney. Approximation Theory. AMS/Chelsea, Providence, RI, 1998. [13] H. Chernoff. A measure of assymptotic efficiency for tests of a hypothesis based on the sum of observations. Ann. of Math. Statistics, 23 (1952), 493-509. [14] H. Ehlich and K. Zeller. Schwankunq von Polynomen zwischen Gitterpunkten. Math. Z., 86 (1964), 41-44. [15] S. Fenner, L. Fortnow, S. A. Kurtz, and L. Li. An oracle builder’s toolkit. Proc. 8th Ann. IEEE Conf. Structure in Complexity Theory, pp. 120-131, 1993. [16] L. Fortnow and N. Reingold. PP is closed under truth table reductions. Proc. 6th Ann. Conf. Structure in Complexity Theory, pp. 13-15, 1991. [17] L. Fortnow and J. Rogers. Separability and one-way functions. Algorithms and Computation (Beijing, 1994), Lecture Notes in Computer Sci., vol. 834, Springer-Verlag, Berlin, 1994, pp. 396-404. [18] L. Fortnow and M. Sipser. Are there interactive protocols for coNP languages? Inf. Proc. Lett., 28 (1988), 249-251. [19] B. Fu. Separating PH from PP by relatimsation. Acta Math. Sinica (N.S.), 8 (1992), 329-336. [20] N. Furst, J. Saxe, and M. Sipser. Parity, circuits and the polynomial time hierarchy. Math. Syst. Theory, 17 (1984), 13 -27. [21] S. Goldwasser, S. Micali, and C. Rackoff. The knowledge complexity of interactive proof- systems. SIAM J. Computing, 18 (1989), 186-208. [22] S. Goldwasser and M. Sipser. Private coins versus public coins in interactive proof systems. Proc. 18th Ann. ACM Symp. Theory of Computing, pp. 59-68, 1986. [23] J. Grollman and A. L. Selman. Complexity measures for public-key cryptosystems. SIAM J. Computing, 17 (1988), no. 2. 309-335. [24] Y. Gurevich. Algebras of feasible functions. Proc. 24th Ann. IEEE Symp. Foundations of Computer Science, pp. 210-214, 1983. [25] J. Hastad. Almost optimal lower bounds for small depth circuits. Proc. 18th Ann. ACM Symp. Theory of Computing, pp. 6-20, 1986. [26] J. Hartmanis and L. Hemachandra. Complexity classes without machines: On complete languages for UP. Theor. Comp. Sci., 58 (1988), 129-142. [27] J. Hartmanis and N. Immerman. On complete problems for NPDcoNP. Proc. Intern. Colloq. Automata. Languages and Programming, 1985. Lecture Notes Comp. Sci., vol. 194, Springer- Verlag, Berlin-Heidelberg, 1985, pp. 250-259. [28] L. Hemaspaandra, S. Jain, and N. Vereshchagin. Banishing robust turing completeness. Intern. J. Found. Comp. Sci., 4 (1993), no. 3, 245-265. [29] S. Homer, A. L. Selman. Oracles for structural properties: The isomorphism problem and public-key cryptography. J. Comp, and Syst. Sci., 44 (1992), no. 2, 287-301. [30] R. Impagliazzo and M. Naor. Decision trees and downward closures. Proc. 3rd Ann. IEEE Conf. Structure in Complexity Theory, pp. 29-38, 1988. [31] J. Kobler, U. Shoning, S. Toda, and J. Toran. Turing machines with few accepting computations and low sets for PP. Proc. 4th Ann. IEEE Conf. Structure in Complexity Theory, 1989, pp. 208-215. [32] K.-I. Ко. Relativized polynomial-time hierarchies having exactly к levels. SIAM J. Computing, 18 (1989), no. 2, 392-408. [33] S. Kurtz, S. Mahaney, and J. Royer. Average Dependence and Random Oracle. Proc. 7th Ann. IEEE Conf. Structure in Complexity Theory, pp. 306-317, 1992. [34] C. Lautemann. BPP and the polynomial hierarchy. Inf. Proc. Lett., 17 (1983), no. 4, 215-217. [35] G. G. Lorentz. Approximation of Functions. Hort, Rinehart and Winston, New York, 1966. [36] C. Lund, L. Fortnow, H. Karloff, and N. Nisan. The polynomial time hierarchy has interactive proofs. Proc. 31th Ann. IEEE Symp. Foundations of Computer Science, pp. 2-10, 1990. [37] M. Minsky and S. Papert. Perceptrons. MIT Press, Cambridge, MA, 1988. (Expanded edition; first edition appeared in 1967.). [38] A. A. Muchnik and N. K. Vereshchagin. A general method to construct oracles realizing given relationships between complexity classes. TR 500, Comp. Sci. Dept, Univ. of Rochester, 1994. 
172 NIKOLAI К. VERESHCHAGIN [39] A General Method to Construct Oracles Realizing Given Relationships between Complexity Classes. Theor. Comp. Sci., 157 (1996), 227-258. [40] N. Nisan. Probabilistic versus deterministic decision trees and CREW PRAM complexity. Proc. 21th Ann. ACM Symp. Theory of Computing, pp. 327-335, 1989. [41] C. Papadimitriou and K. Steiglitz. Combinatorial Optimization: Algorithms and Complexity. Prentice-Hall, Englewood Cliffs, NJ, 1982. [42] G. Polya and G. Szego. Problems and Theorems in Analysis. Springer Verlag, 1972. [43] C. Rackoff. Relativized questions involving probabilistic algorithms. Proc. 10th Ann. ACM Symp. Theory of Computing, pp. 338-342, 1978. [44] M. Santha. Relativized Arthur-Merlin versus Merlin-Arthur games. Inform, and Computation, 80 (1989), 44-49. [45] A. Shamir. IP=PSPACE. Proc. 31th Ann. IEEE Symp. Foundations of Computer Science, pp. 11-15, 1990. [46] R. Silvestri. Complexity classes and relativizations. Ph. D. thesis. Dipartimento di Matem- atica, Universita degli studi di Roma “La Sapienza”, 1992. [47] M. Sipser. On relativizations and the existence of complete sets. Proc. Intern. Colloq. Automata, Languages and Programming, 1982. Lect. Notes Comp. Sci., 140, pp. 523-531, 1982. [48] A complexity theoretic approach for randomness. Proc. 15th Ann. ACM Symp. Theory of Computing, pp. 330-335, 1983. [49] S. Toda. On the computational power of PP and 0P. Proc. 30th Ann. IEEE Symp. Foundations of Computer Science, pp. 514-519, 1989. [50] N. K. Vereshchagin. On the power of PP. Proc. 7th IEEE Conf. Structure in Complexity Theory, pp. 138-143, 1992. [51] Relativizable and Non-Relativizable theorems in Polynomial Theory of Algorithms. Izv. RAN, Ser. Mat., 57 (1993),no. 2, 51-90; English transl., Russian Acad. Sci. Izv. Math., 42 (1994), no. 2, 261-298. [52] Relationships between NP-sets, Co-NP-sets and P-sets relative to random oracles. Izv. Vyssh. Uchebn. Zaved. Mat., 1993, no. 3, 31-39; English transl., Russian Math. (Iz. VUZ) 37 (1993), no. 3, 29-37; Preliminary version, Proc. 8th Annual IEEE Conf. Structure in Complexity Theory, 1993, pp. 132-138. [53] Lower bounds for perceptrons solving some separation problems and oracle separation of AM from PP. Proc. 3rd Israel Symp. Theory of Computing and Systems, pp. 46-51, 1995. [54] NP-sets are Co-NP-immune relative to a random oracle. Proc. 3rd Israel Symp. Theory of Computing and Systems, pp. 40-45, 1995. [55] A. Yao. Separating the polynomial hierarchy by oracles. Proc. 26th Ann. IEEE Symp. Foundations of Computer Science, pp. 1-10, 1985. Dept, of Mathematical Logic and Theory of Algorithms, Moscow State University, Vorobjevy Gory, Moscow 119899. Russia E-mail address: ver@mech.math.msu.su Translated by the author 
Selected Titles in This Series 192 Lev Beklemishev, Mati Pentus, and Nikolai Vereshchagin, Provability, Complexity, Grammars 191 A. Yu. Morozov and M. A. Olshanetsky, Editors, Moscow Seminar in Mathematical Physics 190 S. Tabachnikov, Editor, Differential and Symplectic Topology of Knots and Curves 189 V. Buslaev, M. Solomyak, and D. Yafaev, Editors, Differential Operators and Spectral Theory (M. Sh. Birman’s 70th anniversary collection) 188 O. A. Ladyzhenskaya, Editor, Proceedings of the St. Petersburg Mathematical Society, Volume IV 187 M. V. Karasev, Editor, Coherent Transform, Quantization, and Poisson Geometry 186 A. Khovanskii, A. Varchenko, and V. Vassiliev, Editors, Geometry of Differential Equations 185 B. Feigin and V. Vassiliev, Editors, Topics in Quantum Groups and Finite-Type Invariants (Mathematics at the Independent University of Moscow) 184 Peter Kuchment and Vladimir Lin, Editors, Voronezh Winter Mathematical Schools (Dedicated to Selim Krein) 183 K. Nomizu, Editor, Selected Papers on Harmonic Analysis, Groups, and Invariants 182 V. E. Zakharov, Editor, Nonlinear Waves and Weak Turbulence 181 G. I. Olshanski, Editor, Kirillov’s Seminar on Representation Theory 180 A. Khovanskii, A. Varchenko, and V. Vassiliev, Editors, Topics in Singularity Theory 179 V. M. Buchstaber and S. P. Novikov, Editors, Solitons, Geometry, and Topology: On the Crossroad 178 V. Kreinovich and G. Mints, Editors, Problems of Reducing the Exhaustive Search 177 R. L. Dobrushin, R. A. Minlos, M. A. Shubin, and A. M. Vershik, Editors, Topics in Statistical and Theoretical Physics (F. A. Berezin Memorial Volume) 176 E. V. Shikin, Editor, Some Questions of Differential Geometry in the Large 175 R. L. Dobrushin, R. A. Minlos, M. A. Shubin, and A. M. Vershik, Editors, Contemporary Mathematical Physics (F. A. Berezin Memorial Volume) 174 A. A. Bolibruch, A. S. Merkur'ev, and N. Yu. Netsvetaev, Editors, Mathematics in St. Petersburg 173 V. Kharlamov, A. Korchagin, G. PolotovskiT, and O. Viro, Editors, Topology of Real Algebraic Varieties and Related Topics 172 K. Nomizu, Editor, Selected Papers on Number Theory and Algebraic Geometry 171 L. A. Bunimovich, В. M. Gurevich, and Ya. B. Pesin, Editors, Sinai’s Moscow Seminar on Dynamical Systems 170 S. P. Novikov, Editor, Topics in Topology and Mathematical Physics 169 S. G. Gindikin and E. B. Vinberg, Editors, Lie Groups and Lie Algebras: E. B. Dynkin’s Seminar 168 V. V. Kozlov, Editor, Dynamical Systems in Classical Mechanics 167 V. V. Lychagin, Editor, The Interplay between Differential Geometry and Differential Equations 166 O. A. Ladyzhenskaya, Editor, Proceedings of the St. Petersburg Mathematical Society, Volume III 165 Yu. Ilyashenko and S. Yakovenko, Editors, Concerning the Hilbert 16th Problem 164 N. N. Uraltseva, Editor, Nonlinear Evolution Equations 163 L. A. Bokut7, M. Hazewinkel, and Yu. G. Reshetnyak, Editors, Third Siberian School “Algebra and Analysis” 162 S. G. Gindikin, Editor, Applied Problems of Radon Transform (Continued in the back of this publication) 
Selected Titles in This Series (Continued from the front of this publication) 161 K. Nomizu, Editor, Selected Papers on Analysis, Probability, and Statistics 160 K. Nomizu, Editor, Selected Papers on Number Theory, Algebraic Geometry, and Differential Geometry 159 O. A. Ladyzhenskaya, Editor, Proceedings of the St. Petersburg Mathematical Society, Volume II 158 A. K. Kelmans, Editor, Selected Topics in Discrete Mathematics: Proceedings of the Moscow Discrete Mathematics Seminar 1972-1990 157 M. Sh. Birman, Editor, Wave Propagation. Scattering Theory 156 V. N. Gerasimov, N. G. Nesterenko, and A. I. Valitskas, Three Papers on Algebras and Their Representations 155 O. A. Ladyzhenskaya and A. M. Vershik, Editors, Proceedings of the St. Petersburg Mathematical Society, Volume I 154 V. A. Artamonov et al., Selected Papers in A-Theory 153 S. G. Gindikin, Editor, Singularity Theory and Some Problems of Functional Analysis 152 H. Draskovicova et al., Ordered Sets and Lattices II 151 I. A. Aleksandrov, L. A. Bokut', and Yu. G. Reshetnyak, Editors, Second Siberian Winter School “Algebra and Analysis” 150 S. G. Gindikin, Editor, Spectral Theory of Operators 149 V. S. Afraimovich et al., Thirteen Papers in Algebra, Functional Analysis, Topology, and Probability, Translated from the Russian 148 A. D. Aleksandrov, О. V. Belegradek, L. A. Bokut', and Yu. L. Ershov, Editors, First Siberian Winter School “Algebra and Analysis” 147 I. G. Bashmakova et al., Nine Papers from the International Congress of Mathematicians, 1986 146 L. A. Aizenberg et al., Fifteen Papers in Complex Analysis 145 S. G. Dalalyan et al., Eight Papers Translated from the Russian 144 S. D. Berman et al., Thirteen Papers Translated from the Russian 143 V. A. Belonogov et al., Eight Papers Translated from the Russian 142 M. B. Abalovich et al., Ten Papers Translated from the Russian 141 H. Draskovicova et al., Ordered Sets and Lattices 140 V. I. Bernik et al., Eleven Papers Translated from the Russian 139 A. Ya. Aizenshtat et al., Nineteen Papers on Algebraic Semigroups 138 I. V. Kovalishina and V. P. Potapov, Seven Papers Translated from the Russian 137 V. I. Arnol'd et al., Fourteen Papers Translated from the Russian 136 L. A. Aksent'ev et al., Fourteen Papers Translated from the Russian 135 S. N. Artemov et al., Six Papers in Logic 134 A. Ya. Aizenshtat et al., Fourteen Papers Translated from the Russian 133 R. R. Suncheleev et al., Thirteen Papers in Analysis 132 I. G. Dmitriev et al., Thirteen Papers in Algebra 131 V. A. Zmorovich et al., Ten Papers in Analysis 130 M. M. Lavrent'ev, K. G. Reznitskaya, and V. G. Yakhno, One-dimensional Inverse Problems of Mathematical Physics 129 S. Ya. Khavinson, Two Papers on Extremal Problems in Complex Analysis 128 I. K. Zhuk et al., Thirteen Papers in Algebra and Number Theory 127 P. L. Shabalin et al., Eleven Papers in Analysis 126 S. A. Akhmedov et al., Eleven Papers on Differential Equations 125 D. V. Anosov et al., Seven Papers in Applied Mathematics 124 В. P. Allakhverdiev et al., Fifteen Papers on Functional Analysis (See the AMS catalog for earlier titles)