Author: Maienborn Claudia   Heusinger Klaus von   Portner Paul  

Tags: semantics   linguistics  

ISBN: 978-3-11-018470-9

Year: 2011

Text
                    . tics
An International Handbook of
Natural Language Meaning
Volume i
Edited by
Claudia aienborn
Klaus von Heusinger
Paul Portner
Handbooks of Linguistics and Communication Science *2 *2 "1
Handbiicher zur Sprach- und Kommunikationswissenschaft J J • I
DE GRUYTER
MOUTON


ISBN 978-3-11-018470-9 e-ISBN 978-3-11-022661-4 ISSN 1861-5090 Library of Congress Cataloging-in-Publication Data Semantics : an international handbook of natural language Maienborn, Klaus von Heusinger, Paul Portner. p. cm. - (Handbooks of linguistics and communication Includes bibliographical references and index. ISBN 978-3-11-018470-9 (alk. paper) 1. Semantics—Handbooks, manuals, etc. I. Maienborn, von. III. Portner, Paul. P325.S3799 2011 401'.43-dc22 meaning / edited by Claudia science Claudia. 33.1) II. Heusinger, Klaus 2011013836 Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie: detailed bibliographic data are available in the Internet at http://dnb.d-nb.de. © 2011 Walter de Gruyter GmbH & Co. KG, Berlin/Boston Cover design: Martin Zech, Bremen Typesetting: RefineCatch Ltd, Bungay, Suffolk Printing: Hubert & Co. GmbH & Co. KG, Gottingcn © Printed on acid-free paper Printed in Germany www.degruytcr.com
Contents Volume i I. Foundations of semantics 1. Meaning in linguistics • Claudia Maienborn, Klaus von Heusinger and Paul Portner 1 2. Meaning, intentionality and communication • Pierre Jacob 11 3. (Frege on) Sense and reference • Mark Textor 25 4. Reference: Foundational issues • Barbara Abbott 49 5. Meaning in language use • Georgia M. Green 74 6. Compositionality • Peter Pagin and Dag Wcstcrstahl 96 7. Lexical decomposition: Foundational issues • Stefan Engelberg 124 II. History of semantics 8. Meaning in pre-19th century thought • Stephan Meier-Oeser 145 9. The emergence of linguistic semantics in the 19th and early 20th century • Brigittc Ncrlich 172 10. The influence of logic on semantics • Albert Ncwcn and Bcrnhard Schroder 191 11. Formal semantics and rcprcscntationalism • Ruth Kempson 216 III. Methods in semantic research 12. Varieties of semantic evidence • Manfred Krifka 242 13. Methods in cross-linguistic semantics • Lisa Matthcwson 268 14. Formal methods in semantics • Alice G.B. tcr Meulen 285 15. The application of experimental methods in semantics • Oliver Bott,Sam Featherston, Janina Rado and Britta Stolterfoht 305 IV. Lexical semantics 16. Semantic features and primes • Manfred Bierwisch 322 17. Frameworks of lexical decomposition of verbs • Stefan Engelberg 358 18. Thematic roles • Anthony R. Davis 399 19. Lexical Conceptual Structure • Beth Levin and Malka Rappaport Hovav 420 20. Idioms and collocations • Christiane Fellbaum 441 21. Sense relations • Ronnie Cann 456 22. Dual oppositions in lexical meaning • Sebastian Lobncr 479
V. Ambiguity and vagueness 23. Ambiguity and vagueness: An overview • Christopher Kennedy 507 24. Semantic underspecification • Markus Egg 535 25. Mismatches and coercion • Henriette de Swart 574 26. Metaphors and metonymies • Andrea Tyler and Hiroshi Takahashi 597 VI. Cognitively oriented approaches to semantics 27. Cognitive Semantics: An overview • Leonard Talmy 622 28. Prototype theory • John R. Taylor 643 29. Frame Semantics • Jean Mark Gawron 664 30. Conceptual Semantics • Ray Jackcndoff 688 31. Two-level Semantics: Semantic Form and Conceptual Structure • Ewald Lang and Claudia Maienborn 709 32. Word meaning and world knowledge • Jerry R. Hobbs 740 VII. Theories of sentence semantics 33. Model-theoretic semantics • Thomas Edc Zimmcrmann 762 34. Event semantics • Claudia Maienborn 802 35. Situation Semantics and the ontology of natural language • Jonathan Ginzburg 830 VIII. Theories of discourse semantics 36. Situation Semantics: From indexicality to metacommunicative interaction • Jonathan Ginzburg 852 37. Discourse Representation Theory • Hans Kamp and Uwe Reyle 872 38. Dynamic semantics • Paul Dekker 923 39. Rhetorical relations • Hcnk Zccvat 946 Volume 2 IX. Noun phrase semantics 40. Pronouns • Daniel Biiring 41. Dcfinitcncss and indcfinitcncss • Irene Hcim 42. Specificity • Klaus von Heusinger 43. Quantifiers • Edward Keenan 44. Bare noun phrases- Veneeta Dayal 45. Posscssivcs and relational nouns • Chris Barker 46. Mass nouns and plurals • Peter Lascrsohn 47. Gcncricity • Gregory Carlson
Contents xiii X. Verb phrase semantics 48. Aspectual class and Aktionsart • Hana Filip 49. Perfect and progressive • Paul Portner 50. Verbal mood • Paul Portner 51. Deverbal nominalization • Jane Grimshaw XI. Semantics of adjectives and adverb(ial)s 52. Adjectives • Violeta Demonte 53. Comparison constructions • Sigrid Beck 54. Adverbs and adverbials • Claudia Maienborn and Martin Schafer 55. Adverbial clauses • Kjcll Johan Saeb0 56. Secondary predicates • Susan Rothstcin XII. Semantics of intensional contexts 57. Tense -Toshiyuki Ogihara 58. Modality • Valentine Hacquard 59. Conditionals • Kai von Fintel 60. Propositional attitudes • Eric Swanson 61. Indexicality and de se reports • Philippe Schlenkcr XIII. Scope, negation, and conjunction 62. Scope and binding • Anna Szabolcsi 63. Negation • Elena Hcrburgcr 64. Negative and positive polarity items • Anastasia Giannakidou 65. Coordination • Roberto Zamparclli XIV. Sentence types 66. Questions • Manfred Krifka 67. Imperatives • Chung-hye Han 68. Copular clauses • Line Mikkelsen 69. Existential sentences • Louise McNally 70. Ellipsis • Ingo Reich XV Information structure 71. Information structure and truth-conditional semantics • Stefan Hinterwimmer 72. Topics • Craigc Roberts 73. Discourse effects of word order variation • Gregory Ward and Betty J. Birner 74. Cohesion and coherence • Andrew Kehler 75. Discourse anaphora, accessibility, and modal subordination • Bart GeurLs 76. Discourse particles • Make Zimmermann
Volume 3 XVI. The interface of semantics with phonology and morphology 77. Semantics of intonation • Hubert Truckcnbrodt 78. Semantics of inflection • Paul Kiparsky and Judith Tonhauscr 79. Semantics of derivational morphology • Rochcllc Licbcr 80. Semantics of compounds • Susan Olsen 81. Semantics in Distributed Morphology • Heidi Harley XVII. The syntax-semantics interface 82. Syntax and semantics: An overview • Arnim von Stcchow 83. Argument structure • David Pcsctsky 84. Operations on argument structure • Dieter Wundcrlich 85. Type shifting • Helen do Hoop 86. Constructional meaning and compositionality • Paul Kay and Laura A. Michaelis 87. The grammatical view of scalar implicatures and the relationship between semantics and pragmatics • Gcnnaro Chicrchia, Danny Fox and Benjamin Spcctor XVI11.The semantics-pragmatics interface 88. Semantics/pragmatics boundary disputes • Katarzyna M. Jaszczolt 89. Context dependency • Thomas Edc Zimmcrmann 90. Dcixis and demonstratives • Holgcr Dicsscl 91. Presupposition • David Beaver and Bart Geurts 92. Implicature • Mandy Simons 93. Game theory in semantics and pragmatics • Gerhard Jager 94. Conventional implicature and expressive content • Christopher Potts XIX. Typology and crosslinguistic semantics 95. Semantic types across languages • Emmon Bach and Wynn Chao 96. Count/mass distinctions across languages • Jenny Doctjcs 97. Tense and aspect:Time across languages • Carlota S. Smith 98. The expression of space across languages • Eric Pederson XX. Diachronic semantics 99. Theories of meaning change: An overview • Gcrd Fritz 100. Cognitive approaches to diachronic semantics • Dirk Gccracrts 101. Grammaticalization and semantic reanalysis • Regine Eckardt XXI. Processing and learning meaning 102. Meaning in psycholinguistics • Lyn Frazier 103. Meaning in first language acquisition • Stephen Crain
Contents xv 104. Meaning in second language acquisition • Roumyana Slabakova 105. Meaning in ncurolinguistics • Roumyana Panchcva 106. Conceptual knowledge, categorization and meaning • Stephanie Kcltcr and Barbara Kaup 107. Space in semantics and cognition • Barbara Landau XXII. Semantics and computer science 108. Semantic research in computational linguistics • Manfred Pinkal and Alexander Kollcr 109. Semantics in corpus linguistics • Graham Katz 110. Semantics in computational lexicons • Anette Frank and Sebastian Pado 111. Web Semantics • Paul Buitelaar 112. Semantic issues in machine translation • Kurt Eberle
I. Foundations of semantics 1. Meaning in linguistics 1. Introduction 2. Truth 3. Compositionality 4. Context and discourse 5. Meaning in contemporary semantics 6. References Abstract The article provides an introduction to the study of meaning in modern semantics. Major tenets, tools, and goals of semantic theorizing are illustrated by discussing typical approaches to three central characteristics of natural language meaning: truth conditions, compositionality, and context and discourse. 1. Introduction Meaning is a key concept of cognition, communication and culture, and there is a diversity of ways to understand it, reflecting the many uses to which the concept can be put. In the following we take the perspective on meaning developed within linguistics, in particular modern semantics, and we aim to explain the ways in which semanticists approach, describe, test and analyze meaning. The fact that semantics is a component of linguistic theory is what distinguishes it from approaches to meaning in other fields like philosophy, psychology, semiotics or cultural studies. As part of linguistic theory, semantics is characterized by at least the following features: 1. Empirical coverage: It strives to account for meaning in all of the world's languages. 2. Linguistic interfaces: It operates as a subtheory of the broader linguistic system, interacting with other subthcorics such as syntax, pragmatics, phonology and morphology. 3. Formal cxplicitcness: It is laid out in an explicit and precise way, allowing the community of semanticists to jointly test it, improve it, and apply it to new theoretical problems and practical goals. 4. Scientific paradigm: It is judged on the same criteria as other scientific theories, viz. coherence, conceptual simplicity, its ability to unify our understanding of diverse phenomena (within or across languages), to raise new questions and open up new horizons for research. In the following we exemplify these four features on three central issues in modern semantic theory that define our understanding of meaning: truth conditions, compositionality, and context and discourse. Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK33.1), deGruyter, 1-10
2 I. Foundations of semantics 2. Truth If one is to develop an explicit and precise scientific theory of meaning, the first thing one needs to do is to identify some of the data which the theory will respond to, and there is one type of data which virtually all work in semantics takes as fundamental: truth conditions. At an intuitive level, truth conditions are merely the most obvious way of understanding the meaning of a declarative sentence. If I say It is raining outside, I have described the world in a certain way. I may have described it correctly, in which case what I said is true, or I may have described it incorrectly, in which case it is false. Any competent speaker knows to a high degree of precision what the weather must be like for my sentence to count as true (a correct description) or false (an incorrect description). In other words, such a speaker knows the truth conditions of my sentence. This knowledge of truth conditions is extremely robust - far and wide, English speakers can make agreeing judgments about what would make my sentence true or false - and as a result, we can see the truth conditions themselves as a reliable fact about language which can serve as part of the basis for semantic theory. While truth conditions constitute some of the most basic data for semantics, different approaches to semantics reckon with them in different ways. Some theories treat truth conditions not merely as the data which semantics is to deal with, but more than this as the very model of sentential meaning. This perspective can be summarized with the slogan "meaning is truth conditions", and within this tradition, we find statements like the following: (1) [[It is raining outside ]],s = TRUE iff it is raining outside of the building where the speaker s is located at time t, and = FALSE otherwise. The double brackets [[ X ]] around an expression X names the semantic value of X in the terms of the theory in question. Thus, (1) indicates a theory which takes the semantic value of a sentence to be its truth value,TRUE or FALSE.The meaning of the sentence, according to the truth conditional theory, is then captured by the entire statement (1). Although (1) represents a truth conditional theory according to which semantic value and meaning (i.e., the truth conditions) are distinct (the semantic value is a crucial component in giving the meaning), other truth conditional theories use techniques which allow meaning to be reified, and thus identified with semantic value, in a certain sense. The most well-known and important such approach is based on possible worlds: (2) a. [[ It is raining outside ]]w,s = TRUE iff it is raining outside of the building where the speaker s is located at time t in world w, and = FALSE otherwise, b. [[ It is raining outside ]]'•* = the set of worlds {w : it is raining outside of the building where the speaker s is located at time t in world w] A possible world is a complete way the world could be. (Other theories use constructs similar to possible worlds, such as situations.) The statement in (2a) says virtually the same thing as (1), making explicit only that the meaning of It is raining outside depends not merely on the actual weather outside, but whatever the weather may turn out to be. Crucially, by allowing the possible world to be treated as an arbitrary point of evaluation, as in (2a), we are able to identify the truth conditions with the set of all such points, as
1. Meaning in linguistics 3 in (2b). In (2), we have two different kinds of semantic value: the one in (2a), relativized to world, time, and speaker, corresponds to (1), and is often called the extension or reference. That in (2b), where the world point of evaluation has been transferred into the semantic value itself, is then called the intension or sense. The sense of a full sentence, for example given as a set of possible worlds as in (2b), is called a proposition. Specific theories differ in the precise nature of the extension and intension: The intension may involve more or different parameters than w, t, s, and several of these may be gathered into a set (along with the world) to form the intension. For example, in tense semantics, we often see intensions treated as sets of pairs of a world and a time. The majority of work in semantics follows the truth conditional approach to the extent of making statements like those in (l)-(2) the fundamental fabric of the theory. Scholars often produce explicit fragments, i.e. mini-theories which cover a subset of a language, which are actually functions from expressions of a language to semantic values, with the semantic values of sentences being truth conditional in the vein of (l)-(2). But not all semantic research is truth conditional in this explicit way. Descriptive linguistics, functional linguistics, typological linguistics and cognitive linguistics frequently make important claims about meaning (in a particular language, or crosslinguistically). For example, Wolfart (1973: 25), a descriptive study of Plains Cree states: "Semantically, direction serves to specify actor and goal. In sentence (3), for instances, the direct theme sign /a7 indicates the noun atim as goal, whereas the inverse theme sign /ekw/ in (4) marks the same noun as actor." (3) nisekihanan atim scare(lp-3) dog(3) 'We scare the dog.' (4) nisekihikonan atim scare(3-lp) dog(3) 'The dog scares us.' Despite not being framed as such, this passage is implicitly truth conditional. Wolfart is stating a difference in truth conditions which depends on the grammatical category of direction using the descriptions "actor" and "goal", and using the translations of cited examples. This example serves to illustrate the centrality of truth conditions to any attempt to think about the nature of linguistic meaning. As a corollary to the focus on truth conditions, semantic theories typically take relations like entailment, synonymy, and contradiction to provide crucial data as well. Thus, the example sentence It is raining outside entails (5), and this fact is known to any competent speaker. (5) It is raining outside or the kids are playing with the water hose. Obviously, this entailment can be understood in terms of truth conditions (the truth of the one sentence guarantees the truth of the other), a fact which supports the idea that the analysis of truth conditions should be a central goal of semantics. It is less satisfying to describe synonymy in terms of truth conditions, as identity of truth conditions doesn't in most cases make for absolute sameness of meaning, in an intuitive sense - consider
4 I. Foundations of semantics Mary hit John and John was hit by Mary; nevertheless, a truth conditional definition of synonymy allows for at least a useful concept of synonymy, since people can indeed judge whether two sentences would accurately describe the same circumstances, whereas it's not obvious that complete intuitive synonymy is even a useful concept, insofar as it may never occur in natural language. The truth conditional perspective on meaning is intuitive and powerful where it applies, but in and of itself, it is only a foundation. It doesn't, at first glance, say anything about the meanings of subsentential constituents, the meanings or functions of non- declarative sentences, or non-literal meaning, for example. Semantic theory is responsible for the proper analysis of each of these features of language as well, and we will see in many of the articles in this handbook how it has been able to rise to these challenges, and many others. 3. Compositionality A crucial aspect of natural language meaning is that speakers are able to determine the truth conditions for infinitely many distinct sentences, including sentences they have never encountered before. This shows that the truth conditions for sentences (or whatever turns out to be their psychological correlate) cannot be memorized. Speakers do not associate truth conditions such as the ones given in (1) or (2) holistically with their respective sentences. Rather, there must be some principled way to compute the meaning of a sentence from smaller units. In other words, natural language meaning is essentially combinatorial. The meaning of a complex expression is construed by combining the meaning of its parts in a certain way. Obviously, syntax plays a significant role in this process. The two sentences in (6), for instance, arc made up of the same lexical material. It is only the different word order that is responsible for the different sentence meanings of (6a) and (6b). (6) a. Caroline kissed a boy. b. A boy kissed Caroline. In a similar vein, the ambiguity of a sentence like (7) is rooted in syntax. The two readings paraphrased in (7a) and (7b) correspond to different syntactic structures, with the PP being adjoined either to the verbal phrase or to the direct object NP. (7) Caroline observed the boy with the telescope. a. Caroline observed the boy with the help of the telescope. b. Caroline observed the boy who had a telescope. Examples such as (6) and (7) illustrate that the semantic combinatorial machinery takes the syntactic structure into account in a fairly direct way. This basic insight lead to the formulation of the so-called "principle of compositionality", attributed to Gottlob Frege (1892), which is usually formulated along the following lines: (8) Principle of compositionality: The meaning of a complex expression is a function of the meanings of its parts and the way they are syntactically combined.
1. Meaning in linguistics 5 According to (8), the meaning of, e.g., Caroline sleeps is a function of the meanings of Caroline and sleeps and the fact that the former is the syntactic subject of the latter. There arc stronger and weaker versions of the principle of compositionality, depending on what counts as "parts" and how exactly the semantic combinatorics is determined by the syntax. For instance, adherents of a stronger version of the principle of compositionality typically assume that the parts that constitute the meaning of a complex expression are only its immediate constituents. According to this view, only the NP [Caroline] and the VP [kissed a boy] would count as parts when computing the sentence meaning for (6a), but not (directly) [kissed] or [a boy]. Modern semantics explores many different ways of implementing the notion of compositionality formally. One particularly useful framework is based on the mathematical concept of a function. It takes the meaning of any complex expression as being the result of applying the meaning of one of its immediate parts (= the functor) to the meaning of its other immediate part (= the argument). With functional application as the basic semantic operation that is applied stepwise, mirroring the binary branching of syntax, the function-argument approach allows for a straightforward syntax-semantics mapping. Although there is wide agreement among semanticists that, given the combinatorial nature of linguistic meaning, some version of the principle of compositionality must certainly hold, it is also clear that, when taking into account the whole complexity and richness of natural language meaning, compositional semantics is faced with a series of challenges. As a response to these challenges, semanticists have come up with several solutions and amendments.These relate basically to (A) the syntax-semantics interface, (B) the relationship between semantics and ontology, and (C) the semantics-pragmatics interface. A Syntax-Semantics Interface One way to cope with challenges to compositionality is to adjust the syntax properly. This could be done, e.g., by introducing possibly mute, i.e. phonetically empty, functional heads into the syntactic tree that nevertheless carry semantic content, or by relating the semantic composition to a more abstract level of syntactic derivation - Logical Form - that may differ from surface structure due to invisible movement. That is, the syntactic structure on which the semantic composition is based may be more or less directly linked to surface syntax, such that it fits the demands of compositional semantics. Of course, any such move should be independently motivated. B Semantics - Ontology Another direction that might be explored in order to reconcile syntax and semantics is to reconsider the inventory of primitive semantic objects the semantic fabric is assumed to be composed of. A famous case in point is Davidson's (1967) plea for an ontological category of events. A crucial motivation for this move was that the standard treatment of adverbial modifiers at that time was insufficient insofar as it failed to account properly for the combinatorial behavior and entailments of adverbial expressions. By positing an additional event argument introduced by the verb, Davidson laid the grounds for a theory of adverbial modification that would overcome these shortcomings. Under this assumption Davidson's famous sentence (9a) takes a semantic representation along the lines of (9b):
6 I. Foundations of semantics (9) a. Jones buttered the toast in the bathroom with the knife at midnight. b. 3e [ butter (jones, the toast, e) & in (e, the bathroom) & instr (e, the knife) & at (e, midnight) ] According to (9b), there was an event c of Jones buttering the toast, and this event was located in the bathroom. In addition, it was performed by using a knife as an instrument, and it took place at midnight.That is, Davidson's move enabled standard adverbial modifiers to be treated as simple first-order predicates that add information about the verb's hidden event argument. The major merits of such a Davidsonian analysis are, first, that it accounts for the typical entailment patterns of adverbial modifiers directly on the basis of their semantic representation.That is, the entailments in (10) follow from (9b) simply by virtue of the logical rule of simplification. (10) a. Jones buttered the toast in the bathroom at midnight. b. Jones buttered the toast in the bathroom. c. Jones buttered the toast at midnight. d. Jones buttered the toast. And, secondly, Davidson paved the way for treating adverbial modifiers on a par with adnominal modifiers. In the meantime, researchers working within the Davidsonian paradigm have discovered more and more fundamental analogies between the verbal and the nominal domain, attesting to the fruitfulness of Davidson's move. In short, by enriching the semantic universe with a new onto logical category of events, Davidson solved the compositionality puzzle of adverbials and arrived at a semantic theory superior to its competitors in both conceptual simplicity and empirical coverage. Of course once again, such a solution does not come without costs. With Quine's (1958) dictum "No entity without identity!" in mind, any ontological category a semantic theory makes use of requires a proper ontological characterization and legitimization. In the case of events, this is still the subject of ongoing debates among scmanticists. C Semantics-Pragmatics Interface Finally, challenges to compositionality might also be taken as an invitation to reconsider the relationship between semantics and pragmatics by asking how far the composition of sentential meaning goes, and what the principles of pragmatic enrichment and pragmatic licensing are. One notorious case in point is the adequate delineation of linguistic knowledge and world knowledge. To give an example, when considering the sentences in (11), we know that each of them refers to a very different kind of opening event. Obviously, the actions underlying, for instance, the opening of a can differ substantially from those of opening one's eyes or opening a file on a computer. (11) a. She opened the can. b. She opened her eyes. c. She opened the electronic file. To a certain extent, this knowledge is of linguistic significance, as can be seen when taking into account the combinatorial behavior of certain modifiers:
1. Meaning in linguistics 7 (11) a. She opened the can {with a knife, *abruptly, *with a double click}. b. She opened her eyes {*with a knife, abruptly, *with a double click}. c. She opened the electronic file {*with a knife, *abruptly,with a double click}. A comprehensive theory of natural language meaning should therefore strive to account for these observations. Nevertheless, incorporating this kind of world knowledge into compositional semantics would be neither feasible nor desireable. A possible solution for this dilemma lies in the notion of semantic underspecification. Several proposals have been developed which take the lexical meaning that is fed into semantic composition to be of an abstract, context neutral nature. In the case of to open in (11), for instance, this common meaning skeleton would roughly say that some action of an agent x on an object y causes a change of state such that y is accessible afterwards.This would be the verb's constant meaning contribution that can be found in all sentences in (lla-c) and which is also present,e.g.,in (lid), where we don't have such clear intuitions about how x acted upon y, and which is therefore more liberal as to adverbial modification. (11) d. She opened the gift {with a knife, abruptly, with a double click}. That is, underspecification accounts would typically take neither the type of action performed by x nor the exact sense of accessibility of y as part of the verb's lexical meaning. To account for this part of the meaning, compositional semantics is complemented by a procedure of pragmatic enrichment, by which the compositionally derived meaning skeleton is pragmatically specified according to the contextually available world knowledge. Semantic underspecification/pragmatic enrichment accounts provide a means for further specifying a compositionally well-formed, underspecified meaning representation. A different stance towards the semantics-pragmatics interface is taken by so-called "coercion" approaches.These deal typically with the interpretation of sentences that are strictly speaking ungrammatical but might be "rescued" in a certain way. An example is given in (12). (12) The alarm clock stood intentionally on the table. The sentence in (12) does not offer a regular integration for the subject-oriented adverbial intentionally, i.e, the subject NP the alarm clock does not fulfill the adverbial's request for an intentional subject. Hence, a compositional clash results and the sentence is ungrammatical. Nevertheless, although deviant, there seems to be a way to rescue the sentence so that it becomes acceptable and interpretable anyway. In the case of (12), a possible repair strategy would be to introduce an actor, who is responsible for the fact that the alarm clock stands on the table.This move would provide a suitable anchor for the adverbial's semantic contribution. Thus, we understand (12) as saying that someone put the alarm clock on purpose on the table. That is, in case of a combinatorial clash, there seems to be a certain leeway for non-compositional adjustments of the compositionally derived meaning. The defective part is "coerced" into the right format. The exact mechanism of coercion and its grammatical and pragmatic licensing conditions are still poorly understood. In current semantic research many quite different directions are being explored with respect to the issues A-C. What version of the principle of compositionality
8 I. Foundations of semantics ultimately turns out to be the right one and how compositional semantics interacts with syntax, ontology, and pragmatics is, in the end, an empirical question. Yet, the results and insights obtained so far in this endeavor are already demonstrating the fruitfulness of reckoning with compositionality as a driving force in the constitution of natural language meaning. 4. Context and discourse Speakers do not use sentences in isolation, but in the context of an utterance situation and as part of a longer discourse. The meaning of a sentence depends on the particular circumstances of its utterances, but also on the discourse context in which it is uttered. At the same time the meaning of linguistic expression changes the context, e.g., the information available to speaker and hearer. The analysis of the interaction of context, discourse and meaning provides new and challenging issues to the research agenda in the semantics-pragmatics interface as described in the last section. In the following we focus on two aspects of these issues to illustrate how the concept of meaning described above can further be developed by theorizing on the interaction between sentence meaning, contextual parameters and discourse structure. So far wc have characterized the meaning of a sentence by its truth conditions and, as a result, we have "considered semantics to be the study of propositions" (Stalnaker 1970: 273). It is justified by the very clear concept that meaning describes "how the world is". However, linguistic expressions often need additional information to form propositions as sentences contain indexical elements, such as /, you, she, here, there, now and the tenses of verbs. Indexical expressions cannot be interpreted according to possible worlds, i.e. how the conditions might be, but they are interpreted according to the actual utterance situation. Intensive research into this kind of context dependency led to the conclusion that the proposition itself depends on contextual parameters like speaker, addressee, location, time etc.This dependency is most prominently expressed in Kaplan's (1977) notion character for the meaning of linguistic expressions. The character of an expression is a function from the context of utterance c, which includes the values for the speaker, the hearer, the time, the location etc. to the proposition. Other expressions such as local, different, a certain, enemy, neighbor may contain "hidden" indexical parameters. They express their content dependent on one or more reference points given by the context. Thus meaning is understood as an abstract concept or function from contexts to propositions, and propositions themselves are described as functions from possible worlds into truth conditions. The meaning of a linguistic expression is influenced not only by such relatively concrete aspects of the situation of use as speaker and addressee, but also by intentional factors like the assumptions of the speaker and hearer about the world, their beliefs and their goals. This type of context is continuously updated by the information provided by each sentence in a discourse. We see that linguistic expressions are not only "context-consumers", but also "context-shifters". This can be illustrated by examples from anaphora, presuppositions and various discourse relations. (13) a. A man walks in the park. He smokes, b. #He smokes. A man walks in the park.
1. Meaning in linguistics 9 (14) a. Rebecca married Thomas. She regrets that she married him. b Rebecca regrets that she married Thomas. 'She married him. (15) a. John left. Ann started to cry. b. Ann started to cry. John left. In (13) the anaphoric pronoun needs an antecedent, in other words it is a context- consumer as it takes the information provided in the context for fixing its meaning. The indefinite noun a man however is a context-shifter. It changes the context by introducing a discourse referent into the discourse or discourse structure such that the pronoun can be linked to it. In (13a) the indefinite introduces the referent and the anaphoric pronoun can be linked to it, in (13b) the pronoun in the first sentence has no antecedent and if the indefinite noun phrase in the second clause should refer to the same discourse referent it must not be indefinite. In (14) we see the contribution of presupposition to the context. (14b) is odd, since one can regret only something that is known to have happened. To assert this again makes the contribution of the second sentence superfluous and the small discourse incoherent. (15) provides evidence that wc always assume some relation between sentences above a simple conjunction of two propositions. The relation could be a sequence of events or a causal relation between the two event, and this induces different meanings on the two small discourses as a whole.These and many more examples have led to the development of dynamic semantics, i.e. the view that meaning is shifting a given information status to a new one. There are different ways to model the context dependency of linguistic expressions and the choice among them is still an unresolved issue and a topic of considerable contemporary interest. We illustrate this by presenting one example from the literature. Stalnaker proposes to represent the context as a set of possible worlds that are shared by speaker and hearer, his "common ground". A new sentence is interpreted with respect to the common ground, i.e. to a set of possible worlds.The interpretation of the sentence changes the common ground (given that the hearer does not reject the content of the sentence) and the updated common ground is the new context for the next sentence. Kamp (1988) challenges this view as problematic, as possible worlds do not provide enough linguistically relevant information, as the following example illustrates (due to Barbara Partee, first discussed in Heim 1982:21). (16) Exactly one of the ten balls is not in the bag. It is under the sofa. (17) Exactly nine of the ten balls are in the bag. #It is under the sofa. Both sentences in (16) and (17) have the same truth conditions, i.e. in exactly all possible circumstances in which (16) is true (17) is true, too;still the continuation with the second sentence is only felicitous in (16), but not in (17). (16) explicitly introduces an antecedent in the first sentence, and the pronoun in the second sentence can be anaphorically linked to it. In (17), however, no explicit antecedent is introduced and therefore wc cannot resolve the anaphoric reference of the pronoun. Extensive research on these issues has proven very fruitful for the continuous developing of our methodological tools and for our understanding of natural language meaning in context and its function for discourse structure.
10 I. Foundations of semantics 5. Meaning in contemporary semantics Meaning is a notion investigated by a number of disciplines, including linguistics, philosophy, psychology, artificial intelligence, semiotics as well as many others.The definitions of meaning are as manifold and plentiful as the different theories and perspectives that arise from these disciplines. We have argued here that in order to use meaning as a well-defined object of investigation, we must perceive facts to be explained and have tests to expose the underlying phenomena, and we must have a well-defined scientific apparatus which allows us to describe, analyze and model these phenomena.This scientific apparatus is contemporary semantics: It possesses a clearly defined terminology, it provides abstract representations and it allows for formal modeling that adheres to scientific standards and renders predictions that can be verified or falsified. We have illustrated the tenets, tools and goals of contemporary semantics by discussing typical approaches to three central characteristics of meaning: truth conditionality, compositionality, and context and discourse. Recent times have witnessed an increased interest of scmanticists in developing their theories on a broader basis of empirical evidence, taking into account crosslinguistic data, diachronic data, psycho- and neurolinguistic studies as well as corpus linguistic and computational linguistic resources. As a result of these efforts, contemporary semantics is characterized by a continuous explanatory progress, an increased awareness of and proficiency in methodological issues, and the emergence of new opportunities for interdisciplinary cooperation. Along these lines, the articles of this handbook develop an integral, many-faceted and yet well-rounded picture of this joint endeavour in the linguistic study of natural language meaning. 6. References Davidson, Donald 1967.The logical form of action sentences. In: N. Reslier (ed.). The Logic of Decision and Action. Pittsburgh, PA: University of Pittsburgh Press, 81-95. Reprinted in: D. Davidson (ed.). Essays on Actions and Events. Oxford: Clarendon Press, 1980,105-148. Frege, Gottlob 1892/1980. Ubcr Sinn und Bedeutung. Zeitschrift fiir Philosophic und philosophische Kritik 100,25-50. English translation in: P. Geacli & M. Black (eds.). Translations from the Philosophical Writings of Gottlob Frege. Oxford: Blackwell, 1980,56-78. Hciin, Irene 1982. The Semantics of Definite and Indefinite Noun Phrases. Ph.D. dissertation. University of Massachusetts, Amherst, MA. Reprinted: Ann Arbor, MI: University Microfilms. Kaplan. David 1977/1989. Demonstratives. An Essay on the Semantics, Logic, Metaphysics, and Epistemology of Demonstratives and Other Indexicals. Ms. Los Angeles, CA, University of California. Printed in: J. Almog & J. Perry & H. Wcttstcin (eds.). Themes from Kaplan. Oxford: Oxford University Press, 1989,481-563. Kamp, Hans 1988. Belief attribution and context & Comments on Stalnaker. In: R.H. Grimm & D.D. Merrill (eds). Contents of Thought. Tucson, AZ:The University of Arizona Press, 156-181,203-206. Quine. Willard van Orinan 1958. Speaking of Objects. Proceedings and Addresses of the American Philosophical Association 31,5-22. Stalnaker, Robert 1970. Pragmatics. Synthese 22,272-289. Wolfart, H. Christoph 1973. Plains Cree:A Grammatical Study. Transactions of the American Philosophical Society, New Series, vol. 63, pt. 5. Philadelphia, PA:Tlie American Philosophical Society. Claudia Maienborn, Tubingen (Germany) Klaus von Heusinger, Stuttgart (Germany) Paul Portner, Washington, DC (USA)
2. Meaning, intentionality and communication U_ 2. Meaning, intentionality and communication 1. Introduction 2. Intentionality: Brcntano's legacy 3. Early pragmatics: ordinary language philosophy and speech act theory 4. Gricc on speaker's meaning and implicaturcs 5. The emergence of truth-conditional pragmatics 6. Concluding remarks: pragmatics and cognitive science 7. References Abstract This article probes the connections between the metaphysics of meaning and the investigation of human communication. It first argues that contemporary philosophy of mind has inherited most of its metaphysical questions from Brentano's puzzling definition of intentionality. Then it examines how intentionality came to occupy the forefront of pragmatics in three steps. (1) By investigating speech acts, Austin and ordinary language philosophers pioneered the study of intentional actions performed by uttering sentences of natural languages. (2) Based on his novel concept of speaker's meaning and his inferential view of human communication as a cooperative and rational activity, Grice developed a three- tiered model of the meaning of utterances: (i) the linguistic meaning of the uttered sentence; (ii) the explicit truth-conditional content of the utterance; (Hi) the implicit content conveyed by the utterance. (3) Finally, the new emerging truth-conditional trend in pragmatics urges that not only the implicit content conveyed by an utterance but its explicit content as well depends on the speaker's communicative intention. 1. Introduction This article lies at the interface between the scientific investigation of human verbal communication and metaphysical questions about the nature of meaning. Words and sentences of natural languages have meaning (or semantic properties) and they are used by humans in tasks of verbal communication. Much of twentieth-century philosophy of mind has been concerned with metaphysical questions raised by the perplexing nature of meaning. For example, what is it about the meaning of the English word "dog" that enables a particular token used in the USA in 2008 to latch onto hairy barking creatures that lived in Egypt four thousand years earlier (cf. Horwich 2005)? Meanwhile, the study of human communication in the twentieth century can be seen as a competition between two models, which Sperber & Wilson (1986) call the "code model" and the "inferential model." A decoding process maps a signal onto a message associated to the signal by an underlying code (i.e., a system of rules or conventions). An inferential process maps premises onto a conclusion, which is warranted by the premises. When an addressee understands a speaker's utterance, how much of the content of the utterance has been coded into, and can be decoded from, the linguistic meaning of the utterance? How much content does the addressee retrieve by his ability to infer the speaker's communicative intention? These are the basic scientific questions in the investigation of human verbal communication. Maienborn, von Heusingerand Portner (eds.) 2011, Semantics (HSK33.1), de Gruyter, 11-25
12 I. Foundations of semantics Much philosophy of mind in the twentieth century devoted to the metaphysics of meaning sprang from Brentano's puzzling definition of the medieval word "intention- ality" (section 2). Austin, one of the leading ordinary language philosophers, emphasized the fact that by uttering sentences of some natural language, a speaker may perform an action, i.e., a speech act (section 3). But he espoused a social conventionalist view of speech acts, which later pragmatics rejected in favor of an inferential approach. Grice instead developed an inferential model of verbal communication based on his concept of speaker's meaning and his view that communication is a cooperative and rational activity (section 4). However, many of Grice's insights have been further developed into a non- Gricean truth-conditional pragmatics (section 5). Finally, the "relevance-theoretic" approach pioneered by Spcrbcr & Wilson (1986) fills part of the gap between the study of meaning and the cognitive sciences (section 6). 2. Intentionality: Brentano's legacy Brentano (1874) made a twofold contribution to the philosophy of mind: he provided a puzzling definition of intentionality and he put forward the thesis that intentionality is "the mark of the mental." Intentionality is the power of minds to be about things, properties, events and states of affairs. As the meaning of its Latin root (tendere) indicates, "intentionality" denotes the mental tension whereby the human mind aims at so-called "intentional objects." The concept of intentionality should not be confused with the concept of intention. Intentions are special psychological states involved in the planning and execution of actions. But on Brentano's view, intentionality is a property of all psychological phenomena. Nor should "intentional" and "intentionality" be confused with the predicates "intensional" and "intensionality," which mean "non-extensional" and "non- extensionality": they refer to logical features of sentences and utterances, some of which may describe (or report) an individual's psychological states. "Creature with a heart" and "creature with a kidney" have the same extension: all creatures with a heart have a kidney and conversely (cf. Quinc 1948). But they have different intensions because having a heart and having a kidney are different properties. This distinction mirrors Frege's (1892) distinction between sense and reference (cf. article 3 (Textor) Sense and reference and article 4 (Abbott) Reference). In general, a linguistic context is non- extensional (or intensional) if it fails to license both the substitution of coreferential terms salva veritate and the application of the rule of existential generalization. As Brentano defined it, intentionality is what enables a psychological state or act to represent a state of affairs, or be directed upon what he called an "intentional object." Intentional objects exemplify the property which Brentano called "intentional inexis- tence" or "immanent objectivity," by which he meant that the mind may aim at targets that do not exist in space and time or represent states of affairs that fail to obtain or even be possible. For example, unicorns do not exist in space and time and round squares are not possible geometrical objects. Nonetheless thinking about either a unicorn or a round square is not thinking about nothing. To admire Sherlock Holmes or to love Anna Kar- enina is to admire or love something, i.e., some intentional object.Thus, Brentano's characterization of intentionality gave rise to a gap in twentieth-century philosophical logic between intentional-objects theorists (Meinong 1904; Parsons 1980; Zalta 1988), who claimed that there must be things that do not exist, and their opponents (Russell 1905;
2. Meaning, intentionality and communication 13 Quine 1948), who denied it and rejected the distinction between being and existence. (For further discussion, cf. Jacob 2003.) Brentano (1874) also held the thesis that intentionality is constitutive of the mental: all and only psychological phenomena exhibit intentionality. Brentano's second thesis that only psychological (or mental) phenomena possess intentionality led him to embrace a version of the Cartesian ontological dualism between mental and physical things. Chisholm (1957) offered a linguistic version of Brentano's second thesis, according to which the intensionality of a linguistic report is a criterion of the intentionality of the reported psychological state (cf. Jacob 2003). He further argued that the contents of sentences describing an agent's psychological states cannot be successfully paraphrased into the bchaviorist idiom of sentences describing the agent's bodily movements and behavior. Quine (1960) accepted Chisholm's (1957) linguistic version of Brentano's second thesis which he used as a premise for an influential dilemma: if the intentional idiom is not reducible to the behaviorist idiom, then the intentional idiom cannot be part of the vocabulary of the natural sciences and intentionality cannot be "naturalized." Quine's dilemma was that one must choose between a physicalist ontology and intentional realism, i.e., the view that intentionality is a real phenomenon. Unlike Brentano, Quine endorsed physicalism and rejected intentional realism. Some of the physicalists who accept Quine's dilemma (e.g., Churchland 1989) have embraced climinativc materialism and denied the reality of beliefs and desires.The short answer to this proposal is that it is difficult to make sense of the belief that there are no beliefs. Others (such as Dennett 1987) have taken the "instrumentalist" view that, although the intentional idiom is a useful stance for predicting a complex physical system's behavior, it lacks an explanatory value. But the question arises how the intentional idiom could make useful predictions if it fails to describe and explain anything (cf. Jacob 1997,2003 and Rcy 1997). As a result of the difficulties inherent to both eliminative materialism and interpretive instrumentalism,several physicalists have chosen to deny Brentano's thesis that only non-physical things exhibit intentionality, and to challenge Quine's dilemma according to which intentional realism is not compatible with physicalism.Their project is to "naturalize" intentionality and account for the puzzling features of intentionality (e.g., the fact that the mind may aim at non-existing objects and represent non-actual states of affairs), using only concepts recognizable by natural scientists (cf. section 3 on Gricc's notion of non-natural meaning). In recent philosophy of mind, the most influential proposals for naturalizing intentionality have been versions of the so-called "teleosemantic" approach championed by Millikan (1984), Dretske (1995) and others, which is based on the notion of biological function (or purpose).Teleosemantic theories are so-called because they posit an underlying connection between teleology (design or function) and content (or intentionality): a representational device is endowed with a function (or purpose). Something whose function is to indicate the presence of some property may fail to fulfill its function. If and when it does, then it may generate a false representation or represent something that fails to exist. Brentano's thesis that only mental phenomena exhibit intentionality seems also open to the challenge that expressions of natural languages, which are not mental things, have intentionality in virtue of which they too can represent things, properties, events and
14 I. Foundations of semantics states of affairs. In response, many philosophers of mind, such as Grice (1957, 1968), Fodor (1987), Haugeland (1981) and Searle (1983,1992), have endorsed the distinction between the underived intentionality of a speaker's psychological states and the derived intentionality (i.e., the conventional meaning) of the sentences by the utterance of which she expresses her mental states. On their view, sentences of natural languages would lack meaning unless humans used them for some purpose. (But for dissent, see Dennett 1987.) Some philosophers go one step further and posit the existence of an internal "language of thought:" thinking, having a thought or a propositional attitude is to entertain a token of a mental formula realized in one's brain. On this view, like sentences of natural languages, mental sentences possess syntactic and semantic properties. But, unlike sentences of natural languages, they lack phonological properties. Thus, the semantic properties of a complex mental sentence systematically depend upon the meanings of its constituents and their syntactic combination. The strongest arguments for the existence of a language of thought are based on the productivity and systematicity of thoughts, i.e., the facts that there is no upper limit on the complexity of thoughts and that a creature able to form certain thoughts must be able to form other related thoughts. On this view, the intentionality of an individual's thoughts and propositional attitudes derives from the meanings of symbols in the language of thought (cf. Fodor 1975,1987). 3. Early pragmatics: ordinary language philosophy and speech act theory Unlike sentences of natural languages, utterances are created by speakers, at particular places and times, for various purposes, including verbal communication. Not all communication, however, need be verbal. Nor do people use language solely for the purpose of communication; one can use language for clarifying one's thoughts, reasoning and making calculations. Utterances, not sentences, can be shouted in a hoarse voice and tape-recorded. Similarly, the full meaning of an utterance goes beyond the linguistic meaning of the uttered sentence, in two distinct aspects: both its representational content and its so-called "illocutionary force" (i.e., whether the utterance is meant as a prediction, a threat or an assertion) are underdetermined by the linguistic meaning of the uttered sentence. Prior to the cognitive revolution of the 1950's,the philosophy of language was divided into two opposing approaches: so-called "ideal language" philosophy (in the tradition of Frege, Russell, Carnap and Tarski) and so-called "ordinary language" philosophy (in the tradition of Wittgenstein, Austin, Strawson and later Searle). The word "pragmatics," which derives from the Greek word praxis (which means action), was first introduced by ideal language philosophers as part of a threefold distinction between syntax, semantics and pragmatics (cf. Morris 1938 and Carnap 1942). Syntax was defined as the study of internal relations among symbols of a language. Semantics was defined as the study of the relations between symbols and their denotations (or designata). Pragmatics was defined as the study of the relations between symbols and their users (cf. article 88 (Jaszczolt) Semantics and pragmatics). Ideal language philosophers were interested in the semantic structures of sentences of formal languages designed for capturing mathematical truths.The syntactic structure
2. Meaning, intentionality and communication 1_5 of any "well-formed formula" (i.e., sentence) of a formal language is defined by arbitrary rules of formation and derivation. Semantic values are assigned to simple symbols of the language by stipulation and the truth-conditions of a sentence can be mechanically determined from the semantic values of its constituents by the syntactic rules of composition. From the perspective of ideal language philosophers, such features of natural languages as their context-dependency appeared as a defect. For example, unlike formal languages, natural languages contain indexical expressions (e.g., "now", "here" or "I") whose references can change with the context of utterance. By contrast, ordinary language philosophers were concerned with the distinctive features of the meanings of expressions of natural languages and the variety of their uses. In sharp opposition to ideal language philosophers,ordinary language philosophers stressed two main points, which paved the way for later work in pragmatics. First, they emphasized the context-dependency of the descriptive content expressed by utterances of sentences of natural languages (see section 4). Austin (1962a: 110-111) denied that a sentence as such could ever be ascribed truth-conditions and a truth-value:"the question of truth and falsehood does not turn only on what a sentence is, nor yet on what it means, but on, speaking very broadly, the circumstances in which it is uttered." Secondly, they criticized what Austin (1962b) called the "descriptive fallacy," according to which the sole point of using language is to state facts or describe the world (cf. article 5 (Green) Meaning in language use). As indicated by the title of Austin's (1962b) book, How to Do Things with Words, they argued that by uttering sentences of some natural language, a speaker performs an action, i.e., a speech act: she performs an "illocutionary act" with a particular illocu- tionary force. A speaker may give an order, ask a question, make a threat, a promise, an entreaty, an apology, an assertion and so on. Austin (1962b) sketched a new framework for the description and classification of speech acts. As Green (2007) notes, speech acts arc not to be confused with acts of speech: "one can perform an act of speech, say by uttering words in order to test a microphone, without performing a speech act." Conversely, one can issue a warning without saying anything, by producing a gesture or a "minatory facial expression." Austin (1962b) identified three distinct levels of action in the performance of a speech act: the "locutionary act," the "illocutionary act," and the "perlocutionary act," which stand to one another in the following hierarchical structure. By uttering a sentence, a speaker performs the locutionary act of saying something by virtue of which she performs an illocutionary act with a given illocutionary force (e.g., giving an order). Finally, by performing an illocutionary act endowed with a specific illocutionary force, the speaker performs a perlocutionary act, whereby she achieves some psychological or behavioral effect upon her audience, such as frightening him or convincing him. Before he considered this threefold distinction within the structure of speech acts, Austin had made a distinction between so-called "constative" and "performative" utterances. The former is supposed to describe some state of affairs and is true or false according to whether the described state of affairs obtains or not. Instead of being a (true or false) description of some independent state of affairs, the latter is supposed to constitute (or create) a state of affairs of its own. Clearly, the utterance of a sentence in either the imperative mood ("Leave this room immediately!") or the interrogative mood ("What time is it right now?") is performative in this sense: far from purporting to register any pre-existing state of affairs, the speaker either gives an order or asks a question.
16 I. Foundations of semantics By drawing the distinction between constative and performative utterances, Austin was able to criticize the descriptive fallacy and emphasize the fact that many utterances of declarative sentences are performative (not constative) utterances. In particular, Austin was interested in explicit performative utterances ("I promise I'll come," "I order you to leave" or "I apologize"), which include a main verb that denotes the very speech act that the utterance performs. Austin's attention was drawn towards explicit performatives, whose performance is governed, not merely by linguistic rules, but also by social conventions and by what Searle (1969: 51) called "institutional facts" (as in "I thereby pronounce you husband and wife"), i.e., facts that (unlike "brute facts") presuppose the existence of human institutions. Specific bodily movements count as a move in a game, as an act of e.g., betting, or as part of a marriage ceremony only if they conform to some conventions that are part of some social institutions. For a performative speech act to count as an act of baptism, of marriage, or an oath, the utterance must meet some social constraints, which Austin calls "felicity" conditions. Purported speech acts of baptism, oath or marriage can fail some of their felicity conditions and thereby "misfire" if either the speaker lacks the proper authority or the addressee fails to respond with an appropriate uptake - in response to e.g., an attempted bet sincerely made by the speaker. If a speaker makes an insincere promise, then he is guilty of an "abuse." Austin came to abandon his former distinction between constative and performative utterances when he came to realize that some explicit performatives can be used to make true or false assertions or predictions. One can make an explicit promise or an explicit request by uttering a sentence prefixed by either "I promise" or "I request." One can also make an assertion or a prediction by uttering a sentence prefixed by either "I assert" or "I predict." Furthermore, two of his assumptions led Austin to embrace a social conventionalist view of illocutionary acts. First, Austin took explicit performatives as a general model for illocutionary acts. Secondly, he took explicit performatives, whose felicity conditions include the satisfaction of social conventions, as a paradigm of all explicit performatives. Thus Austin (1962b: 103) was led to embrace a social conventionalist view of illocutionary acts according to which the illocutionary force of a speech act is "conventional in the sense that it could be made explicit by the performative formula." Austin's social conventionalist view of illocutionary force was challenged by Strawson (1964:153-154) who pointed out that the assumption that no illocutionary act could be performed unless it conformed to some social convention would be "like supposing that there could not be love affairs which did not proceed on lines laid down in the Roman de la Rose." Instead, Strawson argued, what confers to a speech act its illocutionary force is that the speaker intends it to be so taken by her audience. By uttering "You will leave," the speaker may make a prediction, a bet or order the addressee to leave. Only the context, not some socially established convention, may help the audience determine the particular illocutionary force of the utterance. Also, as noted by Searle (1975) and by Bach & Harnish (1979), speech acts may be performed indirectly. For example, by uttering "I would like you to leave," a speaker directly expresses her desire that her addressee leave. But in so doing, she may indirectly ask or request her addressee to do so. By uttering "Can you pass the salt?" - which is a direct question about her addressee's ability -, the speaker may indirectly request him to pass the salt. As Recanati (1987:92-93) argues, when a speaker utters an explicit performative such as "I order you to leave," her utterance has the direct illocutionary force of a statement. But it may also have the indirect force of an order.There need be no socially
2. Meaning, intentionality and communication 17 established convention whereby a speaker orders her audience to leave by means of an utterance with a verb that denotes the act performed by the speaker. 4. Grice on speaker's meaning and implicatures In his 1957 seminal paper, Grice did three things: he drew a contrast between "natural" and "non-natural" meaning; he offered a definition of the novel concept of speaker's meaning; and he sketched a framework within which human communication is seen as a cooperative and rational activity (the addressee's task being to infer the speaker's meaning on the basis of her utterance, in accordance with a few principles of rational cooperation). In so doing, Grice took a major step towards an "inferential model" of human communication, and away from the "code model" (cf. article 88 (Jaszczolt) Semantics and pragmatics). As Grice (1957) emphasized, smoke is a natural sign of fire: the former naturally means the latter in the sense that not unless there was a fire would there be any smoke. By contrast, the English word "fire" (or the French word "feu") non-naturally means fire: if a person erroneously believes that there is a fire (or wants to intentionally mislead another into wrongly thinking that there is a fire) when there is none, then she can produce a token of the word "fire" in the absence of a fire. (Thus, the notion of non-natural meaning is Grice's counterpart of Brcntano's intentionality.) Grice (1957,1968,1969) further introduced the concept of speaker's meaning, i.e., of someone meaning something by exhibiting some piece of behavior that can, but need not, be verbal. For a speaker 5 to mean something by producing some utterance x is for 5 to intend the utterance of x to produce some effect (or response r) in an audience A by means of A's recognition of this very intention. Hence, the speaker's meaning is a communicative intention, with the peculiar feature of being reflexive in the sense that part of its content is that an audience recognize it. Strawson (1964) turned to Grice's concept of speaker's meaning as an intention- alist alternative to Austin's social conventional account of illocutionary acts (section 2). Strawson (1964) also pointed out that Grice's complex analysis of speaker's meaning or communicative intention requires the distinction between three complementary levels of intention. For 5 to mean something by an utterance x is for 5 to intend: (i) 5's utterance of x to produce a response r in audience A; (ii) A to recognize 5's intention (i); (iii) y4's recognition of 5's intention (i) to function at least as part of A's reason for A's response r. This analysis raises two opposite problems: it is both overly restrictive and insufficiently so. First, as Strawson's reformulation shows, Grice's condition (i) corresponds to 5's intention to perform what Austin (1962b) called a perlocutionary act. But for 5 to successfully communicate with A, it is not necessary that 5's intention to perform her perlocutionary act be fulfilled (cf. Searle 1969:46-48). Suppose that 5 utters: "It is raining," intending (i) to produce in A the belief that it is raining./! may recognize 5's intention (i); but, for some reason, A may mistrust 5 and fail to acquire the belief that it is raining. 5 would have failed to convince A (that it is raining), but 5 would nonetheless have successfully communicated what she meant to /t.Thus, fulfillment of 5's intention (i) is not
18 I. Foundations of semantics necessary for successful communication. Nor is the fulfillment of 5's intention (iii), which presupposes the fulfillment of 5's intention (i). All that is required for 5 to communicate what she meant to A is /t's recognition of 5's intention (ii) that 5 has the higher-order intention to inform A of her first-order intention to inform A of something. Secondly, Strawson (1964) pointed out that his reformulation of Grice's definition of speaker's meaning is insufficiently restrictive. Following Sperber & Wilson (1986: 30), suppose that 5 intends A to believe that she needs his help to fix her hair-dryer, but she is reluctant to ask him openly to do so. 5 ostensively offers A evidence that she is trying to fix her hair-dryer, thereby intending A to believe that she needs his help. 5 intends A to recognize her intention to inform him that she needs his help. However, 5 does not want A to know that she knows that he is watching her. Since 5 is not openly asking A to help her, she is not communicating with A. Although 5 has the second-order intention that A recognizes her first-order intention to inform him that she needs his help, she does not want A to recognize her second-order intention. To deal with such a case, Strawson (1964) suggested that the analysis of Grice's speaker's meaning include 5's third-order intention to have her second-order intention recognized by her audience. But as Schiffer (1972) pointed out, this opens the way to an infinity of higher-order intentions. Instead, Schiffer (1972) argued that for 5 to have a communicative intention, 5's intention to inform A must be mutually known to 5 and A. But as pointed out by Sperber & Wilson (1986:18-19), people who share mutual knowledge know that they do. So the question arises: how do speaker and hearer know that they do? (We shall come back to this issue in the concluding remarks.) Grice (1968) thought of his concept of speaker's meaning as a basis for a reductive analysis of semantic notions such as sentence- or word-meaning. But most linguists and philosophers have expressed skepticism about this aspect of Grice's program (cf. Chomsky 1975,1980). By contrast, many assume that some amended version of Grice's concept of speaker's meaning can serve as a basis for an inferential model of human communication. In his 1967 William James Lectures, Grice argued that what enables the hearer to infer the speaker's meaning on the basis of her utterance is that he rationally expects all utterances to meet the "Cooperative Principle" and a set of nine maxims or norms organized into four main categories which, by reference to Kant, he labeled maxims of Quantity (informativeness), Quality (truthfulness), Relation (relevance) and Manner (clarity). As ordinary language philosophers emphasized, in addition to what is being said by an assertion - what makes the assertion true or false - , the very performance of an illocutionary act with the force of an assertion has pragmatic implications. For example, consider Moore's paradox: by uttering "It is raining but I do not believe it," the speaker is not expressing a logical contradiction, as there is no logical contradiction between the fact that it is raining and the fact that the speaker fails to believe it. Nonetheless, the utterance is pragmatically paradoxical because by asserting that it is raining, the speaker thereby expresses (or displays) her belief that it is raining, but her utterance explicitly denies that she believes it. Grice's (1967/1975) third main contribution to an inferential model of communication was his concept of conversational implicature, which he introduced as "a term of art" (cf. Grice 1989:24). Suppose that Bill asks Jill whether she is going out and Jill replies: "It's raining." For Jill's utterance about the weather to constitute a response to Bill's question,
2. Meaning, intentionality and communication 1^ additional assumptions are required, such as, for example, that Jill does not like rain (i.e., that if it is raining, then Jill is not going out) which, together with Jill's response, may entail that she is not going out. Grice's approach to communication, based on the Cooperative Principle and the maxims, offers a framework for explaining how, from Jill's utterance, Bill can retrieve an implicit answer to his question by supplying some additional assumptions. Bill must be aware that Jill's utterance is not a direct answer to his question. Assuming that Jill does not violate (or "flout") the maxim of relevance, she must have intended Bill to supply the assumption that e.g., she does not enjoy rain, and to infer that she is not going out from her explicit utterance. Grice (1967/1975) called the additional assumption and the conclusion "conversational" implicaturcs. In other words, Grice's conversational implicaturcs enable a hearer to reconcile a speaker's utterance with his assumption that the speaker conforms to the Principle of Cooperation. Grice (1989:31) insisted that "the presence of a conversational implicature must be capable of being worked out; for even if it can in fact be intuitively grasped, unless the intuition is replaceable by an argument, the implicature (if present at all) will not count as a conversational implicature." (Instead, it would count as a so-called "conventional" implicature, i.e., a conventional aspect of meaning that makes no contribution to the truth-conditions of the utterance.) Grice further distinguished "generalized" conversational implicaturcs, which are generated so to speak "by default," from "particularized" conversational implicaturcs, whose generation depends on special features of the context of utterance. Grice's application of his cooperative framework to human communication and his elaboration of the concept of (generalized) conversational implicature were motivated by his concern to block certain moves made by ordinary language philosophers. One such move was exemplified by Strawson's (1952) claim that, unlike the truth-functional conjunction of propositional calculus, the English word "and" makes different contributions to the full meanings of the utterances of pairs of conjoined sentences. For example, by uttering "John took off his boots and got into bed" the speaker may mean that the event first described took place first. In response,Grice (1967/1975,1981) argued that,in accordance with the truth-table of the logical conjunction of propositional calculus, the utterance of any pair of sentences conjoined by "and" is true if and only if both conjuncts are true and false otherwise. He took the view that the temporal ordering of the sequence of events described by such an utterance need not be part of the semantic content (or truth-conditions) of the utterance. Instead, it arises as a conversational implicature retrieved by the hearer through an inferential process guided by his expectation that the speaker is following the Cooperative Principle and the maxims, e.g., the sub-maxim of orderliness (one of the sub-maxims of the maxim of Manner), according to which there is some reason why the speaker chose to utter the first conjunct first. Also, under the influence of Wittgenstein, some ordinary language philosophers claimed that unless there are reasons to doubt whether some thing is really red, it is illegitimate to say "It looks red to me" (as opposed to "It is red"). In response, Grice (1967/1975) argued that whether an utterance is true or false is one thing; whether it is odd or misleading is another (cf. Carston 2002a: 103; but see Travis 1991 for dissent).
20 I. Foundations of semantics 5. The emergence of truth-conditional pragmatics Grice's seminal work made it clear that verbal communication involves three layers of meaning: (i) the linguistic (conventional) meaning of the sentence uttered, (ii) the explicit content expressed (i.e.,"what is said") by the utterance, and (iii) the implicit content of the utterance (its conversational implicatures). Work in speech act theory further suggests that each layer of meaning also exhibits a descriptive dimension (e.g., the truth conditions of an utterance) and a pragmatic dimension (e.g., the fact that a speech act is an assertion). Restricting itself to the descriptive dimension of meaning, the rest of this section discusses the emergence of a new truth-conditional pragmatic approach, whose core thesis is that what is said (not just the conversational implicatures of an utterance) depends on the speaker's meaning. By further extending the inferentialist model of communication, this pragmatic approach to what is said contravenes two deeply entrenched principles in the philosophy of language: literalism and minimalism. Ideal language philosophers thought of indcxicality and other context-sensitive phenomena as defective features of natural languages. Quine (1960: 193) introduced the concept of an eternal sentence as one devoid of any context-sensitive or ambiguous constituent so that its "truth-value stays fixed through time and from speaker to speaker." An instance of an eternal sentence might be: "Three plus two equals five." Following Quine, many philosophers (see e.g., Katz 1981) subsequently accepted literalism, i.e., the view that for any statement made in some natural language, using a context-sensitive sentence in a given context, there is some eternal sentence in the same language that can be used to make the same statement in any context. Few linguists and philosophers nowadays subscribe to literalism because they recognize that indexicality is an ineliminable feature of natural languages. However, many subscribe to minimalism. Gricc urged an inferential model of the pragmatic process whereby a hearer infers the conversational implicatures of an utterance from what is said. But he embraced the minimalist view that what is said departs from the linguistic meaning of the uttered sentence only as is necessary for the utterance to be truth-evaluable (cf. Grice 1989: 25). If a sentence contains an ambiguous phrase (e.g., "He is in the grip of a vice"), then it must be disambiguated. If it contains an indexical, then it cannot be assigned its proper semantic value except by relying on contextual information. But according to minimalism, appeal to contextual information is always mandated by some linguistic constituent (e.g., an indexical) within the sentence. In order to determine what is said by the utterance of a sentence containing e.g., the indexical pronoun "I," the hearer relics on the rule according to which any token of "I" refers to the speaker who used that token. As Stanley (2000: 391) puts it, "all truth-conditional effects of extra- linguistic context can be traced to logical form" (i.e., the semantic information that is grammatically encoded). Unlike the reference of a pure indexical like "I," however, the reference of a demonstratively used pronoun (e.g., "he") can only be determined by representing the speaker's meaning, not by a semantic rule. So does understanding the semantic value of "here" or "now." A person may use a token of "here" to refer to a room, a street, a city, a country, the Earth, and so forth. Similarly, a person may use a token of "now" to refer to a millisecond, an hour, a day, a year, a century, and so forth. One cannot determine the semantic value of a token of either "here" or "now" without representing the speaker's meaning.
2. Meaning, intentionality and communication 2\_ According to truth-conditional pragmatics, what is said by an utterance is determined by pragmatic processes, which are not necessarily triggered by some syntactic constituent of the uttered sentence (e.g., an indexical). By contrast, minimalists reject truth- conditional pragmatics and postulate, in the logical form of the sentence uttered, the existence of hidden variables whose semantic values must be contcxtually determined for the utterance to be truth-evaluable (see the controversy between Stanley 2000 and Recanati 2004 over whether the logical form of an utterance of "It's raining" contains a free variable for locations). The rise of truth-conditional pragmatics may be interpreted (cf. Travis 1991) as vindicating the view that an utterance's truth-conditions depend on what Searle (1978, 1983) calls "the Background," i.e., a network of practices and unarticulatcd assumptions (but see Stalnaker 1999 and cf. article 38 (Dekker) Dynamic semantics for a semantic approach). Although the verb "to cut" is unambiguous, what counts as cutting grass differs from what counts as cutting a cake. Only against alternative background assumptions will one be able to discriminate the truth-conditions of "John cut the grass" and of "John cut the cake." However, advocates of minimalism argue that if, instead of using a lawn mower, John took out his pocket-knife and cut each blade lengthwise, then by uttering "John cut the grass" the speaker would speak the truth (cf. Cappelen & Lepore 2005). Three pragmatic processes involved in determining what is said by an utterance have been particularly investigated by advocates of truth-conditional pragmatics: free enrichment, loosening and transfer. 5.1. Free enrichment Grice (1967/1975,1981) offered a pragmatic account according to which the temporal or causal ordering between the events described by the utterance of a conjunction is conveyed as a conversational implicaturc. But consider Carston's (1988) example:"Bob gave Mary his key and she opened the door." Carston (1988) argues that part of what is said is that "she" refers to whoever "Mary" refers to and that Mary opened the door with the key Bob gave her.If so, then the fact that Bob gave Mary his key before M ary opened the door is also part of what is said. Following Sperber & Wilson (1986:189), suppose a speaker utters "I have had breakfast," as an indirect way of declining an offer of food. By minimalist standards, what the speaker said was that she has had breakfast at least once in her life prior to her utterance. According to Grice, the hearer must be able to infer a conversational implicaturc from what the speaker said. However, the hearer could not conclude that the speaker does not wish any food from the truism that she has had breakfast at least once in her life before her utterance. Instead, for the hearer to infer that the speaker does not wish to have food in response to his question, what the speaker must have said is that she has had breakfast just prior to the time of utterance. 5.2. Loosening Cases of free enrichment are instances of strengthening the concept linguistically encoded by the meaning of the sentence - for example, strengthening of the concept encoded by "the key" into the concept expressible by "the key Bob gave to Mary". However,
22 I. Foundations of semantics not all pragmatic processes underlying the generation of what is said from the linguistic meaning of the sentence are processes of conceptual strengthening or narrowing. Some are processes of conceptual loosening or broadening. For example, imagine a speaker's utterance in a restaurant of "My steak is raw" whereby what she says is not that her steak is literally uncooked but rather that it is undercooked. 5.3. Transfer Strengthening and loosening are cases of modification of a concept linguistically encoded by the meaning of a word. Transfer is a process whereby a concept encoded by the meaning of a word is mapped onto a related but different concept.Transfer is illustrated by examples from Nunberg (1979, 1995): "The ham sandwich left without paying" and "I am parked out back." In the first example, the property expressed by the predicate "left without paying" is being ascribed to the person who ordered the sandwich, not to the sandwich itself. In the second example, the property of being parked out back is ascribed not to the speaker, but to the car that stands in the ownership relation to her. The gist of truth-conditional pragmatics is that speaker's meaning is involved in determining both the conversational implicatures of an utterance and what is said. As the following example shows, however, it is not always easy to decide whether a particular assumption is a conversational implicature of an utterance or part of what is said. Consider "The picnic was awful.The beer was warm." For the second sentence to offer a justification (or explanation) of the truth expressed by the first, the assumption must be made that the beer was part of the picnic. According to Carston (2002b), the assumption that the beer was part of the picnic is a conversational implicature (an implicated premise) of the utterance. According to Recanati (2004), the concept linguistically encoded by "the beer" is strengthened into the concept expressible by "the beer that was part of the picnic" and part of what is said. 6. Concluding remarks: pragmatics and cognitive science Sperber & Wilson's (1986) relevance-theoretic approach squarely belongs to truth- conditional pragmatics: it makes three contributions towards bridging the gap between pragmatics and the cognitive sciences. First, it offers a novel account of speaker's meaning. As Schiffer (1972) pointed out, not unless S's intention to inform A is mutually known to S and A could S's intention count as a genuine communicative intention (cf. section 3). But how could S and A know that they mutually know S's intention to inform A of something? Sperber & Wilson (1986) argue that they cannot and urge that the mutual knowledge requirement be replaced by the idea of mutual manifestness. An assumption is manifest to S at t if and only if S is capable of representing and accepting it as true at t. A speaker's informative intention is an intention to make (more) manifest to an audience a set of assumptions {I}. A speaker's communicative intention is her intention to make it mutually manifest that she has the above informative intention. Hence, a communicative intention is a second-order informative intention. Secondly, relevance theory is so-called because Sperber & Wilson (1986) accept a Cognitive principle of relevance according to which human cognition is geared towards the maximization of relevance. Relevance is a property of an input for an individual at t: it depends on both the set of contextual effects and the cost of processing, where the
2. Meaning, intentionality and communication 23 contextual effect of an input might be the set of assumptions derivable from processing the input in a given context. Other things being equal, the greater the set of contextual effects achieved by processing an input, the more relevant the input. The greater the effort required by the processing, the lower the relevance of the input. They further accept a Communicative principle of relevance according to which every ostensively produced stimulus conveys a presumption of its own relevance: an ostensive stimulus is optimally relevant if and only if it is relevant enough to be worth the audience's processing effort and it is the most relevant stimulus compatible with the communicator's abilities and preferences. Finally, the relevance-theoretic approach squarely anchors pragmatics into what cognitive psychologists call "third-person mindrcading," i.e., the ability to represent others' psychological states (cf. Leslie 2000). In particular, it emphasizes the specificity of the task of representing an agent's communicative intention underlying her (communicative) ostensive behavior. The observer of some non-ostensive intentional behavior (e.g., hunting) can plausibly ascribe an intention to the agent on the basis of the desirable outcome of the latter's behavior, which can be identified (e.g., hit his target), whether or not the behavior is successful. However, the desirable outcome of a piece of communicative behavior (i.e., the addressee's recognition of the agent's communicative intention) cannot be identified unless the communicative behavior succeeds (cf. Sperber 2000; Origgi & Sperber 2000 and Wilson & Sperber 2002).Thus, the development of pragmatics takes us from the metaphysical issues about meaning and intentionality inherited from Brentano to the cognitive scientific investigation of the human mindrcading capacity to metarepresent others' mental representations. Thanks to Neftali Villanueva Fernandez, Paul Horwich and the editors for comments on this article. 7. References Austin. John L. 1962a. Sense and Sensibilia. Oxford: Oxford University Press. Austin, John L. 1962b. How to Do Things with Words. Oxford: Clarendon Press. Bach, Kent & Robert M. Harnish 1979. Linguistic Communication and Speech Acts. Cambridge, MA:The MIT Press. Brentano, Frantz 1874/1911/1973. Psychology from an Empirical Standpoint. London: Routledge & Kegan Paul. Cappelen, Herman & Ernie Lepore 2005. Insensitive Semantics. Oxford: Blackwell. Carnap. Rudolf 1942. Introduction to Semantics. Chicago, IL:Tlic University of Chicago Press. Carston. Robyn 1988. Iinplicaturc.cxplicaturc and truth-theoretic semantics. In: R. Kcmpson (cd.). Mental Representations: The Interface between Language and Reality. Cambridge: Cambridge University Press, 155-181. Carston, Robyn 2002a. Thoughts and Utterances. The Pragmatics of Explicit Communication. Oxford: Blackwell. Carston, Robyn 2002b. Linguistic meaning, communicated meaning and cognitive pragmatics. Mind & Language 17,127-148. Chisholm, Robert M. 1957. Perceiving: A Philosophical Study. Ithaca. NY: Cornell University Press. Chomsky, Noam 1975. Reflections on Language. New York: Pantheon Books. Chomsky, Noam 1980. Rules and Representations. New York: Columbia University Press. Cliurchland, Paul M. 1989. A Neiirocomputational Perspective: The Nature of Mind and the Structure of Science. Cambridge, MA: The MIT Press.
24 I. Foundations of semantics Dennett. Daniel C. 1987. The Intentional Stance. Cambridge. MA:Tlie MIT Press. Dretske. Fred 1995. Naturalizing the Mind. Cambridge, MA:Tlie MIT Press. Fodor. Jerry A. 1975. The Language of Thought. New York: Crowell. Fodor, Jerry A. 1987. Psychosemantics: The Problem of Meaning in the Philosophy of Mind. Cambridge, MA:The MIT Press. Frege, Gottlob 1892/1980. Uber Sinn und Bedeutung. Zeitschrift fur Philosophic undphilosophische Kritik 100,25-50. English translation in: P. Geach & M. Black (eds.). Translations from the Philosophical Writings of Gottlob Frege. Oxford: Blackwell. 1980,56-78. Green, Mitchell 2007. Speech acts. Stanford Encyclopedia of Philosophy, http://plato.stanford.edu/ entries/speech-acts, July 20,2008. Grice. H. Paul 1957. Meaning. The Philosophical Review 64. 377-388. Reprinted in: H. P. Grice. Studies in the Way of Words. Cambridge, MA: Harvard University Press, 1989,212-223. Grice, H. Paul 1967/1975. Logic and conversation. In: P. Cole & J. Morgan (eds.). Syntax and Semantics 3: Speech Acts. New York: Academic Press, 41-58. Reprinted in: H.P Grice. Studies in the Way of Words. Cambridge. MA: Harvard University Press, 1989,22^0. Grice, H. Paul 1968. Utterer's meaning, sentence meaning and word meaning. Foundations of Language 4, 225-242. Reprinted in: H.P. Grice. Studies in the Way of Words. Cambridge, MA: Harvard University Press, 1989.117-137. Grice, H. Paul 1969. Utterer's meaning and intentions. Philosophical Review 78,147-177. Reprinted in: H.P. Grice. Studies in the Way of Words. Cambridge, MA: Harvard University Press, 1989, 86-116. Grice, H. Paul 1978. Further notes on logic and conversation. In: P. Cole (ed.). Syntax and Semantics 9: Pragmatics. New York: Academic Press, 113-128. Reprinted in: H. P. Grice. Studies in the Way of Words. Cambridge, MA: Harvard University Press, 1989,41-57. Grice, H. Paul 1981. Presuppositions and conversational implicatures. In: P. Cole (ed.). Radical Pragmatics. New York: Academic Press, 183-198. Reprinted in: H. P. Grice. Studies in the Way of Words. Cambridge, MA: Harvard University Press, 1989.269-282. Grice, H. Paul 1989. Studies in the Way of Words. Cambridge, MA: Harvard University Press. Haugeland, John 1981. Semantic engines: An introduction to mind design. In: J. Haugeland (ed.). Mind Design, Philosophy, Psychology, Artificial Intelligence. Cambridge, MA: The MIT Press, 1-34. Horwich, Paul 2005. Reflections on Meaning. Oxford: Oxford University Press. Jacob, Pierre 1997. What Minds Can Do. Cambridge: Cambridge University Press. Jacob, Pierre 2003. Intentionality. Stanford Encyclopedia in Philosophy, http://plato.Stanford.edu/ intensionality/, June 15,2006. Katz, Jerry J. 1981 Language and Other Abstract Objects. Oxford: Blackwell. Leslie, Alan 2000. Theory of Mind' as a mechanism of selective attention. In: M. Gazzaniga (ed.). The New Cognitive Neuroscience. Cambridge, MA:The MIT Press, 1235-1247. Meinong, Alexis 1904.Uber Gegenstandstheorie. In: A.Meinong (cd.). Untersuchungen zur Gegen- standstheorie und Psychologie. Leipzig: Barth, 1-50. English translation in: R. M. Chisholm (ed.). Realism and the Background of Phenomenology. Glencoe:The Free Press, 1960,76-117. Millikan, Ruth G. 1984. Language, Thought and Other Biological Objects. Cambridge, MA: The MIT Press. Morris, Charles 1938. Foundations of the Theory of Signs. Chicago. IL:Tlie University of Chicago Press. Nunbcrg. Geoffrey 1979. The non-uniqueness of semantic solutions: Polysemy. Linguistics & Philosophy 3.143-184. Nunberg. Geoffrey 1995. Transfers of meaning. Journal of Semantics 12.109-132. Origgi. Gloria & Dan Sperber 2000. Evolution, communication and the proper function of language. In: P. Carruthers & A. Chamberlain (eds.). Evolution and the Human Mind: Language, Modularity and Social Cognition. Cambridge: Cambridge University Press, 140-169. Parsons,Terence 1980. Nonexistent Objects. New Haven, CT: Yale University Press.
3. (Frege on) Sense and reference 25 Quine, Willard van Orman 1948. On what there is. Reprinted in: W.V.O. Quine. From a Logical Point of View. Cambridge, MA: Harvard University Press, 1953,1-19. Quine, Willard van Orinan 1960. Word and Object. Cambridge, MA:The MIT Press. Recanati. Francois 1987. Meaning and Force. Cambridge: Cambridge University Press. Recanati. Francois 2004. Literal Meaning. Cambridge: Cambridge University Press. Rey, Georges 1997. Contemporary Philosophy of Mind: A Contentiously Classical Approach. Oxford: Blackwell. Russell. Bertrand 1905. On denoting. Mind 14,479^93. Reprinted in: R. C. Marsh (ed.). Bertrand Russell, Logic and Knowledge, Essays I90I-I950. New York: Capricorn Books, 1956.41-56. Schiffer. Stephen 1972. Meaning. Oxford: Oxford University Press. Searle, John R. 1969. Speech Acts. Cambridge: Cambridge University Press. Searle, John R. 1975. Indirect speech acts. In: P. Cole & J. Morgan (eds.). Syntax and Semantics 3: Speech Acts. New York: Academic Press, 59-82. Searle, John R. 1978. Literal meaning. Erkenntnis 13.207-224. Searle, John R. 1983. Intentionality. Cambridge: Cambridge University Press. Searle, John R. 1992. The Rediscovery of the Mind. Cambridge, M A:Thc MIT Press. Sperber, Dan 2000. Metarepresentations in an evolutionary perspective. In: D. Sperber (ed.). Meta- represen tat ions: A Multidisciplinary Perspective. Oxford: Oxford University Press, 117-137. Sperber. Dan & Deirdre Wilson 1986. Relevance, Communication and Cognition. Cambridge. MA: Harvard University Press. Stalnaker, Robert 1999. Context and Content. Oxford: Oxford University Press. Stanley, Jason 2000. Context and logical form. Linguistics & Philosophy 23,391-434. Strawson, Peter F. 1952. Introduction to Logical Theory. London: Methuen. Strawson, Peter F. 1964. Intention and convention in speech acts. The Philosophical Review 73. 439-460. Reprinted in: P.F. Strawson. Logico-linguistic Papers. London: Methuen. 1971.149-169. Travis, Charles 1991. Annals of analysis: Studies in the Way of Words, by H.P. Grice. Mind 100, 237-264. Wilson. Deirdre & Dan Sperber 2002.Truthfulness and relevance. Mind 111,583-632. Zalta, Edward N. 1988. Intensional Logic and the Metaphysics of Intentionality. Cambridge, MA: The MIT Press. Pierre Jacob, Paris (France) 3. (Frege on) Sense and reference 1. Sense and reference: a short overview 2. Introducing sense and reference 3. Extending and exploring sense and reference 4. Criticising sense and reference 5. Summary 6. References Abstract Gottlob Frege argues in "Ober Sinn und Bedeutung" that every intuitively meaningful expression has a sense. He characterises the main ingredient of sense as a 'mode of presentation' of at most one thing, namely the referent. Theoretical development of sense Maienborn, von Heusingerand Portner (eds.) 2011, Semantics (HSK33.1), de Gruyter, 25-49
26 I. Foundations of semantics and reference has been a fundamental task for the philosophy of language. In this article I will reconstruct Frege's motivation for the distinction between sense and reference (section 2). I will then go on to discuss how the distinction can be applied to predicates, sentences and context-dependent expressions (section 3). The final section 4 shows how discussions of Frege's theory lead to important proposals in semantics. 1. Sense and reference: a short overview Gottlob Frege argues in Uber Sinn und Bedeutung that every intuitively meaningful expression has a sense ("Sinn"). The main ingredient of sense is suggestively characterised as a 'mode of presentation' of at most one thing, namely the referent ("Bedeutung"). Frege's theory of sense and reference has been at the centre of the philosophy of language in the 20th century. The discussion about it is driven by the question whether an adequate semantic theory needs to or even can coherently ascribe sense and reference to natural language expressions. On the side of Frege's critics are, among others, Russell and Kripke. Russell (1905) argues that the introduction of sense creates more problems than it solves. If there arc senses, one should be able to speak about them. But according to Russell, every attempt to do so leads to an 'inextricable tangle'. Russell goes on to develop his theory of definite descriptions to avoid the problems Frege's theory seems to create. How and whether Russell's argument works is still a matter of dispute. (For recent discussions see Levine 2004 and Makin 2000.) More recently, Kripke (1972) has marshalled modal and epistemic arguments against what he called the "The Frege-Russell picture"."Neo-Millians"or"Neo-Russellians"are currently busy closing the holes in Kripke's arguments against Frege. Frege's friends have tried to develop and defend the distinction between sense and reference. In Meaning and Necessity Carnap (1956) takes Frege to task for multiplying senses beyond necessity and therefore proposes to replace sense and reference with intension and extension. I will return to Carnap's view in section 4.1. Quine (1951) has criticised notions like sense and intension as not being definable in scientifically acceptable terms. Davidson has tried to preserve Frege's insights in view of Quine's criticism. According to Davidson, "a theory of truth patterned after a Tarski-type truth definition tells us all we need to know about sense. Counting truth in the domain of reference, as Frege did, the study of sense thus comes down to the study of reference" (Davidson 1984:109). And reference is,evenby Quinean standards, a scientifically acceptable concept. Davidson's proposal to use theories of truth as theories of sense has been pursued by in recent years and lead to fruitful research into the semantics of proper names. (See McDowell 1977 and Sainsbury 2005. See also Wiggins 1997.) Dummett has criticised Davidson's proposal (see Dummett 1993, essay 1 and 2). A theory of sense should be graspable by someone who does not yet master a language and a theory of truth docs not satisfy this constraint. A theory of meaning (sense) should be a theory of understanding and a theory of understanding should take understanding to consist in the possession of an ability, for example, the ability to recognise the bearer of a name. The jury is still out on the question whether the distinction between sense and reference should be preserved or abandoned. The debate has shed light on further
3. (Frege on) Sense and reference 27 notions like compositionality, co-reference and knowledge of meaning. In this article I want to introduce the basic notions, sense and reference, and the problems surrounding them. A note of caution. Frege's theory is not clearly a descriptive theory about natural language, but applies primarily to an ideal language like the Begriffsschrift (see Frege, Philosophical and Mathematical Correspondence (PMC): 101). If a philosopher of language seeks inspiration in the work of Frege, he needs to consider the question whether the view under consideration is about natural language or a language for inferential thought. 2. Introducing sense and reference 2.1. Conceptual content and cognitive value Frcgc starts his logico-philosophical investigations with judgement and inference. (Sec Burge 2005:14f). An inference is a judgement made "because we are cognizant of other truths as providing a justification for it." (Posthumous Writings (PW): 3).The premises of an inference are not sentences, but acknowledged truths. In line with this methodology, Frege introduces in BS the conceptual content of a judgement as an abstraction from the role the judgement plays in inference: Let mc observe that there arc two ways in which the contents of two judgements can differ: it may. or it may not, be the case that all inferences that can be drawn from the first judgement when combined with certain other ones can always be drawn from the second when combined with the same judgements.!...] Now I call the part of the content that is the same in both the conceptual content. (BS,§3:2-3) Frege's 'official' criterion for sameness of conceptual content is: s has the same conceptual content as s* iff given truths t, to tn as premises, the same consequences can be inferred from s and s* together with t, to tn. (See,BS,§3) Frege identifies the conceptual content of a sentence with a complex of the things for which the sentence constituents stand (BS §8). This raises problems for the conceptual content of identity sentences. In identity sentences the sign of identity is flanked by what Frege calls 'proper names'. What is a proper name? For Frege, every expression that can be substituted for a genuine proper name salva grammaticale and salva veritate qualifies as a proper name. Fregean proper names include genuine proper names, complex demonstratives ('That man') when completed by contextual features and definite descriptions ('the negation of the thought that 2 = 3'). Following Russell (1905), philosophers and linguists have argued that definite descriptions do not belong on this list. Definite descriptions do not stand for objects, but for properties of properties. For example,'The first man on the moon is American' asserts that (the property) being a first man on the moon is uniquely instantiated and that whatever uniquely has this property has also the property of
28 I. Foundations of semantics being American. The discussion about the semantics of definite descriptions is ongoing, but we can sidestep this issue by focusing on genuine proper names. (For an overview of the recent discussion about definite descriptions see Bezuidenhout & Reimer 2004.) Equations figure as premises in inferences that extend our knowledge. How can this be so? In BS Frege struggles to give a convincing answer. If one holds that "=" stands for the relation of identity, the sentences (51) The evening star = the evening star and (52) The evening star = the morning star, have the same conceptual content CC: CC(S1): < Venus, Identity. Venus> = CC(S2): <Venus, Identity. Venus>. But although (SI) and (S2) stand for the same complex, they differ in inferential potential. For example, if we combine the truth that (SI) expresses with the truth that the evening star is a planet we can derive nothing new, but if we combine the latter truth we can derive the truth that the morning star is a planet. Prima facie, (i) Frege's sameness criteria for conceptual content are in conflict with (ii) his claim that expressions are everywhere merely placeholders for objects. In BS Frege responds by restricting (ii): in a sentence of the form "a = b" the signs "a" and "b" stand for themselves; the sentence says that "a" and "b" have the same conceptual content.The idea that designations refer to themselves in some contexts helps to explain the difference in inferential potential. It can be news to learn that "Dr. Jekyll" stands for the same person as "Mr. Hyde". Although promising, we will see in a moment that this move alone does not distinguish (Si) and (S2) in the way required. Treating identity statements as the exception to the rule that every expression just stands for an object allows Frcgc to keep his identification of the conceptual content of a sentence with a complex of objects and properties. The price is that the reference of a non-ambiguous and non-indexical term will shift from one sentence to another. Consider a quantified statement like: (Vx)(Vy)((x = y&Fx)->Fy) If signs stand for themselves in identity-statements, one can neither coherently suppose that the variables in the formula range over signs, nor over particulars. (See Furth 1964: xix,and Mendelsohn 1982:297f.) These considerations prime us for the argument that motivates Frcgc's introduction of the distinction between sense and reference. Frege's new argument uses the notion of cognitive value ("Erkenntniswert"). For the purposes of Frege's argument it is not important to answer the question "What is cognitive value?": more important is the
3. (Frege on) Sense and reference 29 question "When do two sentences s and s* differ in cognitive value?" Answer: If the justified acceptance of s puts one in a position to come to know something that one neither knew before nor could infer from what one knew before, while the justified acceptance of s* does not do so (or the other way around). Frege hints at this notion of cognitive value in a piece of self-criticism in The Foundations of Arithmetic (FA). If a true equation connected two terms that refer to the same thing in the same way [a]ll equations would then come down to this, that whatever is given to us in the same way is to be recognized as the same. But this is so self-evident and so unfruitful that it is not worth stating. Indeed, no conclusion could ever be drawn here that was different from any of the premises. (FA. §67:79. My emphasis) If a rational and minimally logically competent person accepts what "The evening star is a planet" says, and she comes also to accept the content of "The evening star = the morning star", she is in a position to acquire the knowledge that the morning star is a planet. This new knowledge was not inferable from what she already knew. Things are different if she has only reason to accept the content of "The evening star = the evening star". Exercising logical knowledge won't take her from this premise to a conclusion that she was not already in a position to know on the basis of the other premise or her general logical knowledge. On the basis of this understanding of cognitive value, Frege's argument can now be rendered in the following form: (PI) (SI) "The evening star = the evening star" differs in cognitive value from (S2) "The evening star = the morning star". (P2) The difference in cognitive value between (SI) and (S2) cannot be constituted by a difference in reference of the signs composing the sentence. (P3) "If the sign 'a' is distinguished from the sign 'b' only as object (here by means of its shape), not as sign (i.e., not by the manner in which it designates something) the cognitive value of a = a becomes essentially equal to the cognitive value of a = b." (P4) "A difference can only arise if the difference between the signs corresponds to a difference between the mode of presentation of that which is designated." (CI) The difference in cognitive value between (SI) and (S2) is constituted by the fact that different signs compose (SI) and (S2) AND that the different signs designate something in different ways. We have now explained (PI) and Frege has already given convincing reasons for holding (P2) in BS. Let us look more closely at the other premises. 2.2. Sense, sign and logical form In On Sense and Reference (S&R) Frege gives several reasons for the rejection of the Begriffsschrift view of identity statements.The most compelling one is (P3). Frege held in BS that (SI) has a different cognitive value from (S2) because (i) in (S2) different
30 I. Foundations of semantics singular terms flank the identity sign that (ii) stand for themselves. If this difference is to ground the difference in cognitive value, Frege must provide an answer to the question "When do different signs and when does the same sign (tokens of the same sign) flank the identity sign?" that is adequate for this purpose. (P3) says that the individuation of signs in terms of their form alone is not adequate. Frege calls signs that are individuated by their form 'figures'. The distinctive properties of a figure are geometrical and physical properties. There are identity-statements that contain two tokens of the same figure ("Paderewski = Paderewski"), which have the same cognitive value as identity statements that contain two tokens of different figures ("Paderewski = the prime minister of Poland between 1917 and 1919"). There are also identity- statements that contain tokens of different figures ("Germany's oldest bachelor is Germany's oldest unmarried eligible male") that have the same cognitive value as identity statements that contain two tokens of the same figure ("Germany's oldest bachelor is Germany's oldest bachelor"). If sameness (difference) of figure does not track sameness (difference) of cognitive value, then the Begriffsschrift view needs to be supplemented by a method of sign individuation that is not merely form-based. Frege's constructive suggestion is (P4). Let us expand on (P4). I will assume that I have frequently seen what I take to be the brightest star in the evening sky. I have given the star the name "the evening star". I will apply "the evening star" to an object if and only if it is the brightest star in the evening sky. I am also fascinated by what I take to be the brightest star in the morning sky. I have given this star the name "the morning star" and I will apply "the morning star" to an object if and only if it is the brightest star in the morning sky. Now if the difference in form between "the evening star" and "the morning star" indicates a difference in mode of designation, one can explain the difference in cognitive value between (SI) and (S2). (Si) and (S2) contain different figures AND the different figures refer to something in a different way (they arc connected in my idiolect to different conditions of correct reference). Frege assumes also that a difference in cognitive value can only arise if different terms designate in different ways. But (SI) and (S2) would be translated differently into the language of first-order logic with identity: (SI) as "a = a", (S2) as "a = b". Why not hold with Putnam (1954) that the difference in logical form grounds the difference in cognitive value? This challenge misses the point of (P3). How can wc determine the logical form of a statement independently of a method of individuating signs? As we have already seen, if "b" is a mere stylistic alternative for "a" (the difference in form indicates no difference in mode of presentation), the logical form of "a = b" is a = a. If difference of cognitive value is to depend on differences of logical form, logical form cannot merely be determined by the form of the signs contained in the sentence. Logical form depends on sense. The appeal to logical form cannot make the notion of sense superfluous. If Frege's argument is convincing, it establishes the conclusion that the sense of an expression is distinct from its reference. The argument tells us what sense is not, it does not tell us what it is. The premises of the argument arc compatible with every assumption about the nature of sense that allows the sense of an expression to differ from its reference.
3. (Frege on) Sense and reference 31_ 3. Extending and exploring sense and reference 3.1. The sense and reference of predicates and sentences Not only proper names have sense and reference. Frege extends the distinction to concept-words (predicates) and sentences. A letter to Husserl contains a diagram that outlines the complete view (PMC: 63): Proper Name Sentence Concept-word Sense Thought Sense concept-word Referent Truth-Value Concept -* Object that falls under it. Fig. 3.1: Frege's diagram The sense of a complete assertoric sentence is a thought. The composition of the sentence out of sentence-parts mirrors the composition of the thought it expresses. If (i) a sentence S is built up from a finite list of simple parts e, to en according to a finite number of modes of combination, (ii) e, to en express senses, and (iii) the thought expressed by S contains the senses of c, to cn arranged in a way that corresponds to the arrangement of e, to en in S, we can express new thoughts by re-combining sentence-parts we already understand. Since, we can express new thoughts, we should accept (i) to (iii). (See Frege, Collected Papers (CP):390 and PMC: 79.) Compositionahty is the property of a language L that the meaning of its complex expressions is determined by the meaning of their parts and their mode of combination. (Sec article 6 (Pagin & Wcstcrstahl) Compositionality.) According to Frege, natural language and his formal language have a property that is like compositionality, but not the same. First, the claim that the composition of a sentence mirrors the composition of the thought expressed implies that the sense of the parts and their mode of combination determine the thought expressed, but the determination thesis does not imply the mirroring thesis. Second, Frcgcan thoughts arc not meanings. The meaning of an indcxical sentence ("I am hungry") does not vary with the context of utterance, the thought expressed does. Third, the assertoric sentence that expresses a thought often contains constituents that are not words, while compositionality is usually defined in terms of expression parts that are themselves expressions. For example, Frege takes pointings, glances and the time of utterance to be part of the sign that expresses a thought. (See T, CP: 358 (64). For discussion see Kiinne 1992 andTextor 2007.) If the thought expressed by a sentences s consists of the senses expressed by the parts of s in an order, a sentence s expresses a different thought from a sentence s* if s and s* arc not composed from parts with the same sense in the same order. (Sec Dummett 1981a: 378-379.) Hence, the Mirror Thesis implies that only isomorphic sentences can express the same thought.This conclusion contradicts Frege's view that the same thought
32 I. Foundations of semantics can be decomposed differently into different parts, and that no decomposition can be selected on independent grounds as identifying the thought. Frege argues, for example, that 'There is at least one square root of 4' and 'The number 4 has the property that there is something of which it is the square' express the same thought, but the wording of the sentence suggests different decompositions of the thought expressed. (See CP: 188 (199— 200).) Progress in logic often consists in discovering that different sentences express the same thought. (See PW: 153-154.) Hence, there is not the structure of a thought. A fortiori, the sentence structure cannot be isomorphic with the structure of the thought expressed. The Mirror and Multiple Decomposition Thesis are in conflict. Dummett (1989) and Rumfitt (1994) want to preserve the Mirror View that has it that thoughts arc complex entities,Geach (1975),Bell (1987) and Kemmerling (1990) argue against it.Textor (2009) tries to integrate both. Does an assertoric sentence have a referent, and, if it has one, what is it? Frege introduces the concept of a function into semantic analysis. Functional signs are incomplete. In '1 + ^' the Greek letter simply marks an argument place. If we complete the functional expression '1 + ^' with '1', we 'generate' a complex designator for the value of the function for the argument 1, the number 2. The analysis of the designator'1 + 1' into functional expressions and completing expressions mirrors the determination of its referent. Frege extends the analysis to equations like '1 + 1' to '22 = 4'. In order to do so, he assumes (i) that the equation can be decomposed into functional (for example %2 = 4') and non-functional expressions (for example '2'). The functional expression stands for a function, the non-functional one for an argument. Like '1 + 1', the equation '22 = 4' shall stand for the value the concept x2 = 4 assumes for the argument 2. Frege identifies the values of these functions as truth-values: the True and the False (CP: 144 (13)). To generalise: An assertoric sentence s expresses a thought and refers to a truth-value. It can be decomposed into functional expressions (concept-words) that stand for functions from arguments to truth-values (concepts). The truth-value of s is the value which the concept referred to by the concept-word distinguishable in s returns for the argument referred to by the proper name(s) in s. In this way the semantics of sentences and their constituents interlock. The plausibility of the given description depends crucially on the assumption that talk of truth-values is more than a stylistic variant of saying that what a sentence says is true (false). Frege's arguments for this assumption arc still under scrutiny. (See Burge 2005, essay 3 and Ricketts 2003.) 3.2. Sense determines reference The following argument is valid: Hesperus is a planet; Hesperus shines brightly. Therefore, something is a planet and shines brightly, is a valid. By contast, this argument isn't: Hesperus is a planet; Phosphorus shines brightly. Therefore, something is a planet and shines brightly. Why? A formally valid argument is an argument whose logical form guarantees its validity. All arguments of the form "Fa, Ga, Therefore: (3x) (Fx & Gx)" are valid, i.e. if the premises are true, the conclusion must also be true. If the first argument is formally valid, the "Hesperus" tokens must have the same sense and reference. If the sense were different, the argument would be a fallacy of equivocation. If the reference could be
3. (Frege on) Sense and reference 33 different, although the sense was the same, the argument could take us from true premises to a false conclusion, because the same sense could determine different objects in the premises and the conclusion. This consideration makes a thesis plausible that is frequently attributed to Frege, although never explicitly endorsed by him: Sense-Determines-Reference: Necessarily, if a and P have the same sense, a and P have the same referent. Sense-Determines-Reference is central to Frege's theory. But isn't it refuted by Putnam's (1975) twin earth case? My twin and I may connect the same mode of presentation with the syntactically individuated word "water", but the watery stuff in my environment is H20, in his XYZ. While I refer to H20 with "water", he refers to XYZ with "water". It is far from clear that this is a convincing counterexample to Sense-Determines- Reference. If we consider context-dependent expressions, it will be necessary to complicate to Sense-determines-Reference to Sense-determines-Reference in context of utterance. But this complication does not touch upon the principle. Among other things, Putnam may be taken to have shown that "water" is a context-dependent expression. 3.3. Transparency and homogeneity. In the previous section wc were concerned with the truth-preserving character of formally valid arguments. In this section the knowledge-transmitting character of such arguments will become important: a formally valid argument enables us to come to know the conclusion on the basis of our knowledge of the premises and the logical laws (See Campbell 1994, chap. 3.1 and 3.2). Formally valid arguments can only have thisepistemic virtue if sameness of sense is transparent in the following way: Transparency-of-Sense-Sameness: Necessarily, if a and P have the same sense, everyone who grasps the sense of a and P. thereby knows that a and P have the same sense,provided that there is no difficulty in grasping the senses involved. Assume for reduction that sameness of sense is not transparent in the first argument in section 3.2.Then you might understand and accept "Hesperus is a planet" (first premise) and understand and accept "Hesperus shines brightly" (second premise), but fail to recognise that the two "Hesperus" tokens have the same sense. Hence, your understanding of the premises does not entitle you to give them the form "Fa" and "Ga" and hence, you cannot discern the logical form that ensures the formal validity of the argument. Consequently, you are not in a position to come to know the conclusion on the basis of your knowledge of the premises alone. The argument would be formally valid, but one could not come to know its conclusion without adding the premise Hesperus mentioned in premise 1 is the same thing as Hesperus mentioned in premise 2. Since we take some arguments like the one above to be complete and knowledge transmitting, sameness of sense must be transparent.
34 I. Foundations of semantics If we assume in addition to Transparency-of-Sense-Sameness: Competent speakers know that synonymous expressions co-refer, if they refer at all, we arrive at: Sense-Determines-Rcference: Necessarily, if a and P have the same sense, and a refers to a and P refers to b. everyone who grasps the sense of a and P knows that a = b. Frege makes use of Sense-Reference in Thoughts to argue for the view that an indexical and a proper name, although co-referential, differ in sense. Transparency-of-Sense-Sameness has further theoretical ramifications. Russell wrote to Frege: I believe that in spite of all its snowfields Mont Blanc itself is a component part of what is actually asserted in the proposition 'Mont Blanc is more than 4000 meters high'. (PMC: 169) If Mont Blanc is part of the thought expressed by uttering "Mont Blanc is a mountain", every piece of rock of Mont Blanc is part of this thought.To Frege, this conclusion seems absurd (PMC: 79). Why can a part of Mont Blanc that is unknown to me not be part of a sense I apprehend? Because if there were unknown constituents of the sense of "Mont Blanc" I could not know whether the thought expressed by one utterance of "Mont Blanc is a mountain" is the same as that expressed by another merely by grasping the thought. Hence, Frege does not allow Mont Blanc to be a constituent of the thought that Mont Blanc is a mountain. He endorses the Homogeneity Principle that a sense can only have other senses as constituents (PMC: 127). How plausible is Transparency-of-Sense-Sameness? Dummett (1975:131) takes it to be an'undeniable feature of the notion of meaning'.Frege's proviso to Transparency-of-Sense- Sameness shows that he is aware of difficulties. Many factors (the actual wording, pragmatic implicatures, background information) can bring about that one assents to s while doubting s*, although these sentences express the same thought. Frege has not worked out a reply to these difficulties. However, he suggests that two sentences express the same thought if one can explain away appearances to the contrary by appeal to pragmatics etc. Recently, some authors have argued that it is not immediate knowledge of co- reference, but the entitlement to 'trade on identity' which is the basic notion in a theory of sense (see Campbell 1994: 88). For example, if I use the demonstrative pronoun repeatedly to talk about an object that I visually track ("That bird is a hawk. Look, now that bird pursues the other bird"), I am usually not in a position to know immediately that I am referring to the same thing. But the fact that I am tracking the bird gives me a defeasible right to presuppose that the same bird is in question. There is no longer any suggestion that the grasp of sense gives one a privileged access to sameness of reference. 3.4. Sense without reference According to Frege, the conditions for having a sense and the conditions for having a referent are different. Mere grammatically is sufficient for an expression that can stand in
3. (Frege on) Sense and reference 35 for (genuine) proper names ("the King of France") to have a sense, but it is not sufficient for it to have a referent (S&R: 159 (28)). What about genuine proper names? The name 'Nausikaa' in Homer's Odyssee is probably empty. "But it behaves as if it names a girl, and it is thus assured of a sense." (PW: 122). Frege's message is: If a is a well-formed expression that can take the place of a proper name or a is a proper name that purports to refer to something, a has a sense. An expression purports to refer if it has some of the properties of an expression that does refer and the possession of these properties entitles the uninformed speaker to treat it like a referring expression. For instance, in understanding "Nausikaa" I will bring information that I have collected under this name to bear upon the utterance. It is very plausible to treat (complex) expressions and genuine proper names differently when it comes to significance. We can ask a question like "Does a exist?" or "Is a so-and-so"?, even if "a" does not behave as if it to refers to something. "The number which exceeds itself" docs not behave as if it names something. Yet, it seems implausible to say that "The number which exceeds itself cannot exist" does not express a thought. What secures a sense for complex expressions is that they are composed out of significant expressions in a grammatically correct way. Complex expressions need not behave as if they name something, simple ones must. Since satisfaction of the sufficient conditions for sense-possession does not require satisfaction of the sufficient conditions for referentiality, there can be sense without reference: Sense-without-Reference: If the expression a is of a type whose tokens can stand for an object, the sense of a determines at most one referent. If Sense without Reference is true, the mode of presentation metaphor is misleading. For there can be no mode of presentation without something that is presented. Some Neo- Fregeans take the mode of presentation metaphor to be so central that they reject Sense without Reference (see Evans 1982: 26). But one feels that what has to give here is the mode of presentation metaphor, not the idea of empty but significant terms. Are there more convincing reasons to deny that empty singular terms are significant? The arguments pro and contra Sense without Reference are based on existence conditions for thoughts. According to Evans, a thought must either be either true or false, it cannot lack a truth-value (Evans 1982:25). Frege is less demanding: The being of a thought may also be taken to lie in the possibility of different thinkers grasping the same thought as one and the same thought. (CP: 376 (146)) A thought is the sense of a complete propositional question. One grasps the thought expressed by a propositional question iff one knows when the question deserves the answer "Yes" and when it deserves the answer "No". Frege aims above for the following existence condition: (E!)The thought that p exists if the propositional question "p?"can be raised and addressed by different thinkers in a rational way.
36 I. Foundations of semantics There is a mere illusion of a thought if different thinkers cannot rationally engage with the same question. On the basis of (E!), Frege rejects the view that false thoughts aren't thoughts. It was rational to ask whether the circle can be squared and different thinkers could rationally engage with this question. Hence, there is a thought and not merely a thought illusion, although the thought is necessarily false. The opposition to Sense-without-Reference is also fuelled by the idea that (i) a thought expressed by a sentence s is what one grasps when one understands (an utterance of) s and (ii) that one cannot understand (an utterance of) a sentence with an empty singular term. A representative example of an argument against Sense without Reference that goes through understanding is Evans argument from diversity (Evans 1982: 336). He argues that the satisfaction of the conditions for understanding a sentence s with a proper name n requires the existence of the referent of n. Someone who holds Sense without Reference can provide a communication-allowing relation in the empty case: speaker and audience grasp the same mode of presentation. But Evans rightly insists that this proposal is ad hoc. For usually, we don't require exact match of mode of presentation. The Fregean is challenged to provide a common communication-allowing relation that is present in the case where the singular term is empty and the case where it is referential. Sainsbury (2005, chapter 3.6) proposes that the common communication-allowing relation is causal: exact match of mode of presentation is not required, only that there is a potentially knowledge-transmitting relation between the speaker's use and the audience episode of understanding the term. 3.5. Indirect sense and reference If we assume that the reference of a sentence is determined by the reference of its parts, the substitution of co-referential terms of the same grammatical category should leave the reference of the sentence (e.g. truth-value) unchanged. However, Frege points out that there are exceptions to this rule: Gottlob said (believes) that the evening star shines brightly in the evening sky. The evening star = the morning star. Gottlob said (believes) that the morning star shines brightly in the evening sky. Here the exchange of co-referential terms changes the truth-value of our original statement. (In order to keep the discussion simple I will ignore propositional attitude ascriptions in which the singular terms are not within the scope of 'S believes that ...', see Burge (2005:198). For a more general discussion of these issues see article 60 (Swanson) Propositional attitudes.) Frege takes these cases to be exceptions of the rule that the reference of a sentence is determined by the referents of its constituents. What the exceptions have in common is that they involve a shift of reference: In reported speech one talks about the sense, e.g. of another person's remarks. It is quite clear that in this way of speaking words do not have their customary reference but designate what is usually their sense. In order to have a short expression, we will say: In reported speech, words are used indirectly or have their indirect reference. We distinguish accordingly the customary sense from its indirect sense. (S&R: 159 (28))
3. (Frege on) Sense and reference 37 Indirect speech is a defect of natural language. In a language that can be used for scientific purposes the same sign should everywhere have the same referent. But in indirect discourse the same words differ in sense and reference from normal discourse. In a letter to Russell Frege proposes a small linguistic reform to bring natural language closer to his ideal: "we ought really to have special signs in indirect speech, though their connection with the corresponding signs in direct speech should be easy to recognize" (PMC: 153). Let us implement Frege's reform of indirect speech by introducing indices that indicate sense types. The index 0 stands for the customary sense expressed by an unem- bedded occurrence of a sign, 1 for the sense the sign expresses when embedded under one indirect speech operator like "S said that ...". By iterating indirect speech operators wc can generate an infinite hierarchy of senses: Direct speech: The-evening-star0 shines0 brightly0 in0 the-evening-sky0. Indirect speech: Gottlobn believesn that the-evening-star, shines, brightly, in, the-evening-sky,. Doubly indirect speech: Bertrand0 believes0 that Gottlob, believes, that the-evening-star2 shines2 brightly2 in2 the-evening-sky2. Triply indirect speech: Ludwig,, believes,, that Bertrand, believes, that Gottlob2 believes2 that the-evening-star3 shines, brightly3 in3 the-evening-skyv "The evening star0" refers to the planet Venus and expresses a mode of presentation of it; "the evening star," refers to the customary sense of "the evening star", a mode of presentation of the planet Venus. "The evening star2" refers to a mode of presentation of the mode of presentation of Venus and expresses a mode of presentation of a mode of presentation of a mode of presentation of Venus. Let us first attend to an often made point that requires a conservative modification of Frege's theory. It seems natural to say that the name of a thought ("that the evening star shines brightly in the evening sky") is composed of the nominaliscr "that" and the names of the thought constituents in an order ("that" + "the evening star" + "shines" ...). If "the evening star" names a sense in "Gottlob said that the evening star shines brightly in the evening sky", then "Gottlob said that the evening star shines brightly in the evening sky and it does shine brightly in the evening sky" should be false (the sense of "the evening star" does not shine brightly). But the sentence is true! Does this refute Frege's reference shift thesis? No, for can't one designator refer to two things? The designator "the evening star," refers to the sense of "the vening star0" AND to the evening star. Since such anaphoric back reference seems always possible, we have no reference shift, but a systematic increase of referents. Fine (1989:267f) gives independent examples of terms with multiple referents. Frege himself should be sympathetic to this suggestion. In S&R he gives the following example: John is under the false impression that London is the biggest city in the world. According to Frege, the italicised sentence 'counts double' ("ist doppelt zu nehmen" S&R: 175, (48)). Whatever the double count exactly is, the sentence above will be true iff
38 I. Foundations of semantics John is under the impression that London is the biggest city in the world AND It is not the case that London is the biggest city in the world. Instead of doubly counting the same 'that'-designator, we can let it stand for a truth- value and the thought that determines the truth-value. Reference increase seems more adequate than reference shift. Philosophers have been skeptical of Frege's sense hierarchy. What is so bad about an infinite hierarchy of senses and references? If there is such a hierarchy, the language of propositional attitude ascriptions is unlearnable, argues Davidson. One cannot provide a finitely axiomatised theory of truth for propositional attitude ascriptions if a "that"- clause is infinitely ambiguous (Davidson 1984: 208). The potentially infinitely many meanings of "that p" would have to be learned one by one. Hence, we could not understand propositional attitude ascriptions we have not heard before, although they are composed of words we already master. This argument loses its force if the sense of a "that" designator is determined by the sense of the parts of the sense named and their mode of combination (see Burge 2005: 172). One must learn that "that" is a name forming operator that takes a sentence s and returns a designator of the thought expressed by s. Since one can iterate the name forming operator one can understand infinitely many designations for thoughts on the basis of finite knowledge. But is there really an infinite hierarchy of senses and references? We get an infinite hierarchy of sense and references on the following assumptions: (PI) Indirect reference of w in n embeddings = sense of w in n-\ einbeddings (P2) Indirect sense of w in n einbeddings * sense of w in n embeddings (P3) Indirect sense of w in n einbeddings * indirect sense of w in m embeddings if n * m. Dummett has argued against (P2) that "the replacements of an expression in double oratio obliqua which will leave the truth-value of the whole sentence unaltered are —just as in single oratio obliqua — those which have the same sense" (Dummett 1981a: 269). Assume for illustration that "bachelor" is synonymous with "unmarried eligible male". Dummett is right that we can replace "bachelor" salva veritate in (i) "John believes that all bachelors are dirty" and (ii) "John believes that Peter believes that all bachelors are dirty" with "unmarried eligible male". But this does not show that "bachelor" retains its customary sense in multiple embeddings. For Frcgc argues that the sense of "unmarried eligible male" also shifts under the embedding. No wonder they can be replaced salva veritate. However, if singly and doubly embedded words differ in sense and reference, one should expect that words with the same sense cannot be substituted salva veritate in different embeddings.The problem is that it is difficult to check in a convincing way for such failures. If we stick to the natural language sentences, substitution of one token of the same word will not change anything; if we switch to the revised language with indices, we can no longer draw on intuitions about truth-value difference. In view of a lack of direct arguments one should remain agnostic about the Frcgc an hierarchy. I side here with Parsons that the choice between a theory of sense and reference with or without the hierachy "must be a matter of taste and elegance" (Parsons 1997:
3. (Frege on) Sense and reference 39 408). The reader should consult the Appendix of Burge's (2005) essay 4 for attempts to give direct arguments for the hierarchy, Peacocke (1999:245) for arguments against and Parsons (1997: 402ff) for a diagnosis of the failure of a possible Fregean argument for the hierarchy. 4. Criticising sense and reference 4.1. Carnap's alternative: extension and intension Carnap's Meaning and Necessity plays an important role in the reception of Frege's ideas. However, as Carnap himself emphasizes he docs not have the same explanatory aims as Frege. Carnap himself aims to explicate and extend the distinction between denotation and connotation, while he ascribes to Frege the goal of explicating the distinction between the object named by a name and its meaning (Carnap 1956:127). This description of Frege does not bear closer investigation. Frege argues that predicates ("£ > 1") don't name their referents, yet they have a sense that determines a referent. Predicates refer, although they are not names. Carnap charges Frege for multiplying senses without good reason (Carnap 1956: 157 and 137). If every name has exactly one referent and one sense, and the referent of a name must shift in propositional attitude contexts, Frege must have names for senses which in turn must have new senses that determine senses and so on. Whether this criticism is convincing is, as we have seen in the previous section, still an open issue. Carnap lays the blame for the problem at the door of the notion of naming. Carnap's method of intension and extension avoids using this notion. Carnap takes equivalence relations between sentences and subsentential expressions as given. In a second step he assigns to all equivalent expressions an object. For example, if s is true in all state descriptions in which s* is true, they have the same intension. If s and s* have the same truth-value, they have the same extension. Expressions have intension and extension, but they don't name either. It seems however unclear how Carnap can avoid appeal to naming when it comes to equivalence relations between names. 'Hesperus' and 'Phosphorus' have the same extension because they name the same object (see Davidson 1962:326f). An intension is a function from a possible world w (a state-description) to an object. The intension of a sentence s is, for example, a function that maps a possible world w onto the truth value of s in w; the intension of a singular term is a function that maps a possible world w onto an object in w. Carnap states that the intension of every expression is the same as its (direct) sense (Carnap 1956: 126). Now, intensions are more coarsely individuated than Fregean senses.The thought that 2 + 2 = 4 is different from the thought that 123 + 23 = 146, but the intension expressed by "2 + 2 = 4" and "123 + 23 = 146" is the same. Every true mathematical statement has the same intension: the function that maps every possible world to the True. Although Carnap's criticism does not hit its target, his method of extension and intension has been widely used. Lewis (1970), Kaplan (1977/1989) and Montague (1973) have refined and developed it to assign intensions and extensions to a variety of natural language expressions.
40 I. Foundations of semantics 4.2. Rigidity and sense What, then, is the sense of a proper name? Frege suggests, but does not clearly endorse, the following answer:The sense of a proper name (in the idiolect of a speaker) is given by a definite description that the speaker would use to distinguish the proper name bearer from all other things. For example, my sense of "Aristotle" might be given by the definite description "the inventor of formal logic". I will now look at two of Kripkc's objections against this idea that have figured prominently in recent discussion. First, the modal objection (for a development of this argument see Soames 1998): 1. Proper names are rigid designators. 2. The descriptions commonly associated with proper names by speakers are non-rigid. 3. A rigid designator and a non-rigid one do not have the same sense. Therefore: 4. No proper name has the same sense as a definite description speakers associate with it. A singular term a is a rigid designator if, and only if, a refers to x in every possible world in which x exists (Kripke 1980: 48-49). One can make sense of rigid designation without invoking a plurality of possible worlds. If the sentence which results from uniform substitution of singular terms for the dots in the schema ... might not have been ... expresses a proposition which we intuitively take to be false, then the substituted term is a rigid designator, if not, not. Why accept (3)? If "Aristotle" had the same sense as "the inventor of formal logic", we should be able to substitute "the inventor of formal logic" in all sentences without altering the truth-value (salva veritate). But Aristotle might not have been the inventor of formal logic. (Yes!) and Aristotle might not have been Aristotle. (No!) But in making this substitution we go from a true sentence to a false one. How can this be? Kripke's answer is: Even if everyone would agree that Aristotle is the inventor of formal logic, the description "the inventor of formal logic" is not synonymous with "Aristotle". The description is only used Xo fix the reference of the proper name. If someone answers the question "Who is Aristotle?" by using the description, he gives no information about the sense of "Aristotle". He just gives you advice how to pick out the right object. The modal argument is controversial for several reasons. Can the difference between sentences with (non-rigid) descriptions and proper names not be explained without positing a difference in sense? Dummett gives a positive answer that explains the difference in terms of a difference in scope conventions between proper names and definite descriptions (see Dummett 1981a: 127ff; Kripke 1980, Introduction: 1 Iff replies; Dummett 1981b Appendix 3 replies to the reply).
3. (Frege on) Sense and reference 41_ Other authors accept the difference, but argue that there are rigid definite descriptions that are associated with proper names. (See, for example, Jackson 2005.) The debate about this question is ongoing. Second, the epistemic objection: Most people associate with proper names only definite descriptions that are not satisfied by only one thing ("Aristotle is the great Greek philosopher") or by the wrong object ("Peano is the mathematician who first proposed the axioms for arithmetic").The 'Peano' axioms were first proposed by Dedekind (see Kripke 1972:84). These objections do not refute Frege's conclusion that co-referential proper names can differ in something that is more finely individuated than their reference. What the objections show to be implausible is the idea that the difference between co-referential proper names consists in a difference of associated definite descriptions. But arguments that show that a proper name does not have the same sense as a particular definite description do not show the more general thesis that a proper name has a sense that is distinct from its referent to be false. In order to refute the general thesis an additional argument is needed. The constraint that the sense of an expression must be distinct from its reference leaves Frege ample room for manoeuvre. Consider the following two options: The sense of "Aristotle" might be conceptually primitive: it is not possible to say what this sense is without using the same proper name again.This is not implausible. We usually don't define a proper name.This move of course raises the question how the sense of proper names should be explained if one does not want to make it ineffable. More soon. The sense of "Aristotle" in my mouth is identical with a definite description, but not with a famous deeds description like "the inventor of bi-focals". Neo-Descriptivists offer a variety of meta-linguistic definite descriptions ("the bearer of 'N'") 'to specify the sense of the name (see, for example, Forbes 1990:536ff and Jackson 2005). 4.3. Too far from the ideal? Frege has provided a good reason to distinguish sense from reference. The expressions composing an argument must be distinguished by form and mode of presentation, if the form of an argument is to reveal whether it is formally valid or not. If figures that differ in form also differ in sense, the following relations obtain between sign, sense and reference: The regular connection between a sign, its sense, and its reference is of such a kind that to the sign there corresponds a definite sense and to that in turn a definite referent, while to a given referent (an object) there docs not belong only a single sign. (S&R: 159 (27)) The "no more than one" requirement is according to Frege the most important rule that logic imposes on language. (For discussion see May 2006.) A language that can be used to conduct proofs must be unambiguous. A proof is a scries of judgements that terminates in the logically justified acknowledgement that a thought stands for the True. If "2 + 2 = 4" expressed different thoughts, the designation that 2+2 = 4 would not identify the thought one is entitled to judge on the basis of the proof (CP: 316, Fn. 3). Frege is under no illusion: there is, in general, no such regular connection between sign, sense and reference in natural language: To every expression belonging to a complete totality of signs [Gauzes von Bezeichnungen], there should certainly correspond a definite sense: but natural languages often do not satisfy
42 I. Foundations of semantics this condition, and one must be content if the same word lias the same sense in the same context [Zusammenhange]. (S&R: 159 (27-28)) In natural language, the same shape-individuated sign may have more than one sense ("Bank"), and different occurrences of the same sign with the same reference may vary in sense ("Aristotle" in my mouth and your mouth). Similar things hold for concept- words. In natural language some concept-words are incompletely defined, others are vague. The concept-word "is a natural number" is only defined for numbers, the sense of the word does not determine its application to flowers, "is bald" is vague, it is neither true nor false of me. Hence, some sentences containing such concept-words violate the law of excluded middle: they are neither true nor false. To prevent truth-value gaps Frege bans vague and incompletely defined concept-words from the language of inference. Even the language of mathematics contains complex signs with more than one referent ("V2"). Natural language doesn't comply with the rule that every expression has in all contexts exactly one sense. Different speakers associate different senses with the figure "Aristotle" referring to the same person at the same time and/or the same speaker associates different senses with the same proper name at different times. Frege points out that different speakers can correctly use the same proper name for the same bearer on the basis of different definite descriptions (S&R: 27, fn. 2). More importantly, such a principle seems superfluous, since differences in proper name sense don't prevent speakers of natural language from understanding each other's utterances. Hence, we seem to be driven to the consequence that differences in sense between proper names and other expressions don't matter, what matters is that one talks about the same thing: So long as the reference remains the same, such variations of sense may be tolerated, although they arc to be avoided in the theoretical structure of a demonstrative science ("bcwciscndc Wisscnschaft") and ought not to occur in a perfect language. (S&R: 158. &fn. 4 (27)) Why is variation of sense tolerable outside demonstrative sciences? Frege answers: The task of vernacular languages is essentially fulfilled if people engaged in communication with one another connect the same thought, or approximately the same thought, with the same sentence For this it is not at all necessary that the individual words should have a sense and reference of their own, provided that only the whole sentence has a sense. (PMC: 115. In part iny translation) Frege makes several interesting points here, but let us focus on the main one: if you and I connect approximately the same thought with "Aristotle was born in Stagira", we have communicated successfully. What does'approximately the same' amount to? What is shared when you understand my question "Is Aristotle a student of Plato?" is not the thought I express with "Aristotle is a student of Plato". If one wants to wax metaphysically, what is shared is a complex consisting of Aristotle (the philosopher) and Plato (the Philosopher) standing in the relation of being a student connected in such a way that the
3. (Frege on) Sense and reference 43 complex is true iff Aristotle is a student of Plato.There are no conventional, community- wide senses for ordinary proper names, there is only a conventional community wide reference. Russell sums this up nicely when he writes to Frege: In the case of a simple proper name like "Socrates" I cannot distinguish between sense and reference: I only see the idea which is something psychological and the object. To put it better: I don't acknowledge the sense, only the idea and the reference. (Letter to Frege 12.12.1904. PMC: 169) If we go along with Frege's use of "sign" for a symbol individuated in terms of the mode of presentation expressed, we must say that often we don't speak the same Fregean language, but that it does not matter for communication. If we reject it, we will speak the same language, but proper names turn out to be ambiguous. (Variations of this argument can be found in Russell 1910/11: 206-207; Kripke 1979:108; Evans 1982: 399f and Sainsbury 2005:12ff) This line of argument makes the Hybrid View plausible (see Heck 1995: 79). The Hybrid View takes Frege to be right about the content of beliefs expressed by sentences containing proper names, but wrong about what atomic sentences literally say. "Hesperus is a planet" and "Phosphorus is a planet" have the same content, because the proper name senses are too idiosyncratic to contribute to what one literally says with an utterance containing the corresponding name. Grasping what an assertoric utterance of an atomic sentence literally says is, in the basic case, latching on to the right particulars and properties combined in the right way. The mode in which they arc presented docs not matter. By contrast, "S believes that Phosphorus is a planet" and "S believes that Hesperus is a planet" attribute different beliefs to S. The argument for the Hybrid View requires the Fregean to answer the question "Why is it profitable to think of natural language in terms of the Fregean ideal in which every expression has one determinate sense?" (see Dummett 1981a: 585). Dummett himself answers that we will gradually approximate the Fregean ideal because we can only rationally decide controversies involving proper names ("Did Napoleon really exist?") when we agree about the sense of these names (Dummett 1981a: lOOf). Evans' reply is spot on:"[I]t is the actual practice of using the name 'a', not some ideal substitute, that interests us [...]" (Evans 1982: 40). The semanticist studies English, not a future language that will be closer to the Fregean ideal. Heck has given another answer that is immune to the objection that the Fregean theory is not a theory for a language anyone (already) speaks. Proponents of the Hybrid View assume (i) that one has to know to which thing "Hesperus" refers in order to understand it and (ii) that there is no constraint on the ways or methods in which one can come to know this. But if understanding your utterance of "George Orwell wrote 1984" consists at least in part in coming to know of George Orwell that the utterance is about him, one cannot employ any mode of presentation of George Orwell.The speaker will assume that his audience can come to know what he is talking about on the basis of his utterance and features of the context. If the audience's method of finding out who is talked about does not draw on these reasons, they still might get it right. But it might easily have been the case that the belief they did actually acquire was false. In this situation they would not know who I am talking about with "George Orwell". Hence, the idea that in understanding an assertoric utterance one acquires knowledge limits the
44 I. Foundations of semantics ways in which one may think of something in order to understand an utterance about it (Heck 1995:102). If this argument for the application of the sense/reference distinction to natural language is along the right lines, the bearers of sense and reference are no longer form- individuated signs. The argument shows at best that the constraints on understanding an utterance are more demanding than getting the references of the uttered words and their mode of combination right. This allows different utterances of the same form- individuated sentence to have different senses. There is no requirement that the audience and the speaker grasp the same sense in understanding (making) the utterance.The important requirement is that they all know what they are referring to. There is another line of argument for the application of the scnsc/rcfcrcncc distinction to natural language. Frege often seems to argue that one only needs sense AND reference in languages 'designed' for inferential thinking. But of course we also make inferences in natural languages. Communication in natural language is often joint reasoning. Take the following argument: You: Hegel was drunk. Me: And Hepel is married. We both: Hegel was drunk and is married. Neither you nor I know both premises independently of each other; each person knows one premise and transmits knowledge of this premise to the other person via testimony. Together we can come to know the conclusion by deduction from the premises. But the argument above can only be valid and knowledge-transferring if you and I are entitled to take for granted that "Hegel" in the first premise names the same person as "Hegel" in the second premise without further justification. Otherwise the argument would be incomplete; its rational force would rest on implicit background premises. According to Frege, whenever coincidence in reference is obvious, we have sameness of sense (PMC: 234). Every theory that wants to account for inference must acknowledge that sometimes we are entitled to take co-reference for granted. Hence, we have a further reason to assume that utterances of natural language sentences have senses. 4.4. Sense and reference for context-dependent expressions Natural language contains unambiguous signs, tokens of which stand for different things in different utterances because the sign means what it does. Among such context-dependent expressions are: - personal pronouns:T,'you','my','he','she','it' - demonstrative pronouns: 'that', 'this', 'these' and 'those' - adverbs: 'here', 'now', 'today', 'yesterday', 'tomorrow' - adjectives: 'actual', 'present' (a rather controversial entry) An expression has a use as a context-dependent expression if it is used in a way that its reference in that use can vary with the context of utterance while its linguistic meaning stays the same.
3. (Frege on) Sense and reference 45 Context-dependent expressions are supposed to pose a major problem for Frege. Perry (1977) has started a fruitful discussion about Frege's view on context-dependent expressions. Let us use the indexical 'now' to make clear what the problem is supposed to be: 1. If tokens of the English word 'now' are produced at different times, the tokens refer to different times. 2. If two signs differ in reference, they differ in sense. Hence, 3. Tokens of the English word 'now' that are produced at different times differ in sense. 4. It is not possible that two tokens of 'now' co-refer if they are produced at different times. Hence, 5. It is not possible that two tokens of'now' that are produced at different times have the same sense. Every token of 'now' that is produced at time t differs in sense from all other tokens of 'now' not produced at t. Now one will ask what is the particular sense of a token of 'now' at r? Perry calls this 'the completion problem'. Is the sense of a particular token of 'now' the same as the sense of a definite description of the time t at which 'now' is uttered? No, take any definite description d of the time t that does not itself contain 'now' or a synonym of'now'. No statement of the form 'd is now' is trivial. Take as a representative example, 'The start of my graduation ceremony is now'. Surely, it is not self-evident that the start of my graduation ceremony is now. Hence, Perry takes Frege to be settled with the unattractive conclusion that for each time t there is a primitive and particular way in which t is presented to us at f, which gives rise to thoughts accessible only at f, and expressible then with 'now' (Perry 1977:491). Perry (1977) and Kaplan (1977/1989) have argued that one should, for this and further reasons, replace Fregean thoughts with two successor notions: character and content. The character of an indexical is, roughly, a rule that fixes the referent of the context-dependent expression in a context; the content is, roughly, the referent that has been so fixed.This revision of Frcgc has now become the orthodox theory. 4.5. The mode of presentation problem Frege's theory of sense and reference raises the question "What are modes of presentation?" If modes of presentation are not the meanings of definite descriptions, what are they? This question can be understood as a request to reduce modes of presentation to scientifically more respectable things. I have no argument that one cannot reduce sense to something more fundamental and scientifically acceptable, but there is inductive evidence that there is no such reduction: many people have tried very hard for a long time, none of them has succeeded (see Schiffer 1990 and 2003). Modes of presentation may simply be modes of presentation and not other things (see Peacocke 1992:121).
46 I. Foundations of semantics Dummett has tried to work around this problem by making a theory of sense a theory of understanding: [Tjhere is no obstacle to our saying what it is that someone can do when lie grasps that sense; and that is all that we need the notion of sense for. (Dummett 1981a: 227) If this is true, we can capture the interesting part of the notion of sense by explaining what knowledge of sense consists in. Does knowledge of sense reduce to something scientifically respectable? Dummett proposes the following conditions for knowing the sense of a proper name: S knows the sense of a proper name N iff S has a criterion for recognising for any given object whether it is the bearer of N. (Dummett 1981a: 229) Docs your understanding "Gottlob Frcgc" consist in a criterion for recognising him when he is presented to you? (See Evans 1982, sec. 4.2.) No, he can no longer be presented to you in the way he could be presented to his contemporaries. Did the sense of "Gottlob Frege" change when he died? The real trouble for modes of presentation seems not to be their irreducibility, but the problem to say, in general, what the difference between two modes of presentation consists in. Consider Fine's (2007: 36) example. You live in a symmetrical universe. At a certain point you are introduced to two identical twins. Perversely you give them simultaneously the same name 'Bruce'. Even in this situation it is rational to assert the sentence "Bruce is not the same person as Bruce". The Fregean description of this case is that the figure "Bruce" expresses in one idiolect two modes of presentation. But what is the difference between these modes of presentation? The difference cannot be specified in purely qualitative terms. The twins have all their qualitative properties in common. Can the difference be specified in indexical terms? After all, you originally saw one Bruce over there and the other over here. This solution seems ad hoc. For, in general, one can forget the features that distinguished the bearer of a name when one introduced the name and yet continue to use the name. Can the difference in mode of presentation be specified in terms of the actual position of the two Bruce's? Maybe, but this difference cannot ground the continued use of name and it raises the question of what makes the current distinction sustain the use of the name originally introduced on the basis of other features. As long as these and related questions are not answered, alternatives to sense and reference merit a fair hearing. 5. Summary Frege's work on sense and reference has set the agenda for over a century of research. The main challenge for Frege's friends is to find a plausible way to apply his theoretical apparatus to natural languages. Whether one believes that the challenge can be met or not, Frege has pointed us to a pre-philosophical datum, the fact that true identity
3. (Frege on) Sense and reference 47 statements about the same object can differ in cognitive value, that is a crucial touchstone for philosophical semantics. / want to thank Sarah-Jane Conrad, Silvan Imhof, Laura Mercolli, Christian Nimtz, the Editors and an anonymous referee for helpful suggestions. 6. References Works by Frege: Begriffsschrift (BS, 1879). Reprinted: Darmstadt: Wissenschaftliche Buchgesellschaft, 1977. The Foundations of Arithmetic (FA, 1884). Oxford: Blackwell, 1974. (C.Thiel (ed.). Die Grundlagen der Arithmetik. Hamburg: Meiner, 1988.) Nachgelassene Schriften (NS), edited by Hans Hermes, Friedrich Kambartel & Friedrich Kaulbach. 2nd rev. edn. Hamburg: McincrVerlag 1983.1st edn. translated by Peter Long & Roger White as Posthumous Writings (PW). Oxford: Basil Blackwell. 1979. Wissenschaftlicher Briefwechsel (BW),edited by Gottfried Gabriel, Hans Hermes, Friederich Kambartel, Christian Thiel & Albert Veraart. Hamburg: Meiner Verlag, 1976. Abridged by Brian McGuinness and translated by Hans Kaal as Philosophical and Mathematical Correspondence (PMC). Oxford: Basil Blackwell, 1980. McGuinness, Brian (ed.) 1984: Gottlob Frege: Collected Papers on Mathematics, Logic and Philosophy (CP). Oxford: Basil Blackwell. 'Uber Sinn und Bedeutung'. Zeitschrift fiir Philosophic undphilosophische Kritik 100,25-50.Translated as'On Sense and Meaning' (S&R), in: McGuinness 1984,157-177. 'Der Gedanke'. Beitrdge zur Philosophic des deutschen Idealismus 1,58-77. Translated as'Thoughts' (T), in: McGuinness 1984,351-373. Other Works: Bell. David 1987. Thoughts. Notre Dame Journal of Philosophical Logic 28,36-51. Bezuidenhout, Anne & Marga Reiiner (eds.) 2004. Descriptions and Beyond. Oxford: Oxford University Press 2004. Burge,Tyler 2005. Truth, Thought, Reason: Essays on Frege. Oxford: Clarendon Press. Campbell, John 1994. Past, Space and Self. Cambridge, M A:The MIT Press. Carnap, Rudolf 1956. Meaning and Necessity. 2nd edn. Chicago, IL:Thc University of Chicago Press. Church, Alonzo 1951. A formulation of the logic of dense and denotation. In: P. Hcnlc, H.M.Kallcn & S. K. Langer (eds.). Structure, Method and Meaning. Essays in Honour of Henry M. Sheffer. New York: Liberal Arts Press, 3-24. Church, Alonzo 1973. Outline of a revised formulation of the logic of sense and denotation (Part I). Noils 7,24-33. Church, Alonzo 1974. Outline of a revised formulation of the logic of sense and denotation (Part II). Nous 8,135-156. Davidson, Donald 1962. The method of extension and intension. In: P. A. Schilpp (ed.). The Philosophy of Rudolf Carnap. La Salle, IL: Open Court, 311-351. Davidson, Donald 1984. Inquiries into Truth and Interpretation. Oxford: Oxford University Press. Dumniett, Michael 1975. Frege's distinction between sense and reference [Spanish translation under the title 'Frege']. Teorema V. 149-188. Reprinted in: M. Dununctt. Truth and Other Enigmas. London: Duckworth, 1978,116-145. Duinmett, Michael 1981a. Frege: Philosophy of Language. 2nd edn. London: Duckworth. Dummett, Michael 1981b. The Interpretation of Frege's Philosophy. London: Duckworth. Duminett, Michael 1989. More about thoughts. Notre Dame Journal of Philosophical Logic 30,1-19. Dummett, Michael 1993. The Seas of Language. Oxford: Oxford University Press. Evans, Gareth 1982. Varieties of Reference. Oxford: Oxford University Press.
48 I. Foundations of semantics Fine, Kit 1989.The problem of de re modality. In: J. Almog, J. Perry & H. K. Wettstein (eds.). Themes from Kaplan. Oxford: Oxford University Press, 197-272. Fine, Kit 2007. Semantic Relationism. Oxford: Blackwell. Forbes, Graeme 1990.The indispensability of Sinn. The Philosophical Review 99,535-563. Furth, Montgomery 1964. Introduction. In: M. Furth (ed.). The Basic Laws of Arithmetic: Exposition of the System. Berkeley, CA: University of California Press, v-lix. Geach, Peter T. 1975. Names and identity. In: S. Guttenplan (ed.). Mind and Language. Oxford: Clarendon Press, 139-158. Heck, Richard 1995.The sense of communication. Mind 104.79-106. Jackson, Frank 2005. What are proper names for? In: J. C. Marek & M. E. Reicher (eds.). Experience and Analysis. Proceedings of the 27th International Wittgenstein Symposium.Vienna: hpt-obv. 257-269. Kaplan, David 1989. Demonstratives. In: J. Almog, J. Perry & H. K. Wettstein (eds.). Themes from Kaplan. Oxford: Oxford University Press, 481-563. Kripke, Saul 1980. Naming and Necessity. Cambridge, MA: Harvard University Press. (Separate reissue of: S. Kripke. Naming and necessity. In: D. Davidson & G. Harm an (eds.). Semantics of Natural Language. Dordrecht: Reidel, 1972,253-355 and 763-769.) Kripke. Saul 1979. A puzzle about belief. In: A. Margalit (ed.). Meaning and Use. Dordrecht: Reidel, 239-283. Reprinted in: N. Salomon & S. Soames (eds.). Propositions and Attitudes. Oxford: Oxford University Press, 1989,102-148. Kemmerling. Andreas 1990. Gedanken und ihreTeile. Grazer Philosophische Studien 37,1-30. Kunne, Wolfgang 1992. Hybrid proper names. Mind 101,721-731. Levine, James 2004. On the 'Gray's elegy1 argument and its bearing on Frege's theory of sense. Philosophy and Phenomenological Research 64,251-295. Lewis, David 1970. General semantics. Synthese 22,18-67. Makin, Gideon 2000. The Metaphysicians of Meaning. London: Routledge. May, Robert 2006. The invariancc of sense. The Journal of Philosophy 103,111-144. McDowell, John 1977. On the sense and reference of a proper name. Mind 86,159-185. Mendelsohn, Richard L. 1982. Frege's BegriffsschriftT\\eory of identity. The Journal for the History of Philosophy 20,279-299. Montague. Richard 1973.The proper treatment of quantification in ordinary English. In: J. Hintikka, J. Moravcsik & P. Suppes (eds.). Approaches to Natural Language. Dordrecht: Reidel, 221-242. Reprinted in: R. Thomason (ed.). Formal Philosophy. Selected Papers of Richard Montague. New Haven, CT: Yale University Press, 1974.247-270. Parsons, Terence 1997. Fregean theories of truth and meaning. In: M. Schirn (ed.). Frege: Importance and Legacy. Berlin: de Gruyter, 371-409. Peacocke, Christopher 1992. A Study of Concepts. Cambridge, MA: The MIT Press. Peacocke, Christopher 1999. Being Known, Oxford: Oxford University Press. Perry, John 1977. Frege on demonstratives. The Philosophical Review 86,474-497. Putnam,Hilary 1954.Synonymy and the analysisof belief sentences. Analysis 14,114-122. Putnam. Hilary 1975. The meaning of 'meaning'. In: K. Gunderson (ed.). Language, Mind and Knowledge. Minneapolis, MN: University of Minnesota Press. Reprinted in: H. Putnam. Mind. Language and Reality: Philosophical Papers, Volume 2. Cambridge: Cambridge University Press, 1975,215-271. Quine, Willard van Orman 1951. Two dogmas of empiricism. The Philosophical Review 60, 20-43. Reprinted in: W.V.O. Quinc. From a Logical Point of View. Harvard: Harvard University Press, 1953.20-47. Ricketts, Thomas 2003. Quantification, sentences, and truth-values. Manuscrito: Revista International de Filosofia 26.389^24. Rumfitt, Ian 1994. Frege's theory of predication: An elaboration and defense, with some new applications. The Philosophical Review 103,599-637. Russell, Bertrand 1905. On denoting. Mind 14,479^93.
4. Reference: Foundational issues 49 Russell, Bertrand 1910/11. Knowledge by acquaintance and knowledge by description. Proceedings of the Aristotelian Society New Series 11,108-128. Reprinted in: B. Russell. Mysticism and Logic. London: Unwin, 1986,201-221. Sainsbury, Mark R. 2005. Reference without Referents. Oxford: Oxford University Press. Schiffer, Stephen 1990. The mode of presentation problem. In: A. C. Anderson & J. Owens (eds.). Propositional Attitudes. Stanford, CA: CSLI Publications. 249-269. Schiffer, Stephen 2003. The Things We Mean. Oxford: Oxford University Press. Soames, Scott 1998.The modal argument: Wide scope and rigidified descriptions. Nous 32,1-22. Textor, Mark 2007. Frege's theory of hybrid proper names developed and defended. Mind 116, 947-982. Textor, Mark 2009. A repair of Frege's theory of thoughts. Synthese 167,105-123. Wiggins, David 1997. Meaning and truth conditions: From Frege's grand design to Davidson's. In: B. Hale & C.Wright (eds.). A Companion to the Philosophy of Language. Oxford: Blackwell, 3-29. Mark Textor, London (United Kingdom) 4. Reference: Foundational issues 1. Introduction 2. Direct reference 3. Frege's theory of sense and reference 4. Reference vs. quantification and Russell's theory of descriptions 5. Strawson's objections to Russell 6. Donnellan's attributive-referential distinction 7. Kaplan's theory of indexicality 8. Proper names and Kripke's return to Millian nondescriptionality 9. Propositional attitude contexts 10. Indefinite descriptions 11. Summary 12. References Abstract This chapter reviews issues surrounding theories of reference. The simplest theory is the "Fido"-Fido theory - that reference is all that an NPhas to contribute to the meaning of phrases and sentences in which it occurs. Two big problems for this theory are preferential NPs that do not behave as though they were semantically equivalent and meaningful NPs without a referent. These problems are especially acute in sentences about beliefs and desires - propositional attitudes. Although Frege's theory of sense, and Russell's quantifica- tional analysis, seem to solve these problems for definite descriptions, they do not work well for proper names, as Kripke shows. And Donnellan and Strawson have other objections to Russell's theory. Indexical expressions like "I" and "here" create their own issues; we look at Kaplan's theory of indexicality, and several solutions to the problem indexicals create in propositional attitude contexts. The final section looks at indefinite descriptions, and some more recent theories that make them appear more similar to definite descriptions than was previously thought. Maienborn, von Heusingerand Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 49-74
50 I. Foundations of semantics 1. Introduction Reference, it seems, is what allows us to use language to talk about things and thus vital to the functioning of human language. That being said there remain several parameters to be fixed in order to determine a coherent field of study. One important one is whether reference is best viewed as a semantic phenomenon - a relation between linguistic expressions and the objects referred to, or whether it is best viewed as pragmatic - a three-place relation among language users, linguistic expressions, and things. The ordinary everyday meaning of words like "refer" and "reference" would incline us toward the pragmatic view (we say, e.g., Who were you referring to?, not Who was your phrase referring to?). However there is a strong tradition, stemming from work in logic and early modern philosophy of language, of viewing reference as a semantic relation, so that will be the main focus of our attention at the outset, although we will turn before long to pragmatic views. 1.1. Reference vs. predication Another parameter to be determined is what range of expressions (can be used to) refer.The traditional model sentence, e.g. Socrates rims, consists of a simple noun phrase (NP), a proper name in this case, and a simple verb phrase (VP).The semantic function of the NP is to pick out (i.e. refer to) some entity, and the function of the VP is to predicate a property of that entity. The sentence is true if the entity actually has that property, and false otherwise. Seen in this light, reference and predication are quite different operations. Of course many natural language sentences do not fit this simple model. In a semantics which is designed to treat the full panoply of expression types and to provide truth conditions for sentences containing them, the distinction between reference and predication may not be so clear or important. In classical Montague Grammar, for example, expressions of all categories (except determiners and conjunctions) are assigned an extension, where extension is a formal counterpart of reference (Montague 1973;see also article 11 (Kempson) Formal semantics and representationalism). (In Montague Grammar expressions are also assigned an intension, which corresponds to a Fre- gean sense - see below, and cf. article 3 (Textor) Sense and reference.) An expression's extension may be an ordinary type of entity, or, more typically, it may be a complex function of some kind. Thus the traditional bifurcation between reference and predication is not straightforwardly preserved in this approach. However for our purposes we will assume this traditional bifurcation, or at least that there is a difference between NPs and other types of expressions, and wc will consider reference only for NPs. Furthermore our primary focus will be on definite NPs, which are the most likely candidates for referring expressions, though they may have other, non-referring uses too (see the articles in Chapter IX, Noun phrase semantics). The category of definite NPs includes proper names (e.g. Amelia Earhart), definite descriptions (e.g. the book Sally is reading), demonstrative descriptions (e.g. this house), and pronouns (e.g. you, that). We will have only a limited amount to say about pronouns; for the full story, sec article 40 (Biiring) Pronouns. (Another important category comprises generic NPs; for these, see article 47 (Carlson) Genericity.) Whether other types of NP, such as indefinite descriptions (e.g. a letter from my mother), can be properly said to be referring expressions is an issue of some dispute -see below,section 10.
4. Reference: Foundational issues 51 1.2. The metaphysical problem of reference Philosophers have explored the question of what it is, in virtue of which an expression has a reference - what links an expression to a reference and how did it come to do that? Frege (1892) argued that expressions express a sense, which is best thought of as a collection of properties. The reference of an expression is that entity which possesses exactly the properties contained in the sense. Definite descriptions arc the clearest examples for this theory; the inventor of bifocals specifies a complex property of having been the first to think up and create a special type of spectacles, and that NP refers to Benjamin Franklin because he is the one who had that property. One problem with this answer to the question of what determines reference is the mysterious nature of senses. Frege insisted they were not to be thought of as mental entities, but he did not say much positive about what they are, and that makes them philosophically suspect. (See article 3 (Textor) Sense and reference.) Another answer, following Kripke (1972), is what is usually referred to as a "causal (or historical) chain".The model in this case is proper names, and the idea is that there is some initial kind of naming event whereupon an entity is bestowed with a name, and then that name is passed down through the speech community as a name of that entity. In this article we will not be so much concerned with the metaphysical problem of what determines reference, but instead the linguistic problem of reference - determining what it is that referring expressions contribute scmantically to the phrases and sentences in which they occur. 1.3. The linguistic problem of reference The two answers to the metaphysical problem of reference correlate with two answers to the linguistic question of what it is that NPs contribute to the semantic content of phrases and sentences in which they appear. Frege's answer to the linguistic question is that expressions contribute their reference to the reference of the phrases in which they occur, and they contribute their sense to the sense of those phrases. But there are complications to this simple answer that we will review below in section 3. The other answer is that expressions contribute only their reference to the semantic content of the phrases and sentences in which they occur. Since this is a simpler answer, we will begin by looking at it in some more detail, in section 2, in order to able to understand why Frege put forward his more complex theory. 2. Direct reference The theory according to which an NP contributes only its reference to the phrases and sentences in which it occurs is currently called the "direct reference" theory (the term was coined by Kaplan 1989). It is also sometimes called the "F/do-Fido" theory, the idea being that you have the name Fido and its reference is the dog, Fido, and that's all there is to reference and all there is to the meaning of such phrases. One big advantage of this simple theory is that it docs not result in the postulation of any suspect entities. However there are two serious problems for this simple theory: one is the failure of preferential NPs to be fully equivalent scmantically, and the other is presented by seemingly meaningful NPs which do not have a reference - so called "empty NPs". We will look more closely at each of these problems in turn.
52 I. Foundations of semantics 2.1. Failure of substitutivity According to the direct reference theory, coreferential NPs - NPs which refer to the same thing - should be able to be substituted for each other in any sentence without a change in the semantics of the sentence or its truth value. (This generalization about intersubstitutivity is sometimes referred to as "Leibniz' Law".) If all that an NP has to contribute scmantically is its reference, then it should not matter how that reference is contributed. However almost any two coreferential NPs will not seem to be intersubstitutable - they will seem to be scmantically different. Frege (1892) worried in particular about two different kinds of sentence that showed this failure of substitutivity. 2.1.1. Identity sentences The first kind are identity sentences - sentences of the form a = b, or (a little more colloquially) a is (the same entity as) b. If such a sentence is true, then the NPs a and b are coreferential so, according to the direct reference theory, it shouldn't matter which NP you use, including in identity sentences themselves! That is, the two sentences in (1) should be scmantically equivalent (these are Frege's examples). (1) a. The morning star is the morning star, b. The morning star is the evening star. However, as Frege noted, the two sentences are very different in their cognitive impact, (la) is a trivial sentence, whose truth is known to anyone who understands English (it is analytic), (lb) on the other hand gives the results of a major astronomical finding.Thus even though the only difference between (la) and (lb) is that we have substituted coreferential NPs (the evening star for the morning star), there is still a semantic difference between them. 2.1.2. Propositional attitude sentences The other kind of sentence that Frege worried about was sentences about propositional attitudes - the attitudes of sentient beings about situations or states of affairs. Such sentences will have a propositional attitude verb like believe, hope, know, doubt, want, etc. as their main verb, plus a sentential complement saying what the subject of the verb believes, hopes, wants, etc. Just as with identity statements, coreferential NPs fail to be intersubstitutable in the complements of such sentences. However the failure is more serious in this case. Intersubstitution of coreferential NPs in identity sentences always preserves truth value, but in propositional attitude sentences the truth value may change. Thus (2a) could be true while (2b) was false. (2) a. Mary knows that the morning star is a planet, b. Mary knows that the evening star is a planet. We can easily imagine that Mary has learned the truth about the morning star, but not about the evening star.
4. Reference: Foundational issues 53 2.2. Empty NPs The other major problem for the direct reference theory is presented by NPs that do not refer to anything - NPs like the golden mountain or the round square. The direct reference theory seems to make the prediction that sentences containing such NPs should be semantically defective, since they contain a part which has no reference. Yet sentences like those in (3) do not seem defective at all. (3) a. Lee is looking for the golden mountain. b. The philosopher's stone turns base metals into gold. Sentences about existence, especially those that deny existence, pose special problems here. Consider (4): (4) The round square does not exist. Not only is (4) not semantically defective, it is even true! So this is the other big problem for the Fido-Fido direct reference theory. 3. Frege's theory of sense and reference As noted above, Frege proposed that referring expressions have semantic values on two levels, reference and sense. Frege was anticipated in this by Mill (1843), who had proposed a similar distinction between denotation (reference) and connotation (sense). (Mill's use of the word "connotation" must be kept distinct from its current use to mean hints or associations connected with a word or phrase. Mill's connotations functioned like Frege's senses - to determine reference (denotation).) One important aspect of Frege's work was his elegant arguments. He assumed two fundamental principles of compositionality. At the level of senses he assumed that the sense of a complex expression is determined by the senses of its component parts plus their syntactic mode of combination. Similarly at the level of reference, the reference of a complex expression is determined by the references of its parts plus their mode of combination. (Following Frege, it is commonly assumed today that meanings, whatever they are, are compositional.That is thought to be the only possible explanation of our ability to understand novel utterances. See article 6 (Pagin & Westerstahl) Compositionality.) Using these two principles Frege argued further that the reference of a complete sentence is its truth value, while the sense of a sentence is the proposition it expresses. 3.1. Solution to the problem of substitutivity Armed with senses and the principles of compositionality, plus an additional assumption that we will get to shortly, Frege was able to solve most, though not all, of the problems pointed out above.
54 I. Foundations of semantics 3.1.1. Identity sentences First, for the problem of failure of substitutivity of coreferential NPs in identity statements Frege presents a simple solution. Recall example (1) repeated here. (1) a. The morning star is the morning star, b. The morning star is the evening star. Although the morning star and the evening star have the same reference (making (lb) a true sentence), the two NPs differ in sense.Thus (lb) has a different sense from (la) and so we can account for the difference in cognitive impact. 3.1.2. Propositional attitude sentences Turning to the problem of substitutivity in propositional attitude contexts, Frege again offers us a solution, although the story is a bit more complicated in this case. Here are the examples from (2) above. (2) a. Mary knows that the morning star is a planet, b. Mary knows that the evening star is a planet. Simply observing that the evening star has a different sense from the morning star will account for why (2b) has a different sense from (2a), but by itself does not yet account for the possible change in truth value. This is where the extra piece of machinery mentioned above comes in. Frege pointed out that expressions can sometimes shift their reference in particular contexts. When wc quote expressions, for example, those expressions no longer have their customary reference, but instead refer to themselves. Consider the example in (5) (5) "The evening star" is a definite description. The phrase the evening star as it occurs in (5) docs not refer to the planet Venus any more, but instead refers to itself. Frege argued that a similar phenomenon occurs in propositional attitude contexts. In such contexts, Frege argued, expressions also shift their reference, but here they refer to their customary sense. This means that the reference of (2a) (its truth value) involves the customary sense of the phrase the morning star rather than its customary reference, while the reference/truth value of (2b) involves instead the customary sense of the phrase the evening star. Since wc have two different components, it is not unexpected that we could have two different references - truth values - for the two sentences. 3.2. Empty NPs NPs that have no reference were the other main problem area for the direct reference theory. One problem was the apparent meaningfulness of sentences containing such NPs. It is easy to sec how Frege's theory solved this problem. As long as such NPs have a sense, the sentences containing them can have a well-formed sense as well, so their
4. Reference: Foundational issues 55 meaningfulness is not a problem. We should note, though, that Frege's theory predicts that such sentences will not have a truth value.That is because the truth value, as we have noted, is determined by the references of the constituent expressions in a sentence, and if one of those expressions doesn't have a reference then the whole sentence will not have one either.This means that true negative existential sentences, such as (4) repeated here: (4) The round square does not exist. remain a problem for Frcgc. Since the subject docs not have a reference, the whole sentence should not have a truth value, but it does. 3.3. Further comments on Frege's work Several further points concerning Frege's work will be relevant in what follows. 3.3.1. Presupposition As wc have just observed, Frege's principle of compositionality at the level of reference, together with his conclusion that the reference of a sentence is its truth value, means that a sentence containing an empty NP will fail to have a truth value. Frege held that the use of an NP involves a presupposition, rather than an assertion, that the NP in question has a reference. (See article 91 (Beaver & Geurts) Presupposition.) Concerning example (6) (6) The one who discovered the elliptical shape of the planetary orbits died in misery. Frege said that if one were to hold that part of what one asserts in the use of (6) is that there is a person who discovered the elliptical shape of the planetary orbits, then one would have to say that the denial of (6) is (7). (7) Either the one who discovered the elliptical shape of the planetary orbits did not die in misery, or no one discovered the elliptical shape of the planetary orbits. But the denial of (6) is not (7) but rather simply (8). (8) The one who discovered the elliptical shape of the planetary orbits did not die in misery. Instead, both (6) and (8) presuppose that the definite description the one who discovered the elliptical shape of the planetary orbits has a reference, and if the NP were not to have a reference, then neither sentence would have a truth value. 3.3.2. Proper names It was mentioned briefly above that Mill's views were very similar to Frege's in holding that referring expressions have two kinds of semantic significance - both sense and
56 I. Foundations of semantics reference, or in Mill's terms, connotation and denotation. Mill, however, made an exception for proper names, which he believed did not have connotation but only denotation. Frege appeared to differ from Mill on that point. We can see that, given Frege's principle of compositionality of sense, it would be important for him to hold that proper names do have a sense, since sentences containing them can clearly have a sense, i.e. express a proposition. His most famous comments on the subject occur in a footnote to "On sense and reference" in which he appeared to suggest that proper names have a sense which is similar to the sense which a definite description might have, but which might vary from person to person. The name Aristotle, he seemed to suggest, could mean 'the pupil of Plato and teacher of Alexander the Great' for one person, but 'the teacher of Alexander the Great who was born in Stagira' for another person (Frcgc 1892, fn. 2). 3.3.3. Propositions According to Frcgc, the sense of a sentence is the proposition it expresses, but it has been very difficult to determine what propositions are. Frege used the German word "Gedanke" ('thought'), and as we have seen, for Frege (as for many others) propositions are not only what sentences express, they are also the objects of propositional attitudes. Subsequent formalizations of Frege's ideas have used the concept of possible worlds to analyze them. Possible worlds arc simply alternative ways things (in the broadest sense) might have been. E.g. I am sitting in my office at this moment, but I might instead have gone out for a walk; there might have been only 7 planets in our solar system, instead of 8 or 9. Using this notion, propositions were analyzed as functions from possible worlds to truth values (or equivalently, as sets of possible worlds). This meshes nicely with the idea that the sense of a sentence combined with facts about the way things are (a possible world) determine a truth value. However there are problems with this view; for example, all mathematical truths are necessarily true, and thus true in every possible world, but the sentences expressing them do not seem to have the same meaning (in some pre-theoretic sense of meaning), and it seems that one can know the truth of one without knowing the truth of all of them - e.g. someone could know that two plus two is four, but not know that there are an infinite number of primes. (For continued defense of the possible worlds view of the objects of thought, see Stalnaker 1984,1999.) Following Carnap (1956), David Lewis (1972) suggested that sentence meanings are best viewed as entities with syntactic structure, whose elements are the senses (intensions) of the constituent expressions. We will see an additional proposal concerning what at least some propositions are shortly, in section 4 on Russell. 3.4. Summary comments on Frege Frege's work was neglected for some time, both within and outside Germany. Eventually it received the attention it deserved, especially during the development of formal semantic treatments of natural language by Carnap, Kripke, Montague, and others (see article 10 (Newen& Schroder) Logic and semantics, article 11 (Kempson) Formal semantics and representationalism). Although the distinction between sense and reference (or intension and extension) is commonly accepted, Frege's analysis of propositional attitude contexts has fallen out of favor; Donald Davidson has famously declared Frege's
4. Reference: Foundational issues 57 theory of a shift of reference in propositional attitude contexts to be "plainly incredible" (Davidson 1968/1984:108), and many others seem to have come to the same conclusion. 4. Reference vs. quantification and Russell's theory of descriptions In his classic 1905 paper "On denoting", Bcrtrand Russell proposed an alternative to Frege's theory of sense and reference. To understand Russell's work it helps to know that he was fundamentally concerned with knowledge. He distinguished knowledge by acquaintance, which is knowledge we gain directly via perception, from knowledge by description (cf. Russell 1917), and he sought to analyze the latter in terms of the former. Russell was a direct reference theorist, and rejected Frege's postulation of senses (though he did accept properties, or universals, as the semantic content of predicative expressions). The only genuine referring expressions, for Russell, were those that could guarantee a referent, and the propositions expressed by sentences containing such expressions are singular propositions, which contain actual entities. If I were to point at Mary, and utter (9a), I would be expressing the singular proposition represented in (9b). (9) a. She is happy. b. <Mary, happiness> Note that the first element of (9b) is not the name Mary, but Mary herself. Any NP which is unable to guarantee a referent cannot contribute an entity to a singular proposition. Russell's achievement in "On denoting" was to show how such NPs could be analyzed away into the expression of general, quantificational propositions. 4.1. Quantification In traditional predicate logic, overtly quantificational NPs like every book, no chair do not have an analysis per se, but only in the context of a complete sentence. (In logics with generalized quantifiers, developed more recently, this is not the case; sec article 43 (Keenan) Quantifiers.) Look at the examples in (10) and (11). (10) a. Every book is blue. b. Vx[book(x) 3 blue(x)] (11) a. No table is sturdy. b. ~3x[table(x) & sturdy(x)] (10b), the traditional logical analysis of (10a), says (when translated back into English) For every x, if x is a book then x is blue. We can see that every, the quantificational element in (10a), has been elevated to the sentence level, in effect, so that it expresses a relationship between two properties - the property of being a book and the property of being blue. Similarly (lib) says, roughly, It is not the case that there is an x such thatx is a table and x is sturdy. It can be seen that this has the same truth conditions as No table is sturdy, and once again the quantificational element {no in this case) has been analyzed as expressing a relation between two properties, in this case the properties of being a table and being sturdy.
58 I. Foundations of semantics 4.2. Russell's analysis of definite descriptions Russell's analysis of definite descriptions was called "the paradigm of philosophy" (by Frank Ramsey), and if analysis is the heart of philosophy then indeed it is that. One of the examples Russell took to illustrate his method is given in (12a), and its analysis is in (12b). (12) a. The present king of France is bald. b. 3x[king-of-France(x) & Vy[king-of-France(y) 3 y=x] & bald(x)] The analysis in (12b) translates loosely into the three propositions expressed by the sentences in (13). (13) a. There is a king of France. b. There is at most one king of France. c. He is bald. (He, in (13c) must be understood as bound by the initial There is a... in (13a).) As can be seen, the analysis in (12b) contains no constituent that corresponds to the present king of France. Instead the is analyzed as expressing a complex relation between the properties of being king of France and being bald. Let us look now at how Russell's analysis solves the problems for the direct reference theory. 4.3. Failure of substitutivity 4.3.1. Identity sentences Although Russell was a direct reference theorist, we can see that, under his analysis, English definite descriptions have more to contribute to the sentences in which they occur than simply their reference. In fact they no longer contribute their reference at all (because they are no longer referring expressions, and do not have a reference). Instead they contribute the properties expressed by each of the predicates occurring in the description. It follows that two different definite descriptions, such as the morning star and the evening star, will make two different contributions to their containing sentences. And thus it is no mystery why the two identity sentences from (1) above, repeated here in (14), have different cognitive impact. (14) a. The morning star is the morning star, b. The morning star is the evening star. The meaning of the second sentence involves the property of being seen in the evening as well as that of being seen in the morning. 4.3.2. Propositional attitude sentences When we come to propositional attitude sentences the story is a little more complicated. Recall that Russell's analysis does not apply to a definite description by itself, but only in the context of a sentence. It follows that when a definite description occurs in an
4. Reference: Foundational issues 59 embedded sentence, as in the case of propositional attitude sentences, there will be two ways to unpack it according to the analysis.Thus Russell predicts that such sentences are ambiguous. Consider our example from above, repeated here as (15). (15) Mary knows that the morning star is a planet. According to Russell's analysis we may unpack the phrase the morning star with respect to either the morning star is a planet or Mary knows that the morning star is a planet. The respective results are given in (16). (16) a. Mary knows that 3x[morningstar(x) & Vy[morning star(y) 3 y=x] & planet(x)] b. 3x[morning star(x) & Vy[morning star(y) 3 y=x] & Mary knows that planet(x)] The unpacking in (16a) is what is called the narrow scope or de dicto (roughly, about the words) interpretation of (15).The proposition that Mary is said to know involves he semantic content that the object in question is the star seen in the morning.The unpacking in (16b) is called the wide scope or de re (roughly, about the thing) interpretation of (15). It attributes to Mary knowledge concerning a certain entity, but not under any particular description of that entity. The short answer to the question of how Russell's analysis solves the problem of failure of substitutivity in propositional attitude contexts is that, since there are no referring constituents in the sentence after its analysis, there is nothing to substitute anything for. However Russell acknowledged that one could, in English, make a verbal substitution of one definite description for a corcfcrcntial one, but only on the wide scope, or de re interpretation. If we consider a slightly more dramatic example than (15), we can see that there seems to be some foundation for Russell's prediction of ambiguity for propositional attitude sentences. Observe (17): (17) Oedipus wanted to marry his mother. Our first reaction to this sentence is probably to think that it is false - after all, when Oedipus found out that he had married his mother, he was very upset.This reaction is to the narrow scope, or de dicto reading of the sentence which attributes to Oedipus a desire which involves being married to specifically his mother and which is false. However there is another way to take the sentence according to which it seems to be true: there was a woman, Jocasta, who happened to be Oedipus's mother and whom he wanted to marry. This second interpretation is the wide scope, or de re, reading of (17), according to which Oedipus has a desire concerning a particular individual, but where the individual is not identified for the purposes of the desire itself by any description. 4.4. Empty NPs Recall our initial illustration of Russell's analysis of definite descriptions, repeated here. (18) a. The present king of France is bald. b. 3x[king-of-France(x) & Vy[king-of-France(y) 3 y=x] & bald(x)]
60 I. Foundations of semantics The example shows Russell's solution to the problem of empty NPs. While for Frege such sentences have a failed presupposition and lack a truth value, under Russell's analysis they assert the existence of the entity in question, and are therefore simply false. Furthermore Russell's analysis of definite descriptions solves the more pressing problem of empty NPs in existence sentences. Under his analysis The round square does not exist would be analyzed as in (19) (19) ~3x[round(x) & square(x) & Vy[[round(y) & square(y)] 3 y=x]] which is meaningful and true. 4.5. Proper names Russell's view of proper names was very similar to Fregc's view; he held that they are abbreviations for definite descriptions (which might vary from person to person) and thus that they have semantic content in addition to, or more properly in lieu of, a reference (cf. Russell 1917). 4.6. Referring and denoting As we have seen, for Russell, definite descriptions are not referring expressions, though he did describe them as denoting. For Russell almost any NP is a denoting phrase, including, e.g. every hat and nobody. One might ask what, if any, expressions were genuine referring expressions for Russell. Ultimately he held that only a demonstrative like this, used demonstratively, would meet the criterion, since only such an expression could guarantee a referent. These were the only true proper names, in his view. (See Russell 1917: 216 and fn. 5). It seems clear that we often use language to convey information about individual entities; if Russell's analysis is correct, it means that the propositions containing that information must almost always be inferred rather than being directly expressed. Russell's analysis of definite descriptions (though not of proper names) has been defended at length by Neale (1990). 5. Strawson's objections to Russell Russell's paper "On Denoting" stood without opposition for close to 50 years, but in 1950 P.F. Strawson's classic reply "On Referring" appeared. Strawson had two major objections to Russell's analysis - his neglect of the indexicality of definite descriptions like the king of France, and his claim that sentences with definite descriptions in them were used to assert the existence of a reference for the description. Let us look at each of these more closely. 5.1. Indexicality Indcxical expressions arc those whose reference depends in part on aspects of the utterance context, and thus may vary depending on context. Obvious examples are pronouns like / and you, and adverbs like here, and yesterday. Such expressions make vivid the
4. Reference: Foundational issues 61 difference between a sentence and the use of a sentence to make a statement - a difference which may be ignored for logical purposes (given that mathematical truths are non- indexical) but whose importance in natural language was stressed by Strawson. Strawson pointed out that a definite description like the king of France could have been used at different past times to refer to different people - Louis XV in 1750, but Louis XVI in 1770, for example. Hence he held that it was a mistake to speak of expressions as referring; instead we can only speak of using an expression to refer on a particular occasion. This is the pragmatic view of reference that was mentioned at the outset of this article. Russell lived long enough to publish a rather tart response, "Mr. Strawson on referring", in which he pointed out that the problem of indexicality was independent of the problems of reference which were his main concern in "On denoting". However indexicality does raise interesting and relevant issues, and we return to it below, in section 7. (See also article 61 (Schlenker) Indexicality and de se.) 5.2. Empty NPs and presupposition The remainder of Strawson's paper was primarily concerned with arguing that Russell's analysis of definite descriptions was wrong in its implication that sentences containing them would be used to assert the existence (and uniqueness) of entities meeting their descriptive content. Instead, he said, a person using such a sentence would only imply "in a special sense of'imply'" (Strawson 1950:330) that such an entity exists. (Two years later he introduced the term presuppose for this special sense of imply, Strawson 1952: 175.) And in cases where there is no such entity - that is for sentences with empty NPs, like The king of France is bald as uttered in 1950 - one could not make either a true or a false statement. The question of truth or falsity, in such cases, simply does not arise. (See article 91 (Beaver & Geurts) Presupposition.) We can see that Strawson's position on empty NPs is very much the same as Frege's, although Strawson did not appear to be familiar with Frege's work on the subject. 6. Donnellan's attributive-referential distinction In 1966 Keith Donnellan challenged Russell's theory of definite descriptions, as well as Strawson's commentary on that theory. He argued that both Russell and Strawson had failed to notice that there are two distinct uses of definite descriptions. 6.1. The basic distinction When one uses a description in the attributive way in an assertion, one "states something about whoever or whatever is the so-and-so" (Donnellan 1966:285). This use corresponds pretty well to Russell's theory, and in this case, the description is an essential part of the thought being expressed. The main novelty was Donnellan's claim of a distinct referential use of definite descriptions. Here one "uses the description to enable his audience to pick out whom or what he is talking about and states something about that person or thing" (Donnellan 1966:285). In this case the description used is simply a device for getting one's addressee to recognize whom or what one is talking about, and
62 I. Foundations of semantics is not an essential part of the utterance. Donnellan used the example in (20) to illustrate his distinction. (20) Smith's murderer is insane. For an example of the attributive use, imagine the police detective at a gruesome crime scene, thinking that whoever could have murdered dear old Smith in such a brutal way would have to have been insane. For a referential use, we might imagine that Jones has been charged with the murder and that everybody is pretty sure he is the guilty party. He behaves very strangely during the trial, and an onlooker utters (20) by way of predicating insanity of Jones. The two uses involve different presuppositions: on the attributive use there is a presupposition that the description has a reference (or denotation in Russell's terms), but on the referential use the speaker presupposes more specifically of a particular entity (Jones, in our example) that it is the one meeting the description. Note though, that a speaker can know who or what a definite description denotes and still use that description attributively. For example I might be well acquainted with the dean of my college, but when I advise my student, who has a grievance against the chair of the department, by asserting (21), (21) Take this issue to the dean of the college. I use the phrase the dean of the college attributively. I mean to convey the thought that the dean's office is the one appropriate for the issue, regardless of who happens to be dean at the current time. 6.2. Contentious issues While it is generally agreed that an attributive-referential distinction exists, there have been several points of dispute. The most crucial one is the status of the distinction - whether it is semantic or pragmatic (something about which Donnellan himself seemed unsure). 6.2.1. A pragmatic analysis Donnellan had claimed that, on the referential use, a speaker can succeed in referring to an entity which does not meet the description used, and can make a true statement in so doing.The speaker who used (20) referentially to make a claim about Jones, for instance, would have said something true if Jones was indeed insane whether or not he murdered Smith. Kripke (1977) used this aspect to argue that Donnellan's distinction is nothing more than the difference between speaker's reference (the referential use) and semantic reference (the attributive use). He pointed out that similar kinds of misuses or errors can arise with proper names, for which Donnellan's distinction, if viewed as semantic, could not be invoked (see below, section 8). Kripke argued further that since the same kind of attributive-referential difference in use of definite descriptions would arise in a language stipulated to be Russellian - that is, in which the only interpretation for definite descriptions was that proposed by Russell in "On denoting" - the fact that it occurs in English
4. Reference: Foundational issues 63 does not argue that English is not Russellian, and thus does not argue that the distinction is semantic. (See Reimer 1998 for a reply.) 6.2.2. A semantic analysis On the other hand David Kaplan (1978) (among others, but cf. Salmon 2004) noted the similarity of Donnellan's distinction to the de dicto/de re ambiguity which occurs in prop- ositional attitude sentences. Kaplan suggested an analysis on which referential uses are involved in the expression of Russellian singular propositions. Suppose Jones is Smith's murderer; then the referential use of (20) expresses the singular proposition consisting of Jones himself plus the property of being insane. (In suggesting this analysis Kaplan was rejecting Donnellan's claim about the possibility of making true statements about misdescribed entities. Others have also adopted this revised view of the referential use, e.g. Wettstein 1983, Reimer 1998.) Kaplan's analysis of the referential understanding is similar to the analysis of the complement of a prepositional attitude verb when it is interpreted de re, while the ordinary Russellian analysis seems to match the analysis of the complement interpreted de dido. However the two understandings of a sentence like (20), without a propositional attitude predicate, always have the same truth value while, as wc have seen, in the context of a sentence about someone's propositional attitude, the two interpretations can result in a difference in truth value. In explaining his analysis, Kaplan likened the referential use of definite descriptions to demonstrative NPs. This brings us back to the topic of indexicality. 7. Kaplan's theory of indexicality A major contribution to our understanding of reference and indexicality came with Kaplan's (1989) classic, but mistitled, article "Demonstratives". The title should have been "Indexicals". Acknowledging the error, Kaplan distinguished pure indexicals like / and tomorrow, which do not require an accompanying indication of the intended reference, from demonstrative indexicals, e.g. this, that book, whose uses do require such an indication (or demonstration, as Kaplan dubbed it). The paper itself was equally concerned with both subcategories. 7.1. Content vs. character The most important contribution of Kaplan's paper was his distinction between two elements of meaning - the content of an expression and its character. 7.1.1. Content The content of the utterance of a sentence is the proposition it expresses. Assuming com- positionality, this content is determined by the contents of the expressions which go to make up the uttered sentence. The existence of indexicals means that the content of an utterance is not determined simply by the expressions in it, but also by the context of utterance. Thus different utterances of, e.g., (22)
64 I. Foundations of semantics (22) I like that. will determine different propositions depending who is speaking, the time they are speaking, and what they are pointing at or otherwise indicating, and would vary depending on these parameters. In each case the proposition involved would be, on Kaplan's view (following Russell), a singular proposition. 7.1.2. Character Although on Kaplan's view the contribution of indcxicals to propositional content is limited to their reference, they do have a meaning: /, for example, has a meaning involving the concept of being the speaker. These latter types of linguistically encoded meaning are what Kaplan referred to as "character". In general, the character of an expression is a function which, given a context of utterance, returns the content of that expression in that context. In a way Kaplan is showing that Frcgc's concept of sense actually needs to be subdivided into these two elements of character and content. (Cf. Kaplan 1989, fn. 26.) 7.2. An application of the distinction Indcxicals, both pure and demonstrative, have a variable character - their character determines different contents in different contexts of utterance. However the content so determined is constant (an actual entity, on Kaplan's view). Using the distinction between character and content, Kaplan is able to explain why (23) (23) I am here now. is in a sense necessary, but in another sense contingent. Its character is such that anyone uttering (23) would be making a true statement, but the content determined on any such occasion would be a contingent proposition. For instance if I were to utter (23) now, my utterance would determine the singular proposition containing me, my office; 2:50 pm on January 18,2007; and the relation of being which relates an entity, a place, and a time. That proposition is true at the actual world, but false in many others. 7.3. The problem of the essential indexical Perry (1979), following Castaneda (1968), pointed out that indexicality seems to pose a special problem in propositional attitude sentences. Suppose Mary, a ballet dancer, has an accident resulting in amnesia. She sees a film of herself dancing, but docs not recognize herself. As a result of seeing the film she comes to believe, de re, of the person in the film (i.e. herself) that that person is a good dancer. Still she lacks the knowledge that it is she herself who is a good dancer. As of now we have no way of representing this missing piece of knowledge. Perry proposed recognizing belief states, in addition to the propositions which are the objects of belief, in order to solve this problem; Mary grasps the proposition, but does not grasp it in the first person way. Lewis (1979) proposed instead viewing belief as attribution to oneself of a property; he termed this "belief de se". Belief concerning a nonindexical proposition would then be self-attribution of the property of
4. Reference: Foundational issues 65 belonging to a possible world where that proposition was true. Mary has the latter kind of belief with respect to the proposition that she is a good dancer, but does not (yet) attribute to herself good dancing capability. 7.4. Other kinds of NP As we have seen, Strawson pointed out that some definite descriptions which would not ordinarily be thought of as indexical can have an element of indexicality to them. An indexical definite description like the (present) king of France would, on Kaplan's analysis, have both variable character and variable content. As uttered in 1770, for example, it would yield a function whose value in any possible world is whoever is king of France in 1770. That would be Louis XVI in the actual world, but other individuals in other worlds depending on contingent facts about French history. As uttered in 1950 the definite description has no reference in the actual world, but does in other possible worlds (since it is not a necessary fact that France is a republic and not a monarchy in 1950 - the French Revolution might have failed). A nonindexical definite description like the inventor of bifocals has a constant character but variable content. That is, in any context of utterance its content is the function from possible worlds that picks out whoever it is who invented bifocals in that world. For examples of NPs with constant character and constant content, we must turn to the category of proper names. 8. Proper names and Kripke's return to Millian nondescriptionality It may be said that in asserting that proper names have denotation without connotation, Mill captured our ordinary pre-theoretic intuition.That is, it seems intuitively clear that proper names do not incorporate or express any properties like having taught Alexander the Great or having invented bifocals. On the other hand we can also understand why both Frege and Russell would be driven to the view that, despite this intuition, they do express some kind of property.That is because the alternative would be the direct reference, or Fido-Fido view, and the two kinds of problems that we saw arising for that view arise for proper names as well as definite descriptions. Thus identity sentences of the form a = b arc informative with proper names as they arc with definite descriptions, as exemplified in (24). (24) a. Mark Twain is Mark Twain. b. Samuel Clemens is Mark Twain. Intuitively (24b) conveys information over and above that conveyed by (24a). Similarly exchanging co-rcfcrcntial proper names in propositional attitude sentences can seem to change truth value. (25) a. Mary knows that Mark Twain wrote Tom Sawyer. b. Mary knows that Samuel Clemens wrote Tom Sawyer. We can well imagine someone named Mary for whom (25a) would be true yet for whom (25b) would seem false. Furthermore there arc many proper names which arc
66 I. Foundations of semantics non-referential, and for which negative identity sentences like (26) would seem true and not meaningless. (26) Santa Claus does not exist. If proper names have a sense, or are otherwise equivalent to definite descriptions, then some or all of these problems are solved. Thus it was an important development when Kripke argued for a return to Mill's view on proper names. But before we get to that, we should briefly review a kind of weakened description view of proper names. 8.1. The 'cluster' view Both Wittgenstein (1953) andSearle (1958) argued for a view of proper names according to which they are associated semantically with a cluster of descriptions - something like a disjunction of properties commonly associated with the bearer of the name. Wittgenstein's example used the name Moses, and he suggested that there is a variety of descriptions, such as "the man who led the Israelites through the wilderness", "the man who as a child was taken out of the Nile by Pharaoh's daughter", which may give meaning to the name or support its use (Wittgenstein 1953, §79). No single description is assumed to give the meaning of the name. However, as Searle noted, on this view it would be necessarily true that Moses had at least one of the properties commonly attributed to him (Searle 1958:172). 8.2. The return to Mill's view In January of 1970 Saul Kripke gave an important series of lectures titled "Naming and necessity" which were published in an anthology in 1972 and in 1980 reissued as a book. In these lectures Kripke argued against both the Russell-Frege view of proper names as abbreviated definite descriptions and the Wittgenstein-Searle view of proper names as associated semantically with a cluster of descriptions, and in favor of a return to Mill's nondescriptional view of proper names. Others had come to the same conclusion (e.g. Marcus 1961, Donnellan 1972), but Kripke's defense of the nondescriptional view was the most thorough and influential.The heart of Kripke's argument depends on intuitions about the reference of expressions in alternative possible worlds. These intuitions indicate a clear difference in behavior between proper names and definite descriptions. A definite description like the student of Plato who taught Alexander the Great refers to Aristotle in the actual world, but had circumstances been different - had Xenocrates rather than Aristotle taught Alexander the Great - then the student of Plato who taught Alexander the Great would refer to Xenocrates, and not to Aristotle. Proper names, on the other hand, do not vary their reference from world to world. Kripke dubbed them "rigid designators".Thus sentences like (27) seem true to us. (27) Aristotle might not have taught Alexander the Great. Furthermore, Kripke pointed out, a sentence like (27) would be true no matter what contingent property description is substituted for the predicate. In fact something like (28) seems to be true:
4. Reference: Foundational issues 67 (28) Aristotle might have had none of the properties commonly attributed to him. But the truth of (28) seems inconsistent with both the Frege-Russell definite description view of proper names and the Wittgenstein-Searle cluster view. On the other hand a sentence like (29) seems false. (29) Aristotle might not have been Aristotle. This supports Kripke's claim of rigid designation for proper names; since the name Aristotle must designate the same individual in any possible world, there is no possible world in which that individual is not Aristotle. And thus, to put things in Kaplan's terms, proper names have both constant character and constant content. 8.3. Natural kind terms Although it goes beyond our focus on NPs, it is worth mentioning that Kripke extended his theory of nondescriptionality to at least some common nouns - those naming species of plants or animal, like elm and tiger, as well as those for well-defined naturally occurring substances or phenomena, such as gold and heat, and some adjectives like loud, and red. In this Kripke's views differed from Mill, but were quite similar to those put forward by Putnam (1975). Putnam's most famous thought experiment involved imagining a "twin earth" which is identical to our earth except that the clear, colorless, odorless substance which falls from the sky as rain and fills the lakes and rivers, and which is called water by twin-English speaking twin earthlings, is not H20 but instead a complex compound whose chemical formula Putnam abbreviates XYZ. Putnam argues that although Oscar, on earth and Oscar2 on twin earth arc exactly the same mentally when they think "I would like a glass of water", nevertheless the contents of their thoughts are different. His famous conclusion:" 'Meanings'just ain't in the head" (Putnam 1975:227;see Segal 2000 for an opposing view.) 8.4. Summary Let us take stock of the situation. We saw that the simplest theory of reference, the Fido-Fido or direct reference theory, had problems with accounting for the apparent semantic inequivalence of coreferential NPs - the fact that true identity statements could be informative, and that exchanging coreferential NPs in propositional attitude contexts could even result in a change in truth value. This theory also had a problem with non-referring or empty NPs, a problem which became particularly acute in the case of true negative existence statements. Frege's theory of sense seemed to solve most of these problems, and Russell's analysis of definite descriptions seemed to solve all of them. However, though the theories of Frcgc and Russell arc plausible for definite descriptions, as Kripke made clear they do not seem to work well for proper names, for which the direct reference theory is much more plausible. But the same two groups of problems - those involving co-referential NPs and those involving empty NPs - arise for proper names just as they do for definite descriptions. Of these problems, the one involving substituting coreferential NPs in propositional attitude
68 I. Foundations of semantics contexts has attracted the most attention. (See also article 60 (Swanson) Propositional attitudes.) 9. Propositional attitude contexts Kripke's arguments for a return to Mill's view of proper names have generally been found to be convincing (although exceptions will be noted below).This appears to leave us with the failure of substitutivity of coreferential names in propositional attitude contexts. However Kripke (1979) argued that the problem was not actually one of substitutivity, but a more fundamental problem in the attribution of propositional attitudes. 9.1. The Pierre and Peter puzzles Kripke's initial example involved a young Frenchman, Pierre, who when young came to believe on the basis of postcards and other indirect evidence that London was a beautiful city. He would sincerely assert (30) whenever asked. (30) Londres et jolie. Eventually, however, he was kidnapped and transported to a very bad section of London, and learned English by the direct method. His circumstances did not allow him to explore the city (which he did not associate with the city he knew as Londres), and thus based on his part of town, he would assert (31). (31) London is not pretty. The question Kripke presses us to answer is that posed in (32): (32) Does Pierre, or does he not, believe that London is pretty? An alternative, monolingual, version of the puzzle involves Peter, who has heard of Paderewski the pianist, and Paderewski the Polish statesman, but who does not know that they were the same person and who is furthermore inclined to believe that anyone musically inclined would never go into politics.The question is (33): (33) Docs Peter, or docs he not, believe that Paderewski had musical talent? Kripke seems to indicate that these questions do not have answers:"...our normal practices of interpretation and attribution of belief are subjected to the greatest possible strain, perhaps to the point of breakdown. So is the notion of the content of someone's assertion, the proposition it expresses" (Kripke 1979: 269; italics in original). Others, however, have not been deterred from answering Kripke's questions in (32) and (33). 9.2. Proposed solutions Many solutions to the problem of propositional attitude attribution have been proposed. Wc will look here at several of the more common kinds of approaches.
4. Reference: Foundational issues 69 9.2.1. Metalinguistic approaches Metalinguistic approaches to the problem involve linguistic expressions as components of belief in one way or another. Quine (1956) had suggested the possibility of viewing propositional attitudes as relations to sentences rather than propositions. This would solve the problem of Pierre, but would seem to leave Peter's problem, given that we have a single name Paderewski in English (but see Fiengo & May 1998). Others (Bach 1987, Katz 2001) have put forward metalinguistic theories of proper names, rejecting Kripke's arguments for their nondescriptionality.The idea here is that a name N means something like "the bearer of AT. This again would seem to solve the Pierre puzzle (Pierre believes that the bearer of Londres is pretty, but not the bearer of London), but not Peter's Paderewski problem. Bach argues that (33) would need contextual supplementation to be a complete question about Peter's beliefs (see Bach 1987: 165ff). 9.2.2. Hidden indexical theories The remaining two groups of theories are consistent with Kripke's nondescriptional analysis of proper names. Hidden indexical theories involve postulating an unmen- tioned (or hidden) element in belief attributions, which is "a mode of presentation" of the proposition, belief in which is being attributed. (Cf. Schiffer 1992, Crimmins & Perry 1989.) Thus belief is viewed as a three-place relation, involving a believer, a proposition believed, and a mode of presentation of that proposition. Furthermore these modes of presentation are like indexicals in that different ones may be invoked in different contexts of utterance. The answer to (32) or (33) could be cither Yes or No, depending upon which kind of mode of presentation was understood. The approach of Richard (1990) is similar, except that the third element is intended as a translation of a mental representation of the subject of the propositional attitude verb. 9.2.3. Pragmatic theories Our third kind of approach is similar to the hidden indexical theories in recognizing modes of presentation. However the verb believe (like other propositional attitude verbs) is seen as expressing a two-place relation between a believer and a proposition, and no particular mode of presentation is entailed. Instead, this relation is defined in such a way as to entail only that there is at least one mode of presentation under which the proposition in question is believed. (Cf. Salmon 1986.) This kind of theory would answer either (32) or (33) with a simple Yes since there is at least one mode of presentation under which Pierre believes that London is pretty, and at least one under which Peter believes that Paderewski had musical talent. A pragmatic explanation is offered for our tendency to answer No to (32) on the basis of Pierre's English assertion London is not pretty. 10.Indefinite descriptions We turn now to indefinite descriptions - NPs which in English begin with the indefinite article a/an.
70 I. Foundations of semantics 10.1. Indefinite descriptions are not referring expressions As we saw above, Russell did not view definite descriptions as referring expressions, so it will come as no surprise that he was even more emphatic about indefinite descriptions. He had several arguments for this view (cf. Russell 1919:167ff). Consider his example in (34). (34) I met a man. Suppose that the speaker of (34) had met Mr. Jones, and that that meeting constituted her grounds for uttering (34). In that case, were a man referential, it would have to refer to Mr. Jones. Nevertheless someone who did not know Jones at all could easily have a full understanding of (34). And were the speaker of (34) to add (35) to her utterance (35) .. .but it wasn't Jones. she would not be contradicting herself (though of course she would be lying, under the circumstances). On the other hand if it should turn out that the speaker of (34) did not meet Jones after all, but did meet some other man, it would be very hard to regard (34) as false. Russell's arguments have been reiterated and augmented by Ludlow & Neale (1991). 10.2. Indefinite descriptions are referring expressions Since Russell's time others (e.g. Strawson 1952) have argued that indefinite descriptions do indeed have referring uses. The clearest kinds of cases are ones in which chains of reference occur, as in (36) (from Chastain 1975:202). (36) A man was sitting underneath a tree eating peanuts. A squirrel came by, and the man fed it some peanuts. Both a man and a squirrel in (36) seem to be coreferential with subsequent expressions that many people would consider to be referring - if not the man, then at least it. Chastain argues that there is no reason to deny rcfcrcntiality to the indefinite NPs which initiate such chains of reference, and that indeed, that is where the subsequent expressions acquired their referents. It should be noted, though, that if serving as antecedent for one or more pronouns is considered adequate evidence for referentiality, then overtly quantificational NPs should also be considered referential, as shown in (37). (37) a. Everybody who came to my party had a good time. They all thanked me afterward, b. Most people don't like apples.They only eat them for their health. 10.3. Parallels between indefinite and definite descriptions Another relevant consideration is the fact that indefinite descriptions seem to parallel definite descriptions in several ways.They show an ambiguity similar to the de dicto-de re ambiguity in propositional attitude contexts, as shown in (38).
4. Reference: Foundational issues 71 (38) Mary wants to interview a diplomat. (38) could mean either that there is a particular diplomat whom Mary is planning to interview (where a diplomat has wide scope corresponding to the de re reading for definite descriptions), or that she wants to interview some diplomat or other - say to boost the prestige of her newspaper. (This reading, where a diplomat has narrow scope with respect to the verb wants, corresponds to the de dicto reading of definite descriptions.) Neither of these readings entails the other - either could be true while the other is false. Furthermore indefinite descriptions participate in a duality of usage, the specific- nonspecific ambiguity (see article 42 (von Heusinger) Specificity), which is very similar to Donncllan's referential-attributive ambiguity for definite descriptions. Thus while the indefinites in Chastain's example above are most naturally taken specifically, the indefinite in (39) must be taken nonspecifically unless something further is added (since otherwise the request would be infelicitous). (39) Please hand me a pencil. (See also Fodor & Sag 1982.) In casual speech specific uses of indefinite descriptions can be unambiguously paraphrased using non-demonstrative this, as in (40). (40) This man was sitting underneath a tree eating peanuts. However non-demonstrative this cannot be substituted for the non-specific a in (39) without causing anomaly. So if at least some occurrences of definite descriptions are viewed as referential, these parallels provide an argument that the corresponding occurrences of indefinite descriptions should also be so viewed. (Devitt 2004 argues in favor of this conclusion.) 10.4. Discourse semantics More recently approaches to semantics have been developed which provide an interpretation for sentences in succession, or discourses. Initially developed independently by Hcim (1982) and Kamp (1981), these approaches treat both definite and indefinite descriptions as similar in some ways to quantificational terms and in some ways to referring expressions such as proper names. Indefinite descriptions introduce new discourse entities, and subsequent references, whether achieved with pronouns or with definite descriptions, add information about those entities. (See article 37 (Kamp & Reyle) Discourse Representation Theory, article 38 (Dckkcr) Dynamic semantics.) 10.5. Another puzzle about belief We noted above that indefinite descriptions participate in scope ambiguities in proposi- tional attitude contexts.The example below in (41) was introduced by Geach (1967), who argued that it raises a new problem of interpretation. (41) Hob thinks a witch has blighted Bob's marc, and Nob wonders whether she (the same witch) killed Cob's sow.
72 I. Foundations of semantics Neither the ordinary wide scope or narrow scope interpretation is correct for (41). The wide scope interpretation (there is a witch such that...) would entail the existence of a witch, which does not seem to be required for the truth of (41). On the other hand the narrow scope interpretation (Hob thinks that there is a witch such that...) would fail to capture the identity between Hob's witch and Nob's witch. This problem, which Geach referred to as one of "intentional identity", like many others, has remained unsolved. 11. Summary As we have seen, opinions concerning referentiality vary widely, from Russell's position on which almost no NPs are referential to a view on which almost any NP has at least some referential uses.The differences may seem inconsequential, but they are central to issues surrounding the relations among language, thought, and communication - issues such as the extent to which we can represent the propositional attitudes of others in our speech, and even the extent to which our own thoughts are encoded in the sentences we utter, as opposed to being inferred from hints provided by our utterances. 12. References Bach, Kent 1987. Thought and Reference. Oxford: Oxford University Press. Carnap, Rudolf 1956. Meaning and Necessity: A Study in Semantics and Modal Logic. 2nd edn. Chicago, IL:The University of Chicago Press. Castaneda, Hector-Neri 1968. On the logic of attributions of self knowledge to others. Journal of Philosophy 65,439^156. Chastain, Charles 1975. Reference and context. In: K. Gunderson (ed.). Minnesota Studies in the Philosophy of Science, vol. 7: Language Mind and Knowledge. Minneapolis, MN: University of Minnesota Press, 194-269. Crinunins, Mark & John Perry 1989. The prince and the phone booth. Journal of Philosophy 86, 685-711. Davidson, Donald 1968. On saying that. Synthese 19,130-146. Reprinted in: D. L. Davidson. Inquiries into Truth and Interpretation. Oxford: Clarendon Press, 1984,93-108. Devitt, Michael 2004.The case for referential descriptions. In: M. Reimer & A. Bezuidenhout (eds.). Descriptions and Beyond. Oxford: Clarendon Press, 280-305. Donnellan, Keith S. 1966. Reference and definite descriptions. Philosophical Review 77,281-304. Donnellan, Keith S. 1972. Proper names and identifying descriptions. In: D. Davidson & G. Harm an (eds.). Semantics of Natural Language. Dordrecht: Reidel, 356-379. Fiengo, Robert & Robert May 1998. Names and expressions. Journal of Philosophy 95,377-409. Fodor, Janet D. & Ivan Sag 1982. Referential and quantificational indefinites. Linguistics & Philosophy 5,355-398. Frcgc, Gottlob 1892. Ubcr Sinn und Bedcutung. Zeitschrift fur Philosophie und philosophische Kritik, 25-50. English Translation in: P. Geach & M. Black (eds.). Translations from the Philosophical Writings of Gottlob Frege. Oxford: Blackwell, 1980,56-78. Geach, Peter T 1967. Intentional identity. Journal of Philosophy 64,627-632. Heim, Irene 1982. The Semantics of Definite and Indefinite Noun Phrases. Ph.D. dissertation. University of Massachusetts, Amherst, MA. Reprinted: Ann Arbor, MI: University Microfilms. Kamp, Hans 1981. A theory of truth and semantic representation. In: J. Grocncndijk, T M.V. Janssen & M. Stokhof (eds.). Formal Methods in the Study of Language. Amsterdam: Mathematical Centre, 277-322. Reprint in: J. Groenendijk, T.M.V. Janssen & M. Stokhof (eds.). Truth, Interpretation and information: Selected Papers from the Third Amsterdam Colloquium. Dordrecht: Foris, 1984,1-41.
4. Reference: Foundational issues 73 Kaplan, David 1978. Dthat. In: P. Cole (ed.). Syntax and Semantics 9: Pragmatics. New York: Academic Press, 221-243. Kaplan, David 1989. Demonstratives: An essay on the semantics, logic, metaphysics, and episte- mology of demonstratives and other indexicals. In: J. Almog, J. Perry & H. Wettstein (eds.). Themes from Kaplan. Oxford: Oxford University Press, 481-563. Katz, Jerrold J. 2001. The end of Millianism: Multiple bearers, improper names, and compositional meaning. Journal of Philosophy 98,137-166. Kripke, Saul 1972. Naming and necessity. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Reidel, 253-355 and 763-769. Reissued separately with Preface, Cambridge, MA: Harvard University Press, 1980. Kripke, Saul 1977. Speaker's reference and semantic reference. In: P. A. French, T. E. Uehling, Jr. & H. Wettstein (eds.). Midwest Studies in Philosophy, vol. II: Studies in the Philosophy of Language. Morris, MN: University of Minnesota, 255-276. Kripke, Saul 1979. A puzzle about belief. In: A. Margalit (ed.). Meaning and Use. Dordrecht: Reidel, 139-183. Lewis, David 1972. General semantics. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Reidel, 169-218. Lewis, David. 1979. Attitudes de dicto and de se. Philosophical Review 88,513-543. Ludlow, Peter & Stephen Neale 1991. Indefinite descriptions. Linguistics & Philosophy 14, 171-202. Marcus, Ruth Barcan 1961. Modalities and intensional languages. Synthese 13,303-322. Mill, John Stuart 1843. A System of Logic, Ratiocinative and Inductive, Being a Connected View of the Principles of Evidence, and the Methods of Scientific Investigation. London: John W. Parker. Montague, Richard 1973. The proper treatment of quantification inordinary English. In: J. Hintikka, J. Moravcsik & P. Suppes (eds.). Approaches to Natural Language: Proceedings of the 1970 Stanford Workshop on Grammar and Semantics. Dordrecht: Reidel, 221-242. Reprinted in: R. Thomason (ed.). Formal Philosophy: Selected Papers of Richard Montague. New Haven, CT:Yale University Press, 1974,247-270. Neale, Stephen 1990. Descriptions. Cambridge. MA:The MIT Press. Perry, John 1979. The problem of the essential indcxical. Nous 13,3-21. Putnam, Hilary 1975. The meaning of 'meaning'. In: K. Gunderson (ed.). Language, Mind and Knowledge. Minneapolis, MN: University of Minnesota Press. Reprinted in: H. Putnam. Philosophical Papers, vol. 2: Mind, Language and Reality. Cambridge: Cambridge University Press, 1975,215-271. Quine.Willaid van Orman 1956. Quantifiers and propositional attitudes. Journal of Philosophy 53, 177-187. Reimer, Marga 1998. Donnellan's distinction/Kripke's test. Analysis 58,89-100. Richard, Mark 1990. Propositional Attitudes. Cambridge: Cambridge University Press. Russell, Bertrand 1905. On denoting. Mind 14,479-493. Russell, Bertrand 1917. Knowledge by acquaintance and knowledge by description. Reprinted in: B. Russell. Mysticism and Logic. Paperback edn. Garden City, NY: Doubleday, 1957, 202-224. Russell, Bertrand 1919. Introduction to mathematical philosophy. London: Allen & Unwin. Reissued n.d., New York: Touchstone Books. Russell, Bertrand 1957. Mr. Strawson on referring. Mind 66,385-389. Salmon, Nathan 1986. Frege's Puzzle. Cambridge, MA:The MIT Press. Salmon, Nathan 2004. The good, the bad, and the ugly. In: M. Reimer & A. Bezuidenhout (eds.). Descriptions and Beyond. Oxford: Clarendon Press, 230-260. Schiffer, Stephen 1992. Belief ascription. Journal of Philosophy 89,499-521. Searle, John R. 1958. Proper names. Mind 67,166-173. Segal, Gabriel M. 2000. A Slim Book about Narrow Content. Cambridge, MA:The MIT Press.
74 I. Foundations of semantics Stalnaker, Robert C. 1984. Inquiry. Cambridge, MA:The MIT Press. Stalnaker, Robert C. 1999. Context and Content. Oxford: Oxford University Press. Strawson, Peter F. 1950. On referring. Mind 59,320-344. Strawson. Peter F. 1952. Introduction to Logical Theory. London: Methuen. Wettstein, Howard K. 1983. Tlie semantic significance of the referential-attributive distinction. Philosophical Studies 44,187-196. Wittgenstein, Ludwig 1953. Philosophical Investigations. Translated into English by G.E.M. Ansconibe. Oxford: Blackwell. Barbara Abbott, Lake Leelanau, MI (USA) 5. Meaning in language use 1. Overview 2. Deixis 3. The Gricean revolution: The relation of utterance meaning to sentence meaning 4. Implications for word meaning 5. The nature of context and the relationship of pragmatics to semantics 6. Summary 7. References Abstract In a speech community, meaning attaches to linguistic forms through the ways in which speakers use those forms, intending and expecting to communicate with their interlocutors. Grice's (1957) insight that conventional linguistic meaning amounts to the expectation by members of a speech community that hearers will recognize speakers' intentions in saying what they say the way they say it. This enabled him to sketch how this related to lexical meaning and presupposition, and (in more detail) implied meaning. The first substantive section of this article briefly recapitulates the work of Bar-Hillel (1954) on indexicals, leading to the conclusion that even definite descriptions have an indexical component. Section 3 describes Grice's account of the relation of intention to intensions and takes up the notion of illocutionary force. Section 4 explores the implications of the meaning-use relationship for the determination of word meanings. Section 5 touches briefly on the consequences of the centrality of communication for the nature of context and the relation of context and pragmatic considerations to formal semantic accounts. 1. Overview At the beginning of the 20th century a tension existed between those who took languages to be representable as formal systems with complete truth-conditional semantics (Carnap, Russell, Frege,Tarski) - and who viewed natural language as rather defective in that regard, and those who took the contextual use of language to be determinative of the meanings of its forms (Austin, Strawson, the later Wittgenstein). (See also article 3 (Tcxtor) Sense and reference.) For a very readable account of this intellectual Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 74-95
5. Meaning in language use 75 battleground,see Recanati (2004).The debate has been largely resolved with acceptance (not always explicitly recognized) of the views of Bar-Hillel and Grice on the character of the human use of language. In a speech community, meaning attaches to linguistic forms through the ways in which speakers use those forms, intending and expecting to communicate with their interlocutors. This is immediately obvious in the case of the reference of indexical terms like / and you, and the determination of the illocutionary force of utterances. It extends also to matters of reference generally, implicature, coherence of texts and discourse understanding, as well as to the role of linguistic forms in maintaining social relations (politeness). Grice's (1957) insight that conventional linguistic meaning amounts to the expectation by members of a speech community that hearers will recognize speakers' intentions in saying what they say the way they say it enabled him to sketch how this related to lexical meaning and presupposition, and (in more detail) implied meaning. While the view presented here has its origins in Grice's insight, it is considerably informed by elaborations and extensions of it over the past 40 years. 2. Deixis Bar-Hillel (1954) demonstrated that it is not linguistic forms that carry pragmatic information, but the facts of their utterance, and this notion was elaborated in Stalnaker (1972). Bar-Hillel claimed that indexicality is an inherent and unavoidable aspect of natural language, speculating that more than 90% of the declarative sentences humans utter have use-dependent meanings in that they involve implicit references to the speaker, the addressee and/or the speech time. The interpretation of first and second person pronouns, tenses, and deictic adverbials are only the tip of the iceberg. A whole host of other relational terms (not to mention deictic and anaphoric third-person references, and even illocutionary intentions) require an understanding of the speaker's frame of reference for interpretation.To demonstrate, it is an elementary observation that utterances of sentences like those in (1) require an indication of when and where the sentence was uttered to be understood enough to judge whether they are true or false. (1) a. I am hungry, b. It's raining. In fact, strictly speaking, the speaker's intention is just as important as location in space- time of the speech act.Thus, (lb) can be intended to refer to the location of the speaker at speech-time, or to some location (like the location of a sports event being viewed on television) that the speaker believes to be salient in the mind of the addressee, and will be evaluated as true or false accordingly. Of course, the reference of past and future tenses is also a function of the context in which they are uttered, referring to times that are a function of the time of utterance.Thus, if I utter (2a) at t(), I mean that I am hungry at t0; if I utter (2b) at t0,1 mean that I was hungry at some point before t0, and if I say (2c) at t0,1 mean that I will be hungry at some point after t0. (2) a. I am hungry. b. I was hungry. c. I will be hungry.
76 I. Foundations of semantics Although (2a) indicates a unique time, (2b) and (2c) refer very vaguely to some time before or after t0; nothing in the utterance specifies whether it is on the order of minutes, days, weeks, years, decades, or millenia distant. Furthermore, it is not clear whether the time indicated by (2a) is a moment or an interval of indefinite duration which includes the moment of utterance. See McCawley (1971),Dowty (1979),Partee (1973), and Hinrichs (1986) for discussion of some issues and ramifications of the alternatives. Although Bar-Hillel never refers to the notion of intention, he seems to have recognized in discussing the deictic this that indexicals are multiply indeterminate, despite being linked to the context of utterance: "'This' is used to call attention to something in the centre of the field of vision of its producer, but, of course, also to something in his spatial neighborhood, even if not in his centre of vision or not in his field of vision at all, or to some thing or some event or some situation, etc., mentioned by himself or by somebody else in utterances preceding his utterance, and in many more ways" (Bar- Hillel 1954:373).The indexical this can thus be intended (a) deictically to refer to something gesturally indicated, (b) anaphorically to refer to a just completed bit of discourse, (c) cataphorically, to refer to a bit of discourse that will follow directly, or (d) figuratively, to refer to something evoked by whatever is indicated and deictically referred to (e.g., to evoke the content of a book by indicating an image of its dust jacket), as in (3), where this refers not to a part of a photograph that is gesturally indicated, or to the dust jacket on which it appears, but to the text of the book the dust jacket was designed to protect. (3) Oh, I've read this! Similarly, there is an indexical component to the interpretation of connectives and relational adverbials in that their contribution to the meaning of an utterance involves determining what bit of preceding discourse they are intended to connect whatever follows them to, as illustrated in (4). (4) a. Therefore, Socrates is mortal. b. For that reason, wc oppose this legislation. In addition to such primary indexicals - linguistic elements whose interpretation is bound to the circumstances of their utterance, there are several classes of expressions whose interpretation is directly bound to the interpretation of such primary indexicals. Partee (1989) discussed covert pronominals,for example, local as in (5). (5) Dan Rather went to a local bar. Is the bar local relative to Rather's characteristic location? to his location at speech- time? to the location of the speaker at speech time? to the characteristic location of the speaker? In addition, as has long been acknowledged in references to "the universe of discourse," interpretation of the definite article invokes indexical reference; the uniqueness presupposition associated with use of the definite article amounts to a belief by the speaker that the intended referent of the definite NP is salient to the addressee. Indeed, insofar as the interpretation of ordinary kind names varies with the expectations about the universe of discourse that the speaker imputes to the addressee (Nunberg 1978), this
5. Meaning in language use 77 larger class of indexicals encompasses the immense class of descriptive terms, including cat, mat, window, hamburger, red, and the like as in (6). (6) a. The blond hamburger spilled her coffee, b. The cat shattered next to the mat. (Some of the conclusions of Nunberg (1978) are summarized in Nunberg (1979), but the latter work does not give an adequate representation of Nunberg's explanation of how the use of words by speakers in contexts facilitates reference. For example, Nunberg (1979) does not contain accounts of either the unextractability of interpretation from context (and the non-existence of null contexts), or the notion of a system of normal beliefs in a speech community. Yet both notions are central to understanding Nunberg's arguments for his conclusions about polysemy and interpretation.) The important point here is that with these secondary indexicals (and in fact, even with the so-called primary indexicals, Nunberg 1993), interpretation is not a simple matter of observing properties of the speech situation (source of the speech sound, calendric time, etc.), but involves making judgements about possible mappings from signal to referent (essentially the same inferential processes as are involved in disambiguation - cf. Green (1995)). Bar-Hillel (1954) and Morgan (1978) pointed out that the interpretation of demonstratives involves choosing from among many (perhaps indefinitely many) objects the speaker may have been pointing at, as well as what distinct class or individual the speaker may have intended to refer to by indicating that object (as pointed out by Nunberg 1978).Thus, the interpretation of indexicals involves choices from the elements of a set (possibly an infinite set) that is limited differently in each speech situation. For further discussion, see also article 90 (Diessel) Deixis and demonstratives. Bar-Hillel's observations on the nature of indexicals form the background for the development of context-dependent theories of semantics by Stalnaker (1970), Kamp (1981), Heim (1982) and others. See also articles 37 (Kamp & Reyle) Discourse Representation Theory and 38 (Dekker) Dynamic semantics. Starting from a very different set of considerations, Grice (1957) outlined how a speaker's act of using a linguistic form to communicate so-called literal meaning makes critical reference to speaker and hearer, with far-reaching consequences. Section 3 takes up this account in detail. 3. The Gricean revolution: The relation of utterance meaning to sentence meaning 3.1. Meaning according to Grice Grice (1957) explored the fact that we use the same word (mean) for what an event entails, what a speaker intends to communicate by uttering something, and what a linguistic expression denotes. How these three kinds of meaning are related was illuminated by his later (and regrettably widely misunderstood) work on implicaturc (sec Ncalc 1992 for discussion). Grice (1957) defined natural meaning (meaning^) as natural entailment: the subject of mean refers to an event or state, as in (7). (7) Red spots on your body means you have measles.
78 I. Foundations of semantics He reserved the notion non-natural meaning (meaningNN) for signification by convention: the subject refers to an agent or a linguistic instrument, as in (8). (8) a. By "my better half", John meant his wife. b. In thieves' cant, "trouble and strife" means 'wife'. Thus, Grice treated the linguistic meaning of expressions as strictly conventional. He declined to use the word sign for it as he equated sign with symptom, which characterizes natural meaning. He showed that non-natural meaning cannot be reduced via a behav- iorist causal strategy to 'has a tendency to produce in an audience a certain attitude, dependent on conditioning,' as that would not distinguish non-natural meaning from natural meaning, and in addition, would fail to distinguish connotation from denotation. He argued that the causal account can characterize so-called "standard" meanings, but not meaning on an occasion of use, although meaning on an occasion of use is just what should explain standard meaning, especially on a causal account. Grice also rejected the notion that X meantvv Y is equivalent to 'the speaker S intended the expression X to make the addressee H believe Y', because that would include in addition to the conventional meaning of a linguistic expression, an agent's utterly nonconventional manipulation of states and events. Finally, he ruled out interpretations of X meantvv Y as 'the speaker intended X to make H believe Y, and to recognize S's intention', because it still includes in the same class as conventional linguistic meaning any intentional, nonconventional acts whose intention is intended to be recognized - such as presenting the head of St. John the Baptist on a platter. Grice supported instead a theory where A meantm something by A'entails that (a) an agent intended the utterance of X to induce a belief or intention in H, and (b) the agent intended X to be recognized as intended to induce that belief/intention.This formulation implies that A does not believe (a) will succeed without recognition of A's intention. Grice's argument begins by showing that in directive cases such as getting someone to leave, or getting a driver to stop a car (examples which do not crucially involve language), the intended effect must be the sort of thing that is within the control of the addressee. Thus, X meansNN something amounts to 'people intend X to induce some belief/intention P by virtue of H taking his recognition of the intention to produce effect P as a reason to believe/intend P'. Grice pointed out that only primary intentions need to be recognized: the utterer/agent is held to intend what is normally conveyed by the utterance or a consequence of the act, and ambiguous cases are resolved with evidence from the context that bears on identifying a plausible intention. Grice's (1975) account of communicated meaning is a natural extension of his (1957) account of meaning in general (cf. also Green 1990, Neale 1992), in that it characterizes meaning as inherently intentional: recognizing an agent's intention is essential to recognizing what act she is performing (i.e., what she meantNN by her act). Grice's reference to the accepted purpose or direction of the talk exchange (Grice 1975: 45) in the characterization of the Cooperative Principle implies that speaker and hearer are constantly involved (usually not consciously) in interpreting what each other's goals must be in saying what they say. Disambiguating structurally or lexically ambiguous expressions like old men and women, or ear (i.e., of corn, or to hear with), inferring what referent a speaker intends to be picked out from her use of a definite noun phrase like the coffee place, and inferring what a speaker meant to implicate by an utterance that might seem unnecessary or irrelevant all depend equally on the assumptions that the speaker did intend something
5. Meaning in language use 79 to be conveyed by her utterance that was sufficiently specific for the goal of the utterance, that she intended the addressee to recognize this intention, and by means of recognizing the intention, to recognize what the speaker intended to be conveyed. In semantics, as intended, the idea that the act of saying something communicates more than just what is said allowed researchers to distinguish constant, truth-conditional meanings that are associated by arbitrary convention with linguistic forms from aspects of understanding that are a function of a meaning being conveyed in a particular context (by whatever means). This in turn enabled syntacticians to abandon the hopeless quest for hidden structures whose analysis would predict non-truth-conditional meanings, and to concentrate on articulating syntactic theories that were compatible with theories of compositional semantics, articulating the details of the relation between form and conventional, truth-conditional meaning. Finally, it inspired a prodigious amount of research in language behavior (e.g., studies of rhetoric and politeness), where it has, unfortunately, been widely misconstrued. The domain of the principles described in Grice (1975) is actually much broader than usually understood: all intentional use of language, whether literal or not, and regardless of purpose. That is, Grice intended a broad rather than a narrow interpretation of the term conversation. Grice's view of the overarching Cooperative Principle: "Make your conversational contribution such as is required, at the stage at which it occurs, by the accepted purpose or direction of the talk exchange in which you are engaged" (Grice 1975: 45) is in fact that it is just the linguistic reflex of a more general principle which governs, in fact, defines, rational behavior: behavior intended to accomplish or enable the achievement of some purpose or goal. Insofar as these are notions universally attributable to human beings, such a principle should be universally applicable with regard to language use. Supposed counterexamples have not held up; for discussion, see Keenan (1976), Prince (1983), Green (1990). Thus, the Cooperative Principle will figure in the interpretation of language generally, not just clever talk, and in fact, will figure in the interpretation of behavior generally, whether communicative or not. Additional facets of this perspective are discussed in articles 91 (Beaver & Geurts) Presupposition, and 92 (Simons) Implicature. 3.2. The Cooperative Principle Since Grice was explicit about the Cooperative Principle not being restricted to linguistic acts, and because the imperative formulation has led to so much misunderstanding, it is useful to rephrase it more generally, and declaratively, where it amounts to this: Individuals act in accordance with their goals. Grice described four categories (Quantity, Quality, Relevance and Manner) of special cases of this principle, that is, applications of it to particular kinds of requirements, and gave examples of their application in both linguistic and non-linguistic domains. It is instructive to translate these into general, declarative formulations as well: An agent will do as much as is required for the achievement of the current goal. (QUANTITY I)
80 I. Foundations of semantics An agent will not do more than is required. (QUANTITY II) Agents will not deceive co-agents. (QUALITY) Consequently, an agent will try to make any assertion one that is true. (I) An agent will not say what she believes to be false. (II) An agent will not say that for which she lacks adequate evidence. An agent's action will be relevant to and relative to an intention of the agent. (RELATION) An agent will make her actions perspicuous to others who share a joint intention. (MANNER) (I) Agents will not disguise actions from co-agents. Consequently, agents will not speak obscurely in attempting to communicate. (II) Agents will act so that intentions they intend to communicate are unambiguously reconstructible. (III) Agents will spend no more energy on actions than is necessary. (IV) Agents will execute sub-parts of a plan in an order that will maximize the perceived likelihood of achieving the goal. The status of the maxims as just special cases of the Cooperative Principle implies that they are not (contra Lycan 1984: 75) logical consequences (corollaries), because they don't follow as necessary consequences in all possible worlds. Nor are they additional stipulations, or an exhaustive list of special cases. Likewise, the maxims do not constitute the Cooperative Principle, as some writers have thought (e.g., Sperber & Wilson 1986: 36). On the contrary, the Cooperative Principle is a very general principle which determines, depending on the values shared by participants, any number of maxims instantiating ways of conforming to it. The maxims are not rules or norms that are taught or learned, as some writers would have it (e.g., Pratt 1981: 11, Brown & Yule 1983:32, Blum-Kulka & Olshtain 1986:175, Allwood, Anderson & Dahl 1977: 37, Ruhl 1989: 96, Sperber & Wilson 1986: 162). See Green (1990) for discussion. Rather, they are just particular ways of acting in accordance with one's goals; all other things being equal, conforming to the Cooperative Principle involves conforming to all of them. When you can't conform to all of them, as Grice discusses, you do the best you can. The premise of the Cooperative Principle, that individuals act in accordance with their goals is what allows Grice to refer to the Cooperative Principle as a definition of what it means to be rational (Grice 1975:45,47,48—49), and the maxims as principles that willy-nilly govern interpersonal human behavior. If X's goal is to get Y to do some act A, or believe some proposition P, it follows that X will want to speak in such a way that A or P is clearly identifiable (Maxims of Quality, Quantity, Manner), and X will not say things that will distract Y from getting the point (Maxim of Relevance). Furthermore, most likely, X will not want to antagonize Y (everybody's Maxim of Politeness). Cohen & Levesque (1991) suggest that their analysis of joint intention enables one to understand "the social contract implicit in engaging in a dialogue in terms of the conversants' jointly intending to make themselves understood, and to understand the other" (Cohen & Levesque 1991:509), observing that this would predict the back-channel comprehension checks that pervade dialogue, as "means to attain the states of mutual belief that discharge this joint intention of understanding" (Cohen & Levesque 1991:509).
5. Meaning in language use 81_ The characterization of the Cooperative Principle as the assumption that speakers act rationally (i.e., in accordance with their goals) makes a variety of predictions about the interpretation of behavior. First of all, it predicts that people will try to interpret weird, surprising, or unanticipated behavior as serving some unexpected goal before they discount it as irrational. The tenacity with which we assume that speakers observe the Cooperative Principle, and in particular the maxim of relevance was illustrated in Green (1990). That is, speakers assume that other speakers do what they do, say what they say, on purpose, intentionally, and for a reason (cf. Brown & Levinson 1978: 63). In other words, they assume that speech behavior, and indeed, all behavior that isn't involuntary, is goal-directed. Speakers "know" the maxims as strategies and tactics for efficiently achieving goals, especially through speech. A person's behavior will be interpreted as conforming to the maxims, even when it appears not to, because of the assumption of rationality (goal-directedness). Hearers will make inferences about the world or the speaker, or both, whenever novel (that is, previously unas- sumed) propositions have to be introduced into the context to make the assumption of goal-directedness and the assumption of knowledge of the strategies consistent with the behavior. Implicatures arise when the hearer additionally infers that the speaker intended those inferences to be made, and intended that intention to be recognized (cf. article 92 (Simons) Implicature). If no such propositions can be imagined, the speaker will be judged irrational, but irrationality will consist in believing the unbelievable, or believing something unfathomable, not in having imperfect knowledge of the maxims, or in not observing the maxims. Only our imagination limits the goals, and beliefs about the addressee's goals, that we might attribute to the speaker. If we reject the assumption that the speaker is irrational, then we must at the very least assume that there is some goal to which his utterance is relevant in the given context, even if we can't imagine what it is. Another illustration of the persistence of the assumption that utterances arc acts executed in accordance with a plan to achieve a goal: even if we know that a sequence of sentences was produced by a computer, making choices entirely at random within the constraints of some grammar provided to it, if the sentences can be construed as connected and produced in the service of a single goal, it is hard not to understand them that way. That is why output like (9) from random sentence generators frequently produces giggles. (9) a. Sandy called the dog. b. Sandy touched the dog. c. Sandy wanted the dog. d. The dog arrived. e. The dog asked for Kim. One further point requires discussion here. Researchers eager to challenge or to apply a Griccan perspective have often failed to appreciate how important it is that discourse is considered purposive behavior. Grice presumes that participants have goals in participating (apparently since otherwise they wouldn't be participating). This is the gist of his remark that "each participant recognizes in [talk exchanges], to some extent, a common purpose or set of purposes, or at least a mutually accepted direction" (Grice 1975: 45).
82 I. Foundations of semantics This is perhaps the most misunderstood passage in "Logic and Conversation". Grice is very vague about these purposes: how many there are, how shared they have to be. With decades of hindsight, we can see that the purposes are first of all not unique. Conversants typically have hierarchically embedded goals. Second, goals are not so much shared or mutual, as they are mutually modelled (Cohen & Perrault 1979, Cohen & Levesque 1980, Green 1982, Appelt 1985, Cohen & Levesque 1990, Perrault 1990): for George to understand Martha's utterance of "X" to George, George must have beliefs about Martha which include Martha's purpose in uttering "X" to George, which in turn subsumes Martha's model of George, including George's model of Martha, etc. Grice's assertion (1975: 48) that "in characteristic talk- exchanges, there is a common aim even if [...] it is a second-order one, namely, that each party should, for the time being, identify himself with the transitory conversational interests of the other" is an underappreciated expression of this view.The idea that participants will at least temporarily identify with each other's interests, i.e., infer what each other is trying to do, is what allows quarrels, monologues, and the like to be included in the talk exchanges that the Cooperative Principle governs. Grice actually cited quarrels and letter- writing as instances that did not fit an interpretation of the Cooperative Principle that he rejected (Grice 1975: 48), vitiating critiques by Pratt (1981) and Schauber & Spolsky (1986).The participants may have different values and agendas, but given Grice's (1957) characterization of conventional meaning, for any communication to occur, each must make assumptions about the other's goals, at least the low-lcvcl communicative goals. This is the sense in which participants recognize a "common goal". When the assumptions participants make about each other's goals are incorrect, and this affects non-trivial beliefs about each other, we say they are "talking at cross-purposes", precisely because of the mismatch between actual goals and beliefs, and attributed goals and beliefs. Interpreting the communicative behavior of other human beings as intentional, and as relevant, necessary, and sufficient for the achievement of some presumed goal seems to be unavoidable. As Gould (1991:60) noted, in quite a different context, "humans are pattern-seeking animals. We must find cause and meaning in all events." This is, of course, equally true of behavior that isn't intended as communicative. Crucially, we do not seem to entertain the idea that someone might be acting for no reason.That alternative, along with the possibility that he isn't even doing what we perceive him to be doing, that it's just happening to him, is one that we seem reluctant to accept without any independent support, such as knowledge that people do that sort of thing as a nervous habit, like playing with their hair. Thus, "cooperative" in the sense of the Cooperative Principle does not entail each party accepting all of their interlocutor's goals as their own and helping to achieve them. Rather, it is most usefully understood as meaning no more - and no less - than 'trying to understand the interaction from the other participants' 'point of view', i.e., trying to understand what their goals and assumptions must be. When Grice refers to the Cooperative Principle as "rational", it is just this assumption that actions are undertaken to accomplish goals that he has in mind. 3.3. Illocutionary intentions The second half of the 20th century saw focusscd investigation of speech acts in general, and illocutionary aspects of meaning in particular, preeminently in the work of Austin
5. Meaning in language use 83 (1962),Searle (1969) and Bach & Harnish (1979). Early on, the view within linguistics was that (i) illocutionary force was indicated by performative verbs (Austin 1962) or other Illocutionary-Force-Indicating-Devices (IFIDs,Stampe 1975) such as intonation or "markers" like preverbal please), and that (ii) where illocutionary force was ambiguous or unclear, that was because the performative clause prefixed to the sentence (Lakoff 1968, Ross 1970, Fraser 1974, Sadock 1974) was "abstract" and invisible. This issue has not been much discussed since the mid-1970s, and Dennis Stampe's (1975) conclusion that so-called performative verbs are really constative, so that the appearance of perfor- mativity is an inference from the act of utterance may now be the default assumption. It makes pcrformativity more a matter of the illocutionary intentions of the speaker than of the classification of visible or invisible markers of illocutionary force. The term "illocutionary intentions" is meant to include the relatively large but limited set of illocutionary forces described and classified by Austin (1962) and Searle (1969) and many others (stating, requesting, promising and the like). So, expositives are utterances which a speaker (S) makes with the intention that the addressee (A) recognize S's intention that A believe that S believes their content. Promises are made with the intention that A recognize S's intention that A believe that S will be responsible for making their content true. Interrogatives are uttered with the belief that A will recognize S's intention that A provide information which S indicates. From the hearer's point of view such illocutionary intentions do not seem significantly different from intentions about the essentially unlimited class of possible pcrlocutionary effects that a speaker might intend an addressee to recognize as intended on an occasion of use. So, reminders are expositives uttered with the intention that A will recognize that S believes A has believed what S wants A to believe. Warnings are utterances the uttering of which is intended to be recognized as intended to inform A that some imminent or contingent state of affairs will be bad for A. An insult is uttered with the intention that A recognize S's intent to convey S's view of A as having properties normally believed to be bad. Both kinds of intentions figure in conditions on the use of linguistic expressions of various sorts. For example, there are a whole host of phrasal verbs (many meaning roughly 'go away') such as butt out, that can be described as directive-polarity expressions (instantiating, reporting, or evoking directives), as illustrated in (10), where the asterisk prefixed to an expression indicates that it is ungrammatical. (10) a. Butt out! b. They may butt out. c. *Why do they butt out? d. They were asked to butt out. e. They refused to butt out. f. Why don't you butt out? But there are also expressions whose distribution depends on mental states of the speaker regarding the addressee that are not matters of illocutionary force. Often it is an intention to produce a particular pcrlocutionary effect, as with threat-polarity expressions like Or else!, exemplified in (11). (11) a. Get out, or else! b. *I want to get out, or else!
84 I. Foundations of semantics c. *Who's on first, or else? d. They said we had to get out, or else. e. They knew they had to get out, or else. f. They were surprised that we had to get out, or else. Sometimes, however, the relevant mental state of the speaker docs not involve intentions at all. This is the case with ignorance-polarity constructions and idioms which are acceptable only in contexts where a relevant being (typically the speaker) is ignorant of the answer to an indicated question, as in (12) (cf. Horn 1972,1978). (12) a. Where the hell is it? b. I couldn't figure out where the hell it was. c. *We went back to the place where the hell we left it. d. *I knew where the hell it would be found. It is possible to represent this kind of selection formally, say, in terms of conjoined and/or nested propositions representing speaker intentions and speaker and addressee beliefs about intentions, actions, existence, knowledge and values. But it is not the linguistic sign which is the indicator of illocutionary intentions, it is the act of uttering it. An utterance is a warning just in case S intends A to recognize the uttering of it as intended to cause A to recognize that some situation may result in a state which A would consider bad. In principle, such conditions could be referenced as constraints in the lexical entries for restricted forms. A threat-polarity item like or else! presupposes a warning context, with the additional information that the speaker intends the addressee to recognize (that the speaker intends the addressee to recognize) that if the warning is not heeded, some individual or situation will be responsible for some state of affairs that endangers the referent of the subject of the clause to which or else is appended. Illocutionary force, like dcixis, is an aspect of meaning that is fairly directly derivative of the use of a linguistic expression. At the other end of the spectrum, lexical meaning, which appears much more concrete and fixed, has also been argued to depend on the sort of social contract that the Cooperative Principle engenders, as explicated by Nunberg (1978,2004). This is the topic of Section 4. 4. Implications for word meaning In general, to the extent that we are able to understand each other, it is because we all use language in accordance with the Cooperative Principle. This entails (cf. Grice 1957) that we will only use a referential term when we believe that our addressee will be able to identify our intended referent from our reference to it by that term in that context, and will believe that we intended him to do so. But the bottom line is that the task of the addressee is to deduce what sense the speaker most likely intended, and he will use all available clues to do so, without regard to whether they come from within or outside the immediate sentence or bit of discourse at hand. This means that even so-called literal meanings have an indexical character in depending on the speaker's ability to infer correctly what an addressee will assume a term is intended to refer to on an occasion of use (cf. Nunberg 1978). Even the sense of predicative lexical items in an utterance is not fixed by or in a linguistic system,but can only be deduced in connection
5. Meaning in language use 85 with assumptions about the speaker's beliefs about the knowledge and expectations of the addressee. To illustrate, as has been often noted (cf. Ruhl 1989, Green 1998), practically any word can be used to denote an almost limitless variety of kinds of objects or functions: in addition to referring to a fruit, or a defective automobile, lemon might refer to the wood of the lemon tree, as in (13a), to the flavor of the juice of the fruit (13b), to the oil from the peel of the fruit (13c), to an object which has the color of the fruit (13d), to something the size of the fruit (13e), and to a substance with the flavor of the fruit (13f).These are only the most obvious uses from an apparently indefinitely large set. (13) a. Lemon has an attractive grain, much finer than beech or cherry. b. I prefer the '74 because the '73 has a lemon aftertaste. c. Lemon will not penetrate as fast as linseed. d. The lemon is too stretchy, but the coral has a snag in it. e. Shape the dough into little lemons, and let rise. f. Two scoops of lemon, please, and one of Rocky Road. The idea that what a word can be used to refer to might vary indefinitely is clearly unsettling. It makes the fact that we (seem to) understand each other most of the time something of a miracle, and it makes the familiar, comfortable Conduit Theory of communication (critiqued in Reddy 1979), according to which speakers encode ideas in words and sentences and send them to addressees to decode, quite irrational. But the conclusion that as language users we are free to use any word to refer to anything at all, any time we want is unwarranted. Lexical choice is always subject to the pragmatic constraint that we have to consider how likely it is that our intended audience will be able to correctly identify our intended referent from our use of the expression we choose. What would really be irrational would be using a word to refer to anything other than what we estimate our addressee is likely to take it to refer to, because it would be self-defeating. Thus, in spite of all the apparent freedom afforded to language users, rationality severely limits what a speaker is likely to use a term to refer to in a given context. Since people assume that people's actions are goal-directed (so that any act will be assumed to have been performed for a reason), a speaker must be assumed to believe that, all things considered, the word she chooses is the best word to further her goals in its context and with respect to her addressee. Speakers frequently exploit the freedom they have, within the bounds of this constraint, referring to movies as turkeys, cars as lemons, and individuals in terms of objects associated with them, as when we say that the flute had to leave to attend his son's soccer game, or that the corned beef spilled his beer. If this freedom makes communication sound very difficult to effect, and very fragile, it is important to keep in mind that we are probably less successful at it than we think we are, and generally oblivious of the work that is required as well. But it is probably not really that fragile. Believing as an operational principle in the convenient fiction that words have fixed meanings is what makes using them to communicate appear to require no effort. If we were aware of how much interpretation we depended on each other to do to understand us, we might hesitate to speak. Instead, wc all act as if wc believe, and believe that everyone else believes, that the denotation an individual word may have on an occasion of use is limited, somewhat arbitrarily, as a matter of linguistic convention. Nunberg (1978), extending observations
86 I. Foundations of semantics made by Lewis (1969), called this sort of belief a normal belief, defined so that the relation normally-believe holds of a speech community and a proposition P when people in that community believe that it is normal (i.e., unremarkable, to be expected) in that community to believe P and to believe that everyone in that community believes that it is normal in that community to believe P. See also Stalnaker (1974) and Atlas (2004). (The term speech community, following Nunberg (1978), is not limited to geographical or political units, or even institutionalized social units, but encompasses any group of individuals with common interests. Thus, we all belong simultaneously to a number of speech communities, depending on our interests and backgrounds; we might be women and mothers and Lutherans and lawyers and football fans and racketball players, who knit and surf the internet, and arc members of countless speech communities besides these.) Illustrating the relevance to traditional semantic concerns of the notion of normal belief, it is normal beliefs about cars and trucks, and about what properties of them good old boys might find relevant that would lead someone to understand the coordination in (14a) with narrow adjectival scope and that in (14b) with wider scope: for example, because of the aerodynamic properties of trucks relative to cars, fast truck is almost an oxymoron. (14) a. The good ol'boys there drive fast cars and trucks, b. The good ol' boys there drive red cars and trucks. This technical use of normal belief should not be confused with other notions that may have the same name. A normal belief in the sense intended is only remotely related to an individual's belief about how things normally are, and only remotely related (in a different direction) to a judgement that it is unremarkable to hold such a belief. The beliefs that are normal within a community are those that "constitute the background against which all utterances in that community are rationally made" (Nunberg 1978: 94-95). Addressing the issue of using words to refer to things, properties, and events, what it is considered normal to use a word like tack or host or rock or metal to refer to varies with the community. These are social facts, facts about societies, and only incidentally and contingently and secondarily facts about words. More precisely, they are facts about what speakers believe other speakers believe about conventions for using words.Thus, it is normal among field archaeologists to use mesh bound in frames to sift through excavated matter for remnants of material culture, and it is normally believed among them that this is normal, and that it is normal to refer to the sieves as screens. Likewise, among users of personal computers, it is normally believed that the contents of a data file maybe inspected by projecting representations of portions of it on an electronic display, and it is normally believed that this belief is normally held, and that it is normal to refer to the display as a screen. Whether screen is (intended to be) understood as (normally) referring to a sort of sieve or to a video display depends on assumptions made by speaker and hearer about the assumptions each makes about the other's beliefs, including beliefs about what is normal in a situation of the sort being described, and about what sort of situation (each believes the other believes) is being discussed at the moment of utterance. This is what makes word meaning irrcducibly a matter of language use. Although different senses of a word may sometimes have different syntactic distributions (so-called selectional restrictions), McCawley (1968) showed that this is not so much a function of the words
5. Meaning in language use 87 themselves as it is a function of properties that language users attribute to the presumed intended referents of the words. Normal use is defined in terms of normal belief, and normal belief is an intensional concept. If everybody believes that everybody believes that it is normal to believe P, then belief in P is a normal belief, even if nobody actually believes P. In light of this, we are led to a view of word usage in which, when a speaker rationally uses a word w to indicate some intended individual or class a, she must assume that the addressee will consider it rational to use w to indicate a in that context. She must assume that if she and her addressee do not in fact have the same assumptions about what beliefs are normal in the community-at-large, and in every relevant subgroup, at least the addressee will be able to infer what relevant beliefs the speaker imputes to the addressee, or expects the addressee to impute to the speaker, and so on, in order to infer the intended referent. If we define an additional, recursive relation mutually-believe as holding among two sentient beings A and B and a proposition when A believes the proposition, believes that B believes the proposition, believes that B believes that A believes the proposition, and so on (cf. Cohen & Levesque 1990), then we can articulate the notion normal meaning (not to be confused with 'normal referent out of context' - cf. Green 1995: 14f, Green 1996:59 for discussion): some set (or property) m is a normal meaning (or denotation) of an expression w insofar as it is normally believed that w is used to indicate m. A meaning m for an expression w is normal in a context insofar as speaker and addressee mutually believe that it is normally believed that w is used to indicate m in that context. We can then say that 'member of the species canis familiaris' is a normal meaning for the word dog insofar as speaker and addressee mutually believe that it is normally believed in their community that such entities are called dogs. Some uses of referential expressions like those exemplified in (13) are not so much abnormal or less normal than others as they are normal in a more narrowly defined community. In cases of systematic polysemy, all the use-types (or senses), whether normal in broadly defined or very narrowly exclusive communities, are relatable to one another in terms of functions like 'source of,'product of,'part of,'mass of, which Nunberg (1978) characterized as referring functions (for discussion, see Nunberg 1978,2004, Green 1996, 1998, Pelletier & Schubert 1986, Nunberg & Zaenen 1992, Copestake & Briscoe 1995, Helmreich 1994). For example, using the word milkshake as in (15) to refer to someone who orders a milkshake exploits the referring function 'purchaser of, and presumes a mutual belief that it is normal for restaurant personnel to use the name of a menu item to refer to a purchaser of that item, or more generally, for sales agents to use a description of a purchase to refer to the purchaser. (15) The milkshake claims you kicked her purse. This is in addition, of course, to the mutual belief it presumes about what the larger class of English speakers normally use milkshake to refer to, and the mutual belief that the person identified as the claimant ordered a milkshake. The assumption that people's actions are purposeful, so that any act will be assumed to have been performed for a reason, is a universal normal belief - everyone believes
88 I. Foundations of semantics it and believes that everyone believes it (cf. Green 1993). The consequence of this for communicative acts is that people intend and expect that interpreters will attribute particular intentions to them, so consideration of just what intention will be attributed to speech actions must enter into rational utterance planning (cf. Green 1993, also Sperber & Wilson 1986). This is the Gricean foundation of this theory (cf. also Neale 1992). If the number of meanings for a given lexical term is truly indefinitely extendable (as it appears to be), or even merely very large, it is impractical in the extreme to try to list them. But the usual solution to the problem of representing an infinite class in a finite (logical) space is as available here as anywhere else, at least to the extent that potential denotations can be described in terms of composablc functions on other denotations, and typically, this is the case (Nunberg 1978: 29-62). It is enough to know, Nunberg argues, that if a term can be used to refer to some class X, then it can be used, given appropriate context, to refer to objects describable by a recognizable function on X. This principle can be invoked recursively, and applies to functions composed of other functions, and to expressions composed of other expressions, enabling diverse uses like those for lemon in (13) to be predicted in a principled manner. Because the intended sense (and thence the intended referent) of an utterance of any referential term ultimately reflects what the speaker intends the hearer to understand from what the speaker says by recognizing that the speaker intends him to understand that, interpreting an utterance containing a polysemic ambiguity (or indeed, any sort of ambiguity) involves doping out the speaker's intent, just as understanding a speaker's discourse goals does. For additional discussion,see also articles 25 (de Swart) Mismatches and coercion and 26 (Tyler & Takahashi) Metaphors and metonymies. 5. The nature of context, and the relationship of pragmatics to semantics 5.1. Disambiguation and interpretation Pragmatic information is information about the speaker's mental models. Consequently, such semantic issues as truth conditions and determination of ambiguity are interdependent with issues of discourse interpretation, and the relevant contexts essential to the resolution of both are not so much the surrounding words and phrases as they are abstractions from the secular situations of utterance, filtered through the minds of the participants (cf. Stalnaker 1974). As a result, it is unreasonable to expect that ambiguity resolution independent of models of those situations can be satisfactory. Linguistic pragmatics irrcducibly involves the speaker's model of the addressee, and the hearer's model of the speaker (potentially recursively). For George to understand Martha's utterance of "X" to him, he must not only recognize (speech perception, parsing) that she has said "X," he must have beliefs about her which allow him to infer what her purpose was in uttering "X," which means that he has beliefs about her model of him, including her model of his model of her, and so on. Any of these beliefs is liable to be incorrect at some level of granularity. George's model of Martha (and hers of him) is more like a sketch than a photograph: there are lots of potentially relevant things they don't know about each other, and they most likely have got a few (potentially relevant) things wrong right off the bat as well. Consequently, even under the
5. Meaning in language use 89 best of circumstances, whatever proposition George interprets Martha's utterance to be expressing may not exactly match the proposition she intended him to understand. The difference, which often is not detected, may or may not matter in the grand scheme of things. Since acts are interpreted at multiple levels of granularity, this holds for the interpretation of acts involved in choosing words and construction types, as well as for acts of uttering sentences containing or instantiating them. From this it follows that the distinction between pragmatic effects called "intrusive pragmatics" (Levinson 2000) or explicature (Sperber & Wilson 1986) and those described as implicature proper has little necessary significance for an information-based account of language structure. Because the computation of pragmatic effects by whatever name involves analysis of what is undcrspccificd in the actual utterance, and how what is uttered compares to what might reasonably have been expected, it involves importing propositions, a process beyond the bounds of computation within a finite domain, even if supplemented by a finite set of functions. When a reader or hearer recognizes that an utterance is ambiguous, resolving that ambiguity amounts to determining which interpretation was intended. When recognizing an ambiguity affects parsing, resolving it may involve grammar and lexicon, as for example, when it involves a form which could be construed as belonging to different syntactic categories (e.g., The subdivision houses most of the officers vs. The subdivision houses are very similar, or Visiting relatives is a lot of fun vs. Visiting relatives are a lot of fun). But grammar and lexicon may not be enough to resolve such ambiguities, as in the case of familiar examples like (16). (16) a. I saw her duck. b. Visiting relatives can be a lot of fun. They will rarely suffice to resolve polysemies or attachment ambiguities like / returned the key to the library. In all of these cases, it is necessary to reconstruct what it would be reasonable for the speaker to have intended, given what is known or believed about the beliefs and goals of the speaker, exactly as when seeking to understand the relevance of an unambiguous utterance in a conversation - that is, to understand why the speaker bothered to utter it, or to say it the way she said it.The literature on natural language understanding contains numerous demonstrations that the determination of how an ambiguous or vague term is intended to be understood depends on identifying the most likely model of the speaker's beliefs and intentions about the interaction. Nunberg (1978: 84-87) discusses the beliefs and goals that have to be attributed to him in order for his uncle to understand what he means by jazz when he asks him if he likes jazz. Crain & Steedman (1985), and Altmann & Steedman (1988) offer evidence that experimentally controllable aspects of context that reflect speakers' beliefs about situations affect processing in more predictable ways than mechanistic parsing strategies. Green (1996:119-122) describes what is involved in identifying what is meant by IBM, at, and seventy-one in Sandy bought IBM at 71. At the other end of the continuum of grammatical and pragmatic uncertainty is Sperber & Wilson's (1986: 239-241) discussion of the process of understanding irony and sarcasm, as when one says I love people who don't signal, intending to convey'I hate people who don't signal.' (A further step in the process is required to interpret / love people who signal! as intended to convey the same thing; the difference is subtle, because
90 I. Foundations of semantics the contextual conditions likely to provoke the two utterances are in a subset relation. One might say / love people who don't signal to inform an interlocutor of one's annoyance at someone who the speaker noticed did not signal, but / love people who signal is likely to be used sarcastically only when the speaker believes that it is obvious to the addressee that someone should have signaled and did not.) Nonetheless, it may be instructive here to examine the resolution of a salient lexical ambiguity. Understanding the officials' statement in example (17) involves comparing how different assumptions about mutual beliefs about the situation are compatible with different interpretations, in order to determine whether plant refers to a vegetable organism or to the production facility of a business (or perhaps just to the apparatus for controlling the climate within it, or maybe to the associated grounds, offices and equipment generally), or even to some sort of a decoy. (17) Officials at International Seed Co. becfed up security at their South Carolina facility in the face of rumors that competitors would stop at nothing to get specimens of a newly-engineered variety, saying, "That plant is worth $5 million." If the interpreter supposes that what the company fears is simply theft of samples of the organism, she will take the official as intending plant to refer to the (type of the) variety: being able to market tokens of that type represents a potential income of $5 million. On the other hand, if the interpreter supposes that the company fears damage to their production facility or the property surrounding it - say, because she knows that samples of the organism are not even located at the production facility any more, and/or that extortionists have threatened to vandalize company property if samples of the organism are not handed over, she is likely to take plant as intended to refer to buildings and grounds or equipment. Believing that the company believes that potential income from marketing the variety is many times greater than $5 million would have the same effect. If the interpreter believes that the statement was made in the course of an interview where a company spokesperson discussed the cost of efforts to protect against industrial espionage, and mentioned how an elaborate decoy system had alerted them to a threat to steal plant specimens, she might even take plant as intended to refer to a decoy. The belief that officials believe that everyone relevant believes that both the earnings potential of the organism, and the value of relevant structures and infrastructure are orders of magnitude more or less than $5 million would contribute to this conclusion, and might even suffice to induce it on its own. Two points are relevant here. First, depending on how much of the relevant information is salient in the context in which the utterance is being interpreted, the sentence might not even be recognized as ambiguous. This is equally true in the case of determining discourse intents. For example, identifying sarcastic intent is similar to understanding the reference of an expression in that it depends on attributing to the speaker intent to be sarcastic. (Such an inference is supported by finding a literal meaning to be in conflict with propositions assumed to be mutually believed, but this is neither necessary nor sufficient for interpreting an utterance as sarcastic.) Being misled in the attribution of intent is a common sort of misunderstanding, indeed, a sort that is likely to go undetected. These issues are discussed at length in articles 23 (Kennedy) Ambiguity and vagueness, 38 (Dekker) Dynamic semantics, 88 (Jaszczolt) Semantics
5. Meaning in language use 91_ and pragmatics, 89 (Zimmermann) Context dependency and 94 (Potts) Conventional implicature. Second, there is nothing linguistic about the resolution of the lexical ambiguity in (17). All of the knowledge that contributes to the identification of a likely intended referent is encyclopedic or contextual knowledge of (or beliefs about) the relevant aspects of the world, including the beliefs of relevant individuals in it (e.g., the speaker, the (presumed) addressees of the quoted speech, the reporter of the quoted speech, and the (presumed) addressees of the report).That disambiguation of an utterance in its context may require encyclopedic knowledge of a presumed universe of discourse is hardly a new observation; it has been a commonplace in the linguistics and Artificial Intelligence literature for decades. Its pervasiveness and its significance sometimes seem to be surprisingly underappreciated. A similar demonstration could be made for many structural ambiguities, including some of the ones mentioned at the beginning of this section. Insofar as language users resolve ambiguities that they recognize by choosing the interpretation most consistent with their model of the speaker and of the speaker's model of the world, modelling this ability of theirs by means of probability-based grammars and lexicons (e.g., Copestake & Briscoe 1995) is likely to provide an arbitrarily limited solution. When language users fail to recognize ambiguities in the first place, it is surely because beliefs about the speaker's beliefs and intentions in the context at hand which would support alternative interpretations are not salient to them. This view treats disambiguation, at all levels, as indistinct from interpretation, insofar as both involve comparing an interpretation of a speaker's utterance with goals and beliefs attributed to that speaker, and rejecting interpretations which in the context are not plausibly relevant to the assumed joint goal for the discourse. This is a conclusion that is unequivocally based in Gricc's seminal work (Gricc 1957,1975), and yet it is one that he surely did not anticipate. See also Atlas (2004). 5.2. Computational modelling Morgan (1973) showed that determining the presuppositions of an utterance depends on beliefs attributed to relevant agents (e.g., the speaker, and agents and experiences of propositional attitude verbs) and is not a strictly linguistic matter. Morgan's account, and Gazdar's (1979) formalization of it, show that the presuppositions associated with lexical items are filtered in being projected as presuppositions of the sentence of which they form a part, by conversational implicature, among other things. (See also articles 37 (Kamp & Reyle) Discourse Representation Theory, 91 (Beaver & Geurts) Presupposition and 92 (Simons) Implicature.) Conversational implicature, as discussed in section 3 of this article is a function of a theory of human behavior generally, not something specifically linguistic, because it is based on inference of intentions for actions generally, not on properties of the artifacts that are the result of linguistic actions: conversational implicatures arise from the assumption that it is reasonable (under the particular circumstances of the speech event in question) to expect the addressee A to infer that the speaker S intended A to recognize S's intention from the fact that the speaker uttered whatever she uttered. It would be naive to expect that the filtering in the projection of presuppositions could be represented as a constraint or set of constraints on values of
92 I. Foundations of semantics any discrete linguistic feature, precisely because conversational implicaturc is inherently indeterminate (Grice 1975, Morgan 1973,Gazdar 1979). Despite the limitations on the computation and attribution of assumptions, much progress has been made in recent years on simultaneous disambiguation and parsing of unrestricted text. To take only the example that I am most familiar with, Russell (1993) describes a system which unifies syntactic and semantic information from partial parses to postulate (partial) syntactic and semantic information for unfamiliar words.This progress suggests that a promising approach to the problem of understanding unrestricted texts would be to expand the technique to systematically include information about contexts of utterance, especially since it can be taken for granted that words will be encountered which arc being used in unfamiliar ways.The chief requirement for such an enterprise is to reject (following Reddy 1979) the simplistic view of linguistic expressions as simple conduits for thoughts, and model natural language use as action of rational agents who treat the exchange of ideas as a joint goal, as Grice, Cohen, Perrault, and Levesque have suggested in the articles cited. This is a nontrivial task, and if it does not offer an immediate payoff in computational efficiency, ultimately it will surely pay off in increased accuracy, not to mention in understanding the subtlety of communicative and interpretive techniques. 6. Summary Some of the conclusions outlined here are surely far beyond what Bar-Hillcl articulated in 1954, and what Grice may have had in mind in 1957 or 1968, the date of the William James Lectures in which the Cooperative Principle was articulated. Nonetheless, our present understanding rests on the shoulders of their work. Bar-Hillel (1954) demonstrated that it is not linguistic forms that carry pragmatic information, but the facts of their utterance.The interpretation of first and second person pro nouns, tenses, and deictic adverbials are only the most superficial aspect of this. Bar-Hillel's observations on the nature of indexicals form the background for the development of context-dependent theories of semantics by Stalnaker (1970), Kamp( 1981 ),Hcim (1982) and others.Starting from a very different set of considerations, Grice (1957) outlined how a speaker's act of using a linguistic form to communicate so-called literal meaning makes critical reference to speaker and hearer, with far-reaching consequences. Grice's (1957) insight that conventional linguistic meaning depends on members of a speech community recognizing speakers' intentions in saying what they say the way they say it enabled him to sketch how this related to lexical meaning and presupposition, and (in more detail) implied meaning. While the view presented here has its origins in Grice's insight, it is considerably informed by elaborations and extensions of it over the past 40 years. The distinction between natural meaning (entailment) and non-natural meaning (conventional meaning) provided the background against which his theory of the social character of communication was developed in the articulation of the Cooperative Principle. While the interpersonal character of illocutionary acts was evident from the first discussions, it was less obvious that lexical meaning, which appears much more concrete and fixed, could also be argued to depend on the sort of social contract that the Cooperative Principle engenders. Subsequent explorations into the cooperative aspects of action generally make it reasonable to anticipate a much more integrated understanding of the interrelations of content and context, of meaning and use, than seemed likely forty years ago.
5. Meaning in language use 93 7. References Allwood, Jens, Lars-Gunnar Anderson & Osten Dahl 1977. Logic in Language. Cambridge: Cambridge University Press. Altmann, Gerry & Mark Steedman 1988. Interaction with context during sentence processing. Cognition 30,191-238. Appelt, Douglas E. 1985. Planning English Sentences. Cambridge: Cambridge University Press. Atlas, Jay David 2004. Presupposition. In: L. R. Horn & G. Ward (cds.). The Handbook of Pragmatics. Oxford: Blackwell, 29-52. Austin, John L. 1962. How to Do Things with Words. Cambridge, MA: Harvard University Press. Bach. Kent & Robert M. Harnish 1979. Linguistic Communication and Speech Acts. Cambridge, MA:The MIT Press. Bar-Hillel, Yehoshua 1954. Indexical expressions. Mind 63,359-379. Blum-Kulka.Shoshana & Elite Olshtain 1986.Too many words: Length of utterance and pragmatic failure. Studies in Second Language Acquisition 8,165-180. Brown, Gillian & George Yule 1983. Discourse Analysis. Cambridge: Cambridge University Press. Brown, Penelope & Stephen Levinson 1978. Universals in language usage: Politeness phenomena. In: E. Goody (ed.). Questions and Politeness: Strategies in Social Interaction. Cambridge: Cambridge University Press, 56-311. (Expanded version published 1987 in book form as Politeness, by Cambridge University Press.) Cohen, Philip R. & Hector J. Levesque 1980. Speech acts and the recognition of shared plans. Proceedings of the Third National Conference of the Canadian Society for Computational Studies of Intelligence. Victoria, BC: University of Victoria. 263-271. Cohen, Philip R. & Hector J. Levesque 1990. Rational interaction as the basis for communication. In: P. Cohen, J. Morgan & M. Pollack (eds.). Intentions in Communication. Cambridge, MA:The MIT Press,221-256. Cohen. Philip R. & Hector J. Levesque 1991. Teamwork. Noils 25,487-512. Cohen, Philip R. & C. Ray Peirault 1979. Elements of a plan based theory of speech acts. Cognitive Science 3,177-212. Copestake, Ann & Ted Briscoe 1995. Semi-productive polysemy and sense extension. Journal of Semantics 12,15-67. Crain, Stephen & Mark Steedman 1985. On not being led up the garden path: The use of context by the psychological parser. In: D. R. Dowty, L. Karttunen & A. M. Zwicky (cds.). Natural Language Parsing. Cambridge: Cambridge University Press, 320-358. Dowty, David 1979. Word Meaning and Montague Grammar. Dordrecht: Reidel. Fraser, Bruce 1974. An examination of the performative analysis. Papers in Linguistics 7,1-40. Gazdar, Gerald 1979. Pragmatics, Implicature, Presupposition, and Logical Form. New York: Academic Press. Gould, Stephen J. 1991. Bully for Brontosaurus. New York: W. W. Norton. Green, Georgia M. 1982. Linguistics and the pragmatics of language use. Poetics 11,45-76. Green. Georgia M. 1990. On the universality of Gricean interpretation. In: K. Hall et al. (eds.). Proceedings of the 16th Annual Meeting of the Berkeley Linguistics Society. Berkeley, CA: Berkeley Linguistics Society, 411-428. Green, Georgia M. 1993. Rationality and Gricean Inference. Cognitive Science Technical Report UIUC-BI-CS-93-09. Urbana. IL, University of Illinois. Green, Georgia M. 1995. Ambiguity resolution and discourse interpretation. In: K. van Deemter & S. Peters (eds.). Semantic Ambiguity and Underspecification. Stanford, CA: CSLI Publications, 1-26. Green, Georgia M. 1996. Pragmatics and Natural Language Understanding. 2nd edn. Hillsdale, NJ: Lawrence Erlbaum Associates. Green, Georgia M. 1998. Natural kind terms and a theory of the lexicon. In: E. Antonsen (ed.). Studies in the Linguistic Sciences 28. Urbana, IL: Department of Linguistics, University of Illinois, 1-26.
94 I. Foundations of semantics Grice, H. Paul 1957. Meaning. Philosophical Review 66,377-388. Grice, H. Paul 1975. Logic and conversation. In: P. Cole & J. L. Morgan (eds.). Syntax and Semantics 3: Speech Acts. New York: Academic Press, 41-58. Heim, Irene 1982. The Semantics of Definite and Indefinite Noun Phrases. Ph.D. dissertation, University of Massachusetts, Amherst, MA. Reprinted: Ann Arbor, MI: University Microfilms. Hehnreich, Stephen 1994. Pragmatic Referring Functions in Montague Grammar. Ph.D. dissertation. University of Illinois, Urbana, IL. Hinrichs. Erhard 1986. Temporal anaphora in discourses of English. Linguistics & Philosophy 9, 63-82. Horn, Laurence R. 1972. On the Semantic Properties of Logical Operators in English. Ph.D. dissertation. University of California, Los Angeles, CA. Reprinted: Bloomington, IN: Indiana University Linguistics Club. 1976. Horn, Laurence R. 1978. Some aspects of negation. In: J. Greenberg, C. Ferguson & E. Moravcsik (eds.). Universals of Human Language, vol. 4, Syntax. Stanford. CA: Stanford University Press, 127-210. Kainp, Hans 1981. A theory of truth and semantic representation. In: J. Groenendijk, T M. V. Janssen & M. Stokhof (eds.). Formal Methods in the Study of Language. Amsterdam: Mathematical Centre, 277-321. Keenan. Elinor O. 1976. The universality of conversational implicature. Language in Society 5, 67-80. Kripke, Saul 1972. Naming and necessity. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Reidel, 253-355 and 763-769. Lakoff, Robin L. 1968. Abstract Syntax and Latin Complementation. Cambridge, MA: The MIT Press. Levinson, Stephen C. 2000. Presumptive Meanings: The Theory of Generalized Conversational Implicature. Cambridge, MA:Thc MIT Press. Lewis, David 1969. Convention: A Philosophical Study. Cambridge, MA: Harvard University Press. Lycan.William G. 1984. Logical Form in Natural Language. Cambridge, MA:The MIT Press. McCawlcy. James D. 1968. The role of semantics in a grammar. In: E. Bach & R.T. Harms (eds.). Universals in Linguistic Theory. New York: Holt, Rinchart & Winston, 124-169. McCawley, James D. 1971.Tense and time reference in English. In: C. Fillmore & D.T Langendoen (eds.). Studies in Linguistic Semantics. New York: Holt, Rinehart & Winston, 97-113. Morgan. Jerry L. 1973. Presupposition and the Representation of Meaning: Prolegomena. Ph.D. dissertation. University of Chicago, Chicago, IL. Morgan, Jerry L. 1978. Two types of convention in indirect speech acts. In: P. Cole (ed.). Syntax and Semantics 9: Pragmatics. New York: Academic Press, 261-280. Neale, Stephen 1992. Paul Grice and the philosophy of language. Language and Philosophy 15, 509-559. Nunberg, Geoffrey 1978. The Pragmatics of Reference. Ph.D. dissertation. CUNY, New York. Reprinted: Bloomington, IN: Indiana University Linguistics Club. (Also published by Garland Publishing, Inc.) Nunberg, Geoffrey 1979. The non-uniqueness of semantic solutions: Polysemy. Linguistics & Philosophy 3,145-185. Nunberg. Geoffrey 1993. Indexicality and deixis. Linguistics & Philosophy 16,1-44. Nunberg. Geoffrey 2004. The pragmatics of deferred interpretation. In: L. R. Horn & G. Ward (eds.). The Handbook of Pragmatics. Oxford: Blackwcll, 344-364. Nunberg, Geoffrey & Annie Zaenen 1992. Systematic polysemy in lexicology and lexicography. In: H.Toinmola, K. Varantola, & J. Schopp (eds.). Proceedings of Euralex 92. Part II. Tampere: The University of Tampere, 387-398. Partee, Barbara 1973. Some structural analogues between tenses and pronouns in English. Journal of Philosophy 70,601-609.
5. Meaning in language use 95 Partee, Barbara 1989. Binding implicit variables in quantified contexts. In: C. Wiltshire, B. Music & R. Graczyk (eds.). Papers from the 25th Regional Meeting of the Chicago Linguistic Society. Chicago, IL: Chicago Linguistic Society, 342-365. Pelletier, Francis J. & Lenhart K. Schubert 1986. Mass expressions. In: D. Gabbay & F. Guenthner (eds.). Handbook of Philosophical Logic, vol. 4. Dordrecht: Reidel, 327-407. Perrault, Ray 1990. An application of default logic to speech act theory. In: P. Cohen, J. Morgan & M. Pollack (eds.). Intentions in Communication. Cambridge, MA:The MIT Press, 161-186. Pratt.Mary L. 1981. The ideology of speech-act theory. Centrum (N.S.) 1:1,5-18. Prince, Ellen 1983. Grice and Universality. Ms. Philadelphia, PA. University of Pennsylvania, http.7/ www.ling.upenn.edu/-ellen/grice.ps.August 3,2008. Putnam, Hilary 1975. The meaning of 'meaning.' In: K. Gunderson (ed.). Language, Mind, and Knowledge. Minneapolis, MN: University of Minnesota Press, 131-193. Recanati, Francois 2004. Pragmatics and semantics. In: L. R. Horn & G. Ward (eds.). The Handbook of Pragmatics. Oxford: Blackwell, 442^62. Reddy, Michael 1979.The conduit metaphor - a case of frame conflict in our language about language. In: A. Ortony (ed.). Metaphor and Thought. Cambridge: Cambridge University Press, 284-324. Ross, John R. 1970. On declarative sentences. In: R. Jacobs & P. S. Roscnbaum (eds.). Readings in English Transformational Grammar. Waltham, MA: Ginn, 222-272. Ruhl. Charles 1989. On Monosemy. Albany, NY: SUNY Press. Russell, Dale W. 1993. Language Acquisition in a Unification-Based Grammar Processing System Using a Real-World Knowledge Base. Ph.D. dissertation. University of Illinois, Urbana, IL. Sadock, Jcrrold M. 1974. Toward a Linguistic Theory of Speech Acts. New York: Academic Press. Schauber, Ellen & Ellen Spolsky 1986. The Bounds of Interpretation. Stanford, CA: Stanford University Press. Seal le. John. 1969. Speech Acts. Cambridge: Cambridge University Press. Sperber, Dan & Deirdre Wilson 1986. Relevance: Communication and Cognition. Cambridge, MA: Harvard University Press. Stalnaker, Robert C. 1970. Pragmatics. Synthese 22,272-289. Stalnaker, Robert C. 1972. Pragmatics. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Reidel, 380-397. Stalnaker, Robert C. 1974. Pragmatic presuppositions. In: M. K. Munitz & P. K. Ungcr (eds.). Semantics and Philosophy. New York: New York University Press, 197-214. Stanipe, Dennis 1975. Meaning and truth in the theory of speech acts. In: P. Cole & J. L. Morgan (eds.). Syntax and Semantics 3: Speech Acts. New York: Academic Press, 1-40. Georgia M. Green, Urbana, IL (USA)
96 I. Foundations of semantics 6. Compositionality 1. Background 2. Grammars and semantics 3. Variants and properties of compositionality 4. Arguments in favor of compositionality 5. Arguments against compositionality 6. Problem cases 7. References Abstract This article is concerned with the principle of compositionality, i.e. the principle that the meaning of a complex expression is a function of the meanings of its parts and its mode of composition. After a brief historical background, a formal algebraic framework for syntax and semantics is presented. In this framework, both syntactic operations and semantic functions are (normally) partial. Using the framework, the basic idea of compositionality is given a precise statement, and several variants, both weaker and stronger, as well as related properties, are distinguished. Several arguments for compositionality are discussed, and the standard arguments are found inconclusive. Also, several arguments against compositionality, and for the claim that it is a trivial property, are discussed, and are found to be flawed. Finally, a number of real or apparent problems for compositionality are considered, and some solutions are proposed. 1. Background Compositionality is a property that a language may have and may lack, namely the property that the meaning of any complex expression is determined by the meanings of its parts and the way they are put together. The language can be natural or formal, but it has to be interpreted. That is, meanings, or more generally, semantic values of some sort must be assigned to linguistic expressions, and compositionality concerns precisely the distribution of these values. Particular semantic analyses that are in fact compositional were given already in antiquity, but apparently without any corresponding general conception. For instance, in Sophist, chapters 24-26, Plato discusses subject-predicate sentences, and suggests (pretty much) that such a sentence is true [false] if the predicate (verb) attributes to what the subject (noun) signifies things that are [are not]. Notions that approximate the modern concept of compositionality did emerge in medieval times. In the Indian tradition, in the 4th or 5th century CE, Sabara says that The meaning of a sentence is based on the meaning of the words. and this is proposed as the right interpretation of a sutra by Jaimini from sometime 3rd- 6th century BCE (cf. Houben 1997:75-76).The first to propose a general principle of this Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 96-123
6. Compositionality 97 nature in the Western tradition seems to have been Peter Abelard (2008,3.00.8) in the first half of the 12th century, saying that Just as a sentence materially consists in a noun and a verb, so too the understanding of it is put together from the understandings of its parts. (Translation by and information from Peter King 2007:8.) Abelard's principle directly concerns only subject-predicate sentences, it concerns the understanding process rather than meaning itself, and he is unspecific about the nature of the putting-together operation. The high scholastic conception is different in all three respects. In early middle 14th century John Buridan (1998,2.3,Soph. 2 Thesis 5, QM 5.14, fol. 23vb) states what has become known as the additive principle: The signification of a complex expression is the sum of the signification of its non-logical terms. (Translation by and information from Peter King 2001:4). The additive principle, with or without the restriction to non-logical terms, appears to have become standard during the late middle ages (for instance, in 1372, Peter of Ailly refers to the common view that it 'belongs to the [very] notion of an expression that every expression has parts each one of which, when separated, signifies something of what is signified by the whole'; 1980:30).The medieval theorists apparently did not possess the general concept of a function, and instead proposed a particular function, that of summing (collecting). Mere collecting is inadequate, however, since the sentences All A's are B's and All B's are A's have the same parts, hence the same collection of part-meanings and hence by the additive principle have the same meaning. With the development of mathematics and concern with its foundations came a renewed interest in semantics. Gottlob Frege is generally taken to be the first person to have formulated explicitly the notion of compositionality and to claim that it is an essential feature of human language (although some writers have doubted that Frege really expressed,or really believed in, compositionality;e.g. Pelletier 2001 and Janssen2001).In "Uber Sinn und Bedeutung", 1892, he writes: Let us assume for the time being that the sentence has a reference. If we now replace one word of the sentence by another having the same reference, this can have no bearing upon the reference of the sentence. (Frege 1892: 62) This is (a special case of) the substitution version of the idea of semantic values being determined; if you replace parts by others with the same value, the value of the whole doesn't change. Note that the values here are Bedeutungen (referents), such as truth values (for sentences) and individual objects (for individual-denoting terms). Both the substitution version and the function version (see below) were explicitly stated by Rudolf Carnap in (1956) (for both extension and intension), and collectively labeled 'Frege's Principle'. The term 'compositional', was introduced by Hilary Putnam (a former student of Carnap's) in Putman (1975a: 77), read in Oxford in 1960 but not published until in the
98 I. Foundations of semantics collection Putnam (1975b). Putnam says "[...] the concept of a compositional mapping should be so defined that the range of a complex sentence should depend on the ranges of sentences of the kinds occurring in the 'derivational history' of the complex sentence." The first use of the term in print seems to be due to Jerry Fodor (a former student of Putnam's) and Jerrold Katz (1964), to characterize meaning and understanding in a similar sense. Today, compositionality is a key notion in linguistics, philosophy of language, logic, and computer science, but there are divergent views about its exact formulation, methodological status, and empirical significance. To begin to clarify some of these views we need a framework for talking about compositionality that is sufficiently general to be independent of particular theories of syntax or semantics and yet allows us to capture the core idea behind compositionality. 2. Grammars and semantics The function version and the substitution version of compositionality are two sides of the same coin: that the meaning (value) of a compound expression is a function of certain other things (other meanings (values) and a 'mode of composition'). To formulate these versions, two things are needed: a set of structured expressions and a semantics for them. Structure is readily taken as algebraic structure, so that the set E of linguistic expressions is a domain over which certain syntactic operations or rules are defined, and moreover E is generated by these operations from a subset A of atoms (e.g. words). In the literature there are essentially two ways of fleshing out this idea. One, which originates with Montague (see 1974a), takes as primitive the fact that linguistic expressions are grouped into categories or sorts, so that a syntactic rule comes with a specification of the sorts of each argument as well as of the value. This use of many-sorted algebra as an abstract linguistic framework is described in Janssen (1986) and Hendriks (2001). The other approach, first made precise in Hodges (2001), is one-sorted but uses partial algebras instead, so that rather than requiring the arguments of an operation to be of certain sorts, the operation is simply undefined for unwanted arguments. (A many-sorted algebra can in a straightforward way be turned into a one-sorted partial one (but not always vice versa), and under a natural condition the sorts can be recovered in the partial algebra (see Westerstahl 2004 for further details and discussion. Some theorists combine partiality with primitive sorts; for example, Keenan & Stabler 2004 and Kracht 2007.) The partial approach is in a sense simpler and more general than the many-sorted one, and we follow it here. Thus, let a grammar E = (E,A,1.) be a partial algebra, where E and A are as above and Z is a set of partial functions over E of finite arity which generate all expressions in E from A.To illustrate, the familiar rules NP -» Det N (NP-rule) S->NPVP (S-rule)
6. Compositionality 99 correspond to binary partial functions, say a, ft el, such that, if most, dog, and bark are atoms in A, one derives as usual the sentence Most dogs bark in E, by first applying a to most and dog, and then applying ft to the result of that and bark. These functions are necessarily partial; for example, /J is undefined whenever its second argument is dog. It may happen that one and the same expression can be generated in more than one way, i.e. the grammar may allow structural ambiguity. So it is not really the expressions in E but rather their derivation histories, or analysis trees, that should be assigned semantic values. These derivation histories can be represented as terms in a (partial) term algebra corresponding to E, and a valuation function is then defined from terms to surface expressions (usually finite strings of symbols). However, to save space we shall ignore this complication here, and formulate our definitions as if semantic values were assigned directly to expressions. More precisely, the simplifying assumption is that each expression is generated in a unique way from the atoms by the rules. One consequence is that the notion of a subexpression is well-defined: the subexpressions of t are t itself and all expressions used in the generation of t from atoms (it is fairly straightforward to lift the uniqueness assumption, and reformulate the definitions given here so that they apply to terms in the term algebra instead; see e.g.Westerstahl 2004 for details). The second thing needed to talk about compositionality is a semantics for E.We take this simply to be a function fl from a subset of E to some set M of semantic values ('meanings'). In the term algebra case,// takes grammatical terms as arguments. Alternatively, one may take disambiguated expressions such as phrase structure markings by means of labeled brackets. Yet another option is to have an extra syntactic level, like Logical Form, as the semantic function domain.The choice between such alternatives is largely irrelevant from the point of view of compositionality. The semantic function fi is also allowed to be partial. For example, it may represent our partial understanding of some language, or our attempts at a semantics for a fragment of a language. Further, even a complete semantics will be partial if one wants to maintain a distinction between meaningfulness (being in the domain of //) and grammaticality (being derivable by the grammar rules). No assumption is made about meanings. What matters for the abstract notion of compositionality is not meanings as such, but synonymy, i.e. the partial equivalence relation on £ defined by: u = t iff /i(u),/i(0 are both defined and /i(n) = ft(t). (Wc use s, t, u, with or without subscripts, for arbitrary members of E.) 3. Variants and properties of compositionality 3.1. Basic compositionality Both the function version and the substitution version of compositionality can now be easily formulated, given a grammar E and a semantics // as above. Funct(//) For every rule a e I there is a meaning operation ra such that if a(uv ..., un) is meaningful, then /i(a(u„ ..., un)) = r£n(ux),.. .,/i(M/()).
100 I. Foundations of semantics Note that Funct(//) presupposes the Domain Principle (DP): subexpressions of meaningful expressions are also meaningful. The substitution version of compositionality is given by Subst(=^) If s[uv . . .,i/J ands[tv . . ., fj are both meaningful expressions,and if ui=ti for 1 < i < n, then s[uv . . ., i/J =^ s[tv . . . , /J. The notation s[ur ..., un] indicates that s contains (not necessarily immediate) disjoint occurrences of subexpressions among u{,. .., un, and s[tv ..., fj results from replacing each u. by r. Restricted to immediate subexpressions Subst(= ) says that = is a partial congruence relation: If a(uv..., uj and a(tr ..., tn) are both meaningful and ut = f( for 1 s/ ^n,then a(uv...,un) =fla{tv...,tn). Under DP, this is equivalent to the unrestricted version. Subst(= ) does not presuppose DP, and one can easily think of semantics for which DP fails. However, a first observation is: (1) Under DP,Funct(//) and Subst(=^) are equivalent. That Rule(//) implies Subst(^) is obvious when Subst(=^) is restricted to immediate subexpressions, and otherwise proved by induction over the generation complexity of expressions. In the other direction, the operations ramust be found. For m,, . . ., mn e A/, let ra(mv ..., mn) = fi(a(ur ..., un)) if there are expressions ui such that //(m.) = m{, 1 <, i <,n, and fj(a(ur ..., uj) is defined. Otherwise, ra(m,..., mj can be undefined (or arbitrary).This is enough, as long as wc can be certain that the definition is independent of the choice of the w., but that is precisely what Subst(= ) says. The requirements of basic compositionality are in some respects not so strong, as can be seen from the following observations: (2) If /J gives the same meaning to all expressions, then Funct(//) holds. (3) If H gives different meanings to all expressions, then Funct(//) holds. (2) is of course trivial. For (3), consider Subst(= ) and observe that if no two expressions have the same meaning, then ut = t. entails u. = t., so Subst(= ), and therefore Funct(//), holds trivially. 3.2. Recursive semantics The function version of compositional semantics is given by recursion over syntax, but that does not imply that the meaning operations are defined by recursion over meaning, in which case we have recursive semantics. Standard semantic theories are typically both recursive and compositional, but the two notions are mutually independent. In the recursive case wc have: Rec(//) There is a function b and for every ore £ an operation rasuch that for every meaningful expression s,
6. Compositionality K)l_ (b(s) if s is atomic ra(n(ul),...,n(un),ui,...,un)ifs = a(ut,...,un) For fj to be recursive, the basic function b and the meaning composition operation ra must themselves be recursive, but this is not required in the function version of compositionality. In the other direction, the presence of the expressions u{, ..., un themselves as arguments to ra has the effect that the compositional substitution laws need not hold (cf.Janssen 1997). If we drop the recursiveness requirement on b and r^ Rec(//) becomes vacuous. This is because ra(ml,...,mn,ul,...,un) can simply be defined to be fi(a(ur.. .,«„)) whenever mi = fj(ut) for all / and a(u{,..., un) is meaningful (and undefined otherwise). Since inter- substitution of synonymous but distinct expressions changes at least one argument of r^ no counterexample is possible. 3.3. Weaker versions Basic (first-level) compositionality takes the meaning of a complex expression to be determined by the meanings of the immediate subexpressions and the top-level syntactic operation. We get a weaker version - second-level compositionality - if we require only that the operations of the two highest levels, together with the meanings of expressions at the second level, determine the meaning of the whole complex expression. A possible example comes from constructions with quantified noun phrases where the meanings of both the determiner and the restricting noun - i.e. two levels below the head of the construction in question - are needed for semantic composition, a situation that may occur with possessives and some reciprocals. In Peters & Westerstahl (2006, ch. 7) and in Westerstahl (2008) it is argued that, in general, the corresponding semantics is second-level but not (first-level) compositional. Third-level compositionality is defined analogously, and is weaker still. In the extreme case we have bottom-level, or weak functional compositionality, if the meaning of the complex term is determined only by the meanings of its atomic constituents and the entire syntactic construction (i.e. the derived operation that is extracted from a complex expression by knocking out the atomic constituents). A function version of this becomes somewhat cumbersome (but see Hodges 2001, sect. 5), whereas the substitution version becomes simply: AtSubst(= ) Just like Subst(= ) except that the w. and r are all atomic. Although weak compositionality is not completely trivial (a language could lack the property), it does not serve the language users very well: the meaning operation ra that corresponds to a complex syntactic operation a cannot be predicted from its build-up out of simpler syntactic operations and their corresponding meaning operations. Hence, there will be infinitely many complex syntactic operations whose semantic significance must be learned one by one. It may be noted here that terminology concerning compositionality is somewhat fluctuating. David Dowty (2007) calls (an approximate version of) weak functional compositionality Frege's Principle, and refers to Funct(//) as homomorphism compositionality, or strictly local compositionality, or context-free semantics. In Larson & Segal (1995), this
102 I. Foundations of semantics is called strong compositionality. The labels second-level compositionality, third-level, etc. are not standard in the literature but seem appropriate. 3.4. Stronger versions We get stronger versions of compositionality by enlarging the domain of the semantic function, or by placing additional restrictions on meaningfulness or on meaning composition operations. An example of the first is Zolt£n Szab6's (2000) idea that the same meaning operations define semantic functions in all possible human languages, not just for all sentences in each language taken by itself. That is, whenever two languages have the same syntactic operation, they also associate the same meaning operation with it. An example of the second option is what Wilfrid Hodges has called the Husserl property (going back to ideas in Husserl 1900): (Huss) Synonymous expressions belong to the same (semantic) category. Here the notion of category is defined in terms of substitution; say that u -^ t if, for every s in £,s["] e dom(n) ifT s[t] e dom(/j). So (Huss) says that synonymous terms can be inter-substituted without loss of meaningfulness. This is often a reasonable requirement (though Hodges 2001 mentions some putative counter-examples). (Huss) also has the consequence that Subst(s ) can be simplified to Subst,(= ), which only deals with replacing one subexpression by another.Then one can replace n subexpressions by applying Subst,(= ) n times; (Huss) guarantees that all the 'intermediate' expressions are meaningful. An example of the third kind is that of requiring the meaning composition operations to be computable.To make this more precise we need to impose more order on the meaningdomain,viewing meanings too asgiven by an algebra M = (A/,B,O),where flc M is a finite set of basic meanings, O is a finite set of elementary operations from n-tuples of meanings to meanings, and M is generated from B by means of the operations in O.This allows the definition of meaning operations by recursion over M.The semantic function p is then defined simultaneously by recursion over syntax and by recursion over the meaning domain. Assuming that the elementary meaning operations are computable in a sense relevant to cognition, the semantic function itself is computable. A further step in this direction is to require that the meaning operations be easy to compute, thereby reducing or minimizing the complexity of semantic interpretation. For instance, meaning operations that arc either elementary or else formed from elementary operations by function composition and function application would be of this kind (cf. Pagin 2011 for work in this direction). Another strengthening, also introduced in Hodges (2001), concerns Frege's so-called Context Principle. A famous but cryptic saying by Frege (1884, x) is: "Never ask for the meaning of a word in isolation, but only in the context of a sentence". This principle has been much discussed in the literature (for example, Dummctt 1973, Dummctt 1981, Janssen 2001, Pelletier 2001), and sometimes taken to conflict with compositionality. However, if not seen as saying that words somehow lose their meaning in isolation, it can be taken as a constraint on meanings, in the form of what we might call the Contribution Principle:
6. Compositionality 103 (CP) The meaning of an expression is the contribution it makes to the meanings of complex expressions of which it is a part. This is vague, but Hodges notes that it can be made precise with an additional requirement on the synonymy = . Assume (Huss), and consider: InvSubst3(= ) If u * t, there is an expression s such that either exactly one of s[u] and s[t] is meaningful, or both are and s[m] *„*[']• So if two expressions of the same category are such that no complex expression of which the first is a part changes meaning when the first is replaced by the second, they are synonymous. That is, if they make the same contribution to all such complex expressions, their meanings cannot be distinguished.This can be taken as one half of (CP), and compositionality in the form of Subst,(= ) as the other. Remark: Hodges' main application of these notions is to what has become known as the extension problem: given a partial compositional semantics fi, under what circumstances can n be extended to a larger fragment of the language? Here (CP) can be used as a requirement, so that the meaning of a new word w, say, must respect the (old) meanings of complex expressions of which w is a part. This is especially suited to situations when all new items are parts of expressions that already have meanings (cofinality). Hodges defines a corresponding notion of fregean extension of fi, and shows that in the situation just mentioned, and given that fl satisfies (Huss), a unique fregcan extension always exists. Another version of the extension problem is solved in Westerstahl (2004). An abstract account of compositional extension issues is given in Fernando (2005). End of remark We can take a step further in this direction by requiring that replacement of expressions by expressions with different meanings always changes meaning: InvSubstv(= ) If for some i.Osisn,^ * r, then for every expression s, either exactly one of s[ur . . ., un] and s[t{, . . ., tj are meaningful, or both are and s[ur...,un]*fis[tr...,tn]. This disallows synonymy between complex expressions transformable into each other by substitution of constituents at least some of which are non-synonymous, but it docs allow synonymous expressions with different structure. Carnap's principle of synonymy as intensional isomorphism forbids this, too. With the concept of intension from possible-worlds semantics it can be stated as (RC) t^u iff i) t, u are atomic and co-intensional, or ii) for some a, t = «(r,,..., tj, u = a(u{,..., un), and f. =^m., 1 £ / £ n (RC) entails both Subst(= ) and InvSubstv(= ), but is very restrictive. It disallows synonymy between brother and male sibling as well as between John loves Susan and Susan is loved by John, and allows different expressions to be synonymous only if they differ at most in being transformed from each other by substitution of synonymous atomic expressions.
104 I. Foundations of semantics (RC) seems too strong. We get an intermediate requirement as follows. First, define /^-congruence, => in the following way: (=,) t^um i) t or u is atomic, t = u, and neither is a constituent of the other, or ii) t = a(tr...,tn),u =^(ur...,un),t.-u.,l £/£/i,and for alls,,...,sn, a(sr...,sn) = fi(sr ..., sn), if either is defined. Then require synonymous expressions to be congruent: (Cong) If f =^m, then f =^11. By (Cong), synonymous expressions cannot differ much syntactically, but they may differ in the two crucial respects forbidden by (RC). (Cong) docs not hold for natural language if logically equivalent sentences are taken as synonymous.That it holds otherwise remains a conjecture (but see Johnson 2006). It follows from (Cong) that meanings are (or can be represented as) structured entities: entities uniquely determined by how they are built, i.e. entities from which constituents can be extracted. We then have projection operations: (Rev) For every meaning operation r:E—>E there arc projection operationssr. such that sr j(r(mr...,mn)) = mr Together with the fact that the operations rare meaning operations for a compositional semantic function /i, (Rev) has semantic consequences, the main one being a kind of inverse functional compositionality: InvFunct(/y) The syntactic expression of a complex meaning m is determined, up to /i-congruence, by the composition of m and the syntactic expressions of its parts. For the philosophical significance of inverse compositionality, see sections 4.6 and 5.2 below. For (= ), (Cong), InvFunct(/i), and a proof that (Rev) is a consequence of (Cong) (really of the equivalent statement that the meaning algebra is a free algebra), see Pagin (2003a). (Rev) seems to be what Jerry Fodor understands by 'reverse compositionality' in e.g. Fodor (2000:371). 3.5. Direct and indirect compositionality Pauline Jacobson (2002) distinguishes between direct and indirect compositionality, as well as between strong direct and weak direct compositionality. This concerns how the analysis tree of an expression maps onto the expression itself, an issue we have avoided here, for simplicity. Informally, in strong direct compositionality, a complex expression t is built up from sub-expressions (corresponding to subtrees of the analysis tree for t) simply by means of concatenation. In weak direct compositionality, one expression may
6. Compositionality 105 wrap around another (as call up wraps around him in call him up). In indirect compositionality, there is no such simple correspondence between the composition of analysis trees and elementary operations on strings. Even under our assumption that each expression has a unique analysis, our notion of compositionality here is indirect in the above sense: syntactic operations may delete strings, reorder strings, make substitutions and add new elements. Strictly speaking, however, the direct/indirect distinction is not a distinction between kinds of semantics, but between kinds of syntax. Still, discussion of it tends to focus on the role of compositionality in linguistics, e.g. whether to let the choice of syntactic theory be guided by compositionality (cf. Dowty 2007 and Kracht 2007. For discussion of the general significance of the distinction, see Barker & Jacobson 2007). 3.6. Compositionality for "interpreted languages" Some linguists, among them Jacobson, tend to think of grammar rules as applying to signs, where a sign is a triple (e, k, m) consisting of a string, a syntactic category, and a meaning.This is formalized by Marcus Kracht (see 2003,2007), who defines an interpreted language to be a set L of signs in this sense, and a grammar G as a set of partial functions from signs to signs,such that L is generated by the functions in G from a subset of atomic (lexical) signs.Thus, a meaning assignment is built into the language, and grammar rules are taken to apply to meanings as well. This looks like a potential strengthening of our notion of grammar, but is not really used that way, partly because the grammar is taken to operate independently (though in parallel) at each of the three levels. Let /?,, pv and p3 be the projection functions on triples yielding their first, second, and third elements, respectively. Kracht calls a grammar compositional if for each w-ary grammar rule or there are three operations ral,ra2, and ra3 such that for all signs av ...,on for which a is defined, a(<rv...,an) = <r« l(P.(CTl)' • ■ •'/'■K))'^^0!)' • • •^2((Tn))'/'a3(P3(CTl)' • • ^P3(an))) and moreover a(ar . . ., a J is defined if and only if each r . is defined for the corresponding projections. In a sense, however, this is not really a variant of compositionality but rather another way to organize grammars and semantics. This is indicated by (4) and (5) below, which are not hard to verify. First, call G strict if a(cr ..., cn) defined and />,(<r) = />,(*,) for 1 £ i £ n entails «(!,,..., t/() defined, and similarly for the other projections. All compositional grammars are strict. (4) Every grammar G in Kracht's sense for an interpreted language L is a grammar (E, A, L) in the sense of section 2 (with E = L, A = the set of atomic signs in L, and S = the set of partial functions of G). Provided G is strict, G is compositional (in Kracht's sense) iff each oi prp2, and py seen as assignments of values to signs (so p3 is the meaning assignment), is compositional (in our sense).
106 I. Foundations of semantics (5) Conversely, if E = (E,A,S) is a grammar and/i a semantics for E, let L = {(u,u,fi(u)): u e dom(n)\. Define a grammar G for L (with the obvious atomic signs) by letting a((ur urn(u{)),.. .,(un, un,n(un))) = (a(ur...,un), a(ur...,un),ft(a(ur...,M/i))> whenever ore Sis defined for m,,.. ., wn and a(uv...,un) e dom(ji) (undefined otherwise). Provided /i is closed under subexpressions and has the Husserl property,/i is compositional iff G is compositional. 3.7. Context dependence In standard possible-worlds semantics the role of meanings are served by intensions: functions from possible worlds to extensions. For instance, the intension of a sentence returns a truth value, when the argument is a world for which the function is defined. Montague (1968) extended this idea to include not just worlds but arbitrary indices i from some set I, as ordered tuples of contextual factors relevant to semantic evaluation. Speaker, time, and place of utterance are typical elements in such indices. The semantic function /i then assigns a meaning /i(f) to an expression t, which is itself a function such that for an index / e I,/i(f)(/) gives an extension as value. Kaplan's (1989) two-level version of this first assigns a function (character) to t taking certain parts of the index (the context, typically including the speaker) to a content, which is in turn a function from selected parts of the index to extensions. In both versions, the usual concept of compositionality straightforwardly applies.The situation gets more complicated when semantic functions themselves take contextual arguments, e.g. if a meaning-in-context for an expression t in context c is given as ft(t, c). The reason for such a change might be the view that the contextual meanings are contents in their own right, not just extensional fall-outs of the standing, context-independent meaning. But with context as an additional argument we have a new source of variation. The most natural extension of compositionality to this format is given by C-Funct(/i) For every rule ore I there is a meaning operation rasuch that for every context c, if a(ur ..., un) has meaning in c, then fi(a(ur..., un), c)= rjji(ur c),.. .,fi(uH, c)). C-Funct(/i) seems like a straightforward extension of compositionality to a contextual semantics, but it can fail in a way non-contextual semantics cannot, by a context-shift failure. For we can suppose that although /ifw., c) = /ifw., c'), 1 £ / £ n, we still have H(a(ur..., un), c) ± n(a(ur..., un), c'). One might see this as a possible result of so-called unarticulated constituents. Maybe the meaning of the sentence (6) It rains. is sensitive to the location of utterance, while none of the constituents of that sentence (say,/r and rains) is sensitive to location.Then the contextual meaning of the sentence at a location / is different from the contextual meaning of the sentence at another location /',even though there is no such difference in contextual meaning for any of the parts.This may hold even if substitution of expressions is compositional.
6. Compositionality 102 There is therefore room for a weaker principle that cannot fail in this way, where the meaning operation itself takes a context argument: C-Funct(/i)r For every rule a e I there is a meaning operation ra such that for every context c, if a(ur . . ., uj has meaning in c, then ft(a(ur . . ., uj, c) = ra(v(urc),...,n(un,c),c). The only difference is the last argument of ra. Because of this argument, C-Funct(/i)ris not sensitive to the counterexample above, and is more similar to non-contextual compositionality in this respect. This kind of semantic framework is discussed in Pagin (2005); a general format, and properties of the various notions of compositionality that arise, arc presented in Westerstahl (2011). For example, it can be shown that (weak) compositionality for contextual meaning entails compositionality for the corresponding standing meaning, but the converse does not hold. So far, we have dealt with extra-linguistic context, but one can also extend compositional semantics to dependence on linguistic context. The semantic value of some particular occurrence of an expression may then depend on whether it is an occurrence in, say, an cxtcnsional context, or an intcnsional context, or a hypcrintcnsional context, a quotation context, or yet something else. A framework for such a semantics needs a set C of context types, an initial null context type 0 e C for unembedded occurrences, and a binary function y/ from context types and syntactic operators to context types. If a(f,, ..., tn) occurs in context type cy, then f,,..., tn will occur in context type t//(cr or). The context type for a particular occurrence t" of an expression Mn a host expression t is then determined by its immediately embedding operator or,, its immediately embedding operator, and so on until the topmost operator occurrence. The semantic function /i takes an expression t and a context type c into a semantic value.The only thing that will differ for linguistic context from C-Funct(/i)t.above is that the context of the subexpressions may be different (according to the function y/) from the context of the containing expression: LC-Funct(/i)r For every a e S there is an operation rasuch that for every context c, if a(uv..., un) has meaning in c, then H(a(uv..., un), c) = rjjt(uvc%.. .,v(un, c'), c), where c' = y/(c, a). 4. Arguments in favor of compositionality 4.1. Learnability Perhaps the most common argument for compositionality is the argument from learnability: A natural language has infinitely many meaningful sentences. It is impossible for a human speaker to learn the meaning of each sentence one by one. Rather, it must be
108 I. Foundations of semantics possible for a speaker to learn the entire language by learning the meaning of a finite number of expressions, and a finite number of construction forms. For this to be possible, the language must have a compositional semantics. The argument was to some extent anticipated already in Sanskrit philosophy of language. During the first or second century BCE Patanjali writes: ... Brhaspati addressed India during a thousand divine years going over the grammatical expressions by speaking each particular word, and still he did not attain the end. . . . But then how are grammatical expressions understood? Some work containing general and particular rules has to be composed ... (Cf. Staal 1969:501-502.Thanks to Brendan Gillon for the reference.) A modern classical passage plausibly interpreted along these lines is due to Donald Davidson: It is conceded by most philosophers of language, and recently by some linguists, that a satisfactory theory of meaning must give an account of how the meanings of sentences depend upon the meanings of words. Unless such an account could be supplied for a particular language, it is argued, there would be no explaining the fact that we can learn the language: no explaining the fact that, on mastering a finite vocabulary and a finite set of rules, we are prepared to produce and understand any of a potential infinitude of sentences. I do not dispute these vague claims, in which I sense more than a kernel of truth. Instead I want to ask what it is for a theory to give an account of the kind adumbrated. (Davidson 1967:17) Properly spelled out, the problem is not that of learning the meaning of infinitely many meaningful sentences (given that one has command of a syntax), for if I learn that they all mean that snow is white, I have already accomplished the task. Rather, the problem is that there are infinitely many propositions that are each expressed by some sentence in the language (with contextual parameters fixed), and hence infinitely many equivalence classes of synonymous sentences. Still, as an argument for compositionality, the learnability argument has two main weaknesses. First, the premise that there are infinitely many sentences that have a determinate meaning although they have never been used by any speaker, is a very strong premise, in need of justification. That is, at a given time tQ, it may be that the speaker or speakers employ a semantic function fj defined for infinitely many sentences, or it may be that they employ an alternative function /i0 which agrees with fi on all sentences that have in fact been used but is simply undefined for all that have not been used. On the alternative hypothesis, when using a new sentence s, the speaker or the community gives some meaning to s, thereby extending /i0 to /i,, and so on. Phenomenologi- cally, of course, the new sentence seemed to the speakers to come already equipped with meaning, but that was just an illusion. On this alternative hypothesis, there is no infinite semantics to be learned. To argue that there is a learnability problem, we must first justify the premise that we employ an infinite semantic function.This cannot be justified by induction, for we cannot infer from finding sentences meaningful that they were meaningful before we found them, and exactly that would have to be the induction base.
6. Compositionality 109 The second weakness is that even with the infinity premise in place, the conclusion of the argument would be that the semantics must be computable, but computability does not entail compositionality, as we have seen. 4.2. Novelty Closely related to the learnability argument is the argument from novelty: speakers are able to understand sentences they have never heard before, which is possible only if the language is compositional. When the argument is interpreted so that, as in the learnability argument, we need to explain how speakers reliably track the semantics, i.e. assign to new sentences the meaning that they independently have, then the argument from novelty shares the two main weaknesses with the learnability argument. 4.3. Productivity According to the pure argument from productivity, we need an explanation of why we are able to produce infinitely many meaningful sentences, and compositionality offers the best explanation. Classically, productivity is appealed to by Noam Chomsky as an argument for generative grammar. One of the passages runs The most striking aspect of linguistic competence is what we may call the 'creativity of language', that is, the speaker's ability to produce new sentences that are immediately understood by other speakers although they bear no physical resemblance to sentences that are 'familiar'. The fundamental importance of this creative aspect of normal language use has been recognized since the seventeenth century at least, and it was the core of Huinboldtian general linguistics. (Chomsky 1971:74) This passage does not appeal to pure productivity, since it makes an appeal to the understanding by other speakers (cf. Chomsky 1980:76-78). The pure productivity aspect has been emphasized by Fodor (e.g. 1987:147-148), i.e. that natural language can express an open-ended set of propositions. However, the pure productivity argument is very weak. On the premise that a human speaker can think indefinitely many propositions, all that is needed is to assign those propositions to sentences. The assignment does not have to be systematic in any way, and all the syntax that is needed for the infinity itself is simple concatenation. Unless the assignment is to meet certain conditions, productivity requires nothing more than the combination of infinitely many propositions and infinitely many expressions. 4.4. Systematicity A related argument by Fodor (1987: 147-150) is that of systematicity. It can be stated either as a property of speaker understanding or as an expressive property of a language. Fodor tends to favor the former (since he is ultimately concerned with the mental). In the simplest case, Fodor points out that if a language user understands a sentence of the
110 I. Foundations of semantics form tRu, she will also understand the corresponding sentence uRt, and argues that this is best explained by appeal to compositionality. Formally, the argument is to be generalized to cover the understanding of any new sentence that is formed by recombination of constituents that occur, and construction forms that are used, in sentences already understood. Hence, in this form it reduces to one of three different arguments; either to the argument from novelty, or to the productivity argument, or finally, to the argument from intersubjectivity (below), and only spells out a bit the already familiar idea of old parts in new combinations. It might be taken to add an element, for it not only aims at explaining the understanding of new sentences that is in fact manifested, but also predicts what new sentences will be understood. However, Fodor himself points out the problem with this aspect, for if there is a sentence s formed by a recombination that we do not find meaningful, we will not take it as a limitation of the systematicity of our understanding, but as revealing that the sentence s is not in fact meaningful, and hence that there is nothing to understand. Hence, we cannot come to any other conclusion than that the systematicity of our understanding is maximal. The systematicity argument can alternatively be understood as concerning natural language itself, namely as the argument that sentences formed by grammatical recombination are meaningful. It is debatable to what extent this really holds, and sentences (or so-called sentences) like Chomsky's Colorless green ideas sleep furiously have been used to argue that not all grammatical sentences are meaningful. But even if we were to find meaningful all sentences that we find grammatical, this does not in itself show that compositionality, or any kind of systematic semantics, is needed for explaining it. If it is only a matter of assigning some meaning or other, without any further condition, it would be enough that we can think new thoughts and have a disposition to assign them to new sentences. 4.5. Induction on synonymy We can observe that our synonymy intuitions conform to Subst(= ). In case after case, we find the result of substitution synonymous with the original expression, if the new part is taken as synonymous with the old. This forms the basis of an inductive generalization that such substitutions are always meaning preserving. In contrast to the argument from novelty, where the idea of tracking the semantics is central, this induction argument may concern our habits of assigning meaning to, or reading meaning into, new sentences: we tend to do it compositionally. There is nothing wrong with this argument, as far as it goes, beyond what is in general problematic with induction. It should only be noted that the conclusion is weak.Typically, arguments for compositionality aim at the conclusion that there is a systematic pattern to the assignment of meaning to new sentences, and that the meaning of new sentences can be computed somehow.This is not the case in the induction argument, for the conclusion is compatible with the possibility that substitutivity is the only systematic feature of the semantics. That is, assignment to meaning of new sentences may be completely random, except for respecting substitutivity. If the substitutivity version of compositionality holds, then (under DP) so does the function version, but the semantic function need not be computable, and need not even be finitely specifiable. So, although the argument may
6. Compositionality 1_1_1_ be empirically sound, it does not establish what arguments for compositionality usually aim at. 4.6. Intersubjectivity and communication The problems with the idea of tracking semantics when interpreting new sentences can be eliminated by bringing in intersubjective agreement in interpretation. For by our common sense standards of judging whether we understand sentences the same way or not, there is overwhelming evidence (e.g. from discussing broadcast news reports) that in an overwhelming proportion of cases, speakers of the same language interpret new sentences similarly. This convergence of interpretation, far above chance, does not presuppose that the sentences heard were meaningful before they were used.The phenomenon needs an explanation, and it is reasonable to suppose that the explanation involves the hypothesis that the meaning of the sentences are computable, and so it isn't left to guesswork or mere intuition what the new sentences mean. The appeal to intersubjectivity disposes of an unjustified presupposition about semantics, but two problems remain. First, when encountering new sentences, these are almost invariably produced by a speaker, and the speaker has intended to convey something by the sentence, but the speaker hasn't interpreted the sentence, but fitted it to an antecedent thought. Secondly, we have an argument for computability, but not for compositionality. The first observation indicates that it is at bottom the success rate of linguistic communication with new sentences that gives us a reason for believing that sentences are systematically mapped on meanings. This was the point of view in Frege's famous passage from the opening of'Compound Thoughts': It is astonishing what language can do. With a few syllables it can express an incalculable number of thoughts, so that even a thought grasped by a terrestrial being for the very first time can be put into a form of words which will be understood by someone to whom the thought is entirely new. This would be impossible, were we not able to distinguish parts in the thoughts corresponding to the parts of a sentence, so that the structure of the sentence serves as the image of the structure of the thought. (Frege 1923:55) As Frege depicts it here, the speaker is first entertaining a new thought, or proposition, finds a sentence for conveying that proposition to a hearer, and by means of that sentence the hearer comes to entertain the same proposition as the speaker started out with. Frege appeals to semantic structure for explaining how this is possible. He claims that the proposition has a structure that mirrors the structure of the sentence (so that the semantic relation may be an isomorphism), and goes on to claim that without this structural correspondence, communicative success with new propositions would not be possible. It is natural to interpret Frege as expressing a view that entails that compositionality holds as a consequence of the isomorphism idea.The reason Frege went beyond compositionality (or homomorphism, which does not require a one-one relation) seems to be an intuitive appeal to symmetry: the speaker moves from proposition to sentence, while the hearer moves from sentence to proposition. An isomorphism is a one-one relation, so that each relatum uniquely determines the other.
112 I. Foundations of semantics Because of synonymy, a sentence that expresses a proposition in a particular language is typically not uniquely determined within that language by the proposition expressed. Still, we might want the speaker to be able to work out what expression to use, rather searching around for suitable sentences by interpreting candidates one after the other. The inverse functional compositionality principle, InvFunct(//), of section 3.4, offers such a method. Inverse compositionality is also connected with the idea of structured meanings, or thoughts, while compositionality by itself isn't, and so in this respect Frege is vindicated (these ideas are developed in Pagin 2003a). 4.7. Summing up Although many share the feeling that there is "more than a kernel of truth" (cf. section 4.1) in the usual arguments for compositionality, some care is required to formulate and evaluate them. One must avoid question-begging presuppositions; for example, if a presupposition is that there is an infinity of propositions, the argument for that had better not be that standardly conceived natural or mental languages allow the generation of such an infinite set. Properly understood, the arguments can be seen as inferences to the best explanation, which is a respectable but somewhat problematic methodology. (One usually hasn't really tried many other explanations than the proposed one.) Another important (and related) point is that virtually all arguments so far only justify the principle that the meaning is computable or recursive, and the principle that up to certain syntactic variation, an expression of a proposition is computable from that proposition. Why should the semantics also be compositional, and possibly inversely compositional? One reason could be that compositional semantics, or at least certain simple forms of compositional semantics, is very simple, in the sense that a minimal number of processing steps are needed by the hearer for arriving at a full interpretation (or, for the speaker, a full expression, cf. Pagin 2011), but these issues of complexity need to be further explored. 5. Arguments against compositionality Arguments against compositionality of natural language can be divided into four main categories: a) arguments that certain constructions are counterexamples and make the principle false, b) arguments that compositionality is an empirically vacuous, or alternatively trivially correct, principle, c) arguments that compositional semantics is not needed to account for actual linguistic communication, d) arguments that actual linguistic communication is not suited for compositional semantics. The first category, that of counterexamples, will be treated in a separate section dealing with a number of problem cases. Here we shall discuss arguments in the last three categories.
6. Compositionality 113 5.1 Vacuity and triviality arguments Vacuity. Some claims about the vacuity of compositionality in the literature are based on mathematical arguments. For example, Zadrozny (1994) shows that for every semantics // there is a compositional semantics vsuch that vit)(t) = /J(t) for every expression t, and uses this fact to draw a conclusion of that kind. But note that the mathematical fact is itself trivial: let v(f) = // for each t and the result is immediate from (2) in section 3.1 above (other parts of Zadrozny's results use non-wellfounded sets and are less trivial). Claims like these tend to have the form: for any semantics // there is a compositional semantics vfrom which // can be easily recovered. But this too is completely trivial as it stands: if we let v(f) = (//(f), f), vis 1-1, hence compositional by (3) in section 3.1, and//is clearly recoverable from v. In general, it is not enough that the old semantics can be computed from the new compositional semantics: for the new semantics to have any interest it must agree with the old one in some suitable sense. As far as we know there are no mathematical results showing that such a compositional alternative can always be found (see Westerstahl 1998 for further discussion). Triviality. Paul Horwich (e.g. in 1998) has argued that compositionality is not a substantial property of a semantics, but is trivially true. He exemplifies with the sentence dogs barks, and says (1998:156-157) that the meaning property (7) x means DOGS BARK consists in the so-called construction property (8) x results from putting expressions whose meanings are DOG and BARK, in that order, into a schema whose meaning is NS V. As far as it goes, the compositionality of the resulting semantics is a trivial consequence of Horwich's conception of meaning properties. Horwich's view here is equivalent to Carnap's conception of synonymy as intensional isomorphism. Neither allows that an expression with different structure or composed from parts with different meanings could be synonymous with an expression that means DOGS BARK. However, for supporting the conclusion that compositionality is trivial, these synonymy conditions must themselves hold trivially, and that is simply not the case. 5.2. Superfluity arguments Mental processing. Stephen Schiffer (1987) has argued that compositional semantics, and public language semantics altogether, is superfluous in the account of linguistic communication. All that is needed is to account for how the hearer maps his mental representation of an uttered sentence on a mental representation of meaning, and that is a matter of a syntactic transformation, i.e. a translation, rather than interpretation. In Schiffer's example (1987:192-200), the hearer Harvey is to infer from his belief that
114 I. Foundations of semantics (9) Carmen uttered the sentence 'Some snow is white', the conclusion that (10) Carmen said that some snow is white. Schiffer argues that this can be achieved by means of transformations between sentences in Harvey's neural language M. M contains a counterpart a to (9), such that a gets tokened in Harvey's so-called belief box when he has the belief expressed by (9). By an inner mechanism the tokening of or leads to the tokening of /?, which is Harvey's M counterpart to (10). For this to be possible for any sentence of the language in question, Harvey needs a translation mechanism that implements a recursive translation function /from sentence representations to meaning representations. Once such a mechanism is in place, we have all we need for the account, according to Schiffer. The problem with the argument is that the translation function / by itself tells us nothing about communicative success. By itself it just correlates neural sentences of which we know nothing except for their internal correlation. We need another recursive function g that maps the uttered sentence Some snow is white on a, and a third recursive function h that maps ft on the proposition that some snow is white, in order to have a complete account. But then the composed function h(J{g(...))) seems to be a recursive function that maps sentences on meanings (cf. Pagin 2003b). Pragmatic composition. According to Franqois Recanati (2004), word meanings are put together in a process of pragmatic composition. That is, the hearer takes word meanings, syntax and contextual features as his input, and forms the interpretation that best corresponds to them. As a consequence, semantic compositionality is not needed for interpretation to take place. A main motivation for Rccanati's view is the ubiquity of those pragmatic operations that Recanati calls modulations, and which intuitively contribute to "what is said", i.e. to communicated content before any conversational implicatures. (Under varying terms and conceptions, these phenomena have been described e.g. by Sperber & Wilson 1995, Bach 1994,Carston 2002 and by Recanati himself.) To take an example from Recanati, in reply to an offer of something to eat, the speaker says (11) I have had breakfast. thereby saying that she has had breakfast in the morning of the day of utterance, which involves a modulation of the more specific kind Recanati calls free enrichment, and implicating by means of what she says that she is not hungry. On Recanati's view, communicated contents are always or virtually always pragmatically modulated. Moreover, modulations in general do not operate on a complete semantically derived proposition, but on conceptual constituents. For instance, in (11) it is the property of having breakfast that is modulated into having breakfast this day, not the proposition as a whole or even the property of having had breakfast. Hence, it seems that what the semantics delivers docs not feed into the pragmatics. However, if meanings, i.e. the outputs of the semantic function, are structured entities, in the sense specified by (Rev) and InvFunct(//) of section 3.4, then the last objection is met, for then semantics is able to deliver the arguments to the pragmatic
6. Compositionality U_5 operations, e.g. properties associated with VPs. Moreover, the modulations that are in fact made appear to be controlled by a given semantic structure: as in (11), the modulated part is of the same category and occupies the same slot in the overall structure as the semantically given argument that it replaces. This provides a reason for thinking that modulations operate on a given (syntactically induced) semantic structure, rather than on pragmatically composed material (this line of reasoning is elaborated in Pagin & Pelletier 2007). 5.3. Unsuitability arguments According to a view that has come to be called radical contextualism, truth evaluable content is radically underdetermined by semantics, i.e. by literal meaning. That is, no matter how much a sentence is elaborated, something needs to be added to its semantic content in order to get a proposition that can be evaluated as true or false. Since there will always be indefinitely many different ways of adding, the proposition expressed by means of the sentence will vary from context to context. Well-known proponents of radical contextualism include John Searle (e.g. 1978), Charles Travis (e.g. 1985), and Sperber & Wilson (1995). A characteristic example from Charles Travis (1985: 197) is the sentence (12) Smith weighs 80 kg. Although it sounds determinate enough at first blush, Travis points out that it can be taken as true or as false in various contexts, depending on what counts as important in those contexts. For example, it can be further interpreted as being true in case Smith weighs (12') a. 80 kg when stripped in the morning. b. 80 kg when dressed normally after lunch. c. 80 kg after being force fed 4 liters of water. d. 80 kg four hours after having ingested powerful diuretic. e. 80 kg after lunch adorned in heavy outer clothing. Although the importance of such examples is not to be denied, their significance for semantics is less clear. It is in the spirit of radical contextualism to minimize the contribution of semantics (literal meaning) for determining expressed content, and thereby the importance of compositionality. However, strictly speaking, the truth or falsity of the compositionality principle for natural language is orthogonal to the truth or falsity of radical contextualism. For whether the meaning of a sentence s is a proposition or not is irrelevant to the question whether that meaning is determined by the meaning of the constituents of s and their mode of composition. The meaning of s may be unimportant but still compositionally determined. In an even more extreme version, the (semantic) meaning of sentence s in a context c is what the speaker uses s to express in c. In that case meaning itself varies from context to context, and there is no such thing as an invariant literal meaning. Not even the extreme version need be in conflict with compositionality (extended to context dependence), since the substitution properties may hold within each context by itself. Context
116 I. Foundations of semantics shift failure, in the sense of section 3.7, may occur, if e.g. word meanings are invariant but the meanings of complex expressions vary between contexts. It is a further question whether radical contextualism itself, in either version, is a plausible view. It appears that the examples of contextualism can be handled by other methods, e.g. by appeal to pragmatic modulations mentioned in section 5.2 (cf. Pagin & Pelletier 2007), which does allow propositions to be semantically expressed. Hence, the case for radical contextualism is not as strong as it may prima facie appear. On top, radical contextualism tends to make a mystery out of communicative success. 6. Problem cases A number of natural language constructions present apparent problems for compositional semantics. In this concluding section we shall briefly discuss a few of them, and mention some others. 6.1. Belief sentences Belief sentences offer diffculties for compositional semantics, both real and merely apparent. At first blush, the case for a counterexample against compositionality seems very strong. For in the pair (13) a. John believes that Fred is a child doctor, b. John believes that Fred is a pediatrician. (13a) may be true and (13b) false, despite the fact that child doctor and pediatrician are synonymous. If truth value is taken to depend only on meaning and on extra-semantic facts, and the extra-semantic facts as well as the meanings of the parts and the modes of composition are the same between the sentences, then the meaning of the sentences must nonetheless be different, and hence compositionality fails.This conclusion has been drawn by Jeff Pelletier (1994). What would be the reason for this difference in truth value? When cases such as these come up, the reason is usually that there is some kind of discrepancy in the understanding of the attributee (John) between synonyms. John may e.g. erroneously believe that pediatrician only denotes a special kind of child doctors, and so would be disposed to assent to (13a) but dissent from (13b) (cf. Mates 1950 and Burge 1978; Mates took such cases as a reason to be skeptical about synonymy).This is not a decisive reason, however, since it is what the words mean in the sentences, e.g. depending on what the speaker means, that is relevant, not what the attributee means by those words.The speaker contributes with words and their meanings, and the attributee contributes with his belief contents. If John's belief content matches the meaning of the embedded sentence Fred is a pediatrician, then (13b) is true as well, and the problem for compositionality is disposed of. A problem still arises, however, if belief contents are more fine-grained than sentence meanings, and words in belief contexts are somehow tied to these finer differences in grain. For instance, as a number of authors have suggested, perhaps belief contents are propositions under modes of presentation (see e.g. Burdick 1982, Salmon 1986. Salmon, however, existentially quantifies over modes of presentations, which preserves substitutivity).
6. Compositionality 117 It may then be that different but synonymous expressions are associated with different modes of presentation. In our example, John may believe a certain proposition under a mode of presentation associated with child doctor but not under any mode of presentation associated with pediatrician, and that accounts for the change in truth value. In that case, however, there is good reason to say that the underlying form of a belief sentence such as (13a) is something like (14) Bel(John, the proposition that Fred is a child doctor, M('Fred is a child doctor')) where M(-) is a function from a sentence to a mode of presentation or a set of modes of presentation. In this form, the sentence Fred is a pediatrician occurs both used and mentioned (quoted), and in its used occurrence, child doctor may be replaced by pediatrician without change of truth value. Failure of substitutivity is explained by the fact that the surface form fuses a used and a mentioned occurrence. In the underlying form, there is no problem for compositionality, unless caused by quotation. Of course, this analysis is not obviously the right one, but it is enough to show that the claim that compositionality fails for belief sentences is not so easy to establish. 6.2. Quotation Often quotation is set aside for special treatment as an exception to ordinary semantics, which is supposed to concern used occurrences of expressions rather than mentioned ones. Sometimes, this is regarded as cheating, and quotation is proposed as a clear counterexample to compositionality: brother and male sibling are synonymous, but ^brother" and 'male sibling" are not (i.e. the expressions that include the opening and closing quote). Since enclosing an expression in quotes is a syntactic operation, we have a counterexample. If quoting is a genuine syntactic operation, the syntactic rules include a total unary operator vsuch that, for any simple or complex expression t, (15) K(t) = r The semantics of quoted expressions is given simply by (Q) //(*(*)) = * Then, since t = u does not imply t = u, substitution of u for t in fc(t) may violate compositionality. However, such a non-compositional semantics for quotation can be transformed into a compositional one, by adapting Frege's view in (1892) that quotation provides a special context type in which expressions refer to themselves, and using the notion of linguistically context-dependent compositionality from section 3.7 above. We shall not give the details here, only indicate the main steps. Start with a grammar E = (E, A, I) (for a fragment of English, say) and a compositional semantics/i for E. First, extend E to a grammar containing the quotation operator
118 I. Foundations of semantics K, allowing not only quote-strings of the form 'John','likes',"Mary", etc., but also things like John likes 'Mary' (meaning that he likes the name), whereas we disallow things like John 'likes' Mary or 'John likes' Mary as ungrammatical. Let E' be the closure of E under the thus extended operations and k, and let I' = [a': ae S}u{/c}.Then we have a new grammar E' =(E',A,Z') that incorporates quotation. Next, extend ft to a semantics ft' for E', using the semantic composition operations that exist by Funct(//), and letting (Q) above take care of K. As indicated, the semantics ft' is not compositional: even if Mary is the same person as Sue, John likes 'Mary' doesn't mean the same as John likes 'Sue'. However, we can extend ft' to a semantics ft" for E'which is compositional in the sense of LC-Funct(/y)t.in section 3.7. In the simplest case, there are two context types: c(i, the use context type, which is the default type (the null context), and the quotation context type c -The function y/ from context types and operators to context types is given by \c ifB*K for fi e I' and c equal to cu or c . ft" is obtained by redefining the given composition operations in a fairly straightforward way, so that LC-Funct(/i")c is automatically insured./i" then extends ft in the sense that if t e £ is meaningful, ft" (t, c(i) = ft(t), and furthermore ft" (i<(t), c(i) = ft" (t, cq) = t. So ft" is compositional in the contextually extended sense. That t = u holds does not license substitution of u for t in K(t), since t there occurs in a quotation context, and we may have ft"(t,c ) 4- ft" (u,c ).This approach is further developed in Pagin & Westerstahl (2009). 6.3. Idioms Idioms are almost universally thought to constitute a problem for compositionality. For example, the VP kick the bucket can also mean 'die', but the semantic operation corresponding to the standard syntax of, say, fetch the bucket, giving its meaning in terms of the meanings of its immediate constituents/efc/i and the bucket, cannot be applied to give the idiomatic meaning of kick the bucket. This is no doubt a problem of some sort, but not necessarily for compositionality. First, that a particular semantic operation fails doesn't mean that no other operation works. Second, note that kick the bucket is ambiguous between its literal and its idiomatic meaning, but compositionality presupposes non-ambiguous meaning bearers. Unless we take the ambiguity itself to be a problem for compositionality (see the next subsection), we should first find a suitable way to disambiguate the phrase, and only then raise the issue of compositionality. Such disambiguation may be achieved in various ways. We could treat the whole phrase as a lexical item (an atom), in view of the fact that its meaning has to be learnt separately. Or, given that it does seem to have syntactic structure, we could treat it as formed by a different rule than the usual one. In neither case is it clear that compositionality would be a problem.
6. Compositionality 1_1^ To see what idioms really have to do with compositionality, think of the following situation. Given a grammar and a compositional semantics for it, suppose we decide to give some already meaningful phrase a non-standard, idiomatic meaning. Can we then extend the given syntax (in particular, to disambiguate) and semantics in a natural way that preserves compositionality? Note that it is not just a matter of accounting for one particular phrase, but rather for all the phrases in which the idiom may occur. This requires an account of how the syntactic rules apply to the idiom, and to its parts if it has structure, as well as a corresponding semantic account. But not all idioms behave the same. While the idiomatic kick the bucket is fine in John kicked the bucket yesterday, or Everyone kicks the bucket at some point, it is not good in (16) The bucket was kicked by John yesterday. (17) Andrew kicked the bucket a week ago, and two days later, Jane kicked it too. By contrast, pull strings preserves its idiomatic meaning in passive form, and strings is available for anaphoric reference with the same meaning: (18) Strings were pulled to secure Henry his position. (19) Kim's family pulled some strings on her behalf, but they weren't enough to get her the job. This suggests that these two idioms should be analyzed differently; indeed the latter kind is called "compositional" in Nunberg, Sag & Wasow (1994) (from which (19) is taken), and is analyzed there using the ordinary syntactic and semantic rules for phrases of this form but introducing instead idiomatic meanings of its parts (pull and string), whereas kick the bucket is called "non-compositional". In principle, nothings prevents a semantics that deals differently with the two kinds of idioms from being compositional in our sense. Incorporating idioms in syntax and semantics is an interesting task. For example, in addition to explaining the facts noted above one has to prevent kick the pail from meaning 'die' even if bucket and pail are synonymous, and likewise to prevent the idiomatic versions of pull and string to combine illegitimately with other phrases. For an overview of the semantics of idioms, see Nunberg, Sag & Wasow (1994). Westerstahl (2002) is an abstract discussion of various ways to incorporate idioms while preserving compositionality. 6.4. Ambiguity Even though the usual formulation of compositionality requires non-ambiguous meaning bearers, the occurrence of ambiguity in language is usually not seen as a problem for compositionality. This is because lexical ambiguity seems easily dealt with by introducing different lexical items for different meanings of the same word, whereas structural ambiguity corresponds to different analyses of the same surface string.
120 I. Foundations of semantics However, it is possible to argue that even though there are clear cases of structural ambiguity in language, as in Old men and women were released first from the occupied building, in other cases the additional structure is just an ad hoc way to avoid ambiguity. In particular, quantifier scope ambiguities could be taken to be of this kind. For example, while semanticists since Montague have had no trouble inventing different underlying structures to account for the two readings of (20) Every critic reviewed four films. it may be argued that this sentence in fact has just one structural analysis, a simple constituent structure tree, and that meaning should be assigned to that one structure. A consequence is that meaning assignment is no longer functional, but relational, and hence compositionality either fails or is just not applicable. Pelletier (1999) draws precisely this conclusion. But even if one agrees with such an account of the syntax of (20), abandonment of compositionality is not the only option. One possibility is to give up the idea that the meaning of (20) is a proposition, i.e. something with a truth value (in the actual world), and opt instead for underspecified meanings of some kind. Such meanings can be uniquely, and perhaps compositionally, assigned to simple structures like constituent structure trees, and one can suppose that some further process of interpretation of particular utterances leads to one of the possible specifications, depending on various circumstantial facts. This is a form of context-dependence, and wc saw in section 3.7 how similar phenomena can be dealt with compositionally. What was there called standing meaning is one kind of underspecified meaning, represented as a function from indices to 'ordinary' meanings. In the present case, where several meanings are available, one might try to use the set of those meanings instead. A similar but more sophisticated way of dealing with quantifier scope is so-called Cooper storage (sec Cooper 1983). It should be noted, however, that while such strategies restore a functional meaning assignment, the compositionality of the resulting semantics is by no means automatic; it is an issue that has to be addressed anew. Another option might be to accept that meaning assignment becomes relational and attempt instead to reformulate compositionality for such semantics. Although this line has hardly been tried in the literature, it may be an option worth exploring (For some first attempts in this direction, sec Wcstcrstahl 2007). 6.5. Other problems Other problems than those above, some with proposed solutions, include possessives (cf. Partcc 1997; Peters & Wcstcrstahl 2006), the context sensitive use of adjectives (cf. Lahav 1989; Szab6 2001; Reimer 2002), noun-noun compounds (cf. Weiskopf 2007), iw/ess+quantifiers (cf.Higginbotham 1986; Pelletier 1994),a/t>'embeddings (cf. Hintikka 1984), and indicative conditionals (e.g. Lewis 1976). All in all, it seems that the issue of compositionality in natural language will remain live, important and controversial for a long time to come.
6. Compositionality 12J_ 7. References Abelard, Peter 2008. Logica 'Ingredientibus'. 3. Commentary on Aristotele's De Interpretatione. Corpus Christianorum Contiunatio Medicvalis,Turnhout: Brepols Publishers. Ailly, Peter of 1980. Concepts and Insolubles. Dordrecht: Reidel. Originally published as Conceptus et Insolubilia, Paris, ca. 1500. Bach, Kent 1994. Conversational iinpliciture. Mind & Language 9,124-162. Barker, Chris & Pauline Jacobson (eds.) 2007. Direct Compositionality. Oxford: Oxford University Press. Burdick, Howard 1982. A logical form for the propositional attitudes. Synthese 52.185-230. Burge, Tyler 1978. Belief and synonymy. Journal of Philosophy 75,119-138. Buridan, John 1998. Summulae de Dialectica 4. Summulae de Suppositionibus, volume 10-4 of Artis- tarium. Nijmegen: Ingeniuin. Carnap, Rudolf 1956. Meaning and Necessity. 2nd edn. Chicago, IL: The University of Chicago Press. Carston, Robyn 2002. Thoughts and Utterances. The Pragmatics of Explicit Communication. Oxford: Oxford University Press. Chomsky, Noam 1971. Topics in the theory of Generative Grammar. In: J. Searle (ed.). Philosophy of Language. Oxford: Oxford University Press, 71-100. Chomsky, Noam 1980. Rules and Representations. Oxford: Blackwell. Cooper, Robin 1983. Quantification and Syntactic Theory. Dordrecht: Reidel. Davidson, Donald 1967. Truth and meaning. Synthese 17, 304-323. Reprinted in: D. Davidson. Inquiries into Truth and Interpretation. Oxford: Clarendon Press, 1984, 17-36. Page reference to the reprint. Dowty, David 2007. Compositionality as an empirical problem. In: C. Barker & P. Jacobson (eds.). Direct Compositionality. Oxford: Oxford University Press, 23-101. Dununett, Michael 1973. Frege. Philosophy of Language. London: Duckworth. Dununett, Michael 1981. The Interpretation ofFrege's Philosophy. London: Duckworth. Fernando,Tun 2005. Compositionality inductively, co-inductively and contextually. In: E. Machery, M. Werning & G. Schurz (eds.). The Compositionality of Meaning and Content: Foundational Issues, vol. I. Frankfui t/M.: Ontos, 87-96. Fodor, Jerry 1987. Psychosemantics. Cambridge, MA:The MIT Press. Fodor, Jerry 2000. Reply to critics. Mind & Language 15,350-374. Fodor, Jerry & Jerrold Katz 1964. The structure of a semantic theory. In: J. Fodor & J. Katz (eds.). The Structure of Language. Englewood Cliffs, NJ: Prentice Hall, 479-518. Frege, Gottlob 1884. Die Grundlagen der Arithmetik: eine logisch-mathematische Untersuchung uber den Begriff der Zahl. Brcslau: W. Kocbncr. English translation in: J. Austin. The Foundations of Arithmetic: A logico-mathematical enquiry into the concept of number. 1st edn. Oxford: Blackwell, 1950. Frege, Gottlob 1892. Uber Sinn und Bedeutuiig. Zeitschrift fiir Philosophic und philosophische Kritik 100,25-50. English translation in: P. Geach & M. Black (eds.). Translations from the Philosophical Writings of Gottlob Frege. Oxford: Blackwell, 1980,56-78. Frege, Gottlob 1923. Logische Untersuchungen. Dritter Teil: Gedankengefiige. Beitrage zur Philosophic des deutschen Idealismus ///(1923-1926), 36-51. English translation in: P. Geach (ed.). Logical Investigations. Oxford: Blackwell, 1977,55-77. Hendriks, Herman 2001. Compositionality and inodel-theoretic interpretation. Journal of Logic, Language and Information 10,29-48. Higginbotham. James 1986. Linguistic theory and Davidson's program in semantics. In: E. Lepore (ed.). Linguistic Theory and Davidsons Program in Semantics. Oxford: Blackwell, 29^8. Hintikka, Jaakko 1984. A hundred years later: The rise and fall of Frege's influence in language theory. Synthese 59,27^9.
122 I. Foundations of semantics Hodges, Wilfrid 2001. Formal features of compositionality. Journal of Logic, Language and Information 10,7-28. Horwich, Paul 1998. Meaning. Oxford: Oxford University Press. Houben, Jan 1997. The Sanskrit tradition. In: W. van Bekkuin et al. (eds.). The Sanskrit Tradition. Amsterdam: Benjamins, 49-145. Husserl, Edmund 1900. Logische Untersuchungen II/l. Translated by J. N. Findlay as Logical Investigations. London: Routledge & Kegan Paul, 1970. Jacobson, Pauline 2002. The (dis)organisation of the grammar: 25 years. Linguistics & Philosophy 25,601-626. Janssen, Theo 1986. Foundations and Applications of Montague Grammar. Amsterdam: CWITracts 19 and 28. Janssen, Theo 1997. Compositionality. In: J. van Benthem & A. ter Meulen (eds.). Handbook of Logic and Language. Amsterdam: Elsevier, 417-473. Janssen, Theo 2001. Frcgc, contcxtuality and compositionality. Journal of Logic, Language and Information 10,115-136. Johnson, Kent 2006. On the nature of reverse compositionality. Erkenntnis 64,37-60. Kaplan, David 1989. Demonstratives. In: J. Almog, J. Perry & H. Wettstein (eds.). Themes from Kaplan. Oxford: Oxford University Press, 481-563. Keenan, Edward & Edward Stabler 2004. Bare Grammar: A Study of Language Invariants. Stanford, CA: CSLI Publications. King. Peter 2001. Between logic and psychology. John Buridan on mental language. Paper presented at the conference John Buridan and Beyond. Copenhagen, September 2001. King, Peter 2007. Abelard on mental language. American Catholic Philosophical Quarterly 81, 169-187. Kracht, Marcus 2003. The Mathematics of Language. Berlin: Mouton de Gruyter. Kracht, Marcus 2007. The emergence of syntactic structure. Linguistics & Philosophy 30, 47-95. Lahav, Ran 1989. Against compositionality: The case of adjectives. Philosophical Studies 57, 261-279. Larson, Richard & Gabriel Segal 1995. Knowledge of Meaning. An Introduction to Semantic Theory. Cambridge, MA:Thc MIT Press. Lewis, David 1976. Probabilities of conditionals and conditional probabilities. The Philosophical Review 85,297-315. Mates, Benson 1950. Synonymity. University of California Publications in Philosophy 25, 201-226. Reprinted in: L. Linsky (ed.). Semantics and the Philosophy of Language. Urbana, IL: University of Illinois Press, 1952,111-136. Montague. Richard 1968. Pragmatics. In: R. Klibanski (ed.). Contemporary Philosophy: A Survey I: Logic and foundations of Mathematics. La Nuove Italia Editice, Florence, 1968, 102-122. Reprinted in: R. Thornason (ed.). Formal Philosophy. Selected Papers of Richard Montague. New Haven, CT: Yale University Press, 1974,95-118. Nunberg, Geoffry, Ivan Sag & Thomas Wasow 1994. Idioms. Language 70,491-538. Pagin, Peter 2003a. Communication and strong compositionality. Journal of Philosophical Logic 32,287-322. Pagin, Peter 2003b. Schiffer on communication. Facta Philosophica 5,25-48. Pagin, Peter 2005. Compositionality and context. In: G. Preyer & G. Peter (eds.). Contextualism in Philosophy. Oxford: Oxford University Press. 303-348. Pagin. Peter 2011. Communication and the complexity of semantics. In: W. Hinzcn. E. Machcry & M. Werning (eds.). The Oxford Handbook of Compositionality. Oxford: Oxford University Press. Pagin. Peter & Francis J. Pelletier 2007. Content, context and communication. In: G. Preyer & G. Peter (eds.). Context-Sensitivity and Semantic Minimalism. New Essays on Semantics and Pragmatics. Oxford: Oxford University Press, 25-62.
6. Compositionality 123 Pagin, Peter & Dag Westerstahl 2009. Pure Quotation, Compositionality. and the Semantics of Linguistic Context. Submitted. Partee, Barbara H. 1997. The genitive. A case study. In: J. van Benthem & A. ter Meulen (eds.). Handbook of Logic and Language. Amsterdam: Elsevier, 464-470. Pelletier, Francis J. 1994. The principle of semantic compositionality. Topoi 13, 11-24. Expanded version reprinted in: S. Davis & B. Gillon (eds.), Semantics:A Reader. Oxford: Oxford University Press, 2004,133-157. Pelletier, Francis J. 1999. Semantic compositionality: Free algebras and the argument from ambiguity. In: M. Faller. S. Kaufmann & M. Pauly (eds.). Proceedings of the Seventh CSLI Workshop on Logic, Language and Computation. Stanford, CA: CSLI Publications, 207-218. Pelletier, Francis J. 2001. Did Frcge believe Frege's Principle? Journal of Logic, Language and Information 10,87-114. Peters, Stanley & Dag Westerstahl 2006. Quantifiers in Language and Logic. Oxford: Oxford University Press. Putnam, Hilary 1975a. Do true assertions correspond to reality? In: H. Putnam. Mind, Language and Reality. Philosophical Papers vol. 2. Cambridge: Cambridge University Press, 70-84. Putnam, Hilary 1975b. Mind, Language and Reality. Philosophical Papers vol. 2. Cambridge: Cambridge University Press. Recanati, Francois 2004. Literal Meaning. Cambridge: Cambridge University Press. Reiiner, Marga 2002. Do adjectives conform to compositionality? Nous 16,183-198. Salmon, Nathan 1986. Frege's Puzzle. Cambridge, MA:The MIT Press. Schiffer, Stephen 1987. Remnants of Meaning. Cambridge, MA:Thc MIT Press. Scai 1c, John 1978. Literal meaning. Erkenntnis 13,207-224. Sperber, Dan & Deirdre Wilson 1995. Relevance. Communication & Cognition. 2nd edn. Oxford: Blackwell. Staal. J. F. 1969. Sanskrit philosophy of language. In: T A. Sebeok (ed.). Linguistics in South Asia. The Hague: Mouton. 499-531. Szabo. Zoltin 2000. Compositionality as supervenience. Linguistics & Philosophy 23.475-505. Szab6. Zoltin 2001. Adjectives in context. In: R. M. Harnich & I. Kenesei (eds.). Adjectives in Context. Amsterdam: Benjamins, 119-146. Travis, Charles 1985. On what is strictly speaking true. Canadian Journal of Philosophy 15, 187-229. Weiskopf, Daniel 2007. Compound nominals, context, and compositionality. Synthese 156,161-204. Westerstahl, Dag 1998. On mathematical proofs of the vacuity of compositionality. Linguistics & Philosophy 21,635-643. Westerstahl, Dag 2002. On the compositionality of idioms: An abstract approach. In: J. van Benthem, D. I. Beaver & D. Barker-Plummer (ed.). Words, Proofs, and Diagrams. Stanford, CA: CSLI Publications, 241-271. Westerstahl. Dag 2004. On the compositional extension problem. Journal of Philosophical Logic 33,549-582. Westerstahl, Dag 2007. Remarks on scope ambiguity. In: E. Ahls£n et al. (eds.). Communication - Action - Meaning. A Festschrift to Jens Allwood. Goteborg: Department of Linguistics, University of Gothenburg. 43-55. Westerstahl. Dag 2008. Decomposing generalized quantifiers. Review of Symbolic Logic 1:3, 355-371. Westerstahl, Dag 2011. Compositionality in Kaplan style semantics. In: W. Hinzen, E. Machery & M. Werning (eds.). The Oxford Handbook of Compositionality. Oxford: Oxford University Press. Zadrozny, Wlodek 1994. From compositional to systematic semantics. Linguistics & Philosophy 17, 329-342. Peter Pagin, Stockholm (Sweden) Dag Westerstfihl, Gothenburg (Sweden)
124 I. Foundations of semantics 7. Lexical decomposition: Foundational issues 1. The purpose of lexical decomposition 2. The early history of lexical decomposition 3. Theoretical aspects of decomposition 4. Conclusion 5. References Abstract Theories of lexical decomposition assume that lexical meanings are complex. This complexity is expressed in structured meaning representations that usually consist of predicates, arguments, operators, and other elements of propositional and predicate logic. Lexical decomposition has been used to explain phenomena such as argument linking, selectional restrictions, lexical-semantic relations, scope ambiguites, and the inference behavior of lexical items. The article sketches the early theoretical development from noun- oriented semantic feature theories to verb-oriented complex decompositions. It also deals with a number of theoretical issues, including the controversy between decompositional and atomistic approaches to meaning, the search for semantic primitives, the function of decompositions as definitions, problems concerning the interpretability of decompositions, and the debate about the cognitive status of decompositions. 1. The purpose of lexical decomposition 1.1. Composition and decomposition The idea that the meaning of single lexical units is represented in the form of lexical decompositions is based on the assumption that lexical meanings arc complex. This complexity is expressed as a structured representation often involving predicates, arguments, operators, and other elements known from propositional and predicate logic. For example, the noun woman is represented as a predicate that involves the conjunction of the properties of being human, female, and adult, whereas the verb empty can be thought of as expressing a causal relation between x and the becoming empty of y. (1) a. woman: A.x[human(x) & female(x) & adult(x)] b. to empty: A.yA.x[cAUSE(x, BECOME(EMPTY(y)))] The structures involved in lexical decompositions resemble semantic structures on the phrasal and sentential level.There is of course an important difference between semantic decomposition and semantic composition; semantic complexity on the phrasal and sentential level mirrors the syntactic complexity of the expression while the assumed semantic complexity on the lexical level - at least as far as non-derived words are concerned - need not correspond to any formal complexity of the lexical expression. Next, we give an overview of the main linguistic phenomena treated within decompositional approaches (section 1.2). Section 2 looks at the origins of the idea of lexical decomposition (section 2.1) and sketches some early formal theories on the lexical decomposition of nouns (sections 2.2, 2.3) and verbs (section 2.4). Section 3 is devoted Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK33.1), deGruyter, 124-144
7. Lexical decomposition: Foundational issues 125 to a discussion of some long-standing theoretical issues of lexical decomposition, the controversy between decompositional and non-decompositional approaches to lexical meaning (section 3.1), the location of decompositions within a language theory (section 3.2), the status of semantic primitives (section 3.3), the putative role of decompositions as definitions (section 3.4), the semantic interpretation of decompositions (section 3.5), and their cognitive plausibility (section 3.6). The discussion relies heavily on the overview of frameworks of lexical decomposition of verbs given in article 17 (Engelberg) Frameworks of decomposition that can be consulted for a more detailed description of the theories mentioned in the present article. 1.2. The empirical coverage of lexical decompositions Which phenomena lexical decompositions are supposed to explain varies from approach to approach. The following have been tackled fairly often in decompositional theories: (i) Argument linking: One of the main purposes for decomposing verbs has been the attempt to form generalizations about the relationship between semantic arguments and their syntactic realization. In causal structures like those given for empty (lb) the first argument of a cause relation becomes the subject of the sentence and is marked with nominative in nominative-accusative languages or absolutive in ergative-absolutive languages. Depending on the linking theory pursued, this can be expressed in different kinds of generalizations, for example, by simply claiming that the first argument of cause always becomes the subject in active sentences or - more general - that the least deeply embedded argument of the decomposition is associated with the highest function in a corresponding syntactic hierarchy. (ii) Selectional restrictions: Lexical decompositions account for semantic co-occurrence restrictions. The arguments selected by a lexical item are usually restricted to particular semantic classes. If the item filling the argument position is not of the required class, the resulting expression is semantically deviant. For instance, the verb preach selects an argument filler denoting a human being for its first argument slot. The decompositional features of woman (la) account for the fact that the woman preached is semantically unobtrusive while the hope I chair I tree preached is not. (iii) Ambiguity resolution: Adverbs often lead to a kind of sentential ambiguity that was attempted to be resolved by reference to lexical decompositions. In a scenario where Rebecca is pointing a gun at Jamaal, sentence (2a) may describe three possible outcomes. (2) a. Rebecca almost killed Jamaal. b. kill: X.yX.x[Do(x,CAUSE(BECOME(DEAD(y))))] Assuming a lexical decomposition for kill as in (2b), ambiguity resolution is achieved by attaching almost to different predicates within the decomposition, yielding a scenario where Rebecca almost pulled the trigger (almost do ...), a scenario where she pulled the trigger but missed Jamaal (almost cause ...),and a scenario where she pulled the trigger, hit him but did not wound him fatally (almost become dead ...).
126 I. Foundations of semantics (iv) Lexical relations: Lexical decompositions have also been employed in the analysis of semantic relations like hyperonymy, complementarity, synonymy, etc. (cf. Bierwisch 1970:170). For example, assuming that a lexeme A is a hyperonym of a lexeme B iff the set of properties conjoined in the lexical decomposition of lexeme A is a proper part of the set of properties conjoined in the lexical decomposition of lexeme B, we derive that child (3a) is a hyperonym of girl (3b). (3) a. c/i/W:X.x[human(x) &->adult(x)] b. girl: A.x[human(x) & ->adult(x) & female(x)] (v) Lexical field structure: Additionally, lexical decompositions have been used in order to uncover the structure of lexical fields (cf. section 2.2). (vi) Inferences: Furthermore, lexical decompositions allow for semantic inferences that can be derived from the semantics of primitive predicates. For example, a predicate like become, properly defined, allows for the inference that Jamaal in (2a) was not dead immediately before the event. 2. The early history of lexical decomposition 2.1. The roots of lexical decomposition The idea that a meaning of a word can be explained by identifying it with the meaning of a more complex expression is deeply rooted not only in linguistics but also in our common sense understanding of language. When asked to explain to a non-native speaker what the German word Junggeselle means, one would probably say that a Junggeselle is an unmarried man. A decompositional way of meaning explanation is also at the core of the Aristotelian conception of word meaning in which the meaning of a noun is sufficiently explained by its genus proximum (here man) and its differentia specifica (here unmarried). Like the decompositions in (1), this conception attempts to define the meaning of a word. However, the distinction between genus proximum and differentia specifica is not explicitly expressed in lexical decompositions: From a logical point of view, each predicate in a conjunction as in (la) qualifies as a genus proximum. The Aristotelian distinction is also an important device in lexicographic meaning explanations as in (4a), where the next superordinate concept (donkey) of the lexical item in question (jackass) and one or more distinctive features (male) are given (cf. e.g., Svensen 1993: 120ff). Interestingly, meaning explanations based on genus proximum and differentia specifica have provoked some criticism within lexicography (Wiegand 1989) as well, and a closer look into standard monolingual dictionaries reveals that many meaning explanations are not of the Aristotelian kind represented in (4a): They involve near-synonyms (4b), integration of encyclopaedic (4c) and pragmatic information (4d), extensional listings of members of the class denoted by the lexeme (4e), pictorial illustrations (cf. numerous examples, e.g., in Harris 1923), or any combinations thereof. (4) a. jackass [...] 1. male donkey [...] (Thorndike 1941:501) b. grumpy [...] surly;ill-humoured;gruff.[...] (Thorndike 1941:413)
7. Lexical decomposition: Foundational issues 12^7 c. scimitar [...] A saber with a much-curved blade with the edge on the convex side, used chiefly by Mohammedans, esp. Arabs and Persians. [...] (Webster's, Harris 1923:1895) d. Majesty [...] title used in speaking to or of a king, queen, emperor, empress, etc.; as, Your Majesty, His Majesty, Her Majesty. [...] (Thorndike 1941:562) e. cat [...] 2. Any species of the family Felidae, of which the domestic cat is the type, including the lion, tiger, leopard, puma, and various species of tiger cats, and lynxes, also the cheetah. [...] (Webster's, Harris 1923:343) This foreshadows some persistent problems of later approaches to lexical decomposition. 2.2. Semantic feature theories and the semantic structure of nouns As we have seen, the concept of some kind of decomposition has been around ever since people began to systematically think about word meanings. Yet, it was not until the advent of Structural Semantics that lexical decompositions have become part of more restrictive semantic theories. Structural Semantics emerged in the late 1920s as a reaction to the semantic mainstream, which, at the time, was oriented towards psychological explanations of idiolectal variation and the diachronic change of single word meanings. It conceived of lexical semantics as a discipline that revealed the synchronic structure of the lexicon from a non-psychological perspective. The main tenet was that the meaning of a word can only be captured in its relation to the meaning of other words. Within Structural Semantics, lexical decompositions developed in the form of breaking down word meanings into semantic features (depending on the particular approach also called'semantic components','semantic markers', or'sememes'). An early analysis of this sort can be found in Hjelmslev's Prolegomena from 1943 (Hjelmslev 1963: 70) who observed that systematic semantic relationships can be traced back to shared semantic components (cf. Tab. 7.1). He favored a strict decompositional approach in that (i) he explicitly conceived of decompositions like the ones in Tab. 7.1 as definitions of words and (ii) assumed that content-entities like 'ram', 'woman', 'boy' have to be eliminated from the inventory of content-entities if they can be defined by decompositions (Hjelmslev 1963:72ff). Tab. 7.1: Semantic components (after Hjelmslev 1963:70) 'sheep' 'human 'child' 'horse' being' 'he' 'ram' 'man' 'boy' 'stallion' 'she' 'ewe' 'woman' 'girl' 'mare' Following the Prague School's feature-based approach to phonology, it was later assumed that semantic analyses should be based on a set of functional oppositions like [±human], [±male], etc. (cf. also article 16 (Bierwisch) Semantic features and primes). Semantic feature theories developed along two major lines. In Europe, structuralists like Pottier (1963,1964), Coseriu (1964), and Greimas (1966) employed semantic features to reveal
128 I. Foundations of semantics the semantic structure of lexical fields. A typical example for a semantic feature analysis in the European structuralist tradition is Pottier's (1963) analysis of the lexical field of sitting furniture with legs (French siege) that consists of the lexemes chaise, fauteuil, tabouret,canape, andpouf(ci.Tab.7.2). Six binary features serve to define and structure the field: s1 = avec dossier 'with back', s2 = sur pied 'on feet', s3 = pour 1 personne 'for one person', s4 = pour s'asseoir 'for sitting', s5 = avec bras 'with armrest', and s6 = avec matdriau rigide 'with rigid material'. Tab. 7.2: Semantic feature analysis of the lexical field siege ('seat with legs') in French (Pottier 1963:16) s1 s2 s* s4 ss s6 chaise + + + + - + fauteuil + + + + + + tabouret - + + + - + canape + + - + + + pouf - + + + In North America, Katz, Fodor, and others tried to develop a formal theory of the lexicon as a part of so-called Interpretive Semantics that constituted the semantic component of the Standard Theory of Generative Grammar (Chomsky 1965). In this tradition, semantic features served, for example, as targets for selectional restrictions (Katz & Fodor 1963). The semantic description of lexical items consists of two types of features, 'semantic markers' and 'distinguishes', by which the meaning of a lexeme is decomposed exhaustively into its atomic concepts: "The semantic markers assigned to a lexical item in a dictionary entry are intended to reflect whatever systematic semantic relations hold between that item and the rest of the vocabulary of the language. On the other hand, the distinguishes assigned to a lexical item arc intended to reflect what is idiosyncratic about its meaning." (Katz& Fodor 1963:187). An example entry is given in Tab. 7.3. Tab. 7.3: Readings of the cnglish noun bachelor distinguished by semantic markers (in parentheses) and distinguishcrs (in square brackets) (Katz, after Fodor 1977:65) bachelor, [+N,...], — (Human), (Male), [who has never married] — (Human), (Male), [young knight serving under the standard of another knight] — (Human), [who has the first or lowest academic degree] — (Animal), (Male), [young fur seal when without a mate during the breeding time] Besides this feature-based specification of the meaning of lexical items, Interpretive Semantics assumed recursive rules that operate over syntactic deep structures and build up meaning specifications for phrases and sentences out of lexical meaning specifications (Katz & Postal 1964). As we have seen, semantic feature theories make it possible to tackle phenomena in the area of lexical fields and selectional restrictions (cf. also article 21 (Cann) Sense relations). They can also be used in formal accounts of lexical-semantic relations. For example, expression A is incompatible with expression B iff A and B have different
7. Lexical decomposition: Foundational issues 129 values for at least one of their semantic features: boy [+human, -adult, -female] is incompatible with woman [+human, +adult, +female]. Expression A is complementary to expression B iff A and B have different values for exactly one of their semantic features: for instance,girl [+human, -adult, +female] is complementary to boy. Expression A is hyperonymous to expression B iff the set of feature-value assignments for A is included in the set of feature-value assignments for B: thus, child [+iiuman, -adult] is hyperonymous to boy. In European structuralism, the status of semantic features was a matter of debate. They were usually conceived of as part of a descriptive, language-independent semantic metalanguage, but were also treated as cognitive entities. In Generative Grammar, Katz conceived of semantic features as derived from a universal conceptual structure: "Semantic markers must [...] be thought of as theoretical constructs introduced into semantic theory to designate language invariant but language linked components of a conceptual system that is part of the cognitive structure of the human mind." (Katz 1967: 129). In a similar vein, Bierwisch (1970: 183) assumed that the basic semantic components "are not learned in any reasonable sense of the term, but are rather an innate predisposition for language acquisition." Thus, language-specific semantic structures come about by the particular combination of semantic features that yield a lexical item. 2.3. Some inadequacies of semantic feature analyses Semantic feature theories considerably stimulated research in lexical semantics. Beyond that, semantic features had found their way into contemporary generative syntax as a target for sclcctional restrictions (cf. Chomsky 1965). Yet, a number of empirical weaknesses have quickly become evident. (i) Relational lexemes: The typical cases of semantic feature analyses seem to presuppose that all predicates are one-place predicates. Associating woman with the feature bundle [+human, +adult, +female] means that the referent of the sole argument of woman{\) has the three properties of being human, adult, and female. Simple feature bundles cannot account for relational predicates like mother(\,y) or devour(x,y), because the argument to which a semantic feature attaches has to be specified. With mother(x,y), the feature [+female] applies to the first, but not to the second argument. (ii) Structure of verb decompositions: Semantic feature analyses usually represent word meanings as unordered sets of features. However, in particular with verbal predicates, the decomposition cannot be adequately formulated as a flat structure (cf. section 2.4). (iii) Undecomposable words: It has been criticized that cohyponyms in larger taxonomies, such as lion, tiger, puma, etc. as cohyponyms of cat (cf. 4e) or rose, tulip, daffodil, carnation, etc. as cohyponyms of flower, cannot be differentiated by semantic features in a non-trivial way. If one of the semantic features of rose is [+flower], then what is its distinguishing feature? This feature should abstract from a rose being a flower since [+flower] has already been added to the feature list. Moreover, it should not be unique for the entry of rose or else the number of features
130 I. Foundations of semantics threatens to surpass the number of lexical entries. In other words, there does not seem to be any plausible candidate for P that would make Vx[rose(x) <-> (p(x) & flower(x))] a valid definition (cf. Fodor 1977:150; Roelofs 1997:46ff for arguments of this sort). Besides cohyponymy in large taxonomies, there are other lexical relations as well that cannot be adequately captured by binary feature descriptions, for instance the scalar nature of antonymy and lexical rows like hot > warm > tepid > cool > cold. In general, it is simply unclear for many lexical items what features might be used to distinguish them from near-synonyms (cf. grumpy in (4b)). (iv) Exhaustiveness: For most lexical items, it seems to be impossible to give an exhaustive lexical analysis, that is, one that provides the features that are necessary and sufficient to distinguish the item from all other lexical items of the language without introducing features that are used solely for the description of this one particular item. Katz's distinction between markers and distinguishes does not solve the problem. Apart from the fact that the difference between 'markers' as targets for selectional restrictions and 'distinguishes' as lexeme-specific idiosyncratic features is not supported by the data (cf. Bierwisch 1969:177ff; Fodor 1977:144ff), this concession to semantic diversity weakens the explanatory value of semantic feature theories considerably since no restrictions are provided for what can occur as a distinguishes (v) Finiteness: Only rarely have large inventories of features been assembled (e.g., by Lorcnz & Wotjak 1977). Moreover, semantic feature theory has not succeeded in developing operational procedures by which semantic features can be discovered. Thus, it has not become evident that there is a finite set of semantic features that allows a description of the entire vocabulary of a language, in particular that this set is smaller than the set of lexical items. (vi) Universality: Another point of criticism has been that the alleged universality of a set of features has not been convincingly demonstrated. As Lyons (1968: 473) stated, cross-linguistic semantic comparisons of semantic structures rather point to the contrary. However, the search for a universal semantic metalanguage has continued and has become a major topic in particular among the proponents of Natural Semantic Metalanguage (cf. article 17 (Engelberg) Frameworks of decomposition, section 8). (vii) Theoretical status: The often unclear theoretical status of semantic features has drawn some criticism as well. Among other things, it has been argued that in order to express that a mare is a female horse it is not necessary to enrich the metalanguage by numerous features. The relation can equally well be expressed on the level of object language by assuming a meaning postulate in form of a biconditional: oVx[mare(x) <-> horse(x) & female(\)]. 2.4. Lexical decomposition and the semantic structure of verbs The rather complex semantic structure of many verbs could not be adequately captured by semantic feature approaches for two reasons: They focused on one-place lexemes, and they expressed lexical meaning in flat structures, that is, by simply conjoining semantic features. Instead, hierarchical structures were needed. Katz (1971) tackled this problem in the form of decompositions that also included aspectually relevant features such as 'activity' (cf.Tab. 7.4).
7. Lexical decomposition: Foundational issues 13J_ Tab. 7.4: Decomposition of chase (after Katz 1971:304) chase -»Verb, Verb transitive....; (((Activity) (Nature: (Physical)) of*), ((Movement) (Rate: Fast)) (Character: Following)), (Intention of*: (Trying to catch ((V) ((Movement) (Rate: Fast))))); (SR). While this form of decomposition never caught on, other early verb decompositions (cf. Bendix 1966, Fillmore 1968, Bierwisch 1970) look more familiar to semantic representations still employed in many theories of verb semantics: (5) a. g/ve(x,y,z): x cause (y have z) (after Bendix 1966:69) b. persuade(x,y^.): x cause (y believe z) (after Fillmore 1968:377) It was the rise of Generative Semantics in the late 1960s that caused a shift in interest from decompositional structures of nouns to lexical decompositions of verbs. The history of lexical decompositions of verbs that emerged from these early approaches is reviewed in article 17 (Engelberg) Frameworks of decomposition. Starting from early generative approaches, verb decompositions have been employed in theories as different as Conceptual Semantics (cf. also article 30 (Jackendoff) Conceptual Semantics), Natural Semantic Metalanguage, and Distributed Morphology (cf. also article 81 (Harlcy) Semantics in Distributed Morphology). 3. Theoretical aspects of decomposition 3.1. Decompositionalism versus atomism Directly opposed to decompositional approaches to lexical meaning stands a theory of lexical meaning that is known as lexical atomism or holism and whose main proponent is Jerry A.Fodor (1970,1998,Fodor et al. 1980). According to a decompositional concept of word meaning, knowing the meaning of a word involves knowing its decomposition, that is, the linguistic or conceptual entities and relations it consists of. Fodor's atomistic conception of word meaning rejects this view and assumes instead that there is a direct correspondence between a word and the mental particular it stands for. A lexical meaning does not have constituents, and - in a strict formulation of atomism - knowing it does not involve knowing the meaning of other lexical units. Fodor (1998: 45) observes in favor of his atomistic, anti-definitional approach that there arc practically no words whose definition is generally agreed upon - an argument that cannot be easily dismissed. Atomists are also skeptical about the claim that decompositions/definitions are simpler than the words they are attached to: "Does anybody present really think that thinking bachelor is harder than thinking unmarried? Or that thinking father is harder than thinking parent?" (Fodor 1998:46). Discussing Jackendoff s (1992) decompositional representation of keep, Fodor (1998: 55) comments on the relation that is expressed in examples as different as someone kept the money and someone kept the crowd happy: "I would have thought, saying what relation they both instance is precisely what the word 'keep' is for; why on earth do you suppose that you can say it 'in other words'?" And he adds: "I can't think of a better way to say what 'keep' means than to say that it means keep. If, as I suppose, the concept keep is
132 I. Foundations of semantics an atom, it's hardly surprising that there's no better way to say what 'keep' means than to say that it means keep" More detailed arguments for and against atomistic positions will appear throughout this article. The controversy between decompositionalists and atomists is often connected to the question whether decompositions or meaning postulates should be employed to characterize lexical meaning. Meaning postulates are used to express analytic knowledge concerning particular semantic expressions (Carnap 1952: 67). Lexical meaning postulates are necessarily true. They consist of entailments where the antecedent is an open lexical proposition (6a). (6) a. dVx[bachelor(x) -> man(x)] dVx[bacmelor(x) -» ->married(x)] b. dVx[baciielor(x) <-> (man(x) & ->married(x))] c. bachelor: A.x[man(x) & unmarried(x)] Meaning postulates can also express bidirectional entailments as in (6b) where the biconditional expresses a definition-like equivalence between a word and its decomposition. I will assume that in the typical case on a pure lexical level of meaning description dccompositional approaches like (6c) conceive of word meanings as bidirectional entailments as in (6b) while atomistic approaches involve monodirectional entailments as in (6a) (cf. similarly Chierchia & McConnell-Ginet 1990: 360ff). Thus, meaning postulates do not per se characterize atomistic approaches to meaning, but it is rather the kind of meaning postulate that serves to distinguish the two basic stances on word meaning. Informally, one might say that bidirectional meaning postulates provide definitions, monodirectional ones single aspects of word meaning in form of relations to other semantic elements. Three caveats are in order here: (i) Semantic reconstruction on the basis of meaning postulates is not uniformly accepted either in the dccompositional or in the atomistic camp. Some proponents of dccompositional approaches do not adhere to a definitional view of decompositions; they claim that their decompositions do not cover the whole meaning of the lexical item (cf. section 3.4). At the same time, some radical approaches to atomism reject lexical meaning postulates completely (Fodor 1998). (ii) Decompositions and bidirectional meaning postulates are only equivalent on the level of meaning explanation (cf. Chierchia & McConnell-Ginet 1990: 362). They differ, however, in that in dccompositional approaches the word defined (bachelor in (6c)) is not accessible within the semantic derivation while the elements of the decomposition (man(x) & unmarried(x)) are. This can have an effect, for example, on the explanation of scope phenomena, (iii) Furthermore, decompositions as in (6c) and bidirectional meaning postulates as in (6b) can give rise to different predictions with respect to language processing (cf. section 3.6). 3.2. Decompositions and the lexicon One of the most interesting differences in the way verb decompositions are used in different language theories concerns their location within the theory. Some approaches locate decompositions and the principles and rules that build them up in syntax (e.g.,
7. Lexical decomposition: Foundational issues 133 Generative Semantics, Distributed Morphology), some in semantics (e.g., Dowty's Montague-based theory, Lexical Decomposition Grammar, Natural Semantic Metalanguage), and others in conceptual structure (e.g., Conceptual Semantics) (cf. article 17 (Engelberg) Frameworks of decomposition). Decompositions are sometimes constrained by interface conditions as well. These interface relations between linguistic levels of representation are specified to a different degree in different theories. Lexical Decomposition Grammar has put some effort into establishing interface conditions between syntactic, semantic, and conceptual structure. In syntactic approaches to decomposition (Lexical Relational Structures, Distributed Morphology), however, the relation between syntactic decompositions and semantic representations often remains obscure - one of the few exceptions being von Stcchow's (1995) analysis of the scope properties of German wieder 'again' in syntactic decompositions (cf. article 17 (Engelberg) Frameworks of decomposition). The way decompositions are conceived has an impact on the structure of the lexicon and its role within the architecture of the language theory pursued. While some approaches advocate rather rich meaning representations (e.g., Natural Semantic Metalanguage, Conceptual Semantics), others downplay semantic representation and reduce it to a rather unstructured domain of encyclopaedic knowledge (Distributed Morphology) (cf. the overview in Ramchand 2008). Meaning representation itself can occur on more than one level. Sometimes the distinction is between semantics proper and some sort of conceptual representation (e.g., Lexical Decomposition Grammar); sometimes different levels of conceptual representation are distinguished such as Jackendoffs (2002) Conceptual Structure and Spatial Structure.Theories also differ in how much the lexicon is structured by rules and principles. While syntactic approaches often conceive of the lexicon as a mere inventory, other approaches (e.g. Levin & Rappaport Hovav's Lexical Conceptual Structures, Wunderlich's Lexical Decomposition Grammar, Pustejovsky's Event Structures) assume different kinds of linking principles, interface conditions and structure rules for decompositions (for references, cf. article 17 (Engelberg) Frameworks of decomposition). 3.3. Decompositions and primitives Decompositional approaches to lexical meaning usually claim that all lexical items can be completely reduced to their component parts; that is, they can be defined. This requires certain conditions on decompositions in order to avoid infinite regress: (i) The predicates used in the decompositions are either semantic primitives or can be reduced to semantic primitives by definitions. It is necessary that the primitives are not reduced to other elements within the vocabulary, but are grounded elsewhere, (ii) Another condition is that the set of primitives be notably smaller than the lexicon (cf. Fodor et al. 1980:268). (iii) Apart from their primitivity, it is often required that predicates within decompositions be general, that is, distinctive for a large number of lexemes, and universal, that is, relevant to the description of lexemes in all or most languages (cf. Lobner 2002:132ff) (cf. also the discussion in article 19 (Levin & Rappaport Hovav) Lexical Conceptual Structure). The status of these primitives has been a constant topic within decompositional semantics. Depending on the particular theories, the vocabulary of semantic primitives is located on different levels of linguistic structure. Theories differ as to whether these predicates are elements of the object language (e.g., Natural Semantic Metalanguage),
134 I. Foundations of semantics of a semantic metalanguage (e.g., Montague Semantics) or of some set of conceptual entities (e.g., Lexical Conceptual Semantics). A finite set of primitives is rarely given, the notable exception being Natural Semantic Metalanguage. Most theories obviously assume a core of the decompositional vocabulary, including such items as cause, become, do, etc., but they also include many other predicates like alive, believe, in, mouth, write. Since they are typically not concerned with all subtleties of meaning, most theories often do not bother about the status of these elements.They might be conceived of as definable or not. While in Dowty's (1979) approach cause gets a counterfactual interpretation in the vein of Lewis (1973), Lakoff (1972: 6l5f) treats cause as a primitive universal and, similarly, in Natural Semantic Metalanguage because is taken as a primitive. However, no matter how many primitives a theory assumes, something has to be said about how these elements can be grounded. Among the possible answers are the following: (i) Semantic primitives are innate (or acquired before language acquisition) (e.g., Bierwisch 1970). To my knowledge, no evidence from psycholinguistics or neuro- linguistics has been obtained for this claim, (ii) Semantic primitives can be reduced to perceptual features. Considering the abstract nature of some semantic features, a complete reduction to perception seems unlikely (cf. Jackendoff 2002: 339). (iii) Semantic primitives are conceptually grounded (e.g., Natural Semantic Metalanguage, Conceptual Semantics).This is often claimed but rarely pursued empirically (but cf. Jackendoff 1983, Engelberg 2006). 3.4. Decompositions and definitions If lexical decompositions are semantically identified with biconditional meaning postulates, they can be regarded as definitions: They provide necessary and sufficient conditions. It has been questioned whether word meaning can be captured this way. The non-equivalence of kill and cause to die had been a major argument against Generative Semantics (cf. article 17 (Engelberg) Frameworks of decomposition, section 2). It has been observed that the definitional approach simply fails on a word-by-word basis: "There are practically no defensible examples of definitions; for all the examples we've got, practically all words (/concepts) are undefinable.And,of course, if a word (/concept) doesn't have a definition, then its definition can't be its meaning." (Fodor 1998:45) Given what was said about lexicography in section 2.1, we must concede that even by a less strict view on semantic equivalence, the definitional approach can only be applied to a subgroup of lexical items. Atomistic and prototype-based approaches to lexical meaning thrive on these observations. Decompositionalist approaches to word meaning have reacted differently to these problems. Some have just denied them: Natural Semantic Metalanguage claims that based on a limited set of semantic primitives and their combinatorial potential a complete decomposition is possible. Other approaches, in particular Conceptual Semantics, point to the particular descriptive level of their decompositions. They claim that cause does not have the same meaning as cause, the former being a conceptual entity. This allows Jackendoff (2002: 335f) to state that decompositions are not definitions since no equation between a word and a synonymous phrasal expression is attempted. However, the meanings of the decompositional predicates employed are all the more in need of explanations since our intuition about the meaning of natural language lexemes does not account for them anymore. As Pulman (2005) puts it in a discussion of Jackendoff's approach: "[...] if your intuition is that part of the meaning
7. Lexical decomposition: Foundational issues 135 of'drink' is that liquid should enter a mouth, then unless there is some explicit connection between the construct mouth and the meaning of the English word 'mouth', that intuition is not accounted for." Finally, some researchers assume that decompositions only capture the core meaning of a word (e.g., Kornfilt & Correra 1993:83).Thus, decompositions do not exhaust a word's meaning, and they are not definitions. This, of course, poses the questions what they are decompositions of and by what semantic criteria core aspects of lexical meaning can be identified. In any case, a conception of decompositions as incomplete meaning descriptions weakens the approach considerably."It is, after all, not in dispute that some aspects of lexical meanings can be represented in quite an exiguous vocabulary; some aspects of anything can be represented in quite an exiguous vocabulary," Fodor (1998:48) remarks and adds: "It is supposed to be the main virtue of definitions that, in all sorts of cases, they reduce problems about the defined concepts to corresponding problems about its primitive parts. But that won't happen unless each definition has the very same content as the concept it defines." (Fodor 1998:49) In some approaches the partial specification of meaning in semantic decompositions results from a distinction between semantic and conceptual representation (e.g., Bierwisch 1997). Semantic decompositions are underspecified and are supplemented on the utterance level with information from a conceptual representation. The completeness question is sometimes tied to the attempt to distinguish those aspects of meaning that are grammatically relevant from those that are not. This is done in different ways. Some assume that decompositions arc incomplete and represent only what is grammatically relevant. Others differentiate between levels of representation; Lexical Decomposition Grammar distinguishes Semantic Form, which includes the grammatically relevant information, from Conceptual Structure. Finally, some assume one level of representation in which only particular parts are grammatically relevant; in Rappaport Hovav & Levin (1998), general decompositional templates contain the grammatically relevant information whereas idiosyncratic aspects arc reflected by lexeme-specific constants that are inserted into these templates. However, the distinction between grammatically relevant and irrelevant properties within a semantic representation raises a serious theoretical question. It is a truism that not all subtleties of lexical meaning show grammatical effects. With to eat, the implied aspect of intentional agentivity is grammatically relevant in determining which argument becomes the subject while the implied aspect of biological food processing is not. However, distinguishing the grammatically relevant from the irrelevant by assigning them different locations in representations is not more than a descriptive convention unless one is able to show that grammatically relevant meaning is a particular type of meaning that can be distinguished on semantic grounds from grammatically irrelevant meaning. But do all the semantic properties that have grammatical effects (intentional agentivity and the like) form one natural semantic class and those that do not (biological food processing and the like) another? As it stands, it seems doubtful that such classes will emerge. As Jackendoff (2002: 290) notes, features with grammatical effects form a heterogeneous set. They include well-known distinctions such as those between agent and experiencer or causation and non-causation but also many idiosyncratic properties such as the distinction between emission verbs where the sound can be associated with an action of moving and which therefore allow a motion construction (the car squealed around the corner) and those where this is not the case (*the car honked around the corner) (cf. Levin & Rappaport Hovav 1996).
136 I. Foundations of semantics 3.5. Decompositions and their interpretation It is evident that in order to give empirical content to theoretical claims on the basis of lexical semantic representations, the meaning of the entities and configurations of entities in these semantic representations must be clear. This is a major problem not only for decompositional approaches to word meaning.To give an example from Distributed Morphology (DM), Harlcy & Noycr (2000: 368) notice that cheese is a mass noun and sentences like I had three cheeses for breakfast are unacceptable. The way DM is set up requires deriving the mass noun restriction from encyclopaedic knowledge (cf. article 17 (Engelberg) Frameworks of decomposition, section 10). Thus, it is listed in the encyclopaedia that "cheese does not typically come in discrete countable chunks or types". However, to provide a claim like that with any empirical content, we need a methodology for determining what the encyclopaedic knowledge for cheese looks like. As we will see, lexical-semantic representations often exhibit a conflict when it comes to determining what a word actually means. If we use its syntactic behaviour as a guideline for its meaning, the semantic representation is not independently motivated but circularly determined by the very structures it purports to determine. If we use our naive everyday conception of what a word means, we lack an objective methodology of determining lexical meaning. Moreover, the two paths lead to different results. If we take a naive, syntax-independent look at the encyclopaedic semantics of cheese, a visit to the next supermarket will tell us that - contrary to what Harlcy & Noyer claim - all cheese comes in chunks or slices, so does all sausage and some of the fruit. Thus, it seems that the encyclopaedic knowledge in DM is not arrived at by naively watching the world around you. If it were, the whole architecture of Distributed Morphology would probably break down since, as corpus evidence shows, the German words for'cheese' Kdse, 'sausage' Wurst, and 'fruit' Obst are different with respect to the count/mass distinction although their supermarket appearance with respect to chunks and slices is very similar: Wurst can be freely used as a count and a mass noun; Kdse is a mass noun that can be used as a count noun, especially, but not only, when referring to types of cheese, and Obst is obligatorily a mass noun.Thus, the encyclopaedic knowledge about cheese in Distributed Morphology seems to be forced by the fact that cheese is grammatically a mass noun. This two-way dependency between grammar and encyclopaedia immunizes DM against falsification and, thus, renders a central claim of DM empirically void. The neglect of semantic methodology and theory particularly in those approaches that advocate a semantically constrained but otherwise free syntactic generation of argument structures has of course been pointed out before: "Although lip service is often paid to the idea that a verb's meaning must be compatible with syntactically determined meaning [...], it is the free projection of arguments that is stressed and put to work, while the explication of compatibility is taken to be trivial." (Rappaport Hovav & Levin 2005: 275). It has to be emphasized that it is one thing to observe differences in the syntactic behaviour of words in order to build assumptions about hitherto unnoticed semantic differences between them but it is a completely different matter to justify the existence of a particular semantic property of a word on the basis of a syntactic construction it occurs in and then use this property to predict its occurrence in this construction. The former is a useful heuristic method for tracing semantic properties that would then need to be justified independently; the latter is a circular construction of explanations that can rob a theory of most of its empirical value (cf. Engelberg 2006). This becomes particularly
7. Lexical decomposition: Foundational issues 137 obvious in approaches that distinguish grammatically relevant from grammatically irrelevant meaning. For example, Grimshaw (2005: 75f) claims that "some meaning components have a grammatical life" ('semantic structure') while "some are linguistically inert" ('semantic content'). Only the former have to be linguistically represented. She then discusses Jackendoff's (1990:253) representation of ear as a causative verb, which according to Jackendoff means that xcauses^ to go into x's mouth.While she agrees that Jackendoff's representation captures what eat "pretheoretically" means, she does not consider causation as part of the representation of eat since eat differs from other causatives (e.g., melt) in lacking an inchoative variant (Grimshaw 2005: 85f). Thus, we are confronted with a concept of grammatically relevant causation and a concept of grammatically irrelevant causation, which apart from their grammatical relevance seem to be identical.The result is a theory that pretends to explain syntactic phenomena on the basis of lexical meaning but actually just maps syntactic distinctions onto distinctions on a putatively semantic level, which, however, is not semantically motivated. Further problems arise if decompositions are adapted to syntactic structures: There is consensus among semanticists and philosophers that causation is a binary relation with both relata belonging to the same type. Depending on the theory, the relation holds either between events or proposition-like entities. In decompositional approaches, the causing argument is often represented as an individual argument, cause(x,p), since purportedly causative verbs only allow agent-denoting NPs in subject position. When this conflict is discussed, it is usually suggested that the first argument of cause is reinterpreted as 'x does something' or 'that x does something'. Although such a reinterpretation can be formally implemented, it raises the question why decomposition is done at all. One could as well stay with simple predicate-argument structures like dry(x,y) and reinterpret them as 'something that x does causes y to become dry'. Furthermore, the decision to represent the first argument of cause as an individual argument is not motivated by the meaning of the lexical item but, in a circular way, by the same syntactic structure that it claims to explain. Finally, the assumption that all causative verbs require an agentive NP in subject position is wrong. While it holds for verbs like German trocknen 'to dry' it does not hold for verbs like vergrojkrn 'enlarge' that allow sentential subjects. In any case, the asymmetrical representation of causation raises the problem that two cause predicates need to be introduced where one is reinterpreted in terms of the other. The problem is even bigger in approaches like Distributed Morphology that have done away with a mediating lexicon. If wc assume that the bi-propositional (or bi-cventive) nature of causation is part of the encyclopaedic knowledge of vocabulary items that express causation, then causative verbs that only allow agentive subjects should be excluded from transitive verb frames completely. Even if the predicates used in decompositions are characterized in semantic terms, the criteria are not always sufficiently precise to decide whether or not a verb has the property expressed. Levin & Rappaport Hovav (1996) observe that there are counterexamples to the wide-spread assumption that telic intransitive verbs are unaccusative and agentive intransitive verbs are unergative, namely, verbs of sound, which are uner- gative but not necessarily agentive nor telic {beep, buzz, creak, gurgle). Therefore they propose that verbs that refer to "internally caused eventualities" arc unergative, which is the case if "[...] some property of the entity denoted by the argument of the verb is responsible for the eventuality" (Levin & Rappaport Hovav 1996: 501). If we want to apply this idea to the unaccusative German zerbrechen 'break' and the unergative
138 I. Foundations of semantics knacken 'creak', which they do not discuss, we have to check whether it is true that some property of the twig is responsible for the creaking in der Zweig hat geknackt 'the twig creaked' while there is no property of the twig that is responsible for the breaking in der Zweig ist zerbrochen 'the twig broke'. In order to do that, we must know what 'internal causation' is; that is, we have to answer questions like: What is 'causation'? What is 'responsibility'? What is 'eventuality'? Is 'responsibility', contrary to all assumptions of theories of action, a predicate that applies to properties of twigs? What property of twigs are we talking about? Is (internal) 'causation', contrary to all theories of causation, a relation between properties and eventualities? As long as these questions are not answered, proponents of the theory will agree that the creaking of the twig but not the breaking is internally caused while opponents will deny it. And there is no way to resolve this (cf. Engelberg 2001). Similar circularities have been noticed by others. Fodor (1998: 51, 60ff), citing work from Pinker (1989) and Higginbotham (1994), criticizes that claims about linking based on interminably vague semantic properties elude any kind of evaluation. This is all the worse since the predicates involved in decompositions are not only expressions in the linguist's metalanguage but are concepts that are attributed to the speaker and his or her knowledge about language (Fodor 1998:59). Stipulations about the structure of decompositions can diminish the empirical value of the theory, too. Structural aspects of decompositions concern the way embedded predicates are combined as well as the argument structure of these predicates. For example, predicates like cause or poss are binary because we conceive of causation and possession as binary relations. Their binarity is a structural aspect that is deeply rooted in our understanding of these concepts. However, it is just by convention that most approaches represent the causing entity and the possessor as the first argument of cause and poss, respectively. Which argument of multi-place predicates stands for which entity is determined by the truth conditions for this predicates.Thus, the difference between POss(xposs|:sso",yENTnv"POSSESSED) and poss(xENTmYPOSShSS,£D,yPOSSESSO") is just a notational one. Whenever explanations rely on this difference, they are not grounded in semantics but in notational conventions. This is, for example, the case in Lexical Decomposition Grammar where the first argument of poss falls out as the higher one - with all its consequences forTheta Structure and linking principles (Wunderlich 1997:39). In summary, the problems with interpreting the predicates used in decompositions and structural stipulations severely limit the empirical content of predictions based on these decompositions. Two ways out of this situation have been pursued only rarely. Decompositional predicates can be given precise truth conditions, as is done in Dowty (1979), or they can be linked to cognitive concepts that are independently motivated (e.g., Jackendoff 1983, Engelberg 2006). 3.6. Decompositions and cognition Starting from the 1970s, psycholinguistic evidence has been used to argue for or against lexical decompositions. While some approaches to decompositions accepted psycholinguistic data as relevant evidence, proponents of particular lexical theories denied that their theories are about lexical processing at all. Dowty (1979:391) emphasized that what the main decompositional operators determine is not what the speaker/listener
7. Lexical decomposition: Foundational issues 139 must compute but to what he can infer. Similarly, Goddard (1998:135) states that "there is no claim that people, in the normal course of linguistic thinking, compose their thoughts directly in terms of semantic primitives; or, conversely, that normal processes of comprehension involve real-time decomposition down to the level of semantic primitives." It has also been suggested that in theorizing about decompositional versus atomistic theories one should distinguish whether lexical concepts are definitionally primitive, computationally primitive (pertaining to language processing), and/or developmentally primitive (pertaining to language acquisition) (cf. Fodoret al. 1980:313; Carey 1982:350f). When lexical decompositions are interpreted in psycholinguistic terms, the typical assumption is that the components of a decomposition are processed each time the lexical item is processed. Lexical processing efforts should emerge as a function from the complexity of the lexical decomposition to processing time: The more complex the decomposition, the longer the processing time. Most early psycholinguistic studies did not produce evidence for lexical decomposition (cf. Fodor et al. 1980; Johnson-Laird 1983). Employing a forced choice task and a rating test, Fodor et al. (1980) failed to find processing differences between causative verbs like kill, which are putatively decompo- sitionally complex, and non-causative verbs like bite. Fodor, Fodor & Garrett (1975:522) reported a significant difference between explicit negatives (e.g., not married) and putatively implicit negatives (e.g., an UNMARRiED-feature in bachelor) in complex conditional sentences like (7). (7) a. If practically all men in the room are not married, then few of the men in the room have wives, b. If practically all men in the room are bachelors, then few of the men in the room have wives. Sentences like (7a) that contain explicit negatives gave rise to longer processing times, thus suggesting that bachelor does not contain hidden negatives. Measuring fixation time during reading, Rayner & Duffy (1986) did not find any differences between putatively complex words like causatives and non-causatives. Similarly, Roelofs (1997: 48ff) discussed several models for word retrieval and argued for a non-decompositional spreading-activation model. More recent studies display a more varied picture. Gennari & Poeppel (2003) compared cvcntivc verbs like build, distort, show, which denote causally structured events, with stative verbs like resemble, lack, love, which do not involve complex cause/ become structures. Controlling for differences in argument structure and thematic roles, they carried out a self-paced reading study and a visual lexical decision task. They found that semantic complexity was reflected in processing time and that elements of decomposition-like structures were activated during processing. McKoon & MacFarland (2002) adopted Rappaport Hovav & Levin's (1998) template-based approach to decomposition and their distinction between verbs denoting internal causation (bloom) and external causation (break) (Levin & Rappaport Hovav 1995,1996).They reported longer processing times for break-iype verbs than for bloom-Xype verbs in grammatically judgments, reading time experiments, and lexical decision tasks. They interpreted the results as confirmation that break-iype verbs involve more complex decompositions than bloom-type verbs.
140 I. Foundations of semantics Different conclusions were drawn from other experiments. Applying a "release from proactive interference" technique, Mobayyen & de Almeida (2005) investigated the processing times for lexical causatives (bend,crack,grow), morphological causatives (thicken, darken, fertilize), perception verbs (see, hear, smell), and repetitive perception verbs with morphological markers (e.g., re-smell). If verbs were represented in the form of decompositions, the semantically more complex lexical and morphological causatives should pattern together and evoke longer processing times than perception verbs. However, this did not turn out to be the case. Morphological causatives and the morphologically complex re-verbs required longer processing than lexical causatives and perception verbs. Similar results have been obtained in action-naming tasks carried out with Alzheimer patients (cf. dc Almeida 2007). That lead Mobayyen and dc Almeida to the conclusion that the latter two verb types are both semantically simple and refer to non-complex mental particulars. Another line of psycholinguistic/neurolinguistic research concerns speakers with category-specific semantic deficits due to brain damage. Data obtained from these speakers have been used to argue for semantic feature approaches as well as for approaches employing meaning postulates in the nominal domain (cf. the discussion in de Almeida 1999). However, de Almeida (2001: 483) emphasizes that so far no one has found evidence for category-specific verb concept deficits, for example, deficits concerning features like cause or go. Evidence for decomposition theories has also been sought in data from language acquisition. If a meaning of a word is its decomposition then learning a decomposed word means learning its decomposition.The most explicit early theory of decomposition- based learning is Clark's (1973) semantic-feature based theory of word-learning. In her view, only some of the features that make up a lexical representation are present when a word is first acquired whereas the other features are learned while the word is already used. The assumption that these features are acquired only successively predicts that children ovcrgcncralizc heavily when acquiring a new word. In subsequent research, it turned out that Clark's theory did not conform to the data: (i) Overgeneralization does not occur as often as predicted; (ii) with recently acquired words, undergeneralization is more typical than overgeneralization, and (iii) at some stages of acquisition, the referents a word is applied to do not have any features in common (Barrett 1995:375ff,cf. also the review in Carey 1982:361 ff). It has been repeatedly argued that meaning postulates are better suited to explain acquisition processes (cf. Chierchia & McConnell-Ginet 1990: 363f, Bartsch & Vcnncmann 1972:22). However, even if Clark's theory of successive feature acquisition is not tenable, related data are cited in favor of decompositions to show that some kind of access to semantic features is involved in acquisition. For example, it has been argued that a meaning component cause is extracted from verbs by children and used in overgeneralizations like he failed it (cf. the overview in Clark 2003: 233ff). Some research on the acquisition of argument structure alternations has been used to argue for particular decompositional approaches, such as Pinker (1989) for Lexical Conceptual Structures in the vein of Levin & Rappaport Hovav, and Brinkmann (1997) for Lexical Decomposition Grammar. In summary, the question whether decompositions are involved in language processing or language acquisition remains open. Although processing differences for different classes of verbs have to be acknowledged, it is often difficult to conclude from these data what forms of lexical representation are compatible with these data.
7. Lexical decomposition: Foundational issues Hl_ 4. Conclusion From a heuristic and descriptive point of view, lexical decomposition has proven to be a very successful device that has made it possible to discover and to tackle numerous lexical phenomena, in particular, at the syntax-semantics interface. Yet, from a theoretical point of view, lexical decompositions have remained a problematic concept that is not always well grounded in theories of semantics and cognition: - The basic predicates within decompositions are often elusive and lack truth conditions, definitions, or an empirically grounded link to basic cognitive concepts. - The lack of semantic grounding of decompositions often leads to circular argumentations in linking theories. - The cognitive status of decompositions is by and large unclear; it is not known whether and how decompositions arc involved in lexical processing and language acquisition. Thus, decompositions still raise many questions: "But even if the ultimate answers are not in sight, there is certainly a sense of progress since the primitive approaches of the 1960s." (Jackendoff 2002:377) 5. References de Almeida. Roberto G. 1999. What do category-specific semantic deficits tell us about the representation of lexical concepts? Brain & Language 68,241-248. de Almeida, Roberto G. 2001. Conceptual deficits without features: A view from atomism. Behavioral and Brain Sciences 24,482-483. de Almeida, Roberto G. 2007. Cognitive science as paradigm of interdisciplinarity: the case of lexical concepts. In: J. L. Audy & M. Morosini (eds.). Interdisciplinarity in Science and at the University. Porto Alegre: EdiPUCRS, 221-276. Barrett, Martyn 1995. Early lexical development. In: P. Fletcher & B. MacWliinney (eds.). 77ie Handbook of Child Language. Oxford: Blackwell, 362-392. Bartsch, Renate & Theo Vennemann 1972. Semantic Structures. A Study in the Relation between Semantics and Syntax. Frankfurt/M.: Athenauin. Bendix, Edward H. 1966. Componential Analysis of General Vocabulary. The Semantic Structure of a Set of Verbs in English, Hindi, and Japanese. Blooinington. IN: Indiana University. Bierwisch,Manfred 1969. Certain problems of semantic representations. Foundations of Languages, 153-184. Bierwisch, Manfred 1970. Semantics. In: J. Lyons (ed.). New Horizons in Linguistics. Hai mondswoi th: Penguin, 166-184. Bierwisch, Manfred 1997. Lexical information from a minimalist point of view. In: C. Wilder, H.-M. Gartner & M. Bierwisch (eds.). 77ie Role of Economy Principles in Linguistic Theory. Berlin: Akademie Verlag, 227-266. Brinkmann, Ursula 1997. The Locative Alternation in German. Its Structure and Acquisition. Amsterdam: Benjamins. Carey, Susan 1982. Semantic development: The state of the art. In: E. Wanner & L. R. Gleitman (eds.). Language Acquisition. The State of the Art. Cambridge: Cambridge University Press, 347-389. Carnap, Rudolf 1952. Meaning postulates. Philosophical Studies 3,65-73. Cliieichia, Gcnnaro & Sally McConncll-Ginet 1990. Meaning and Grammar. An Introduction to Semantics. Cambridge, MA:Thc MIT Press.
142 I. Foundations of semantics Chomsky, Noam 1965. Aspects of the Theory of Syntax. Cambridge, MA:Tlie MIT Press. Clark, Eve V. 1973. What's in a word? On the child's acquisition of semantics in his first language. In:T. E. Moore (ed.). Cognitive Development and the Acquisition of Language. New York: Academic Press, 65-110. Clark, Eve V. 2003. First Language Acquisition. Cambridge: Cambridge University Press. Coseriu, Eugenio 1964. Pour une s£inantique diachronique structurale. Travaux de Linguistique et de Litterature 2,139-186. Dowty, David R. 1979. Word Meaning and Montague Grammar. The Semantics of Verbs and Times in Generative Semantics and in Montague's PTQ. Dordrecht: Reidel. Engelberg, Stefan 2001. Immunisicrungsstrategien in der lexikalischen Ereignissemantik. In: J. Dolling & T. Zybatov (eds.). Ereignisstrukturen. Leipzig: Institut fur Linguistik dcr Universitat Leipzig, 9-33. Engelberg. Stefan 2006. A theory of lexical event structures and its cognitive motivation. In: D. Wundcrlich (cd.). Advances in the Theory of the Lexicon. Berlin: dc Gruytcr, 235-285. Fillmore, Charles J. 1968. Lexical entries for verbs. Foundations of Language 4,373-393. Fodor, Janet D. 1977. Semantics: Theories of Meaning in Generative Grammar. New York: Crowell. Fodor, Janet D., Jerry A. Fodor & Merrill F. Garrett 1975. The psychological unreality of semantic representations. Linguistic Inquiry 6,515-531. Fodor, Jerry A. 1970. Three reasons for not deriving'kill' from 'cause to die'. Linguistic Inquiry 1, 429^38. Fodor, Jerry A. 1998. Concepts. Where Cognitive Science Went Wrong. Oxford: Clarendon Press. Fodor,Jerry A., Merrill F. Garrett, Edward C.T.Walker & Cornelia H. Parkes 1980. Against definitions. Cognition 8,263-367. Gennari, Silvia & David Poeppel 2003. Processing correlates of lexical semantic complexity. Cognition 89, B27-B41. Goddard, Cliff 1998. Semantic Analysis. A Practical Introduction. Oxford: Oxford University Press. Greimas, Algirdas Julicn 1966. Semantique structurale. Paris: Larousse. Griinshaw. Jane 2005. Words and Structure. Stanford, CA: CSLI Publications. Harley, Heidi & Rolf Noyer 2000. Formal versus encyclopedic properties of vocabulary: Evidence from nominalizations. In: B. Pccters (ed.). The Lexicon-Encyclopedia Interface. Amsterdam: Elsevier, 349-375. Harris, William T (ed.) 1923. Webster's New International Dictionary of the English Language. Springfield, MA: Merriam. Higginbotham, James 1994. Priorities of thought. Supplementary Proceedings of the Aristotelian Society 68.85-106. Hjelmslev, Louis 1963. Omkring sprogteoriens grundlxggelse. Festskrift udgivet af K0benhavns Universitet i anledning of Universitetets Aarsfest. K0benhavn: E. Munksgaard, 1943, 3-113. English translation: Prolegomena to a Theory of Language. Madison, WI: The University of Wisconsin Press, 1963. Jackendoff, Ray 1983. Semantics and Cognition. Cambridge, MA:Thc MIT Press. Jackendoff. Ray 1990. Semantic Structures. Cambridge, MA:The MIT Press. Jackendoff, Ray 1992. Languages of the Mind. Essays on Mental Representation. Cambridge, MA: The MIT Press. Jackendoff, Ray 2002. Foundations of Language. Brain, Meaning, Grammar, Evolution. Oxford: Oxford University Press. Johnson-Laird, Philip N. 1983. Mental Models. Towards a Cognitive Science of Language, Inference, and Consciousness. Cambridge, MA: Harvard University Press. Katz, Jen old J. 1967. Recent issues in semantic theory. Foundations of Language 3.124-194. Katz, Jerrold J. 1971. Semantic theory. In: D. D. Steinberg & L. A. Jakobovits (eds.). Semantics. An Interdisciplinary Reader in Philosophy, Linguistics, and Psychology. Cambridge: Cambridge University Press, 297-307.
7. Lexical decomposition: Foundational issues 143 Katz, Jen old J. & Jerry A. Fodor 1963. The structure of a semantic theory. Language 39,170-210. Katz, Jen old J. & Paul M. Postal 1964. An Integrated Theory of Linguistic Descriptions. Cambridge, MA:The MIT Press. Kornfilt, Jaklin & Nelson Correa 1993. Conceptual structure and its relation to the structure of lexical entries. In: E. Reuland & W. Abraham (eds.). Knowledge and Language, vol. II: Lexical and Conceptual Structure. Dordrecht: Kluwer, 79-118. Lakoff, George 1972. Linguistics and natural logic. In: D. Davidson & G. Harinan (eds.). Semantics of Natural Language. Dordrecht: Reidel, 545-655. Levin, Beth 1993. English Verb Classes and Alternations. A Preliminary Investigation. Chicago. IL: The University of Chicago Press. Levin, Beth & Malka Rappaport Hovav 1995. Unaccusativity. At the Syntax-Lexical Semantics Interface. Cambridge, MA:The MIT Press. Levin, Beth & Malka Rappaport Hovav 1996. Lexical semantics and syntactic structure. In: S. Lappin (cd.). 77ie Handbook of Contemporary Semantic Theory. Oxford: Blackwcll, 487-507. Lewis, David 1973. Causation. The Journal of Philosophy 70,556-567. Lobner, Sebastian 2002. Understanding Semantics. London: Arnold. Lorenz, Wolfgang & Gerd Wotjak 1977. Zum Verhaltnis von Abbild und Bedeutung. Uberlegungen im Grenzfeld zwischen Erkenntnistheorie und Semantik. Berlin: Akademie Verlag. Lyons, John 1968. Introduction to Theoretical Linguistics. Cambridge: Cambridge University Press. McKoon, Gail & Talke MacFarland 2002. Event templates in the lexical representations of verbs. Cognitive Psychology 45.1-44. Mobayyen, Forouzan & Roberto G. de Almeida 2005.The influence of semantic and morphological complexity of verbs on sentence recall: Implications for the nature of conceptual representation and category-specific deficits. Brain and Cognition 57,168-175. Pinker. Steven 1989. Learnability and Cognition. The Acquisition of Argument Structure. Cambridge, MA:The MIT Press. Pottier, Bernard 1963. Recherches sur {'analyse semantique en linguistique et en traduction meca- nique. Nancy: Publications Linguistiques de la Faculte* de Lettres. Pottier, Bernard 1964. Vers une semantique moderne. Travaux de Linguistique et de Litterature 2, 107-137. Pulman. Stephen G. 2005. Lexical decomposition: For and against. In: J. I.Tait (cd.). Charting a New Course: Natural Language Processing and Information Retrieval: Essays in Honour of Karen Sparck Jones. Dordrecht: Kluwer. 155-174. Ramchand, Gillian C. 2008. Verb Meaning and the Lexicon. A First-Phase Syntax. Cambridge: Cambridge University Press. Rappaport Hovav, Malka & Beth Levin 1998. Building verb meanings. In: M. Butt & W. Geuder (eds.). The Projection of Arguments: Lexical Compositional Factors. Stanford, CA: CSLI Publications. 97-134. Rappaport Hovav, Malka & Beth Levin 2005. Change-of-state verbs: Implications for theories of argument projection. In: N. Ertcschik-Shir & T Rapoport (eds.). The Syntax of Aspect. Deriving Thematic and Aspectual Interpretation. Oxford: Oxford University Press, 274-286. Rayner, Keith & Susan A. Duffy 1986. Lexical complexity and fixation times in reading: Effects of word frequency, verb complexity, and lexical ambiguity. Memory & Cognition 14,191-201. Roelofs, Ardi 1997. A case for nondecomposition in conceptually driven word retrieval. Journal of Psycholinguistic Research 26,33-67. von Stechow, Arnim 1995. Lexical decomposition in syntax. In: U. Egli ct al. (eds.). Lexical Knowledge in the Organisation of Language. Amsterdam: Benjamins, 81-177. Svens£n, Bo 1993. Practical Lexicography. Principles and Methods of Dictionary-Making. Oxford: Oxford University Press. Thorndike, Edward L. 1941. Thorndike Century Senior Dictionary. Chicago, IL: Scott, Foresman and Company.
144 I. Foundations of semantics Wiegand, Herbert E. 1989. Die lexikographische Definition im allgemeinen einsprachigen Worterbuch. In: F. J. Hausmann et al. (eds.). Worterbiicher. Dictionaries. Dictionnaires. Ein internationales Handbiich zur Lexikographie. An International Encyclopedia of Lexicography. Encyclopedic internationale de lexicographic (HSK 5.1). Berlin: de Gruyter, 530-588. Wundeiiich, Dieter 1997. Cause and the structure of verbs. Linguistic Inquiry 28,27-68. Stefan Engelberg, Mannheim (Germany)
II. History of semantics 8. Meaning in pre-i9th century thought 1. Introduction 2. Semantic theories in classical antiquity 3. Hellenistic theories of meaning (ca. 300 B.C-200 A.D.) 4. Late classical sources of medieval semantics 5. Concepts of meaning in the scholastic tradition 6. Concepts of meaning in modern philosophy 7. References Abstract The article provides a broad survey of the historical development of western theories of meaning from antiquity to the late 18th century. Although it is chronological and structured by the names of the most important authors, schools, and traditions, the focus is mainly on the theoretical content relevant to the issue of linguistic meaning, or on doctrines that are directly related to it. I attempt to show that the history of semantic thought does not have the structure of a continuous ascent or progress; it is rather a complex multilayer process characterized by several ruptures, such as the decline of ancient or the expulsion of medieval learning by Renaissance Humanism, each connected with substantial losses. Quite a number of the discoveries of modern semantics are therefore in fact rediscoveries of much older insights. 1. Introduction Although it is commonly agreed that semantics as a discipline emerged in the 19th and 20th centuries, the history of semantic theories is both long and rich. In fact semantics started with a rather narrow thematic scope, since it focused, like Reisig, on the "development of the meaning of certain words, as well as the study of their use", or, like Brdal, on the "laws which govern changes in meaning, the choice of new expressions, the birth and death of phrases" (see article 9 (Nerlich) Emergence of semantics), but many of the theoretical issues that were opened up by this flourishing discipline during the 20th century were already traditional issues of philosophy, especially of logic. The article aims to give an overview of the historical development of western theories of meaning from antiquity to the late 18th century. The attempt to condense more than 2000 years of intense intellectual labor into 25 pages necessarily leads to omissions and selections which are, of course, to a certain degree subjective. The article should not therefore be read with the expectation that it will tell the whole story. Quite a lot of what could or should have been told is actually not even mentioned. Still it is, I think, a fair sketch of the overall process of western semantic thought. The oldest known philosophical texts explicitly concerned with the issue of linguistic meaning evolved out of more ancient reflexions on naming and on the question whether, and in what sense, the relation between words and things is conventional or natural. Plato Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK33.1), deGruyter, 145-172
146 II. History of semantics and Aristotle were fairly aware that word meaning needs to be discussed together with sentence meaning (see sections 2.2., 2.3.). The Stoic logicians later on were even more aware of this (see section 3.), but Aristotle's approach in his most influential book Peri hermeneias was more influential than them. His account of the relation between linguistic expressions (written and spoken), thoughts and things, combining linguistics (avant la lettre), epistemology, and ontology, (see section 4.2.) brought about a long-lasting tendency within the scholastic tradition of Aristotle commentaries to focus on word meaning and to center semantics on the question whether words signify mental concepts or things. The numerous theories that have been designed to answer to this question (see section 5.2.) have strengthened the insight, shared by modern conceptual semantics, that linguistic signs do not refer to the world per sc, but rather to the world as conceptualized by language users (see article 30 (Jackendoff) Conceptual Semantics). The way in which the question was put and answered in the scholastic tradition, however, in some respects fell short of its own standards. For ever since Abelard (see section 5.3.) there existed a propositional approach to meaning in scholastic logic fleshed out especially in the termi- nist logic (see section 5.4.) whose elaborate theories of syncategorematic terms and supposition suggested that by no means every word is related to external things, and that the reference of terms is essentially determined by the propositional context. Medieval philosophy gave birth to a number of innovative doctrines such as Roger Bacons 'use theory' of meaning (see section 5.5.), the speculative grammar, functionally distinguishing the word classes according to their different modes of signifying (sec section 5.6.), or the late medieval mentalist approach, exploring the relations between spoken utterances and the underlying mental structures of thought (see section 5.7.). Compared to the abundance of semantic theories produced particularly within the framework of scholastic logic, the early modern contribution to semantics in the narrow sense is fairly modest (see section 6.1.). The philosophy of the 17th and 18th centuries, however, disconnecting semantics from logic and centering language theory on epistemology, opened up important new areas of interest, and reopened old ones. So, the idea of a universal grammar underlying all natural languages was forcefully reintroduced by the rationalists in the second half of the 17th century independently from the medieval grammatica speculativa (see section 6.4.). One of the new issues, or issues that were discussed in a new way, was the question of the function of language, or of signs in general, in the process of thinking.The fundamental influence of language on thought and knowledge became widely accepted, particularly, though not exclusively, within the empirist movement (see sections 6.2., 6.3., 6.5., 6.6.). Another innovation: from the late 17th century the philosophy of language showed a growing consciousness of and interest in the historical dimension of language (see section 6.7.). 2. Semantic theories in classical antiquity 2.1. The preplatonic view of language Due to the fragmentary survival of textual relics from the preplatonic period the very beginnings of semantic thought are withdrawn from our sight. Though some passages in the preserved texts as well as in several later sources might prompt speculations about the earliest semantic reflexions, the textual material available seems to be insufficient for a reliable reconstruction of a prcsocratic semantic theory in the proper sense. The case
8. Meaning in pre-19th century thought 147 of the so-called Sophists (literally: 'wise-makers') of the 5th century B.C. is somewhat different: they earned their living mainly by teaching knowledge and skills that were thought to be helpful for gaining private and political success. Their teachings largely consisted of grammar and rhetoric as resources or the command of language and argumentation. One of their standard topics was the 'correctness of names' (orthotes ono- maton) which was seemingly treated by all the main sophists in separate books, such as Protagoras (ca.490^20 B.C.),Hippias (ca.435 B.C.), and Prodicus (ca.465^15 B.C.), the temporary teacher of Socrates (Schmitter 2001). 2.2. Plato (427-347 B.C.) The problem of the correctness of names, spelled out as the question whether the relation between names and things is natural or conventional, also makes up the main topic of Plato's dialogue Cratylus (ca. 388 B.C.), the earliest known philosophical text on language theory. The position of naturalism is represented by Cratylus, claiming that "everything has a right name of its own, which comes by nature, and that a name is not whatever people call a thing by agreement,... but that there is a kind of inherent correctness in names, which is the same for all men, both Greeks and barbarians." (Cratylus 383a-b). The opposite view of conventionalism is advanced by Hermogenes who denies that "there is any correctness of name other than convention (syntheke) and agreement (homologia)" and holds that "no name has been generated by nature for any particular thing, but rather by the custom and usage (thesei) of those who use the name and call things by it" (384c-d). Democritus (ca. 420 B.C.) had previously argued against naturalism by pointing to semantic phenomena such as homonymy, synonymy, and name changes. The position represented by Hermogenes, however, goes beyond the conventionalism of Democritus insofar as he does not make any difference between the various natural languages and an arbitrary individual name-giving {Cratylus 385d-e) resulting in autonomous idiolects or private language. Plato's dialogue expounds with great subtlety the problems connected to both positions. Plato advocates an organon theory of language, according to which language is a praxis and each name is functioning as "an instrument {organon) of teaching and of separating (i.e. classifying) reality" (388b-c). He is therefore on the one hand showing, contrary to Hermogenes's conventionalism, that names, insofar as they stand for concepts of things, have to be well defined and to take into account the nature of things in order to carve reality at the joints. On the other hand he demonstrates against Cratylus's overdrawn naturalism the absurdities one can encounter in trying to substantiate the correctness of names etymologically, by tracing all words to a core set of primordial or 'first names' {prota onomata) linked to the objects they denote on grounds of their phonological qualities. Even if Plato's own position is more on the side of a refined naturalism (Sedley 2003), the dialogue, marking the strengths and shortcomings of both positions, leaves the question undecided. Its positive results rather consist (1) in underlining the difference between names and things and (2) in the language critical warning that "no man of sense should put himself or the education of his mind in the power of names" (440c). Plato in his Cratylus (421d-e,424e,431 b-c) draws,for the first time,a distinction between the word-classes of noun {onoma) and verb {rhema). In his later Dialogue Sophistes he
148 II. History of semantics shows that the essential function of language does not consist in naming things but rather in forming meaningful propositions endowed with a truth-value (Soph.26lc-262c; Baxter 1992). The question of truth is thus moved from the level of individual words and their relation to things to the more adequate level of propositions. No less important, however, is the fact that Plato uses this as the basis for what can be seen as the first account of propositional attitudes (see article 60 (Swanson) Propositional attitudes). For in order to claim that "thought,opinion, and imagination ... exist in our minds both as true and false" he considers it necessary to ascribe a propositional structure to them. Hence Plato defines "forming opinion as talking and opinion as talk which has been held, not with someone else, nor yet aloud, but in silence with oneself" (Theaetet 190a), and claims, more generally, that "thought and speech (arc) the same, with this exception, that what is called thought is the unuttered conversation of the soul with herself", whereas "the stream of thought which flows through the lips and is audible is called speech" (Soph. 263e). Even if this view is suggested already by the Greek language itself in which logos means (inter alia) both speech or discourse and thought, it is in the Sophistes where the idea that thought is a kind of internal speech is explicitly stated for the first time. Thus, the Sophistes marks the point of departure of the long tradition of theories on the issue of mental speech or language of thought. 2.3. Aristotle (384-322 B.C.) Aristotle's book Peri hermeneias (De interpretation, On interpretation), especially the short introductory remarks (De int. 16a 3-8), can be seen as "the most influential text in the history of semantics" (Krctzmann 1974: 3). In order to settle the ground for his attempt to explain the logico-semantic core notions of name, verb, negation, affirmation, statement and sentence, Aristotle delineates the basic coordinate system of semantics, comprising the four elements he considers to be relevant for giving a full account of linguistic signification: i.e. (1) written marks, (2) spoken words, (3) mental concepts, and (4) things. This system, which under labels like 'order of speech' (ordo orandi) or 'order of signification' (ordo significa- tionis,see 5.2.), became the basic framework for most later semantic theories (see Tab. 8.1.) does not only determine the interrelation of these elements but also gives some hints about the connection of semantics and epistemology. According to Aristotle (De int. 16a 3-8), spoken words (ta en tephone: literally: 'those which are in the voice') are symbols (symbola) of affections (pathemata) in the soul (i.e. mental concepts (noemata) according to De int. 16a 10), and written marks symbols of spoken words. And just as written marks are not the same for all men, neither arc spoken sounds. But what these arc in the first place signs (semeia) of - affections of the soul - are the same for all: and what these affections are likenesses (homoiomata) of- actual things - are also the same. Though the interpretation of this passage is highly controversial, the most commonly accepted view is that Aristotle seems to claim that the four elements mentioned are interrelated such that spoken words which are signified by written symbols are signs of mental concepts which in turn arc likenesses (or images) of things (Wcidcmann 1982).This passage presents at least four assumptions that are fundamental to Aristotle's semantic theory: 1. Mental concepts are natural and therefore intersubjectively the same for all human beings.
8. Meaning in pre-19th century thought 149 2. Written or spoken words are, in contrast, conventional signs, i.e. sounds significant by agreement (kata syntheken, De int. 16a 19-27). 3. Mental concepts which are both likenesses (homoiomata) and natural effects of external things are directly related to reality and essentially independent from language. 4. Words and speech are at the same time sharply distinguished from the mental concepts and closely related to them, insofar as they refer to external things only through the mediation of concepts. Aristotle's description of the relation between words, thoughts, and things can therefore be seen as an anticipation of the so called 'semantic triangle' (Lieb 1981). All four tenets played a prominent role in philosophy of language for a long time and yet, each of them, at one time or another, came under severe attack.The tenet that vocal expressions signify things trough the mediation of concepts, which the late ancient commentators saw as the core thesis of Aristotle's semantics (Ebbesen 1983), leads, if spelled out, to the distinction of intension and extension (Graeser 1996: 39). For it seem to be just an abbreviated form of expressing that for each word to signify something {semainei ti) depends upon its being connected to an understanding or a formula (logos) which can be a definition, a description or a finite set of descriptions (Metaphysics 1006a 32-1006b 5) picking out a certain thing or a certain kind of things by determining what it is to be a such and such thing (Met. 1030a 14-17). Due to the prominency of the order of signification it may seem as if Aristotle were focusing on word semantics. There is, however, good reason to maintain that Aristotle's primary intention was a semantics of propositions. Sedley (1996: 87) has made the case that for Aristotle, just as for other teleologists like Plato and the Stoics, "who regard the whole as onto logically prior to the part ... the primary signifier is the sentence, and individual words are considered only secondarily, in so far as they contribute to the sentence's function." In face of this he characterizes De interpretatione, which is usually seen as a specimen of word semantics as "the most seriously misunderstood text in ancient semantics" (Sedley 1996:88). This may hold for most of the modern interpretations but not so for the medieval commentators. For quite a number of them were well aware that the somewhat cryptic formulation "those which are in the voice" does not necessarily stand for nouns and verbs exclusively but was likely meant to include also propositions and speech in general. That the treatment of words is tailored to that of propositions is indicated already by the way in which Aristotle distinguishes onomata (nouns) and rhemata (predicative expressions) not as word classes as such but rather as word classes regarding their function in sentences. For when, as Aristotle remarks, predicative expressions "are spoken alone, as such, they are nouns (onomata) that signify something - for the one who utters them forms the notion [or: arrests the thought], and the hearer pauses" (De int. 16a 19-21). The two classes of nouns and predicative expressions, the members of which are described as the smallest conventionally significant units, are complemented by the 'conjunctions' (syndesmoi) which include conjunctions, the article, and pronouns and are the direct ancestors of Priscian's syncategoremata that became a major topic in medieval logical semantics (sec section 4.5.). Although the natural/conventional distinction separates mental concepts from spoken language, Aristotle describes the performance of thinking, just like Plato, in terms of an
150 II. History of semantics internal speaking. He distinguishes between an externally spoken logos and a logos or "speech in the soul" (Met. 1009a 20) the latter of which provides the fundament of signification, logical argumention and demonstration (Posterior analytics 76b 24; Panaccio 1999: 34-41), and is at the same time the place where truth and falsehood are located properly (Meier-Oeser 2004: 314). Thus, when in De interpretatione (17a 2-24) it is said that the fundamental logical function of declarative sentences to assert or deny something of something presupposes a combination of a name and a verb or an inflection of a verb, this interconnection of truth or falsity and propositionality only seemingly refers primarily to spoken language. For in other passages, as Met. 1027b 25-30, or in the last chapter of De interpretatione, Aristotle emphasizes that combination and separation and thus truth "arc in thought" properly and primarily. How this internal discourse which should be 'the same for all' precisely relates to spoken language remains unclear in Aristotle. The close connection Aristotle has seen between linguistic signification and thought becomes evident in his insistance on the importance of an unambiguous and well-defined use of language the neglect of which must result in a destruction of both communication and rational discourse; for, as he claimed (Met. 1006b 7-12): not to signify one thing is to signify nothing, and if words do not signify anything there is an end of discourse with others, and even, strictly speaking, with oneself; because it is impossible to think of anything if we do not think of one thing. Noticing that the normal way of speaking docs not conform to the ideal of unambiguity, Aristotle is well aware about the importance of a detailed analysis of the semantic issues of homonymy, synonymy, paronymy etc. (Categories la;Sophistical refutations I65f) and devotes the first two books of his Topics to the presentation of rules for detecting, and strategies for avoiding ambiguitiy. 3. Hellenistic theories of meaning (ca. 300 B.C-200 A.D.) It is uncontrovcrsial that a fully fledged theory of language, considering all the different phonetic, syntactic, semantic, and pragmatic aspects of language as its essential subject matter, is the invention of the Stoic philosophers (Pohlenz 1939). However, since hardly any authentic Stoic text has come down to us, it is still true that "the nature of the Stoics' philosophy of language is the most tantalizing problem in the history of semantics" (Kretzmann 1967:363). After a long period of neglect and contempt, the Stoic account of semantics is today generally seen as superior even to Aristotle's, which was historically more influential (Graeser 1978: 77). For it has become a current conviction that central points of Stoic semantics are astonishingly close to some fundamental tenets that have paved the way for modern analytical philosophy. According to Sextus Empiricus (Adv. math. 8.11-12; Long & Sedley 1987 [=L&S] 33B) the principal elements of Stoic semantics are: 1. the semainon, i.e. that which signifies,or the signifier, which is a phoneme or grapheme, i.e. the material configuration that makes up a spoken word or rather - because Stoic semantics is primarily a sentence semantics - a spoken or written sentence; 2. the tynchanon (or:'name-bearer'), i.e the external material object or event referred to; and
8. Meaning in pre-19th century thought 15J_ 3. the semainomenon, i.e. that which is signified. This is tantamount to the core concept and the most characteristic feature of the Stoic propositionalist semantics, the lekton (which can be translated both as 'that which is said' and 'that which can be said', i.e. the 'sayable').The lekton proper or the 'complete lekton' (lekton autoteles) is the meaning of a sentence, whereas the meaning of a word is characterized as an 'incomplete lekton' (lekton ellipes). Even though questions and demands may have a certain kind of lekton, the prototype of the Stoic lekton corresponds to what in modern terminology would be classified as the propositional content of a declarative sentence. Whereas the semainon and the tynchanon are corporeal things or events, the lekton is held to be incorporeal. This puts it in an exceptional position within the materialist Stoic ontology, which considered almost everything - even god, soul, wisdom, truth, or thought - as material entities. Hence the lekton "is not to be identified with any thoughts or with any of what goes on in one's head when one says something" (Annas 1992: 76). This is evident also from the fact that the lekton corresponds to the Stoic notion of pragma which, however, in Stoic terminology does not stand for an external thing, as it does for Aristotle, but rather for a fact or something which is the case. The Stoic complement of the Aristotelian notion of mental concept (noema), is the material phantasia logike, i.e. a rational or linguistically expressible presentation in the soul. The phantasia logike is a key feature in the explanation of how incorporeal meaning is connected to the physical world, since the Stoics maintained "that a lekton is what subsists in accordance with a phantasia logike" (Adv. math. 8.70; L&S 33C). It should be clearly distinguished from the lekton or meaning as such. The phantasia logike makes up the 'internal discourse' (logos endiathetos) and is thus part of the subjective act of thinking, in contrast to the lekton, which is the "objective content of acts of thinking (noeseis)" (Long 1971:82).The semainon and the tynchanon more or less correspond to the written or spoken signs and the things in the Aristotelian order of signification, but it has no equivalent of the lekton (sec Tab. 8.1.). Meaning, as the Stoics take it, is neither some thing nor something in the head. Nor is the Stoic lekton a quasi-Platonic entity that would exist in some sense independent of whether or not one thinks of it, unlike Frege's notion of Gedanke (thought) (Graeser 1978:95; Barnes 1993:61). Whitin the context of logical inference and demonstration a lekton may function as a sign (semeion, not to be mistaken with semainonl), i.e. as a "leading proposition in a sound conditional, revelatory of the consequent" (Sextus Empiricus, Outlines of Pyrrhonism, L&S 35C). Besides the notion of propositional content, one of the most interesting and innovative elements of Stoic semantics is the discovery that to take into account propositional content alone might be not in any case sufficient in order to make explicit what a sentence means.This discovery, pointing in the direction of modern speech act theory, is indicated when Plutarch (with critical intention) reports that the Stoics maintain "that those who forbid say one thing, forbid another, and command yet another. For he who says 'do not steal' says just this,'do not steal', forbids stealing and commands not stealing" (Plutarch, On Stoic self-contradictions 1037d-e). In this case there are three different lekta associated to a single sentence, corresponding to the distinct linguistic functions this sentence can perform.Thus, the Stoics seem to have been well aware that the explication of meaning "involves not only the things we talk about and the thoughts we express but also the jobs we do by means of language alone" (Kretzmann 1967:365a).
152 II. History of semantics A semantic theory which is significantly different from both the Aristotelian and the Stoic is to be found in the Epicurean philosophers who, as Plutarch (Against Colotes; L&S 19K) reports,"completely abolish the class of sayables ..., leaving only words and name-bearers" (i.e. external things), so that words refer directly to the objects of sensory perception.This referential relation still presupposes a mental'preconception' (pro- lepsis) or a schematic presentation (typos) of the thing signified, for we would not "have named something if we had not previously learnt its delineation (typos) by means of preconception (prolepsis)" (Diogenes Laertius, Lives and Opinions of eminent Philosophers L&S 17E), but signification itself remains formally a two-term relation. It is true, the semantic relation between words and things is founded on a third; but this intermediate third, the prolepsis or typos, unlike the passiones or concepts in the Aristotelian account, does not enter the class of what is signified (see Tab. 8.1.). Tab. 8.1: The ordo significationis Labeled grey fields stand for elements (I, II, III) being signifiers and/or signified. Labeled white fields stand for semantic relations (1, 2,3) characterizing the element on the left in regard to the one on the right. Labeled light grey fields stand for elements involved in the process of signification though neither being signifiers nor signified. 4. Late classical sources of medieval semantics 4.1. Augustinus (354-430) Whereas Aristotelian logic and semantics was transmitted to the Middle Ages via Boethius, the relevant works of Augustinus provide a complex compound of modified Stoic, Skeptic and Neoplatonic elements, together with some genuinely new ideas. Probably his most important and influential contribution to the history of semantics and semiotics consists of (1) the explicit definition of words as signs, and (2) his definition of the sign which, for the first time, included both the natural indexical sign and the conventional linguistic sign as species of an all-embracing generic notion of sign. Augustinus
8. Meaning in pre-19th century thought 153 thus opened a long tradition in which the theory of language is viewed as a special branch of the more comprehensive theory of signs. The sign, in general, is defined as "something which, offering itself to the senses, conveys something other to the intellect" (Augustinus 1963:33).This triadic sign conception provides the general basis for Augustinus's communicative approach to language, which holds that a word is a "sign of something, which can be understood by the hearer when pronounced by the speaker" (Augustinus 1975: 86). In contrast to natural signs which "apart from any intention or desire of using them as signs, do yet lead to the knowledge of something else", words are conventional signs "living beings mutually exchange in order to show ..., the feelings of their minds, or their perceptions, or their thoughts" (Augustinus 1963: 34). The preeminent position of spoken words among the different sorts of signs used in human communication, some of which "relate to the sense of sight, some to that of hearing, a very few to the other senses", does not result from their quantitative preponderance but rather from their significative universality, i.e. from the fact that, as Augustinus sees it, everything which can be indicated by nonverbal signs could be put into words but not vice versa (Augustinus 1963:35). The full account of linguistic meaning entails four elements (Augustinus 1975: 88f): (1) the word itself, as an articulate vocal sound, (2) the dicibile, i.e. the sayable or "whatever is sensed in the word by the mind rather than by the ear and is retained in the mind", (3) the dictio, i.e. the word in its ordinary significative use in contrast to the same word just being mentioned. It "involves both the word itself and that which occurs in a mind as the result of the word" (i.e. the dicibile or meaning), and (4) the thing signified (res) in the broadest sense comprising anything "understood, or sensed, or inapprehensible". Even if the notion of dicibile obviously refers to the Stoic lekton, Augustinus has either missed or modified the essential point. For describing it as "what happens in the mind by means of the word" implies nothing less than the rejection of the Stoic expulsion of meaning (lekton) from the mind. It is as if he mixed up the lekton with the phantasia logike, which was characterized as a representation whose content "can be expressed in language" (Sextus Empiricus,y4dv. maf/i.8.70;L&S 33C). Augustinus's emphasis on the communicative and pragmatic aspects of language is manifest in his concept of the 'force of word' (vis verbi) which he describes as the "efficacy to the extent of which it can affect the hearer" (Augustinus 1975: 100). The import spoken words have on the hearer is not confined to their bare signification but includes a certain value and some emotive moments resulting from their sound, their familiarity to the hearer or their common use in language - aspects which Frege called Fdrbungen (colorations). In his later theory of verbum mentis (mental word), especially in De trinitate, Augustinus advocates the devaluation of the spoken word against the internal sphere of mental cognition. It is now the mental or 'interior word' (verbum interius), i.e., the mental concept, that is considered as word in its most proper sense, whereas the spoken word appears as a mere sign or voice of the word (signum verbi, vox verbi; Augustinus 1968: 486). In line with the old concept of an internal speech, Augustinus claims that thoughts (cogitationes) arc performed in mental words. The verbum mentis however, corresponding to what later was called the conceptus mentis or intellectus, is by no means a linguistic entity in the proper sense, for it is "nullius linguae", i.e. it does not belong to any spoken language like Latin or Greek.
154 II. History of semantics Between mental and spoken words there is a further level of speech, consisting of imaginative representations of spoken words, or, as he calls them, imagines sonorum (images of sounds), closely corresponding to Saussure's notion of image accoustique. In Augustin's theory of language this internalized version of uttered words does not seem to play a major role, but it will gain importance in late medieval and early modern reflections on the influence of language on thought. 4.2. Boethius (480-528) Boethius' translations of and comments on parts of the Aristotelian Organon (especially De Interpretatione) are for a long time the only available source texts for the semantics of Aristotle and his late ancient Neoplatonic commentators for the medieval world. The medieval philosophers thus first viewed Aristotle's logic through the eyes of Boethius, who made some influential decisions on semantic terminology, as well as on the interpretation of the Aristotelian text. What they learned through his writings were inter alia the conventional character of language, the view that meaning is established by an act of 'imposition', i.e., name-giving or reference-setting, and the influential idea that to 'signify' (significare) is to "establish an understanding" (intellectum constituere). Especially in his more elaborate second commentary on De interpretatione, Boethius discusses at length Aristotle's four elements of linguistic scmciosis (scripta, voces, intellects, res), which he calls the 'order of speaking' (ordo orandi) (Magee 1989:64-92).The ordo orandi determines the direction of linguistic signification: written characters signify spoken words, whereas spoken words primarily signify mental concepts and, by means of the latter, secondarily denote the things, or, in short: words signify things by means of concepts (Boethius 1880:24,33). The first step of the process of homogenizing or systematizing the semantic relations between these four elements, later continued in the Middle Ages (see 5.2.), is to be seen in Boethius's translation of Aristotle's De interpretatione (De int. 16a 3-8) (Meier-Oeser 2009). For whereas Aristotle characterizes these relations with the three terms of sym- bola, semeia, and homoiomata, Boethius translates both symbola and semeia as notae (signs; see Tab. 8.1.). Boethius makes an additional distinction in the work of late classical Aristotle commentators: he distinguishes three levels of speech: besides - or rather, at the basis of - written and spoken discourse there is a mental speech (oratio mentis) in which thinking is performed. This mental speech is, just like Augustinus's mental word, not made up of words of any national language but rather of transidiomatic or even non-linguistic mental concepts (Boethius 1880:36) which are, as Aristotle had claimed, "the same for all". 5. Concepts of meaning in the scholastic tradition The view that semantic issues are not only a subject matter of logic but its primary and most fundamental one is characteristic for the scholastic tradition. This tradition was not confined to the Middle Ages but continued, after a partial interruption in the mid-l6th century, during the 17th and 18th centuries. Because it is the largest and most elaborate tradition in the history of semantics, it seems advisable to begin with an overview of some basic aspects of medieval semantics, such as the use and the definition of
8. Meaning in pre-19th century thought 155 the pivotal term of significatio (see 5.1.), and the most fundamental medieval debate on that subject (see 5.2.). 5.1. The use and definition of significatio in the scholastic tradition The problematic vagueness or ambiguity that has frequently been pointed out regarding the notion of meaning holds, at least partly, for the Latin term of significatio as well. There is, however, evidence that some medieval authors were aware of this terminological difficulty. Robert Kilwardby (1215-1279) for instance noted that significatio can designate either the 'act or form of the signifier' (actus et forma significantis), the 'signified' (significatum), or the relation between the two (comparatio signi ad significatum; Lewry 1981:379). A look at the common scholastic use of the term significatio does not only verify this diagnosis but reveals a number of further variants (Mcier-Ocser 1996: 763-765) which mostly resulted from debates on the appropriate ontological description of significatio as some sort of quality, form, relation, act etc. The scholastic definitions of significatio and significare (to signify), however, primarily took up the question what signification is in the sense of'what it is for a word or sign to signify'. There is a current misconception: the medieval notions of sign and signification are often described in terms of what Biihler (1934: 40) called the "famous formula of aliquid stat pro aliquo" (something stands for something). This description, confusing signification with supposition (see 5.4.), falls short by reducing signification to a two-term relation between a sign and its significate, whereas in scholastic logic the relation to a cognitive faculty of a sign recipient is always a constitutive element of signification. In this vein, the most widely spread scholastic definition (Meier-Oeser 1996: 765-768), based on Boethius's translation of Aristotle's De interpretatione (De int. 16b 20), characterizes signification or the act of signifying as "to establish an understanding" (constituere intellectum) of some thing, or "to cvoquc a concept in the intellect of the hearer". The relation to the sign recipient remains pivotal when in the later Middle Ages the act of signifying is primarily defined as "to represent something to an intellect" (aliquid intellectui repraesentare). The common trait of the definitions mentioned (and all the others not mentioned) is (1) to describe signification - in contrast to meaning - not as something a word has, but rather as an act of signifying which is - in contrast to the stare pro - involved in a triadic relation including both the significate and a cognitive faculty. 5.2. The order of signification and the great altercation about whether words are signs of things or concepts The introductory remarks of Aristotle's De interpretatione have been described as "the common starting point for virtually all medieval theories of semantics" (Magee 1989: 8). They did at least play a most important role in medieval semantics. In the late 13th and early 14th centuries the order of the four principal elements of linguistic signification (i.e. written and spoken words, mental concepts, and things) is characterized as ordo significationis (order of signification; Aquinas 1989: 9a), or ordo in significando (order in signifying; Ockham 1978: 347). The coinage of these expressions is the terminological outcome of the second step in the process mentioned above of homogenizing
156 II. History of semantics the relations between these four elements. It took place in the mid-13th century, when mental concepts began to be described as signs. The Boethian pair of notae and similitu- dines was thus further reduced to the single notion of sign (signum, see Tab. 8.1.), so that the entire order of signification was then uniformly described in terms of sign relations, or, as Antonius Andreas (ca. 1280-1320) said:"written words, vocal expressions,concepts in the soul and things are coordinated according to the notion of sign and significate" (Andreas 1508: fol.63va). From the second half of the 13th century on, most scholastic logicians shared the view that mental concepts were signs, which provided new options for solving the "difficult question of whether a spoken word signifies the mental concept or the thing" (Roger Bacon 1978: 132). This question made up the subject matter of the most fundamental scholastic debate on semantic issues. John Duns Scotus (1265/66-1308) labeled it "the great altercation" (magna altercatio). The simple alternative offered in the formulation of the question is, however, by no means exhaustive, but only marks the extreme positions within a highly differentiated spectrum of possible answers (Ashworth 1987; Meier-Oeser 1996:770-777). The various theories of word meaning turn out to be most inventive in producing variants of the coordination of linguistic signs, concepts and things. For example (1) while especially some early authors held the mental concept to be the only proper significate of a spoken word, (2) Roger Bacon (ca. 1214 or 1220 - ca. 1292) as well as most of the so-called late medieval nominalists favored an cxtcnsionalist reference semantics according to which words signify things. (3) Especially Thomist authors took up the formula that words signify things by the mediation of concepts (voces significant res medi- antibus conceptibus) and answered the question along the lines of the semantic triangle (Aquinas 1989: 11a). Some later authors put it the other way round and maintained (4) that words signify concepts only by the mediation of their signification of things (voces significant conceptus mediante signification rerum). For "if I do not know which things the words signify I shall never learn by them which concepts the speaker has in his mind" (Smiglecius 1634: 437). Sharing the view that concepts were signs of things, (5) Scotus referred to the principle of semantic transitivity, claiming that "the sign of a sign is [also] the sign of the significate" (signum signi est signum signati), and held that words signify both concepts and things by one and the same act of signifying (1891:451f). Others, in contrast, maintained (6) that there had to be two simultaneous but distinguishable acts of signification (Conimbriccnscs 1607:2.39f). And still others tried to solve the problem by introducing further differentiations. Some of them were related to the medi- antibus conceptibus-foTmu\a by the claim (7) that this formula does not imply that concepts were the immediate significates of words but rather that they were a prerequisite condition for words to signify things (Henry of Ghent 1520:2.272v). Others were related to the notion of things, taking (8) the 'thing conceived' (res concepta; res ut intelligitur) as the proper significate of words (Scotus 1891:543). Further approaches tried to decide the question either (9) by distinguishing senses of the term significare (significare sup- positive - manifestative), claiming that spoken words stand for the thing but manifest the concepts (Rubius 1605:21), or (10) by differentiating between things being signified and thoughts being expressed (Soto 1554: fol.3 rb-va),or again (11) by taking into account the different roles of the discourse participants, so that words signify concepts for the speaker and things for the hearer (Versor 1572: fol. 8 r), or lastly (12) by distinguishing between different types of discourse, maintaining that in familiar speech words primarily
8. Meaning in pre-19th century thought 157 refer to concepts whereas in doctrinal discourse to things (Nicolaus a S. Iohanne Baptista 1687:40.43ff). No matter how subtle the semantic doctrines behind these positions may have been in detail; it is still true that most of the contributions to the 'great altercation1 were focusing primarily on word semantics. Scholastic semantics, however, was by no means confined to this approach. 5.3. Peter Abelard (1079-1142) and the meaning of the proposition As early as Peter Abelard a shift in the primary interest of scholastic logic became apparent. His treatment of logic and semantics is determined by a decidedly proposi- tional approach; all distinctions he draws and all discussions he conducts are guided by his concentration on propositions (Jacobi 1983:91). In a conceptual move comparable to the one in Frege's famous Ober Sinn unci Bedeutung, Abelard transposes the distinction between the signification of things (which is akin to Frege's Bedeutung) and concepts (Frege's Sinn) to the level of propositions. On the one hand, and in line with Frege's principle ofcompositionality, the signification of a proposition is the complex comprehension of the sentence as it is integrated from the meanings of its components (sec article 6 (Pagin & Westerstahl) Compositionality). On the other hand, it corresponds to Frege's Gedanke, i.e. to the propositional content of a sentence. Abelard calls this, in accordance with the Stoic Ickton, dictum propositionis ('what is said by the proposition') or res propositionis ('thing of the proposition', i.e. the Stoic pragma).These similarities do not concern only terminology, but also the ontological interpretation. For the res propositionis is characterized as being essentially nothing (nullae penitus essentiae; Abelard 1927:332, 24) or as entirely nothing (nil omnino;366:1). And yet the truth value or the modal state of a proposition depends on the res propositionis being either true or false, necessary or possible etc. (367:9-16). In the logical textbooks of the late 12th and 13th centuries Abelard's notion of dictum propositionis is present under the name of enuntiabile ('what can be stated') (de Rijk 1967 2/2.208: 15ff). In the 14th century it has its analog in the theory of the 'complexly signifiable' (complexe significabile) developed by Adam Wodeham (ca. 1295-1358), Gregory of Rimini (ca. 1300-1358), and others in the context of intense discussions on the immediate object of knowledge and belief (Tachau 1987). These conceptions of the significatc of propositions correspond in important points to Frege's notion of'thought' (Gedanke) or Bolzano's'sentences as such' (Satze an sich), and, of course, to the Stoic Ickton. In contrast, Walter Burlcy (ca. 1275-1344) answered the question about the ultimate significatc of vocal and mental propositions by advocating the notion of a propositio in re (proposition in reality) or a proposition composed of things, which points more in the direction of Wittgensteins notion of Sachverhalte (cases) or Tatsachen (facts; or better: states of affairs) as described in his Tractatus logico- philosophicus ("l.The world is all that is the case. 1.1.The world is the totality of facts, not of things."). Whereas the advocates of the complexe significabile project proposition- ality onto a Fregian-like 'third realm' of propositional content, Burley and some other authors project it onto the real world, maintaining that what makes our propositions true (or false) are not the things as such but rather the states of affairs, i.e. the things relating (or not relating) to each other in the way our propositions say they do (Meier-Oeser 2009:503f).
158 II. History of semantics 5.4. The theory of supposition and the propositional approach to meaning The propositional approach to meaning is also characteristic of the so-called 'terminist' or 'modern logic' (logica moderna), emerging in the late 12th and 13th centuries with a rich and increasingly sophisticated continuation from the 14th to the early 16th century. Most of what is genuinely novel in medieval logic and semantics is to be found in this tradition whose two most important theoretical contributions arc (1) the theory of syncategorematic terms and (2) the theory of the properties of terms (proprietates terminorum). l.The theory of syncategorematic terms is concerned with the semantic and logical functions of those parts of speech that have been missed out in the ordo significationis since they are neither nouns nor verbs and thus have neither a proper meaning nor a direct relation to any part of reality (Meier-Oeser 1998). Even if syncategorematic terms (i.e. quantifiers, prepositions, adverbs, conjunctions etc. like 'some', 'every', 'besides', 'necessarily',or the copula'est') do not signify'something' (aliquid) but only, as was later said,'in some way' (aliqualiter), they perform semantic functions that are determinative for the meaning and the truth-value of propositions. Since the late 12th century the syncategorematic terms became the subject matter of a special genre of logical textbooks, the syncategoremata tracts. They also played an important role in the vast literature on sophismata, i.e. on propositions like "every man is every man" or "Socrates twice sees every man besides Plato", which, due to a syncategorematic term contained in them, need further analysis in order to make explicit their unterlying logical form as well as the conditions under which they can be called true or false (Kretzmann 1982). 2. The second branch of terminist semantics was concerned with those properties of terms that are relevant for explaining truth, inference and fallacy. While signification was seen as the most fundamental property of terms, the one to which they devoted most attention was suppositio. Whereas any term, due to its imposition, has signification or lexical meaning on its own, it is only within the context of a proposition that it achieves the property of supposition or the function of standing for (supponere pro) a certain object or a certain number or kind of objects. Thus it is the propositional context that determines the reference of terms. The main feature of supposition theory is the distinction of different kinds of suppositions. The major distinction is that between 'material supposition' (suppositio materialis, when a term stands for itself as a token, e.g. 'I write donkey', or a type; e.g. 'donkey is a noun'),'simple supposition' (s. simplex, when a term stands for the universal form or a concept; e.g. 'donkey is a species'), and 'personal supposition' (s. personalis; when a term stands for ordinary objects, e.g. 'some donkey is running'). This last most important type of supposition is further divided and subdivided depending whether the truth conditions of the proposition in which the term appears require a particular quantification of the term (all x, every x, this x, some x, a certain x, etc). While supposition theory in general provides a set of rules to determine how the terms in a given propositional context have to be understood in order to render the proposition true or an inference valid, the treatment of suppositio personalis and its subclasses, which are at the center of this logico-semantic approach, focuses on the extension of the terms in a given proposition.
8. Meaning in pre-19th century thought 159 The characteristic feature of terminist logic, as it is exemplified both in the theory of syncategorematic terms and in the theory of supposition, is commonly described as a contextual approach (de Rijk 1967: 123-125), or, more precisely, as a propositional approach to meaning (de Rijk 1967:552). 5.5. Roger Bacon's theory of the foundation and the change of reference Roger Bacons theory of linguistic meaning is structured around two dominant features: (1) his semiotic approach, according to which linguistic signification is considered in connection to both conventional and natural sign processes, and (2) his original and inventive interpretation of the doctrine of the 'imposition of names' (impositio nominum) as the basis of word meaning. Bacon accentuates the arbitrariness of meaning (Frcdborg 1981:87ff). But even though the first name-giver is free to impose a word or sign on anything whatsoever, he or she performs the act of imposition according to the paradigm of baptizing a child, so that Bacon in this respect might be seen as an early advocate of what today is known as causal theory of reference (see article 4 (Abbott) Reference). This approach, if taken seriously, has important consequences for the concept of signification. For:"all names which we impose on things we impose inasmuch as they are present to us, as in the case of names of people in baptism" (Bacon 1988:90). Contrary to the tradition of Aristotelian or Boethian semantics (Ebbesen 1983), Bacon favors the view that words according to their imposition immediately and properly signify things rather than mental concepts of things. Thus, his account of linguistic signification abandons the model of the semantic triangle and marks an important turning point on the way from the traditional intensionalist semantics to the extensionalist reference semantics as it became increasingly accepted in the 14th century (Pinborg 1972:58f). With regard to mental concepts, spoken words function just as natural signs, which indicates that the speaker possesses some concept of the object the word refers to, for this is a prerequisite for any meaningful use of language (Bacon 1978:85f, 1988:64). When Bacon treats the issue of linguistic meaning as a special case of sign relations, he considers the sign relation after the model of real relations presupposing both the distinction of the terms related (so that nothing can be a sign of itself) and their actual existence (so that there can be no relation to a non-existent object). As a consequence of this account, words lose their meaning, or, as Bacon says, "fall away from their signification" (cadunt a significatione) if their significate ceases to exist (1978:128). But even if the disappearance of the thing signified annihilates the sign relation and therefore must result in the corruption of the sign itself, Bacon is well aware that the use of names and words in general is not restricted to the meaning endowed during the first act of imposition (the term homo does not only denote those men who were present when the original act of its imposition took place); nor do words cease to be used when their name-bearers no longer physically exist (Bacon 1978:128). As a theoretical device for solving the resulting difficulties regarding the continuity of reference, Bacon introduced a distinction of two modes of imposition that can be seen as "his most original contribution to grammar and semantics" (Fredborg 1981:168). Besides the 'formal mode of imposition', conducted by an illocutionary expression like "I call this ..." {modus imponendi sub forma impositionis vocaliter expressa), there is a kind of 'secondary imposition', taking place tacitly (sine forma imponendi vocaliter expressa) whenever a term is applied (transumitur) to any object other than that which the first
160 II. History of semantics name-giver 'baptized' (Bacon 1978:130). Whereas the formal mode of imposition refers to acts of explicitly coining a new word, the second mode describes what happens in the everyday use of language. Bacon (1978:130) states: We notice that infinitely many expressions are transposed in this way; for when a man is seeing for the first time the image of a depicted man he does not say that this image shall be called 'man' in the way names are given to children, he rather transposes the name of man to the picture. In the same way he who for the first time says that god is just, does not say beforehand 'the divine essence shall be called justice', but transposes the name of human justice to the divine one because of the similitude. In this way wc act the whole day long and renew the things signified by vocal expressions without an explicit formula of vocal imposition. In fact this modification of the meaning of words is constantly taking place even without the speaker or anyone else being actually aware of it. For by simply using language we "all day long impose names without being conscious of when and how" (nos tota die imponimus nomina et non advertimus quando et quomodo; Bacon 1978: 100, l30f).Thus, according to Roger Bacon, who in this respect is playing off a use theory of meaning against the causal approach, the common mode of language use is too complex and irregular as to be sufficiently described solely by the two features of a primary reference-setting and a subsequent causal chain of reference-borrowing. 5.6. Speculative grammar and its critics The idea, fundamental already for Bacon, that grammar is a formal science rather than a propaedeutic art, is shared by the school of the so-called 'modist' grammarians (modistae) emerging around 1270 in the faculty of arts of the university of Paris and culminating in the Grammatica Speculativa of Thomas of Erfurt (ca. 1300). The members of this school who took it for granted that the objective of any formal science was to explain the facts by giving reasons for them rather than to simply describe them, made it their business to deduce the 'modes of signifyng' {modi significandi), i.e. grammatical features common to all languages, from universal 'modes of being' (modi essendi) by means of corresponding 'modes of understanding' (modi intelligendi). Thus the tradition of 'speculative grammar' (grammatica speculativa) adopted Aristotle's commonly accepted claim that mental concepts, just as things, are the same for all men, and developed it further to the thesis of a universal grammar based on the structural analogy between the 'modes of being' (modi essendi), the 'modes of understanding' (modi intelligendi), and the 'modes of signifying' (modi significandi) that are the same for all languages (Bursill-Hall 1971). Thus, Boethius Dacus (1969: 12): one of the most important theoreticians of speculative grammar, states that ... all national languages are grammatically identical.The reason for this is that the whole grammar is borrowed from the things ... and just as the natures of things are similar for those who speak different languages, so are the modes of being and the modes of understanding; and consequently the modes of signifying are similar, whence, so are the modes of grammatical construction or speech. And therefore the whole grammar which is in one language is similar to the one which is in another language.
8. Meaning in pre-19th century thought 161 Even though the words are arbitrarily imposed (whence arise the differences between all languages), the modes of signifying are uniformly related to the modes of being by means of the modes of understanding (whence arise the grammatical similarities among all languages). Soon after 1300 the modistic approach came under substantial criticism. The main point that critics like Ockham oppose is not the assumption of a basic universal grammar, for such a claim is implied in Ockham's concept of mental grammar too, but rather two other aspects of modism: (1) the assertion of a close structural analogy between spoken or mental language and external reality (William of Ockham 1978:158), and (2) the inadmissible reification of the modus significandi, which is involved in its description as some quality or form added to the articulate voice (dictionisuperadditum) through the act of imposition. To say that vocal expressions 'have' different modes of signifying is, as Ockham points out, just a metaphorical manner of speaking; for what is meant is simply the fact that different words signify whatever they signify in different ways (Ockham 1974:798). According to John Aurifaber (ca. 1330), a vocal term is significative, or is a sign, solely by being used significatively, not on grounds of something inherent in the sound. In order to assign signification a proper place in reality, it must be ascribed to the intellect rather than to the vocal sound (Pinborg 1967:226).This criticism of modist grammar is connected to a process that might be described as a progressive 'mentalization' of signification. 5.7. The late medieval mentalist approach to signification The idea behind this process is the contention that without some sort of'intentionality' the phenomena of sign, signification, and semiosis in general must remain inconceivable. The tendency to relocate the notions of sign and signification from the sphere of spoken words to the sphere of the mind is characteristic for the mentalist logic, emerging in the early 14th century and remaining dominant throughout the later Middle Ages. The signification of spoken words and external signs in general is founded on the natural signification instantiated in the mental concepts. The cognitive mental act as that which makes any signification possible is now conceived as a sign or an act of signification in its most proper sense. The introduction of the notion of formal signification (signi- ficatio formalis), identical with the mental concept (Raulin 1500: d3vb), is the result of a fundamental change in the conception of signification.The mental concept does not have but rather is signification.This, however, does not imply that it is the signified of a spoken word but, quite the contrary: the mental concept, as Ockham (1974: 7f) claimed, is the primary significr in subordination to which a spoken word (to which again is subordinate the corresponding written term) takes up the character of a sign (see Tab. 8.1.).Thus the significative force of mental concepts is seen as the point at which the analysis of signification necessarily must find its end. It is an ultimate fact for which no further rationale can be given (Meier-Oeser 1997:141-143). 6. Concepts of meaning in modern philosophy Whereas in Western Europe, under the growing influence of humanism, the scholastic tradition of terminist logic and semantics came to an end in the third decade of the 16th century, it continued vigourously on the Iberian Peninsula until the 18th century. It was
162 II. History of semantics then reimported from there into central Europe in the late 16th and early 17th century and dominated, though in a somewhat simplified form, academic teaching in Catholic areas for more than a century. In what is commonly labeled 'modern philosphy', however, logic, the former center of semantic theory, lost many of its medieval attainments and subsided into inactivity until the middle of the 19th century. In early modern philosophy of language the logico-semantic approach of the scholastic tradition is displaced by an epistemological approach, so that in this context the focus is not on the issue of meaning but rather on the cognitive function of language. 6.1. The modern redefinition of meaning In early modern philosophy (outside the scholastic discourse) the "difficult question of whether a spoken word signifies the mental concept or the thing" (see 5.2.) which once had stimulated a rich variety of distinctly elaborated semantic theories was unanimously considered as definitively answered. Due to the prevalent persuasion that the primary function of speech was to express one's thoughts, most of the non-scholastic early modern authors took up the view that words signify concepts rather than things which, from a scholastic point of view, had been classified as the "more antiquated" one (Rubius 1605: 2.18). Given, however, that concepts, ideas, or thoughts are the primary meaning of words, the thesis that language has a formative influence on thought, which became increasingly widely accepted during the 18th century, turns out to be a thesis of fundamental importance to semantics. 6.2. The influence of conventional language on thought processes Peterof Ailly (1330-1421) claimed that there is such a habitually close connection between the concept of the thing and the concept of its verbal expression that by stimulating one of these concepts the other is always immediately stimulated as well (Kaczmarck 1988: 403ff). Still closer is the correlation of language and thought in Giovanni Battista Giattini's (1601-1672) account of language acquisition. Upon hearing certain words frequently and in combination with the sensory perception of their significata, a 'complex species' is generated, and this species comprises, just like the Saussurean sign, the sound- image as well as the concept of its correlate object ("... generantur ... species complexae talium vocum simul et talium obiectorum ex ipsa consuetudine"; Giattini 1651:431). In this vein, Jean de Raey (1622-1702) sees the "union of the external vocal sound and the inner sense" as the "immediate fundament of signification" and holds that sound (sonus) and meaning or sense (sensus) make up "one and the same thing rather than two things" (Racy 1692:29). Thinking, therefore, "seems to be just some kind of internal speech or logos endiathetos without which there would be no reasoning" (30). Even if de Reay refers to the old tradition of describing the process of thinking in terms of internal speech (see 2.2.), a fundamental difference becomes apparent when he claims that "both the speaker and the hearer have in mind primarily the sound rather than the meaning and often the sound without meaning but never the meaning without sound" (29). Until then, internal speech had generally been conceived as being performed in what Augus- tinus had called verba nullius linguae (see 4.1.), but from de Raey (and other authors of that time) inner speech is clearly intimately linked to spoken language.
8. Meaning in pre-19th century thought 163 The habitual connection of language and thought was the theoretical foundation for the thesis of an influence of language on thinking in the 17th century. As the introduction to the Port-Royal logic notes: "this custom is so strong, that even when we think alone, things present themselves to our minds only in connection with the words to which we have been accustomed to recourse in speaking to others." (Arnauld & Nicole 1662:30). Because most 17th century authors adhered to the priority of thought over language they considered this custom just a bad habit. While this critical attitude remained a constant factor in the philosophical view of language during the following centuries, a new and alternative perspective, taking into account also its positive effects, was opened with Hobbes, Locke and Leibniz. 6.3. Thomas Hobbes (1588-1679) In his Logic (Computatio sive logica), which makes up the first part (De Corpore) of his Elementa Philosophiae (1655, English 1656), Thomas Hobbes draws a parallel between reasoning and a mathematical calculus, the basic operations of which can be described as an addition or subtraction of ideas, thoughts, or concepts (Hobbes 1839a: 3-5). Because thoughts are volatile and fleeting (fluxae et caducae) they have to be fixed by means of marks (notae) which, in principle, everyone can arbitrarily choose for himself (1839a: 1 If). Because the progress of science, however, can be obtained only in form of a collective accumulation of knowledge, it is necessary that "the same notes be made common to many" (Hobbes 1839b: 14). So, in addition to these marks, signs (signa) as a means of communication are required. In fact, both functions are obtained by words: "The nature of a name consists principally in this, that it is a mark taken for memory's sake; but it serves also by accident to signify and make known to others what we remember ourselves" (Hobbes 1839b: 15). Hobbes adopts the scholastic practice of viewing linguistic signs in light of the general notion of sign. His own concept of sign, however, according to which signs in general can be described as "the antecedents of their consequents, and the consequents of their antecedents", is from the outset confined to the class of indcxical signs (Hobbes 1839b: 14). Words and names ordered in speech must therefore be indexical signs rather than expressions of conceptions, just as they cannot be "signs of the things themselves; for that the sound of this word stone should be the sign of a stone, cannot be understood in any sense but this, that he that hears it collects that he that pronounces it thinks of a stone" (Hobbes 1839b: 15). It is true, from Roger Bacon onwards we find in scholastic logic the position that words are indexical signs of the speaker's concepts. Connected to this, however, was always the assumption, explicitly denied by Hobbes, that the proper significate of words are the things talked about. Names, according to Hobbes, "though standing singly by themselves, are marks, because they serve to recall our own thoughts to mind, ... cannot be signs, otherwise than by being disposed and ordered in speech as parts of the same" (Hobbes 1839b: 15). As a result of the distinction between marks and signs, any significative function of words can be realized only in the framework of communication. These rudiments of a sentence semantics (Hungerland & Vick 1973), however, were not elaborated any further by Hobbes.
164 II. History of semantics 6.4. The 'Port-Royal Logic' and 'Port-Royal Grammar' The so called Port-Royal Logic (Logique ou Vart depenser), published in 1662 by Antoine Arnauld (1612-1694) and Pierre Nicole (1625-1695), is not only one of the most influential early modern books on the issue of language but in some respect also the most symptomatic one. For "it marks, better than any other, the abandonment of the medieval doctrine of an essential connection between logic and semantics", and treats the "most fundamental questions ... with the kind of inattention to detail that came to characterize most of the many semantic theories of the Enlightenment" (Kretzmann 1967: 378a). The most influential, though actually quite modest, semantic doctrine of this text is the distinction between comprehension and extension which is commonly seen as a direct ancestor of the later distinction between intension and extension. And yet it is different: Whereas the comprehension of an idea is tantamount to "the attributes it comprises in itself that cannot be removed from it without destroying it", extension is described as "the subjects with which that idea agrees, which are also called the inferiors of the general term, which in relation to them, is called superior; as the idea of triangle is extended to all the various species of triangle" (Arnauld & Nicole 1662: 61f). It is manifest that Arnauld in this passage unfolds his doctrine along the lines of the relation of genus and species, so that the idea of a certain species would be part of the extension of the idea of the superior genus, which does not match with the current use of the intension/extension distinction.The extension of a universal idea, however, does not consist of species alone; for Arnauld also notices that the extention of a universal idea can be restriced in two different modes: either (1) "by joining another distinct or determinate idea to it" (e.g. "right-angled" to "triangle") which makes it the idea of a certain subclass of the universal idea (= extension l),or (2) by "joining to it merely an indistinct and indeterminate idea of a part" (e.g. the quantifying syncategoreme "some") which makes it the idea of an undetermined number of individuals (= extension 2) (Arnauld & Nicole 1662:62). What Arnauld intends to convey is simply that the restriction of a general idea can be achieved cither by specification or by quantification - which, however, result in two different and hardly combinable notions of extension. While empirism, according to which sense experience is the ultimate source of all our concepts and knowledge, was prominently represented by Hobbes, the Port- Royal Logic, taking a distinct Cartesian approach, is part of the rationalist philosophy acknowledging the existence of innate ideas or, at least, of predetermined structures of rational thought. This also holds for the Port-Royal Grammar (1660) (Arnauld & Lancelot 1667/1966) that opened the modern tradition of universal grammar which dominated linguistic studies in the 17th and 18th centuries (Schmitter 1996). The universal grammarians aimed to reduce, in a Chomskian-style analysis, the fundamental structures of language to universally predetermined mental structures. The distinction of deep and surface structure seems to be heard when Nicolas Beauzde (1717-1789) claims that since all languages are founded on an identical "mechanisme intellectuel" the "differences qui se trouvent d'une langue a 1'autre ne sont, pour ainsi dire, que superficielles." (Beauzee 1767: viiif). Whereas the rationalist grammarians took language as "the exposition of the analysis of thought" (Beauzde 1767: xxxiiif) and thus as a means of detecting the rules of thought, empirists like Locke or Condillac saw language as a means of forming and analyzing
8. Meaning in pre-19th century thought 165 complex ideas, thus showing a pronounced tendency to ascribe a certain influence of language on thought. 6.5. John Locke (1632-1704) Locke's Essay Concerning Human Understanding (1690/1975) is the most influential early modern text on language, even if the third book, which is devoted to the issue "of words", hardly offers more than a semantics of names, differentiating between names of simple ideas, of mixed modes, and of natural substances. In this context Locke focuses on the question of how names and the ideas they stand for are related to external reality. With regard to simple ideas the answer is simple as well. For their causation by external object shows such a degree of regularity that our simple ideas can be considered as "very near and undiscernably alike" (389). Error and dissent, therefore, turn out to be primarily the result of inconsiderate use of language: "Men, who well examine the Ideas of their own Minds, cannot much differ in thinking; however, they may perplex themselves with words" (180). Locke places emphasis on the priority of ideas over words in some passages (437; 689) and distinguishes between a language-independent mental discourse and its subsequent expression in words (574ff). However, the thesis that thought is at least in principle independent of language is counterbalanced by Locke's account of individual language acquisition and the actual use of language. Even if the idea is logically prior to the corresponding word, this relation is inverted in language learning. For in most cases the meaning of words is socially imparted (Lenz 2010), so that we learn a word before being acquainted with the idea customarily connected to it (Locke 1975: 437). This habitual connection of ideas with words does not only effect an excitation of ideas by words but quite often a substitution of the former by the latter (408). Thus, the mental proposition made up of ideas actually turns out to be a marginal case, for "most Men, if not all, in their Thinking and Reasoning within themselves, made use of Words instead of Ideas" (575). Even if clear and distinct knowledge were best achieved by "examining and judging of Ideas by themselves their Names being quite laid aside", it is, as Locke conceeds, "through the prevailing custom of using Sounds for Ideas ... very seldom practised" (579). Locke's theory of meaning is often characterized as vague or even incoherent (Kretzmann 1968; Landesman 1976). For, on the one hand, Locke states that "Words in their primary or immediate Signification, stand for nothing, but the Ideas in the Mind of him he uses them" (Locke 1975: 405f, 159, 378, 402, 420, 422) so that words "can be Signs of nothing else" (408). On the other hand, like most scholastic authors unlike the contempora trend of the 17th and 18th centuries, he considers ideas as signs of things, and advocates the view that words "ultimatly ... represent Things" (Locke 1975:520) in accordance with the scholastic mediantibus conceptibus-lhesis (see 5.2.; Ashworth 1984). Whereas in the late 19th century it was considered "one of the glories of Locke's philosophy that he established the fact that names are not the signs of things but in their origins always the signs of concepts" (Miiller 1887: 77), it is precisely for this view that Locke's semantic theory is often criticized as a paradigm case of private-language philosophy and semantic subjectivism. But if there is something like semantic subjectivism in Locke then it is more in the sense of a problem that he is pointing to than something that his theory tends to result in.For one of his main points regarding our use of language
166 II. History of semantics is that we should keep it consistent with the use of others (Locke 1975: 471), since words are "no Man's private possession, but the common measure of Commerce and Communication" (514). Locke saw sensualism as supported by an interesting observation regarding meaning change (see article 99 (Fritz) Theories of meaning change) of words. Many words, he noticed, "which are made use of to stand for actions and notions quite removed from sense, have their rise from ... obvious sensible ideas and are transferred to more abstruse significations" (Locke 1975: 403). Therefore large parts of our vocabulary are "metaphorical" concepts in the sense that metaphor is defined by the modern cognitive account (see article 26 (Tyler &Takahashi) Metaphors and metonymies).That is, thinking about a concept from one knowledge domain in terms of another domain, as is exemplified by terms like "imagine, apprehend, comprehend, adhere, conceive, instil, disgust, disturbance, tranquillity". This view was substantiated by Leibniz's comments on the epistemic function of the "analogic des choses sensibles et insensibles", as it becomes manifest in language. It would be worthwile, Leibniz maintained, to consider "1'usage des prepositions, comme a, avec, de, devant, en, hors, par, pour, sur, vers, qui sont toutes prises du lieu, de la distance, et du mouvement, et transferees depuis a toute sorte de changemens, ordres, suites, differences, convenances" (Leibniz 1875-1890: 5.256). Kant, too, agreed in the famous §59 of his Critique of Judgement that this symbolic function of language would be "worthy of a deeper study". For "... the words ground (support, basis), to depend (to be held up from above), to flow from (instead of to follow), substance ... and numberless others, are ... symbolic hypotyposes, and express concepts without employing a direct intuition for the purpose, but only drawing upon an analogy with one, i.e., transferring the reflection upon an object of intuition to quite a new concept, and one with which perhaps no intuition could ever directly correspond." Thus, according to Locke, Leibniz and Kant, metaphor is not simply a case of deviant meaning but rather, as modern semantics has found out anew, an ubiquitous feature of language and thought. 6.6. G. W. Leibniz (1646-1716) and the tradition of symbolic knowledge While Hobbes and Locke, at least in principle and to a certain degree, still acknowledged the possibility of a non-linguistic mental discourse or mental proposition, Gottfried Wilhelm Leibniz emphazised the dependency of thinking on the use of signs: "thinking can take place without words ... but not without other signs" (1875-1890: 7.191). For "all our reasoning is nothing but a process of connecting and substituting characters which may be words or other signs or images" (7.31). This view became explicit and later extremely influential under the label of cognitio symbolica (symbolic knowledge, symbolic cognition), a term Leibniz coined in his Meditationes de cognitione, veritate et ideis (1684; 1875-1890: 4.423). Symbolic knowledge is opposed to intuitive knowledge (cognitio intuitiva) which is defined as a direct and simultaneous conception of a complex notion together with all its partial notions. Because the limited human intellect cannot comprehend more complex concepts other than successively, the complex concept of the thing itself must be substituted by a sensible sign in the process of reasoning always supposing that a detailed explication of its meaning could be given if needed. Leibniz, therefore, maintains that the knowledge or
8. Meaning in pre-19th century thought 167 cognition of complex objects or notions is always symbolic, i.e. performed in the medium of signs (1875-1890:4.422f).The possibility and validity of symbolic knowledge is based on the principle of proportionality according to which the basic signs used in symbolic knowledge may be choosen arbitrarily, provided that the internal relations between the signs are analogous to the relations between the things signified (7.264). In his Dialogue (1677) Leibniz remarks that "even if the characters are arbitrary, still the use and interconnection of them has something that is not arbitrary - viz. a certain proportionality between the characters and the things, and the relations among different characters expressing the same things. This proportion or relation is the foundation of truth." (Leibniz 1875-1890:7.192). The notion of cognitio symbolica provides the cpistcmological foundation of both his project of a characteristica universalis or universal language of science and his philosophy of language. For the basic principle of analogy is also realized in natural languages to a certain extent. Leibniz therefore holds that "languages are the best mirror of the human mind and that an exact analysis of the signification of words would make known the operations of the understanding better than would anything else" (5.313). Especially through its reception by Christian Wolff (1679-1654) and his school the doctrine of symbolic knowledge became one of the main epistemological issues of the German Enlightenment. In this tradition it is a common conviction that the use of signs in general and of language in particular provides an indispensable function for any higher intellectual operation. Hence Leibniz's proposal of a characteristica universalis has been massively taken up, even though mostly in a somewhat altered form. For what the 18th century authors are generally aiming at is not the invention of a sign system for obtaining general knowledge, but rather a general science of sign systems. The general doctrine of signs, referred to with names like Characteristica, Semiotica, or Semiologia, was considered as a most important desideratum. In 1724 Wolff's disciple Georg Bern- hard Bilfingcr (1693-1750) suggested the name Ars semantica for this, as he saw it, until then neglected discipline, the subject matter of which would be the knowledge of all sorts of signs in general as well as the theory about the invention, right use, and assessment of linguistic signs in particular (Bilfinger 1724:298f). The first extended attempt to fill this gap was made by Johann Heinrich Lambert (1728-1777) with his Semiotik, oder die Lehre von der Bezeichnung der Gedanken und Dinge (Semiotics, or the doctrine of the signification of thoughts and things), published as the second part of his Neues Organon (1764).The leading idea of this work is Leibniz's principle of proportionality which guarantees, as Lambert claims, the interchangeability of "the theory of the signs" and "the theory of the objects" signified. (Lambert 1764: 3.23-24). Besides the theory of sign invention, the hermeneutica, the theory of sign interpretation, made up an essential part of the characteristica. Within the framework of 'general hermeneutics' {hermeneutica generalis), originally designed by Johann Conrad Dannhauer (1603-1666) as a complement to Aristotelian logic, the reflections on linguistic meaning focused on sentence meaning. Due to Dannhauer's influential Idea boni interpretis (Presentation of the good interpreter, 1630), some relics of scholastic semantics were taken up by hcrmcncutical theory, for which particularly the theory of supposition (see 5.4.), which provided the "knowledge about the modes in which the signification of words may vary according to their connection to others" (Reusch 1734:266) had to be of pivotal interest. The developing discipline of hermeneutics as "the science
168 II. History of semantics of the very rules in compliance of which the meanings can be recognized from their signs" (Meier 1757: 1), shows a growing awareness of several important semantic doctrines, as for instance the methodological necessity of the principle of charity in form of a presumption of consistency and rationality (Scholz 1999:35-64), the distinction between language meaning and contextual meaning, as it is present in Christian August Crusius's (1747:1080) distinction between grammatic meaning (grammatischer Verstand),i.e."the totality of meanings a word may ever have in one language", and logic meaning, i.e. the "totality of what a word can mean at a certain place and in a certain context", or the principle of compositionality (see article 6 (Pagin & Westerstahl) Compositionality), as it appears in Georg Friedrich Meier's claim that "the sense of a speech is the sum total of the meanings of the words that make up this speech and which arc connected to and determining each other" (Meier 1757:57). 6.7. Condillac (1714-1780) One of the most decisive features of 18th century science is its historical approach. In linguistics this resulted in a great number of works on the origin and development of language (Gessinger & von Rahden 1989).The way in which the historic-genetic point of view opened new perspectives on the relation between language and thought is most clearly reflected in Etienne Bonnot de Condillac's Essai sur Vorigine des connaissances humaines (1746). Already in the early 18th century it was widely accepted that linguistic signs provide the basis for virtually any intellectual knowledge. Christian Thomasius (1655-1728) had argued that the use of language plays a decisive role in the ontogenetic development of each individual's cognitive faculties (Meier-Oeser 2008). Condillac goes even further and argues that the same holds for the phylogenetic development of mankind as well. Language, therefore, is not only essentially involved in the formation of thoughts or abstract ideas but also in the formation of the subject of thought, viz. of man as an intellectual being. According to Condillac all higher cognitive operations are nothing but 'transformed sensation' (sensation transforme). The formative principle that effects this transformation is language or, more generally, the use of signs (Vusage des signes). Condillac reconstructs the historical process of language development as a process leading from a primordial natural language of actions and gestures (langage d'action), viewed as a language of simultaneous ideas (langage des idees simultanees), to the language of articulate voice (langage des sons articules), viewed as a language of successive ideas (langage des idees successives). We are so deeply accustomed to spoken language with its sequential catenation of articulate sounds, he notices, that wc believe our ideas would by nature come to our mind one after another, just as we speak words one after another. In fact, however, the discursive structure of thinking is not naturally given but rather the result of our use of linguistic signs. The main effect of articulate language consists in the gradual analysis of complex and indistinct sensations into abstract ideas that are connected to and make up the meaning of words. Since language is a necessary condition of thought and knowledge, Locke was mistaken to claim that the primary purpose of language is to communicate knowledge (Condillac 1947-1951:1.442a): The primary purpose of language is to analyze thought. In fact we cannot exhibit the ideas that coexist in our mind successively to others except in so far as wc know how to exhibit
8. Meaning in pre-19th century thought 169 them successively to ourselves. That is to say, we know how to speak to others only in so far as we know how to speak to ourselves. It was therefore Locke's adherence to the scholastic idea of mental propositions that prevented him "to realize how necessary the signs are for the operations of the soul" (1.738a). Every language, Condillac claims,"is an analytic method". Due to his comprehensive notion of language this also holds vice versa: "every analytic method is a language" (2.119a), so that Condillac, maintaining that sciences are analytical methods by essence, comes to his provocative and controversial thesis that "all sciences are nothing but well-made languages" (2.419a). Condillac's theory of language and its epistemic function became the core topic of the so-called school of'ideology' (ideologic) that dominated the French scene in early 19th century. Although most authors of this school rejected the absolute necessity of signs and language for thinking, they adhered to the subjectivistic consequences of sensualism and considered it impossible "that one and the same sign should have the same value for all of those who use it and even for each of them at different moments of time" (Destutt de Tracy 1803:405). 7. References 7.1. Primary Sources Abelard, Peter 1927. Logica ingredientibus, Glossae super Peri ermenias. Ed. B. Geyer. Munstcr: Ascliendorff. Andreas, Antonius 1508. Scriptum in arte veteri. Venice. Aquinas,Thomas 1989. Expositio libriperyermenias. In: Opera omnia I* 1. Ed. Coinmissio Leonina. 2nd edn. Rome. Arnauld, Antoine & Pierre Nicole 1965. La logique ou I'art depenser. Paris. Arnauld, Antoine & Claude Lancelot 1667/1966. Grammaire generate et raisonnee ou La Gram- maire de Port-Royal. Ed. H. E. Brekle, t. 1, reprint of the 3rd edn. Stuttgart-Bad Cannstatt: Frornmami. 1966. Augustinus 1963. De docti ina Christiana. In: W. M. Green (ed.). Sancti Augustini Opera (CSEL 80). Vienna: Osterreichische Akademie der Wissenschaften. Augustinus 1968. De trinitate. In: W. J. Mountain & F. Glorie (eds.). Aurelii Augustini Opera (CCSL 50).Turnhout: Brepols. Augustinus 1975. De dialectica. Ed. Jan Pinborg, translation with introduction and notes by B. Dan el Jackson. Dordrecht: Reidel. Bacon, Roger 1978. De signis. Ed. K.M. Fredborg, L. Nielsen & J. Pinborg. Traditio 34,75-136. Bacon, Roger 1988. Compendium Studii Theologiae. Ed.Th. S. Maloney. Leiden: Brill. Bilfinger, Georg Bernhard 1724. De Sermone Sinico. In: Specimen doctrinae vetentm Sinarum moralis et politicae, Frankfurt/M. Beauz£e, Nicolas 1767. Grammaire generate. Paris. Boethius, Anicius M.T. S. 1880. In Perihermeneias editio secunda. Ed. C. Meiser. Leipzig:Teubner. Boethius Dacus 1969. Modi significandi. Ed. J. Pinborg & H. Roos. Copenhagen: C. E. C. Gad. Condillac, Eticnne Bonnot de 1947-1951. Oeuvres philosophiques. Ed. G. Lc Roy. Paris: PUF. Conimbricenses 1607. Commentaria in universam Aristotelis dialectica. Cologne. Crusius, Christ ian August 1747. Weg zur Gewifilheit und Zu verlassigkeit der menschlichen Erkenntnis. Leipzig. Dannhauer, Johann Conrad 1630. Idea boni interpretis. Strasburg. Destutt de Tracy, Antoine 1803. Elements d'ideologie. Vol. 2. Paris. Giattini, Johannes Baptista 1651. Logica. Rome.
170 II. History of semantics Henry of Ghent 1520. Summa quaestionum ordinarium. Paris. Hobbes. Thomas 1839a. Opera philosophica quae scripsit omnia. Ed. W. Molesworth, vol. 1. London. Hobbes. Thomas 1839b. The English Works. Ed. W. Molesworth, vol. 1. London. Hoffbauer, Johannes Christoph 1789. Tentamina semiologica sive quaedam generalem theoriam sig- norum spectantia. Halle. Lambert, Johann Heinrich 1764. Neues Organon oder Gedanken uber die Erforschung und Bezeich- nung des Wahren. Vol. 2. Leipzig. Leibniz, Gottfried Wilhelm 1875-1890. Die philosophischen Schriften. Ed. C. I. Gerhardt. Berlin. Reprinted: Hildesheim: Olms, 1965. Locke, John 1975. An Essay concerning Human Understand. Ed. P. H. Nidditch. Oxford: Clarendon Press. Meier, Georg Friedrich 1757. Versuch einer allgemeinen Auslegungskunst. Halle. Nicolaus a S. Iohanne Baptista 1687. Philosophia augustiniana. Genova. Ockham, William of 1974. Summa logicae. Ed. Ph. Bochncr, G. Gal & S. Brown. In: Opera philosophica 1. St. Bona venture, NY: The Franciscan Institute. Ockham, William of 1978. Expositio in librum Perihermeneias Aiistotelis. Ed. A. Gambatese & S. Brown. In: Opera philosophica 2. St. Bona venture, NY: The Franciscan Institute. Raey, Jean de 1692. Cogitata de interpretation. Amsterdam. Raulin, John 1500. In logicam Aristotelis commentarium. Paris. Reusch, Johann Peter 1734. Systema logicum. Jena. Rubius. Antonius 1605. Logica mexicana. Cologne. Scotus, John Duns 1891. In primum librum perihermenias quacstiones. In: Opera Omnia. Ed. L. Vives. vol. 1. Paris. Smiglecius. Martin 1634. Logica. Oxford, de Soto, Domingo 1554. Summulae. Salamanca. Vcrsor, Johannes 1572. Summulae logicales. Venice. 7.2. Secondary Sources Annas. Julia E. 1992. Hellenistic Philosophy of Mind. Berkeley. LA: University of California Press. Ashworth, E. Jennifer 1984. Locke on language. Canadian Journal of Philosophy 14,45-74. Ashworth, E. Jennifer 1987. Jacobus Naveros on the question: 'Do spoken words signifiy concepts or things?'. In: L. M. de Rijk & C. A. G. Braakhuis (eds.). Logos and Pragma. Nijmegen: Inge- nium Publishers. 189-214. Barnes, Jonathan 1993. Meaning, saying and thinking. In: K. Doling & T. Ebert (eds.). Dialektiker und Stoiker. Zur Logik der Stoa und ihrer Vorlaufer. Stuttgart: Steiner, 47-61. Baxter,Timothy M. S. 1992. The Cratylus: Plato's Critique of Naming. Leiden: Brill. Biihler, Karl 1934. Sprachtheorie. Jena: Fischer. Bursill-Hall, G. E. 1971. Speculative Grammars of the Middle Ages. The Hague: Mouton. Ebbesen, Sten 1983. The odyssey of semantics from the Stoa to Buridan. In: A. Eschbach & J. Trabant (eds.). History of Semiotics. Amsterdam: Benjamins. 67-85. Fredborg, Karen Margareta 1981. Roger Bacon on "Impositio vocis ad significandum'. In: H. A. G. Braakhuis, C. H. Kneepkens & L. M. de Rijk (eds.). English Logic and Semantics. Nijmegen: Ingenium Publishers. 167-191. Gessinger, Joachim & Wolfert von Rahden 1989 (eds.). Theorien vom Ursprung der Sprache. Berlin: de Gruyter. Graeser, Andreas 1978. The Stoic theory of meaning. In: J. M. Rist (ed.). The Stoics. Berkeley, CA: University of California Press, 77-100. Graeser, Andreas 1996. Aristotclcs. In:T. Borschc (cd.). Klassiker der Sprachphilosophie, Miinchcn: C.H. Beck, 33-^7. Hungei land, Isabel C. & George R. Vick 1973. Hobbes's theory of signification. Journal of the History of Philosophy 11,459^82.
8. Meaning in pre-19th century thought 17j_ Jacobi, Klaus 1983. Abelard and Frege. Tlie seinanics of words and propositions. In: V.M. Abrusci, E. Casari & M. Mugnai (eds.). Alii del Convegno Internazionale di storia della logica, San Gimignano 1982. Bologna: CLUEB, 81-96. Kaczinarek, Ludger 1988. 'Notitia' bei Peter von Ailly. In: O. Pluta (ed.). Die Philosophic im 14. und 15. Jahrhundert. Amsterdam: Griiner, 385-420. Kretzmann, Norman 1967. Semantics, history of. In: P. Edwards (ed.). The Encyclopedia of Philosophy. Vol. 7. New York: The Macmillan Company, 358b-406a. Kretzmann, Normann 1968. The main thesis of Locke's semantic theory. Philosophical Review 78, 175-196. Kretzmann, Norman 1974. Aristotle on spoken sound significant by convention. In: J. Corcoran (ed.). Ancient Logic and its Modern Interpretation. Dordrecht: Kluwer, 3-21. Kretzmann, Norman 1982. Syncategoremata. exponibilia, sophismata. In: N. Kretzmann, A. Kenny & J. Pinborg (eds.). The Cambridge History of Later Medieval Philosophy. Cambridge: Cambridge University Press, 211-245. Landcsman, Charles 1976. Locke's theory of meaning. Journal of the History of Philosophy 14, 23-35. Lenz, Martin 2010. Lockes Sprachkonzeption. Berlin: de Gruyter. Lewry, Osmund 1981. Robert Kilwardby on meaning. A Parisian course on the logica vetus. Miscellanea mediaevalia 13,376-384. Lieb, Hans-Heinrich 1981. Das 'semiotische Dreieck' bei Ogden und Richards: Eine Neuformu- lierung des Zeichenmodells von Aristoteles. In: H. Geckeler et al. (eds.). Logos Semantikos. Studia Linguistica in Honorem Eugenio Coseriu, vol. 1. Berlin: dc Gruyter/Madrid: Gredos, 137-156. Long, Anthony A. & David N. Sedley 1987. The Hellenistic Philosophers. Cambridge: Cambridge University Press. Long, Anthony A. 1971. Language and thought in Stoicism. In: A. A. Long (cd.). Problems in Stoicism. London: The Athlonc Press, 75-113. Magee, John 1989. Boethius on Signification and Mind. Leiden: Brill. Meier-Oeser, Stephan 1996. Signification. In: J. Ritter & K. Gi under (eds.). Historisches Worterbuch der Philosophic 9. Basel: Schwabc, 759-795. Mcicr-Ocscr, Stephan 1997. Die Spur des Zeichens. Das Zeichen und seine Funktion in der Philosophic des Mittelalters und der friihen Neuzeit. Berlin: de Gruyter. Meier-Oeser, Stephan 1998. Synkategorem. In: J. Ritter & K. Griinder (eds.). Historisches Worterbuch der Philosophic 10. Basel: Schwabe, 787-799. Meier-Oeser, Stephan 2004. Sprache und Bilder im Geist. Philosophisches Jahrbuch 111,312-342. Meier-Oeser, Stephan 2008. Das Ende der Metapher von der 'inneren Rede'. Zuin Verhaltnis von Sprache und Denken in der deutschen Friihaufklarung. In: H. G. Bodecker (ed.). Strukturen der deutschen Friihaufklarung 1680-1720. Gottingen: Vandenhoeck & Ruprecht, 195-223. Meier-Oeser, Stephan 2009. Walter Buiiey's piopositio in re and the systematization of the ordo significationis. In: S. F. Brown, Th. Dewender & Th. Kobusch (eds.). Philosophical Debates at Paris in the Early Fourteenth Century. Leiden: Brill, 483-506. Miiller. Max 1887. The Science of the Thought. London: Longmans, Green & Co. Panaccio, Claude 1999. Le Discours Interieure de Platon a Guillaume Ockham. Paris: Seuil. Pinborg, Jan 1967. Die Entwicklung der Sprachtheorie im Mittelalter. Minister: Aschendorff. Pinborg, Jan 1972. Logik und Semantik im Mittelalter. Ein Oberblick. Ed. H. Kohlenberger. Stuttgart-Bad Cannstatt: Frommann-Holzboog. Pohlcnz, Max 1939. Die Bcgrundung der abcndlandischcn Sprachlchrc durch die Stoa. (Nach- richten von der Gesellschaft der Wissenschaften, Gottingen. Philologisch-historische Klasse, N.F. 3). Gottingen: Vandenhoeck & Ruprecht, 151-198. Rijk, Lambertus Maria de 1967. Logica modemorum, vol. II/l. Assen: Van Gorcum. Schmitter, Peter (ed.) 1996. Sprachtheorien der Neuzeit II. Von der Grammaire de Port-Royal zur Konstitution moderner linguistischer DiszipHnen.Tub'ingen: Narr.
172 II. History of semantics Schmitter, Peter (2001).The emergence of philosophical semantics in early Greek antiquity. Logos and Language 2,45-56. Scholz, Oliver R. 1999. Verstehen und Rationalitat. Frankfui t/M.: Klostermann. Sedley, David 1996. Aristotle's de interpretatione and ancient semantic. In: G. Manetti (ed.). Knowledge Through S/gns. Turnhout: Brepols, 87-108. Sedley, David 2003. Plato's Cratylus. Cambridge: Cambridge University Press. Tachau, Katherine H. 1987. Wodeham, Crathorn and Holcot: The development of the complexe significabile. In: L. M. de Rijk & H. A. G. Braakhuis (eds.). Logos and Pragma. Nijmegen: Inge- nium Publishers. 161-189. Weidemann, Hermann 1982. Ansatze zu einer semantischen Theorie bei Aristoteles. Zeitschrift fur Semiotik 4,241-257. Stephan Meier-Oeser, Berlin (Germany) 9. The emergence of linguistic semantics in the 19th and early 20th century 1. Introduction: The emergence of linguistic semantics in the context of interdisciplinary research in the 19th century 2. Linguistic semantics in Germany 3. Linguistic semantics in France 4. Linguistic semantics in Britain 5. Conclusion 6. References Abstract This chapter deals with the 19th-century roots of current cognitive and pragmatic approaches to the study of meaning and meaning change. It demonstrates that 19th-century linguistic semantics has more to offer than the atomistic historicism for which 19th-century linguistics became known and for which it was often criticised. By contrast, semanticists in Germany, France and Britain in particular sought to study meaning and change of meaning from a much more holistic point of view, seeking inspiration from philosophy, biology, geology, psychology, and sociology to study how meaning is 'made' in the context of social interaction and how it changes over time under pressure from changing linguistic, societal and cognitive needs and influences. Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 172-191
9. The emergence of linguistic semantics in the 19th and early 20th century 173 The subject in which I invite the reader to follow me is so new in kind that it has not even been given a name. The fact is that most linguists have directed their attention to the forms of words: the laws which govern changes in meaning, the choice of new expressions, the birth and death of phrases, have been left behind or have been noticed only in passing. Since this subject deserves a name as much as does phonetics or morphology, I shall call it semantics [...]. the science of meaning. (Breal 1883/1991:137) In memory of Peter Schmitter who first helped me to explore the history of semantics 1. Introduction: The emergence of linguistic semantics in the context of interdisciplinary research in the 19th century The history of semantics as a reflection on meaning is potentially infinite, starting probably with Greek or Hindu philosophers of language and embracing more than two thousand years of the history of mankind. It is spread over numerous disciplines, from ancient philosophy (Schmitter 2001a), to modern cognitive science. The history of semantics as a linguistic discipline is somewhat shorter and has been well explored (Ncrlich 1992b, 1996a, 1996b). This chapter therefore summarizes results from research dealing with the history of semantics from the 1950s onwards and I shall follow the standard view that semantics as a linguistic discipline began with Christian Karl Reisig's lectures on Latin semasiology or Bedeutungslehre, given in the 1820s (Schmitter 1990,2004). As one can see from the motto cited above, another central figure in the history of linguistic semantics in the 19th century is undoubtedly Michel Brdal, the author of the famous Essai de semantique, published in 1897, the cumulative product of work started as early as 1866 (Breal 1883/1991). When this seminal book was translated into English in 1900, the name of the discipline that studied linguistic meaning and changes of meaning became 'semantics', and other terms, such as semasiology, or sematology were sidelined in the 20th century. Although the focus is here on linguistic semantics, it should not be forgotten that the study of'meaning' also preoccupied philosophers and semioticians.Two seminal figures in the 19th century that should be mentioned in this context are Charles Sanders Peirce in the United States who worked in the tradition of American pragmatism (Nerlich & Clarke 1996) and Gottlob Frege who helped found mathematical logic and analytic philosophy (see article 10 (Newcn & Schroder) Logic and semantics and article 3 (Textor) Sense and reference). Although Peirce exerted enormous influence on the developments of semantics, pragmatics and semiotics in the late 19th and early 20th century, his main interest can be said to have been in epistemology. Frege too exerted great influence on the development of formal semantics (see article 11 (Kempson) Formal semantics and representationalism and article 14 (ter Meulen) Formal methods), truth-conditional semantics, feature semantics and so on, especially through his distinction between Sinn and Bedeutung or sense and reference. However, his main interest lay in logic, arithmetic and number theory. Neither Frege nor Peirce were widely discussed in the treatises on linguistic semantics which will be presented below, except for the philosophical and psychological reflections of meaning around the tradition of 'signifies'. Frege was a logician, not a linguist and he himself pointed out that "[t]o a mind concerned with the beauties of language, what is trivial to the logician may seem to be just what is important" (Frege 1977:10).The linguists discussed below were all fascinated with the beauty of language.
174 II. History of semantics This chapter focuses on linguistic semantics in Germany, France, and Britain, thus leaving aside Eastern Europe, Russia, Scandinavia, and our closer neighbours, Italy, Spain, the Benelux countries and many more. However, the work carried out in these countries was, as far as I know, strongly influenced by, if not dependent on the theories developed in Germany, France, and Britain. Terminologically, German linguists initially wrote about Semasiologie, French ones about la semantique, and some English ones about sematology. In the end the term semantics was universally adopted. In general terms one can say that linguistic semantics emerged from a dissatisfaction with traditional grammar on the one hand, which could not deal adequately with questions of meaning, and with traditional lexicography and etymology on the other, which did not give a satisfactory account of the evolution of meaning, listing the meanings of words in a rather arbitrary fashion, instead of looking for a logical, natural or inner order in the succession of meanings. To redefine grammar, scholars looked for inspiration in the available traditions of philosophy; to redefine lexicography they grasped the tools provided by rhetoric, that is the figures of speech, especially metaphor and metonymy (Nerlich 1998).The development of the field was however not only influenced by internal factors relating to the study of language but also by developments in other fields such as geology and biology, for example, from which semantics imported concepts such as 'uniformitarianism', 'transformation', 'evolution', 'organism' and 'growth'. After initial enthusiasm about ways to give semantics'scientific' credibility in this way, a debate about whether framing the development and change of meaning in such terms was legitimate would preoccupy scmanticists in the latter half of the 19th century. The different traditions in the field of semantics were influenced by different philosophical traditions on the one hand, by different sciences on the other. In the case of German semasiology, the heritage of Kantian philosophy, idealism, the romantic movement, and the new type of philology, or to quote some names, the works of Immanuel Kant, especially his focus on the 'active subject', Wilhclm von Humboldt and his concept of'ergon' and 'energeia' (Schmitter 2001b) and Franz Bopp and his research into comparative grammar were of seminal importance. German semasiology after Reisig was very much attached to the predominant paradigm in linguistic science, that is, to historical-comparative philology. This might be the reason why the term 'semasiology', as designating one branch of a prospering and internationally respected discipline, was at first so successful in English speaking countries where an autonomous approach to semantics was missing. This was not the case in France, where Brcal, from 1866 onwards, used semantic research as a way to challenge the German supremacy in linguistics. Later on, however, German Semasiologie, just like the French tradition of la semantique, began to be influenced by psychology, and thus these two traditions moved closer together. French semantics, especially the Brealian version, was influenced by the French philosophy of language which was rooted in the work of Etienne Bonnot de Condillac and his followers, the Ideologues, on words as signs. But Breal was also an admirer of Bopp and of Humboldt. Breal first expressed his conviction that semantics should be a psychological and historical science in his review of the seminal book, La Vie des mots, written in 1887 by Arsene Darmesteter (first published in English, in 1886, based on lectures given in London) and advocated caution in adopting terms and concept from biology, such as organism and transformation (Brdal 1887). Darmesteter had derived his theory of semantic change from biological models, such as Charles Darwin's theory of evolution and August Schleicher's model of language as an evolving organism and of languages as organised into family trees, transforming
9. The emergence of linguistic semantics in the 19th and early 20th century 175 themselves independently from the speakers of the language. Darmesteter applied this conception to words themselves. It is therefore not astonishing to find that Darmesteter's booklet contains a host of biological metaphors about the birth, life and death of words, their struggle for survival, etc.This metaphorical basis of Darmesteter's theory was noted with skepticism by his colleagues who agreed however that Darmesteter's book was the first really sound attempt at analysing how and why words change their meanings. To integrate Darmesteter's insights into his own theoretical framework, Brdal had only to replace the picture of the autonomous change of a language by the axiom that words change their meaning because the speakers and hearers use them in different ways, in different situations (Delesalle 1987, Nerlich 1990). In parallel with Reisig, Darmesteter used figures of speech, such as metaphor, metonymy and synecdoche to describe the transitions between the meanings of words (Darmesteter 1887), being inspired by the achievements of French rhetoric, especially the work of Cesar Chesneau Du Marsais (1757) on tropes as being used in ordinary language. The English tradition of semantics emerged from philosophical discussions about language and mind in the 17th and 18th centuries (John Locke), and about etymology and the search for 'true' and original meaning (John Home Tooke). Philosophical influences here were utilitarianism and a certain type of materialism. Semantics in its broadest sense was also used at first to underpin religious arguments about the divine origin of language. The most famous figure in what one might call religious semantics, was Richard Chenevix Trench, an Anglican ccclcsiast, bishop of Dublin and later Dean of Westminster, who wrote numerous books on the history of the English language and the history of English words. His new ideas about dictionaries led the Philological Society in London to the creation of the New English Dictionary, later called the Oxford English Dictionary, which is nowadays the richest source-book for those who want to study semantic change in the English language. After the turn of the 19th to the 20th century one can observe in Britain a rapid increase in books on 'words' - the trivial literature of semantics, so to speak (Nerlich 1992a) -, but also a more thoroughly philosophical reflection on meaning, as well as the start of a new tradition of contextualism in the work of (Sir) Alan Henderson Gardiner and John Rupert Firth, mainly influenced by the German linguist Philipp Wegener and the anthropologist Bronislaw Malinowski (see Nerlich & Clarke 1996). Many of those interested in semantics tried to establish classifications of types or causes of semantic change, something that became in fact one of the main preoccupations for 19th-century semanticists.The classifications of types of semantic change were mostly based on logical principles, that is a typology according to the figures of speech, such as metaphor, metonymy, and extension and restriction, specifying the type of relation or transition between the meanings of a word; the typologies of causes of semantic change were mostly based on psychological principles, specifying human drives and instincts; and finally some classifications applied historical or cultural principles; but most frequently these enterprises used a mixture of all three approaches. Later on in the century, when the issue of phonetic laws reverberated through linguistics, semanticists tried not only to classify semantic changes, but to find laws of semantic change that would be as strict as the sound laws were then believed to be and in doing so, some believed to turn linguistics into a 'science' in the sense of natural science. After this preliminary sketch I shall now deal with the three traditions of semantics one by one, the German, the French, and the British one. However, the reader of the following sections has to keep in mind that the three traditions of semantics are not as
176 II. History of semantics strictly separable as it might appear. There were numerous links of imitation, influence, cross-fertilization, and collaboration. 2. Linguistic semantics in Germany It has become customary to distinguish two main periods in German semantics: (1) a logico-historical or historico-descriptive one, and (2) a psychologico-explanatory one. The father of the first tradition is Reisig, the father of the second Steinthal. Christian Karl Reisig (Schmitter 1987,2004) was, like many other early semanticists, a classical philologist. In his lectures on Latin grammar given in the 1820s, and first published by his student Friedrich Haase in 1839 (2nd edition 1881-1890), he tried to reform the standard view of traditional grammar by adding to it a new discipline: semasiology or Bedeutungslehre, that is the study of meaning in language. Grammar was normally considered to consist in the study of syntax and etymology (which then meant approximately the same as morphology or Formenlehre). Reisig claims that the word should not only be studied with regard to its form (etymology) and in its relation to other words (syntax), but as having a certain meaning. He points out that there are words whose meaning is neither determined by their form alone nor by their place in the sentence, and that the meaning of these words has to be studied by semasiology. More specifically semasiology is the study of the development of the meaning of certain words, as well as the study of their use, both phenomena that were covered by neither etymology nor syntax. Reisig puts forward thirteen principles of semantics, according to which language precedes grammar, languages are the products of nations, not of single human beings, language comes into being through imagination and enthusiasm in the social interaction of people. We shall see that this dynamic view of language and semantic change became central to historical semantics at the end of the 19th century when it was also linked to a more contextual approach. According to Reisig the evolution of a particular language is determined by free language-use within the limits set by the general laws of language. These general laws of language are Kant's laws of pure intuition (space and time), and his categories. This means that language and language change are brought about by a dynamic interplay of several forces which Reisig derived from his knowledge of German idealism on the one side and the romantic movement on the other. In line with German idealistic philosophy, he wanted to find the general principles of semantic change, assumed to be based on general laws of the human mind. In tune with the romantic movement, he saw that every language has, however, its individuality, based on the history of the nation and he recognized that speakers too have certain degrees of freedom in their creative use of language. This freedom is however constrained by certain habitual associations between ideas which had already been discussed in rhetoric, but which semasiology should, he claims, take account of, namely synecdoche, metonymy and metaphor. Whereas rhetoric focuses on their aesthetic function, semasiology focuses on the way these figures have come to structure language use in a particular language. The recognition of rhetorical figures, such as metaphor and metonymy, as habitual cognitive associations and also as procedures of semantic innovation and change was a very important step in the constitution of semantics as an autonomous discipline. The figures of speech were reinterpreted in two ways and thus emancipated from their definitions in philosophical and literary discourse: they were no longer regarded as mere abuses of language, but as necessary for the life of language, that is, the continuous use
9. The emergence of linguistic semantics in the 19th and early 20th century 177 of it; and they were not mere ornaments of speech, but essential instruments of mind and language to cope with ever new communicative needs - an insight deepened in various contributions to the 19th and early 20th-century philosophy of metaphor and rediscovered in modern cognitive semantics (Nerlich & Clarke 2000; see also article 100 (Geeraerts) Cognitive approaches to diachronic semantics). To summarize: Reisig's approach to semantics is philosophical with regard to the general laws of the mind, as inherited from Kant, but it is also historical, because Reisig stressed the necessity of always studying the Latin texts very closely. One can also claim that his semantics is to some degree influenced by psychology, in as much as Reisig adds to Kant's purely philosophical principles of intuition and reason a third source of human language: sensations or feelings. It is also rhetorical and stylistic, a perspective later rejected by his followers Heerdegen and Hey. The theory of the sign that underlies his semantics is very traditional, that is representational: the sign signifies an idea/concept (Begriff) or a feeling (Empfindung); language represents thought, a thought that itself represents the external world. For Reisig thoughts and feelings exist independently of the language that represents them, a view that Humboldt tried to destroy in his conception of language as constitutive of thought. The study of semantic change can therefore only be the study of the development of ideas or thoughts as reflected in the words that signify them.This development of thought can run along the following ('logical') lines, called metaphor (the interchange of ideas, II: 6), metonymy (the interchange of representations, II: 4), or synecdoche (II: 4). To these he adds the use of originally local prepositions to designate temporal relations, based on the interchange of the forms of intuition, time and space, again an insight that was rediscovered in modern cognitive semantics. Semasiology has to show how the different meanings of a word have emerged from the first meaning (logically and historically, II: 2), that is, how thought has unfolded itself in the meaning of words.This kind of'Vorstel- lungsscmantik' (Knobloch 1988:271) would dominate German semasiology until 1930 approximately. It was then challenged and eventually overthrown by Leo Weisgerber in linguistics and by Karl Biihler in psychology. 19th-century diachronic and psychological semantics was replaced by 20th-century synchronic and structural semantics. However, Reisig does not leave it at the narrow view of semantics sketched above. He points out that the meaning of a word is not only constituted by its function of representing ideas, but that it is determined as well by the state of the language in general and by the use of a word according to a certain style or register (Schmittcr 1987:126). In fact, in dealing with the 'stylistic' problem of choosing a word from a series of words with 'the same' meaning, he reshapes his unidimensional definition of the sign as signifying one concept. Reisig deals here with synonyms, either complete or quasi synonyms; he even indicates the importance of a new kind of study: synonymology. This rather broad conception of semantics, including word semantics, but also sty- listics and synonymology, is very similar to the one later advocated by Brdal. However, Reisig's immediate followers in Germany gradually changed Reisig's initial conception of semasiology in the following ways: they dropped the philosophical underpinnings and narrowed the scope of Reisig's semasiology by abandoning the study of words in their stylistic context, reducing semasiology more and more to a purely atomistic study of changes in word-meaning. In this new shape and form, semasiology flowered in Germany, especially after the new edition of Reisig's work in the 1880s. Up to the end of the century a host of treatises on the subject were published by philologists but also by a number of'schoolmen' (Nerlich 1992a).
178 II. History of semantics A new impetus to the study of meaning came from the rise of psychological thought in Germany,especially under the influence of Johann Friedrich Herbart, Heymann Steinthal, Moritz Lazarus, and Wilhelm Wundt, the latter three fostering a return to Humboldt's philosophy of language. At the time when Steinthal tried to reform linguistic thought through the application of psychological principles, even the most hard-nosed linguists, the neogrammarians themselves, were forced to turn to psychology. This was due to the introduction of analogy as the second most important principle of language change, apart from sound laws. As early as 1855 Steinthal had written a book where he tried to refute the belief held by many of his fellow linguists that language is based on logical principles and that grammar is based on logic. According to Steinthal, language is plainly based on psychological principles, and these principles are largely of a semantic nature. Criticizing the view inherited from Reisig that grammar has three parts, etymology, semasiology, and syntax, he claims that there is meaning (what he calls after Humboldt an 'inner form') in etymology as well as syntax. In short, semasiology should be part of etymology and syntax, not be separated from them (1855: xxi-xxii). Using the Humboldtian dichotomy of inner and outer form, he wants to study grammar (etymology and syntax) from two points of view: semasiology and phonetics. For him language is 'significant sound', that is, sound and meaning can not be artificially separated. In 1871 Steinthal wrote his Abriss der Sprachwissenschaft. Volume I was intended to be an Introduction into psychology and linguistics. In this work, he wants to explain the origin of language as meaningful sound. His theory of the emergence of language can be compared to modern symbolic interactionism (Nerlich & Clarke 1998).The principle axiom is that when we emit a sound which is understood by the other in a certain way, we understand not only the sound we made, but we understand ourselves, attain consciousness. The origin of language and of consciousness thus lies in understanding. This principle became very important to Philipp Wegener who fostered a new approach to semantics, no longer a mere word-semantics, but a semantics of communication and understanding (Wegener 1885/1991). Steinthal's conception of psychology was the basis of an influential book on semantic change in the Greek language by Max Hecht, which appeared in 1888 and was extensively quoted by the classical philologists among the semanticists. Hermann Paul is often regarded as one of the leading figures in the neogrammarian movement and his book, the Prinzipien der Sprachgeschichte, first published in 1880, is regarded by some as the bible of the neogrammarians. It is true that Paul intended his book at first to be just that. But already in the second edition (1886) he extensively elaborated his at first rather patchy thoughts on semantic topics, such that Brdal - normally rather critical of neogrammarian thought, especially their phonetic bias - could say in his review of the second edition (Brdal 1887) that Paul's book constituted a major contribution to semantics (Breal 1897:307). How had this change of emphasis from sound change to semantic change come about? In 1885 Wegener, like Paul a follower of Steinthal, had published his Untersuchungen iiber die Grundfragen des Sprachlebens (Wegener 1885/1991) where he had devoted a long chapter to semantic change, especially its origin in human communication and interaction. Paul had read (and reviewed) this book (just as Wegener had read and reviewed Paul's). In doing so, Paul must have discovered many affinities between his ideas and those of Wegener, and he must have been inspired to devote more thought to semantic questions. What were the affinities? The most direct resemblance was their insistence on the
9. The emergence of linguistic semantics in the 19th and early 20th century 179 interaction between speaker and hearer; here again their debt to Steinthal is clear, as clear as their opposition to another very influential psychologist of language, namely Wundt. Paul's intention was to get rid of abstractions or 'hypostasiations' such as the soul of a people or a language, ghosts that Wundt, and even Steinthal, still tried to catch. These entities, if indeed they are entities, escape, according to Paul, the grasp of any science that wants to be empirical. What can be observed, from a psychological and historical perspective, are only the psychological activities of individuals, but individuals that interact with others. This social and psychological interaction is a mediated one; it is mediated by physiological factors: the production and reception of sounds. Historical linguistics (and all linguistics should be historical in Paul's eyes) as a science based on principles is therefore closely related to two other disciplines: physiology and the psychology of the individual. From this perspective, language use does not change autonomously as it had been believed by a previous generation of linguists, but neither can it be changed by an individual act of the will. It evolves through the cumulative changes occurring in the speech activity of individuals.This speech activity normally proceeds unconsciously - we are only conscious of what we want to say, not of how we say it or how we change what we use in our speech activity: the sounds and the meanings. Accordingly, Paul devotes one chapter to sound change, one to semantic change, and one to analogy (more concerned with the changes in word-forms). The most important dichotomy that Paul introduced into the study of semantics is that of usual and occasional meaning (usuelle und okkasionelle Bedeutung) (Paul 1920/1975: 75), a distinction that exerted lasting influence on semantics and also the psychology and philosophy of meaning (Stout 1891, for example).The usual signification is the accumulated sedimentation of occasional significations, the occasional signification, based on the usual signification is imbued with the intention of the speaker and reshaped by being embedded in the situation of discourse. This context-dependency of the occasional signification can have three forms: it depends on a certain perceptual background shared by speaker and hearer; it depends on what has preceded the word in the discourse; and finally it depends on the shared location of discourse. "Put the plates in the kitchen" is understood because we know that we are speaking about that kitchen here and now and no other. These contextual clues facilitate especially the understanding of words which are ambiguous or polysemous in their usual signification (but Paul does not use term 'polysemy', which had been introduced by BnSal in 1887;Nerlich & Clarke 1997,2003). A much more radical view of the situation as a factor in meaning construction was put forward by Wegener in 1885 (Nerlich 1990). Like so many linguists of the 19th century, Paul tries to state the main types of changes of meaning, but he insists that they correspond to the possibilities we have to modify the meaning of words on the level of occasional signification.The first type is the specialization of meaning (what Reisig and Darmesteter would have called 'synecdoche'), which he defines as the restriction of the extension of a word (the set of all actual things the word describes) and enrichment of its intension (the set of all possible things a word or phrase could describe). This type of semantic change is very common. Paul gives the example of German Schirm, which can be used to designate any object employed as a 'screen'. In its occasional usage it may signify a 'fire-screen', a 'lamp-screen', a 'screen' for the eyes, an 'umbrella', a 'parasol', etc. But normally, on hearing the word Schirm we think of a 'Regenschirm', an umbrella - what cognitive semanticists would now call its prototypical meaning. This meaning has somehow separated itself from the general meaning of 'screen' and become independent. A second basic means to extend (and
180 II. History of semantics restrict) word meaning is metaphor. A third type of semantic change is the transfer of meaning upon that which is connected with the usual meaning in space, time or causally (what would later be called via 'contiguity'). Astonishingly, Paul does not use the term 'metonymy' to refer to this type of semantic change. One of the most important contributions to linguistics in general and semantics in particular was made by Wegener in his Untersuchungen iiber die Grundfragen des Sprachlebens, published in 1885. Although Wegener can to some extent be called a neogrammarian, he never accepted their strict distinction between physiology and psychology, as advocated for example by Hermann Osthoff. For Wegener language is a phenomenon based on the whole human being, their psyche and their body,of a human being who is an integral part of a communicative situation. Paul was influenced by Wegener and so was Gardiner, the Egyptologist and general linguist who dedicated his book The Theory of Speech and Language (Gardiner 1932/1951) to Wegener. So much for some major contributions to German semantics. As one could easily devote an entire book to each of the three national trends in semantic thought, of which the German tradition was by far the most prolific, I can only indicate very briefly the major lines of development that semantics took in Germany after Paul. Many classical philologists continued the tradition started by Reisig. Others took Paul's achievements as a starting point for treatises on semantic change that wanted to illustrate Paul's main types of semantic change by more and more examples. Others still, such as Johan Stock- lein tried to develop Paul's core theory further by stressing, for instance, the importance of the context of the sentence for semantic change (Stocklcin 1898). But most importantly, the influence of psychology on semantics increased strongly. Apart from Herbart's psychology of mechanical association which had had a certain influence on Steinthal and hence Paul and Wegener, and apart from some more incidental influences such as that of Sigmund Freud and Carl Gustav Jung on Hans Sperber, for example, the most important development in the field of psychology was Wundt's Volkerpsychologie. Two volumes of his monumental work on the psychology of such collective phenomena as language, myth and custom were devoted to language (Wundt 1900), and of these a considerable part was concerned with semantic change (on the psychology of language in Germany see Knobloch 1988). Wundt distinguished between regular semantic change based on social processes or, as he said, the psyche of the people, and singular semantic change, based on the psyche of the individual. He divided the first class into assimilative change and complicative change, the second into name-making according to individual (or singular) associations, individual (or singular) transfer of names, and metaphorically used words. In short, the different types of semantic change were mainly based on different types of association processes (similar to Reisig in this way). However, Wundt's work attracted a substantial body of criticism, especially from the psychologist Karl Biihler (see Nerlich & Clarke 1998) and the philosopher and general linguist Anton Marty who developed a descriptive semasiology in opposition to Wundt's historical approach (Marty 1908), in this comparable to Raoul de La Grasserie in France (de La Grasserie 1908). Two other developments in German linguistics have at least to be mentioned: the new focus on words and things, that is on designation (Bezeichnung), and not so much on meaning (Bedeutung) and the new focus on lexical and semantic fields instead of single words. After the introduction of this first new perspective, the term 'semasiology' itself
9. The emergence of linguistic semantics in the 19th and early 20th century 181_ changed its meaning, standing now in opposition to 'onomasiology'.The second new perspective lead to the flourishing new field of field semantics (Nerlich & Clarke 2000). 3. Linguistic semantics in France After this sketch of the evolution of semasiology in Germany we now turn to France, where a rather different doctrine was being developed by Michel Breal, the most famous of French semanticists. But Breal was by no means the only one interested in semantic questions. Lexicographers, such as Emile Littre" (1880) and later Darmesteter and Adolphe Hatzfeld, contributed to the discussion on semantic questions from a 'naturalist' point of view, applying insights of Darwin's theory of evolution, of Lamarckian transformationism and, in the case of Littre\ of Auguste Comte's positivism to the problems of etymology. Littre" was one of the first to advocate uniformitarian principles in linguistics, which became so important for Darmesteter and Breal in France and William Dwight Whitney in the United States (Nerlich 1990). According to the uniformitarian view, inherited from geology (Lyell 1830-1833), the laws of language change now in operation, and which can therefore be 'observed' in living languages, were the same that structured language change in the past. Hence, one can explain past changes by laws now in operation. Darmesteter was indeed the first to put forward a program for semantics which resembles in its broad scope that of Reisig before him and Br£al after him and in his emphasis on history that of Paul. He contended that the philosophy of language should focus on the history of languages, the transformations of syntax, grammatical forms and word meanings, as a contribution to the history of the human mind and he also claims that these figures of speech also structure changes in grammatical forms and syntactic constructions. However, as early as the 1840s, before Littre and Darmesteter, the immediate predecessors of Br£al, another group of linguists had started to do 'semantics' under the heading of ideologic, or as one of its members later called it fonctologie, a term obviously influenced by Schleicher's distinction between form, function and relation (Schleicher 1860). French semantics of this type focused, like the later German semasiology, on the isolated word, but even more on the idea it incarnates, and excluded from its investigation the sentential or other contexts. Later on de La Grasserie (1908) proposed an 'integral semantics' based on this framework. He was (with Marty) the first to point out the difference between synchronic and diachronic semantics, or as he called it 'la s£mantique statique' and 'la s£mantique dynamique'. As we shall see in part 4 of this chapter, as early as 1831 the English philosopher Benjamin Humphrey Smart was aware of the dangers of that type of 'ideology' and wanted to replace it by his type of'sematology', stressing heavily the importance of the context in the production and understanding of meaning - that is replacing mental association by situational embeddedness (Smart 1831:252). However, the real winner in this 'struggle for survival' between opposing approaches to semantics, was the new school surrounding Breal. Language was no longer regarded as an organism, nor did words 'live and die'. The focus was now on the language users, their psychological make-up and the process of mutual understanding. It was in this process that 'words changed their meanings'. Hence the laws of semantic change were no longer regarded as'natural'or'logical' laws, but as intellectual laws (Breal 1883), or what one would nowadays call cognitive laws. This new psychological approach to semantic
182 II. History of semantics problems resembled that advocated in Germany by Steinthal, Paul and Wegener. Paul, Wegener, Darmesteter, and Breal all stressed that the meaning of a word is not so much determined by its etymological ancestry, but by the value it has in current usage, a point of view that moved 19th-century semantics slowly from a purely diachronic to a more synchronic and functional perspective. Although Br£al and Wegener seem not to have known each other's work, their conceptions of language and of semantics are in some ways astonishingly similar (they also overlap with theories of language developed by Whitney in the United States and Johan Nicolai Madvig (1875) in Denmark, see Hauger 1994). Brought up in the framework of traditional German comparative linguistics, both objected to the rcification of language as an autonomous sclf-cvolving system, both saw in psychology a valuable help to get an insight into how people speak and understand each other and change the language as they go along, both made fruitful use of the form-function distinction, where the function drives the evolution of the linguistic form, both assumed that to understand a sentence the hearer had much more to do than to decode it word by word - s/he has to draw inferences from the context of the sentence as a whole, as well as from the context of its use or its function in discourse -, and, finally, they both had a much broader conception of historical semantics than their contemporaries, especially some of the semasiologists in Germany and the 'ideologists' in France. In their eyes semantic change is a phenomenon not only of the word or idea, but must be observed at the morphological and syntactical level, too.The evolution of grammar or of syntax is thus an integral part of semantics. Br£al's thoughts on semantics, gathered and condensed in the Essai de semantique {Science des significations) (1897), had evolved over many years.The stages in the maturation of his semantic theory were, briefly stated, the following: 1866 - lecture on the form and function of words; 1868 - lecture on latent ideas; 1883 - introduction of the term semantique for the study of semantic change and more particularly for the search of the intellectual laws of language change in general; 1887 - review of Darmcstctcr's book on the life of words, a review called quite intentionally 'history of words'. Br£al rejected all talk about the life of words. For him words do not live and die, they change according to the use speakers make of them. But postulating the importance of the speaker was not enough for him, he went so far as to proclaim that the will or consciousness of the speaker are the ultimate forces of language change. This made him very unpopular among those French linguists who still adhered to the biological paradigm of language change, based on Schleicher's views on the transformation of language. But Breal was also criticised by some of his friends such as Antoine Meillct. Meillct stressed the role of collective forces, such as social groups, over and above the individual will of the speaker, and became as such important for a new trend in 20th-century French semantics: sociosemantics (Meillet 1904-1905). As mentioned before, Br£al was not the only one who wrote a review of Darmesteter's book. Two of his friends and colleagues had done the same: Gaston Paris and Victor Henry, and they had basically adopted the same stand as Breal. Henry should be remembered for his criticism of Breal's insistence on consciousness, or at least certain degrees of consciousness, as factors in language change. Henry held the view that all changes in language arc the result of unconsciously applied procedures, a view he defended in his booklet on linguistic antinomies (1896) and in his study of a case of glossolalia (1901). How was Br£al received in Germany, a country where a long tradition of'semasiology' already existed? It is not astonishing to find that the psychologically oriented Br£al was
9. The emergence of linguistic semantics in the 19th and early 20th century 183 warmly applauded by Steinthal in his 1868 review of Breal's lecture on latent ideas. He was also mentioned approvingly by Paul (1920/1975: 78 fn. 2). Breal's division of linguistics into phonetics and semantics as the study of meaning at the level of the lexicon, morphology and syntax also corresponds to some extent to Steinthal's conce ption outlined above. It did, however, disturb those who, after Ferdinand Heerdegen's narrowing of the field of semasiology, practiced the study of semantic change almost exclusively on the level of the word, excluding morphology and syntax. This difference between French semantics and German semasiology was noted by Oskar Hey in his review of the Essai (1898:551). Hey comes to the conclusion that if Br£al had not entirely ignored the German achievements in the field of semasiology, he would have noticed that everything he has to say about semantic change had already been said. He concedes, however, that etymologists, morphologists and syntacticians may have a different view on some parts of Breal's work than he has as a classical philologist and semasiologist (see p. 555). From this it is clear that Hey had not really grasped the implications of Breal's novel approach to semantics. Br£al tried to open up the field of historical semantics from a narrow study of changes in word-meaning to the analysis of language change in general based on the assumption that meaning and change of meaning are a function of discourse. If one had to answer the question: where did Breal's thoughts on semantics come from, if not from German semasiology (but Breal had read Reisig,whom he mentions in the context of a discussion on pronouns, see 1897:207 fn.l), one would have to look more closely at 18th-century philosophy of language, specifically the work of the philosopher Eticnne Bonnot de Condillac on words as signs and about the progress of knowledge going hand in hand with the progress of language. From this point of view words are not 'living beings' and the fate of language is not mere decay. However, Br£al did not accept Condillac's use of etymology as the instrument to find the real, original meanings of words and to get insights into the constitution of the human mind (a view also espoused in Britain by Tookc, sec below). For Brdal the progress of language is linked to the progressive forgetting of the original etymological meaning, it is based on the liberation of the mind from its etymological burden. The red thread that runs through Breal's semantic work is the following insight: To understand the evolution and the structure of languages we should not focus so much on the forms and sounds but on the functions and meanings of words and constructions, used and built up by human beings under the influence of their will and intelligence, on the one hand, and the influence of the society they live and talk in, on the other. Unlike some of his contemporaries, Breal therefore looked at how ideas, how our knowledge of a language and our knowledge of the world, shape the words we use. However, he was also acutely aware of the fact that this semantic and cognitive side of language studies was not yet on a par with the advances made in the study of phonetics, of the more physiological side of language, and had much to learn from the emerging experimental sciences of the human mind (BnSal 1883/1991:151). Breal's most famous contribution to semantics as a discipline was probably his discussion of polysemy, a term he invented in 1887. For him, as for Elisabeth Closs Traugott today, all semantic change arises by polysemy, i.e., new meanings coexist with earlier ones, typically in restricted contexts (Traugott 2005). French semantics had its peak between 1870 and 1900. Ironically, when the term 'semantics' was created through the translation of Breal's Essai, the interest in semantics faded slightly in France. However, there were some continuations of 19th-century
184 II. History of semantics semantics, as for example in the work of Meillet who focused on the social aspects of semantic change.There was also, just as in Germany, a trend to study affective and emotional meaning, a trend particularly well illustrated by the work of the Swiss linguist and student of Ferdinand de Saussure, Charles Bally, on'stylistics' (1951), followed finally by a period of syntheses, of which the work of the Belgian writer Albert Joseph Carnoy is the best example (1927). 4. Linguistic semantics in Britain In Britain the study of semantic change was linked for a long time to a kind of etymology that had also prevailed in 18th-century France, that is the use of etymology as the instrument to find the real, original meanings of words and so to get insights into the constitution of the human mind. Genuinely philological considerations only came to dominate the scene by the middle of the century with the creation of the Philological Society in 1842, and its project to create a New English Dictionary. The influence of Locke's Essay Concerning Human Understanding (1689) on English thinking had been immense, strengthened by John Home Tooke's widely read Diversions ofPurley (Tooke 1786-1805).Tooke's theory of meaning can be summarised in the slogan "one word - one meaning". Etymology has to find this meaning, and any use that deviates from it is regarded as'wrong' - linguistically and morally - this also has religious implications. Up to the 1830s Tooke was much in vogue. His doctrine was, however, challenged by two philosophers: the Scottish common sense philosopher Dugald Stewart and, following him to some extent, Benjamin Humphrey Smart. In his 1810 essay "On the Tendency of some Late Philological Speculations", "Stewart attacked", as Aarsleff points out, "what might be called the atomistic theory of meaning, the notion that each single word has a precise idea affixed to it and that the total meaning of a sentence is, so to speak, the sum of these meanings." He "went to the heart of the matter, asserting that words gain meaning only in context, that many have none apart from it" (Aarsleff 1967/1983: 103). According to Stewart, words which have multiple meanings in the dictionary, or as Br£al would say, polysemous words, are easily understood in context. This contextual view of meaning is endorsed by Smart in his anonymously published book entitled An Outline of Sematology or an Essay towards establishing a new theory of grammar, logic and rhetoric (1831), which was followed by a sequel to this book published in 1851, and finally by his 1855 book on thought and language. In both his 1831 and 1855 books Smart quotes the following lines from Stewart: (...) our words, when examined separately, are often as completely insignificant as the letters of which they are composed, deriving their meaning solely from the connection or relation in which they stand to others. (Stewart 1810:208-209) The Outline is based on Locke's Essay, but goes far beyond it. Smart takes up Locke's threefold division of knowledge into (1) physicology or the study of nature, (2) prac- tology or the study of human action, and (3) sematology, the study of the use of signs for our knowledge or in short the doctrine of signs (Smart 1831:1-2). This study deals with signs "which the mind invents and uses to carry on a train of reasoning independently of actual existences" (1831:2, note carried over from 1). In the first chapter of his book, which is devoted to grammar, Smart tries "to imagine the progress of speech upwards as from its first invention" (Smart 1831:3). It starts with
9. The emergence of linguistic semantics in the 19th and early 20th century 185 natural cries which have 'the same' meaning as a 'real' sentence composed of different parts, and this because "if equally understood for the actual purpose, [it] is, for this purpose, quite adequate to the artificially compounded sign", the sentence (Smart 1831:8). But as it is impossible to have a (natural) sign for every occasion or for every purpose (to signify a perception or conception), it was necessary to find an expedient.This expedient was to put together several signs, which each had served a particular purpose, in such a way that they would modify each other, and could, united, serve the new purpose, signify something new (see Smart 1831: 9-10), From these rudest beginnings language developed gradually as an artificial instrument of communication. I cannot go into Smart's presentation of the evolution of the different parts of speech, but it is important to point out that Smart, like Stewart, rejected the notion that words have meaning in isolation. Words have only meaning in the sentence, the sentence has only meaning inside a paragraph, the paragraph only inside a text (see Smart 1831:54-55). Signs in isolation signify notions, or what the mind knows on the level of abstraction. Signs in combination signify perceptions, conceptions and passions (see Smart 1831:10-12). Words have thus a double force "by which they signify at the same time the actual thought, and refer to knowledge necessary perhaps to come at it" (Smart 1831:16).This knowledge is not God-given. "It is by frequently hearing the same word in context with others, that a full knowledge of its meaning is at length obtained; but this implies that the several occasions on which it is used, are observed and compared; it implies in short, a constant enlargement of our knowledge by the use of language as an instrument to attain it" (Smart 1831:18-19). And thus, as only words give access to ideas, ideas do not exist antecedently to language. As language does not represent notions, the understanding of language is not as simple as one might think, it cannot be used to transfer notions from the head of the speaker to the head of the hearer. Instead we use words in such a way that we adapt them to what the hearer already knows (see Smart 1831:191). It is therefore not astonishing to find a praise of tropes and figures of speech in the third chapter of the Outline, devoted to rhetoric. Smart claims that they are "essential parts of the original structure of language; and however they may sometimes serve the purpose of falsehood, they are, on most occasions, indispensable to the effective communication of truth. It is only by [these] expedients that mind can unfold itself to mind; - language is made up of them; there is no such thing as an express and direct image of thought." (Smart 1831:210).Tropes and figures of speech "are the original texture of language and that from which whatever is now plain at first arose. All words arc originally tropes; that is expressions turned (...) from their first purpose, and extended to others." (Smart 1831:214, Nerlich & Clarke 2000). In his 1855 book Smart wants to correct and extend Locke's thought even further, in particular get rid of the mistake according to which there is a one to one relationship between ideas and words. According to Smart, we do not add meaning to meaning to make sense of a sentence or a text, on the contrary: we subtract (Smart 1855: 139). As an example Smart gives the syntagm old men. Just as the French vieillards, the English old men does not mean the same thing as vieux added to hommes. To understand a whole sentence, we step down from what he calls premise to premise until we reach the conclusion. Unfortunately, Smart's conception of the construction of meaning seems to have had little influence on English linguistic thought in the 19th century. He left, however, an impression on philosophers and psychologists of language, such as Victoria Lady Welby (1911) and George Frederick Stout who had also read Paul's work for example. Stout
186 II. History of semantics picked up Paul's distinction between usual and occasional meaning for example, but points out that it "must be noticed, however, that the usual signification is, in a certain sense, a fiction" (Stout 1891:194) and that:"Permanent change of meaning arises from the gradual shifting of the limits circumscribing the general significations. This shifting is due to the frequent repetition of the same kind of occasional application" (Stout 1891:196). What James A.H. Murray later called 'sematology' (in a different sense to Smart's use of the term), that is the use of semantics in lexicography, received its impulses from the progress in philology and dictionary writing in Germany and from the dissatisfaction with English dictionary writing. This dissatisfaction was first expressed most strongly by Richard Garnett in 1835 when he attacked English dictionaries for overlooking the achievements of historical-comparative philology. Garnett even went as far as to call some of his countrymen's lexicographical attempts "etymological trash" (Garnett 1835:306). The next to point out certain deficiencies in dictionaries was the man who became much more popular for his views on semantic matters:Trench. He used his knowledge of language to argue against the 'Utilitarians', against the biological transformationists and those who held 'uniformitarian' or evolutionary views of language change. For him language is a divine gift, insofar as God has given us the power of reason, and thus the power to name things. His most popular book was On the Study of Words (1851), used here in its 21st edition of 1890. The difference between Trench and the up to then prevailing philosophy of language is summarized by Aarsleff in the following way: (...) by making the substance of language - the words - the object of inquiry,Trench placed himself firmly in the English tradition, which had its beginning in Locke. There was one important difference, however. Trench shared with the Lockcian school, Tookc and the Utilitarians, the belief that words contained information about thought, feeling, and experience, but unlike them he did not use this information to seek knowledge of the original, philosophical constitution of the mind, but only as evidence of what had been present to the conscious awareness of the users of words within recent centuries; this interest was not in etymological metaphysics, not in conjectural history; not in material philosophy, but in the spiritual and moral life of the speakers of English. (Aarsleff 1967/1983:238) He studied semantic change at one and the same time as historical records and a lessons in changing morals and history.This is best expressed in this chapter-title: "On the Morality of Words"; the chapter contains 'records of sin' and 'records of good and evil in language'. Despite these moralizing overtones, Trench's purely historical approach to etymology became slowly dominant in Britain and it found its ultimate expression in the New English Dictionary. However,Trench's book contains some important insights into the nature of language and semantic change which would later on be treated more fully by Darmcstctcr and Br£al. It is also surprising to find that language is for Trench as it was for Smart "a collection of faded metaphors" (Trench 1851/1890:48), that words are for him fossilized poetry (for very similar views, see Jean Paul 1962-1977). Trench also writes about what we would nowadays call the amelioration orpejoration of word-meaning, about the changes in meaning due to politics, commerce, the influence of the church, on the rise of new words according to the needs and thoughts of the speakers, and finally we find a chapter "On the Distinction of Words", which deals with a phenomenon called by Hey, Br£al and Paul the differentiation of synonyms. The study of synonyms deals with the essential
9. The emergence of linguistic semantics in the 19th and early 20th century 187 (but not entire) resemblance between word-meanings (Trench 1851/1890:248-249). For Trench there can never be perfect synonyms, and this for the following reason: Men feel, and rightly, that with a boundless world lying around them and demanding to be catalogued and named [...], it is a wanton extravagance to expend two or more signs on that which could adequately be set forth by one - an extravagance in one part of their expenditure, which will be almost sure to issue in, and to be punished by, a corresponding scantness and straitncss in another. Some thought or feeling or fact will wholly want one adequate sign, because another has two. Hereupon that which has been well called the process of 'desynonymizing' begins - that is, of gradually discriminating in use between words which have hitherto been accounted perfectly equivalent, and, as such, indifferently employed. (...)This may seem at first sight only as a better regulation of old territory: for all practical purposes it is the acquisition of new. (Trench 1851/1890:258-259) Trench's books, which became highly popular, must have sharpened every educated Englishman's and Englishwoman's awareness for all kinds and sorts of semantic changes. On a more scientific level Trench's influence was even more profound. As Murray wrote in the Preface to the first volume of the New English Dictionary the "scheme [for the NED] originated in a resolution of the Philological Society, passed in 1857, at the suggestion of the late Archbishop Trench, then Dean of Westminster" (Murray 1884: v). In this dictionary the new historical method in philology was for the first time applied to the "life and use of words" (ibid.). The aim was "to furnish an adequate account of the meaning,origin,and history of English words now in general use,or known to have been in use at any time during the last seven hundred years." (ibid.).The dictionary endeavoured (1) to show, with regard to each individual word, when, how, in what shape, and with what signification, it became English; what development of form and meaning it has since received; which of its uses have, in the course of time, become obsolete, and which still survive; what new uses have since arisen, by what processes, and when: (2) to illustrate these facts by a series of quotations ranging from the first known occurrence of the word to the latest, or down to the present day; the word being thus made to exhibit its own history and meaning: and (3) to treat the etymology of each word strictly on the basis of historical fact, and in accordance with the methods and results of modern philological science. (Murray 1884: vi) Etymology was no longer seen as an instrument used to get insights into the workings of the human mind, as philosophers in France and Britain had believed at the end of the 18th century, or to discover the truly original meaning of a word. It was now put on a purely scientific, i.e. historical, footing. But, as we have seen, by then a new approach to semantics, fostered by Stcinthal and Br£al under the influence of psychological thought, brought back considerations of the human mind, of the speaker, and of communication, opening up semantics from the study of the history of words in and for themselves to the study of semantic change in the context of psychology and sociology. One Cambridge philosopher, who knew Scottish common sense philosophy just as well as Kantianism, and studied meaning in the context of communication was John Grote.In the 1860s, he developed a theory of meaning as use and of thinking as a social activity based on communication (Gibbins 2007). His focus on 'living meaning' as opposed to 'fossilised' or historical meaning had parallels with the
188 II. History of semantics theory of meaning developed by Breal at the same time and can be regarded as a direct precursor of ordinary language philosophy. Influenced by Wegener, Ferdinand de Saussure (and his distinction between speech and language) and the anthropologist Malinowski (and his claim that language can only be studied as part of action), Gardiner and then Firth tried to develop a new contextualist approach to semantics and laid the foundation for the London School of Linguistics. However, by the 1960s Britain, like the rest of Europe, began to feel the influence of structuralism (Pottier 1964). Gustav Stern (1931) in Denmark tried tosynthesise achievements in semantics, especially with reference to the English language. Stephen Ullmann in Oxford attempted to bridge the gap between the old (French and German) type of historical-psychological and the new type of structural semantics in his most influential books on semantics written in 1951 and 1962. Although brought up in the tradition of Gardiner and Firth, Sir John Lyons finally swept away the old type of semantics and advocated the new 'structural semantics' in 1963 (Lyons 1963). The focus was now both on how meanings are shaped by their mutual relations in a system, rather than by their evolution over time and on the way meanings are constituted internally by semantic features which were, by some thought to be invariant (see Matthews 2001; for more information see article 16 (Bierwisch) Semantic features and primes). Variation and change, discourse and society, mind and metaphor which had so fascinated earlier linguists were sidelined but later rediscovered inside frame semantics (see article 29 (Gawron) Frame Semantics), prototype semantics (sec article 28 (Taylor) Prototype theory), cognitive semantics (sec article 27 (Talmy) Cognitive Semantics) and studies of grammaticalisation (see article 101 (Eckhardt) Grammaticalization and semantic reanalysis). 5. Conclusion One of the pioneers in the history of semantics, Dirk Geeraerts has, in the past, distinguished between five stages in the history of lexical semantics, namely pre-structuralist diachronic semantics, structuralist semantics, lexical semantics as practized in the context of generative grammar, logical semantics, and cognitive semantics (Geeraerts 1997). Geeraerts himself has now provided a masterly overview of the history of semantics from its earliest beginnings up to the present (Geeraerts 2010), with chapters on historical-philological semantics, structuralist semantics, generativist semantics, neostructur- alist semantics and cognitive semantics. His first chapter is at the same time broader and narrower than the history of early (pre-structuralist) semantics provided here. It provides an overview of speculative etymology in antiquity as well as of the rhetorical tradition, which I do not cover in this chapter; and when Geeraerts focuses on developments between 1830 and 1930 he mainly focuses on Br£al and Paul.So I hope that by examining discussions of what words mean and how meaning is achieved as a process between speaker and hearer, author and reader and so on in three national traditions, the French, the British and the German in early semantics, between around 1830 and 1930, and by focusing more on context than cognition, I supplement Geeraerts' work to some extent. 6. References Aarsleff, Hans 1967/1983. The Study of Language in Britain 1780-1860. 2nd edn. London: Atlilone Press. 1983.
9. The emergence of linguistic semantics in the 19th and early 20th century 189 Bally, Charles 1951. Traite de Stylistique Francaise. 3rd edn. Geneve: George Klincksieck. Brlal, Michel 1883. Les lois iiitellectuelles du langage. Fragment de sdmantique. Anniiaire de I'Association pour Vencouragement des etudes grecques en France 17,132-142. Brlal, Michel 1883/1991. The Beginnings of Semantics: Essays, Lectures and Reviews. Translated and introduced by George Wolf. Reprinted: Stanford, CA: Stanford University Press, 1991. BnSal, Michel 1887. L'histoire des mots. Revue des Deux Mondes 1,187-212. Bidal, Michel 1897. Essai de Semantique (Science des Significations). Paris: Hachette. Carnoy. Albert J. 1927. La Science du Mot. Traite de Semantique. Louvain: Editions 'Universitas'. Darmesteter, Arsene 1887. La Vie des Mots Etudiee d'apres leurs Significations. Paris: Delagrave. Delesalle, Sinione 1987. Vie des mots et science des significations: Arsene Darmesteter et Michel BnSal (DRLAV). Revue de Linguistique 36-37,265-314. Firth, John R. 1935/1957. The technique of semantics. Transactions of the Philological Society for 1935,36-72. Reprinted in: J.R. Firth, Papers in Linguistics 1934-1951. London: Oxford University Press, 1957,7-33. Frcgc, Gottlob 1977.Thoughts. In: P. Gcach (cd.). Logical Investigations. Gottlob Frege. New Haven, CT: Yale University Press, 1-30. Gardiner, Alan H. 1932/1951. The Theory of Speech and Language. 2nd edn., with additions. Oxford: Clarendon Press, 1951. Garnett, Richard 1835. English lexicography. Quarterly Review 54,294-330. Geeraerts, Dirk 1997. Diachronic Prototype Semantics. A Contribution to Historical Lexicology. Oxford: Oxford University Press. Geeraerts, Dirk 2010. Theories of Lexical Semantics: A Cognitive Perspective. Oxford: Oxford University Press, de la Grasserie, Raoul 1908. Essai d'une Semantique Integrate. Paris: Leroux. Gibbins, John R. 2007. John Grote, Cambridge University and the Development of Victorian Thought. Exeter: Imprint Academic. Hauger, Brigitte 1994. Johan Nicolai Madvig. The Language Theory of a Classical Philologist. Minister Nodus. Hecht.Max 1888. Diegriechische Bedeutungslehre:EineAufgabe der klassischen Philologie. Leipzig: Tcubner. Henry, Victor 1896. Antinomies Linguistique. Paris: Alcan. Henry, Victor 1901. Le Langage Martien. Paris: Maisonneuve. Hey, Oskar 1898. Review of Br6al (1897). Archiv fiir Lateinische Lexicographic und Grammatik 10,551-555. Jean Paul [Johann Paul Fricdrich Richter] 1962-1977[ 1804]. Vorschule der Asthetik. [Introduction to Aesthetics]. In: Werke [Works]. Edited by Norbert Miller. Abt. 1, Vol. 5. Darmstadt: Wissen- schaftliche Buchgesellschaft, 7-330. Knobloch, Clemens 1988. Geschichte der psychologischen Sprachauffassung in Deutschland von 1850 bis 1920. Tubingen: Niemeyer. Littre\ Emile 1880. Pathologie verbalc ou lesion de certains mots dans 1c coins de 1'usage. Etudes et Glanures pour Faire Suite a I'Histoire de la Langue Francaise. Paris: Didier. 1-68. Locke, John 1975/1689. Essay on Human Understanding. Oxford: Oxford University Press. 1st edn. 1689. Lyell, Charles 1830-1833. Principles of Geology, Being an Attempt to Explain the Former Changes of the Earth's Surface by Reference to Causes now in Operation. London: Murray. Lyons, John 1963. Structural Semantics. Oxford: Blackwell. Madvig, Johann N. 1875. Kleine philologische Schriften. Leipzig: Tcubner. Reprinted: Hildcshcim: Olms, 1966. Du Maisais, Cesar de Chesneau 1757. Des Tropes ou des Differens Sens dans lesquels on Peut Prendre un Meme Mot dans line Meme Langue. Paris: chez David. Matthews, Peter 2001. A Short History of Structural Linguistics. Cambridge: Cambridge University Press.
10. The influence of logic on semantics 19_1^ Steinthal, Heymann 1855. Grammatik, Logik und Psychologie: Ihre Prinzipien und ihr Verhdltnis zueinander. Berlin: Diimmler. Steinthal, Heymann 1871. Abriss der Sprachwissenschaft 1. Berlin: Diimmler. Stern, Gustav 1931. Meaning and Change of Meaning. With Special Reference to the English Language. Goteborg: Elanders Boktryckeri Aktiebolag. Reprinted: Bloomington, IN: Indiana University Press, 1968. Stewart, Dugald 1810. Philosophical Essays. Edinburgh: Creech. Stocklein, Johann 1898. Bedeutungswandel der Worter. Seine Entstehung und Entwicklung. Miinchen: Lindauer. Stout, George Frederick 1891.Thought and language. Mind 16,181-197. Tooke, John H. 1786-1805. Epea ptepoenta; or, the Diversions of Purley. 2nd edn. London: printed for the author. Traugott, Elizabeth C. 2005. Semantic change In: K. Brown (ed.). Encyclopedia of Language and Linguistics. Amsterdam: Elsevier. Trench, Richard C. 1851/1890. On the Study of Words. 21st edn. London: Kegan Paul, Trench, Trubncr & Co., 1890. Ullmann, Stephen 1951. The Principles of Semantics. A Linguistic Approach to Meaning. 2nd edn. Glasgow: Jackson, 1957. Ullmann, Stephen 1962. Semantics. Oxford: Blackwell. Wegener, Philipp 1885/1991. Untersuchungen iiberdie Grundfragen des Sprachlebens. Halle/Saale: Niemeyer. Welby, Victoria Lady 1911. Signifies and Language. The Articulate Form of our Expressive and Interpretative Resources. London: Macmillan & Co. Wundt, Wilhelm 1900. Volkerspsychologie I: Die Sprache. Leipzig: Engelmann. Brigitte Nerlich, Nottingham (United Kingdom) 10. The influence of logic on semantics 1. Overview 2. Pre-Fregean logic 3. GottlobFrege's progress 4. Bertrand Russell's criticism and his theory of definite descriptions 5. Rudolf Carnap's theory of extension and intension: Relying on possible worlds 6. Willard V. O. Quine: Logic, existence and propositional attitudes 7. Necessity and direct reference:The two-dimensional semantics 8. Montague-Semantics: Compositionality revisited 9. Generalized quantifiers 10. Intensional theory of types 11. Dynamic logic 12. References Abstract The aim of this contribution is to investigate the influence of logical tools on the development of semantic theories and vice versa. Pre-19th-century logic was limited to a few sentence forms and their logical interrelations. Modern predicate logic and later type logic, both inspired by investigating the meaning of mathematical sentences, widened Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 191-215
192 II. History of semantics the view for new sentence forms and thereby made logic relevant for a wider range of expressions in natural language. In a parallel course of developments the problem of different levels of meaning like sense and reference, or intension and extension were studied and initiated a shift to modal contexts in natural language. Montague bundled in his intensional type-theoretical framework a great part of these development in a unified formal framework which had strong impact on the formal approaches in natural language semantics. While the logical developments mentioned so far could be seen as direct answers to natural language phenomena, the first approaches to dynamic logic did not get their motivation from natural language, but from the semantics of computer programming. Here, a logical toolset was adapted to specific problems of natural language semantics. 1. Overview The aim of this contribution is to investigate the influence of logical tools on the development of semantic theories and vice versa. The article starts with an example of Pre- Fregean logic, i.e. Aristotelian syllogistic.This traditional frame of logic has severe limits for a theory of meaning (e.g. no possibility for multiple quantification, no variety of scope). These limitations have been overcome by Frege's predicate logic which is the root of standard modern logic and the basis for the first developments in a formal philosophy of language: Wc will present Frege's analysis of mathematical sentences and his transfer to the analysis of natural language sentences by introducing a theory of sense and reference. The insufficient treatment of singular terms in Frege's logic is one reason for Russell's theory of definite descriptions. Furthermore, Russell tries to organize semantics without a difference between sense and reference. Carnap introduces the first logical framework for a semantics of possible worlds and shows how one can keep the Frcgcan semantic intuitions without using the problematic tool of "senses". Carnap develops a systematic theory of intension and extension defining the intension of an expression as a function from possible worlds to the relevant extension. An important formal step was then the invention of modal logic by Kripke.This is the framework for the theory of direct reference of proper names and for the so-called two-dimensional semantics which is relevant to receive an adequate treatment of names, definite descriptions, and especially indexicals. Tarski's formal theory of truth is used by Davidson to argue that truth-conditions arc the adequate tool to characterize the meaning of assertions. Although the idea of a truth-conditional semantics is already in the background since Frege, with Davidson's work it became the leading idea for modern semantics. The second part of the article (starting with section 8) will concentrate on important progresses made in this overall framework of truth-conditional semantics. Montague presented a compositional formal semantics including quantifiers, intensional contexts and the phenomenon of deixis. His ideal was to offer an absolute truth-condition for any sentence. In the next step new formal tools were invented to account not only for the extralinguistic environment but also for the discourse as a core feature of the meaning of utterances. Context-dependency in this sense is considered in approaches of dynamic semantics. Formal tools arc nowadays not only used to answer the leading question "What is the meaning of a natural language expression?" In recent developments new logic formalisms are used to answer questions like "How is a convention established?" and "How can we account for the pragmatics of the utterance?" It has
10. The influence of logic on semantics 193 become clear that truth-conditional semantics has to be completed by aspects of the environment,social conventions and speaker's intentions to receive an adequate account of meaning. Therefore the latest trend is to offer new formal tools that can account for these features. 2. Pre-Fregean logic The first system of logic was grounded by Aristotle (see 1992). He organized inferences according to a syllogistic schema which consists of two premises and a conclusion. Each sentence of such a syllogistic schema contains two predicates (F, G), a quantifier in front of the first predicate (some, every) and a negation (not) could be added in front of the second predicate. Each syllogistic sentence has the following structure "Some/every F is/ is not G".Then we receive four possible types of sentences: A sentence is universal if it starts with "every" and particular if it starts with "some". A sentence is affirmative if it docs not contain a negation in front of the second predicate otherwise it is negative. We receive Tab. 10.1. Tab. 10.1: Syllogistic types of sentences NAME a i e o FORM Every F is G Some F is G Every F is not G Some F is not G TITLE Universal Affirmative Particular Affirmative Universal Negative Particular Negative The name of the affirmative syllogistic sentences is due to the latin word "affirmo".The first vowel represents the universal sentence while the second vowel represents the particular.The name of the negative syllogistic sentences is due to the latin word "nego" again with the same convention concerning the use of the first and second vowel. As we will see the sequence of vowels is also used to represent the syllogistic inferences. If we introduce the further quantifier "no" we can find equivalent representations but no new propositions.The sentence "No F is G" is equivalent to (e) "Every F is not G" and "No F is not G" is equivalent to (a) "Every F is G".The proposition (e) is intuitively more easily to grasp in the form "No F is G" while proposition (a) is better understandable in the original format "Every F is G". So we continue with these formulations. On the basis of these sentences we can systematically arrange the typical syllogistic inferences, e.g. the inference called "barbara" because it contains three sentences of the form (a). Tab. 10.2: Barbara Premise 1 (a): Every G is H. abbreviation: GaH Premise 2 (a): Every F is G. abbreviation: FaG Conclusion (a): Every F is H. abbreviation: FaH Now we can start with systematic variations of the four types of sentences (a, i, e, o).The aim of Aristotle was to select all and only those inferences which are valid. Given the same structure of the predicates in the premises and the conclusion only varying the kind of sentence we receive e.g. the valid inferences of Tab. 10.3.
194 II. History of semantics Tab. 10.3: Same predicate structure, varying types of sentences Barbara Every M is H Every F is M Every F is H Darii Every M is H Some F are M Some F are H Ferio No M is H Some F is M Some F is not H Celarent No M is H Every F is M No F is H To present the complete list of possible syllogistic inferences we have to account for different kinds of predicate positions in the inference. We can distinguish four general schemata including the one we already had presented so far. Our first schema has the general structure (I) and we also receive the other structures (II to IV) in Tab. 10.4. Tab. 10.4: Predicate structures I. M H II. H M III. M H IV. H M FM FM MF MF FH FH FH FH For each general format we can vary the kinds of sentences that are involved in the way presented above.This leads to all possible syllogistic inferences in the Aristotelian logic. While making this claim wc arc ignoring the fact that Aristotle already worked out a modal logic, cf. Nortmann (1996). Concentrating on nonmodal logic we have presented the core of the Aristotelian system. Although it was an ingenious discovery in ancient times, the Aristotelian system has its strong limitations: Strictly speaking, there is no space in syllogistic inferences for (a) singular terms, (b) existence claims (like "Trees exist") and there are only very limited possibilities for quantification (see section on Frege's progress). Especially, there is no possibility for multiple uses of quantifiers in one sentence. This is the most important progress which is due to the predicate logic essentially developed by Gottlob Frege. Before we present this radical step into modern logic, we shortly describe some core ideas of G.W. Leibniz, who invented already some influential ideas on the way to modern logic. Leibniz is well-known for introducing the idea of a calculus of logical inferences. He introduced the idea that the syntax of the sentences mirrors the logical structure of the thoughts expressed and that there can be defined a purely syntactic procedure of proving a sentence. This leads to the modern understanding of a syntactic notion of proof which ideally allows for all sentences to decide simply on the basis of syntactic transformations whether they are provable or not. The logic systems developed by Leibniz are essentially advanced compared to the Aristotelian syllogistic. It has been shown that his logic is equivalent to the Boolean logic, i.e. the monadic predicate logic, see Lenzen (1990). Furthermore, Leibniz introduced a calculus of concepts defining concept identity, inclusion, containment and addition, see Zalta (2000) and Lenzen (2000). He reserved a special place for individual concepts. Since his work had almost no influence on the general development of logic the main ideas are only mentioned here. Ignoring a lot of interesting developments (e.g. modal systems) wc can characterize a great deal of the logical systems initiated by Aristotle until the 19th century by the square of opposition (cf. article 8 (Meier-Oeser) Meaning in pre 19th-century thought). The square of opposition (see Fig. 10.1) already involves the essential distinction between different understandings of "oppositions": A contradiction of a sentence is an external negation of sentence "It is not the case that ..." while the contrary involves
10. The influence of logic on semantics 195 Every S is P Some S is P Fig. 10.1: Square of oppositions Some S is not P an "internal" negation. What is meant by an "internal" negation can only be illustrated by transforming the syllogistic sentences into modern predicate logic. The most important general features in the traditional understanding are the following: (i) From two contradictory sentences one must be false and one be true, (ii) For two contrary sentences holds that they cannot both be true (but they can both be false), (iii) For two subcontrary sentences holds that they cannot both be false (but they can both be true). A central problem pointed out by Abelard in the Dialectica (1956) is presupposition of existential import: According to one understanding of the traditional square of opposition it presupposes that sentences like "Every F is G" or "Some F is G" imply "There is at least one thing which is F".This is the so-called existential import condition which leads into trouble.The modern transformation of the syllogistic sentences into predicate logic, as included into the figure above, does not involve the existential presupposition. Let us use an example: On the one hand, "Some man is black" implies that at least one thing is a man, namely the man who has to be black if "Some man is black" is true. On the other hand,"Some man is not black" also implies that something is a man, namely the man who is not black if "Some man is not black" is true. But these are two subcontrary sentences, i.e. according to the traditional view they cannot both be false; one has to be true.Therefore (since both imply that there is a thing which is a man) it follows that men exist. In such a logic the use of a predicate F in a sentence "Some F..." presupposes that F is non-empty (simply given the meaning of F and the traditional square of opposition), i.e. there are no empty predicates. But of course (as Abelard points out) surely men might not exist. This observation leads to the modern square of opposition which uses the reading of sentences in predicate logic and leaves out the relations of contraries and subcontraries and only keeps the relation of the contradictories. The leading intuition to avoid the problematic consequence mentioned above is the claim that meaningful universal sentences like "Every man is mortal" do not imply that men exist. The existential import is denied for both predicates in universal sentences. Relying on the interpretation of particular sentences like "Some men are mortal" according to modern predicate logic, the truth of the particular sentences just means that there is at least one man which is mortal. If we want to allow nonempty predicates then we receive the modern square of opposition.
196 II. History of semantics 3. Gottlob Frege's progress Frege was a main figure from two points of view: He introduced a modern predicate logic and he also developed the first systematic modern philosophy of language by transferring his insights in logic and mathematics into a philosophy of language. His logic was first developed in the "Begriffsschrift" (1879, in the following shortly: BS), see Frege (1977). He used a special notation to characterize the content of a sentence by a horizontal line (the content line) and the act of judging the content by a vertical line (the judgement line). V A This distinction is a clear precursor of the distinction between illocution (the type of speech act) and proposition (the content of the speech act) in Searle's speech act theory. Frege's main aim was to clarify the status of arithmetic sentences. Dealing with a mathematical expression like „32" Frege analyzes it into the functional expression „()2" und the argument expression „3". The functional expression refers to the function of squaring something while the argument expression refers to the number 3. An essential feature of the functional expression is its being unsaturated, i.e. it has an empty space that needs to be filled by an argument expression to constitute a complete sentence. The argument expression is saturated, i.e. it has no space for any addition. Frege transferred this observation into the philosophy of language: Predicates are typical expressions which are unsaturated, while proper names and definite descriptions are typical examples for saturated expressions. Predicates refer to concepts while proper names and definite descriptions refer to objects. Since predicates are unsaturated expressions which need to be completed by a saturated expression, Frege defines concepts (as the reference of predicates) as functions which - completed by objects (as the referents of proper names and other singular terms) - always have a truth value (Truth or Falsity) as result. The truth-value is the reference of the sentence composed of the proper name and the predicate. By analogy from mathematical sentences Frege starts to analyze sentences of natural language and dealing with natural language and develops a systematic theory of meaning. Before outlining some basic aspects of this project, we first introduce the idea of a modern system of logic. Frege developed the following propositional calculus (for a detailed reconstruction of Frege's logic see von Kutschera 1989, chap. 3): Axioms: (1) a. A^(B^A) b. (C^(B^A))^((C^B)^(C^A)) c. (D^(B^A))^(B^(D^A)) d. (B -+ A) -+ (^A -+ ^B) e. ^A-^A f. A -+ -.-.A A rule of inference, which allows to derive theorems by starting with two axioms or theorems: (2) A^B,A\-B
10. The influence of logic on semantics 197 If we add an axiom and a rule we receive a system of predicate logic that is complete and consistent. Frege suggested the following axiom: (3) VxA[x]^A[a](BS:5\). The additionally relevant rule of inference was not explicitly marked by Frege but presupposed implicitly: (4) A -* B[a] \- A -* VjcB[jc], if „a" is not involved in the conclusion (BS: 21). Frege tried to show the semantic consistency of the predicate calculus (which was then intensity debated) but he did not try to prove the completeness since he lacked the notion of interpretation to develop such a proof (von Kutschera 1989: 34). A formal system of axioms and rules is complete for first-order predicate logic (FOPL) if all sentences logically valid in FOPL are derivable in the formal system.The completeness proof was for the first time worked out by Kurt Godel (1930). Frege already included second-order predicates into his system of logic. The interesting fact that second-order predicate logic is incomplete was for the first time shown by Kurt Godel (1931). One of the central advantages of modern predicate logic for the development of semantics is the fact that we can now use as much quantifiers in sequence as we want. We have of course to take care of the meaning of quantifiers given the sequence. The following sentences which cannot be expressed in the system of Aristotelian syllogism can be nicely expressed by the modern predicate logic using "L" as a shorthand for the two-place predicate "( ) loves ( )". (5) Everyone loves everyone: \/x\/yL(x,y) Using mixed quantifiers their sequence becomes relevant: (6) a. Someone loves everyone: 3x\/yL(x,y) b. Everyone loves someone: \/x3yL(x, y) [This can be a different person for everyone] c. Someone is loved by everyone: 3y\/xL(x, y) [There is (at least) one specific human being who is loved by all human beings] d. Everyone is loved by someone: Vv3jcL(jc, y) e. Someone loves someone: 3x3yL(x,y) [There is (at least) one human being who loves (at least) one human being] Frege's philosophy of language is based on a principle of compositionality (cf. article 6 (Pagin & Westerstahl) Compositionality), i.e. the principle that the value of a complex expression is determined by the values of the parts plus its composition. He developed a systematic theory of sense and reference (cf. article 3 (Tcxtor) Sense and reference). The reference of a proper name is the designated object and the reference of a predicate is a concept while both determine the reference of the sentence, i.e. the truth-value. Given this framework of reference it follows that the sentences
198 II. History of semantics (7) The morning star is identical with the morning star, and (8) The morning star is identical with the evening star. have the same reference, i.e. the same truth-value: The truth-value is determined by the reference of the name and the predicate. Each token of the predicate refers to the same concept and the two names refer to the same object, the planet Venus. But sentence (7) is uninformative while sentence (8) is informative. Therefore, we need a new aspect of meaning to account for the informativity: the sense of an expression.The sense of a proper name is a mode of presentation of the designated object, i.e. "the evening star" expresses the mode of presentation characterized as the brightest star in the evening sky. Furthermore the sense of a sentence is a thought. The latter (in the case of simple sentences like "Socrates is a philosopher") is constituted by the sense of a predicate "() is a philosopher" and the sense of the proper name "Socrates". Frege defines the sense of an expression in general as the mode of presentation of the reference. To develop a consistent theory of sense and reference Frege introduced different senses for one and the same expression in different linguistic contexts, e.g. indirect speech (propositional attitude ascriptions) or quotations are contexts in which the sense of an expression changes. Frege's philosophy of language has at least two major problems: (1) the necessity of an infinite hierarchy of senses to account for the recursive syntactic structure (John believes that Mary believes that Karl believes ) and (2) the problem of indexical expressions (it is accounted for in two-dimensional semantics and dynamic semantics, see below). 4. Bertrand Russell's criticism and his theory of definite descriptions Russell (1903, partly in cooperation with Whitehead (Russell & Whitehead 1910-1913)) also developed himself both a system of logic and a philosophy of language in contrast to Frege such that we nowadays speak of Neo-Fregean and Neo-Russellian theories of meaning. Let us first have a look at Russell's logical considerations. Russell developed his famous paradox which was a serious problem for Frege because Frege presupposes in his system that he could produce sets of sets in an unconstrained manner. But if there are no constraints we run into Russell's paradox: Let R be the set of all sets which are not members of themselves.Then R is neither a member of itself nor not a member of itself. Symbolically, let R := [x :x g JtJ.Then Re RiffRe R.To illustrate the consideration: If R is a member of itself it must fulfill the definition of its members, i.e. it must not be a member of itself. If R is not a member of itself then it should not fulfill the definition of its members, i.e. it must be a member of itself. When Russell wrote his discovery in a letter to Frege who was just completing Grundlagen der Arithmetik Frege was despaired because the foundations of his system were undermined. Russell himself developed a solution by introducing a theory of types (1908). The leading idea is that we always have to clarify those objects to which the function will apply before a function can be defined exactly. This leads to a strict distinction between object language and meta-language: We can avoid the paradox by avoiding self-references and this can be done by arranging all sentences (or, equivalently, all propositional functions) into a hierarchy. The lowest level of this hierarchy will consist of sentences about individuals. The next lowest level
10. The influence of logic on semantics 199 will consist of sentences about sets of individuals. The next lowest level will consist of sentences about sets of sets of individuals, and so on. It is then possible to refer to all objects for which a given condition (or predicate) holds only if they are all at the same level, i.e. of the same type. The theory of types is a central element in modern theory of truth and thereby also for semantic theories. Russell's contribution to the philosophy of language is essentially connected with his analysis of definite descriptions (Russell 1905). The meaning of the sentence "The present King of France is bald" is analyzed as follows: 1. there is an x such that x is the present King of France (3x(Fx)) 2. for every x that is the present King of France and every y that is the present King of France, x equals y (i.e. there is at most one present King of France) (Vjc(Fjc —*■ Vy (Fy^y = x))) 3. for every x that is the present King of France, x is bald. (\/x(Fx —► Bx)) Since France is no longer a kingdom, assertion Lis plainly false; and since our statement is the conjunction of all three assertions, our statement is false. Russell's analysis of definite descriptions involves a strategy to develop a purely exten- sional semantics, i.e. a semantic theory that can characterize the meaning of sentences without introducing the distinction between sense and reference or any related distinction of intensional and extensional meanings. Definite descriptions are analyzed such that there remains no singular term in the reformulation and ordinary proper names are according to Russell's theory hidden definite descriptions. His strategy eliminates singular terms with only one exception: He needs the basic singular term "this/that" to account for our speech about sense-data (Russell 1910). Since he takes an acquaintance relation with sense-data (and also with universals) including a sense-data ontology as a basic presupposition of his specific semantic approach, the only core idea that survived in modern semantics is his logical analysis of definite descriptions. 5. Rudolf Carnap's theory of extension and intension: Relying on possible worlds Since Russell's project was idiosyncratically connected with a sense-data theory it was for the great majority of scientists not acceptable as a purely extensional project. The extensional semantics had to wait until Davidson used Tarski's theory of truth as a framework to characterize a new extensional semantics. Meanwhile it was Rudolf Carnap who introduced the logic of extensional and intensional meanings to modernize Frege's twofold distinction of semantics. The central progress was made by introducing the idea of possible worlds into logics and semantics: The actual world is constituted by a combination of states of affaires which are constituted by objects (properties, relations etc.). If at least one state of affairs is changed we speak of a new possible world. If the world consists of basic elements which constitute states of affaires then the possible combinations of these elements allow us to characterize all possible states of affaires. Thereby we can characterize all possible worlds since a possible world can be characterized by a class of states of affaires that is realized in this world. Using this new instrument of possible worlds Carnap introduces a systematic theory of intension. His notion of intension should substitute Frege's notion of sense and thereby account for the informational content of a
200 II. History of semantics sentence. His notion of extension is closely connected to Frege's notion of reference: The extension of a singular term is the object referred to by the use of the term, the extension of a predicate is the property referred to and the extension of a complete assertive sentence is its truth-value. The intension which has to account for the informational content is characterized as a function from possible worlds to the relevant extensions. In the case of a singular term the intension is a function from possible worlds (p.w.) to the object referred to in the relevant possible world. In the same line you receive the intension of predicates (as function from p.w. to sets or n-tuples) and of sentences (as functions from p.w. to truth-values) as shown in Tab. 10.5. Tab. 10.5: Carnap's semantics of possible worlds Extension Intension singular terms objects individual concepts predicates sets of objects and n-tuples of objects properties sentences truth-values propositions A principle limitation of a semantic of possible worlds is that you cannot account for so-called hyperintensional phenomena, i.e. one cannot distinguish the meaning of two sentences which are necessarily true (e.g. two different mathematical claims) because they arc simply characterized by the same intension (matching each p.w. onto the value "true"). 6. Willard V. O. Quine: Logic, existence and propositional attitudes How should we relate quantifiers with our ontology? Quine (1953) is famous for his slogan "To be is to be the value of a bound variable". Quantifiers "there is (at least) an x (3jc)", "for all x (Vjc)" are the heart of modern predicate logic which was already introduced by Frege (s. above). For Quine the structure of the language determines the structure of the world: If the language which is necessary to receive the best available complete description of the world contains several existential and universal quantifications then these quantifications at the same time determine the objects, properties etc. we have to presuppose. Logic, language and ontology are essentially connected according to this view. Although Quine's special views about connecting logic, language and world are very controversial nowadays the core of the idea of combining quantificational and ontological claims is widely accepted. Another problem that is essentially inspired by the development of logic is the analysis of propositional attitude ascriptions. Quine established the following standard story: There is a certain man in a brown hat whom Ralph has glimpsed several times under questionable circumstances on which wc need not enter here; suffice it to say that Ralph suspects he is a spy. Also there is a gray-haired man, vaguely known to Ralph as rather a pillar of the community, whom Ralph is not aware of having seen except once at the beach. Now Ralph does not know it, but the men are one and the same. Can we say of this man (Bernard J. Ortcutt, to give him a name) that Ralph believes him to be a spy? If so. we find ourselves accepting a conjunction of the type: (9) w sincerely denies' '. w believes that
10. The influence of logic on semantics 201_ as true, with one and the same sentence in both blanks. For, Ralph is ready enough to say. in all sincerity.'Bernard J. Ortcutt is no spy.' If, on the other hand, with a view to disallowing situations of the type (9), we claim simultaneously that (10) Ralph believes that the man in the brown hat is a spy. (11) Ralph does not believe that the man seen at the beach is a spy. then we cease to affirm any relationship between Ralph and any man at all.[...] 'believes that' becomes, in a word, referentially opaque. (Quine 1956: 179, examples renumbered) In line with Russell, Quine starts to analyze the cognitive situation of Ralph by distinguishing two readings of the sentence (12) Ralph believes that someone is a spy. namely: (13) Ralph believes [3x(x is a spy)] (14) 3jc(Ralph believes [jc is a spy]) Quine calls (13) the notional and (14) the relational reading of the original sentence which is at the first glance parallel to the traditional distinction between de dicto (13) and de re (14) reading. But he shows that the difference of these two readings is not sufficient to account for Ralph's epistemic situation as characterized with the sentences (10) and (11). Intuitively Ralph has a de re reading in both cases, one of the man on the beach and the other of a person wearing a brown hat.The transformation into de re readings leads to: (15) 3jc(Ralph believes [jc is a spy]) (out of (10)) (16) 3jc(Ralph does not believe [jc is a spy]) (out of (11)) Since both extensional quantifications are about the same object, we receive the combined sentence which explicitly attributes contradictory beliefs to Ralph: (17) 3jc(Ralph believes [jc is a spy] a Ralph does not believe [jc is a spy]) To avoid this unacceptable consequence Quine suggests that the ambiguity of the belief sentences cannot be accounted for by a distinction of the scopus of the quantifier (leading to de re and de dicto readings) but by a systematic ambiguity of the belief predicate: He suggests to distinguish a two-place predicate "Believe2 (subject, proposition)" and a three-place predicate "believe3 (subject, object-of-belief, property)". (18) Believe2 (Ralph, that the man with the brown hat is a spy) (19) believe3 (Ralph, the man with the brown hat, spy-being)
202 II. History of semantics This distinction is the basis for Quine's further famous claim: We are not allowed to quantify into propositional attitudes (i.e. implying (14) from (18)): if we have interpreted a sentence such that the 'believe' predicate is used intensionally (as a two-place predicate) then we cannot ignore that and we are not allowed to change the reading to the sentence into one using a three-place predicate. We are not allowed to change from a notional reading (18) into a relational reading (19) and vice versa. This line of strategy was further improved e.g. by Kaplan (1969) and Loar (1972). It definitely made clear that we cannot always understand the belief expressed by a belief sentence simply as a relation between a subject and a proposition. Sometimes it has to be understood differently. Quine's consequence is a systematic ambiguity of the predicate "believe". This is problematic since it leads also to four-place, five-place predicates etc. (Haas-Spohn 1989:66): for each singular term which is used in the scopus of the belief ascription we have to distinguish a notional and a relational reading. Therefore Cresswell & von Stechow (1982) suggested an alternative view which only needs to presuppose a two-place predicate "believe" but therefore changes the representation of a proposition: A proposition is not completely characterized by a set of possible worlds (according to which the relevant state of affaires is true) but in addition by a structure of the components of the proposition. Structured propositions are the alternative to a simple possible world semantics to account for propositional attitude ascriptions. 7. Necessity and direct reference: The two-dimensional semantics The development of modal logic essentially put forward by Saul A. Kripke had strongly influenced the semantical theories. The basic intuition the modal logic started with is rather straightforward: Each entity is necessarily identical with itself (and necessarily different from anything else). Kripke (1972) shows that there arc sentences which express a necessary truth but nevertheless are a posteriori: "Mark Twain is identical with Samuel Clemens". Since "Samuel Clemens" is the civil name of Mark Twain the sentence expresses a self-identity but it is not known a priori since knowing that both names refer to the same object is not part of standard linguistic knowledge. There are also sentences which express contingent facts but which can be known to be true a priori, e.g. "I am speaking now". If I utter the sentence it is a priori graspablc that it is true but it is not a necessary truth since otherwise I would be a necessary speaker at this timepoint (but of course I could have been silent). To account for the new distinction between epistemic dimension of a priori/a posteriori and the metaphysical dimension of necessary/contingent Kripke introduced the theory of direct reference of proper names and Kaplan (1979/1989) introduced the two-dimensional semantics. Since Kaplan's theory of characters is nowadays a standard framework to account for names and indcxicals we shortly introduce the core idea: We have to distinguish the utterance context which determines a proposition which is expressed by uttering a sentence and the circumstance of evaluation which is the relevant possible world according to which this proposition will be evaluated as true or false. We can illustrate this two-step approach as in Fig. 10.2:
10. The influence of logic on semantics 203 character of a sentence 1st step utterance context intension = truth-condition (= proposition) 2nd step circumstance of evaluation extension = truth-value Fig. 10.2: Two-dimensional semantics A character of a sentence is a function from possible utterance contexts to truth- conditions (propositions) and these truth-conditions are then evaluated relative to the circumstances of evaluation. Especially in the cases of indcxicals wc can demonstrate the two-dimensional semantics: Let us investigate the sentence "I am a philosopher" according to three different utterance contexts w0, w, und w2 while in each world there is a different speaker: in w0 Cicero is the speaker of the utterance, in w, Caesar and in w2 Augustus. Furthermore it is in w0 the case that Cicero is a philosopher while Caesar and Augustus are not. In w, Caesar and Augustus are philosophers while Cicero isn't. In w2 no one is a philosopher (poor world!).The three worlds function both as utterance contexts and as circumstances of evaluation but of course different facts are relevant in the two different functions. Now the utterance contexts are represented veritically while the circumstances of evaluation are represented horizontally. Then the sentence "I am a philosopher" receives the character shown in Tab. 10.6. Tab. 10.6: Character of "I am a philosopher" Utterance Contexts Circumstances Truth conditions of evaluation ^0 W\ W2 wQ w f f (Cicero: being a philosopher) w, / w f (Caesar; being a philosopher) w2 f w f (Augustus: being a philosopher) Each line represents the proposition that is determined by the sentence relative to the utterance contexts and this proposition receives a truth-value for each circumstance of evaluation. The character of the sentences has in principle to be represented for all possible worlds not only for the three ones selected above. This instrument of a "character" of a sentence is useable for all expressions of a natural language. Such it is a principle improvement and enlargement in the formal semantics that contains Carnap's theory of intensions as a special case. 8. Montague-Semantics: Com positional ity revisited Frege claimed in his principle of compositonality that the meaning of a complex expression is a function of the meanings of the expression parts and their way of composition (see section 3.). He regarded every complex expression as composed of a saturated part and a non-saturated part. The semantic counterparts are objects and functions. Frege
204 II. History of semantics transfers the syntactic notion of an expression which needs some complement to be a complete and well-formed expression to the semantic realm: Functions are regarded as incomplete and non-saturated. This leads him to the ontological view that functions are not "objects" ("Gegenstande").This is Frege's term for any entitity that can be a value of a meaning function mapping expressions to their extensions. Functions, however, can be reified as "Werthverlaufe" (courses-of-values), e.g. by combining their expressions with the expression the function,but in Frege's ontology, the "Werthverlaufe" are distinct from the functions themselves. Successors of Frege did not follow him in this ontological respect of his semantics. In Tarskian semantics one-place predicates are usually mapped to sets of individuals, two- placc predicates to sets of pairs of individuals etc. N-placc functions arc regarded as special kinds of (n-l)-place relations. This approach allowed to make explicit the meaning of Frege's non-saturated expressions and to give a precise compositional account of the meanings of expressions of predicate logic in form of a recursive definition. But for every type of composition, like connecting formulae, applying a predicate to its arguments, or prefixing a quantifier to a formula, distinct forms of meaning composition were needed. The notion of compositionality could be radically simplified by two developments: The application of type theory and lambda abstraction.Type theory goes back to Russell and was developed to avoid paradoxical notions as sets not containing themselves, see above sec. 4. There are a lot of versions of type theory, but in natural language semantics usually a variant is used which starts with a number of basic types, e.g. the type of objects (entities) e and the type of truth values t for an extensional language, and provides a type of functions (7,, 72) for every type 7, and every type 72, i.e. the type of functions from entities of type 7, to entities of type 72. Sometimes the set of types is extended to types composed of more than two types. But any such system can easily be reduced to this binary system. Predicates can now be viewed as expressions of type (e, t), i.e. as functions from objects to truth values, because they yield a true sentence if an argument is filled in that refers to an instance of the predicate is filled in,otherwise they yield a false sentence. That just means that we take the characteristic function of the predicate extension as its meaning. A characteristic function of a set maps its members to the truth value true, its non-members to false. In the same sense the negation operator is of type (t, t), i.e. a function from a truth- value to a truth-value (namely to the opposite one); binary sentence connectives are of type (t,(t,t)), i.e. a function taking the first argument and yielding a function which takes the second argument and then results in a truth value. Frege had already recognized that first-order quantifiers (as the existential quantifier something) are just second-order predicates, i.e. predicates applicable to first-order predicates. The application of an existential quantifier to a one-place first-order predicate is true if the predicate is non-empty. Therefore the existential quantifier can be regarded as a second-order predicate which has non-empty first-order predicates as instances, it is of type ((e, t), f).The semantics of the universal quantifier is analoguous: It yields the truth value true for a first-order predicate which has the whole universe of discourse as its extension. A type problem arises with this view for predicates of arity greater than one: Their type does not fit to the quantifier.The predicate /ove,e.g.isof lypc(e,(e,t)) because if we add one object as an argument we receive a one-place predicate as introduced above. In the sentence
10. The influence of logic on semantics 205 (20) Everybody loves someone. one of the two quantifiers has to be composed with love in a compositional approach. Let us assume someone is this quantifier.Then its meaning of type ((e,t),t) has to be applied to the meaning of love, leading to a type clash, because the quantifier needs a type (e, t) as its argument while the predicate love is of a different type. Analoguous problems arise for complex formulae. How can the meaning of the two-place predicate love be transformed into the type required? The type theory needs an extension by the introduction of a lambda-operator, borrowed from Alonzo Church's lambda calculus which was developed in Church (1936). The lambda operator is used to transform a type t expression into an expression of (Tr t) depending on the variable type 7*,. Let e.g. jc be a variable of type e and P be of type t, then Xjc[F] has type (e, t), i.e. is a one-place first-order predicate. If we consider love as a two-place first-order predicate and love(jc, y) as an expression of type t with two free variables, then Xjc[love(jc, y)] is an expression of type (e, f).This is the type required by the quantifier.The variable jc is bound to the lambda operator and y is free in this expression. If we now take someone as a first-order quantifier, which has type ((e, t), t), then someone(Ajc[love(jc,)>)]) is an expression of type t again with free variable y.This can be made a predicate of type (e, t) by using the same procedure again: Xy[someone(Ajc[love (jc, y)])], which can be used as an argument of a further quantifier everybody. We receive the following new analysis of (20): (21) everybody(Xy[someone(Ajc[love(jc,y)])]) The semantics of a lambda expression Xjc[F] is defined as the characteristic function which yields true for all arguments which would make P true, if they were taken as assignments to jc, false for the others. With this semantics we get the following logical equivalences which we used implicitly in the above formalizations: a-CONVERSlON (22) Xx[P] = \y[P\y/x]] where /'[v/jc] is the same as P besides the fact that all occurences of jc which are free in P are changed into y. tt-conversion is just a formal renaming of variables. ^-REDUCTION (23) \x[F](a)mF[alx] where P[a/x] is the same as P besides the fact that all occurences of jc which are free in P are changed into a. This, however, may be false in some non-extensional contexts, if a is a non-rigid designator. If we take (24) Ralph believes that jc is a spy. as P then the de re reading of (10) is Xx[P](a).a is interpreted outside the non-extensional context of P and its factual denotation is taken as its extension, /'[o/jc], however, is the de
206 II. History of semantics dicto reading because a is interpreted within the belief-context. For intensional contexts these differences are treated in the intensional theory of types, see section 10 below. ^-conversion (25) Xx[P(x)] = P where x docs not occur as a free variable in P. 77-convcrsion is needed for the conversion between atomic predicates and X-expressions. In this type-theoretical view, a lot of other linguistic expression types can easily be integrated. Adjectives, which are applied to nouns of type (e, t), can be seen as type ((e, t), (e, t)), e.g. tasty is a modifier which takes predicates, expressed by nouns, like apple and yields new predicates, like tasty apple. Modifiers in general, like adverbs, prepositional phrases and relative and adverbial clauses, are regarded as type (7,, 7,), because they modify the meaning of their argument, but this results in an expression of the same type. This also mirrors that modifiers can be applied in an iterated manner, as red tasty apple, where red further modifies the predicate tasty apple. The compositionality of meaning got a very strict interpretation in Montague's work. The type-theoretic semantics was accompanied by a syntactic formalism whose expression categories could be directly mapped onto semantic types, called catcgorial grammar, which was based on ideas by Kasimierz Ajdukiewicz in the mid-1930s and Yehoshua Bar-Hillel (1953). Complex semantic types, i.e. semantic types needing some complemetation, like (7,, 7,) for some types 7, and 7,, have their counterpart in complex syntactic categories like SJSX and 52\5, which need a complementation by 5, to yield the syntactic category S2. Given a syntactic category SJSy a complement of type 5, has to be added to the right, in case of S2\S] to the left. Let e.g. N be the syntactic category of a noun and DP be the category of a determiner phrase, then DP/N is the category of the determiner in the examples above. The interaction of Montague's syntactic and semantic conception results in the requirement that the meaning of a complex expression can be recursively decomposed into function-argument-pairs which are always expressed by syntactically immediately adjacent constituents. Categorial grammars describe the same class of languages as context free grammars, and they are subject to the same problems when applied to natural languages. Although most phenomena in natural languages can in principle be represented by a context free grammar, and therefore by a categorial grammar, too, both formalisms lead to quite unintuitive descriptions when applied to a realistic fragment of natural language. Especially with regard to semantic compositionality discontinous constituents require a quite unintuitive multiplication of categories. The type theoretic view of nouns and noun phrases has also consequences for the semantic concept of quantifying expressions. Determiners like all or two as parts of determiner phrases will have to be assigned a suitable type of meaning. 9. Generalized quantifiers As mentioned above, already Frege regarded quantifiers as second- or higher-order predicates. But Frege himself did not go the step from this insight to the consideration
10. The influence of logic on semantics 207 of other quantifiers than the universal and the existential ones. Without being aware of Frege's concept of quantifiers the generalized view on quantifiers by Mostowski (1957) and Lindstrom (1966) pathed the way to a generalized theory of quantifiers in linguistics around 1980. This allowed for a proper treatment of syntactically first-order quantifying expressions whose semantics was not expressible in first-order logic, e.g. most. Furthermore, now, noun phrases which had no mere anaphoric function, could be interpreted as a quantifier, consisting of the determiner, e.g. all, which specifies the basic quantifier semantics, and the noun, e.g. women, or a noun-like expression which restricts the quantification. The determiner is a function of type ((e, t), ((e, t), t)) taking the restrictor as argument and resulting in a generalized quantifier, e.g. all(woman) for the noun phrase all women. This generalized quantifier is (the characteristic function of) a second-order predicate which has all (charatcristic functions of) first-order predicates as its instances which are true for all women. As the determiner designates the principle function in such a noun phrase some linguists prefer the term determiner phrase. Generalized quantifiers can be studied with respect to their monotonicity properties. Let Q be a generalized quantifier and Q{P) be true.Then it can be the case - depending on Q's semantics - that Q(P') is always true if (A) the extension of P' is a subset of the extension of P or (B) the extension of P is a subset of the extension of P' In the first case, we call Q monotone decreasing or downward entailing, in case (B) monotone increasing or upward entailing. An example of the first quantifier type is no women, an example of the second type (at least) two women, cf. e.g. the entailment relations between sentences the following. (26a) entails (26b), and (27a) entails (27b). (26) a. At least two women worked as extremely successful CEOs, b. At least two women worked as CEOs. (27) a. No women worked as CEOs. b. No women worked as extremely successful CEOs. While quantifiers also expressible in first-order logic, like those in the examples above, always show one of the monotonicity properties, there are other generalized quantifiers which do not. Numerical expressions providing a lower and an upper bound for a quantity, like exact numbers, are examples for non-montonic quantifiers. If we replace at least two women with exactly two women in the examples above, the entailment relations between the sentences disappear. Mononicity of quantifiers seems to play an important role in natural language although not all quantifiers are monotonic themselves. But it is claimed that all simple quantifiers, i.e. one-word quantifiers or quantifiers of the form determiner + noun, are expressible as conjunctions of monotonic quantifiers. E.g. exactly three Ps is equivalent to the conjunction at least three Ps and no more than three Ps, the first conjunct (at least three Ps) being upward monotonic and the latter (no more than three Ps) being downward monotonic. There arc, of course, quantifiers not bearing this property, and they are expressible in natural language, cf.au even number of Ps, but there does not seems to be any natural language which reserves a simple lexical item for such a purpose. The
208 II. History of semantics theory of generalized quantifiers therefore raises empirical questions about language universals. Similar considerations on entailment conditions can be applied to the first argument of the determiner. For some determiners D it might be the case that D(R)(P) entails D(R')(P) always if (A') the extension of R' is a subset of the extension of R or (B') the extension of R is a subset of the extension of R'. In the second case the determiner is called persistent, while in the first it is called anti- persistent. Consider the entailment relations between the sentences below. Here, (28a) entails (28b), and (29a) entails (29b). (28) a. Some extremely successful female CEOs smoke, b. Some female CEOs smoke. (29) a. All female CEOs smoke. b. All extremely successfull female CEOs smoke. It is easy to see that some is persistent while all is antipersistent. All combinations of monotonicity and persistence/antipersistence are realized in natural languages.Tab. 10.7 shows some examples. Note that the quantifiers of the square of opposition are part of this scheme. Tab. 10.7: Monotonicity and (anti-)peisistence of quantifiers upward inonotonic downward inonotonic antipersistent all, every no, at most three persistent some, (at least) three not all And the relations of being contradictory and (sub-)contrary (sec sec. 2) in the square of oppositions are mirrored by negations of the whole determiner-governed sentence or the second argument: -<D(R, P) is contradictory to D(R, P), while D(R, ->P) is (sub-)contrary to D(R, P) (we use ->P as short for Xx[^P(x)].). Analyzing quantified expressions as structures consisting of a determiner, a restrictor and a quantifier scope provides us with a relational view of quantifiers. The determiner type ((e, t), ((e, t), t)) is that of a second-order two-place relation. Besides the logical properties of the argument positions discussed above, there arc a number of interesting properties regarding the relation of the two arguments. One of them is conservativity. A determiner is conservative iff it is always the case that D(R, P) = D(R, P a R) (where we use P a R as the conjunction of the predicates P and R,more precisely Xx [P(x) a R(x)]). It is quite evident that determiners in general fulfill this condition. E.g. from (30) Most CEOs are incompetent.
10. The influence of logic on semantics 209 follows (31) Most CEOs are incompetent CEOs. But are really all determiners conservative? Only is an apparent counterexample: (32) Only CEOs are incompetent, cannot be paraphrased as (33) Only CEOs are incompetent CEOs. (32) being contingent, (33) tautological. But besides this observation there are syntactical reasons to doubt the classification of only as a determiner. Other quantifying items whose conservativity is questioned are e.g. many and few. The foundation for the generalization of quantifiers was in principle laid in Montague's work, but he himself did not refer to other quantifiers than the classical existential and universal quantifier. The theory of generalized quantifiers was recognized in linguistics in the carlcy 1980s, cf. Barwisc & Cooper (1980) and article 43 (Kccnan) Quantifiers. 10. Intensional theory of types Montague developed his semantics as an intensional semantics, taking into account the non-extensional aspects of natural languages as alethic-modal, temporal and deontic operators. The intensional theory of types builds on Carnap's concept of intensions as functions from possible worlds to extensions. These functions are built into the type system by adding a new functional type from possible worlds s to the other types. Type s differs from the other types insofar as there are no expressions - constants or variables - directly denoting objects of this type, i.e. no specific possible worlds besides the contcx- tually given current one can be addressed. The difference between an extensional sentential operator like negation and an intensional like necessarily is reflected by their respective types: The meaning of the negation operator has type (t, t) while necessarily needs ((s, t), t) because not only the truth value of an argument p in the current world has impact on the truth value of necessarily p, but also the truth values in the alternative worlds. In Montague (1973) he does not directly define a model-theoretic mapping for expressions of English, although this should be feasible in principle, but he gives a translation of English expressions into a logical language. Besides the usual ingredients of modal predicate logic, the lambda operator as well as variables and constants of the various types, he introduces the intensor A and extensor v of type (T,(s, T)) and ((s, 7), T) respectively for arbitrary typesT. A transforms a given meaning into a Carnapian intension, i.e. Aa means the function from possible worlds to a's extensions in these worlds. In contrast, if b means an intension then vb refers to the extension in the current world. The intensor is used if a usually extensionally interpreted expression is used in an intensional context. E.g. a unicorn and a centaur mean generalized quantifiers, say Qx and Qv which extensionally are false for any argument. This may be different for other possible worlds. Therefore
210 II. History of semantics the intensions A(?, and AQ2 may differ.This accounts for the fact that e.g. the intensional verb seek applied to A(?, and *Q2 may result in different values, as (34) John seeks a unicorn, may be true while (35) John seeks a centaur. may be false at the same time. With the intensional extension of type logic, natural language semantics gets a powerful tool to account for the interaction of intensions and extensions in compositional semantics. The same machinery which is applicable in alethic modal logic - i.e. the logic of possibility and necessity - is transferable to other branches of intensional semantics, as e.g. the semantics of tense and temporal expressions, cf. article 57 (Ogihara) Tense. 11. Dynamic logic Natural language expresssions not only refer to time-dependent situations, but their interpretation also is dependent on time-dependent contexts. A preceding sentence may introduce referents for later anaphoric expressions (cf. article 38 (Dekker) Dynamic semantics and article 89 (Zimmermann) Context dependency). (36) A man walks in the park. He whistles. Among the anaphoric expressions are - of course - nominal anaphora, like pronouns and definite descriptions. Antecedents are typically indefinite noun phrases. But possible antecedents can also be introduced in a less obvious way, e.g. by propositions expressing events. The event time can then be referenced by a temporal anaphora like at the same time. (37) The CEO lighted her cigarette. At the same time the health manager came in. The anaphora at the same time refers to the time when the event described in the first proposition happens. Anaphorical relations are not necessarily realized by overt expressions, they can be implicit, too, or they may be indicated by morphological means, e.g. by the choice of a grammatical tense. Anaphora pose a problem for compositional approaches to semantics based on predicate or type logic. (36) can be formalized by an expression headed by an existential quantifier like (38) 3jc[man(jc) a w-i-t-p(jc) a whistle(jc)] But the man mentioned here can be referred to anywhere in the following discourse. Therefore the quantifier scope cannot be closed at any particular position in the discourse.
10. The influence of logic on semantics 2H_ The semantics of anaphora, however, is not just a matter of the scope of existential quantifiers. Expressions, like indefinite noun phrases, usually meaning an existential quantifier can in certain contexts introduce discourse referents with a universal reading. This fact was described by Geach (1962). (39) If a farmer owns a donkey he feeds it. means (40) Vjc[V^[farmer(jc) a donkey(y) a own(jc,y) -» feed(jc,y)]] This kind of anaphora is a further challenge for compositional semantics as it has to deal with the fact that an expression which is usually interpreted cxistcntially gets a universal reading here.The challenge is addressed by dynamic semantics. Pre-compositional versions were developed independently by Hans Kamp and Irene Heim as Discourse Representation Theory and File Change Semantics respectively (cf. article 37 (Kamp & Reyle) Discourse Representation Theory). In both approaches structures representing the truth-conditional content of the parts of a discourse as well as the entities which are addressable anaphorically are manipulated by the meaning of discourse constituents. The answer to the question how the meaning of a discourse constituent is to be construed is simply: as a function from given discourse representing structures to new structures of the same type, cf. Muskens (1996). This view is made explicit in dynamic logics. This kind of logics was developed in the 1970er by David Harel and others, cf. Harel (2000), and has been primarily used for the formal interpretation of procedural programming languages. (41) (a)q means that statement a possibly leads to a state where q is true, while (42) [a\q means that statement a necessarily leads to a state where q is true. Regarding states as possible worlds we arrive at a modal logic with as many modalities as there are (equivalence classes of) statements a, for a recursive language usually infinitely many. For many purposes the consideration can be constrained to such modalities where from each state exactly one successor state is accessible. For such modalities with functional accessibility relation the weak ((...)) and the strong operator ([...]) collapse scmantically into one operator. If we further agree that (43) J,A[a]j2 can be rewritten as (44) s\a\s2
212 II. History of semantics then this notation has the intuitive reading that state 5, is mapped by the meaning of a into sr A simplistic application is the following: Assume that 5, is a characterization of the knowledge state of a recipient before receiving the information given by assertion a.Then s2 is a characterization of the knowledge state of the recipient after being informed. If you identify knowledge states with those sets of possible worlds which are consistent with the current knowledge, and if you consider s, and s2 as descriptions of sets Wx and W2 of possible worlds, then they differ in exactly that respect that W2 is the intersection of Wx and the set W of possible worlds in which a is true, i.e. W2 = W, n Wa. The notion of an informative utterance a can be defined by the condition that W2* Wr And in order to be consistent with the previous context, it must be true that W2 * 0. The treatment of discourse states or contexts in dynamic logics is not limited to truth-conditionally characterizable knowledge. In principle any kind of linguistic context parameters can be part of the states, among these the anaphorically accessible antecedents of a sentence. In their Dynamic Predicate Logic, as developed in Groenendijk & Stokhof (1991), Groenendijk and Stokhof model the anaphoric phenomena accounted for in Kamp's Discourse Representation Theory and Heim's File Change Semantics in a fully compositional fashion. This is mainly achieved by a dynamic interpretation of the existential quantifier: (45) 3x[P(x)] is semantically characterized by the usual truth conditions but has the additional effect that free occurrences of the variable x have to be kept assigned to the same object in subsequent expressions which are connected appropriately.The dynamic effect is limited by the scopes of certain operators like the universal quantifier, negation, disjunction, and implication. So x is bound to the same object ouside the syntactic scope of the existential quantifier as in (46) 3jc[man(jc) a w-i-t-p(jc)] a whistle(jc) although the syntactic scope of the existential quantifier ends after w-i-t-p(jc).This prop- sosition is true, only if there is an object x which verifies all three predicates man, w-i-t-p, and whistle. If we characterize the dynamic dimension, the sentences (36) and (39) can be formalized in Dynamic Predicate Logic as (47) 3jc[man(jc) a w-i-t-p(jc)] a whistle(jc) and (48) 3jc[farmer(jc) a 3y[donkey()>) a own(jc, y)]] -> feed(jc, y) respectively. It can easily be seen, how the usual meanings of the discourse sentences (49) A man walks in the park.
10. The influence of logic on semantics 213 and (50) A farmer has a donkey. enter into the composed meaning of the discourse without any changes. In order to get the intended truth conditions for implications, it is required as truth condition that the second clause can be verified for any assigment to x verifying the first clause. Putting together the filtering effect of propositions on possible worlds and the modifying effect on assignment functions, we can consider propositions in Dynamic Predicate Logic as functions on sets of world-assigment pairs. The empty context can be characterized by the Cartesian product of the set of possible worlds and the set of assigments. Each proposition of a discourse filters out certain world-assignment pairs. In some respects Dynamic Predicate Logic deviates from standard dynamic logic approaches, as Groenendijk & Stokhof (1991, sec. 4.3) point out, but it still can be seen as a special case of a logic in this framework. The dynamic view of semantics can be used to model other contextual dependencies than just anaphora. Groenendijk (1999) shows another application in the Logic of Interrogation. Questions add felicity conditions for a subsequent answer to the discourse context. To a great extent, these conditions can be characterized semantically. In the Logic of Interrogation the effect of a question is understood as a partitioning of the current set of possible worlds. Each partition stands for some alternative exhaustive answer, e.g. a yes-no question partitions the set of possible worlds into one subset consistent with the positive answer and a complementary subset compatible with the negative answer. Let us e.g. take the question (51). (51) Does a man walk in the park? According to the formalism of Groenendijk (1999) (51) can be formalized as (52). (52) ?3jc [man(jc) a w-i-t-p(jc)] (52) partitions the set of possible worlds in two subsets W+ and W~, such that (53) is true for every world in W+ and false for every world in its complement subset W~. An appropriate answer selects exactly one of these subsets. (53) 3jc[man(jc) a w-i-t-p(jc)] Wh-questions like (54) partition the sets of possible worlds in more partitions than yes- no questions. (54) Who walks in the park? Each partition corresponds to an exhaustive answer, which provides the full information who walks in the park and who does not. An assertion answers the question partially if it eliminates at least one partition. If it furthermore eliminates all partitions but one, it answers the question exhaustively. This chapter has given a short overview how logic provided the tools for treating linguistic phenomena. Each logical system has its characteristic strength and its limits.
214 II. History of semantics The limits of a logical system sometimes inspired the development of new formal tools (e.g. the step from Aristotelian syllogistic to modern predicate logic) which inspired a new semantics. Sometimes a change in the focus of linguistic phenomena inspired a systematic search for new logical tools (e.g. modal logic) or a reinterpretation of already available logical tools (e.g. dynamic logic). We hope to have illustrated the main developments of the bi-directional influences of logical systems and semantics. 12. References Abaclardus. Pctrus 1956. Dialectica. Ed. L.M. de Rijk. Asscn: van Gorcum. Almog. Joseph. John Perry & Howard Wettstein (eds.) 1989. Themes from Kaplan. Oxford: Oxford University Press. Aristoteles 1992. Analytica priora. Die Lehre vom Schlufi oder erste Analytik. Ed. E. Rolfes. Hamburg: Meiner. Bar-Hillel, Yehosliua 1953. A quasi-arithmetical notation for syntactic description. Language 29, 47-58. Bai wise, Jon & Robin Cooper 1980. Generalized quantifiers and natural language. Linguistics & Philosophy 4,159-218. Carnap, Rudolf 1947. Meaning and Necessity. Chicago, IL:The University of Chicago Press Church, Alonzo 1936. An uiisolvable problem of elementary number theory. American Journal of Mathematics 58,354-363. Cresswell, Maxwell & Arnini von Stechow 1982. De re belief generalized. Linguistics & Philosophy 5,503-535. Davidson, Donald 1967. Truth and meaning. Synthesis 17,304-323. Frcgc, Gottlob 1966. Grundgesetze der Arithmetik. Hildcshcim: Olms. Frege, Gottlob 1977. Be griffsschrift und andere Aufsatze. Ed. I. Angelelli. Darmstadt: Wissen- schaftliche Buchgesellschaft. Frege, Gottlob 1988. Die Grundlagen der Arithmetik. Eine logisch mathematische Untersuchung iiber den Begriffder Zahl. Ed. Chr.Thiel. Hamburg: Meiner. Frege, Gottlob 1994. Funktion, Begriff Bedeutung. Fiinf logische Studien. Gottingen: Vandenhoeck & Ruprecht. Geach, Peter 1962. Reference and Generality: An Examination of Some Medieval and Modern Theories. Ithaca, NY: Cornell University Press Godel, Kurt 1930. Die Vollstandigkeit der Axiome des logischen Funktionenkalkiils. Monatshefte fur Mathematik und Physik 37,349-360. Godel, Kurt 1931. Uber formal unentscheidbare Satze der Principia Mathematica und verwandter Systeme. Monatshefte fur Mathematik und Physik 38,173-198. Groenendijk, Jeroen 1999. The Logic of Interrogation. Research report, ILLC Amsterdam. Groenendijk, Jeroen & Martin Stokhof 1991. Dynamic Predicate Logic. Linguistics & Philosophy 14,39-100. Haas-Spohn, Ulrike 1989. Zur Interpretation der Einstellungszuschreibungen. In: E. Falkenberg (ed.). Wissen, Wahrnehmen, Glauben. Epistemische Ausdriicke und propositionale Einstellungen. Tubingen: Niemeyer, 50-94. Hard, David 2000. Dynamic logic. In: D. Gabbay & F Guenthner (eds.). Handbook of Philosophical Logic, vol. II: Extensions of classical logic, chap. 11.10. Dordrecht: Reidel, 497-604. Hodges, Wilfried 1991. Logic. London: Penguin. Kaplan, David 1969. Quantifying in. In: D. Davidson & J. Hintikka (eds.). Words & Objections. Essays on the Work ofW.V.O. Quine. Dordrecht: Reidel,206-242. Kaplan, David 1979. On the logic of demonstratives. Journal of Philosophical Logic 8,81-98. Kaplan, David 1989. Demonstratives. In: J. Almog, J. Perry & H. Wettstein (eds.). Themes from Kaplan. Oxford: Oxford University Press, 481-563.
10. The influence of logic on semantics 2T5 Kripke, Saul 1972. Naming and necessity. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Reidel, 253-233 and 763-769. von Kutschera. Franz 1989. Gottlob Frege. Eine Einfiihrung in sein Werk. Berlin: de Gruyter. Leibniz, Gottfried Wilhelm 1992. Schriften zur Logik und zur philosophischen Grundle- gung von Mathmatik und Naturwissenschaft. Ed. H. Herring. Darmstadt: Wissenschaftliche Buchgesellschaft. Lenzen, Wolfgang 1990. Das System der Leibnizschen Logik. Berlin: de Gruyter. Lenzen. Wolfgang 2000. Guilielmi Pacidii Non plus ultra oder: Eine Rekonstruktion des Leibnizschen Plus-Minus-Kalkiils. Philosophicgeschichte und logische Analyse 3,71-118. Lindstrom, Per 1966. First order predicate logic with generalized quantifiers. Theoria 32,186-195. Loar, Brian 1972. Reference and propositional attitudes. The Philosophical Review 81.43-62. Montague, Richard 1973. The proper treatment of quantification in ordinary English. In: J. Hintikka, J. Moravcsik & P. Suppes (eds.). Approaches to Natural Language. Dordrecht: Reidel. 221-242. Reprinted in: R. Thomason (ed.). Formal Philosophy. Selected Papers of Richard Montague. New Haven, CT: Yale University Press, 1974,247-270. Mostowski, Andrzcj 1957. On a generalization of quantifiers. Fundamenta Mathematicae 44,12-36. Muskens, Reinhard 1996. Combining Montague Semantics and Discourse Representation. Linguistics & Philosophy 19,143-186. Nortmann, Ulrich 1996. Modale Syllogismen, mbgliche Welten, Essentialismus. Eine Analyse der aristotelischen Modallogik. Berlin: de Gruyter. Quine, Willard van Orman 1953. From a Logical Point of View. Cambridge, MA: Harvard University Press Quine, Willard van Orman 1956. Quantifiers and propositional attitudes. The Journal of Philosophy 53,177-187. Russell, Bertrand 1903. The Principles of Mathematics. Cambridge: Cambridge University Press Russell, Bertrand 1905. On denoting. Mind 14,479^193. Russell, Bertrand 1908. Mathematical logic as based on the theory of types. American Journal of Mathematics 30,222-262. Russell, Bertrand 1910. Knowledge by acquaintance and knowledge by description. Proceedings of the Aristotelian Society 11,108-128. Russell, Bertrand & Alfred N. Whitehead 1910-1913. Principia Mathematica. Cambridge: Cambridge University Press Tarski, Alfred 1935. Der Wahrheitsbegriff in den formalisierten Sprachen. Studia Philosophica 1, 261^105. Zalta, Edward 2000. Leibnizian theory of concepts. Philosophiegeschichte und logische Analyse 3, 137-183. Albert Newen, Bochum (Germany) Bernhard Schroder, Essen (Germany)
216 II. History of semantics 11. Formal semantics and representationalism 1. Logic, formal languages and the grounding of linguistic methodologies 2. Natural languages as formal languages: Formal semantics 3. The challenge of context-dependence 4. Dynamic Semantics 5. The shift towards proof-thcorctic perspectives 6. Ellipsis: A window on context 7. Summary 8. References Abstract This paper shows how formal semantics emerged shortly after the explosion of interest in formally characterising natural language in the fifties, swiftly replacing all advocacy of semantic representations within explanations of natural-language meaning. It then charts how advocacy of such representations has progressively re-emerged in formal semantic characterisations through the need to model the systemic dependency on context of natural language construal. First, the logic concepts on which subsequent debates depend are introduced, as is the formal-semantics (model-theoretic) framework in which meaning in natural language is defined as a reflection of a direct language-world correspondence. The problem of context dependence is then set out which has been the primary motivation for introducing semantic representation, with sketched accounts of pronoun construal relative to context. It is also shown how, in addition to such arguments, proof-theoretic (hence structural) concepts have been increasingly used in semantic modelling of natural languages. Finally, ellipsis is introduced as a novel window on context providing additional evidence for the need to advocate representations in semantic explanation. The paper concludes with reflections on how the goal of modelling the incremental dynamics of natural language interpretation forges a much closer link between competence and performance models than has hitherto been envisaged. 1. Logic, formal languages, and the grounding of linguistic methodologies The meaning of natural-language (NL) expressions, of necessity, is wholly invisible; and one of the few supposedly reliable ways to establish the interpretation of expressions is through patterns of inference. Inference is the relation between two pieces of information such that one can be wholly inferred from the other (if for example I assert that I discussed my analysis with at least two other linguists, then I imply I have an analysis, that I have discussed it with more than one linguist, and that I am myself a linguist). Inference alone is not however sufficient to pinpoint the interpretation potential of NL expressions. Essential to NL interpretation is the pervasive dependence of expressions on context for how they arc to be understood, a problem which has been a driving force in the development of formal semantics.This article surveys the emergence of formal semantics following growth of interest in the formal characterisation of NL grammars (Chomsky Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 216-241
11. Formal semantics and representationalism 217 1955, Lambek 1958). It introduces logic concepts on which subsequent debates depend and the formal-semantics framework (the formal articulation of what has been called truth-conditional semantics), in which NL is claimed to be a logic. It then sets out the problem of context dependence with its apparent need to add a level of semantic representation over and above whatever is needed for syntax, and shows how semanticists have increasingly turned to tools of proof theory (the syntactic mechanisms for defining inference in logic) to project compositionality of content in natural language. Ellipsis data are introduced as an additional basis for evaluating the status of representations specific to characterising NL construal; and the paper concludes with reflections on how modelling the incremental dynamics of the step-by-step way in which information is built up in discourse forges a much closer link between competence and performance models than has hitherto been envisaged. During the sixties, with emergence of the new Chomskian framework (Chomsky 1965), inquiry into the status of NL semantics within the grammar of a language was inevitable. There were two independent developments: articulation of semantics as part of the broadly Chomskian philosophy (Katz & Fodor 1963, Katz 1972), and Montague's extension of formal-language semantic tools to NL (Thomason (ed.) 1974). The point of departure for both Chomskian and formal semantics paradigms was the inspiration provided by the formal languages of logic, though, as we shall see, they make rather different use of this background. Logics arc defined for the formal study of inference irrespective of subject matter, with individual formal languages defined to reflect specific forms of reasoning, modal logic to reflect modal reasoning, temporal logic to reflect temporal reasoning, etc. Predicate logic, as its name implies, is defined to reflect forms of reasoning that turn on subsentential structure, involving quantification, names, and predicates: it is the logic arguably closest to natural languages. In predicate logic, the grammar defines a system for inducing an infinite set of propositional formulae with internal predicate-argument structure over which semantic operations can be defined to yield a compositional account of meaning for these formulae. Syntactic rules involve mappings from (sub)-formulae to (sub)-formulae making essential reference to structural properties; semantic rules assign interpretations to elementary parts of such formulae and then compute interpretations by mapping interpretations onto interpretations from bottom to top ('bottom- up') as dictated by the structures syntactically defined. With co-articulation of syntax and semantics, inference as necessary, truth dependence is then defined syntactically and semantically, the former making reference solely to properties of structure of the formulae in question, the latter solely to truth-values assigned to such formulae (relative to some so-called model). The syntactic characterisation of inference is defined by rules which map one propositional structure into another, the interaction of this small set of rules (the proof rules) predicting all and only the infinite set of valid inferences. We can use this pattern to define what we mean by representationalism, as follows. A representationalist account is one that involves essential attribution of structure in the characterisation of the phenomenon under investigation. Any account of natural language which invokes syntactic structure of natural language strings is providing a representationalist account of language. More controversially, representationalist accounts of meaning are those in which the articulation of structure is an integral part of the account of NL interpretation in addition to whatever characterisation is provided of syntactic properties of sentence-strings.
218 II. History of semantics 1.1. The Chomskian methodology In the Chomskian development of a linguistic philosophy for NL grammar, it was the methodology of grammar-writing for these familiar logics which was adapted to the NL case. By analogy, NL grammars were defined as a small number of rules inducing an infinite set of strings, success in characterising a language residing in whether all and only the wcllformcd sentences of the language arc characterised by the given rule set. Grammars were to be evaluated not by data of language use or corpus analysis, but by whether the grammar induces the set of strings judged by a speaker to be grammatical. In this, Chomsky was universally followed: linguists generally agreed that there should be no grounding of grammars directly in evidence from what is involved in producing or parsing a linguistic string. Models of language were logically prior to consideration of performance factors; so data relevant to grammar construction had to be intuitions of grammatically as made by individuals with capacity in the language. This commitment to complete separation of competence-based grammars from all performance considerations has been the underpinning to almost all linguistic theorising since then, though as we shall see, this assumption is being called into question. Following the Chomskian methodology, Katz and colleagues set out analogous criteria of adequacy for semantic theories of NL: that they should predict relations between word meaning and sentence meaning as judged by speakers of the language; synonymy for all expressions having the same meaning; entailment (equivalently, inference) for all clausal expressions displaying a (possibly asymmetric) dependence of meaning; ambiguity for expressions with more than one interpretation. The goal was to devise rule specifications that yield these results, with candidate theories evaluated solely by relative success in yielding the requisite set of semantic relations/properties. Much of the focus was on exploring appropriate semantic representations in some internalised language of thought to assign to words to secure a basis for predicting such entailment relations as John killed Bill, Bill died, or John is a bachelor, John is an unmarried man (Katz 1972, Fodor 1981,1983,1998, Pustejovsky 1995, Jackendoff 2002).There was no detailed mapping defined from such constructs onto the objects/events which the natural language expression might be presumed to depict. 1.2. Language as logic: The formal-semantic methodology It was Montague and the program of formal semantics that defined a truth-theoretic grounding for natural language interpretation. Montague took as his point of departure both the methodology and formal tools of logic (cf. article 10 (Newen & Schroder) Logic and semantics). He argued that by extending the syntactic and semantic systems of modal/temporal predicate logic with techniques defined in the lambda calculus, the extra flexibility of natural language could be directly captured, with each individual natural language defined to be a formal language no different in kind from a suitably enriched variant of predicate logic (see Montague 1970 reprinted in Thomason (ed.) 1974). In predicate logic, inference is definable from the strings of the language, with invocation of syntax as an independent level of representation essentially eliminable in being no more than a way of describing semantic combinatorics; and, in applying this concept to natural languages, many formal scmanticists adopt similar assumptions (in particular catcgorial grammar: Morrill 1994). Their theoretical assumptions are thus unlike Chomskian NL
11. Formal semantics and representationalism 2^9 grammars in which syntax, hence representations of structure, is central. Since these two paradigms made such distinct use of these formal languages in grounding their NL grammars, an essential background to appreciating the debate is a grasp of predicate logic as a language. 1.3. Predicate logic: Syntax, semantics and proof theory The remit of predicate logic is to express inference relations that make essential reference to sub-propositional elements such as quantifiers and names, extending propo- sitional logic (with its connectives a ('and'), -> ('if-then', the conditional connective), v ('or'),—i ('not').The language has a lexicon,syntax and semantics (Gamut 1991).There is a finite stock of primitive expressions, and a small set of operators licensed by the grammar: the propositional connectives and the quantifiers V (universal), 3 (existential). Syntactic rules define the properties of these operators, mapping primitive expressions onto progressively more complex expressions; and for each such step, there is a corresponding semantic rule so that meanings of individual expressions can be defined as recursively combining to yield a formal specification of necessary and sufficient truth-conditions for propositional formulae in which they are contained. There are a number of equivalent ways of defining such semantics. In the Montague system, the semantics is defined with respect to a model defined as (i) a set of stipulated individuals as the domain of discourse, (ii) appropriate assignment of a denotation (equivalently extension) for the primitive expressions from that set: individuals from the domain of discourse for names, sets of individuals for one-place predicate expressions, and so on. Semantic rules map these assignments onto denotations for composite expres- sions.Such denotations are based exclusively on assignments given to their parts and their mode of combination as defined by the syntax, yielding a truth value (True, False) with respect to the model for each propositional formula. There is a restriction on the remit of such semantics. Logics are defined to provide the formal vehicle over which inference independent of subject matter can be defined. Accordingly, model-theoretic semantics for the expressions of predicate logic takes the denotation of terminal expressions as a primitive, hence without explanation: all it provides is a formal way of expressing com- positionality for the language, given an assumption of a stipulated language-denotation relation for the elementary expressions. With such semantics, relationships of inference between propositional formulae are definable. There are two co-extensive characterisations of inference. One, more familiar to linguists, is a semantic characterisation (entailment). A proposition 0entails a distinct proposition ^if in all models in which 0 is true y/ is true (explicitly a characterisation in terms of truth-dependence). Synonymy, or equivalence, =, is when this relation is two-way. The other characterisation of inference is syntactic, i.e. proof-theoretic, defined as the deducibility of one propositional formula from another using proof rules: all such derivations (proofs) involve individual steps that apply strictly in virtue of structural properties of the formulae. Bringing syntax and semantics together, a logic is sound if the proof rules derive only inferences that are true in the intended models, and complete if it can derive all the true statements according to the models.Thus, in logic, syntax and semantics are necessarily in a tight systematic relation. These proof rules constitute some minimal set, and it is interaction between them which determines all and only the correct inferences expressible in the language. In natural deduction systems (Fitch 1951,Prawitz 1965), defined to reflect individual local steps
220 II. History of semantics in any such derivation, each operator has an associated introduction and elimination rule. Elimination rules map complex formulae onto simpler formulae: introduction rules map simpler formulae onto a more complex formula. For example, there is Conditional Elimination (Modus Ponendo Ponens), which given premises of the form 0 and 0 -> y/ licenses the deduction of y/; there is Conditional Introduction, which, conversely, from the demonstration of a proof of ^on the basis of some assumption 0 enables the assumption of 0 to be removed, and a weaker conclusion 0 -> ^to be derived. Of the predicate- logic rules, Universal Elimination licenses the inference of VjcF(jc) to F(a), simplifying the formula by removing the quantifying operator and replacing its variable with a constructed arbitrary name. Universal Introduction enables the universal quantifier to be rc-introduccd into a formula replacing a corresponding formula containing such a name, subject to certain restrictions. In rather different spirit, Existential Elimination involves a move from 3xF(x) by assumption to F(a) (a an arbitrary name) to derive some conclusion 0 that crucially does not depend on any properties associated with the particular name a. A sample proof, with annotations as metalevel commentary detailing the rules used, illustrates a characteristic proof pattern. Early steps of the proof involve eliminating the quantificational operators and the structure they impose, revealing the propositional structure simpliciter with names in place of variables; central steps of inference (here just one) involve rules of propositional calculus; late steps of the proof re-introduce the quantificational structure with suitable quantifier-variable binding, (h is the proof sign for valid inference.) Vx(F(x) -» G(x)),VxF(x) i-VxG(x) 1. Vx(F(x) -> G(x)) 2. Vx.F(x) 3. F(a)^G(a) 4. F(a) 5. G(a) 6. VxG(x) Assumption Assumption Universal-Elim, 1 Universal-Elim, 2 Modus Ponens,3,4 Universal-Intro, 5 Fig. 11.1: Sample proof by universal elimination/introduction These rules provide a full characterisation of inference, in that their interaction yields all and only the requisite inference relations as valid proofs of the system, with semantic definitions grounding the syntactic rules appropriately. In sum, inference in logical systems is characterisable both semantically, in terms of denotations, and syntactically, in proof-theoretic terms; and, in these defined languages, the semantic and syntactic characterisations are co-extensive, defined to yield the same results. 1.4. Predicate logic and natural language Bringing back into the picture the relation between such formal languages and natural languages, what first strikes a student of NL is that predicate logic is really rather UNlike natural languages. In NL, quantifying expressions occur in just the same places as other noun phrases, and not, as in predicate logic, in some position adjoined to fully defined sentential units (see Vjc(F(jc) -> G(x)) in Fig. 11.1. Nonetheless, there is parallelism between predicate logic and NL structure in the mechanisms defined for such quantified
11. Formal semantics and representationalism 221_ formulae. The arbitrary names involved in proofs for quantified formulae display a pattern similar to NL quantifying expressions. In some sense then, NL quantifying expressions can be seen as closer to the constructs used to model the dynamics of inferential action than to the predicate-logic language itself. Confirming this, there is tantalising parallelism between NL quantifiers and terms of the epsilon calculus (Hilbert & Bernays 1939), the logic defining the properties of these arbitrary names. In this logic, the epsilon term that corresponds to an arbitrary name carries a record of the mode of combination of the propositional formula within which it occurs: (1) 3xF(x) = F(exF(x)) The formula on the right hand side of the equivalence sign is a predicate-argument sequence and within the argument of this sequence, there is a required second token of the predicate F as the restrictor for that argument term (e is the variable-binding term operator that is the analogue of the existential quantifier). The effect is that the term itself replicates the content of the overall formula. As we shall see, this internal complexity to epsilon terms corresponds directly to so-called E-type pronouns (Gcach 1972, Evans 1980), in which the pronoun appears to reconstruct the whole of some previous propositional formula, despite only coreferring to an indefinite term. In (2), that is, it is the woman that was sitting on the steps that the pronoun she is used to refer to: (2) A woman was sitting on the steps. She was sobbing. So there is interest in exploring links between construal of NL quantifiers and epsilon terms (sec von Hcusingcr 1997, Kcmpson, Meyer-Viol & Gabbay 2001, von Hcusingcr& Kempson (eds.) 2004).There is also, more generally, as we shall see, growth of interest in exploring links between NL interpretation processes and proof-theoretic characterisations of inference. In the meantime, what should be remembered is that predicate logic, with its associated syntax-semantics correspondence, is taken as the starting point for all formal- semantic modelling of a truth-conditional semantics for NL; and this leads to very different assumptions held in the conflicting Chomskian and formal-semantic paradigms about the status of representations within the overall NL grammar. For Chomskians, with the parallelism with logic largely residing in the methodology of defining a grammar yielding predictions that are consistent and complete for the phenomenon being modelled, the ontology for NL grammars is essentially representationalist. The grammar is said to comprise rules, an encapsulated body of knowledge acquired by a child (in part innate, and encapsulated from other devices controlled by the cognitive system). Whether or not semantics might depend on some additional system of representations is seen as an empirical matter, and not of great import.To the contrary in the Montague paradigm, no claims about the relation between language and the mind are made, and in particular there is no invocation of any mind-internal language of thought. Language is seen as an observable system of patterns similar in kind to formal languages of logic (cf. article 10 (Newen & Schroder) Logic and semantics). From this perspective, syntax is only a vehicle over which semantic (model-theoretic) rules project interpretations for NL strings. If natural languages are indeed to be seen as formal languages, as claimed,
222 II. History of semantics the syntax will do no more in the grammar than yield the appropriate pairing of phonological sequences and denotational contents, so is not the core defining property of a grammar: it is rather the pairing of NL strings and truth-conditionally defined content that is its core. This is the stance of categorial grammar: Lambek (1958), Morrill (1994). Not all grammars incorporating formal semantic insights are this stringent: Montague's defined grammar for English as a formal language had low-level rules ensuring appropriate morphological forms of strings licensed by the grammar rules. But even though formal semanticists might grant that structural properties of language require an independent system of structure-inducing rules said to constitute NL syntax, the positing of an additional mentalistic level of representation internal to the projection of denotational content for the structured strings of the language is debarred in principle. The core formal-semantics claim is that interpretation for NL strings is definable over the interpretation of the terminal elements of the language and their mode of combination as dictated by the syntax and nothing else: this is the compositionality of meaning principle. Positing any level of representation intermediate between the system projecting syntactic structure of strings and the projection of content for those strings is tantamount to abandoning this claim. The move to postulate a level of semantic representation as part of some supposed semantic component of NL grammar is thus hotly contested. In short, though separation of competence performance considerations as a methodological assumption is shared by all, there is not a great deal else for advocates of the Chomskian and Montague paradigms to agree about. 2. Natural languages as formal languages: Formal semantics The early advocacy of semantic representations as put forward by Katz and colleagues (cf. Katz & Fodor 1963, Katz 1972) was unsuccessful (see the devastating critique of Lewis 1970); and from then on, research in NL semantics has been largely driven by the Montague program. Inevitably not all linguists followed this direction.Those concerned with lexical specifications tended to resist formal-semantic assumptions, retaining articulations of representationalist forms of analysis despite lack of formal-semantic underpinnings, relying on linguistic, computational, or psycho-linguistic forms of justification (Pustejovsky 1995, Fodor 1998, Jackendoff 2002). However, there were in any case what were taken at the time to be good additional reasons for not seeking to develop a representational alternative to the model-theoretic program following the pattern of the syntactic characterisation of inference for predicate logic provided by the rules of proof (though see Hintikka 1974). For any one semantic characterisation, there are a large number of alternative proof systems for predicate and propositional calculus, with no possible means of choosing between them, the only unifying factor being their shared semantic characterisation. Hence it would seem that, if a single choice for explanation has to be made, it has to be a semantic one. Moreover, because of the notorious problems in characterising generalized quantifiers such as most (Barwise & Cooper 1981), it is only model-theoretic characterisations of NL content that have any realistic chance of adequate coverage. In such debates, the putative relevance of processing considerations was not even envisaged. Yet amongst many variant proof-theoretic methods for predicate-logic proof systems, Fitch-style natural deduction is invariably cited as the closest to the observable procedural nature of natural language reasoning (Fitch 1951, Prawitz 1965). If consideration of external factors such as psycholinguistic plausibility had been taken
11. Formal semantics and representationalism 223 as a legitimate criterion for determining selection between alternative candidate proof systems, this scepticism about the feasibility of selecting from amongst various proof- theoretic methodologies to construct a representationalist (proof-theoretic) model of NL inference might not have been so widespread. However, inclusion of performance- related considerations was, and largely still is, deemed to be illegitimate; and a broad range of subsequently established empirical results appear to confirm the decision to retain the NL-as-formal-language methodology. A corollary has been that vocabularies for syntactic and semantic generalisations had to be disjoint, with only the syntactic component of the grammar involving representations, semantics notably involving no more than a bottom-up characterisation of denotational content defined model-theoretically over the structures determined by the syntax, hence with a non-rcprcscntationalist account of NL semantics. Nevertheless, as we shall see, representationalist assumptions within semantics have been progressively re-emerging as these formal-semantic assumptions have led to ever increasing postulations of unwarranted ambiguity, suggesting that something is amiss in the assumption that the semantics of NL expressions are directly encapsulated in their assigned denotational content. 2.1. The Lambda calculus and NL semantics There was one critical tool which Montague utilised to substantiate the claim that natural languages can be treated as having denotational semantics read off their syntactic structure: the lambda calculus (cf. also article 33 (Zimmermann) Model-theoretic semantics). The lambda calculus is a formal language with a function operator X which binds variables in some open formula to yield an expression that denotes a function from the type of the variable onto the type of the formula. For example F(x), an open predicate- logic formula, can be used as the basis for constructing the expression Xjc[F(jc)], where the lambda-term is identical in content to the one-place predicate expression F (square brackets for lambda-binding visually distinguish lambda binding and quantifier binding). Thus the formula Xjc[F(jc)] makes explicit the functional nature of the predicate term F, as does its logical type (e, t) (equivalently e —> t): any such expression is a predicate that explicitly encodes its denotational type mapping individual-denoting expressions onto propositional formulae. All that is needed to be able to define truth conditions over a syntactically motivated structure is to take the predicate logic analogue for any NL quantification-containing sentence, and define whatever processes of abstraction are needed over the predicate-expressions in the agreed predicate-logic representation of content to yield a match with requirements independently needed by the NL expressions making up that sentence. For example, on the assumption that V'x{Student{x) -» Smoke(x)) is an appropriate point of departure for formulating the semantic content of Every student smokes, two steps of abstraction can be applied to that predicate- logic formula, replacing the two predicate constants with appropriately typed variables and lambda operators to yield the term XPXQ[\/x(P(x) -» (?(•*))] as specifying the lexical content of every. This term can then combine first with the term Student (to form a noun-phrase meaning) and then with the term Smokes (as a verb-phrase meaning) to yield back the predicate logic formula V'x(Student(x) -» Smokes{x)). This derivation can be represented as a tree structure with parallel syntactic and semantic labelling:
II. History of semantics (3) Vx (Student(x) -> Smokes(x)):S \PkQ[Vx(P(x) -» Q(x))]:DET The consequence is that noun phrases are not analysed as individual-denoting expressions of type e (for individual), but as higher-type expressions ((e ->/)-> 0- Since the higher-typed formula is deducible from the lower-typed formula, such a lifting was taken to be fully justified and applied to all noun-phrase contents, thereby in addition reinstating syntactic and semantic parallelism for these expressions. This gave rise to the generalised quantifier theory of natural language quantification (Barwise & Cooper 1981, article 43 (Keenan) Quantifiers). The surprising result is the required assumption that the content attributable to the VP is semantically the argument (despite whatever linguistic arguments there might be that the verb is the syntactic head in its containing phrase), and the subject expresses the functor that applies to it, mapping it into a propositional content, so semantic and syntactic considerations appear no longer to coincide. This methodology of defining suitable lambda terms that express the same content as some appropriate predicate-logic formula was extended across the broad array of NL structures (with the addition of possible world and temporal indices in what was called intensional semantics). Hence the demonstrable claim that the formal semantic method captures a concept of compositionality for natural language sentences while retaining predicate-logic insights into the content to be ascribed, a formal analysis which also provides a basis for characterisations of entailment, synonymy, etc. With syntactic and semantic characterisations of formal languages defined in strictly separate vocabulary, albeit in tandem, the Montague methodology for natural languages imposes separation of syntactic and semantic characterisations of natural language strings, the latter being defined exclusively in terms of combinatorial operations on denotational contents, with any intermediate form of representation being for convenience of exegesis only. Montague indeed explicitly demonstrated that the mapping onto intermediate (intensional) logical forms in articulating model-theoretic meanings was eliminable. Following predicate-logic semantics methodology, there was little concern with the concept of meaning for elementary expressions. However, relationships between word meanings were defined by imposing constraints on possible denotations as meaning postulates (following Carnap 1947). For example, the be of identity was defined so that extensions of its arguments were required to be co-extensive across all possible worlds: conversely, the verbs look for and find were defined to ensure that they did not have equivalent denotational contents. 3. The challenge of context-dependence Despite the dismissal by formal semanticists in the seventies of any form of representation within a semantic characterisation of interpretation, it was known rightaway that the model-theoretic stance as a program for natural language semantics within the grammar is not problem-free. One major problem is the inability to distinguish synonymous
11. Formal semantics and representationalism 225 expressions when within prepositional attitude reports (John believes groundhogs are woodchucks vs John believes groundhogs are groundhogs), indeed any necessary truths, a problem which led to invoking structured meanings (Cresswell 1985, Lappin & Fox 2005). Another major problem is the context-dependency of NL construal, a truth-theoretic semantics for NL expressions having to be defined relative to some concept of context (cf. article 89 (Zimmermann) Context dependency). This is a foundational issue which has been a recurrent concern amongst philosophers over many centuries (cf. article 9 (Nerlich) Emergence of semantics for a larger perspective on the same problem). The pervasiveness of the context-dependence of NL interpretation was not taken to be of great significance by some, in Lewis (1970) the only provision being the stipulation of an addition to the model of an open-ended set of indices indicating objects in the utterance context (speaker, hearer, and some finite list of individuals). Nevertheless, the problem posed by context was recognised as a challenge; and an early attempt to meet it, sustaining a core Montagovian conception of denotational semantics, while nevertheless disagreeing profoundly over details, was proposed by Barwise & Perry (1983). Their proposed enriched semantic ontology, Situation Semantics, included situations and an array of partial semantic constructs (resource situations, infons, etc.) with sentence meanings requiring anchoring in such situations in order to constitute contents with context-determined values. Inference relations were then defined in terms of relations between situations, with speakers being attuned to such relations between situations (see the subsequent exchange between Fodor and Barwise on such direct interpretation of NL strings: Fodor 1988, Barwise 1989). 3.1. Anaphoric dependencies Recognition of the extent of this problem emerged in attempts to provide principled explanations for how pronouns are understood. Early on, Partee (1973) had pointed out that pronouns can be interpreted anaphorically, indexically or as a bound-variable. In (4), the pronoun is subject to apparent indexical construal (functioning like a name as referring to some intended object from the context); but in (5) it is interpreted like a predicate-logic variable with its value determined by some antecedent quantifying expression, hence not from the larger context: (4) She is tired. (5) Every woman student is panicking that she is inadequate. Yet, as Kamp and others showed (Evans 1980, Kamp 1981, Kamp & Rcylc 1993), this is just the tip of the iceberg, in that the phenomenon of anaphoric dependence is not restrict- able to the domain provided by any one NL sentence in any obvious analogue of predicate-logic semantics. There are for example E-type uses of that same pronoun in which it picks up its interpretation from a quantified expression across a sentential boundary: (6) A woman student left. She had been panicking about whether she was going to pass. If natural languages matched predicate-logic patterns, not only would we apparently be forced to posit ambiguity as between indexical and bound-variable uses of pronouns, but
226 II. History of semantics one would be confronted with puzzles that do not fall into either classification: this pronoun appears to necessitate positing a term denoting some arbitrary witness of the preceding propositional formula, in (6) a name arbitrarily denoting some randomly picked individual having the properties of being a student, female, and having left. Such NL construal is directly redolent of the epsilon terms underpinning arbitrary names of natural deduction proofs, carrying a history of the compilation of content from the sentence providing the antecedent (von Heusinger 1997). But this just adds to the problem for it seems there are three different interpretations for one pronoun, hence ambiguity. Moreover, as Partee had pointed out, tense specifications display all the hallmarks of anaphora construal, able to be interpreted either anaphorically, indexically or as a bound variable, indeed with E-type effects as well, so this threat of proliferating ambiguities is not specific to pronouns. 4. Dynamic Semantics The responses to this challenge can all be labelled dynamic semantics (Muskens, van Benthem & Visser 1997, Dekker 2000). 4.1. Discourse Representation Theory Discourse Representation Theory (DRT: Kamp 1981, Kamp & Reyle 1993) was the first formal articulation of a response to the challenge of modelling anaphoric dependence in a way that enables its various uses to be integrated. Sentences of natural language were said to be interpreted by a construction algorithm for interpretation which takes the syntactic structure of a string as input and maps this by successive constructional steps onto a structured representation called a Discourse Representation Structure (DRS), which was defined to correspond to a partial model for the interpretation of the NL string. A DRS contains named entities (discourse referents) introduced from NL expressions, with predicates taking these as arguments, the sentence relative to which such a partial model is constructed being defined to be true as long as there is at least one embedding of the DRS into the overall model. For example, for a simple sentence-sequence such as (7), the construction algorithm for building discourse representation structure induces a DRS for the interpretation of the first sentence in which one discourse referent is entered into the DRS corresponding to the name and one for the quantifying expression, together with a set of predicates corresponding to the verb and nouns. (7) John loves a woman. She is French. (8) HT7 The DRS in (8) might then be extended, continuing the construal process for the overall discourse by applying the construction algorithm to the second sentence to yield the expanded DRS: John=x loves(y)(x) woman(y)
11. Formal semantics and representationalism (9) x, y,z John=x loves(y)(x) woman(y) z = y French(y) To participate in such a process, indefinite NPs are defined as introducing a new discourse referent into the DRS, definite NPs and pronouns require that the referent entered into the DRS be identical to some discourse referent already introduced, and names require a direct embedding into the model providing the interpretation. Once constructed, the DRS is evaluated by its embeddability into the model. Any such resulting DRS is true in a model if and only if there is at least one embedding of it within the overall model. Even without investigating further complexities that license the embeddability of one DRS within another and the famous characterisation of If a man owns a donkey, he beats it (cf. article 37 (Kamp & Reyle) Discourse Representation Theory), an immediate bonus for this approach is apparent. The core cases of the so-called E-type pronouns fall into the same characterisation as more obvious cases of co-reference: all that is revised is the domain across which some associated quantifying expression can be seen to bind. It is notable in this account that there is no structural reflex of the syntactic properties of the individual quantifying determiner: indeed this formalism was among the first to come to grips with the name-like properties of such quantified formulae (cf. Fine 1984). It might of course seem that such a construction process is obliterating the difference between names, quantifying expressions, and anaphoric expressions, since all lead to the construction of discourse referents in a DRS. But, as we have seen, these expressions are distinguished by differences in the construction process. The burden of explanation for NL expressions is thus split: some aspect of their content is characterised by the mode of construction of the intervening DRS, some of it by the embeddability conditions of that structure into the overall model. The particular significance of DRT lies in the Janus-faced properties of the DRS's defined. On the one hand, a DRS corresponds to a partial model (or more weakly, is a set of constraints on a model), defined as true if and only if it is embeddable in the overall model (hence is the same type of construct). On the other hand, specific structural properties of the DRS may be invoked in defining antecedent-pronoun relations, hence such a level is an essential intermediary between NL string and the denotations assigned to its expressions. Nonetheless, this level has a fully defined semantics constituting its embeddability into an overall model, so its properties are explicitly defined. There is a second sense in which DRT departs from previous theories. In providing a formal articulation of the incremental process of how interpretation is built up relative to some previously established context, there is implicit rejection of the methodology disallowing reference to performance in articulations of NL competence. Indeed the DRS construction algorithm is a formal reflection of sentence-by-sentence accumulation of content in a discourse (hence the term Discourse Representation Theory). So DRT not only offers a representationalist account of NL meaning, but one reflecting the incrementality of utterance processing.
228 II. History of semantics 4.2. Dynamic Predicate Logic This account of anaphoric resolution sparked immediate response from proponents of the model-theoretic tradition. For example, Groenendijk & Stokhof (1991) argued that the intervening construct of DRT was both unnecessary and illicit in making composi- tionality of NL expressions definable not directly over the NL string but only via this intermediate structure. Part of their riposte to Kamp involved positing Dynamic Predicate Logic (DPL) with two variables for each quantifier and a new attendant semantics, so that once one of these variables gets closed off in ways familiar from predicate- logic binding, the second remains open, bindable by a quantifying mechanism introduced as part of the semantic combinatorics associated with some preceding string, hence obtaining cross-sentential anaphoric binding without any ancillary level of representation as invoked in DRT (cf. article 38 (Dekker) Dynamic semantics). Both the logic and its attendant semantics were new. Nevertheless, such a view is directly commensurate with the stringently model-theoretic view of context-dependent interpretation for natural language sentences provided by e.g. Stalnaker (1970, 1999): in these systems, progressive accumulation of interpretation across sequences of sentences in a discourse is seen exclusively in terms of intersections of sets of possible worlds progressively established, or rather, to reflect the additional complexity of formulae containing unbound variables, intersection of sets of pairs of worlds and assignments of values to variables (see Heim 1982 where this is set out in detail). In the setting out of DPL as a putative competitor over DRT in characterising the same data without any level of representation, there was no attempt to address the challenge which Kamp had brought to the fore in articulating DRT, that of characterizing how anaphoric expressions contribute to the progressive accumulation of interpretation: on the DPL account, the pronoun in question was simply presumed to be coindexed with its antecedent. Notwithstanding this lack of take up of the challenge which DRT was addressing, there has been continuing debate since then as to whether any intervening level of representation is justified over and above whatever syntactic levels are posited to explain syntactic properties of natural language expressions. Examples such as(10)-(ll) have been central to the debate (Kamp 1996, Dekker 2000): (10) Nine of the ten marbles are in the bag. It is under the sofa. (11) One of the ten marbles isn't in the bag. It is under the sofa. According to the DRT account, the reason why the pronoun it cannot successfully be used with the interpretation that it picks up on the one marble not in the bag in (10) is because such an entity is only inferrable from information given by expressions in the previous sentence: no representation of any term denoting such an entity in (10) has been made available by the construction process projecting a discourse representation structure on the basis of which the truth conditions of the previous sentence are compiled. So though in all models validating the truth of (10) there must be a marble not in the bag described, there cannot be a successful act of reference to such an individual in using the pronoun. By way of contrast, in (11), despite its being true in all the same models that (10) is true, it is because the term denoting the marble not in the bag is
11. Formal semantics and representationalism 229 specifically introduced that anaphoric resolution is successful. Hence, it is argued, the presence of an intermediate level of representation is essential. 4.3. The pervasiveness of context-dependence Despite Kamp's early insight (Kamp 1981) that anaphora resolution was part of the construction for building up interpretation, this formulation of anaphoric dependence was set aside in the face of the charge that DRT did not provide a compositional account of natural language semantics, and alternative generalised-quantifier accounts of natural language quantification were provided reinstating compositionality of content over the NL string within the DRT framework even though positing an intermediate level of representation (van Eijck & Kamp 1997). Despite great advances made by DRT, the issue of how to model context dependence continues to raise serious challenges to model-theoretic accounts of NL content.The different modes of interpretation available for pronouns extend far beyond a single syntactic category. Definite NPs, demonstrative NPs, and tense all present a three-way ambiguity between indexical, bound-variable and E-type forms of construal; and this ambiguity, if not reducible to some general principle, requires that the language be presumed to contain more than one such expression, with many discrete expression-denotation pairings. Such ambiguity is a direct consequence of the assumption that an articulation of meaning of an expression has to be in terms of its systematic contribution to truth conditions of the sentences in which it occurs. Such distinct uses of pronouns do indeed need to be expressed as contributing different truth conditions, whether as variable, as name, or as some analogue of an epsilon term; but the very fact that this ambiguity occurs in all context-dependent expressions in all languages indicates that something systematic is going on: this pattern is wholly unlike the accidental homonymy typical of lexical ambiguity. Furthermore, it is the non-existence of context-dependence in interpretation of classical logic which lies at the heart of the difference between the two types of system. In predicate logic, by definition, there is no articulation of context or how interpretation is built up relative to that.The phenomenon under study is that of inference, and the formal language is defined to match such patterns (and not the way in which the formulae in question might themselves have been established). Natural languages are however not purpose-built systems; and context-dependence is essential to their success as an economical vehicle for expressing arbitrarily rich pieces of information relative to arbitrarily varying contexts. This perspective is buttressed by work in the neighbouring disciplines of philosophy of language and pragmatics. The gap between intrinsic content of words and their interpretation in use had been emphasised by the later Wittgenstein (1953), Austin (papers collected in 1961), Grice (papers collected in 1989), Sperber & Wilson (1986/1995), Carston (2002). The fact that context is essential to NL construal imposes an additional condition of adequacy on accounts of NL content: a formal characterisation of the meaning of NL expressions needs to define both the input which an individual NL expression provides to the interpretation process and the nature of contexts with which such input interacts. Answers to the problem of context formulation cannot, however, be expected to come from the semantics of logics. Rather, we need some basis for formulating specifications that UNDER-determine any assigned content. Given the
230 II. History of semantics needed emphasis on underspecification, on what it means to be part-way through a process whereby some content is specifiable only as output, it is natural to think in terms of representations, or, at least, in terms of constraints on assignment of content.This is now becoming common-place amongst linguists (Underspecified Discourse Representation Semantics (UDRT), Reyle 1993, van Leusen & Muskens 2003). Indeed Hamm, Kamp & van Lambalgen (2006) have argued explicitly that such linguistically motivated semantic representations have to be construed within a broadly computational, hence representationalist theory of mind. 5. The shift towards proof-theoretic perspectives It might seem as though, with representationalism in semantics such a threat to the fruitfulness of the Montague paradigm, any such claim would be deluged with counterarguments from the formal-semantics community (Dekker 2000, Muskens 1996). But, to the contrary, use of representationalist tools is continuing apace in both orthodox and less orthodox frameworks; and the shift towards more representational modes of explanation is not restricted to anaphora resolution. 5.1. The Curry-Howard isomorphism An important proof-theoretic result pertaining to the compositionality of NL content came from the proof that the lambda calculus and type deduction in intuitionistic logic are isomorphic (intuitionistic logic is weaker than classical logic in that several classical tautologies do not hold). This is the so-called Curry-Howard isomorphism. Its relevance to linguists, which I define ostensively by illustration, is that the fine structure of how compositionality of content for NL expressions is built up can be represented proof- theoretically, making use of this isomorphism (Morrill 1994, Carpenter 1997). The isomorphism is displayed in proofs of type deduction in which propositions are types, with a label demonstrating how that proof type as conclusion was derived. The language of the labels is none other than the lambda calculus, with functional application in the labels corresponding to type deduction on the formula side. So in the label, we might have a lambda term, e.g. Xx[Sneeze(x)], and in the formula its corresponding type e -> /.The compositionality of content expressible through functional application defined over lambda terms can thus be represented as a step of natural deduction over labelled propositional formulae, with functional application on the labels and modus ponens on the typed formula. For example, a two-place predicate representable as XxXy[See(x)(y)] can be stated as a label to a typed formula: (12) XxXy[See(x)(y)]:e^(e^t) This, when combined with a formula (13) Mary.e yields as output: (14) Xy[See{Mary){y)\ :e^t
11. Formal semantics and representationalism 231_ And this in its turn when paired with (15) John :e yields as output by one further step of simultaneous functional application and Conditional Elimination: (16) See(Mary)(John) : t So compilation of content for the string John sees Mary can be expressed as a labelled deduction proof (12)—(16), reflecting the bottom-up compilation of content for the NL expression. 5.2. Proof theory as syntax: Type Logical Grammar This method for using the fine structure of natural deduction as a means of representing NL compositionality has had very wide applicability, in categorial grammar and elsewhere (Dalrymple, Shieber & Pcreira 1991, Dalrymple (ed.) 1999). In categorial grammar in particular (following Lambek 1958 who defined a simple extension of the lambda calculus with the incorporation of two order-sensitive operators indicating functional application with respect to some left/right placed argument), the Curry-Howard isomorphism is central, and with later postulation of modal operators to define syntactic domains (Morrill 1990,1994), such systems have been shown to have expressive power sufficient to match a broad array of variation expressible in natural languages (leaving on one side the issue of context dependence). Hence the categorial-grammar claim that NL grammars are logics with attendant proof systems.These are characteristically presented either in natural-deduction format or in its meta-level analogue, the sequent calculus. The advantage of such systems is the fine level of granularity they provide for reflecting bottom-up compositionality of content, given suitable (higher) type assignments to NL expressions. In being logics, natural languages are presumed to have syntax and semantics defined in tandem, with the proof display being no more than an elegant display of how prosodic sequences (words) are paired projected by steps of labelled type deduction onto denotational contents. In particular, no representationalist assumptions are made vis a vis either syntax or logical representations (see Morrill 1994 for a particularly clear statement of this strictly dcnotationalist commitment). Nonetheless in more recent work, Morrill departs from at least one aspect of this stringent categorial grammar proof-theoretic ideology. In all categorial grammar formalisms, as indeed in many other frameworks, there is strict separation of competence and performance considerations with grammar-formalisms only evaluated in terms of their empirical predictive success; yet Morrill & Gavarro (2004) and Morrill (2010) argue that an advantage of the particular categorial-grammar characterisation adopted (with linear-logic proof derivations), is the step-wise correspondence of individual steps in the linear-logic derivation to measures of complexity in processing the NL string, this being a bonus for the account. Thus representational properties of grammar-defined derivations would seem at least evidenced by performance data, even if such proof- theoretically defined derivations are taken to be eliminable as a core property of the grammar formalism itself.
232 II. History of semantics 5.3. Type Theory with Records A distinct proof-theoretic equivalent of Montague's PTQ grammar was defined by Ranta (1994), who adopted as a basis for his framework the methodology of Martin- Lofs (1984) proof theory for intuitionistic type theory. By way of introducing the Martin-Lof ontology, remember how in natural deduction proofs there are metalevel annotations (section 1.4), but these arc only partially explicit: many do not record dependencies between arbitrary names as they are set up. The Martin-Lof methodology to the contrary requires that all such dependencies are recorded, and duly labelled; and Ranta used this rich attendant labelling system to formulate analyses of anaphora resolution and quantification, and from these he established a structural concept of context, notably going beyond what is achievable in categorial grammar formalisms. Furthermore, such explicit representations of dependency have been used to establish a fully explicit proof-theoretic grounding for generalized quantifiers, from which an account of anaphora resolution follows incorporating E-type pronouns as a subtype (Ranta 1994, Piwek 1998, Fernando 2002). In Cooper (2006), the Ranta framework is taken a step further, yielding the Type Theory with Records framework (TTR). Cooper uses a concept of record and record- type to set out a general framework for modelling both context-dependent interpretation and the intrinsic undcrspccification that NL expressions themselves contribute to the interpretation process. Echoing the DRT formulation, the interpretation to be assigned the sentence A man owns a donkey is set out as taking the form of the record- type (Cooper 2005) (the variables in these formulations, as labels to proof terms, are like arbitrary names, expressing dependencies between one term and another: (17) Vx:Ind c, :man{x) y And c2 :donkey{y) c, :own(y)(x) x,y are variables of individual type, c, is of the type of proof that x is a man, (hence a proof that is dependent on some proof of jc), c2 is of the type of proof that y is a donkey,... A record of that record type would be some instantiation of variables e.g.:
11. Formal semantics and representationalism (18) x =a c. =/>. y=b C2=/>2 .C>=P>. px a proof of'man(a)', p2 a proof of 'donkey(y)\ and so on. This is a proof-theoretic reformulation of the situation-theory concepts of infon (= situation-type) and situation. A record represents some situation that provides values that make some record-type true. The concept of record-type corresponds to sentence meanings in abstraction from any given context/record: a sentence meaning is a mapping from records to record-types. It is no coincidence that such dependency labelling has properties like that of epsilon terms, since, like them, the terms that constitute the labels to the derived types express the history of the mode of combination. The difference between this and DRT lies primarily in the grounding of records and record-types in proof-theoretic rather than model-theoretic underpinnings (Cooper 2006). Like DRT, this articulation of record theory with types as a basis for NL semantics leaves open the link with syntax. However, in recent work, Ginzburg and Cooper have extended the concept of records and record-types yet further to incorporate full details of linguistic signs (with phonological, syntactic and semantic information - a multi-level representational system Ginzburg & Cooper 2004, cf. article 36 (Ginzburg) Situation Semantics). 6. Ellipsis: A window on context So far, the main focus has been anaphora construal, but ellipsis presents another window on context. Elliptical fragments are those where there is only a fragment of a clause: words and whole phrases can simply be omitted when the context fully determines what is needed to complete the interpretation.The intrinsic interest of such elliptical fragments is that they provide evidence that very considerable richness of labelling is required in modelling NL construal. Like pronominal anaphora, ellipsis displays a huge diversity. Superficially, the phenomenon might seem to be amenable to some more sophisticated variant of anaphora construal.There are, that is, what look like indexical construals, and also analogues of coreference and bound-variable anaphora: (19) (Mother to a toddler stretching up to the stove above their head) Johnny, don't. (20) John stopped in time but I didn't. (21) Everyone who submitted their thesis without checking it wished that the others had too. To capture the array of effects illustrated in (20)-(21), there is a model-theoretic account of ellipsis (Dalrymple, Shieber & Pereira 1991) whose starting point is to take
234 II. History of semantics the model-theoretic methodology, and from the propositional content provided by the antecedent clause, to define some lambda term to isolate an appropriate predicate for combining with the NP term provided by the fragment. For (20), one might define the term Xjc[stop-in-time(jc)]; for (21) the term Xjt[3)>(thesis(y) a submit(y)(jt) a -icheck(y) (jc))]. However, even allowing for complexities of possible sequences of quantificational dependencies replaced at the ellipsis site as in (21), the rebinding mechanism has to be yet more complex, as what is rebound may be not merely some subject value and whatever quantified expressions are dependent on that, but cases where the subject expression is itself dependent on some expression within its restrictor, so that an even higher-order form of abstraction is required: (22) The man who arrested Joe failed to read him his rights, as did the man who arrested Sue. This might seem to be an echo of the debate between DRT and DPL, with the higher- order account having the formal tools to express the parallelism of construal carried over from the antecedent clause to the ellipsis site as in (20)-(21), without requiring any invocation of a representation of content. However, there are many cases of ellipsis, as syntacticians have demonstrated, where only an account which invokes details of some assigned structure to the string can explain the available interpretations (Fiengo & May 1994, Merchant 2004). For example, there are morphological idiosyncracies displayed by individual languages that surface as restrictions on licensed elliptical forms. In a case-rich language such as German, for example, the fragment has to occur with the case form it would be expected to have in a fully explicit follow-on to the antecedent clause (Greek also displays exactly the same type of requirement). English, where case is very atrophied, imposes no such requirement (Morgan 1973, Ginzburg & Cooper 2004): (23) Hans will nach London gehen. Ich/*mich auch. (24) Hans wants to go to London. Me too. This is not the kind of information which a higher-order unification account can express, as its operations arc defined on the model-theoretic construal of the antecedent conjunct, not over morphological sequences. There are many more such structure-particular variations indicating that ellipsis would seem to have to be defined over representations of structure, hence to be analysed as a syntactic phenomenon, in each case defined over some suitable approximate correspondent to the surface strings (there has to be this caveat as there are well-known cases of vehicle-change where what is reconstructed is not the linguistic form, Fiengo & May 1994): (25) John has checked his thesis notes carefully, but I haven't. However, there are puzzles for both types of account. Fragments may occur in dialogue for which arguably only a pragmatic enrichment process can capture the effects (not based either on structured strings or on their model-theoretic contents, Stainton 2006): (26) A (leaving hotel):The station? B (receptionist): Left out of the door.Then second on the right.
11. Formal semantics and representationalism 235 For reasons such as this, ellipsis continues to receive a great deal of attention, with no single account apparently able single-handedly to match the fine-grained nature of the cross-conjunct/speaker patterning that ellipsis construal can achieve. The conclusion generally drawn is that the folk intuition about ellipsis is simply not expressible. Indeed the added richness in TTR of combining Ranta's type-logical formalism with the feature- matrix vocabulary of Head Driven Phrase Structure Grammar (HPSG: Sag, Wasow & Bender 2002) to obtain a system combining independent semantic, syntactic and morphological paradigms was at least in part driven by the observed 'fractal heterogeneity' of ellipsis (Ginzburg & Cooper 2004). However, there is one last chance to capture such effects in an integrated way, and this is to posit a system of labelling recording individual steps of information build-up with whatever level of granularity is needed to preserve the idiosyncracies in the individual steps. Ellipsis construal can then be defined by making reference to such records. This type of account is provided in Dynamic Syntax (Kempson, Meyer-Viol & Gabbay 2001, Cann, Kempson & Marten 2005, Purver, Cann & Kempson 2006, Cann, Kempson & Purver 2007). 6.1. Dynamic Syntax: A reflection of parsing dynamics as a basis for syntax Dynamic Syntax (DS) models the dynamics of how interpretation is incrementally built up following a parsing dynamics, with progressively richer representations of content constructed as words are processed relative to context. This articulation of the stepwise way in which interpretation is built up is claimed to be all that is needed for explaining NL syntax: representations constructed are simultaneously the vehicle for explaining syntactic distributions and a vehicle for representing how interpretation is built up. The methodology adopts a rcprcscntationalist stance vis a vis content (Fodor 1983). Predicate-argument structures are represented in a tree format with the assumption of progressive update of partial tree-representations of content. Context, too, is represented in the same terms, evolving in tandem with each update. This concept of structural growth totally replaces the semantically blind syntactic specifications characteristic of such formalisms as Minimalism (Chomsky 1995) and HPSG. It is seen as starting from an initial one-node tree stating the goal of the interpretation process to establish some propositional formula (the tree representation to the left of the »-» in (27)).Then, using both parse input and information from context, some propositional formula is progressively built up (the tree representation to the right of the •-» in (27)). (27) Parsing John upset Mary ?t, 0 ~ Upsef(Mary') (John'): t, 0 Upset'(Mary'): e -»t Mary': e Upset": e -»(e -»t)
236 II. History of semantics The output is a fully decorated tree whose topnode is a representation of some proposition expressed with its associated type specification, and each dominated node has a concept formula, e.g. John' representing some individual John, and an indication of what semantic type that concept is. The primitive types are types e and t as in formal semantics but construed syntactically as in the Curry-Howard isomorphism, labelled type deduction determining the decorations on non-terminal nodes once all terminal nodes are fixed and suitably decorated with formula values.There is invariably one node under development in any partial tree, as indicated by the pointer 0. So a parse process for (27) would constitute a transition across partial trees, the substance of this transition turning on how the growth relation •-» is to be determined by the word sequence. The concept of requirement IX for any decoration X is central. Decorations on nodes such as ?/, ?e, ?(e -> t), etc. express requirements to construct formulae of the appropriate type on the nodes so decorated, and these requirements drive the subsequent tree-construction process. A string can then be said to be wellformed if and only if there is at least one derivation involving monotonic growth of partial trees licensed by computational (general), lexical and pragmatic actions following the sequence of words that yields a complete tree with no requirements outstanding. Just as the concept of tree growth is central, so too is the concept of procedure for mapping one partial tree to another. Individual transitions from partial tree to partial tree are all defined as procedures for tree growth. The formal system underpinning the partial trees that arc constructed is a logic of finite trees (LOFT: Blackburn & Meyer-Viol 1994). There are two basic modalities, (i) and (T), and Kleene * operators defined over these relations, e.g. (\^)Tn(a) indicating that somewhere dominating this node is the tree-node Tn(a) (a standard tree-theoretic characterisation of'dominate'). The procedures in terms of which the tree growth processes are defined then involve such actions as make((i)),go((i)),make((i*)),put(Ar) (for any decoration X),etc.This applies both to general constraints on tree growth (hence the syntactic rules of the system) and to specific tree update actions constituting the lexical content of words. So the contribution which a word makes to utterance interpretation is expressed in the same vocabulary as general structural growth processes, and is invariably more than just a concept specification: it is a sequence of actions developing a sub-part of a tree, possibly building new nodes, and assigning them decorations such as formula and type specifications. Of the various concepts of undcrspccification, two arc central. On the one hand, there is underspecification of conceptual content, with anaphoric expressions being defined as adding to a node in a tree a place-holding metavariable of a given type as a provisional formula value to be replaced by some fixed value which the immediate context makes available.This is a relatively uncontroversial approach to anaphora construal,equivalent to formulations in many other frameworks (notably DRT and TTR). On the other hand, there is underspecification and update of structural relations, in particular replacing all movement or feature-passing accounts of discontinuity effects, with introduction of an unfixed node, one whose structural relation is not specified at the point of construction, and whose value must be provided from the construction process. Formally the construction of a new node within a partial tree is licensed from some node requiring a propositional type, with that relation being characterised only as that of domination (weakly specified tree relations are indicated by a dashed line with 7>i(0)asthe rootnode: this is step (i) of (28).The update to this relatively weak tree-relation for English, lacking as it does any case specifications,
11. Formal semantics and representationalism 237 becomes possible only once having parsed the verb.This is the unification step (ii) of (28), an action which satisfies both type and structure update requirements: (28) Parsing Mary, John upset ft, Tn(0) J Mary1: e Mary'.e, <T*>7n(0) <T*)7n(0), 0 step (i) This process, like the substitution operation associated with pronoun construal feeds into the ongoing process of creating a completed tree, in this case by steps of labelled type deduction. It might seem that all such talk of (partial) trees as representations of content could not in principle simultaneously serve as both a syntactic explanation and a basis for semantic interpretation, because of the problems posed by quantification, known to necessitate a globally defined process expressing scope dependencies between quantifying expressions, and generally agreed to involve mismatch between logical and syntactic category (see section 2.1). However, quantified expressions are taken to map onto epsilon terms, hence of type e (Kempson, Meyer-Viol & Gabbay 2001, ch.7); and in having a restrictor which reflects the content of the proposition in which they are contained, the defining property of epsilon terms is that they grow incrementally, as additional predicates are added to the term under construction (see section 1.4). And these epsilon terms, metavariables and tree relations, all under development, all interact to yield the complex interactions more familiarly seen as scope and anaphoric sensitivities to long-distance and other discontinuity effects. The bonus of this framework for modelling ellipsis is that by taking not merely structured representations of content as first-class citizens but the procedures used to incrementally build up such stucture, ellipsis construal is expressible in an integrated way: it is indeed definable as determined directly by context. What context comprises is the richer notion that includes some sequence of words, their assigned content, and its attendant propositional tree structure, plus the sequence of actions whereby that structure and its content were established. Any one of these may be used to build up interpretation of the fragment, and together they determine the range of interpretations available: recovery of some predicate content from context (19)-(20), recovery of some actions in order to create a parallel but distinct construal (21)-(22).This approach to ellipsis as building up structure by reiterating content or actions can thus replace the higher order unification account with a more general, intuitive, and yet essentially rcprcscntationalist account. A unique property of the DS grammar mechanisms is that production is expressed in the same terms as parsing, as progressive build-up of semantic structure, differing from parsing only in having also a richer representation of what the speaker is trying to express as a filter on whether appropriate semantic structure is being induced by the selected words.This provides a basis from which to expect fluent exchange of speaker/ Johri.e ?<e^') Upset'. (e->(e->0) step (ii)
238 II. History of semantics hearer roles in dialogue, in which a speaker sets out a partial structure which their interlocutor, parsing what they say, takes over: (29) Q:Who did John upset? A: Himself. (30) A: I saw John. B:With his mother? A: Yes, with Sue. In the park. Since both lexical and syntactic actions arc defined as tree-growth processes, construal of elliptical fragments both within and across speakers is expected to allow replication of any such actions following their use to build interpretation for some antecedent, predicting the mixture of semantic and syntactic factors in ellipsis. Scope dependencies are unproblematic.These are not expressed on the trees, but formulated as an incrementally collected set of constraints on the evaluation of the emergent epsilon terms. These are applied once the propositional formula is constructed, determining the resulting epsilon term. So despite the apparently global nature of scope, parallel scope dependencies can be expressed as the re-use of a sequence of actions that had earlier been used in the construal of the antecedent string, replicating scope actions but relative to the terms constructed in interpreting the fragment. Even case, defined as an output filter on the resultant tree, is unproblematic, unlike for semantic characterisations of ellipsis, as in (23). So, it is claimed, interaction of morphology, syntax, and even pragmatics in ellipsis construal is predictable given DS assumptions. To flesh this out would involve a full account of how DS mechanisms interact to yield appropriate results. All this sketch provides is a point of departure for addressing ellipsis in an integrated manner (sec also a variant of DRT: Ashcr & Lascaridcs 2002). More significantly, the DS account of context and the emergent concept of content intrinsic to NL words themselves are essentially representationalist.The ellipsis account succeeds in virtue of the fact that lexical specifications are procedures that interact with context to induce the building of representations of denotational content. Natural languages are accordingly denotationally interpretable only via a mapping onto an intermediate logical system. And, since compositionality of denotational content is defined over the resulting trees, it is the incrcmcntality of projection of word meaning and the attendant monoto- nicity of the tree growth process which constitutes compositionality definable over the words making up sentences. 7. Summary Inevitably,issues raised by anaphora and ellipsis remain open. But,even without resolving these, research goals have strikingly shifted since early work in formal semantics and the representationalism debate it engendered. The dispute remains; but answers to the questions are very different. On the view espoused within TTR and DS formulations, a natural language is not definable as a logic in the mould of predicate logic. Rather, it is a set of mechanisms out of which truth-denoting objects can be built relative to what is available in context. To use Cooper's apt turn of phrase 'natural languages are tools for formal language construction' (Cooper & Ranta 2008): so semantic representations are central to the form of explanation.
11. Formal semantics and representationalism 239 Whatever answers individual researchers might reach to individual questions, one thing is certain: new puzzles are taking centre-stage. The data of semantic investigation are not now restricted to judgements of entailment relations between sentences. The remit includes modelling the human capacity to interpret fragments in context in conjunction with other participants in dialogue; and defining appropriate concepts of information update. Each of these challenges involves an assumption that the human capacity for natural language is a capacity for language processing in context. With this narrowing of the gap between competence and performance considerations, assumptions about the nature of such semantic representations of content can be re-evaluated. We can again pose the question of the status of representations in semantic modelling; but we can now pose this as a question of the relation of such representations to those required for modelling cognitive inference more broadly. Furthermore, such questions can be posed from a number of frameworks as starting point: categorial grammar, Type Theory with Records, DRT and its variants, Dynamic Syntax, to name but a few. And it is these new avenues of research which are creating new ways of understanding the nature of natural language and linguistic competence. Eleni Gregoromichelaki, Ronnie Cann and two readers provided most helpful comments leading to significant improvement of this paper. However, normal disclaimers apply. 8. References Asher, Nicholas & Alex Lascarides 2002. Logics of Conversation. Cambridge, MA: The MIT Press Austin, John 1961. How to Do Things with Words. Oxford: Clarendon Press Barwise, Jon 1989. The Situation in Logic. Stanford, CA: CSLI Publications. Barwise, Jon & Robin Cooper 1981. Generalized quantifiers and natural language. Linguistics & Philosophy 4,159-219. Barwise, Jon & John Perry 1983. Situations and Attitudes. Cambridge, MA:The MIT Press Blackburn, Patrick & Wilfried Meyer-Viol 1994. Linguistic logic and finite trees. Bulletin of the Interest Group of Pure and Applied Logics 2,2-39. Cann, Ronnie, Ruth Kempson & Lutz Marten 2005. Dynamics of Language. Amsterdam: Elsevier. Cann, Ronnie, Ruth Kempson & Matthew Purver 2007. Context, wellformedness and the dynamics of dialogue. Research on Language and Computation 5,333-358. Carnap, Rudolf 1947. Meaning and Necessity. A Study in Semantics and Modal Logic. Chicago, IL: The University of Chicago Press Carpenter, Robert 1997. Type-Logical Semantics. Cambridge, M A:Tlie MIT Press Carston, Robyn 2002. Thoughts and Utterances: The Pragmatics of Explicit Utterances. Oxford: Blackwcll. Chomsky, Noam 1955. Syntactic Structures.The Hague: Mouton. Chomsky, Noam 1965. Aspects of the Theory of Syntax. Cambridge, MA: The MIT Press Chomsky, Noam 1995. The Minimalist Program. Cambridge, MA:The MIT Press Cooper, Robin 2005. Austinian truth, attitudes and type theory. Research on Language and Computation 3,333-362. Cooper, Robin 2006. Records and record types in semantic theory. Journal of Logic and Computation 15,99-112. Cooper, Robin & Aarne Ranta 2008. Natural language as collections of resources In: R. Cooper & R. Kempson (eds). Language in Flux: Dialogue Coordination, Language Variation, Change, and Evolution. London: College Publications 109-120. Crcsswcll, Max 1985. Structured Meaning. Cambridge, MA:The MIT Press
240 II. History of semantics Dalrymple Mary, Stuart Shieber & Fernando Pereira 1991. Ellipsis and higher-order unification. Linguistics & Philosophy 14,399^452. Dalrymple, Mary (ed.) 1999. Semantics and Syntax in Lexical Functional Grammar: The Resource Logic Approach. Cambridge, MA:Tlie MIT Press Dekker, Paul 2000. Coreference and representationalism. In: K. von Heusinger & U. Egli (eds). Reference and Anaphoric Relations. Dordrecht: Kluwer. 287-310. Evans, Gareth 1980. Pronouns. Linguistic Inquiry 11.337-362. Fernando, Timothy 2002. Three processes in natural language interpretation. In: W. Sieg, R. Sonimer & C.Talcott (eds.). Reflections on the Foundations of Mathematics: Essays in Honor of Solomon Feferman. Natick, MA: Association for Symbolic Logic, 208-227. Fiengo, Robert & Robert May 1994. Indices and Identity. Cambridge, MA:The MIT Press Fine, Kit 1984. Reasoning with Arbitrary Objects. Oxford: Blackwell. Fitch, Frederic 1951. Symbolic Logic. New York:The Ronald Press Company. Fodor, Jerry 1981. Re-Presentations. Cambridge. M A:Tlic MIT Press. Fodor, Jerry 1983. Modularity of Mind. Cambridge, MA:Thc MIT Press Fodor, Jerry 1988. A situated grandmother. Mind & Language 2,64-81. Fodor, Jerry 1998. Concepts. Oxford: Oxford University Press Gamut, L.T.F. 1991. Logic, Language and Meaning. Chicago, IL:Tlic University of Chicago Press Geach, Peter 1972. Logic Matters. Oxford: Oxford University Press Ginzburg, Jonathan & Robin Cooper 2004. Clarification, ellipsis and the nature of contextual updates Linguistics & Philosophy 27,297-365. Grice, Paul 1989. Studies in the Way of Words. Cambridge, MA: Harvard University Press Groenendijk, Jeroen & Martin Stokhof 1991. Dynamic predicate logic. Linguistics & Philosophy 14.39-100. Hamm. Fritz, Hans Kamp & Michiel van Lambalgen 2006. There is no opposition between formal and cognitive semantics Theoretical Linguistics 32,1—40. Heim, Irene 1982. The Semantics of Definite and Indefinite Noun Phrases. Ph.D. dissertation. University of Massachusetts Amherst, MA. Reprinted: Ann Arbor, MI: University Microfilms Hilbert, David & Bernays Paul 1939. Grundlagen der Mathematik. Vol. II. 2nd edn. Berlin: Springer. Hintikka, Jaako 1974. Quantifiers vs quantification theory. Linguistic Inquiry 5,153-177. Jackcndoff, Ray 2002. Foundations of Language: Brain, Meaning, Grammar, Evolution. Oxford: Oxford University Press Kamp, Hans 1981. A theory of truth and semantic representation. In: J. Groenendijk, T. Janssen & M. Stokhof (eds). Formal Methods in the Study of Language. Amsterdam: Mathematical Centre, 277-322. Kamp, Hans 1996. Discourse Representation Theory and Dynamic Semantics: Representational and nonrepresentational accounts of anaphora. French translation in: F. Corblin & C. Gardent (eds). Interpreter en contexte. Paris: Hermes 2005. Kamp, Hans & Uwe Reyle 1993. From Discourse to Logic. Dordrecht: Kluwer. Katz, Jcrrold 1972. Semantic Theory. New York: Harper & Row. Katz, Jerrold & Jerry Fodor 1963.The structure of a semantic theory. Language 39,170-210. Kempson, Ruth, Wilfried Meyer-Viol & Dov Gabbay 2001. Dynamic Syntax: The Flow of Language Understanding. Oxford: Blackwell. Lambek, Joachim 1958. The mathematics of sentence structure. American Mathematical Monthly 65,154-170. Lappin. Shalom & Christopher Fox 2005. Foundations of Intensional Semantics. Oxford: Blackwell. van Lcuscn, Noor & Rcinhard Muskcns 2003. Construction by description in discourse representation. In: J. Peregrin (ed). Meaning: The Dynamic Turn, chap. 12. Amsterdam: Elsevier, 33-65. Lewis David 1970. General semantics Synthese 22,18-67. Reprinted in: G. Harman & D. Davidson (eds). Formal Semantics of Natural Language. Dordrecht: Reidcl, 1972,169-218. Martin-L6f, Per 1984. Intuitionistic Type Theory, Naples: Bibliopelis
11. Formal semantics and representationalism 241 Merchant, Jason 2004. Fragments and ellipsis. Linguistics & Philosophy 27,661-738. Montague, Richard 1970. English as a formal language. In: Bruno Visentini et al. (eds). Linguaggi nella Societ a e nella Tecnica. Milan: Edizioni di Comunita, 189-224. Reprinted in: R.Thomason (ed.). Formal Philosophy. Selected Papers of Richard Montague. New Haven, CT: Yale University Press, 1974.188-221. Morgan, Jerry 1973. Sentence fragments and the notion 'sentence'. In: H. R. Kahane, R. Kahane & B. Kachru (eds.). Issues in Linguistics. Urbana, IL: University of Illinois Press, 719-751. Morrill, Glyn 1990. Intensionality and boundedness Linguistics & Philosophy 13,699-726. Morrill, Glyn 1994. Type Logical Grammar. Dordrecht: Foris Morrill, Glyn 2010. Categorial Grammar: Logical Syntax, Semantics, and Processing. Oxford: Oxford University Press Morrill. Glyn & Anna Gavano 2004. On aphasic comprehension and working memory load. In: Proceedings of Categorial Grammars: An Efficient Tool for Natural Language Processing. Montpcllicr, 259-287. Muskcns, Rcinhard 1996. Combining Montague semantics and Discourse Representation. Linguistics & Philosophy 19.143-186. Muskens, Reinhard, Johan van Benthem & Albert Visser 1997. Dynamics. In: J. van Benthem & A. ter Meulen (eds.). Handbook of Logic and Language. Amsterdam: Elsevier, 587-648. Partee, Barbara 1973. Some structural analogies between tenses and pronouns in English. The Journal of Philosophy 70,601-609. Piwek, Paul 1998. Logic, Information and Conversation. Ph.D dissertation. University of Technology, Eindhoven. Prawitz, Dag 1965. Natural Deduction: A Proof-Theoretical Study. Uppsala: Ahnqvist & Wiksell. Purver, Matthew, Ronnie Cann & Ruth Kempson 2006. Grammars as parsers: The dialogue challenge. Research on Language and Computation 4.289-326. Pustcjovsky, James 1995. The Generative Lexicon. Cambridge, MA:Tlic MIT Press Ranta, Aarne 1994. Type-Theoretic Grammar. Oxford: Clarendon Press Reyle.Uwe 1993. Dealing with ambiguities by underspecification: Construction, representation and deduction. Journal of Semantics 10,123-179. Sag, Ivan, Tom Wasow & Emily Bender 2002. Head Phrase Structure Grammar: An Introduction. 2nd cdn. Stanford, CA: CSLI Publications. Sperber, Dan & Deirdre Wilson 1986/1995. Relevance: Communication and Cognition. Oxford: Blackwell. Stainton. Robert 2006. Words and Thoughts. Oxford: Oxford University Press Stalnaker, Robert 1970. Pragmatics Synthese 22,272-289. Stalnaker, Robert 1999. Context and Content. Oxford: Oxford University Press Thomason, Richmond (ed.) 1974. Formal Philosophy. Selected Papers of Richard Montague. New Haven, CT: Yale University Press van Eijk, Jan & Hans Kamp 1997. Representing discourse in context. In: J. van Benthem & A. ter Meulen (eds). Handbook of Logic and Language. Amsterdam: Elsevier, 179-239. von Heusinger, Klaus 1997. Definite descriptions and choice functions In: S. Akama (ed.). Logic, Language and Computation. Dordrecht: Kluwer, 61-92. von Heusinger, Klaus & Ruth Kempson (eds) 2004. Choice Functions in Linguistic Theory. Special issue of Research on Language and Computation 2. Wittgenstein, Ludwig 1953. Philosophical Investigations. Translated by Gillian Anscombe. 3rd edn. Oxford: Blackwell, 1967. Ruth Kempson, London (United Kingdom)
III. Methods in semantic research 12. Varieties of semantic evidence 1. Introduction: Aspects of meaning and possible sources of evidence 2. Ficldwork techniques in semantics 3. Communicative behavior 4. Behavioral effects of semantic processing 5. Physiological effects of semantic processing 6. Corpus-linguistic methods 7. Conclusion 8. References Abstract Meanings are the most elusive objects of linguistic research. The article summarizes the type of evidence we have for them: various types of metalinguistic activities like paraphrasing and translating, the ability to name entities and judge sentences true or false, as well as various behavioral and physiological measures such as reaction time studies, eye tracking, and electromagnetic brain potentials. It furthermore discusses the specific type of evidence we have for different kinds of meanings, such as truth-conditional aspects, presuppositions, implicatures, and connotations. 1. Introduction: Aspects of meaning and possible sources of evidence 1.1. Why meaning is a special research topic If we ask an astronomer for evidence for phosphorus on Sirius, she will point out that spectral analysis of the light from this star reveals bands that are characteristic of this element, as they also show up when phosphorus is burned in the lab. If we ask a linguist the more pedestrian question for evidence that a certain linguistic expression - say, the sentence The quick brown fox jumps over the lazy dog - has meaning, answers arc probably less straightforward and predictable. He might point out that speakers of English generally agree that it has meaning - but how do they know? So it is perhaps not an accident that the study of meaning is the subfield of linguistics that developed only very late in the 2500 years of history of linguistics, in the 19th century (cf. article 9 (Nerlich) Emergence of semantics). The reason why it is difficult to imagine what evidence for meaning could be is that it is difficult to say what meaning is. According to a common assumption, communication consists in putting meaning into a form, a form that is then sent from the speaker to the addressee (the conduit metaphor of communication, see Lakoff & Johnson 1980). Aspects that are concerned with the form of linguistic expressions and their material realization as studied in syntax, morphology, phonology and phonetics; they are generally Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 242-268
12. Varieties of semantic evidence 243 more tangible than aspects concerned with their content. But semanticists in general hold that semantics, the study of linguistic meaning, indeed has an object to study that is related but distinct from the forms in which it is encoded, from the communicative intentions of the speaker and from the resulting understanding of the addressee. 1.2. Aspects of meaning The English noun meaning is multiply ambiguous, and there are several readings that are relevant for semantics. One branch of investigation starts out with meaning as a notion rooted in communication. Grice (1957) has pointed out that we can ask what a speaker meant by uttering something, and what the utterance means that the speaker uttered. Take John F. Kennedy's utterance of the sentence Ich bin ein Berliner on June 26,1963. What JFK meant was that in spite of the cold war, the USA would not surrender West Berlin - which was probably true. What the utterance meant was that JFK is a citizen of Berlin, which was clearly false. Obviously, the speaker's meaning is derived from the utterance meaning and the communicative situation in which it was uttered. The way how this is derived, however, is less obvious - cf. article 2 (Jacob) Meaning, intentionality and communication, especially on particularized conversational implicatures. A complementary approach is concerned with the meaning of linguistic forms, sometimes called literal meanings, like the meaning of the German sentence Ich bin ein Berliner which was uttered by JFK to convey the intended utterance meaning. With forms, one can distinguish the following aspects of meaning (cf. also article 3 (Textor) Sense and reference, and article 4 (Abbott) Reference). The character is the meaning independent from the situation of utterance (like speaker, addressee, time and location - see Kaplan 1978).The character of the sentence used by JFK is that the speaker of the utterance is a citizen of Berlin at the time of utterance. If we find a sticky note in a garbage can, reading / am back in five minutes - where wc don't know the speaker, the time, or location of the utterance - we just know the character. A character, supplied with the situation of utterance, gets us the content or intension (Frege's Sinn) of a linguistic form. In the proper historical context, JFK's utterance has the content that JFK is a citizen of Berlin on June 26, 1963. (We gloss over the fact here that this first has to be decoded as a particular speech act, like an assertion.) This is a proposition, which can be true or false in particular circumstances.The extension or reference of an expression (Frege's Bedeutung) is its content when applied to the situation of utterance. In the case of a proposition, this is a truth value; in the case of a name or a referring expression, this is an entity. Sometimes meaning is used in a more narrow sense, as opposed to reference; here I have used meaning in an encompassing way. Arguably, the communicative notion of meaning is the primary one. Meaning is rooted in the intention to communicate. But human communication crucially relies on linguistic forms, which are endowed with meaning as outlined, and for which speakers can construct meanings in a compositional way (see article 6 (Pagin & Westerstahl) Compositionality). Semantics is concerned with the meaning of linguistic forms, a secondary and derived notion. But the use of these forms in communication is crucial data to rc-cnginccr the underlying meaning of the forms. The ways how literal meanings arc used in acts of communication and their effects on the participants, in general, is part of pragmatics (cf. article 88 (Jaszczolt) Semantics and pragmatics).
244 III. Methods in semantic research 1.3. Types of access to meaning Grounding meaning of linguistic expressions in communication suggests that there are various kinds of empirical evidence for meaning. First, we can observe the external behavior of the participants in, before and after the act of communication. Some kinds of behavior can be more directly related to linguistic meaning than others, and hence will play a more central role in discovering underlying meaning. For example, commands often lead to a visible non-linguistic reaction, and simple yes/no-quesiions will lead to linguistic reactions that are easily decodable. Secondly, we can measure aspects of the external behavior in detail, like the reaction times to questions, or the speed in which passages of text are read (cf. article 102 (Frazier) Meaning in psycholinguis- tics). Third, we can observe physiological reactions of participants in communication, like the changing size of their pupil, the saccades of the eyes reading a text, the eye gaze when presented to a visual input together with a spoken comment, or the electromagnetic field generated by their cortex (cf. article 15 (Bott, Featherston, Rad6 & Stolterfoht) Experimental methods, and article 105 (Pancheva) Meaning in neuro- linguistics). Fourth, we can test hypotheses concerning meaning in the output of linguistic forms itself, using statistical techniques applied to corpora (cf. article 109 (Katz) Semantics in corpus linguistics). 1.4. Is semantics possible? The reader should be warned that correlations between meanings and observable phenomena like non-linguistic patterns of behavior or brain scans do not guarantee that the study of meaning can be carried out successfully. Leonard Bloomfield, a behavioralist, considered the observable effects so complex and interwoven with other causal chains that the science of semantics is impossible: Wc have defined the meaning of a linguistic form as the situation in which the speaker utters it and the response which it calls forth in the hearer. [...] In order to give a scientifically accurate definition of meaning for every form of a language, we should have to have a scientifically accurate knowledge of everything in the speaker's world. The actual extent of human knowledge is very small compared to this. [...]The statement of meanings is therefore the weak point in language-study, and will remain so until human knowledge advances very far beyond its present state. (Bloomfield 1933: 139f) We could imagine similar skepticism concerning the science of semantics from a neuroscientist believing that meanings are activation patterns of our head. The huge number of such patterns, and their variation across individuals that we certainly have to expect, seems to preclude that they will provide the foundation for the study of meaning. Despite Bloomfield's qualms, the field of semantics has flourished. Where he went wrong was in believing that we have to consider the whole world of the speaker, or the speaker's whole brain.There arc ways to cut out phenomena that stand in relation to, and bear evidence for, meanings in much more specific ways. For example, we can investigate whether a specific sentence in a particular context and describing a particular situation is considered true or false; and derive from that hypotheses about the meaning of the
12. Varieties of semantic evidence 245 sentence and the meaning of the words involved in that sentence.The usual methods of science - forming hypotheses and models, deriving predictions, making observations and constructing experiments that support or falsify the hypotheses - have turned out to be applicable to linguistic semantics as well. 1.5. Native semantic activities There are many native activities that directly address aspects of meaning. When Adam named the animals of paradise he assigned expressions to meanings, as we do today when naming things or persons or defining technical terms. We explain the meaning of words or idioms by paraphrasing them - that is, by offering different expressions with the same or at least similar meanings. We can refer to aspects of meaning: We say that one expression means the same as another one, or its opposite; we say that one expression refers to a subcase of another. As for speaker's meanings, we can elaborate on what someone meant by such-and-such words, and can point out differences between that and what the words actually meant or how they were understood by the addressee. Furthermore, for human communication to work it is crucial that untruthful use of language can be detected, and liars can be identified and punished. For this, a notion of what it means for a sentence or text to be true or false is crucial. Giving a statement at court means to know what it means to speak the truth, and the whole truth. Hence, it seems that meanings are firmly established in the pre-scientific ways we talk about language. We can translate, that is, rephrase an expression in one language by an expression in another while keeping the meaning largely constant. We can teach the meaning of words or expressions to second language learners or to children acquiring their native language - even though both groups, in particular children in first language acquisition, will acquire meanings to a large part implicitly, by contextual clues. The sheer possibility of translation has been enormously important for the development of humankind. We find records of associated practices, like the making of dictionaries, dating back to Sumerian-Akkadian glossaries of 2300 BC. These linguistic activities show that meaning is a natural notion, not a theoretical concept.They also provide important source of evidence for meaning. For example, it would be nearly impossible to construct a dictionary in linguistic field work without being able to ask for what a particular word means, or how a particular object is called. As another example, it would be foolish to dismiss the monumental achievements of the art of dictionary writing as evidence for the meaning of words. But there are problems with this kind of evidence that one must be aware of. Take dictionary writing.Traditional dictionaries are often unsystematic and imprecise in their description of meaning. They do not distinguish systematically between contextual (or "occasional") meaning and systematic meaning, nor do they keep ambiguity and polysemy apart in a rigorous way. They often do not distinguish between linguistic aspects and more general cultural aspects of the meaning and use of words. We in re ich (1964) famously criticized the 115 meanings of the verb to turn that can be found in Webster's Third Dictionary. Lexicography has greatly improved since then, with efforts to define lexical entries by a set of basic words and by recognizing regularities like systematic variations between word meanings (e.g. the intransitive use of transitive verbs, or the polysemy triggered in particular contexts of use).
246 III. Methods in semantic research 1.6. Talking about meanings Pre-scientific ways to address meanings rely on an important feature of human language, its self reference - we can use language to talk about language. This feature is so entrenched in language that it went unnoticed until logicians like Frege, Russell and Tarski, working with much more restricted languages, pointed out the importance of the metalanguage / object language distinction. It is only quite recently that we distinguish between regular language, reference to expressions, and reference to meanings by typographical conventions and write things like "XXX means'YYY'". The possibility to describe meanings may be considered circular - as when Tarski states that 'Snow is white' is true if and only if snow is white. However, it does work under certain conditions. First, the meaning of an unknown word can be described, or at least delimited, by an expression that uses known words; this is the classical case of a definition. If we had only this procedure available as evidence for meaning, things would be hopeless because we have to start somewhere with a few expressions whose meanings arc known; but once we have those, they can act as bootstraps for the whole lexicon of a language. The theory of Natural Semantic Metalanguage even claims that a small set of concepts (around 200) and a few modes of combining them are sufficient to achieve access to the meanings of all words of a language (Goddard 1998). Second, the meanings of an ambiguous word or expression can be paraphrased by expressions that have only one or the other meaning. This is common practice in linguistic semantics, e.g. when describing the meaning of He saw that gasoline can explode as (a) 'He saw an explosion of a can of gasoline' and (b) 'He recognized the fact that gasoline is explosive'. Speakers will generally agree that the original sentence has the two meanings teased apart by the paraphrases. There are variations on this access to meaning. For example, we might consider a sentence in different linguistic contexts and observe differences in the meaning of the sentence by recognizing that it has to be paraphrased differently. For the paraphrases, we can use a language that has specific devices that help to clarify meanings, like variables. For example, we can state that a sentence like Every man likes a woman that likes him has a reading 'Every man x likes a woman y that likes x', but not 'There is a woman y that every man x likes and that likes x'.The disadvantage of this is that the paraphrases cannot be easily grasped by naive native speakers. In the extreme case, we can use a fully specified formal language to specify such meanings, such as first-order predicate logic; the existing reading of our example then could be specified as Vx[man(x) -> 3y[woman(y) a likes(x, y) a likes (y,x)]]. Talking about meanings is a very important source of evidence for meanings. However, it is limited not only by the problem mentioned above, that it describes meanings with the help of other meanings. There are many cases where speakers cannot describe the meanings of expressions because this task is too complex - think of children acquiring their first language, or aphasics loosing the capacity of language. And there are cases in which the description of meanings would be too complex for the linguist. We may think of first fie Id work sessions in a research project on an unknown language. Somewhat closer to home, we may also think of the astonishingly complex meanings of natural language determiners such as a, some, a certain, a particular, a given or indefinite this in there was this man standing at the door whose meanings had to be teased apart by careful considerations of their acceptability in particular contexts.
12. Varieties of semantic evidence 247 2. Fieldwork techniques in semantics In this section we will discuss various techniques that have been used in linguistic field- work, understood in a wide sense as to include work on one's own language and on language acquisition, for example. There are a variety of sources that reflect on possible procedures; for example, the authors in McDaniel, McKee & Cairns (eds.) (1996) discuss techniques for the investigation of syntax in child language, many of which also apply to semantic investigations, and Matthewson (2004) is concerned with techniques for semantic research in American languages which, of course, are applicable for work on other languages as well (cf. also article 13 (Matthewson) Methods in cross-linguistic semantics, and article 103 (Crain) Meaning in first language acquisition). 2.1. Observation, transcription and translation The classical linguistic fieldwork method is to record conversations and texts in natural settings, transcribe them, and assign translations, ideally with the help of speakers that are competent in a language that they share with the investigator. In classical American structuralism, this has been the method de rigueur, and it is certainly of great importance when we want to investigate natural use of language. However, this technique is also severely limited. First, even large text collections may not provide the evidence that distinguishes between different hypothesis. Consider superlatives in English; is John is the tallest student true if John and Mary both are students that are of the same height and taller than any other student? Competent English speakers say no, superlatives must be unique - but it might be impossible to find out on the basis of a corpus of non-elicited text. Secondly, there is the problem of translation. Even when we grant that the translation is competent according to usual standards, it is not clear how we should deal with distinctions in the object language that are not easily made in the meta language. For example, Matthewson (2004) shows that in Menominee (Algonquian, Northern Central United States of America), inalienable nouns can have a prefix me-indicating an arbitrary owner, as contrasted with a prefix o- indicating a specific 3rd person owner.This difference could not be derived from simple translations of Menominee texts into English, as English does not make this distinction.There is also the opposite problem of distinctions that are forced on us by the meta language; for example, pronouns in English referring to humans distinguish two genders, which may not be a feature of the object language. Hence, as Matthewson puts it, translations should be seen as clues for semantic analysis, rather as its result. Translations, or more generally paraphrases, are problematic for more fundamental reasons as evidence for meaning, as they explain the meaning of an expression a by way of the meaning of an expression (3, hence it presupposes the existence and knowledge of meanings, and a judgment of similarity of meaning. However, it appears that without accepting this type of hcrmcncutic circle the study of semantics could not get off the ground. But there are methods to test hypotheses that have been generated first with the help of translations and paraphrases by independent means. 2.2. Pointing Pointing is a universal non-linguistic human behavior that aligns with aspects of meanings of certain types of linguistic expressions (cf. also article 90 (Diessel) Deixis and
248 III. Methods in semantic research demonstratives). Actually, pointing may be as characteristic for humans as language, as humans appear to be the only apes that point (cf.Tomascllo 2008). Pointing is most relevant for referring expressions, with names as the prototypical example (cf. article 4 (Abbott) Reference). These expressions denote a particular entity that is also identified by the pointing gesture, and hence pointing is independent evidence for the meaning of such expressions. For example, if in a linguistic fieldwork situation an informant points to a person and utters Max, this might be taken to be the name of that person. We can conclude that Max denotes that person, in other words, that the meaning of Max is the person pointed at. Simple as this scenario is, there are certain prerequisites for it to work. For example, the pointing gesture must be recognized as such; in different cultures, the index finger, the stretched-out hand, or an upward movement of the chin may be used, and in some cultures there may be a taboo against pointing gestured when directed at humans. Furthermore, there must be one most salient object in the pointing cone (cf. Kranstedt et al. 2006) that will then be identified.This presupposes a pre-linguistic notion of objects, and of saliency.This might work well when persons or animals are pointed at, who are cog- nitively highly salient. But mistakes can occur when there is more than one object in the pointing cone that are equally salient. When Captain Cook on his second voyage visited an island in the New Hebrides with friendly natives and tried to communicate with them, he pointed to the ground. What he heard was tanna, which he took as the name of the island, which is still known under this name. Yet the meaning of tana in all Melanesian languages is simply "earth".The native name for Tanna is reported to be parei (Gregory 2003); it is not in use anymore. Pointing gestures may also help to identify the meaning of common nouns, adjectives, or verbs - expressions that denote sets of entities or events. The pointing is directed towards a specimen, but reference is at entities of the same type as the one pointed at. There is an added source of ambiguity or vagueness here: What is "the same type as"? On his first voyage, Captain Cook made landfall in Australia, and observed creatures with rabbit-like ears hopping on their hind legs. When naturalist Joseph Banks asked the local Guugu Yimidhirr people how they are called, presumably with the help of some pointing gesture, he was the first to record the word kangaroo. But the word gangurru actually just refers to a large species of black kangaroo, not to the marsupial family in general (cf.Haviland 1974). Quine (1960: ch. II), in an argument to discount the possibility of true translation, famously described the problems that even a simple act like pointing and naming might involve. Assume a linguist points to a white rabbit, and gets the response gavagai. Quine asks whether this may mean 'rabbit', or perhaps 'animal', or perhaps 'white', or perhaps even 'non-detached rabbit parts'. It also might mean 'rabbit stage', in which case repeated pointing will identify different reference objects. All these options are theoretical possibilities under the assumption that words can refer to arbitrary aspects of reality. However, it is now commonly assumed that language is build on broad cognitive commonalities about entities and classes. There is evidence that pre-linguistic babies and higher animals have concepts of objects (as contrasted to substances) and animals (as contrasted to lifeless beings) that preclude a conceptualization of a rabbit as a set of rabbit legs, a rabbit body, a rabbit head and a pair of rabbit ears moving in unison. Furthermore, there is evidence that objects are called with terms of a middle layer of a
12. Varieties of semantic evidence 249 taxonomic hierarchy, the so-called "generic level", avoiding terms that are too general or too specific (cf. Berlin, Breedlove & Raven 1973). Hence a rabbit will not be called thing in English, or animal, and it will not be called English angora either except perhaps by rabbit breeders that work with a different taxonomy. This was the reason for Captain Cooks misunderstanding of gangurru; the native Guugu Yimidhirr people had a different, and more refined, taxonomic hierarchy for Australian animals, where species of kangaroo formed the generic level; for the British visitors the family itself belonged to that level. Pointing, or related gestures, have been used to identify the meaning of words. For example, in the original study of Berlin & Kay (1969) on color terms subjects were presented with a two-dimensional chart of 320 colors varying according to spectral color and saturation. The task was to identify the best specimen for a particular color word (the focal color) and the extent to which colors fall under a particular color word. Similar techniques have been used for other lexical fields, for example for the classification of vessels using terms like cup, mug or pitcher (cf. Kempton 1981;see Fig. 12.1). 1 oaooaooo ODOOOOoo PffOPOOOc? o» Cf CPODDOOoo V'uvuovnz? VVVVWvk? Coffee cup Fig. 12.1: Vessel categories after Kempton (1981:103). Bold lines: Identification of >80% agreement between subjects for mug and coffee cup. Dotted lines: Hypothetical concept that would violate connectedness and convexity (see below) Tests of this type have been carried out in two ways: Either subjects were presented with a field of reference objects ordered after certain dimensions; e.g. Berlin & Kay (1969) presented colors ordered after their wave length (the order they present themselves in a rainbow) and after their saturation (with white and black as the extremes). Kempton's
250 III. Methods in semantic research vessels were presented as varying in two dimensions: The relation between the upper and lower diameters, and the relation between height and width. When judging whether certain items fall under a term or not, the neighboring items that already have been classified might influence the decision. Another technique, which was carried out in the World Color Survey (see Kay et al. 2008), presented color chips in random order to avoid this kind of influence. The pointing test can be used in two ways: Either we point at an entity in order to get the term that is applicable to that entity, or we have a term and point to various objects to find out whether the term is applicable.The first approach asks an onomasio- logical question; it is concerned with the question: How is this thing called? The second approach asks the complementary semasiological question: What docs this expression mean? Within a Fregean theory of meaning, a distinction is made between reference and sense (cf. article 3 (Textor) Sense and reference). With pointing to concrete entities we gain access to the reference of expressions, and not to the sense, the concept that allows us to identify the reference. But by varying potential reference objects we can form hypotheses about the underlying concept, even though we can never be certain that by a variation of reference objects we will uncover all aspects of the underlying concept. Goodman (1955) illustrated this with the hypothetical adjective grue that, say, refers to green objects when used before the year 2100 and to blue objects when used after that time; no pointing experiment executed before 2100 could differentiate grue from green. Meaning shifts like that do happen historically: The term Scotia referred to Ireland before the 11th century, and after to Scotland; the German term gelb was reduced in extension when the term orange entered the language (cf. the traditional local term Gelbe Ruben 'yellow turnips' for carrots). But these are language changes, and not meanings of items within a language. A meaning like the hypothetical grue appears as strange as a reference towards non-detached rabbit parts. We work under the hypothesis that meanings of lexical items are restricted by general principles of uniformity over time. There are other such principles that restrict possible meanings, for example connectedness and, more specifically, convexity (Gardenfors 2000). In the vessel example above, where potential reference objects were presented following certain dimensions, we expect that concepts do not apply to discontinuous areas and have the general property that when x is an a and y is an a, then everything in between x and y is an a as well.The dotted lines in Fig. 12.1 represent an extension of a concept that would violate connectedness and convexity. In spite of all its problems, pointing is the most elementary kind of evidence for meaning without which linguistic field work, everyday communication and language acquisition would be impossible. Yet it seems that little research has been done on pointing and language acquisition, be it first or second. Its importance, however, was recognized as early as in St. Augustin's Confessions (5th century AD), where he writes about his own learning of language: When they [the elders] called some thing by name and pointed it out while they spoke. I saw it and realized that the thing they wished to indicate was called by the name they then uttered. And what they meant was made plain by the gestures of their bodies, by a kind of natural language, common to all nations [...] (Confessions, Book 1:8)
12. Varieties of semantic evidence 251 2.3. Truth value judgments (TVJ) Truth value judgments do the same job for the meaning of sentences as pointing does for referring expressions. In the classical setup, a situation is presented with non-linguistic means together with a declarative sentence, and the speaker has to indicate whether this sentence is true or false with respect to the situation.This judgement is an linguistic act by itself, so it can be doubted that this provides a way to base the study of meaning wholly outside of language. But arguably, agreeing or disagreeing arc more primitive linguistic acts that may even rely on simple gestures, just as in the case of pointing. The similarity between referential expressions - which identify objects - and declarative sentences - which identify states of affairs in which they are true - is related to Frege's identification of the reference of sentences with their truth value with respect to a particular situation (even though this was not the original motivation for this identification, cf. Frege 1892). This is reflected in the two basic extensional types assumed in sentence semantics: Type e for entities referred to by names, and type t for truth values referred to by sentences. But there is an important difference here:There are many distinct objects - De, the universe of discourse, is typically large; but there are just two (basic) truth values - D(, the set of truth values, standardly is {0, 1), falsity and truth. Hence we can distinguish referring expressions more easily by their reference than wc can distinguish declarative sentences. One consequence of this is that onomasiological tests do not work. We cannot present a "truth value" and expect a declarative sentence that is true. Also,on presenting a situation in a picture or a little movie we cannot expect that the linguistic reactions are as uniform as when we, say, present the picture of an apple. But the sema- siological direction works fine: We can present speakers with a declarative sentence and a situation or a set of situations and ask whether the sentence is true in those situations. Truth values are not just an ingenious idea of language philosophers to reduce the meaning of declarative sentences to judgments whether a sentence is true or false in given situations.They are used pre-linguistically, e.g. in court procedures. Within linguistics, they are used to investigate the meaning of sentences in experiments and in linguistic field work. They have been particularly popular in the study of language acquisition because they require a rather simple reaction by the child that can be expected even from two-year olds. The TVJ task comes in two flavors. In both, the subjects are presented with a sentence and a situation, specified by a picture, an acted-out scene with hand puppets or a movie, or by the actual world provided that the subjects have the necessary information about it. In the first version, the subjects should simply state whether the sentence is true or false. This can be done by a linguistic reaction, by a gesture, by pressing one of two buttons, or by ticking off one of two boxes. We may also record the speed of these reactions in order to get data about the processing of expressions. In the second version, there is a character, e.g. a hand puppet, that utters the sentence in question, and the subjects should reward or punish the character if the sentence is true or false with respect to the situation presented (see e.g.Crain 1991). A reward could be, for example, feeding the hand puppet a cookie. Interestingly, the second procedure taps into cognitive resources of children that are otherwise not as easily accessible. Gordon (1998), in a description of TVJ in language acquisition, points out that this task is quite natural and easy. This is presumably so because truth value judgment is an elementary linguistic activity, in contrast to,say,grammaticality judgments.TVJ also puts
252 III. Methods in semantic research less demands on answers than wh-questions (e.g. Who chased the zebra? vs. Did the lion chase the zebra?) This makes it the test of choice for children and for language-impaired persons. But there are potential problems in carrying out TVJ tasks. For example, Crain et al. (1998) have investigated the phenomenon that children seem to consider a sentence like Every farmer is feeding a donkey false if there is a donkey that is not fed by the farmer. They argue that children are confused by the extra donkey and try to reinterpret the sentence in a way that seems to make sense. A setup in which attention is not drawn to a single object might be better; even adding a second unfed donkey makes the judgments more adult-like. Also, children respond better to scenes that are acted out than to static pictures. In designing TVJ experiments, one should consider the fact that positive answers are given quicker and more easily than negative ones. Furthermore, one should be aware that unconscious reactions of the experimenter may provide subtle clues for the "right" answer (the "Clever Hans" effect, named after the horse that supposedly could solve arithmetic problems). For example, when acting out and describing a scene, the experimenter may be more hesitant when uttering a false statement. 2.4. TVJ and presuppositions/implicatures There are different aspects of meaning beyond the literal meaning, such as presuppositions, conventional implicatures, conversational implicatures and the like, and it would be interesting to know how such meaning components fare in TVJ tasks.Take presuppositions (cf. also article 91 (Beaver & Geurts) Presupposition).Theories such as Stalnaker (1974) that treat them as preconditions of interpretation predict that sentences cannot be interpreted with respect to situations that violate their presuppositions.The TVJ test does not seem to support this view. The sentence The dog is eating the bone will most likely be judged true with respect to a picture showing two dogs, where one of the dogs is eating a bone. This may be considered evidence for the case of accommodation, which consists of restricting the context to the one dog that is eating a bone. Including a third option or truth value like "don't know" might reveal the specific meaning contribution of presuppositions As for conversational implicature (cf. article 92 (Simons) Implicature) we appear to get the opposite picture.TVJ tests have been used to check the relevance of scalar implicatures. For example, Noveck (2001), building on work of Smith (1980), argued that children are "more logical" than adults because they can dissociate literal meanings from scalar implicatures. Children up to 11 years react to statements like some giraffes have long necks (where the picture shows that all giraffes have long necks) with an affirmative answer, while most adults find them inappropriate. 2.5. TVJ variants: Picture selection and acting out The picture selection task has been applied for a variety of purposes beyond truth values (cf. Gerken & Shady 1998). But for the purpose of investigating sentence meanings, it can be seen as a variant to the TVJ task:The subject is exposed to a declarative sentence and two or more pictures and has to identify the picture for which the sentence is true. It is good to include irrelevant pictures as filler items, which can test the attention of
12. Varieties of semantic evidence 253 the subjects. The task can be used to identify situations that fit best to a sentence. For example, for sentences with presuppositions it is expected that a picture will be chosen that does not only satisfy the assertion, but also the presupposition. So, if the sentence is The dog is eating a bone, and if a picture with one or two dogs is shown, then presumably the picture with one dog will be preferred. Also, sentences whose scalar implicature is satisfied will be preferred over those for which this is not the case. For example, if the sentence is some giraffes have long necks, a picture in which some but not all giraffes have long necks will be preferred over a picture in which all giraffes are long-necked. Another relative of the TVJ task is the Act Out task in which the subject has to "act out" a sentence with a scene such that the sentence is true. Again, we should expect that sentences arc acted out in a way as to satisfy all meaning components - assertion, presupposition, and implicature - of a sentence. 2.6. Restrictions of the TVJ methodology One restriction of the various TVJ methodologies appears to be that they just target expressions that have a truth value, that is, sentences. However, they allow to investigate the meaning of subsentential expressions, under the assumption that the meaning of sentences is computed in a compositional way from the meanings of their syntactic parts (cf. article 6 (Pagin & Westerstahl) Compositionality). For example, the meaning of spatial presuppositions like on, on top of, above or over can be investigated with scenes in which objects are arranged in particular ways. Another potential restriction of TVJ as discussed so far is that we assumed that the situations are presented by pictures. Language is not restricted to encoding information that can be represented by visual stimuli. But we can also present sounds, movie scenes or comic strips that represent temporal developments, or even olfactory and tactile stimuli to judge the range of meanings of words (cf. e.g. Majid et al. 2006 for verbs of cutting and breaking). TVJ is also difficult to apply when deictic expressions arc involved, as they often require reference to the speaker, who is typically not part of the picture. For example, in English the sentence The ball is in front of the tree means that the ball is in between the speaker that faces the tree and the tree; the superficially corresponding sentence in Hausa means that the ball is behind the tree (cf. Hill 1982). In English, the tree is seen as facing the speaker, whereas in Hausa the speaker aligns with the tree (cf. article 98 (Pederson) The expression of space). Such differences are not normally represented in pictures, but it can be done. One could either represent the picture from a particular angle, or represent a speaker with a particular position and orientation in the picture itself and ask the subject to identify with that figure. The TVJ technique is systematically limited for sentences that do not have truth values, such as questions, commands, or exclamatives. But we can generalize it to a judgment of appropriateness of sentences given a situation, which sometimes is done to investigate politeness phenomena and the like. There are also subtypes of declarative sentences that are difficult to investigate with TVJ, namely modal statements, e.g. Mary must be at home, or habituals and generics that allow for exceptions, like Delmer walks to school, or Birds fly (cf. article 47 (Carlson) Genericity). This is arguably so because those sentences require to consider different possible worlds, which cannot be easily represented graphically.
254 III. Methods in semantic research 2.7. TVJ with linguistic presentation of situation The TVJ technique can be applied for modal or generic statements if we present the situation linguistically, by describing it. For example, we could ask whether Delmer walks to school is true if Delmer walks every day except Fridays, when his father gives him a ride. Of course, this kind of linguistic elicitation technique can be used in nearly all the cases described so far. It has clear advantages: Linguistic descriptions arc easy and cheap to produce and can focus the attention of the subject to aspects that are of particular relevance for the task. For this reason it is very popular for quick elicitations whether a sentence can mean such-and-such. Matthewson (2004) argues that elicitation is virtually the only way to get to more subtle semantic phenomena. She also argues that it can be combined with other techniques, like TVJ and grammaticality judgments. For example, in investigating aspect marking in St'aYimcets Salish (Salishan, Southwestern Canada) the sentence Have you been to Seattle? is translated using an adverb Ian that otherwise occurs with the meaning 'already'; a follow-up question could be whether it is possible to drop Ian in this context, retaining roughly the same meaning. The linguistic presentation of scenes comes with its own limitations.There is the foundational problem that we get at the meaning of an expression a by way of the meaning of an expression (3. It cannot be applied in case of insufficient linguistic competence, as with young children or language-impaired persons. 2.8. Acceptability tests In this type of test, speakers are given an expression and a linguistic context and/or an description of an extralinguistic situation, and are asked whether the expression is acceptable with respect to this context or the situation. With it, we can explore the felicity conditions of an expression, which often are closely related to certain aspects of its meaning. Acceptability tests are the natural way to investigate presuppositions and conventional implicaturcs of expressions. For example, additive focus particles like also presuppose that the predication holds for an alternative to the focus item. Hence in a context like John went to Paris, the sentence John also went to PRAGUE is felicitous, but the sentence Mary also went to PRAGUE is not. Acceptability tests can also be used to investigate information-structural distinctions. For example, in English, different accent patterns indicate different focus structures; this can be seen when judging sentences like JOHN went to Paris \s.John went to PARIS in the context of questions like Who went to Paris? and John went where? (cf. article 66 (Krifka) Questions). As another example, Portner & Yabushita (1998) discussed the acceptability of sentences with a topic-comment structure in Japanese where the topic was identified by a noun phrase with a restrictive relative clause and found that such structures are better if the relative clause corresponds to a comment on the topic in the preceding discourse. Acceptability tests can also be used to test the appropriateness of terms with honorific meaning, or various shades of expressive meaning, which have been analyzed as conventional implicatures by Potts (2005). When applying acceptability judgments, it is natural to present the context first, to preclude that the subject first comes up with other contexts which may influence
12. Varieties of semantic evidence 255 the interpretation. Another issue is whether the contexts should be specified in the object language, or can also be given in the meta-language that is used to carry out the investigation. Matthewson (2004) discusses the various advantages and disadvantages - especially if the investigator has a less-than-perfect command over the object language - and argues that using a meta-language is acceptable, as language informants generally can resist the possible influence of the metalanguage on their responses. 2.9. Elicited production We can turn the TVJ test on its head and ask subjects to describe given situations with their own words. In language acquisition research, this technique is known as "elicited production", and encompasses all linguistic reactions to planned stimuli (cf. Thornton 1998). In this technique the presumed meaning is fixed, and controls the linguistic production; wc can hypothesize about how this meaning can be represented in language. The best known example probably is the retelling of a little movie called the Pear Story, which has unearthed interesting differences in the use of tense and aspect distinctions in different languages (cf. Chafe 1980 for the original publication). Another example, which allows to study the use of meanings in interaction, is the "map task", where one person explains the configuration of objects or a route on a map to another without visual contact. The main problem of elicited production is that the number of possible reactions by speakers is, in principle, unlimited. It might well be that the type of utterances one expects do not occur at all. For example, we could set up a situation in which person A thinks that person B thinks that person C thinks that it is raining, to test the recursivity of propositional attitude expressions, but we will have to wait long till such utterances are actually produced. So it is crucial to select cues that constrain the linguistic production in a way that ensures that the expected utterances will indeed occur. 2.10. From sentence meanings to word meanings The TVJ technique and its variants test the meaning of sentences, not of words or sub- sentential expressions. Also, with elicitation techniques, often we will get sentence-like reactions. With elicited translations, it is also advisable to use whole sentences instead of single words or simpler expressions, as Matthewson (2004) argues. It is possible to elicit the basic meaning of nouns or certain verbs directly, but this is impossible for many other words. The first ten most frequent words in English are often cited as being the, of, and, a, to, in, is, you, that; it would be impossible to ask a na'ive speaker of English what they mean or discover there meanings in other more direct ways, with the possible exception of you. We can derive hypotheses about the meaning of such words by using them in sentences and judging the truth value of the sentences with respect to certain situations, and their acceptability in certain contexts. For example, we can unearth the basic uses of the definite article by presenting pictures containing one or two barking dogs, and ask to pick out the best picture for the dog is barking. The underlying idea is that the assignment of meanings to expressions is compositional, that is, that the meaning of the complex expression is a result of the meaning of its parts and the way they arc combined.
256 III. Methods in semantic research 3. Communicative behavior Perhaps the most important function of language is to communicate, that is, to transfer meanings from one mind to another. So we should be able to find evidence for meaning by investigating communicative acts.This is obvious in a trivial sense: If A tells B something, B will often act in certain ways that betray that B understood what A meant. More specifically, we can investigate particular aspects of communication and relate them to particular aspects of meaning. We will look at three examples here: Presuppositions, conversational implicatures and focus-induced alternatives. Presuppositions (cf. article 91 (Beaver & Geurts) Presupposition) are meaning components that are taken for granted, and hence appear to be downtoned.This shows up in possible communicative reactions. For example, consider the following dialogues: A: Unfortunately, it is raining. B: No, it isn't. Here, B denies that it is raining; the meaning component of unfortunate expressing regret by the speaker is presupposed or conventionally implicated. A: It is unfortunate that it is raining. B: No, it isn't. Here, B presupposes that it is raining, and states that this is unfortunate. In order to deny the presupposed part, other conversational reactions are necessary, like But that's not unfortunate, or But it doesn't rain. Simple and more elaborate denials are a fairly consistent test to distinguish between presupposed and proffered content (cf. van der Sandt 1988). For conversational implicatures (cf. article 92 (Simons) Implicature) the most distinctive property is that they are cancelable without leading to contradiction. For example, John has three children triggers the scalar implicature that John has exactly three children. But this meaning component can be explicitly suspended: John has three children, if not more. It can be explicitly cancelled: John has three children, in fact he has four. And it does not arise in particular contexts, e.g. in the context of People get a tax reduction if they have three children. This distinguishes conversational implicatures from presuppositions and semantic entailments:/o/m has three children, [if not two /in fact, two) is judged contradictory. Our last example concerns the introduction of alternatives that are indicated by focus, which in turn can be marked in various ways,e.g. by sentence accent. A typical procedure to investigate the role of focus is the question-answer test (cf. article 66 (Krifka) Questions). In the following four potential question-answer pairs (Al-Bl) and (A2-B2) are well-formed, but (A1-B2) and (A2-B1) are odd. Al: Who ate the cake? A2: What did Mary eat? Bl: MARYate the cake. B2: Mary ate the CAKE.
12. Varieties of semantic evidence 257 This has been interpreted as saying that the alternatives of the answer have to correspond to the alternatives of the question. To sum up, using communicative behavior as evidence for meaning consists in evaluating the appropriateness of certain conversational interactions. Competent speakers generally agree on such judgments. The technique has been used in particular to identify, and differentiate, between different meaning components having to do with the presentation of meanings, in particular with information structure. 4. Behavioral effects of semantic processing When discussing evidence for the meaning of expressions we have focused so far on the meanings themselves. We can also investigate how semantic information is processed, and get a handle on how the human mind computes meanings. To get information on semantic processing, judgment tasks are often not helpful, and might even be deceiving. We need other types of evidence that arguably stand in a more direct relation to semantic processing. It is customary to distinguish between behavioral data on the one hand, and neurophysiology data that directly investigates brain phenomena on the other. In this section we will focus on behavioral approaches (cf. also article 15 (Bott, Featherston, Rad6 & Stolterfoht) Experimental methods). 4.1. Reaction times The judgment tasks for meanings described so far can also tap into the processing of semantic information if the timing of judgments is considered. The basic assumption is that longer reaction times, everything else being equal, are a sign for semantic processing load. For example, Clark & Lucy (1975) have shown that indirect speech acts take longer for processing than direct ones, and attribute this to the additional inferences that they require. Noveck (2004) has shown that the computation of scalar implicature takes time; people that reacted to sentences like Some elephants are mammals with a denial (because all elephants and not just some arc) took considerably longer. Kim (2008) has investigated the processing of ow/y-sentences, showing that the affirmative content is evaluated first, and the presupposition is taken into account only after. Reaction times are relevant for many other psycholinguistic paradigms, beyond tasks like TVJ, and can provide hints for semantic processing. One notable example is the semantic phenomenon of coercion, changes of meanings that are triggered by the particular context in which meaning-bearing expressions occur (cf. article 25 (dc Swart) Mismatches and coercion). One well-known example is aspectual coercion: Temporal adverbials of the type until dawn select for atelic verbal predicates, hence The horse slept until dawn is fine. But The horse jumped until dawn is acceptable as well, under an iterative interpretation of jump that is not reflected overtly. This adaptation of the basic meaning to fit the requirements of the context should be cognitively costly, and there is indeed evidence for the additional semantic processing involved. Pifiango et al. (2006) report on various studies and their own experiments that made use of the dual task interference paradigm: Subjects listen to sentences and, at particular points, deal with an unrelated written lexical decision task.They were significantly slower in deciding this
258 III. Methods in semantic research task just after an expression that triggered coercion (e.g. until in the second example, as compared to the first). This can be taken as evidence for the cognitive effort involved in coercion; notice that there is not syntactic difference between the sentences to which such reaction time difference could be attributed. 4.2. Reading process: Self-paced reading and eye tracking Another window into semantic processing is the observation of the reading process. There are two techniques that have been used: (i) Self-paced reading, where subjects are presented with a text in a word-by-word or phrase-by-phrasc fashion; the subject has control over the speed of presentation, which is recorded, (ii) Eye tracking, where the reading movements of the subject are recorded by cameras and matched with the text being read. While self-paced reading is easier to handle as a research paradigm, it has the disadvantage that it might not give fine-grained data, as subjects tend to get into a rhythmical tapping habit. Investigations of reading have provided many insights into semantic processing; however, it should be kept in mind that by their nature they only help to investigate one particular aspect of language use that lacks many features of spoken language. For example, reading speed has been used to determine how speakers deal with semantic ambiguity: Do they try to resolve it early on, which would mean that they slow down when reading triggers of ambiguity,or do they entertain an undcrspccificd interpretation? Frazier & Rayncr (1990) have shown that reading slows down after ambiguous words, as e.g. in The records were carefully guarded {after they were scratched / after the political takeover), showing evidence for an early commitment for a particular reading. However, with polysemous words, no such slowing could be detected; an example is Unfortunately the newspaper was destroyed, [lying in the rain / managing advertising so poorly). The newspaper example is a case of coercion, which shows effects for semantic processing under the dual task paradigm (see discussion of Pifiango et al. 2006 above). Indeed, Pickering, McElree & Frisson (2006) have shown that the aspectual coercion cases do not result in increased reading times; thus different kinds of tests seem to differ in their sensitivity. Another area for which reading behavior has been investigated is the time course of pronoun resolution: Arc pronouns resolved as early as possible, at the place where they occur, or is the semantic processor procrastinating this decision? According to Ehrlich & Rayner (1983), the latter is the case. They manipulated the distance between an antecedent and its pronoun and showed that distance had an effect on reading times, but only well after the pronoun itself was encountered. 4.3. Preferential looking and the visual world paradigm Visual gaze and eye movement can be used in other ways as windows to meaning and semantic processing. One technique to investigate language understanding is the preferential looking paradigm, a version of the picture selection task that can be administered to young infants. Preferential looking has been used for the investigation of stimulus discrimination, as infants look at new stimuli longer than at stimuli that they are already accustomed to. For
12. Varieties of semantic evidence 259 the investigation of semantic abilities, so-called "Intermodal Preferential Looking" is used: Infants hear an expression and are presented at the same time with two pictures or movie scenes side by side; they preferentially look at the one that fits the description best. Hirsh- Pasek & Golinkoff (1996) have used this technique to investigate the understanding of sentences by young children that produce only single-word utterances. A second procedure that uses eye gaze is known as "Visual World Paradigm". The general setup is as follows: Subjects are presented with a scene and a sentence or text, and have to judge whether the sentence is true with respect to the scene. In order to perform this verification, subjects have to glance at particular aspects of the scene, which yields clues about the way how the sentence is verified or falsified, that is, how it is scmantically processed. In an early study, Eberhard et al. (1995) have shown that eye gaze tracks semantic interpretation quite closely. Listeners use information on a word-by-word basis to reduce the set of possible visual referents to the intended one. For example, when instructed to Touch the starred yellow square, subjects were quick to look at the target in the left-hand situation, slower in the middle situation, and slowest in the right-hand situation. Sedivy et al. (1999) have shown that there are similar effects of incremental interpretation even with non-intersective adjectives, like tall. pink blue n + n yellow red ffl □ pink red ffl ffl yellow blue ffl ffl blue red ffl ffl yellow yellow ffl ffl Fig. 12.2: Stimulus of eye gaze test (from Eberhard et al. 1995) Altman & Kamide (1999) have shown that eye gaze is not just cotemporaneous with interpretation, but may jump ahead; subjects listening to The boy will eat the... looked preferentially at the picture of a cake than at the picture of something non-edible. In a number of studies, including Weber, Braun & Crocker (2006), the effect of contrastive accent has been studied. When listeners had already fixated one object - say, the purple scissors - and now are asked to touch the RED scissors (where there is a competing red vase), they gaze at the red scissors more quickly, presumably because the square property is given. This effect is also present, though weaker, without contrastive accent, presumably because the use of modifying adjectives is inherently contrastive. For another example of this technique, see article 15 (Bott, Featherston, Rado & Stolterfoht) Experimental methods. 5. Physiological effects of semantic processing There is no clear-cut way to distinguishing physiological effects from behavioral effects. With the physiological phenomena discussed in this section it is evident that they are truly beyond conscious control, and thus may provide more immediate access to semantic processing.
260 III. Methods in semantic research Physiological evidence can be gained in a number of ways: From lesions of the brain and how they affect linguistic performance, from excitations of brain areas during surgery, from the observable metabolic processes related to brain activities, and from the electro-magnetic brain potentials that accompany the firing of bundles of neurons. There are other techniques that have been used occasionally, such as pupillary dilation, which correlates with cognitive load. For example, Kriiger, Nuthmann & van der Meer (2001) show with this measure that representations of event sequences following their natural order are cognitively less demanding than when not following the time line. 5.1. Brain lesions and stimulations Since the early discoveries of Broca and Wernicke, it has been assumed that specific brain lesions affect the relation between expressions to meanings. The classical picture of Broca's area responsible for production and Wernicke's area responsible for comprehension is now known to be incomplete (cf. Damasio ct al. 2004), but it is still assumed that Broca's aphasia impedes the ability to use complex syntactic forms to encode and also to decode meanings. From lesion studies it became clear that areas outside the classical Broca/Wernicke area and the connecting Geschwind area are relevant for language production and understanding. Brain regions have been identified where lesions lead to semantic dementia (also known as anomic aphasia) that selectively affects the recognition of names of persons, nouns for manipulable objects such as tools, or nouns of natural objects such as animals.These regions are typically situated in the left temporal lobe, but the studies reported by Damasio et al. also indicate that regions of the right hemisphere play an important role. It remains unclear, however, whether these lesions affect particular linguistic abilities or more general problems with the pre-linguistic categorization of objects. A serious problem with the use of brain lesions as source of evidence is that they are often not sufficiently locally constrained as to allow for specific inferences. Stimulation techniques allow for more directed manipulations, and hence for more specific testing of hypothesis.There are deep stimulation techniques that can be applied during brain surgery. There is also a new technique, Transcranial Magnetic Stimulation (TMS), which affects the functioning of particular brain regions by electromagnetic fields applied from outside of the skull. 5.2. Brain imaging of metabolic effects The last decades have seen a lively development of methods that help to locate brain activity by identifying correlated metabolic effects. Neuronal activity in certain brain regions stimulate the flow of oxygen-rich blood, which in turn can be localized by various means. While early methods like PET (Positron-Electron Tomography) required the use of radioactive markers, the method of fMRI (functional Magnetic-Resonance Imaging) is less invasive; it is based on measuring the electromagnetic fields of water molecules excited by strong magnetic fields. A more recent method, NIRS (Near Infrared Spectroscopy), applies low-frequency laser light from outside the skull; it is currently the least invasive technique. All the procedures mentioned have a low temporal resolution, as metabolic changes are slow, within the range of a second or so. However, their spatial resolution is quite acute, especially for fMRI using strong magnetic fields.
12. Varieties of semantic evidence 261 Results of metabolic brain-image techniques often support and refine findings derived from brain lesions (cf. Damasio et al. 2004). As an example of a recent study, Tyler, Randall & Stamatakis (2008) challenge the view that nouns and verbs are represented in different brain regions; they rather argue that inflected nouns and verbs and minimal noun phrases and minimal verb phrases, that is,specific syntactic uses of nouns and verbs, are spatially differentiated. An ongoing discussion is how general the findings about localizations of brain activities are, given the enormous plasticity of the brain. 5.3. Event-related potentials This family of procedures investigates the electromagnetic fields generated by the cortical activity. They are observed by sensors placed on the scalp that either track minute variations of the electric field (EEG) or the magnetic field (MEG).The limitations of this technique are that only fields generated by the neocortex directly under the cranium can be detected. As the neocortex is deeply folded, this applies only to a small part of it. Furthermore, the number of electrodes that can be applied on the scalp is limited (typically 16 to 64, sometimes up to 256), hence the spatial resolution is weak even for the accessible parts of the cortex. Spatial resolution is better for MEG,but the required techniques are considerably more complex and expensive. On the positive side, the temporal resolution of the technique is very high, as it docs not measure slow metabolic effects of brain activity, but the electric fields generated by the neurons themselves (more specifically, the action potentials that cause neurotransmitter release at the synapses). EEG electrodes can record these fields if generated by a large number of neurons in the pyramidal bundles of neurons in which the cortex is organized, in the magnitude of at least 1000 neurons. ERP (Event-related potentials), the correlation of EEG signals with stimuli events, has been used for thirty years in psycholinguistic research, and specifically for semantic processing since the discovery by Kutas & Hillyard (1980) of a specific brain potential, the N400.This is a frequently observed change in the potential leading to higher negativity roughly 400ms after the onset of a relevant stimulus. See Kutas, van Petten & Klucndcr (2006) for a review of the vast literature, and Lau, Phillips & Pocppcl (2008) for a partially critical view of standard interpretations. The N400 effect is seen when subjects are presented in an incremental way with sentences like / like my coffee with cream and [sugar / socks), and the EEG signals of the first and the second variant is compared. In the second variant, with a semantically incongruous word, a negativity around 400ms after the onset of the anomalous word (here: socks) appears when the brain potential development is averaged over a number of trials. -5pVH Fig. 12.3: Averaged EEG over sentences with no semantic violation (solid line) and with semantic violation (dotted line): vertical axis at the onset of the anomalous word (from Lau, Phillips & Poeppel 2008)
262 III. Methods in semantic research There are at least two interpretations of the N400 effect: Most researchers see it as a reflex of the attempt to integrate the meaning of a subexpression into the meaning of the larger expression, as constructed so far. With incongruous words, this task is hard or even fails, which is reflected by a stronger N400.The alternative view is that the N400 reflects the effort of lexical access.This is facilitated when the word is predictable by the context, but also when the word is frequent in general. There is evidence that highly frequent words lead to a smaller N400 effect. Also, N400 can be triggered by simple word priming tasks; e.g. in coffee - [tea/chair], the non-primed word chair leads to an N400. See Lau, Phillips & Poeppel (2008) for consequences of the integration view and the lexical access view of the N400. The spatial location of the N400 is also a matter of dispute. While Kutas, van Petten & Kluender (2006) claim that its origins are in the left temporal lobe and hence can be related to established language areas, the main electromagnetic field can be observed rather in the centroparietal region, and often on the right hemisphere. Lau, Phillips & Poeppel (2008) discuss various possible interpretations of these findings. There are a number of other reproducible electrophysiological effects that point at additional aspects of language processing. In particular, Early Left Anterior Negativity (ELAN) has been implicated in phrase structure violations (150ms), Left Anterior Negativity (LAN) appears with morphosyntactic agreement violations (300-500ms), and P600, a positivity after 600ms, has been seen as evidence for difficulties of syntactic integration, perhaps as evidence for attempts at syntactic restructuring. It is being discussed how specific N400 is for semantics; while it is triggered by phenomena that are clearly related to the meaning aspects of language, it can be also found when subjects perform certain non-linguistic tasks, as in melody recognition. Interestingly, N400 can be masked by syntactic inappropriateness, as Hahne & Friederici (2002) have shown. This can be explained by the plausible assumption that structures first have to make syntactic sense before semantic integration can even start to take place. There are a number of interesting specific findings around N400 or related brain potentials (cf. Kutas, van Petten & Kluender 2006 for an overview). Closed-class words generally trigger smaller N400 effects than open-class words, and the shape of their negativity is different as well - it is more drawn out up to about 700ms. As already mentioned, low-frequency words trigger greater N400 effects, which may be seen as a point in favor for the lexical access theory; however, we can also assume that low frequency is a general factor that impedes semantic integration. It has been observed that N400 is greater for inappropriate concrete nouns than for inappropriate abstract nouns. With auditory presentations of linguistic structures, it was surprising to learn that N400 effects can appear already before the end of the triggering word; this is evidence that word recognition and semantic integration sets in very early, after the first phonemes of a word. The larger context of an expression can modulate the N400 effect, that is, the preceding text of a sentence can determine whether a particular word fits and is easy to integrate, or does not fit and leads to integration problems. For example, in a context in which piercing was mentioned, earring triggers a smaller N400 than necklace. This has been seen as evidence that semantic integration does not differentiate between lexical access, the local syntactic fit and the more global semantic plausibility; rather, all factors play a role at roughly the same time. N400 has been used as evidence for semantic features. For example, in the triple The pizza was too hot to [eat /drink / kill), the item drink elicits a smaller N400 than kill, which
12. Varieties of semantic evidence 263 can be interpreted as showing that the expected item eat and the test item drink have semantic features in common (ingestion), in contrast to eat and kill. Brain potentials have also been used to investigate the semantic processing of negative polarity items (cf. article 64 (Giannakidou) Polarity items). Saddy, Drenhaus & Frisch (2004) and Drenhaus et al. (2006) observe that negative polarity items in inappropriate contexts trigger an N400 effect (as in [A /no] man was ever happy]. With NPIs and with positive polarity items, a P600 could be observed as well, which is indicative for an attempt to achieve a syntactic structure in which there is a suitable licensing operator in the right syntactic configuration. Incidentally, these findings favor the semantic integration view of the N400 over the lexical access view. There are text types that require special efforts for semantic integration - riddles and jokes. With jokes based on the reinterpretation of words, it has been found that better comprehenders of jokes show a slightly higher N400 effect on critical words, and a larger P600 effect for overall integration. Additional effort for semantic integration has also been shown for metaphorical interpretations. A negativity around 320ms has been identified by Fischler et al. (1985) for statements known to the subjects to be false, even if they were not asked to judge the truth value. But semantic anomaly clearly overrides false statements; as Kounios & Holcomb (1992) have showed, in the examples like No dogs are [animals / fruits], the latter triggers an N400 effect. More recent experiments using MEG have discovered a brain potential called AMF (Anterior Midline Field) situated in the frontal lobe, an area that is not normally implied in language understanding. The effect shows up with coercion phenomena (cf. article 25 (de Swart) Mismatches and coercion). Coercion does not lead to an N400 effect; there is no anomaly with John began the book (which has to be coerced to read or write the book). But Pylkkancn & McElrcc (2007) found an AMF effect about 350ms after onset. This effect is absent with semantically incongruous words, as well with words that do not require coercion. Interestingly, the same brain area has been implied for the understanding of sarcastic and ironic language in lesion studies (Shamay-Tsoory et al.2005). 6. Corpus-linguistic methods Linguistic corpora, the record of past linguistic production, is a valuable source of evidence for linguistic phenomena in general, and in case of extinct languages, the only kind of source (cf. article 109 (Katz) Semantics in corpus linguistics). This includes the study of semantic phenomena. For the case of extinct languages we would like to mention, in particular, the task of deciphering, which consists in finding a mapping between expressions and meanings. Linguistic corpora can provide for evidence of meaning in many different ways. An important philosophical research tradition is hermeneutics, originally the art of understanding of sacred texts. Perhaps the most important concept in the modern hermeneutic tradition is the explication of the so-called hermeneutic circle (cf. Gadamer 1960): The interpreter necessarily approaches the text with a certain kind of knowledge that is necessary for an initial understanding, but the understanding of the text in a first reading will influence and deepen the understanding in subsequent readings. With large corpora that are available electronically, new statistical techniques have been developed that can tap into aspects of meaning that might otherwise be difficult
264 III. Methods in semantic research to recognize. In linguistic corpora, the analysis of word co-occurrences and in particular collocations can yield evidence for meaning relations between words. For example, large corpora have been investigated for verb-NP collocations using the so-called Expectation Maximation (EM) algorithm (Rooth et al. 1999).This algorithms leads to the classification of verbs and nouns into clusters such that verbs of class X frequently occur with nouns of class Y.The initial part of one such cluster, developed from the British National Corpus, looks as in the following table. The verbs can be characterized as verbs that involve scalar changes, and the nouns as denoting entities that can move along such scales. increase.as:s increase:aso:o fall.as:s pay.aso:o reduce. aso:o rise.as.s exceed.aso.o exceed.aso:s affect.aso.s grow.as.s number rate price cost level amount sale value interest demand • • • • • Fig. 12.4: Clustering analysis of nouns and verbs; dots represent pairs that occur in the corpus. "as:s" stands for subjects of intransitive verbs, "aso.s" and "aso:o" for subjects and objects of transitive verbs, respectively (from Rooth et al. 1999) We can also look at the frequency of particular collocations within this cluster, as illustrated in the following table for the verb increase. Tab. 12.1: Frequency of nouns occurring with INCREASE (from Rooth et al. 1999) increase number demand pressure temperature cost 134.147 30.7322 30.5844 25.9691 23.9431 propoi size rate level price tion 23.8699 22.8108 20.9593 20.7651 17.9996 While pure statistical approaches as Rooth et al. (1999) are of considerable interest, most applications of large-scale corpus-based research are based on a mix between hand-coding and automated procedures. The best-known project that has turned into an important application is WordNet (Fellbaum (ed.) 1998). A good example for the mixed procedure is Gildea & Juravsky (2002), a project that attempted semi-automatic assignment of thematic roles. In a first step, thematic roles were hand-coded for a large number of verbs, where a large corpus provided for a wide variety of examples.These initial examples, together with the coding, were used to train an automatic syntactic parser, which then was able to assign thematic roles to new instances of known predicates and even to new, unseen predicates with reasonable accuracy.
12. Varieties of semantic evidence 265 Yet another application of corpus-linguistic methods involves parallel corpora, collections of texts and their translations into one or more other languages. It is presupposed that the meanings of the texts are reasonably similar (but recall the problems with translations mentioned above). Refined statistical methods can be used to train automatic translation devices on a certain corpus, which then can be extended to new texts that then are translated automatically, a method known as example based machine translation. For linguistic research, parallel corpora have been used in other ways as well. If a language a marks a certain distinction overtly and regularly, whereas language p marks that distinction only rarely and in irregular ways, good translations pairs of texts from a into (3 can be used to investigate the ways and frequency in which the distinction in (3 is marked. This method is used, for example, in von Heusinger (2002) for specificity, using Umberto Eco's Ilnome della rosa, and Behrens (2005) for genericity, using Sait-Exup£ry's Le petit prince. The articles in Cysouw & Walchli (2007) discuss the potential of the technique, and its problems, for typological research. 7. Conclusion This article, hopefully, has shown that the elusive concept of meaning has many reflexes that we can observe, and that semantics actually stands on as firm grounds as other disciplines of linguistics. The kinds of evidence for semantic phenomena are very diverse, and not always as convergent as semanticists might wish them to be. But they provide for a very rich and interconnected area of study that has shown considerable development since the first edition of the Handbook Semantics in 1991. In particular, a wide variety of experimental evidence has been adduced to argue for processing of meaning. It is to be hoped that the next edition will show an even richer and, hopefully, more convergent picture. 8. References Altniann, Gerry & Yuki Kamide 1999. Incremental interpretation at verbs: Restricting the domain of subsequent reference. Cognition 73,247-264. Behrens, Leila. 2005. Genericity from a cross-linguistic perspective. Linguistics 43,257-344. Berlin, Brent, Dennis E. Breedlove & Peter H. Raven 1973. General principles of classification and nomenclature in folk biology. American Anthropologist 75,214-242. Berlin, Brent & Paul Kay 1969. Basic Color Terms. Their Universality and Evolution. Berkeley, CA: University of California Press Bloomfield. Leonard 1933. Language. New York: Henry Holt and Co. Chafe. Wallace (ed.) 1980. The Pear Stories: Cognitive, Cultural, and Linguistic Aspects of Narrative Production. Norwood, NJ: Ablex. Clark, Herbert H. & Peter Lucy 1975. Understanding what is meant from what is said: A study in conversationally conveyed requests. Journal of Verbal Learning and Verbal Behavior 14,56-72. Crain, Stephen 1991. Language acquisition in the absence of experience. Behavioral and Brain Sciences 14,597-650. Crain, Stephen, Rosalind Thornton, Laura Conway & Diane Lillo-Martin 1998. Quantification without quantification. Language Acquisition 5,83-153. Cysouw, Michael & Bernhard Walchli 2007. Parallel texts: Using translational equivalents in linguistic typology. Special issue of Sprachtypologie und Universalienforschung 60,95-99.
266 III. Methods in semantic research Damasio, Hanna. Daniel Tranel, Thomas Grabowski, Ralph Adolphs & Antonio Damasio 2004. Neural systems behind word and concept retrieval. Cognition 92,179-229. Drenhaus, Heiner, Peter beim Graben, Doug Saddy & Stefan Frisch 2006. Diagnosis and repair of negative polarity constructions in the light of symbolic resonance analysis. Brain & Language 96,255-268. Eberhard, Kathleen, Michael J. Spivey-Knowlton, Judy Sedivy & Michael Tanenhaus 1995. Eye movement as a window into real-time spoken language comprehension in a natural context. Journal of Psycholinguistic Research 24,409-436. Ehrlich, Kate & Keith Rayner 1983. Pronoun assignment and semantic integration during reading: Eye movement and immediacy of processing. Journal of Verbal Learning and Verbal Behavior 22,75-87. Fellbaum. Christiane (ed.) 1998. WordNet: An Electronic Lexical Database. Cambridge, MA: The MIT Press Fischlcr, Ira, Donald G. Childcrs,Tecra Achariyapaopan & Nathan W. Perry 1985. Brain potentials during sentence verification: Automatic aspects of comprehension. Biological Psychology 21, 83-105. Frazier, Lyn & Keith Rayner 1990.Taking on semantic commitments: Processing multiple meanings vs. multiple senses. Journal of Memory and Language 29,181-200. Frege, Gottlob 1892. Uber Sinn und Bedeutung. Zeitschrift fiir Philosophic und philosophische Kritik 100,25-50. Gadamer. Hans Georg 1960. Wahrheit und Methode. Grundzuge einer philosophischen Herme- neutik. Tubingen: Mohr. Gaidenfors, Peter 2000. Conceptual Spaces - The Geometry of Thought. Cambridge, MA:The MIT Press Gei ken, Lou Ann & Michele E. Shady 1998. The picture selection task. ImD.McDaniel.C.McKee & H. S. Cairns (eds). Methods for Assessing Children's Syntax. Cambridge, MA:The MIT Press, 125-146. Gildea, Daniel & Daniel Jurafsky 2002. Automatic labeling of semantic roles. Computational Linguistics 28,245-288. Goddard, Cliff 1998. Semantic Analysis. A Practical Introduction. Oxford: Oxford University Press Goodman, Nelson 1955. Fact, Fiction, and Forecast. Cambridge, MA: Harvard University Press Gordon. Peter 1998. The truth-value judgment task. In: D. McDaniel, C. McKee & H. S. Carins (eds). Methods for Assessing Children's Syntax. Cambridge, MA:The MIT Press, 211-235. Gregory, Robert J. 2003. An early history of land on Tanna, Vanuatu. The Anthropologist 5, 67-74. Grice. Paul 1957. Meaning. The Philosophical Review 66,377-388. Hahne, Anja & Angela D. Friederici 2002. Differential task effects on semantic and syntactic processes are revealed by ERPs Cognitive Brain Research 13,339-356. Haviland, John B. 1974. A last look at Cook's Guugu-Yimidhirr wordlist. Oceania 44,216-232. von Heusinger, Klaus 2002. Specificity and definiteness in sentence and discourse structure. Journal of Semantics 19,245-274. Hill, Clifford. 1982. Up/down, front/back, left/right: A contrastive study of Hausa and English. In: J. Weissenborn & W. Klein (eds). Here and There: Cross-Linguistic Studies on Deixis and Demonstration. Amsterdam: Benjamins 11-42. Hirsh-Pasck,Kathryn & Roberta M. Golinkoff 1996. The Origin of Grammar: Evidence from Early Language Comprehension. Cambridge, MA:Thc MIT Press Kaplan, David 1978. On the logic of demonstratives Journal of Philosophical Logic VIII, 81-98. Reprinted in: P. French, T Uehling & H.K. Wettstein (eds). Contemporary Perspectives in the Philosophy of Language. Minneapolis MN: University of Minnesota Press 1979,401-412. Kay, Paul, Brent Berlin, Luisa Maffi & Wiliam R. Merrifield 2008. The World Color Survey. Stanford, CA: CSLI Publications Kempton, William. 1981. The Folk Classification of Ceramics. New York: Academic Press
12. Varieties of semantic evidence 267 Kim, Christina 2008. Processing presupposition: Verifying sentences with 'only'. In: J. Tauberer, A. Eilam & L. MacKenzie (eds.). Proceedings of the 31st Penn Linguistics Colloquium. Philadelphia, PA: University of Pennsylvania. Kounios, John & Phillip J. Holcomb 1992. Structure and process in semantic memory: Evidence from event-related brain potentials and reaction times. Journal of Experimental Psychology. General 121,459-479. Kranstedt, Alfred, Andy Lucking,Thies Pfeiffer & Hannes Rieser 2006. Deixis: How to determine demonstrated objects using a pointing cone. In: Sylvie Gibet, Nicolas Courty & Jean-Franqois Kamp (eds.). Gesture in Human-Computer Interaction and Simulation. Berlin: Springer. 300-311. Kriiger. Frank. Antje Nuthmann & Elke van der Meer 2001. Pupillometric indices of the temporal order representation in semantic memory. Zeitschrift fur Psychologie 209,402-415. Kutas, Marta & Steven A. Hillyard 1980. Reading senseless sentences: Brain potentials reflect semantic incongruity. Science 207.203-205. Kutas, Marta, Cyma K. van Pcttcn & Robert Kluender 2006. Psycholinguistics electrified II 1994- 2005. In: M.A. Gcrnsbachcr & M. Traxlcr (eds.). Handbook of Psycholinguistics. New York: Elsevier, 83-143. Lakoff, George & Mark Johnson 1980. Metaphors we Live by. Chicago, IL: The University of Chicago Press Lau. Ellen F, Colin Phillips & David Poeppel 2008. A cortical network for semantics: (de) constructing the N400. Nature Reviews Neuroscience 9,920-933. McDaniel, Dena, Cecile McKee & Helen Smith Cairns (eds.) 1996. Methods for Assessing Children's Syntax. Cambridge, MA:The MIT Press Majid.Asifa, Melissa Bowerman, Miriam van Staden & James Boster 2006.The semantic categories of cutting and breaking events: A crosslinguistic perspective. Cognitive Linguistics 18,133-152. Mattliewson, Lisa 2004. On the methodology of semantic fieldwork. International Journal of American Linguistics 70,369-415. Noveck, Ira 2001. When children arc more logical than adults: Experimental investigations of scalar implicatures. Cognition 78,165-188. Noveck, Ira 2004. Pragmatic inferences linked to logical terms. In: I. Noveck & D. Sperber (eds.). Experimental Pragmatics. Basingstoke: Palgravc Macmillan, 301-321. Pickering, Martin J., Brian McElrcc & Steven Frisson 2006. Undcrspccification and aspectual coercion. Discourse Processes 42,131-155. Pinango, Maria M.. Aaron Winnick, Rash ad Ullah & Edgar Zurif 2006. Time course of semantic composition:The case of aspectual coercion. Journal of Psycholinguistic Research 35,233-244. Portlier, Paul & Katsuhiko Yabushita 1998.The semantics and pragmatics of topic phrases. Linguistics & Philosophy 21.117-157. Potts, Christopher 2005. The Logic of Conventional Implicatures. Oxford: Oxford University Press Pylkkanen, Liina & Brian McElree 2007. An MEG study of silent meaning. Journal of Cognitive Neuroscience 19,1905-1921. Quine, Willard van Ornian 1960. Word and Object. Cambridge, MA:Thc MIT Press, 26-79. Rayner, Keith 1998. Eye movements in reading and information processing: 20 years of research. Psychological Bulletin 124,372-422. Rooth, Mats, Stefan Riezler, Detlef Prescher & Glenn Carroll 1999. Inducing a semantically annotated lexicon via EM-based clustering. In: R. Dale & K. Church (eds.). Proceedings of the 37th Annual Meeting of the Association for computational Linguistics. College Park, MD: ACL. 104-111. Saddy, Douglas, Hcincr Drcnhaus & Stefan Frisch 2004. Processing polarity items: Contrastivc licensing costs. Brain & Language 90,495-502. van der Sandt, Rob A. 1988. Context and Presupposition. London: doom Helm. Sedivy, Julie C, Michael K. Tanenhaus, Craig C. Chambers & Gregory N. Carlson 1999. Achieving incremental semantic interpretation through contextual representation. Cognition 71,109-147. Shamay-Tsoory. Simone, Rachel Tomer & Judith Aharon-Peretz 2005. The neuroanatomical basis of understanding sarcasm and its relationship to social cognition. Neuropsychology 19,288-300.
268 III. Methods in semantic research Smith, Carlotta L. 1980. Quantifiers and question answering in young children. Journal of Experimental Child Psychology 30,191-205. Stalnaker, Robert 1974. Pragmatic presuppositions. In: M. K. Munitz & P. K. Unger (eds.). Semantics and Philosophy. New York: New York University Press, 197-214. Thornton, Rosalind 1998. Elicited production. In: D. McDaniel, C. McKee & H. S. Carins (eds.). Methods for Assessing Children's Syntax. Cambridge, MA: The MIT Press, 77-95. Tomasello, Michael 2008. Why don't apes point? In: R. Eckardt, G. Jager & Tonjes Veenstra (eds.). Variation, Selection, Development. Probing the Evolutionary Model of Language Change. Berlin: Mouton de Gruyter, 375-394. Tyler, Lorraine K., Billi Randall & Emmanuel A. Stamatakis 2008. Cortical differentiation for nouns and verbs depends on grammatical markers. Journal of Cognitive Neuroscience 20,1381-1389. Weber, Andrea, Bettina Braun & Matthew W. Crocker 2006. Finding referents in time: Eye-tracking evidence for the role of contrastive accents. Language and Speech 49.367-392. Weinreich, Uriel 1964. Webster's Third. A critique of its semantics. International Journal of American Linguistics 30,405-409. Manfred Krifka, Berlin (Germany) 13. Methods in cross-linguistic semantics 1. Introduction 2. The need for cross-linguistic semantics 3. On semantic fieldwork methodology 4. What counts as a universal? 5. How do we find universals? 6. Semantic universals in the literature 7. Variation 8. References Abstract This article outlines methodologies for conducting research in cross-linguistic semantics, with an eye to uncovering semantic universals. Topics covered include fieldwork methodology, types of evidence for semantic universals, different types of semantic universals (with examples from the literature), and semantic parameters. 1. Introduction This article outlines methodologies for conducting research in cross-linguistic semantics, with an eye to uncovering semantic universals. Section 2 briefly motivates the need for cross-linguistic semantic research, and section 3 outlines fieldwork methodologies for work on semantics. Section 4 addresses the issue of what counts as a universal, and section 5 discusses how one finds universals. Section 6 provides examples of semantic universals, and section 7 discusses variation. The reader is also referred to articles 12 (Krifka) Varieties of semantic evidence and 95 (Bach & Chao) Semantic types across languages. Maienborn, von Heusmger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 268-285
13. Methods in cross-linguistic semantics 269 2. The need for cross-linguistic semantics It is by now probably uncontroversial that our field's empirical base needs to be as broad as possible. Typologists have always known this;gcncrativists have also mostly advanced beyond the view that we can uncover Universal Grammar by doing in-depth study of a single language. Although semantics was the last sub-field of formal linguistics to undertake widespread cross-linguistic investigation, such work has been on the rise ever since the synthesis of generative syntax and Montague semantics in the 1980s. In the 20 years since Comrie (1989: 4) wrote that for a 'Chomskyist' for whom universals are abstract principles, "there is no way in which the analysis of concrete data from a wide range of languages would provide any relevant information", we have seen many instances where empirical cross-linguistic work has falsified abstract universal principles. Among many other others, Maria Bittner's pioneering work on Kalaallisut (Eskimo-Aleut, Greenland) is worth mentioning here (e.g., Bittncr 1987,1994,2005,2008), as well as the papers in Bachet al.'s (1995) quantification volume, and the body of research responding to Chierchia's (1998) Nominal Mapping Parameter. It is clear that we will only come close to discovering semantic universals by looking at data from a wide range of languages. 3. On semantic fieldwork methodology The first issue for a cross-linguistic semantic researcher is how to obtain usable data in the field. The material presented here draws on Matthcwson (2004); the reader is also referred to article 12 (Krifka) Varieties of semantic evidence. The techniques described here are theory-neutral; I assume that regardless of one's theoretical assumptions, similar data-collection processes are appropriate. 3.1. In support of direct elicitation It was once widely believed that direct elicitation is an illegitimate methodology; see for example Harris & Voegelin (1953: 59). Even in this century we find statements like the following: The referential meaning of nouns (in terras of definiteness and specificity) is an intricate topic that is extremely hard to investigate on the basis of elicitation. In the end it is texts or connected discourse in general in the language under investigation which provide the most important clues for analysis of these grammatical domains. (Dimmcndaal2001:69) It is true that connected discourse is an indispensable part of any investigation into noun phrase semantics. However, this methodology alone is insufficient. Within semantics, there are two main reasons why textual materials are insufficient as an empirical base. The first is shared with syntactic research, namely that one cannot obtain negative evidence from texts. One cannot claim with any certainty that any structure is impossible, if one only investigates spontaneously produced data. The second reason is specific to semantics, namely that texts provide little direct evidence about meaning. A syntactician working with a textual corpus has at least a number of sentences which may be assumed to be grammatical. But a text is paired with at most a translation - an incomplete representation of meaning. Direct elicitation results in more detailed,
270 III. Methods in semantic research targeted information than a translation. (This is not to deny that examination of texts is useful; texts are good sources of information about topic-tracking devices or reference time maintenance, for example.) 3.2. Eliciting semantic judgments The semantics of utterances or parts of utterances are not consciously accessible to native speakers. Comments, paraphrases and translations offered by consultants are all useful clues for the fieldworker (see article 12 (Krifka) Varieties of semantic evidence). However, just as a syntactician does not ask a native speaker to provide a tree- structure (but rather asks for grammaticality judgments of sentences which are designed to test for constituency, etc.), so a semanticist does not ask a native speaker to conduct semantic analysis.This includes generally avoiding questions such as 'what does this word mean?','when is it okay to use this word?' and 'docs this sentence have two meanings?' Instead, we mainly proceed by requesting judgments on the acceptability of sentences in particular discourse contexts. A judgment is an acceptance or rejection of some linguistic object, and ideally reflects the speaker's native competence. (However, see Carden 1970, Schiitze 1996 on the problem of speaker variability in judgments.) I assume there is only one kind of judgment within semantics, that of whether a grammatical sentence (or string of sentences) is acceptable in a given discourse context. Native speakers are not qualified to give 'judgments' about technical concepts such as entailment, tautology, ambiguity or vagueness (although of course consultants' comments can give valuable clues about these issues). All of these concepts can and should be tested for via acceptability judgment tasks. The acceptability judgment task is similar to the truth value judgment task used in language acquisition research (cf. Crain & Thornton 1998 and much other work; see article 103 (Crain) Meaning in first language acquisition). The way to elicit an acceptability judgment is to first provide a discourse context to the consultant, then present an utterance, and ask whether the utterance is acceptable in that context. The consultant's answer potentially gives us information about the truth-value of the sentence in that context, and about the pragmatic felicity of the sentence in that context (e.g., whether its presuppositions are satisfied). The relation of the acceptability judgment task to truth values derives from Grice's (1975) Maxim of Quality: we assume that a speaker will only accept a sentence S in a discourse context C if S is true in C. Conversely, if S is false in C, a speaker will reject S in C. From this it follows that acceptance of a sentence gives positive information about truth value (the sentence is true in this discourse context), but rejection of a sentence gives only partial information: the sentence may be false, but it may also be rejected on other grounds. Article 103 (Crain) Meaning in first language acquisition offers useful discussion of the influence of the Cooperative Principle on responses to acceptability judgment tasks. And see von Fintel (2004) for the claim that speakers will assign the truth- value 'false' to sentences which involve presupposition failure, and which are therefore actually infelicitous (and analyzed as having no truth value). When conducting an acceptability judgment task, it is important to present the consultant with the discourse context before presenting the object language sentence(s). If the consultant hears the sentence in the absence of a context, s/he will spontaneously think of a suitable context or range of contexts for the sentence. If the speaker's imagined
13. Methods in cross-linguistic semantics 271_ context differs from the one the researcher is interested in, false negatives can arise, particularly if the researcher is interested in a dispreferred or non-obvious reading. The other important question is how to present the discourse context. In section 3.4 I will discuss the use of non-verbal stimuli when presenting contexts. Concentrating for now on verbal strategies, we have two choices: explain the context in the object language, or in a meta-language (a language the researcher is fluent in). Either method can work, and I believe that there is little danger in using a meta-language to describe a context. The context is merely background information; its linguistic features are not relevant. In fact, presenting the context in a meta-language has the advantage that the consultant is unlikely to copy structural features from the context-description to the sentence being judged. (In Matthcwson 2004, I argue that even in a translation task, the influence of meta-language syntax and semantics is less than has sometimes been feared. Based on examples drawn from Salish, I argue that consultants will only copy structures up to the point allowed by their native grammar, and that any problems of cross-language influence are easily solvable by means of follow-up elicitation.) See article 12 (Krifka) Varieties of semantic evidence for further discussion of issues in presenting discourse contexts. 3.3. Eliciting translations Translations have the same status as paraphrases or additional comments about meaning offered by the consultant: they are a clue to meaning, and should be taken seriously, but they are not primary data. A translation represents the consultant's best effort to express the same truth conditions in another language - but often, there is no way to express exactly the same truth conditions in two different languages. Even if we assume effability, i.e. that every proposition expressible in one language is expressible in any language (Katz 1976), a translation task will usually fall short of pairing a proposition in one language with a truth-conditionally identical counterpart in the other language.The reasons for this include the fact that what is easily expressible in one language may be expressible only with difficulty in another, or that what is expressed using an unambiguous proposition in one language is best rendered in another language by an utterance which is ambiguous or vague. Furthermore, what serves as a good translation of a sentence in one discourse context may be a poor or inappropriate translation of the same sentence in a different discourse context. As above, the only direct evidence about truth conditions are acceptability judgments in particular contexts. The problem of ambiguity is acute when dealing with a translation or paraphrase task. If an object-language sentence is ambiguous, its translation or paraphrase often only produces the preferred reading. And a consultant will often want to explicitly disambiguate; after all, Gricc's Manner Maxim exhorts a cooperative speaker to avoid ambiguity. If one suspects that a sentence may be ambiguous, one should not simply give the sentence and ask what it means. Instead, present a discourse context, and elicit an acceptability judgment. Eliciting the dispreferred reading first is a good idea (or, if the status of the readings is not known, using different elicitation orders on different days). It is also a good idea to construct discourse contexts which pragmatically favour the dispreferred reading. Manipulating pragmatic plausibility can also establish the absence of ambiguity: if a situation pragmatically favours a certain possible reading, but the consultant rejects the sentence in that context, one can be pretty sure that the reading is absent.
272 III. Methods in semantic research When dealing with presuppositions, or other aspects of meaning which affect felicity conditions, translations provide notoriously poor information. Felicity conditions are usually ignored in the translation process; the reader is referred to Matthewson (2004) for examples of this. When eliciting translations, the basic rule is only to ask for translations of complete, grammatical sentences. (See Nida 1947:140 for this point; Harris &Voegelin 1953:70-71, on the contrary, advise that one begin by asking for translations of single morphemes.) Generally speaking, any sub-sentential string will probably not be translatable with any accuracy by a native speaker. A sub-sentential translation task rests on the faulty assumption that the meaning which is expressed by a certain constituent in one language is expressible by a constituent in another language. Even if the syntactic structures are roughly parallel in the two languages, the meaning of a sub-sentential string may depend on its environment (cf. for example languages where bare noun phrases are interpreted as specific or definite only in certain positions). The other factor to be aware of when eliciting translations is that a sentence which would result in ungrammatically if it were translated in a structure-preserving way, will be restructured. In Matthewson (2004) I provide an example of this involving testing for Condition C violations in Salish. Due to pro-drop and free word order, the St'at'imcets (Lillooet Salish, British Columbia) sentence which translates Mary saw her mother is structurally ambiguous with She saw Mary's mother. The wrong way to test for potential Condition C-violating structures (which has been tried!) is to ask for translations of the potentially ambiguous sentence into English. Even if the object language does allow Condition C violations (i.e., does allow the structure Shet saw Mary^s mother), the consultant will never translate the sentence into English as She saw Mary's mother. (See Davis 2009 for discussion of strategies for dealing with the particular elicitation problems of Condition C.) 3.4. Elicitation using non-verbal stimuli It was once widely believed that elicitation should proceed by means of non-verbal stimuli, which are used to generate spontaneous speech in the object language, and that a metalanguage should be entirely avoided (see for example Hayes 1954, Yegerlehner 1955, or Aitken 1955). It will already be clear that I reject this view, as do most modern researchers. Elicitation using only visual stimuli cannot provide negative evidence, for example. Visual stimuli are routinely used in language acquisition experiments; see Crain & Thornton (1998), among others.The methodology is equally applicable to fieldwork with adults. The stimuli can be created using computer technologies, but they do not need to be high-tech; they can involve puppets, small toys, or line-drawings. Usually, these methodologies arc not intended to replace verbal elicitation, but serve as a support and enhancement for it. For example, a picture or a video will be shown, and may be used to elicit spontaneous discourse, but is also usually followed up with judgment questions. The advantage of visual aids is obvious: the methodology avoids potential interference from linguistic features of the context description, and it can potentially more closely approximate a real-life discourse context. This is particularly relevant for pragmatically sensitive questions, where it is almost impossible to hold cues in the metalanguage (e.g., intonation) constant.This methodology also allows standardization across different elicitation sessions, different fieldworkers, and different languages. When researchers share
13. Methods in cross-linguistic semantics 273 their video clips, for example, we can be certain that the same contexts are being tested and can compare cross-linguistic results with more confidence. Another advantage of visual stimuli concerns felicity conditions. It is quite challenging to elicit information about presupposition failure using only verbal descriptions of discourse contexts, as the necessary elements of the context include information about the interlocutors' belief or knowledge states. However, video, animation or play-acting give a potential way to represent a character's knowledge state, and if carefully constructed, the stimulus can rule out potential presupposition accommodation.The consultant can then judge the acceptability of an utterance by one of the characters in the video/play which potentially contains presupposition failure. The main disadvantage of these methodologies is logistics: compared to verbally describing a context, visually representing it takes more time and effort. Many hours can go into planning (let alone filming or animating) a video for just one discourse context. In many cases, verbal strategies are perfectly adequate and more efficient. For example, when trying to establish whether a particular morpheme encodes past tense, it is relatively simple to describe a battery of discourse contexts for a single sentence and obtain judgments for each context. One can quickly and unambiguously explain that an event took place yesterday, is taking place at the utterance time, or will take place tomorrow, for example. In sum, verbal elicitation still remains an indispensable and versatile methodology. The next section turns to semantic universale We first discuss what semantic universal look like, and then how one gets from cross-linguistic data to the postulation of universale 4. What counts as a universal? Universals come in two main flavours: absolute/unconditional ('all languages have x'), and implicational ('if a language has x, it has y'). Examples of each type are given in (1) and (2); all of these except (lb) can be found in the Konstanz Universals Archive. (1) Absolute/unconditional universals: a. Every language has an existential element such as a verb or particle (Ferguson 1972:78-79). b. Every language has expressions which are interpreted as denoting individuals (Article 95 (Bach & Chao) Semantic types across languages). c. Every language has N's and NPs of the type e —> t (Partee 2000). d. The simple NP of any natural language express monotone quantifiers or conjunctions of monotone quantifiers (Barwise & Cooper 1981:187). (2) Implicational universals: a. If a language has adjectives for shape, it has adjectives for colour and size (Dixon 1977). b. A language distinguishes between count and mass nouns iff a language possesses configurational NPs (Gil 1987). c. There is a simple NP which expresses the [monotone decreasing] quantifier ~Q if and only if there is a simple NP with a weak non-cardinal determiner which expresses the [monotone increasing] quantifier Q (Barwise & Cooper 1981: 186).
274 III. Methods in semantic research Within the typological literature, 'universals' are often actually tendencies ('statistical universals'). Many of the entries in the Konstanz Universals Archive contain phrases like 'with more than chance frequency','more likely to','most often','typically' or 'tend to'. For obvious reasons, one does not find much attention paid to statistical universals in the generativist literature. Another difference between typological and formal research relates to whether more importance is attached to implicational or unconditional universals.Typological research finds the implicational kind more interesting, and there is a strand of belief according to which it is difficult to distinguish unconditional universals from the preconditions of one's theory. For example, take (lc) above. Is this a semantic universal, or a theoretical decision under which wc describe the semantic behaviour of languages? (Thanks to a reviewer for discussion of this point.) There are certainly unconditional universals which seem to be empirically unfalsi- fiable. One example is semantic compositionality, which is assumed by many formal semanticists to hold universally. As discussed in von Fintel & Matthewson (2008) and references therein, compositionality is a methodological axiom which is almost impossible to falsify empirically. On the other hand, many unconditional universals are falsifiable; for example, Barwise and Cooper's NP-Quantifier Universal, introduced in section 5 below. Even (lc) does make a substantive claim, although worded in theoretical terms. It asserts that all languages share certain meanings; it is in a sense irrelevant if one doesn't agree that wc should analyze natural language using the types c and t. If typologists are suspicious of unconditional universals, formal semanticists may tend to regard implicational universals as merely a stepping stone to a deeper discovery. Many of the implications as they are stated are probably not primitively part of Universal Grammar. They may either have a functional explanation, or they may be descriptive generalizations which should be derivable from the theory rather than forming part of it. For example, take the semantic implicational universal in (3) (provided by a reviewer): (3) If a language has a definite article, then a type shift from <e,t> to «e,t>,t> is not freely available (without using the article). (3) is ideally merely one specific instance of a general ban on type-shifting when there is an overt element which does the job.This in turn may follow from some general economy principle, and not need to be stated separately. A final point of debate relates to level of abstractness. Formal semanticists often propose constraints which are only statable at a high level of abstractness, but some researchers believe that universals should always be surface testable. For example, Comrie (1989) argues that abstractly-stated universals have the drawback that they rely on analyses which may be faulty. And Levinson (2003:320) argues for surface-testability because "highly abstract generalizations ... are impractical to test against a reasonable sample of languages." Abstractness is required in the field of semantics because superficial statements almost never reveal what is really going on.This becomes clear if one considers how much reliable information about meaning can be gleamed from sources like descriptive grammars. In fact, the inadequacy of superficial studies is illustrated by the very examples cited by Levinson (2003) in support of his claim that "Whorf's (1956: 218) emphasis on the 'incredible degree of linguistic diversity of linguistic systems over the globe' looks considerably better informed than the opinions of many contemporary thinkers." Lcvinson's
13. Methods in cross-linguistic semantics 275 examples include variation in lexical categories, and the fact that some languages have no tense (Levinson 2003: 328). However, these are areas where examination of superficial data can lead to hasty conclusions about variation, and where deeper investigation often reveals underlying similarities. With respect to lexical categories, Jelinek (1995) famously advanced the strong and interesting hypothesis that Straits Salish (Washington and British Columbia) lacks cat- egorial distinctions. The claim was based on facts such as that any open-class lexical item can function as a predicate in Straits, without a copular verb. However, subsequent research has found, among other things, that the head of a relative clause cannot be occupied by just any open-class lexical item. Instead, this position is restricted to elements which correspond to nouns in languages like English (Dcmirdachc & Matthcwson 1995, Davis & Matthewson 1999, Davis 2002;see also Baker 2003).This is evidence for catego- rial distinctions, and Jelinek herself has since rejected the category-neutral view of Salish (orally,at the 2002 International Conference on Salish and Neighbouring Languages;see also Montler 2003). As for tense, while tenseless languages may exist, we cannot determine whether a language is tensed or not based on superficial evidence. I will illustrate this based on St'at'imcets; see Matthewson (2006b) for details. St'at'imcets looks like a tenseless language on the surface, as there is no obligatory morphological encoding of the distinction between present and past. (4) illustrates this for an activity predicate; the same holds for all aspectual classes: (4) mets-c31=lhkan write-ACT=lsG.suBJ 'I wrote /1 am writing.' St'at'imcets does have obligatory marking for future interpretations; (4) cannot be interpreted as future.The main way of marking the future is with the future modal kelh. (5) mets-cai=lhkan=A;e//j Write-ACT=lSG.SUBJ=fCT 'I will write.' The analytical problem is non-trivial here: the issue is whether there is a phonologically unpronounccd tense in (4).The assumption of null tense would make St'at'imcets similar to English, and therefore would be the null hypothesis (see section 5.1).The hypothesis is empirically testable language-internally, but only by looking beyond obvious data such as that in (4)-(5). For example, it turns out that St'at'imcets and English behave in a strikingly parallel manner with respect to the temporal shifting properties of embedded predicates. In English, when a stative past-tense predicate is embedded under another past tense, a simultaneous reading is possible.The same is true in St'at'imcets, as shown in (6). (6) tsut=tu7 s=Pauline [kw=s=guy't-aTmen=s=tu7] say=thcn NOM=Paulinc [DET=NOM=slccp-want=3poss=thcn] 'Pauline said that she was tired.' OK: Pauline said at a past time t that she was tired at t
276 III. Methods in semantic research On the other hand, in English a future embedded under another future does not allow a simultaneous reading, but only allows a forward-shifted reading (Enc, 1996, Abusch 1998, among others).The same is true of St'aYimcets, as shown in (7). Just as in English, the time of Pauline's predicted tiredness must be later than the time at which Pauline will speak. (7) lsHl=kelh s=Pauline [kw=s=guy't-aTmen=s=A;e//j] say=FUT NOM=Pauline [DET=NOM=sleep-want=3Poss=fcr] 'Pauline will say that she will be tired.' ♦Pauline will say at a future time t that she is tired at t. OK: Pauline will say at a future time t that she will be tired at t' after t. The parallel behaviour of the two languages can be explained if we assume that the St'aYimcets matrix clauses in (6)-(7) contain a phonologically covert non-future tense morpheme (Matthewson 2006b). The embedding data can then be afforded the same explanation as the corresponding facts in English (see e.g., Abusch 1997,1998, Ogihara 1996 for suggestions). We can see that there is good reason to believe that St'aYimcets is tensed, but the evidence for the tense morpheme is not surface-obvious. Of course, there are other languages which have been argued to be tense less (see e.g., Bohnemeyer 2002, Bittner 2005, Lin 2006, Ritter & Wiltschko 2005, and article 97 (Smith) Tense and aspect). However, I hope to have shown that it is useless to confine one's examination of temporal systems to superficial data. Doing this would lead to the possibly premature rejection of core universal properties of temporal systems in human language. This in turn suggests that we cannot say that semantic universals must be surface-testable. 5. How do we find universals? Given that we must postulate universals based on insufficient data (as data is never available for all human languages), we employ a range of strategies to get from data to universals. I will not discuss sources of evidence such as language change or cognitive- psychological factors; the reader is referred to article 99 (Fritz) Theories of meaning change, article 100 (Geeraerts) Cognitive approaches to diachronic semantics, and article 27 (Talmy) Cognitive Semantics for discussion. Article 12 (Krifka) Varieties of semantic evidence is also relevant here. 5.1. Assume universality One approach to establishing universals is simply to assume them, based on study of however many languages one has data from. This is actually what all universals research does, as no phenomenon has ever been tested in all of the world's languages. A point of potential debate is the number of languages which need to be examined before a universal claim should be made. This issue arises for semanticists because the nature of the fieldwork means that testing semantic universals necessarily proceeds quite slowly.
13. Methods in cross-linguistic semantics 277 My own belief is that one should always assume universality in the absence of evidence to the contrary - and then go and look for evidence to the contrary. In earlier work I dubbed this strategy the 'No variation null hypothesis' (Matthewson 2001); see also article 95 (Bach & Chao) Semantic types across languages. The assumption of universality is a methodological strategy which tells us to begin with the strongest empirically falsifiable hypothesis. It does not entail that there is an absence of cross-linguistic variation in the semantics, and it does not mean that our analyses of unfamiliar languages must look parallel to those of English. On the contrary, work which adopts this strategy is highly likely to uncover semantic variation, and to help establish the parameters within which natural language semantics may vary. The null hypothesis of universality is assumed by much work within semantics. In fact, some of the most influential semantic universals to have been proposed, those of Barwise & Cooper (1981),advance data only from English in support. (See also Keenan & Stavi 1986 for related work, and article 95 (Bach & Chao) Semantic types across languages for further discussion.) We saw two of Barwise and Cooper's universals above; here are two more for illustration. Ul is a universal semantic definition of the noun phrase (DP, in modern terminology): Ul. NP-Quantifier universal (Barwise & Cooper 1981: 177) Every natural language has syntactic constituents (called noun-phrases) whose semantic function is to express generalized quantifiers over the domain of discourse. U3 contains the well-known conservativity constraint, as well as asserting that there are no languages which lack determiners in the semantic sense. U3. Determiner universal (Barwise & Cooper 1981: 179) Every natural language contains basic expressions, (called determiners) whose semantic function is to assign to common count noun denotations (i.e., sets) A a quantifier that lives on A. Barwise and Cooper do not offer data from any non-English language, yet postulate universal constraints.This methodology was clearly justified, and the fact that not all of their proposed universals have stood the test of time is largely irrelevant. By proposing explicit and empirically testable constraints, Barwise and Cooper's work inspired a large amount of subsequent cross-linguistic research, which has expanded our knowledge of quantifica- tional systems in languages of the world. For example, several of the papers in the Bach et al. volume (1995) specifically address the NP-Quantifier Universal; see also Bittner & Trondhjem (2008) for a recent criticism of standard analyses of generalized quantifiers. It is sometimes asserted that the strategy of postulating universals as soon as one can, or even of claiming that an unfamiliar language shares the same analysis as English, is euro-centric (see e.g., Gil 2001). But this is not correct. Since English is a well-studied language, it is legitimate to compare lesser-studied languages to analyses of English. It is our job to see whether we can find evidence for similarities between English and unfamiliar languages, especially when the evidence may not be superficially obvious.The important point here is that the null hypothesis of universality does not assign analyses of any language any priority over language-faithful analyses of any other languages. The null hypothesis is that all languages arc the same - not that all languages arc like English. Thus, it is just as likely that study of a non-Indo-European language will cause us
278 III. Methods in semantic research to review and revise previous analyses of English. One example of this is Bittner's (2008) analysis of temporal anaphora. Bittner proposes a universal system of temporal anaphora which "instead of attempting to extend an English-based theory to a typologically distant language ... proceeds in the opposite direction -extending a Kalaallisut-based theory to English" (Bittner 2008:384). Another example is the analysis of quantification presented in Matthewson (2001). I observe there that the St'aVimcets data are incompatible with the standard theory of generalized quantifiers, because St'aYimcets possesses no determiners which create a generalized quantifier by operating on a common noun denotation, as English every or most do. (The insight that Salish languages differ from English in their quantificational structures is originally due to Jelinck 1995.) I propose that the analysis suggested by the St'at'imccts data may be applicable also to English. In a similar vein, see Bar-el (2005) for the claim that the aspectual system of Skwxwu7mesh invites us to reanalyze the aspectual system of English. The null hypothesis of universality guides empirical testing in a fruitful way; it gives us a starting hypothesis to test, and it allows us to use knowledge gained from study of one language in study of another. It inspires us to look beyond the superficial for underlying similarities, yet it does not force us to assume similarity where none exists. It also explicitly encodes the belief that there are limits on cross-linguistic variation. In contrast, an extreme non-universalist position (cf. Gil 2001) would essentially deny the idea that we have any reason to expect similarity between languages. This can lead to premature acceptance of exotic analyses. 5.2. Language acquisition and learnability Generative linguists are interested in universals which are likely to have been provided by Universal Grammar, and clues about this can come from acquisition research and learnability theory. As argued by Crain & Pietroski (2001: 150) among others, the features which are most likely to be innately specified by UG arc those which are shared across languages but which arc not accessible to a child in the Primary Linguistic Data (the linguistic input available to the learner). Crain and Pietroski offer the examples of coreference possibilities for pronouns and strong crossover; the restrictions found in these areas are by hypothesis not learnable based on experience.The discussion of tense above is another case in point, as the data required to establish the existence of the null tense morpheme are not likely to be often presented to children in their first few years of life. 5.3. A typological survey It may seem that a typological survey is at the opposite end of the spectrum from the research approaches discussed so far. However, even nativists should not discount the benefits of typological research. Such studies can help us gain perspective on which research questions arc of central interest from a cross-linguistic point of view. One example of this involves determiner semantics. A preliminary study by Matthewson (2008) (based on grammars of 33 languages from 25 language families) suggests that English is in many ways typologically unusual with respect to the syntax and semantics of determiners. For example, cross-linguistically, articles are the exception rather than the rule. Demonstratives arc present in almost all languages, but rarely seem
13. Methods in cross-linguistic semantics 279 to occupy determiner position as they do in English; strong quantifiers often appear to occupy a different position again. In many languages, the only strong quantifiers are universals. While distributive universal quantifiers are frequently reduplications or affixes, non-distributive ones are not. What we see is that a broad study can lead us to pose questions that would perhaps not otherwise be addressed, such as what the reason is for the apparent correlation between the syntax of a universal quantifier and its semantics. There are many examples of typological studies which have inspired formal semantic analyses (e.g.,Cusic 1981,Comrie 1985,Corbett 2000, to name a few, and see also Haspelmath et al.2001 for a comprehensive overview of typology and universals). 6. Semantic universals in the literature It is occasionally implied that since there is no agreed-upon set of semantic universals, they must not exist. For example, Levinson (2003:315) claims: "There are in fact very few hypotheses about semantic universals that have any serious, cross-linguistic backing." However, I believe that the reason there is not a large number of agreed-upon semantic universals is simply that semantic universals have so far only infrequently been explicitly proposed and subjected to cross-linguistic testing. And the situation is changing: there is a growing body of work which addresses semantic variation and its limits, and the literature has reached a level of maturity where the relevant questions can be posed in an interesting way. (Some recent dissertations which employ semantic fieldwork are Faller 2002, Wilhelm 2003, Bar-el 2005, Gillon 2006,Tonhauser 2006, Deal 2010, Murray 2010, Peterson 2010, among others.) Many more semantic universals will be proposed as the field continues to conduct in-depth, detailed study of a wide range of languages. In this section I provide a very brief overview of some universals within semantics, illustrating universals from a range of different domains. In the area of lexical semantics we have for example Levinson's (2003:315) proposal restricting elements encoding spatial relations. He argues that universally, there are at most three frames of reference upon which languages draw, each of which has precise characteristics that could have been otherwise. Levinson's proposals arc discussed further in article 107 (Landau) Space in semantics and cognition; see also article 27 (Talmy) Cognitive Semantics for discussion of location. A proposal which restricts composition methods and semantic rules is that of Bittner (1994). Bittner claims that there are no language-specific or construction- specific semantic rules. She proposes a small set of universal operations which essentially consist of functional application, type-lifting, lambda-abstraction, and a rule interpreting empty nodes. Although her proposal may seem to be worded in a theory- dependent way, claims about possible composition rules can be, and have been, challenged on at least partly empirical grounds; sec for example Hcim & Kratzcr (1998), Chung & Ladusaw (2003) for challenges to the idea that the only compositional rule is functional application. Another proposal of Bittner's involves a set of universals of temporal anaphora, based on comparison of English with Kalaallisut. These include statements about the location of states, events, processes and habits relative to topical instants or periods, and restrictions on the default topic time for different event-types.The constraints are explicitly formulated independently of syntax: "Instead of aligning LF structures, this strategy aligns communicative functions" (Bittner 2008:383).
280 III. Methods in semantic research A relatively common type of semantic universal involves constraints on semantic types. For example, Landman (2006) proposes that universally, traces and pro-forms can only be of type e.This is a purely semantic constraint: it is independent of whether the syntax of a language allows elements occupying the relevant positions to be of different categories. A universal constraint on the syntax-semantics mapping is proposed by Gillon (2006). Gillon argues that universally, all and only elements which occupy D(eterminer) position introduce a contextual domain restriction variable (von Fintel 1994). (Gillon has to claim, contra Chierchia 1998, that languages like Mandarin have null determiners.) An interesting constructional universal is suggested by Faller (2007). She argues that the semantics of reciprocal constructions (involving plurality, distinctness of co-arguments, universal quantification and reflexivity) may be cross-linguistically uniform. 7. Variation The search for universals is a search for limits on variation. If we assume a null hypothesis of universality, any observed variation necessitates a rejection of our null hypothesis and leads us to postulate restrictions on the limits of variation. Restrictions on variation are known in the formal literature as'parameters', but the relation between parameters and universals is so tight that restricted sets of options from which languages choose can be classified as a type of universal (cf. article 95 (Bach & Chao) Semantic types across languages). One prime source for semantic variation is independent syntactic variation. Given compositionality,the absence of certain structures in a language may result in the absence of certain semantic phenomena. Within the semantics proper, one well-known parameter is Chierchia's (1998) Nominal Mapping Parameter, which states that languages vary in the denotation of their NPs. A language may allow NPs to be mapped into predicates, into arguments (i.e., kinds), or both. These choices correlate with a cluster of co-varying properties across languages. The Nominal Mapping Parameter represents a weakening of the null hypothesis that the denotations of NPs will be constant across languages. Whether or not it is correct is an empirical matter, and Chierchia's work has inspired a large amount of discussion (e.g., Cheng & Sybesma 1999, Schmitt & Munn 1999, Chung 2000, Longobardi 2001, 2005, Doron 2004, Dayal 2004, Krifka 2004, among others). As with Barwise and Cooper's universals, Chierchia's parameter has done a great deal to advance our understanding of the denotations of NPs across languages. The Nominal Mapping Parameter also illustrates the restricted nature of the permitted variation; there are only three kinds of languages. Relevant work on restricted variation within semantics is discussed in article 96 (Doetjes) Count/mass distinction, article 98 (Smith) Tense and aspect, and article 98 (Pcdcrson) The expression of space. There is often resistance to the postulation of semantic parameters. One possible ground for this resistance is the belief that semantics should be more cross-linguistically invariant than other areas of the grammar. For example, Chomsky (2000: 185) argues that "there are empirical grounds for believing that variety is more limited for semantic than for phonetic aspects of language." It is not clear to me, however, that we have empirical grounds for believing this. We are still in the early stages of research into semantic variation; we probably do not know enough yet to say whether semantics varies less than other areas of the grammar.
13. Methods in cross-linguistic semantics 281_ Another possible ground for skepticism about semantic parameters is the belief that semantic variation is not learnable. However, there is no reason why this should be so. Chierchia, for example, outlines a plausible mechanism by which the Nominal Mapping Parameter is learnable, with Chinese representing the default setting in line with Manzini & Wexler's (1987) Subset Principle (Chierchia 1998:400-401). Chierchia argues that his parameter "is learned in the same manner in which every other structural difference is learned: through its overt morphosyntactic manifestations. It thus meets fully the reasonable challenge that all parameters must concern live morphemes." A relatively radical semantic parameter is the proposal that some languages lack pragmatic presuppositions in the sense of Stalnaker (1974) (Matthcwson 2006a, based on data from St'at'imcets). This parameter fails to be tied to specific lexical items, as it is stated globally. However, the presupposition parameter places languages in a subset- superset relation, and as such is potentially learnable, as long as the initial setting is the St'aYimcets one. A learner who assumes that her language lacks pragmatic presuppositions can learn that English possesses such presuppositions on the basis of overt evidence (such as'Hey, wait a minute!' responses to presupposition failure, cf. von Fintel 2004). One common way in which languages vary in their semantics is in degree of under- specification. For example, I argued above that St'at'imcets shares basic tense semantics with English.The languages differ in that the St'at'imcets tense morpheme is semantically undcrspccificd, failing to distinguish past from present reference times. Another case of variation in underspecification involves modality. According to Rullmann,Matthewson & Davis (2008), St'at'imcets modals allow similar interpretations to English modal auxiliaries. However, the interpretations are distributed differently across the lexicon: while in English, a single modal allows a range of different conversational backgrounds (giving rise to deontic, circumstantial orepistemic interpretations, Kratzer 1991), in St'at'imcets, the conversational background is lexically encoded in the choice of modal. Conversely, while in English, the quantificational force of a modal is lexically encoded, in St'at'imcets it is not;each modal is compatible with both existential and universal interpretations. One strong hypothesis would be that all languages possess the same functional categories, but that languages differ in the levels of underspecification in the lexical entries of the various morphemes, and in the way in which they 'bundle' the various aspects of meaning into morphemes. As an example of the latter case, Lin's (2006) analysis of Chinese involves partially standard semantics for viewpoint aspect and for tense, but a different bundling of information from that found in English (for example, the perfective aspect in Chinese includes a restriction that the reference time precedes the utterance time). In terms of lexical aspect, there appears to be cross-linguistic variation in the semantics of basic aspectual classes. Bar-el (2005) argues that stative predicates in Skwxwu7mesh Salish include an initial change-of-state transition. Bar-el implies that while the basic building blocks of lexical entries are provided (e.g., that events can either begin or end with BECOME transitions (cf. Dowty 1979, Rothstein 2004), languages combine these in different ways to give different semantics for the various aspectual classes. Thus, the semantics we traditionally attribute to 'accomplishments' or to 'states' arc not primitives of the grammar. While the Salish/English aspectual differences appear not to be reducible to underspecification or bundling, further decompositional analysis may reveal that they are.
282 III. Methods in semantic research / would like to thank Henry Davis for helpful discussion during the writing of this article, and Klaus von Heusinger and Noor van Leusen for helpful feedback on the first draft. 8. References Abusch, Dorit 1997. Sequence of tense and temporal de re. Linguistics & Philosophy 20,1-50. Abusch, Dorit 1998. Generalizing tense semantics for future contexts. In: S. Rothstein (ed.). Events and Grammar. Dordrecht: Kluwer. 13-33. Aitken, Barbara 1955. A note on eliciting. International Journal of American Linguistics 21,83. Bach, Emmon et al. (eds) 1995. Quantification in Natural Languages. Dordrecht: Kluwer. Baker, Mark 2003. Lexical Categories: Verbs, Nouns and Adjectives. Cambridge: Cambridge University Press Bar-cl. Lcora 2005. Aspectual Distinctions in Skwxwii7mesh. Ph.D. dissertation. University of British Columbia, Vancouver, BC. Barwise, Jon & Robin Cooper 1981. Generalized quantifiers and natural language. Linguistics & Philosophy 4,159-219. Bittner, Maria 1987. On the semantics of the Greenlandic antipassive and related constructions. International Journal of American Linguistics 53,194-231. Bittner, Maria 1994. Cross-linguistic semantics. Linguistics & Philosophy 17,53-108. Bittner, Maria 2005. Future discourse in a tenseless language. Journal of Semantics 22,339-388. Bittner, Maria 2008. Aspectual universals of temporal anaphora. In: S. Rothstein (ed.). Theoretical and Crosslinguistic Approaches to the Semantics of Aspect. Amsterdam: Benjamins, 349-385. Bittner, Maria & Naja Trondhjem 2008. Quantification as reference: Kalaallisut Q-verbs In: L. Matthewson (ed.). Quantification: A Cross-Linguistic Perspective. Amsterdam: Emerald. 7-66. Bohnemeyer, Jiirgen 2002. The Grammar of Time Reference in Yukatek Maya. Munich: Lincom Europa. Carden, Guy 1970. A note on conflicting idiolects. Linguistic Inquiry 1,281-290. Cheng, Lisa & Rint Sybcsma 1999. Bare and not-so-barc nouns and the structure of NP. Linguistic Inquiry 30,509-542. Chierchia, Gennaro 1998. Reference to kinds across languages. Natural Language Semantics 6, 339-405. Chomsky, Noam 2000. New Horizons in the Study of Language and Mind. Cambridge: Cambridge University Press Chung, Sandra 2000. On Reference to kinds in Indonesian. Natural Language Semantics 8,157-171. Chung, Sandra & William Ladusaw 2003. Restriction and Saturation. Cambridge, MA: The MIT Press Comrie, Bernard 1985. Tense. Cambridge: Cambridge University Press Comrie, Bernard 1989. Language Universals and Linguistic Typology: Syntax and Morphology. Oxford: Blackwell. Corbett, Greville 2000. Number. Cambridge: Cambridge University Press Crain, Stephen & Paul Pietroski 2001. Nature, nurture, and universal grammar. Linguistics & Philosophy 24,139-186. Crain, Stephen & Rosalind Thornton 1998. Investigations in Universal Grammar:A Guide to Experiments on the Acquisition of Syntax. Cambridge, MA: The MIT Press Cusic, David 1981. Verbal Plurality and Aspect. Ph.D. dissertation. Stanford University, Stanford, CA. Davis Henry 2002. Categorial restrictions in St'aTimcets (Lillooet) relative clauses In: C. Gillon, N. Sawai & R. Wojdak (eds). Papers for the 37th International Conference on Salish and Neighbouring Languages. Vancouver, BC: University of British Columbia, 61-75.
13. Methods in cross-linguistic semantics 283 Davis, Henry 2009. Cross-linguistic variation in anaphoric dependencies: Evidence from the Pacific Northwest. Natural Language and Linguistic Theory 27,1-43. Davis, Henry & Lisa Matthewson 1999. On the functional determination of lexical categories. Revue Quebecoise de Linguistique 27,27-67. Dayal, Veneeta 2004. Number marking and (in)definiteness in kind terms. Linguistics & Philosophy 27,393-450. Deal, Amy Rose 2010. Topics in the Nez Perce Verb. Ph.D. dissertation. University of Massachusetts, Amherst, MA. Demirdache, Hamida & Lisa Matthewson 1995. On the universality of syntactic categories. In: J. Beckman (ed.). Proceedings of the North Eastern Linguistic Society (= NELS) 25. Amherst, MA: GLSA, 79-93. Dimmendaal, Gerrit 2001. Places and people: Field sites and informants. In: P. Newman & M. Ratliff (eds.). Linguistic Fieldwork. Cambridge: Cambridge University Press. 55-75. Dixon, Robert M.W. 1977. Where have all the adjectives gone? Studies in Language 1,19-80. Doron, Edit 2004. Bare singular reference to kinds. In: R. Young & Y. Zhou (cds.). Proceedings of Semantics and Linguistic Theory (= SALT) XIII. Ithaca, NY: Cornell University, 73-90. Dowty, David 1979. Word Meaning in Montague Grammar. Dordrecht: Reidel. E119, Murvet 1996. Tense and modality. In: S. Lappin (ed.). The Handbook of Contemporary Semantic Theory. Oxford: Blackwell, 345-358. Faller, Martina 2002. Semantics and Pragmatics of Evidentials in Cuzco Quechua. Ph.D. dissertation. Stanford University, Los Angeles, CA. Faller, Martina 2007. The ingredients of reciprocity in Cuzco Quechua. Journal of Semantics 243, 255-288. Ferguson, Charles 1972. Verbs of 'being' in Bengali, with a note on Amharic. In: J. Verhaar (ed.). The Verb 'Be' and its Synonyms: Philosophical and Grammatical Studies, 5. Dordrecht: Rcidcl, 74-114. von Fintel, Kai 1994. Restrictions on Quantifier Domains. Ph.D. dissertation. University of Massachusetts, Amherst, MA. von Fintel, Kai 2004. Would you believe it? The king of France is back! Presuppositions and truth value intuitions. In: A. Bczuidcnhout & M. Rcimcr (cds.). Descriptions and Beyond. Oxford: Oxford University Press. 261-296. von Fintel, Kai & Lisa Matthewson 2008. Universals in semantics. The Linguistic Review 25, 49-111. Gil, David 1987. Definitencss, noun phrase configurationality, and the count-mass distinction. In: E. Reuland & A. ter Meulen (eds.). The Representation of (In)Definiteness. Cambridge, MA: The MIT Press, 254-269. Gil, David 2001. Escaping Euro-centrism: Fieldwork as a process of unlearning. In: P. Newman & M. Ratliff (eds.). Linguistic Fieldwork. Cambridge: Cambridge University Press, 102-132. Gillon, Carrie 2006. The Semantics of Determiners: Domain Restriction in Skwxwii7mesh. Ph.D. dissertation. University of British Columbia, Vancouver, BC. Grice, H. Paul 1975. Logic and conversation. In: P. Cole & J. L. Morgan (eds.). Syntax and Semantics 3: Speech Acts. New York: Academic Press, 41-58. Reprinted in: S. Davis (ed.). Pragmatics. Oxford: Oxford University Press, 1991,305-315. Harris, Zellig S. & Charles F. Voegelin 1953. Eliciting in linguistics. Southwestern Journal of Anthropology 9,59-75. Haspchnath, Martin ct al. (cds.) 2001. Language Typology and Language Universals: An International Handbook (HSK 20.1-2). Berlin: de Gruyter. Hayes, Alfred 1954. Field procedures while working with Diegueno. International Journal of American Linguistics 20,185-194. Heim, Irene & Angelika Kratzcr 1998. Semantics in Generative Grammar. Oxford: Blackwell.
284 III. Methods in semantic research Jelinek, Eloise 1995. Quantification in Straits Salish. In: E. Bach et al. (eds.). Quantification in Natural Languages. Dordrecht: Kluwer, 487-540. Katz. Jerrold 1976. A hypothesis about the uniqueness of human language. In: S. Harnad, H. Steklis & J. Lancaster (eds.). Origins and Evolution of Language and Speech. New York: New York Academy of Sciences, 33-41. Keenan, Edward & Jonathan Stavi 1986. A semantic characterization of natural language determiners. Linguistics & Philosophy 9,253-326. Konstanz Universals Archiv 2002-. The Universals Archive, maintained by the University of Konstanz, Department of Linguistics, http://typo.uni-konstanz.de/archive/intro/, March 5, 2009. Kratzer, Angelika 1991. Modality. In: D. Wunderlich & A. von Stechow (eds.). Semantics: An International Handbook of Contemporary Research (HSK 6). Berlin: de Gruyter, 639-650. Krifka, Manfred 2004. Bare NPs: Kind-referring, indefinites, both, or neither? Empirical issues in syntax and semantics. In: R. Young & Y Zhou (eds.). Proceedings of Semantics and Linguistic Theory (= SALT) XIII. Ithaca, NY: Cornell University, 180-203. Landman, Meredith 2006. Variables in Natural Language. Ph.D. dissertation. University of Massachusetts, Amherst, MA. Levinson. Stephen 2003. Space in Language and Cognition: Explorations in Cognitive Diversity. Cambridge: Cambridge University Press Lin, Jo-Wang 2006. Time in a language without tense: The case of Chinese. Journal of Semantics 23, 1-53. Longobardi, Giuseppe 2001. How comparative is semantics? A unified parametric theory of bare nouns and proper names. Natural Language Semantics 9,335-361. Longobardi, Giuseppe 2005. A minimalist program for parametric linguistics? In: H. Broekhuis et al. (eds.). Organizing Grammar: Linguistic Studies for Henk van Riemskijk. Berlin: de Gruyter, 407-414. Manzini, M. Rita & Kenneth Wcxler 1987. Parameters, binding theory, and learnability. Linguistic Inquiry 18,413-444. Matthewson, Lisa 1998. Determiner Systems and Quantificational Strategies: Evidence from Salish. The Hague: Holland Academic Graphics. Matthewson, Lisa 2001. Quantification and the nature of cross-linguistic variation. Natural Language Semantics 9,145-189. Matthewson, Lisa 2004. On the methodology of semantic fieldwork. International Journal of American Linguistics 70,369-415. Matthewson, Lisa 2006a. Presuppositions and cross-linguistic variation. In: C. Davis, A. Deal & Y Zabbal (eds.). Proceedings of the North Eastern Linguistic Society (= NELS) 36. Amherst, MA: GLSA, 63-76. Matthewson, Lisa 2006b. Temporal semantics in a superficially tenseless language. Linguistics & Philosophy 29.673-713. Matthewson, Lisa 2008. Strategies of quantification in St'aYimcets and the rest of the world. To appear in Strategies of Quantification. Oxford: Oxford University Press Montler, Timothy 2003. Auxiliaries and other categories in Straits Salishan. International Journal of American Linguistics 69,103-134. Murray, Sarah 2010. Evidentiality and the Structure of Speech Acts. Ph.D. dissertation. Rutgers University, New Brunswick, NJ. Nida, Eugene 1947. Field techniques in descriptive linguistics. International Journal of American Linguistics 13,138-146. Ogihara,Toshi-Yuki 1996. Tense, Attitudes and Scope. Dordrecht: Kluwer. Parlee. Barbara 2000. 'Null' determiners vs. no determiners. Lecture read at the 2nd Winter Typology School, Moscow. Peterson, Tyler 2010. Epistemic Modality and Evidentiality in Gitksan at the Semantics-Pragmatics Interface. Ph.D. dissertation. University of British Columbia, Vancouver, BC.
14. Formal methods in semantics 285 Ritter, Elizabeth & Martina Wiltschko 2005. Anchoring events to utterances without tense. In: J. Alderete, C-h. Han & A. Kochetov (eds.). Proceedings of the 24th West Coast Conference on Formal Linguistics (= WCCFL). Somerville, MA: Cascadilla Proceedings Project, 343-351. Rothstein, Susan 2004. Structuring Events. Oxford: Blackwell. Rullmann, Hotze, Lisa Matthewson & Henry Davis 2008. Modals as distributive indefinites. Natural Language Semantics 16,317-357. Schmitt. Cristina & Alan Munn 1999. Against the nominal mapping parameter: Bare nouns in Brazilian Portuguese. In: P. Tamanji, M. Hirotani & N. Hall (eds.). Proceedings of the North Eastern Linguistic Society (=NELS) 29. Amherst, MA: GLSA, 339-353. Schiitze, Carson 1996. The Empirical Base of Linguistics: Grammaticality Judgments and Linguistic Methodology. Chicago, IL:Tlie University of Chicago Press Stalnaker, Robert 1974. Pragmatic presuppositions. In: M.K. Munitz & P.K. Unger (eds.). Semantics and Philosophy. New York: New York University Press, 197-213. Tonhauser, Judith 2006. The Temporal Semantics of Noun Phrases: Evidence from Guarani. Ph.D. dissertation. Stanford University, Stanford, CA. Whorf, Benjamin 1956. Language, Thought and Reality: Selected Writings of Benjamin Lee Whorf Edited by J. Carroll. Cambridge, MA:The MIT Press Wilhelm, Andrea 2003. Telicity and Durativity: A Study of Aspect in Dene Suline (Chipewyan) and German. Ph.D. dissertation, University of Calgary, Calgary, AL. Yegerlehner, John 1955. A note on eliciting techniques. International Journal of American Linguistics 21,286-288. Lisa Matthewson, Vancouver (Canada) 14. Formal methods in semantics 1. Introduction 2. First order logic and natural language 3. Formal systems, proofs and decidability 4. Semantic models, validity and completeness 5. Formalizing linguistic methods 6. Linguistic applications of syntactic methods 7. Linguistic applications of semantic methods 8. Conclusions 9. References Abstract Covering almost an entire century, this article reviews in general and non-technical terms how formal, logical methods have been applied to the meaning and interpretation of natural language. This paradigm of research in natural language semantics produced important new linguistic results and insights, but logic also profited from such innovative applications. Semantic explanation requires properly formalized concepts, but only provides genuine insight when it accounts for linguistic intuitions on meaning and interpretation or the results of empirical investigations in an insightful way. The creative tension between the linguistic demand for cognitively realistic models of human linguistic competence and the logicians demand for a proper and explicit account of all and only the Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 285-305
286 III. Methods in semantic research valid reasoning patterns initially led to an interesting divergence of methods and associated research agendas. With the maturing of natural language semantics as a branch of cognitive science an increasing number of logicians trained in linguistics and linguists apt in using formal methods are developing more convergent empirical issues in interdisciplinary research programs. 1. Introduction The scientific analysis of patterns of human reasoning properly belongs to the ancient discipline of logic, bridging more than twenty centuries from its earliest roots in the ancient Greek philosophical treatises of Plato and Aristotle on syllogisms to its contemporary developments in connecting dynamic reasoning in context to underlying neuro- biological and cognitive processes. In reasoning with information from various sources available to us, we systematically exploit (i) the meaning of the words, (ii) the way they are put together in clauses, as well as (iii) the relations between these clauses and (iv) the circumstances in which we received the information in order to arrive at a conclusion (Frege 1892;Tarski 1956).The formal analysis of reasoning patterns not only offers an important window on the meaning and interpretation of logical languages, but also of ordinary, acquired, i.e. natural languages. It constitutes a core component of cognitive science, providing the proper scientific methods to model human information processing as constitutive structure and form. Any explanatory scientific theory of the meaning and interpretation of natural language must at some level aim to characterize all and only those patterns of reasoning that guarantee to preserve in one way or another the assumed truth of the information on which the conclusions are based. In analyzing patterns of inferences, content and specific aspects of the interpretation can only be taken into consideration, if they can be expressed in syntactic form and constitutive structure or in the semantic meta-language. The contemporary research program of natural language semantics has significantly expanded the expressions of natural language to be subjected to such formal methods of logical analysis to cover virtually all syntactic categories, as well as relations between sentences and larger sections of discourse or text, mapping syntactic, configurational structures to sophisticated notions of semantic content or information structure (Barwise & Perry 1983; Chierchia 1995; Cresswell 1985; Davidson & Harman 1972; Dowty, Wall & Peters 1981; Gallin 1975; Partee, ter Meulen & Wall 1990). The classical division of labor between syntactic and semantic theories of reasoning is inherited from the mathematical logical theories developed in the early twentieth century, when logical, i.e. unnatural and purposefully designed languages were the primary subject of investigation. Syntactic theories of reasoning exploit as explanatory tools merely constitutive, configurational methods, structural, i.e. formal operations such as substitution and pure symbol manipulation of the associated proof theories. The semantic theories of reasoning require an interpretation of such constitutive, formal structure in models to characterize truth-conditions and validity of reasoning as systematic interaction between form and meaning (Boolos & Jeffrey 1980).The syntactic, proof theoretic strategy with its customary disregard for meaning as intangible, has originally been pursued most vigorously for natural language in the research paradigm of generative grammar. Its earliest mathematical foundational research in automata theory,
14. Formal methods in semantics 287 formal grammars and their associated design languages regarded structural operations as the only acceptable formal methods. Reasoning or inference was as such not the target of their investigations, as grammars were principally limited to characterize sentence internal properties and their computational complexity (Chomsky 1957; Chomsky 1959; Chomsky & Miller 1958, 1963; Hopcroft & Ullman 1979; Gross & Lentin 1970). The semantic, model-theoretic strategy of investigating human reasoning has been pursued most vigorously in natural language semantics, Lambek and Montague grammars and game theoretic semantics, and their 21st century successors, the various dynamic theories of meaning and interpretation founded on developments in intensional and epistemic logicsof (sharing) belief and knowledge (Barwise 1989;van Benthem 1986;van Benthem & tcr Mculcn 1997;Crcsswcll 1985; Davidson & Harman 1972; Kamp 1981; Kamp & Reyle 1993; Lambek 1958; Lewis 1972,1983; Montague 1974;Stalnaker 1999). Formal methods deriving from logic are also applied in what has come to be known as formal pragmatics, where parameters other than worlds or situations, such as context or speaker/hearer, time of utterance or other situational elements may serve in the models to determine meaning and situated inference.This article will address formal methods in semantics as main topic, as formal pragmatics may be considered a further generalization of these methods to serve wider linguistic applications, but as such docs not in any intrinsic way differ from the formal methods used in semantics. In both logical and natural languages, valid forms of reasoning in their most general characteristics exploit the information available in the premises, assumed to be true in an arbitrary given model, to draw a conclusion, guaranteed to be also true in that model. Preserving this assumed truth of the premises is a complex process that may be modeled in various formal systems, but if their admitted inference rules are somehow violated in the process the conclusion of the inference is not guaranteed to be true. This common deductive approach accounts for validity in forms or patterns of reasoning based on the stable meaning of the logical vocabulary, regardless of the class of models under consideration. It has constituted the methodological corner stone of the research program of natural language semantics, where ordinary language expressions are translated into logical expressions, their 'logical form', to determine their truth conditions in models as a function of their form and subsequently characterize their valid forms of reasoning (May 1985; Montague 1974; Dowty, Wall & Peters 1981). Some important formal methods of major schools in this research program are reviewed, mostly to present their conceptual foundations and discuss their impact on linguistic insights, while referring the reader for more technical expositions and formal details to the relevant current literature and articles 10 (Newen & Schroder) Logic and semantics, 11 (Kempson) Formal semantics and representationalism, 33 (Zimmermann) Model-theoretic semantics, 43 (Keenan) Quantifiers, 37 (Kamp & Reyle) Discourse Representation Theory and 38 (Dekker) Dynamic semantics (see also van Benthem & ter Meulen 1997;Gabbay & Guenthner 1983;Partee, terMeulen& Wall 1990). Excluded from consideration as formal methods in the sense intended here arc other mathematical methods, such as statistical, wductive inference systems, where the assumptions and conclusions of inferences are considered more or less likely, or various quantitative approaches to meaning based on empirical studies, or optimality systems, which ordinarily do not account for inference patterns, but rank possible interpretations according to a given set of constraints of diverse kinds. In such systems form does not directly and functionally determine meaning and hence the role of inference patterns, if
288 III. Methods in semantic research any, is quite different from the core role in logical, deductive systems of reasoning which constitute our topic. Of course, as any well defined domain of scientific investigation, such systems too may be further formalized and perhaps eventually even axiomatized as a logical, formal system. But in the current state of linguistics their methods are often informal, appealing to semantic notions as meaning only implicitly, resulting in fragmented theories, which, however unripe for formalization, may still provide genuinely interesting and novel linguistic results. In section 2 of this article the best known logical system of first order logic is discussed. It served as point of departure of natural language semantics, in spite of its apparent limitations and idealizations. In section 3 the classical definition of a formal system is specified with its associated notions of proof and theorem, and these arc related to its early applications in linguistic grammars.The hierarchy of structural complexity generated by the various kinds of formal grammars is still seen to direct the current quest in natural language semantics for proper characterizations of cognitive complexity. Meta-logical properties such as decidability of the set of theorems are introduced. In section 4 the general notion of a semantic model with its definition of truth conditions is presented, without formalization in set-theoretic notation, to serve primarily in conceptually distinguishing contingent truth (at a world) in a model from logical truth in regardless which model to represent valid forms of reasoning. Completeness is presented as a desirable, but perhaps not always feasible property of logical systems that have attained a perfect harmony between their syntactic and semantic sides. Section 5 discusses how the syntactic and semantic properties of pronouns first provided a strong impetus for using formal methods in separate linguistic research programs, that have later converged on a more integrated account of their behavior, currently still very much under investigation. In section 6 we review a variety of proof theoretic methods that have been developed in logic to understand how each of them had an impact on formal linguistic theories later. This is where constraints on derivational complexity and resource bounded generation of expressions is seen to have their proper place, issues that have only gained in importance in contemporary linguistic research. Section 7 presents an overview of the semantic methods that have been developed in logic over the past century to see how they have influenced the development of natural language semantics as a flourishing branch of cognitive science, where formal methods provide an important contribution to their scientific methods. The final section 8 contains the conclusion of this article stating that the application of formal methods, deriving from logical theories, has greatly contributed to the development of linguistics as an independent academic discipline with an integrated, interdisciplinary research agenda. Seeking a continued convergence of syntactic and semantic methods in linguistic applications will serve to develop new insights in the cognitive capacity underlying much of human information processing. 2. First order logic and natural language Many well known systems of first order logic (FOL), in which quantifiers may only range over individuals, i.e. not over properties or sets of individuals, have been designed to study inference patterns deriving from the meaning of the classical Boolean connectives of conjunction (... and ...), disjunction (... or...), negation {not...) and conditionals {if ... then ...) and biconditionals (... if and only if...), besides the universal {every N) and existential {some N) quantifiers. Syntactically these FOL systems differed in the number
14. Formal methods in semantics 289 of axioms, logical vocabulary or inference rules they admitted, some were optimal for simplicity of the proofs, others more congenial to the novel user, relying on the intuitive meaning of the logical vocabulary (Boolos & Jeffrey 1980; Keenan & Faltz 1985; Link 1991). The strongly reformist attitudes of the early 20th century logicians Bertrand Russell, Gottlob Frege, and Alfred Tarski, today considered the great-grandfathers of modern logic, initially steered FOL developments away from natural language, as it was regarded as too ambiguous, hopelessly vague,or content-and context-dependent. Natural language was even considered prone to paradox, since you can explicitly state that something is or is not true. This is what is meant when natural language is accused of "containing its own truth-predicate", for the semantics of the predicate "is true" cannot be formulated in a non-circular manner for a perfectly grammatical, but self-referential sentence like This statement is false, which is clearly true just in case it is false. Similarly treacherous forms of self-referential acts with circular truth-conditions are found in simple statements such as/am lying,resembling the ancient Cretense Liar Paradox,and the syntactically overtly self-referential, but completely comprehensible The claim that this sentence contains eleven words is false, that lead later model theoretic logicians to develop innovative models of self-reference and logically sound forms of circularity, abandoning the need for rock bottom atomic elements of sets, as classical set theory had always required (Barwise & Etchemendy 1987). Initially FOL was advocated as a 'good housekeeping' act in understanding elementary forms of reasoning in natural language, although little serious attention was paid on just how natural language expressions should be systematically translated into FOL, given their syntactic constituent structure. This raised objections of what linguists often called 'miraculous translation', appealing to the implicit intuitions on truth-functional meaning only trained logicians apparently had easy access to. Although this limited logical language of FOL was never intended to come even close to modeling the wealth of expressive power in natural languages, it was considered the basic Boolean algebraic core of the logical inference engine. The descriptive power of FOL systems was subsequently importantly enriched to facilitate the Fregean adagio of compositional translation by admitting lambda abstraction, representing the denotation of a predicate as a set A by a characteristic function /that tells you for each element d in the underlying domain D whether or not d is an element of that set A, i.e.fA(d) is true if and only if d is an clement of the set A (Partcc, tcr Mculcn & Wall 1990). Higher order quantification with quantifiers ranging over sets of properties, or sets of those ad infinitum, required a type structure, derived from the abstract lambda calculus, a universal theory of functional structure.Typing formal languages resembled in some respects Bertrand Russell's solution to avoid vicious circularity in logic (Barendregt 1984; Carpenter 1997; Montague 1974; Link 1991). More complex connectives or other truth functional expressions and operators were added to FOL, besides situations or worlds to the models to analyze modal, temporal or epistemic concepts (van Benthem 1983; Carnap 1947; Kripke 1972; Lewis 1983; McCawley 1981). The formal methods of FOL semantics varied from classical Boolean full bi-valuation with total functions that ultimately take only true and false as values, to three or more valued models in which formulas may not always have a determined truth value, initially proposed to account for presupposition failure of definite descriptions that did not have a referent in the domain of the intended model. Weaker logical systems
290 III. Methods in semantic research were also proposed, admitting fewer inference rules.The intuitionistic logics are perhaps the best known of these, rejecting for philosophical and perhaps conceptual reasons the classical law of double negation and the correlated rule of inference that allowed you to infer a conclusion, if its negation had been shown to lead to contradictions. In classical FOL the truth functional definition of the meaning of negation as set-theoretic complement made double negation logically equivalent to no negation at all, e.g. It is not the case that every student did not hand in a paper should at least truth-functionally mean the same as Some student handed in a paper (Partee, ter Meulen & Wall 1990). Admitting partial functions that allowed quantifiers to range over possible extensions of their already fixed, given range, ventured into for logic also innovative higher order methods, driven by linguistic considerations of pronoun resolution in discourse and various sophisticated forms of quantification found in natural language, to which we return below (Chierchia 1995;Gallin 1975;Groenendijk & Stokhof 1991; Kamp & Reyle 1993). 3. Formal systems, proofs and decidability Whatever its exact language and forms of reasoning characterized as valid, any particular logical system must adhere to some very specific general requirements, if it is to count as a formal system. A formal system must consist of four components: (i) a lexicon specifying the terminal expressions or words, and a set of non-terminal symbols or categories, (ii) a set of production rules which determine how the well formed expressions of any category of the formal language may be generated, (iii) a set of axioms or expressions of the lexicon that are considered primitive, (iv) a set of inference rules, determining how expressions may be manipulated. A formal system may be formulated purely abstractly without being intended as representation of anything, or it may be designed to serve as a description or simulation of some domain of real phenomena or, as intended in linguistics, modeling aspects of empirical, linguistic data. A formal proof is the product of a formal system, consisting of (i) axioms, expressions of the language that serve as intuitively obvious or in any case unquestionable first principles, assumed to be true or taken for granted no matter what, and (ii) applications of the rules of inference that generate sequences of steps in the proof, resulting in its conclusion, the final step, also called a theorem. The grammar of a language, whether logical or natural, is a system of rules that generates all and only all the grammatical or well-formed sentences of the language. But this does not mean that we can always get a definite answer to the general question whether an arbitrary string belongs to a particular language, something a child learning its first language may actually need. There is no general decision procedure determining for any arbitrary given expression whether it is or is not derivable in any particular formal system (Arbib 1969; Davis 1965; Savitch,Bach& Marsh 1987). However, this question is pro\ab\y decidable for sizable fragments of natural language, even if some form of higher order quantification is permitted (Nishihara, Morita & Iwata 1990; Pratt-Hartmann 2003; Pratt-Hartmann & Third 2006). Current research on generative complexity is focused on expanding fragments of natural language that are known to be decidable to capture realistic limitations on search
14. Formal methods in semantics 291 complexity,in attempting to characterize formal counterparts to experimentally obtained results on human limitations of cognitive resources, such as memory or processing time. Automated theorem provers certainly assist people in detecting proofs for complex theorems or huge domains. People actually use smart, but still little understood heuristics in finding derivations, trimming down the search space of alternative variable assignments, for example, by marked prosody and intonational meaning. Motivating much of the contemporary research in applying formal methods to natural language is seeking to restrain the complexity or computational power of the formal systems to less powerful, learnable and decidable fragments. In such cognitively realistic systems the inference rules constrain the search space of valuation functions or exploit limited resources in linguistically interesting and insightful ways. The general research program of determining the generative or computational complexity of natural languages first produced in the late 1950s the well known Chomsky Hierarchy of formal language theory, which classifies formal languages, their corresponding automata and the phrase-structure grammar that generate the languages as regular, context-free, context-sensitive or unrestricted rewrite systems (Arbib 1969; Gross & Lentin 1970; Hopcroft & Ullman 1979). Initially, Chomsky (1957, 1959) claimed that the rich structure of natural languages with its complex forms of agreement required the strength of unrestricted rewrite systems, or Turing machines. Quickly linguists realized that for grammars to be learnable and to claim to model in any cognitively realistic way our human linguistic competence, whether or not innate, their generative power should be substantially restricted (Peters & Ritchie 1973). Much of the contemporary research on resource-bounded categorial grammars and the economy of derivation in minimalist generative grammar, comparing formal complexity of derivations, is still seeking to distill universal principles of natural languages, considered cognitive constants of human information processing, linguistically expressed in structurally very diverse phenomena (Adcs & Stccdman 1982; Moortgat 1996; Morrill 1994, 1995; Pentus 2006; Stabler 1997). 4. Semantic models, validity and completeness On the semantic side of formal methods, the notion of a model M plays a crucial role in the definition of truth-conditions for formulas generated by the syntax.The meaning of the Boolean connectives is considered not to vary from one model to a next one, as it is separated in the vocabulary as the closed class of expressions of the logical vocabulary. The conjunction (... and ...) is true just in case each of the conjuncts is, the disjunction (... or ...) is false only in the case neither disjunct is true. The conditional is only false in the case the antecedent (//... clause) is true, but the consequent (then ... clause) is false. The bi-conditional (... if and only if...) is true just in case the two parts have the same truth value. Negation (not ...) simply reverses the truth value of the expression it applies to. For the interpretation of the quantifiers an additional tool is required, a variable assignment function /, which assigns to each variable x in the vocabulary of variables a referent d, an element of the domain D of the model M. Proper names and the descriptive vocabulary containing all kinds of predicates, i.e. adjectives, nouns, and verbs, are interpreted by a function P, given with model M, that specifies who was the bearer of the name, or who had a certain property corresponding to a one place predicate, or stood in a certain relation for relational predicates in the given model A/.
292 III. Methods in semantic research Linguists were quick to point out that in natural language at least conjunctions and disjunctions connect not only full clauses, but also noun phrases, which is not always reducible to a sentential connective, e.g. John and Mary met * John met and Mary met. This kind of linguistic criticism quickly led to generalizations of Boolean operations in a logical system, where variable assignment functions are generalized and given the flexibility to assign referents of appropriate complex types in a richer higher order logic, but such complexities need not concern us here. A formal semantic model M consists hence of: (i) a domain D of semantic objects, sometimes classified into types or given a particular internal structure, and (ii) a function P that assigns appropriate denotations to the descriptive vocabulary of the language interpreted. If the language also contains quantifiers, the model M comes equipped with a given variable assignment function g, often considered arbitrary, to provide a referent, which is an element of D, for all free variables.The given variable assignment function is sometimes considered to represent the current context in some contemporary systems that investigate context dependencies and indexicals. In FOL the universal quantifier every x [N(x)J is interpreted as true in the model M, if all possible alternative assignment functions g' to the variable x that it binds provide a referent d in the denotation of N. So not only the given variable assignment function g provides a d in the denotation of N, but all alternative assignments g' that may provide another referent, but could equally well have been considered the given one, also do. For the existential quantifier some x [N(x)J only one such variable assignment, the given one or another alternative variable assignment function, suffices to interpret the quantifier as true in the model M. An easy way to understand the effect of the interpretation of quantifiers in clauses is to see that for a universal NP the set denoted by N should be a subset of the set denoted by the VP, e.g. every student sings requires that the singers include all the students. Similarly, the existential quantifier requires that the intersection of the denotation of the N and the denotation of the VP is not empty, e.g. some student sings means that among the singers there is at least one student. Intensional models generalize this elementary model theory for extensional FOL to characterize truth in a model relative to a possible world, situation or some other index, representing, for instance, temporal or epistemic variability. Intensional operators typically require a clause in its scope to be true at all or at only some such indices, mirroring strong, universal and existential quantifiers respectively.The domain of intensional models must be enriched with the interpretation of the variables referring to such indices, if such meta-variables are included in the language to be interpreted. Otherwise they are considered to be external to the model, added as a set of parameters indexing the function interpreting the language. Domains may also be structured as (semi)lat- tices or other partial orders, which has proven useful for the semantics of plurals, mass terms and temporal reference to events, but such further variations on FOL must remain outside the scope of the elementary exposition in this article. To characterize logical truths, reflecting the valid reasoning patterns, one simply generalizes over all formulas true in all logically possible models, to obtain those formulas that must be true as a matter of necessity or form only, due to the meaning assigned to their logical vocabulary, irrespective of what is actually considered to be factually the case in the models. By writing out syntactic proofs with all their premises conjoined as antecedents of a conditional of which the conclusion is the consequent one may test in
14. Formal methods in semantics 293 a semantic way the validity of the inference. If such a conditional cannot be falsified, the proof is valid and vice versa. A semantic interpretation of a language, natural or otherwise, is considered formal only if it provides such precise logical models in which the language can be systematically interpreted. Accordingly, contingent truths, which depend for their truth on what happens to be the case in the given model, are properly distinguished from logical truths, which can never be false in any possible model, because the meaning of the logical vocabulary is fixed outside the class of models and hence remains invariable. A logical system in which every proof of a theorem can be proven to correspond to a semantically valid inference pattern and vice versa is called a complete system, as FOL is. If a formal system contains statements that arc true in every possible model but cannot be proven as theorems within that system by the admitted rules of inference and the given axiom base, the system is considered incomplete. Familiar systems of arithmetic have been proven to be incomplete, but that does not disqualify them from their sound use in practice and they definitely still serve as a valuable tool in applications. 5. Formalizing linguistic methods Given the fundamental distinction between syntactic, proof-theoretic and semantic, model-theoretic characterizations of reasoning in theories that purport to model meaning and interpretation, the general question arises what (dis)advantages these two methodologically distinct approaches respectively may have for linguistic applications. Although in its generality this question may not be answerable in a satisfactory way, it is clear that at least in the outset of linguistics as its own, independent scientific discipline, quite different and mostly disconnected, if not antagonizing research communities were associated with the two strategies. This separation of minds was only too familiar to logicians from the early days of modern logic, where proof theorists and model theoretic scmanticists often drew blood in their disputes on the priority, conceptual or otherwise, of their respective methods. Currently seeking convergence of research issues in syntax and semantics is much more en vogue and an easy go-between in syntactic and semantic methods has already proven to pay off in obtaining the best linguistic explanations and new insights. One of the best examples of how linguistic questions could fruitfully be addressed both by syntactic and semantic formal methods, at first separately, but later in tandem, is the thoroughly studied topic of binding pronouns and its associated concept of quantifier scope. Syntacticians focused primarily on the clear configurational differences between free (la) and bound pronouns (lb), and reflexive pronouns (lc), which all depend in different ways on the subject noun phrase that precedes and commands them in the same clause. Co-indexing was their primary method of indicating binding, though no interpretive procedure was specified with it, as meaning was considered elusive (Reinhart 1983a, 1983b). (1) a. [Every student]; who knows [John/a professor] loves [him]#j . b. [Every student^ loves [[his]i teacher]^ . c. [Every student^ who knows [John/a professor], loves [himself^ ...
294 III. Methods in semantic research This syntactic perspective on the binding behavior of pronouns within clauses had deep relations to constraints on movement as a transformation on a string, to which we return below. It limited its consideration of data to pronominal dependencies among clauses within sentences, disregarding the fact that singular universal quantifiers cannot bind pronouns across sentential boundaries (2a,b), but proper names, plurals and indefinite noun phrases typically do (2c). (2) a. [Every student^ handed in a paper. [He]#j . passed the exam. b. [Every student^ who handed in a paper [ t \ passed the exam. c. [John/A student/All students], handed in a paper. [Hc/thcyjj passed the exam. Semanticists had to understand how pronominal binding in natural language was in some respects similar, but in other respects quite different from the ordinary variable binding of FOL. From a semantic perspective the first puzzle was how ordinary proper names and other referring expressions, including freely referring pronouns, that were considered to have no logical scope and hence could not enter into scope ambiguities, could still force pronouns to corefer (3a), even in intensional contexts (3b). (3) a. John/he loves his mother. b. John believes that Peter loves his mother. c. Every student believes that Peter loves his mother. In (3a) the reference of the proper name John or the contextually referring free pronoun he fixes the reference of the possessive pronoun his. In (3b) the possessive pronoun his in the subordinate clause could be interpreted as dependent on Peter, but equally easily as dependent upon John in the main clause. If FOL taught semanticists to identify bound variables with those variables that were syntactically within the scope of a existential or universal quantifier, proper names had to be reconsidered as quantifiers having scope, yet referring rigidly to the same individual, even across intensional contexts. By considering quantifying in as a primary semantic technique to bind variables simultaneously, first introduced in Montague (1974),semanticists fell into the logical trap of identifying binding with configurational notions of linear scope.This fundamental connection ultimately had to be abandoned, when the infamous Gcach sentence (4) (4) Every farmer who owns a donkey beats it. made syntacticians as well as semanticists realize that existential noun phrases in restrictive relative clauses of universal subject noun phrases could inherit, as it were, their universal force, creating for the interpretation of (4) cases of farmers and donkeys over which the quantifier was supposed to range.This is called unselective binding, introduced first by David Lewis (Lewis 1972,1983), but brought to the front of research in semantic circles in Kamp (1981). Generalizing quantifying in as a systematic procedure to account also for intcrscntcntial binding of pronouns, as in (2), was soon also realized to produce counterintuitive results.This motivated an entirely new development of dynamic semantics, where the interpretation of a given sentence would partially determine the interpretation of the next sentence in a text and pronominal binding was conceptually once
14. Formal methods in semantics 295 and for all separated from the logical or configurational notion of scope (Groenendijk & Stokhof 1991; Kamp & Reyle 1993). The interested reader is referred to article 38 (Dekker) Dynamic semantics and article 40 (Biiring) Pronouns for further discussion and details of the resulting account. 6. Linguistic applications of syntactic methods In logic itself quite a few different flavors of formal systems had been developed based on formal proof-theoretic (syntactic) or model-theoretic (semantic) methods. The best known on the proof-theoretic side are axiomatic proof theory (Jeffrey 1967), Gentzen sequent calculus (Gentzen 1934), combinatorial logic (Curry 1961), and natural deduction (Jeffrey 1967; Partee, ter Meulen & Wall 1990), each of which have made distinct and significant contributions in various developments in semantics. Of the more semanti- cally flavored developments most familiar are Tarski's classical notion of satisfaction in models (Tarski 1956), the Lambek and other categorial grammars (Ajdukiewicz 1935; Bar Hillel 1964; van Benthem 1987,1988;Buszkowski 1988;Buszkowski & Marciszewski 1988; Lambek 1958;Oehrle, Bach & Wheeler 1988), tightly connecting syntax and semantics via type theory (Barendregt 1984; van Benthem 1991; Carpenter 1997; Morrill 1994), Beth's tableaux method (Beth 1970),game theoretic semantics (Hintikka & Kulas 1985), besides various intensional logical systems, enriched with indices representing possible worlds or other modal notions, which each also have led to distinctive semantic applications (Asher 1993; van Benthem 1983;Cresswell 1985; Montague 1974).The higher order enrichment of FOL with quantification over sets, properties of individuals or properties of properties in Montague Grammar (Barwise & Cooper 1981; van Benthem & ter Meulen 1985; Montague 1974) at least initially directed linguists' attention away from the global linguistic research program of seeking to constrain the generative capacity and computational complexity of the formal methods in order to model realistically the cognitive capacities of human language users, as it focused everyone's attention on com- positionality, type theory and type shifting principles, and adopted a fully generalized functional structure. The next two sections selectively review some of the formal methods of these logical systems that have led to important semantic applications and innovative developments in linguistic theory. An axiomatic characterization of a formal system is the best way to investigate its logic, as it matches its semantics to prove straightforwardly its soundness and completeness, i.e. demonstrating that every provable theorem is true in all models (semantical!)/ valid) and vice versa. For FOL a finite, in fact small number of logically true expressions suffice to derive all and only all valid expressions, i.e. FOL is provably complete. But in actually constructing new proofs an axiomatic characterization is much less useful, for it does not offer any reliable heuristics as guidance for an effective proof search.The main culprit is the rule of inference guaranteeing the transitivity of composition, i.e. to derive A -> C from A -> B and B -> C requires finding an expression B which has no trace in the conclusion A —> C. Since there are infinitely many possible such Bs, you cannot exhaustively search for it.The Gentzen sequent calculus (Gentzen 1934) is known to be equivalent to the axiomatic characterization of FOL, and it proved that any proof of a theorem using the transitivity of composition, or its equivalent, the so called Cut inference in Gentzen sequent calculus, may be transformed into a proof that avoids using this
296 III. Methods in semantic research rule. Therefore, transitivity of composition is considered 'logically harmless', since the Gentzen sequent calculus effectively limits searching for a proof of any theorem to the expressions constituting the theorem you want to derive, called the subformula property. The inference rules in Gentzen sequent calculus are hence guaranteed to decompose the complexity of the expressions in a derivation, making the question whether an expression is a theorem decidable. But from the point of view of linguistic applications the original Gentzen calculus harbored another drawback as generative system. It allowed structural inference rules that permuted the order of premises in a proof or rebracketed any triple (associativity), making it hard to capture linguistically core notions of dominance, governance or precedence between constituents of a sentence to be proven grammatical.To restore constitutive order to the premises, the premises had to be regarded to create an linearly ordered sequence or w-tuple, or a multiset (in mathematics, a multiset (or bag) is a generalization of a set. A member of a multiset can have more than one instances, while each member of a set has only one, unique instance).This is now customary within the current categorical grammars deriving from Gentzen's system, reviewed below in the semantic methods that affected developments in natural language semantics (Moortgat 1997). Perhaps the most familiar proof theoretic characterization of FOL is Natural Deduction, in which rules of inference systematically introduce or eliminate the connectives and quantifiers (Jeffrey 1967;Partee,ter Meulen & Wall 1990).This style of constructing proofs is often taught in introductory logic classes, as it docs provide a certain intuitive heuristics in constructing proofs and closely follows the truth-conditional meaning given to the logical vocabulary, while systematically decomposing the conclusion and given assumptions until atomic conditions are obtained. Proving a theorem with natural deduction rules still requires a certain amount of ingenuity and insight, which may be trained by practice. But human beings will never be perfected to attain logical omniscience, i.e. the power to find each and every possible proof of a theorem. No actual success in finding a proof may mean either you have to work harder at finding it or that the expression is not a theorem, but you never know for sure which situation you are in (Boolos & Jeffrey 1980;Stalnaker 1999). The fundamental demand that grammars of natural languages must realistically model the human cognitive capacities to produce and understand language has led to a wealth of developments in searching how to cut down on the generative power of formal grammars and their corresponding automata. Early in the developments of generative grammar, the unrestricted deletion transformation was quickly considered the most dangerously powerful operation in an unrestricted rewrite system or Turing machine, as it permitted the deletion of any arbitrary expressions that were redundant in generating the required surface expression (Peters & Ritchie 1973). Although deletion as such has still not been eliminated altogether as possible effect of movement, it is now always constrained to leave a trace or some other formal expression, making deleted material recoverable. Hence no expressions may simply disappear in the context of a derivation. The Empty Category Principle (ECP) substantiated this requirement further, stating that all traces of moved noun phrases and variables must be properly governed (Chomsky 1981; Hacgcman 1991; van Ricmsdijk & Williams 1986). This amounts to requiring them to be c-commanded by the noun phrase interpreted as binding them (c-command is a binary relation between nodes in a tree structure defined as follows: Node A c-commands node B iff A * B, A does not dominate B and B does
14. Formal methods in semantics 297 not dominate A, and every node that dominates A also dominates B.). Extractions of an adjunct phrase out a w/i-island as in (5) *How. did Mary ask whether someone had fixed the car f.? or moving w/j-expressions out of a f/iaf-clause as in (6) *Who( does Mary believe that f. will fix the car? are clearly ungrammatical, because they violate this ECP condition, as the traces f. are co-indexed with and hence intended to be interpreted as bound by expressions outside their proper governance domain. The classical logical notion of quantifier scope is much less restricted, as ordinarily quantifiers may bind variables in intensional context without raising any semantic problems of interpretation, as we saw above in (3c) (Dowty, Wall & Peters 1981; Partee, ter Meulen & Wall 1990). For instance, in intensional semantics the sentence (7) (7) Mary believes that someone will fix the car. has at least one so called 'de re' interpretation in which someone is 'quantified in1 and assigned a referent in the actual world, of whom Mary believes that he will fix the car in a future world, where Mary's beliefs have come true. Such wide scope interpretations of noun phrases occurring inside a complementizer that-C? is considered in generative syntax a form of opaque or hidden movement at LF, regarded a matter of semantic interpretation, and hence external to grammar, i.e. not a question of derivation of syntactic surface word order (May 1985). Surface wide scope w/i-quantifiers in such 'de re' constructions binding overt pronouns occurring within the intensional context i.e. within the that-clause are perfectly acceptable, as in (8). (8) Of whonv does Mary believe that hef will fix the car? Anyone intending to convey that Mary's belief regarded a particular person, rather than someone hypothetically assumed to exist, would be wise to use such an overt wide scope clause as in (8), according to a background pragmatic view that speakers should select the optimal syntactic form to express their thought and to avoid any possible misunderstanding. This theoretical division of labor between tangible, syntactic movement to generate proper surface word order and intangible movement to allow disambiguation of the semantic scope of quantifiers has perhaps been the core bone of contention over many years between on the one hand the generative formal methods, in which semantic ambiguity does not have to be syntactically derived, and on the other hand,categorial grammar and its later developments in Montague grammar, which required full compositionality, i.e. syntactic derivation must determine semantic interpretation, disambiguating quantifier scope by syntactic derivation. In generative syntax every derivational difference had to be meaningful, however implicit this core notion remained, but in categorial grammars certain derivational differences could be provably scmantically equivalent and hence meaningless, often denigratingly called the problem of spurious ambiguities. To
298 III. Methods in semantic research characterize the logical equivalence of syntactically distinguished derivations required an independent semantic characterization of their truth conditions, considered a suitable task of logic, falling outside the scope of linguistic grammar proper, according to most generativists.This problem of characterizing which expressions with different derivational histories would be true in exactly the same models, hence would be logically equivalent, simply does not arise in generative syntax, as it hides semantic ambiguities as LF movement not reflected in surface structure, avoiding syntactic disambiguation of semantic ambiguities. A much stronger requirement on grammatical derivations is to demand methodologically that all and only the constitutive expressions of the derived expression must be used in a derivation, often called surface compositionality (Cresswell 1985; Partee 1979). This research program is aiming to eliminate from linguistic theory anything that is not absolutely necessary. Chomsky (1995) claimed that both deep structure, completely determined by lexical information, and surface structure, derived from it by transformations, may be dispensed with. Given that a language consists of expressions which match sound structure to representations of their meaning, Universal Grammar should consist merely of a set of phonological, semantic, and syntactic features, together with an algorithm to assemble features into lexical expressions and a small set of operations, including move and merge, that constitute syntactic objects, the computational system of human languages. The central thesis of this minimalist framework is that the computational system is the optimal, most simple, solution to legibility conditions at the phonological and semantic interface. The goal is to explain all the observed properties of languages in terms of these legibility conditions, and properties of the computational system. Often advocated since the early 1990s is the Lexicalist Hypothesis requiring that syntactic transformations may operate only on syntactic constituents, and can only insert or delete designated elements, but cannot be used to insert, delete, permute, or substitute parts of words.This Lexicalist Hypothesis, which is certainly not unchallenged even among generativists, comes in two versions: (a) a weak one, prohibiting transformations to be used in derivational morphology, and (b) a strong version prohibiting use of transformations in inflection. It constitutes the most fundamentally challenging attempt from a syntactic perspective to approach surface compositionality as seen in the well-known theories of natural language semantics to which we now turn. 7. Linguistic applications of semantic methods The original insight of the Polish logician Alfred Tarski was that the truth conditional semantics of any language must be stated recursively in a distinct meta-language in terms of satisfaction of formulas consisting of predicates and free variables to avoid the paradoxical forms of self-reference alluded to above (Tarski 1956; Barwise & Etchemendy 1987). By defining satisfaction directly, and deriving truth conditions from it, a proper recursive definition could be formulated for the semantics of any complex expression of the language. For instance, in FOL an assignment satisfies the complex sentence S and S' if and only if it satisfies S and it also satisfies 5'. For universal quantification it required an assignment/to satisfy the sentence 'Every x sings' if and only if for every individual that some other assignment/' assigns to the variable x, while assigning the same things as/to the other variables,/' satisfies sings(x), i.e. the value of every such f'(x) is an element in the set of singers. Tarski's definition of
14. Formal methods in semantics 299 satisfaction is compositional, since for an assignment to satisfy a complex expression depends only on the syntactic composition of its constituents and their semantics, as Gottlob Frege had originally required (Frege 1892).Truth conditions can subsequently be stated relative to a model and an arbitrary given assignment, assigning all free variables their reference. Truth cannot be compositionally defined directly for 'Every x sings' in terms of the truth of sings(x), because sings(x) has a free variable x, so its truth depends on which assignment happens to be the given one. The Tarskian truth- conditional semantics of FOL also provided the foundation for natural language semantics, limited to fragments that do not contain any truth or falsity predicate, nor verbs like to lie, nor other expressions directly concerned with veridicality. The developments of File Change Semantics (Hcim 1982), Discourse Representation Theory (Kamp 1981;Kamp & Reyle 1993),SituationTheory (Barwise & Perry 1983;Seligman & Moss 1997) and dynamic Montague Grammar (Chierchia 1995;Groenendijk& Stokhof 1991), that all allowed free variables or reference markers representing certain use of pronouns to be interpreted as if bound by a widest scope existential quantifier, even if they occurred in different sentences, fully exploit this fundamental Tarskian approach to compositional semantics by satisfaction. Other formal semantics methods for FOL were subsequently developed in the second half of the 20th century as alternatives to Tarskian truth-conditional semantics. Beth (1970) designed a tableaux method in which a systematic search for counterexamples to the assumed validity of a reasoning pattern seeking to verify the premises, but falsify its conclusion leads in a finite number of decompositional steps either to such a counterexample, if one exists, or to closure, tantamount to the proof that no such counterexample exists (Beth 1970; Partee, ter Meulen & Wall 1990).This semantic tableaux method provided a procedure to enumerate the valid theorems of FOL, because it only required a finite number of substitutions in deriving a theorem: (i) the expression itself, (ii) all of its constituent expressions, and (iii) certain simple combinations of the constituents depending on the premises. Hence any tableau for a valid theorem eventually closes, and the method produces a positive answer. It does not however constitute a decision procedure for testing the validity of any derivation, since it does not enumerate the set of expressions that are not theorems of FOL. Game-theoretic semantics characterizes the semantics of FOL and richer, intensional logics in terms of rules for playing a verification game between a truth-seeking player and falsification seeking, omniscient Nature (Hintikka & Kulas 1985; Hintikka & Sandu 1997; Hodges 1985). Its interactive and epistemic flavor made it especially suitable for the semantics of interrogatives in which requests for information are acts of inquiry resolved by the answerer, providing the solicited information (Hintikka 1976). Such information-theoretic methods are currently further explored in the generalized context of dynamic epistemic logic, where communicating agents each have access to partial, private and publicly shared information and seek to share or hide information they may have depending on their communicative needs and intentions (van Ditmarsch, van der Hoek & Kooi 2007). Linguistic applications to the semantics of dialogue or multi-agent conversations in natural language already seem promising (Ginzburg & Sag 2000). It was first shown in Skolcm (1920) how second order methods could provide novel tools for logical analysis by rewriting any linear FOL formula with an existential quantifier in the scope of a universal quantifier into a formula with a quantification prefix consisting of existential quantifiers ranging over assignment functions, followed by
300 III. Methods in semantic research only monadic (one place) universal quantifiers binding individual variables. The dependent first order existential quantifier is eliminated by allowing such quantification over second-order choice functions that assign the value of the existentially quantified dependent variable as a function of the referent assigned to the monadic, universally quantified individual variable preceding it. Linguistic applications using such Skolem functions have been given in the semantics of questions (Engdahl 1986) and the resolution of functional pronouns (Winter 1997). The general strategy to liberate FOL from the linear dependencies of quantifiers by allowing higher order quantification or partially ordered, i.e. branching quantifier prefixes, was linguistically exploited in the semantic research on branching quantifiers (Hintikka & Sandu 1997; Barwise 1979). From a linguistic point of view the identification of linear quantifier scope with bound occurrences of variables in their bracketed ranges never really seemed justified, since informational dependencies such as coreference of pronouns bound by an indefinite noun phrase readily cross sentential boundaries, as we saw in (2c). Furthermore, retaining perfect information on the referents already assigned to all preceding pronouns smells of unrealistic logical omniscience, where human memory limitations and contextual constraints are disregarded. It is obviously much too strong as epistemic requirement on ordinary people sharing their necessarily always limited, partial information (Seligman & Moss 1997). Game-theoretic semantics rightly insisted that a proper understanding of the logic of information independence and hence of the lack of information was just as much needed for natural language applications, as the logic of binding and other informational dependencies. Such strategic reconsiderations of the limitations of foundational assumptions of logical systems have prompted innovative research in logical research programs, considerably expanding the formal methods available in natural language semantics (Muskens, van Benthem&Visser 1997). By exploiting the full scale higher order quantification of the type-theoretic categorial grammars Montague Grammar first provided a fully compositional account of the translation of syntactically disambiguated natural language expressions to logical expressions by treating referential noun phrases semantically on a par with quantificational ones as generalized quantifiers denoting properties of sets of individuals.This was accomplished obviously at the cost of generating spurious ambiguities ad libitum, giving up on the program of modeling linguistic competence realistically (van Benthem & ter Meulen 1985; Keenan & Westerstahl 1997; Link 1991; Montague 1974; Partee 1979). Its type theory, based only on two primitive types, e for individual denoting expressions and t for truth- value denoting expressions, forged a perfect fit between the syntactic categories and the function-argument structure of their semantics. For instance, all nouns are considered syntactic objects that require a determiner on their left side to produce a noun phrase and semantically denote a set of entities, of type <e, t>, which is an element in the generalized quantifier of type «e, t>, t > denoted by the entire noun phrase. Proper names, freely referring pronouns, universal and existential NPs are hence treated semantically on a par as denoting a set of sets of individuals. This fruitful strategy led to a significant expansion of the fragments of natural languages that were provided with a compositional model-theoretic semantics, including many kinds of adverbial phrases, degree and measurement expressions, unusual and complex quantifier phrases, presuppositions, questions, imperatives, causal and temporal expressions, but also lexical relations that affected reasoning patterns (Chierchia 1995; Krifka 1989). Logical properties of generalized quantifiers prove to be very useful in explaining, for instance, not only which noun
14. Formal methods in semantics 301 phrases are acceptable in pleonastic or existential contexts, but also why the processing time of noun phrases may vary in a given experimental situation and how their semantic complexity may also constrain their learnability. In pressing on for a proper, linguistically adequate account of pronouns in discourse, and for a cognitively realistic logic of information sharing in changing contexts, new tools that allowed for non-linear structures to represent information content play an important conceptually clarifying role in separating quantifier scope from the occurrence of variables in the linear or partial order of formulas of a logical language, while retaining the core model-theoretic insights in modeling inference as concept based on Tarskian satisfaction conditions. 8. Conclusions The development of formal methods in logic has contributed essentially to the emancipation of linguistic research into an academic community where formal methods were given their proper place as explanatory tool in scientific theories of meaning and interpretation. Although logical languages arc often designed with a particular purpose in mind, they reflect certain interesting computational or semantic properties also exhibited, though sometimes implicitly, in natural languages. The properties of natural languages that lend themselves for analysis and explanation by formal methods have increased steadily over the past century, as the formal tools of logical systems were more finely chiseled to fit the purpose of linguistic explanation better. Even more properties will most likely become accessible for linguistic explanation by formal methods over the next century. The issues of cognitive complexity, characterized at many different levels from the neurobiological, molecular structure detected in neuro-imaging to interactive behavioral studies, and experimental investigations of processing time provide a new set of empirical considerations in the application of formal methods to natural language. They drive experimental innovations and require an interdisciplinary research agenda to integrate the various modes of explanation into a coherent model of human language use and communication of information. The current developments in dynamic natural language semantics constitute major improvements in expanding linguistic application to a wider range of discourse phenomena. The forms of reasoning in which context dependent expressions may change their reference during the processing of the premises are now considered to be interesting aspects of natural languages, that logical systems are challenged to simulate, rather than avoid, as our great-grandfathers' advice originally directed us to do. There is renewed attention to limit in a principled and empirically justified way the search space complexity to decidable fragments of FOL and to restrict the higher order methods in order to reduce the complexity to model cognitively realistic human processing power. Such developments in natural language will converge eventually with the syntactic research programs focusing on universals of language as constants of human linguistic competence. 9. References Ades, Anthony E. & Mark J. Steedinan 1982. On the order of words. Linguistics & Philosophy 4, 517-558. Ajdukiewicz, Kazimierz 1935. Die syntaktisclie Konnexitat. Stiidia Philosophica 1. 1-27. English translation in: S. McCall (ed.). Polish Logic. Oxford: Clarendon Press, 1967.
302 III. Methods in semantic research Arbib, Michael A. 1969. Theories of Abstract Automata. Englewood Cliffs, NJ: Prentice Hall. Asher, Nicholas 1993. Reference to Abstract Objects in Discourse. Dordrecht: Kluwer. Bar-Hillel, Yehoshua 1964. Language and Information: Selected Essays on their Theory and Application. Reading, MA: Addison-Wesley. Barendregt. Hendrik P. 1984. The Lambda Calculus. Amsterdam: North-Holland. Barwise, Jon 1979. On branching quantifiers in English. Journal of Philosophical Logic 8,47-80. Barwise, Jon 1989. The Situation in Logic. Stanford, CA: CSLI Publications. Barwise, Jon & Robin Cooper 1981. Generalized quantifiers and natural language. Linguistics & Philosophy 4,159-219. Barwise. Jon & John Etchemendy 1987. The Liar. An Essay on Truth and Circularity. Oxford: Oxford University Press. Barwise. Jon & John Perry 1983. Situations and Attitudes. Cambridge, MA:The MIT Press. van Benthem, Johan 1983. The Logic of Time: A Model-Theoretic Investigation into the Varieties of Temporal Ontology and Temporal Discourse. Dordrecht: Reidel. van Bcnthcm, Johan 1986. Essays in Logical Semantics. Dordrecht: Rcidcl. van Benthem, Johan 1987. Categorial grammar and lambda calculus. In: D. Skordev (ed.). Mathematical Logic and its Applications. New York: Plenum, 39-60. van Benthem, Johan 1988.The Lambek calculus. In: R.T. Oehrle, E. Bach & D. Wheeler (eds.). Categorial Grammars and Natural Language Structures. Dordrecht: Reidel, 35-68. van Benthem, Johan 1991. Language in Action: Categories, Lambdas and Dynamic Logic. Amsterdam: North-Holland. van Benthem, Johan & Alice ter Meulen 1985. Generalized Quantifiers in Natural Language. Dordrecht: Foris. van Benthem, Johan & Alice ter Meulen 1997. Handbook of Logic and Language. Amsterdam: Elsevier. Beth, Evert 1970. Aspects of Modern Logic. Dordrecht: Rcidcl. Boolos, George & Richard Jeffrey 1980. Computability and Logic. Cambridge: Cambridge University Press. Buszkowski, Wojciech 1988. Generative power of categorial grammars. In: R.T. Oehrle. E. Bach & D. Wheeler (eds.). Categorial Grammars and Natural Language Structures. Dordrecht: Rcidcl, 69-94. Buszkowski, Wojciech & Witold Marciszewski 1988. Categorial Grammar. Amsterdam: Benjamins. Carnap, Rudolf 1947. Meaning and Necessity. Chicago, IL:The University of Chicago Press. Carpenter, Bob 1997. Type-Logical Semantics. Cambridge, MA:The MIT Press. Chierchia, Gennaro 1995. Dynamics of Meaning: Anaphora, Presupposition, and the Theory of Grammar. Chicago, IL:The University of Chicago Press. Chomsky, Noam 1957. Syntactic Structures.1\\e Hague: Mouton. Chomsky, Noam 1959. On certain formal properties of grammars. Information and Control 2, 137-167. Chomsky, Noam 1981. Lectures on Government and Binding. Dordrecht: Foris. Chomsky, Noam 1995. The Minimalist Program. Cambridge, MA:The MIT Press. Chomsky, Noam & George A. Miller 1958. Finite-state languages. Information and Control 1, 91-112. Chomsky, Noam & George A. Miller 1963. Introduction to the formal analysis of natural languages. In: R.D. Luce, R. Bush & E. Galanter (eds.). Handbook of Mathematical Psychology, vol. 2. New York: Wiley, 269-321. Crcsswcll, Max J. 1985. Structured Meanings: The Semantics of Propostitional Attitudes. Cambridge, MA: The MIT Pi ess. Curry, Haskell B. 1961. Some logical aspects of grammatical structure. In: R. Jakobson (ed.). Structure of Language and its Mathematical Aspects. Providence, RI: American Mathematical Society, 56-68. Davidson, Donald & Gilbert Harinan 1972. Semantics of Natural Language. Dordrecht: Reidel.
14. Formal methods in semantics 303 Davis, Martin 1965. The Undecidable: Basic Papers on Undecidable Propositions, Unsolvable Problems and Computable Functions. Hewlett, NY: Raven Press van Ditmarsch, Hans, Wiebe van der Hoek & Barteld Kooi 2007. Dynamic Epistemic Logic. Dordrecht: Springer. Dowty, David, Robert Wall & Stanley Peters 1981. Introduction to Montague Semantics. Dordrecht: Reidel. Engdahl, Elisabeth 1986. Constituent Questions. Dordrecht: Reidel. Frege, Gottlob 1892. Uber Sinn und Bedeutung. Zeitschrift fur Philosophic und philosophische Kritik 100,25-50. Reprinted in: G. Patzig (ed.). Funktion, Begriff Bedeutung. Fiinf logische Stu- dien. 3rd edn. Vandenhoeck Hoeck & Ruprecht: Gottingen, 1969, 40-65. English translation in: P. Geach & M. Black (eds.). Translations from the Philosophical Writings of Gottlob Frege. Oxford: Blackwell, 1980,56-78. Gabbay, Dov & Franz Guenthner 1983. Handbook of Philosophical Logic, vol. 1-4. Dordrecht: Reidel. Gallin, Daniel 1975. Intensional and Higher-Order Modal Logic. With Applications to Montague Semantics. Amsterdam: North-Holland. Gentzen, Gerhard 1934. Untersuchungen iiber das logische SchlieBen I & II. Mathematische Zeitschrift 39,176-210,405-431. Ginzburg, Joliiiathan & Ivan Sag 2000. Interrogative Investigations: The Form, Meaning and Use of English Interrogatives. Stanford, CA: CSLI Publications. Groenendijk, Jeroen & Martin Stokhof 1991. Dynamic Predicate Logic. Linguistics & Philosophy 14,39-100. Gross, Maurice & Andre Lentin 1970. Introduction to Formal Grammars. Berlin: Springer. Haegeman, Liliane 1991. Introduction to Government and Binding Theory. Oxford: Blackwell. Heim, Irene R. 1982. The Semantics of Definite and Indefinite Noun Phrases. Ph.D. dissertation. University of Massachusetts, Amherst, MA. Reprinted: Ann Arbor, MI: University Microfilms. Hintikka, Jaakko 1976. The Semantics of Questions and the Questions of Semantics: Case Studies in the Interrelations of Logic, Semantics, and Syntax. Amsterdam: North-Holland. Hintikka, Jaakko & Jack Kulas 1985. Anaphora and Definite Descriptions: Two Applications of Game-Theoretical Semantics. Dordrecht: Reidel. Hintikka, Jaakko & Gabriel Sandu 1997. Game-theoretical semantics. In: J. van Benthein & A. ter Meulen (eds.). Handbook of Logic and Language. Amsterdam: Elsevier, 361-410. Hodges, Wilfrid 1985. Building Models by Games. Cambridge: Cambridge University Press Hopcroft, John E. & Jeffrey D. Ullman 1979. Introduction to Automata Theory, Languages and Computation. Reading, MA: Addison-Wesley. Jeffrey, Richard 1967. Formal Logic: Its Scope and Limits. New York: McGraw-Hill. Kamp, Hans 1981. A theory of truth and semantic representation. In: J. Groenendijk (ed.). Formal Methods in the Study of Language. Amsterdam: Mathematical Centre, 277-322. Kamp, Hans & Uwe Reyle 1993. From Discourse to Logic. Dordrecht: Kluwer. Keenan, Edward L. & Leonard M. Faltz 1985. Boolean Semantics for Natural Language. Dordrecht: Reidel. Keenan, Edward & Dag Westerstahl 1997. Generalized quantifiers in linguistics and logic. In: J. van Benthem & A. ter Meulen (eds.). Handbook of Logic and Language. Amsterdam: Elsevier, 837-893. Krifka, Manfred 1989. Nominal reference, temporal constitution and quantification in event semantics. In: R. Bartsch, J. van Bcnthcm & P. van Enidc Boas (eds.). Semantics and Contextual Expressions. Dordrecht: Foris, 75-115. Kripke, Saul A. 1972. Naming and necessity. In: D. Davidson & G. Haimaii (eds.). Semantics of Natural Language. Dordrecht: Reidel, 253-355 and 763-769. Lambek, Joachim 1958. The mathematics of sentence structure. American Mathematical Monthly 65,154-170.
304 III. Methods in semantic research Lewis, David 1972. General semantics. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Reidel, 169-218. Lewis, David 1983, Philosophical Papers, vol. I. Oxford: Oxford University Press. Link, Godehard 1991. Formale Methoden in der Semantik. In: A. von Stechow & D. Wunderlich (eds.). Semantik. Ein internationales Handbuch der zeitgenossischen Forschung (HSK 6). Berlin: de Gruyter, 835-860. May, Robert 1985. Logical Form: Its Structure and Derivation. Cambridge, MA: The MIT Press. McCawley, James D. 1981. Everything that Linguists Have Always Wanted to Know About Logic But Were Ashamed to Ask. Chicago, IL:Tlie University of Chicago Press. Montague, Richard 1974. Formal Philosophy. Selected Papers of Richard Montague. Edited and with an introduction by Richard H.TIiomason. New Haven, CT: Yale University Press. Moortgat, Michael 1996. Multimodal linguistic inference. Journal of Logic, Language and Information 5,349-385. Moortgat, Michael 1997. Categorial type logics. In: J. van Benthem & A. ter Meulen (eds.). Handbook of Logic and Language. Amsterdam: Elsevier, 93-179. Morrill, Glyn 1994. Type Logical Grammar. Categorial Logic of Signs. Dordrecht: Kluwer. Morrill, Glyn 1995. Discontinuity in categorial grammar. Linguistics & Philosophy 18,175-219. Muskens, Re in hard, Johan van Benthein & Albert Visser 1997. Dynamics. In: J. van Benthem & A. ter Meulen (eds.). Handbook of Logic and Language. Amsterdam: Elsevier, 587-648. Nishihara, Noritaka, Kcnichi Morita & Shigenori Iwata 1990. An extended syllogistic system with verbs and proper nouns, and its completeness proof. Systems and Computers in Japan 21, 96-111. Oehrle, Richard T, Ennnon Bach & Deirdre Wheeler 1988. Categorial Grammars and Natural Language Structures. Dordrecht: Reidel. Partee, Barbara 1979. Semantics - mathematics or psychology? In: R. Bauerle, U. Egli & A. von Stechow (eds.). Semantics from Different Points of View. Berlin: Springer. Partee, Barbara, Alice ter Meulen & Robert Wall 1990. Mathematical Methods in Linguistics. Dordrecht: Kluwer. Pentus, Mati 2006. Lambek calculus is NP-complete. Theoretical Computer Science 357,186-201. Peters, Stanley & Richard Ritchie 1973. On the generative power of transformational grammars. Information Sciences 6,49-83. Pratt-Hartmann, Ian 2003. A two-variable fragment of English. Journal of Logic, Language & Information 12,13-45. Pratt-Hartmann, Ian & Allan Third 2006. More fragments of language. Notre Dame Journal of Formal Logic 47,151-177. Reinhart,Tanya 1983a. Anaphora and Semantic Interpretation. London: doom Helm. Reinhart,Tanya 1983b. Coreference and bound anaphora: A restatement of the anaphora question. Linguistics & Philosophy 6,47-88. van Riemsdijk, Henk & Edwin Williams 1986. Introduction to the Theory of Grammar. Cambridge. MA:The MIT Press. Savitch, Walter J.. Emmon Bach & Wiliam Marsh 1987. The Formal Complexity of Natural Language. Dordrecht: Reidel. Seligman, Jerry & Lawrence S. Moss 1997. Situation theory. In: J. van Benthem & A. ter Meulen (eds.). Handbook of Logic and Language. Amsterdam: Elsevier, 239-280. Skolcm, Thoralf 1920. Logisch-kombinatoi isclic Untcrsuchungen iibcr die Erfullbarkcit odcr Bcwcisbarkcit mathematischer Satzc nebst cincm Theorem iibcr dichtc Mcngcn. Videnskaps- selskapets Skrifter I, Matem-naturv. KI.I4,1-36. Stabler. Edward 1997. Derivational minimalism. In: C. Retore" (ed.). Logical Aspects of Computational Linguistics. Berlin: Springer. 68-95. Stalnaker, Robert 1999. Context and Content Essays on Intentionality in Speech and Thought. Oxford: Oxford University Press.
15. The application of experimental methods in semantics 305 Tarski, Alfred 1956. Logic, Semantics, Metamathematics: Papers from 1923 to 1938. Oxford: Clarendon Press. Winter, Yoad 1997. Choice functions and the scopal semantics of indefinites. Linguistics & Philosophy 20,399-467. Alice G.B. ter Meulen, Geneva (Switzerland) 15. The application of experimental methods in semantics 1. Introduction 2. The stumbling blocks 3. Off-line evidence for scope interpretation 4. Underspecification vs. full interpretation 5. On-line evidence for representation of scope 6. Conclusions 7. References Abstract The purpose of this paper is twofold. On the methodological side, we shall attempt to show that even relatively simple and accessible experimental methods can yield significant insights into semantic issues. At the same time, we argue that experimental evidence, both the type collected in simple questionnaires and measures of on-line processing, can inform semantic theories. The specific case that we address here concerns the investigation of quantifier scope. In this area, where judgements are often subtle and controversial, the gradient data that psycholinguistic experiments provide can be a useful tool to distinguish between competing approaches, as we demonstrate with a case study. Furthermore, we describe how a modification of existing experimental methods can be used to test predictions of underspecification theories. The program of research we outline here is not intended to be a prescriptive set of instructions for researchers, telling them what they should do; rather it is intended to illustrate some problems an experimental semanticist may encounter but also the profit of this enterprise. 1. Introduction A wide range of data types and sources are used in the field of semantics, as is demonstrated by the related article 12 (Krifka) Varieties of semantic evidence in this volume. The aim of this article is to show with an example research study series what sort of questions can be addressed with experimental tools and suggest that these methods can deliver valuable data which is relevant to basic assumptions in semantics. This text also attempts to address the constraints on and limits to such an approach. These are both methodological and theoretical: it has long been recognized that links between empirical measures and theoretical constructs require careful argumentation to establish. Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 305-321
306 III. Methods in semantic research The authors therefore have two aims: one related to experimental methodologies and the other to do with the value of processing data. They first seek to show that even relatively simple and accessible experimental methods can yield significant insights into semantic issues. They second wish to illustrate that experimental evidence such as that gathered in their eye-tracking study has the potential to inform semantic theory. Semanticists have of course always sought confirmatory evidence to support their analyses. There is, on the one hand, fairly extensive use of computational techniques and corpus data in the field, and a growing body of experimental work on semantic processing, language acquisition, and pragmatics, but in the area of theoretical and formal semantics the experimental methods are less frequently employed. Now there arc good reasons for this. There arc inherent factors related to the accessibility of the relevant measures why controlled data gathering techniques are still somewhat less frequent in this field than in some others. We shall discuss what these reasons are and demonstrate with a case study what constraints they place on empirical studies, particularly experimental studies.The example research program that we shall report is thus not simply a recipe for others for what should be done, rather it is an illustration of the difficulties involved, which aims to explore some of the boundaries of what is accessible to experimental studies. The specific case that we address here concerns the investigation of quantifier scope, a perennial issue in semantics. Previous attempts to account for the complex data patterns to be found in natural languages have met with the difficulty that the causal factors and preferences need first to be identified before a realistic model can be developed. This requires as an initial step the capture and measurement of the relevant effects and their interactions, which is no trivial task. The next section lays out a range of reasons why semanticists do not routinely seek to test the empirical bases of their theories with simple experiments. Section 3 reports the scries of empirical investigations on quantifier scope carried out by Bott and Rado in ongoing research. Section 4 lays out some of the theoretical background and importance of these studies for current theory (the underspecification debate).The final section takes as a starting point Bott and Rad6 (2009) to suggest how some of the problems noted in section 3 may be overcome with a more sophisticated experimental procedure. 2. The stumbling blocks As Manfred Krifka notes in his neighbouring article 12 (Krifka) Varieties of semantic evidence, a major problem with investigating meaning is that we cannot yet fully define what it is. This is indeed a root cause of difficulty, but here we shall attempt to illustrate in more practical detail what effects this has on attempts to conduct experiments in this field. 2.1. Specifying meaning without using language The essential feature distinguishing experiment procedure is control. In language experiments we may distinguish three (sets of) variables: linguistic form, context, and meaning. In the typical experiment we will keep two of them constant and systematically vary the other. Much semantic research concerns the systematic interdependence of form, context, and meaning. These issues can be investigated for example by:
15. The application of experimental methods in semantics a) keeping form and context constant, manipulating meaning systematically, and measuring the felicity of the outcome (in judgements, or reaction times, or processing effort), or b) manipulating (at least one of) form and context, and measuring perceived meaning. The first requires the experimenter to manipulate meaning as a variable, which entails expressing meaning in a form other than language, (pictures, situation descriptions, etc); the second requires the experimenter to measure perceived meaning, which again normally demands reference to meanings captured in non-linguistic form. But precisely this expression of tightly constrained meaning in non-linguistic form is very difficult. To show how this factor affects studies in semantics disproportionately, it is worth noting how this makes controlled studies in semantics more challenging than in syntax. Work in experimental syntax is often interested in addressing precisely those effects of form change which are independent of meaning. The variable meaning can thus be held constant, but this does not require it to be exactly specified.Thus only the syntactic analysis need be controlled, which makes empirical studies in syntax a whole parameter less difficult than those in semantics. 2.2. The boundaries of form, context, and meaning A further problem of exact studies concerning meaning is that the three variables are not always clearly distinguished, in part because they systematically covary, but also in part because linguists do not always agree about the boundaries. This is particularly visible when we seek to identify where an anomaly lies. Views have changed over time in linguistics about the nature and location of ill-formedness (e.g. the discussion of the status of I am lurking in a culvert in Ross 1970) but the fundamental ambiguity is still with us. For example, Weskott & Fanselow (2009) give the following examples and judgements of syntactic and semantic well-formedness: (la) is syntactically ill-formed (*), (lb) is semantically ill-formed (#), and (lc) is ill-formed on both accounts (*#). (1) a. *Die Suppe wurde gegen versalzen. the soup was against oversalted b. #Der Zug wurde gekaut. the train was chewed c. *#Das Eis wurde seit entziindet. the ice was since inflamed Our own judgements suggest that the structures in (1-a) and (1-c) have no acceptable syntactic analysis, and therefore no semantic analysis can be constructed - they are thus both syntactically and semantically ill-formed. Crucially, the semantic anomaly is dependent upon the syntactic problem; the lack of a recognizable compositional interpretation is a result of the lack of a possible structural analysis. We would therefore regard these examples as primarily syntactically unacceptable. This contrasts with (1-b), which we regard as well-formed on both parameters, being merely implausible, except in a small child's playroom, where a train being chewed is an entirely normal situation (cf. Hahne & Friederici 2002).
308 III. Methods in semantic research 2.3. Plausibility Such examples highlight another problem in manipulating meaning as an experimental variable: the human demand to make sense of linguistic forms. We associate possible meanings with things that we can accept as being true or plausible. So 'the third-floor appartment reappeared today', which is both syntactically and semantically flawless, will cause irrelevant experimental effects since subjects will find it difficult to fit the meaning into their mental model of the world. Zhou & Gao (2009) for example argue that participants interpret Every robber robbed a bank in the surface scope reading because it is more plausible that each robber robbed a different bank. This links in to a wider discussion of the role of plausibility as a factor in semantic processing and as a filter on possible readings. Zhou & Gao (2009) claim that such doubly quantified sentences arc ambiguous in Mandarin, since their experimental evidence suggests that both interpretations are built up in parallel, but one reading is subsequently filtered out by plausibility, which accounts for the contrary judgements in work on semantic theory (e.g. Huang 1982, Aoun & Li 1989). 2.4. Meaning as a complex measure The meaning of a structure is not fixed or unique, even when linguistic, social, and discourse context are fixed. First, a single expression may have multiple readings, which compete for dominance. Often a specific relevant reading of a structure needs to be forced in an experiment. Some readings of theoretical interest may be quite inaccessible, though nevertheless real.This raises the issue of expert knowledge, which again contrasts with the situation in syntax. Syntactic well-formedness judgements are generally available and accessible to any native speaker and require no expertise. On the other hand, it can require specialist knowledge to 'get1 some readings since the access to variant readings is usually via different analyses. This is a crucial point in semantics, since it reduces the likelihood that the intuitions of the naive native speaker can be the final arbiter in this field, as they can reasonably be argued to be in syntax (Chomsky 1965). A fine example of this is from Hobbs & Schieber (1987): (2) Two representatives of three companies saw most samples. They claim that this sentence is five-ways ambiguous. Park (1995) however denies the existence of one of these readings {three > most > two). It is doubtful whether this question is solvable by asking naive informants. Even within a given analysis of a construction, the meaning may not be fully determined. Aspects of meaning are left unspecified, which means that two different per- ceivers can interpret a single structure in different ways.This too requires great care and attention to detail when designing experiments which aim to be exact. 2.5. The observer's paradox A frequent aim in semantic experiments is to discover how subjects interpret linguistic input under normal conditions. A constant problem is how experimenters can access this information, because whatever additional task we instruct the subjects to carry out renders the conditions abnormal. For example, if we ask them to choose which one of a
15. The application of experimental methods in semantics 309 pair of pictures illustrates the interpretation that they have gathered, or even if we just observe their eye movements, the very presence of two pictures is likely to make them more aware that more than one interpretation is possible, thus biasing the results. Even a single picture can alter or trigger the accessibility of a reading. 2.6. Inherent meaning and inferred meaning One last linguistic distinction which we should note here is that between the inherent meaning of an expression ("what is said") and the inferred meaning of a given utterance of an expression.This distinction is fundamental in the division of research into meaning into separate fields, but it is in practice very difficult to apply in experimental work, since naive informants do not naturally differentiate the two.The recent 'literal Lucy' approach of Larson et al. (2010) is a promising solution to this problem; in this paradigm participants must report how 'literal Lucy', who only ever perceives the narrowly inherent meaning of utterances and makes no inferences, would understand example sentences. This distinction is particularly important when an experimental design requires a disambiguation, and extreme care must be taken that its content is not only inferred. For example, in (3), it is implicated that every rugby player broke one of their own fingers, but this is not necessarily the case.This example cannot thus offer watertight disambiguation. (3) Every rugby player broke a finger. Implication: Every rugby player broke one of their own fingers. 2.7. Experimental measures and the object of theory As a rule, semantic theory makes no predictions about semantic processing. Instead it concerns itself with the final stable interpretation which is achieved after a whole linguistic expression, usually at the sentence level, has been processed and all reanalyses, for example as a result of garden paths, have been resolved. It fundamentally concerns the stative, holistic result of the processing of an expression, indeed many theoretical approaches regard meaning as only coming about in a full sentence (cf. article 8 (Meier-Oeser) Meaning in pre-19th century thought). But the processing of a sentence is made up of many steps which are incremental and which interact strongly with each other, partly predicting, partly parsing input as it arrives, partly confirming or revising previous analyses. Much of the experimental evidence available to us provides direct evidence only of these processing steps. It thus follows that for many semantics practitioners much of the empirical evidence which we can gather concerns at best out predictions about what the sentence is going to mean, not really aspects of its actual meaning. The time course of our arriving at a particular reading, whether it be remote or readily accessible, has no direct implications for the theory, since this makes no predictions about processing speed (cf. Phillips & Wagers 2007). One aim of this article is to show that experimental techniques can deliver data which can contribute to theory building. 2.8. Categorical predictions and gradient data Predictions of semantic theories typically concern the availability of particular interpretations. Experiments deliver more fine-grained data that reflect the relative preferences
310 III. Methods in semantic research among the interpretations.Mapping these gradient data onto the categorical predictions, that is, drawing the line between still available and impossible readings is a non-trivial task. At the same time, the ability to distinguish preferences among the "intermediate" interpretations may be highly relevant for testing predictions concerning readings that fall between the clearly available and the clearly impossible. 2.9. Outlook In the remainder of this paper we will discuss two ways in which systematically collected experimental data can contribute to semantic theorizing. We will use quantifier scope as an example of a phenomenon where results of psycholinguistic experiments can make significant contributions to the theoretical discussions. We will not attempt to review here the considerable psycholinguistic literature on the processing of quantifiers (for a comprehensive survey cf. article 102 (Frazier) Meaning in psycholinguistics). Instead we will concentrate on a small set of studies that show the usefulness of end-of-sentence judgements in establishing the available interpretations of quantified sentences.Then we will sketch an experiment to address aspects of the unfolding interpretation of quantifier scope which are of interest to theoretical semanticists as well. 3. Off-line evidence for scope interpretation Semantic theories are typically based on introspective judgements of a handful of theoreticians. The judgements concern available readings of a sentence, possibly ranked as to how easily available these readings are. Not surprisingly, judgements of this sort are subtle and often controversial. For instance, the sentence Everyone loves someone has been alternately considered to only allow the wide-scope universal reading (e.g. Hornstein 1995; Beghelli & Stowell 1997) or to be fully ambiguous (May 1977, 1985; Hornstein 1984;Higginbotham 1985). Example (2) above illustrates the same point. Park (1995) and Hobbs & Shieber (1987) disagree about the number of available readings. The data problem has been known for a long time. Studies as early as Ioup (1975) and VanLehn (1978) have used the intuitions of naive speakers in developing an empirically motivated theory. However, it has been clear from the beginning that "obvious" tasks such as paraphrasing a presumably ambiguous doubly-quantified sentence or asking informants to choose a (preferred) paraphrase are rather complex and that linguistically untrained participants may not be able to carry them out reliably. Another purely linguistic task has been problematic for a different reason. Researchers have tried to combine the quantified sentence with a disambiguating continuation, as in (4). (4) Every kid climbed a tree. (a) The tree was full of apples. (b) The trees were full of apples. Disambiguation of this type was used by Gillen (1991), Kurtzman & MacDonald (1993), Tunstall (1998) and Filik, Paterson & Liversedge (2004), for instance. Here the plural continuation is only acceptable if multiple trees are instantiated, that is, the wide-scope universal interpretation (every kid > a tree) is chosen, whereas the singular continuation
15. The application of experimental methods in semantics 31J_ is intended to only fit the wide-scope existential interpretation (a tree > every kid). Unfortunately the singular continuation fails to disambiguate the sentence, as Tunstall (1998) points out: the tree (4a) can easily be taken to mean the tree the kid climbed, thus making it compatible with the wide-scope universal interpretation as well (see also Bott & Rad6 2007 and article 102 (Frazier) Meaning in psycholinguistics). Problems of these kinds have prompted researchers to look for non-linguistic methods of disambiguation. Gillen (1991) used, among other methods, simple pictures resembling set diagrams. In her experiments subjects either drew diagrams to represent the meaning of quantified sentences, chose the diagram that corresponded to the (preferred) reading or judged how well the situation depicted in the diagram fitted the sentence. Bott & Rad6 (2007) tested a somewhat modified form of the last of these methods using diagrams like those in Fig. 15.1, to see whether they constitute a reliable mode of disambiguation that naive informants can use easily.They found that participants consistently delivered the expected judgements both for scopally unambiguous quantified sentences (i.e. sentences where one scope reading was excluded due to an intervening clause boundary) and for ambiguous quantified sentences where expected preferences could be determined based on theoretical considerations and corpus studies. These results show that there is no a priori reason to exclude the judgements of non-linguist informants from consideration. A) exactly one > each B) each > exactly one students novels students novels Fig. 15.1: Disambiguating diagrams for the sentence: "Exactly one novel was read by each student" For informative experiments, however, we need to be able to derive testable hypotheses based on existing semantic proposals. Although semantic theories are not formulated to make predictions about processing, it is still possible to identify areas where different approaches lead to different predictions concerning the judgement of particular constructions.The interpretation of quantifiers provides an example here as well. One possible way of classifying theories of quantifier scope has to do with the way different factors are supposed to affect the scope properties of quantifiers. In configura- tional models such as Reinhart (1976, 1978, 1983, 1995) and Beghelli & Stowell (1997), quantifiers move to/are interpreted in different structural positions. A quantifier higher in the (syntactic) tree will always outscope lower ones.The absolute position in the tree is irrelevant; what matters is the position relative to the other quantifier(s). While earlier proposals only considered syntactic properties of quantifiers, Beghelli and Stowell also include semantic factors in the hierarchy of quantifier positions.Taking distributivity as an example, assuming that a +dist quantifier is interpreted in Spec,QP which is the
312 III. Methods in semantic research highest position available for quantifiers, Ql will outscope Q2 if only Ql is +dist, regardless of what other properties Ql or Q2 may have. An effect of other factors will only become apparent if neither of the quantifiers is +dist. By contrast, the basic assumption in multi-factor theories of quantifier scope is that each factor has a certain amount of influence on quantifier scope regardless of the presence or absence of other factors (cf. Ioup 1975; Kurtzman & MacDonald 1993; Kuno 1991 and Pafel 2005).The effects of different factors can be combined, resulting in greater or lesser preference for a particular interpretation. Theories differ in whether one of the readings disappears when it is below some threshold, or whether sentences with multiple quantifiers are always necessarily ambiguous. Let us assume that the two scope-relevant factors we arc interested in arc distributivity and discourse-binding, the latter indicated by the partitive NP one of these N, see (6). Crossing these factors yields four possible combinations: +dist/+d-bound, +dist/-d- bound, -dist/+d-bound, and -dist/-d-bound. In a configurational theory presumably there will be a structural position reserved for discourse-bound phrases. Let us consider the case where this position is lower than that for +dist, but higher than the lowest scope position available for quantifiers. Thus Ql should outscope Q2 in the first two configurations, Q2 should outscope Ql in the third, and the last one may in fact be fully scope ambiguous unless some additional factors are at play as well. Moreover, as configurational theories of scope have no mechanism to predict relative strength of scope preference, the first two configurations should show the same size preference for a wide-scope interpretation of Ql.In statistical terms, we expect an interaction: d-binding should have an effect when Ql is -dist, but not when it is +dist. In multi-factor theories, on the other hand, the prediction would usually be that the effects of the different factors should add up.That is, the difference in scope bias between a d-bound and a non-d-bound +dist quantifier should be the same as between a d-bound and a non-d-bound -dist quantifier. A given factor should be able to exert its influence regardless of the other factors present. Bott and Rad6 have been testing these predictions in on-going work. In two questionnaire studies subjects read doubly-quantified German sentences and used magnitude estimation to indicate how well disambiguating set diagrams fitted the interpretation of the sentence. Experiment 1 manipulated distributivity and linear order and used materials like (5). Experiment 2 tested the factors distributivity and d-binding using sentences like (6). (5) a. Genau einen dieser Professoren haben alle Studentinnen verehrt. Exactly one these professors have all female students adored. All female students adored exactly one of these professors. b. Genau einen dieser Professoren hat jede Studcntin verehrt. Exactly one these professors has each female students adored. Each female student adored exactly one of these professors. c. Alle Studentinnen haben genau einen dieser Professoren verehrt. All female students have exactly one these professors adored. All female students adored exactly one of these professors.
15. The application of experimental methods in semantics 313 d. Jede Studentin hat genau einen dieser Professoren verehrt. Each female student has exactly one these professors adored. Each female student adored exactly one of these professors. (6) a. Genau einen Professor haben alle diese Studentinnen verehrt. Exactly one professor have all these female students adored. All of these female students adored exactly one professor. b. Genau einen dieser Professoren haben alle Studentinnen verehrt. Exactly one these professors have all female students adored. All female students adored exactly one of these professors. c. Genau einen Professor hat jede dieser Studentinnen verehrt. Exactly one professor has each these female students adored. Each of these female students adored exactly one professor. d. Genau einen dieser Professoren hat jede Studentin verehrt. Exactly one these professors has each female student adored. Each female student adored exactly one of these professors. Bott and Rad6 found clear evidence for the influence of all three factors. The distributive quantifier jeder took scope more easily than alle, d-binding of a quantifier and linear precedence both resulted in a greater tendency to take wide scope. Crucially, the effects were additive, which is compatible with the predictions of multi-factor theories but unexpected under configurational approaches. These results show that even simple questionnaire studies can deliver theoretically highly relevant data.This is particularly important in an area like quantifier scope, where the judgements are typically subtle and not always accessible to introspection. Of course the study reported here cannot address all possible questions concerning the interpretation of quantified sentences like those in (5)-(6). It cannot for example clarify whether the processor initially constructs a fully specified representation of quantifier scope or whether it first builds only a underspecified structure which is compatible with both possible readings, an outstanding question of much current interest in semantics. The data that wc have presented so far is off-line, in that it measures preferences only at the end of the sentence, when its content has been disambiguated. In section 5 we present an experimental design which allows investigation of the on-going (on-line) processing of scope ambiguities. In the next section we relate the semantic issue of underspecification to experimental data and predictions for on-line processing. 4. Underspecification vs. full interpretation It is generally agreed that syntactic processing is incremental in nature (e.g. van Gompel & Pickering 2007) i.e. a full-fledged syntactic representation is assigned to every incoming word. Whether semantic processing is incremental in the strict sense, is far from being
314 III. Methods in semantic research beyond dispute and is still an empirical question. To formulate hypotheses about the time-course of semantic processing, we will now look at the on-going debate in semantic theory on underspecification in semantic representations. Underspecified semantic representations are a tool intended to handle the problem of ambiguity. The omission of parts of the semantic information allows one single representation to be compatible with a whole set of different meanings (for an overview of underspecification approaches, see e.g. Pinkal 1999; articles 24 (Egg) Semantic underspecification and 108 (Pinkal & Koller) Semantics in computational linguistics). It is thus an economical method of dealing with ambiguity in that it avoids costly reanalysis, used above all in computational applications. Taking the psycholinguistic perspective, one would predict that constructing under- specified representations in semantically ambiguous regions of a sentence avoids processing difficulties in ambiguous regions and at the point of disambiguation (Frazier & Rayner 1990). Underspecification can be contrasted with an approach that assumes strict incre- mentality and thus immediate full interpretation even in ambiguous regions.This would predict processing difficulties in cases of disambiguations to non-preferred readings. A candidate for a semantic processing principle guiding the choice of one specified semantic representation would be a complexity-sensitive one (for example .-"Avoid quantifier raising" captured in Tunstall's Principle of Scope Interpretation 1998 and Anderson's 2004 Processing Scope Economy). In the psycholinguistic investigation of coercion phenomena, the experimental evidence is interpreted along these lines. Processing difficulties at the point of disambiguation are taken as evidence for full semantic interpretation (see e.g. Pinango, Zurif & Jackendoff 1999;Todorova, Straub, Badecker & Frank 2000) whereas the lack of measurable effects is seen as support for an underspecified semantic representation (see e.g. Pylkkancn & McElrcc 2006; Pickering, McElrcc, Frisson, Chen & Traxlcr 2006). Analogously, in the processing of quantifier scope ambiguities, experimental evidence for processing difficulties at the point of disambiguation will be interpreted as support for full interpretation. However, this need not be taken as final. If we look at underspecification approaches in semantics, non-semantic factors are mentioned which might explain (and predict) difficulties in processing local scope ambiguities (see article 24 (Egg) Semantic underspecification, section 6.4.1.). And these are exactly the factors which arc assumed by multi-factor theories to have an impact on quantifier scope: syntactic structure and function, context, and type of quantifier. The relative weighting and interaction of these factors are not made fully explicit, however. For the full picture, it would be necessary to examine not only the point of disambiguation but also the ambiguous part of the input, for it is there that the effects of these factors might be identified. Underspecification is normally only temporary, however, and a full interpretation will presumably be constructed at some stage (but see Sanford & Sturt 2002). This might be recognizable for example in behavioral measures, but the precise predictions of underspecification theory are not always clear. For example, it might be assumed that even representations which are never fully specified by the input signal (or context) do receive more specific interpretations at some later stagc.This of course raises the question what domains of interpretation are relevant here (sentence boundary, utterance, ...). In the next section we present experimental work which may offer a starting point for the empirical investigation of such issues.
15. The application of experimental methods in semantics 315 5. On-line evidence for representation of scope The underspecification view would predict that relative scope should remain underspeci- ficd as long as neither interpretation is forced. Indeed there should not even be any preference for one reading. The results of the questionnaire studies reported in Section 3 already indicate that this view cannot be right: a particular combination of factors was found to systematically support a certain reading. Furthermore it is unlikely that the task itself introduced a preference towards one interpretation - although the diagram representing the wide-scope existential reading was somewhat more complex, this did not seem to interfere with participants' performance. The observed preferences must thus be due to the experimental manipulation.That is, even if all possible interpretations are available up to the point where disambiguating information arrives, there must be some inherent ranking of the various scope-determining factors that results in certain interpretations being more activated than others. Off-line results such as those discussed above are thus equally compatible with two different explanations; one where quantifier scope is fully determined (at least) by the end of the sentence, and another one where several (presumably all combinatorially possible) interpretations are available but weighted differently. A different methodology is needed to find out whether there is any psycholinguistic support for an underspecified view of quantifier scope. As it turns out, the currently existing results of on-line studies are no more able to distinguish the two alternatives than are offline studies. In on-line experiments a scope- ambiguous initial clause is followed by a second one that is only compatible with one scope reading. An indication of difficulty during the processing of the second sentence is typically taken as evidence that the disambiguation is incompatible with the (sole) interpretation that had been entertained up to that point. However, there is another way to look at such effects. When the disambiguation is encountered, the underspecified representation needs to be enriched to allow only one reading and exclude all others. It is conceivable that updating the representation may require more or less effort depending on the ultimate interpretation that is required. This situation poses a dilemma for researchers investigating the interpretation of quantifier scope. If explicit disambiguation is provided we can only test how easily the required reading is available - the results don't tell us what other reading(s) may have been constructed. Without explicit disambiguation, however, reading time (or other) data cannot be interpreted, since we do not know what reading(s) the participants had in mind. Bott & Rad6 (2009) approached this problem by tracking the eye-movements of participants while they read ambiguous sentences and then asking them to report the interpretation they computed. Although their results are only partly relevant for the underspecification debate, we will describe the experiment in some detail,since it provides a good starting point for a more conclusive investigation. We will then sketch a modification of the method that makes it possible to avoid some problems with the original study. The scope-ambiguous sentences in Bott and Rado's study were instructions like those in (7): (7) a. Genau ein Tier auf jedem Bild sollst du nennen! Exactly one animal on each picture should you name! Name exactly one animal from each picture!
316 HI. Methods in semantic research b. Genau ein Tier auf alien Bildern sollst du nennen! Exactly one animal on all pictures should you name! Name exactly one animal from all pictures! Fig. 1S.2: Display following inverse linking constructions The first quantifier (Ql) was always the indefinite genau ein "exactly one".The second (Q2) was either distributive (jeder) or not (alle). In one set of control conditions Ql was replaced by a definite NP (das Tier "the animal"). In another set of control conditions the two possible interpretations of (7) (one animal that is present in all fields vs. a possibly different animal from each field on a display) were expressed by scope-unambiguous quantified sentences, as in (8). (8) a. Name exactly one animal that is found on all pictures, b. From each picture name exactly one animal. In each experimental trial participants first read one of these instruction sentences and their eye-movements were monitored. Then the instruction sentence disappeared and a picture display as in Fig. 15.2, replaced it. Participants inspected this and had to provide an answer within four seconds.The displays were constructed to be compatible with both possible readings: the wide-scope universal reading where the participant should select any one animal per field, but also the wide-scope existential reading where the one element common to all fields must be named (e.g. the monkey in Fig. 15.2).To make the quantifier exactly one felicitous, the critical displays always allowed two potential answers for the wide-scope existential interpretation. The scope-ambiguous instructions were so-called inverse linking constructions, in which the two quantifiers are contained within one NP. It has been assumed (e.g. May &
15. The application of experimental methods in semantics 317 Bale 2006) that in inverse linking constructions the linearly second quantifier preferentially takes scope over the first. The purpose of the study was to test this prediction and to investigate to what extent the distributivity manipulation is able to modulate it. Based on earlier results (Bott & Rado 2007) it was assumed that jeder would prefer wide scope, which should further enhance the preference for the inverse reading. When alle occurred as Q2, there should be a conflict between the preferences inherent to the construction and those arising from the particular quantifiers. The experimental setup made it possible to look at both the process of computing the relative scope of the quantifiers (eye-movement behavior while reading the instructions) and at the final interpretation (the answer participants gave) without providing any disambiguation. Thus the answers could be taken to reflect the scope preferences at the end of the sentence, whereas processing difficulty during reading would serve as an indication that scope preferences are computed at a point where no decision is yet required. The off-line answers showed the expected effects. There was an overall preference for the inverse scope reading, which was significantly stronger with jeder than with alle. Crucially, the reading time data showed clear evidence of a conflict between the scope factors: there was a significant slow-down at the second quantifier in (7b). The effect was present already in first-pass reading times, suggesting that scope preferences were computed immediately. Bott and Rado interpret these results as strong indication that readers regularly disambiguate sentences during normal reading. However, this conclusion may be too strong. In Bott and Rad6's experiment participants had to choose a particular interpretation in order to carry out the instructions (i.e. name an animal). Although they did not have to settle on that interpretation while they were reading the instruction, they had to make a decision as to the preferred reading immediately after the end of the sentence. This may have caused them to disambiguate constructions that are typically left ambiguous during normal interpretation. Moreover, the instructions used in the experiment were highly predictable in structure: they always contained a complex NP with two quantifiers (experimental items), a definite NP1 followed by a quantified NP2 (fillers A), or else an unambiguous sentence with two quantifiers. Although the content of NP1 (animal, vehicle, flag) and distributivity of Q2 was varied, the rest of the instruction was the same: sollst du nennen 'you should name'.This pattern was easy to recognize and may have resulted in a strategy of starting to compute the scope preferences as soon as the second NP had been received. To rule out this explanation Bott and Rado compared responses provided in the first and the last third of each experimental session and failed to find any indication of strategic behavior. Still the possibility remains that consistent early disambiguation in the experiment resulted from the task of having to choose a reading quickly in order to provide an answer. The ultimate test of underspecification would have to avoid such pressure to disambiguate fast. We are currently conducting a modification of Bott and Rad6's experiment that may not only avoid this pressure but actually encourage participants to delay disambiguation. In this experiment participants have to judge the accuracy of sentences like those in (9): (9) a. Genau eine geometrische Form auf alien Bildern ist rechteckig. Exactly one geometrical shape on all pictures is rectangular. Exactly one geometrical shape on all pictures is rectangular.
318 HI. Methods in semantic research b. Genau eine geometrische Form auf jedem Bild ist rechteckig. Exactly one geometrical shape on each picture is rectangular. Exactly one geometrical shape on each picture is rectangular. A) wide scope existential disambiguation B) wide scope universal disambiguation Fig. 1S.3: Disambiguating displays in the proposed experiment The experiment procedure is as before. The sentences will be paired with unambiguous displays supporting either the wide-scope universal or the wide-scope existential reading (Fig. 15.3.). In (9) full processing of the semantic content is not possible until the critical information (rechteckig) has been received. Since the display following the sentence is only compatible with one reading which the participant cannot anticipate, they are better off waiting to see which interpretation will be required for the answer. If underspecification is indeed the preferred strategy, there should be no difference in reading times across the different conditions, nor should there be any difficulty in judging any kind of sentence-display pair. Assuming immediate full specification of scope, however, we would expect the same pattern of results as in Bott and Radd's study: slower reading times in (9a) than in (9b) at the second quantifier, as well as slower responses to displays requiring the wide-scope existential interpretation, the latter presumably modulated by distributivity of Q2. This experiment should be able to distinguish intermediate positions between the two extremes of complete underspecification and immediate full interpretation. It is conceivable, for instance, that scope interpretation is only initiated when the perceiver can be reasonably sure that they have received all (or at least sufficient) information. This would correspond to the same reading time effects (and same answering behavior) as predicted under immediate full interpretation, but the effects would be somewhat delayed. Another possibility is an initial underspecification of scope, but the construction of a fully specified interpretation at the boundary of some interpretation domain such as the clause boundary. That would predict a complete lack of reading time effects but answer times showing the same incompatibility effects as under versions of the full interpretation approach. It is worth emphasizing how this design differs from existing studies. First, it looks at the ambiguous region and not just the disambiguation point. Second, it differs from
15. The application of experimental methods in semantics 3_19 Filik, Paterson & Liversedge (2004), who also measured reading times in the ambiguous region, but who used the kind of disambiguation that we criticized in section 3. 6. Conclusions In this article we have attempted to show that experimentally obtained data can, in spite of certain complicating and confounding factors, be of relevance to semantic theory and provide both support for and in some cases falsification of its assumptions and constructs. In section 2 we noted that the field of theoretical semantics has made less use of experimental verification of its analyses and assumptions. We have seen that there are some quite good reasons for this and laid out what some of the problematic factors are. While some of these are shared to a greater or lesser degree with other branches of linguistics, some of them are peculiar to semantics or are especially severe in this case. The main part of our paper reports a research program addressing the issue of relative scope in doubly quantified sentences. We present this work as an example of the ways in which experimental approaches can contribute to the development of theory. They also illustrate some of the practical constraints upon such studies. For example, we have seen that clear disambiguation is not always easy to achieve, in particular, it is difficult to achieve without biasing the interpretational choices of the experiment participant. The use of eye-tracking and fully ambiguous picture displays is a real advance on previous practice (Bott & Rad6 2009). Section 3 shows how experimental procedures which are simple enough for non- specialist experimenters can nevertheless yield evidence of value for the development of semantic theories: a carefully constructed and counter-balanced design can produce data of sufficient quality to answer outstanding questions with some degree of finality. In this particular case the configurational account of scope can be seen as failing to account for data that the multi-factor account succeeds in capturing.The unsupported account is demonstrated to need adaptation or development. Experimentation can make the field of theory more dynamic and adaptive; an account which repeatedly fails to capture evidence gathered in controlled studies and which cannot economically be extended to do so will eventually need to be reconsidered. In section 5 we describe an experiment designed to provide evidence which distinguishes between two accounts (section 4) of the way that perceivers deal with ambiguity in the input signal: underspecification vs. full interpretation. This is an example of how processing data can under certain circumstances provide decisive evidence which distinguishes between theoretical accounts. It is of course often the case that theory does not make any direct predictions about psycholinguistically testable measures of processing. The collaboration of psycholinguists and scmanticists may yet reveal testable predictious more often than has sometimes been assumed. We therefore argue for experimental linguists and semanticists to cooperate more and take more notice of each other's work for their mutual benefit. Semanticists will gain additional ways to falsify theoretical analyses or aspects of them, which can deliver a boost to theory development. This will be possible, because experimenters can tailor experimental methods, tasks, and designs to their specific requirements. Experimenters for their part will benefit by having the questioning eye of the seman- ticist look over their experimental materials, which will surely avoid many experiments
320 III. Methods in semantic research being carried out whose materials fail to uniquely fulfill the requirements of the design. An example of this is the mode of disambiguation which we discussed in section 3. Further to this, experimenters will doubtless be able to derive more testable predictions from semantic theories, if they discuss the finer workings of these with specialist seman- ticists.We might mention here the example of semantic underspecification: can we find evidence for its psychological reality? Further questions might be: if some feature of an expression remains underdetermined by the input, how long can the representation remain underspecified? Is it possible for a final representation of a discourse to have unspecified features and nevertheless be fully meaningful? We conclude, therefore, that controlled experimentation can provide a further source of evidence for semantics. This data can under certain circumstances give a more detailed picture of the states of affairs which theories aim to account for.This additional evidence could be the catalyst for some advances in semantic theory and explanation, in the same way that it has in syntactic theory. 7. References Anderson. Catherine 2004. The Structure and Real-time Comprehension of Quantifier Scope Ambiguity. Ph.D. dissertation. Northwestern University, Evanstone, IL. Aoun, Joseph & Yen-hui Audrey Li 1989. Scope and constituency. Linguistic Inquiry 16,623-637. Bcghclli, Filippo &Tim Stowcll 1997. Distributivity and negation:The syntax of each and every. In: A. Szabolcsi (ed.). Ways of Scope Taking. Dordrecht: Kluwer, 71-107. Bott, Oliver & Janina Rad6 2007. Quantifying quantifier scope. In: S. Featherston & W. Sternefeld (eds.). Roots. Linguistics in Search of its Evidential Base. Berlin: Mouton dc Gruyter, 53-74. Bott, Oliver & Janina Rad6 2009. How to provide exactly one interpretation for every sentence, or what eye movements reveal about quantifier scope. In: S. Winkler & S. Featherston (eds.). The Fruits of Empirical Linguistics, Volume I: Process. Berlin: de Gruyter, 25-46. Chomsky, Noam 1965. Aspects of the Theory of Syntax. Cambridge. MA:Thc MIT Press Filik, Ruth, Kevin B. Paterson & Simon P. Liversedge 2004. Processing doubly quantified sentences: Evidence from eye movements. Psychonomic Bulletin & Review 11,953-959. Frazier, Lyn & Keith Rayner 1990.Taking on semantic commitments: Processing multiple meanings vs. multiple senses. Journal of Memory and Language 29,181-200. Gillen, Kathryn 1991. The Comprehension of Doubly Quantified Sentences. Ph.D. dissertation. Durham University. van Gompel, Roger P.G. & Martin J. Pickering 2007. Syntactic parsing. In: G. Gaskell (ed.). The Oxford Handbook of Psycholinguistics. Oxford: Oxford University Press 455-504. Hahnc, Anja & Angela D. Fricdcrici 2002. Differential task effects on semantic and syntactic processes as revealed by ERPs. Cognitive Brain Research 13,339-356. Higginbotham, James 1985. On semantics. Linguistic Inquiry 16,547-594. Hobbs, Jerry & Stuart M. Shieber 1987. An algorithm for generating quantifier scopings. Computational Linguistics 13,47-63. Hornstein, Norbert 1984. Logic as Grammar. Cambridge, MA:The MIT Press. Hornstein, Norbert 1995. Logical Form: From GB to Minimalism. Oxford: Blackwell. Huang. Cheng-Teh James 1982. Logical Relations in Chinese and the Theory of Grammar. Ph.D. dissertation. MIT, Cambridge, MA. Ioup, Georgette 1975. The Treatment of Quantifier Scope in Transformational Grammar. Ph.D. dissertation. The University of New York, New York. Kuno,Susumu 1991. Remarks on quantifier scope. In: H. Nakajima (ed.). Current English Linguistics in Japan. Berlin: Mouton de Gruyter, 261-287.
15. The application of experimental methods in semantics 321 Kurtzman. Howard S. & Maryellen C. MacDonald 1993. Resolution of quantifier scope ambiguities. Cognition 48,243-279. Larson, Meredith. Ryan Doran, Yaron McNabb, Rachel Baker, Matthew Berends, Alex Djalali & Gregory Ward 2010. Distinguishing the said from the implicated using a novel experimental paradigm. In: U. Sauerland & K. Yatsushiro (eds.). Semantics and Pragmatics. From Experiment to Theory. Houndmills: Palgrave Macmillan. May, Robert 1977. The Grammar of Quantification. Ph.D. dissertation. MIT, Cambridge, MA. Reprinted: Bloomington, IN: Indiana University Linguistics Club, 1982. May, Robert 1985. Logical Form: Its Structure and Derivation. Cambridge, MArTlie MIT Press May, Robert & Alan Bale 2006. Inverse linking. In: M. Everaert & H. van Riemsdijk (eds.). Black- well Companion to Syntax. Oxford: Blackwell, 639-667. Pafel, Jurgen 2005. Quantifier Scope in German. (Linguistics Today 84). Amsterdam: Benjamins. Park, Jong C. 1995. Quantifier scope and constituency. In: H. Uszkoreit (ed.). Proceedings of the 33rd Annual Meeting of the Association of Computational Linguistics. Boston, MA: Morgan Kaufmann, 205-212. Phillips, Colin & Matthew Wagers 2007. Relating structure and time in linguistics and psycholinguist ics. In: M.G. Gaskell (ed.)- The Oxford Handbook of Psycholinguistics. Oxford: Oxford University Press, 739-756. Pickering, Martin J., Brian McElree, Steven Frisson, Lilian Chen & Matthew J.Traxler 2006. Under- specification and aspectual coercion. Discourse Processes 42,131-155. Pinango, Maria, Edgar Zurif & Ray Jackendoff 1999. Real-time processing implications of enriched composition at the syntax-semantics interface. Journal of Psycholinguistic Research 28,395-414. Pinkal, Manfred 1999. On semantic underspecification. In: H. Bunt & R. Muskens (eds.). Computing Meaning, Dordrecht: Kluwer, 33-55. Pylkkanen, Liina & Brian McElree 2006. The syntax-semantics interface: On-line composition of meaning. In: M. A. Gcrnsbachcr & M.Traxlcr (eds.). Handbook of Psycholinguistics. 2nd cdn. New York: Elsevier, 537-577. Reinhart. Tanya 1976. The Syntactic Domain of Anaphora. Ph.D. dissertation. MIT, Cambridge, MA. Reinhart,Tanya 1978. Syntactic domains for semantic rules. In: F. Guenthner & S.J. Schmidt (eds.). Formal Semantics and Pragmatics for Natural Language. Dordrecht: Rcidcl, 107-130. Reinhart,Tanya 1983. Anaphora and Semantic Interpretation. London: Croom Helm. Reinhart, Tanya 1995. Interface Strategies (OTS Working Papers in Linguistics). Utrecht: Utrecht University. Ross, John R. 1970. On declarative sentences. In: R. Jacobs & P. Rosenbaum (eds). Readings in English Transformational Grammar. Waltham, MA: Ginn, 222-272. Sanford, Anthony J. & Patrick Sturt 2002. Depth of processing in language comprehension: Not noticing the difference. Trends in Cognitive Neuroscience 6,382-386. Todorova, Marina, Kathy Straub, William Badecker & Robert Frank 2000. Aspectual coercion and the online computation of sentential aspect. In: L. R. Gleitman & A. K. Joshi (eds.). Proceedings of the 22nd Annual Conference of the Cognitive Science Society. Mahwah, NJ: Lawrence Erlbaum Associates, 3-8. Tunstall, Susanne L. 1998. The Interpretation of Quantifiers: Semantics and Processing. Ph.D. dissertation. University of Massachussetts, Amherst, MA. VanLehn, Kurt A. 1978. Determining the Scope of English Quantifiers. Technical Report (AI-TR 483). Cambridge, MA, Artificial Intelligence Laboratory, MIT. Weskott,Thomas & Gisbcrt Fansclow 2009. Scaling issues in the measurement of linguistic acceptability. In: S. Winkler & S. Fcathcrston (eds.). The Fruits of Empirical Linguistics, Volume I: Process. Berlin: de Gruyter, 229-246. Zhou, Peng & Liqun Gao 2009. Scope processing in Chinese. Journal of Psycholinguistic Research 38,11-24. Oliver Bott, Sam Featherston, Janina Rado, and Britta Stolterfoht, Tubingen (Germany)
IV. Lexical semantics 16. Semantic features and primes 1. Introduction 2. Background assumptions 3. The form of semantic features 4. Semantic aspects of morpho-syntactic features 5. Interpretation of semantic features 6. Kinds of elements 7. Primes and universals 8. References Abstract Semantic features or primes, like phonetic and morpho-syntactic features, are usually considered as basic elements from which the structure of linguistic expressions is built up. Only a small set of semantic features seems to be uncontroversial, however. In this article, semantic primes are considered as basic elements of Semantic Form, the interface level between linguistic expressions and the full range of mental structures representing the content to be expressed. Semantic primes are not just features, but elements of a functor- argument-structure, on which the internal organization of lexical items and their combinatorial properties, including their Thematic Roles, is based. Three types of semantic primes are distinguished: Systematic elements, that are related to morpho-syntactic conditions like tense or causativity; idiosyncratic features, not corresponding to grammatical distinctions, but likely to manifest primitive conceptual conditions like color or taste; and a large range of elements called dossiers, which correspond to hybrid mental configurations, integrating varying modalities, but providing unified conceptual entities. (Idiosyncratic features and dossiers together account for what is sometimes called distinguishers or completers.) A restricted subsystem of semantic primes can reasonably be assumed to be directly fixed by Universal Grammar, while the majority of semantic primes is presumably due to general principles of mental organization and triggered by experience. 1. Introduction The concept of semantic features, although frequently used in pertinent discussion, is actually in need of clarification with respect to both of its components. The term feature, referring in ordinary discourse to a prominent or distinctive aspect, quality or characteristic of something, became a technical term in the structural linguistics of the 1930s, primarily in phonology, where it identified the linguistically relevant properties as opposed to other aspects of sound shape.The systematic extension of the concept from phonology to other components of linguistic structure led to a wide variety of terms with similar but not generally identical interpretation, including component, category, atom, feature value, attribute, primitive element and a number of others. The qualification semantic, which competes with a smaller number of alternatives, conceptual being one of them, is Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 322-357
16. Semantic features and primes 323 not less in need of elucidation though, depending among others on the delimitation of the relevant domains, including syntactic, semantic, conceptual, discourse and pragmatic structure, and the intricate distinction between linguistic and encyclopedic knowledge. In what follows, I will use the term semantic feature in the sense of basic component of linguistic meaning, adding amendments and changes, where necessary. Different approaches and theoretical frameworks of linguistic analysis recognized the need for an overall concept of basic elements in terms of which linguistic phenomena are to be analyzed. Hjelmslev (1938) for instance based his theory of Glossematics on the assumption that linguistic expressions are made up of ultimate, irreducible invariants called glossemes. Following the Saussurean view of content and expression as the interdependent planes of linguistic structure, he furthermore distinguished kenemes and pleremes as the glossemes of the expression and the content plane, respectively. A recent and rather different case in point is the Minimalist Program discussed in Chomsky (1995, 2000), where Universal Grammar is assumed to make available "a set Fof features (linguistic properties) and operations C,IL (the computational procedure for human language) that access F to generate expressions", which are pairs (PF, LF) of Phonetic Form and Logical Form, determining the sound shape and the meaning of linguistic expressions. The general notion of features as primitive components constituting the structure of language must not obscure the fundamental differences between basic elements of the phonetic, morphological, syntactic and semantic aspect of linguistic expressions. On the phonetic side, the nature of distinctive features as properties of segments and perhaps syllables is fairly clear in principle and subject to dispute only with respect to interesting detail. The nature and role of primitive elements on the semantic side however is subject to problems that are unsolved in crucial respects. A number of questions immediately arise: (1) a. What is the formal character of basic semantic elements? b. Can linguistic meaning be exhaustively reduced to semantic features? c. Is there a fixed set of semantic features? d. What is the origin and interpretation of semantic features? Any attempt to deal with these questions must obviously rely on general assumptions about the framework which at least makes it possible to formulate the problems. 2. Background assumptions An indispensable assumption of all approaches concerns the fact that a natural language L provides a sound-meaning correspondence, relating an unlimited range of signals to an equally unlimited range of objects, situations, and conditions the expressions arc about. What linguistics is concerned with, however, is neither the set of signals nor the range of things they are about, but rather the invariant patterns on which the production and recognition of signals is based and the distinctions, in terms of which things and situations are experienced, organized or imagined and related to the linguistic expressions. Schematically, these general assumptions can be represented as in (2), where PF and SF (for Phonetic Form and Semantic Form, respectively) indicate the structure of the sound shape and meaning, with A-P (for Articulation and Perception) and C-I (for Conceptualization and Intention) abbreviating the complex mental systems by which linguistic
324 IV. Lexical semantics expressions are realized and related to their intended interpretations. Although I will comply with general terminology as far as possible, tension cannot always be avoided. Thus, the Semantic Form SF corresponds in many respects to the Logical Form LF of Chomsky (1986 and subsequent work), to the Conceptual Structure CS of Jackendoff (1984 and later work), and to the Discourse Representation Structure DRS of Kamp and Reyle (1993), to mention just a few comparable approaches, (cf. also article 30 (Jackendoff) Conceptual Semantics, article 31 (Lang & Maienborn) Two-level Semantics and article 37 (Kamp & Reyle) Discourse Representation Theory). (2) Signal <?=======*> PF«- A-P . This schema is a simplification in several respects, but it enables us to fix a number of relevant points. First, PF and SF represent the conditions that L imposes on or extracts from extra-linguistic systems. Hence PF and SF are what is often called interfaces, by which language interacts with other mental systems, abridged here as A-P and C-I. Semantic and phonetic features can now be identified as the primitive elements of SF and PF, respectively. Hence their formal and substantive nature is determined by the character and role of these interfaces. Second, the function of language, to systematically assign meaning to signals, is accomplished by the component indicated here as Morpho-Syntax, which establishes the connection between PF and SF. For obvious reasons, the rules and principles of Morpho- Syntax are the primary concern of all pertinent theories, which in spite of differences that will not be dealt with here agree on the observation that the relation between PF and SF depends on the lexical system LS of L and is organized by rules and principles which make up the Grammar G of L. The content of Morpho-Syntax is relevant in the present context to the extent to which it relies on morphological and syntactic features that relate to features of SF. Third, according to (2), PF and SF and their primes are abstract in the sense that they do not reflect properties of the signal or the external world directly, but represent the units and configurations in terms of which the mental mechanisms of P-A and C-I perceive or organize external phenomena. Features of PF should according to Halle (1983) most plausibly be construed as instructions for vocal gestures in terms of which speech sounds are articulated and perceived. In a similar vein, features of SF must be construed not as elements of the external reality, but as conditions according to which elements, properties, and situations are experienced or construed. Hence semantic features as components of SF require an essentially mentalistic approach to meaning. This is clearly at variance with various theories of meaning, among them in particular versions of formal or model theoretic semantics like, e.g., Lewis (1972) or Montague (1974), which consider semantics as the relation of linguistic expressions to strictly non-mental,external entities and conditions. It should be emphasized, though, that the mentalistic view underlying (2) docs not deny external objects and their properties, corresponding to structures in SF under appropriate conditions. The crucial point is that it is the mental organization of C-I which provides the correspondence to the external reality - if the correspondence External Morpho-Syntax ► SF <?========> and Internal ^ , C-I Environment L Mind/Brain
16. Semantic features and primes 325 actually obtains. Analogous considerations apply, by the way, to PF and its relation to the external signal. For further discussion of these matters and the present position see, e.g., Jackendoff (2002, chapter 10). It must finally be noted that the symmetrical status of PF and SF suggested in (2) is in need of modification in order to account for the fundamental differences between the two interfaces. While the mechanisms of A-P, which PF has access to, constitute a complex but highly specialized mental system, the range of capacities abbreviated as C-I is not restricted in any comparable way. As a matter of fact, SF has access to the whole range of perceptual modalities, spatial orientation, motor control, conceptual organization, and social interaction, in short: to all aspects of experience. This is not merely a quantitative difference in diversity, size, and complexity of the domains covered, it raises problems for the very status of SF as interface, as we will see. In any case, essential differences concerning the nature of basic elements and their combination in PF and SF have to be recognized. Two remarks must be added about the items of the lexical system LS. First, whether elements like killed, left, rose, or went are registered in LS as possibly complex but nevertheless fixed lexical items or are derived by morphological rules of G from the underlying elements kill, leave, rise, go plus abstract morphemes for Person, Number, and Tense is a matter of dispute with different answers in different theories. In any case, regular components like kill and -ed as well as idiosyncratic conditions holding for left, rose, and went need to be captured, and it must be acknowledged that the items in question consist of component parts. This is related to, but different from, the question whether basic lexical items like kill, leave, or go are themselves complex structures. As to PF, the analysis of words into segments and features is obvious, but for SF their decomposition is a matter of debate, to which we will return in more detail, claiming that lexical items are semantically complex in ways that clearly bear on the nature of semantic features. The second remark concerns the question whether and to what extent rules and principles of G determine the internal structure of lexical items and of the complex expressions made up of them. Elements of LS are stored in long-term memory and thus plausibly assumed to be subject to conditions that differ from those of complex expressions. Surprisingly, though, the character of features and their combination at least seems to be the same within and across lexical items, as we will see. As to notation, I will adopt the following standard conventions: Features of PF will be enclosed in square brackets like [ + nasal ], [ - voiced ], etc.; morphological and syntactic features are enclosed in square brackets and marked by initial capitals like [ + Past ], [ - Plural ], [ + Nominal ], etc.; and semantic features are given in small capitals, like [HUMAN ], [ MALE ], [ CAUSE ]. 3. The form of semantic features 3.1. Features in phonology and morpho-syntax It is useful to first consider the primitive elements of the phonetic side, where the conditions determining their formal character arc fairly clear. PF is based on sequentially organized segments, which represent abstract time slots corresponding to the temporal structure of the phonetic signal. Now features are properties of segments, i.e. one-place predicates specifying articulatory (and perceptual) conditions on temporal units. The
326 IV. Lexical semantics predicates either do or do not apply to a given segment, whence basic elements of PF are mostly conceived as binary features. Thus, a feature combination like [+ nasal, + labial, + voiced] would specify a segment by means of the simultaneous articulatory conditions indicated as nasal, labial, and voiced. Systematic consequences and extensions of these conditions need to be added, such as the difference between presence and absence of a property, marked by the plus- and minus-value of the features leading to the markedness- asymmetry in Chomsky & Halle (1968). Further extensions include relations between features in the feature geometry of Clements (1985), the predictability of unspecified features, the supra-segmental properties of stress and intonation in Halle & Vergnaud (1987), and the arrangement of segmental properties along different tiers, leading to three-dimensional representations, to mention the more obvious extensions of PF. These extensions, however, do not affect the basic conditions on phonetic features, which can be summarized as follows: (3) Phonetic features are a. binary one-place predicates, represented as [ ± F ]; b. simultaneous conditions on sequentially connected segments (or syllables); c. interpreted as instructions on the mechanisms in A-R Some of the considerations supporting distinctive features in phonology have also been applied to syntax and morphology. Thus in Jakobson (1936), Hjelmslev (1938), or Bierwisch (1967) categories like Case, Number, Gender, or Person are characterized in terms of binary features. Similarly, Chomsky (1970) reduced categories like Noun, Verb, Adjective, and their projections Noun Phrase etc. to binary syntactic features. While features of this kind are one-place predicates and thus in line with condition (3a), they clearly don't meet condition (3b): morphological and syntactic features are not conditions on segments of PF connected by sequential ordering, but on lexical and syntactic units, which are related by syntactic conditions like dominance or constituency according to the morpho-syntactic rules of G. The most controversial aspect of morpho-syntactic features is their interpretation analogously to (3c). As these features do not belong to PF or SF directly, but are essentially elements of the mediation between PF and SF, they can only indirectly participate in the interface structures and their interpretation. But they do affect PF as well as SF, although in rather different ways. As to PF, morphological features determine the choice of segments or features by means of inflectional rules like (4), where / - d / abbreviates the features identifying the past tense suffix d in cases like climbed, rolled, etc. (4) [ + Past]-W-d/ Actual systems of rules are more complex, taking into account intricate dependencies between different categories, features, and syntactic conditions. For systematic accounts of these matters see e.g., Bierwisch (1967), Halle & Marantz (1993), or Wunderlich (1997a). In any case, morphological or syntactic features are not part of PF, but can only influence its content via particular grammatical rules. As to SF, it is often assumed that features of morphological categories like tense or number are subject to direct conceptual interpretation, alongside with other semantic elements, and are therefore considered as a particular type of semantic features. This
16. Semantic features and primes 327 was the position of Jakobson (1936), Hjelmslev (1938), and related work of early structuralism, where in fact no principled distinction between morphological and semantic features was made, supporting the notion of semantic primes as binary features. This view cannot be generally upheld for various reasons, however. To be sure, a feature like [+ Past ] has a more stable and motivated relation to the temporal condition attached to it than to its various phonetic realizations e.g., in rolled, left, went, or was, still there is no reasonable way to treat morphological or syntactic features as elements of SF, as will be discussed in section 4. 3.2. Features and types Turning to the actual elements of SF, we notice that they differ from the features of PF in crucial respects.To begin with,condition (3a), according to which features are binary one- place predicates, seems to be satisfied by apparently well-established ordinary predicates like [ male ], [ alive ] or [ open ]. There are, however, equally well-established elements like [ parent-of ], [ part-of ], [ perceive ], [ before ], or [ cause ], representing relations or functions of a different type, which clearly violate this condition. More generally, SF must be based on an overall system of types integrating the different kinds of properties, relations, and operators. This leads directly to condition (3b), concerning the overall organization of PF in terms of sequentially ordered segments. It should be obvious that SF cannot rely on this sort of organization: it does neither consist of segments, nor does it exhibit linear ordering. Whatever components one might identify in the meaning of e.g., nobody arrived in time, or he needs to know it, or any other expression, there is no temporal or other sequential ordering among them, neither for the meaning of words nor their components. Although semantic processing in comprehension and production does of course have temporal characteristics, they do not belong to the resulting structure of meaning,in contrast to the linearstructure related to phonetic processing.See Bierwisch & Schreuder (1992) for further discussion of this point. Hence SF is organized according to other conditions, which naturally follow from the type system just mentioned: Elements of SF, including in particular its primitive elements, are connected by the functor-argument relation based on the type-structure of the elements being combined. Notice that this is the minimal assumption to be made, just like the linear organization of PF, with no arbitrary stipulations. We now have the following conditions on semantic features, corresponding to the conditions (3a) and (3b) on the phonetic side: (5) Semantic elements, including irreducible features, are a. members of types determining their combinatorial conditions; b. participants of type-based hierarchical functor-argument structures. These are just two interdependent aspects of the general organization of linguistic meaning, which is, by the way, the counterpart of linear concatenation of segments in PF. There are various notational proposals to make these conditions explicit, usually extending or adapting basic conventions of predicate logic. The following considerations seem to be indispensable in order to meet the requirements in (5). First, each element of SF must belong to some specific type; second, if the element is a functor, its type determines two things: (a) the type of the elements it combines with, and (b) the type the
328 IV. Lexical semantics resulting combination belongs to. Thus functor-types must be of the general form (ot,p), where a is the type of the required argument, and p that of the resulting complex. This kind of type-structure was introduced in Ajdukicwicz (1935) in a related but different context, and has since been adapted in various notational ways by Lewis (1972), Cresswell (1973), Montague (1974), to mention just a few; cf. also article 85 (de Hoop) Type shifting. Following standard assumptions, at least two basic types are to be recognized: e (for entity, i.e. objects in the widest sense) and t (for truth-value bearer, roughly propositions or situations). From these basic types, functor types like (e,t) and (e,(e,t)) for one- and two-place predicates, (t,t) and (t,(t,t)) for one- and two-place propositional functions, etc. are built up.To summarize: (6) Elements of SF belong to types, where a. e and t are basic types b. if a and p are types, then (a,p) is a type. A crucial point to be added is the possibility of empty slots, i.e. positions to be filled in by way of the syntactic combination of lexical items with the complements they admit or require. Empty slots are plausibly represented as variables, which are assigned to types like elements of SF in general. The kind of resulting structure is illustrated in (6) by an example adapted from Jackendoff (1990: 46), representing the SF of the verb enter in three notational variants, indicating the relevant hierarchy of types by a labeled tree (7a), and the corresponding labeled bracketing in (7b) and (7c): (7) a. x GO TO IN y II III e ^.(e.1)) (e>e) (e>e) e b- UcXlUlU^GollJ^TollJ^iNl^y]]]] c- [, [<c.,> [[<e,e,» go] [e [<e.e> to] [J<ee> ,n] [e y]]] [e x ]] While (7b) simply projects the tree branching of (7a) into the equivalent labeled bracketing, (7c) follows the so-called Polish notation, where functors systematically precede their arguments. This becomes more obvious if the type-indices are dropped, as in: [go [to [ in y]]x]. Notice, however, that in all three cases the linear ordering is theoretically irrelevant. What counts is only the functor-argument-relation. For the sake of illustration,GO,TO, and in are assumed to be primitive constants of SF with more or less obvious interpretation: in is a function that picks up the interior region of its argument, to turns its argument into the end of a path, and go specifies the motion of its "higher" or external argument along the path indicated by its "lower" or internal argument.
16. Semantic features and primes 329 Jackendoff's actual treatment of this example is given in (8a), with the equivalent tree representation (8b). In spite of a gross similarity between (7) and (8), a different conception of the combinatorial type system seems to be involved. W 3- [Even,G°([pa,h king L' [pa,hTO dpiace ,N (king ]j)])])l GO "TO IN J III II Event-Function Thing Path-Function Place-Function Thing Putting minor notational details aside (such as Jackendoff's notation [ x([ y])] to indicate that a component [y] is the argument of a functor x, or his use of indices i and j instead of variables x and y to identify the relevant slots), the main difference between (7) and (8) consists in the general attachment of substantive content to type nodes in (8), in contrast to the minimal conditions assumed in (7), where types merely determine the combinatorial aspect of their elements. Thus [ in [y ]] in (7) is of the same type e as the variable y, while its counterpart [ in [j]] in (8) is assigned to the type Place, differing from the type Thing of the variable [j]. It is not obvious whether the differences between (7) and (8) have any empirical consequences, after stripping away notational redundancies by which e.g., the functor Path-Function generates a complex of type Path, or the Place- Function a Place without specifying substantial information. With respect to the form of semantic features, however, the two systems can be taken as largely equivalent versions, together with a fair number of further alternatives, including Miller & Johnson-Laird (1976), Kamp & Reyle (1993), and to some extent Dowty (1979), to mention some well- known proposals. Katz (1972) pursues similar goals by means of a somewhat clumsy and mixed notational system. 3.3. SF and argument structure One of the central consequences emerging from the systems exemplified in (7) and (8) is the account of combinatorial properties of lexical items, especially the selection restrictions they give rise to. Although a systematic survey of these matters would go far beyond the concern for the nature of semantic features, at least the following points must be made. First, as already mentioned, variables like x and y in (7) determine the conditions syntactic complements have to meet semantically, as their SF has to be compatible with the position of the corresponding variables. Second, the position the variables occupy within the hierarchy of a lexical item's SF determines to a large extent the syntactic conditions that the complements in question must meet. What is at issue here is the semantic underpinning of what is usually called the argument structure of a lexical item. As Jackendoff (1990:48) puts it,"argument structure can be thought of as an abbreviation for the part of conceptual structure that is "visible" to syntax." There are various ways in which parts of SF can be made accessible to syntax. Jackendoff (1990) relies
330 IV. Lexical semantics on coindexing of semantic slots with syntactic subcategorization, Bierwisch (1996) and Wunderlich (1997b) extend the type system underlying (8) by standard lambda operators, as illustrated in (9) for open, where the features [+V(erbal), -N(ominal)] identify the entry's morpho-syntactic categorization, and act, cause, become, and open make up its semantic form: (9) /open/;[+V,-N]; ty ^X [, [, [<e.,> ACTJ le XH [<t.t)l<t.<t.t» CAUSE1 It [<,.,)BECOME] I, l<e.t> OPEN 1 Ie y 1 1 1 1 1 Xy and Xx mark the argument positions for object and subject, respectively, defining in this way the semantic selection restrictions the complements arc subject to. It is worth noting, incidentally, that these operators in some way correspond to the functional categories AgrQ and Agrs assumed in Chomsky (1995 and related work) to relate the verb to its complements. In spite of relevant similarities, though, there are important differences that can not be pursued here. Representations like (9) also provide a plausible account for the way in which the transitive and intransitive variants of open and other "(un)erga- tive" verbs like close, change, break etc. are related:The intransitive version is derived by dropping the causative elements [act x ] and [ cause ] and the operator Xx, turning Xy into the subject position: (10) / open /; [+ V,-N] ; Xy [,[ ^become ] [t[<el> open ] [e y ] ] ] These observations elucidate how syntactic properties of lexical items are routed in configurations of their SF, determining thereby the semantic effect of the syntactic head-complement-construction. 3.4. Extensions The principles exemplified in (7) to (10) provide the skeleton of SF, determining the form of semantic features, the structure of meaning of lexical items and complex expressions. They do not account for all aspects of linguistically determined meaning, though, but must be augmented with respect to phenomena like shift or continuation of reference, topic-focus-articulation, presupposition, illocutionary force, and discourse relations. To capture these facts, Jackendoff (1990, 2002) enriched semantic representations by distinguishing a referential tier and information structure from the descriptive tier. Similar ideas are pursued in Kamp & Reyle (1993), where a referential "universe of discourse" is distinguished from proper semantic representation, which in Kamp (2001) is furthermore split up into a descriptive and a presuppositional component. Distinctions of this type do not affect the nature of basic semantic components, which they necessarily rely on. Hence we need not go into the details of these extensions. We only mention that components like become involve presuppositions, as observed in Bierwisch (2010),or that the extensively discussed definiteness operator def depends on the universe of discourse. Two further points must be added to this exposition, both of which show up in ordinary nouns like bottle, characterized by the following highly provisional lexical entry: (11) /bottle /;[-V,+N ] ;Xx [[physical x] [artifact x] [container x] [bottle x]]
16. Semantic features and primes 331_ The first point concerns the role of standard classificatory features like [physical] [object], [artifact], [furniture], [container], etc, which are actually the root of the notion feature, descending from the venerable tradition of analyzing concepts in terms of genus proximum and differentia specifica. Although natural concepts of objects or any other sort of entities obviously do not comply with orderly hierarchies of classification, the logical relations and the (at least partial) taxonomies that classificatory features give rise to clearly play a fundamental role in conceptual structure and lexical meaning. Entries like (11) indicate how these conditions are reflected in SF: Features like [person], [artifact], [physical] etc. are elements of type (e,t), i.e. properties that classify the argument they apply to. Different conditions that jointly characterize an object x, e.g. as a bottle, must according to the general structure of SF make up an integrated unit of a resulting common type. Intuitively, the four conditions in (11), which are all propositions of type t, are connected by logical conjunction, yielding a complex proposition, again of type t. This can be made explicit by means of an (independently needed) propositional functor & of type (t,(t,t)), which turns two propositions into simultaneous conditions of one complex proposition. It might be noted that & is asymmetrical,in line with the generaldefini- tion of functor types in (6b), according to which a functor takes exactly one argument, such that two-place functors combine with their arguments in two hierarchical steps.This is due to the lack of linear order in SF, by which the relation between elements becomes structurally asymmetrical. In case of conjunction, which is generally assumed to be symmetrical with regard to the conjuncts, this asymmetry might to some extent correspond to the specificity of conditions, such that the SFof (11) would come out as (12): (12) A.X [ [ PHYSICAL X ] [ & [ ARTIFACT X ] [ & [ CONTAINER X ] [ & [ BOTTLE X ] ] ] ] ] Thus [ & [ container x ] [ & [ bottle x ] ] ] is the more specific condition added to the condition [ artifact x ]. Whether this is in fact the appropriate way to reflect hierarchical classification must be left open, especially as there are innumerable cases of joint conditions that do not make up hierarchies but reflect properties of cross-classification or just incidental combinations of conditions, differences the asymmetry of & cannot reflect. It might be added that various proposals have been made to formally represent joint conditions within lexical items as well as in standard types of adverbial or adnominal modification like little, green bottle or walk slowly through the garden. The second point to be added concerns elements like [ bottle ], whose status is problematic and characteristic in various respects. On the one hand, standard dictionary definitions of bottle like with a narrow neck, usually made of glass suggest that [ bottle ] is a heuristic abbreviation to be replaced by a more systematic analysis, turning it into a complex condition consisting of proper elementary features. On the other hand, any further analysis might run into problems that cannot reasonably be captured in SF.Thus even if the conditions indicated by narrow neck could be accounted for by appropriate basic elements, it is unclear whether the features of neck, designating the body-part connecting head and shoulders, can be used in bottle, without creating a vicious circle, which requires bottle to identify the right sort of neck. What is important here is not a matter of particular detail, but a kind of problem which shows up in innumerable places and in fact marks the limits of semantic structure. The issue has been treated in different ways. Laurence & Margolis (1999) appropriately called it "the problem of Completers",
332 IV. Lexical semantics dealing with the residue of systematic analysis. It will be taken up below,cf. also article 32 (Hobbs) Word meaning and world knowledge. Features of PF and SF were compared in §3.2. with respect to form and combination, but not with respect to their interpretation, which for PF according to condition (3c) consists in instructions on mechanisms in A-P. A straightforward counterpart for SF would consider its basic elements as conditions on mechanisms of C-I.This analogy, apparently suggested in schema (1), would be misleading for various reasons, though.The size of the repertoire, the diversity of the involved components of C-I, and the status of SF as interface all raise problems wildly differing from those of PF. The subsequent sections deal with the conditions on interpretation of semantic features in more detail. Distinctions between at least three kinds of basic elements will have to be made, not only because of the ambivalent status of the completers just mentioned. 4. Semantic aspects of morpho-syntactic features 4.1. Interpretability of features and feature instances The semantic aspect of morphological and syntactic categories is a matter of continuous debate. As already mentioned, morphological features specifying categories like Case, Number, Gender, Tense etc. were considered by Jakobson, Hjclmslcv, and others as grammatical elements with essentially semantic content, independently of the PF- realization assigned to them by rules like (4). We cannot deal here with the interesting results of these approaches in any detail. It must be noted, however, that the insights into conceptual aspects of morphological categories were never incorporated into systematic and coherent semantic representations, their integration was left to common sense understanding - mainly because appropriate frameworks of semantic representation were not available. To account for conceptual aspects of morpho-syntactic features, two important distinctions must be recognized. First, two different roles of feature instances are to be distinguished, which might be called "dominant" and "subdominant", for lack of better terms. Dominant instances are essentially elements that categorize the expression they belong to, subdominant instances are conditions on participants of various syntactic relations including selection of complements, agreement, and concord. Thus the pronoun her is categorized by the dominant features [+Feminine, -Plural, +Accusative], among others, while prepositions like with or verbs like know specify their selection restriction by subdominant instances of the feature [+Accusative]. Similarly, the proper name John is categorized by the dominant features [+N, + 3.Person,-Plural ], while the (inflected) verb knows has the subdominant features [+ 3.Person,-Plural]. The intricate details of the relevant morpho-syntactic relations and the technicalities of their formal treatment are notoriously complex and cannot be dealt with here. We merely note that different feature instances are decisive in mediating the relation between PF and SF.The crucial point is that subdominant feature instances have no semantic interpretation, neither in selection restrictions nor as conditions in agreement or concord.Thus, in a case like these boys knew her, the feature [+Plural] of boys, the feature [+Past] in knew, and the feature [+ Feminine ] in her can participate in semantic interpretation, but the feature [+Plural ] of these or the Number- and Person-features of the verb cannot.
16. Semantic features and primes 333 The second distinction to be made concerns the interpretability of dominant feature instances. Three possibilities have to be recognized: First, features that have a constant interpretation, with all dominant instances determining a fixed semantic effect, second, features with a conditional interpretation, subject to different interpretations under different circumstances, and third, features with no interpretation, whether dominant or not. Clear cases of the first type are Tense and Person, whose interpretation is invariant. Gender and Number in many languages are examples of the second type.Thus [+Plural] has a clear conceptual effect in nouns like boys, nails, clouds, or women, which it doesn't have in scissors, glasses or trousers, and [+ Feminine] is semantically vacuous in the case of nouns designating ships. The paradigm case of the third type is structural Case. Lack of conceptual content holds even for instances where a clear semantic effect seems to be related to Case-distinctions, such as the contrast between German Dative and Accusative in locative aufdem Dach (on the roof) vs. directional aufdas Dach (onto the roof), because the contrast is actually due to the SF of the preposition, not the dominant Case of its object, as shown in Bierwisch (1997). We will return to abstract Case shortly; see also article 78 (Kiparsky & Tonhauser) Semantics of inflection. This three-way-distinction is realized quite differently in different languages. A particularly intriguing case in point is Gender in German, which has dominant instances in the categorization of all nouns. Now, the feature [+Feminine] corresponds to the concept female in cases like Frau (woman) Schwester (sister), but has no interpretation in the majority of inanimate nouns like Zeit (time), Briicke (bridge), Wut (rage), many of which inherit the feature [+Feminine] from derivational suffixes like -t, -ung, -heit, as in Fahr-t (drive), Wirk-ung (effect), Dumm-heit (stupidity), etc. Moreover, some nouns like Weib (woman) or Madchen (girl) are categorized as [-Feminine], irrespective of their SF. Even more intricate are [+Feminine]-cases like Katze (cat), Ratte (rat), which designate either the species (without specified sex) or the female animal. Further complications come with nouns like Person (person) with the feature [+ Feminine], which arc animate and may or may not be female. Three conclusions follow from these observations: First, morpho-syntactic features must be distinguished from their possible semantic impact, because these features sometimes do, sometimes do not have a conceptual interpretation. Second, the relation between morpho-syntactic features and elements of SF, which mediate the conceptual content, must be determined by conditions comparable to those sketched in (4) relating morphological features to PF, even if the source and the content of the conditions is remarkably different. Third, the relation of morphological features to SF must be conditional, applying to dominant instances depending on sometimes rather special morphological or semantic circumstances. 4.2. The interpretation of morphological features Morpho-syntactic features are binary conditions of the computational system that accounts for the combinatorial matching of PF and SF.The semantic value corresponding to these features consists of elements of the functor-argument-structure of SF. Hence the conditions that capture this correspondence must correlate elements and configurations of two systems, both of which are subject to their respective principles. (13) illustrates the point for the feature [+Past] occurring, e.g., in the door opened indicating that an
334 IV. Lexical semantics utterance u of this sentence denotes an event s the time T of which precedes the time T of u, where T is a functor of type (e,e) and before a relation of type (e,(e,t)): (13) [ +Past ]h[Ts][ before [ T u ] ] This is a simplification which does not reflect the complex details discussed in the huge literature about temporal relations, but (13) should indicate in principle how the feature [+Past],syntactically categorizing the verb (or whatever one's syntactic theory requires), can contribute to meaning. More specifically, most semantic analyses agree that tense information applies to events instantiating in one way or another the propositional condition specified by the verb and its complements. With this proviso, we get (14) as the SF of the door opened, if we expand (10) by the event-instantiation expressed by the operator inst of type (t,(e,t)). (14) [ [ T S [ BEFORE [ T U ] ] ] & [ S [ INST [ BECOME [ OPEN [ DEF X [DOOR X ]]]]]] ] As [+Past] has a constant interpretation in English, (13) needs no restricting condition: Instances of [+Past] are always subject to (13). It might be considered as a kind of lexical entry for [+Past] which applies to all occurrences of the feature, whether its morpho- phonological realization is regular or idiosyncratic. There are further non-trivial questions related to conditions like (13). One concerns the appropriate choice of the event argument s, which obviously depends on the "scope" of [+Past], i.e. the constituent it categorizes. Another one is the question what (13) means for the interpretation of [-Past], i.e. morphological present tense. One option is to take [-Past] to determine the negation not[Ts [ beforeTu] ].A more general strategy would consider [-Past] as the unmarked case, which is semantically unspecified and simply doesn't determine a temporal position of the event s.These issues concerning the conceptual effect of morphological categories are highly controversial and need further clarification. Conditional interpretation of morphological features is exemplified by [+Plural] in (15), where [ collection ] of type (e,t) must be construed as imposing the condition that its argument consists of elements of equal kind, rather than being a set in the sense of set theory, since the denotation of a plural NP like these students is of the same type as that of the singular the student. See, e.g., Kamp & Reyle (1993) for discussion of these mattes and some consequences. (15) [ + Plural ] <-> [ collection x ] Like (13), condition (15) might be construed as a kind of lexical information about the feature [+Plural], which would have the effect that for a plurale tantum like people the regular collective reading follows from the idiosyncratic categorization by [+Plural]. Now, the conditional character of (15) would have to be captured by lexically blocking nouns like glasses or measles from the application of (15). Thus measles would be marked by two exceptions: idiosyncratic categorization as [+Plural], but at the same time exemption from the feature's interpretation.This again raises the question of interpreting [-Plural], and again, the strategy to leave the unmarked value of the feature [±Plural] without semantic effect seems to be promising, assuming that the default type of reference is just neutral with respect to the condition represented as [ collection ]. A revealing
16. Semantic features and primes 335 illustration of lexical idiosyncrasies is the contrast between German Ferien (vacation) and its near synonym Urlaub, where Ferien is a plurale tantum with idiosyncratic singular reading, much like measles, while Urlaub allows for [+Plur] and [-Plur] with (15) applying in the former case. Still more complex conditions obtain, as already noted, for the category Gender in German. The two features [± Masculine ] and [± Feminine ] identify three Genders by two features as follows: (16) a. Masculine: [+Masculine ] ; Mann (man), Loffel (spoon) b. Feminine: [+ Feminine ]; Frau (women), Gabel (fork) c. Neuter: [-Masculine,-Feminine ] ; Kind (child), Messer (knife) Only plus values of Gender features are related to semantic conditions, and only for animate nouns.This can be expressed as follows: (17) a. [ + Masculine ] <-> [ male x ] / [ animate x ] b. [ + Feminine ] <-> [ female x ] / [ animate x ] The correspondence expressed in (17) holds only for cases where the semantic condition [ animate x ] is present, hence the Gender features of nouns like Loffel, Gabel have no semantic impact. According to the strategy that leaves minus-valued features semantically uninterpreted, nouns like Kind and Messer would both be equally unaffected by (17), while derivational affixes like -ung, -heit, -t, -schaft project their inherent Gender-specification even when it is not interpretable by (17). Conditions like (17) and (15) might reasonably be considered as saving lexical redundancy, such that the categorization as [+Masculine] is predictable for nouns like Vater(father) the SF of which contains [ animate x & [ male x ] ]. Idiosyncratic specifications would then be involved in various cases of blocking (17). Thus Weib (woman) is [ female x ] in SF, but categorized as [-Masculine, -Feminine]. On the other hand, Zeuge (witness) is categorized as [+Masculine], but blocked for (17), although marked as [animate x]. Similarly Person (person) is [+ Feminine ], but still not [ female x ], as eine mannliche Person (a male person) is not contradictory. Finally a short remark is in order about structural or abstract Case, the paradigm case of features devoid of SF-interpretation. Two points have to be made. First, structural Case must be distinguished from semantic or lexical Case, e.g., in Finno-Ugric and Caucasian languages, representing locative and directional relations of the sort expressed by prepositions in Indo-European languages. Semantic Cases like Adessiv, Inessiv, etc. correspond to in, on, at, etc. Although the borderline between abstract and semantic Case is not always clear-cut, raising non-trivial questions of detail, the role of abstract Case we are concerned with is clear enough in principle. There is, of course, a natural temptation to extend successful analyses of semantic Cases to the Case system in general, relying on increasingly abstract characterizations, which are compatible with practically all concrete conditions. Hjelmslev (1935) is an impressive example of an ingenious strategy of this sort;cf. also article 78 (Kiparsky & Tonhauser) Semantics of inflection. Second, the semantic aspect of morphological features must not be confused with their participation in expressions whose semantic structure differs for independent
336 IV. Lexical semantics reasons, as already noted with regard to the locative/directive-distinction of the German prepositions in, an, auf, etc. The most plausible candidates for possible semantic effects of abstract Case are thematic roles in constructions like he hit him, where Nominative and Accusative relate to Agent and Patient. Identity or difference of meaning is not due to semantic features assigned to abstract Case, however, as can be seen from constructions like (18), where the verb anziehen (put on, dress) assigns the same role Recipient alternatively by means of Dative, Accusative, Nominative, and Genitive in the four cases in (18a-d), while in (18b) and (18e) the different roles Recipient and Theme are marked by the same Case Accusative. (18) a. cr zicht dcm PaticntcnDat den MantclAcc an (he puts the coat on the patient) b. er zieht den PatientenAcc an (he dresses the patient) c. der PatientNom wird angezogen (the patient is dressed) d. das Anziehen des PatientenGen (the dressing of the patient) e. Er zieht den MantelAcc an (he puts on the coat) Hence the roles of complements cannot derive from Case features,but must be determined by the SF of the verb in the way indicated in (9) and (10). The selection restrictions the verb imposes on its complements are inherent lexical conditions of the verb, modulated by passivization in (18c) and nominalization in (18d).See Wunderlich (1996b) for a discussion of further interactions between grammatical and semantic conditions; cf. also article 84 (Wunderlich) Operations on argument structure. To sum up, morphological features must clearly be distinguished from elements of SF, but may correspond to them by systematic conditions; and they may participate in making semantic distinctions explicit, even if they don't have any semantic purport at all. This point is worth emphasizing in view of considerations like those in Svenonius (2007), who adumbrates conceptual content for morphological features in general, including Gender and abstract Case, not distinguishing instances with and without semantic purport (as in Gender), or semantic purport and the grammatical distinction it depends on (as in structural Case). 4.3. Syntactic categories While categories like Noun, Verb, Preposition, Determiner, etc. are generally assumed to determine the combinatorial properties of lexical items and complex expressions, the identification of their features is still a matter of debate. Incidentally, the terminology sways between syntactic and lexical categories, depending on whether functional categories like Complementizer, Determiner, Tense, etc. and phrasal categories like NP, VP, DP, etc. are included. For the sake of clarity, I will talk about syntactic categories and their features. Alternatives of the initial proposal [ ± Verbal, ± Nominal ] generally recognize the determination of argument structure as the major effect of the features in question. One point to be noted is that nouns and adjectives do not have strong argument positions, i.e. they allow their complements to be dropped, in contrast to verbs and prepositions, which normally require their argument positions to be syntactically realized. Hence the feature [+Nominal] could be construed as indicating the argument structure to be weak in this sense. This is more appropriately expressed by an inverse feature like [-Strong Arguments]. Furthermore, verbs and nouns are recognized to require functional heads
16. Semantic features and primes 337 (roughly Determiner for nouns and Tense and Complementizer for verbs), ultimately supporting the referential properties of their constituents, a condition that does not apply to adjectives and prepositions. This can be expressed by a feature [+Referential], which replaces the earlier feature [Verbal]. A similar proposal is advocated in Wun- derlich (1996a). It might be noted that the feature [+Referential] - or whatever counterpart one might prefer - has a direct bearing on the categorization of the functional categories Determiner,Tense, and Complementizer, and the constituents they dominate. These matters go beyond the present topic, though. We would thus get the following still provisional proposal to capture the distinctions just noted: (19) Verb Noun Adjective Preposition Strong Arg + - - + Referential + + - Further systematic effects these features impose on the argument structure and selection restrictions depend on general principles of Universal Grammar and on language- particular morphological categories. Thus in English and German Nominative- and Accusative-complements of verbs must be realized as Genitive-attributes of corresponding nouns, as shown by well-known examples like John discovered the solution vs. John's discovery of the solution. On the whole, then, the features in (19) have syntactic consequences and regulate morphological conditions, and don't seem to call for any semantic interpretation. There is, however, a traditional, if not very clear view, according to which syntactic categories have an essentially semantic or conceptual aspect, where nouns, verbs, adjectives, and prepositions denote, roughly speaking, things, events, properties, and relations, respectively. Even Hale & Keyser (1993) adopt within the "Minimalist Program" a notional view of syntactic categories, assuming verbs to have the type "(dynamic) event" e, prepositions the type "interrelation" r, adjectives the type "state" s, with only the type n of nouns left without further specification. There are, however, various kinds of counterexamples such as adjectives denoting relations like similar or dynamic aspects of events like sudden. Primarily, though, all these conditions are neatly accounted for by the SF and the argument structure of the items in question, as shown by a rough illustration like (20). According to (20a), a transitive verb like open denotes an event s which instantiates an activity relating x to y; (20b) shows that the relational noun father denotes a person x related to the (syntactically optional) individual y;(20c) indicates that the adjective open denotes a property of the entity y, and (20d) sketches the preposition in, which denotes a relation locating the entity x in the interior of y. (20) a. /open/ X y X x X s [ s inst [ [ act x ] [ cause [ become [ open y ] ] ] ] ] b. /father/ X y X x [ person x [ & [ male x ] [ & [ x [parent-of y ] ] ] ] ] c. /open/ X y [ open y ] d. /in/ X y X x [ x loc [ interior y ] ] Clearly enough, the relevant sortal or ontological aspects of the items arc taken care of without further ado.This is not the whole story, though. Even if the semantic neutrality of the features in (19) might be taken for granted, one is left with the well-known but intriguing asymmetry between nouns and verbs, both marked [+Referential], but subject
338 IV. Lexical semantics to different constraints: While nouns have access to practically all sortal or ontological domains, verbs are restricted to events, processes, and states. The asymmetry is highlighted by basic lexical items like (21), which can show up as verbs as well as nouns, but with different semantic consequences: (21) a. run, walk, jump, rise, sleep, use,.... b. bottle, box, shelf, fence, house,saddle,.... Items like (21a) have essentially the same SF as nouns and verbs, differing only by their morpho-syntactic realization, as in Eve runs vs. Eve's run. By contrast, items like (21b), occurring as verbs differ scmantically from the corresponding noun, as illustrated by Eve bottles the wine, Max saddles the horse, etc. In other words, nouns allow for object- as well as event-reference, while verbs cannot refer to objects. Hence items like run need only allow for alternative values of the feature [ ± Strong Arg ] to yield nominal or verbal realizations, while nouns like bottle with object denotation achieve the necessary event reference only through additional semantic components like cause, become, and loc. These conditions can be made explicit in various ways. What we are left with in any case is the observation that the event-reference of verbs seems to be constrained by syntactic features. Event reference is used here (as before) in the general sense of eventuality discussed in Bach (1986), including events, processes, states, activities - in short, entities that instantiate propositions and arc subject to temporal identification as assumed in (13) above for Tense features. Two points should be made in this respect: First, the constraint is unidirectional, as verbs require event-reference, but event reference is not restricted to verbs in view of nouns like decision, event, etc Second, as it concerns verbs but none of the other categories, it cannot be due to one feature alone, whether (19) or any other choice is adopted. Using the present notation, the constraint could be expressed as follows: (22) [ + Referential, + Strong Arg ] -> X s [ s inst [ p ] ] This condition singles out verbs, as desired, and it determines the relevant notional condition on their referential capacity, if we consider inst as imposing the sortal requirement of eventualities on its first argument. It also has the correct automatic result of fixing the event reference for verbs, which is the basis for the semantic interpretation of Tense, Aspect, and Mood. Something like (22) might, in fact, not only be part of the (perhaps universal) conditions on syntactic features in general and those of verbs in particular, but can also be considered as a redundancy condition on lexical entries, providing verbs with the event instantiation as their general semantic property. If so, the lexical information of entries like (20a) could be reduced to (9), with the event reference being supplied automatically. 5. Interpretation of semantic features 5.1. Conditions on interpretation Basic elements of SF, including those corresponding to morpho-syntactic features, are eventually related to or interpreted by the structures of C-I. As already mentioned, this
16. Semantic features and primes 339 interpretation, its conditions, and consequences are analogous to the interpretation of phonetic features in A-P, yet different in fundamental respects. Analogies and differences can be characterized by four major points. First, phonetic and semantic features alike are primitive elements of the representational systems they belong to, they cannot be reduced to smaller elements. Structural differences of PF cannot be due to distinctions within elements like [-voiced] or [+nasal], just as distinct representations of SF cannot be due to distinctions within components like [ animate ], [ cause ], or [ interior ], although acoustic properties of an utterance (say, because of different speakers), or variants of animacy or causation might well be at issue (e.g., due to differences in intentionality). In other words, representations of PF and SF and their correspondence via morpho-syntax cannot affect or depend on internal structures of the primitive elements of PF and SF. Second, the interpretation of basic elements is nevertheless likely to be complex, and in any case subject to additional, rather different principles and relations. Basically, primitive elements of PF and SF recruit patterns based on different conditions of mental organization for the structural building blocks of language: The correlates of phonetic features in A-P integrate conditions and mechanisms of articulation and auditory perception, whereas correlates of semantic features in C-I involve patterns of various domains of perception, intention, and complex structural dependencies, as shown by cases like [ animate ] or [ artifact ], combining sensory with abstract, purpose oriented conditions. This is the very gist of positing interface elements, which deal with items of two currencies, so to speak, valid under intra- and extra-linguistic conditions, albeit in different ways. As the present article necessarily focuses on the interpretation of basic elements, it has to leave aside encroaching conditions involving larger structures, where compositional principles of PF and SF are partially suspended. With respect to PF, overlay-effects can be seen, e.g., in whispering, where suspending [ ± voiced ] has consequences for various other features. With respect to SF, the wide range of metaphor, for instance, is based on more or less systematic shifts within conceptually coherent areas. Projecting, e.g., conditions of animacy into inanimate domains, as is frequently done in talk about computers or other technology, obviously alters the interpretation of features beyond [ animate ]. Phenomena of this sort may partially suspend the compositionality of interpretation, still presupposing the base line of compositional matching, though. Third, PF interacts with one complex but unified domain, viz. production and processing of acoustic signals, and it shares with A-P the linear, time dependent structure as the basic principle of organization. SF on the other hand deals with practically all dimensions of experience, motivating the abstract functor-argument-structure sketched above. It necessarily cannot match the different organizational principles inherent in domains as diverse as visual perception, social relations, emotional values or practical intentions, hence it must be neutral with respect to all of them. In other words, the relation of SF to the diverse subsystems of C-I cannot mean that it shares the specific structure of different domains of interpretation. To mention just one case in point: Color perception, though highly integrated with other aspects of vision, is structured by autonomous principles, which are completely different from the distinctions of color categories and their relations in SF, as discussed in Berlin & Kay (1969) and subsequent work. It follows from these observations that the interpretation of SF must be compatible with conditions obtaining in disparate domains - not only with regard to the compositionality of complex configurations, but also with regard to the nature and content of its primitive elements.
340 IV. Lexical semantics Fourth, while for good reasons PF is assumed to be the interface with both articulation and perception, possibly even granting their integration, it is unclear whether in the same way SF could be the interface integrating all the subsystems and modules of C-I. Notice that this is not the same issue as the fact that two or more systems organized along different principles might well have a common interface. Singing for instance integrates language and music, and visual perception integrates shape and color, which originate from quite different neuronal mechanisms. On the one hand, for reasons of mental economy one would expect language, which forces SF to interface with the whole range of different modules of mental organization anyway, to be the designated representational system unifying the systems of C-I, setting the stage for coherent experience and mental operations like planning, reasoning, orientation.The central system posited in Fodor (1983) as the global domain the modular input/output systems interface with, might be taken to stipulate this kind of overall interface. On the other hand, there are well-known systems integrating different modules of C-I independently of SF and according to autonomous, different principles of organization. The most extensively studied case in point is spatial orientation, which integrates various modes of perception and locomotion and is fundamental for cognition in general, as it also provides a representational format for various other domains, including patterns of social relation or abstract values, as documented in the vast literature about many aspects of spatial orientation, its neurophysiological basis, its computational realization and psychological representation, including the particular conditions of spatial problem solving. An instructive survey is found in Jackcndoff (2002) and the contributions in Bloom et al. (1996), which deal in particular with the representation of space in language;see also article 107 (Landau) Space in semantics and cognition. Similar considerations hold for other domains, like music, emotion, or plans of action.The question whether SF serves as the central instance, providing the common interface that directly integrates the different modules and subsystems of C-I, or whether there arc separate interfaces mediating among parts of C-I independently of language must be left open here. In any case, SF and its basic elements must eventually interface with all aspects of experience we can talk about. In other words, color, music, emotions, other people's minds, and theories about everything must all in some way participate in interpreting the basic elements of SF. 5.2. Remarks on Conceptual Structure One of the major problems originating from these considerations is the difficulty to specify at least tentatively the format of interpretations which elements of SF receive in C-I and its subsystems. To be sure, different branches of psychology and cognitive sciences provide important and elaborate theories for particular mental domains, such as Marr's (1982) seminal theory of vision. But because of their genuine task and orientation, they deal with specific domains within the mental world. Thus they are far from covering the whole range of areas that elements of SF have to deal with, and have little to say about the interpretation of linguistic expressions. Proposals seriously dealing with the question of how extra-linguistic structures correspond to linguistic elements and configurations are largely linguo-centric, i.e. they approach the problem via insights and hypotheses developed and motivated by the analysis of linguistic expressions. An impressive large scale approach of this sort is Miller & Johnson-Laird (1976), providing an extensive survey of perceptual and conceptual structures that interpret
16. Semantic features and primes 341_ linguistic expressions.The approach systematically distinguishes the "Sensory" Field and representations of the "Perceptual World" it gives rise to from the semantic structure of linguistic expressions, the format of which is very similar to that of SF as discussed here. (23) illustrates elements of the perceptual world, (24) the semantic elements based on them. (23) a. Cause (e, e') Event e causes event e' b. Event (e) e is an event (24) a. happen (S): An event x characterized by S happens at time t if: (i) Event, (jc) b. cause (S,S'): Something characterized by S causes something S' if: (i) happen (S) (ii) happen (S1) (iii) Cause ((i),(ii)) These examples cannot do justice to the general approach, but they correctly show that perceptual and semantic representations have essentially the same format, suggesting that semantic structures and their interpretation are of roughly the same sort and are related by conditions as in (24). (This is surprising in view of the fact that Johnson-Laird (1983) explores the mental structure of spatial representations and the inferences based on them, showing that they are systematically different from the semantic structure of verbal expressions describing the same spatial situations, thus supporting different inferences.) In other words, perceptual interpretation could be (mis)construed as turning the organization of SF into principles of perception, something Miller and Johnson-Laird clearly do not suggest. In a different but comparable approach, Jackendoff (1984,2002) assumes representations of the sort illustrated in (8), which he calls Conceptual Structures, to cover semantic representations and much of their perceptual interpretation.Thus, like Miller & Johnson- Laird, he takes the format of semantic representations to be the model of mental systems at large - with one important exception. In Jackendoff (1996, 2002) a system of Spatial Representation SR, which Conceptual Structure CS interfaces with, is explicitly assumed as a system based on separate principles, some of which, like deictic frameworks, identification of axes, or orientation of objects, are made explicit.The systematic interaction of CS and SR, however, is left implicit, except that its roots are fixed by means of hybrid combinations inside lexical items, as indicated by the following entry for dog, adapted from Jackendoff (1996:11). (25) PF: / dog / Categorization: [ +N,-V,+Count,... ] CS: [ ANIMAL X & [ [ CARNIVORE X ] & [ POSSIBLE [ PET X ] ] ] ] SR: [ 3-D model with motion affordances ] Auditory: [ sound of barking ] The notion of a 3-D model invoked here is taken from the theory of vision in Marr (1982), extended by a mental representation of motion types Marr's theory does not
342 IV. Lexical semantics deal with. The actual model must be construed as the prototype of dogs in the sense of Rosch & Mervis (1975) and related work. Jackendoff's inclusion of auditory information points (without further comment) to the natural condition that the concept of dogs includes the characteristic voice, the mental representation of which requires access to the organization of auditory perception. Besides this purely sensory quality, however, barking is a linguistically classified aspect of dogs (and foxes) and as such belongs to the SF-information of the entry of dog. In general, then, what kind of representations SF must serve as interface for, is anything but obvious and clear. Principles of propositional structure, on which SF is based, are certainly not excluded from mental systems outside of language, but it is more than unlikely that they can all be couched in this format, as it appears to be suggested in Miller & Johnson-Laird (1976), Jackendoff (2002) and (at least implicitly) many others. Spatial reasoning, to mention just one domain, does not rely on descriptions using the principles of predicate logic, as shown in Johnson-Laird (1983). 5.3. The inventory The diversity of domains SF has to cope with leads to difficulties in identifying the repertoire of its primes - a problem that does not arise in PF, where the repertoire of primes is uncontroversial, at least in principle. For SF, two contrary tendencies can be recognized, which might be called minimal and maximal decomposition. Fodor et al. (1980) defend the view that concepts (which are roughly the meaning of simple lexical items) are essentially basic, unstructured elements. On this account, the inventory of semantic primes (for which the term features ceases to be appropriate) is by and large identical to the repertoire of semantically distinct lexical entries. Although Fodor (1981) admits a certain amount of lexical decomposition, recognizing some undisputable structure within lexical items, for the majority of cases he considers word meanings as primes. As a matter of fact, the problem boils down to the role of Completers, noted above - a point to which we will return. The opposite view requires the SF of all lexical items to be reducible to 1 fixed repertoire of basic elements, corresponding to the features of PF. The re are various versions of this orientation, but explicit reflections on its feasibility are rare. Jackendoff (2002:334ff) at least sketches its consequences, even contemplating the possibility of combinatorial principles inside lexical items that differ from lexicon-external combination, a view that is clearly at variance with detailed proposals he pursues elsewhere. On closer inspection, the claims of minimal and maximal decomposition are not really incompatible, though, allowing for intermediate positions with more or less extensive decomposition. As a matter of fact, the contrary perspectives end up, albeit in rather different ways, with the need to account for the nature of irreducible elements. Two points arc to be made, however, before these issues will be taken up. First, with regard to the size of the repertoire, there is no reliable estimate on the market. It is clear that the number of PF-features is on the order of ten or so, but for SF-features the order of magnitude is not even a matter of serious debate. If decomposition is denied, the repertoire of basic elements is on the order of the lexical items, but the decompositional view per se does not lead to interesting estimates, either, as there is no direct relation between the number of lexical items and the number of primes. The combinatorial principles of SF would allow for any number of different lexical items on
16. Semantic features and primes 343 the basis of whatever number of primes is proposed. Hence for standard methodological reasons, parsimonious assumptions should be expected. However, since reasonable assumptions must be empirically motivated, they would have to rely on systematically analyzed, representative sets of lexical items. But so far, available results have not led to any converging guesses.To give at least an idea about possible estimates, the comprehensive analysis of larger domains of linguistic expressions in Miller & Johnson-Laird (1976) can be taken - with all necessary caveats - to deal with remarkably more than a hundred basic elements of the sort illustrated in (24).This repertoire is due to well-motivated considerations of the authors, leading to interesting results with respect to central cognitive domains but with no general implications, except that huge parts of the English vocabulary could not even be touched. One completely different attempt to set up and motivate a general repertoire of basic elements should at least be mentioned: Wierzbicka (1996) has a growing list of 55 primes on the basis of what is called Natural Semantic Metalanguage, a rather idiosyncratic framework a more detailed presentation of which would by far exceed the present limits, but cf. article 17 (Engelberg) Frameworks of decomposition. Among the peculiar assumptions of the approach is the tenet that semantic primes are the meaning of designated, irreducible lexical items, like I, you, this, not, can, good, BAD, DO, HAPPEN, WHERE, etc. Second, as to the content, the repertoire of primes has to respond not only to the richness of the vocabulary of different languages, but first of all to the diversity and nature of the mental domains to be covered. Distinctions must be captured and categorized not only for different perceptual modalities, but also for "higher order", more central domains like social dependencies, goals of action, generating explanations, etc. There is, after all, no simple and unconditional pattern to order sets of primes. Two perspectives emerge from the contrary views about decomposition. On the one hand, even strong opponents of decomposition grant a certain amount of formal analysis, which requires a limited, not strictly closed core of basic semantic elements. These elements show up in different guises in practically all pertinent approaches. Besides components related to morpho-syntactic features as noted above, there is a small set of primes like cause, become, and a few others that have been treated as direct syntactic elements in different ways. The "light verbs" in Hale & Keyser (1993) and much related work around the Minimalist Program, for instance, are just syntactically realized semantic components. These different perspectives, emphasizing either the semantic or the syntactic role of Logical Form, arc the reason for the terminological tensions mentioned at the outset.Three decades earlier, an even stronger syntactification of semantic elements had been proposed in Generative Semantics by McCawley (1968) and subsequent work, also relying on cause, become, not, and the like. In spite of deep differences between the alternative frameworks, there are clearly recurring reasons to identify central semantic elements and the structures they give rise to; cf. also article 7 (Engelberg) Lexical decomposition, article 17 (Engelberg) Frameworks of decomposition, and article 81 (Harley) Semantics in Distributed Morphology. On the other hand, even if one keeps to the program according to which lexical items are completely decomposed into primitive elements, the problem of idiosyncratic residues, identified as Distinguishes in Katz & Fodor (1963) and Katz (1972) or as Completers in Laurence & Margolis (1999), must be taken seriously. One might consider elements like [ bottle ] in (11), or taxonomic specifications like [ canine ] for dog or [ feline ] for cat, or distinguishes like [ knight serving under the standard of another
344 IV. Lexical semantics knight ] in Katz & Fodor (1963) as provisional candidates, to be replaced by more systematic items. But there are different domains where one cannot get rid of elements of this sort for quite principled reasons. Taxonomies of animals, plants, or artifacts are by their very nature based on elements that must be identified by idiosyncratic conditions. For different reasons, distinctions in perceptual dimensions like color, heat, pressure, etc. and other mental domains must be identified by primitive elements. In short, there is a complex variety of possibly systematic sources for Completers, such that an apparently undetermined set of basic elements of this sort must be acknowledged. For two reasons, these elements cannot be considered as a side issue, to be dismissed in view of the more principled, linguistically motivated semantic elements. First, they make up the majority of the inventory of basic semantic elements, which is, moreover, not closed on principled grounds, but open to amendments within systematic limits. Second, they are of remarkably different character with respect to their interpretation and their position within SF and the role of SF as the interface with C-I. To sum up, the overall repertoire that SF-representations are made up of comprises elements of different kinds. They can reasonably be grouped according to two conditions into (a) elements that are related to language-internal systematic conditions of L as opposed to ones that are not, and (b) elements with homogeneous as opposed to potentially heterogeneous interpretation - an important aspect that will be made explicit below.The emerging kinds of elements can be characterized as follows: (26) Systematic Features Idiosyncratic Features Dossiers L-systematic + - - Homogeneous + + - The distinctions between these kinds of elements are based on systematic conditions with principled theoretical foundations, although the resulting differences need not always be clear-cut, a usual phenomenon with regard to empirical phenomena. As closer inspection will show, however, the classification reasonably reflects the fact that linguistically entrusted elements exhibit increasingly homogeneous and systematic properties. 6. Kinds of elements 6.1. Systematic semantic features The distinction between systematic elements and the rest seems reminiscent of that between semantic markers and distinguishes in Katz & Fodor (1963), but it differs fundamentally, both in principle and in empirical detail. There are at least two ways in which semantic primes can be systematic in the sense of bearing on morpho-syntactic distinctions. First, there are elements that participate in the interpretation of morphological and syntactic features as discussed in §4, including both categorization and the various types of selection. These include properties like [ female ], [ animate ], [ physical ],relations like [ before ], [ location ],or functions like [ surface ], [ proximate ], [ vertical ], etc. Second, there arc components on which much of the argument structure of expressions, and hence their syntactic behavior depends. Besides well-known and widely discussed elements like [ cause ], [ become ], [ act ], also features of relational nouns like [ parent ], [ color ], [ extension ] and others belong to this group. For general
16. Semantic features and primes 345 reasons, negation, conjunction, possibility, and a number of other connectives and logical operators must be subsumed here, as they are clearly involved in the way in which SF accounts for the logical organization of cognitive structures. It must be noted, however, that logical constants, explored in pertinent theories of logic, are not necessarily identical with their natural counterparts in SF.The functor & assumed in (12), for example,differs from standard conjunction at least by its formal asymmetry. Further differences between logical and linguistic perspectives have been recognized with regard to quantification and modality. In any case, the interpretation of these elements consists in conceptually basic, integrated conditions, which may or may not be related to particular sensory domains. Causality, for instance, is a fundamental dimension of experience, which lives on integrated perceptual conditions, incorporating information from various domains like change and position, as discussed, e.g., in Miller & Johnson-Laird (1976). More specifically, it is not extracted from, but imposed on perceptual information. Dowty (1979) furthermore shows that the conceptual conditions of causality and change have straightforward external, model-theoretic correlates. Similarly, shape, location, relative size, part-whole- relation, or possession are integrated and often fairly abstract conditions organizing perception and action. For instance, the relation [ loc ] or functions like [ interior ], [ surface ], etc. cover conditions of rather different kind, depending on the sortal aspect of their arguments. Thus a page in the novel, a chapter in the novel, an error in the novel, or a scene in the novel pick out different aspects by means of the same semantic conditions represented by the features assumed for in in (20d). Other and perhaps still more complex and integrated aspects of experience are involved in the recognition of animacy or the identification of male and female sex. Even if some ingredients of these complex patterns can be sorted out and verbalized separately (identifying, e.g., physiological properties or behavioral characteristics), the features are holistic components with no internal structure within SF.This corresponds to the fact that the content of PF-fcaturcs is sometimes accessible to conscious control and verbalization.The articulatory gestures of features like [voiced] or [labial] for instance may well be reflected on, without their encapsulated character being undercut. In general, systematic features are to be construed as the stabilized, linguistic reflex of mechanisms involved in organizing and conceptualizing experience. Fundamental conditions of this sort seem to provide a core of possibilities the language faculty can draw on. Which of these options arc eventually implemented in a given language obviously depends to some extent on its morphological and lexical regularities. 6.2. Idiosyncratic semantic features Two factors converge in this kind of features. First, there is the fundamental observation, granted even under a radical denial of lexical decomposition, that the meanings of linguistic expressions respond to distinctions within basic sensory domains, from which more complex concepts are constructed. An extensively debated and explored domain in this respect is color perception. Items like yellow, green, blue differ semantically on the basis of perceptual conditions, but with no other linguistic consequences, except common properties of the group as a whole, which is captured by the common,systematic feature [ color ], while the different chromatic categories are represented by idiosyncratic features like [ red ], [ green ], [ black ], etc. It might be noted in this respect that color terms -
346 IV. Lexical semantics unlike those for size, value, speed, etc. - are adjectives that can also occur as nouns, as shown by the red of the car versus *the huge of the car, due to the component [ color ] as opposed to the idiosyncratic values [ red ], [ blue ], which still are by no means arbitrary, as Berlin & Kay (1969) and subsequent work has shown. Various other domains categorizing sensory distinctions have attracted different degrees of attention, with limited systematic insight. Intriguing problems arise, e.g., with respect to haptic qualities, concerning texture and plasticity of surfaces or even the perceptual classification of temperature. It might be added that besides perceptual domains, there are biologically determined motor patterns, as in grasp or the distinction between walk and run, which also provide the interpretation of basic elements of SF. The second factor supporting idiosyncratic features is the obvious need to account for differences of meaning that obviously have no systematic status in morpho-syntax or more general semantic patterns. Whether or not the distinctions between break, shatter, smash can be traced back to homogeneous sensory modalities need not be a primary concern, since in any case besides basic modes of perception and motor control, more complex conditions like those involved in the processes denoted by cough, breath, or even sleep, laugh, cry and smile are likely to require semantic features that are motivated by nothing else than the distinctions assigned to them in C-I.These distinctions might well be based on processes that are physiologically or physically complex - they still lead to self-contained features, as long as their interpretation is experientially homogeneous. One easily realizes that lexical knowledge abounds with elements like shatter, splinter, smash, and plenty of other sets of items that cannot get along without features of this idiosyncratic kind. They seem to be the paradigm cases demonstrating the need for Distinguishes or Completers. This is only half of the truth, though, because distinguishing elements like [ canine ], [ feline ] etc. are usually assumed to be equally typical for Completers as are [ green ] or [ smooth ] and the like. Characteristic elements identifying the peculiarities of a particular species or a specific artifact arc, of course, plausibly considered as Completers or Distinguishes. But they do have an essentially different character, as their interpretation can be heterogeneous for principled reasons, combining information from separate mental modalities. Similar considerations apply to what Pustejovsky (1995) calls the Qualia structure of lexical items, which he supposes to characterize their particular distinctive properties, using the term Qualia, however, in a fairly different sense than the original notion of subjective (usually monomodal) percepts, e.g., in Goodman (1951). The integration of different types of information acknowledged in these considerations might be a matter of degree, which yields a fuzzy boundary with respect to the kind of elements to be considered next. 6.3. Dossiers The notion of Dossiers, proposed in Bierwisch (2007), takes up a problem that is present but not made explicit in much of the literature that deals with the phenomena in question. The point is illustrated by the entry (25) proposed by Jackendoff for dog, the semantically relevant parts of which are repeated here as (27): (27) CS: [ ANIMAL X & [ [ CARNIVORE X ] & [ POSSIBLE [ PET X ] ]] ] SR: [ 3-D model with motion affordances ] Auditory: [ sound of barking ]
16. Semantic features and primes 347 The elements in CS (Jackendoff's counterpart of SF) are fairly abstract conceptual conditions, and in fact plausible candidates for systematic features classifying animals. But instead of the distinguisher [ canine ] that would have to identify dogs, we have SR and Auditory information to account for their specificity. Now, SR would either require the domain of 3-D models to be subdivided in prototypes of dogs, cats, horses, spiders, etc. - comparable to the division of the color space by cardinal categories like red, green, yellow, etc. -, clearly at variance with the theory of 3-D representations, or it would leave the information about shape and movement of dogs outside the range of the features of SF. Similar remarks apply to the auditory information. The essential point is, however, that 3-D models (of any kind of object, not just of dogs or animals) arc Spatial Representations, which arc subject to completely different principles of organization than SF (or CS,for that matter). SR-representations by their very nature preserve three-dimensional relations, allowing for spatial properties and inferences, as discussed in Johnson-Laird (1983) and elsewhere, conditions that SF is incapable to support. Moreover, besides shape, movement, and acoustic characteristics, the distinguishing information about dogs includes a variety of other aspects of behavior, capacities, possible functions, etc. some of which might require propositional specifications that can be represented in the format of SF. In other words, the replacement or interpretation of [ canine] does not only concern SR and Auditory in (27), but involves a presumably extendible, heterogeneous cluster of conditions, i.e. a dossier of different conditions, which is nevertheless a unified element within SF, as the abbreviation [ canine ] in the SF of dog would correctly indicate. The comments made on the SF of dog are easily extended to a large range of lexical items. Take, for the sake of illustration, nouns like ski or bike, which would be classified by systematic features like [ artifact ] and [ for motion ], while the more-or-less complex specificities are indicated by dossiers like [ ski ], or [ bicycle ], respectively. Again, what is involved are different domains of C-I, indicating e.g., conditions on substance, bits and pieces, and technology, in addition to spatial models, and interpreting integrated elements, which, incidentally,enter into operations of verbalization noted with respect to (21b),such that ski and bike become verbs denoting movement by means of the specified objects. This in turn shows that dossiers are by no means restricted to the SF of nouns. Three comments should be made here. First, as noted in (26), dossiers are elements with inhomogeneous interpretation that allow for conditions from different modules of C-I. To the extent to which these conditions may or may not be integrated, the borderline between dossiers and idiosyncratic features gets fuzzy, as noted earlier. With respect to SF, however, they are basic elements, subject to the general type structure and the combinatorial conditions it imposes, as shown by constructions like we were skiing or she biked home. As a consequence, the Lexical System LS of a given language L must be able to contain a remarkable collection of primitive elements whose idiosyncratic, heterogeneous interpretation is attached to, but not really part of, the lexical entries. In other words, lexical entries are connected to C-I not only by means of the general, systematic primes, but also by means of idiosyncratic files calling up different modules. One might observe that entries like (25) reflect this situation by making the extra-linguistic (e.g. spatial and auditory) information part of the lexical items directly. That way one would avoid ambivalent primes of the sort called Dossier, but the Lexical System would become a hybrid combination of linguistic and other knowledge - which perhaps it is. On that view, dossiers would behave as unified basic elements, a point that will be taken up below. Under the opposite view, dossiers might be construed to a large extent as
348 IV. Lexical semantics the meaning of lexical entries without decomposition. And that is in fact what dossiers are - precisely to the extent to which primes resist language-internal decomposition. Second, for this very reason dossiers are crucial joints for the role of SF as the interface with the different modules of C-I. To the extent to which they include, e.g., 3-D models that allow for spatial characteristics and spatial similarity judgments, dossiers situate the elements they specify in SF with respect to spatial possibilities, whose actual conditions, however, are given by the system of Spatial Representation. Similar considerations apply to auditory or interpersonal information. In general then, insofar as dossiers address different mental subsystems, they directly mediate between language and the different modes of experience. Third, as already noted, dossiers may integrate propositional information, in particular if they are enriched through individual experience, such that, e.g., the file [ frog ] includes conditions of procreation and development or usual habitats besides the characteristic visual and auditory information. Much of this information might be of propositional character, sharing in principle the format of SF. With respect to these components, dossiers are transparent in the sense that on demand their content is available for verbalization. In a way, dossiers might thus be means of sluicing information into the proper lexical representations. It is worth noting in this respect that Fodor (1981) supports the claim that lexical meanings are basic, indefinable elements by explaining them as a kind of Janus-headed entities, which arc logically primitive but may nevertheless depend on and integrate other (basic or complex) elements. The decisive point is that the dependence in question must not be construed as a logical combination of the integrated elements, since this would lead to analyzable complex items, but as something which Fodor calls mental chemistry in the sense of John Stuart Mill (1967), who suggests that by way of mental chemistry simple ideas generate, rather than compose, the complex ones. In other words, the way in which the prerequisites of such a double-faced clement arc involved is not the kind of combination on which SF is based, but a type of integration by which the mental architecture supports hybrid elements that are primitive in one respect, still recruiting resources that are complex in other respects. This is exactly what Fodor's lexical meanings share with dossiers or completers. Mill's and Fodor's mental chemistry, the obviously necessary amalgamation of different mental resources and modalities, is a permanent challenge for the cognitive sciences dealing with perception and conceptual organization. The notion of frames introduced, e.g., in Fillmore (1985) and systematically developed in Barsalou (1992, 1999) is one such proposal to relate the meaning of linguistic expressions to the different dimensions of their interpretation, where frames are assumed to integrate and organize the different dimensions of experience; cf. also article 29 (Gawron) Frame Semantics. In conclusion, in view of the controversial properties of (in)decomposable complex semantic elements it might be unavoidable to recognize ambivalent elements, which are at the same time primitive features of the internal architecture of L and heterogeneous elements at the border of SF. 6.4. Combinatorial effects The phenomena just noted are related to a number of widely discussed problems. One of them concerns the variation on the basis of the so-called literal meaning, in particular the
16. Semantic features and primes 349 combinatorial effect that SF-features may exert on their interpretation. (28) is a familiar exemplification of one type of interaction: (28) a. Tom opened the door. b. Sally opened her eyes c. The carpenters opened the wall d. Sam opened his book to page 37 e. The surgeon opened the wound f. The chairman opened the meeting g. Bill opened a restaurant Searle (1983) remarks about these examples that open has the same literal meaning in (28a-e), adumbrating a somewhat different meaning for (28f) and (28g), as incidentally suggested by the German glosses offiien for (28a-e), but eroffnen for (28f/g), although he takes it as obvious that the verb is understood differently in all cases. As far as this is correct, the differences are connected to the objects the verb combines with, inducing different acts and processes of opening. If the SF of open given in (20a), repeated as (29), represents the invariant meaning of the occurrences in question, then the differences noted by Searle must arise from the feature-interpretation in C-I: (29) / open / X y X x X s [ s inst [ [ act x ] [cause [ become [ open y ] ] ] ] ] To sort out the relevant points, we first notice that the different types of transition towards the resulting state of y and their causation by the act of x are intimately related to the nature of the resulting state and the sort of activity that brings it about. In other words, the time course covered by [ become ] and the act causing the change are different if the eventual state is that of the door, the eye, a book, a bottle, or a meeting. Similarly the character of s instantiating the transition is determined by the nature of x and y and their involvement. Hence [sinst...] and [cause [ become ... ] ] don't contribute to the variants of interpretation independently of [ act x] and [ open y ]. As noted earlier, [ become p ] moreover involves a presupposition. It requires the preceding state to be [ not p ], which in turn depends on the interpretation of p, i.e. [ open y ] in the present case. Hence the actual choice in making sense of open depends-not surprisingly-on the values of x and y, which arc provided by the subject and object of the verb. Now, the subject in all cases at hand has the feature [ human x ], allowing for intentional action and thus imposing no relevant differences on the interpretation of x. Variation in the interpretation of [ act x ] is therefore not due to the value of x, but to the interpretation of [ open y ], which in turn depends on y and the property abbreviated by open.This property is less clear-cut than it appears, however. What it requires is that y does not preclude access, where y is either the container whose interior is at issue, as in he opened the bottle, or its boundary or possible barrier, as in he opened the door, while in cases like she opened her eyes or he opened the hand the alternative seems altogether inappropriate. Bowerman (2005) shows that differences of this sort may lead to different lexical items in different languages according to conditions on the specification of y. How these and further variations arc to be reflected in SF must be left open here. As a first approximation, [ open y ] could be replaced by something like [ y allow-for [ access-to [ interior ] ] ], leaving undecided how the interior relates to y, i.e. to the container or its boundary. In any case, the interpretation of
350 IV. Lexical semantics [ open y ] depends on how the object of open, delivering the value of y, is to be interpreted in C-I in all relevant respects.This in turn modulates the interpretation of [ act x ], fostering the specific activity by which the resulting state can be brought about. Without going through further details involved in cases similar to (28), it should be clear that the SF-features are not interpreted in isolation, but only with regard to connected scenarios C-I must provide on the basis of experience from different modules. Searle (1983) calls these conditions the "Background" of meaning, without which literal meaning could not be understood.The Background cannot itself be part of the meaning, i.e. of the semantic structure, without leading to an infinite regress, as the background elements would in turn bring in their background. It can be verbalized, however, to the extent to which it is accessible to propositional representation. Roughly the same distinction is made in Bierwisch & Lang (1989), Bierwisch (1996) between Semantic Form and Conceptual Structure; cf. also article 31 (Lang & Maienborn) Two-level Semantics. A different conclusion with respect to the same phenomena is drawn by Jackendoff (1996, 2002, and related work, cf. article 30 (Jackendoff) Conceptual Semantics), who proposes Conceptual Structure, i.e. the representation of literal meaning, to include spatial as well as any other sensory and motor representation. Carefully comparing various versions to relate semantic, spatial, and other conceptual information, Jackendoff (2002) ends up with the assumption that no principled separation of linguistic meaning from other aspects of meaning is warranted. But besides the fact that there are different types of reasoning based e.g., on spatial and propositional representations, as Jackendoff is well aware, the problem remains whether and how to capture the different interpretations related to the same lexical meaning, as illustrated by cases like (28) and plenty of others, without corresponding differences in Conceptual Structure. In any case, if one does not assume the verb open to be indefinitely ambiguous, with equally many different representations in SF (or CS), it is indispensable to have some way to account for the unified literal meaning as opposed to the multitude of its interpretations. This problem has many facets and consequences, one of which is exemplified by the following contrast: (30) a. Mary left the institute two hours ago. b. Mary left the institute two years ago. Like open in (28), leave has the same literal meaning in (30a) and (30b), which can roughly be indicated as follows, assuming that [ x at y ] provisionally represents the condition that x is in some way connected to y. (31) / leave / X y X x X s [ s inst [ [ act x ] [ cause [ become [ not [x at y ] ] ] ] ] ] Under the preferred interpretation, (30a) denotes a change of place, while (30b) denotes a change of affiliation.The alternatives rely on the interpretation of institute as a building or as a social institution. Pustejovsky (1995) calls items of this sort dot-objects. A dot- object connects ontologically heterogeneous conditions by means of a particular type of combination which makes them compatible with alternative qualifications.The entry for institute has the feature [ building x ] in "dotted" combination (hence the name) with [ institution x ],which are picked up alternatively under appropriate combinatorial conditions, as shown in (32). Whether [ building ] and [ institution ] are actually primes
16. Semantic features and primes 351^ or rather configurations of SF built on more basic elements such as [ physical ] and [ social ] etc. can be left open. The point is that they have complementary conditions with incompatible sortal properties. (32) a. The institute has six floors and an elevator. b. The institute has three directors and twenty permanent employees. The cascade of dependencies in (30) turns on the different values for y in (31) - the "dotted" institute -, which induces a different interpretation of the relation at.This in turn implies a different type of change, caused by different sorts of activity, involving crucially different aspects of the actor Mary, who must cither cause a change of her location or of her social relations, participating as a physical or a social, i.e. intentional agent. More generally, the feature [ human x ] must be construed as a "dotted" combination tantamount to the very basis of the mind-body-problem. Now, the choice between the physical and the social interpretation is triggered by the time adverbials two hours ago vs. two years ago, which sets the limits for the event e, which in turn affects the interpretation of the time course covered by [change ...].The intriguing point is that the contrasting elements hour \s.year have a fixed temporal interpretation with no dotted properties. Hence they must trigger the relevant cascades via general, extra-linguistic background knowledge. An intriguing consequence of these observations is the fact that there are clearly violations of the principle of compositionality, according to which the interpretation of a complex expression derives from the interpretation of its constituent parts. Simple and straightforward cases like (31) illustrate the point: The interpretation of Mary, institute and leave - or rather of the components [ person ], [ act ], [ building ] etc. they contain - depends on background- or context-information not part of SF at all, and definitely outside the SF of the respective constituents. Hence even if SF is not supposed to be systematically separate from conceptual structure and background information, its compositional aspect, following the logic of its type-structure, must be distinguished from differently organized aspects of knowledge. 7. Primes and universals As noted at the beginning, there are strong and plausible tendencies to consider the primitive elements that make up linguistic expressions as substantive universals, provided by the language faculty, the formal organization of which is characterized by Universal Grammar. In other words, UG is assumed to contain a universal repertoire of basic possibilities, which are activated or triggered by individual experience. Thus individual, ontogenetic processes select the actual distinctions from the general repertoire, which is part of the language capacity as such. This means that the distinctions indicated by features like [ tense ] or [ round ] may or may not appear in the system of PF-features in a given system L, but if the features appear, they just realize options UG provides as one of the prerequisites for the acquisition and use of language. If these considerations are extended to features of SF, two aspects must be distinguished, which at PF are often considered as essentially one phenomenon, namely features and their interpretation. For articulation this identification is a plausible abbreviation, but it doesn't hold at the conceptual side. For SF, the mental systems that provide the interpretation of linguistic expressions must be conceived as having a rich and
352 IV. Lexical semantics to a large extent biologically determined structure of their own, independent of (or in parallel to) the language capacity. Spatial structure is the most obvious, but by no means the only case in point. General conditions of experience such as three-dimensionality of spatial orientation, identification of verticality, dimensions and distinctions of color perception, and many other domains and distinctions may directly correspond to possible features in SF, very much like parameters of articulation correspond to possible features of PF. Candidates in this respect are systematic features like [ vertical ] or [ before ] and their interpretation in C-I, as discussed earlier. Similarly, idiosyncratic features, which correspond to biologically determined conditions of perception, motor control, or emotion might be candidates of this sort.To which extent observations about the role of focal colors or the body-schema and natural patterns of motor activity like walking, grasping, chewing, swallowing, etc. can be considered as options recruited as features of SF, similar to articulatory conditions for features of PF, must be left open. Notice, however, that we are talking about features of SF predetermined by UG, not merely about perceptual or motor correlates in C-I. In any case, because of the number and diversity of phenomena to be taken into account, there is a problem of principle, which precludes generally extending SF from the notion of universal options predetermined to be triggered by experience. The problem comes from the wide variety of basic elements that must be taken to be available in principle, but cannot be conceived without distinct experience. It primarily concerns dossiers, but also a fair range of idiosyncratic features, and it arises independently of the question whether one believes in decomposition or takes, like Fodor (1981), all concepts or possible word meanings to be innate. It is most easily demonstrated with regard to 3-D models (or visual prototypes) that must belong to many dossiers. We easily identify cats, dogs, birds, chairs, trumpets, trees, flowers, etc. on the basis of limited experience, but it does not make sense to stipulate that the characteristic prototypes are all biologically fixed, ready to be triggered like for instance the prototype of the human face, which is known to unfold along a fixed maturational path. The same holds for the wide variety of auditory patterns (beyond those recruited for features of PF), by which we distinguish frogs, dogs, nightingales, or flutes and trombones, cars, bikes, and trains. Without adding further aspects and details, it should be obvious that whatever enters the interpretation of these kinds of basic elements, it cannot belong to conditions that are given prior to experience and need only be activated. At the same time, the range of biologically predisposed capacities, by means of which features can be interpreted in C-I and distinguished in SF, is by no means arbitrary, chaotic, and unstructured, but clearly determined by principles organizing experience. Taken together, these considerations suggest a modified perspective on the nature of primitive elements. Instead of taking features to be generally part of UG, triggered by experience, one might look at UG as affording principles, patterns, or guidelines along which features or primes are constructed or extracted, if experience provides the relevant information. The envisaged distinction between primes and principles is not merely a matter of terminology. It corresponds, metaphorically speaking, to the difference between locations fixed on a map and the principles from which a map indicating the locations would emerge on the basis of exploration. More formally, the difference corresponds to that between the actual set of natural numbers and its construction from the initial element by means of the successor operation. Less metaphorically, the substantial content of the distinction can be explained by analogy with face recognition. Human beings can
16. Semantic features and primes 353 normally distinguish and recognize a large number of faces on the basis of limited and often short exposure. The resulting knowledge is due to a particular, presumably innate disposition, but it cannot be assumed to result from triggering individual faces that were innately known, just waiting for their eventual activation. In other words, the acquisition of faces depends on incidental, personal information, processed by the biologically determined capacity to identify and recognize faces, a capacity that ontogenetically emerges along a biologically determined schema and a fixed developmental path. With these considerations in mind, UG need not be construed as containing a fixed and finite system of semantic features, but as providing conditions and principles according to which distinctions in C-I can make up elements of SF.The origin and nature of these conditions are twofold, corresponding to the interface character of the emerging elements. On the one hand the conditions and principles must plausibly be assumed to rely on characteristic structures of the various interpretive domains which are independently given by the modules of C-I. Fundamentals of spatial and temporal orientation, causal explanation, principles of good form, discovered in Gestalt psychology, focal colors, principles of pertinence and possession, or the body schema and its functional determinants are likely candidates. On the other hand, conditions on primes are given by the organization of linguistic expressions, i.e. the format of representations and the principles by which they are built up and mapped onto each other. In this respect, even effects of morpho-syntax come into play through constraints on government, agreement, or selection restrictions, which depend on features at least partially interpreted in C-I, as discussed in § 4. In general, then, a more or less complex configuration of conditions originating from principles and distinctions within C-I serves as a feature of SF just in case this element plays a proper systematic role within the combinatorial structure of linguistic expressions. Under this perspective, semantic primes would emerge from the interaction of principles of strictly linguistic structure with those of sensory-motor and various other domains of mental organization -just as one should expect from their role as components of an interface level. The effect of this interaction, although resulting from universal principles of linguistic structure and general cognition, is determined (to different degrees) by particular languages and their lexical inventory, as convincingly demonstrated by Bow- erman (2000), Slobin et al. (2008) and much related work.This applies already to fairly elementary completers like those indicated by [ interior ] in (20d),or [ at ] in (31).Thus Bowerman (2005) shows that conditions like containment, (vertical) support, tight fit, or flexibility can make up configurations matched differently by (presumably basic) elements in different languages, marking conditions that children are aware of at the age of 2. (33) is a rather incomplete illustration of distinctions that figure in the resulting situation of actions placing x in/on y. (33) English Dutch German Korean block book ring Lego cup hat towel -> -> -> -> -> -> -> pan fitted case finger Lego stack table head hook in in on on on on on in in om op op op aan in in an auf auf auf an nehta kkita kkita kkita nohta ssuta kelta
354 IV. Lexical semantics In a similar vein, joints of distinctions and generalizations are involved in elements like [ open ] in (20c) and (29),or [ animal ] and [ pet] in (25),but also [ artifact],[ container ], and [ bottle ] in (12) and many others. Further complexities along the same lines must be involved in Completers or dossiers specifying concepts like car, desk, bike, dog, spider, eagle, murky, clever, computer, equation, exaggerate, occupy and all the rest. In this sense, the majority of lexical items is likely to consist of elements that comprise a basic categorization in terms of systematic features and a dossier indicating its specificity, similarly to proper names, which combine - as argued in Bierwisch (2007) - a categorization of their referent with an individuating dossier. Remember that these specifying conditions are based on cross-modal principles in terms of which experience is organized, integrated by what is metaphorically called mental chemistry, such that dossiers arc not in general logically combined necessary and sufficient criteria of classification. The upshot of these considerations is that primes of SF are clearly determined by UG, as they emerge from principles of the faculty of language and are triggered by conditions of mental organization in general. They are not, however, necessarily elements fixed in UG prior to experience. They are not learned in the sense of inductive learning, but induced or triggered by structures of experience, which linguistic expressions interface with.This may lead, among others, to cross-linguistic differences within the actual repertoire of primes, as suggested in (33). It does not exclude, however, certain core-elements like [ cause ], [ become ], [ not ], [ loc ] and quite a few others to be explicitly fixed and prc-cstablishcd in UG, providing the initial foundation to interface linguistic expressions with experience. This view leads to a less paradoxical reading of Fodor's (1981) claim that lexical concepts must all be innate, including not only nose and triangle, but also elephant, electron or grandmother, because they cannot logically be decomposed into basic sensory features. Since Fodor concedes a (very restricted) amount of lexical decomposition, his claim concerns essentially the status of dossiers. Now, these elements can well be considered as innate, if at least the principles by which they originate are innate - either via UG or by conditions of mental organization at large. Under this construal, the specific dossier of e.g., horse or electron can be genetically determined without being actually represented prior to triggering experience, just as, e.g., the knowledge of all prime numbers could be considered as innate if the successor-operation and multiplication are innate, identifying any prime number on demand, without an infinite set of them being actually represented. As to the general principles of possible concepts, which under this view are the innate aspect of possible entities (analogous to prime numbers fixed by their indivisibility), Fodor assumes a hierarchy of triggering configurations, with a privileged range of Basic-level concepts in the sense of Rosch (1977) as the elements most directly triggered. Their particular status is due to two general conditions, ostensibility and accessibility, both presupposing conceptual configurations to depend on each other along the triggering-hierarchy (or perhaps network). Ostensibility requires a configuration to be triggered by way of ostension or a direct presentation of exemplars, given the elements triggered so far. Accessibility relates to dependence on prior (ontognetic or intellectual) acquisition of other configurations.Thus the basic-level concept tree is prior to the supcrordinatc plant, but also to the subordinates oak or elm, etc. Fodor is well aware that ostension and accessibility are plausible, descriptive constraints on the underlying principles, the actual specification of which is still a research program rather than a simple result. What needs to be clarified are at least three problems: First the functional
16. Semantic features and primes 355 integration of conditions from different domains (the mental chemistry) within complex but unified configurations of CS, second the abstraction or filtering, by which these configurations are matched with proper, basic elements of SF, and third the dependency of clusters or configurations in CS along the triggering hierarchy, which cannot generally be reduced to logical entailment. Some of these dependencies are characteristically reflected within SF by means of ordinary semantic elements and relations, while much of crosslinguistic variation, metaphor and other aspects of reasoning are a matter of essentially extralinguistic principles. It might finally be added that for quite different reasons a similar orientation seems to be indicated even with respect to features of PF. As pointed out in Bicrwisch (2001), if the discoveries about sign language arc taken seriously, the language capacity cannot be restricted to articulation by means of the vocal tract. As has been shown by Klima & Bellugi (1979) and subsequent work, sign languages are organized by the same general principles and with the same expressive power as spoken languages. Now, if the faculty of language includes the option of sign languages as a possibility activated under particular conditions, the principles of PF could still be generally valid, using time slots and articu- latory properties, but the choice and interpretation of features can no longer be based on elements like [ nasal ], [ voiced ], [ palatal ] etc. Hence even for PF, what UG is likely to provide are principles that create appropriate structural elements from the available input information, rather than a fixed repertoire of articulatory features to be triggered. Logically, of course, UG can be assumed to contain a full set of features for spoken and signed languages, of which normally only those of PF are triggered, but it is not a very plausible scenario. To sum up, semantic primitives are based on principles of Universal Grammar, even if they do not make up a finite list of fixed elements. UG is assumed to provide the principles that determine both the mapping of an unlimited set of structured signals to an equally infinite array of meanings and the format of representations this mapping requires. These principles specify in particular the formal conditions for possible elements with interpretation provided by different aspects of experience, such as spatial orientation, motor control, social interaction, etc. Although this view does not exclude UG from supporting fixed designated features for certain aspects of linguistic structure, it allows for the repertoire of primitive semantic elements to be extended on demand. 8. References Ajdukiewicz, Kazimierz 1935. Uber die syntaktisclie Konnexitat. Studia Philosophica 1,1-27. Bach, Enunon 1986. The algebra of events. Linguistics & Philosophy 9,1-16. Barsalou. Lawrence W. 1992. Frames, concepts, and conceptual fields. In: A. Lehrer & E. Kittay (eds.). Frames, Fields, and Contrasts. Hillsdale, NJ: Eiibaum, 21-74. Barsalou, Lawrence W. 1999. Perceptual symbol systems. Behavioral and Brain Sciences 22,577-660. Berlin, Brent & Paul Kay 1969. Basic Color Terms. Berkeley, CA: University of California Press Biei'wisch, Manfred 1967. Syntactic features in morphology: General problems of so-called pronominal inflection in German. In: To Honor Roman Jakobson, vol. /.The Hague: Mouton, 239-270. Biei'wisch, Manfred 1996. How much space gets into language? In: P. Bloom et al. (eds.). Language and Space. Cambridge, MA:Tlie MIT Press, 31-76. Biei'wisch, Manfred 1997. Lexical information from a minimalist point of view. In: C. Wilder, H.-M. Gartner & M. Bierwisch (eds.). The Role of Economy Principles in Linguistic Theory. Berlin: Akademie Verlag, 227-266.
356 IV. Lexical semantics Bierwisch, Manfred 2001. Repertoires of primitive elements - prerequisite or result of acquisition? In: J. Weissenborn & B. Hohle (eds). Approaches to Bootstrapping, vol. II. Amsterdam: Benjamins, 281-307. Bierwisch, Manfred 2007. Semantic form as interface. In: A. Spath (ed.). Interfaces and Interface Conditions. Berlin: de Gruyter, 1-32. Bierwisch, Manfred 2010. become and its presuppositions. In: R. Bauerle, U. Reyle & T. E. Zimmermann (eds.). Presupposition and Discourse. Bingley: Emerald, Group Publishing, 189-234. Bierwisch, Manfred & Ewald Lang 1989 (eds.). Dimensional Adjectives. Berlin: Springer. Bierwisch, Manfred & Rob Schreuder 1992. From concepts to lexical items. Cognition 42,23-60. Bloom, Paul, Mary A. Peterson, Lynn Nadel & Merrill F. Garrett 1996. Language and Space. Cambridge, MA:The MIT Press Bowerman, Melissa 2000. Where do children's meanings come from? Rethinking the role of cognition in early semantic development. In: P. Nucci, G. Saxc & E.Turicl (eds.). Culture, Thought, and Development. Mahwah, NJ: Erlbaum, 199-230. Bowerman, Melissa 2005. Why can't you 'open' a nut or 'break' a cooked noodle? Learning covert object categories in action word meanings. In: L. Gershkoff-Stowe & D. H. Rakison (eds.). Building Object Categories in Developmental Time. Mahwah, NJ: Erlbaum, 209-243. Chomsky, Noam 1965. Aspects of the Theory of Syntax. Cambridge, MA:Tlie MIT Press Chomsky, Noam 1970. Remarks on nominalization. In: R.A.Jacobs & P. S. Rosenbaum (eds.). Readings in English Transformational Grammar. The Hague: Mouton, 11-61. Chomsky, Noam 1981. Lectures on Government and Binding. Dordrecht: Foris Chomsky, Noam 1985. Knowledge of Language: Its Nature, Origin, and Use. New York: Praeger. Chomsky, Noam 1995. The Minimalist Program. Cambridge, MA:Tlie MIT Press Chomsky, Noam 2000. Minimalist inquiries. In: R. Martin, D. Michaels & J. Uriagereka (eds.). Step by Step. Cambridge, MA:Tlic MIT Press, 89-155. Chomsky, Noam & Morris Halle 1968. The Sound Pattern of English. New York: Harper & Row. Clements, George N. 1985.The geometry of phonological features. Phonology Yearbook 2,225-252. Cresswell, Max 1973. Logics and Languages. London: Methuen. Dowty, David 1979. Word Meaning and Montague Grammar. Dordrecht: Rcidcl. Fillmore, Charles J. 1985. Frames and the semantics of understanding. Quaderni di Semantica 6, 222-255. Fodor, Jerry A. 1981. Representations. Cambridge, MA:Tlie MIT Press Fodor, Jerry A. 1983. The Modularity of Mind. Cambridge, MA:Tlie MIT Press Fodor, Jerry A., Merrill F. Garrett, Edward Walker & Cornelia Parkes 1980. Against definitions Cognition 8,263-367. Goodman, Nelson 1951. The Structure of Appearance. Cambridge, MA: Harvard University Press Hale. Kenneth & Samuel J. Keyser 1993. Argument structure and the lexical expression of syntactic relations In. K. Hale & S. J. Keyser (eds). The View from Building 20. Cambridge, M A:The MIT Press, 53-109. Halle, Morris 1983. On distinctive features and their articulatory implementation. Natural Language and Linguistic Theory 1,91-105. Halle, Morris & Alec Marantz 1993. Distributed morphology and the pieces of inflection. In: K. Hale & S. J. Keyser (eds). The View from Building 20. Cambridge, MA: The MIT Press, 111-176. Halle, Morris & Jean-Roger Vcrgnaud 1987. An Essay on Stress. Cambridge. MA: The MIT Press Hjelmslev, Louis 1935. La Categorie des Cas I. Aarhus: Univcisitctsfoi lagct. Hjelmslev, Louis 1938. Essai d'une thdorie des morphemes In: Actes du IV* Congres international de linguists 1936. Kopenhagen, 140-151. Reprinted in: L. Hjelmslev. Essais linguistiques. Copen- hague: Nordisk Sprog- og Kulturforlag, 1959,152-164. Hjelmslev. Louis 1953. Prolegomena to a Theory of Language. Madison. WI: The University of Wisconsin Press
16. Semantic features and primes 357 Jackendoff, Ray 1984, Semantics and Cognition. Cambridge, MA:Tlie MIT Press Jackendoff, Ray 1990, Semantic Structures. Cambridge, MA:The MIT Press Jackendoff, Ray 1996. The architecture of the linguistic-spatial interface. In: P Bloom et al. (eds). Language and Space. Cambridge, MA:The MIT Press, 1-30. Jackendoff. Ray 2002. Foundations of Language. Oxford: Oxford University Press Jakobson. Roman 1936. Beitrag zur allgemeinen Kasuslehre. Travaux du Cercle Linguistique de Prague 6,240-288. Johnson-Laird, Philip N. 1983. Mental Models: Toward a Cognitive Science of Language, Inference and Consciousness. Cambridge, MA: Harvard University Press Kamp, Hans 2001. The importance of presupposition. In: C. Rohrer, A. RoBdeutscher & H. Kamp (eds). Linguistic Form and its Computation. Stanford, CA: CSLI Publications 207-254. Kamp, Hans & Uwe Reyle 1993. From Discourse to Logic. Dordrecht: Kluwer. Katz, Jerrold J. 1972. Semantic Theory. New York: Harper & Row. Katz, Jcrrold J. & Jerry A. Fodor 1963. The structure of a semantic theory. Language 39,170-210. Klima, Edward S. & Ursula Bcllugi 1979. The Signs of Language. Cambridge, MA: Harvard University Press Laurence, Stephen & Eric Margolis 1999. Concepts and cognitive science. In: E. Margolis & S. Laurence (eds). Concepts: Core Readings Cambridge, MA:The MIT Press, 3-81. Lewis David 1972. General semantics In: D. Davidson & G Harnian (eds). Semantics of Natural Language. Dordrecht: Reidel, 169-218. Marr, David 1982. Vision. San Francisco, CA: Freeman. McCawley, James D. 1968. Lexical insertion in a transformational grammar without deep structure. In: B. J. Darden, C.-J.N. Bailey & A. Davison (eds). Papers from the Regional Meeting of the Chicago Linguistic Society (= CLS) 4. Chicago, IL: Chicago Linguistic Society, 71-80. Mill, John S. 1967. System of Logic. Collected Works. Vol. III. Toronto. ON: Toronto University Press Miller, George & Philip Johnson-Laird 1976. Language and Perception. Cambridge, MA: Harvard University Press Montague, Richard 1974. Formal Philosophy. Selected Papers of Richard Montague. Edited and with an introduction by Richmond H.Thomason. New Haven, CT: Yale University Press Pustcjovsky, James 1995. The Generative Lexicon. Cambridge, MA:The MIT Press Rosch, Eleanor. 1977. Classification of real-world objects: Origins and representations in cognition. In: P. N. Johnson-Laird & P. C. Wason (eds). Thinking - Readings in Cognitive Science. Cambridge: Cambridge University Press, 212-222. Rosch, Eleanor & Carolin Mervis 1975. Family resemblances: Studies in the internal structure of categories Cognitive Psychology 7,573-605. Seal le, John R. 1983. Intentionality. Cambridge: Cambridge University Press Slobin, Dan I., Melissa Bowerman, Penelope Brown, Sonja EisenbeiB & Bhuvana Narasimlian 2008. Putting things in places: Developmental consequences of linguistic typology. In: J. Bohnemeyer & E. Pcdcrson (eds). Event Representation. Cambridge, MA: Cambridge University Press, 1-28. Svenonius Peter 2007. Interpreting uninterpretable features Linguistic Analysis 33,375-413. Wierzbicka, Anna 1996. Semantics: Primes and Universals. Oxford: Oxford University Press Wunderlich, Dieter 1996a. Lexical categories Theoretical Linguistics 22,1-48. Wunderlich, Dieter 1996b. Dem Freund die Hand auf die Schulter legen. In: G Han as & M. Bierwisch (eds). Wenn die Semantik arbeitet.Tubingen: Nicmcycr, 331-360. Wunderlich, Dieter 1997a. A minimalist model of inflectional morphology. In: C. Wilder, H.-M. Gartner & M. Bierwisch (eds). The Role of Economy Principles in Linguistic Theory. Berlin: Akademie Verlag, 267-298. Wunderlich, Dieter 1997b. Cause and the structure of verbs Linguistic Inquiry 28,27-68. Manfred Bierwisch, Berlin (Germany)
358 IV. Lexical semantics 17. Frameworks of lexical decomposition of verbs 1. Introduction 2. Generative Semantics 3. Lexical decomposition in Montague Semantics 4. Conceptual Semantics 5. LCS decompositions and the MIT Lexicon Project 6. Event Structure Theory 7. Two-level Semantics and Lexical Decompositional Grammar 8. Natural Semantic Metalanguage 9. Lexical Relational Structures 10. Distributed Morphology 11. Outlook 12. References Abstract Starting from early approaches within Generative Grammar in the late 1960s, the article describes and discusses the development of different theoretical frameworks of lexical decomposition of verbs. It presents the major subsequent conceptions of lexical decompositions, namely, Dowty's approach to lexical decomposition within Montague Semantics, Jackendoff s Conceptual Semantics, the LCS decompositions emerging from the MIT Lexicon Project, Pustejovsky's Event Structure Theory, Wierzbicka's Natural Semantic Metalanguage, Wunderlich's Lexical Decompositional Grammar, Hale and Kayser's Lexical Relational Structures, and Distributed Morphology. For each of these approaches, (i) it sketches their origins and motivation, (ii) it describes the general structure of decompositions and their location within the theory, (Hi) it explores their explanative value for major phenomena of verb semantics and syntax, (iv) and it briefly evaluates the impact of the theory Referring to discussions in article 7 (Engelberg) Lexical decomposition, a number of theoretical topics are taken up throughout the paper concerning the interpretation of decompositions, the basic inventory of decompositional predicates, the location of decompositions on the different levels of linguistic representation (syntactic, semantic, conceptual), and the role they play for the interfaces between these levels. 1. Introduction The idea that word meanings are complex has been present ever since people have tried to explain and define the meaning of words. When asked the meaning of the verb persuade, a competent speaker of the language would probably say something like (la).This is not far from what semanticists put into a structured lexical decomposition as in (lb): (1) a. You persuade somebody if you make somebody believe or do something. b. persuade(\,y,z): x cause (y believe z) (after Fillmore 1968a: 377) However, it was not until the mid-1960s that intuitions about the complexity of verb meanings lead to formal theories of their lexical decomposition.This article will review the history of lexical decomposition of verbs from that time on. For some general discussion of Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 358-399
17. Frameworks of lexical decomposition of verbs 359 the concept of decomposition and earlier decompositional approaches, cf. article 7 (Engel- berg) Lexical decomposition. The first theoretical framework to systematically develop decompositional representations of verb meanings was Generative Semantics (section 2), where decompositions were representations of syntactic deep structure. Later theories did not locate lexical decompositions on a syntactic level but employed them as representations on a lexical-semantic level as in Dowty's Montague-based approach (section 3) and the decompositional approaches emerging from the MIT Lexicon Project (section 5) or on a conceptual level as in Jackendoff's Conceptual Semantics (section 4). Other lexical approaches were characterized by the integration of decompositions into an Event Structure Theory (section 6), the conception of a comprehensive Natural Semantic Metalanguage (section 7), and the development of a systematic structure-based linking mechanism as in Lexical Decomposition Grammar (section 8). Parallel to these developments, new syntactic approaches to decompositions emerged such as Hale and Kayser's Lexical Relational Structures (section 9) and Distributed Morphology (section 10). Throughout the paper a number of theoretical topics will be touched upon that are discussed in more detail in article 7 (Engelberg) Lexical decomposition. Of particular interest will be the questions on which level of linguistic representation (syntactic, semantic, conceptual) decompositions are located, how the interfaces to other levels of linguistic representation are designed, what evidence for the complexity of word meaning is assumed, how decompositions are semantically interpreted, what role the formal structure of decompositions plays in explanations of linguistic phenomena, and what the basic inventory of decompositional predicates is. In the following sections, the major theoretical approaches involving lexical decompositions will be presented. Each approach will be described in four subsections that (i) sketch the historical development that led to the theory under discussion, (ii) describe the place lexical decompositions take in these theories and their structural characteristics, (iii) present the phenomena that arc explained on the basis of decompositions, and (iv) give a short evaluation of the impact of the theory. 2. Generative Semantics 2.1. Origins and motivation Generative Semantics was a school of syntactic and semantic research that opposed certain established views within the community of Generative Grammar. It was active from the mid-1960s through the mid-1970s. Its major proponents were George Lakoff, James D. McCawley, Paul M. Postal, and John Robert Ross. (For the history of Generative Semantics, cf.Binnick 1972; McCawley 1994.) At that time, the majority view held within Generative Grammar was that there is a single, basic structural level on which generative rules operate and to which all other structural levels arc related by interpretive rules. This particular structural level was syntactic deep structure from which semantic interpretations were derived. It was this view of 'interpretive semantics' that was not shared by the proponents of Generative Semantics. Although there never was a "standard theory" of Generative Semantics, a number of assumptions can be identified that were wide-spread in the GS-community (cf. Lakoff 1970; McCawley 1968; Binnick 1972): (i) Deep structures arc more abstract than Chomsky (1965) assumed. In particular, they are semantic representations of sentences, (ii) Syntactic and semantic representations have the same formal status:They are
360 IV. Lexical semantics structured trees, (iii) There is one system of rules that relates semantic representations and surface structures via intermediary representations. Some more specific assumptions that are important when it comes to lexical decomposition were the following: (iv) In semantic deep structure, lexical items occur as decompositions where the semantic elements of the decomposition are distributed over the structured tree (cf. e.g., Lakoff 1970; McCawley 1968; Postal 1971). (v) Some transformations take place before lexical insertion (prelexical transformations, McCawley 1968). (vi) Semantic deep structure allows only three categories: V (corresponding to predicates), NP (corresponding to arguments), and S (corresponding to propositions). Thus, for example, verbs, adjectives, quantifiers, negation, etc. are all assigned the category V in semantic deep structure (cf. Lakoff 1970: 115ff; Bach 1968; Postal 1971). (vii) Transformations can change syntactic relations: Since Floyd broke the glass contains a structure expressing the glass broke as part of its semantic deep structure, the glass occurs as subject in semantic deep structure and as object in surface structure. 2.2. Structure and location of decompositions In Generative Semantics, syntactic and semantic structures do not constitute different levels or modules of linguistic theory. They are related by syntactically motivated transformations; that is, although semantic by nature, the lexical decompositions occurring in semantic deep structure and intermediate levels of sentence derivation must be considered as parts of syntactic structure. In contrast to the "Aspects"-model of Generative Syntax (Chomsky 1965), the terminal constituents of semantic deep structure are semantic and not morphological entities. Particularly interesting for the development of theories of lexical decomposition is the fact that semantic deep structure contained abstract verbs like cause or change (Fig. 17.1) that were sublcxical in the sense that they were part of a lexical decomposition. Moreover, it was assumed that all predicates that appear in semantic deep structure are abstract predicates. A basic abstract predicate like believe resembles the actual word believe in its meaning and its argument-taking properties, but unlike actual words, it is considered to be unambiguous.These abstract entities attach to the terminal nodes in semantic deep structure (cf. Fig. 17.1 and for structures and derivations of this sort Lakoff 1965; McCawley 1968;Binnick 1972). S NP V S I I /\ david CAUSE V S I Ax BECOME V S NOT V NP I I ALIVE goliath Fig. 17.1: Semantic deep structure for David killed Goliath
17. Frameworks of lexical decomposition of verbs 361 Abstract predicates can be moved by a transformation called 'predicate raising', a form of Chomsky adjunction that has the effect of fusing abstract predicates into predicate complexes (cf. Fig. 17.2). I I NOT ALIVE Fig. 17.2: Adjunction of ALIVE to NOT by predicate raising Performing a series of transformations of predicate raising, the tree in Fig. 17.2 is transformed into the tree in Fig. 17.4 via the tree in Fig. 17.3. david CAUSE v NP I I NOT ALIVE Fig. 17.3: Adjunction of NOT ALIVE to BECOME by predicate raising Fig. 17.4: Adjunction of BECOME NOT ALIVE to CAUSE by predicate raising
362 IV. Lexical semantics Finally, the complex of abstract predicates gets replaced by a lexical item. The lexical insertion transformation *[cause[become[not[alive]]]] —> kilV yields the tree in Fig. 17.5. S NP V NP I I I david kill goliath Fig. 17.5: Lexical insertion of kill, replacing [ CAUSE [ BECOME [ NOT ALIVE ] ] ] According to McCawley (1968), predicate raising is optional. If it does not take place, the basic semantic components do not fuse.Thus,A:i//isonly one option of expressing the semantic deep structure in Fig. 17.1 besides cause to die, cause to become dead, cause to become not alive, etc. 2.3. Linguistic phenomena Since Generative Semantics considered itself a general theory of syntax and semantics, a large array of phenomena were examined within this school, in particular quantification, auxiliaries, tense, speech acts, etc. (cf. Immlcr 1974; McCawley 1994; Binnick 1972). In the following, a number of phenomena will be illustrated whose explanation is closely related to lexical decomposition. (i) Possible words: Predicate raising operates locally. A predicate can only be adjoined to an adjacent higher predicate. For a sentence like (2a), Ross (1972:109ff) assumes the semantic deep structure in (2b). The local nature of predicate raising predicts that the decompositional structure can be realized as try to find or in case of adjunction of find to try as look for. A verb conveying the meaning of'try-entertain', on the other hand, is universally prohibited since entertain cannot adjoin to try. (2) a. Fritz looked for entertainment. b. [tkv fritz [find fritz [entertain someone fritz]]] (ii) Lexical gaps: Languages have lexical gaps in the sense that not all abstract predicate complexes can be replaced by a lexical item. For example, while the three admitted structures cause become red (redden), become red (redden), and red (red) can be replaced by lexical items, the corresponding structures for blue show accidental gaps:cause become blue (no lexical item), become blue (no lexical item), and blue (blue) (cf. McCawley 1968). Lexical items missing in English may exist in other languages, for example, French bleuir. Since the transformation of predicate raising is restricted in the way described above, Generative Semantics can distinguish between those non-existing lexical items that are ruled out in principle, namely, by the restrictions on predicate raising and those that arc just accidentally missing. (iii) Related sentences: Lexical decompositions allow us to capture identical sentence- internal relations across a number of related sentences.The relation between Goliath and not alive in the three sentences David kills Goliath (cf. Fig. 17.1), Goliath dies,
17. Frameworks of lexical decomposition of verbs 363 and Goliath is dead is captured by assigning an identical subtree expressing this relation to the semantic deep structures of all three sentences. Lakoff (1970: 33ff) provides an analysis of the adjective thick, the intransitive verb thicken, and the transitive verb thicken that systematically bridges the syntactic differences between the three items by exploring their semantic relatedness through decompositions. (iv) Cross-categorical transfer of polysemy: The fact that the liquid cooled has two readings ('the liquid became cool' and 'the liquid became cooler') is explained by inserting into the decomposition the adjective from which the verb is derived where the adjective can assume the positive or the comparative form (Lakoff 1970). (v) Selectional restrictions (cf. e.g., Postal 1971:204ff): Generative Semantics is capable of stating generalizations over selectional restrictions. The fact that the object of kill and the subjects of die and dead share their selectional restrictions is due to the fact that all three contain the abstract predicate not alive in their decomposition (cf. Postal 197 l:204ff). (vi) Derivational morphology:Terminal nodes in lexical decompositions can be associated with derivational morphemes. McCawley (1968) suggests a treatment of lexical causatives like redden in which the causative morpheme en is inserted under the node become in the decomposition cause become red. (vii) Reference of pronouns: Particular properties of the reference of pronouns are explained by relating pronouns to subtrees within lexical decompositions. (3) a. Floyd melted the glass though it surprised me that he would do so. b. Floyd melted the glass though it surprised me that it would do so. In (3b) in contrast to (3a), the pronoun picks up the decompositional subtree the-glass become melted within the semantic structure [Floyd cause [the-glass become melted]] (cf. Lakoff 1970; Lakoff & Ross 1972). (viii)Semantics of adverbials: Adverbials often show scopal ambiguities (cf. Morgan 1969). Sentences like Rebecca almost killed Jamaal can have several readings depending on the scope of almost (cf. article 7 (Engclbcrg) Lexical decomposition, section 1.2). Assuming the semantic deep structure [do [Rebecca cause [become [Jamaal not alive]]]], the three readings can be represented by attaching the adverb above to either do, cause, or not alive. Similar analyses have been proposed for adverbs to like again (Morgan 1969), temporal adverbials (cf. McCawley 1971), and durative adverbials like for four years in the sheriff of Nottingham jailed Robin Hood for four years, where the adverbial only modifies the resultative substructure indicating that Robin Hood was in jail (Binnick 1968). 2.4. Evaluation Generative Semantics has considerably widened the domain of phenomena that syntactic and semantic theories have to account for. It stimulated research not only in syntax but particularly in lexical semantics, where structures similar to the lexical decompositions proposed by Generative Semantics are still being used. Yet, Generative Semantics experienced quite vigorous opposition, in particular from formal semantics, psycholinguis- tics, and, of course, proponents of interpretive semantics within Generative Grammar
364 IV. Lexical semantics (e.g., Chomsky 1970). Some of the critical points pertaining to decompositions were the following: (i) Generative Semantics was criticized for its semantic representations not conforming to standards of formal semantic theories. According to Bartsch & Vennemann (1972: lOff), the semantic representations of Generative Semantics are not logical forms: The formation rules for semantic deep structures are uninterpreted; operations like argument deletion lead to representations that are not well-formed; and the treatment of quantifiers, negation, and some adverbials as predicates instead of operators is inadequate. Dowty (1972) emphasizes that Generative Semantics lacks a theory of reference. (ii) While the rules defining deep structure and the number of categories were considerably reduced by Generative Semantics, the analyses were very complex, and semantic deep structure differed extremely from surface structure (Binnick 1972:14). (iii) It was never even approximately established how many and what transformations would be necessary to account for all sentential structures of a language (Immler 1974:121). (iv) It often remained unclear how the reduction to more primitive predicates should proceed, that is, what criteria allow one to decide whether dead is decomposed as NOTALivEora/iveasNOTDEAD (Bartsch & Vennemann 1972:22)-an objection that also applies to most other approaches to decompositions. (v) In most cases, a lexical item is not completely equivalent to its decomposition. De Rijk has shown that while forget and its presumed decomposition 'cease to know' are alike with respect to presuppositions, there are cases where argument-taking properties and pragmatic behaviour are not interchangeable. If somebody has friends in Chicago who suddenly move to Australia, it is appropriate to say / have ceased to know where to turn for help in Chicago but not / have forgotten where to turn for help in Chicago (de Rijk, after Morgan 1969: 57ff). More arguments of this sort can be found in Fodor's (1970) famous article Three Reasons for Not Deriving "Kill" from "Cause to Die" which McCawley (1994) could partly repudiate by reference to pragmatic principles. (vi) A lot of the phenomena that Generative Semantics tried to explain by regular transformations on a decomposed semantic structure exhibited lexical idiosyncrasies and were less regular than would be expected under a syntactic approach. Here are some examples. Pronouns are sometimes able to refer to substructures in decompositions. Unlike example (3) above, in sentences with to kill, they cannot pick up the corresponding substructure x become dead (Fodor 1970:429ff); monomorphemic lexical items often seem to be anaphoric islands: (4) a. John killed Mary and it surprised me that he did so. b. John killed Mary and it surprised me *that she did so. While to cool shows an ambiguity related to the positive and the comparative form of the adjective (cf. section 2.3), to open only relates to the positive form of the adjective (Immler 1974: 143f). Sometimes selectional restrictions carry over to related sentences displaying the same decompositional substructure, in other cases they do not.
17. Frameworks of lexical decomposition of verbs 365 While the child grew is possible, a decompositionally related structure does not allow child as the corresponding argument of grow: *the parents grew the child (Kandiah 1968). Advcrbials give rise to structurally ambiguous sentence meanings, but they usually cannot attach to all predicates in a decomposition (Shibatani 1976: ll;Fodor et al. 1980: 286ff). In particular, Dowty (1979) showed that Generative Semantics overpredicted adverbial scope, quantifier scope, and syntactic interactions with cyclic transformations. He concluded that rules of semantic interpretation of lexical items are different from syntactic transformations (Dowty 1979:284). (vii) Furthermore, Generative Semantics was confronted with arguments derived from psycholinguistic evidence (cf. article 7 (Engelberg) Lexical decomposition, section 3.6). 3. Lexical decomposition in Montague Semantics 3.1. Origins and motivation The interesting phenomena that emerged from the work of Generative Semanticists, on the one hand, and the criticism of the syntactic treatment of decompositional structures, on the other, led to new approaches to word-internal semantic structure. Dowty's (1972; 1976; 1979) goal was to combine the methods and results of Generative Semantics with Montague's (1973) rigorously formalized framework of syntax and semantics where truth and denotation with respect to a model were considered the central notions of semantics. Dowty refuted the view that all interesting semantic problems only concern the so- called logical words and compositional semantics. Instead, he assumed that compositional semantics crucially depends on an adequate approach to lexical meaning. While he acknowledged the value of lexical decompositions in that, he considered decompositions as incomplete unless they come with "an account of what meanings really are." Dowty (1979: v, 21) believed that the essential features of Generative Semantics can all be accommodated within Montague Semantics where the logical structures of Generative Semantics will get a model-theoretic interpretation and the weaknesses of the syntactic approaches, in particular overgeneration, can be overcome. In Montague Semantics,sentences are not interpreted directly but are first translated into expressions of intensional logic.These translations are considered the semantic representations that correspond to the logical structure (i.e., the semantic deep structure) of Generative Semantics (Dowty 1979: 22). Two differences between Dowty's approach and classical Generative Semantics have to be noted: Firstly, directionality is inverse. While Generative Semantics maps semantic structures onto syntactic surface structure, syntactic structures are mapped onto semantic representations in Dowty's theory. Secondly, in Generative Semantics but not in Dowty's theory, derivations can have multiple stages (Dowty 1979:24ff). 3.2. Structure and location of decompositions Lexical decompositions are used in Dowty (1979) mainly in order to reveal the different logical structures of verbs belonging to the several so-called Vendler classes. Vendler
366 IV. Lexical semantics (1957) classified verbs (and verb phrases) into states, activities, accomplishments, and achievements according to their behaviour with respect to the progressive aspect and temporal-aspectual adverbials (cf. also article 48 (Filip) Aspectual class and Aktion- sart). Dowty (1979: 52ff) extends the list of phenomena associated with these classes, relates verbs of different classes to decompositional representations as in (5a-c), and also distinguishes further subtypes of these classes as in (5d-f) (cf. Dowty 1979:123ff). (5) a. simple statives *nK «„) John knows the answer. b. simple activities Do(ap[7Cn(<x, <xn)]) John is walking. c. simple achievements BECOME[7Cn(a, <Xn)] John discovered the solution. d. non-intentional agentive accomplishments [[D0(ai,[7Cn(a, <Xn)])] CAUSE [BECOME[pm(p, PJ]]] John broke the window, c. agentive accomplishments with secondary agent [[DO(o|t K(a, <Xn)])] CAUSE [D0(P|t [Pm (P, PJ])]] John forced Bill to speak. f. intentional agentive accomplishment do(<x,, [do(<x,,7cn(a, <xn)) cause <|>]) John murdered Bill. The different classes are built up out of stative predicates (nn, pm) and a small set of operators (do, become, cause). Within an aspect calculus, the operators involved in the decompositions are given model-theoretic interpretations, and the stative predicates are treated as predicate constants. The interpretation of become (6a) is based on von Wright's (1963) logic of change; the semantics of cause (6b) as a bisentential operator (cf. Dowty 1972) is mainly derived from Lewis' (1973) counterfactual analysis of causality, and the less formalized analysis of do (6c) relates to considerations about will and intentionality in Ross (1972). (6) a. [become 0] is true at / if there is an interval J containing the initial bound of / such that —i 0is true at J and there is an interval K containing the final bound of /such that 0is true at K (Dowty 1979:140). b. [0cause y/\ is true if and only if (i) 0is a causal factor for if/, and (ii) for all other 0' such that 0' is also a causal factor for f.some -i 0-world is as similar or more similar to the actual world than any -i 0'-world is. <f>is a causal factor for if/if and only if there is a series of sentences 0, 0, 0n, \ff (for n > 0) such that each member of the series depends causally on the previous member.
17. Frameworks of lexical decomposition of verbs 367 <f> depends causally on if/ if and only if <f>, if/ and -i <f> a -» -i if/ are all true (Dowty 1979:108f). C. □[Do(a,0)<-»0 A UNDER_THE_UNMEDlATED_CONTROL_OF_THE_AGENT_0f (<f>)] (Dowty 1979:118) By integrating lexical decompositions into Montague Semantics, Dowty wants to expand the treatment of the class of entailments that hold between English sentences (Dowty 1979:31); he aims to show how logical words interact with non-logical words (e.g., words from the domain of tense, aspect, and mood with Vendler classes), and he expects that lexical decompositions help to narrow down the range of possible lexical meanings (Dowty 1979:34f,125ff). With respect to the semantic status of lexical decompositions, Dowty (1976: 209ff) explores two options; namely, that the lexical expression itself is decomposed into a complex predicate, or that it is related to a predicate constant via a meaning postulate (cf. article 7 (Engelberg) Lexical decomposition, section 3.1). While he mentions cases where a strong equivalence between predicate and decomposition provides evidence for the first option, he also acknowledges that the second option might often be empirically more adequate since it allows the weakening of the relation between predicate and decomposition from a biconditional to a conditional. This would account for the observation that the complex phrase cause to die has a wider extension than kill. 3.3. Linguistic phenomena Dowty provides explanations for a wide range of phenomena related to Vendler classes. A few examples are the influence of mass nouns and indefinites on the membership of expressions in Vendler classes (Dowty 1979: 78ff), the interaction of derivational morphology with sublexical structures of meaning (Dowty 1979:32,206f, 256ff), explanations of adverbial and quantifier scope (Dowty 1976:213ff), the imperfective paradox (Dowty 1979:133), the progressive aspect (Dowty 1979:145ff), resultative constructions (Dowty 1979: 219ff), and temporal-aspectual adverbials (for an hour, in an hour) (Dowty 1979: 332ff). Dowty also addresses aspectual composition. Since he conceives of Vendler classes in terms of lexical decomposition, he shows that dccompositional structures involving cause, become, and do are not only introduced via semantically complex verbs but also via syntactic and morphological processes. For example, accomplishments that arise when verbs are combined with prepositional phrases get their become operator from the preposition (7a) where the preposition itself can undergo a process that adds an additional cause component (7b) (Dowty 1979:21 If). In (7c), the morphological process forming deadjec- tival inchoatives is associated with a become proposition (Dowty 1979: 206) that can be expanded with a cause operator in the case of deadjectival causatives (Dowty 1979:307f). (7) a. John walks to Chicago. walk'(John) a become be-at'(John, Chicago) b. John pushes a rock to the fence. 3x[rock'(x) a 3y[Vz[fence'(z) <=> y = z] a push'(John, x) cause become beat' (x.y)]]
368 IV. Lexical semantics c. The soup cooled. 3x[Vy[soup'(y) <=> x = y] a become cool'(x)] 3.4. Evaluation Of the pre-80s work on lexical decomposition, besides the work of Jackendoff (cf. 2.4), it is probably Dowty's "Word Meaning and Montague Grammar" that still exerts the most influence on semantic studies. It has initiated a long period of research dominated by approaches that located lexical decompositions in lexical semantics instead of syntax. It must be considered a major advancement that Dowty was committed to providing formal truth-conditions for operators involved in decompositions. Many of the approaches preceding and following Dowty (1979) lack this degree of explicitness. His account of the compositional nature of many accomplishments, the interaction of aspectual adverbials with Vcndlcr classes, and many other phenomena mentioned in section 3.3 served as a basis for discussion for the approaches to follow. Among the approaches particularly influenced by Dowty (1979) are van Valin's (1993) decompositions within Role and Reference Grammar, Levin and Rappaport Hovav's Lexical Conceptual Structures (cf. section 5), and Pustejovsky's Event Structures (cf. section 6). 4. Conceptual Semantics 4.1. Origins and motivation Along with lexical decompositions in Generative Semantics, another line of research emerged where semantic arguments of predicates were associated with the roles they played in the events denoted by verbs (Gruber 1965; Fillmore 1968b). Thematic roles like agent, patient, goal, etc. were used to explain how semantic arguments arc mapped onto syntactic structures (cf. also article 18 (Davis) Thematic roles). Early approaches to thematic roles assumed that there is a small set of unanalyzable roles that are semanti- cally stable across the verbal lexicon. Thematic role approaches were confronted with a number of problems concerning the often vague semantic content of roles, their coarsegrained nature as a descriptive tool, the lack of reliable diagnostics for them, and the empirically inadequate syntactic generalizations (cf. the overview in Levin & Rappaport Hovav2005:35ff; Dowty 1991:553ff).As a consequence, thematic role theories developed in different ways by decomposing thematic roles into features (Rozwadowska 1988), by reducing thematic roles to just two generalized macroroles (van Valin 1993) or to proto-rolcs within a prototype approach based on lexical entailments (Dowty 1991), by combining them with event structure representations (Grimshaw 1990; Reinhart 2002), and, in particularly conceiving of them as notions derived from lexical decompositions (van Valin 1993).This last approach was pursued by Jackendoff (1972; 1976) and was one of the foundations of a semantic theory that came to be known as Conceptual Semantics, and which, over the years, has approached a large variety of phenomena beyond thematic roles (cf. also article 30 (Jackendoff) Conceptual Semantics). According to Jackendoff's (1983; 1990; 2002) Conceptual Semantics, meanings are essentially conceptual entities and semantics is "the organization of those thoughts that language can express" (Jackendoff 2002:123). Meanings are represented on an autonomous level of cognitive representation called "conceptual structure" that is related to
17. Frameworks of lexical decomposition of verbs 369 syntactic and phonological structure,on the one side,and to non-linguistic cognitive levels like the visual, the auditory, and the motor system, on the other side. Conceptual structure is conceived of as a universal model of the mind's construal of the world (Jackendoff 1983:18ff).Thus, Conceptual Semantics differs from formal, model-theoretic semantics in locating meanings not in the world but in the mind of speakers and hearers. Therefore, notions like truth and reference do not play the role they play in formal semantics but are relativized to the speaker's conceptualizations of the world (cf. Jackendoff 2002: 294ff). Jackendoff's approach also differs from many others in not assuming a strict division between semantic and encyclopaedic meaning and between grammatically relevant and irrelevant aspects of meaning (Jackendoff 2002:267ff). 4.2. Structure and location of decompositions Conceptual structure involves a decomposition of meaning into conceptual primitives (Jackendoff 1983:57ff). Since Jackendoff (1990: lOf) assumes that there is an indefinitely large number of possible lexical concepts, conceptual primitives must be combined by generative principles to determine the set of lexical concepts.Thus, most lexical concepts are considered to be conceptually complex. Decompositions in Conceptual Semantics differ considerably in content and structure from lexical structure in Generative Semantics and Montague-based approaches to the lexicon.The sentence in (8a) would yield the meaning representation in (8b): (8) a. John entered the room. b- Uven, GO (Ling JOHN]> [p..|, TO dpi.ee IN (Ling ROOM])])])l Each pair of square brackets encloses a conceptual constituent, where the capitalized items denote the conceptual content, which is assigned to a major conceptual category like Thing, Place, Event, State, Path, Amount, etc. (Jackendoff 1983: 52ff). There are a number of possibilities for mapping these basic ontological categories onto functor- argument structures (Jackendoff 1990: 43). For example, the category Event can be elaborated into two-place functions like go ([Thing], [Path]) or cause ([Thing/Event], [Event]). Furthermore, each syntactic constituent maps into a conceptual constituent, and partly language-specific correspondence rules relate syntactic categories to the particular conceptual categories they can express. In later versions of Conceptual Semantics, these kinds of structures are enriched by referential features and modifiers, and the propositional structure is accompanied by a second tier encoding elements of information structure (cf. Jackendoff 1990:55f; Culicover & Jackendoff 2005:154f). A lexical entry consists of a phonological, syntactic, and conceptual representation. As can be seen in Fig. 17.6, most of the conceptual structure in (8b) is projected from the conceptual structure of the verb that provides a number of open argument slots (Jackendoff 1990:46). C enter "^ V <NP;> |^ [Event GO ([Thing]/), [path TO ([P,ace IN ([Thing]/)])]] J Fig. 17.6: Lexical entry for enter
370 IV. Lexical semantics The conceptual structure is supplemented with a spatial structure that captures finer distinctions between lexical items in a way closely related to non-linguistic cognitive modules (Jackendoff 1990:32ff; 2002:345ff). The same structure can also come about in a compositional way. While enter already includes the concept of a particular path (in), this information is contributed by a preposition in the following example: ^9) a. John ran into the room. b- [Even. GO (Ling JOHND' [path TO dpiace IN (Ling RO<H)])]] The corresponding lexical entries for the verb and the preposition (Fig. 17.7) account for the structure in (9b) (Jackendoff 1990:45). run V <PPy> L [Event GO ([Thing ]/), [path TO ]y] J [ into ""1 P <NPy> L [Path TO ([Race IN ([Thing]/)])] J Fig. 17.7: Lexical entries for run and into An important feature of Jackendoff's (1990: 25ff; 2002:356ff) decompositions is the use of abstract location and motion predicates in order to represent the meaning of words outside the local domain. For example, a changc-of-statc verb like melt is rendered as 'to go from solid to liquid'. Building on Gruber's (1965) earlier work, Jackendoff thereby relates different classes of verbs by analogy. In Jackendoff (1983: 188), he states that in any semantic field in the domain of events and states "the principle event-, state-, path-, and place-functions are a subset of those used for the analysis of spatial location and motion".To make these analogies work, predicates are related to certain fields that determine the character of the arguments and the sort of inferences. For example, if the two-place function BE(x,y) is supplemented by the field feature spatial, it indicates that x is an object and y is its location while the field feature possession indicates that x is an object and y the person who owns it. Only the latter one involves inferences about the rights of y to use x (Jackendoff 2002:359ff). Jackendoff (2002: 335f) dissociates himself from approaches that compare lexical decompositions with dictionary definitions; the major difference is that the basic elements of decompositions need not be words themselves, just as phonological features as the basic components on phonological elements are not sounds. He also claims that the speaker does not have conscious access to the decompositional structure of lexical items; this can only be revealed by linguistic analysis. The question of what elements should be used in decompositions is answered to the effect that if the meaning of lexeme A entails the meaning of lexeme B, the decomposition of A includes that of B (Jackendoff 1976). Jackendoff (2002:336f) admits that it is hard to tell how the lower bound of decomposition can be determined. For example,
17. Frameworks of lexical decomposition of verbs 37_1 some approaches consider cause to be a primitive; others conceive of it as a family of concepts related through feature decomposition. In contrast to approaches that are only interested in finding those components that are relevant for the syntax-semantics interface, he argues that from the point of learnability the search for conceptual primitives has to be taken seriously beyond what is needed for the syntax. He takes the stance that it is just a matter of further and more detailed research before the basic components are uncovered. 4.3. Linguistic phenomena Lexical decompositions within Conceptual Semantics serve much wider purposes than in many other approaches. First of all, they are considered a meaning representation in their own right, that is, not primarily driven by the need to explain linking and other linguistic interface phenomena. Moreover, as part of conceptual structure, they arc linked not only to linguistic but also to non-linguistic cognitive domains. As a semantic theory, Conceptual Semantics has to account for inferences. With respect to decompositions, this is done by postulating inference rules that link decom- positional predicates. For example, from any decompositional structure involving Go(x,y,z), we can infer that BE(x,y) holds before the go event and be(x,z) after it. This is captured by inference rules as in (10a) that resemble meaning postulates in truth- conditional semantics (Jackcndoff 1976: 114). Thus, wc can infer from the train went from Kankakee to Mattoon that the train was in Kankakee before and in Mattoon after the event. Since other sentences involving a Go-type predicate like the road reached from Altoona to Johnstown do not share this inference, the predicates are subtyped to the particular fields transitional versus extensional.lhe extensional version of go is associated with the inference that one part of x in Go(x,y,z) is located in y and the other in z (10b) (Jackendoff 1976:139): (10) a. GOTrans(x,y,z) at /, => for some times t2 and ty such that t2 < /, < ty BETrans(x,y) at f2and BETrans(x^) at ty b. GOExl(x,y,z) => for some v and w such that v <zx and wax, BEEx.(v'y) and BEnx.(w'z)- One of Jackendoffs main concerns is the mapping between conceptual and syntactic structure. Part of this mapping is the linking of semantic arguments into syntactic structures. In Conceptual Semantics this is done via thematic roles. Thematic roles are not primitives in Conceptual Semantics but can be defined on the basis of decompositions (Jackcndoff 1972; 1987). They "arc nothing but particular structural configurations in conceptual structure" (Jackendoff 1990: 47). For example, Jackendoff (1972: 39) identifies the first argument of the cause relation with the agent role. In later versions of his theory, Jackendoff (e.g., 1990:125ff) expands the conceptual structure of verbs by adding an action tier to the representation. While the original concept of decomposition (the 'thematic tier') is couched in terms of location and motion, thereby rendering thematic roles like theme, goal, or source, the action tier expresses how objects are affected and accounts for roles like actor and patient. Thus, hit as in the car hit the tree provides a theme (the car, the "thing in motion" in the example sentence) and a goal (the tree) on
372 IV. Lexical semantics the thematic tier, and an actor (the car) and a patient (the tree) on the action tier. These roles can be derived from the representation in Fig. 17.8 (after Jackendoff 1990:125ff). tINCH [BE ([CAR], [AT [TREE]])] Ev*nl AFF ([CAR], [TREE]) J Fig. 17.8: Thematic tier and action tier for the car hit the tree The thematic roles derived from the thematic and the action tiers are ordered within a thematic hierarchy.This hierarchy is mapped onto a hierarchy of syntactic functions such that arguments are linked to syntactic functions according to the rank of their thematic role in the thematic hierarchy (Jackendoff 1990:258,268f; 2002:143). Strict subcatcgori- zation can largely be dispensed with. However, Jackendoff (1990: 255ff; 2002:140f) still acknowledges subcategorizational idiosyncrasies. Among the many other phenomena treated within Conceptual Semantics and related to lexical decompositions are argument structure alternations (Jackendoff 1990: 71ff), aspectual-temporal advcrbials and their relation to the boundedness of events (Jackendoff 1990: 27ff), the semantics of causation (Jackendoff 1990: 130ff), and phenomena at the border between adjuncts and arguments (Jackendoff 1990:155ff). 4.4. Evaluation Jackendoff's approach to lexical decomposition has been a cornerstone in the development of lexical representations since it coveres a wide domain of different classes of lexical items and phenomena associated with these classes. The wide coverage forced Jackendoff to expand the structures admitted in his decompositions. This in turn evoked criticism that his theory lacks sufficient restrictiveness. Furthermore, it has been criticized that the locational approach to decomposition needs to be stretched too far in order to make it convincing that it includes all classes of verbs (Levin 1995: 84). Wunderlich (1996:171) considers Jackendoff's linking principle problematic since it cannot easily be applied to languages with case systems. In general, Jackendoff s rather heterogeneous set of correspondence rules has attracted criticism because it involves a considerable weakening of the idea of semantic compositionality (cf. article 30 (Jackendoff) Conceptual Semantics for Jackendoff's position). 5. LCS decompositions and the MIT Lexicon Project 5.1. Origins and motivation An important contribution to the development of decompositional theories of lexical meaning originated in the MIT Lexicon Project in the mid-eighties. Its main proponents are Beth Levin and Malka Rappaport Hovav. Their approach is mainly concerned with the relation between semantic properties of lexical items and their syntactic behaviour. Thus, it aims at "developing a representation of those aspects of the meaning of a lexical item which characterize a native speaker's knowledge of its argument structure and determine the syntactic expression of its arguments" (Levin 1985:4).The meaning representations were supposed to lead to definitions of semantic classes that show a uniform syntactic behaviour:
17. Frameworks of lexical decomposition of verbs 373 (1) All arguments bearing a particular semantic relation are systematically expressed in certain ways. (2) Predicates fall into classes according to the arguments they select and the syntactic expression of these arguments. (3) Adjuncts are systematically expressed in the same way(s) and their distribution often seems to be limited to semantically coherent classes of predicates. (4) There are regular extended uses of predicates that are correlated with semantic class. (S) Predicates belonging to certain semantic classes display regular alternations in the expression of their arguments. (Levin 1985:47) Proponents of the approach criticized accounts that were solely based on thematic roles, as they were incapable of explaining diathesis alternations (Levin 1985:49ff; 1995:76ff). Instead, they proposed decompositions on the basis of Jackcndoff's (1972; 1976) conceptual structures, Generative Semantics, earlier work by Carter (1976) and Joshi (1974), and ideas from Hale & Keyser (1987). 5.2. Structure and location of decompositions The main idea in the early MIT Lexicon Project Working Papers was that two levels of lexical representation have to be distinguished, a lexical-semantic and a lexical- syntactic one (Rappaport, Laughren & Levin 1987, later published as Rappaport, Levin & Laughren 1993; Rappaport & Levin 1988). The lexical-syntactic representation, PAS ("predicate argument structure"), "distinguishes among the arguments of a predicator only according to how they combine with the predicator in a sentence". PAS, which is subject to the projection principle (Rappaport & Levin 1988: 16), expresses whether the role of an NP-argument is assigned (i) by the verb ("direct argument"), (ii) by a different theta role assigner like a preposition ("indirect argument"), or (iii) by the VP via predication ("external argument") (Rappaport, Laughren & Levin 1987:3).These three modes of assignment are illustrated in the PAS for the verb put: (11) a. WPAS:x<y,P|ocz> b. put, LCS: [ x cause [ y come to be at z ]] The lexical-semantic basis of PAS is a lexical decomposition, LCS ("Lexical Conceptual Structure") (Rappaport, Laughren & Levin 1987: 8). The main task for an LCS-based approach to lexical semantics is to find the mapping principles between LCS and PAS and between PAS and syntactic structure (Rappaport 1985:146f). Not all researchers associated with the MIT Lexicon Project distinguished two levels of representation. Carter (1988) refers directly to argument positions of predicates within dccompositionsinordcrtocxplainlinkingphcnomcna.Inlatcrwork,Lcvin and Rappaport do not make reference to PAS as a level of representation anymore. The distinction between grammatically relevant and irrelevant lexical information is now reflected in a distinction between primitive predicates that are embedded in semantic templates, which are claimed to be part of Universal Grammar, and predicate constants, which reflect the idiosyncratic part of lexical meaning. The templates pick up distinctions known from Vendler classes (Vendler 1957) and are referred to as event structure representations. For example, (12a) is a template for an activity, and (12b) is a template for a particular kind of accomplishment (Rappaport Hovav & Levin 1998:108):
374 IV. Lexical semantics (12) a. [x act<imvvaJ b. [[x act<imvvar>] cause [become [y <STATE>]]] The templates in (12) illustrate two main characteristics of this approach: templates can embed other templates and constants can function as modifiers of predicates (e.g., <MANNER> with respect to ACT) or as arguments of predicates (e.g., <STATE> with respect to BECOME). The representations are augmented by well-formedness conditions that require that each subevent in a template is represented by a lexical head in syntax and that all participants in lexical structure and all argument XPs in syntax are mapped onto each other. The principle of Template Augmentation makes it possible to build up complex lexical representations from simple ones such that the variants of sweep reflected in (13a-13c) are represented by the decompositions in (14a-14c): (13) a. Phil swept the floor. b. Phil swept the floor clean. c. Phil swept the crumbs onto the floor. (14) a. [xACT<TOf>y] b. [[x act <sm:Kf9, y] cause [become [y <STATE>]]] c. [[x act <xwtKfl> y] cause [become [z <PLACE>]\] It is assumed that the nature of the constant can determine the range of templates that can be associated with it (cf. article 19 (Levin & Rappaport Hovav) Lexical Conceptual Structure). It should be noticed that the approach based on LCS-type decompositions aims primarily at explaining the regularities of argument realization. Particularly in its later versions, it is not intended to capture different kinds of entailments, aspectual behaviour, or restrictions on adverbial modification. It is assumed that all and only those meaning components that are relevant to grammar can be isolated and represented as LCS templates. 5.3. Linguistic phenomena Within this approach, most research was focused on argument structure alternations that verbs may undergo. (15) a. Bill loaded cartons onto the truck, b. Bill loaded the truck with cartons. With respect to alternations as in (15) Rappaport & Levin (1988:19ff) argued that a lexical meaning representation has to account for (i) the near paraphrase relation between the two variants, (ii) the different linking behaviour of the variants, (iii) and the interpretation of the goal argument in (15b) as completely affected by the action. They argued that a theta role approach could not fulfill these requirements: If the roles for load were considered identical for both variants, for example, <agent, locatum, goal>, requirement (i) is met but (ii) and (iii) are not since the different argument realizations cannot follow from identical theta role assignments and the affectedness component in (15b) is not
17. Frameworks of lexical decomposition of verbs 375 expressed. If the roles for the two variants are considered different, for example, <agent, theme, goal> for (15a) versus <agent, locatum, goal> for (15b), the near paraphrase relation gets lost, and the completeness interpretation of (15b) still needs stipulativc interpretation rules. Within an LCS approach, linking rules make reference not to theta roles but to substructures of decompositions, for example, "When the LCS of a verb includes one of the substructures in [16], link the variable represented by x in either substructure to the direct argument variable in the verb's PAS. (16) a. ... [ x come to be at LOCATION ] ... b. ... [ x come to be in STATE ] ..." (Rappaport & Levin 1988:25) With respect to load, it is assumed that the variant in (17b) with its additional meaning component of completion entails the variant in (17a) giving rise to the following representations: (17) a. load: [ x cause [ y to come to be at z ] /LOAD ] b. load: [[ x cause [ z to come to be in STATE ]] BY MEANS OF [ x cause [ y to come to be at z ]] /LOAD ] (Rappaport & Levin 1988:26) Assuming that the linking rules apply to the main clause within the decomposition, the two decompositions lead to different PAS representations in which the direct argument is associated with the theme in (18a) and the goal in (18b): (18) a. load:x<y,Piocz> b. load: x<z,Pwilhy> Thus the three observations, (i) near-paraphrase relation, (ii) different linking behaviour, and (iii) complete affectedness of the theme in one variant, are accounted for. Among the contemporary studies that proceeded in a similar vein are Hale & Kcyscr's (1987) work on the middle construction and Gucrsscl ct al.'s (1985) crosslinguistic studies on causative, middle, and conative alternations. It is particularly noteworthy that Levin and Rappaport have greatly expanded the range of phenomena in the domain of argument structure alternations that a lexical semantic theory has to cover. Their empirical work on verb classes determined by the range of argument structure alternations they allow is documented in Levin (1993): About 80 argument structure alternations in English lead to the definition of almost 200 verb classes.The theoretical work represented by the template approach to LCS focuses on finding the appropriate constraints that guide the extension of verb meanings and explain the variance in argument structure alternations. 5.4. Evaluation The work of Levin, Rappaport Hovav, and other researchers working with LCS-likc structures had a large influence on later work on the syntax-semantics interface. By uncovering the richness of the domain of argument structure alternations, they defined what theories at the lexical syntax-semantic interface have to account for today. Among
376 IV. Lexical semantics the work inspired by Levin and Rappaport Hovav's theory are approaches whose goal is to establish linking regularities on more abstract, structural properties of decompositions (e.g., Lexical Decomposition Grammar, cf. section 8) and attempts to integrate elements of lexical decompositions into syntactic structure (cf. section 9). Levin and Rappaport Hovav's work is also typical of a large amount of lexical semantic research in the 1980s and 90s that has largely given up the semantic rigorous- ness characteristic of approaches based on formal semantics like Dowty (1979). Less rigorous semantic relations make theories more susceptible to circular argumentations when semantic representations are mapped onto syntactic ones (cf. article 7 (Engelberg) Lexical decomposition, section 3.5). It has also been questioned whether Levin and Rappaport Hovav's approach allows for a principled account of cross-linguistic variation and universals (Croft 1998:26;Zubizarreta & Oh 2007:8). 6. Event Structure Theory 6.1. Origins and motivation In the late 1980s, two papers approaching verb semantics from a philosophical point of view inspired much research in the domain of aspect and Aktionsart, namely, Vendler's (1957) classification of expressions based on predicational aspect and Davidson's (1967) suggestion to reify events in order to explain adverbial modification. In connection with Dowty's (1979) work on decompositions within Montague semantics, the intensification of research on grammatical aspect, predicational aspect, and Aktionsarten also stimulated event-based research in lexical semantics. In particular, Pustejovsky's (1988; 1991a; 1991b) idea of conceiving of verbs as referring to structured events added a new dimension to decompositional approaches to verb semantics. 6.2. Structure and location of decompositions According to Pustejovsky (1988; 1991a; 1991b), each verb refers to an event that can consist of subevents of different types, where 'processes' (P) and 'states' (S) are simple types that can combine to yield the complex type 'transition' [P S]T via event composition. A process is conceived of as "a sequence of events identifying the same semantic expression", a state as "a single event, which is evaluated relative to no other event", and a transition as "an event identifying a semantic expression, which is evaluated relative to its opposition" (Pustejovsky 1991a: 56). In addition to this event structure (ES), Pustejovsky assumes a level LCS', where each subevent is related to a decomposition. Out of this, a third level of Lexical Conceptual Structure (LCS) can be derived, which contains a single lexical decomposition. The following examples illustrate how the meaning of sentences is based on these representational levels: (19) a. Mary ran. b. Mary ran to the store. c. The door is closed. d. The door closed. e. John closed the door.
17. Frameworks of lexical decomposition of verbs 377 LCS": [ run(mary) ] LCS: run(mary) Fig. 17.9: Representation of Mary ran T LCS": [ run(mary) ] [ at(mary, the-store) ] LCS: cause(act(mary), become(at (mary, the-store)) by run) Fig. 17.10: Representation of Mary ran to the store S ES: I LCS": [ closed(the-door) ] LCS: closed(the-door) Fig. 17.11: Representation of the door is closed T ES: LCS": [ ->closed(the-door) ] [ closed(the-door) ] LCS: become(closed(the-door)) Fig. 17.12: Representation of the door closed T ES: LCS": [ act(john, the-door) & ->closed(the-door) ] [ closed(the-door) ] LCS: cause(act(john, the-door), become(closed(the-door))) Fig. 17.13: Representation of John closed the door
378 IV. Lexical semantics In terms of Vendler classes, Fig. 17.9 describes an activity, Fig. 17.11 a state, Figs. 17.10 and 17.13 accomplishments, and Fig. 17.12 an achievement. According to Pustejovsky, achievements and accomplishments have in common that they lead to a result state and are distinguished in that achievements do not involve an act-predicate at LCS'. As in many other decompositional theories (Jackendoff 1972; van Valin 1993), thematic roles are considered epiphenomenal and can be derived from the structured lexical representations (Pustejovsky 1988:27). Pustejovsky's event structure theory is part of his attempt to construct a theory of the Generative Lexicon (Pustejovsky 1995) that, besides Event Structure, also comprises Qualia Structure, Argument Structure, and Inheritance Structure (Pustejovsky 1991b, 1995). He criticises contemporary theories for focussing too much on the search for a finite set of semantic primitives: Rather than assuming a fixed set of primitives, let us assume a fixed number of generative devices that can be seen as constructing semantic expressions. Just as a formal language is described in terms of the productions in the grammar rather than its accompanying vocabulary, a semantic language should be defined by the rules generating the structures for expressions rather than the vocabulary of primitives itself. (Pustejovsky 1991a: 54) 6.3. Linguistic phenomena The empirical coverage of Pustejovsky's theory is wider than many other decompositional theories: (i) The ambiguity of adverbials as in Lisa rudely departed is explained by attaching the adverb cither to the whole transition T ('It was rude of Lisa to depart') or to the embedded process P ('Lisa departed in a rude manner') (Pustejovsky 1988:31f). (ii) The mapping of Vendler classes onto structural event representations allows for a formulation of the restrictions on temporal-aspectual adverbials (in five minutes, for five minutes, etc.) (Pustejovsky 1991a: 73). (iii) The linking behaviour of verbs is related to LCS' components; for example, the difference between unaccusatives and unergatives is accounted for by postulating that a participant involved in a predicate opposition (as in Fig. 12) is mapped onto the internal argument position in syntax while the agentive participant in an initial subevent (as in Fig. 17.13) is realized as the external argument (Pustejovsky 1991a: 75). Furthermore, on the basis of Event Structure and Qualia Structure, a theory of aspectual coercion is developed (Pustejovsky & Bouillon 1995) as well as an account of lexicalizations of causal relations (Pustejovsky 1995). 6.4. Evaluation Pustejovsky's concept of event structures has been taken up by many other lexical semanticists. Some theories included event structures as an additional level of representation. Grimshaw (1990) proposed a linking theory that combined a thematic hierarchy and an aspectual hierarchy of arguments based on the involvement of event participants in Pustcjovsky-stylc subevents. In Lexical Decomposition Grammar, event structures were introduced as a level expressing sortal restrictions on events in order to explain the distribution and semantics of adverbials (Wunderlich 1996). It has sometimes been criticised that Pustejovsky's event structures were not fine-grained enough to explain
17. Frameworks of lexical decomposition of verbs 379 adverbial modification. Consequently, suggestions have been made how to modify and extend event structures (e.g.,Wunderlich 1996; cf. also Engelberg 2006). Apart from those studies and theories that make explicit reference to Pustejovsky's event structures, a number of other approaches emerged in which phasal or mereolog- ical properties of events are embedded in lexical semantic representations, among them work by Tenny (1987; 1988), van Voorst (1988), Croft (1998), and some of the syntactic approaches to be discussed in section 9. Even standard lexical decompositions are often conceived of as event descriptions and referred to as 'event structures', for example, the LCS structures in Rappaport Hovav & Levin (1998). Event structures by themselves can of course not be considered full decompositions that exhaust the meaning of a lexical item. As we have seen above, they arc always combined with other lexical information, for example, LCS-style decompositions or thematic role representations. Depending on the kind of representation they are attached to it is not quite clear if they constitute an independent level of representation. In Pustejovsky's approach, event structures are probably by and large derivable from the LCS structures they are linked to. 7. Two-level Semantics and Lexical Decompositional Grammar 7.1. Origins and motivation Two-level-Semantics originated in the 1980s with Manfred Bierwisch, Ewald Lang and Dieter Wunderlich being its main proponents (Bierwisch 1982; 1989; 1997; Bierwisch & Lang 1989; Wunderlich 1991; 1997a; cf. article 31 (Lang & Maienborn) Two-level Semantics). In particular, Bierwisch's contribution is remarkable for his attempt to define the role of the lexicon within Generative Grammar. Lexical Decomposition Grammar (LDG) emerged out of Two-level Semantics in the early 1990s. LDG has been particularly concerned with the lexical decompositon of verbs and the relation between semantic, conceptual, and syntactic structure. It has been developed by Dieter Wunderlich (1991; 1997a; 1997b; 2000; 2006) and other linguists stemming from the Diisseldorf Institute for General Linguistics (Sticbcls 1996; 1998; 2006; Kaufmann 1995a; 1995b; 1995c; Joppcn & Wunderlich 1995; Gamerschlag 2005) - with contributions by Paul Kiparsky (cf. article 78 (Kiparsky & Tonhauser) Semantics of inflection). 7.2. Structure and location of decompositions Two-level semantics argues for separating semantic representations (semantic form, SF) that are part of the linguistic system, and conceptual representations that are part of the conceptual system (CS). Only SF is seen as a part of grammar that is integrated into its computational mechanisms while conceptual structure is a level of reasoning that builds on more general mental operations. How the interplay between SF and CS can be spelled out is shown in Maienborn (2003). She argues that spatial PPs can either function as event-external modifiers, locating the event as a whole as in (20a), or as event-internal modifiers, specifying a spatial relation that holds within the event as in (20b). While external event location is semantically straightforward, internal event location is subject to conceptual knowledge. Not only does the local relation expressed in (20b) require world knowledge about spatial relations in bike riding events, it is also re-interpreted as
380 IV. Lexical semantics an instrumental relation that is not lexically provided by the verb or the preposition. Furthermore, in sentences like (20c) the external argument of the preposition, the woman's hand or some instrument the woman uses is not even mentioned in the sentence but has to be supplied by conceptual knowledge. (20) a. Der Bankraubcr ist aufderlnsel geflohen. the bank robber has on the island escaped b. Der Bankrauber ist auf dem Fahrrad geflohen. the bank robber has on the bicycle escaped c. Maria zog Paul an den Haaren aus dem Zimmer. Maria pulled Paul at the hair out of the room Several tests show that event-internal modifiers attach to the edge of V and event external modifiers to the edge of VP. Maienborn (2003:487) suggests that the two syntactic positions trigger slightly different modification processes at the level of SF. While in both cases the lexical entries entering the semantic composition have the same decomposi- tional representation (21a, b), the process of external modification identifies the external argument of the preposition with the event argument of the verb (21c; XQ applying to the PP, XP to the verb), whereas the process of internal modification turns the external argument of the preposition into a free variable, a so-called SF parameter (variable v in 21d) that is specified as a constituent part (part-of) of what will later be instantiated with the event variable. (21) a. [pauf]: Xy Xx [LOC (x,ON (y))] b. [vfliehen]: Xx Xe [ESCAPE (e) & THEME (e,x)] c. MOD: XQXPXx[P(x)&Q(x)] d. MOD': XQXPXx[P(x)& PART-OF (x,v)&Q(v)] The compositional processes yield the representation in (22a) for the sentence (20b), the variable v being uninstantiated.This representation will be enriched at the level of CS which falls back on a large base of shared conceptual knowledge.The utterance meaning is achieved via abduction processes which lead to the most economical explanation that is consistent with what is in the knowledge base. Spelling out the relevant part of the knowledge base, i.e. knowledge about spatial relations, about event types in terms of participants serving particular functions, and about the part-whole organization of physical objects, Maienborn shows how the CS representation for (20b), given in (22b), can be fomally derived (for details cf. Maienborn 2003:492ff). (22) a. SF: 3e [ESCAPE (e) & THEME (e, r) & BANK-ROBBER (r) & PART-OF (e, v) & LOC (v, ON (b)) & BIKE (b)] b. CS: 3e [EXTR-MOVE (e) & ESCAPE (e) & THEME (e,r) & BANK-ROBBER (r) & INSTR (e, b) & VEHICLE (b) & BIKE (b) & SUPPORT (b, r, x(c)) & LOC (r, ON (b))] The emergence of Lexical Decomposition Grammar out of Two-level Semantics is particularly interesting for the development of theories of decompositional approaches
17. Frameworks of lexical decomposition of verbs 381 to verb meaning. LDG locates decompositional representations in semantics and rejects syntactic approaches to decomposition arguing that they have failed to provide logically equivalent paraphrases and an adequate account of scopal properties of adverbials (Wunderlich 1997a: 28f). It assumes four levels of representation: conceptual structure (CS), semantic form (SF), theta structure (TS), and morphological/syntactic structure (MS, in earlier versions of LDG also called phrase structure, PS). SF is a decomposition based on type logic that is related to CS by restrictive lexicalization principles;TS is derived from SF by lambda abstraction and encodes the argument hierarchy. TS in turn is mapped onto MS by linking principles (Wunderlich 1997a: 32; 2000: 249ff).The four levels are illustrated in Fig. 17.14 with respect to the German verb geben 'give' in (23). (23) a. (als) [derTorwart [dem Jungen [den Ball gab]]] (when) the goalkeeper the boy the ball gave b. [DPxnom [DP>dat [DPzacc geb-AGR* ]]] TS Xz +hr -Ir 1 Xy +hr +lr ' i ACC Xx Xs -hr +lr ' i DAT <^= ' AGR NOM MS SF {act(x) & bec Poss(y,z)}(s) 1 CS x = Agent or Controller y = Recipient z = Patient or Affected Causal event: act(x,s,) R esult state: Poss(y,z)(s2) Fig. 17.14: The representational levels of LDG (Wunderlich 2000:250) Semantic form docs not provide a complete characterization of a word's meaning. It serves to represent those properties of predicate-argument structures that make it possible to account for their grammatical properties (Wunderlich 1996: 170). This level of representation must be finite and not subject to contingent knowledge. In contrast to semantic form, conceptual structures draw on an infinite set of properties and can be subject to contingent knowledge (Wunderlich 1997a: 29). Since SF decompositions consist of hierarchically ordered binary structures - assuming that a & b branches as [a [& b]] - arguments can be ranked according to how deeply they are embedded within this structure.TS in turn preserves the SF hierarchy of arguments in inverse order so that arguments can be discharged by functional application (Wunderlich 1997a: 44). Each argument role in TS is characterized as to whether there is a higher or lower role. Besides their thematic arguments, nouns and verbs also have referential arguments, which do not undergo linking. Referential arguments are subject to sortal restrictions that are represented as a structured index on the referential argument (Wunderlich 1997a: 34). With verbs, this sortal index consists of an event structure, similar in form to Pustejovsky's event structures but slightly differing with respect to the distinctions expressed (Wunderlich 1996:175ff).
382 IV. Lexical semantics The relation between the different levels is mediated by a number of principles. For example, Argument Hierarchy regulates the inverse hierarchical mapping from SF to TS, and Coherence requires that the subevents corresponding to SF predicates be interpreted as contemporaneous or causally related (Kaufmann 1995c).Thus, the causal interpretation of geben in (23) is not explicitly given in SFbut left to Coherence as a general CS principle of interpretation. 7.3. Linguistic phenomena The main concern of LDG is argument linking, and its basic assumption is that syntactic properties of arguments follow from hierarchical structures within semantic form. Structural linking is based on the assignment of two binary features to the arguments in TS, [± hr] 'there is a / no higher role' and [± lr] 'there is a / no lower role'. The syntactic features are associated with these two binary features, dative with [+ hr, + lr], accusative with [+hr], ergative with [+lr], and nominative/absolutive with [ ] (cf. Fig. 17.14). All and only the structural arguments have to be matched with a structural linker. Besides structural linking, it is taken into account that arguments can be suppressed or realized by oblique markers.This also motivates the distinction between SF andTS (Wunderlich 1997b: 47ff; 2000: 252). The following examples show how structural arguments are matched with structural linkers in nominative-accusative (NA) and absolutive-ergative (AE) languages (Wunderlich 1997a: 49): intransitive verbs: transitive verbs: ditransitive verbs: NA: AE: NA: AE: NA: AE: Xx [-hr, nom abs Xy [+hr, ace abs Xz [+hr, ace abs -lr] -lr] -lr] Xx [-hr,+lr] nom erg Xy [+hr,+lr] dat dat Xx [-hr,+lr] nom erg LDG pursues a strictly lexical account of argument extensions such as possessors, beneficiaries, or arguments introduced by word formation processes or resultative formation. These argument extensions are all handled within SF formation by adding predicates to an existing SF (Stiebels 1996;Wunderlich 2000).Thus, the complex verb in (25a) is represented as the complex SF (25c) on the basis of (25b) and an argument extension principle. (25) a. Sie erschrieb sich den Pulitzer-Preis. she "er"-wrote herself the Pulitzer Prize 'She won the Pulitzer Prize by her writing.' b. schreib- 'write': XyXxXs WRiTE(x,y)(s) c. erschreib: XvXuXxXsBy {wRiTE(x,y)(s) & become poss(u,v)}(s)
17. Frameworks of lexical decomposition of verbs 383 These processes are restricted by two constraints on possible verbs, Coherence and Connexion, the latter one requiring that each predicate in SF share at least one, possibly implicit, argument with another predicate in SF (Kaufmann 1995c). In (25c), Coherence guarantees the causal interpretation, and Connexion accounts for the identification of the agent of writing with the possessor of the prize. The resulting SF is then subject to Argument Hierarchy and the usual linking principles. As we have seen in (25c), the morphological operation adds semantic content to SF as it does with other operations like resultative formation. In other cases, morphology operates on TS in order to change linking conditions (e.g., passive) (Wunderlich 1997a: 52f). During the last 20 years, LDG has produced numerous studies on phenomena in a number of typologically diverse languages, dealing with agreement (Wunderlich 1994), word formation of verbs (Wunderlich 1997b; Stiebels 1996; Gamerschlag 2005), locative verbs (Kaufmann 1995a), causatives and resultatives (Wunderlich 1997a; Kaufmann 1995a),dative possessors (Wunderlich 2000),ergative case systems (Joppen & Wunderlich 1995), and nominal linking (Stiebels 2006). 7.4. Evaluation In contrast to some other decompositional approaches, LDG adheres to a compositional approach to meaning and tries to define its relation to current syntactic theories. In more recent publications (Stiebels 2002; Gamerschlag 2005), LDG has been reformulated within an optimality theoretic framework. Lexical Decomposition Grammar is criticized by Taylor (2000), in particular for its division between semantic and conceptual knowledge. LDG, based on Two-level Semantics, accounts for the different readings of a lexical item within conceptual structure, leaving lexical entries largely monosemous.Taylor argues that lexical usage is to a large degree conventionalized and that the particular readings a word docs or docs not have cannot be construed entirely from conceptual knowledge. Bierwisch (2002) presents a number of arguments against the removal of cause from SF decompositions. Further problems, emerging from structural stipulations, are discussed in article 7 (Engelberg) Lexical decomposition, section 3.5. 8. Natural Semantic Metalanguage 8.1. Origins and motivation The theory of Natural Semantic Metalanguage (NSM) originated in the early seventies. Its main proponents have been Anna Wierzbicka (1972; 1980; 1985; 1992; 1996) and Cliff Goddard (1998; 2006; 2008a; Goddard & Wierzbicka 2002). NSM theory has been developed as an attempt to construct a semantic metalanguage (i) that is expressive enough to cover all the word meanings in natural languages, (ii) that allows noncircular reductive paraphrases, (iii) that avoids metalinguistic elements that are not part of the natural language it describes, (iv) that is not ethnocentric, and (v) that makes it possible to uncover the universal properties of word meanings (cf. for an overview Goddard 2002a; Durst 2003). In order to achieve this, Wierzbicka suggested that the lexicon of a language can be divided into a small set of indefinable words (semantic primes) and a large set of words that can be defined in terms of these indefinables.
384 IV. Lexical semantics 8.2. Structure and location of decompositions The term Natural Semantic Metalanguage is intended to reflect that the semantic primes used as a metalanguage are actual words of the object language. The indefinables constitute a finite set and, although they are language-specific, each language-specific set "realizes, in its own way, the same universal and innate alphabet of human thought" (Wicrzbicka 1992: 209). More precisely, this implies that the set of semantic primes of a particular language and their combinatorial potential have the expressive power of a full natural language and that the sets of semantic primes of all languages are isomorphic to each other. The set of semantic primes consists of 60 or so elements including such words as you, this, two, good, know, see, word, happen, die, after, near, if, very, kind of, and like, each disambiguated by a canonical context (cf. Goddard 2008b). These primes are claimed to be indefinable and indispensable (cf. Goddard & Wierzbicka 1994b; Wierzbicka 1996; Goddard 2002b; Wierzbicka 2009). Meaning descriptions within NSM theory look like the following (Wierzbicka 1992:133): (26) a. (X is embarrassed) b. X thinks something like this: something happened to me now because of this, people here arc thinking about mc I don't want this because of this, I would want to do something I don't know what I can do I don't want to be here now because of this, X feels something bad It is required for the relationship between the defining decomposition and the defined term that they be identical in meaning.This is connected to substitutability; the definiens and the definiendum are supposed to be replaceable by each other without change of meaning (Wicrzbicka 1988:12). 8.3. Linguistic phenomena More than any other decompositional theory, NSM theory resembles basic lexicographic approaches to meaning, in particular, those traditions of English learner lexicography in which definitions of word meanings are restricted to the non-circular use of a limited "controlled" defining vocabulary (e.g., Summers 1995).Thus, it is not surprising that NSM theory tackles word meanings in many semantic fields that have not been at the centre of attention within other decompositional approaches, for example, pragmatically complex domains like speech act verbs (Wierzbicka 1987). Other investigations focus on the cultural differences reflected in words and their alleged equivalents in other languages, for example, Wierzbicka's (1999) study on emotion words. NSM theory also claims to be able to render the meaning of syntactic constructions and grammatical categories by decompositions. An example is given in (27). (27) a. f'first person plural exclusive'] b. I'm thinking of some people
17. Frameworks of lexical decomposition of verbs 385 I am one of these people you are not one of these people (Goddard 1998:299) The claim of NSM theory to be particularly apt as a means to detect subtle cross-linguistic differences is reflected in Goddard & Wierzbicka (1994a), where studies on a fairly large number of typologically and genetically diverse languages are presented. 8.4. Evaluation While many of its critics acknowledge that NSM theory has provided many insights into particular lexical phenomena, its basic theoretical assumptions have often been subject to criticism. It has been called into question whether the emphasis on giving dictionary- style explanations of word meanings is identical to uncovering the native speaker's knowledge about word meaning. NSM theory has also been criticized for not putting much effort into providing a foundation for the theory on basic semantics concepts (cf. Riemer 2006: 352). The lack of a theory of truth, reference, and compositionality within NSM theory raised severe doubts about whether it can adequately deal with phenomena like quantification, anaphora, proper names, and presuppositions (Geurts 2003; Matthewson 2003; Barker 2003). This criticism also affects the claim of the theory to be able to cover the semantics of the entire lexicon of a language. 9. Lexical Relational Structures 9.1. Origins and motivation With the decline of Generative Semantics in the 1970s, lexical approaches to decomposition began to dominate the field. These approaches enriched our understanding of the complexity of lexical meaning as well as the possibility of generalizations across verb classes.Then in the late 1980s syntactic developments within the Principles & Parameter framework suggested more complex structures within the VP. The assumption of VP- internal subjects and, in particular, Larson's (1988) theory of VP-shells as layered VP- internal structures suggested the possibility to align certain bits of verb-internal semantic structure with structural positions in layered VPs. With these developments underway, the time was ripe for new syntactic approaches to decomposition (cf. also the summary in Levin & Rappaport Hovav 2005:131 ff). 9.2. Structure and location of decompositions On the basis of data from binding, quantification, and conjunction with respect to double object constructions, Larson (1988: 381) argues for the Single Argument Hypothesis, according to which a head can only have one argument. This forces a layered structure with multiple heads within VP. Fig. 17.15 exhibits the structure of Mary gave a box to Tom within this VP-shell. The verb moves to the higher V node by head-movement. The mapping of a verb's arguments onto the nodes within the VP-shell is determined by a thcta hierarchy 'agent > theme > goal > obliques' such that the lowest role of a verb is assigned to the lowest argument position, the next lowest role to the next lowest argument position, and so on (Larson 1988:382). Thus, there is a weak correspondence between structural positions and verb semantics in the sense that high argument positions
386 IV. Lexical semantics are associated with a comparatively high thematic value. However, structural positions within VP shells are not linked to any stable semantic interpretation. assigned by Thematic Hierarchy AGENT THEME GOAL <- give Fig. 17.15: VP-shell and theta role assignment Larsonian shells inspired research on the syntactic representation of argument structure. Particularly influential was the approach pursued by Hale & Keyser (1993; 1997; 2002). They assume that argument structure is handled within a lexicon component called 1-syntax, which is an integral part of syntax as it obeys syntactic principles. The basic assumption is that argument structure, also called "lexical relational structure", is defined in reference to two possible relations between a head and its arguments, namely, the head-complement and the head-specifier relation (Hale & Keyser 1999: 454). Each verb projects an unambiguous structure in 1-syntax. In Fig. 17.16, the lexical relational structure of put as in put the books on the shelf is illustrated. (on the shelf) Fig. 17.16: Lexical relational structure of put and head movement of the verb
17. Frameworks of lexical decomposition of verbs 387 Primary evidence for this approach is taken from verbs that are regarded as denominal. Locational verbs of this sort such as to shelve, to box, or to saddle receive a similar representation as to put. The Lexical Relational Structure of shelve consists of Larsonian VP-shells with the noun shelf as complement of the embedded prepositional head. From there, the noun incorporates into an abstract V head by head movement (cf. Fig. 17.17). (Hale &Keyser 1993:55ff) V V VP NP V Fig. 17.17: Lexical relational structure of shelve and incorporation of the noun In a similar way, unergative verbs like sneeze or dance (28b), which are assumed to have a structure parallel to expressions like make trouble or have puppies (28a), are derived by incorporation of a noun into a V head (Hale & Kcyscr 1993:54f). (28) a. [have v [puppies N] Np] v. b. [sneezej v [t N] Np] v. Hale & Kayser (1993:68) assume that "elementary semantic relations" are "associated" with these syntactic structures:The agent occurs in a Spec position above VP, the theme in a Spec position of a V that takes a PP/AP complement, and so forth. The argument structure of shelve is thus related to semantic relations as exhibited in Fig. 17.18. It is important to keep in mind that Hale and Keyser do not claim that argument structures are derived from semantics. On the contrary, they assume that "certain meanings can be assigned to certain structures" in the sense that they are fully determined by 1-syntactic configurations (Hale & Keyser 1993:68; 1999:463). Fig. 17.18 also reflects two important points of Hale and Kcyscr's theory. Firstly, they assume a central distinction between verb classes: Contrary to the early exploration of VP structures as in Fig. 17.17, unergatives and transitives in contrast to unaccusatives are assumed not to have a subject as part of their argument structure; their subjects are assigned in s-syntax (Hale & Keyser 1993:76ff). Secondly, Hale and Kayser emphasize that their approach explains why the number of theta roles is (allegedly) so small, namely, because there is only a very restricted set of syntactic configurations with which they can be associated.
388 IV. Lexical semantics Fig. 17.18: Semantic relations associated with lexical relational structures (after Hale & Keyserl993:76ff) 9.3. Linguistic phenomena Hale and Keyser's approach aims to explain why certain argument structures are possible while others are not. For example, it is argued that sentences like (29a) are ungrammat- ical because incorporation of a subject argument violates the Empty Category Principle (Hale & Keyser 1993:60).The ungrammaticality of (29b) is accounted for by the assumption that unergatives as in (28) do not project a specifier that would allow a transitivity alternation (Hale & Kcyscr 1999:455). (29c) is argued to be ungrammatical because the Lexical Relational Structure would have to be parallel to she gave a church her money, in which church occupies Spec,VP, the "subject" position of the inner VP. Incorporation from this position violates the Empty Category Principle. By the same reasoning, she flattened the metal is well-formed, incorporating/?^ from an AP in complement position while (29d) is not since metal would have to incorporate from an inner subject position. (29) a. *It cowed a calf, (with the meaning 'a cow calved' and it as expletive) b. *An injection calved the cow early. c. *She churched the money. d. *Shc metalled flat. This incorporation approach allows Hale and Keyser to explore the parallels in syntactic behaviour between expressions like give a laugh and laugh which, besides being
17. Frameworks of lexical decomposition of verbs 389 near-synonymous, both fail to transitivize, as well as the differences between expressions like make trouble and thicken soups where only the latter allows middles and inchoatives. 9.4. Evaluation Hale and Keyser's work has stimulated a growing body of research aiming at a syn- tactification of notions of thematic roles, decompositional and aspectual structures. Some prominent examples are Mateu (2001), Alexiadou & Anagnostopoulou (2004), Ertcschik-Shir & Rapoport (2005; 2007), Zubizarrcta & Oh (2007), and Ramchand's (2008) first phase syntax. Some of this work places a strong emphasis on aspectual structure, for example, Ritter & Rosen (1998) and Travis (2000). As Travis (2000:181f) argues, the new syntactic approaches to decompositions avoid many of the pitfalls of Generative Semantics, which is certainly due to a better understanding of restrictive principles in recent syntactic theories. However, Hale and Keyser's work has also attracted heavy criticism from proponents of Lexical Conceptual Structure (e.g., Culicover & Jackendoff 2005, Rap- paport Hovav & Levin 2005), Two-level Semantics (e.g., Bierwisch 1997; Kiparsky 1997) as well as from anti-decompositionalist positions (e.g., Fodor & Lepore 1999). The analyses themselves raise many questions. For example, it remains unexplained which principles exclude a lexical structure of a putative verb to church (29c) along the lines of she gave the money to the church, that is, a structure parallel to the one suggested for to shelve (cf. also Kiparsky 1997: 481). It has also been observed that the position allegedly vacated by the noun in structures as in Fig. 17.18 can actually show lexical material as in Joe buttered the toast with rancid butter (Culicover & Jackendoff 2005: 102). Many other analyses and assumptions have also been under attack, among them assumptions about which verbs are denominal and how their meanings come about (Kiparsky 1997: 485ff; Culicover & Jackendoff 2005:55) as well as predictions about possible transitivity alternations (Kiparsky 1997: 491). Furthermore, one can of course doubt that the number of different theta roles is as small as Hale and Keyser assume. More empirically oriented approaches to verb semantics come to dramatically different conclusions (cf. Kiparsky 1997: 478 or work on Frame Semantics like Ruppcnhofcr ct al. 2006). Even some problems from Generative Semantics reemerge, such as that expressions like put on a shelf and shelve are not synonymous, the latter being more specific (Bierwisch 1997: 260). Overgeneralization is not accounted for, either. The fact that there is a verb to shelve but no semanti- cally corresponding verb to basket points to a location of decomposition in the lexicon (Bierwisch 1997: 232f). Reacting to some of the criticism, Hale & Kcyscr (2005) later modified some assumptions of their theory; for example, they abandoned the idea of incorporation in favour of a locally operating selection mechanism. 10. Distributed Morphology 10.1. Origins and motivation Hale & Keyser (1993) and much research inspired by them have attempted to reduce the role of the lexicon in favour of syntactic representations. An even more radical anti-
390 IV. Lexical semantics lexicalist approach is pursued by Distributed Morphology (DM), which started out in Halle & Marantz (1993) and since then has been elaborated by a number of DM proponents (Marantz 1997; Harley 2002; Harley & Noyer 1999; 2000; Embick 2004; Embick & Noyer 2007) (cf. also article 81 (Harley) Semantics in Distributed Morphology). 10.2. Structure and location of decompositions According to Distributed Morphology, syntax does not combine words but generates structures by combining morphosyntactic features. Terminal nodes, so-called "morphemes", are bundles of these morphosyntactic features. DM distinguishes f-nodes from 1-nodes. F-nodes correspond to what is traditionally known as functional, closed-class categories; their insertion at spell-out is deterministic. L-nodes correspond to lexical, open-class categories; their insertion is not deterministic. Vocabulary items are only inserted at spell-out. These vocabulary items arc minimally specified in that they only consist of a phonological string and some information where this string can be inserted (cf. Fig. 17.19). abstract syntactic representation I spell-out syntax linguistic expression I interpretation conceptual interface vocabulary encyclopaedia Fig. 17.19: The architecture of distributional morphology (cf. Harley & Noyer 2000:352; Embick & Noyer 2007) Neither a syntactic category nor any kind of argument structure representation is included in vocabulary entries as can be seen in example (30a) from Harley & Noyer (1999: 3). The distribution information in the vocabulary item replaces what is usually done by theta-roles and selection. In addition to the Vocabulary, there is a component called Encyclopaedia where vocabulary items are linked to those aspects of meaning that are not completely predictable from morphosyntactic structure (30b). (30) a. Vocabulary item: /dog/: [Root] [+count] [+animatc] ... b. Encyclopaedia item: dog: four legs, canine, pet, sometimes bites etc... chases balls, in environment "let sleeping s lie", refers to discourse entity who is better left alone... While the formal information in vocabulary items in part determines grammatical well- formedness, the encyclopaedic information guides the appropriate use of expressions. For example, the oddness of (31a) is attributed to encyclopaedic knowledge.The sentence is pragmatically anomalous but interpretable: It could refer to some unusual telepathic transportation event. (31b), on the other hand, is considered ungrammatical because put is
17. Frameworks of lexical decomposition of verbs 391 not properly licensed and, therefore, uninterpretable under any circumstances (Harley & Noyer 2000:354). (31) a. Chris thought the book to Mary, b. *James put yesterday. Part of speech is reflected in DM by the constellation in which a root morpheme occurs. For example, a root is a noun if its nearest c-commanding f-node is a determiner and a verb if its nearest c-commanding f-nodes are v, aspect, and tense. Not only are lexical entries more reduced than in approaches based on Hale & Keyser (1993), there is also no particular part of syntax corresponding to 1-syntax (cf. for this overview Harley & Noyer 1999,2000; Embick & Noyer 2007). 10.3. Linguistic phenomena Distributed Morphology has been applied to all kinds of phenomena in the domain of inflectional and derivational morphology. Some work has also been done with respect to the argument structure of verbs and nominalizations. One of the main topics in this area is the explanation of the range of possible argument structure alternations. A typical set of data is given in (32) and (33) (taken from Harley & Noyer 2000:362). (32) a. John grows tomatoes. b. Tomatoes grow. c. The insects destroyed the crop. d. *The crops destroyed. (33) a. the growth of the tomatoes b. the tomatoes' growth c. *John's growth of the tomatoes d. the crop's destruction e. the insects' destruction of the crop It has to be explained why grow, but not destroy, has an intransitive variant and why destruction, but not growth, allows the realization of the causer argument in Spec, DP (for the following, cf. Harley & Noyer 2000: 356ff). Syntactic structures are based on VP-shells. For each node in these structures, there is a set of possible items that can fill this position (with LP corresponding approximately to VP): a. b. c. d. e. node Spec,vP v head Spec,LP L head Comp,LP possible filling DP,0 happen/become, cause, be DP,0 1-node DP,0 Picking from this menu, a number of different syntactic configurations can be created:
392 IV. Lexical semantics (35) a. b. c. d. e. f. Spec DP 0 DP DP 0 0 j£ y CAUSE BECOME CAUSE CAUSE BECOME BE Spec.LP 0 0 0 DP 0 DP L Comp.LP DP DP DP DP DP DP (example) grow (tr.) grow (itr.) destroy give arrive know The items filling the v head are the only ones conceived of as having selectional properties: cause, but not become or be, selects an external argument.The grammaticality of the configurations in (35) is also determined by the licensing environment specified in the Vocabulary (cf. Fig. 17.20). VOCABULARY Phonology Licensing environment ENCYCLOPAEDIA Fig. 17.20: Vocabulary and encyclopaedic entries of verbs (after Harley & Noyer 2000:361) Thus, the transitive and intransitive uses of the roots in (32a) through (32c) are reflected in the syntactic structures in (36), where cause and become are realized as zero morphemes (cf. article 81 (Harley) Semantics in Distributed Morphology) and the LP head is assumed to denote a resulting state: (36) a. [iP[DPJohn][,, cause [LP grown [Dp tomatoes]]]] b. [v. become [Lp grown [Dp tomatoes]]] c. [iP [DP the insects] [|r. cause [Lp destroyed [Dp the crop]]]] The fact that destroy does not allow an intransitive variant is due to the fact that its licensing environment requires embedding under cause while grow is underspecified in this respect. The explanation for the nominalization data in (33) relies on the assumption that Spec,DP is not as semantically loaded as Spec,vp. It is further assumed that by encyclopaedic knowledge destroy always requires external causation while grow refers inherently to an internally caused spontaneous activity, which is optionally facilitated by some agent. Since cause is only implied with destroy but not with grow, only the causer of destroy can be interpreted in a semantically underspecified Spec,DP position. The fact that some verbs like explode behave partly like grow, in allowing the transitive-intransitive alternation, and partly like destroy, in allowing the realization of the causer in nominal- izations, is explained by the assumption that the events denoted by such roots can occur spontaneously (internal causation) but can also be directly brought about by some agent (external causation) (cf. also Marantz 1997). In summary, the phenomena in (32) are traced back to syntactic regularities, those in (33) to encyclopaedic, that is, pragmatic conditions.
17. Frameworks of lexical decomposition of verbs 393 10.4. Evaluation While some other approaches to argument structure share a number of assumptions with DM - for example, they also operate on category-neutral roots (e.g., Borer 2005; Arad 2002) - of course the radical theses of Distributed Morphology have also drawn some criticism. It has been doubted that all the differences in the syntactic behaviour of verbs can be accounted for with a syntax-free lexicon (cf.c.g., Ramchand 2008). Cross-linguistic differences might also pose some problems. For example, it is assumed that verbs allowing the unaccusative-transitive alternation are distinguished on the basis of encyclopaedic semantic knowledge from those that do not (Embick 2004: 139). The causative variant of the showcase example grow in (32a) is grammatically licensed and pragmatically acceptable because of encyclopaedic knowledge. However, it is not clear why German wachsen 'grow' does not have a causative variant.The alternation would be expected since wachsen does not seem to differ from grow in its encyclopaedic properties. Moreover, many other verbs like to dry 'trocknen' (37) demonstrate that German does not show any kind of structural aversion to alternations of this sort. (37) a. Der Salat trocknet / wachst. 'The lettuce dries / grows.' b. Peter trocknet Salat / *wachst Salat. 'Peter dries lettuce / grows lettuce.' The way the line is drawn between grammatical and pragmatic (un)acceptability also poses some problems. If the use of put with only one argument is considered ungrammat- ical, then how can similar uses of three-place verbs like German stellen 'put (in upright position)' and geben 'give' be explained (38)? (38) a. Er gibt.'He deals (in a card game).' b. Sie stellt.'She plays a volleyball such that somebody can smash it.' Since they are ruled out by grammar, encyclopaedic knowledge cannot save these examples by assigning them an idiomatic meaning. Thus, it might turn out that sometimes argument-structure flexibility is not as general as DM's encyclopaedia suggests, and grammatical restrictions are not as strict as syntax and DM's vocabulary predict. 11. Outlook The overview has shown that stances on lexical decomposition still differ widely, in particular with respect to the questions of where to locate lexical decompositions, how to interpret them, and how to justify them. It has to be noted that most work on lexical decompositions has not been accompanied by extensive empirical research. With the rise of new methods in the domain of corpus analysis, grammaticality judgements, and psy- cholinguistics (cf. article 7 (Engelberg) Lexical decomposition, section 3.6), the empirical basis for further decompositional theories will alter dramatically. It remains to be seen how theories of the sort presented here will cope with the empirical turn in contemporary linguistics.
394 IV. Lexical semantics 12. References Alexiadou, Artemis & Elena Anagnostopoulou 2004. Voice morphology in the causative-inchoative alternation: Evidence for a non-unified structural analysis of unaccustives. In: A. Alexiadou. E. Anagnostopoulou & M. Everaert (eds.). The Unaccusativity Puzzle. Explorations of the Syntax-Lexicon Interface. Oxford: Oxford University Press, 114-136. Arad, Maya 2002. Universal features and language-particular morphemes. In: A. Alexiadou (ed.). Theoretical Approaches to Universals. Amsterdam: Benjamins, 15-29. Bach, Emmon 1968. Nouns and noun phrases. In: E. Bach & R.T. Harms (eds.). Universals in Linguistic Theory. New York: Holt, Rinehart & Winston, 90-122. Barker, Chris 2003. Paraphrase is not enough. Theoretical Linguistics 29.201-209. Bartsch, Renate & Theo Vennemann 1972. Semantic Structures. A Study in the Relation between Semantics and Syntax. Frankfurt/M.: Athenaum. Bierwisch, Manfred 1982. Formal and lexical semantics. Linguistische Berichte 30,3-17. Bierwisch, Manfred 1989. Event nominalizations: Proposals and problems. In: W. Motsch (ed.). Wortstruktur und Satzstruktur. Berlin: Akademie Verlag. 1-73. Bierwisch, Manfred 1997. Lexical information from a minimalist point of view. In: C. Wilder, H.-M. Gartner & M. Bierwisch (eds.). The Role of Economy Principles in Linguistic Theory. Berlin: Akademie Verlag, 227-266. Bierwisch, Manfred 2002. A case for cause. In: I. Kaufmann & B. Stiebels (eds.). More than Words: A Festschrift for Dieter Wunderlich. Berlin: Akademie Verlag. 327-353. Bicrwisch, Manfred & Ewald Lang (eds.) 1989. Dimensional Adjectives. Grammatical Structure and Conceptual Interpretation. Berlin: Springer. Binnick, Robert I. 1968. On the nature of the 'lexical item'. In: B. J. Dai den, C.-J. N. Bailey & A. Davison (eds.). Papers from the Fourth Regional Meeting of the Chicago Linguistic Society (= CLS). Chicago. IL: Chicago Linguistic Society, 1-13. Binnick. Robert J. 1972. Zur Entwicklung der generativen Semantik. In: W. Abraham & R. J. Binnick (eds.). Generative Semantik. Frankfurt/M.: Athenaum, 1-48. Borer, Hagit 2005. Structuring Sense, vol. 2: The Normal Course of Events. Oxford: Oxford University Press. Carter, Richard J. 1976. Some constraints on possible words. Semantikos 1,27-66. Carter, Richard 1988. Arguing for semantic representations. In: B. Levin & C. Tenny (eds.). On Linking: Papers by Richard Carter. Lexicon Project Working Papers 25. Cambridge, MA: Center for Cognitive Science, MIT, 139-166. Chomsky, Noam 1965. Aspects of the Theory of Syntax. Cambridge, MA: The MIT Press. Chomsky, Noam 1970. Some empirical issues in the theory of transformational grammar. In: P. S. Peters (ed.). Goals of Linguistic Theory. Englcwood Cliffs, NJ: Prentice Hall, 63-130. Croft, William 1998. Event structure in argument linking. In: M. Butt & W. Geuder (eds.). The Projection of Arguments: Lexical Compositional Factors. Stanford, CA: CSLI Publications, 21-63. Culicover, Peter W. & Ray Jackendoff 2005. Simpler Syntax. Oxford: Oxford University Press. Davidson, Donald 1967. The logical form of action sentences. In: N. Reseller (ed.). The Logic of Decision and Action. Pittsburgh, PA: University of Pittsburgh Press, 81-95. Dowty, David R. 1972. Studies in the Logic of Verb Aspect and Time Reference in English. Ph.D. dissertation. University of Texas, Austin,TX. Dowty, David R. 1976. Montague Grammar and the lexical decomposition of causative verbs. In: B. Partee (ed.). Montague Grammar. New York: Academic Press, 201-246. Dowty. David R. 1979. Word Meaning and Montague Grammar. The Semantics of Verbs and Times in Generative Semantics and in Montague's PTQ. Dordrecht: Rcidcl. Dowty, David R. 1991.Thematic proto-roles and argument selection. Language 67'. 547-619. Durst, Uwe 2003.The Natural Semantic Metalanguage approach to linguistic meaning. Theoretical Linguistics 29,157-200.
17. Frameworks of lexical decomposition of verbs 395 Einbick, David 2004. Unaccusative syntax and verbal alternations. In: A. Alexiadou, E. Anagno- stopoulou & M. Everaert (eds.). The Unaccusativity Puzzle. Explorations of the Syntax-Lexicon Interface. Oxford: Oxford University Press, 137-158. Einbick, David & Rolf Noyer 2007. Distributed Morphology and the syntax/morphology interface. In: G. Ramchand & C. Reiss (eds.). The Oxford Handbook of Linguistic Interfaces. Oxford: Oxford University Press, 289-324. Engelberg, Stefan 2006. A theory of lexical event structures and its cognitive motivation. In: D. Wunderlich (ed.). Advances in the Theory of the Lexicon. Berlin: de Gruyter, 235-285. Erteschik-Shir. Nomi & Tova Rapoport 2005. Path predicates. In: N. Erteschik-Shir & T. Rapoport (eds.). The Syntax of Aspect. Deriving Thematic and Aspectual Interpretation. Oxford: Oxford University Press, 65-86. Erteschik-Shir, Nomi &Tova Rapoport 2007. Projecting argument structure. The grammar of hitting and breaking revisited. In: E. Reuland. T. Bhattacharya & G. Spathas (eds.). Argument Structure. Amsterdam: Benjamins, 17-35. Fillmore, Charles J. 1968a. Lexical entries for verbs. Foundations of Language 4,373-393. Fillmore, Charles J. 1968b. The case for case. In: E. Bach & R.T. Harms (eds.). Universals in Linguistic Theory. New York: Holt, Rinehart & Winston, 1-88. Fodor, Jerry A. 1970. Three reasons for not deriving'kill' from 'cause to die'. Linguistic Inquiry 1, 429-438. Fodor, Jerry A., Merrill F. Garrett, Edword C.T. Walker & Cornelia H. Parkes 1980. Against definitions. Cognition 8,263-367. Fodor, Jerry & A. & Ernie Lepore 1999. Impossible words? Linguistic Inquiry 30,445-453. Gamerschlag,Thomas 2005. Komposition und Argumentstniktiir komplexer Verben. Eine lexikalische Analyse von Verb-Verb-Komposita und Serialverbkonstniktionen. Berlin: Akademie Verlag. Geurts, Bart 2003. Semantics as lexicography. Theoretical Linguistics 29.223-226. Goddard, Cliff 1998. Semantic Analysis. A Practical Introduction. Oxford: Oxford University Press. Goddard, Cliff 2002a. Lexical decomposition II: Conceptual axiology. In: A. D. Cruse et al. (eds.). Lexikologie - Lexicology. Ein internationales Handbuch zur Natur und Struktur von Wortern und Wortschatzen - An International Handbook on the Nature and Structure of Words and Vocabularies, (HSK 21.1). Berlin: de Gruyter, 256-268. Goddard. Cliff 2002b. The search for the shared semantic core of all languages. In: C. Goddard & A. Wierzbicka (eds.). Meaning and Universal Grammar, vol. I: Theory and Empirical Finding. Amsterdam: Benjamins, 5-40. Goddard. Cliff 2006. Ethnopraginatics: A new paradigm. In: C. Goddard (cd.). Ethnopragmatics. Understanding Discourse in Cultural Context. Berlin: de Guytcr, 1-30. Goddard, Cliff 2008a. Natural Semantic Metalanguage: The state of the art. In: C. Goddard (ed.). Cross-Linguistic Semantics. Amsterdam: Benjamins, 1-34. Goddard. Cliff 2008b.Towards a systematic table of semantic elements. In: C. Goddard (cd.). Cross- Linguistic Semantics. Amsterdam: Benjamins, 59-81. Goddard, Cliff, & Anna Wierzbicka (eds.) 1994a. Semantic and Lexical Universals - Theory and Empirical Findings. Amsterdam: Benjamins. Goddard, Cliff & Anna Wierzbicka 1994b. Introducing lexical primitives. In: C. Goddard & A. Wierzbicka (eds.). Semantic and Lexical Universals - Theory and Empirical Findings. Amsterdam: Benjamins, 31-54. Goddard. Cliff & Anna Wierzbicka 2002. Semantic primes and universal grammar. In: C. Goddard & A. Wierzbicka (eds.). Meaning and Universal Grammar, vol. I: Theory and Empirical Findings. Amsterdam: Benjamins, 41-85. Grimshaw. Jane 1990. Argument Structure. Cambridge, MA:The MIT Press. Gruber, Jeffrey S. 1965. Studies in Lexical Relations. Ph.D. dissertation. MIT, Cambridge, MA. Guerssel, Mohamed, Kenneth Hale, Mary Laughren, Beth Levin & Josie White Eagle 1985. A cross-linguistic study of transitivity alternations. In: W. H. Eilfort, P. D. Kroeber & K. L. Peterson
396 IV. Lexical semantics (eds.). Papers from the Parasession on Causatives and Agentivity at the 21st Regional Meeting of the Chicago Linguistic Society (= CLS), April 1985. Chicago, IL: Chicago Linguistic Society, 48-63. Hale, Ken & Samuel Jay Keyser 1987. A View from the Middle. Lexicon Project Working Papers 10. Cambridge. MA: Center for Cognitive Science, MIT. Hale. Ken & Samuel Jay Keyser 1993. On argument structure and the syntactic expression of lexical relations. In: K. Hale & S. J. Keyser (eds.). The View from Building 20. Essays in Linguistics in Honor ofSylvain Bromberger. Cambridge, MA:The MIT Press,53-109. Hale, Ken & Samuel Jay Keyser 1997. On the complex nature of simple predicators. In: A. Alsina, J. Bresnan & P. Sells (eds.). Complex Predicates. Stanford, CA: CSLI Publications, 29-65. Hale, Ken & Samuel Jay Keyser 1999. A response to Fodor and Lepore, "Impossible words?" Linguistic Inquiry 30,453-466. Hale, Ken & Samuel Jay Keyser 2002. Prolegomenon to a Theory of Argument Structure. Cambridge, MA:The MIT Press. Hale, Ken & Samuel Jay Keyser 2005. Aspect and the syntax of argument structure. In: N. Ertcschik-Shir &T. Rapoport (eds.). The Syntax of Aspect. Deriving Thematic and Aspectual Interpretation. Oxford: Oxford University Press, 11-41. Halle, Morris & Alec Marantz 1993. Distributed Morphology and the pieces of inflection. In: K. Hale & S. J. Keyser (eds.). The View from Building 20. Essays in Honor ofSylvain Bomberger. Cambridge, MA:The MIT Press, 111-176. Harley, Heidi 2002. Possession and the double object construction. In: P. Pica & J. Rooryck (eds.). Linguistic Variation Yearbook, vol. 2. Amsterdam: Benjamins, 31-70. Harley, Heidi & Rolf Noyer 1999. Distributed Morphology. GLOTInternational 4,3-9. Harley, Heidi & Rolf Noyer 2000. Formal versus encyclopedic properties of vocabulary: Evidence from nominalizations. In: B. Peelers (ed.). The Lexicon-Encyclopedia Interface. Amsterdam: Elsevier. 349-375. Immler, Manfred 1974. Generative Syntax - Generative Semantik. Miinchcn: Fink. Jackendoff. Ray 1972. Semantic Interpretation in Generative Grammar. Cambridge, MA:The MIT Press. Jackendoff. Ray 1976.Toward an explanatory semantic representation. Linguistic Inquiry 7,89-150. Jackendoff, Ray 1983. Semantics and Cognition. Cambridge. MA:Thc MIT Press. Jackendoff, Ray 1987. Relations in linguistic theory. Linguistic Inquiry 18,369-411. Jackendoff. Ray 1990. Semantic Structures. Cambridge. MA:The MIT Press. Jackendoff, Ray 2002. Foundations of Language. Brain, Meaning, Grammar, Evolution. Oxford: Oxford University Press. Joppen. Sandra & Dieter Wunderlich 1995. Argument linking in Basque. Lingua 97,123-169. Joshi, Aravind 1974. Factorization of verbs. In: C. H. Heidi ich (ed.). Semantics and Communication. Amsterdam: North-Holland, 251-283. Kandiah.Thiru 1968.Transformational Grammar and the layering of structure in Tamil. Journal of Linguistics 4,217-254. Kaufmann, Ingrid 1995a. Konzeptuelle Grundlagen semantischer Dekompositionsstrukturen. Die Kombinatorik lokaler Verben und pradikativer Komplemente. Tubingen: Niemeyer. Kaufmann, Ingrid 1995b. O- and D-predicates: A semantic approach to the u n accusative- unergative distinction. Journal of Semantics 12,377-427. Kaufmann, Ingrid 1995c. What is an (im-)possible verb? Restrictions on semantic form and their consequences for argument structure. Folia Linguistica 29,67-103. Kiparsky, Paul 1997. Remarks on dcnominal verbs. In: A. Alsina. J. Bresnan & P. Sells (eds.). Complex Predicates. Stanford. CA: CSLI Publications, 473-499. Lakoff, George 1965. Irregularity in Syntax. Ph.D. dissertation. Indiana University. Bloomington. IN. Reprinted: New York: Holt, Rinehart & Winston. 1970. Lakoff, George & John Robert Ross 1972. A note on anaphoric islands and causatives. Linguistic Inquiry 3, 121-125.
17. Frameworks of lexical decomposition of verbs 397 Larson, Richard K. 1988. On the double object construction. Linguistic Inquiry 19,335-391. Levin, Beth 1985. Lexical semantics in review: An introduction. In: B. Levin (ed.). Lexical Semantics in Review. Cambridge, MA:The MIT Press, 1-62. Levin, Beth 1993. English Verb Classes and Alternations. A Preliminary Investigation. Chicago, IL: The University of Chicago Press. Levin, Beth 1995. Approaches to lexical semantic representation. In: D. E. Walker, A. Zampolli & N. Calzolari (eds.). Automating the Lexicon: Research and Practice in a Multilingual Environment. Oxford: Oxford University Press, 53-91. Levin, Beth & Malka Rappaport Hovav 2005. Argument Realization. Cambridge: Cambridge University Press. Lewis, David 1973. Causation. The Journal of Philosophy 70,556-567. Maienborn, Claudia 2003. Event-internal modifiers: Semantic underspecification and conceptual interpretation. In: E. Lang, C. Maienborn & C. Fabricius-Hansen (eds.). Modifying Adjuncts. Berlin: Mouton de Gruyter, 475-509. Marantz, Alec 1997. No escape from syntax: Don't try morphological analysis in the privacy of your own lexicon. In: A. Dimitriadis, L. Siegel, C Surek-Clark & A. Williams (eds.). Proceedings of the 21st Annual Penn Linguistics Colloquium. Philadelphia, PA: University of Pennsylvania Press, 201-225. Mateu, Jaume 2001. Unselected objects. In: N. Dehe- & A. Wanner (cds.). Structural Aspects of Semantically Complex Verbs. Frankfurt/M.: Lang, 83-104. Matthewson, Lisa 2003. Is the meta-language really natural? Theoretical Linguistics 29,263-274. McCawley, James D. 1968. Lexical insertion in a Transformational Grammar without deep structure. In: B. J. Darden, C.-J. N. Bailey & A. Davison (eds.). Papers from the Fourth Regional Meeting of the Chicago Linguistic Society. Chicago, IL: Chicago Linguistics Society, 71-80. McCawley, James D. 1971. Prelexical syntax. In: R. J. O'Brien (ed.). Report of the 22nd Annual Round Table Meeting on Linguistics and Language Studies. Washington, DC: Georgetown University Press, 19-33. McCawley, James D. 1994. Generative Semantics. In: R. E. Asher & J. M. Y. Simpson (ed.). The Encyclopedia of Language and Linguistics. Oxford: Pergamon, 1398-1403. Montague, Richard 1973. The proper treatment of quantification in ordinary English. In: J. Hintikka, J. M. E. Moravcsik & P. Suppcs (cds.). Approaches to Natural Language. Proceedings of the 1970 Stanford Workshop on Grammar and Semantics. Dordrecht: Reidel, 221-242. Morgan, Jerry L. 1969. On arguing about semantics. Papers in Linguistics 1,49-70. Postal, Paul M. 1971. On the surface verb 'remind'. In: C. J. Fillmore & D. T. Langendoen (eds.). Studies in Linguistic Semantics. New York: Holt, Rinehart & Winston, 180-270. Pustejovsky, James 1988. The geometry of events. In: C. Tenny (ed.). Studies in Generative Approaches to Aspect. Cambridge, MA:The MIT Press, 19-39. Pustejovsky, James 1991a. The syntax of event structure. Cognition 41,47-81. Pustejovsky, James 1991b.The generative lexicon. Computational Linguistics 17,409-441. Pustejovsky, James 1995. The Generative Lexicon. Cambridge, MA:The MIT Press. Pustejovsky, James & Pierrette Bouillon 1995. Aspectual coercion and logical polysemy. Journal of Semantics 12,133-162. Ramchand, Gillian C. 2008. Verb Meaning and the Lexicon. A First-Phase Syntax. Cambridge- Cambridge University Press. Rappaport, Malka 1985. Review of Joshi's "Factorization of verbs". In: B. Levin (ed.). Lexical Semantics in Review. Cambridge, MA:The MIT Press, 137-148. Rappaport, Malka, Mary Laughrcn & Beth Levin 1987. Levels of Lexical Representation. Lexicon Project Working Papers 20. Cambridge, MA: MIT. Rappaport, Malka & Beth Levin 1988. What to do with 9-roles. In: W. Wilkins (ed.). Syntax and Semantics 21: Thematic Relations. New York: Academic Press, 7-36. Rappaport, Malka, Beth Levin & Mary Laughren 1993. Levels of lexical representation. In: J. Pustejovsky (ed.). Semantics and the Lexicon. Dordrecht: Kluwer, 37-54.
398 IV. Lexical semantics Rappaport Hovav, Malka & Beth Levin 1998. Building verb meanings. In: M. Butt & W. Geuder (eds.). The Projection of Arguments: Lexical Compositional Factors. Stanford, CA: CSLI Publications, 97-134. Rappaport Hovav, Malka & Beth Levin 2005. Change-of-state verbs: Implications for theories of argument projection. In: N. Erteschik-Shir &T Rapoport (eds.). The Syntax of Aspect. Deriving Thematic and Aspectual Interpretation. Oxford: Oxford University Press, 274-286. Reinhart,Tanya 2002.The theta system - an overview. Theoretical Linguistics 28,229-290. Riemer, Nick 2006. Reductive paraphrase and meaning: A critique of Wierzbickian semantics. Linguistics & Philosophy 29,347-379. Ritter, Elizabeth & Sara Thomas Rosen 1998. Delimiting events in syntax. In: M. Butt & W. Geuder (eds.). The Projection of Arguments: Lexical Compositional Factors. Stanford, CA: CSLI Publications, 135-164. Ross, John Robert 1972. Act. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Reidel,70-126. Rozwadowska, Bozena 1988.Thematic restrictions on derived nominals. In: W. Wilkins (ed.). Syntax and Semantics 21: Thematic Relations. New York: Academic Press, 147-165. Ruppenhofer, Josef, Michael Ellsworth, Miriam R. L. Petruck, Christopher R. Johnson & Jan Scheffczyk 2006. Frame Net II: Extended Theory and Practice, http://framenet.icsi.berkeley. edu/index.php?option=com_wrapper&Itemid= 126. December 30,2006. Shibatani, Masayoshi 1976. The grammar of causative constructions: A conspectus. In: M. Shibatani (ed.). The Grammar of Causative Constructions New York: Academic Press, 1-40. Stiebels, Barbara 1996. Lexikalische Argumente und Adjunkte. Zum semantischen Beitrag von ver- balen Prdfixen und Partikeln. Berlin: Akademie Verlag. Stiebels,Barbara 1998. ComplexdenominalverbsinGermanandthemorphology-seinantics interface. In: G. Booij & J. van Marie (eds.). Yearbook of Morphology 1997. Dordrecht: Kluwer, 265-302. Stiebels, Barbara 2002. Typologie des Argumentlinkings. Okonomie und Expressivitat. Berlin: Akademie Verlag. Stiebels, Barbara 2006. From rags to riches. Nominal linking in contrast to verbal linking. In: D. Wunderlich (ed.). Advances in the Theory of the Lexicon. Berlin: de Gruyter, 167-234. Summers, Delia (Med.) 1995. Longman Dictionary of Contemporary English. Munchen: Langcnschcidt-Longman. Taylor, John R. 2000. The network model and the two-level model in comparison. In: B. Peeters (ed.). The Lexicon-Encyclopedia Interface. Amsterdam: Elsevier, 115-141. Tenny, Carol 1987. Grammaticalizing Aspect and Affectedness. Ph.D. dissertation. MIT, Cambridge, MA. Tenny, Carol 1988. The aspectual interface hypothesis: The connection between syntax and lexical semantics. In: C. Tenny (ed.). Studies in Generative Approaches to Aspect. Cambridge, MA: The MIT Press, 1-18. Travis, Lisa 2000. Event structure in syntax. In: C. Tenny (ed.). Events as Grammatical Objects. The Converging Perspectives of Lexical Semantics and Syntax. Stanford, CA: CSLI Publications, 145-185. van Valin, Robert D. Jr. 1993. A synopsis of Role and Reference Grammar. In: R. D. van Valin (ed.). Advances in Role and Reference Grammar. Amsterdam: Benjamins, 1-164. Vendlei, Zeno 1957. Verbs and times. The Philosophical Review LXVI, 143-160. van Voorst, Jan 1988. Event Structure. Amsterdam: Benjamins. Wicrzbicka, Anna 1972. Semantic Primitives. Frankfurt/M.: Athcnaum. Wicrzbicka. Anna 1980. Lingua Mentalis. The Semantics of Natural Language. Sydney: Academic Press. Wierzbicka, Anna 1985. Lexicography and Conceptual Analysis. Ann Arbor, MI: Kaioma. Wierzbicka, Anna 1987. English Speech Act Verbs. A Semantic Dictionary. Sydney: Academic Press. Wierzbicka, Anna 1988. The Semantics of Grammar. Amsterdam: Benjamins. Wierzbicka, Anna 1992. Semantics, Culture, and Cognition. Universal Human Concepts in Culture- Specific Configurations. Oxford: Oxford University Press.
18. Thematic roles 399 Wierzbicka, Anna 1996. Semantics. Primes and Universals. Oxford: Oxford University Press. Wierzbicka, Anna 1999. Emotions across Languages and Cultures. Cambridge: Cambridge University Press. Wierzbicka, Anna 2009. The theory of the mental lexicon. In: S. Kempgen et al. (eds.). Die slav- ischen Sprachen - The Slavic Languages. Ein internationales Ilandbuch zu ihrer Geschichte, ihrer Struktur und ihrer Erforschung -An International Handbook of their History, their Structure and their Investigation. (HSK 32.1). Berlin: de Gruyter, 848-863. von Wright, George Henrik 1963. Norm and Action. London: Routledge & Kegan Paul. Wunderlich, Dieter 1991. How do prepositional phrases fit into compositional syntax and semantics? Linguistics 25,283-331. Wunderlich, Dieter 1994.Towards a lexicon-based theory of agreement. Theoretical Linguistics 20, 1-35. Wunderlich, Dieter 1996. Models of lexical decomposition. In: E. Weigand & F. Hundsnurscher (eds.). Lexical Structures and Language Use. Proceedings of the International Conference on Lexicology and Lexical Semantics, Milnster, September 13-15,1994. Vol. 1: Plenary Lectures and Session Papers. Tubingen: Nieineyer, 169-183. Wunderlich, Dieter 1997a. cause and the structure of verbs. Linguistic Inquiry 28,27-68. Wunderlich, Dieter 1997b. Argument extension by lexical adjunction. Journal of Semantics 14, 95-142. Wunderlich, Dieter 2000. Predicate composition and argument extension as general options - a study in the interface of semantic and conceptual structure. In: B. Stiebels & D. Wunderlich (eds.). Lexicon in Focus. Berlin: Akadeinie Verlag, 247-270. Wunderlich, Dieter 2006. Towards a structural typology of verb classes. In: D. Wunderlich (ed.). Advances in the Theory of the Lexicon. Berlin: de Gruyter, 57-166. Zubizarieta, Maria Luisa & Eunjeong Oh 2007. On the Syntactic Composition of Manner and Motion. Cambridge, MA:Thc MIT Press. Stefan Engelberg, Mannheim (Germany) 18. Thematic roles 1. Introduction 2. Historical and terminological remarks 3. The nature of thematic roles 4. Thematic role systems 5. Concluding remarks 6. References Abstract Thematic roles provide one way of relating situations to their participants. Thematic roles have been widely invoked both within lexical semantics and in the syntax-semantics interface, in accounts of a wide range of phenomena, most notably the mapping between semantic and syntactic arguments (argument realization). This article addresses two sets of issues. The first concerns the nature of thematic roles in semantic theories: what are thematic roles, are they specific to individual predicates or more general, how do they Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 399-420
400 IV. Lexical semantics figure in semantic representations, and what interactions do they have with event and object individuation and with the semantics of plurality and aspect? The second concerns properties of systems of thematic roles: what is the inventory of thematic roles, and what relationships, such as an ordering of roles in a thematic hierarchy, or the consistency of roles across semantic domains posited by the thematic relations hypothesis, exist among roles? Various applications of thematic roles will be noted throughout these two sections, in some cases briefly mentioning alternative accounts that do not rely on them. The conclusion notes some skepticism about the necessity for thematic roles in linguistic theory. 1. Introduction Thematic roles provide one way of relating situations to their participants. Somewhat informally, we can paraphrase this by saying that participant x plays role R in situation e. Still more informally, the linguistic expression denoting the participant is said to play that role. Thematic roles have been widely invoked both within lexical semantics and in the syntax-semantics interface, in accounts of a wide range of phenomena, including the mapping between semantic and syntactic arguments (argument realization), controller choice of infinitival complements and anaphors, constraints on relativization, constraints on morphological or syntactic phenomena such as passivization, determinants of telicity and distributivity, patterns of idiom frequencies, and generalizations about the lexicon and lexical acquisition. Automatic role labeling is now an active area of investigation in computational linguistics as well (Marquez et al. 2008). After a brief terminological discussion, this article addresses two sets of issues. The first concerns the nature of thematic roles in semantic theories: what are thematic roles, are they specific to individual predicates or more general, how do they figure in semantic representations, and what interactions do they have with event and object individuation and with the semantics of plurality and aspect? The second concerns properties of systems of thematic roles: what is the inventory of thematic roles, and what relationships, such as an ordering of roles in a thematic hierarchy, or the consistency of roles across semantic domains posited by the thematic relations hypothesis, exist among roles? Various applications of thematic roles will be noted throughout these two sections, in some cases briefly mentioning alternative accounts that do not rely on them. The conclusion notes some skepticism about the necessity for thematic roles in linguistic theory. 2. Historical and terminological remarks Panini's karakas are frequently noted as forerunners of thematic roles in modern linguistics. Grubcr (1965) and Fillmore (1968) arc widely credited with initiating the discourse on thematic roles within generative grammar and research relating to thematic roles has blossomed since the 1980s in conjunction with growing interest in the lexicon and in semantics. A variety of terms appear in the literature to refer to essentially the same notion of thematic role: thematic relation, theta-role, (deep) case role, and participant role. A distinction between "broad", general roles and "narrow", predicate-specific roles is worth noting as well. General roles (also termed absolute roles (Schein 2002) or thematic role types (Dowty 1989)) apply to a wide range of predicates or events; they include the well-known Agent, Patient, Theme, Goal, and so on. Predicate-specific roles (also
18. Thematic roles 401 termed relativized (Schein 2002), individual thematic roles (Dowty 1989), or "relation- specific roles") apply only to a specific event, situation, or predicate type; such roles as Devourer or Explainer arc examples. There need not be a sharp distinction between broad and narrow roles; indeed, roles of various degrees of specificity, situated in a sub- sumption hierarchy, can be posited. In this article the term thematic role will cover both broad and narrow roles, though some authors restrict its use to broad roles. 3. The nature of thematic roles Thematic roles have been formally defined as relations, functions, and sets of entailments or properties. As Rappaport & Levin (1988:17) state: "Thcta-rolcs arc inherently relational notions; they label relations of arguments to predicators and therefore have no existence independent of predicators." Within model-theoretic semantics, an informal definition like the one that begins this article needs to be made explicit in several respects. First, what are the entities to be related? Are thematic roles present in the semantic representations of verbs and other predicators (i.e., lexical items that take arguments), or added compositionally? Are they best viewed as relations or as more complex constructs such as sets of entailments, bundles of features, or merely epiphenomena defined in terms of other linguistic representations that need not be reified at all? 3.1. Defining thematic roles in model-theoretic semantics Attempts to characterize thematic roles explicitly within model-theoretic semantics begin with works such as Chierchia (1984) and Carlson (1984). Both Chierchia and Carlson treat thematic roles as relations between an event and a participant in it. Dowty (1989: 80) provides the following version of Chierchia's formulation. Events are regarded as tuples of individuals, and thematic roles are therefore defined thus: (1) A 9-rolc 9 is a partial function from the set of events into the set of individuals such that for any event k, if 0(k) is defined, then 0(k) e k. For example, given (2a), an event tuple in which Akill' is the intension of the verb kill and x is the killer of y, the functions Agent and Patient yield the values in (2b) and (2c:) (2) a. <Akiir,*,;y> b. Agent((Aki\\',x,y))=x c. Patient «AkiH\jc,y» = y This shows how thematic roles can be defined within an ordered argument system for representing event types. The roles are distinguished by the order of arguments of the predicate, though there is no expectation that the same roles appear in the same order for all predicates (some predicates may not have an Agent role at all, for example). Another representation, extending Davidson's (1967) insights on the representation of events, is the neo-Davidsonian one, in which thematic roles appear explicitly as relations between events and participants; see article 34 (Maienborn) Event semantics.The roles are labeled, but not ordered with respect to one another. In one form of such a representation, the equivalent of (2) would be (3):
402 IV. Lexical semantics (3) 3e [killing(e) & Agent(e,*) & Patient(e,.y)] Here, thematic roles arc binary predicates, taking an eventuality (event or state) as first argument and a participant in that eventuality as second argument. For a discussion of other possibilities for the type of the first argument;see Bayer (1997:24-28). Furthermore, as Bayer (1997: 5) notes, there are two options for indexing arguments in a neo-David- sonian representation: lexical and compositional. In the former, used, e.g., in Landman (2000), the lexical entry for a verb includes the thematic roles that connect the verb and its arguments. This is also effectively the analysis implicit in structured lexical semantic representations outside model-theoretic semantics, such as Jackendoff (1987,1990), Rap- paport & Levin (1988), Foley & van Valin (1984), and van Valin (2004). But it is also possible to pursue a compositional approach, in which the lexical entry contains only the event type, with thematic roles linking the arguments through some other process, as does Krifka (1992,1998), who integrates the thematic role assignments of the verb's arguments through its subcategorization. Bayer (1997: 127-132) points out some difficulties with Krifka's system, including coordination of VPs that assign different roles to a shared NP. A position intermediate between the lexical and compositional views is advocated by Kratzer (1996), who claims that what has been regarded as a verb's external argument is in fact not an argument at all, although its remaining arguments are present in its lexical entry. Thus the representation of kill in (3) would lack the clause Agent(e, x), this role being assigned by a VoiceP (or "little v") above VP. This, Kratzer argues, accounts for subject/object asymmetries in idiom frequencies and the lack of a true overt agent argument in gerunds. Svenonius (2007) extends Kratzer's analysis to adpositions. However, Wechsler (2005) casts doubt on Kratzer's prediction of idiom asymmetries and notes that some mechanism must still select which role is external to the verb's lexical entry, particularly in cases where there is more than one agentive participant, such as a commercial transaction involving both a buyer and a seller. Characterizing a thematic role as a partial function leaves open the question of how to determine the function's domain and range; that is, what events is a role defined for, and which participant does the role pick out? In practice, this problem of determining which roles arc appropriate for a predicate plagues every system of broad thematic roles, but it is less serious for roles defined on individual predicates. Dowty (1989:76) defines the latter in terms of a set of entailments, as in (4): (4) Given an /i-place predicate 8 and a particular argument xr the individual thematic role (8, i) is the set of all properties a such that the entailment □[§(*,,... Jtf., ...-*„) -> a(oc()] holds. What counts as a "property" must be specified, of course. And as Bayer (1997:119-120) points out, nothing in this kind of definition tells us which of these properties are important for, say, argument realization. Formulating such cross-lexicon generalizations demands properties or relations that are shared across predicates. Dowty defines cross-predicate roles, or thematic role types, as he terms them, as "the intersection of all the individual thematic roles" (Dowty 1989: 77). Therefore, as stated by Dowty (1991: 552): "From the semantic point of view, the most general notion of thematic role (type) is A SET OF ENTAILMENTS OF A GROUP OF PREDICATES WITH RESPECT TO ONE OF THE ARGUMENTS OF EACH. (Thus a thematic role type is a kind of second-order
18. Thematic roles 403 property, a property of multiplace predicates indexed by their argument positions.)."Thematic role types are numerous; however, he argues, linguists will generally be interested in identifying a fairly small set of these that play some vital role in linguistic theory. This definition of thematic role types imposes no restrictions on whether an argument can bear multiple roles to a predicate, whether roles can share entailments in their defining sets and thus overlap or subsume one another, whether two arguments of a predicate can bear the same role (though this is ruled out by the functional requirement in (1)), and whether every argument must be assigned a role. These constraints, if desired, must be independently specified (Dowty 1989:78-79) As the discussion of thematic role systems below indicates, various models answer these questions differently. Whether suitable entailments for linguistically significant broad thematic roles can be found at all is a further issue that leads Rappaport & Levin (1988,2005), Dowty (1991),Wechsler (1995), Croft (1991,1998), and many others to doubt the utility of positing such roles at all. Finally, note that definitions such as(l) and (4) above, or Dowty's thematic role types, make no reference to morphological, lexical, or syntactic notions. Thus they are agnostic as to which morphemes, words, or constituents have thematic roles associated with them; that depends on the semantics assigned to these linguistic entities and whether roles are defined only for event types or over a broader range of situation types. Verbs are the prototypical bearers of thematic roles, but nominalizations are typically viewed as role-bearing, and other nouns, adjectives, and adpositions (Gawron 1983, Wechsler 1995, Scvcnonius 2007), as prcdicators denoting event or situation types, will have roles associated with them, too. 3.2. Thematic role uniqueness Many researchers invoking thematic roles have explicitly or implicitly adopted a criterion of thematic role uniqueness. Informally, this means that only one participant in a situation bears a given role. Carlson (1984: 271) states this constraint at a lexical level: "one of the more fundamental constraints is that of'thematic uniqueness' - that no verb seems to be able to assign the same thematic role to two or more of its arguments." This echoes the 9-criterion of Chomsky (1981), discussed below. Parsons (1990: 74) defines thematic uniqueness with respect to events and their participants: "No event stands in one of these relations to more than one thing." Note that successfully connecting the lexical-level constraint and the event-level constraints requires the Davidsonian assumption that there is a single event variable in the semantic representation of a predicator, as Carlson (1998: 40) points out. In some models of lexical representation (e.g., Jackendoffs lexical decomposition analyses and related work), this is not the case, as various subevents are represented, each of which can have a set of roles associated with it. The motivations for role uniqueness are varied. One is that it simplifies accounts of mapping from thematic roles to syntactic arguments of predicators (which are also typically regarded as unique). It is implicit in hypotheses such as the Universal Alignment Hypothesis (Rosen 1984) and the Uniformity of Theta Assignment Hypothesis (Baker 1988, 1997), described in greater detail in the section on argument realization below. Role uniqueness also provides a tool to distinguish situations - if two different individuals appear to bear the same role, then there must be two distinct situations involved (Landman 2000:39). A further motivation,emphasized in the work of Krifka (1992,1998),
404 IV. Lexical semantics is that role uniqueness and related conditions are crucial in accounting for the semantics of events in which a participant is incrementally consumed, created, or traversed. Role uniqueness does not apply straightforwardly to all types of situations. Krifka (1998:209) points out that a simple definition of uniqueness - "it should not be the case that one and the same event has different participants" - is "problematic for see and touchr If one sees or touches an orange, for example, then one also typically sees or touches the peel of the orange. But the orange and its peel are distinct; thus the seeing or touching event appears to have multiple entities bearing the same role. A corollary of role uniqueness is exhaustivity, the requirement that whatever bears a given thematic role to a situation is the only thing bearing that role. Thus if some group is designated as the Agent of an event, then there can be no larger group that is also the Agent of that same event. Exhaustivity is equivalent to role uniqueness under the condition that only one role may be assigned to an individual; as Schein (2002:272) notes, however, if thematic roles are allowed to be "complex" - that is, multiple roles can be assigned conjunctively to individuals, then exhaustivity is a weaker constraint than uniqueness. 3.3. How fine-grained should thematic roles be? As noted in the introduction, thematic roles in the broad sense are frequently postulated as crucial mechanisms in accounts of phenomena at the syntax-semantics interface, such as argument realization, anaphoric binding, and controller choice. For thematic roles to serve many of these uses, they must be sufficiently broad, in the sense noted above, that they can be used in stating generalizations covering classes of lexical items or constructions. To be useful in this sense, therefore, a thematic role should be definable over a broad class of situation types. Bach (1989: 111) articulates this view: "Thematic roles seem to represent generalizations that we make across different kinds of happenings in the world about the participation of individuals in the eventualities that the various sentences are about." Schein (2002: 265) notes that if "syntactic positions [of arguments given their thematic roles] arc predictable, wc can explain the course of acquisition and our understanding of novel verbs and of familiar verbs in novel contexts." Within model-theoretic semantics, two types of critiques have been directed at the plausibility of broad thematic roles. One is based on the difficulty of formulating definitional criteria for such roles, arguing that after years of effort, no rigorous definitions have emerged.The other critique examines the logical implications of positing such roles, bringing out difficulties for semantic representations relying on broad thematic roles given basic assumptions, such as role uniqueness. It is clear that the Runner role of run and the Jogger role of jog share significant entailments (legs in motion, body capable of moving along a path, and so on). Indeed, to maintain that these two roles arc distinct merely because there arc two distinct verbs run and jog in English seemingly determines a semantic issue on the basis of a language-particular lexical accident. But no attempt to provide necessary, sufficient, and comprehensive criteria for classifying most or all of the roles of individual predicates into a few broad roles has yet to meet with consensus. This problem has been noted by numerous researchers; see, for example, Dowty (1991: 553-555), Croft (1991: 155-158), Wechsler (1995: 9-11), and Levin & Rappaport Hovav (2005:38-41) for discussion. Theme in particular is notoriously defined in vague and differing ways, but it is not an isolated case; many partially overlapping criteria for Agenthood have been proposed, including volitionality, causal
18. Thematic roles 405 involvement, and control or initiation of an event; see Kiparsky (1997:476) for a remark on cross-linguistic variation in this regard and Wechsler (2005:187-193) for a proposal on how to represent various kinds of agcntivc involvement. Moreover, there are arguments against the possibility of broad roles even for situations that seemingly semantically quite close, in cases where two predicators would appear to denote the same situation type.The classic case of this, examined from various perspectives by numerous authors, involves the verbs buy and sell. As Parsons (1990:84) argues, given the two descriptions of a commercial transaction in (5): (5) a. Kim bought a tricycle from Sheehan. b. Sheehan sold a tricycle to Kim. and the assumptions that "Kim is the Agent of the buying, and Sheehan is the Agent of the selling", then to "insist that the buying and the selling are one and the same event, differently described" entails that "Kim sold a tricycle to Kim, and Sheehan bought a tricycle from Sheehan." Parsons concludes that the two sentences must therefore describe different, though "intimately related", events; see article 29 (Gawron) Frame Semantics. Landman (2000:31-33) and Schein (2002) make similar arguments; the import of which is that one must either individuate fine-grained event types or distinguish thematic roles in a fine-grained fashion. Schein (2002) suggests, as a possible alternative to fine-grained event distinctions, a ternary view of thematic roles, in which the third argument is essentially an index to the predicate, as in (6), rather than the dyadic thematic roles in, e.g., (3): (6) Agent(e,*,fo7/') The Agent role of kill is thereby distinguished from the Agent of murder, throw, explain, and so on.This strategy, applied to buy and sell, blocks the invalid inference above. However, this would allow the valid inference from one sentence in (5) to the other only if we separately ensure that the Agent of buy corresponds to the Recipient of sell, and vice versa. Moreover, other problems remain even under this relativized view of thematic roles, in particular concerning symmetric predicates, the involvement of parts of individuals or of groups of individuals in events, whether they are to be assigned thematic roles, and whether faulty inferences would result if they are. Assuming role uniqueness (or exhaustivity), and drawing a distinction between an individual car and the group of individuals constituting its parts, one is seemingly forced to conclude that (7a) and (7b) describe different events (Carlson 1984, Schein 2002).The issue is even plainer in (8), as the skin of an apple is not the same as the entire apple. (7) a. I weighed the Volvo. b. I weighed (all) the parts of the Volvo. (8) a. Kim washed the apple. b. Kim washed the skin of the apple. Similar problems arise with symmetric predicators such as face or border, as illustrated in (9) (based on Schein 2002) and (10).
406 IV. Lexical semantics (9) a. The Carnegie Deli faces Carnegie Hall, b. Carnegie Hall faces the Carnegie Deli. (10) a. Rwanda borders Burundi, b. Burundi borders Rwanda. If two distinct roles are ascribed to the arguments in these sentences, and the mapping of roles to syntactic positions is consistent, then the two sentences in each pair must describe distinct situations. Schein remarks that the strategy of employing fine-grained roles indexed to a predi- cator fails for cases like (7) through (10), since the verb in each pair is the same. Noting that fine-grained events seem to be required regardless of the availability of fine-grained roles, Schein (2002) addresses these issues by introducing additional machinery into semantic representations, including fine-grained scenes as perspectives on events, to preserve absolute (broad) thematic roles. Each sentence in (9) or (10) involves a different scene, even though the situation described by the two sentences is the same. "In short, scenes are fine-grained, events are coarser, and sentences rely on (thematic) relations to scenes to convey what they have to say about events." (Schein 2002: 279). Similarly, the difficulties posed by (7) and (8) are dealt with through "a notion of resolution to distinguish a scene fine-grained enough to resolve the Volvo's parts from one which only resolves the whole Volvo." (Schein 2002:279). 3.4. Thematic roles and plurality Researchers with an interest in the semantics of plurals have studied the interaction of plurality and thematic roles. Although Carlson (1984: 275) claims that "it appears to be necessary to countenance groups or sets as being able to play thematic roles", he does not explore the issues in depth. Landman (2000: 167), contrasting collective and distributive uses of plural subjects, argues that "distributive predication is not an instance of thematic predication." Thus, in the distributive reading of a sentence like The boys sing., the semantic properties of Agents hold only of the individual boys, so "no thematic implication concerning the sum of the boys itself follows." For "on the distributive interpretation, not a single property that you might want to single out as part of agenthood is predicated of the denotation of the boys" (Landman 2000:169). Therefore, Landman argues, "the subject the boys does not fill a thematic role" of sing, but rather a "non- thematic role" that is derived from the role that sing assigns to an individual subject. He then develops a theory of plural roles, derived from singular roles (which are defined only on atomic events), as follows (Landman 2000:184), where E is the domain of events (both singular and plural): (11) If e is an event in E, and for every atomic part a of e, thematic role R is defined for a, then plural role *R is defined for e, and maps e onto the sum of the /?-values of the atomic parts of e. Thus *R subsumes R (if e is itself atomic then R is defined for e). Although Landman does not directly address the issue, his treatment of distributive readings and plural roles seems to imply that a role-based theory of argument realization - indeed, any theory of
18. Thematic roles 407 argument realization based on entailments of singular arguments - must be modified to extend to distributive plural readings. 3.5. Thematic roles and aspectual phenomena Krifka (1992,1998) explores the ways in which entailments associated with participants can account for aspectual phenomena. In particular, he aims to make precise the notions of incremental theme and of the relationships of incremental participants to events and subevents, formalizing properties similar to some of the proto-role properties of Dowty (1991). Krifka (1998:211-212) defines some of these properties as follows, where © denotes a mereological sum of objects or events, s denotes a part or subevent relation, and 3x means that there exists a unique entity x of which the property following x holds: (12) a. A role 6 shows uniqueness of participants, UP(#), iff: 9{x, e) & 0{y, e) — x = y b. A role 9 is cumulative, CUM(0), iff: 0{x, e) & 0{y, e') -► 9{x ®p y, e 0£ e') c. A role 6 shows uniqueness of events, UE(0), iff: 0(x,e)&yzpx^> 3\e'[e' s£e & 0(y,e')] d. A role 8 shows uniqueness of objects, UO(#), iff: 0{x,e) & e' *Fe^ 3\y[y <.px & 6{y,e')\ The first of these is a statement of role uniqueness, and the second is a weak property that holds of a broad range of participant roles, which somewhat resembles Landman's definition of plural roles in (11). The remaining two are ingredients of incremental participant roles, including those borne by entities gradually consumed or created in an event. Krifka's analysis treats thematic roles as the interface between aspectual characteristics of events, such as tclicity, and the mereological structure of entities (parts and plurals). In the case of event types with incremental roles, the role's properties establish a homomorphism between parts of the participant and subevents of the event. Slightly different properties are required for a parallel analysis of objects in motion and the paths they traverse. 4. Thematic role systems This section examines some proposed systems of thematic roles; that is, inventories of roles and relationships among them, if any. One simple version of such a system is an unorganized set of broad roles, such as Agent, Instrument, Experiencer, Theme, Patient, etc., each assumed to be atomic, primitive, and independent of one another. In other systems, roles may be treated as non-atomic, being defined either in terms of more basic features, as derived from positional criteria within structured semantic representations, or situated within a hierarchy of roles and subroles (for example, the Location role might have Source and Goal subroles). Dependencies among roles are also posited; it is reasonable to claim that predicates allowing an Instrument role should also then require an Agent role, or that Source or Goal are meaningless without a Theme in motion from one to the other. Another type of dependency among roles is a thematic hierarchy, a global
408 IV. Lexical semantics ordering of roles in terms of their prominence, as reflected in, for instance, argument realization or anaphoric binding phenomena.These applications of thematic roles at the syntax-semantic interface are examined at the end of this section. 4.1. Lists of primitive thematic roles Fillmore (1968) was one of the earliest in the generative tradition to present a system of thematic roles (which he terms "deep cases"). This foreshadows the hypotheses of later researchers about argument realization, such as the Universal Alignment Hypothesis (Perlmutter & Postal 1984, Rosen 1984), and the Uniformity of Theta Assignment Hypothesis (Baker 1988,1997), discussed in the section on argument realization below. Fillmore's cases are intended as an account of argument realization at deep structure, and are, at least implicitly, ranked. A version of role uniqueness is also assumed, and roles are treated as atomic and independent of one another. A simple use of thematic roles that employs a similar conception of them is to ensure a one-to-one mapping from semantic roles of a predicate to its syntactic arguments.This is exemplified by the 0-criterion, a statement of thematic unique-ness formulated by Chomsky (1981:36), some version of which is assumed in most syntactic research in Government and Binding, Principles and Parameters, early Lexical-Functional Grammar, and related frameworks. (13) Each argument bears one and only one 6-role, and each 8-role is assigned to one and only one argument. This use of thematic roles as "OK marks", in Carlson's (1984) words, makes no commitments as to the semantic content of 9-roles, nor to the ultimate syntactic effects. As Ladusaw & Dowty (1988:62) remark, "the 9-criterion and 9-roles are a principally diacritic theory: what is crucial in their use in the core of GB is whether an argument is assigned a 9-rolc or not, which limits possible structures and thereby constrains the applications of rules." The 9-criterion as stated above is typically understood to apply to coreference chains; thus "each chain is assigned a 9-role." and "the 9-criterion must be reformulated in the obvious way in terms of chains and their members.." (Chomsky 1982:5-6). Other researchers have continued along the lines suggested by Fillmore (1968), furnishing each lexical item with a list of labeled thematic roles. These representations are typically abstracted away from the issue of whether roles should be represented in a Davidsonian fashion; an example would look like (14), where the verb cut is represented as requiring the two roles Agent and Patient and allowing an optional Instrument: (14) cut (Agent, Patient, (Instrument)) This is one version of the "9-grid" in the GB/P&P framework. Although the representation in (14) uses broad roles, the granularity of roles is a independent issue. For an extensive discussion of such models, see chapter 2 of Levin & Rappaport Hovav (2005). They note several problems with thematic list approaches: difficulties in determining which role to assign "symmetric" predicates like face and border with two arguments seemingly bearing the same role, "the assumption that semantic roles are taken to be discrete
18. Thematic roles 409 and unanalyzable" Levin & Rappaport Hovav (2005:42) rather than exhibiting relations and dependencies amongst themselves, and the failure to account for restrictions on the range of possible case frames (Davis & Koenig 2000:59-60). An additional feature of some thematic list representations is the designation of one argument as the external argument. Belletti & Rizzi (1988: 344), for example, annotate this external argument, if present, by italicizing it, as in (15a), as opposed to (15b) (where lexically specified case determines argument realization, and there is no external argument). (15) a. temere ('fear') (Experiencer,Theme) b. preocupare ('worry') (Experiencer,Theme) This leaves open the question of how the external argument is to be selected; Belletti & Rizzi address this issue only in passing, suggesting the possibility that a thematic hierarchy is involved. 4.2. Thematic hierarchies It is not an accident that thematic list representations, though ostensibly consisting of unordered roles, typically list the Agent role first if it is present. Apart from the intuitive sense of Agents being the most "prominent", the widespread use of thematic role lists in argument realization leads naturally to a ranking of thematic roles in a thematic hierarchy, in which prominence on the hierarchy corresponds to syntactic prominence, whether configurationally or in terms of grammatical functions. As Levin & Rappaport Hovav (2005: chapters 5 and 6) make clear, there are several distinct bases on which a thematic hierarchy might be motivated, independently of its usefulness in argument realization:prominence in lexical semantic representations (Jack- endoff 1987,1990), event structure, especially causal structure (Croft 1991, Wunderlich 1997), and topicality or salience (Fillmore 1977). However, many authors motivate a hierarchy primarily by argument realization, adopting some version of a correspondence principle to ensure that the thematic roles defined for a predicate are mapped to syntactic positions (or grammatical functions) in prominence order. The canonical thematic hierarchy is a total ordering of all the thematic roles in a theory's inventory (many hierarchies do not distinguish an ordering among Source, Goal, and Location, however). While numerous variants have been proposed - Levin & Rappaport Hovav (2005:162-163) list over a dozen - they agree on ranking Agent/Actor topmost, Theme and/or Patient near the bottom, and Instrument between them. As with any model invoking broad thematic roles, thematic hierarchy approaches face the difficulty of defining the roles they use, and addressing the classification of roles of individual verbs that do not fit well. A thematic hierarchy of fine-grained roles would face the twin drawbacks of a large number of possible orderings and a lack of evidence for establishing a relative ranking of many roles. Some researchers emphasize that the sole function of the thematic hierarchy is to order the roles; the role labels themselves are invisible to syntax and morphology, which have access only to the ordered list of arguments. Thus Grimshaw (1990:10) states that though she will "use thematic role labels to identify arguments ... the theory gives no status to this information." Williams (1994) advocates a similar view of the visibility of
410 IV. Lexical semantics roles at the syntactic level. Wunderlich (1997) develops an approach that is similar in this regard, where depth of embedding in a lexical semantic structure determines the prominence of semantic arguments. It is worth noting that these kinds of rankings can be achieved by means other than a thematic hierarchy, or even thematic roles altogether. Rappaport & Levin's (1988) predicate argument structures consist of an ordered list of argument variables derived from lexical conceptual structures like those in (17) below, but again no role information is present. Fillmore (1977) provides a saliency hierarchy of criteria for ascertaining the relative rank of two arguments; these include active elements outranking inactive ones, causal arguments outranking noncausal ones, and changed arguments outranking unchanged ones. Gawron (1983) also makes use of this system in his model of argument realization. Wechsler (1995) similarly relies on entailments between pairs of participants, such as one having a notion of the other, to determine a partial ordering amongst arguments. Dowty's (1991) widely-cited system of comparing numbers of proto-agent and proto-patient entailments is a related though distinct approach, discussed in greater detail below. Finally, Grimshaw (1990) posits an aspectual hierarchy in addition to a thematic hierarchy, on which causers outrank other elements. Li (1995) likewise argues for a causation-based hierarchy independent of a thematic hierarchy, based on argument realization in Mandarin Chinese resultative compounds. Primus (1999, 2006) presents a system of two role hierarchies, corresponding to involvement and causal dependency. "Morphosyntactic linking , i.e., case in the broader sense, corresponds primarily to the degree and kind of involvement ...while structural linking responds to semantic dependency" (Primus 2006: 54). These multiple-hierarchy systems bear an affinity to systems that assign multiple roles to participants and to the structured semantic representations of Jackendoff, discussed in the following section. 4.3. Multiple role assignment Thematic role lists and hierarchies generally assume some principle of thematic uniqueness. However, there arc various arguments for assigning multiple roles to a single argument of a predicator, as well as cases where it is hard to distinguish the roles of two arguments. Symmetric predicates such as those in (9) and (10) exemplify the latter situation. As for the former, Jackendoff (1987:381-382) suggests buy, sell, and chase as verbs with arguments bearing more than one role, and any language with morphologically productive causative verbs furnishes examples such as 'cause to laugh', in which the laugher exhibits both Agent and Patient characteristics. Williams (1994) argues that the subject of a small clause construction like John arrived sad. is best analyzed as bearing two roles, one from each predicate. Broadwell (1988: 123) offers another type of evidence for multiple role assignment in Choctaw (a Muskogcan language of the southeastern U.S.). Some verbs have supplctive forms for certain persons and numbers, and "1 [= subject] agreement is tied to the 0-roles Agent, Effector, Experiencer, and Source/Goal. Since the Choctaw verb 'arrive' triggers 1 agreement, its subject must bear one of these 0-roles. But I have also argued that suppletion is tied to the Theme, and so the subject must bear the role Theme." Similarly, one auxiliary is selected if the subject is a Theme, another otherwise, and arrive in Choctaw selects the 7/ieme-subject auxiliary. Cases like these have led many researchers to pursue representations in which roles are derived, not primitive. Defining roles in terms of features, or positions in lexical
18. Thematic roles 411 representations based on semantic decomposition, can more readily accommodate those predicators that appear to violate role uniqueness within a system of broad thematic roles. 4.4. Structural and featural analyses of thematic roles Many researchers, perhaps dissatisfied with the seemingly arbitrary set of descriptions of the meanings of thematic roles such as Agent, Patient, Goal, and, notoriously, Theme, have sought organizing principles that would characterize the range of broad thematic roles. These efforts can be loosely divided into two, somewhat overlapping types. The first might be called structural or relational, situating roles within structures typically representing lexical entries, including properties of the event type they denote. The second approach is featural, analyzing roles in terms of another set of more fundamental features that provides some structure for the set of possible roles. Both approaches are compatible with viewing thematic roles as sets of entailments. Under the structural approach, the entailments are associated with positions in a lexical or event structure, while under the featural approach, the entailments can be regarded as features. Both approaches are also compatible with thematic hierarchies or other prominence schemes. However, a notion of prominence within event structures can do the work of a thematic hierarchy in argument selection within a structural approach, and prominence relations between features can do the same within a featural approach. While Fillmore's cases are presented as an unstructured list, Gruber (1965) and Jack- endoff (1983) develop a model in which the relationships between entities in motion and location situations provide an inventory of roles: Theme, Source, Goal, and Location. Through analogy, these extend to a wide range of semantic domains, as phrased by Jackendoff (1983:188): (16) Thematic Relations Hypothesis In any semantic field of [events] and [states], the principal event-, state-, path-, and place-functions are a subset of those used for the analysis of spatial location and motion. Fields differ in only three possible ways: a. what sorts of entities may appear as theme; b. what sorts of entities may appear as reference objects; c. what kind of relation assumes the role played by location in the field of spatial expressions. This illustrates one form of lexical decomposition (see article 17 (Engelberg) Frameworks of decomposition) allowing thematic roles to be defined in terms of their positions within the structures representing the semantics of lexical items. Such an approach is compatible with the entailment-based treatments of thematic roles discussed above, but developing a model-theoretic interpretation of these structures that would facilitate this has not been a priority for those advocating this kind of analysis. However, lexical decomposition does reflect the internal complexity of natural language predicators. For example, causative verbs are plausibly analyzed as denoting two situation types standing in a causal relationship to one another, and transactional verbs such as buy, sell, and rent as denoting two oppositely-directed transfers.The Lexical Conceptual Structures of Rappaport & Levin (1988) (see article 19 (Levin & Rappaport Hovav)
412 IV. Lexical semantics Lexical Conceptual Structure) illustrate decomposition into multiple subevents of the two alternants of "spray/load" verbs; the LCS for the TTieme-object alternant is shown in (17a) and that of the Location-ob')ccl alternant in (17b), in which the LCS of the former is embedded as a substructure: (17) a. [x cause [y to come to be at z]\ b. [x cause [z to come to be in STATE] BY MEANS OF [x cause [y to come to be at z]]] As noted above, in such representations, thematic roles are not primitive, but derived notions. And because LCSs - or their counterparts in other models - can be embedded as in (17b), there may be multiple occurrences of the same role type within the representation of a single predicate. A causative or transactional verb can then be regarded as having two Agents, one in each subevent. Furthermore, participants in more than one subevent can accordingly be assigned more than one role; thus the buyer and seller in a transaction are at once Agents and Recipients (or Goals). This leads to a view of thematic roles that departs from the formulations of role uniqueness developed within an analysis of predicators as denoting a unitary, undecomposed event, though uniqueness can still be postulated for each subevent in a decompositional representation. It also can capture some of the dependencies among roles; for example Jackendoff (1987:398—402) analyzes the Instrument role in terms of conceptual structures representing intermediate causation, and this role exists only in relation to others in the chain of causation. Croft (1991,1998) has taken causation as the fundamental framework for defining relationships between participants in events,with the roles more closely matching the Agent, Patient, and Instrument of Fillmore (1968). The causal ordering is plainly correlated with the ordering of roles found in thematic hierarchies and with the Actor-Undergoer cline of Role and Reference Grammar (Foley & van Valin 1984, van Valin 2004), which orders thematic roles according to positions in a structure intended to represent causal and aspectual characteristics of situations. In Jackendoff (1987,1990) the causal and the motion/location models are combined in representations of event structures, some quite detailed and elaborate. Thematic roles within these systems are also derived notions, defined in terms of positions within these event representations. As the systems become more elaborate, one crucial issue is: how can we tell what the correct representation should be? Both types of systems exploit metaphorical extensions of space, motion, and causation to other, more abstract domains, and it is often unclear what the "correct" application of the metaphors should be.The implication for thematic roles defined within such systems is that their semantic foundations are not always sound. Notable featural analyses of thematic roles include Ostler (1979), whose system uses eight binary features that characterize 48 roles. His features arc further specifications of the four generalized roles: Theme, Source, Goal, and Path from the Thematic Role Hypothesis, and include some that resemble entailments (volitional) and some that specify a semantic domain (positional, cognitive). Somers (1987) puts forward similar systems, again based on four broad roles that appear in various semantic domains, and Sowa (2000), takes this as a point of departure for a hierarchy of role types within a knowledge representation system. Sowa decomposes the four types of Somers into two pairs of roles characterized by features or entailments: Source ("present at the beginning of the process") and Goal ("present at the end of the process"), and Determinant
18. Thematic roles 413 ("determines the direction of the process") and Immanent ("present throughout the process", but "does not actively control what happens"). Sowa argues that this system allows for useful undcrspccification; in the dog broke the window, the dog might might be involved volitionally or nonvolitionally as initiator, or used as an instrument by some other initiator, but Source covers all of these possibilities. This illustrates another characteristic of Sowa's system; he envisions an indefinitely large number of roles, of varying degrees of specificity, but all are subtypes of one of these four. In this kind of system, the set of features will also grow indefinitely, so that it is more naturally viewed as a hierarchy of roles induced from the hierarchy of event types, at least below the most general roles. Similar ideas have been pursued in Lehmann (1996) and the Framenet project (http://framenet.icsi.berkeley.edu); see article 29 (Gawron) Frame Semantics. Rozwadowska (1988) pursues a somewhat different analysis, with three binary features: ±sentient, ±cause, and ±change, intended to characterize broad thematic roles.This system permits natural classes of roles to be defined, which Rozwadowska argues are useful in describing restrictions on the interpretation of English and Polish specifiers of nominalizations. For example, the distinction between *the movie's shock of the audience, vs. the audience's shock at the movie is accounted for by a requirement that the specifier's role not be Neutral; that is, not have negative values of all three features. Thus either an Agent or an Experiencer NP (both marked as +change) can appear in specifier position, but not a Neutral one. Featural analyses of thematic roles are close in spirit to entailment-based frameworks that dispense with reified roles altogether. If the features can be defined with sufficient clarity and consistency, then a natural question to ask is whether the work assigned to thematic roles can be accomplished simply by direct reference to these definitions of participant properties.The most notable exponent of this approach is Dowty (1991), with the two sets of proto-Agent and proto-Patient entailments. Wechsler (1995) is similar in spirit but employs entailments as a means of partially ordering a predicator's arguments. Davis & Koenig (2000) and Davis (2001) combine elements of these with a limited amount of lexical decomposition. While these authors eschew the term "thematic role" for their characterizations of a predicate's arguments, such definitions do conform to Dowty's (1989) definition of thematic role types noted above, though they do not impose any thematic uniqueness requirements. 4.5. Thematic roles and argument realization Argument realization, the means by which semantic roles of predicates are realized as syntactic arguments of verbs, nominalizations, or other predicators, has been a consistent focus of linguists seeking to demonstrate the utility of thematic roles. This section examines some approaches to argument realization, and the following one briefly notes other syntactic and morphological phenomena that thematic roles have been claimed to play a role in. Levin & Rappaport Hovav (2005) provide an extensive discussion of diverse approaches to argument realization, including those in which thematic roles play a crucial part. Such accounts can involve a fixed,"absolute" rule, such as "map the Agent to the subject (or external argument)" or relative rules, which establish a correspondence between an ordering (possibly partial) among a predicate's semantic roles and an ordering among
414 IV. Lexical semantics grammatical relations or configurationally-defined syntactic positions.The role ordering may correspond to a global ordering of roles, such as a thematic hierarchy, or be derived from relative depth of semantic roles or from relationships, such as entailments, holding among sets of roles. One example is the Universal Alignment Hypothesis developed within Relational Grammar; the version here is from Rosen (1984:40): (18) There exists some set of universal principles on the basis of which, given the semantic representation of a clause, one can predict which initial grammatical relation each nominal bears. Another widely employed principle of this type is the Uniformity of Thcta Assignment Hypothesis (UTAH), in Baker (1988: 46), which assumes a structural conception of thematic roles: (19) Identical thematic relationships between items are represented by identical structural relationships between those items at the level of D-structure. Baker (1997) examines a relativized version of this principle, under which the ordering of a predicator's roles, according to the thematic hierarchy, must correspond to their relative depth in D-structure. In Lexical Mapping Theory (Brcsnan & Kancrva 1989,Brcsnan & Moshi 1990,Alsina 1992), the correspondence is more complex, because grammatical relations are decomposed into features, which in turn interface with the thematic hierarchy. A role's realization is thus underspecified in some cases until default assignments fill in the remaining feature. For example, an Agent receives an intrinsic classification of -O(bjective), and the highest of a predicate's roles is assigned the feature -R(estricted) by default, with the result that Agents arc by default realized as subjects. These general principles typically run afoul of the complexities of argument realization, including cross-linguistic and within-language variation among predicators and diathesis alternations. A very brief mention of some of the difficulties follows. First, to avoid an account that is essentially stipulative, the inventory of roles cannot be too large or too specific; once again this leads to the problem of assigning roles to the large range of verbs whose arguments appear to fit poorly in any of the roles. Second, cross-linguistic variation in realization patterns poses problems for principles like (18) and (19) that claim universally consistent mapping; some languages permit a wide range of ditransi- tive constructions, for example, while others entirely lack them (Gerdts 1992). Third, there are some cases where argument realization appears to involve information outside what would normally be ascribed to lexical semantic representations;some semantically similar verbs require different subcategorizations (wish for vs. desire, look at vs. watch, appeal to vs.please) or display differing diathesis alternations (hide and dress permit an intransitive alternant with reflexive meaning, while conceal and clothe are only transitive) (Jackendoff 1987: 405-406, Davis 2001:171-173). Fourth, there are apparent cases of the same roles being mapped differently, as in the Italian verbs temere and preocupare in (15) above; these cases prove problematic particularly for a model with a thematic hierarchy and a role list for each predicate. These difficulties have been addressed in large degree through entailment-based models described in the following section, and through models that employ more
18. Thematic roles 415 elaborate semantic decomposition that what is assumed by principles such as (18) and (19) (Jackendoff 1987,1990, Alsina 1992, Croft 1991,1998). 4.6. Lexical entailments as alternatives to thematic roles Dowty (1991) is an influential proposal for argument realization avoiding reified thematic roles in semantic representations.The central idea is that subject and object selection relies on a set of proto-role entailments,grouped into two sets,proto-agent properties and proto-patient properties, as follows (Dowty 1991:572): (20) Contributing properties for the Agent Proto-Role a. volitional involvement in the event or state b. sentience (and/or perception) c. causing an event or change of state in another participant d. movement (relative to the position of another participant) e. exists independently of the event named by the verb) (21) Contributing properties for the Patient Proto-Role a. undergoes change of state b. incremental theme c. causally affected by another participant d. stationary relative to the movement of another participant e. does not exist independently of the event, or not at all) Each of these properties may or may not be entailed of a participant in a given type of event or state. Dowty's Argument Selection Principle, in (22), characterizes how transitive verbs may realize their semantic arguments (Dowty 1991:576): (22) In predicates with grammatical subject and object, the argument for which the predicate entails the greatest number of Proto-Agent properties will be lexi- calized as the subject of the predicate; the argument having the greatest number of Proto-Patient properties will be lexicalized as the direct object. This numerical comparison across semantic roles accounts well for the range of attested transitive verbs in English, but fares less well with other kinds of data (Primus 1999, Davis & Koenig 2000, Davis 2001, Levin & Rappaport Hovav 2005). In the Finnish causative example in (23), for example (Davis 2001: 69), the subject argument bears fewer proto-agent entailments than the direct object. (23) Uutinen puhu-tt-i nais-i-a pitkaan. news-item talk-CAUS-PAST woman-pl-PART long-ILL 'The news made the women talk for a long time.' Causation appears to override the other entailments in (23) and similar cases. In addition, (22) makes no prediction regarding predicators other than transitive verbs, but their argument realization is not unconstrained; as with transitive verbs, a causally affecting argument or an argument bearing a larger number of proto-agent entailments is realized
416 IV. Lexical semantics as the subject of verbs such as English prevail (on), rely (on), hope (for), apply (to), and many more. Wechsler (1995) presents a model of argument realization that resembles Dowty's in eschewing thematic roles and hierarchies in favor of a few relational entailments amongst participants in a situation, such as whether one participant necessarily has a notion of another, or is part of another. Davis & Koenig(2000) and Davis (2001) borrow elements of Dowty's and Wechsler's work but posit reified "proto-role attributes" in semantic representations reflecting the flow of causation in response to the difficulties noted above. 4.7. Other applications of thematic roles in syntax and morphology Thematic roles and thematic hierarchies have been invoked in accounts of various other syntactic phenomena, and the following provides only a sample. Some accounts of anaphoric binding make explicit reference to thematic roles, as opposed to structural prominence in lexical semantic representations. Typically, such accounts involve a condition that the antecedent of an anaphor must outrank it on the thematic hierarchy. Wilkins (1988) is one example of this approach. Williams (1994) advocates recasting the principles of binding theory in terms of role lists defined on predicators (including many nouns and adjectives). This allows for "implicit", syntactically unrealized arguments to participate in binding conditions. For example, the contrast between admiration of him (admirer and admiree must be distinct) and admiration of himself (admirer and admiree must be identical) suggests that even arguments that are not necessarily syntactically realized play a crucial role in binding. Thematic role labels, however, play no part in this system; rather, the arguments of a predicator are simply in an ordered list, as in the argument structures of Rappaport & Levin (1988) and Grimshaw (1990), though ordering on the list might be determined by a thematic hierarchy. And Jackendoff (1987) suggests that indexing argument positions within seman- tically decomposed lexical representations can address these binding facts, without reference to thematic hierarchies. Everaert & Anagnostopoulou (1997) argue that local anaphors in Modern Greek display a dependence on the thematic hierarchy; as a Goal or Experiencer antecedent can bind a Theme, for example, but not the re verse. This holds even when the lower thematic role is realized as the subject, resulting in a subject anaphor. Nishigauchi (1984) argues for thematic role-based effect in controller selection for infinitival complements and purpose clauses, a view defended by Jones (1988). For example, the controller of a purpose clause is said to be a Goal. Ladusaw & Dowty (1988) counter that the data is better handled by verbal entailments and by general principles of world knowledge about human action and responsibility. Donohue (1996) presents data on relativization in Tukang Besi (an Austronesian language of Indonesia) that suggest a distinct relativization strategy for Instruments, regardless of their grammatical relation. Mithun (1984) proposes an account of noun incorporation in which Patient is preferred over other roles for incorporation, though in some languages arguments bearing other roles (Instrument or Location) may incorporate as well. But alternatives based on underlying syntactic structure (Baker 1988) and depth of embedding in lexical semantic representations (Kiparsky 1997) have also been advanced. Evans (1997) examines
18. Thematic roles 417 noun-incorporation in Mayali (a non-Pama-Nyungan language of northern Australia) and finds a thematic-role based account inadequate to deal with the range of incorporated nominals. He instead suggests that constraints based on animacy and prototypicality in the denoted event are crucial in selecting the incorporating argument. 5. Concluding remarks Dowty (1989:108-109) contrasts two positions on the utility of (broad) thematic roles. Those advocating that thematic roles are crucially involved in lexical, morphological, and syntactic phenomena have consequently tried to define thematic roles and develop thematic role systems. But, even 20 years later, the position Dowty states in (24) can also be defended: (24) Thematic roles per se have no priviledged [sic] status in the conditioning of syntactic processes by lexical meaning, except insofar as certain semantic distinctions happen to occur more frequently than others among natural languages. Given the range of alternative accounts of argument realization, lexical acquisition, and other phenomena for which the "traditional", broad thematic roles have sometimes been considered necessary, and the additional devices required even in those approaches that do employ them, it is unclear how much is gained by introducing them as reified elements of linguistic theory. There do appear to be some niche cases of phenomena that depend on such notions, some of which are noted above, and there are stronger motivations for entailment-based approaches to argument realization, diathesis alternations, aspect, and complement control. Such entailments can certainly be viewed as thematic roles, some even as roles in a broad sense that apply to a large class of predicates. But the overall picture is not one that lends support to the "traditional" notion of a small inventory of broad roles, with each of a predicate's arguments uniquely assigned one of them. Fine-grained roles serve a somewhat different function in model-theoretic semantics, one not dependent on the properties of a thematic role system but on the use of roles in individuating events and how they are related to their participants. But in this realm, too, they do not come without costs; as Landman and Schein have argued, dyadic thematic roles, coupled with principles of role uniqueness, lead both to unwelcome inferences that can be blocked only with additional mechanisms and to requiring events that are intuitively too fine-grained.These difficulties arise in connection with symmetric predicators, transactional verbs, and other complex event types that may warrant a more elaborated treatment than thematic roles defined on a unitary predicate can offer. The author gratefully acknowledges detailed comments on this article from Cleo Condoravdi and from the editors. 6. References Alsina, Alex 1992. On the argument structure of causatives. Linguistic Inquiry 23,517-555. Bach, Einmon 1989. Informal Lectures on Formal Semantics. Albany, NY: State University of New York Press. Baker, Mark C. 1988. Incorporation. A Theory of Grammatical Function Changing. Chicago, IL:The University of Chicago Press.
418 IV. Lexical semantics Baker, Mark C. 1997. Thematic roles and syntactic structure. In: L. Haegeinan (ed.). Elements of Grammar. Dordrecht: Kluwer, 73-137. Bayer, Samuel L. 1997. Confessions of a Lapsed Neo-Davidsonian. Events and Arguments in Compositional Semantics. New York: Garland. Belletti, Adriana & Luigi Rizzi 1988. Psych-verbs and 6-Theory. Natural Language and Linguistic Theory 6,291-352. Bresnan.Joan & Jonni Kanerva 1989. Locative inversion in Chichewa. A case study of factorization in grammar. Linguistic Inquiry 20.1-50. Bresnan.Joan & Lioba Moshi 1990. Object asymmetries in comparative Bantu syntax. Linguistic Inquiry 21,147-185. Broadwell, George A. 1988. Multiple 6-role assignment in Choctaw. In: W. Wilkins (ed.). Syntax and Semantics 21: Thematic Relations. New York: Academic Press, 113-127. Carlson, Greg 1984. On the role of thematic roles in linguistic theory. Linguistics 22, 259- 279. Carlson, Greg 1998. Thematic roles and the individuation of events. In: S. Rothstein (ed.). Events and Grammar. Dordrecht: Kluwer, 35-51. Chierchia, Gennaro 1984. Topics in the Syntax and Semantics of Infinitives and Gerunds. Ph.D. dissertation. University of Massachusetts, Amherst, MA. Chomsky, Noam 1981. Lectures on Government and Binding. Dordrecht: Foris. Chomsky, Noam 1982. Some Concepts and Consequences of the Theory of Government and Binding. Cambridge. MA:The MIT Press. Croft, William 1991. Syntactic Categories and Grammatical Relations: The Cognitive Organization of Information. Chicago, IL:The University of Chicago Press. Croft, William 1998. Event structure in argument linking. In: M. Butt & W. Geuder (eds.). The Projection of Arguments. Lexical and Compositional Factors. Stanford, CA: CSLI Publications, 21-63. Davidson, Donald 1967. The logical form of action sentences. In: N. Reseller (ed.). The Logic of decision and Action. Pittsburgh. PA: University of Pittsburgh Press, 81-95. Davis, Anthony R. 2001. Linking by Types in the Hierarchical Lexicon. Stanford, CA: CSLI Publications. Davis, Anthony R. & Jean-Pierre Kocnig 2000. Linking as constraints on word classes in a hierarchical lexicon. Language 76,56-91. Donohue, Mark 1996. Relative clauses inTukang Besi. Grammatical functions and thematic roles. Linguistic Analysis 26,159-173. Dowty, David 1989. On the semantic content of the notion of 'thematic role'. In: G. Chierchia, B. Partee & R. Turner (eds.). Properties, Types, and Meaning, vol. 2. Semantic Issues. Dordrecht: Kluwer, 69-129. Dowty, David 1991.Thematic proto-roles and argument selection. Language 67,547-619. Everaert, Martin & Elena Anagnostopoulou 1997. Tlieinatic hierarchies and Binding Theory. Evidence from Greek. In: F. Gorblin, D. Godard & J.-M. Marandin (eds.). Empirical Issues in Formal Syntax and Semantics. Selected Papers from the Colloque de Syntaxe et de Semantique de Paris (CSSP1995). Bern: Lang. 43-59. Evans, Nick 1997. Role or cast. In: A. Alsina, J. Bresnan & P. Sells (eds.). Complex Predicates. Stanford. CA: CSLI Publications, 397^30. Fillmore, Charles J. 1968. The case for case. In: E. Bach & R.T. Harms (eds.). Universals of Linguistic Theory. New York: Holt, Rinchai t & Winston. 1-88. Fillmore. Charles J. 1977.Topics in lexical semantics. In: R. W. Cole (ed.). Current Issues in Linguistic Theory. Bloomington. IN: Indiana University Press, 76-138. Foley. William A. & Robert D. van Valin, Jr. 1984. Functional Syntax and Universal Grammar. Cambridge: Cambridge University Press. Gawron, Jean Mark 1983. Lexical Representations and the Semantics of Complementation. Ph.D. dissertation. University of California, Berkeley, CA.
18. Thematic roles 419 Gerdts, Donna B. 1992. Morphologically-mediated relational profiles. In: L. A. Buszard-Welcher, L. Wee & W. Weigel (eds.). Proceedings of the I8th Annual Meeting of the Berkeley Linguistics Society (=BLS). Berkeley, CA: Berkeley Linguistic Society. 322-337. Grimshaw. Jane 1990. Argument Structure. Cambridge, MA:The MIT Press. Gruber, Jeffrey S. 1965. Studies in Lexical Relations. Ph.D. dissertation. MIT, Cambridge. MA. Jackendoff, Ray 1983. Semantics and Cognition. Cambridge, MA:The MIT Press. Jackendoff, Ray 1987. The Status of thematic relations in linguistic theory. Linguistic Inquiry 18, 369^11. Jackendoff, Ray 1990. Semantic Structures. Cambridge, MA: The MIT Press. Jones, Charles 1988. Thematic relations in control. In: W. Wilkins (ed.). Syntax and Semantics 21: Thematic Relations. New York: Academic Press, 75-89. Kiparsky, Paul 1997. Remarks on denominal verbs. In: A. Alsina, J. Bresnan & P. Sells (eds.). Complex Predicates. Stanford, CA: CSLI Publications, 473-499. Kratzer. Angclika 1996. Severing the external argument from its verb. In: J. Rooiyck & L. Zai ing (eds.). Phrase Structure and the Lexicon. Dordrecht: Kluwcr, 109-137. Krifka, Manfred 1992. Thematic relations as links between nominal reference and temporal constitution. In: I. A. Sag & A. Szabolcsi (eds.). Lexical Matters. Stanford, CA: CSLI Publications, 29-53. Krifka, Manfred 1998. The origins of telicity. In: S. Rothstein (ed.). Events and Grammar. Dordrecht: Kluwer, 197-235. Ladusaw. William A. & David R. Dowty 1988.Towards a nongrammatical account of thematic roles. In: W. Wilkins (ed.). Syntax and Semantics 21: Thematic Relations. New York: Academic Press, 61-73. Landman, Fred 2000. Events and Plurality. The Jerusalem Lectures. Dordrecht: Kluwer. Lehmann, Fritz 1996. Big posets of participatings and thematic roles. In: P. W. Eklund, G. Ellis & G. Mann (eds.). Proceedings of the 4th International Conference on Conceptual Structures. Knowledge Representation as Interlingua. Berlin: Springer, 50-74. Levin, Beth & Malka Rappaport Hovav 2005. Argument Realization. Cambridge: Cambridge University Press. Li, Yafei 1995. The thematic hierarchy and causativity. Natural Language and Linguistic Theory 13, 255-282. Marquez, Llufs et al. 2008. Semantic role labeling. An introduction to the special issue. Computational Linguistics 34,145-159. Mithun, Marianne 1984.The evolution of noun incorporation. Language 60,847-894. Nishigauchi,Taisuke 1984. Control and the thematic domain. Language 60,215-250. Ostler, Nicholas 1979. Case Linking. A Theory of Case and Verb Diathesis Applied to Classical Sanskrit. Ph.D. dissertation. MIT, Cambridge, MA. Parsons, Terence 1990. Events in the Semantics of English. A Study in Subatomic Semantics. Cambridge, MA:The MIT Press. Pei (mutter, David M.& Paul Postal. 1984.The I-advancementexclusivencsslaw. In:D.M.Pcrhnutter & C. Rosen (eds.). Studies in Relational Grammar, vol. 2. Chicago, IL:The University of Chicago Press, 81-125. Primus, Beatrice 1999. Cases and Thematic Roles. Ergative, Accusative and Active. Tubingen: Niemeyer. Primus, Beatrice 2006. Mismatches in semantic-role hierarchies and the dimensions of Role Semantics. In: I. Bornkcsscl ct al. (eds.). Semantic Role Universals and Argument Linking. Theoretical, Typological, and Psycholinguistic Perspectives. Berlin: Mouton dc Gruytcr, 53-88. Rappaport, Malka & Beth Levin 1988. What to do with 6-roles. In: W. Wilkins (ed.). Syntax and Semantics 21: Thematic Relations. New York: Academic Press, 7-36. Rosen, Carol 1984.The interface between semantic roles and initial grammatical relations. In: D.M. Perlmutter & C. Rosen, (eds.). Studies in Relational Grammar, vol. 2. Chicago, IL:The University of Chicago Press, 38-77.
420 IV. Lexical semantics Rozwadowska. Bozena 1988.Thematic restrictions on derived noininals. In: W. Wilkins (ed.). Syntax and Semantics 21: Thematic Relations. New York: Academic Press, 147-165 Scliein. Barry 2002. Events and the semantic content of thematic relations. In: G. Preyer & G. Peter (eds.). Logical Form and Language. Oxford: Oxford University Press, 263-344. Somers, Harold L. 1987. Valency and Case in Computational Linguistics. Edinburgh: Edinburgh University Press. Sowa. John F. 2000. Knowledge Representation. Logical, Philosophical, and Computational Foundations. Pacific Grove, CA: Brooks Cole Publishing Co. Svenonius, Peter 2007. Adpositions, particles and the arguments they introduce. In: E. Reuland,T. Bhattacharya & G. Spathas (eds.). Argument Structure. Amsterdam: Benjamins, 63-103. van Valin. Robert D., Jr. 2004. Semantic macroroles in Role and Reference Grammar. In: R. Kailu- weit & M. Hummel (eds.). Semantische Rollen.Tubingen: Narr, 62-82. Wechsler, Stephen 1995. The Semantic Basis of Argument Structure. Stanford, CA: CSLI Publications. Wechslei. Stephen 2005. What is right and wrong about little v. In: M. Vulchanova & T. A. A fail i (eds.). Grammar and Beyond. Essays in Honour of Lars Hellan. Oslo: Novus Press, 179-195. Wilkins, Wendy 1988. Thematic structure and rcflcxivization. In: W. Wilkins (ed.). Syntax and Semantics 21: Thematic Relations. New York: Academic Press, 191-213. Williams, Edwin 1994. Thematic Structure in Syntax. Cambridge. MA:The MIT Press. Wunderlich. Dieter 1997. CAUSE and the structure of verbs. Linguistic Inquiry 28,27-68. Anthony R. Davis, Washington, DC (USA) 19. Lexical Conceptual Structure 1. Introduction 2. The introduction of LCSs into linguistic theory 3. Components of LCSs 4. Choosing primitive predicates 5. Subeventual analysis 6. LCSs and syntax 7. Conclusion 8. References Abstract The term "Lexical Conceptual Structure" was introduced in the 1980s to refer to a structured lexical representation of verb meaning designed to capture those meaning components which determine grammatical behavior, particularly with respect to argument realization. Although the term is no longer much used, representations of verb meaning which share many of the properties of LCSs are still proposed in theories which maintain many of the aims and assumptions associated with the original work on LCSs. As LCSs and the representations that are their descendants take the form of predicate decompositions, the article reviews criteria for positing the primitive predicates that make up LCSs. Following an overview of the original work on LCS, the article traces the developments in the representation of verb meaning that characterize the descendants of the early LCSs. The more recent work exploits the distinction between root and event structure implicit in even the earliest LCS in the determination of grammatical behavior. This work Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 42(M40
19. Lexical Conceptual Structure 421 also capitalizes on the assumption that predicate decompositions incorporate a subeven- tual analysis which defines hierarchical relations among arguments, allowing argument realization rules to be formulated in terms of the geometry of the decomposition. 1. Introduction Lexical Conceptual Structure (LCS) is a term that was used in the 1980s and 1990s to refer to a structured lexical representation of verb meaning. Although the term "LCS" is no longer widely used, structured representations of verb meaning which share many of the properties of LCSs are still often proposed, in theories which maintain many of the aims and assumptions associated with those originally involving an LCS. These descendants of the original LCSs go by various names, including lexical relational structures (Hale & Keyser 1992,1993),event structures (Rappaport Hovav & Levin 1998a, Levin & Rappaport Hovav 2005),semantic structures (Pinker 1989), L-syntax (Mateu 2001a,Travis 2000), 1-structure (Zubizarreta & Oh 2007) and first phase syntax (Ramchand 2008); representations called Semantic Forms (Wunderlich 1997a, 1997b) and Semantic Representations (van Valin 1993,2005, van Valin & LaPolla 1997) are also close in spirit to LCSs. Here we provide an overview of work that uses a construct called LCS, and we then trace the developments which have taken place in the representation of verb meaning in descendants of this work. We stress, however, that we are not presenting a single coherent or unified theory, but rather a synthetic perspective on a collection of related theories. 2. The introduction of LCSs into linguistic theory In the early 1980s, the idea emerged that major facets of the syntax of a sentence are projected from the lexical properties of the words in it (e.g. Chomsky 1981, Farmer 1984, Pcsctsky 1982,Stowcll 1981; sec Fillmore 1968 for an earlier proposal of this sort), and over the course of that decade its consequences were explored. Much of this work assumes that verbs are associated with predicate-argument structures (e.g. Bresnan 1982, Grimshaw 1990), often called theta-grids (Stowell 1981, Williams 1981). The central idea is that the syntactic structure that a verb appears in is projected from its predicate- argument structure, which indicates the number of syntactic arguments a verb has, and some information about how the arguments are projected onto syntax, for example, as internal or external arguments (Marantz 1984, Williams 1981). One insight arising from the closer scrutiny of the relationship between the lexical properties of verbs and the syntactic environments in which they appear is that a great many verbs display a range of what have been called argument - or diathesis - alternations, in which the same verb appears with more than one set of morphosyntactic realization options for its arguments, as in the causative and dative alternations, in (1) and (2), respectively. (1) a. Pat dried the clothes, b. The clothes dried. (2) a. Pat sold the rare book to Terry, b. Pat sold Terry the rare book. Some argument alternations seem to involve two alternate realizations of the same set of arguments (e.g. the dative alternation), while others seem to involve real changes in
422 IV. Lexical semantics the meaning of the verb (e.g. the causative alternation) (Rappaport Hovav & Levin 1998b). Researchers who developed theories of LCS assumed that in addition to a verb's argument structure, it is possible to isolate a small set of recurring meaning components which determine the range of argument alternations a particular verb can participate in. These meaning components are embodied in the primitive predicates of predicate decompositions such as LCSs. Thus, LCSs are used both to represent systematic alternations in a verb's meaning and to define the set of verbs which undergo alternate mappings to syntax, as we now illustrate. A case study which illustrates this line of investigation is presented by Guerssel et al. (1985).Their study attempts to isolate those facets of meaning which determine a verb's participation in several transitivity alternations in four languages: Berber, English, Warlpiri, and Winnebago. Guerssel et al. compare the behavior of verbs corresponding to English break (as a representative of the class of change of state verbs) and cut (as a representative of the class of motion-contact-effect verbs) in several alternations, including the causative and conative alternations in these languages (cf. Fillmore 1970).They suggest that participation in the causative alternation is contingent on the LCS of a verb containing a constituent of the form '[come to be STATE]' (represented via the predicate BECOME or CHANGE in some other work), while participation in the conative alternation requires an LCS with components of contact and effect. The LCSs suggested for the intransitive and transitive uses of break, which together make up the causative alternation, arc given in (3), and the LCSs for the transitive and intransitive uses of cut, which together make up the conative alternation, illustrated in (4), are presented in (5). (3) a. break: y come to be BROKEN (Guerssel et al. 1985:54, ex. (19)) b. break: x cause (y come to be BROKEN) (Guerssel et al. 1985:55, ex. (21)) (4) a. I cut the rope around his wrists. b. I cut at the rope around his wrists. (5) a. cut: x produce CUT in y, by sharp edge coming into contact with y (Guerssel etal. 1985:51,ex.(ll)) b. cut Conative LCS: x causes sharp edge to move along path toward y, in order to produce CUT on y, by sharp edge coming into contact with y (Guerssel etal. 1985:59, ex. (34)) We cite semantic representations in the forms given in the source, even though this leads to inconsistencies in notation; where we formulate representations for the purposes of this article, we adopt the representations used by Rappaport Hovav & Levin (1998a) and subsequent work. Although the LCSs for cut in (5) include semantic notions not usually encountered in predicate decompositions, central to them are the notions 'move' and 'produce', which have more common analogues: go and the combination of cause and become, respectively. The verb cut docs not have an intransitive noncausativc use, as in (6), since its LCS docs not have an isolatable constituent of the form '[come to be in STATE]', while the verb break lacks a conative variant, as in (7), because its LCS does not include a contact
19. Lexical Conceptual Structure 423 component. Finally, verbs like touch, whose meaning does not involve a change of state and simply involves contact with no necessary effect, display neither alternation, as in (8). (6) "The bread cut. (7) *We broke at the box. (8) a. We touched the wall. b. "The wall touched. c. *We touched at the wall. For other studies along these lines, see Hale & Keyser (1987), Laughren (1988), and Rappaport, Levin & Laughren (1988). Clearly, the noncausative and causative uses of a verb satisfy different truth conditions, as do the conative and nonconative uses of a verb. As we have just illustrated, LCSs can capture these modulations in the meaning of a verb which, in turn, have an effect on the way a verb's arguments are morphosyntactically realized. As we discuss in sections 5 and 6, subsequent work tries to derive a verb's argument realization properties in a principled way from the structure of its LCS. However, as mentioned above, verbs with certain LCSs may also simply allow more than one syntactic realization of their arguments without any change in meaning. Rappaport Hovav & Levin (2008) argue that this possibility is instantiated by the English dative alternation as manifested by verbs that inherently lexicalize caused possession such as give, rent, and ,se//.They propose that these verbs have a single LCS representing the causation of possession, as in (9), but differ from each other with respect to the specific type of possession involvcd.Thc verb give Icxicalizcs nothing more than caused possession, while other verbs add further details about the event: it involves the exchange of money for sell and is temporary and contractual for rent. (9) Caused possession LCS: [ x cause [ y have z ] ] According to Rappaport Hovav & Levin (2008), the dative alternation arises with these verbs because the caused possession LCS has two syntactic realizations. (See Harley 2003 and Goldberg 1995 for an alternative view which takes the dative alternation to be a consequence of attributing both caused motion and caused possession LCSs to all alternating verbs; Rappaport Hovav & Levin only attribute both LCSs to verbs such as send and throw, which are not inherently caused possession verbs.) As these case studies illustrate, the predicate decompositions that fall under the rubric "LCS" are primarily designed to capture those facets of meaning which determine grammatical facets of behavior, including argument alternations. This motivation sets LCSs apart from other predicate decompositions, which are primarily posited on the basis of other forms of evidence, such as the ability to capture various entailment relations between sets of sentences containing morphologically related words and the ability to account for interactions between event types and various tense operators and temporal adverbials;cf. article 17 (Engelberg) Frameworks of decomposition.To give one example, it has been suggested that verbs which pass tests for telicity all have a state predicate
424 IV. Lexical semantics in their predicate decomposition (Dowty 1979, Parsons 1990). Nevertheless, LCS representations share many of the properties of other predicate decompositions used as explications of lexical meaning, including those proposed by Dowty (1979), Jackendoff (1976, 1983, 1990), and more recently in Role and Reference Grammar, especially, in van Valin & LaPolla (1997), based in large part on the work of generative semanticists such as Lakoff (1968,1970), McCawley (1968,1971), and Ross (1972).These similarities, of course, raise the question of whether the same representation can be the basis for capturing both kinds of generalizations. LCSs, however, are not intended to provide an exhaustive representation of a verb's meaning, as mentioned above. Positing an LCS presupposes that it is possible to distinguish those facets of meaning that arc grammatically relevant from those which arc not; this assumption is not uncontroversial, see, for example, the debate between Taylor (1996) and Jackendoff (1996a). In addition, the methodology and aims of this form of "componential" analysis of verb meaning differs in fundamental ways from the type of componential analysis proposed by the structuralists (e.g. Nida 1975). For the structuralists, meaning components were isolatable insofar as they were implicated in semantic contrasts within a lexical field (e.g. "adult" to distinguish parent from child); the aim of a componential analysis, then, was to provide a feature analysis of the words in a particular semantic field that distinguishes every word in that field from every other. In contrast, the goal of the work assuming LCS is not to provide an exhaustive semantic analysis, but rather to isolate only those facets of meaning which recur in significant classes of verbs and determine key facets of the linguistic behavior of verbs. This approach makes the crucial assumption that verbs may be different in significant respects, while still having almost identical LCSs; for example, freeze and melt denote "inverse" changes of state, yet they would both share the LCS of change of state verbs. Although all these works assume the value of positing predicate decompositions (thus differing radically from the work of Fodor & Lcporc 1999), the nature of the predicate decomposition and its place in grammar and syntactic structure varies quite radically from theory to theory. Here we review the work which takes the structured lexical representation to be a specifically linguistic representation and, thus, to be distinct from a general conceptual structure which interfaces with other cognitive domains. Furthermore, this work assumes that the information encoded in the LCSs is a small subset of the information encoded in a fully articulated explication of lexical meaning. In this respect, this work is different from the work of Jackendoff (1983, 1990), who assumes that there is a single conceptual representation, used for linguistic and nonlinguistic purposes; cf. article 30 (Jackendoff) Conceptual Semantics. 3. Components of LCSs Since verbs individuate and name events, LCS-style representations are taken to specify the limited inventory of basic event types made available by language for describing happenings in the world. Thus, our use of the term "event" includes all situation types, including states, similar to the notion of "eventuality" in some work on event semantics (Bach 1986). For this reason, such representations are often currently referred to as "event structures". In this section, we provide an overview of the representations of the lexical meaning of verbs which are collectively called event structures and identify the properties which are common to the various instantiations of these representations.
19. Lexical Conceptual Structure 425 In section 6, we review theories which differ in terms of how these representations are related to syntactic structures. All theories of event structure, either implicitly or explicitly, recognize a distinction between the primitive predicates which define the range of event types available and a component which represents what is idiosyncratic in a verb's meaning. For example, all noncausative verbs of change of state have a predicate decomposition including a predicate representing the notion of change of state, as in (10); however, these verbs differ from one another with respect to an attribute of an entity whose value is specified as changing: the attribute relevant to cool involves temperature, while that relevant to widen involves a dimension. One way to represent these components of meaning is to allow the predicate representing the change to take an argument which represents the attribute, and this argument position can then be associated with distinct attributes. This idea is instantiated in the representations for the three change of state verbs in (11) by indicating the attribute relevant to each verb in capital italics placed within angle brackets. (10) [ become [ y <RES-STATE> ] ] (11) a. dry: [ become [ y <DRY> ] ] b. widen: [ become [ y <WIDE> ] ] c. dim: [ become [ y <DIM> ] ] As this example shows, LCSs are constructed so that common substructures in the representations of verb meanings can be taken to define grammatically relevant classes of verbs, such as those associated with particular argument alternations. Thus, the structure in (10), which is shared by all change of state verbs, can then be associated with displaying the causative alternation. Being associated with this LCS substructure is a necessary, but not sufficient, condition for participating in the causative alternation, since only some change of state verbs alternate in English.The precise conditions for licensing the alternation require further investigation, as does the question of why languages vary somewhat in their alternating verbs; see Alexiadou, Anagnostopoulou & Schafer (2006), Doron (2003), Haspclmath (1993), Koontz-Garbodcn (2007), and Levin & Rappaport Hovav (2005) for discussion. The idiosyncratic component of a verb's meaning has received several names, including "constant", "root", and even "verb". We use the term "root" (Pesetsky 1995) in the remainder of this article, although we stress that it should be kept distinct from the notion of root used in morphology (e.g. Aronoff 1993). Roots may be integrated into LCSs in two ways: a root may fill an argument position of a primitive predicate, as in the change of state example (10), or it may serve as a modifier of a predicate, as with various types of activity verbs, as in (12) and (13). (Modifier status is indicated by subscripting the root to the predicate being modified.) (12) Casey ran. lXACT<*/.vJ (13) Tracy wiped the table. [x act<»,,,> y 1
426 IV. Lexical semantics Although early work on the structured representation of verb meaning paid little attention to the nature and contribution of the root (the exception being Grimshaw 2005), more recent work has taken seriously the idea that the elements of meaning lexicalized in the root determine the range of event structures that a root can be associated with (e.g. Erteschik-Shir & Rapoport 2004,2005, Harley 2005, Ramchand 2008, Rappaport Hovav 2008, Rappaport Hovav & Levin 1998a, Zubizarreta & Oh 2007). Thus, Rappaport Hovav & Levin (1998a) propose that roots are of different onto- logical types, with the type determining the associated event structures.Two of the major ontological types of roots are manner and result (Levin & Rappaport Hovav 1991,1995, Rappaport Hovav & Levin 2010; see also Talmy 1975, 1985, 2000). These two types of roots arc best introduced through an examination of verbs apparently in the same semantic field which differ as to the nature of their root: the causative change of state verb clean, for example, has a result root that specifies a state that often results from some activity, as in (14), while the verb scrub has a manner root that specifies an activity, as in (15); in this and many other instances, the activity is one conventionally carried out to achieve a particular result. With scrub the result is "cleanness", which explains the intuition of relatedness between the manner verb scrub and the result verb clean. (14) [ [ x act<imvvaa(>] cause [become [ y <CLEAN> ] ] ] (15)[xact_J Result verbs specify the bringing about of a result state - a state that is the result of some sort of activity; it is this state which is lexicalized in the root. Thus, the verbs clean and empty describe two different result states that are often brought about by removing material from a place; neither verb is specific about how the relevant result state comes about. Result verbs denote externally caused eventualities in the sense of Levin & Rappaport Hovav (1995).Thus, while a cave can be empty without having been emptied, something usually becomes empty as a result of some causing event. Result verbs, then, are associated with a causative change of state LCS;see also Hale & Keyser (2002) and Koontz-Garboden (2007) and for slightly different views Alexiadou,Anagnostopoulou& Schafer (2006) and Doron (2003). A manner root is associated with an activity LCS;such roots describe actions, which are identified by some sort of means, manner, or instrument.Thus, the manner verbs scrub and wipe both describe actions that involve making contact with a surface, but differ in the way the hand or some implement is moved against the surface and the degree of force and intensity of this movement. Often such activities are characterized by the instrument used in performing them and the verbs themselves take their names from the instruments. Again among verbs describing making contact with a surface, there are the verbs rake and shovel, which involve different instruments, designed for different purposes and, thus, manipulated in somewhat different ways. Despite the differences in the way the instruments arc used, linguistically all these verbs have a basic activity LCS. In fact, all instrument verbs have this LCS even though there is apparent diversity among them: Thus, the verb sponge might be used in the description of removing events (e.g. Tyler sponged the stain off the fabric) and the verb whisk in the description of adding events (e.g. Cameron whisked the sugar into the eggs), while the verbs rake and shovel might be used for cither (e.g. Kelly shoveled the snow into the truck, Kelly shoveled the snow off the drive).
19. Lexical Conceptual Structure 427 According to Rappaport Hovav & Levin (1998b), this diversity has a unified source: English allows the LCSs of all activity verbs to be "augmented" by the addition of a result state, giving rise to causative LCSs, such as those involved in the description of adding and removing events, via a process they call Template Augmentation.This process resembles Wunderlich's (1997a,2000) notion of argument extension;cf. article 84 (Wunderlich) Operations on argument structure; see also Rothstein (2003) and Ramchand (2008). Whether an augmented instrument verb receives an adding or removing interpretation depends on whether the instrument involved is typically used to add or remove stuff. In recent work, Rappaport Hovav & Levin (2010) suggest an independent characterization of manner and result roots by appealing to the notions of scalar and nonscalar change—notions which have their origins in Dowty (1979,1991) and McClurc (1994), as well as the considerable work on the role of scales in determining telicity (e.g. Beavers 2008, Borer 2005, Hay, Kennedy & Levin 1999, Jackendoff 1996b, Kennedy & Levin 2008, Krifka 1998, Ramchand 1997,Tenny 1994). As dynamic verbs,manner and result verbs all involve change, though crucially not the same type of change: result roots specify scalar changes, while manner roots do not. Verbs denoting events of scalar change in one argument lexically entail a scale: a set of degrees - points or intervals indicating measurement values - ordered on a particular dimension representing an attribute of an argument (e.g. height, temperature, cost) (Bartsch & Vennemann 1972, Kennedy 1999, 2001); the degrees indicate the possible values of this attribute. A scalar change in an entity involves a change in the value of the relevant attribute in a particular direction along the associated scale. The change of state verb widen is associated with a scale of increasing values on a dimension of width; and a widening event necessarily involves an entity showing an increase in the value along this dimension. A nonscalar change is any change that cannot be characterized in terms of a scale;such changes are typically complex, involving a combination of many changes at once.They are characteristic of manner verbs. For example, the verb sweep involves a specific movement of a broom against a surface that is repeated an indefinite number of times. See Rappaport Hovav (2008) for extensive illustration of the grammatical reflexes of the scalar/nonscalar change distinction. As this section makes clear, roots indirectly influence argument realization as their ontological type determines their association with a particular event structure. We leave open the question of whether roots can more directly influence argument realization. For example, the LCS proposed for cut in (5) includes elements of meaning which are normally associated with the root since "contact" or a similar concept has not figured among proposals for the set of primitive predicates constituting an LCS. Yet, this element of meaning is implicated by Guerssel et al. in the conative alternation. (In contrast, the notion "effect" more or less reduces to a change of state of some type.) 4. Choosing primitive predicates LCSs share with other forms of predicate decomposition the properties that are said to make such representations an improvement over lists of semantic roles, whether Fillmore's (1968) cases or Gruber (1965/1976) and Jackendoff s (1972) thematic relations, as structured representations of verb meaning. There is considerable discussion of the problems with providing independent, necessary and sufficient definitions of semantic roles (see e.g. Dowty 1991 and article 18 (Davis) Thematic roles), and one suggestion for dealing with this problem is the suggestion first found in Jackendoff (1972) that semantic
428 IV. Lexical semantics roles can be identified with particular open positions in predicate decompositions. For example, the semantic role "agent" might be identified with the first argument of a primitive predicate cause. There is a perception that the set of primitive predicates used in a verb's LCS or event structure is better motivated than the set of semantic role labels for its arguments, and for this reason predicate decompositions might appear to be superior to a list of semantic role labels as a structured representation of a verb's meaning. However, there is surprisingly little discussion of the explicit criteria for positing a particular primitive predicate, although see the discussion in Carter (1978), Jackendoff (1983: 203-204), and Joshi (1974). The primitive predicates which surface repeatedly in studies using LCSs or other forms of predicate decomposition arc act or do, be, become or change, cause, and go, although the predicates have, move, stay, and, more recently, result are also proposed. Jackendoff (1990) posits a significantly greater number of predicates than in his previous work, introducing the predicates configure, extend, exchange, form, inch(oative), orient, and react. Article 32 (Hobbs) Word meaning and world knowledge discusses how some of these predicates may be grounded in an axiomatic semantics. Once predicates begin to proliferate, theories of predicate decomposition face many of the well-known problems facing theories of semantic roles (cf. Dowty 1991). The question is whether it is possible to identify a small, comprehensive, universal, and well- motivated set of predicates accepted by all. It is worthwhile, therefore, to scrutinize the motivation for proposing a predicate in the first place and to try to make explicit when the introduction of a new predicate is justified. In positing a set of predicates, researchers have tried to identify recurring elements of verb meaning that figure in generalizations holding across the set of verbs within (and, ultimately, across) languages. Often these generalizations involve common entailments or common grammatical properties. Wilks (1987) sets out general desiderata for a set of primitive predicates that arc implicit in other work. For instance, the set of predicates should be finite in size and each predicate in the set should indeed be "primitive" in that it should not be reducible to other predicates in the set, nor should it even be partially definable in terms of another predicate. Thus, in positing a new predicate, it is important to consider its effect on the overall set of predicates. Wilks also proposes that the set of predicates should be able to exhaustively describe and distinguish the verbs of each language, but LCS-style representations, by adopting the root-event structure distinction, simply require that the set of primitives should be able to describe all the grammatically relevant event types. It is the role of the root to distinguish between specific verbs of the same event type, and there is a general, but implicit assumption that the roots themselves cannot be reduced to a set of primitive elements. As Wilks (1987:760) concludes, the ultimate justification for a set of primitive is in their "special organizing role in a language system". We now briefly present several case studies chosen to illustrate the type of reasoning used in positing a predicate. One way of arriving at a set of primitive predicates is to adopt a hypothesis that circumscribes the basic inventory of event types, while allowing for all events to be analyzed in terms of these types. This approach is showcased in the work of Jackendoff (1972, 1976,1983,1987,1990), who develops ideas proposed by Grubcr (1965/1976). Jackendoff adopts the localist hypothesis: motion and location events are basic and all other events should be construed as such events.There is one basic type of motion event, represented
19. Lexical Conceptual Structure 429 by the primitive predicate go, which takes as arguments a theme and the path (e.g. The cart went from the farm to the market).There are two basic types of locational events, represented by the predicate be (for stative events) and stay (for non-stative events); these predicates also take a theme and a location as arguments (e.g. The coat was/stayed in the closet). In addition, Jackendoff introduces the predicates cause and let, which are used to form complex events taking as arguments a causer and a motion or location event. Events that are not obviously events of motion or location are construed in terms of some abstract form of motion or location. For example, with events of possession, pos- sessums can be taken as themes and possessors as locations in an abstract possessional "field" or domain. The verb give is analyzed as describing a causative motion event in the possessional field in which a posscssum is transferred from one possessor to another. Physical and mental states and changes of state can be seen as involving an entity being "located" in a state or "moving" from one state to a second state in an identificational field; the verb break, for instance, describes an entity moving from a state of being whole to a state of being broken. Generalizing, correspondences are set up between the components of motion and location events and the components of other event domains or "semantic fields" in Jackendoff's terms; this is what Jackendoff (1983:188) calls the Thematic Relations Hypothesis;see article 30 (Jackendoff) Conceptual Semantics. In general on this view, predicates are most strongly motivated when they figure prominently in lexical organization and in cross-field generalizations. This kind of cross-field organization can be illustrated in a number of ways. First, many English verbs have uses based on the same predicate in more than one field (Jackendoff 1983:203-204).Thus, the predicate stay receives support from the English verb keep, whose meaning presumably involves the predicates cause and stay, combined in a representation as in (16), because it shows uses involving the three fields just introduced. (16) [cause (x, (stay y,z))] (17) a. Terry kept the bike in the shed. (Positional) b. Terry kept the bike basket. (Possessional) c. Terry kept the bike clean. (Identificational) Without the notion of semantic field, there would be no reason to expect English to have verbs which can be used to describe events which on the surface seem quite different from each another, as in (17). Second, rules of inference hold of shared predicates across fields (Jackendoff 1976). One example is that "if an event is caused, it takes place" (Jackendoff 1976:110), so that the entailment The bike stayed in the shed can be derived from (17a), the entailment The bike basket stayed with Terry can be derived from (17b), and the entailment The bike stayed clean can be derived from (17c). This supports the use of the predicate cause across fields. Finally, predicates are justified when they explain cross-field generalizations in the use of prepositions.The use of the allative preposition to is taken to support the analysis of give as a causative verb of motion. These very considerations lead Carter (1978: 70-74) to argue against the primitive predicate stay. Carter points out that few English words have meanings that include the notion captured by stay, yet if stay were to number among the primitive predicates,
430 IV. Lexical semantics such words would be expected to be quite prevalent. So, while the primitives cause and become are motivated because languages often contain a multitude of lexical items differentiated by just these predicates (e.g. the various uses of cool in The cook cooled the cake, The cake cooled, The cake was cool), there is no minimal pair differentiated by the existence of stay, for example, the verb cool cannot also mean 'stay cool'. Carter also notes that if not is included in the set of predicates, then the predicate stay becomes superfluous, as it could be replaced by not plus change, a predicate which is roughly an analogue to Jackendoff's go. Carter further notes that as a result simpler statements of certain inference rules and other generalizations might be possible. However, the primitive predicates which serve best as the basis for cross-field generalizations arc not necessarily the ones that emerge from efforts to account for argument alternations—the efforts that lead to LCS-style representations and their descendants. This point can be illustrated by examining another predicate whose existence has been controversial, have. Jackendoff posits a possessional field that is modeled on the loca- tional field: being possessed is taken to be similar to being at a location—existing at that location (see also Lyons 1967,1968:391-395).This approach receives support since many entailments involving location also apply to possession, such as the entailment described for stay. Furthermore, in some languages, including Hindi-Urdu, the same verb is used in basic locational and possessive sentences, suggesting that possession can be reduced to location (though the facts arc often more complicated than they appear on the surface; sec Harlcy 2003). Nevertheless, Pinker (1989: 189-190) and Tham (2004: 62-63, 74-85, 100-104) argue that an independent predicate have is necessary; see also Harley (2003). Pinker points out that in terms of the expression of its arguments, it is belong and not have which resembles locational predicates.The verb belong takes the possessum, which would be analyzed as a theme (i.e. located entity), as its subject and the possessor, which would be analyzed as a location, as an oblique. Its argument realization, then, parallels that of a locational predicate; compare (18a) and (18b). In contrast, have takes the possessor as subject and the possessum as object, as in (19), so its arguments show the reverse syntactic prominence relations - a "marked" argument realization, which would need an explanation, on the localist analysis. (18) a. One of the books belongs to me. b. One of the books is on the table. (19) I have one of the books. Pinker points out that an analysis which takes have to be a marked possessive predicate is incompatible with the observations that it is a high-frequency verb, which is acquired early and unproblematically by children.Tham (2004:100-104) further points out that it is belong that is actually the "marked" verb from other perspectives: it imposes a referen- tiality condition on its possessum and it is used in a restricted set of information structure contexts - all restrictions that have does not share.Taking all these observations together, Tham concludes that have shows the unmarked realization of arguments for possessive predicates, while belong shows a marked realization of arguments. Thus, she argues that the semantic prominence relations in unmarked possessive and locative sentences are quite different and, therefore, warrant positing a predicate have.
19. Lexical Conceptual Structure 431_ 5. Subeventual analysis One way in which more recent work on event structure departs from earlier work on LCSs is that it begins to use the structure of the semantic representation itself, rather than reference to particular predicates in this representation, in formulating generalizations about argument realization. In so doing, this work capitalizes on the assumption, present in some form since the generative semantics era, that predicate decompositions may have a subeventual analysis. Thus, it recognizes a distinction between two types of event structures: simple event structures and complex event structures, which themselves are constituted of simple event structures. The prototypical complex event structure is a causative event structure, in which an entity or event causes another event, though Ramchand (2008) takes some causative events to be constituted of three subevents, an initiating event, a process, and a result. Beginning with the work of the generative scmanticists, the positing of a complex event structure was supported using evidence from scope ambiguities involving various adverbial phrases (McCawley 1968, 1971, Morgan 1969, von Stechow 1995, 1996). Specifically, a complex event structure may afford certain adverbials, such as again, more scope-taking options than a simple event structure, and thus adverbials may show interpretations in sentences denoting complex events that are unavailable in those denoting simple events. Thus, (20) shows both so-called "restitutive" and "repetitive" readings, while (21) has only a "repetitive" reading. (20) Dale closed the door again. Repetitive: the action of closing the door was performed before. Restitutive: the door was previously in the state of being closed, but there is no presupposition that someone had previously closed the door. (21) John kicked the door again. Repetitive: the action of kicking the door was performed before. The availability of two readings in (20) and one reading in (21) is explained by attributing a complex event structure to (20) and a simple event structure to (21). A recent line of research argues that the architecture of event structure also matters to argument realization, thus motivating complex event structures based on argument realization considerations. This idea is proposed by Grimshaw & Vikner (1993), who appeal to it to explain certain restrictions on the passivization of verbs of creation (though the pattern of acceptability that they are trying to explain turns out to have been mischaracterized; see Macfarland 1995). This idea is further exploited by Rappaport Hovav & Levin (1998a) and Levin & Rappaport Hovav (1999) to explain a variety of facts related to objecthood. In this work, the notion of event complexity gains explanatory power via the assumption that there must be an argument in the syntax for each subevent in an event structure (Rappaport Hovav & Levin 1998a, 2001; sec also Grimshaw & Vikner 1993 and van Hout 1996 for similar conditions). Given this assumption, a verb with a simple event structure may be transitive or intransitive, while a verb with a complex event structure,
432 IV. Lexical semantics say a causative verb, must necessarily be transitive. Rappaport Hovav & Levin (1998a) attribute the necessary transitivity of break and melt, which contrasts with the "optional" transitivity of sweep and wipe, to this constraint; the former, as causative verbs, have a complex event structure. (22) a. *Blair broke/melted, b. Blair wiped and swept. Levin & Rappaport Hovav (1999) use the same assumption to explain why a resultative based on an unergative verb can only predicate its result state of the subject via a "fake" reflexive object. (23) My neighbor talked *(herself) hoarse. Resultatives have a complex, causative event structure, so there must be a syntactically realized argument representing the argument of the result state subevent; in (23) it is a reflexive pronoun as it is the subject which ends up in the result state. Levin (1999) uses the same idea to explain why agent-act-on-patient verbs are transitive across languages, while other two-argument verbs vary in their transitivity: only the former are required to be transitive. As in other work on LCSs and event structure, the use of subeventual structure is motivated by argument realization considerations. Pustejovsky (1995) and van Hout (1996) propose an alternative perspective on event complexity: they take telic events, rather than causative events, to be complex events. Since most causative events are telic events, the two views of event complexity assign complex event structures in many of the same instances. A major difference is in the treatment of telic uses of manner of motion verbs such as Terry ran to the library and telic uses of consumption verbs such as Kerry ate the peach; these arc considered complex predicates on the telicity approach, but not on the causative approach. See Levin (2000) for some discussion. The subeventual analysis also defines hierarchical relations among arguments, allowing rules of argument realization to be formulated in terms of the geometry of the LCS.We now discuss advantages of such a formulation over direct reference to semantic roles. 6. LCSs and syntax LCSs, as predicate decompositions, include the embedding of constituents, giving rise to a hierarchical structure. This hierarchical structure, which includes the subeventual structure discussed in section 5, allows a notion of semantic prominence to be defined, which mirrors the notion of syntactic prominence. For instance, Wundcrlich (1997a, 1997b, 2006) introduces a notion of a-command defined over predicate decompositions, which is an analogue of syntactic c-command. By taking advantage of the hierarchical structure of LCSs, it becomes possible to formulate argument realization rules in terms of the geometry of LCSs and, more importantly, to posit natural constraints on the nature of the mapping between LCS and syntax. As discussed at length in Levin & Rappaport Hovav (2005: chapter 5), many researchers assume that the mapping between lexical semantics and syntax obeys a constraint of prominence preservation: relations of semantic prominence in a semantic representation are preserved in the corresponding
19. Lexical Conceptual Structure 433 syntactic representation, so that prominence in the syntax reflects prominence in semantics (Bouchard 1995). This idea is implicit in the many studies that use a thematic hierarchy - a ranking of semantic roles - to guide the semantics-syntax mapping and explain various other facets of grammatical behavior; however, most work adopting a thematic hierarchy does not provide independent motivation for the posited role ranking. Predicate decompositions can provide some substance to the notion of a thematic hierarchy by correlating the position of a role in the hierarchy with the position of the argument bearing that role in a predicate decomposition (Baker 1997, Croft 1998, Kiparsky 1997, Wunderlich 1997a, 1997b, 2000). (There are other ways to ground the thematic hierarchy; see article 18 (Davis) Thematic roles.) Researchers such as Bouchard (1995), Kiparsky (1997),and Wunderlich (1997a, 1997b) assume that predicate decompositions constitute a lexical semantic representation, but many other researchers now assume that predicate decompositions are syntactic representations, built from syntactic primitives and constrained by principles of syntax. This move obviates the need for prominence preservation in the semantics-syntax mapping since the lexical semantic representation and the syntactic representation are one. The idea that predicate decompositions motivated by semantic considerations are remarkably similar to syntactic structures, and thus should be taken to be syntactic structures has previously been made explicit in the generative semantics literature (e.g. Morgan 1969). Hale & Keyser were the first to articulate this position in the context of current syntactic theory; their LCS-stylc syntactic structures arc called "lexical relational structures" in some of their work (1992,1993,1997).The proposal that predicate decompositions should be syntactically instantiated has gained currency recently and is defended or assumed in a range of work, including Erteschik-Shir & Rapoport (2004), Mateu (2001a, 2001b),Travis (2000), and Zubizaretta & Oh (2007), who all build directly on Hale & Keyser's work, as well as Alexiadou, Anagnostopoulou & Schafer (2006), Harley (2003,2005), Pylkkanen (2008), and Ramchand (2008), who calls the "lexical" part of syntax "first phase syntax". We now review the types of arguments adduced in support of this view. Hale & Keyser (1993) claim that their approach explains why there are few semantic role types (although this claim is not entirely uncontroversial; see Dowty 1991 and Kiparsky 1997). For them, the argument structure of a verb is syntactically defined and represented. Furthermore, individual lexical categories (V,in particular) are constrained so as to project syntactic structures using just the syntactic notions of specifier and complement. These syntactic structures arc associated with coarse-grained semantic notions, often corresponding to the predicates typical in standard predicate decompositions; the positions in these structures correspond to semantic roles, just as the argument positions in a standard predicate decomposition are said to correspond to semantic roles; see section 4. For example, the patient of a change of state verb is the specifier of a verbal projection in which a verb takes an adjective complement (i.e. the N position in '[v N [v V A]]'). Since the number of possible structural relations in syntax is limited, the number of expressible semantic roles is also limited. Furthermore, Hale & Keyser suggest that the nature of these syntactic representations of argument structure also provide insight into Baker's (1988:46,1997) Uniformity of Theta Role Assignment, which requires that identical semantic relationships between items be represented by identical structural relations between those items at d-structure. On Hale & Keyser's approach, this principle must follow since semantic roles are defined over a hierarchical syntactic structure, with a particular role always having the same instantiation.These ideas are developed further
434 IV. Lexical semantics in Ramchand (2008), who assumes a slightly more articulated lexical syntactic structure than Hale & Keyser do. Hale & Keyser provide support for their proposal that verb meanings have internal syntactic structure from the syntax of denominal verbs.They observe that certain impossible types of denominal verbs parallel impossible types of noun-incorporation. In particular, they argue that just as there is no noun incorporation of either agents or recipients in languages with noun-incorporation (Baker 1988: 453-454, fn. 13, 1996: 291-295), so there is no productive denominal verb formation where the base noun is interpreted as an agent or a recipient (e.g. */ churched the money, where church is understood as a recipient; *// cowed a calf, where cowN is understood as the agent). However, denominal verbs arc productively formed from nouns analyzed as patients (e.g. / buttered the pan) and containers or locations (e.g. / bottled the wine). Hale & Keyser argue that the possible denominal verb types follow on the assumption that these putative verbs are derived from syntactic structures in which the base noun occupies a position which reflects its semantic role. This point is illustrated using their representations for the verbs paint and shelve given in (24) and (25), respectively, which are presented in the linearized form given in Wechsler (2006: 651, ex. (17)-(18)) rather than in Hale & Keyser's (1993) tree representations. (24) a. We painted the house. b. Wc [v VI [vp house I, V2 [pp Pwith paint]]]]. (25) a. We shelved the books. b. We [, VI [vp books [, V2 [pp Pon shelf]]]]. The verbs paint and shelf are derived through the movement of the base noun - the verb's root - into the empty VI position in the structures in (b), after first merging it with the preposition Pwit|) or Pon and then with V2. Hale & Keyser argue that the movement of the root is subject to a general constraint on the movement of heads (Baker 1988, Travis 1984). Likewise, putative verbs such as bush or house, used in sentences such as */ bushed a trim (with the intended interpretation 'I gave the bush a trim') or * I housed a coat of paint (with the intended interpretation 'I gave the house a coat of paint') are also excluded by the same constraint. Hale & Keyser's approach is sharply criticized by Kiparsky (1997). He points out that the syntax alone will not prevent the derivation of sentences such as */ bushed some fertilizer from a putative source corresponding to I put some fertilizer on the bush (cf. the source for / bottled the wine or / corralled the Zior.se). The association of a particular root with a specific lexical syntactic structure is governed by conceptual principles such as "If an action refers to a thing, it involves a canonical use of the thing" (Kiparsky 1997:482). Such principles ensure that bush will not be inserted into a structure as a location, since unlike a bottle, its canonical use is not as a container or place. Denominal verbs in English, although they do not involve any explicit verb-forming morphology, have been said, then, to parallel constructions in other languages which do involve overt morphological or syntactic derivation (e.g. noun incorporation), and this parallel has been taken to support the syntactic derivation of these words. Comparable arguments can be made in other areas of the English lexicon. For example, in most languages of the world, manner of motion verbs cannot on their own license a directional
19. Lexical Conceptual Structure 435 phrase, though in English all manner of motion verbs can. Specifically, sentences parallel to the English Tracy ambled into the room are derived through a variety of morphosyn- tactic means in other languages, including the use of directional suffixes or applicative morphemes on manner of motion verbs and the combination of manner and directed motion verbs in compounds or serial verb constructions (Schaefer 1985, Slobin 2004, Talmy 1991). Japanese, for example, must compound a manner of motion verb with a directed motion verb in order to express manner of motion to a goal, as the contrast in (26) shows. (26) a. ?John-wa kishi-e oyoida. John-Top shorc-to swam 'John swam to the shore.' (intended; Yoneyama 1986: l,ex.(lb)) b. John-wa kishi-c oyoidc-itta. John-Top shore-to swimming-went 'John swam to the shore.' (Yoneyama 1986:2, ex. (3b)) The association of manner of motion roots with a direction motion event type is accomplished in theories such as Rappaport Hovav & Levin (1998b, 2001) and Levin & Rappaport Hovav (1999) by processes which augment the event structure associated with the manner of motion verbs. But such theories never make explicit exactly where the derivation of these extended structures takes place. Ramchand (2008) and Zubizarreta & Oh (2007) argue that since these processes are productive and their outputs are to a large extent compositional, they should be assigned to the "generative computational" module of the grammar, namely, syntax. Finally, as Ramchand explicitly argues, this syntacticized approach suggests that languages that might appear quite different, are in fact, underlyingly quite similar, once lexical syntactic structures are considered. Finally, we comment on the relation between structured event representations and the lexical entries of verbs. Recognizing that many roots in English can appear as words belonging to several lexical categories and that verbs themselves can be associated with various event structures, Borer (2005) articulates a radically nonlexical position: she proposes that roots are category neutral. That is, there is no association specified in the lexicon between roots and the event structures they appear with. Erteschik-Shir & Rapoport (2004, 2005), Levin & Rappaport Hovav (2005), Ramchand (2008), and Rappaport Hovav & Levin (2005) all point out that even in English the flexibility of this association is still limited by the semantics of the root. Ramchand includes in the lexical entries of verbs the parts of the event structure that a verbal root can be associated with, while Levin & Rappaport Hovav make use of "canonical realization rules", which pair roots with event structures based on their ontological types, as discussed in section 3. 7. Conclusion LCSs are a form of predicate decomposition intended to capture those facets of verb meaning which determine grammatical behavior, particularly in the realm of argument realization. Research on LCSs and the structured representations that are their descendants has contributed to our understanding of the nature of verb meaning and the
436 IV. Lexical semantics relation between verb syntax and semantics. This research has shown the importance of semantic representations that distinguish between root and event structure, as well as the importance of the architecture of the event structure to the determination of grammatical behavior. Furthermore, such developments have led some researchers to propose that representations of verb meaning should be syntactically instantiated. This research was supported by Israel Science Foundation Grants 806-03 and 379-07 to Rappaport Hovav. We thank Scott Grimm, Marie-Catherine de Marneffe, and Tanya Nikitina for comments on an earlier draft. 8. References Alexiadou, Artemis, Elena Anagnostopoulou & Florian Schafcr 2006. The properties of anti- causatives crosslinguistically. In: M. Frascarelli (ed.). Phases of Interpretation. Berlin: Mouton deGruyter, 187-211. Aronoff, Mark 1993. Morphology by Itself. Stems and Inflectional Classes. Cambridge, MA: The MIT Press. Bach, Einmon 1986.The algebra of events. Linguistics & Philosophy 9,5-16. Baker, Mark C. 1988. Incorporation. A Theory of Grammatical Function Changing. Chicago, IL:The University of Chicago Press. Baker, Mark C. 1996. The Polysynthesis Parameter. New York: Oxford University Press. Baker, Mark C. 1997. Tliematic roles and syntactic structure. In: L. Haegeman (ed.). Elements of Grammar. Handbook of Generative Syntax. Dordrecht: Kluwer, 73-137. Bartsch, Renate & Theo Veiineinanii 1972. The grammar of relative adjectives and comparison. Linguistische Berichte 20.19-32. Beavers, John 2008. Scalar complexity and the structure of events. In: J. Dolling & T Heyde- Zybatow (eds.). Event Structures in Linguistic Form and Interpretation. Berlin: de Gruyter, 245-265. Borer, Hagit 2005. Structuring Sense, vol. 2: The Normal Course of Events. Oxford: Oxford University Press. Bouchard, Denis 1995. The Semantics of Syntax. A Minimalist Approach to Grammar. Chicago, IL: The University of Chicago Press. Brcsnan, Joan 1982.The passive in lexical theory. In: J. Brcsnan (ed.). The Mental Representation of Grammatical Relations. Cambridge, MA:The MIT Press, 3-86. Carter, Richard J. 1978. Arguing for semantic representations. Recherches Linguistiques de Vin- cennes 5-6,61-92. Chomsky, Noam 1981. Lectures on Government and Binding. Dordrecht: Foris. Croft, William 1998. Event structure in argument linking. In: M. Butt & W. Geuder (eds.). The Projection of Arguments. Lexical and Compositional Factors. Stanford, CA: CSLI Publications, 21-63. Doron, Edit 2003. Agency and voice. The Semantics of the Semitic templates. Natural Language Semantics 11,1-67. Dowty. David R. 1979. Word Meaning and Montague Grammar. The Semantics of Verbs and Times in Generative Semantics and in Montagues PTQ. Dordrecht: Reidel. Dowty, David R. 1991.Tliematic proto-roles and argument selection. Language 67'. 547-619. Erteschik-Shir, Nomi & Tova R. Rapoport 2004. Bare aspect. A theory of syntactic projection. In: J. Gu<Sron & J. Lecarme (eds.). The Syntax of Time. Cambridge, MA:Tlie MIT Press, 217-234. Erteschik-Shir. Nomi & Tova R. Rapoport 2005. Path predicates. In: N. Erteschik-Shir & T Rapoport (eds.). The Syntax of Aspect. Deriving Thematic and Aspectual Interpretation. Oxford: Oxford University Press, 65-86.
19. Lexical Conceptual Structure 437 Farmer, Ann K. 1984. Modularity in Syntax. A Study of Japanese and English. Cambridge, M A: The MIT Press. Fillmore, Charles J. 1968. The case for case. In: E. Bach & R. T. Harms (eds.). Universals in Linguistic Theory. New York: Holt, Rinehart & Winston. 1-88. Fillmore. Charles J. 1970.The grammar of hitting and breaking. In: R. A. Jacobs & P. S. Rosenbaum (eds.). Readings in English Transformational Grammar. Waltham, MA: Ginn, 120-133. Fodor, Jerry & Ernest Lepore 1999. Impossible words? Linguistic Inquiry 30,445-453. Goldberg, Adele E. 1995. Constructions. A Construction Grammar Approach to Argument Structure. Chicago, IL:The University of Chicago Press. Grimshaw, Jane 1990. Argument Structure. Cambridge, MA:The MIT Press. Grimshaw, Jane 2005. Words and Structure. Stanford, CA: CSLI Publications. Grimshaw, Jane & Sten Vikner 1993. Obligatory adjuncts and the structure of events. In: E. Reuland & W. Abraham (eds.). Knowledge and Language, vol. 2: Lexical and Conceptual Structure. Dordrecht: Kluwer, 143-155. Gruber, Jeffrey S. 1965/1976. Studies in Lexical Relations. Ph.D. dissertation. MIT, Cambridge, MA. Reprinted in: J. S. Gruber. Lexical Structures in Syntax and Semantics. Amsterdam: North- Holland, 1976,1-210. Guerssel, Mohamed et al. 1985. A cross-linguistic study of transitivity alternations. In: W. H. Eilfort, P. D. Kroeber & K. L. Peterson (eds.). Papers from the Parasession on Causatives and Agentivity. Chicago, IL: Chicago Linguistic Society, 48-63. Hale, Kenneth L. & Samuel J. Keyser 1987. A View from the Middle. Lexicon Project Working Papers 10. Cambridge, MA: Center for Cognitive Science, MIT. Hale, Kenneth L. & Samuel J. Keyser 1992. The syntactic character of thematic structure. In: I. M. Roca (ed.). Thematic Structure. Its Role in Grammar. Berlin: Foris, 107-143. Hale, Kenneth L. & Samuel J. Keyser 1993. On argument structure and the lexical expression of syntactic relations. In: K. L. Hale & S. J. Keyser (eds.). The View from Building 20. Essays in Linguistics in Honor ofSylvain Bromberger. Cambridge, MA:Thc MIT Press, 53-109. Hale, Kenneth L. & Samuel J. Keyser 1997. On the complex nature of simple predicators. In: A. Alsina, J. Bresnan & P. Sells (eds.). Complex Predicates. Stanford, CA: CSLI Publications, 29-65. Hale, Kenneth L. & Samuel J. Keyser 2002. Prolegomenon to a Theory of Argument Structure. Cambridge, MA:Thc MIT Press. Harley, Heidi 2003. Possession and the double object construction. In: P. Pica & J. Rooryck (eds.). Linguistic Variation Yearbook, vol. 2. Amsterdam: Benjamins, 31-70. Harley, Heidi 2005. How do verbs get their names? Denominal verbs, manner incorporation and the ontology of verb roots in English. In: N. Erteschik-Shir & T Rapoport (eds.). The Syntax of Aspect. Deriving Thematic and Aspectual Interpretation. Oxford: Oxford University Press, 42-64. Haspelmath, Martin 1993. More on the typology of inchoative/causative verb alternations. In: B. Connie & M. Polinsky (eds.). Causatives and Transitivity. Amsterdam: Benjamins, 87-120. Hay, Jennifer, Christopher Kennedy & Beth Levin 1999. Scalar structure underlies telicity in 'degree achievements'. In: T. Matthews & D. Strolovich (eds.). Proceedings of Semantics and Linguistic Theory (=SALT) IX. Ithaca, NY: Cornell University. 127-144. van Hout, Angeliek 1996. Event Semantics of Verb Frame Alternations. A Case Study of Dutch and Its Acquisition. Doctoral dissertation. Katholieke Universiteit Brabant.Tilburg. Jackeiidoff, Ray S. 1972. Semantic Interpretation in Generative Grammar. Cambridge, MA:The MIT Press. Jackeiidoff, Ray S. 1976. Toward an explanatory semantic representation. Linguistic Inquiry 7, 89-150. Jackeiidoff, Ray S. 1983. Semantics and Cognition. Cambridge, MA:The MIT Press. Jackendoff, Ray S. 1987. The status of thematic relations in linguistic theory. Linguistic Inquiry 18, 369^11. Jackeiidoff, Ray S. 1990. Semantic Structures. Cambridge, MA:The MIT Press.
438 IV. Lexical semantics Jackendoff, Ray S. 1996a. Conceptual semantics and cognitive linguistics. Cognitive Linguistics 7, 93-129. Jackendoff, Ray S. 1996b.The proper treatment of measuring out, telicity, and perhaps even quantification in English. Natural Language and Linguistic Theory 14,305-354. Joshi, Aravind 1974. Factorization of verbs. In: C. H. Heidi ich (ed.). Semantics and Communication. Amsterdam: North-Holland, 251-283. Kennedy, Christopher 1999. Projecting the Adjective. The Syntax and Semantics of Gradability and Comparison. New York: Garland. Kennedy, Christopher 2001. Polar opposition and the ontology of 'degrees'. Linguistics & Philosophy 24,33-70. Kennedy, Christopher & Beth Levin 2008. Measure of change. The adjectival core of degree achievements. In: L. McNally & C. Kennedy (eds.). Adjectives and Adverbs. Syntax, Semantics, and Discourse. Oxford: Oxford University Press, 156-182. Kiparsky, Paul 1997. Remarks on denominal verbs. In: A. Alsina, J. Bresnan & P. Sells (eds.). Complex Predicates. Stanford, CA: CSLI Publications, 473^99. Koontz-Garbodcn, Andrew 2007. States, Changes of State, and the Monotonicity Hypothesis. Ph.D. dissertation. Stanford University, Stanford, CA. Krifka, Manfred 1998. The origins of telicity. In: S. Rothstein (ed.). Events and Grammar. Dordrecht: Kluwer, 197-235. Lakoff, George 1968. Some verbs of change and causation. In: S. Kuno (ed.). Mathematical Linguistics and Automatic Translation. Report NSF-20. Cambridge, MA: Aiken Computation Laboratory, Harvard University. Lakoff, George 1970. Irregularity in Syntax. New York: Holt, Rinehart & Winston. Laughren, Mary 1988. Towards a lexical representation of Warlpiri verbs. In: W. Wilkins (ed.). Syntax and Semantics 21: Thematic Relations. New York: Academic Press, 215-242. Levin, Beth 1999. Objecthood. An event structure perspective. In: S. Billings, J. Boyle & A. Griffith (eds.). Proceedings of the Chicago Linguistic Society (=CLS) 35, Part I: Papers from the Main Session. Chicago, IL: Chicago Linguistic Society, 223-247. Levin, Beth 2000. Aspect, lexical semantic representation, and argument expression. In: L. J. Conathan et al. (eds.). Proceedings of the 26th Annual Meeting of the Berkeley Linguistics Society (=BLS). General Session and Parasession on Aspect. Berkeley, CA: Berkeley Linguistic Society, 413-429. Levin, Beth & Malka Rappaport Hovav 1991. Wiping the slate clean. A lexical semantic exploration. Cognition 41,123-151. Levin, Beth & Malka Rappaport Hovav 1995. Unaccusativity. At the Syntax-Lexical Semantics Interface. Cambridge, MA:The MIT Press. Levin, Beth & Malka Rappaport Hovav 1999. Two structures for compositionally derived events. In:T. Matthews & D. Strolovich (eds.). Proceedings of Semantics and Linguistic Theory (=SALT) IX. Ithaca. NY: Cornell University. 199-223. Levin, Beth & Malka Rappaport Hovav 2005. Argument Realization. Cambridge: Cambridge University Press. Lyons, John 1967. A note on possessive, existential and locative sentences. Foundations of Language 3,390-396. Lyons, John 1968. Introduction to Theoretical Linguistics. Cambridge: Cambridge University Press. Macfarland.Talke 1995. Cognate Objects and the Argument/Adjunct Distinction in English. Ph.D. dissertation. Northwestern University, Evanston, IL. Mai antz, Alec P. 1984. On the Nature of Grammatical Relations. Cambridge, MA:Thc MIT Press. Mateu, Jaume 2001a. Small clause results revisited. In: N. Zhang (ed.). Syntax of Predication. ZAS Papers in Linguistics 26. Berlin: Zentrum fur Allgeineine Sprachwissenschaft. Mateu, Jauine 2001b. Unselected objects. In: N. Dehe' & A. Wanner (eds.). Structural Aspects of Semantically Complex Verbs. Frankfurt/M.: Lang, 83-104.
19. Lexical Conceptual Structure 439 McCawley, James D. 1968. Lexical insertion in a transformational grammar without deep structure. In: B. J. Dai den. C.-J. N. Bailey & A. Davison (eds.). Papers from the Fourth Regional Meeting of the Chicago Linguistic Society (=CLS). Chicago, IL: Chicago Linguistic Society, 71-80. McCawley, James D. 1971. Prelexical syntax. In: R. J. O'Brien (ed.). Report of the 22nd Annual Roundtable Meeting on Linguistics and Language Studies. Washington, DC: Georgetown University Press, 19-33. McClure, William T. 1994. Syntactic Projections of the Semantics of Aspect. Ph.D. dissertation. Cornell University, Ithaca, NY. Morgan, Jerry L. 1969. On arguing about semantics. Papers in Linguistics 1,49-70. Nida. Eugene A. 1975. Componential Analysis of Meaning. An Introduction to Semantic Structures. The Hague: Mouton. Parsons, Terence 1990. Events in the Semantics of English. Cambridge, MA: The MIT Press. Pesetsky, David M. 1982. Paths and Categories. Ph.D. dissertation. MIT. Cambridge, MA. Pesetsky, David M. 1995. Zero Syntax. Experiencers and Cascades. Cambridge, MA:Thc MIT Press. Pinker, Steven 1989. Learnability and Cognition. The Acquisition of Argument Structure. Cambridge, MA:The MIT Press. Pustejovsky, James 1995. The Generative Lexicon. Cambridge, MA:The MIT Press. Pylkkanen, Liina 2008. Introducing Arguments. Cambridge, MA:The MIT Press. Ramchand, Gillian C. 1997. Aspect and Predication. The Semantics of Argument Structure. Oxford: Clarendon Press. Ramchand, Gillian C. 2008. Verb Meaning and the Lexicon. A First-Phase Syntax. Cambridge: Cambridge University Press. Rappaport Hovav, Malka 2008. Lexicalized meaning and the internal temporal structure of events. In: S. Rothstein (ed.). Theoretical and Crosslinguistic Approaches to the Semantics of Aspect. Amsterdam: Benjamins. 13-42. Rappaport Hovav. Malka & Beth Levin 1998a. Morphology and lexical semantics. In: A. Spencer & A. Zwicky (eds.). The Handbook of Morphology. Oxford: Blackwell, 248-271. Rappaport Hovav, Malka & Beth Levin 1998b. Building verb meanings. In: M. Butt & W. Geuder (eds.). The Projection of Arguments. Lexical and Compositional Factors. Stanford, CA: CSLI Publications, 97-134. Rappaport Hovav, Malka & Beth Levin 2001. An event structure account of English rcsultativcs. Language 11,766-797. Rappaport Hovav, Malka & Beth Levin 2005. Change of state verbs. Implications for theories of argument projection. In: N. Erteschik-Shir & T Rapoport (eds.). The Syntax of Aspect. Deriving Thematic and Aspectual Interpretation. Oxford: Oxford University Press, 276-286. Rappaport Hovav, Malka & Beth Levin 2008. The English dative alternation. The case for verb sensitivity. Journal of Linguistics 44,129-167. Rappaport Hovav, Malka & Beth Levin 2010. Reflections on manner/result complementarity. In: E. Doron, M. Rappaport Hovav & I. Sichel (eds.). Syntax, Lexical Semantics, and Event Structure. Oxford: Oxford University Press 5,21-38. Rappaport, Malka, Beth Levin & Mary Laughren 1988. Niveaux de representation lexicale. Lex- ique 7,13-32. English translation in: J. Pustejovsky (ed.). Semantics and the Lexicon. Dordrecht: Kluwer, 1993,37-54. Ross, John Robert 1972. Act. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Reide 1,70-126. Rothstein, Susan 2003. Structuring Events. A Study in the Semantics of Aspect. Oxford: Blackwell. Schacfcr, Ronald P. (1985) Motion in Tswana and its characteristic lexicalization. Studies in African Linguistics 16,57-87. Slobin, Dan I. 2004. The many ways to search for a frog. Linguistic typology and the expression of motion events. In: S. Stromqvist & L. Veiiioeven (eds.). Relating Events in Narrative, vol. 2: Typological and Contextual Perspectives. Mahwah, NJ: Erlbaum, 219-257.
440 IV. Lexical semantics von Stechow, Arnim 1995. Lexical decomposition in syntax. In: U. Egli et al. (eds.). Lexical Knowledge in the Organization of Language. Amsterdam: Benjamins, 81-118. von Stechow, Arnim 1996. The different readings of w/eder. A structural account. Journal of Semantics 13,87-138. Stowell,Timothy 1981. Origins of Phrase Structure. Ph.D. dissertation. MIT, Cambridge, MA. Taliny, Leonard 1975. Semantics and syntax of motion. In: J. P. Kimball (ed.). Syntax and Semantics 4. New York: Academic Press, 181-238. Taliny, Leonard 1985. Lexicalization patterns. Semantic structure in lexical forms. In: T. Shopen (ed.). Language Typology and Syntactic Description, vol. 3: Grammatical Categories and the Lexicon. Cambridge: Cambridge University Press, 57-149. Talmy, Leonard 1991. Path to realization. A typology of event conflation. In: L. A. Sutton, C. Johnson & R. Shields (eds.). Proceedings of the I7th Annual Meeting of the Berkeley Linguistics Society (=BLS). General Session and Parasession. Berkeley, CA: Berkeley Linguistic Society, 480-519. Talmy, Leonard 2000. Toward a Cognitive Semantics, vol. 2: Typology and Process in Concept Structuring. Cambridge, MA:The MIT Press. Taylor, John R. 1996. On running and jogging. Cognitive Linguistics 7,21-34. Tenny, Carol L. 1994. Aspectual Roles and the Syntax-Semantics Interface. Dordrecht: Kluwer. Tham, Shiao Wei 2004. Representing Possessive Predication. Semantic Dimensions and Pragmatic Bases. Ph.D. dissertation. Stanford University, Stanford, CA. Travis, Lisa D. 1984. Parameters and Effects of Word Order Variation. Ph.D. dissertation. MIT, Cambridge, MA. Travis, Lisa D. 2000. The 1-syntax/s-syntax boundary. Evidence from Austronesian. In: I. Paul, V. Phillips & L. Travis (eds.). Formal Issues In Austronesian Linguistics. Dordrecht: Kluwer, 167-194. van Valin, Robert D. 1993. A synopsis of role and reference grammar. In: R. D. van Valin (ed.). Advances in Role and Reference Grammar. Amsterdam: Benjamins, 1-164. van Valin, Robert D. 2005. Exploring the Syntax-Semantics Interface. Cambridge: Cambridge University Press. van Valin, Robert D. & Randy J. LaPolla 1997. Syntax. Structure, Meaning, and Function. Cambridge: Cambridge University Press. Wcchslcr, Stephen 2006. Thematic structure. In: K. Brown (ed.). Encyclopedia of Language and Linguistics. 2nd edn. Amsterdam: Elsevier, 645-653.1st edn. 1994. Wilks, Yorick 1987. Primitives. In: S. C. Shapiro (ed.). Encyclopedia of Artificial Intelligence. New York: Wiley, 759-761. Williams, Edwin 1981. Argument structure and morphology. The Linguistic Review 1,81-114. Wunderlich, Dieter 1997a. Argument extension by lexical adjunction. Journal of Semantics 14, 95-142. Wunderlich, Dieter 1997b. cause and the structure of verbs. Linguistic Inquiry 28,27-68. Wunderlich, Dieter 2000. Predicate composition and argument extension as general options. A study in the interface of semantic and conceptual structure. In: B. Stiebels & D. Wunderlich (eds.). Lexicon in Focus. Berlin: Akademie Verlag, 247-270. Wunderlich, Dieter 2006. Argument hierarchy and other factors determining argument realization. In: I. Bornkessel et al. (eds.). Semantic Role Universals and Argument Linking. Theoretical, Typological, and Psycholinguistic Perspectives. Berlin: Mouton de Gruyter, 15-52. Yoneyaina, Mitsuaki 1986. Motion verbs in conceptual semantics. Bulletin of the Faculty of Humanities 22. Tokyo: Scikci University, 1-15. Zubizarrcta, Maria Luisa & Eunjcong Oh 2007. On the Syntactic Composition of Manner and Motion. Cambridge, MA:The MIT Press. Beth Levin, Stanford, CA (USA) Malka Rappaport Hovav, Jerusalem (Israel)
20. Idioms and collocations 441 20. Idioms and collocations 1. Introduction: Collocation, collocations, and idioms 2. Lexical properties of idioms 3. Compositionality 4. How frozen arc idioms? 5. Syntactic flexibility 6. Morphosyntax 7. Idioms as constructions 8. Diachronic changes 9. Idioms in the lexicon 10. Idioms in the mental lexicon 11. Summary and conclusion 12. References Abstract Idioms constitute a subclass of multi-word units that exhibit strong collocational preferences and whose meanings are at least partially non-compositional. Drawing on English and German corpus data, we discuss a number of lexical, syntactic, morphological and semantic properties of Verb Phrase idioms that distinguish them from freely composed phrases. The classic view of idioms as "long words" admits of little or no variation of a canonical form. Fixedness is thought to reflect semantic non-compositionality: the nonavailability of semantic interpretation for some or all idiom constituents and the impossibility to parse syntactically ill-formed idioms block regular grammatical operations. However, corpus data testify to a wide range of discourse-sensitive flexibility and variation, weakening the categorical distinction between idioms and freely composed phrases. We cite data indicating that idioms are subject to the same diachronic developments as simple lexemes. Finally, we give a brief overview of psycholinguistic research into the processing of idioms and attempts to determine their representation in the mental lexicon. 1. Introduction: Collocation, collocations and idioms Words in text and speech do not co-occur freely but follow rules and patterns. We draw an initial three-fold distinction for recurrent lexical co-occurrences: collocation, patterns of words found in close neighborhood, and collocations and idioms, multi-word units with lexical status that are often distinguished in terms of their fixedness and semantic non-compositionality. The remainder of the paper will focus on idioms, in particular verb phrase idioms. We describe their syntactic, morphosyntactic and lexical properties with respect to frozenness and variation. The data examined here show a sliding scale of fixedness and a concomitant range of semantic compositionality. We next consider the diachronic behavior of idioms, which does not seem to differ from that of simple lexemes. Finally, several theories concerning the mental representation of idioms are discussed. Maienbom, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 441-456
442 IV. Lexical semantics 1.1. Collocation: A statistical property of language Collocation, the co-occurrence patterns of words observed in spoken and written language, is constrained by syntactic (grammatical), semantic, and lexical properties of words. At each level, linguists have attempted to formulate rules and constraints for co-occurrence. The syntactic constraints on lexical co-occurrcncc arc specific to a word but constrained by its syntactic class membership. For example, the adjective prone, unlike hungry and tired, subcategorizes for a Prepositional Phrase headed by to; moreover, hungry and tired, but not prone, can occur pre-nominally. Firth (1957) called the syntactic constraints on a word's selection of neighboring words "colligation". Firth moreover recognized that words display collocational properties beyond those imposed by syntax. He coined the term "collocation" for the "habitual or customary places" of a word. Firth's statement that "you can recognize a word by the company it keeps" has been given scientific validation by corpus studies establishing lexical profiles for words, which reflect their regular attraction to other words. Co-occurrence properties of individual lexical items can be expressed quantitatively (Halliday 1966, Sinclair 1991, Stubbs 2001, Krenn 2000, Evert 2005, inter alia). Church & Hanks (1990), and Church et al. (1991) proposed Mutual Information as a measure of the tendency of word forms to select or deselect one another in a context. Mutual Information compares the probability of two lexemes' co-occurrence with the probability that they occur alone and that their co-occurrence is merely chance. Mutual Information is a powerful tool for measuring the degree of fixedness, and thus lexical status, of word groups. Highly fixed expressions that have been identified statistically are candidates for inclusion in the lexicon, regardless of their semantic transparency. Fixed expressions are also likely to be stored as units in speakers' mental lexicons, and retrieved as units rather than composed anew each time. Psycholinguists are aware of people's tendency to chunk (Miller 1956). Statistical analyses can also quantify the flexibility of fixed expressions, which are rarely complclcty frozen (sec sections 3 and 4). Fazly & Stevenson (2006) propose a method for the automatic discovery of the set of syntactic variations that VP idioms can undergo and that should be included in their lexical representation. Fazly & Stevenson further incorporate this information into statistical measures that effectively predict the idiomaticity level of a given expression. Others have measured the distributional similarity between the expression and its constituents (McCarthy, Keller & Carroll 2003, Baldwin et al. 2003). The collocational properties of a language are perhaps best revealed in the errors committed by learners and non-native speakers. A fluent speaker may have sufficient command of the language to find the words that express the intended concept. However, only considerable competence allows him to select that word from among several near-synonyms that is favored by its neighbors; the subtlety of lexical preference appears unmotivated and arbitrary. 1.2. Collocations: Phrasal lexical items We distinguish collocation, the linguistic and statistically measurable phenomenon of co-occurrcncc, from collocations, specific instances of collocation that have lexical status.
20. Idioms and collocations 443 Collocations like beach house, eat up, and blood-red are mappings of word forms and word meanings, much like simple lexemes such as house and eat and red. Collocations are "prc-fabricatcd" combinations of simple lexemes. Jackendoff (1997) collected multi-word units (MWUs) from the popular U.S. television show "Wheel of Fortune" and showed the high frequency of collocations and phrases, both in terms of types and tokens. He estimates that MWUs constitute at least half the entire lexicon; Cowie (1998) reports the percentage of verb phrase idioms and collocations in news stories and feature articles to be around forty percent. Such figures make it clear that MWUs constitute a significant part of the lexicon and collocating is a pervasive aspect of linguistic behavior. Collocations include a wide range of MWUs. A syntax-based typology (e.g., Moon 1998) distinguishes sentential proverbs {the early bird gets the worm), routine formulae {thanks a lot), and adjective phrases expressing similes {sharp as a whistle). Frames or constructions (Fillmore, Kay & O'Connor 1988) consist of closed-class words with slots for open-class words, such as what's X doing y? and the X-er the Y-er, these frames carry meaning independent of the particular lexemes that fill the open slots. A large class of collocations are support (or "light") verb constructions, where a noun collocates with a specific verbal head such as make a decision and take a photograph (Starrer 2007). Under a semantically-based classification, phrasal lexemes form a continuous scale of fixedness and semantic transparency. Following Moon (1998), we say that collocations are decomposable multi-word units that often follow the paradigmatic patterns of free language. For example, a number of German verb phrase collocations show the caus- ative/inchoative/stative alternation pattern, e.g., in Rage bringen/kommen/sein (make/ become/be enraged). 1.3. Idioms Idioms like hit the ceiling, lose one's head, and when the cows come home, constitute another class of MWUs with lexical status. They differ from the kinds of collocations discussed in the previous section in several respects. To begin with, the lexical make-up of idioms is usually unpredictable and often highly idiosyncratic, violating the usual rules of selectional restrictions. Examples are English rain cats and dogs and talk turkey, and German unter dem Pantoffel stehen (lit., 'stand under the slipper', be dominated). Some idioms have a possible literal (non-idiomatic) interpretation, such as drag one's feet and not move a finger. Certain idioms are syntactically ill-formed, such as English trip the light fantastic and German an jemandem einen Narren gefressen haben (lit.'have eaten a fool on/at some- body'.be infatuated with somebody),or Bauklotzestaunen (lit.'be astonished toy blocks', be very astonished), which violate the verb's subcategorization properties. Others, like by and large, do not constitute syntactic categories. Perhaps the most characteristic feature ascribed to idioms is their semantic non- compositionality. Because the meaning of idioms is not made up by the meanings of their constituents (or only to a limited extent), their meanings are considered largely opaque. A common argument says that if the components of idioms are semantically non- transparent to speakers, they are not available for the kinds of grammatical operations found in free, literal, language. As a result, idioms are often considered fixed, "frozen" expressions, or "long words" (Swinncy & Cutler 1979, Bobrow & Bell 1973).
444 IV. Lexical semantics We will consider each of these properties attributed to idioms, focusing largely on verb phrase (VP) idioms with a verbal head that select for at least one noun phrase or prepositional phrase complement. Such idioms have generated much discussion, focused mostly on putative constraints on their syntactic frozenness. VP idioms were also found to be the most frequent type of MWUs in the British Hector Corpus (Moon 1998). 2. Lexical properties of idioms Idioms are perhaps most recognizable by their lexical make-up. Selectional restrictions are frequently violated, and the idiom constituents tend not to be semantically related to words in the surrounding context, thus signaling the need for a non-literal interpretation. But there are many syntactically and semantically well-formed strings that are polyse- mous between an idiomatic and a literal, compositional meaning, such as play first fiddle and fall off the wagon. Here, context determines which meaning is likely the intended one. Of course, the ambiguity between literal and idiomatic reading is related to the plausibility of the literal reading. Thus, pull yourself up by your bootstraps, give/lend somebody a hand, and German mil der Kirche urns Dorf fahren (lit.'drive with the church around the village', deal with an issue in an overly complicated or laborious manner) denote highly implausible events in their literal readings. 2.1. Polarity Many idioms are negative polarity items: not have a leg to stand on, not give somebody the time of day, not lift a finger, be neither fish nor fowl, no love lost, not give a damn/ hoot/shit. Without the negation, the idiomatic reading is lost; in specific contexts, it may be preserved in the presence of a marked stress pattern. Corpus data show that many idioms do not require a fixed negation component but can occur in other negative environments (questions, conditionals, etc.), just like their non-idiomatic counterparts (Sohn 2006, Sailer & Richter 2002, Stantcheva 2006 for German). Idioms as negative polarity items occur crosslinguistically, and negative polarity may be one universal hallmark of idiomatic language. 2.2. Idiom-specific lexemes Many idioms contain lexemes that do not occur outside their idiomatic contexts. Examples are English thinking cap (in put on one's thinking cap), humble pie (in eat humble pie), and German Hungertuch {am Hungertuch nagen; lit.'gnaw at the hunger-cloth', be destitute). Similarly, the noun Schlafittchen in the idiom am Schlafittchen packen, (lit. 'grab someone by the wing', confront or catch somebody), is an obsolete word referring to "wing" and its meaning is opaque to contemporary speakers. Other idiom components have meaning outside their idiomatic context but may occur rarely in free language {gift horse in don't look a gift horse in the mouth), cf. article 80 (Olsen) Semantics of compounds. 2.3. Non-referential pronouns Another lexical peculiarity of many English idioms is the obligatory presence of a non- referential it: have it coming, give it a shot, lose it, have it in for somebody, etc. These
20. Idioms and collocations 445 pronouns are a fixed component of the idioms and no number or gender variation can occur without loss of the idiomatic reading.They do not carry meaning, though they may have referred at an earlier stage before the idiom became fixed. Substituting a context- appropriate noun does not seem felicitous, either: have the punishment coming, give the problem a shot etc. seem odd at best. (A somewhat similar phenomenon are a number of German idioms containing eins, einen (lit.'one'): sich eins lachen, einen draufmachen, 'have a good laugh/have a night on the town.') 3. Compositionality Idioms are often referred to as non-compositional, i.e., their meaning is not composed of the meanings of their constituents, as is the case in freely composed language. In fact, idioms vary with respect to the degree of non-compositionality. One way to measure the extent to which an idiom violates the compositional norms of the language is to examine and measure statistically the collocational properties of its constituents.The verb fressen (eat like an animal) co-occurs with nouns that fill the roles of the eating and the eaten entities; among the latter Narr ("fool") will stand out as not belonging into the class of entities included in the verb's selectional preferences; a thesaurus or WordNet (Fellbaum 1998) could firm up this intuition. In highly opaque idioms, none of the constituents can be mapped onto a referent, as is the case with and jemandem einen Narren gefressen haben and the much-cited kick the bucket. The literal equivalents of these verbs do not select for arguments that the idiom constituents could be mapped to in a one-to-one fashion so as to allow a "translation". An idiom's lexeme becomes obsolete outside its idiomatic use, with the result that meaning becomes opaque. For example, Fittiche in the idiom unter die Fittiche nehmen seems not to be interpreted as "wings" by some contemporary speakers, as attested by corpus examples like unter der/seiner Fittiche, where the morphology shows that the speaker analyzed the noun as a feminine singular rather than the plural and possibly it assigned a new meaning. Lexical substitutions and adjectival modifications show that speakers often assign a meaning to one or more constituents of an idiom (see section Lexical variation), though such variations from the "canonical forms" tend to be idiosyncratic and are often specific to the discourse in which they are found. Many idioms are partially decomposable; they may contain one or more lexemes with a conventional meaning or a metaphor. 3.1. Metaphors Many idioms containt metaphors, and the notion of metaphor is closely linked to idioms; cf. article 26 (Tyler & Takahashi) Metaphors and metonymies. We use the term here to denote a single lexeme (most often a noun or noun phrase) with a figurative meaning. For example, jungle is often used as a metaphor for a complex, messy, competitive, and perhaps dangerous situation; this use of jungle derives from its literal meaning,"impen- ctrablc forest" and preserves some of the salient properties or the original concept (Glucksberg 2001). A metaphor like jungle may become conventionalized and be readily interpreted in the appropriate contexts as referring to a competitive situation. Thus, the expression it's a jungle out here is readily understandable due to the interpretability of
446 IV. Lexical semantics the noun. Idioms that contain conventional or transparent metaphors are thus at least partly compositional. Many idioms arguably contain metaphors, though their use is bound to a particular idioms. For example, nettle in grasp the nettle and bull in take the bull by the horns can be readily interpreted as referring to specific entities in the context where the idioms are used. But such metaphors do not work outside the idioms; this assignment involves numerous nettles/bulls would be difficult to interpret, despite the fact that nettle and bull seem like appropriate expressions referring to a difficult problem or a challenge, respectively, and could thus be considered "motivated". Arguments have been made for a cognitive, culturally-embedded basis of metaphors and idioms containing metaphors that shapes their semantic structure and makes motivation possible (Lakoff & Johnson 1980, Lakoff 1987, Gibbs & Steen 1999, Dobrovol'skij 2004, inter alia). Indeed, certain conceptual metaphors are cross- linguistically highly prevalent and productive. For example, the "time is money" metaphor is reflected in the many possession verbs (have, give, take, buy, cost) that take time as an argument. However, Burger (2007) points out that not all idioms are metaphorical and that not all metaphorical idioms can be traced back to general conceptual metaphors. For example, in the idiom pull strings, strings does not encode an established metaphor that is readily understandable outside the idiom and that is exploited in related idioms. Indeed, a broad claim to the universality and cultural independence of metaphor seems untenable. The English idiom pull strings structurally and lexically resembles the German idiom die Drdhte Ziehen (lit.:'pull the wires'), but the German idiom does not mean "to use one's influence to one's favor", but rather "to mastermind". Testing the conceptual metaphor hypothesis, Keysar & Bly (1995) asked speakers guess the meanings of unfamiliar idioms involving such salient concepts as "high" and "up", which many cognitivists claim to be universally associated with increased quality and quantity. Subjects could not guess the meanings of many of the phrases, casting doubt on the power and absolute universality of conceptual metaphors. 4. How frozen are idioms? A core property associated with idioms is lexical, morphosyntactic, and syntactic fixedness, which is said to be a direct consequence of semantic opacity. Many grammatical frameworks, beginning with early Transformational-Generative Grammar, explicitly classify idioms as "long words" that are exempt from regular and productive rules, much like strong verbs. Lexicographic resources implicitly reinforce the view of idioms as fixed strings by listing idioms in morphologically unmarked citation forms, or lemmata, usually based of the lexicographer's intuition. And experimental psycholinguists studying idiom processing commonly use only such citation forms. The few much-cited classic examples, including kick the bucket and trip the light fantastic, served well to exemplify the properties ascribed in a wholesale fashion to all idioms. But soon linguists began to note that not all idioms were alike. Fraser (1970) distinguished several degrees of syntactic frozenness among idioms. Where a mapping can be made from the idiom's constituents to those of its literal counterpart, syntactic
20. Idioms and collocations 447 flexibility is possible; simply put, the degree of semantic opacity is reflected in the degree of syntactic frozenness. Echoing Weinreich (1969),Fraser states that there is no idiom that is completely free. In particular, he rules out focusing operations like clefting and topicalization, as these are conditional on the focused constituent bearing a meaning. Citing constructed data, Cruse (2004) also categorically rules out cleft constructions for idioms. Syntactic and lexical frozenness is commonly ascribed to the semantic non- compositionality of idioms. Indeed, it seems reasonable to assume that semantically unanalyzable strings cannot be subject to regular grammatical processes as syntactic variations typically serve to (de-)focus or modify particular constituents; when these carry no meanings, such operations arc unmotivated. But corpus studies show that a "standard", fixed form cannot be determined for many idioms, at least not on a quantitative basis. Speakers use many idioms in ways that deviate from that given in dictionary in more than one way;some idioms exhibit the same degree of freedom as freely composed strings. We rely in this article on corpus examples to illustrate idiom variations. The English data come from Moon's (1998) extensive analysis of the Hector Corpus. The German data are based on the in-depth analysis of 1,000 German idioms in a one billion word corpus of German (Fellbaum 2006,2007b). We are not aware of similar large-scale corpus analyses in other languages and therefore the data in this article will be somewhat biased towards English and German. 5. Syntactic flexibility As a wider range of idioms were considered, linguists realized that syntactic flexibility was not an exception.The influential paper by Nunberg, Sag & Wasow (1994) argued for the systematic relation between semantic compositionality and flexibility. Both Abeille (1995), who based her analysis on a number of French idioms, and Dobrovol'skij (1999) situate idioms on a continuum of flexibility that interacts with semantic transparency. But while it seems true that constituents that are assigned a meaning by speakers are open to grammatical operations as well as modification and even lexical substitution,semantic transparency is not a requirement for variation. We consider as an example the German idiom kein Blatt vor den Mund nehmen (lit. 'take no leaf/sheet in front of one's mouth', be outspoken, speak one's mind). Blatt (leaf, sheet) has no obvious referent. Yet numerous attested corpus examples are found where Blatt is passivized, topicalized, relativized and pronominalized; moreover, this idiom need not always appear as a Negative Polarity Item: (1) Bci BMW wird kein Blatt vor den Mund genommen. (2) Ein Blatt habe er nie vor den Mund genommen. (3) Das Blatt, das Eva vor ihr erregendes Geheimnis gehalten, ich nahme es nicht einmal vor den Mund. (4) Ein Regierungssprecher ist ein Mann, der sich 100 Blatter vor den Mund nimmt.
448 IV. Lexical semantics Even "cran-morphemes", i.e., words that do not usually occur outside the idiom, can behave like free lexemes. For example, Fettnapfchen rarely occurs outside the idiom ins Fettnapfchen treten (lit. 'step into the little grease pot', commit a social gaffe), and no meaning can be attached to it by a contemporary speaker that would map to the meaning of a constituent in the literal equivalent. Yet the corpus shows numerous uses of this idiom where the noun is relativied, topicalized, quantified, and modified. (5) Das Fettnapfchen, in das die Frau ihres jiingsten Sohnes gestiegen ist, ist aber auch riesig. (6) Ins Fettnapfchen trete ich bestimmt mal und das ist gut so. (7) Silvio Berlusconi: Ein Mann, viele Fettnapfchen. (8) Immer trat der New Yorker ins bereitstehende Fettnapfchen. Syntactic variation is probably due to speakers' ad-hoc assignment of meanings of idiom components. Adjectival modification and lexical variations in particular indicate that the semantic interpretation of inherently opaque constituents arc dependent on the particular context in which the idiom is embedded. For more examples and discussion see Moon (1998) and Fellbaum (2006,2007b). 6. Morphosyntax The apparent fixedness of many idioms extends beyond constituent order and lexical selection to their morphosyntactic make-up. 6.1. Modality, tense, aspect The idiomatic reading of many strings requires a specific modality, tense, or aspect. Thus horses wouldn 't get (NP) to (VP) is at best odd without the conditional: horses don't get (NP) to (VP). Similarly, the meaning of / couldn't care less is not preserved in / could care less. Interestingly, the negation in the idiom is often omitted even when the speaker does not intend a change of polarity, as in / could care less; perhaps speakers are uneasy about the double negation in such cases but do not decompose the idiom to realize that a single negation changes the meaning. Another English idiom requiring a specific modal is will not/won't hear of it. The German idioms in den Tuschkasten gefallen sein (lit. 'have fallen into the paint box', be overly made up) and nicht auf den Kopf gefallen sein (lit. 'not have fallen on one's head', be rather intelligent) denote states and cannot be interpreted as idioms in any tense other than the perfect (Fellbaum 2007a). 6.2. Determiner and number The noun in many VP idioms is preceded by a definite determiner even though it lacks definiteness and, frequently, reference: kick the bucket, fall off the wagon, buy the farm. The determiner here may be a relic from the time when transparent phrases became lexicalized as idioms.
20. Idioms and collocations 449 A regular and frequent grammatical process in German is the contraction of the determiner with a preceding preposition.Thus, the idiom jemanden hinters Licht fiihren (lit. 'lead someone behind the light', deceive someone), occurs most frequently with hinter (behind) and das (the) contracted to hinters. Eisenberg (1999, inter alia) assert that the figurative reading of many German idioms precludes a change in the noun's determiner. This hypothesis seems appealing: if the noun phrase is semantically opaque, the choice of determiner does not follow the usual rules of grammar and is therefore arbitrary. As speakers cannot interpret the noun, the determiner is not subject to grammatical processes. Decontraction in cases like hinters Licht fiihren is thus ruled out by Eisenberg, as is contraction in cases where the idiom's citation form found in dictionaries shows the preposition separated from the determiner. However, Firenze (2007) cites numerous corpus examples of idioms with decontraction. In some cases, the decontraction is conditioned by the insertion of lexical material such as adjectives between the preposition and the determiner or number variation of the noun; in other cases, contracted and decontracted forms alternate freely. Such data show that the NP is subject to the regular grammatical processes of free language, even when its meaning is not transparent: Licht in the idiom has no referent, like wool in the corresponding English idiom pull the wool over someone's eyes. Corpus data also show number variations for the nouns. For example, in den Bock zum Gartner machen (lit. 'make the buck the gardener', put an incompetent person in charge), both nouns occur predominantly in the singular. But wc find examples with plural nouns, Bocke zu Gdrtnern machen, where the speaker refers to several specific incompetent persons. 6.3. Lexical variation Variation is frequently attested for the nominal, verbal, and adjectival components of VP idioms. In most cases, this variation is context-dependent, and shows conscious playfulness on the part of the speaker or writer. Fellbaum & Stathi (2006) discuss three principal cases of lexical variation: paradigmatic variation, adjectival modification, and compounding. In each case, the variation plays on the literal interpretation of the varied component. Paradigmatic substitution has occurred in the case where verb in jemandem die Leviten lesen (lit.,'read the Levitus to someone', read someone the riot act) has been replaced by quaken (croak like a frog) and briillen (scream). An adjective has been added to the noun in an economics text in die marktwirtschaftlichen Leviten lesen, 'read the market-economical riot act.' An example for compounding is er nimmt kein Notenblatt vor den Mund where the idiom kein Blatt vor den Mund nehmen (lit. 'take no leaf/sheet in front of one's mouth', meaning be outspoken) occurs in the context of musical performance and Blatt becomes Notenblatt, 'sheet of music' Besides morphological and syntactic variation, corpus data show that speakers vary the lexical form of many idioms. Moon's (1998) observation that lexical variation is often humorous is confirmed by the German corpus data, as is Moon's finding that such variation is found most frequently in journalism. (See also Kjellmer 1991 for examples of playful lexical variations in collocations, such as not exactly my cup of tequila.) Ad-hoc lexical variation is dependent on the particular discourse and must play on the literal reading of the substituted constituent rather than on a metaphoric one.That is,spill
450 IV. Lexical semantics the secret would be hard to interpret, whereas rock the submarine would be interpretable in the appropriate context, as submarine invokes the paradigmatically related boat. Moon (1998) discusses another kind of lexical variation, called idiom schemas, exemplified by the group of idioms hit the deck/sack/hay. Idiom schemas correspond to Nun- berg, Sag & Wasow's (1994) idiom families, such as don't give a hoot/damn/shit. The variation is limited, and each variant is lexicalized rather than ad-hoc. But in other cases, lexical variation is highly productive. An example is the German idiom hier tanzt der Bar (lit. 'here dances the bear', this is where the action is), which has spawned a family of new expressions with the same meaning, including hier steppt der Bar (lit.'the bear does a step dance here') and even hier rappt der Hummer (lit.'the lobster raps here') and hier boxt der Papst (lit.,'the Pope is boxing here'). There are clear constraints on the lexical variations in terms of semantic sets, unlike in cases like don't give a hoot/damn/shit; this may account for the productivity of the "bear" idiom. 6.4. Semantic reanalysis Motivation is undoubtedly a strong factor in the variability of idioms; if speakers can assign meaning to a component, even if only in a specific context, the component is available for modification and syntactic operations. Corpus examples with lexical and syntactic variations suggest that speakers attribute meaning to idiom components that are opaque to contemporary speakers and remo- tivatc them. Gchwcilcr (2007) discusses the German idiom in die Rohre schauen (lit. 'look into the pipe', go empty-handed), which originates in the language of hunters and referred to dogs peering into foxholes. The meaning of noun here is opaque to contemporary speakers, but the idiom has aquired more recent, additional senses where the noun is re-interpreted. 7. Idioms as constructions Fillmore, Kay & O'Connor (1988) discuss the idiomaticity of syntact constructions like the X-erthe Y-er, which carry meaning independent of the lexical items that fill the slots; cf. article 86 (Kay & Michaelis) Constructional meaning. Among VP idioms with a more regular phrase structure, some require the suppression of an argument for the idiomatic reading, resulting in a violation of the verb's subcatego- rization properties. For example, werfen ('throw') in the German idiom das Handtuch werfen, lit.'throw the towel' does not co-occur with a Location (Goal) argument, which is required in the verb's non-idiomatic use. Other idioms require the presence of an argument that is optional in the literal language; this is the case for many ditransitivc German VP idioms, including jemandem ein Bein stellen, lit., 'place a leg for someone', 'trip someone up' and jemandem eine Szene machen, lit. 'make a scene for someone', cause a scene that embarrasses someone in German. Here, the additional indirect object (indicated by the placeholder someone) is most often an entity that is negatively affected by the event, a Maleficiary. Whereas in free language use, ditransitive constructions often denote the transfer of a Theme from a Source to a Goal or Recipient, this is rarely the case for ditransitive idioms (Moon 1998,Fellbaum 2007c). Instead,such idioms exemplify Green's (1974) "symbolic action", events intended by the Agent to have a specific effect on a Beneficiary, or, more often, a
20. Idioms and collocations 451 Maleficiary. Ditransitives that do not denote the transfer of an entity are highly restricted in the literal language (Green 1974); it is interesting that so many idioms denoting an event with a negatively affected entity are expressed with a ditransitive construction. Structurally defined classes of idioms can be accounted for under Goldberg's (1995) Construction Grammar, where syntactic configurations carry meaning independent of lexical material; however, Goldberg (1995) and Goldberg & Jackendoff (2004) consider syntactically well-formed and lexically productive structures such as resultatives rather than idioms. The identification of classes of grammatically marked idioms points to idiom-specific syntactic configurations or frames that carry meaning. 8. Diachronic changes The origins of specific idioms is a subject of much speculation and folk etyomology. Among the idioms with an indisputable history are those found in the Bible (e.g., throw pearls before the swine, fall from grace, give up the ghost from the King James Bible). These tend to be found in many of the languages into which the Bible was translated. Many other idioms originate in specific domains: pull one's punches, go the distance (boxing), have an ace up your sleeve, let the chips fall where they may (gambling), and fall on one's sword, bite the bullet (warfare). Idioms are subject to the same diachronic processes that have been observed for lexemes with literal interpretation. Longitudinal corpus studies by Gehweiler, Hoser & Kramer (2007) shows how German VP idioms undergo extension, merging, and semantic splitting and may develop new, homonymic readings. Idioms may also change their usage over time. Thus, the German idiom unter dem Pantoffel stehen (lit.,'stand underneath somebody's slipper', be under someone's thumb) used to refer to domestic situations where a husband is dominated by his wife. But corpus data from the past few decades show that this idiom is extended to not only to female spouses but also to other social relations, such as employees in a workplace dominated by a boss (Fellbaum 2005). Idioms change their phrase structure over time. Kwasniak (2006) examines cases where a sentential idiom turns into a VP idiom when a fixed component becomes a "free" constituent that is no longer part of the idiom. 9. Idioms in the lexicon As form-meaning pairs, idioms belong in the lexicon. A dual nature - semantic simplicity but structural complexity - is often ascribed to them. But the question arises as to why natural languages show complex encoding for concepts whose semantics are as straightforward as those of simple lexemes. It has often been observed that many idioms express concepts already covered by simple lexemes but with added connotational nuances or restrictions to specific contexts and social situations; classic examples arc the many idioms meaning to "die", ranging from disrespectful to euphemistic. But many - perhaps most - idioms, including cut one's teeth on, live from hand to mouth, have eyes for, lend a hand, lose one's head are neutral with respect to register. Similar idioms are found across languages, for example idioms expressing a range of judgments of physical appearance, mental ability, and social aptitude.
452 IV. Lexical semantics Subtle register differences alone do not seem to warrant the structural and lexical extravagance of idioms, the constraints on their use, and the added burden on language acquisition and processing. An examination of how English VP idioms fit into the structure of the lexicon reveals that many lack non-idiomatic synonyms and express meanings not covered by simple lexemes, arguably filling "lexical gaps". Moreover, many idioms appear not to fit the regular lexicalization patterns of English. A typology of idioms based on semantic criteria is suggested in (Fellbaum 2002, 2007a). It includes idioms expressing negations of events or states (miss the bus, fall through the cracks, go begging) and idioms expressing several events linked by a Boolean operator (fish or cut bair, have one's cake and eat it). Such structurally and semantically complex idioms can be found across languages. One function of idioms may be to encode pre-packaged complex messages that cannot be expressed by simple words and whose salience makes them candidates for lexical encoding. 10. Idioms in the mental lexicon How are idioms represented in the mental lexicon and speakers' grammar? On the one hand, they are more or less fixed MWUs - long words - that speakers produce and recognize as such, which suggests that they are represented exactly like simple lexemes. On the other hand, attested idiom use shows a wide range of syntactic, morphological, and especially lexical variation, indicating that speakers access the internal structure of idioms and subject them to all the grammatical processes found in literal language. To determine whether idioms are represented and processed as unanalyzable units or as of potentially (and often partially) decomposable strings, psycholinguistic experiments have investigated both the production and the comprehension of idioms. The materials used in virtually all experiments are constructed and not based on corpus examples, yet the findings and the hypothesis based on the results are fully compatible with naturally occurring data. Comprehension time studies show that familiar idioms like kick the bucket are processed faster in their idiomatic meaning ('die') that in a literal one (kicking a pail). Glucks- berg (2001) asserts that the literal meaning may be inhibited by the figurative meaning of a string, though both may be accessed. However, the timing experiments argue against a processing model where the idiomatic reading kicks in only after a literal one has failed. Cutting & Bock (1997), based on a number of experiments involving idiom production, propose a "hybrid" theory of idiom representation. They argue for the existence of a lexical concept node for each idiom; at the same time, idioms are syntactically and semantically analyzed during production, independent of the idioms' degree of compositionality. This "hybrid account" is also supported by Sprenger, Levelt & Kempen (2006), who show that idioms can be primed with lexemes that are semantically related to constituents of the idioms. Sprenger, Levelt & Kempen propose the notion of a "superlemma" as a conceptual unit whose lexemes are bound both to their idiomatic use and their use in the free language. The Superlemma theory is compatible with the Configuration Hypothesis that Cac- ciari & Tabossi (1988) formulated on the basis of idiom comprehension experiments.The Configuration Theory maintains that speakers activate the literal meanings of words in a
20. Idioms and collocations 453 phrase and recognize the idiomatic meaning of a polysemous string only when they recognize an idiom-specific configuration of lexems or encounter a "key" lexeme. One such key in many idioms may be the definite article (kick the bucket, fall off the wagon, buy the farm) which suggests that a referent for the noun has been previously introduced into the discourse; when no matching antecedent can be found, another interpretation of the string must be attempted (Fellbaum 1993). The keyhypothesis is compatible with Cutting & Bock's proposal concerning idioms' representation in the mental lexicon. Kuiper (2004) analyzed a collection of slips of the tongue for idioms, comprising 1,000 errors. He proposes a taxonomy of sources for the errors from all levels of grammar. Kui- pcr's analysis of the data shows that idioms arc not simply stored as frozen long words, consistent with the superlemma theory of idiom representation. 10.1. Idioms in natural language processing If one inputs an idiom into a machine translation engine (such as Babelfish or Google translate), it does not - in many cases - return a corresponding idiom or an adequate non-idiomatic translation in the target language.This is one indication that the recognition and processing especially of non-compositional idioms are still a challenge. One reason is that lexical resources that many NLP applications rely on do not include many idioms and fixed collocations. When idioms arc listed in computational lexicons, it is often in a fixed form; idioms exhibiting morphosyntactic flexibility and lexical variations make automatic recognition very challenging. A more promising approach than lexical look-up is to search for the co-occurrence of the components of an idiom within a specific window, regardless of syntactic configuration and morphological categories. Lexical variation can be accounted for by searching for words that are semantically similar, as reflected in a thesaurus. Another difficulty for the automatic processing of idioms is polysemy. Many idioms are "plausible" and have a literal reading (keep the ball rolling, make a dent in, not move a finger). To distinguish the literal and the idiomatic readings, a system would have to perform a semantic analysis of the wider context, a task similar to that performed by human when disambiguating between literal and idiomatic meanings. 11. Summary and conclusion A prevailing view in linguistics represents idioms as "long words", largely non- compositional multi-word units with little or no room for deviation from a canonical form; any morphosyntactic flexibility is often thought to be directly related to semantic transparency. Corpus investigations show, first, that idioms are subject to far more variation than the traditional view would allow, and, second, that speakers use idioms in creative ways even in the absence of full semantic interpretation. The boundary between compositional and non-compositional strings appears to be soft, as speakers assign ad-hoc, discourse-specific meanings to idiom constituents that are opaque outside of certain contexts. Psycholinguistic experiments, too, indicate that idiomatic and non-idiomatic language is not strictly separated in our mental lexicon and grammar.
454 IV. Lexical semantics Perhaps the most important function of many idioms, which may account for their universality and ubiquity, is that they provide convenient, pre-fabricated, conventionalized encodings of often complex messages. 12. References Abeille\ Anne 1995.The flexibility of French idioms. A representation with lexicalized tree adjoining grammar. In: M. Everaert et al. (eds.). Idioms. Structural and Psychological Perspectives. Hillsdale, NJ: Erlbaum, 15^2. Baldwin,Timothy, Colin Bannard.TakaakiTanaka & Dominic Widdows 2003. An empirical model of multiword expression decomposability. In: Proceedings of the ACL-03 Workshop on Multiword Expressions. Analysis, Acquisition and Treatment. Stroudsburg, PA: ACL, 89-96. Bobrow, Daniel & Susan Bell 1973. On catching on to idiomatic expressions. Memory & Cognition 1,343-346. Burger, Harald 2007. Phraseologie. Eine Einfiihrung am Beispiel des Deutschen. 3rd edn. Berlin: Erich Schmidt. 1st edn. 1998. Cacciari, Cristina & Patrizia Tabossi 1988. The comprehension of idioms. Journal of Memory and Language 27,668-683. Church, Kenneth W. & Patrick Hanks 1990. Word association norms, mutual information, and lexicography. Computational Linguistics 16,22-29. Church, Kenneth W., William Gale, Patrick Hanks & Donald Hindle 1991. Using statistics in lexical analysis. In: U. Zernik (ed.). Lexical Acquisition. Exploiting On-Line Resources to Build a Lexicon. Hillsdale, NJ: Erlbaum, 115-164. Cowie, Anthony P. 1998. Phraseology. Theory, Analysis, and Applications. Oxford: Oxford University Press. Cruse, D. Alan 2004. Meaning in Language. An Introduction to Semantics and Pragmatics. 2nd edn. Oxford: Oxford University Press. 1st edn. 2000. Cutting, J. Cooper & Kathryn Bock 1997.That's the way the cookie bounces. Syntactic and semantic components of experimentally elicited idiom blends. Memory & Cognition 25,57-71. Dobrovol'skij, Dmitrij 1999. Habcn transformationcllc Dcfcktc dcr Idiomstruktur scmantischc Ursachen? In: N. Fernandez Bravo, I. Behr & C. Rozier (eds.). Phraseme und typisierende Rede. Tubingen: Stauffenburg, 25-37. Dobrovol'skij, Dmitrij 2004. Lexical semantics and combinatorial profile. A corpus-based approach. In: G. Williams & S. Vesser (eds.). Euralex II. Lorient: Universitd de Bretagne Sud, 787-796. Eisenberg, Peter 1999. Grundriss der deutschen Grammatik, Vol. 2: Der Satz. Stuttgart: Metzler. Evert, Stefan 2005. The Statistics of Word Cooccurrences. Word Pairs and Collocations. Doctoral dissertation. University of Stuttgart. Fazly, Afsaneh & Suzanne Stevenson 2006. Automatically constructing a lexicon of verb phrase idiomatic combinations. In: Proceedings of the European Chapter of the Association for Computational Linguistics (EACL) 11. Stroudsburg, PA: ACL, 337-344. Fellbaum, Christiane 1993. The determiner in English idioms. In: C. Cacciari & P. Tabossi (eds.). Idioms. Processing, Structure, and Interpretation. Hillsdale, NJ: Erlbaum, 271-295. Fellbaum, Christiane 1998. WordNet. An Electronic Lexical Database. Cambridge, MA: The MIT Press. Fellbaum, Christianc 2002. VP idioms in the lexicon. Topics for research using a very large corpus. In: S. Buscmann (cd.). Proceedings of Konferenz zur Verarbeitang natiirlicher Sprache (= KON- VENS) 6. Kaiserslautern: DFKI, 7-11. Fellbaum, Christiane 2005. Unter dein Pantoffel stehen. Circular der Berlin-Brandenburgischen Akademie der Wissenschaften 31,18. Fellbaum, Christiane (cd.) 2006. Corpus-Based Studies of German Idioms and Light Verbs. Special issue of the Journal of Lexicography 19.
20. Idioms and collocations 455 Fellbauin. Christiane 2007a. The ontological loneliness of idioms. In: A. Schalley & D. Zaefferer (eds.). Ontolinguistics. How Ontological Status Shapes the Linguistic Coding of Concepts. Berlin: Mouton de Gruyter, 419-434. Fellbauin, Christiane (ed.) 2007b. Idioms and Collocations. Corpus-Based Linguistic and Lexicographic Studies. London: Continuum. Fellbauin, Christiane 2007c. Argument selection and alternations in VP idioms. In: Ch. Fellbauin (ed.). Idioms and Collocations. Corpus-Based Linguistic and Lexicographic Studies. London: Continuum. 188-202. Fellbaum, Christiane & Ekaterini Stathi 2006. Idiome in der Grammatik und im Kontext. Wer briillt hier die Leviten? In: K. Proost & E. Winkler (eds.). Von Intentionalitdt zur Bedeutung konventionalisierter Zeichen. Festschrift fur Gisela Harras zum 65. Geburtstag. Tubingen: Narr, 125-146. Fillmore. Charles, Paul Kay & Mary O'Connor 1988. Regularity and idiomaticity in grammatical constructions. Language 64,501-538. Fircnzc, Anna 2007. 'You fool her' doesn't mean (that) 'you conduct her behind the light'. (Dis)agglutination of the determiner in German idioms. In: Ch. Fellbaum (ed.). Idioms and Collocations. Corpus-Based Linguistic and Lexicographic Studies. London: Continuum, 152-163. Firth, John R. 1957. A Synopsis of Linguistic Theory 1930-1955. In: Studies in Linguistic Analysis. Oxford: Blackwell, 1-32. Fraser, Bruce 1970. Idioms within a transformational grammar. Foundations of Language 6. 22-42. Gehweiler, Elke 2007. How do homonymic idioms arise? In: M. Nenonen & S. Niemi (eds.). Proceedings of Collocations and Idioms 1. Joensuu: Joensuu University Press. Gehweiler, Elke, Iris Hoser & Undine Kramer 2007. Types of changes in idioms. Some surprising results of corpus research. In: Ch. Fellbaum (ed.). Idioms and Collocations. Corpus-Based Linguistic and Lexicographic Studies. London: Continuum, 109-137. Gibbs, Ray & Gerard J. Steen (eds.) 1999. Metaphor in Cognitive Linguistics. Amsterdam: Benjamins. Glucksberg, Sain 2001. Understanding Figurative Language. From metaphors to Idioms. New York: Oxford University Press. Goldberg, Adclc 1995. Constructions. A Construction Grammar Approach to Argument Structure. Chicago, IL:The University of Chicago Press. Goldberg, Adele & Ray Jackendoff 2004. The English resultative as a family of constructions. Language 80,532-568. Green, Georgia M. 1974. Semantics and Syntactic Regularity. Blooinington, IN: Indiana University Press. Halliday, Michael 1966. Lexis as a linguistic level. In: C. E. Bazell et al. (eds.). In Memory ofJ.R. Firth. London: Longman, 148-162. Jackendoff, Ray 1997.Twistin' the night away. Language 73,534-559. Keysar, Boaz & Bridget Bly 1995. Intuitions of the transparency of idioms. Can one keep a secret by spilling the beans? Journal of Memory and Language 34,89-109. Kjelhner, Goran 1991. A mint of phrases. In: K. Aijmer & B. Altenberg (eds.). English Corpus Linguistics. Studies in Honor of Jan Svartvik. London: Longman, 111-127. Krenn, Brigitte 2000. The Usual Suspects. Data-Oriented Models for Identification and Representation of Lexical Collocations. Doctoral dissertation. University of Saarbriicken. Kuiper. Konrad 2004. Slipping on superlemmas. In: A. Hacki-Buhofcr & H. Burger (eds.). Phraseology in Motion, vol. I. Baltmannswcilcr: Schneider, 371-379. Kwasniak, Renata 2006. Wer hat nun den Salat? Now who's got the mess? Reflections on phraseological derivation. From sentential to verb phrase idiom. International Journal of Lexicography 19,459^178. Lakoff, George 1987. Women, Fire, and Dangerous Things. What Categories Reveal about the Mind. Chicago, IL:The University of Chicago Press.
456 IV. Lexical semantics Lakoff, George & Mark Johnson 1980. Metaphors We Live By. Chicago, IL: The University of Chicago Press. McCarthy, Diana, Bill Keller & John Carroll 2003. Detecting a continuum of compositionality in phrasal verbs. In: Proceedings of the ACL-03 Workshop on Multiword Expressions. Analysis, Acquisition and Treatment. Stroudsburg, PA: ACL, 73-80. Miller, George A. 1956.The magical number seven, plus or minus two. Some limits on our capacity for processing information. Psychological Review 63,81-97. Moon, Rosamund 1998. Fixed Expressions and Idioms in English. A Corpus-Based Approach. Oxford: Clarendon Press. Nunberg, Geoffrey, Ivan Sag & Thomas Wasow 1994. Idioms. Language 70,491-538. Sailer, Manfred & Frank Richter 2002. Not for love or money. Collocations! In: G. Jager et al. (eds.). Proceedings of Formal Grammar 7. Stanford, CA: CSLI Publications, 149-160. Sinclair, John 1991. Corpus, Concordance, Collocation. Oxford: Oxford University Press. Sohn, Jan-Philipp 2006. On idiom parts and their contexts. Linguistik Online 27,11-28. Sprenger, Simone A., William J. M. Levelt & Gerard A. M. Kern pen 2006. Lexical access during the production of idiomatic phrases. Journal of Memory and Language 54,161-184. Stantcheva, Diana 2006. The many faces of negation. German VP idioms with a negative component. International Journal of Lexicography 19,397-418. Storrer, Angelika 2007. Corpus-based investigations on German support verb constructions. In: Ch. Fellbauin (ed.). Idioms and Collocations. Corpus-Based Linguistic and Lexicographic Studies. London: Continuum, 164-187. Stubbs, Michael 2001. Words and Phrases. Corpus Studies of Lexical Semantics. Oxford: Blackwell. Swinney, Douglas & Anne Cutler 1979.The access and processing of idiomatic expressions. Journal of Verbal Learning and Verbal Behavior 18,523-534. Weinreich. Uriel 1969. Problems in the analysis of idioms. In: J. Puhvel (ed.). Substance and Structure of Language. Berkeley, CA: University of California Press, 23-81. Christiane Fellbaum, Princeton, NJ (USA) 21. Sense relations 1. Introduction 2. The basic sense relations 3. Sense relations and word meaning 4. Conclusion 5. References Abstract This article explores the definition and interpretation of the traditional paradigmatic sense relations such as hyponymy, synonymy, meronymy, antonymy, and syntagmatic relations such as selectional restrictions. A descriptive and critical overview of the relations is provided in section 1 and in section 2 the relation between sense relations and different theories of word meaning is briefly reviewed. The discussion covers early to mid twentieth century structuralist approaches to lexical meaning, with its concomitant view of the lexicon as being structured into semantic fields, leading to more recent work on decompositional approaches Maienbom, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 456-479
21. Sense relations 457 to word meaning. The latter are contrasted with atomic views of lexical meaning and the capturing of semantic relations through the use of meaning postulates. 1. Introduction Naive discussions of meaning in natural languages almost invariably centre around the meanings of content words, rather than the meanings of grammatical words or phrases and sentences, as is normal in academic approaches to the semantics of natural languages. Indeed, at first sight, it might seem to be impossible to construct a theory of meaning of sentences without first uncovering the complexity of meaning relations that hold between the words of a language that make them up. So, it might be argued, to know the meaning of the sentence Matthew rears horses we need also to at least know the meaning of Matthew rears animals or Matthew breeds horses, since horses are a kind of animal and rearing tends to imply breeding. It is in this context that the notion of sense relations, the meaning relations between words (and expressions) of a language, could be seen as fundamental to the success of the semantic enterprise. Indeed, the study of sense relations has a long tradition in the western grammatical and philosophical traditions, going back at least to Aristotle with discussions of relevant phenomena appearing throughout the medieval and later literature. However, the systematisation and taxonomic classification of the system of sense relations was only taken up in the structuralist movements of the twentieth century,particularly in Europe following the swift developments in structuralist linguistics after de Saussure.This movement towards systematic analyses of word sense was then taken up in the latter part of that century and the early part of the twenty-first century in formal modelling of the sense relations and, in particular, the development of computational models of these for the purposes of natural language processing. The notion of 'sense' in this context may be variously interpreted, but is usually interpreted in contrast to the notion of reference (or, cquivalcntly, denotation or extension). The latter expresses the idea that one aspect of word meaning is the relation between words and the things that they can be used properly to talk about. Thus, the reference/ denotation of cat is the set of all cats (that are, have been and will be); that of run (on one theoretical approach), the set of all past, present and future events of running (or, on another view, the set of all things that ever have, are or will engage in the activity we conventionally refer to in English as running). Sense, on the other hand, abstracts away from the things themselves to the property that allows us to pick them out.The sense of cat is thus the property that allows us to identify on any occasion an object of which it can truthfully be said that is a cat - 'catness' (however that might be construed, cognitively in terms of some notion of concept, see for instance Jackendoff 2002 or model-theoretically in terms of denotations at different indices, see Montague 1973). Sense relations are thus relations between the properties that words express, rather than between the things they can be used to talk about (although, as becomes clear very quickly, it is often very difficult to separate the two notions). Whether or not the study of sense relations can provide a solid basis for the development of semantic theories (and there are good reasons for assuming they cannot, see for example Kilgarriff 1997), nevertheless the elaboration and discussion of such meaning relations can shed light on the nature of the problems we confront in providing such theories, not least in helping to illuminate features of meaning that are truly amenable to semantic analysis and those that remain mysterious.
458 IV. Lexical semantics 2. The basic sense relations There are two basic types of sense relation. The most commonly presented in introductory texts arc the paradigmatic relations that hold between words of the same general category or type and that are characterised in terms of contrast and hierarchy. Typically, a paradigmatic relation holds between words (or word-forms) when there is choice between them. So given the string John bought a, it is possible to substitute any noun that denotes something that can be bought: suit, T-shirt, cauliflower, vegetable, house,... Between some of these words there is more to the choice between them than just the fact they are nouns denoting commodities. So, for example, if John bought a suit is true then it follows that John bought a pair of trousers is also true by virtue of the fact that pairs of trousers are parts of suits and if John bought a cauliflower is true then John bought a vegetable is also true, this time by virtue of the fact that cauliflowers are vegetables. The second type of sense relations arc syntagmatic which hold between words according to their ability to co-occur meaningfully with each other in sentences. Typically syntagmatic sense relations hold between words of different syntactic categories or (possibly) semantic types such as verbs and nouns or adverbs and prepositional phrases. In general, the closer the syntactic relation between two words such as between a head word and its semantic arguments or between a modifier and a head, the more likely it is that one word will impose conditions on the semantic properties the other is required to show. For example, in the discussion in the previous paragraph, the things that one can (non-metaphorically) buy are limited to concrete objects that are typically acceptable commodities in the relevant culture: in a culture without slavery adding boy to the string would be highly marked. As we shall see below, there is a sense in which these two dimensions, of paradigm and syntagm, cannot be kept entirely apart, but it is useful to begin the discussion as if they do not share interdependencies. Of the paradigmatic sense relations there arc three basic ones that can be defined between lexemes, involving sense inclusion, sense exclusion and identity of sense. Within these three groups, a number of different types of relation can be identified and, in addition, to these other sorts of sense relations, such as part-whole, have been identified and discussed in the literature. As with most taxonomic endeavours, researchers may be 'lumpers', preferring as few primary distinctions as possible, and 'splitters' who consider possibly small differences in classificatory properties as sufficient to identify a different class. With respect to sense relations, the problem of when to define an additional distinction within the taxonomy gives rise to questions about the relationship between knowledge of the world and knowledge of a word: where does one end and the other begin (see article 32 (Hobbs) Word meaning and world knowledge). In this article, I shall deal with only those relations that are sufficiently robust as to have become standard within lexical semantics: antonymy, hyponymy, synonymy and meronymy. In general, finer points of detail will be ignored and the discussion will be confined to the primary, and generally accepted, sense relations, beginning with hyponymy. 2.1. Hyponymy Hyponymy involves specific instantiations of a more general concept such as holds between horse and animal or vermilion and red or buy and get. In each case, one word provides a more specific type of concept than is displayed by the other.The more specific
21. Sense relations 459 word is called a hyponym and the more general word is the superordinate which may also be referred to as a hyperonym or hypernym, although the latter is disprcferred as in non- rhotic dialects of English it is homophonic with hyponym. Where the words being classified according to this relation are nouns, one can test for hyponymy by replacing X and Y in the frame'X is a kind of Y' and seeing if the result makes sense. So we have'(A) horse is a kind of animal' but not '(An) animal is a kind of horse' and so on. A very precise definition of the relation is not entirely straightforward, however. One obvious approach is to have recourse to class inclusion,so that the set of things denoted by a hyponym is a subset of the set of things denoted by the superordinate. So the class of buying events is a subset of the class of getting events.This works fine for words that describe concrete entities or events, but becomes metaphysically more challenging when abstract words like thought emotion, belief, understand, think etc. are considered. More importantly there are words that may be said to have sense but no denotation such as phoenix, hobbit, light sabre and so on. As such expressions do not pick out anything in the real word they can be said to denote only the empty set and yet, obviously, there are properties that such entities would possess if they existed that would enable us to tell them apart. A better definition of hyponymy therefore is to forego the obvious and intuitive reliance on class membership and define the relation in terms of sense inclusion rather than class inclusion: the sense of the superordinate being included in the sense of the hyponym. So a daffodil has the sense of flower included in it and more besides. If we replace'sense', as something we are trying to define, with the (perhaps) more neutral term 'property', then we have: (1) Hyponymy: X is a hyponym of Y if it is the case that if anything is such that it has the property expressed by X then it also has the property expressed by Y. Notice that this characterisation in terms of a universally quantified implication statement, does not require there to be actually be anything that has a particular property, merely that if such a thing existed it would have that property. So unicorn may still be considered a hyponym of animal, because if such things did exist, they would partake of 'animalness'. Furthermore, in general if X and Y are hyponyms of Z they are called co-hyponyms, where two words can be defined as co-hyponyms just in case they share the same superordinate term and one is not a hyponym of the other. Co-hyponyms arc generally incompatible in sense, unless they are synonymous (see below section 2.3). For example, horse, cat, bird, sheep, etc. are all co-hyponyms of mammal and all mutually incompatible with each other: *That sheep is a horse. Hyponymy is a transitive relation so that if X is a hyponym of Y and Y is a hyponym of Z then X is a hyponym of Z: foal is a hyponym of horse, horse is a hyponym of animal, and so foal is a hyponym of animal. Note that because of this transitivity,/oa/ is treated as a co-hyponym of, and so incompatible with, not only filly and stallion, but also sheep, lamb and bull, etc.This sort of property indicates how hyponymy imposes partial hierarchical structure on a vocabulary. Such hierarchies may define a taxonomy of (say) natural kinds as in Fig. 21.1. Complete hierarchies are not common, however. Often in trying to define semantic fields of this sort, the researcher discovers that there may be gaps in the system where some expected superordinate term is missing. For example, in the lexical field defined by move we have hyponyms like swim, fly, roll and then a whole group of verbs involving movement using legs such as run, walk, hop, jump, skip, crawl, etc. There is, however, no word in English to express the concept that classifies the latter group together. Such gaps abound in any attempt to construct a fully
460 IV. Lexical semantics hierarchical lexicon based on hyponymy. Some of these gaps may be explicable through socio-cultural norms (for example, gaps in kinship terms in all languages), but many are simply random: languages do not require all hierarchical terms to be lexicalised.That is not to say, however, that languages cannot express such apparently superordinate concepts. As above, we can provide the concept required as superordinate by modifying its apparent superordinate to give move using legs. Indeed, Lyons (1977) argues that hyponymy can in general be defined in terms of predicate modification of a superordinate. Thus,swim is move through fluid, mare is female horse, lamb is immature sheep and so on. This move pushes a paradigmatic relation onto some prior syntagmatic basis: sheep cow eagle blackbird shark pike mare stallion foal filly ewe lamb calf bull Fig. 21.1: Hyponyms of animal Hyponymy is a paradigmatic relation of sense which rests upon the encapsulation in the hyponym of some syntagmatic modification of the sense of the superordinate relation. Lyons (1977: 294) Such a definition does not work completely. For example, it makes no sense at the level of natural kinds (is horse to be defined as equine animal!) and there arc other apparent syntagmatic definitions that are problematic in that precise definitions are not obvious (saunter is exactly what kind of walk!). Such considerations, of course, reflect the vagueness of the concepts that words express out of context and so we might expect any such absolute definition of hyponymy to fail. Hyponymy strictly speaking is definable only between words of the same (syntactic) category, but some groups of apparent co-hyponyms seem to be related to a word of some other category. This seems particularly true of predicate-denoting expressions like adjectives which often seem to relate to (abstract) nouns as superordinates rather than some other adjective. For example, round, square, tetrahedral, etc. all seem to be 'quasi-hyponyms' of the noun shape and hot, warm, cool, cold relate to temperature. Finally, the hierarchies induced by hyponymy may be cross-cutting. So the animal field also relates to fields involving maturity (adult, young) or sex (male, female) and perhaps other domains. This entails that certain words may be hyponyms of more than one superordinate, depending on different dimensions of relatedness. As we shall see below, such multiple dependencies have given rise to a number of theoretical approaches to word meaning that try to account directly for sense relations in terms of primitive sense components or inheritance of properties in some hierarchical arrangement of conceptual or other properties. 2.2. Synonymy Synonymy between two words involves sameness of sense and two words may be defined as synonyms if they are mutually hyponymous. For example, sofa and settee are both
21. Sense relations 461 hyponyms of furniture and both mutually entailing since if Bill is sitting on a settee is true, then it is true that Bill is sitting on a sofa, and vice versa. This way of viewing synonymy defines it as occurring between two words just in case they arc mutually intcrsubstitut- able in any sentence without changing the meaning (or truth conditions) of those sentences. So violin and fiddle both denote the same sort of musical instrument so that Joan plays the violin and Joan plays the fiddle both have the same truth conditions (are both true or false in all the same circumstances). Synonyms are beloved of lexicographers and thesauri contain lists of putative synonyms. However, true or absolute synonyms are very rarely attested and there is a significant influence of context on the acceptability of apparent synonyms. Take an example from Roget's thesaurus at random, 726, for combatant in which some of the synonyms presented are: (2) disputant, controversialist, litigant, belligerent; competitor, rival; fighter, assailant, agressor, champion; swashbuckler, duellist, bully, fighting-man, boxer, gladiator,... Even putting to one side the dated expressions, it would be difficult to construct a single context in which all these words could be substituted for each other without altering the meaning or giving rise to pragmatic awkwardness. Context is the key for acceptability of synonym construal. Even the clear synonymy of fiddle = violin mentioned above shows differences in acceptability in different contexts: if Joan typically plays violin in a symphony orchestra or as soloist in classical concerti, then someone might object that she does not play the fiddle, where that term may be taken to imply the playing of more popular styles of music. Even the sofa = settee example might be argued to show some differences, for example in terms of the register of the word or possibly in terms of possible differences in the objects the words pick out. It would appear, in fact, that humans typically don't entertain full synonymy and when presented with particular synonyms in context will try and provide explanations of (possibly imaginary) differences. Such an effect would be explicable in terms of some pragmatic theory such as Relevance Theory (Sperber & Wilson 1986/1995) in which the use of different expressions in the same contexts is expected to give rise to different inferential effects. A more general approach to synonymy allows there to be degrees of synonymy where this may be considered to involve degrees of semantic overlap and it is this sort of synonymy that is typically assumed by lexicographers in the construction of dictionaries. Kill and murder arc strongly but not absolutely synonymous, differing perhaps in terms of intcntionality of the killer/murderer and also the sorts of objects such expressions may take (one may kill a cockroach but does not thereby murder it). Of course, there are conditions on the degree of semantic similarity that we consider to be definitional of synonymy. In the first place, it should in general be the case that the denial of one synonym implicitly denies the other. Mary is not truthful seems correctly to implicitly deny the truth of Mary is honest and Joan didn't hit the dog implies that she didn't beat it. Such implications may only go one way and that is often the case with near synonyms. The second condition on the amount of semantic overlap that induces near synonymy is that the terms should not be contrastive.Thus, labrador and corgi have a large amount of semantic overlap in that they express breeds of dog, but there is an inherent contrast between these terms and so they cannot in general be intersubstitutable in any context and maintain the truth of the sentence. Near- synonyms arc often used to explain a word already used: John was dismissed, sacked in fact. But if the terms contrast in meaning in some way then the resulting expression
462 IV. Lexical semantics is usually nonsensical: #John bought a corgi, a labrador, in fact, where # indicates pragmatic markedness. The felicity of the use of particular words is strongly dependent on context. The reasons for this have to do with the ways in which synonyms may differ. At the very least two synonymous terms may differ in style or register. So, for example, baby and neonate both refer to newborn humans, but while the neonate was born three weeks premature means the same as the baby was born three weeks premature, What a beautiful neonate! is distinctly peculiar. Some synonyms differ in terms of stylistic markedness. Conceal and hide, for example, are not always felicitously intersubstitutable. John hid the silver in the garden and John concealed the silver in the garden seem strongly synonymous, but John concealed Mary's gloves in the cupboard does not have the same air of normality about it as John hid Mary's gloves in the cupboard. Other aspects of stylistic variation involve expressiveness or slang. So while gob and mouth mean the same, the former is appropriately used in very informal contexts only or for its shock value. Swear words in general often have acceptable counterparts and euphemism and dysphemism thus provides a fertile ground for synonyms: lavatory = bathroom = toilet = bog = crapper, elc; fuck = screw = sleep with, and so on. Less obvious differences in expressiveness come about through the use of synonyms that indicate familiarity with the object being referred to such as the variants of kinship terms: mother = mum = mummy = ma. Regional and dialectal variations of a language may also give rise to synonyms that may or may not co-occur in the language at large: valley = dale = glen or autumn = fall. Sociolinguistic variation thus plays a very large part in the existence of near synonymy in a language. 2.3. Antonymy The third primary paradigmatic sense relation involves oppositeness in meaning, often called antonymy (although Lyons 1977 restricts the use of this term to gradable opposites) and is defined informally in terms of contrast, such that if 'A is X' then 'A is not Y'. So, standardly, if John is tall is true then John is not small is also true. Unlike hyponymy, there are a number of ways in which the senses of words contrast. The basic distinction is typically made between gradable and ungradable opposites.Typically expressed by adjectives in English and other Western European languages, gradable antonyms form instances of contraries and implicitly or explicitly invoke a field over which the grading takes place, i.e. a standard of comparison. Assuming, for example, that John is human, then human size provides the scale against which John is tall is measured. In this way, John's being tall for a human does not mean that he is tall when compared to buildings. Note that the implicit scale has to be the same scale invoked for any antonym: John is tall contrasts with John is not small for a human, not John is not small for a building. Other examples of gradable antonyms are easy to identify: cold/hot,good/bad, old/young and so on. (3) Gradable antonymy: Gradable antonyms form instances of contraries and implicitly or explicitly invoke a field over which the grading takes place, i.e. a standard of comparison. Non-gradable antonyms are, as the name suggests, absolutes and divide up the domain of discourse into discrete classes. Hence, not only docs the positive of one antonym imply
21. Sense relations 463 the negation of the other, but the negation of one implies the positive of the other. Such non-gradable antonyms are also called complementaries and include pairs such as male/ female, man/woman, dead/alive. (4) Complementaries (or binary antonyms) are all non-gradable and the sense of one entails the negation of the other and the negation of one sense entails the positive sense of the other. Notice that there is, in fact, a syntagmatic restriction that is crucial to this definition. It has to be the case that the property expressed by some word is meaningfully predicable of the object to which it is applied. So, while that person is male implies that that person is not female and that person is not female implies that that person is male, that rock is not female does not imply that that rock is male. Rocks are things that do not have sexual distinctions. Notice further that binary antonyms are quite easily coerced into being gradable, in which case the complementarity of the concepts disappears. So we can say of someone that they are not very alive without committing ourselves to the belief that they are very dead. Amongst complementaries, some pairs of antonyms may be classed as privative in that one member expresses a positive property that the other negates. These include pairs such as animate/inanimate. Others are termed equipollent when both properties express a positive concept such as male/female. Such a distinction is not always easy to make: is the relation dead/alive equipollent or privative? Some relational antonyms differ in the perspective from which a relation is viewed: in other words, according to the order of their arguments. Pairs such as husband/wife, parent/child are of this sort and are called converses. (5) Converses, involve relational terms where the argument positions involved with one lexeme are reversed with another and vice versa. So if Mary is John's wife then John is Mary's husband. In general, the normal antonymic relation stands, provided that the relation expressed is strictly asymmetric: Mary is Hilary's child implies Mary is not Hilary's parent. Converses may involve triadic predicates as well, such as buy/sell, although it is only the agent and the goal that are reversed in such cases. If Mary buys a horse from Bill then it must be the case that Bill sells a horse to Mary. Note that the relations between the two converses here are not parallel: the agent subject of buy is related to the goal object of sell whereas the agent subject of sell is related to the source object of buy.This may indicate that the converse relation, in this case at least, resides in the actual situations described by sentences containing these verbs rather than necessarily inherently being part of the meanings of the verbs themselves. So far, we have seen antonyms that are involved in binary contrasts such as die/live, good/bad and so on, and the relation of antonymy is typically said only to refer to such binary contrasts. But contrast of sense is not per se restricted to binary contrasts. For example, co-hyponyms all have the basic oppositeness property of exclusion. So, if something is a cow, it is not a sheep, dog, lion or any other type of animal and equally for all other co-hyponyms that arc not synonyms. It is this exclusion of sense that makes corgi and labrador non-synonymous despite the large semantic overlap in their semantic properties (as breeds of dogs). Lyons (1977) calls such a non-binary relation 'incompatibility'.
464 IV. Lexical semantics Some contrastive gradable antonyms form scales where there is an increase (decrease) of some characteristic property from one extreme of the scale to another. With respect to the property heat or temperature we have the scale [freezing, cold, cool, lukewarm, warm, hot, boiling].These adjectives may be considered to be quasi-hyponyms of the noun heat and all partake of the oppositeness relation. Interestingly (at least for this scale), related points on the scale act like (gradable) antonyms: freezing/boiling, cold/hot, cool/warm. There are other types of incompatible relations such as ranks (e.g. [private (soldier), corporal, sergeant, staff sergeant, warrant officer, lieutenant, major, ...}) and cycles (e.g. [monday, tuesday, Wednesday, thursday, friday, Saturday, Sunday)). Finally on this topic, it is necessary again to point out the context-sensitivity of ant- onymy. Although within the colour domain red has no obvious antonym, in particular contexts it does. So with respect to wine, the antonym of red is white and in the context of traffic signals, its opposite is green. Without a context, the obvious antonym to dry is wet, but again within the context of wine, its antonym is sweet, in the context of skin it is soft and for food moist (examples taken from Murphy 2003). This contextual dependence is problematic for the definition of antonymy just over words (rather than concepts), unless it is assumed that the lexicon is massively homonymous (see below). (See also article 22 (Lobner) Dual oppositions.) 2.4. Part-whole relations The final widely recognised paradigmatic sense relation is that involving 'part-of relations or meronymies. (6) Meronymy: If X is part-of Y or Y has X then X is a meronym of Y and Y is a holonym of X. Thus, toe is a meronym of foot and foot is a meronym of leg, which in turn is a meronym of body. Notice that there is some similarities between meronymy and hyponymy in that a (normal) hand includes fingers and finger somehow includes the idea of hand, but, of course, they are not the same things and so do not always take part in the same entailment relations as hyponyms and superordinates. So while Mary hurt her finger (sort of) entails Mary hurt her hand, just as Mary hurt her lamb entails Mary hurt an animal, Mary saw her finger does not entail Mary saw her hand, unlike Mary saw her lamb does entail Mary saw an animal. Hence, although meronymy is like hyponymy in that the part-whole relations define hierarchical distinctions in the vocabulary, it is crucially different in that meronyms and holonyms define different types of object that may not share any semantic properties at all: a finger is not a kind of hand, but it does share properties with hands such as being covered in skin and being made of flesh and bone; but a wheel shares very little with one of its holonyms car, beyond being a manufactured object. Indeed, appropriate entailment relations between sentences containing meronyms and their corresponding holonyms are not easily stated and, while the definition given above is a reasonable approximation, it is not unproblematic. Cruse (1986) attempts to restrict the meronymy relation just to those connections between words that allow both the 'X is part of Y' and 'Y has X' paraphrases. He points out that the 'has a' relation does not always involve a 'part of one, at least between the two words: a wife has a husband but not #a husband is part of a wife. On the other hand, the reverse
21. Sense relations 465 may also not hold: stress is part of the job does not mean that the job has stress, at least not in the sense of possession. However, even if one accepts that both paraphrases must hold of a mcronymic pair, there remain certain problems. As an example, the pair of sentences a husband is part of a marriage and a marriage has a husband seems to be reasonably acceptable, it is not obvious that marriage is strictly a holonym of husband. It may be, therefore, that it is necessary to restrict the relation to words that denote things of the same general type: concrete or abstract, which will induce different 'part of relations depending on the way some word is construed. So, a chapter is part of a book = Books have chapters if book is taken to be the abstract construal of structure, but not if it is taken to be the concrete object. Furthermore, it might be necessary to invoke notions like 'discreteness' in order to constrain the relation. For examp\e,flesh is part of a hand and hands have flesh, but are these words thereby in a meronymic relationship? Flesh is a substance and so not individuated and if meronymy requires parts and wholes to be discretely identifiable, then the relationship would not hold of these terms. Again we come to the problem of world- knowledge, which tells us that fingers are prototypically parts of hands, versus word- knowledge: is it the case that the meaning of'finger' necessarily contains the information that it forms part of a hand and thus that some aspect of the meaning of'hand' is contained in the meaning of'finger'? If that were the case, how do we account for the lack of any such inference in extensions of the word to cover (e.g.) emerging shoots of plants (cf. finger of asparagus)! (See article 32 (Hobbs) Word meaning and world knowledge.) This short paragraph does not do justice to the extensive discussions of meronymy, but it should be clear that it is by far the most problematic of the paradigmatic relations to pin down, a situation that has led some scholars to reject its existence as a different type of sense relation altogether. (See further Croft & Cruse 2004:159-163, Murphy 2003:216-235.) 2.5. Syntagmatic relations Syntagmatic relations between words appear to be less amenable to the sort of taxonomies associated with paradigmatic relations. However, there is no doubt that some words 'go naturally' with each other, beyond what may be determined by general syntactic rules of combination. At one extreme, there are fixed idioms where the words must combine to yield a specific meaning. Hence we have idiomatic expressions in English meaning 'die' such as kick the bucket (a reference to death by hanging) or pass away or pass over (to the other side) (references to religious beliefs) and so on.There are certain words also that only have a very limited ability to appear with others such as is the case with addled which can only apply to eggs or brains. Other collocational possibilities may be much freer, although none are constrained solely by syntactic category. For example, hit is a typical transitive verb in English that takes a noun phrase as object. However, it further constrains which noun phrases it acceptably collocates with by requiring the thing denoted by that noun phrase to have concrete substance. Beyond that, collocational properties are fairly free;see article 20 (Fellbaum) Idioms and collocations: (7) The plane hit water/a building/#the idea. That words do have semantic collocational properties can be seen by examining strings of words that arc 'grammatical' (however defined) but make no sense. An extreme
466 IV. Lexical semantics example of this is Chomsky (1965)'s ubiquitous colorless green ideas sleep furiously in which the syntactic combination of the words is licit (as an example of a subject noun phrase containing modifiers and a verb phrase containing an intransitive verb and an adverb) but no information is expressed because of the semantic anomalies that result from this particular combination.There are various sources of anomaly. In the first place, there may be problems resulting from what Cruse (2000) calls collocational preferences. Where such preferences are violated, various degrees of anomaly can arise, ranging from marginally odd through to the incomprehensible. For example, the sentence my pansies have passed away is peculiar because pass away is typically predicated of a human (or pet) not flowers. However, the synonymous sentence, my pansies have died, involves no such peculiarity since the verb die is prcdicablc of anything which is capable of life such as plants and animals. A worse clash of meaning thus derives from the collocation of an inanimate subject and any verb or idiom meaning 'die'. My bed has died is thus worse than my pansies have passed a way, although notice that metaphorical interpretations can be (and often are) attributed to such strings. For example, one could interpret my bed has died as indicating that the bed has collapsed or is otherwise broken and such metaphorical extensions are common. Compare the use of die collocated with words such as computer, car, phone, etc. Notice further in this context that the antonym of die is not live but go or run. It is only when too many clashes occur that metaphorical interpretation breaks down and no information at all can be derived from a string of words. Consider again colorless green ideas sleep furiously. Parts of this sentence arc less anomalous than the whole and we can assign (by whatever means) some interpretation to them: (8) a. Green ideas: 'environmentally friendly ideas' or 'young, untried ideas' (both via the characteristic property of young plant shoots); b. Colorless ideas: 'uninteresting ideas' (via lacklustre, dull); c. Colorless green ideas: 'uninteresting ideas about the environment' or 'uninteresting untried ideas' (via associated negative connotations of things without colour); d. Ideas sleep: 'ideas are not currently active' (via inactivity associated with sleeping); e. Green parrots sleep furiously: 'parrots determinedly asleep (?)' or 'parrots restlessly asleep'. But in putting all the words together, the effort involved in resolving all the contradictions just gets beyond any possible effect on the context by the information content of the final proposition. The more contradictions that need to be resolved in processing some sentence the greater the amount of computation required to infer a non-contradictory proposition from it and the less information the inferred proposition will convey. A sentence may be said to be truly anomalous if there is no relevant proposition that can be deduced from it by pragmatic means. A second type of clash involving collocational preferences is induced when words are combined into phrases but add no new information to the string. Cruse calls this pleonasm and exemplifies it with examples such as John kicked the ball with his foot (Cruse 2000:223). Since kicking involves contact with something by a foot the string final prepositional phrase adds nothing new and the sentence is odd (even though context may allow the apparent tautology to be acceptable). Similar oddities arise with collocations
21. Sense relations 467 such as female mother or human author. Note that pleonasm does not always give rise to feelings of oddity. For example, pregnant female does not seems as peculiar as #female mother, even although the concept of 'female' is included in 'pregnant'. This observation is indicative of the fact that certain elements in strings of words have a privileged status. For example, it appears that pleonastic anomaly is worse in situations in which a semantic head, a noun or verb that determines the semantic properties of its satellites, which does not always coincide with what may be identified as the syntactic head of a construction, appears with a modifier (or sometimes complement, but not always) whose meaning is contained within that of the head: bovine mammal is better than ^mammalian cow (what other sort of cow could there be?). Pleonastic anomaly can usually be obviated by substituting a hyponym for one expression or a superordinate of the other since this will give rise to informativity with respect to the combination of words: female parent, He struck the ball with his foot and so on. Collocational preferences are often discussed with respect to the constraints imposed by verbs or nouns on their arguments and sometimes these constraints have been incorporated into syntactic theories. In Chomsky (1965), for example, the subcategorisation of verbs was defined not just in terms of the syntactic categories of their argument but also their semantic selectional properties. A verb like kick imposes a restriction on its direct object that it is concrete (in non-metaphorical uses) and on its subject that it is something with legs like an animal, whereas verbs like think require abstract objects and human subjects. Subsequent research into the semantic properties of arguments led to the pos- tulation of participant (or case or thematic) roles which verbs 'assign' to their arguments with the effect that certain roles constrained the semantic preferences of the verb to certain sorts of subjects and objects. A verb like/ear, therefore, assigns to its subject the role of experiencer, thus limiting acceptable collocations to things that are able to fear such as humans and other animals. Some roles such as experiencer, agent, recipient, etc. arc more tightly constrained by certain semantic properties, such as animacy, volition, and mobility than others such as theme, patient (Dowty 1991). We have seen that paradigmatic relations cannot always be separated from concepts of syntagmatic relatedness. So Lyons' attempts to define hyponymy in terms of the modification of a superordinate while the basic relation of antonymy holds only if the word being modified denotes something that can be appropriately predicated of the relevant properties. Given that meanings are constructed in natural languages by putting words together, it would be unsurprising if syntagmatic relations are, in some sense, primary and that paradigmatic relations are principally determined by collocational properties between words. Indeed the primacy of syntagmatic relations is supported by psycholin- guistic and acquisition studies. It is reported in Murphy (2003), for example, that in word association experiments, children under 7 tend to provide responses that reflect collocational patterns rather than paradigmatic ones. So to a word like black the response of young children is more likely to give rise to responses such as bird or board rather than the antonym white. Older children and adults, on the other hand, tend to give paradigmatic responses. There is, furthermore, some evidence that knowledge of paradigmatic relations is associated with metalinguistic awareness: that is, awareness of the properties of, and interactions between, those words. The primacy of syntagmatic relations over paradigmatic relations seems further to be borne out by corpus and computational studies of collocation and lexis. For example, there are many approaches to the automatic disambiguation of homonyms, identical
468 IV. Lexical semantics word forms that have different meanings (see below), that rely on syntagmatic context to determine which sense is the most likely on a particular occasion of the use of some word. Such studies also rely on corpus work from which collocational probabilities between expressions are calculated. In this regard, it is interesting also to consider experimental research in computational linguistics which attempts to induce automatic recognition of synonyms and hyponyms in texts. Erk (2009) reports research which uses vector spaces to define representations of words meanings where such spaces are defined in terms of collocations between words in corpora. Without going into any detail here, she reports that (near) synonyms can be identified in this manner to a high degree of accuracy, but that hyponymic relations cannot be identified easily without such information being directly encoded, but that the use of vector spaces allows such encoding to be done and to yield good results. Although this is not her point, Erk's results are interesting with respect to the possible relation between syntagmatic and paradigmatic sense relations. Synonymy may be defined not just semantically as involving sameness of sense, but syntactically as allowing substitution in the all the same linguistic contexts (an idealisation, of course, given the rarity of full synonymy, but probabilistic techniques may be used to get a definition of degree of similarity between the contexts in which two words can appear). Hence, we might expect that defining vector spaces in terms of collocational possibilities in texts will yield a high degree of comparability between synonyms. But hyponymy cannot be defined easily in syntactic terms, as hyponyms and their superordi- nates will not necessarily collocate in the same way (for example, four-legged animal is more likely than four-legged dog while Collie dog is fine but #Collie animal is marginal at best).Thus, just taking collocation into account, even in a sophisticated manner, will fail to identify words in hyponymous relations.This implies that paradigmatic sense relations are 'higher order' or'metalexical' relations that do not emerge directly from syntagmatic ones. I leave this matter to one side from now on, because, despite the strong possibility that syntagmatic relations are cognitively primary, it remains the case that the study of paradigmatic relations remains the focus of studies of lexical semantics. 2.6. Homonymy and polysemy Although not sense relations of the same sort as those reviewed above in that they do not structure the lexicon in systematic ways, homonymy and polysemy have nevertheless an important place in considerations of word meaning and have played an important part in the development of theories of lexical semantics since the last two decades of the twentieth century. Homonymy involves formal identity between words with distinct meanings (i.e. interpretations with distinct extensions and senses) which Weinreich (1963) calls "contrastive ambiguity". Such formal identity may involve the way a word is spoken (homophony 'same sound') such as bank, line, taxi, can, lead (noun)Zled (verb) and/or orthography bank, line, putting, in which case it is referred to as homography 'same writing'. It is often the case that the term homonymy is reserved only for those words that are both homophones and homographs, but equally often the term is used for cither relation. Homonymy may be full or partial: in the former case, every form of the lexeme is identical for both senses such as holds for the noun punch (the drink or the action) or it may be partial in which only some forms of the lexeme are identical for both senses, such as between the verb punch and its corresponding noun. Homonymy leads to
21. Sense relations 469 the sort of ambiguity that is easily resolved in discourse context, whether locally through syntactic disambiguation (9a), the context provided within a sentence (9b) or from the topic of conversation (9c). (9) a. His illness isn't terminal. b. My terminal keeps cutting out on me. c. I've just been through Heathrow Airport.The new terminal is rubbish. In general, there is very little to say about homonymy. It is random and generally only tolerated when the meanings of the homonyms are sufficiently semantically differentiated as to be easily disambiguated. Of more interest is polysemy in which a word has a range of meanings in different local contexts but in which the meaning differences are taken to be related in some way. While homonymy may be said to involve true ambiguity, polysemy involves some notion of vagueness or underspecification with respect to the meanings a polyseme has in different contexts (see article 23 (Kennedy) Ambiguity and vagueness). The classic example of a polysemous word is mouth which can denote the mouth of a human or animal or various other types of opening, such as bottle, and more remotely of river. Unlike homonymy no notion of contrast in sense is involved and polysemes are considered to have an apparently unique basic meaning that is modified in context.The word bank is both a homonym and a polyseme in its meaning of 'financial establishment' between its interpretation as the institution (The bank raised its interest rates yesterday ) and its physical manifestation (The bank is next to the school). One of the things that differentiates polysemy from homonymy is that the different senses of polysemes are not 'suppressed' in context (as with homonyms) but one aspect of sense is foregrounded or highlighted. Other senses are available in the discourse and can be picked up by other words in the discourse: (10) a. Mary tried to jump through the window (aperture), but it was closed (aperture/ physical object) and she broke it (physical object), b. *Mary walked along the bank of the river. It had just put up interest rates yet again. Polysemy may involve a number of different properties: change of syntactic category (11); variation in valency (12); and subcategorisation properties (13). (11) a. Rambo picked up the hammer (noun). b. Rambo hammered (verb) the nail into the tree. (12) a. The candle melted. b. The heat melted the candle. (13) a. Rambo forgot that he had buried the cat (clausal complement - factive interpretation) b. Rambo forgot to bury the cat (infinitival complement - non-factive interpretation) Some polysemy may be hidden and extensive, as often with gradable adjectives where the adjective often picks out some typical property associated with the head noun that it modifies which may vary considerably from noun to noun, as with the adjective good in
470 IV. Lexical semantics (15), where the interpretation varies considerably according to the semantics of the head noun and contrasting strongly with other adjectives like big, as illustrated in (14). Note that one might characterise the meaning of this adjective in terms of an undcrspccificd semantics such as that given in (15e). (14) big car/big computer/big nose: 'big for N' (15) a. good meal'/tasty, enjoyable, pleasant' b. good knife:'sharp, easy to handle' c. good car: 'reliable, comfortable, fast' d. good typist:'accurate, quick, reliable' e. good N: 'positive evaluation of some property associated with N' There are also many common alternations in polysemy that one may refer to as constructional (or logical) polysemy since they are regular and result from the semantic properties of what is denoted. (16) a. Figure/Ground: window, door, room b. Count/Mass: lamb, beer c. Container/Contained: bottle, glass d. Product/Producer: book, Kleenex e. Plant/Food: apple, spinach f. Process/Result: examination, merger Such alternations depend to a large degree on the perspective that is taken with respect to the objects denoted. So a window may be viewed in terms of an aperture (e.g. when it is open) or in terms of what it is made of (glass, plastic in some sort of frame) while other nouns can be viewed in terms of their physical or functional characteristics, and so on. Polysemy is not exceptional but rather the norm for word interpretation in context. Arguably every content word is polysemous and may have its meaning extended in context, systematically or unsystematically.The sense extensions of mouth, for example, are clear examples of unsystematic metaphorical uses of the word, unsystematic because the metaphor cannot be extended to just any sort of opening: ?#mouth of a flask, timouth of a motorway, ?#mouth of a stream, #mouth of a pothole. Of course, any of these collocations could become accepted, but it tends to be the case that until a particular collocation has become commonplace, the phrase will be interpreted as involving real metaphor rather than the use of a polysemous word. Unsystematic polysemy, therefore, may have a diachronic dimension with true (but not extreme) metaphorical uses becoming interpreted as polysemy once established within a language. It is also possible for diachron- ically homonymous terms to be interpreted as polysemes at some later stage. This has happened with the word ear where the two senses (of a head and of corn) derive from different words in Old English (pare andear, respectively). More systematic types of metaphorical extension have been noted above, but may also result from metonymy: the use of a word in a non-literal way, often based on a partwhole or 'connected to' relationships. This may happen with respect to names of composers or authors where the use of the name may refer to the person or to what they
21. Sense relations 471 have produced. (17) may be interpreted as Mary liking the man or the music (and indeed listening to or playing the latter). (17) Mary likes Beethoven. Ad hoc types of metonymy may simply extend the concept of some word to some aspect of a situation that is loosely related to, but contextually determined by, what the word actually means. John has new wheels may be variously interpreted as John having a new car or, if he was known to be paraplegic, as him having a new wheelchair. A more extreme, but classic, example is one like (18) in which the actual meaning of the food lasagna is extended to the person who ordered it. (Such examples arc also known as 'ham sandwich' cases after the examples found in Nunberg 1995). (18) The lasagna is getting impatient. Obviously context is paramount here. In a classroom or on a farm, the example would be unlikely to make any sense, whereas in a restaurant where the situation necessarily involves a relation between customers and food, a metonymic relation can be easily constructed. Some metonymic creations may become established within a linguistic community and thus become less context-dependent. For example, the word suit(s) may refer not only to the garment of clothing but also to people who wear them and thus the word gets associated with types of people who do jobs that involve the wearing of suits, such as business people. 3. Sense relations and word meaning As indicated in the discussion above, the benefit of studying sense relations appears to be that it gives us an insight into word meaning generally. For this reason, such relations have often provided the basis for different theories of lexical semantics. 3.1. Lexical fields and componential analysis One of the earliest modern attempts to provide a theory of word meaning using sense relations is associated with European structuralists, developing out of the work of de Saussure in the first part of the twentieth century. Often associated with theories of componential analysis (see below), lexical field theory gave rise to a number of vying approaches to lexical meaning, but which all share the hypothesis that the meanings (or senses) of words derive from their relations to other words within some thematic/ conceptual domain defining a semantic or lexical field. In particular, it is assumed that hierarchical and contrastive relations between words sharing a conceptual domain is sufficient to define the meaning of those words. Early theorists such as Trier (1934) or Porzig (1934) were especially interested in the way such fields develop over time with words shifting with respect to the part of a conceptual field that they cover as other words come into or leave that space. For Trier, the essential properties of a lexical field are that: (i) the meaning of an individual word is dependent upon the meaning of all the other words in the same conceptual domain;
472 IV. Lexical semantics (ii) a lexical field has no gaps so that the field covers some connected conceptual space (or reflects some coherent aspect of the world); (iii) if a word undergoes a change in meaning, then the whole structure of the lexical field also changes. One of the obvious weaknesses of such an approach is that the identification of a conceptual domain cannot be identified independently of the meaning of the expressions themselves and so appears somewhat circular. Indeed, such research presented little more than descriptions of diachronic semantic changes as there was little or no predictive power in determining what changes are and are not possible within lexical fields, nor what lexical gaps arc tolerated and what not. Indeed, it seems reasonable to suppose that no such theory could exist, given the randomness that the lexical development of conten- tive words displays and so there is no reason to suppose that sense relations play any part in determining such change. (See Ullman 1957,Geckeler 1971,Coseriu & Geckeler 1981, for detailed discussions of field theories at different periods of the twentieth century.) Although it is clear that 'systems of interrelated senses' (Lyons 1977: 252) exist within languages, it is not clear that they can usefully form the basis for explicating word meaning.The most serious criticism of lexical field theory as more than a descriptive tool is that it has unfortunate implications for how humans could ever know the meaning of a word: if a word's meaning is determined by its relation to other words in its lexical field, then to know that meaning someone has to know all the words associated with that lexical field. For example, to know the meaning of tulip, it would not be enough to know that it is a hyponym of (plant) bulb and a co-hyponym of daffodil, crocus, anemone, lily, dahlia but also to trillium, erythronium, bulbinella, disia, brunsvigia and so on. But only a botanist specialising in bulbous plants is likely to know anything like the complete list of names, and even then this is unlikely. Of course, one might say that individuals might have a shallower or deeper knowledge of some lexical field, but the problem persists, if one is trying to characterise the nature of word meaning within a language rather than within individuals. And it means that the structure of a lexical field and thus the meaning of a word will necessarily change with any and every apparent change in knowledge. But it is far from clear that the meaning of tulip would be affected if, for example, botanists decided that that a disia is not bulbous but rhizomatous, and thus does not after all form part of the particular lexical field of plants that have bulbs as storage organs. It is obvious that someone can be said to know the meaning of tulip independently of whether they have any knowledge of any other bulb or even flowering plant. Field theory came to be associated in the nineteen-sixties with another theory of word meaning in which sense relations played a central part.This is the theory of componen- tial analysis which was adapted by Katz & Fodor (1963) for linguistic meaning from similar approaches in anthropology. In this theory, the meaning of a word is decomposed into semantic components, often conceived as features of some sort. Such features are taken to be cognitively real semantic primitives which combine to define the meanings of words in a way that automatically predicts their paradigmatic sense relations with other words. For example, one might decompose the two meanings of dog as consisting of the primitive features [CANINE] and [CANINE, MALE, ADULT]. Since the latter contains the semantic structure of the former, it is directly determined to be a hyponym. Assuming that bitch has the componential analysis [CANINE, FEMALE, ADULT], the hyponym meaning of dog is easily identified as an antonym of bitch as they differ one
21. Sense relations 473 just one semantic feature. So the theory provides a direct way of accounting for sense relations: synonymy involves identity of features; hyponymy involves extension of features; and antonymy involves difference in one feature. Although actively pursued in the nineteen sixties and seventies, the approach fell out of favour in mainstream linguistics for a number of reasons. From an ideological perspective, the theory became associated with the Generative Semantics movement which attempted to derive surface syntax from deep semantic meaning components. When this movement was discredited, the logically distinct semantic theory of componential analysis was mainly rejected too. More significantly, however, the theory came in for heavy criticism. In the first place, there is the problem of how primitives are to be identified, particularly if the assumption is that the set of primitives is universal. Although more recent work has attempted to give this aspect of decompositional theories as a whole a more empirically motivated foundation (Wierzbicka 1996), nevertheless there appears to be some randomness to the choice of primitives and the way they are said to operate within particular languages. Additionally, the theory has problems with things like natural kinds: what distinguishes [CANINE] from the meaning of dog or [EQUINE] from horse! And does each animal (or plant) species have to be distinguished in this way? If so, then the theory achieves very little beyond adding information about sex and age to the basic concepts described by dog and horse. Using Latinate terms to indicate sense components for natural kinds simply obscures the fact that the central meanings of these expressions are not decomposable. An associated problem is that features were often treated as binary so that, for example, puppy might be analysed as [CANINE,-ADULT]. Likewise, instead of MALE/FEMALE one might have ±MALE or [-ALIVE] for dead. The problem here is obvious: how does one choose a non-arbitrary property as the unmarked one? ±FEMALE and ±DEAD are just as valid as primitive features, as the reverse, reflecting the fact that male/female and dead/alive are equipollent antonyms (see section 2.3). Furthermore, restriction to binary values excludes the inclusion of relational concepts that are necessary for any analysis of meaning in general. Overall, then, while componential analysis does provide a means of predicting sense relations, it does so at the expense of a considerable amount of arbitrariness. 3.2. Lexical decomposition Although the structuralist concept of lexical fields is one that did not develop in the way that its proponents might have expected, nevertheless it reinforced the view that words are semantically related and that this relatedness can be identified and used to structure a vocabulary. It is this concept of a structured lexicon that persists in mainstream linguistics. In the same way, lexical decompositional analyses have developed in rather different ways than were envisaged at the time that componcntial semantic analysis was developed. See article 17 (Engelberg) Frameworks of decomposition. Decomposition of lexical meaning appears in Montague (1973), one of the earliest attempts to provide a formal analysis of a fragment of a natural language as one of two different mechanisms for specifying the interpretations of words. Certain expressions with a logical interpretation, like be and necessarily, are decomposed, not into cognitive primitives, but into complex logical expressions, reflecting their truth-conditional content. For example, necessarily receives the logical translation "kp\upx\, where the abstracted variable, p, ranges over propositions and the decomposition has the effect
474 IV. Lexical semantics of equating the semantics of the adverb with that of the logical necessity operator, □. Montague restricted decomposition of this sort to those grammatical expressions whose truth conditional meaning can be given a purely logical characterisation. In a detailed analysis of word meaning within Montague Semantics, however, Dowty (1979) argued that certain entailments associated with content expressions are constant in the same way as those associated with grammatical expressions and he extended the decomposi- tional approach to analyse such words in order to capture such apparently independent entailments. Dowty's exposition is concerned primarily with inferences from verbs (and more complex predicates) that involve tense and modality. By adopting three operators DO, BECOME and CAUSE, he is able to decompose the meanings of a range of different types of verbs, including activities, accomplishments, inchoatives and causatives, to account for the entailments that can be drawn from sentences containing them. For example, he provides decomposition rules for de-adjectival inchoative and causative verbs in English that modify the predicative interpretation of base adjectives in English. Dowty uses the propositional operator BECOME for inchoative interpretations of (e.g.) cool: Xx [BECOME cool'(x)] where cool is the semantic representation of the meaning of the predicative adjective and the semantics of BECOME ensures that the resulting predicate is true of some individual just in case it is now cool but just previously was not cool.The causative interpretation of the verb involves the CAUSE operator in addition: Xy "kx [x CAUSE BECOME cooV{y)\ which guarantees the entailment between (e.g.) Mary cooled the wine and Mary caused the wine to become cool. Dowty also gives more complex (and less obviously logical) decompositions for other content expressions, such as kill which may be interpreted as x causes y to become not alive. The quasi-logical decompositions suggested by Dowty have been taken up in theories such as Role and Reference Grammar (van Valin & LaPolla 1997) but primarily for accounting for syntagmatic relations such as argument realisation, rather than for accounting for paradigmatic sense relations.The same is not quite true for other decom- positional theories of semantics such as that put forward in the Generative Lexicon Theory of Pustejovsky (1995). Pustejovsky presents a theory designed specifically to account for structured polysemous relations such as those given in (11-13) utilising a complex internal structure for word meanings that goes a long way further than that put forward by Katz & Fodor (1963). In particular, words are associated with a number of different 'structures' that may be more or less complex. These include: argument structure which gives the number and semantic type of logical arguments; event structure specifying the type of event of the lexeme; and lexical inheritance structure, essentially hyponymous relations (which can be of a more general type showing the hierarchical structure of the lexicon). The most powerful and controversial structure proposed is the qualia structure. Qualia is a Latin term meaning 'of whatever sort' and is used for the Greek aitiai 'blame', 'responsibility' or'cause' to link the current theory with Aristotle's modes of explanation. Essentially the qualia structure gives semantic properties of a number of different sorts concerning the basic sense properties, prototypicality properties and encyclopaedic information of certain sorts. This provides an explicit model for how meaning shifts and polyvalcncy phenomena interact. The qualia structure provides the structural template over which semantic combinatorial devices, such as co- composition, type coercion and subselection, may apply to alter the meaning of a lexical item. Pustejovsky (1991:417) defines qualia structure as:
21. Sense relations 475 — The relation between [a word's denotation] and its constituent parts — That which distinguishes it within a larger domain (its physical characteristics) — Its purpose and function — Whatever brings it about Such information is used in interpreting sentences such as those in (19) (19) a. Bill uses the train to get to work, b. This car uses diesel fuel. The verb use is scmantically undcrspccificd and the factors that allow us to determine which sense is appropriate for any instance of the verb are the qualia structures for each phrase in the construction and a rich mode of composition, which is able to take advantage of this information. For example, in (19a) it is the function of trains to take people to places, so use here may be interpreted as 'catches', 'takes' or 'rides on'. Analogously, cars contain engines and engines require fuel to work, so the verb in (19b) can be interpreted as 'runs on', 'requires', etc. Using these mechanisms Pustejovsky provides analyses of complex lexical phenomena, including coercion, polysemy and both paradigmatic and syntagmatic sense relations. Without going into detail, Pustejovskys fundamental hypothesis is that the lexicon is generative and compositional, with complex meanings deriving from less complex ones in structured ways, so that the lexical representations of words should contain only as much information as they need to express a basic concept that allows as wide a range of combinatorial properties as possible. Additionally, lexical information is hierarchically structured with rules specifying how phrasal representations can be built up from lexical ones as words are combined. A view which contrasts strongly with that discussed in the next section. Sec article 17 (Engclbcrg) Frameworks of decomposition. 3.3. Meaning postulates and semantic atomism In addition to decomposition for logico-grammatical word meaning, Montague (1973) also adopted a second approach to accounting for the meaning of basic expressions, one that relates the denotations of words (analogously, concepts) to each other via logical postulates. Meaning Postulates were introduced in Carnap (1956) and consist of universally quantified conditional or bi-conditional statements in the logical metalanguage which constrain the denotations of the constant that appears in the antecedent. For example, Montague provides an example that relates the denotations of the verb seek and the phrase try to find. (20) states (simplified from Montague) that for every instance of x seeking y there is an instance of x trying find y: (20) □VxVy[seek'(.*,>') <-> try-to (*,A find'(x,y)] Note that the semantics of seek, on this approach, does not contain the content of try to find, as in the decompositional approach. The necessity operator, □, ensures that the relation holds in all admissible models, i.e. in all states-of-affairs that we can talk about using the object language. This raises the bi-conditional statement to the status of a logical truth (an axiom) which ensures that on every occasion in which it is true to say of
476 IV. Lexical semantics someone that she is seeking something then it is also true to say that she is trying to find that something (and vice versa). Meaning postulates provide a powerful tool for encoding detailed information about non-logical entailments associated with particular lexemes (or their translation counterparts). Note that within formal, model-theoretic, semantics such postulates act, not as constraints on the meaning of words, but their denotations. In other words, they reflect world knowledge, how situations are, not how word meanings relate to each other. While it is possible to use meaning postulates to capture word meanings within model-theoretic semantics, this requires a full intensional logic and the postulation of'impossible worlds', to allow fine-grained differentiations between worlds in which certain postulates do not hold. (See Cann 1993 for an attempt at this and a critique of the notion of impossible worlds in Fox & Lappin 2005, cf. also article 33 (Zimmermann) Model-theoretic semantics). A theory that utilises meaning postulates treats the meaning of words as atomic with their semantic relations specified directly. So, although traditional sense relations, both paradigmatic and syntagmatic,can easily be reconstructed in the system (see Cann 1993 for an attempt at this) they do not follow from the semantics of the words themselves. For advocates of this theory, this is taken as an advantage. In the first place, it allows for conditional, as opposed to bi-conditional, relations, as necessary in a decomposi- tional approach. So while we might want to say that an act of killing involves an act of causing something to die, the reverse may not hold. If kill is decomposed as x CAUSE {BECOME{-*alive\y))), then this fact cannot be captured. A second advantage of atomicity is that even if a word's meaning can be decomposed to a large extent, there is nevertheless often a 'residue of meaning' which cannot be decomposed into other elements. This is exactly what the feature CANINE is in the simple componential analysis given above: it is the core meaning of dog/bitch that cannot be further decomposed. In decomposition, therefore, one needs both some form of atomic concept and the decomposed elements whereas in atomic approaches word meanings arc individual concepts (or denotations), not further decomposed. What relations they have with the meanings of other words is a matter of the world (or of experience of the world) not of the meanings of the words themselves. Fodor & Lepore (1998) argue extensively against decompositionality, in particular against Pustejovsky's notion of the generative lexicon, in a way similar to the criticism made against field theories above. They argue that while it might be that a dog (necessarily) denotes an animal, knowing that dogs are animals is not necessary for knowing what dog means. Given the non-necessity of knowing these inferences for knowing the meaning of the word means that they (including interlexical relations) should not be imposed on lexical entries, because these relations are not part of the linguistic meaning. Criticisms can be made of atomicity and the use of meaning postulates (see Pustejovsky 1998 for a rebuttal of Fodor & Lepore's views). In particular, since meaning postulates are capable of defining any type of semantic relation, traditional sense relations form just arbitrary and unpredictable parts of the postulate system, impossible to generalise over. Nevertheless it is possible to define theories in which words have atomic meanings, but the paradigmatic sense relations are used to organise the lexicon. Such a one is WordNct developed by George A. Miller (1995) to provide a lexical database of English organised by grouping words together that are cognitive synonyms (a synset), each of which expresses a distinct concept with different concepts associated with a word being found in different synsets (much like a thesaurus). These synsets then are related
21. Sense relations 477 to each other by lexical and conceptual properties, including the basic paradigmatic sense relations. Although it remains true that the sense relations are stated independently of the semantics of the words themselves, nonetheless it is possible to claim that using them as an organisational principle of the lexicon provides them with a primitive status with respect to human cognitive abilities. WordNet was set up to reflect the apparent way that humans process expressions in a language and so using the sense relations as an organisational principle is tantamount to claiming that they are the basis for the organisation of the human lexicon, even if the grouping of specific words into synsets and the relations defined between them is not determined directly by the meanings of the words themselves. (See article 110 (Frank & Pado) Semantics in computational lexicons). A more damning criticism of the atomic approach is that context-dependent polysemy is impossible because each meaning (whether treated as a concept or a denotation) is in principle independent of every other meaning. A consequence of this, as Pustejovsky points out, is that every polysemous interpretation of a word has to be listed separately and the interpretation of a word in context is a matter of selecting the right concept/ denotation a priori. It cannot be computed from aspects of the meaning of a word with those of other words with which it appears. For example, the meanings of gradable adjectives such as good in (15) will need different concepts associated with each collocation that are in principle independent of each other. Given that new collocations between words are made all the time and under the assumption that the number of slightly new senses that result arc potentially infinite in number, this is a problem for storage given the finite resources of the human brain. A further consequence is that, without some means of computing new senses, the independent concepts cannot be learned and so must be innate. While Fodor (1998) has suggested the possibility of the consequence being true, this is an extremely controversial and unpopular hypothesis that is not likely to help our understanding of the nature of word meaning. 4. Conclusion In the above discussion, I have not been able to more than scratch the surface of the debates over the sense relations and their place in theories of word meaning. I have not discussed the important contributions of decompositionalists such as Jackendoff, or the problem of analyticity (Quine 1960), or the current debate between contextualists and semantic minimalists (Cappelen & Lepore 2005, Wedgwood 2007). Neither have I gone into any detail about the variations and extensions of sense relations themselves, such as is often found in Cognitive Linguistics (e.g. Croft & Cruse 2004). And much more besides. Are there any conclusions that we can currently draw? Clearly, sense relations are good descriptive devices helping with the compilation of dictionaries and thesauri, as well as the development of large scale databases of words for use in various applications beyond the confines of linguistics, psychology and philosophy. It would, however, appear that the relation between sense relations and word meaning itself remains problematic. Given the overriding context dependence of the latter, it is possible that pragmatics will provide explanations of observed phenomena better than explicitly semantic approaches (see for example, Blutner 2002, Murphy 2003, Wilson & Carston 2006). Furthermore, the evidence from psycholinguistic and developmental studies, as well as the collocational sensitivity of sense, indicates that syntagmatic relations may be cognitivcly primary and that paradigmatic relations may be learned, either explicitly or through experience as
22. Dual oppositions in lexical meaning 479 Quine. Willard van Orman 1960. Word and Object. Cambridge, MA:Tlie MIT Press. Sperber. Dan & Deirdre Wilson 1986/1995. Relevance. Communication and Cognition. Oxford: Blackwell. 2nd edn. (with postface) 1995. Trier, Jost 1934. Das sprachliche Feld. Eine Auseinandersetzung. Neue Jahrbucher fur Wbsenschaft und Jugendbildung 10, 428-449. Reprinted in: L. Antal (ed.). Aspekte der Semantik. Zu ihrer Theorie und Geschichte 1662-1970. Frankfurt/M.: Athenaum, 1972,77-104. Ullman, Stephen 1957. The Principles of Semantics. 2nd edn. Glasgow: Jackson. 1st edn. 1951. van Valin, Robert D. & Randy J. LaPolla 1997. Syntax. Structure, Meaning, and Function. Cambridge: Cambridge University Press. Wedgwood. Daniel 2007. Shared assumptions. Semantic minimalism and relevance theory. Journal of Linguistics 43,647-681. Weinreich. Uriel 1963. On the semantic structure of language. In: J. H. Greenberg (ed.). Universals of Language. Cambridge, MA:The MIT Press, 142-216. Wicrzbicka. Anna 1996. Semantics. Primes and Universals. Oxford: Oxford University Press. Wilson, Deirdre & Robyn Carston 2006. Metaphor, relevance and the 'emergent property' issue. Mind & Language 21.404-433. Ronnie Cann, Edinburgh (United Kingdom) 22. Dual oppositions in lexical meaning 1. Preliminaries 2. Duality 3. Examples of duality groups 4. Semantic aspects of duality groups 5. Phase quantification 6. Conclusion 7. References Abstract Starting from well-known examples, a notion of duality is presented that overcomes the shortcomings of the traditional definition in terms of internal and external negation. Rather duality is defined as a logical relation in terms of equivalence and contradiction. Based on the definition, the notion of duality groups, or squares, is introduced along with examples from quantification, modality, aspectual modification and scalar predication (adjectives). The groups exhibit remarkable asymmetries as to the lexicalization of their four potential members. The lexical gaps become coherent if the members of duality groups are consistently assigned to four types, corresponding e.g. to some, all, no, and not all. Among these types, the first two are usually lexicalized, the third is only rarely and the fourth almost never. Using the example of the German schon ("already") group, scalar adjectives and standard quantifiers, the notion of phase quantification is introduced as a general pattern of second-order predication which subsumes quantifiers as well as aspectual particles and scalar adjectives. Four interrelated types of phase quantifiers form a duality group. According to elementary monotonicity criteria the four types rank on a scale of markedness that accounts for the lexical distribution within the duality groups. Maienbom, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 479-506
480 IV. Lexical semantics 1. Preliminaries Duality of lexical expressions is a fundamental logical relation. However, unlike others such as antonymy it enjoys much less attention. Duality relates all and some, must and can, possible and necessary, already and still, become and stay. Implicitly, it is even involved in ordinary antonymy such as between big and small. Traditionally duality is defined in terms of inner and outer negation: two operators are dual iff the outer negation of the one is equivalent to the inner negation of the other; alternatively two operators are dual iff one is equivalent to the simultaneous inner and outer negation of the other. For example, some is equivalent to not all not. Duality, in fact, is a misnomer. Given the possibility of inner and outer negation, there are always four cases involved with duality: a given operator, its outer negation, its inner negation and its dual, i.e. inner plus outer negation. Gottschalk (1953) therefore proposed to replace the term duality by quaternality. In this article, a couple of representative examples are introduced before we proceed to a formal definition of the relation of duality. The general definition of duality is not as trivial as it might appear at first sight. Inner and outer negations are not always available for dual operators at the syntactic level whence it is necessary to base the definition on a semantic notion of negation. Following the formal definition of duality, a closer look is taken at a variety of complete duality groups of four, their general structure and their relationship to the so-called Square of Oppositions of Aristotle's. Duality groups of four exhibit striking asymmetries: out of the four possible cases, two are almost always lexicalized, while the third is occasionally and the fourth almost never. (If the latter two are not lexicalized they are expressed by using explicit negation with one of the other two cases.) Criteria will be offered for assigning the four members of a group to four types defined in terms of monotonicity and "tolerance". A general conceptual format is described that allows the analysis of dual operators as instances of the general pattern of "phase quantification". This is a pattern of second-order predication; a phase quantifier predicates about a given first-order predication that there is, or is not, a transition on some scale between the predication being false and being true, i.e. a switch in truth-value. Four possibilities arise out of this setting: (i) there is a transition from false to true, (ii) there is no transition from false to true, (iii) there is a transition from true to false; (iv) there is no transition from true to false. These four possibilities of phase quantification form a duality group of four. It can be argued that all known duality groups semantically are instances of this general scheme. 1.1. First examples 1.1.1. Examples from logic Probably the best-known cases of duality are the quantifiers in standard predicate logic, 3 and V. The quantifiers are attached a variable and combined with a sentence (formula, proposition), to yield a quantified sentence. (1) a. VxP for every xP b. 3x P for at least one x P
22. Dual oppositions in lexical meaning 481_ Duality of the two quantifiers is stated in the logical equivalences in (2): (2) a. b. c. d. 3xP VxP -,3xP -,VxP ■ ■ ■ ■ -,Vx-,P -,3x-,P Vx-,P 3x-,P Duality can be paraphrased in terms of EXTERNAL NEGATION and INTERNAL NEGATION (cf. article 63 (Herburger) Negation). External negation is the negation of the whole statement, as on the left formula in (2c,d) and on the right formula in (2a,b). Internal negation concerns the part of the formula following the quantificr,i.c.the "scope" of the quantifier (cf. article 62 (Szabolcsi) Scope and binding). For example, according to (2c) the external negation of existential quantification is logically equivalent to the internal negation of universal quantification, and vice versa in (2d). If both sides in (2c) and (2d) are negated and double negation is eliminated, one obtains (2a) and (2b), respectively. In fact the four equivalences in (2) are mutually equivalent: they all state that universal and existential quantification are duals. Dual operators are not necessarily operators on sentences. It is only required that at least one of their operands can undergo negation ("internal negation"), and that the result of combining the operator with its operand(s) can be negated, too ("external negation"). Another case of duality is constituted by conjunction a and disjunction v; duality of the two connectives is expressed by De Morgan's Laws, for example: (3) -,(A a B) ■ (-.A v-,B) These dual operators arc two-place connectives, operating on two sentences, and internal negation is applied to both operands. 1.1.2. First examples from natural language The duality relationship between 3 and V is analogously found with their natural language equivalents some and every. Note that sentences with some NPs as subject are properly negated by replacing some with no (cf. Lobner 2000: §1 for the proper negation of English sentences): (4) a. some tomatoes are green = not every tomato is not green b. every tomato is green = no tomato is not green c. no tomato is green = every tomato is not green d. not every tomato is green = some tomatoes are not green The operand of the quantificational subject NP is its 'nuclear' scope, the VP. Modal verbs are another field where duality relations are of central importance. Modal verbs combine with infinitives. A duality equivalence for epistemic must and can is stated in (5): (5) he must have lied = he cannot have told the truth
482 IV. Lexical semantics Aspectual particles such as already and still are among the most thoroughly studied cases of dual operators. Their duality can be demonstrated by pairs of questions and negative answers as in (6). Let us assume that on and off arc logically complementary, i.e. equivalent to the negations of each other: (6) a. Is the light already on? - No, the light is still off. b. Is the light still on? - No, the light is already off. 1.2. Towards a general notion of duality The relationship of duality is based on logical equivalence. Duality therefore constitutes a logical relation. In Model-theoretic semantics (cf. article 33 (Zimmermann) Model- theoretic semantics), meaning is equated with truth conditions; therefore logical relations are considered sense relations (cf. article 21 (Cann) Sense relations). However, in richer accounts of meaning that assume a conceptual basis for meanings, logical equivalence does not necessarily amount to equal meanings (cf. Lobner 2002, 2003: §§4.6, 10.5). It could therefore not be inferred from equivalences such as in (4) to (6) that the meanings of the pairs of dual expressions match in a particular way. All one can say is that their meanings are such that they result in these equivalences. In addition, expressions which exhibit duality relations are rather abstract in meaning and, as a rule, can all be used in various different constructions and meanings. In general, the duality relationship only obtains when the two expressions are used in particular constructions and/or in particular meanings. For example, the dual of German schon "already" is noch "still" in cases like the one in (6), but in other uses the dual of schon is erst (temporal "only");/ioc/i on the other hand has uses where it does not possess a dual altogether (cf. Lobner 1989 for a detailed discussion). The duality relation crucially involves external negation of the whole complex of operator with operands, and internal operand negation. The duality relationship may concern one operand (out of possibly more) as in the case of the quantifiers, modal verbs or aspectual particles, or more than one (witness conjunction and disjunction). In order to permit internal negation, the operands have to be of sentence type or else of some type of predicate expressions. For the need of external negation, the result of combining dual operators with their operands must itself be eligible for negation. A first definition of duality, in accordance with semantic tradition, would be: (7) Let q and q' be two operators that fulfil the following conditions: a. they can be applied to the same domain of operands b. the operands can be negated (internal negation) c. the results of applying the operators to appropriate operands can be negated (external negation) Then q and q' are DUALS iff external negation of one is equivalent to internal negation of the other. This definition, however, is in need of modification. First, "negation" must not be taken in a syntactic sense as it usually is. If it were, English already and still would not be candidates for duality, as they allow neither external nor internal syntactic negation.
22. Dual oppositions in lexical meaning 483 This is shown in Lobner (1999:89f) for internal negation;as to external negation,already and still can only be negated by replacing them with not yet and no more/not anymore, respectively. The term 'negation' in (7) has therefore to be replaced by a proper logical notion. A second inadequacy is hidden in the apparently harmless condition (a): dual operators need not be defined for the same domain of operands. For example, already and still have different domains: already presupposes that the state expressed did not obtain before, while still presupposes that it may not obtain later. Therefore, (8a) and (8b) are semantically odd if we assume that one cannot be not young before being young, or not old after being old: (8) a. She's already young. b. She's still old. These inadequacies of the traditional definition will be taken care of below. 1.3. Predicates, equivalence, and negation 1.3.1. Predicates and predicate expressions For proper semantic considerations, it is very important to carefully distinguish between the levels of expression and of meaning, respectively. Unfortunately there is a terminological tradition that conflates these two levels when talking of "predicates", "arguments", "operators", "operands", "quantifiers" etc.: these terms are very often used both for certain types of expressions and for their meanings. In order to avoid this type of confusion, the following terminological distinctions will be observed in this article: A "predicate" is a meaning; what a meaning is depends on semantic theory (cf. article 1 (Maicnborn, von Heusinger & Portner) Meaning in linguistics). In a model-theoretic approach, a predicate would be a function that assigns truth values to one or more arguments; in a cognitive approach, a predicate can be considered a concept that assigns truth values to arguments. For example, the meaning of has lied would be a predicate (function or concept) which in a given context (or possible world) assigns the truth value true to everyone who has lied and false to those who told the truth. Expressions, lexical or complex, with predicate meanings will be called PREDICATE EXPRESSIONS. ARGUMENTS which a predicate is applied to are neither expressions nor meanings; they are objects in the world (or universe of discourse); such objects may or may not be denoted by linguistic expressions; if they arc, let us call these expressions ARGUMENT TERMS. (For a more comprehensive discussion of these distinctions see Lobner 2002, 2003: §6.2) Sometimes, arguments of predicate expressions are not explicitly specified by means of an argument term. For example, sentences are usually considered as predicating about a time argument, the time of reference (cf. article 57 (Ogihara) Tense), but very often, the time of reference is not specified by an explicit expression such as yesterday.The terms OPERATOR, OPERAND and QUANTIFIER will all be used for certain types of expressions. If the traditional definition of duality given in (7) is inadequate, it is basically because it attempts to define duality at the level of expressions. Rather it has to be defined at the level of meanings because it is a logical relation and logical relations between expressions originate from their meanings.
484 IV. Lexical semantics 1.3.2. The operands of dual operators The first prerequisite for an adequate definition of duality is a precise semantic characterization of the operands of dual operators ("d-operators", for short). Since the operands must be eligible for negation, their meanings have to be predicates, i.e. something that assigns a truth-value to arguments. Negation ultimately operates on truth-values; its effect on a predicate is the conversion of the truth value assigned. Predicate expressions range from lexical expressions such as verbs, nouns and adjectives to complex expressions like VPs, NPs, APs or whole sentences. In (4), the dual operators are the subject NPs some tomatoes and every tomato; their operands are the VPs is/are green and their respective negations is/are not green; they express predications about the tomatoes referred to. In (5), the operands of the dual modal verbs are the infinitives have lied and have told the truth; they express predications about the referent of the subject NP he; in (6), the operands of already and still are the remainders of the sentence: the light is on/off; in this case, these sentences are taken as predicates about the reference time implicitly referred to. Predicates are never universally applicable, but only in a specific DOMAIN of cases. For a predicate P, its domain D(P) is the set of those tuples of arguments the predicate assigns a truth value to. The notion of domain carries over to predicate expressions: their domain is the domain of the predicate that constitutes their meaning. In the following, PRESUPPOSITIONS (cf. article 91 (Beaver & Geurts) Presupposition) of a sentence or other predicate expression are understood as conditions that simply restrict the domain of the predicate that is its meaning. For example, the sentence Is the light already on in (6) presupposes (among other conditions) (pi) that there is a uniquely determined referent of the NP the light (cf. article 41 (Hcim) Definiteness and indefiniteness) and (p2) that this light was not on before. The predication expressed by the sentence about the time of reference is thus restricted to those times when (pi) and (p2) are fulfilled, i.e. those times where there is a unique light which was not on before. In general, a predicate expression p will yield a truth-value for a given tuple of arguments if and only if the presuppositions of p are fulfilled. This classical Frcgcan view of presuppositions is adequate here, as wc arc dealing with the logical level exclusively. It follows from this notion of presupposition that predicate expressions which are defined for the same domain of arguments necessarily carry identical presuppositions. In particular that is the case if two predicate expressions are logically equivalent: Definition 1: logical equivalence Let p and p' be predicate expressions with identical domains, p and p' are LOGICALLY EQUIVALENT - p ■ p' - iff for every argument tuple in their common domain, p and p' yield identical truth values. 1.3.3. The negation relation The crucial relation of logical contradiction can be defined analogously. It will be called 'neg\ the 'neg(ation) relation'. Expressions in this relationship, too, have identical presuppositions.
22. Dual oppositions in lexical meaning 485 Definition 2: negation relation Let p and p' be predicate expressions with identical domains, p and p' are NEG- OPPOSITES, or NEGATIVES, of each other - p neg p' - iff for every argument tuple out of their common domain, p and p' yield opposite truth values. Note that this is a semantic definition of negation, as it is defined in terms of predicates, i.e. at the level of meaning. The tests of duality require the construction of pairs of neg- opposites, or NEG-PAIRS for short. This means to come up with means of negation at the level of expressions, i.e. with lexical or grammatical means of converting the meanings of predicate expressions. For sentences, an obvious way of constructing a negative is the application (or de- application) of grammatical negation ('g-negation', in the following). In English, g- negation takes up different forms depending on the structure of the sentence (cf. Lobner 2000: §1 for more detail). The normal form is g-negation by VP negation, with do aux- iliarization in the case of non-auxiliary verbs. If the VP is within the scope of a higher- order operator such as a focus particle or a quantifying expression, either the higher-order operator is subject to g-negation or it is substituted by its neg-opposite. Such higher-order operators include many instances of duality. English all, every, always, everywhere, can and others can be directly negated, while some,sometimes,somewhere, must, always,still etc. are replaced for g-negation by no, never, nowhere, need not, not yet and no more, respectively. For the construction of neg-pairs of predicate expressions other than sentences, sometimes g-negation can be used, e.g. for VPs. In other cases, lexical inversion may be available, i.e. the replacement of a predicate expression by a lexical neg-opposite such as on/ off, to leave/to stay, member/non-member. Lexical inversion is not a systematic means of constructing neg-pairs because it is contingent on what the lexicon provides. But it is a valuable instrument for duality tests. 1.4. Second-order predicates and subnegation 1.4.1. D-operators D-operators must be eligible to negation and therefore predicate expressions themselves; as wc saw, at least one of their arguments, their opcrand(s) must, again, be a predicate expression. For example, the auxiliary must in (5) is a predicate expression that takes another predicate expression have lied as its operand. In this sense, the d-operators are second-order predicate expressions, i.e. predicate expressions that predicate about predicates. D-operators may have additional predicate or non-predicate arguments. For an operator q and its operand p let 'q(p)' denote the morpho-syntactic combination of q and p, whatever its form. In terms of the types of Formal Semantics, the simplest types of d-operators would be (t,t) (sentential operators) and ((a,t),t) (quantifiers); a frequent type, represented by focus particles, is (((a,t),a),t) (cf. article 33 (Zimmermann) Model-theoretic semantics for logical types). 1.4.2. Negation and subnegation When the definition of the neg-relation is applied to d-operators, it captures external negation. The case of internal negation is taken care of by the 'subneg(ation) relation'.
486 IV. Lexical semantics Two operators are subneg-opposites if, loosely speaking, they yield the same truth values for neg-opposite operands. Definition 3: subnegation opposites Let q and q' be operators with a predicate type argument. Let the predicate domains of q and q' be such that q yields a truth value for a predicate expression p iff q' yields a truth value for the neg-opposites of p. q and q' are SUBNEG(ATION) OPPOSITES, or SUBNEGATIVES, of each other - q SUBNEG q' - iff > for any predicate expressions p and p' eligible as operands of q and q', respectively: ifp neg p'then q(p)sq'(p')- (For operators with more than one predicate argument, such as conjunction and disjunction, the definition would have to be modified in an obvious way.) If two d-operators are subnegatives, their domains need not be identical. The definition only requires that if q is defined for p, any subnegative q' is defined for the negatives of p. If the predicate domain of q contains negatives for every predicate it contains, then q and q' have the same domain. Such arc the domains of the logical quantifiers, but that docs not hold for pairs of operators with different presuppositions, e.g. already and still (cf. §5.1). An example of subnegatives is the pair always/never: (9) Max always is late = Max never is on time To be late and to be on time are neg-opposites, here in the scope of always and never, respectively. The two quantificational adverbials have the same domain, whence the domain condition in Definition 3 is fulfilled. 2. Duality 2.1. General definition Definition 3 above paths the way for the proper definition of duality: Definition 4: dual opposites Let q and q' be operators with a predicate type argument. Let the predicate domains of q and q' be such that q yields a truth value for a predicate expression p iff q' yields a truth value for the neg-opposites of p. q and q' are DUAL (OPPOSITE)S of each other - q DUAL q' - iff: > for any predicate expressions p and p' eligible as operands of q and q', respectively: ifp neg p'then q(p) neg q'(p')- The problem with condition (a) in the traditional definition in (7) is taken care of by the domain condition here; and any mention of grammatical negation is replaced by relating
488 IV. Lexical semantics Due to the equivalences in (10) and (11), with any d-operator q, the operations N, S and D yield a group of exactly four cases: q, Nq, Sq, Dq - provided Nq, Sq and Dq each differ from q (see §2.3 for reduced groups of two members). Each of the four cases may be expressible in different, logically equivalent ways. Thus what is called a "case" here, is basically a set of logically equivalent expressions. According to (10) and (11), any further application of the operations N, S and D just yields one of these four cases: the group is closed under these operations. Within such a group, no element is logically privileged: instead of defining the group in terms of q and its correlates Nq, Sq and Dq, we might, for example, just as well start from Sq and take its correlates NSq = Dq, SSq = q, and DSq = Nq. Definition 5: duality group A DUALITY GROUP is a group of subneg and dual that contains at least a up to pair c four f dual aperators in operators. the mutual relations neg. Duality groups can be graphically represented as a square of the structure depicted in Fig. 22.1. dual q < ► Dq Nq < ► Sq dual Fig. 22.1: Duality square Although the underlying groups of four operators related by neg, subneg and dual are perfectly symmetrical, duality groups in natural, and even in formal, languages are almost always deficient, in that not all operators are lexicalized. This issue will be taken up in §§3,4 and 5 below. 2.3. Reduced duality groups and self-duality For the sake of completeness, we will briefly mention cases of reduced (not deficient) duality groups. The duality square may collapse into a constellation of two, or even one case, if the operations N, D, or S are of no effect on the truth conditions, i.e. if Nq, Dq or Sq arc equivalent to q itself. Neutralization of N contradicts the Law of Contradiction: if q = Nq, and q were true for any predicate operand p, it would at the same time be false. Therefore, the domain of q must be empty. This might occur if q is an expression with contradictory presuppositions, whence the operator is never put to work. For such an "idle" operator, negatives, subnegatives and duals are necessarily idle, too. Hence the whole duality group collapses into one case.
490 IV. Lexical semantics (17) / don't want you to leave = I want you not to leave The question as to which verbs are candidates for the neg-raising phenomenon is, as far as I know, not finally settled (see Horn (1978) for a comprehensive discussion, also Horn (1989: §5.2)).The fact that NR verbs are self-dual allows, however, a certain general characterization. The condition of self-duality has the consequence that if v is true for p, it is false for Np, and if v is false for p, it is true for Np.Thus NR verbs express propositional attitudes that in their domain "make up their mind" for any proposition as to whether the attitude holds for the proposition or its negation. For example, NR want applies only to such propositions the subject has either a positive or a negative preference for. The claim of self-duality is tantamount to the presupposition that this holds for any possible operand. Thereby the domains of NR verbs are restricted to such pairs of p and Np to which the attitude either positively or negatively obtains. Among the modal verbs, those meaning "want to" like German wollen do not partake in the duality groups listed below. I propose that these modal verbs are NR verbs, whence their duality groups collapse to a group of two [q/Dq, Nq/Sq], e.g. [wollen, wollen neg]. THE GENERIC OPERATOR. In mainstream accounts of characterizing (i-generic) sentences (cf. Krifka et al. 1995, also article 47 (Carlson) Genericity) it is commonly assumed that their meanings involve a covert genericity operator GEN. For example, men are dumb would be analyzed as (18a): (18) a. GEN[x](x is a man;x is dumb] According to this analysis, the meaning of the negative sentence men are not dumb would yield the GEN analysis (18b), with the negation within the scope of GEN: (18) b. GEN[x](x is a man;-i x isdumb] The sentence men are not dumb is the regular grammatical negation of men are dumb, whence it should also be analysed as (18c), i.e. as external negation w.r.t. to GEN: (18) c. -iGEN[x](x is a man;x isdumb] It follows immediately that GEN is self-dual.This fact has not been recognized in the literature on generics. In fact, it invalidates all available accounts of the semantics of GEN which all agree in analyzing GEN as some variant of universal quantification. Universal quantification, however, is not self-dual, as internal and external negation clearly yield different results (see Lobner 2000: §4.2) for elaborate discussion. HOMOGENEOUS QUANTIFICATION. Ordinary restricted quantification may happen to be self-dual if the predicate quantified yields the same truth-value for all elements of the domain of quantification, i.e. if it is either true of all cases or false of all cases. (As a special case, this condition obtains if the domain of quantification contains just one clement u;both 3 and V arc then equivalent to self-dual Iu.) The "homogeneous quantifier" 3V (cf. Lobner 1989:179, and Lobner 1990:27ff for more discussion) is a two- place operator which takes one formula for the restriction and a second formula for the predication quantified. It will be used in §5.2.
494 IV. Lexical semantics Tab. 22.2: Duality Groups of Deontic Expressions type of expression modal verbs (deontic modality) modal verbs (2) adjectives German causative deontic verbs imperative causative verbs of deontic and causal modality q may/can may possible ermoglichen "render possible" imperative of permission causative of permission accept allow let/admit dual must shall necessary erzwingen "force" imperative of request causative of causation demand request let/make/force negative (must neg) (may neg) (shall neg) impossible verhindern "prevent" (neg imperative) (neg causative) refuse forbid prevent subnegative (need neg) (need neg) unnecessary eriibrigen "render unnecessary" - - (demand neg) (request neg) (force neg) The imperative form has two uses, the prototypical one of request, or command, and a permissive use, corresponding to the first cell of the respective duality group.The negation of the imperative, however, only has the neg-reading of prohibition. The case of permitting not to do something cannot be expressed by a simple imperative and negation. Similarly, grammatical causative forms such as in Japanese tend to have a weak (permissive) reading and a strong reading (of causation), while their negation inevitably expresses prevention, i.e. causing not to. The same holds for English let and German lassen. 3.2.3. Epistemic modality In the groups of modal verbs in epistemic use, g-negation of may yields a subnegative, unlike the neg-reading of the same form in the deontic group. Can, however, yields a neg- reading with negation.Thus, the duality groups exhibit remarkable inconsistencies such as the near-equivalence of may and can along with a clear difference of their respective g-negations, or the equivalence of the g-negations of non-equivalent may and must. Tab. 22.3: Duality Groups of Epistemic Expressions type of expression modal verbs (1) modal verbs (2) epistemic adjectives adverbs verbs of altitude adjectives for logical properties verbs for logical relations q can may possible possibly hold possible satisfiable be compatible with dual must must certain certainly believe tautological entail negative can neg can neg impossible in no case exclude contradictory unsatisfiable exclude subnegative need neg may neg questionable - doubt (neg tautological) (neg entail)
22. Dual oppositions in lexical meaning 495 Logical necessity and possibility can be considered a variant of epistemic modality. Here the correspondence to quantification is obvious, as these properties and relations are defined in terms of existential and universal quantification over models (or worlds, or contexts). 3.2.4. Aspectual operators Aspectual operators deal with transitions in time between opposite phases, or equiv- alently, with beginning, ending and continuation. Duality groups are defined by verbs such as begin, become and by focus particles such as already and still. The German particle schon will be analyzed in §5.1 in more detail, as a paradigm case of'phase quantification'. The particles of the nurnoch group have no immediate equivalents in English. See Lobner (1999: §5.3) for semantic explanations. 3.2.5. More focus particles, conjunctions More focus particles such as only, even, also are candidates for duality relations. A duality account of German nur ("only", "just") is proposed in Lobner (1990: §9). In some of the uses of only analyzed there, it functions as the dual of auch ("also"). An analysis of auch, however, is not offered there. Konig (1991a, 1991b) proposed to consider causal because and concessive although duals, due to the intuition that "although p, not q1 means something like 'not {because p, q)\ However, Iten (2005) argues convincingly against that view. 3.2.6. Scalar adjectives Lobner (1990: §8) offers a detailed account of scalar adjectives which analyzes them as dual operators on an implicit predication of markedness. The analysis will be briefly sketched in §5.1. According to this view, pairs of antonyms, together with their negations, form duality groups such as [long, short, (neg long), (neg short)]. Independently of this analysis, certain functions of scalar adjectives exhibit logical relations that are similar to the duality relations. Consider the logical relationships between positive with enough and too with positive, as well as between equative and comparative: (24) a. x is not too short = x is long enough x is too long = x is not short enough b. x is not as long as y = x is shorter than y x is as long as y = x is not shorter than y The negation of too with positive is equivalent to the positive of the antonym with enough, and the negation of the equative is equivalent to the comparative of the antonym. Antonymy essentially means reversal of the underlying common scale. Thus the equivalences in (24) represent instances of a "duality" relation that is based on negation and scale reversal, another self-inverse operation, instead of being based on negation and subnegation. In view of such data, the notion of duality might be generalized to analogous logical relations based on two self-inverse operations.
496 IV. Lexical semantics 3.3. Asymmetries within duality groups The duality groups considered here exhibit remarkable asymmetries. Let us refer to the first element of a duality group as type 1, its dual as type 2, its negative as type 3 and its subnegative as type 4. The first thing to observe is the fact that there are always lexical or grammatical means of directly expressing type 1 and type 2, sometimes type 3 and almost never type 4.This tendency has long been observed for the quantifier groups (sec Horn 1972 for an early account, Dohmann 1974a,b for cross-linguistic data, and Horn to appear for a comprehensive recent survey). In addition to the lexical gaps, there is a considerable bias of negation towards type 3, even if the negated operand is type 2.Types 3 and 4 are not only less frequently lexicalized; if they are, the respective expressions are often derived from type 1 and 2, if historically, by negative affixes, cf. n-either,n-ever,n-or, im-possible.The converse never occurs.Thus, type 4 appears to be heavily marked: on a scale of markedness we obtain 1,2 < 3 < 4. A closer comparison of type 1 and type 2 shows that type 2 is marked vs. type 1, too. As for nominal existential quantification, languages are somehow at pain when it comes to an expression of neutral existential quantification. This might at a first glance appear to indicate markedness of existential vs. universal quantification. However, nominal existential quantification can be considered practically "built into" mere predication under the most frequent mode of particular (non-generic) predication: particular predication entails reference, which in turn entails existence. Thus, existential quantification is so unmarked that it is even the default case not in need of overt expression. A second point of preference compared to universal quantification is the degree of elaboration of existential quantification by more specific operators such as numerals and other quantity specifications. For the modal types, this argumentation does not apply. What distinguishes type 1 from type 2 in these cases, is the fact that irregular negation, i.e. subneg readings of g-negation only occurs with type 2 operators; in this respect, epistemic may constitutes an exception. Tab. 22.4: Duality Groups of Aspectual Expressions type of expression verbs of beginning etc. German aspectual particles with stative operands (1) German aspectual particles with focus on a specification of time German aspectual particles with stative operands(2) German aspectual particles with scalar stative operands(3) q begin become schon "already" schon "already" endlich "finally" nurnoch dual continue stay noch "still" erst "only" noch immer "STILL" noch negative end (become neg) (noch neg) "neg yet" (noch neg) "neg yet" noch immer neg "STILL NEG" (neg nur noch) subnegative (neg begin) (neg stay) (neg mehr) "neg more" (neg erst) "neg still" (endlich neg mehr) "finally neg more" (neg mehr)
22. Dual oppositions in lexical meaning The aspectual groups will be discussed later. So much may, however, be stated. In all languages there are very many verbs that incorporate type 1 'become' or 'begin' as opposed to very few verbs that would incorporate type 2 'continue' or 'stay' or type 3 'stop', and apparently no verbs incorporating 'not begin'. These observations that result in a scale of markedness of type 1 < type 2 < type 3 < type 4 are tendencies. In individual groups, type 2 may be unmarked vs. type 1. Conjunction is unmarked vs. disjunction, and so is the command use of the imperative opposed to the permission use. Of course, the tendencies are contingent on which element is chosen as the first of the group.They emerge only if the members of the duality groups are assigned the types they arc. Given the perfect internal symmetry of duality groups, the type assignment might seem arbitrary - unless it can be motivated independently. What is needed, therefore, is independent criteria for the assignment of the four types. These will be introduced in §4.2 and §5.3. 4. Semantic aspects of duality groups 4.1. Duality and the Square of Opposition The quantificational and modal duality groups (Tab. 22.1,22.2,22.3) can be arranged in a second type of logical constellation, the ancient Square of Opposition (SqO), established by the logical relations of entailment, contradiction (i.e. neg-opposition), contrariety and subcontrariety. The four relations are essentially entailment relations (cf. Lobner 2002, 2003: §4 for an introduction).They are defined for arbitrary, not necessarily second-order, predicates. The relation of contradictoriness is just neg. Definition 7: entailment relations for predicate expressions Let p and p' be arbitrary predicate expressions with the same domain D. (i) p ENTAILS p' iff for every a in D, if p is true for a, then p' is true for a. (ii) p and p' are CONTRARIES iff p entails Np'. (iii) p and p' are SUBCONTRARIES iff Np entails p'. The traditional SqO has four vertices, A, I, E, O corresponding to V, 3, -i 3 and -iV, or type 1,2,3,4, respectively, although in a different arrangement than in Fig. 22.2: subcontraries Fig. 22.2: Square of Oppositions O
498 IV. Lexical semantics The duality square and the SqO depict different, in fact independent logical relations. Unlike the duality square, the SqO is asymmetric in the vertical dimension: the entailment relations are unilateral, and the relation between A and E is different from the one between I and O.The relations in the SqO are basically second-order relations (between first-order predicates), while the duality relations dual and subneg are third-order relations (between second-order predicates). The entailment relations can be established between arbitrary first-order predicate expressions; the duality relations would then simply be unavailable due to the lack of predicate-type arguments. For example, any pair of contrary first-order predicate expressions such as frog and dog together with their negations span a SqO with, say, A = dog, E = frog, I = N frog, O = N dog. There arc also SqO's of second-order operators which are not duality groups. For example, let A be more than one and E no with their respective negations O not more than one/at most one and I some/at least one. The SqO relations obtain, but A and I, i.e. more than one and some are not duals: the dual of more than one is not more than one not, i.e. at most one not. This is clearly not equivalent with some. On the other hand, there are duality groups which do not constitute SqO's, for example the schon group. An SqO arrangement of this group would have schon and noch nicht, and noch and nicht mehr, as diagonally opposite vertices. However, in this group duals have different presuppositions, in fact this holds for all horizontal or vertical pairs of vertices. Therefore, since the respective relations of entailment, contrariety and subcontrariety require identical presuppositions, none of these obtains. In Lobner (1990:210) an artificial example is constructed which shows that the duality relations do not entail the SqO relations even if all four operators share the same domain.There may be universal constraints for natural language which exclude such cases. 4.2. Criteria for the distinction among the four types 4.2.1. Monotonicity: types i and 2 vs. types 3 and 4 The most salient difference among the four types is that between types 1 and 2 and types 3 and 4: types 1 and 2 are positive, types 3 and 4 negative.The difference can be captured by the criterion of monotonicity. Barwisc & Cooper (1981) first introduced this property for quantifiers; see also article 43 (Keenan) Quantifiers. Definition 8: monotonicity a. An operator q is UPWARD MONOTONE - MONT- if for any operands p and p' if p entails p' then q(p) entails q(p')- b. An operator q is DOWNWARD MONOTONE - MONi - if for any operands p and p' if p entails p' then q(p') entails q(p). All type 1 and type 2 operators of the groups listed are mont while the type 3 and type 4 operators are moni. Negation inverses entailment: if p entails p' then Np' entails Np. Hence, both N and S inverse the direction of monotonicity. To see this, consider (25):
22. Dual oppositions in lexical meaning 499 (25) have a coke entails have a drink, therefore a. every is mont: every student had a coke entails every student had a drink b. /joismoni: no student had a drink entails no student had a coke Downward monotonicity is a general trait of semantically negative expressions (cf. Lobner 2000: §1.3). It is generally marked vs. upward monotonicity. Moni operators, including most prominently g-negation itself, license negative polarity items (cf. article 64 (Giannakidou) Polarity items) and are thus specially restricted. Negative utterances in general are heavily marked pragmatically since they require special context conditions (Givon 1975). 4.2.2. Tolerance: types i and 4 vs. types 2 and 3 Intuitively, types 1 and 4 are weak as opposed to the strong types 2 and 3. The weak types make weaker claims. For example, one positive or negative case is enough to verify existential or negated universal quantification, respectively, whereas for the verification of universal and negated existential quantification the whole domain of quantification has to be checked.This distinction can be captured by the property of (in)consistency, or (in)tolerance: Definition 9: tolerance and intolerance a. b. An operator q is TOLERANT iff for some neg-pair p and Np of operands, q is true for both p An operator q is INTOLERANT iff it is not tolerant. and Np. Intolerant operators arc "strong", tolerant ones "weak". (26) a. intolerant every: every light is off excludes every light is on b. tolerant some: some lights are on is compatible with some lights are off An operator q is intolerant iff for all operands p, if q(p) is true then q(Np) is false, i.e. Nq(Np) is true. Hence an operator is intolerant iff it entails its dual. Unless q is self- dual, it is different from its dual, hence only one of two different duals can entail its dual.Therefore, of two different dual operators one is intolerant and the other tolerant, or both are tolerant. In the quantificational and modal groups, type 2 generally entails type 1, whence type 2 is intolerant and type 1 tolerant. Since negation inverses entailment, type 3 entails type 4 if type 2 entails type 1. Thus in these groups, type 4 is tolerant, and type 3 intolerant. The criterion of (in)tolerance cannot be applied to the aspectual groups in Tab. 22.4. This gap will be taken care of in §5.3. As a result, for those groups that enter the SqO (the quantificational and modal groups), the four types can be distinguished as follows. Tab. 22.5: Type Distinctions in Duality Squares and the Square of Opposition type 1 /1 type 2 / A type 3 / E type 4 / O mont mont monl intolerant tolerant intolerant inoni tolerant
500 IV. Lexical semantics Horn (to appear: 14) directly connects (in)tolerance to the asymmetries of g-negation w.r.t. types 3 and 4. He states that 'intolerant q may lexically incorporate S, but tends not to lexicalize N; conversely, tolerant q may lexically incorporate N, but bars lexicalization of S' (the quotations are modified as to fit the terminology and notation used here). Since intolerant operators are of type 2 and tolerant ones of type 1, both incorporations of S or N lead to type 3. The logical relations in the SqO can all be derived from the fact that V entails 3.They are logically equivalent to V being intolerant which, in turn, is equivalent to each of the (in)tolerance values of the other three quantifiers. The monotonicity properties cannot be derived from the SqO relations. They are inherent to the semantics of universal and existential quantification. 4.3. Explanations of the asymmetries There is considerable discussion as to the reasons for the gaps observed in the SqO groups. Horn (1972) and Horn (to appear) suggest that type 4 is a conversational implica- ture of type 1 and hence in no need of extra lexicalization: some implicates not all, since if it were in fact all, one would have chosen to say so.This being so, type 4 is not altogether superfluous; some contexts genuinely require the expression of type 4; for example, only not all, not some, can be used with normal intonation as a refusal of all. Lobner (1990: §5.7) proposes a speculative explanation in terms of possible differences in cognitive cost. The argument is as follows. Assume that the relative unmarkedness of type 1 indicates that type 1 is cognitively basic and that the other types are cognitively implemented as type 1 plus some cognitive equivalents of N, S, or D. If the duality group is built up as [q, Dq, Nq, DNq], this would explain the scale of markedness, if one assumes that application of D is less expensive than application of N, and simultaneous application of both is naturally even more expensive than either. Since we are accustomed to think of D as composed of S and N, this might appear implausible; however, an analysis is offered (see §5.2), where, indeed, D is simple (essentially presupposition negation) and S the combined effect of N and D. Jaspers (2005) discusses the asymmetries within the SqO, in particular the missing lexicalization of type 4 / O, in more breadth and depth than any account before. He, too, takes type 1 as basic,"pivotal" in his terminology, and types 2 and types 3 as derived from type 1 by two different elementary relations, one of them N.The fourth type, he argues, does not exist at all (although it can be expressed compositionally). His explanation is therefore basically congruent with the one in Lobner (1990), although the argument is based not on speculations about cognitive effort, but on a reflection on the character of human logic. For details of a comparison between the two approaches see Jaspers (2005: §2.2.5.2 and §4). 5. Phase quantification In Lobner (1987,1989,1990) a theory of "phase quantification" was developed which was first designed to provide uniform analyses of the various schon groups in a way that captures the duality relationships. The theory later turned out to be also applicable to "only" (German nur), scalar adjectives and, in fact, universal and existential quantification in
502 IV. Lexical semantics Tab. 22.6: Presuppositions and Assertions of the schon Group operator schon (tt,p) noch(tf,p) noch nicht(tf, p) nicht mehr(te ,P) relation to schon dual neg subneg presupposition: previous not-p P not-p P state assertion: state at te P P not-p not-p Types 2,3, and 4 can be directly analyzed as generated from type 1 by application of N and D, where D is just negation of the presupposition. Mittwoch (1993) and van der Auwera (1993) have questioned the presuppositions of the schon group. Their criticism, however, is essentially due to a confusion of types of uses (each use potentially comes with different presuppositions), or of presuppositions and conversational implicatures. Lobner (1999) offers an elaborate discussion, and refutation, of these arguments. Scalar adjectives. Scalar adjectives frequently come in pairs of logically contrary antonyms such as long/short, old/young, expensive/cheap. They relate to a scale that is based on some ordering.They encode a dimension for their argument such as size, length, age, price, and rank its degree on the respective scale. Pairs of antonyms consist of a positive element +A and a negative element -A (see Bierwisch 1989 for an elaborate general discussion). +A states for its argument that it occupies a high degree on the scale, a degree that is marked against lower degrees. -A states that the degree is low, i.e. marked against higher degrees. The respective negations predicate an unmarked degree on the scale as opposed to marked higher degrees (neg +A), or marked lower degrees (neg -A).The criteria of marked degrees are context dependent and need not be discussed here. Similar to the meanings of the schon group, + A can be seen as predicating of an argument t that on the given scale it is placed above a critical point where unmarkedness changes into markedness, and analogously for the other three cases. In the diagrams in Fig. 22.4, unmarkedness is denoted by 0 and markedness by 1, as (un)markedness coincides with the truth value that +A and -A assign to their argument. +A(t) -Aft) not+A(t) not-Aft) 0 1 10 Fig. 22.4: Phase diagrams for scalar adjectives i. Other uses of scalar adjectives such as comparative, equative, positive with degree specification, enough or too can be analyzed analogously (Lobner 1990: §8). Existential and universal quantification. In order to determine the truth value of quantification restricted to some domain b and about a predicate expressed by p, the elements of b have to be checked in some arbitrary order as to whether p is true or
506 IV. Lexical semantics Huddleston. Rodney & Geoffrey K. Pullum 2002. The Cambridge Grammar of the English Language. Cambridge: Cambridge University Press. Iten, Corinne 2005. Linguistic Meaning, Truth Conditions and Relevance. The Case of Concessives. Houndmills: Palgrave Macmillan. Jaspers, Dany 2005. Operators in the Lexicon: On the Negative Logic of Natural Language. Ph.D. dissertation. University of Leiden. Utrecht: Landelijke Onderzoekschool Taalwetenschap (=LOT). Konig, Ekkehard 1991a. Concessive relations as the dual of causal relations. In: D. Zaefferer (ed.). Semantic Universals and Universal Semantics. Dordrecht: Foris, 190-209. Konig, Ekkehard 1991b. The Meaning of Focus Particles in German. London: Routledge. Krifka, Manfred, Francis J. Pelletier. Gregory N. Carlson. Alice ter Meulen, Godehard Link & Gennaro Chicrchia 1995. Gencricity: An introduction. In: G. N. Carlson & F. J. Pelletier (eds.). The Generic Book. Chicago, IL:The University of Chicago Press, 1-124. Lobner, Sebastian 1987. Quantification as a major module of natural language semantics. In: J. A. G. Grocncndijk, M. J. B. Stokhof & D. de Jongh (eds.). Studies in Discourse Representation Theory and the Theory of Generalized Quantifiers. Dordrecht: Foris, 53-85. Lobner, Sebastian 1989. German schon - erst - noch: An integrated analysis. Linguistics & Philosophy 12,167-212. Lobner, Sebastian 1990. Wahr neben Falsch. Duale Operatoren als die Quantoren natilrlicher Sprache. Tubingen: Niemeyer. Lobner, Sebastian 1999. Why German schon and noch are still duals: A reply to van der Auwera. Linguistics & Philosophy 22.45-107. Lobner, Sebastian 2000. Polarity in natural language: Predication, quantification and negation in particular and characterizing sentences. Linguistics & Philosophy 23,213-308. Lobner, Sebastian 2002. Understanding Semantics. London: Hodder-Arnold. Lobner, Sebastian 2003. Semantik. Eine Einfiihrung. (German version of Lobner 2002). Berlin: dc Gruytcr. Mittwoch, Anita 1993. The relationship between schon/already and noch/still: A reply to Lobner. Natural Language Semantics 2,71-82. Palmer, Frank R. 2001. Mood and Modality. Cambridge: Cambridge University Press. Schliickcr. Barbara 2008. Warum nicht bleiben nicht werden ist: Ein Pladoycr gegcn die Dualitat von werden und bleiben. Linguistische Berichte 215,345-371. Sebastian Lobner, Diisseldorf (Germany)
V. Ambiguity and vagueness 23. Ambiguity and vagueness: An overview 1. Interpretive uncertainty 2. Ambiguity 3. Vagueness 4. Conclusion 5. References Abstract Ambiguity and vagueness are two varieties of interpretive uncertainty which are often discussed together, but are distinct both in their essential features and in their significance for semantic theory and the philosophy of language. Ambiguity involves uncertainty about mappings between levels of representation with different structural characteristics, while vagueness involves uncertainty about the actual meanings of particular terms. This article examines ambiguity and vagueness in turn, providing a detailed picture of their empirical characteristics and the diagnostics for identifying them, and explaining their significance for theories of meaning. Although this article continues the tradition of discussing ambiguity and vagueness together, one of its goals is to emphasize the ways that these phenomena are distinct in their empirical properties, in the factors that give rise to them, and in the analytical tools that can be brought to bear on them. 1. Interpretive uncertainty Most linguistic utterances display "interpretive uncertainty", in the sense that the mapping from an utterance to a meaning (Grice's 1957 "meaning^") appears (at least from the external perspective of a hearer or addressee) to be one-to-many rather than one-to- one. Whether the relation between utterances and meanings really is one-to-many is an open question, which has both semantic and philosophical significance, as I will outline below. What is clear, though, is that particular strings of phonemes, letters or manual signs used to make utterances are, more often than not, capable of conveying distinct meanings. As a first step, it is important to identify which kinds of interpretive uncertainty (viewed as empirical phenomena) arc of theoretical interest. Consider for example (la-b), which manifest several different kinds of uncertainty. (1) a. Sterling's cousin is funny, b. Julian's brother is heavy. One kind concerns the kinds of individuals that the English noun phrases Sterling's cousin and Julian's brother can be used to pick out: the former is compatible with Sterling's cousin being male or female and the latter is compatible with Julian's brother Maienbom, von Heusingerand Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 507-535
508 V. Ambiguity and vagueness being older or younger than him. However, this sort of uncertainty merely reflects the fact that cousin and brother are indeterminate with respect to sex and age, respectively, both terms have conditions of application that specify certain kinds of familial relationships, and brother, unlike cousin, also imposes conditions on the sex of the individuals it applies to, but beyond these constraints these terms do not discriminate between objects as a matter of meaning. (Indeterminacy is also sometimes referred to as "generality"; see Zwicky & Sadock 1975,1984;Gillon 1990,2004.) That this is so can be seen from the fact that distinctions of this sort do not affect judgments of truth or falsity. For example, assuming that the antecedents of the conditionals in (2a-b) specify the minimal difference between the actual world and the counter- factual worlds under consideration, the fact that (2a) is false and (2b) is true shows that a change in sex, unlike a change in familial relationships, does not affect the truth of the application of cousin. (2) Lily is Sterling's cousin.... a. ...but if she were a boy, she wouldn't be Sterling's cousin anymore. b. ...but if her mother weren't Sterling's father's sister, she wouldn't be Sterling's cousin anymore. Likewise, the hypothetical change in age in (3a) doesn't affect the truth of the application of brother, though the change in sex in (3b) now docs make a difference. (3) Sterling is Julian's brother... a. ...but if their ages were reversed, he wouldn't be Julian's brother anymore. b. ...but if he were a girl, he wouldn't be Julian's brother anymore. Indeterminacy reflects the fact that the meaning of a word or phrase typically docs not involve an exhaustive specification of the features of whatever falls under its extension; some features are left open, resulting in the sort of flexibility of application we see above. This is not to say that such features couldn't be specified: many languages contain cousin terms that do make distinctions based on sex (either through grammatical gender, as in Italian cugino 'cousin/mKc' vs. cugina 'cousin ', or lexically, as in Norwegian fetter 'male cousin' vs. kusine 'female cousin'), and some contain male sibling terms that specify age (such as Mandarin gege 'older brother' vs. didi 'younger brother'). Whether a particular distinction is indeterminate or not is thus somewhat arbitrary and language specific, and while it might be interesting to determine if there are cultural or historical explanations for the presence/absence of such distinctions in particular languages, the existence of indeterminacy in any single language is typically not a fact of particular significance for its semantic analysis. A second type of uncertainty manifested by (la) and (lb) is of much greater importance for semantic analysis, however, as it involves variability in the truth or satisfaction conditions that a particular bit of an utterance introduces into the meaning calculation. This kind of uncertainty is ambiguity, which manifests itself as variation in truth conditions: one and the same utterance token can be judged true of one situation and false of another, or the other way around, depending on how it is interpreted. In (la) and (lb), we see ambiguity in the different ways of understanding the contributions of funny and heavy to the truth conditions, (la) can be construed either as a claim that Sterling's cousin
23. Ambiguity and vagueness: An overview 509 has an ability to make people laugh ("funny ha-ha") or that she tends to display odd or unusual behavior ("funny strange"). Similarly, (lb) can be used to convey the information that Julian's brother has a relatively high degree of weight, or that he is somehow serious, ponderous, or full of gravitas.That these pairs of interpretations involve distinct truth conditions is shown by the fact that we can use the same term (or, more accurately, the same bits of phonology) to say something that is true and something that is false of the same state of affairs, as in (4) (Zwicky & Sadock 1975). (4) Sterling's cousin used to make people laugh with everything she did, though she was never in any way strange or unusual. She was funny without being funny. Lately, however, she has started behaving oddly, and has lost much of her sense of humor. Now she's funny but not very funny. Both examples also manifest an ambiguity in the nature of the relation that holds between the genitive-marked nominal in the possessive construction and the denotation of the whole possessive DP;cf. article 45 (Barker) Possessives and relational nouns. While the most salient relations are the familial ones expressed by the respective head nouns (cousin o/and brother of), it is possible to understand these sentences as establishing different relations. For example, if Julian is one of several tutors working with a family of underachieving brothers, we could use (lb) as a way of saying something about the brother who has been assigned to Julian, without in any way implying that Julian himself stands in the brother of relation to anyone. (He could be an only child.) Even after we settle on a particular set of conditions of application for the ambiguous terms in (la) and (lb) (e.g. that we mean "funny ha-ha" by funny or are using heavy to describe an object's weight), a third type of uncertainty remains about precisely what properties these terms ascribe to the objects to which they are applied, and possibly about whether these terms can even be applied in the first place.This is vagueness, and is of still greater significance for semantic theory, as it raises fundamental questions about the nature of meaning, about deduction and reasoning, and about knowledge of language. Consider, an utterance of (lb) in a context in which we know that heavy is being used to characterize Julian's brother's weight. If we take the person who utters this sentence to be speaking truthfully, we may conclude that Julian's brother's weight is above some threshold. However, any conclusions about how heavy he is will depend on a range of other contextual factors, such as his age, his height, information about the individuals under discussion, the goals of the discussion, the interests of the discourse participants, and so forth, and even then will be rough at best. For example, if we know that Julian's brother is a 4-year old, and that we're talking about the children in his preschool class, we can conclude from an utterance of (lb) that his weight is somewhere above some threshold, but it would be extremely odd to follow up such an utterance by saying something like (5). (5) Well, since that means he is at least 17.5 kg, we need to make sure that he is one of the carriers in the piggy-back race, rather than one of the riders. (5) is odd because it presumes a specific cut-off point separating the heavy things from the non-heavy things (17.5 kg), but the kind of uncertainty involved in vagueness is precisely uncertainty about where the cut off is.
510 V. Ambiguity and vagueness This can be further illustrated by a new context. Imagine that we are in a situation in which the relevant contextual factors are clear: Julian's brother is a 4-year old, we're talking about the children in his class, and we want to decide who should be the anchor on the tug-of-war team. In addition, we also know that Julian's brother weighs exactly 15.2 kg. Even with these details spelled out - in particular, even with our knowledge of Julian's brother's actual weight - we might still be uncertain as to whether (lb) is true: Julian's brother is a borderline case for truthful application of the predicate. Borderline cases and uncertainty about the boundaries of a vague predicate's extension raise significant challenges for semantic theory. If we don't (and possibly can't) know exactly how much weight is required to put an object in the extension of heavy (in a particular context of use), even when wc arc aware of all of the potentially relevant facts, can we truly say that we know the meaning of the term? Do we have only incomplete knowledge of its meaning? If our language contained only a few predicates like heavy, we might be able to set them aside as interesting but ultimately insignificant puzzles. Vagueness is pervasive in natural language, however, showing up in all grammatical categories and across lexical fields, so understanding the principles underlying this type of uncertainty is of fundamental importance for semantic theory. In the rest of this article, I will take a closer look at ambiguity and vagueness in turn, providing a more detailed picture of their empirical characteristics and the diagnostics for identifying them, and explaining their significance for theories of meaning. Although this article follows in a long tradition of discussing ambiguity and vagueness together, a goal of the article is to make it clear that these phenomena are distinct in their empirical properties, in the factors that give rise to them, and in the analytical tools that can be brought to bear on them. However, both present important challenges for semantics and philosophy of language, and in particular, for a compositional, truth conditional theory of meaning. 2. Ambiguity 2.1. Varieties of ambiguity Ambiguity is associated with utterance chunks corresponding to all levels of linguistic analysis, from phonemes to discourses, and is characterized by the association of a single orthographic or phonological string with more than one meaning. Ambiguity can have significant consequences, for example if the wording of a legal document is such that it allows for interpretations that support distinct judgments. But it can also be employed for humorous effect, as in the following examples from the 1980s British comedy series A Bit of Fry and Laurie (created by Stephen Fry and Hugh Laurie). (6) FRY: You have a daughter, I believe. LAURIE: Yeah, Henrietta. FRY: Did he? I'm sorry to hear that.That must'vc hurt. (6) illustrates a case of phonological ambiguity, playing on the British comedians' pronunciations of the name Henrietta and the sentence Henry ate her. (7) makes use of the lexical ambiguity between the name Nancy and the British slang term nancy, which means weak or effeminate when used as an adjective.
512 V. Ambiguity and vagueness test of contradiction, which involves determining whether the same string of words or phonemes (modulo the addition/subtraction of negation) can be used to simultaneously affirm and deny a particular state of affairs. We saw this test illustrated with funny in (4) above; the fact that (11) is read as a true contradiction shows that a merely indeterminate term like cousin does not allow for this. (11) Lily's mother is Sterling's father's sister. Since Lily is a girl, she is Sterling's cousin but not his cousin. A second important test involves identity of sense anaphora, such as ellipsis, anaphoric too, pronominalization and so forth. As pointed out by Lakoff (1970), such relations impose a parallelism constraint that requires an anaphoric term and its antecedent to have the same meaning (cf. article 70 (Reich) Ellipsis, for a discussion of parallelism in ellipsis), which has the effect of reducing the interpretation space of an ambiguous expression. For example, consider (12), which can have either of the truth- conditionally distinct interpretations paraphrased in (12a) and (12b), depending on whether the subject the fish is associated with the agent or theme argument of the verb eat. (12) The fish is ready to eat. a. The fish is ready to eat a meal. b. The fish is ready to be eaten. When (12) is the antecedent in an identity of sense anaphora construction, as in (13a-b), the resulting structure remains two-ways ambiguous; it does not become four-ways ambiguous: if the fish is ready to do some eating, then the chicken is too; if the fish is ready to be eaten, then the chicken is too. (13) a. The fish is ready to cat, and the chicken is ready to cat too. b. The fish is ready to eat, but the chicken isn't. That is, these sentences do not have understandings in which the fish is an agent and the chicken is a theme, or vice-versa. This fact can be exploited to demonstrate ambiguity by constructing test examples in such a way that one of the expressions in the identity of sense relation is compatible only with one interpretation of the ambiguous term or structure. For example, since potatoes are not plausible agents of an eating event, the second conjunct of (14a) disambiguates the first conjunct towards the interpretation in (14b). In contrast, (14c) is somewhat odd, because children are typically taken to be agents of eating events, rather than themes, but the context of a meal promotes the theme-rather than agent-based interpretation of the first conjunct, creating a conflict.This conflict is even stronger in (14c), which strongly implies either agentive potatoes or partially-cooked children, giving the whole sentence a certain dark humour. (14) a. The fish is ready to eat, but the potatoes are not. b. ?The fish is ready to eat, but the children are not. c. ??The potatoes are ready to eat, but the children are not.
23. Ambiguity and vagueness: An overview 513 Part of the humorous effect of (14c) is based on the fact that a sensible interpretation of the sentence necessitates a violation of the default parallelism of sense imposed by ellipsis, a parallism that would not be required if there weren't two senses to begin with. The use of such violations for rhetorical effect is referred to as syllepsis, or sometimes by the more general term zeugma. 2.3. Ambiguity and semantic theory Ambiguity has played a central role in the development of semantic theory by providing crucial data for both building and evaluating theories of lexical representation and semantic composition. Cases of ambiguity are often "analytical choice points" which can lead to very different conclusions depending on how the initial ambiguity is evaluated. For example, whether scope ambiguities arc taken to be structural (reflecting different Logical Forms), lexical (reflecting optional senses, possibly derived via type-shifting), or compositional (reflecting indeterminacy in the application of composition rules) has consequences for the overall architecture of a theory of the syntax-semantics interface, as noted above. Consider also the ambiguity of adjectival modification structures such as (15) (first discussed in detail by Bolinger 1967; see also Siegel 1976; McConnell-Ginet 1982; Cinque 1993; Larson 1998 and article 52 (Demonte) Adjectives), which is ambiguous between the "intersective" reading paraphrased in (15a) and the "nonintersective" one paraphrased in (15b). (15) Olga is a beautiful dancer. a. Olga is a dancer who is beautiful. b. Olga is a dancer who dances beautifully. Siegel (1976) takes this to be a case of lexical ambiguity (in the adjective beautiful), and builds a theory of adjective meaning on top of this assumption. In contrast, Larson (1998) argues that the adjectives themselves are unambiguous, and shows how the different interpretations can be accommodated by hypothesizing that nouns (like verbs) introduce a Davidsonian event variable, and that the adjective can take either the noun's individual variable or its event variable as an argument. (The first option derives the interpretation in (15a); the second derives the one in (15b).) In addition to playing an important methodological role in semantic theory, ambiguity has also been presented as a challenge for foundational assumptions of semantic theory. For example, Parsons (1973) develops an argument which aims to show that the existence of ambiguity in natural language provides a challenge for the hypothesis that sentence meaning involves truth conditions (see Saka 2007 for a more recent version of this argument). The challenge goes like this. Assume that a sentence 5 contains an ambiguous term whose different senses give rise to distinct truth conditions p and q (as in the case of (15)),such that (I6a-b) hold. (16) a. 5 is true if and only if p b. 5 is true if and only if q c. p if and only if q But (16a-b) mutually entail (16c), which is obviously incorrect, so one of our assumptions must be wrong; according to Parsons, the problematic assumption is the one that
518 V. Ambiguity and vagueness of interpretive uncertainty completely by denying that fixing the reference of pronouns is part of semantics. In each of the cases discussed above, the key to responding to Parsons' challenge was to demonstrate that standard (or at least reasonable) assumptions about lexico-syntactic representation and composition support the view that observed variability in truth- conditions has a representational basis: in each case, the mappings from representations to meanings are one-to-one, but the mappings from representations (and meanings) to pronunciations are sometimes many-to-one. But what if standard assumptions fail to support such a view? In such a case, Parsons' challenge would reemerge, unless a representational distinction can be demonstrated. One of the strongest cases of this sort comes from the work of Charles Travis, who discusses a particular form of truth conditional variability associated with color terms (see e.g.,Travis 1985,1994,1997).The following passage illustrates the phenomenon: A story. Pia's Japanese maple is full of russet leaves. Believing that green is the colour of leaves, she paints them. Returning, she reports. That's better.The leaves are green now.' She speaks truth. A botanist friend then phones, seeking green leaves for a study of green-leaf chemistry. 'The leaves (on my tree) are green,' Pia says. 'You can have those.' But now Pia speaks falsehood. (Travis 1997:89) This scenario appears to show that distinct utterances of the words in (27), said in order to describe the same scenario (the relation between the leaves and a particular color), can be associated with distinct truth values. (27) The leaves are green. Following the line of reasoning initially advanced by Parsons, Travis concludes from this example that sentence meaning is not truth conditional; that instead, the semantic value of a sentence at most imposes some necessary conditions under which it may be true (as well as conditions under which it may be used), but those conditions need not be sufficient, and the content of the sentence does not define a function from contexts to truth. Travis' skeptical conclusion is challenged by Kennedy & McNally (2010), however, who ask us to consider a modified version of the story of Pia and her leaves. Now she has a pile of painted leaves of varying shades of green (pile A) as well as a pile of naturally green leaves, also of varying shades (pile B). Pia's artist friend walks in and asks if she can have some green leaves for a project. Pia invites her to sort through the piles and take whichever leaves she wants. In sorting through the piles, the artist might utter any of the sentences in (28) in reference either to leaves from pile A or to leaves from pile B, as appropriate based on the way that they manifest green: the particular combination of hue, saturation, and brightness, extent of color, and so forth. (28) a. These leaves are green. b. These leaves are greener than those. c. These leaves aren't as green as those. d. These leaves are less green than those.
520 V. Ambiguity and vagueness 3. Vagueness 3.1. The challenge of vagueness It is generally accepted that the locus of vagueness in sentences like (29) is the predicate headed by the gradable adjective expensive: this sentence is vague because, intuitively, what it means to count as expensive is unclear. (29) The coffee in Rome is expensive. Sentences like (29) have three distinguishing characteristics, which have been the focus of much work on vagueness in semantics and the philosophy of language. The first is contextual variability in truth conditions: (29) could be judged true if asserted as part of a conversation about the cost of living in Rome vs. Naples (In Rome, even the coffee is expensive!), for example, but false in a discussion of the cost of living in Chicago vs. Rome (The rents are high in Rome, but at least the coffee is not expensive!). This kind of variability is of course not restricted to vague predicates - for example, the relational noun citizen introduces variability because it has an implicit argument (citizen ofx) but it is not vague - though all vague predicates appear to display it. The second feature of vagueness is the existence of borderline cases. For any context, in addition to the sets of objects that a predicate like is expensive is clearly true of and clearly false of, there is typically a third set of objects for which it is difficult or impossible to make these judgments. Just as it is easy to imagine contexts in which (29) is clearly true and contexts in which it is clearly false, it is also easy to imagine a context in which such a decision cannot be so easily made. Consider, for example, a visit to a coffee shop to buy a pound of coffee. The Mud Blend at $1.50/pound is clearly not expensive, and the Organic Kona at $20/pound is clearly expensive, but what about the Swell Start Blend at $9.25/ pound? A natural response is "I'm not sure"; this is the essence of being a borderline case. Finally, vague predicates give rise to the Sorites Paradox, illustrated in (30). (30) The Sorites Paradox PI. A $5 cup of coffee is expensive (for a cup of coffee). P2. Any cup of coffee that costs 1 cent less than an expensive one is expensive (for a cup of coffee). C. Therefore, any free cup of coffee is expensive. The structure of the argument appears to be valid, and the premises appear to be true, but the conclusion is without a doubt false. Evidently, the problem lies somewhere in the inductive second premise; what is hard is figuring out exactly what goes wrong. And even if we do, we also need to explain both why it is so hard to detect the flaw and why we arc so willing to accept it as valid in the first place. These points are made forcefully by Fara (2000), who succinctly characterizes the challenges faced by any explanatorily adequate account of vagueness in the form of the following three questions: (31) a. The Semantic Question If the inductive premise of a Sorites argument is false, then is its classical negation - the sharp boundaries claim that there is an adjacent pair in a sorites
23. Ambiguity and vagueness: An overview 523 variability, borderline cases, and the Sorites Paradox as a guide, we can find vague terms in all grammatical classes: nouns (like heap, which gives the Sorites Paradox its name), verbs (such as like, or more significantly, know), determiners (such as many and few), prepositions (such as near) and even locative adverbials, as in the Fry and Laurie dialogue in (33). (33) FRY: There are six million people out there.... LAURIE: Really? What do they want? Here the humor of Laurie's response comes from the vagueness of out there: whether it extends to cover a broad region beyond the location of utterance (Fry's intention) or whether it picks out a more local region (Laurie's understanding). Vagueness is thus pervasive, and its implications for the analysis of linguistic meaning extend to all parts of the lexicon. 3.2. Approaches to vagueness It is impossible to summarize all analysis of vagueness in the literature, so I will focus here on an overview of four major approaches, based on supervaluations, epistemic uncertainty, contextualism, and interest relativity. For a larger survey of approaches, see Williamson (1994); Keefe & Smith (1997); and Fara & Williamson (2002). 3.2.1 Supervaluations Let's return to one of the fundamental properties of vague predicates: the existence of borderline cases. (34a) is clearly true; (34b) is clearly false; (34c) is (at least potentially) borderline. (34) a. Mercury is close to the sun. b. Pluto is close to the sun. c. The earth is close to the sun. However, even if we are uncertain about the truth of (34c), we seem to have clear intuitions about (35a-b): the first is a logical truth, and the second is a contradiction. (35) a. The earth is or isn't close to the sun. b. The earth is and isn't close to the sun. This is not particularly surprising, as these sentences are instances of (36a-b): (36) a. pw -7? b. p/\ ->p As noted by Fine (1975), these judgments show that logical relations (such as the Law of the Excluded Middle in (36a)) can hold between sentences which do not themselves have clear truth values in a particular context of utterance. Fine accepts the position that sentences involving borderline cases, such as (34c), can fail to have truth values
524 V. Ambiguity and vagueness in particular contexts (this is what it means to be a borderline case), and accounts for judgments like (35a) by basing truth valuations for propositions built out of logical connectives not on facts about specific contexts of evaluation, but rather on a space of interpretations in which all "gaps" in truth values have been filled. In other words, the truth of examples like (35a-b) is based on supervaluations (van Fraassen 1968,1969) rather than simple valuations. The crucial components of Fine's theory are stated in (37)-(39); a similar set of proposals (and a more comprehensive linguistic analysis) can be found in Kamp (1975) and especially Klein (1980). (37) Specification space A partially ordered set of points corresponding to different ways of specifying the predicates in the language; at each point, every proposition is assigned true, false or nothing according to an "intuitive" valuation. This valuation must obey certain crucial constraints, such as Fine's Penumbral Connections which ensure e.g., that if x is taller than y, it can never be the case that x is tall is false while y is tall is true (cf. the Consistency Postulate of Klein 1980). (38) Completability Any point can be extended to a point at which every proposition is assigned a truth value, subject to the following constraints: a. fidelity:Truth values at complete points are 1 or 0. b. stability: Definite truth values are preserved under extension. (39) Supertruth A proposition is supcrtruc (or supcrfalsc) at a partial specification iff it is true (false) at all complete extensions. According to this approach, the reason that we have clear intuitions about (35a) and (35b) is because for any ways of making things more precise, we're always going to end up with a situation where the former holds for any proposition and the latter fails to hold, regardless of whether the proposition has a truth value at the beginning. In particular, given (38), (35a) is supertrue and (35b) is superfalse. This theory provides an answer to Fara's Semantic Question about vagueness. According to this theory, any complete and admissible specification will entail a sharp boundary between the things that a vague predicate is true and false of. This renders inductive statements like the second premise of (40) superfalse, even when the argument as a whole is evaluated relative to a valuation that does not assign truth values to some propositions of the form x is heavy, i.e., one that allows for borderline cases. (40) a. A 100 kilogram stone is heavy. b. Any stone that weighs 1 gram less than a heavy one is heavy. c. #A 1 gram stone is heavy. The supervaluationist account thus gives up bivalence (the position that all propositions are either true or false relative to particular assignments of semantic values), but still manages to retain important generalizations from classical logic (such as the Law of the
526 V. Ambiguity and vagueness eliminate borderline cases and identify where the second premise of the Sorites Paradox fails? Williamson's response comes in several parts. First, he assumes that meaning supervenes on use; as such, a difference in meaning entails a difference in use, but not vice- versa. Second, he points out that the meanings of some terms may be stabilized by natural divisions (cf. Putnam's 1975 distinction between H20 and XYZ), while the meanings of others (the vague ones) cannot be so stabilized: a slight shift in our disposition to say that the earth is close to the sun would slightly shift the meaning of close to the sun. The boundary is sharp, but not fixed. But this in turn means that an object around the borderline of a vague predicate P could easily have been (or not been) P had the facts (in particular, the linguistic facts) been slightly different - different in ways that arc too complex for us to fully catalogue, let alone compute. Given this instability, we can never really know about a borderline case whether it is or is not P. This last point leads to the principle in (41), which is another way of saying that vague knowledge requires a margin for error. (41) The Margin for Error Principle For a given way of measuring differences in measurements relevant to the application of property P, there will be a small but non-zero constant c such that if jc and y differ in those measurements by less than c and x is known to be P. then y is known to be P. The upshot of this reasoning is that it is impossible to know whether x is P is true or false when x and y differ by less than c. That's why we fail to reject the second premise of the Sorites, and also why "big" changes make a difference. (If we replace 1 cent with 1 dollar in (30), or 1 gram with 10 kilograms in (40), the paradox disappears.) There are a number of challenges to this account, most of which focus on the central hypothesis that we can be ignorant about core aspects of meaning and still somehow manage to have knowledge of meaning at all. Williamson (1992) lists these challenges and provides responses to them; here I focus on the question of whether this theory is adequate as an account of vagueness. According to Fara (2000), it is not, because although it addresses the Semantic and Epistemological Questions, it does not address the Psychological one. In particular, there is no account of why we don't have the following reaction to the inductive premise: "That's false! I don't know where the shift from P to ->P is, so there are cases that I'm not willing to make a decision about, but I know it's in there somewhere, so the premise must be false." Williamson (1997) suggests the answer has to do with the relation between imagination and experience.The argument runs as follows: (42) i. It is impossible to gain information through imagination that cannot be gained through experience, ii. It is impossible to recognize the experience of the boundary transition in a sorites sequence because the transition lacks a distinctive appearance, iii. Therefore, it is impossible to imagine the transition. This failure of imagination then makes it impossible to reject the inductive premise, since doing so precisely requires imagination of the crucial boundary transition. However,
528 V. Ambiguity and vagueness length. The epistemic account of vagueness provides no account of this fact (nor does an unaugmented supervaluationist account, since it too needs to allow for contextual shifting of standards of comparison). If the impossibility of crisp judgments in examples like these involving definite descriptions and our judgments about the second premise of the Sorites Paradox are instances of the same basic phenomenon, then the failure of the epistemic account of vagueness to explain the former raises questions about its applicability to the latter. 3.2.3. Contextualism and interest relativity Raffman (1996) observes three facts about vague predicates and Sorites sequences. First, as we have already discussed, vague predicates have context dependent extensions. Second, when presented with a sorites sequence based on a vague predicate P, a competent speaker will at some point stop (or start, depending on direction) judging P to be true of objects in the sequence. Third, even if we fix the (external) context, the shift can vary from speaker to speaker and from run to run. This is also a part of competence with P. These observations lead Raffman (1994, 1996) to a different perspective on the problem of vagueness: reconciling tolerance (insensitivity to marginal changes) with categorization (the difference between being red and orange, tall and not tall, etc.). She frames the question in the following way: how can we simultaneously explain the fact that a competent speaker seems to be able to apply incompatible predicates (e.g., red vs. orange; tall vs. not tall) to marginally different (adjacent) items in the sequence and the fact that people are unwilling to reject the inductive premise of the Paradox? Her answer involves recognizing the fact that evaluation of a sorites sequence triggers a context shift, which in turn triggers a shift in the extension of the predicate in such a way as to ensure that incompatible predicates are never applied to adjacent pairs, and to make (what looks like) a sequence of inductive premises all true. (For similar approaches, sec Kamp 1981; Bosch 1983; Soamcs 1999.) This gives the illusion of validity, but since there is an extension shift, the predicate at the end of the series is not the same as the one at the beginning, so the argument is invalid. There are three pieces to her account. The first comes from work in cognitive psychology, which distinguishes between two kinds of judgments involved in a sorites sequence. The first is categorization, which involves judgments of similarity to a prototype/standard; the second is discrimination, which involves judgments of sameness/difference between pairs. Singular judgments about items involve categorization, and it is relative to such judgments that a cutoff point is established. Discrimination, on the other hand, doesn't care where cutoff points fall, but it imposes a different kind of constraint: adjacent pairs must be categorized in the same way (Tvcrsky & Kahncman 1974). At first glance, it appears that the categorization/discrimination distinction just restates the problem of vagueness in different terms: if for any run of a sorites sequence, a competent speaker will at some point make a category shift, how do we reconcile such shifts with the fact that we resist discrimination between adjacent pairs? Note that the problem is not the fact that a speaker might eventually say of an object o in a sorites sequence based on P that it is not P, even if she judged o/+1 to be P, because this is a singular judgment about o^The problem is that given the pair (op o/+1 >, the speaker will refuse to treat them differently. This is what underlies judgments about the inductive
23. Ambiguity and vagueness: An overview 529 premise of the Sorites Paradox and possibly the crisp judgment effects discussed above as well, though this is less clear (see below). The second part of Raffman's proposal is designed to address this problem, by positing that a category shift necessarily involves a change in perspective such that the new category instantaneously absorbs the preceding objects in the sequence. Such backwards spread is the result of entering a new psychological state, a Gestalt shift that triggers a move from one 'category anchor' or prototype to another, e.g., from the influence of the red anchor to the influence of the orange one. This gives rise to the apparent boundlessness of vague predicates: a shift in category triggers a shift in the border away from the edge, giving the impression that it never was there in the first place. And in fact, as far as the semantics is concerned, when it comes to making judgments about pairs of objects, it never is. This is the third part of the analysis, which makes crucial appeal to the context dependence of vague predicates. Raffman proposes that the meaning of a vague predicate P is determined by two contextual factors. The external context includes discourse factors that fix domain, comparison class, dimension, etc. of P. The internal context includes the properties of an individual's psychological state that determine dispositions to make judgments of P relative to some external context. Crucially, a category shift causes a change in internal context e.g., (from a state in which the red anchor dominates to one in which the orange anchor does), which in turn results in a change in the extension of the predicate in the way described above, resulting in backwards spread. Taken together, these assumptions provide answers to each of Fara's questions. The answer to the Semantic Question is clearly positive, since the commitment to category shifts involves a commitment to the position that a vague predicate can be true of one member of an adjacent pair in a sorites sequence oj and false of o/+[. The reason we cannot say which (ojf o/+1) has this property, however, is that the act of judging o; to be not P (or P) causes a shift in contextual meaning of P to P\ which, given backwards spread, treats oj and o.+/ the same. This answer to the Epistcmological Question also underlies the answer to the Psychological Question: even though the inductive premise of the Sorites Paradox is false for any fixed meaning of a vague predicate, we think that it is true because it is possible to construct a sequence of true statements that look like (instantiations of) the inductive premise, but which in fact do not represent valid reasoning because they involve different contextual valuations of the vague predicate. For example, if we are considering a sequence of 100 color patches \pv pv ... pm\ ranging from 'pure' red to 'pure' orange, such that a category shift occurs upon encountering patch p4V the successive conditional statements in (46a) and (46c) work out to be true thanks to backwards spread (because their subconstituents are both true and both false in their contexts of evaluation, respectively), even though red means something different in each case. (46) a. If p45 is red, then p^ is red. PwP** \\red\\c TRUE-> TRUE ^ TRUE b. shift at p41: change from context c to context c' c. If p^ is red, then p41 is red. p*>p*7* Wred\Y FALSE -> FALSE |=TRUE
23. Ambiguity and vagueness: An overview 531_ relativity, will cause the extension of the predicate to shift in a way that ensures that they are evaluated in the same way. In Fara's words:"the boundary between the possessors and the lackers in a sorites series is not sharp in the sense that we can never bring it into focus; any attempt to bring it into focus causes it to shift somewhere else." (Fara 2000:75-76) One of the reasons that the contextualist and interest relative analyses provide compelling answers to the Psychological Question is that they are inherently psychological: the former in the role that psychological state plays in fixing context sensitive denotations; the latter in the role played by interest relativity. Moreover, in providing an explanation for judgments about pairs of objects, these analyses can support an explanation of the 'crisp judgment' effects discussed in the previous section, provided they can be linked to a semantics that appropriately distinguishes the positive and comparative forms of a scalar predicate. However, the very aspects of these analyses that are central to their successes also raise fundamental problems that question their ultimate status as comprehensive accounts of vagueness, according to Stanley (2003). Focusing first on the contextualist analysis, Stanley claims that it makes incorrect predictions about versions of the Sorites that involve sequential conditionals and ellipsis. Stanley takes the contextualist to be committed to a view in which vague predicates are a type of indexical expression. Indexicals, he observes, have the property of remaining invariant under ellipsis: (48b), for example, cannot be used to convey the information expressed by (48a). (48) a. Kim voted for that^ candidate because Lee voted for thatfl candidate. b. Kim voted for that^ candidate because Lee did vote for thatfl candidate. Given this, Stanley argues that if the contextualist account of vagueness entails that vague predicates are indexicals, then our judgments about sequences of conditionals like (46) (keeping the context the same) should change when the predicates are elided. Specifically, since ellipsis requires indexical identity, it must be the case that the elided occurrences of red in (49) be assigned the same valuation as their antecedents, i.e. that [[red]Y=[[red]y. (49) a. If p45 is red, then p^ is red too. PwP** \\red\\c TRUE-> TRUE |= TRUE b. shift at p47: change from context c to context c' c. If p^ is red, then p41 is red too. P«,^[[red]r\p47e[[red]r TRUE -> FALSE |= FALSE But this is either in conflict with backwards spread, in which case ellipsis should be impossible, or it entails that (49c) should be judged false while (49a) is judged true. Neither of these predictions arc borne out: the judgments about (49) arc in all relevant respects identical to those about (46). Raffman (2005) responds to this criticism by rejecting the view that the contextualist account necessarily treats vague predicates as indexicals, and suggests that the kind of 'shiftability' of vague predicates under ellipsis that is necessary to make the account work is analogous to what we see with comparison classes in examples like (50a), which has the meaning paraphrased in (50b) (Klein 1980).
532 V. Ambiguity and vagueness (50) a. That elephant is large and that flea is too. b. That elephant is large for an elephant and that flea is large for a flea. This is probably not the best analogy, however: accounts of comparison class shift in examples like (50a) rely crucially on the presence of a binding relation between the subject and a component of the meaning of the predicate (see e.g., Ludlow 1989; Kennedy 2007), subsuming such cases under a general analysis of 'sloppy identity' in ellipsis. If a binding relation of this sort were at work in (49), the prediction would be that the predicate in the consequent of (49a) and the antecedent of (49b) should be valued in exactly the same way (since the subjects are the same), which, all else being equal, would result in exactly the problematic judgments about the truth and falsity of the two conditionals that Stanley discusses. In the absence of an alternative contextualist account of how vague predicates should behave in ellipsis, then, Stanley's objection remains in force. This objection does not present a problem for Fara's analysis, which assumes that vague predicates have fixed denotations. However, the crucial hypothesis that these denotations are interest relative comes with its own problems, according to Stanley. In particular, he argues that this position leads to the implication that the meaning of a vague predicate is always relativized to some agent, namely the entity relative to whom significance is assessed. But this implication is inconsistent with the fact that we can have beliefs about the truth or falsity of a sentence like (51) without having beliefs about any agent relative to whom Mt. Everest's height is supposed to be significant. (51) Mt. Everest is tall. Moreover, the truth of the proposition conveyed by an utterance of (51) by a particular individual can remain constant even in hypothetical worlds in which that individual doesn't exist, something that would seem to be impossible if the truth of proposition has something to do with the utterer's interests. Fara (2008) rejects Stanley's criticism on the grounds that it presumes that an "agent of interest" is a constituent of the proposition expressed by (51), something that is not the case given her particular assumptions about the compositional semantics of the vague scalar predicates she focuses on, which builds on the decompositional syntax and semantics of Kennedy (1999, 2007) (see also Bartsch & Venneman 1972). However, to the extent that her analysis creates entailments about the existence of individuals with the relevant interests, it is not clear that Stanley's criticisms can be so easily set aside. 4. Conclusion Ambiguity and vagueness are two forms of interpretive uncertainty, and as such, are often discussed in tandem.They are fundamentally different in their essential features, however, and in their significance for semantic theory and the philosophy of language. Ambiguity is essentially a "mapping problem", and while there are significant analytical questions about how (and at what level) to best capture different varieties of ambiguity, the phenomenon per se does not represent a significant challenge to current conceptions of semantics. Vagueness, on the other hand, raises deeper questions about knowledge
23. Ambiguity and vagueness: An overview 533 of meaning. The major approaches to vagueness that I have outlined here, all of which come from work in philosophy of language, provide different answers to these questions, but none is without its own set of challenges. Given this, as well as the fact that this phenomenon has seen relatively little in the way of close analysis by linguists, vagueness has the potential to be an important and rich domain of research for semantic theory. 5. References Barker. Chris 2002.The dynamics of vagueness. Linguistics & Philosophy 25.1-36. Bartsch, Renate & Theo Vennemann 1972. The grammar of relative adjectives and comparison. Linguistische Berichte 20,19-32. Bierwisch, Manfred 1989. The semantics of gradation. In: M. Bierwisch & E. Lang (eds.). Dimensional Adjectives. Grammatical Structure and Conceptual Interpretation. Berlin: Springer, 71-261. Bogustawski, Andrzej 1975. Measures are measures. In defence of the diversity of comparatives and positives. Linguistische Berichte 36,1- 9. Bolinger, Dwight 1967. Adjectives in English. Attribution and predication. Lingua 18,1-34. Bosch, Peter 1983. 'Vagueness' is context-dependence. A solution to the Sorites Paradox. In: T Ballmer & M. Pinkal (eds.). Approaching Vagueness. Amsterdam: North-Holland, 189-210. Cinque, Guglielmo 1993. On the evidence for partial N movement in the Romance DP. University of Venice Working Papers in Linguistics 3(2), 21-40. Cresswell, Max J. 1976. The semantics of degree. In: B. Partee (ed.). Montague Grammar. New York: Academic Press, 261-292. Fara, Delia Graff 2000. Shifting sands. An interest-relative theory of vagueness. Philosophical Topics 28,45-81. Originally published under the name "Delia Graff. Fara. Delia Graff 2008. Profiling interest relativity. Analysis 68,326-335. Fara. Delia Graff & Timothy Williamson (eds.) 2002. Vagueness. Burlington, VT: Ashgate. Originally published under the name "Delia Graff. Fine, Kit 1975. Vagueness, truth and logic. Synthese 30,265-300. van Fraassen. Bas C. 1968. Presupposition, implication, and self-reference. Journal of Philosophy 65,136-152. van Fraassen, Bas C. 1969. Presuppositions, supervaluations, and free logic. In: K. Lambert (ed.). The Logical Way of Doing Things. New Haven, CT: Yale University Press, 69-91. Gillon, Brendan S. 1990. Ambiguity, generality, and indeterminacy. Tests and definitions. Synthese 85,391^116. Gillon, Brendan S. 2004. Ambiguity, indeterminacy, dcixis, and vagueness. Evidence and theory. In: S. Davis & B. Gillon (eds.). Semantics. A Reader. Oxford: Oxford University Press, 157-187. Gi ice, H. Paul 1957. Meaning. The Philosophical Review 66,377-388. Heim, Irene 2000. Degree operators and scope. In: B. Jackson &T Matthews (eds.). Proceedings of Semantics and Linguistic Theory (= SALT) X. Ithaca, NY: Cornell University, 40-64. Heim, Irene & Angelika Kratzer 1998. Semantics in Generative Grammar. Oxford: Blackwell. Jacobson, Pauline 1999.Towards a variable-free semantics. Linguistics & Philosophy 22,117-184. Jacobson, Pauline 2000. Paycheck pronouns, Bach-Peters sentences, and variable-free semantics. Natural Language Semantics 8,77-155. Kamp, Hans 1975. Two theories of adjectives. In: E. Keenan (ed.). Formal Semantics of Natural Language. Cambridge: Cambridge University Press, 123-155. Kamp, Hans 1981. The paradox of the heap. In: U. Monnich (ed.). Aspects of Philosophical Logic. Dordrecht: Reidel, 225-277. Kamp, Hans & Barbara H. Partee 1995. Prototype theory and compositionality. Cognition 57, 129-191. Keefe, Rosanna & Peter Smith (eds.) 1997. Vagueness. A Reader. Cambridge, MA: The MIT Press.
24. Semantic underspecification 535 Stanley, Jason 2003. Context, interest relativity, and the Sorites. Analysis 63,269-280. von Stechow, A mini 1984. Comparing semantic theories of comparison. Journal of Semantics 3, 1-77. Travis, Charles 1985. On what is strictly speaking true. Canadian Journal of Philosophy 15, 187-229. Travis, Charles 1994. On constraints of generality. Proceedings of the Aristotelian Society 94, 165-188. Travis, Charles 1997. Pragmatics. In: B. Hale & C. Wright (eds.). A Companion to the Philosophy of Language. Oxford: Blackwell, 87-107. Tversky, Amos & Daniel Kahneinan 1974. Judgment under uncertainty. Heuristics and biases. Science 185(4157), 1124-1131. Ward, Gregory 2004. Equatives and deferred reference. Language 80,262-289. Ward, Gregory, Richard Sproat & Gail McKoon 1991. A pragmatic analysis of so-called anaphoric islands. Language 67,439-474. Wheeler, Samuel 1972. Attributives and their modifiers. Nous 6,310-334. Williamson, Timothy 1992. Vagueness and ignorance. Proceedings of the Aristotelian Society. Supplementary Volume 66,145-162. Williamson,Timothy 1994. Vagueness. London: Routledge. Williamson,Timothy 1997. Imagination, stipulation and vagueness. Philosophical Issues 8,215-228. Zwicky, Arnold & Jerrold Sadock 1975. Ambiguity tests and how to fail them. In: J. P. Kimball (ed.). Syntax and Semantics. 4. New York: Academic Press, 1-36. Zwicky, Arnold & Jerrold Sadock 1984. A reply to Martin on ambiguity. Journal of Semantics 3, 249-256. Christopher Kennedy, Chicago, IL (USA) 24. Semantic underspecification 1. Introduction 2. The domain of semantic underspecification 3. Approaches to semantic underspecification 4. Motivation 5. Semantic underspecification and the syntax-semantics interface 6. Further processing of underspecified representations 7. References Abstract This article reviews semantic underspecification, which has emerged over the last three decades as a technique to capture several readings of an ambiguous expression in one single representation by deliberately omitting the differences between the readings in the semantic descriptions. After classifying the kinds of ambiguity to which underspecification can be applied, important properties of underspecification formalisms will be discussed that can be used to distinguish subgroups of these formalisms. The remainder of the article then presents various motivations for the use of underspecification, and expounds the derivation and further processing of underspecified semantic representations. Maienbom, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 535-574
536 V. Ambiguity and vagueness 1. Introduction Underspecification is defined as the deliberate omission of information from linguistic descriptions to capture several alternative realisations of a linguistic phenomenon in one single representation. Underspecification emerged in phonology (see Steriade 1995 or Harris 2007 for an overview), where it was used e.g. for values of features that need not be specified because they can be predicted independently, e.g., by redundancy rules or by phonological processes. The price for this simplification, however, are additional layers or stages in phonological processes/representations, which resurfaces in most approaches that use underspecification in semantics. In the 1980's, underspecification was adopted in semantics. Here the relevant linguistic phenomenon is meaning, thus, underspecified representations are intended to capture whole sets of different meanings in one representation. Since this applies only to sets of meanings that correspond to the readings of a linguistic expression, semantic underspecification emerges as a technique for the treatment of ambiguity. (Strictly speaking, underspecification could be applied to semantic indefiniteness in general, which also encompasses vagueness, see Pinkal 1995 and article 23 (Kennedy) Ambiguity and vagueness. But since underspecification focusses almost exclusively on ambiguity, vagueness will be neglected.) While underspecification is not restricted to expressions with systematically related sets of readings (as opposed to homonyms), it is in practice applied to such expressions only. The bulk of the work in semantic underspecification focusses on scope ambiguity. In natural language processing, underspecification is endorsed to keep semantic representations of ambiguous expressions tractable and to avoid unnecessary disambiguation steps; a completely new use of underspecification emerged in hybrid processing, where it provides a common format for the results of deep and shallow processing. Underspecification is used also in syntax and discourse analysis to obtain compact representations whenever several structures can be assigned to a specific sentence (Marcus, Hindle & Fleck 1983; Rambow, Weir & Shanker 2001; Muskens 2001) or discourse, respectively (Asher & Fernando 1999; Duchier & Gardent 2001; Schilder 2002; Egg & Redeker 2008; Regneri, Egg & Roller 2008). This article gives an overview over underspecification techniques in semantics. First the range of phenomena in semantics to which underspecification (formalisms) can be applied is sketched in section 2. Section 3 outlines important properties of underspecification formalisms which distinguish different subgroups of these formalisms. Various motivations for using underspecification in semantics are next outlined in section 4. The remaining two sections focus on the derivation of underspecified semantic representations by a suitable syntax-semantics interface (section 5) and on the further processing of these representations (section 6). 2. The domains of semantic underspecification Before introducing semantic underspecification in greater detail, ambiguous expressions that are in principle amenable to a treatment in terms of semantic underspecification will be classified according to two criteria. These criteria compare the readings of these
538 V. Ambiguity and vagueness (4) Every researcher of a company saw most samples. (5) [Every man]( read a book he(. liked. (6) Every linguist attended a conference, and every computer scientist did, too. The subject in (4) exhibits nested quantification, where one quantifier-introducing DP comprises another one. (4) is challenging because the number of its readings is less than the number of the possible permutations of its quantifiers (3! = 6). The scope ordering that is ruled out in any case is V > most7 > 3 (Hobbs & Shieber 1987). (See section 3.1. for further discussion of this example.) In (5), the anaphoric dependency of a book he liked on every man restricts the quantifier scope ambiguity in that the DP with the anaphor must be in the scope of its antecedent (Reyle 1993). In (6), quantifier scope is ambiguous, but must be the same in both sentences (i.e., if every linguist outscopes a conference, every computer scientist does, too), which yields two readings. In a third reading, a conference receives scope over everything else, i.e., both linguists and computer scientists attending the same conference (Hirschbiihler 1982; Crouch 1995; Dalrymple, Shieber & Pereira 1991; Shieber, Pereira & Dalrymple 1996; Egg, Roller & Nichrcn 2001). Other scope-bearing items can also enter into scope ambiguity, e.g., negation and modal expressions, as in (7) and (8): (7) Everyone didn't come. (V > -i or -i > V) (8) A unicorn seems to be in the garden. (3 > seem or seem > 3) Such cases can also be captured by underspecification. For instance, Minimal Recursion Semantics (Copestake et al. 2005) describes them by underspecifying the scope of the quantifiers and fixing the other scope-bearing items scopally. But cases of scope ambiguity without quantifiers show that underspecifying quantifier scope only is not general enough. For instance, cases of 'neg raising' (Sailer 2006) like in (9) have a reading denying that John believes that Peter will come, and one attributing to John the belief that Peter will not come: (9) John doesn't think Peter will come. Sailer analyses these cases as a scope ambiguity between the matrix verb and the negation (whose syntactic position is invariably in the matrix clause). Other such examples involve coordinated DPsJike in (10), (11),or (12) (Hurum 1988; Babko-Malaya 2004; Chaves 2005b): (10) I want to marry Peggy or Sue. (11) Every man and every woman solved a puzzle. (12) Every lawyer and his secretary met.
540 V. Ambiguity and vagueness However, the definition of semantically and syntactically homogeneous ambiguity includes also cases where the readings consist of the same building blocks, but differ in that some of the readings exhibit more than one instance of specific building blocks. This kind of semantically and syntactically homogeneous ambiguity shows up in the ellipsis in (17). Its two readings 'John wanted to greet everyone that Bill greeted' and 'John wanted to greet everyone that Bill wanted to greet' differ in that there is one instance of the semantic contribution of the matrix verb want in the first reading, but two instances in the second reading (Sag 1976): (17) John wanted to greet everyone that Bill did. This is due to the fact that the pro-form did is interpreted in terms of a suitable preceding VP (its antecedent), and that there are two such suitable VPs in (17), viz., wanted to greet everyone that Bill did and greet everyone that Bill did. ((17) is a case of antecedent- contained deletion, see Shieber, Pereira & Dalrymple 1996 and Egg, Roller & Niehren 2001 for underspecified accounts of this phenomenon;see also article 70 (Reich) Ellipsis.) Another example of this kind of semantically and syntactically homogeneous ambiguity is the Afrikaans example (18) (Sailer 2004). Both the inflected form of the matrix verb wou 'wanted' and the auxiliary het in the subordinate clause introduce a past tense operator. But these examples have three readings: (18) Jan wou gebel het. Jan want. PAST called have 'Jan wanted to call/Jan wants to have called/Jan wanted to have called.' The readings can be analysed (in the order given in (18)) as (19a-c):That is, the readings comprise one or two instances of the past tense operator: (19) a. PAST(wanf (j; (call'(j)))) b. want'(j; PAST(call'(j))) c. PASTtwairfCj; PAST(call'(j)))) Finally, the criterion 'syntactically and semantically homogeneous' as defined in this subsection will be compared to similar classes of ambiguity from the literature. Syntactic and semantic homogeneity is sometimes referred to as structural ambiguity. But this term is itself ambiguous in that it is sometimes used in the broader sense of 'semantically homogeneous' (i.e., syntactically homogeneous or not). But then it would also encompass the group of semantically but not syntactically homogeneous ambiguities discussed in the next subsection. The group of semantically and syntactically homogeneous ambiguities coincides by and large with Bunt's (2007) 'structural semantic ambiguity' class, excepting ambiguous compounds like math problem and the collective/distributive ambiguity of quantifiers, both of which are syntactically but not semantically homogeneous: Different readings of a compound each instantiate an unspecific semantic relation between the components in a unique way. Similarly, distributive and quantitative readings of a quantifier arc distinguished in the semantics by the presence or absence of a distributive or collective operator, e.g., Link's (1983) distributive D-operator (see article 46 (Lasersohn) Mass nouns and plurals).
546 V. Ambiguity and vagueness The description (28) can then be read as follows: The fragment at the top consists of a hole only, i.e., we do not yet know what the described representations look like. However, since the relation R relates this hole and the right and the left fragment, they are both part of these representations - only the order is open. Finally, the holes in both the right and the left fragment are related to the bottom fragment in terms of /?, i.e., the bottom fragment is in the scope of either quantifier. The only semantic representations compatible with this description are (27a-b). To derive the described readings from such a constraint (its solutions), the relation R between holes and fragments is monotonically strengthened until all the holes are related to a fragment, and all the fragments except the one at the top are identified with a hole (this is called 'plugging' in Bos 2004). In (28), one can strengthen R by adding the pair consisting of the hole in the left fragment and the right fragment. Here the relation between the hole in the universal fragment and the bottom fragment in (28) is omitted. (29) p \/x. woman'(jc) -» □ 3y. man'(_y)A □ love'(oc, y) Identifying the hole-fragment pairs in R in (29) then yields (27a), one of the solutions of (28). The other solution (27b) can be derived by first adding to R the pair consisting of the hole in the right fragment and the left fragment. Underspecification formalisms that implement scope in this way comprise Under- specified Discourse Representation Theory (UDRT; Reyle 1993; Reyle 1996; Frank & Reyle 1995), Minimal Recursion Semantics (MRS; Copestake et al. 2005), the Constraint Language for Lambda Structures (CLLS; Egg, Koller & Niehren 2001), the language of Dominance Constraints (DQsubsumed by CLLS;Althauset al.2001),Hole Semantics (HS; Bos 1996; Bos 2004; Kallmcycr & Romero 2008), and Logical Description Grammar (Muskens2001). Koller, Niehren & Thater (2003) show that expressions of HS can be translated into expressions of DC and vice versa; Fuchss et al. (2004) describe how to translate MRS expressions into DC expressions. Player (2004) claims that this is due to the fact that UDRT, MRS, CLLS, and HS are the same 'modulo cosmetic differences', however, his comparison does not pertain to CLLS but to the language of dominance constraints. Scope relations like the one between a quantifying DP and the verb it is an argument of can also be expressed in terms of suitable variables. This is implemented e.g. in the Undcrspccicd Semantic Description Language (USDL; Pinkal 1996, Niclircn, Pinkal & Ruhrberg 1997; Egg & Kohlhase 1997 present a dynamic version of this language). In USDL, the constraints for (26) are expressed by the equations in (30): (30) a. X0 = C,(every_woman@L^ (C2(love@*2@*,))) b. X0 = C3(a_man@Lr (C4(love@jc2@jc,)))
24. Semantic underspecification 547 Here 'every_woman' and 'a_man' stand for the the two quantifiers in the semantics of (26), '@' denotes explicit functional application in the metalanguage, and 'Lr', X-abstraction over x. These equations can now be solved by an algorithm like the one in Huet (1975). For instance, for the V3-reading of (26), the variables would be resolved as in (31a-c).This yields (31d), whose right hand side corresponds to (27a): (31) a. C, = C4 = X£/> b. C2 = XP.a_man@Lr (P) c. C3 = XP.every_woman@Lr (/>) d. X0 = every_woman@Lj (a_man@L^ (love@ac2@ac,)) Another way to express such scope relations is used in the version of the Quasi-Logical Form (QLF) in Alshawi & Crouch (1992), viz., list-valued metavariables in semantic representations whose specification indicates quantifier scope. Consider e.g. the (simplified) representation for (26) in (32a), which comprises an underspecified scoping list (the variable _s before the colon). Here the meanings of every woman and a man are represented as complex terms; such terms comprise term indices (+m and +w) and the restrictions of the quantifiers (man and woman, respectively). Specifying this underspecified reading to the reading with wide scope for the universal quantifier then consists in instantiating the variable _s to the list [+w,+m] in (32b), which corresponds to (27a): (32) a. _s:love(term(+w,...,woman,...),term(+m,...,man,...)) b. [+w,+m]: love (term (+w,...,woman,...), term (+m,...,man,...)) Even though QLF representations seem to differ radically from the ones that use dominance constraints, Lev (2005) shows how to translate them into expressions of Hole Semantics, which is based on dominance relations. Finally, I will show how Glue Language Semantics (GLS; Dalrymple et al. 1997; Crouch & van Genabith 1999; Dalrymple 2001) handles scope ambiguity. Each lexical item introduces so-called meaning constructors that relate syntactic constituents (I abstract away from details of the interface here) and semantic representations. For instance, for the proper name John, the constructor is 'DP ~ John", which states that the DP John has the meaning John' ('-' relates syntactic constituents and their meanings). Such statements can be arguments of linear logic connectives like the conjunction ® and the implication — , e.g., the meaning constructor for love: (33) V* Y.DPsiibj ~X® DPobj ~Y~S~ \ovc\X, Y) In prose: Whenever the subject interpretation in a sentence S is X and the object interpretation is V, then the 5 meaning is love'(A', y).That is these constructors specify how the meanings of smaller constituents determine the meaning of a larger constituent. The implication — is resource-sensitive: 'A — B" can be paraphrased as'use a resource A to derive (or produce) fl'.The resource is 'consumed' in this process, i.e., no longer available for further derivations.Thus, from A and A — B one can deduce B, but no longer
548 V. Ambiguity and vagueness A. For (33), this means that after deriving the 5 meaning the two DP interpretations are no longer available for further processes of semantic construction (consumed). The syntax-semantics interface collects these meaning constructors during the construction of the syntactic structure of an expression. For ambiguous expressions such as (26), the resulting collection is an underspecified representation of its different readings. Representations for the readings of the expression can then be derived from this collection by linear-logic deduction. In the following, the presentation is simplified in that DP-internal semantic construction is omitted and only the DP constructors are given: (34) a. V//, P(Vx.DP ~x^H~, (jc)) - H ~ every/(woman', P) b. VG, R.{Vy.DP ~y^G -,(>>)) - G ~ a'(man',/?) The semantics of every woman in (34a) can be paraphrased as follows: Look for a resource of the kind 'use a resource that a DP semantics is jc, to build the truth-valued (subscript f of ~) meaning P(x) of another constituent //'.Then consume this resource and assume that the semantics of H is every'(woman',P); here every' abbreviates the usual interpretation of every.The representation for a man works analogously. With these constructors for the verb and its arguments, the semantic representation of (26) in GLS is (35d), the conjunction of the constructors of the verb and its arguments. Note that semantic construction has identified the DPs that are mentioned in the three constructors: (35) a. V//, P(Vx.DPsiibj ~x^H~, P(x)) - H ~ every'( woman', P) b. VG, R.(Vy.DPobj ~y^G~t R(y)) - G ~ *'(mm',R) c. VX, Y DPslibj ~X® DPobj ~Y~S~ love'(*, Y) d. (35a)®(35b)®(35c) From such conjunctions of constructors, fully specified readings can be derived. For (26), the scope ambiguity is modelled in GLS in that two different semantic representations for the sentence can be derived from (35d). Either derivation starts with choosing one of the two possible specifications of the verb meaning in (35c), which determine the order in which the argument interpretations are consumed: (36) a. V*.DPsubj ~ X- (VKDPobj ~ Y - 5 ~ love'(*, Y)) b. VY. DPobj ~Y~ (V*. DPsubj ~ X - S ~ love'(*, Y)) I will now illustrate the derivation of the reading of V3-reading of (26). The next step uses the general derivation rule (37) and the instantiations in (38): (37) from A - B and B - C one can deduct A - C (38) G - 5, Y - y, and R - Xy.love'(X,y)) From specification (36a) and the object semantics (35b) we then obtain (39a), this goes then together with the subject semantics (35a) to yield (39b), a notational variant of (27a):
24. Semantic underspecification 549 (39) a. V*.DPsubj ~ X - 5 ~ a'(man', X>>.love'(*,y)) b. every'(woman', k.a'(man', Xy.love'(.*, _y)) The derivation for the other reading of (26) chooses the other specification (36b) of the verb meaning and works analogously. 3.1.2. A more involved example After this expository account of the way that the simple ambiguity of (26) is captured in various underspecification formalisms, reconsider the more involved nested quantification in (40) [= (4)], whose constraint is given in (41). (40) Every researcher of a company saw most samples (41) a 3y. company' (y) a □ Vac.(researcher' (x)a a)->a most' (sample',Xz □) of (x, y) see' (x, z) As expounded in section 2.1, not all scope relations of the quantifiers are possible in (40). I assume that (40) has exactly five readings, there is no reading with the scope ordering V>most' > 3.(41) is a suitable underspecified representation of (40) in that it has exactly its five readings as solutions. As a first step of disambiguation, we can order the existential and the universal fragment. Giving the former narrow scope yields (42): (42) Vx. (researcher' (r) a □) ->□ most' (sample', Xz □) By.company' (y )a □ of (x, y) see' (x, z) But once the existential fragment is outscopcd by the universal fragment, it can no longer interact scopally with the most - and the see-fragment, because it is part of the restriction of the universal quantifier. That is, there are two readings to be derived from (42), with the mosf-fragment scoping below or above the universal fragment. This rules out a reading in which most scopes below the universal, but above the existential quantifier: (43) a. Vac.(researcher'(ac) a By.compan/^) a of (x,y)) -> most'(sample', Xz.see'(x, z)) b. most/(sample/, \zVx.(researcher'(x) a 3y.company'(>') a of '(x,y)) -> see'(x,z)) The second way of fixing the scope of the existential w.r.t. the universal quantifier in (41) gives us (44):
24. Semantic underspecification 551_ (47) love'«forall x woman'(oc)>, (exists y man'^))) (47) renders the semantics of DPs as terms, i.e., scope-bearing expressions whose scope has not been determined yet. Terms are triples of a quantifier, a bound variable, and a restriction. The set of fully specified representations encompassed by such a representation is then determined by a resolution algorithm. The algorithm 'discharges' terms, i.e., integrates them into the rest of the representation, which determines their scope. (Formally, a term is replaced by its variable, then quantifier, variable, and restriction arc prefixed to the resulting expression.) For instance, to obtain the representation (27a) for the 'V3'-reading of (26) the existential term is discharged first, which yields (48): (48) 3y.mm'(y) a love'((forall x woman'(*)), y) Discharging the remaining term then yields (27a); to derive (27b) from (47), one would discharge the universal term first. Such an approach is adopted e.g. in the Core Language Engine version of Moran (1988) and Alshawi (1992). Hobbs & Shieber (1987) show that a rather involved resolution algorithm is needed to prevent overgeneration in more complicated cases of scope ambiguity, in particular, for nested quantification. Initial representations for nested quantification comprise nested terms, e.g., the representation (49) for (40): (49) see'((forall x researcher'^) a ot'(x, (exists y company'^)))), (most z sample'(z))) Here the restriction on the resolution is that the inner term may never be discharged before the outer one, which in the case of (40) rules out the unwanted 6th possible permutation of the quantifiers. Otherwise, this permutation could be generated by discharging the terms in the order '3, most', V. Such resolution algorithms lend themselves to a straightforward integration of preference rules such as 'each outscopes other determiners', see section 6.4. Other ways of handling nested quantification by restricting resolution algorithms for underspecified representations have been discussed in the literature. First, one could block vacuous binding (even though vacuous binding does not make formulae ill-formed), i.e., requesting an appropriate bound variable in the scope of every quantifier. In Hobbs & Shieber's (1987) terms, the step from (51), the initial representation for (50), to (52) is blocked, because the discharged quantifier fails to bind an occurence of a variable y in its scope (the only occurrence of y in its scope is inside a term, hence not accessible for binding). Thus, the unwanted solution (53) cannot be generated: (50) Every researcher of a company came (51) come'((forall x researcher'(or) a ot'(x, (exists y company/(_y)»» (52) 3y.company'(y) a come'((forall x researcher'(or) a ot'(x,y))) (53) \/x.(researcher'(x) a ot'(x,y)) -» 3y. company'^) a come'(jc)
554 V. Ambiguity and vagueness In prose: The whole structure (represented by the label /T) consists of a set of labelled DRS fragments (for the semantic contributions of DP, negation, and VP, respectively) that arc ordered in a way indicated by a relation ORD. For an undcrspecificd representation of the two readings of (61), the scope relations between /, and l2 are left open in ORD: (63) ORD = </T > /,, /T > lv scope^) > ly scope(l2) > I J. Here '>' means 'has scope over', and scope maps a DRS fragment onto the empty DRS box it contains. Fully specified representations for the readings of (61) can then also be expressed in terms of (62). In these cases, ORD comprises in addition to the items in (63) a relation to determine the scope between universal quantifier and negation, e.g., scope{lx) > l2 for the reading with wide scope of the universal quantifier. Another instance of such a 'monostratal' underspecification formalism is the (revised) Quasi-Logical Form (QLF) of Alshawi & Crouch (1992), which uses list-valued metavariables in semantic representations whose specification indicates quantifier scope. See the representation for (26) in (32a). Kempson & Cormack (1981) also assume a single level of semantic representation (higher-order predicate logic) for quantifier scope ambiguities. 3.4. Compositionality Another distinction between underspecification formalisms centres upon the notion of resource: In most underspecification formalisms, the elements of a constraint show up at least once in all its solutions, in fact, exactly once, except in special cases like ellipses. For instance, in UDRT constraints and their solutions share the same set of DRS fragments, in CLLS (Egg, Kollcr & Nichrcn 2001) the relation between constraints and their solutions is defined as an assignment function from node variables (in constraints) to nodes (in the solutions), and in Glue Language Semantics this resource-sensitivity is explicitly encoded in the representations (expressions of linear logic). Due to this resource-sensitivity, every solution of an underspecified semantic representation of a linguistic expression preserves the semantic contributions of the parts of the expression. If different parts happen to introduce identical semantic material, then each instance must show up in each solution. For instance, any solution to a constraint for (64a) must comprise two universal quantifiers.The contributions of the two DPs may not be conflated in solutions, which rules out that (64a) and (64b) could share a reading 'for every person x:x likes x\ (64) a. Everyone likes everyone b. Everyone likes himself While this strategy seems natural in that the difference between (64a) and (64b) need not be stipulated by additional mechanisms, there are cases where different instances of the same semantic material seem to merge in the solutions. Reconsider e.g. the case of Afrikaans past tense marking (65) [= (18)] in Sailer (2004). This example has two tense markers and three readings. Sailer points out that
24. Semantic underspecification 555 the two instances of the past tense marker seem to merge in the first and the second reading of (65): (65) Jan wou gebel het Jan want.PAST called have 'Jan wanted to call/Jan wants to have called/Jan wanted to have called' A direct formalisation of this intuition is possible if one relates fragments in terms of subexpressionhood, as in the underspecified analyses in the LRS framework (Richter & Sailer 2006; Kallmeyer & Richter 2006). If constraints introduce identical fragments as subexpressions of a larger fragment, these fragments can but need not coincide in the solutions of the constraints. For the readings of (18), the constraint (simplified) is (66a): (66) a. <[PAST(y)]^, [PAST(0]c, [wanfC^ [call'(j)],, P<a, e<S, 0<S,i<y,i<£ b. PAST(wanf(j,A (call'(j)))) In prose: The two PAST- and the wa/if-fragments are subexpressions of (relation ' < ') the whole expression (as represented by the variables a or <5), while the ca//-fragment is a subexpression of the arguments of the PAST operators and the intcnsionaliscd second argument of want. This constraint describes all three semantic representations in (19); e.g., to derive (66b) [= (19a)], the following equations are needed: a=S=f}=£,y=£ = 0, and J] = /.The crucial equation here is fi = e, which equates two fragments. (Additional machinery is needed to block unwanted readings, see Sailer 2004.) This approach is more powerful than resource-sensitive formalisms.The price one has to pay for this additional power is the need to control explicitly whether identical material may coincide or not, e.g., for the analyses of negative concord in Richter & Sailer (2006). (See also article 6 (Pagin & Westerstahl) Compositionality.) 3.5. Expressivity and compactness The standard approach to evaluate an underspecification formalisms is to apply it to challenging ambiguous examples, e.g., (67) [= (40)], and to check whether there is an expression of the formalism that expresses all and only the attested readings of the example: (67) Every researcher of a company saw most samples However, what if these readings are contextually restricted, or, if the sentence has only four readings, as claimed by Kallmeyer & Romero (2008) and others, lacking the reading (45b) with the scope ordering 3 > most7 > V? Underspecification approaches that model scope in terms of partial order between fragments of semantic representations run into problems already with the second of these possibilities: Any constraint set that encompasses the four readings in which most7 has highest or lowest scope also covers the fifth reading (45b) (Ebert 2005). This means
V. Ambiguity and vagueness that such underspecification formalisms are not expressive in the sense of Konig & Reyle (1999) or Ebert (2005), since they cannot represent any subset of readings of an ambiguous expression. The formalisms are of different expressivity, e.g., approaches that model quantifier scope by lists (such as Alshawi 1992) are less expressive than those that use dominance relations, or scope lists together with an explicit ordering of list elements as in Fox & Lappin's (2005) Property Theory with Curry Typing. Fully expressive is the approach of Koller, Regneri & Thater (2008), which uses Regular Tree Grammars for scope underspecification. Rules of these grammars expand nonterminals into tree fragments. For instance, the rule 5 -» f (A, B) expands 5 into a tree whose mother is labelled by /, and whose children arc the subtrees to be derived by expanding the nonterminals A and B. Koller, Regneri & Thater (2008) show that dominance constraints can be translated into RTGs, e.g., the constraint (68) [= (41)] for the semantics of (40) is translated into (69). (68) 3>>.company' (y) a □ Vi (researcher' (x) a □) -> □ most' (sample', Xz □) of (x,y) see' (x,z) (69) {1-5} {1-5} {1-5} {2-5} {2-5} Ml -> -> -> -> -> -> 3comp([2-5}) Vres({l-2},{4-5}) most([\^\) Vres({2},{4-5}) most ([2-4]) V({l-2},{4}) Ml {1-2} 12-4} {4-5} {2} {4} -> -> -> -> -> -> comp{{\\, [2- comp([2}) Vres({2},{3}) most({4}) of see In (69), the fragments of (68) are addressed by numbers, 1,3, and 5 are the fragments for the existential, universal, and most-DP, respectively, and 2 and 4 are the fragments for o/and see. All nonterminals correspond to parts of constraints; they are abbreviated as sequences of fragments. For instance, {2-5} corresponds to the whole constraint except the existential fragment. Rules of the RTG specify on the right hand side the root of the partial constraint introduced on the left hand side, for instance, the first rule expresses wide scope of a company over the whole sentence.The RTG (69) yields the same five solutions as (68). In (69), the reading 3 > most7 > V can be excluded easily, by removing the production rule {2-5} -» mo.sf({2-4}):This leaves only one expansion rule for {2-5}. Since {2-5} emerges only as child of 3comp with widest scope, only "ires can be the root of the tree below widest-scope Bcomp.This shows that RTGs are more expressive than dominance constraints. (In more involved cases, restricting the set of solutions can be less simple, however, which means that RTGs can get larger if specific solutions are to be excluded.) This last observation points to another property of underspecification formalisms that is interdependent with expressivity, viz., compactness : A (sometimes tacit) assumption is
24. Semantic underspecification 559 of the sentence constituent. This gives the desired flexibility to derive scopally different semantic representations like in (27) from uniform syntactic structures like (2). The approach of Woods (1967,1978) works similarly: Semantic contributions of DPs are collected separately from the main semantic representation; they can be integrated into this main representation immediately or later. Another approach of this kind is Steedman (2007). Here non-universal quantifiers and their scope with respect to universal quantifiers are modelled in terms of Skolem functions. (See Kallmeyer & Romero 2008 for further discussion of this strategy.) These functions can have arguments for variables bound by universal quantifiers to express the fact that they are outscoped by these quantifiers. Consider e.g. the two readings of (26) in Skolem notation: (70) a. \/x.v/oman'(x) -» man'^A:,) a love'(oc, sk{) ('one man for all women') b. Vac.woman'(ac) -» mm'(sk2(x)) a lo\e'(x,sk2(x)) ('a possibly different man per woman') For the derivation of the different readings of a scopally underspecified expression, Steedman uses underspecified Skolem functions, which can be specified at any point in the derivation w.r.t. their environment, viz., the /i-tuple of variables bound by universal quantifiers so far. For (26), the semantics of a man would be represented by XQ.Q(skolem''(man')), where skolem' is a function from properties P and environments E to generalised skolem terms like /(£), where P holds of /(£). The term A(2£?(skolem'(man')) can be specified at different steps in the derivation, with different results: Immediately after the DP has been formed specification returns a Skolem constant like skx in (70a), since the environment is still empty. After combining the semantics of the DPs and the verb, the environment is the 1-tuple with the variable x bound by the universal quantifier of the subject DP, and specification at that point yields a skolem term like sk2(x). This sketch of competing approaches to the syntax-semantics interface shows that the functionality of this interface (or, an attempt to uphold it in spite of semantically and syntactically homogeneous ambiguous expressions) can be a motivation for underspecification: Functionality is preserved for such an expression directly in that there is a function from its syntactic structure to its underspecified semantic representation that encompasses all its readings. 4.2. Ambiguity and negation Semantic underspecification also helps avoiding problems with disjunctive representations of the meaning of ambiguous expressions that show up under negation: Negating an ambiguous expression is intuitively interpreted as the disjunction of the negated expressions, i.e., one of the readings of the expressions is denied. However, if the meaning of the expression itself is modelled as the disjunction of its readings, the negated expression emerges as the negation of the disjunctions, which is equivalent to the conjunction of the negated readings, i.e., every reading of the expression is denied, which runs counter to intuitions. For instance, for (26) such a semantic representation can be abbreviated as (71), which turns into (72) after negation:
24. Semantic underspecification 561 What is more, a complete disambiguation may be not even necessary. In these cases, postponing ambiguity resolution, and resolving ambiguity only on demand makes NLP systems more efficient.For instance,scope ambiguities arc often irrelevant for translation, therefore it would be useless to identify the intended reading of such an expression: Its translation into the target language would be ambiguous in the same way again. Therefore e.g. the Verbmobil project (machine translation of spontaneous spoken dialogue; Wahlster 2000) used a scopally underspecified semantic representation (Schiehlen 2000). The analyses of concrete NLP systems show clearly that combinatorial explosion is a problem for NLP that suggests the use of underspecification (pace Player 2004). The large number of readings that are attributed to linguistic expressions are due to the fact that, first, the number of scope-bearing constituents per expression is underestimated (there are many more such constituents besides DPs, e.g., negation, modal verbs, quantifying adverbials like three times or again), and, second and much worse, spurious ambiguities come in during syntactic and semantic processing of the expressions. For instance, Koller, Rcgneri & Thater (2008) found that 5% of the representations in the Rondane Treebank (underspecified MRS representations of sentences from the domain of Norwegian tourist information, distributed as part of the English Resource Grammar, Copestake & Flickinger 2000) have more than 650 000 solutions, record holder is the rather innocuous looking (74) with about 4.5 x 1012 scope readings: (74) Myrdal is the mountain terminus of the Flam rail line (or Flamsbana) which makes its way down the lovely Flam Valley (Flamsdalen) to its sea-level terminus at Flam. The median number of scope readings per sentence is 56 (Koller, Regneri & Thater 2008), so, short of applying specific measures to eliminate spurious ambiguities (see section 6.2), combinatorial explosion definitely is a problem for semantic analysis in NLP In recent years, underspecification has turned out very useful for NLP in another way, viz., in that underspecified semantics provides an interface bridging the gap between deep and shallow processing. To combine the advantages of both kinds of processing (accuracy vs. robustness and speed), both can be combined in NLP applications {hybrid processing). The results of deep and shallow syntactic processing can straightforwardly be integrated on the semantic level (instead of combining the results of deep and shallow syntactic analyses). An example for an architecture for hybrid processing is the 'Heart of Gold' developed in the project 'DeepThought' (Callmeier etal.2004). Since shallow syntactic analyses provide only a part of the information to be gained from deep analysis, the semantic information derivable from the results of a shallow parse (e.g., by a part-of-speech tagger or an NP chunker) can only be a part of the one derived from the results of a deep parse. Underspecification formalisms can be used to model this kind of partial information as well. For instance, deep and shallow processing may yield different results with respect to argument linking: NP chunkers (as opposed to systems of deep processing) do not relate verbs and their syntactic arguments, e.g., experiencer and patient in (75). Any semantic analysis based on such a chunker will thus fail to identify individuals in NP and verb semantics as in (76):
562 V. Ambiguity and vagueness (75) Max saw Mary (76) named(ac,, Max), see(ac2,*3), named(ac4, Mary) Semantic representations of different depths must be compatible in order to combine results from parallel deep and shallow processing or to transform shallow into deep semantic analyses by adding further pieces of information. Thus, the semantic representation formalism must be capable of separating the semantic information from different sources appropriately. For instance, information on argument linking should be listed separately, thus, a full semantic analysis of (75) should look like (77) rather than (78). Robust MRS (Copcs-takc 2003) is an underspecification formalism that was designed to fulfill this demand: (77) named(ac1,Max),see(ac2,ac3),named(ac4,Mary),*, = x2,x3 = x4 (78) named(acI,Max),see(acI,ac4), named(ac4, Mary) 4.4. Semantic construction Finally, underspecification formalisms turn out to be interesting from the perspective of semantic construction in general, independently of the issue of ambiguity. This interest is based on two properties of these formalisms, viz., their portability and their flexibility. First, underspecification formalisms do not presuppose a specific syntactic analysis (which would do a certain amount of preprocessing for the mapping from syntax to semantics, like the mapping from surface structure to Logical Form in Generative Grammar). Therefore the syntax-semantics interface can be defined in a very transparent fashion, which makes the formalisms very portable in that they can be coupled with different syntactic formalisms. Tab. 24.1 lists some of the realised combinations of syntactic and semantic formalisms. Second, the flexibility of the interfaces that are needed to derive underspecified representations of ambiguous expressions is also available for unambiguous cases that pose a challenge for any syntax-semantics interface. For instance, semantic construction for the modification of modifiers and indefinite pronouns like everyone is a problem, because the types of functor (semantics of the modifier) and argument (semantics of the modified expression) do not fit: For instance, in (79) the PP semantics is a function from properties to properties, the semantics of the pronoun as well as the one of the whole modification structure are sets of properties. Tab. 24.1: Realised couplings of underspecification formalisms and syntax formalisms HPSG LFG (L)TAG MRS Copestakc ct al. (2005) Oepen et al. (2004) Kallineyer & Joshi (1999) GLS Asudch& Crouch Dalrymple (2001) Frank & van Genabith (2001) (2001) UDRT Frank & Reyle (1995) van Genabith & Crouch Cimiano & Rcyle (2005) (1999) HS Chaves (2002) Kallineyer & Joshi (2003)
564 V. Ambiguity and vagueness the main DP fragment. The top fragments (holes that determine the scope of the DP, because the top fragment of a constituent always dominates all its other fragments) of DP, determiner, and NP are identical. ('SSI' indicates that it is a rule of the syntax-semantics interface.) The main fragment of a VP (of a sentence) emerges by applying the main verb (VP) fragment to the main fragment of the object (subject) DP.The top fragments of the verb (VP) and its DP argument are identical to the one of the emerging VP (S): (SSI) (81) [vp V DP] => |VPl: |V](|DP]); IVPj = |V J = |DPJ (SSI) (82) [s DP VP] => |S]: |VP](|DP]); |S J = |DP J = |VP J We assume that for all lexical entries, the main fragments are identical to the standard semantic representation (e.g., for every, we get [[Det]]: \Q\PVx.Q{x) -» P(x)), and the top fragments arc holes dominating the main fragments. If in unary projections like the one of man from N toN and NP main and top fragments are merely inherited, the semantics of a man emerges as (83): By. man' (y) a p [DP]: y The crucial point is the decision to let the bound variable be the main fragment in the DP semantics.The intermediate DP fragment between top and main fragment is ignored in further processes of semantic construction. Combining (83) with the semantics of the verb yields (84): (84)[VPJ: D 3y. man'(_y) a n flVP]: love'OO Finally, the semantics of every woman, which is derived in analogy to (83), is combined with (84) through rule (82). According to this rule, the two top fragments are identified and the two main fragments are combined by functional application into the main S fragment, but the two intermediate fragments, which comprise the two quantifiers, are not addressed at all, and hence remain dangling in between. The result is the desired dominance diamond:
566 V. Ambiguity and vagueness Hurum's (1988) algorithm, the CLE resolution algorithm (Moran 1988;Alshawi 1992), and Chaves' (2003) extension of Hole Semantics detect specific cases of equivalent solutions (e.g., when one existential quantifier immediately dominates another one) and block all but one of them.The blocking is only effective once the solutions are enumerated. In contrast, Roller & Thater (2006) and Roller, Regneri & Thater (2008) present algorithms to reduce spurious ambiguities that map underspecified representations on (more restricted) underspecified representations. 6.3. Reasoning with underspecified representations Sometimes fully specified information can be deduced from an underspecified semantic representation. For instance, if Amdlic is a woman, then (26) allows us to conclude that she loves a man, because this conclusion is valid no matter which reading of (26) is chosen. For UDRT (Konig & Reyle 1999; Reyle 1992; Reyle 1993; Reyle 1996) and APL (Jaspars & van Eijck 1996), there are calculi for such reasoning with underspecified representations. Van Deemter (1996) discusses different kinds of consequence relations for this reasoning. 6.4. Integration of preferences In many cases of scope ambiguity, the readings are not on a par in that some are more preferred than others. Consider e.g. a slight variation of (26), here the 3V-reading is preferred over the V3-reading: (86) A woman loves every man One could integrate these preferences into underspecified representations of scopally ambiguous expressions to narrow down the number of its readings or to order the generation of solutions (Alshawi 1992). 6.4.1. Kinds of preferences The preferences discussed in the literature can roughly be divided into three groups. The first group are based on syntactic structure, starting with Johnson-Laird (1969) and Lakoff's (1971) claim that surface linear order or precedence introduces a preference for wide scope of the preceding scope-bearing item (but see e.g.Villalta 2003 for experimental counterevidence). Precedence can be interpreted in terms of a syntactic configuration such as c-command (e.g., VanLehn 1978), since in a right-branching binary phrase-marker preceding constituents c-command the following ones. However, these preferences are not universally valid: Kurtzman & MacDonald (1993) report a clear preference for wide scope of the embedded DP in the case of nested quantification as in (87). Here the indefinite article precedes (and c-commands) the embedded DP, but the V3-reading is nevertheless preferred: (87) I met a researcher from every university Hurum (1988) and VanLehn (1978) make the preference of scope-bearing items to take scope outside the constituent they are directly embedded in also dependent on the
24. Semantic underspecification 567 category of that constituent (e.g., much stronger for items inside PPs than items inside infinite clauses). The scope preference algorithm of Gamback & Bos (1998) gives scope-bearing nonheads (complements and adjuncts) in binary-branching syntactic structures immediate scope over the respective head. The second group of preferences focus on grammatical functions and thematic roles. Functional hierarchies have been proposed that indicate preference to take wide scope in loup (1975) (88a) and VanLehn (1978) (88b): (88) a. topic > deep and surface subject > deep subject or surface subject > indirect object > prepositional object > direct object b. preposed PP, topic NP > subject > (complement in) sentential or adverbial PP > (complement in) verb phrase PP > object While loup combines thematic and functional properties in her hierarchy (by including the notion of'deep subject'), Pafel (2005) distinguishes grammatical functions (only subject and sentenctial adverb) and thematic roles (strong and weak patienthood) explicitly. There is a certain amount of overlap between structural preferences and the functional hierarchies, at least in a language like English: Here DPs higher on the functional hierarchy also tend to c-command DPs lower on the hierarchy, because they are more likely to surface as subjects. The third group of preferences addresses the quantifiers (or, the determiners expressing them) themselves. loup (1975) and VanLehn (1978) introduce a hierarchy of determiners: (89) each > every > all > most > many > several > some (plural) > a few CLE incorporates such preference rules, too (Moran 1988; Alshawi 1992), e.g., the rule that each outscopes other determiners, and that negation is outscoped by some and outscopes every. Some of these preferences can be put down to a more general preference for logically weaker interpretations, in particular, the tendency of universal quantifiers to outscope existential ones (recall that the VB-reading of sentences like (26) is weaker than the 3V-rcading; VanLehn 1978; Moran 1988; Alshawi 1992). Similarly, scope of the negation above every and below some returns existential statements, which are weaker than the (unpreferred) alternative scopings (universal statements) in that they do not make a claim about the whole domain. Pafel (2005) lists further properties, among them focus and discourse binding (whether a DP refers to an already established set of entities, as e.g. in few of the books as opposed to few books). 6.4.2. Interaction of preferences It has been argued that the whole range of quantifier scope effects can only be accounted for in terms of an interaction of different principles. Fodor (1982) and Hurum (1988) assume an interaction between linear precedence and the determiner hierarchy, which is corroborated by experimental results of Filik,
24. Semantic underspecification 569 Asher, Nicholas & Alex Lascarides 2003. Logics of Conversation. Cambridge: Cambridge University Press. Asudeh, Ash & Richard Crouch 2002. Glue semantics for HPSG. In: F. van Eynde, L. Hellan & D. Beermann (eds.). Proceedings of the 8th International Conference on Head-Driven Phrase Structure Grammar. Stanford, CA: CSLI Publications, 1-19. Babko-Malaya, Olga 2004. LTAG semantics of NP-coordination. In: Proceedings of the 7th International Workshop on Tree Adjoining Grammars and Related Formalisms. Vancouver, 111-117. Bierwisch. Manfred 1983. Semantische und konzeptionelle Representation lexikalischer Einheiten. In: R. Ruzi£ka & W. Motsch (eds.). Untersuchungen zur Semantik. Berlin: Akademie Verlag, 61-99. Bierwisch, Manfred 1988. On the grammar of local prepositions. In: M. Bierwisch, W. Motsch & I. Zimmermann (eds.). Syntax, Semantik und das Lexikon. Berlin: Akademie Verlag, 1-65. Bierwisch, Manfred & Ewald Lang (eds.) 1987. Grammatische und konzeptuelle Aspekte von Dimensionsadjektiven. Berlin: Akademie Verlag. Blackburn, Patrick & Johan Bos 2005. Representation and Inference for Natural Language. A First Course in Computational Semantics. Stanford, CA: CSLI Publications. Bos, Johan 1996. Predicate logic unplugged. In: P. Dekker & M. Stokhof (eds.). Proceedings of the 10th Amsterdam Colloquium. Amsterdam: ILLC. 133-142. Bos, Johan 2004. Computational semantics in discourse. Underspecification, resolution, and inference. Journal of Logic, Language and Information 13,139-157. Brants,Thorsten. Wojciech Skut & Hans Uszkoreit 2003. Syntactic annotation of a German newspaper corpus. In: A. Abeille' (ed.). Treebanks. Building and Using Parsed Corpora. Dordrecht: Kluwer, 73-88. Bronnenberg, Wim, Harry Bunt, Jan Landsbergen, Remko Scha, Wijnand Schoeninakers & Eric van Utteren 1979. The question answering system PHLIQA1. In: L. Bole (ed.). Natural Language Question Answering Systems. London: Macmillan, 217-305. Bunt, Harry 2007. Underspecification in semantic representations. Which technique for what purpose? In: H. Bunt & R. Muskens (eds.). Computing Meaning, vol. 3. Amsterdam: Springer, 55-85. Callmeier, Ulrich, Andreas Eisele, Ulrich Schafer & Melanie Siegel 2004. The DeepThought core architecture framework. In: M.T. Lino ct al. (eds.). Proceedings of the 4th International Conference on Language Resources and Evaluation (= LREC). Lisbon: European Language Resources Association, 1205-1208. Chaves, Rui 2002. Principle-based DRTU for HSPG. A case study. In: Proceedings of the 1st International Workshop on Scalable Natural Language Understanding (= ScaNaLu). Heidelberg: EML. Chaves, Rui 2003. Non-Redundant scope disambiguation in undeispecified semantics. In: B. ten Cate (ed.). Proceedings of the 8th EESLLI Student Session. Vienna, 47-58. Chaves, Rui 2005a. DRT and underspecification of plural ambiguities. In: Proceedings of the 6th International Workshop in Computational Semantics. Tilburg University, 78-89. Chaves, Rui 2005b. Underspecification and NP coordination in constraint-based grammar. In: F. Richter & M. Sailer (eds.). Proceedings of the ESSLLI'05 Workshop on Empirical Challenges and Analytical Alternatives to Strict Compositionality. Edinburgh: Heriot-Watt University, 38-58. Cimiano, Philipp & Uwe Reyle 2005. Talking about trees, scope and concepts. In: H. Bunt, J. Geertzen & E.Thijse (eds.). Proceedings of the 6th International Workshop in Computational Semantics. Tilburg:Tilburg University, 90-102. Cooper, Robin 1983. Quantification and Syntactic Theory. Dordrecht: Rcidcl. Copcstakc, Ann & Dan Flickingcr 2000. An open-source grammar development environment and broad-coverage English grammar using HPSG. In: Proceedings of the 2nd International Conference on Language, Resources and Evaluation (= LREC). Athens: European Language Resources Association, 591-600. Copestake, Ann, Dan Flickinger, Carl Pollard & Ivan Sag 2005. Minimal recursion semantics. An introduction. Research on Language and Computation 3,281-332.
570 V. Ambiguity and vagueness Crouch, Richard 1995. Ellipsis and quantification. A substitutional approach. In: Proceedings of the 7th Conference of the European Chapter of the Association for Computational Linguistics (= EACL). Dublin: Association for Computational Linguistics, 229-236. Crouch, Richard & Josef van Genabith 1999. Context change, underspecification and the structure of Glue Language derivations. In: M. Dairymple (ed.). Semantics and Syntax in Lexical Functional Grammar. The Resource Logic Approach. Cambridge, MA:The MIT Press, 117-189. Dairy mple, Mary 2001. Lexical Functional Grammar. New York: Academic Press. Dalrymple, Mary, John Lamping, Fernando Pereira & Vijay Saraswat 1997. Quantifiers, anaphora, and intensionality. Journal of Logic, Language and Information 6,219-273. Dalrymple, Mary, Stuart Shieber & Fernando Pereira 1991. Ellipsis and higherorder unification. Linguistics & Philosophy 14,399-452. van Deemter, Kees 1996.Towards a logic of ambiguous expressions. In: K. van Deemter & S. Peters (eds.). Semantic Ambiguity and Underspecification. Stanford, CA: CSLI Publications, 203-237. Dolling, Johannes 1995. Ontological domains, semantic sorts and systematic ambiguity. International Journal of Human-Computer Studies 43,785-807. Duchicr, Dcnys & Claire Gardcnt 2001. Tree descriptions, constraints and incrcinentality. In: H. Bunt, R. Muskens & E.Thijsse (eds.). Computing Meaning, vol. 2. Dordrecht: Kluwer, 205-227. Ebert, Christian 2005. Formal Investigations of Underspecified Representations. Ph.D. dissertation. King's College, London. Egg, Markus 2004. Mismatches at the syntax-semantics interface. In: S. Miiller (ed.). Proceedings of the 11th International Conference on Head-Driven Phrase Structure Grammar. Stanford, CA: CSLI Publications, 119-139. Egg, Markus 2005. Flexible Semantics for Reinterpretation Phenomena. Stanford, CA: CSLI Publications. Egg, Markus 2006. Anti-Ikonizitat an der Syntax-Semantik-Schnittstelle. Zeitschrift fur Sprachwis- senschaft 25,1-38. Egg, Markus 2007. Against opacity. Research on Language and Computation 5,435-455. Egg, Markus & Michael Kohlhase 1997. Dynamic control of quantifier scope. In: P. Dekker, M. Stokhof & Y. Venema (eds.). Proceedings of the 11th Amsterdam Colloquium. Amsterdam: ILLC, 109-114. Egg, Markus, Alexander Roller & Joachim Nichrcn 2001. The constraint language for lambda- structures. Journal of Logic, Language and Information 10,457-485. Egg, Markus & Gisela Redeker 2008. Underspecified discourse representation. In: A. Benz & P. Kuhnlein (eds.). Constraints in Discourse. Amsterdam: Benjamins, 117-138. van Eijck, Jan & Manfred Pinkal 1996. What do underspecified expressions mean? In: R. Cooper et al. (eds.). Building the Framework. FraCaS Deliverable D 15, section 10.1. Edinburgh: University of Edinburgh. 264-266. Filik, Ruth, Kevin Patei son & Simon Liversedge 2004. Processing doubly quantified sentences. Evidence from eye movements. Psychonomic Bulletin and Review 11,953-959. Fodor, Janet 1982. The mental representation of quantifiers. In: S. Peters & E. Saarinen (eds.). Processes, Beliefs, and Questions. Dordrecht: Reidel, 129-164. Fox, Chris & Shalom Lappin 2005. Underspecified interpretations in a Curry-typed representation language. Journal of Logic and Computation 15,131-143. Frank, Annette & Josef van Genabith 2001. GlueTag. Linear logic based semantics for LTAG - and what it teaches us about LFG and LTAG. In: M. Butt &T Holloway King (eds.). Proceedings of the LFG '01 Conference. Stanford, CA: CSLI Publications, 104-126. Frank, Annette & Uwe Reyle 1995. Principle-based semantics for HPSG. In: Proceedings of the 7th Conference of the European Chapter of the Association for Computational Linguistics (= EACL). Dublin: Association for Computational Linguistics, 9-16. Fuchss, Ruth, Alexander Roller, Joachim Niehren & Stefan Thater 2004. Minimal recursion semantics as dominance constraints. Translation, evaluation, and analysis. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (= ACL). Barcelona: Association for Computational Linguistics, 247-254.
24. Semantic underspecification 571 Gamback.Bjorn & Johan Bos 1998. Semantic-head based resolutions of scopal ambiguities. In: Proceedings of the 17th International Conference on Computational Linguistics and the 36th Annual Meeting of the Association for Computational Linguistics (= ACL). Montreal: Association for Computational Linguistics, 433-437. Gawron, Jean Mark & Stanley Peters 1990. Anaphora and Quantification in Situation Semantics. Menlo Park, CA: CSLI Publications. van Genabith, Josef & Richard Crouch 1999. Dynamic and underspecified semantics for LFG. In: M. Dalrymple (ed.). Semantics and Syntax in Lexical Functional Grammar. The Resource Logic Approach. Cambridge, MA:The MIT Press, 209-260. Harris, John 2007. Representation. In: P. de Lacy (ed.). The Cambridge Handbook of Phonology. Cambridge: Cambridge University Press, 119-137. Heim, Irene & Angelika Kratzer 1998. Semantics in Generative Grammar. Oxford: Blackwell. Hendriks, Herman 1993. Studied Flexibility. Categories and Types in Syntax and Semantics. Amsterdam: ILLC. Hirschbiihler, Paul 1982. VP deletion and across the board quantifier scope. In: J. Pustejovsky & P. Sells (eds.). Proceedings of the North Eastern Linguistics Society (= NELS) 12. Amherst, MA: GLSA. 132-139. Hobbs, Jerry & Stuart Shieber 1987. An algorithm for generating quantifier scoping. Computational Linguistics 13,47-63. Hobbs, Jerry, Mark Stickel, Douglas Appelt & Paul Martin 1993. Interpretation as abduction. Artificial Intelligence 63,69-142. Hodges, Wilfrid 2001. Formal features of compositionality. Journal of Logic, Language and Information 10,7-28. Hoeksema.Jack 1985. Categorial Morphology. New York: Garland. Huet, Gerard P. 1975. A unification algorithm for typed X-calculus. Theoretical Computer Science 1,27-57. Hurum, Sven 1988. Handling scope ambiguities in English. In: Proceedings of the 2nd Conference on Applied Natural Language Processing (= AN LP). Morristown, NJ: Association for Computational Linguistics, 58-65. Ioup, Georgette 1975. Some universals for quantifier scope. In: J. Kimball (ed.). Syntax and Semantics 4. New York: Academic Press, 37-58. Jaspars, Jan & Jan van Eijck 1996. Underspecification and reasoning. In: R. Cooper et al. (eds.). Building the Framework. FraCaS Deliverable D 15, section 10.2, 266-287. Johnson-Laird, Philip 1969. On understanding logically complex sentences. Quarterly Journal of Experimental Psychology 21,1-13. Kallmeyer, Laura & Aravind K. Joshi 1999. Factoring predicate argument and scope semantics. Underspecified semantics with LTAG. In: P. Dekker (ed.). Proceedings of the 12th Amsterdam Colloquium. Amsterdam: ILLC, 169-174. Kallmeyer, Laura & Aravind K. Joshi 2003. Factoring predicate argument and scope semantics. Underspecified semantics with LTAG. Research on Language and Computation 1,3-58. Kallmeyer, Laura & Frank Richter 2006. Constraint-based computational semantics. A comparison between LTAG and LRS. In: Proceedings of the 8th International Workshop on Tree-Adjoining Grammar and Related Formalisms. Sydney: Association for Computational Linguistics, 109-114. Kallmeyer, Laura & Maribel Romero 2008. Scope and situation binding in LTAG using semantic unification. Research on Language and Computation 6,3-52. Keller, William 1988. Nested Cooper storage. The proper treatment of quantification in ordinary noun phrases. In: U. Reyle & Ch. Rohrer (eds.). Natural Language Parsing and Linguistic Theories. Dordrecht: Reidel, 432^47. Kempson, Ruth & Annabel Cormack 1981. Ambiguity and quantification. Linguistics & Philosophy 4,259-309. Roller, Alexander, Joachim Niehren & Stefan Thater 2003. Bridging the gap between underspecification formalisms. Hole semantics as dominance constraints. In: Proceedings of the
572 V. Ambiguity and vagueness 11th Conference of the European Chapter of the Association for Computational Linguistics (= EACL). Budapest: Association for Computational Linguistics, 195-202. Roller, Alexander, Michaela Regneri & StefanThater 2008. Regular tree grammars as a formalism for scope underspecification. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (= ACL). Columbus, OH: Association for Computational Linguistics, 218-226. Roller, Alexander & Stefan Thater 2005. The evolution of dominance constraint solvers. In: Proceedings of the ACL Workshop on Software. Ann Arbor, MI: Association for Computational Linguistics, 65-76. Roller, Alexander & Stefan Tliater 2006. An improved redundancy elimination algorithm for underspecified descriptions. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics (=ACL). Sydney: Association for Computational Linguistics, 409-416. Ronig, Esther & Uwe Reyle 1999. A general reasoning scheme for underspecified representations. In: H.J. Ohlbach & U. Reyle (eds.). Logic, Language and Reasoning. Essays in Honour of Dov Gabbay. Dordrecht: Rluwer. 1-28. Rratzcr, Angclika 1998. Scope or pscudoscope? Arc there wide-scope indefinites? In: S. Rothstein (ed.). Events and Grammar. Dordrecht: Rluwer. 163-196. Kurtzman. Howard & Maryellen MacDonald 1993. Resolution of quantifier scope ambiguities. Cognition 48,243-279. Lakoff, George 1971. On generative semantics. In: D. Steinberg & L. Jakobovits (eds.). Semantics. An Interdisciplinary Reader in Philosophy, Linguistics and Psychology. Cambridge: Cambridge University Press, 232-296. Larson, Richard 1998. Events and modification in nominals. In: D. Strolovitch & A. Lawson (eds.). Proceedings of Semantics and Linguistic Theory (= SALT) VIII. Ithaca, NY: CLC Publications, 145-168. Larson, Richard & Sungeon Cho 2003. Temporal adjectives and the structure of possessive DPs. Natural Language Semantics 11,217-247. Lev, Iddo 2005. Decoupling scope resolution from semantic composition. In: H. Bunt, J. Geertzen & E.Thijse (eds.). Proceedings of the 6th International Workshop on Computational Semantics. Tilburg:Tilburg University, 139-150. Link, Godchard 1983. The logical analysis of plurals and mass terms. A lattice-theoretic approach. In: R. Bauerle, Ch. Schwarze & A. von Stechow (eds.). Meaning, Use, and the Interpretation of Language. Berlin: Mouton de Gruyter, 303-323. Marcus, Mitchell, Donald Hindle & Margaret Fleck 1983. D-theory. Talking about talking about trees. In: Proceedings of the 21st Annual Meeting of the Association for Computational Linguistics (= ACL). Cambridge, MA: Association for Computational Linguistics, 129-136. Montague, Richard 1974. Formal Philosophy. Selected Papers of Richard Montague. Edited and with an introduction by Richmond H.Thomason. New Haven, CT: Yale University. Moran, Douglas 1988. Quantifier scoping in the SRI core language engine. In: Proceedings of the 26th Annual Meeting of the Association for Computational Linguistics (= ACL). Buffalo, NY: Association for Computational Linguistics, 33-40. Muskens, Reinhard 2001. Talking about trees and truth-conditions. Journal of Logic, Language and Information 10,417-455. Nerbonne, John 1993. A feature-based syntax/semantics interface. Annals of Mathematics and Artificial Intelligence 8.107-132. Niehrcn, Joachim, Manfred Pinkal & Peter Ruhrbcrg 1997. A uniform approach to underspecification and parallelism. In: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (= ACL). Madrid: Association for Computational Linguistics, 410-417. Oepen, Stephan, Helge Dyvik, Jan Tore L0nning, Erik Velldal, Dorothee Beermann, John Carroll, Dan Flickinger, Lars Hellan, Janni Bonde Johannessen, Paul Meurer, Torbj0rn Nor- dgard & Victoria Rosin 2004. Som a kapp-ete med trollet? Towards MRS-based Norwegian- English machine translation. In: Proceedings of the 10th International Conference on Theoretical and Methodological Issues in Machine Translation (= TMI). Baltimore, MD, 11-20.
24. Semantic underspecification 573 Pafel, Jiirgen 2005. Quantifier Scope in German. Amsterdam: Benjamins. Park, Jong 1995. Quantifier scope and constituency. In: Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics (= ACL). Cambridge, MA: Association for Computational Linguistics, 205-212. Pinkal, Manfred 1995. Logic and Lexicon: The Semantics of the Indefinite. Dordrecht: Kluwer. Pinkal, Manfred 1996. Radical underspecification. In: P. Dekker & M. Stokhof (eds.). Proceedings of the 10th Amsterdam Colloquium. Amsterdam: ILLC, 587-606. Pinkal, Manfred 1999. On semantic underspecification. In: H. Bunt & R. Muskens (eds.). Computing Meaning, vol. 1. Dordrecht: Kluwer, 33-55. Player, Nickie 2004. Logics of Ambiguity. Ph.D. dissertation. University of Manchester. Poesio, Massimo 1996. Semantic ambiguity and perceived ambiguity. In: K. van Dee inter & S. Peters (eds.). Semantic Ambiguity and Underspecification. Stanford, CA: CSLI Publications. 159-201. Poesio, Massimo, Patrick Sturt, Ron Artslein & Ruth Filik 2006. Underspecification and anaphora. Theoretical issues and preliminary evidence. Discourse Processes 42,157-175. Pulman, Stephen G. 1997. Aspectual shift as type coercion. Transactions of the Philological Society 95,279-317. Rambow, Owen, David Weir & Vijay K. Shanker 2001. D-tree substitution grammars. Computational Linguistics 27,89-121. Rapp, Irene & Arniin von Stechow 1999. Fast 'almost' and the visibility parameter for functional adverbs. Journal of Semantics 16,149-204. Regneri, Michaela, Markus Egg & Alexander Roller 2008. Efficient processing of underspecified discourse representations. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (= ACL), Short Papers. Columbus, OH: Association for Computational Linguistics, 245-248. Reyle, Uwe 1992. Dealing with ambiguities by underspecification. A first order calculus for unscoped representations. In: M. Stokhof & P. Dekker (eds.). Proceedings of the 8th Amsterdam Colloquium. Amsterdam: ILLC, 493-512. Reyle, Uwe 1996. Co-indexing labeled DRSs to represent and reason with ambiguities. In: K. van Deemter & S. Peters (eds.). Semantic Ambiguity and Underspecification. Stanford, CA: CSLI Publications, 239-268. Richtcr, Frank & Manfred Sailer 1996. Syntax fur cine untcrspczifizicrtc Scmantik. PP-Anbindung in einem deutschen HPSG-Fragment. In: S. Mehl, A. Mertens & M. Schulz (eds.). Pr'dpositional- semantik und PP-Anbindung. Duisburg: Institut fur Informatik, 39-47. Richter, Frank & Manfred Sailer 2006. Modeling typological markedness in semantics.The case of negative concord. In: S. Miiller (ed.). Proceedings of the HPSG 06 Conference. Stanford, CA: CSLI Publications, 305-325. Robaldo, Livio 2007. Dependency Tree Semantics. Ph.D. disseration. University of Torino. Sag, Ivan 1976. Deletion and Logical Form. Ph.D. dissertation. MIT, Cambridge, MA. Sailer. Manfred 2000. Combinatorial Semantics and Idiomatic Expressions in Head-Driven Phrase Structure Grammar. Doctoral dissertation. University of Tubingen. Sailer, Manfred 2004. Past tense marking in Afrikaans. In: C. Meier & M. Weisgerber (eds.). Proceedings of Sinn und Bedeutung 8. Konstanz: University of Konstanz, 233-248. Sailer, Manfred 2006. Don't believe in underspecified semantics. In: O. Bonami & P. Cabredo Hoflierr (eds.). Empirical Issues in Formal Syntax and Semantics, vol. 6. Paris: CSSP, 375-403. Schiehlen, Michael 2000. Semantic construction. In: W. Wahlster (ed.). Verbmobil. Foundations of Speech-to-Speech Translation. Berlin: Springer, 200-216. Schildcr, Frank 2002. Robust discourse parsing via discourse markers, topicality and position. Natural Language Engineering 8,235-255. Schubert, Lenhart & Francis Pelletier 1982. From English to logic. Context-free computation of 'conventional' logical translation. American Journal of Computational Linguistics 8,26-44. Shieber, Stuart, Fernando Pereira & Mary Dalrymple 1996. Interaction of scope and ellipsis. Linguistics & Philosophy 19,527-552. Steedman, Mark 2000. The Syntactic Process. Cambridge, MA:The MIT Press.
574 V. Ambiguity and vagueness Steedman, Mark 2007. Surface-Compositional Scope-Alternation without Existential Quantifiers. Draft 5.2., September 2007. Ms. Edinburgh, ilcc/University of Edinburgh, http://www.iccs. informatics.ed.ac.uk/~steedman/papers.html, August 25,2009. Steriade.Donka 1995. Underspecification and markedness. In: J. Goldsmith (ed.). The Handbook of Phonological Theory. Oxford: Blackwell, 114-174. de Swart, Henriette 1998. Aspect shift and coercion. Natural Language and Linguistic Theory 16, 347-385. VanLehn, Kurt 1978. Determining the Scope of English Quantifiers. MA thesis. MIT, Cambridge, MA. MITTechnical Report AI-TR-483. Villalta, Elisabeth 2003. The role of context in the resolution of quantifier scope ambiguities. Journal of Semantics 20,115-162. Wahlster, Wolfgang (ed.) 2000. Verbmobil. Foundations of Speech-to-Speech Translation. Berlin: Springer. Westerstahl, Dag 1998. On mathematical proofs of the vacuity of compositionality. Linguistics & Philosophy 21,635-643. Woods, William 1967. Semantics for a Question-Answering System. Ph.D. dissertation. Harvard University, Cambridge, MA. Woods, William 1978. Semantics and quantification in natural language question answering. In: M. Yovits (ed.). Advances in Computers, vol. 17. New York: Academic Press, 2-87. Markus Egg, Berlin (Germany) 25. Mismatches and coercion 1. Enriched type theories 2. Type coercion 3. Aspect shift and coercion 4. Cross-linguistic implications 5. Conclusion 6. References Abstract The principle of compositionality of meaning is the foundation of semantic theory. With function application as the main rule of combination, compositionality requires complex expressions to be interpreted in terms of function-argument structure. Type theory lends itself to an insightful representation of many combinations of a functor and its argument. But not all well-formed linguistic combinations are accounted for within classical type theory as adopted by Montague Grammar. Conflicts between the requirements of a functor and the properties of its arguments are described as type mismatches. Three solutions to type mismatches have been proposed within enriched type theories, namely type raising, type shifting, and type coercion. This article focusses on instances of type coercion. We provide examples, propose lexical and contextual restrictions on type coercion, and discuss the status of coercion as a semantic enrichment operation. The paper includes a special application of the notion of coercion in the domain of tense and aspect. Aspectual coercion affects the relationship between predicative and grammatical aspect. We treat examples from English and Romance languages in a cross-linguistic perspective. Maienbom, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 574-597
25. Mismatches and coercion 575 1. Enriched type theories 1.1. Type mismatches According to the principle of compositionality of meaning, the meaning of a complex whole is a function of the meaning of its composing parts and the way these parts are put together. With function application as the main rule for combining linguistic expressions, compositionality requires that complex expressions be interpreted in terms of function- argument structure. Functors do not combine with just any argument. They look for arguments with particular syntactic and semantic properties, A type mismatch arises when the properties of the argument do not match the requirements of the functor. Such a type mismatch can lead to ungrammaticalities. Consider the ill-formed combination eat laugh, where the transitive verb eat wants to take an object of category NP, and does not combine with the intransitive verb laugh of category VP, In type-theoretical terms, eat is an expression of type (e, (e,t)), which requires an expression of type e as its argument. Given that laugh is of type (e,t), there is a mismatch between the two expressions, so they do not combine by function application, and the combination eat laugh is ungrammatical. In this article wc do not focus on syntactic incompatibilities, but on semantic type mismatches and ways to resolve conflicts between the requirements of a functor and the properties of its argument. Sections 1,2 and 1.3 briefly refer to type raising and type shifting as operations defined within an enriched type theoretical framework. Section 2 addresses type coercion, as introduced by Pustejovsky (1995), Section 3 treats aspectual coercion within a semantic theory of tense and aspect. 1.2. Type raising As we saw with the ill-formed combination eat laugh, a type mismatch can lead to ungrammaticalities. Not all type mismatches have such drastic consequences. Within extended versions of type theory, mechanisms of type-raising have been formulated, that deal with certain type mismatches (cf, Hendriks 1993), Consider the VPs kiss Bill and kiss every boy, which are both well-formed expressions of type (e,t).The transitive verb kiss is an expression of type (e,(e,t)). In the VP kiss Bill, the transitive verb directly combines with the object of type e. However, the object every boy is not of type e, but of type ((e,t),t),The mismatch is resolved by raising the type of the argument of the transitive verb. If wc assume that a transitive verb like kiss can take cither proper names or generalized quantifiers as its argument, we interpret kiss as an expression of type (e, (e,t)), or type (((e,t),t), (e,t)). Argument raising allows the transitive verb kiss to combine with the object every boy by function application. 1.3. Type shifting Type raising affects the domain in which the expression is interpreted, but maintains function application as the rule of combination, so it preserves compositionality. For other type mismatches, rules of type-shifting have been proposed, in particular in the interpretation of indefinites (Partee 1987), In standard Montague Grammar, NPs have a denotation in the domain of type e (e.g. proper names) and/or type ((e,t),t) (generalized quantifiers). In certain environments, the indefinite seems to behave more like a
576 V, Ambiguity and vagueness predicate (type e,t) than like an argument. We find the predicative use of indefinites in predicative constructions {Sue is a lawyer, She considers him a fool), in measurement constructions {This baby weighs seven pounds), with certain 'light' verbs {This house has many windows), and in existential constructions {There is a book on the table, (see article 69 (McNally) Existential sentences). An interpretation of indefinites as denoting properties makes it possible to treat predicative and existential constructions as functors taking arguments of type (e,t) (see article 85 (de Hoop) Type shifting, and references therein). The type-shifting approach raises problems for the role of indefinites in the object position of regular transitive verbs, though. Assuming that we interpret the verb eat as an expression of type (c,(c,t)), and the indefinite an apple as an expression of type (e,t), wc would not know how to combine the two in eat an apple. Argument raising docs not help in this case, for a functor of type (((e,t),t), (e,t)) does not combine with an argument of type (e,t) either. Two solutions present themselves. If we assign transitive verbs two types, the type (e,(e,t)) and the type ((e,t),(e,t)), we accept a lexical ambiguity of all verbs (Van Geenhoven 1998). If widespread lexical ambiguities are not viewed as an attractive solution, the alternative is to develop combinatory rules other than function application. File Change Semantics and Discourse Representation theory (Heim 1982, Kamp 1981, Kamp & Reyle 1993) follow this route, but not in a strictly type-theoretical setting. The indefinite introduces a variable, that is existentially closed at the discourse level if it is not embedded under a quantificational binder. De Swart (2001) shows that we can reinterpret the DRT rule of existential closure in typc-thcorctical terms. The price wc pay is a fairly complex system of closure rules in order to account for monotone increasing, decreasing and non-monotone quantifiers. These complexities favor a restriction of the property type denotation of indefinites to special contexts where specific construction rules can be motivated by the particularities of the construction at hand. 2. Type coercion In Section 1, we saw that enriched type theories have formulated rules of type raising and type shifting to account for well-formed natural language examples that present mismatches in stricter type theories. A third way of resolving type mismatches in natural language has been defined in terms of type coercion, and this route constitutes the focus of this article, 2.1. Type coercion as a strategy to resolve type mismatches The use of coercion to resolve type mismatches goes back to the work on polysemy by James Pustejovsky (Pustejovsky 1995), Pustejosvky points out that the regular rules of type theoretic combinatorics are unable to describe sentences such as (1): (1) a. Mary finished the novel in January 2007, A week later, she started a new book, b. John wants a car until next week. Verbs like start and finish in (la) do not directly take expressions of type c or ((c,t),t), such as the novel or a new book, for they operate on processes or actions rather than on objects, as we see in start reading,finish writing. As Dowty (1979) points out, the temporal adverbial until next week modifies a hidden or understood predicate in (lb), as in John
25. Mismatches and coercion 577 wants to have a car until next week. The type mismatch in (1) does not lead to ungram- maticalities, but triggers a reinterpretation of the combination of the object of the verb. The hearer fills in the missing process denoting expression, and intcrprcts^/iw/i the novel as 'finish reading the novel' or 'finish writing the novel', and wants a car as 'wants to have a car'. This process of reinterpretation of the argument, triggered by the type mismatch between a functor (here: the verb finish or start) and its argument (here the object the novel or a new book) is dubbed type coercion by Pustejovsky, Type coercion is a semantic operation that converts an argument to the type expected by the function, where it would otherwise result in a type error. Formally, function application with coercion is defined as in (2) (Pustejovsky 1995: 111), with a the argument and p the functor: (2) Function application with coercion. If a is of type c, p of type (a,b), then: (i) if type c = a, then p(a) is of type b, (ii) if there is a o e Za such that o(a) results in an expression of type a, then p(o(a)) is of type b. (iii)otherwise a type error is produced. If the argument is of the right type to be the input of the functor, function application applies as usual (i). If the argument can be reinterpreted in such a way that it satisfies the input requirements of the function, function application with coercion applies (ii). La represents the set of well-defined reinterpretations of a, leading to the type a for o(a) that is required as the input of the functor p. If such a reinterpretation is not available, the sentence is ungrammatical (iii). Type coercion is an extremely powerful process. If we assume that any type mismatch between an operator and its argument can be repaired by the hearer simply filling in the understood meaning, we would run the risk of inserting hidden meaningful expressions anywhere, thus admitting random combinations of expressions, and loosing any ground for explaining ungrammaticalities. So the process of type coercion must be severely restricted, 2.2. Restrictions on type coercion There are three important restrictions Pustejovsky (1995) imposes upon type coercion in order to avoid ovcrgcncralization. First, coercion docs not occur freely, but must always be triggered as part of the process of function application (clause (ii) of definition 2), In (1), the operators triggering type coercion are the verbs finish, start and want. Outside of the context of such a trigger, the nominal expressions the novel, a new book and a car do not get reinterpreted as processes involving the objects described by these noun phrases (clause (i) of definition 2), Second, type coercion always affects the argument, never the functor. In example (1), this means that the interpretation of finish, start and want is as usual, and only the interpretation of the nominal expressions the novel, a new book and a car is enriched by type coercion. Third, the interpretation process involved in type coercion requires a set of well-defined reinterpretations for any expression a (Za in clause ii of definition 2), Pustejovsky (1995) emphasizes the role of the lexicon in the reinterpretation process. He develops a generative lexicon, in which not only the referential interpretation of lexical items is spelled out, but an extended interpretation is developed in terms of argument structure, extended event structure and qualia
578 V. Ambiguity and vagueness structure (involving roles). In (1), the relevant qualia of the lexical items novel and book include the fact that these nouns denote artifacts which are produced in particular ways (agcntivc role: process of writing), and which arc used for particular purposes (tclic role: process of reading). The hearer uses these extended lexical functions to fill in the understood meaning in examples like (1). The qualia structure is claimed to be part of our linguistic knowledge, so Pustejovsky maintains a strict distinction between lexical and world knowledge. See article 32 (Hobbs) Word meaning and world knowledge for more on the relation between word meaning and world knowledge. Type coercion with verbs like begin, start, finish, is quite productive in everyday language use. Google provided many examples involving home construction as in (3a,b), building or producing things more in general (3c-f), and, by extension, organizing (3g, h): (3) a. When I became convalescent, father urged Mr, X to begin the house. b. We will begin the roof on the pavilion on December 1. c. Today was another day full of hard work as our team stays late hours to finish the car, d. (about a bird) She uses moss, bits of vegetation, spider webs, bark flakes, and pine needles to finish the cup-shaped nest, e. Susan's dress is not at any picture, but the lady in the shop gave very good instructions how to finish the dress, f. Once I finish an album, I'll hit the road and tour, I got a good band together and 1 really feel pleased. g. The second step was to begin the House Church, (context: a ministry page) h. This is not the most helpful way to begin a student-centered classroom. Examples with food preparation (4a) and consumption (4b) arc also easy to find, even though we sometimes need some context to arrive at the desired interpretation (4c), (4) a. An insomniac, she used to get up at four am and start the soup for dinner, while rising dough for breakfast, and steaming rice for lunch, b. The portions were absolutely rcdiculusly huge, even my boyfriend the human garbage disposal couldn't finish his plate. c. A pirate Jeremy is able to drink up a rum barrel in 10 days, A pirate Amelie needs 2 weeks for that. How much time do they need together to finish the barrel? Finally, we find a wide range of other constructions, involving exercices, tables, exams, etc, to complete (5a), technical equipment or computer programs to run (5b), lectures to teach (by lecturers) (5c), school programs to attend (by students) (5d),etc. (5) a. Read the instructions carefully before you begin the exam. b. Press play to begin the video, c. Begin the lesson by asking what students have seen in the sky. d. These students should begin the ancient Greek or Latin sequence now if they have not already done so. These reinterpretations are clearly driven by Pustejovsky's qualia structure, and are thus firmly embedded in a generative lexicon. But type coercion can also apply in creative
25. Mismatches and coercion 579 language use, where the context rather than the lexicon drives the reinterpretation process. If the context indicates that Mary is a goat that eats anything she is fed, we could end up with an interpretation of (la) in which Mary finished eating the novel, and started eating a new book a week later (cf, Lascarides & Copestake 1998). One might be skeptical about the force of contextual pressure. Indeed, a quick google search revealed no naturally occurring examples of 'begin/finish the book' (as in: a goat beginning/ finishing to eat the book). However, there are other examples that suggest a strongly context-dependent application of type coercion: (6) a. Is it too late to start geranium plants from cuttings? (context: a Q&A page on geraniums, meaning: begin growing geranium plants) b, (After briefly looking at spin as an angular momentum property, we will begin discussing rotational spectroscopy.) We finish the particle on a sphere, and begin to apply those ideas to rotational spectography, (context: chemistry class schedule; meaning: finish discussing). The numbers of such examples of creative language use are fairly low, but in the right context, their meaning is easy to grasp. All in all, type coercion is a quite productive process, and the limits on this process reside in a combination of lexical and contextual information. 2.3. Putting type coercion to work The contexts in (7) offer another prominent example of type coercion discussed by Pustejovsky (1995): (7) a. We will need a fast boat to get back in time, b, John put on a long album during dinner. Although fast is an adjective modifying the noun boat in (7a), it semantically operates on the action involving the boat, rather than the boat itself. Similarly, the adjective long in (7b) has an interpretation as an event modifier, so it selects the telic role of playing the record, Pustejovsky's notion of type coercion has been put to good use in other contexts as well. One application involves the interpretation of associative ('bridging') definite descriptions (Clark 1975): (8) Sara brought an interesting book home from the library. A photo of the author was on the cover. The author and the cover are new definites, for their referents have not been introduced in previous discourse. The process of accommodation is facilitated by an extended argument structure, which is based on the qualia structure of the noun (Bos, Buitelaar & Mineur 1995), and the rhetorical structure of the discourse (Asher & Lascarides 1998). Accordingly, the most coherent interpretation of (8) is to interpret the author and the cover as referring to the author and the cover of the book introduced in the first sentence.
580 V, Ambiguity and vagueness Kluck (2007) applies type coercion to metonymic reinterpretations as in (9), merging Pustejovsky's generative lexicon with insights concerning conceptual blending: (9) a toy truck, a stone lion The metonymic interpretation is triggered by a conflict between the semantics of the noun and that of the modifying adjective. In all these examples, we find a shift to an 'image' of the object, rather than an instance of the object itself. The metonymic type coercion in (9) comes close to instances of polysemy like those in (10) that Nunberg (1995) has labelled 'predicate transfer', (10) The ham sandwich from table three wants to pay. This sentence is naturally used in a restaurant setting, where one waiter informs another one that the client seated at table three, who had a ham sandwich for lunch is ready to pay. Nunberg analyzes this as a mapping from one property onto a new one that is functionally related to it; in this case the mapping from the ham sandwich to the person who had the ham sandwich for lunch. Note that the process of predicate transfer in (10) is not driven by a type mismatch, but by other conflicts in meaning (often driven by selection restrictions on the predicate). This supports the view that reinterpretation must be contextually triggered in general. Asher & Pustejovsky (2005) use Rhetorical Discourse Representation Theory to describe the influence of context more precisely (see article 37 (Kamp & Reyle) Discourse Representation Theory for more on discourse theories).They point out that in the context of a fairy tale, we have no difficulty ascribing human-like properties to goats that allow us to interpret examples like (11): (11) The goat hated the film but enjoyed the book. In a more realistic setting, we might interpret (11) as the goat eating the book, and rejecting the film (cf. section 2.2 above), but in a fictional interpretation, goats can become talking, thinking and reading agents, thereby assuming characteristics that their sortal typing would not normally allow. Thus discourse may override lexical preferences. Type coercion, metonymic type coercion and predicate transfer can be included in the set of mechanisms that Nunberg (1995) subsumes under the label 'transfer of meaning', and that come into play in creative language use as well as systematic polysemy. Function application with coercion is a versatile notion, which is applied in a wide range of cases where mismatches between a functor and its argument are repaired by reinterpretation of the argument. Coercion restores compositionality in the sense that reinterpretation involves inserting hidden meaningful material that enriches the semantics of the argument, and enables function application. Lexical theories such as those developed by Pustejovsky (1995), Nunberg (1995), Asher & Pustejovsky (2005), highlighting not only denotational and selective properties,but also functional connections and discourse characteristics arc best suited to capture the systematic relations of polysemy wc find in these cases.
25. Mismatches and coercion 583 The aspectual operator applies (recursively if necessary) on the eventuality description provided by the predicate-argument structure, so in (13b), the perfect maps the ongoing process of writing a book onto its consequent state. Aspectual restrictions and implications for a cross-linguistic semantics of the perfect are discussed in Schmitt (2001) and de Swart (2003, 2007). Article 49 (Portner) Perfect and progressive offers a general treatment of the semantics and pragmatics of the perfect. Tense operates on the output of the aspectual operator (if present), so in (13a), the state of the ongoing process of eating is located in the past, and no assertion about the event reaching its inherent endpoint in the real world is made. The structure in (13a) gives rise to the semantic representation in DRT format in Fig, 25,1. s, t, n t<n SOt s: PROG e: Julie eat a fish | Fig. 25.1: Julie was eating a fish According to Fig. 25.1, there is a state of Julie eating a fish in progress, that is located at some time t preceding the speech time n ('now'). The intensional semantics of the Progressive (Dowty 1979, Landman 1992, and others) is hidden in the truth conditions of the Prog operator, and is not our concern in this article (see article 49 (Portner) Perfect and progressive for more discussion). 3.2. Aspectual mismatches and coercion It is well known that the English Progressive does not take just any eventuality description as its argument.The Progressive describes processes and events in their development, and the combination with a state verb frequently leads to ungrammaticalities (14a,b). (14) a. *?Bill is being sick/in the garden. b. *?Julie is having blue eyes. c. I'm feeding him a line, and he is believing every word. d. Bill is being obnoxious. However, the Progressive felicitously combines with a state verb in (14c) (from Michaelis 2003) and (14d). A process of aspectual reinterpretation takes place in such examples, and the verb is conceived as dynamic, and describing an active involvement of the agent. The dynamic reinterpretation arises out of the conflict between the aspectual requirements of the Progressive, and the aspectual features of the state description. De Swart (1998) and Zucchi (1998) use the notion of aspectual coercion to explain the meaning effects arising from such mismatches. In an extended use of the structure in (12), aspectual reinterpretation triggers the insertion of a hidden coercion operator C^, mapping a stativc description onto a dynamic one: (15) a. He is believing every word. [Pres [Prog [ Csd [he believe every word]]]
584 V. Ambiguity and vagueness b. Bill is being obnoxious. [ Pres [ Prog [ C^, [ Bill be obnoxious ]]]] The hidden coercion operator is inserted in the slot for grammatical aspect. In (15), Csd adapts the properties of the state description to the needs of the Progressive. More precisely, the operator Csd reinterprets the state description as a dynamic description, which has the aspectual features that allow it to be an argument of the Progressive operator. In this way, the insertion of a hidden coercion operator resolves the aspectual mismatch between the eventuality description and the aspectual operator in contexts like (15). (15b) gives rise to the semantic representation in DRT in Fig. 25.2: s.t, n t = n tOs s: PROG d:CSd s': Bill be obnoxious Fig. 25.2: Bill is being obnoxious The dynamic variable d, obtained by coercion of the state s' is the input for the Progressive operator. The output of the progressive is again a state, but the state of an event or process being in progress is more dynamic than the underlying lexical state. The DRT representation contains the coercion operator C^,, with the subscripts indicating the aspectual characterization of the input and output description.The actual interpretation of the coercion operator is pushed into the truth conditions of C, because it is partly dependent on lexical and contextual information (compare also Section 2).The relevant interpretation of Csd in Fig. 25.2 is dynamic: (16) Dynamic is a function from sets of state eventualities onto sets of dynamic eventualities in such a way that the state is presented as a process or event that the agent is actively involved in. Note that (14a) and (14b) cannot easily be reinterpreted along the lines of (16), because of lack of agentivity in the predicate. Following Moens & Steedman (1988), we assume that all mappings between sets of eventualities that are not labelled by overt aspectual operators are free.This implies that all semantically possible mappings that the language allows between sets of eventualities, but that arc not the denotation of an overt grammatical aspect operator like the Progressive, the Perfect, etc. are possible values of the hidden coercion operator. Of course, the value of the coercion operator is constrained by the aspectual characteristics of the input eventuality description and the aspectual requirements of the aspectual operator. Thus, the coercion operator in Fig. 25.1 is restricted to being a mapping from sets of states to sets of events. 3.3. Extension to aspectual adverbials Grammatical operators like the Progressive or the Perfect are not the only expressions overtly residing in the Aspect slot in the structure proposed in (12). Consider the
25. Mismatches and coercion 585 sentences in (17)—(19), which illustrate that in adverbials combine with event descriptions, and for and until adverbials with processes or states: (17) in adverbials a. Jennifer drew a spider in five minutes. (event) b. #Jennifer ran in five minutes. (process) c. #Jennifer was sick in five minutes. (state) (18) for adverbials a. Jennifer was sick for two weeks. (state) b. Jennifer ran for five minutes. (process) c. #Jennifer drew a spider for five minutes. (event) (19) until adverbials a. Jennifer was sick until today. (state) b. Jennifer slept until midnight. (process) c. #Jcnnifcr drew a spider until midnight. (event) The aspectual restrictions on in and for adverbials have been noted since Vendler (1957), see article 48 (Filip) Aspectual class and Aktionsart and references therein. For until adverbials, see de Swart (1996), and references therein. A for adverbial measures the temporal duration of a state or a process, whereas an in adverbial measures the time it takes for an event to culminate. Until imposes a right boundary upon a state or a process. Their different semantics is responsible for the aspectual requirements in, for and until adverbials impose upon the eventuality description with which they combine, as illustrated in (17)—(19). We focus here on for and in adverbials. Both have been treated as measurement expressions that yield a quantized eventuality (Krifka 1989). Thus to run denotes a set of processes, but to run for five minutes denotes a set of quantized eventualities. Aspectual adverbials can then be treated as operators mapping one set of eventualities onto another (cf. Moens & Steedman 1988). For adverbials map a set of states or processes onto a set of events; in adverbials a set of events onto a set of events. The mapping approach provides a natural semantics of sentences (17a) and (18a, b). However, the mismatches in (17b, c) and (18c) are marked as pragmatically infelicitous (#), rather than ungrammatical (*).This raises the question whether the aspectual restrictions on in and for adverbials are strictly semantic, or rather loosely pragmatic, in which case an underspecification account might be equally successful (cf. Dolling 2003 for a proposal, and article 24 (Egg) Semantic underspecification for a general discussion on the topic). We can maintain the semantic treatment of aspectual adverbials as two different kinds of measurement expressions, if we treat the well-formed examples that combine in with states/processes or for with event descriptions as instances of aspectual coercion, as illustrated in (20) and (21): (20) a. Sally read a book for two hours. (process reading of event) [Past [ for two hours [ Ceh [ Sally read a book ]]]] b. Jim hit a golf ball into the lake for an hour. (frequentative reading of event)
25. Mismatches and coercion 587 over eventualities such that the frequentative/habitual state describes a stable property over time of an agent. Note that habitual readings are constrained to iterable events, which imposes constraints upon the predicate-argument structure, as analyzed in de Swart (2006). The advantage of a coercion analysis over an underspecification approach in the treatment of the aspectual sensitivity of in and for adverbials is that we get a systematic explanation of the meaning effects that arise in the unusual combinations exemplified in (17b, c), (18c), (20) and (21), but not in (17a) and (18a, b). Effects of aspectual coercion have been observed in semantic processing (Pinango et al.2006 and references therein). Brcnnan & Pylkkancn (2008) hypothesize that the online effects of aspectual coercion are milder than in type coercion, because it is easier to shift into another kind of event than to shift an object into an event. Nevertheless, they find an effect with mag- netoencephalography, which they take to support the view that aspectual coercion is part of compositional semantics, rather than a context-driven pragmatic enrichment of an underspecified semantic representation, as proposed by Dolling (2003) (cf. also the discussion in section 2.4 above). 3.4. Aspectually sensitive tenses Aspectual mismatches can arise at all levels in the structure proposed in (12). Dc Swart (1998) takes the English simple past tense to be aspectually transparent, in the sense that it lets the aspectual characteristics of the eventuality description shine through.Thus, both he was sick and he ate an apple get an episodic reading, and the sentences locate a state and an event in the past respectively. Not all English tenses are aspectually transparent. The habitual interpretation of he drinks or she washes the car (Schmitt 2001, Michaelis 2003) arises out of the interaction of the process/event description and the present tense, but cannot be taken to be the meaning of either, for he drank and she washed the car have an episodic besides a habitual interpretation, and he is sick has just an episodic interpretation. We can explain the special meaning effects if we treat the English simple present as an aspectually restricted tense. It is a present tense in the sense that it describes the eventuality as overlapping in time with the speech time. It is aspectually restricted in the sense that it exclusively operates on states (Schmitt 2001, Michaelis 2003). This restriction is the result of two features of the simple present, namely its characterization as an imperfective tense, which requires a homogenous eventuality description as its input (cf. also the French Imparfait in section 4.4 below), and the fact that it posits an overlap of the eventuality with the speech time, which requires an evaluation in terms of instants, rather than intervals. States are the only eventualities that combine these two properties. As a result, the interpretation of (23a) is standard, and the sentence gets an episodic interpretation, but the aspectual mismatch in (23b) and (23c) triggers the insertion of a coercion operator Cds, reinterpreting the dynamic description as a stative one (see Fig. 25.4). (23) a. Bill is sick. [ Present [ Bill be sick ]] b. Bill drinks. [ Present [ Cds [ Bill drink ]]]
V. Ambiguity and vagueness Julie washes the car. [ Present [ Cds [ Julie wash the car ]]] s,t, n sOt tOn s:Cds e: Julie wash the car Fig. 25.4: Julie washes the car In (23b) and (23c), the value of Cds is a habitual interpretation. A habitual reading is stative, because it describes a regularly recurring action as a stable property over time (Smith 2005). The contrast between (23a) and (23b, c) confirms the general feature of coercion, namely that reinterpretation only arises if coercion is forced by an aspectual mismatch. If there is no conflict between the aspectual requirements of the functor, and the input, no reinterpretation takes place, and we get a straightforward episodic interpretation (23a). Examples (23b) and (23c) illustrate that the ^interpretations that are available through coercion are limited to those that are not grammaticalized by the language (Moens 1987). The overt Progressive blocks the progressive interpretation of sentences like (23b, c) as the result of a diachronic development in English (Bybee, Perkins & Pagliucca 1994:144). 3.5. Aspectual coercion and rhetorical structure At first sight, the fact that an episodic as well as a habitual interpretation is available for he drank and she washed the car might be a counterexample to the general claim that aspectual coercion is not freely available: (24) Bill drank. a. [Past [ Bill drink ]] b. [Past [Cds [ Bill drink]]] (episodic interpretation) (habitual intepretation) (25) Julie washed the car. a. [Past [ Julie washed the car ]] b. [Past [ Cds [ Julie washed the car ] (episodic interpretation) (habitual interpretation) Given that the episodic interpretation spelled out in (24a) and (25a) does not show an aspectual mismatch, we might not expect to find habitual interpretations involving a stative reinterpretation of the dynamic description, as represented in (24b) and (25b), but we do. Such aspectual reinterpretations are not triggered by a sentence internal aspectual conflict, but by rhetorical requirements of the discourse. Habitual interpretations can be triggered if the sentence is presented as part of a list of stable properties of Bill over time, as in (26): (26) a. In high school, Julie made a little pocket money by helping out the neighbours. She washed the car, and mowed the lawn.
25. Mismatches and coercion 589 b. In college, Bill led a wild life. He drank, smoked, and played in a rock band, rather than going to class. Aspectual coercion may be governed by discourse requirements as in (26), but even then, the shifted interpretation is not free, but must be triggered by the larger context, and more specifically the rhetorical structure of the discourse. An intermediate case is provided by examples in which an adverb in the sentence indicates the rhetorical function of the eventuality description, and triggers aspectual coercion, as in (27): (27) Suddenly, Jennifer knew the answer. (inchoative reading of state) [Past [ C^ [ Jennifer know the answer ]]] Suddenly only applies to happenings, but know the answer is a state description, so (27) illustrates an aspectual mismatch.The insertion of a hidden coercion operator solves the conflict,so we read the sentence in such a way that the adverb suddenly brings out the inception of the state as the relevant happening in the discourse. 3.6. Key features of aspectual coercion In this section, we defined aspectual mapping operations with coercion in DRT. Formally, the proposal can be summed up as follows: (28) Aspectual mappings with coercion in DRT (i) When a predicate-argument structure S, is modified by an aspectual operator Asp denoting a mapping from a set of eventualities of aspectual class a to a set of eventualities of aspectual class a\ introduce in the universe of discourse of the DRS a new dr a' and introduce in the set of conditions a new condition a': Asp Kr (ii) If S, denotes an eventuality description of aspectual class a, introduce in the universe of discourse of K, a new discourse referent a\ and introduce in the set of conditions of K, a new condition a': y, where y is the denotation ofS,. (iii)If S( denotes an eventuality description of aspectual class a" (a" *a), and there is a coercion operator Ca„a such that Ca„a(y) denotes a set of eventualities of aspectual class a, introduce in the universe of K, a new discourse referent a\ and introduce in the set of conditions of K, a new condition o':Ca..a(y),whereyis the denotation of Sr (iv) If S, denotes an eventuality description of aspectual class a" (a'Va), and there is no coercion operator Ca„a such that Ca..a(y) denotes a set of eventualities of aspectual class a, modification of S, by the aspectal operator Asp issemantically ill-formed. According to this definition, aspectual coercion only arises when aspectual features of an argument do not meet the requirements of the functor (typically an adverbial expression, an aspectual operator, or a tense operator) (compare clauses ii and iii). As a result, we never find coercion operators that map a set of states onto a set of states, or a set of events onto a set of events, but we always find mappings from one
25. Mismatches and coercion 593 s, n, t t<n hOt h:Ceh e: he do his homework Fig. 25.6: II faisait ses devoirs out of the conflict between the aspectual requirements of the tense operator, and the input eventuality description.The outcome of the coercion process is that state/process descriptions reported in the Passe Simple arc event denoting, and fit into a sequence of ordered events, whereas event descriptions reported in the Imparfait are state denoting, and do not move the reference time forward. (38) a. lis s' aperqurent de 1'ennemi. Jacques eut grand peur. They refl noticed.ps of the ennemi. Jacques had.ps great fear.' 'They noticed the ennemi. Jacques became very fearful.' b. II se noyait quand 1'agent le sauva He refl drowned.imp when the officer him saved.ps en le retirant de 1'eau. by him withdraw, part from the water. 'He was drowning when the officer saved him by dragging him out of the water.' The examples in (38) are to be compared to those in (30) and (31), where no coercion takes place, but the discourse structures induced by the Passe Simple and the Imparfait arc similar. We conclude that the coercion approach maintains Kamp & Rohrcr's (1983) insights about the temporal structure of narrative discourse in French, while offering a compositional analysis of the internal structure of the sentence. Versions of a treatment of the perfective/imperfective contrast in terms of aspec- tually sensitive tense operators have been proposed for other Romance languages, cf. Schmitt (2001). If we embed the analysis of the perfective and imperfective past tenses in the hierarchical structure proposed in (12), we predict that they always take wide scope over overt aspectual operators, including aspectual adverbials, negation, and quan- tificational adverbs (iterative adverbs like twice and frequentative adverbs like often). If these expressions are interpreted as mappings between sets of eventualities, we predict that the distribution of perfective/ imperfective tenses is sensitive to the output of the aspectual operator. De Swart (1998) argues that eventuality descriptions modified by in and for adverbials are quantized (cf. section 3.3 above), and are thus of the right aspectual class to serve as the input to the Passe Simple in French. She shows that coercion effects arise in the combination of these adverbials with the Imparfait. De Swart & Molendijk (1999) extend the analysis to negation, iteration and frequency in French. Lcnci & Bcrtinctto (2000) treat the pcrfcctivc/impcrfcctivc past tense distribution in sentences expressing iteration and frequency in Italian, and Perez-Leroux et al. (2007) show that similar meaning effects arise in Spanish. Coercion effects also play a role in first and second language acquisition, cf. Montrul & Slabakova (2002), Perez-Leroux et al. (2007), and references therein for aspectuality.
594 V. Ambiguity and vagueness 5. Conclusion This article outlined the role that type coercion and aspectual rcintcrpretation play in the discussion on type mismatches in the semantic literature.This docs not mean that the coercion approach has not been under attack, though.Three issues stand out. The question whether coercion is a semantic enrichment mechanism or a pragmatic inference based on an underspecified semantics has led to an interesting debate in the semantic processing literature. The question is not entirely resolved at this point, but there are strong indications that type coercion and aspectual coercion lead to delays in online processing, supporting a semantic treatment. As far as aspectual coercion is concerned, the question whether predicative aspect and grammatical aspect are to be analyzed with the same ontological notions (as in the mapping approach discussed here), or require two different sets of tools (as in Smith 1991/1997) is part of an ongoing discussion in the literature. It has been argued that the lexical component of the mapping approach is not fine-grained enough to account for more subtle differences in predicative aspect (Caudal 2005). In English, there is a class of verbs (including read, wash, combe, ...) that easily allow the process reading, whereas others (eat, build,...) do not (Kratzer 2004). Event decomposition approaches (as in Kempchinsky & Slabakova 2005) support the view that some aspectual operators apply at a lower level than the full predicate-argument structure, which affects the view on predicative aspect developed by Verkuyl (1972/1993).The contrast between English- type languages, where aspectual operators apply at the VP level, and Slavic languages, where aspectual prefixes are interpreted at the V-level is highly relevant here (Filip & Rothstein 2006). The sensitivity of coercion operators to lexical and contextual operators implies that the semantic representations developed in this article are not complete without a mechanism that resolves the interpretation of the coercion operator in the context of the sentence and the discourse. So far, we only have a sketch of how this would have to run (Pulman 1997, de Swart 1998, Hamm & van Lambalgen 2005, Asher & Pustejovsky 2005). There is also a more fundamental problem. Most operators we discussed in the aspectual section are sensitive to the stative/dynamic distinction, or the homogeneous/quantized distinction.This implies that the relation established by coercion is not strictly functional, for it leaves us with several possible outputs (Verkuyl 1999).The same issue arises with type coercion. However, spelling out the different readings semantically would bring us back to a treatment in terms of ambiguities or underspecification, and we would lose our grip on the systematicity of the operator-argument relation. Notwithstanding the differences in framework, and various problems with the implementation of the notion of coercion, the idea that reinterpretation effects are real, and require a compositional analysis seems to have found its way into the literature. 6. References dc Almeida, Roberto 2004. The effect of context on the processing of type-shifting verbs. Brain & Language 90,249-261. Asher, Nicholas & Alex Lascai ides 1998. Briding. Journal of Semantics 15,83-113. Asher, Nicholas & James Pustejovsky 2005. Word Meaning and Coinnionsense Metaphysics. Ms. Austin,TX, University ofTexas/Waltham/Boston, MA, Brandeis University.
598 V. Ambiguity and vagueness non-literal language and a number of major innovations in metaphor theory. Metaphor and metonymy are now understood to be ubiquitous aspects of language, not simply fringe elements. The study of metaphor and metonymy have provided a major source of evidence for Cognitive Linguists to argue against the position that a sharp divide exists between semantics and pragmatics. Mounting empirical data from psychology, especially work by Gibbs, has led many to question the sharp boundary between literal and metaphorical meaning. While distinct perspectives remain among contemporary metaphor theorists, very recent developments in relevance theory and cognitive semantics also show intriguing areas of con vergen ce. 1. Introduction It is an exaggeration to say that metaphor and metonymy are the tropes that launched 1,000 theories. Still, they are long recognized uses of language that have stimulated considerable debate among language scholars and most recently served as a major impetus for Cognitive Linguistics, a theory of language which challenges the modularity hypothesis and postulates such as a strict divide between semantics and pragmatics (cf. article 88 (Jaszczolt) Semantics and pragmatics). From Aristotle to the 1930's, the study of metaphor and metonymy was primarily seen as part of the tradition of rhetoric. Metaphor in particular was seen as a highly specialized, non-literal, 'deviant' use of language, used primarily for literary or poetic effect. Although scholars developed a number of variations on the theme, the basic analysis that metaphor involved using one term to stand for another, remained largely unchanged. Beginning with Richards (1936) and his insight that the two elements in a metaphor (the tenor and vehicle) are both active contributors to the interpretation, the debate has shifted away from understanding metaphor as simply substituting one term for another or setting up a simple comparison (although this definition is still found in reference books such as the Encyclopedia Britannica). In the main, contemporary metaphor scholars now believe that the interpretations and inferences generated by metaphors are too complex to be explained in terms of simple comparisons. (However, see discussion of resemblance metaphors, below.) Richards' insights stirred debate among philosophers such as Black and Grice. Subsequently, linguists (for example, Grady, Johnson, Lakoff, Reddy, Sperber & Wilson, and Giora) and psychologists (for example, Coulson, Gibbs, and Keysar & Bly) have joined in the discussion.The result has been a sharp increase in interest in metaphor and metonymy. Debate has revolved around issues such as the challenges such non-literal language poses for truth conditional semantics and pragmatics, for determining the division between literal and non-literal language, and the insights metaphor and metonymy potentially offer to our understanding of the connections between language and cognition.The upshot has been a number of major innovations which include, but are not limited to, a deeper understanding of the relations between the terms in linguistic metaphor, that is, tenor (or target) and vehicle (or source); exploration of the type of knowledge, 'dictionary' versus 'encyclopedic', necessary to account for the interpreting metaphor; and development of a typology of metaphors, that is, primary, complex, and resemblance metaphor. Given the recent diversity of scholarly activity, universally accepted definitions of either metaphor or metonymy are difficult to come by. All metaphor scholars recognize certain examples of metaphor, such as 'Ted is the lion of the Senate', and agree that such
26. Metaphors and metonymies 599 examples of metaphor represent, in some sense, non-literal use of language. Scholars holding a more traditional, truth-conditional perspective might offer a definition along the lines of 'use of a word to create a novel class inclusion relationship'. In contrast, conceptual metaphor theorists hold that metaphor is "a pattern of conceptual association" (Grady 2007) resulting in "understanding and experiencing one kind of thing in terms of another" (Lakoff & Johnson 1980: 5). A key difference between perspectives is that the truth-conditional semanticists represent metaphor as a largely word-level phenomenon, while cognitive semanticists view metaphor as reflective of cognitive patterns of organization and processing more generally. Importantly, even though exemplars of metaphor such as 'Ted is a lion' are readily agreed upon, the prototypicality of such a metaphor is controversial. More traditional approaches tend to treat 'Ted is a lion' as a central example of metaphor; in contrast, conceptual metaphor theory sees it as one of three types of metaphor (a resemblance metaphor) and not necessarily the most prototypical. According to Evans & Green (2006) and Grady (2008), resemblance metaphors represent the less prototypical type of metaphors and reflect a comparison involving cultural norms and stereotypes. A subset of resemblance metaphors, termed image metaphors, are discussed by Lakoff & Turner (1989) as more clearly involving physical comparisons, as in 'My wife's waist is an hourglass'. For cognitive semanticists, uses such as 'heavy' in Stewart's brother is heavy, meaning Stewart's brother is a serious, deep thinker, represent the most prevalent, basic type of metaphor, termed primary metaphor. In contrast, a truth conditional scmanticist (cf. articles 23 (Kennedy) Ambiguity and vagueness and 24 (Egg) Semantic underspecification) would likely classify this use of 'heavy' as a case of literal language use (sometimes referred to as a dead metaphor) involving polysemy and ambiguity. Not surprisingly, analogous divisions are found in the analyses of metonymy. All semanticists recognize a statement such as 'Brad is just another pretty face' or 'The ham sandwich is ready for the check' as central examples of metonymy. Moreover, there is general agreement that a metonymic relationship involves referring to one entity in terms of a closely associated or conceptually contiguous entity. Nevertheless, as with analyses of metaphor, differences exists in terms of whether metonymy is considered solely a linguistic phenomenon in which one linguistic entity refers to another or whether it is reflective of more general cognition. Cognitive semanticists argue "metonymy is a cognitive process in which one conceptual entity [...] provides mental access to another conceptual entity within the same [cognitive] domain [...]" (Kovesces &Radden 1998:39). The chapter is organized as follows: Section 2 will present an overview of the traditional approaches to metaphor which include Aristotle's classic representation as well as Richards', Black's and Davidson's contributions. Section 3 addresses pragmatic accounts of metaphor with particular attention to recent discussions within Relevance theory. Section 4 presents psycholinguistic approaches, particularly Gentner's 'metaphor-as-career' account and Glucksberg's metaphor -as- category account and the more recent "dual reference" hypothesis. Section 5 focuses on cognitive and conceptual accounts. It provides an overview of Lakoff & Johnson's (1980) original work on conceptual metaphor, later refinements of that work, especially development of the theory of experiential correlation and Grady's typology of metaphors, and conceptual blending theory. Section 6 addresses metonymy.
26. Metaphors and metonymies 601^ on metaphor, and hence no systematic patterns of metaphor. Like Black, and in keeping with the long standing tradition, Davidson believed that metaphor is a lexical phenomenon. Gibbs (1994: 220-221) points out that Davidson's "metaphors-without-meaning" view reflects the centuries-old belief in the importance of an ideal scientific-type language and that Davidson's denial of meaning or cognitive content in metaphor is motivated by a theory-internal need to restrict the study of meaning to truth conditional sentences. 3. Pragmatic accounts of metaphor Since their inception in the 1960's most pragmatic theories have preserved or reinforced this sharp line of demarcation between semantic meanings and pragmatic interpretations in their models of metaphor understanding. In his theory of conversational implicature, Grice (1969/1989, 1975) argued that the non-literal meaning of utterances is derived inferentially from a set of conversational maxims (e.g. be truthful and be relevant). Given the underlying assumption that talk participants are rational and cooperative, an apparent violation of any of those maxims triggers a search for an appropriate conversational implicature that the speaker intended to convey in context (cf. articles 92 (Simons) Implicature and 94 (Potts) Conventional implicature). In this conception of non-literal language understanding, known as the "standard pragmatic model", metaphor comprehension involves a scries of inferential steps - analyze the literal meaning of an utterance first, then reject it due to its literal falsity, and then derive an alternative interpretation that fits the context. In Robert is a bulldozer, for instance, the literal interpretation is patently false (i.e. violates the maxim of quality) and this recognition urges the listener to infer that the speaker must have meant something non-literal (e.g., "persistent and does not let obstacles stand in his way in getting things done"). In his speech act theory, Searle (1979) also espoused this standard model and posited an even more elaborate set of principles and steps to be followed in deriving non-literal meanings. One significant implication of the standard pragmatic view is that metaphors, whose interpretation is indirect and requires conversational implicaturcs, should take additional time to comprehend over that needed to interpret equivalent literal speech, which is interpreted directly (Gibbs 1994, 2006b). The results of numerous psycholinguistic experiments, however, have shown this result to be highly questionable. In the majority of experiments, listeners/readers were found to take no longer to understand the figurative interpretations of metaphor, metonymy, sarcasm, idioms, proverbs, and indirect speech acts than to understand equivalent literal expressions, particularly if these are seen in realistic linguistic and social contexts (see Gibbs 1994,2002 for reviews; see also the discussion below of psycholinguistic analyses of metaphor understanding for findings on processing time for conventional versus novel figurative expressions). Within their increasingly influential framework of relevance theory, Sperber & Wilson (1995; Wilson & Sperber 2002a, 2002b) present an alternative model of figurative language understanding that does not presuppose an initial literal interpretation and its rejection. Instead, metaphors are processed similarly to literal speech in that interpretive hypotheses are considered in their order of accessibility with the process stopping once the expectation of "optimal relevance" has been fulfilled. A communicative input is optimally relevant when it connects with available contextual assumptions to yield maximal cognitive effects (such as strengthening, revising, and negating such contextual
26. Metaphors and metonymies 603 In the case of Caroline is a princess, the contextually derived ad hoc concept princess* is narrowed, in a particular context, from the full extension of the encoded concept princess to cover only a certain type of princess ("spoiled" and "indulged"), excluding other kinds (e.g. "well-bred", "well-educated", "altruistic", etc.). It is also at the same time a case of broadening in that the ad hoc concept covers a wider category of people who behave in self-centered, spoiled manners, but are non-royalty, and whose prototypical members are coddled girls and young women. The relevance theoretic account of figurative language understanding thus pivots on the "literal-loose-metaphorical continuum", whereby no criterion can be found for distinguishing literal, loose, and metaphorical utterances, at least in terms of inferential steps taken to arrive at an interpretation that satisfies the expectation of optimal relevance. What is crucial here is the claim that the fully inferential process of language understanding unfolds equally for all types of utterances and that the linguistically encoded content of an utterance (hence, the "decoded" content of the utterance for the addressee) merely serves as a starting point for inferring the communicator's intended meaning, even when it is a "literal" interpretation. In other words, strictly literal interpretations that involve neither broadening nor narrowing of lexical concepts are derived through the same process of mutually adjusting explicit content (i.e. explicature, as distinct from the decoded content) with implicit content (i.e. implicature), as in loose and metaphorical interpretations of utterances (Sperber & Wilson 2008:93). As seen above, this constitutes a radical departure from the standard pragmatic model of language understanding, which gives precedence to the literal over the non-literal in the online construction of the speaker's intended meaning. Relevance theory assumes, instead, that literal interpretations are neither default interpretations to be considered first, nor are they equivalent to the linguistically encoded (and decoded) content of given utterances. The linguistically denoted meaning recovered by decoding is therefore "just one of the inputs to an inferential process which yields an interpretation of the speaker's meaning" (Wilson & Sperber 2002b: 600) and has "no useful theoretical role" in the study of verbal communication (Wilson & Sperber 2002b: 583). Under this genuinely inferential model, "interpretive hypotheses about explicit content and implicatures are developed partly in parallel rather than in sequence and stabilize when they arc mutually adjusted so as to jointly confirm the hearer's expectations of optimal relevance" (Sperber & Wilson 2008: 96; see also Wilson & Sperber 2002b). As an illustration of how the inferential process may proceed in literal and figurative use of language, let us look at two examples from Sperber & Wilson (2008). (1) Peter: For Billy's birthday, it would be nice to have some kind of show. Mary: Archie is a magician. Let's ask him. (Sperber & Wilson 2008:90) In this verbal exchange, the word magician can be assumed to be ambiguous in its linguistic denotation, between (a) someone with supernatural powers who performs magic (magician,), and (b) someone who does magic tricks to amuse an audience (Magician,). In the minimal context given, where two people are talking about a show for a child's birthday party, the second encoded sense is likely to be activated first, under the assumption that Mary's utterance will achieve optimal relevance by addressing Peter's suggestion that they have a show for Billy's birthday party. Since magicians in the sense of Magician2 put on magic shows that children enjoy, this assumption, combined with
608 V. Ambiguity and vagueness Frame Semantics, and 86 (Kay & Michaelis) Constructional meaning for broader perspectives on language and cognition within the framework of Cognitive Linguistics). Moving away from the deviance tradition, Lakoff & Johnson focused on the ubiquity of, often highly conventionalized, metaphor in everyday speech and what this reveals about general human cognition. Rather than placing metaphor on the periphery of language, Lakoff & Johnson argue that it is central to human thought processes and hence central to language. They defined metaphor as thinking about a concept (the target) from one knowledge domain in terms of another domain (the source). In particular, they articulated the notion of embodied meaning, i.e., that human conceptualization is crucially structured by bodily experience with the physical-spatial world; much of metaphor is a reflection of this structuring by which concepts from abstract, internal or less familiar domains are structured by concepts from the physical/spatial domains.This structuring is held to be asymmetrical.The argument is that many of these abstract or internal domains have not developed language in their own terms, for instance, English has few domain specific temporal terms and generally represents time in terms of the spatial domain. Thus language from the familiar, accessible, intersubjective domains of the social- spatial-physical is recruited to communicate about the abstract and internal. This overarching vision is illustrated in Lakoff & Johnson's well known analysis of the many non-spatial, metaphoric meanings of up and down. They note that up and down demonstrate a wide range of meanings in everyday English, a few of which are illustrated below: (3) a. The price of gas is up/down amount b. Jane is further up/down in the corporation than most men. power, status c. Scooter is feeling up/down these days. mood Their argument is that all these uses of up and down reflect a coherent pattern of human bodily experience with verticality.The importance of up and down to human thinking is centrally tied to the particularities of the human body; humans have a unique physical asymmetry stemming from the fact that we walk on our hind legs and that our most important organs of perception arc located in our heads, as opposed to our feet. Moreover, the up/down pattern cannot be attributed simply to these two words undergoing semantic extension, as the conceptual metaphor is found with most words expressing a vertical orientation: (4) a. The price of gas is rising/declining. b. She's in high/low spirits today. c. She's over/under most men in the corporation. This analysis explains a consistent asymmetry found in such metaphors; language does not express concepts from social/physical/spatial domains in terms of concepts from abstract or internal domains. For example, concepts such as physical verticality are not expressed in terms of emotions or amount; a sentence such as He seems particularly confident cannot be interpreted to mean something like He seems particularly tall, whereas He's standing tall now does have the interpretation of a feeling of confidence and well being For Conceptual Metaphor Theorists, the asymmetry of mappings represents a key, universal constraint on the vast majority metaphors.
26. Metaphors and metonymies 609 In contrast to other approaches, conceptual metaphor theory tends not to consider metaphoric use of words and phrases on a case-by-case, word-by-word basis, but rather points out broader patterns of use. Thus, a distinction is made between conceptual metaphors, which represent underlying conceptual structure, and metaphoric expressions, which are understood as linguistic reflections of the underlying conceptual structure organized by systematic cross-domain mappings. The correspondences are understood to be mappings across domains of conceptual structure. Cognitive semanticists hold that words do not refer directly to entities or events in the world, but rather they act as access points to conceptual structure.They also advocate an encyclopedic view of word meaning and argue that a lexical item serves as an access point to richly patterned memory (Langackcr 1987,1991,1999,2008; also sec the discussion of Conceptual Blending below). Lakoff & Johnson argue that such coherent patterns of bodily experience are represented in memory in what they term 'image schemas', which are multi-modal, multi- faceted, systematically patterned conceptualizations. Some of the best explored image schemas include BALANCE, RESISTANCE, BEGINNING-PATH-END POINT, and MOMENTUM (Gibbs 2006a). Conceptual metaphors draw on these image-schemas and can form entrenched, multiple mappings from a source domain to a target domain. Her problems are weighing her down is a metaphorical expression of the underlying conceptual metaphor difficulties are physical burdens, which draws on the image schematic structure representing physical resistance to gravity and momentum.The metaphor maps concepts from the external domain of carrying physical burdens onto the internal domain of mental states. Such underlying conceptual metaphors are believed to be directly accessible without necessarily involving any separate processing of a literal meaning and subsequent inferential processes to determine the metaphorical meaning (Gibbs 1994, 2002). Hence, Conceptual Metaphor Theory questions a sharp division between semantics and pragmatics, rejecting the traditional stance that metaphoric expressions arc first processed as literal language and only when no appropriate interpretation can be found, are then processed using pragmatic principles.The same underlying conceptual metaphor is found in numerous metaphorical expressions, such as He was crushed by his wife's illness and Her heart felt heavy with sorrow. Recognition of the centrality of mappings is a hallmark of Conceptual Metaphor Theory; Grady (2007: 190), in fact, argues that the construct of cross-domain mapping is "the most fundamental notion of CMT". Additionally, since what is being mapped across domains is cognitive models, the theory holds that the mappings allow systematic access to inferential structure of the source domain, as well as images and lexicon (Coulson 2006). So, the mapping from the domain of physical weight to the domain of mental states includes the information that the effect of the carrying the burden increases and decreases with its size or weight, as expressed in the domain of emotional state in phrases such as Facing sure financial ruin, George felt the weight of the world settling on him, and Teaching one class is a light work load, and Everyone's spirits lightened as more volunteers joined the organization and the number of all night work sessions were reduced. Conceptual metaphor theorists argue that while the conceptual metaphor is entrenched, the linguistic expression is often novel. Coulson (2006) further notes that viewing metaphorical language as a manifestation of the underlying conceptual system offers an explanation for why wc consistently find limited but systematic mappings between the source domain and target domain (often termed 'partial projection'). The systematicity stems from the access to higher order inferential structures within the conceptual domains which allows mappings across
26. Metaphors and metonymies 61J_ physical structure. Moreover, the analysis suggests that while primary metaphors, such as ORGANIZATION IS COMPLEX PHYSICAL STRUCTURE and FUNCTIONING/PERSISTING IS REMAINING erect/whole will be found in most languages, the precise cross-domain mappings or complex metaphors which occur will differ from language to language. Grady points out that humans regularly observe or experience the co-occurrence of two separable phenomena, such that the two phenomena become closely associated in memory. Grady identifies scores of experiential correlations, of which more is up is probably the most frequently cited. The argument is that humans frequently observe liquids being poured into containers or objects added to piles of one sort or another.These seemingly simple acts actually involve two discrete aspects which tend to go unrecognized- thc addition of some substance and an increase in vertical elevation. Notice that it is possible to differentiate these two phenomena; for instance, one could pour the liquid on the floor, thus adding more liquid to a puddle, without causing an increase in elevation. However, humans are likely to use containers and piles because they offer important affordances involving control and organization which humans find particularly useful. The correlation between increases in amount and increases in elevation are ubiquitous and important for humans and thus form a well entrenched experiential correlation or mapping between two domains. Another important experiential correlation Grady identifies is affection is warmth. He argues that many of the most fundamental experiences during the first few months of life involve the infant experiencing the warmth of a caretaker's body when being held, nursed and comforted. Thus, from the first moments of life, the infant begins to form an experiential correlation between the domain of temperature and the domain of affection. This experiential correlation (or primary metaphor) is reflected in expressions such as warm smile and warm welcome. Conversely, lack of warmth is associated with discomfort and physical distance from the caretaker, as reflected in the expressions cold stare or frigid reception. The theory of experiential correlation has begun to gain support from a number of related fields. Psycholinguistic evidence comes from work by Zhong & Leonardelli (2008) who investigated the affection is warmth metaphor and found that subjects report experiencing a sense of physical coldness when interpreting phrases such as 'icy glare'. From the area of child language learning, Chris Johnson (1999) has argued that caretaker speech often reflects experiential correlations such as knowing is seeing and that children may have no basis for distinguishing between literal and metaphorical uses of a word like see. If caretakers regularly use see in contexts where it can mean either the literal 'perceive visually', as in "Let's see what is in the bag" or the metaphorical iearn; find out', as in "Let's see what this tastes like", children may develop a sense of the word which conflates literal and metaphorical meanings. Only at some later stage of development, they come to understand there are two separate uses and go through a process Johnson terms deconflation. Thus, Johnson argues that the conceptual mappings between experiential correlates are reinforced in the child's earliest exposure to language. Experiential correlation and the theory of primary metaphors was also a major impetus for the "Neural Theory of Language" (Lakoff & Johnson 1999; Narayanan 1999).This is a computational theory which has successfully modeled inferences arising from mcta- phoric interpretation of language. Experiential correlations are represented in terms of computational 'neural nets'; correlation mappings are treated as neural circuits linking representations of source and target concepts. Consistent with a usage-based, frequency
612 V. Ambiguity and vagueness driven model of language, the neural net model assumes that neural circuits are automatically established when a perceptual and a nonperceptual concept are repeatedly co-activated. The theory of experiential correlation moves our understanding of metaphor past explanations based only on comparison or similarity or ad hoc categories. It addresses the conundrum of how it is that much metaphoric expression, such as 'warm smile', involves two domains that have no immediately recognizable similarities. Grady does not claim that all metaphor is based on experiential correlation. In addition to primary metaphors and their related complex metaphors, he posits another type, resemblance metaphor, to account for expressions such as Achilles is a lion. These are considered one-shot metaphors that tend to have limited extensions.They involve mappings based on conventionalized stereotypes, here the western European stereotype that lions are fierce, physically powerful and fearless. Key to this account is Grady's adherence to the tenets of lexical meaning being encyclopedic in nature (thus lion acts as an access point to the conventional stereotypes) and mappings across conceptual domains. Clearly, resemblance metaphors with their direct link to conventional stereotypes are expected to be culturally-specific, in contrast to primary metaphors which are predicted to be more universal. Grady (1999b) has analyzed over 15 languages, representing a wide range of historically unrelated languages, for the occurrence of several primary metaphors such as big is important. He found near universal expression of these metaphors in the languages examined. The work by Grady and his colleagues addresses many of the criticisms of Lakoff & Johnson (1980). By setting up a typology of metaphors, conceptual metaphor theory can account for the occurrence of both universal and language specific metaphors. Within a specific language, the notion of complex metaphors, based on multiple primary metaphors, offers an explanation for systematic, recurring metaphorical patterns, including a compelling account of partial projection from source to target. The construct of experiential correlation also provides a methodology for distinguishing between the three proposed types of metaphor. Primary metaphors and their related complex metaphors are strictly asymmetrical in their mappings and ultimately grounded in ubiquitous, fundamental human experiences with the world. In contrast, resemblance metaphors are not strictly asymmetrical. So we find not only Achilles is a lion but also The lion is the king of beasts. Moreover, metaphor theorists have not been able to find a basic bodily experience that would link lions (or any fierce animals) with people. Finally, identifying the three categories allows an coherent account of embodied metaphor without having to entirely abandon the insight that certain metaphors, such as 'My wife's waist is an hourglass' or 'Achilles is a lion', are based on some sort of comparison. Lakoff & Johnson (1980) also hypothesized that certain idioms, such as 'spill the beans' were transparent (or made sense) to native speakers because they are linked to underlying conceptual structures, such as the CONTAINER image schema. Keysar & Bly (1995,1999) question the claim that such transparent idioms can offer insights into cognition.They argue that the native speakers' perception of the relative transparency of an idiom's meaning is primarily a function of what native speakers believe the meaning of the idiom to be.This is essentially a claim for a language based, rather than conceptually based meaning of idioms and metaphors. Keysar & Bly carried out a set of experiments in which they presented subjects with attested English idioms which could be analyzed as transparent, but which had fallen out of use. Examples included 'the goose
614 V. Ambiguity and vagueness This framework holds that words do not refer directly to entities in the world, but rather they act as prompts to access conceptual structure and to set up short term, local conceptual structures called mental spaces. Mental spaces can be understood as'temporary containers for relevant, partial information about a particular domain' "(Coulson 2006: 35). The key architecture of the theory is the conceptual integration network which is 'an array of mental spaces in which the processes of conceptual blending unfold' " (Coulson 2006: 35). A conceptual integration network is made up of two or more input spaces structured by information from distinct domains, a generic space that contains 'skeletal' conceptual structure that is common to both input spaces, and a blended space. Mappings occur across the input spaces as well as selectively projecting to the blended space. One of the most often cited examples of a conceptual blend is That surgeon is a butcher, which is interpreted to mean that the surgeon is incompetent and careless. Neither the surgeon space nor the butcher space contains the notion of incompetence. The key issue is how to account for this emergent meaning. Both spaces contain common structure such as an agent, an entity acted upon (human versus animal), goals (healing the patient versus severing flesh), means for achieving the goals (precise incisions followed by repair of the wound versus slashing flesh, hacking bones), etc. Analogy mappings project across the two input spaces linking the shared structure. The notion of incompetence arises in the blended space from the conflict between the goal of healing the patient, which projects from the surgeon space, and the means for achieving the goal, slashing flesh and hacking, which projects from the butcher space. Swcctscr (1999) has also analyzed noun-noun compounds such as land yacht (large, showy, luxury car) and couch potato (person who spends a great deal of time sitting and watching TV) as blends. Such noun-noun compounds are particularly interesting as the entities being referred to are not yachts or potatoes of any kind. Conceptual blending theory subsumes conceptual metaphor, with its commitment to embodied meaning. Its adherents also argue that the processes occurring in non-literal language and a variety of other phenomenon, such as conditionals and diachronic meaning extension, are essentially the same as those needed to explain the interpretation of literal language. Over the past decade, there has been a growing convergence between cognitive semanticists and relevance theorists in the area of metaphor analysis (see Gibbs & Tendahl 2006). For instance, as we saw above, the latest versions of relevance theory (e.g., Wilson & Sperber 2002a, 2002b; Wilson & Carston 2006; Sperber & Wilson 2008) now posit entries for lexical items which include direct access to 'encyclopedic assumptions' which are exploited in the formation and processing of metaphors and metonymies, a position which is quite similar to Langacker's (1987) analysis of words being prompts to encyclopedic knowledge. Wilson & Carston (2006:425) also accept the analysis that physical descriptions such as 'hard, rigid, inflexible, cold', etc apply to human psychological properties through inferential routes such as 'broadening' to create superordinate concepts (hard*, rigid*, cold*, etc) which have both physical and psychological instances. They further argue that ad hoc, on the fly categories can be constructed through the same process of broadening. For instance, they posit that the process of broadening provides an inferential route to the interpretation of Robert is a bulldozer, where the encoded concept bulldozer has the logical feature, i.e. encoded meaning, machine of a certain kind and the following encyclopedic assumptions:
26. Metaphors and metonymies 61_5 (5) a. large, powerful, crushing, dangerous to bystanders, etc. b. looks like this (xxx); moves like this (yyy), etc. c. goes straight ahead regardless of obstacles d. pushes aside obstructions; destroys everything in its path e. hard to stop or resist from outside; drowns out human voices, etc. They continue, "Some of these encyclopedic features also apply straightforwardly to humans. Others, (powerful, goes straight ahead ...) have both a basic, physical sense and a further psychological sense, which is frequently encountered and therefore often lexicalized" (Wilson & Carston 2006:425). However, the "inferential" theory of metaphor skirts the issue of how the very process of creating "ad hoc" concepts (such as 'hard to stop') and extended lexicalized concepts (such as cold* or rigid*) may be motivated in the first place. The qualities and consequences of physical coldness manifested by ice are not the same as the affectations of lack of caring or standoffishness which are evoked in the interpretation of an expression such as Sally is a block of ice. The physical actions and entities involved in a scenario of a bulldozer moving straight ahead crushing physical obstacles such as mounds of dirt and buildings is manifestly different than the actions of a person and the entities involved in a situation in which a person ignores the wishes of others and acts in an aggressive and obstinate manner. As noted above, it remains unclear how the notions of narrowing and broadening are 'straight forward inferences' in cases like Robert is a bulldozer, where the key psychological properties such as "aggressive" and "obstinate" cannot be drawn from the lexical entry of "bulldozer" or its encyclopedic entry because such properties simply are not part of our understanding of earth moving machines. Another gap in the relevance theory approach, and indeed all approaches that treat metaphor on a word-by-word basis, is the failure to recognize that most metaphors are part of very productive, systematic patterns of conceptualization based on embodied experience. For instance, Wilson & Carston argue that block of ice has a particular set of encyclopedic assumptions, such as solid, hard, cold, rigid, inflexible, and unpleasant to touch or interact with, associated with it. This treatment fails to systematically account for the fact that the words icy, frigid, frosty, chilly, even cool, are all used to express the same emotional state. In contrast, conceptual metaphor theory accounts for these pervasive patterns through the construct of experiential correlation and embodied meaning. Similarly, the account of Robert is a bulldozer outlined above fails to recognize the underlying image schemas relating to force dynamics, such as momentum and movement along a path, that give rise to any number of conceptually related metaphoric expressions to describe a person as obstinate, powerful and aggressive. Robert could have also been described as behaving like a run away freight train or as bowling over or mowing down his opponents to the same effect, even though the dictionary entries (and their related encyclopedic entries) for freight trains, bowling balls, and scythes or lawn movers would have very little overlap with that of a bulldozer. 6. Metonymy Over the centuries, metonymy has received less attention than metaphor. Indeed, Aristotle made no clear distinction between the two. Traditionally, metonymy was defined
VI. Cognitively oriented approaches to semantics 27. Cognitive Semantics: An overview 1. Introduction 2. The semantics of grammar 3. Schematic structure 4. Conceptual organization 5. Interactions among semantic structures 6. Conclusion 7. References Abstract The linguistic representation of conceptual structure is the central concern of the two-to- three decades old field that has come to be known as "cognitive linguistics". Its approach is concerned with the patterns in which and processes by which conceptual content is organized in language. It addresses the linguistic structuring of such basic conceptual categories as space and time, scenes and events, entities and processes, motion and location, and force and causation. To these it adds the basic ideational and affective categories attributed to cognitive agents, such as attention and perspective, volition and intention, and expectation and affect. It addresses the semantic structure of morphological and lexical forms, as well as of syntactic patterns. And it addresses the interrelationships of conceptual structures, such as those in metaphoric mapping, those within a semantic frame, those between text and context, and those in the grouping of conceptual categories into large structuring systems. Overall, its aim is to ascertain the global integrated system of conceptual structuring in language. 1. Introduction The linguistic representation of conceptual structure is the central concern of the two- to-three decades old field that has come to be known generally as "cognitive linguistics" through such defining works as Fauconnier (1985), Fauconnier & Turner (2002), Fillmore (1975,1976), Lakoff (1987,1992), Langacker (1987,1991), and Talmy (2000a, 2000b), as well as through edited collections like Geeraerts & Cuyckens (2007). This field can first be characterized by contrasting its "conceptual" approach with two other approaches, the "formal" and the "psychological". Particular research traditions have largely based themselves within one of these approaches, while aiming - with greater or lesser success - to address the concerns of the other two approaches. The formal approach focuses on the overt structural patterns exhibited by linguistic forms, largely abstracted away from or regarded as autonomous from any associated meaning. This approach thus includes the study of syntactic, morphological, and morphemic structure. The tradition of generative grammar has been centered in the formal Maienbom, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 622-642
27. Cognitive Semantics: An overview 623 approach. But its relations to the other two approaches have remained limited. It has all along referred to the importance of relating its grammatical component to a semantic component, and there has indeed been much good work on aspects of meaning, but this enterprise has generally not addressed the overall conceptual organization of language. The formal semantics that has been adopted within the generative tradition (e.g., Lappin 1997) has largely included only enough about meaning to correlate with the formal categories and operations that the main body of the tradition has focused on. And the reach of generative linguistics to psychology has largely considered only the kinds of cognitive structure and processing that might be needed to account for its formal categories and operations. The psychological approach regards language from the perspective of general cognitive systems such as perception, memory, attention, and reasoning. Centered in this approach, the field of psychology has also addressed the other two approaches. Its conceptual concerns (see e.g., Neely 1991) have in particular included semantic memory, the associativity of concepts, the structure of categories, inference generation, and contextual knowledge. But it has insufficiently considered systematic conceptual structuring - the global integrated system of schematic structures with which language organizes conceptual content. By contrast, the conceptual approach of cognitive linguistics is concerned with the patterns in which and processes by which conceptual content is organized in language. It has thus addressed the linguistic structuring of such basic conceptual categories as space and time, scenes and events, entities and processes, motion and location, and force and causation.To these it adds the basic ideational and affective categories attributed to cognitive agents, such as attention and perspective, volition and intention, and expectation and affect. It addresses the semantic structure of morphological and lexical forms, as well as of syntactic patterns. And it addresses the interrelationships of conceptual structures, such as those in metaphoric mapping, those within a semantic frame, those between text and context, and those in the grouping of conceptual categories into large structuring systems. Overall, the aim of cognitive linguistics is to ascertain the global integrated system of conceptual structuring in language. Cognitive linguistics, further, addresses the concerns of the other two approaches to language. First, it examines the formal properties of language from its conceptual perspective. Thus, it aims to account for grammatical structure in terms of the functions this serves in the representation of conceptual structure. Second, as one of its most distinguishing characteristics, cognitive linguistics aims to relate its findings to the cognitive structures that concern the psychological approach. It aims both to help account for the behavior of conceptual phenomena within language in terms of those psychological structures, and at the same time, to help work out some of the properties of those structures themselves on the basis of its detailed understanding of how language realizes them. It is this trajectory toward unification with the psychological that motivates the term "cognitive" within the name of this linguistic tradition. In the long term, its aim is to integrate the linguistic and the psychological perspectives on cognitive organization in a unified understanding of human conceptual structure. With its focus on the conceptual, cognitive linguistics regards "meaning" or "semantics" simply as conceptual content as it is organized by language. Thus, general conception as experienced by individuals - i.e., thought - includes linguistic meaning within its greater compass. And while linguistic meaning - whether that expressible by an individual language or by language in general - apparently involves a selection from or constraints upon general conception, it is nevertheless qualitatively of a piece with it.
624 VI. Cognitively oriented approaches to semantics Cognitive linguistics is as ready as other linguistic approaches to represent an aspect of language abstractively with a symbolic formula or schematic diagram, provided that that aspect is judged both to consist of discrete components in crisp relationships and to be clearly understood. But most cognitive linguists share the sensibility that such formal representations poorly accord with the gradients, partial overlaps, interactions that lead to mutual modification, processes of fleshing out, inbuilt forms of vagueness, and the like that they observe in semantics. They instead aim to set forth such phenomena through descriptive means that provide precision and rigor without formalisms.They further find that formal accounts present their representations of language organization with premature exhaustiveness and mistakenly uniform certainty. We might propose developing a field of "thcoryology" that taxonomizes types of theories, according to their defining properties. In such a field, a formal theory of language, at any given phase of its grasp of language phenomena, would be of the type that requires encompassive and perfected mechanisms to account for those phenomena. But cognitive linguistics rests on a type of theory that, at its foundation, builds in gradients for the stage of development to which any given aspect of language under analysis has been brought, and for the certainty with which the analysis is held. While cognitive linguists largely share the approach to language outlined here, they may differ in more limited respects. For example, Ronald Langacker generally stresses the contribution of every morpheme and construction in a sentence to the unified meaning of the sentence, while George Lakoff and I sec certain interactions among the elements as overriding such contributions. And Lakoff stresses a theory of embodiment that Langacker and I break into subtheories and in part challenge (see section 5.1). Terminologically, "cognitive linguistics" refers to the field as a whole. Within that field, "cognitive grammar" is largely associated with Langacker's work, while "cognitive semantics" is largely associated with my own work (the main focus below), though it is sometimes used more cxtcndcdly. Externally, cognitive linguistics is perhaps closest to functional linguistics (e.g., Givon 1989). Discourse is central to the latter while more peripheral to the former, but both approach language with similar sensibilities. Jackendoffs approach is comparable in spirit to cognitive linguistics, as seen in article 30 (Jackendoff) Conceptual Semantics. Thus, he assumes a mentalist, rather than a cognition-avoiding logical, basis for meaning. And he critiques the approaches of Fodor, Wierzbicka, and Levin and Rappaport (see article 19 (Levin & Rappaport Hovav) Lexical Conceptual Structure) much as docs cognitive linguistics. But his reliance on an algebraic features-based formalism to represent meaning differs from the cognitive-linguistic view of its inadequacy in handling the semantic gradience and modulation cited above. And his privileging of spatial structure in semantics is at variance with the significance that cognitive linguistics sees in such further domains as temporal structure, force-dynamic/causal structure, cognitive state (including purpose, expectation, affect, familiarity Hovav), and reality status (including factual, counterfactual, conditional, potential), as well as domains that he himself cites, like social relations. 2. The semantics of grammar To turn to the specific contents of Cognitive Semantics, then, this outline opens with the semantics of grammar because it is the key to conceptual structuring in language. A
27. Cognitive Semantics: An overview 625 universal design feature of languages is that their meaning-bearing forms are divided into two different subsystems, the open-class, or lexical, and the closed-class, or grammatical (see Talmy 2000a: ch. 1). Open classes have many members and can readily add many more.They commonly include (the roots of) nouns, verbs, and adjectives. Closed classes have relatively few members and are difficult to augment. They include bound forms - inflections, derivations, and clitics and such free forms as prepositions, conjunctions, and determiners. In addition to such overt closed classes, a language can have certain implicit closed classes such as word order patterns, a set of lexical categories (e.g., nounhood, verbhood, etc. per se), a set of grammatical relations (e.g., subject status, direct object status, etc.), and grammatical constructions. 2.1. Semantic constraint on grammar Within this formal distinction, the crucial semantic finding is that the meanings that open- class forms can express are virtually unrestricted, whereas those of closed-class forms are highly constrained. This constraint applies both to the conceptual categories they can refer to and to the particular member notions within any such category. For example, many languages around the world have closed-class forms in construction with a noun that indicate the number of the noun's referent, but no languages have closed-class forms indicating its color. And even closed-class forms referring to number can indicate such notions as singular, dual, plural, paucal, and the like, but never such notions as even, odd, a dozen, or countable. By contrast, open-class forms can refer to all such notions, as the very words just used demonstrate. The total set of conceptual categories with their member notions that closed-class forms can ever refer to thus constitutes an approximately closed inventory. Individual languages draw in different patterns from this universally available inventory for their particular set of grammatically expressed meanings.The inventory is graduated, progressing from categories and notions that may well appear universally in all languages (a candidate is "polarity" with its members 'positive' and 'negative'), through ones appearing in many but not all languages (a candidate is "number"), down to ones appearing in just a few languages (an example is "rate" with its members 'fast' and 'slow'). 2.2. Topological principle for grammar The next issue is what determines the conceptual categories and member notions included in the inventory as against those excluded from it. No single global principle is evident, but several semantic constraints with broad scope have been found. One of these, the "topology principle", applies to the meanings - or "schemas" - of closed- class forms referring to space, time or certain other domains. This principle largely excludes Euclidean properties such as absolutes of distance, size, shape, or angle from such schemas. Instead, these schemas exhibit such topological properties as "magnitude neutrality" and "shape neutrality". To illustrate magnitude neutrality, the spatial schema of the English preposition across prototypically represents motion along a path from one edge of a bounded plane perpendicularly to its opposite. But this schema is abstracted away from magnitude so the preposition can be used equally well in The ant crawled across my palm, and in The bus drove across the country. Likewise in time, the temporal schema of the past tense
27. Cognitive Semantics: An overview 627 example), each with its own distinctive schematic structure. Such categories also largely share certain structural properties, such as the capacity for converting from one category member to another, the multiple nesting of such conversions, and a structural parallelism between objects in space and events in time. At a second level, these categories join in extensive "schematic systems" that structure major sectors of conception. Four of these schematic systems are outlined next. 3.1. Configurational structure One schematic system,"configurational structure",comprehends all the respects in which closed-class schemas represent structure for space or time or other conceptual domains often in virtually geometric patterns (seeTalmy 2000a: ch. 3,2003,2006; Herskovits 1986; article 98 (Pederson) The expression of space; article 107 (Landau) Space in semantics and cognition). It thus includes much that is within the schemas represented by spatial prepositions, by temporal conjunctions, and by tense and aspect markers, as well as by markers that otherwise interact with open-class forms with respect to object and event structure. This last type of schema is seen in the categories of "plexity" and "state of boundedness", treated next. 3.1.1. Plexity The conceptual category of plexity pertains to a quantity's state of articulation into equivalent elements. Its two main member notions arc "uniplcxity" and "multiplcxity". The novel term "plexity" was chosen to capture an underappreciated generalization present across the traditional categories of "number" for objects in space and "aspect" for events in time. In turn, uniplexity thus covers both the singular and the semelfactive, while multiplexity covers both plural and iterative. If an open-class form is intrinsically lexicalized for a uniplex referent, a closed-class form in construction with it can trigger a cognitive operation of "multiplexing" that copies its original solo referent onto various points of space or time.Thus, in English, the noun bird and the verb (to) sigh intrinsically have a uniplex referent. But this can be multiplexed by adding -s to the noun, as in birds, or by adding keep -ing to the verb, as in keep sighing. (True, English keep is open-class, but parallel forms in other languages arc closed-class iterative forms.) An operation of "unit excerpting" can perform the reverse conversion from multiplexity to uniplexity on an intrinsically multiplex open-class form. In English, this operation is performed only by grammatical complexes, as in going from furniture to (a) piece of furniture or from breathe to take a breath. But other languages have simplex forms. Thus, Yiddish goes from groz 'grass' to (a) grezl '(a) blade of grass'. And Russian goes from dixat' 'sneeze a multiplex number of times' to cixnut' 'sneeze once'. 3.1.2. State of boundedness A second conceptual category is "state of boundedness", with two main member notions, "unboundedness" and "boundedness". An unbounded quantity is conceptualized as able to continue on indefinitely without intrinsic finiteness. A bounded quantity is conceptualized as an individuated unit entity with a boundary around it. As with plexity, these new terms are intended to capture the commonality across the space and time domains, and
630 VI. Cognitively oriented approaches to semantics viewing, or family of viewings, points ahead to a subsequent wife or wives following John's marriage with the woman at the punchbowl. Finally, a perspective point of the speaker at the present moment of speech is established by the past tense of the main verb was. It is this perspective point at which the speaker's cumulative knowledge of the reported sequence of events is stored as memory and, in turn, which functions as the origin of a retrospective direction of viewing over the earlier sequence. The earlier perspective points are here nested within the scope of the viewing from the current perspective point. 3.3. Distribution of attention A third schematic system, "distribution of attention", directs a listener's attention differentially over the structured scene from the established perspective point (see Talmy 2000a: ch. 4,2007). Grammatical and other devices set up regions with different degrees of salience, arrange these regions into different patterns, and map these patterns in one or another way over the components of the structured scene. Several patterns are outlined here. 3.3.1. Focal attention One attentional arrangement is a center-surround pattern with the center foregrounded as the focus and with the surround backgrounded. The grammatical relation of subject status can direct focal attention to the referent of the subject nominal, and alternative selections of subject can place the center of the pattern over different referents, even ones within the same event.Thus, focal attention can be mapped either onto the seller in a commercial transaction, with lesser attention on the remainder, as in The clerk sold the vase to the customer, or onto the buyer, with lesser attention on the new remainder, as in The customer bought the vase from the clerk (see Talmy 2000a: ch. 1). For another realization of this pattern, Fillmore's (1976) term "frame" and Langacker's (1987) term "base" refer to a structured set of coentailed concepts in the attentional background. Their respective terms "highlighting" and "profiling" then refer to the foregrounding of the portion of the set that a morpheme refers to directly. A Husserl (1970) example can illustrate. The nouns husband and wife both presuppose the conception of a married couple in the background of attention, while each focuses attention on one or the other member of such a pair in the foreground. 3.3.2. Level of synthesis In expressions referring to the same scene, different grammatical forms can direct greater attention to either of two main "levels of synthesis" or of granularity, the Gestalt level or the componential level. Thus, the head status of pyramid in the sentence The pyramid of bricks came crashing down, raises its salience over that of bricks with its dependent status. More attention is at the Gestalt level of the whole pyramid, conceptually tracking its overall movement. But the dependency relations are reversed in the sentence The bricks in the pyramid came crashing down. Here, more attention is at the componential level of the constituent bricks, tracking their multiple movements (see Talmy 2000a: ch. 1).
27. Cognitive Semantics: An overview 631_ 3.3.3. Window of attention A third pattern is the "window of attention". Here, one or more (discontinuous) portions of a referent scene are foregrounded in attention (or "windowed") by the basic device of their explicit mention, while the remainder of the scene is backgrounded in attention (or "gapped") by their omission from mention. To illustrate, the sentence The pen kept rolling off the uneven table conveys the conception of an iterating cycle in which a pen progresses through the phases of lying on a table, falling down, lying on the ground, and being placed back on the table. But the overt linguistic material refers only to the departure phase of the pen's cyclic path. Accordingly, only this portion of the total referent is foregrounded in attention, while the remainder of the cycle is relatively backgrounded. With enough context, the alternative sentence / kept placing the pen back on the uneven table could refer to the same cycle. But here, the presence of overt material referring to the return phase of that cycle foregrounds that phase in attention, while now the departure phase is backgrounded (see Talmy 2000a: ch. 4). 3.3.4. Attentional nesting Nesting was shown for configuration and for perspective, and it can also be seen in the schematic system of attention. It appears in the second of the following two sentences: The customer bought a vase. /The customer was sold a vase. In the second sentence, focal attention is first directed to the seller by the lexical choice of sell but is then redirected to the buyer by the passive voice. If this redirection of attention were total, then the second sentence would be scmantically indistinguishable from the first sentence, but in fact it is not. Rather, the redirection of attention is only partial: it leaves intact the foregrounding of the seller's active intentional role, but it shifts the main focus onto the buyer as target. Altogether, then, it can be said that attention on the seller is hierarchically embedded within a more dominant attention on the buyer. 4. Conceptual organization In addition to schematic systems, language has many other forms of extensive and integrated conceptual organization, such as the three presented next. Although Figure/ Ground organization and factive/fictive organization could be respectively comprehended under the attentional and the configurational schematic systems, and force dynamic organization has elsewhere been treated as a fourth schematic system, these are all extensive enough and cut across enough distinctions to be presented here as separate bodies of conceptual organization. 4.1. Figure/ground organization In representing many spatial, temporal, equational, and other situations, language is so organized as to single out two portions of the situation, the "Figure" and the "Ground", and to relate the former to the latter (see Talmy 2000a: ch. 5). In particular, the Figure is a conceptually movable entity; its location or path is conceived as a variable whose particular value is at issue.The Ground is a reference entity with a stationary setting relative to a reference frame; the Figure's variable is characterized with respect to it.
27. Cognitive Semantics: An overview 633 4.2.2. Emanation paths Another category of Active motion, "emanation paths", involves the Active conceptualization of an intangible line emerging from a source object, passing in a straight line through space, and terminating on a target object, where factively nothing is in motion. In one subtype,"demonstrative paths", a directed line emerges from the pointed front of a source object. This is seen in The arrow points toward / past / away from the town. In the "radiation paths" subtype, a beam of radiation emanates from a radiant object and terminates on an irradiated object. This is seen in Light shone from the sun into the cave. It might be claimed that photons do factively emanate from a radiant object, so that Active motion need not be invoked. However, we do not see photons, so any representation of motion is cognitively imputed. In any case, in a related subtype, "shadow paths", none will claim the existence of "shadowons", and yet once again Active motion is seen in a sentence like The pole threw its shadow on the wall. Finally, a "sensory path" is represented as moving from the experiencer to the experienced object in a sentence like / looked into /past / away from the tunnel. Such an emanating "line of sight" can also be represented as moving laterally. Both these forms of Activity - Arst lateral, then axial - are represented in / slowly looked down into the well. One question for this Active category, though, is what determines the direction of the intangible emanation. Logically, since motion is imagined, it should be possible to conceptualize a reversed path. Attempts at representing such reversed paths appear in sentences like *Light shone from my hand onto the sun, or *The shadow jumped from the wall onto the pole, or */ looked from that distant mountain into my eyes. But such formulations do not exist in any language that represents such events Actively. Rather, an "active-determinative" principle appears to govern the direction of emanation. Of the two objects, the more active or determinative one is conceptualized as the source. Thus, relative to my hand, the sun is brighter, hence, more active, and must be treated as the source of radiative emanation. My agency in looking is more active than the inanimate perceived object, so I am treated as the source of sensory emanation. And the pole is more determinative - I can move the pole and the shadow will also move, but I cannot perform the opposite operation of moving the shadow and getting the pole to move - so the pole is treated as the source of shadow emanation. 4.3. Force dynamics Language has an extensive conceptual system of "force dynamics" for representing the patterns in which one entity, the "Agonist", has force exerted on it by another entity, the "Antagonist" (see Talmy 2000a:ch. 7). It covers such concepts as an Agonist's natural tendency toward action or rest, an Antagonist's opposition to such a tendency, the Agonist's resistance to this opposition, and the Antagonist's overcoming of such resistance. It includes the concepts of causing and letting, helping and hindering, and blockage and the removal of blockage. It generalizes over the causative concepts of traditional linguistics, placing them naturally within a matrix of Aner distinctions. It also cuts across conceptual domains, from the physical, to the psychological, to the social, as illustrated next.
634 VI. Cognitively oriented approaches to semantics 4.3.1. The physical domain A contrast between two sentences can illustrate the physical domain. The sentence The ball rolled along the green represents motion in a force-dynamically neutral way. But The ball kept rolling along the green adds force dynamics to the otherwise same spatial movement. In fact, it has readings for two different force dynamic patterns. Interpreted under the "extended causing of motion" pattern, the ball as Agonist has a natural tendency toward rest but is being overcome by a stronger Antagonist such as the wind. Alternatively, interpreted under one of the "despite" patterns, the ball as Agonist has a natural tendency toward motion and is overcoming a weaker Antagonist such as stiff grass - that is, it moves along despite opposition from the grass. 4.3.2. The psychological domain An individual's psyche can be conceptualized and linguistically represented as a "divided self" in which two different components are in force dynamic opposition.To illustrate, the sentence I didn't respond is force dynamically neutral. But the sentence I refrained from responding, though it still represents a lack of response, now adds in the force dynamic pattern "extended causing of rest". Specifically, a more central part of mc, the Agonist, has a tendency toward responding, while a more peripheral part of me, the Antagonist, opposes this tendency, is stronger, and so blocks a response.The two opposing parts are explicitly represented in the corresponding sentence I held myself back from responding. 4.3.3. The social domain Much as the closed-class category of prepositions is largely associated with a specific semantic category, that of paths or sites in relation to a Ground object, so the closed-class category of modals is largely associated with the semantic category of force dynamics - in particular, with its social application. Here, certain interpersonal interactions, mediated solely through communication, can be metaphorically represented in terms of force or pressure exerted by one individual or group on another. For example, must, as in You must go to school, represents one of the "causing" force dynamic patterns between individuals. It sets the subject up as an Agonist whose desire - taken as a kind of tendency - is to do the opposite of the predicate's referent. And it sets up an implicit Antagonist - for example, I, your mother, people at large - that exerts psychological pressure on the Agonist toward performance of the undesired action. The modal may, as in You may go to the playground, instead represents a "letting" force dynamic pattern. Here, the subject as Agonist has a desire or tendency toward the stated action that could have been blocked by an implicit stronger Antagonist, but this potential blockage is withheld. 5. Interactions among semantic structures The preceding discussion has mostly dealt with conceptual structures each in its own terms. But a major aspect of language organization is that conceptual structures, from small to large, can also interact with each other in accordance with certain principles. Such interactions are grouped together below under four extensive categories.
27. Cognitive Semantics: An overview 635 5.1. Conceptual imposition A widespread view about the contents and structures of cognition is that they ultimately derive from real properties of external phenomena, through processes of perception and abstraction, in what John Searle has called the "world-to-mind direction of fit". While acknowledging such processes, cognitive linguistics calls attention instead to intrinsic content and structure in cognition - presumably mainly of innate origin - and to how extensive they are. Such native cognitive properties certainly apply to the general functioning of cognition. But they also apply in many forms of "conceptual imposition" - the imputation of certain contents and structures to our conceptions and perceptions of the world in a "mind-to-world direction of fit". Several realizations of such conceptual imposition are outlined next. 5.1.1. The imputation of content or structure An initial non-linguistic example of autochthonous cognition is "affect". Emotions such as anger or affection are experienced either as such or as applied to outside entities. But it is difficult to see how such feelings could arise from a process of abstraction from the external world. Linguistic examples of course abound. Fictive motion offers some immediately striking ones, such as the "shadow path" in The pole threw its shadow onto the wall. As described in 4.2.2, the literal wording here depicts a movement from pole to wall that is not overtly perceived as occurring "out there". That is, at least the language- related portion of our cognition imposes the conceptualization of motion onto what would be perceived as static. Actually, though, virtually all the semantic structures described so far are forms of conceptual imposition. Thus, in the scene represented by the sentence The post office is near the bank, based on the discussion in 4.1, it could hardly be claimed that the post office is inherently Figure-like and the bank Ground-like, beyond our cognitive imputation of those roles to those objects. And houses dispersed over a valley, as described in 3.2.2, could scarcely possess an associated moving or stationary perspective point from which they arc viewed, apart from the linguistic forms that ascribe such perspective to the represented scene. 5.1.2. Alternatives of conceptualization A consequence of the fact that a particular structural or contentful conception can be imputed to a phenomenon is that a range of alternative conceptions could also be imputed to it. These are alternatives of what my work has termed the "conceptualization" and Langackcr's has termed the "construal" of a phenomenon. For example, as seen in 4.2.1, the fictive motion that could be imputed to a fence along one coextension path, as in The fence goes from the plateau down into the valley could also be imputed to it along the reverse path, as in The fence goes from the valley up onto the plateau. Or consider the deictics this and that, which establish a conceptual boundary in space and depict an indicated object as being respectively either on the speaker's side of the boundary or on the side opposite the speaker. Then, referring to the exact same bicycle standing, say, some 8 feet away, a speaker could opt to say either This bike is in my way, or That bike is in my way.The speaker can thus impose alternatives of conceptualization
636 VI. Cognitively oriented approaches to semantics on the scene, imputing a conceptual boundary either between himself and the bike or on the other side of the bike. 5.1.3. Embodiment The notion of "embodiment" extends the idea of conceptual imposition. It assumes that such imposed concepts are largely based on experiences humans have of their bodies interacting with environments or on psychological or neural structure. It proposes that such experiences are imputed to, or form the basis of, our understanding of most phenomena (Lakoff & Johnson 1999). In my view, though, the linguistic literature has largely applied the blanket term "embodiment" to a range of insufficiently distinguished ideas that differ in their validity. Four such distinct ideas of embodiment in current use are outlined here. First, in what might be called the "bulk encounter" idea of embodiment, phenomena are grouped and categorized in terms of the way in which our bodies - with their particular shape and mesoscopic size - can interact with them. But this idea is either incorrect or limited. For example, many languages have closed-class representation for a linear configuration, as English does with the preposition along. This preposition applies to a Ground object schematizable as linear. But due to magnitude neutrality (see 1.2), this schema can be applied to objects of quite different sizes, as in The ant climbed up along the matchstick, and The squirrel climbed up along the tree trunk. Yet, although the along schema can group a matchstick and a tree trunk together, we bodily interact with those objects in quite different ways. Accordingly, the bulk encounter idea of embodiment does not account for this and perhaps much else in the structure of linguistically represented conception. Second, in what could be called the "neural infrastructure" idea of embodiment, it is the organization and operation of our neural structure that determines how we conceptualize phenomena. Thus, the linear schematization just cited might arise from ncurally based processes of visual perception that function to abstract out just such one- dimensional contours. Comparably, the concept evoked on hearing a word such as bicycle or coffee might arise from the reactivation of the visual, motor, and olfactory areas that were previously active during interaction with those objects, in the manner of Damasio's "convergence zones".The problem with this idea of embodiment is that, although generally correct, it is simply subsumed by psychology and needs no separate statement of its own. Third, in what could be called the "concreteness as basic" idea of embodiment, the view is that experience with the tangible world is developmentally earlier and provides the basis for later conceptions of intangible phenomena, much as in Piagetian theory. A commonly cited example is concepts of time based on those of space. Another is the conception of purpose based on that of destination, that is, one's destination in a physical journey in what Lakoff (1992) terms the "event structure metaphor". While something of this directional bias is evident in metaphoric mapping (see below), it is not clear that it correctly characterizes cognitive organization. On the contrary, we may well have an innate cognitive system dedicated to temporal processing - perhaps already evident very early - that includes perception of and control over duration; starting, continuing, and stopping; interrupting and resuming; repeating; waiting; and speeding up and slowing down. We may likewise have an innate cognitive system for intention or purpose. In any
27. Cognitive Semantics: An overview 637 case,"purpose" cannot be derived from "destination". After all, the concept that a person moving from point X to point Y has Y as a "destination" already includes a component of purpose. When such a component is lacking, we do not say that a person has point Y as her destination but rather that her motion simply "stops" at that point. Accordingly, the notion of purpose present in the concept of "destination" could not derive from perceptions of concrete motion patterns, but might originate in an innate cognitive system for the enactment and conception of intention or purpose. In a fourth and final type here, what can be called the "anti-objectivism" idea of embodiment faults the view that there exists an autonomous truth, uniform and pervasive, in such realms as logic and mathematics that the human mind taps into for its understandings and activities in those realms. Rather, wc deal with such realms by imputing or mapping onto them various of our conceptual schemas, motor programs, or other cognitive structures. On this view, we do much of our thinking and reasoning in terms of such experientially derived structures. For example, our sense of the meaning of the word angle is not derived from some independent ideal mathematical realm, but is rather built up from our experience, e.g., from perceptions of a static forking branch, from moving two sticks axially until their ends touch, or from rotating one stick while its end touches that of another.This view of how we think may be largely correct. But if applied too broadly, it might obscure the possible existence of an actual cognitive system for objectivity and reason. Such a system might have the capacity to check for coherence across concepts, for global consistency across conceptual and cognitive domains, and for consistency across inferences and reasoning, - whether or not the assessed components themselves arose through otherwise embodied processes - and it might be the source of the very conception of an objective domain. 5.2. Cognitive recruitment I propose the term " recruitment" for a pervasive cognitive process in which a cognitive configuration with a certain original function or conceptual content gets used to perform another function or to represent some other concept.That is, the basic function or concept is appropriated or co-opted in the service of manifesting another one. Such recruitment would certainly cover all tropes, including Activity and metaphor. Thus, in a coextension path example of Active motion like The fence goes from the plateau to the valley (see 4.2.1), the morphemes go, from, and to originally and basically refer to motion, but this reference is conscripted in the service of representing a stationary configuration. And metaphor can also be understood in terms of recruitment. In cognitive linguistics, metaphor has been mainly studied not for its salient poetic form familiar from literature but - under the term "conceptual metaphor" - for its largely unconscious pervasive structuring of everyday expression (see e.g., Lakoff 1992; article 26 (Tyler & Takahashi) Metaphors and metonymies). In the basic analysis, certain structural elements of a conceptual "source domain" are mapped onto the content of a conceptual "target domain". But in our present terms, it can also be said that the conceptual structures and morphemic meanings original to the source domain are recruited for use as structures and meanings within the target domain. The directionality of the mapping - based on the "concrete as basic" view of embodiment (see 5.1.3) - is typically from a more concrete domain grounded in bodily experience to a more abstract domain. Thus, the more palpable
638 VI. Cognitively oriented approaches to semantics domain of space is systematically mapped onto the more abstract domain of time in such everyday expressions as Christmas is ahead /near /almost here / upon us /past. Recruitment can be seen as well in the appropriation of one type of construction to serve as another type. For example, the English question construction with certain modals can serve as a request, as in Could you pass me the salt?. Fictivity terminology could be extended to label the host construction here as a Active question, and the parasitic construction as a factive request. Or it could be said that a request construction has recruited a question construction. Finally, to illustrate functional recruitment, repair mechanisms in discourse, in their basic function, comprise a variety of devices that a speaker uses to remedy hitches that arise in the production of an utterance.Talmy (2000b: ch.6) cites a recorded example of a young woman rejecting a suitor in which she uses an inordinate density of repair mechanisms, including false starts, interruptions, corrections, and repetitions. But it is evident that these originally corrective devices have been co-opted to perform a different function: to manifest embarrassed concern for the addressee's sensitive feelings. And, built in turn upon that function is the further function of the speaker's signaling to the addressee that she did have his feelings in mind. 5.3. Semantic conflict resolution A conflict or incompatibility often exists between the references of two constituents in a sentence, or between the reference of a constituent and the context or one's general knowledge (see Talmy 2000b: ch. 5). The treatment of such semantic conflict thus complements treatments of semantic "unification" in which the referents of constituents integrate unproblematically. A hearer of a conflict generally applies one out of a set of resolutions to it. These include shifts, blends, juxtapositions, and juggling. Of these, the first two are characterized next. 5.3.1. Shifts In the type of resolution Talmy (1977) termed a "shift" - now largely called "coercion" after Pustejovsky (1993) - the reference of one of the two conflicting forms changes so as to accord with the reference of the other form. A shift can involve the cancellation, stretching, or replacement of a semantic feature. Each of these three types of shifts is illustrated next. The across schema cited in 2.2 can illustrate component cancellation. This schema prototypically involves a horizontal path on a bounded plane from one edge perpendicularly to its opposite. But the path's termination on the distal edge can be canceled, as in a sentence like The shopping cart rolled across the boulevard and was hit by an oncoming car. Here, the English preposition is not blocked from usage, or replaced by some preposition referring to partial planar traversal, but continues on with one of its semantic components missing. In fact, the preposition can continue in usage even with both of the path's edge contacts canceled, as seen in The tumbleweed rolled across the desert for an hour. The across schema can also illustrate component stretching. A prototypical constraint on this schema, not mentioned earlier, is that the main axis of the plane, which is perpendicular to the path, may be longer than the path or of the same length, but cannot be
27. Cognitive Semantics: An overview 639 shorter. Accordingly, I can swim "across" a square swimming pool from one edge to the other, or "across" a canal from one bank to the other, but if my path parallels a canal's banks, I am not swimming "across" the canal but "along" it. But what if I am at an oblong pool and swim from one of the narrow edges to its opposite? In referring to this situation, the acceptability of the sentence / swam across the pool is great where the pool is only slightly longer than a square shape, and decreases as its relative length increases. The across schema thus permits the path length within the relative-axis constraint to be stretched moderately but not too far. Finally, component replacement can be seen in a sentence like She is somewhat pregnant. Here, the gradient specification of somewhat conflicts with the basic all-or-none specification of pregnant. A hearer might resolve this conflict through the mechanism of juxtaposition, to yield the "incongruity effect" of humor. If not, though, the hearer can shift pregnant into accord with somewhat by replacing its 'all-or-none' component with that of 'gradience'.Then the overall meaning of pregnant shifts as well from involving the presence or absence of a fetus to involving the length of gestation. 5.3.2. Blends An incompatibility between two sets of specifications in a sentence can also be resolved as a "blend", in which a hearer generates an often imaginative conceptual hybrid that accommodates both of the original conceptual inputs in some novel relation to each other. Talmy (1977) distinguished two types of blends, superimposition and introjection, and illustrated the former with the sentence My sister wafted through the party.The conflict here is between waft suggesting something like a leaf moving gently in an irregular pattern through the air, and the rest of the sentence suggesting a person (moving) through a group of other people. In myself, this sentence evokes the blended conceptualization of my sister wandering aimlessly through the party, somewhat unconscious of the events around her, and of the party somehow suffused with a slight rushing sound of air. Fauconnier & Turner (2002) have greatly elaborated on this process, also terming it a "blend" or a "conceptual integration". In their terms, two separate mental spaces (see below) can map elements of their content and structure into a third mental space that constitutes a blend of the two inputs, with potentially novel structure.Thus, in referring to a modern catamaran reenacting a century-old voyage by an early clipper, a speaker can say At this point, the catamaran is barely maintaining a 4 day lead over the clipper.The speaker here conceptually superimposes the two treks and generates the apparency of a race. 5.4. Semantic interrelations In the preceding three subsections, semantic structures have in effect "acted on" each other to yield a novel conceptual derivative. But semantic elements and structures can also simply relate to each other in particular patterns. Four such patterns are outlined next. 5.4.1. Within one sense of a morpheme Several bodies of research within cognitive linguistics address the structured relations among the semantic components of the meaning of a morpheme in one of its polysemous senses. Two of these are "frame semantics" and "prototype theory", outlined next.
640 VI. Cognitively oriented approaches to semantics Fillmore's (e.g., 1976) Frame Semantics (see article 29 (Gawron) Frame Semantics) shows that the meaning of a morpheme does not simply consist of a central concept - the main concern of a speaker in using the morpheme - but extends out indefinitely with ever further conceptual associations that bear particular relations to each other and to the central concept. In fact, several different morphemes can share roughly the same extended frame while foregrounding different portions in the center. Thus, such "commercial frame" verbs as sell, buy, spend, charge, and cost all share in their frames a seller, a buyer, money, and goods, as well as the transfer of money from the buyer to the seller and, in return for that, the transfer of the goods from the seller to the buyer. Each of these concepts in turn rests on a further conceptual infrastructure. For example, the 'money' concept rests on notions of governmental minting and socially agreed value, while the 'in return for' concept rests on notions of reciprocity and equity. In Lakoff's (1987) prototype theory (see article 28 (Taylor) Prototype theory), a morpheme's meaning can generally be viewed as a category whose members differ in privilege, whose properties can vary in number and strength, and whose boundary can vary in scope. In its most prototypical usage, then, the morpheme refers to the most privileged category member, assigns the fullest set of its properties at their greatest strength to that member, and tightens its boundary to enclose the smallest scope. For an example from Fillmore (1976), the meaning of breakfast - in its most prototypical usage - consists of eating certain foods, namely, eggs, bacon, toast, coffee, orange juice, and the like, at a certain time of day, namely, in the morning. But the meaning can be extended to less prototypical values, for example, either to different foods - The Joneses eat liver and onions for breakfast - or to different times of the day - Breakfast is served all day. 5.4.2. Across different senses of a morpheme Brugman (1981) was the first to show that for a polysemous morpheme, one sense can function as the prototype to which the other senses are progressively linked by conceptual increments within a "radial category". Thus, for the preposition over, the prototype sense may be 'horizontal motion above an object' as in The bird flew over the hill. But linked to this by "endpoint focus" is the sense in Sam lives over the hill. 5.4.3. Relations from within a morpheme to across a sentence The "Motion typology" of Talmy (2000b: ch. 1) proposes a universal semantic framework for an event of motion or location. This consists of four components in the main event proper - the moving or stationary "Figure", its state of "Motion" (moving or being located), its "Path" (path or site), and the "Ground" that serves as its reference point - plus an outside "Co-event" typically of Manner or of Cause. Languages differ typologi- cally as to which of these components they characteristically include within the verb of a sentence, and which they locate elsewhere in the sentence. And these two sets of allocations are correlated. Thus, a "verb-framed" language like Spanish characteristically places the components of Motion and Path together in the verb, and so has an extensive scries of "path verbs" with meanings like 'enter', 'exit', 'ascend', 'descend', 'cross', 'pass', and 'return'. In correlation with this lexicalization pattern for the verb, the language has a ready colloquial construction for representing the Co-event - typically a gerund form that can appear
27. Cognitive Semantics: An overview 641_ right after the path verb. For example,'I ran into the cave' might be expressed as Entre corriendo a la cueva - literally,"I entered running to the cave". By contrast, a "satellite-framed" language like English characteristically places the components of Motion and Co-event together in the verb, and so has a series of "Manner verbs" like run, limp, scuttle and speed. In correlation with this lexicalization pattern for the verb, the language also has an extensive series of "path satellites" - e.g., in, out, up, down, past, across and back - as well as a partially overlapping set of path prepositions, together with the syntactic construction for their inclusion after the verb. The English sentence corresponding to the preceding Spanish one is thus: I ran into the cave. This correlation within a language between a verb's lexicalization pattern and the rest of the syntax in a motion sentence can be put into relief by noting minimally occurring patterns (see Slobin 1996).Thus, Spanish does not have a path satellite category or an extensive set of path prepositions, and in fact can largely not use the prepositions it does have to represent a path. For instance, it could not do so for the cave example. For its part, English does not have a colloquial gerund construction for use with its few path verbs (which in any case are mostly borrowed from Romance languages, where they are native). Thus, a sentence like / entered the cave running is fully awkward. 5.4.4. Across a sentence Fauconnier (1985) shows how different portions of a sentence can set up distinct "mental spaces" with particular relations to each other. Each such space is a relatively self- contained conceptual domain with its component elements in a particular arrangement; two spaces can share many of the same elements; and a mapping can be established between corresponding elements.The mapping is directional, going from a "base" space - a conceptual domain generally factual for the speaker - to a "subordinate" space that can be counterfactual, representational, at a different time, etc. Thus, in Max thinks Harry's name is Joe, the speaker's base space includes 'Max' and 'Harry' as elements; the word thinks sets up a subordinate space for a portion of Max's belief system; and this contains an element 'Joe' that corresponds to 'Harry'. 6. Conclusion In this survey, the field of cognitive linguistics in general and of Cognitive Semantics in particular is seen to have as its central concern the representation of conceptual structure in language.The field addresses properties of conceptual structure both local and global, both autonomous and interactive, and both typological and universal. And it relates these linguistic properties to more general properties of cognition. While much has already been done in this relatively young linguistic tradition, it remains quite dynamic and is extending its explorations in a number of new directions. 7. References Brugman. Claudia 1981. The Story of'Over'. MA thesis University of California, Berkeley, CA. Fauconnier, Gilles 1985. Mental Spaces. Aspects of Meaning Construction in Natural Language. Cambridge, MA:The MIT Press.
28. Prototype theory 643 28. Prototype theory 1. Introduction 2. Prototype effects 3. Prototypes and the basic level 4. The cultural context of categories 5. Prototypes and categories 6. Objections to prototypes 7. Words and the world 8. Prototypes and polysemy 9. References Abstract According to a long-established theory, categories are defined in terms of a set of features. Entities belong in the category if, and only if, they exhibit each of the defining features. The theory is problematic for a number of reasons. Many of the categories which are lexical- ized in language are incompatible with this kind of definition, in that category members do not necessarily share the set of defining features. Moreover, the theory is unable to account for prototype effects, that is, speakers' judgements that some entities are 'better' examples of a category than others. These findings led to the development of prototype theory, whereby a category is structured around its good examples. This article reviews the relevant empirical findings and discusses a number of different ways in which prototype categories can be theorized, with particular reference to the functional basis of categories and their role in broader conceptual structures. The article concludes with a discussion of how the notion of prototype category has been extended to handle polysemy, where the various senses of a word can be structured around, and can be derived from, a more central, prototypical sense. 1. Introduction In everyday discourse, the term 'prototype' refers to an engineer's model which, after testing and possible improvement, may then go into mass production. In linguistics and in cognitive science more generally, the term has acquired a specialized sense,although the idea of a basic unit,from which other examples can be derived,may still be discerned.The term, namely,refers to the best,most typical,or most central member of category. Things belong in the category in virtue of their sharing of commonalities with the prototype. Prototype theory refers to this view on the nature of categories. This article examines the role of prototypes in semantics, especially in lexical semantics. To the extent that words can be said to be names of categories, prototype theory becomes a theory of word meaning. Prototype theory contrasts with the so-called classical, or Aristotelian theory of categorization (Lakoff 1982, 1987; Taylor 2003a). According to the classical theory, a category is defined in terms of a set of properties, or features, and an entity is a member of the category if it exhibits each of the features. Each of the features is necessary, jointly they are sufficient.The classical theory captures the 'essence' of a category in contrast to Maienbom, von Heusinger and Portner (eds.) 2011, Semantics (HSK33.1), deGruyter, 643-664
644 VI. Cognitively oriented approaches to semantics the 'accidental' properties of category members.The theory entails that categories have clear-cut boundaries and that all members, in their status as category members, have equal status within the category. An example which is often cited to illustrate the classical theory is the category 'bachelor', defined in terms of the features [+human], [+adult], [+male], and [-married]. Any entity which exhibits each of the four defining features is, by definition, a member of the category and thus can bear the designation bachelor. Something which exhibits only three or fewer of the features is not a member. In their status as bachelors, all members of the category are equal (though they may, of course, differ with respect to non-essential, accidental features, such as their height, wealth, and such like). The classical theory is attractive for a number of reasons. First, the theory neatly accounts for the relation of entailment. If X is a bachelor, then necessarily X is unmarried, because 'unmarried' is a property already contained in the definition of bachelor. Second, the theory explains why some expressions are contradictions. In married bachelor, or This bachelor is married, the property designated by married conflicts with a definitional feature of bachelor. Third, the theory accounts for the distinction between analytic and synthetic statements. This bachelor is a man is synthetic; it is necessarily true in virtue of the definitions of the words. This man is a bachelor is analytic; its truth is contingent on the facts of the matter.The theory also goes some way towards explaining concept combination. A rich bachelor refers to an entity which exhibits the features of bachelor plus the features of rich. In spite of these obvious attractions, there are many problems associated with the classical theory. First, the theory is unable to account for the well-documented prototype effects, to be discussed in section 2. Another problem is that the theory says nothing about the function of categories. Organisms categorize in order to manage the myriad impressions of the environment. If something looks like an X, then it may well be an X, and wc should behave towards it as we would towards other Xs. The classical theory is unable to accommodate this everyday kind of inferencing. According to the classical theory, the only way to ascertain whether something is an X is to check it out for each of the defining features of X. Having done that, there is nothing further to say about the matter. Categorization of the entity serves no useful purpose to the organism. A related matter is that the theory makes no predictions as to which categories are likely to be lexicalized in human languages. In principle, any random set of features can define a category. But many of these possible categories - for example, a category defined by the features [is red], [was manufactured before 1980], [weighs 8 kg] - although perfectly well- formed in terms of the theory, are not likely to be named in the lexicon of any human language. A further set of problem arises in regard to the features. In terms of the theory, the features are more basic than the categories which they define. In many cases, however, the priority of the features might be questioned. [Having feathers] may well be a necessary feature of'bird'. But do we comprehend the category 'bird' on the basis of a prior understanding of what it means for a creature to have feathers, in contrast, say, to having fur, hair, scales, or spines? Probably not. Rather, we understand 'feathers' in consequence of our prior acquaintance with birds. Another point to bear in mind is that each feature will itself define a category, namely, the category of entities exhibiting that feature. The feature [+adult] - one component of the definition of bachelor - singles out the category of adults. We will need to provide a classical definition of the category 'adult', whose
28. Prototype theory 649 level thus turns out to be the most informative and efficient of the taxonomic levels. It is the level which packs the most information, both in terms of what something is, also with respect to what it is not. It is not surprising, therefore, that basic level terms tend to be of high frequency, they are short, and are learned early in first language acquisition. The findings also make possible a more sophisticated understanding of prototypes. As noted, basic level categories tend to be contrastive. It is possible, then, to view the prototype as a category member which exhibits the maximum number of features which are typical of the category and which are not shared by members of neighbouring categories. Rosch et al. (1976) in this connection speak of the cue validity of features. A feature has high cue validity to the extent that presence of the feature is a (fairly) reliable predictor of category membership. For example, [having a liver] has almost zero cue validity with respect to the category'bird'. Although all birds do have a liver, so also do countless other kinds of creature. On the other hand, [having feathers] has very high cue validity, since being feathered is a distinctive characteristic of most, if not all birds, and birds alone. [Being able to fly] has a somewhat lower, but still quite high cue validity, since there are some other creatures (butterflies, wasps, bats) which also fly. [Not being able to fly] would have very low cue validity, not only because it applies to only a few kinds of birds, but because there are countless other kinds of things which do not fly. It is for these reasons that being able to fly, and having feathers, feature in the bird prototype. Other researchers (e.g. Murphy 2002: 215) have drawn attention to the converse of cue validity, namely, category validity. If cue validity can be defined as the probability of membership in category C, given feature f, i.e. P(C|f), category validity can be defined as the probability that an entity will exhibit feature f, given its membership in C, i.e. P(f | C).The interaction of cue and category validity offers an interesting perspective on inferencing strategies, mentioned in section 1 in connection with the classical theory. (See also article 106 (Kelter & Kaup) Conceptual knowledge, categorization and meaning.) A person observes that entity c exhibits feature f. If the feature has high cue validity with respect to category C, the person may infer, with some degree of confidence, that e is a member of C. One may then make inferences about e, based on category validity. For example, an entity with feathers is highly likely to be a bird. If it is a bird, it is likely to be able to fly, and much more besides. The hypothetical category mentioned in section 1 - the category defined by the features [is red], [was manufactured before 1980], and [weighs 8 kg] - exhibits extremely low cue and category validity, and this could be one reason why such a category would never be lexicalized in a human language.The fact that something is red scarcely predicts membership in the category,since there are countless other red things in the universe; likewise for the two other features. The only way to assign membership in the category would be to check off each of the three features. Having done this, one can make no predictions about further properties of the entity.To all intents and purposes, the hypothetical category would be quite useless. A functional account of prototypes, outlined above, may be contrasted with an account in terms of frequency of occurrence. In response to the question 'Where does prototypi- cality come from?' (Geeraerts 1988), many people are inclined to say that prototypes (or prototypical instances) arc encountered more frequently than more marginal examples and that that is what makes them prototypical. Although frequency of occurrence certainly may be a factor (our prototypical vehicles are now somewhat different from those of 100 years ago, in consequence of changing methods of transportation) it cannot be the
652 VI. Cognitively oriented approaches to semantics confidential and not at all public property.The status of information as public or private property can be expected to vary according to circumstances and cultural conventions. 5. Prototypes and categories There arc several ways of understanding the notion of prototype and of the relation between a prototype and a category (Taylor 2008). Some of these have been hinted at in the preceding discussion. In this section we examine them in more detail. 5.1. Categories are defined with respect to a 'best example' On this approach, a category is understood and mentally represented simply in terms of a good example. One understands 'red' in terms of a mental image of a good red, other hues being assimilated to the category in virtue of their similarity to the prototype. Some of Rosch's statements may be taken in support of this view. For example, Heider (1971: 455) surmises that "much actual learning of semantic reference, particularly in perceptual domains, may occur through generalization from focal exemplars". Elsewhere she writes of "conceiving of each category in terms of its clear cases rather than its boundaries" (Rosch 1978:35-36). An immediate problem arises with this approach. Any colour can be said to be similar to red in some respect (if only in virtue of its being a colour) and is therefore eligible to be described as red 'to some degree'. In order to avoid this manifestly false prediction we might suppose that the outer limits of category membership will be set by the existence of neighbouring, contrasting categories. As a colour becomes more distant from focal red and approaches focal orange, there comes a point at which it will no longer be possible to categorize it as red, not even to a small degree.The colour is, quite simply, not red. Observe that this account presupposes a structuralist view of lexical semantics, whereby word meanings divide up conceptual space in a mosaic-like manner, such that the denotational range of one term is restricted by the presence of neighbouring terms (Lyons 1977: 260). It predicts (correctly in the case of colours, or at least, basic level colours) that membership will be graded, in that an entity may be judged to be a member of a category only to a certain degree depending on its distance from the prototype.The category, as a consequence, will have fuzzy boundaries, and degree of membership in one category will inversely correlate with degree of membership in a neighbouring category. The 'redder' a shade of orange, the less it is orange and the more it is red. There are a small number of categories for which the above account may well be valid, including the household receptacles studied by Labov: as a vessel morphs from a prototypical cup into a prototypical bowl, categorization as cup gradually decreases, offset by increased categorization as bowl.The account may also be valid for scalar concepts such as hot, warm, cool, and cold, where the four terms exhaustively divide up the temperature dimension. But for a good many categories it will not be possible to maintain that they arc understood simply in terms of a prototype, with their boundaries set by neighbouring terms. In the first place, the mosaic metaphor of word meanings may not apply.This is the case with near synonyms, that is, words which arguably have distinct prototypes, but whose usage ranges overlap and which are not obviously contrastive.Take the pair high and tall (Taylor 2003b). Tall applies prototypically to humans (tall man),high to inanimates (high
28. Prototype theory 653 mountain). Yet the words do not mutually circumscribe each other at their boundaries. Many entities can be described equally well as tall or high; use of one term does not exclude use of the other. It would be bizarre to say of a mountain that it is high but not tall, or vice versa. The approach is also problematic in the case of categories which cannot be reduced to values on a continuously varying dimension or on set of such dimensions. Consider natural kind terms such as bird, mammal, and reptile, or gold, silver, and platinum. Natural kinds are presumed to have a characteristic 'essence', be it genetic, molecular, or whatever. (This said, the category of natural kind terms may not be clear-cut; see Keil 1989. As a rule of thumb, we can say that natural kinds are the kinds of things which scientists study. We can imagine scientists studying the nature of platinum, but not the nature of furniture.) While natural kind categories may well show goodness-of-example effects, they tend to have very precise boundaries. Birds, as we know them, do not morph gradually into mammals (egg-laying monotremes like the platypus notwithstanding), neither can we conceive of a metal which is half-way between gold and silver. And, indeed, it would be absurd to claim that knowledge of the bird prototype (e.g. a small songbird, such as a robin) is all there is to the bird concept, to claim, in other words, that the meaning of bird is 'robin', and that creatures are called birds simply on the basis of their similarity to the prototype. While a duck may be similar to a robin in many respects, we cannot appeal to the similarity as evidence that ducks should be called robins. In the case of categories like 'bird', the prototype is clearly insufficient as a category representation. We need to know what kinds of things are likely to be members of the category, how far we can generalize from the prototype, and where (if only approximately) the boundaries lie. We need an understanding of the category which somehow encompasses all its members. 5.2. The prototype as a set of weighted attributes In subsequent work Rosch came to a more sophisticated understanding of prototype, proposing that categories tend to become defined in terms of prototypes or prototypical instances that contain the attributes most representative of items inside and least representative of items outside the category. (Rosch 1978:30; italics added) A category now comes to be understood as a set of attributes which are differentially weighted according to their cue validity, that is, their importance in diagnosing category membership, and an entity belongs in the category if the cumulative weightings of its attributes achieve a certain threshold level. On this approach, category members need not share the same attributes, nor is an attribute necessarily shared by all category members. Rather, the category hangs together in virtue of a 'family resemblance' (Rosch & Mcrvis 1975), in which attributes 'criss-cross', like the threads of a rope (Wittgenstein 1978: 32). The more similar an instance to all other category members (this being a measure of its family resemblance), the more prototypical it is of the category. A major advantage of the weighted attribute view is that it makes possible a "summary representation" of a category, which, like the classical theory, "somehow encompass[es] an entire concept" (Murphy 2002: 49). As a matter of fact, a classical category would
654 VI. Cognitively oriented approaches to semantics turn out to be a limiting case, where each of the features has an equal and maximal weighting, and without the presence of each of the features the threshold value would not be attained. The weighted attribute view raises the interesting possibility that the prototype may not correspond to any actual category member; it is more in the nature of an idealized abstraction. Confirmation comes from work with artificial categories (patterns of dots which deviate to varying degrees from a pre-established prototype), where subjects have been able to identify the prototype of a category they have learned, even though they had not been previously exposed to it (Posner & Keele 1968). 5.3. Categories as exemplars A radical alternative to feature-based approaches construes a category simply as a collection of instances. Knowledge of a category consists in a memory store of encountered exemplars (Smith & Medin 1981). Categorization of a new instance occurs in virtue of similarities to one or more of the stored exemplars, a prototypical example being one which exhibits the highest degree of similarity with the greatest number of instances. There are several variants of the exemplar view of categories. The exemplars might be individual instances encountered on specific occasions; especially for superordinate categories, on the other hand, the exemplars might be the basic categories which instantiate them (Storms, de Boeck & Ruts 2000). In its purest form, the exemplar theory denies that people make generalizations over category exemplars. Mixed representations might also be envisaged, however, whereby instances which closely resemble each other might coalesce into a generic image which preserves what is common to the instances and filters out the idiosyncratic details (Ross & Makin 1999). On the face of it, the exemplar view,even in its mixed form, looks rather implausible.The idea that we retain specific memories of previously encountered instances would surely make intolerable demands on human memory. Several factors, however, suggest that wc should not dismiss the exemplar theory out of hand, and indeed Storms, dc Boeck & Ruts (2000) report that the exemplar theory outperforms the summary representation theory, at least with respect to membership in superordinate categories. First, computer simulations have shown that exemplar models are able to account for a surprising range of experimental findings on human categorization, including, importantly, prototype effects (Hintzman 1986). Second, there is evidence that human memory is indeed rich in episodic detail (Schacter 1987). Even such apparently irrelevant aspects of encountered language, such as the position on a page of a piece of text (Rothkopf 1971), or the voice with which a word is spoken (Goldinger 1996), may be retained over substantial periods of time. Moreover, humans are exquisitely sensitive to the frequency with which events, including linguistic events, have occurred (Ellis 2002). Bybcc (2001) has argued that frequency should be recognized as a major determinant of linguistic performance, acceptability judgements, and language change. A focus on exemplars would tie in with the trend towards usage-based models of grammar (Langacker 2000,Tomasello 2003). It is axiomatic, in a usage-based model, that linguistic knowledge is acquired on the basis of encounters with actual usage events. While generalizations may be made over encountered events, the particularities of the events need not thereby be erased from memory (Langacker 1987:29). Indeed, it is now widely recognized that a great deal of linguistic knowledge must reside in rather particular facts
28. Prototype theory 657 of natural kind terms. As mentioned earlier, Labov queried the idea that the set of things called cups might possess a defining essence, distinct from the recognition features which allow a person to categorize something as a cup. In this case, the stereotype turns out to be nothing other than the prototype. Not to be forgotten also is the fact that speakers may operate with more than one understanding of a category.The case of'adult' was already mentioned, where an 'expert' bureaucratic definition might co-exist with a looser, multi-dimensional, and inherently fuzzy understanding of what constitutes an adult. 6.3. Prototypes save: An excuse for lazy lexicographers? Wierzbicka (1990) maintained that appeal to prototypes is simply an excuse for lazy semanticists to avoid having to formulate rigorous word definitions. Underlying Wicrzbicka's position is the view that words arc indeed amenable to definitions which are able to predict their full usage range. She offers sample definitions of the loci classici of the prototype literature, including 'game','lie', and 'bird'. However, as Geeraerts (1997: 13-16) has aptly remarked, Wierzbicka's definitions often sneak in prototype effects by the back door, as it were. For example, Wierzbicka (1990: 361-362) claims that ability to fly is part of the bird-concept, in spite of the fact that some birds are flightless. The discrepancy is expressed in terms of how a person would imagine a bird, namely, as a creature able to move in the air, with the proviso that 'some creatures of this kind cannot move in the air'. Far from discrediting prototype structure, Wierzbicka's definition simply incorporates them. 7. Words and the world Rosch's work addressed the relation between words and the things in the world to which the words can refer. The relation can be studied from two perspectives (Taylor 2007). We can ask, for this word, what are the things which it can be used to refer to? This is the semasiological, or referring perspective. Alternatively we can ask, for this thing, what are the words that we can use to refer to it? This is the onomasiological, or naming perspective (Blank 2003).The two perspectives roughly correspond to the way in which dictionaries and thesauri are organized. A dictionary lists words and gives their meanings. A thesaurus lists concepts and gives words which can refer to them. The two perspectives underlie much research in colour terminology. Consider, for example, the data elicitation techniques employed by MacLaury (1995). Three procedures were involved. The first requires subjects to name a series of colour chips presented in random sequence.This procedure elicits the basic colour terms of the language. Next, for each of the colour terms proffered on the naming task, subjects are asked to identify its focal reference on a colour chart.This procedure elicits the prototypes of the colour terms.Third, for each colour term, subjects map the term on the colour chart, indicating which colours could be named by the word.This procedure shows the referential range of the term. MacLaury's research, therefore, combined an onomasiological perspective (from world to word, i.e. "What do you call this?") with a semasiological perspective (from
28. Prototype theory 659 over the hill, the trajector traces a random path which 'covers' the hill. In The board is over the hole, the board completely obscures the hole. In this usage, the verticality of the trajector vis-a-vis the landmark is no longer obligatory: the board could be positioned vertically against the hole. The examples give the flavour of what came to be known, for obvious reasons, as a radial category. The various senses radiate out from the central, prototypical sense, like spokes in a wheel. This approach to polysemy has proved extremely attractive to many researchers, not least because it lends itself to the visual display of central and derived senses. For a particularly well worked-out example, see Fillmore & Atkins' (2000) account of English crawl in comparison to French ramper. The approach has been seen as a convenient way to handle the fact that the various senses of a word may not share a common definitional core. Just as the various things we call 'furniture' may not exhibit a set of necessary and sufficient features, so also the various senses of a word may resist a definition in terms of a invariant semantic core. The approach has also been taken up with respect to the semantics of constructions (Goldberg 1995, 2006). Take, for example, the ditransitive [V NP, NPJ construction in English. Its presumed prototype, illustrated by give the dog a bone, involves the transfer of one entity, NP2, to another, NP,, such that NP, ends up having NPr But in throw the dog a bone there is only the intention that NP, should have NP2, there is no entailment that NP, does end up having NP2. More distant from the prototypical sense are examples such as deny someone access, where the intention is that NP2 should be withheld from NP,. In applying the notion of a prototype category to cases of polysemy (whether lexical or constructional), we must be aware of the differences between the two phenomena. On the one hand,wc can use the word fruit to refer, firstly, to apples and oranges, but also to olives. Although the word can refer to different kinds of things, the word presumably has a single sense and designates a single category of objects (albeit, a prototypically structured category). But when we use the word to refer to the outcome of a person's efforts, as in the fruit of my labours or The project bore fruit, we are using the word in a different sense. The outcome of a person's efforts cannot be regarded as just another marginal example of fruit, akin to an olive or a coconut. Rather, the metaphorical sense has to be regarded as an extension from the botanical sense. Even so, to speak of the two senses as forming a category, and to claim that one of the senses is the prototype, is to use the terms 'category' and 'prototype' also in an extended sense. In the case of fruit, it is reasonably clear which of the senses is to be taken as basic and which are extensions therefrom. But in other cases a decision may not be so easy. As noted above, for Lakoff and Brugman the central sense of over was movement 'above and across' (The plane flew over the city). For Tyler & Evans (2001), on the other hand, the 'protoscene' of the preposition is exemplified by The bee is hovering over the flower, which lacks the notion of movement 'across'. The question now arises, on what basis is the central sense identified as such? Whereas Rosch substantiated the prototype notion by a variety of experimental techniques, linguists applying the prototype model to polysemous items appeal (implicitly or explicitly) to a variety of principles, which may sometimes be in conflict. One is descriptive elegance, whereby the prototype is identified as that sense to which the others can most reasonably, or most economically, be related. However, as the example of over demonstrates,
660 VI. Cognitively oriented approaches to semantics different linguists are liable to come up with different proposals as to what is the central sense. Another principle appeals to the organic growth of the polysemous category, with a historically older sense being taken as more central than senses which have developed later. Relevant here are certain assumptions concerning metaphorical extension (see article 26 (Tyler & Takahashi) Metaphors and metonymies).Thus, Lakoff (1987:416-417) claims that the spatial sense of long (as in a long stick) is 'more central' than the temporal sense (a long time), on the basis of what is supposed to be a very general conceptual metaphor which maps spatial notions onto non-spatial domains. A controversial question concerns the psychological reality of radial categories. Experimental evidence, such as it is, would suggest that radial categories might actually have very little psychological reality for speakers of the language (Sandra & Rice: 1995). One might, for example, suppose that the radial structure represents the outcome of the acquisition process. Data from acquisition studies, however, do not always corroborate the radial analysis. Amongst the earliest uses of over which are acquired by children are uses such as fall over, over here, and all over (i.e. 'finished') (Hallan 2001). These would probably be regarded as marginal senses on just about any radial analysis. It is also legitimate to ask, what it would mean, in terms of a speaker's linguistic knowledge, for a particular sense of a word to be 'marginal' or 'non-prototypical'. Both the temporal and the spatial uses of long are frequent and both have to be mastered by any competent speaker of the language. In the case of a prototype category (as studied by Rosch) wc arc dealing with a single sense (with its prototype structure) of a word. In the case of a radial category (as proposed by Lakoff) we are dealing with several senses (each of which will also no doubt have a prototype structure). The distinction is based on whether we are dealing with a single sense of a word or multiple (related) senses. However, the allocation of the various uses of a word to a single sense or to two different senses can be fraught with difficulty, and various tests for diagnosing the matter can give conflicting results (Gccracrts 1993, Tuggy 1993). For example, do paint a portrait,paint the kitchen, and paint white stripes on the road exemplify a single sense of paint, or two (or perhaps even three) closely related senses? There are arguments for each of these positions. Recently, a number of scholars have queried whether it is legitimate in principle to try to identify the senses of a word (Allwood 2003, Zlatev 2003). Perhaps the most reasonable conclusion to draw from the above is that knowing a word involves learning a set (possibly, a very large and open-ended set) of established uses and usage patterns (Taylor 2006). Such an account would be reminiscent of the exemplar theory of categorization, in that a speaker retains memories, not of category members, but of word uses. Whether, or how, a speaker of the language perceives these uses to be related may not have all that much bearing on the speaker's proficiency in the language. The notion of prototype in the Roschean sense might not therefore be all that relevant. The notion of prototype, and extensions therefrom, might, however, be important in the case of novel, or creative uses. In this connection, Langacker (1987:381) speaks of'local' prototypes. Langacker (1987: 57) characterizes a language as an inventory of conventionalized symbolic resources. Often, the conceptualization that a speaker wishes to symbolize on a particular occasion will not correspond exactly with any of the available resources. Inevitably, some extension of an existing resource will be indicated. The existing resource constitutes the local prototype and the actual usage is an extension from it. If the extension is used on future occasions, it may become entrenched and will
28. Prototype theory 663 Langacker, Ronald 1987. Foundations of Cognitive Grammar, vol. 1. Theoretical Prerequisites. Stanford, CA: Stanford University Press. Langacker, Ronald 2000. A dynamic usage-based model. In: M. Barlow & S. Kemmer (eds). Usage- Based Models of Language. Stanford. CA: CSLI Publications, 1-63. Lewandowska-Tomaszczyk, Barbara 2007. Polysemy, prototypes, and radial categories In: D. Geeraerts & H. Cuyckens (eds.). The Oxford Handbook of Cognitive Linguistics. Oxford: Oxford University Press, 139-169. Lyons, John 1977. Semantics. Cambridge: Cambridge University Press. MacLaury, Robert 1991. Prototypes revisited. Annual Review of Anthropology 20,55-74. MacLaury, Robert 1995. Vantage theory. In: J.Taylor & R. MacLaury (eds.). Language and the Cognitive Construal of the World. Berlin: Mouton de Gruyter, 231-276. Majid. Asifa, Melissa Bowei man, Miriam van Staden & James S. Bostei 2007. The semantic categories of cutting and breaking events A crosslinguistic perspective. Cognitive Linguistics 18. special issue, 133-152. Moon, Rosamund 1998. Fixed Expressions and Idioms in English. A Corpus-Based Approach. Oxford: Clarendon Press. Murphy. Gregory 2002. The Big Book of Concepts. Cambridge. MA: The MIT Press. Murphy. Gregory & Douglas Medin 1985. The role of theories in conceptual coherence. Psychological Review 92.289-316. Osherson, Daniel & Edward E. Smith 1981. On the adequacy of prototype theory as a theory of concepts Cognition 9,35-58. Posner. Michael & Steven Keele 1968. On the genesis of abstract ideas Journal of Experimental Psychology 11,353-363. Pulman, Stephen G. 1983. Word Meaning and Belief. London: Croom Helm. Putnam, Hilary 1975. Philosophical Papers, vol. 2. Mind, Language and Reality. Cambridge: Cambridge University Press Rosch, Eleanor 1975. Cognitive representations of semantic categories Journal of Experimental Psychology. General 104,192-233. Rosch, Eleanor 1977. Human categorization. In: N. Warren (ed.). Studies in Cross-Cultural Psychology, vol. I. London: Academic Press, 3-49. Rosch, Eleanor 1978. Principles of categorization. In: E. Rosch & B. Lloyd (eds). Cognition and Categorization. Hillsdale, NJ: Erlbaum, 27-48. Reprinted in: B. Aarts et al. (eds). Fuzzy Grammar. A Reader. Oxford: Oxford University Press 2004,91-108. Rosch, Eleanor & Carolyn B. Mervis 1975. Family resemblances Studies in the internal structure of categories Cognitive Psychology 1,573-605. Rosch, Eleanor, Carolyn Mervis Wayne Grey, David Johnson & Penny Boyes-Braem 1976. Basic objects in natural categories Cognitive Psychology 8,382-439. Ross Brian H. & Valerie S. Makin 1999. Prototype versus exemplar models In: R. J. Sternberg (ed.). The Nature of Cognition. Cambridge, MA: The MIT Press, 205-241. Rothkopf. Ernst Z. 1971. Incidental memory for location of information in text. Journal of Verbal Learning and Verbal Behavior 10,608-613. Sandra. Dominiek & Sally Rice 1995. Network analysis of prepositional meaning. Mirroring whose mind - the linguist's or the language user's? Cognitive Linguistics 6,89-130. Schacter, Daniel 1987. Implicit memory. History and current status Journal of Experimental Psychology. Learning, Memory, and Cognition 13,501-518. Smith. Edward E. & Douglas L. Mcdin 1981. Categories and Concepts. Cambridge, MA: Harvard University Press Storms, Gert, Paul de Boeck & Wim Ruts 2000. Prototype and exemplar-based information in natural language categories Journal of Memory and Language 42,51-73. Sweetser, Eve 1987. The definition of lie. An examination of the folk models underlying a semantic prototype. In: D. Holland & N. Quinn (eds). Cultural Models in Language and Thought. Cambridge: Cambridge University Press, 43-66.
29. Frame Semantics 665 frames in providing a principled account of the openness and richness of word-meanings, distinguishing a frame-based account from classical approaches, such as accounts based on conceptual primitives, lexical fields, and connotation, and showing how they can play a role in the account of how word meaning interacts with syntactic valence. For there exists a great chasm between those, on the one side, who relate everything to a single central vision, one system more or less coherent or articulate, in terms of which they understand, think and feel - a single, universal, organizing principle in terms of which alone all that they are and say has significance - and, on the other side, those who pursue many ends, often unrelated and even contradictory, connected, if at all, only in some de facto way, for some psychological or physiological cause, related by no moral or aesthetic principle. Berlin (1957: 1).cited by Minsky (1975) 1. Introduction Two properties of word meanings contribute mightily to the difficulty of providing a systematic account. One is the openness of word meanings.The variety of word meanings is the variety of human experience. Consider defining words such as Tuesday, barber, alimony, seminal, amputate, and brittle. One needs to make reference to diverse practices, processes, and objects in the social and physical world: repeatable calendar events, grooming and hair, marriage and divorce, discourse about concepts and theories, and events of breaking. Before this seemingly endless diversity, semanticists have in the past stopped short, excluding it from the semantic enterprise, and attempting to draw a line between a small linguistically significant set of primitive concepts and the openness of the lexicon. The other problem is the closely related problem of the richness of word meanings. Words are hard to define, not so much because they invoke fine content specific distinctions, but because they invoke vast amounts of background information.The concept of buying presupposes the complex social fact of a commercial transaction.The concept of alimony presupposes the complex social fact of divorce, which in turn presupposes the complex social fact of marriage. Richness, too, has inspired semanticists simply to stop, to draw a line, saying exact definitions of concepts do not matter for theoretical purposes. This boundary-drawing strategy, providing a response if not an answer to the problems of richness and openness, deserves some comment. As linguistic semanticists, the story goes, our job is to account for systematic, structurally significant properties of meaning.This includes: (1) a. the kinds of syntactic constructions lexical meanings are compatible with, i. the kinds of participants that become subjects and objects ii. regular semantic patterns of oblique markings and valence alternations b. Regular patterns of inference licensed by category, syntactic construction or closed class lexical item. The idea is to carve off that part of semantics necessary for knowing and using the syntactic patterns of the language. To do this sort of work, we do not need to pay attention to every conceptually possible distinction. Instead we need a small set of semantic
668 VI. Cognitively oriented approaches to semantics depends on the concept of marriage. The dependency is definitional. Unless you define what a marriage is, you can't define what a divorce is. Unless you define what a divorce is, you can't define what alimony is. Thus there is a very real sense in which the dependencies we are describing move us toward simpler concepts. Notice, however, that the dependency is leading in a different direction than an analysis that decomposes meanings into a small set of primitives like cause and become. Instead of leading to concepts of increasing generality and abstractness, we are being led to define the situations or circumstances which provide the necessary background for the concepts we are describing. The concepts of marriage and divorce are equally specific, but the institution of marriage provides the necessary background for the institution of divorce. Or consider the complex subject of Tuesdays (Fillmore 1985). Wc live in a world of cyclic events. Seasons come and go and then return.This leads to a cyclic calendar which divides time up into repeating intervals, which are divided up further. Years are divided into months, which are divided into weeks, which are divided into days, which have cyclic names. Each week has a Sunday, a Monday, a Tuesday, and so on. Defining Tuesday entails defining the notion of a cyclic calendar. Knowing the word Tuesday may not entail knowing the word Sunday, but it does entail understanding at least the concept of a week and a day and their relation, and that each week has exactly one Tuesday. We thus have words and background concepts. We will call the background concept the frame. Now the idea of a frame begins to have some lexical semantic bite with the observation that a single concept may provide the background for a set of words. Thus the concept of marriage provides the background for words/suffixes/phrases such as bride, groom, marriage, wedding, divorce, -in-law, elope,fiancee, best man, maid-of-honor, honeymoon, husband, and wife, as well as a variety of basic kinship terms omitted here for reasons of space. The concept of calendar cycle provides the frame for lexical items such as week, month, year, season, Sunday, ..., Saturday, January, ..., December, day, night, morning, and afternoon. Notice that a concept once defined may provide the background frame for further concepts. Thus, divorce itself provides the background frame for lexical items such as alimony, divorce, divorce court, divorce attorney, ex-husband, and ex-wife. In sum, a frame may organize a vocabulary domain: Borrowing from the language of gestalt psychology wc could say that the assumed background of knowledge and practices - the complex frame behind this vocabulary domain - stands as a common ground to the figure representable by any of the individual words [Words belonging to a frame] are lexical representatives of some single coherent schemati- zation of experience or knowledge. Fillmore (1985: 223) Now a premise of Frame Semantics is that the relation between lexical items and frames is open ended. Thus one way in which the openness of the lexicon manifests itself is in building concepts in unpredictable ways against the backdrop of other concepts. The concept of marriage seems to be universal or near-universal in human culture.The concept of alimony is not. No doubt concepts sometimes pop into the lexicon along with their defining frames (perhaps satellite is an example), but the usual case is to try to build them up out of some existing frame (Thus horseless carriage leading to car is the more usual model).
670 VI. Cognitively oriented approaches to semantics 2.2. Basic tools We have thus far focused on the role of frames in a theory of word meanings. Note that nothing in particular hangs on the notion word. Frames may also have a conventional connection to a simple syntactic constructions or idiom; give someone the slip probably belongs to the same frame as elude. Or they may be tied to more complex constructions such as the Comparative Correlative (cf. article 86 (Kay & Michaclis) Constructional meaning). (7) The more I drink the better you look. This construction has two "slots" requiring properties of quantity or degree. The same issues of background situation and profiled participants arise whether the linguistic exponent is a word or construction.The term sign, used in exactly the same sense as it is used by construction grammarians, will serve here as well. As a theory of the conventional association of schematized situations and linguistic exponents, then, Frame Semantics makes the assumption that there is always some background knowledge relative to which linguistic elements do some profiling, and relative to which they are defined.Two ideas are central: 1. a background concept 2. a set of signs including all the words and constructions that utilize this conceptual background. Two other important frame theoretic concepts are frame elements and profiling. Thus far in introducing frames I have emphasized what might be called the modularity of knowledge. Our knowledge of the the world can usefully be divided up into concrete chunks. Equally important to the Fillmorian conception of frames is the integrating function of frames. That is, frames provide us with the means to integrate with other frames in context to produce coherent wholes. For this function, the crucial concept is the notion of a frame element (Fillmore & Baker 2000). A frame element is simply a regular participant, feature, or attribute of the kind of situation described by a frame.Thus, frame elements of the wedding frame will include the husband, wife, wedding ceremony, wedding date, best man and maid of honor, for example. Frame elements need not be obligatory; one may have a wedding without a best man; but they need to be regular recurring features. Thus, frames have slots, replaceable elements. This means that frames can be linked to to other frames by sharing participants or even by being participants in other frames. They can be components of an interpretation. In Frame Semantics, all word meanings are relativized to frames. But different words select different aspects of the background to profile (we use the terminology in Langacker 1984). Sometimes aspects profiled by different words are mutually exclusive parts of the circumstances, such as the husband and wife in the marriage frame, but sometimes word meanings differ not in what they profile, but in how they profile it. In such cases, I will say words differ in perspective (Fillmore 1977a). I will use Fillmore's much- discussed commercial event example (Fillmore 1976) to illustrate:
29. Frame Semantics 671 (8) a. John sold the book to Mary for $100. b. Mary bought the book from John for $100. c. Mary paid John $100 for the book. Verbs like buy, sell, pay have as background the concept of a commercial transaction, an event in which a buyer gives money to a seller in exchange for some goods. Now because the transaction is an exchange it can be thought of as containing what Fillmore calls two subscenes: a goods Jransfer, in which the goods is transferred from the seller to the buyer, and a money Jransfer, in which the money is transferred from the buyer to the seller. Here it is natural to say that English has as a valence realization option for transfers of possession one in which the object being transferred from one possessor to another is realized as direct object. Thus verbs profiling the money transfer will make the money the direct object (pay and collect) and verbs profiling the goods transfer will make the goods the direct object (buy and sell). Then the difference between these verb pairs can be chalked up to what is profiled. But what about the difference between buy and selll By hypothesis, both verbs profile a goods transfer, but in one case the buyer is subject and in another the seller is. Perhaps this is just an arbitrary choice. This is in some sense what the thematic role theory of Dowty (1991) says: Since (8a) and (8b) are mutually entailing, there can be no semantic account of the choice of subject. In Frame Semantics, however, we may attempt to describe the facts as follows: in the case of buy the buyer is viewed as (perspectivalized as) agent, in the case of sell, the seller is. There are two advantages to this description. First, it allows us to preserve a principle assumed by a number of linguists, that cross-linguistically agents must be subjects. Second, it allows us to interpret certain adjuncts that enter into special relations with agents: instrumentals, benefactives, and purpose clauses. (9) a. John bought the book from Mary with/for his last pay check. [Both with and for allow the reading on which the pay check provides the funds for the purchase.] b. Mary sold the book to John ?with/for his last paycheck. [Only for allows the reading on which the pay check provides the funds.] c. John bought the house from Sue for Mary, [allows reading on which Mary is ultimate owner, disallows the reading on which Mary is seller and Sue is seller's agent.] d. Sue sold the house to John for Mary, [allows reading on which Mary is seller and Sue is seller's agent; disallows reading on which Mary is ultimate owner.] e. John bought the house from Sue to evade taxes/as a tax dodge, [tax benefit is John's] f. Sue sold the house to John to evade taxes/as a tax dodge, [tax benefit is Sue's] But what does it mean to say that a verb takes a perspective which "views" a particular participant as an agent? The facts are, after all, that both the buyer and the seller are agents; they have all the entailment properties that characterize what we typically call agents; and this, Dowty's theory of thematic roles tells us, is why verbs like buy and sell can co-exist. I will have more to say on this point in section 4; for the moment I will confine myself to the following general observation on what Frame Semantics allows: What is profiled and what is left out is not determined by the entailment facts of its frame. Complex
672 VI. Cognitively oriented approaches to semantics variations are possible. For example, as Fillmore observes, the commercial transaction frame is associated with verbs that have no natural way of realizing the seller: (10) John spent $100 on that book. Nothing in the valence marking of the verb spend suggests that what is being profiled here is a possession transfer; neither the double object construction, nor from nor to is possible for marking a core commercial transaction participant. Rather the pattern seems to be the one available for what one might call resource consumption verbs like waste, lose, use (up), and blow. In this profiling, there is no room for a seller. Given that such variation in what is profiled is allowed, the idea that the agenthood of a participant might be part of what's included or left out does not seem so far-fetched. As I will argue in section 4, the inclusion of events into the semantics can help us make semantic sense of what abstractions like this might mean. These considerations argue that there can be more than one frame backgrounding a single word meaning; for example, concepts of commercial event, possession transfer, and agentivity simultaneously define buy. A somewhat different but related issue is the issue of event structure.There is strong evidence cross-linguistically at least in the form of productive word-formation processes that some verbs - for example, causatives - represent complex events that can only be expressed through a combination of two frames with a very specific semantics. So it appears that a word meaning can simultaneously invoke a configuration of frames, with particulars of the configuration sometimes spelled out morphologically. The idea that any word meaning exploits a background is of use in the account of polysemy. Different senses will in general involve relativization to different frames. As a very simple example, consider the use of spend in the following sentence: (11) John spent 10 minutes fixing his watch. How are we to describe the relationship of the use of spend in this example, which basically describes a watch fixing event, with that in (10), which describes a commercial transaction? One way is to say that one sense involves the commercial transaction, and another involves a frame we might call action duration which relates actions to their duration, a frame that would also be invoked by durativc uses of for. A counterproposal is that there is one sense here, which involves an actor using up a resource. But such a proposal runs up against the problem that spend really has rather odd disjunctive selection restrictions: (12) John spent 30 packs of cigarettes that afternoon. Sentence (12) is odd except perhaps in a context (such as a prison or boarding school) where cigarette packs have become a fungible medium of exchange; what it cannot mean is that John simply used up the cigarettes (by smoking them, for example). The point is that a single general resource consumption meaning ought to freely allow resources other than time and money, so a single resource consumption sense does not correctly describe the readings available for (12); however, a sense invoking a commercial transaction frame constrained to very specific circumstances docs. Note also, that the fact
29. Frame Semantics 673 that 30 packs of cigarettes can be the money participant in the right context is naturally accommodated.The right constraint on the money participant is not that it be cash (for which Visa and Mastercard can be thankful), but that it be a fungible medium of exchange. Summarizing: 1. Frames are motivated primarily by issues of understanding and converge with various schema-like conceptions advanced by cognitive psychologists, AI researchers, and cognitive linguists. They are experientially coherent backgrounds with variable components that allow us to organize families of concepts. 2. The concept of frames has far reaching consequences when applied to lexical semantics, because a single frame can provide the organizing background for a set of words. Thus frames can provide an organizing principle for a rich open lexicon. FrameNet is an embodiment of these ideas. 3. In proposing an account of lexical semantics rich enough for a theory of understanding, Frame Semantics converges with other lexical semantic research which has been bringing to bear a richer set of concepts on problems of the syntax semantics interface. Having sketched the basic idea, I want in the next two sections to briefly contrast the notion frame with two other ideas that have played a major role in semantics, the idea of a relation, as incorporated via set theory and predicate logic into semantics, and the idea of a lexical field. 3. Related conceptions In this section I compare the idea of frames with two other concepts of major importance in theories of lexical semantics, relations and lexical fields. The comparison offers the opportunity to develop some other key ideas of Frame Semantics, including profiling and saliency. 3.1. Frames versus relations: profiling and saliency Words (most verbs, some nouns, arguably all degreeable adjectives) describe relations in the world. Love and hate are relations between animate experiencers and objects. The verb believe describes a relation between an animate experiencer and a proposition. These are commonplace views among philsophers of language, semanticists, and syntacticians, and they have provided the basis for much fruitful work. Where do frames fit in? For Fillmore, frames describe the factual basis for relations. In this sense they are "pre-"relational.To illustrate, Fillmore (1985) cites Mill's (1847) discussion of the words father and son. Although there is a single history of events which establishes both the father- and the son- relation, the words father and son pick out different entities in the world. In Mill's terminology, the words denote different things, but connote a single thing, the shared history.This history, which Mill calls the fundamentum relationis (the foundation of the relation), determines that the two relations bear a fixed structural relation to
VI. Cognitively oriented approaches to semantics each other. It is the idea of a determinate structure for a set of relations that Fillmore likens to the idea of a frame. Thus, a frame defines not a single relation but, minimally, a structured set of relations. This conception allows for a natural description not just of pairs of words like father and son, but also of single words which do not in fact settle on a particular relation. Consider the verb risk, discussed in Fillmore & Atkins (1998), which seems to allow a range of participants into a single grammatical "slot". For example, (13) Joan risked a. censure. b. her car. c. a trip down the advanced ski slope. The risk frame has at least 3 distinct participants, (a) the bad thing that may happen, (b) the valued thing that may be lost, and (c) the activity that may cause the bad thing to happen. All can be realized in the direct object position, as (13) shows. Since there are three distinct relations here, a theory that identifies lexical meanings with relations needs to say there are 3 meanings as well. Frame Semantics would describe this as one frame allowing 3 distinct profilings. It is the structure of the frame together with the profiling options the language makes available which makes the 3 alternatives possible. Other verbs with a similar indeterminacy of participant are copy, collide, and mix: (14) a. Sue copied her costume (from a film poster). b. Sue copied the film poster. c. The truck and the car collided. d. The truck collided with the car. e. John mixed the soup. f. John mixed the paste into the soup. g. John mixed the paste and the flour. In each of these cases the natural Frame Semantics account would be to say the frame remains constant while the profilings or perspective changes. Thus, under a Frame Semantics approach, verbal valence alternations are to be expected, and the possibility of such alternations provides motivation for the idea of a background frame with a range of participants and a range of profiling options. Now on a theory in which senses are relations, all the verbs in (14) must have different senses.This is, for example, because the arguments in (14a) and (14b) fill different roles. Frame Semantics allows another option. We can say the same verb sense is used in both cases.The differences in interpretation arise because of differences in profiling and perspectivalization. 3.2. Frames versus lexical fields Because frames define lexical sets, it is useful to contrast the concept of frames with an earlier body of lexical semantic work which takes as central the identification of lexical sets. This work develops the idea of lexical fields (Weisgerber 1962; Coseriu 1967; Trier 1971; Geckeler 1971; Lehrer & Kittay 1992). Lexical fields define sets of lexical items in mutually defining relations, in other words, lexical semantic paradigms. The
676 VI. Cognitively oriented approaches to semantics that preoccupied lexical field theorists, a specialization of height that includes just the polar adjectives, a specialization of temperature that includes just the set cold, cool, warm, hot, and so on. This level of specificity in fact roughly describes the granularity of FramcNct. 3.3. Minskian frames As described in Fillmore (1982), the term frame is borrowed from Marvin Minsky. It will be useful before tackling the question of how profiling and perspectivalization work to take a closer look at this precursor. In Minsky's original frames paper (Minsky 1975), frames were put forth as a solution to the problem of scene interpretation in vision. Minsky's proposal was in reaction to those who, like the Gcstalt theorists (Koffka 1963), viewed scene perception as a single holistic process governed by principles similar to those at work in electric fields. Minsky thought scenes were assembled in independent chunks, constituent by constituent, in a series of steps involving interpretation and integration.To describe this process, a model factoring the visual field into a number of discrete chunks, each with its own model of change with its own discrete phases, was needed. A frame was thus a dynamic model of some specific kind of object with specific participants and parameters. The model had built-in expectations about ways in which the object could change, cither in time or as a viewer's perspective on it changed, formalized as operations mapping old frame states to new frame states. A frame also included a set of participants whose status changed under these operations; those moving into certain distinguished slots are foregrounded. Thus, for example, in the simplified version of Minsky's cube frame, shown before and after a rotation in Figs. 29.1 and 29.2, a frame state encodes a particular view of a cube and the participants are cube faces. One possible operation is a rotation of the cube, defined to place new faces in certain view-slots, and move old faces out and possibly out of view. The faces that end up in view arc the foregrounded participants of the resulting frame state. Thus the cube frame offers the tools for representing particular views or perspectives on a cube, together with the operations that may connect them in time. Fig. 29.1: View of cube together with simplified cube frame representing that view. Links marked "fg" lead to foregrounded slots: slots marked "invis" are backgrounded. Faces D and C are out of view.
29. Frame Semantics 679 The connection between an argument frame like agent patient and simple circumstance frames can be illustrated through the example of the possession transfer frame (related to verbs like give, get, take, receive, acquire, bequeath, loan, and so on). Represented as an AVM, this is: (19) POSSESSION TRANSFER donor animate possession entity recipient animate Now both give and acquire will be defined in terms of the possession transfer frame, but give and acquire differ in that with give the donor becomes subject and with acquire the recipient does. (Compare the difference between buy and sell discussed in section 2.2.) We will account for this difference by saying that give and acquire have different mappings from the agent patient frame to their shared circumstance frame (possession transfer).This works as follows. We define the relation between a circumstance and argument frame via a perspectiv- alizing function. Here are the axioms for what we will call the acquisition function, on which the recipient is agent: (20) a. acquisition : possession_transfer -» agent_patient b. agent o acquisition = recipient c. patient o acquisition = possession d. source o acquisition = donor The first line defines acquisition as a mapping from the sort possession_transfer to the sort agent_patient, that is as a mapping from possession transfer eventualities to agent patient eventualities. The mapping is total; that is, each possession transfer is guaranteed to have an agent patient eventuality associated with it. In the second line, the symbol o stands for function composition; the composition of the agent function with the acquisition function (written agent o acquisition) is the same function (extensionally) as the recipient relation.Thus the filler of the recipient role in a possession transfer must be the same as the filler of the agent role in the associated agent patient eventuality. And so on, for the other axioms. Summing up AVM style: (21) POSSESSION TRANSFER donor \\j recipient [2] possession [3] AGENT PATIENT agent [2] source [y patient [3] I will call the mapping that makes the donor agent donation.
VI. Cognitively oriented approaches to semantics (22) POSSESSION TRANSFER donor recipient possession AGENT PATIENT agent [T] goal || patient \3] With the acquisition and donation mappings defined, the predicates give and acquire can be defined as compositions with donation and acquisition: give = POSSESSION transfer o donation'* acquire = POSSESSION transfer o acquisition'* donation'* is an inverse of donation, a function from agent patient eventualities to possession transfers defined only for those agent patient events related to possession transfers. Composing this with the possession transfer predicate makes give a predicate true of those agent patient events related to possession transfers, whose agents are donors and whose patients are possessions. The treatment of acquire is parallel but uses the acquisition mappings. For more extensive discussion, sec Gawron (2008). Summarizing: a. an argument frame agent patient, with direct consequences for syntactic valence (agents become subject, patients direct object, and so on). b. a circumstance frame possession transfer, which captures the circumstances of possession transfer. c. perspectivalizing functions acquisition and donation which map participants in the circumstances to argument structure. This is the basic picture of perspectivalization. The picture becomes more interesting with a richer example. In the discussion that follows, I assume a commercial transaction frame with at least the following frame elements: (23) COMMERCIAL TRANSACTION buyer animate seller animate money fungible goods entity This is a declaration that various functions from event sorts to truth values and entity sorts exist, a rather austere model for the sort of rich backgrounding function we have assumed for frames. Wc will sec how this model is enriched below. Our picture of profiling and perspectivalization can be extended to the more complex cases of commercial transaction predicates with one more composition. For example, we may define buy' as follows:
29. Frame Semantics 683 consumption Buy resource consumption Fig. 29.3: Left: Lexical network for commercial transaction. Right: Same network with the perspec- tivalization chosen by buy in the boxed area. do ... cause become ... in Dowty's system attributes a fixed structure to causatives of inchoatives. As these examples show, a frame may have subscene roles which carve up its constituents in incompatible ways. Now this may seem peculiar. Shouldn't the roles of a frame define a fixed relation between disjoint entities? I submit that the answer is no. The roles associated with each sort of event are regularities that help us classify an event as of that sort. But such functions are not guaranteed to carve up each event into non- overlapping, hierarchically structured parts. Sometimes distinct roles may select overlapping constituents of events, particularly when independent individuation criteria are not decisive, as when the constituents are collectives, or shapeless globs of stuff, or abstract things such as events or event types. Thus we get the cases discussed above like collide, mix, and risk, where different ways of profiling the frames give us distinct, incompatible sets of roles. We may choose to view the colliders as a single collective entity (X and Y collided), or as two (X collided with Y). We may choose to separate a figure from a ground in the mixing event (14f),or lump them together (mix X and Y), or just view the mixed substance as one (14f). Finally, risks involve an action (13c) and a potential bad consequence (13a), and for a restricted set of cases in which that bad consequence is a loss, a lost thing (13b). What of relations? Formally, in this frame-based picture, we have replaced relations with event predicates, each of which is defined through some composed set of mappings to a set of events that will be defined only for some fixed set of roles. Clearly, for every lexical predicate, there is a corresponding relation, namely one defined for exactly the same set of roles as the predicate. Thus in the end the description of the kind of lexical semantic entity which interfaces with the combinatorial semantics is not very different.
29. Frame Semantics 685 (27) a. The glass was hurled against the wall and broke. b. The cockroach was hurled against the wall and died. Thus the defaults at play in determining matters of "valence" differ from those in discourse. We can at least describe the contrasts in (26) - not explain it - by saying movement is an optional component of the breaking frame through which the denotation of the verb break is defined, and not a component of the killing frame; or in terms of the formal picture of section 4: Within the conventional lexical network linking frames in English there is a partial function from breaking events to movement subscenes; there is no such function for killing events. In contrast Fillmore's (1985:232) discussed in section 2.1: (28) We never open our presents until morning. The point of this example was that it evoked Christmas without containing a single word specific to Christmas. How might an automatic interpretation system simulate what is going on for human understanders? Presumably by a kind of application of Occam's razor.There is one and only one frame that explains both the presence of presents and the custom of waiting until morning, and that is the Christmas frame.Thus the assumption that gets us the most narrative bang for the buck is Christmas. In this case the frame has to be evoked by dynamically assembling pieces of information activated in this piece of discourse. These two examples show that frames will function differently in a theory of discourse understanding than they will in a theory of sign-meanings in at least two ways. They will require a different notion of default, and they will need to resort to different inferencing strategies, such as inference to the most economical explanation. 7. Conclusion The logical notion of a relation, which preserves certain aspects of the linearization syntax forces on us, has at times appeared to offer an attractive account of what we grasp when we grasp sign meanings. But the data we have been looking at in this brief excursion into Frame Semantics has pointed another way. Lexical senses seem to be tied to the same kind schemata that organize our perceptions and interpretations of the social and physical world. In these schemata participants are neither linearized nor uniquely individuated, and the mapping into the linearized regime of syntax is constrained but underdetermined. We see words with options in what their exact participants are and how they are realized. Frames offer a model that is both specific enough and flexible enough to accommodate these facts, while offering the promise of a firm grounding for lexicographic description and an account of text understanding. 8. References Ashcr, Nicholas & Alex Lascaridcs 1995. Lexical disambiguation in a discourse context. Journal of Semantics 12,69-108. Baker, Collin & Charles J. Fillmore 2001. Frame Semantics for text understanding. In: Proceedings of WordNet and Other Lexical Resources Workshop. Pittsburgh, PA: North American Association of Computational Linguistics, 3-4.
29. Frame Semantics 687 Fillmore, Charles J. 1985. Frames and the semantics of understanding. Quaderni di Semantica 6, 222-254. Fillmore, Charles J. & B.T.S Atkins 1994. Starting where the dictionaries stop. The challenge for computational lexicography. In: B.T.S Atkins & A. Zampolli (eds.). Computational Approaches to the Lexicon. Oxford: Clarendon Press. Fillmore. Charles J. & B.T.S Atkins 1998. FraineNet and lexicographic relevance. In: Proceedings of the First International Conference on Language Resources and Evaluation. Granada: International Committee on Computational Linguistics, 28-30. Fillmore. Charles J. & Collin Baker 2000. FrameNet, http://www.icsi.berkeley.edu/~framenet, October 13,2008. Gawron, Jean Mark 1983. Lexical Representations and the Semantics of Complementation. New York: Garland. Gawron, Jean Mark 2008. Circumstances and Perspective. The Logic of Argument Structure, http:// iepositoiiescdlib.oig/ucsdling/sdlp3/l. October 13,2008. Geckeler, Horst 1971. Strukturelle Semantik und Wortfeldtheorie. Miinclieii: Fink. Hobbs, Jerry R., Mark Stickcl, Douglas Appelt & Paul Martin 1993. Interpretation as abduction. Artificial Intelligence 63,69-142. Jackendoff, Ray S. 1990. Semantic Structures. Cambridge, MA:The MIT Press. Koffka, Kurt 1963. Principles ofGestalt Psychology. New York: Harcourt, Brace, and World. Lakoff, George 1972. Linguistics and natural logic. In: D. Davidson & G. Harm an (eds.). Semantics of Natural Language. Dordrecht: Reide 1,545-665. Lakoff, George 1983. Category. An essay in cognitive linguistics In: Linguistic Society of Korea (ed.). Linguistics in the Morning Calm. Seoul: Hanshin, 139-194. Lakoff, George & Mark Johnson 1980. Metaphors We Live By. Chicago, IL: The University of Chicago Press. Langacker, Ronald 1984. Active zones In: C. Brugman et al. (eds). Proceedings of the Tenth Annual Meeting of the Berkeley Linguistics Society. Berkeley, CA: Berkeley Linguistics Society, 172-188. Lehrer, Adrienne & Eva Kittay (eds.) 1992. Frames, Fields, and Contrasts. New Essays in Semantic and Lexical Organization. Hillsdale. NJ: Lawrence Erlbaum. Levin,Beth 1993. English Verb Classes and Alternations. Chicago, I L:The University of Chicago Press. Mill, John Stuart 1847. A System of Logic. New York: Harper and Brothers Minsky, Marvin 1975.A framework for representing knowledge. In: P.Winston (ed.). The Psychology of Computer Vision. New York: McGraw-Hill, 211-277. http://web.media.init.edu/~minsky/ papers/Frames/franie&html, October 13,2008. Parsons, Terence 1990. Events in the Semantics of English. Cambridge, MA:The MIT Press. Pustejovsky, James 1995. The Generative Lexicon. Cambridge, MA:The MIT Press. Rounds, William C. 1997. Feature logics In: J. van Benthem & A. ter Meulen (eds.). Handbook of Logic and Language. Amsterdam: Elsevier, 475-533. Rumelhart, Donald 1980. Schemata. The building blocks of cognition. In: R.J. Spiro, B.C. Bruce & W.F. Brewer (eds.). Theoretical Issues in Reading Comprehension. Hillsdale, NJ: Lawrence Erlbaum, 33-58. Schank, Roger C. & R.P. Abelson 1977. Scripts, Plans, Goals, and Understanding. Hillsdale, NJ: Lawrence Erlbaum. Smolka, Gert 1992. Feature constraint logics for unification grammars Journal of Logic Programming 12,51-87. Trier, Jost 1971. Aufs'dlze und Vortrage zur Wortfeldtheorie.T\\c Hague: Mouton. Wcisgcrbcr, J. Leo 1962. Gnindzilge der inhaltsbezogenen Grammatik. Diisscldorf: Schwann. Jean-Mark Gawron, San Diego, CA (USA)
688 VI. Cognitively oriented approaches to semantics 30. Conceptual Semantics 1. Overall framework 2. Major features of Conceptual Structure 3. Compositionality 4. References Abstract Conceptual Semantics takes the meanings of words and sentences to be structures in the minds of language users, and it takes phrases to refer not to the world per se, but rather to the world as conceptualized by language users. It therefore takes seriously constraints on a theory of meaning coming from the cognitive structure of human concepts, from the need to learn words, and from the connection between meaning, perception, action, and nonlinguistic thought. The theory treats meanings, like phonological structures, as articulated into substructures or tiers: a division into an algebraic Conceptual Structure and a geometric/topological Spatial Structure; a division of the former into Propositional Structure and Information Structure; and possibly a division of Propositional Structure into a descriptive tier and a referential tier. All of these structures contribute to word, phrase, and sentence meanings. The ontology of Conceptual Semantics is richer than in most approaches, including not only individuals and events but also locations, trajectories, manners, distances, and other basic categories. Word meanings are decomposed into functions and features, but some of the features and connectives among them do not lend themselves to standard definitions in terms of necessary and sufficient conditions. Phrase and sentence meanings are compositional, but not in the strict Fregean sense: many aspects of meaning are conveyed through coercion, ellipsis, and constructional meaning. 1. Overall framework Conceptual Semantics is a formal approach to natural language meaning developed by Jackendoff (1983,1987,1990,2002,2007) and Pinker (1989,2007); Pustejovsky (1995) has also been influential in its development. The approach can be characterized at two somewhat independent levels. The first is the overall framework for the theory of meaning, and how this framework is integrated into linguistics, philosophy of language, and cognitive science (section l).Thc second is the formal machinery that has been developed to achieve the goals of this framework (sections 2 and 3). These two are somewhat independent: the general framework might be realized in terms of other formal approaches, and many aspects of the formal machinery can be deployed within other frameworks for studying meaning. The fundamental goal of Conceptual Semantics is to describe how humans express their understanding of the world by means of linguistic utterances. From this goal flow two theoretical commitments. First, linguistic meaning is to be described in mentalistic/ psychological terms - and eventually in neuroscientific terms. The theory of meaning, like the theories of generative syntax and phonology, is taken to be about what is going Maienbom, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 688-709
30. Conceptual Semantics 693 sentence meanings. Besides differences in the phenomena that it focuses on, Cognitive Grammar differs from Conceptual Semantics in four respects. First, the style of its formalization is different and arguably less rigorous. Second, it tends to connect to nonlin- guistic phenomena through theories of embodied cognition rather than through more standard cognitive neuroscience. Third (and related), many practitioners of Cognitive Grammar take a rather empiricist (or Lockean) view of learning, whereas Conceptual Semantics admits a considerable structured innate (or Kantian) basis to concept formation. Fourth, Conceptual Semantics is committed to syntax having a certain degree of independence from semantics, whereas Cognitive Grammar seeks to explain all aspects of syntactic form in terms of the meaning(s) expressed (see article 86 (Kay & Michaelis) Constructional meaning). 1.3. Conceptual Structure and Spatial Structure; interfaces with syntax and phonology The central hypothesis of Conceptual Semantics is that there is a level of mental representation, Conceptual Structure, which instantiates sentence meanings and serves as the formal basis for inference and for connection with world knowledge and perception.The overall architecture of the mind in which this embedded is shown in Fig. 30.1. Vision u ition conceptual „ „ Spatial *C^*Haptic PercePtion Phonology «-► Syntax «-► J «-► strHuctureJ^ Motor ar *^ °jiu UB °"l"'lure^*> Proprioception instructions *«-. ..--' ^ Action repesentations to vocal tract LANGUAGE COGNITION PERCEPTION/ACTION Fig. 30.1: Architecture of the mind Each of the levels in Fig. 30.1 is a generative system with its own primitives and principles of combination. The arrows indicate interfaces among representations: sets of principles that provide systematic mappings from one level to the other. On the left-hand side are the familiar linguistic levels and their interfaces to hearing and speaking. On the right-hand side are nonlinguistic connections to the world through visual, haptic, and proprioceptive perception, and through the formulation of action. (One could add general-purpose audition, smell, and taste as well.) In the middle lies cognition, here instantiated as the levels of Conceptual Structure and Spatial Structure. Spatial Structure is hypothesized (Jackendoff 1987, 1996a; Landau & Jackendoff 1993) as a geometric/topological encoding of 3-dimensional object shape, spatial layout, motion, and possibly force. It is not a strictly visual representation, because these features of the conceptualized physical world can also be derived by touch (the haptic sense), by proprioception (the spatial position of one's own body), and to some degree by auditory localization. Spatial Structure is the medium in which these disparate perceptual modalities are integrated. It also serves as input for formulating one's own physical actions. It is moreover the form in which memory for the shape of familiar objects and object categories is stored. This level must be generative, since one encounters, remembers, and acts in relation to an indefinitely large number of
694 VI. Cognitively oriented approaches to semantics objects and spatial configurations in the course of life (see article 107 (Landau) Space in semantics and cognition). However, it is impossible to encode all aspects of cognition in gcomctric/topological terms. A memory for a shape must also encode the standard type/token distinction, i.e. whether this is the shape of a particular object or of an object category. Moreover, the taxonomy of categories is not necessarily a taxonomy of shapes. For instance, forks and chairs are not at all similar in shape, but both are artifacts; there is no general characterization of artifacts in terms of shape or the spatial character of the actions one performs with them. Likewise, the distinction between familiar and unfamiliar objects and actions cannot be characterized in terms of shape. Finally, social relations such as kinship, alliance, enmity, dominance, possession, and reciprocation cannot be formulated in geometric terms. All of these aspects of cognition instead lend themselves to an algebraic encoding in terms of features (binary or multi-valued, e.g. TYPE vs.TOKEN) and functions of one or more arguments (e.g. x INSTANCE-OF y, x SUBCATEGORY-OF y, x KIN-OF y).This system of algebraic features and functions constitutes Conceptual Structure. Note that at least some of the distinctions just listed, in particular the social relations, are also made by nonhuman primates. Thus a description of primate nonlin- guistic cognition requires some form of Conceptual Structure, though doubtless far less rich than in the human case. Conceptual Structure too has its limitations. Attempts to formalize meaning and reasoning have always had to confront the impossibility of coding perceptual characteristics such as color, shape, texture, and manner of motion in purely algebraic terms.The architecture in Fig. 30.1 proposes to overcome this difficulty by sharing the work of encoding meaning between the geometric format of Spatial Structure and the algebraic format of Conceptual Structure: spatial concepts have complementary representations in both domains. (In a sense, this is a more sophisticated version of Paivio's 1971 dual-coding hypothesis and the view of mental imagery espoused by Kosslyn 1980.) Turning now to the interfaces between meaning and language, a standard assumption in both standard logic and mainstream generative grammar is that semantics interfaces exclusively with syntax, and in fact that the function of the syntax-semantics interface is to determine meaning in one-to-one fashion from syntactic structure (this is made explicit in article 81 (Harley) Semantics in Distributed Morphology).The assumption, rarely made explicit (but going back at least to Descartes), is that combinatorial thought is possible only through the use of combinatorial language.This comports with the view, common well into the 20th century, that animals are incapable of thought. Modern cognitive ethology (Hauser 2000; Cheney & Seyfarth 2007) decisively refutes this view, and with it the assumption that syntax is the source of combinatorial thought. Conceptual Semantics is rooted instead in the intuition that language is combinatorial because it evolved to express a pre-existing combinatorial faculty of thought (condition C6). Under this approach, it is quite natural to expect Conceptual Structure to be far richer than syntactic structure - as indeed it is. Culicover & Jackendoff (2005) argue that the increasing complexity and abstraction of the structures posited by mainstream generative syntax up to and including the Minimalist Program (Chomsky 1995) have been motivated above all by the desire to encode all semantic relations overtly or covertly in syntactic structure. Ultimately the attempt fails because semantic relations are too rich and multidimensional to be encoded in terms of purely syntactic mechanisms.
30. Conceptual Semantics 695 In Conceptual Semantics, a word is regarded as a part of the language/thought interface: it is a longterm memory association of a piece of phonological structure (e.g./kaet/), some syntactic features (singular count noun), and a piece of Conceptual Structure (FELINE ANIMAL, PET, etc.). If it is a word for a concept involving physical space, it also includes a piece of Spatial Structure (what cats look like). Other interface principles establish correspondences between semantic argument structure (e.g. what characters an action involves) and syntactic argument structure (e.g. transitivity), between scope of quantification in semantics and position of quantifiers in syntax, and between topic and focus in semantics (information structure, see article 71 (Hinterwimmer) Information structure) and affixation and/or position in syntax. In order to deal with situations where topic and focus arc coded only in terms of stress and intonation (e.g. The dog CHASED the mailman), the theory offers the possibility of a further interface that establishes a correspondence directly between semantics and phonology, bypassing syntax altogether. If it proves necessary to posit an additional level of "linguistic semantic structure" that is devoted specifically to features relevant for grammatical expression (as posited in article 31 (Lang & Maienborn) Two-level Semantics, article 19 (Levin & Rappaport Hovav) Lexical Conceptual Structure and article 81 (Harley) Semantics in Distributed Morphology), such a level would be inserted between Conceptual Structure and syntax, with interfaces to both. Of course, in order for language to be understood and to play a role in inference, it is still necessary for words to bridge all the way from phonology to Conceptual Structure and Spatial Structure - they cannot stop at the putative level of linguistic semantic structure. Hence the addition of such an extra component would not at all change the content of Conceptual Structure, which is necessary to drive inference and the connection to perception. (Jackendoff 2002, section 9.7, argues that in fact such an extra component is unnecessary.) 2. Major features of Conceptual Structure This section sketches some of the important features of Conceptual Structure (henceforth CS). There is no space here to spell out formal details; the reader is referred to Jackendoff (1983; 2002, chapters 11-12). 2.1. Tiers in CS A major advance in phonological theory was the realization that phonological structure is not a single formal object, but rather a collection of tiers, each with its own formal organization, which divide up the work of phonology into a number of independent but correlated domains. These include at least segmental and syllabic structure; the amalgamation of syllables into larger domains such as feet, phonological words, and into- national phrases; the metrical grid that assigns stress; and the structure of intonation contours correlated with prosodic domains. A parallel innovation is proposed within Conceptual Semantics. The first division, already described, is into Spatial Structure and Conceptual Structure. Within CS, the clearest division is into Propositional Structure, a function-argument encoding of who did what to whom, how, where, and when (arguments and modifiers), versus Information Structure, the encoding of Topic, Focus, and Common Ground. These two aspects of meaning arc orthogonal, in that virtually any constituent of a clause, with any thematic
696 VI. Cognitively oriented approaches to semantics or modifying role, can function as Topic or Focus or part of Common Ground. Languages typically use different grammatical machinery for expressing these two aspects of meaning. For instance, roles in Propositional Structure are typically expressed (morpho-) syntactically in terms of position and/or case with respect to a head. Roles in Information Structure are typically expressed by special focusing constructions, by special focusing affixes, by special topic and focus positions that override propositional roles, and above all by stress and intonation - which is never used to mark propositional roles. Both the syntactic and the semantic phenomena suggest that Propositional and Information Structure are orthogonal but linked organizations of the semantic material in a sentence. More controversially, Jackendoff (2002) proposes segregating Propositional Structure into two tiers, a descriptive tier and a referential tier. The former expresses the hierarchical arrangement of functions, arguments, and modifiers. The latter expresses the sentence's referential commitments to each of the characters and events, and the binding relations among them; it is a dependency graph along the lines of Discourse Representation Theory (see article 37 (Kamp & Reyle) Discourse Representation Theory). The idea behind this extra tier is that such issues as anaphora, quantification, specificity, and referential opacity are in many respects orthogonal to who is performing the action and who the action is being performed on.The canonical grammatical structures of language typically mirror the latter rather closely: the relative embedding of syntactic constituents reflects the relative embedding of arguments and modifiers. On the other hand, scope of quantification, specificity, and opacity arc not at all canonically expressed in the surface of natural languages; this is why theories of quantification typically invoke something like "quantifier raising" to relate surface position to scope. The result is a semantic structure in which the referential commitments are on the outside of the expression, and the thematic structure remains deeply embedded inside, its arguments bound to quantifiers outside. Dividing the expressive work into descriptive and referential tiers helps clarify the resulting notational logjam. The division into descriptive and referential tiers also permits an insightful account of two kinds of anaphora. Standard definite anaphora, as in (la), is anaphoric on the referential tier and indicates coreference. Oie-anaphora, as in (lb), however, is anaphoric on the descriptive tier and indicates a different individual with the same description. (1) a. Bill saw a balloon and I saw it too. b. Bill saw a balloon and I saw one too. 2.2. Ontological categories and aspectual features Reference is typically discussed in terms of NPs that refer to objects. Conceptual Semantics takes the position that there is a far wider range of ontological types to which reference can be made (m-referencc, of course). The deictic that is used in (2a) to (m-)refer to an object that the hearer is invited to locate in (his or her conceptualization of) the visual environment. Similarly, the underlined deictics in (2b-g) are used to refer to other sorts of entities. (2) a. Would you pick that [pointing] up, please? [reference to object] b. Would you put your hat there [pointing], please? [reference to location] c. They went that away [pointing]*. [reference to direction] d. Can you do this [demonstrating]'! [reference to action]
30. Conceptual Semantics e. That [pointing] had better never happen in MY house! [reference to event] f. The fish that got away was this [demonstrating] long. [reference to distance] g. You may start ... right ... now [clapping]\ [reference to time] This enriched ontology leads to a proliferation of referential expressions in the semantic structure of sentences. For instance, John went to Boston refers not only to John and Boston, but also to the event of John going to Boston and to the trajectory 'to Boston', which terminates at Boston. The event corresponds to the Davidsonian event variable (see article 34 (Maienborn) Event semantics) and the events of Situation Semantics (see articles 35 (Ginzburg) Situation Semantics and NL ontology and 36 (Ginzburg) Situation Semantics. Event and Situation Scmanticists have taken this innovation to be a major advance in the ontology over classical formal semantics. Yet the expressions in (2) clearly show a far more differentiated ontology; events are just a small part of the full story. At the same time, it should be recalled that the 'existence' of trajectories, distances, and so forth is a matter not of how the world is, but of how speakers conceptualize the world. Incidentally, trajectories such as to Boston are often thought to intrinsically involve motion. A more careful analysis suggests that this is not the case. Motion along the trajectory is a product of composing the motion verb went with to Boston. But the very same trajectory is referred to in the road leads to Boston - a stative sentence expressing the extent of the road - and in the sign points to Boston - a stative sentence expressing the orientation of the sign. The difference among these examples comes from the semantics of the verb and the subject, not from the prepositional phrase. The semantics of expressions of location and trajectory - including their crossling- uistic differences and relationships to Spatial Structure - has become a major preoccupation in areas of semantics related to Conceptual Semantics (e.g. Bloom et al. 1996,Talmy 2000, Levinson 2003, van der Zee & Slack 2003; see article 98 (Pederson) The expression of space; article 107 (Landau) Space in semantics and cognition). Orthogonal to the ontological category features are aspectual features. It has long been known (Declerck 1979, Hinrichs 1985, Bach 1986, and many others) that the distinction between objects and substances (expressed by count and mass NPs respectively) parallels the distinction between events and processes (telic and atelic sentences) (see article 46 (Lasersohn) Mass nouns and plurals; article 48 (Filip) Aspectual class and Aktionsart). Conceptual Semantics expresses this parallelism (Jackendoff 1991) through a feature [±boundcd]. Another feature, [±intcrnal structure], deals with aggregation, including plurality. (3) shows how these features apply to materials and situations. (The situations in (3) are expressed as NPs but could just as easily be sentences.) (3) Material Situation [+boundcd,-internal object (a dog) single telic event (a sneeze) structure] [-bounded,-internal substance (dirt) atelic process (sleeping) structure] [+boundcd, +intcrnal group (a herd) multiple telic event (some structure] sneezes) [-bounded,+internal aggregate (dogs) iterated events (sneezing structure] repeatedly)
698 VI. Cognitively oriented approaches to semantics Trajectories or paths also partake of this feature system: a path such as into the forest, with an inherent endpoint, is [+bounded]; along the road, with no inherent endpoint, is [-bounded]. Because these features cut across ontological categories, they can be used to calculate the telicity and iterativity of a sentence based on the contributions of all its parts (Jackendoff 1996b). For instance, (4a) is telic because its subject and path are bounded, and its verb is a motion verb. (4b) is atelic because its path is unbounded and its verb is a motion verb. (4c) is atelic and iterative because its subject is an unbounded aggregate of individuals and its verb is a motion verb. (4d) is stative, hence atelic, because the verb is stative - even though its path is bounded. (Jackendoff 1996b shows formally how these results follow.) (4) a. John walked into the forest. b. John walked along the road. c. People walked into the forest. d. The road leads into the forest. 2.3. Feature analysis in word meanings Within Conceptual Semantics, word meanings are regarded as composite, but not necessarily built up in a fashion that lends itself to definitions in terms of other words. This subsection and the next three lay out five sorts of evidence for this view, and five different innovations that therefore must be introduced into lexical decomposition. The first case is when a particular semantic feature spans a number of semantic fields. Conceptual Semantics grew out of the fundamental observations of Gruber (1965), who showed that the notions of location, change, and causation extend over the semantic fields of space, possession, and predication. For example, the sentences in (5) express change in three different semantic fields, in each case using the verb go and expressing the endpoint of change as the object of to. (5) a. John went to New York. [space] b. The inheritance went to John. [possession] c. The light went from green to red. [predication] Depending on the language, sometimes these fields share vocabulary and sometimes they don't. Nevertheless, the semantic generalizations ring true crosslinguistically. The best way to capture this crosscutting is by analyzing motion, change of possession, and change of predication in terms of a common primitive function GO (alternating with BE and STAY) plus a "field feature" that localizes it to a particular semantic field (space vs. possession vs. predication). Neither the function nor the field feature is lexicalized by itself: GO is not on its own the meaning of go. Rather, these two elements are like features in phonology, where for example voiced is not on its own a phonological segment but when combined with other features serves to distinguish one segment from another. Thus these meaning components cannot be expressed as word-like primes. An extension of this approach involves force-dynamic predicates (Talmy 1988, Jackendoff 1990: chapter 7), where for instance force, entail, be obligated, and the various senses of must share a feature, and permit, be consistent with, have a right, and the various
30. Conceptual Semantics 699 senses of may share another value of the same feature. At the same time, these predicates differ in whether they are in the semantic field of physical force, social constraint, logical relation, or prediction. Another such case was mentioned in section 2.2: the strong semantic parallel between the mass-count distinction in material substances and the process-event distinction in situations. Despite the parallel, only a few words cut across these domains. One happens to be the word end, which can be applied to speeches, to periods of time, and to tables of certain shapes (e.g. long ones but not circular ones). On the Conceptual Semantics analysis (Jackendoff 1991), end encodes a boundary of an entity that can be idealized as one- dimensional, whatever its ontological type. And because only certain table shapes can be construed as elaborations of a one-dimensional skeleton, only such tables have ends. (Note that an approach to end in terms of metaphor only restates the problem. Why do these metaphors exist? Answer: Because conceptualization has this feature structure.) The upshot of cases like these is that word meanings cannot be expressed in terms of word-like definitions, because the primitive features are not on their own expressible as words. 2.4. Spatial structure in word meanings One of the motivations for concluding that linguistic meaning must be segregated from "world knowledge" (as inTwo-lcvcl Semantics and Lexical Conceptual Structure) is that there are many words with parallel grammatical behavior but clearly different semantics. For instance, verbs of manner of locomotion such as jog, sprint, amble, strut, and swagger have identical grammatical behavior but clearly differ in meaning. Yet there is no evident way to decompose them into believable algebraic features. These actions differ in how they look and how they feel. Similarly, a definition of chair in terms of "[+has-a-seat]" and "[+has-a-back]" is obviously artificial. Rather, our knowledge of the shape of chairs seems to have to do with what they look like and what it is like to sit in them - where sitting is ultimately understood in terms of performing the action. Likewise, our knowledge of dog at some level involves knowing that dogs bark. But to encode this purely in terms of a feature like "[+barks]" misses the point. It is what barking sounds like that is important - and of course this sound also must be involved in the meaning of the verb bark. In each of these cases, what is needed to specify the word meaning is not an algebraic feature structure, but whatever cognitive structures encode categories of shapes, actions, and sounds. Among these structures are Spatial Structures of the sort discussed in section 1.3, which encode conceptualizations of shape, color, texture, decomposition into parts, and physical motion. As suggested there, it is not that these structures alone constitute the word meanings in question. Rather, it is the combination of Conceptual Structure with these structures that fills out the meanings. Jackendoff (1996a) hypothesizes that these more perceptual elements of meaning do not interface directly with syntactic structure; that is, only Conceptual Structure makes a difference in syntactic behavior. For example, the differences in manner of motion among the verbs mentioned above arc coded in Spatial Structure and therefore make no difference in their grammatical behavior. If correct, this would account for the fact that such factors are not usually considered part of "linguistic semantics", even though they play a crucial role in understanding. Furthermore, since these factors are not encoded in
VI. Cognitively oriented approaches to semantics a format amenable to linguistic expression, they cannot be decomposed into definitions composed of words. The best one can do by way of definition is ostension, relying on the hearer to pick out the relevant factors of the environment. 2.5. Centrality conditions and preference rules It is well known that many words do not have a sharply delimited denotation. An ancient case is bald: the central case is total absence of hair, but there is no particular amount of hair that serves as dividing point between bald and non-bald. Another case is color terms: for instance, there are focal values of red and orange and a smooth transition of hues between them; but there is no sharp dividing line, one side of which is definitely red and the other side is definitely orange. To reinforce a point made in section 1.1, it is not our ignorance of the true facts about baldness and redness that leads to this conclusion. Rather, there simply is no fact of the matter. When judgments of such categories are tested experimentally, the intermediate cases lead to slower, more variable, and more context-dependent judgments. The character of these judgments has to do more with the conceptualization of the categories in question than with the nature of the real world (see article 28 (Taylor) Prototype theory, article 106 (Kelter & Kaup) Conceptual knowledge, categorization and meaning). In Conceptual Semantics, such words involve centrality conditions.They are coded in terms of a focal or central case (such as completely bald or focally red), which serves as prototype. Cases that deviate from the prototype (as in baldness) - or for which another candidate prototype competes (as in color words) - result in the observed slowness and variability of judgments. Such behavior is in fact what would be expected from a neural implementation -sharp categorical behavior is actually much harder to explain in neural terms. A more complex case that results in noncategorical judgments involves so-called cluster concepts. The satisfaction conditions for such concepts are combined by a non- Boolean connective (let's call it "smor") for which there is no English word. If a concept C is characterized by [condition A smor condition B], then stereotypical instances of C satisfy both condition A and condition B, and more marginal cases satisfy either A or B. For instance, the verb climb stcrcotypically involves (A) moving upward by (B) clambering along a vertically aligned surface, as in (6a). However, (6b) violates condition A while observing condition B, and (6c,d) are the opposite. (6e,f), which violate both conditions, are unacceptable. This shows that neither condition is necessary, yet either is sufficient. (6) a. The bear climbed the tree. [upward clambering] b. The bear climbed down the tree/ [clambering only] across the cliff. c. The airplane climbed to 30,000 feet, [upward only] d. The snake climbed the tree. [upward only] e. *The airplane climbed down to [neither upward nor clambering] 10,000 feet. f. The snake climbed down the tree. [neither upward nor clambering]
30. Conceptual Semantics 701_ The connective between the conditions is not simple logical disjunction, because if we hear simply The bear climbed, we assume it was going upward by clambering. That is, both conditions are default conditions, and either is violable (Fillmore 1982). Jackendoff (1983) calls conditions linked by this connective preference rules. This connective is involved in the analysis of Wittgenstein's (1953) famous example game, in the verb see (Jackendoff 1983: chapter 8), in the preposition in (Jackendoff 2002: chapter 11), and countless other cases. It is also pervasive elsewhere in cognition, for example gestalt principles of perceptual grouping (Wertheimer 1923, Jackendoff 1983: chapter 8) and even music and phonetic perception (Lerdahl & Jackendoff 1983). Because this connective is not lexicalized, word meanings involving it cannot be expressed as standard definitions. 2.6. Dot-objects An important aspect of Conceptual Semantics stemming from the work of Pustejovsky (1995) is the notion of dot-objects - entities that subsist simultaneously in multiple semantic domains. A clear example is a book, a physical object that has a size and weight, but that also is a bearer of information. Like other cluster concepts, either aspect of this concept can be absent: a blank notebook bears no information, and the book whose plot I am currently developing in my head is not (yet) a physical object. But a stereotypical book partakes of both domains. The information component can be linked to other instantiations besides books, such as speech, thoughts in people's heads, computer chips, and so on. Pustejovsky notatcs the semantic category of objects like books with a dot between the two domains: [PHYSICAL OBJECT • INFORMATION], hence the nomenclature "dot-object." Note that this treatment of book is different from considering the word polysemous. It accounts for the fact that properties from both domains can be applied to the same object at once: The book that fell off the shelf [physical] discusses the war [information]. Corresponding to this sort of dot-object there arc dot-actions. Reading is at once a physical activity - moving one's glance over a page - and an informational one - taking in the information encoded on the page. Writing is creating physical marks that instantiate information, as opposed to, say, scribbling, which need not instantiate information. Implied in this analysis is that spoken language also is conceptualized as a dot-object: sounds dotted with information (or meaning). The same information can be conveyed by different sounds (e.g. by speaking in a different language), and the same sounds can convey different information (e.g. different readings of an ambiguous sentence, or different pragmatic construals of the same sentence in different contexts). Then speaking involves emitting sounds dotted with information; by constrast, groaning is pure sound emission. Another clear case of a dot-object is a university, which consists at once of a collection of buildings and an academic organization: Walden College covers 25 acres of hillside and specializes in teaching children of the rich. Still another such domain (pointed out by Searle 1995) is actions in a game. For example, hitting a ball to a certain location is a physical action whose significance in terms of the game may be a home run, which adds runs, or a foul ball, which adds strikes. For such a case, the physical domain is "dotted" with a special "game domain," in terms of which one carries out the calculation of points or the like to determine who wins in the end.
702 VI. Cognitively oriented approaches to semantics Symbolic uses of objects, say in religious or patriotic contexts, can also be analyzed in terms of dot-objects and dot-actions with significance in the symbolized domain. Similarly with money, where coins, bills, checks, and so on - and the exchange thereof - are both physical objects and monetary values (Searle calls the latter institutional facts, in contrast with physical brute facts). Perhaps the most far-reaching application of dot-objects is to the domain of persons (Jackendoff 2007). On one hand, a person is a physical object that occupies a position in space, has weight, can fall, has blue eyes, and so forth. On the other hand, a person has a personal identity in terms of which social roles are understood: one's kinship or clan relations, one's social and contractual obligations, one's moral responsibility, and so forth. The distinction between these two domains is recognized crossculturally as the difference between body on one hand and soul or spirit on the other. Cultures are full of beliefs about spirits such as ghosts and gods, with personal identity and social significance but no bodies. We quite readily conceptualize attaching personal identity to different bodies, as in beliefs in life after death and reincarnation, films like Freaky Friday (in which mother and daughter involuntarily exchange bodies), and Gregor Samsa's metamorphosis into a giant cockroach. A different sort of such dissociation is Capgras Syndrome (McKay, Langdon & Coltheart 2005), in which a stroke victim claims his wife has been replaced by an impostor who looks exactly the same. Thus persons, like books and universities, are dot-objects. Social actions partake of this duality between physical and social/personal as well. For example, shaking hands is a physical action whose social significance is to express mutual respect between persons.The same social significance can be attached to other actions, say to bowing, high-fiving, or a man kissing a lady's hand. And the same physical action can have different social significance; for instance hissing is evidently considered an expression of approval in some cultures, rather than an expression of disapproval as in ours. Note that this social/personal domain is not the same as Theory of Mind, although they overlap a great deal. On one hand, we attribute intentions and goals not just to persons but also to animals, who do not have social roles (with the possible exception of pets, who are treated as "honorary" persons). On the other hand, social characteristics such as one's clan and one's rights and obligations are not a consequence of what one believes or intends: they are just bare social facts. The consequence is that we likely conceptualize people in three domains "dotted" together: the physical domain, the personal/ social domain, and the domain of sentient/animate entities. A formal consequence of this approach is that the meaning of an expression containing dot-objects and dot-actions is best treated in terms of two or more linked "planes" of meaning operating in parallel. Some inferences are carried out on the physical plane, others on the associated informational, symbolic, or social plane. Particularly through the importance of social predicates to our thought and action, such a formal treatment is fundamental to understanding human conceptualization and linguistic meaning. 3. Compositionality A central idealization behind most theories of semantics, including those of mainstream generative grammar and much of formal logic, is classical Fregean compositionality, which can be stated roughly as (7) (see article 6 (Pagin & Westerstahl) Compositionality).
VI. Cognitively oriented approaches to semantics (8a), taking place at a time and in a brain location consistent with semantic rather than syntactic processing. Another such case is reference transfer (Nunberg 1979), in which an NP is used to refer to something related such as 'picture of NP','statue of NP','actor portraying NP' and so on: (10) a. There's Chomsky up on the top shelf, [statue of or book by Chomsky] next to Plato. b. [One waitress to another:] The ham sandwich in the corner wants [person who ordered sandwich] some more coffee. c. I'm parked out back. I got smashed up [my car] on the way here. Jackendoff (1992) (also Culicover & Jackendoff 2005) shows that these shifts cannot be disregarded as "merely pragmatic," for two reasons. First, a theory that is responsible for how speakers understand sentences must account for these interpretations. Second,some of these types of reference transfer have interactions with anaphoric binding, which is taken to be a hallmark of grammar. Suppose Richard Nixon went to see the opera Nixon in China. It might have happened that ... (11) Nixon was horrified to watch himself sing a foolish aria to Chou En-lai. Here Nixon stands for the real person and himself stands for the portrayed Nixon on stage. However, such a connection is not always possible: (12) *Aftcr singing his aria to Chou En-lai, Nixon was horrified to sec himself get up and leave the opera house. (11) and (12) are syntactically identical in the relevant respects. Yet the computation of anaphora is sensitive to which NP's reference has been shifted.This shows that reference transfer must be part of semantic composition. Jackendoff (1992) demonstrates that the meaning of reference transfer cannot be built into syntactic structure in order to derive it by Fregean composition. Another sort of challenge to Fregean composition comes from constructional meaning, where ordinary syntax is paired with nonstandard semantic composition (see article 86 (Kay & Michaclis) Constructional meaning). Examples appear in (13): the verb is syntactically the head of the VP, but it does not select its complements. Rather, the verb functions semantically as a means or manner expression. (13) a. Bill belched his way out of the ['Bill went out of the restaurant restaurant. belching'] b. Laura laughed the afternoon away. ['Laura spent the afternoon laughing'] c. The car squealed around the corner. ['The car went around the corner squealing']
30. Conceptual Semantics 707 the interfaces with syntax and phonology. Thus the empirical phenomena studied within Conceptual Semantics provide arguments for the theory's overall worldview, one that is consistent with the constraints of the mentalistic framework. 4. References Bach, Eminon 1986. The algebra of events Linguistics & Philosophy 9,5-16. Bloom. Paul 2000. How Children Learn the Meanings of Words. Cambridge, MA:The MIT Press. Bloom, Paul et al. (eds.) 1996. Language and Space. Cambridge, MA:The MIT Press. Carnap, Rudolf 1939. Foundations of Logic and Mathematics. Chicago, IL: The University of Chicago Press. Cheney, Dorothy L. & Robert M. Seyfarth 2007. Baboon Metaphysics. The Evolution of a Social Mind. Chicago, IL:The University of Chicago Press. Chomsky, Noam 1995. The Minimalist Program. Cambridge, MA: The MIT Press. Culicover, Peter W. & Ray Jackeiidoff 2005. Simpler Syntax. Oxford: Oxford University Press. Declerck. Renaat 1979. Aspect and the bounded/unbounded (telic/atelic) distinction. Linguistics 17,761-794. Fauconnier, Gilles 1985. Mental Spaces. Aspects of Meaning Construction in Natural Language. Cambridge, MA:The MIT Press. Fellbaum, Christiane (ed.) 1998. WordNet. An Electronic Lexical Database. Cambridge, MA: The MIT Press. Fillmore, Charles 1982.Towards a descriptive framework for dcixis. In: R. Jarvclla & W. Klein (eds.). Speech, Place, and Action. New York: Wiley, 31-59. Fodor, Jerry A. 1975. The Language of Thought. Cambridge, MA: Harvard University Press. Fodor, Jerry A. 1998. Concepts. Where Cognitive Science Went Wrong. Oxford: Oxford University Press. Frege, Gottlob 1892/1952. Uber Sinn und Bedeutung. Zeitschrift fur Philosophic und philosophische Kritik 100,25-50. English translation in: P. Geach & M. Black (eds.). Translations from the Philosophical Writings of Gottlob Frege. Oxford: Blackwell. 1952,56-78. Giv6n,Talmy 1995. Functionalism and Grammar. Amsterdam: Benjamins. Goldberg, Adele E. 1995. Constructions. A Construction Grammar Approach to Argument Structure. Chicago, IL:The University of Chicago Press. Gruber, Jeffrey S. 1965. Studies in Lexical Relations. Ph.D. dissertation. MIT, Cambridge, MA. Reprinted in: J. S. Gruber. Lexical Structures in Syntax and Semantics. Amsterdam: North-Holland, 1976,1-210. Hauser, Marc D. 2000. Wild Minds. What Animals Really Think. New York: Henry Holt. Hinrichs, Erhard W. 1985. A Compositional Semantics for Aktionsarten and NP Reference in English. Ph.D. dissertation. Ohio State University, Columbus, OH. Jackeiidoff, Ray 1983. Semantics and Cognition. Cambridge, MA:The MIT Press. Jackeiidoff, Ray 1987. Consciousness and the Computational Mind. Cambridge, M A:The MIT Press. Jackeiidoff, Ray 1990. Semantic Structures. Cambridge, MA. The MIT Press. Jackeiidoff, Ray 1991. Parts and boundaries Cognition 41, 9-45. Reprinted in: R. Jackeiidoff, Meaning and the Lexicon. The Parallel Architecture 1975-2000. Oxford: Oxford University Press, 138-173. Jackeiidoff, Ray 1992. Mine. Tussaud meets the binding theory. Natural Language and Linguistic Theory 10,1-31. Jackendoff, Ray 1996a. The architecture of the linguistic-spatial interface. In: P. Bloom et al. (eds). Language and Space. Cambridge, M A:The MIT Press, 1-30. Reprinted in: R. Jackeiidoff, Meaning and the Lexicon. The Parallel Architecture 1975-2010. Oxford: Oxford University Press, 112-134. Jackeiidoff, Ray 1996b. The proper treatment of measuring out, telicity, and possibly even quantification in English. Natural Language and Linguistic Theory 14, 305-354. Reprinted in:
31.Two-level Semantics: Semantic Form and Conceptual Structure 709 Stainton, Robert J. 2006. Words and Thoughts. Subsentences, Ellipsis, and the Philosophy of Language. Oxford: Oxford University Press. Talmy, Leonard 1978. The relation of grammar to cognition. A synopsis In: D. Waltz (ed.). Theoretical Issues in Natural Language Processing. New York: ACM, 14-24. Revised and enlarged version in: L. Talmy. Toward a Cognitive Semantics, vol. I. Concept Structuring Systems. Cambridge, MA:The MIT Press, 2000,21-96. Talmy, Leonard 1988. Force-dynamics in language and cognition. Cognitive Science 12,49-100. Talmy, Leonard 2000. Toward a Cognitive Semantics. Cambridge, MA:The MIT Press. Tarski, Alfred 1956. The concept of truth in formalized languages In: A. Tarski. Logic, Semantics, Metamathematics.Translated by J. H. Woodger. 2nd edn.ed. J. Corcoran. London: Oxford University Press, 152-197. van der Zee & Jon Slack (eds.) 2003. Representing Direction in Language and Space. Oxford: Oxford University Press. Verkuyl, Henk 1993. A Theory of Aspectuality. The Interaction between Temporal and A temporal Structure. Cambridge: Cambridge University Press. Wcrthcimcr, Max 1923. Laws of organization in perceptual forms Reprinted in: W. D. Ellis (cd.). A Source Book ofGestalt Psychology. London: Routledge & Kegan Paul, 1938,71-88. Wierzbicka, Anna 1996. Semantics. Primes and Universals. Oxford: Oxford University Press. Wittgenstein, Ludwig 1953. Philosophical Investigations. Translated by G.E.M. Ascombe. Oxford: Blackwell. Ray Jac kendoff Med ford, MA (USA) 31. Two-level Semantics: Semantic Form and Conceptual Structure 1. Introduction 2. Polysemy problems 3. Compositionality and beyond: Semantic underspecification and coercion 4. More on SF variables and their instantiation at the CS level 5. Summary and outlook 6. References Abstract Semantic research of the last decades has been shaped by an increasing interest in conceptually, that is, in emphasizing the conceptual nature of the meanings conveyed by natural language expressions. Among the multifaceted approaches emerging from this tendency, the article focuses on discussing a framework that has become known as »Two- level Semantics«. The central idea it pursues is to assume and justify two basically distinct, but closely interacting, levels of representation that spell out the meaning of linguistic expressions: Semantic Form (SF) and Conceptual Structure (CS). The distinction of SF vs. CS representations is substantiated by its role in accounting for related parallel distinctions including 'lexical vs. contextually specified meaning', 'grammar-based vs. concept-based restrictions', 'storage vs. processing' etc. The SF vs. CS distinction is discussed on the basis of semantic problems regarding polysemy, underspecification, coercion, and inferences. Maienbom, von Heusinger and Portner (eds.) 2011, Semantics (HSK33.1), deGruyter, 709-740
31.Two-level Semantics: Semantic Form and Conceptual Structure 1 distinction of SF vs. CS but also in providing criteria for deciding what requirements the representations at issue have to meet. The methodologically most relevant conclusion drawn by Kelter & Kaup (article 108) reads as follows: "researchers should acknowledge the fact that concepts and word meaning are different knowledge structures." The claim in (5) suggests that if it is the SF of lexical items that is stored in long-term memory, the entries should be confined to representing what may be called "context-free lexical meanings", whereas CS representations compiled and processed in working memory should take charge of what may be called "contextually specified (parts of) utterance meanings". The difference between the two types of meaning representations indicates the virtual semantic undcrspccification of the former and the possible semantic enrichment of the lattcr.Thcrc is a series of recent experimental studies designed and carried out along these lines which - in combination with evidence from corpus data, linguistic diagnostics etc. - are highly relevant for the theoretical issues raised by SF vs. CS representations. Experiments reported by Stolterfoht, Gese & Maienborn (2010) and Kaup, Liidtke & Maienborn (2010) succeeded in providing processing evidence that supports the distinction of, e.g., primary adjectives vs. adjectivized participles vs. verbal participles, that is, evidence for packaging categories relevant to SF representations. In addition, these studies reveal the processing costs of contextualizing semantically underspecified items, a result that supports the view that contextualizing the interpretation of a given linguistic expression e is realized by building up an enriched CS representation on the basis of SF(e). 1.3. SF vs. CS - an illustration from everyday life To round off the picture outlined so far, we illustrate the features listed in (2)-(5) in favor of the SF vs. CS distinction by an example wc arc well acquainted with, viz. the representations involved in handling numbers, number symbols, and numerals in everyday life. Note that each of the semiotic objects in (6)-(8) below represents in some way the numerical concept »18«. However, how numerical concepts between »10« and »20« are stored, activated, and operated on in our memory is poorly understood as yet, so the details regarding the claim in (5) must be left open. Suffice it to agree that »18« stands for the concept we make use of, say, in trying to mentally add up the sum to be paid for our purchases in the shopping trolley. With this proviso in mind, wc now look at the representations of the concept »18« in (6)-(8) to find out their interrelations. (6) a. 1 1 1 HI (7) a. XVIII a". IIXX (rarely occurring alternative) b. 18 (8) a. eighteen, achtzehn (8+10) English, German b. dix-huit,shiba (10 + 8) French, Mandarin c. okto-kai-deka ((8)-and-(10)) Greek d. diezyocho ((10)-and-(8)) Spanish e. vosem-na-dcat' ((8)-on-(10)) Russian
714 VI. Cognitively oriented approaches to semantics f. duo-de-viginti ((2)-of-(20)) Latin g. ocht-deec (8 + (2 x 5)) Irish h. deu-naw (2 x 9) Welsh (6) shows two iconic non-verbal representations of a quantity whose correlation with the concept »18« and/or with the numerals in (8) rests on the ability to count and the availability of numerals.The tallying systems in (6) are simple but inefficient for doing arithmetic and hence hardly suitable to serve as semantic representations of the numerals in (8). (7) shows two symbolic non-verbal representations of »18«, generated by distinct writing systems for numbers. The Roman number symbols are partially iconic in that they encode addition by iterating up to three special symbols for one, ten, hundred, or thousand, partially symbolic due to placing the symbol of a small number in front of the symbol of a larger number, thus indicating subtraction, cf. (7a, a'). The lack of a symbol for null prevented the creation of a positional system, the lack of means to indicate multiplication or division impeded calculation. Both were obstacles to progress in mathematics. Thus, Roman number symbols may roughly render the lexical meaning of (8a-f) but not those of (8g-h) and all other variants involving multiplication or division. The Indo-Arabic system of number symbols exemplified by 18 in (7b) is a positional system without labels based on exponents of ten (10°, 101,102,... , 10"). As a representational system for numbers it is recursive and potentially infinite in yielding unambiguous and well-distinguished chains of symbols as output. Thus, knowing the system implies knowing that 18 * 81 or that 17 and 19 are the direct predecessor and successor of 18, respectively, even if we do not have pertinent number words at our disposal to name them. Moreover, it is this representation of numbers that we use when we do arithmetic with paper and pencil or by pressing the keys of an electronic calculator. Enriched with auxiliary symbols for arithmetical operations and for marking their scope of application, as well as furnished with conventions for writing equations etc., this notational system is a well-defined means to reduce the use of mathematical expressions to representations of their Conceptual Structures, that is, to the CS representations they denote, independent of any natural language in which these expressions may be read aloud or dictated. Let's call this enriched system of Indo-Arabic number symbols the "CS system of mathematical expressions". Now, what about the SF representations of numerals? Though all number words in (8) denote the concept »18«, it is obvious that their lexical meanings differ in the way they are composed, cf. the second column in (8). As regards their combinatorial category, the number words in (8) are neither determinative nor copulative compounds, nor are they conjoined phrases. They are perhaps best categorized as juxtapositions with or without connectives, cf. (8c-f) and (8a-b, g-h), respectively.The unique feature of complex number words is that the relations between their numeral constituents are nothing but encoded fundamental arithmetic operations (addition and multiplication are preferred; division and subtraction are less frequent). Thus, the second column in (8) shows SF representations of the number words in the first column couched in terms of the CS system of mathematical expressions.The latter are construable as functor-argument structures with arithmetic operators ('+','-', 'x' etc.) as functors, quantity constants for digits as arguments, and parentheses (...) as boundaries marking lexical building blocks. Now let's see what all this tells us about the distinctions in (2)-(5) above.
31.Two-level Semantics: Semantic Form and Conceptual Structure 715 The subset - set relation SF c CS mentioned in connection with (2) also holds for the number symbols in (7b). The CS system of mathematical expressions is capable of representing all partitions of 18 that draw on fundamental arithmetic operations. Based on this, the CS system at stake covers the internal structures of complex number words, cf. (8), as well as those of equations at the sentence level like 18 = 3 x 6; 18 = 2 x 9; 18 = 72:4 etc. By contrast, the subset of SF representations for numerals is restricted in two respects. First, not all admissible partitions of a complex number like 18 are designated as SF of a complex numeral lexicalized to denote »18«.The grammar of number words in L is interspersed with (certain types of) L-specific packing strategies, cf. Hurford (1975), Greenberg (1978). Second, since the ideal relationship between systems of number symbols and systems of numerals is a one-to-one correspondence, the non-ambiguity required of the output of numeral systems practically forbids creation or use of synonymous number names (except for the distinct numerals used for e.g. 1995 when speaking of years or of prices in €). There is still another conclusion to be drawn from (6)-(8) in connection with (2). The CS system of mathematical expressions is a purposeful artifact created and developed to solve the "homogenization problem" raised by CS representations for the well-defined field of numbers and arithmetic operations on them. First, the mental operations of counting, adding, multiplying etc., which the system is designed to represent, have been abstracted from practical actions, viz- from lining up things, bundling up things, bundling up bundles of things etc. Second, the CS representations of mathematical expressions provided by the system are unambiguous, complete (that is, fully specified and containing neither gaps nor variables to be instantiated by elements from outside the system), and independent of the particular languages in which they may be verbalized. The lexicon-based packaging and contents of the components of SF representations claimed in (3) and (4) are also corroborated by (6)-(8). The first point to note is the L- specific ways in which (i) numerals are categorized in morpho-syntactic terms and (ii) their lexical meanings are composed. The second point is this: Complex numerals differ from regular (determinative or copulative) compounds in that the relations between their constituents arc construed as encodings of fundamental arithmetical operations, cf. (8a-h). This unique feature of the subgrammar of number words also yields a strong argument wrt. the "justification problem" posed by the assumption of lexicon-based SF representations. The claims in (3) and (4) concerning the non-linguistic nature of CS representations are supported by the fact that e.g. 18 is an intermodally valid representation of the concept »18« as it covers both the perception-based iconic representations of »18« in (6) and the lexicon-based linguistic expressions denoting »18« in (8).Thus, the unique advantage of the CS system of mathematical expressions is founded on the representational inter- modality and the conceptual homogeneity it has achieved in the history of mathematical thinking. No other science is more dependent on the representations of its subject than mathematics. Revealing as this illustration may be, the insights it yields cannot simply be extended to the lexicon and grammar of a natural language L beyond the subgrammar of number words. The correlations between systems of number names and their SF representations in terms of the CS system of mathematical expressions form a special case which results from the creation of a non-linguistic semiotic artifact, viz- a system to represent number concepts under controlled laboratory conditions. The meanings, the combinatorial
720 VI. Cognitively oriented approaches to semantics Whereas the institution and the building readings are somewhat compatible, the building and the personnel readings are not; as regards the (type of) reading of the antecedent, anaphoric pronouns arc less tolerant than relative pronouns or repeated DPs.The data in (17) show that the conceptual specifications of SF (school) differ in ways that are poorly understood as yet;cf. Asher (2007) for some discussion. The semantics of institution nouns, for a while the signature tune of Two-level Semantics, elicited a certain amount of discussion and criticism, cf. Herzog & Rollinger (1991), Bierwisch & Bosch (1995).The problems expounded in these volumes are still unsolved but they sharpened our view of the intricacies of the SF vs. CS distinction. 2.2. Locative prepositions In many languages the core inventory of adpositions encode spatial relations to localize some x (called theme, figure or located object) wrt. the place occupied by some y (called relatum, ground or reference object), where x and y may pairwise range over objects, substances, and events. Regarding the conceptual basis of these relations, locative prepositions in English and related languages are usually subdivided into topological (in, at, on), directional (into, onto), dimensional (above, under, behind), and path-defining (along, around) prepositions. The semantic problems posed by these lexical items can be best illustrated with in, which supposedly draws on spatial containment, pure and simple, and which is therefore taken to be the prime example of a topological preposition. To illustrate how SF (in) is integrated into a lexical entry with information on Phonetic Form (PF), Grammatical Features (GF), Argument Structure (AS) etc., we take German in as a telling example: It renders English in vs. into with distinct cases which in turn correspond to the values of the feature [± Dir(cctional)] subcatcgorizing the internal argument y, and to further syntactic distinctions. The entry in (18) is taken from Bierwisch (1988:37), examples are added in (19). The interdependence of the values for the case feature [± Obl(ique)] and for the category feature [± Dir] is indicated by means of the meta-variable a e {+, -} and by the conventions (i) - a inverts the value of a and (ii) (aW) means that W is present if a = + and absent if a = -. SF [(„fin) [loc x] c [loc y]] (19) a. Die StraBe/Fahrt fiihrt in die Stadt.[+Dir,-Obi] = Ace, "x is a path ending in y" The street/journey leads into the city. /in/; [-V-N, + Dir]; \y \x [(fin) [loc x] c [loc y]] I l-Obl] (18) Lexical entry of the German preposition in: PF GF AS /in/; [-V,-N,aDir]; \y \x I [- a Obi]
722 VI. Cognitively oriented approaches to semantics general schema for the SF of locative prepositions thereby abandoning the problematic functor c and revising the functor loc: (22) Xy Xx (loc (x, PREP*(y))), where loc localizes the object x in the region determined by the preposition p and prep* is a variable ranging overp-based regions. Bierwisch (1996:69) replaces SF (in) in (18) with Xy Xx [x [loc [int y]]] commenting "jc loc p identifies the condition that the location of x be (improperly) included in p" and "int y identifies a location determined by the boundaries of y, that is, the interior of y". Although this proposal avoids some of the problems with the functor c, the puzzling effect of "(improperly) included" remains and so does the definition of int y as yielding "the interior of x". Herweg (1989) advocates an abstract SF (in) which draws on proper spatial inclusion such that the examples in (21) are semantically marked due to violating the "Presupposition of Argument Homogeneity". The resulting truth value gap triggers certain function- based accommodations at the CS level that account for the interpretations of (21a-d). Hottenroth (1991), in a detailed analysis of French dans, rejects the idea that SF (dans) might draw on imprecise region-creating constants like int>\ Instead, SF (dans) should encode the conditions on the relatum in prototypical uses of dans. The standard reference region of dans is a three-dimensional empty closed container (bottle, bag, box etc.). If the relatum of dans does not meet one or more of these characteristics, the reference region is conceptually adapted by means of certain processing principles (laws of closure, mental demarcation of unbounded y, conceptual switching from 3D to 2D etc.). In view of data like those in (21), Carstcnscn (2001) proposes to do away with the region account altogether and to replace it with a perception-based account of prepositions that draws on the conceptual representation of changes of focused spatial attention. To sum up, the brief survey of developments in the semantic analysis of prepositions may also be taken as proof of the heuristic productivity emanating from the SF vs. CS distinction. Among polyscmous verbs, the verb to open has gained much attention, cf. Bierwisch (article 16). Based on a French-German comparison, Schwarze & Schepping (1995) discuss what type of polysemy is to be accounted for at which of the two levels. Functional categories (determiners, complementizers, connectives etc.), whose lexical meanings lack any support in perception and are hence purely operative, have seldom been analyzed in terms of the SF vs. CS distinction so far; but cf. Lang (2004) for an analysis that accounts for the abstract meanings of and, but etc. and their contextual specification by inferences drawn from the structural context, the discourse context, and/or from world knowledge. Clearly, the 'poorer' the lexical meaning of such a synsemantic lexical item, the more will its semantic contribution need to be enriched by means of contextualization. 3. Compositionality and beyond: Semantic underspecification and coercion Two-level Semantics was first mainly concerned with polysemy problems of the kind illustrated in the previous section. Emphasis was laid on developing an adequate theory
31.Two-level Semantics: Semantic Form and Conceptual Structure 723 of lexical semantics that would be able to deal properly and on systematic grounds with the distinction of word knowledge and world knowledge. A major tenet of Two-level Semantics as a lexicon-based theory of natural language meaning is that the internal decompositional structure of lexical items determines their external combinatorial properties, that is, their external syntactic behavior. This is why compositionality issues are of eminent interest to Two-level Semantics; cf. (la). There is wide agreement among semanticists that, given the combinatorial nature of linguistic meaning, some version of the principle of compositionality - as formulated, e.g., in (23) - must certainly hold. But in view of the complexity and richness of natural language meaning, there is also consensus that compositional semantics is faced with a scries of challenges and problems; sec article 6 (Pagin & Wcstcrstahl) Compositionality. (23) Principle of compositionality: The meaning of a complex expression is a function of the meanings of its parts and the way they are syntactically combined. Rather than weakening the principle of compositionality or abandoning it altogether, Two-level Semantics seeks to cope with the compositionality challenge by confining compositionality to the level of Semantic Form. That is, SF is understood as comprising exactly those parts of natural language meaning that are (i) context-independent and (ii) compositional, in the sense that they are built in parallel with syntactic structure. This leaves space to integrate non-compositional aspects of meaning constitution at the level of Conceptual Structure. In particular, the mapping of SF-representations onto CS- representations may include non-local contextual information and thereby qualify as non-compositional. Of course, the operations at the CS level as well as the SF - CS mapping operations are also combinatorial and can therefore be said to be compositional in a broader sense. Yet their combinatorics is not bound to mirror the syntactic structure of the given linguistic expression and thus does not qualify as compositional in a strict sense. This substantiates the assumption of two distinct levels of meaning representation as discussed in §l.Thus,Two-lcvcl Semantics' account of the richness and flexibility of natural language meaning constitution consists in assuming a division of labor between a rather abstract, context-independent and strictly compositionally determined SF and a contextually enriched CS that also includes non-compositionally derived meaning components. Various solutions have been proposed for implementing this general view of the SF vs. CS distinction. These differ mainly in (a) the syntactic fine-tuning of the compositional operations and the abstractness of the corresponding SF-representations, and in (b) the way of handling non-compositional meaning aspects in terms of, e.g., coercion operations.These issues will be discussed in turn. 3.1. Combinatory meaning variation Assumptions concerning the spell-out of the specific mechanisms of compositionality are generally guided by parsimony. That is, the fewer semantic operations warranting compositionality are postulated, the better. On this view, it would be attractive to have a single semantic operation, presumably functional application, figuring as the semantic counterpart to syntactic binary branching. An illustration is given in (24):
724 VI. Cognitively oriented approaches to semantics Given the lexical entries for the locative preposition in and the proper noun Berlin in (24a) and (24b) respectively, functional application of the preposition to its internal argument yields (24c) as the compositional result corresponding to the semantics of the PP. (24) a. in: Xy Xx (loc (x, iN*(y))) b. Berlin: berlin c. [pp in [Dp Berlin]]: Xy Xx (loc (x, iN*(y))) (berlin) = Xx (loc (x, iN*(berlin))) Functional application is suitable for syntactic head-complement relationships as it reveals a correspondence between the syntactic head-non-head relationship and the semantic functor-argument relationship. In (24c), for instance, the preposition in is both the syntactic head of the PP and the semantic functor, which takes the DP as its argument. Syntactic adjuncts, on the other hand, cannot be properly accounted for by functional application as they lack a comparable syntax-semantics correspondence. In syntactic head-adjunct configurations the semantic functor, if any, is not the syntactic head but the non-head; for an overview of the different solutions that have been put forth to cope with this syntax-semantics imbalance see article 54 (Maienborn & Schafer) Adverbs and adverbials. Different scholars working in different formal frameworks have suggested remarkably convergent solutions, according to which the relevant semantic operation applying to syntactic head-adjunct configurations is predicate conjunction. This might be formulated, for instance, in terms of a modification template MOD as given in (25);cf.,e.g. Higginbotham's (1985) notion of 9-identification,Yi\exv/'\scWs (1997) adjunction schema, Wunderlich's (1997b) argument sharing, or the composition rule of predicate modification in Hcim & Kratzcr (1998). (25) Modification template MOD: MOD: XQXPXx(P(x)&Q(x)) The template MOD takes a modifier and an expression to be modified (= modifyee) and turns it into a conjunction of predicates. More specifically, an (intersective) modifier adds a predicate that is linked up to the referential argument of the expression to be modified. In (26) and (27) illustrations are given for nominal modification and verbal modification, respectively. In (26), the semantic contribution of the modifier is added as an additional predicate of the noun's referential argument. In (27), the modifier provides an additional predicate of the verb's eventuality argument. (26) a. house: Xz (house (z)) b. [pp in Berlin]: Xu (loc (u, iN*(berlin))) c- [Np[NPhouse][ppinBerlin]]: XQ XP Xx (P(x) & Q(x)) (Xz (house (z))) (Xu (loc (u, iN*(bcrlin)))) = Xx (house (x) & loc (x, iN*(berlin))) (27) a. sleep: Xz Xe (sleep (e) & agent (e,z)) b. [pp in Berlin]: Xu (loc (u, iN*(berlin))) c- [vplvp sleep] [pp in Berlin]]:
31.Two-level Semantics: Semantic Form and Conceptual Structure 725 XQ XPXx (P(x) & Q(x))(XzXe (sleep (e) & agent(e,z)))(Xu (loc (u,iN*(berlin)))) = Xz Xe (sleep (e) & agent (e, z) & loc (e, iN*(berlin))) The semantic template MOD thus provides the compositional semantic counterpart to syntactic head-adjunct configurations. There are good reasons to assume that, besides functional application, some version of MOD is required when it comes to spelling out the basic mechanisms of compositionality. The template MOD in (25) captures a very fundamental insight about the compositional contribution of intersective modifiers. Nevertheless, scholars working within the Two-level Semantics paradigm have emphasized that a modification analysis along the lines of MOD fails to cover the whole range of intersective modification; cf., e.g., Maienborn (2001, 2003) for locative adverbials, Dolling (2003) for adverbial modifiers in general, Bucking (2009, 2010) for nominal modifiers. Modifiers appear to be more flexible in choosing their compositional target, both in the verbal domain and in the nominal domain. Besides supplying an additional predicate of the modifyee's referential argument, as in (26) and (27), modifiers may also relate less directly to their host argument. Some illustrations are given in (28)-(30). (For the sake of simplicity the data are presented in English.) (28) a. The cook prepared the chicken in a Marihuana sauce, (cf. Maienborn 2003) b. The bank robbers escaped on bicycles. c. Paul tickled Maria on her neck. (29) a. Anna dressed Max's hair unobtrusively. (cf. Dolling 2003:530) b. Ede reached the summit in two days. (cf. Dolling 2003:516) (30) a. the fast processing of the data (cf. Bucking 2009:94) b. the preparation of the chicken in a pepper sauce (cf. Bucking 2009:102) c. Georg's querying of the men (cf. Bucking 2010:51) The locative modifiers in (28) differ from the general MOD pattern as illustrated in (27) in that they do not locate the whole event but only one of its integral parts. For instance, in (28b) it's not the escape that is located on bicycles but - according to the preferred reading - the agent of this event, viz. the bank robbers. In the case of (28c), the linguistic structure does not even tell us what is located on Maria's neck. It could be Paul's hand but also, e.g., a feather he used for tickling Maria. Maienborn (2001, 2003) calls these modifiers "event internal modifiers" and sets them apart from "event external modifiers" such as in (27), which serve to holistically locate the verb's eventuality argument. Similar observations are made by Dolling (2003) wrt. cases like (29). Sentence (29a) is ambiguous. It might be interpreted as expressing that Anna performed the event of dressing Max's hair in an unobtrusive manner. This is what the application of MOD would result in. But (29a) has another reading, according to which it is not the event of hair dressing that is unobtrusive but Max's resulting hair-style. Once again, the modifier's contribution does not apply directly to the verb's eventuality argument but to some referent related to it. The same holds true for (29b), where the temporal adverbial cannot relate to the punctual event of Ede reaching the summit but only to its preparatory phase.
31.Two-level Semantics: Semantic Form and Conceptual Structure 727 (34) Paul tickled Maria on her neck. SF: Be (tickle (e) & agent (e, paul) & patient (e, maria) & R (e, v) & loc (v, ON*(maria's neck)) CS: Bex (tickle (e) & agent (e,paul) & patient (e, maria) & instr (e, x) & feather (x) & loc (x,ON*(maria's neck)) This conceptual spell-out provides a plausible utterance meaning for sentence (34). It goes beyond the compositionally determined meaning by exploiting our conceptual knowledge that tickling is performed with some instrument which needs to have spatial contact to the object being tickled. Consequently, the SF-parameter R can be identified as the instrument relation, and the parameter v may be instantiated, e.g., by a feather. Although not manifest at the linguistic surface, such conceptually inferred units arc plausible potential instantiations of the compositionally introduced SF-parameter v. (Dolling and Maienborn use abduction as a formal means of deriving a contextually specified CS from a semantically underspecified SF;cf. Hobbs et al. (1993). We will come back to the SF-CS mapping in §4.) Different proposals have been developed for implementing the notion of a more liberal and flexible combinatorics, such as MOD', into the compositional machinery. Maienborn (2001,2003) argues that MOD' is only licensed in particular structural environments: Event-internal modifiers have a base adjunction site in close proximity to the verb, whereas event-external adjuncts adjoin at VP-level. These distinct structural positions provide the key to a compositional account. Maienborn thus formulates a more fine-tuned syntax-semantics interface condition that subsumes MOD and MOD' under a single compositional rule MOD*. (35) Modification template MOD*: MOD*: XQ XP Xx (P(x) & R (x, v) & Q(v)) Condition on the application of MOD*: If MOD* is applied in a structural environment of categorial type X, then R = part-of, otherwise (i.e. in an XP-environment) R is the identity function. If MOD* is applied in an XP-environment, then R is instantiated as identity, i.e. v is identified with the referential argument of the modified expression, thus yielding the standard variant MOD. If applied in an X-environment, R is instantiated as the part- of relation, which pairs entities with their integral constituents. Thus, in Maienborn's account the observed meaning variability is traced back to a grammatically constrained semantic indeterminacy that is characteristic of modification. Dolling (2003) takes a different track by assuming that the SF-parameter R is not rooted in modification but is of a more general nature. Specifically, he suggests that R is introduced compositionally whenever a one-place predicate enters the composition. By this move, the SF of a complex expression is systematically extended by a series of SF-parameters, which guarantee that the application of any one-place predicate to its argument is systematically shifted to the conceptual level. On Dolling's account, the SF of a complex linguistic expression is maximally abstract and underspecified, with SF-parameters delineating possible (though not necessarily actual) sites of meaning variation. Differences aside, the studies of Dolling, Maienborn and other scholars working in the Two-level Semantics paradigm emphasize that potential sources for semantic
728 VI. Cognitively oriented approaches to semantics indeterminacy are not only to be found in the lexicon but may also emerge in the course of composition, and they strive to model this combinatory meaning variation in terms of a rigid account of lexical and compositional semantics. A key role in linking linguistic and extra-linguistic knowledge is taken by so-called SF-parameters. These are free variables that are installed under well-defined conditions at SF and are designed to be instantiated at the level of CS. SF-parameters are a means of triggering and controlling the conceptual enrichment of a grammatically determined meaning representation. They delineate precisely those gaps within the Semantic Form that call for conceptual specification and they impose sortal restrictions on possible conceptual fillers. SF-parameters can thus be seen as well-defined windows through which compositional semantics allows linguistic expressions to access and constrain conceptual structures. 3.2. Non-compositional meaning adjustments Conceptual specification of a compositionally determined, underspecified, abstract meaning skeleton, as illustrated in the previous section, is the core notion that characterizes the Two-level Semantics perspective on the semantics-pragmatics interface. Its focus is on the conceptual exploitation of a linguistic expression's regular meaning potential. A second focus typically pursued within Two-level Semantics concerns the possibilities of a conceptual solution of combinatory conflicts arising in the course of composition.These are combinatory adjustment operations by which a strictly speaking ill-formed linguistic expression gets an admissible yet irregular interpretation. In the literature such non- compositional rescue operations are generally discussed under the label of "coercion". An example is given in (36). (36) The alarm clock stood intentionally on the tabic. The sentence in (36) does not offer a regular integration for the subject-oriented adverbial intentionally, i.e, the subject NP the alarm clock does not fulfill the adverbial's selec- tional restriction for an intentional subject. Hence, a compositional clash results, and the sentence is ungrammatical. Nevertheless, although deviant, there seems to be a way to rescue the sentence so that it becomes acceptable and interpretable. In the case of (36), a possible repair strategy would be to introduce an actor who is responsible for the fact that the alarm clock stands on the table. This move would provide a suitable anchor for the adverbial's semantic contribution. Thus, we understand (36) as saying that someone put the alarm clock on the table on purpose. That is, in case of a combinatorial clash, there seems to be a certain leeway for non-compositional adjustments of the compositionally derived meaning. The defective part is "coerced" into the right format. Coercion phenomena are a topic of intensive research in current semantics. Up to now the primary focus has been on the widely ramified notion of aspectual coercion (e.g. Mocns & Stccdman 1988; Pulman 1997; dc Swart 1998; Dolling 2003, 2010; Egg 2005) and on cases of so-called "complement coercion" as in Peter began the book (e.g. Pustejovsky 1995; Egg 2003; Asher 2007); see article 25 (de Swart) Mismatches and coercion for an overview. The framework of Two-level Semantics is particularly suited to investigate these borderline cases at the semantics-pragmatics interface because of its
31.Two-level Semantics: Semantic Form and Conceptual Structure 731_ (41) Be [[quant max DEF.pole'] = [Norm . + c]] This much on SF variables coming with dimension terms and on their instantiation in structural contexts that arc provided by the morphosyntax of the sentence at issue. After a brief look at the elements instantiating the metavariable dim, we will discuss a type of SF variable that is rooted in the lexical field structure of DA terms. Conceived as a basic module of cognition, dimension assignment to spatial objects involves entities and operations at three levels.The perceptual level provides the sensory input from vision and other senses; the conceptual level serves as a filter system reducing perceptual distinctions to the level that our everyday knowledge of space needs, and the semantic level accounts for the ways in which conceptually approved features are encoded in categorized lexemes and arranged in lexical fields. DA basically draws on Dimension Assignment Parameters (DAP) that are provided by two frames of reference, which determine the dimensional designation of spatial objects: (42) a. The Inherent Proportion Schema (IPS) yields proportion-based gcstalt features by identifying the object's extents as MAximal, MiNimal, and across axis, respectively, b. The Primary Perceptual Space (PPS) yields contextually determined position features of spatial objects by identifying the object's extents as aligned with the vERTical axis, with the OBserver axis, and/or with an across axis in between. The DAP in small caps listed in (42) occur in two representational formats that reflect the SF vs. CS distinction. In SF representations, the DAP figure as functor constants of category N/N in the SF of L-particular dimension terms that instantiate {dim} within the general schema in (40). In CS representations, elements of the DAP inventory figure as conceptual features in so-called Object Schemata (cf. 4.2 below) that contain the conceptually defining as well as the contextually specified spatial features of the object at issue. Lang (2001) shows that the lexical field of spatial dimension terms in a language L is determined by the share it has in IPS and PPS, respectively. While reference to the vertical is ubiquitous, the lexical coverage of DA terms amounts to the following typology: proportion-based languages (Mandarin, Russian) adhere to IPS, observer-based ones (Korean, Japanese) adhere to PPS, and mixed-type ones (English, German) draw on an overlap between IPS and PPS. The semantic effects of this typology are inter alia reflected by the respective across terms: In P-based and in O-bascd languages, they arc lexically distinct and rcfcrcntially unambiguous, in mixed-type languages like English they lack both of these properties. Note the referential ambiguity of the English across term wide in (44.1) and its contextualized interpretations in (44.2 - 4) when referring to a board sized 100 x 30 x 3 cm in the spatial settings I—III shown in (43):
VI. Cognitively oriented approaches to semantics the pertinent DA terms for a and b have been added in (47). The respective extent chosen as d' to anchor across in the OS at issue and/or to interpret wide in (44.2-4) is in boldface. (47) I II III < a b c> max 0 min max across min a = long,b = wide < a b c> max 0 min across vert min a = wide, b = high < a b c> max 0 min across obs min a = widc,b = deep The OS in (47) as CS representations of (43) and (44) capture all semantic aspects of DA discussed so far but they deserve some further remarks. First, (47-1) results from primary identification a la (46a) indicated by matching entries in the 2nd and 3rd row, while (47-11 and III) arc instances of contextual specification as defined in (46b). Second, the typological characteristics of a P/O-mixcd-typc language are met as d' for wide may be taken from IPS as in (47 I) or from PPS as in (47 II and III). Third, the rows of an OS, which contain the defining spatial properties and possibly also some contextual specifications, can be taken as a heuristic cue for designing the SF representations of object names that lexically reflect the varying degree of integration into spatial contexts we observe in (43-44), e.g. board (freely movable) < notice-board (hanging) < windowsill (bottom part of a window) - in this respect OS may be seen as an attempt to capture what Bicrwisch (article 16 in this volume) calls "dossiers". Fourth, Lang, Carstcnsen & Simmons (1991) presents a Prolog system of DA using OS enriched by sidedness features, and Lang (2001) proposes a detailed catalogue of types of spatial objects with their OS accounting for primary entries and for contextually induced orientation or perspectivization. Fifth, despite their close interaction by means of the operations in (46), DAP as elements of SF representations and OS entries as CS elements arc subject to different constraints, which is another reason to keep them distinct. The entries in an OS arc subject to conditions of conceptual compatibility that inter alia define the set of admissible complex OS entries listed as vertically arranged pairs in (48): (48) max max max 0 0 0 across vert obs across vert obs An important generalization is that (48) holds independently of the way in which the complex entry happens to come about. So, the combination of max and vert in the same column may result from primary identification in the 2nd row,cf. The pole is 2m tall, where the SF of tall contains max & vdrt x as a conjunction of DAP, or from contextual specification, cf. The pole is 2m high, where vert is added in the 3rd row. The semantic structure of DA terms is therefore constrained by compatibility conditions at the CS level but within this scope it is cross-linguistically open to different lexicalization patterns and to variation of what is covered by the SF of single DA terms.
VI. Cognitively oriented approaches to semantics (50b), respectively. For the whole range of entailments and SF postulates based on DA terms sec Bicrwisch (1989). (50) a. The board is short -> The board is not long, b. The board is not long enough <-» The board is too short (51) a. 3c [[quant max B] = [N-c]] => ~ [Be [[3c [[quantmax B] = [N + c]]] b. ~ [3C [[ QUANT MAX B] = [K + C ]]] <=> 3c [[ QUANT MAX B] = [K - C ]] 4.3.2. Contextually induced dimensional designation Valid inferences like those in (52) are accounted for, and invalid ones like those in (53) are avoided, by drawing on the information provided by, or else lacking in, contextually specified OS. (52) a. The board is lm wide and 0.3 m high -» The board is lm long and 0.3m wide b. The pole is 2m tall/2m high —» The pole is 2m long (53) a. The wall is wide and high enough -& The wall is long and wide enough b. The tower is 10 m tall/high -/-» *The tower is 10 m long. The valid inferences result from the operation of de-specification, which is simply the reverse of the operation of contextual specification defined in (46b): (54) De-specification: a. For any OS for x with a vertical entry < p, q >, there is an OS' with < p, p >. b. For any OS for x with a vertical entry < 0, q >, there is an OS' with < 0, across >. The inferences in (53a, b) are ruled out as invalid because the OS under review do not contain the type of entries needed for (54) to apply. 4.3.3. Commensurability of object extents Note that the DA terms long, wide and/or thick arc not hyponyms to big despite the fact that big may refer to the [v + c] of one, two or all three extents of a 3D object, depending on the OS of the objects at issue. When objects differing in dimensionality are compared by using the DA term big, the dimensions it covers are determined by the common share of the OS involved, cf. (55): (55) a. My car is too big for the parking space. (too long and/or too wide) b. My car is too big for the garage door. (too wide and/or too high) So it is above all the two mapping operations between SF and CS representations as defined in (46a, b) and exemplified by DAP and OS that account for the whole range of seemingly complicated facts about DA to spatial objects.
31.Two-level Semantics: Semantic Form and Conceptual Structure 737 5. Summary and outlook In this article we have reported on some pros and cons related to distinguishing SF and CS representations and illustrated them by data and facts from a selection of semantic phenomena. Now we briefly outline the state of the art in more general terms and take a look at the desiderata that define the agenda for future research. The current situation can be summarized in three statements: (i) the SF vs. CS distinction brings in clear-cut advantages as shown by the examples in §§ 2-4; (ii) we still lack reliable heuristic strategies for identifying the appropriate SF of a lexical item; (iii) it is difficult to define the scope of variation a given SF can cover at CS level. What we urgently need is independent evidence for the basic assumption underlying the distinction: SF representations and CS representations differ in nature as they are subject to completely different principles of organization. By correlating the SF vs. CS distinction with distinctions relevant to other levels of linguistic structure formation, cf. (2)-(5) in section 1, the article has taken some steps in that direction. One of them is to clarify the differences between SF and CS that derive from their linguistic vs. non-linguistic origin; cf. (4). The linguistic basis of the SF-representations of DA terms, for instance, is manifested (i) in the DAP constants' interrelation by postulates underlying lexical relations, (ii) in participating in certain lexicalization patterns (e.g. proportion-based vs. observer-based), (iii) in being subject to the uniqueness constraint in (49), which is indicative of the semio- ticity of the system it applies to, whereas the Conceptual System CS is not a semiotic one. Pursuing this line of research, phenomena specific to natural languages like idiosyncra- cies, designation gaps, collocations, connotations, folk etymologies etc. should be scrutinized for their possible impact on establishing SF as a linguistically determined level of representation. The non-linguistic basis of CS-representations, e.g. OS involved in DA, is manifested (i) in the fact that OS entries are exclusively subject to perception-based compatibility conditions;(ii) in their function to integrate input from the spatial environment regardless of the channel it comes in; (iii) in their property to allow for valid inferences to be drawn on entries that are induced as contextual specifications. To deepen our understanding of CS-representations, presumptions like the following deserve to be investigated on a broader spectrum and in more detail: (i) CS representations may be underspecified in certain respects, cf. the role of '0' in OS, but as they are not semiotic entities they are not ambiguous; (ii) the compatibility conditions defining admissible OS suggest that the following relation may hold wrt. the well-formedness of representations: sortal restrictions c sclcctional restrictions; (iii) CS representations have to be contingent since contradictory entries cause the system of inferences to break down; contradictions at SF level trigger accommodation activities. As the agenda above suggests, a better understanding of the interplay of linguistic and non-linguistic aspects of meaning constitution along the lines developed here is particularly to be expected from interdisciplinary research combining methods and insights from linguistics, psycholinguistics, neurolinguistics and cognitive psychology. 6. References Aslier, Nicholas 2007. A Web of Words: Lexical Meaning in Context. Ms. Austin, TX, University of Texas
738 VI. Cognitively oriented approaches to semantics Bierwiscli, Manfred 1983. Semantische und konzeptuelle Interpretation lexikalischer Einheiten. In: R. Ruzi£ka & W. Motsch (eds). Untersuchungen zur Semantik. Berlin: Akademie Verlag, 61-99. Bierwiscli, Manfred 1988. On the grammar of local prepositions In: M. Bierwiscli, W. Motsch & I. Ziinmerinann (eds.). Syntax, Semantik und Lexikon. Berlin: Akademie Verlag, 1-65. Bierwiscli, Manfred 1989. The semantics of gradation. In: M. Bierwiscli & E. Lang (eds.). Dimensional Adjectives: Grammatical Structure and Conceptual Interpretation. Berlin: Springer,71-261. Bierwiscli, Manfred 1996. How much space gets into language? In: P. Bloom et al. (eds). Language and Space. Cambridge, MA:The MIT Press, 31-76. Bierwiscli, Manfred 1997. Lexical information from a minimalist point of view. In: Ch. Wilder, H.-M. Gartner & M. Bierwiscli (eds.). The Role of Economy Principles in Linguistic Theory. Berlin: Akademie Verlag. 227-266. Bierwiscli, Manfred 2005. The event structure of cause and become. In: C. Maienborn & A. Wollstein (eds.). Event Arguments: Foundations and Applications.Tiib'mgeiv. Niemeyer, 11-44. Bierwiscli, Manfred 2007. Semantic Form as interface. In: A. Spath (cd.). Interfaces and Interface Conditions. Berlin: dc Gruytcr, 1-32. Bierwiscli, Manfred 2010. become and its presuppositions In: R. Bauerle, U Reyle & T E. Ziniinerinann (eds.). Presupposition and Discourse. Bingley: Emerald, 189-234. Bierwiscli, Manfred & Peter Bosch (eds.) 1995. Semantic and Conceptual Knowledge (Arbeitspa- piere des SFB 340 Nr. 71). Heidelberg: IBM Deutschland. Bierwiscli, Manfred & Ewald Lang (eds) 1989a. Dimensional Adjectives: Grammatical Structure and Conceptual Interpretation. Berlin: Springer. Bierwiscli, Manfred & Ewald Lang 1989b. Somewhat longer - much deeper - further and further. Epilogue to the Dimensional Adjective Project. In: M. Bierwiscli & E. Lang (eds). Dimensional Adjectives: Grammatical Structure and Conceptual Interpretation. Berlin: Springer, 471-514. Bierwisch, Manfred & Rob Schreuder 1992. From concepts to lexical items Cognition 42,23-60. Blutncr, Rcinhard 1995. Systcmatischc Polyscmic: Ansatzc zur Erzcugung und Bcschrankung von Interprctationsvariantcn. In: M. Bierwisch & P. Bosch (eds). Semantic and Conceptual Knowledge (Arbeitspapiere des SFB 340 Nr. 71). Heidelberg: IBM Deutschland, 33-67. Blutner, Reinhard 1998. Lexical pragmatics Journal of Semantics 15,115-162. Blutncr, Rcinhard 2004. Pragmatics and the lexicon. In: L. R. Horn & G. Ward (eds). The Handbook of Pragmatics. Maiden, MA: Blackwcll, 488-514. Bucking, Sebastian 2009. Modifying event noininals: Syntactic surface meets semantic transparency. In: A. Riester & T Solstad (eds). Proceedings of Sinn und Bedeutung (= SuB) 13. Stuttgart: University of Stuttgart, 93-107. Bucking, Sebastian 2010. Zur Interpretation adnominaler Genitive bei nominalisierten Infinitiven im Deutschen. Zeitschrift flir Sprachwissenschaft 29,39-77. Carstensen. Kai-Uwe 2001. Sprache, Raum und Aufmerksamkeit. Tubingen: Niemeyer. Dolling, Johannes 1994. Sortale Selektioiisbeschrankungen und systematische Bedeutungsvai iabil- itat. In: M. Schwarz (ed.). Kognitive Semantik - Cognitive Semantics. Tubingen: Narr. 41-60. Dolling, Johannes 2001. Systematische Bedeutungsvariationen: Semantische Form und kontextuelle Interpretation (Linguistische Arbeitsberichte 78). Leipzig: University of Leipzig. Dolling, Johannes 2003. Flexibility in adverbal modification: Reinterpretation as contextual enrichment. In: E. Lang, C. Maienborn & C. Fabricius-Hansen (eds). Modifying Adjuncts. Berlin: de Gruyter, 511-552. Dolling, Johannes 2005a. Semantische Form und pragmatische Anreicherung: Situationsausdrucke in dcr AuBcrungsintcrprctation. Zeitschrift fur Sprachwissenschaft 24.159-225. Dolling, Johannes 2005b. Copula sentences and entailment relations Theoretical Linguistics 31, 317-329. Dolling, Johannes 2010. Aspectual coercion and eventuality structure.To appear in: K. Robering & V Engerer (eds). Verbal Semantics. Egg, Markus 2003. Beginning novels and finishing hamburgers Remarks on the semantics of 'to begin'. Journal of Semantics 20,163-191.
742 VI. Cognitively oriented approaches to semantics role in language. We can be in a room, in a social group, in the midst of an activity, in trouble, and in politics. We can move a chair from the desk to the table, move money from one bank account to another, move a discussion from religion to politics, and move an audience to tears. A fundamental distinction between tangibles and intangibles rules out the possibility of understanding the sense of'in" or "move" common to all these uses. Our effort, by contrast, has sought to exploit the insights of linguists such as Gruber (1965), the generative semanticists, Johnson (1987), Lakoff (1987), Jackendoff (see article 30 (Jackendoff) Conceptual Semantics), and Talmy (see article 27 (Talmy) Cognitive Semantics: An overview). Johnson, Lakoff, Talmy and others have used the term "image schemas" to refer to a conceptual framework that includes topological relations but excludes, for example, Euclidean notions of magnitude and shape. We have been developing core theories that formalize something like the image schemas, and we have been using these to define or characterize words. Among the theories we have developed are theories of composite entities, or things made of other things, the figure-ground relation, scalar notions, change of state, and causality.The idea behind these abstract core theories is that they capture a wide range of phenomena that share certain features.The theory of composite entities, for example, is intended to accommodate natural physical objects like vole an os, artifacts like automobiles, complex events and processes like concerts and photosynthesis, and complex informational objects like mathematical proofs. The theory of scales captures commonalities shared by distance, time, numbers, money, and degrees of risk, severity, and happiness. The most common words in English (and other languages) can be defined or characterized in terms of these abstract core theories. Specific kinds of composite entities and scales, for example, are then defined as instances of these abstract concepts, and we thereby gain access to the rich vocabulary the abstract theories provide. We can illustrate the link between word meaning and core theories with the rather complex verb "range". A core theory of scales provides axioms involving predicates such as scale, <, subscale, top, bottom, and at. Then we arc able to define "range" by the following axiom: (V x, y, z)range(x, y,z) = (3 s, s{, u{, u2)scale(s) a subscale(srs) a bottom(y, s{) a top(z,sx) a ux e x Aat(u{,y) a u2e x Aat(u2,z) a (V u e x)(3 v e s,)af(w,v) That is,Jt ranges from y to z if and only if there is a scale s with a subscale s{ whose bottom island whose top isz,such that some member m, of x is at y, some member w2of jc is at z, and every member u of x is at some point v in ^.Then by choosing different scales and instantiating the at relation in different ways, we can get such uses as The buffalo ranged from northern Texas to southern Saskatchewan. The students" SAT scores range from 1100 to 1550. The hepatitis cases range from moderate to severe. His behavior ranges from sullen to vicious Many things can be conceptualized as scales, and when this is done, a large vocabulary, including the word "range", becomes available.
744 VI. Cognitively oriented approaches to semantics the use of a word in context.This is far from a solved problem, but recasting meaning and interpretation in this way gives us a formal, computational way of approaching the problem. Defeasibility in the logic gives us an approach to prototypes (see article 28 (Taylor) Prototype theory; Rosch 1975). Categories correspond to predicates and are characterized by a set of possibly defeasible inferences, expressed as axioms, among which are their traditional defining features. For example, bachelors are unmarried and birds fly. (V x)bachelor(x) z> unmarried(x) (V x)bird{x) a etc{(x) z>fly(x) where etc^x) indicates the defeasibility of the axiom. Each instance of a category has a subset of the defeasible inferences that hold in its particular case. The more prototypical, the more inferences. In the case of the penguin, which is not a prototypical bird, the defeasible inference about flying is defeated. In this view, the basic level category is the predicate with the richest set of associated axioms. For example, there is more gain in useful knowledge from learning an animal is a dog than from learning a dog is a boxer. Similarly, defeasible inference lends itself to a treatment of novel metaphor. In metaphor, some properties arc transferred from a source to a target, and some arc not. When we say Pat is a pig, we draw inferences about manner and quantity of eating from "pig", but not about four-leggedness or species membership.The latter inferences are defeated by the other things we know. Hobbs (1992) develops this idea. Taking abstract core theories as basic may seem to run counter to a central tenet of cognitive linguistics, namely, that our understanding of many abstract domains is founded on spatial metaphor. It is certainly true that the field of spatial relationships, along with social relationships, is one of the domains babies have to figure out first. But I think that to say we figure out space first and then transfer that knowledge to other domains is to seriously underestimate the difficulty of figuring out space. There are many ways one could conceptualize space, e.g., via Euclidean geometry. But in fact it is the topological concepts which predominate in a baby's spatial understanding. A one-year-old baby fascinated by "in" might put a necklace into a trash can and a Cheerio into a shoe, despite their very different sizes and shapes. In spatial metaphor it is generally the topological properties that get transferred from the source to the target. In taking the abstract core theories as basic, we are isolating precisely the topological properties of space that are most likely to be the basis for understanding metaphorical domains. If one were inclined to make innateness arguments, one position would be that we are born with a instinctive ability to operate in spatial environments. We begin to use this immediately when we are born, and when we encounter abstract domains, we tap into its rich models. The alternative, more in line with our development here, is that we are born with at least a predisposition towards instinctive abstract patterns - composite entities, scales, change, and so on - which we first apply in making sense of our spatial environment, and then apply to other, more abstract domains as we encounter them.This has the advantage over the first position that it is specific about exactly what properties of space might be in our innate repertoire. For example, the scalar notions of "closer" and "farther" are in it; exact measures of distance are not. A nicely paradoxical coda for
32. Word meaning and world knowledge 745 summing up this position is that we understand space by means of a spatial metaphor. I take Talmy's critique of the "concreteness as basic" idea as making a similar point (see article 27 (Talmy) Cognitive Semantics: An overview). Many of the preceding articles have proposed frameworks for linking words to an underlying conceptual structure.These can all be viewed as initial forays into the problem of connecting lexical meaning with world knowledge. The content of this work survives translation among the various frameworks that have been used for examining it, and survives recasting it as a problem of explicitly encoding world knowledge, specifically, a theory of image schemas explicating such concepts as composite entities, figure-ground, scales, change of state, causality, aggregation, and granularity shifts—an abstract theory that can be instantiated in many different, more specialized domains. The core theories we are developing are not so much theories about particular aspects of the world, but rather abstract frameworks that are useful in making sense of a number of different kinds of phenomena. Levin and Rappaport Hovav (see article 19 (Levin & Rappaport Hovav) Lexical Conceptual Structure) say, "All theories of event structure, either implicitly or explicitly, recognize a distinction between the primitive predicates which define the range of event types available and a component which represents what is idiosyncratic in a verb's meaning." The abstract theories presented here are an explication of the former of these. This work can be seen as an attempt at a kind of deep lexical semantics. Not only are the words "decomposed" into what were once called primitives, but also the primitives are explicated in axiomatic theories, enabling one to reason deeply about the concepts conveyed by the text. 2. Core abstract theories 2.1. Composite entities Composite entities are things made of other things. A composite entity is characterized by a set of components, a set of properties of these components, and a set of relations among the components and between the components and the whole. The concept of composite entity captures the minimal complexity something must have in order for it to have structure. It is hard to imagine something that cannot be conceptualized as a composite entity. For this reason, a vocabulary for talking about composite entities will be broadly applicable. The elements of a composite entity can themselves be viewed as composite entities, and this gives us a very common example of shifting granularities. It allows us to distinguish between the structure and the function of an entity. The function of an entity as a component of a larger composite entity is its relations to the other elements of the larger composite entity, its environment, while the entity itself is viewed as indecomposable. The structure of the entity is revealed when we decompose it and view it as a composite entity itself. We look at it at a finer granularity. An important question any time we can view an entity both functionally and structurally is how the functions of the entity arc implemented in its structure. Wc need to spell out the structure-function articulations. For example, a librarian might view a book as an indecomposable entity and be interested in its location in the library, its relationship to other books, to the bookshelves, and
746 VI. Cognitively oriented approaches to semantics to the people who check the book out. This is a functional view of the book with respect to the library. We can also view it structurally by inquiring as to its parts, its content, its binding, and so on. In spelling out the structure-function articulations, we might say something about how its content, its size, and the material used in its cover determines its proper location in the library. A composite entity can serve as the ground against which some ex\ema\ figure can be located or can move (see 27 Cognitive semantics: An overview). A primitive predicate at expresses this relation. In at(x, y, s) s is a composite entity,}' is one of its elements, and x is an external entity.The relation says that the figure x is at a point y in the composite entity s, which is the ground. The at relation plays primarily two roles in the knowledge base. First, it is involved in the "decompositions" of many lexical items. We saw this above in the definition of "range".There is a very rich vocabulary of terms for talking about the figure-ground relation. This means that whenever a relation in some domain can be viewed as an instance of the figure-ground relation, we acquire at a stroke a rich vocabulary for talking about that domain. This gives rise to the second role the at predicate plays in the knowledge base. A great many specific domains have relations that arc stipulated to be instances of the at relation. There are a large number of axioms of the form (V x, y, s)r(x, y, s) => at(x, y, s) It is in this way that many of the metaphorical usages that pervade natural language discourse are accommodated. Once we characterize some piece of the world as a composite entity, and some relation as an at relation, we have acquired the whole locational way of talking about it. Once this is enriched with a theory of time and change, we can import the whole vocabulary of motion. For example, in computer science, a data structure can be viewed as a composite entity, and we can stipulate that if a pointer points to a node in a data structure, then the pointer is at that node. We have then acquired a spatial metaphor, and we can subsequently talk about, for example, the pointer moving around the data structure. Space, of course, is itself a composite entity and can be talked about using a locational vocabulary. Other examples of at relations are A person at an object in a system of objects: John is at his desk. An object at a location in a coordinate system: The post office is at the corner of 34th Street and Eighth Avenue. A person's salary at a particular point on the money scale: John's salary reached $75,000 this year. An event at a point on the time line: The meeting is at three o'clock.
VI. Cognitively oriented approaches to semantics 1. a broad sense, including have a son, having a condition hold and having a college degree 2. having a feature or property, i.e., the property holding of the entity 3. a sentient being having a feeling or internal property 4. a person owning a possession 7. have a person related in some way: have an assistant 9. have left: have three more chapters to write 12. have a disease: have influenza 17. have a score in a game: have three touchdowns The supersense can be characterized by the axiom have-s\(x,y) z> relatedTo(x,y) In these axioms, supersenses are indexed with s, WordNet senses with w, and FrameNet senses with/. Unindcxcd predicates arc from core theories. The individual senses are then specializations of the supersense where more domain- specific predicates are explicated in more specialized domains. For example, sense 4 relates to the supersense as follows: have-w4(x, y) = possess(x, y) have-w4(x, y) z> have-sl (x, y) where the predicate possess would be explicated in a commonsense theory of economics, relating it to the priveleged use of the object. Similarly, have-w3(x, y) links with the supersense but has the restrictions that x is sentient and that the "relatedTo" property is the predicate-argument relation between the feeling and its subject. The second supersense of "have" is "come to be in a relation to".Thisisour changeTo predicate. Thus, the definition of this supersense is have-s2(x, y) = changeTo(e) a have-sl \e, x, y) The WordNet senses this covers are as follows: 10. be confronted with: we have a fine mess 11. experience: the stocks had a fast run-up 14. receive something offered: have this present 15. come into possession of: he had a gift from her 16. undergo, e.g.. an injury: he had his arm broken in the fight 18. have a baby In these senses the new relation is initiated but the subject does not necessarily play a causal or agentive role. The particular change involved is specialized in the WordNet senses to a confronting, a receiving, a giving birth, and so on. The third supersense of "have" is "cause to come to be in a relation to". The axiom defining this is have-s3(x,y) = cause(x, e) a have-s2'(e,x, y) The WordNet senses this covers are
32. Word meaning and world knowledge 755 remainder-w\(x, e) = remain-w3(x, e) That is,Jt is the remainder after process e if and only if x remains after e.Thc other three senses result from specialization of the removal process to arithmetic division, arithmetic subtraction, and the purposeful cutting of a piece of cloth. The noun "remains" refers to what remains (w2>) after a process of consumption or degradation. 3.4. The nature of word senses The most common words in a language are typically the most polysemous. They often have a central meaning indicating their general topological structure. Each new sense introduces inferences that cannot be reliably determined just from a core meaning plus contextual factors. They tend to build up along what Brugman (1981), Lakoff (1987) and others have called a radial category structure (see article 28 (Taylor) Prototype theory). Sense 2 may be a slight modification of sense 1, and senses 3 and 4 different slight modifications of sense 2. It is easy to describe the links that take us from one sense to an adjacent one in the framework presented here. Each sense corresponds to a predicate which is characterized by one or more axioms involving that predicate. A move to an adjacent sense happens when incremental changes are made to the axioms. As we have seen in the examples of this section, the changes are generally additions to the antecedents or consequents of the axioms.The principal kinds of additions are embedding in change and cause, as we saw in the supersenses of "have"; the instantiation of general predicates like relatedTo and at to more specific predicates in particular domains, as we saw in all three cases; and the addition of domain-specific constraints on arguments, as in restricting y to be clothes in remove-fl. A good account of the lexical semantics of a word should not just catalog various word senses. It should detail the radial category structure of the word senses, and for each link, it should say what incremental addition or modification resulted in the new sense. Note that radial categories provide us with a logical structure for the lexicon, and also no doubt a historical one, but not a developmental one. Children often learn word senses independently and only later if ever realize the relation among the senses. See article 28 Prototype theory for further discussion of issues with respect to radial categories. 4. Distinguishing lexical and world knowledge It is perhaps natural to ask whether a principled boundary can be drawn between linguistic knowledge and knowledge of the world. To make this issue more concrete, consider the following seven statements: (1) If a string w{ is a noun phrase and a string w2 is a verb phrase, then the concatenation wyw2 is a clause. (2) The transitive verb "moves" corresponds to the predication move2(x,y), providing a string describing x occurs as its subject and a string describing y occurs as its direct object. (3) If an entity x moves (in sense move2) an entity y, then x causes a change of state or location ofy.
32. Word meaning and world knowledge 759 knowledge and interpretation processes, as with the study of lexical conceptual stuctures, as a strategic decision to identify a fruitful and tractable research problem. Pustejovsky's work is an attempt to specify the knowledge that is required for interpreting at least the majority of nonstandard uses of words. Kilgarriff (2001) tests this hypothesis by examining the uses of nine particular words in a 20-million word corpus. 41 of 2276 instances were judged to be nonstandard since they did not correspond to any of the entries for the word in a standard dictionary. Of these, only two nonstandard uses were derivable from Pustejovsky's qualia structures. The others required deeper com- monsense knowledge or previous acquaintance with collocations. Kilgarriffs conclusion is that "Any theory that relies on a distinction between general and lexical knowledge will founder." (Kilgariff 2001:325) Some researchers in natural language processing have argued that lexical knowledge should be distinguished from other knowledge because it results in more efficient computation or more efficient comprehension and production. One example concerns hyperonymy relations, such as that car(x) implies vehicle{x). It is true that some kinds of inferences lend themselves more to efficient computation than others, and inferences involving only monadic predicates are one example. But where this is true, it is a result not of their content but of structural properties of the inferences, and these cut across the lexical-world distinction. Any efficiency realized in inferring vehicle{x) can be realized in inferring expensive(x) as well. All of statements (1 )-(7) arc facts about the world, because sentences and their structure and words and their roles in sentences are things in the world, as much as barges, planes, and New Jersey. There is certainly knowledge we have that is knowledge about words, including how to pronounce and spell words, predicate-argument realization patterns, alternation rules, subcategorization patterns, grammatical gender, and so on. But words are part of the world, and one might ask why this sort of knowledge should have any special cognitive status. Is it any different in principle from the kind of knowledge one has about friendship, cars, or the properties of materials? In all these cases, we have entities, properties of entities, and relations among them. Lexical knowledge is just ordinary knowledge where the entities in question are words.There are no representational reasons for treating linguistic knowledge as special, providing we are willing to treat the entities in our subject matter as first-class individuals in our logic (cf. Hobbs 1985).There are no procedural reasons for treating linguistic knowledge as special, since parsing, argument realization, lexical decomposition, the coercion of metonymies, and so on can all be implemented straightforwardly as inference.The argument that parsing and lexical decomposition, for example, can be done efficiently on present-day computers, whereas commonsense reasoning cannot, does not seem to apply to the human brain; psycho- linguistic studies show that the influence of world knowledge kicks in as early as syntactic and lexical knowledge, and yields the necessary results just as quickly. We are led to the conclusion that any drawing of lines is for the strategic purpose of identifying a coherent, tractable and fruitful area of research. Statements (l)-(6) are examples from six such areas. Once we have identified and explicated such areas, the next question is what connections or articulations there are among them; Pustejovsky's research and work on lexical conceptual structures arc good examples of people addressing this question. However, all of this does not mean that linguistic insights can be ignored. The world can be conceptualized in many ways. Some of them lend themselves to a deep treatment
760 VI. Cognitively oriented approaches to semantics of lexical semantics, and some of them impede it. Put the other way around, looking closely at language leads us to a particular conceptualization of the world that has proved broadly useful in everyday life. It provides us with topological relations rather than with the precision of Euclidean 3-space. It focuses on changes of state rather than on correspondences with an a priori time line. A defeasible notion of causality is central in it. It provides means for aggregation and shifting granularities. It encompasses those properties of space that are typically transferred to new target domains when what looks like a spatial metaphor is invoked. More specific domains can then be seen as instantiations of these abstract theories. Indeed, Euclidean 3-space itself is such a specialization. Language provides us with a rich vocabulary for talking about the abstract domains. The core meanings of many of the most common words in language can be defined or characterized in these core theories. When the core theory is instantiated in a specific domain, the vocabulary associated with the abstract domain is also instantiated, giving us a rich vocabulary for talking about and thinking about the specific domain. Conversely, when we encounter general words in the contexts of specific domains, understanding how the specific domains instantiate the abstract domains allows us to determine the specific meanings of the general words in their current context. We understand language so well because we know so much. Therefore, we will not have a good account of how language works until we have a good account of what we know about the world and how we use that knowledge. In this article I have sketched a formalization of one very abstract way of conceptualizing the world, one that arises from an investigation of lexical semantics and is closely related to the lexical decompositions and image schemas that have been argued for by other lexical semanticists. It enables us to capture formally the core meanings of many of the most common words in English and other languages, and it links smoothly with more precise theories of specific domains. I have profited from discussions of this work with Gully Burns, Peter Clark, Tim Clausner, Christiane Fellbaum, and Rutu Mulkar-Mehta. This work was performed in part under the IARPA (DTO) AQUAINTprogram, contract N61339-06-C-0160. 5. References Baker, Colin F., Charles J. Fillmore & Beau Cronin 2003. The structure of the Framcnct database. International Journal of Lexicography 16,281-296. Bloomfield, Leonard 1933. Language. New York: Holt, Rinehart & Winston. Brugman, Claudia 1981. The Story of'Over'. MA thesis University of California, Berkeley, CA. Cycorp 2008. http://www.cyc.com/. December 9,2010. Davis, Ernest 1990. Representations of Commonsense Knowledge. San Mateo, CA: Morgan Kaufmann. Fodor, Jerry A. 1980. Methodological solipsism considered as a research strategy in cognitive science. Behavioral and Brain Sciences 3,63-109. Fodor, Jerry A. 1983. The Modularity of Mind. An Essay on Faculty Psychology. Cambridge, MA: The MIT Press. Ginsberg. Matthew L. (ed.) 1987. Readings in Nonmonotonic Reasoning. San Mateo, CA: Morgan Kaufmann.
VII. Theories of sentence semantics 33. Model-theoretic semantics 1. Truth-conditional semantics 2. Possible worlds semantics 3. Model-theoretic semantics 4. References Abstract Model-theoretic semantics is a special form of truth-conditional semantics. According to it, the truth-values of sentences depend on certain abstract objects called models. Understood in this way, models are mathematical structures that provide the interpretations of the (non-logical) lexical expressions of a language and determine the truth-values of its (declarative) sentences. Originally designed for the semantic analysis of mathematical logic, model-theoretic semantics has become a standard tool in linguistic semantics, mostly through the impact of Richard Montagues seminal work on the analogy between formal and natural languages. As such, it is frequently (and loosely) identified with possible worlds semantics, which rests on an identification of sentence meanings with regions in Logical Space, the class of all possible worlds. In fact, the two approaches have much in common and are not always easy to keep apart. In a sense, (i) model- theoretic semantics can be thought of as a restricted form of possible worlds semantics, where models represent possible worlds; in another sense, (ii) model-theoretic semantics can be seen as a wild generalization of possible worlds semantics, treating Logical Space as variable rather than given. Consequently, the present introductory exposition of model-theoretic semantics also covers possible worlds semantics - hopefully helping to disentangle the relationship between the two approaches. It starts with a general discussion of truth-conditional semantics (section I), the main purpose of which is to provide some motivation and background. Model-theoretic semantics is then approached from the possible worlds point of view (section 2), highlighting the similarities between the two approaches that give rise to perspective (i). The final section 3 turns to model theory as a providing a mathematical reconstruction of possible worlds semantics, ultimately arriving at the more abstract perspective (ii). 1. Truth-conditional semantics The starting point for the truth-conditional approach to semantics is the tight connection between meaning and truth. From a prc-thcorctic point of view, linguistic meaning may appear a multi-faceted blend of phenomena, partly subjective and private, partly social and inter-subjective, mostly vague, slippery, and apparently hard to define in precise terms. Nevertheless, there are a few unshakable, yet substantial (non-tautological) insights into meaning. Among these is the contention that differences in truth-value necessitate differences in meaning: Maienbom, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 762-802
33. Model-theoretic semantics 763 Most Certain Principle [MCP] cf. Cresswell (1982:69) If 5, and S2 are declarative sentences such that, under given circumstances, 5, is true whereas S2 is not, then 5, and S2 differ in meaning. The MCP certainly has the flavour of an a priori truth. If it is one, an adequate analysis of meaning ought to take care of it by making a conceptual connection with truth; in fact, this connection might lie at the heart of any adequate theoretical reconstruction of meaning. If so, meaning is primarily sentence meaning; for sentences are the bearers of truth, i.e. the only expressions that can be true (or false). We will say that the truth value of a (declarative) sentence S is 1 [under given circumstances] if S is true [under these circumstances]; and that its truth value [...] is 0 otherwise.Then the lesson to be learnt from the MCP is that, given arbitrary circumstances, the meaning of a sentence S determines its truth value under these circumstances: nothing, i.e. no sentence 5', could have that meaning without having the same truth value. Whatever the meanings of (declarative) sentences may be, then, they must somehow determine their truth conditions, i.e. the circumstances under which the sentences are true. According to truth-conditional semantics, anything beyond this most marked trait of sentence meanings ought to be left out of consideration (or to pragmatics), thus paving the way for the following'minimalist' semantic axiom: Basic Principle of Truth-Conditional Semantics [BPJ cf. Wittgenstein (1922:4.431) Any two declarative sentences agree in meaning just in case they agree in their truth conditions. Since by (our) definition, truth conditions arc circumstances of truth, the BP implies that sentences that differ in meaning, also differ in truth value under at least some circumstances. Truth-conditional semantics owes much of its flavour, as well as some of its problems, to this immediate consequence of the BP, to which we will return in section 2. According to the BP, the meanings of sentences covary with their truth conditions. It is important to realize (and easy to overlook) that a statement of covariance is not an equation. The - common - identification of sentence meanings with truth conditions requires further justification. Some motivation for it may be sought in functionalist considerations, according to which truth conditions suffice to explain the extralinguistic roles attributed to sentence meaning in social interaction and cognitive processes.Though we cannot go into these matters here, we will later see examples of covarying entities - the material models introduced in Section 2.3 - that are less likely to be identified with sentence meanings than truth conditions in the sense envisaged here. Bearing this in mind, we will still read the BP as suggesting that the meanings of sentences may as well be taken to be their truth conditions. In its most radical form, truth-conditional semantics seeks to derive the meanings of non-sentential expressions from sentence meanings, as the contributions these expressions make to the truth conditions of the sentences in which they occur; cf. Frege (1884: 71). Without any further specification and constraints, this characterisation of non- sentential meaning is bound to be incomplete. To begin with, the contribution an expression makes to the truth conditions of a sentence in which it occurs, obviously depends on
764 VII. Theories of sentence semantics the position in which it occurs: the contribution of an expression A occupying position p in a sentence S to the meaning of S must be understood as the contribution of A as occurring in p; for the sake of dcfinitcncss, positions p can be identified with (certain) partial, injective functions on the domain of expressions, inserting occupants into hosts, so that an (occupant) expression A occurs in position p of a (host) expression B just in case p{A) = B. Positions in sentences are structural positions that need not coincide with positions in surface strings; in particular, the contributions made by the parts of structurally ambiguous (surface) sentences depend on the latters' readings, which have different structures, hosting their parts in different structural positions. Given their sensitivity to syntactic structure, the truth-conditional contributions of its parts can hardly be determined by looking at one sentence in isolation. Otherwise minimal pairs like the following, taken from Abbott & Hauser (1995: 6, fn. 10), would create problems when it comes to the truth-conditional contributions of their parts: (1) Someone is buying a car. (2) Someone is selling a car. (3) and (4) have the same truth conditions and the same overall structure, in which the underlined verbs occupy the same position. Since the two sentences are otherwise identical, it would seem that the two alternatives make the same contribution. If so, this contribution cannot be their meaning - after all, the two underlined verb forms are not synonymous and generally do differ in their truth-conditional contributions, as in: (3) Mary is buying a car. (4) Mary is selling a car. Hence, lest the two verbs come out as synonymous, their meanings can only be identified with their truth-conditional contributions if the latter are construed globally, taking into account all positions in which the verbs may occur. Generalising from this example, it emerges that sameness of meaning among any non-sentential expressions A and B (of the same category) requires sameness of truth-conditional contribution throughout the structural positions of all sentences. Since A and B occupy the same structural position in two sentences SA and SB if SB derives from SA by putting B in place of A, this condition on synonymy may be cast in terms of a: Substitution Principle [Subst] If two non-sentential expressions of the same category have the same meaning, either may replace the other in all positions within any sentence without thereby affecting the truth conditions of that sentence. The italicized restriction is made so as to not rule out the possibility that two expressions make the same truth-conditional contribution without being grammatically equivalent. As argued in Gazdar et al. (1985: 32), likely and probable may be cases in point - given the grammatical contrast between likely to leave vs. *probable to leave.
33. Model-theoretic semantics 765 Taken together, the BP and Subst form the backbone of truth-conditional semantics: the meanings of sentences coincide with their truth conditions; and synonymy among non-scntcntial expression is restricted by the synonymies among the sentences in which they occur. Subst leaves open what kinds of entities the meanings of non- sentential expressions are. Keeping the 'minimalist' spirit behind the passage from the MCP to the BP, it is tempting to identify them by strengthening Subst to a biconditional, thereby arriving at a synonymy criterion, according to which two (non- sentential) expressions have the same meaning if they may always replace each other within a sentence without affecting the truth conditions of that sentence. Ontologi- cally speaking, this synonymy criterion has meaning supervene on truth conditions; cf. Kupffer (2007). However, we will depart from radical truth-conditional semantics and not take it for granted that this criterion always applies; possible counterexamples will be addressed in due course. The above considerations also support a more general version of Subst according to which various synonymous (non-sentential) expressions may be replaced simultaneously. As a result, the meaning of any complex expression can be construed as only depending of the meanings of its immediate parts - thus giving rise to a famous principle, which we note here for future reference: Principle of Compositionality [Compo] The meaning of a complex expression functionally depends on the meanings of its immediate parts and the way in which they are combined. Like Subst, Compo does not say what kinds of objects non-sentential meanings are. In fact, a permutation argument has it that they arc hopelessly undcrdctcrmincd: if a truth-conditional account specified some objects x andy as the meanings of some (non- sentential) expressions A and B, then an alternative account could be given that would also be in line with Subst, but according to which x and y have changed their places. We will take a closer look at this argument in section 3. Even though Compo and Subst arc not committed to a particular concept of non- sentential meaning, some choices might be more natural than others. Indeed, Subst falls out immediately if truth-conditional contributions and thus meanings (of non-sentential expressions) are constructed by abstraction, as classes of expressions that make the same truth-conditional contributions in all positions in which they occur: if the meaning of an expressions A is identified with the set of all expressions with which A can be replaced without any change in truth conditions, then Subst trivially holds. However, there is a serious problem with this strategy: if (non-sentential) meanings were classes of inter- substitutable expressions, they would be essentially language-dependent. But if meanings were just substitution classes, then meaning-preserving translation between distinct languages would be impossible; and even more absurdly, all expressions of a language would have to change their meanings each time a new word is added to it! This certainly speaks against a construal of meanings in terms of abstraction, even though synonymy classes have proved useful as representatives of meanings in the algebraic study of compositionality; cf. Hodges (2001). If contributing to truth conditions and conforming to Subst were all there is to non- sentential meanings, then there would be no fact of the matter as to what kinds of objects
766 VII. Theories of sentence semantics they are; moreover, there would be no obvious criteria for deciding whether two expressions of different languages have the same meaning. Hence a substantial theory of non- sentential meaning cannot live on Subst alone. One way to arrive at such a theory is by finding a natural analogue to the MCP that goes beyond the realm of sentences. The most prominent candidate is what may be dubbed the: Rather Plausible Principle [RPPJ If 7, and T2 are terms such that, under given circumstances, 7, refers to something to which 72does not refer, then 7, and T2 differ in meaning. A term in the sense of the RPP is an expression that may be said to refer to something. Proper names, definite descriptions, and pronouns are cases in point. Thus, in one of its usages, (i) the name Manfred refers to a certain person; under the current actual circumstances, (ii) the description Regine's husband refers to the same person; and in an email to Manfred Kupffer, I can use (iii) the pronoun you to refer to him. Of course, in different contexts, (iii) can also refer to another person; hence the two terms mentioned in the RPP must be understood as being used in the same context. Under different (past or counterfactual) circumstances, (ii) may refer to another person, or to no one at all; the RPP only mentions the reference of terms under the same circumstances, and in the latter case thus applies vacuously. For the purpose of this survey we will also assume that names with more than one bearer, like (i), are lexically ambiguous; the terms quantified over in the RPP must therefore be taken as disambiguated expressions rather than surface forms. Why should the RPP be less certain than the A/CP? One possible reason comes from ontology, the study of what there is: the exact subject matter of sentences containing referring terms is not necessarily clearly defined, let alone obvious.The reader is referred to the relevant literature on the inscrutability of reference; Williams (2005) contains a good, if biased, survey. Putting ontological qualms aside, we may conclude from the RPP that whatever the meanings of terms may be, they must somehow determine their reference conditions, i.e. the referents as depending on the circumstances. According to a certain brand of truth- conditional semantics, anything beyond this most marked trait of term meanings ought to be left out of consideration (or to pragmatics), thus giving rise to a stronger version of our 'minimalist' semantic axiom: Extended Basic Principle of Truth-Conditional Semantics [EBPJ Any two declarative sentences agree in meaning just in case they agree in their truth conditions; and any two terms agree in meaning just in case they agree in their reference conditions. Like the BP, the EBP is merely a statement of covariance, not an equation. The - common - identification of term meanings with reference conditions, also requires further 'external' evidence. However, as in the case of the BP, we will read the EBP as suggesting that term meanings be reference conditions. Following this suggestion, and given Subst, the EBP may then be used to determine the meanings of at least some non- sentential expressions in a non-arbitrary and at the same time language-independent way, employing a heuristics based on:
33. Model-theoretic semantics 767 Frege's Functional Principle [FFP] If not specified by the EBP, the meaning of an expression A is the meaning of (certain) expressions in which A occurs as functionally depending on A's occurrence. Since the functional meanings of parts can be thought of as contributions to host meanings, FFP can be seen as a version of Frege's (1884: x) Context Principle that is not committed to the radical form of truth-conditional semantics: 'nach der Bedeutung der Worter muss im Satzzusammenhange, nicht in ihrer Vereinzelung gefragt werden' '[«'one must ask for the meanings of words in their sentential context, not in isolation']. Obviously, FFP is more specific; it also deviates from the original in having term extensions determined directly,in accordance with the EBP- and Frege's (1892) later policy. According to FFP, the meaning of an expression A that is not a sentence or a term, turns out to be a function assigning meanings of host expressions (in which A occurs) to meanings of positions (in which A occurs) - where the latter can be made out in case the positions coincide with adjacent expressions whose meanings have been independently identified - either by the EBP, or by previous applications of the same strategy. As a case in point, the meanings of (coordinating) conjunctions C (like and and or) may be determined from their occurrences in unembedded coordinations of sentences A and B, i.e. sentences of the form A C B; these occurrences are completely determined by the two coordinated sentences and may thus be identified with the ordered pairs (A,/?). Hence, following, the meaning of a conjunction comes out as a function from pairs of sentence meanings to sentence meanings, i.e. a function assigning truth conditions to pairs of truth conditions. In particular, the meaning of and assigns the truth conditions of any sentence A and B to the pair consisting of the truth conditions of A and the truth conditions of B. In a similar vein, by looking at simple predications of the form T P (where T is a term), the meanings of predicates P come out as functions assigning sentence meanings to term meanings. Since the latter are determined by the EBP, predicate meanings are thus construed as functions from reference conditions to truth conditions.This result may in turn be used to obtain the meanings of transitive verbs V, which may take terms T as objects to form predicates, the meanings of which have already been identified; hence, by another application of FFP, the meaning of a transitive verb comes out as a function assigning predicate meanings to reference conditions (= the meanings of terms in object position). In one form or another, the heuristics just sketched has been applied widely, and rather successfully, in linguistic semantics. Following it, a large variety of expressions can be assigned meanings whose primary function is to contribute to truth and reference conditions without depending on any particular language. Nevertheless, for a variety of reasons, following FFP (in the way indicated here) does not settle the issue of determining meaning in a fully principled, non-arbitrary way: - As the above examples show, the meanings of expressions depend on a choice of primary occurrences, i.e. positions in which they can be read off the immediate context. However, in general there is no way of telling which of a given number of grammatical constructions ought to serve this purpose; and the choice may influence the meanings obtained by applying FFP. Quantifier phrases - expressions like nobody; most linguists;every atom in the universe;elc- are cases in point. Customarily they are analyzed by putting them in subject position and thus interpret them as functions from predicate meanings (determined in the way above) to sentence meanings (determined
770 VII. Theories of sentence semantics quantifier in subject position may depend on the value of the predicate extension for nameless individuals, the aforementioned idealised generalisation of predicate extensions to functions operating on arbitrary individuals turns out to be crucial for this step. The above extensional adaptation is much more limited in its application than FFP in general: it only works as long as extensions behave compositionally; otherwise it is bound to fail; and there are a number of environments that cannot be analyzed by compositionally combining extensions. Still, a considerable portion of the grammatical constructions of English is extensional in that the extensional version of FFP may be applied to them, thus extending the notion of extension to ever more kinds of expressions. In the present and the following section we will be exclusively concerned with extensional constructions and, for convenience, pretend that all constructions arc extensional. The generalisation of extensions to expressions of (almost) arbitrary categories leads to a natural theoretical construction of meanings in terms of extensions and circumstances. The key to this construction lies in a generalisation of the EBP, which comes for free for those expressions whose extensions have been constructed in the way indicated, i.e. using the extensional version of FFP.To see this, we have to be more specific about truth conditions and reference conditions (in the sense intended here): since they are truth values and referents as depending on circumstances, we will henceforth take them to be functions assigning the former to the latter. This identification obviously presupposes that, whatever circumstances may be, they collectively form a set that may serve as the domain of these functions.Though wc will not be entirely specific as to the nature of this set, we assume that it is vast, and that its members do not miss a detail: Vastness: Circumstances may be as hypothetical as can be. Detail: Circumstances are as specific as can be. Vastness reflects the raison d'etre of circumstances in semantic theory: according to the EBP, they serve to differentiate the meanings of non-synonymous sentences and terms by providing them with distinct truth values and referents, respectively. In order to fulfill this task in an adequate way, actual circumstances - those met in the actual world - obviously do not suffice, as a random example shows: (5) The physicist who discovered the X-rays died in Munich. (6) The first Nobel laureate in physics died in Munich. As the educated reader knows, it so happens that the German physicist Wilhelm Conrad Rontgen discovered a phenomenon known as X-rays in 1895, was subsequently awarded the first Nobel prize in physics in 1901, and died in Munich in 1923. Hence the sentences (5) and (6) are both true; and their subjects refer to the same person. In fact, (5) and (6) are true under any actual circumstances; likewise, their subjects share their referent under any actual circumstances. Nevertheless, (5) and (6) are far from being synonymous, and neither are the definite descriptions in their subject position. In order to reflect these semantic differences in their truth and reference conditions, respectively, circumstances are called for under which (5) and (6) differ in truth value and, consequently, their subjects do not corefer. Such circumstances are not hard to imagine; the inventive reader
33. Model-theoretic semantics 771 is invited to concoct his or her pertinent favourite scenario. Since it is hardly foreseeable what kinds of differences are needed in order to separate two non-synonymous sentences or terms by circumstances under which their extensions do not coincide, we will assume that any conceivable circumstances count as evidence for non-synonymy, however far-fetched or bizarre they may be. The following pair, inspired by Macbeath (1982), is a case in point: (7) Dr Who ate himself. (8) Dr Who is his own father. Lest both (7) and (8) come out as false for all circumstances and thus as synonymous, there better be circumstances that contradict current science, thereby reflecting the meaning difference between the two sentences. Detail is an assumption made mostly for convenience and definiteness;but once made, it cannot be ignored. It captures the idea that circumstances are chosen in such a way as to determine the truth values of any sentence whatsoever. As far as actual circumstances are concerned, this is a rather natural assumption. My present circumstances are such that I am sitting in a train traveling through northern regions of Germany at a fairly high speed. I have no idea what precisely the speed is, nor do I know where exactly the train is right now, how many passengers arc on it, how old the engineer is, etc. But all of these details, and innumerable others, are resolved by these circumstances, which are as specific as can be.The point of Detail is that counterfactual (= non-actual) circumstances are just as specific. Hence if I had been on a space shuttle instead of riding a train, the circumstances that I would have been in would have included a specific velocity, a specific number of co-passengers, etc.To be sure, these details could be filled in in different ways, all of which correspond to different counterfactual circumstances. Given Detail, then, counterfactual circumstances are unlike the ordinary conception of worlds of imagination or fiction: stories and novels usually, nay always, leave open a lot of details - like the exact number of hairs on the protagonist's head, etc. If anything, worlds of fiction correspond to sets of circumstances; for instance, 'the world of Sherlock Holmes' corresponds to the set of (counterfactual) circumstances that are in accordance with whatever Conan Doyle's stories say, and disagree on all kinds of unsettled details. As was said above, we assume that the totality of all actual and counterfactual circumstances forms a rather large set which is called Logical Space; cf. Wittgenstein (1922: 3.4 & passim). We will follow logical and semantic tradition and refer to the elements of Logical Space as possible worlds rather than circumstances, even though the term is slightly misleading in that it not only suggests lack of detail (as was just noted) but also a certain grandeur, which the members of Logical Space need not have; it will be left open here whether possible worlds are all-inclusive agglomerations of facts, or whether they could also correspond to more mundane (!), medium-sized situations. What is important is that Logical Space is sufficiently rich to differentiate the meanings of arbitrary non-synonymous sentences and terms. Yet even the macroscopic Vastness of Logical Space and microscopic Details of its worlds cannot guarantee that any two sentences that appear to differ in meaning also differ in their truth conditions. Thus, e.g., (9)-(10) are true of precisely the same possible worlds, but arguably differ in meaning:
774 VII. Theories of sentence semantics Other (extensional) constructions may require some ingenuity on the semanticist's part in order to determine the precise way in which extensions combine. Quantified objects are an infamous case in point. As the reader may verify, for any possible world w, the extension of a predicate of the form V Q, where V is transitive verb and Q is a quantifier phrase (in direct object position),can be given by the following equation: {l5)\\VQ\\» = \x.\\Qr{\y.\\VV{y){x)) Like (14), equation (15) completely specifies the extension of a certain kind of complex expression in terms of the extensions of its immediate parts; and it does so in a perfectly general manner, for arbitrary worlds w. By definition, this is so for all extensional constructions, which arc extensional precisely in that the extensions of the expressions they combine determine the extensions of the resulting phrases, thus giving rise to general equations like (14) or (15). In particular, then, the combinatorial part (b) of the specification of extensions does not depend on the particular circumstances envisaged. It thus turns out that the world dependence of extensions is entirely a matter of lexical specification (a). Hence the role worlds play in the determination of extensions can be seen as being restricted to the extensions of lexical expressions - the rest is compositionality, captured by general equations like (14) and (15). To the extent that the determination of extensions is the only role possible worlds w play in (compositional extensional) semantics, they could be identified with functions assigning extensions to lexical expressions. If A is a lexical expression, we would thus have: (16) ||A||" = FH.(A) In (16) Fw is a function assigning to any expression in its domain the extension of that expression at world w. What precisely is the domain of FM.? It has just been noted that it only contains lexical expressions - but does it include all of them? A brief reflection shows that this need not even be so; for some of the equations of the form (16) do not depend on a particular choice of w. For instance, as noted above, under any circumstances w the extension of a (sentence-) coordinating conjunction and is a binary function on truth values; the same goes for other 'logical' particles: (17) a. ||and||"=XM.Xv.Mx v [= (12e)] b. ||or||**' = Xu.Xv. (u + v) - (u x v) c. ||not||H' = Xw.l -u The equations in (17) imply that, in any worlds w and w\ the extensions of and, or, and not remain stable: ||and||H' = ||and||H', ||or||H' = ||or||H', and ||not||H' = ||not||""'.This being so, the specifications of the world-dependent extensions of lexical expressions may safely skip such logical words. A similar case can be made for quantificational determiners and the is of identity: (18) a. ||every||H=XP.X0.l-Pc0H [= (12a)] b. ||no||H- = XP.X0.l-Pn 0 = 0-1 c. \\\s\\w = Xx.Xy.\-x=y-\
776 VII. Theories of sentence semantics (ii-a) leveryl^'1 = XP. XQ. \- P c QH where PcUw and 0c Uw (Hi) \A\Mw = Fw(A)MAeNE (iv-a) \DN\M» = \D\M"(\N\M«) if D N is a quantifier phrase, where D is a quantificational determiner and TV is a count noun; In order to complete (29), clauses (i) and (ii) would have to take care of all logical words of E. We will now give a semantic characterisation of logical words that helps to draw the line between the non-logical part NE- or NL in general - and the rest of the lexicon. We have already seen that one characteristic feature of logical words is that their extension is largely world-independent: it may depend on which individuals there are in the world but it does not depend on any particular worldly facts. However, having a world-independent intension (even stricto sensu) is not sufficient for being a logical word. Indeed, there arc good reasons for taking the intensions of proper names as constant functions over Logical Space.; cf. Kripke (1972), where terms with constant intensions are called rigid designators. Still, a name like Jesus can hardly be called logical even if its bearer turned out to be present in every possible world. What, then, makes a word, or an expression in general, logical') The (rather standard) answer given here rests on the intuition that the extension of a logical word can be described in purely structural terms. As a case in point, the determiner no is logical because its extension may be described as the relation that holds between two sets of individuals just in case they do not overlap - or, in terms of characteristic functions: that function which, successively applied to two characteristic functions (of sets of individuals), yields (the truth value) 1 just in case these functions do not both assign 1 to any individual. If one thinks of functions as configurations of arrows leading from arguments to values, then the extension of no - and that of a logical word in general - can be described solely in terms of the abstract arrangement of these arrows, without mentioning any particular individuals or other extensions - apart from the truth values. In other words, the descriptions do not depend on the identity of the individuals, which may be anything - or replaced by any other individuals. The idea, then, is to characterise logical words as having extensions that arc stable under any replacement of individuals. In order to make this intuition precise, a bit of notation should come in handy. It is customary to classify the extensions of expressions according to whether they derive (i) from the EBP, or (ii) by application of FFP: (i) the extensions of sentences are said to be of type t, those of terms are of type e; (ii) if the extension of an expression operates on extensions of some type a resulting in extensions of some type b, it is said to be of type (a,b). In other words, (i) t is the type of truth values;e is the type of individuals (or entities); (ii) (a,b) is the type of (total) functions from a to b. More precisely, t is a canonical label of the set of truth values, etc.; 'f' is mnemonic for truth value, V abbreviates entity. In this notation, the extensions of sentences, terms, coordinating conjunctions, predicates, transitives, quantificational phrases, and determiners are of types t; e; (t, (t,t)); (e,t); (e,(e,t)); ((e,t),t); and ((e,t),((e,t),t)), respectively. It should be noted that, unlike the extensions of the expressions, their types remain the same throughout Logical Space (and thus across all material models).The function \L assigning to each expression of a
33. Model-theoretic semantics 777 language L the unique type of its extensions is called the type assignment of L; we take xL to be part of the specification of the language L (and will usually suppress the subscript). Now for the characterisation of logicality. Given (not necessarily distinct) possible worlds w and w\ a replacement (of w by w') is a bijective function from the domain of individuals of w to the domain of individuals of w' (which must therefore have the same cardinality). Replacements may then be generalized from individuals (of type e) to all types of extensions: given a replacement p (from w to w'), the following recursive equations define corresponding functions pa, for each type a: - pe = p; - P, = l(0*0)*(M)) [= Xjc.jc, where 'jc' ranges over truth values]; " P(a.h) = Xf- l(P„M' P*00l /W = y\ [where lf ranges over functions of type (a,b)]. In other words, the generalised replacement leaves truth values untouched - because they define the structure of extensions like characteristic functions - and maps functional extensions to corresponding functions, replacing arrows from x to y by arrows between substitutes; it is readily seen that, for any type a, pa is the identical mapping on the extensions of type a if (and only if) pis the identical mapping on the domain of individuals. Given the above generalisation of replacements from individuals to objects of arbitrary types, a logical word A can be defined as one whose extensions cannot be affected by replacements. More precisely, if / is an intension of type a (i.e. a function from W to extensions of type a), then / is (replacement-) invariant just in case for any replacements pandp'of why some world tv',it holds that pa(f(w)) = p'a(f(w));and logical words may then be characterised as lexical expressions A with invariant intensions: PT(/l)(llA||H) = p'T(/l)(||A||H),for any worlds w and replacements p and p' of w by some world w\ The definition implies that pT(A)( IIA ||H) = || A ||,v, whenever A is a logical word and p is a replacement from a world w to itself (or any world with the same domain of individuals) - which is a classical criterion of logicality, going back to Lindenbaum &Tarski (1935), and rediscovered by a number of scholars since; cf. M acFarlane (2008: Section 5) for a survey. Extensive treatments of replacements in (models of) Logical Space can be found in Fine (1977) and Rabinowicz (1979). Derivatively, we will also call extensions of types a invariant if they happen to be the values of invariant intensions of type a. Invariant extensions are always of a peculiar, combinatorial kind. In particular, and ignoring all too small universes (of cardinalities 1 and 2), extensions of type e are never invariant; there are only two invariant extensions of type (e,t), viz. the empty one and the universe; and four of type (e,(e,t)), viz. identity, distinctness, and the two trivial (empty and universal) binary relations;extensions of type ((e,t),t) are invariant just in case they classify sets of individuals according to their cardinalities; etc. As a consequence, the extensions of logical words are severely limited by the invariance criterion. Still, it takes more for an intension to be invariant than just having invariant extensions; in particular, a function that assigns different invariant extensions to worlds with the same domain cannot be invariant. More generally, if the domains U and U' of two material models M and M' have the same cardinality, the invariance condition on logical words of the second kind (ii) ensure that, intuitively speaking, they have analogous extensions.Thus, e.g., a determiner whose extension relative to M is the subset-relation on U will denote the subset on U' in M'.
33. Model-theoretic semantics 779 by (22a) is the case iff [w | |5|'M» = 1} c [w | |5'|'M»- = 1). Similar arguments apply to the other sense relations defined above. Let us define the L-intension \A\ of an expression A (of a language L) as the function that assigns A's extension (according to MH) to each material model: |A|(JWH.) = |A|-'M»-, i.e. \A\ = ^MH.. |A|-'M»-. Observation (22) shows that L-intensions are as good as intensions when it comes to determining extensions; and we have seen that they may also be used to reconstruct sense relations defined via intensions, basically on account of (22b). Moreover, the Boolean structure of the set of propositions expressible in L (= those that happen to be the intensions of a sentence of L) turns out to be isomorphic to the Boolean structure of L-propositions (= the L-intcnsions of the sentences of L): as the patient reader may want to verify, the mapping from ||5||H to iSl^tv is one-one (infective) and preserves algebraic (Boolean) structure in that ||S||B" c ||S'||B" iff |S|'M»- c \S'\M»; \\S\\wn ||5'||H'= |5|:M» n |5'|'M'setc.The tight relation between Logical Space and the set of material models can also be gleaned from considering worlds that cannot be distinguished by the expressions of a given language L: Definition If w and w' are possible worlds and L is a language, then w is L-indistinguishable from tv-in symbols: w = Lw' - iff ||A||,V = ||A||H", for any expression A of L. In general, L-indistinguishability will not collapse into identity.The special case in which it does, will be marked for future reference: Definition A language L is discriminative iff no two distinct possible worlds w and w' are L-indistinguishable. Hence, in a discriminative language L any two distinct possible worlds w and w\can be distinguished by some expression A: ||A||"'* ||A||H'';it is not hard to see that in this case the material models stand in a one-one correspondence with Logical Space. In particular, if L contains a term the referent of which at a given world is that world itself, L will certainly be discriminative; arguably, the definite description the world is such a term, thus rendering English discriminative. In any case, for almost all languages L, any two L-indistinguishable worlds give rise to the same material model: (23) If w = Lw\ then Mw = Mw.. As a conscquence,material models stand in a one-one correspondence to the equivalence classes induced by L-indistinguishability (which is obviously an equivalence relation). Since L-intensions and material models are suited to playing the role of worlds in determining extensions and to characterising Boolean structure,one may wonder whether they could replace worlds and intensions in general. In other words, is it possible to do (extensional) semantics without Logical Space altogether? In particular, could meaning be defined in terms of material models instead of possible worlds? Ontological thrift seems to support this option: on the face of it, material models are rather down-to-earth mathematical structures as compared to the dubious figments of imagination that
780 VII. Theories of sentence semantics possible worlds may appear to the semantic layman. However, this impression is mistaken: though material models are mathematical structures, they consist of precisely the stuff that possible worlds are made of, viz. fictional individuals under fictional circumstances. Still, replacing intensions with L-intensions may turn out to be, if not ontologi- cally less extravagant, at least theoretically more parsimonious, sparing us a baroque theory of Logical Space and its structure. However, this is not obvious either: - So far material models have been defined in terms of Logical Space, since the extension a material model assigns to an expression A depends on details about w. If material models arc to replace worlds altogether, an independent characterisation for them would have to be given. For instance, we have seen that no possible world separates (9) from (10), particularly because there is no intensional difference between woodchuck and groundhog. Hence these two nouns have the same L-intension: this is a consequence of the above definition of a material model, relating it to the world w on which it is based; it is reflected in the general equation (20m). Now, if the very notion of a material model is going to be defined independently of Logical Space, then the coincidence of |woodchuck|-'M>» and |groundhog| M» in all material models Mw would have to be guaranteed in some other way, without reference to the worlds these models are based on. Hence some restriction to the effect that no model can assign different extensions to the nouns under consideration would have to be formulated. And even if in this particular case the restriction should appear unwelcome, there arc other cases in which similar restrictions are certainly needed - like the infamous inclusion relation between the extensions of bachelor and unmarried. Arguably, at the end of the day the restrictions to be imposed on the material models amount to a theory of Logical Space; cf. Etchemendy (1990:23f) for a similar point. - Whereas Logical Space is absolute, material models are language-dependent. In particular (and disregarding neurotic cases), for any two distinct languages L and L', the sets of material models of L and of L' will not overlap. Hence if material models were to replace worlds in semantic theory, expressions from different languages would always have different intensions. In particular, sentences would never be true under the same circumstances, if circumstances corresponded to models.To make up for this obvious inadequacy,some procedure for translating material models across languages would be called for. For instance, a material model for English assigning a certain set of individuals to the noun groundhog would have to be translated into a material model for German assigning the same set to the noun Murmeltier. In general, then, circumstances would correspond to equivalence classes of material models of different languages. Obviously, at the end of the day the construction of these equivalence classes again recapitulates the construction of Logical Space;presumably, a one- one correspondence between material models and possible worlds requires an ideal 'language' of Logical Space, as envisaged by Wittgenstein (1922). It thus appears that Logical Space can only be eliminated from (extensional) semantics at the price of a theory of something very much like Logical Space. The upshot is that neither ontological objcctionability nor theoretical economy arc sufficient motives for reversing the conceptual priority of Logical Space over the set of material models. On the other hand, this does not mean that material models ought to be dispensed with. On
33. Model-theoretic semantics 781 the contrary, they may be seen as compressions of possible worlds, reducing them to the barest necessities of semantic analysis. 2.3. Intensionality The adaptation of FFP to derive extensions for expressions other than sentences and terms only works in so-called extensional contexts, i.e. (syntactic) constructions in which extensionally equivalent parts may replace each other without affecting the extension of the host expression. However, a number of environments prove to be non-extensional. Clausal complements to attitude verbs like think and know are classical cases in point: their contribution to the extension of the predicate cannot be their own extension which is merely a truth value; if it were, materially equivalent clausal complements (= those with identical truth values) would be substitutable salva veritate (= preserving the truth [value] of the host sentence) - which they are not. E.g., (24) and (25) may differ in truth value even if the the underlined clauses do not. (24) John thinks Mary is home. (25) John thinks Ann is pregnant. This failure of substitutivity obviously blocks the derivation of the extension of think via the extensional version of FFP: if it were a function / assigning the predicate extension to the extension of the complement clause, then, as soon as the complement clauses have the same truth value (and are thus co-extensional),/ would have to assign the same extension to the predicates: (26) If: || Mary is home||" = || Ann is pregnant ||", then: /(|| Mary is home||") =/( || Ann is pregnant||"). The argument (26) shows that an attitude verb like think cannot have an extension / that operates on the extension (= truth value) of its complement clause. It does not show that attitude verbs do not have extensions, nor that FFP cannot be used to determine them. Rather, since FFP seeks to derive the extension of an expression in terms of the contribution made by its natural environment (or primary context), the lesson from (26) ought to be that this contribution must consist in more than a truth value. On the other hand, given Subst, it is safe to assume that it is the intension of the complement clause; cf. Frege (1892). After all, by the BP, any two intensionally equivalent sentences are synonymous and may thus replace each other in any (sentential) position - including non-cxtcnsional ones - without affecting the truth conditions of the host sentence. In particular, any two sentences of the forms T thinks S and T thinks S' (where T is a term) will be synonymous if the sentences S and S' have the same intension. But then at any possible world, the extensions of the predicates thinks S and thinks S' coincide, and thus so do their intensions. In other words, there can be no difference in the extensions of the predicates without a difference in the intensions of the complement clauses, which is to say that the former functionally depend on the latter. Following the spirit of FFP, then, one can think of the intensions of the embedded clauses as the contributions they
33. Model-theoretic semantics 783 extensions relative to material models of E may also contain clauses like the following, where £ is a more inclusive fragment of English than E: (iv-c) \VS\M» = |V\M»- (kw'. \S\*w) if V S is a predicate, where V is an attitude verb and 5 is a clausal complement. Equations like (iv-c) show that the programme of eliminating Logical Space in favour of the set of material models cannot be upheld beyond the realm of extensional constructions. For,unlike the intension of the embedded clause, Xtv'.^I^ISl'*1"', the corresponding L-intension, \MW>. |S|'M»', cannot serve as an argument to the extension \V\M* of the attitude verb: the latter is the value of a lexical extension assignment /v, which itself is a component of Mw, which in turn is a member of the domain of XMW<. |5|'M»' - which cannot be, for set-theoretic reasons. It should be noted that the intension of the embedded clause is defined in terms of its extensions relative to all other material models. While this does not present any technical problem, it does make the material models less self-contained than their extensional counterparts, which contain all the information needed to determine the extensions of all expressions. If need be, this seeming defect can be remedied by tucking in all of Logical Space and its inhabitants as components of material models: Definition Given a possible world w* and a language L, the intensional material model {for L based on w*) is the quadruple J#H., = (W, U, w*, F) consisting of the set W of all possible worlds; the domain function U assigning to each possible world w the domain of individuals Uw; the world w* itself; and the lexical interpretation function F which assigns to every non-logical lexical expression A of L the intension of A at w. Two complications arising from this definition arc worth noting: - Universal and existential quantification over possible worlds are prime candidates for logical operations of type ((s,t),t), the former yielding the truth value 1 only if applied to Logical Space itself, whereas the latter is true of all but the empty set of possible worlds. Since the extensions of lexical expressions need not be of extensional types, the logicality border needs some adjustment. For the time being we will leave this matter open, returning to it in Section 3.2 with a natural extension of the replacement approach to logicality. - Lest clauses like (iv-c) should make reference to anything outside the intensional model, the lexical interpretation function is defined for all worlds w. As a consequence, two distinct intensional material models only differ in their 3rd component, and each determines the extensions of all expressions at all possible worlds. Hence the procedure for determining the extensions is more general than expected; its precise format will also be given in Section 3.2. Intensional material models are rather redundant objects, which is why they are not used in real-life semantics; we have mainly defined them here for future reference. Once again, it is obvious that set theory precludes Logical Space as it appears in them, from
784 VII. Theories of sentence semantics being replaced by the set of all intensional material models. On the other hand, its role might be played by the set of extensional material models, i.e. those that only cover the extensional part of L - containing only its lexical expressions with extensions of extensional types, and its extensional syntactic constructions. 3. Model-theoretic semantics 3.1. Extensional model space One rather obvious strategy for constructing (extensional) models within set theory is to start with material models and replace their individuals by (arbitrary) set-theoretic objects; as it turns out, this can be done by a straightforward generalisation of the replacements used in the above characterisation of logical words (cf. Section 2.2). However, since the individuals to be replaced also function as the building blocks of extensions, the latter will have to be generalised first. Given any non-empty set {/, the U-extensions of type e arc the elements of {/; the {/-extensions of type t arc the truth values; and if a and b are extensional types, then the {/-extensions of type (a,b) are the (total) functions from {/-extensions of type a to {/-extensions of type b. Hence, {/-extensions (of extensional types) are to the elements of {/what ordinary extensions are to the individuals of a given possible world; and clearly, ordinary extensions come out as {/^-extensions, where U happens to be the domain of individuals in w. Definition Given a language L, a formal model (for L) is a pair !M = ({/, F) consisting of a non-empty set U (= the universe of M) and a function F which assigns to every non-logical lexical expression A of L a {/-extension of type *L(A). Using the recursive procedure applied earlier, we can extend any bijection p between any (not necessarily distinct) non-empty sets U and U of the same cardinality, to a family of functions pa taking {/-extensions to corresponding {/-extensions (where a is an extensional type): pe = p; pt(x) = x, if x is a truth value; and ph(f(x)) = p(a h)(f)(p „(■*)), whenever/and x arc {/-extensions of types (a,b) and a, respectively. It is readily verified that replacements assign structural analogues to the {/-extensions they are applied to. For instance, if/is a {/-extension of some type a, then p(af)(/) is (i.e., characterises) the set of all {/-extensions of the form pa(x), where x is a {/-extension of type a; in particular, and given that pa is a bijection,/and p(a t)(f) are of the same cardinality. It is also worth noticing that the values of invariant extensions are themselves invariant, in an obvious sense: Observations Let {/H. be the domain of individuals of some world w, X an invariant Uw- extension of some type a, and p a bijection from Uw to a set U of the same cardinality. Then: (*) p'a{X) = p"a(pa(X)), for any bijections p' and p" from Uw and U to some set {/' of the same cardinality, respectively; (**) pa(X) = p'a(X), for any bijection p* from UH. to U. The proofs of (*) and (**) are rather straightforward and thus left to the readers.
33. Model-theoretic semantics 787 this representation is a perfect match if the language L is discriminative. However, even if the material models correspond to the possible worlds in a one-one fashion, there is no guarantee that so do the classes of ersatz models in (34). More precisely, even if (35) holds of any worlds w and w' and the corresponding material models Mw and Mw. the analogous implication (36) about the latter and their representations in Ersatz Space need not be true: (35) If w * w\ then Mw * Mw. (36) If Mw * Mw. then |3f Js * \MJS In other words, distinct material models may be represented by the same ersatz models - which will be the case precisely if L allows for distinct, but isomorphic material models in the first place. Discriminativity does not exclude this possibility: it only implies some extensional differences between any two material models; but these differences could be made up for by the replacements used in representing material models in Ersatz Space. In order to guarantee a perfect match between Logical Space and Ersatz Space, a stronger condition on L is needed than discriminativity. The natural candidate is completeness, which is defined like discriminativity, except that it is not based on indistinguishability but the weaker notion of equivalence: Definitions If w and w' are possible worlds and L is a language, then w is L-equivalent to w' - in symbols: w =*Lw' - iff ||5||H = ||5||H', for any declarative sentence S of L. A language L is complete iff no two distinct possible worlds w and w' are L-equivalent. Unlike L-indistinguishability, L-equivalence is not affected by replacements in that it only concerns the truth values of sentences rather than the extensions of arbitrary expressions. The following observations about arbitrary worlds w and w' and languages L are not hard to establish: (37) a. If w= Lw\ then w*= Lw'. b. If L is complete, then L is discriminative. c. If L is complete and w * w', then |^MJ_* l-^H.|= In effect, (37c) says that, via the isomorphicity classes, the Ersatz Space of a complete language matches Logical Space. Thus, completeness plays a similar role for the adequacy of ersatz models as does discriminativity in the case of material models. However, whereas discriminative languages are easy to find, completeness seems a much rarer property. Of course, if a language is incomplete, its ersatz models could still match Logical Space in the sense of (37a), but then again their isomorphicity classes may equally well contain, and thus conflate, representations of distinct worlds. However, since the differences between distinct worlds that correspond to the same ersatz models are - by definition - inexpressible in the language under investigation, this imperfect match
788 VII. Theories of sentence semantics between Logical Space and Ersatz Space does not necessarily conflict with the general programme of replacing possible worlds with formal models. As far as its potential in descriptive semantics goes, then, Ersatz Space does seem to earn its name. However, as the reader will have noticed, its very definition still appears to be of no avail when it comes to soothing ontological worries: by employing the concept of representation, it depends on material models - and thus presupposes Logical Space. To overcome this embarrassment, a definition of Ersatz Space is needed that does not rely on Logical Space. The usual strategy is one of approximation, starting out from a maximally wide model space and gradually restricting it by eliminating those formal models that do not represent any material counterparts; the crucial point is that these restrictions be formulated without reference to Logical Space, i.e. in the language of pure set theory, or with reference only to urelements that are less dubious than the inhabitants of Logical Space. We will refer to the natural starting point of this enterprise as L's Model Space, and identify it with the class of all pure models (for a given language L), i.e. all formal models whose universe is a pure set. It should be noted that the concept of a pure model only depends on L's syntax and type assignment, both of which in principle may be given in terms of pure set theory. In order to exploit Model Space for semantic purposes, the procedure for determining the extensions of expressions must be generalised from ersatz models to arbitrary pure models. Hence, for any pure model M = (U, F), the extensions of logical expressions need to be specified, as well as the combinations of extensions corresponding to the syntactic constructions. As in the case of Logical Space and Ersatz Space, this should not pose any particular problems. In fact, these extensions and combinations are invariant and should only depend on the universe U. Thus, for those U that happen to be of the same cardinality as the universe Uw of some material model Mw = (Uw, FB>), their specifications may be taken over from Ersatz Space that is bound to contain at least some pure model with universe U (and isomorphic to ^MH). For all other models they would have to be suitably generalised. Thus, e.g., it is natural to define the extension of the English determiner every as characterising the subset relation on a given universe U, even if the latter is larger than any of the universes Uw encountered in Logical Space; similarly, the extension of a quantifier phrase D N, where D is a determiner and N a count noun, can be determined by functional application - which, suitably restricted, is an invariant {/-extension of type ((e,t),t),(e,t);eic. At the end of the day, then, the formal models resemble their material and ersatz counterparts. In particular, the specification of extensions is strikingly similar: (38) For any expression A of £ and any formal model M = (U, F) for E, the extension of A relative to M - [A]-"*1 - is determined by the following induction (on the grammatical complexity of A): (i-a) [and]:M = Xu. Xv. u x v ... where ue {0,1} and ve {0,1} (ii-a) [every]*1 = XP. XQ. \-P c Q-\ where P^U and Q^U (Hi) [A]-w = FH.(A),ifAGyV£ (iv-a) [D N}M = \D\M ([S]:Kt)
33. Model-theoretic semantics 789 if D N is a quantifier phrase, where D is a quantificational determiner and N is a count noun; Given specifications of extensions in the style (38), one may start approximating the Ersatz Space of a language L by restricting its Model Space. A common strategy to this end is to eliminate models that have certain sentences come out false, viz. analytic sentences that owe their truth to their very meaning. As a case in point, one may require of appropriate models for English that the extension of (39) be the truth value 1. For the sake of the example it may be assumed that the extension of a predicative adjective is a set of individuals and that it is passed on to the predicate (Copula + Adjective); cf. Heim &Kratzer(1998:6lff). (39) No bachelor is married. More generally, a set £ of sentences of L may be used to characterise the class of all pure models M (for L) such that [5]|'M = 1, for all Sel. Ever since Carnap (1952), sentences of a language L that are used to characterise the appropriateness of formal models L are called meaning postulates; and we will refer to a set of sentences used for this purpose as as a postulate system. Obviously, the truth of a meaning postulate like (39) may guarantee the truth or falsity of certain other sentences: (40) Every bachelor is not married. (41) Some bachelor is married. Once (39) is adopted as a meaning postulate, there is no need to also take on (40), or to rule out (41) separately, because the truth values of these two sentences come out as intended in any formal model for English relative to which (39) is true. Alternatively, if (40) is taken as a meaning postulate, (39) and (41) come out as desired. As this example suggests, when it comes to approximating Ersatz Space by meaning postulates, there need not be a general criterion for preferring one postulate system over another. And it is equally obvious that the effect of a meaning postulate, or a postulate system, may be achieved by other means. Thus, e.g., the effect of adopting (39) as a meaning postulate may also be obtained by the following constraint on appropriate formal models M = (U,F) for E: (42) F(bachelor) n F(married) = 0 In other words, the effect of adopting (39) or (40) as a meaning postulate is to establish a certain sense relation between the noun bachelor and the participle married (which we take to be a lexical item, if only for the sake of the example). The relation is known as incompatibility and holds between any two expressions just in case their extensions of some type (a,t) cannot overlap. For future reference, we also define the relation of compatibility as it holds between box and wooden - and in general between expressions A and B if their extensions may overlap. Interestingly, compatibilities do not have to be
790 VII. Theories of sentence semantics established by meaning postulates because they are guaranteed by the existence of any model attesting the overlaps; hence compatibility will hold as long as Model Space is not (erroneoulsy) narrowed down so as to exclude all of these models. Obviously any meaning postulate S can be replaced by a corresponding constraint on formal models to the effect that S come out true; and given a procedure for determining the extensions of arbitrary expressions, this constraint can be given directly in terms of the extensions of the (non-logical) lexical expressions S contains. On the other hand, not every constraint on formal models needs to correspond to a meaning postulate, or even a postulate system;cf.Zimmermann (1985) for a concrete example. On top of restrictions on the range of possible extensions of lexical expressions, more global constraints may rule out pure models that do not belong to Ersatz Space for cardinality reasons. In general, then, the set-theoretic reconstruction of Logical Space consists in formulating suitable constraints on the members of Model Space, i.e. the pure models. Taken together, these constraints define a class DCof appropriate models (according to the constraints). Ideally, this class should coincide with Ersatz Space.The closer DCgeis to this ideal, and the more it coincides with Ersatz Space in semantically relevant aspects, the more descriptively adequate will ^Cbe.Thus, e.g., the set of DC-valid sentences - those that are true according to all DAeDC- should be the same as the set of sentences that are true throughout Logical Space. If two formal models are isomorphic, both represent precisely the same material models, and hence there is no reason for ruling out (or counting in) one but not the other. Appropriateness constraints are thus subject to the following meta-constraint on classes DCoi appropriate models: (43) If DA = D4\ then D4e DC iff D4'e DC. In mathematical jargon, (43) says that the class DC of appropriate models is closed under isomorphism: if D4 and D4' are isomorphic formal models (for some language L), then DA is appropriate according to a given set of constraints just in case D4' is appropriate according to the same constraints. Two things about (43) are noteworthy. First, as a consequence of (31), constraints formulated in terms of meaning postulates (or postulate systems) always satisfy (43): any two isomorphic models make the same sentences true. Secondly, if the constraints are jointly satsifiable at all (which they should be), the appropriate models always form a proper class; this is so because for any formal model D4 = (U,F) and any set U' of the same cardinality as U, there is an isomorphic model of the form D4' = (U\ F') - and hence any pure set whatsoever will be a member of some universe of an appropriate formal model. One interesting consequence of (43) is that, no matter how closely a given system of constraints and/or meaning postulates may approximate Logical Space, it will never be able to pin down a specific model, let alone specific extensions of all expressions. In fact, given any term A (of some language L) and any pure set x, there will be an appropriate model D4x = (Ux, F) (for L), according to which [Aj^ = x; D4x may be constructed from any given appropriate model D4 = (U, F) by replacing [A]|'M with x (and simultaneously x with U, in case xe U) - thereby preserving appropriateness, by (43). This only reflects the strategy of having inhabitants of Logical Space represented by arbitrary set- thcorctic objects and is thus hardly surprising. However, a similar line of thought docs
33. Model-theoretic semantics 791 give rise to interesting consequences for the relation between reference and truth. This is the gist of the following permutation argument, made famous by Putnam (1977,1980) with predecessors including Newman (1928:137ff), Jeffrey (1964:82ff), Field (1975), and Wallace (1977);see Devitt (1983), Lewis (1984), Abbott (1997),Williams (2005:89ff), and Button (2011) for critical discussion of its impact. Given a model M for a language L, the extensions according to any isomorphic model M* may be characterised directly by an inductive specification in terms of M, without mention of M*. For the present purposes, it suffices to consider the special case in which M = (U,F) and M* = (£/*, F*) are models for our extensional fragment of English and share their universe U = U*. More specifically, since any bijection n on U is a model- isomorphism from M to a model M* = (U*,F*), the following induction characterises the extensions relative to M* entirely in terms of M: (44) For any expression A of £ and any formal model M = (U,F) for £, the permuted extension of A relative to M- IIAIIM - is determined by the following induction (on the grammatical complexity of A): (i-a) Hand//™ = \u.\v. u x v ... where ue {0,1} and ve {0,1} (ii-a) //every//^ = XP XQ. \-P c Q-\ where Pcf/ and Q^U (Hi) IIAll™ = 7tT(/l)(FH.(A)), if As NE (iv-a) IID Nil™ = IIDII'M (IIAII'M) if D N is a quantifier phrase, where D is a quantificational determiner and N is a count noun; Obviously, IIAllM = rcT(/t)([A]|'M) = [Aj^*, for any expression A of £. Moreover, there is a striking similariy between (44) and the specification (38) of the extensions relative to !M. In fact, the only difference lies in clause (Hi): according to (44), non-logical words are assigned the 7t-image of the {/-extension they are assigned according to (38). The formulation of (44m) may suggest that the values IIAII'M somehow depend on the permutation k when of course, they are perfectly independent {/-extensions in their own right - just like the values [Afl^: any instantiation of either (38m) or (44m) will assign some specific {/-extension to some specific expression. In particular, then, (44) is no more complicated or roundabout than (38); it is just different. Yet, as far as the specification of the truth values of sentences 5 of L is concerned, the two agree, since nsirM = *T([s]TM) = [syM. The construction (44) can also be carried out if M = Mw is a material model. Of course, in this case the permutation model M* cannot be expected to be a material model too, but then again it is not needed for the definition of the //A/^-valucs anyway. All that is needed is a permutation k of the domain of w. (44) will then deliver exactly the same truth valuation of L as (38).The comparison between (44) and (38) thus illustrates that truth is independent of reference in that the latter is not determined by the former. In particular, then, although the extensions of terms help determining the truth values of the sentences in which they occur, this role hopelessly undcrdctcrmincs them:
792 VII. Theories of sentence semantics if reference is merely contribution to truth, then reference is arbitrary;else, reference has to be grounded independently. 3.2. Intensional Model Space Beyond the realm of extensionality the strategy of approximating Logical Space in terms of formal models is bound to reach its limits. Even when restricted to the extensional part of a language L and constrained by meaning postulates or otherwise, Model Space is a proper class and thus cannot serve as the domain of any function. As a consequence, the set-theoretic reconstruction of intensional semantics cannot have intensional models assign L-intensions to expressions. However, this does not mean that formal models cannot be adapted to intensional languages. Rather, they would each have to come with their own set-theoretic reconstruction of Logical Space, consisting of an arbitrary (non-empty) set W representing the worlds and a system 0 of arbitrary (non-empty) universes, i.e. one set per world (representation): Definition A formal ontology is a pair (W, 0), where W is a non-empty set (the worlds according to (W, 0) and 0 is a function with domain W such that 0(w) is a non-empty set whenever we W [= the individuals of w, according to (W, 0)]. In the literature on modal logic, the requirement of non-emptiness is sometimes weaker, applying to the union of all domains rather than each individual domain; cf. Fine (1977: 144). Given a formal ontology (W, 0), one may generalise {/-extensions from extensional to arbitrary types. Due to the world-dependence of individual domains, these generalised extensions also depend on which element w of W is chosen. More precisely, (W, 0, w)-extensions may be defined by induction on (the complexity of) their types: (i) (W, 0, tv)-extensions of type t are truth values; (ii) (W, 0, tv)-cxtensions of type e are members of 0(w); (iii) whenever a and b arc types, (W, 0, tv)-cxtcnsions of type (a,b) are functions assigning (W, 0, tv)-extensions of type b to all (W, 0, tv)-extensions of type a; and (iv) (W, 0, tv)-extensions of type (s,a) are functions / with domain W assigning to any w'e W a (W, 0, tv')-extensions of type a (where, again, a is a type). At first glance, this definition may appear gruesome, but closer inspections shows that it merely mimics (a staightforward generalisation of) the replacements defined in the previous part; we invite the reader to check this for her- or himself. (W, 0, tv)-extensions give rise to (W, 0)-intensions of any type a as functions assigning a (W, 0, tv)-extension of type a to any we W. We thus arrive at the following: Definition Given a language L, an intensional formal model (for L) is a quadruple M = (W,0, w*,F), where (W, 0) is a formal ontology; a member w* of W (= the actual world according to !M); and a function F which assigns to every non- logical lexical expression A of L a (W, 0)-intension of type tt(A) (= the lexical interpretation function according to M). Moreover we say that an intensional formal model J^ =(W, U, w*,F) is based on the ontology (W, 0) and call the members of W and (J U(w), the worlds and individuals u-elV
800 VII. Theories of sentence semantics 3.4. Mathematical Model Theory Model-theoretic interpretation originated in mathematical logic, where it is not used to approximate Logical Space but rather accounts for semantic variation among various (fragments of) formal languages; cf. Etchemendy (1990) for more on this difference in perspective, which incidentally does not affect any mathematical technicalities and results.Though a large part of the latter concern certain varieties of first-order predicate logic, they may have repercussions on model-theoretic semantics of natural language, especially if they concern questions of expressiveness.The two most fundamental results of mathematical model theory - usually derived as corollaries to completeness of first- order logic - are cases in point: - According to the Compactness Theorem, a set Z of (closed) first order formulae can only imply a particular (closed first-order) formula if the latter already follows from a finite subset of I. As a consequence, no first-order formula can express that the universe is infinite (though it may imply this) - because such a formula would be implied by the set of (first-order expressible) sentences that say that there are at least n objects, but not by any of its finite subsets. - According to the Lowenheim-Skolem Theorem, a set Z of (closed) first order formulae that is true in a model with an infinite universe, is true in models with universes of arbitrary infinite cardinalities. In particular, no collection of first-order formulae can express that the universe has a particular infinite cardinality. According to a fundamental result of abstract model theory due to Lindstrom (1969), the above two theorems characterise first-order logic in that any language exceeding its expressive power is bound to fail at least one of them.There is little doubt that natural languages have the resources to express infinity; if they can also be shown to be able to express, say, the countability of the universe, Lindstrom's Theorem would imply that the notion of validity in them cannot be axiomatised - cf. Ebbinghaus et al. (1994) for the technical background. The modcl-thcorctical study of higher-order logics is comparatively less well investigated, a major theme being the distinction between standard and non-standard models, the latter being introduced to restore axiomatisability - at the price of losing the kinds of idealisations in the set-theoretic construction of denotations mentioned and motivated in Section 2; a survey of the most fundamental results of higher-order model theory can be found in van Benthem & Doets (1983). 4. References Abbott, Barbara 1997. Models, truth and semantics Linguistics & Philosophy 20,117-138. Abbott, Barbara & Larry Hauser 1995. Realism, Model Theory, and Linguistic Semantics. Paper presented at the Annual Meeting of the Linguistic Society of America. New Orleans, January 8, 1995. http://cogpriiits.Org/256/l/realism.litni. December 9,2010. Benthem, Johan van & Kees Doets 1983. Higher-order logic. In: D. M. Gabbay & F. Guenthner (eds.). Handbook of Philosophical Logic, vol. I. Dordrecht: Reidel, 275-329. Button, Tim 2011. The metamathematics of Putnam's model-theoretic arguments Erkenntnis 74, 321-349. Carnap, Rudolf 1947. Meaning and Necessity. Chicago. IL:The University of Chicago Press. Carnap, Rudolf 1952. Meaning postulates Philosophical Studies 3,65-73. Casanovas, Enrique 2007. Logical operations and invariance. Journal of Philosophical Logic 36, 33-60.
33. Model-theoretic semantics 801 Cresswell, Maxwell J. 1982. The autonomy of semantics In: S. Peters & E. Saarinen (eds). Processes, Beliefs, and Questions. Dordrecht: Kluwer, 69-86. Devitt, Michael 1983. Realism and the renegade Putnam: A critical study of meaning and the moral sciences Nous 17,291-301. Dowty, David 1979. Word Meaning and Montague Grammar. Dordrecht: Kluwer. Ebbinghaus, Heinz-Dieter, Jorg Flum & Wolfgang Thomas 1994. Mathematical Logic. 2nd edn. Berlin: de Gruyter. Etchemendy, John 1990. The Concept of Logical Consequence. Cambridge. MA: The MIT Press. Field, Hartry 1975. Conventionalism and instrumental ism in semantics Nous 9,375-405. Fine, Kit 1977. Properties, propositions, and sets Journal of Philosophical Logic 6,135-191. Frege, Gottlob 1884/1950. Die Grundlagen der Arithmetik: eine logisch-mathematische Untersu- chung iiber den Begriff der Zahl. Breslau: W. Koebner 1884. English translation in J. Austin. The Foundations of Arithmetic: A Logico-mathematical Enquiry into the Concept of Number. 1st edn. Oxford: Blackwell, 1950. Frege, Gottlob 1892/1980. Uber Sinn und Bedeutung. Zeitschrift fur Philosophic undphilosophische Kritik 100,25-50. English translation in: P. Geach & M. Black (cds). Translations from the Philosophical Writings of Gottlob Frege. Oxford: Blackwell, 1980,56-78. Gazdar, Gerald, Ewan Klein, Geoffrey Pullum & Ivan A. Sag 1985. Generalized Phrase Structure Grammar. Cambridge, MA:The MIT Press. Heim, Irene & Angelika Kratzer 1998. Semantics in Generative Grammar. Oxford: Oxford University Press. Hodges, Wilfried 2001. Formal features of compositionality. Journal of Logic, Language and Information 10,7-28. Janssen,Theo M.V. 1983. Foundations and Applications of Montague Grammar. Doctoral dissertation. University of Amsterdam. Jeffrey, Richard J. 1964. Review of Sections IV-V of E. Nagel et al. (eds.). Logic, Methodology, and Philosophy of Science (Stanford, CA 1962). Journal of Philosophy 61,79-88. Kripke. Saul A. 1972. Naming and necessity. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language. Dordrecht: Kluwer, 253-355. Kupffer, Manfred 2007. Contextuality as supervenience. In: M. Aloni, P. Dekker & F. Roelofsen (cds). Proceedings of the Sixteenth Amsterdam Colloquium. Amsterdam: ILLC, 139-144. http:// www.illc.uva.nl/AC2007/uploaded_files/proceedings-AC07.pdf. December 9,2010. Lasersohn. Peter 2000. Same, models and representation. In: B. Jackson & T. Matthews (eds.). Proceedings from Semantics and Linguistic Theory (=SALT) X. Ithaca, NY: Cornell University, 83-97. Lasersohn, Peter 2005. The temperature paradox as evidence for a presuppositional analysis of definite descriptions Linguistic Inquiry 36.127-134. Lewis, David K. 1984. Putnam's paradox. Australasian Journal of Philosophy 62,221-236. Lindenbaum, Adolf & Alfred Tarski 1935/1983. Uber die Beschranktheit der Ausdrucksmittel deduktiver Theorien. Ergebnisse eines mathematischen Kolloquiums 7, 15-22. English translation in: J. Corcoran (ed.). Logic, Semantics, Metamathematics. 2nd edn. Indianapolis, IN: Hackett Publishing Company, 1983,384-392. Lindstrom, Per 1969. On extensions of elementary logic. Theoria 35,1-11. Lobner, Sebastian 1979. Intensionale Verben und Funktionalbegriffe.Tub'mgen: Niemeyer. Macbeath, Murray 1982. "Who was Dr Who's Father?". Synthese 51,397-430. MacFarlane, John 2008. Logical constants In: E. Zalta (ed.). The Stanford Encyclopedia of Philosophy (Fall 2008 Edition). Online Publication. Stanford Encyclopedia of Philosophy, http:// plato.stanford.edu/cntrics/logical-constants December 9,2010. Machover, Moshe 1994. Review of G. Slier. The Bounds of Logic. A Generalized Viewpoint (Cambridge, MA 1991). British Journal for the Philosophy of Science 45,1078-1083. Mendelson, Elliott 1997. An Introduction to Mathematical Logic. 4th edn. London: Chapman & Hall. Menzel, Christopher 1990. Actualism.ontological commitment, and possible world semantics Synthese 85,355-389.
804 VII. Theories of sentence semantics These developments are accompanied by a newly found interest in the linguistic and ontological foundation of events.To the extent that more attention is paid to less typical events than the classical "Jones buttering a toast" or "Brutus stabbing Caesar", which always come to the Davidsonian semanticist's mind first, there is a growing awareness of the vagueness and incongruities lurking behind the notion of events and its use in linguistic theorizing. A particularly controversial case in point is the status of states.The question of whether state expressions can be given a Davidsonian treatment analogous to process and event expressions (in the narrow sense) is still open to debate. All in all, Davidsonian event arguments have become a very familiar "all-purpose" linguistic instrument over the past decades, and recent years have seen a continual extension of possible applications far beyond the initial focus on verb semantics and adver- bials also including a growing body of psycholinguistic studies that aim to investigate the role of events in natural language representation and processing. This article will trace the motivation, development, and applications of event semantics during the past decades and provide a picture of current views on the role of events in natural language meaning. Section 2 introduces the classical Davidsonian paradigm, providing an overview of its motivation and some classical and current applications, as well as an ontological characterization of events and their linguistic diagnostics. Section 3 discusses the Neo-Davidsonian turn with its broader perspective on eventualities and the use of thematic roles.This section also includes some notes on decompositional approaches to event semantics. Section 4 turns to the stagc-lcvcl/individual-lcvcl distinction outlining the basic linguistic phenomena that are grouped together under this label and discussing the event semantic treatments that have been proposed as well as the criticism they have received. Section 5 returns to ontological matters by reconsidering the category of states and asking whether indeed all of them, in particular the referents introduced by so-called "statives", fulfill the criteria for Davidsonian eventualities. And, finally, section 6 presents some experimental results of recent psycholinguistic studies that have tested the insights of Davidsonian event semantics.The article concludes with some final remarks in section 7. 2. Davidsonian event semantics 2.1. Motivation and applications On the standard view in Pre-Davidsonian times, a transitive verb such as to butter in (la) would be conceived of as introducing a relation between the subject Jones and the direct object the toast, thus yielding the logical form (lb). (1) a. Jones buttered the toast, b. butter (jones, the toast) The only individuals that sentence (la) talks about according to (lb) are Jones and the toast. As Davidson (1967) points out such a representation does not allow us to refer explicitly to the action described by the sentence and specify it further by adding, e.g., that Jones did it slowly, deliberately, with a knife, in the bathroom, at midnight. What, asks Davidson, does it refer to in such a continuation? His answer is that action verbs introduce an additional hidden event argument that stands for the action proper. Under
34. Event semantics 805 this perspective, a transitive verb introduces a three-place relation holding between the subject, the direct object and an event argument. Davidson's proposal thus amounts to replacing (lb) with the logical form in (lc). (1) c. 3e [butter (jones, the toast, e)] This move paves the way for a straightforward analysis of adverbial modification. If verbs introduce a hidden event argument, then standard adverbial modifiers may be simply analyzed as first-order predicates that add information about this event; cf. article 54 (Maienborn & Schafer) Adverbs and adverbials on the problems of alternative analyses and further details of the Davidsonian approach to adverbial modification.Thus, Davidson's classical sentence (2a) takes the logical form (2b). (2) a. Jones buttered the toast in the bathroom with the knife at midnight. b. 3e [butter (jones, the toast, e) & in (e, the bathroom) & instr (e, the knife) & at (e, midnight)] According to (2b), sentence (2a) expresses that there was an event of Jones buttering the toast, and this event was located in the bathroom. In addition, it was performed by using a knife as an instrument, and it took place at midnight. Thus, the verb's hidden event argument provides a suitable target for adverbial modifiers. As Davidson points out, this allows adverbial modifiers to be treated analogously to adnominal modifiers: Both target the referential argument of their verbal or nominal host. Adverbial modification is thus seen to be logically on a par with adjectival modification: what adverbial clauses modify is not verbs but the events that certain verbs introduce. Davidson (1969/1980: 167) One of the major advances achieved through the analysis of adverbial modifiers as first-order predicates on the verb's event argument is its straightforward account of the characteristic entailment patterns of sentences with adverbial modifiers. For instance, we want to be able to infer from (2a) the truth of the sentences in (3). On a Davidsonian account this follows directly from the logical form (2b) by virtue of the logical rule of simplification; cf. (3*). See, e.g., Eckardt (1998,2002) on the difficulties that these entailment patterns pose for a classical operator approach to adverbials such as advocated by Thomason & Stalnaker (1973), see also article 54 (Maienborn & Schafer) Adverbs and adverbials. (3) a. Jones buttered the toast in the bathroom at midnight. b. Jones buttered the toast in the bathroom. c. Jones buttered the toast at midnight. d. Jones butterd the toast with the knife. e. Jones buttered the toast. (3') a. 3e [butter (jones, the toast,e) & in (e, the bathroom) & at (e, midnight)] b. 3e [butter (jones, the toast, e) & in (e, the bathroom)] c. 3e [butter (jones, the toast, e) & at (e, midnight)]
806 VII. Theories of sentence semantics d. 3e [butter (jones, the toast, e) & instr (e, the knife)] e. 3e [butter (jones, the toast, e)] Further evidence for the existence of hidden event arguments can be adduced from anaphoricity, quantification and definite descriptions among others: Having introduced event arguments, the anaphoric pronoun it in (4) may now straightforwardly be analyzed as referring back to a previously mentioned event, just like other anaphoric expressions take up object referents and the like. (4) It happened silently and in complete darkness. Hidden event arguments also provide suitable targets for numerals and frequency adverbs as in (5). (5) a. Anna has read the letter three times/many times, b. Anna has often/seldom/never read the letter. Krifka (1990) shows that nominal measure expressions may also be used as a means of measuring the event referent introduced by the verb. Krifka's example (6) has a reading which does not imply that there were necessarily 4000 ships that passed through the lock in the given time span but that there were 4000 passing events of maybe just one single ship. That is, what is counted by the nominal numeral in this reading are passing events rather than ships. (6) 4000 ships passed through the lock last year. Finally, events may also serve as referents for definite descriptions as in (7). (7) a. the fall of the Berlin Wall b. the buttering of the toast c. the sunrise See,e.g.,Bierwisch (1989),Grimshaw (1990),Zucchi (1993),Ehrich & Rapp (2000),Rapp (2007) for event semantic treatments of nominalizations; cf. also article 51 (Grimshaw) Deverbal nominalization. Engelberg (2000: lOOff) offers an overview of the phenomena for which event-based analyses have been proposed since Davidson's insight was taken up and developed further in linguistics. The overall conclusion that Davidson invites us to draw from all these linguistic data is that events are things in the real world like objects; they can be counted, they can be anaphorically referred to, they can be located in space and time, they can be ascribed further properties. All this indicates that the world, as we conceive of it and talk about it, is apparently populated by such things as events. 2.2. Ontological properties and linguistic diagnostics Semantic research over the past decades has provided impressive confirmation of Davidson's (1969/1980:137) claim that "there is a lot of language we can make systematic sense of if we suppose events exist". But, with Quine's dictum "No entity without
34. Event semantics 807 identity!" in mind, we have to ask: What kind of things are events? What are their identity criteria? And how are their ontological properties reflected through linguistic structure? None of these questions has received a definitive answer so far, and many versions of the Davidsonian approach have been proposed, with major and minor differences between them. Focussing on the commonalities behind these differences, it still seems safe to say that there is at least one core assumption in the Davidsonian approach that is shared more or less explicitly by most scholars working in this paradigm. This is that eventualities are, first and foremost, particular spatiotemporal entities in the world. As LePore (1985:151) puts it,"[Davidson's] central claim is that events are concrete particulars - that is, unrepeatable entities with a location in space and time." As the past decades' discussion of this issue has shown (see, e.g., the overviews in Lombard 1998, Engelberg 2000, and Pianesi & Varzi 2000), it is nevertheless notoriously difficult to turn the above ontological outline into precise identity criteria for eventualities. For illustration, I will mention just two prominent attempts. Lemmon (1967) suggests that two events are identical just in case they occupy the same portion of space and time. This notion of events seems much too coarse-grained, at least for linguistic purposes, since any two events that just happen to coincide in space and time would, on this account, be identical.To take Davidson's (1969/1980:178) example, we wouldn't be able to distinguish the event of a metal ball rotating around its own axis during a certain time from an event of the metal ball becoming warmer during the very same time span. Note that we could say that the metal ball is slowly becoming warmer while it is rotating quickly, without expressing a contradiction.This indicates that we are dealing with two separate events that coincide in space and time. Parsons (1990), on the other hand, attempts to establish genuinely linguistic identity criteria for events: "When a verb-modifier appears truly in one source and falsely in another, the events cannot be identical." (Parsons 1990: 157). This, by contrast, yields a notion of events that is too fine-grained; see, e.g., the criticism by Eckardt (1998: § 3.1) and Engelberg (2000:221-225). What we are still missing, then, are ontological criteria of the appropriate grain for identifying events.This is the conclusion Pianesi & Varzi (2000) arrive at in their discussion of the ontological nature of events: [...] the idea that events are spatiotemporal particulars whose identity criteria are moderately thin [...] has found many advocates both in the philosophical and in the linguistic literature. [...] But they all share with Davidson's the hope for a 'middle ground' account of the number of particular events that may simultaneously occur in the same place. Pianesi & Varzi (2000:555) We can conclude, then, that the search for ontological criteria for identifying events will probably continue for some time. In the meantime, linguistic research will have to build on a working definition that is up to the demands of natural language analysis. What might also be crucial for our notion of events (besides their spatial and temporal dimensions) is their inherently relational character. Authors like Parsons (1990), Carlson (1998), Eckardt (1998), and Asher (2000) have argued that events necessarily involve participants serving some function. In fact, the ability of Davidsonian analyses to make explicit the relationship between events and their participants, either via thematic roles or by some kind of decomposition (sec sections 3.2 and 3.3 below), is certainly one of the major reasons among linguists for the continuing popularity of such analyses. This
34. Event semantics 809 elaborate on the internal functional set-up of events. This was already illustrated by our sentence (2);see article 54 (Maienborn & Schafer) Adverbs and adverbials for details on the contribution of manner adverbials and similar expressions that target the internal structure of events. This is, in a nutshell, the Davidsonian view shared (explicitly or implicitly) by current event-based approaches. The diagnostics in (10) provide a suitable tool for detecting hidden event arguments and may therefore help us to assess the Neo-Davidsonian claim that event arguments are not confined to action verbs but have many further sources, to which we will turn next. 3. The Neo-Davidsonian turn 3.1. The notion of eventualities Soon after they took the linguistic stage, it became clear that event arguments were not to be understood as confined to the class of action verbs, as Davidson originally proposed, but were likely to have a much wider distribution. A guiding assumption of what has been called the Neo-Davidsonian paradigm, developed particularly by Higginbotham (1985, 2000) and Parsons (1990, 2000), is that any verbal predicate may have such a hidden Davidsonian argument as illustrated by the following quotations from Higginbotham (1985) and Chierchia (1995). The position E corresponds to the 'hidden' argument place for events, originally suggested by Donald Davidson (1967).There seem to be strong arguments in favour of. and little to be said against, extending Davidson's idea to verbs other than verbs of change or action. Under this extension, statives will also have impositions. Higginbotham (1985:10) A basic assumption I am making is that every VP, whatever its internal structure and aspectual characteristics, lias an extra argument position for eventualities, in the spirit of Davidson's proposal. [...] In a way, having this extra argument slot is part of what makes something a VP, whatever its inner structure. Chierchia (1995:204) Note that already some of the first commentators on Davidson's proposal took a similarly broad view on the possible sources for Davidson's extra argument. For instance, Kim (1969:204) notes: "When we talk of explaining an event, we are not excluding what, in a narrower sense of the term, is not an event but rather a state or a process." So it was only natural to extend Davidson's original proposal and combine it with Vendler's (1967) classification of situation types into states, activities, accomplishments and achievements. In fact, the continuing strength and attractiveness of the overall Davidsonian enterprise for contemporary linguistics rests to a great extent on the combination of these two congenial insights: Davidson's introduction of an ontological category of events present in linguistic structure, and Vendler's subclassification of different situation types according to the temporal-aspectual properties of the respective verb phrases;cf., e.g., Pirion (1997), Engelberg (2002), Saeb0 (2006) for some more recent event semantic studies on the lexical and/or aspectual properties of certain verb classes.
810 VII. Theories of sentence semantics The definition and delineation of events (comprising Vendler's accomplishments and achievements), processes (activities in Vendler's terms) and states has been an extensively discussed and highly controversial topic of studies particularly on tense and aspect. The reader is referred to the articles 48 (Filip) Aspectual class and Aktionsart and 97 (Smith) Tense and aspect. For our present purposes the following brief remarks shall suffice: First, a terminological note:The notion "event" is often understood in a broad sense, i.e. as covering, besides events in a narrow sense, processes and states as well. Bach (1986) has introduced the term "eventuality" for this broader notion of events. In the remainder of this article I will stick to speaking of events in a broad sense unless explicitly indicated otherwise. Other labels for an additional Davidsonian event argument that can be found in the literature include "spatiotemporal location" (e.g. Kratzer 1995) and "Davidsonian argument" (e.g. Chierchia 1995). Secondly, events (in a narrow sense), processes, and states may be characterized in terms of dynamicity and telicity. Events and processes are dynamic eventualities, states are static. Furthermore, events have an inherent culmination point, i.e., they are telic, whereas processes and states, being atelic, have no such inherent culmination point; see Krifka (1989,1992,1998) for a mereological characterization of events and cf. also Dowty (1979), Rothstein (2004). Finally, accomplishments and achievements, the two subtypes of events in a narrow sense, differ wrt. their temporal extension. Whereas accomplishments such as expressed by read the book, eat one pound of cherries, run the 100m final have a temporal extension, achievements such as reach the summit, find the solution, win the 100m final are momentary changes of state with no temporal duration. See Kennedy & Levin (2008) on so- called degree achievements expressed by verbs like to lengthen, to cool, etc. The variable aspectual behavior of these verbs - atelic (permitting the combination with a for-PP) or telic (permitting the combination with an m-PP) - is explained in terms of the relation between the event structure and the scalar structure of the base adjective; cf. (12). (12) a. The soup cooled for 10 minutes. (atelic) b. The soup cooled in 10 minutes. (telic) Turning back to the potential sources for Davidsonian event arguments, in more recent times not only verbs, whether eventive or stative, have been taken to introduce an additional argument, but other lexical categories as well, such as adjectives, nouns and also prepositions. Motivation for this move comes from the observation that all predicative categories provide basically the same kind of empirical evidence that motivated Davidson's proposal and thus call for a broader application of the Davidsonian analysis. The following remarks from Higginbotham & Ramchand (1997) are typical of this view: Once we assume that predicates (or their verbal, etc. heads) have a position for events, taking the many consequences that stem there from, as outlined in publications originating with Donald Davidson (1967), and further applied in Higginbotham (1985, 1989), and Terence Parsons (1990), we are not in a position to deny an event-position to any predicate; for the evidence for, and applications of, the assumption are the same for all predicates Higginbotham & Ramchand (1997:54) As these remarks indicate, nowadays Neo-Davidsonian approaches often take event arguments to be a trademark not only of verbs but of predicates in general. We will come back to this issue in section 5 when we reconsider the category of states.
34. Event semantics 811 3.2. Events and thematic roles The second core assumption of Neo-Davidsonian accounts, besides assuming a broader distribution of event arguments, concerns the way of relating the event argument to the predicate and its regular arguments. While Davidson (1967) introduced the event argument as an additional argument to the verbal predicate thereby augmenting its arity, Neo-Davidsonian accounts use the notion of thematic roles for linking an event to its participants. Thus, the Neo-Davidsonian version of Davidson's logical form in (2b) for the classical sentence (2a), repeated here as (13a/b) takes the form in (13c). (13) a. Jones buttered the toast in the bathroom with the knife at midnight. b. 3e [butter (jones, the toast, e) & in (e, the bathroom) & instr (e, the knife) & at (e, midnight)] c. 3e [butter (e) & agent (e, jones) & patient (e, the toast) & in (e, the bathroom) & instr (e, the knife) & at (e, midnight)] On a Neo-Davidsonian view, all verbs are uniformly one-place predicates ranging over events. The verb's regular arguments are introduced via thematic roles such as agent, patient, experiencer, etc., which express binary relations holding between events and their participants; cf. article 18 (Davis) Thematic roles for details on the nature, inventory and hierarchy of thematic roles. Note that due to this move of separating the verbal predicate from its arguments and adding them as independent conjuncts, Neo- Davidsonian accounts give up to some extent the distinction between arguments and modifiers. At least it isn't possible anymore to read off the number of arguments a verb has from the logical representation. While Davidson's notation in (13b) conserves the argument/modifier distinction by reserving the use of thematic roles for the integration of circumstantial modifiers, the Neo-Davidsonian notation (13c) uses thematic roles both for arguments such as the agent Jones as well as for modifiers such as the instrumental the knife;sec Parsons (1990: 96ff) for motivation and defense and Bierwisch (2005) for some criticism on this point. 3.3. Decompositional event semantics The overall Neo-Davidsonian approach is also compatible with adopting a decompositional perspective on the semantics of lexical items, particularly of verbs; cf. articles 16 (Bierwisch) Semantic features and primes and 17 (Engelberg) Frameworks of decomposition. Besides a standard lexical entry for a transitive verb such as to close in (14a) that translates the verbal meaning into a one-place predicate close on events, one might also choose to decompose the verbal meaning into more basic semantic predicates like the classical cause, become etc; cf. Dowty (1979). A somewhat simplified version of Parsons' "subatomic" approach is given in (14b); cf. Parsons (1990:120). (14) a. to close: Xy Xx Xe [close (e) & agent (e, x) & theme (e, y)] b. to close: Xy Xx Xe [agent (e, x) & theme (e, y) & 3e' [cause (e, e') & theme (e',y) & 3s [become (e',s) & closed (s) & theme (s,y)]]] According to (14b) the transitive verb to close expresses an action c taken by an agent x on a theme y which causes an event e' of y changing into a state s of being closed. On this account a causative verb introduces not one hidden event argument but three.
34. Event semantics 813 used and combined with further semantic instruments such as decompositional and/or thematic role approaches. 4. The stage-level/individual-level distinction 4.1. Linguistic phenomena A particularly prominent application field for contemporary event semantic research is provided by the so-called stage-level/individual-level distinction, which goes back to Carlson (1977) and, as a precursor, Milsark (1974,1977). Roughly speaking, stage-level predicates (SLPs) express temporary or accidental properties, whereas individual-level predicates (ILPs) express (more or less) permanent or inherent properties; some examples are given in (17) vs. (18). (17) Stage-level predicates a. adjectives: tired, drunk, available,... b. verbs: speak, wait, arrive,... (18) Individual-level predicates a. adjectives: intelligent, blond, altruistic,... b. verbs: know, love, resemble,... The stage-level/individual-level distinction is taken to be a conceptually founded distinction that is grammatically reflected. Lexical predicates are classified as being either SLPs or ILPs. In the last years, a growing list of quite diverse linguistic phenomena have been associated with this distinction. Some illustrative cases will be mentioned next;cf., e.g., Higginbotham & Ramchand (1997), Fernald (2000), Jager (2001), Maienborn (2003: §2.3) for commented overviews of SLP/ILP diagnostics that have been discussed in the literature. 4.1.1. Subject effects Bare plural subjects of SLPs have, besides a generic reading ('Firemen are usually available.'), also an existential reading ('There are firemen who are available.') whereas bare plural subjects of ILPs only have a generic reading ('Firemen are usually altruistic.'): (19) a. Firemen arc available. (SLP: generic + existential reading) b. Firemen are altruistic. (ILP: only generic reading) 4.1.2. There-coda Only SLPs (20) but not ILPs (21) may appear in the coda of a f/iere-construction: (20) a. There were children sick. (SLP) b. There was a door open. (21) a. *Thcrc were children tall. (ILP) b. There was a door wooden.
34. Event semantics 815 In sum, the standard perspective under which all these contrasts concerning subject effects, tv/ieAi-conditionals, locative modifiers, and so on have been considered is that they are distinct surface manifestations of a common underlying contrast. The stage-level/ individual-level hypothesis is that the distinction of SLPs and ILPs rests on a fundamental (although still not fully understood) conceptual opposition that is reflected in multiple ways in the grammatical system.The following quotation from Fernald (2000) is representative of this view: Many languages display grammatical effects due to the two kinds of predicates, suggesting that this distinction is fundamental to the way humans think about the universe. Fernald (2000:4) Given that the conceptual side of the coin is still rather mysterious (Fernald (2000: 4): "Whatever sense of permanence is crucial to this distinction, it must be a very weak notion"), most stage-level/individual-level advocates content themselves with investigating the grammatical side (Higginbotham & Ramchand (1997: 53): "Whatever the grounds for this distinction, there is no doubt of its force"). We will come back to this issue in section 4.3. 4.2. Event semantic treatments A first semantic analysis of the stage-level/individual-level contrast was developed by Carlson (1977). Carlson introduces a new kind of entities, which he calls "stages".These are spatiotemporal partitions of individuals. SLPs and ILPs are then analyzed as predicates ranging over different kinds of entities: ILPs are predicates over individuals, and SLPs are predicates over stages.Thus, on Carlson's approach the stage-level/individual- level distinction amounts to a basic difference at the ontological level. Kratzer (1995) takes a different direction locating the relevant difference at the level of the argument structure of the corresponding predicates. Crucially, SLPs have an extra event argument on Kratzer's account, whereas ILPs lack such an extra argument. The lexical entries for a SLP like tired and an ILP like blond are given in (27). (27) a. tired: Xx Xe [tired (e, x)] b. blond: Xx [blond (x)] This difference in argument structure may now be exploited for selectional restrictions, for instance. Perception verbs, e.g., require an event denoting complement; see the discussion of (11) in section 2.2.This prerequisite is only fulfilled by SLPs, which explains the SLP/ILP difference observed in (24). Moreover, the ban of ILPs from depictive constructions (see (25) vs. (26)) can be traced back to the need of the secondary predicate to provide a state argument that includes temporally the main predicate's event referent. A very influential syntactic explanation for the observed subject effects within Kratzer's framework has been proposed by Diesing (1992). She assumes different subject positions for SLPs and ILPs: Subjects of SLPs have a VP-internal base position; subjects of ILPs arc base-generated VP-cxtcrnally. In addition, Diesing formulates a
816 VII. Theories of sentence semantics so-called Mapping Hypothesis, which serves as a syntax/semantics interface condition on the derivation of a logical form. (Diesing assumes a Lewis-Kamp-Heim style tripartite logical form consisting of a non-sclcctivc quantifier Q, a restrictive clause (RC), and a nuclear scope (NS).) Diesing's (1992) Mapping-Hypothesis states that VP-material is mapped into the nuclear scope, and VP-external material is mapped into the restrictive clause. Finally, Diesing takes the VP-boundary to be the place for the existential closure of the nuclear scope.The different readings for SLP and ILP bare plural subjects follow naturally from these assumptions: If SLP subjects stay in their VP-internal base position, they will be mapped into the nuclear scope and, consequently, fall under the scope of the existential quantifier. This leads to the existential reading; cf. (28a). Or they move to a higher, VP-external subject position, in which case they are mapped into the restrictive clause and fall under the scope of the generic operator.This leads to the generic reading; cf. (28b). ILP subjects, having a VP-external base position, may only exploit the latter option.Thus, they only have a generic reading;cf. (29). (28) a. 3e,x [Ns firemen (x) & available (e,x)] (cf. Kratzer 1995:141) b. Gen e, x [Rc firemen (x) & in (x, e)] [Ns available (e, x)] (29) Gen x [Rc firemen (x)] [ns altruistic (x)] Kratzer's account also offers a straightforward solution for the different behavior of SLPs and ILPs wrt. locative modification; cf. (23). Having a Davidsonian event argument, SLPs provide a suitable target for locative modifiers, hence, they can be located in space. ILPs, on the other hand, lack such an additional event argument, and therefore do not introduce any referent whose location could be further specified via adverbial modification. This is illustrated in (30)/(31). While combining a SLP with a locative modifier yields a semantic representation like (30b), any attempt to add a locative to an ILP must necessarily fail;cf. (31b). (30) a. Maria was tired in the car. b. 3c [tired (c, maria) & in (c, the car)] (31) a. */??Maria was blond in the car. b. [blond (maria) & in (???, the car)] Thus, on a Kratzerian analysis, SLPs and ILPs indeed differ in their ability to be located in space (see the above quote from Fernald), and this difference is traced back to the presence vs. absence of an event argument. Analogously, the event variable of SLPs provides a suitable target for tv/iew-conditionals to quantify over in (22a), whereas the ILP case (22b) lacks such a variable; cf. Kratzer's (1995) Prohibition against Vacuous Quantification. A somewhat different event semantic solution for the incompatibility of ILPs with locative modifiers has been proposed by Chierchia (1995). He takes a Neo-Davidso- nian perspective according to which all predicates introduce event arguments. Thus, SLPs and ILPs do not differ in this respect. In order to account for the SLP/ILP contrast in combination with locatives, Chierchia then introduces a distinction between two kinds of events: SLPs refer to location dependent events whereas ILPs refer to location
34. Event semantics 817 independent events; see also McNally (1998). The observed behavior wrt. locatives follows on the assumption that only location dependent events can be located in space. As Chierchia (1995:178) puts it: "Intuitively, it is as if ILP were, so to speak, unlocatcd. If one is intelligent, one is intelligent nowhere in particular. SLP, on the other hand, are located in space." Despite all differences, Kratzer's and Chierchia's analyses have some important commonalities. Both consider the SLP/ILP contrast in (30)/(31) as a grammatical effect.That is, sentences like (31a) do not receive a compositional semantic representation; they are grammatically ill-formed. Kratzer and Chierchia furthermore share the general intuition that SLPs (and only those) can be located in space.This is what the difference in (30a) vs. (31a) is taken to show. And, finally, both analyses rely crucially on the idea that at least SLPs, and possibly all predicates, introduce Davidsonian event arguments. All in all, Kratzer's (1995) synthesis of the stage-level/individual-level distinction with Davidsonian event semantics has been extremely influential, opening up a new field of research and stimulating the development of further theoretical variants and of alternative proposals. 4.3. Criticisms and further developments In subsequent studies of the stage-level/individual-level distinction two tendencies can be observed. On the one hand, the SLP/ILP contrast has been increasingly conceived of in information structural terms. Roughly speaking, ILPs relate to categorial judgements, whereas SLPs may build either categorial or thetical judgements;cf.,e.g., Ladusaw (1994), McNally (1998), Jager (2001). On this move, the stage-level/individual-level distinction is usually no longer seen as a lexically codified contrast but rather as being structurally triggered. On the other hand there is growing skepticism concerning the empirical adequacy of the stage-level/individual-level hypothesis. Authors such as Higginbotham & Ramchand (1997),Fernald (2000), and Jager (2001) argue that the phenomena subsumed under this label are actually quite distinct and do not yield such a uniform contrast upon closer scrutiny as a first glance might suggest. For instance, as already noted by Bauerle (1994: 23), the group of SLPs that support an existential reading of bare plural subjects is actually quite small; cf.( 19a).The majority of SLPs, such as tired or hungry in (32) behaves more like ILPs, i.e., they only yield a generic reading. (32) Firemen arc hungry/tired. (SLP: only generic reading) In view of the sentence pair in (33) Higginbotham & Ramchand (1997: 66) suspect that some notion of speaker proximity might also be of relevance for the availability of existential readings. (33) a. (Guess whether) firemen are nearby/at hand. b. ?(Gucss whether) firemen arc far away/a mile up the road. r/iere-constructions, on the other hand, also appear to tolerate ILPs, contrary to what one would expect;cf. the example (34) taken from Carlson (1977:72).
820 VII. Theories of sentence semantics VP-modifiers. They should not be confounded with locative frame adverbials such as those in (39).These are sentential modifiers that do not add an additional predicate to a VP's event argument but instead provide a semantically underspecified domain restriction for the overall proposition. Locative frame adverbials often yield temporal or conditional interpretations (e.g.'When he was in Italy, Maradona was married.' for (39c)) but might also be interpreted epistemically, for instance ('According to the belief of the people in Italy, Maradona was married.'); see Maienborn (2001) for details and cf. also article 54 (Maienborn & Schafer) Adverbs and adverbials. (39) Locative frame adverbials: a. By candlelight, Carolin resembled her brother. b. Maria was drunk in the car. c. In Italy, Maradona was married. Secondly, we are now in a position to make more precise what is going on in sentence pairs like (23), repeated here as (40), which are often taken to demonstrate the different behavior of SLPs and ILPs wrt. location in space; cf. the discussion in section 4. (40) a. Maria was tired/hungry/nervous in the car. (SLP) b. ??Maria was blond/intelligent/a linguist in the car. (ILP) Actually, this SLP/ILP contrast is not an issue of grammaticality but concerns the acceptability of these sentences under a temporal reading of the locative frame; cf. Maienborn (2004) for a pragmatic explanation of this temporariness effect. Thirdly, sentences (38d/e) are well-formed under an alternative syntactic analysis that takes the locative as the main predicate and the adjective as a depictive secondary predicate. Under this syntactic analysis sentence (38d) would express that there was a state of the dress being on the clothesline, and this state is temporally included in an accompanying state of the dress being wet.This is not the kind of evidence needed to substantiate the Neo-Davidsonian claim that states can be located in space. If the locative were a true event-related modifier, sentence (38d) should have the interpretation:There was a state of wetness of the dress, and this state is located on the clothesline. (38d) has no such reading; cf. the discussion on this point between Rothstein (2005) and Maienborn (2005b). Turning back to our event diagnostics, the same split within the group of state expressions that we observed in the previous cases also shows up with manner adverbials, comitatives and the like - that is, modifiers that elaborate on the internal functional structure of events. State verbs combine regularly with them, whereas statives do not, as (41) shows. Katz (2003) dubbed this the Stative Adverb Gap. (41) Manner adverbials etc.: a. Bardo slept calmly/with his teddy/without a pacifier. b. Carolin sat motionless/stiff at the table. c. The pearls gleamed dully/reddishly/moistly. d. *Bardo was calmly/with his teddy/without a pacifier tired. e. *Carolin was restlessly/patiently thirsty. f. *Andrea resembled with her daughter Romy Schneider. g. *Bardo owned thriftily/generously much money.
34. Event semantics 821 There has been some discussion on apparent counterexamples to this Stative Adverb Gap such as (42). While, e.g., Jager (2001), Mittwoch (2005), Dolling (2005) or Rothstein (2005) conclude that such cases provide convincing evidence for assuming a Davidsonian argument for statives as well, Katz (2000,2003,2008) and Maienborn (2003,2005a,b,2007) argue that these either involve degree modification as in (42a) or are instances of event coercion, i.e. a sentence such as (42b) is, strictly speaking, ungrammatical but can be "rescued" by inferring some event argument to which the manner adverbial may then apply regularly; see the discussion in section 6.2. For instance, what John is passionate about in (42b) is not the state of being a Catholic but the activities associated with this state (e.g. going to mass, praying, going to confession). If no related activities come to mind for some predicate such as being a relative of Grit in (42'b) then the pragmatic rescue fails and the sentence becomes odd. According to this view, understanding sentences such as (42b) requires a non-compositional reinterpretation of the stative expression that is triggered by the lack of a regular Davidsonian event argument. (42) a. Lisa firmly believed that James was innocent. b. John was a Catholic with great passion in his youth. (42')b. ??John was a relative of Grit with great passion in his youth. Sec also Rothmayr (2009) for a recent analysis of the semantics of stative verbs including a decompositional account of stative/eventive ambiguities as illustrated in (43): (43) a. Hair obstructed the drain. (stative reading) b. A plumber obstructed the drain, (preferred eventive reading) A further case of stativc/cvcntivc ambiguities is discussed by Engclbcrg (2005) in his study of dispositional verbs such as German helfen (help), gefahrden (endanger), erleichtern (facilitate).These verbs may have an eventive or a stative reading depending on whether the subject is nominal or sentential; cf. (44).Trying to account for these readings within the Davidsonian program turns out to be challenging in several respects. Engelberg advocates the philosophical concept of supervenience as a useful device to account for the evaluative rather than causal dependency of the effect state expressed by these verbs. (44) a. Rebecca helped Jamaal in the kitchen. (eventive) b. That Rebecca had fixed the water pipes helped Jamaal in the kitchen, (stative) In view of the evidence reviewed above, it seems justified to conclude that the class of statives, including all copular constructions, does not behave as one would expect if they had a hidden Davidsonian argument, regardless of whether they express a temporary or a permanent property. What conclusions should we draw from these linguistic observations concerning the ontological category of states? There are basically two lines of argumentation that have been pursued in the literature. Maienborn takes the behavior wrt. the classical event diagnostics in (10) as a sufficiently strong linguistic indication of an underlying ontological difference and assumes that only state verbs denote true Davidsonian eventualities, i.e., Davidsonian states,
822 VII. Theories of sentence semantics whereas statives resist a Davidsonian analysis but refer instead to what Maienborn calls Kimian states, exploiting Kim's (1969,1976) notion of temporally bound property exemplifications. Kimian states may be located in time and they allow for anaphoric reference, Yet, in lacking an inherent spatial dimension, They are ontologically "poorer", more abstract entities than Davidsonian eventualities; cf. Maienborn (2003, 2005a, b, 2007) for details. Authors like Dolling (2005), Higginbotham (2005), Ramchand (2005) or Rothstein (2005) take a different track. On their perspective, the observed linguistic differences call for a more liberal definition of eventualities that includes the referents of stative expressions. In particular, they are willing to give up the assumption of eventualities having an inherent spatial dimension. Hence, Ramchand (2005: 372) proposes the following alternative to the definition offered in (8): (45) Eventualities are abstract entities with constitutive participants and with a constitutive relation to the temporal dimension. So the issue basically is whether we opt for a narrow or a broad definition of events. 40 years after Davidson's first plea for events we still don't know for sure what kind of things event(ualitie)s actually are. 6. Psycholinguistic studies In recent years, a growing interest has emerged in testing hypotheses on theoretical linguistic assumptions about event structure by means of psycholinguistic experiments. Two research areas involving events have attracted major interest within the still developing field of semantic processing; cf. articles 15 (Bott, Featherston, Radd & Stoltcrfoht) Experimental methods, 102 (Frazicr) Meaning in psycholinguistics. These are the processing of underlying event structures and of event coercion. 6.1. The processing of underlying event structures The first focus of interest concerns the issue of distinguishing different kinds of events in terms of the complexity of their internal structure. Gcnnari & Pocppcl (2003) show that the processing of event sentences such as (46a) takes significantly longer than the processing of otherwise similar stative sentences such as (46b). (46) a. The visiting scientist solved the intricate math problem. (eventive) b. The visiting scientist lacked any knowledge of English. (stative) This processing difference is attributed to eventive verbs having a more complex decom- positional structure than stative verbs; cf. the Bierwisch-style representations in (47). (47) a. to solve: Xy Xx Xe [e: cause (x, become (solved (y))] b. to lack: Xy Xx Xs [s: lack (x, y)] Thus, the study of Gennari & Poeppel (2003) adduces empirical evidence for the event vs. state distinction and it provides experimental support for the psychological reality of
34. Event semantics 823 structuring natural language meaning in terms of decompositional representations.This is, of course, a highly controversial issue; cf. the argumentation in Fodor, Fodor & Garrett (1975), de Almeida (1999) and Fodor & LePore (1998) against decomposition, and see also the more differentiated perspective taken in Mobayyen & de Almeida (2005). McKoon & Macfarland (2000, 2002), taking up a distinction made by Levin and Rappaport Hovav (1995) investigate two kinds of causative verbs, viz. verbs denoting an externally caused event (e.g. break) as opposed to verbs denoting an internally caused event (e.g. bloom). Whereas the former include a causing subevent as well as a change- of-state subevent, the latter only express a change of state; cf. McKoon & Macfarland (2000: 834). Thus, the two verb classes differ wrt. their decompositional complexity. McKoon and Macfarland describe a scries of experiments that show that there arc clear processing differences corresponding to this lexical distinction. Sentences with external causation verbs take significantly longer to process than sentences with internal causation verbs. In addition, this processing difference shows up with the transitive as well as the intransitive use of the respective verbs; cf. (48) vs. (49). McKoon and Macfarland conclude from this finding that the causing subevent remains implicitly present even if no explicit cause is mentioned in the break-case. That is, their experiments suggest that both transitive and intransitive uses of, e.g., awake in (49) are based on the same lexical semantic event structure consisting of two subevents. And conversely if an internal causation verb is used transitively, as wilt in (48a), the sentence is still understood as denoting a single event with the subject referent being part of the changc-of-statc event. (48) Internal causation verbs: a. The bright sun wilted the roses. b. The roses wilted. (49) External causation verbs: a. The fire alarm awoke the residents. b. The residents awoke. In sum, the comprehension of break, awake etc. requires understanding a more complex event conceptualization than that of bloom, wilt etc. This psycholinguistic finding corroborates theoretically motivated assumptions on the verbs' lexical semantic representations. Sec also Hartl (2008) on a more thorough and differentiated study on implicit event information. Hartl discusses whether, to what extent, and at which processing level implicit event participants and implicit event predicates are still accessible for interpretation purposes. Most notably, the studies of McKoon & Macfarland (2000, 2002) and Gennari & Poeppel (2003) provide strong psycholinguistic support for the assumption that verb meanings are represented and processed in terms of an underlying event structure. 6.2. The processing of event coercion The second focus of psycholinguistic research on events is devoted to the notion of event coercion. Coercion refers to the forcing of an alternative interpretation when the compositional machinery fails to derive a regular interpretation. In other words, event coercion is a kind of rescue operation which solves a grammatical conflict by using additional
824 VII. Theories of sentence semantics knowledge about the involved event type;cf. also article 25 (de Swart) Mismatches and coercion.There are two types of coercion that are prominently discussed in the literature. The first type, the so-called complement coercion is illustrated in (50). The verb to begin requires an event-denoting complement and forces the given object-denoting complement the book into a contextually appropriate event reading. Hence, sentence (50) is reinterpreted as expressing that John began, e.g., to read the book; cf., e.g., Pustejovsky (1995), Egg (2003). (50) John began the book. The second kind, the so-called aspectual coercion refers to a set of options for adjusting the aspectual type of a verb phrase according to the demands of a temporal modifier. For instance, the punctual verb to sneeze in (51a) is preferably interpreted iteratively in combination with the durative adverbial for five minutes, whereas the temporal adverbial for years forces a habitual reading of the verb phrase to smoke a morning cigarette in (51b), and the stative expression to be in one's office receives an ingressive reinterpretation due to the temporal adverbial in 10 minutes in (51c); cf., e.g., Moens & Steedman (1988), Pulman (1997), de Swart (1998), Dolling (2003), Egg (2005). See also the classification of aspectual coercions developed in Hamm & van Lambalgen (2005). (51) a. John sneezed for five minutes. a. John smoked a morning cigarette for years. b. John was in his office in 10 minutes. There are basically two kinds of theoretical accounts that have been developed for the linguistic phenomena subsumed under the label of event coercion: type-shifting accounts (e.g., Moens & Steedman 1988, de Swart 1998) and underspecification accounts (e.g., Pulman 1997, Egg 2005); cf. articles 24 (Egg) Semantic underspecification and 25 (de Swart) Mismatches and coercion. These accounts and the predictions they make for the processing of coerced expressions have been the subject of several psycholinguistic studies;cf., e.g.,de Almeida (2004), Pickering McElrcc & Traxlcr (2005) and Traxlcr ct al. (2005) on complement coercion and Pinango,Zurif&Jackendoff (1999),Pinango,Mack& Jackendoff (to appear), Pickering et al. (2006), Bott (2008a, b), Brennan & Pylkkanen (2008) on aspectual coercion.The crucial question is whether event coercion causes additional processing costs, and if so at which point in the course of meaning composition such additional processing takes place.The results obtained so far still don't yield a fully stable picture. Whether processing differences are detected or not seems to depend partly on the chosen experimental methods and tasks;cf. Pickering et al. (2006). Pylkkanen & McElree (2006) draw the following interim balance: Whereas complement coercion always raises additional processing costs (at least without contextual support), aspectual coercion does not appear to lead to significant processing difficulties. Pylkkanen & McElree (2006) propose the following interpretation of these results: Complement coercion involves an ontological type conflict between the verb's request for an event argument and a given object referent.This ontological type conflict requires an immediate and time-consuming repair; otherwise the compositional process would break down. Aspectual coercion, on the other hand, only involves sortal shifts within the category of events that do not seem to affect composition and should therefore best be taken as an instance of semantic
35. Situation Semantics and the ontology of natural language 83_[ - The priority of information: language has external significance, as model theoretic semantics has always emphasized, but, as cognitive scientists of various stripes emphasize, it also has mental significance, yielding information about agents' internal states; in this respect see also article 11 (Kempson) Formal semantics and representation- alism. What is needed is a way of capturing the commonality between the external and the mental, a matter exacerbated when multimodal meaning (gesture, gaze, visual access) enters into the picture. - Cognitive readability: in common with all other biological organisms, language users are resource bounded agents.This requires that only relatively "small" entities feature in semantic accounts, hence the emphasis on situations and their characterization in a computable fashion. - Structured objects: semantic objects such as propositions need to be treated in a way that treats their identity conditions very much on a par with 'ordinary' individuals. Such entities are structured objects: (1) The primitives of our theory are all real things: individuals, properties, relations, and space-time locations. Out of these and objects available from the set theory we construct a universe of abstract objects. (Barwise & Perry 1983:178) That is, structured objects arrive on the scene with certain constraints that 'define them' in terms of other entities of the ontology in a manner that is inspired by proof theoretic approaches. This way of setting up the ontology has the potential of avoiding various foundational problems that beset classical theories of properties and propositions. For propositions, these problems typically center around doxastic puzzles such as logical omniscience and its variants. An important component in fulfilling these desiderata, according to Barwise and Perry, is a theory by means of which external (and internal) reality can be represented - an ontology of some kind. The formalism that emerged came to be known as Situation Theory - its make up and motivation constitute the focus of sections 2,3, and 4 of the paper. These proceed in an order that reflects the evolution of Situation Theory: initially as a theory of situations, then as a theory that includes both concrete entities such as situations and abstract ones such as propositions, finally as a more extended ontology, comprising in addition entities such as questions, outcomes, and possibilities. Section 5 of the paper concerns the emergence of a type theoretic version of the theory, within a formalism initiated by Robin Cooper. Section 6 provides some concluding remarks. 2. Introducing situations into semantics: Empirical motivations Situation Semantics owes its initial prominence to its analysis of the naked infinitive (NI) construction, exemplified in (2). Here is a construction, argued Barwise (1989b), that intrinsically requires positing situations - spatio-temporally located parts of the world. One component of this argument goes as follows: the difference in meaning between (2a) and (2b) illustrates that "logically equivalent" NIs (relative to an evaluation by a world) are not semantically equivalent. And yet, the intuitive validity of the inference from (2b) to (2c) and the inference described in (2d) shows that NIs bring with them
834 VII. Theories of sentence semantics Lewis 1979), until Barwise and Perry's proposal, the requisite relativization was not considered a matter to be handled in semantic theory. Barwise and Perry's essential idea is that in language use more than one situation comes into the picture: they make a distinction between the described situation, the situation which roughly speaking a declarative utterance picks out, and a resource situation, so called because it is used as a resource to fix the range/reference of an NP. Cooper's argument is based on data such as (7), modelled on an example from Lewis (1979), where two domains are in play, a local one and a New Zealand one. The former is exploited in the first two sentences, after which the New Zealand domain takes over. At the point marked II we are to imagine a sudden shift back to the local domain. By assuming that domains are situations we capture the fact that once a shift is made, it encompasses the entire situation, ensuring that the dog referred to is local: (7) The dog is under the piano and the cat is in the carton.The cat will never meet our other cat because our other cat lives in New Zealand. Our New Zealand cat lives with the Cresswells and their dog. And there he'll stay because the dog would be sad if the cat went away. II The cat's going to pounce on you. And the dog's coming too. For computational work using resource situations, integrated also with visual information see Poesio (1993). For experimental work on the resolution of definitcs in conversation taking a closely related perspective sec Brown-Schmidt & Tancnhaus (2008). 3. The Austinian picture The ontology we have discussed so far comprises situations and situation types (as well as of course the elements that make up these entities - individuals, role to individual assignments). Situations are the main ingredient in a treatment of bare perceptual reports and play a significant role in underpinning NP meaning and assertion. This is essentially the ontology of Situations and Attitudes in which there were no propositions. These were rejected as'artifact(s) of the semantic endeavor' (Barwise & Perry 1983). As Barwise & Perry (1985) subsequently admitted, this was not a move they were required to make. Indeed propositional-like entities, more intensional than situations, are a necessary ingredient for accounts of attitude reports and illocutionary acts. Sets of situations, although somewhat more fine grained than sets of worlds, are vulnerable to sophisticated variants of logical omniscience (see e.g. Soames' puzzle in Soames 1985). Nonetheless, Angelika Kratzer has initiated an approach, somewhat confusingly also known as Situation Semantics, that does attempt to exploit sets of situations for precisely this role and develops accounts for a wide range of linguistic phenomena, including modality, donkey anaphora, cxhaustivity, and factivity. Sec Kratzcr (1989) for an early version of this approach, and Kratzer (2008) for a detailed, recent survey. The next cheapest solution available within the Situations and Attitudes ontology would be to draft the situation types to serve as the propositional entities. Indeed, situation types are competitive in such a role: they can distinguish identity statements that involve distinct constituents (e.g. (8a) corresponds to the situation type in (8c), whereas (8b) corresponds to the situation type in (8d), while allowing substitutivity of co-referentials and cross-linguistic equivalents, as exemplified respectively by (8e) and (8f), the Hebrew analogue of (8b):
35. Situation Semantics and the ontology of natural language 835 (8) a. Enesco is identical with himself. b. Poulenc is identical with himself. c. ((Identical; enesco, enesco)) d. ((Identical; poulenc, poulenc)) e. He is identical with himself. f. Poulank zehe leacmo. Nonetheless, post 1985 situation theory did not go for the cheapest solution; as we will see in section 4., not succumbing to ontological stinginess pays off when scaling up the theory to deal with other abstract entities. Building on a conception articulated 30 years earlier by Austin (1970), Barwisc & Etchemendy (1987) developed a theory of propositions in which a proposition is a structured object prop(s,a), individuated in terms of a situation s and a situation type a Here the intuition is that s is the described situation (or the belief situation, in so far as it is used to describe an agent's belief, or utterance token, in the case of locutionary propositions discussed below), with the relationship between s and a being the one introduced above in our discussion of NIs, leading to a straightforward notion of truth and falsity: (9) a. prop(s, a) is true iff s : a (s is of type a). b. prop(s, a) is false iff s :la(s is not of type a). In saying that a proposition prop(s, a) is individuated in terms of s and a, the intention is to say that prop(s, 6) = prop(f, r) if and only if s = t and <T= r. Individuating propositions in terms of their "subject matter" (i.e. the situation type component) is familiar, but what is innovative and/or puzzling is the claim that two propositions can be distinct despite having the same subject matter. I mention three examples from the literature of cases which motivate differentiating propositions on the basis of their situational component.The first is one we saw above in the case of definiteness resolution, where the possibility of using 'the dog' is underwritten by distinct presuppositions; the difference in the presuppositions resides in the different resource situations exploited: (10) a. pvop(slocal, ((UNIQUE, Dog))) b. prop(snewzealantl,((UNIQUE, Dog))) A second case are the locutionary propositions introduced by Ginzburg (2011). Ginzburg argues that characterizing both the update potential and the range of utterances that can be used to seek clarification about a given utterance uQ requires reference to the utterance token u0, as well as to its grammatical type Tu (see article 36 (Ginzburg) Situation Semantics for details). By defining propositions (locutionary propositions) individuated in terms of «Qand Tu one can simultaneously define update and clarification potential for utterances. In this case, there are potentially many instances of distinct locutionary propositions, which need to be differentiated on the basis of the utterance token - minimally any two utterances classified as being of the same type by the grammar. The original motivation for Austinian propositions was in the treatment of the Liar paradox by Barwise & Etchemendy (1987).This paradox concerns sentences like (lla,b) which, pretheoretically, are false if true and true if false. Although one approach to this
838 VII. Theories of sentence semantics viable theory of propositions can on its own deliver a viable theory of the attitudes. This is because attitudes have structure not perfectly mirrored by their external content, a realization of which prompted Barwise and Perry to abandon their initial essentially proposition-based account.The most striking illustration of this is in puzzles like Kripke's Pierre (Kripke 1979), who is unaware that the wonderful city of Londres about which he learnt as a child is the same place as the squalid London, where he currently resides. While his beliefs are perfectly rational, we can say of him that he believes that London is pretty and also does not believe that London is pretty. One possible conclusion from this (see e.g. Crimmins 1993), a way out of paradox, is that attitude reports involve implicit reference to attitudinal states: relative to information acquired in France, Pierre believes London is pretty; relative to information acquired on the ground in London, he believes the opposite. Here is yet another important role for situations in linguistic description. One way to integrate this in an account of complementation was offered in Cooper & Ginzburg (1996) and Ginzburg (1995) for declarative and interrogative attitude reports, respectively. This constitutes a compositional reformulation of the philosophical accounts cited above. The main idea is to assume that attitude predicates involve at least three arguments: an agent, an attitudinal state and a semantic entity. For instance with respect to belief, this relates an agent's belief in a proposition to facts about the agent's mental situation.This amounts to linking a positive belief attribution of proposition p relative to the mental situation ms with the existence of an internal belief state whose content is p. An example of such a mental situation is given in section 5. 4. Awiderontological net The ontology of Situation Theory (ST) was originally designed on the basis of a rather restricted data set. One of the challenges of more recent work has been to extend this ontology in order to account for two related key domains for semantics: root clauses in conversational use and verb complementation. A large body of semantic work that has emerged since the late 1970s demonstrating that interrogative clauses possess denotations (questions) distinct in semantic type from declarative ones; imperative and subjunctive clauses possess denotations (dubbed outcomes by Ginzburg & Sag 2000) distinct in semantic type from declarative and interrogative ones; facts are distinct from true propositions; for detailed empirical evidence for these distinctions, see Vendler (1972), Asher (1993), Peterson (1997) and Ginzburg & Sag (2000); see also article 66 (Krifka) Questions and article 67 (Han) Imperatives. The main challenge in developing an ontology which distinguishes the diverse menagerie of abstract entities including propositions, questions, outcomes and facts is characterizing the structure of these entities, indeed figuring out how the distinct entities relate to each other. As pointed out by Ginzburg & Sag (2000), quantified NPs and certain adverbs are possible in declarative, interrogative and imperative semantic environments. Hence, the ontology must provide a semantic unit which constitutes the input/output of such adverbial modifiers and of NP quantification. To make this concrete - the assumption that the denotation of imperatives is of a type distinct from t (however cashed out) is difficult to square with (a simplistic implementation) of the received wisdom that NPs such as 'everyone' are of type « e, t >, t >. If the latter were the case, composing 'everyone' with 'vacate the building' in (14c) would yield a denotation of type t:
35. Situation Semantics and the ontology of natural language 839 (14) a. b. c. d. e. f. Everyone vacated the building. Did everyone vacate the building? Everyone vacate the building! Kim always wins. Does Kim always win? Always wear white! As we will see subsequently, a good candidate for this role are situation types. These, as we observed in section 3., are not conflated with propositions in the situation theoretic universe. Ginzburg & Sag (2000) set out to construct an ontology that appropriately distinguishes these entities and yet retains the features of the ST ontology discussed earlier. The ontology, dubbed a Situational Universe with Abstract Entities (SU+AE), was developed in line with the strategy of Barwise and Perry's (l).This was implemented on two levels, one within a universe of type-based feature structures (Carpenter 1992).This universe underpinned grammatical analysis, using Head Driven Phrase Structure Grammar (HPSG). A denotational semantics was also developed in the Axiom of Foundation with Atoms (AFA)-based set theoretic framework of Seligman & Moss (1997). In what follows, I restrict attention to the latter. A semantic universe is identified with a relational structure S of the form [A, 5,,..., Sm; Rv ..., Rm]. Here A - sometimes notatcd also as I S| - is the universe of the structure. From the class of relations we single out the 5,,..., Sm which are called the structural relations, as they are to capture the structure of certain elements in the domain. Each 5. can be thought of as providing a condition that defines a single structured object in terms of a list of n objects xv ..., xn. Situations and situation types serve as the 'basic building blocks' from which the requisite abstract entities of the ontology arc constructed: Propositions are structurally determined by a situation and a situation type. (See discussion in section 3.) Intuitively, each outcome is a specification of a situation which is futurate relative to some other given situation. Given this, outcomes are structurally determined by a situation and a situation type abstract whose temporal argument is abstracted away, thereby allowing specification of fulfillcdncss conditions. Possibilities, a subclass of which constitutes the universe's facts, are structurally determined by a proposition. This reflects the tight link between propositions and possibilities. As Ginzburg & Sag (2000) explain, there is no obvious priority between possibilities and propositions: one could develop an ontology where propositions are built out of possibilities. An additional assumption made is that the semantic universe is closed under simultaneous abstraction. Simultaneous abstraction, originally defined by Aczel & Lunnon (1991), is a semantic operation akin to X-abstraction with one significant extension: abstraction is over sets of elements, including the empty set. Moreover, abstraction (including over the empty set) is potent - the body out of which abstraction occurs is distinct from the abstract.The assumption about closure under simultaneous abstraction is akin to the common type theoretic assumption about closure under functional type formation.
842 VII. Theories of sentence semantics in Barwise (1989c), and comprehensively developed in Seligman & Moss (1997), were rather complex. Concretely, simultaneous X-abstraction with restrictions is a tool with a variety of uses, including quantification, questions, and the specification of attitudinal states and meanings (for the latter see article 36 (Ginzburg) Situation Semantics). Its complex set theoretic characterization made it difficult to use. Concomitantly, the theory in this form required an auxiliary coding into a distinct formalism (e.g. typed feature structures) for grammatical and computational applications. Neither of these versions of the theory provides an adequate notion of role-dependency that has become standard in recent treatments of anaphora and quantification on which much semantic work has been invested in frameworks such as Discourse Representation Theory and Dynamic Semantics; sec Gawron & Peters (1990) for a detailed theory of anaphora and quantification in situation semantics, though one that is not dynamic. Motivated to a large extent by such concerns, the situation theoretic outlook has been redeveloped using tools from Type Theory with Records (TTR), a framework initiated by Robin Cooper. Ever since Sundholm and Ranta's pioneering work (Sundholm 1986; Ranta 1994), there has been interest in using constructive type theory (often referred to as Martin-LofType Theory) as a framework for semantics (see e.g. Fernando 2001 and Krahmer & Piwek 1999). TTR is a model theoretic outgrowth of constructive type theory. Its provision of entities at both levels of tokens and types allows one to combine aspects of the typed feature structures world and the set theoretic world, enabling its use as a computational grammatical formalism. As wc will sec,TTR provides the scmanticist with a formalism that satisfies the desiderata I mentioned in section 1. Cooper (2006a) has shown that the lion's share of situation theoretic results can be recast in TTR - the main exception being those results that depend explicitly on the existence of a non-well- founded universe, for instance Barwise and Etchemendy's account of the Liar; the type theoretic universe is well founded. But one could, according to Cooper (p.c), recreate non-wcll-foundcdncss at the level where witnessing of types occurs. In addition, TTR allows for DRT-oriented or Montogovian treatment of anaphora and quantification. For a computational implementation of TTR, see Cooper (2008); for a closely related framework, the Grammatical Framework see Ranta (2004). The move to TTR is, however, not primarily a means of capturing and perhaps mildly refining past results, but crucially underpins a theory of conversational interaction on both illocutionary and mctacommunicative levels. A side effect of this is, via a theory of generation, an account of attitude reports. In the remaining space, I will briefly exposit the basics of TTR, show its ability to underpin SU+AEs and briefly sketch how this can be used to define basic information states in dialogue. One linguistic application will be provided, one that ties up situations, information states, and meaning: a specification of the meaning of a discourse-bound pronoun. 5.1. Generalizing the situation/situation type relation The most fundamental notion of TTR is the typing judgement a : T classifying an object a as being of type 7. This can be seen as a generalization of the situation semantics judgement s : a, generalization in that not only situations can figure as subjects of typing judgements. Note that the theory provides the objects and the types, but this form of judgement, as well as other forms are metatheoretical. Examples are given in
846 VII. Theories of sentence semantics goes on to offer a criterion of type individuation of record types using X-types, where the corresponding'labels' function as bound variables. Ginzburg (2005a) shows how to formulate a theory of questions as propositional abstracts in TTR, while using the standard TTR notion of abstraction. In this way, a possible criticism of the approach of Ginzburg & Sag (2000), that they use an ad hoc and complex notion of abstraction, can be circumvented. Similarly, Ginzburg (2005b) shows how to explicate outcomes within TTR. 5.3. Ontology in interaction The most active area in the application of TTR to the description of NL is in the area of dialogue. Larsson (2002) and Cooper (2006b) showed how to decompose interaction protocols, such as those specified situation theoretically in Ginzburg (1996), by using TTR to describe update rules on the information states of dialogue participants.This was extended by Ginzburg (2011) to cover a variety of illocutionary moves, metacommunica- tive interaction (see article 36 (Ginzburg) Situation Semantics for some discussion) and conversational genres. Fernandez (2006) uses TTR to develop a wide coverage of the range of non-sentential utterances that occur in conversation. In these works, information states are assumed to consist of a public and unpublicized part. For current purposes wc restrict attention to the public part, also known as each participant's dialogue gameboard (DGB). Each DGB is a record of the type given in (27) - the spkr, addr fields allow one to track turn ownership, Facts represents conversationally shared assumptions, Pending and Moves represent respectively moves that are in the process of/have been grounded, QUD tracks the questions currently under discussion: (27) rspkr:Ind addr:Ind c-utt: addrcssing(spkr,addr) Facts : Set(Prop) Pending : list(LocProp) Moves : list(LocProp) I QUD : poset(Question) We call a mapping that indicates how one DGB can be modified by conversationally related action a conversational rule, and the types specifying its domain and its range respectively the preconditions and the effects. Here I exemplify the use of TTR to give a partial characterization of the meaning of pronouns in dialogue, a task that links assertion acceptance, situations, and meaning. The main challenge for a theory of meaning for pronouns is of course how to characterize their antecedency conditions; here I restrict attention to intersentential cases, see Ginzburg (2011) for an extension of this account to intra-sentential cases. Dialogue takes us away quite quickly from certain received ideas on this score: antecedents can arise from queries (28a), from partially understood or even disflucnt utterances ((28b,c) respectively). Moreover, as (28d) illustrates, where 'he' cannot refer to 'Jake', the shelf life of an antecedent is potentially quite short. Although the data are subtle, a plausible assumption is that for non-referential NPs anaphora are not generally possible from
35. Situation Semantics and the ontology of natural language 847 within a query (polar or wh) (Groenendijk 1998), or from an assertion that has been rejected (e.g. (28e,f)). (28) a. A: Did John phone? B: He's out of contact in Daghestan. b. A: Did John phone? B: Is he someone with a booming bass voice? c. Peter was, well he was fired. d. A: Jake hit Bill. / B: No, he patted him on the back. / A: Ah. Is Bill going to the party tomorrow? /B: No./ A: Is #he/Jake? e. A: Do you own an apartment? B: Yes. A: Where is it located? f. A: Do you own an apartment? B: No. A: #Where might you buy it? This means, naturally enough, that witnesses to non-referential NPs can only emerge in a context where the corresponding assertion has been accepted. A natural move to make in light of this is to postulate a witnessing process as a side effect of assertion acceptance, a consequence of which will be the emergence of referents for non-referential NPs. For uniformity's sake, we can assume that these witnesses get incorporated into the contextual parameters (c-params) of that utterance, which in any case includes (witnesses for) the referential NPs.This means that c-params serves uniformly as the locus for witnesses of'discourse anaphora'. The rule of incorporating non-referential witnesses in c-params is actually simply a minor add on to the rule that underwrites assertion acceptance (see Ginzburg 2011, chapter 4) - the rule underpinning the utterance of acceptances - it can be viewed as providing for a witness for situation/event anaphora since this is what gets directly introduced into c-params. In cases where the witness is a record (essentially when the proposition is positive), NP witnesses will emerge. In (29) the preconditions involve the fact that the speaker's latest move is an assertion of a proposition whose type isTl.The effects change the speaker/addressee roles (since the asserted to becomes the accepter) and adds a record w, including a witness for Tl, to the contextual parameters. (29) Accept move: spkr:Ind 1 1 addr:Ind _|~sit = sitl "I P~[sit-typc = TlJ' r°P LatestMoveco,,,c,"= Assert(spkr,addr,p): IllocProp J spkr = preconds.addr: Ind 1 addr = prcconds^pkr: Ind w = preconds.LatestMove.c-params U [sit = sitl]: Rec Moves = ml ©preconds.Moves:list(LocProp) mjco.ucni _ Accept(spkr,addr,p): LocProp m0.c-param = w : Rec JJ Wc can now state the meaning of a singular pronoun somewhat schematically as follows: it is a word whose contextual parameters include an antecedent, which is to be sought from among the constituents of an active move; the pronoun is identical in reference to preconds: effects
848 VII. Theories of sentence semantics this antecedent and agrees with it. Space precludes a careful characterization of what it means to be active, but given the data we saw above it is a composite property determined by QUD - essentially being specific to an clement of QUD - and Pending. Sec article 36 (Ginzburg) Situation Semantics for the justification for including reference to an utterance's constituents in grammatical representation. In (30), I provide a lexical entry in the style of HPSG that captures this specification: here m represents the active move and a the antecedent; the final condition on c-params requires that within /n's contextual parameters is one whose index is identical to that of a's content: (30) PHON:(shc) m : LocProp a '.Sign c-params:I cl: member(a,m.constits) c2: ActivcMovc(m) m.sit.c-params: [a.c-params.index = a.cont.index : Ind] head: N ana : + agr = c-params.m.cat.agr cont: [index = a.cont.index : Ind] num =sg: Number gen = fem : Gender pers = third: Person :syncat 6. Conclusions Situation semantics initiated an ontology-oriented approach to semantics: the aim being to develop means of representing the external and internal reality of agents in a cogni- tivcly tractable way. The initial emphasis was on situations, based in part on evidence from the naked infinitival construction. Situations, parts of the world, have proved to be of significant importance to a variety of phenomena ranging from the domains associated with NP use to negation and, one way or the other, play a significant role in the very act of asserting. The theory of situations subsequently lead to a theory of propositions (Austinian propositions), structured objects constructed from situations and situation types: Austinian propositions are significantly more fine-grained than possible worlds propositions, but coarse grained enough to pass translation and paraphrase criteria.They also have a potential construal in terms of differingly coarse grained memory traces. The technique of individuating abstract entities as structured objects enables the theory to scale up: by integrating questions, outcomes and facts into the ontology, Situation Theory was able to underpin a rudimentary theory of illocutionary interaction (entities such as questions, propositions and outcomes serve as the descriptive content of queries, assertions and requests) and a theory of complementation for attitudinal predicates.
35. Situation Semantics and the ontology of natural language 849 A recent development has been to recast the theory in type theoretic terms, concretely using the formalism of Type Theory with Records. Type Theory with Records has many similar characteristics to situation theory - an ontology - oriented approach, computational tractability, structured objects. This enables most of the results situation theory achieved to be maintained. On the other hand, Type Theory with Records brings with it a more transparent formalism and the existence of dependent types allows both dynamic semantic and unification grammar techniques to be utilized. Indeed, all these can be combined to construct a theory of illocutionary and metacommunicative interaction, one of the key areas of development for semantics in the early 21st century. 7. References Aczel, Peter 1988. Non Well Founded Sets. Stanford, CA: CSLI Publications Aczel, Peter, David Israel, Stanley Peters & Yasuhiro Katagiri (eds) 1993. Situation Theory and Its Applications, III. Stanford, CA: CSLI Publications Aczel, Peter & Rachel Luniioii 1991. Universes and parameters In: J. Barwise ct al. (eds). Situation Theory and its Applications, II. Stanford, CA: CSLI Publications, 3-24. Allen. James 1983. Maintaining knowledge about temporal intervals Communications of the ACM 26,832-843. Aslicr, Nicholas 1993. Reference to Abstract Objects in English: A Philosophical Semantics for Natural Language Metaphysics. Dordrecht: Kluwei. Austin. John L. 1970. Truth. In: J. Urinson & G. J. Warnock (eds). Philosophical Papers. 2nd edn. Oxford: Oxford University Press, 117-133. Barwise, Jon 1989a. Branch points in Situation Theory. In: J. Barwise. The Situation in Logic. Stanford, CA: CSLI Publications 255-276. Barwise, Jon 1989b. Scenes and other situations In: J. Barwise. The Situation in Logic. Stanford, CA: CSLI Publications 5-36. Barwise, Jon 1989c. The Situation in Logic. Stanford, CA: CSLI Publications Barwise, Jon & John Etchemendy 1987. The Liar: An Essay on Truth and Circularity. Oxford: Oxford University Press Barwise, Jon ct al. (eds) 1991. Situation Theory and Its Applications, II. Stanford, CA: CSLI Publications Barwise, Jon & John Perry 1983. Situations and Attitudes. Cambridge, MA: The MIT Press Barwise, Jon & John Perry 1985. Shifting situations and shaken attitudes Linguistics & Philosophy 8,399^52. Brown-Schmidt. S. & Michael K.Tanenhaus 2008. Real-time investigation of referential domains in unscripted conversation: A targeted language game approach. Cognitive Science: A Multidisci- plinary Journal 32,643-684. Carpenter, Bob 1992. The Logic of Typed Feature Structures: With Applications to Unification Grammars, Logic Programs, and Constraint Resolution. Cambridge: Cambridge University Press Cooper, Robin 1993. Generalized quantifiers and resource situations In: P. Aczel et al. (eds). Situation Theory and Its Applications, III. Stanford, CA: CSLI Publications 191-211. Cooper, Robin 1996. The role of situations in Generalized Quantifiers In: S. Lappin (ed.). Handbook of Contemporary Semantic Theory. Oxford: Blackwell, 65-86. Cooper, Robin 1998. Austinian propositions Davidsonian events and perception complements In: J. Ginzburg et al. (eds). The Tbilisi Symposium on Logic, Language and Computation: Selected Papers. Stanford, CA: CSLI Publications 19-34. Cooper, Robin 2006a. Austinian truth in Martin-L6f Type Theory. Research on Language and Computation 3,333-362.
35. Situation Semantics and the ontology of natural language 851 Kamp, H. 1979. Events, instants and temporal reference. In: R. Bauerle, U. Egli & A. von Stechow (eds.). Semantics from Different Points of View. Berlin: Springer. 376-417. Kirn, Yookyung 1998. Information articulation and truth conditions of existential sentences Language and Information 1,67-105. Kiahmei. Emiel & Paul Piwek 1999. Presupposition projection as proof construction. In: H. Bunt & R. Muskens (eds.). Computing Meaning: Current Issues in Computational Semantics. Dordrecht: Kluwer, 281-300. Kratzer. Angelika 1989. An investigation of the lumps of thought. Linguistics & Philosophy 12. 607-653. Kratzer. Angelika 2008. Situations in natural language semantics In: E. N. Zalta (ed.). The Stanford Encyclopedia of Philosophy (Fall 2008 Edition), http://plato.stanford.edu/cntries/situation- semantics/. December 15,2010. Ki ipke, Saul 1975. Outline of a theory of truth. The Journal of Philosophy, 690-716. Kripke, Saul 1979. A puzzle about belief. In: A. Margalit (cd.). Meaning and Use. Dordrecht: Rcidel, 239-283. Larsson, Staffaii 2002. Issue based Dialogue Management. Doctoral dissertation. University of Gothenburg. Lewis, David K. 1979. Score keeping in a language game. In: R. Bauerle, U. Egli & A. von Stechow (eds.). Semantics from Different Points of View. Berlin: Springer, 172-187. McCawley, James D. 1979. Presupposition and discourse structure. In: C.-K. Oh & D. Dinneen (eds.). Presupposition. New York: Academic Press, 371-388. McGee, Vann 1991. Review of J. Barwise & J. Etchemendy. The Liar (Oxford, 1987). Philosophical Review 100,472^74. Moss, Lawrence 1989. Review of J. Barwise & J. Etchemendy. The Liar (Oxford, 1987). Bulletin of the American Mathematical Society 20,216-225. Muskens, Rcinhard 1989. Meaning and Partiality. Doctoral dissertation. University of Amsterdam. Reprinted: Stanford, CA: CSLI Publications, 1995. Neale, Stephen 1988. Events and logical form. Linguistics & Philosophy 11,303-321. Peterson, Philip 1997. Fact, Proposition, Event. Dordrecht: Kluwer. Poesio, Massimo 1993. A situation-theoretic formalization of definite description interpretation in plan elaboration dialogues In: P. Aczel ct al. (eds). Situation Theory and Its Applications, III. Stanford, CA: CSLI Publications, 339-374. Ranta, Aarne 1994. Type Theoretical Grammar. Oxford: Oxford University Press. Ranta.Aarne 2004. Grammatical framework. Journal of Functional Programming 14,145-189. Richard. Mark 1990. Propositional Attitudes: An Essay on Thoughts and How We Ascribe Them. Cambridge, MA:The MIT Press. Russell, Bertrand 1905. On denoting. Mind 14,479^193. Schulz, Stephen 1993. Modal Situation Theory. In: P. Aczel et al. (eds.). Situation Theory and Its Applications, III. Stanford, CA: CSLI Publications, 163-188. Seliginan, Jerry & Larry Moss 1997. Situation Theory. In: J. van Benthcm & A. tcr Meulen (eds). Handbook of Logic and Linguistics. Amsterdam: North-Holland, 239-309. Soames, Scott 1985. Lost innocence. Linguistics & Philosophy 8,59-71. Sundholm, Goran 1986. Proof Theory and meaning. In: D. Gabbay & F. Guenthner (eds.). Handbook of Philosophical Logic. Oxford: Oxford University Press, 471-506. Vendler, Zeno 1972. Res Cogitans. Ithaca, NY: Cornell University Press. Vogel, Carl & Jonathan Ginzburg 1999. A situated theory of modality. Paper presented at the 3rd Tbilisi Symposium on Logic, Language, and Computation, Batumi. Republic of Georgia, September 1999. Jonathan Ginzburg, Paris (France)
VI11. Theories of discourse semantics 36. Situation Semantics: From indexicality to metacommunicative interaction 1. Introduction 2. Desiderata for semantics 3. The Relational Theory of Meaning 4. Meaning, utterances, and dialogue 5. Closing remarks 6. References Abstract Situation Semantics emerged in the 1980s with an ambitious program of reform for semantics, both in the domain of semantic ontology and with regard to the integration of context in meaning. This article takes as its starting point the focus on utterance (as opposed to sentence) interpretation. The far reaching aims Barwise and Perry proposed for semantic theory are spelled out. Barwise and Perry's Relational Theory of Meaning is described, in particular its emphasis on utterance situations and on the reification of information. The final part of the article explains how conceptual apparatus from situation semantics has ultimately come to play an important role in a highly challenging enterprise, modelling dialogue interaction, in particular metacommunicative interaction. 1. Introduction Situation Semantics emerged in the 1980s with an ambitious program of reform for semantics, both in the domain of semantic ontology and with regard to the integration of context in meaning. In their 1983 book Situations and Attitudes (Barwise & Perry 1983), as well as a host of other publications around that time collected in Barwise (1989) and Perry (2000), Barwise and Perry argued for the preeminence of a situation-based ontology and took contexts of utterance to be situations, thereby offering the potential for a richer view of context than was available previously. For situation semantics and ontology, see article 35 (Ginzburg) Situation Semantics and NL ontology. This article takes as its starting point the focus on utterance (as opposed to sentence) interpretation. In section 2 I spell out the far reaching aims Barwise and Perry proposed for semantic theory. In section 3 I sketch Barwise and Perry's Relational Theory of Meaning, in particular its emphasis on utterance situations and on the reification of information. I also point out some of the weaknesses of Barwise and Perry's enterprise, particularly the approach to context. One of these weaknesses, in my view, is that the theory is quite powerful, but it was, largely, applied to dealing with traditional, scntcncc-lcvcl semantics. The final section of this article, section 4, explains how conceptual apparatus from situation semantics has ultimately come to play an important role in a highly challenging enterprise, modelling dialogue interaction. Maienbom, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 852-872
36. Situation Semantics: From indexicality to metacommunicative interaction 853 2. Desiderata for semantics Barwise and Perry's starting point is model theoretic semantics, as developed in the classical Montague Semantics tradition (sec e.g. Montague 1974; Dowty, Wall & Peters 1981; Gamut 1991 and article 33 (Zimmcrmann) Model-theoretic semantics): a natural language is likened to a formal language (first order logic, intensional logic etc). On this approach, providing a semantics for such a language involves primarily assigning interpretations (or denotations) to the words of the language and rules that allow phrases to be interpreted in a compositional manner.This allows both the productivity of NL meaning and the potential for various kinds of ambiguity to be explicated. Contexts, on this view, are identified with indices - tuples consisting of a small and fixed number of dimensions, prototypically providing values for speaker, addressee, time, location. Interpretations of words/phrases arc then all taken to be relative to contexts, thereby yielding two essential semantic entities: characters/meanings which involve abstracting away indices from contents/interpretations. These - supplemented by lexical meaning postulates - can be used to explicate logically valid inference. Barwise and Perry view this picture of semantics as significantly too restrictive. The basic perspective they adopt is one in which linguistic understanding is assimilated to the extraction of information by resource bounded agents in their natural environment (inspired in part by the work of Gibson, e.g. Gibson 1979).This drives their emphasis on a number of unorthodox seeming fundamental desiderata for semantic theory, desiderata we will subsequently come to see find considerable resonance in the desiderata for a theory of meaning for conversational interaction. The first class of desiderata are metatheoretical in nature and can be summed up as follows: Desideratum 1: The priority or information Language has external significance, as model theoretic semantics has always emphasized, but, as cognitive scientists of various stripes emphasize, it also has mental significance, yielding information about agents' internal states. What is needed is a way of capturing the commonality between the external and the mental, the flow of information - the chain from fact to thought in one participant's mind to utterance to thought in another participant's mind, graphically exemplified in Fig. 36.1. Fig. 36.1: The Flow of Information. (Barwise & Perry 1983: 17)
854 VIII. Theories of discourse semantics An important component in fulfilling this desideratum, according to Barwise and Perry, is a theory by means of which external (and internal) reality can be represented - an ontology of some kind. This is what developed into situation theory and type theory with records (see article 35 (Ginzburg) Situation Semantics and NL ontology). A key ingredient in such a theory are some notion of constraints, a way of capturing necessary, natural, or conventional linkages between situations (e.g. smoke means fire, image being such and such means leg is broken etc.), along with a theory of how agents in a situation extract information using constraints.The other crucial component is the naturalization of linguistic meanings - their reduction to concepts from the physical world - in terms of constraints. The other two pivotal desiderata put forward by Barwise and Perry arc more directly aimed at repositioning the semantic fulcrum, from interpretation towards context. Desideratum 2: Information content is underdetermined by interpretation We might provide news about the Argentinean elections using any of the following sentences in (1). All three sentences uttered in these circumstances intuitively have the same external significance - we would wish to identify their content and, on some accounts, their meaning as well. Nonetheless, different information can be acquired from each: for instance, (lb) allows one to infer that Kirchner is a woman, whereas Lavagna is a man. (1) a. Kirchner beat Lavagna. b. Senora Kirchner defeated Senor Lavagna. c. Cristina's losing opponent was Lavagna. Desideratum 3: Language is an efficient medium Barwise and Perry emphasize that the flip side of productivity gets less attention as a fundamental characteristic of NL: the possibility of reusing the same expression to say different things. Examples of the phenomena Barwise and Perry had in mind are in (2), which even in 2009 are tricky. By 'tricky' I don't mean we lack a diagnosis, I mean there is no single formal and/or implemented semantic/pragmatic theory that picks them all off with ease, interfacing along the way with inter alia theories of gesture, gaze, and visual access. (2) a. A: I'm right, you're wrong. B: I'm right, you're wrong. b. I want you, you, and you to stand here and I want you, you, and you to stand here, (based on examples in Levinson 1983; Pollard & Sag 1994) c. A: John is irritating John no end. B: He can't be annoying him so badly. d. In last week's FoLLI dissertation prize meeting sadly the linguist voted for the linguist, whereas the logician voted for the logician, (based on an example in Cooper 1996) 3. The Relational Theory of Meaning At the heart of situation semantics is the Relation Theory of Meaning. There are two fundamentally innovative aspects underlying this theory, which bear significant importance to current semantic theory in the wider sense:
36. Situation Semantics: From indexicality to metacommunicative interaction 855 (3) a. Meaning Reification: the reification of meanings as entities on which humans reason (rather than as metatheoretical entities, as standard in logic), b. Speech Events as Semantic Entities: recognition of speech events (incl speakers, addressees, the speech token) as fundamental semantic units; sentences arc viewed as derivative: type-like entities that emerge from utterances, or, as Barwise and Perry put it, uniformities over utterances. To get a feel for the theory, consider a simple example. (4b), taken to be the meaning of (4a), is a crude representation of an early version of the Relational Theory of Meaning: a (declarative) meaning relates all utterance events u in which there exists a speaker a, addressee b, spatiotcmporal locations /, t, referents ;', m (for the names 'Jacky' and 'Molly' respectively) to described events e in which ; bites m at /.This relation is exemplified graphically in Fig. 36.2, which emphasizes the reality of the utterance situation. I have purposely used quasi-Davidsonian notation (see article 34 (Maienborn) Event semantics) to indicate that the central insight there is independent of the various more and particularly less standard formalisms in which the Relational Theory of Meaning has been couched. As we will soon see, there are various ways which differ significantly to cash out the characterization of u,e and their interrelation. (4) a. Jacky bit Molly. b. { u,e | Ba,b,l,j,m,t [uttering(a,'Jacky is biting Molly',u) a addressee(u,b) a In(u,l) a rcfcrring(a,j,'Jacky') a Namcd(j,'Jacky') a rcfcrring(a,m,'Molly') a Namcd(m,'Molly') a coincidcnt(l,t) a dcscribing(a,c) a bitc(c,j,m,t)J } Fig. 36.2: The meaning of 'Jacky is biting Molly' as a relation between situations in which this construction is uttered and events in which a Jacky bites a Molly. (Barwise & Perry 1983: 122) Of the two assumptions, Speech Events as Semantic Entities was introduced by Barwise and Perry in a stronger form than (3), graphically exemplified in Fig. 36.2 - not only do they make reference to speech event, but Barwise and Perry actually posit a compositional aspect to speech events: (5) a. If oris a phrase with sub-constituents X,Y, then uttcring(a, a, u) entails the existence of two subevents of e er e2 such that
856 VIII. Theories of discourse semantics b. e{ < e2 (el temporally precedes e2) c. uttering(a, X, u{) d. uttcring(a, Y, u2) This formulation raises a variety of issues concerning syntax, presupposing essentially a strongly surfacey and linearized approach. For obvious reasons of space I cannot enter into these, but they constitute an important backdrop. Speech Events as Semantic Entities underlay a number of grammar fragments subsequent to Barwise & Perry (1983) (e.g. Gawron & Peters 1990; Cooper & Poesio 1994), but on the whole was not the focus of much interest until Poesio realized its significance for conversational processing, as we discuss in section 4.2. In contrast, issues concerning Meaning Reification drove much research in the hey day of Situation Semantics. The relation examplified in (4b) is certainly a relation with relevance to the semantics of utterances of 'Jacky is biting Molly': it relates events in which the speaker mouths a particular linguistic form while referring to a Jacky and a Molly with an event the speaker is describing in which that Jacky bit that Molly. Barwise and Perry view attunement - the awareness of similarities between situations and of relationships that obtain between such similar situations - to the constraint in (4) as being what underlies our competence to use and understand such utterances. Nonetheless, there are two aspects which the formulation above abstracts away from: contextual parameter instantiation and truth evaluation. (4) does not make explicit the fact that understanding such an utterance involves finding appropriate referents for the two NP sub-utterances, as indeed in certain circumstances - e.g. for an overhearer who cannot see the speech participants or hears a recording - for the speaker and the time. In fact, in the original formulation of the Relational Theory of Meaning Barwise and Perry made a point of not packaging all of context in one event/situation, but distinguished three components of context: (a) the discourse situation, comprising the public aspects of an utterance (including all the standard indexical parameters), (b) the speaker connections, comprising information pertaining to a speaker's referential intentions, and (c) resource situations, events/situations distinct from the described situation, used to serve as referential/quantificational domains. Although the discourse situation/speaker connection dichotomy docs not seem to have survived - examples such as (2b) illustrate the importance of speaker intention even with 'simple indexicals', the ultimate insight to be drawn here, it seems, is the unbounded nature of contextual dependence. Resource situations are one of the important contributions of situation semantics (see particularly Cooper 1996), and are further discussed in article 35 (Ginzburg) Situation Semantics and NL ontology. Returning to (4), the formulation of the Relational Theory of Meaning as a relation between contextual situations (the discourse situation, speaker connections, zero or more resource situations) and described situations, is problematic. It means that the latter cannot serve as the denotations of declarative utterances (since they are not truth bearers), nor does it generalize to non-declarative meaning.This reflects the fact that in Situations and Attitudes Barwise and Perry attempt to stick to an avowedly "concrete" ontology, one which eschews abstract entities such as propositions, leading them into various foundational problems. This stance was abandoned soon after - various notions of propositions emerged as situation theory developed. Hence, in works such as Gawron & Peters (1990), Cooper & Poesio (1994), works from a maturer version of situation semantics, (declarative)
36. Situation Semantics: From indexicality to metacommunicative interaction 861_ (rank: 30; 5% of turns). Clarification Requests (CRs) constitute approximately 4-5% of all utterances (see e.g. Purver 2004; Rodriguez & Schlangen 2004). Moreover, there is suggestive evidence from artificial life simulation studies that the existence of CRs is not an incidental feature of interaction but a key component in the long-term viability of a language. Macura & Ginzburg (2006) and Macura (2007) show that when repair acts are a part of a linguistic interaction system, a stable language can be maintained over generations. Whereas, in a community endowed with a language that /acfoCRification,asI refer to the interaction brought about by a CR, the emergent divergence among language users is so high that the language eventually dies out. Ignoring metacommunicative interaction, as has been the case for just about the entire tradition of formal semantics, means missing out one of the basic building blocks of linguistic interaction. Situation Semantics was itself complicit in this. However, the view of language it provides, with its reference to speech events as part of the semantic domain, and the reification of meanings provides important building blocks for a theory of metacommunicative interaction. How then to integrate metacommunicative aspects into the semantic process? Such phenomena have been studied extensively by psycholinguists and conversational analysts in terms of notions such as grounding and feedback (in the sense of Clark 1996 and Allwood 1995, respectively) and of repair (in the sense of Schegloff 1987). The main claim that originates with Clark & Schaefer (1989) is that any dialogue move m{ made by A must be grounded (viz acknowledged as understood) by the other conversational participant B before it enters the common ground; failing this CRification must ensue. While Clark and Schacfer's assumption about grounding is somewhat too strong, as Allwood argues, it provides a starting point, indicating the need to interleave the potential for grounding/CRification incrementally; the size of the increments being an important empirical issue. From a semantic theory, we might expect the ability to generate concrete predictions about forms/meanings of metacommunicative interaction utterances in context. Such a characterization needs to cover both the range of possibilities associated with successful communication (grounding), as well as with imperfect communication - indeed it has been argued that /m'scommunication is the more general case (see e.g. Healey 2008). Thus, we can suggest that the adequacy of semantic theory involves the ability to characterize for any utterance type the update that emerges in the aftermath of successful grounding and the full range of possible CRs otherwise. This is, arguably, the early 21st century analogue of truth conditions. The update component of this criterion builds on earlier adequacy criteria that emerged from dynamic semantics' frameworks (see article 38 (Dckkcr) Dynamic semantics). Nonetheless, these frameworks have abstracted away from metacom- munication. I now consider two general approaches that strive to develop semantic theories capable of delivering grounding conditions/CRification potential. The first approach, an extension of Discourse Representation Theory (DRT) (see article 37 (Kamp & Reyle) Discourse Representation Theory), aims at explicating inter alia the potential for acknowledgements and utterance-oriented presuppositions; the second approach, constructed from the start as a theory of dialogue perse, shows how to characterize CRification potential. A crucial assumption both approaches bear in common, one that distinguishes them from other dynamic semantic work (e.g. Roberts 1996; Groenendijk 1998; Dekker 2004; Asher & Lascarides 2003), but one that seems inescapable if metacommunicative
36. Situation Semantics: From indexicality to metacommunicative interaction 863 the two utterances in (16a).The discourse entities eel and ce2 can serve as antecedents both of implicit anaphoric references, e.g. in the case of 'backward' acts like answers to questions, and of explicit ones. Consider (17): this may be viewed as performing at least two functions here: implicitly accepting the option proposed in eel, and performing a query. Indeed backward-looking acts - (for the backward/forward-looking dialogue act dichotomy see Core & Allen 1997) such as accept are all implicitly anaphoric to a previous conversational event (eel in this case), hence the assumption that conversational events introduce discourse markers just like normal events do. (17) a. A: We should send an engine to Avon. B: Shall we use engine E3? b. [ccl,cc2,cc3|ccl:opcn-option(A,B,[x,w,c|cnginc(x),Avon(w),c:scnd(A,B,x,w)]), ce2: accept(B,cel) ce3: ask(B,A,[y,e'| engine(y), E3(y),e ':use(A,B,y)])] In fact, as mentioned earlier, Poesio and Traum develop their theory on the basis of a strong and dynamicized version of Speech Events as Semantic Entities: an utterance is taken to be a sequence of micro-conversational events (MCEs). On this view, the discourse situation is updated not just when a complete sentence has been observed, but whenever a new event is observed. Psychological research suggests that such updates can take place every few milliseconds (Tanenhaus & Trueswell 1995), so that observing the utterance of a phoneme is sufficient to cause an update; but in practice PTT typically assumes that updates take place after every word. The incremental update hypothesis is not just motivated by psychological findings about incremental interpretation in sentential utterances, but by the fact that in dialogue many types of conversational acts are hardly, if ever, performed with full sentences. A class of non-sentential utterances that quite clearly lead to immediate updates of the discourse situation are those used to perform dialogue control acts such as take-turn, keep-turn and release-turn actions whose function is to synchronize the two participants in the conversation as to whom is holding the floor (Traum & Hinkelmann 1992) and acknowledgements. These conversational actions are sometimes performed by sentential utterances that also generate a core speech act (e.g., the second utterance in (17a)), but more commonly they are generated by single-word discourse markers like 'mmh','okay','well','now'. In PTT, lexicon and grammar are formulated as defeasible rules characterizing the update potential of locutionary acts. The motivation for defeasibility include psycholin- guistic results about lexical access, e.g. work such as Onifcr & Swinncy (1981) demonstrating that conversationalists simultaneously access all meanings of ambiguous words. Lexical entries and syntactic rules link a precondition stated in terms of the phonological/syntactic characteristics of a micro-conversational event and a possible effect stated in terms of the possible meaning of that event. In particular, syntactic rules enable the construction of compound locutionary events, whose atomic constituents are the MCEs corresponding to utterances of individual words. Each locutionary act laf sets up the potential for a subsequent illocutionary act il{ (one of whose) effects is to constitute an acknowledgement of /a,. This provides the basis for a treatment of grounding and dialogue control particles. I illustrate this for 'okay' in its use as an acknowledgement particle; PTT assumes that locutionary acts generate - here in a causal sense introduced by Goldman (1970) - core speech acts.The lexical entry could be specified, roughly, as in (18), where u represents a locutionary and ce an illocutionary act resepctively:
866 VIII. Theories of discourse semantics lexemes (e.g.'yes','no'), through reprise fragments, can be analyzed as indexical expressions relative to the DGB. Context change is specified in terms of conversational rules, rules that specify the effects applicable to a DGB that satisfies certain preconditions. This allows both illo- cutionary effects to be modelled (preconditions for and effects of greeting, querying, assertion, parting etc), interleaved with locutionary effects, our focus here. In the immediate aftermath of the speech event u, Pending gets updated with a record of the [sit = u sit-typc = (of type LocProp (locutionary proposition)). Here Tn is a grammatical type that emerges during the process of parsing u, as already exemplified above in (6).The relationship between u and Tn - describable in terms of the Austinian proposition (see [sit = u _ - can be utilized in providing an analysis of grounding/CRification conditions: (21) a. Grounding: pn is true: the utterance type fully classifies the utterance token, b. CRification: Tn is weak (e.g. incomplete word recognition); u is incompletely specified (e.g. incomplete contextual resolution). Thus, pending utterances are the locus off of which to read grounding/CR conditions. Without saying much more, wc can formulate a lexical entry for CR particles like 'eh?' (Purver 2004). Given a context that supplies speaker, addressee and a pending utterance the content expressed is a question querying the intended content of the utterance: (22) phon:(eh) cat = interjection: syncat spkr:IND addr:IND pending :utt [c2: address(addr.spkr,pending) cont = Ask(c-params.spkr,c-params.addr, [Xjc Mean(c-params.addr,c-params.pending,x)):IllocProp c-params: (22) is straightforward apart from one point - what is the type utt. This actually is a fundamental semantic issue, one which, as we will see, responds to the underdetermina- tion of information by interpretation desideratum raised in section l:what is the semantic type of pending? In other words, what information needs to be associated with pending to enable the formulation of grounding conditions/CR potential? The requisite information needs to be such that it enables the original speaker to interpret and recognize the coherence of the range of possible clarification queries that the original addressee might make.
872 VIII. Theories of discourse semantics Taiienhaus, Michael & JohnTrueswell 1995. Sentence comprehension. In: J. Miller & P.Eimas(eds.). Handbook of Perception and Cognition, vol. 11: Speech, Language and Communication. New York: Academic Press, 217-262. Traum, David & Elizabeth Hinkelmann 1992. Conversation acts in task-oriented spoken dialogue. Computational Intelligence 8,575-599. Jonathan Ginzburg, Paris (France) 37. Discourse Representation Theory 1. Introduction 2. DRT at work 3. Presupposition and binding 4. Binding in DRT 5. Lexicon and inference 6. Extensions 7. Direct reference and anchors 8. Coverage, extensions of the framework, implementations 9. References Abstract Discourse Representation Theory (DRT) originated from the desire to account for aspects of linguistic meaning that have to do with the connections between sentences in a discourse or text (as opposed to the meanings that individual sentences have in isolation). The general framework it proposes is dynamic: the semantic contribution that a sentence makes to a discourse or text is analysed as its contribution to the semantic representation - Discourse Representation Structure or DRS - that has already been constructed for the sentences preceding it. Interpretation is thus described as a transformation process which turns DRSs into other (as a rule more informative) DRSs, and meaning is explicated in terms of the canons that govern the construction of DRSs. DRT's emphasis on semantic representations distinguishes it from other dynamic frameworks (such as the Dynamic Predicate Logic and Dynamic Montague Grammar developed by Groenendijk and Stokhof, and numerous variants of those). DRT is - both in its conception and in the details of its implementation - a theory of semantic representation, or logical form. The selection of topics for this survey reflects our view of what are the most important contributions of DRT to natural language semantics (as opposed to philosophy or artificial intelligence). 1. Introduction 1.1. Origins The origins of Discourse Representation Theory (DRT) had to do with the semantic connection between adjacent sentences in discourse. Starting point was the analysis of tense, Maienbom, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 872-923
37. Discourse Representation Theory 873 and more specifically the question how to define the different roles of Imperfect (Imp) and Simple Past (PS) in French.The semantic effects these tenses produce are often visible through the links they establish between the sentences in which they occur and the sentences preceding them. A telling example, which has become something of a prototype for the sentence linking role of tenses, is the following. (1) is the original example, in French; (2), its translation into English, establishes the same point. (1) Quand Alain ouvrit (PS) les yeux, il vit (PS) sa femme qui £tait (Imp) debout pres de son lit. a.Ellcluitsourit.(PS) b. Elle lui souriait. (Imp) (2) When Alain opened his eyes he saw his wife who was standing by his bed. a. She smiled. b. She was smiling. The difference between (la) and (lb) is striking:The PS-sentence in (la) is understood as describing the reaction of Alain's wife to his waking up, the Imp-sentence in (lb) as describing a state of affairs that already holds at the time when Alain opens his eyes and sees her: the very first thing he sees is his smiling wife. In the late seventies the study of the tenses in French and other languages led to the conviction that their discourse linking properties are an essential aspect of their meaning and an effort got under way to formulate interpretation rules for different tenses that make their linking roles explicit. In 1980 came the awareness that the mechanisms which account for the inter-sentential connections that are established by tenses can also be invoked to explain the inter- and intra-scntcntial links between pronouns and their anaphoric antecedents. (An analogy in the spirit of Partee 1973, but within the realm of anaphora rather than deixis.) DRT was the result of working out the details for a small fragment dealing with sentence-internal and -external anaphora. This first fragment (Kamp 1981a) dealt only with pronominal anaphora, but a treatment of temporal anaphora, which offered an analysis of, among others, the anaphoric properties of PS and Imp, followed in the same year (Kamp 1981b). The theory presented in Kamp (1981a) proved to be equivalent to the independently developed File Change Semantics of Hcim, which became available to a general audience at roughly the same time (Heim 1982). However, DRT and FCS were inspired by different intentions from the start, a difference that became more pronounced with the advent of Groenendijk and Stokhofs Dynamic Semantics (Groenendijk & Stokhof 1991; Groenendijk & Stokhof 1990). Dynamic Semantics in the spirit of Groenendijk and Stokhof followed the lead of FCS, not DRT (Barwise & Perry 1983;Rooth 1987). From the very beginning one of the strong motivations of DRT was the desire to capture certain features of the way in which interpretations of sentences, texts and discourses are represented in the mind of the interpreter, including features that cannot be recaptured from the truth conditions that the chosen interpretation determines.This representational aspect of DRT was at first seen by some as a draw-back, viz. as an unwelcome deviation from the emphatically anti-psychologistic methods and philosophy of Montague Grammar (Montague 1973; Montague 1970a; Groenendijk & Stokhof 1990);
874 VIII. Theories of discourse semantics but with time this resistance appears to have lessened, largely because of the growing trend to see linguistics as a branch of cognitive science. (How good the representations of DRT are from a cognitive perspective, i.e. how much they tell us about the way in which humans represent information - or at least how they represent the information that is conveyed to them through language - is another matter, and one about which the last word has not been said.) In the course of the 1980s the scope of DRT was extended to the full paradigm of tense forms in French and English, as well as to a range of temporal adverbials, to anaphoric plural pronouns and other plural NPs, the representation of propositional attitudes and attitude reports, i.e. sentences and bits of text that describe the propositional attitudes of an agent or agents (Kamp 1990; Ashcr 1986; Kamp 2003), and to ellipsis (Ashcr 1993; Lerner & Pinkal 1995;Hardt 1992.) The nineties saw,besides extension and consolidation of the applications mentioned, a theory of lexical meaning compatible with the general principles of DRT (Kamp & RoBdeutscher 1992; Kamp & RoBdeutscher 1994a), and an account of presupposition (van der Sandt 1992; Beaver 1992; Beaver 1997; Beaver 2004; Geurts 1994; Geurts 1999; Geurts & van der Sandt 1999; Kamp 2001a; Kamp 2001b; van Genabith, Kamp & Reyle 2010). The nineties also saw the beginnings of two important extensions of DRT that have become theories in their own right and with their own names, U(nderspecified) DRT (Reyle 1993) and S(egmented) DRT (Asher 1993; Lascarides & Asher 1993; Asher & Lascarides 2003). SDRT would require a chapter on its own and wc will only say a very few words about it here; UDRT will be discussed (all too briefly) in Section 8.2. In the first decade of the present century DRT was extended to cover focus-background structure (Kamp 2004; Riester 2008; Riester & Kamp 2010) and the treatment of various types of indefinites (Bende-Farkas & Kamp 2001;Farkas& de Swart 2003). 2. DRT at work In this section we show in some detail how DRT deals with one of the examples that motivated its development. 2.1. Tense in texts As noted in Section 1, the starting point for DRT was an attempt in the late seventies to come to grips with certain problems in the theory of tense and aspect. In the sixties and early seventies formal research into the ways in which natural languages express temporal information had been dominated by temporal logics of the kind that had been developed from the fifties onwards, starting with the work of Prior and others (Prior 1967; Kamp 1968; Vlach 1973). It became increasingly clear, however, that there were aspects to the way in which temporal information is handled in natural languages which neither the original Priorcan logics nor later extensions of them could handle. One of the challenges that tenses present to semantic theory is to determine how they combine temporal and aspectual information and how those two kinds of information interact in the links that tenses establish between their own sentences and the ones preceding them. (1) and (2) are striking examples of this challenge. Here we will look at a pair of slightly simplified discourses which illustrate the same point.
37. Discourse Representation Theory (3) Alain woke up. a. His wife smiled. b. His wife was smiling. We will assume that (3a) and (3b) are related to the first sentence of (3) in the same way as the second sentences of (1) and (2) are related to the first sentences there: in the case of (3b) Alain's wife was smiling when Alain opened his eyes, in (3a) she smiled as a reaction to that.These respective interpretations may not be as compelling as they are in the case of (2) or (1), but they are there and it is these readings for which we are now going to construct semantic representations. We assume that the first sentence has the syntactic structure given in (4). (4) Note that this structure makes the assumption, familiar from syntactic theories based on the work of Chomsky, that the interpretation provided by tense is located at a node T high up in the sentence tree and above the one containing the verb. We will see presently what implications this has for the construction of a semantic representation. We assume that the construction is bottom up (unlike in the first explicit formulations of DRT (Kamp 1981a; Kamp & Reyle 1993) where it is top down, and which for many years were treated as a kind of standard in DRT). Before we describe the construction procedure, we show the resulting DRS in (5), so that the reader will have an idea of what we are working towards. We will then describe the procedure in more detail. (5) t<n Alain (x) ect e:wake_up'(x)
876 VIII. Theories of discourse semantics Formally DRSs are pairs <U,Con> consisting of (i) a set U of discourse referents, and (ii) a set Con of conditions. (5) exemplifies the graphical convention in DRT to represent DRSs as 2-dimensional structures, with the universe displayed at the top and the condition set below it. We will keep to that convention throughout this article. Discourse referents play the role of representatives of entities.They behave much like the variables of predicate logic, but not quite. In fact, the behaviour of some discourse referents resembles more closely that of individual constants - details will become clear as we go along. In general, discourse referents come with certain sortal restrictions built into them. For instance, (5) has discourse referents of three different sorts - x stands for an individual (by individual we understand any entity that can be the referent of a definite noun phrase), c stands for an event and t for a time. DRS-conditions arc in essence formulas of predicate logic, built from predicates and discourse referents. The discourse referents play the part of argument terms.The predicates are either translations of predicate words from the represented natural language - such as e.g. the verb wake up in (5) or the proper name Alain - into the DRS representation formalism, or else they are primes of the representation formalism, to which no simple word of the represented natural language has a privileged connection. Examples of such primes are the relational predicates < and c. (The condition t < n means that the point or interval of time t precedes the utterance time n.The condition e c t means that the event e is temporally included in the time point or interval t.) Wakc_up' is the predicate corresponding to the English verb wake up. The discourse referent e stands for the waking up event that is described by the sentence.The notation e:wake_up'(x) is to be read as"e is an event of the type x waking up". We could also have written wake_up'(e,x) - a notation one often sees elsewhere - but stick with the notation using ":" which was originally introduced to highlight the asymmetry between the referential argument e of wake up and its non-referential argument x. A brief explanation may be needed here of the distinction between referential and non-referential arguments. We assume with Williams (1977) that all words which function semantically as predicates (names, prepositions, adjectives and adverbs) have one referential argument and in addition one or more non-referential arguments.The referential argument of a verb is always the event or state the verb is used to describe. (Whether this argument is an event or a state is determined by the lexical properties of the verb in question.) Verbs also always have at least one non-referential argument, which is realised as the grammatical subject when the verb is used in the active voice. Intransitive verbs have just this one non-referential argument, simple transitive verbs have two and so on. Most nouns, adjectives and adverbs just have a referential argument but no non- referential arguments. But there are also relational nouns, adjectives and adverbs, which have non-referential arguments as well. An example is the noun wife. In a DP such as Alain's wife the non-referential argument is the referent of the embedded DP Alain. The referential argument is not represented by a separate phrase but introduced by the predicate word wife itself; it becomes the referent of the complete phrase of which the noun wife is the lexical head - here the DP Alain's wife. The non-relational noun woman only has a referential argument, which for instance can be expressed by a containing DP like the woman. Wc write woman(y) and wifc(y,z) rather than y:woman or y:wife-of(z), which might have been expected given what has just been said in connection with verbs.This arguably is a slight inconsistency of notation, but it is harmless and it has become standard practice within DRT, so we will stick to it
37. Discourse Representation Theory 877 here. Proper names like Alain are also treated as predicates, with a built-in referential uniqueness. For instance, Alain(x) means that the discourse referent x represents the individual referred to (in the represented utterance) by the name Alain. This completes the informal description of the different parts of the DRS in (5). A formally precise description is provided by the model-theoretic semantics for DRSs. From a logical point of view DRSs are the formulas of DRT's semantic representation languages. These languages come - like other logical formalisms, such as the predicate calculus - with a syntax which defines the possible forms of expressions (here: the well- formed DRSs and DRS-conditions), and with a model theory which describes for each of the well-formed expressions its denotation in each of the models that it specifies for the given DRS-language. We won't go into the formal definition of the models for DRS languages that include DRSs like that in (5), but refer the reader to the literature (Kamp & Reyle 1993 or van Genabith, Kamp & Reyle 2010). Since the DRSs of our formalism involve discourse referents for entities of various sorts, models must have a fairly complicated structure (much more so than the models for standard first order logic): they must have a time structure as well as temporal relations between times and eventualities (eventuality is used as a cover term for both events and states). For any model M the denotation of (5) in M will be the set of all functions of the universe of (5) into the universe of M such that: (i) f(t) is a time of M (i.e. an interval or point of the time structure of M), (ii) f(e) is an event of M, (iii) f(x) is an individual of M, (iv) the conditions of (5) are all satisfied in M by the f-values of their arguments: (a) f(t) temporally precedes the utterance time of the first sentence of (3); (b) f(e) is temporally included in f(t); (c) f(x) is Alain (i.e. the person referred to by the speaker in using Alain on the given occasion; it is assumed that Alain is one of the individuals of M); (d) f(e) is an event of f(x) waking up. Such functions, from the universe of a DRS into the universe of a model, arc usually called embedding functions or, simply, embeddings. If an embedding function verifies all the conditions of the DRS in the model - in the case of (5) this means that it satisfies the requirements (a) - (d) - then it is called a verifying embedding. With this definition of the denotation of (5) in a model comes a definition of truth: (5) is true in M if there exists a verifying embedding of (5) in M,i.e. if the denotation of (5) in M is non-empty. Note the existential form of this definition: (5) is true in M if there exists a way of associating entities from M with the universe of (5) such that the conditions of (5) are satisfied in M. To construct (5) from (4) we proceed bottom up, associating semantic representations with the nodes of (4) as we traverse the tree from its leaves to its root, working our way up, roughly speaking, from bottom right to top left. We start by replacing the given occurrence of the verb, wake up, with the appropriate instantiation of its semantic representation.This representation is provided by the lexical entry for the verb, which we assume is given in the following form. (6) wake up verb nom e x Selectional Restrictions: event animal Semantic Representation: e:wake_up'(x) This entry says that wake up is a verb, that its referential argument is an event (see the Selection Restrictions), that wake up has one non-referential argument, which must
VIII. Theories of discourse semantics The one remaining task is the binding of t. In the present case, where the S node we are dealing with is the S node of a discourse-initial main clause, binding of the discourse referent t is existential, which once again amounts to transferring the discourse referent from its store to the universe of the DRS adjacent to it. The resulting representation is the one that was already shown as (5). We now turn to the two second sentences (a) and (b) of (3). This is where the special features of DRT, which concern the semantic relations between successive sentences in a discourse, come into prominence. We start with sentence (3b). To avoid unhelpful complications we treat the form be smiling as a single verb, which serves to describe states: the condition s: be_smiling'(x) means that s is a state to the effect that x is smiling. Construction of the semantic representation of (3b) proceeds in much the same way as that of the first sentence of (3). The only differences have to do with the temporal location of the state s, with the representation of the subject phrase his wife and with the binding of the temporally locating discourse referent t'.The first difference manifests itself in the transition from VP to TP. We assume that states are located at times in the sense that the time is one at or during which the state holds; that is, the location relation between t' and s is t' c s (rather than the converse relation which was assumed for events). So we get as representation at this construction stage: (10) The representation of the subject phrase his wife differs from that of Alain in several respects. First, the discourse referent introduced as referential argument of wife' must be a fresh discourse referent (i.e. one that is distinct from the discourse referents that have been previously introduced into the representation of the discourse) so that no clashes can occur between the referential roles they are meant to play. In particular, the new
882 VIII. Theories of discourse semantics (13) t' s x' u wife'(x', u) t'<n t'cs t' = t u = s:be_smiling'(x') The identification of t' with t also amounts to a kind of anaphora resolution. In fact, past tenses in non-initial sentences of a discourse tend to be anaphoric in this sort of way. It is this anaphoric dimension which enables them to contribute to the temporal connectedness of texts and discourses, something that the present examples are meant to illustrate. Note that (13) is not a proper DRS in the sense that some of the discourse referents occurring in its conditions (viz. x and t) do not occur in its universe. Such improper DRSs do not have well-defined denotations in the sense we specified earlier: embeddings of their universes do not determine the satisfaction of all of their conditions.The improper- ness of (13) disappears, however, when it is merged with the representation of the first sentence (by forming the union of their universes and the union of their condition sets). The result of the merge is given in (14). (14) t e x t' s x' u Alain (x) t < : t e:wake_up' (x) wife'(x',u) t' < n t' c s t' = t u = x s:be_smiling'(x') (14) is the joint representation of the two sentences of (3b). It is a proper DRS and thus has a denotation and a truth value in every model. The representation construction for (3a) is in most respects like that for (3b). The only differences have to do with the fact that (3a) functions as an event description, and not, like (3b), as a state description.This difference shows up, first, in that the verb smile introduces an event discourse referent e' (and not a state discourse referent) and, second, in that the temporal relation established between e' and e is different from that between s and e in the case of (3b). Intuitively, wc saw, the relationship between c' and c is that c' follows c. But how does this relationship get established? Although this has been the subject of discussion for several decades, it continues to be a topic for ongoing research and debate. But this much is clear: How successions of event sentences are interpreted depends very much on the rhetorical relations between them, and these vary. For instance, the event described inthesecondoftwoconnectedsentencesinadiscourseissometimesunderstoodasareacfioft to the event described in the first sentence - the natural interpretation in the case of (3a) - but in other cases it can also be understood as the cause of that event - the most salient interpretation for John fell. Bill pushed him; or it can be understood as an elaboration
37. Discourse Representation Theory of the event of the first sentence - as in John went to see his mother. He took the bus. (And these are just some of the various possibilities.) The interactions between rhetorical relations and temporal relations have been a central issue in SDRT (Segmented Discourse Representation Theory; see Lascarides & Asher 1993; Asher & Lascarides 2003), and in fact, they were an important impetus to the development of that approach. They are a topic that we can do no more than mention here; the reader is referred to the just mentioned publications and to further work cited in Asher & Lascarides (2003). The rhetorical relation between (3a) and the first sentence of (3) is called Narration in SDRT (as well as in some other accounts of discourse relations, cf. Mann & Thompson 1988, as well as article 39 (Zeevat) Rhetorical relations). The Narration relation between two successive sentences is one of the relations which entail that the event of the second sentence temporally follows that of the first. We implement the implications of this for DRS construction by assuming that the same relation holds between the location times t and t' of these events and use for this the condition t < t'. This time the joint representation of the two sentences together has the form given in (15). (15) t e x t' e' x' u Alain (x) t <n e c t e:wake_up'(x) wife'(x',u) |t'< n e'et' t<t' u = e':smile'(x') We have dealt with these two examples at a level of detail that some readers may have thought excessive, and inappropriate for a handbook. We have done so for two reasons. First, the DRS-construction principles we have discussed reflect some of the fundamental assumptions of DRT about the nature of verbal and nominal predication.These are intimately connected with the principles of sentence-external and -internal discourse connection for which DRT has long been primarily known and to which much that will be said in this article will be devoted. Without the details wc have gone into in our discussion of (3) it would have been much more difficult to explain these more specifically dynamic aspects of DRT. Through the work of van der Sandt and Geurts starting in the late eighties (van der Sandt 1992; van der Sandt & Geurts 1991; Geurts 1994; Geurts 1999), the basic architecture of DRT underwent a fundamental change. One part of that change is illustrated by the two-step procedure wc have just used in dealing with the second sentences of (3): first a preliminary representation was constructed solely on the basis of the sentence itself, and then the issues that this construction had left unsettled were resolved on the basis of the context representation provided by the antecedent part of the discourse. (In our example this was just the representation of the first sentence of (3).)
884 VIII. Theories of discourse semantics What is still missing, though, is a precise articulation of the steps that provide the links between the two sentence representations. A more detailed proposal for how such links are established will be discussed below in Section 3. 2.2. Donkey sentences and complex DRS conditions Before we get to that discussion we will first have a look at examples, given in (16) and (17), of the two types of donkey sentences which were used in Geach (1962) to illustrate the puzzle known as the donkey sentence problem or the donkey pronoun problem. The early development of DRT, and its initial reception, were closely connected with this problem. (16) If Alain owns a donkey he beats it. (17) Every farmer who owns a donkey beats it. In both these sentences the pronoun it is understood as referring back to the indefinite DP a donkey. The methods that are suggested by classical predicate logic for constructing logical forms for these sentences, in which pronouns are treated as bound variables and indefinite DPs as existential quantifiers, have trouble with such sentences - their logical forms come out either as ill-formed or they give the wrong truth conditions. The two apparent options for (16) are shown in (18) and (19). (18) By(donkey'(y) a own'(a,y)) -» beat'(a,y) (19) By(donkey'(y) a (own'(a,y) —► beat'(a,y)) In (18) the final occurrence of y is not bound, so we have an open formula, which doesn't determine any definite truth conditions.This problem does not arise for (19),but the truth conditions of this formula are clearly not those of (16).The solution of DRT (Heim 1982; Kamp 1981a) was based on a conceptual parallel between the sentences in (16) and (17) and a two sentence discourse like (20). (20) Alain owns a donkey. He beats it. In (20) the pronouns he and it of the second sentence are anaphoric to Alain and a donkey in the first sentence in the same way that his is anaphoric to Alain in (1) and (3); and a DRT-based interpretation of (20) will establish these anaphoric links in the same way in which we assumed the link was established between his and Alain in the treatment we have presented of (3b).The point about (16) and (17) is that here too the same linking mechanism is involved. Let us focus first on (16).This sentence has the form of a conditional "if A then B".The DRT conception of a conditional is this: the antecedent A functions as the description of a certain type of situation, and the conditional as a whole has the function to express that if a situation is of this type then it is also of the type described by the consequent B. And since it is about situations of the type described by A that the claim is made that they also satisfy the description given by B, it is
37. Discourse Representation Theory 887 model M and X is a set of discourse referents, g 2t f means that Dom(g) = Dom(f) u X. Moreover, when K is a DRS, then g ^K f is short for g 3t f, where X = U^. Definition 2 (Verification of complex DRS conditions) Let f be a function from discourse referents to elements of the model M, and let K, and K2 be DRSs. (a) f verifies -iK, in M if there is no extension g 3k, f which verifies all the conditions of K, in M. (b) f verifies K, => K2 in M if every extension g ^k, f which verifies all the conditions of K, in M has an extension h 3^ which verifies all the conditions of K2 in M. (c) f verifies K, v K2 in M if either there is an extension gl ^k, f which verifies all the conditions of K, in M or there is an extension g2 3k2S which verifies all the conditions of K2 in M. /K (d) [verifies K, / \K2in JWif every extension g2i/f, where U = Uk, u{a},such that g verifies all the conditions of K{ in M has an extension h 3k2 g which verifies all the conditions of K2 in M. DRS languages which include the operators mentioned in Def. 1 (with the semantics given for them in Def. 2) have at least the expressive power of first order predicate logic. As in standard formulations of first order logic the set of operators is redundant - some operators in the set can be expressed with the help of others. But for the given set of DRS operators the redundancy is more extreme. Since existential quantification and conjunction are built into the structure of DRSs, just adding -i already suffices to express all the remaining classical operators. It should be stressed, however, that the operators of extensional logic cover only a small part of the operator-like constructions that are found in natural languages and that play a crucial part in making natural languages the powerful and flexible instruments of expression and communication they are. In order to cover more of these operations DRS languages have been extended repeatedly as time went on. Some such extensions will be discussed in Section 6. We return to the discussion of the sentences in (16) and (17). The point of departure for that discussion was the analogy between those sentences and discourses like (20): The discourse context that the first sentence (20) provides for the second sentence of (20) is provided in the conditional sentence (16) by the conditional's antecedent for the conditional's consequent. A similar analogy applies to (17), where it is the restrictor of the quantifier that provides a discourse context for the nuclear scope.The DRS for (17) is given in (22).
37. Discourse Representation Theory 889 fact, there were two new perspectives, each of them focussed more on the structure and use of natural language than on the properties of logical calculi. The first, best known through the writings of Stalnaker (Stalnaker 1972; Stalnaker 1974; Stalnaker 1979), is the view that presupposition is a pragmatic phenomenon, in the following sense. When we use language to communicate we always assume that there is much information we already share with those we are trying to communicate with; without such a supporting Common Ground it would be virtually impossible to convey any information concisely and clearly. So it is a natural and ubiquitous feature of verbal communication that much information is presupposed without which we couldn't express ourselves as concisely as we want to. The second perspective - sometimes somewhat unfortunately referred to as "semantic" - emphasises the conventional aspect of presupposition: Certain words and grammatical constructions come with presuppositions in the sense that no utterance in which these words or constructions occur can be considered felicitous unless the context in which the utterance is made verifies the presuppositions they introduce. A speaker who uses such words or constructions cannot help but make the corresponding presuppositions, the Common Ground of which she assumes that it obtains between her and her interlocutors must verify those presuppositions. Historically this second view, according to which linguistic presuppositions are conventionally associated with their triggers, is primarily connected with the names of Langendoen and Savin, Kiparsky, and Karttunen (Langendoen & Savin 1971; Kiparsky & Kiparski 1970; Karttuncn 1973; Karttunen 1974). (Sec also article 91 (Beaver & Gcurts) Presupposition.) It is this second perspective that motivates the treatment of presupposition within DRT. With the semantic perspective came the awareness that presupposition is much more widespread than had previously been assumed. Presuppositions are not only triggered by definite descriptions and other singular terms, but also by many other words and constructions. Indeed, when investigations inspired by this new awareness got under way, a significant part of the work consisted in identifying the presupposition triggers that can be found in English and other natural languages; and even today that search is far from over. The central challenge that this way of looking at presupposition was soon seen to present for the theoretical linguist, however, was the Projection Problem (Karttunen 1973). Karttunen noted that presuppositions carried by simple sentences often disappear when these sentences arc embedded within more complex ones. Our intuitions about the truth and felicity conditions of simple sentences give us reliable information about which words and constructions are presupposition triggers, and about what presuppositions they trigger;but sometimes, when the triggers occur in logically complex sentences, these very presuppositions seem to be absent nonetheless. To give an example: the word again carries the presupposition that an event or state of the kind described by the clause containing it happened or obtained at some time preceding the event or state that is being described. Thus compare the utterances (23a) and (23b). (23) a. He will come. b. He will come again. c. He won't come again. d. Will he come again?
892 VIII. Theories of discourse semantics pronoun is different in form from the presupposition imposed by a word like again, but that is as far as the difference goes. In van Genabith, Kamp & Reyle (to appear) a distinction is made between anaphoric and non-anaphoric presuppositions. Anaphoric presuppositions always involve anaphoric discourse referents and impose on the context the constraint that it supplies discourse referents which can serve as antecedents for those. The anaphoric discourse referent is then interpreted as standing to its antecedent in a certain relation; often this relation is coreference, but not always (not e.g. in cases of temporal anaphora). Non-anaphoric presuppositions are presuppositions in the sense of presupposition theory "before DRT" and act as entailment requirements: they express presuppositions that the context is required to entail. (As a matter of fact, both these types of presuppositions can be seen as special cases of a more general form of anaphora cum presupposition: a constraint to the effect that the context provides one or more discourse referents for which it entails certain properties and/or relations; arguably van der Sandt's theory can be interpreted as advocating this generalised notion.) Here we follow the version of presupposition theory presented in van Genabith, Kamp & Reyle (to appear). In this version anaphoric pronouns are treated as triggers of presuppositions of the form: <x?,Cl(x),..,Cn(x)>, where x is the discourse referent representing the pronoun and Cl,..,Cn are conditions connected with the pronoun in question (e.g. a condition to the effect that the referent must be a person of the female sex) and as before the question mark behind the discourse referent x serves as indicator that the context must provide an antecedent discourse referent with which x is to be identified. With this new treatment of anaphoric pronouns the representation of (23b) will no longer require a store. (However, we will see below that stores are still needed.) Instead the two pronoun occurrences now each contribute a presupposition. The new representation, which replaces (25b), is given in (26). (26) yesterday (t) x' person(x) male(x) }>. t,e, t,< n e,ct, t,ct e:come'(x) H person (x') male(x') e':come'(x') t,e, n < t, e2ct, e2:come'(x') Resolution of the first pronoun presupposition is possible only if a suitable discourse referent is provided by the utterance context. In such a context the second pronoun presupposition can then also be resolved to that discourse referent. This will lead to
37. Discourse Representation Theory 893 identification of x' with x. And once this identification has been made, the antecedent DRS will entail the agam-presupposition. (26) is a typical example of a preliminary sentence representation, in which all anaphoric and presuppositional constraints that parts of the sentence impose on their respective contexts have been explicitly represented in the form of anaphoric and non- anaphoric presuppositions. One of the unsolved issues in presupposition theory is accommodation. It often happens that a speaker utters a sentence which carries a presupposition that is neither entailed by some local context provided by the sentence itself nor by what the interpreter has reason to assume is the global context (or Common Ground; we do not make a distinction in this article between global context and Common Ground). In such cases the interpreter will normally accommodate the presupposition by adjusting the global context so that it satisfies the constraint that the presupposition imposes. Apparently, thus the implicit reasoning behind accommodation, the speaker is assuming a global context distinct from the one I was assuming myself, for otherwise she would not have used the sentence she did use (Beaver 1997; Beaver 2002). (A problem arises in those cases where the necessary adjustments conflict with some of the interpreter's beliefs: should he give these up or should he rather dismiss the speaker's utterance as infelicitous? How recipients resolve conflicts of this sort can depend on all sorts of factors.The presupposition theory of DRT has nothing to say about this.) It should be stressed that accommodation of a presupposition need not take the form of simply adopting the presupposition itself as a new context constituent. Often the recepient will assume that the Common Ground contains additional information that exceeds what is needed to entail a presupposition, simply because that is the more plausible adjustment. And on the other hand it is often the case that the accommodated information need not entail the presupposition on its own but only in combination with information that is already present, say in the local context where the presupposition can be resolved. In such cases the accommodated information may be weaker than the presupposition (or the two may be logically independent). The first of these points has been emphasised by Beaver and the second in Kamp & RoBdeutscher (1994a), which proposes the term presupposition justification to cover all cases of (a) verification of the presupposition by the context as is, (b) wholesale accommodation and (c) satisfaction by a combination of old and new (that is, accommodated) contextual information. A further intriguing feature of presupposition justification is that the given discourse context often points to a particular accommodation that will guarantee presupposition verification; and the force of this pointing may be so powerful that it confers upon the indicated accommodation the apparent status of a valid inference (Kamp 2001b). One famous illustration of this last effect is an example due to Kripke (Kripke 2009), which once more involves the presupposition trigger again. Compare the sentences in (27). (27) a. We won't have pizza on John's birthday if we are going to have pizza on Mary's birthday. b. We won't have pizza again on John's birthday if we are going to have pizza on Mary's birthday. c. We already had pizza on Billic's birthday last week. So we won't have pizza again on John's birthday if we are (also) going to have pizza on Mary's birthday.
37. Discourse Representation Theory 897 location times can be bound in each of the three ways listed above for nominal discourse referents. In our examples so far we have only encountered structural binding of location times. But instances of the other two types are also found. Quantificational binding of location times occurs in the presence of temporal adverbs like always and mostly. These adverbs give rise to temporal duplex conditions and introduce location time discourse referents that are bound by the quantifiers which the adverbs introduce (see Reyle, Rossdeutscher & Kamp 2008 for details). There are also cases where location times are bound presuppositionally. In (29a) the temporal relative clause ten minutes after Bill called her serves to locate the event described by the main clause Mary left the house. The relative clause introduces a location time for the calling event which serves to locate the event described by the main clause. (29) a. Mary left the house ten minutes after Bill called her. b. Mary didn't leave the house ten minutes after Bill called her. That both the Bill called event and its location time have a presuppositional status is shown by (29b), the negation of (29a). The most prominent interpretation of this sentence is one according to which Bill did call Mary at some time t and that ten minutes after t Mary did not leave the house. (An interpretation according to which the time at which Mary left the house wasn't ten minutes after the time when Bill called because Bill didn't call seems possible only in quite special contexts.) This indicates that both the event discourse referent of the relative clause and the discourse referent for its location time are bound presuppositionally. 5. Lexicon and inference The meanings of the sentences we utter and the texts we write are a function of the meanings of the words from which they are built. So any account of linguistic meaning must start with the meanings of words. And that entails that any semantic theory must include a lexicon. In fact, the lexicon is arguably the most demanding part of semantics, on the one hand because languages have so many words, and on the other because capturing the meaning of even a single word - or crafting a satisfactory semantic entry for it - can be very hard. How hard will depend on what purpose the entry is supposed to serve. But we have already hinted at what that purpose should be: the semantics of words that is specified by their entries must be such that the meanings of sentences and texts can be computed from these entries in a systematic, compositional fashion. From the start this has been the guiding principle for lexicon development within DRT (see in particular Kamp & RoBdeutscher 1994a). And here it takes the following form: The semantic specifications given in the lexical entries of words should be such that they can be imported into the semantic representations of the sentences and texts that can be constructed out of those words, and these representations should capture the meanings of the represented texts and sentences in the specific sense that they support the inferences that human interpreters arc able to draw from those texts and sentences. In DRT the problem of semantic lexical specification is thus tied directly to the problem of inference: Suppose that T is a text and that S is a sentence which an interpreter
898 VIII. Theories of discourse semantics would naturally infer (or be able to infer) from T.Then the DRS for T should entail the DRS for S and our theory should allow us to show this. And whether the theory can do that will in almost all cases depend on what the lexical entries for the words that occur in T and S contribute to their representations. Although the development of a DRT-based lexicon has been a concern for many years, there still is a long way to go. In particular, current DRT-based lexicons lack the broad coverage that is indispensable in automated natural language processing. Even so, existing results exceed by a wide margin what could be reported within the limits of this article. We have therefore decided to focus on some of the methodological problems of semantic lexicon design as they present themselves within DRT, instead of presenting an inevitably fragmentary overview of the proposals for the entries of particular words and word classes. (For more on lexical semantics within a DRT-based framework see Kamp & RoBdeutscher 1994a; Kamp & RoBdeutscher 1994b; RoBdeutscher 1994; RoBdeutscher 2000; Solstad 2007, and for a more recent perspective Lechler & RoBdeutscher 2009; Hamm & Kamp 2009.) Our first encounter with the lexicon was in Section 2 when we looked at the DRS construction for the sentences in (3). DRS construction always starts with the insertion of semantic representations for the words of the sentence for which a DRS is being built. This presupposes that the lexical entries for those words provide the representations that are suited for this task. For verbs, we saw, this minimally involves two requirements: the entry must (i) provide a semantic representation with slots for the verb's arguments and (ii) contain information that is needed by the syntactic parser (which provides the input trees to the DRS construction algorithm) to link these slots to argument phrases in the sentence. But these are just the minimal requirements. In addition we would also like lexical entries to provide semantic information that will make it possible to deduce from DRSs for sentences or bits of discourse those conclusions that arc readily available to a human interpreter. On this point the one entry we have so far presented - that for the verb wake up, given in (6) and repeated here as (30) - seems to score pretty close to the bottom of the scale: To simply use a DRS predicate wake_up' as semantics for the verb wake up doesn't tell us very much about what wake up means, or what inferences it supports. (30) wake up (verb) nom e x Sel.Restr.: event animal Sem.Repr.: |e:wake_up'(x)| Of course there is little that can be predicted about the inferential power of an entire system from just one entry. Most of the inferences that we draw from what we hear or read have to do with the semantic connections between words and not just with the meanings of single words. To take one, very simple example: from the information that Alain woke up at t we can infer that he was awake immediately after t and that he was asleep or unconscious immediately before. It is conceivable that even with the simplistic entry in (30) these inferences could be computed as long as enough is said, in some part of the overall theory, about the connections between the predicate wake_up' and the
902 VIII. Theories of discourse semantics The second principle has to do with the punctual character of the time specification 10 o'clock. Intuitively such a time t cannot extend beyond the duration of an event c happening at that timc.This entails that if c happens at t and s is a state that right-abuts e, then s must include some initial segment of the period that stretches from t into the future. Using "punct" to represent the notion of punctuality that is conveyed by expressions like 10 o'clock, we can formalise this principle as in (38). With the help of (37) and (38), (36) can be modified into (39), from which (34) can be obtained by renaming discourse referents and thinning (i.e. throwing away discourse referents and conditions). (39) t<n t =*=t' t t' e x s t" lo-o-clock(f) let' e c t res(s.e) e dc s t" c s Alain(x) s:awake'(x) To infer the DRS for (33c) from that of (33a) a little more is involved. We now need in addition: (i) entries for asleep and unconscious (which are like (34)) and the meaning postulate (32): (ii) a general principle to the effect that if a state s of a certain type is the result state of an event e, then immediately before e there was a state of the opposite type, i. c. a state s' which is characterised by the negation of the condition that characterises s. (Often s' and s are referred to as the prestate and poststate of e.) The prestate of an event e will hold at the moment e begins. Sometimes it will last throughout e, but it may also cease before that, in which case the gap between it and the result state will be bridged by some transitional state. However, if we assume that for each state s and interval t included in the duration of s there is a state s' of the same time as s and with t as its duration, then the Prestate Principle can be formulated simply that there is a state which abuts e on the left and is of a type opposite to e's result state. Note that the Prestate Principle has the status of making explicit a presupposition. Compare for instance the sentences Bill left the room and Bill didn't leave the room. For both of these the salient interpretation is that according to which, at the time in question, Bill is inside the room. For some reason the question is raised whether Bill left the room at that time, and the two sentences resolve that issue in opposite ways - the one says that Bill did leave the room and the other that he remained in it. (The negated sentence can be used in relation to a situation in which Bill is not in the room at t, as in Bill didn 't leave the room for the simple reason that he wasn't in the room to begin with at that time. But
37. Discourse Representation Theory 903 such contexts are marked, and they typically involve information which denies that the pre-state held at the relevant time.) The upshot of this is that the general principle according to which any combination of an event e and a result state s implies the existence of a corresponding prestate s' must take the form of adding to the representation of event e and result state s a presupposition saying that e is left-abutted by a state whose characterisation is the opposite of that of s. (40) gives the template for adding pre-state presuppositions to arbitrary representations of this kind (i.e. to arbitrary representations introduced by change-of-state verbs). K is a schematic letter for DRSs, K' for a DRS or a DRS-condition. (40) To infer the DRS for (33c) from the DRS (36) for (33a) we first expand the latter by an application of (40) in which K' is instantiated by the conditon awake'(x). ((41) shows this instantiation of the second term in (40) of the second occurrence of ©.) The condition that the event of x waking up was immediately preceded by a sate to the effect that x was not awake, is incorporated into (36) as a presupposition (see (42)). After the presupposition of (42) has been justified (possibly through accommodation), it is available as non-presuppositional information and can be merged with the rest of (42), see (43). At this point we can apply (32) to replace the characterisation awake'(x) of the pre-state sQby asleep'(x) v unconscious'(x). Moreover, we can also apply to the state s0 a principle that is the mirror image of principle (38): if t"is punctual, e c t", and s0dc e, then sQ includes some period of time t'"that left-abuts t". With this last application a DRS has been obtained from which the DRS for (33.c) can be derived by renaming and thinning. (42) t e x f t 1 ( <n 10(f) t c s0 s0: -i awake'(x) so zxze ' 1 L | e c t Alain(x s res(s.e) s:awake'(x) )
37. Discourse Representation Theory 905 the application of these postulates to the premise representation that the inference rules of classical formal logic came into play. In the examples we have looked at the uses we have made of such rules have been very elementary, and in this respect our examples have been good illustrations of what is involved in such inferences generally: The real action - or there is of it - in such inference processes is in selecting the postulates that make it possible to transform the premise DRS in the ways required. Since the number of meaning postulates that are part of knowing a language is very large, choosing the right ones for a given inferential task could well be a non-trivial problem. It seems however that this is something at which language users are remarkably adept. So, inasmuch as knowing meaning postulates and making the right choices from the set of meaning postulates one knows is arguably part of our linguistic competence, the conclusion would seem to be that our inferential abilities depend as much as on our knowledge of language as on a language-independent capacity for abstract formal reasoning. We conclude this discussion with a word of caution. The inferences reconstructed above were strict, deductively valid inferences. But most of the inferences people draw in the day to day business of real life aren't like that; they are approximate and defeasible. Lexical entries should be such that they can support such inferences too. However, our understanding of how defeasible inferencing works is still fragmentary and tentative. And so long as our grasp of this part of the general problem of inference remains as feeble as it is today, there is no way of telling how well the lexical semantics we propose will stand up in the longer run. 6. Extensions 6.1. Plurals Natural languages such as English mark the distinction between singular and plural. Plural nouns arc as common as singular ones. This in itself would be reason enough to demand of a theory dealing with the semantics of such a language that it can handle plurals as well as singulars. But in the case of DRT there is a special reason for wanting to cover plurals and not only singulars, which has to do with the original motivations for the theory. As noted earlier, one of the main selling points of early DRT was the way it treats the anaphoric relation between pronouns and their indefinite antecedents. A crucial ingredient of that account is the assumption that the interpretation of an anaphoric pronoun always involves a discourse referent that is either made available by the DRS of the preceding discourse or by the DRS of the sentence itself. It is to this discourse referent that the pronoun is then related (often, though not invariably, in the sense of conference). Plural pronouns, however, do not behave in strict accordance with this principle. They are often interpreted as referring back to elements of the discourse context that are not (yet) represented by discourse referents and that must first be constructed from material that is explicitly contained in the representation. Our first task in this section will be to review the relevant observations about the behaviour of plural pronouns and DRT's account of them. But the extension of DRT to cover plurals also has another important aspect: By extending DRT so that plurals are covered as well as singulars we move from first to second order logic. This point is not as obvious as it might seem, and requires careful discussion. But it is methodologically
906 VIII. Theories of discourse semantics important. We will discuss the two issues - plural anaphora and the expressive power of the extended DRS language - in that order. Many plural DPs denote sets with two or more members. (Not invariably, see Kamp & Reyle (1993, Ch.4);but in this brief review we will confine ourselves to DPs for which this is the case.) This applies also to anaphoric plural pronouns. But often the sets that these pronouns refer to are not explicitly represented in the discourse representation that should supply the pronoun's antecedent - not, at least, when that representation has been constructed from the antecedent discourse along the lines sketched in Section 2. An example is (44). (44) Alan took his wife out to dinner. They shared the hors d'ocuvrc. Here the pronoun they in the second sentence clearly refers to the pair consisting of Alan and his wife. But if the DRS for the first sentence is constructed according to the standard rules of DRS construction, then it will have no discourse referent representing this pair; it will only have discourse referents - x and y, say - that represent Alan and his wife separately. To obtain a discourse referent that can serve as antecedent for they we have to apply the operation of Summation. This application takes the form of introducing a new plural discourse referent Z and adding the condition Z = x 0 y, which defines Z as the sum of x and y. The possibility of constructing discourse referents that can serve as antecedents for plural pronouns is of methodological interest because it is subject to certain restrictions. This is illustrated by (45). (45) Half of the shareholders were present at the meeting. They learned about what had been said and decided at the meeting only from the newspapers the next morning. In this sentence they can only refer to the half of the shareholders that were at the meeting, even if that interpretation makes little sense in the given context. The interpretation that would make more sense - the one according to which they refers to the other half, who were not at the meeting - is blocked: subtraction of one set from another is not an operation that is available for creating pronominal antecedents. (It has been argued that subtraction is possible in certain circumstances, though speakers vary in their judgements about such cases. For extensive discussion of this issue see Nouwen 2003 and Kibble 1997.) Creation of pronoun antecedents is thus not just a matter of logical inference from the DRS representing the given discourse context. Only certain operations, such as the Summation operation that was used to synthesise the Z in constructing a representation for (44), are permitted. (For more details see Kamp & Reyle 1993). Moreover, the restrictions to which this process is subject, and which distinguish it from standard logical deduction, are specific to the interpretation of pronouns:There is no difficulty in referring to the half of the shareholders that were not at the meeting, provided we use a definite description. (In fact, we just did; and in (45) the phrase the other half would have done as well.) The upshot of these considerations:There are a number of principles that can be used to construct antecedents for plural pronouns from material that is explicitly present in the given DRS, but these principles do not exhaust the full power of logical inference.
37. Discourse Representation Theory 907 Moreover, these principles, with their built-in restrictions, apply not only within the sentence but equally to cases of inter-sentential anaphora. And as we just saw, they are specific to pronouns; anaphoric definite descriptions, for instance, are not subject to them. The methodological interest of these observations is that what we are seeing here is a distinctive property of some particular linguistic category (plural pronouns), and which applies not just to what happens within single sentences, but equally to multi-sentential discourse. This is a case, in other words, where the grammatical properties of a certain type of expression exert their influence beyond the sentences in which they occur. The restrictions on the construction of plural pronoun antecedents illustrated in (45) find a parallel in the famous "ball" example due to Partee: (46) a. One of the ten balls is not in the bag. It/the missing ball is under the sofa, b. Nine of the ten balls are in the bag. *It/the missing ball is under the sofa. Getting the intended antecedent for it in (46b) by subtracting the mentioned set of nine balls (the balls in the bag) from the mentioned set of ten balls (the set of all balls) doesn't work, or at any rate not very well. This gets us to a question that imposes itself once it has been accepted that the antecedents of plural pronouns may be constructed using a certain range of special purpose principles: How does this account for plural anaphoric pronouns relate to the one for singular pronouns that was outlined in Section 2? The simplest relationship would be that of subsumption: If we accept that singular pronouns must always refer to single individuals (or singleton sets, the distinction doesn't matter here), then the operations that permit antecedent construction will be otiose; they cannot create a discourse referent for a single individual, unless they start from a discourse referent that belongs to the representation already and which they then simply return. As it turns out, things are not quite that simple. One of the operations that can be used to construct antecedents for plural pronouns is Abstraction. It takes a nominal predication represented within the DRS as input and returns its extension as output. An example is (47), in which Abstraction can be applied to the predicate friend that Alan had invited and returns as output a discourse referent for the extension of this predicate which can then serve as antecedent for the pronoun they in the second sentence. (47) No friend that Alan had invited came to his party. They were too busy. If the same rules that can be used to produce antecedents for plural pronouns are available in principle for singular pronouns as well, couldn't Abstraction be used to yield antecedents for singular pronouns as well, in those cases where the predicate to which it is applied has a unique instance? As it turns out, the evidence suggests that this is indeed the case. A variant of an often quoted example is (48). (48) It is not true that there is no rabbi officiating at this wedding. He is standing over there, right next to the buffet. One way to account for the felicity of this discourse is to construe the antecedent of he as the result of applying Abstraction to the predicate rabbi officiating at this wedding together with the general assumption that the number of officiating rabbis at a wedding
37. Discourse Representation Theory 909 Both (50b) and the sentence (50a) are known to be essentially second order (meaning that their truth conditions cannot be captured by any first order statement) and to yield, when conjoined with (representations of) first order sentences that express the familiar properties of addition and multiplication, a non-axiomatisable theory. This establishes the non-axiomatisability of the logic expressible in a fragment of English with plurals, and likewise of a DRS-language which has plural discourse referents and just a handful of predicates (among which e and the *-predicate that occurs in (50b)). DRS languages which licence DRSs like (50b) (such as e.g. the DRS-language assumed in Ch. 4 of Kamp & Reyle 1993) thus have the power of second order logic. (Versions of the argument given here have been known for several decades. A seminal paper on the topic is Boolos 1984, in which a very similar method to the one used here is attributed to David Kaplan.) Attempts to find natural and independently motivated characterisations of sublanguages of such DRS languages which retain the ability of representing a significant portion of English sentences with plurals, but whose logic is axiomatisable (and which must therefore exclude DRSs like (50b)), have thus far been without success, and at this point it seems doubtful that such sublanguages can be found. 6.2. Intensionality Only some natural language constructions that build clauses out of other clauses are extensional.The majority are intensional, in the sense that the denotation of the resulting clause is a function of the intensions of the clause or clauses from which it is constructed, and not from their truth values. Modifying DRT so that it can handle intensional constructions is straightforward in principle. All that is needed for this is that we replace the extensional models we have been considering up to now by intensional models. We can take these intensional models to be bundles of extensional models indexed by possible worlds. A simple formalisation of this notion is as a pair M = <W,M), where W is a nonempty set of possible worlds and M is a function from W to extensional models. In relation to such intensional models M we can define the denotation of a DRS K in M, \[K]\ M, to be the set of all pairs <w,f) where w e W and f is an embedding of K into the universe of A/(w) which verifies the conditions of K in A/(w); and the proposition expressed by K in M can be defined as the set jw | B f «w,f) e [[K]] ,M)}. In analogous ways we can also define the intensions in a model M that are determined by expressions other than sentences. The details are straightforward. For this to be of any practical use in linguistic analysis we of course also need linguistically motivated predicates and functors that take intensions as arguments. Here it is useful to distinguish between modal and attitudinal predicates and functors, even if it is not possible to draw a sharp line between these categories. As far as modal notions are concerned, there is (to our knowledge) not much work that has been done towards their analysis within a specifically DR-thcorctical context. But any of the existing proposals for the analysis of modal words or constructions (including countcrfactuals and other conditionals) can be incorporated into DRS-languages once intensional models have been adopted, since all further structure that might be needed (such as, for instance, various Kripkean accessibility relations between worlds) can be imposed on these models. The analysis of propositional attitudes is another matter. The account of propositional attitudes and of the operators (believes, intends, etc.) that arc used in natural language to describe them differs significantly from the paradigm provided by modal logic, and,
37. Discourse Representation Theory 913 6.3. Propositional attitudes The classical treatment of belief, knowledge and other propositional attitudes as it has come to us through the work of Carnap, Montague, Kaplan, Lewis and others treats the objects of the attitudes - that which is believed, known, etc. - to be intensions, as defined in the last subsection. There is one major objection to this kind of analysis, often referred to as the problem of logical omniscience: intensions do not allow for distinctions that arc sufficiently fine-grained. In particular, any two sentences that are logically equivalent express the same proposition (i.e. propositional intension). An account that takes intensions to be the objects of belief cannot explain how a person could stand in a relation of belief to a sentence S while failing to stand in that relation to a sentence S' in cases where S and S' happen to be logically equivalent, but where this is not so easy to see and the person in question hasn't seen it. If the person professes belief in S but denies belief in S', then all the theory could say is that she must either be wrong about S or about S'.There is wide (if not universal) agreement that this conclusion is incompatible with the actual aetiology of belief. To obtain an ontology of attitudinal objects that is immune to this objection Asher (Asher 1986; Asher 1993) offers a DRT-based account in which the objects of the attitudes are identified as equivalence classes of DRSs. The equivalence relation that generates these classes is defined in terms of the structural properties of DRSs and is substantially tighter than the relation of logical equivalence in classical logic.This approach has the merit of providing an explicit definition of intensional identity that escapes the problem of logical omniscience (at least in its most obvious and unacceptable manifestations). A potential drawback is that it is hard to see which relation of structural equivalence between DRSs gives us the intuitively correct notion of intensional identity. (The problem seems to be in part that intensional identity varies with context.) A second DRT-based approach, related to Asher's, but different at certain points, was first outlined in Kamp (1990) and subsequently developed in explicit formal detail in Kamp (2003). (See also Kamp 2006 and van Genabith, Kamp & Reyle (to appear).) Here the emphasis is on the development of a representation formalism that is capable of representing not just single propositional attitudes but also combinations of attitudes involving different attitudinal modes - e.g. the combination of a belief and a desire, or of a belief and a doubt. Moreover, it can represent such combinations of two or more attitudes not only for a single agent at a single time, but also combinations of attitudes that a person entertains at different times, or that are held by different persons at the same or at different times.The basic construct of this extension of DRT is a predicate 'Att' whose first argument slot is for the bearer of the represented attitudinal state, while the second slot is for a representation of the attitudinal state that is being ascribed to him. (There is also a third slot, which is reserved for connections between discourse referents occurring in the second slot and objects in the world to which these are anchored, so that these discourse referents function as directly referring terms. For ease of exposition we ignore this slot until further notice; but see the next subsection.) The representations that fill the second slot of 'Att1 are sets of pairs consisting of an attitudinal mode indicator (such as BEL for belief, DES for desire, etc.) and a DRS specifying the propositional content of the attitude represented by the pair. An important feature of this way of representing attitudinal states is that the representations which occur as second members of the pairs may share discourse referents
916 VIII. Theories of discourse semantics to capture the effects of direct reference. Let us return to example (57), and focus on the belief it attributes to Alain, while forgetting about the desire. Let z' and y' be discourse referents that we, as attributors of the belief, use to represent the road and the coin on which the belief we attribute is targeted. Then we can use the formalism of the last section to express that the attributed belief is (doubly) singular - with respect to z and the road and with respect to y and the coin - by inserting in the third argument position of'Att' external anchors for y and z. These external anchors can be given the simple form of ordered pairs <z,z'> and <y,y'>, where z' and y' are again discourse referents, and for the term we insert into the third slot of 'Att' we can use the canonical way of denoting the set consisting of these two anchors, i.e. {<z,z'>, <y,y'>). If we now also add z' and y' to the universe of the main DRS of (57), then we get a DRS which says that there are two entities (represented by z' and y') such that Alain has a belief that is singular with respect to these entities.The model-theoretic semantics for 'Att' guarantees that the contents of the attributed attitudes are singular propositions in the sense defined above. We have assumed that in order to entertain a proposition that is singular with respect to an object an agent must (i) stand in some non-representational relation to d (representational relations are inherently descriptive, yielding general, not singular propositional content), and (ii) take himself to stand in such a relation to the object. Thus the discourse referent x he uses to represent the object must on the one hand stand in a non-representational relation to the object (parasitic on the non-representational relation between the object and the agent himself). And on the other hand x must have conditions associated with it at the level of internal representation which express that x is non-representationally connected with the object in the relevant way. A condition, or set of conditions, which expresses that a certain discourse referent is non-representationally connected to the object it represents is called an internal anchor. Sometimes an agent will have an internally anchored discourse referent for which there is no corresponding external anchor. These are cases where the agent is under some kind of illusion, as when he thinks he is seeing a coin in the middle of the road, but, in reality there is nothing there, neither a coin nor some other object that he mistakes for a coin. Such representations are deficient; they presuppose the existence of an external anchor, but the presupposition fails to be true. Strictly speaking such representations have no well-defined propositional content - the content tries to be that of a singular proposition, but doesn't succeed. On the other hand an external anchor for a discourse referent of an internal representation for which there is no matching internal anchor will be irrelevant to propositional content; this content would be the same that it would be if the external anchor didn't exist. These contributions of DRT to questions of direct reference depend essentially on extending its representation formalism with the predicate 'Att' discussed in the last section. What we said in the concluding paragraph of that section also applies to the contributions to the theory of direct reference discussed in this one: Direct reference, according to the view presented here, is in the first instance an aspect of thought; thoughts - beliefs, desires etc - can be directly referential by virtue of involving representations with anchored discourse referents. Utterances can be directly referential as well, but only by virtue of expressing directly referential thoughts.
37. Discourse Representation Theory 917 8. Coverage, extensions of the framework, implementations 8.1. Coverage The preceding sections have given some impression of the linguistic scope of DRT: Singular and plural DPs of various types; the behaviour of tenses and temporal adverbs, as well as certain aspect operators; presuppositions of any kind; certain modal and inten- sional operators (the aspect operators among them); propositional attitudes and direct reference. Among contributions to these topics that make a significant use of DRT but have not yet been mentioned are Stirling (1993) on switch reference, Farkas & de Swart (2003) on incorporation, de Swart (1991), Reyle, Rossdeutscher & Kamp (2008) on temporal quantification, and de Swart & Molendijk (1999) on the interaction between tense and negation. There are also natural language phenomena that so far have not been mentioned at all, but where there is substantial DRT-based work. We mention two: (i) Ellipsis (Hardt 1992; Klein 1987; Asher,Hardt & Busquets 1997; Bos 1993) and (ii) Information Structure (Kamp 2004; Riester 2008). 8.2. Extensions: SDRT and UDRT DRT has led to a number of extensions that differ from it substantially and that have come to be known by their own names. The best known of these arc S(egmented) DRT and U(nderspecified) DRT. Each of these would need a separate contribution. Rather than compromising by attempting a presentation that would be far too brief to do them justice we limit ourselves here to a mere statement of their goals and a few references. a. SDRT (Asher & Lascarides 1993, 2003): SDRT extends DRT in that it uses DRSs as building blocks of more complex structures in which rhetorical and other discourse relations between the successive sentences that make up a discourse or text arc analysed as relations between DRSs representing those sentences. This adds an additional level of discourse structure, with its own principles of structure and computation. SDRT has developed into what is arguably the most sophisticated current approach to some of the most challenging questions on the borderline between semantics and pragmatics. b. UDRT (Reyle 1993; Reyle 1996; Reyle, Rossdeutscher & Kamp 2008): The aims of UDRT are quite different from those of SDRT. UDRT was motivated by the problem of computing DRT's semantic representations - i.e. DRSs - from syntactic input. A major problem that affects the computation of semantic representations generally (whether DRSs or representations of some other form) is that natural language is full of ambiguity. Words are often ambiguous and of the ambiguous words many are multiply ambiguous, with three or more (often many more) different readings. Ambiguity arises also in connection with certain syntactic configurations (e.g. those that give rise to scope ambiguities). But in the practice of language use these ambiguities are much less of a problem than they might have been, for mostly they are resolved by context, provided either by the sentences in which they occur or by the circumstances in which these sentences are produced. For the theorist, however, disambiguation is a challenge; and it is an inexorable challenge for those who try to build effective, scmantically sophisticated NLP systems. Disambiguation is not only an issue that arises in connection with interpreting nearly every sentence we ever come across; it is also something that almost always
918 VIII. Theories of discourse semantics requires inference. One of the premises for the inferences that are needed for resolving an ambiguity in a given sentence is always the still ambiguous representation of the sentence itself. It is important to structure this representation in such a way that the required inferences can be drawn efficiently. In particular it will as a rule be useful to avoid long sentence level disjunctions in which each disjunct represents a potential complete reading of the reading of the sentence as a whole. This is the central concern of UDRT. UDRT offers ways of representing ambiguous premises that permit deductions which avoid many of the duplications that are typically involved in such 'proofs by cases'. At the same time the 'underspecified' representations proposed by UDRT (known as 'UDRSs') support the inferences that are typically needed for ambiguity resolution. Thus the inferential component of a UDRT-based system serves a double purpose: (i) assist in ambiguity resolution and (ii) draw inferences from premises for which complete disambiguation can't be achieved. (See also article 24 (Egg) Semantic underspecification. The advantages of underspecification have been challenged in Ebert 2005.) 8.3. DRT and Dynamic Semantics DRT has often been compared with Dynamic Semantics as developed by Groenendijk and Stokhof and others (see in particular Groenendijk & Stokhof 1991; Groenendijk & Stokhof 1990; Dckkcr 1993, as well as article 38 (Dckkcr) Dynamic semantics). In Dynamic Semantics sentences are explicitly analysed as relations between information states: The meaning of a sentence S manifests itself in the way S changes the information state of one who receives it, interprets it and accepts it. So in Dynamic Semantics it is the concept of meaning itself that has become dynamic (meanings are transformers of information states into other information states), whereas in DRT the notion of meaning remains static. As in classical Montague Grammar meanings in DRT are relations between sentences or texts (or their semantic representations) on the one hand and models or possible worlds on the other hand. Here the dynamics doesn't concern the notion of meaning as such, but the process of interpretation. Muskens (1994) showed how DRT can be formalised within a version of the X-calculus (a modest extension of Montague's HOIL) and as part of that how DRSs can be given a compositional relational semantics of the kind proposed by Dynamic Semantics (see also Eijck & Kamp 1997; Muskens, van Benthem & Visser 1997).This is an elegant and insightful way of eating one's cake and having it. Since the mid-nineties this dynamic approach to the analysis of meaning in natural language has been developed further, especially during the past decade in the work of, among others, Bittner and Brasoveanu (Bittner 2005; Bittner 2007; Bittner 2008; Brasoveanu 2007). 8.4. Cognitive significance From the very beginning one of the central motivations behind DRT has been the hope that it can tell us more about the processes of language interpretation and semantic representation in the human mind than Montague Semantics - the approach out of which DRT developed and which it shares most of its theoretical and methodological commitments - cither can or wants to.The DRSs of DRT ought to tell us some things about the
38. Dynamic semantics 927 (9) Once upon a time there was an old king, who didn't have a son. He did have a daughter, though. Whenever she saw a frog, she kissed it. (10) x,t " OLD(x, t) KING (x, t) y SON_OF(x,y, t) x,t,z - DA OLD(x, t) KING(x, 0 y SON_OF (x, y, t) UGHTEFLOF (x, z, t) X, t,Z OLD(x, 0 KING(x, t) SON_OF (x, y, t) DAUGHTERJDF (x, z, t) v, f FROG(u) SEE (z, v, f) KISS (z, v, t) These three DRSs represent the contents of the discourse in (9) after processing the first sentence (10a), the first two sentences (10b), and after processing the whole (10c). Notice that the material contributed by the second and the third sentence gets added in the representations that result from processing the first and the first two sentences. In this way, the pronouns he and she are appropriately related to the established domain of discourse. 1.4. Historical remarks We end this introductory section with some historical remarks on the treatment of indefinite anaphoric relationships in terms of discourse reference.The subject has gained prominence by, among many others, the logico-philosophical (Geach 1962), and the seminal but relatively informal work on discourse reference in Karttunen (1968). Kamp (1984) and Heim (1988) were the first, independently, to present a formal framework of interpretation for anaphoric phenomena, DRT and File Change Semantics (FCS)
930 VIII. Theories of discourse semantics [I3*0]L = l(g, h) | for some k: g[x]k and <*, h) e [[0l]J 110 a f\\M = ((&A>|forsomcfc(g,A:>6lI^Ilil|and<A,A>6lIVfIlJ [IV*0]] „ = [ig,h)\g = h and for all k: if g[x]k then there is h: (k, h) e [[01] J 110 v ^M = (<g, /t> | g = /t and for some *:<g, A:> € [[0]]^ or for some k: (g,k)e [[ y/]\M) U^¥\\M = (<g, *> |g = /i and for all*: if <g,*>e [[*]]„ then there is h:(k,h)e \[ifJ\\M) Most clauses require the input assignment g and output assignment h to be the same, besides some standard predicate logical conditions. For instance, if an atomic formula like Rtx... tn or r = tt is true relative to M and g then the input-output pair (g, g) is in the interpretation of such a formula. Intuitively this says that, if the standard test succeeds, g is accepted as possible input and the interpretation of the formula docs not change anything in its output. If the test fails then g is not accepted as possible input and in that case there is no assignment h such that (g, h) is in the interpretation of that formula. Exactly when this is the case, that is, when the conditions imposed by a formula 0upon M and g is not satisfied, then its negation is satisfied, and g is a possible input for -i0 relative to M.In other words, if 0 cannot be executed upon input g, then -i0can, and its interpretation will yield g again as output. Apart from the clauses for existentially quantified formulas and conjunctions, the other clauses in the definition are also static, in the sense that they do not allow input assignments to really change.The associated conditions are straightforward adjustments of the static interpretation of the embedded formulas to the dynamic (relational) interpretation. Only the interpretation of an implication is a bit more involved. An implication (0 -» y/) is satisfied (relative to M and g) iff relative to all ways of satisfying 0 on input g in M, y/is true as well. Since y/here gets evaluated relative to outputs (k) of interpreting 0, dynamic effects of 0 may affect the interpretation of y/. Changes in assignment functions are due to the interpretation of existentially quantified formulas. According to the above definition, if we have some input assignment g, then the interpretation of 3jc0 requires us to try out any assignment k which differs from g only in its valuation of x, then see if it serves as an input for interpreting 0, and if it does and outputs h, then h is also a possible output for interpreting 3x</> on input g. Notice that if x indeed, as in most examples, occurs free in 0, and 0 imposes certain conditions on the valuation of x, then the output valuation of x will have to satisfy these conditions. Metaphorically speaking, a 'discourse referent' with the properties attributed to x is introduced by such a formula, and it is labeled with the variable x. A conjunction docs not change any context all by itself, but it docs preserve, or rather compose, possible changes brought about by the combined conjuncts.That is to say, if 0 accepts an input g and produce some possibly different output k, and if ^accepts k as input and delivers h as possible output, then the conjunction (0 a y/) accepts g as possible input upon which h is a possible output.This implements the dynamic idea that the interpretation of (0 a y/) involves the interpretation of 0 first and y/next.
38. Dynamic semantics 931_ 2.2. Dynamic binding By way of illustration, let us first consider a simple example in detail, throughout neglecting reference to a model M. (12) A farmer owned a donkey. It was unhappy. It didn't have a tail. 3x(Fx a 3y(Dy a Oxy)) a Uy a -3z(Tz a Hyz) Relative to input assignment g this will have as output assignment h if wc can find assignments k and / such that k is a possible output of interpreting 3x(Fx a 3y(Dy a Oxy)) relative to g, and / a possible output of interpreting Uy relative to k, and h a possible output of interpreting —3z(Tz a Hyz) relative to /. Since the second formula is atomic, and the third a negation, we know that in that case k = I and I = h. Assignment h (that is: k) is obtained from g by resetting the value of x so that h(x) e 1(F), and by next resetting the value ofyso that h(y) e 1(D) and (h(x),h(y)) e /(O).That is,h(x) is a farmer who owns a donkey h(y). Observe that for any farmer/and donkey d that /owns, there will be a corresponding assignment /i':g[{-*,.y}]/i' and such that h'(x) = fandh'(y) = d. The second conjunct first tests whether y is unhappy, that is, whether h(y) = l(y) e I(U). The third conjunct, a negation, tests whether assignment h cannot serve as input to satisfy the embedded formula 3z(Tz a Hyz).This is the case if we cannot change the valuation of z into anything that is a tail had by h(y). Putting things together, (g, h) is in the interpretation of our example (12) if, and only if, g[[x, y\\h and h(x) is a farmer who owns a donkey h(y) which is unhappy and does not have a tail. Observe that for any farmer/and unhappy tail-failing donkey d that /owns, there will be a corresponding assignment/I'lgljjc,^}]/!'and such \na\h'(x) = f and h'(y) = d. In the example discussed above we see that a free variable y, for instance in the second conjunct, gets semantically related to, or effectively bound by, a preceding existential quantifier which does not have the variable in its syntactic scope. This is an example of a much more general fact about interpretation in DPL, which goes under the folkloric name of a 'donkey equivalence'. Observation 1 (Donkey Equivalences) For any formulas 0and y/ (Bjt0a vO = Bjt(0a V) (3x0 -> \ff) = Vjc(0 -> iff) These equivalences are classical, but for the fact that they do not come with the proviso that x not occur free in y/iThis is dynamic binding at work. In the first equivalence we see that a syntactically free variable x in y/may get semantically bound by a previous existential quantifier. The second one shows that this semantic binding gains strong, universal, force in implications.The use of the second equivalence is exemplified by the following, canonical, examples. (13) If a farmer owns a donkey, he beats it. (3jc(Fjc a 3y(Dy a Oxy)) -> Bxy)
932 VIII. Theories of discourse semantics (14) Every farmer beats every donkey he owns. Vx(Fx -> Vy((Dy a Oxy) -> Bxy)) These two sentences have generally been judged equivalent, and so are the naturally associated translations in DPL. (As a historical remark, back in 1979 Urs Egli has proposed the above equivalences, in order to account for the anaphoric puzzles that plagued the literature. One of the merits of DPL is that the equivalences show up as true theorems.) The following, classical, equivalences are also valid in DPL. Observation 2 (Equivalences that Hold) i i 10= —10 Vjf0=-i3jt-i0 (0 V ^)=-.(-.0A-.^) (0~> V)=-.(0A-.^) As we have seen —■ is an operator that introduces tests without any further dynamic impact, and V, v and -> do likewise.That is, if 0 contains a quantifier with binding potential, this potential gets lost when it occurs in the scope of -i, V, v or ->. As a consequence other equivalences typically do not hold in DPL. Observation 3 (Equivalences that do Not Hold) —1—10 * 0 Bx0 * —iVjt—10 (0 A Iff) « -,(-,0V-,^) (0 A Iff) * -,(0 -> -i^f) These non-cquivalcnccs arc motivated by the observation that the pronouns in the following examples do not seem to be resolved, or at least not bound by the indefinite which figures in the scope of one of the mentioned operators. (15) Farley doesn't have car. It is red. (16) Every man here owns a car. It is a mustang. (17) Mary has a donkey or she doesn't have one. It brays. The facts about (undoing) dynamic binding, which follow by the definition of the semantics, correspond one to one with the facts about (in-)accessibility of discourse referents in basic DRT, cf. article 75 (Geurts) Accessibility and anaphora. 2.3. Dynamic consequences DPL has been motivated by the desire to bring out the logic of a system of interpretation that accounts for anaphoric relationships, like DRT. It allows us to study the logical consequences in full formal detail. Before we can see this more clearly, we have to present the DPL notions of truth and dynamic entailment first.
936 VIII. Theories of discourse semantics The dynamics of discourse is much more involved than these simple examples suggest. It is not just the passing on of information exchanged, and not just the creation and utilization of discourse referents.This type of information comes by in a structured manner, as the following examples serve to show. Notice, first, that plural pronouns may pick up plural entities which have not as such been introduced in the discourse. (24) Bob and Carol went to play bridge with Ted and Alice. They had a wonderful evening. Obviously the pronoun "they" should not be taken to refer to cither Bob, or one of the others. The pronoun can, however, refer to the couple of Bob and Carol, and also to the whole group of four, Bob, Carol,Ted and Alice. However, this group of four has not been mentioned as such in the first line of example (24). Somehow this plural referent may have to be constructed from the four persons that have been explicitly introduced. (This process of forming plural discourse referents is called 'summation' in Kamp & Reyle 1993.) Besides this type of summation of individual referents more is at stake. When we deal with plural anaphoric dependencies all questions about the distributive, collective or cumulative interpretation of plurals play up, in a dynamic fashion. Consider the following example. (25) Seven pupils and four teachers wrote five ballads and some rhymes.They performed them at an evening during the spring holiday. The first sentence here can be taken to introduce a group X of pupils and teachers, and a group Y of ballads and rhymes, such that X wrote V and X performed Y at a certain evening. Of course, the writing of Y by X can be analyzed in more detail. Maybe the intended reading is that the pupils wrote the ballads and the teachers the rhymes. Also, the pronouns "they" and "them" can be taken to require further analysis, possibly depending on the particular kind of reading associated with the first sentence. The performers can be taken to have performed, individually or group-wise, the ballads and rhymes they wrote themselves. Upon this rather natural analysis, the truth conditions of the second sentence arc dependent on the analysis chosen for the first, so that the dynamics of interpreting the first sentence must, not only deliver just two plural discourse referents, but some internal relation between these referents as well. This point also shows in quantified constructions. (26) Almost all students chose a book. Most of them wrote an essay about it. The first sentence in this example can be taken to yield a referent set of students who chose a book, and this is the set of individuals which the pronoun "them" intuitively refers to. However, assuming not everyone of them chose the same book, there is no singular referent figuring as the chosen book, which the pronoun it could have referred to. A natural interpretation of the second sentence of (26) is that the students wrote an essay about a book that they individually chose. We witness again a close interaction between
38. Dynamic semantics 939 Informative types of discourse not only consist of assertions, but they are, typically, also guided by questions: interrogatives, topics which the interlocutors raise, and questions they themselves face. Already at the outset of formal semantic theorizing about questions their dynamic role, their role in discourse or dialogue, has been obvious. Adopting a dynamic theoretical perspective, discourses or dialogues can be described as games of stacking and answering 'questions under discussion' or as processes of 'raising and resolving issues'. Such processes are not unstructured, but governed by structural linguistic rules, and highly pragmatic principles of reasonable or rational coordination (Ginzburg 1995; Roberts 1996; Hulstijn 1997). A quite minimal way to account for this proceeds by representing information as a set of possibilities, one of which is supposed to be actual.The possibilities are grouped in sorts.The sorting indicates that the current issue is, not which of the possibilities is the actual one, but which is the sort of the actual one, that is, in which sort of possibilities the actual world can be found. Information states then can be taken to be sets of sets of possibilities, which may get updated with further questions and more data in the development of a discourse. Such a tentative sketch already provides the basics for characterizing certain basic discourse and dialogue notions like that of a coherent and felicitous dialogue. For a dialogue to proceed coherently and felicitously, one may require assertions to be consistent with the current information state, but also informative: logically speaking it is of no use to accept a state of inconsistent information or to assert what is already (commonly) known. Questions can be required to be non-superfluous as well,so that one doesn't raise an issue which is already there. Assertions can also be required to address issues at stake and not to provide unsolicited information.These observation and requirements, and many others, find a neat formulation in update style systems of interpretation. Asher & Lascarides (2003) gives an impression of the wealth of data to be covered in this direction, and Aloni, Butler & Dckkcr (2007) gives an overview of recent formalizations in a dynamic paradigm. 3.3. Modality and presupposition Like we said, a pragmatic system of interpretation along the lines of Veltman's update semantics provides a ground for evaluating epistemic modals, assertions made with the auxiliaries may and must or adverbials like maybe and evidently. The sententials operators may and must neatly seem to fit in the dynamic paradigm, as the first can be used to express consistence with the current context of information, and must to express something that can be derived from this context.Thus, the effect of a modal mod, which expresses contextual epistemic possibility, can be tentatively defined as follows. (32) (t)[mod0] = r, if (r)[0] is possible, and (t)[mod0] = J. otherwise. Since the interpretation of the modality is stated in terms of the possible update of the current information state rwith the embedded sentence 0, the interpretation of these modal sentences may be variable. For instance, it may be the case at one point in an exchange that Nancy might be home, as far as we all know, while later in the discourse we may have collected information which rules out that she is home. Epistemically used modals might and must may change their truth, or acceptability, in the course of events. The dynamic logic of such epistemic operators is investigated in detail in various
38. Dynamic semantics 941 function composition, the second formula's presupposition that 0 ("Mike has children") is automatically satisfied by the update of the context with the first conjunct 0. Not so according to the rendering of example (3b) as (x0 a 0).This conjunction as a whole still presupposes that 0, while its second conjunct still needlessly and explicitly conveys what has already been presupposed by the first. An update semantics precisely accounts for this difference. The AB theory of presupposition (van der Sandt 1992; Geurts 1999) is dynamic like the satisfaction theory, but it presents a different account of resolution. According to the AB theory, presuppositions appear in a preliminary phase of interpretation, and at some intermediary level, as distinguished information units which have to be 'resolved' for the interpretation process to be completed. Being resolved roughly means being semanti- cally invisible, the AB counterpart of being satisfied. If they are resolved these information units as it were dissolve at the intermediary level of DRTs discourse representation structures. If they are not resolved, they get 'accommodated', or, rather, they accommodate themselves, like true squatters. In such a case their contents are settled in a relevant part of a representation which resolves them in the slot where they originally appeared. For a very sketchy illustration, consider again example (33). (33) Sally believes that Harry didn't quit smoking cannabis. Nowhere in example (33) it is literally said or communicated that Harry did smoke cannabis, or that Sally believes so. So if the preceding context of such an example does not supply a way of resolving this presupposition, it has to accommodate itself. Even though this may require some further contextual support, one way for this presupposition to accommodate itself is right there where it stands, thus bringing about a reading according to which Sally is taken to believe that it is not the case that Harry smoked cannabis and stopped, similar to the last reading above. It may be easier to obtain a reading by accommodating the presupposition in Sally beliefs. Sally would then be taken to believe that Harry did smoke cannabis and didn't stop doing so, as one gets from the second reading above. Probably the most straightforward interpretation is one where the presupposition accommodates itself at the main level of interpretation, so that example (33) is taken to say that Harry used to smoke cannabis, and that according to Sally he didn't stop doing so, basically the first reading again. While the underlying ideas of the satisfaction and the binding theory on satisfaction and resolution are similar, and while both are dynamic, the specific treatments are quite different.The first is a logical approach, in that it predicts presuppositions as a type of entailments following from an independently specified semantics; according to the second presuppositions arc typically computational constructions.The difference is vast, but not unbridgeable. A semantic variant of the AB theory, which fleshes out a logical notion of the meaning or interpretation of unresolved presuppositional structures, is presented in Dekker (2008). 4. Methodological issues In this section we discuss two more theoretical issues. They arc related to the classical Fregean themes of compositionality, representationalism, dynamicity and contextuality, which show to be very actual still in current debates.
39. Rhetorical relations 953 (13) a. John fell. Bill pushed him. (Nonvolitional Cause) b. John pushed Bill. He was angry. (Volitional Cause) c. John pushed Bill. I saw it myself. (Justify) A special case are also elaborations that give extra information about participants or the location, for the purpose of identification or for motivating or explaining the pivot. This whole group of relations can - with some charity - be described as relations between the state or event described by the current sentence and the state or event described by the pivot. All other relations are different. They are Contrast, Concession, Narration and List and seem primarily related to the strategic level: what is reported belongs together because it is relevant to the same issue, but the events and states that are reported are not necessarily related to each other. This does not mean that they cannot have causal, temporal or local relations to each other. Contrast and Concession often go together with temporal simultaneity and local overlap. The elements of Lists also often have non-accidental temporal and local relations to each other. But none of these relations seems to entail any particular temporal, local or causal relationship between the events and states involved. If they nonetheless have a relation of this kind it would be due to their shared relation with an overarching topic. Narration (Sequence in the list of relations above) is connected with moving through time and will imply temporal succession and that is a reason for distinguishing Narration from List. In a Narration, a story is told.The pivot event is abandoned and the current sentence reports the next event that is relevant for the story. That is the definitional property for Narration: the new event happens after the pivot event. Hobbs (1985) adds another element: the new event is contingent on the pivot event. Contingency should be defined in terms of causality, but it is not obvious how this must be done. The idea is that the pivot event does not itself cause the utterance event, but establishes one of its preconditions. This can be expressed as a counterfactual (14) and an illustration is (15). (14) If the pivot event had not happened, the current event would not have happened either. (15) John stepped out of his car and walked up to the door. contingency: John's stepping out of the car brings him in a state and at a place which makes it possible for him to walk to the door. counterfactual: If John had not stepped out of his car, he would not have walked up to the door. Stories are held together not just by temporal succession and contingency, but also by protagonists and locations (the aboutness topic). One can try to express the unity of stories by thinking of them as an Elaboration of a single event. Another way of defining the unity of a story tries to see the whole story as an answer to a single question (the story topic) that gets answered by the events that make it up. Unfortunately, clear ideas about the construction of this topic question from the story have not been forthcoming, except for simple constructed cases. Narration cannot be reduced to a relation between the reported events, not even if with Hobbs one adopts Contingency. On the topic view, it is the relation to the event
954 VIII. Theories of discourse semantics described by the whole story or the overarching topic question that makes the pivot and the current sentence stand in the Narration relation. Topic questions (see article 72 (Roberts) Topics) do a much better job with Lists. A List (a sequence of sentences connected by the List relation) can be seen as an answer in many parts to a single topic question, typically in a situation in which single sentence answers would not do the same job. Here there are algorithms, e.g. Priist, Scha & van den Berg 1994 that compute the topic question (or a closely related object) for simple cases. Lists can also be seen as an answer to the problem that one cannot fully specify binary or ternary relations in a single sentence, unless one is very lucky. For instance, let John love Mary and Sue, Bill Mary and Harriet and Tom only Sue, while there are other boys and girls. It is impossible to give a simple clause which specifics the whole relation, in response to a question: Which boy loves which girl? (16) is true but fails to give the details. (16) John, Bill and Tom love Mary, Sue and Harriet. But a List like (18) does. Full specification is possible by splitting up the given question into subquestions as in (17) and answering each in turn as in (18). (17) Which girls does x love? The List in (18) can be seen as a strategy to avoid the problem. (18) John loves Mary and Sue. Bill Mary and Harriet. And Tom Sue. Contrast is the topic of ongoing discussions. Umbach (2001) provides a definition for the case when the English "but" or rather the German "aber" does not express Concession but what is normally called Formal Contrast. The definition runs as follows: A positively addresses a topic question Q which B (directly or indirectly) addresses negatively. This is illustrated in (19). (19) John is tall but Bill is small. topic question: Are John and Bill tall? background knowledge: Small implies not tall. (19) should be contrasted with (20) which requires a different topic question. (20) John is tall and Bill is small. topic question: How tall are John and Bill? The generality of this approach can be undermined however by looking at languages where Concession and Formal Contrast arc expressed by different words. A case in point is Russian, where the expressions a and no both correspond to English but with no specialising in the Concessive readings (Malchukov 2004).
39. Rhetorical relations 955 A is sometimes rendered by but in English. In many cases however, the correct translation is and. The problem is that the other Russian conjunction / is subject to strong restrictions which prevent it to be the uniform translation of and. One hypothesis (due to Jasinskaja & Zeevat 2008) is that a relates to double contrast: a doubly distinct answer to a double wh-question, as in example (21). (21) John likes Susan "a" Bill likes Harriet. topic question:^'ho like who? No would not be adequate in this case, since no marks a special case of an answer to a double wh-question, namely the case where one wh-element is why and the other is whether. The polarity switch is then due to the fact that there are only two distinct polarities that can answer a whether-question. Concession can be defined along the lines proposed by Konig (1991) as anticausative: Concession(A, B) iff A normally causes B to be false. This is however not fully general, as was already noted by Ducrot (1973). In an example like (22), the first conjunct argues for going to a restaurant, the second against going there. Typically, the speaker can be taken as committing herself to the drift of the second conjunct, i.e. as proposing not to go. (22) The food is good but it is expensive. This can be fitted into the scheme of double wh-questions by making the concessive case provide doubly distinct answers to a why-whether question. In (22), the first conjunct gives a reason for going to the restaurant, and the second conjunct a reason for not going there. The anticausal readings are then the ones where the second conjunct itself is the conclusion that is argued against (in the first conjunct) and argued for (by itself) in the second conjunct. Concession has its origin in conversation. A Concession is typically partly acknowledgment of what the other said. Suppose the other said "A". Then one can concede any part of A or a consequence of A. It goes together with not accepting A in its entirety, normally with a rejection of A. In a conversational concession what is conceded is already accepted by the other speaker and so has common ground status. What is not conceded from A, lacks this status. This may explain why subordinate concessive clauses are presupposition triggers. In English, it is necessary to mark Concession, by markers like but, (al)though, however and nonetheless. In languages like German and Dutch there are concessive markers like zwar and weliswaar inside the first clause that indicate that one is in a Concession and announce the aber- or /naar-clause that will follow. So these languages have a coordinate structure that is unambiguously concessive, unlike the English construction with but. All rhetorical relations allow asyndetic expression (expression without any overt marking) and there are often lexical markers. A curious fact is that while most relations can be expressed by specific grammaticalised markers and can be marked as a relation
956 VIII. Theories of discourse semantics between constituent clauses in a syntactically integrated complex construction there are exceptions to this principle. The analysis of many of the markers as listed in Tab. 39.2 is problematic. Good analyses would give an account of the rhetorical relation(s) they can mark.The growing body of research on the semantics and pragmatics of particles is therefore directly relevant for a better understanding of rhetorical relations (see also article 76 (Zimmermann) Discourse particles). Tab. 39.2: Grammaticalised cues for the RST relations Grammaticalised markers Antithesis Background Concession Enablement Evidence Justify Motivation Preparation Restatement Summary Circumstance Condition Elaboration Evaluation Interpretation Means Non-volitional Cause Non-volitional Result Otherwise Purpose Solutionhood Unconditional Unless Volitional Cause Volitional Result Conjunction Contrast Disjunction Joint List Multinuclear Restatement Sequence but but, though because, since because, since because, since so while if namely so because, since so otherwise, else anyway unless so so and but, and or and, also then
39. Rhetorical relations 957 4. Where must rhetorical relations be assumed? The table at the end of section 3. already can serve to make an important point: it is not just the sentences of a discourse that arc related by rhetorical relations. Rhetorical relations must also must be assumed within a single sentence as obtaining between coordinated clauses and between main clauses and subordinated clauses. And between subordinate clauses as in the following example (23). (23) Stepping out of his car and walking to the door John noticed a squirrel in the tree. Stirling (1993) reports that this is in fact the favourite way of telling a story in switch- reference languages. In Latin - where the case system allows a far more reliable way of tracking the different participants in a sentence than the pronominal systems of many modern languages - multiple participial constructions with rhetorical relations holding between them are much favoured. (24) is from Caesar (1869: book 1,24). (24) a. ipsi confcrtissima acic, rciccto nostro cquitatu, phalange facta sub primam nos- tram aciem successerunt b. they themselves in very close order, after having repulsed our cavalry and formed a phalanx, advanced up to our front line. In (24), the relation of Narration between throwing back the cavalry and forming a phalanx, both expressed by an absolute ablative is only expressed by the linear order. This raises the question how deep one should go into syntactic structure for applying rhetorical relations. Very far it seems. Any subordinate predication can in principle be related by a rhetorical relation to another predication, although not to any other predication. Predications have to be related to other ones, since the speaker has put them there for a reason and the hearer needs to figure out why the material was deemed useful by the speaker. An exception is material that is used for the purpose of identification, but also such material can be simultaneously used to give Causes or Justifications. (25) a. The angry farmers blocked the road. (Volitional Cause) b. Pushing the button, John blew up the wall. (Means) c. Holding the flowers in his arm, John crossed the road. (Circumstance) d. When he reached the crossing, he turned right, following Mary's instructions. (Volitional Cause) One can even quite legitimately ask the opposite question. How much of the semantic connections expressed by lexical and syntactic means arc in fact rhetorical relations? As it turns out many are. Typically all the connectors between sentences have a semantics that is reminiscent of a rhetorical relation.The thematic relations expressed by case and word order have to be analysed by the proto-thematic properties following the analysis of Dowty (1989). And those are suspiciously reminiscent of rhetorical relations: Cause, Volition, Affected, Beneficiary, Result, Instrument. Complement sentences can be related to Elaborations. On this perspective, one could say that there are fundamental semantic relations for natural language which should be recognised in interpretation and that
960 VIII. Theories of discourse semantics the same quantifier scopes should be preferentially assumed etc. Kehler has however convincingly argued that this is not a general default but one that depends for its operation on the assumption of discourse relations that force parallellism: List, Formal Contrast and (cases of) Concession and Narration. The effects of parallelism can be illustrated by the following examples. In (32)-without special intonation - she musi be Marian and cannot be Susan. (With stress on she, it is the other way round, due to the distinctness between she and Marian expressed by the contrastive stress: the most parallel reading that is compatible with contrastive stress on she.). (32) Marian likes Susan. And she likes Tom. In the second clause of (33), the ellipsed VP is "like Susan at dinner parties" and not "like Susan". In (34), this default is broken by the provision of a different parallel modifier. (33) Marian likes Susan at dinner parties. And Tom does too. (34) Marian likes Susan at dinner parties. And Tom does in dancing class. Kehler's point is that in cases like these, but not with other rhetorical relations the syntax of the connected clauses must be very similar.The effects show up especially in ellipsis. Compare (35) with (36). (36) allows two intcrprctations:Tom likes himself or Tom likes Marian, interpretations that are both unavailable for (35). ((35) will be repaired to have one of the indicated readings, but And Tom does too. is just not the right way to express them.) (35) Marian likes herself. (?) And Tom does too. (36) Marian likes herself. Because Tom docs too. Further compare (37) with (38). Here the unexpressed reflexive in the ellipsed clause creates a problem in (37), while this does not seem to matter in (38). (37) Marian likes Tom. (?) And he does too. (38) Marian likes Tom. Because he docs too. What seems to be going on is that the List, Formal Contrast and (cases of) Concession force syntactic parallelism and that the antecedent as a syntactic object must allow the interpretation. This restriction is removed when the relation is not one of Similarity. It is quite tempting to think that this can be analysed with quaestios (von Stutterheim & Klein 1989), discourse topics (Asher 2004) or schemas (Prust,Scha & van den Berg 1994). Lists are then joint answers to a single question (or different elements falling under one discourse topic or sharing a schema). The problem then seems to be that the underlying questions (topic, schemata) in the cases considered are not well- formed. This would block the interpretations that need them. For example, in Priicst, Scha & van den Berg (1994) And Tom does too, is expanded to And Tom likes herself which cannot be interpreted.
39. Rhetorical relations 963 5.3. Information structure There is a seemingly opposite school in the study of rhetorical structure represented by scholars like von Stutterheim & Klein (1989) and van Kuppevelt (1995). In this approach, the starting point is the task of the speaker, conceived as a question. The structure of a text is then a complex answer to this question. This immediately leads to a distinction between partial answers to the question and satellites to such partial answers (which answer questions of their own, raised by their nucleus). The question of a text also leads to the fixation of the topic and foci of the partial answers and the fixation and movement of the referential parameters in these texts. This leads to interesting parallels with treatments of discourse semantics such as Discourse Representation Theory. One can say from the perspective of this school that RST is just a classification of a set of natural relations that arise in the goal-directed enterprise that is telling a story or producing an overview of some states of affairs or to provide an extended explanation. A difference with RST is that the question immediately makes a connection with the task of the text as a whole. Moreover, it is possible to develop an account of especially those discourse relations that seem to be governed by information structure rather than by semantic relations almost directly on the basis of the questions structure: especially van Kuppevelt works out this connection and treats both information structure and text structure in terms of questions.This makes it possible to link up with areas such as theories of topic-focus articulation (cf. article 71 (Hinterwimmer) Information structure or Grosz, Joshi & Weinstein 1995 on centering). The theory seems to have the potential to serve as a foundation for rhetorical relations and rhetorical structure. Any element of a text needs to be assigned a role with respect to the question of the text or with respect to one of its answers. If this role can be expressed as another question, specific for the particular element, this question will determine both the rhetorical relation and the pivot. 5.4. Text interpretation While rhetorical relations started in text generation as a theory of how to structure texts, section 5.1. should have made it clear that there is quite a lot of semantic mileage in having a grasp on rhetorical relations in interpretation. The main obstacle for achieving such a grasp is the fact that rhetorical relations quite commonly are not overtly marked at all. Two remarks are in order. First of all, it is quite normal that there is not a marker in NL for all the conceptual distinctions that are expressed in the sense that speakers are aware of the distinction and hearers are supposed to figure out what it is. The many meanings of common prepositions in English like "with", "of" or "in" are a clear case in point. A natural language understanding system faces the task of filling in the blanks there, if it is aiming for understanding that is comparable with what humans get out of the language input. A lot of ambiguity resolution is necessary. The second remark is that nonetheless NL is full of markers for rhetorical relations and that texts are full of these markers: various coordinating and subordinating connectors express one or more rhetorical relations, many particles at least rule out some rhetorical relations. (Webber et al. 2003 argues for what seems to be the correct view: the
39. Rhetorical relations 965 one needs to be able to compare explanations by a combination of logical and empirical methods to get anywhere at all. It would however not seem that there is anything going on in the area of rhetorical relations that is not just normal computational linguistics. It would therefore seem that further development of empirical computational semantics will bring solutions for asyndetic marking of rhetorical relations. SDRT has dominated rhetorical structure in recent years. It started as an attempt to get rid of some of the limitations of the treatment of tense and aspect in DRT, a treatment that is largely a refinement of the treatment of tense and aspect in Hinrichs (1986) assuming a very basic narrative structure (Kamp's earlier collaborative work with Christian Rohrer on French tense and aspect (Kamp & Rohrer 1983) - the discovery context for discourse representation theory - was also limited to narration, this time based on Gustave Flaubert's Education Sentimentale). It was however clear to everybody, that the limitation to narration led to generalisations that are not tenable when one considers a wider category of texts. The original work is reported in Lascarides & Asher (1993) and is influenced by attempts at that time to investigate the applicability of default logic to phenomena in natural language semantics and pragmatics. The combination of rhetorical structure and non-monotonic reasoning was not new, since Hobbs had proposed a combination of abduction and his work on text coherence, but SDRT aims for a comprehensive model for the first time. It brought applications to presupposition and the development of a hybrid logical system which separates the logic of rhetorical relations (a propositional non-monotonic logic) from the logic in which the proper natural language semantics is taking place (a variant of standard first order logic). Merging the two logics would lead to an intractable system. While it seems that SDRT can profit as much from empirically acquired statistical data, it requires argument to assume that non-monotonic logic is the most suitable formalism for exploiting it. The aim of SDRT is to develop a implementable logical model of rhetorical structure. And certainly, the work collected in the monograph Asher & Lascarides (2003) comes closer to that goal than any other effort in the area. Moreover, almost all issues in rhetorical structure have come by in this enterprise and no student of rhetorical structure would be wise to ignore SDRT. 6. Outlook I have tried to show in this overview that rhetorical structure is a necessary ingredient of accounts of text generation and of generalised pronominal resolution, including next to pronouns, tense and aspect, ellipsis and gapping, particles and information structure and therefore an essential part of context-based interpretation of natural language. Here the qualification "context-based" can be safely dropped: there is no natural language interpretation worthy of the name that is not context-based. With the exception of SDRT, there is no theoretical model of the recognition of rhetorical relations that has been worked out on a larger scale. For SDRT too, the proper recogniser of rhetorical structure has not been achieved yet. With the exception of the rather limited cue-based recognition of rhetorical relations, there is therefore nothing that can be immediately incorporated into natural language understanding systems. In text generation, rhetorical structure is part of most systems. Nonetheless there are many phenomena that need improved understanding. For example, while recent years have seen a substantial improvement in our understanding of the meaning of particles
966 VIII. Theories of discourse semantics and conjunctions, the question when to deploy these devices in a text generator is still not well understood. It seems that knowing the meaning is not enough, one also needs a proper description of the triggering conditions beyond meaning. For an area that is as central as rhetorical structure, the research investment so far has been very small indeed. As compared e.g. to the investment in the syntax of sentences where hundreds if not thousands of researchers are active there are maybe 75 people altogether who have contributed to the area of rhetorical structure and for a few of them only it has been a substantial part of their career. The intellectual interest of the two problems is hard to compare, but theoretical syntax seems to have only a minor impact on natural language understanding and generation and the potential of rhetorical structure seems to be far greater for both of these areas. I would gamble on two new impulses into the area in the coming years. One is the advent of corpora that are annotated with rhetorical structure. Such corpora have already been developed and annotation schemes have been provided. For useful references, see http://www.isi.edu/ marcu/discourse/. More recent is the Penn Discourse Tree- bank (http://www.ldc.upenn.edu).They can be combined with the development of more semantic understanding within the stochastic paradigm in NLP in combination with logical techniques. This will lead within the coming years to the possibility of estimating the plausibility of a given interpretation in a context and preferring the most plausible interpretation. This will also allow the recognition of rhetorical relations and thereby liberate research on rhetorical structure from the fixation on the problem of asyndetic expression, a problem outside the reach of current technologies. While these new possibilities will help considerably in other areas of semantics and pragmatics, the improvement of rhetorical understanding will have strong repercussions on the overall quality of semantic and pragmatic understanding. A second promise is optimality theory. Both generation and interpretation can usefully be seen as a choice between alternatives where the choice is guided by principles, i.e. as optimisation problems. Both generation and interpretation also need blocking and blocking is typical for optimisation problems. Certain interpretations may be obscured by other and better ones, certain possible realisations may be blocked by preferred other realisations. Two interpretational constraints seem directly relevant to rhetorical structure and define defaults there.The first is a principle that maximises the givenness of any element of an interpretation, called *NEW, DOAP, ♦ACCOMMODATE by different authors. This creates defaults in the interpretation of rhetorical relations as indicated in the following diagram, taken from Zeevat & Jasinskaja (2007). Explanation Contrast Reformulation > Elaboration > > Narration > Justification Concession The principle also underpins the default in pivot identification: the lowest element on the right frontier that fits. A good thing about the area of rhetorical relations is that, while schools have been formed, they have not led to divisions other than in general ideology. The Right