Logic and Interactive RAtionality Yearbook 2012 Volume I Editors: Zoé Christoff Paolo Galeazzi Nina Gierasimczuk Alexandru Marcoci Sonja Smets Collecting Editors: Alexandru Baltag (Europe) Wesley Holliday (United States) Fenrong Liu (China) The printing of this book was supported by the UP fund of Johan van Benthem. Cover design by Nina Gierasimczuk. Foreword It is our great pleasure to present you with the two volumes of the “Logic and In- teractive Rationality” (LIRa) Yearbook 2012, this time officially a joint-venture of Amsterdam-BayArea-Beijing. Following tradition, the Yearbook reflects the activi- ties of the LIRa seminar held regularly at the ILLC in Amsterdam. We are very happy that Tsinghua University in Beijing and Stanford University and UC Berkeley in the Bay Area have joined our efforts. Submissions from China and the US show the global nature of research into interactive rationality and logical dynamics. We expect that the Yearbook can serve as a platform for future collaborations among congenial colleagues working in distant parts of the world. We want to thank all the contributors for allowing us to include the fruits of their work. We would like to give recognition to Nina Gierasimczuk for her cover design, but also for coordinating the work on the volume. We also would like to direct our special thanks to Alexandru Marcoci for typesetting yet another edition of the LIRa Yearbook. Finally, we want to thank Johan van Benthem for sponsoring the seminar’s activities, for his efforts to keep us all connected, and for being a constant source of scientific inspiration. The Editors Amsterdam-Berkeley-Beijing February 21st, 2014 Contents of Volume I 1 Preface by Johan van Benthem 1 2 Dynamic Epistemic Logic as a Substructural Logic by Guillaume Aucher 3 3 Strategic Voting and the Logic of Knowledge by Hans van Ditmarsch, Jérôme Lang, and Abdallah Saffidine 27 4 What Does it Mean to Know an Action? by Jiahong Guo 53 5 What You Can Do Depends on What You Have Done by Fengkui Ju and Li Liang 81 6 Public Announcements under Sheaves by Kohei Kishida 105 7 Transition Semantics. The Dynamics of Dependence Logic by Pietro Galliani 121 8 Dynamic Measure Logic by Tamar Lando 145 9 Surprise in Probabilistic Dynamic Epistemic Logic by Lorenz Demey 177 10 A Geo-logical Solution to the Lottery Paradox, with Applications to Con- ditional Logic by Hanti Lin and Kevin T. Kelly 201 11 Changing Types: Information Dynamics for Qualitative Type Spaces by Dominik Klein and Eric Pacuit 253 12 Dependent Type Semantics: An Introduction by Daisuke Bekki 277 13 A Medieval Epistemic Puzzle by Sara L. Uckelman 301 Preface Johan van Benthem University of Amsterdam, Stanford University
[email protected]Rocking All Over the World Though the song playing on the radio right now is from a band whimsically named the Status Quo, ‘Rocking All Over The World’ seems the perfect background when reading these exciting volumes. So many dynamic themes combining the tested rigor of logic with attunement to the latest that is happening in the international world today. In this book, you can drift along on many currents, that meet and separate all the time. There are logical analyses of social networks, pluralistic ignorance, strategic vot- ing, and bubbles, trying to come to grips with the burning issues of society right around us. But there are also investigations into the foundations of games as played by ratio- nal but bounded agents, who must do their planning based on fallible information and beliefs that sometimes run into surprises and the unexpected, and whose interactions depend on the pervasive phenomenon of dependence. Computation and complexity are further basic themes that are never far away in many contributions, but so is human cognitive behavior, and the intricacies of our common uses of natural language, both logical and probabilistic. But there is also a lot of methodological innovation to be found here. Mathematicians may appreciate several more technical papers that provide powerful new links with proof theory and substructural logics, as well as new forms of outreach toward category-theoretic sheave models, probability theory, and measure- ment theory. And philosophers may be drawn to papers studying the entanglement of knowledge, evidence, action, learning, and ability, some of them bridging between epistemology and philosophy of science. And if, after all this, you need a trip to quite different realms, just read the papers on dynamic logical foundations of the quantum world below us, or on medieval history: a past every bit as exciting as the future. 2 Preface This Yearbook is now a production of editors from three sites some eight time zones apart: Amsterdam, Beijing, and the Bay Area. Just multiply these two figures, and you will see that the sun never sets in the realm of dynamics. Dynamic Epistemic Logic as a Substructural Logic Guillaume Aucher University of Rennes 1, INRIA
[email protected]Abstract Dynamic Epistemic Logic (DEL) is an influential logical framework for reason- ing about the dynamics of beliefs and knowledge. It has been related to older and more established logical frameworks but, despite these connections, DEL remains, arguably, a rather isolated logic in the vast realm of non-classical logics and modal logics. This is problematic if logic is to be viewed ultimately as a unified and unifying field and if we want to avoid that DEL goes on “riding off madly in all di- rections” (a metaphor used by van Benthem about logic in general). In this article, we show that DEL can be redefined naturally and meaningfully as a two-sorted substructural logic. In fact, it is even one of the most primitive substructural logics since it does not preserve any of the structural rules. Moreover, the ternary seman- tics of DEL and its dynamic interpretation provides a conceptual foundation for the Routley & Meyer’s semantics of substructural logics. 1 Introduction Dynamic Epistemic Logic (DEL) is an influential logical framework for reasoning about the dynamics of beliefs and knowledge, which has drawn the attention of a num- ber of researchers ever since the seminal publication of Batlag et al. (1998). A number of contributions have linked DEL to older and more established logical frameworks: it has been embedded into (automata) PDL (van Eijck 2004, van Benthem and Kooi 2004), it has been given an algebraic semantics (Baltag et al. 2005; 2007), and it has been related to epistemic temporal logic (van Benthem et al. 2009, Aucher and Herzig 4 DEL as a Substructural Logic 2011) and the situation calculus (van Benthem 2011a, van Ditmarsch et al. 2009). De- spite these connections, DEL remains, arguably, a rather isolated logic in the vast realm of non-classical logics and modal logics. This is problematic if logic is to be viewed ultimately as a unified and unifying field and if we want to avoid that DEL goes on “riding off madly in all directions” (a metaphor used by van Benthem (2008; 2011b) about logic in general). In this article, we will show that DEL can be redefined natu- rally and meaningfully as a two-sorted substructural logic. In fact, it is even one of the most primitive substructural logics since it does not preserve any of the structural rules. Substructural logics will also benefit from this interaction with DEL. The well- known semantics for substructural logics is based on a ternary relation introduced by Routley & Meyer for relevance logic in the 1970’s (Routley and Meyer 1972a;b; 1973, Routley et al. 1982). However, the introduction of this ternary relation was orig- inally motivated by technical reasons, and it turns out that providing a non-circular and conceptually grounded interpretation of this relation remains problematic (Beall et al. 2012). As we shall see, the ternary semantics of DEL provides a conceptual founda- tion for Routley & Meyer’s semantics. In fact, the dynamic interpretation induced by the DEL framework turns out to be not only meaningful, but also consistent with the interpretations of this ternary relation proposed in the substructural literature. The article is structured as follows. In Section 2 we recall the core of DEL viewed from a semantic perspective. In Section 3 we briefly recall elementary notions of rel- evance and substructural logics and we observe that the ternary relation of relevance logic can be interpreted as a sort of update. In Section 4 we proceed further to define a substructural language based on this idea. This substructural language extends the DEL language with operators stemming from the Lambek calculus (a substructural logic), but we show that these different substructural operators actually correspond to the DEL operators of (Aucher 2011; 2012). This allows us to show that DEL is a (two-sorted) substructural logic. In Section 5, we conclude. 2 Dynamic Epistemic Logic Dynamic epistemic logic (DEL) is a relatively recent non-classical logic (Batlag et al. 1998) which extends ordinary modal epistemic logic (Hintikka 1962) by the inclu- sion of event/action models (called Lα –models in this article) to describe actions, and a product update operator that defines how epistemic models (called L–models in this article) are updated as the consequence of executing actions described through event models (see Baltag and Moss 2004, van Ditmarsch et al. 2007, van Benthem 2011b, for more details). So, the methodology of DEL is such that it splits the task of representing the agents’ beliefs and knowledge into three parts: first, one repre- Aucher 5 sents their beliefs/knowledge about an initial situation; second, one represents their beliefs/knowledge about an event taking place in this situation; third, one represents the way the agents update their beliefs/knowledge about the situation after (or during) the occurrence of the event. Following this methodology, we also split the exposition of the DEL framework into three sections. 2.1 Representation of the initial situation: L-model In the rest of this article, AT M is a countable set of propositional letters called atomic facts which describe static situations, and AGT := {1, . . . , m} is a finite set of agents. Definition 2.1 (Language L and L–structure). We define the language L inductively as follows: L : ϕ ::= p | ¬ϕ | ϕ ∧ ϕ | ϕ ∨ ϕ | j ϕ where p ranges over AT M and j over AGT . We define ⊥ := p ∧ ¬p for a chosen p ∈ AT M and we also define > := ¬⊥. The formula ^ j ϕ is an abbreviation for ¬ j ¬ϕ, the formula ϕ → ψ is an abbreviation for ¬ϕ ∨ ψ, and the formula ϕ ↔ ψ is an abbreviation for (ϕ → ψ) ∧ (ψ → ϕ). A L-structure is defined inductively as follows, with ϕ ranging over L: X ::= ϕ | (X, X). We abusively write ϕ ∈ X when the formula ϕ ∈ L is a substructure of X. A (pointed) L–model (M, w) represents how the actual world represented by w is perceived by the agents. Atomic facts are used to state properties of this actual world. Definition 2.2 (L-model). A L-model is a tuple M = (W, R1 , . . . , Rm , I) where: • W is a non-empty set of possible worlds, • R j ⊆ W × W is an accessibility relation on W, for each j ∈ AGT , • I : W → 2AT M is a function assigning to each possible world a subset of AT M. The function I is called an interpretation. We write w ∈ M for w ∈ W, and (M, w) is called a pointed L-model (w often represents the actual world). We denote by C the set of pointed L–models. If w, v ∈ W, we write wR j v or (M, w)R j (M, v) for (w, v) ∈ R j , and R j (w) denotes the set {v ∈ W | wR j v}. 6 DEL as a Substructural Logic Intuitively, wR j v means that in world w agent j considers that world v might corre- spond to the actual world. Then, we define the following epistemic language that can be used to describe and state properties of L–models: Definition 2.3 (Truth conditions of L). Let M be a L-model, w ∈ M and ϕ ∈ L. M, w ϕ is defined inductively as follows: M, w p iff p ∈ I(w) M, w ¬ψ iff not M, w ϕ M, w ϕ∧ψ iff M, w ϕ and M, w ψ M, w ϕ∨ψ iff M, w ϕ or M, w ψ M, w jϕ iff for all v ∈ R j (w), M, v ϕ We write M ϕ when M, w ϕ for all w ∈ M, and ϕ when for all L-model M, M ϕ. A L-formula ϕ is said to be valid if ϕ. We extend the scope of the relation to also relate pointed L–models to structures: M, w X, Y iff M, w X and M, w Y Let C be a class of pointed L–models, let X, Y be L-structures. We say that X entails Y in the class C, written X C Y, when the following holds: X C Y iff for all pointed L-model (M, w) ∈ C, if for all ϕ ∈ X M, w ϕ, then there is ψ ∈ Y such that M, w ψ. We also write X Y for X C Y, where C is the class of all pointed L–models. The formula j ϕ reads as “agent j believes ϕ”. Its truth conditions are defined in such a way that agent j believes ϕ is true in a possible world when ϕ holds in all the worlds agent j considers possible. Example 1. Assume that agents A, B and C play a card game with three cards: a white one, a red one and a blue one. Each of them has a single card but they do not know the cards of the other players. At each step of the game, some of the players show their/her/his card to another player or to both other players, either privately or publicly. We want to study and represent the dynamics of the agents’ beliefs/knowledge in this game. The initial situation is represented by the pointed L-model (M, w) of Figure 1. In this example, AGT := {A, B, C} and AT M := {r j , b j , w j | j ∈ AGT } where r j stands for ‘agent j has the red card’, b j stands for ‘agent j has the blue card’ and w j stands for ‘agent j has the white card’. The boxed possible world corresponds to the actual world. The propositional letters not mentioned in the possible worlds do not hold Aucher 7 rC7 , bBO , wf A B C w & rA , bBO , wC k 3 rC , bAO , wB C B A A s + w : rA , bC , g wB A rB8 , bA , wC C ' x B rB , bC , wA Figure 1: Cards Example in these possible worlds. The accessibility relations are represented by arrows indexed by agents between possible worlds. Reflexive arrows are omitted in the figure, which means that for all worlds v ∈ M and all agents j ∈ AGT , v ∈ R j (v). In this model, we have for example the following statement: M, w (wB ∧ ¬A wB ) ∧ C ¬A wB . It states that player A does not ‘know’ that player B has the white card and player C ‘knows’ it. 2.2 Representation of the event: Lα -model The language Lα was introduced by Baltag et al. (1999). The propositional letters pψ describing events are called atomic events and range over AT Mα = {pψ ψ ranges over L}. The reading of pψ is “an event of precondition ψ is occurring”. Definition 2.4 (Language Lα and Lα –structure). We define the language Lα induc- tively as follows: Lα : α ::= pψ | ¬α | α ∧ α | α ∨ α | j α where ψ ranges over L and j over AGT . We define ⊥ := pψ ∧ ¬pψ for a chosen ψ ∈ L and we define > := ¬⊥. The formula ^ j α is an abbreviation for ¬ j ¬α, the formula α → β is an abbreviation for ¬α ∨ β, and the formula α ↔ β is an abbreviation for (α → β) ∧ (β → α). A Lα -structure is defined inductively as follows, with β ranging over Lα : Sα : Xα ::= β | (Xα , Xα ) We abusively write α ∈ Xα when the formula α ∈ Lα is a substructure of Xα . 8 DEL as a Substructural Logic A pointed Lα -model (E, e) represents how the actual event represented by e is per- ceived by the agents. Intuitively, f ∈ R j (e) means that while the possible event repre- sented by e is occurring, agent j considers possible that the possible event represented by f is actually occurring. Definition 2.5 (Lα –model, Batlag et al. 1998). A Lα -model is a tuple E = (Wα , R1 , . . . , Rm , I) where: • Wα is a non-empty set of possible events, • R j ⊆ Wα × Wα is an accessibility relation on Wα , for each j ∈ AGT , • I : Wα → L is a function assigning to each possible event a formula of L. The function I is called the precondition function. Let P be a subset of L. A P–complete Lα –model is a Lα –model which satisfies more- over the following condition: • I(e) ∈ P, for each e ∈ Wα . (P-complete) We write e ∈ E for e ∈ Wα , and (E, e) is called a pointed Lα -model (e often represents the actual event). We denote by Cα the set of pointed Lα –models, by CαP the set of pointed P-complete event models. If e, f ∈ Wα , we write eR j f or (E, e)R j (E, f ) for (e, f ) ∈ R j , and R j (e) denotes the set { f ∈ Wα | eR j f }. The truth conditions of the language Lα are identical to the truth conditions of the language L: Definition 2.6 (Truth conditions of Lα ). Let E be a Lα -model, e ∈ E and α ∈ Lα . E, e α is defined inductively as follows: E, e pψ iff I(e) = ψ E, e ¬α iff not E, e α E, e α∧β iff E, e α and E, e β E, e α∨β iff E, e α or E, e β E, e jα iff for all f ∈ R j (e), E, f α Let C be a class of pointed Lα –models, let Xα , Yα be Lα –structures. We say that X entails Y in the class C, written Xα C Yα , when the following holds: Xα C Yα iff for all pointed Lα -model (E, e) ∈ C, if for all α ∈ Xα E, e α, then there is β ∈ Yα such that E, e β. We also write Xα Yα for Xα Cα Yα , where Cα is the class of all pointed Lα –models. Aucher 9 A,B e : rA C C z & f : wA o / g : rA ∧ wA W C W A,B,C A,B,C Figure 2: Players A and B show their cards to each other in front of player C Example 2. Let us resume Example 1 and assume that players A and B show their card to each other. As it turns out, C noticed that A showed her card to B but did not notice that B did so to A. Players A and B know this. This event is represented in the Lα -model (E, e) of Figure 2. The boxed possible event e corresponds to the actual event ‘player A shows her red card’ (with precondition rA ), f stands for the event ‘player A shows her white card’ (with precondition wA ) and g stands for the atomic event ‘players A and B show their red and white cards respectively to each other’ (with precondition rA ∧ wA ). The following statement holds in the example of Figure 2: E, e prA ∧ ^A prA ∧ A prA ∧ ^B prA ∧ B prA ∧ ^C pwA ∧ ^C prA ∧wA ∧ C pwA ∨ prA ∧wA . (1) It states that players A and B show their cards to each other, players A and B ‘know’ this and consider it possible, while player C considers possible that player A shows her white card and also considers possible that player A shows her red card, since he does not know her card. In fact, that is all that player C considers possible since he believes that either player A shows her red card or her white card. 2.3 Update of the initial situation by the event: Product Update The DEL product update of Batlag et al. (1998) is defined as follows. This update yields a new L-model (M, w) ⊗ (E, e) representing how the new situation which was previously represented by (M, w) is perceived by the agents after the occurrence of the event represented by (E, e). Definition 2.7 (Product Update). Let (M, w) = (W, R1 , . . . , Rm , I, w) be a pointed L-model and let (E, e) = (Wα , R1 , . . . , Rm , I, e) be a pointed Lα -model such that 10 DEL as a Substructural Logic A,B (w, w0 ) : rA , bC , wB C C v ( rA , bCO , wB o / rB , bC , wA O C A A rA , bB , wC rC , bB , wA Figure 3: Situation after the update of the situation represented in Figure 1 by the event repre- sented in Figure 2 M, w I(e). The product update of (M, w) and (E, e) is the pointed L-model (M ⊗ E, (w, e)) = (W ⊗ , R⊗1 , . . . , R⊗m , I ⊗ , (w, e)) defined as follows: for all v ∈ W and all f ∈ Wα , • W ⊗ = {(v, f ) ∈ W × Wα | M, v I( f )}, • R⊗j (v, f ) = {(u, g) ∈ W ⊗ | u ∈ R j (v) and g ∈ R j ( f )}, • I ⊗ (v, f ) = I(v). Example 3. As a result of the event described in Example 2, the agents update their beliefs. We get the situation represented in the L-model (M, w) ⊗ (E, e) of Figure 3. In this L–model, we have for example the following statement: (M, w) ⊗ (E, e) (wB ∧ BA wB ) ∧ BC ¬BA wB . It states that player A ‘knows’ that player B has the white card but player C believes that it is not the case. 3 Substructural logics Substructural logics are a family of logics lacking some of the structural rules of clas- sical logic. A structural rule is a rule of inference which is closed under substitution of formulas. We shall see in this article that DEL invalidates all of them. Aucher 11 3.1 A substructural language Our exposition of substructural logics is based on (Restall 2000; 2006, Dunn and Re- stall 2002). The logical framework presented in (Restall 2000) is much more general and studies a wide range of substructural logics: relevance logic, linear logic, lambek calculus, display logic, etc. . . For what concerns us in this article, we will only intro- duce a fragment of this general framework. The semantics of this fragment is based on the ternary relation of the frame semantics for relevant logic originally introduced by Routley & Meyer (Routley and Meyer 1972a;b; 1973, Routley et al. 1982). Definition 3.1 (Language LSub and LSub –structure). The language LSub is defined inductively as follows: LSub : ϕ ::= > | ⊥ | p | ¬ϕ | ϕ ∧ ϕ | ϕ ∨ ϕ | ϕ | ϕ⊃ϕ | ϕ⊂ϕ | ϕ◦ϕ where p ranges over AT M. A LSub –structure is defined inductively as follows, with ϕ ranging over LSub : X ::= ϕ | (X, X) | (X; X) Definition 3.2 (Point set, plump accessibility relation). A point set P = (P, v) is a set P together with a partial order v on P. The set Prop(P) of propositions on P is the set of all subsets X of P which are closed upwards: that is, if x ∈ X and x v x0 then x0 ∈ X. We abusively write x ∈ P for x ∈ P. • A binary relation S is a positive two–place accessibility relation on the point set P iff for any x, y ∈ P where xSy, if x0 v x then there is a y0 w y, where x0 Sy0 . Similarly, if xSy and y v y0 then there is some x0 v x, where x0 Sy0 . • A ternary relation R is a three–place accessibility relation iff whenever Rxyz and z v z0 then there are y0 w y and x0 v x, where Rx0 y0 z0 . Similarly, if x0 v x then there are y0 v y and z0 w z, where Rx0 y0 z0 , and if y0 v y then there are x0 v x and z0 w z, where Rx0 y0 z0 . • A ternary relation R is a plump accessibility relation on the point set P if and only if for any x, y, z, x0 , y0 , z0 ∈ P such that Rxyz, if x0 v x, y0 v y and z v z0 , then Rx0 y0 z0 . Our definition of LSub –model corresponds to the definition of a model in (Restall 2000, Chap. 11) stripped out from all its truth sets. These other features are not needed for what concerns us here. 12 DEL as a Substructural Logic Definition 3.3 (LSub –model). A LSub –model is a tuple MR = (P, S, R, I) where: • P = (P, v) is a point set; • S ⊆ P × P is a positive two–place accessibility relation on P; • R ⊆ P × P × P is a three–place accessibility relation on P; • I : P → 2AT M is an interpretation function. We abusively write x ∈ MR for x ∈ P, and (MR , x) is called a pointed LSub –model. Note that in the above definition, there could be multiple positive two–place acces- sibility relations S1 , . . . , Sn corresponding to multiple modalities 1 , . . . n . We refrain from defining LSub –models in their full generality in order to ease the readability of the article. Definition 3.4 (Truth conditions of LSub ). Let MR be a LSub –model, x ∈ MR and ϕ ∈ LSub . The relation MR , x ϕ is defined inductively as follows: MR , x > always MR , x ⊥ never MR , x p iff p ∈ I(x) MR , x ¬ϕ iff not MR , x ϕ MR , x ϕ∧ψ iff MR , x ϕ and MR , x ψ MR , x ϕ∨ψ iff MR , x ϕ or MR , x ψ MR , x ϕ iff for all y ∈ MR , where xSy, MR , y ϕ MR , x ϕ⊃ψ iff for all y, z ∈ P where Rxyz, if MR , y ϕ then MR , z ψ MR , x ψ⊂ϕ iff for all y, z ∈ P where Ryxz if MR , y ϕ then MR , z ψ MR , x ϕ◦ψ iff there are y, z ∈ P such that Ryzx, MR , y ϕ and MR , z ψ We extend the scope of the relation to also relate points to LSub –structures: MR , x X, Y iff MR , x X and MR , x Y MR , x X; Y iff there are y, z ∈ MR such that Ryzx, MR , y X and MR , z Y We say that MR validates a LSub –structure X when for all x ∈ MR , MR , x X. Let X be a structure and let ϕ ∈ LSub . We say that X entails ϕ, written X ϕ, when the following holds: X ϕ iff for all pointed LSub –model (MR , x), if MR , x X, then MR , x ϕ. Aucher 13 3.2 Updates as ternary relations The ternary relation R of the Routley & Meyer semantics was introduced originally for technical reasons: any 2-ary (n-ary) connective of a logical language can be given a semantics by resorting to a 3-ary (resp. n + 1-ary) relation on worlds. Subsequently, a number of philosophical interpretations of this ternary relation have been proposed and we will briefly recall some of them at the end of this section (see Beall et al. 2012, Restall 2006, Mares and Meyer 2001, for more details). However, one has to admit that providing a non-circular and conceptually grounded interpretation of this relation remains problematic. In this article, we propose a new dynamic interpretation of this relation, inspired by the ternary semantics of DEL. First, one should observe that the DEL product update ⊗ of Definition 2.7 can be x seen as a partial function from a pair of pointed L–model and pointed Lα –model to another pointed L–model: x : C × Cα → C. (2) There is a formal similarity between this abstract definition of the DEL product update and the function t introduced by Urquhart in the early 1970s for providing a semantics to the implication of relevance logic. This similarity is not only formal but also intuitively meaningful. Indeed, the intuitive interpretation of the DEL prod- uct update operator is very similar to the intuitive interpretation of the function t of Urquhart. Points are sometimes also called “worlds”, “states”, “situation”, “set-ups”, and as explained by Restall: “We have a class of points (over which x and y vary), and a function t which gives us new points from old. The point x t y is supposed, on Urquhart’s interpretation, to be the body of information given by combin- ing x with y.” (Restall 2006, p. 363) and also, “To be committed to A ⊃ B is to be committed to B whenever we gain the information that A. To put it another way, a body of information warrants A ⊃ B if and only if whenever you update that information with new infor- mation which warrants A, the resulting (perhaps new) body of information warrants B.” (my emphasis) (Restall 2006, p. 362) From these two quotes, it is natural to interpret the DEL product update ⊗ of Defini- tion 2.7 as a specific kind of Urquhart’s function t. Moreover, as explained by Restall, this substructural “update” can be nonmonotonic and may correspond to some sort of revision: 14 DEL as a Substructural Logic “[C]ombination is sometimes nonmonotonic in a natural sense. Some- times when a body of information is combined with another body of in- formation, some of the original body of information might be lost. This is simplest to see in the case motivating the failure of A B ⊃ A. A body of information might tell us that A. However, when we combine it with something which tells us B, the resulting body of information might no longer warrant A (as A might with B). Combination might not simply re- sult in the addition of information. It may well warrant its revision.” (my emphasis) (Restall 2006, p. 363) Our dynamic interpretation of the ternary relation is consistent with the above con- siderations: sometimes, updating beliefs amounts to revise beliefs. As it turns out, belief revision has also been extensively studied within the DEL framework and DEL has been extended to deal with this phenomenon (Aucher 2004, van Ditmarsch 2005, van Benthem 2007a, Baltag and Smets 2008a;b, Liu 2008, Aucher 2008). x More generally, an update can be seen as a partial function from a pair of pointed L–model and pointed Lα –model to a set of pointed L–model: x : C × Cα → P(C) (3) Equivalently, an update can be seen as a ternary relation R defined on C ∪ Cα between three pointed models ((M, w), (E, e), (M f , w f )) where (M, w) is a pointed L– model, (E, e) is a pointed Lα –model and (M f , w f ) is another pointed L–model: R ⊆ C × Cα × C (4) The ternary relation of Equation (4) then resembles the ternary relation of the Rout- ley & Meyer semantics. This is not surprising since the Routley & Meyer semantics generalizes the Urquhart semantics (they are essentially the same, since as we ex- plained it in the previous section, an operational frame is a Routley & Meyer frame where Rxyz holds if and only if x t y = z). Viewed from the perspective of DEL, the ternary relation then represents a particular sort of update. With this interpretation in mind, Rxyz reads as ‘the occurrence of event y in world x results in the world z’ and the corresponding conditional α ⊃ ϕ reads as ‘the occurrence in the current world of an event satisfying property α results in a world satisfying ϕ’. The dynamic reading of the ternary relation and its corresponding conditional is very much in line with the so-called “Ramsey Test” of conditional logic. The Ramsey test can be viewed as the very first modern contribution to the logical study of condi- tionals and much of the contemporary work on conditional logic can be traced back to the famous footnote of Ramsey (1929). Roughly, it consists in defining a counterfac- tual conditional in terms of belief revision: an agent currently believes that ϕ would Aucher 15 be true if ψ were true (i.e. ψ ⊃ ϕ) if and only if he should believe ϕ after learning ψ. A first attempt to provide truth conditions for conditionals, based on Ramsey’s ideas, was proposed by Stalnaker. He defined his semantics by means of selection functions over possible worlds f : W × 2W → W. As one can easily notice, Stalnaker’s selec- tion functions could also be considered from a formal point of view as a special kind of ternary relation, since a relation R f ⊆ W × 2W × W can be canonically associated to each selection function f . Moreover, like the ternary relation corresponding to a product update (Equation (4)), this ternary relation is ‘two-sorted’: the antecedent of a conditional takes value in a set of worlds (instead of a single world).1 So, the dynamic reading of the ternary semantics is consistent with the dynamic reading of conditionals proposed by Ramsey. This dynamic reading was not really considered and investigated by substructural logicians when they connected the substructural ternary semantics with conditional logic (Beall et al. 2012). On the other hand, the dynamic reading of inferences has been stressed to a large extent by van Benthem (2007b; 2011b) (we will come back to this point in Section 4.2), and also by Baltag and Smets who distinguished dynamic belief revision from static (standard) belief revision (Baltag and Smets 2006; 2008a;b). What distinguishes dynamic belief revision from static belief revision is that the latter is a revision of the agent’s beliefs about the state of the world as it was before an event, and the former is a revision of the state of the world as it is after the event. Note, however, that this important distinction between static belief revision and dynamic be- lief revision collapses in the case of relevant logic, because in that case we only deal with propositional formulas. This shows again that a dynamic interpretation of the ternary semantics of substructural logic is consistent with the interpretations proposed by substructural logicians. To summarize our discussion, the DEL product update provides substructural log- ics with an intuitive and consistent interpretation of its ternary relation. This interpre- tation is consistent in the sense that the intuitions underlying the definitions of the DEL framework are coherent with those underlying the ternary semantics of substructural logic, as witnessed by our quotes and citations from the substructural literature. Other interpretations of the ternary relation One interpretation, due to Barwise (1993) and developed by Restall (1996), takes worlds to be ‘sites’ or ‘channels’, a site being possibly a channel and a channel being possibly a site. If x, y and z are sites, Rxyz reads as ‘x is a channel between y and z’. Hence, if ϕ ⊃ ψ is true at channel 1 Note that Burgess (1981) already proposed a ternary semantics for conditionals, but his truth conditions and his interpretation of the ternary relation were quite different from ours. 16 DEL as a Substructural Logic x, it means that all sites y and z connected by channel x are such that if ϕ is informa- tion available in y, then ψ is information available in z. Another similar interpretation due to Mares (1996) adapts Perry and Israel (1990)’s theory of information to the re- lational semantics. In this interpretation, worlds are situations in the sense of Barwise and Perry (1983)’s situation semantics and pieces of information – called infons – can carry information about other infons: an infon might carry the information that a red light on a mobile phone carries the information that the battery of the mobile phone is low. In this interpretation, the ternary relation R represents the informational links in situations: if there is an informational link in situation x that says that an infon σ carries the information that the infon π also holds, then if Rxyz holds and y contains the infon σ, then z contains the infon π. Other interpretations of the ternary relation have been proposed by Beall et al. (2012), with a particular focus on their relation to conditionality. 4 DEL is a substructural logic In this section, we will extend the languages L and Lα of Section 2 with the sub- structural operators ◦, ⊃ and ⊂. We will also provide a substructural semantics for this language based on the idea to view an update as a ternary relation of a substructural frame (LSub –model). This idea is motivated and intuitively grounded in the analysis of the previous section. 4.1 An extended DEL language Our language extends both the language L and the language Lα of Section 2. Like our semantics, it is two-sorted: it contains both formulas of L and formulas of Lα . Definition 4.1 (Language LR ). The language LR is two-sorted and is defined by a double induction as follows: L1R : ϕ ::= p | ¬ϕ | ϕ ∧ ϕ | ϕ ∨ ϕ | j ϕ | α ⊃ ϕ | ϕ ◦ α L2R : α ::= pψ | ¬α | α ∧ α | α ∨ α | j α | ϕ ⊂ ϕ where p ranges over AT M, ψ ranges over L1R and j over AGT . The abbreviations ϕ → ψ, ϕ ↔ ψ and α → β, α ↔ β are defined as in Definitions 2.1 and 2.4. Definition 4.2 (LR –structure and LR –sequent). The LR –structures are defined induc- tively as follows: S1 : X ::= ϕ | (X, X) | (X; Xα ) S2 : X ::= ϕ | (X, X) Aucher 17 where ϕ ranges over LR and Xα ranges over Lα -structures. A LR –sequent is a Lα – sequent or an expression of the form X Y, where X ∈ S1 , Y ∈ S2 . Definition 4.3 (DEL product update model). The DEL product update model is the tuple M⊗ = (P, R1 , . . . , Rm , R⊗ , I) where: • P := (C ∪ Cα , -) where - is the bisimilarity relation; • R j ⊆ P × P is a positive two-place accessibility relation on P for each j ∈ AGT such that for all x, y ∈ P, where x = (M x , w x ) and y = (My , wy ): x ∈ R j (y) iff M x = My and w x ∈ R j (wy ). n o • R⊗ := (x, y, z) ∈ C × Cα × C x ⊗ y = z is a plump ternary relation on P; • I(x) := I(x), for all x ∈ C ∪ Cα . The DEL product update model is a LSub –model where points are pointed L– models and pointed Lα –models. The ternary relation R⊗ is defined and motivated by the explanations of the previous section. Note that the accessibility relations R j of L–models and Lα –models are seen in this definition as positive two-place accessibility relations R j . The truth conditions are the same as the ones for LR –models: Definition 4.4 (Truth conditions of LR ). Let M⊗ be the DEL product update model, x ∈ M⊗ and ϕ ∈ LR . The relation M⊗ , x ϕ is defined inductively as follows: M⊗ , x p iff p ∈ I(x) M⊗ , x ¬ϕ iff not M⊗ , x ϕ M⊗ , x ϕ∧ψ iff M⊗ , x ϕ and M⊗ , x ψ M⊗ , x ϕ∨ψ iff M⊗ , x ϕ or M⊗ , x ψ M⊗ , x jϕ iff for all y ∈ P such that xR j y, M⊗ , y ϕ M⊗ , x α⊃ψ iff for all y, z ∈ P such that R⊗ xyz, if M⊗ , y α then M⊗ , z ψ M⊗ , x ψ⊂ϕ iff for all y, z ∈ P such that R⊗ yxz, if M⊗ , y ϕ then M⊗ , z ψ M⊗ , x ϕ◦α iff there are y, z ∈ P such that R⊗ yzx, M⊗ , y ϕ and M⊗ , z α We extend the scope of the relation to also relate points to LR –structures: M⊗ , x X, Y iff M⊗ , x X and M⊗ , x Y M⊗ , x X; Y iff there are y, z ∈ MR such that Ryzx, M⊗ , y X and M⊗ , z Y 18 DEL as a Substructural Logic Let C ⊆ C ∪ Cα be a class of pointed L-models or Lα -models, and let X ϕ be a LR –sequent. We say that X entails ϕ in the class C, written X C ϕ, when the following holds: X C ϕ iff for all x ∈ C, if M⊗ , x X then M⊗ , x ϕ. We also write X ϕ for X C∪Cα ϕ. 4.2 DEL operators are substructural operators In this section, we will show that the DEL operators introduced in (Aucher 2011; 2012) correspond to the substructural operators ◦, ⊃ and ⊂. Recently again, van Benthem (2010) expressed some worries about interpreting the Lambek Calculus (the paradigmatic substructural logic) as a base logic of informa- tion flow while trying to connect the operators ◦, ⊃ and ⊂ of substructural logic to some sort of DEL operators. Indeed, the DEL operators usually rely on the regular algebra of sequential composition, choice and iteration which are of a quite different nature. Re- cently, I introduced some DEL operators called progression, regression and epistemic planning (Aucher 2011; 2012), the operator of regression being a natural generaliza- tion of the standard and original action modality [E, e]ϕ of DEL (Batlag et al. 1998). It turns out that these operators can all be identified with connectives of the substruc- tural language LR . We first briefly recall their definitions below and then we give our correspondence results between the two kinds of operators. Progression The operator of progression is denoted ⊗ in (Aucher 2011). In (Aucher 2012, Def. 41), a constructive definition of this operator is provided using charac- teristic formulas (called “Kit Fine” formulas). Here, we provide an alternative and non–constructive definition of the progression of ϕ by α, denoted ϕ ⊗ α: Theorem 1. Let (M f , w f ) be a pointed L–model and let ϕ ∈ L and α ∈ Lα . Then, Mf , wf ϕ⊗α iff there is a pointed L–model (M, w) and a pointed Lα –model (E, e) such that (M, w) ⊗ (E, e) - (M f , w f ), M, w ϕ and E, e α. Proof. It follows from Lemmata 43 and 44 of (Aucher 2011). Epistemic planning The operator of epistemic planning is denoted P in (Aucher 2012). It is defined relatively to a finite set P of formulas/preconditions/atomic events. In (Aucher 2012, Def. 14–15), a constructive definition of this operator is provided Aucher 19 using characteristic formulas (called “Kit Fine” formulas). As it turns out, an alterna- tive and non–constructive definition of the epistemic planning from ϕ to ϕ f , denoted ϕ P ϕ f , exists as well: Theorem 2 (Aucher 2012). Let ϕ, ϕ f ∈ L and let P be a finite subset of L. Then, for all P–complete Lα –model (E, e), it holds that there is (M, w) such that M, w ϕ, E, e ϕ P ϕ f iff M, w I(e) and (M, w) ⊗ (E, e) ϕ f . The dual of the operator ϕ P ϕ f is defined by: ϕ[]P ϕ f := ¬(ϕ P ¬ϕ f ). (5) Theorem 2 entails that ϕ[]P ϕ f can be alternatively defined as follows: for all P–complete Lα –model (E, e), it holds that E, e ϕ[]P ϕ f iff for all (M, w) such that M, w ϕ, if (6) M, w I(e) then (M, w) ⊗ (E, e) ϕ f . Example 4. In the situation depicted in the L-model of Figure 1, agent B does not know that agent A has the red card and does not know that agent C has the blue card: M, w (^B rA ∧ ^B ¬rA ) ∧ (^B bC ∧ ^B ¬bC ). Our problem is therefore the following: What sufficient and necessary property (i.e. ‘minimal’ property) an event should fulfill so that its occurence in the initial situation (M, w) results in a situation where agent B knows the true state of the world, i.e. agent B knows that agent A has the red card and that agent C has the blue card? The answer to this question obviously depends on the kind of atomic events we con- sider. In this example, the events P = {pbC , prA , pwB } under consideration are the fol- lowing. First, agent C shows her blue card (pbC ), second, agent A shows her red card (prA ), and third, agent B herself shows her white card (pwB ). Answering this question amounts to compute the formula (M, w) P B (rA ∧ bC ∧ wB ). Applying the algorithm of (Aucher 2012, Definition 15), we obtain that (M, w) P B (rA ∧ bC ∧ wB ) ↔ B (pbC ∨ prA ) is valid. In other words, this result states that agent B should believe either that agent A shows her red card or that agent C shows her blue card in order to know the true state of the world. Indeed, since there are only three different cards which are known by the agents and agent B already knows her card, if she learns the card of (at least) one of the other agents, she will also be able to infer the card of the third agent. 20 DEL as a Substructural Logic Regression The operator of regression is denoted in (Aucher 2011). In (Aucher 2012, Def. 41), a constructive definition of this operator is provided using characteristic formulas (called “Kit Fine” formulas) by adapting and translating the reduction axioms of Batlag et al. (1998). As it turns out, an alternative and non–constructive definition of the regression of ϕ f by α, denoted α ϕ f , exists as well: Theorem 3. Let α ∈ Lα and ϕ f ∈ L. Then, for all L-model (M, w), it holds that there is (E, e) such that E, e α, M, w α ϕ f iff M, w I(e) and (M, w) ⊗ (E, e) ϕf . Note that we could define a dual operator of α ϕ f as follows: α[]ϕ f = ¬ α ¬ϕ f . (7) Then, the counterpart of Theorem 3 for this dual operator is as follows: for all (E, e) such that E, e α, M, w α[]ϕ f iff (8) if M, w I(e) then (M, w) ⊗ (E, e) ϕf . As shown in (Aucher 2012, Sec. 6), the operator α[]ϕ f is a generalization of the original and more standard DEL operator [E, e]ϕ almost exclusively used in the DEL literature (Batlag et al. 1998). Correspondence between DEL and substructural operators As one can easily notice, there is a strong similarity between the operations of progression, epistemic planning and regression and the operations of substructural logic, more precisely of the Lambek Calculus. In fact, there exists a rigorous mapping between them, as the following theorem shows: Theorem 4. Let P be a finite subset of L, let x = (M, w) ∈ C and let y = (E, e) ∈ CαP be a P-complete pointed event model. Let ϕ, ψ ∈ L and let α ∈ Lα . Then, M⊗ , x ϕ◦α iff M, w ϕ ⊗ α M⊗ , x α⊃ϕ iff M, w α[]ϕ M⊗ , y ψ⊂ϕ iff E, e ϕ[]P ψ Moreover, for all α, α1 , . . . , αn ∈ Lα , for all ϕ, ψ, ϕ0 , ϕ1 , . . . , ϕn ∈ L, we have: ϕ; α ψ iff ϕ, α ψ (((ϕ0 ; α1 ), ϕ1 ); . . . ; αn ), ϕn ψ iff ϕ0 , α1 , ϕ1 , . . . , αn , ϕn ψ Aucher 21 Substructural operators DEL operators ◦ ⊗ ⊃ [] ⊂ [] Figure 4: Correspondence between DEL and substructural operators The key Theorem 42 of (Aucher 2011) relates DEL–sequents and the operator of progression: for all ϕ, ϕ f ∈ L and α ∈ Lα , it holds that ϕ, α ϕ f iff ϕ ⊗ α ϕf . (9) As it turns out, this theorem is also valid in any substructural logic: it corresponds to a theorem of the Lambek calculus. More generally, all the theorems of the non- associative Lambek calculus hold in our DEL setting if we use the translation given in Figure 4. In particular, if P be a finite subset of L, then for all ϕ, ϕ f ∈ L and α ∈ Lα , it holds that ϕ; α ϕ f iff ϕ α[]ϕ f (10) ϕ α[]ϕ f iff ϕ ⊗ α ϕf (11) ϕ⊗α ϕ f iff α CαP ϕ[]P ϕ f (12) ϕ α[]ϕ f iff α CαP ϕ[]P ϕ f (13) 5 Conclusion We proved in this article that DEL is a two-sorted substructural logic. Also, we ar- gued in Section 3.2 that our embedding of DEL within the framework of substructural logic is intuitively consistent, in the sense that in this embedding the intuitions un- derlying the DEL framework are coherent with the intuitive interpretations proposed for the ternary semantics of substructural logics. This may explain to a certain extent why some substructural phenomena arise in the dynamic inferences of van Benthem (2008): “it seemed that structural rules address mere symptoms of some underlying phenomenon” (van Benthem 2011b, p. 297). I claim that these “symptoms” are caused at a deeper semantic level by the fact that an update, and in that case the DEL product update, can be represented by the ternary relation of substructural logics. 22 DEL as a Substructural Logic In a certain sense, this article is in line with (van Benthem 2008; 2011b) and con- tributes to relate even more closely the programs of Logical Pluralism (Beall and Re- stall 2006) and Logical Dynamics (van Benthem 2011b). Roughly, the informal idea underpinning the connection between these two logical paradigms is to consider dif- ferent reasoning styles and their corresponding consequence relations as the result of different sorts of updates induced by various informational tasks (such as observation, memory, questions and answers, dialogue, or general communication). We showed that this approach is not only meaningful from an intuitive point of view, but it can also be realized at a formal level if the ternary relation of substructural logic is interpreted intuitively as a sort of update. So, we hope that our embedding will strengthen the connections between the two areas of research represented by Logical Pluralism (and substructural logics) on the one hand and Logical Dynamics on the other hand. In fact, our point of view is also very much in line with the claim of (Gärdenfors 1991, Makin- son and Gärdenfors 1989) that non-monotonic reasoning and belief revision are “two sides of the same coin”: as a matter of fact, non-monotonic reasoning is a reasoning style and belief revision is a sort of update. Likewise, the formal connection in this case also relies on a similar idea based on the Ramsey test. In this article, we focused on the DEL product update. It is, however, a particular kind of update operator and the ternary relation of substructural logics could actually be a representation of any sort of update, including the various revision and update op- erators which have been studied in the logics of “common sense reasoning” of artificial intelligence and philosophical logic, such as conditional logic (Nute and Cross 2001), default and non-monotonic logics (Makinson 2005, Gabbay et al. 1998), belief revi- sion theory (Gärdenfors 1988), etc. Different kinds of updates, induced by different informational tasks, define different kinds of reasoning styles. If one adheres to our interpretation of the ternary relation, the dynamic notion of update then becomes the foundational concept of substructural logics.2 References G. Aucher. A combined system for update logic and belief revision. In M. Barley and N. K. Kasabov, editors, PRIMA, volume 3371 of Lecture Notes in Computer Science, pages 1–17. Springer, 2004. G. Aucher. Perspectives on belief and change. PhD thesis, University of Otago – University of Toulouse, 2008. 2 This article is a short version of (Aucher 2013). Aucher 23 G. Aucher. DEL-sequents for progression. Journal of Applied Non-Classical Logics, 21(3-4):289–321, 2011. G. Aucher. DEL-sequents for regression and epistemic planning. Journal of Applied Non-Classical Logics, 22(4):337–367, 2012. G. Aucher. Outstanding Contributions: Johan F. A. K. van Benthem on Logical and Informational Dynamics, chapter DEL as a substructural logic. Trends in Logic. Springer, forthcoming, 2013. G. Aucher and A. Herzig. Exploring the power of converse events. Dynamic Formal Epistemology, pages 51–74, 2011. A. Baltag and L. Moss. Logic for epistemic programs. Synthese, 139(2):165–224, 2004. A. Baltag and S. Smets. Conditional doxastic models: A qualitative approach to dynamic belief revision. Electronic Notes in Theoretical Computer Science, 165:5– 21, 2006. A. Baltag and S. Smets. Texts in Logic and Games, volume 4, chapter The Logic of Conditional Doxastic Actions, pages 9–31. Amsterdam University Press, 2008a. A. Baltag and S. Smets. Texts in Logic and Games, volume 3, chapter A Qualitative Theory of Dynamic Interactive Belief Revision, pages 9–58. Amsterdam University Press, 2008b. A. Baltag, L. Moss, and S. Solecki. The logic of public announcements, common knowledge and private suspicions. Technical report, Indiana University, 1999. A. Baltag, B. Coecke, and M. Sadrzadeh. Algebra and sequent calculus for epistemic actions. Electronic Notes in Theoretical Computer Science, 126:27–52, 2005. A. Baltag, B. Coecke, and M. Sadrzadeh. Epistemic actions as resources. Journal of Logic and Computation, 17(3):555–585, 2007. J. Barwise. Constraints, channels, and the flow of information. Situation theory and its applications, 3:3–27, 1993. J. Barwise and J. Perry. Situations and Attitudes. Cambridge, Massachusetts. MIT Press, 1983. 24 DEL as a Substructural Logic A. Batlag, L. S. Moss, and S. Solecki. The logic of public announcements and com- mon knowledge and private suspicions. In I. Gilboa, editor, TARK, pages 43–56. Morgan Kaufmann, 1998. J. Beall, R. Brady, J. M. Dunn, A. Hazen, E. Mares, R. K. Meyer, G. Priest, G. Restall, D. Ripley, J. Slaney, et al. On the ternary relation and conditionality. Journal of philosophical logic, 41(3):595–612, 2012. J. C. Beall and G. Restall. Logical pluralism. Oxford University Press, 2006. J. van Benthem. Exploring logical dynamics. CSLI publications Stanford, 1996. J. van Benthem. Dynamic logic for belief revision. Journal of Applied Non-Classical Logics, 17(2):129–155, 2007a. J. van Benthem. Inference in action. Publications de l’Institut Mathématique- Nouvelle Série, 82(96):3–16, 2007b. J. van Benthem. Logical dynamics meets logical pluralism? The Australasian Journal of Logic, 6:182–209, 2008. J. van Benthem. Modal logic for open minds. CSLI publications, 2010. J. van Benthem. McCarthy variations in a modal key. Artificial intelligence, 175(1): 428–439, 2011a. J. van Benthem. Logical Dynamics of Information and Interaction. Cambridge Uni- versity Press, 2011b. J. van Benthem and B. Kooi. Reduction axioms for epistemic actions. In R. Schmidt, I. Pratt-Hartmann, M. Reynolds, and H. Wansing, editors, AiML-2004: Advances in Modal Logic, number UMCS-04-9-1 in Technical Report Series, pages 197–211, University of Manchester, 2004. J. van Benthem, J. Gerbrandy, T. Hoshi, and E. Pacuit. Merging frameworks for interaction. Journal of Philosophical Logic, 38(5):491–526, 2009. J. P. Burgess. Quick completeness proofs for some logics of conditionals. Notre Dame Journal of Formal Logic, 22(1):76–84, 1981. H. van Ditmarsch. Prolegomena to dynamic logic for belief revision. Synthese, 147: 229–275, 2005. Aucher 25 H. van Ditmarsch, W. van der Hoek, and B. Kooi. Dynamic Epistemic Logic, volume 337 of Synthese library. Springer, 2007. H. P. van Ditmarsch, A. Herzig, and T. D. Lima. From situation calculus to dynamic epistemic logic. Journal of Logic and Computation, 21(2):179–204, 2009. J. M. Dunn and G. Restall. Relevance logic. Handbook of philosophical logic, 6: 1–128, 2002. J. van Eijck. Reducing dynamic epistemic logic to PDL by program transformation. Technical Report SEN-E0423, CWI, 2004. D. M. Gabbay, C. J. Hogger, J. A. Robinson, J. Siekmann, and D. Nute, editors. Hand- book of logic in artificial intelligence and logic programming, volume Nonmonotonic reasoning and uncertain reasoning (Volume 3). Clarendon Press, 1998. P. Gärdenfors. Knowledge in Flux (Modeling the Dynamics of Epistemic States). Bradford/MIT Press, Cambridge, Massachusetts, 1988. P. Gärdenfors. Belief revision and nonmonotonic logic: Two sides of the same coin? In Logics in AI, pages 52–54. Springer, 1991. J. Hintikka. Knowledge and Belief, An Introduction to the Logic of the Two Notions. Cornell University Press, Ithaca and London, 1962. F. Liu. Changing for the Better: Preference Dynamics and Agent Diversity. PhD thesis, ILLC, University of Amsterdam, 2008. D. Makinson. Bridges from classical to nonmonotonic logic. King’s College, 2005. D. Makinson and P. Gärdenfors. Relations between the logic of theory change and nonmonotonic logic. In A. Fuhrmann and M. Morreau, editors, The Logic of Theory Change, volume 465 of Lecture Notes in Computer Science, pages 185–205. Springer, 1989. E. D. Mares. Relevant logic and the theory of information. Synthese, 109(3):345–360, 1996. E. D. Mares and R. K. Meyer. The Blackwell guide to philosophical logic, chapter Relevant Logics. Wiley-Blackwell, 2001. D. Nute and C. B. Cross. Handbook of philosophical logic, volume 4, chapter Condi- tional logic, pages 1–98. Kluwer Academic Pub, 2001. 26 DEL as a Substructural Logic J. Perry and D. Israel. What is information? Information, Language, and Cognition, 1, 1990. F. Ramsey. Philosophical Papers, chapter General Propositions and Causality. Cam- bridge University Press, Cambridge, 1929. G. Restall. Information flow and relevant logics. In Logic, Language and Computa- tion: The 1994 Moraga Proceedings. CSLI, pages 463–477. csli Publications, 1996. G. Restall. An Introduction to Substructural Logics. Routledge, 2000. G. Restall. Relevant and substructural logics. Handbook of the History of Logic, 7: 289–398, 2006. R. Routley and R. Meyer. The semantics of entailment. Studies in Logic and the Foundations of Mathematics, 68:199–243, 1973. R. Routley and R. K. Meyer. The semantics of entailment—II. Journal of Philosoph- ical Logic, 1(1):53–73, 1972a. R. Routley and R. K. Meyer. The semantics of entailment—III. Journal of philosoph- ical logic, 1(2):192–208, 1972b. R. Routley, V. Plumwood, and R. K. Meyer. Relevant logics and their rivals. Ridgeview Publishing Company, 1982. Strategic Voting and the Logic of Knowledge Hans van Ditmarsch, Jérôme Lang, and Abdallah Saffidine LORIA – CRNS / Université de Lorraine LAMSADE – CNRS / Université Paris Dauphine University of New South Wales
[email protected],
[email protected],
[email protected]Abstract We propose a general framework for strategic voting when a voter may lack knowl- edge about other votes or about other voters’ knowledge about her own vote. In this setting we define notions of manipulation and equilibrium. We also model action changing knowledge about votes, such as a voter revealing its preference or as a central authority performing a voting poll. Some forms of manipulation are preserved under such updates and others not. Another form of knowledge dy- namics is the effect of a voter declaring its vote. We envisage Stackelberg games for uncertain profiles. The purpose of this investigation is to provide the epistemic background for the analysis and design of voting rules that incorporate uncertainty. 1 Introduction A well-known fact in social choice theory is that strategic voting, also known as manip- ulation, becomes harder when voters know less about the preferences of other voters. Standard approaches to manipulation in social choice theory (Gibbard 1973, Satterth- waite 1975) as well as in computational social choice (Bartholdi et al. 1989) assume that the manipulating voter knows perfectly how the other voters will vote. Some ap- proaches (Duggan and Schwartz 2000, Barbera et al. 1998) assume that voters have a probabilistic prior belief on the outcome of the vote, which encompasses the case where each voter has a probability distribution over the set of profiles. A recent paper 28 Strategic Voting and the Logic of Knowledge (Conitzer et al. 2011) extends coalitional manipulation to incomplete knowledge, by distinguishing manipulating from non-manipulating voters and by considering that the manipulating coalition has, for each voter outside the coalition, a set of possible votes encoded in the form of a partial order over candidates. Still, we think that the study of strategic voting under complex belief states has received little attention so far, espe- cially when voters are uncertain about the uncertainties of other voters, i.e., when we model higher-order beliefs of voters. An extreme case of uncertainty is when a voter is completely ignorant about other votes. In that case, if a manipulation under incomplete knowledge is defined in a pessimistic way, i.e., if it is said to be successful if it succeeds for all possible votes of other voters, voting rules may well be non-manipulable. For the special case where all other voters are non-strategic this is shown for most common voting rules in (Conitzer et al. 2011). In the first place we model how uncertainty about the preferences of other voters may determine a strategic vote, and how a reduction in this uncertainty may change a strategic vote. We restrict ourselves to the case where uncertainty is over a number of well-described alternatives, including the true state of affairs, between which the voter is unable to distinguish. We also investigate the dynamics of uncertainty. The uncertainty reduction may be due to receiving information on voting intentions in polls or to voters directly telling you their preference. For simplicity we assume that received information is correct, or rather, we only model the consequences of incorporating new information after the decision to consider the information reliable. Such informative actions can then be modelled as truthful public announcements (Plaza 1989). Another form of dynamics is the dynamics of declaring votes. Declaring votes can be modeled as assignments (ontic / factual change). Just as there may be uncertainty about truthful votes, there may also be uncertainty about declared votes. Consider the following. Half of the votes are declared. It is not known whether candidate x or y has taken the lead, but z has clearly lost. You still have to vote. Does this influence your strategy? Another example is that of safe manipulation (Slinko and White 2008), where the manipulating voter announces her vote to a (presumably large) set of voters sharing her preferences but is unsure of how many will follow her. Finally, consider Stackelberg voting games, wherein voters declare their votes in sequence, following a fixed, exogeneously defined order. Our framework applies to Stackelberg voting games with uncertainty about profiles. There are several ways of expressing incomplete knowledge about the linear order of a voter. The literature on possible and necessary winners assumes that it is expressed by a collection of partial strict orders (one for each voter), while Hazon et al. (2008) consider it to consist of a collection of probability distributions, or a collection of sets van Ditmarsch et al. 29 of linear orders (one for each voter). Whereas the latter is more expressive (some sets of linear orders do not correspond to the set of extensions of a partial order), the former is more succinct. Ours is a more expressive modelling than both modes of representation, because an uncertain profile can be any set of profiles. A set of profiles such as {(a 1 b 1 c, a 2 b 2 c), (b 1 a 1 c, b 2 a 2 c)} expresses uncertainty (ignorance) which candidate voters 1 and 2 rank first, but knowledge (certainty) that voters 1 and 2 have identical preferences — which is not possible in (Hazon et al. 2008), and a fortiori also not in (Konczak and Lang 2005) and subsequent works on the possible winner problem. Of course, this mode of representation is also the less succinct of all. However, succinctness and complexity issues will play no role yet in this paper, where we focus on modelling and expressivity. Somewhat surprisingly, there are yet more complex scenarios that cannot be seen as uncertainty between a number of given profiles: it may be that a voter cannot dis- tinguish between two situations with identical profiles, because in the first case yet another voter has some uncertainty about the profile, but in the other case not. Our investigation is restricted in various ways: (i) we model uncertainty and ma- nipulability of individuals but not of coalitions, (ii) we model knowledge but not belief, and, in the dynamics, truthful announcements but not lying, (iii) we model incomplete knowledge (uncertainty) but not other forms of incompleteness, and (iv) as already said, we have not investigated complexity and succinctness. The reason for these restrictions is our desire to, first, present this complete logical framework for voters uncertain about profiles. Later we wish to broaden our scope. Let us briefly comment on these issues here. Epistemic and voting notions for coalitions are treated in Section 8 in some detail. There are many scenarios wherein voters may have incorrect beliefs about prefer- ences, or where information changing actions are intended to deceive. I may incorrectly believe that you prefer a over b, whereas you really prefer b over a. I may tell you that I prefer a over b, but I may be lying. Such scenarios can also be modelled in epistemic logic, with the same tools and techniques as presented in this paper, but we have re- stricted ourselves to knowledge: reliable beliefs. This is already a far and high enough jump from the typical social choice theory perspective of reliable common knowledge of preferences, and we think that the variety of phenomena described within the re- striction of knowledge and reliable information already sufficiently demonstrate the expressive power of the extension of voting with uncertainty. The study of uncertain votes is different from the study of other forms of incom- pleteness, e.g., when the number of voters or candidates may be unknown — the only form of incompleteness that we model is incomplete knowledge in the form of inability to determine which of a number of well-defined alternatives is the case. Here, we also restrict ourselves. 30 Strategic Voting and the Logic of Knowledge Complexity issues will be occassionally referred to in running text and in the con- cluding Section 4. A link between epistemic logic and voting has first been given, as far as we know, in (Chopra et al. 2004)—they use knowledge graphs to indicate that a voter is uncertain about the preference of another voter. A more recent approach, within the area known as social software, is (Parikh et al. 2011). The recent (Conitzer et al. 2011) walks a middle way namely where equivalence classes are called information sets, as in treat- ments of knowledge and uncertainty in economics, but where the uncertain voter does not take the uncertainty of other voters into account. 2 Voting This section recalls standard voting terminology. Assume a finite set N = {1, . . . , n} of n voters (or agents), and a finite set C = {a, b, c, . . . } of m candidates (or alternatives). Voter variables are i and j, and candidate variables are x and y (and x1 , x2 , ...). Definition 2.1 (Vote). For each voter i a vote i ⊆ C × C is a linear order on C. If voter i prefers candidate a to candidate b in vote i , we write a i b. Vote variables are i , 0i , etc. Instead of x1 i · · · i xn we also write i : x1 . . . xn , or depict it vertically in a table. Definition 2.2 (Profile). A profile P is a collection {1 , . . . , n } of n votes. Let O(C) be the set of linear orders of C. Then O(C)n is the set of all profiles for N. Profile variables are P, P0 , .... If P ∈ O(C)n , i ∈ P, and 0i ∈ O(C), then P[i /0i ] is the profile wherein i is substituted by 0i in P. Definition 2.3 (Voting rule). A voting rule is a function F : O(C)n → C from the set of profiles to the set of candidates. The voting rule determines which candidate wins the election — F(P) is the winner. A voting correspondence C : O(C)n → 2C \ {∅} maps a profile to a nonempty set of tied cowinners. To obtain a voting rule from a voting correspondence (to obtain a unique winner from a non-empty set of cowinners) we assume an exogeneously specified tie- breaking mechanism, that is a total order over candidates. Voters cannot be assumed to vote according to their preferences. Relative to a given profile P, a vote i ∈ P can be called the truthful vote or preference. A voter may change her truthful vote if this improves the outcome of the voting. This is called a manipulation or strategic vote. van Ditmarsch et al. 31 Definition 2.4 (Manipulation). Let i ∈ N, P ∈ O(C)n and i ∈ P, and let 0i ∈ O(C). If F(P[i /0i ]) i F(P), then 0i is a successful manipulation by voter i. Of course some votes that are not truthful still do not improve the outcome — relative to the truthful vote i ∈ P, any 0i ∈ O(C) can be called a possible vote. Finally, there is the case of the declared vote, after which a voter can no longer change her vote. Information on declared votes may be available to other voters (such as in Stackelberg games), and that may change their subsequent strategic votes. This is an overview of different votes: • truthful vote / preference; • strategic vote / successful manipulation; • possible vote; • declared vote. We now define stable outcomes of the voting rule. The combination of a profile P and a voting rule F defines a strategic game: a player is a voter, an individual strategy for a player is a vote (an individual strategy for a player in the game theoretical sense may not be a strategic vote in the social choice theoretical sense), a strategy profile (of players) is therefore a profile in our defined sense (of voters), and the preference of a player among the outcomes is according to his preferred vote: given voter i with truthful vote i ∈ P, and profiles P0 , P00 , i prefers outcome F(P0 ) over outcome F(P00 ) in the game theoretical sense iff F(P0 ) i F(P00 ). The relevant equilibrium notion is: Definition 2.5 (Equilibrium profile). Given a profile P, a profile P0 is an equilibrium profile iff no agent has a successful manipulation. In the view of a voting process as a game, an equilibrium profile corresponds to a Nash equilibrium. Manipulation and equilibrium for coalitions will be addressed in Section 8, later. 3 Knowledge profiles We model uncertainty about voting in the sense of incomplete knowledge about votes. The terminology to describe such uncertainty that we introduce in this section is fairly standard in modal logic (Fagin et al. 1995), but its application to social choice theory is novel. The novelty consists in taking models with profiles instead of valuations of propositional variables. An expression like b i a is a proposition ‘voter i prefers 32 Strategic Voting and the Logic of Knowledge candidate b over candidate a’, which is true or false for any given profile; and from that perspective, a profile is nothing but a collection where for all voters all such variables are given a value true or false: a valuation. Definition 3.1 (Knowledge profile). Given is the set O(C)n of all profiles for a set N = {1, . . . , n} of n voters. A profile model is a structure P = (S , {∼1 , . . . , ∼n }, π), where S is a domain of abstract objects called states; where for i = 1, . . . , n, ∼i is an indistinguishability relation that is an equivalence relation; and where valuation π : S → O(C)n assigns a profile to each state. A knowledge profile is pointed structure P s where P is a profile model and s is a state in the domain of P. If s ∼i s0 , π(s) = P, and π(s0 ) = P0 , then voter i is uncertain if the profile is P or P0 ; e.g. if j : bca in P and j : cba in P0 , then voter i is uncertain if voter j prefers b over c or c over b. Instead of ‘voter i is uncertain if’ we also say ‘voter i does not know that’. We can do this formally in a logical language interpreted on knowledge profiles. Definition 3.2 (Logical language). The language L over the set of voters N = {1, . . . , n} and the set of preferences is defined as follows, where i is an agent and a, b ∈ C: ϕ ::= a i b | ¬ϕ | ϕ ∧ ϕ | Ki ϕ A profile P is defined in L by abbreviation as the description of the valuation (the conjunction of all its terms a i b and all its excluded terms ¬(a i b)). Similarly, a vote i is defined in L by abbreviation as the i-part of that. An element of the language is called a formula, ϕ is a formula variable. Formula Ki ϕ stands for ‘voter i knows that ϕ’. We have allowed ourselves to overload the mean- ing of a i b, as it is really the name for the atomic proposition uniquely interpreted (below) as the truth of a i b. Definition 3.3 (Semantics). The interpretation of formulas in a knowledge profile is defined as follows: Ps a i b iff a i b, where i ∈ π(s) Ps ¬ϕ iff it is not the case that P s ϕ Ps ϕ∧ψ iff P s ϕ and P s ψ Ps Ki ϕ iff for every t such that s ∼i t, Pt ϕ Given a knowledge profile P s and a proposition ϕ, agent i knows that ϕ if and only if ϕ holds for all states in P indistinguishable for i from s (i.e., for all s0 ∈ P such that s ∼i s0 ). If P s ϕ for all s ∈ S , we write P ϕ (ϕ is valid on P) and if this is the case for all P, we say that ϕ is valid, and we write ϕ. Propositions like ‘voter i knows the profile’ now have a precise description. van Ditmarsch et al. 33 Example 1. Consider the following P consisting of three states s, t, u and for two voters 1 and 2. State s is assigned to profile P, wherein a 1 c 1 b 1 d and d 2 c 2 b 2 a, etc. States that are indistinguishable for a voter i are linked with an i-labelled edge. The partition for 1 on the domain is therefore {{s, t}, {u}}, and the partition for 2 on the domain is {{s}, {t, u}}. 1 2 1 2 1 2 a d a d d d c c ——1—— c c ——2—— c c b b b b b b d a d a a a s, P t, P u, P0 States s and t have been assigned the same profile P but have different epistemic prop- erties. In s, 2 knows that 1 prefers a over d, whereas in t 2 does not know that. We list some such relevant formulas: • Ps K2 a 1 d • Pt 6 K2 a 1 d • P (1 → K1 1 ) ∧ (2 → K2 2 ) (Both voters know their preference.) The example demonstrates than we cannot do away with states. Sometime, different states are being assigned the same profile. But in many typical scenarios different states are assigned different profiles, and then we can truly say that the uncertainty of a voter is about a collection of profiles. We now define the notion of ‘voter i changes her vote’ in L. Definition 3.4 (Changing a vote). We define P ↔i P0 as _ P → i ∧ P0 → 0i ∧ (a j b ↔ a 0j b). j,i,a,b∈C Given the abbreviations defined, P → i stands for i ∈ P. Formula P ↔i P0 says that there is a vote 0i such that P0 = P[i /0i ]. Surprisingly, our logic of knowledge and voter preferences, that we extend with dynamics in the next sections, is not in fact a dynamic logic of preference (Liu 2011). Given that, the following perspective may be of interest. In our models, the preferences are modelled as propositional variables. These induce preferences between states by 34 Strategic Voting and the Logic of Knowledge enriching the model with total orders expressing that: one state is more preferred than another one, if the outcome of the truthful vote for the profile of the first state is more preferred than the outcome of the vote for the profile of the second state. Definition 3.5 (Models for knowledge and preference). Given a knowledge profile P s with P = (S , {∼1 , . . . , ∼n }, π) the induced preference knowledge profile Ps is defined as P = (S , {∼1 , . . . , ∼n }, {1 , . . . , n }, π) where i is defined as: for all s, t ∈ S , s i t iff F(π(s)) i F(π(t)). Thus we reclaim the epistemic plausibility models of (Baltag and Smets 2008) and therefore, indirectly, approaches as (Liu 2011), although not in the meaning of ‘agent i considers state s more plausible than state t’, but in the sense of ‘voter i prefer the outcome of voting of the profile in s to the outcome of voting of the profile in t’. As there, one has a choice between global preferences or ‘local’ preferences (intersec- tion of global preferences with equivalence classes). This embedding seems important enough to mention as a result: Proposition 1. The epistemic logic of votes can be embedded into epistemic plausibil- ity logic. Proof. We refer to the embedding of Definition 3.5. 4 Manipulation and knowledge In a knowledge profile it may be that a voter can manipulate the vote but does not know that, because she considers it possible that another profile is the case in which she can- not manipulate the vote. Such situations call for more refined notions of manipulation that also involve knowledge. They can be borrowed from the knowledge and action literature (van Benthem 2001, Jamroga and van der Hoek 2004). Given is a knowledge profile P s where π(s) = P. If voter i can manipulate P, then voter i also can manipulate P s . The uncertainty is about what the profile is. But this does not affect that P is the actual profile. In our modelling, if the voter can manipulate P, she always considers it possible that she can manipulate P. This is a consequence of modelling uncertain knowledge instead of uncertain belief. However, there are situations wherein she considers it possible that she can manipulate, but where in fact she cannot manipulate, namely if she considers a state possible with a profile that is not the profile in the actual state. A curious situation is the one wherein in all states that the voter considers possi- ble there is a successful manipulation, but where, unfortunately, this is not the same strategic vote in all such states! So she knows that she has a successful manipulation, van Ditmarsch et al. 35 but she does not know what the manipulation is. This is called de dicto knowledge of manipulation. A stronger form of knowing is when there is a vote that is strategic in the pro- file for any state that the voter considers possible. This is called de re knowledge of manipulation. A further situation of interest for voting theory is when (a) in any profile that the voter considers possible she can vote such that the outcome is either the same or better than when she had voted sincerely, and when (b) for at least one possible profile the outcome is better. This can be called weakly successful manipulation. (It is somewhat unclear if the qualification weak should apply to the manipulation or to the knowledge, as it is a property of a set of profiles.) Definition 4.1 (Knowledge of manipulation). Given a knowledge profile P s , • voter i can successfully manipulate P s if she can successfully manipulate the profile π(s); • voter i considers possible that she can successfully manipulate P s if there is a t such that s ∼i t and she can successfully manipulate π(t); • voter i knows ‘de dicto’ that she can successfully manipulate P s , if for all t such that s ∼i t she can successfully manipulate π(t); • voter i knows ‘de re’ that she can successfully manipulate P s if there is a vote 0i such that for all t such that s ∼i t, 0i is a successful manipulation for profile π(t); • voter i knows ‘de re’ that she can weakly successfully manipulate P s if: (a) there is a vote 0i such that for all t such that s ∼i t, either 0i is a successful manipulation for profile π(t) or the outcome of that vote in π(t) does not change, and (b) there is a t such that s ∼i t and 0i is a successful manipulation for profile π(t). There is also a weakly successful version of ‘de dicto’ knowledge of manipulation. These notions of knowledge of manipulation do not assume that voters know their own vote, although to apply them under these circumstances could lead to counterintu- itive results. If voter i knows ‘de re’ that she can manipulate the election, she has the ability to manipulate, namely by strategically voting 0i . On the other hand, ‘de dicto’ manipula- tions do not have any practical interest, since the voter does not seem to have the ability to manipulate the election. It is akin to ‘game of chicken’ type equilibria in game the- ory (Osborne and Rubinstein 1994). Therein, for each strategy of a player there is a complementary strategy of the other player such that the pair is an equilibrium. This 36 Strategic Voting and the Logic of Knowledge cannot be guaranteed without coordination. Example 2, below, illustrates ‘de dicto’ manipulability. Example 2. We consider manipulation with voting according to the Borda voting rule. Consider three agents, four candidates, and two profiles P and P0 that are indistinguish- able for agent 1, but that agents 2 and 3 can tell apart; as follows. 1 2 3 1 2 3 c d b c d b b a d ——1—— b a a a c c a c c d b a d b d P P0 There is also a tie-breaking preference b c d a. The difference between the profiles P and P0 is that 3 prefers d over a in P but a over d in P0 . We prove that 1 can manipulate the election if the profile is P, and that 1 can manipulate the election if the profile is P0 , but that the manipulation for P gives a worse outcome for P0 , and that the manipulation for P0 gives a worse outcome for P. Therefore she is not effectively able to manipulate the outcome of the election. In Borda, the ranks for each candidate in each vote are added, and the candidate with the highest sum wins, modulo the tie-breaking preference. The preferred candi- date gets 3 points, the 2nd choice 2 points, etc. First, the outcome when all three agents give their truthful vote. We write xyzw when there are x points for a, y for b, z for c, w for d. profile count observation outcome P 3555 b, c, d are tied b P0 5553 a, b, c are tied b Voter 1 can manipulate P or P0 by downgrading b. But this is tricky, because it comes at the price of making a or d, or both, more preferred. This price is indeed too high: In P, 1 can achieve a better outcome by 01 defined as 1 : cabd. Let Q = P[1 /01 ], and Q0 = P[1 /01 ]. Although 1 prefers the winner in Q over the winner in P, the winner in Q0 is less preferred by her than the winner in P0 : profile count observation outcome Q 4455 c, d are tied c Q0 6453 a In P0 , 1 can achieve a better outcome by 001 defined as 1 : cdba. Let R = P[1 /001 ], and R0 = P[1 /001 ]. Now, 1 prefers the winner in R0 over the winner in P0 , but the van Ditmarsch et al. 37 winner in R is less preferred by her than the winner in P: profile count observation outcome R 2457 1’s worst dream d R0 4455 c, d are tied c For the record, the winners for all different votes for voter 1 where c is most pre- ferred. 1 : cbad 1 : cabd 1 : cdba 1 : cadb 1 : cdab 1 : cbda b(3555) c(4455) d(2457) d(4356) d(3357) d(2556) b(5553) a(6453) c(4455) a(6354) c(5355) b(4554) In the language L we cannot say that the outcome of the election in P is preferred by a voter to the outcome of the election in P0 . For that, we need to add primitives P i P0 to the language. These act as background knowledge. They encode the voting function so that its results are available in all states and in all profile models. Definition 4.2 (Language L+ ). We expand the set of propositional variables with P i P0 for any P, P0 ∈ O(C)n , and we add the following clause to the semantics: Ps P i P0 iff F(P) i F(P0 ). The variables P i P0 mean that voter i prefers the candidate chosen by the votes in P over the candidate chosen by the votes in P0 . This is a(n) (inefficient) way to encode the voting function. We observe that the semantics is indeed independent from state s and profile model P. These are model validities P i P0 . All notions of manipulation in Definition 4.1 are definable in the language L+ . Definition 4.3. Let P s be a knowledge profile with profile P. • Voter i has a successful manipulation: _ P ∧ (P → i ) ∧ (P0 i P ∧ (P0 ↔i P)). P0 • Voter i has a successful manipulation 0i : P ∧ (P → i ) ∧ (P0 → 0i ) ∧ (P0 ↔i P) ∧ P0 i P. • Voter i knows de dicto that she has a successful manipulation: _ P ∧ (P → i ) ∧ Ki ((P0 ↔i P)) ∧ P0 i P). P0 38 Strategic Voting and the Logic of Knowledge • Voter i knows de re that she has a successful manipulation: P ∧ (P → i ) ∧ 0i [((P0 ↔i P) ∧ P0 i P ∧ P0 → 0i ))∧ W Ki (P00 → ((P0 ↔i P00 ) ∧ P0 i P00 )). De re knowledge of weak manipulation is similarly defined. Proposition 2. Knowledge of manipulation is definable in L+ . Proof. As evidenced in Definition 4.3. 5 Equilibrium and knowledge Determining equilibria under incomplete knowledge comes down to decision taking under incomplete knowledge. Therefore we have to choose a decision criterion. Ex- pected utility makes no sense here, because we didn’t start with probabilities over pro- files in the first place, nor with utilities. In the absence of prior probabilities, the fol- lowing three criteria make sense. (i) The insufficient reason (or Laplace) criterion con- siders all possible states in a given situation as equiprobable. This criterion was used in (Ågotnes and van Ditmarsch 2011) to determine equilibria of certain (Bayesian) games of imperfect information. (ii) The maximum regret criterion selects the decision min- imizing the maximum utility loss, taken over all possible states, compared to the best decision, had the voter known the true state. (iii) The pessimistic (or Wald, or maximin) criterion compares decisions according to their worst possible consequences. The lat- ter criterion, that we also call risk averse, is one that fits well our probability-free and utility-free model; this was also the criterion chosen in (Conitzer et al. 2011). The only assumption here is that the probability distribution is positive in all states. We now fix this criterion for the rest of the paper. (Pessimistic, optimistic, and yet other criteria only assuming positive probability are applied to social choice settings in the recent (Parikh et al. 2011). We think their interesting results can be modelled as games using our setting.) In the presence of knowledge, the definition of an equilibrium extends naturally. The trick is that for each agent, the combination of an agent i and an equivalence class [s]∼i for that agent (for some state s in the knowledge profile) defines a so-called vir- tual agent (we model these imperfect information games as Bayesian games (Harsanyi 1967–1968)). Thus, agent i is multiplied in as many virtual agents as there are equiva- lences classes for ∼i in the model. In our setting we can almost think of these equivalence classes as sets of indistin- guishable profiles. Almost but not quite: we recall that states with different properties van Ditmarsch et al. 39 in a given equivalence class, or states in different equivalence classes, may be assigned the same profile. An equilibrium is then a combination of votes such that none of the virtual agents has an interest to deviate. A intuitively more appealing solution than virtual agents, also applied in (Ågotnes and van Ditmarsch 2011), is to stick to the agents we already have, but change the set of votes into a larger set of conditional votes — where the conditions are the equivalence classes for the agents. This we will now follow in the definition below. For risk-averse voters we can effectively determine if a conditional profile is an equilibrium without taking probability distributions into account, unlike in the more general setting of Bayesian games that it originates with. Definition 5.1 (Conditional equilibrium). Given is a knowledge profile model P such that every voter knows her preference (truthful vote). For each agent i, a conditional vote is a function []i : S /∼i → O(C), i.e., a function that assigns a vote to each equivalence class for that agent. A conditional profile is a collection of n conditional votes, one for each agent. A conditional voting game is then a (standard) strategic game where voters declare conditional votes. A conditional profile is an equilibrium iff no agent has a successful manipulation in any of its equivalence classes. The outcome of a conditional profile consisting of conditional votes is a n-tuple of vectors (x1 , . . . , xm ) where voter i has m equivalence classes. The definition of equilib- rium for the conditional voting game is derived from the Bayesian game form. It is not the standard form of strategic games! Consider a case for two equivalence classes for a voter 1 where two outcome vectors for 1 are (a, d) and (d, a), and a i d. We cannot say which of these two are preferred: therefore, the outcomes for 1 are not ordered, and therefore, it does not define a standard strategic game. However, if we only vary 1’s vote in the first argument (equivalence class) or in the second argument, the outcomes are ordered. This is the Bayesian game computation of equilibrium, where we deter- mine manipulability for each virtual agent. Therefore, in the definition we did not write ‘A conditional profile is an equilibrium iff no agent has a successful manipulation’ but ‘(. . . ) iff no agent has a successful manipulation in any of its equivalence classes.’ The requirement in Def. 5.1 that voters need to know their preference (truthful vote), is because the value they associate with that class is the worst outcome. This might otherwise be undefined. Example 3. We recall Example 1. There are two voters 1, 2, and four candidates a, b, c, d. Consider a plurality vote with a tie-breaking rule b a c d. 40 Strategic Voting and the Logic of Knowledge First consider the profile P defined as 1 2 a d c c b b d a If 1 votes for her preference a and 2 votes for his preference d, then the tie prefers a, 2’s least preferred candidate. If instead 2 votes c, a will still win. But if 2 votes b, b wins. We observe that (a, b) and (b, b) are equilibria pairs of votes, and that for 1 voting a is dominant. This is also apparent from the voting matrix (wherein equilibria are boxed), and even more so when we express the payoffs for both voters by their ranking for the winner, as on the right. 1\2 a b c d 1\2 a b c d a a b a a a 30 11 30 30 b b b b b b 11 11 11 11 c a b c d c 30 11 22 03 d a b d d d 30 11 03 03 Example 4. We now add uncertainty to the setting of Example 3. Consider another profile P0 , that is as P, but where 1’s vote is 1 : dcba. Now consider a knowledge profile as follows. It remains the case that the actual profile is P; voter 2 is uncertain which of P and P0 is the case; whereas voter 1 knows that. (It is tempting to add: voter 1 of course knows that, as he knows his own vote; but our framework equally applies to situations where he does not, e.g., because he has not yet made up his mind.) And, as one should always add: 1 and 2 know that this is the uncertainty about the profile. This knowledge profile PP consists of states t and u. 1 2 1 2 a d d d c c ——2—— c c b b b b d a a a t, P u, P0 What are the conditional equilibria of P? Votes (a, b) and (b, b) still lead to elect b and are the equilibria in state t with profile P. The only equilibrium vote for for state van Ditmarsch et al. 41 u with profile P0 is (d, d)—the preferences are identical for 1 and 2, and d is their top candidate. We argue our way towards the equilibria of this conditional voting game. There are two. Of course, alternatively to this argument one can directly determine these are equilibria by applying Definition 5.1 in a 16×4 matrix (below). Recall that we assumed that voters are risk-averse. 1\2 a b c d aa aaa bbb aaa aaa ab aba bbb aaa aaa ac aaa bbb aca aca ad aaa bbb aaa ada ba baa bbb baa baa bb bbb bbb bbb bbb bc baa bbb bcb bcb bd baa bbb bbb bdb ca aaa bbb caa caa cb aaa bbb cbb cbb cc aaa bbb ccc ccc cd aaa bbb ccc cdc da aaa bbb caa daa db aaa bbb cbb dbb dc aaa bbb ccc dcc dd aaa bbb ccc ddd First, consider voter 1. For each equivalence class of 1, we have to determine her optimal vote. If the profile is P, 1’s vote for a is dominant, so no matter what strategic considerations 2 may have due to the additional uncertainty about the profile, does not make a difference. Voter 1 votes a. If the profile is P0 , d is dominant for 1. Next, consider voter 2. Because 2 is risk-averse he will vote b. Because if 2 votes d and the profile is P, a wins because 1 votes a, as this is dominant for 1 (or b wins because 1 votes b); whereas if the profile is P0 and 2 votes d, then d wins because 1 votes d, which is dominant there. The worst outcome of these two is a (or b). Whereas if 2 votes b, the worst outcome is b. (The votes c and a can be eliminated from consid- eration as well.) The two equilibria that we can associate with this knowledge profile are below. The conditional vote for 1 in the first equilibrium actually is actually defined as: []1 ({t}) = 1 and []1 ({u}) = 01 ; and the vote for 2 is conditional to one equiva- 42 Strategic Voting and the Logic of Knowledge lence class — in other words, it is unconditional. The equivalent verbose formulation is more intelligible. • (if 1 prefers a then a and if 1 prefers d then d, b), • (if 1 prefers a then b and if 1 prefers d then d, b). In particular, 2 does not know that d is his equilibrium vote in P0 , because he considers it possible that the profile is P, where, if 2 votes d, 1 votes a (or 1 can improve her outcome by voting a), in which case 2 is worse off than d. In the 16 × 4 matrix, a conditional vote ab for 1 means: in t she votes a and in u she votes b. The outcome triples xyz represent: (worst and only) outcome for 1 in equivalence class of t, (worst and only) outcome for 1 in equivalence class of u; (worst) outcome for 2 in equivalence class of {t, u}. The table contains much symmetry. We omitted the table in terms of ranked outcomes. A triple like aaa corresponds to ranked outcome 144: the equal winners a for voter 1 are ranked according to different profiles, a is preferred in state t / in profile P, hence 1, but a is least preferred in state u / in profile P0 , hence 4. In the table, the third of a triple xyz is necessarily equal to the least preferred of x and y, but this is an artifact of the example (namely, that the two equivalence classes for 1 together comprise the equivalence class for 2). Example 5. We can add further uncertainty to Example 5. 1 2 1 2 1 2 a d a d d d c c ——1—— c c ——2—— c c b b b b b b d a d a a a s, P t, P u, P0 Consider a third state that has the same profile P as the actual state, but that has different epistemic properties: 2 is not uncertain about the profile there, but 1 cannot distinguish this from the other state for P wherein 2 is uncertain about the profile. This is the profile model from Example 1. Will 1 vote differently in s and t? In fact, she will not, nor will 2, and the conditional equilibria votes remain the same; strictly, 2’s vote should depend on his equivalence class, but as 2’s choice is the same either way, namely b, his vote is more succinctly described as an unconditional: b. We did not yet attempt to characterize conditional equilibria in the logic of the pre- vious sections, as we did for manipulation and knowledge of manipulation (Def. 3.4 van Ditmarsch et al. 43 and 4.3). This might be interesting for epistemic game theory (Aumann and Branden- burger 1995, Perea 2012), but even so we only deal with the special case of voting games. 6 Dynamics: revealing preference We can extend the modal logical setting for voting and knowledge of the previous sections with logical operations that are dynamic in character. In the context of voting, two obvious choices here are public announcement of a proposition (such as an agent revealing her true preference), and declaring a vote. Such actions can be modelled as semantic operations P s 7→ P s |ϕ (for propositions ϕ, e.g., respectively, ϕ = i for revealing her preference) and P s 7→ P s i :=> (for voter i declaring vote i ). In this section we deal with public announcement, in the next section, with public assignment. A well-known dynamic feature of epistemic logics is truthful public announcement (Plaza 1989). Given a knowledge profile P s , the requirement for execution of pub- lic announcement of ϕ is that ϕ is true in P s , and the way to execute it is to restrict the model P to all the states where ϕ is true. We can then investigate the truth of propositions in that model restriction: we can evaluate formulas of form [ϕ]ψ, for ‘Af- ter announcement of ϕ, ψ (is true)’, such as: ‘After 1 reveals her preference (truthful vote) to 2, 2 knows that he has a successful manipulation’. We need to add a clause to the logical language for these announcements and define their semantics. The model restriction to the ϕ-states is denoted as P s |ϕ. Definition 6.1 (Public announcement). We add an inductive clause [ϕ]ϕ to the logical language L (i.e., a dynamic modal operator with an argument of type formula followed by a postcondition also of type formula). Its semantics is: Ps [ϕ]ψ iff P s ϕ implies P s |ϕ ψ, where P s |ϕ = (S 0 , ∼01 , . . . , ∼0n , π0 ) such that S 0 = {t ∈ S : Pt ϕ}, ∼0i = ∼i ∩ (S 0 × S 0 ), and π0 (a i b) = π(a i b) ∩ S 0 . Example 6. Consider again Examples 1 and 4, with plurality voting. In state t (for profile P), after voter 1 informs voter 2 of her true preference (a public announcement), the uncertainty in the model disappears and 1 and 2 commonly know that the profile is P. The equilibrium vote remains (b, b). So this seems not a big deal. On the other hand, in state u voter 1 has an incentive to make her preference known to 2: after that, 2’s equilibrium vote changes from b to d, and the equilibrium profile is now (d, d). And that is a big deal. 44 Strategic Voting and the Logic of Knowledge The transitions can be depicted as follows: 1 2 1 2 1 2 1 2 a d a d d d d d c c ⇐ c c ——2—— c c ⇒ c c b b b b b b b b d a d a a a a a t, P t, P u, P0 u, P0 We can now formalize statements as Pt ¬K2 a 1 c ∧ [a 1 c]K2 a 1 c. There are two obvious ways to interpret such public announcements in voting the- ory: (i) when voters make announcements about their own preferences (and such that these announcements are trusted by other voters), and, more properly from the view- point of public announcement logic, (ii) when external observers, such as a central authority, reveal preferences to voters. The last can be interpreted as holding a vot- ing poll. Successive voting polls reduce the uncertainty for the individual voter of the preferences (truthful vote) of other voters. And this may determine the strategic vote. Two obvious results are that: Proposition 3. Knowledge of weakly successful manipulation is not preserved after update. Proof. We recall Definition 4.1. For the weak form of manipulation there were two requirements: (a) the profile of at least one state in a given equivalence class for voter i needs to have a manipulation, and (b) the profiles of all states in that equivalence class must have either equal or better outcome. The state with a manipulation need not be the actual state, therefore, after model restriction the existential requirement (a) may no longer hold. This holds for ‘de re’ as well as ‘de dicto’ knowledge. Proposition 4. Knowledge of successful manipulation is preserved after update. Proof. The profiles of all states have a manipulation, a universal property that is pre- served after update. 7 Dynamics of declaring votes A voter i declaring a vote i can be modelled in dynamic epistemic terms as an assign- ment (a.k.a. ontic change, in contrast to an informative change like an announcement van Ditmarsch et al. 45 and coalition deliberation). A succinct way to model this is to expand the knowledge profiles with a duplicate set of propositional variables expressing voter preference, ini- tially all set to false. To distinguish the preference (truthful vote) from the declared vote we keep writing i for the former whereas we write i for the latter. So, the set of variables a i b encode the preferences of the voters, whereas variables a i b encode their declared votes. The action of declaring a vote i , defined by preferences a i b, sets the value of the propositions encoding i in the model to true: these are the assignments a i b := > executed for all a i b in i . If we assume that the declared vote is public, then this assignment can be executed in all states of the knowledge profile. The dynamic epistemic logic equivalent to achieve that is a public assignment (van Ditmarsch et al. 2005, van Benthem et al. 2006). Definition 7.1 (Public assignment). We add an inductive clause [a i b := >]ϕ to the logical language. For the semantics, given a knowledge profile P s , P s [a i b := >]ϕ iff (Pai b ) s ϕ, where Pai b is as P except that π(a i b) = D(P). By abbreviation we define i := > as the sequential execution of all assignments a i b := T for all terms a i b in i . Assignments need not be to ‘true’ (>) but can be to any formula. Such an assign- ment a i b := ψ has semantics π(a i b) = {t ∈ D(P) | Pt ψ. Declaring one’s preference, the truthful vote, can then be seen as the assignment i := i . Example 7. Consider a 1 b 1 c. The assignment declaring this vote is the sequence of three assignments a 1 b := >, b 1 c := >, a 1 c := >, abbreviated as i := >. Example 8. Another continuation of Example 4 is with declaring votes. If in state t voter 2 declares his vote, i.e., fixes d as the candidate of his choice, 1 votes a, because with the given tie b a d c, her preference a now gets elected. We can simulate this assignment as the sequence of d2 c := >, d2 b := >, d2 a := > (or as the as- signment of preference to the declared vote: 2 := 2 ). For simplicity this is depicted as making d bold. 1 2 1 2 1 2 1 2 a d d d a d d d c c ——2—— c c ⇒ c c ——2—— c c b b b b b b b b d a a a d a a a t, P u, P0 t, P0 u, P0 We have no results yet for the interaction of declaring votes and revealing voter preference, but Stackelberg games are the obvious games of interest here. 46 Strategic Voting and the Logic of Knowledge Axiomatization and completeness All four logics proposed in this work have sound and complete axiomatizations with respect to the class of profile models. However, this is not remarkable. We have therefore omitted these axiomatizations, for that see the cited references. 8 Chair and coalitions We have some modelling results concerning matters relevant for social choice theory that we have chosen not to incorporate in the main story, as not to lose focus there: how to model the central authority, and group notions of preference and knowledge. 8.1 Central authority Apart from the n voters, it seems convenient to distinguish yet another agent: a des- ignated agent named 0, the central authority, or chair. We recall that the tie-breaking preference tie is a linear order on candidates. Apart from applying the tie, the cen- tral authority may perform other kinds of actions such as fixing the agenda. This also opens the door to the logical modelling of well-studied problems in computational so- cial choice, such as control by the chair, or determining possible winners. The main reason not to model the chair it that her role is uniform throughout the model (through- out any knowledge profile model). We assume that there is no uncertainty on what the voting rule (and the tie-breaking preference) is. So in that sense it is exogenous. The universal relation S × S on a knowledge profile model can be seen as the indistinguishability relation of the agent 0, the central authority. On a connected model (i.e., when there is always a path between any two states in the model) this is the same as common knowledge of the voters. The computational tasks of the central authority, be it determining the possible winners or finding strategic actions such as agenda fixing or any other form of control, can only be harder on knowledge profiles as it has to take uncertainty into account. By identifying the central authority with an agent with universal ignorance we can be precise about how much harder. A partial profile in the social choice literature corresponds in a profile model to the set of profiles completing it, with identity access for all voters, and indistinguishable for the central authority, as in the following example. (The set of partial profiles then seems to consist of such disconnected parts.) van Ditmarsch et al. 47 Example 9. The following depicts the partial profile (b 1 a 1 c, a 2 {b, c}). Voters 1 and 2 have identity access on the profile model. The central authority is agent 0. 1 2 1 2 b a b a ——0—— a b a c c c c b 8.2 Coalitional manipulation Group notions play an important role in social choice theory. We consider coalitions G ⊆ N. As straightforward generalizations of (individual) preference i , (individ- ual) manipulation, (weak) equilibrium, and (weak) equilibrium of a conditional voting game, we can also define: coalitional preference G , and successful manipulation by a coalition G. A profile P0 is a strong equilibrium profile iff no coalition has a successful manipulation. Group notions also play an important role in epistemic logic. Two notions useful in our setting are common knowledge and distributed knowledge. Given a knowledge profile, a proposition is commonly known if it is true in all states reachable (from the actual state of the knowledge profile) by arbitrarily long finite paths in the model (re- flexive transitive closure of access for all voters in the coalition). With the interpretation of common knowledge of coalition G we can thus associate an equivalence relation ∼G (defined as ( i∈G ∼i )∗ ). A proposition is distributedly known in a knowledge profile, S if it is true in the intersection of accessibility relations in the actual state (the relation T i∈G ∼i ). If there is no uncertainty about the profile, the voters have common knowledge about the profile. This assumption is almost always made in social choice theory. It is important to observe that in the presence of uncertainty this strong form of common knowledge disappears, but that still some form of common knowledge remains: all agents have common knowledge of the structure of the profile model. This means that they have common knowledge of the set of states, the accessibility relations of the knowledge model, and what profiles these states stand for. The only thing they do (or rather, may) not know is the designated point of the profile model: what the preferences (truthful votes) are. Coalitions play a big role in voting, partly because in realistic settings the power of individual voters is very limited. Now by analogy, just as the vote of an individual agent depends on her knowledge, the vote of a coalition would seem to depend on the common knowledge of that coalition. But that seems wrong. In voting theory, the power of a coalition means the power of a set of agents that can decide on a joint action 48 Strategic Voting and the Logic of Knowledge as a result of communication between them. Communication makes the uncertainty about each others’ profiles disappear. In terms of knowledge profiles, this means that we are talking about another model, namely the model where for all agents i ∈ G, ∼i T is refined to i∈G ∼i . What determines the voting power of a coalition seems rather its distributed knowledge. We are still exploring the implications of these observation, and should note that also other choices can be made to model the power of a coalition in voting. Knowledge of manipulation and equilibria of conditional voting games can also be defined for coalitions but have been left out of this presentation. 9 Conclusion, further research We presented a formal logical semantics for the interaction of voting and knowledge. The semantic primitive is the knowledge profile: a profile including uncertainty of voters about what the actual profile is. This reveals different notions for knowledge of manipulation, such as de re knowledge of manipulation and de dicto knowledge of manipulation, and novel notions for equilibria, such as conditional equilibrium for risk- averse voters. Dynamic operations on such knowledge profiles can also be modelled, and their effects on manipulation, where we distinguished public announcements, such as revealing true preferences, from public assignments, i.e., declaring votes. As far as the formalization is concerned, our setting is very similar to that of the re- cent literature on robust mechanism design (Bergemann and Morris 2005), which gen- eralizes classical mechanism design by weakening the common knowledge assump- tions of the environment among the players and the planner. In (Bergemann and Morris 2005) uncertainty is modelled with information partitions. The main technical differ- ence is that in our setting, as in classical social choice theory, preferences are ordinal, whereas in (robust) mechanism design preferences are numerical payoffs, which allows for payments (which we don’t). This connection with mechanism design, however, is certainly worth exploring further. (We are very grateful to an anonymous reviewer for pointing this connection to us.) The logical setting defined in the paper allows us to represent various classes of situations already studied specifically in (computational) social choice, thus offering a general representation framework in which, of course, new classes of problems will be representable as well, thus providing an homogeneous, unified representation frame- work. In some of the classes of problems we need one more agent, the chair. The chair may have preferences, but does not vote. In some classes of problems the dynam- ics plays a crucial role in defining these problems, both as announcements (revealing preference) and assignments (declaring votes). Here are a few such problems: van Ditmarsch et al. 49 1. Possible and necessary winners (Konczak and Lang 2005): there is one more agent (the chair), who has an incomplete knowledge of each of the votes; the voters’ knowledge is does not matter. x is a possible winner if the chair does not know that x is not a (co)winner, and a necessary winner if the chair knows that x is a (co)winner. 2. Stackelberg voting games (Xia and Conitzer 2010): voters express their votes in sequence, in a commonly known order. Their preferences are common knowl- edge. The votes are announced publicly and each voter thus know the vote of the voters which speak before him. 3. Sequential voting games with abstention (Desmedt and Elkind 2010): voters express their votes in sequence, preferences are common knowledge; the voting rule is plurality; voters have the choice to vote or to abstain; voting is costly. 4. Control by adding or removing voters or candidates (Bartholdi III et al. 1992): the chair has a perfect knowledge of the voters’ preferences; voters have no knowledge (and thus are supposed to vote truthfully); the chair may add or re- move some candidates as well as register or unregister voters. 5. Sequential voting on multi-issue domains (Lang and Xia 2009): the set of al- ternatives is a combinatorial domains, therefore the valuations are preference relations over tuples of values; voters vote in sequence, issue by issue, and the value for the (binary) issue is chosen by majority, and then communicated to the voters. 10 Acknowledgements This contribution to the LIRa yearbook is identical to (van Ditmarsch et al. 2013). The work was previously presented at AAMAS 2012 Valencia and at the ESSLLI 2012 Opole workshop ‘Strategies for Learning, Belief Revision and Preference Change’. Hans van Ditmarsch is also affiliated to IMSc (Institute of Mathematical Sciences), Chennai, as a research associate. He acknowledges support from European Research Council grant EPS 313360. References R. Aumann and A. Brandenburger. Epistemic conditions for nash equilibrium. Econo- metrica, 63:1161–1180, 1995. 50 Strategic Voting and the Logic of Knowledge A. Baltag and S. Smets. A qualitative theory of dynamic interactive belief revision. In Proc. of 7th LOFT, Texts in Logic and Games 3, pages 13–60. Amsterdam University Press, 2008. S. Barbera, A. Bogomolnaia, and H. van der Stel. Strategy-proof probabilistic rules for expected utility maximizers. Mathematical Social Sciences, 35(2):89–103, 1998. J. Bartholdi, C. Tovey, and M. Trick. The computational difficulty of manipulating an election. Social Choice and Welfare, 6(3):227–241, 1989. J. Bartholdi III, C. Tovey, and M. Trick. How hard is it to control an election? Math- ematical and Computer Modelling, 16(8/9):27–40, 1992. J. van Benthem. Games in dynamic epistemic logic. Bulletin of Economic Research, 53(4):219–248, 2001. J. van Benthem, J. van Eijck, and B. Kooi. Logics of communication and change. Information and Computation, 204(11):1620–1662, 2006. D. Bergemann and S. Morris. Robust mechanism design. Econometrica, 73(6):1771– 1813, 2005. S. Chopra, E. Pacuit, and R. Parikh. Knowledge-theoretic properties of strategic vot- ing. In Proc. of 9th JELIA, LNCS 3229, pages 18–30, 2004. V. Conitzer, T. Walsh, and L. Xia. Dominating manipulations in voting with partial information. In Proc. of AAAI, 2011. Y. Desmedt and E. Elkind. Equilibria of plurality voting with abstentions. In ACM Conference on Electronic Commerce, pages 347–356, 2010. H. van Ditmarsch, W. van der Hoek, and B. Kooi. Dynamic epistemic logic with assignment. In Proc. of 4th AAMAS, pages 141–148. ACM, 2005. H. van Ditmarsch, J. Lang, and A. Saffidine. Strategic voting and the logic of knowl- edge. In Proc. of 14th TARK – Chennai, 2013. J. Duggan and T. Schwartz. Strategic manipulability without resoluteness or shared beliefs: Gibbard-Satterthwaite generalized. Social Choice and Welfare, 17(1):85–93, 2000. R. Fagin, J. Halpern, Y. Moses, and M. Vardi. Reasoning about Knowledge. MIT Press, Cambridge MA, 1995. van Ditmarsch et al. 51 A. Gibbard. Manipulation of voting schemes: A general result. Econometrica, 41: 587–601, 1973. J. Harsanyi. Games with Incomplete Information Played by ’Bayesian’ Players, Parts I, II, and III. Management Science, 14:159–182, 320–334, 486–502, 1967–1968. N. Hazon, Y. Aumann, S. Kraus, and M. Wooldridge. Evaluation of election outcomes under uncertainty. In Proc. of AAMAS ’08, pages 959–966, 2008. W. Jamroga and W. van der Hoek. Agents that know how to play. Fundamenta Informaticae, 63:185–219, 2004. K. Konczak and J. Lang. Voting procedures with incomplete preferences. In Proc. IJCAI Multidisciplinary Workshop on Advances in Preference Handling, 2005. J. Lang and L. Xia. Sequential composition of voting rules in multi-issue domains. Mathematical Social Sciences, 57(3):304–324, 2009. F. Liu. Reasoning about Preference Dynamics. Springer, 2011. Synthese Library, Vol. 354. M. Osborne and A. Rubinstein. A Course in Game Theory. MIT Press, 1994. R. Parikh, C. Tasdemir, and A. Witzel. The power of knowledge in games. In Proc. of the Workshop on Reasoning About Other Minds, 2011. CEUR Workshop Proceedings. Volume: 751. A. Perea. Epistemic game theory. Cambridge University Press, 2012. J. Plaza. Logics of public communications. In Proc. of the 4th ISMIS, pages 201–216. Oak Ridge National Laboratory, 1989. M. A. Satterthwaite. Strategy-proofness and Arrow’s conditions: Existence and cor- respondence theorems for voting procedures and social welfare functions. Journal of Economic Theory, 10(2):187–217, April 1975. A. Slinko and S. White. Is it ever safe to vote strategically? Technical report, Auck- land University, 2008. Dep. of Math. Research Report 563. L. Xia and V. Conitzer. Stackelberg voting games: Computational aspects and para- doxes. In Proc. of AAAI, 2010. T. Ågotnes and H. van Ditmarsch. What will they say? - Public announcement games. Synthese, 179(S.1):57–85, 2011. What Does it Mean to Know an Action? Jiahong Guo School of Philosophy and Sociology, Beijing Normal University
[email protected]Abstract This paper focuses on knowing actions as knowing their denotations in terms of successful transitions. We view actions (programs) as black boxes exhibiting input- output behavior. This view ignores all intermediate details, and we investigate how much it can do for us. Knowledge of an action is then defined with a new notation in a combined language of dynamic and epistemic logic. The main purpose of this article is to propose a first approximation of knowing actions by their exter- nal input-output behavior, implemented in a first order epistemic setting. The full version of this idea takes place in the framework of first order epistemic models, with the help of a standard translation on program expressions. Some basic logical principles of reasoning are explored with respect to validity in this way, and we apply them in particular to general epistemic properties of knowing actions, and to knowledge of different ways of generating actions, through tests, serial combi- nation, choice, and iteration. Beyond that, because of the shape of our definition, knowing an action basically has the same introspection properties as those assumed in the basic propositional epistemic logic. The general base logic of the system is also demonstrated. Inside this full system, the logic of knowing actions here in- volves just a smaller fragment of the full language, but we have not determined its special properties yet. 1 Introduction It is customary in Cognitive Psychology and Epistemology to make at least a rough di- vision between declarative knowledge and procedural knowledge. Declarative knowl- 54 What Does it Mean to Know an Action? edge is about knowing propositions, about knowing that. Procedural knowledge is about interaction with the world, about knowing how. It is also common in Philos- ophy community that a finer distinction of knowledge is made with the above cate- gories knowing that, knowing how and knowing to do (Tang 2011). Some philosophers claim that knowing how is just a species of knowing that (Stanley 2011, Stanley and Williamson 2001). We are not going to explore and make justifications for the actual division of different kinds of knowledge and relationships between them. Perhaps only a kind of procedural knowledge - knowing an action will be considered in a formal representation here. There is a rich literature about declarative knowledge. The history of the study of what it means to have factual knowledge starts in Antiquity, and makes a restart with the advent of Kripke semantics and the proposal of Hintikka to analyze knowledge and belief in terms of access to possible worlds (Hintikka 1962). This analysis was taken up in cognitive science (Gärdenfors 1988), computer science (Halpern 1987, Halpern et al. 2009, Fagin et al. 1995) and game theory (Aumann 1976, Battigalli and Bonanno 1999, Perea 2012). The analysis of knowledge in computer science, however, also brings in a dynamic component. It is just a small step from knowledge based programming to knowledge of programs, knowledge of procedures, or knowledge of actions. Porgrams and actions have long been the area of dynamic logics that capture the effects of explicitly defined procedures (Pratt 1982, Harel et al. 2000). A natural combination of this dynamics with knowledge emerges in dynamic epistemic logic or DEL (van Benthem 1996, Baltag et al. 1998, van Benthem 2011, Ditmarsch et al. 2006). But this does still not directly deal with the analysis of what it means to know a procedure. In fact, several perspectives on the analysis of “knowing how” are possible: 1. Knowing how is about how to achieve a proposition ϕ via some actions or pro- cedures: e.g, to win a game. 2. Knowing how is about maintaining ϕ as an invariant through some actions or procedures that can be controlled by the agent. 3. Knowing how is about all (or some of) the denotations of procedures or actions. 4. Knowing how is for an agent to be able to perform certain actions or procedures as a whole: being able to swim, being able to play chess. 5. More variants are possible and useful, e.g., for the analysis of what it means for a person to know a foreign language, to know a book (the Bible, say), to know the special theory of relativity, or even to know one’s spouse. Guo 55 In this paper, we will take only one of these lines, focussing on knowing actions as knowing their denotations in terms of successful transitions. Thus, we view actions or procedures (complex actions) as a black box exhibiting input-output behavior. This means viewing a procedure or action as a relation, whose denotation is a set of pairs of input/output states. This view ignores all intermediate details, and we shall investigate how much it can do for us, while eventually also arriving at a better understanding of its limitations, and the need for richer views of actions. There is some previous literature on logics for knowing actions (for some ap- proaches different from ours, see Singh 1999). In a separate section at the end of this paper, we also briefly discuss some approaches closer to ours, and state our reasons for going beyond these earlier attempts. Technically speaking, our proposal will be as follows. In dynamic logic, a program α denotes a set of ordered pairs in some given state space, where all intermediate events are ignored. We will say that an agent knows an action α if he can distinguish the set of input-output pairs that constitutes α from all other possible sets of input-output pairs (possibly within some given class of relations). Here the propositional notion of knowledge is standard, given in terms of the ac- cessibility relations R of epistemic logic. Here Kϕ is true in a world w when ϕ is true in all R-successors of w. Now we will define knowledge of an action as follows, using a new notation in a combined language of dynamic and epistemic logic, and following the above idea of knowing the input-output behavior: K̇α is true at a world w iff for any two states x, y, if xRα y then the agent knows it, otherwise the agent knows ¬xRα y. Practically speaking, then, the agent can make a judgment whether a given pair of states is a possible transition in the action relation or not. To bring out the logical principles governing this notion, it seems natural to explain this a bit further in a first order modal language, where we can quantify over a suitable domains of states and transitions. This is the technical framework that we shall develop below. In summary, the main purpose of this article is to propose a first approximation of knowing actions by their external input-output behavior, implemented in a first or- der epistemic-dynamic setting. We will discuss in particular, what reasoning about knowing-that and knowing-how becomes available in this way, with a major focus on knowledge of complex actions. At the end, we evaluate where we stand, and discuss what further structure of actions and logics would be needed to give a fuller account of action and procedure. 56 What Does it Mean to Know an Action? 2 Syntax and semantics of Propositional Dynamic Logic We start with the language of propositional dynamic logic (PDL) for acquiring a suit- able language of knowing actions, the basis of our later definition of the essential se- mantic item K̇α in a first order epistemic framework. All the formulas in the original language of PDL have a corresponding “standard translation” into first-order logic, enriched with fixed-point operators where needed. We will use this tool, too, toward setting up our eventual proposed system. It is clear that the language for PDL is defined in two aspects: formulas and actions (van Benthem et al. 2012). The language of propositional part (denoted by LPDL ) is over some set of basic propositions Φ is given by (for the simplicity, only single agent is considered here): ϕ ::= > | p | ¬ϕ | ϕ ∨ ϕ | [α]ϕ where p ranges over Φ. We do not yet introduce knowledge operators K and the crucial new item K̇α of knowing action α here: they will be settled later in the first order epistemic framework. The dual of [α]ϕ is denoted by hαiϕ. Other Boolean connectives are defined as usual. If we assume that a set of basic action symbols A is given, then the language of actions for regular programs can be formally defined as: α ::= a |?ϕ | αˇ | α; α | α ∪ α | α∗ where a ranges over A. For example, p ∧ [a; b]q → q is a well-formed formula. There are two kinds of atomic actions need to be mentioned later, they are abort ι, and skip ⇓, representing empty relation and identity relation on states respectively. Models M for PDL are triples (S , Ra {for each atomic action symbol a }, V), where S is a set of states or worlds (a state is commonly denoted by s). And V is a valuation map assigning each propositional letter to a set of possible states, that is, V(p) ⊆ S , meaning that p is true in every state of V(p). Binary relation Ra on S interprets the respective atomic action a in the model M. The interpretation of an arbitrary action α in a model M can be explained inductively by the structure of actions as usual in dynamic logic. Now the truth of an arbitrary formula of PDL can be defined Guo 57 in a world of a model as follows (van Benthem et al. 2012): M, s > always M, s p ⇐⇒ s ∈ V(p) M, s ¬ϕ ⇐⇒ M, s 6 ϕ M, s ϕ∨ψ ⇐⇒ M, s ϕ or M, s ψ M, s ϕ∧ψ ⇐⇒ M, s ϕ and M, s ψ M, s hαiϕ ⇐⇒ for some t, (s, t) ∈ [[α]] M and M, t ϕ M, s [α]ϕ ⇐⇒ for all t with (s, t) ∈ [[α]] M it holds that M, t ϕ where the binary relation [[α]] M (also denoted by Rα if the context M is clear) interpret- ing the action α in the model is defined as in (van Benthem et al. 2012): [[a]] M = Ra [[?ϕ]] M = {(s, s) | M, s ϕ} [[αˇ]] M = M ([[α]] )ˇ [[α; β]] M = [[α]] M ◦ [[β]] M [[α ∪ β]] M = [[α]] M ∪ [[β]] M [[α∗ ]] M = ([[α]] M )∗ It is clear that the natural denotation of abort action ι is ∅, and skip ⇓ is ∆, the identity relation on S . If (s, t) ∈ Rα in a model M, we may write this as M, (s, t) α for convenience, but α itself is not a formula in the language. 3 Standard translation of PDL into FOL As can be seen in any good textbook on modal logic (Blackburn et al. 2001), all PDL formulas (without Kleene star) can be translated into a fragment of first order language (FOL hereafter) with at most two free variables, and actually even with Kleene star, the PDL formulas can be translated into a fragment of FOL where infinitely disjunc- tions are allowed. Now have a quick look at standard translation of PDL with regular programs. Definition 3.1. Let LPDL be given as in the above. The target language FOL which is used for translating PDL formulas has unary predicate symbols P, Q, . . . corresponding to propositional letters p, q, . . . in Φ and a binary relation symbol Ra for each atomic action a. We write F(x) to denote a first order formula F with free variable x. 58 What Does it Mean to Know an Action? Let x be a first-order variable. Following (Blackburn et al. 2001), the standard translation S T x taking propositional modal formulas in the system PDL to first-order formulas in FOL can be defined inductively as follows: • S T x (>) = (x = x) • S T x (p) = P x • S T x (¬ϕ) = ¬S T x (ϕ) • S T x (ϕ ∨ ψ) = S T x (ϕ) ∨ S T x (ψ) • S T x (ϕ ∧ ψ) = S T x (ϕ) ∧ S T x (ψ) • S T x ([α]ϕ) = ∀y(S T xy (α) → S T x (ϕ)) • S T x (hαiϕ) = ∃y(S T xy (α) ∧ S T x (ϕ)) The formula translation calls on S T xy to start recursively decomposing the action α. This requires two free variables to define binary relations. First for the *-free fragment of PDL: • S T xy (a) = xRa y • S T xy (?ϕ) = xR?ϕ y where ϕ is a PDL formula and R?ϕ = {(s, s) | M, s ϕ} • S T xy (α; β) = ∃z(S T xz (α) ∧ S T zy (β)) • S T xy (α ∪ β) = S T xy (α) ∨ S T xy (β) Still, the intended interpretation of α∗ is the reflexive, transitive closure of Rα and this kind of closure of a binary relation is not expressible in FOL. Since the meaning of α∗ is defined as [ R∗α = Rnα , n∈N one way to go here is to use infinitely long disjunctions to capture the meaning of an iterated action α: _ x(Rα )∗ y ⇐⇒ (x = y) ∨ xRα y ∨ ∃z1 . . . zn (xRα z1 ∧ . . . ∧ zn Rα y). n≥1 So, in an infinitary modal logic allowing countably infinite disjunctions and conjunc- tions, we can standardly translate all the formulas of PDL. The clauses for the *-free Guo 59 fragment has been given as in the above, and the following clause settles the Kleene star: _ S T xy (α∗ ) = (x = y) ∨ xRα y ∨ ∃z1 . . . zn (xRα z1 ∧ . . . ∧ zn Rα y). n≥1 Alternatively, we can translate PDL into the logic LFP(FO) which extends first order logic with fixed-point operators for recursive definitions using monotone operations. Either way, like with the standard translation of classical modal logic, we can define natural corresponding first order models. Definition 3.2. A corresponding first order model M F can be constructed as < S , I > where S is the domain, it is the same as in M. I is an interpretation which interprets unary predicate symbols P, Q, . . . as subsets of S with PI = V(p), QI = V(q), . . . re- spectively. And I interprets the binary relation symbol Ra just as Ra in PDL for each atomic action a. We may understand the states in the PDL model M intuitively as individuals (ob- jects) here in the model M F . Likewise, transition relations between worlds become relations between individuals. And, getting ahead of ourselves, while a single PDL model allows of no variation here, it is easy to see how we could give this an epistemic twist, by varying universes of states and denotations of programs across worlds, allow- ing for significant knowledge or ignorance of agents about what a given action really does. Now we are ready to state some correspondence results between PDL and FOL (allowing countably infinite disjunctions and conjunctions) with the help of standard translation: Fact 3.1. For every ϕ ∈ LPDL , M, s ϕ ⇐⇒ M F S T x (ϕ)[s]; M ϕ ⇐⇒ M F ∀xS T x (ϕ). Proof. Proofs are similar as in classical textbook of modal logic (Blackburn et al. 2001). We only check the first one with formulas in case of h?ϕiψ. First suppose M, s h?ϕiψ. This means for some t ∈ S with sR?ϕ t and M, t ψ. sR?ϕ t means that s = t and M, s ϕ, showing that M, s ψ as well. Then by induction hy- pothesis, we have M F S T x (ψ)[s]. And S T x (h?ϕiψ) = ∃y(S T xy (?ϕ) ∧ S T y (ψ)) = ∃y(xR?ϕ y ∧ S T y (ψ)). It’s clear to have that M F (∃y(xR?ϕ y) ∧ S T y (ψ))[s] since M, s ϕ (then sR?ϕ s) and M F S T x (ψ)[s] (then M F ∃yS T y (ψ)[s]). It shows that M F S T x (h?ϕiψ)[s], as required. Next we prove the direction from right to left. Sup- pose that M F S T x (h?ϕiψ)[s]. And S T x (h?ϕiψ) = ∃y(xR?ϕ y ∧ S T y (ψ)) as above. This 60 What Does it Mean to Know an Action? means M F ∃y(sR?ϕ y ∧ S T y (ψ)) holds. It intuitively shows that there exists some t ∈ S satisfying sR?ϕ t and S T x (ψ) is true under the assignment assigning t to x. The former shows that M, s ϕ and s = t. The latter means that M F S T x (ψ)[t]. By in- duction hypothesis, we have M, t ψ. Those results show precisely that M, s h?ϕiψ, as required. 4 Extension to a first order epistemic framework Next the formalization of knowledge of actions can be considered. We add knowledge operators K, K̇ and the formula K̇α into the language of PDL directly, but it seems not easy to settle the semantics of this item purely inside the usual propositional setting. By contrast, our intuitive idea is to consider this as knowing a binary relation of the interpreted α: for any two states x, y, if xRα y then the agent knows it, otherwise the agent knows ¬xRα y. Practically speaking, the agent can make a judgment whether the given pair of states is in the action relation or not. It seems natural to explain this intuition in an extended first order modal setting. 4.1 Language First the alphabet of our first order epistemic language (FOEL, hereafter) is defined as follows (it is based on countably infinite FOL, we add constant symbols for conve- nience in proving some results later): • constant symbols: a, b, c, . . . • variables: x, y, z, . . . • unary predicate symbols: P, Q, . . . with respect to the propositional letters p, q, . . . in PDL • a special binary relation symbol: = • binary relation symbols: Ra , Rb , . . . with respect to the atomic actions a, b, . . . in PDL • quantifier symbols: ∀, ∃ • knowledge operators: K, < K > Only unary and binary predicates are concerned here. Like in FOL with countably infinite disjunctions and conjunctions, we can have the standard translation from PDL Guo 61 to FOEL. Then formulas and actions in PDL are translated into first order formulas (maybe infinitely long) with several free variables. It seems useful to generate new binary relation symbols for compound actions, according to the rules for compound actions shown in the following, while the subscripted atomic actions can be standardly translated into countably infinite FOL (and FOEL) as done in the last section: R?ϕ , Rα;β , Rα∪β , Rα∗ . This means that, essentially, every action has a corresponding binary relation symbol in FOEL; although strictly speaking, only binary relation symbols for atomic actions are in the language. We introduce this trick in order to have a simple expression of our later proposal. And the language does not have functional symbols either, then terms only contain constants and variables here, where the set of terms may be represented as t1 , t2 , . . .. Now the FOEL formulas (denoted by LFOEL ) can be generated as ϕ ::= Pt | t1 = t2 | t1 Rα t2 | ¬ϕ | ϕ ∨ ϕ | Kϕ | ∀xϕ Other Boolean cases can be defined as usual. Now we have enough syntax to formally introduce the formula K̇α (for every action α) as an abbreviation of a first order epistemic formula in the next definition: Definition 4.1. Formula K̇α representing knowing the arbitrary action α is defined as the following first order epistemic sentence (no free variables contained) ∀x∀y((xRα y → K xRα y) ∧ (¬xRα y → K¬xRα y)) in our first order epistemic language LFOEL . In what follows, we will pay close attention to the first order formulas that are standard translations from PDL and that abbreviate statements of knowing actions since exploring the logical properties of knowing actions is a central theme here. Moreover, when talking with PDL formula ϕ in FOEL, we use a first order sentence ∀xS T x (ϕ) to refer to ϕ, but perhaps directly apply ϕ as an abbreviation of ∀xS T x (ϕ) in more situations for several reasons: in some cases, such as analyzing properties of tests, we don’t want the problem to be much complicated, then only original PDL formulas are allowed to be arguments of a test; ϕ is simpler than ∀xS T x (ϕ), just as K̇α, it can be more intuitively understood when used to form a valid result with K̇α; every ϕ has a correspondence with ∀xS T x (ϕ) in the respective models M and M F , it’s easy to deal with semantically. 62 What Does it Mean to Know an Action? 4.2 Models Having given the language, we now look at the semantic structures where it is supposed to be interpreted. Definition 4.2. A first order epistemic model can be constructed from the respective original dynamic model M and first order model M F , we augment M F to a quintuple form MC =< W, R, D, I, w∗ > where W is a set of non-empty possible worlds and R is a usual epistemic relation (commonly equivalence relation) on W. D is just S in M F (same as the set of possible states in M). I is an interpretation, technically it interprets all the unary predicate symbols as sets of individuals with respect to a world (PI,w ⊆ D for every w ∈ W), and the binary relation symbols as binary relations over domain D in a world w: RaI,w ⊆ D × D. w∗ ∈ W is the actual world which reflects (inherits) all ∗ ∗ the information from original PDL model M with PI,w ∗ = V(p), QI,w ∗ = V(q), . . . for the respective proposition letters p, q, . . . in Φ and RaI,w = Ra , RbI,w = Rb , . . . for the respective atomic actions a, b, . . . in A. Here we may consider D intuitively as states with respect to every world w ∈ W and the set D is fixed (that is also called constant domains in literature) over different worlds. All the interpretations of binary relation symbols (we added them into the language for simplicity) for an arbitrary (complex) action α can be inductively defined in each world, just as we did in dynamic logic: RaI,w = a binary relation R ⊆ D × D I,w R?ϕ = {(d, d) ∈ D × D | M, d ϕ} I,w Rαˇ = (RαI,w )ˇ I,w Rα;β = RαI,w ◦ RβI,w I,w Rα∪β = RαI,w ∪ RβI,w RαI,w ∗ = (RαI,w )∗ It is clear that we have RαI,w ⊆ D × D for an arbitrary action α with each world w. Here are some further explanations. Every constant symbol c is interpreted as the same individual c ∈ D in different worlds. The superscript I, w will be omitted for con- venience if the context is clear. Just as in first order semantics, the valuation function σ will assign a respective individual in D to each variable and constant symbol at each world. Also as in first order logic, we can define the notion of terms t: all constants and variables (recall there are no function symbols in our language). We have given an intuitive motivation for these semantic structures earlier. Basi- cally, they are a first-order way of viewing what the state space of a process can be like. Guo 63 Here the possible variation across worlds may arise naturally when we think of agents who do not know exactly how the process runs, for instance because of observational limitations. In multi-agent settings (not defined here, but analogous to our setting), epistemic variation may arise because agents who know their own actions may still not know those of other agents. Now we come to the truth definition for our language. Definition 4.3. Truth of a formula ϕ is defined in a world w under some assignment σ of a given model MC inductively as usual (just as in first order semantics, it’s easy to have that for every term t, σ(t) ∈ D): MC , w, σ t=t ⇐⇒ always MC , w, σ Pt ⇐⇒ σ(t) ∈ PI,w MC , w, σ t1 Rα t2 ⇐⇒ (σ(t1 ), σ(t2 )) ∈ RαI,w MC , w, σ ¬ϕ ⇐⇒ MC , w, σ 6 ϕ MC , w, σ ϕ∨ψ ⇐⇒ MC , w, σ ϕ or MC , w ψ MC , w, σ ϕ∧ψ ⇐⇒ MC , w ϕ and MC , w ψ MC , w, σ Kϕ ⇐⇒ for every w0 ∈ W with wRw0 , MC , w0 , σ ϕ MC , w, σ ∀xϕ ⇐⇒ for all d ∈ D, MC , w, σ[x := d] ϕ. If a first order formula F is true under every assignment σ at world w, we say that it is true in the world w, denoted by MC , w F (also written as w F if the context MC is clear). If the formula is true at all worlds of MC , we claim that it is valid in the model MC , denoted by MC F. If F is valid in all models of the above first order epistemic setting, it is called valid, denoted by F. The duals of K and ∀ can be defined as usual in epistemic logic and first order logic. It is clear that if a first order formula does not contain free variables, it has nothing to do with assignment σ. Then we may consider the truth of K̇α just in a world w of MC simpliciter, that is, MC , w ∀x∀y((xRα y → K xRα y) ∧ (¬xRα y → K¬xRα y)) This will be just true or false at the world, since the first order epistemic formula ab- breviating K̇α does not contain free variables. 4.3 Logic The next natural theme is studying the logic of these models and language in FOEL. For a start, the classical epistemic logic for knowledge of propositions with single agent is widely accepted to be S5. Here are its three well-known axioms with prominent 64 What Does it Mean to Know an Action? epistemic interpretations over and above the axiom system K based on basic modal logic: Veridicality Kϕ → ϕ. Positive Introspection Kϕ → KKϕ. Negative Introspection ¬Kϕ → K¬Kϕ. According to basic modal logic, S5 is sound and complete with respect to the class of all equivalence relational models (frames as well). As for an intuitive interpretation, the first of these axioms seems uncontroversial: knowledge has to be true, or it does not deserve to be called knowledge according to the tradition of literature in western philosophy. Although introspective properties have been widely and deeply discussed by philosophers, many objections as well as supports can be found in literature, it is a kind of reasonable characterization of propo- sitional knowledge. We will not focus on this philosophical issue here. Our treatment of knowing an action would also work, we believe, if other philosophical positions are taken. Next, for standard first order modal logic, it is well known that all the theorems of minimal first order normal modal logic with constant domains (the system is also called first order K) as done in (Priest 2008) are also valid in all of the above models. In our setting, the assumption of a constant domain means that agent knows the state space of the relevant process, though perhaps not its transition structure. In particular, one can check that both the Barcan formula (BF) and the Converse Barcan Formula (CBF) BF : ∀xKF(x) → K∀xF(x) CBF : K∀xF(x) → ∀xKF(x) are valid in standard relational first order modal models with constant domains (Arló- Costa and Pacuit 2006). This fixes our basic logic. The minimal logic derived by all of the above first order epistemic structures is first order S5 plus BF and CBF with a countably infinite language, since R is an equivalence relation and the domain remains constant across different worlds. 5 Reasoning about knowledge of actions Now we can study the logical properties of reasoning about knowledge of actions. We will do so in several steps. First, we consider the simplest atomic actions, namely, tests. Guo 65 After that, our second topic is general epistemic properties of knowing an action, such as it being veridical or introspective. Finally, we take up the topic of knowing complex actions, and how this knowledge relates to knowing their components. 5.1 Knowing tests Consider the test action ?ϕ (without loss of generality, we assume proposition ϕ here to be pure Boolean in order to avoid entangled iterations). In regular programs, action ?ϕ can only be executed in a state where ϕ is true, and it does not change the original state. Our intuition would be this. When someone says that a person knows an test ?ϕ, this means that the person knows whether ϕ. Formally, then, we expect the following assertion to hold: Fact 5.1. K̇(?ϕ) ↔ Kϕ ∨ K¬ϕ is valid. Interestingly we can indeed prove this, in a somewhat roundabout way with the help of standard translation. The rest of this subsection is a somewhat detailed investigation of this fact, and several basic observations around it. The test ?ϕ denotes a corresponding binary relation of ?ϕ as {(a, a)|a ∈ D} where ϕ is true in the state a of model M. We only consider the test of ϕ in PDL formulas, the standard translation of ?ϕ is S T xy (?ϕ) = S T x (?ϕ) = xR?ϕ x. Recall that we can think of that for every world w in the model MC , ϕ is true at a state s of M, if and only if the standard translation of ϕ is true in the corresponding state (individual) assigned to the free variable with respect to the world w. That is, given a world w, for every first order assignment σ and state (individual) a, M, a ϕ if and only if MC , w, σ[x := a] S T x (ϕ). In this above setting, we may first expect to have the following property of the knowledge of test: for a PDL formula ϕ, if a rational agent knows that (denoted by Kϕ, actually it’s K∀xS T x (ϕ) in FOEL, recall we abbreviate ∀xS T x (ϕ) as ϕ for simplicity), then he knows its test. This means that the following claim should hold: Fact 5.2. The formula Kϕ → K̇(?ϕ) is valid. Proof. Now we check its validity. Suppose Kϕ is true at an arbitrary world w of an arbitrary first order epistemic model MC . We can conclude that for every w0 ∈ W satisfying wRw0 , ∀xS T x (ϕ) is true in w0 . And we know R is an equivalence relation, so ∀xS T x (ϕ) is true at w. Now consider arbitrary a ∈ D (recall that a is just an ele- ment of S in PDL model M) that satisfy aR?ϕ a in w. Since ∀xS T x (ϕ) is true in w, it 66 What Does it Mean to Know an Action? then follows that for every assignment σ, MC , w, σ ∀xS T x (ϕ) (by the definition of truth at a world of the model MC ) and actually every world w0 that satisfies wRw0 has MC , w0 , σ ∀xS T x (ϕ) holds as well since Kϕ is true in w. By the properties of stan- dard translation, we have, M, a ϕ for every a ∈ S . This means aR?ϕ a holds in every world w0 . Therefore, KaR?ϕ a is true in w. We can then conclude, ∀x(xR?ϕ x → K xR?ϕ x) is true at w of MC , as required. Next consider the case of arbitrary a ∈ D that do not satisfy aR?ϕ a (that is, ¬aR?ϕ a) is true in w. It must be that M, a 6 ϕ, and it follows by properties of standard translation that MC , w, σ[x := a] 6 S T x (ϕ) for each such kind of assignments σ. But we have shown MC , w ∀xS T x (ϕ), a contradiction is de- duced now, and then we conclude that this case is impossible to hold. It is vacuously to have that ¬aR?ϕ a → K¬aR?ϕ a is true at w. Since a is arbitrary, It’s save to have ∀x(¬xR?ϕ x → K¬xR?ϕ x) is true at w. The two cases of the above show that K̇(?ϕ) is true at w. Since w is arbitrary and R?ϕ is interpreted the same in different models with a constant domain S , we conclude that the formula Kϕ → K̇(?ϕ) is valid. Interestingly, the converse of our observation does not hold. Fact 5.3. The formula K̇(?ϕ) → Kϕ is not valid in general in our first order epistemic models. Proof. To see this, consider a counter model MC with W = {w, w0 }, D = {a, b}, R = {(w, w), (w, w0 ), (w0 , w), (w0 , w0 )} and a valuation V that makes p false at a and b in the original model M, other things remain the same). The interpretation I at w of MC I,w0 makes unary predicate P as V(p) = ∅. It’s clear then R?p = ∅ and R?p = ∅. This I,w means MC , w ∀xy(xR?p y → K xR?p y) and MC , w ∀xy(¬xR?p y → K¬xR?p y), that is, MC , w K̇(?p). But it is obvious MC , w 2 K p (i.e., K∀xP x ) since we can find a world w0 with wRw0 such that MC , w0 2 ∀xP x . This formal result agrees with our intuition that the knowledge of an action of testing some proposition ϕ does not necessarily imply the knowledge that ϕ holds. However, we can obtain a following weaker version: Fact 5.4. The formula K̇(?ϕ) → Kϕ ∨ K¬ϕ is valid. Proof. Suppose that MC , w K̇(?ϕ) for arbitrary MC and w. First consider the case I,w R?ϕ = ∅. This means for every x ∈ D, ¬xR?ϕ x is true at w. Since K̇(?ϕ) is true at w, it follows that for arbitrary x, K¬xR?ϕ x is true at w as well, that means for all x ∈ D, ¬xR?ϕ x is true at each w0 with wRw0 . It clear to get now, for an arbitrary σ and all a ∈ D, MC , w0 , σ[x := a] S T x (¬ϕ), and then MC , w0 , σ ∀xS T x (¬ϕ), that is, Guo 67 MC , w0 ¬ϕ by abbreviation and σ is arbitrary. Now we have ϕ must be false (hence ¬ϕ is true) at each w0 , and so K¬ϕ is true at w. Next consider the case R?ϕ I,w = {(d, d) | d ∈ D}. This means for all x ∈ D, xR?ϕ x is true at w. Since K̇(?ϕ) is true there, it follows that for every x, K xR?ϕ x is also true at w. Then we conclude that, for every x, xR?ϕ x is true at each world w0 that is a R-successor of w. It is clear to get now, for an arbitrary σ and all a ∈ D, MC , w0 , σ[x := a] S T x (ϕ), and then MC , w0 , σ ∀xS T x (ϕ), that is, MC , w0 ϕ by abbreviation and σ is arbitrary. Now we have ϕ is true at each w0 , and so I,w Kϕ is true at w. The last case here is ∅ ⊂ R?ϕ ⊂ {(d, d) | d ∈ D}. This means there exists an object a and c in D such that aR?ϕ a is true but cR?ϕ c is false (cR?¬ϕ c is true) at w. Since K̇(?ϕ) is true at w, we can also get KaR?ϕ a and KcR?¬ϕ c are both true there, and further to have aR?ϕ a is true but cR?ϕ c is false at each w0 with wRw0 . Then it shows for an arbitrary σ, MC , w0 , σ[x := a] S T x (ϕ) but MC , w0 , σ[x := c] S T x (¬ϕ) for each w0 , that is, MC , w0 , σ[x := c] ¬S T x (ϕ). It follows that MC , w0 , σ ¬∀xS T x (ϕ). Since σ is arbitrary, we have MC , w0 ¬∀xS T x (ϕ) for each w0 with wRw0 , and further to have MC , w K¬∀xS T x (ϕ). That is just MC , w K¬ϕ by abbreviation, as required. The above proof shows intuitively that knowledge of a test ?ϕ leads to knowledge of ϕ or knowledge of ¬ϕ. By the help of first rule, we can deduce a more general result, perhaps surprising at first sight, which describes the connection between knowledge of ?ϕ and knowledge of ϕ which is a PDL formula (here it is also the abbreviation of ∀xS T x (ϕ) in FOEL). Fact 5.5. The formula K̇(?ϕ) ∧ ϕ ↔ Kϕ is valid. Proof. Proving the direction from right to left is trivial since we have shown Kϕ → K̇(?ϕ) is valid and Kϕ → ϕ is valid in classical epistemic logic. Now have a look from left to right: suppose K̇(?ϕ)∧ϕ is true in an arbitrary w of a model MC . Then we get for arbitrary assignment σ, MC , w, σ ∀xS T x (ϕ) and then in every a ∈ D, MC , w, σ[x := a] S T x (ϕ). It follows that for every a, aR?ϕ a is true at w. Then by K̇(?ϕ) true at w, we have for every a, KaR?ϕ a is also true at w. This means aR?ϕ a is true at each w0 with wRw0 for every a. That is, in an arbitrary σ and every a ∈ D, MC , w0 , σ[x := a] S T x (ϕ) and then we have MC , w0 , σ ∀xS T x (ϕ). It follows MC , w0 ∀xS T x (ϕ) since σ is arbitrary, that is just MC , w0 ϕ by abbreviation. Now we have Kϕ true in w. It may be also interesting to check the following. Fact 5.6. The formula K̇(?ϕ) ↔ K̇(?¬ϕ) is valid. 68 What Does it Mean to Know an Action? Proof. We only prove the left to right and the other direction can be done similarly. Suppose MC , w K̇(?ϕ) for arbitrary MC and w. We check K̇(?¬ϕ) also holds at w. For arbitrary a ∈ D, first consider the case aR?¬ϕ a: it means MC , w, σ[x := a] S T x (¬ϕ) for every σ, that is, MC , w, σ[x := a] 2 S T x (ϕ). It follows that ¬aRϕ a is true in w. By the supposition of truth of K̇(?ϕ), we have K¬aR?ϕ a is true at w. But ¬aR?ϕ a is equiva- lent to aR?¬ϕ a (it can be shown at original model M), hence we get MC , w KaR?¬ϕ a. Now we have MC , w ∀x(xR?¬ϕ x → K xR?¬ϕ x), as required. Then consider the case ¬aR?¬ϕ a for arbitrary a ∈ D: it means MC , w, σ[x := a] S T x (ϕ) for every σ, that is, aR?ϕ a is true at w. By the supposition of truth of K̇(?ϕ), we have KaR?ϕ a is true at w. But aR?ϕ a is equivalent to ¬aR?¬ϕ a, hence we get w K¬aR?¬ϕ a. Now we have MC , w ∀x(¬xR?¬ϕ x → K¬xR?¬ϕ x), as required. The result of two cases shows that K̇(?¬ϕ) is also true in w, meaning that K̇(?ϕ) → K̇(?¬ϕ) is valid. With the help of all the above results, we can derive the statement made at the beginning of this subsection, which said that ‘knowing whether a proposition ϕ is just knowing it or knowing its negation’: K̇(?ϕ) ↔ Kϕ ∨ K¬ϕ. Proof. It is not difficult to show this: the left to right direction has already been shown. Let’s do the other direction. Suppose Kϕ ∨ K¬ϕ is true at an arbitrary world w of a model MC . It’s clear to have Kϕ is true at w or K¬ϕ is true at w. In any case, we can have K̇(?ϕ) is true at w or K̇(?¬ϕ) is true at w respectively since Kϕ → K̇(?ϕ) has been shown valid. Application with the result K̇(?ϕ) ↔ K̇(?¬ϕ) we have just proved, it’s easy to have K̇(?ϕ) is true at w, as required. 5.2 General properties of knowing actions Just as in classical epistemic logic for propositions, we are interested in observing general properties in reasoning about knowing actions. Interestingly, many sound gen- eral results, such as Positive and Negative Introspections, as similar correspondences to classical epistemic logic can be proved valid. First, we have the following similar corresponding Positive Introspection property for knowing actions: Fact 5.7. The formula K̇α → K K̇α is valid. Proof. Suppose for an arbitrary world w of a given first order epistemic model MC , MC , w K̇α. We need to show in every w0 ∈ W with wRw0 , w0 K̇α, that is, 0 w ∀x∀y((xRα y → K xRα y) ∧ (¬xRα y → K¬xRα y)). Guo 69 Now consider two arbitrary objects c, d ∈ D and first suppose the case w0 cRα d. There are two subcases need to be observed. One is w cRα d: according to the supposition w K̇α, we can have w KcRα d, meaning that for every u ∈ W with wRu, u cRα d. But we know wRw0 as well and R is an equivalence relation, it follows that w0 Ru for every u ∈ W. This means w0 KcRα d, as required. The other subcase is w 6 cRα d: it is equivalent to w ¬cRα d since cRα d is nothing to do with first order assignments. Then by the supposition of w K̇α we get w K¬cRα d. It follows that w0 ¬cRα d since wRw0 , contradicting to w0 cRα d. So this subcase is impossible. Next suppose the case w0 ¬cRα d. Similarly we need to consider two subcases. One is w cRα d: this case can be similarly excluded by deriving a contradiction. We only observe the other subcase w 6 cRα d: it is equivalent to w ¬cRα d for the same reason in above. Then by the supposition of w K̇α we get w K¬cRα d. It means that for every u ∈ W with wRu, we have u ¬cRα d. But we know wRw0 and R is an equivalence relation, it follows that w0 Ru holds for every u ∈ W as well. This means w0 K¬cRα d, as required. Those two cases just show the result what we wanted. Interestingly, we have also a similar corresponding property of Negative Introspec- tion: Fact 5.8. The formula ¬K̇α → K¬K̇α is valid. Proof. Suppose for an arbitrary world w of a given first order epistemic model MC , MC , w ¬K̇α. We need to show in every w0 ∈ W with wRw0 , w0 ¬K̇α, 0 that is, w ¬∀x∀y((xRα y → K xRα y) ∧ (¬xRα y → K¬xRα y)), equivalent to w0 ∃x∃y((xRα y ∧ ¬K xRα y) ∨ (¬xRα y ∧ hKixRα y)). Since w ¬K̇α by supposition, we can find some objects c, d which satisfy the following condition: 1. w cRα d ∧ ¬KcRα d 2. w ¬cRα d ∧ hKicRα d Now consider the first case: 1. w0 cRα d. By w ¬KcRα d in case 1, we have that there exists some u ∈ W with wRu such that u ¬cRα d. But wRw0 and R is an equivalence relation, then w0 Ru. Now we have w0 ¬KcRα d, and then w0 cRα d ∧ ¬KcRα d, as required. 2. w0 ¬cRα d. By w cRα d and w0 Rw (since wRw0 and R is symmetric), we have w0 hKicRα d, and then w0 ¬cRα d ∧ hKicRα d. 70 What Does it Mean to Know an Action? In those two subcases of case 1, we can have w0 (cRα d ∧ ¬KcRα d) ∨ (¬cRα d ∧ hKicRα d) holds. Next consider the second case, we also need to observe two subcases: 1. w0 cRα d. By w ¬cRα d in case 2 and w0 Rw (since wRw0 and R is symmetric), we have w0 cRα d ∧ ¬KcRα d, as required. 2. w0 ¬cRα d. By w hKicRα d in case 2, we have that there exists some u ∈ W with wRu such that u cRα d. But wRw0 and R is an equivalence relation, then w0 Ru. Now we have w0 hKicRα d, and then w0 ¬cRα d ∧ hKicRα d. In those two subcases of case 2, we can have w0 (cRα d ∧ ¬KcRα d) ∨ (¬cRα d ∧ hKicRα d) holds as well. Hence in any cases, there exist some objects c, d ∈ D satisfying w0 (cRα d ∧ ¬KcRα d) ∨ (¬cRα d ∧ hKicRα d) with every world w0 ∈ W with wRw0 . This just says w K¬K̇α holds for the arbitrary given world w and model MC with supposition w ¬K̇α, as required. Finally, we define the dual of K̇α as hK̇iα by adding two negations into the universal first order sentence to get an existential version, that is, hK̇iα = ∃x∃y((Rα xy → KRα xy) ∧ (¬Rα xy → K¬Rα xy)). Then it is possible to show the following general properties (for a non-empty domain D): K̇α → hK̇iα hK̇iα → KhK̇iα These may be seen as correspondents of the Axioms D and 5 in classical doxastic logic. 5.3 Knowing complex actions Now we provide an extensive technical discussion of how knowing complex actions relates to knowing their components. We follow the inductive steps in defining complex PDL programs, in a number of subsections. Abort and Skip First consider the simplest case of the abort action ι. The relation symbol of abort action Rι should be interpreted as ∅ in every model. It is intuitive that the abort action should be known in every world by rational agents, and K̇ι can indeed be checked to be valid in our first order epistemic setting. The argument is this. For Guo 71 the first clause of definition of K̇ι, given an arbitrary world w, since no pair of states (individuals) is in the empty relation, then ∀x∀y(Rι xy → KRι xy) holds vacuously. Then for the second clause of definition, in every world, every pair of two states (a, b) does not belong to ∅, that is, ¬aRι b is true in every world. It’s clear that w K¬aRι b. Next, consider the atomic skip action ⇓. The corresponding relation symbol R⇓ is interpreted as ∆D = {(a, b) ∈ D2 |a = b} = {(a, a)|a ∈ D} for every world in each of the above first order model. In this way, we can again safely assume that K̇ ⇓ is valid. Here is the proof in more formal detail. Suppose for arbitrary states c, d ∈ D, if (c, d) ∈ R⇓ in a world w, then c = d holds as well and actually in every world since D = S is fixed and R⇓ is interpreted same (identity relation) in every world. It follows that Kc∆D d) in w. If (c, d) < R⇓ is true in world w, then c , d holds in world w as well and then at every world by the same reason that D is fixed. Similarly we have K¬c∆D d) in w. Those just show K̇ ⇓ holds in w. But w is arbitrary, it means that K̇ ⇓ is true in every world of an arbitrary model. Converse actions A natural next step is the relationship between knowledge of an action α and its converse action. Suppose an agent knows α in a world. It means that for each pair of states, the agent can decide whether it satisfies Rα or not. For the agent is rational and has sufficient inference ability, he can decide whether the pair satisfies the converse of Rα as well. And vice versa. The following fact bears this out: Fact 5.9. The formula K̇α ↔ K̇αˇ is valid. Proof. It is not difficult to prove it formally. In every world w of each first or- der model MC , Rαˇ is interpreted as the converse relation of RαI,w . And K̇α is just ∀x∀y((Rα xy → KRα xy) ∧ (¬Rα xy → K¬Rα xy)). It is equivalent to ∀y∀x((Rαˇ yx → KRαˇ yx) ∧ (¬Rαˇ yx → K¬Rαˇ yx)) by the definition of converse relation and first order logic. The latter is just K̇αˇ by the definition of knowing actions. And vice versa. Sequential combination Next we take a look at knowledge of the crucial sequential combination of two actions α and β given knowledge of α and β. Intuitively speaking, for a rational agent, if he can grasp all the denotations of Rα and Rβ , it should be possible for him to grasp the denotation of new relation which is combined from the relations Rα and Rβ . That is, we expect Fact 5.10. The formula K̇α ∧ K̇β → K̇(α; β) is valid. 72 What Does it Mean to Know an Action? Proof. Interestingly, this can be proved formally in our first order epistemic setting: suppose we have MC , w K̇α ∧ K̇β in an arbitrary world w of a given model MC , and first consider the case of arbitrary a and b in D of MC (we will directly use variables such as x, y to represent individuals a, b in later proofs for convenience, and MC , w may also be omitted if the context is clear), aRα ◦ Rβ b (the added relation symbol Rα;β is interpreted as a binary relation Rα ◦ Rβ in our first order epistemic models just as the meaning in classical logic of actions) holds. It means that there exists z such that aRα z∧ zRβ b holds at w. And then we have, there exists z such that KaRα z ∧ KzRβ a is true at w as well by the definition of knowing an action and supposition of K̇α ∧ K̇β. It follows that there exists z, such that K(aRα z ∧ zRβ b) is true in w by classical epistemic logic. According to first order modal logic K with constant domains expressed as in (Priest 2008) and (Arló-Costa and Pacuit 2006), we can safely conclude that K∃z(aRα z∧zRβ b), that is KaRα ◦ Rβ b (KaRα;β b) is true in w. Since a, b are arbitrary, it follows that ∀xy(K xRα ◦ Rβ y) is true at w, as required. Now consider the case of ¬xRα ◦ Rβ y for arbitrary x, y that is true in w. It means that ¬∃z(xRα z ∧ zRβ y). By first order logic, it is equivalent to ∀z(¬xRα z ∨ ¬zRβ y). Since we have K̇α ∧ K̇β true at w, it is clear to get ∀z(K¬xRα z ∨ K¬zRβ y) true at w, implying ∀zK(¬xRα z ∨ ¬zRβ y) in basic modal logic, and that implies ∀zK¬(xRα z ∧ zRβ y) by propositional logic. As we know that BF and CBF are valid in the standard first order modal logic (relational semantics) with constant domains, it’s safe to conclude that w satisfies K∀z¬(xRα z ∧ zRβ y) which is equivalent to K¬∃z(xRα z ∧ zRβ y), that is, K¬xRα ◦ Rβ y is true at w , as required. We may expect the converse holds as well. But that does not work: Fact 5.11. The formula K̇(α; β) → K̇α ∧ K̇β is not valid. Proof. From our experience of properties about binary relations, we can conclude that it should not be valid in general. Intuitively speaking, if an agent knows all pairs in the extension of two combined binary relations, it does not mean that he knows all pairs in extensions of separate relations, because the combination can be generated from different possibilities. Even with more information such as knowing one of the combined actions, the agent cannot necessarily know the other. This means K̇(α; β) ∧ K̇α → K̇β does not valid either. Suppose we have RaI,w = {(1,2), (1,3)} and RbI,w = {(2,2), (3,2)} to represent denotations of atomic actions a and b respectively. It follows I,w that Ra;b = {(1,2)}. Then the agent cannot make a certain conclusion about Rb from the information of Ra;b and Ra . It is possible for him to get results which are different from the original RbI,w . A fact pointing in the same direction is that the formula is not valid in general at the tableau framework as done in (Priest 2008). Guo 73 Choice Turning to combinations of actions by choice, we can make similar observa- tions. Fact 5.12. The formula K̇α ∧ K̇β → K̇(α ∪ β) is valid. Proof. Suppose K̇α ∧ K̇β is true at an arbitrary w of MC , and for arbitrary x, y, consider the case that xRα∪β y holds at w. In dynamic logic for actions, we know that Rα∪β = Rα ∪ Rβ , so we have xRα ∪ Rβ y. It follows that xRα y or xRβ y and then K xRα y or K xRβ y is true at w since K̇α ∧ K̇β is true there. According to normal modal logic, we conclude K(xRα y ∨ xRβ y), that is, K(xRα∪β y) is true at w, as required. Now consider the case ¬xRα∪β y for arbitrary x, y. It means that ¬xRα y and ¬xRβ y are true at w. With the help of supposition that K̇α ∧ K̇β is true there, we can conclude K¬xRα y and K¬xRβ y, that is, K¬xRα y∧K¬xRβ y is true at w. It is equivalent to K(¬xRα y∧¬xRβ y) in propositional epistemic logic, and then equivalent to K¬(xRα y ∨ xRβ y) by propositional logic. Since xRα y ∨ xRβ y is just xRα∪β y by the definition of ∪, the latter is just K¬(xRα∪β y). Hence, it is also true at w, as required. Like with serial combination, the converse does not hold. Fact 5.13. The formula K̇(α ∪ β) → K̇α ∧ K̇β is not valid. Proof. It is easy to understand this intuitively. As we know, knowing the denotation of Rα ∪ Rβ is not helpful enough to know its components separately. And actually we can check an even weaker version, K̇(α ∪ β) → K̇α ∨ K̇β is not valid in the above first order epistemic framework by tableaux. It may help us understand this through the following counter example. Suppose in a model MC , we have RαI,w = {(1,2), (1,3)} and RβI,w = {(2,1), (1,3)}. It clear that RαI,w ∪ RβI,w = {(1,2), (1,3), (2,1)}. We cannot decide where are the elements in the union set from (a particular component set), such as (1,3), it could be from Rα or Rβ or both. So in general the agent can not make a clear boundary between the extensions of Rα and Rβ within the denotation of Rα ∪ Rβ , that means he may not know any one of α and β. Interestingly the converse of the above weaker version, (K̇α ∨ K̇β) → K̇(α ∪ β) 74 What Does it Mean to Know an Action? is not valid either just as we expected since knowing one of component programs is not enough to get knowing the whole big program. Furthermore, K̇α → K̇(α ∪ β) and K̇(α ∪ β) ∧ ¬K̇α → K̇β are both invalid in our first order epistemic framework. It would appear that they are in keeping with our intuitions about knowledge of random combination of binary rela- tions. Kleene star Now we continue with observing properties of knowledge over itera- tions α∗ , arguably the most complex form of structured action. Recall that in classical dynamic logic for regular programs (actions), we interpreted Rα∗ as (Rα )∗ which is the reflexive transitive closure of Rα (We can also see this in (Doets and van Eijck 2004, van Benthem et al. 2012). It means that (Rα )∗ = ∆ ∪ Rα ∪ R2α ∪ R3α ∪ . . . Here Rnα is defined (interpreted) as follows in our first order epistemic model with countably infinite language: R0α = ∆D Rnα = α for n > 0. Rα ◦ Rn−1 This can be expressed without the . . ., as mentioned in the standard translation: [ R∗α = Rnα . n∈N Perhaps the first natural question on this issue is: if a rational agent knows an action α, is it reasonable for him to know the compound action α∗ that is generated from α? The intuitive answer seems “yes”, at least at the level of denotations where we are operating. And indeed we can show Fact 5.14. The formula K̇α → K̇α∗ is valid. Proof. In order to prove this, we first define a notion of αn (for n ∈ N). α0 = ⇓ αn = α; αn−1 for n > 0. Guo 75 Then we are going to prove a following lemma: For every n ∈ N, K̇α → K̇αn . For the case of n = 0, α0 means there is no execution of α, then α0 is just ⇓. It has been shown that K̇ ⇓ is valid. For the case of n = 1, K̇α → K̇α is trivially valid as well. Now suppose K̇α → K̇αk is valid for k ∈ N. It is clear to get K̇α → K̇α ∧ K̇αk is also valid by the induction hypothesis. But we have already shown that K̇α ∧ K̇αk → K(α; αk ) is valid, it follows that K̇α → K(α; αk ), that is, K̇α → K̇αk+1 is valid as well. Now we can prove K̇α → K̇α∗ is valid: suppose K̇α is true in an arbitrary world w of an arbitrary model MC . It follows from the lemma that MC , w K̇αn for every n ∈ N. For any x, y, first consider the case x(Rα )∗ y true in w. It means that at least one of the following situations holds in w: x∆D y, xRα y, . . ., xRαn y, . . . (for some n > 1). Without loss of generality, suppose it is xRαi y (for i > 1). Then by the lemma we have just proved, we get K xRαi y is true in w. This means for all w0 satisfying wRw0 , xRαi y is true at w0 . It’s clear to conclude that xR∗α y is true at every w0 since Rαi ⊆ R∗α . Hence we have K xR∗α y is true at w, as required. Next consider the case ¬x(Rα )∗ y true in w. It means none of the following situations holds in w: x∆D y, xRα y, . . ., xRαn y, . . . (for some n > 1), that is, ¬x∆D y, ¬xRα y, . . ., ¬xRαn y, . . . are all true at w. By the above lemma again, we can get K¬x∆D y, K¬xRα y, . . ., K¬xRαn y, . . . are all true at w. It follows that ¬x∆D y, ¬xRα y, . . ., ¬xRαn y, . . . are all true at every w0 that satisfies wRw0 . It means xR∗α y is false (that is, ¬xR∗α y is true) in every w0 . This shows M, w K¬xR∗α y, as required. The converse principle, that is, the implication K̇α∗ → K̇α seems not valid in general. It is not difficult to check this by providing a counter model, but we omit this here. From all the above results on knowing complex actions, we can draw two general conclusions. It can be proved in all cases that, if an agent knows the components of an action, then he also knows the complex action. But the converses do not hold in general, and several counter-examples have been found. What this means to us is that the level of mere denotations does not provide enough information for the agent to recapture how the components worked. This seems a natural limit to what can be achieved by letting knowledge just operate on the transition relation associated with complex actions or procedures. If we want to go further, more structure will have to be given for the objects of knowledge. 76 What Does it Mean to Know an Action? 6 Some related work An earlier logic for knowing actions can be found in (Li 2005, Liu and Li 2005). Their work is mainly based on a propositional modal language and mainly epistemic actions are considered. The definition of knowing an action Kα in Li (2005) is M, s Kα if and only if for any t, t0 ∈ S , if sRt and tRt0 , then (t, t0 ) ∈ Rα . Here M = (S , R, Rα {for each action symbol α},V) is a dynamic epistemic model, and s is a state similar to those defined in section 2 here (an epistemic relation R has been added). In this way, knowing an action α at s can be understood as ‘in every R-successor t knows all the output states of α’. While this has some technical charm and virtues in its formal development, we doubt the motivation of this definition on at least two points. First, the input states of actions are ignored, even though these seem crucial to understanding what an action does. And next, agents know something in R-successors, but not directly at s, which seems strange. We can also derive several undesired logical validities in this frame- work, which we do not pursue here. For earlier preliminary work on logic of knowing complex actions, see (Wang and Cui 2010). The authors define a kind of denotation for actions α as a non-empty subset of W, making actions degenerate into propositions. And some principles that come out of this are dubious. For example, K(α; β) → Kα is valid, making knowledge of a sequential combination of two actions lead to the knowledge of first action. This seems implausible. Even though we see the value of these two pioneering attempts, we decided to set out in a different direction in this paper, for the reasons stated. 7 Conclusions and further directions In this paper, we have proposed and investigated an analysis of knowing an action for a rational agent. Using some plausible intuitions from dynamic logic, we defined knowl- edge of an action as knowing the transition relation of the action. Our full version of this idea took place in the framework of first order epistemic models, with the help of a standard translation on program expressions. We investigated some basic logical principles of reasoning validated in this way, and applied them in particular to general epistemic properties of knowing actions, and to knowledge of different ways of gener- ating actions, through tests, serial combination, choice, and iteration. We found some interesting principles, such as the equivalence of knowing a test with knowing whether Guo 77 the tested proposition holds. Beyond that, because of the shape of our definition, know- ing an action basically had the same introspection properties as those assumed in the base logic adopted for propositional knowledge. The general base logic of our system is first order epistemic predicate logic S5 plus the axioms BF and CBF for constant domains. Inside this full system, the logic of knowing actions in our sense will involve just a smaller fragment of the full language, but we have not determined its special properties. Another obvious desideratum concerns a sort of imbalance in our system. We have propositional knowledge on knowing programs, but these programs themselves were knowledge-free. But of course, it makes eminent sense to also look at programs that themselves involve epistemic structure. For instance, ‘knowledge programs’ with epis- temic test conditions are studied in (Fagin et al. 1995, Baltag and Moss 2004). For instance, knowing the program π = IF K p THEN α ELSE skip seems intuitively equiv- alent to K̇((?K p; α)∪ ⇓)) in our language. Clearly more needs to be done here, to bring this within the scope of our analysis. Concrete examples of settings where our pro- posal applies may be imperfect information games (van Benthem 2001, van Benthem and Liu 2004), where one often speaks of knowing a strategy, leading to knowledge programs for “uniform strategies”. And even knowledge programs are just one instance of what can be seen as a more general interest in “epistemizing” notions from traditional modal and dynamic logic. To mention one other example, just think of an issue like the following, connecting our various notions in a yet different way. Consider the basic invariance of bisimulation underlying modal logic. What does it mean to know a bisimulation between two mod- els (a notion of knowledge referring to a relation once more), and what follows from that for our propositional knowledge of what is true in those models? Next, our proposal is a simplest start, with a very sparse notion of action. One can easily increase the level of detail here. One simple obvious line left unexplored is the kind of denotation that we have taken. There are also semantics for dynamic logic (and its extensions such as the modal µ-calculus) that do not give just input-output relations, but also intermediate “traces”. We think that our proposal will also work on such more structured trace semantics for actions, but the precise logical effects of this remain to be explored. But in the end, we think that even this is not enough, and we need much more fine-grained views of the inner structure of actions or procedures, to give the knowl- edge operator more structure to work on. These richer denotations might come from other process theories in computer science, or also from games as models of structured interactive computation. Intuitions about validity and non-validity of the principles dis- cussed in this paper may then reverse. We have nothing to offer in this direction here. But we do hope that we have provided a starting point that lends itself to further inves- 78 What Does it Mean to Know an Action? tigation, while mapping out issues that will return even as we turn on the magnification in our account of actions and processes. Acknowledgements This work was sponsored by the youth project of National So- cial Science Foundation of China (No. 09CZX032) and the major program of National Social Science Foundation of China (No. 12&ZD118). Most of the research for this paper was done while I was visiting ILLC, University of Amsterdam (The Netherlands) from August 2011 to August 2012 funded by China Scholarship Council. I would like to extend my gratitude to Johan van Benthem and Jan van Eijck for their creative and helpful guidance, suggestions, comments on the research, and careful helps in revis- ing, polishing the article as well. Special thanks to Alexandru Baltag, Sonja Smets and other colleagues involved in Amsterdam for their organizing and participating LIRa and LogICCC seminars from those activities I have acquired several views, ideas and techniques related to the research. References H. Arló-Costa and E. Pacuit. First-order classical modal logic. Studia Logica, 84(2): 171–210, 2006. R. J. Aumann. Agreeing to disagree. The Annals of Statistics, 4(6):1236–1239, 1976. A. Baltag and L. Moss. Logics for epistemic programs. Synthese: Knowledge, Ratio- nality, and Action, 139(2):165–224, 2004. A. Baltag, L. Moss, and S. Solecki. The logic of public announcements, common knowledge, and private suspicions. In I. Bilboa, editor, Proceedings of TARK’98, pages 43–56, 1998. P. Battigalli and G. Bonanno. Recent results on belief, knowledge and the epistemic foundations of game theory. Research in Economics, 53(2):149–225, 1999. J. van Benthem. Exploring Logical Dynamics. Center for the Study of Language and Information - Studies in Logic, Language, and Information, 1996. J. van Benthem. Games in dynamic epistemic logic. Bulletin of Economic Research, 53:216–248, 2001. J. van Benthem. Logical Dynamics of Information and Interaction. Cambridge Uni- versity Press, 2011. Guo 79 J. van Benthem and F. Liu. Diversity of logical agents in games. Philosophia Scien- tiae, 8(2):163–178, 2004. J. van Benthem, H. van Ditmarsch, J. van Eijck, and J. Jaspars. Logic in Action. Open Course Project, Institute for Logic, Language and Computation, University of Amsterdam, 2012. P. Blackburn, M. de Rijke, and Y. Venema. Modal Logic. Cambridge University Press, 2001. H. van Ditmarsch, W. van der Hoek, and B. Kooi. Dynamic Epistemic Logic, volume 337 of Synthese Library. Springer, 2006. K. Doets and J. van Eijck. The Haskell Road to Logic, Maths and Programming. College Publications, 2004. R. Fagin, J. Halpern, Y. Moses, and M. Vardi. Reasoning about Knowledge. Cam- bridge : The MIT Press, 1995. P. Gärdenfors. Knowledge in Flux: Modeling the Dynamics of Epistemic States. The MIT Press, Cambridge, MA, 1988. J. Halpern. Using reasoning about knowledge to analyse distributed systems. Annual Review of Computer Science, 2:37–68, 1987. J. Halpern, D. Samet, and E. Segev. Defining knowledge in terms of belief: the modal logic perspective. Review of Symbolic Logic, 2(3):469–487, 2009. D. Harel, D. Kozen, and J. Tiuryn. Dynamic Logic. The MIT Press, 1st edition, 2000. J. Hintikka. Knowledge and Belief. Cornell University Press, 1962. X. Li. Three kinds of logics of knowing an action. Logic and Cognition online journal, 3(3):35–59, 2005. Z. Liu and X. Li. Cognition of actions. Journal of Hunan University of Science & Technology (Social Sciecne Edition), 8(6):33–38, 2005. A. Perea. Epistemic Game Theory. Cambridge University Press, 2012. V. R. Pratt. On the composition of processes. POPL, pages 213–223, 1982. G. Priest. An Introduction to Non-Classical Logic. Cambridge University Press, second edition, 2008. 80 What Does it Mean to Know an Action? M. Singh. Know-how. In A. Rao and M. Wooldridge, editors, Foundations of Rational Agency, Applied Logic Series, pages 105–132. Kluwer, 1999. J. Stanley. Know How. Oxford University Press, 2011. J. Stanley and T. Williamson. Knowing how. Journal of Philosophy, 98(8):411–444, 2001. R. Tang. Knowing that, knowing how, and knowing to do. Frontiers of Philosophy in China, 6(3):426–442, 2011. J. Wang and J. Cui. Knowing regular action logic. Journal of Southwest Univer- sity(Social Sciecne Edition), 36(4):59–65, 7 2010. What You Can Do Depends on What You Have Done Fengkui Ju and Li Liang Department of Philosophy, Beijing Normal University, Beijing, China
[email protected], liangli
[email protected]Abstract We aim to present a deontic logic with updates as an extension of Boolean Modal Logic. The features of this logic include the following: (a) deontic relations are defined on sets of finite sequences of states, called histories, and consequently, formulas are evaluated at histories, not states; and (b) it has two dynamic opera- tors, which tend to update the obligation states of agents in different ways. This logic reflects the distinction between the descriptive and prescriptive use of norm sentences. 1 Introduction One fundamental issue of deontic logic is Jorgensen’s dilemma (1937). This dilemma was originally about imperatives. There are inferences involving imperatives in our lives. However, imperatives express orders and do not have truth values, so it is hard to say that there is a logic of imperatives. A dilemma arises. Traditionally, deontic logic does not consider imperatives. However, norm sentences such as “you should stay” or “you may leave” are similar to imperatives in many cases: they can also be used to change agents’ behaviors, and therefore do not have truth values. Hence, this dilemma is also a serious problem in deontic logic. There are two puzzles attached to this dilemma: Ross’s Paradox and the Free Choice Permission Paradox, both of which were identified by Ross (1944). The first puzzle can be illustrated by the inference “you should mail this letter; therefore, you should mail it or burn it”. This inference is intuitively strange, but valid according to classical logic. The second puzzle is opposite 82 A Dynamic Deontic Logic to the first one; it notes that the inference “you may drink coffee or tea; therefore, you may drink coffee” is not valid in the classical logic but is intuitively plausible. In order to solve Jorgensen’s dilemma, as mentioned in (Hilpinen 2001), many philosophers have proposed a distinction between two different uses of norm sentences: descriptive and prescriptive uses. Norm sentences are descriptively used to state what the agent ought to do or what he is allowed to do, among other actions, etc. These sentences can be true or false in these cases. In the prescriptive way, norm sentences are used to generate norms and do not have truth values. Jorgensen’s dilemma would disappear if the prescriptive use of norm sentences were not relevant to deontic logic. Deontic logic is “legalized” this way. We consider this distinction reasonable. How- ever, we do not think that prescriptive norm sentences are irrelevant to deontic logic. We believe that for any moral agent, there is an obligation state regarding his obliga- tions and freedoms. Descriptive norm sentences describe these states, while prescrip- tive norm sentences change them. In this paper, we present a dynamic deontic logic to realize this concept. There is a “dynamic” direction in deontic logic, in which works are based on dy- namic logics. A fundamental work is that of Meyer (1988), which provided a deontic logic as an extension of Propositional Dynamic Logic. Influenced by Anderson-Kanger Deontic Logic, this work introduced a propositional constant, that intuitively means that the requirements of morality are violated. Deontic operators are defined by this constant, but they are applied to actions, not propositions. There is also a “dynamic” direction in semantics for imperatives and permissions starting from (Veltman 2009). This work is based on update semantics. It proposed a notion plans, i.e., a set of to- do lists, which can be viewed as sets of actions. Imperatives and permissions update plans in different ways: the former tend to “strengthen” them, while the latter tend to “weaken” them. This paper attempts to combine the spirits of these two research lines. As a propo- sitional dynamic logic, Boolean Modal Logic contains these three action constructors: complement, intersection and choice. Our work is an extension of this logic in both language and semantics. The extended language contains a deontic operator, applied to actions, and two dynamic operators, corresponding to the descriptive utterance of obligations and the prescriptive utterance of permissions. The prescriptive utterance of obligations is derived from other utterances. A model is a labeled transition system plus a deontic relation, which is defined on the set of finite sequences of states, called histories, not on the set of states. The truth of a formula is defined against histories, not states. Histories represent what the agent has done. In this way, the idea of what you have done affects what you can do is reflected semantically. Descriptive norm sen- tences describe models, while prescriptive norm sentences update models by changing Ju and Liang 83 deontic relations. In this logic, Ross’s Paradox is not valid, but the Free Choice Per- mission Paradox is. At the end, we axiomatize the logic. 2 Language and semantics 2.1 Language Let Π0 be a countable set of atomic actions and Φ0 a countable set of atomic proposi- tions. Let a range over Π0 and p over Φ0 . The sets Π of actions and Φ of propositions are defined as follows: α ::= a | 1 | α | (α ∩ α) | (α ∪ α) ϕ ::= p | > | Oα | ¬ϕ | (ϕ ∧ ϕ) | hαiϕ | [↓ α]ϕ | [↑ α]ϕ The empty action 0 is defined as 1. Other routine propositional connectives, the falsity ⊥, and the dual [α]ϕ of hαiϕ are defined in the usual way. To perform α is to do something that is not α. To perform α ∩ β is to perform α and β at the same time. To perform α ∪ β is to perform α or β. This language does not have compositions of actions, and all actions are just one unit deep. The formula Oα means that the agent ought to do α. As the dual of Oα, Pα is defined as ¬Oα, which means that the agent may do α. For any α, Oα is called a pure deontic formula. The expression ↓ α denotes the action of descriptive utterance of “you should do α”, and [↓ α]ϕ means after this utterance, ϕ is true. ↑ α denotes the prescriptive utterance of “you may do α”, and [↑ α]ϕ indicates that ϕ is true after the utterance. 2.2 Models Let W be a set of states. Let ∆W denote the set of finite non-empty sequences of states in W. Each element of ∆W is called a history of W. Capitals like H, J and K denote histories. For any H ∈ ∆W , let H̊ denote the last state of H. A model is a tuple M = (W, {Rα | α ∈ Π}, D, V) where 1. W is a non-empty set of states; 2. Rα ⊆ W × W; 3. D ⊆ ∆W × ∆W and for any (H, J) ∈ D, J = (H, w) for some w ∈ W; 4. V is a function from Φ0 to 2W . 84 A Dynamic Deontic Logic D is called the deontic relation. There is no loop for D, i.e., any history H can not reach itself in finite steps. This intuitively means that the agent’s history is always going to be his history. A model M is standard if it meets such additional constraints: 1. R1 = W × W; 2. Rα = W × W − Rα ; 3. Rα∩β = Rα ∩ Rβ ; 4. Rα∪β = Rα ∪ Rβ ; 5. D is serial. w4 a c D((w2 ), (w2 , w2 )) D((w2 , w2 ), (w2 , w2 , w1 )) b w2 w1 w3 D((w2 , w2 , w1 ), (w2 , w2 , w1 , w3 )) d D((w2 , w2 , w1 ), (w2 , w2 , w1 , w4 )) e D((w2 , w2 , w1 , w3 ), (w2 , w2 , w1 , w3 , w3 )) .. w5 . Figure 1: A Standard Model Figure 1 depicts what a standard model looks like. A labeled transition system is on the left, and the deontic relation is on the right. Histories are sequences of states. Since actions are transitions of states, histories represent what the agent has done. Suppose he is standing in w2 with a blank history (w2 ) behind him, which means that he has done nothing. According to the deontic relation, he now must perform a. After a is done, he is still in w2 , however, his history is now (w2 , w2 ), and he must perform b, which will take him to w1 . What he is allowed to do is dependent on what he has done. There are three actions possible for the agent to perform in w1 : c, d and e. However, given the history (w2 , w2 , w1 ), as a moral agent, he is not allowed to do e, and he must perform c or d, although he can freely choose which one. We require the deontic relation to be serial. We make this requirement for the following reasons: we believe that for any action, no matter what the world is and what the agent has done, he is allowed to perform it or the opposite of it, and we do not think a coherent legal system could tolerate the existence of situations in which the Ju and Liang 85 agent is forbidden to do anything. In some cases, performing an action in a state might not change this state. For example, consider an agent pushing a revolving door. It is not reasonable to think that what the agent has to do never changes before and after performing this sort of actions, because otherwise if the agent has to push this door, he might have to push it forever. This is one reason we introduce histories as parameters in defining deontic relations. A second reason will be explained later. 2.3 Updates of models Let M = (W, {Rα | α ∈ Π}, D, V) be a model, H a history, and α an action. Definition 2.1 (Two Updates of Deontic Relations). 1. DαH = D − {(H, (H, w)) | ¬Rα (H̊, w)}; 2. DαH = D ∪ {(H, (H, w)) | Rα (H̊, w)}. The only difference among D, DαH and DαH lies in that H might “see” less in DαH and “see” more in DαH than in D. For any D, let gH (D) = {w ∈ W | D(H, (H, w))}, which is called the goodness set of H in D. Let RαH̊ = {w | Rα (H̊, w)}. It can be verified gH (DαH ) = gH (D) ∩ RαH̊ and gH (DαH ) = gH (D) ∪ RαH̊ . Essentially, the two updates are two different ways of changing the goodness sets of H in D. If D is serial, then DαH is serial, but DαH might not be. However, given that D is serial, if there is a w such that D(H, (H, w)) and Rα (H̊, w), then DαH is serial. The following proposition includes some results about manipulating updates, which will be used later: Proposition 1. 1. (DαH )βJ = (DβJ )αH ; 2. (DαH )βJ = (DβJ )αH ; 3. (DαH )βJ = (DβJ )αH , where J , H. Based on the updates of deontic relations, we define updates of models: Definition 2.2 (Two Updates of Models). 1. MαH = (W, {Rα | α ∈ Π}, DαH , V); 2. MαH = (W, {Rα | α ∈ Π}, DαH , V). 86 A Dynamic Deontic Logic The two updates only change the deontic relations of models. We see if M is standard, then MαH is standard, but MαH might not be. MαH can be viewed as the result of updating M with the descriptive utterance of “you should do α” at the history H, and DαH as the result of updating D with the prescriptive utterance of “you may do α” at H. The first update tends to “stop” some transitions, while the second tends to “free” some links. We take the model illustrated in Figure 1 as an example. Uttering “you should do c” in the descriptive way at (w2 , w2 , w1 ) would cut the deontic link between (w2 , w2 , w1 ) and (w2 , w2 , w1 , w3 ). This means the agent is not allowed to transition to w3 and must perform c. Prescriptively uttering “you may do e” at (w2 , w2 , w1 ) would generate a link between (w2 , w2 , w1 ) and (w2 , w2 , w1 , w5 ), which means he can do e now. 2.4 Semantics Let M = (W, {Rα | α ∈ Π}, D, V) be a model, and H a history. Here we do not require M to be standard. Truth of formulas at H is defined as follows: 1. M, H p ⇔ H̊ ∈ V(p); 2. M, H > always holds; 3. M, H Oα ⇔ for any w ∈ W, if D(H, (H, w)), then Rα (H̊, w); 4. M, H ¬ϕ ⇔ not M, H ϕ; 5. M, H (ϕ ∧ ψ) ⇔ M, H ϕ and M, H ψ; 6. M, H hαiϕ ⇔ there is a w ∈ W such that Rα (H̊, w) and M, (H, w) ϕ; 7. M, H [↓ α]ϕ ⇔ M, H Pα implies MαH , H ϕ; 8. M, H [↑ α]ϕ ⇔ MαH , H ϕ. It can be verified that 9. M, H Pα ⇔ there is a w ∈ W such that D(H, (H, w)) and Rα (H̊, w); 10. M, H [α]ϕ ⇔ for any w ∈ W, if Rα (H̊, w), then M, (H, w) ϕ. The formula hαiϕ being true at H means that there is a way to perform α such that after α is done, ϕ is true at the new history. It can be verified that M, H Oα if gH (D) ⊆ RαH̊ . This intuitively means that α is obligatory for the agent if whatever he does without violating morality, α would be performed. We can also verify that Ju and Liang 87 M, H Pα if gH (D) ∩ RαH̊ , ∅. This means that he is allowed to perform α if there is a way to perform α without violating morality. Similar ideas can be found in (Hilpinen 2001). We consider only standard models reasonable. As discussed, given that a model M is standard, MαH might not be standard, unless there is a w ∈ W such that D(H, (H, w)) and Rα (H̊, w), that is, M, H Pα. This is why we define the truth condition of [↓ α]ϕ as conditional. Those updates resulting in non-standard models are unsuccessful ones. The truth of formulas is defined at histories in general models, not just in standard models, so the definition is well-defined. This semantics would collapse to classical relational semantics if the deontic part were ignored; hence, it is a genuine extension of Boolean Modal Logic. A formula ϕ is valid if for any standard model M and history H, M, H ϕ. 3 Valid formulas Proposition 2. The following formulas are valid: 1. Oα → Pα; 2. Pα → hαi>; 3. [↓ α]Oα; 4. hαi> → [↑ α]Pα. From the first two items, we obtain that Kant’s Law, expressed as Oα → hαi>, is valid. The third indicates that the agent ought to do α after the descriptive utterance of “you should do α”. The last item expresses that he is allowed to do α after the prescriptive utterance of “you may do α”, given that α is possible to perform. Hilpinen (2001) proposed a principle to explain why Ross’s Paradox seems invalid: in our intuitions, if a norm sentence N1 entails N2 , then the normative effects of N1 entail the normative effects of N2 . The prescriptive utterance of “you should mail the letter or burn it” gives the agent the permission to burn the mail, but the utterance of “you should mail the letter” does not; therefore, the normative effects of the former do not entail the normative effects of the latter. Then Ross’s Paradox is not valid. We consider this principle plausible. Even further, we believe that its converse is also rea- sonable. In fact, the bi-implication version of this principle underlies update semantics in defining validity. According to the stronger version, the Free Choice Permission Paradox is valid, as the prescriptive utterance of “you may drink coffee” just gives the 88 A Dynamic Deontic Logic agent the freedom to drink coffee, whereas the utterance of “you may drink coffee or tea” gives him the freedom to drink tea, in addition to the freedom to drink coffee. Our language contains dynamic operators and models contain normative factors; thus, the normative effects of utterances can be expressed in this setting. We believe that prescriptive norm sentences generate not only obligations but also permissions. In (Ju and Liu 2011), we have argued that in the aspect of normative effects, prescriptively uttering “you should do α” is equivalent to prescriptively uttering “you may do α” and then descriptively uttering “you should do α”. Define [↑↓ α] as [↑ α][↓ α], which represents the action of prescriptively uttering “you should do α”. Ross’s Paradox fails here. Let c denote the action mailing the letter, and e the action burning the letter. We look at the model illustrated in Figure 1. It can be verified that [↑↓ c][↑↓ (c ∪ e)]Pe is true at the history (w2 , w2 , w1 ), but [↑↓ c]Pe is false at it. Therefore, the normative effects of “you should mail the letter” do not entail the normative effects of “you should mail it or burn it”. One may wonder why [↓ α][↑ α] is not used to denote prescriptive utterances of “you should do α”. The update sequence [↑ α][↓ α] might be different from [↓ α][↑ α], and the only difference is this: given that α is possible to perform, [↑ α][↓ α] would always be successful, but [↓ α][↑ α] might not, as [↓ α] might make a standard model not serial. We believe that in real life, given α is possible to perform, prescriptive utterance of “you should do α” is always meaningful. This is the reason. The Free Choice Permission Paradox, [↑ (α ∪ β)][↑ α]ϕ ↔ [↑ (α ∪ β)]ϕ, is valid in this semantics, which is easy to check. The following lemma says that the two updates do not change a model much: Lemma 1. J is a proper super-sequence of H. 1. MαH , J ϕ if and only if M, J ϕ; 2. MαH , J ϕ if and only if M, J ϕ. By this lemma, we can show these two propositions: Proposition 3. The following formulas are valid: 1. [↓ α]p ↔ (Pα → p); 2. [↓ α]> ↔ (Pα → >); 3. [↓ α]Oβ ↔ (Pα → O(α ∪ β)); 4. [↓ α]¬ϕ ↔ (Pα → ¬[↓ α]ϕ); 5. [↓ α](ϕ ∧ ψ) ↔ ([↓ α]ϕ ∧ [↓ α]ψ); Ju and Liang 89 6. [↓ α]hβiϕ ↔ (Pα → hβiϕ); 7. [↓ α][↓ β]ϕ ↔ [↓ (α ∩ β)]ϕ. Proposition 4. The following formulas are valid: 1. [↑ α]p ↔ p; 2. [↑ α]> ↔ >; 3. [↑ α]Oβ ↔ (Oβ ∧ [α ∩ β]⊥); 4. [↑ α]¬ϕ ↔ ¬[↑ α]ϕ; 5. [↑ α](ϕ ∧ ψ) ↔ ([↑ α]ϕ ∧ [↑ α]ψ); 6. [↑ α]hβiϕ ↔ hβiϕ; 7. [↑ α][↑ β]ϕ ↔ [↑ (α ∪ β)]ϕ. From these propositions, we obtain that the formulas containing only one dynamic op- erator can be equivalently reduced to the formulas not containing any. By introducing histories, we can obtain valid formulas [↓ α]hβiϕ ↔ hβiϕ and [↑ α]hβiϕ ↔ hβiϕ, and consequently obtain the reduction of dynamic operators. This is the above mentioned second motivation for using the notion of histories. 4 Axiomatization 4.1 Axiomatization Let ΦPC be the language generated from Φ0 ∪ {>} under ¬, ∧ and ∨, where Φ0 is the set of atomic propositions. Let f be a natural bijective function from the set Π of actions to ΦPC . We say α and β are equivalent if f (α) ↔ f (β) is a tautology. For instance, a ∩ b is equivalent to a ∪ b. The axiomatization of the logic consists of eight classes of axioms: A. Basic axioms of normal modal logics: (a) all propositional tautologies; (b) [α](ϕ → ψ) → ([α]ϕ → [α]ψ). B. The axiom for choice: hα ∪ βiϕ ↔ hαiϕ ∨ hβiϕ. 90 A Dynamic Deontic Logic C. Axioms for the universal modality: (a) ϕ → h1iϕ; (b) ϕ → [1]h1iϕ; (c) h1ih1iϕ → h1iϕ; (d) hαiϕ → h1iϕ. D. The axiom for the empty modality: [0]⊥. E. Axioms for equivalence of actions: hαiϕ ↔ hα0 iϕ, if α and α0 are equivalent. F. Axioms for the deontic operator O: (a) (Oα ∧ Oβ) ↔ O(α ∩ β); (b) ¬Oα → Pα; (c) (Oα ∧ Pβ) → P(α ∩ β); (d) Oα → hαi>; (e) Pα → hαi>. G. Axioms for the dynamic operator [↓ α]: (a) [↓ α]p ↔ (Pα → p); (b) [↓ α]> ↔ (Pα → >); (c) [↓ α]Oβ ↔ (Pα → O(α ∪ β)); (d) [↓ α]¬ϕ ↔ (Pα → ¬[↓ α]ϕ); (e) [↓ α](ϕ ∧ ψ) ↔ ([↓ α]ϕ ∧ [↓ α]ψ); (f) [↓ α]hβiϕ ↔ (Pα → hβiϕ). H. Axioms for the dynamic operator [↑ α]: (a) [↑ α]p ↔ p; (b) [↑ α]> ↔ >; (c) [↑ α]Oβ ↔ (Oβ ∧ [α ∩ β]⊥); (d) [↑ α]¬ϕ ↔ ¬[↑ α]ϕ; (e) [↑ α](ϕ ∧ ψ) ↔ ([↑ α]ϕ ∧ [↑ α]ψ); (f) [↑ α]hβiϕ ↔ hβiϕ. Ju and Liang 91 and three inference rules: 1. Modus Ponens: given ϕ and ϕ → ψ, prove ψ; 2. Generalization: given ϕ, prove [α]ϕ; 3. Replacement of Validity: given ϕ ↔ ϕ0 , prove [↓ α]ϕ ↔ [↓ α]ϕ0 and [↑ α]ϕ ↔ [↑ α]ϕ0 . The class E is decidable, as the set of tautologies is decidable. The logic is sound and complete respect to the class of standard models. The soundness is easy to verify. We now show the completeness. Firstly, we show that the logic restricted to Φ0 , the sub-language of Φ not containing any dynamic operators, is complete with respect to the class of standard models. Then, by use of the classes G and H of axioms and the inference rule RV, we obtain the completeness of the whole logic in a similar way as (van Ditmarsch et al. 2007) for Public Announcement Logic. To show the restricted completeness, it suffices to show for any consistent formulas ϕ in Φ0 , there is a standard model M and a history H such that M, H ϕ. Let G be a consistent formula. Let Σ be the smallest set of formulas such that G ∈ Σ and Σ is closed under sub-formulas. 4.2 CINF Let a1 , . . . , an be all the atomic actions of Σ and A = {a1 , . . . , an , a1 , . . . , an }. Each element of A is called a literal action. Define ΘA as such a set: {X ⊆ A | for any i ≤ n, exactly one of ai and ai is in X}. ΘA has 2n members. For any X ∈ ΘA , γ = X is called a path relative to Σ, which is an intersection of some literal actions. T There are 2n paths, if we do not consider orders of literal actions. Enumerate these paths as γ1 , . . . , γ2n . In any standard model, Rγ1 , . . . , Rγ2n are pairwise disjoint blocks and the union of them is W × W. In other words, {Rγ1 , . . . , Rγ2n } is a partition of W × W. By some refections we can get that for any α built from a1 , . . . , an , if α is not equiva- lent to 0, Rα is the union of some of these blocks. These blocks are like atomic parts of W × W. Here is an example. Suppose a, b and c are all the atomic actions under considerations. There are 8 paths: a ∩ b ∩ c, . . . , a ∩ b ∩ c, and W × W is divided into 8 parts: Ra∩b∩c , . . . Ra∩b∩c . Each non-empty action whose atomic actions occur in a, b and c is the union of some of these parts. For example, Ra∩(b∩c) = Ra∩b∩c ∪ Ra∩b∩c ∪ Ra∩b∩c . The classes D and E of axioms guarantee this result: Lemma 2. For any α occurring in Σ, if α is not equivalent to 0, there are paths γn1 , . . . , γnm such that α is equivalent to γn1 ∪ · · · ∪ γnm . 92 A Dynamic Deontic Logic For any α not equivalent to 0, we call γn1 ∪ · · · ∪ γnm the choice-intersection normal form (CINF) of α relative to Σ. Actions equivalent to 0 such as a ∩ a do not have corresponding CINFs. Lemma 3. 1. Let γh1 ∪· · ·∪γhi and γ j1 ∪· · ·∪γ jk be the CINFs of β and β. Then {γ j1 , . . . , γ jk } = {γ1 , . . . , γ2n } − {γh1 , . . . , γhi }; 2. Let γh1 ∪ · · · ∪ γhi , γ j1 ∪ · · · ∪ γ jk and γl1 ∪ · · · ∪ γlm be the CINFs of β, π and β ∩ π. Then {γl1 , . . . , γlm } = {γh1 , . . . , γhi } ∩ {γ j1 , . . . , γ jk }; 3. Let γh1 ∪ · · · ∪ γhi , γ j1 ∪ · · · ∪ γ jk and γl1 ∪ · · · ∪ γlm be the CINFs of β, π and β ∪ π. Then {γl1 , . . . , γlm } = {γh1 , . . . , γhi } ∪ {γ j1 , . . . , γ jk }; 4. If α is equivalent to 1, the CINF of α is γ1 ∪ · · · ∪ γ2n . The axiom D is used in proving the second item. 4.3 An incomplete model and its generated submodel Let MC = (W C , {RCα | α ∈ Π}, V C ) be the structure where 1. W C is the set of maximal consistent sets; 2. Rα uv if and only if for any ϕ, ϕ ∈ v implies hαiϕ ∈ u; 3. For any p, V(p) = {u ∈ W C | p ∈ u}. This structure is not a model, as the deontic relation is missing. Actually, if we ignore the deontic part of the language, it is the canonical model. Lemma 4. 1. If hαiϕ ∈ u, there is a v ∈ W C such that ϕ ∈ v and RCα uv; 2. RCα∪β = RCα ∪ RCβ ; 3. For any α equivalent to 0, RCα = ∅. Let w be a maximal consistent set containing G. Let M = (W, {Rα | α ∈ Π}, V) be the substructure of MC generated from w under the relation RC1 . The class C of axioms guarantee that R1 is the universal relation on W. Here a similar lemma with Lemma 4: Ju and Liang 93 Lemma 5. 1. If hαiϕ ∈ u, there is a v ∈ W such that ϕ ∈ v and Rα uv; 2. Rα∪β = Rα ∪ Rβ ; 3. For any α equivalent to 1, Rα = W × W; 4. For any α equivalent to 0, Rα = ∅. 4.4 Filtration Define a relation ≈Σ on W as this: u ≈Σ v if and only if u ∩ Σ = v ∩ Σ. This is an equivalence relation. Let M f = (W f , {Rαf | α ∈ Π}, V f ) be such a structure: 1. W f is the partition of W under ≈Σ ; 2. Rαf |u||v| if and only if there are x ∈ |u| and y ∈ |v| such that Rα xy; 3. for any p, V f (p) = {|u| | p ∈ u}. For any x, y ∈ |u| and ϕ ∈ Σ, ϕ ∈ x if and only if ϕ ∈ y. We use ϕ B |u| to express that ϕ ∈ x for any x ∈ |u|. Here is a similar lemma with Lemma 5: Lemma 6. 1. If hαiϕ B |u|, there is a v ∈ W such that ϕ B |v| and Rαf |u||v|; f 2. Rα∪β = Rαf ∪ Rβf ; 3. For any α equivalent to 1, Rαf = W f × W f ; 4. For any α equivalent to 0, Rαf = ∅. With the help of Lemma 3 and 6, it is not hard to show the following lemma: Lemma 7. 1. For any α of Σ not equivalent to 0, Rαf = Rγf n ∪···∪γnm , where γn1 ∪ · · · ∪ γnm is the 1 CINF of α; 2. Rγf 1 ∪ · · · ∪ Rγf 2n = W f × W f ; 3. For any hαi> and u ∈ W, if hαi> B |u|, there is a v ∈ W such that Rαf |u||v|. 94 A Dynamic Deontic Logic We present some observations on the situation confronting us. Our purpose is to show the consistent formula G is satisfiable in a standard model. Two things are im- portant: a standard model and satisfiability. The structure M f = (W f , {Rαf | α ∈ Π}, V f ) might not be standard even if we ignore the deontic part, as Rαf = W f × W f − Rαf and f Rα∩β = Rαf ∩ Rβf might not be satisfied. We hope to transform it to a standard model. Those actions not built from a1 , . . . , an are irrelevant, and we can freely manipulate the interpretations of their parts not involving a1 , . . . , an ; therefore, these actions do not present a problem. However, we are using the Henkin method, so at least to some extent, we should “respect” the interpretations of the actions built from a1 , . . . , an , if we want to obtain satisfiability. Via some reflections with the help of Lemma 3 and 6, we can see that if Rγf 1 , . . . , Rγf 2n are real atomic blocks, M f is standard. That Rγf 1 , . . . , Rγf 2n are real atomic blocks means that Rγf 1 , . . . , Rγf 2n are pairwise disjoint and Rγf 1 ∪ · · · ∪ Rγf 2n = W f × W f . The second condition holds by Lemma7. The situa- tion is now clear: to achieve our goal, we only need to achieve two things: making Rγf 1 , . . . , Rγf 2n pairwise disjoint and respecting the interpretations of the actions built from a1 , . . . , an . The copy method given in (Gargov and Passy 1990) can perform both at the same time, although it made a few mistakes, which will be explained later in a footnote. 4.5 A standard model n Let M1 = (W1 , {R1α | α ∈ Π}, V1 ), . . . , M2n = (W2n , {R2α | α ∈ Π}, V2n ) be 2n pairwise disjoint structures which are isomorphic to M f . We now build up a standard model from these structures. For any i ≤ 2n , let fi be an isomorphism from Mi to M f . Let f = f1 ∪ · · · ∪ f2n and U = W1 ∪ · · · ∪ W2n . Let g : U → {1, . . . , 2n } be this function: for any s ∈ U, g(s) is the index of the set from which s comes, i.e., for any s ∈ U, s ∈ Wg(s) . For any s ∈ U, let ϕ n s denote ϕ B f (s). For any i ≤ 2n , we define a relation Bγi on U: Definition 4.1 (Atomic Relations). Let s, t ∈ U. Let γk1 , . . . , γkm be the sequence such that (i) it consists of all paths γ such that Rγf f (s) f (t) and (ii) k1 < · · · < km . Bγi st if and only if there is a j ≤ m such that i = k j and j = (g(t) mod m) + 1. The sequence γk1 , . . . , γkm is never empty, which is guaranteed by Lemma 7. If a path γi is not occurring in γk1 , . . . , γkm , there is no j ≤ m such that i = k j , and so not Bγi st. Suppose γi is occurring in γk1 , . . . , γkm . Then there is one and only one j ≤ m such that i = k j , which means that γi is the j-th element in γk1 , . . . , γkm . In this case, if j = (g(t) mod m) + 1, then Bγi st, or else not. Ju and Liang 95 Our purpose is to produce atomic relations. To do this, we must get this: for any s, t ∈ U, (s, t) belongs to one and only one path. This definition gives a way to assign (s, t) to the “right” path1 . Here is a concrete example. Let a, b, c be all the atomic actions of Σ. Paths are γ1 = a ∩ b ∩ c, . . . , γ8 = a ∩ b ∩ c. Let s, t ∈ U and g(t) = 4. Then t ∈ W4 . Suppose Rγf 2 f (s) f (t), Rγf 5 f (s) f (t), Rγf 7 f (s) f (t), and no other paths can do this. The sequence satisfying the two conditions in Definition 4.1 is γ2 , γ5 , γ7 . Then k1 = 2, k2 = 5 and k3 = 7. As j = 2 satisfies that 5 = k j and j = (4 mod 3) + 1, we get Bγ5 st. For any i ≤ 8, if i , 5, there is no j satisfying that i = k j and j = (4 mod 3) + 1, and so not Bγi st. By the following lemma, Bγ1 , . . . , Bγ2n are real atomic relations: Lemma 8. 1. Bγ1 , . . . , Bγ2n are pairwise disjoint; 2. Bγ1 ∪ · · · ∪ Bγ2n = U × U. Proof. (1) Assume there are i, j ≤ 2n such that i , j and Bγi ∩ Bγ j , ∅. Then there are s, t ∈ U such that Bγi st and Bγ j st. Let γk1 , . . . , γkm be the sequence such that it consists of all paths γ such that Rγf f (s) f (t) and k1 < · · · < km . Let x, y ≤ m be such that i = k x and j = ky . As k x , ky , x , y. By the definitions of Bγi and Bγ j , we get that x = (g(t) mod m) + 1 and y = (g(t) mod m) + 1. This is impossible. Then Bγ1 , . . . , Bγ2n are pairwise disjoint. (2) Trivially, we get Bγ1 ∪ · · · ∪ Bγ2n ⊆ U × U. Let s, t ∈ U. Let γk1 , . . . , γkm be the sequence such that it consists of all paths γ such that Rγf f (s) f (t) and k1 < · · · < km . Let x ≤ m be such that x = (g(t) mod m) + 1. By the definition of Bγkx , we have Bγkx st. Definition 4.2 (A Model). N = (U, {S α | α ∈ Π}, E, Z) is the model where 1. U is defined as above; 2. For any atomic action a occurring in Σ, S a = Bγn1 ∪· · ·∪Bγnm , where γn1 ∪· · ·∪γnm is the CINF of a2 ; For any atomic action b not occurring in Σ, S b = U × U; Interpretations of compound actions are defined from interpretations of atomic actions by corresponding operations; 1 Gargov and Passy (1990) made a mistake at this point: the definition of atomic relations given by it can not guarantee that for any s, t ∈ U, (s, t) belongs to exactly one path. There are two other mistakes in this work: (i) By Lemma 2, those actions equivalent to 0 do not correspond to any CINF, but this paper did not notice this; (ii) Lemma 3 is necessary to the proof of completeness, but it is not mentioned in this paper at all. 2 Here a might be the universal action 1. 96 A Dynamic Deontic Logic 3. E(H, J) if and only if there is a s ∈ U such that J = (H, s) and for any Oα ∈ Σ, if Oα n H̊, then S α (H̊, s); 4. For any p, Z(p) = S Vi (p). i≤2n Clearly, S α = U ×U −S α , S α∩β = S α ∩S β and S α∪β = S α ∪S β . As S 1 = Bγ1 ∪· · ·∪ Bγ2n , S 1 = U × U by Lemma 8. If E is serial, this is a standard model. By the following lemma, in this model, the interpretations of the actions not equiv- alent to 0 are unions of some atomic relations. Lemma 9. For any α of Σ not equivalent to 0, S α = Bγn1 ∪· · ·∪Bγnm , where γn1 ∪· · ·∪γnm is the CINF of α. In proving this lemma, we have to use Lemma 3. Now we show a crucial result: Lemma 10. α occurs in Σ. 1. For any s, t ∈ U, if S α st, Rαf f (s) f (t); 2. For any s ∈ U and y ∈ W f , if Rαf f (s)y, there is a t ∈ U such that f (t) = y and S α st. Proof. (1) Assume S α st. By Definition 4.2, S α is the result of operating on interpreta- tions of atomic actions. Therefore, if α is equivalent to 0, S α is empty. Then α is not equivalent to 0. Let γn1 ∪· · ·∪γnm be the CINF of α. By Lemma 9, S α = Bγn1 ∪· · ·∪Bγnm . There is an i ≤ m such that Bγni st. By the definition of Bγni , Rγf ni f (s) f (t). By Lemma 7 and 6, Rαf = Rγf n1 ∪ · · · ∪ Rγf nm . Then Rαf f (s)(t). (2) Assume Rαf f (s)y. By Lemma 6, α is not equivalent to 0. Let γn1 ∪ · · · ∪ γnm be the CINF of α. Since Rαf = Rγf n1 ∪ · · · ∪ Rγf nm , there is an i ≤ m such that Rγf ni f (s)y. Let γk1 , . . . , γkh be the sequence such that it consists of all the paths γ such that Rγf f (s)y and k1 < · · · < kh . Then γni occurs in γk1 , . . . , γkh . Let j ≤ h be such that k j = ni . Suppose j = h. Let t ∈ U be such that g(t) = 1 and f (t) = y. It can be verified that Bγkh st, that is, Bγni st. Since S α = Bγn1 ∪ · · · ∪ Bγnm , S α st. Suppose j < h. Let t ∈ U be such that g(t) = j + 1 and f (t) = y. It can also be verified that Bγk j st, that is, Bγni st. Since S α = Bγn1 ∪ · · · ∪ Bγnm , S α st. Lemma 11 (Existence Lemmas for hαiϕ and hαi>). 1. For any hαiϕ ∈ Σ, if hαiϕ n s, there is a t such that ϕ n t and S α st; 2. For any hαi>, if hαi> n s, there is a t such that S α st. Ju and Liang 97 Lemma 12. E is serial. Proof. Let H be a history of N. Let Oα1 , . . . , Oαn be all the pure deontic formulas in Σ such that Oα1 , . . . , Oαn n H̊. Suppose n < 1. Trivially, we have E(H, (H, t)) for any t ∈ U. Suppose 1 ≤ n. Then Oα1 , . . . , Oαn B f (H̊). Then for any u ∈ f (H̊), Oα1 , . . . , Oαn ∈ u. By Axiom F1, for any u ∈ f (H̊), O(α1 ∩ · · · ∩ αn ) ∈ u. By Axiom F4, hα1 ∩ · · · ∩ αn i> ∈ u for any u ∈ f (H̊). Then hα1 ∩ · · · ∩ αn i> B f (H̊). Then hα1 ∩ · · · ∩ αn i> n H̊. By Lemma 11, there is a t ∈ U such that S α1 ∩···∩αn (H̊, t). Then S α1 (H̊, t), . . . , S αn (H̊, t). By the definition of E, we have E(H, (H, t)). N is a standard model. 4.6 Restricted completeness Lemma 13 (Existence Lemma for Pα). If Pαn H̊, there is a t ∈ U such that E(H, (H, t)) and S α (H̊, t). Proof. Suppose Pα n H̊. Let Oα1 , . . . , Oαn be all the pure deontic formulas in Σ such that Oα1 , . . . , Oαn n H̊. Suppose n < 1. Then trivially, we get E(H, (H, t)) for any t ∈ U. As Pα n H̊, Pα B f (H̊). Then for any u ∈ f (H̊), Pα ∈ u. By Axiom F5, for any u ∈ f (H̊), hαi> ∈ u. Then hαi> B f (H̊). Then hαi> n H̊. By Lemma 11, there is a t ∈ U such that S α (H̊, (H, t)). Suppose 1 ≤ n. Then Oα1 , . . . , Oαn , Pα B f (H̊). Then for any u ∈ f (H̊), Oα1 , . . . , Oαn , Pα ∈ u. By Axiom F1, for any u ∈ f (H̊), O(α1 ∩· · ·∩αn ) ∈ u. By Axiom F3, P(α1 ∩ · · · ∩ αn ∩ α) ∈ u for any u ∈ f (H̊). By Axiom F5, hα1 ∩ · · · ∩ αn ∩ αi> ∈ u for any u ∈ f (H̊). Then hα1 ∩ · · · ∩ αn ∩ αi> B f (H̊). Then hα1 ∩ · · · ∩ αn ∩ αi> n H̊. By Lemma 11, there is a t ∈ U such that S α1 ∩···∩αn ∩α (H̊, t). Then S α1 (H̊, t), . . . , S αn (H̊, t) and S α (H̊, t). By the definition of E, E(H, (H, t)). Lemma 14 (Truth Lemma). For any ϕ ∈ Σ, ϕ n H̊ if and only if N, H ϕ. Proof. We put an induction on the structure of ϕ. The cases of p, >, ¬ψ and (ψ ∧ χ) are easy to go through, and we simply skip them. The case ϕ = Oα. Suppose Oα n H̊. Let E(H, (H, t)). By the definition of E, we have S α (H̊, t). Then N, H Oα. Now suppose N, H Oα. Then for any t ∈ U, if E(H, (H, t)), S α (H̊, t). Assume not Oα n H̊. Then not Oα B f (H̊). Since Oα ∈ Σ, Oα < u for any u ∈ f (H̊). Then ¬Oα ∈ u for any u ∈ f (H̊). By Axiom F2, Pα ∈ u for any u ∈ f (H̊). Then Pα B f (H̊). Then Pα n H̊. By Lemma 13, there is a t ∈ U such that E(H, (H, t)) and S α (H̊, t). However, since E(H, (H, t)), we have S α (H̊, t). This is impossible, as S α ∩ S α = ∅. 98 A Dynamic Deontic Logic The case ϕ = hαiψ. Suppose hαiψ n H̊. By Lemma 11, there is a t ∈ U such that ψ n t and S α st. Since Σ is closed under sub-formulas, ψ ∈ Σ. By the inductive hypothesis, N, (H, t) ψ. Then N, H hαiψ. Now suppose N, H hαiψ. Then there is a t ∈ U such that S α (H̊, t) and N, (H, t) ψ. By the inductive hypothesis, ψ n t. Then ψ B f (t). Then for any z ∈ f (t), ψ ∈ z. Since S α (H̊, t), by Lemma 10, we get Rαf ( f (H̊), f (t)). Then there are u ∈ f (H̊) and v ∈ f (t) such that Rα uv. Then ψ ∈ v. By the definition of Rα , hαiψ ∈ u. Since hαiψ ∈ Σ, hαiψ B f (H̊). Then hαiψ n H̊. Proposition 5. The logic restricted to Φ0 is complete with respect to the class of stan- dard models. 4.7 Completeness by translation Definition 4.3 (Translation). The translation function t : Φ → Φ0 is defined as follows: 1. t(p) = p 2. t(>) = > 3. t(Oα) = Oα 4. t(¬ϕ) = ¬t(ϕ) 5. t(ϕ ∧ ψ) = t(ϕ) ∧ t(ψ) 6. t(hαiϕ) = hαit(ϕ) 7. t([↓ α]p) = t(Pα → p) 8. t([↓ α]>) = t(>) 9. t([↓ α]Oβ) = t(Pα → O(α ∪ β)) 10. t([↓ α]¬ϕ) = t(Pα → ¬[↓ α]ϕ) 11. t([↓ α](ϕ ∧ ψ)) = t([↓ α]ϕ ∧ [↓ α]ψ) 12. t([↓ α]hβiϕ) = t(Pα → hβiϕ) 13. t([↓ α][↓ β]ϕ) = t([↓ α]t([↓ β]ϕ)) 14. t([↓ α][↑ β]ϕ) = t([↓ α]t([↑ β]ϕ)) 15. t([↑ α]p) = t(p) Ju and Liang 99 16. t([↑ α]>) = t(>) 17. t([↑ α]Oβ) = t(Oβ ∧ [α ∩ β]⊥) 18. t([↑ α]¬ϕ) = t(¬[↑ α]ϕ) 19. t([↑ α](ϕ ∧ ψ)) = t([↑ α]ϕ ∧ [↑ α]ψ) 20. t([↑ α]hβiϕ) = t(hβiϕ) 21. t([↑ α][↑ β]ϕ) = t([↑ α]t([↑ β]ϕ)) 22. t([↑ α][↓ β]ϕ) = t([↑ α]t([↓ β]ϕ)) Definition 4.4 (Complexity). The complexity function c : Φ → N is defined as fol- lows: 1. pc = 1 2. >c = 1 3. (Oα)c = 1 4. (¬ϕ)c = 1 + ϕc 5. (ϕ ∧ ψ)c = 1 + max(ϕc , ψc ) 6. (hαiϕ)c = 1 + ϕc 7. ([↓ α]ϕ)c = 5 × ϕc 8. ([↑ α]ϕ)c = 5 × ϕc Lemma 15. For any ϕ ∈ Φ, t(ϕ) ∈ Φ0 . This means that t can really translate all formulas of Φ into Φ0 . Lemma 16. 1. ([↓ α]ϕ)c = ([↓ β]ϕ)c ; 2. (t([↓ α]ϕ))c = (t([↓ β]ϕ))c . Lemma 17. The function c meets the following conditions: 1. If ϕ is a proper sub-formula of ψ, then ϕc < ψc . 100 A Dynamic Deontic Logic 2. ([↓ α]p)c > (Pα → p)c 3. ([↓ α]>)c > (>)c 4. ([↓ α]Oβ)c > (Pα → O(α ∪ β))c 5. ([↓ α]¬ϕ)c > (Pα → ¬[↓ α]ϕ)c 6. ([↓ α](ϕ ∧ ψ))c > ([↓ α]ϕ ∧ [↓ α]ψ)c 7. ([↓ α]hβiϕ)c > (Pα → hβiϕ)c 8. ([↑ α]p)c > (p)c 9. ([↑ α]>)c > (>)c 10. ([↑ α]Oβ)c > (Oβ ∧ [α ∩ β]⊥)c 11. ([↑ α]¬ϕ)c > (¬[↑ α]ϕ)c 12. ([↑ α](ϕ ∧ ψ))c > ([↑ α]ϕ ∧ [↑ α]ψ)c 13. ([↑ α]hβiϕ)c > (hβiϕ)c 14. ([↓ α]ϕ)c > (t([↓ α]ϕ))c 15. ([↑ α]ϕ)c > (t([↑ α]ϕ))c Proof. We only show the item (14). The proof for item (15) is similar and other items can be easily proved. We put an induction on ϕc . Suppose ϕc = 1. Then ϕ = p, ϕ = > or ϕ = Oα. By items (2), (3) and (4), we can easily see that ([α]ϕ)c > (t([α]ϕ))c holds in all these cases. Suppose for any ψ, if ψc < ϕc , then ([α]ψ)c > (t([α]ψ))c . Here we only show that ([α]ϕ)c > (t([α]ϕ))c if ϕ = [↓ β]ψ for some ψ. Since ψc < ([↓ β]ψ)c , we get ([↓ α]ψ)c > (t([↓ α]ψ))c by the inductive hypothesis. By Lemma 16, ([↓ β]ψ)c > (t([↓ β]ψ))c . Since ([↓ α][↓ β]ψ)c = 5 × ([↓ β]ψ)c and ([↓ α]t([↓ β]ψ))c = 5 × (t([↓ β]ψ))c , we know that ([↓ α][↓ β]ψ)c > ([↓ α]t([↓ β]ψ))c . Since (t([↓ β]ψ))c < ([↓ β]ψ)c , we get ([↓ α]t([↓ β]ψ))c > (t([↓ α]t([↓ β]ψ)))c by the inductive hypothesis again. Since t([↓ α][↓ β]ψ) = t([↓ α]t([↓ β]ψ)), we have ([↓ α][↓ β]ψ)c > (t([↓ α][↓ β]ψ))c . Lemma 18. For any ϕ ∈ Φ, ` ϕ ↔ t(ϕ). Ju and Liang 101 Proof. We put an induction on ϕc . Suppose ϕc = 1. Then ϕ = p, ϕ = > or ϕ = Oα. In these cases, t(ϕ) = ϕ, and then ` ϕ ↔ t(ϕ). Suppose for any ψ, if ψc < ϕc , then ` ψ ↔ t(ψ). We want to show that ` ϕ ↔ t(ϕ). It suffices to show that ` ϕ ↔ t(ϕ) in all of the cases that ϕ is one of the following formulas: ¬ψ, ψ ∧ χ, hαiψ, [↓ α]p, [↓ α]>, [↓ α]Oα, [↓ α]¬ψ, [↓ α](ψ ∧ χ), [↓ α]hβiψ, [↓ α][↓ β]ψ, [↓ α][↑ β]ψ, [↑ α]p, [↑ α]>, [↑ α]Oα, [↑ α]¬ψ, [↑ α](ψ ∧ χ), [↑ α]hβiψ, [↑ α][↓ β]ψ, and [↑ α][↑ β]ψ. This can be proved by use of Lemma 16 and 17. Here we only go through the case ϕ = [↓ α][↓ β]ψ. Since [↓ β]ψ is a proper sub-formula of [↓ α][↓ β]ψ, by Lemma 17, we get ([↓ β]ψ)c < ([↓ α][↓ β]ψ)c . By the inductive hypothesis, ` [↓ β]ψ ↔ t([↓ β]ψ). By the inference rule Replacement of Validity, ` [↓ α][↓ β]ψ ↔ [↓ α]t([↓ β]ψ). By Lemma 17, (t([↓ β]ψ))c < ([↓ β]ψ)c . Then ([↓ α]t([↓ β]ψ))c < ([↓ α][↓ β]ψ)c . By the inductive hypothesis again, ` [↓ α]t([↓ β]ψ) ↔ t([↓ α]t([↓ β]ψ)). Since t([↓ α]t([↓ β]ψ)) = t([↓ α][↓ β]ψ), we have ` [↓ α][↓ β]ψ ↔ t([↓ α][↓ β]ψ). Proposition 6 (Completeness). The logic is complete with respect to the class of stan- dard models. Proof. Suppose ϕ. By Lemma 18, ` ϕ ↔ t(ϕ). The logic is sound, then ϕ ↔ t(ϕ). Then t(ϕ). By Lemma 15, t(ϕ) ∈ Φ0 . By Proposition 5, ` t(ϕ). By Lemma 18, ` ϕ ↔ t(ϕ). Then ` ϕ. 5 Future work Our semantics does not work for descriptive permissions. The descriptive utterances of sentences such as “you may do α or β” might only inform the agent that he is allowed to do something, but not specify it. After such utterances, the agent might still not know how to act. This sort of utterances raises uncertainties, but our semantics does not have any settings to handle them. Next on our agenda are two questions. The language in this work contains three action operators: complement, intersection and choice. It is natural to add test and composition to obtain a more powerful language in which conditional and sequential obligations and permissions can be expressed. This is an issue we want to pursue in the future. According to our semantics, the last update always overrides the previous ones. For example, given that α is possible to perform, the prescriptive permission “you may do α” always gives the agent the freedom to do α, regardless of what obligations have been put on him. This is the case only if there is only one speaker or moral source. In real life, there are many moral sources whose authorities are ranked, and only or- ders and permissions from speakers with higher authorities can overwhelm those from 102 A Dynamic Deontic Logic speakers with lower authorities. Introducing prioritized speakers into this framework is another direction of future work for us. Notes This paper is an extension of (Ju and Liang 2013), where we only axiomatized the logic restricted to Φ0 , which contains no dynamic operators, and did not present an axiomatization for the whole logic. Acknowledgements This research is supported by the National Social Science Foun- dation of China (No. 12CZX053) and the Major Bidding Project of the National Social Science Foundation of China (No. 10&ZD073). We would like to thank Maria Aloni, Davide Grossi, Fenrong Liu, Hans van Ditmarsch, Frank Veltman, Yanjing Wang, Yi Wang, Tomoyuki Yamada, and the three anonymous reviewers of the 4th International Workshop of Logic, Rationality and Interaction, for their helpful comments and sug- gestions. References H. van Ditmarsch, W. van Der Hoek, and B. Kooi. Dynamic Epistemic Logic. Springer, Heidelberg, 2007. G. Gargov and S. Passy. A note on boolean modal logic. In P. P. Petkov, editor, Mathematical Logic, pages 311–321. Plenum Press, 1990. R. Hilpinen. Deontic logic. In L. Goble, editor, The Blackwell Guide to Philosophical Logic, pages 159–182. Blackwell Publishing, 2001. J. Jorgensen. Imperatives and logic. Erkenntnis, 7(1):288–296, 1937. F. Ju and L. Liang. A dynamic deontic logic based on histories. In D. Grossi, O. Roy, and H. Huang, editors, Logic, Rationality, and Interaction, volume 8196 of Lecture Notes in Computer Science, pages 176–189. Springer Berlin Heidelberg, 2013. F. Ju and F. Liu. Prioritized imperatives and normative conflicts. European Journal of Analytic Philosophy, 7(2):33–54, 2011. J.-J. C. Meyer. A different approach to deontic logic: Deontic logic viewed as a variant of dynamic logic. Notre Dame Journal of Formal Logic, 29(1):109–136, 1988. A. Ross. Imperatives and logic. Philosophy of Science, 11(1):30–46, 1944. Ju and Liang 103 F. Veltman. Imperatives at the borderline of semantics and pragmatics. Manuscript, 2009. Public Announcements under Sheaves Kohei Kishida University of Oxford
[email protected]The goal of this article is to bring together the frameworks of model-update seman- tics for (propositional) public-announcement logic (van Ditmarsch et al. 2008) and of sheaf semantics for first-order modal logic (Awodey and Kishida 2008, Gabbay et al. 2009, Kishida 2011), and to thereby obtain a sheaf semantics for first-order public- announcement logic. The first attempt to extend dynamic epistemic logics to the first order was made by Kooi (2007), who introduced terms to refer to epistemic agents, and an extension of public-announcement logic to the first order was briefly given by Ma (2011);1 both of these extensions used constant domains for interpreting first-order vocabulary. (A first-order extension of dynamic logic was given in (Harel 1979) and (Harel et al. 2000), also with constant domains.) This article pushes ahead with these extensions by employing a sheaf structure, providing a progress toward a more flexible and useful treatment of first-order notions. We will first review model-update semantics for propositional public- announcement logic in Section 1, and sheaf semantics for first-order modal logic in Section 2. Then, in Section 3, we will show how to combine the ideas of these two semantics into sheaf semantics for first-order public-announcement logic. We will con- clude the article with a brief discussion of future work and projects in Section 4. 1 Theorem 14 of Ma (2011), the chief theorem in the section on first-order public-announcement logic, is unfortunately incorrect, with an invalid reduction axiom for quantifiers. Hence, even in the constant-domain setting, a correct axiomatization of first-order public-announcement logic has not been given. 106 Public Announcements under Sheaves 1 Reviewing the propositional case We first review the basics of PAL, public-announcement logic, in the propositional case (see van Ditmarsch et al. 2008 or van Benthem 2011 for a detailed exposition of motivations, definitions and results). One characteristic of the approach of this article is to use (monotone) neighborhood semantics, rather than Kripke semantics, as a static basis of PAL, although all the results that will follow apply equally to Kripke seman- tics (which is just a particular subclass of monotone neighborhood semantics). This feature is not standard, but not new to this article either: Neighborhood semantics for propositional PAL was first introduced by Demey (2010) and Zvesper (2010).2 A language L of PAL is a propositional language with two sorts of unary modal operators: Ki , for each i ∈ I for a fixed set I of “agents”; and [!σ], for each sentence σ of L. Given a sentence ϕ, sentences Ki ϕ is supposed to mean, approximately, that the agent i “knows” that ϕ, and [!σ]ϕ is supposed to mean that ϕ will be the case after it is “publicly (and truthfully) announced” or “publicly observed” that σ.3 Let us say L is a static language if it contains Ki but not [!σ], and a PAL language if it contains both. We use (monotone) neighborhood semantics to interpret operators Ki . Definition 1.1. A monotone neighborhood frame, or MN-frame for short, is a tuple (W, {Bi }i∈I ) of any set W , ∅ of “worlds” and any map Bi : W → PPW for every i ∈ I.4 The members B ⊆ W of Bi (w) are called the basic i-neighborhoods of w, and Bi is called an i-basic-neighborhood map. Moreover, each Bi induces an operation inti : PW → PW, called an i-interior operation, with w ∈ inti (A) ⇐⇒ B ⊆ A for some B ∈ Bi (w). (int) 2 Both Demey (2010) and Zvesper (2010) discuss the fully general neighborhood semantics, whereas in this article we only use the monotone case. Both Demey’s and Zvesper’s goal behind their use of neighborhood—as opposed to Kripke—semantics is to falsify principles such as Ki (ϕ → ψ) → (Ki ϕ → Ki ψ) and to avoid the so-called “problem of logical omniscience”; our goal is different, as we lay out at the end of this section. 3 See Chs. 2 and 3 of (van Benthem 2011) for discussions of precisely what K ϕ and [!σ]ϕ mean given i the semantics of PAL. 4 Our formulation of monotone neighborhood semantics may appear different from, but is in fact equiv- alent to, the more common formulation (e.g., in Chellas 1980). Commonly, an MN-frame (W, N) is assumed to have N : W → PPW “upward closed” (i.e., to have A ∈ N(w) whenever B ∈ N(w) and B ⊆ A ⊆ W), while N induces int with w ∈ int(A) ⇐⇒ A ∈ N(w). (int0 ) Clearly, MN-frames in this sense are MN-frames in our sense ((int0 ) is equivalent to (int) when B = N is upward closed). Kishida 107 It is obvious that a Kripke frame (or model) is just an MN-frame (or model) with each Bi (w) a singleton, so that B ∈ Bi (w) is just the set of worlds that are i-accessible from w. We interpret each sentence ϕ with a subset ~ϕ of W as the so-called “truth set” of ϕ, so that w ∈ ~ϕ means that ϕ is true at the world w; in this sense subsets of W can be called “propositions”. So, each w ∈ W has a family (perhaps empty) of propositions as its basic i-neighborhoods, and w is in the i-interior of a proposition A ⊆ W iff some of its basic i-neighborhoods “entails” A. Then we interpret the classical connectives classically, and the modal operators Ki with the i-interior operations. Definition 1.2. Given a propositional static or PAL language L, a monotone neighbor- hood model, or MN-model, for L consists of an MN-frame (W, {Bi }i∈I ) and any map ~· that assigns a set ~p ⊆ W to each atomic sentence p of L. A monotone neighborhood interpretation, or MN-interpretation, over an MN-model (W, {Bi }i∈I , ~·) for L is a map that extends ~· to all the sentences ϕ of L following the constraints ~¬ = W \ ·, (¬Prop ) so that ~¬ϕ = ~¬~ϕ = W \ ~ϕ, ~∧ = ∩, (∧Prop ) so that ~ϕ ∧ ψ = ~ϕ~∧~ψ = ~ϕ ∩ ~ψ, ~→ = (W \ ·) ∪ ·, (→Prop ) so that ~ϕ → ψ = ~ϕ~→~ψ = (W \ ~ϕ) ∪ ~ψ, ~Ki = inti , (Prop ) so that w ∈ ~Ki ϕ = ~Ki ~ϕ ⇐⇒ B ⊆ ~ϕ for some B ∈ B (w). i The semantics of PAL involves the updating of models by restricting them to sub- models. Definition 1.3. Given an MN-model M = (W, {Bi }i∈I , ~·) for L and nonempty S ⊆ W, let MS = (S , {BiS }i∈I , ~·S ) be the MN-model for L with On the other hand, each MN-frame (W, B) in our sense “generates” an MN-frame (W, NB ) as commonly defined by setting A ∈ NB (w) ⇐⇒ B ⊆ A for some B ∈ B(w), and then NB (w) induces with (int0 ) the same int : PW → PW as B does with (int), which means that (W, B) and (W, NB ) have the same ~K. In short, the two formulations give the same semantics of K. An advantage of ours is that, in modelling, it enables us to distinguish basic neighborhoods B ∈ B(w) from other neighborhoods A ∈ NB (w), a distinction that corresponds to the one between direct observability and verifiability, in terms of the observability interpretation mentioned at the end of this section. 108 Public Announcements under Sheaves • BiS (w) = { B ∩ S | B ∈ Bi (w) },5 and • ~pS = ~p ∩ S for every atomic sentence p of L. This operation of restriction update provides a semantics for [!σ]: Definition 1.4. Given a propositional PAL language L, an MN-interpretation over an MN-model M = (W, {Bi }i∈I , ~·) for L is called a monotone neighborhood PAL- interpretation, or MN-PAL-interpretation, over M if it satisfies ~[!σ]ϕ = ~σ~→~ϕ~σ , ([!·]Prop ) so that w ∈ ~[!σ]ϕ ⇐⇒ if w ∈ ~σ then w ∈ ~ϕ~σ , where ~·~σ is an MN-PAL-interpretation over (~σ, {Bi~σ }i∈I , ~·~σ ) if ~σ , ∅; we write ~ϕ~σ = ∅ if ~σ = ∅. Under the condition ([!·]Prop ), [!σ]ϕ intuitively means that, if σ is susceptible of announcing truthfully (in the sense of simply being true), then the public announcement that σ will make it the case that ϕ. It is worth noting that, when we write h!σiϕ for ¬[!σ]¬ϕ, ([!·]Prop ) implies ~h!σiϕ = ~ϕ~σ , (h!·iProp ) because ~h!σiϕ = ~¬[!σ]¬ϕ = W \ (~σ~→~¬ϕ~σ ) = ~σ \ ~¬ϕ~σ = ~ϕ~σ . The class of MN-PAL-interpretations forms monotone neighborhood semantics for PAL. The logic of this semantics, M-PAL, has all the axiom or rule schemes of classi- cal logic, as well as the rule scheme ϕ→ψ . M Ki ϕ → Ki ψ 5 Zvesper (2010) uses the same definition of restriction to a submodel as ours for PAL. Demey (2010), on the other hand, uses a seemingly different definition; it agrees with ours in the monotone case, but disagrees in the non-monotone case. (Related but crucially different notions of submodel in the neighborhood setting are discussed in (Hansen 2003).) Our definition of submodel is a straightforward generalization of a subspace in topology: The family {BiS }i∈I makes the inclusion map m : S → W :: w 7→ w “I-continuous”, in the sense that m−1 [intiW (A)] ⊆ intiS (m−1 [A]) for every A ⊆ W and for every i ∈ I. Moreover, {BiS }i∈I is the “coarsest” family from which m is I-continuous; or, equivalently, any map f : X → W with f [X] ⊆ S that is I-continuous to (W, {Bi }i∈I ) is I-continuous to (S , {BiS }i∈I ) (when regarded as a map f : X → S ). In category-theoretic terms, m is a “regular monomorphism” in the category of MN-frames and I-continuous maps. Kishida 109 To add more axioms and rules, correspondence results are available: For instance, let us say that an MN-frame (W, {Bi }i∈I ) is an MCN-frame if B0 , B1 ∈ Bi (w) implies B2 ⊆ B0 ∩ B1 for some B2 ∈ Bi (w); then the logic of the MN-PAL-interpretations over MCN-frames, MC-PAL, is given by adding to M-PAL the axiom scheme (Ki ϕ ∧ Ki ψ) → Ki (ϕ ∧ ψ). C Moreover, monotone neighborhood semantics for PAL validates the same set of axioms of reduction as Kripke semantics for PAL does: [!σ]p ≡ (σ → p) for atomic p; Ratom [!σ]¬ϕ ≡ (σ → ¬[!σ]ϕ); R¬ [!σ](ϕ ∧ ψ) ≡ ([!σ]ϕ ∧ [!σ]ψ); R∧ [!σ]Ki ϕ ≡ (σ → Ki [!σ]ϕ); R [!σ][!τ]ϕ ≡ [!(σ ∧ [!σ]τ)]ϕ. Rit It is worth showing how monotone neighborhood semantics for PAL, with a different truth condition for Ki than Kripke’s, validates R . First observe that ~Ki ϕ~σ = ~σ ∩ ~Ki [!σ]ϕ, since w ∈ ~Ki ϕ~σ = inti~σ ~ϕ~σ ⇐⇒ w ∈ ~σ and B ⊆ ~ϕ~σ for some B ∈ Bi~σ (w) ⇐⇒ w ∈ ~σ and B ∩ ~σ ⊆ ~ϕ~σ for some B ∈ Bi (w) ⇐⇒ w ∈ ~σ and B ⊆ ~σ~→~ϕ~σ = ~[!σ]ϕ for some B ∈ Bi (w) ⇐⇒ w ∈ ~σ and w ∈ inti ~[!σ]ϕ = ~Ki [!σ]ϕ for each w ∈ W. Therefore ~[!σ]Ki ϕ = ~σ~→~Ki ϕ~σ = ~σ~→~Ki [!σ]ϕ = ~σ → Ki [!σ]ϕ. Ratom –Rit reduce any sentence with [!σ] to one without, by pushing [!σ] of [!σ]ϕ inward when ϕ is compound and eventually eliminating [!σ] when ϕ is atomic. Hence Ratom –Rit completely axiomatize PAL by reducing it to its “announcement-free”, static fragment, which is just the modal logic given by M (along with any other axioms or rules on Ki that one chooses to add). Theorem 1. Given a class M of MN-models, let T be the logic of M in a propositional static language L. Then the logic of M in the PAL language extending L is obtained by adding Ratom –Rit to T. 110 Public Announcements under Sheaves It needs explaining why we should use neighborhood semantics, instead of the standard Kripke approach, to interpret PAL. One merit is that neighborhood semantics facilitates what may be called a verifiability interpretation of Ki operators. Think of Bi (w) as the family of propositions that the agent i can directly observe at the world w; then (Prop ) means that Ki ϕ is true at w iff some proposition that i can observe entails ϕ, that is, iff i can verify that ϕ.6 For instance, let us represent infinite sequences of coin tosses with maps w : N → 2 in W = 2N , so that w(n) = 1 means that the (n + 1)st toss comes up heads in the world w ∈ W; then what we can directly observe is the outcome of each toss or finite combinations thereof, so that B ∈ Bi (w) iff B = { v ∈ 2N | v(n) = w(n) for all n ∈ J } for some finite J ⊆ N.7 Then, for instance, even though it is not directly observable that some toss comes up heads, it is verifiable (if true). We should note that such interpretation is precluded in Kripke semantics, where each Bi (w) is a singleton or has the smallest element, which means that there is a single observation that verifies everything verifiable. The combination of the verifiability interpretation with PAL is suitable for the pur- pose of expressing epistemic inquiries. In scientific or any kind of inquiries, increase in knowledge (which PAL expresses by the restriction update of models) can turn the unobservable observable; for instance, scientists used to only observe a segment of light in a certain glass case, where they now observe a trail of a charged particle in a cloud chamber. The restriction update of PAL captures such a phenomenon precisely, by enabling sentences ϕ with ~ϕ < Bi (w) to nonetheless have ~ϕS ∈ BiS (w). 2 Reviewing “Neighborhood-Sheaf” semantics for First-Order Modal Logic Next we review sheaf semantics for first-order modal logic, and in particular, “neighborhood-sheaf” semantics (Kishida 2011) (which subsumes Kripke-sheaf se- mantics (Gabbay et al. 2009) and topological-sheaf semantics (Awodey and Kishida 2008), as well as neighborhood semantics with constant domains (Arló-Costa and Pacuit 2006)); we will extend it to public announcements in the next section. A major advantage of using sheaf semantics as opposed to constant-domain semantics (either Kripke semantics or neighborhood semantics (Arló-Costa and Pacuit 2006)) is that the former facilitates more flexible treatment of identity of first-order individuals; this is particularly important in applying first-order public announcement logic to epistemic 6 Topologic (Dabrowski et al. 1996) and evidence logic (van Benthem and Pacuit 2011) are based on similar interpretations of Ki and of topology or neighborhoods. 7 The operation inti : P(2N ) → P(2N ) which this Bi induces with (int) is exactly the interior operation of the Cantor space. Kishida 111 contexts where identity of objects are subject of epistemic inquiries. We refer the reader to (Awodey and Kishida 2008) and (Kishida 2011) for more explanation of how the semantics works both conceptually and technically. Given MN-frames (X, BX ) and (Y, BY ), we say that a map f : X → Y is open if f −1 [intY (A)] = intX ( f −1 [A]). Then, by a local isomorphism, or neighborhood sheaf, we mean an open map π : D → W between MCN-frames (and not just MN-frames) such that, for every a ∈ D with BD (a) , ∅, there is B ∈ BD (a) with πB : B → W injective (or, equivalently, πB : B → π[B] bijective).8 It is worth noting that the Kripke sheaves (Gabbay et al. 2009) are just the neighborhood sheaves with BW (w) a singleton for each w ∈ W, and that the topological sheaves (Awodey and Kishida 2008) are just the neighborhood sheaves over a topological space (W, BW ). Fixing any MCN-frame W, write LI/W for the category of MCN-frames and local isomorphisms sliced over W. This category provides structures for interpreting first- order logic, when we regard W as a set of worlds. Indeed, whereas in interpreting prop- ositional modal logic we regard worlds as having no internal structure, in interpreting first-order modal logic we regard worlds as models of first-order logic. So, given any (surjective) local isomorphism π : D → W, we regard the inverse image Dw = π−1 [{w}] of each w ∈ W as a domain of individuals of the model w; then D = w∈W Dw , the dis- P joint union of all Dw , is the domain of individuals that are from some model or other, and, for any individual a ∈ D, π(a) is the (unique) model it is from.9 The category LI/W provides with each w ∈ W more structures for first-order logic. For instance, the n-fold product of π in LI/W comes with the set Dn = w∈W (Dw × · · · × Dw ) = { (a1 , . . . , an ) | π(a1 ) = · · · = π(an ) } P of n-tuples from the same model, and with the map πn : Dn → W :: (a1 , . . . , an ) 7→ π(a1 ) = · · · = π(an ); in particular, D0 = W and π0 is the identity on W. (We will later discuss what basic- neighborhood maps the n-fold product of π : D → W has on Dn .) Accordingly, we interpret the classical part of first-order modal logic within each world or model w ∈ W. Let L be a first-order (modal) language (that may have the relation symbol =, any function symbols, etc.). Then, extending the interpretation ~σ ⊆ W = D0 of sentences σ, we interpret a formula ϕ of L with no free variables other than x̄ = (x1 , . . . , xn ) with the set ~ x̄ | ϕ ⊆ Dn of n-tuples ā = (a1 , . . . , an ) ∈ Dn of which ϕ is true (with ai in place of xi ) in πn (ā). So, for instance, ā ∈ ~ x̄ | ¬ϕ iff 8 Seefootnote 11 for why we require MCN-frames and not just MN-frames. 9 Thus π : D → W gives essentially the same ontological picture as David Lewis’s counterpart theory (Lewis 1968), though our interpretation of differs significantly from Lewis’s. 112 Public Announcements under Sheaves ā < ~ x̄ | ϕ , because ¬ϕ is true of ā in πn (ā) iff ϕ is not true of ā in πn (ā). Thus ~· must satisfy, for each n, ~¬ = Dn \ ·, (¬FO ) so that ~ x̄ | ¬ϕ = D \ ~ x̄ | ϕ , n ~∧ = ∩, (∧FO ) so that ~ x̄ | ϕ ∧ ψ = ~ x̄ | ϕ ∩ ~ x̄ | ψ , ~→ = (Dn \ ·) ∪ ·, (→FO ) so that ~ x̄ | ϕ → ψ = (D \ ~ x̄ | ϕ ) ∪ ~ x̄ | ψ , n straightforwardly extending (¬Prop )–(→Prop ), and moreover ā ∈ ~ x̄ | ∃y ϕ ⇐⇒ (ā, b) ∈ ~ x̄, y | ϕ , (∃) (ā, b) ∈ ~ x̄, y | ϕ ⇐⇒ ā ∈ ~ x̄ | ϕ when y is not free in ϕ. (vac) From the parallelism between (¬Prop )–(→Prop ) and (¬FO )–(→FO ), it is useful to ex- tract the following, “many-sorted” view of this semantics. That is, for the connectives ¬ through →, which give sentences when taking sentences as arguments and give n-ary formulas when taking n-ary formulas as arguments, W provides a possible-world model of worlds w ∈ W and propositions ~σ ⊆ W; D provides a possible-world model of individuals a ∈ D and properties ~ x | ϕ ⊆ D; and each Dn provides a possible-world model of n-tuples ā ∈ Dn and n-ary relations ~ x̄ | ϕ . To interpret , we take advantage of the “many-sorted” view, and straightforwardly extend (Prop ); that is, we interpret working on a sentence with intW on W, working on a unary formula with intD on D, and working on an n-ary formula with intDn on Dn .10 Each Dn is an MCN-frame equipped with a basic-neighborhood map Bn : Dn → PP(Dn ) such that, for each ā ∈ Dn and B ⊆ Dn , B ∈ Bn (ā) ⇐⇒ B = (B1 × · · · × Bn ) ∩ Dn where Bi ∈ B(ai ) for some i and, for each i, either Bi ∈ B(ai ) or Bi = D; 10 We can contrast this to the non-sheaf frameworks of Kripke’s semantics (Kripke 1963) and of neigh- borhood semantics with constant domains by Arló-Costa and Pacuit (2006), in both of which is always interpreted with structures on W. In this regard, our interpretation is conceptually closer to that in David Lewis’s counterpart theory (Lewis 1968), which interprets working on a unary formula with a counter- part relation among individuals. Notwithstanding this and other conceptual difference, however, models by Kripke and by Arló-Costa and Pacuit can be subsumed as constant sheaves, in which D consists of copies of W having BD strictly parallel to BW . Kishida 113 in other words, when we write pi : Dn → D :: ā 7→ ai for each projection, Bn (ā) consists of the inverse images pi −1 [B] of B ∈ B(ai ) (for all i) and their (nonempty) finite intersections.11 So (Prop ) extends to the constraint that, for each n, ~ = intDn , (FO ) so that ā ∈ ~ x̄ | ϕ ⇐⇒ B ⊆ ~ x̄ | ϕ for some B ∈ B (ā). n To sum up, we enter Definition 2.1. A neighborhood-sheaf model for a first-order language L is a pair (π, ~·) of a surjective local isomorphism π : D → W and and a map ~· that interprets12 • each n-ary predicate R of L with a subset ~R of Dn , and • each n-ary function symbol f of L with an arrow ~ f : Dn → D in LI/W (the case of individual constants is covered with n = 0).13 11 πn with (Dn , Bn (ā)) is the n-fold product of π in the category LI/W of MCN-frames and local iso- morphisms over W. We can loosen the definition of local isomorphism π : D → W so that W and D can be MN-frames that are not MCN-frames, and think of the category of MN-frames and local iso- morphisms over W. Yet the n-fold product of π in this category is “coarser” than in LI/W, given as Bn (ā) = 16i6n { pi −1 [B] | B ∈ B(ai ) } for the maps pi : Dn → D :: ā 7→ ai . This Bn is too coarse for S the purpose of modelling , because together with (FO ) it entails ā ∈ ~ x̄ | ϕ ⇐⇒ ∃i ∃B ∈ BD (ai ). pi −1 [B] ⊆ ~ x̄ | ϕ ⇐⇒ ∃i ∃B ∈ BD (ai ). B ⊆ ~ xi | ∀x1 · · · ∀xi−1 ∀xi+1 · · · ∀xn ϕ ⇐⇒ ∃i .ai ∈ ~ xi | ∀x1 · · · ∀xi−1 ∀xi+1 · · · ∀xn ϕ . To avoid this, we need a basic-neighborhood map Bn : Dn → PPDn that is “finer” than Bn . This means that, to provide a model, we need an infinite sequence ((W, BW ), (D, BD ), (D2 , B2 ), . . . ) of data, rather than just a pair ((W, BW ), (D, BD )). In this article we choose MCN-frames as opposed to MN-frames, so that a pair ((W, BW ), (D, BD )) suffices. 12 Given ( ) and (vac), the requirement that π : D → W is an open map amounts to the condition that FO the square intW ~σ / ~σ π−1 = π−1 ~y | σ / ~ y | σ intD commutes. In other words, the requirement means that ~ y | σ is well-defined from ~σ. 13 That ~ f : Dn → D is an arrow in LI/W means that π ◦ f = πn and that f is a local isomorphism; but for these to hold it is enough that π ◦ f = πn and that f is continuous, meaning that f −1 [intD (A)] ⊆ intDn ( f −1 [A]). 114 Public Announcements under Sheaves Moreover, a neighborhood-sheaf interpretation over a neighborhood-sheaf model (π, ~·) for L is a map that extends ~· to all the terms t and formulas ϕ of L, in- terpreting t with ~ x̄ | t : Dn → D and ϕ with ~ x̄ | ϕ ⊆ Dn , following (¬FO )–(FO ) along with the constraint that ~ x̄ | R x̄ = ~R for n-ary predicate R.14 To illustrate how a neighborhood-sheaf model (π : D → W, ~·) interprets terms, let us take an individual constant c. (See Figure 1.) Since each w ∈ W is a model ~c(w) D ~c π W w Figure 1: A global section ~c of a sheaf π : D → W of first-order logic, c has its reference in w, written ~c(w), which lives in w—that is, ~c(w) ∈ Dw and π(~c(w)) = w. Thus, the interpretation of c in the entire (π, ~·) is a map ~c : W → D such that π ◦ ~c : W → W is the identity on W. Moreover, ~c is not just any map picking ~c(w) randomly from Dw , but has to be an arrow in LI/W, that is, a continuous map, picking ~c(w) continuously. In topological terms, ~c forms a “global section”. The class of neighborhood-sheaf interpretations forms neighborhood-sheaf seman- tics, and its logic FOMC is obtained by adding the rule M and the axiom C to classical first-order logic.15 Correspondence results are also available; for instance, FOS4, first- order logic with S4, is the logic of the subclass of topological-sheaf models (Awodey and Kishida 2008). 14 There are constraints on interpreting terms and interpreting substitution of terms, but we omit them in this article. See (Awodey and Kishida 2008) and (Kishida 2011) instead. 15 FOMC proves the converse Barcan formula ∀x ϕ → ∀x ϕ and also ∃x ϕ → ∃x ϕ, but neither the Barcan ∀x ϕ → ∀x ϕ nor ∃x ϕ ` ∃x ϕ. Also, FOMC fails to prove x = y → (x = y) and x , y → (x , y). Kishida 115 3 First-Order Public-Announcement Logic We are finally ready to combine the two semantics in the previous sections to provide a neighborhood-sheaf semantics for FOPAL, first-order public-announcement logic. We define a language L of FOPAL as a first-order language (possibly with the relation symbol =, function symbols, etc.) with operators Ki for i ∈ I and [!σ] for all sentences σ of L. In this article, however, we settle for a pair of restrictions (leaving a formulation without them for future work): We assume that the set I of agents is a singleton, and simply write for Ki ; we also require σ in [!σ] be a sentence and with no free variables. We define the restriction updating of sheaf models by taking advantage of the “many-sorted” view yet again. [!σ] gives a sentence when taking a sentence as an argument and gives an n-ary formula when taking an n-ary formula as argument. So we interpret [!σ] working on a sentence with a restriction update of W, [!σ] working on a unary formula with a restriction update of D, and [!σ] working on an n-ary for- mula with a restriction update of Dn . And these restriction updates are parallel to one another, and in particular to the propositional case (~σ, B~σ , ~·~σ ). The only question is to what subsets we should restrict D and Dn . In the proposi- tional case, we eliminate worlds w < ~σ since they are no longer possibilities once σ is announced. So, similarly, we eliminate individuals a such that π(a) < ~σ, since such a, not being from worlds in ~σ, are no longer “possible individuals” once σ is announced; therefore we restrict D to π−1 [~σ] = w∈~σ Dw , the set of individuals P from ~σ. More generally, given S ⊆ W, the restriction of Dn from over W to over S is DnS = (πn )−1 [S ] = Dnw ⊆ Dn , P w∈S the set of n-tuples from S ; in particular, D0S = S . Thus, extending Definition 1.3 straightforwardly, we have the following. Definition 3.1. Given a neighborhood-sheaf model M = (π : D → W, ~·) for L and nonempty S ⊆ W, let MS = (πS , ~·S ) be the neighborhood-sheaf model for L as follows. • πS = πDS : DS → S :: a 7→ π(a); • BS (w) = { B ∩ S | B ∈ BW (w) }; • BDS (a) = { B ∩ DS | B ∈ BD (a) }; • ~RS = ~R ∩ DnS for every n-ary predicate R of L; and 116 Public Announcements under Sheaves • ~ f S = ~ f DnS : DnS → DS :: ā 7→ ~ f (ā) for every n-ary function symbol f of L. As the definition presupposes, and as is straightforward to check, πS is a local isomorphism over S (with BS ), and so MS is in fact a neighborhood-sheaf model.16 So, for the model-update interpretation of [!σ], let us enter Definition 3.2. Given a language L of FOPAL, a neighborhood-sheaf interpretation over a neighborhood-sheaf model M = (π : D → X, ~·) is called a neighborhood- sheaf PAL-interpretation over M if it satisfies the following, straightforwardly extend- ing ([!·]Prop ). (It may be worth noting that Dn~σ = ~ x̄ | σ .) ~ x̄ | [!σ]ϕ = ~ x̄ | σ ~→~ x̄ | ϕ ~σ , ([!·]FO ) n so that ā ∈ ~ x̄ | [!σ]ϕ ⇐⇒ either ā ∈ D \ Dn~σ or ā ∈ ~ x̄ | ϕ ~σ ⊆ Dn~σ . The class of such interpretations forms neighborhood-sheaf semantics for FOPAL. Also, the subclasses of interpretations over Kripke sheaves and over topological sheaves respectively form Kripke-sheaf semantics and topological-sheaf semantics for FOPAL. The semantics for FOPAL validates Ratom –Rit , just as in the case for propositional PAL. A language of FOPAL has one more operator, namely, ∃; the semantics validates the reduction axiom [!σ]∃y ϕ ≡ ∃y [!σ]ϕ, R∃ 16 It is also worth noting that the restriction preserves (finite) products, in the sense that, not only (Dn ) = S (DS )n = DnS , but also (Bn )Dn = (BDS )n . To sum up in category-theoretic terms, the restriction is given by S the “change of base” functor from LI/W to LI/S , which preserves products. Kishida 117 because for each ā ∈ Dn we have ā ∈ ~ x̄ | [!σ]∃y ϕ ⇐⇒ either ā ∈ Dn \ Dn~σ or ā ∈ ~ x̄ | ∃y ϕ ~σ ⇐⇒ either (ā, b) ∈ Dn+1 \ Dn+1 ~σ for some b ∈ D \ D~σ or (ā, b) ∈ ~ x̄, y | ϕ ~σ for some b ∈ D~σ ⇐⇒ either (ā, b) ∈ Dn+1 \ Dn+1 ~σ for some b ∈ D or (ā, b) ∈ ~ x̄, y | ϕ ~σ for some b ∈ D ⇐⇒ there is b ∈ D such that either (ā, b) ∈ Dn+1 \ Dn+1 ~σ or (ā, b) ∈ ~ x̄, y | ϕ ~σ ⇐⇒ there is b ∈ D such that (ā, b) ∈ ~ x̄, y | [!σ]ϕ ⇐⇒ ā ∈ ~ x̄ | ∃y [!σ]ϕ . Therefore, again, Ratom –Rit and R∃ completely axiomatize FOPAL by reducing it to its static fragment, that is, FOMC (or any stronger logic one chooses; in particular, FOK for Kripke-sheaf semantics and FOS4 for topological-sheaf semantics). Theorem 2. Given a class M of neighborhood-sheaf models, let T be the logic of M in a first-order modal (but static) language L. Then the logic of M in the language of FOPAL extending L is obtained by adding Ratom –Rit and R∃ to T. In this way, the sheaf semantics extends PAL to the first-order in a natural, straight- forward, and modular fashion. Nonetheless, its flexibility in treating individuals and identity thereof facilitates interesting dynamic behaviors of models. To see this, let us take an example from Awodey and Kishida (2008), namely, a local homeomorphism (that is, local isomorphism with topology) π : R+ → S 1 :: a 7→ (cos 2πa, sin2πa) with ~6 = { (a, b) ∈ (R+ )2 | a 6 b }.17 (See Figure 2.) R+ draws a spiral over the circle S 1 . In this spiral, the intervals (0, 1] and (0, 1) are ~ x | ∀y. x 6 y and ~ x | ∀y. x 6 y , the sets of elements that are “actually the least” and that are “necessarily the least” (in their own worlds); therefore, with ϕ short for ∀y. x 6 y → ∀y. x 6 y, 1 < ~ x | ϕ and so ϕ is not valid in (π, ~·). Now consider the image of (0, 1) = ~ x | ∀y. x 6 y projected down to the circle, S 1 \ {(1, 0)}; it is ~∃x ∀y. x 6 y. Write σ for ∃x ∀y. x 6 y. The restriction to ~σ eliminates 1 < ~ x | ϕ , and so ϕ becomes valid in (π~σ , ~·~σ ); thus, (π, ~·) validates [!σ]ϕ. Now, recall that any neighborhood-sheaf model (π : D → W, ~·) has to interpret an individual constant c with a continuous, global section ~c : W → D. This means that 17 Awodey and Kishida (2008), 157–159. 118 Public Announcements under Sheaves 3 3( )4 + 2( )3 R 2 1 1( )2 π 0( )1 )0 S1 ( ) (1, 0) (1, 0) (1, 0) Figure 2: Restriction update of a sheaf the spiral model above cannot interpret a language with any individual constant, since π : R+ → S 1 has no global section. The public announcement of σ, however, “severs” the spiral at 1, 2, 3, . . . , updating the model into a constant-domain model consisting of N-many copies of ~σ, and hence the new model accommodates a language with any constants for natural numbers. This example points to an application of FOPAL to dynamic processes of naming things in epistemic inquiries, in which the increase in knowledge makes our ontology better structured and thereby enables us to name more objects. 4 Conclusion In this article, we laid out three elements of extension for the standard PAL, namely, the (monotone) neighborhood setting, the first-order version FOPAL, and its neighborhood-sheaf semantics. The first facilitates the verifiability interpretation of Ki operators. The second brings into the PAL analysis of dynamic and epistemic pro- cesses a first-order vocabulary, and in particular names for individuals. And, lastly, the sheaf semantics gives us a flexible way of modelling how a first-order ontology as well as a first-order vocabulary can evolve in dynamic and epistemic processes. Obvious extensions of this work include the following. In this article we only dealt with a single-agent setting for FOPAL; so a multi-agent setting is naturally expected, and it is a natural step to discuss distributed knowledge and common knowledge as well. Also, we only dealt with public announcement of closed sentences, but public announcement of open formulas is also an informative event, so the fully general se- mantics of FOPAL should be able to deal with it. Yet the principal thesis of this article Kishida 119 is that the sheaf semantics enables us to lift up whatever structure we have at the prop- ositional level to the first order straightforwardly. Therefore the semantics is expected to help us to lift up dynamic epistemic logics (van Ditmarsch et al. 2008) or other for- malisms of logical dynamics (van Benthem 2011) such as belief revision (Baltag et al. 2008) to the first order. Above all, bringing logical dynamics to the first order adds an entirely new aspect to the scope of logical dynamics, namely, dynamics in first-order ontology. The sheaf semantics enables us to flexibly model the interaction among the evolutions of knowl- edge, of ontology and of language; and it should be an interesting future project to introduce modal operators and other expressions for explicitly expressing and analyz- ing this interaction. Acknowledgements This research was funded by the VIDI grant 639.072.904 of the Netherlands Organization for Scientific Research. The author is grateful to anonymous referees for and the audience at the workshop Logic and Engineering of Natural Lan- guage Semantics 9, and in particular to Hans Kamp and to Katsuhiko Sano, for their insightful comments and helpful suggestions. An extended version of this paper was presented at the Logic and Interactive Rationality seminar at the University of Amster- dam, and the author thanks the audience there, and in particular Alexandru Baltag and Johan van Benthem, for inspiring discussions and instructive comments. A grateful ac- knowledgment also goes to Lorenz Demey and Barteld Kooi for valuable discussions of this research as well as of their researches in relation to this. References H. Arló-Costa and E. Pacuit. First-order classical modal logic. Studia Logica, 84: 171–210, November 2006. S. Awodey and K. Kishida. Topology and modality: the topological interpretation of first-order modal logic. Review of Symbolic Logic, 1:146–166, August 2008. A. Baltag, H. P. van Ditmarsch, and L. S. Moss. Epistemic logic and information update. In P. Adriaans and J. van Benthem, editors, Philosophy of Information. Ams- terdam: Elsevier, 2008. J. van Benthem. Logical Dynamics of Information and Interaction. Cambridge: Cam- bridge University Press, 2011. J. van Benthem and E. Pacuit. Dynamic logics of evidence-based beliefs. Studia Logica, 99:61–92, October 2011. 120 Public Announcements under Sheaves B. F. Chellas. Modal Logic: An Introduction. Cambridge: Cambridge University Press, 1980. A. Dabrowski, L. S. Moss, and R. Parikh. Topological reasoning and the logic of knowledge. Annals of Pure and Applied Logic, 78, 1996. L. Demey. Towards a dynamics of realistic knowledge: neighborhood semantics for public announcement logic. In Fourth Conference of the Dutch-Flemish Association for Analytic Philosophy, Leuven, Belgium, January 2010. H. van Ditmarsch, W. van der Hoek, and B. Kooi. Dynamic-Epistemic Logic. Dor- drecht: Springer, 2008. D. M. Gabbay, V. Shehtman, and D. Skvortsov. Quantification in Nonclassical Logic, volume 1. Burlington: Elsevier, 2009. H. H. Hansen. Monotonic modal logics. Master’s thesis, University of Amsterdam, Amsterdam, the Netherlands, October 2003. D. Harel. First-Order Dynamic Logic. Berlin: Springer, 1979. D. Harel, D. Kozen, and J. Tiuryn. Dynamic Logic. Cambridge: MIT Press, 2000. K. Kishida. Neighborhood-sheaf semantics for first-order modal logic. Electronic Notes in Theoretical Computer Science, 278:129–143, 2011. B. Kooi. Dynamic term-modal logic. In J. van Benthem, S. Ju, and F. Veltman, editors, A Meeting of the Minds: Proceedings of the Workshop on Logic, Rationality and Interaction, Beijing, 2007, pages 173–185. London: College Publications, 2007. S. Kripke. Semantical considerations on modal logic. Acta Philosophica Fennica, 16, 1963. D. Lewis. Counterpart theory and quantified modal logic. Journal of Philosophy, 65, 1968. M. Ma. Mathematics of public announcements. In H. P. van Ditmarsch, J. Lang, and S. Ju, editors, Logic, Rationality and Interaction: Third International Workshop, LORI 2011, Guangzhou, China, October 10–13, 2011, Proceedings, pages 193–205. Heidelberg: Springer, 2011. J. Zvesper. Playing with Information. PhD thesis, University of Amsterdam, Amster- dam, the Netherlands, March 2010. ILLC Dissertation Series DS-2010-02. Transition Semantics. The Dynamics of Dependence Logic Pietro Galliani University of Helsinki
[email protected]Abstract We examine the relationship between Dependence Logic and game logics. A vari- ant of Dynamic Game Logic, called Transition Logic, is developed, and we show that its relationship with Dependence Logic is comparable to the one between First- Order Logic and Dynamic Game Logic discussed by van Benthem. This suggests a new perspective on the interpretation of Dependence Logic for- mulas, in terms of assertions about reachability in games of imperfect information against Nature. We then capitalize on this intuition by developing expressively equivalent variants of Dependence Logic in which this interpretation is taken to the foreground. 1 Introduction 1.1 Dependence Logic Dependence Logic (Väänänen 2007a) is an extension of First-Order Logic which adds dependence atoms of the form =(t1 , . . . , tn ) to it, with the intended interpretation of “the value of the term tn is a function of the values of the terms t1 . . . tn−1 .” In recent years, this logic has attracted a remarkable amount of interest as a formalism for the formal analysis of the properties of dependence itself in a first-order setting; and some recent papers (Grädel and Väänänen 2013, Engström 2012, Galliani 2012) explore the effects of replacing dependence atoms with other similar primitives such as indepen- 122 Transition Semantics dence atoms (Grädel and Väänänen 2013), multivalued dependence atoms (Engström 2012), or inclusion or exclusion atoms (Galliani 2011; 2012). In this section we will recall the definition of the semantics of this logic; then, in the rest of this work, we will examine its relationship with (an imperfect-information variant of) Dynamic Game Logic, and discuss how it (or variants thereof) may be used as a formalism for reasoning about games of imperfect information. Definition 1.1 (Assignments and substitutions). Let M be a first order model and let V be a finite set of variables. Then an assignment over M with domain V is a function s from V to the set Dom(M) of all elements of M. Furthermore, for any assignment s over M with domain V, any element m ∈ Dom(M) and any variable v (not necessarily in V), we write s[m/v] for the assignment with domain V ∪ {v} such that if w = v; ( m s[m/v](w) = s(w) if w ∈ V\{v} for all w ∈ V ∪ {v}. Definition 1.2 (Team). Let M be a first-order model and let V be a finite set of vari- ables. A team X over M with domain Dom(X) = V is a set of assignments from V to M. Definition 1.3 (Relations corresponding to teams). Let X be a team over M, and let V be a finite set of variables. and let ~v be a finite tuple of variables in its domain. Then X(~v) is the relation {s(~v) : s ∈ X}. Furthermore, we write Rel(X) for X(Dom(X)). As is often the case for Dependence Logic, we will assume that all our formulas are in Negation Normal Form: Definition 1.4 (Dependence Logic, Syntax). Let Σ be a first-order signature. Then the set of all dependence logic formula with signature Σ is given by ϕ ::= R~t | ¬R~t | =(t1 , . . . , tn ) | ϕ ∨ ϕ | ϕ ∧ ϕ | ∃vϕ | ∀vϕ where R ranges over all relation symbols, ~t ranges over all tuples of terms of the appro- priate arities, t1 . . . tn range over all terms and v ranges over the set Var of all variables. The set Free(ϕ) of all free variables of a formula ϕ is defined precisely as in First Order Logic, with the additional condition that all variables occurring in a dependence atom are free with respect to it. Definition 1.5 (Dependence Logic, Semantics). Let M be a first-order model, let X be a team over it, and let ϕ be a Dependence Logic formula with the same signature of M and with free variables in Dom(X). Then we say that X satisfies ϕ in M, and we write M X ϕ, if and only if Galliani 123 TS-lit: ϕ is a first-order literal and M sϕ for all s ∈ X; TS-dep: ϕ is a dependence atom = (t1 , . . . , tn ) and any two assignments s, s0 ∈ X which assign the same values to t1 . . . tn−1 also assign the same value to tn ; TS-∨: ϕ is of the form ψ1 ∨ ψ2 and there exist two teams Y1 and Y2 such that X = Y1 ∪ Y2 , M Y1 ψ1 and M Y2 ψ2 ; TS-∧: ϕ is of the form ψ1 ∧ ψ2 , M X ψ1 and M X ψ2 ; TS-∃: ϕ is of the form ∃vψ and there exists a function F : X → Dom(M) such that M X[F/v] ψ, where X[F/v] = {s[F(s)/v] : s ∈ X}; TS-∀: ϕ is of the form ∀vψ and M X[M/v] ψ, where X[M/v] = {s[m/v] : s ∈ X, m ∈ Dom(M)}. The disjunction of Dependence Logic does not behave like the classical disjunction: for example, it is easy to see that =(x)∨ =(x) is not equivalent to =(x), as the former holds for the team X = {{(x, 0)}, {(x, 1)}} and the latter does not. However, it is possible to define the classical disjunction in terms of the other connectives: Definition 1.6 (Classical Disjunction). Let ψ1 and ψ2 be two Dependence Logic for- mulas, and let u1 and u2 be two variables not occurring in them. Then we write ψ1 t ψ2 as a shorthand for ∃u1 ∃u2 (=(u1 )∧ =(u2 ) ∧ ((u1 = u2 ∧ ψ1 ) ∨ (u1 , u2 ∧ ψ2 ))). Proposition 1. For all formulas ψ1 and ψ2 , all models M with at least two elements1 whose signature contains that of ψ1 and ψ2 and all teams X whose domain contains the free variables of ψ1 and ψ2 M X ψ1 t ψ2 ⇔ M X ψ1 or M X ψ2 . The following four proportions are from (Väänänen 2007a): Proposition 2. For all models M and Dependence Logic formulas ϕ, M ∅ ϕ. Proposition 3 (Downwards Closure). If M Xϕ and Y ⊆ X then M Y ψ. 1 In general, we will assume through this whole work that all first-order models which we are considering have at least two elements. As one-element models are trivial, this is not a very onerous restriction. 124 Transition Semantics Proposition 4 (Locality). If M Xϕ and X(Free(ϕ)) = Y(Free(ϕ)) then M Y ϕ. Proposition 5 (From Dependence Logic to Σ11 ). Let ϕ(~v) be a Dependence Logic for- mula with free variables in ~v. Then there exists a Σ11 sentence Φ(R) such that M Xϕ ⇔M Φ(X(~v)) for all suitable models M and for all nonempty teams X. Furthermore, in Φ(R) the symbol R occurs only negatively. As proved in (Kontinen and Väänänen 2009), there is also a converse for the last proposition: Theorem 1 (From Σ11 to Dependence Logic). Let Φ(R) be a Σ11 sentence in which R occurs only negatively. Then there exists a Dependence Logic formula ϕ(~v), where |~v| is the arity of R, such that M Xϕ ⇔M Φ(X(~v)) for all suitable models M and for all nonempty teams X whose domain contains ~v. Because of this correspondence between Dependence Logic and Existential Sec- ond Order Logic, it is easy to see that Dependence Logic is closed under existential quantification: for all Dependence Logic formulas ϕ(~v, P) over the signature Σ ∪ {P} there exists a Dependence Logic formula ∃Pϕ(~v, P) over the signature Σ such that M v, P) X ∃Pϕ(~ ⇔ ∃P s.t. M X ϕ(~ v, P) for all models M with domain Σ and for all teams X over the free variables of ϕ. Therefore, in the rest of this work we will add second-order existential quantifiers to the language of Dependence Logic, and we will write ∃Pϕ(~v, P) as a shorthand for the corresponding Dependence Logic expression. 1.2 Dynamic Game Logic Game logics are logical formalisms for reasoning about games and their properties in a very general setting. Whereas the Game Theoretic Semantics approach attempts to use game-theoretic techniques to interpret logical systems, game logics attempt to put logic to the service of game theory, by providing a high-level language for the study of games. They generally contain two different kinds of expressions: Galliani 125 1. Game terms, which are descriptions of games in terms of compositions of certain primitive atomic games, whose interpretation is presumed fixed for any given game model; 2. Formulas, which, in general, correspond to assertions about the abilities of play- ers in games. In this subsection, we are going to summarize the definition of a variant of Dynamic Game Logic (Parikh 1985).2 Then, in the next subsection, we will discuss a remarkable connection between First-Order Logic and Dynamic Game Logic discovered by Johan van Benthem in (van Benthem 2003). One of the fundamental semantic concepts of Dynamic Game Logic is the notion of forcing relation: Definition 1.7 (Forcing Relation). Let S be a nonempty set of states. A forcing relation over S is a set ρ ⊆ S × Parts(S ), where Parts(S ) is the powerset of S . In brief, a forcing relation specifies the abilities of a player in a perfect-information game: (s, X) ∈ ρ if and only if the player has a strategy that guarantees that, whenever the initial position of the game is s, the terminal position of the game will be in X. A (two-player) game is then defined as a pair of forcing relations satisfying some axioms: Definition 1.8 (Game). Let S be a nonempty set of states. A game over S is a pair (ρE , ρA ) of forcing relations over S satisfying the following conditions for all i ∈ {E, A}, all s ∈ S and all X, Y ⊆ S : Monotonicity: If (s, X) ∈ ρi and X ⊆ Y then (s, Y) ∈ ρi ; Consistency: If (s, X) ∈ ρE and (s, Y) ∈ ρA then X ∩ Y , ∅; Non-triviality: (s, ∅) < ρi . Determinacy: If (s, X) < ρi then (s, S \X) ∈ ρ j , where j ∈ {E, A}\{i}.3 2 The main difference between this version and the one of Parikh’s original paper lies in the absence of the iteration operator γ∗ from our formalism. In this, we follow (van Benthem 2003, van Benthem et al. 2008). 3 This requirement is nothing but a formal version of Zermelo’s Theorem: if one of the players cannot force the outcome of the game to belong to a set of “winning outcomes” X, this implies that the other player can force it to belong to the complement of X. 126 Transition Semantics Definition 1.9 (Game Model). Let S be a nonempty set of states, let Φ be a nonempty set of atomic propositions and let Γ be a nonempty set of atomic game symbols. Then a game model over S , Φ and Γ is a triple (S , {(ρgE , ρgA ) : g ∈ Γ}, V), where (ρgE , ρgA ) is a game over S for all g ∈ Γ and where V is a valutation function associating each p ∈ Φ to a subset V(p) ⊆ S . The language of Dynamic Game Logic, as we already mentioned, consists of game terms, built up from atomic games, and of formulas, built up from atomic proposition. The connection between these two parts of the language is given by the test operation ϕ?, which turns any formula ϕ into a test game, and the diamond operation, which combines a game term γ and a formula ϕ into a new formula hγ, iiϕ which asserts that agent i can guarantee that the game γ will end in a state satisfying ϕ. Definition 1.10 (Dynamic Game Logic - Syntax). Let Φ be a nonempty set of atomic propositions and let Γ be a nonempty set of atomic game formulas. Then the sets of all game terms γ and formulas ϕ are defined as γ ::= g | ϕ? | γ; γ | γ ∪ γ | γd ϕ ::= ⊥ | p | ¬ϕ | ϕ ∨ ϕ | hγ, iiϕ for p ranging over Φ, g ranging over Γ, and i ranging over {E, A}. We already mentioned the intended interpretations of the test connective ϕ? and of the diamond connective hγ, iiϕ. The interpretations of the other game connectives should be clear: γd is obtained by swapping the roles of the players in γ, γ1 ∪ γ2 is a game in which the existential player E chooses whether to play γ1 or γ2 , and γ1 ; γ2 is the concatenation of the two games corresponding to γ1 and γ2 respectively. Definition 1.11 (Dynamic Game Logic - Semantics). Let G = (S , {(ρgE , ρgA ) : g ∈ Γ}, V) be a game model over S , Γ and Φ. Then for all game terms γ and all formulas ϕ of Dynamic Game Logic over Γ and Φ we define a game |γ|G and a set |ϕ|G ⊆ S as follows: DGL-atomic-game: For all g ∈ Γ, |g|G = (ρgE , ρgA ); DGL-test: For all formulas ϕ, |ϕ?|G = (ρE , ρA ), where • sρE X iff s ∈ |ϕ|G and s ∈ X; • sρA X iff s < |ϕ|G or s ∈ X for all s ∈ S and all X with ∅ , X ⊆ S ; Galliani 127 DGL-concat: For all game terms γ1 and γ2 , |γ1 ; γ2 |G = (ρE , ρA ), where, for all i ∈ {E, A} and for |γ1 |G = (ρ1E , ρ1A ), |γ2 |G = (ρ2E , ρ2A ), • sρi X if and only if there exists a Z such that sρi1 Z and for each z ∈ Z there exists a set Xz satisfying zρi2 Xz such that [ X= Xz ; z∈Z DGL-∪: For all game terms γ1 and γ2 , |γ1 ∪ γ2 |G = (ρE , ρA ), where • sρE X if and only if sρ1E X or sρ2E X, and • sρA X if and only if sρ1A X and sρ2A X where, as before, |γ1 |G = (ρ1E , ρ1A ) and |γ2 |G = (ρ2E , ρ2A );4 DGL-dual: If |γ|G = (ρE , ρA ) then |γd |G = (ρA , ρE ); DGL-⊥: |⊥|G = ∅; DGL-atomic-pr: |p|G = V(p); DGL-¬: |¬ϕ|G = S \|ϕ|G ; DGL-∨: |ϕ1 ∨ ϕ2 |G = |ϕ1 |G ∪ |ϕ2 |G ; DGL-: If |γ|G = (ρE , ρA ) then for all ϕ, |hγ, iiϕ|G = {s ∈ S : ∃X s ⊆ |ϕ|G s.t. sρi X s }. If s ∈ |ϕ|G , we say that ϕ is satisfied by s in G and we write M s ϕ. We will not discuss here the properties of this logic, or the vast amount of variants and extensions of it which have been developed and studied. It is worth pointing out, however, that van Benthem et al. (2008) introduced a Concurrent Dynamic Game Logic that can be considered one of the main sources of inspiration for the Transition Logic that we will develop in Subsection 3.2. 4 Van Benthem et al. (2008) give the following alternative condition for the powers of the universal player: • sρA X if and only if X = Z1 ∪ Z2 for two Z1 and Z2 such that sρ1A Z1 and sρ2A Z2 . It is trivial to see that, if our games satisfy the monotonicity condition, this rules is equivalent to the one we presented. 128 Transition Semantics 1.3 Dynamic Game Logic and First Order Logic In this subsection, we will briefly recall a remarkable result from (van Benthem 2003) which establishes a connection between Dynamic Game Logic and First-Order Logic. In brief, as the following two theorems demonstrate, either of these logics can be seen as a special case of the other, in the sense that models and formulas of the one can be uniformly translated into models of the other in a way which preserves satisfiability and truth: Theorem 2. Let G = (S , {(ρgE , ρgA ) : g ∈ Γ}, V) be any game model, let ϕ be any game formula for the same language, and let s ∈ S . Then it is possible to uniformly construct a first-order model G FO , a first-order formula ϕFO and an assignment sFO of G FO such that G s ϕ ⇔ G FO sFO ϕFO . Theorem 3. Let M be any first order model, let ϕ be any first-order formula for the signature of M, and let s be an assignment of M. Then it is possible to uniformly construct a game model G DGL , a game formula ϕDGL and a state sDGL such that sϕ ⇔ G DGL sDGL ϕ . DGL M We will not discuss here the proofs of these two results. Their significance, how- ever, is something about which is necessary to spend a few words. In brief, what this back-and-forth representation between First Order Logic and Dynamic Game Logic tells us is that it is possible to understand First Order Logic as a logic for reasoning about determined games! In the next sections, we will attempt to develop a similar result for the case of Dependence Logic. 2 Transition Logic 2.1 A logic for imperfect information games against nature We will now define a variant of Dynamic Game Logic, which we will call Transition Logic. It deviates from the basic framework of Dynamic Game Logic in two funda- mental ways: 1. It considers one-player games against Nature, instead of two-player games as is usual in Dynamic Game Logic; 2. It allows for uncertainty about the initial position of the game. Galliani 129 Hence, Transition Logic can be seen as a decision-theoretic logic, rather than a game- theoretic one: Transition Logic formulas, as we will see, correspond to assertions about the abilities of a single agent acting under uncertainty, instead of assertions about the abilities of agents interacting with each other. In principle, it is certainly possible to generalize the approach discussed here to multiple agents acting in situations of imperfect information, and doing so might cause interesting phenomena to surface; but for the time being, we will content ourselves with developing this formalism and discussing its connection with Dependence Logic. Our first definition is a fairly straightforward generalization of the concept of forc- ing relation: Definition 2.1 (Transition system). Let S be a nonempty set of states. A transition system over S is a nonempty relation θ ⊆ Parts(S )×Parts(S ) satisfying the following requirements: Downwards Closure: If (X, Y) ∈ θ and X 0 ⊆ X then (X 0 , Y) ∈ θ; Monotonicity: If (X, Y) ∈ θ and Y ⊆ Y 0 then (X, Y 0 ) ∈ θ; Non-creation: (∅, Y) ∈ θ for all Y ⊆ S ; Non-triviality: If X , ∅ then (X, ∅) < θ. Informally speaking, a transition system specifies the abilities of an agent: for all X, Y ⊆ S such that (X, Y) ∈ θ, the agent has a strategy which guarantees that the output of the transition will be in Y whenever the input of the transition is in X. The four axioms which we gave capture precisely this intended meaning, as we will see. Definition 2.2 (Decision Game). A decision game is a triple Γ = (S , E, O), where S is a nonempty set of states, E is a nonempty set of possible decisions for our agent and O is an outcome function from S × E to Parts(S ). If s0 ∈ O(s, e), we say that s0 is a possible outcome of s under e; if O(s, e) = ∅, we say that e fails on input s. Definition 2.3 (Abilities in a decision game). Let Γ = (S , E, O) be a decision game, and let X, Y ⊆ S . Then we say that Γ allows the transition X → Y, and we write Γ : X → Y, if and only if there exists a e ∈ E such that ∅ , O(s, e) ⊆ Y for all s ∈ X (that is, if and only if our agent can make a decision which guarantees that the outcome will be in Y whenever the input is in X). Theorem 4 (Transition Systems and abilities). A set θ ⊆ Parts(S ) × Parts(S ) is a transition system if and only if there exists a decision game Γ = (S , E, O) such that (X, Y) ∈ θ ⇔ Γ : X → Y. 130 Transition Semantics What this theorem tells us is that our notion of transition system is the correct one: it captures precisely the abilities of an agent making choices under imperfect information and attempting to guarantee that, if the initial state is in a set X, the outcome will be in a set Y. Definition 2.4 (Trump). Let S be a nonempty set of states. A trump over S is a nonempty, downwards closed family of subsets of S . Whereas a transition system describes the abilities of an agent to transition from a set of possible initial states to a set of possible terminal states, a trump describes the agent’s abilities to reach some terminal state from a set of possible initial states:5 Proposition 6. Let θ be a transition system and let Y ⊆ S , ∅. Then reach(θ, Y) = {X | (X, Y) ∈ θ} forms a trump. Conversely, for any trump X over S there exists a transition system θ such that X = reach(θ, Y) for any nonempty Y ⊆ S . We can now define the syntax and semantics of Transition Logic: Definition 2.5 (Transition Model). Let Φ be a set of atomic propositional symbols and let Θ be a set of atomic transition symbols. Then a transition model is a tuple T = (S , {θt : t ∈ Θ}, V), where S is a nonempty set of states, θt is a transition system over S for any t ∈ Θ, and V is a function sending each p ∈ Φ into a trump of S . Definition 2.6 (Transition Logic - Syntax). Let Φ be a set of atomic propositions and let Θ be a set of atomic transitions. Then the transition terms and formulas of our language are defined respectively as τ ::= t | ϕ? | τ ⊗ τ | τ ∩ τ | τ; τ ϕ ::= > | p | ϕ ∨ ϕ | ϕ ∧ ϕ | hτiϕ where t ranges over Θ and p ranges over Φ. Definition 2.7 (Transition Logic - Semantics). Let T = (S , {θt : t ∈ Θ), V) be a transi- tion model, let τ be a transition term, and let X, Y ⊆ S . Then we say that τ allows the transition from X to Y, and we write T X→Y τ, if and only if TL-atomic-tr: τ = t for some t ∈ Θ and (X, Y) ∈ θt ; TL-test: τ = ϕ? for some transition formula ϕ such that T Xϕ in the sense described later in this definition, and X ⊆ Y; 5 The term “trump” is taken from (Hodges 1997), who used it to describe the set of all teams which satisfy a given formula. Galliani 131 TL-⊗: τ = τ1 ⊗ τ2 , and X = X1 ∪ X2 for two X1 and X2 such that T X1 →Y τ1 and T X2 →Y τ2 ; TL-∩: τ = τ1 ∩ τ2 , T X→Y τ1 and T X→Y τ2 ; TL-concat: τ = τ1 ; τ2 and there exists a Z ⊆ S such that T X→Z τ1 and T Z→Y τ2 . Analogously, let ϕ be a transition formula, and let X ⊆ S . Then we say that X satisfies ϕ, and we write T X ϕ, if and only if TL->: ϕ = >; TL-atomic-pr: ϕ = p for some p ∈ Φ and X ∈ V(p); TL-∨: ϕ = ψ1 ∨ ψ2 and T X ψ1 or T X ψ2 ; TL-∧: ϕ = ψ1 ∧ ψ2 , T X ψ1 and T X ψ2 ; TL-: ϕ = hτiψ and there exists a Y such that T X→Y τ and T Y ψ. Proposition 7. For any transition model T , transition term τ and transition formula ϕ, the set |τ|T = {(X, Y) : T X→Y τ} is a transition system and the set |ϕ|T = {X : T X ϕ} is a trump. Proof. By induction. We end this subsection with a few simple observations about this logic. First of all, we did not take the negation as one of the primitive connectives. Indeed, Transition Logic, much like Dependence Logic, has an intrinsically existential charac- ter: it can be used to reason about which sets of possible states an agent may reach, but not to reason about which ones such an agent must reach. There is of course no reason, in principle, why a negation could not be added to the language, just as there is no reason why a negation cannot be added to Dependence Logic, thus obtaining the far more powerful Team Logic (Väänänen 2007b, Kontinen and Nurmi 2009): however, this possible extension will not be studied in this work. The connectives of Transition Logic are, for the most part, very similar to those of Dynamic Game Logic, and their interpretation should pose no difficulties. The excep- tion is the tensor operator τ1 ⊗τ2 , which substitutes the game union operator γ1 ∪γ2 and 132 Transition Semantics which, while sharing roughly the same informal meaning, behaves in a very different way from the semantic point of view (for example, it is not in general idempotent!) The decision game corresponding to τ1 ⊗ τ2 can be described as follows: first the agent chooses an index i ∈ {1, 2}, then he or she picks a strategy for τi and plays accordingly. However, the choice of i may be a function of the initial state: hence, the agent can guarantee that the output state will be in Y whenever the input state is in X only if he or she can split X into two subsets X1 and X2 and guarantee that the state in Y will be reached from any state in X1 when τ1 is played, and from any state in X2 when τ2 is played. It is also of course possible to introduce a “true” choice operator τ1 ∪ τ2 , with semantical condition TL-∪: T X→Y τ1 ∪ τ2 iff T X→Y τ1 or T X→Y τ2 ; but we will not explore this possibility any further in this work, nor we will consider any other possible connectives such as, for example, the iteration operator X→Y τ iff there exist n ∈ N and Z0 . . . Zn such that Z0 = X, Zn = Y and ∗ TL-∗: T T Zi →Zi+1 τ for all i ∈ 1 . . . n − 1. 2.2 Transition Logic and Dependence Logic This subsection contains the central result of this work, that is, the analogues of Theorems 2 and 3 for Dependence Logic and Transition Logic. Representing Dependence Logic models and formulas in Transition Logic is fairly simple: Definition 2.8 (M T L ). Let M be a first-order model. Then M T L is the transition model (S , Θ, V) such that • S is the set of all teams over M; • The set of all atomic transition symbols is {∃v, ∀v : v ∈ Var}, and hence Θ is {θ∃v , θ∀v : v ∈ Var}; • For any variable v, θ∃v = {(X, Y) : ∃F s.t. X[F/v] ⊆ Y} and θ∀v = {(X, Y) : X[M/v] ⊆ Y}; • For any first-order literal or dependence atom α, V(α) = {X : M X ϕ}. Definition 2.9 (ϕ ). Let ϕ be a Dependence Logic formula. Then ϕ TL TL is the transition term defined as follows: Galliani 133 1. If ϕ is a literal or a dependence atom, ϕT L = ϕ?; 2. If ϕ = ψ1 ∨ ψ2 , ϕT L = (ψ1 )T L ⊗ (ψ2 )T L ; 3. If ϕ = ψ1 ∧ ψ2 , ϕT L = (ψ1 )T L ∧ (ψ2 )T L ; 4. If ϕ = ∃vψ, ϕT L = ∃v; (ψ)T L ; 5. If ϕ = ∀vψ, ϕT L = ∀v; (ψ)T L . Theorem 5. For all first-order models M, teams X and formulas ϕ, the following are equivalent: • M X ϕ; • ∃Y s.t. M T L X→Y ϕ TL ; • MT L X hϕ TL i>; • MT L X→S ϕ TL . Proof. By structural induction on ϕ. One interesting aspect of this representation result is that Dependence Logic formu- las correspond to Transition Logic transitions, not to Transition Logic formulas. This can be thought of as one first hint of the fact that Dependence Logic can be thought of as a logic of transitions: and in the later sections, we will explore this idea more in depth. Representing Transition Models, game terms and formulas in Dependence Logic is somewhat more complex: Definition 2.10 (T DL ). Let T = (S , (θt : t ∈ Θ), V) be a transition model. Furthermore, for any t ∈ Θ, let θt = {(Xi , Yi ) : i ∈ It }, and, for any p ∈ Φ, let V(p) = {X j : j ∈ J p }. Then T DL is the first-order model with domain6 S ] {It : t ∈ Θ} ] {J p : p ∈ Φ} U U whose signature contains • For every t ∈ Θ, a ternary relation Rt whose interpretation is {(i, x, y) : i ∈ It , x ∈ Xi , y ∈ Yi }; • For every p ∈ Φ, a binary relation V p whose interpretation is {( j, x) : j ∈ J p , x ∈ X j }. 6 Here we write A ] B for the disjoint union of the sets A and B. 134 Transition Semantics x and τ x ). For any transition formula ϕ and variable x, the De- Definition 2.11 (ϕDL DL pendence Logic formula ϕ x is defined as DL 1. >DL x is >; 2. For all p ∈ Φ, pDL x is ∃ j(=( j) ∧ V p ( j, x)); 3. (ψ1 ∨ ψ2 )DL DL DL x is (ψ1 ) x t (ψ2 ) , where t is the classical disjunction introduced in Definition 1.6; 4. (ψ1 ∧ ψ2 )DL DL DL x is (ψ1 ) x ∧ (ψ2 ) x ; 5. (hτiψ)DL DL DL x is ∃P((τ) x (P) ∧ ∀y(¬Py ∨ (ψ)y )), where for any transition term τ, variable x and unary relation symbol P, τDL x (P) is defined as 6. For all t ∈ Θ, t xDL (P) is ∃i(=(i) ∧ ∃y(Rt (i, x, y)) ∧ ∀y(¬Rt (i, x, y) ∨ Py)); 7. For all formulas ϕ, (ϕ?)DL x (P) is ϕ x ∧ Px; DL 8. (τ1 ⊗ τ2 )DL x (P) = (τ1 ) x (P) ∨ (τ2 ) x (P); DL DL x (P) = (τ1 ) x (P) ∧ (τ2 ) x (P); 9. (τ1 ∩ τ2 )DL DL DL 10. (τ1 ; τ2 )DL x (P) = ∃Q((τ1 ) x (Q) ∧ ∀y(¬Qy ∨ (τ2 )y (P))) for a new and unused DL DL variable y. Theorem 6. For all transition models T = (S , (θt : t ∈ Θ), V), transition terms τ, transition formulas ϕ, variables x, sets P ⊆ S and teams X over T DL with X(x) ⊆ S ,7 T DL X ϕx X(x) ϕ DL ⇔T and T DL X τ x (P) X(x)→P τ. DL ⇔T Proof. The proof is by structural induction on terms and formulas. We show the two most interesting cases: 7 That is, such that X(x) is a set of states of the transition model. Galliani 135 5. T DL X (hτiψ)DL x if and only if there exists a P such that T DL DL X (τ) x (P) and DL DL T X[T DL /y] ¬Py ∨ (ψ)y . By induction hypothesis, the first condition holds if and only if T X(x)→P τ. As for the second one, it holds if and only if X[T DL /y] = Y1 ∪ Y2 for two Y1 , Y2 such that T DL Y1 ¬Py and T DL Y2 τy (ψ). But then we must have that T Y2 (y) ψ and that P ⊆ Y2 (y); therefore, by downwards closure, T P ψ and finally T X(x) hτiψ. Conversely, suppose that there exists a P such that T X(x)→P τ and T P ψ; then by induction hypothesis we have that T DL X (τ)DL x (P) and that T DL X[T DL /y] ¬Py ∨ (ψ)DL x , and hence T DL DL X (hτiψ) x . 10. T DL X ∃Q((τ1 )DL DL x (Q) ∧ ∀y(¬Qy ∨ (τ2 )y (P))) if and only if there exists a Q such that T X(x)→Q τ1 and there exists a Q0 ⊇ Q such that T Q0 →P τ2 . By downwards closure, if this is the case then T Q→P τ2 too, and hence T X(x)→P τ1 ; τ2 , as required. Conversely, suppose that there exists a Q such that T X(x)→Q τ1 and T Q→P τ2 . Then, by induction hypothesis T DL X (τ1 )DL x (Q); and furthermore, X[T DL /y] can be split into Z1 = {s ∈ X[T DL /y] : s(y) < Q} and Z2 = {s ∈ X[T DL /y] : s(y) ∈ Q} It is trivial to see that T DL Z1 ¬Qy; and furthermore, since Z2 (y) = Q and Q→P τ2 , by induction hypothesis we have that T DL DL T Z2 (τ2 )y . Thus X (τ1 ; τ2 ) x (P), and this DL DL DL DL T X[T DL /y] ∀y(¬Qy ∨ (τ2 )y (P)) and finally T concludes the proof. Hence, the relationship between Transition Logic and Dependence Logic is anal- ogous to the one between Dynamic Game Logic and First-Order Logic. In the next sections, we will develop variants of Dependence Logic which are syntactically closer to Transition Logic, while still being first-order: as we will see, the resulting frame- works are expressively equivalent to Dependence Logic on the level of satisfiability, but can be used to represent finer-grained phenomena of transitions between sets of assignments. 136 Transition Semantics 3 Dynamic variants of Dependence Logic 3.1 Dependence Logic and transitions between teams Now that we have established a connection between Dependence Logic and a variant of Dynamic Game Logic, it is time to explore what this might imply for the further development of logics of imperfect information. If, as Theorems 5 and 6 suggest, De- pendence Logic can be thought of as a logic of imperfect-information decision prob- lems, perhaps it could be possible to develop variants of Dependence Logic in which expressions can be interpreted directly as transition systems? In what follows, we will do exactly that, first with Transition Dependence Logic – a variant of Dependence Logic, expressively equivalent to it, which is also a quantified version of Transition Logic – and then with Dynamic Dependence Logic, in which all expressions are interpreted as transitions! But why would we interested in such variants of Dependence Logic? One possible answer, which we will discuss in this subsection, is that transitions between teams are already a central object of study in the field of Dependence Logic, albeit in a non- explicit manner: after all, the semantics of Dependence Logic interprets quantifiers in terms of transformations of teams, and disjunctions in terms of decompositions of teams into subteams. This intuition is central to the study of issues of interdefinability in Dependence Logic and its variants, like for example the ones discussed in (Galliani 2012). As a simple example, let us recall Definition 1.6: ψ1 t ψ2 := ∃u1 ∃u2 (=(u1 )∧ =(u2 ) ∧ ((u1 = u2 ∧ ψ1 ) ∨ (u1 , u2 ∧ ψ2 ))), where u1 and u2 are new variables. As we said in Proposition 1, M X ψ1 t ψ2 if and only if M X ψ1 or M X ψ2 . We will now sketch the proof of this result, and – as we will see – this proof will hinge on the fact that the above expression can be read as a specification of the following algorithm: 1. Choose an element a ∈ Dom(M) and extend the team X by assigning a as the value of u1 for all assignments; 2. Choose an element b ∈ Dom(M) and further extend the team by assigning b as the value of u2 for all assignments; 3. Split the resulting team into two subteams Y1 and Y2 such that (a) ψ1 holds in Y1 , and the values of u1 and u2 coincide for all assignments in it; Galliani 137 (b) ψ2 holds in Y2 , and the values of u1 and u2 differ for all assignments in it. Since the values of u1 and u2 are chosen to always be respectively a and b, one of Y1 and Y2 is empty and the other is of the form X[ab/u1 u2 ], and since u1 and u2 do not occur in ψ1 or ψ2 the above algorithm can succeed (for some choice of a and b) only if M X ψ1 or M X ψ2 . 3.2 Transition Dependence Logic As stated, we will now define a variant of Dependence Logic which can also be seen as a quantified variant of Transition Logic. We will then prove that the resulting Transition Dependence Logic is expressively equivalent to Dependence Logic, in the sense that any Dependence Logic formula is equivalent to some Transition Dependence Logic formula and vice versa. Definition 3.1 (Transition Dependence Logic - Syntax). Let Σ be a first-order signa- ture. Then the sets of all transition terms and of all formulas of Dependence Transition Logic are given by the rules τ ::= ∃v | ∀v | ϕ? | τ ⊗ τ | τ ∩ τ | τ; τ ϕ ::= R~t | ¬R~t | =(t1 , . . . , tn ) | ϕ ∨ ϕ | ϕ ∧ ϕ | hτiϕ. where v ranges over all variables in Var, R ranges over all relation symbols of the signature, ~t ranges over all tuples of terms of the required arities, n ranges over N and t1 . . . tn range over the terms of our signature. Definition 3.2 (Transition Dependence Logic - Semantics). Let M be a first-order model, let τ be a first-order transition term of the same signature, and let X and Y be teams over M. Then we say that the transition X → Y is allowed by τ in M, and we write M X→Y τ, if and only if TDL-∃: τ is of the form ∃v for some v ∈ Var and there exists a F such that X[F/v] ⊆ Y; TDL-∀: τ is of the form ∀v for some v ∈ Var and X[M/v] ⊆ Y; TDL-test: τ is of the form ϕ?, M Xϕ in the sense given later in this definition, and X ⊆ Y; TDL-⊗: τ is of the form τ1 ⊗ τ2 and X = X1 ∪ X2 for some X1 and X2 such that M X1 →Y τ1 and M X2 →Y τ2 ; 138 Transition Semantics TDL-∩: τ is of the form τ1 ∩ τ2 , M X→Y τ1 and M X→Y τ2 ; TDL-concat: τ is of the form τ1 ; τ2 and there exists a team Z such that M X→Z τ1 and M Z→Y τ2 . Similarly, if ϕ is a formula and X is a team with domain Var. Then we say that X satisfies ϕ in M, and we write M X ϕ, if and only if TDL-lit: ϕ is a first-order literal and M sϕ in the usual first-order sense for all s ∈ X; TDL-dep: ϕ is a dependence atom =(t1 , . . . , tn ) and any two s, s0 ∈ X which assign the same values to t1 . . . tn−1 also assign the same value to tn ; TDL-∨: ϕ is of the form ϕ1 ∨ ϕ2 and M X ϕ1 or M X ϕ2 ; TDL-∧: ϕ is of the form ϕ1 ∧ ϕ2 , M X ϕ1 and M X ϕ2 ; TDL-: ϕ is of the form hτiψ and there exists a Y such that M X→Y τ and M Y ψ. As the next theorem shows, in this semantics formulas and transitions are inter- preted in terms of trumps and transition systems: Theorem 7. For all Transition Dependence Logic formulas ϕ, all models M and all teams X and Y, we have that Downwards Closure: If M Xϕ and Y ⊆ X then M Y ϕ; Empty Team Property: M ∅ ϕ. Furthermore, for all Transition Dependence Logic transition terms τ, all models M and all teams X, Y and Z, Downwards Closure: If M X→Y τ and Z ⊆ X then M Z→Y τ; Monotonicity: If M X→Y τ and Y ⊆ Z then M X→Z τ; Non-creation: For all Y, M ∅→Y τ; Non-triviality: If X , ∅ then it is not the case that M X→∅ τ. Proof. The proof is by structural induction over ϕ and τ. Also, it is not difficult to see, on the basis of the results of the previous section, that this new variant of Dependence Logic is equivalent to the usual one: Galliani 139 Theorem 8. For every Dependence Logic formula ϕ there exists a Transition Depen- dence Logic transition term τϕ such that M Xϕ ⇔ ∃Y s.t. M X→Y τϕ ⇔M X hτϕ i> for all first-order models M and teams X. Proof. τϕ is defined by structural induction on ϕ, as follows: 1. If ϕ is a first-order literal or a dependence atom then τϕ = ϕ?; 2. If ϕ is ϕ1 ∨ ϕ2 then τϕ = τϕ1 ⊗ τϕ2 ; 3. If ϕ is ϕ1 ∧ ϕ2 then τϕ = τϕ1 ∩ τϕ2 ; 4. If ϕ is ∃vψ then τϕ = ∃v; τψ ; 5. If ϕ is ∀vψ then τϕ = ∀v; τψ . It is then trivial to verify, again by induction on ϕ, that M Xϕ if and only if M X hτϕ i>, as required. This representation result associates Dependence Logic formulas to Transition De- pendence Logic transition terms. This fact highlights the dynamical nature of Depen- dence Logic operators, which we discussed in the previous subsection: in this frame- work, quantifiers describe transformations of teams, the Dependence Logic connec- tives are operations over games, and the literals are interpreted as tests. In fact, one might wonder what is the purpose of Transition Dependence Logic formulas: could we do away with them altogether, and develop a variant of Transition Dependence Logic in which all formulas are transitions? Later, we will explore this idea further; but first, let us verify that Transition De- pendence Logic is no more expressive than Dependence Logic. Theorem 9. For every Transition Dependence Logic formula ϕ there exists a Depen- dence Logic formula T (ϕ) such that M Xϕ ⇔M X T (ϕ) for all first-order models M and teams X. Furthermore, for every Transition Depen- dence Logic transition term τ and Dependence Logic formula θ there is a Dependence Logic formula U(τ, ψ) such that M X U(τ, θ) ⇔ ∃Y s.t. M X→Y τ and M Y θ, again for all first-order models M and teams X. 140 Transition Semantics Proof. By structural induction over ϕ and τ. However, in a sense, Transition Dependence Logic allows one to consider subtler distinctions than Dependence Logic does. The formula ∀x∃y(= (y, f (x)) ∧ Pxy), for example, could be translated as any of • h∀x; ∃yi(=(y, f (x)) ∧ Pxy); • h∀x; ∃yih=(y, f (x))?iPxy; • h∀x; ∃yihPxy?i =(y, f (x)); • h∀x; ∃yih(Pxy?) ∩ (=(y, f (x))?)i>. The intended interpretations of these formulas are rather different, even though they happen to be satisfied by the same teams: and for this reason, Transition Dependence Logic may be thought of as a proper refinement of Dependence Logic even though it has exactly the same expressive power. 3.3 Dynamic Predicate Logic Dynamic Semantics is the name given to a family of semantical frameworks which subscribe to the following principle (Groenendijk and Stokhof 1991): The meaning of a sentence does not lie in its truth conditions, but rather in the way it changes (the representation of) the information of the inter- preter. In various forms, this intuition can be found prefigured in some of the later work of Ludwig Wittgenstein, as well as in the research of philosophers of language such as Austin, Grice, Searle, Strawson and others (Dekker 2008); but its for- mal development can be traced back to the work of Groenendijk and Stokhof about the proper treatment of pronouns in formal linguistics (Groenendijk and Stokhof 1991). We will not discuss here the formal definition of Groenendijk and Stockhof’s for- malism or its linguistic applications. All that is relevant for our purposes is that, ac- cording to it, formulas are interpreted as transitions from assignments to assignments: for example, the satisfaction conditions for conjunctions are given by DPL-∧: ϕ is of the form ψ1 ∧ ψ2 and there exists an h such that M s→h ψ1 and M h→s0 ψ2 ; Galliani 141 The similarity between this semantics and our semantics for transition terms should be evident. Hence, it seems natural to ask whether we can adopt, for a suitable variant of Dependence Logic, the following variant of Groenendijk and Stokhof’s motto: The meaning of a formula does not lie in its satisfaction conditions, but rather in the team transitions it allows. In the next section, we will make use of this intuition to develop another, terser version of Dependence Logic; and finally, we will discuss some implications of this new ver- sion for the further developments and for the possible applications of this interesting logical formalism. 3.4 Dynamic Dependence Logic We will now develop a formula-free variant of Transition Dependence Logic, along the lines of Groenendijk and Stockhof’s Dynamic Predicate Logic. Definition 3.3 (Dynamic Dependence Logic - Syntax). Let Σ be a first-order signature. The set of all formulas of Dynamic Dependence Logic over Σ is given by the rules τ ::= R~t | ¬R~t | =(t1 , . . . , tn ) | ∃v | ∀v | τ ⊗ τ | τ ∩ τ | τ; τ where, as usual, R ranges over all relation symbols of our signature, ~t ranges over all tuples of terms of the required lengths, n ranges over N, t1 . . . tn range over all terms, and v ranges over Var. The semantical rules associated to this language are precisely as one would expect: Definition 3.4 (Dynamic Dependence Logic - Semantics). Let M be a first-order model, let τ be a Dynamic Dependence Logic formula over the signature of M, and let X and Y be two teams over M with domain Var. Then we say that τ allows the transition X → Y in M, and we write M X→Y τ, if and only if DDL-lit: τ is a first-order literal, M sτ in the usual first-order sense for all s ∈ X, and X ⊆ Y; DDL-dep: τ is a dependence atom = (t1 , . . . , tn ), X ⊆ Y, and any two assignments s, s0 ∈ X which coincide over t1 . . . tn−1 also coincide over tn ; DDL-∃: τ is of the form ∃v for some v ∈ Var, and X[F/v] ⊆ Y for some F : X → Dom(M); DDL-∀: τ is of the form ∀v for some v ∈ Var, and X[M/v] ⊆ Y; 142 Transition Semantics DDL-⊗: τ is of the form τ1 ⊗ τ2 and X = X1 ∪ X2 for two teams X1 and X2 such that M X1 →Y τ1 and M X2 →Y τ2 ; DDL-∩: τ is of the form τ1 ∩ τ2 , M X→Y τ1 and M X→Y τ2 ; DDL-concat: τ is of the form τ1 ; τ2 , and there exists a Z such that M X→Z τ1 and M Z→Y τ2 . A formula τ is said to be satisfied by a team X in a model M if and only if there exists a Y such that M X→Y τ; and if this is the case, we will write M X τ. It is not difficult to see that Dynamic Dependence Logic is equivalent to Transition Dependence Logic (and, therefore, to Dependence Logic). Proposition 8. Let ϕ be a Dependence Logic formula. Then there exists a Dynamic Dependence Logic formula ϕ0 which is equivalent to it, in the sense that Xϕ Xϕ X→Y ϕ 0 0 M ⇔M ⇔ ∃Y s.t. M for all suitable teams X and models M Proposition 9. Let τ be a Dynamic Dependence Logic formula. Then there exists a Transition Dependence Logic transition term τ0 such that X→Y τ X→Y τ 0 M ⇔M for all suitable X, Y and M, and such that hence Xτ 0 M ⇔M X hτ i>. Corollary 1. Dynamic Dependence Logic is equivalent to Transition Dependence Logic and to Dependence Logic Proof. Follows from the two previous results and from the equivalence between De- pendence Logic and Transition Dependence Logic. 4 Further work In this work, we established a connection between a variant of Dynamic Game Logic and Dependence Logic, and we used it as the basis for the development of variants of Dependence Logic in which it is possible to talk directly about transitions from teams to teams. This suggests a new perspective on Dependence Logic and Team Semantics, Galliani 143 one which allow us to study them as a special kind of algebras of nondeterministic transitions between relations. One of the main problems that is now open is whether it is possible to axiomatize these algebras, in the same sense in which Allen Mann (2009) offers an axiomatization of the algebra of trumps corresponding to IF Logic (or, equivalently, to Dependence Logic). Furthermore, we might want to consider different choices of connectives, like for example ones related to the theory of database transactions. The investigation of the relationships between the resulting formalisms is a natural continuation of the currently ongoing work on the study of the relationship between various extensions of Depen- dence Logic, and promises of being of great utility for the further development of this fascinating line of research. Acknowledgements The author wishes to thank Johan van Benthem and Jouko Väänänen for a number of useful suggestions and insights. Furthermore, he wishes to thank the reviewers for a number of highly useful suggestions and comments. References J. van Benthem. Logic games are complete for game logics. Studia Logica, 75: 183–203, 2003. J. van Benthem, S. Ghosh, and F. Liu. Modelling simultaneous games in dynamic logic. Synthese, 165:247–268, 2008. P. Dekker. A guide to dynamic semantics. ILLC Prepublication Series, (PP-2008-42), 2008. F. Engström. Generalized quantifiers in dependence logic. Journal of Logic, Lan- guage and Information, 21(3):299–324, 2012. P. Galliani. Multivalued dependence logic and independence logic. In Non-classical Modal and Predicate Logics, 2011. P. Galliani. Inclusion and exclusion dependencies in team semantics: On some logics of imperfect information. Annals of Pure and Applied Logic, 163(1):68 – 84, 2012. E. Grädel and J. Väänänen. Dependence and independence. Studia Logica, 101(2): 399–410, 2013. J. Groenendijk and M. Stokhof. Dynamic Predicate Logic. Linguistics and Philoso- phy, 14(1):39–100, 1991. 144 Transition Semantics W. Hodges. Compositional Semantics for a Language of Imperfect Information. Jour- nal of the Interest Group in Pure and Applied Logics, 5 (4):539–563, 1997. J. Kontinen and V. Nurmi. Team logic and second-order logic. In H. Ono, M. Kanazawa, and R. de Queiroz, editors, Logic, Language, Information and Compu- tation, volume 5514 of Lecture Notes in Computer Science, pages 230–241. Springer Berlin / Heidelberg, 2009. J. Kontinen and J. Väänänen. On definability in dependence logic. Journal of Logic, Language and Information, 3(18):317–332, 2009. A. L. Mann. Independence-friendly cylindric set algebras. Logic Journal of IGPL, 17 (6):719–754, 2009. R. Parikh. The logic of games and its applications. In Selected papers of the inter- national conference on ”foundations of computation theory” on Topics in the theory of computation, pages 111–139, New York, NY, USA, 1985. Elsevier North-Holland, Inc. J. Väänänen. Dependence Logic. Cambridge University Press, 2007a. J. Väänänen. Team Logic. In J. van Benthem, D. Gabbay, and B. Löwe, editors, Interactive Logic. Selected Papers from the 7th Augustus de Morgan Workshop, pages 281–302. Amsterdam University Press, 2007b. Dynamic Measure Logic Tamar Lando Columbia University
[email protected]Abstract This paper brings together Dana Scott’s measure-based semantics for the prop- ositional modal logic S 4, and recent work in Dynamic Topological Logic. In a series of recent talks, Scott showed that the language of S 4 can be interpreted in the Lebesgue measure algebra, M, or algebra of Borel subsets of the real interval, [0, 1], modulo sets of measure zero. Conjunctions, disjunctions and negations are interpreted via the Boolean structure of the algebra, and we add an interior opera- tor on M that interprets the -modality. In this paper we show how to extend this measure-based semantics to the bimodal logic of S 4C. S 4C is a dynamic topo- logical logic that is interpreted in ‘dynamic topological systems,’ or topological spaces together with a continuous function acting on the space. We extend Scott’s measure based semantics to this bimodal logic by defining a class of operators on the algebra M, which we call O-operators and which take the place of continuous functions in the topological semantics for S 4C. The main result of the paper is that S 4C is complete for the Lebesgue measure algebra. A strengthening of this result, also proved here, is that there is a single measure-based model in which all non-theorems of S 4C are refuted. 1 Introduction Kripke models for normal modal logics, consisting of a set of possible worlds together with a binary accessibility relation, are, by now, widely familiar. But long before Kripke semantics became standard, Tarski showed that the propositional modal logic S 4 can be interpreted in topological spaces. In the topological semantics for S 4, a 146 Dynamic Measure Logic topological space is fixed, and each propositional variable, p, is assigned an arbitrary subset of the space: the set of points where p is true. Conjunctions, disjunctions and negations are interpreted as set-theoretic intersections, unions and complements (thus, e.g., ϕ ∧ ψ is true at all points in the intersection of the set of points where ϕ is true and the set of points where ψ is true. The -modality of S 4 is interpreted via the topological interior: ϕ is true at any point in the topological interior of the set of points at which ϕ is true. In this semantics, the logic S 4 can be seen as describing topological spaces. In- deed, with the topological semantics it became possible to ask not just whether S 4 is complete for the set of topological validities—formulas valid in every topological space—but also whether S 4 is complete for any given topological space. The culmi- nation of Tarski’s work in this area was a very strong completeness result. In 1944, Tarski and McKinsey proved that S 4 is complete for any dense-in-itself metric space. One particularly important case was the real line, R, and as the topological semantics received renewed interest in recent years, more streamlined proofs of Tarski’s result for this special case emerged (in, e.g., Aiello 2003, Bezhanishvili and Gehrke 2005, Mints and Zhang 2005). The real line, however, can be investigated not just from a topological point of view, but from a measure-theoretic point of view. Here, the probability measure we have in mind is the usual Lebesgue measure on the reals. In the last several years Dana Scott introduced a new probabilistic or measure-based semantics for S 4 that is built around Lebesgue measure on the reals and is in some ways closely related to Tarski’s older topological semantics (see Scott 2009). Scott’s semantics is essentially algebraic: formulas are interpreted in the Lebesgue measure algebra, or the σ-algebra of Borel subsets of the real interval [0,1], modulo sets of measure zero (henceforth, “null sets”). We denote this algebra by M. Thus elements of M are equivalence classes of Borel sets. In Scott’s semantics, each propositional variable is assigned to some element of M. Conjunctions, disjunctions and negations are assigned to meets, joins and complements in the algebra, respectively. In order to interpret the S 4 -modality, we add to the algebra an “interior” operator (defined below), which we construct from the collection of open elements in the algebra, or elements that have an open representative. Unlike the Kripke or topological semantics, there is no notion here of truth at a point (or at a “world”). In (Fernandez-Duque 2010) and (Lando 2012) it was shown that S 4 is complete for the L ebesgue measure algebra. The introduction of a measure-based semantics for S 4 raises a host of questions that are, at this point, entirely unexplored. Among them: What about natural extensions of S 4? Can we give a measure-based semantics not just for S 4 but for some of its extensions that have well-known topological interpretations? Lando 147 This paper focuses on a family of logics called dynamic topological logics. These logics were investigated over the last fifteen years, in an attempt to describe “dynamic topological systems” by means of modal logic. A dynamic topological system is a pair hX, f i, where X is a topological space and f is a continuous function on X. We can think of f as moving points in X in discrete units of time. Thus in the first moment, x is mapped to f (x), then to f ( f (x)), etc. The most basic dynamic topological logic is S 4C. In addition to the S 4 -modality, it has a temporal modality, which we denote by . Intuitively, we understand the formula p as saying that at the “next moment in time,” p will be true. Thus we put: x ∈ V( p) iff f (x) ∈ V(p). In (Kremer and Mints 2005) and (Slavnov 2003) it was shown that S 4C is incomplete for the real line, R. However, in (Slavnov 2005) it was shown that S 4C is complete for Euclidean spaces of arbitrarily large finite dimension, and in (Fernandez-Duque 2005) it was shown that S 4C is complete for R2 . The aim of this paper is to give a measure-based semantics for the logic S 4C, along the lines of Scott’s semantics for S 4. Again, formulas will be assigned to some element of the Lebesgue measure algebra, M. But what about the dynamical aspect— i.e., the interpretation of the -modality? We show that there is a very natural way of interpreting the -modality via operators on the algebra M that take the place of continuous functions in the topological semantics. These operators can be viewed as transforming the algebra in discrete units of time. Thus one element is sent to another in the first instance, then to another in the second instance, and so on. The operators we use to interpret S 4C are O-operators: ones that take “open” elements in the algebra to open elements (defined below). But there are obvious extensions of this idea: for example, to interpret the logic of homeomorphisms on topological spaces, one need only look at automorphisms of the algebra M. Adopting a measure-based semantics for S 4C brings with it certain advantages. Not only do we reap the probabilistic features that come with Scott’s semtantics for S 4, but the curious dimensional asymmetry that appears in the topological semantics (where S 4C is incomplete for R but complete for R2 ) disappears in the measure-based semantics. Our main result is that the logic S 4C is complete for the Lebesgue-measure algebra. A strengthening of this result, also proved here, is that S 4C is complete for a single model of the Lebesgue measure algebra. Due to well-known results by Oxtoby, this algebra is isomorphic to the algebra generated by Euclidean space of arbitrary dimension. In other words, S 4C is complete for the reduced measure algebra generated by any Euclidean space. 148 Dynamic Measure Logic 2 Topological semantics for S 4C Let the language L, consist of a countable set, PV = {pn | n ∈ N}, of propositional variables, and be closed under the binary connectives ∧, ∨, →, ↔, unary operators, ¬, , , and a unary modal operator (thus, L, is the language of propositional S 4 enriched with a new modality, ). Definition 2.1. A dynamic topological space is a pair hX, f i, where X is a topological space and f : X → X is a continuous function on X. A dynamic topological model is a triple, hX, f, Vi, where X is a topological space, f : X → X is a continuous function, and V : PV → P(X) is a valuation assigning to each propositional variable a subset of X. We say that hX, f, Vi is a model over X. We extend V to the set of all formulas in L, by means of the following recursive clauses: V(ϕ ∨ ψ) = V(ϕ) ∪ V(ψ) V(¬ϕ) = X − V(ϕ) V(ϕ) = Int (V(ϕ)) V( ϕ) = f −1 (V(ϕ)) where ‘Int’ denotes the topological interior. Let N = hX, f, Vi be a dynamic topological model. We say that a formula ϕ is satisfied at a point x ∈ X if x ∈ V(ϕ), and we write N, x ϕ. We say ϕ is true in N (N ϕ) if N, x ϕ for each x ∈ X. We say ϕ is valid in X ( X ϕ), if for any model N over X, we have N ϕ. Finally, we say ϕ is topologically valid if it is valid in every topological space. Definition 2.2. The logic S 4C in the language L, is given by the following axioms: – the classical tautologies, – S 4 axioms for . (A1) (ϕ ∨ ψ) ↔ ( ϕ ∨ ψ), (A2) ( ¬ϕ) ↔ (¬ ϕ), (A3) ϕ → ϕ (the axiom of continuity), and the rules of modus ponens and necessitation for both and . Following (Kremer and Mints 2005), we use S 4C both for this axiomatization and for the set of all formulas derivable from the axioms by the inference rules. Lando 149 We close this section by listing the known completeness results for S 4C in the topological semantics. Theorem 2.3. (Completeness) For any formula ϕ ∈ L, , the following are equivalent: (i) S 4C ` ϕ; (ii) ϕ is topologically valid; (iii) ϕ is true in any finite topological space; (iv) ϕ is valid in Rn for n ≥ 2. Proof. The equivalence of (i)-(iii) was proved in (Artemov et al. 1997). The equiva- lence of (i) and (iv) was proved in (Fernandez-Duque 2005). This was a strengthening of a result proved in (Slavnov 2005). Theorem 2.4. (Incompleteness for R) There exists ϕ ∈ L, such that ϕ is valid in R, but ϕ is not topologically valid. Proof. See (Kremer and Mints 2005) and (Slavnov 2003). 3 Kripke semantics for S 4C In this section we show that the logic S 4C can also be interpreted in the more familiar setting of Kripke frames. It is well known that the logic S 4 (which does not include the ‘temporal’ modality, ) is interpreted in transitive, reflexive Kripke frames, and that such frames just are topological spaces of a certain kind. It follows that the Kripke semantics for S 4 is just a special case of the topological semantics for S 4. In this section, we show that the logic S 4C can be interpreted in transitive, reflexive Kripke frames with some additional ‘dynamic’ structure, and, again, that Kripke semantics for S 4C is a special case of the more general topological semantics for S 4C. Henceforth, we assume that Kripke frames are both transitive and reflexive. Definition 3.1. A dynamic Kripke frame is a triple hW, R, Gi where W is a set, R is a reflexive, transitive relation on W, and G : W → W is a function that is R-monotone in the following sense: for any u, v ∈ W, if uRv, then G(u) R G(v). Definition 3.2. A dynamic Kripke model is a pair hF, Vi where F = hW, R, Gi is a dynamic Kripke frame and V : PV → P(W) is a valuation assigning to each proposi- tional variable an arbitrary subset of W. We extend V to the set of all formulas in L, by the following recursive clauses: V(ϕ ∨ ψ) = V(ϕ) ∪ V(ψ) 150 Dynamic Measure Logic V(¬ϕ) = W − V(ϕ) V( ϕ) = G−1 (V(ϕ)) V(ϕ) = {w ∈ W | v ∈ V(ϕ) for all v ∈ Wsuch that wRv}. Given a dynamic Kripke frame K = hW, R, Gi, we can impose a topology on W via the accessibility relation R. We define the open subsets of W as those subsets that are upward closed under R: (*) O ⊆ W is open iff x ∈ O and xRy implies y ∈ O. Recall that an Alexandroff topology is a topological space in which arbitrary inter- sections of open sets are open. The reader can verify that the collection of open subsets of W includes the entire space, the empty set, and is closed under arbitrary intersections and unions. Hence, viewing hW, Ri as a topological space, the space is Alexandroff. Going in the other direction, if X is an Alexandroff topology, we can define a rela- tion R on X by: (@) xRy iff x is a point of closure of {y}. (Equivalently, y belongs to every open set containing x.) Clearly R is reflexive. To see that R is transitive, suppose that xRy and yRz. Let O be an open set containing x. Then since x is a point of closure for {y}, y ∈ O. But since y is a point of closure for {z}, z ∈ O. So x is a point of closure for {z} and xRz. So far, we have shown that static Kripke frames, hW, Ri correspond to Alexandroff topologies. But what about the dynamical aspect? Here we invite the reader to verify that R-monotonicity of the function G is equivalent to continuity of G in the topological setting. It follows that dynamic Kripke frames are just dynamic Alexandroff topologies. In view of the fact that every finite topology is Alexandroff (if X is finite, then there are only finitely many open subsets of X), we have shown that finite topologies are just finite Kripke frames. This result, together with Theorem 2.3 (iii), gives the following completeness theorem for Kripke semantics: Lemma 3.3. For any formula ϕ ∈ L, , the following are equivalent: (i) S 4C ` ϕ; (ii) ϕ is true in any finite Kripke frame (= finite topological space). In what follows, it will be useful to consider not just arbitrary finite Kripke frames, but frames that carry some additional structure. The notion we are after is that of a stratified dynamic Kripke frame, introduced by Slavnov (2005). We recall his defini- tions below. Lando 151 Definition 3.4. Let K = hW, R, Gi be a dynamic Kripke frame. A cone in K is any set Uv = {w ∈ W | vRw} for some v ∈ W. We say that v is a root of Uv . Note in particular that any cone, Uv , in K is an open subset of W—indeed, the smallest open subset containing v. Definition 3.5. Let K = hU, R, Gi be a finite dynamic Kripke frame. We say that K is stratified if there is a sequence hU1 , . . . , Un i of pairwise disjoint cones in K with roots u1 , . . . , un respectively, such that U = k Uk ; G(uk ) = uk+1 for k < n, and G S is injective. We say the stratified Kripke frame has depth n and (with slight abuse of notation) we call u1 the root of the stratified frame. Note that it follows from R-monotonicity of G that G(Uk ) ⊆ Uk+1 , for k < n. Definition 3.6. Define the function CD (“circle depth”) on the set of all formulas in L, inductively, as follows. CD(p) = 0 for any propositional variable p; CD(ϕ ∨ ψ) = max {CD(ϕ), CD(ψ)}; CD(¬ϕ) = CD(ϕ); CD(ϕ) = CD(ϕ); CD( ϕ) = 1 + CD(ϕ). We also refer to CD(ϕ) as the -depth of ϕ. Lemma 3.7. Suppose the formula ϕ is not a theorem of S 4C, and CD(ϕ) = n. Then there is a stratified finite dynamic Kripke frame K with depth n+1 such that ϕ is refuted at the root of K. Proof. The proof is by Lemma 3.3 and by a method of ‘disjointizing’ finite Kripke frames (for the details, see Slavnov 2005). 4 Algebraic semantics for S 4C We saw that the topological semantics for S 4C is a generalization of the Kripke se- mantics. Can we generalize further? Just as classical propositional logic is interpreted in Boolean algebras, we would like to interpret modal logics algebraically. Tarski and McKinsey showed that this can be done for the logic S 4, interpreting the -modality as an interior operator on a Boolean algebra. In this section we show that the same can be done for the logic S 4C, interpreting the -modality via O-operators on a Boolean algebra. We denote the top and bottom elements of a Boolean algebra by 1 and 0, respec- tively. 152 Dynamic Measure Logic Definition 4.1. A topological Boolean algebra is a Boolean algebra, A, together with an interior operator I on A that satisfies: (I1 ) I1 = 1; (I2 ) Ia ≤ a; (I3 ) IIa = Ia; (I4 ) I(a ∧ b) = Ia ∧ Ib. Example 4.2. The set of all subsets P(X) of a topological space X with set-theoretic meets, joins and complements and where the operator I is just the topological interior operator (for A ⊆ X, I(A) = Int(A)) is a topological Boolean algebra. More generally, any collection of subsets of X that is closed under finite intersections, unions, comple- ments and topological interiors is a topological Boolean algebra. We call any such algebra a topological field of sets. Suppose A is a topological Boolean algebra with interior operator I. We define the open elements in A as those elements for which Ia = a. (1) Definition 4.3. Let A1 and A2 be topological Boolean algebras. We say h : A1 → A2 is a Boolean homomorphism if h preserves Boolean operations. We say h is a Boolean embedding if h is an injective Boolean homomorphism. We say h is a homomorphism if h preserves Boolean operations and the interior operator. We say h is an embedding if h is an injective homomorphism. Finally, we say A1 and A2 are isomorphic if there is an embedding from A1 onto A2 . Definition 4.4. Let A1 and A2 be topological Boolean algebras, and let h : A1 → A2 . We say h is an O-map if (i) h is a Boolean homomorphism. (ii) For any c open in A1 , h(c) is open in A2 . An O-operator is an O-map from a topological Boolean algebra to itself. Lemma 4.5. Let A1 and A2 be topological Boolean algebras, with interior operators I1 and I2 respectively. Suppose that h : A1 → A2 is a Boolean homomorphism. Then h is an O-map iff for every a ∈ A1 , h(I1 a) ≤ I2 (h(a)). (2) Lando 153 Proof. We let G1 and G2 denote the collection of open elements in A1 and A2 respec- tively. (⇒) Suppose h is an O-map. Then h(I1 a) ∈ G2 by Definition 4.4 (ii). Also, I1 a ≤ a, so h(I1 a) ≤ h(a) (h is a Boolean homomorphism, hence preserves order). Tak- ing interiors on both sides, we have h(I1 a) = I2 (h(I1 a)) ≤ I2 (ha). (⇐) Suppose that for every a ∈ A1 , h(I1 a) ≤ I2 (h(a)). Let c ∈ G1 . Then c = I1 c, so h(c) = h(I1 c) ≤ I2 (h(c)). But also, I2 (h(c)) ≤ h(c). So h(c) = I2 (h(c)) and h(c) ∈ G2 . We are now in a position to state the algebraic semantics for the language L, . Definition 4.6. A dynamic algebra is a pair hA, hi, where A is a topological Boolean algebra and h is an O-operator on A. A dynamic algebraic model is an ordered triple, hA, h, Vi, where A is a topological Boolean algebra, h is an O-operator on A, and V : PV → A is a valuation, assigning to each propositional variable p ∈ PV an element of A. We say hA, h, Vi is a model over A. We can extend V to the set of all formulas in L, by the following recursive clauses: V(ϕ ∨ ψ) = V(ϕ) ∨ V(ψ) V(¬ϕ) = −V(ϕ) V(ϕ) = IV(ϕ) V( ϕ) = hV(ϕ). (The remaining binary connectives, → and ↔, and unary operator, ^, are defined in terms of the above in the usual way.) We define standard validity relations. Let N = hA, h, Vi be a dynamic algebraic model. We say ϕ is true in N (N ϕ) iff V(ϕ) = 1. Otherwise, we say ϕ is refuted in N. We say ϕ is valid in A ( A ϕ) if for any algebraic model N over A, N ϕ. Finally, we let DMLA = {ϕ | A ϕ} (i.e., the set of validities in A). In our terminology, soundness of S 4C for A is the claim: S 4C ⊆ DMLA . Completeness of S 4C for A is the claim: DMLA ⊆ S 4C. Proposition 4.7. (Soundness) Let A be a topological Boolean algebra. Then S 4C ⊆ DMLA . Proof. We have to show that the S 4C axioms are valid in A and that the rules of inference preserve truth. To see that (A1) is valid, note that: V( (ϕ ∨ ψ)) = h(V(ϕ) ∨ V(ψ)) = h(V(ϕ)) ∨ h(V(ψ)) (h a Boolean homomorphism) = V( ϕ ∨ ψ) 154 Dynamic Measure Logic Thus V( (ϕ ∨ ψ) ↔ ( ϕ ∨ ψ)) = 1. Validity of (A2) is proved similarly. For (A3), note that: V( ϕ) = h(IV(ϕ)) ≤ Ih(V(ϕ)) (by Lemma 4.5) = V( ϕ) So V( ϕ) ≤ V( ϕ) and V( ϕ → ϕ) = 1. This takes care of the special -modality axioms. The remaining axioms are valid by soundness of S 4 for any topo- logical Boolean algebra (see e.g. Rasiowa and Sikorski 1963). To see that necessitation for preserves validity, suppose that ϕ is valid in A (i.e., for every algebraic model N = hA, h, Vi, we have V(ϕ) = 1). Then V( ϕ) = h(V(ϕ)) = h(1) = 1, and ϕ is valid in A. 5 Reduced measure algebras We would like to interpret S 4C not just in arbitrary topological Boolean algebras, but in algebras carrying a probability measure—or ‘measure algebras.’ In this section we show how to construct such algebras from separable metric spaces together with a σ- finite Borel measure (defined below). Definition 5.1. Let A be a Boolean σ-algebra, and let µ be a non-negative function on A. We say µ is a measure on A if for any countable collection {an } of disjoint elements in A, µ( n an ) = n µ(an ). W P If µ is a measure on A, we say µ is positive if 0 is the only element at which µ takes the value 0. We say µ is σ-finite if 1 is the countable join of elements in A with finite measure.1 Finally, we say µ is normalized if µ(1) = 1. Definition 5.2. A measure algebra is a Boolean σ-algebra A together with a positive, σ-finite measure µ on A. Lemma 5.3. Let A be a Boolean σ-algebra and let µ be a σ-finite measure on A. Then there is a normalized measure ν on A such that for all a ∈ A, µ(a) = 0 iff ν(a) = 0. Proof. Since µ is σ-finite, there exists a countable collection {sn | n ≥ 1} ⊆ A such that n≥1 sn = 1 and µ(sn ) < ∞ for each n ≥ 1. WLOG we can assume the sn ’s are pairwise W 1 I.e., there is a countable collection of elements A in A such that An = 1 and µ(An ) < ∞ for each W n n n ∈ N. Lando 155 disjoint (i.e., sn ∧ sm = 0 for m , n). For any a ∈ A, let X µ(a ∧ sn ) ν(a) = 2−n . n≥1 µ(sn ) The reader can verify that ν has the desired properties. In what follows, we show how to construct measure algebras from a topological space, X, together with a Borel measure on X. The relevant definition is given below. Definition 5.4. Let X be a topological space. We say that µ is a Borel measure on X if µ is a measure defined on the σ-algebra of Borel subsets of X.2 Let X be a topological space, and let µ be a σ-finte Borel measure on X. We let Borel(X) denote the collection of Borel subsets of X and let Nullµ denote the collection of measure-zero Borel sets in X. Then Borel(X) is a Boolean σ-algebra, and Nullµ is a σ-ideal in Borel(X). We form the quotient algebra MµX = Borel(X)/Nullµ . (Equivalently, we can define the equivalence relation ∼ on Borel sets in X by A ∼ B iff µ(A 4 B) = 0, where 4 denotes symmetric difference. Then MµX is the algebra of equivalence classes under ∼.) Boolean operations in MµX are defined in the usual way in terms of underlying sets: |A| ∨ |B| = |A ∪ B| |A| ∧ |B| = |A ∩ B| −|A| = |X − A| Lemma 5.5. There is a unique measure ν on MµX such that ν|A| = µ(A) for all A in Borel(X). Moreover, the measure ν is σ-finite and positive. Proof. See (Halmos 1959, pg. 79). It follows from Lemma 5.5 that MµX is a measure algebra. We follow Halmos (1959) in referring to any algebra of the form MµX as a reduced measure algebra.3 2 I.e., on the smallest σ-algebra containing all open subsets of X. 3 In fact, Halmos allows as ‘measure algebras’ only algebras with a normalized measure. We relax this constraint here, in order to allow for the ‘reduced measure algebra’ generated by the entire real line together µ with the usual Lebesgue measure. This algebra is, of course, isomorphic to MX , where X is the real interval [0, 1], and µ is the usual Lebesgue measure on X. This amendment was suggested by the anonymous referee. 156 Dynamic Measure Logic Lemma 5.6. Let X be a topological space and let µ be a σ-finite Borel measure on X. Then for any |A|, |B| ∈ MµX , |A| ≤ |B| iff A ⊆ B ∪ N for some N ∈ Nullµ . Proof. (⇒) If |A| ≤ |B|, then |A| ∧ |B| = |A|, or equivalently |A ∩ B| = |A|. This means that (A ∩ B) 4 A ∈ Nullµ , so A − B ∈ Nullµ . But A ⊆ B ∪ (A − B). (⇐) Suppose A ⊆ B ∪ N for some N ∈ Nullµ . Then A ∩ (B ∪ N) = A, and |A| ∧ |B ∪ N| = |A|. But |B ∪ N| = |B|, so |A| ∧ |B| = |A|, and |A| ≤ |B|. For the remainder of this section, let X be a separable metric space, and let µ be a σ-finite Borel measure on X. Where the intended measure is obvious, we will drop superscripts, writing MX for MµX . So far we have seen only that MµX is a Boolean algebra. In order to interpret the - modality of S 4C in MµX , we need to construct an interior operator on this algebra (thus transforming MµX into a topological Boolean algebra). We do this via the topological structure of the underlying space, X. Let us say that an element a ∈ MµX is open if a = |U| for some open set U ⊆ X. We denote the collection of open elements in MµX by GµX (or, dropping superscripts, GX ). Proposition 5.7. GµX is closed under (i) finite meets and (ii) arbitrary joins. Proof. (i) This follows from the fact that open sets in X are closed under finite inter- sections. (ii) Let {ai | i ∈ I} be a collection of elements in GµX . We need to show that sup {ai | i ∈ I} exists and is equal to some element in GµX . Since X is separable, there exists a countable dense set D in X. Let B be the collection of open balls in X centered at points in D with rational radius. Then any open set in X can be written as a union of elements in B. Let S be the collection of elements B ∈ B such that |B| ≤ ai for some i ∈ I. We claim that [ sup {ai | i ∈ I} = | S |. S First, we need to show that | S | is an upper bound on {ai | i ∈ I}. For each i ∈ I, ai = |Ui | for some open set Ui ⊆ X. Since Ui is open, it can be written as a union of elements in B. Moreover, each of these elements is a member of S (if B ∈ B and B ⊆ Ui , then |B| ≤ |Ui | = ai ). So Ui ⊆ S and ai = |Ui | ≤ | S |. S S For the reverse inequality (≥) we need to show that if m is an upper bound on {ai | i ∈ I}, then | S | ≤ m. Let m = |M|. Note that S is countable (since S ⊆ B and B is S countable). We can write S = {Bn | n ∈ N}. Then for each n ∈ N, there exists i ∈ I such that |Bn | ≤ ai ≤ m. By Lemma 5.5, Bn ⊆ M ∪ Nn for some Nn ∈ Nullµ . Taking unions, n Nn ∈ Nullµ . By Lemma 5.5, |S | = | n Bn | ≤ m. S S S S n Bn ⊆ M ∪ n Nn , and We can now define an interior operator, IXµ , on MµX via the collection of open ele- ments, GµX . For any a ∈ MµX , let IXµ a = sup {c ∈ GµX | c ≤ a}. Lando 157 Lemma 5.8. IXµ is an interior operator. Proof. For simplicity of notation, we let I denote IXµ and let G denote GµX . Then (I1 ) follows from the fact that 1 ∈ G. (I2 ) follows from the fact that a is an upper bound on {c ∈ G | c ≤ a}. For (I3 ) note that by (I2 ), we have IIa ≤ Ia. Moreover, if c ∈ G W with c ≤ a, then c ≤ Ia (since Ia is supremum of all such c). Thus {c ∈ G | c ≤ W a} ≤ {c ∈ G | c ≤ Ia}, and Ia ≤ IIa. For (I4 ) note that since a ∧ b ≤ a, we have I(a ∧ b) ≤ Ia. Similarly, I(a ∧ b) ≤ Ib, so I(a ∧ b) ≤ Ia ∧ Ib. For the reverse inequality, note that Ia ∧ Ib ≤ a (since Ia ≤ a), and similarly Ia ∧ Ib ≤ b. So Ia ∧ Ib ≤ a ∧ b. Moreover, Ia ∧ Ib ∈ G. It follows that Ia ∧ Ib ≤ I(a ∧ b). Remark 5.9. Is the interior operator IXµ non-trivial? (That is, does there exist a ∈ MµX such that Ia , a?) This depends on the space, X, and the measure, µ. If we let X be the real interval, [0, 1], and let µ be the Lebesgue measure on Borel subsets of X, then the interior operator is non-trivial (for the proof, see Lando 2012). But suppose µ is a non-standard measure on the real interval, [0, 1], defined by: if 21 ∈ A ( 1 µ(A) = 0 otherwise. Then Borel([0, 1])/Nullµ is the algebra 2, and both elements of this algebra are ‘open.’ So Ia = a for each element a in the algebra. Remark 5.10. The operator IXµ does not coincide with taking topological interiors on underlying sets. More precisely, it is in general not the case that for A ⊆ X, IXµ (|A|) = |Int (A)|, where ‘Int(A)’ denotes the topological interior of A. Let X be the real interval [0, 1] with the usual topology, and let µ be Lebesgue measure restricted to measurable subsets of X. Consider the set X − Q and note that |X − Q| = |X| (Q is countable, hence has measure zero). We have: IXµ (|X − Q|) = IXµ (|X|) = IXµ (1) = 1. However, |Int (X − Q)| = |∅| = 0. Remark 5.11. Note that an element a ∈ MµX is open just in case IXµ a = a. Indeed, if a is open, then a ∈ {c ∈ GµX | c ≤ a}. So a = sup {c ∈ GµX | c ≤ a} = IXµ a. Also, if IXµ a = a, then a is the join of a collection of elements in GµX , and so a ∈ GµX . This shows that the definition of ‘open’ elements given above fits with the definition in (1). In what follows, it will sometimes be convenient to express the interior operator IXµ in terms of underlying open sets, as in the following Lemma: Lemma 5.12. Let A ⊆ X. Then IXµ (|A|) = | {O open | |O| ≤ |A|}|. S 158 Dynamic Measure Logic Proof. By definition of IXµ , IXµ (|A|) = sup{c ∈ GµX | c ≤ |A|}. Let B and D be as in the proof of Proposition 5.7, and let S be the collection of elements B ∈ B such that |B| ≤ |A|. Then by the proof of Proposition 5.7, IXµ (|A|) = | S |. But now S = S S S {O open | |O| ≤ |A|}. (This follows from the fact that any open set O ⊆ X can be written as a union of elements in B.) Thus, IXµ (|A|) = | S | = | {O open | |O| ≤ S S |A|}|. We have shown that MµX together with the operator IXµ is a topological Boolean algebra. Of course, for purposes of our semantics, we are interested in O-operators on MµX . How do such maps arise? Unsurprisingly, a rich source of examples comes from continuous functions on the underlying topological space X. Let us spell this out more carefully. Definition 5.13. Let X and Y be topological spaces and let µ and ν be Borel measures on X and Y respectively. We say f : X → Y is measure-zero preserving (MZP) if for any A ⊆ Y, ν(A) = 0 implies µ( f −1 (A)) = 0. Lemma 5.14. Let X and Y be separable metric spaces, and let µ and ν be σ-finite Borel measures on X and Y respectively. Suppose B is a Borel subset of X with µ(B) = µ(X), and f : B → Y is measure-zero preserving and continuous. Define h|·|f : MνY → MµX by h|·|f (|A|) = | f −1 (A)|. Then h|·|f is an O-map. In particular, if X = Y, then h|·|f is an O-operator. Proof. First, we must show that h|·|f is well-defined.4 Indeed, if |A| = |B|, then ν(A 4 B) = 0. And since f is MZP, µ ( f −1 (A) 4 f −1 (B)) = µ ( f −1 (A 4 B)) = 0. So f −1 (A) ∼ f −1 (B). This shows that h|·|f |A| is independent of the choice of representative, A. Furthermore, it is clear that h|·|f is a Boolean homomorphism. To see that it is an O-map, we need only show that if c ∈ GνY , h|·|f (c) ∈ GµX . But if c ∈ GνY then c = |U| for some open set U ⊆ Y. By continuity of f , f −1 (U) is open in B. So f −1 (U) = O ∩ B for some O open in X. So h|·|f (c) = | f −1 (U)| = |O| ∈ GµX . By the results of the previous section, we can now interpret the language of S 4C in reduced measure algebras. In particular, we say an algebraic model hA, h, Vi is a dynamic measure model if A = MµX for some separable metric space X and a σ-finite Borel measure µ on X. 4 Note that by continuity of f , f −1 (A) is a Borel set in B, hence also a Borel set in X. Lando 159 We are particularly interested in the reduced measure algebra generated by the real interval, [0, 1], together with the usual Lebesgue measure. Definition 5.15. Let I be the real interval [0, 1] and let λ denote Lebesgue measure restricted to the Borel subsets of I. The Lebesgue measure algebra is the algebra MλI . Because of its central importance, we denote the Lebesgue measure algebra with- out subscripts or superscripts, by M. Furthermore, we denote the collection of open elements in M by G and the interior operator on M by I. As in Definition 4.6, we let DMLM = {ϕ | M ϕ} (i.e., the set of validities in M). In our terminology, soundness of S 4C for M is the claim: S 4C ⊆ DMLM . Completeness of S 4C for M is the claim: DMLM ⊆ S 4C. Proposition 5.16. (Soundness) S 4C ⊆ DMLM . Proof. Immediate from Proposition 4.7. Remark 5.17. The algebra M is isomorphic to the algebra Leb([0, 1])/Nullµ where Leb([0, 1]) is the σ-algebra of Lebesgue-measureable subsets of the real interval [0, 1], and Nullµ is the σ-ideal of Lebesgue measure-zero sets. This follows from the fact that every Lebesgue-measureable set in [0, 1] differs from some Borel set by a set of measure zero. 6 Isomorphism between reduced measure algebras In this section we use a well-known result of Oxtoby’s to show that any reduced mea- sure algebra generated by a separable metric space with a σ-finite, nonatomic Borel measure is isomorphic to M. By Oxtoby’s result, we can think of M as the canonical separable measure algebra. In the remainder of this section, let J denote the space [0, 1] − Q (with the usual metric topology), and let δ denote Lebesgue measure restricted to the Borel subsets of J. Definition 6.1. A topological space X is topologically complete if X is homeomorphic to a complete metric space. Definition 6.2. Let X be a topological space. A Borel measure µ on X is nonatomic if µ({x}) = 0 for each x ∈ X. 160 Dynamic Measure Logic Theorem 6.3. (Oxtoby, 1970) Let X be a topologically complete, separable metric space, and let µ be a normalized, nonatomic Borel measure on X. Then there exists a Borel set B ⊆ X and a function f : B → J such that µ(X − B) = 0 and f is a measure-preserving homeomorphism (where the measure on J is δ). Proof. See (Oxtoby 1970). Lemma 6.4. Suppose X and Y are separable metric spaces, and µ and ν are normalized Borel measures on X and Y respectively. If f : X → Y is a measure preserving homoemorphism, then MµX is isomorphic to MνY .5 Proof. For simplicity of notation, we drop superscripts, writing simply MX , GX , and IX , etc. Let h|·|f : MY → MX be defined by h|·|f (|A|) = | f −1 (A)|. This function is well-defined because f is MZP and continuous. (The first property ensures that h|·|f (|A|) is independent of representative A; the second ensures that f −1 (A) is Borel.) Clearly h|·|f is a Boolean homomorphism. We can define the mapping h|·|f −1 : MX → MY by h|·|f −1 (|A|) = | f (A)|. Then h|·|f and h|·|f −1 are inverses, so h|·|f is bijective. We need to show that h|·|f preserves interiors—i.e., h|·|f (IY a) = IX h|·|f (a). The inequality (≤) follows from the fact that h|·|f is an O-map (see Lemma 5.14). For the reverse inequality, we need to see that h|·|f (IY a) is an upper bound on {c ∈ GX | c ≤ h|·|f (a)}. If c ∈ GX , then h|·|f −1 (c) ∈ GY and if c ≤ h|·|f (a), then h|·|f −1 (c) ≤ h|·|f −1 (h|·|f (a)) = a. Thus h|·|f −1 (c) ≤ IY a, and c = h|·|f (h|·|f −1 (c)) ≤ h|·|f (IY a). Corollary 6.5. Let X be a separable metric space, and let µ be a nonatomic σ-finite Borel measure on X with µ(X) > 0. Then, MµX M. Proof. By Lemma 5.3, we can assume that µ is normalized.6 Let Xcomp be the comple- tion of the metric space X. Clearly Xcomp is separable. We can extend the Borel measure µ on X to a Borel measure µ∗ on Xcomp by letting µ∗ (S ) = µ(S ∩ X) for any Borel set S 5 We can relax the conditions of the lemma, so that instead of requiring that f is measure-preserving, we require only that ν( f (S )) = 0 iff µ(A) = 0. In fact, we can further relax these conditions so that f : B → C, where B ⊆ X, C ⊆ Y, µ(B 4 X) = 0, and ν(C 4 Y) = 0. We prove the lemma as stated because only this weaker claim is needed for the proof of Corollary 6.5. 6 More explicitly: If µ is σ-finite, then by Lemma 5.3 there is a normalized Borel measure µ∗ on X such µ µ∗ that µ∗ (S ) = 0 iff µ(S ) = 0 for each S ⊆ X. It follows that MX MX (where the isomorphism is not, in general, measure-preserving). Lando 161 in Xcomp . The reader can convince himself that µ∗ is a normalized, nonatomic, σ-finite Borel measure on Xcomp , and that MµXcomp MµX . By Theorem 6.3, there exists a set ∗ B ⊆ Xcomp and a function f : B → J such that µ∗ (B) = 1 and f is a measure-preserving homeomorphism. By Lemma 6.4, MJ MB . We have: M MJ MB MXµ comp MµX . ∗ 7 Invariance maps At this point, we have at our disposal two key results: completeness of S 4C for finite stratified Kripke frames, and the isomorphism between MµX and M for any separable metric space X and σ-finite, nonatomic Borel measure µ. Our aim in what follows will be to transfer completeness from finite stratified Kripke frames to the Lebesgue measure algebra, M. But how to do this? We can view any topological space as a topological Boolean algebra—indeed, as the topological field of all subsets of the space (see Example 4.2). Viewing the finite stratified Kripke frames in this way, what we need is ‘truth-preserving’ maps between the algebras generated by Kripke frames and MµX , for appropriately chosen X and µ. The key notion here is that of a “dynamic embedding” (defined below) of one dynamic algebra into another. Although our specific aim is to transfer truth from Kripke algebras to reduced measure algebras, the results we present here are more general and concern truth preserving maps between arbitrary dynamic algebras. Recall that a dynamic algebra is a pair hA, hi, where A is a topological Boolean algebra, and h is an O-operator on A. Definition 7.1. Let M1 = hA1 , h1 i and M2 = hA2 , h2 i be two dynamic algebras. We say a function h : M1 → M2 is a dynamic embedding if (i) h is an embedding of A1 into A2 ; (ii) h ◦ h1 = h2 ◦ h. Lemma 7.2. Let M1 = hA1 , h1 , V1 i and M2 = hA2 , h2 , V2 i be two dynamic algebraic models. Suppose that h : hA1 , h1 i → hA2 , h2 i is a dynamic embedding, and for every propositional variable p, V2 (p) = h ◦ V1 (p). Then for any ϕ ∈ L, , V2 (ϕ) = h ◦ V1 (ϕ). 162 Dynamic Measure Logic Proof. By induction on the complexity of ϕ. Corollary 7.3. Let M1 = hA1 , h1 , V1 i and M2 = hA2 , h2 , V2 i be two dynamic algebraic models. Suppose that h : hA1 , h1 i → hA2 , h2 i is a dynamic embedding, and for every propositional variable p, V2 (p) = h ◦ V1 (p). Then for any ϕ ∈ L, , M1 ϕ iff M2 ϕ. Proof. M2 ϕ iff V2 (ϕ) = 1 iff h ◦ V1 (ϕ) = 1 (by Lemma 7.2) iff V1 = 1 (since h is an embedding) Let hX, Fi be a dynamic topological space and let AX be the topological field of all subsets of X (see Example 4.2). We define the function hF on AX by hF (S ) = F −1 (S ). It is not difficult to see that hF is an O-operator. We say that hAX , hF i is the dynamic algebra generated by (or corresponding to) to the dynamic topological space hX, Fi. Our goal is to embed the dynamic algebras generated by finite dynamic Kripke frames into a dynamic measure algebra, hMµX , hi, where X is some appropriately cho- sen separable metric space and µ is a nonatomic, σ-finite Borel measure on X. In view of Corollary 7.3 and completeness for finite dynamic Kripke frames, this will give us completeness for the measure semantics. The basic idea is to construct such embed- dings via ‘nice’ maps on the underlying topological spaces. To this end, we introduce the following new definition: Definition 7.4. Suppose X and Y are a topological spaces, and µ is a Borel measure on X. Let γ : X → Y. We say γ has the M-property with respect to µ if for any subset S ⊆ Y: (i) γ−1 (S ) is Borel; (ii) for any open set O ⊆ X, if γ−1 (S ) ∩ O , ∅ then µ(γ−1 (S ) ∩ O) > 0. Lemma 7.5. Suppose hX, Fi is a dynamic topological space, where X is a separable metric space, F is measure-zero preserving, and let µ be a σ-finite Borel measure on X with µ(X) > 0. Suppose hY, Gi is a dynamic topological space, and hAY , hG i is the corresponding dynamic algebra. Let B be a subset of X with µ(B) = µ(X), and suppose we have a map γ : B → Y that satisfies: Lando 163 (i) γ is continuous, open and surjective; (ii) γ ◦ F = G ◦ γ; (iii) γ has the M-property with respect to µ. Then the map Φ : hAY , hG i → hMµX , h|·|F i defined by Φ(S ) = |γ−1 (S )| is a dynamic embedding. Proof. By the fact that MµX is isomorphic to MµB , we can view Φ as a map from hAY , hG i into hMµB , h|·|F i, where hµF is viewed as an operator on MµB . Note that Φ is well-defined by the fact that γ satisfies clause (i) of the M-property. We need to show that (i) Φ is an embedding of hAY , hG i into hMµB , h|·|F i, and (ii) Φ ◦ hG = h|·|F ◦ Φ. (i) Clearly Φ is a Boolean homomorphism. We prove that Φ is injective and preserves interiors. • (Injectivity) Suppose Φ(S 1 ) = Φ(S 2 ) and S 1 , S 2 . Then γ−1 (S 1 ) ∼ γ−1 (S 2 ), and S 1 4 S 2 , ∅. Let y ∈ S 1 4 S 2 . By surjectivity of γ, we have γ−1 (y) , ∅. Moreover, µ(γ−1 (y)) > 0 ( since γ has the M-property w.r.t. µ, and the entire space B is open). So µ(γ−1 (S 1 ) 4 γ−1 (S 2 )) = µ(γ−1 (S 1 4 S 2 )) ≥ µ(γ−1 (y)) > 0. And γ−1 (S 1 ) / γ−1 (S 2 ). ⊥. • (Preservation of Interiors) For clarity, we will denote the topological inte- rior in the spaces Y and B by IntY and IntB respectively, and the interior operator on MµB by I. Let S ⊆ Y. It follows from continuity and openness of γ : B → Y, that γ−1 (IntY (S )) = IntB (γ−1 (S )). Note that, – Φ(IntY (S )) = | γ−1 (IntY (S )) | = | IntB (γ−1 (S )) | [ =| {O open in B | O ⊆ γ−1 (S ) } | – I(Φ(S )) = I |γ−1 (S )| [ =| {O open in B | |O| ≤ |γ−1 (S )| } | (by Lemma 5.12) 164 Dynamic Measure Logic Thus it is sufficient to show that for any open set O ⊆ B, O ⊆ γ−1 (S ) iff |O| ≤ |γ−1 (S )|. The left-to-right direction is obvious. For the right-to-left direction, sup- pose (toward contradiction) that |O| ≤ |γ−1 (S )| but that O * γ−1 (S ). Then O ⊆ γ−1 (S ) ∪ N for some N ⊆ B with µ(N) = 0. Moreover, since O * γ−1 (S ), there exists x ∈ O such that x < γ−1 (S ). Let y = γ(x). Then γ−1 (y) ∩ O , ∅. Since γ has the M-property with respect to µ, it follows that µ(γ−1 (y) ∩ O) > 0. But γ−1 (y) ∩ O ⊆ N (since γ−1 (y) ∩ O ⊆ O ⊆ γ−1 (S ) ∪ N, and γ−1 (y) ∩ γ−1 (S ) = ∅). ⊥. We’ve shown that Φ is an embedding of hAY , hG i into hMµB , h|·|F i. In view of the isomorphism between MµX and MµB , we have shown that Φ is an embedding of hAY , hG i into MµX . (ii) We know that γ ◦ F = G ◦ γ. Taking inverses, we have F −1 ◦ γ−1 = γ−1 ◦ G−1 . Now let S ⊆ Y. Then: Φ ◦ hG (S ) = |γ−1 (G−1 (S ))| = |F −1 (γ−1 (S ))| = h|·|F ◦ Φ(S ). 8 Completeness of S 4C for the Lebesgue measure algebra In this section we prove the main result of the paper: Completeness of S 4C for the Lebesgue measure algebra, M. Recall that completeness is the claim that DMLM ⊆ S 4C. In fact, we prove the contrapositive: For any formula ϕ ∈ L, , if ϕ < S 4C, then ϕ < DMLM . Our strategy is as follows. If ϕ is a non-theorem of S 4C, then by Lemma 3.7, ϕ is refuted in some finite stratified Kripke frame K = hW, R, Gi. View- ing the frame algebraically (i.e., as a topological field of sets), we must construct a dynamic embedding Φ : hAW , hG i → hM, hi, where hAW , hG i is the dynamic Kripke algebra generated by the dynamic Kripke frame K, and h is some O-operator on M. In view of the isomorphism between M and MµX for any separable metric space, X, and nonatomic, σ-finite Borel measure µ on X with µ(X) > 0, it is enough to construct a dynamic embedding of the Kripke algebra into MµX , for appropriately chosen X and µ. The constructions in this section are a modification of the constructions introduced in (Slavnov 2005), where it is proved that S 4C is complete for topological models in Lando 165 Euclidiean spaces of arbitrarily large finite dimension. The modifications we make are measure-theoretic, and are needed to accommodate the new ‘probabilistic’ setting. We are very much indebted to Slavnov for his pioneering work in (Slavnov 2005).7 8.1 Outline of the proof Let us spell out the plan for the proof a little more carefully. The needed ingredients are all set out in Lemma 7.5. Our first step will be to construct the dynamic topological space hX, Fi, where X is a separable metric space, and F is a measure-zero preserving, continuous function on X. We must also construct a measure µ on the Borel sets of X that is nonatomic and σ-finite, such that µ(X) > 0. We want to embed the Kripke algebra hAW , hG i into hMµX , h|·|F i, and to do this, we must construct a topological map γ : B → W, where B ⊆ X and µ(B) = 1, and γ satisfies the requirements of Lemma 7.5. In particular, we must ensure that (i) γ is open, continuous and surjective, (ii) γ ◦ F = G ◦ γ and (iii) γ has the M-property with respect to µ. In Section 8.2, we show how to construct the dynamic space hX, Fi, and the Borel measure µ on X. In Section 8.3, we construct the map γ : X → W, and show that it has the desired properties. 8.2 The topological carrier of the countermodel Let Xn = I 1 t · · · t I n where I k is the k-th dimensional unit cube and t denotes disjoint union. We would like Xn to be a metric space, so we think of the cubes I k as embedded in the space Rn , and as lying at a certain fixed distance from one another. For simplicity of notation, we denote points in I k by (x1 , . . . , xk ), and do not worry about how exactly these points are positioned in Rn . For each k < n, define the map Fk : I k → I k+1 by (x1 , . . . , xk ) 7→ (x1 , . . . , xk , 21 ). We let if x ∈ Ik , k < n F (x) k F(x) = if x ∈ In x Clearly F is injective. For each k ≥ 2 we choose a privileged “midsection” Dk = [0, 1]k−1 × { 21 } of Ik . Thus, f (Ik ) = Dk+1 for k < n. 7 Where possible, we have preserved Slavnov’s original notation in (Slavnov 2005). 166 Dynamic Measure Logic y z D2 D3 x I1 I2 I3 Figure 1: The space X3 = I 1 t I 2 t I 3 . Note that µ(I 1 ) = 1, µ(I 2 ) = 2, and µ(I 3 ) = 3. The shaded regions in I 2 and I 3 denote the midsections, D2 and D3 , respectively. The space Xn will be the carrier of our countermodels (we will choose n according to the -depth of the formula which we are refuting, as explained in the next section). We define a non-standard measure, µ, on Xn . This somewhat unusual measure will allow us to transfer countermodels on Kripke frames back to the measure algebra, MµXn . Let µ on I1 be Lebesgue measure on R restricted to Borel subsets of I1 . Suppose we have defined µ on I 1 , . . . , I k . For any Borel set B in I k+1 , let B1 = B ∩ Dk+1 , and B2 = B \ Dk+1 . Then B = B1 t B2 . We define µ(B) = µ(F −1 (B1 )) + λ(B2 ) where λ is the usual Lebesgue measure in Rk+1 . Finally, for any Borel set B ⊆ Xn , we let µ(B) = nk=1 µ(B ∩ I k ) P Note that µ(I 1 ) = 1, and in general µ(I k+1 ) = µ(I k ) + 1. Thus µ(Xn ) = µ(I 1 t · · · t I ) = n1 k = 21 (n2 + n). n P Lemma 8.1. µ is a nonatomic, σ-finite Borel measure on Xn . Proof. Clearly µ is nonatomic. Moreover, since µ(Xn ) < ∞, µ is σ-finite. The only thing left to show is that µ is countably additive. Suppose that {Bm }m∈N is a collection of pairwise disjoint subsets of Xn . Claim 8.2. For any k ≤ n, [ X µ( (Bm ∩ I k )) = µ(Bm ∩ I k ). m m Lando 167 Proof of Claim: By induction on k.8 But now [we have: X [ µ( Bm ) = µ[( Bm ) ∩ I k ] (by definition of µ) m k m X [ = µ[ (Bm ∩ I k )] k m XX = µ(Bm ∩ I k ) (by Claim 8.2) k m XX = µ(Bm ∩ I k ) m k X = µ(Bm ) (by definition of µ) m Lemma 8.3. X is a separable metric space and F : Xn → Xn is measure-preserving and continuous. Proof. The set of rational points in I k is dense in k (k ≤ n), so Xn is separable. Conti- nuity of F follows from the fact that F is a translation in Rn ; F is measure-preserving by the construction of µ. 8.3 Completeness Assume we are given a formula ϕ ∈ L, such that ϕ is not a theorem of S 4C and let n = CD(ϕ) + 1. By Lemma 3.7, there is a finite stratified, dynamic Kripke model K = hW, R, G, V1 i of depth n such that ϕ is refuted at the root of K. In other words, there is a collection of pairwise disjoint cones W1 , . . . , Wn with roots w10 , . . . , wn0 respectively, such that W = k≤n Wk ; G is injective; and G(wk ) = wk+1 for each k < n; and K, w10 6 S 8 The base case is by countable additivity of Lebesgue measure on the unit interval, [0, 1]. For the induction [ step, suppose the claim [ is true for k − 1. Then[we have: µ( (Bm ∩ I k )) = µ [F −1 ( (Bm ∩ I k ∩ Dk ))] + λ [ (Bm ∩ I k ) \ Dk ] (by definition of µ) m m m [ X = µ[ F (Bm ∩ I ∩ D )] + −1 k k λ((Bm ∩ I k ) \ Dk ) (by countable additivity of λ) m m X X = µ[F −1 (Bm ∩ I k ∩ Dk )] + λ((Bm ∩ I k ) \ Dk ) (by induction hypothesis) m m X = µ[F −1 (Bm ∩ I k ∩ Dk )] + λ((Bm ∩ I k ) \ Dk ) m X = (Bm ∩ I k ) (by definition of µ) m 168 Dynamic Measure Logic ϕ. Let the space X = Xn = I 1 t · · · t I n and the measure µ be as defined in the previous section. We construct a map γ̃ : X → W in a countable number of stages. To do this we will make crucial use of the notion of -nets, defined below: Definition 8.4. Given a metric space S and > 0, a subset Ω of S is an -net for S if for any y ∈ S , there exists x ∈ Ω such that d(x, y) < (where d denotes the distance function in S ). Observe that if S is compact, then for any > 0 there is a finite -net for S . Basic construction. Let w1root = w10 , and let w1 , . . . , wr1 be the R-successors of w1root . At the first stage, we select r1 pairwise disjoint closed cubes T 1 , . . . , T r1 in I 1 , making sure that their total measure adds up to no more than ( 12 )0+2 —that is, k≤r1 µ(T k ) < 14 . P For each x in the interior of T k we let γ̃(x) = wk (k ≤ r1 ). With slight abuse of notation we put γ̃(T k ) = wk . We refer to T 1 , . . . , T r1 as terminal cubes, and we let I11 = I 1 − rk=1 S1 Int (T k ). At any subsequent stage, we assume we are given a set Ii1 that is equal to I 1 with a finite number of open cubes removed from it. Thus Ii1 is a compact set. We find a 1 2i -net Ωi for Ii1 and for each point y ∈ Ωi , we choose r1 pairwise disjoint closed cubes, T 1 , . . . , T ry1 in the 21i -neighborhood of y, putting γ̃(T ky ) = wk (for k ≤ r1 , with the same y meaning as above). Again, we refer to the T k ’s as terminal cubes. Since Ωi is finite, we create only a finite number of new terminal cubes at this stage, and we make sure to do this in such a way as to remove a total measure of no more than ( 12 )i+2 . We let Ii+1 1 be 1 the set Ii minus the interiors of the new terminal cubes. After doing this countably many times, we are left with some points in I 1 that do not belong to the interior of any terminal cube. We call such points exceptional points and we put γ̃(x) = w1root for each exceptional point x ∈ I 1 . This completes the definition of γ̃ on I 1 . j+1 Now assume that we have already defined γ̃ on I j . We let wroot = w0j+1 and let j+1 w1 , . . . , wr j+1 be the R-successors of wroot . We define γ̃ on I j+1 as follows. At first we choose r j+1 closed cubes T 1 , . . . , T r j+1 in I j+1 , putting γ̃(T k ) = wk (for k ≤ r j+1 ). In choosing T 1 , . . . , T r j+1 , we make sure that these cubes are not only pairwise disjoint (as before) but also disjoint from the midsection D j+1 . Again, we also make sure to remove Sr j+1 a total measure of no more than ( 21 )0+2 µ(I j+1 ). We let I1j+1 = I j+1 − k=1 Int(T k ). j+1 j+1 At stage i, we assume we are given a set Ii equal to I minus the interiors of a finite number of closed cubes. Thus Iij+1 is compact, and we choose a finite 21i -net Ωi for Iij+1 . For each y ∈ Ωi we choose r j+1 closed terminal cubes T 1 , . . . , T r j+1 in the 1 2i -neighborhood of y. We make sure that these cubes are not only pairwise disjoint, but disjoint from the midsection D j+1 . Since Ωi is finite, we add only finitely many Lando 169 new terminal cubes in this way. It follows that there is an -neighborhood of D j+1 that is disjoint from all the terminal cubes added up to this stage. Moreover, for each terminal cube T of I j defined at the ith stage, F(T ) ⊆ D j+1 , and we let T 0 be some closed cube in I j+1 containing F(T ) and of height at most . To ensure that the equality γ̃ ◦ F(x) = G ◦ γ̃(x) hold s for all points x belonging to the interior of terminal cubes of I j , we put: γ̃(T 0 ) = G ◦ γ̃(T ). Finally, we have added only finitely many terminal cubes at this stage, and we do so in such a way as to make sure that the total measure of these cubes is no more than j+1 ( 12 )i+2 µ(I j+1 ). We let Ii+1 be the set Iij+1 minus the new terminal cubes added at this stage. We iterate this process countably many times, removing a countable number of terminal cubes from I j+1 . For all exceptional points x in I j+1 (i.e., points that do not j+1 belong to the interior of any terminal cube defined at any stage) we put γ̃(x) = wroot . j Noting that exceptional points of I are pushed forward under F to exceptional points in I j+1 , we see that the equality γ̃ ◦ F(x) = G ◦ γ̃(x) holds for exceptional points as well. This completes the construction of γ̃ on X. We pause now to prove two facts about the map γ̃ that will be of crucial importance in what follows. Lemma 8.5. Let E(I j ) be the collection of all exceptional points in I j for some j ≤ n. Then µ(E(I j )) ≥ 21 µ(I j ). Proof. At stage i of construction of γ̃ on I j , we remove from I j terminal cubes of total measure no more than ( 12 )i+2 µ(I j ). Thus over countably many stages we remove a total measure of no more than µ(I j ) i≥0 ( 21 )i+2 = 21 µ(I j ). The remaining points in I j are all P exceptional, so µ(E(I j )) ≥ µ(I j ) − 21 µ(I j ) = 21 µ(I j ). Lemma 8.6. Let x ∈ I j be an exceptional point for some j ≤ n. Then γ̃(x) = w0j , and for any > 0 and any wk ∈ W j there is a terminal cube T contained in the -neighborhood of x with γ̃(T ) = wk . Proof. Since x ∈ I j is exceptional, it belongs to Iij for each i ∈ N. We can pick i large enough so that 21i < 2 . But then in the notations above, there exists a point y ∈ Ωi such that d(x, y) < 2 . The statement now follows from the Basic Construction, since for each wk ∈ W j there is a terminal cube T k in the 21i -neighborhood of y (and so also in the 2 -neighborhood of y) with γ̃(T k ) = wk . Construction of the maps, γl . In the basic construction we defined a map γ̃ : X → W that we will use in order to construct a sequence of ‘approximation’ maps, γ1 , γ2 , γ3 , . . . ..., where γ1 = γ̃. In the end, we will construct the needed map, γ, as 170 Dynamic Measure Logic the limit (appropriately defined) of these approximation maps. We begin by putting γ1 = γ̃. The terminal cubes of γ1 and the exceptional points of γ1 are the terminal cubes and exceptional points of the Basic Construction. Note that each of I 1 , . . . , I n contains countably many terminal cubes of γ1 together with exceptional points that don’t belong to any terminal cube. Assume that γl is defined and that for each terminal cube T of γl , all points in the interior of T are mapped by γl to a single element in W, which we denote by γl (T ). Moreover, assume that: (i) γl ◦ F = G ◦ γl ; (ii) for any terminal cube T of γl in I j , F maps T into some terminal cube T 0 of γl in I j+1 , for j < n. where F is again the embedding (x1 , . . . , x j ) 7→ (x1 , . . . , x j , 21 ). We now define γl+1 on the interiors of the terminal cubes of γl . In particular, for any terminal cube T of γl in I 1 , let T 1 = T and let T j+1 be the terminal cube of I j+1 containing F(T j ), for j < n. Then we have a system T 1 , . . . , T n exactly like the system I 1 , . . . , I n in the Basic Construction. We define γl+1 on the interiors of j T 1 , . . . , T n in the same way as we defined γ̃ on I 1 , . . . , I n , letting wroot = γl (T j ) and j letting w1 , . . . , wr j be the R-successors of wroot . The only modification we need to make is a measure-theoretic one. In particular, in each of the terminal cubes T j , we want to end up with a set of exceptional points that carries non-zero measure (this will be important for proving that the limit map we define, γ, has the M-property with respect to µ). To do this, assume γl+1 has been defin ed on T 1 , . . . , T j , and that for k ≤ j, µ(E(T k )) ≥ 21 µ(T k ), where E(T k ) is the set of exceptional points in T k . When we define γl+1 on T j+1 , we make sure that at the first stage we remove terminal cubes 0+2 with a total measure of no more than 12 µ(T j+1 ). At stage i where we are given T ij+1 we remove terminal cubes with a total measure of no more than ( 12 )i+2 µ(T j+1 ). Again, this can be done because at each stage i we remove only a finite number of terminal cubes, so we can make the size of these cubes small enough to ensure we do not exceed the allocated measure. Thus, over countably many stages we remove from T j+1 a total measure of no more than µ(T j+1 ) i≥0 ( 21 )i+2 = 12 µ(T j+1 ). Letting E(T j+1 ) be the set P of exceptional points in T j+1 , we have µ(E(T j+1 )) ≥ 21 µ(T j+1 ). We do this for each terminal cube T of γl in I 1 . Next we do the same for all the remaining terminal cubes T of γl in I 2 (i.e. those terminal cubes in I 2 that are disjoint from D2 ), and again, for all the remaining terminal cubes T of γl in I 3 (the terminal cubes in I 3 that are disjoint from D3 ), etc. At the end of this process we have defined γl+1 on the interior of each terminal cube of γl . For any point x ∈ X that does not belong Lando 171 to the interior of any terminal cube of γl , we put γl+1 (x) = γl (x). The terminal cubes of γl+1 are the terminal cubes of the Basic Construction applied to each of the terminal cubes of γl . The points in the interior of terminal cubes of γl that do not belong to the interior of any terminal cube of γl+1 are the exceptional points of γl+1 . In view of the measure-theoretic modifications we made above, we have the fol- lowing analog of Lemma 8.5: Lemma 8.7. Let l ∈ N and let T be any terminal cube of γl and E(T ) be the set of exceptional points of γl+1 in T . Then 1 µ(E(T )) ≥ µ(T ). 2 Furthermore, the reader can convince himself that we have the following analog of Lemma 8.6 for the maps γl : Lemma 8.8. Let x be an exceptional point of γl and let γl (x) = w. Then for any > 0 and any v such that wRv, there is a terminal cube T of γl contained in the -neighborhood of x with γl (T ) = v. Finally, note that if x is an exceptional point of γl for some l, then γl (x) = γl+k (x) for any k ∈ N. We let B denote the set of points that are exceptional for some γl , and define the map γ : B → W as follows: γ(x) = lim γl (x). l→∞ Lemma 8.9. µ(B) = µ(X). Proof. Let T l be the set of all points that belong to some terminal cube of γl . Note that T l ⊇ T l+1 for l ∈ N, and µ(T 1 ) is finite. Thus µ( l T l ) = liml→∞ µ(T l ) = 0. (The limit T value follows from Lemma 8.7.) Finally, note that B = X − l T l . So B is Borel, and T µ(B) = µ(X) − µ( l T l ) = µ(X). T We have constructed a map γ : B → W where µ(B) = µ(X). Moreover, by the Basic Construction, we have γl ◦ F(x) = G ◦ γl (x) for each l ∈ N. It follows that γ ◦ F(x) = G ◦ γ(x) for x ∈ B. All that is left to show is that (i) γ is continuous, open, and surjective; and (ii) γ has the M-property with respect to µ. Lemma 8.10. γ has the M-property with respect to µ. Proof. We show that for any subset S ⊆ W, (i) γ−1 (S ) is Borel; and (ii) for any open set O ⊆ X, if γ−1 (S ) ∩ O , ∅ then µ(γ−1 (S ) ∩ O) , 0. Note that since W is finite, it is sufficient to prove this for the case where S = {w} for some w ∈ W. 172 Dynamic Measure Logic (i) Note that x ∈ γ−1 (w) iff x is exceptional for some γl and x belongs to some terminal cube T of γl−1 , with γl−1 (T ) = w. There are only countably many such cubes, and the set of exceptional points in each such cube is closed. So γ−1 (w) is a countable union of closed sets, hence Borel. (ii) Suppose that O is open in X with γ−1 (w) ∩ O , ∅. Let x ∈ γ−1 (w) ∩ O. Again, x is exceptional for some γl . Pick > 0 such that the -neighborhood of x is contained in O. By Lemma 8.8, there is a terminal cube T of γl contained in the -neighborhood of x such that γl (T ) = w (since wRw). Letting E(T ) be the set of exceptional points of γl+1 in T , we know that E(T ) ⊆ γ−1 (w). But by Lemma 8.7, µ(E(T )) ≥ 12 µ(T ) > 0. So E(T ) is a subset of γ−1 (w) ∩ O of non-zero measure, and µ(γ−1 (w) ∩ O) > 0. In what follows, for any w ∈ W, let Uw = {v ∈ W | wRv} (i.e., Uw is the smallest open set in W containing w). Lemma 8.11. γ is continuous. Proof. Let U be an open set in W and suppose that x ∈ γ−1 (U). Let γ(x) = w ∈ U. Then x is exceptional for some γl . So x belongs to an (open) terminal cube T of γl−1 with γl−1 (T ) = w. By R-monotonicity of hγl (y)i for all y ∈ B, we know that for any y ∈ T , γ(y) ∈ Uw —i.e., T ⊆ γ−1 (Uw ). Moreover, since w ∈ U and U is open, we have Uw ⊆ U. Thus x ∈ T ⊆ γ−1 (U). This shows that γ−1 (U) is open in X. Lemma 8.12. γ is open. Proof. Let O be open in B and let w ∈ γ(O). We show that Uw ⊆ γ(O). We know that there exists x ∈ O such that γ(x) = w. Moreover, x is exceptional for some γl . Pick > 0 small enough so that the -neighborhood of x is contained in O. By Lemma 8.8, for each v ∈ Uw there is a terminal cube T v of γl contained in the -neighborhood of x such that γl (T v ) = v. But then for any exceptional point yv of γl+1 that lies in T v , we have γ(yv ) = γl+1 (yv ) = v, and yv ∈ O. We have shown that for all v ∈ Uw , v ∈ γ(O). It follows that γ(O) is open. Lemma 8.13. γ is surjective. Proof. This follows immediately from the fact that γ ‘hits’ each of the roots, w10 , . . . , wn+1 0 , of K and γ is open. Corollary 8.14. ϕ is refuted in M. Lando 173 Proof. We stipulated that ϕ is refuted in the dynamic Kripke model K = hW, R, G, V1 i. Equivalently, letting M1 = hAK , hG , V1 i be the dynamic algebraic model corresponding to K, ϕ is refuted in M1 . By Lemma 8.11, Lemma 8.12, Lemma 8.13, and Lemma 8.10, we showed that γ : X → W is (i) continuous, open and surjective; (ii) γ ◦ f = G ◦ γ; and (iii) γ has the M-property with respect to µ. Thus by Lemma 7.5, the map Φ : hAK , hG i → hMµX , h|·|F i defined by Φ(S ) = |γ−1 (S )| is a dynamic embedding. We now define the valuation V2 : PV → MµX by: V2 (p) = Φ ◦ V1 (p) and we let M2 = hMµX , h|·|F , V2 i. By Corollary 7.3, M2 6 ϕ. In view of the isomorphism MXµ M, we have shown that ϕ is refuted in M. We have shown that for any formula ϕ < S 4C, ϕ is refuted in M. We conclude the section by stating this completeness result more formally as follows: Theorem 8.15. DMLM ⊆ S 4C. 9 Completeness for a single measure model In this section we prove a strengthening of the completeness result of the previous section, showing that there is a single dynamic measure model hM, h, Vi in which every non-theorem of S 4C is refuted. Definition 9.1. Denote by Mω the product M × M × M . . . This is a Boolean algebra, where Boolean operations are defined component-wise: (a1 , a2 , a3 , . . . ) ∨ (b1 , b2 , b3 , . . . ) = (a1 ∨ b1 , a2 ∨ b2 , a3 ∨ b3 , . . . ) (a1 , a2 , a3 , . . . ) ∧ (b1 , b2 , b3 , . . . ) = (a1 ∧ b1 , a2 ∧ b2 , a3 ∧ b3 , . . . ) −(a1 , a2 , a3 , . . . ) = (−a1 , −a2 , −a3 , . . . ) Definition 9.2. We say (a1 , a2 , a3 , . . . ) is an open element in Mω if ak is open in M for each k ∈ N. The collection of open elements in Mω is closed under finite meets, arbitrary joins and contains the top and bottom element (since operations in Mω are componentwise). We define the operator Iω on Mω by: Iω (a1 , a2 , a3 , . . . ) = (Ia1 , Ia2 , Ia3 , . . . ). 174 Dynamic Measure Logic Then Iω is an interior operator on Mω (the proof is the same as the proof of Lemma 5.8). So the algebra Mω together with the interior operator Iω is a topological Boolean algebra. Lemma 9.3. There is a dynamic algebraic model M = hMω , h, Vi such that for any formula ϕ ∈ L, , the following are equivalent: (i) S 4C ` ϕ; (ii) M ϕ. Proof. Let hϕk i be an enumeration of all non-theorems of S 4C (there are only count- ably many formulas, so only countably many non-theorems). By completeness of S 4C for M, for each k ∈ N, there is a model Mk = hM, hk , Vk i such that Mk 2 ϕk . We construct a model M = hMω , h, Vi, where h and V are defined as follows. For any hak ik∈N = (a1 , a2 , a3 , . . . ) ∈ Mω , and for any propositional variable p: h((a1 , a2 , a2 , . . . )) = hhk (ak )ik∈N V(p) = hVk (p)ik∈N . (The fact that h is an O-operator follows from the fact that h is computed component- wise according to the hk ’s, and each hk is an O-operator). We can now prove the lemma. The direction (i) ⇒ (ii) follows from Proposition 4.7. We show (ii) ⇒ (i), by proving the contrapositive. Suppose that S 4C 6 ϕ. Then ϕ = ϕk for some k ∈ N. We claim that πk V(ϕ) = Vk (ϕ) where πk is the projection onto the kth coordinate. (Proof: By induction on complexity of ϕ, and the fact that πk is a topological homomorphism.) In particular, πk V(ϕk ) = Vk (ϕk ) , 1. So V(ϕk ) , 1, and M 2 ϕk . Lemma 9.4. Mω is isomorphic to M. Proof. We need to construct an isomorphism from Mω onto M. Let (a1 , a2 , a3 , . . . ) be an arbirary element in Mω . Then for each k ∈ N, we can choose a set Ak ⊆ [0, 1] such that ak = |Ak | and 1 < Ak . We define a sequence of points sk in the real interval [0, 1] as follows: s0 = 0 s1 = 1/2 s2 = 3/4 Lando 175 2k −1 In general, sk = 2k (k ≥ 1). We now define a sequence of intervals Ik having the ak ’s as endpoints: 1 I0 = [0, ) 2 1 3 I1 = [ , ) 2 4 3 7 I2 = [ , ) 4 8 and in general Ik = [sk , sk+1 ). Our idea is to map each set Ak into the interval Ik . We do this by letting Bk = {lk x + sk | x ∈ Ak } where lk is the length of Ik . Clearly Bk ⊆ Ik and Bk ∩ B j = ∅ for all k , j. We can now define the map h : Mω → M by: [ h(a1 , a2 , a3 , . . . ) = | Bk | k∈N where Bk is defined as above. The reader can now verify that h is an isomorphism. Corollary 9.5. There is a dynamic measure model M = hM, h, Vi such that for any formula ϕ ∈ L, , the following are equivalent: (i) S 4C ` ϕ; (ii) M ϕ. Proof. Immediate from Lemma 9.3 and Lemma 9.4. Acknowledgements I would like to thank Grigori Mints and Dana Scott for valuable comments, and Sergei Slavnov for his pioneering work in this area. References G. B. M. Aiello, J. van Benthem. Reasoning about space: The modal way. Journal of Logic and Computation, 13(6):889–920, 2003. S. Artemov, J. Davoren, and A. Narode. Modal logics and topological semantics for hybrid systems. MSI 97-05, Cornell University, 1997. G. Bezhanishvili and M. Gehrke. Completeness of s4 with respect to the real line: Revisited. Annals of Pure and Applied Logic, 131(1-3):287–301, 2005. 176 Dynamic Measure Logic D. Fernandez-Duque. Dynamic topological completeness for r2 . Logic Journal of IGPL, 15:77–107, 2005. D. Fernandez-Duque. Absolute completeness of s4u for its measure-theoretic seman- tics. Advances in Modal Logic, 8, 2010. P. Halmos. Boolean algebras. New York: Stevens & Co., 1959. P. Kremer and G. Mints. Dynamic topological logic. Annals of Pure and Applied Logic, 131(1-3):133–158, 2005. T. Lando. Completeness of s4 for the lebesgue measure algebra. Journal of Philo- sophical Logic, 41(2):287–316, 2012. G. Mints and T. Zhang. A proof of topological completeness of s4 in (0,1). Annals of Pure and Applied Logic, 133(1-3):231–235, 2005. J. Oxtoby. Homeomorphic measures in metric spaces. Proceedings of the American Mathematical Society, 24(3):419–423, 1970. H. Rasiowa and R. Sikorski. The mathematics of metamathematics. Państwowe Wydawnictowo Naukowe, Warsaw, Poland, 1963. D. Scott. Mixing modality and probability. Lecture notes, 2009. S. Slavnov. Two counterexamples in the logic of dynamic topological systems. TR 2003015, Cornell University, 2003. S. Slavnov. On completeness of dynamic topological logic. Moscow Mathematical Journal, 5:477–492, 2005. Surprise in Probabilistic Dynamic Epistemic Logic Lorenz Demey Center for Logic and Analytic Philosophy, KU Leuven
[email protected]Abstract This paper presents a new analysis of surprise in the framework of probabilistic dynamic epistemic logic. This analysis is based on current psychological theories, and as a result, several experimentally observed aspects of surprise can be derived as theorems within the logical system. Most importantly, however, I will argue that unlike other formal accounts of surprise, the current analysis is able to capture the essentially dynamic nature of surprise. This conceptual elucidation also yields additional empirical benefits: the new analysis can capture important aspects of surprise that are not covered by earlier frameworks, such as its transitory nature. 1 Introduction The phenomenon of surprise is ubiquitous in everyday life. People get surprised all the time; for example, by an unexpected flash of light, or—more ‘down to earth’— about the fact that their local grocery store has run out of milk (after all, the store is usually well-stocked!). The role of surprise in human life has been intensively stud- ied in psychology from cognitive, social, developmental and educational perspectives. Furthermore, computer scientists have implemented the psychological findings about human surprise in artificial agents, and used logical models to describe these agent architectures. Surprise even crops up in various philosophical debates, such as those 178 Surprise in Probabilistic Dynamic Epistemic Logic concerning the role of surprising evidence in Bayesian epistemology, or concerning the so-called surprise examination paradox.1 The overarching goal of this paper is to provide a new analysis of the phenomenon of surprise in the framework of probabilistic dynamic epistemic logic. This account is based on the vast amount of experimental work on surprise in psychology, which should benefit its empirical adequacy. The paper’s main thesis, however, is of a more conceptual nature: surprise is an essentially dynamic phenomenon, and any good for- mal analysis should represent this dynamics explicitly. I will argue that all current formalizations of surprise in artificial intelligence and logic fail to fully capture this dynamics, and show that the framework developed in this paper is able to capture it. As an additional benefit, this new framework can be used to analyze some aspects of surprise that could not be analyzed before. This enterprise is motivated by a variety of interrelated issues. In the first place, a logical perspective on surprise can help to elucidate the basic properties of this notion. Starting from the concrete empirical results about surprise, a complete axiomatization is proposed in which the observed behavioral patterns can be derived as theorems. In other words, the fundamental laws of surprise can be ‘reverse engineered’ out of the concrete behavior that they generate. Secondly, the resulting logical system serves as a highly expressive language to formally specify agent architectures; it belongs to the general framework of (dynamic) epistemic logic, which is becoming a contemporary ‘lingua franca’ in multi-agent systems. Thirdly, and most importantly, this project constitutes a concrete illustration of the so-called dynamic turn in logic (van Ben- them 1996; 2011, Demey 2013), which maintains that many theorems, phenomena, etc. which are usually expressed or analyzed in an entirely statical way, actually have a lot of dynamics going on, and could benefit significantly from analyses which explic- itly represent this underlying dynamics. Considering the main conceptual thesis of this paper (as stated above), it should be clear that the paper offers a new illustration of the dynamic turn in logic. The remainder of the paper is organized as follows. Section 2 briefly reviews the literature on surprise in cognitive science, multi-agent systems and logic. In Section 3 I argue that two earlier formalizations do not adequately represent the dynamic nature of surprise, and make some suggestions on how this can be achieved. In Section 2.2, then, I show how these suggestions can be developed into a full-fledged dynamic logic of surprise, which can capture several key aspects of surprise, such as its transitory (short-lived) nature and its role in belief revision. Finally, Section 5 wraps things up. 1 These philosophical debates will not be directly addressed in this paper; for overviews, the reader can consult (Talbott 2008) and (Chow 1998), respectively. Demey 179 2 Three perspectives on surprise This section provides an overview of the literature on surprise in cognitive science, multi-agent systems, and logic, focusing on those topics and debates that are most relevant for our current purposes. For more comprehensive overviews, the reader can consult (Macedo et al. 2009; 2012, Reisenzein and Meyer 2009), and the longer version of this paper (Demey forthcoming b). 2.1 Cognitive Science The emotion of surprise is probably of old phylogenetic origin (Reisenzein et al. 1996). This short-lived state of mind is caused in an agent when she encounters an event that she did not expect. Surprise comes in degrees of intensity, which depend mono- tonically on the degree of unexpectedness of the surprise-causing event (Stiensmeier- Pelster et al. 1995). The cognitive-psychoevolutionary theory of surprise (Meyer et al. 1997) claims that typically, an unexpected event elicits a sequence of four processes. First, the event is appraised as unexpected, i.e. as conflicting with a previously held belief. Secondly, if the degree of unexpectedness is sufficiently large, then ongoing processes are inter- rupted and attention is shifted to the unexpected event. Thirdly, the unexpected event is analyzed and evaluated, which can lead to the fourth process, viz. revision of the relevant beliefs. The fact that this sequence ends in belief revision helps to explain the transitory (short-lived) character of surprise. When a surprising event occurs again and again, subjects tend to ‘get used’ to it, and after a few occurrences they do not find it surprising at all anymore (Experiment II Charlesworth 1964). Initially, the surprising event is unexpected: it conflicts with a previously held belief B. This leads to a process of belief revision, which removes B from the agent’s stock of beliefs (and perhaps replaces it with another belief). When the same event happens again, it is no longer surprising, because it no longer conflicts with a previously held belief. 2.2 Multi-agent Systems Since surprise typically leads to processes of learning and belief revision in humans, it is a natural move to endow artificial agents with the capability of feeling surprise, which can guide them in their actions. In a recent series of papers, Macedo and Car- doso have done exactly this (Macedo and Cardoso 2001a;b; 2004, Macedo et al. 2004; 2006). This work is based on the cognitive theories of surprise described above, and 180 Surprise in Probabilistic Dynamic Epistemic Logic can thus also be seen as a simulation of the human surprise mechanism (with various simplifications, obviously). In the simplest model (Macedo and Cardoso 2001b), the anticipated intensity of surprise elicited by a piece of information ϕ is calculated as follows:2 S (ϕ) := 1 − P(ϕ). (1) The unexpectedness of ϕ is represented by 1 − P(ϕ). Here, P(ϕ) denotes the subjective probability of ϕ, which is computed based on frequencies stored in the agent’s memory. Thus (1) clearly shows that the intensity of surprise about ϕ is a monotone increasing function of the unexpectedness of ϕ. 2.3 Logic Lorini (2008) has argued that researchers attempting to incorporate surprise and other emotions into multi-agent systems can benefit from the accuracy of logical frame- works for the formal specification of emotions. Therefore, Lorini and Castelfranchi (2006; 2007) have developed a logical framework for surprise. Just like Macedo and Cardoso’s, this framework is based on the cognitive theories of surprise described in Subsection 2.1, and can thus be seen as a formal-logical model of human surprise. I will now discuss some of the main features of this framework. The base logic is a system of probabilistic epistemic logic with a belief operator B and formulas about (linear combinations of) probabilities, such as P(ϕ) ≥ 0.5 and P(ϕ) + 2P(ψ) ≥ 0.7. This system is extended with PDL-style dynamic operators, and two unary operators T est and Datum. The formulas T est(ϕ) and Datum(ϕ) are to be read as “the agent is currently scrutinizing ϕ” and “the agent has perceptual datum ϕ”, respectively. Fur- thermore, there are actions observe(ϕ) and retrieve(ϕ), which represent observing that ϕ is the case and retrieving (from memory) that ϕ. Each of these actions gives rise to a PDL-style dynamic operator. The two most important axioms are: [observe(ϕ)]Datum(ϕ), (2) [retrieve(ϕ)]T est(ϕ). (3) Axiom (2) says that after the agent observes that ϕ, this becomes a perceptual datum; analogously, axiom (3) says that after the agent has retrieved ϕ, this becomes an item under scrutiny. 2 Thereexist more complex (and realistic) proposals for defining surprise in terms of unexpectedness (probability) (Macedo et al. 2004). However, the experimental data do not seem to single out one of these complex definitions over the other ones. Furthermore, the main conceptual points of this paper (regarding the dynamic nature of surprise) can perfectly be made using (1). Therefore, I will stick to the simpler definition. Demey 181 With these resources, the notion of mismatch-based surprise can be defined. This emotion arises when there is a conflict between a perceptual datum ψ and a currently scrutinized belief ϕ; ‘conflict’ here means that the agent believes that ϕ and ψ cannot be jointly true. Furthermore, the intensity of a mismatch-based surprise is defined as the probability that the agent assigns to the scrutinized belief ϕ. Hence, the more confident the agent is in her belief that ϕ, the more intensely she will be surprised upon receiving a perceptual datum that conflicts with ϕ (this captures exactly the idea that the intensity of surprise is a monotone function of the degree of unexpectedness). Formally: MismatchS (ψ, ϕ) :≡ Datum(ψ) ∧ T est(ϕ) ∧ B(ψ → ¬ϕ), (4) IntensityS (ψ, ϕ) = c :≡ MismatchS (ψ, ϕ) ∧ P(ϕ) = c. (5) 3 Surprise as a dynamic phenomenon In this section I will argue that neither Macedo and Cardoso’s computational nor Lorini and Castelfranchi’s logical models of surprise adequately capture the dynamic nature of surprise. Afterwards I will suggest how the dynamics of surprise can adequately be formalized. 3.1 Quasi-static analyses of surprise Let’s first fix some terminology. Surprise is caused by an unexpected event. Any mental state (beliefs, desires, emotions, etc.) that the agent had (just) before perceiving the unexpected event will be called ‘prior’; any such state that she has (just) after perceiving the event will be called ‘posterior’.3 A statement that involves only prior notions or only posterior notions will be called ‘temporally coherent’; a statement that involves both prior and posterior notions will be called ‘temporally incoherent’. Consider Macedo and Cardoso’s analysis of surprise, and recall their Definition (1) of surprise intensity as unexpectedness: S (ϕ) = 1 − P(ϕ). The left side contains a posterior notion: the intensity of the surprise felt by the agent after the unexpected event. The right side, however, contains a prior notion: the agent’s subjective probability before the unexpected event. Hence, Definition (1) is a tempo- rally incoherent statement. 3This terminology is analogous to the use of ‘priors’ and ‘posteriors’ in Bayesian frameworks. However, it should be emphasized that in this paper, ‘prior’ and ‘posterior’ are defined in terms of (being before or after) perceiving the unexpected event, while in Bayesian frameworks they are defined in terms of (being before or after) the probabilistic update triggered by that event. 182 Surprise in Probabilistic Dynamic Epistemic Logic To see this more clearly, note that there are two ways of reading (1) as a temporally coherent statement: (i) by considering both S and P to be prior notions, and (ii) by considering both S and P to be posterior notions. For interpretation (i), consider a case where the agent assigns a low (prior) probability to ϕ; Definition (1) then says that she should experience a highly intensive surprise about ϕ. Under interpretation (i), this surprise is prior; in other words, the agent is highly surprised about an event before she has even perceived it—which is clearly absurd. For interpretation (ii), consider a case where the agent is highly surprised after perceiving an occurrence of ϕ; Definition (1) then says that she assigns a low probability to ϕ. Under interpretation (ii), this prob- ability is posterior; in other words, even after the agent has observed an occurrence of ϕ, she still assigns a low probability to it—which clearly contradicts the common assumption that agents process new information via Bayesian updating.4 I now turn to Lorini and Castelfranchi’s analysis of surprise. Let’s first consider the qualitative notion of mismatch-based surprise—ignoring, for the moment, surprise intensity. Recall their Definition (4): MismatchS (ψ, ϕ) ≡ Datum(ψ) ∧ T est(ϕ) ∧ B(ψ → ¬ϕ). The left side contains a posterior notion: the agent’s mismatch-based surprise after the unexpected event. The right side is more complicated. The first conjunct is posterior: ψ is only a perceptual datum after it has been observed by the agent; this dynamics was explicitly represented in (2). The second conjunct is both prior and posterior: ϕ was under scrutiny before the observation of the unexpected event, and remains so afterwards. The third and final conjunct is prior: the agent believed that ψ and ϕ cannot be jointly true before the unexpected event; typically, she will drop this belief as a result of her surprise (recall from Subsection 2.1 that surprise typically leads to a process of belief revision). Thus, in total, Definition (4) is temporally incoherent.5 Finally, let’s consider the quantitative aspects of Lorini and Castelfranchi’s system. Recall their Definition (5) of surprise intensity: IntensityS (ψ, ϕ) = c ≡ MismatchS (ψ, ϕ) ∧ P(ϕ) = c. The left side contains a posterior notion: the intensity of the agent’s mismatch-based surprise after she has perceived the unexpected event. The right side is, again, more complicated. The first conjunct—which was also the left side of (4)—is posterior: the 4 And P(ϕ|ϕ) = 1, so after the occurrence of ϕ, the agent should assign probability 1 to it. 5 Again, there are two ways of reading (4) as a temporally coherent statement: by considering all notions that appear in it to be prior, or by considering all those notions to be posterior. It is easy to see, however, that both interpretations quickly lead to counterintuitive consequences. Similar remarks apply to (5), which will be discussed next. Demey 183 agent experiences mismatch-based surprise only after perceiving the unexpected event. The second conjunct, however, involves a prior notion, viz. the probability that the agent assigns to the scrutinized item ϕ before perceiving the unexpected event. Hence, Definition (5) is temporally incoherent as well. An intuitively right principle about surprise should look somewhat like this: if the agent has a (prior) belief that ψ and ϕ are incompatible, and assigns (prior) probability c to ϕ, then after retrieving ϕ and observing an occurrence of ψ, she will experience a (posterior) mismatch-based surprise with intensity c. Formally, this looks as follows: B(ψ → ¬ϕ) ∧ P(ϕ) = c −→ [retrieve(ϕ); observe(ψ)]IntensityS (ψ, ϕ) = c. (6) However, to derive (6) in Castelfranchi and Lorini’s system, one needs principles such as (7) and (8), which link the agent’s prior and posterior states by claiming that her ob- servation of the occurrence of ψ does not change her relevant beliefs and probabilities in any way. This is highly counterintuitive: both (7) and (8) run entirely against the idea that surprise triggers a process of belief revision; additionally, (8) clearly contradicts the common assumption that agents process new information via Bayesian condition- alization. B(ψ → ¬ϕ) → [observe(ψ)]B(ψ → ¬ϕ), (7) P(ϕ) = c → [observe(ψ)]P(ϕ) = c. (8) 3.2 Towards a fully dynamic analysis of surprise I have shown that both Macedo and Cardoso’s definition of surprise intensity (1) and Lorini and Castelfranchi’s definitions of mismatch-based surprise and its intensity (4– 5) are temporally incoherent. There is a uniform explanation for this: surprise is an essentially dynamic phenomenon, but none of these authors explicitly represent this dynamics, so they have to ‘smuggle’ it into their systems—which thus end up being temporally incoherent.6 To obtain a temporally coherent definition of surprise, which respects the different ‘stages’ (before vs. after perceiving the unexpected event), the dynamics of surprise needs to be represented explicitly. I will use a public announcement operator [!ϕ] for this purpose (technical details will be discussed in the next section). Whether a certain notion is to be interpreted as prior or as posterior, is now encoded directly in the syntax of the language: if the notion is within the scope of a dynamic operator, it is posterior, otherwise it is prior. For example, P(ϕ) = 0.2 means that the agent’s prior probability of ϕ is 0.2, while [!ϕ]P(ϕ) = 0.2 means that her posterior probability of ϕ is 0.2. 6 A similar story can be told about the role of dynamics in Aumann’s celebrated ‘agreeing to disagree’ theorem in game theory (Demey 2010; forthcoming c). 184 Surprise in Probabilistic Dynamic Epistemic Logic We will work with a simple measure of surprise intensity S , based on Macedo and Cardoso’s (1).7 When the surprise dynamics is explicitly represented, (1) is trans- formed into the following: [!ϕ]S (ϕ) = c ←→ P(ϕ) = 1 − c. (9) This principle says that the agent will be surprised about ϕ with intensity c after the unexpected event iff she assigns probability 1 − c to ϕ before the unexpected event. It thus says exactly the same as (1), but now in a temporally coherent way: both sides of (9) are prior statements.8 Furthermore, note that the right-to-left direction of (9) is similar in spirit to (6), which was very intuitive, but which was only derivable using additional implausible principles such as (7–8). 4 Modeling surprise in Probabilistic DEL In the previous section, I made some suggestions on how the dynamics of surprise can be represented explicitly. In this section, these suggestions will be developed into a full-fledged logical system. I will also show how this system can naturally capture several important properties of surprise. 4.1 Semantic setup Given the dynamic nature of surprise, and its connection with epistemic states and processes (beliefs, unexpectedness, belief revision, etc.), it is natural to work in the general framework of dynamic epistemic logic. This framework is rapidly becoming a ‘lingua franca’ or ‘universal toolbox’, which has been applied to problems in game theory, philosophy, artificial intelligence, etc. (Demey 2013). Fix a countable set Prop of proposition letters. In this paper I will only work with a single agent, so it is not necessary to introduce agent indices. The formal language L is given by the following Backus-Naur form: X ϕ ::= p | ¬ϕ | (ϕ ∧ ϕ) | Kϕ | ci P(ϕ) ≥ c | S (ϕ) ≥ c | S (ϕ) ≤ c | [!ϕ]ϕ, where p ∈ Prop and ci , c ∈ Q. As usual, Kϕ means that the agent knows that ϕ. Sim- ilarly, P(ϕ) ≥ c means that the agent assigns probability (degree of belief) at least c to 7 RecallFootnote 2. 8 Theleft formula as a whole is prior; the subformula S (ϕ) = c occurs inside the scope of the [!ϕ]- operator, and is thus posterior. In other words, principle (9) is able to express a connection between the agent’s prior probability and her posterior surprise intensity in a temporally coherent way, exactly by making use of the dynamic [!ϕ]-operator. Demey 185 ϕ. Arbitrary linear combinations of probability terms are allowed mainly for technical reasons that need not concern us here (Fagin and Halpern 1994). Because of this gen- erality, any type of (in)equality of probabilities is expressible (Def. 2 Kooi 2003). The formula S (ϕ) ≥ c says that the agent is surprised about ϕ with intensity at least c. Here, full expressivity is not allowed, and so the ≥- and ≤-forms are both taken as primitive. One can then define S (ϕ) < c as ¬(S (ϕ) ≥ c), etc. Finally, [!ϕ]ψ should be read as: “after a public announcement of ϕ, it will be the case that ψ”. Its dual is h!ϕiψ := ¬[!ϕ]¬ψ. Public announcement is usually explicated in terms of rational communication, but actually, almost any public event can be mod- eled using public announcements; for example, a strike of lightning can be modeled as a public announcement of the proposition ‘lightning occurs (at time t and location `)’.9 It thus makes perfect sense to represent an unexpected event (whatever its exact nature) as a public announcement.10 We now turn to the models on which this language will be interpreted: Definition 4.1. A surprise model is a tuple M := hW, R, µ, σ, Vi, where W is a non- empty and finite set of states, R is an equivalence relation on W, and V : Prop → ℘(W) is a valuation function. Furthermore, µ assigns to every state w ∈ W a probability mass function µ(w) : W → [0, 1] that satisfies two conditions: (i) if (w, v) < R then µ(w)(v) = 0, and (ii) µ(w)(w) > 0. Finally, σ assigns to every state w ∈ W a surprise measure, i.e. a partial function σ(w) : ℘(W) * [0, 1]. Definition 4.2. The class of all surprise models will be denoted CS . Furthermore, C∗S is the class of all surprise models whose surprise measures are entirely undefined, i.e. such that σ(w)(X) is undefined for all w ∈ W and X ⊆ W. A surprise model is thus just an ordinary finite11 Kripke model hW, R, Vi with ad- ditional components µ and σ. First of all, µ(w)(v) = c means that at state w, the agent assigns probability c to v being the actual state. Similarly, σ(w)(X) = c means that at 9 Van Benthem et al. (2009, p. 71) make a similar comment: “While much of the theory has been developed with conversation and communication in mind, it is important [ . . . ] to stress that we are not doing some sort of formal linguistics. The formal systems we will be dealing with apply just as well to observation, experimentation, learning, or any sort of information-carrying scenario”. 10 This also resolves a terminological tension in the literature on surprise. Agents are surprised about some propositional content (a piece of information), but their surprise is caused by some (non-propositional) event. In the new system, the propositional content of the surprise is formalized as the proposition ϕ, while its cause is formalized as the public announcement of that proposition. In short: ϕ is a proposition, but !ϕ is an event. 11 The assumption that surprise models are finite ensures that probabilities can be represented using sim- ple probability mass functions. This assumption can be dropped; the general case uses σ-algebras to repre- sent probabilities (Sack 2009, Demey and Sack forthcoming). However, the main points of this paper are of a more conceptual nature, and can perfectly be made using the less sophisticated setup. 186 Surprise in Probabilistic Dynamic Epistemic Logic state w, the agent experiences surprise with intensity c about X (i.e. about one of the states in X being the actual state). Note the following differences between µ(w) and σ(w) (for any state w ∈ W): • µ(w) is a total function, so µ(w)(v) is defined for every state v ∈ W; on the other hand, σ(w) is a partial function, so it is allowed that σ(w)(X) is undefined for some sets X ⊆ W, • µ(w) is required to satisfy conditions (i) and (ii), whose motivation is discussed extensively in (Demey and Sack forthcoming); on the other hand, σ(w) is not required to satisfy any additional conditions whatsoever, • µ(w) is defined on individual states, and can additively be lifted to sets of states: µ(w)(X) = x∈X µ(w)(x) (this essentially reflects the finite additivity of proba- P bilities); on the other hand, σ(w) is defined directly on sets of states, so it might happen that σ(w)({x, y}) , σ(w)({x}) + σ(w)({y}). These differences show that unlike the well-behaved epistemological notion of probability (degree of belief), the psychological notion of (degree of) surprise satisfies no static regularities whatsoever. This is a clear manifestion of the essentially dynamic nature of surprise in the definition of surprise models.12 I now turn to the logic’s semantics. This is entirely as expected; the formal clauses are stated in Definition 2.9. Given a formula ϕ ∈ L and a surprise model M, I use [[ϕ]]M to denote the set {w ∈ W | M, w ϕ}. The clause for surprise formulas holds for ≷ ∈ {≥, ≤}; I will return to it later (see Lemma 2). Note that to interpret a formula of the form [!ϕ]ψ at a surprise model M, the subformula ψ has to be interpreted at the updated model M ϕ, which is well-defined because of Definition 4.4 and Lemma 1. Finally, Definition 4.5 states the usual definition of semantic validity. Definition 4.3. Consider a surprise model M and state w in M. Then: 12 One might consider adding the requirements that if X ⊆ Y ⊆ W, then σ(w)(X) ≥ σ(w)(Y) and σ(w)(W − X) = 1 − σ(w)(X), in analogy to the well-known Kolmogorov axioms for probability. However, the only motivation for such requirements seems to be the observation that “surprise is inversely correlated with probability”, which is only plausible if ‘surprise’ is read as posterior and ‘probability’ as prior. I will return to this suggestion after the dynamics has been formally introduced (cf. Lemma 4). Demey 187 M, w p iff w ∈ V(p) (for p ∈ Prop), M, w ¬ϕ iff M, w 6 ϕ, M, w ϕ∧ψ iff M, w ϕ and M, w ψ, M, w Kϕ iff for all v ∈ W : if wRv then M, v ϕ, ci µ(w)([[ϕi ]]M ) ≥ c, P P M, w ci P(ϕi ) ≥ c iff σ(w)([[ϕ]]M ) ≷ c if σ(w)([[ϕ]]M ) is defined M, w S (ϕ) ≷ c iff c = 0 otherwise, M, w [!ϕ]ψ iff if M, w ϕ then M ϕ, w ψ. Definition 4.4. Consider an arbitrary surprise model M = hW, R, µ, σ, Vi and formula ϕ ∈ L, and suppose that M, w ϕ for some w ∈ W. Then the updated model M ϕ := hW ϕ , Rϕ , µϕ , σϕ , V ϕ i is defined as follows: • W ϕ := [[ϕ]]M = {w ∈ W | M, w ϕ}, • Rϕ := R ∩ ([[ϕ]]M × [[ϕ]]M ), µ(w)(v) • µϕ (w)(v) := µ(w)([[ϕ]]M ) for all w, v ∈ W ϕ , • σϕ (w)(X) := 1 − µ(w)(X) for all w ∈ W ϕ , X ⊆ W ϕ , • V ϕ (p) := V(p) ∩ [[ϕ]]M for every p ∈ Prop. Definition 4.5. For any formula ϕ ∈ L and class of models C, we say that C ϕ iff M, w ϕ for all models M ∈ CS and states w in M. Lemma 1. The class CS is closed under public announcements, i.e. if M ∈ CS , then also M ϕ ∈ CS (for any formula ϕ ∈ L). This does not hold for C∗S . Proof. The CS case is trivial: for the non-surprise components, (see Lemma 9 Demey 2010), and since Definition 4.1 does not require the surprise measures to satisfy any additional requirements, there is nothing else to prove. For C∗S , note that by Defini- tion 4.4, the updated surprise measures are total functions, even if the original surprise measures were entirely undefined. The public announcement of ϕ in a model M deletes all ¬ϕ-states from that model; this is a standard idea (van Ditmarsch et al. 2007). The probability functions are changed by Bayesian conditionalization on the announced proposition ϕ (Kooi 2003). To see this more clearly, note that the definition of the updated probability function can be rewritten using conditional probabilities: µϕ (w)(v) = µ(w)(v | [[ϕ]]M ). Most impor- tantly, the updated surprise measure σϕ (w) is defined in terms of the original probability function µ(w). This is the only substantial property of surprise that is assumed in the 188 Surprise in Probabilistic Dynamic Epistemic Logic logic’s semantic setup; it is clearly of a dynamic nature (linking the original and the updated model). Even though the surprise measures σ(w) are allowed to be partial, Lemma 2 be- low shows that this does not lead to any truth value gaps in the semantics. When we are modeling concrete scenarios, we typically want to assume that the agent initially (i.e. before any unexpected events have taken place) experiences no surprise. Lemma 2 therefore justifies the following heuristic rule (Heur) : When modeling a scenario, it can be assumed that the ‘initial’ model M (which represents the situation before any unexpected events have taken place) leaves all surprise measures undefined, i.e. that M ∈ C∗S . Lemma 2. Consider an arbitrary surprise model M = hW, R, µ, σ, Vi and formula ϕ ∈ L, and suppose that σ(w)([[ϕ]]M ) is undefined. Then M, w S (ϕ) = 0. Proof. Since σ(w)([[ϕ]]M ) is undefined, it follows by the semantic clause for S (ϕ) ≥ c that M, w S (ϕ) ≥ 0 (and M, w 6 S (ϕ) ≥ c for all c , 0). Entirely analogously, M, w S (ϕ) ≤ 0 (and M, w 6 S (ϕ) ≤ c for all c , 0). The following lemma states that the language L contains no redundancies. In par- ticular, the surprise operator cannot be defined in terms of the other available operators. Lemma 3. There exists no formula ϕ ∈ L − {S } such that ϕ ↔ S (p) ≥ 0.5. Proof. Consider the surprise models M1 and M2 , defined as follows: • M1 = hW1 , R1 , µ1 , σ1 , V1 i, W1 = {w1 }, R1 = {(w1 , w1 )}, µ(w1 )(w1 ) = 1, σ1 (w1 )(X) = 0.6 for all X ⊆ W1 , and V1 (p) = W1 , • M2 = hW2 , R2 , µ2 , σ2 , V2 i, W2 = {w2 }, R2 = {(w2 , w2 )}, µ2 (w2 )(w2 ) = 1, σ2 (w2 )(X) = 0.4 for all X ⊆ W2 , and V2 (p) = W2 . One can show by induction on the complexity of ϕ that for all ϕ ∈ L − {S } : M1 , w1 ϕ iff M2 , w2 ϕ. But it also holds that M1 , w1 S (p) ≥ 0.5, while M2 , w2 6 S (p) ≥ 0.5. The distinction between the original and the updated model corresponds exactly to the distinction between prior and posterior notions that was introduced in the previ- ous section. Hence, the definition σϕ (w)(X) = 1 − µ(w)(X) defines posterior surprise in terms of prior probability. As a consequence, all the properties of probability are manifested in the posterior surprise measure (recall Footnote 12): Demey 189 Lemma 4. Consider an arbitrary surprise model M = hW, R, µ, σ, Vi and formula ϕ ∈ L, and suppose that M, w ϕ for some w ∈ W. For all w ∈ W ϕ and X ⊆ Y ⊆ W ϕ , it holds that σϕ (w)(X) ≥ σϕ (w)(Y) and that σϕ (w)(W − X) = 1 − σϕ (w)(X). Proof. Both items follow immediately from the definition of σϕ and the fact that µ(w) is a probability mass function. For example, if X ⊆ Y, then µ(w)(X) ≤ µ(w)(Y), and hence σϕ (w)(X) = 1 − µ(w)(X) ≥ 1 − µ(w)(Y) = σϕ (w)(Y). Before moving to the logic’s proof theory, I will illustrate and justify its semantics by discussing a simple example in full detail. Example 1. Consider the following scenario. Mary does not know whether it is cur- rently snowing. In fact, it is indeed currently snowing, but since Mary does not yet know about this, she experiences no surprise about it whatsoever. Furthermore, since it is July and Mary knows that snow in July is very rare at her current location, she consid- ers it very unlikely that it is currently snowing. This example can be formalized using the following surprise model: M = hW, R, µ, σ, Vi, W = {w, v}, R = W × W, µ(w)(w) = µ(v)(w) = 0.05, µ(w)(v) = µ(v)(v) = 0.95, V(p) = {w}, and σ(w)(X) and σ(v)(X) un- defined for all X ⊆ W. (Note that we have followed the heuristic rule Heur discussed above.) The proposition letter p represents ‘it is snowing’; the state w represents the actual world. This model is a faithful representation of the scenario described above; for example: M, w ¬K p ∧ ¬K¬p ∧ P(p) = 0.05 ∧ P(¬p) = 0.95 ∧ S (p) = 0. Now suppose that Mary goes outside and sees that it is actually snowing. This can be modeled as a public announcement of p (recall Footnote 9). Applying Definition 4.4, we obtain the updated model M p, with W p = {w}, R = {(w, w)}, µ(w)(w) µ(w)(w) µ p (w)([[p]]Mp ) = µ p (w)(w) = = = 1, µ(w)([[p]]M ) µ(w)(w) σ p (w)([[p]]Mp ) = σ p (w)({w}) = 1 − µ(w)({w}) = 1 − 0.05 = 0.95. Using this updated model M p, we find that [!p] K p ∧ P(p) = 1 ∧ P(¬p) = 0 ∧ S (p) = 0.95 . M, w So after going outside, Mary comes to know that it is in fact snowing. She also adjusts her probabilities: she is now certain that it is snowing; i.e. she assigns probability 1 to p being true and probability 0 to p being false. These are the main cognitive effects of Mary’s observation that it is snowing. However, on the emotional side, she is also 190 Surprise in Probabilistic Dynamic Epistemic Logic highly surprised to find out that it is snowing, because she initially considered this highly unlikely. These are the results that one would intuitively expect, so the semantic setup introduced above seems to yield an adequate representation of (the interactions between) the cognitive (epistemic and probabilistic) and emotional (surprise) effects of a public announcement. 4.2 Axiomatization I now turn to the logic’s proof theory. Reduction axioms are equivalences which al- low us to push the public announcement operator through any of the other connectives, thus yielding an effective procedure to rewrite any formula as an equivalent formula that does not contain any dynamic operators. The reduction axioms for all operators of L − {S } are well-known (Kooi 2003, van Ditmarsch et al. 2007), cf. items 1–5 of Def- inition 4.6 below. What about reduction axioms for S ? Recall that in Subsection 3.2 I suggested a dynamified (and temporally coherent!) version (9) of Macedo and Car- doso’s original (1). With only minor modifications,13 this suggestion can be turned into reduction axioms for S ; cf. items 6–7 below. Definition 4.6. The reduction axioms for public announcement: 1. [!ϕ]p ←→ ϕ→p (for p ∈ Prop) 2. [!ϕ]¬ψ ←→ ϕ → ¬[!ϕ]ψ 3. [!ϕ](ψ1 ∧ ψ2 ) ←→ [!ϕ]ψ1 ∧ [!ϕ]ψ2 4. [!ϕ]Kψ ←→ ϕ → K[!ϕ]ψ ϕ → ci (h!ϕiψ) ≥ cP(ϕ) P P 5. [!ϕ] ci P(ψi ) ≥ c ←→ 6. [!ϕ]S (ψ) ≥ c ←→ ϕ → P(h!ϕiψ) ≤ 1 − c 7. [!ϕ]S (ψ) ≤ c ←→ ϕ → P(h!ϕiψ) ≥ 1 − c We are now ready to axiomatize the logic of surprise. Definition 4.7. SURPRISE is the logic axiomatized as follows: • all of propositional logic, • S5 for the knowledge operator K, • three sets of axioms which are discussed in detail in (Demey and Sack forthcom- ing): 13 Trivial modifications are that the statement about = needs to be ‘split out’ into statements about ≤ and ≥, and that in the reduction axioms the argument of S should be an arbitrary formula ψ, and not just ϕ itself. A more serious modification is that the right sides of the reduction axioms should not contain simply P(ψ), but rather P(h!ϕiψ), to ‘pre-encode’ the effect of the public announcement. Demey 191 – the Kolmogorov axioms for the probability operator P, – auxiliary axioms for linear inequalities, – the axioms Kϕ → P(ϕ) = 1 and ϕ → P(ϕ) > 0 (these correspond to conditions (i) and (ii) in Definition 4.1), • auxiliary axioms and rules for the surprise operator S : – S (ϕ) ≥ 0, – S (ϕ) > 0 → S (ϕ) ≤ 1, – ¬(S (ϕ) ≤ k ∧ S (ϕ) ≥ k0 ) for all k < k0 , – S (ϕ) > 0 → (S (ϕ) ≥ k ∨ S (ϕ) ≤ k), – if ` ϕ ↔ ψ then ` S (ϕ) ≷ c ↔ S (ψ) ≷ c (for ≷ ∈ {≥, ≤}), • necessitation for public announcement: if ` ψ, then ` [!ϕ]ψ, • the reduction axioms for public announcement described in Definition 4.6. This logic is sound and complete with respect to CS (see Demey forthcoming b, for a proof). Note that the static axioms for surprise are all concerned with the technical details of this particular formalization of surprise (such as the totality of ≥), rather than with any substantial properties of surprise itself. The only substantial axioms for surprise are thus its reduction axioms (items 6–7 of Definition 4.6), which together constitute a dynamified version of Macedo and Cardoso’s original definition (1). I take this to be a clear manifestion of the essentially dynamic nature of surprise in the axiomatization of the logic. 4.3 Some interesting modeling results I will now show that the logical system developed in the previous subsections is able to capture several properties of surprise. However, there is one technical caveat. Recall that ϕ can only be publicly announced if it is true before the announcement. It is natural to assume that ϕ will still be true after the announcement. However, because public announcements take into account higher-order information, it might happen that ϕ, simply by being announced, becomes false. A typical example is ϕ = p ∧ ¬K p. If no such ‘self-falsifying’ effects occur, ϕ is called successful: Definition 4.8. A formula ϕ ∈ L is called successful iff [!ϕ]ϕ. 192 Surprise in Probabilistic Dynamic Epistemic Logic When modeling ‘real-life’ scenarios in a single-agent setting, formulas typically do not involve higher-order information,14 so at least from this modeling perspective, the assumption of successfulness in many of the propositions below is quite harmless. I now turn to the first concrete result. Proposition 1. The following formula is satisfiable: ϕ ∧ ¬Kϕ ∧ P(ϕ) = 0.2 ∧ S (ϕ) = 0 P(ϕ) = 1 S (ϕ) = 0.8 ∧ h!ϕi Kϕ ∧ ∧ P(ϕ) = 1 S (ϕ) = 0 . ∧ h!ϕih!ϕi Kϕ ∧ ∧ Proof. Consider M := hW, R, σ, µ, Vi, with W = {w, v}, R = W × W, V(p) = {w}, µ(w)(w) = 0.2, µ(w)(v) = 0.8 and σ(w)(X) and σ(v)(X) undefined for all X ⊆ w (all components which have not been mentioned are irrelevant, and can thus be assigned values at random). One can easily check that this is a surprise model, and that the formula stated above (with ϕ instantiated to p) is true at M, w. Finally, note that M ∈ C∗S , i.e. we have followed the heuristic rule Heur. Proposition 1 shows that the logic is capable of doing what it was designed to do, viz. explicitly representing surprise dynamics. It describes the following scenario. Ini- tially, ϕ is true, but the agent does not know this. Furthermore, she assigns rather low prior probability to it (and thus does not expect its announcement). However, because she does not yet know that ϕ is actually true, she experiences no surprise about it what- soever. Next, the unexpected announcement of ϕ occurs, and three things happen: (i) the agent comes to know that ϕ, (ii) she processes this new information by Bayesian conditionalization and thus assigns probability 1 to it, and (iii) she experiences a very high degree of surprise about ϕ (inversely correlated to the low probability that she initially assigned to it). After another announcement of ϕ, the agent’s knowledge and probabilities are not changed; however, because this second announcement was no longer unexpected (after all, in the meanwhile she has come to know that ϕ), her sur- prise about ϕ drops again to 0. The formula in Proposition 1 captures this scenario in a very natural way, using nested public announcement operators to explicitly represent the successive layers of surprise dynamics. We now turn to Proposition 2 below. This says that an occurrence of ϕ can lead to surprise about ϕ itself, but also about all of its consequences. For example, it follows from items 1 and 2 that if an agent assigns probability 0.2 to p ∧ q, then after the announcement of this conjunction, she is surprised with intensity 0.8 about p ∧ q, but 14 In a single-agent setting one is typically surprised about ‘facts of nature’, not about one’s epistemic attitudes about such facts. In a multi-agent setting, however, it would be natural to have scenarios like “Alice was surprised when finding out that Bob knows that ϕ”. Demey 193 also about p and q individually. Items 3 and 4 are trivial consequences of 1 and 2; they are mentioned to highlight the subtleties of unsuccessful formulas: if ϕ is not assumed to be successful, then 4 continues to hold, but 3 doesn’t. Proposition 2. Assume that ϕ ∈ L is successful, and that ϕ → ψ. Then: 1. P(ϕ) ≥ c → [!ϕ]S (ψ) ≤ 1 − c, 2. P(ϕ) ≤ c → [!ϕ]S (ψ) ≥ 1 − c, 3. P(ϕ) ≥ c → [!ϕ]S (ϕ) ≤ 1 − c, 4. P(ϕ) ≤ c → [!ϕ]S (ϕ) ≥ 1 − c. Proof. Straightforward applications of the semantics. The fact that an occurrence of ϕ can lead to surprise about its consequences pre- supposes that the agent is actually able to draw those consequences (if the agent did not realize that ψ is a logical consequence of ϕ, then an unexpected occurrence of ϕ would cause her to be surprised about ϕ, but not about ψ). In other words, Proposition 2 shows that the logical system assumes the agent to be logically omniscient.15 An even clearer illustration of this assumption is provided by item 1 of Proposition 3 below, which says that the agent is never surprised about semantic validities. Similarly, items 2 and 3 say that if an agent already knows ϕ, or assigns probability 1 to it, then she will not be surprised about it. These principles are clearly false for actual human beings, which are not logically omniscient, and can e.g. be genuinely surprised upon learning (that some formula is actually) a tautology; rather, the main importance of item 1 is that it elucidates Wittgenstein’s famous anti-psychologistic claim that “there can never be surprises in logic” (Proposition 6.1251 Wittgenstein 1922). Proposition 3. Assume ϕ ∈ L is successful. Then: 1. if ϕ, then [!ϕ]S (ϕ) = 0, 2. P(ϕ) = 1 → [!ϕ]S (ϕ) = 0, 3. Kϕ → [!ϕ]S (ϕ) = 0. Proof. Straightforward applications of the semantics. 15 This also illustrates the thoroughly epistemic character of surprise: the problem of logical omniscience originally is a problem for epistemic logic, but it automatically carries over into the surprise component. 194 Surprise in Probabilistic Dynamic Epistemic Logic I will finish this subsection by proving two more substantial results, both of which illustrate how important empirical properties of surprise can be obtained as semantic validities of the logical system. Proposition 4. Assume ϕ ∈ L is successful. Then for all n ≥ 2, we have:16 [!ϕ]n S (ϕ) = 0. Proof. First of all, note that since ϕ is successful, it holds that ϕ ↔ h!ϕiϕ; call this principle (†). Consider an arbitrary surprise model M = hW, R, µ, σ, Vi and state w, and assume that M, w ϕ. For any n ≥ 0, we abbreviate hW n , Rn , µn , σn , V n i = M n := (· · · (M ϕ) ϕ · · · ) ϕ . | {z } n times Let’s now show that M, w [!ϕ]n+1 P(ϕ) = 1 for all n ≥ 0. This follows directly from the following calculation: µn+1 (w)([[ϕ]]Mn+1 ) = µn+1 (w)([[h!ϕiϕ]]Mn ) = µn+1 (w)([[ϕ]]Mn ) (†) = µn (w)([[ϕ]]Mn | [[ϕ]]Mn ) = 1. (Definition 4.4) We now use this to justify the (‡)-labeled step in the following calculation: σn+2 (w)([[ϕ]]Mn+2 ) = σn+2 (w)([[h!ϕiϕ]]Mn+1 ) = σn+2 (w)([[ϕ]]Mn+1 ) (†) = 1 − µn+1 (w)([[ϕ]]Mn+1 ) (Definition 4.4) = 1 − 1 = 0. (‡) This shows that M, w [!ϕ]n+2 S (ϕ) = 0 for all n ≥ 0. Informally speaking, Proposition 4 says that after two public announcements of ϕ, the agent is no longer surprised about ϕ. It thus nicely captures the transitory nature of surprise, which was discussed in Subsection 2.1. Furthermore, the proof closely resembles the informal explanation which was given there: the first announcement of ϕ causes the agent to update her probabilities and to assign probability 1 to ϕ, so that the second (and subsequent) announcement is no longer unexpected, and thus no longer surprising.17 16 [!ϕ]n is defined inductively: [!ϕ]0 ψ := ψ, and [!ϕ]n+1 ψ := [!ϕ][!ϕ]n ψ. 17 The fact that surprise intensity drops to 0 after only two announcements is no problem for Proposition 4, even though for most real subjects this drop happens more gradually and requires several more repetitions (Charlesworth 1964). The more gradual decrease in surprise intensity is the consequence of personal and coincidental factors, such as intelligence and fatigue. Both the informal explanation in Subsection 2.1 and Proposition 4 make abstraction of such factors, and predict that the drop in surprise intensity will already happen after the second repetition. Demey 195 Finally, Proposition 5 says that if an occurrence of (a public announcement of) ϕ leads an agent to change her probability of ψ from a to b in a non-trivial18 fashion, then she will experience at least some surprise about ψ. In other words: surprise is a neces- sary condition for belief revision (in the current framework: probability revision). This is perfectly in line with the cognitive-psychoevolutionary theory of surprise described in Subsection 2.1, which holds that surprise is part of a sequence of processes triggered by an unexpected event; the final stage of this sequence is typically a process of belief revision. Proposition 5. Consider ϕ, ψ ∈ L and suppose that ¬ψ → [!ϕ]¬ψ. Then P(ψ) = a ∧ [!ϕ]P(ψ) = b ∧ a , b → [!ϕ]S (ψ) > 0. Proof. Consider an arbitrary surprise model M = hW, R, µ, σ, Vi and state w, and as- sume that the antecedent of the formula above is true at M, w. For a reductio, assume that M, w 6 [!ϕ]S (ψ) > 0. Then it follows that 0 = σϕ (w)([[ψ]]Mϕ ) = σϕ (w)([[h!ϕiψ]]M ) = 1 − µ(w)([[h!ϕiψ]]M ), and thus µ(w)([[h!ϕiψ]]M ) = 1. From the assumption that ¬ψ → [!ϕ]¬ψ in the statement of the proposition, it follows that [[h!ϕiψ]]M ⊆ [[ψ]]M , and thus 1 = µ(w)([[h!ϕiψ]]M ) ≤ µ(w)([[ψ]]M ) = a, so a = 1. Since h!ϕiψ → ϕ, we similarly get that µ(w)([[ϕ]]M ) = 1, and hence µ(w)([[h!ϕiψ]]M ) 1 b = µϕ (w)([[ψ]]Mϕ ) = µϕ (w)([[h!ϕiψ]]M ) = = = 1. µ(w)([[ϕ]]M ) 1 We thus have a = 1 = b, which contradicts the assumption that a , b. Corollary 1. For any ϕ ∈ L, it holds that P(ϕ) = a ∧ [!ϕ]P(ϕ) = b ∧ a , b → [!ϕ]S (ϕ) > 0. Proof. It always holds that ¬ϕ → [!ϕ]¬ϕ, so by putting ψ = ϕ, the condition of Proposition 5 is always satisfied. 18 This non-triviality requirement is captured by the condition that ¬ψ → [!ϕ]¬ψ, i.e. the public announcement of ϕ should not turn any ¬ψ-states into ψ-states. In other words, the change of P(ψ) from a to b is non-trivial if [[ψ]]M does not grow. (If [[ψ]]M grows, then it is trivial that the value of P(ψ) might change: if A ⊆ B, then P(A) ≤ P(B).) Intuitively, exactly the same argument can be made about [[ψ]]M shrinking rather than growing (i.e. about the requirement that ψ → [!ϕ]ψ), but it turns out that this second requirement is technically speaking not necessary for Proposition 5 to hold. This disanalogy is similar to the disanalogy between items 3 and 4 of Proposition 2. 196 Surprise in Probabilistic Dynamic Epistemic Logic 4.4 A Lockean thesis for surprise The current framework allows us to express statements such as ‘the agent is surprised about ϕ with intensity 0.8’. In many natural cases, however, we might want to say that the agent is surprised about ϕ, without wishing to commit ourselves to some particular value for her surprise intensity. This is entirely analogous to the epistemic cases, where we might sometimes want to say that the agent believes that ϕ, without committing ourselves to some particular degree of belief. A widespread proposal is to define ‘belief’ as ‘sufficiently high degree of belief’; this proposal is called the Lockean thesis (Demey forthcoming a): Bϕ :≡ P(ϕ) ≥ τ (10) (here, τ ∈ (0.5, 1) is some threshold value). Because of the high similarity between the epistemic case and the surprise case, it seems natural to apply the Lockean thesis also to surprise. In other words, we will introduce a ‘qualitative’ surprise operator by saying that the agent is surprised about ϕ iff she is surprised about ϕ with some sufficiently high intensity. I will argue below that the most natural choice for the value of the surprise intensity threshold is τ, i.e. the same value as the degree of belief threshold. Formally, the Lockean thesis for surprise thus looks as follows: S ϕ :≡ S (ϕ) ≥ τ. (11) Principles (10) and (11) allow us to talk about an agent’s ‘qualitative’ beliefs and surprises. Furthermore, since both principles make use of the same threshold value τ, there is a natural connection between the two operators they define. Proposition 6 says that after an announcement of ϕ, the agent will be surprised about ψ iff (assuming that ϕ is true) she initially believed that ψ would be false then. This qualitative observation is in line with the cognitive-psychoevolutationary theory of surprise described in Sub- section 2.1, which holds that surprise stems from a conflict between unexpected data and a previously held belief. Proposition 6. [!ϕ]S ψ ←→ ϕ → B[!ϕ]¬ψ. Proof. Consider the reduction axiom for surprise formulas: [!ϕ]S (ψ) ≥ τ ←→ ϕ → P(h!ϕiψ) ≤ 1 − τ. (12) We have the following chain of SURPRISE-equivalences: P(h!ϕiψ) ≤ 1 − τ ↔ 1 − P(h!ϕiψ) ≥ τ ↔ P(¬h!ϕiψ) ≥ τ ↔ P([!ϕ]¬ψ) ≥ τ Demey 197 and thus (12) can be rewritten as [!ϕ]S (ψ) ≥ τ ←→ ϕ → P([!ϕ]¬ψ) ≥ τ. Applying (11) and (10) to the left- and right-hand sides, respectively, yields the desired result. 5 Conclusion In this paper I have presented a new analysis of surprise in the framework of probabilis- tic dynamic epistemic logic. This analysis is based on current psychological theories, and as a result, several experimentally observed aspects of surprise can be derived as theorems within the logical system (recall, for example, Proposition 5 on the role of sur- prise in belief revision). Furthermore, being based on the contemporary ‘lingua franca’ of (dynamic) epistemic logic, it offers a natural, well-understood and highly expressive language for the formal description of agent architectures (cf. Proposition 1). Most importantly, however, the analysis naturally captures the dynamic nature of surprise. This is clearly manifested in the logic’s semantics (the surprise measures σ(w) are not required to satisfy any static properties) as well as in its proof theory (the only substantial axioms for surprise are its reduction axioms). These reduction axioms jointly constitute a temporally coherent definition of surprise, in contrast to earlier, temporally incoherent formalizations such as Macedo and Cardoso’s and Lorini and Castelfranchi’s. This temporal coherence has several advantages. First and foremost, by explicitly distinguishing between prior and posterior notions, the proposed analysis is able to reach a high level of conceptual hygiene. This conceptual advantage also yields additional empirical benefits: the new analysis can capture important aspects of surprise that are not covered by earlier frameworks, such as its transitory nature (cf. Proposition 4).19 Acknowledgements A longer version of this paper will be published as (Demey forthcoming b); however, Subsection 4.4 is new. Earlier versions of this paper were presented at the LIRa seminar (ILLC, Amsterdam, October 2012), the Reasoning Club PhD conference (Brussels, September 2012), LOFT 10 (Sevilla, June 2012) and a workshop on modal logic (Brussels, May 2012). I would like to thank the audiences of these talks for their helpful remarks and suggestions. In particular, I would like to thank Alexandru Baltag, Johan van Benthem, Jan van Eijck, Jan Heylen, Emiliano 19 Unsurprisingly, the aspect of transitoriness is itself of a highly dynamic character, involving repeated occurrences of the unexpected event. 198 Surprise in Probabilistic Dynamic Epistemic Logic Lorini, Alexandru Marcoci, Ahti-Veikko Pietarinen, Sonja Smets and Jean Paul Van Bendegem for their feedback on earlier versions of this paper. This research was sup- ported by a PhD fellowship of the Research Foundation – Flanders (FWO). References J. van Benthem. Exploring Logical Dynamics. CSLI Publications, Stanford, CA, 1996. J. van Benthem. Logical Dynamics of Information and Interaction. Cambridge Uni- versity Press, Cambridge, 2011. J. van Benthem, J. Gerbrandy, and B. P. Kooi. Dynamic update with probabilities. Studia Logica, 93:67–96, 2009. W. R. Charlesworth. Instigation and maintenance of curiosity behavior as a function of surprise versus novel and familiar stimuli. Child Development, 35:1169–1186, 1964. T. Y. Chow. The surprise examination or unexpected hanging paradox. American Mathematical Monthly, 105:41–51, 1998. L. Demey. Agreeing to disagree in probabilistic dynamic epistemic logic. Master’s thesis, ILLC, Universiteit van Amsterdam, Amsterdam, 2010. L. Demey. Believing in Logic and Philosophy. PhD thesis, KU Leuven, 2013. L. Demey. Contemporary epistemic logic and the Lockean thesis. Foundations of Science, forthcoming a. L. Demey. The dynamics of surprise. Logique et Analyse, forthcoming b. L. Demey. Agreeing to disagree in probabilistic dynamic epistemic logic. Synthese, forthcoming c. L. Demey and J. Sack. Epistemic probabilistic logic. In H. van Ditmarsch, J. Y. Halpern, W. van der Hoek, and B. Kooi, editors, Handbook of Logics for Knowledge and Belief. College Publications, London, forthcoming. H. van Ditmarsch, W. van der Hoek, and B. P. Kooi. Dynamic Epistemic Logic. Springer, Dordrecht, 2007. Demey 199 R. Fagin and J. Halpern. Reasoning about knowledge and probability. Journal of the ACM, 41:340–367, 1994. B. P. Kooi. Probabilistic dynamic epistemic logic. Journal of Logic, Language and Information, 12:381–408, 2003. E. Lorini. Agents with emotions: A logical perspective. Association for Logic Pro- gramming Newsletter, 21(2–3):1–9, 2008. E. Lorini and C. Castelfranchi. The unexpected aspects of surprise. Int. J. of Pattern Rec. and AI, 20:817–833, 2006. E. Lorini and C. Castelfranchi. The cognitive structure of surprise: Looking for basic principles. Topoi, 26:133–149, 2007. L. Macedo and A. Cardoso. Creativity and surprise. In G. Wiggins, editor, Proc. of the AISB ’01 Symposium on Creativity in Arts and Science, pages 84–92, York, 2001a. The Society for the Study of Artificial Intelligence and Simulation Behaviour. L. Macedo and A. Cardoso. Modelling forms of surprise in an artificial agent. In J. Moore and K. Stenning, editors, Proc. of the 23rd An. Conf. of the Cog. Sci. Soc., pages 588–593, Edinburgh, 2001b. Erlbaum. L. Macedo and A. Cardoso. Exploration of unknown environments with motivational agents. In N. Jennings and M. Tambe, editors, Proc. of the Third Int. Joint Conf. on Autonomous Agents and MAS, pages 328–335, New York, NY, 2004. IEEE Computer Society. L. Macedo, A. Cardoso, and R. Reisenzein. Modeling forms of surprise in artificial agents: Empirical and theoretical study of surprise functions. In K. Forbus, D. Gen- tner, and T. Regier, editors, Proc. of the 26th An. Conf. of the Cog. Sci. Soc., pages 588–593, Mahwah, NJ, 2004. Erlbaum. L. Macedo, A. Cardoso, and R. Reisenzein. A surprise-based agent architecture. In R. Trappl, editor, Proc. of the 18th European Meeting on Cybernetics and Systems Research, pages 583–588, Vienna, 2006. Austrian Society for Cybernetic Studies. L. Macedo, A. Cardoso, R. Reisenzein, E. Lorini, and C. Castelfranchi. Artificial sur- prise. In J. Vallverdú and D. Casacuberta, editors, Handbook of Research on Synthetic Emotions and Sociable Robotics: New Applications in Affective Computing and AI, pages 267–291. IGI Global, Hershey, PA, 2009. 200 Surprise in Probabilistic Dynamic Epistemic Logic L. Macedo, R. Reisenzein, and A. Cardoso. Surprise and anticipation in learning. In N. M. Seel, editor, Encyclopedia of the Sciences of Learning, pages 3250–3253. Springer, New York, NY, 2012. W.-U. Meyer, R. Reisenzein, and A. Schützwohl. Towards a process analysis of emo- tions: The case of surprise. Motivation and Emotion, 21:251–274, 1997. R. Reisenzein and W.-U. Meyer. Surprise. In D. Sander and K. R. Scherer, edi- tors, Oxford Companion to the Affective Sciences, pages 386–387. Oxford University Press, Oxford, 2009. R. Reisenzein, W.-U. Meyer, and A. Schützwohl. Reacting to surprising events: A paradigm for emotion research. In N. Frijda, editor, Proc. of the 9th Conf. of the Int. Soc. for Research on Emotions, pages 292–296, Toronto, 1996. ISRE. J. Sack. Extending probabilistic dynamic epistemic logic. Synthese, 169:241–257, 2009. J. Stiensmeier-Pelster, A. Martini, and R. Reisenzein. The role of surprise in the attribution process. Cognition and Emotion, 9:5–31, 1995. W. Talbott. Bayesian epistemology. In E. N. Zalta, editor, Stanford Encyclopedia of Philosophy. CSLI, Stanford University, Stanford, CA, 2008. L. Wittgenstein. Tractatus Logico-Philosophicus. Routledge and Kegan Paul, Lon- don, 1922. A Geo-logical Solution to the Lottery Paradox, with Applications to Conditional Logic Hanti Lin and Kevin T. Kelly Australian National University, Carnegie Mellon University
[email protected],
[email protected]Abstract We defend a set of acceptance rules that avoids the lottery paradox, that is closed under classical entailment, and that accepts uncertain propositions without ad hoc restrictions. We show that the rules we recommend provide a semantics that val- idates exactly Adams’ conditional logic and are exactly the rules that preserve a natural, logical structure over probabilistic credal states that we call probalogic. To motivate probalogic, we first expand classical logic to geo-logic, which fills the en- tire unit cube, and then we project the upper surfaces of the geo-logical cube onto the plane of probabilistic credal states by means of standard, linear perspective, which may be interpreted as an extension of the classical principle of indifference. Finally, we apply the geometrical/logical methods developed in the paper to prove a series of trivialization theorems against question-invariance as a constraint on ac- ceptance rules and against rational monotonicity as an axiom of conditional logic in situations of uncertainty. 1 The Lottery Paradox If Bayesians are right, one’s credal state should be a probability measure p over propo- sitions, where probabilities represent degrees of belief. It seems that one also accepts propositions in light of p. Acceptance of proposition A is sometimes portrayed as a momentous inference making A certain, in the sense that one would bet one’s life against nothing that A is true (e.g., Levi 1967). But that extreme standard would elim- 202 A Geo-logical Solution to the Lottery Paradox inate almost all ordinary examples of accepted propositions. We therefore entertain a more modest view of acceptance, according to which the set of propositions accepted in light of p should, in some sense, aptly capture some characteristics of p to others or, in everyday cognition, to ourselves. That view is non-inferential in the sense that p is not conditioned on the propositions accepted, but it is inferential in another sense—the accepted propositions may serve as premises in arguments whose conclusions are also accepted in the same, weak sense. It seems that high probability short of full certainty suffices for acceptance, a view now referred to as the Lockean thesis. But the Lockean rule licenses acceptance of inconsistent sets of propositions, however high the threshold r < 1 is set. For there exists a fair lottery with more than 1/(1−r) tickets. It is accepted that some ticket wins, since that proposition carries probability 1. But for each ticket, it is also accepted that the ticket loses, since that proposition has probability greater than r. So an inconsistent set of propositions is accepted. That is Henry Kyburg’s lottery paradox (Kyburg 1961). To elude the paradox, one must abandon either the full Lockean thesis or classical consistency. Kyburg pursued the second course by rejecting the classical inference rule that A, B jointly imply A∧ B, so that the collection of propositions of form “ticket i does not win” does not entail “no ticket wins”. Most responses side with classical logic and constrain the Lockean thesis in some manner to avoid contradictions. For example, Jeffrey (1970) recommended that the entire practice of acceptance be abandoned in favor of reporting probabilities. Levi (1967) rejected the idea that acceptance can be based on probability alone, since utilities should also be consulted. Or one may impose as a necessary condition that accepted propositions be certain (van Fraassen 1995, Arló- Costa and Parikh 2005) or to cases in which no logical contradiction happens to result (Pollock 1995, Ryan 1996, Douven 2002). Our approach is different. Instead of restricting the Lockean thesis, we revise it. In particular, we defend an unrestricted rule of acceptance that is contradiction-free and yet capable of accepting uncertain propositions—even propositions of fairly low probability. Like the Lockean rule, the proposed rule has a parameter that controls its strictness. When the parameter is tuned toward 1, the proposed rule is almost indistin- guishable from the classical logical closure of the Lockean rule; but as the parameter drops toward 0, the proposed rule’s geometry shifts steadily away from that of the Lockean rule so as to avert the lottery paradox. The rule we recommend was invented by (Levi 1996, p.286), who saw no jus- tification for it except as a loose approximation to an alternative rule he took to be justified by decision-theoretic means (Levi 1967; 1969).1 We provide a justification of 1 Levi writes: “I do not know how to derive it from a view of the cognitive aims of inquiry [i.e. seeking more information and avoiding error] that seems attractive.” (Levi 1996, p.286) We rediscovered the rule as Lin and Kelly 203 the rule in terms of preservation of logical structure implicit in the space of probabilis- tic credal states. The crux is to order probabilistic credal states according to relative logical strength, as Boolean algebra does for propositions. We do so in two steps. First, we start with a sigma algebra of propositions (closed under negation and count- able disjunction) and then extend that sigma algebra to cover the entire unit cube by introducing a new connective ¬d interpreted as negation to degree d, so that ¬0 ϕ is equivalent to ϕ and ¬1 ϕ is equivalent to ϕ. The resulting logical structure is called geologic (Section 4). Next, we view the geological structure in perspective through the picture plane of possible credal states to obtain a logical structure over credal states that we call probalogic (Sections 5 and 6). Then it is natural to require that every ac- ceptance rule preserves probalogical structure when it maps probabilistic credal states to standard, Boolean propositions. The requirement that acceptance rules preserve probalogical structure has appeal- ing consequences for the theory of acceptance. First, we show that the rules we recom- mend are exactly the rules that preserve probalogical structure (Section 7). Moreover, no plausible logical structure on probability measures is preserved by the Lockean rule or its variants (Section 8). Another justification of the proposed acceptance rules concerns the logic of condi- tionals and defeasible reasoning. Frank P. Ramsey proposed an influential, epistemic condition for acceptance of conditional statements, now commonly referred to as the Ramsey test: If two people are arguing ‘If A, then B?’ and are both in doubt as to A, they are adding A hypothetically to their stock of knowledge and arguing on that basis about B; so that in a sense ‘If A, B’ and ‘If A, ¬B’ are contra- dictories. We can say that they are fixing their degrees of belief in B given A. (Ramsey 1929, footnote 1)2 Suppose that an agent is in a probabilistic credal state p and adopts an acceptance rule. We propose the following interpretation of the Ramsey test: the agent accepts the (flat) conditional ‘if A then B’ when, by the acceptance rule she adopts, she would accept B in the credal state p(·|A) that results from p by conditioning on A. Thus, conditional acceptance is reduced to Bayesian conditioning and acceptance of non- conditional propositions. This natural semantics allows one to characterize the axioms of conditional logic in terms of their geometrical constraints on acceptance rules, in a consequence of our work on Ockham’s razor. The problem was to extend the Ockham efficiency theorem (Kelly 2008) from methods that choose theories to methods that update probabilistic degrees of belief on theories. That required a concept of retraction of credal states, expounded in (Kelly 2010). We thank Teddy Seidenfeld for bringing the prior publication of the rule to our attention. 2 We take the liberty of substituting A, B for p, q in Ramsey’s text. 204 A Geo-logical Solution to the Lottery Paradox much the same way that axioms of modal logic are standardly characterized in terms of constraints on accessibility relations among worlds. Accordingly, for each of the axioms in Adams’ conditional logic (Adams 1975), we solve for its geometrical con- straint on acceptance rules (Section 9). These constraints are shown to be satisfied by the rules that preserve probalogical structure, so the probalogic-preserving rules vali- date Adams’ logic with respect to the Ramsey test (Section 10). Conversely, Adams’ logic is shown to be complete with respect to the Ramsey test when acceptance follows probalogic-preserving rules (Section 12). The result is a new probabilistic semantics: it defines validity simply as preservation of acceptance, which improves upon Adams’ -δ semantics (Adams 1975); and it allows for accepting propositions of low proba- bilities, which improves upon Pearl’s infinitesimal semantics (Pearl 1989). Thus, the recommended acceptance rules are vindicated both by probalogic and by conditional logic. One might hope for validating a stronger logic of flat conditionals than Adams’, e.g. system R (Lehmann and Magidor 1992) or, slightly stronger, the AGM axioms for belief revision (Harper 1975, Alchourrón et al. 1985). We close the door on that hope with a new trivialization theorem (Section 11). In light of that result, we propose that Adams’ conditional logic reflects Bayesian ideals better than AGM belief revision does. Finally, the acceptance rules we recommend are sensitive to framing effects deter- mined by an underlying question. One might hope that the advantages of the proposed rules could be obtained without question-dependence. Again, we close the door on that hope with a series of trivialization theorems (Sections 13 and 14), employing the geometrical techniques described above. We conclude that, all things considered, the advantages of the recommended acceptance rules within questions justify their depen- dence on questions. 2 The geometry of the Lottery Paradox Let Eκ = {Ei : i ∈ I} be a countable collection of mutually exclusive and jointly exhaustive propositions over some underlying set of possibilities, where κ (either ω or some finite n) is the cardinality of the index set I.3 We think of Eκ as a question in context whose potential answers are the various Ei . Let Aκ be the least collection of propositions containing Eκ that is closed under negation and countable disjunction and conjunction, and let Pκ denote the set of all (countably additive) probability measures defined on Aκ . We think of Pκ as the space of probabilistic credal states over answers 3 Note that in the mathematics that follows, we never distinguish between questions of a given cardinality, so no confusion results from identifying questions in terms of cardinality. Lin and Kelly 205 to question Eκ . The subscripts are suppressed in the sequel unless we wish to emphasize cardinality. We assume that acceptance rules produce sets of propositions that are closed under classical entailment so that, without loss of generality, each acceptance rule may be viewed as a map α : P → A, where proposition α(p) is understood as the strongest proposition accepted in light of probability measure p. Then proposition A is accepted by rule α at credal state p, written p α A, if and only if α(p) entails A. The acceptance zone of A under α is defined as the set of all credal states at which A is accepted by α. For example, the Lockean acceptance rule with threshold set to r in the unit interval is just the mapping: ^ λr (p) = {A ∈ A : p(A) ≥ r}. (1) Each probability measure p in P can be represented with respect to E as the κ- dimensional vector (p(Ei ) : i ∈ I) with components in the unit interval summing to one. In the context of question E, we identify p with its vector, so that the i-th compo- nent pi equals p(Ei ). When κ = 3, for example, P3 corresponds to the set of all such 3-vectors, which is the equilateral triangle in R3 whose corners have Cartesian coor- dinates e1 = (1, 0, 0), e2 = (0, 1, 0) and e3 = (0, 0, 1) (Figure 1). To avoid ambiguity, we let (ei ) j pick out the j-th component of ei . Reformulate the Lockean rule (1) as p(E2) (0,1,0) ( 13 , 13 , 13 ) p(E3) (0,0,1) (1,0,0) p(E1) Figure 1: The space P3 of probabilistic credal states 206 A Geo-logical Solution to the Lottery Paradox follows:4 ^ λr (p) = {¬Ei : p(¬Ei ) ≥ r and i ∈ I}; (2) ^ = {¬Ei : pi ≤ 1 − r and i ∈ I}. (3) By this formulation, the acceptance zone of ¬E1 under λr with respect to question E3 is depicted in Figure 2. The Lockean rule is now expressed geometrically—its (0,1,0) (1 - r, r, 0) E1 (0,0,1) (1 - r, 0, r) (1,0,0) Figure 2: Acceptance zone for E2 ∨ E3 under λr acceptance zone for ¬E1 has a definite, trapezoidal shape that results from truncating the triangular space P3 parallel to one side. As threshold r is dropped, the trapezoid becomes thicker. The acceptance zones of ¬E2 and ¬E3 are included in Figure 3.a. By closure under entailment, proposition E1 is accepted exactly when both ¬E2 and ¬E3 are accepted, so the corner, diamond-shaped zones license acceptance of potential answers to E. When r ≤ 2/3, the propositions ¬E1 , ¬E2 , ¬E3 are all accepted at the probability measures contained in the small, dark, central triangle (Figure 3.b). But that set of propositions is inconsistent so, by closure under entailment, the dark, central triangle is the acceptance zone of the inconsistent proposition ⊥. That is just the lottery paradox for thresholds r ≤ 2/3 (interpret Ei as the proposition “ticket i wins”). Geometrically, the lottery paradox arises because the Lockean rule’s acceptance zones for the various propositions ¬Ei crash clumsily into one another as the proba- bility threshold r decreases. It is easy to design alternative acceptance zones that bend 4 This is equivalent to the original formulation because, first, every proposition A is equivalent to the conjunction of all propositions of form ¬Ei that are entailed by A and, second, propositions of form ¬Ei that are entailed by A are at least as probable as A. Lin and Kelly 207 E2 E2 E3 E1 E3 T E1 T E1 E2 E3 E1 E2 E3 (a) (b) Figure 3: Acceptance zones under λr progressively as they approach the center of the triangle so that they eventually meet without overlapping like the leaves of a camera shutter (Figure 4). The proposed ac- E2 E2 E3 E1 E3 E1 T T E1 E2 E3 E1 E2 E3 (a) (b) Figure 4: Progressively bent zones that avert collision ceptance zones are almost indistinguishable from those of the Lockean rule when r is close to 1. As r approaches 0, the bending becomes more pronounced and the lottery paradox is avoided. A special, symmetric case of the proposed rule, which we call the symmetric cam- era shutter rule, modifies the Lockean rule as follows. Test whether answer Ei to E should be rejected at credal state p by considering, not probability pi itself, but the probability ratio: pi σ(p)i = , max j p j 208 A Geo-logical Solution to the Lottery Paradox resulting in the modified rule: ^ νr (p) = {¬Ei : σ(p)i ≤ 1 − r and i ∈ I} . (4) The symmetric camera shutter rule is algebraically the same as the Lockean rule (3) except that probability is divided by the probability distribution’s mode. Say that ac- ceptance rule α is everywhere consistent if and only if p 1α ⊥ for each p in P, and say that α is non-skeptical if and only if for each Ei in E there exists p in P such that p(Ei ) < 1 and p α Ei . Then: Proposition 1. Let E contains at least two answers. The symmetric camera shutter rule νr is everywhere consistent and non-skeptical, for each r such that 0 < r < 1. pi = 1, so there exists i ∈ I such P Proof. For everywhere consistency, note that since i that pi = max j p j . Then, since r > 0, σ(p)i = 1 1 − r, so p 1νr ¬Ei , by formula (4). It follows that p 1νr ⊥. For non-skepticism, let Ei be an arbitrary answer, and it suffices to show that Ei is accepted by νr at some credal state p such that pi < 1. Let pi = 1/(2 − r). Since E contains at least two answers, choose j in I distinct from i and let p j = (1 − r)/(2 − r). Since a probability distribution is normalized, pk = 0 for all k , i, j. Note that pi is the mode of p, since r > 0. So for each k , i: σ(p)k ≤ 1 − r, σ(p)i = 1 6≤ 1 − r, since r > 0. Hence p νr Ei , by formula (4), with pi = 1/(2 − r) < 1, since r < 1. On the other hand: Proposition 2. Suppose that E is countably infinite. The Lockean rule λr is either skeptical or somewhere inconsistent, for each r such that 0 ≤ r ≤ 1. Proof. If Lockean rule is not skeptical, then r < 1, and thus there exists p in Pω such that pi ≤ 1 − r, for each i ∈ I. So by formula (3), λr (p) = ⊥, and hence λr is somewhere inconsistent. Lin and Kelly 209 3 Respect for logic The range of acceptance rule α : P → A has a natural, Boolean logical structure: (A, ≤, ∨, ∧, ⊥, >), where the partial order ≤ corresponds to classical entailment or relative strength of propositions and ∨ and ∧ are the least upper bound and the greatest lower bound with respect to ≤, which correspond to the usual propositional operations of disjunction and conjunction.5 If there were also a motivated logical structure on the space: (P, ≤, ∨, ∧, ⊥, >) of probabilistic credal states, in which ≤ is intended, again, to reflect relative strength, then an obvious constraint on acceptance rules would be to preserve logical structure in the sense that: p≤q ⇒ α(p) ≤ α(q); (5) α(p ∨ q) = α(p) ∨ α(q); (6) α(p ∧ q) = α(p) ∧ α(q); (7) α(ei ) = Ei ; (8) α(>) = >; (9) α(⊥) = ⊥. (10) Any plausible logical structure over P should also satisfy the following constraint: the unit vectors ei , for i ∈ I, are exactly the strongest credal states in P. (11) Then we already have the following assurance against inconsistency: Proposition 3 (No lottery paradox). Suppose that acceptance rule α and relative strength ≤ over P satisfy conditions (5), (8), and (11). Then α is everywhere consistent. Proof. Suppose for reductio that for some credal state p, α(p) = ⊥. Then, by condition (11), there exists a strongest state ei such that ei ≤ p. So α(ei ) ≤ α(p), by (5). Then by (8), we have that Ei = α(ei ) ≤ α(p) = ⊥. So Ei ≤ ⊥, which is false in the Boolean logical structure of A. 5 In algebraic logic, A ≤ B means that A is at least as strong as B. 210 A Geo-logical Solution to the Lottery Paradox Therefore, the lottery paradox witnesses the failure of the Lockean rule to pre- serve logical structure. But the lottery paradox is only the most glaring consequence of the Lockean rule’s disrespect for logical structure. It is plausible to suppose that with respect to question E, if credal state p accords maximal probability to answer Ei , compared to all the alternative answers to E, and if E j is a distinct answer to E, then credal state p(·|¬E j ) is at least as strong as p: pi = maxk pk and Ei , E j =⇒ p(·|¬E j ) ≤ p. (12) But then the Lockean rule again fails to preserve relative strength, i.e., it violates con- dition (5). Recall from Figure 3 that a consistent Lockean rule’s acceptance zone for E2 is a diamond. The diamond has the wrong shape—its sides meet at an angle that is too acute. For consider a credal state p very close to the inner apex of the diamond, as depicted in Figure 5. Let q = p(·|¬E3 ). By condition (12), we have that q ≤ p. But E2 q = p( . | E3) p E3 E1 T E1 E2 E3 e3 Figure 5: Deeper trouble for the Lockean rule point q lies on the side of the triangle opposite e3 because q3 = 0, and q lies on the ray from e3 that passes through p because q1 /q2 = p1 /p2 . So λ(q) = ¬E3 6≤ E2 = λ(p). Therefore, q ≤ p but λ(q) 6≤ λ(p), which violates (5). That is another, counterintuitive way to fail to preserve logical order even when the lottery paradox does not arise. The preceding argument illustrates a further point: intuitions about relative strength of credal states are tied to conditioning. The boundaries of acceptance zones deter- mined by the Lockean rule do not follow the geometrical rays that correspond to the trajectories of probabilistic credal states under conditioning. For that reason, the Lock- ean rule is a bad choice for trying to explicate the acceptance of conditionals in terms of conditional probabilities. Specifically, consider the following interpretation of the Ramsey test. Let A, B be arbitrary propositions in A. Let the flat conditional with antecedent A and consequent B be expressed by A ⇒ B. (The arrow notation ‘⇒’ does Lin and Kelly 211 not denote any binary operation in any algebra; so mathematically, A ⇒ B is simply the ordered pair (A, B). Instead, A ⇒ B is used to indicate that we are talking about the acceptance condition of a flat conditional with antecedent A and consequent B.) We propose that the Ramsey test be explicated by the following acceptance condition for flat conditionals:6 p α A ⇒ B ⇐⇒ p(·|A) α B or p(A) = 0. (13) So when the antecedent has nonzero probability, this semantic rule says that flat con- ditional A ⇒ B is accepted at credal state p if and only if, when one “adds A hypothet- ically to one’s stock of knowledge” and thereby hypothetically conditions p on A to obtain posterior credal state p(·|A), one accepts the consequent B in the posterior state. Consider again the consistent Lockean rule λ and credal state p in Figure 5. Then, as evident from the picture, we have: p λ E2 , p 1λ ¬E3 ⇒ E2 . Note that ¬E3 is entailed by E2 , so Lockean rule λ instructs one to retract her accep- tance of E2 when a logical consequence of E2 is learned or supposed. That allows one to retract acceptance too easily and violates almost all logics of conditionals that have interested logicians, including Adams’ conditional logic (Adams 1975).7 On the other hand, to preserve acceptance under logically entailed information, it suffices to require conditions (5) and (12). For by (12), credal state q would have been at least as strong as credal state p and hence, by (5), any proposition accepted in p remains accepted in q, e.g., E2 . The angles formed by the sides of the acceptance zones are crucial to the preser- vation of logical structure. The acceptance rules we recommend—the camera shutter rules—do have acceptance zones with the correct angles at their corners and, therefore, do not encounter any of the preceding logical difficulties. We will show that the camera shutter rules preserve a very natural logical structure on state space P and, therefore, yield a soundness and completeness theorem for Adams’ conditional logic that is sim- pler than more natural than Adams’ original 1975 version. 6 If p(A) , 0, p(·|A) is defined to be p( · ∧ A)/p(A); otherwise it is undefined. 7 Specifically, the principle that acceptance be preserved under logically implied information can be shown to be equivalent to Cautious Monotonicity, given two other axioms in Adams’ conditional logic (a.k.a system P: Right Weakening, And. That system will be discussed in detail latter. 212 A Geo-logical Solution to the Lottery Paradox 4 Geologic Consider classical, infinitary propositional logic, which allows for countable disjunc- tion and conjunction.8 Start with propositional constants ⊥, > and propositional vari- ables Vκ = {Ei : i ∈ I}, where the countable index set I has cardinality κ. Let j ϕ j W and j ϕ j be countable disjunction and conjunction, respectively. Let language Lκ be V the least set containing the propositional constants in Vκ that is closed under negation, countable disjunction, and countable conjunction. We interpret the propositional vari- ables to be mutually exclusive and exhaustive. Under that restriction, each assignment is an κ-dimensional basis vector ei . Let Bκ denote the set of all such vectors. The valuation function for classical logic is definable as follows. In the base case: vei (E j ) = ei · e j ; vei (>) = 1; vei (⊥) = 0, where · denotes the vector inner product x · y = i∈I xi yi . In the inductive case: P _ ^ vei (¬ϕ) = 1 − vei (ϕ); vei ( ϕ j ) = max j (vei (ϕ j )); vei ( ϕ j ) = min j (vei (ϕ j )). j j Logical entailment is definable in terms of valuation as follows: ϕ ψ ⇐⇒ vei (ϕ) ≤ vei (ψ), for all i ∈ I. Let the proposition [[ϕ]]κ expressed by ϕ in language Lκ denote the set of all assignments in Bκ in which ϕ evaluates to 1. Each proposition [[ϕ]]κ is represented uniquely by its valuation vector: vκ (ϕ) = (vei (ϕ) : i ∈ I), which belongs to 2κ . Define the following relations and operations over 2κ : u≤v ⇐⇒ ui ≤ vi , for all i in I; (14) (¬v) = 1 − vi ; (15) _ i j ( v j )i = max j vi ; (16) j ^ j ( v j )i = min j vi . (17) j Then the structure of classical, infinitary logic is captured9 by the mathematical struc- ture: _ ^ Lκ = (2κ , ≤, , , 1, 0). 8 For classic studies concerning completeness infinitary logic, cf. (Karp 1964) and (Barwise 1969). Our applications make no reference to completeness or to proof systems for infinitary logic. 9 I.e., the Lindenbaum-Tarski algebra of language L is isomorphic to L . κ κ Lin and Kelly 213 Figure 6.a illustrates L3 , which bears a suggestive resemblance to the unit cube [0, 1]3 (1, 1, 1) (1, 1, 1) (1, 1, 0) (1, 0, 1) (0, 1, 1) (1, 1, 0) (0, 1, 1) (1, 0, 0) (0, 0, 1) (1, 0, 0) (0, 0, 1) (0, 1, 0) (0, 1, 0) (0, 0, 0) (0, 0, 0) (a) (b) Figure 6: Bead-and-string logic vs. geologic (Figure 6.b), but it is really just a string-and-bead figure whose strings happen to be sized and stretched to outline a cube. However, one can extend classical propositional logic on Lκ to a fuzzy language Lκ∗ that generates fuzzy propositions covering the en- tire κ-dimensional unit cube [0, 1]κ .10 A fuzzy proposition is just a fuzzy subset (Zadeh 1965) of Bκ , which is representable by a fuzzy characteristic function from Bκ to [0, 1] and, hence, by a fuzzy valuation vector v in [0, 1]κ . Formula (14) represents the fuzzy subset relation and formulas (15) through (17) correspond to fuzzy complement, inter- section, and union over fuzzy propositions. Here is one natural way to extend classical logic over Lκ to cover the κ-dimensional unit cube. For each real number d in the unit interval, let the partial negation ¬d ϕ be understood as the negation of ϕ to degree d, interpreted as follows: vei (¬d ϕ) = d vei (¬ϕ) + (1 − d)vei (ϕ). In particular, ¬0 ϕ is equivalent to ϕ, whereas ¬1 ϕ is equivalent to ¬ϕ. Between these extremes, ¬1/2 ϕ hovers semantically midway between ϕ and ¬ϕ. Let Lκ∗ be the result of expanding language Lκ with ¬d . Otherwise, the preceding definitions of valuation 10 The idea may sound similar to multi-valued logic, but it is quite different. In multi-valued logic, (dis- crete) logical formulas in Lκ are interpreted over an expanded, continuous space of assignments (Novak et al. 2000)—such logics generate a discrete, weakening of classical logic, rather than a continuous, conservative extension of classical logic. 214 A Geo-logical Solution to the Lottery Paradox function vei and valuation vector (vei (ϕ) : i ∈ I) remain unaltered.11 Partial negation never generates values outside of the unit interval, so all valuation vectors for Lκ∗ are in the unit cube [0, 1]κ . Conversely, every vector v in [0, 1]κ is the valuation vector of some formula in Lκ∗ , namely: _ (¬1−vi Ei ∧ Ei ). i∈I So the propositions expressible by the fuzzy language L∗κ correspond to the vectors in the κ-dimensional unit cube [0, 1]κ . Therefore, we refer to the logic just defined as geologic. Formulas (14) to (17) still make sense for fuzzy valuation functions (because they correspond to the standard definitions of the fuzzy set theoretic operations). Therefore, the structure of geologic is: _ ^ L∗κ = ([0, 1]κ , ≤, , , 1, 0). Since the valuation definition for geologic is exactly the same as for classical logic over the fragment Lκ , it follows that L∗κ restricted to Lκ is just Lκ —in other words, geologic is a conservative extension of classical, infinitary logic. Since the operations in L∗κ correspond to fuzzy set theoretical operations on propo- sitions, it is immediate that the geological operations satisfy associativity, commuta- tivity, distributivity, and the De Morgan rules (Zadeh 1965). Excluded middle and disjunctive syllogism, on the other hand, can fail spectacularly for propositions in the unit cube’s interior. For example, let c denote the center ( 12 , . . . , 12 , . . .) of the unit cube. Then: ¬c = c; c ∨ ¬c = c; (e1 ∨ c) ∧ ¬c = c. In spite of that, we think of geologic as the natural extension of classical logic to fuzzy propositions. Associativity, commutativity, distributivity, and the De Morgan rules are all motivated by symmetries of the unit cube. Excluded middle is not motivated by symmetry—it is a mere artifact of an impoverished syntax. Furthermore, unlike modal logic, which is also a conservative extension of classical propositional logic, geologic arises from the addition of a truth-functional negation. 11 More directly, one can simply introduce a new unary connective aϕ called scalar multiple interpreted by vei (aϕ) = avei (ϕ). But we found it harder to motivate usage of such a connective. Lin and Kelly 215 Filling the interior of the Boolean algebra to make it a genuine cube provides an explanatory, geometrical perspective on classical logic. For example, given points v and u in the unit cube, find the smallest parallelepiped solid S (v, u) containing v and u whose sides are parallel to the sides of the cube. Then the uppermost vertex of S (v, u) is v∨u and the lowermost vertex of S (v, u) is v∧u (Figure 7.a). The parallelepiped S (v, u) is like a sub-crystal within the cube, which is another reason for thinking of geologic as geological.12 The geometry of full geological negation is just reflection through the T T v vu u v v c d v v v u v T T (a) (b) Figure 7: Geological operations center c of the cube, which is a natural generalization of Boolean complementation. To construct the partial negation ¬d v of v, first reflect v through c to obtain the full negation ¬v. Now draw a straight line segment between v and ¬v. Then ¬d v is the point that lies proportion d of the way from v to ¬v along the line segment (Figure 7.b). Consider the classical De Morgan rules. Since full negation involves projection through the center c of the cube, think of c as the aperture of a pinhole camera. It is a familiar fact that projection through an aperture inverts the image. But the disjunction v∨u is the top vertex of the parallelepiped spanning v and u. Projecting the parallelepiped through the aperture inverts it and turns the top vertex into the bottom vertex—the conjunction of the projection of v with the projection of u (Figure 8). 12 Note that the same geometrical relationships would hold even if the unit cube were stretched along its various axes to form a prism. We will return to that theme in the last section of the paper. 216 A Geo-logical Solution to the Lottery Paradox T v vu v u ( v u) c v v u v u v ( v v u) T Figure 8: Geometry of the De Morgan Rules 5 Logic from a probabilistic perspective For our purposes, the point of geologic is that it affords a unified perspective on logic and probability.13 The set P3 of possible credal states is a horizontal, triangular plane through the unit cube of geological propositions (Figure 6.b). Thus, credal state space P3 has a natural embedding within geologic. That embedding generalizes to each countable cardinality κ. Valuation and probability assignment can both be viewed as inner products within the geological cube: vei (u) = ei · u, p(b) = p · b, 14 where u is a vector in [0, 1]κ corresponding to an arbitrary, geological proposition, vei is a valuation function corresponding to a classical assignment, b is a Boolean val- uation vector in 2κ corresponding to a classical proposition, and p is a probability measure/vector in Pκ . Say that probability measure p is uniform with respect to E if and only if p assigns only zero or a fixed value to the answers in E. The support of p is the disjunction of all elements of E that p assigns non-zero probability to (recall that E is countable). 13 Due to the truth-functionality of conjunction in fuzzy logic, the fuzzy logic community tends to view fuzzy logic in isolation from probability theory, rather than as a tool for understanding probability theory, as we propose. Lin and Kelly 217 The classical principle of indifference is a mapping σ that associates each uniform probability distribution p with its support. For example, σ associates ( 12 , 12 , 0) with the classical proposition σ( 21 , 12 , 0) = (1, 1, 0). Construct a ray from ⊥ through uniform distribution p and then σ(p) is the (classical) proposition on the upper surface of the unit cube that the ray points to (Figure 9.a). Algebraically, σ(p) is the (unique) scalar T T (1, 1, 1/2) (1, 1, 0) σ σ (1/2, 1/2, 0) (2/5, 2/5, 1/5) T T (a) (b) Figure 9: Indifference as projection multiple of p in the unit cube that has at least one component equal to 1, which amounts to the formula encountered earlier in the definition of the symmetric camera shutter acceptance rules: pi σ(p)i = , for i ∈ I. max j p j Say that geological proposition u is fully satisfiable if and only if there exists ei such that vei (u) = 1, i.e. u has a component equal to 1. So σ(p) is the (unique), fully satisfiable, geological valuation vector that is proportional to p. In classical logic, the mapping σ(p) is defined only for uniform p, but it is defined for all p in geologic, since the (continuous) upper surface of the geological cube covers the entire triangle of probability measures (Figure 9.b). Now every probability measure p has a unique, geological proposition σ(p) that stands to p in much the same way that the support of p stands to uniform p. Mapping σ has a heuristic interpretation. Think of the unit cube as a room with tiled walls (Figure 10). Imagine that there is a digital camera embedded in the baseboard of 218 A Geo-logical Solution to the Lottery Paradox the room at corner ⊥. Think of the triangle P3 as the picture plane corresponding to the 2-dimensional image received by the camera. Then the inverse σ−1 of σ is the classical perspective rendering of the room’s interior on the picture plane. The perspective is extreme because the camera is literally embedded in the lower corner of the room, so the floor and adjacent walls are tangent to the camera’s view and are rendered as the boundaries of the triangular picture. vanishing point σ−1 ceiling picture plane wall wall vanishing vanishing (a) point (b) point Figure 10: A literal, probabilistic perspective on logic The point of the preceding detour through geologic is that the picture plane is the space P3 of probability measures on A3 and the walls and ceiling of the office are the fully satisfiable propositions in geologic. So Figure 10 literally illustrates geo- logic from a probabilistic perspective. That perspective sheds new light on the lottery paradox and its associated conundrums. In particular, note the similarity between the acceptance zones of the proposed, paradox-avoiding rule (Figure 4) and the projected coordinate lines of the unit cube (Figure 10.b). The boundaries of the former always follow the latter. 6 Probalogic We understand “logic” in the broad, pragmatic sense that logic is wherever logical structure is. If the logical structure pertains to relative strength of credal states, then there is a logic of such states, even though the states in question are not necessarily propositional and the logical relations among them are not plausibly interpreted as arguments. And if the structure happens to be relative to pragmatic factors such as a question that elevates the significance of certain propositions as relevant answers, Lin and Kelly 219 then logic, itself, is pragmatic—we do not insist that logic must in some sense be prior to or independent of such considerations. Our view accords with an ancient tradition according to which logic is a tool or organon for inquiry, which typically begins with some question and ends with an answer thereto. In this section, we introduce a logic of probabilistic credal states in the broad, pragmatic sense just outlined. When credence is modeled as qualitative belief in a proposition, it is straightfor- ward to judge the relative strength of credal states in terms of the classical, logical strength of the propositions believed: Bϕ ≤ Bψ ⇐⇒ ϕ ≤ ψ. We propose, in a similar spirit, that probabilistic credal states inherit their logical strength from their unique, geological images: p≤q ⇐⇒ σ(p) ≤ σ(q) (18) ⇐⇒ σ(p)i ≤ σ(q)i for all i ∈ I. (19) Disjunction ∨ and conjunction ∧ are standardly defined, respectively, as the least upper bound and the greatest lower bound with respect to ≤. We call the resulting logical structure on probability measures probalogic: (P, ≤, ∨, ∧). Probalogic is just geologic from a probabilistic perspective. Consider arbitrary credal state p in P3 . Which credal states are probalogically at least as weak as p? First, project p up to geological proposition σ(p) on the upper surface of the geological cube. The geological consequences of σ(p) consist of the parallelepiped containing > whose sides are parallel to the sides of the unit cube and whose bottom-most corner is σ(p) (Figure 11). Since σ(p) is incident to an upper surface of the cube, the parallelepiped is, in this case, a rectangle lying entirely in one upper face of the unit cube (or, in degenerate cases, entirely within an upper edge of the unit cube). The probalogical consequences of p are contained within the linear perspective projection of that rectangle onto the picture plane P3 . Note that, according to the usual rules of linear perspective, parallel sides of the rectangles meet at vanishing points, which correspond to the corners of P3 that are not closest to p. Similarly, the geological propositions in the range of σ that are geologically at least as strong as σ(p) are in the rectangle with sides parallel to the sides of the unit cube that has σ(p) as its upper corner and the nearest unit vector ei to σ(p) as its lower corner. So the inverse image of that rectangle under σ is the set of the credal states that are probalogically at least as strong as σ(p). We call the partial order ≤ so defined relative probalogical strength. 220 A Geo-logical Solution to the Lottery Paradox geologically probalogically weaker stronger p σ (p) probalogically geologically weaker stronger p (a) (b) Figure 11: Probalogical strength Probalogical disjunction, conjunction, and negation can be defined similarly, as the projections of the corresponding, geological disjunction: p∨q = σ−1 (σ(p) ∨ σ(q)); (20) p∧q = σ (σ(p) ∧ σ(q)); −1 (21) ¬p = σ (¬σ(p)). −1 (22) Since the geological disjunction of two propositions on the upper surface of the unit cube is also on the upper surface of the unit cube, P3 is closed under probalogical disjunction. Geometrically, these logical operations can be constructed as perspective renderings of the corresponding geological operations on the cube (Figures 12, 13, and 14). Probalogical constants and operations are not necessarily defined. In finite questions, > denotes the uniform distribution, but in countably infinite questions there is no such distribution. There is no interpretation of ⊥. Letting ⊥ = > is obvious unappealing, but any choice of ⊥ that is off-center is equally implausible. Geological negation is closed over the lower edges of the upper faces of the unit cube, but is not closed elsewhere over the upper faces of the unit cube, so probalogical negation is defined only over the lower edges of the unit cube. Furthermore, if σ(p) and σ(q) are on different upper faces of the unit cube, then the conjunction σ(p) ∧ σ(q) lies below the upper faces of the unit cube, so p ∧ q = σ−1 (σ(p) ∧ σ(q)) is undefined. Although we will not pursue the idea in this paper, there is a way to expand P to a space over which probalogical conjunction and disjunction are closed. Some assump- tions are so certain that one does not even conceive of their falsity—e.g., that a particle cannot have two distinct momenta at the same time but can have a definite momen- tum and position at the same time. But when experience gets strange, we may come to Lin and Kelly 221 σ (p) v σ (q) p v q q σ (p) σ (q) pvq p σ (p) v σ (q) q p (a) (b) Figure 12: Probalogical conjunction and disjunction within a face σ (p) v σ (q) pvq σ (p) σ (q) p q σ (p) σ (q) v p q (a) (b) Figure 13: Probalogical conjunction and disjunction across faces doubt our basic assumptions without having thought yet of any concrete alternatives. In such cases, a natural response is to transfer probability mass to a non-descript “catchall hypothesis” absent from the original algebra A3 . Within A3 , the resulting credal state appears to be normalized to a value less than 1. Accordingly, let P∗ denote the set of all additive measures p on A such that 0 ≤ p(>) ≤ 1. Then the problem of closure under negation and conjunction is solved by plausibly extending σ to a bijection between P∗ 222 A Geo-logical Solution to the Lottery Paradox (1, 0, 0) (0, 0, 1) (4/5, 1/5, 0) (0, 1/5, 4/5) (2/3, 1/3, 0) (0, 1/3, 2/3) σ (p) σ (p) (4/7, 3/7, 0) (0, 3/7, 4/7) (1/2, 1/2, 0) (0, 1/2, 1/2) p p (a) (b) Figure 14: Negation around the perimeter and the entire unit cube as follows (Figure 15): pi σ∗ (p)i = p(>) · , for i ∈ I. max j p j So equations (18) to (22), with σ replaced by σ∗ , induce a probalogical structure on P∗ that is closed under the probalogical operations of conjunction, disjunction, and negation. T T σ T T Figure 15: σ extended to measures normalized to a value ≤ 1 Lin and Kelly 223 7 Acceptance that respects Probalogic A probalogical acceptance rule ν is an acceptance rule that preserves probalogical structure in the sense of morphism conditions (5) to (8).15 As described in the pre- ceding section, condition (7) is understood to hold only when p ∧ q is defined over P. Recall the camera-shutter-like acceptance rules introduced above as one geometri- cal strategy for solving the lottery paradox. The rules can be stated a bit more generally, by allowing the threshold r and the strictness of the inequality to vary with i. Say that acceptance rule ν is a camera shutter rule for E if and only if there exist thresholds {ri : i ∈ I} in the unit interval and inequalities {Ci : i ∈ I} that are either ≤ or <, such that for each p in P and i ∈ I: 1. ν(p) = {¬Ei : σ(p)i Ci 1 − ri and i ∈ I} ; V 2. if Ci = ≤ then ri > 0; 3. if Ci = < then ri < 1. Note that 0 is omitted in the second condition to make it possible to not accept ¬Ei , and 1 is omitted in the third condition to make it possible to accept ¬Ei —else morphism condition (8) would be violated trivially. The main result of this section is that, over countable dimensions, the camera shutter rules are precisely the rules that preserve probalogic. Theorem 1 (representation of probalogical rules). Suppose that E is countable. Then an arbitrary acceptance rule is probalogical if and only if it is a camera shutter rule. The proof proceeds by a series of lemmas. Let p, q be in P. Define: q ≤i p ⇐⇒ σ(p)i ≤ σ(q)i . Lemma 1. Suppose that q ≤i p. Then p = (p ∨ ei ) ∧ (p ∨ q). Proof. See Figure 16. By the definition of probalogic in terms of geologic, it suffices to show that σ(p) = σ((p ∨ ei ) ∧ (p ∨ q)). By geologic, the j-th component of the right hand side expands to: min(max(σ(p) j , σ(ei ) j ), max(σ(p) j , σ(q) j )). 15 Note that (5) is redundant, for it is derivable from (6). 224 A Geo-logical Solution to the Lottery Paradox v p = (p v q) (p v ei) pvq p v ei q ei Figure 16: Proof of Lemma 1 Since (ei )i = 1, it follows that max(σ(p)i , σ(ei )i ) = 1. Since σ(q)i ≤ σ(p)i , it follows that max(σ(p)i , σ(q)i ) = σ(p)i . So σ((p ∨ ei ) ∧ (p ∨ q))i = σ(p)i . Now let E j be in E for j , i. Then (ei ) j = 0, so max(σ(p) j , σ(ei ) j ) = σ(p) j . In general, min(x, max(x, y)) = x, so we have as well that σ((p ∨ ei ) ∧ (p ∨ q)) j = σ(p) j . Lemma 2. Let ν satisfy morphism conditions (6), (7), and (8). Let i ∈ I. Then: p ν ¬Ei and q ≤i p =⇒ q ν ¬Ei . Proof. Suppose that p ¬Ei and that q ≤i p. Since p ¬Ei , it follows that pi < maxk pk . For otherwise, ei ≤ p, so by morphism condition 5, ei ¬Ei , contrary to morphism condition (8). Since it is also the case that q ≤i p, Lemma 1 yields that p = (p∨ei )∧(p∨q). Suppose for reductio that ν(q) is logically compatible with Ei . Then by morphism condition (6), ν(p ∨ q) is compatible with Ei . By morphism condition (8), ν(ei ) is compatible with Ei . So again by morphism condition (6), ν(p∨ei ) is compatible with Ei . So ν(p) = ν((p∨ei )∧(p∨q)) is compatible with Ei , by morphism condition (7) and by the fact that Ei is an atom in algebra A. But p ν ¬Ei . Contradiction. Hence, q ν ¬E. Proof of Theorem 1. For the only if side, let i ∈ I. Define: 1 − ri = sup{σ(p)i : p ∈ P and p ν ¬Ei }. Suppose that σ(p)i <i 1 − ri . Then p ν ¬Ei , by Lemma 2. Suppose that σ(p)i > 1 − ri . Then p 1ν ¬Ei , by the definition of 1 − ri . Finally, suppose that σ(p)i = σ(q)i = 1 − ri . Consider the case in which there exists r in P such that σ(r) = 1 − ri and r ν ¬Ei . Then p ν ¬E and q ν ¬E, by Lemma 2. In the alternative case, it is immediate that Lin and Kelly 225 p 1ν ¬E and q 1ν ¬E. Thus, p ν ¬Ei if and only if q ν ¬Ei . Set Ci = ≤ in the former case and set Ci = < in the latter case. In the former case, suppose for reductio that ri = 0. Then ν(ei ) ¬Ei , contradicting morphism condition (8), so ri > 0, as required. In the latter case, suppose for reductio that ri = 1. Then ν(ei ) 1 Ei , contradicting morphism condition (8), so ri > 0, as required. For the if side of the theorem, suppose that ν is a camera shutter rule for countable E. For morphism condition (5), suppose that p ≤ q. Then σ(p)i ≤ σ(q)i , for each i ∈ I. Then q ν Ei implies p ν Ei , so ν(p) ≤ ν(q). For morphism condition (6), let ν(p) = A and ν(q) = B, so: ^ A = {¬Ei : σ(p)i Ci 1 − ri }; ^ B = {¬Ei : σ(q)i Ci 1 − ri }. Let D = {¬Ei : A ≤ Ei and B ≤ Ei } and note that A ∨ B = D. Suppose that ¬Ei V is in D. Then σ(p)i Ci 1 − ri and σ(q)i Ci 1 − ri . Hence, max(σ(p)i , σ(q)i ) Ci 1 − ri . Thus, ν(p ∧ q) ≤ ¬Ei . Suppose that ¬Ei is not in D. Then either σ(p)i 6 Ci 1 − ri or σ(q)i 6 Ci 1 − ri , so max(σ(p)i , σ(q)i ) 6 Ci 1 − ri and, thus, ν(p ∧ q) 6≤ ¬Ei . Hence, ¬Ei is in D if and only if ν(p ∧ q) ≤ ¬Ei . Therefore, ν(p ∨ q) = D = A ∨ B. The dual V argument works for morphism condition (7). Recall that the conditions (5)-(7) omit preservation of negation and of the infini- tary versions of disjunction and conjunction. There are good reasons to drop those conditions. Proposition 4. In finite dimensions, no probalogical acceptance rule preserves infinite conjunction and disjunction. Proof. Consider probalogical acceptance rule ν for question {Ei : i ∈ I}. By morphism condition (8), ν(e1 ) = E1 and ν(e2 ) = E2 . Let L be the straight line connecting e1 with e2 . Note that no uniform distribution with infinite support is encountered along this line, so it is continuous. So by morphism condition (5), there is a boundary point b such that q ν E1 , for all q closer to e1 than b, and q 1ν E1 , for all q farther from e1 than b. Let m be the mid-point of L. Consider the case in which p is between m and e1 . Consider the case in which b ν E1 . Let {pi : i ∈ N} be a discrete sequence of points in line segment e1 b that converges to b and let {qi : i ∈ N} be a discrete sequence of points in line segment m b that converges to b. Then: _ ^ pi = b = qi . i i Suppose that b 1ν E1 . Then ν( ν(pi ). Alternatively, suppose that b ν E1 . W W i pi ) , i Then ν( i qi ) , i ν(qi ). V V 226 A Geo-logical Solution to the Lottery Paradox Proposition 5. In finite dimensions, no probalogical acceptance rule also preserves probalogical negation. Proof. Let p = ( 23 , 13 , 0). Assume, for reductio, that acceptance rule ν is probalogical and preserves probalogical negation as well. So by Proposition 1, ν is a camera shutter rule. Suppose that ν rejects E2 in p. So σ(e2 ) = 21 C2 1 − r2 . Note, in Figure 14, that ¬p = (0, 13 , 23 ). So by preservation of negation, ν does not reject E2 at ¬p. Thus: σ(e2 ) = 12 6C2 1 − r2 , which is a contradiction. The case in which ν does not reject E2 in p is similar. The argument generalizes to arbitrary, finite dimensions. On the other hand, setting each ri = 12 almost preserves negation, in the sense that negation is preserved at all points on the perimeter of the triangle except at the six probability assignments with range {0, 31 , 32 }. But even so, no other setting for the ri other than 12 has that property, so the demands imposed by negation preservation are unreasonably strict. 8 Acceptance that does not respect Probalogic The acceptance rules we recommend, the camera shutter rules, are exactly the rules that preserve probalogical structure. Alternative acceptance rules proposed by Kyburg (1961) and by Pollock (1995) fail to preserve probalogical structure—actually, they fail to preserve any plausible logical structure. Each Kyburgian acceptance rule χr is a Lockean rule without closure under con- junction: χr = {A ∈ A : p(A) ≥ r}. Let question E be ternary and set r = 32 . In Figure 17, the set χ 23 (c) of propositions accepted at the center c = ( 31 , 31 , 13 ) is indicated by a solid line and the set χ 23 (e3 ) is indicated by a dashed line. Rule χ 23 does not preserve logical order in any plausible sense, for corner e3 is at least as strong as center c, but χ 32 (e3 ) is, intuitively, not at least as strong as χ 23 (c) due to the retraction of e1 ∨ e2 . There is, therefore, a hidden dilemma in Kyburg’s thesis that one should give up closure of accepted propositions under conjunction. On the one hand, if only > is accepted at the uniform measure c, then there is no lottery paradox and, hence, there is no motivation for failing to close the accepted propositions under conjunction. On the other hand, if some proposition other than > is accepted at c—say, a disjunction D that is incompatible with Ei —then, using the same argument as above, when one jumps from the center c to the stronger state ei , one must accept Ei (which has probability Lin and Kelly 227 T e 1v e 2 e 1v e 3 e 2v e 3 c χ2/3 e3 e3 T Figure 17: Kyburgian acceptance rule one) and retract D (which has probability zero) and thus one must fail to expand the set of accepted propositions. In contrast, all camera shutter rules preserve probalogic. Pollock (1995), Ryan (1996), and Douven (2002) all propose variants of the Lock- ean acceptance rule, which we will call Pollockian. The basic idea is to restrict the Lockean rule to cases in which it produces no paradox. The idea is illustrated, for ternary E, in Figure 18. The basic difference between Pollockian and Lockean rules in q p T c T (a) r > 2/3 (b) r 2/3+ (c) r < 2/3 Figure 18: Pollockian acceptance rules 3-dimension is that the former return > whenever the latter return ⊥ (compare to Figure 3). The choice of > as a substitute for ⊥ is natural enough, on grounds of symmetry, but due to the shape of Pollockian acceptance zones, there still exists no single logical structure that all Pollockian rules preserve. 228 A Geo-logical Solution to the Lottery Paradox Proposition 6. Suppose that E is ternary. Let be an arbitrary partial order on P whose binary least upper bound operation g is totally defined. Then there exists at least one Pollockian acceptance rule that is not a structure preserving map from (P3 , , g) to (A3 , ≤, ∨). Proof. Suppose the contrary for reductio. Let πr be a Pollockian rule. When r > 32 , as in Figure 18.a, the rule πr accepts E1 ∨ E2 at p = ( 12 , 12 , 0) and E2 ∨ E3 at q = (0, 23 , 13 ), respectively, whose disjunction is E1 ∨ E2 ∨ E3 = >. So, to preserve disjunction, p g q must lie within the white triangle, where > is accepted. If we let r approach 32 from above, as in Figure 18.b, the white triangle converges to the center point c = ( 13 , 13 , 13 ), so p g q = c. Now consider the case in which r < 32 (Figure 18.c). By preservation of disjunction, we have: > = πr (c) = πr (p g q) = πr (p) ∨ πr (q) = (E1 ∨ E2 ) ∨ E2 = E1 ∨ E2 . Hence > = E1 ∨ E2 , a contradiction. A dilemma for Pollockian theorists is that, on the one hand, symmetry precludes accepting anything other than > at the center point c, but that implies that there is no logical structure on P that all Pollockian rules preserve. In contrast, all camera shutter rules preserve probalogic. 9 The geometry of Conditional Logic As illustrated in Figure 5, acceptance zones with a wrong shape can invalidate plau- sible principles of nonmonotonic reasoning. In fact, each axiom of the logic for flat conditionals corresponds to a definite, geometrical constraint on acceptance zones. The correspondences are established in this section and are used below to demonstrate that each probalogical rule validates a plausible set of axioms for conditional logic due to Adams (1975). The acceptance condition of a conditional is defined by (13) as an explication of Ramsey test: p α A ⇒ B ⇐⇒ p(·|A) α B or p(A) = 0. Lin and Kelly 229 The set of axioms known as Adams’ conditional logic (Adams 1975) or system P have been widely recognized as central to conditional and nonmonotonic reasoning (Kraus et al. 1990). They state closure properties for a set of accepted conditionals. Here we rewrite them as closure properties for the set of conditionals accepted at a fixed credal state p under a fixed acceptance rule α (where the horizontal line means material implication): (Reflexivity) p α A⇒A p α A⇒B (Left Equivalence) if A is classically equivalent to C. p α C⇒B p α A⇒B (Right Weakening) if B classically entails C. p α A⇒C p α A⇒B (And) p α A⇒C p α A ⇒ (B ∧ C) p α A⇒C (Or) p α B⇒C p α (A ∨ B) ⇒ C p α A⇒B (Cautious Monotonicity) p α A⇒C p α (A ∧ B) ⇒ C Say that acceptance rule α validates an axiom for conditional logic if and only if, for each credal states p, α together with p satisfies that axiom. Say that α validates a set of axioms if and only if α validates each axiom in that set. The some axioms in Adams’ logic are validated trivially. Proposition 7. Each acceptance rule validates And, Left Equivalence, and Right Weak- ening. Proof. Immediate from the modeling assumption: with respect to each credal state and each acceptance rule, there is a strongest accepted proposition that entails all the other accepted propositions. 230 A Geo-logical Solution to the Lottery Paradox Proposition 8. Let α be an acceptance rule. Then, α validates Reflexivity if and only if α accepts every certain proposition in the following sense: p α A for each credal state p in P and each proposition A in A such that p(A) = 1. Proof. For the only if side, suppose that α validates Reflexivity. Suppose further that p(A) = 1. Then we have that p α A ⇒ A (by Reflexivity), and thus that p(·|A) α A (note that p(·|A) exists), and hence that p α A (because p = p(·|A)). So α accepts every certain proposition. For the converse, let α be an acceptance rule and p a credal state. Either credal state p(·|A) is undefined, and thus we have that p α A ⇒ A by default. Or p(·|A) is defined, and thus p(A|A) = 1 and then p(·|A) α A (by acceptance of every certain proposition) and hence we have that p α A ⇒ A (by definition). Axioms Cautious Monotonicity and Or impose substantial geometrical constraints on acceptance rules. Let A be a proposition in A. Let P|A denote the set of all p in P such that p(A) = 1, which we will call the facet of simplex P for proposition A. The line segment with endpoints p, q in simplex P is defined by convex combination: p q = {ap + (1 − a)q : a ∈ [0, 1]}.16 Say that q is a projection of p from facet P|¬A onto facet P|A if and only if (i) there exists a line segment L through p with endpoint q in P|A and the other endpoint in facet P|¬A and (ii) p is not in the complementary facet P|¬A. Projection is equivalent to Bayesian conditioning: Lemma 3. Credal state q is a projection of p from facet P|¬A onto facet P|A if and only if p(·|A) is defined and q = p(·|A). Proof. This lemma is trivially true when p is in P|A or in P|¬A, so suppose that p is neither in P|A nor in P|¬A and, thus, that both p(·|A) and p(·|¬A) are defined. For the if side, consider line segment L = p(·|A) p(·|¬A), whose endpoints are in P|A and p(·|¬A), respectively. Note that p lies on L, since for each B in A, p(B) = p(B|A)p(A) + p(B|¬A)p(¬A) = a p(B|A) + (1 − a) p(B|¬A), where a = p(A). Therefore, p(·|A) is a projection of p from P|¬A onto P|A. For the only if side, suppose that q is a projection of p from facet P|¬A onto facet P|A. So q is in P|A and there exists credal state r in P|¬A such that line segment q r contains p. Then, p lies in the interior of q r, since p is neither in P|A nor in P|¬A. So there exists a in the open interval (0, 1) such that p = aq + (1 − a)r. Then it suffices to show that q = p(·|A). Consider the case in which Ei 6≤ A. Then Ei ≤ ¬A. Since q 16 Addition is defined as vector addition; multiplication is defined as scalar multiplication. Lin and Kelly 231 is in facet P|A, we have that q(Ei ) = 0 = p(Ei |A). Now consider the case in which Ei ≤ A. Then since r is in facet P|¬A, we have that r(Ei ) = 0, so p(Ei ) = aq(Ei ). Similarly, we have that q(A) = 1 and r(A) = 0, so p(A) = a · 1 + 0 = a. Hence, q(Ei ) = p(Ei )/a = p(Ei )/p(A) = p(Ei )p(A|Ei )/p(A) = p(Ei |A). So q(·) agrees with p(·|A) for all Ei in E and, thus, for all B in A, as required. Proposition 9 (Geometry of Cautious Monotonicity). Let α be an acceptance rule. Then, α validates Cautious Monotonicity if and only if the following condition holds: for each credal state p and for each proposition A, if α accepts A at p, then α accepts A at the projection of p on the facet P|B, for each logical consequence B of A (as long as the projection exists). In light of Lemma 3, the condition may be restated as: p α A, A ≤ B, and p(·|B) is defined =⇒ p(·|B) α A. (23) Proof. The proof of the only if side involves unpacking the definitions and checking that the projection condition (23) is simply an instance of Cautious Monotonicity. For the if side, assume that the projection condition (23) holds. Suppose that p α A ⇒ B and p α A ⇒ C. It suffices to show that p α (A ∧ B) ⇒ C. If p(·|A ∧ B) is undefined, then by default p α (A ∧ B) ⇒ C. So suppose that p(·|A ∧ B) is defined and, thus, p(·|A) is defined. Then argue as follows: p α A ⇒ B, p α A ⇒ C =⇒ q α B, q α C letting q = p(·|A), =⇒ q α B ∧ C, (B ∧ C) ≤ B =⇒ q(·|B) α B ∧ C by condition (23) and the existence of q(·|B) which equals p(·|A ∧ B), =⇒ p(·|A ∧ B) α B ∧ C since p(·|A ∧ B) = q(·|B), =⇒ p(·|A ∧ B) α C =⇒ p α (A ∧ B) ⇒ C. Proposition 10 (Geometry of Or). Let α be an acceptance rule that validates Reflex- ivity. Then, α validates Or if and only if the following condition holds: for each line segment L connecting two complementary facets P|B and P|¬B, and for each proposi- tion A in A, if α accepts A at both endpoints of L, then α accepts A at each point on L; in light of Lemma 3, the condition may be restated as: p(·|B) α A , p(·|¬B) α A =⇒ p α A. (24) 232 A Geo-logical Solution to the Lottery Paradox Proof. For the only if side, argue as follows: p(·|B) α A, p(·|¬B) α B =⇒ p α B ⇒ A, p α ¬B ⇒ A =⇒ p α (B∨¬B) ⇒ A by axiom Or, =⇒ p(·|B∨¬B) α A =⇒ p α A. For the converse, suppose that p α A ⇒ C and p α B ⇒ C. It suffices to show that p α (A ∨ B) ⇒ C. If both p(·|A) and p(·|B) are undefined, then p(·|A ∨ B) is undefined and thus we have that p α (A ∨ B) ⇒ C by default. If one is defined and the other is undefined—say, p(·|A) is defined and p(·|B) is undefined—then p(B) = 0 and thus p(·|A ∨ B) = p(·|A) is defined, so: p α A ⇒ C =⇒ p(·|A) α C =⇒ p(·|A ∨ B) α C by p(·|A ∨ B) = p(·|A), =⇒ p α (A ∨ B) ⇒ C. Last, suppose that both p(·|A) and p(·|B) are defined. So p(·|A ∨ B) is defined. Then argue for Or as follows: p α A ⇒ C, p α B ⇒ C =⇒ p(·|A) α C, p(·|B) α C =⇒ q(·|A) α C, q(·|B) α C letting q = p(·|A ∨ B), so q(·|A) = p(·|A) and q(·|B) = p(·|B), =⇒ q α C ∨ ¬A, q α C ∨ ¬B (∗) see the explanation below, =⇒ q α C ∨ ¬(A ∨ B) by classical entailment, =⇒ q α C ∨ ¬(A ∨ B), q α A ∨ B since q(A ∨ B) = 1 and Proposition 8 applies, =⇒ q α C by classical entailment, =⇒ p(·|A ∨ B) α C =⇒ p α (A ∨ B) ⇒ C. It only remains to establish step (∗). By the symmetric roles of A and B, it suffices to show that q(·|A) α C implies that q α C ∨ ¬A. If q(·|¬A) is undefined, then q(A) = 1 − q(¬A) = 1 − 0 = 1 and thus q = q(·|A) α C ≤ C ∨ ¬A, so q α C ∨ ¬A. If q(·|¬A) is defined, then we have both that q(·|A) α C (by supposition) and that q(·|¬A) α ¬A (by Reflexivity and Proposition 8). So we have both that q(·|A) α C ∨ ¬A and that q(·|¬A) α C ∨ ¬A (by classical entailment). Hence q α C ∨ ¬A, by the convexity condition (24). Lin and Kelly 233 10 The Geometry of System P In this section we examine, for each axiom in conditional logic system P, its geometri- cal constraint on acceptance rules. It is an easy corollary of the geometrical character- izations in the preceding section that: Theorem 2 (Lin 2011). Each probalogical rule validates system P. Proof sketch. When |E| = 3, one can easily verify that probalogical rules satisfy the geometric conditions given in propositions 7-10 when the consequence relations in question have antecedents of nonzero probability. The routine verification can be easily generalized to a proof for all countable dimensional cases. We now proceed to establish a partial converse to Theorem 2. Recall that accep- tance zones for answers have the following form under probalogical rules: p ν Ei ⇐⇒ Ei = ν(p) ^ ⇐⇒ Ei = {¬E j : σ(p) j C j 1 − r j } ⇐⇒ ∀ j , i, σ(p) j C j 1 − r j pj ⇐⇒ ∀ j , i, Cj 1 − rj maxk pk pj ⇐⇒ ∀ j , i, C j 1 − r j .17 pi Namely, answer Ei is accepted if and only if each rival has a sufficiently low odds to Ei . To allow for more generalized rules entertained below (Section 15), we relax the conditions that the rejection threshold 1 − r j is in the unit interval and that it is constant for all i. Accordingly, say that the acceptance zone of answer Ei under α is a blunt diamond (Figure 19.a) if and only if it takes the following form: there exist thresholds {ti j : j ∈ I \ {i}} in interval [0, ∞] and inequalities {Ci j : j ∈ I \ {i}} that are either ≤ or <, such that for each p ∈ P: pj 1. p α Ei ⇐⇒ ∀ j , i, pi Ci j ti j ; 2. if Ci j = ≤ then ti j < ∞; 3. if Ci j = < then ti j > 0. Say that acceptance rule α is corner-monotone if and only if (i) α(ei ) = Ei for each i ∈ I, and (ii) for each p ∈ P such that α(p) = Ei , we have that α(q) = Ei for all q in line segment p ei . Corner-monotonicity is a very natural constraint on acceptance rules and it is by all the rules we have discussed. Our partial converse to Theorem 2 is as follows. 234 A Geo-logical Solution to the Lottery Paradox e2 e2 E2 p q e1 e3 e1 e3 (a) (b) Figure 19: Acceptance zone of E2 Theorem 3 (Blunt diamond, Lin 2011). Let α be an acceptance rule. If α is every- where consistent, satisfies corner-monotonicity, and validates system P, then for each answer Ei to question E, the acceptance zone of Ei under α is a blunt diamond. Proof sketch. Here we present a geometric argument for case |E| = 3, which can be easily generalized to each countable dimension. Solve for the acceptance zone of E2 under α, as depicted in Figure 19.b. By corner-monotonicity, the credal states along side e2 e1 of the triangle at which α accepts E2 form a continuous, unbroken line seg- ment with e2 as an endpoint, which is depicted as the heavy, grey line segment lying on e2 e1 . The same is true for side e2 e3 .18 Connect the endpoints of the grey line segments to the opposite corners by straight lines, which enclose the grey blunt diamond at the corner e2 . Argue as follows that p α E2 , for each point p in the blunt diamond. Consider the projection p0 of p to the facet P|(E2 ∨ E3 ). Note that p0 is in the heavy, grey line segment alone side e2 e3 . On line segment e1 p0 ray, acceptance rule α accepts E1 at one endpoint (e1 ) and accepts E2 at the other endpoint (p0 ), so α accepts E1 ∨ E2 at both endpoints. Then, by Proposition 10, we have that p α E1 ∨ E2 . By applying the same argument to the projection of p to the facet for proposition E1 ∨ E3 , we have that p α E3 ∨ E2 . Then p α E2 , since E2 is entailed by E1 ∨ E2 plus E2 ∨ E3 . Argue as follows that q 1α E2 , for each point q outside of the blunt diamond. Since q lies outside of the blunt diamond, there exists at least one answer Ei other than 18 There is an issue whether the line segments are open or closed at the endpoints distinct from e , which 2 would give rise to a possible mixture of strict and weak inequalities, as stated in the theorem. That detail is handled in the formal proof in (Lin 2011), but is ignored here. Lin and Kelly 235 E2 such that the projection q0 of q to the facet P|(E2 ∨ Ei ) does not touch the grey line segment along side e2 ei . Figure 19.b. illustrates the case for i = 3. Suppose for reductio that q α E2 . Then, by applying Proposition 9 to the projection q0 of q, we have that q0 α E2 . But q0 1α E2 , for q0 lies outside of the grey line segment— contradiction. 11 AGM geometry is trivial A popular, stronger system for the logic of flat conditionals, R, is obtained from P by adding the following axiom (Lehmann and Magidor 1992): p 1α A ⇒ ¬B (Rational Monotonicity) p α A⇒C p α (A ∧ B) ⇒ C Recall the probabilistic Ramsey test assumed in the preceding sections of this paper: p α A ⇒ B ⇐⇒ α(p(·|A)) ≤ B or p(A) = 0. Given this test, validation of system R trivializes uncertain acceptance in the sense defined as follows. Say that acceptance rule α is skeptical if and only if there is some answer to E that is accepted by α over no open subset of P. Say that α is opinionated if and only if there is no open subset of P over which α accepts an incomplete, disjunctive answer to E as strongest. Finally, B is trivial if and only if α is either skeptical or opinionated. Theorem 4 (Skepticism or opinionation). Let question E has cardinality ≥ 3. Sup- pose that acceptance rule α is everywhere consistent, corner-monotone, and validates system R. Then α is trivial. Since the probabilistic Ramsey test is based on probabilistic conditioning, accep- tance rules must respect the geometry of conditioning in order to validate axioms of nonmonotonic reasoning. What Theorem 4 says is that these geometrical constraints become hopelessly severe when one adds rational monotonicity to system P. Of course, the situation is quite different if one drops probabilistic conditioning from the Ramsey test.19 A conditional acceptance rule is a mapping β : P × A → A, where β(p|A) = B is interpreted as saying that B is the strongest proposition accepted in p in light of new 19 The approach that follows is due to Hannes Leitgeb, who presented his unpublished results at the Opening Celebration of the Center for Formal Epistemology at Carnegie Mellon University in the Summer of 2010. The discussion in this section is based on detailed slides he presented at that meeting and on personal communication with him at that time. 236 A Geo-logical Solution to the Lottery Paradox information A. Then one can state a new, non-probabilistic Ramsey test directly in terms of conditional acceptance: p β A ⇒ B ⇐⇒ β(p|A) ≤ B. (25) Such a conditional acceptance rule is an abstract concept that can be filled out in various different ways. For example, say that conditional acceptance rule β is Bayesian if and only if there exists a (non-conditional) acceptance rule α such that: α(p(·|A)) if p(A) > 0; ( β(p|A) = (26) ⊥ otherwise. When β is Bayesian, the new information A is used to condition the credal state p to obtain p(·|A) and then some new propositional belief state S 0 is accepted in light of p(·|A) (the upper path in Figure 20). If β is Bayesian, then the non-probabilistic Bayesian conditioning p p(. | A) α α *p S propositional < belief revision B Figure 20: Two paths Ramsey test for β is equivalent to the probabilistic Ramsey test for α, so Theorem 4 still applies to β. But β need not be Bayesian. For example, β may sidestep Bayesian conditioning entirely by using α to accept a propositional belief state S = α(p) in p and by subsequently applying a propositional belief revision operator ∗ p (that may depend on p) to convert α(p) into a new propsitional belief state S 0 = α(p) ∗ p A (the lower path in Figure 20). β(p|A) = α(p) ∗ p A. (27) In that case, the validation of system R depends entirely on the propositional revision operator ∗ p —probabilistic conditioning and α are both irrelevant, so the geometrical proof of Theorem 4 is also sidestepped. To validate Rational Monotonicity, it is suf- ficient to require that each ∗ p be an AGM belief revision operator (Harper 1975, Al- Lin and Kelly 237 chourrón et al. 1985), thanks to the translation between nonomonotonic logic and belief revision due to Makinson and Gärdenfors (1991).20 The escape route just described does not really vindicate or explain Rational Mono- tonicity from a Bayesian perspective, since Bayesian conditioning is bypassed and Ra- tional Monotonicity is simply imposed on the propositional belief revision operator ∗ p . Moreover, as explained above, non-Bayesian conditional acceptance rules validate sys- tem R. On the other hand, it is an immediate corollary of Theorem 2 that the Bayesian rules of form: β(p|A) = ν(p(·|A)) (28) all validate system P with respect to the non-probabilistic Ramsey test. We propose, therefore, that system P reflects Bayesian ideals better than system R. The proof of Theorem 4 proceeds by establishing a slightly stronger result. The following two properties of an acceptance rule are derivable from validation of system R: Say that α satisfies preservation if and only if, for each credal state p and each proposition E, if (new information) E is consistent with (the prior belief state) α(p), then (the posterior belief state) α(p(·|E)) entails α(p) ∧ E. Say that α satisfies inclusion if and only if, for each credal state p and each proposition E, α(p(·|E)) is entailed by α(p) ∧ E.21 So, to prove Theorem 4, it suffices to prove the following theorem: Theorem 5. Let question E has cardinality ≥ 3. Suppose that acceptance rule α is everywhere consistent, corner-monotone, and satisfies preservation and inclusion. Then α is trivial. The proof of Theorem 5 proceeds by a sequence of lemmas and occupies the bal- ance of this section. Suppose that rule α is everywhere consistent, corner-monotone, and satisfies preservation and inclusion. Suppose further that α is not skeptical. It suf- fices to show that α is opinionated. Let Ei , E j be distinct answers to E. Choose an arbitrary, third answer Em to E (since E is assumed to have at least three answers). Let M ei e j em denote the two dimensional facet P|(Ei ∨ E j ∨ Em ) (Figure 21.a). Let ei em denote the one-dimensional facet P|(Ei ∨ Em ), and similarly for ei e j and e j em . Let Lim be the set of the credal states on line segment ei em at which Ei is accepted by α as strongest; namely: Lim = {p ∈ ei em : α(p) = Ei }. 20 It is not necessary, though, because to validate system R one does not have to require that ∗ satisfies p the consistency axiom in AGM—but all the other axioms have to be satisfied. We thank David Etlin for pointing this out. 21 Inclusion is derivable from system P alone. Preservation is derivable from a special case Rational Monotonicity alone in which A is replaced by the tautology >. They are so named because of their corre- spondence to the AGM axioms K ∗ 3 and K ∗ 4. 238 A Geo-logical Solution to the Lottery Paradox em em em p Ei a C a a b p Ej b b d d p d Lim Ljm p B A ei c ej ei c ej ei q=p Ei v Ej c ej (a) (b) (c) Figure 21: Why system R is trivial Lemma 4. Lim is a connected line segment of nonzero length that contains ei but does not contain em . Proof. By non-skepticism, there exists open subset O of P over which α accepts Ei as strongest. Let O0 be the image {o|Ei ∨Em : o ∈ O} of O under conditioning on Ei ∨ Em . Since O is open, O0 is an open subset of ei em . Note that the conditioning proposition Ei ∨ Em is consistent with the prior belief state Ei , so preservation applies. Since α satisfies preservation, α accepts old belief Ei over O0 . It follows that α accepts Ei as strongest over O0 , because α is consistent and the only proposition strictly strongest than Ei in the algebra is the inconsistent proposition ⊥. So Lim is nonempty. Then, since α is corner-monotone, Lim is a nonempty, connected line segment that contains ei . It remains to show that Lim does not contain em . Suppose for reductio that Lim contains em , then Lim must be so large that it is identical to ei em , by corner-monotonicity. By the same argument for showing that there is an open subset O0 of ei em over which α accepts Ei , we have that there is an open subset O00 of ei em over which α accepts Em . So α accepts both Em and Ei over O00 , and hence by closure under conjunction, α accepts their conjunction, which is an inconsistent proposition. So α is not consistent— contradiction. Let a be the endpoint of Lim that is closest to em ; namely, probability measure a is such that: a ∈ ei em , a(Em ) = sup{p(Em ) : p ∈ Lim }. Lin and Kelly 239 By the lemma we just proved, point a lies in the interior of side ei em . Applying the above argument for pair (i, m) to pair ( j, m), we have that the set L jm , defined by L jm = {p ∈ e j em : α(p) = E j }, is a connected line segment of nonzero length that contains e j but does not contain em , with endpoint b that lies in the interior of side e j em . Since both points a, b lie in the interiors of their respective sides, we have the following constructions. Let A be the line that connects a to e j , B be the line that connects b to ei , and C be the line that connects em through the intersection d of A and B, to point c on side ei e j . Lemma 5. α accepts Ei as strongest over the interior of M adei . Proof. Consider an arbitrary point p in the interior of M adei (Figure 21.b). Argue as follows that α accepts Ei ∨ E j at p. Take p as a prior state and consider ¬E j as the conditioning information. Note that credal state p(·|¬E j ) falls inside Lim , so α accepts Ei as strongest at the posterior credal state p(·|¬E j ). Then, since α satisfies inclusion, we have that: α(p) ∧ ¬E j Ei (namely the posterior belief state Ei is entailed by the conjunction of the the prior belief state and the conditioning information). Then, by the consistency of α and the mutual exclusion among the answers, we have only three possibilities for α(p): α(p) is either Ei , or E j , or Ei ∨ E j . Rule out the last two alternatives as follows. Suppose for reductio that the prior belief state α(p) is E j or Ei ∨ E j . Consider ¬Ei as the conditioning information, which is consistent with the prior belief state and thus makes preservation applicable. Then, since α satisfies preservation, the posterior belief state α(p(·|¬Ei )) must entail α(p) ∧ ¬Ei (i.e. the conjunction of the prior belief state and the information). But the latter proposition α(p) ∧ ¬Ei equals E j , by the reductio hypothesis. So α(p(·|¬Ei )) = E j , by the consistency of α. Hence p(·|¬Ei ) lies on line segment L jm by the construction of L jm —but that is impossible (Figure 21.b). Ruling out the last two alternatives for α(p), we conclude that α(p) = Ei . Lemma 6. α accepts Ei as strongest over the interior of ei c. Proof. Let p be an arbitrary interior point of M adei . So α(p) = Ei . Consider proposi- tion Ei ∨ E j as the conditioning information. Then, since α satisfies preservation, the 240 A Geo-logical Solution to the Lottery Paradox posterior belief state α(p(·|Ei ∨ E j )) entails α(p) ∧ (Ei ∨ E j ) (i.e. the conjunction of the prior belief state and the information), which equals Ei . Then, by consistency, the posterior belief state is determined: α(p(·|Ei ∨ E j )) = Ei . Let q be an arbitrary point in the interior of ei c. Then q can be expressed as q = p(·|Ei ∨ E j ) for some point p in the interior of M adei (Figure 21.c). So, by the formula we just proved, α(q) = α(p(·|Ei ∨ E j )) = Ei , as required. Lemma 7. There is no open subset of ei e j over which α accepts Ei ∨ E j as strongest. Proof. We have established in the last lemma that α accepts Ei as strongest over the interior of ei c. By the same argument, α accepts E j as strongest over the interior of e j c (Figure 21.c). So if α accepts disjunction Ei ∨ E j as strongest somewhere on ei e j , α does so at some of the three points: ei , e j , and c. (We can rule out the first two alternatives; but the for the sake of the lemma, this result suffices.) Since the choice of Ei and E j is arbitrary, the last lemma generalizes to the follow- ing: Lemma 8. For each pair of distinct answers Ei , E j to E, there is no open subset of ei e j over which α accepts Ei ∨ E j as strongest. The last lemma establishes opinionation for all edges of the simplex. The next step is to extend opinionation to the whole simplex. Lemma 9. For each disjunction D of at least two distinct answers to the question, there is no open subset of P over which α accepts D as strongest. Proof. Suppose for reductio that some disjunction Ei ∨ E j ∨ X of at least two distinct answers is accepted by α as strongest over some open subset O of P. Take Ei ∨ E j ∨ X as the prior belief state at each point in O and consider Ei ∨ E j as the conditioning information. So the image O0 (= {p(·|Ei ∨ E j ) : p ∈ O}) of O under conditioning on Ei ∨ E j is an open subset of 1-dimensional space ei e j . Let p0 be an arbitrary point in O0 . Since α satisfies inclusion, posterior belief state α(p0 ) is entailed by (Ei ∨ E j ∨ X) ∧ (Ei ∨ E j ) (i.e. the conjunction of the prior state and the new information), which equals Ei ∨ E j . But α(p0 ) also entails Ei ∨ E j , for otherwise the process of conditioning p0 on ¬E j to obtain ei would violate the fact that α satisfies inclusion and accepts Ei at ei . So α(p0 ) = Ei ∨ E j . Hence α accepts Ei ∨ E j as strongest over open subset O0 of ei e j , which contradicts the last lemma. Lin and Kelly 241 Proof of Theorem 5. Since the last lemma states that α is opinionated, we are done. Proof of Theorem 4. Immediate from Theorem 5. 12 A new probabilistic semantics for flat conditionals Axiom system P is characteristic of Adams’ logic of flat conditionals, so it is not surprising that the probalogical rules yield a new probabilistic semantics for which Adams’ logic is sound. In fact, Adams’ logic is both sound and complete for the new semantics. Let L be a set of sentences of a propositional language that is closed under con- junction, disjunction, and negation. Let ⇒ be a sentential connective standing for “if ... then ...”. The language for the logic of flat conditionals, written L⇒ , is the set of all sentences ϕ ⇒ ψ with ϕ, ψ ∈ L. Adams’ logic of flat conditionals for language L⇒ is just the system P that we have stated, except that now it is construed as a system of rules of inference (with the symbol “p α ” deleted). Say that γ is derivable from Γ in Adams’ logic of flat conditionals, written Γ `Adams γ, if and only if γ is derivable from Γ in a finite number of steps using the rules of inference in system P. A probabilistic model of acceptance for language L⇒ is a triple: M = (α, p, [[·]]), where α : P → A is an acceptance rule, p is a probability measure in the domain P of α, and [[·]] is a classical interpretation of L to the codomain A of α. When M = (α, p, [[·]]), say that α is the underlying acceptance rule of M. Let ϕ ⇒ ψ be a flat conditional in L⇒ . Acceptance of flat conditional ϕ ⇒ ψ in model M = (α, p, [[·]]), written M ϕ ⇒ ψ, is defined by the probabilistic Ramsey test: M ϕ⇒ψ ⇐⇒ p α [[ϕ]] ⇒ [[ψ]], ⇐⇒ p(·|[[ϕ]]) α [[ψ]] or p([[ϕ]]) = 0. Let Γ be a set of flat conditionals in L⇒ . Acceptance of Γ in model M is defined by: M Γ if and only if M γ for all γ ∈ Γ. Validity is defined straightforwardly, as preservation of acceptance. Let C be a class of acceptance rules. Say that C validates the inference from Γ to γ, written Γ C γ, if and only if for each probabilistic model M whose underlying acceptance rule is in C, if M Γ, then M γ. The proposed probabilistic semantics has the following attractive properties: (i) it is based on the probabilistic Ramsey test for accepting conditionals; (ii) it defines validity simply as preservation of acceptance, which improves upon Adams’ -δ se- mantics (Adams 1975); and (iii) it allows for accepting propositions of probabilities 242 A Geo-logical Solution to the Lottery Paradox significantly less than 1, which improves upon Pearl’s infinitesimal semantics (Pearl 1989). To establish the soundness and completeness result for Adams’ logic of flat conditionals, it suffices to assume that the underlying acceptance rule is probalogical, or equivalently, a camera shutter rule: Theorem 6 (soundness and completeness, Lin 2011). Let N be the class of the cam- era shutter rules. Then, for each finite sentence set Γ and each sentence γ in the language L⇒ of flat conditionals, Γ `Adams γ if and only if Γ N γ. 13 Question-invariance To this point, we have considered acceptance only within a fixed question E. But one can and should consider the behavior of acceptance rules across questions. Let Ω de- note some infinite collection of possibilities. A question E = {Ei : i ∈ I} is a countable partition of Ω such that each answer/cell Ei is infinite—the requirement of infinite an- swers rules out the artificial possibility of a maximally informative question whose answers cannot be strengthened. Let AE denote the least collection of propositions containing E and closed under negation and countable disjunction and conjunction. Let E denote the set of all such questions over Ω, and let P denote the set of all count- ably additive probability measures p such that p is defined on AE for some question E in E. If p is in P, let A p denote the domain of p and let E p denote the (unique) question that generates A p . A (cross-question) acceptance rule is a map β defined on P such that β always maps p to a proposition in A p . Then the rules discussed earlier in this paper can be defined explicitly across questions as follows: ^ λr (p) = {¬Ei : pi ≤ 1 − r and i ∈ I p }; 1 λ(p) = λ s(p) (p), where s(p) = 1 − ; 2|I p | ^n o νr (p) = ¬E : σ(p)i Ci 1 − ri and i ∈ I p ; if λr (p) = ⊥; ( > πr (p) = λr (p) otherwise. Rule λr is the Lockean rule with a fixed threshold across all questions in E. Rule νr is the probalogical rule. Rule λ is the ad hoc Lockean rule whose threshold is adjusted to avoid lottery paradoxes in finite questions. Rule πr is the Pollockian rule that substitutes > for ⊥ whenever the latter is produced by λr . Say that cross-question acceptance rule β is question-invariant if and only if: p(A) = q(A) =⇒ (p β A ⇐⇒ q β A), Lin and Kelly 243 for each p, q in P and for each A that is in both A p and Aq . Question-invariance is appealing. First, question-invariance makes it easier to compute whether to accept A in light of p, since all of the detailed structure of p aside from p(A) can be ignored. Second, question-invariance allows for the accumulation of accepted propositions as one’s question is refined by new concepts and theories. Third, question-invariance allows individual scientists pursuing distinct questions to pool their accepted conclu- sions. Probalogical rules, however, are not even remotely question-invariant. For ex- ample, in a four ticket lottery, the probalogical rule ν2/3 licenses acceptance of “ticket 1 will lose” when the question is “will ticket 1 lose or not?”, but not when the question is “which ticket will win?”. That makes one wonder whether the question-dependence of probalogical rules is a design defect that could have been avoided. We now proceed to demonstrate that no question-invariant rule has the three crucial virtues of the proba- logical rules: consistency, logical closure, and non-skeptical acceptance of uncertain propositions. Here is the first sign of trouble. Say that acceptance rule β is non-skeptical about answer E in question E if and only if β accepts E at some probability measure p defined on AE such that p(E) < 1. Say that acceptance rule β is gullible about E in E if and only if β accepts E at some p defined on AE such that p(E) = 0. Then: Proposition 11. Suppose that β is question-invariant. If β is non-skeptical about an- swer E in ternary question E, then β is gullible about E in E. Proof. Consider the equilateral triangle 4 q u v depicted in Figure 22.a. Note that p lies e1 e1 p p u v q q e2 e3 e2 s e3 (a) (b) Figure 22: Triangles preserve acceptance on a line parallel to e2 e3 extending the base u v of the triangle 4 q u v and q is at the 244 A Geo-logical Solution to the Lottery Paradox apex. Suppose that p β E1 . Then u, v β E1 , by question-invariance. So u β ¬E2 and v β ¬E3 . Then by question-invariance again, q β ¬E2 and q β ¬E3 . So q β ¬E2 ∧ ¬E3 = E1 . Therefore, if β accepts E1 at p, then β also accepts E1 at q. Now we can chain such triangles all the way to the bottom of P3 to obtain s such that s E1 and s(E1 ) = 0. Note that if p(E1 ) < 1, there is room in P3 for such a chain. It gets worse. Say that β is dogmatic about answer E in question E if and only if β accepts E at each probability measure defined on AE . Proposition 12. Suppose that β is question-invariant. If β is non-skeptical about an- swer E in ternary question E, then β is dogmatic about E in E. Proof. Consider the situation depicted in Figure 22.b, in which question-invariant rule β accepts E1 at s, with s(E1 ) = 0, and let q be an arbitrary credal state in P3 . Then there exists an equilateral triangle with s on its base and with q at its apex, so β also accepts E1 at the arbitrarily chosen state q. Here is the coup de grâce. Say that β is everywhere inconsistent if and only if β(p) = ⊥, for all p in P. Nothing could be more useless than an acceptance rule that accepts the contradiction in every possible credal state and every possible question. Theorem 7. Suppose that β is question-invariant. If β is non-skeptical about at least two distinct answers in some ternary question, then β is everywhere inconsistent. Proof. Suppose that β is non-skeptical about at least two distinct answers Ei , E j in ternary question E. Then, by Proposition 12, β accepts Ei ∧ E j and, thus, ⊥ at every state in question E. But ⊥ has the same probability, namely 0, at every state in every question. So, by question-invariance, ⊥ is accepted at every state in every question. It follows from the preceding propositions that none of the rules listed above is question-invariant. That fact is obvious for probalogical rules and the ad hoc rules, all of which base acceptance explicitly on the underlying question. However, even the logically closed Lockean rule with fixed threshold is question-dependent whenever the threshold is strictly between 0 and 1—for then the rule is neither skeptical nor everywhere inconsistent (at threshold 0 it is everywhere inconsistent and at threshold 1 it is skeptical). If closure under conjunction is dropped, the Lockean rule with a fixed threshold is question-invariant and is non-skeptical, but is also consistent, so it escapes Theorem 7 (recall that set-valued rules are not covered by that proposition). We are inclined to view Theorem 7 as a reductio argument against question- invariance. That conclusion fits naturally with a minimalist, pragmatic interpretation of accepted proposition A as a more or less apt proxy for one’s underlying credal state p, Lin and Kelly 245 rather than as new “information” that alters p (e.g., by conditioning p on A). Question- invariance would be nice, but it is not rationally mandated under the minimalist con- ception of acceptance, and its price in terms of logical virtues within questions is too high. 14 Refinement-monotonicity Invariance across all questions is a strong requirement. In this section, we consider the consequences of requiring invariance only over questions that refine or coarsen the given question. Say that E refines F (or that F coarsens E) if and only if each answer to E entails some answer to F . When E refines F , write E F . By extension, say that p refines q (written p q) when q is the restriction of p to Aq , which implies that E p Eq . Say that cross-question acceptance rule β is refinement-invariant if and only if: p q =⇒ (p β A ⇐⇒ q β A), for each p, q in P and for each proposition A in Aq . However: Proposition 13. Refinement-invariance is equivalent to question-invariance. Proof. Suppose that p(A) = q(A). Let r = (p(A), 1 − p(A)) over question {A, ¬A}. Then p r q. By refinement-invariance, if follows that p β A ⇐⇒ q β A. The converse is immediate. Refinement-invariance demands that acceptance be preserved under both refine- ment and coarsening. Since questions tend to become more precise as inquiry proceeds, perhaps it suffices merely to preserve acceptance under refinement. Accordingly, say that β is refinement-monotone if and only if: p q =⇒ β(p) ≤ β(q), for all p, q in P. Refinement-monotonicity suffices for the accumulation of accepted conclusions as the question is refined and for the pooling of propositions accepted across diverse questions. With respect to the latter, let p, q, r be in P. Say that r is a conjunction of p, q if and only if r is a maximally coarse common refinement of p, q. Then say that β preserves conjunction if and only if β(r) ≤ β(p) ∧ β(q), for each p and q in P and for each conjunction r of p and q. Then it is easy to show that: Proposition 14. Conjunction-preservation is equivalent to refinement-monotonicity. 246 A Geo-logical Solution to the Lottery Paradox Alas, probalogical rules also violate refinement-monotonicity—as witnessed by the simple lottery example in the preceding section of this paper. Again, the failure is not a defect but an ineluctable consequence of the logical virtues of probalogical rules. Theorem 8. Suppose that β is refinement-preserving, validates system P in each ques- tion, and is non-skeptical about both answers in some binary question. Then there exists a facet of at least two dimensions over which β accepts ⊥ everywhere. The alternative rules listed above also violate refinement-monotonicity, even though they all fail to validate system P. Choosing a probalogical rule at least yields the net advantage of validating P. The proof of Theorem 8 proceeds by a sequence of lemmas that rely heavily on the geometrical characterizations of the axioms of P established in Section 9. Consider the binary question {E0 , F0 }, whose space of credal states is depicted in Figure 23.a as the line next to the triangle. Assume that β is non-skeptical about answers E0 and f0 f0 f0 f0 pn q L L L v w N p1 p2 p0 u M e0 e1 f1 e2 f2 en fn (a) (b) (c) (d) Figure 23: Acceptance snakes up the triangle F0 , so that β accepts E0 at p0 and F0 at q. Since E0 is infinite, split E0 into infinite answers F1 and E1 to produce the refined, ternary question {F0 , F1 , E1 } (Figure 23.b). Suppose that β is refinement-monotone. Then proposition F0 is accepted throughout the line segment L depicted in Figure 23.b, which is defined to be the set of all credal states that refine q. Similarly, proposition E0 = E1 ∨ F1 is accepted throughout the line segment M, which is the set of all credal states that refine p0 . Let line segment N connect the right endpoint of L in Figure 23.b to the opposite corner e1 , intersecting M at credal state u; then project u to the (one-dimensional) facet for proposition E1 ∨ F0 to obtain credal state p1 . The following lemma concerns p1 . Lemma 10. Suppose β is refinement-monotone and validates system P. Then p1 β E1 . Lin and Kelly 247 Proof. Proposition E1 is accepted by β at e1 , by the geometry of Reflexivity (Propo- sition 8); and F0 is accepted at each point on L, by construction. So the disjunction E1 ∨ F0 is accepted by β at both endpoints of N. Then, since u lies on N, we have that u β E1 ∨ F0 , by the geometry of Or (Proposition 10). We have noted that u β E1 ∨ F1 . So u β E1 , because E1 = (E1 ∨ F0 ) ∧ (E1 ∨ F1 ). Then, since p1 is the projection of u onto the facet for a logical consequence of E1 , the geometry of Cautious Monotonicity (Proposition 9) yields that p1 β E1 , as required. The result is that E1 is accepted by β with a lower probability than E0 . Split E1 into two infinite, exclusive propositions E2 and F2 and, thus, obtain the finer, quaternary question {F0 , F1 , F2 , E2 }. Restrict attention to the two-dimensional, triangular facet for proposition F0 ∨ F2 ∨ E2 , as depicted in Figure 23.c. Construct credal state p2 as we did for p1 , and argue similarly that E2 is accepted at p2 , with an even lower probability than the probability at which E1 is accepted at p1 . This construction can be repeated until we obtain a refined, finite question {F0 , F1 , . . . , Fn , En } such that En is accepted at pn with low probability (Figure 23.d)—so low that pn is far away from corner en and lies on or above the line L. Therefore: Lemma 11. Continuing from the preceding lemma, pn β En . Then inconsistency arises: Lemma 12. Continuing from the preceding lemma, let line segment pn fn intersect L at v. Then v β ⊥. Proof. Proposition En is accepted by β at en , by construction; and F0 is accepted at fn , by the geometry of Reflexivity (Proposition 8). So the disjunction En ∨ Fn is accepted by β at both endpoints of line segment pn fn . Then, by the geometry of Or (Proposition 10), v β En ∨ Fn . But v β F0 , because v lines on L and thus refines q. Since ⊥ = F0 ∧ (En ∨Fn ), we have that v β ⊥, as required. Here is the coup de grâce, of which Theorem 8 is an immediate corollary. Lemma 13. Continuing from the preceding lemma, let Pn+2 be the set of probability measures defined on AEn+2 , where En+2 is the question {F0 , . . . , Fn , En }. Then β accepts ⊥ at each credal state p in facet Pn+2 |(F0 ∨Fn ∨En ). Proof. Let M denote the two-dimensional facet Pn+2 |(F0 ∨Fn ∨En ). Suppose that v lies in the interior, but not the sides, of M. Since ⊥ is accepted at v, we have that ⊥ is accepted at the three corners f0 , fn , en of M, by projecting v to the three corners and by the geometry of Cautious Monotonicity (Proposition 9). Then, since each side of M has endpoints that are corners, we have that ⊥ is accepted on the three sides of M, by the 248 A Geo-logical Solution to the Lottery Paradox geometry of Or (Proposition 10). Then, since each point on M is on a line segment with endpoints on the sides of M, we have that ⊥ is accepted at each credal state on M, as required. When v is not in the interior of M, v lies on side en f0 of M and, thus, cannot be projected to the opposite corner fn . But in that case we can apply the geometry of Or (Proposition 10) to line segment v fn to show that Fn is accepted at every credal state on v fn . Similarly, F0 ∨ En is accepted at every credal state on w en , where w is defined to be the intersection of line L and f0 fn . So ⊥ is accepted at the intersection of v fn and w en , which is in the interior of M—the second case is thus reduced to the first case. 15 Probalogic generalized We close with a natural generalization of the probalogical framework. The uniform probability measure over E is the center of the simplex P of probability measures on the least algebra over question E and served as the probalogically weakest credal state in P in the presentation to this point. But, as Levi (1967) has emphasized, the answers to question E typically have different contents (e.g., “quantum mechanics is true” has a great deal of content but “quantum mechanics is false” has very little). Therefore, a credal state that assigns less probability to a cell that has more content could sensibly be understood as weaker than a uniform state that accords the same probability to all cells. In that case, probalogic should be relative not only to question E, but to an assignment of contents to the answers to E. The result is a family of probalogics sensitive both to the cardinality of question E and to the relative contents of the answers to E. We approach the issue as follows. If the answers Ei differ in content, it is natural to weight answers by weakness and to think of the neutral credal state as the center of mass of the answers. As a result, the weakest credal state is biased toward answers of low content. In particular, the center of P is stronger than a state closer to a very weak answer. Recall that probalogic is just the geological cube in perspective. The sides of the cube have equal length. To represent differences in content, deform the cube into an oblong box whose side lengths are inversely proportional to the strengths of the corresponding answers. (Figure 24). Just like the cube, the oblong box may be viewed as a generalized geologic (recall that geological structure does not uniquely determine the metric). Project the generalized geologic from the box to the triangle credal state space, just as before, to induce a generalized probalogic on it. Then the credal states stronger than p are those in the grey region of Figure 24.d. Disjunction and conjunction are defined as before. The weakest proposition in the generalized geologic is (m1 , m2 , m3 ) (i.e. the vertex of the box that is most distant from the origin), so its rectilinear projection w to the triangle is the weakest credal state in the corresponding probalogic. Projection pre- Lin and Kelly 249 p(E2) p(E2) linear transformation p(E3) p(E3) p(E1) p(E1) (a) (b) e2 e2 projective transformation p e1 e3 e1 e3 (c) (d) Figure 24: Deformation of geologic and corresponding deformation of probalogic serves ratios between the rectangular coordinates, so we have: w = mM1 , mM2 , mM3 , where M = i∈I mi . The coordinates of w uniquely determine the generalized probalogic that P has w as the weakest state. Intuitively, the result is like viewing a phone booth, rather than a cubical office, from the origin (Figure 24.d).22 Acceptance rules are still defined as maps that preserve probalogical structure and they look like Figure 25. Although the generalized probalogical acceptance rules appear “oblique”, the boundaries of ac- ceptance zones still follow rays from the corners—so they still validate exactly Adams’ 22 In terms of projective geometry, the geological transformation is a non-rotational, non-reflective linear transformation and, thus, the induced probalogical transformation is a projective transformation that fixes all the corners. 250 A Geo-logical Solution to the Lottery Paradox B2 B3 B1 T B1 B2 B3 Figure 25: Generalized probalogical acceptance rule conditional logic. Algebraically, the generalized rules take the following form: ^( p(Ei )/mi ) α(p) = ¬Ei : Ci 1 − ri and i ∈ I . max j p(E j )/m j The acceptance rules introduced in (Levi 1996, p.286) are the same, except that we allow different thresholds ri for different answers Ei while Levi does not. As we men- tioned at the outset, Levi sees no justification for these rules, relative to his momentous understanding of acceptance as an explicit decision to condition one’s credal state on the accepted proposition and, therefore, to bet one’s life on it against nothing. Our own justification for the rules, grounded in a weaker conception of acceptance as apt description of one’s credal state relative to a question, is again, that they preserve nat- urally defined logical structures over credal states relative to a question and that they validate exactly Adams’ logic of conditionals. Acknowledgements The authors are indebted to David Etlin for recommending us to explicitly state a stronger negative result for rational monotonicity. We are indebted to Teddy Seidenfeld for referring us to the passage in which Levi defines the camera shutter rules. We are indebted to Hannes Leitgeb for sharing his alternative approach based on conditional acceptance rules with us. We are also indebted to Clark Glymour and Greg Wheeler for useful questions. References E. W. Adams. The Logic of Conditionals. D. Reidel, Dordrecht, 1975. Lin and Kelly 251 C. E. Alchourrón, P. Gärdenfors, and D. Makinson. On the logic of theory change: Partial meet contraction and revision functions. The Journal of Symbolic Logic, 50: 510–530, 1985. H. Arló-Costa and R. Parikh. Conditional probability and defeasible inference. Jour- nal of Philosophical Logic, 34:97–119, 2005. K. Barwise. Infinitary logic and admissible sets. Journal of Symbolic Logic, 34: 226–252, 1969. I. Douven. A new solution to the paradoxes of rational acceptability. British Journal for the Philosophy of Science, 53:391–410, 2002. B. van Fraassen. Fine-grained opinion, probability and the logic of full belief. Journal of Philosophical Logic, 24:349–377, 1995. W. Harper. Rational belief change, popper functions and counterfactuals. Synthese, 30(1-2):221–262, 1975. R. Jeffrey. Dracula meets wolfman: Acceptance vs. partial belief. In M. Swain, editor, Induction, Acceptance, and Rational Belief. 1970. C. Karp. Languages with Expressions of Infinite Length. North Holland, Dordrecht, 1964. K. Kelly. Ockham’s razor, truth, and information. In J. van Benthem and P. Adriaans, editors, Handbook of the Philosophy of Information. Elsevier, Dordrecht, 2008. K. Kelly. Ockham’s razor, truth, and probability. In P. Bandyopadhyay and M. Forster, editors, Handbook on the Philosophy of Statistics. Elsevier, Dordrecht, 2010. S. Kraus, D. Lehmann, and M. Magidor. Nonmonotonic reasoning, preferential mod- els and cumulative logics. Artificial Intelligence, 44:167–207, 1990. H. Kyburg. Probability and the Logic of Rational Belief. Wesleyan University Press, Middletown, 1961. D. Lehmann and M. Magidor. What does a conditional base entails? Artificial Intel- ligence, 55:1–60, 1992. I. Levi. Gambling With Truth: An Essay on Induction and the Aims of Science. MIT Press, Cambridge, 2nd edition, 1967. I. Levi. Information and inference. Synthese, 19:369–91, 1969. 252 A Geo-logical Solution to the Lottery Paradox I. Levi. For the Sake of the Argument: Ramsey Test Conditionals, Inductive Inference and Non-monotonic Reasoning. Cambridge University Press, Cambridge, 1996. H. Lin. A New Theory of Acceptance that Solves the Lottery Paradox and Provides a Simplified Probabilistic Semantics for Adams’ Logic of Conditionals. Master’s thesis, Carnegie Mellon University, Pittsburgh, PA, 2011. D. Makinson and P. Gärdenfors. Relations between the logic of theory change and nonmonotonic logic. In A. Fuhrmann and M. Morreau, editors, The Logic of Theory Change (Verlag Lecture Notes in Computer Science 465), pages 183–205, Springer, 1991. V. Novak, I. Perfilieva, and J. Mockor. Mathematical principles of fuzzy logic. Kluwer, Dordrecht, 2000. J. Pearl. Probabilistic semantics for nonmonotonic reasoning: A survey. In Proc. First International Conference on Principles of Knowledge Representation and Reasoning, pages 505–516. 1989. J. Pollock. Cognitive Carpentry. MIT Press, Cambridge, MA, 1995. F. P. Ramsey. General propositions and causality. In A. Mellor, editor, F. Ramsey and Philosophical Papers. Cambridge University Press, 1990, Cambridge, 1929. S. Ryan. The epistemic virtues of consistency. Synthese, 109:121–41, 1996. L. Zadeh. Fuzzy sets. Information and Control, 8:338–353, 1965. Changing Types: Information Dynamics for Qualita- tive Type Spaces Dominik Klein and Eric Pacuit TiLPS Tilburg University of Maryland
[email protected],
[email protected]Abstract Many different approaches to describing the players’ knowledge and beliefs can be found in the literature on the epistemic foundations of game theory. We focus here on non-probabilistic approaches. The two most prominent are the so-called Kripke- or Aumann- structures and knowledge structures (non-probabilistic vari- ants of Harsanyi type spaces). Much of the recent work on Kripke structures has focused on dynamic extensions and simple ways of incorporating these. We argue that many of these ideas can be applied to knowledge structures as well. Our main result characterizes precisely when one type can be transformed into another type by a specific type of information update. Our work in this paper suggest that it would be interesting to pursue a theory of “information dynamics” for knowledge structures (and eventually Harsanyi type spaces). 1 Introduction The central thesis of the epistemic program in game theory is that the basic mathe- matical model of a game situation should include an explicit parameter describing the players’ informational attitudes1 , see (Brandenburger 2008) for the relevant references 1 This is, of course, something of a truism regarding games of incomplete or imperfect information. But, the thesis is intended to apply to all game situations (see Section 5 of Brandenburger 2010, for a precise description about the crucial differences between an epistemic model of a game and a Bayesian game). 254 Changing Types and a discussion of the key results, and (Perea 2012) for an introduction to this lit- erature. Games are played in specific informational contexts, in which players have specific knowledge and beliefs about each other.2 Many different formal models have been used to represent such informational contexts of a game (see Bonanno and Bat- tigalli 1999, van der Hoek and Pauly 2006, van Benthem et al. 2011, and references therein, for a discussion). In this paper, we are not only interested in structures that describe the informational context of a game, but how these structures can change in response to the players’ observations, communicatory acts or other dynamic operations of information change (cf. van Benthem 2011). We focus our attention on the players’ hard information about the game (which we refer to as knowledge following standard terminology in the game theory and epistemic logic literature) and its dynamics. Broadly speaking, there are two different types of models that have been used to describe the players’ knowledge (and beliefs) in a game situation. Both types of models include a nonempty set S of states of nature (elements of S are intended to represent possible outcomes of a game situation).3 The first type of models are the so-called Aumann- or Kripke-structures (Aumann 1999, Fagin et al. 1995). These structures describe the players’ knowledge in terms of an epistemic indis- tinguishability relation over a (finite) set of states W. The second type of models are the knowledge structures of Fagin et al. (1991; 1999), which are non-probabilistic variants of Harsanyi type spaces (Harsanyi 1967).4 The key concept here is a type which de- scribes the players’ infinite hierarchy of knowledge (i.e., what the players know about the ground facts, what the players know about each others knowledge of the ground facts, what players know about what each other know about each others knowledge of the ground facts, and so on.) The precise relationship between these two types of models was clarified in (Fagin et al. 1991; 1999). Our goal in this paper is to show how to adapt recent work modeling information change on Kripke structures as a product update with an event model (van Ditmarsch et al. 2007) to the more general setting where the players’ knowledge is represented using knowledge structures. To the best of our knowledge, this is the first attempt to develop a theory of information change for knowledge structures in the style of recent 2 This is nicely explained by Adam Brandenburger and Amanda Friedenberg (2010, pg. 801): “In any particular structure, certain beliefs, beliefs about beliefs, ..., will be present and others won’t be. So, there is an important implicit assumption behind the choice of a structure. This is that it is “transparent” to the players that the beliefs in the [type] structure — and only those beliefs — are possible....The idea is that there is a “context” to the strategic situation (eg., history, conventions, etc.) and this “context” causes the players to rule out certain beliefs.” 3 Often, it is assumed that the elements of S can be described by some logical language (for example, propositional logic), but this is not crucial for us in this paper. 4 See (Siniscalchi 2008) for a modern introduction to type spaces as models of beliefs and (Myerson 2004) for a discussion of Harsanyi’s classic paper. Klein and Pacuit 255 work on dynamic epistemic logic. Our main result (Theorem 2) characterizes precisely when a type in a fixed knowledge structure can be transformed into another type in that structure using the product update operation. There are two main motivations for this technical study. The first is to explore generalizations of the product update operation. This is done in Section 3.1 where we also generalize a result of (van Ditmarsch and French 2009) characterizing when a Kripke structure can be transformed into another Kripke structure by a product update. The second motivation for this work is to initiate a study of information dynamics for epistemic models of games. The players’ information in a game can change in two ways. First, the players’ knowledge and beliefs change during the play of a sequential game (for example, they learn about the choices of the other players as the game is played). The second way that the players’ information can change is in response to some exogenous event. For example, during a poker game, a player may accidentally drop his cards or a gust of wind may allow a subset of the players to see certain cards. Of course, one may argue that game-theoretic models should abstract away from these payoff irrelevant events. We agree that the type of events we have in mind here are irrelevant to a game-theoretic analysis. But, these events do change the context5 of a game by revealing or hiding important information to all or some of the players. This paper is a first step towards a more general project that uses the dynamic epistemic logic framework to represent changes in the informational context of a game. Our paper is organized as follows. Section 2 provides the necessary background on (dynamic) epistemic logic and knowledge structures. Note that this Section was written for a reader already familiar with the key concepts and definitions. Consult (van Benthem 2011) and (Fagin et al. 1999) for motivations and a broader discussion of the literature. Our main result is in Section 3.2 with the technical preliminaries found in Section 3.1. We conclude in Section 4 with a discussion of topics for future research. 2 Background 2.1 A primer on Dynamic Epistemic Logic We assume the reader is familiar with the basics of (dynamic) epistemic logic, and so, we only give the key definitions here (see the textbooks Fagin et al. 1995, van Benthem 2011, for an introduction to the subsequent definitions). Let I be the finite set of players and At a (finite or infinite) set of atomic propositions.6 5 Here, we take the “context” of a game to be all events that influence the players’ beliefs in the game situation. 6 Atomic propositions are intended to represent properties of states of nature. 256 Changing Types Definition 2.1. The epistemic language, denoted LEL , is the smallest set of formulas generated by the following grammar: ϕ ::= p | ¬ϕ | ϕ ∧ ϕ | Ki ϕ where p ∈ At and i ∈ I. Define Li ϕ as the dual of Ki (i.e., Li ϕ := ¬Ki ¬ϕ) and the other boolean connectives (e.g., ∨, →) as usual. The intended interpretation of Ki ϕ is “agent i knows that ϕ (is true)”. The standard semantics for LEL are Kripke structures. Definition 2.2. A Kripke structure (for a set of atomic propositions At) is a tuple hW, {Ri }i∈I , Vi where W is a set of states, Ri ⊆ W × W is an equivalence relation7 , and V : At → ℘(W) is a valuation function. To simplify notation, we may write w ∈ M when w ∈ W. Formulas of LEL are interpreted at states in a Kripke model in the standard way, we only remind the reader of the definition for the knowledge modality: M, w Ki ϕ iff for all v ∈ W if wRi v then M, v ϕ The central idea of dynamic epistemic logic is to describe events that change a situation and the (uncertain) perceptions of these events by the agents’ as a so-called event model. Definition 2.3. An event model is a tuple hE, {Qi }i∈I , prei where E is a set of basic events, Qi ⊆ E × E is an equivalence relation8 and pre : E → LEL assigns to each primitive event a formula that serves as a precondition for that event. We write e ∈ E if e is an event in E. The primitive events represent the basic observations available to the agents in a dynamic situation. Similar to Kripke structures, uncertainty about which events are taking place is represented by relations Qi . Given our assumptions that each Qi is an equivalence relation, the intended interpretation of eQi f is that agent i cannot distin- guish between events e and f . The key operation of product update describes how to incorporate into a Kripke structure M (describing an epistemic situation) the epistemic event described by an event model E. 7 In this paper, we restrict attention structures where the epistemic relations are equivalence relations. These are known in the literature as S5-structures or Aumann structures. 8 To keep things manageable for this initial study, we restrict attention to event models with equivalence relations. For much of what follows, this assumption is not crucial. Klein and Pacuit 257 Definition 2.4. The product update of a Kripke model M = hW, {Ri }i∈I , Vi and an event model E = hE, {Qi }i∈I , prei is a Kripke model M ⊕ E = hW 0 , {R0i }i∈I , V 0 i defined as follows: • W 0 = {(w, e) ∈ W × E | M, w pre(e)} • (w, e)R0i (w0 , e0 ) iff wRi w0 and eQi e0 • (w, e) ∈ V 0 (p) iff w ∈ V(p) / This operation (together with variants appropriate for modeling belief and pref- erence change) has been extensively studied in the literature. We do not provide an overview of this literature here (see van Ditmarsch et al. 2007, van Benthem 2011, for an extensive analysis). Rather, the focus is on how to understand this theory of infor- mation dynamics in the context of models of knowledge (and beliefs) typically found in the game theory literature. We need one additional notion from the general theory of modal logic. Definition 2.5. Suppose that M1 = hW1 , R1 , V1 i and M2 = hW2 , R2 , V2 i are Kripke structures. A nonempty relation Z ⊆ W1 × W2 is a bisimulation provided for all w1 ∈ W1 and w2 ∈ W2 , if w1 Zw2 then: (atomic harmony) For all p ∈ At, w1 ∈ V1 (p) iff w2 ∈ V2 (p). (zig) If w1 R1 v1 then there is a v2 ∈ W2 such that w2 R2 v2 and v1 Zv2 . (zag) If w2 R2 v2 then there is a v1 ∈ W1 such that w1 R1 v1 and v1 Zv2 . We write M1 , w1 - M2 , w2 if there is a bisimulation relating w1 with w2 . We write M1 - M2 if there is a bitotal bisimulation between M1 and M2 , that is a bisimulation Z such that for every v ∈ M1 there is some W ∈ M2 with vZw and vice versa. The relation Z is called a simulation from M1 to M2 , denoted M1 , w1 → M2 , w2 , if Z satisfies the atomic harmony and zig properties. Z is called total provided for each w1 ∈ W1 there is a w2 ∈ W2 such that w1 Zw2 . Finally, Z is called functional if it is total and a function from W1 to W2 (i.e. for every w1 ∈ W1 and w2 , w̃2 ∈ W2 it is the case that w1 Zw2 and w1 Z w̃2 implies w2 = w̃2 ). 2.2 Knowledge structures Knowledge structures were introduced in (Fagin et al. 1999) as an alternative semantics for the basic epistemic language LEL .9 They are non-probabilistic versions of Harsanyi 9 See (Fagin et al. 1999) for an extended discussion of knowledge structures aimed at game theorists. Fagin (1994) and Fagin and Vardi (1985) show how variants of knowledge structures can provide an elegant semantics for many modal logics. 258 Changing Types type spaces which are the predominant model of knowledge and beliefs in the litera- ture on the epistemic foundations of game theory (Brandenburger (2010) offers some explanation about why this is the case). The key concept is a κ-world (also called a type in the game theory literature) de- scribing the players’ infinite hierarchy of knowledge (belief) of a given state of affairs. Definition 2.6. Let S be a (finite or infinite) nonempty set (whose elements are called states). A κ-world is a vector of functions f~ = h f0 , f1 , f2 . . .i of length κ (a possibly infinite ordinal) defined inductively as follows. • A 1-world is a vector h f0 i where f0 is a state of nature (i.e., f0 ∈ S ).10 • For κ > 1 of the form κ = λ + 1 (i.e. κ is a successor ordinal) a κ-world is a vector h f0 . . . fλ i such that h fi | i < λi is a λ-world and fλ is a function from the set of agents I to the power set of the set of λ-worlds over S (i.e., fλ : I → ℘(Fλ (S )), where Fλ (S ) denotes the set of all λ-worlds over S ) that satisfies the following conditions. Let f~<β denote the initial segment of f~ of length β. Extendability If 0 < α < λ, then ~g ∈ fα (i) iff there is some ~h ∈ fλ (i) such that ~g = ~h<α (i.e., higher-order worlds are extensions of lower-order worlds and every lower-order world has at least one higher-order extension). In addition, since we intend κ-worlds to represent the knowledge of the players, we impose two additional conditions: Correctness For each agent i ∈ I, f~<λ ∈ fλ (i) (i.e., every agent must consider the actual state of the world possible). Introspection For all i ∈ I, if hg0 , g1 , . . .i ∈ fκ (i), then gλ (i) = fλ (i), for all λ with 0 < λ < κ (i.e., players cannot consider states possible that differ in their description of their own lower-order beliefs). / • Finally, for κ a limit ordinal a κ-world is a vector of functions h fi | i < κi such that for every λ < κ the vector h fi | i < λi is a λ-world. We denote the set of all κ-worlds over S by Fκ (S ). 10 For the comparison with epistemic logic, it is useful to think of the set of states S as the set of prop- ositional valuations on a set At of atomic propositions. In this case f0 would be a propositional valuation function. Klein and Pacuit 259 The intended interpretation is that fκ (i) ⊆ Fκ (S ) is the set of all κ-worlds player i considers possible. Then, κ-worlds f~ are descriptions of the state of affairs and the players higher-order knowledge (up to level κ). Thus, we can interpret the basic epis- temic language at κ-worlds. For simplicity, we assume that each atomic proposition E corresponds to a subsset of the set of states S . This language is interpreted as follows: f~ E ⇔ f0 ∈ E f~ ¬ϕ ⇔ f~ 6 ϕ f~ ϕ ∧ ψ ⇔ f~ ϕ and f~ ψ f~ Ki ϕ ⇔ for each ~g ∈ fl (i) : ~g ϕ where l is the quantifier depth11 of ϕ. There is an alternative way of defining truth of the knowledge modality by defining an accessibility relation on Fκ (S ), which transforms Fκ (S ) into a Kripke model. We can then use the standard definition of a modal operator. For a κ-world f~ = h f0 , f1 , . . .i, let f~i = h f1 (i), f2 (i), . . .i (note that the state of nature is not part of f~i ) and define a relation ∼i on the Fκ (S ) as follows: f~ ∼i ~g iff f~i = ~gi (equality is defined component-wise). If f~ ∼i ~g then we say f~ and ~g are equivalent according to agent i. It is easy to see that these relations are equivalence relations. They turn Fκ (S ) into a Kripke structure (with At = ℘(S ) and the valuation function V defined by w ∈ V(E) iff w ∈ E). Fagin et al. (1991) show (Theorem 2.4.) that the interpretation of the epistemic language given above coincides with the interpretation of the epistemic language obtained by interpreting hFκ (S ), {∼i }i∈I , Vi as a Kripke structure. So, there are two equivalent ways to interpret the basic epistemic language on the set Fκ (S ) of κ-worlds. In the remainder of the paper, we will use whichever definition is most convenient. We are interested in general maps between Kripke structures and knowledge struc- tures. To this end, we fix a set of atomic propositions At and assume that the state space S is the set of propositional valuations of At, i.e. identifying a valuation with its characteristic function S = ℘(At). To simplify our exposition, we identify p ∈ At with {e ∈ S | p ∈ e} ⊆ S , i.e. the set of valuations containing p. The key observation is that every Kripke structure can be naturally associated with a substructure of hFω (S ), {∼i }i∈I , Vi. The mapping is defined as follows.12 11 Quantifier depth is defined as usual by induction on the structure of ϕ ∈ L : Formally, qd(p) = 0, EL qd(¬ϕ) = qd(ϕ), qd(ϕ ∧ ψ) = max(qd(ϕ), qd(ψ)), and qd(Ki ϕ) = 1 + qd(ϕ). 12 The mapping is a functional simulation but in general not a bisimulation onto its image. Nonetheless, it is a natural mapping in the sense that when applied to connected components K of hFω (S ), {∼i }i∈I , Vi it is simply the embedding of K into hFω (S ), {∼i }i∈I , Vi. 260 Changing Types Definition 2.7. Embedding from Kripke structures to knowledge structures Let M = hW, {Ri }i∈N , Vi be a Kripke structure. We associate with each state w ∈ W in M an ω-world f~M,w = h f0w , f1w , f2w , . . .i where the fαw are defined by synchronous induction on all worlds w ∈ W: • f0w = {p | w ∈ V(p)}. • To define the function fk+1 w assume inductively that f0x , f1x , f2x , . . ., fkx have been defined for all worlds x ∈ W (k a natural number). Then, fk+1 w (i) = {h f0 , f1 , . . . fk i | wRi x}. x x x Define the map r : W → Fω (℘(At)) as r(w) = f~M,w . For every infinite ordinal λ we can continue the construction transfinitely to get a vector h fix | i < λi. Thus this map naturally generalizes to maps rλ : W → Fλ (℘(At)) for every ordinal λ. To simplify notation, assume for the rest of the paper that S = ℘(At) and that S is finite. The map rκ gives a precise way to connect the class of all Kripke structures to a single structure Mκ = hFκ (S ), {∼i }i∈I , Vi for any κ. The following observation is immediate from the relevant definitions. Observation 1. Let M = hW, {Ri }i∈I , Vi be a Kripke structure and Mκ be the structure hFκ (S ), {∼i }i∈I , Vi. 1. The relation wZ f~ iff rκ (w) = f~ is a functional simulation from M into Mκ , but, in general, is not a bisimulation. 2. There is an ordinal λ, depending on M such that Z is a bisimulation if κ ≥ λ.13 3. In particular, if M is finite, then there is a bisimulation between M and r(M) = hr[W], {∼i }, Vi. Moreover, r(M) is the minimal bisimulation contraction of M, i.e. the Kripke model of minimal cardinality that allows for a total bisimulation to M. 4. Let ϕ ∈ LEL with quantifier depth k and assume that κ > k. Then M, w ϕ iff Mκ , rκ (w) ϕ. Proof. i) The functionality of Z is obvious, since rκ is a function. Atomic harmony holds by definition of f0w . To see that zig holds let v0 , v1 ∈ M with v0 Ri v1 and w ∈ Mκ 13 In fact, for M = F (S ) we have λ = κ. Moreover the model Mκ is a terminal object in the category of κ Kripke models over At with total simulations as morphisms. Though Fκ (S ) and Fλ (S ) are not bisimilar for κ , λ. Klein and Pacuit 261 with v0 Zw. Since Z is functional we have w = f~M,v0 An induction shows that fiv0 = fiv1 for every i ≤ κ, thus f~M,v0 (i) = f~M,v1 (i). Thus by definition of ∼i we have f~M,v0 ∼i f~M,v1 . By definition of Z we also have v1 Z f~M,v1 , thus zig holds. Example 3.10 of (Fagin et al. 1999) shows that that Z is in general not a bisimulation. ii) Choose λ0 such that for all v, w ∈ M holds: If there is some µ such that rµ (v) , rµ (w), then rλ0 (v) , rλ0 (w) and let λ := λ0 + ω. We have to show that zag holds: Let vZw with Z defined as above and let w ∼i w0 . We have to show that there is some v0 ∈ M with rλ (v0 ) = w0 . Indeed, since w ∼i w0 we have for all µ < λ that w0 µ ∈ wµ (i). By the construction of rλ this implies that for every µ < λ there is some v0 ∈ M such that w0 µ = rµ (v0 ). By the choice of λ0 and the extendability condtion, we have that ∃µ ∈ [λ0 ; λ] : rµ (v0 ) ∈ wµ (i) implies ∀µ ∈ [λ0 ; λ] : rµ (v0 ) ∈ wµ (i). In particular we have by the limit condition that rλ (v0 ) = w0 as desired. See chapter 3 of (Fagin et al. 1999) for more details. iii) Obvious from ii) and the definition of rω . iv) straightforward induction over the quantifier depth of ϕ 3 Information dynamics on knowledge structures Our aim is to examine natural transitions between types in a knowledge structure. These transitions are intended to represent some type of reasoning process or infor- mation update. For this initial study, we focus on the operation of product update (restricted to equivalence relations as in Definition 2.4). 3.1 Technical preliminaries: generalized product update Our first contribution is to define a sequence of products ×Nn between Kripke struc- tures. The idea to apply product update between Kripke structures (rather than Kripke structures and event models) was initially proposed by Jan van Eijck and colleagues (van Eijck et al. 2010). We follow the same basic idea, although our approach differs in a technical, but crucial, way. In order to generalize the product update operation so that it applies between two Kripke structures, we must replace the precondition function with something appropri- ate for merging two Kripke structures. Our approach is to explicitly mark which of the 262 Changing Types formulas we are interested in, and treat these formulas as atomic propositions.14 Fix a set I of players and At of atomic propositions (for simplicity assume both are finite). Definition 3.1. Language extension 1. Let > ⊆ LEL with At ⊆ >. For every ϕ ∈ > we introduce a new constant ϕ̌ called the name of ϕ. Let > ˇ := {ϕ̌|ϕ ∈ >}. The language extension with >, denoted by L>EL , is the epistemic language with the same set of agents as L and >ˇ as atomic propositions. By a slight abuse of notation we write p instead of p̌ for p ∈ At ⊆ >. We denote the valuation function over the language L>EL by V> . As usual, we omit the subscript when it is clear from the context. 2. Let M = hW, {Ri }i∈I , Vi be a Kripke model with atomic propositions At and let > ⊆ LEL with At ⊆ >. Then M can naturally be interpreted as a Kripke model over L>EL by defining V> as: w ∈ V> (ϕ̌) iff M, w ϕ. We denote M viewed over L>EL by M> . / In ⊕-updates every state v in the event model comes with a (generally complex) formula ϕ that is the precondition for v to occur. That is (w, v) is only defined if M, w pre(v). This is exactly the idea of the ×> update defined below: pairs of states are in the new model only if they agree on the formulas in >. Definition 3.2. i) Let > ⊆ LEL with At ⊆ >. Let M = hW, {Ri }i∈I , Vi and M0 = hW 0 , {R0i }i∈I , V 0 i be two Kripke models over L>EL . The product model M × M0 = hW 00 , {R00i }i∈I , V 00 i over L>EL is defined as follows: 0 • W 00 = {(w, w0 ) | w ∈ W, w0 ∈ W 0 and for all ϕ̌ ∈ Ť : w ∈ V>M ˇ (ϕ̌) iff w ∈ V> 0 ˇ (ϕ̌); M • (w, w0 )R00i (v, v0 ) iff wRi v and w0 R0i v0 ; and • (w, w0 ) ∈ V>00ˇ (ϕ̌) iff w ∈ V>M ˇ (ϕ̌) (and thus also w ∈ V> 0 ˇ (ϕ̌) ). 0 ii) The generalized product update of M and M0 over >, denoted by M ×> M0 is the model M × M0 as defined above interpreted as a model over LEL . (That is: removing all atoms ϕ̌ with ϕ ∈ > \ At and identifying p̌ with p for all p ∈ At.) We write M ×> M0 where M and M0 are Kripke models over LEL , meaning that we interpret M and M0 as being models over >ˇ and do the ×T -update as defined above. The procedure that we follow to compute this product runs as follows: 14 In general, this type of language extension can be used to model agents with limited memory. For in- stance, this is needed for an analysis of situations such as the sum and product riddle involving the dialogues: A: I don’t know ϕ. B: I knew you didn’t know before you said that (cf. Sietsma and van Eijck, for an analysis of this puzzle in Public Announcement Logic). Klein and Pacuit 263 1. Pick a set > of statements to keep track of, 2. Build the Product in L>EL , and ˇ to 3. Remove the additional information, i.e., restrict the valuation function from > At. The following example demonstrates this procedure. Example 1. Let > = {p, K1 p, K2 p, K1 ¬p, K2 ¬p}. Then the product of the two models is calculated as follows. M0 M p 2 ¬p M ×> M0 2 v1 v2 p ¬p ×> = p ¬p w1 w2 1 (w1 ; v1 ) (w2 ; v4 ) p 2 ¬p v3 v4 Note that the reflexive and transitive arrows are not drawn in the above picture for simplicity. The set T is rich enough to uniquely describe all knowledge assignments of level at most one. Thus, the product reflects a merging of models taking into account the agents’ first-order information. The fragments of > true at the individual worlds are: M, w1 {p, K1 p} M, w2 {K1 ¬p} M0 , v1 {p, K1 p} M0 , v 2 ∅ M0 , v3 p M0 , v4 {K1 ¬p} The only pairs satisfying the same fragment of > are (w1 , v1 ) and (w2 , v4 ). Observe that in the model M ×> M0 we have: M ×> M0 , (w1 ; v1 ) {p, K1 p, K2 p} which is different from {p, K1 p}, the fragment of > satisfied by M, w1 . In general, taking a generalized product update consists of two steps: The first is picking a set of statements > ⊇ At that one wants to keep track of and extending the language to L>EL . The second is to do generalized product update ×> , that is the normal 264 Changing Types product × over L>EL followed by omitting all the information about the valuation of ˇ \ At, i.e., making the newly created model an LEL model again. The above example > shows that the ×> product does not preserve higher order information. Remark 1. There are epistemic models K, w and L, v over LEL a finite fragment > ⊂ LEL and some ϕ ∈ > \ At such that (v, w) ∈ K × L (the product over L>EL ) and K × L, (v, w) ϕ̌, but K × L, (v, w) 2 ϕ. (Where in the last formula ϕ is evaluated as a formula of LEL .) There is a close connection between generalized product update and the ⊕-update. In both cases, the result is not the complete cartesian product between the two state spaces, but a subset that is characterized by a certain set of formulae. The precise connection between the two concepts is clarified by the following lemma. Lemma 1. For every event model E there is some fragment > ⊆ LEL and a Kripke model M0 (for the language L>EL ) such that − ⊕ E is the same as − ×> M0 (i.e., for all Kripke models M, M ⊕ E is isomorphic to M ×> M0 ). Proof. Let E = hE, {Qi }i∈I , prei be an event model. Let > be the set {pre(e) | e ∈ E}∪ At. Construct the model M0 = hW 0 , {R0i }, V 0 i as follows: Let W 0 be the set of pairs (e, Le ) where e ∈ E and Le ⊆ > is a maximally consistent subset of > containing pre(e). The relations R0i are defined as (e, Le )Ri (e0 , Le0 ) iff eQi e0 , and the valuation V 0 is defined by Le (i.e., (e, Le ) ∈ V(ϕ̌) provided ϕ̌ ∈ Le ). It is easy to check that this M0 has the desired properties. Corollary 1. If there is an upper bound for the quantifier depths of the preconditions in the event model E (i.e., the set {qd(pre(e)) | e ∈ E} has an upper bound) then the set > in the above lemma can be chosen finite. This holds in particular if E is finite. Proof. Let n − 1 be an upper bound for the quantifier depths of {pre(e) | e ∈ E}. Recall that Fn (℘(At)) is finite, and so, there are characteristic formulae ϕt of maximal quantifier depth n − 1 for every t ∈ Fn (℘(At)) (that is, Fn (℘(At)), s ϕt ⇔ s = t). Let > := {ϕe | e ∈ Fn (℘(At))} ∪ At and construct the model M0 as follows: W 0 := {(e, t)|e ∈ E, t ∈ Fn (℘(At)) and Fn (℘(At)), t pre(e)}, let (e, t)R0i (e0 , s) if eQi e0 , and define V 0 as: (e, t) ∈ V 0 (ϕ̌) iff Fn (℘(At), t ϕ. Klein and Pacuit 265 The sets S = {ϕt | t ∈ Fn (℘(At))} chosen above are special in that these sets reflect all possible knowledge assignments up to depth n − 1. We denote the resulting set of formulas by Nn (i.e., Nn = {ϕt | t ∈ Fn (℘(At))} ∪ At). Remark 2. i) In the above proof, we can turn M0 into an event model E0 by letting pre(e, t) = ϕt . In this case we have M ×Nn M0 = M ⊕ E0 for all M. In partic- ular the model E0 is a special event model that only has preconditions from Nn . This follows a general pattern: The initial strength of arbitrary event models is that they allow for a very intuitive description of events in a multi-agent setting. However, from a technical point of view arbitrary event models can be difficult to handle. Therefore it sometimes proves useful to translate arbitrary event mod- els into a certain subclass of event models which are easier to work with. For instance, van Ditmarsch et al. (2008) defined a class of canonical event models that are useful for studying when two event models are equivalent. ii) The translation of an event model into a Kripke model blurs the distinction be- tween static descriptions of situations and descriptions of events. There is an interesting peculiarity of the ×> -products. Obviously, ×> is commuta- tive, but the following example shows that it is not associative.15 Example 2. As in example 1 let > = {p, K1 p, K2 p, K1 ¬p, K2 ¬p}. Consider the fol- lowing LEL -models which we interpret as L>EL -models. M2 M1 p 2 ¬p M3 p 2 ¬p v1 v2 p ¬p ×> ×> w1 w2 1 u1 u2 p 2 ¬p v3 v4 We now show that (M1 ×> M2 ) ×> M3 , M1 ×> (M2 ×> M3 ). As we already noted in the previous example (Example 1), M1 ×> M2 = M3 . In particular, (M1 ×> 15 In general, it is clear that the process of consecutive learning is not commutative. One’s actions in some event B can depend on having learned A before. In our formalization, the non-associativity captures this intuition: (A ×S B) ×S C is to be read as being in situation A and learning B, then C, whereas A ×S (B ×S C) = A ×S (C ×S B) corresponds to learning B and C at a time. A similar phenomena has been noticed in the belief merging literature (Maynard-Reid II and Shoham 1998, Section 5.1). 266 Changing Types M2 ) ×> M3 = M3 ×> M3 = M3 where the last equivalence holds since u1 and u2 satisfy different formulas from >. On the other hand, note that the following formulas from > are true at states in M3 : M3 , u1 {p, K1 p, K2 p} M3 , u2 {K1 ¬p, K2 ¬p} However, there are no states in M2 precisely these substes of T , so M2 ×> M3 = ∅ and consequently M1 ×> (M2 ×> M3 ) = ∅. Thus, we have (M1 ×> M2 ) ×> M3 , M1 ×> (M2 ×> M3 ).16 The interpretation of this statement is that first learning E and then learning E0 is different to learning E and E0 at the same time. To be more precise, we have (E ×> F ) ×> G , E ×> (F ×> G) , E ×> F ×> G17 This non-associativity shows that our framework is rich enough to distinguish between consecutive learning and receiving all information at once. These observations should be contrasted with the theory developed in (van Eijck et al. 2010). There, the authors are concerned with updates where all preconditions are boolean combinations of the ground variables (describing non-epistemic facts about the state of the world). Learning facts about the world is associative (cf van Eijck et al. 2010, Theorem 1), whereas learning facts about the players’ previous knowledge is not! Van Eijck et al. study the monoid generated by ×At products. Our primary goal in this paper is to understand how the ⊕-update works in type spaces. To that end, we first generalize a result from (van Ditmarsch and French 2009). Theorem 1. Let M1 be a Kripke structure such that for any v, w ∈ M there is an epistemic formula ϕ distinguishing v and w (i.e. M, w ϕ and M, v ¬ϕ). Let M2 be an arbitrary Kripke model. Then there is a set of formulas > and L>EL -Kripke structure M0 such that M1 ×> M0 - M2 if and only if there is a total simulation from M2 to M1 . Furthermore, if the model M1 is finite the set > can be chosen finite. Proof. The direction from left to right is easy: Let M0 and > be such that M1 ×> M0 = M2 . It is easy to see that the map M1 ×> M0 → M1 sending every pair (w, w0 ) to w is a functional, hence total, simulation. For the direction from right to left: Let Z be a total simulation from M2 to M1 . ◦ ◦ M◦ First we define a Kripke model M◦ = hW M , {RM i }i∈I , V i: 16 There are examples where both (M × M ) × M and M × (M × M ) are non-empty; however, 1 > 2 > 3 1 > 2 > 3 they are more complicated while making the same point. 17 Here E × F × G is the obvious generalization of × where all tuples (e, f ) in the definition are > > T replaced by triples (e, f, g). Klein and Pacuit 267 ◦ • W M = {(t1 , t2 )|ti ∈ Mi , i = 1, 2 and t1 Zt2 } ◦ i (s1 , s2 ) iff t1 Ri s1 and t2 Ri s2 • (t1 , t2 )RM M1 M2 ◦ • (t1 , t2 ) ∈ V M (p) iff t2 ∈ V M2 (p) (and thus also t1 ∈ V M1 (p) ) First we show that the model M◦ is bisimilar to M2 . We show that the projection map π2 mapping every (t1 , t2 ) ∈ M◦ to t2 ∈ M2 is a bitotal bisimulation (recall Def- inition 2.5). The atom condition is clear. For forth assume that (t1 , t2 )π2 t2 and that ◦ M◦ (t1 , t2 )RMi (s1 , s2 ). By the definition of Ri i s2 and by definition of π2 we have t2 RM 2 we have (s1 , s2 )π2 s2 , thus forth is fulfilled. Similarly, for back assume that (t1 , t2 )π2 t2 and that t2 RM 2 i s2 . Since Z is a total ◦ simulation and t1 Zt2 holds by the construction of M , there is some s1 ∈ M1 with M◦ s1 Z s2 and t1 RM i s1 . But this means that (s1 , s2 ) ∈ M and that (t1 , t2 )Ri (s1 , s2 ), thus 1 ◦ proving the back condition. Since M2 - M◦ , it suffices to show that there is some M0 with M1 ×> M0 = M◦ . Note, that the projection π1 : M◦ → M1 sending each pair (t1 , t2 ) to t1 is a func- tional left simulation. The atom condition is clear, and the rest can be shown with arguments similar to ones given above. Now, pick a set >∗ ⊆ LEL that contains a distinguishing formula for any two worlds v, w ∈ M1 and let T := T ∗ ∪ At. Turn M◦ into an L>EL -model M0 by defining: (t1 , t2 ) ∈ V > (ϕ̌) iff M1 , t1 ϕ. Since >∗ contains enough distinguishing formulae, s1 ∈ M1 and (t1 , t2 ) ∈ M◦ satisfy the same Ť -formulae iff s1 = t1 . Therefore M1 ×> M0 = M◦ as desired. Furthermore, if M1 is finite, then the set T ∗ can be chosen finite, thus proving the last statement. Remark 3. Van Ditmarsch and French (2009) contains a proof for a similar statement about ⊕-updates in the finite case. However, the generalization to infinite Kripke mod- els does not hold for the ⊕-update. Remark 4. Note that the model M0 constructed in the right-to-left direction of the prove of Lemma 1 is in general not a LEL model that is simply interpreted as an L>EL model. That is: There is in general some ϕ ∈ > and some w ∈ M0 such that M0 , w ϕ̌ but M0 , w 2 ϕ (where ϕ̌ is an atom of L>EL and ϕ is a formula evaluated in M0 interpreted as a Kripke model over At (i.e. only containing atoms from { p̌ | p ∈ At}). That is: to gain the expressive power of updating with an arbitrary event model, one needs the class of all L>EL -models. Interestingly enough, this is no longer true when we restrict ourselves to the class of finite Kripke structures. There, the full expressive power of the class of all ⊕-updates is already given by the class of all finite Kripke models over LEL together with the set of all ×Nn products for n ∈ ω. More formally, we have the following fact (whose straightforward, but tedious, proof we leave out). 268 Changing Types Fact 3.1. Let K = hW, (Ri )i , Vi be a finite Kripke model without bisimilar points and let L = hW 0 , R0i , V 0 i be a finite Kripke model such that L is obtainable from K by a product update. Then, there is some > ⊇ At and some Kripke model M over LEL such that K ×> M = L. 3.2 Characterization result As discussed in the previous section, every ⊕-update can be written as a ×> -update over a language in which the formulas in > are treated as atomic propositions. This will help us represent the product update in knowledge structures. First, we need an equivalent to the extension of atomic propositions on types: For n ∈ N let S n denote the set of all possible n-worlds, thus S n = Fn (S ) and S 0 = S ). Technically, this is redundant, though it helps conceptually to distinguish Fn (S ) as a type space generated by S and S n which is the same type space reinterpreted as new set of atoms. By switching between those interpretations, every n + k world over S can be seen as a k-world over S n and thus there is a canonical embedding Fω (S ) ,→ Fω (S n ).18 For any two Kripke models K, v and L, w we have defined the product update (K × L, (v, w)) over the ground language LEL above. Furthermore, we have seen that there is some κ such that rκ is a bisimulation of K onto its image. Since rκ (v) is obviously in the image of rκ [K] this implies that parts of K are somehow coded in rκ (v). The idea of the following definition is that we can unravel enough information about K and L from rκ (v) and rκ (w) to determine rκ ((v, w)). We define a product ×0 below and we will show later (lemma 2) that rκ ((v, w)) = rκ (v) ×0 rκ (w). As with the original definition of a κ-world (see 2.6), the definition is by inductive. ~ ~g ∈ Fω (S ). Then the ×0 -product ( f~ ×0 ~g) ∈ Definition 3.3. Suppose that n ∈ N and f, Fω (S ) ∪ {∅} is defined as follows: • ( f~ ×0 ~g)0 = h f0 i iff f0 = g0 and ∅ otherwise. • ( f~ ×0 ~g)m (i) = {( f~0 ×0 ~g0 )m−1 | f~0 ∈ fm (i), ~g0 ∈ gm (i)} This definition can be lifted to an analogue of the generalized product update: The operator ×n will correspond to a product update with > = Nn . First observe that the above definition of ×0 works equally well if all S are replaced by S n . As in the case of the generalized product update, the ×n update implicitly consists of two steps: First a product update between two elements of Fκ (S n ) followed by a removal of information, 18 Note that this map is not surjective for n ≥ 1: For instance the introspection conditions of F k+1 (S ) gives some limitations on which elements of F2 (S k ) can come from Fk+1 (S ). Klein and Pacuit 269 i.e. a projection from S n to S . As with general product updates, the definition contracts these two steps into one: Definition 3.4. Let π̄ : S n → S be the projection map sending the tuple h f0 . . . fn−1 i to f0 . Define ×n : Fω (S n ) × Fω (S n ) → Fω (S ) as follows: • ( f~ ×n ~g)0 = hs0 i iff π̄( f0 )) = π̄(g0 ) = s0 , and ∅ otherwise. • ( f~ ×n ~g)m (i) = {( f~0 ×n ~g0 )m−1 | f~0 ∈ fm (i), ~g0 ∈ gm (i)}. / The following lemma describes the relationship between the ×Nn -product and the ×n -product. Basically, the ×Nn product of two Kripke models (K, w) and (L, v) carries the same information as the ×n -product on the types r(v) and r(w). For technical convenience we need a definition before we state the lemma: Recall that Nn \ At was chosen to be a special set of characteristic formulae for Fn (S ). Since the map rn preserves the validity of all formulae in Nn (see observation 1 iv)), every state w in a Kripke structure K over LEL satisfies exactly one formula of Nn \ At. By the the definition of ×> we thus get the following: In particular we get for any Kripke model K over LEL and L over LNELn that (v, w) ∈ K ×Nn L implies that there is exactly one ϕ̌ ∈ Nn \ At with w ∈ V(ϕ̌). Thus, we can restrict our study to Kripke models L satisfying this property. We call Kripke models over LNELn satisfying this property admissible. Furthermore, since every e ∈ Fn (S ) satisfies exactly one formula from Nn \ At we have that every world in an admissible Kripke model over L>EL corresponds to exactly one e ∈ Fn (S ). We can define a map r0 from admissible Kripke models to Fω (S n ) in the same way as we defined r. Lemma 2. Let n ∈ ω and let K, L, be Kripke models over LNELn . Let v ∈ K, w ∈ L satisfaying the same Nn -formulae. Let (v, w) ∈ K ×Nn L denote the product of v and w in K ×Nn L. Then we have r((v, w)) = r0 (v) ×n r0 (w), i.e., the following diagram commutes: × Nn K, L / K ×N L n r0 ,r0 r Fω (S n ), Fω (S n ), ×n / Fω (S ) Proof. Let n ∈ N and v ∈ K, w ∈ L satisfying the same Nn -formulae. We inductively show that (r0 (v) ×n r0 (w))k = r(v, w)k . for k = 0 this is trival: If v and w satisfy the same atomic propostions over Ňn we have (r0 (v) ×n r0 (w))0 = r((v, w))0 = {p ∈ At : 270 Changing Types v ∈ V K (p)}. If they satisfy different atomic propositions we have (v, w) < K ×Nn L and r0 (v) ×n r0 (w) = ∅. Now assume the statement holds for k − 1 and let i ∈ I (the set of agents). First, we show r(v, w)k (i) ⊆ (r0 (v) ×n r0 (w))k (i). Let x ∈ r((v, w))k (i), thus x is a k − 1-world. By construction of the map r there is some x̃ in K ×Nn L such that x̃Ri (v, w) and r( x̃)k−1 = x. Thus there are x1 ∈ K and x2 ∈ L such that the product of x1 and x2 in K ×Nn L is x̃ - in particular x1 Ri v and x2 Ri w and x1 and x2 satisfy the same Nn -formulae. In particular, r0 (x1 ) ×n r0 (x2 ) , ∅ and by induction we have that (r(x1 , x2 ))k−1 = (r0 (x1 ) ×n r0 (x2 ))k−1 . On the other hand, we have r0 (x1 )k−1 ∈ r0 (v)k (i) and similarly for x2 and w by the construction of r0 . In particular, we have x = (r0 (x1 ) ×n r0 (x2 ))k−1 ∈ (r0 (v) ×n r0 (w))k (i) as desired, thus proving the first direction. The argument for the reverse inclusion r(v, w)k (i) ⊇ (r0 (v) ×n r0 (w))k )(i) is similar: Let x ∈ (r(0 v) ×n r0 (w))k (i). Then there are x̃1 ∈ r0 (v) and x̃2 ∈ r0 (w) such that (r0 ( x̃1 ) ×n r0 ( x̃2 ))k−1 = x and such that there are x1 ∈ K, x2 ∈ L such that r0 (xi ) = x̃i and x1 Ri v and x2 Ri w hold. Since x̃1 ×n x̃2 exists, x1 and x2 satisfy the same Nn -formulae. In particular there is some (x1 , x2 ) in K ×Nn L with (x1 , x2 )Ri (v, w). By construction of r we have r((x1 , x2 ))k−1 ∈ r((v, w))k and by induction we have r((x1 , x2 ))k−1 = x, thus proving the reverse direction. Note that the calculation of f~ ×n ~g from types f~ and ~g is computationally efficient: In order to calculate the k-th level of f~ ×n ~g only the first n + k levels of f~ and ~g are required. The above definition of ×n updates gives a way of modeling dynamics on a type space—thus, opening up the field of epistemic game theory to belief dynamics. Event models were designed as a very intuitive and natural tool for representing epistemic events in a multiagent setting. The translation of event models into the corresponding pair of Kripke models and a product relation ×Nn , and further into a type and a relation ×n allows us to calculate the change of epistemic status brought about by an event model E. On the other hand, every product update with a finite event model can be written as a ×n -update, thus it suffices to understand the structure of ×n to study product updates. Thus, Fω (℘(S )) is not only a universal Kripke model in the static sense, together with the products ×n is also universal in that it incorporates all potential updates. On Kripke structures, translating event models into types allows us to study updat- ing events as separate entities without any reference to a ground type. Furthermore, the translation blurs the distinction between types as static descriptions of epistemic states and knowledge changing events. One natural and important question is: Given two types f~ and ~g, is there a possible piece of incoming information that transforms f~ into ~g? Klein and Pacuit 271 The intuition behind the answer given by the following theorem is: In the entire model, the agents are assumed to be omniscient and non-forgetting. Thus, an event cannot add any uncertainty about the state of nature, it can only remove some states from the sets of possible states. In contrast, for the higher order information, essen- tially anything is possible as long as it is compatible with individuals gaining new in- formation about the state of nature. In particular, an epistemic event may increase the uncertainty about other agents’ types. This idea is captured by the following definition. Definition 3.5. For a type f~ ∈ Fα (S ) we say that a type ~g is admissible for f~ iff • f0 = g0 ; • for all agents i: g1 (i) ⊆ f1 (i); and • for α > 1: If ~h ∈ gα (i) then there is a ~h0 ∈ fα (i) such that h is admissible for ~h0 . / Our characterization theorem is similar to Theorem 1. Theorem 2. Let f, ~ ~g ∈ Fα (S ) be types such that ~g is obtainable by an update from f~, i.e., there is some n and some ~h ∈ Fα (S n ) such that f~ ×n ~h = ~g. Then ~g is admissible for f~. If the submodel of Fω (S ) generated by f~ is finite also the converse holds true. Before we can prove this theorem, we recall the following result from infinite com- binatorics. Theorem 3. (König’s Lemma) Let T be an infinite, finitely branching tree. Then, T has an infinite branch. Proof. Construct an infinite branch hx0 , x1 , . . .i as follows: x0 is the root. For i > 0: If x0 , . . . xi are already in the branch, pick a successor xi+1 of xi such that {y|y > x} is infinite. (since the tree is finitely branching such a successor always exists). Then hx0 , x1 , . . .i is an infinite branch. Prof of Theorem 2. The first statement is straightforward: Let F and G be the epis- temic submodels of Fω (S ) induced by f~ and ~g, respectively. Assume that there is some ~h ∈ Fω (S n ) such that f~ ×n ~h = ~g. By Lemma 2, this is equivalent to saying that F ×Nn H = G, where F , G, H are the generated Kripke models (over LNELn ) from f, ~ ~g, ~ and h. By Theorem 1 there is a total simulation S from G to F. We inductively show that every ~g0 ∈ G is admissible for every f~0 ∈ F with f~0 S ~g0 . The 0th-level is clear by definition of a simulation. Now it suffices to show that definition of admissibility is fulfilled at the 1st level: Since we do this for all ~g0 ∈ G the rest follows from the inductive definition of admissibility and the map r. To see that admissibility is fulfilled 272 Changing Types at the 1st level, let ~h ∈ G with ~g0 ∼i ~h. By definition, there is a ~h0 ∈ F with f~0 ∼i ~h0 . Thus, every state of nature that is conceivable for agent i in G via ~h is also conceivable in F via ~h0 - this is exactly the definition of being admissible in the first level. For the second statement let ~g be admissible for f~ and let the submodel of Fω (℘(S )) generated by f~ be finite. Again, let F and G be the Kripke submodels of Fω (S ) induced by f~ and ~g. Define the Relation Z between F and G as f~0 Z~g0 iff ~g0 ∈ G is admissible for f~0 ∈ F . We will show that Z is a total simulation from G to F , thus showing that G is obtainable by F via update (again using Theorem 1 and Lemma 2). From the first clause in the definition of admissibility it follows that Z satisfied atomic harmony. Furthermore we have that ~g is admissible for f~ by assumption, thus Z is not empty. To see that Z satisfies the zig condition we have to show that whenever ~g0 ∈ G is ˜ admissible for f~0 ∈ F and ~g0 ∼i ~g˜ , then there is some f~0 ∼i f~ such that ~g˜ is admissible ˜~ for f . Once we have that it is easy to see that Z is also total, since for every ~g0 in G there is a chain ~g ∼i1 ~g1 ∼i2 . . . ∼in ~g0 connecting ~g with ~g0 . Thus to show (zig) let ~g0 ∈ G be admissible for f~0 ∈ F and ~g0 ∼i ~g˜ . We construct an ω-tree (T, ≺) as follows: Since ~g0 ∼i ~g˜ we have g̃<k ∈ g0k (i) for every k. Since ~g0 is admissible for f~’ we have that for there is some h ∈ fk0 (i) such that g̃<k is admissible for h. Let the kth level of the tree T consist of exactly these h ∈ fk0 (i) that g̃<k is admissible for and let the three order ≺ be the initial-segment preorder. Since ~g0 is admissible for f~0 , every level of T is non-empty. On the other hand, since the state of nature is considered finite, every level of T finite, thus T is finitely branching. Thus, by König’s lemma T has an infinite ˜ S ˜ ˜ path P. By construction, f~ = ~r∈P ~r is a type, f~ ∼i f~0 and ~g˜ is admissible for f~. Since we assumed that the submodel F is the substructure of Fω (S ) induced by f~ (and thus ˜ ˜ by f~0 ) we have f~ ∈ F , thus the simulation Z relates ~g˜ to f~. Again, there is an obvious counterpart of Remark 4 allowing us to update with F (S ) worlds rather than F (S n ) worlds, provided all the induced Kripke structures involved are finite. To be precise, we can show the following: Let f,~ ~g ∈ Fω (S ) be such that the epistemic submodels of Fω (S ) induced by f and ~g are finite. Then ~g is admissible for f~ ~ if and only if there is some natural number n and some ~h ∈ Fω (S ) such that f~ ×n ~h = ~g. 4 Conclusion and future work Many different formal models have been used to describe the players knowledge and beliefs in game-theoretic situations. The variety of models reflect different mathemat- ical conventions used by the various sub-communities, as well as competing intuitions Klein and Pacuit 273 about how best to describe the players’ beliefs and reasoning in a game situation. It is important to understand the precise relationship between the alternative modeling paradigms. In this paper, we focused on the two most prominent models found in the literature on the epistemic foundations of game theory: Kripke- or Aumann- structures and knowledge structures (non-probabilistic variants of Harsanyi type spaces). There are two main contributions in this paper. The first is to initiate a study of “in- formation dynamics” for knowledge structures in the style of recent work on dynamic epistemic logic (cf. van Benthem 2011). Such a theory would further illustrate the subtle relationship between type spaces and Kripke structures (updating the discussion initiated in Fagin et al. 1991; 1999). In particular, it allows us to combine the strengths of both approaches and use event models as a tool to describe epistemic events. The main technical contribution is the definition of a product operation ×n on the type space Fω (S ). We provide a procedure that allows us to translate arbitrary event models into types. Furthermore, we show that the ×n product is powerful enough to simulate all updates by event models. We prove a characterization theorem (Theorem 2) showing when a type can be transformed into another type by updates with an event model. This is only an initial study. We see our work here opening up many different avenues of future research. In particular, we plan on investigating the following issues in the future. • What happens if we allow only updating types from a certain subclass of Fα (S n ) (for example, finite epistemic models hFα (S n ), {∼i }i∈I , Vi)? • What are the “behavioral” implications of our main characterization theorem (Theorem 2)? For example, if a strategy is rational for a type f~ in a game G, does that strategy remain rational for all types that are admissible for f~? • How do we extend the ideas developed in this paper to Harsanyi type spaces, where the beliefs are represented by probability measures? The first step is to generalize the dynamic epistemic logic framework to settings where beliefs are represented by probabilities. Fortunately, this has largely been done (see van Benthem et al. 2009, Aceto et al. 2011, for details). A very interesting direction for future research is to explore how to use the probabilistic event models and product update operation of (van Benthem et al. 2009) to prove a result analogous to our main characterization theorem (Theorem 2) for Harsanyi type spaces. • The relation “obtainable by an update” together with our extended theorem (see Remark 4) turns the set of finite induced submodels of FS (w) into an algebra. Can we characterize this algebra? 274 Changing Types Acknowledgements This research was funded by the NWO Vidi project 016.094.345. The authors would like to thank the anonymous referees for their ex- tremely thorough comments which greatly improved this paper. References L. Aceto, W. van der Hoek, A. Ingólfsdóttir, and J. Sack. Sigma algebras in probabilis- tic epistemic dynamics. In Proceedings of the Thirteenth conference on Theoretical Aspects of Rationality and Knowledge, ACM, 2011, pp. 191–199., pages 191 – 199, 2011. R. Aumann. Interactive epistemology I: Knowledge. International Journal of Game Theory, 28:263–300, 1999. J. van Benthem. Logical Dynamics of Information and Interaction. Cambridge Uni- versity Press, 2011. J. van Benthem, J. Gerbrandy, and B. Kooi. Dynamic update with probabilities. Studia Logica: An International Journal for Symbolic Logic, 93(1):67 – 96, 2009. J. van Benthem, E. Pacuit, and O. Roy. Toward a theory of play: A logical perspective on games and interaction. Games, 2(1):52 – 86, 2011. G. Bonanno and P. Battigalli. Recent results on belief, knowledge and the epistemic foundations of game theory. Research in Economics, 53(2):149–225, 1999. A. Brandenburger. Epistemic game theory: Complete information. In S. N. Durlauf and L. E. Blume, editors, The New Palgrave Dictionary of Economics. Palgrave Macmillan, 2008. A. Brandenburger. Origins of epsitemic game theory. In V. F. Hendricks and O. Roy, editors, Epistemic Logic: Five Questions, pages 59–69. Automatic Press, 2010. A. Brandenburger and A. Friedenberg. Self-admissible sets. Journal of Economic Theory, 145:785 – 811, 2010. H. van Ditmarsch and T. French. Simulation and information: Quantifying over epis- temic events. In J.-J. Meyer and J. Broersen, editors, Knowledge Representation for Agents and Multi-Agent Systems, volume 5605 of Lecture Notes in Computer Science, pages 51–65. Springer Berlin / Heidelberg, 2009. Klein and Pacuit 275 H. van Ditmarsch, W. van der Hoek, and B. Kooi. Dynamic Epistemic Logic. Synthese Library. Springer, 2007. H. van Ditmarsch, J. Ruan, and R. Verbrugge. Sum and product in dynamic epistemic logic. Journal of Logic and Computation, 18:563–588, 2008. J. van Eijck, Y. Wang, and F. Sietsma. Composing models. In Proceedings of LOFT 2010, 2010. R. Fagin. A quantitative analysis of modal logic. Journal of Symbolic Logic, 59(1): 209 – 252, 1994. R. Fagin and M. Vardi. An internal semantics for modal logic: Preliminary report. In Proc. 17th ACM SIGACT Symposium on Theory of Computing, pages 305 – 315, 1985. R. Fagin, J. Halpern, and M. Vardi. A model-theoretic analysis of knowledge. Journal of the ACM, 91(2):382 – 428, 1991. R. Fagin, J. Halpern, Y. Moses, and M. Vardi. Reasoning about Knowledge. The MIT Press, 1995. R. Fagin, J. Geanakoplos, J. Halpern, and M. Vardi. The hierarchical approach to modeling knowledge and common knowledge. International Journal of Game The- ory, 28(3):331 – 365, 1999. J. Harsanyi. Games with incomplete informations played by ‘bayesian’ players. Man- agement Science, 14:159–182, 320–334, 486–502, 1967. P. Maynard-Reid II and Y. Shoham. From belief revision to belief fusion. In Proceed- ings of LOFT-98, 1998. R. Myerson. Harsanyi’s games with incomplete information. Management Science, 50(12):1818–1824, 2004. A. Perea. Epistemic Game Theory: Reasoning and Choice. Cambridge University Press, 2012. F. Sietsma and J. van Eijck. Action emulation between canonical models. preprint. M. Siniscalchi. Epistemic game theory: Beliefs and types. In S. Durlauf and L. Blume, editors, The New Palgrave Dictionary of Economics. Palgrave Macmillan, Basingstoke, 2008. 276 Changing Types W. van der Hoek and M. Pauly. Modal logic for games and information. In P. Black- burn, J. van Benthem, and F. Wolter, editors, Handbook of Modal Logic, volume 3 of Studies in Logic, pages 1077 – 1148. Elsevier, 2006. Dependent Type Semantics: An Introduction Daisuke Bekki Ochanomizu University, Graduate School of Humanities and Sciences National Institute of Informatics CREST, Japan Science and Technology Agency
[email protected]Abstract This paper introduces dependent type semantics, a framework for natural language semantics based on dependent type theory. The main features of dependent type semantics are as follows. 1) It is dynamic: it analyzes E-type/donkey anaphora with well-formed representations. 2) It is proof-theoretic: deductions between representations are available without recourse to models. 3) It is compositional: semantic representations of sentences are derived from lexicalized representations by a fixed number of combinatory rules. Lastly, 4) it explains accessibility: the accessibility or inaccessibility of anaphoric antecedents depends on the structural differences between proofs. 1 Introduction 1.1 Between dynamism and compositionality Since 1980, the enterprise of dynamic semantics has pursued an alternative framework to Montagovian semantics, which compensates for the gap between syntactic structures of natural language sentences involving dynamic binding. The difficulty of this pursuit implies that there is tension between dynamism and compositionality, which have not yet been unified in a coherent semantic theory that accounts for both aspects. This tension has been the driving force behind dynamic semantics, and in fact some theories have achieved partial success in unifying the two aspects. At this point, I 278 Dependent Type Semantics should clarify what I mean by dynamism and compositionality. Dynamic semantics explores various empirical data concerning dynamic binding, whose nature is exem- plified by the two paradigms of donkey sentences in (1) by Geach (1962) and E-type anaphora in (2) by Evans (1980). (1) a. Every farmer who owns a donkey i beats iti . b. If [a farmer]i owns a donkey | , hei beats it j . (2) [A man]i entered. Hei whistled. As discussed elsewhere, (1a), for example, is problematic in terms of composition- ality. Compositional semantic theory is such that it provides a way to calculate any semantic representation of any target sentence from the semantic representations of its parts. The structural analogue of (1a) (and (1b)), which allows us to give a straight- forward compositional analysis, is (3). However, it is not an appropriate semantic representation for (1a) since variable y occurs as a free variable outside of the scope of ∃y. (3) ∀x(Farmer(x) ∧ ∃y(Donkey(y) ∧ Own(x, y)) → Beat(x, y)) In the same way, the structural analogue of (2) is (4), which is not an appropriate representation for (2) since variable x in Whistle(x) is not bound by ∃x. (4) ∃x(Man(x) ∧ Enter(x)) ∧ Whistle(x) Thus we have the following criteria for a successful theory of dynamic semantics. Criterion #1: It gives sentences involving dynamic binding well-formed semantic representations. On the other hand, semantic representations have to correctly represent their truth conditions, or the entailment relation in which they are involved. For example, (1a) may participate in the following syllogism. Example 1 (Donkey Syllogism). Every farmer who owns a donkey i beats iti . John is a farmer. Bill is a donkey. John owns Bill. John beats Bill. Bekki 279 Furthermore, (2) may participate in the following syllogisms. Example 2 (E-type Syllogisms). [A man]i entered. [A man]i entered. Hei whistled. Hei whistled. A man entered. A man whistled. Although these examples are far from exhaustive, they constitute central paradigms which can be used to check if a given analysis is not immediately falsified. Thus, a successful semantic theory should include a method for calculating such relations in one way or another. Criterion #2: It calculates the entailment relation between sentences involving dynamic binding. The first-order representations for sentences (1) and (2) necessary in order to cor- rectly calculate their entailment relations are (5) and (6), respectively. (5) ∀x(Farmer(x) → ∀y(Donkey(y) ∧ Own(x, y) → Beat(x, y))) (6) ∃x(Man(x) ∧ Enter(x) ∧ Whistle(x)) These represent proper information that the sentences (1) and (2) contain, in a sense that any proof system for first-order predicate logic will prove that the inferences in Example 1 and Example 2 are valid. On the other hand, the structural similarity to the original sentences is lost in (5) and (6), so their direct decomposition does not lead to the respective lexicalized represen- tations. In this way, there is always tension between dynamism and compositionality. Criterion #3: It is compositional. Moreover, one important empirical paradigm to be explained by dynamic semantics is that of accessibility constraints that restrict the access of anaphora to its antecedent, which is an issue reported by Karttunen (1976) in mid 1970’s and given a first general- ization in DRT (Kamp 1981, Kamp and Reyle 1993). Dynamic binding is licensed in some configurations but not in others, as exemplified in the following sentences. (7) Everybody bought [a car]i . *Iti stinks. (8) If John bought [a car]i , iti must be a Porsche. *Iti stinks. (9) John didn’t buy [a car]i . *Iti stinks. 280 Dependent Type Semantics (10) John bought [a car]i or didn’t buy anything. *Iti stinks. Dynamic semantics should thus correctly predict such accessibility constraints and should be able to explain why they appear as they do and where they come from. Criterion #4: It predicts and explains accessibility constraints. Therefore, a successful theory of dynamic semantics should satisfy all criteria #1, #2, #3 and #4. 1.2 “Type Theoretical” approaches Sundholm (1986) noticed fairly early that Martin-Löf (or Constructive) Type The- ory (Martin-Löf 1975; 1984) provides semantic representations for donkey sentences whose structures are parallel to their syntactic structures (Criterion #1 is satisfied) in a different way from DRT (Kamp 1981, Kamp and Reyle 1993), DPL (Groenendijk and Stokhof 1991), and their successors. Subsequently, the following three approaches have been proposed to obtain Sundholmian representations: 1. Ahn and Kolb (1990) provides a set of translation rules from Discourse Repre- sentation Structures (DRS) to Sundholmian representations. 2. Dávila-Pérez (1994) presented a reformulation of Montague Grammar (Mon- tague 1973) in terms of Martin-Löf’s Higher Type Theory, following the line of (Ranta 1991). 3. Ranta (1994) proposed a generative theory of grammar based on Martin-Löf’s Lower Type Theory, known as Type-Theoretical Grammar (TTG). In all these approaches, representations are incorporated into the proof system of Martin-Löf Type Theory (Criterion #2 is satisfied) and accessibility conditions are ex- plained in a non-ad-hoc manner (Criterion #4 is satisfied: see (Fox 1994) among oth- ers), which I discuss in detail in Section 3. Thus, type theoretical approaches are highly prospective in pursuing a unified the- ory which satisfies all the four criteria and serves as a proof-theoretic alternative to dynamic semantics. However, they face several problems with respect to composition- ality (Criterion #3 is not satisfied) as matters stand. For example, TTG’s derivations of the semantic representations for the sentences in (11) are given as (12).1 1 In TTG, the sorts set and prop are synonyms of type. Man is a type by itself, while Enter(x) is a type depending on x. Bekki 281 (11) a. A man entered. b. Every man entered. (12) (1) (1) x : Man x : Man .. .. . . Man : set Enter(x) : set Man : set Enter(x) : set (ΣI) (1) (ΠI) (1) (Σx:Man)Enter(x) : set (Πx:Man)Enter(x) : set The proof diagrams in (12) use the same set of axioms, and the only difference between them is the rules used in the last steps. It derives (11a) if the (±I) rule is used, while it derives (11b) if the (I) rule is used instead. This point is puzzling from a perspective of formal semantics, making us wonder whether it counts as compositional analysis. At the very least, TTG is not compo- sitional in the sense of Montagovian semantics since it provides no lexicalized rep- resentations for “A” and “Every”, and as a result it provides no way to compose the representations in (12) from the representations of the words therein. On the other hand, TTG analysis can be regarded as compositional from the per- spective of analytic philosophy. That is, TTG analyzes the concepts that the sentences in (11) express, rather than the sentences in (11) themselves, and how such concepts can be composed from more primitive concepts such as Man : set and Enter(x) : set, with a fixed number of rules. In other words, TTG is a theory about compositionality of concepts, not of sentences. Therefore, we have to be careful about the status of TTG as a grammar. TTG is not about the relation between sentences and meanings which Montagovian theories are about, but rather a theory of meaning which has its own way of verification. 1.3 Toward a dynamic, proof-theoretic and compositional theory of semantics The aim of this paper is to provide a new framework of formal semantics that satisfies all four criteria in the subsection 1.1 in a single and unified setting. The new framework, dependent type semantics, is formulated based on dependent type theory that originates in Martin-Löf Type Theory and λ-cube (Barendregt 1992). Although dependent type semantics is closely related to the three type theoretical approaches in the subsection 1.2, the analysis of presupposition and anaphora in de- pendent type semantics is vastly different in that it employs a mechanism of context passing, which is common in continuation semantics (cf. de Groote 2006). This paper is organized as follows. In Section 2, I present the basic concept of de- pendent type semantics after informally introducing the notion of dependent types (the 282 Dependent Type Semantics formal presentation of which is given in Bekki 2013). The basic concepts include the notion of truth based on the paradigm of Curry-Howard correspondence, the analysis of anaphora based on the relation between propositions and proofs (Criterion #1), the proof theory based on dependent type theory that calculates inferences involving dy- namic binding (Criterion #2), the lexicalized semantic representations and a number of rules to combine these elements of the theory (Criterion #3). In Section 3, accessibility is uniformly explained in terms of dependent type semantics, without assuming any ad- hoc mechanism (Criterion #4). Similarly to the way that instances of inaccessibility are reduced to negation in DPL/HoDL/QDL, they are reduced to the Π-operator in depen- dent type semantics. In Section 4, in order to make the formulation of dependent type semantics more rigorous, I discuss the polymorphic nature of contexts and reformulate dynamic propositions in a polymorphic manner. 2 Dependent Type Semantics 2.1 What are Dependent Types? The notion of dependent types originates from Martin-Löf Type Theory (Martin-Löf 1975; 1984), which was proposed as a foundation of constructive mathematics, and Calculus of Constructions (CoC, Coquand and Huet 1988), which was proposed as a foundation of functional programming and mathematical proofs. Lately, fragments of Martin-Löf Type Theory and CoC have been integrated into a general theory of the λ- cube (Barendregt 1992) and Pure Type Systems (PTS) (Berardi 1990, Barendregt 1991) with other type theories, such as Girard’s F (Girard et al. 1989). Furthermore, CoC has been extended by the notion of inductive definition of types in Martin-Löf Type Theory, leading to Calculus of Inductive Constructions, which is known as an underlying lan- guage of proof assistants Coq (Bertot and Castéran 2004) and Agda (Nordström et al. 1990, Bove and Dybjer 2008). For those familiar with typed lambda calculi and the Curry-Howard correspon- dence between logic and type theory, the difference between dependent type theory and simply-typed lambda calculi can be summarized as follows.2 • Types may depend on terms, i.e., there is a function from terms to types. • Types that depend on term variables can be quantified (by Π as a universal quan- tifier and Σ as an existential quantifier) 2 The formal presentation of the specific theory that we adopt in this paper (a fragment of CoC extended with Σ-types) is given in (Bekki 2013). Bekki 283 • (Πx:A)B is a generalized form of the type A → B, and (Σx:A)B is a generalized form of the type A × B (thus, the quantifiers Π and Σ may appear within types). 2.2 Intra-sentential semantics: Truth as inhabitance In dependent type semantics, dependent types are applied to natural language seman- tics in the following way. We start by considering the truth conditions of the simple sentence in (13). (13) John is a man. A proper name, such as “John”, is considered to denote an entity. In dependent type semantics, it means that a semantic representation of “John” is a term j of type Entity. Assume that John is in fact a man, then there must be John’s state of being a man in the sense of Davidson (1967) and Parsons (1990), which serves as a proof for sentence (13). Such proof can be represented as a term of type Man( j)3 , which is dependent on j. Since it is obvious that someone’s state of being a man does not serve as a proof that someone else is a man, the collection of (someone’s) states of being a man is divided into subcollections of states of being a man in accordance with the bearers of the states. This is the reason why we should consider that the type for the states of being a man depends on the terms that represent the bearer of the states. By Curry-Howard correspondence (“terms = proofs”), such a proof of (13) is iden- tified with a term of type Man( j). Again, by Curry-Howard correspondence (“types = propositions”), the type Man( j) is also regarded as a proposition, which is the seman- tic representation of sentence (13) (tentatively, until the next section) in dependent type semantics. According to the proof-theoretic paradigm, sentence (13) is true if and only if there exists a proof for the proposition that the sentence denotes, namely, the type Man( j). This condition can be rephrased as that the type Man( j) is inhabited, that is, there exists a term of type Man( j). Equivalently, sentence (13) is true if and only if there is a term that represents John’s state of being a man. Sentence (14) is another simple example. (14) John ran. Assume that John actually ran, then there must be a “running” event whose agent is John, again in the sense of Davidson (1967) and Parsons (1990). Such an event is represented as a term of type Ran( j), which is also dependent on j. Therefore, proofs 3 The type of Man( j) itself is type, which plays the role of type t in Montagovian semantics, so the type of the one-place predicate Ran is Entity → type. 284 Dependent Type Semantics for sentence (14) have a type Ran( j), which is also regarded as a proposition. The same argument as above applies here as well, and the type Ran( j) is (tentatively) a semantic representation of sentence (14), which is true if and only if the type Ran( j) is inhabited. In dependent type semantics, a semantic representation of an declarative sentence is always of the sort type. Thus, these arguments for the simple sentences in (13) and (14) can be generalized as in Definition 2.1 Definition 2.1 (Truth Condition in Dependent Type Semantics). For any declarative sentence S whose semantic representation is S 0 of the sort type, the truth condition of S is stated as follows:4 S is true. ⇐⇒ S 0 is inhabited. Now, let us consider the truth condition of the existentially quantified sentence (15) in this setting. (15) A man entered. A qualified proof that ”A man entered” (i.e. “there exists a man who entered”) consists of at least 1) an entity, 2) a proof that the entity is a man (that is, a state of the entity’s being a man), and 3) a proof that the entity entered (that is, an entering event whose agent is the entity). For example, suppose that j and b are John and Bill, m1 and m2 are John and Bill’s respective states of being a man, and e1 and e2 are John and Bill’s respective entering events, namely: (16) j : Entity, m1 : Man( j), e1 : Enter( j) b : Entity, m2 : Man(b), e2 : Enter(b) Then the tuples (( j, m1 ), e1 ) and ((b, m2 ), e2 ) are two possible proofs for “A man entered”. Thus, the truth condition of (15) is that there exists a tuple consisting of x such that x : Entity, a proof of x’s being a man and a proof of x’s entering. The type of such proofs is (tentatively) (17), and so is the semantic representation for (15) in dependent type semantics.5 4 This claim is rather controversial in a philosophcal viewpoint, which is related with anti-realism in (Dummett 1975; 1976; 1991) and (Prawitz 1980). See also (Dávila-Pérez 1996) which argued for a proof- theoretic semantics of type theoretical grammar. 5 The tuples ( j, (m , e )) and (b, (m , e )) are as much proofs for sentence (15) as (( j, m ), e ) and 1 1 2 2 1 1 ((b, m2 ), e2 ), and thus their type (Σx:Entity)(Man(x) ∧ Enter(x)) also qualifies as a semantic represen- tation for the sentence (15). Although this looks simpler than (17), I do not adopt it due to the lack of compositionality. Bekki 285 (17) (Σu:(Σx:Entity)Man(x))Enter(π1 (u)) The fact that the representation (17) is in fact of the sort type is proved as follows.6 (18) (AX) Man : Entity → type x : Entity (1) u : (Σx:Entity)Man(x) (2) (AX) (ΠE) (AX) (ΣE) Entity : type Man(x) : type Enter : Entity → type π1 (u) : Entity (ΣF) (1) (ΠE) Enter(π1 (u)) : type (Σx:Entity)Man(x) : type (ΣF) (2) (Σu:(Σx:Entity)Man(x))Enter(π1 (u)) : type Let us proceed to the truth condition of the universally quantified sentence (19). (19) Every man entered. A qualified proof for sentence (19) is a function that, for any pair of 1) an entity and 2) a proof that the entity is a man, returns a proof that the entity entered. The type of this function is not a simple form Entity × Man → Enter since the type Man depends on the element of Entity, and so does the type Enter. The type for the function, and thus the semantic representation of (19), is (tentatively) (20). (20) (Πu:(Σx:Entity)Man(x))Enter(π1 (u)) The representation in (20) is of the sort type, which is proved as in (21) (almost isomorphic to (18)). (21) (AX) Man : Entity → type x : Entity (1) u : (Σx:Entity)Man(x) (2) (AX) (ΠE) (AX) (ΣE) Entity : type Man(x) : type Enter : Entity → type π1 (u) : Entity (ΣF) (1) (ΠE) Enter(π1 (u)) : type (Σx:Entity)Man(x) : type (ΠF) (2) (Πu:(Σx:Entity)Man(x))Enter(π1 (u)) The notion I introduced in this section, truth as inhabitance (of a proof), together with a bridging perspective of states and events as proofs, is a core notion of the truth condition within dependent type semantics. However, dependent type semantics gives a full play when in inter-sentential cases, which is discussed in the next section. 2.3 Inter-sentential semantics: Dynamics as dependence Let us move on to the dynamic aspect of dependent type semantics. Consider an E-type pronoun in the sentences in (2). (2) [A man]i entered. Hei whistled. 6 For details about the rules used in (18), see the appendix of (Bekki 2013). 286 Dependent Type Semantics The semantic representation for the first sentence is (17) as we have discussed in the preceding section. Henceforth, I abbreviate Man(x) simply as M(x), Enter(x) as E(x), and so forth, for compactness. Moreover, I use the following abbreviated form for types of the form (Σx:Entity)M, which appears often. Definition 2.2 (Abbreviated forms). de{ de{ (Σx)M ≡ (Σx:Entity)M M ≡ Man de{ de{ E ≡ Enter W ≡ Whistle ... The semantic representation in (17) can now be written as (22). (22) (Σu:(Σx)M(x))E(π1 (u)) : type In order to determine the semantic representation of (2), we need a representa- tion for the anaphoric expression “He” in (2), which is related to its antecedent. The key in inter-sentential analysis of dependent type semantics is that we can always find the antecedent of an anaphoric expression in proofs of the preceding discourse.7 If (( j, s1 ), e1 ) is a proof for (22), then “Hei ” in (2) must denote j. If ((b, s2 ), e2 ) is a proof for (22), then “Hei ” in (2) denotes b. In general, if v is a proof for (22), namely v : (Σu:(Σx)M(x))E(π1 (u)), then “Hei ” in (2) denotes π1 (π1 (v)). In this way, the an- tecedent of an E-type pronoun can be accessed by means of (a composition of) projec- tions from a proof of the representation of the sentence containing the antecedent. Therefore, all we need is a mechanism that combines the representations of any two consecutive sentences by passing the proof for the representation of the first sentence to the representation of the second sentence. For this purpose, we employ a method which is similar to that in continuation semantics, and the semantic representation of the second sentence in (2) becomes as follows: (23) (λc)W(seli (c)) : discourse → type For each index i, the selection function seli is a selected projection function (or a composition thereof)8 . The choice of the selection function seli corresponds to the choice of indices in generative grammar. In this paper, subscripts in the sentences, 7 This is in sharp contrast with the view of dynamic logic, which refers to the antecedent by means of assignment functions, while dependent type semantics refers to the antecedent by means of the proofs. 8 The name sel is adopted from continuation semantics (de Groote 2006, Asher and Pogodalla 2011). n More formally, seli is a member o of the set of finite compositions of projections, namely πi1 ◦ · · · ◦ πin | i1 , . . . , in ∈ { 1, 2 } , 1 ≤ n . Similar ideas are also found in monadic semantics (Ogata 2008, Unger 2011). Bekki 287 such as i in “[A man]i ” and “Hei ”, are not a formal notation: they just indicate which expression is an antecedent of which anaphoric expression. The use of subscripts such as i in seli is also an informal notation: they just indicate that the selection function is properly chosen in order to establish the anaphoric link indicated by the same subscript. The variable c in (23) takes the preceding discourse9 , and in the discussion below a semantic representation for a sentence is a function from discourse to type, instead of just type. The actual type of discourse varies with respect to the content of previous discourses, but for a moment I simply specify it as discourse.10 This change should affect the semantic representation of (2) in a way that allows us to add the vacuous discourse variable c to (22), and we obtain (24). (24) (λc)(Σu:(Σx)M(x))E(π1 (u)) : discourse → type In (24), variable c does not appear in the scope of λc. This is a case when the sentence (2) contains no anaphoric expressions. Below, (24) serves as the semantic representation for the first sentence in (2). The discourse variable c provides us with a nice way for combining semantic rep- resentations M, N for two consecutive sentences. They are composed by the dynamic conjunction “;”, defined as follows:11 Definition 2.3 (Dynamic Conjunction Operator). de{ M; N ≡ (λc)(Σu:M(c))(N((c, u))) By means of the dynamic conjunction operator, we can define a CCG-style con- junction rule which can be applied to the two consecutive sentences. Definition 2.4 (Dynamic Conjunction Rule). S:M CONJ : ; S:N ; S : M ;N This operation can be generalized to be applied to any phrase that is “S - reducible”12 , following Gazdar’s cross-categorical conjunction (Gazdar 1980) as in Definition 2.5. 9 It is not a representation of the preceding discourse, but a proof of it. I revisit this issue in later sections. 10 Rigorously, the type discourse should be formulated as a polymorphic type. I consider this issue again and elaborate on its analysis in Section 4. 11 In the notation N((c, u)), N is a one-place predicate which is fed a tuple (c, u). This is different from N(c, u), where N is a two-place predicate. 12 The notion of “S -reducible category” (cf. Winter 1995) is recursively defined as follows: 1. S is an S -reducible category. 288 Dependent Type Semantics Definition 2.5 (Generalized Dynamic Conjunction Rule). X:M CONJ : ; X:N ; X : (λ→ − x )(λc)(Σu:M→ − x (c))N→ − x ((c, u)) where X is a S -reducible category. Now the representations (24) and (23) are combined by the dynamic conjunction rule, and the single representation for the two sentences in (2) is obtained by the fol- lowing derivation. (25) ; S : (λc)(Σu:(Σx)M(x))E(π1 (u)) CONJ : ; S : (λc)W(seli (c)) S : (λc)(Σv:(Σu:(Σx)M(x))E(π1 (u)))W(seli ((c, v))) The proper choice of seli here is such that seli (x) = π1 π1 π2 (x). Since π1 (π1 (π2 ((c, v)))) = π1 (π1 (v)), the pronoun “He” is analyzed as denoting the entity, for which a proof of being a man and a proof of entering exist. 2.4 Dependent Type Semantics is proof-theoretic As mentioned in Section 1, the sentences in (2) participate in the entailment relations Example 2, repeated below. (26) a. [A man]i entered. Hei whistled. A man entered. b. [A man]i entered. Hei whistled. A man whistled. In dependent type semantics, these are correctly calculated in a proof-theoretic manner, without recourse to their models. This shall be counted as one of the advan- tages of dependent type semantics. The sentences in the premise of (26a), when they are subsequent to the discourse, whose proof is c (whatever c may be), have the following semantic representation with c applied to it. (27) (Σv:(Σu:(Σx)M(x))E(π1 (u)))W(π1 π1 (v)) : type The entailment in (26a) is proven in a straightforward manner since the conse- quence of (26a) is just a first projection of the premises of (26a). In proving (26a), we assume that (27) is inhabited. Assume that the term t is such an inhabitant. 2. if X is an S -reducible category, X/Y and X\Y are also S -reducible categories, for any category Y. 3. No other category is an S -reducible category. Bekki 289 (28) t : (Σv:(Σu:(Σx)M(x))E(π1 (u)))W(π1 (π1 (v))) (ΣE) π1 (t) : (Σu:(Σx)M(x))E(π1 (u)) The entailment in (26b) seems to be more complex, but still simple as in (29), where the use of the (CONV) rule (see the appendix of Bekki 2013) is somewhat abused. (29) t : (Σv:(Σu:(Σx)M(x))E(π1 (u)))W(π1 π1 (v)) t : (Σv:(Σu:(Σx)M(x))E(π1 (u)))W(π1 π1 (v)) (ΣE) π1 (t) : (Σu:(Σx)M(x))E(π1 (u)) π2 (t) : W(π1 π1 (v))[π1 (t)/v] (ΣE) π1 π1 (t) : (Σx)M(x) ≡ W(π1 π1 π1 (t)) ≡ W(π1 (u))[π1 π1 (t)/u] (ΣI) (π1 π1 (t), π2 (t)) : (Σu:(Σx)M(x))W(π1 (u)) Thus, the following inferences hold, which is a proof-theoretic account of the data in (26). (30) a. (Σv:(Σu:(Σx)M(x))E(π1 (u)))W(π1 π1 (v)) ` (Σu:(Σx)M(x))E(π1 (u)) b. (Σv:(Σu:(Σx)M(x))E(π1 (u)))W(π1 π1 (v)) ` (Σu:(Σx)M(x))W(π1 (u)) 2.5 Dependent Type Semantics is compositional In dependent type semantics, the semantic representation for the sentences in (2) can be naturally lexicalized and composed. In this paper, I adopt Combinatory Categorial Grammar (henceforth CCG) (Steedman 1996; 2000) as a syntactic theory in order to demonstrate semantic composition along with syntactic derivation. However, with mi- nor modifications, it should work together with any other lexical grammar, such as type logical grammars, classical/abstract categorial grammars, LFG, LTAG and minimalist grammars, since dependent type semantics is independent of any specific feature of CCG. I assume the following map ~− from the syntactic categories in CCG to the se- mantic types in dependent type semantics. Definition 2.6 (Correspondence between Syntactic Categories in CCG and Semantic Types in Dependent Type Semantics). ~NP = Entity ~N = Entity → type ~S = discourse → type ~S̄ = discourse → type ~X/Y = ~Y → ~X ~X\Y = ~Y → ~X Theorem 1 states that combinatory rules in CCG preserve this correspondence, which allows us to conclude that every sentence has a semantic representation of type discourse → type. 290 Dependent Type Semantics Theorem 1 (Soundness of Combinatory Rules with respect to the Category=Type Correspondence). For any combinatory rule in CCG of the following form (where X1 , . . . , Xn , Y are syntactic categories and f1 , . . . , fn , g are semantic representations): X1 : f1 ··· X n : fn Y:g if f1 is of type ~X1 , . . . , fn is of type ~Xn , then g is of type ~Y. The proof is routine. Then, a derivation of the sentence “A man entered” is as follows: (31) A man S/(S\NP)/N N (λn)(λp)(λc)(Σu:(Σx)nxc)(p(π1 (u))((c, u))) (λx)(λc)Mx entered > S /(S \NP) S\NP (λp)(λc)(Σu:((Σx)Mx))p(π1 (u))((c, u)) (λx)(λc)Ex > S (λc)(Σu:((Σx)Mx))E(π1 (u)) Intuitively, u denotes a proof of the subject noun phrase, and x denotes a proof of the representation of a head noun of the noun phrase.13 The variable c in nxc in the representation of “A” does not appear in the resulting representation above. This is because “A man” does not contain any modifiers that contain anaphora. (32) Hei whistled S/(S\NP) S\NP (λp)(λc)p(seli (c))c (λx)(λc)Wx > S (λc)W(seli (c)) Thus, the derivation in (25) is resumed, this time entirely from the lexicalized rep- resentations. 13 This representation may seem to be redundant because TTG would assign a representation as follows for sentence (2). (Σx:Man)Enter(x) I do not adopt this representation, however, for reasons of compositionality, and the analysis is more robust when we consider noun modifiers such as relative clauses. Bekki 291 (33) A man entered Hei whistled S S (λc)(Σu:((Σx)Mx))E(π1 (u)) (λc)W(seli (c)) ; S (λc)(Σv:((Σu:(Σx)Mx)E(π1 (u))))W(seli ((c, v))) 3 Accessibility DRT has boxes as a mechanism for controlling the accessibility of anaphora. In DPL, negation as a test plays the role of a box. The current implementation of dependent type semantics, where the Π-operator plays the role, inherits this idea from DPL, since in dependent type theory negation is defined via the Π-operator. Let us check some of the classical paradigms of accessibility discussed in DRT and how dependent type semantics gives them a new life. 3.1 Universal quantification The first case is universal quantification in the subject position, which blocks the link to indefinites within its scope from subsequent sentences, as in (7). (7) Everybody bought [a car]i . *Iti stinks. The lexical item for “Everybody” is given by means of the Π-operator as follows. (34) Everybody bought a car S/(S\NP) S\NP (λp)(λc)(Πu:(Σx)M(x))(p(π1 u)((c, u))) (λx)(λc)(Σv:(Σy)C(y))B(x, π1 (v)) > S (λc)(Πu:(Σx)M(x))(Σv:(Σy)C(y))B(π1 (u), π1 (v)) (35) Everybody bought [a car]i ∅ *Iti stinks S CONJ S (λc)(Πu:(Σx)M(x))(Σv:(Σy)C(y))B(π1 (u), π1 (v)) ; (λc)S(seli (c)) ; S (λc)(Σw:(Πu:(Σx)M(x))(Σv:(Σx)C(x))B(π1 (v), π1 (u)))S(seli ((c, w))) In the derivation above, the projection seli ((c, w)) has to pick an antecedent from a proof of the following type: (36) (Πu:(Σx)M(x))(Σv:(Σy)C(y))B(π1 (u), π1 (v)) 292 Dependent Type Semantics However, “a car” (denoted by a variable y) is not accessible from the subsequent discourse since w is a function from (Σx)M(x) to (Σv:(Σy)C(y))B(π1 (v), π1 (u)), so that it is not possible to pick a car from w by (a composition of) projections alone. Thus, universal quantification behaves like a box in DRT. 3.2 Conditionals The second case is conditional sentences. In a conditional sentence, an antecedent in the premise part is accessible from anaphora in the consequent part, but not from anaphora in subsequent sentences, as shown in (8). (8) If John bought [a car]i , iti must be a Porsche. *Iti stinks. The sentences in (8) are derived as follows. I analyze “If” as a material conditional for the sake of simplicity, which is represented by the Π-operator in dependent type semantics. (37) If John bought [a car]i S/S/S S (λp)(λq)(λc)(Πw:pc)(q((c, w))) (λc)(Σu:(Σy)C(y))B( j, π1 (u)) iti must be a Porsche > S /S S (λq)(λc)(Πw:(Σu:(Σy)C(y))B( j, π1 (u)))(q((c, w))) (λc)bePorsche(seli c) > S (λc)(Πw:(Σu:(Σy)C(y))B( j, π1 (u)))(bePorsche(seli (c, w))) The antecedent “a car” is a proof y of type Entity that satisfies C(y), so “iti ” is accessible to the proof via w. On the other hand, “Iti ” in the second sentence in (8) is not accessible to the proof y. (38) If John bought [a car]i , iti must be a Porsche ∅ *Iti stinks (37) S CONJ S (λc)(Πw:(Σu:(Σy)C(y))B( j, π1 (u)))(bePorsche(seli ((c, w)))) ; (λc)S(seli (c)) ; S (λc)(Σv:(Πw:(Σu:(Σy)C(y))B( j, π1 (u)))(bePorsche(seli ((c, w)))))S(seli ((c, v))) As can be seen from the derivation in (38), the second seli has to pick the antecedent from (c, u), however, the functional proof that u denotes encapsulates the participants within the first sentence, which includes a proof of C(y). Thus, the accessibility con- straint in conditional sentences are predicted without any ad-hoc assumptions. Bekki 293 3.3 Negation The third case is negation. Negations are known to block accessibility, as in (9). (9) John didn’t buy [a car]i . *Iti stinks. This constraint in dependent type semantics is due to the definition of negation via implication. de{ Definition 3.1 (Negation). ¬A ≡ (Πx:A)⊥ The semantic representation of the first sentence in (9) is as follows: (39) didn’t buy a car S\NP/(S\NP) S\NP John (λp)(λx)(λc)¬pxc (λx)(λc)(Σu:(Σy)C(y))B(x, π1 (u)) > NP S \NP j (λx)(λc)¬(Σu:(Σy)C(y))B(x, π1 (u)) < S (λc)¬(Σu:(Σy)C(y))B( j, π1 (u)) According to Definition 3.1, the semantic representation in the last line, if some context c is fed to it, is a functional type (Πx:(Σu:(Σx)C( j))B( j, π1 (u)))⊥, which is the type of functions from (Σu:(Σx)C( j))B( j, π1 (u)) to ⊥. Therefore, it is not possible to pick up “a car” from its proof by means of a composition of projections. In other words, negation blocks accessibility due to its implicational nature. Notice that the fact that universal quantifications, conditionals and negations block accessibility is not only correctly predicted but also explained in dependent type se- mantics in a uniform way. In other words, they are all represented by the Π-operator, whose proofs are functions, from which a mere composition of projections cannot pick up the elements involved. This is a deeper explanation on accessibility, which relies only on the structures of proofs between the Σ-operator and the Π-operator, which is widely adopted in type theory, without resource to any ad-hoc assumptions. 3.4 Nominal conjunction Another support for dependent type semantics is found in cases of anaphoric links over a nominal conjunction, as in (40). (40) [A monk]i and hisi apprentice opened the gate. de{ Since A ∧ B ≡ (Σx:A)B (where x < {v(B), see the appendix of Bekki 2013), the analysis of anaphora in conjunctions parallels with those in implications and universal 294 Dependent Type Semantics quantifications, in a sense that the proof of the first conjunct is passed to the second conjunct. The derivation of “A monk” follows that of “A man” in (31), and I assume the following lexical item for the possessive pronoun “his”. (41) his apprentice S/(S\NP)/N N (λn)(λp)(λc)(Σv:(Σy)(nyc ∧ of(y, seli (c))(p(π1 (v))((c, v)))) (λx)(λc)A(x) > S /(S \NP) (λp)(λc)(Σv:(Σy)(A(y) ∧ of(y, seli (c))(p(π1 (v))((c, v)))) (42) A monk and his apprentice S/(S\NP) CONJ S/(S\NP) (λp)(λc)(Σu:(Σx)Mx)(p(π1 (u))((c, u))) ; (λp)(λc)(Σv:(Σy)(A(y) ∧ of(y, seli (c))(p(π1 (v))((c, v)))) ; S /(S \NP) (λp)(λc)(Σw:(Σu:(Σx)Mx)(p(π1 (u))((c, u))))(Σv:(Σy)(A(y) ∧ of(y, seli ((c, w)))(p(π1 (v))(((c, w), v)))) The analysis of “and” as a category CONJ here is the same as that for the senten- tial conjunction in Definition 2.5. The essence is that the proof of the first conjunct is passed to the second conjunct, which may depend on it, i.e., whose member may be used as an antecedent by anaphoric expressions contained in the second conjunct. If the selection function selects “A monk” (which is included in w), “his” gets bound in the intended way. Moreover, in the verb phrase, both “A monk” and “his apprentice” can serve as an antecedent as in (43), which are both derivable if we set seli appropriately. (43) [A monk]i and his apprentice | met hisi/ j father. Anaphora has access to the proof of the preceding discourse, but the only method to pick the antecedent is a composition of projection functions. Thus, the pairwise proof introduced by a Σ-operator (existential quantification and conjunction) yields accessible referents, while the functional proof introduced by a Π-operator (universal quantification and implication) blocks access to the elements inside it. To sum up, accessibility amounts to reachability by projection. 4 Polymorphism of discourse In the subsection 2.3, I defined the semantic type of declarative sentences as discourse → type. However, this is not a precise formalization, as I have mentioned in Footnote 10, since the type discourse should vary according to the discourse that precedes the sentence. Recall the sentences in (2) and their semantic representations in (33), repeated here as (44) for convenience. Bekki 295 (2) [A man]i entered. Hei whistled. (44) A man entered Hei whistled S S (λc)(Σu:((Σx)Mx))E(π1 (u)) (λc)W(seli (c)) ; S (λc)(Σv:((Σu:(Σx)Mx)E(π1 (u))))W(seli ((c, v))) If the initial discourse i is of type d, the representation (λc)(Σu:((Σx)Mx))E(π1 (u)) must be of type d → type. Then, the new discourse (which consists of “A man en- tered”) (c, v) is passed to (λc)W(seli (c)) where v is a proof of type (Σu:((Σx)Mx)), and thus the type of (c, v) is (Σc:d)(Σu:((Σx)Mx))E(π1 (u)). One proper way to treat types of discourses is to introduce a polymorphic type available in λ-cube. A polymorphic type is a type depending on other types. (45) A man entered S/(S\NP) S\NP (λp)(λδ : type)(λc : δ)(Σu:((Σx)Mx))p(π1 (u))(δ)((c, u)) (λx : Entity)(λδ : type)(λc : δ)Ex > S (λδ : type)(λc : δ)(Σu:((Σx)Mx))E(π1 (u)) This representation is polymorphic, i.e., it works for any discourse type. Then, one can instantiate the type according to the initial discourse. (46) d : type (λδ : type)(λc : δ)(Σu:((Σx)Mx))E(π1 (u)) : (Πδ:type)(Πc:δ)type (ΠE) (λc : d)(Σu:((Σx)Mx))E(π1 (u)) : (Πc:δ)type One can also instantiate the type of discourse passed to the second sentence. The derivation (47) is a polymorphic version of (32). (47) Hei whistled S/(S\NP) S\NP (λp)(λδ : type)(λc : δ)p(seli (c))δc (λx : Entity)(λδ : type)(λc : δ)Wx > S (λδ : type)(λc : δ)W(seli (c)) (48) (Σu:((Σx)Mx))E(π1 (u)) : type (λδ : type)(λc : δ)W(seli (c)) : (Πδ:type)(Πc:δ)type (ΠE) (λc : (Σu)((Σx)Mx)E(π1 (u)):W(seli (c))) : (Πc:(Σu:((Σx)Mx))E(π1 (u)))type 296 Dependent Type Semantics On first look, we may get the impression that it is a bit complicated but it is not counterintuitive. This representation says that the discourse passed to the second sen- tence is a (generalized) pair composed of the discourse that precedes the first sentence and the proof of the first sentence. This remedy is implemented in a simple manner by two modifications to our theory. First, we uniformly change the semantic representation for every proposition of the form (49) to (50). (49) (λc : discourse)M : discourse → type (50) (λδ : type)(λc : δ)M : (Πδ:type)(Πc:δ)M Second, we modify the dynamic conjunction rule in Definition 2.4 as follows: Definition 4.1 (Dynamic Conjunction Rule (modified)). S CONJ S M : (Πδ:type)(Πc:δ)type ; N : (Πδ:type)(Πc:δ)type ; S de{ M; N ≡ (λδ)(λc )(Σu:Mδc )(N((Σc:δ)Mδc0 )((c0 , u))) : (Πδ:type)(Πc:δ)type 0 0 The proof of Theorem 1 must be updated accordingly. 5 Conclusion This paper presented a new framework, referred to as dependent type semantics, which is based on dependent type theory and the context-passing mechanism in continuation semantics. I have demonstrated that dependent type semantics satisfies the four crite- ria required for a successful theory of dynamic semantics, which are discussed in the subsection 1.1, in the following way: Criterion #1: it proves well-formed representations for sentences involving dynamic binding. Criterion #2: it is proof-theoretic and thus predicts entailment relations involving dy- namic binding. Criterion #3: it is compositional, and semantic representations for E-type and donkey sentences can be composed from lexicalized representations. Criterion #4: it explains accessibility in a uniform way based on the nature of impli- cation/universal quantification. Bekki 297 The core concept of dependent type semantics, with respect to the notions of truth, dynamics and accessibility, is summarized by the following three concepts: i) truth as inhabitance, ii) dynamics as dependence on the contexts, and iii) accessibility as reachability by projections. I hope that dependent type semantics not only serves as a dynamic, proof-theoretic and compositional framework which not only correctly predicts and explains both intra- sentential and inter-sentential anaphora and their accessibility, but also provides a new conceptual perspective of the relation between representations, proofs and denotations in natural language semantics. Acknowledgements My sincere thanks to the participants in my guest lectures at University of Amsterdam, Tilburg University and Utrecht University in October 2012, especially to Alexandru Baltag, Kohei Kishida, Reinhard Muskens, Rick Nouwen, Ben Rodenhaeuser and Sonja Smets for their valuable comments. I also thank Alastair Butler, Yuri Ishishita, Yuki Nakano and Ribeka Tanaka for many helpful discussions. This research is partially supported by a Grant-in-Aid for Young Scientists (A) (No. 22680013) from the Ministry of Education, Science, Sports and Culture. References R. Ahn and H.-P. Kolb. Discourse representation meets constructive mathematics. In L. Kalman and L. Polos, editors, Papers from the Second Symposium on Logic and Language. Akademiai Kiado, 1990. N. Asher and S. Pogodalla. SDRT in continuation semantics. In T. Onoda, D. Bekki, and E. McCready, editors, New Frontiers in Artificial Intelligence (JSAI-isAI 2010 Workshops, Tokyo, Japan, November 2010, Selected Papers from LENLS7), volume LNAI 6797, pages 3–15. Springer, Heidelberg, 2011. H. P. Barendregt. Introduction to generalized type systems. Journal of Functional Programming, 1(2):125–154, 1991. H. P. Barendregt. Lambda calculi with types. In S. Abramsky, D. M. Gabbay, and T. Maibaum, editors, Handbook of Logic in Computer Science, volume 2, pages 117– 309. Oxford Science Publications, 1992. D. Bekki. A type-theoretic approach to double negation elimination in anaphora. In Logic and Engineering of Natural Language Semantics 10 (LENLS 10), Tokyo, 2013. 298 Dependent Type Semantics S. Berardi. Type Dependence and Constructive Mathematics. PhD thesis, Mathemat- ical Institute, 1990. Y. Bertot and P. Castéran. Interactive Theorem Proving and Program Development. Springer, 2004. A. Bove and P. Dybjer. Dependent types at work, February - March 2008 2008. T. Coquand and G. Huet. The calculus of constructions. Information and Computa- tion, 76(2-3):95–120, 1988. D. Davidson. The logical form of action sentences. In N. Rescher, editor, The Logic of Decision and Action. University of Pittsburgh Press, Pittsburgh, 1967. R. Dávila-Pérez. Translating english into Martin-Löf’s theory of types: A composi- tional approach. Technical report, University of Essex, May 1994 1994. R. Dávila-Pérez. Comments on constructive semantics for natural language, 1996. P. de Groote. Towards a montagovian account of dynamics. In M. Gibson and J. How- ell, editors, 16th Semantics and Linguistic Theory Conference (SALT16), pages 148– 155, University of Tokyo, 2006. CLC Publications. M. Dummett. What is a theory of meaning? In S. Guttenplan, editor, Mind and Language, pages 97–138. Oxford University Press, Oxford, 1975. M. Dummett. What is a theory of meaning? (II). In Evans and McDowell, editors, Truth and Meaning, pages 67–137. Oxford University Press, Oxford, 1976. M. Dummett. The Logical Basis of Metaphysics. Duckworth, London, 1991. G. Evans. Pronouns. Linguistic Inquiry, 11:337–362, 1980. C. Fox. Discourse representation, type theory and property theory. In H. Bunt, R. Muskens, and G. Rentier, editors, the International Workshop on Computational Semantics, pages 71–80, Institute for Language Technology and Artificial Intelligence (ITK), Tilburg, 1994. G. Gazdar. A cross-categorial semantics for conjunction. Linguistics and Philosophy, 3:407–409, 1980. P. Geach. Reference and Generality: An Examination of Some Medieval and Modern Theories. Cornell University Press, Ithaca, New York, 1962. Bekki 299 J.-Y. Girard, Y. Lafont, and P. Taylor. Proofs and Types. Cambridge Tracts in Theo- retical Computer Science 7. Cambridge University Press, 1989. J. Groenendijk and M. Stokhof. Dynamic predicate logic. Linguistics and Philosophy, 14:39–100, 1991. H. Kamp. A theory of truth and semantic representation. In J. Groenendijk, T. M. Janssen, and M. Stokhof, editors, Formal Methods in the Study of Language. Mathe- matical Centre Tract 135, Amsterdam, 1981. H. Kamp and U. Reyle. From Discourse to Logic. Kluwer Academic Publishers, 1993. L. Karttunen. Discourse referents. In J. D. McCawley, editor, Syntax and Semantics 7: Notes from the Linguistic Underground, volume 7, pages 363–85. Academic Press, New York, 1976. P. Martin-Löf. An intuitionistic theory of types. In H. E. Rose and J. Shepherdson, editors, Logic Colloquium ’73, pages 73–118. North-Holland, Amsterdam, 1975. P. Martin-Löf. Intuitionistic Type Theory, volume 17. Italy: Bibliopolis, Naples, 1984. Sambin, G. (ed.). R. Montague. The proper treatment of quantification in ordinary english. In J. Hin- tikka, J. Moravcsic, and P. Suppes, editors, Approaches to Natural Language, pages 221–242. Reidel, Dordrecht, 1973. B. Nordström, K. Petersson, and J. Smith. Programming in Martin-Löf’s Type Theory. Oxford University Press, 1990. N. Ogata. Towards computational non-associative lambek lambda-calculi for formal pragmatics. In the Fifth International Workshop on Logic and Engineering of Natural Language Semantics (LENLS2008) in Conjunction with the 22nd Annual Conference of the Japanese Society for Artificial Intelligence 2008, pages 79–102, Asahikawa, Japan, 2008. T. Parsons. Events in the Semantics of English: A Study in Subatomic Semantics. The MIT Press, Cambridge MA, 1990. D. Prawitz. Intuitionistic logic: A philosophical challenge. In G. von Wright, editor, Logics and Philosophy. Martinus Nijhoff, The Hague, 1980. 300 Dependent Type Semantics A. Ranta. Intuitionistic categorial grammar. Linguistics and Philosophy, 14:203–239, 1991. A. Ranta. Type-Theoretical Grammar. Oxford University Press, 1994. M. J. Steedman. Surface Structure and Interpretation. The MIT Press, Cambridge, 1996. M. J. Steedman. The Syntactic Process (Language, Speech, and Communication). The MIT Press, Cambridge, 2000. G. Sundholm. Proof theory and meaning. In D. Gabbay and F. Guenthner, editors, Handbook of Philosophical Logic, volume III, pages 471–506. Kluwer, Reidel, 1986. C. Unger. Dynamic semantics as monadic computation. In the Eighth International Workshop on Logic and Engineering of Natural Language Semantics (LENLS8), pages 153–164, Takamatsu, Kagawa, Japan, 2011. JSAI International Symposia on AI 2011. Y. Winter. Syncategorematic conjunction and structured meanings. In SALT 1995, 1995. A Medieval Epistemic Puzzle Sara L. Uckelman Cluster of Excellence “Asia and Europe in a Global Context” Universität Heidelberg
[email protected]Abstract In this paper we present a dynamic epistemic language with names for propositions which we use to analyse an epistemic puzzle and associated disputation due to the medieval logician Paul of Venice. 1 Introduction Epistemic logicians love puzzles, particularly puzzles involving reasoning about imper- fect information in multi-agent settings. The literature on epistemic logic and dynamic epistemic logic is rife with examples of such puzzles1 , and many results in the field have arisen from attempts to formalize and solve these puzzles. In this paper, we consider an interesting little epistemic puzzle with unusual ori- gins. The puzzle is due to the medieval logician Paul of Venice, and occurs in the treatise De scire et dubitare ‘On knowing and being uncertain’2 of his monumental work Logica Magna ‘Great Logic’, which was published in the early 15th C.3 Paul introduces this puzzle as a precursor to presenting an obligatio concerning the contents of the puzzle. An obligatio is a type of formalized disputation which was developed 1 See (van Ditmarsch et al. 2007) for many examples. 2 The translation of Latin dubitare and its cognates with ‘to be uncertain’ rather than ‘to doubt’ reflects the fact that dubito A ‘I am uncertain about A’ does not have the implication of belief in ¬A that ‘I doubt A’ can have in English. 3 For information about Paul’s life and other works, see the intro of (Venice 1981). 302 A Medieval Epistemic Puzzle in the Middle Ages and used as a method for testing students’ abilities to recognize inferential relations and to reason under settings of changing information. As Paul ex- plains in the treatise of the Logica Magna on obligationes, “the topic of obligationes is nothing other than the topic of inferences presented in a more subtle manner, in a way intended to test whether the respondent has a good head by setting a deceptive course before him” (Venice 1988, p. 33).4 Though topics in medieval logic are often considered far removed from the interests of modern logicians, it turns out that many of the puzzles medieval logicians were in- terested in remain interesting today for the same reasons. In this paper we use the tools of dynamic epistemic logic, suitably enriched to be sufficiently expressive to cover the relevant aspects of the puzzle and the ensuing disputation, to provide a formalization of and analysis of both. The resulting analysis displays a hitherto unnoticed inconsistency in Paul of Venice’s epistemic theory, and also brings together different formal tools that are not ordinarily combined; as such, what we have to say here will be of interest to scholars of both medieval and modern logic. We begin by giving the context of the disputation. The treatise De scire et dubitare is devoted to arguments for and against the claim that there is something known by someone which is uncertain to him. One of the arguments in favor of this claim rests on the following two assumptions: (1) I assume (a) that you know that A is one of the two propositions ‘God exists’ and ‘A human being is a donkey’, and (b) that one A is every A, and (c) that it is hidden from you which of the propositions is A, but (d) you know perfectly well that the proposition ‘God exists’ is necessary and the other, ‘A human being is a donkey’, impossible (Venice 1981, p. 3).5 (2) Every proposition you consider which you do not know to be true and do not know to be false is uncertain to you (Venice 1981, p. 5).6 It must be noted, regarding the first assumption, that ‘God exists’ and ‘A human being is a donkey’ are being used as they typically are in medieval contexts; that is, they are being taken as a generic necessity and a generic impossibility, respectively. No part of 4 Quod omnes regulae superius adsignatae in Tractatu Consequentiarum de consequentia bona vel non bona sunt hic fundamentaliter sustinendae. Et ratio quia materia obligationorum non est nisi materia conse- quentiarum stilo subtiliori procedens, et an respondens sit sani capitis gressu deceptorio temptativa (Venice 1988, p. 32). 5 Et pono quod scias A esse alteram illarum ‘Deus est’ et ‘Homo est asinus’, et quod unum A est omne A, et lateat te quae illarum est A, sed bene scias quod illa est necessaria ‘Deus est’ et reliqua impossibilis ‘Homo est asinus’ (Venice 1981, p. 2). 6 Omnis propositio de qua consideras quam non scis esse veram nec scis esse falsam sit tibi dubia (Venice 1981, p. 4). Uckelman 303 the argument depends on the specific content of these propositions, it is rather the fact that (d) you know that one is necessity and the other is impossible that is relevant. We will not discuss the details of the argument here, having done so elsewhere (Uckelman 2013b). Instead, we are interested in what follows after Paul has set up this background scenario. 2 Background information and other work 2.1 Obligationes Though we do not assume that the reader is already familiar with the basics of medieval disputationes de obligationibus and their history, we cannot go into much detail here. Those wishing to know more are directed to (Uckelman 2012; 2013a) for further read- ing. We briefly recap Paul’s rules and definitions for the basic variant of obligationes, called positio, found in his De obligationibus, Part II, Tract. 8 of the Logica Magna (Venice 1988). There are two participants, Opponent and Respondent. Opponent begins by putting forward an initial statement, called the positum. When the Respondent admits a posi- tum, he then binds himself to follow the rules of obligationes (Venice 1988, p. 3). It is this relation of rule following between the Respondent and the positum which is, on Paul’s view, precisely what makes up the obligatio: Definition 2.1. An obligatio is a relation limiting one to uphold some statement, or its equiform, in some way . . . It is based on the obligater, by virtue of the positio or the depositio; and on the obligated, by reason of his admissio. (Venice 1988, pp. 7, 11).7 As is standard, Paul uses a notion of relevance. He first gives a general definition of relevance in terms of a relationship between propositions. Definition 2.2 (Relevance). A proposition ϕ is relevant to a proposition ψ if ϕ follows from it or is inconsistent with ψ; it is irrelevant otherwise (Venice 1988, pp. 24–25). This general definition of relevance is broken down into a number of specific defini- tions, depending on whether the proposition ϕ is relevant (1) to the positing of the posi- tum, (2) to the positum itself, (3) to both together, or, finally, (4) to the positum taken together with correctly granted propositions and negations of correctly denied propo- sitions (Venice 1988, pp. 24–29). The 2nd definition is relevance in the Swyneshedian tradition; the 4th is the standard Burleyan definition of relevance, which Paul adopts.8 7 Obligatio est relatio limitans ad aliquod enuntiabile vel sibi consimile aliqualiter sustinendum . . . in obligante, ratione positionis vel depositionis; in obligato, vero ratione admissionis (Venice 1988, pp. 6, 10). 8 See (Uckelman 2012, S4) for more on these two traditions. 304 A Medieval Epistemic Puzzle The rest of the preliminary definitions and rules are standard: The positum should be admitted if there is no impediment arising in doing so (Venice 1988, pp. 48–49); any proposition once conceded must be conceded if it is ever put forward again (Venice 1988, pp. 34–35); relevant propositions must be conceded if they follow and denied if they do not (Venice 1988, pp. 54–63); irrelevant propositions must be conceded if they are known to be true, denied if known to be false, and doubted if neither (Venice 1988, pp. 64–65). 2.2 The language and its models In other work we have developed a formal framework for the analysis of obligationes (Uckelman 2011a;b). For our present analysis, we use this framework along with the language introduced in (Uckelman 2013b), which is a combination of dynamic and epistemic logic, with additional alethic modalities, quantification over propositions, and names for propositions. The combination of dynamic logic with epistemic logic allows us to model both the actions of the players and their knowledge explicitly. Definition 2.3 (Well-formed formulas). For sets Φ0 of propositional letters, Ξ of vari- ables, A of agents (containing R and O, for Respondent and Opponent, respectively), and N of names, the set Φ of well-formed formulas and Π of well-formed actions is defined by mutual induction: ϕ := p ∈ Φ0 | x ∈ Ξ | x = y : x, y ∈ Ξ | x = p : x ∈ Ξ, p ∈ Φ0 | ¬ϕ | ϕ ∨ ϕ | T x : x ∈ Ξ | T p : p ∈ Φ | ∃xϕ(x) | Ka ϕ : a ∈ A | ϕ | [α]ϕ : α ∈ Π â : a ∈ N | â = ϕ | â = x | ↓ϕ(â↓) α := ϕ! : ϕ ∈ Φ | ϕ? : ϕ ∈ Φ The other propositional operators, F, and ^ are defined in the usual fashion. We define an uncertainty operator Ua ϕ := ¬Ka ϕ ∧ ¬Ka ¬ϕ; ϕ is uncertain for an agent a if he neither knows ϕ nor knows ϕ is false. Lastly, we also introduce a contingency operator, ?ϕ := ¬ϕ ∧ ¬¬ϕ. The action ϕ! is a public announcement of ϕ (by any agent), and the action ϕ? is an instruction to test for the truth of ϕ. As is usual, this language is interpreted on Kripke models. Definition 2.4 (Model). A model is a structure M = hW, w∗ , {∼a : a ∈ A}, V, ν, ni where • W is a set, with w∗ ∈ W a designated point (representing the actual world). • {∼a : a ∈ A} is a family of equivalence relations on W, one for each member of A. The relation w ∼a w0 is interpreted as ‘w and w0 are epistemically equivalent for agent a’; that is, a cannot distinguish between w and w0 . Uckelman 305 • V : Φ0 → 2W is a valuation function associating atomic propositions with subsets of W. • ν : Ξ → Φ0 is an assignment of propositions to the variables in Ξ. The notion of an x-variant of ν is the standard one. • n : N × W → Φ is a naming function associating names and propositions. If n(a, w) = ϕ, then we say that “a is a name of ϕ at w”. We designate the class of models by M. The semantics for the dynamic modalities are defined in terms of two types of model modification, which require some prelimi- nary definitions. Definition 2.5. The truth set of ϕ ∈ Φ in a model M is ~ϕM = {w : M, w ϕ}. Definition 2.6 (Model reduction). The reduction of a model M by a formula ϕ, M ϕ = hW M,ϕ , w∗ , {∼M,ϕ a : a ∈ A}, V , ν , n i, where W M,ϕ := {w ∈ W : M, w ϕ}, M,ϕ M,ϕ M,ϕ the actual world remains unchanged, and the relations and valuation functions are just restrictions of the originals. Definition 2.7 (Uncertainty model). Given a model M = hW, w∗ , {∼a : a ∈ A}, Vi, define the class of a-uncertainty models for ϕ MUa ϕ as follows: If M, w∗ Ua ϕ, then MUa ϕ = {M}. Otherwise, MUa ϕ = {MUa ϕ = hW 0 , w0∗ , {∼0a : a ∈ A}, V 0 i} such that: • W 0 = W ∪ {v}, where v < W. • w0∗ = w∗ . • ∼0a is the reflexive, transitive, and symmetric closure of ∼a ∪ hw0∗ , vi. • For a0 ∈ A, if a0 , a, then ∼0a0 = ∼a0 . • V 0 is any valuation function minimally extending V in such a way that – If M, w∗ Ka ϕ, then v ∈ ~¬ϕ, – If M, w∗ Ka ¬ϕ, then v ∈ ~ϕ, and for all other ψ, v ∈ ~ψ iff w∗ ∈ ~ψ. Definition 2.8 (Model revision). Given a selection function f : M × A × Φ → M such that f (M, a, ϕ) ∈ MUa ϕ , the f -revision of a model M by a formula ϕ for an agent a, written M ↑ f ϕ, is defined as follows: If ϕ is of the form Ua ψ: M ↑ f ϕ = f (M, a, ψ) Otherwise: M ↑f ϕ = M ϕ 306 A Medieval Epistemic Puzzle Definition 2.9 (Semantics). For a model M with valuation ν, the truth of a formula at a world w ∈ W is defined recursively as follows: M, w p iff w ∈ V(p) M, w x iff M, w ν(x) M, w x = y iff ν(x) = ν(y) M, w x = p iff ν(x) = p M, w ∃xϕ(x) iff there is an x-variant ν0 of ν s.t. M, ν0 , w ϕ[ν0 (x)/x] M, w T p iff w ∈ V(p) M, w T x iff w ∈ V(ν(x)) M, w ϕ iff ∀w ∈ W, M, w ϕ M, w Ka ϕ iff ∀w0 (hw, w0 i ∈ ∼a implies M, w0 ϕ) M, w â iff M, w ϕ and n(a, w) = ϕ M, w x = â iff ν(x) = n(a, w) M, w â = ϕ iff n(a, w) = ϕ M, w [ϕ!]ψ iff M, w ϕ implies M ϕ, w ψ M, w [ϕ?]ψ iff ∀v ∈ M ↑ ϕ, v ψ M, w ↓ϕ(â↓) iff M, w ϕ[n(a, w)/â↓] where ϕ[ν(x)/x] is the result of replacing all occurrences of x in ϕ with ν(x), and where ϕ[n(a, w)/â↓] is the result of replacing all occurrences of â↓ in ϕ with n(a, w). As we discuss in (Uckelman 2013b), this language is richly expressive. Using it, we can express various types of scope distinctions (e.g., the difference between ‘Of A you know that it is ϕ’ and ‘You know that A is ϕ’ as well as talk about both propositions and names of propositions. 2.3 Obligationes in this framework In this section we formally define obligationes disputations. We first specify the pos- sible actions of R, which are defined using the dynamic test operator defined in the previous section: Definition 2.10. (Actions of R) Let ϕn be a proposition put forward by O. The possible actions of R (designated Act) are: concede : [ϕn ?]> deny : [¬ϕn ]?> doubt : [UR ϕn ?]> Uckelman 307 These are essentially tests for consistency; if R follows the rules in his responses, and does not concede contradictory statements, then he will always remain within a non-empty model. Note that as defined, doubting ϕ is the same as conceding that you are uncertain about ϕ. Thus, Paul’s assertion that dubitatio is a variant of positio (Venice 1988, p. 39) is borne out.9 Definition 2.11. (Obligatio). An obligatio is a quadruple O = hΘ, R, Γ, ΓR i where • Θ is a sequence of propositions, such that θ0 ∈ Θ is the obligatum and θn ∈ Θ is the proposition put forward by O at round n. • R : Θ × N → 2Act is an obligational rule. The intended interpretation of R is that for any statement proposed by O, the rule gives a set of correct responses for R. We often write R(ϕn ) for R(ϕ, n) to simplify notation. • Γ is a sequence of actions, formed by R’s actual responses to each element of Θ. • ΓR is a sequence of actions, formed by the correct response of R to each element in Θ, as given by R. The set of obligationes is denoted by O. We abuse notation and identify Γ and ΓR with the formulas inside [ ], that is, if Γ1 = h[θ0 ?]>, [¬θ1 ?]>i, we identify Γ1 with hθ0 , ¬θ1 i. Further, for a set of ordered propositions Γn , let M ↑ Γn = M ↑ γ0 ↑ . . . ↑ γn , that is, M ↑ Γn is the result of the sequential reduction of M by the elements of Γn . Different types of obligationes are modeled by changing R. Paul’s rules for positio are defined as follows: Definition 2.12. For a model M and positum θ0 ∈ Θ: concede : iff M, w hθ0 ?i> R(θ0 ) = deny : iff M, w [θ0 ?] ⊥ For θn ∈ Θ, n > 0: If M ↑ Γn−1 θn : R(θn ) = concede : If M ↑ Γn−1 ¬θn : R(θn ) = deny : Otherwise: If M, w∗ KR θn : R(θn ) = concede : If M, w∗ KR ¬θn : R(θn ) = deny : If M, w∗ ¬(KR θ ∨ KR ¬θn ): R(θn ) = doubt : 9 Note that this is not generally true of dubitatio; see (Uckelman 2011a, Uckelman et al. forthcoming). 308 A Medieval Epistemic Puzzle Two things to note about Paul’s rules. First, they are deterministic, which means that the sequence ΓR is uniquely defined. Second, they are in a sense global or universal: for any formula ϕn which is conceded, either it was true in all the remaining worlds in M ↑ Γn or it was known in the original world, that is, it was true in all worlds accessible from the original world (and analogously for formulas which are denied). Many obligationes take place against the backdrop of some explicit world- knowledge. This is done via the specification of a casus ‘case’. Definition 2.13 (Casus). Let Lit(Φ0 ) be the set of literals of Φ0 , Lit(N) be the set of all formulas of the form â = ϕ or â , ϕ, and LitB (Φ0 ) be the closure of Lit(Φ0 ) under the Boolean operators. A casus is any C ⊆ Lit(Φ0 ) ∪ LitB (N). Definition 2.14. M models the casus if there is a partition P = P1 ∪ P2 of W with w∗ ∈ P1 , such that: • if w ∼R w∗ or w ∼O w∗ , then w ∈ P1 . • for all w, v ∈ P1 , w ∼R v and w ∼O v. • for every positive literal p ∈ C, every negative literal ¬q ∈ C, and every w ∈ P1 , w ∈ V(p) and w < V(q). • for every ψ ∈ C ∩ LitB (N) and w ∈ P1 , M, w ψ. Corollary 1. Fix a model M and a casus C. Then, for every ϕ ∈ C, if M models C, then M, w∗ KR ϕ and M, w∗ KO ϕ. Proof. Follows straightforwardly from Def. 2.14. 3 Analysis of the disputation With all of this in place, we can now turn to the analysis of the puzzle and ensuing disputation. Here, R stands for “you”, that is, the reader, who plays the role of the Respondent in the disputation below. We let p :=‘God exists’ and q :=‘A human being is a donkey’, and then the assumptions are: (1a) KR (â = p ∨ â = q). (1b) ∀x(â = x → (x = p ∨ x = q)). (1c) ¬KR (â = p) ∧ ¬KR (â = q) ∧ ¬KR (â , p) ∧ ¬KR (â , q). (1d) KR p ∧ KR ¬q. Uckelman 309 (2) ∀x((¬KR x ∧ ¬KR ¬x) → UR x). These assumptions set up the background against which the disputation is to be evaluated—that is, this is the presentation of a casus. (2) is simply the definition of uncertainty from S2.2. (1b) is a constraint on allowable n functions, requiring that each name is the name of a unique proposition. This constraint is already built in to our models (cf. Uckelman 2013b, fn. 12). Thus, we must specify a casus which accounts for (1a), (1c), and (1d). Proposition 1. Let C = {â = p ∨ â = q, p, ¬q}, and let M model C. By Def. 2.14, it follows that M, w∗ p and M, w∗ ¬q. Hence, by Cor. 1, M, w∗ KR (â = p∨â = q) and M, w∗ KR p ∧ KR ¬q. This takes care of (1a) and (1d). Proposition 2. Let C = {â = p ∨ â = q, p, ¬q}, let M model C, and fix a suitable selection function f . Then, M0 = M ↑ f (â = p ∧ â = q) is such that M0 , w∗ ¬KR (â = p) ∧ ¬KR (â = q) ∧ ¬KR (â , p) ∧ ¬KR (â , q). Proof. First, note that given the definition of UR , (1c) is equivalent to UR (â = p) ∧ UR (â = q), which in turn is equivalent to UR (â = p ∧ â = q). There are two cases: (1) M, w∗ UR (â = p ∧ â = q). In this case, by Def. 2.8, M = M0 , and we are done. (2) M, w∗ 2 UR (â = p ∧ â = q). There are a further four possibilities: (i) M, w∗ KR â = p; (ii) M, w∗ KR â = q; (iii) M, w∗ KR â , p; (iv) M, w∗ KR â , q, each of which are proven analogously, so we take case (i) only. Since ∼R is reflexive, it follows that n(a, w∗ ) = p. By Def. 2.7, every M0 ∈ MUR (â=p∧â=q) is such that there is a v ∼R w∗ where v ∈ ~â , p; in fact, given that M models C and we have chosen a suitable f , we know that n(a, v) = q, and hence v ∈ ~â = q. Because v ∼R w∗ , M, w∗ ¬KR â = p, as required. The model given in Fig. 1 satisfies all the assumptions (1)–(2). (In fact—though we do not prove this here—it is a minimal model satisfying all the assumptions). Here, M = M ↑ f (â = p ∧ â = q), because â = p and â = q are already uncertain for R, because he cannot distinguish between the actual world w∗ where â = p and world w0 , where â = q, and thus per Def. 2.7, MUa ϕ = {M}. Once Paul has set up the assumptions and the argument, he provides an obligatio- like piece of reasoning. He says: You could work out from this how to reply in the case under discussion. For example: 310 A Medieval Epistemic Puzzle R, O R, O R w∗ w0 p, ¬q p, ¬q â = p â = q Figure 1 1 A is true. I am uncertain. 2 A is false. I am uncertain. 3 A is contingent. I deny it. 4 A is possible. I am uncertain. 5 A is necessary. I am uncertain. 6 A is impossible. I am uncertain. 7 You know that A is true. I deny it. 8 You know that A is false. I deny it. 9 A is uncertain to you. I deny it. 10 A is known by you. I am uncertain. 11 You know that A is known by you. I deny it. 12 You are uncertain that A is known by you. I grant it. 13 You know that A is uncertain to you. I deny it. 14 Of A you know that it is known by you. I am uncertain. 15 You are uncertain that A is true. I grant it. 16 Of A I am uncertain that it is true. I deny it. 17 About A I am uncertain. I deny it. 18 You are uncertain about A I deny it. 19 Of something true you know that it is A. I deny it. 20 Of A you know that it is something. I grant it. 21 You know that A is A. I grant it. 22 Of A you know that it is A. I deny it. 23 Of A you know that it is A or something other than A. I grant it. 24 Of A you know that it is something other than A. I deny it. 25 Of A you know that it is true or false. I grant it. Uckelman 311 26 You know that A is necessary or impossible. I grant it. 27 Of A you know that it is possible or contingent. I am uncertain. 28 Of A you know that it is impossible or contingent. I am uncertain. 29 You know that A is possible or contingent. I deny it. 30 You know that A is contingent or impossible. I deny it. And so on for innumerable other propositions, some of which should be granted, some of which should be denied, and about some of which one should say that one is uncertain, if one wants to examine the matter care- fully10 (Venice 1981, pp. 19, 21). (The numbers are my addition to make reference easier.) While Paul does not say explicitly that this is an obligational disputation, nor make explicit the rules involved, it is clear from the presentation—of propositions which the reader is responding to either by conceded, denying, or doubting—and from the initial assumptions that it is one, about the nature of â. Thus, starting from an initial model where R concedes that he is uncertain about whether â = p or â = q, if he correctly follows the rules, he will never be forced into admitting that he knows â (cf. Uckelman 2011a, Theorem 24). We now explore how this disputation plays out in the formalization we have de- fined. We begin with the statements of O, which make up the sequence Θ: Θ0 = hθ0 i = h¬KR âi Θ1−10 = hT â, F â, ↓ ? â↓, ↓^â↓, ↓â↓, ↓¬â↓, KR T â, KR F â, ↓UR â↓, ↓KR â↓i Θ11−17 = hKR ↓KR â↓, UR ↓KR â↓, KR ↓UR â↓, ↓KR ↓KR â↓, UR T â, ↓UO T â↓, ↓UO â↓i Θ18−22 = hUR â, ∃x(T x ∧ KR (x = â)), ↓KR (∃x = â↓), KR (â = â), ↓KR (â↓ = â)i Θ23−25 = h↓KR (â↓ = â ∨ â↓ , â), ↓KR (â↓ , â), ↓KR (T â↓ ∨ F â↓)i Θ26−27 = hKR (â ∨ ¬â), ↓KR (^â↓ ∨ (^â↓ ∧ ^¬â↓))i Θ28−29 = h↓KR (¬â↓ ∨ (^â↓ ∧ ^¬â↓)), KR (^â ∨ (^â ∧ ^¬â))i Θ30 = hKR ((^â ∧ ^¬â) ∨ ¬âi 10 Due to reasons of space, I will not quote the original Latin here. 312 A Medieval Epistemic Puzzle In addition to Θ, Paul also specifies Γ, the sequence of correct responses of R to the statements of O in Θ: Γ1−30 = hdoubt : θ1 , doubt : θ2 , deny : θ3 , doubt : θ4 , doubt : θ5 , doubt : θ6 , deny : θ7 , deny : θ8 , deny : θ9 , doubt : θ10 , deny : θ11 , concede : θ12 , deny : θ13 , doubt : θ14 , concede : θ15 , deny : θ16 , deny : θ17 , deny : θ18 , deny : θ19 , concede : θ20 , concede : θ21 , deny : θ22 , concede : θ23 , deny : θ24 , concede : θ25 , concede : θ26 , doubt : θ27 , doubt : θ28 , deny : θ29 , deny : θ30 i We now show that Paul’s Γ is correct; that is, R will not be led into error following his advice. We proceed through the first fifteen actions sequentially, omitting reference to M when context makes it clear. The initial statement θ0 = ¬KR â is satisfied at w∗ in Fig. 1; for if it were the case that w KR â, then it would be the case that both w∗ â and w0 â. But n(a, w0 ) = q, and ∗ w0 ¬q. Thus, M ↑ Γ0 = M. Thus, R can admit the initial thesis and the disputation begins. 1. Paul says that R should doubt T â: If R is following the rules, this means that ¬(KR T â ∨ KR ¬T â) should be true at w∗ . We know from the preceding that w∗ ¬KR â, so it follows that w∗ ¬KR T â. A similar argument can be given for w∗ ¬KR ¬T â, and hence, by DeMorgan’s, w∗ ¬(KR T â ∨ KR ¬T â). Therefore, R’s doubting θ1 is correct and M ↑ Γ1 = M. 2. Paul says that R should doubt F â as well: An analogous argument to the previous shows that this is correct, and M ↑ Γ2 = M. 3. R should deny ↓ ? â↓. There are two cases where this is correct: If M ↑ Γ2 ¬↓ ? â↓ or if w∗ KR ¬↓ ? â↓. Both condition are satisfied; w∗ ^p, but w∗ 2 ^¬p, while the reverse is true at w0 with respect to q. 4. R should doubt ↓^â↓: This is correct if w∗ ¬(KR ↓^â↓ ∨ KR ¬↓^â↓). It is easy to check the the relevant witness worlds are w0 for the first disjunct and w∗ for the second, so R is correct in doubting θ4 , and M ↑ Γ4 = M. 5. R should doubt ↓â↓: This is correct if M, w∗ ¬(KR ↓â↓ ∨ ¬KR ¬↓â↓); as in the previous case, the witness worlds are w0 for the first disjunct and w∗ for the second, so R is correct in doubting θ5 , and M ↑ Γ5 = M. 6. R should doubt ↓¬â↓: Completely analogous to the previous; R is correct in doubting θ6 , and M ↑ Γ6 = M. Uckelman 313 7. R should deny KR T â: This is the case if either M ↑ Γ6 ¬KR T â or w∗ KR ¬KR T â. M ↑ Γ6 ¬KR T â iff M ↑ Γ6 , w∗ 2 KR T â and M ↑ Γ6 , w0 2 KR T â. Both conjuncts are true because w∗ ∼R w0 and w0 ∼R w0 , and w0 2 T â, since n(a, w0 ) = q and w0 < V(q). The second conjunct is true because w∗ ∼R w∗ . R has responded correctly, and M ↑ Γ7 = M. 8. R should deny KR F â: The argument is parallel to the previous. R has responded correctly, and M ↑ Γ8 = M. 9. R should deny ↓UR â↓: This is the case if either M ↑ Γ8 ¬↓UR â↓ or w∗ KR ¬↓UR â↓. The former holds if ¬↓UR â↓ holds at both worlds. w∗ ¬↓UR â↓ iff w∗ ¬(¬↓KR â↓ ∧ ¬↓KR ¬â↓), that is, iff w∗ ↓KR â↓ ∨ ↓KR ¬â↓. w∗ ↓KR â↓ iff w∗ KR p iff w∗ p and w0 p, which is the case. A straightforward argument shows that the opposite disjunct is true for w0 because of q. R has responded correctly, and M ↑ Γ9 = M. 10. R should doubt ↓KR â↓: This is true if w∗ ¬(KR ↓KR â↓ ∨ KR ¬↓KR â↓), i.e., if w∗ ¬KR ↓KR â↓ ∧ ¬KR ¬↓KR â↓. w∗ ¬KR ↓KR â↓ iff either w∗ ¬↓KR â↓ or w0 ¬↓KR â↓, that is, either (1) w∗ 2 ↓KR â↓ or (2) w0 2 ↓KR â↓. This means either (10 ) w∗ 2 KR p or (20 ) w0 2 KR q. But (20 ) is the case. w∗ ¬KR ¬↓KR â↓ iff either w∗ ↓KR â↓ or w0 ↓KR â↓. But w∗ KR p, so the first disjunct is satisfied. So R has responded correctly, and M ↑ Γ10 = M. 11. R should deny KR ↓KR â↓: This is the case if either M ↑ Γ10 ¬KR ↓KR â↓ or w∗ KR ¬KR ↓KR â↓. The first disjunct holds since w0 ¬q, and hence w0 ¬KR q, that is, w0 ¬↓KR â↓, and thus this formula cannot be known by R at either w∗ or w0 . R has responded correctly, and M ↑ Γ11 = M. 12. R should concede UR ↓KR â↓. This is the case if either M ↑ Γ11 UR ↓KR â↓ or w∗ KR UR ↓KR â↓. In (10), R doubted ↓KR â↓, the result of which is that M ↑ Γ10 UR ↓KR â↓. But M ↑ Γ10 = M ↑ Γ11 , and hence the first disjunct holds. 13. R should deny KR ↓UR â↓: This is the case if either M ↑ Γ12 ¬KR ↓UR â↓ or w∗ KR ¬KR ↓UR â↓. The first disjunct holds; w∗ 2 KR ↓UR â↓, since w∗ KR p, and hence w∗ ↓KR â↓, from which it follows that w∗ ¬↓UR â↓. So R has responded correctly, and M ↑ Γ13 = M. 314 A Medieval Epistemic Puzzle 14. R should doubt ↓KR ↓KR â↓: This is the case if w∗ ¬(KR ↓KR ↓KR â↓ ∨ KR ¬↓KR ↓KR â↓), that is, w∗ ¬KR ↓KR ↓KR â↓ ∧ ¬KR ¬↓KR ↓KR â↓. It is easy to check that w0 is a witness for the first conjunct, since n(a, w0 ) = q and w0 ¬KR q, and w∗ is a witness for the second conjunct, because n(a, w∗ ) = p. So R has re- sponded correctly, and M ↑ Γ14 = M. 15. R should concede UR T â: This is the case if M ↑ Γ14 UR T â or w∗ KR UR T â. In step (1), R doubted T â, which means that M ↑ Γ1 UR T â; and M ↑ Γ1 = M ↑ Γ14 . So R has responded correctly, and M ↑ Γ15 = M. We leave the remaining actions as interesting exercises for the reader; they can all be worked out in similar fashion, though the level of complexity increases. 4 Observations A number of interesting observations arise from the preceding, which we touch on by way of concluding. We begin with relatively specific, precise points, and end with conclusions of more general import. First, consider the two sentences after where we left off, θ16 and θ17 : Of A I am uncertain that it is true. (θ16 ) About A I am uncertain. (θ17 ) Historians writing on obligationes often point out the fact that O plays little substantive role as one of the puzzling properties of these disputations. There are no rules that govern O’s behavior, and he seems to do little more than rotely spit out sentences. What is disputational about such exchanges? There is no substantive argument being made (cf., e.g., King 1991 and Uckelman 2013a, S3). These two sentences turn such an interpretation of O on its head. Here, just as “you” referred to the reader, i.e., R, “I” here can only refer to O (this role played by Paul). Thus, what Paul is asking R to do is reason about not only R’s own knowledge, but also about his, that is, Paul or O’s. Note that in no part of the discussion is it ever stated what “I” knows, only what “you” knows. The only way that R could have any knowledge about O’s knowledge is by it arising from the disputational context: It is because this is set up as a disputation between two people, one of whom has just finished presenting the casus to the other (with the strong, albeit unstated, assumption that “I” know which sentence is on the other side of the card labeled A that I have just held up to you), that R is in any position to make the inference required to deny that O is uncertain about A or its truth value. This strategic move—using the fact that something has been presented from the author to the reader, or O to R in a particular way to infer the presence of knowledge Uckelman 315 in either participant—leads us to our second observation. Paul uses this strategic move in many different ways throughout De scire et dubitare. We have discussed elsewhere how the knowledge which is generated through this fits into traditional medieval ap- proaches to epistemology, whereby knowledge is defined as “a mental grasp (notitia) of anything acquired by the most powerful demonstration” (Uckelman 2013b, pp. 3). Two aspects of this approach to epistemology are important in De scire: on the one hand, “mental grasp”, akin to the contemporary concept of “awareness”, is necessary for knowledge because one can only be properly said to know a proposition of which he is aware. One way to make someone aware of a proposition, as a precursor for knowledge, is to present a proof or demonstration of it; a simpler way is for one per- son simply to assert it to another, and that brings us to the second important aspect, namely that of demonstration. “Demonstration” of the most powerful type corresponds to a full-blooded proof, but medieval philosophers also allowed lesser kinds of knowl- edge corresponding to weaker types of demonstration, of which testimony by authority counted as one. Unfortunately, such a dynamic approach to logic is inconsistent with other principles that Paul ascribes to (specifically, a certain type of monotonicity), as we show in (Uckelman 2013b, S4.3). Third, this example illustrates an interesting a general feature of obligationes which is not immediately obvious from the way the rules are presented, namely, that in a correctly-played obligatio, the only moves that change the model are the concessions and denials of irrelevant propositions, which can reduce the state-space. Since there were no such irrelevant sentences proposed, the model never changed as a result of R’s responses that we analysed. Ordinarily, the difficulty in disputing in this fash- ion arises from keeping track of what is relevant and what is irrelevant, and how this changes given previous concessions and denials (on a static view of relevance such as Swyneshed’s, the disputations are substantially easier). Here, the complexity comes from reasoning with multiply embedded modal operators; the models do not change, but the evaluation of formulas on these models is more complex the more complex the formula is.11 Fourth, the entire excursion into the Middle Ages presented in this paper serves as yet another reminder that medieval logic, far from being dull syllogistic and irrele- vant scholastic wrangling, handled interesting and complex puzzles with a sophistica- tion that makes itself manifest when one attempts to provide a formal analysis of the puzzles. One need only look at the instructions for constructing well-formed formu- las in Def. 2.3 to see that the language which we used is incredibly rich and varied; furthermore, every part of the language is required in order to be able to express the distinctions we need to express in order to formalize the argument. The very process of 11 Using an ordinary sense of ‘complexity’, not the computational notion of the same. 316 A Medieval Epistemic Puzzle formalization thus brings to light not only interesting aspects of medieval puzzles, but also provides us with perhaps hitherto unexplored combinations of logical languages whose interaction the study of may prove of interest apart from anything medieval. Acknowledgements Research for this paper was partially funded by the NWO project “Towards Logics that Model Natural Reasoning”. References H. van Ditmarsch, W. van de Hoek, and B. Kooi. Dynamic Epistemic Logic, volume 337 of Synthese Library Series. Springer, 2007. P. King. Medieval thought-experiments: The metamethodology of mediaeval science. In T. Horowitz and G. J. Massey, editors, Thought Experiments in Science and Phi- losophy, pages 43–64. Rowman & Littlefield, 1991. S. L. Uckelman. Deceit and indefeasible knowledge: The case of Dubitatio. Journal of Applied Non-Classical Logics, 21, nos. 3/4:503–519, 2011a. S. L. Uckelman. A dynamic epistemic logic approach to modeling Obligationes. In D. Grossi, S. Minica, B. Rodenhäuser, and S. Smets, editors, LIRa Yearbook, pages 147–172. Institute for Logic, Language & Computation, 2011b. S. L. Uckelman. Interactive logic in the Middle Ages. Logic and Logical Philosophy, 21(3):439–471, 2012. S. L. Uckelman. Medieval Disputationes de obligationibus as formal dialogue sys- tems. Argumentation, 27(2):143–166, 2013a. S. L. Uckelman. Paul of Venice on a puzzle about uncertainty. In submission, 2013b. S. L. Uckelman, J. Maat, and K. Rybalko. The art of doubting in Obligationes Parisienses. In C. Kann, B. Löwe, C. Rode, and S. L. Uckelman, editors, Modern Views of Medieval Logic. Peeters, forthcoming. P. Venice. Pauli Veneti: Logica Magna, Prima Pars: Tractatus de Scire et Dubitare. Oxford University Press, 1981. Ed. and trans. by Patricia Clarke. P. Venice. Pauli Veneti: Logica Magna, Secunda Pars: Tractatus de Obligationibus. Oxford University Press, 1988. Ed. and trans. by E. Jennifer Ashworth.