Specifying and Enforcing Intertask Dependencies Paul C. Attie,1 Munindar P. Singh,1 Amit Sheth,2 and Marek Rusinkiewicz3 1: Carnot Project, MCC, 3500 W. Balcones Center Drive, Austin, TX 78759-6509 2: Bell Communications Research, 444 Hoes Lane, Piscataway, NJ 08854-4182 3: Department of Computer Science, University of Houston, Houston, TX 77204-3475 email:
[email protected],
[email protected],
[email protected],
[email protected]Abstract Extensions of the traditional atomic transaction model are needed to support the develop- ment of multi-system applications or work ows that access heterogeneous databases and legacy application systems. Most extended transaction models use conditions involving events or de- pendencies between transactions. Intertask dependencies can serve as a uniform framework for dening extended transaction models. In this paper, we introduce event attributes needed to determine whether a dependency is enforceable and to properly schedule events in extended transaction models. Using these attributes and a formalization of a dependency into the tempo- ral logic CTL, we can automatically synthesize an automaton that captures the computations that satisfy the given dependency. We show how a set of such automata can be combined into a scheduler that produces global computations that satisfy all relevant dependencies. We show how dependencies required to implement relaxed transactions such as Sagas can be enforced and discuss brie y the issues of concurrency, safety and recoverability. 1 Introduction One of the main objectives of the Carnot project at MCC is to provide an environment for the development of applications that need to access related information stored in multiple existing systems [Ca91]. An important component of this eort is a facility for relaxed task management. A task is any unit of computation that performs some useful function in a system. The tasks that are of particular interest to us are database transactions. To eciently develop such multi-system applications accessing the existing heterogeneous and closed1 systems, we must be able to modularly capture the execution constraints of various applications. This can be achieved by modeling them as relaxed transactions consisting of related tasks executed on dierent systems. It should be noted, that the requirements of the traditional transaction model based on full isolation, atomic commitment and global serializability may be either too strong, or not sucient for a particular multi-system application. For example, an application may need to ensure that two tasks commit only in a certain temporal order. An example is a banking application in which deposits made into an account over a certain period may have to be processed before debits can be made from the account over the same period. Therefore, we may need to selectively relax the ACID properties [Gra81, HR83] for multi-system transactions to capture precisely the synchrony and coupling requirements based on the true application semantics. The semantic constraints may be specied as intertask dependencies, which are constraints over signicant task events, such as commit and abort. The concomitant reduction in semantic constraints across tasks makes possible the genera- tion of scripts that can be eciently executed with a high level of parallelism. This, in turn, may result in a higher availability of data, better response times and a higher throughput. The modeling of complex telecommunication applications is discussed in [ANRS92] where it is argued that many multi-system applications can be eciently modeled and executed as relaxed transactions. To illustrate these concepts, let us consider the following scenario. A travel agency maintains two databases: one containing detailed information about the bookings made by dierent agents and another containing a summary of the information in the rst database with the number of bookings per agent. When the summary changes, a task is run that sets o an alarm if the summary falls below a preset threshold. An obvious integrity constraint is that for each travel agent, the number of rows in the bookings database should be equal to the number of bookings stored for that agent in the summary database. Assuming that it holds initially, the maintenance of this constraint can be assured by ex- ecuting all the updates to both databases as atomic multidatabase transactions that are globally serializable [BS88]. This, however, may be very inecient or even impossible, if the database 1 In many such systems, the data can be accessed only through the existing interfaces, even if it is internally stored under the control of a general purpose Database Management System (DBMS). Such systems are frequently referred to as legacy systems and the applications that access several of them are called work ows. 1 interfaces do not provide visible two-phase commit facilities. Instead, we may assume that the in- terdatabase integrity is maintained by executing separate tasks that obey the appropriate intertask dependencies. These dependencies state that if a delete task on the bookings database commits, then a decrement-summary task should also commit. Furthermore, if a delete task aborts, while its associated decrement-summary task commits, then we must restore consistency by compensating for the spurious decrement. We do this by executing an increment-summary task. Figure 1 shows the tasks involved in this example; dB, dS, iS, and u?a denote the delete-booking, decrement-summary, increment-summary, and update-alarm tasks, respectively. DELETE BOOKING PP PP PP PP PP P delete decrement update booking summary - alarm (dB) (dS) (u?a) ? increment update summary - alarm (iS) (u?a) Figure 1: Task Graph for the Delete Booking Example We model each intertask dependency as a dependency automaton, which is a nite state automaton whose paths represent the computations that satisfy the dependency. Each such au- tomaton ensures that its corresponding dependency is not violated, by permitting only those events whose execution would not lead to the violation of the dependency. The scheduler receives events corresponding to a possible task execution. It queries the applicable dependency automata to de- termine whether they all allow the event to be executed. If so, the event is executed; otherwise, it is delayed and its execution re-attempted later. In this paper, we present a framework in which dependencies can be stated modularly as constraints across tasks. We also present a scheduler that enforces all stated dependencies and assures that a dynamically changing collection of tasks is executed in accordance with a set of intertask dependencies. The scheduler does this by accepting, rejecting, or delaying signicant events so that the resulting global computation satises all of the dependencies. 2 The rest of the paper is organized as follows. Section 2 provides the technical and method- ological background for our work and gives an example of its application. Section 3 describes how we formally specify dependencies, discusses the important issue of event attributes, their impact on the enforceability of dependencies, and considers how dependencies can be added or removed at run-time. Section 4 gives a formal denition of a dependency automaton, which we use to represent each dependency; it also shows how dependency automata operate and enforce their corresponding dependencies. Section 5 presents our execution model as well as the notion of viable pathsets, which we use as a correctness criterion. It formalizes these denitions and uses them in the denition of a scheduler algorithm.2 It also shows how a relaxed transaction model such as the Sagas [GS87] can be described (and hence enforced) as a set of dependencies. Section 6 brie y discusses the concur- rency control, safety and recovery issues in the context of exible transaction [JNRS91]. Summary and conlusions are presented in Section 7. 2 Background The specication and enforcement of intertask dependencies has recently received considerable attention [CR90, DHL90, El92, Kl91]. Following [Kl91] and [CR92], we specify intertask depen- dencies as constraints on the occurrence and temporal order of certain signicant events specied on a per-task basis. Klein has proposed the following two primitives [Kl91]: 1. e1 ! e2 : If e1 occurs, then e2 must also occur. There is no implied ordering on the occurrences of e1 and e2 . 2. e1 < e2 : If e1 and e2 both occur, then e1 must precede e2 . Examples of the execution dependencies dened in the literature include: Commit Dependency [CR92]: Transaction A is commit-dependent on transaction B, i if both transactions commit, then A commits before B commits. Let the relevant signicant events be denoted as cmA and cmB . This can be expressed as cmA < cmB . Abort Dependency [CR92]: Transaction A is abort-dependent on transaction B, i if B aborts, then A must also abort. Let the signicant events here be abA and abB , so this can be written abB ! abA . Conditional Existence Dependency [Kl91]: If event e1 occurs, then if event e2 also occurs, then event e3 must occur. That is, the existence dependency between e2 and e3 comes into force if e1 occurs. This can be written e1 ! (e2 ! e3). 2 This paper is a slightly revised and abbreviated version of the report [ASRS92] available from the authors. The report contains complete proofs of the theorems mentioned here. 3 The above primitives are important because they can capture many of the semantic con- straints encountered in practice. Any useful framework for intertask dependencies should be at least as powerful. Our approach described below meets this criterion: the above primitives are special cases of our formalism. Committed Y H ] JHH Aborted * J HH J ab H cm J Done abJ 6 J dn J J Executing 6 stNot executing Figure 2: An Example Task State Transition Diagram The relationships between signicant events of a task can be represented by a task state transition diagram which is an abstract representation of the actual task that hides irrelevant details of its sequential computations. Execution of the event causes a transition of the task to another state. Figure 2 shows an example task state transition diagram taken from [Kl91]. From its initial state (at the bottom of the diagram), the task rst executes a start event (st). Once the task has started, it will eventually either rollback, as represented by the ab transition, or nish, as represented by the dn transition (for \done"). When a task is done, it can either commit, i.e., make the cm transition, or abort, i.e., make the ab transition. Depending on the characteristics of the local system at which the task is executing, the set of transitions and the set of signicant events of a task may be dierent. Our approach (and our implementation) accepts tasks that are characterized by an arbitrary set of task states and signicant tasks events. Using the state transition diagrams and signicant events dened above, we can represent the travel agent application described in the previous section as shown in Figure 3. The intertask dependencies are shown as \links" between signicant events of various tasks. 3 Intertask Dependency Declarations As discussed in Section 2, we specify intertask dependencies as constraints on the occurrence and temporal order of events. Let e, ei , ej , etc. denote any signicant event (or when no ambiguity, just event) and D(e1; . . . ; en ) denote an unspecied dependency over e1; . . . ; en . We assume that 4 ?PP ? ? ? ? ? I @ C AK @ PP P ? > I @ ? I @ ? A K ? @ K A C A @ ?? PPP ! A @ ?? A @ ?? ? @ @ ? C A C A 6 P P < A - A 6 @ @ ! @ A A 6 A ? A ? A ? C A A @ A C C 6 6 @ @ @ 6 C ? - ? R ? C ! C @ C C C C ? ? ? ? C K@ A I ?@ K@ AI ? A @ ? ? ? @ ? C A @ ? @ A @ ? C A @ A ! C A 6 @ A 6 C CW & / A A ? @ @ A A ? XXX XXX ! 6 @ @ 6 XX XXX z ? @ R ? Figure 3: Dependencies Between Signicant Events in the Delete Booking Example an event can occur at most once in any possible execution of the system. This is not a restriction in real terms. If a task aborts and must be re-executed, a new id may be generated for it (and for its events). The dependencies can be appropriately modied and everything can proceed normally. 3.1 Formal Specication of Dependencies We adopt the language of Computation Tree Logic (CTL) as the language in which these dependen- cies will be expressed [Em90]. CTL is a powerful language, well-known from distributed computing. A brief description of CTL and modeling of various dependencies is given in Appendix A. The prim- itives < and ! are available as useful macros that yield CTL formulae. CTL makes it possible to uniformly express dierent dependencies. And, since it is a formal language, it helps reduce ambiguity in communication. It also makes it possible to rigorously determine the relationships among dierent dependencies, e.g., whether they are consistent, or whether one entails another. We would like our dependencies to be easily speciable by users or database administrators. For this reason, it is essential that the automata that enforce those dependencies be synthesized au- tomatically from those dependencies. We show that the CTL formulae can be used to automatically synthesize dependency automata for scheduling events. Thus we are able to retain the exibility of Klein's approach, while using a formal, more expressive and general representation. By adopting CTL as the specication language, we can hide the detailed automata synthesis techniques from 5 the person who species dependencies. 3.2 Enforceable Dependencies The scheduler enforces a dependency by variously allowing, delaying, rejecting, or forcing events to occur, so that the resulting computation satises the given dependency. Some syntactically well- formed dependencies may not be enforceable at run-time. For example, the dependency ab(T1) ! cm(T2) is not enforceable, because a scheduler can neither prevent ab(T1) from occurring nor guarantee the occurrence of cm(T2). This is because, in general, a scheduler cannot prevent user tasks from unilaterally deciding to abort. Thus both T1 and T2 can abort. We associate with each signicant event type the following attributes: Forcible: the system can always force the execution of an event of this type. Rejectable: the system can always reject an event of this type. Delayable: the system can delay execution of an event of this type. We assume that every non-real-time signicant event is delayable. This assumption is unavoidable in any model in which event execution is subject to the control of an entity (i.e., the scheduler) other than the task that issues the event. In the discussion below we assume that the systems on which the tasks are executed provide a prepared-to-commit state and a task can issue a prepare-to-commit pr event. The prepared- to-commit state is visible if the application task (the scheduler) can decide whether the prepared task should commit or abort. We also distinguish between signicant events from tasks initiated by users (user events) and those from tasks that are initiated by the scheduler. An example of a system-generated task is one that is invoked under the control of the scheduler to restore mutual consistency of data [RSK91]. Table 1 below shows the attributes of the signicant events of user and system tasks usually found in database applications and DBMSs. User Tasks Forcible Rejectable System Tasks Forcible Rejectable cm p cm p p ab ab pr p pr p p st st Table 1: Attribute Tables for Signicant Events Now we can characterize the run-time enforceability of dependency D(e1 ; . . . ; en) in terms of the attributes of e1 ; . . . ; en described above. In general, a dependency can be run-time enforceable 6 based on the attributes of its events. For example, e1 ! e2 is run-time enforceable if rejectable(e1 ) holds, since we can then delay e1 until e2 is submitted, and reject e1 if we see that task that issues e2 has terminated (or timed out: see below) without issuing e2 . Alternatively, if e2 is forcible, then we can enforce e1 ! e2 at run-time by forcing the execution of e2 when e1 is accepted for execution. Yet another (although somewhat vacuous) strategy would be to unconditionally reject e1 . This strategy is also an option when rejectable(e1 ) holds. As another example, consider e1 < e2 . There are two possible strategies for enforcing this dependency. The rst one, which can be applied for any event attributes, is to delay e2 until either e1 has been accepted for execution, or task 1 has terminated without issuing e1 . The second strategy which can be applied if rejectable(e1 ) holds, is to let e2 be executed when it is submitted and thereafter reject e1 if it is submitted. One way to extend our approach to account for real-time dependencies is by considering real-time events, such as clock times (e.g., 5:00 p.m.), as regular events that lack the attribute of delayability. Consider the dependency e1 < e2 , where e2 is non-delayable. This dependency is enforceable only if e1 is rejectable. The scheduler can enforce e1 < e2 by accepting e1 if e2 has not already occurred and by rejecting e1 otherwise. 3.3 Dynamic Addition and Removal of Dependencies The preceding exposition has implicitly assumed that all dependencies are initially given, i.e., at compile-time. However, dependencies may be added or deleted dynamically at run-time. The addition of a dependency requires that an automaton be synthesized for it and used in the schedul- ing. The removal of a dependency at run-time is achieved by simply removing its corresponding automaton from the scheduler. A potential problem is that a dependency may be added too late to be enforced. Suppose D = e1 ! e2 is added after e1 is executed. Now if e2 is not forcible and is never submitted, D cannot be enforced. This situation is unavoidable in general, since the addition of dependencies cannot be predicted. The best we can do in such a case, is to report a violation if an attempt to add such a dependency is made. 4 Dependency Automata: Enforcing a Single Dependency For each dependency D, we create a nite state machine AD that is responsible for enforcing D. This can be done either manually, or by using an extension of the CTL synthesis technique of [EC82, Em90] that we have developed [ASRS92]. Our procedure requires only the specication of the dependencies, not of the tasks over which those dependences are dened. As a result, it generates 7 an open system. By contrast, traditional temporal logic synthesis methods [EC82, MW84] require a specication of the entire system. Thus their results have to be recomputed whenever the system is modied. The details of the synthesis procedure are omitted here for brevity, but can be found in [ASRS92]. AD is a tuple hs0; S; ; i, where S is a set of states, s0 is the distinguished initial state, is the alphabet, and S S is the transition relation. We use ti to indicate the specic termination event of task i, and " to denote any event which can either be a signicant event (previously notated with e) or a termination event. The elements of are notated as , 0, etc. can be of form a (accept) or r (reject), and can be interleaved or sequenced as described below. a("1; . . . ; "m ): This indicates that AD accepts the events "1 through "m. If this transition is taken by AD , then each "i is accepted and, if "i is a signicant event, it is then forwarded to the event monitor for execution. r(e1; . . . ; em): This indicates that AD rejects the events e1 through em because the execution of any of them would violate the dependency D. 1jjj . . . jjjn, where the i 2 : This indicates the interleaving of the accept operations corresponding to 1 through n . 1; . . .; n, where the i 2 : This indicates the accept operations of i occur before the accept operations of i+1 (for 1 i (n ? 1)). Example Dependency Automata We represent AD as a labeled graph, whose nodes are states, and whose edges are transitions. Each edge is labeled with an element of . denotes the actions, such as accept or reject, that are taken by the scheduler when that transition is executed. In Figures 4 and 5 below, we give example dependency automata for the dependencies e1 < e2, and e1 ! e2 , respectively. The symbol j indicates choice: an edge labeled j 0 may be followed if the scheduler permits either or 0. Operation of an Automaton The following observations are made with respect to how a dependency automaton enforces a dependency. A ti indicates the termination or timing out of a task. A dependency automaton cannot reject a ti event, since it cannot unilaterally prevent such an event. However, an automaton may delay a termination event until an appropriate time. The importance of ti events is that their submission tells the automaton that events that may have been submitted by the given task will 8 PP PPa(t2 )jr(e2) a(t1 ) ? ? PP PP ? PP PP ?a(e1 ) PP ) ? ? a(e2 )jjj a(t ) 1 Pq P H HH J a(e2)ja(t2) H J a(e2 )ja(t2H) HH J H J HH HH J a(e1 ) a(t1) j HHJ ? J H j^ J Figure 4: Dependency Automaton for order dependency e1 < e2 assuming rejectable(e2 ) m @ PPa(t2 ) PP a(t1) r (e )j 1 @ PP PP @ a(e 2 ) PP @ PP PP mH ) @R a(e1 )jjja(e2) Pq P m m HH j H a(e2) a(tH 2)HH a(e1 ) a(t1) j HH H HH r(e1 ) a(t1) j H Hj ? H / m Figure 5: Dependency Automaton for existence dependency e1 ! e2 assuming rejectable(e1 ) 9 denitely not be submitted. This can signicantly aect the automaton's behavior. Knowledge that the given task has terminated may allow the scheduler to accept for execution a previously delayed event ej , as the knowledge that ei will never occur may enable the scheduler to infer that the execution of ej now will not violate certain dependencies. This happens, for example, if a dependency ei < ej is to be enforced and ej has been submitted, but is being delayed. In such a case, the arrival of ti ensures that the dependency ei < ej cannot be violated; consequently, ej can be scheduled (unless doing so would violate some other dependencies). The ordering of the reject operations is not constrained, since they do not contribute to the nal global computation. We assume for simplicity that each task can have at most one event in a given dependency, i.e., only intertask dependencies are explicitly considered. Thus the input alphabet for AD , where D is of the form D(e1; . . . ; en), is fe1 ; . . . ; en; t1; . . . ; tng. That is, the size of the input alphabet for AD is 2n. AD operates as follows. At any time, it is in some state, say, s. Initially, s = s0 . Events arrive sequentially. Let " be the current event. If s has an outgoing edge labeled a(") and incident on state s0 , then the given transition is enabled. This means that, as far as its local state is concerned, AD can change its state to s0 . However, AD cannot actually make the transition unless the scheduler permits it (see Section 5). If the scheduler permits a certain transition, then the automaton can execute it, thereby changing its local state to keep in synchronization with respect to the events executed so far. The behavior of the scheduler is such that it accepts an event only if it can nd an event ordering that is consistent with all of the dependency automata that contain that event in their input alphabet. So if it accepts an event, all the relevant automata must be in agreement. Therefore, each of them must execute the given accepting transition. This ensures that acceptance of the event does not violate any of the dependencies in which the event is mentioned. Similarly, the scheduler can reject an event only if all of the relevant automata reject it, i.e., only if it can nd an event ordering that is consistent with all of the relevant dependency automata executing a rejecting transition for the event. The same reasoning as for accepting an event applies here, since the rejection of an event can also cause the violation of a dependency in which the event is mentioned. Section 5 discusses the operation of the scheduler in detail. Dealing with Failures using Timeouts We have so far interpreted the ti events to indicate the termination of task i. Ordinarily, tasks terminate by committing or aborting. However, system problems, such as disk crashes and commu- nication failures, may cause indenite waits. For example, the automaton shown in Figure 4 delays accepting e2 until t1 or e1 is submitted. This is to ensure that if both e1 and e2 are submitted, then e1 executes before e2 . Thus this automaton could possibly hang forever. One policy is to have the automaton accept e2 when e2 arrives and reject e1 if e1 arrives 10 later. In general, this policy speeds up e2 's task at the cost of aborting e1 's task and, possibly, delaying or aborting the global task. In cases where both policies, namely, one in which an event is indenitely delayed and the other in which an event is eagerly rejected, are unacceptable, a policy based on timeouts may be preferred. This would require tasks to wait, but would allow timeouts to be generated when expected events are not received within a reasonable time. This is an improvement in practical terms, but does not require any signicant change in our approach. We support timeouts by modifying the interpretation of the ti events in the above and associate them with either the normal termination of a task or a timeout on the corresponding event, ei . We assume that ei is not submitted after ti has been submitted. This is easy enough to implement. 5 The Scheduler: Enforcing Multiple Dependencies A system must enforce several dependencies at the same time. A naive approach would generate a product of the individual automata (AD 's) that each enforce a single dependency. However, if there are m individual automata each roughly of size N , then the product automaton has size on the order of N m . This is intractable for all but the smallest m. We avoid this \state explosion problem" [CG87], by coordinating the relevant individual automata at run-time rather than building a static (and exponentially large) product at compile-time. This is achieved using techniques similar to those introduced in [AE89]. The software that does this is the scheduler. 5.1 The Execution Model Figure 6 shows our execution model. Events are submitted to the scheduler as tasks execute. The actual set of signicant events is determined by the application and system tasks. We introduce the correctness criterion of viable pathsets, which is used to check whether all dependencies can be satised if a given event is executed. Computing a viable pathset requires looking at all relevant dependency automata. If an event can be accepted based on the viable pathset criterion, it is given to the event dispatcher for execution. If an event cannot be accepted immediately, then it still may be possible to execute it after other events occur. Therefore, the event is put in the pending set and a decision taken on it later. If the scheduler ever permits the execution of an r(e) transition by some automata, then e is rejected, and a reject(e) message is sent to the task that submitted e to the scheduler. 5.2 Pathset We now discuss the concept of pathset, present an algorithm to compute pathsets, and discuss the execution in more detail. When an event " is submitted, the scheduler searches for a pathset, 11 Submitted Events- Accepted Events in partial order - Execute Event - Tasks Scheduler Dispatcher J Rejected Events Re-attempted J J 6 Execution of JJ JJ Delayed Events J J Delayed Events Queries Replies J '? $ ' J J J ^ J $ Pending Dependency Set Automata a process & % & % indicates indicates a data structure Figure 6: The Execution Model i.e., a corresponding set of paths. A pathset consists of one path from each relevant dependency automaton. The desired pathset must 1. accept ", 2. begin in the current global state of the scheduler, 3. be order-consistent, 4. be a-closed and r-closed, and 5. be executable. Order-consistency means that dierent paths in the set must agree on the order of execution of each pair of events. The requirements of a-closure and r-closure mean that for any event that is accepted or rejected, paths from each automaton referring to that event must be included and must agree on whether to accept or reject it. Executable means that all rejected events must have been submitted and all accepted events must have been submitted or be forcible. A pathset that meets criteria 2{5 is called viable. Following a few technical denitions, we discuss further use of pathsets and an algorithm to determine them. Denition 1 (Global State). A global state s is a tuple hsD1 ; . . . ; sD ; . . . ; sD i where sD is the local state of AD , and D1; . . . Dn i n i i are all the dependencies in the system. 12 The global state is simply the aggregation of the local states of every individual dependency au- tomaton. Denition 2 (Path). A path in AD , D , is a sequence s1 1 s2 2 . . . sj j sj +1 . . . such that (8j 1 : (sj ; j ; sj +1 ) 2 D ) where D is the transition relation of AD . A global computation is a sequence of events as executed by the event dispatcher. Recall that AD is meant to encode all the computations that satisfy dependency D. Thus, each path of AD represents computations that satisfy D. Furthermore, AD is maximal in the sense that every possible computation whose prexes satisfy D is represented by some path in AD . By denition, a global computation must consist solely of events accepted by the scheduler. Our scheduler has the property that, for each dependency D, the projection of any global computation onto the events in D is represented by a path in AD . This means that our scheduler enforces each dependency. Denition 3 (Pathset). A pathset is a set, , of paths such that: 1. Each element of is a path in some AD . 2. Each AD contributes at most one path to . As we mentioned in section 5.1, when an event " is submitted to the scheduler, the scheduler attempts to execute " by nding a viable pathset that accepts ". If such a pathset is found, then all events that are accepted by the pathset are executed in an order that is consistent with that imposed by the pathset. This results in the global state of the scheduler being updated appropriately. If such a pathset is not found, then event " is placed in the pending set. Another attempt at nding a suitable pathset is made when other events that have an impact on the acceptability of " have been submitted. Event " remains in the pending set until either a viable pathset executed that accepts or rejects it. In the latter case, " is rejected for execution by the scheduler and the task that issued " is informed of this decision. 5.3 The Pathset Search Algorithm In Figure 7, we present a (recursive) procedure search that searches for viable pathsets. The procedure is initially called as search (;). The event to be executed, ", and other necessary data structures are assumed to be globals for simplicity (they are passed as parameters in the actual implementation). The search procedure attempts to construct a viable pathset by selecting paths from each relevant automaton that are order-consistent with and are executable. If these paths contain a(") or r(") events that occur in automata outside the set of automata being considered, those automata are considered to ensure a-closure and r-closure of the eventual solution. 13 search () if r-closed() and a-closed() then return(); else f Find an automaton A that is needed to close o ; c := get candidate paths(A, ); for each 2 t := search ( [ f g); if t =6 ; then /* t is viable; end all recursive calls */ return(t); endfor /* all paths in c failed, so return ; */ return(;); g Figure 7: Pathset Search Algorithm 14 The function get candidate paths(A, ) returns a set of executable paths from automa- ton A that are order-consistent with all paths in . Some of the returned paths may be extensions of paths already in . We now establish some correctness properties of the pathset search algorithm. Most proofs are not included here for brevity, but appear in [ASRS92]. Lemma 1 For any event, ", and global state s, if search (;) terminates with 6= ;, then is viable (w.r.t. global state s) and accepts ". Proof sketch. We show that each of the clauses of the denition of viable (page 12) is satised. By construction of the scheduler, the search for a pathset begins in the current global state. New paths that are added to the candidate pathset (c ) are executable and order-consistent with , by denition of the get candidate paths function. Search only terminates when either is empty or is a-closed and r-closed. Lemma 2 search (;) always terminates. Proof sketch. The essential idea is that because the number of automata is nite and each automaton has nitely many paths, only nitely many candidate pathsets need to be considered. Thus the algorithm terminates. 5.4 The Scheduler The scheduler is a nonterminating loop, which on each iteration attempts to execute an event " that has just been submitted or is in the pending set (Figure 6). It does this by invoking search (;). If this invocation returns a nonempty , then is immediately executed. Otherwise, " is placed in the pending set. is executed by (a) accepting the events that accepts in a partial order that is consistent with and (b) rejecting all events rejected by . Denition 4 (Path Projection). The projection "D of global computation onto a dependency automaton D is the path obtained from by removing all transitions " such that " 62 D . Lemma 3 Let be a global computation generated by the scheduler. Then, for every dependency D, "D is a path in AD . 15 Proof sketch. By construction of the scheduler. Note that the paths in c returned by get candidate paths are examined in arbitrary order. Optimality of the constructed pathset could be improved if the paths in c were examined according to some appropriate criterion, such as minimal length or maximal acceptance or minimal rejection. We are currently experimenting with such criteria. 5.5 Example of Scheduler Operation We now give an example of how relaxed transactions expressed with < and ! can be scheduled using our algorithm. For simplicity, let the only dependencies in force be e1 ! e2 and e1 < e2 , where both e1 and e2 are rejectable. Let A! and A< be the corresponding automata. Assume that e1 is submitted rst. We nd a(e1) in A< . However, since no path in A! begins with e1 , the empty pathset is returned and e1 added to the pending set. When e2 is submitted, two executable paths can be found in A! : a(e2 ); a(e1) and a(e2 )jjja(e1). The a-closure requirement now forces the scheduler to search A< for a path that accepts e1 and e2 . The only such path is a(e1); a(e2). Since a(e1 ); a(e2) and a(e2); a(e1) are not mutually order-consistent, the only viable pathset is fa(e1); a(e2), a(e2)jjja(e1)g. This is nally returned. The partial order consistent with it is: e1 and then e2. Table 2 shows how the axioms for the Saga transaction model [GS87], that were formulated in [CR92] using the ACTA formalism, can be expressed using the < and ! primitives. A Saga S , is a sequence of (sub-)transactions Ti , i = 1; . . . ; n. All subtransactions must be either successfully executed in the specied order, in which case the Saga commits or, if one of the subtransactions aborts, the Saga aborts and the compensating transactions CTi are executed in the reverse order. Since the specications use only the < and ! primitives, our scheduler can be used to execute relaxed transactions having the semantics of Sagas. ACTA <; ! notation post(begin(S)) Ti BCD Ti?1 st(Ti) ! cm(Ti?1 ) ^ cm(Ti?1) < st(Ti) CTj WCD CTj+1 cm(CTj+1) < st(CTj ) CTn?1 BAD S (st(CTn?1) ! ab(S)) ^ (ab(S) < st(CTn?1)) post(begin(Ti )) S AD Ti ab(Ti) ! ab(S ) Ti WD S cm(Ti) < ab(S ) CTi BCD Ti st(CTi) ! cm(Ti ) ^ cm(Ti ) < st(CTi) post(commit(Ti)) CTi CMD S ab(S ) ! cm(CTi) CTi BAD S st(CTi) ! ab(S ) ^ ab(S ) < st(CTi) post(begin(Tn )) S SCD Tn cm(Tn ) ! cm(S ) Table 2: Expressing SAGA Dependencies in ACTA and the <; ! Notation 16 6 Executing Multidatabase Transactions We now discuss three issues in executing multidatabase transactions: concurrency control, safety, and recoverability. 6.1 Concurrency Control Our scheduler is a part of a multidatabase environment in which local database systems (LDBS) cooperate in the execution of global transactions. Each LDBS will, in general, contain a concurrency control module, which enforces local concurrency control (typically ensuring local serializability). We may assume that a task executing at each of the local systems has serialization event that determines its position in the local serialization order. For example, it the local system uses two- phase locking (2PL), the serialization order of a local transaction is determined by its lock point { the point when the last lock of the transaction is granted. A problem arises if local concurrency control modules impose an inconsistent ordering on se- rialization events of tasks belonging to a given multidatabase application. We resolve this problem by transferring the responsibility for global concurrency control to the scheduler. This is achieved by restating the concurrency control obligations as a set of dependencies, which are then enforced in the same way as any other dependency. The main dierence is that, unlike other scheduling de- pendencies, the concurrency control dependencies between applications arise at the run time, when a serialization precedence between the tasks belonging to these applications has been established at any site. However, once these dependencies are detected, they can be enforced in the same way as the other dependencies described earlier. Thus we have a uniform mechanism for both dependency enforcement and concurrency control. The main diculty in this approach is that the serialization events are neither reported by the local concurrency controllers, nor can they be deduced from the temporal order of other signicant events controlled by the global scheduler (start, commit, terminate). It is possible for a local concurrency controller to completely execute task Ti before task Tj has even begun, yet serialize them in such a way that that Tj precedes Ti . This problem can be overcome by using the idea of tickets introduced in [GRS91]. As in [GRS91], we may add a ticket read and ticket write operation to each task of a global application. These ticket read/write operations can be regarded as signicant events, and so their execution can be controlled by declaring dependencies that refer to them. Thus the required concurrency control is then obtained simply by declaring an appropriate set of ticket access dependencies. 17 6.2 Flexible Transaction Safety A exible transaction [ELLR90] is dened as a set of subtransactions and their scheduling precon- ditions along with a set of conditions over their nal states [ELLR90]. These conditions specify the acceptable termination states of the exible transaction; it completes successfully i it terminates in such a state. Consider the following example, adapted from [JNRS91]. We have a travel agent exible transaction, consisting of reserve- ight (F ) and reserve-car (C ) subtransactions. If we fail to secure a car reservation, we wish to cancel the plane reservation. This cancellation is achieved by a subtransaction F ? , which is a compensating transaction for F . Thus the set of acceptable termination states for the overall transaction is given in Table 3, where in, cm, and ab indicate that the subtransaction is in its initial state, is committed, and is aborted, respectively. The set of acceptable states is a constraint on the execution of a exible transaction. This constraint can also be expressed as the set of dependencies given in Table 3. F F? C cm in cm ab in in abC < cmF ? ab in ab (abC ^ cmF ) ! cmF ? cm cm ab cmC ! cmF in in in cm cm in Table 3: Acceptable States of a Flexible Transaction 6.3 Recoverability We will not deal extensively with the issue of recovery from failure in this paper. Suce it to say that the following data must be checkpointed in order to enable recovery of the scheduler from a failure: 1. The current state of every dependency automaton. 2. Any (partially executed) pathset (see Section 5), plus the current state along every path in the pathset. 3. The set of pending events. The above data is subject to concurrent updates that must be executed atomically with respect to the checkpointing mechanism. For example, when an event e is executed, the current state of every 18 dependency automaton AD where " occurs in D must be updated. We do not wish a checkpoint to re ect only some of these updates. It should either re ect none of them (corresponding to a state before " is executed), or re ect all of them (corresponding to a state after " is executed). In addition, we require a persistent communication mechanism between the scheduler and the tasks. This is to prevent the loss of messages sent by the tasks to the scheduler while the scheduler is down (i.e., after a failure and before recovery from that failure). Mailboxes or persistent pipes may be used to provide this functionality. 7 Conclusions and Future Work We addressed the problem of specifying and enforcing intertask dependencies. Our framework al- lows dependencies to be stated modularly and succinctly as constraints across tasks. It can be extended to accommodate the issues of concurrency control, exible transaction safety, recoverabil- ity, and the enforcement of other dependencies that are introduced dynamically at run-time. We showed how a dependency can be expressed as an automaton that captures all the computations that satisfy the dependency. We presented a scheduling algorithm that enforces multiple dependencies at the same time. This algorithm uses the automata corresponding to each dependency. We showed that every global computation generated by the scheduler satises all of the dependencies. We also showed how relaxed transaction models such as the Saga model can be captured in our framework. The desiderata for a task scheduler for multidatabase transaction processing include correctness (no dependencies are violated), safety (transaction terminates only in an acceptable state), recoverability, and optimality. We have established the correctness, safety and recoverability of the scheduler, we are curretly studying the optimality issues. An implementation of this work has been completed as part of the distribution services of the Carnot project [Ca91] at MCC. Our implementation is in the concurrent actor language Rosette, whose asynchrony and other features make for a natural realization of our execution model. Carnot enables the development of open applications that use information stored under the control of existing closed systems. The specication and run-time enforcement of data dependencies is an important component of this eort. Acknowledgements We are indebted to Greg Meredith and Christine Tomlinson for numerous discussions, and to Allen Emerson for advice on CTL. We have also beneted from conversations with Darrell Woelk. Sridhar Ganti provided the Sagas example. Phil Cannata, the Carnot project director, has provided the moral support for this research as well as the environment to do it in. 19 References [ANRS92] M. Ansari, L. Ness, M. Rusinkiewicz, and A. Sheth. Using Flexible Transactions to Support Multi-system Telecommunication Applications. Proceedings of the 18th VLDB Conference, August 1992. [AE89] Paul C. Attie and E. Allen Emerson, Synthesis of Concurrent Systems with Many Similar Sequential Processes, Proceedings of 16th Annual ACM Symposium on Principles of Programming Languages, pages 191{201, 1989. [ASRS92] P. Attie, M. Singh, M. Rusinkiewicz, and A. Sheth. Specifying and Enforcing Intertask Dependencies. MCC Technical Report Carnot-245-92, November 1992. [AST92] P. Attie, M. Singh, and C. Tomlinson. A Language Based on Temporal Logic for Specify- ing Intertask Dependencies. MCC Technical Report, Number Carnot-245-92, November 1992. [BS88] Y. Breitbart and A. Silberschatz. Multidatabase update issues. In Proc. of ACM SIG- MOD Int'l Conf on Management of Data, June 1988. [Ca91] P. Cannata. The Irresistible Move Towards Interoperable Database Systems. Proceedings of the 1st International Workshop on Interoperability in Multidatabase Systems, Kyoto, Japan, April 1991. [CG87] E. Clarke and O. Grumberg. Avoiding the State Explosion Problem in Temporal Logic Model Checking Algorithms. Carnegie Mellon University, Pittsburgh, 1987. [CR90] P. Chrysanthis and K. Ramamritham. ACTA: A Framework for Specifying and Reason- ing about Transaction Structure and Behavior. Proceedings of ACM SIGMOD Confer- ence on Management of Data, 1990. [CR92] P. Chrysanthis and K. Ramamritham. ACTA: The SAGA Continues. Chapter 10 in [El92]. [DHL90] U. Dayal, M. Hsu, and R. Ladin. Organizing Long-Running Activities with Triggers and Transactions. Proceedings of ACM SIGMOD Conference on Management of Data, 1990. [DHL91] U. Dayal, M. Hsu, R. Ladin. A Transactional Model for Long-running Activities Pro- ceedings of the 17th VLDB Conference, September 1991. [El92] Ahmed Elmagarmid, editor, Database Transaction Models, Morgan Kaufman, 1992. [ELLR90] A. Elmagarmid, Y. Leu, W. Litwin, and M. Rusinkiewicz. A Multidatabase Transaction Model for Interbase. Proceedings of the 16th VLDB Conference, August 1990. 20 [Em90] E. Allen Emerson. Temporal and Modal Logic. In Handbook of Theoretical Com- puter Science, vol. B, J. Van Leeuwen, editor, 1990. [EC82] E. Allen Emerson and E. Clarke. Using Branching Time Temporal Logic to Synthesize Synchronization Skeletons. Science of Computer Programming vol. 2, 1982, 241{266. [EMSS93] E. Allen Emerson, A. Mok, A. Prasad Sistla and J. Srinivasan. Quantitative Temporal Reasoning. To appear in Real Time Systems Journal, vol. 2, January 1993, 331 { 352. [HR83] T. Haerder and A. Reuter. Principles of transaction-oriented database recovery. ACM Computing Surveys, 15(4), December 1983. [GS87] H. Garcia-Molina and K. Salem. Sagas. Proceedings of ACM SIGMOD Conference on Management of Data, 1987. [Gra81] J.N. Gray. The Transaction Concept: Virtues and Limitations. Proceedings of the 7th VLDB, September 1981. [GRS91] D. Georgakopoulos, M. Rusinkiewicz and A. Sheth. On Serializability of Multidatabase Transactions through Forced Local Con ict. Proceedings of the 7th International Con- ference on Data Engineering, April 1991. [JNRS91] W. Jin, L. Ness, M. Rusinkiewicz and A. Sheth. Executing Service Provisioning Applica- tions as Multidatabase Flexible Transactions (draft). Bellcore Technical Memorandum, 1992. [Kl91] J. Klein. Advanced Rule Driven Transaction Management. Proceedings of the IEEE COMPCON, 1991. [MW84] Z. Manna, P. Wolper. Synthesis of Communicating Processes from Temporal Logic Specications. ACM TOPLAS, vol. 6, no. 1, January 1984, 68{93. [RSK91] M. Rusinkiewicz, A. Sheth, and G. Karabatis. Specifying Interdatabase Dependencies in a Multidatabase Environment. MCC Technical Report ACT-OODS-153-91(Q), May 1991. [Also appears in IEEE Computer, December 1991.] A CTL Syntax and Semantics We have the following syntax for CTL (where p denotes an atomic proposition, and f; g denote (sub-) formulae): 1. Each of p; f ^ g and :f is a formula (where the latter two constructs indicate conjunction and negation, respectively). 21 2. EXj f is a formula that intuitively means that there is an immediate successor state reachable by executing one step of process Pj in which formula f holds. 3. A[f Ug ] is a formula that intuitively means that for every computation path, there is some state along the path where g holds, and f holds at every state along the path until that state. 4. E[f Ug ] is a formula that intuitively means that for some computation path, there is some state along the path where g holds, and f holds at every state along the path until that state. Formally, we give the semantics of CTL formulae with respect to a structure M = (S; A1; . . . ; Ak ; L) that consists of: S - a countable set of states Ai - S S , a binary relation on S giving the possible transitions by process i, and L - a labeling of each state with the set of atomic propositions true in the state. Let A = A1 [ [ Ak . We require that A be total, i.e., that 8x 2 S; 9y : (x; y ) 2 A. A fullpath is an innite sequence of states (s0 ; s1; s2 . . .) such that 8i(si ; si+1) 2 A. To any structure M and state s0 2 S of M , there corresponds a computation tree (whose nodes are labeled with occurrences of states) with root s0 such that s !i t is an arc in the tree i (s; t) 2 Ai . We use the usual notation to indicate truth in a structure: M; s0 j= f means that f is true at state s0 in structure M . When the structure M is understood, we write s0 j= f . We dene j= inductively: s0 j= p i p 2 L(s0) s0 j= :f i not(s0 j= f ) s0 j= f ^ g i s0 j= f and s0 j= g s0 j= EXj f i for some state t, (s0 ; t) 2 Aj and t j= f , s0 j= A[f Ug] i for all fullpaths (s0 ; s1; . . .), 9i[i 0 ^ si j= g ^ 8j (0 j ^ j < i ) sj j= f )] s0 j= E[f Ug] i for some fullpath (s0 ; s1; . . .), 9i[i 0 ^ si j= g ^ 8j (0 j ^ j < i ) sj j= f )] We write j= f to indicate that f is valid, i.e., true at all states in all structures. We introduce the abbreviations f _ g for :(:f ^ :g ), f ) g for :f _ g , and f g for (f ) g ) ^ (g ) f ) for logical disjunction, implication, and equivalence, respectively. We also introduce a number of additional modalities as abbreviations: A[f Wg ] for :E[:f U:g ], E[f Wg ] for 22 :A[:f U:g], AFf for A[trueUf ], EFf for E[trueUf ], AGf for :EF:f , EGf for :AF:f , A[f Uw g] for :E[:gU(:f ^ :g)], E[f Uw g] for E[f Ug] _ EGf , AXi f for :EXi :f , EXf for EX1 f _ _ EXk f , AXf for AX1 f ^ ^ AXk f . Particularly useful modalities are AFf , which means that for every path, there exists a state on the path where f holds, and AGf , which means that f holds at every state along every path. A formula of the form A[f Ug ] or E[f Ug ] is an eventuality formula. An eventuality corresponds to a liveness property in that it makes a promise that something does happen. This promise must be fullled. The eventuality A[f Ug ] (E[f Ug ]) is fullled for s in M provided that for every (respectively, for some) path starting at s, there exists a nite prex of the path in M whose last state satises g and all of whose other states satisfy f . Since AFg and EFg are special cases of A[f Ug ] and E[f Ug ], respectively, they are also eventualities. In contrast, A[f Wg ], E[f Wg ] (and their special cases AGg and EGg ) are invariance formulae. An invariance corresponds to a safety property since it asserts that certain conditions will necessarily be met. CTL is a propositional branching-time temporal logic. That is, it includes propositional logic and temporal operators. A CTL temporal operator is composed of a path-quantier (either A, meaning for all possible computations, or E, meaning for some possible computation), followed by a linear temporal operator (one of F, G, or U). Fp means that p holds at some point along the given computation; Gp means that p holds at all points along the given computation; and pUq means that q holds at some point along the given computation and p holds from the current point until that point. A.0.1 Expressing Dependencies in CTL Atomic propositions naturally model the states of a given system: each proposition corresponds to a signicant event and holds in the state immediately following the occurrence of that event. Now we show how certain dependencies that were motivated and dened by other researchers can be expressed uniformly in CTL. Order Dependency [Kl91]: If both events e1 and e2 occur, then e1 precedes e2. This was expressed as e1 < e2 in the above discussion. In CTL, it becomes: AG[e2 ) AG:e1 ] That is, if e2 occurs, then e1 cannot occur subsequently. Existence Dependency [Kl91]: If event e1 occurs sometimes, then event e2 also occurs some- times. This was expressed as e1 ! e2 in the above discussion. In CTL, it becomes: :E[:e2U(e1 ^ EG:e2)] That is, there is no computation such that e2 does not occur until a state s is reached where s satises (e1 ^ EG:e2 ), i.e., e1 is executed in state s, and subsequently, e2 never occurs. 23 The following instances of the above dependencies have also appeared in the literature. Commit Dependency [CR92]: Transaction A is commit-dependent on transaction B, i if both transactions commit, then A commits before B commits. Let the relevant signicant events be denoted as cmA and cmB . AG[cmB ) AG:cmA ] Abort Dependency [CR92]: Transaction A is abort-dependent on transaction B, i if B aborts, then A must also abort. Let the signicant events here be abA and abB , so this can be written abB ! abA , and is rendered in CTL just like e1 ! e2 above: :E[:abA U(abB ^ EG:abA )] Conditional Existence Dependency [Kl91]: If event e1 occurs, then if event e2 also occurs, then event e3 must occur. That is, the existence dependency between e2 and e3 comes into force if e1 occurs. This can be written e1 ! (e2 ! e3 ). Translating it to CTL involves two applications of the translation of e1 ! e2 given above, one nested inside the other. The rst application, to e2 ! e3 , yields the following \mixed" formula: e1 ! :E[:e3 U(e2 ^ EG:e3 )] The second application, which substitutes :E[:e3 U(e2 ^ EG:e3 )] for e2 in the CTL translation of e1 ! e2 given above, gives us :E[::E[:e3 U(e2 ^ EG:e3 )]U(e1 ^ EG::E[:e3 U(e2 ^ EG:e3)])] Eliminating the double negations nally yields :E[E[:e3U(e2 ^ EG:e3 )]U(e1 ^ EGE[:e3U(e2 ^ EG:e3 )])] A.0.2 Expressing Real-time Dependencies in CTL We use the variant of CTL called RTCTL (Real-Time Computation Tree Logic ) [EMSS93]. This is the same as CTL except that EFt means \will occur after t or more time units along some computation." Real-time Order Dependency: If both events e1 and e2 occur, then e1 precedes e2, and e2 occurs within t time units of e1 . AG[(e2 ) AG:e1 ) ^ (e1 ) :EFt e2 )] Real-time Existence Dependency: If event e1 occurs sometimes, then event e2 also occurs sometimes. Furthermore, e2 occurs no later than t time units after e1 . :E[:e2U(e1 ^ EG:e2)] ^ :EF[e1 ^ EFte2] 24