June, 2002. Revised June, 2003, Copyright, AT&T. This document is not intended for display on Netscape 4 and other older browsers.
We define a programming language for coordinating multimedia and multimodal input in modern browser-based user interfaces. Our language, ReaX, is a reactive notation that precisely captures notions of events, event notifications, real-time scheduling, and simultaneity in an XML-based framework. Moreover, ReaX comes with built-in support for undoability and time seek. The main contributions of ReaX are threefold.
First, ReaX contributes a new event semantics that subsume existing mechanisms such as event notification, event broadcast, event queuing, exception propagation, event bubbling, and event capturing. Our semantics is based on a event tree constructed during a reaction. The event tree provides an explicit account of causality. Explicit causes, such as the raising of events, and implicit causes, such as event notifications, are treated uniformly in this data structure. This allows us to explain the seemingly contradictory requirements of simultaneity and temporally ordered event dispatching that is common to reactive programming notation such as Esterel (a hardware and embedded systems programming language), Statecharts (a visual notation for reactive systems), and SMIL (a multimedia notation for content presentation proposed by the W3C).
Second, ReaX is designed to be an intrinsically undoable language. This is possible because ReaX coordinates events, but is otherwise meant to be free of computations. Consequently, user interfaces coordinated by a ReaX-program provide a uniform and consistent solutions to (1) erroneous user input or (2) delayed system response caused by recognizers of speech or gestures. The solution to (1) is: the user may always invoke the "undo" button linked to the runtime system of ReaX. The solution to (2) is: a program under ReaX supervision that recognizes speech or handwriting may always retract a preliminary hypothesis of what the user intended. The solution to (1) frees the programmer from writing explicit code to undo user input (code that is often not written because it is too complicated). The solution to (2) makes human interaction with input recognizers more natural since system feedback can be provided continuously, not only after a timeout that follows the last user input, as is commonly the case in current practice.
Third, the undo facility is combined with a time seek mechanism, similar to that of SMIL. This mechanism allows media buttons, as found on a CD-player, to be linked to the ReaX runtime system in addition to undo and redo buttons.
To illustrate our approach, we provide an example that exhibits semantic deficiencies in SMIL as it is currently defined. We show how to mend the problems when SMIL is explained in terms of ReaX.
The ReaX programming language is a simple, thread-based scripting language for multimedia and telephony user interface programming. It is formulated as a model for how to simplify and consolidate user interface programming techniques as expressed in SMIL 2.0 [SMIL20], VoiceXML and Dynamic HTML. ReaX borrows as much as possible from the general XML programming notations [XPATH20] and XSLT 2.0 [XSLT20]. In particular, ReaX provides a precise framework for understanding and generalizing the Synchronous Multimedia Interface Language (SMIL). As it stands, SMIL suffers from incomplete and complicated semantics; more importantly, SMIL does not address the fundamental issue of maintaining program state when going backwards and forward in time according to seek and undo mechanisms. In contrast, ReaX offers a precise formulation of a general undo and seek mechanism. Consequently, ReaX-based browsers offer uniform support for undo, redo, forward, and rewind functionalities.
The main features of ReaX are:
In this document, we describe in Sections 2 to 7 the syntax and semantics of ReaX. Section 8 introduces an example of modeling SMIL concepts. Section 9 provides a road map to a sound reactive programming notation based on the declarative concepts of time intervals and synchronization in SMIL. Section 10 outlines how the user interface of ReaX-enabled devices can be linked to the undo and time seek mechanisms that are intrinsic to ReaX-presented content.
This report is preliminary.
A ReaX program is an XML document of the form
In accordance with XML costumes, we use the syntactic category Name defined as in [XML] and we disallow colons. This category is used for names of variables, which may be scoped. For identifiers that must be defined in at most one place, we use the category ID.
All value manipulation in ReaX is based on the XML Path Language (XPath). We rely on XML Path 2.0, which is still under development. An Expression returns a value that is either of a simple data type, such as a Boolean or a string, or a document node, which denotes an XML tree.
Certain operations in XPath may generate type exceptions. The type exception policy is strict: an exception is converted into a ReaX event <ReaX:error type="XPath" /> raised at the place of the expression.
An XPath expression is evaluated with respect to a static context, which defines the value of declared variables and exception policy. In contrast to XSLT, ReaX uses imperative variables, but they are referenced in the same way according to conventional mechanisms of static scoping.
The dynamic context specifies the focus of the expression, that is, the information that specifies which nodes are to be processed by the expression. An XPath expression is evaluated with respect to a context item and a context document. As a default in ReaX, the context document is the event tree and the context node is that of the selected event (as defined in section ).
A Boolean expression Bool-Expr is an XPath expression that evaluates to a Boolean value, that is to true or false.
A time-track expression track-Expr is an XPath expression that evaluates to "start", "pause", "resume", "seek", or "stop".
A location expression Loc-Expr is an XPath expression that is evaluated statically in a context that consists of the program itself with the context node being the element node that has the attribute node where the expression is contained. The value of the location expression is a set of nodes in the ReaX program.
A floating point expression Float-Expr expresses a real number. Here, we use such expressions only for time designations. By default, the unit used is seconds.
As in the current XLST 2.0 proposal, and similar to the result tree fragments in the XLST 1.0, an XML tree constructor Constr express XML trees that can be put together from static parts, using XML syntax, and dynamic parts, using XPath expressions. For example, the expression
Events enjoy properties that allow them to emulate functions found in a variety of programming languages. For example, they may act as exceptions, signals, and asynchronous messages. Events are raised either by the ReaX program itself, by the raise statement, or by the environment as an external event. An event is an XML document. There are three classes of events:
Events may either be consumable or not. A consumable event is handled if possible, but a non-consumable event, called a signal, is never handled.
Events either are raised exactly on the target location specified or are forwarded to the location of the program counter in the thread of the target location.
An event that is raised is usually appended to the end of the post-event queue. But if it has the immediate property, then it is appended to the front of the queue.
A co-event is normally raised before the observed event. Technically, the delay property is not a property of the event, but rather of the way it is generated. If the delay property is specified, then the co-event is raised after the observed event.
An event may be automatically raised to the next enclosing thread, but a cancel statement may prevent this from happening.
In ReaX, a thread is suspended whenever it enters a block. But at the same time, init is signaled on the block as an immediate event. Consequently, the thread is restarted if a co-handler, typically a on-init handler, exists for the signal. The <terminate /> statement does not raise an event. But, termination is usually accomplished through the raising of the exit signal to a block, which by default propagates to all blocks in all handlers, and by transitivity, to all enclosed blocks.
ReaX is a reactive programming language. The statement syntax is formulated in XML. Hence, the syntax provides a convenient abstract tree for the definition of scopes and program counters.
The basic element, called a handler, describes an event-initiated sequence of statements that all execute in the same thread. The event expression of a handler characterizes the events that the handler may catch. A thread is a handler of a thread block or it is the main thread of the program. Each thread of a thread block is called a subthread of a thread containing the block. A thread executes one statement at a time, in sequence. Threads themselves are structured through the mechanism of a state block, whose handlers are called states and define an event-driven state-machine. Both kinds of blocks may contain local variable declarations; thus, blocks delineate scope in the usual way.
We assume that each element in the program is annotated with an id attribute, which we regard as a label. We identify a position in the program with a label. Given positions l and l', we say l is within l' if the node indicated by l is a descendent of the node indicated by l'. In particular, l is within l' if l= l'. The upward closure of a position l consists of all the state blocks l' that contain l and that are in the same thread as l.
During a reaction, a set of protected positions prevents the execution of a handler, and code that contains a handler, when the handler has already been selected for execution for an event e, but is not yet executing due to intervening pre-events, none of which are allowed to prevent the handler from executing e.
The execution status of a thread is either executing, invalid, or suspended. At most one thread executes at a time, and the progress of the execution is reflected by the threads's program counter, which is either invalid, as is the thread itself, or an ID of an element contained in the thread; in the latter case the program counter, and the thread as well, is said to be valid. Execution is always in response to an event. The execution of a thread is considered to take almost no time, and it must consist of a finite number of executions of statements. Normally, the execution lasts until the program counter reaches a block or co-block, upon which the thread is suspended. Alternatively, the execution of a thread may lead to its termination, upon which the thread becomes invalid.
The ReaX program itself is a thread block. The notion of execution status validity extends to all handlers. Validity of an element means that the program counter of the element's thread is defined and is within it. A handler is valid only if the block that contains it is. Any subset of the handlers of a thread block may be valid. But, at most one of the handlers of a state block may be valid. A block is valid if only if the handler that contains it is valid and the program counter of the thread containing the block is inside the block.
The global state is an account of the program counter of each thread and the value of each variable that is declared in a valid block.
The program contains by default an empty handler that matches all events. The main thread is always valid. An event that is not caught anywhere else is caught by the default handler; consequently, such an event has no effect on the global state.
Normally, all threads are in a non-executing state. Only an external event may initiate execution of a thread. This thread may in turn raise additional events which lead to further executions of threads. This process is called the reaction to the external event. The reaction terminates in finite time (unless the program is faulty, in which case either an infinite process of executions leads to nontermination or some execution is infinite). The reaction as a whole is regarded as taking very little time. Thus, in between external events, the ReaX program has enough time to complete the executions of the reactions.
During the reaction, an event e is subjected to the following calculations. First, the intended target of the event is defined as follows. If the event is not forwarded (Section 4 ), then the intended target is the same as the target but only when the program counter of the thread containing the target is at the target or inside the target---when the latter condition is not satisfied, the intended target is undefined. Otherwise, when the event is forwarded, the intended target is the location of the program counter of the thread containing the target (but if the program counter is not defined, the intended target is not defined).
From the intended target, the actual target calculated as follows. The innermost block l' containing l is located. If l' is a protected program position, then the actual target is undefined. Note that potentially l=l'. The event expression of the first handler of l' is evaluated with an XPath focus of e. If this expression evaluates to true, then the handler is the actual handler and l' is the actual target. Otherwise, the remaining handlers are tested in sequence until an accepting handler as found; but if no such handler is found the same procedure is repeated for the innermost enclosing block l'' that contains l' (and so on, recursively, but if any such block is a protected position, the actual target is undefined). The actual target is now defined as the block or thread-block defining the accepting handler.
When the actual target l is defined, then the set of co-events is calculated as consisting of all pairs (e,l), where l is the location of a co-handler that matches e. They are ordered in a sequence according to their locations as per the document ordering of the program. Note that this definition is independent of whether control is actually at the co-handler; it is possible that it is not, but that the co-handler during the reaction will become enabled.
If the actual target is a state block and the program counter is inside a handler l_h of the block, then this handler must the terminated. This is accomplished through the exit event (ReaX:exit, l_h), which is appended to sequence of co-events. The location l_h is added to the set of locked blocks.
An event e whose actual target is a block l_b is executed as follows. For a state block, the program counter will be at a handler of the block. (This will be the case thanks to the (ReaX:exit, l_h) event and the fact that the position l_b, along with all blocks above it in the same thread, cannot execute since they are protected during the processing of the pre-events of e.) Otherwise, the program counter is at the block itself, and it is now changed to be at the handler.
The fundamental data structure that determines the order of execution is the event tree. It is a data structure that grows monotonically during a reaction, eventually containing all events and co-events. The root of the event tree is the external event. At the beginning of the reaction, the event tree consists of this root only and the set of protected positions is empty. Every event has two ordered sets of children called the pre-event queue and post-event queue. If both are empty, then the event is called a leaf. The reaction follows a method, called the event principle, that is defined relative to an event e called the selected event:
Since all threads are statically declared, the memory model is very simple. There is a fixed tree of activation records, each corresponding to a block with allocated memory for the local variables. These variables are pointers into a heap, where XML data is allocated. In an XPath expression, a variable is referred to using the $-syntax. For example, a variable named x is referred to as $x.
Variables are declared in the usual way. There are two forms:
The states block is the fundamental intra-thread control structure. Execution of a states block suspends the containing thread after raising the init signal on the block.
The threads block creates a subthread for each handler. Execution of a thread block suspends the thread after the init signal is raised on the block.
The termination statement
The raise statement is the general method of raising an event.
The semantics of handler and co-handler statements are the same, except for the role that co-handler statements play in determining pre-events.
The on-init handler is used to resume execution immediately after the program counter has reached a states block. It is also used to start subthreads in a thread block immediately after the program counter has reached the thread block. In both cases, it is still possible that other threads suspended at prior locations, according to document order, are resumed beforehand when they observe the init event of the block.
ReaX offers conventional sequential control structures.
A mechanism for binding ReaX scripting to another programming language is through the use of an object element:
If the value of the optional attribute timed is true, then ReaX regards the object as a media element that is presented in real-time. In particular, the object must exhibit various methods for controlling the presentation. These are specified by the elements start, stop, pause, resume, and seek. These elements must all be present when the timed attribute has the value true. They may not be present in any other the case. When present, the Mname-seek is a function that seeks the timeline of the object, which is assumed to be a presentation or animation or other time-dependent activity. The function will be called by the undo mechanism with an argument that is Seek-Constr concatenated with an element <time>t</time>, where t is a floating point number that represents the presentation time in seconds since the execution of the call statement. The value of t is at most Float-Expr if the optional attribute max is provided. Usually, max is set to the length of the presentation. The other control elements are defined analogously. They enable the ReaX program to keep objects in synchronization with its own time tracking, which is affected by undo and seek operations issued by the user.
A method of an object may be invoked through the following syntax:
The undo element specifies in a similar way the method and arguments to be used when the method call is undone.
The time-track attribute, if present, informs the ReaX program that an operation affecting the presentation of the object is carried out. If the time-track attribute is provided with value seek-time, then the seek-time attribute must be present (and it is not allowed in any other case).
If the value of the optional seek-time attribute is provided, then the object is assumed to be rendering a presentation and the call is assumed to change the presentation time. This attribute allows the ReaX program and the object to maintain the same notion of time.
The use of the recover-category attribute is explained below.
As an alternative to providing explicit undo methods, the recover statement stipulates that the way to re-establish the effect of a category of method calls is to simply re-execute the last such call. For example, a method that specifies a complete speech recognition grammar satisfies this property. During the undoing of program execution, there is then no need to explicitly undo each specification and retraction of a complete grammar.
An embedded object is able to issue calls to the ReaX program. To enable such callbacks, the ReaX program declares a special handler:
The checkpoint statement
ReaX maintains a wall-clock that records the amount of seconds that have passed since the beginning of the execution of the program. It is possible to skip forward and to skip backward in a program through an operation called seeking. For example, if no events have taken place during the 10 seconds that have passed since an audio recording was initiated, then it may be possible to rewind the rendering by the amount of say eight seconds, without changing the value of any program variables or program counters. In that case, we say that the presentation time decremented by eight seconds. Generally the presentation time advances at the same rate as the wall-clock; the only exceptions are when the program is seeked in a forwards or backwards direction. In those cases, the presentation time is decremented or implemented by the delta seek, which is the relative amount of time that the presentation, that is, the program execution, is subjected to.
Moreover, some objects are time-tracked: when a call containing a start attribute to that object is executed, time-tracking is initiated and the delta time of the object is set to 0. The delta time then progresses at the same rate as the wall-clock. When a call containing the pause attribute is executed, time-tracking is suspended for the object. When a call containing the resume attribute is executed, time-tracking is resumed. And, when a call containing a seek-time attribute is executed, the delta time of the object is set to the value of the attribute. When a call a containing the stop attribute is executed, the object is no longer time-tracked.
To enable recovering, ReaX maintains a log of the last n_max external events and associated reactions that occurred. The number n_max is implementation dependent. Additionally, the global state of the program is recorded for each recorded reaction. The state is recorded right before the reaction. The presentation time at the time of the reaction is recorded. The global state is augmented with information about the objects that are time-tracked at the time of the reaction along with their delta times. Moreover, the sequence of call statements executed during the reaction is stored. Finally, for each valid recover statement at the beginning of the reaction, the last call with the appropriate recover-category is memorized.
Any statement in ReaX may disallow an undo operation through the use of the optional undo attribute. If it is false, then the reaction in which the statement is executed is not undoable.
Recovery by undo backs up the computation to the last reaction for which a checkpoint statement was executed. Recovery is not possible if this reaction or a later reaction is not undoable.
In general, the computation is backed up to a reaction r by undoing one reaction at time, starting with the last reaction that occurred and proceeding until and but not including r.
A reaction is undone as follows. The sequence of call statements executed is examined in reverse order. For each such statement that contains a undo element, the corresponding method is invoked on the appropriate object. Then, the global state recorded is restored.
Moreover, for each object that is time-tracked after the preceding reaction, a call of the provided seek method is issued with the value of the delta time recorded for the reaction.
Finally, for each recover statement with category C that contains a program counter, the last executed call statement of category C as recorded by the reaction is re-executed.
According to the conventional script model of undoing and redoing, it is possible to apply a redo operation following an undo operation in order to cancel the effect of the latter. To accomplish this effect, the ReaX program re-executes the reactions corresponding to the events that were undone.
A seek to a time before any scheduled timeouts, but to a time after the last reaction, is called a local seek. Let the seeked presentation time be the current presentation time plus the seek delta. The program state is not altered but the presentation time is altered, negatively or positively, according to the seek delta. The maximum seek delta is the minimum of the remaining time of active timeout handlers. The minimum seek delta is the difference between the presentation time and the presentation time of the last reaction. This seek operation does not affect the global state. All objects that are time-tracked are issued a call according to their respective seek element with a time parameter that is the sum of their delta time and the seek delta. Also, the remaining time of all soft timeout handlers is decremented by the seek delta.
If the seeked presentation time is earlier than the presentation time of the last reaction, then the seek operation is accompanied by undo operations up to the last reaction r with a presentation time that is less or equal to the seek time. The seek operation is now performed from r with a delta of the difference between the seeked presentation time and the presentation time of r. The combined operation is called a global backwards seek.
It is also possible to seek forwards in time beyond the time of an active timeout handler. This is accomplished by executing the timeout events that lie between the current presentation time and the seeked presentation time. The difference between the seeked presentation time and the last such time is then the delta of a local seek that is carried out after the timeout reactions. In this way, the effect of redos and local seeks are combined to enable forward time seek.
We here show how to emulate the semantics of a media element in SMIL. The element to be emulated by a ReaX block is
SMIL offers a programming mechanism that substantially simplifies synchronization: the ability to react to events dispatched anywhere. This kind of event response is well-known from the class of synchronous reactive languages. The paradigm is that events are not directed towards any target and are not consumed by event handlers. Instead, events are globally visible, at each instant a number of such events are present in the "event ether."
SMIL 2.0 has a traditional concept of event: an event is dispatched by a thread on (to) a target, which is a block or thread. Events may be handled in the classical manner: by listeners or handlers. But SMIL 2.0 introduces a complementary mechanism: an event-value, which is identified by the event type and the ID of the target, where the event is raised to. The event-value provides a way for a thread to wake up or to terminate. The activation of this mechanism is called "responding to an event" or "delivery of an event notification." For example, the attribute begin='foo.begin' used on a block delays its ativation until the begin event has been dispatched on the block named foo. (The DOM concept that an event is dispatched is called "is raised", "is fired", "has occurred", or "has happened" in SMIL 2.0.)
The mechanism of event-values, and the related mechanism of sync-values, are treated semantically as part of the building of a time graph constructed dynamically as the programs is running. Unfortunately, this mechanism is very complicated and contains several seemingly ad hoc exceptions. Moreover, the semantics are not tied precisely to that of the dispatching of events; for example, in the same instant, awakening by observation may trigger further events to be raised through scripting, which in turn may awake further threads through observation. Thus, there seems to be some implicit ordering of operational steps---but this phenomenon contradicts the point-like nature of time and synchronization semantics. It is the this interaction between the activation of blocks determined by the time graph and the general events that we would like to elucidate with a ReaX-based semantics.
We take the view that event-values and sync-values should be co-events. These "inverse" events are generated by need: a co-event is raised to a co-handler only when an event is selected for execution that matches the event specification of the co-handler. In this way, we can systematically describe temporal relationships along timed threads.
Moreover, our event and history oriented view of the ReaX language implies that a built-in recovery mechanism founded on both traditional checkpointing and time seeking is possible within one framework. And, we have shown how this framework can be integrated with a traditional scripting language, even preserving undoability whenever the programmer provides explicit recipes for recovery.
In the program fragment:
SMIL semantics are declarative and no history of events is maintained. The semantics stipulate that when an element seeked the parent of the element is also seeked. Unfortunately, since no history about actually occurring events is recorded, the seek of the parent element, which in this case is the outer SMIL:seq element, results in the re-execution of the button element. The semantics of SMIL does not assume that the c.click event ever happened. Thus, the resulting state after the seeked is inconsistent: both children of the SMIL:seq element are executed at the same time, something that should be impossible according to a sequential semantics.
In ReaX, the program could be written as follows:
Because ReaX semantics is based on recording the history of events, the ReaX program above correctly rewinds the video without re-executing the button element. This functionality is independent of the <checkpoint /> elements that have been inserted.
But in addition to the seek mechanism, this ReaX program offers an undo facility: if the undo button is pressed during a video presentation, then the state of the ReaX program is made to be the one existing right after the last checkpoint was encountered. So, the effect is that the video presentation is seeked to the beginning. If two undoes are presented in a row, then the ReaX program is brought back to the state that existed after the first checkpoint was encountered. That is, the button will be shown again, and the video will not be playing.
The discussion of forward and backwards seek, along with the undo and redo mechanisms, clearly points to a user interface for ReaX programs that is almost identical to that of CD or tape players. For example, the interface could be arranged as:
The pause button invokes the pause (or resume) function specified in ReaX calls, while the presentation time is paused (or resumed).
The present ReaX language is yet another proposal in a series of notations that colleagues and visiting students have helped formulate. Giuseppe di Fabbrizio greatly influenced the language by formulating usage scenarios and programming situations that ReaX was designed to tackle. Jennifer Beckham have defined an earlier version called ReX (REactive Xml) that conceptually is quite similar. Anders Moller defined and implemented RSS (Reactive Style Sheets), a reactive scripting language for interpreting XML documents.