S5: Desugaring Exceptions

S5: Desugaring Exceptions

Posted on 17 December 2012.

During our development of S5 (start there if you have no idea what S5 is, as this post assumes some background), our implementation hit several pain points. These are useful points in the specification to highlight, because our implementation is designed to give an unambiguous, precise, understandable account of the language that we can use to make formal claims.

This post touches on one such pain point, which is balancing the creation of exceptions objects between the core semantics and the desugaring to the core. This split highlights some of the invariants we try to maintain in our core language, and where they are broken by the specification. I don't think this is pointing out any flaw in either; rather, it's an example of a case where traditional models for semantics face difficulty with real-world language features. There is a real design tradeoff here, and S5 currently sits somewhere in the middle between two ends of the spectrum.

Implicit Creation of Exceptions Objects

Several points in the spec mandate throwing an exception of a particular class.

As with any feature in the specification, we have two choices for how to handle these exceptions. We can:

  • Make them a part of the semantics, or
  • Desugar them away into existing parts of the semantics.

In some cases, this is straightforward. For example, when defineOwnProperty is called and tries to change a property from non-writable to writable on a non-configurable field, an exception object needs to be created with the built-in TypeError as its internal proto field.

In this case, we use different exceptions we've defined in a core semantics implementation, and provide an implementation of defineOwnProperty in the core that instantiates the correct kind of exception. The same pattern is used in a number of built-in functions, and is straightforward and uncontroversial.

Even More Implicit Creations

This works just fine for a number of cases, especially built-in functions. Others cause some head-scratching. For example, setting an unwritable field also throws a TypeError. The semantics is aware of the writability of fields, and will thus refuse to perform assignment on one. But the semantics still isn't, and I argue shouldn't, be in charge of allocating a new exception object and knowing about TypeErrors. There are at least two reasons:

  1. We have kept the creation of objects to a minimum in our core semantics. Only object literal expressions and object key extraction expressions are capable of adding new locations on the heap for objects. All other creations of objects in the spec are desugared into these two operations. This aids proofs over the core, because more of the operations in the semantics enjoy the invariant that they cannot change the object store in any way.

  2. The core semantics and interpreter start evaluation with an empty heap that contains no objects, and an empty environment that contains no bindings. This isn't what I'd recommend for a blazingly fast implementation of JavaScript, but it is a clean model: because of (1), this means that all object references come from some allocation in the code. Expecting that something named TypeError would be available at the level of the semantics time would require violating this property, and would complicate the clean model of evaluation.

So, it seems like the natural thing to do for assignment to unwritable fields is to desugar field assignment to check for writability and throw the correct exception if the field isn't writable. That is, we would transform:

o[x] = 5

// becomes

if (o[x<writable>]) o[x] = 5
else %TypeError("unwritable field")

This works correctly, though the actual implementation would be a bit longer, because it would also need to check for extensibility. There are two drawbacks, however:

  1. The desugared code is much less readable, because a field assignment at the surface is nearly unrecognizable (imagine what a nested assignment like o[x] = o[y] = 4).

  2. From an engineering point of view, we now have the desugared code checking for writability, and the interpreter checking for it, with presumably, but not actually guaranteed, dead code in the exception case.

One Alternate Approach

We've started experimenting with another appraoch as a result of being somewhat dissatisfied with the pure desugaring approach above. We let the interpreter's checks happen, and have the interpreter throw exceptions as primitive values that can be caught by the desugared code. Our desugaring becomes:

o[x] = 5

// becomes

try { o[x] = 5 } catch %ErrorDispatch

This is less desugared cruft than before, and relies on two tricks for defining %ErrorDispatch and tidying up the interaction with the rest of desugaring.

  • We embed all user-thrown values one level deep in an object data structure with a tagged field that lets us tell user-thrown values from interpreter-thrown exceptions. If we didn't do this, actual JavaScript code might be able to act like the trusted interpreter. We do this in the desugaring of throw expressions.

  • The catch block, %ErrorDispatch, checks the thrown value to see if it is a user-thrown or interpreter-thrown exception. It passes on user-thrown exceptions, and converts interpreter-thrown exceptions to the appropriate type. For example, in this case, it recognizes the string "unwritable field" and creates a TypeError.

Tradeoffs

There are pros and cons to this pattern.

It moves the desugaring burden from the checks around uses of built-in forms to the definition of %ErrorDispatch, which makes desugared code clearer. It also maintains all the invariants we want about the interpreter itself. On the other hand, it adds an extra implicit dependency between the strings used in the interpreter. It also makes the desugared code less explicit in exchange for removing some noise. We'll see how well we like this, and if certain classes of exceptions should use different strategies. Function application and lookup on undefined and null could also use this treatment to clarify desugared code, but it's probably not worth it for built-in functions.