S5: Desugaring Exceptions
During our development of S5 (start there if you have no idea what S5 is, as this post assumes some background), our implementation hit several pain points. These are useful points in the specification to highlight, because our implementation is designed to give an unambiguous, precise, understandable account of the language that we can use to make formal claims.
This post touches on one such pain point, which is balancing the creation of exceptions objects between the core semantics and the desugaring to the core. This split highlights some of the invariants we try to maintain in our core language, and where they are broken by the specification. I don't think this is pointing out any flaw in either; rather, it's an example of a case where traditional models for semantics face difficulty with real-world language features. There is a real design tradeoff here, and S5 currently sits somewhere in the middle between two ends of the spectrum.
Several points in the spec mandate throwing an exception of a particular class.
As with any feature in the specification, we have two choices for how to handle these exceptions. We can:
In some cases, this is straightforward. For example, when
defineOwnProperty
is called and tries to change a property
from non-writable to writable on a non-configurable
field,
an exception object needs to be created with the built-in
TypeError
as its internal proto
field.
In this case, we use different exceptions we've defined in a core
semantics implementation, and provide an implementation of
defineOwnProperty
in the core that instantiates the correct
kind of exception. The same pattern is used in a number of built-in
functions, and is straightforward and uncontroversial.
This works just fine for a number of cases, especially built-in
functions. Others cause some head-scratching. For example, setting an
unwritable field also throws a TypeError
. The semantics is
aware of the writability of fields, and will thus refuse to perform
assignment on one. But the semantics still isn't, and I argue
shouldn't, be in charge of allocating a new exception object
and knowing about TypeError
s. There are at least two
reasons:
We have kept the creation of objects to a minimum in our core semantics. Only object literal expressions and object key extraction expressions are capable of adding new locations on the heap for objects. All other creations of objects in the spec are desugared into these two operations. This aids proofs over the core, because more of the operations in the semantics enjoy the invariant that they cannot change the object store in any way.
The core semantics and interpreter start evaluation with an empty
heap that contains no objects, and an empty environment that contains no
bindings. This isn't what I'd recommend for a blazingly fast
implementation of JavaScript, but it is a clean model: because of (1),
this means that all object references come from some allocation in the
code. Expecting that something named TypeError
would be
available at the level of the semantics time would require violating
this property, and would complicate the clean model of evaluation.
So, it seems like the natural thing to do for assignment to unwritable fields is to desugar field assignment to check for writability and throw the correct exception if the field isn't writable. That is, we would transform:
o[x] = 5
// becomes
if (o[x<writable>]) o[x] = 5
else %TypeError("unwritable field")
This works correctly, though the actual implementation would be a bit longer, because it would also need to check for extensibility. There are two drawbacks, however:
The desugared code is much less readable, because a field
assignment at the surface is nearly unrecognizable (imagine what a
nested assignment like o[x] = o[y] = 4
).
From an engineering point of view, we now have the desugared code checking for writability, and the interpreter checking for it, with presumably, but not actually guaranteed, dead code in the exception case.
We've started experimenting with another appraoch as a result of being somewhat dissatisfied with the pure desugaring approach above. We let the interpreter's checks happen, and have the interpreter throw exceptions as primitive values that can be caught by the desugared code. Our desugaring becomes:
o[x] = 5
// becomes
try { o[x] = 5 } catch %ErrorDispatch
This is less desugared cruft than before, and relies on two tricks for
defining %ErrorDispatch
and tidying up the interaction with
the rest of desugaring.
We embed all user-thrown values one level deep in an object data structure with a tagged field that lets us tell user-thrown values from interpreter-thrown exceptions. If we didn't do this, actual JavaScript code might be able to act like the trusted interpreter. We do this in the desugaring of throw expressions.
The catch block, %ErrorDispatch
, checks the thrown value to
see if it is a user-thrown or interpreter-thrown exception. It passes
on user-thrown exceptions, and converts interpreter-thrown exceptions to
the appropriate type. For example, in this case, it recognizes
the string "unwritable field"
and creates a
TypeError
.
There are pros and cons to this pattern.
It moves the desugaring burden from the checks around uses of built-in
forms to the definition of %ErrorDispatch
, which makes
desugared code clearer. It also maintains all the invariants we want
about the interpreter itself. On the other hand, it adds an extra
implicit dependency between the strings used in the interpreter. It
also makes the desugared code less explicit in exchange for removing
some noise. We'll see how well we like this, and if certain classes of
exceptions should use different strategies. Function application and
lookup on undefined
and null
could also use
this treatment to clarify desugared code, but it's probably not worth it
for built-in functions.