S5: Adding Stack Traces

S5: Adding Stack Traces

Posted on 02 December 2012.

We've started using S5 to learn about some multi-thousand line programs. While debugging some of the problems S5 had with these, it was clear that we needed some better error messages coming out of the interpreter. This stands to reason: as our codebase grows up to tackle more complicated use cases, the tools need to scale up as well.

Sometimes, we'd run into a bug in the environment, and other times a desugaring bug would trigger an error thrown from the environment. Our environment is 5000 lines of S5 code, each of which may be invoked by desugared code. For example, we have a function that checks if a value is actually an object, used before property lookup and at the top of built-ins such as Object.getPrototypeOf() and Object.defineProperty().

This abstraction is great: it makes desugared code read cleaner (much better than if it were inlined), and shares the functionality across many contexts. Sadly, when the TypeError was thrown, S5 only knew that it came from that particular line in the environment. That isn't particularly useful when that line was reached from line 2398 in a 10000 line file. We needed stack traces. Today we're going to walk through the process of adding stack traces to S5. The total change came in at a whopping 77 lines of code, split across two commits.

We already keep track of the source position (in the original source file) of every expression we create in desugaring: our expression type has a special spot for it:

type exp =
  | Null of pos
  | Undefined of pos
  | String of pos * string
  | Num of pos * float
  | True of pos
  | False of pos
  | Id of pos * id
  | Object of pos * attrs * (string * prop) list
  ...

We just need to find a way to get the information about the expressions we've visited out to the toplevel, and then to the user. S5 models JavaScript exceptions with a special Ocaml exception type:

exception Throw of value * store

This construct keeps track of the value that's propagating upward, and can be recognized by an Ocaml catch block in the implementation of S5 try; the store is included because the interpreter itself is implemented in pure store-passing style (thanks to Ben Lerner). What Throw is lacking is any sort of context about where the exception came from. But this is easy to add:

exception Throw of exp list * value * store

Now a Throw can keep track of an entire list of expressions, for example, the list of expressions in the context of its evaluation. To get the chain started: we create a single element list when an actual throw occurs. This is straightforward to locate in the interpreter. This:

match exp with
  ...
  | Syntax.Throw (p, e) -> let (v, s) = eval e env store in
    raise (Throw (v, s))

Becomes:

match exp with
  ...
  | Syntax.Throw (p, e) -> let (v, s) = eval e env store in
    raise (Throw ([exp], v, s))

This on its own will only include the throw expression itself. But, there's no reason we need to let this Ocaml raise statement reach all the way to the surrounding try or the toplevel. We can wrap every expression evaluation in a handler for the Throw special exception. The handler adds the current expression to the list, and reraises the exception:

let rec eval exp env store : (value * store) =
  let eval exp env store =
    begin try eval exp env store
      with 
      | Throw (exprs, v, s) ->
        raise (Throw (exp::exprs, v, s))
    end in
  match exp with
  ...
  | Syntax.Throw (p, e) -> ...
  | Syntax.Object (p, attrs, props) -> ...
  ...

Now, all inner calls to the evaluator will be wrapped in a handler that is patiently waiting for the possibility of a thrown exception, to add the current expression and continue up the chain. We try it out, and get some crazy looking output:

../envs/es5.env:6:0-0
../envs/es5.env:10:0-0
../envs/es5.env:12:0-0
../envs/es5.env:14:0-0
../envs/es5.env:16:0-0
[574 lines total]
../envs/es5.env:4777:0-0
:0:-1--1
:0:-1--1
:0:-1--1
[around 50 lines]
tests/mybigfile.js:355:-11--9
[around 10 lines]
:0:-1--1
:0:-1--1
[around 100 lines]
tests/mybigfile.js:1981:-3--37
tests/mybigfile.js:1981:-14--37
tests/mybigfile.js:1981:-14--37
tests/mybigfile.js:1981:-14--32
tests/mybigfile.js:1981:-14--32
../envs/es5.env:159:35-1
../envs/es5.env:160:23-3
Uncaught exception: {[#proto: object, #class: "Object", #extensible: true,]}
Fatal error: exception Failure("Uncaught exception")

All told, there are around 1000 lines of output. That is, the interpreter was 1000 expressions deep when some exception was thrown on line 1981 of mybigfile (the error itself is irrelevent here, the fact is that it happened and we needed to diagnose it). But a lot of this is just noise: The first 574 lines are simply the environment warming up, and would be at the front of any such stack trace. All the lines that print as ":0:-1--1" are dummy positions inserted by desugaring for intermediate expressions that don't have any corresponding source location.

There's no reason we have to print the whole stack trace, though. It's just a list of expressions that we're printing the positions of, and we can write arbitrary filters over it. That's just what we did: wrote one filter to remove leading entries with the same file name, and another to remove all dummy positions. After doing that, we get much more reasonable output:

../envs/es5.env:6:0-0
tests/mybigfile.js:355:-11--9
tests/mybigfile.js:355:-11--9
tests/mybigfile.js:355:-11--9
tests/mybigfile.js:355:-11--9
tests/mybigfile.js:355:-11--2
tests/mybigfile.js:355:-11--2
tests/mybigfile.js:355:-11--2
tests/mybigfile.js:355:-11--2
tests/mybigfile.js:355:-11--2
tests/mybigfile.js:1981:-3--37
tests/mybigfile.js:1981:-14--37
tests/mybigfile.js:1981:-14--37
tests/mybigfile.js:1981:-14--32
tests/mybigfile.js:1981:-14--32
../envs/es5.env:159:35-1
../envs/es5.env:160:23-3
Uncaught exception: {[#proto: object, #class: "Object", #extensible: true,]}
Fatal error: exception Failure("Uncaught exception")

We might decide later that we want these to look like traditional stack traces, and only print at function boundaries. But we might also be interested in writing other predicates over the expressions we're deep inside when an error occurs. Perhaps we'll want to detect changes across file boundaries, or exceptions that flow through specific files in the environment. Simple additions like this are possible because our interpreter is small and traditional; these small changes gave us an enormous boost in error understanding. S5 is a fertile ground for trying out ideas like these on an interpreter that handles a real language. What do you want to build?