Screencast: SpiderMonkey hacking intro

I saw Gary Bernhardt's DestroyAllSoftware screencasts last night and thought, "Wow, that's a great idea! I bet it takes a lot less time to make a screencast than to write a blog entry with thousands of gross words in it."

And here you have it: my first screencast. Since it's only five minutes long, I was able to make it right after breakfast! It shows how to get started building the JS shell and how to add a new shell builtin that performs very important functionality.

I'm not very good at speaking and typing concurrently, so be gentle! (Also, it's hard to see if it's not fullscreen and HD quality, so I recommend using that.)

Update 2011-10-19 1700: @dukeleto points out that, if you need to work around recent issues with Mercurial clone timeouts on Mozilla servers, you can also visit the Mercurial web view for the tip revision and download via the bz2 link displayed at the top.

My programming job evaluation criterion

I love making lists with no title or context and then hiding them somewhere in my room. At least, that's the inevitable conclusion of my paper-purging weekend.

I actually found one fun list (under my bed) that I've rematerialized the context for: what criterion do I use, as a candidate, to evaluate a programming position in the valley?

I came up with this short list when I was interviewing at Mozilla. It's a personal list, inextricable from my personality, experiences, and my current lot in life, but some of the thoughts may apply generally and help future candidates to brainstorm.

Purpose

If you're into the whole consequentialism thing.

What's the outcome you're achieving by doing this job? Particularist (i.e. feed your family, put your kid through college) and universalist (i.e. bring education to undeveloped nations, make navigation systems that don't fail) concerns are both important to consider. [*]

Community

Who are the people you're going to be working with? Community in my mind is roughly, "Humans you will be interacting with as a part of your job." Do you have reason to believe this will be a beneficial community for you? Maybe you work best interacting with a particular disposition of person: driven, explanatory, idealistic, encouraging, expert, iconic, pub-dwelling, sky-diving?

As my friend eloquently put it, "Fuck staying in a job where people suck."

Learning

How much do you already know about the subject of your work? How much do you want to know? They say, "God is in the details." I buy it, but if the job expects and encourages you to learn new things in topics that you're passionate about on a regular basis, it sure makes this part easier.

I've also heard, "Lots of folks just try to do the same exact thing they did at their last job over again, but slightly better." Maybe you'd like to avoid that. [†]

Artistic direction

How much control do you have over the direction of the project that you'll be working on? It's quite frustrating to pour your heart and soul into your work if all your projects get cancelled or you're being metaprogrammed.

This brings up related questions of status: How valued will your input be? Is this place a meritocracy in the ways that drive excellent products and good decision making? What capability do you have to devote resources to things that you think are important? (This could be a use-20%-time-to-prove-its-worth-pursuing kind of deal.)

Or, maybe there is no meritocracy, and there's politics. That'd be important to know.

Artistic fidelity

How much craftsmanship is expected? Mandated? Tolerated?

Posed another way: how hacky will your work need to be to meet deadlines, output requirements, expectations, and so forth? How hacky do you anticipate the work of others will be? Is there some mitigating circumstance (like an important product deadline) that makes a lower level of craftsmanship acceptable?

Another consideration: does this job expect you to show appropriate respect for the product being delivered, or is it overly focused on the purity of their development methodology?

Tedium

What percentage of your time will be spent doing mundane tasks (that you have little hope of learning from)? Do you have a good strategy for coping-with/minimizing expected sources of tedium?

Legacy

How much state-that-already-exists will you have to deal with? Sometimes this is to your advantage (i.e. already-established big/important customers) and sometimes it's to your disadvantage (i.e. original code base believed the MS Visual Basic was the language of the future and wrote their application using solely the corner cases of the language semantics).

(Compensation and career advancement weren't on the list. I'm sure that everyone has a price where they'd sacrifice any one of these things, but it's a much more meaningful exercise to evaluate them on their own.)

The categories are supposed to be largely orthogonal, but the idea is simply to provoke thought for meaningful/tough questions during the interview process — if you value your time and attention, the employer should be giving you sufficiently satisfying answers!

No place is perfect, but thinking about what you're getting yourself into and why should put you squarely ahead of the game as a job candidate.

Footnotes

[*]

Programming jobs that work towards solving severe third-world problems that also let you work with interesting technology are not abundant. My current strategy is to work in the interesting tech position of my choosing and donate a pre-set percentage of my salary towards effective charities working in those areas.

[†]

A similar quote that a friend gave me: "Some people don't have ten years of experience so much as the same year of experience ten times."

JS regexps implemented in JS

If you heart JS and want to learn more about how regular expressions work, I've got yet another fun project for you to work on.

Back when I was working more heavily on the regular expression engine, some ECMAScript spec correctness questions came up. As it turns out, the regular expression part of the specification is written in scary-sounding-but-really-not-that-hard-to-understand continuation passing style (and it really seems to make the most sense that way).

I tried to work through the first bug on paper, but I forgot to carry the one or something, and I got the wrong result. dmandelin quickly whipped up a program modeled on the spec to resolve that one example conclusively. Seeing how easy that was, I followed his lead and started working on a little library to resolve these questions in ways that save more trees.

I haven't worked on it much lately (according to this neat GitHub activity doohickey), but I put cdlre out on github today, and I'd be happy to review pull requests. The regular expression specification for matching (ECMAScript revision 5 section 15.10) is easy to translate into running code, and I left a lot of features unimplemented, just for you!

I'll just copy-pasta the rest from the project README:

Potential applications:

  • Regression testing the specification against host implementations.

  • Use in understanding why regular expressions succeed/fail to match.

  • Use in a metacircular interpreter (like Narcissus).

  • Use as a staging ground for regular expression optimizations and/or a regular expression compiler. (Such a compiler could target eval as a backend or a JIT code execution foreign function.)

Goals

  • Be capable of visualizing (or at least dumping out) the ECMAScript standard steps taken in matching a regular expression.

  • Be capable of enabling/disabling the de-facto quirks from various browsers which are not yet part of the standard.

  • Be capable of running a thorough regression suite against the host regular expression engine (presumably with a set of permitted quirk options).

  • Keep the JS code a direct translation from the spec where possible and practical.

I'm sure that hooking in a comprehensible visualization would be a helpful tool for web developers who want to harness the Indiana Jones-like power of regular expressions.

Go-go gadget community?

These tablets are for consumption

I got lucky and came up with a witty title for this one despite sleep deprivation. I could probably go on some sarcastic diatribe about how we happily pay half a thousand dollars for a magazine-consolidating bathroom reading device while people with TB lack necessary medical supplies; but, surprisingly, my goal is not to torture you, dear reader. Mostly because you've got it going on.

In reality, I just wanted to confess to the world that I get it now. Work recently lent me an Asus Transformer tablet (sans fancy keyboard dock thing I've heard about) in order to debug a JS problem in OS X cross compiles. So, I took the plunge, trying to figure out what people actually use these things for in their daily lives.

For me, the answer was pretty simple: streamlined content consumption.

I quickly learned that I can't create anything of value on a tablet in its natural habitat — at least, until the demand for "world's funniest pot-roast-fisted input device typing error videos" goes mainstream.

At first, I found this infuriating. Most of my typical computing time is spent creating things — things of questionable value though they may be. But then, a docile sense of calm and well being washed over me, like that inexplicable clump of undissolved Koolaid powder licked off the lips of a siren or a wildly misfired tranquilizer dart.

I don't have to try to produce things all the time. I can chill.

Reading books in the book reader, catching up on bug mail, knocking down a few cool and refreshing feed reader entries on one of California's patronizingly delectably prodigiously warm October days.

Sure, all the cross country runners care about now is training, but if you entice them to run in a giant hamster ball, how much more likely are they to stop and smell the roses?

(Presumably the aforementioned hamster ball has large air holes that you could potentially smell flowers through.)

Thinking of dynamic language code as a template

You can think of a dynamic language code as a template. This code has generality because of duck typing; the idea is roughly, "If the code can work on a value, it will, with no muss or fuss."

Some programmers find this to be an empowering form of "do what I say" — you don't need to explain everything you want your code to do in terms that even an optimizing compiler can understand.

In traditional dynamic language VMs, this template is instantiated with concrete value types at runtime. When a value flows into a bit of code, the interpreter inspects the type of the value and then runs the appropriate "instantiation of the template".

Example

Consider the function:

function bitAndAdd(a, b, c) {
    return a + (b & c);
}

In its maximally generic, template-looking form it reads like this pseudo bytecode::

bitAndAdd<T0, T1, T2> =
    GETARG0 -> T0
    GETARG1 -> T1
    GETARG2 -> T2
    BITAND<T1, T2> -> T3
    ADD<T1, T3> -> T4

In the JS engine, the T-prefixed types can be any in the set {null, undefined, boolean, int32, double, object, string} — these are the engine internal value tags.

In a bytecode-based interpreter, each bytecode typically asks, "Is this thing that I'm acting on an int32? Or double? Or string? Or ...? Or user defined type?" [*] Once that's determined, it runs the appropriate sub-section of the bytecode implementation for that type. The question-asking part of the bytecode is, in effect, directing you to the appropriate template instantiation. (Note that bytecodes are kind of like mini-templates composed into the larger function template.)

When you call bitAndAdd<int32, int32, int32>, you get an instantiated code path at runtime that looks like::

bitAndAdd<int32, int32, int32> =
    GETARG0 -> int32
    GETARG1 -> int32
    GETARG2 -> int32
    BITAND<int32, int32> -> int32
    ADD<int32, int32> -> ...

This path represents all of the machine code executed in the interpreter that does useful work for this function. Imagine the bytecodes that make up the function as a bunch of enum values in an array. For each enum value in the array, the interpreter dispatches to an type-unspecialized bytecode implementation. Then, the unspecialized bytecode implementation inspects the input types and dispatches to the specialized implementation.

A decent example is the ADD opcode in JavaScript. The unspecialized version of ADD looks something like::

ADD<T0, T1>(T0 lhs, T1 rhs) {
    if (T0 is int32 and T1 is int32)
        result = ADD<int32, int32>(lhs.getInt32(), rhs.getInt32())
    else if (T0 is string and T1 is string)
        result = ADD<string, string>(lhs.getString(), rhs.getString())
    else if (...)
        ...
}

You can see the unspecialized ADD delegating to the type-specialized bytecode sub-implementations. A specialization like ADD<int32, int32> are fairly simple and read something like::

ADD<int32, int32>(int32 lhs, int32 rhs) {
    result = lhs + rhs
    if (overflow)
        result = (double) lhs + (double) rhs;
}

Specializing the template

Of course, with JIT technology, we try to be a lot smarter about making the instantiation efficient. There are a few key mechanisms:

Avoiding bytecode dispatch

The composition of mini-templates (bytecodes) into the larger program template (made out of bytecodes) causes "jump to the next bytecode in the program" cost in an interpreter loop. By creating a machine-code implementation for the sequence of bytecodes that make up a function, we naturally cause machine-code duplication, but avoid that interpretation overhead. This is a natural consequence of choosing to JIT compile a function.

Constant propagation / local type inference (prime example: JägerMonkey)

If the JIT sees that you're performing a bitwise-and, and the spec says that the result of a bitwise-and is always an int32, the JIT can safely assume that fact downstream in the code; for example, in:

function byteSquare(x) {
    var a = x & 0xff;
    var b = a * a;
    return b;
}

The inputs to the a * a multiply are int32s, without question. There's no reason to instantiate the machine code for other types when you know it's going to be a int32.

Whole program type inference (prime examples: JägerMonkey + Type Inference, IonMonkey)

Flowing conservative type information through all the code in the system tells you key interprocedural facts. For example, just by looking at your program it may be easy to see that the divmod(dividend, divisor) function is only ever called with two constant integers as arguments.

With that as the basic idea, the analysis can get a lot fancier; for example, analyses can see that you're only storing new Car() objects into the vehicle property of your Person records, and that new Car() will always have an int32 value in its mufflerCount attribute.

By considering everything [†] that the template could possibly do, we can still narrow down the possibilities in worthwhile cases. This less us instantiate less code and know more facts for use in optimization.

Trace recording (prime example: TraceMonkey)

When a loop is determined to be important (i.e. it hits a certain execution count) a recorder notes which bytecodes execute in the interpreter and what types flow into them. This avoids both bytecode dispatch overhead and type ambiguity, creating a very highly specialized template.

However, if the types ever stop matching up with the observed bytecode execution — say, myObject.quotient used to be an int32, but when we ran the trace a subsequent time it turned out to be a double — the execution of that trace cannot continue, because it's instantiated for the wrong types downstream. Because of this possibility, a bunch of checks for type compatibility have to be inserted in the template instantiation.

It's important to note that the generated template is more than just type dependent — it can also be value depdendent! Because a trace only records a single execution path and questions like "Do I execute the if body or the else body?" can be determined by the values held by a given type.

Some folks argue that single-execution recording tends to create overly instantiated templates for typical dynamic language code.

Profiling (prime example: IonMonkey)

If you think of trace recording as observing bytecode for a single run of execution, profiling is observing bytecode for n runs of execution [‡] and accumulating the notes. This eases the pain of compiling "flip floppy" code, because you don't have to see the primarily taken path the first time, you have n tries to get a feel for it.

This process lets you instantiate a template that's general enough to handle all the things you've seen over the n iterations you've observed.

While trace compilation culls out all but one path of control flow, profiling data can be used to cull out all unobserved (and presumably cold) control flow as well. This may increase the effectiveness and decrease the runtime of dataflow-analysis-based optimizations, because there's less unimportant code to consider. I'm not aware of existing profiling JITs that do this, however, which indicates to me that it may not be an overall win.

Further discussion

Dynamic language compilers try to use runtime feedback as well as things it can infer about the code to specialize the template in the best way possible and make generated machine code blazingly fast. We want to be able to mix, "things we definitely know" (from analysis) with "things we are pretty sure we know" (from runtime feedback) in order to get the maximum speedup possible.

One exciting thing about Just-In-Time is that this runtime feedback makes it at least theoretically possible to exceed the speed of a corresponding static program; however, there are signficant challenges involved. Whenever we start leaning too heavily on "things we are pretty sure we know," we need to be able to keep the world from falling apart when we're wrong.

It's actually a fairly emotional topic in compiler engineering: how do you hold the world together when everything you thought you knew is wrong?

Be strong. Until next time.

Footnotes

[*]

Of course, bytecodes that don't need to know about the types of its operands don't ask the questions. Note that GETARG0 takes no parameters, and so wouldn't need to ask any questions.

[†]

Nearly everything. The type inference algorithm actually makes some conservative simplifications to cut down on analysis time.

[‡]

To some approximation. As an alternative, you can also observe execution for some period of time t, which has some correlation to the number of runs of execution.

« Older entries
Newer entries »