July 29, 2010

Coding style as a feature of language design

roc recently posted a thought-provoking entry titled, "Coding Style as a Failure of Language Design", in which he states:

Languages already make rules about syntax that are somewhat arbitrary. Projects imposing additional syntax restrictions indicate that the language did not constrain the syntax enough; if the language syntax was sufficiently constrained, projects would not feel the need to do it. Syntax would be uniform within and across projects, and developers would not need to learn multiple variants of the same language.

I totally agree with roc's point that there is overhead in learning-and-conforming-to local style guidelines. I also agree that this overhead is unnecessary and that language implementers should find ways to eliminate it; however, I think that imposing additional arbitrary constraints on the syntax is heading in the wrong direction.

Your language's execution engine [*] already has a method of normalizing crazy styles: it forms an abstract syntax tree. Before the abstract syntax tree (AST) is mutated [†] it is in perfect correspondence with the original source text, modulo the infinite number of possible formatting preferences. This is the necessary set of constraints on the syntax that can actually result in your program being executed as it is written. [‡]

So, why don't we just lug that thing around instead of the source text itself?

The dream

The feature that languages should offer is a mux/demux service: mux an infinite number of formatting preferences into an AST (via a traditional parser); demux the AST into source text via an AST-decompiler, parameterized by an arbitrarily large set of formatting options. Language implementations could ship with a pair of standalone binaries. Seriously, the reference language implementation should understand its own formatting parameters at least as well as Eclipse does. [§]

Once you have the demux tool, you run it on your AST files as a post-checkout hook in your revision control system for instant style personalization. If the engine accepts the AST directly as input, you would only need to demux the files you planned to work on — if the engine accepted an AST directly as input in lieu of source text, this could even be an optimization.

Different execution engines are likely to use different ASTs, but there should be little problem with composability: checked-in AST goes through standalone demux with an arbitrary set of preferences, then through the alternate compiler's mux. So long as the engines have the same language grammar for the source text, everybody's happy, and you don't have to waste time writing silly AST-to-AST-prime transforms.

In this model, linters are just composable AST observers/transforms that have no ordering dependencies. You could even offer a service for simple grammatical extensions without going so far as language level support. Want a block-end delimiter in the Python code you look at? [¶] Why not, just use a transform to rip it out before it leaves the front-end of the execution engine.

Reality

Of course, the set of languages we know and love has some overlap with the set of languages that totally suck to parse, whether due to preprocessors or context sensitivity or the desire to parse poems, but I would bet good money that there are solutions for such languages. In any case, the symmetric difference between those two sets could get with it, and new languages would be kind to follow suit. It would certainly be an interesting post-FF4 experiment for SpiderMonkey, as we've got a plan on file to clean up the parser interfaces for an intriguing ECMAScript strawman proposal anywho.

Footnotes

[*]

Interpreter, compiler, translator, whatever.

[†]

To do constant folding or what have you.

[‡]

Oh yeah, and comments. We would have to keep those around too. They're easy enough to throw away during the first pass over the AST.

[§]

Even more ideal, you'd move all of that formatting and autocompletion code out of IDEs into a language service API.

[¶]

Presumably because you despise all that is good and righteous in the world? ;-)

B&B++: bed and breakfast for programmers

1. Collect background

This is the latest in my steal-my-idea-but-give-me-free-stuff-after-you-do series, with slightly more earning potential than my last installment, "Strike a Cord".

I recently spoke to some Mozillians who had participated in a "code retreat" — I'd only heard tale of such a thing in lore and folk song, but it seems like a brilliant concept.

The idea is this: a small think tank (of one or more persons) requires a large amount of code throughput on a task which requires a high degree of focus. To facilitate that, they run far from the middling issues of civilized society and deep into the wilderness [*] to code "butt-necked in harmony with Eywa". [†] Through single-minded concentration and a dearth of non-maskable interrupts, they emerge victorious. [‡]

2. ?

Follow these simple steps to steal my idea:

  1. Assume that the aforementioned code retreat process is awesome.

  2. Make a bed-and-breakfast in the outskirts of a city that's attractive to programmers (for whatever reason).

  3. Offer retreats with high-speed internet access, offices with whiteboards, mirra chairs, height-adjustable desks, pay-as-you-go phone conference equipment, high-res DLP projectors, disco balls, whatever. Make it clearly "the works". If you want to go even further, mount speakers and sound-proof the walls. [§]

  4. Make the experience as luxurious and classy as reasonably possible so that the programmers respect the "sanctity" of the retreat: chef-prepared meals, an indisputably good coffee machine, a Z80 prominently featured as a piece of wall art, and a complimentary bag-o-munchy-chips regimen. Beautiful scenery in which one can walk and think would definitely be a plus, and proximity to a nerd-friendly bar never hurt a nerdy establishment either.

The patrons have a good degree of flexibility as a result of this setup. They might hole themselves away in offices 95% of the time, emerging only to sleep, gather delicious food, and scuttle back into their offices. Alternatively, if they're on a more casual endeavor (coding vacation?), they might choose to strike up conversations with people at meals and go out to see the sites.

3. Profit!

Please do steal my idea and make a lot of money for yourself (share it with no one!) — I only ask that you offer me a free stay once you get off the ground.

I'll leave you off with a little marketing campaign idea:

B&B++: universally evaluated as the way to B, and, after each bed and breakfast, we get a little bit better. Until we overflow. [¶]

Footnotes

[*]

Or a hotel.

[†]

Sadly, I can't take credit for this phrase.

[‡]

Readers familiar with XP may draw a parallel to the practice of Kanban, which has a fascinating backstory, and acknowledges the awesome power of JIT.

[§]

For the mercy of those who dislike techno.

[¶]

Hey, I'm giving this advice away for free, you can't expect it to all be good. No company ever survived giving their excellent primary product away for free. [#]

[#]

Ugh, too much meta-humor. If you've read and understood up to this point, I apologize.

Tool teams should work like sleeper cells

I've had some unique experiences interacting-with and participating-in tool development at previous companies that I've worked for, with the quality of those experiences in the broad spectrum from train-wreck to near-satisfactory. From that mental scarring has emerged the weighty goo of an idea, which may be interesting food for thought. [*]

How it starts

At some point in a company's growth, management notices that there is a lot of sub-par, [†] redundant, and distributed tool development going on. Employees have been manufacturing quick-and-dirty tools in order to perform their jobs more efficiently.

Management then ponders the benefit of centralizing that tool development. It seems like an easy sell:

Good management will also consider the negative repercussions of turning distributed and independent resources into a shared and centrally managed resource:

How I've seen it work (warning: depressing, hyperbolic)

  1. A group at the company makes a strong enough case to the centralized-tool-management machinery — a request for tool development is granted.

  2. A series of inevitably painful meetings are scheduled where the customer dictates their requirements, after which the tool team either rejects them or misunderstands/mis-prioritizes them because: a) that's not how it works — they have to actively gather the requirements, and b) they don't have enough time to do all the silly little things that the customer wants.

    Because people are fighting each other to get what they want, everybody forgets that the customers haven't really described the problem domain in any relevant detail.

  3. The tool team developers are happy to go code in peace, without going back for more painful meetings. They create a tool according to their understanding of the requirements during the first iteration.

  4. The customer has no idea how the tool team came up with a product that was nothing like their expectation. They say something overly dramatic like, "it's all wrong," pissing off the tool team, and lose faith in the ability of the tool team to deliver the product they want.

  5. The customer goes back to doing it manually or continue to develop their own tools, expecting that the tool team will fail.

  6. The tool team fails because the customer lost interest in telling them what they actually needed and giving good feedback. It wasn't the tool that anybody was looking for because the process doomed it from the start.

I say that this scenario is depressing because tool teams exist to make life better for everybody — they enjoy writing software that makes your life easier. Working with a tool team should not be painful. You should want to jump for joy when you start working with them and take them out to beers when you're finished working with them, because they're just that good. I think that, by taking a less traditional approach, you will be able to achieve much better results...

How it should work

  1. A group at the company makes a strong enough case to the centralized-tool-management machinery — a request for tool development is granted.

  2. A small handful of tool team operatives [‡], probably around two or three people, split off from the rest of the tool team and are placed in the company hierarchy under the team of the customers. They sit the customers' cube farm, go to their meetings to listen (but no laptops!), etc., just like a typical team member would.

  3. The customer team brings the operatives up to speed on the automatable task that must be performed each day through immersion. Depending on the frequency, breadth, and duration of the manual processes, the operatives must perform this manual process somewhere on the scale from weeks to months, until they develop a full understanding of the variety of manual processes that must be performed. [§] All operatives should be 100% assigned to the manual tasks for this duration, temporarily offloading members of customer team after their ramp-up.

  4. Bam! With an unquestionably solid understanding of the problem domain, the tool team sleeper cells activate. 80% of the manual task load is transitioned off of the operatives so that they can begin development work. Agile-style iterations of 1-2 weeks should be used.

  5. After each iteration there must be a usable product (by definition of an iteration). As a result of this, a percentage of the manual task load is shifted back onto the operatives each iteration, augmenting the original 20%. If the tool is actually developing properly, the operatives will be able to cope with the increased load over time.

  6. As the feature set begins to stabilize or the manual task load approaches zero (because it has all been automated), the product is released to the customers for feedback and a limited amount of future-proofing is considered for final iterations.

  7. Most customer feedback is ignored, but a small and reasonable subset is acted on. If the operatives were able to make do with the full task load plus development, it's probably a lot better than it used to be, and the customer is just getting greedy.

  8. The customer takes the operatives out for beers, since the tool team saved them a crapload of time and accounted for all the issues in the problem domain.

  9. A single operative hangs back with the customer for a few more iterations to eyeball maintenance concerns and maybe do a little more future-proofing while the rest head back to the tool team. The one who hangs back gets some kind of special reward for being a team player.

Conclusion

In the sleeper cell approach, the operatives have a clear understanding of what's important through first hand knowledge and experience and, consequently, know the ways in which the software has to be flexible. It emulates the way that organic tool development is found in the wild, as described in the introductory paragraph, but puts the task of creating the actual tool in the hands of experienced tool developers (our operatives!).

I think it's also noteworthy that this approach adheres to a reasonable principle: to write a good program to automate a task, you have to know/understand the variety of ways in which you might perform that task by hand, across all the likely variables.

The operatives are forced to live with the fruits of their labor; i.e. a defect like slow load times will be more painful for them, because they have to work with their tool regularly and take on larger workloads on an ongoing basis, before developers can ever get their hands on it.

Notice that there's still the benefit through centralization of tool developers: central contact point for tool needs, cultivating expertise in developers, knowledge of shared code base, understanding of infrastructure and contact points for infrastructural resource needs; however, you avoid the weird customer disconnect that comes with time slicing a traditional tool team.

Tools developers may also find that they enjoy the team that they're working in so much that they request to stay on that team! How awesome of a pitch is that to new hires? "Do you have a strong background in software development? Work closely with established software experts, make connections to people who will love you when you're done awesome-ing their lives, and take a whirlwind tour of the company within one year."

Footnotes

[*]

Yes, I'm suggesting you digest my mind-goo.

[†]

For some definition of par.

[‡]

I'm calling them operatives now, because their roles are different from tool developers, as you'll see.

[§]

It is beneficial if a small seed of hatred for the manual task begins to develop, though care should be taken not to allow operatives to be consumed by said hatred.

Virtues of Extreme Programming practices

Aside: I've changed the name of my blog to reflect a new writing approach. I've found, with good consistency, that being near-pathologically honest and forward is a boon to my learning productivity. Sometimes it causes temporary setbacks (embarrassment, remorse) when I step into areas that I don't fully understand, but the increased rate of progress is worthwhile. For example, this approach should help me get some SQRRR accomplished more readily, as I can get more ideas out in the open and don't need to feel like an expert on everything I write about.

In my limited experience with Extreme Programming (XP) practices, I've felt it was a long-term benefit for myself and my teammates. Unfortunately, because of XP's deviation from the more standard programming practices that I was taught, the activities originally carry a certain weirdness and unapproachability about them. Tests up front? Two people writing a single piece of code? Broadcasting the concrete tasks you accomplished on a daily basis?

After shaking off the inevitable willies, I've found that those activities improve relationships between myself and other team members and help to solidify code understanding and emphasize maintainability. From what I've read, this is what the developers of XP were trying to help optimize: the productivity that results from accepting the social aspect of coding. It is strictly more useful to form good working relationships with humans than with rubber ducks.

A nice secondary effect from the social coding activity is an increased flow of institutional knowledge. Everybody knows little secrets about the corners of your code base or have figured out optimized workflows — somewhat obviously, interpersonal flow of info helps keep more people in the know. When it takes five to ten minutes to explain a code concept to someone, both parties start to get the feeling it should be documented somewhere.

It reads a bit dramatic, but this snippet from the XP website has been fairly accurate in my experience:

Extreme Programming improves a software project in five essential ways; communication, simplicity, feedback, respect, and courage. Extreme Programmers constantly communicate with their customers and fellow programmers. They keep their design simple and clean.

The cons that I've witnessed are some minor bikeshedding and the increased overhead that seems to accompany these tasks:

On the other hand, I've also witnessed these costs get amortized away:

At Mozilla we seem to have a decent code review process down, which is one of my favorite social coding practices when it's done well. At the moment, my team doesn't seem too keen on some of the other practices I've found helpful, and it's certainly not something you should force. In any case, I'm happy to be the guy who talks about how great I've found these practices when the topic comes up until somebody comes around. ;-)

Notes from the JS pit: closure optimization

In anticipation of a much-delayed dentist appointment tomorrow morning and under the assumption that hard liquor removes plaque, I've produced [*] an entry in the spirit of Stevey's Drunken Blog Rants, s/wine/scotch/g. I apologize for any and all incomprehensibility, although Stevey may not mind since it's largely an entry about funargs, which he seems to have a thing for. (Not that I blame him — I'm thinking about them while drinking...) It also appears I may need to prove myself worthy of emigration to planet Mozilla, so hopefully an entry filled with funarg debauchery will serve that purpose as well.

Background

Lately, I've been doing a little work on closure optimization, as permitted by static analysis; i.e. the parser/compiler marks which functions can be optimized into various closure forms.

In a language that permits nested functions and functions as first-class values, there are a few things you need to ask about each function before you optimize it:

Function escape (the funarg problem)

If a function can execute outside the scope in which it was lexically defined, it is said to be a "funarg", a fancy word for "potentially escaping outside the scope where it's defined". We call certain functions in the JS runtime Algol-like closures if they are immediately applied function expressions, like so:

function outer() {
    var x = 12;
    return (function cubeX() { return x * x * x; })();
}

The function cubeX can never execute outside the confines of outer — there's no way for the function definition to escape. It's as if you just took the expression x * x * x, wrapped it in a lambda (function expression), and immediately executed that expression. [†]

Apparently a lot of Algol programmers had the hots for this kinda thing — the whole function-wrapping thing was totally optional, but you chose to do it, Algol programmers, and we respect your choice.

You can optimize this case through static analysis. As long as there's no possibility of escape between a declaration and its use in a nested function, the nested function knows exactly how far to reach up the stack to retrieve/manipulate the variable — the activation record stack is totally determined at compile time. Because there's no escaping, there's not even any need to import the upvar into the Algol-like function.

Dijkstra's display optimization

To optimize this Algol-like closure case we used a construct called a "Dijkstra display" (or something named along those lines). You just keep an array of stack frame pointers, with each array slot representing the frame currently executing at that function nesting level. When outer is called in the above, outer's stack frame pointer would be placed in the display array at nesting level 0, so the array would look like:

Level 0: &outerStackFrame
Level 1: NULL
Level 2: NULL
...

Then, when cubeX is invoked, it is placed at nesting level 1:

Level 0: &outerStackFrame
Level 1: &cubeX
Level 2: NULL
...

At parse time, we tell cubeX that it can reach up to level 0, frame slot 0 to retrieve the jsval for x. [‡] Even if you have "parent" frame references in each stack frame, this array really helps when a function is reaching up many levels to retrieve an upvar, since you can do a single array lookup instead of an n link parent chain traversal. Note that this is only useful when you know the upvar-referring functions will never escape, because the display can only track stack frames for functions that are currently executing.

There's also the possibility that two functions at the same nesting level are executing simultaneously; i.e.

function outer() {
    var x = 24;
    function innerFirst() { return x; }
    function innerSecond() {
        var x = 42;
        return innerFirst();
    }
    return innerSecond();
}

To deal with this case, each stack frame has a pointer to the "chained" display stack frame for that nesting level, which is restored when the executing function returns. To go through the motions:

Level 0: &outerStackFrame
Level 1: &innerSecond
Level 2: NULL
...

Which then activates innerFirst at the same static level (1), which saves the pointer that it's clobbering in the display array.

Level 0: &outerStackFrame
Level 1: &innerFirst (encapsulates &innerSecond)
Level 2: NULL
...

Then, when innerFirst looks up the static levels for x, it gets the correct value, restoring innerSecond when it's done executing in a return-style bytecode (which would be important if there were further function nesting in innerSecond). [§]

Okay, hopefully I've explained that well enough, because now I get to tell you that we've found this optimization to be fairly useless in SpiderMonkey experimental surveys and we hope to rip it out at some point. The interesting case that we actually care about (flat closures) is discussed in the second to last section.

Free variable references

Because JS is a lexically scoped language [¶] we can determine which enclosing scope a free variable is defined in. [#] If a function's free variables only refer to bindings in the global scope, then it doesn't need any information from the functions that enclose it. For these functions the set of free variables in nested functions is the null set, so we call it a null closure. Top-level functions are null closures. [♠]

function outer() {
    return function cube(x) { return x * x * x; }; // Null closure - no upvars.
}

Free variables are termed upvars, since they are identifiers that refer to variables in higher (enclosing) scopes. At parse time, when we're trying to find a declaration to match up with a use, they're called unresolved lexical dependencies. Though JavaScript scopes are less volatile — and, as some will undoubtedly point out, less flexible — I believe that the name upvar comes from this construct in Tcl, which lets you inject vars into and read vars from arbitrary scopes as determined by the runtime call stack: [♥]

set x 7

proc most_outer {} {
    proc outer {} {
        set x 24
        proc upvar_setter {level} {
            upvar $level x x
            set x 42
        }
        proc upvar_printer {level} {
            upvar $level x x
            puts $x
        }
        upvar_printer 1
        upvar_setter 1
        upvar_printer 1
        upvar_setter 2
        upvar_printer 2
        upvar_printer 3
        upvar_setter 3
        upvar_printer 3
    }
    outer
}
most_outer # Yields the numbers 24, 42, 42, 7, and 42.

Upvar redefinitions

If you know that the upvar is never redefined after the nested function is created, it is effectively immutable — similar to the effect of Java's partial closures in anonymous inner classes via the final keyword. In this case, you can create an optimized closure in a form we call a flat closure — if, during static analysis, you find that none of the upvars are redefined after the function definition, you can import the upvars into the closure, effectively copying the immutable jsvals into extra function slots.

On the other hand, if variables in enclosing scopes are (re)defined after the function definition (and thus, don't appear immutable to the function), a shared environment object has to be created so that nested functions can correctly see when the updates to the jsvals occur. Take the following example:

function outer() {
    var communicationChannel = 24;
    function innerGetter() {
        return communicationChannel();
    }
    function innerSetter() {
        communicationChannel = 42;
    }
    return [innerGetter, innerSetter];
}

Closing over references

In this case, outer must create an environment record outside of the stack so that when innerGetter and innerSetter escape on return, they can see both communicate through the upvar. This is the nice encapsulation-effect you can get through closure-by-reference, and is often used in the JS "constructor-pattern", like so:

function MooCow() {
    var hasBell = false;
    var noise = "Moo.";
    return {
        pontificate: function() { return hasBell? noise + " <GONG!>" : noise; }
        giveBell: function() { hasBell = true; }
    };
}

It's interesting to note that all the languages I work with these days perform closure-by-reference, as opposed to closure-by-value. In constrast, closure-by-value would snapshot all identifiers in the enclosing scope, so immutable types (strings, numbers) would be impossible to change.

Sometimes, closure-by-reference can produce side effects that surprise developers, such as:

def surprise():
    funs = [lambda: x ** 2 for x in range(6)]
    assert funs[0]() == 25

This occurs because x is bound in function-local scope, and all the lambdas close over it by reference. When x is mutated in further iterations of the list comprehension (at least in Python 2.x), the lambdas are closed over the environment record of surprise, and all of them see the last value that x was updated to.

I can sympathize. In fact, I've wrote a program to do so:

var lambdas = [];
var condolences = ["You're totally right",
        "and I understand what you're coming from, but",
        "this is how closures work nowadays"];
for (var i = 0; i < condolences.length; i++) {
    var condolence = condolences[i];
    lambdas.push(function() { return condolence; });
}
print(lambdas[0]());

Keep in mind that var delcarations are hoisted to function scope in JS.

I implore you to note that comments will most likely be received while I'm sober.

Footnotes

[*]

Vomited?

[†]

Cue complaints about the imperfect lambda abstraction in JavaScript. Dang Ruby kids, go play with your blocks! ;-)

[‡]

Roughly. Gory details left out for illustrative purposes.

[§]

There's also the case where the display array runs out of space for the array. I believe we emit unoptimized name-lookups in this case, but I don't entirely recall.

[¶]

With a few insidious dynamic scoping constructs thrown in. I'll get to that in a later entry.

[#]

Barring enclosing with statements and injected eval scopes.

[♠]

Unless they contain an eval or with, in which case we call them "heavyweight" — though they still don't need information from enclosing functions, they must carry a stack of environment records, so they're not optimal. I love how many footenotes I make when I talk about the JavaScript language. ;-)

[♥]

As a result, it's extremely difficult to optimize accesses like these without whole propgram analysis.