Archive for April, 2009

Monstrous polymorphism and a Python post-import hook decorator

I queue up a few thousand things to do before I get on an airplane: synchronize two-thousand Google Reader entries, load up a bunch of websites I've been meaning to read, and make sure for-fun projects are pulled from their most updated branches.

Then, once I get up in the air, I realize that I don't really want to do 90% of those things crammed into a seat with no elbow room. I end up doing one or two. Along with reading Stevey's Drunken Blog Rant: When Polymorphism Fails, this entry is all the productivity I can claim. The full code repository for this entry is online if you'd like to follow along.

Polymorphism Recap

The word "polymorphic" comes from Greek roots meaning "many shaped." (Or they lied to me in school — one of those.) From a worldly perspective I can see this meaning two things:

  • A single object can take on many shapes, or

  • Requirements for a general "shape" can be satisfied by different categories of objects.

As it turns out, both of these concepts apply to the Object-Oriented programming, but the canonical meaning is the latter. [*] As Yegge says:

If you have a bunch of similar objects [...], and they're all supposed to respond differently to some situation, then you add a virtual method to them and implement it differently for each object.

(If you don't know what a virtual method is, the Wikipedia page has an alternate explanation.)

Yegge's Example

Yegge demonstrates that strictly adhering to the principles of polymorphism does not always produce the best design:

Let's say you've got a big installed base of monsters. [...] Now let's say one of your users wants to come in and write a little OpinionatedElf monster. [...] Let's say the OpinionatedElf's sole purpose in life is to proclaim whether it likes other monsters or not. It sits on your shoulder, and whenever you run into, say, an Orc, it screams bloodthirstily: "I hate Orcs!!! Aaaaaargh!!!" (This, incidentally, is how I feel about C++.)

The polymorphic approach to this problem is simple: go through every one of your 150 monsters and add a doesMrOpinionatedElfHateYou() method.

This is a great counterexample — it induces an instant recognition of absurdity.

He then touches on the fact that dynamic languages allow you to do neat things consistent with polymorphism due to the flexibility of the object structure (which is typically just a hash map from identifiers to arbitrary object values):

I guess if you could somehow enumerate all the classes in the system, and check if they derive from Monster, then you could do this whole thing in a few lines of code. In Ruby, I bet you can... but only for the already-loaded classes. It doesn't work for classes still sitting on disk! You could solve that, but then there's the network...

This is clearly impractical, but I figured there was some exploratory value to implementing this challenge in Python. This entry is a small walk-through for code to detect interface conformity by inspection, enumerate the classes in the environment, manipulate classes in place, and add an import hook to manipulate classes loaded from future modules.

The Antagonist

Double entendre intended. :-)

class OpinionatedElf(object):

    is_liked_by_class_name = {
        'OpinionatedElf': True,
        'Orc': False,
        'Troll': False}

    def __init__(self, name):
        self.name = name

    def be_scary(self):
        print("I'm small, ugly, and don't like the cut of your jib!")

    def proclaim_penchance(self, other):
        if not IMonster.is_conforming(other):
            print("I can't even tell what that is!")
            return
        is_liked = other.is_liked_by_elf()
        class_name = other.__class__.__name__
        if is_liked is None:
            print("I'm not sure how I feel about %s" % class_name)
            return
        emotion = 'love' if is_liked else 'hate'
        print('I %s %s!!! Aaaaaargh!!!' % (emotion, other.__class__.__name__))

Determining which Classes are Monsters

First of all, Python doesn't require (nor does it encourage) a rigid type hierarchy. Python's all about the interfaces, which are often implicit. Step one is to create a way to recognize classes that implement the monster interface:

    required_methods = ['be_scary']

    def be_scary(self):
        raise NotImplementedError

    @classmethod
    def is_conforming(cls, object):
        result = all(callable(getattr(object, attr_name, None))
            for attr_name in cls.required_methods)
        logging.debug('%s conforms? %s', object, result)
        return result

assert IMonster.is_conforming(IMonster)

This is a simple little class — there are better third party libraries to use if you want real interface functionality (i.e. more generic conformity testing and Design By Contract).

Enumerating the Classes in the Environment

All of the modules that have been loaded into the Python environment are placed into sys.modules. By inspecting each of these modules, we can manipulate the classes contained inside if they conform to our monster interface.

for name, module in sys.modules.iteritems():
    extend_monsters(module)

The extend_monsters function is a bit nuanced because immutable modules also live in sys.modules. We skip those, along with abstract base classes, which have trouble with inspect.getmembers():

def extend_monsters(module, extension_tag='_opinionated_extended'):
    """Extend monsters in the module's top-level namespace to
    tell if they are liked by the :class:`OpinionatedElf`.
    and tag it with the :param:`extension_tag` as a flag name.
    Do not attempt to extend already-flagged modules.
    Do not clobber existing methods with the extension method name.

    Warning: swallows exceptional cases where :param:`module`
        is builtin, frozen, or None.
    """
    name = module.__name__ if module else None
    logging.info('Instrumenting module %s', name)
    if not module or imp.is_builtin(name) or imp.is_frozen(name) \
            or getattr(module, extension_tag, False):
        logging.info('Skipping module: %s', name)
        return
    module._opinionated_instrumented = True
    classes = inspect.getmembers(module, inspect.isclass)
    for name, cls in classes:
        logging.debug('%s: %s', name, cls)
        try:
            conforming = IMonster.is_conforming(cls)
        except AttributeError, e:
            if '__abstractmethods__' in str(e): # Abstract class.
                continue
            raise
        if not conforming:
            continue
        class_name = cls.__name__
        logging.debug('Instrumenting class %s', class_name)
        attr_name = 'is_liked_by_elf'
        if hasattr(cls, attr_name): # Don't clobber existing methods.
            logging.warn('Method already exists: %s', cls)
            continue
        logging.info('Setting %s on %s', attr_name, class_name)
        setattr(cls, attr_name,
            lambda self: OpinionatedElf.is_liked_by_class_name.get(
                self.__class__.__name__, None))

If we were going to be thorough, we would recurse on the members of the class to see if the class scope was enclosing any more IMonster classes, but you're never really going to find them all: if a module defines a monster class in a function-local scope, there's no good way to get the local class statement and modify it through inspection.

In any case, we're at the point where we can modify all monsters in the top-level namespace of already-loaded modules. What about modules that we have yet to load?

Post-import Hook

There is no standard post-import hook (that I know of) in Python. PEP 369 looks promising, but I couldn't find any record of additional work being done on it. The current import hooks, described in PEP 302, are all pre-import hooks. As such, you have to decorate the __import__ builtin, wrapping the original with your intended post-input functionality, like so: [†]

def import_decorator(old_import, post_processor):
    """
    :param old_import: The import function to decorate, most likely
        ``__builtin__.__import__``.
    :param post_processor: Function of the form
        `post_processor(module) -> module`.
    :return: A new import function, most likely to be assigned to
        ``__builtin__.__import__``.
    """
    assert all(callable(fun) for fun in (old_import, post_processor))
    def new_import(``*args``, ``**kwargs``):
        module = old_import(\*args, \*\*kwargs)
        return post_processor(module)
    return new_import

After which we can replace the old __import__ with its decorated counterpart:

__builtin__.__import__ = import_decorator(__builtin__.__import__,
    extend_monsters)

The Network

Yegge brings up the issue of dynamically generated classes by mentioning network communications, calling to mind examples such as Java's RMI and CORBA. This is a scary place to go, even just conceptualizing. If metaclasses are used, I don't see any difficulty in decorating __new__ with the same kind of inspection we employed above; however, code generation presents potentially insurmountable problems.

Decorating the eval family of functions to modify new classes created seems possible, but it would be challenging and requires additional research on my part. exec is a keyword/statement, which I would think is a hopeless cause.

Footnotes

[*]

In accordance with the former, an object can implement many interfaces.

[†]

This function is actually a generic decorate-with-post-processing closure, but I added the references to import for more explicit documentation.

Eliminating web service dependencies with a language-specific abstraction barrier

Hyperbolic analogy: Saying, "You shouldn't need to wrap the web service interface, because it already provides an API," is like saying, "You shouldn't need different programming languages, because they're all Turing complete."

Web services tend to deliver raw data payloads from a flat interface and thus lack the usability of native language APIs. Inevitably, when you program RPC-like interfaces for no language in particular, you incur incompatibilities for every particular language's best practices, idioms, and data models. [*] The issue of appropriately representing exceptions and/or error codes in RPC-like services is a notorious example of this.

There are additional specification mechanisms like WSDL [†] that allow us to make the payloads more object-like. Additional structure is indicated through the use of user-defined "complex types," but this only gets you part of the way to a usable API for any given language. In Python, it's a lot more sensible to perform an operation like in the following abstraction:

from internal_tracker.service import InternalTracker
bug_serivce = InternalTracker(username=getpass.getuser(),
    password=getpass.getpass())
bug = bug_service.get_bug(123456)
bug.actionable.add('Chris Leary') # may raise ReadOnlyException
comment = Comment(text='Adding self to actionable')
bug.arbs.add_comment(comment)
bug.save() # may raise instanceof ServiceWriteException

Than to use an external web service API solution directly (despite using the excellent Suds library):

# Boilerplate
client = suds.client.Client(wsdl=wsdl_uri)
security = suds.wsse.Security()
security.tokens.append(UsernameToken(getpass.getuser(),
    getpass.getpass()))
client.set_options(wsse=security)
internal_tracker_service = client.service

# Usage
service_bug = internal_tracker_service.GetBug(123456)
service_bug.actionable += ', Chris Leary'
# Do we check the response for all WebFault exceptions?
# (Do we check for and handle all the possible transport issues?)
internal_tracker_service.UpdateBug(service_bug)
Comment = internal_tracker_service.factory['Comment']
comment = Comment()
comment.BugId = service_bug.Id
comment.Text = 'Adding self to actionable'
# Again, what should we check?
internal_tracker_service.AddComment(comment)

Why is it good to have the layer of indirection?

Lemma 1: The former example actually reads like Python code. It raises problem-domain-relevant exceptions, uses keyword arguments appropriately, follows language naming conventions, and uses sensible language-specific data types that may be poorly represented in the web service. For example, actionable may be a big comma-delimited string according to the service, whereas it should clearly be modeled as a set of (unique) names, using Python's set data type. Another example is BigIntegers being poorly represented as strings in order to keep the API language-neutral.

Lemma 2: The layer represents an extremely maintainable abstraction barrier between the client and the backing service. Should a team using the abstraction decide it's prudent to switch to, say, Bugzilla, I would have no trouble writing a port for the backing service in which all client code would continue to work. Another example is a scenario in which we determine that the transport is unreliable for some reason, so decide all requests should be retried three times instead of one. [‡] How many places will I need to make changes? How many client code bases do I potentially need to keep track of?

Why is it risky to use the web service interface?

If the web service API represents the problem domain correctly with constructs that make sense for your language, it's fine to use directly. (As long as you're confident you won't have transport-layer issues.) If you're near-certain that the backing service will not change, and/or you're willing to risk all the client code that will depend on that API directly being instantaneously broken, it's fine. The trouble occurs when one of these is not the case.

Let's say that the backing service does change to Bugzilla. Chances are that hacking in adapter classes for the new service would be a horrible upgrade experience that entails:

  1. Repeated discovery of leaky abstractions,

  2. Greater propensity to bugs, [§] and

  3. More difficult maintenance going forward.

Client code that is tightly coupled to the service API would force a rewrite in order to avoid these issues.

Pragmatic Programming says to rely on reliable things, which is a rule that any reasonable person will agree with. [¶] The abstraction barrier is reliable in its loose coupling (direct modeling of the problem domain), whereas direct use of the web service API could force a reliance on quirky external service facts, perhaps deep into client code.

Is there room for compromise?

This is the point in the discussion where we think something along the lines of, "Well, I can just fix the quirky things with a bunch of shims between my code and the service itself." At that point, I contend, you're really just implementing a half-baked version of the language-specific API. It's better to make the abstractions appropriate for the target language and problem domain the first time around than by incrementally adding shims and hoping client code didn't use the underlying quirks before you got to them. Heck, if the web service is extremely well suited to your language, you'll end up proxying most of the time anyway, and the development will be relatively effortless. [#]

What about speed of deployment?

If we have language-specific APIs, won't there be additional delay waiting for it to update when additional capabilities are added to the backing service?

First of all, if the new capability is not within the problem domain of the library, it should be a separate API. This is the single responsibility principle applied to interfaces — you should be programming to an interface abstraction. Just because a backing service has a hodgepodge of responsibilities doesn't mean that our language-specific API should as well. In fact, it probably shouldn't. Let's assume it is in the problem domain.

If the functionality is sane and ready for use in the target language, it should be really simple for the library owner to extend the language-specific API. In fact, if you're using the proxy pattern, you may not have to do anything at all. Let's assume that the functionality is quirky and you're blocked waiting for the library owner to update with the language-specific shim, because it's non-trivial.

Now our solution tends to vary based on the language. Languages like Python have what's known as "gentlemen's privacy", based on the notion of a gentlemen's agreement. Privacy constraints are not enforced at compile-time and/or run-time, so you can just reach through the abstraction barrier if you believe you know what you're doing. Yes, you're making an informed decision to violate encapsulation. Cases like this are exactly when it comes in handy.

assert not hasattr(bug_service, 'super_new_method_we_need')
# HACK: Violate abstraction -- we need this new capability right now
# and Billy-Bo, the library owner, is swamped!
suds_client = bug_service._suds_client
result = suds_client.SuperNewMethodWeNeed()
target_result = de_quirkify(result)

As you can see, we end up implementing the method de_quirkify to de-quirk the quirky web service result into a more language-specific data model — it's bad form to make the code dependent on the web service's quirky output form. We then submit our code for this method to the library owner and suggest that they use it as a basis for their implementation, so that a) they can get it done faster, and b) we can seamlessly factor the hack out.

For privacy-enforcing languages, you would need to expose a public API for getting at the private service, then tell people not to use it unless they know what they're doing. As you can tell, you pretty much wind up with gentlemen's privacy on that interface, anyway.

Footnotes

[*]

In EE we call this kind of phenomenon an impedance mismatch, which results in power loss.

[†]

And many others, with few successful ones.

[‡]

Or maybe we want to switch from HTTP transport to SMTP. Yes, this is actually possible. :-)

[§]

In duplicating exact web service behaviors — you have to be backwards compatible with whichever behaviors the existing client code relies on.

[¶]

It's a syntactic tautology. The caveat is that reasonable people will almost certainly quibble over the classification of what's reliable.

[#]

At least in Python or other languages with a good degree of dynamism, it will.

Bit twiddling: Simple O(1) membership test

Disclaimer

Bit twiddling is fun. Plus, it has several advantages:

  • It gives you a greater understanding of the nuances of the (C) language. As we all know, learning C is important. If you're a computer engineer and have weak C-fu, then it is time, grasshopper.

  • Bit twiddling is important to know if you're ever going to write software in the field of computer engineering. Without a good understanding of how to slice-and-dice primitive data types, your simulations will be significantly more prone to error. Trust me, you don't want to use gdb for hours only to find that your shift value was off by one. Not that I've ever done that.

  • It makes you feel smart, like when you're the only person who knows the secret to your party trick. I don't know what that's like personally, of course, because I spend all my party time bit bashing, but I hear it's similar.

You have to understand, though, that clever tricks without appropriate documentation will make people want to break your face. [*] Always bit bash responsibly: appoint a designated code-reader to make sure you're clear enough, and leave your keys at the door.

The Problem

Let's say you wanted to know whether a number was a valid PCI Express link width in terms of number of lanes. We know that valid widths are x1, x2, x4, x8, x12, x16, or x32, and want to construct a function of the following form:

#include <stdbool.h>
#include <stdint.h>
#include <assert.h>

/**
 * :return: Whether the lane count is valid.
 */
bool is_valid_link_width(uint8_t lane_count);

/**
 * Unit test for ``is_valid_link_width``.
 */
int main(int argc, char **argv) {
    assert(!is_valid_link_width(0));
    assert(is_valid_link_width(1));
    assert(is_valid_link_width(2));
    assert(!is_valid_link_width(3));
    assert(is_valid_link_width(32));
    assert(!is_valid_link_width(33));
    assert(!is_valid_link_width(33));
    assert(!is_valid_link_width(0xff));
    return 0;
}

Note that the uint8_t has a width of exactly 8 bits. [†]

How would you write it?

Less Interesting Solution

If you were thinking switch statement, that will work. You could use a switch statement with intentional fall-throughs and hope that the compiler optimizes a branch table for you. (For values this small and dense it probably will, as mentioned in the referenced article.) If the compiler doesn't write the branch table for you, but instead generates the equivalent of a big if/else if ladder, your solution doesn't satisfy the O(1) constraint: in that case, the worst case control flow hits every rung of the ladder (the else if guards), making it O(n).

bool is_valid_link_width(uint8_t lane_count) {
    switch (lane_count) {
    case 1:
    case 2:
    case 4:
    case 8:
    case 12:
    case 16:
    case 32:
        return true;
    }
    return false;
}

An implementation that I like better, which doesn't put as much faith in the compiler, is as follows:

bool is_valid_link_width(uint8_t lane_count) {
    return 0x100011116ULL & (1ULL << lane_count);
}

How cool is that?

The Neat Trick

The clever insight here is that we can encode all of our target "true" values in binary form, like so:

       32                      16    12    8     4    1
0b__0001__0000__0000__0000__0001__0001__0001__0001__0110

Now, if we were to take a 1 value and move it over a number of binary slots equal to the lane count, it will line up with a 1 value in this long binary number we've constructed. Take the bitwise-AND of those two values, and we wind up with:

  • A non-zero (true) value if they line up, OR

  • A zero value if they don't

This is exactly what we were looking for.

This long binary number we've created must be converted from binary into a hexadecimal value, so that we can represent it as an integer literal in our C program. Encoding each binary 4-tuple into hex from right to left, we get the value 0x100011116.

There's an issue with this value, however. Unless we specify a suffix for our integer literal, the compiler is allowed to truncate the value to its native word size, [‡] which would cause serious problems. For x86 systems with 16 bit words, our value could be truncated to 0x1116, which would only allow lane sizes of 1, 2, 4, 8, and 12 — the allowed values of 16 and 32 would be cut off!

To solve this, as you can see in the function definition, we add the ULL integer suffix, which explicitly marks the integer literal as an unsigned long long. (The long long integer data type was added to the C language in the C99 standard.) This data type is required to be at least 64 bits wide, so it can definitely hold our 33 relevant bits (32 plus the zero bit which is there for the 1ULL << 0 case). The long data type is too small to hold this value, as long can potentially be only 32 bits wide per the C standard (and it is 32 bits wide on most modern machines).

Readability Counts

Note that there's a more readable version of the same trick in the following:

bool is_valid_link_width(uint8_t lane_count) {
    const uint64_t set =
        1ULL << 1
        | 1ULL << 2
        | 1ULL << 4
        | 1ULL << 8
        | 1ULL << 12
        | 1ULL << 16
        | 1ULL << 32;
    return set & (1ULL << lane_count);
}

Here we make the construction of the big integer more explicit and make the code less prone to our errors in encoding the literal binary value into hex. Any compiler worth its salt will fold out the const calculation at compile time, so no overhead will be incurred for writing it this way.

I demonstrated the other way of doing it first to: a) blow your mind a little bit, and b) demonstrate an idiom you might see in other people's (overly) clever code. Now there's a chance you can recognize and decipher it without adequate documentation. Huzzah!

Footnotes

[*]

A great rift in the universe known as "Other People's Perl" has been the cause of 80% of the computer-engineering face breakage since 1987. Don't let this tragedy happen to you or your beloved programmers.

[†]

I like using fixed-width integer types, especially in describing problems, because they helpfully constrain the possibilities on the input domain. This is even more important for newcomers who are just wrapping their heads around bit twiddling.

[‡]

I can't find standards documents/discussions to support this claim, but it's definitely what I was taught. Can anybody provide evidence to confirm/deny?

Generators and resource aquisition/release

One of the neatest things about language lawyers is that they have a keen eye for features of a language that may conflict with each other to produce fail. I, on the other hand, find it fun to stumble around in various languages and analyze interesting cases as I encounter them.

Generators in Python were a subset of a more general concept of coroutines. Generators are an elegant and concise way to write reasonably sized state machines. For that reason, you'll seem them heavily associated with iterators (which are more sytaxerific [*] to write in a language without generators, like Java).

I used to envision generators as little stack frames that were detached from the call stack and placed somewhere in outer space, eating moon cheese and playing with the Django pony, where they lived happily ever after. Surprisingly, that concept didn't match up with reality too well.

PEP 342: Coroutines via Enhanced Generators and PEP 325: Resource-Release Support for Generators are the language lawyer smack-down of my naive view. We used to be unable to perform proper resource acquisition within generators; notably, you couldn't yield from the try suite of a try/finally block, because the only way to guarantee resource release in the finally block was to step the generator until a StopIteration exception:

Restriction: A yield statement is not allowed in the try clause of a try/finally construct. The difficulty is that there's no guarantee the generator will ever be resumed, hence no guarantee that the finally block will ever get executed; that's too much a violation of finally's purpose to bear.

  • PEP 255 — Specification: Yield

from threading import Lock

lock = Lock()

def gen():
    try:
        lock.acquire()
        yield 'Acquired!'
    finally:
        lock.release()

if __name__ == '__main__':
    g = gen()
    print g.next()

We see the addition of this capability in Python 2.5:

$ python2.4 poc.py
  File "poc.py", line 8
    yield 'Acquired!'
SyntaxError: 'yield' not allowed in a 'try' block with a 'finally' clause

$ python2.5 poc.py
Acquired!

Before Python 2.5 there was no way to tell the generator to die and give up its resources. As PEP 342 describes, Python 2.5 turns generators into simple coroutines, which we can force to release its resources [†] when necessary via the close method:

>>> import poc
>>> g = poc.gen()
>>> h = poc.gen()
>>> g.next()
'Acquired!'
>>> g.close() # Force it to release the resource, or we deadlock.
>>> h.next()
'Acquired!'

SimPy

If you're wondering how I came across this combination in day-to-day Python programming, it was largely due to SimPy. I was writing a PCI bus simulation [‡] for fun, to help get a grasp of the SimPy constructs and how they might affect normal object oriented design. [§] I wanted to "acquire" a bus grant, so I analyzed the applicability of with for this resource acquisition.

I went to Stack Overflow and submitted a "feeler" question to see if there was some conventional Python wisdom I was lacking: Is it safe to yield from within a "with" block in Python (and why)?. The concept seemed relatively new to those in the discussion; however, the responses are still insightful.

The Lesson

This experience has demonstrated to me there are two modes of thinking when it comes to Python generators: short-lived and long-lived.

Typical, pre-Python 2.5 generator usage, where generators are really used like generators, lets you glaze over the difference between a regular function and a generator. Really, all that you want to do with this kind of construct is get some values to be used right now. You're not doing anything super-fancy in the generator — it's just nicer syntax to have all of your local variables automatically saved in the generator function than doing it manually in an independent object.

Fancy, SimPy co-routine usage, where generators are managed as coroutines by a central dispatcher, makes a generator take on some more serious object-like semantics. Shared-resource acquisition across coroutine yields should scare you, at least as much as objects that acquire shared resources without releasing them right away. [¶] Perhaps more, seeing as how you're lulled into a state of confidence by understanding short-lived Python generator behaviors.

Footnotes

[*]

This word was invented to make me seem less biased against Java. Oh, also, even more props to Barbara Liskov, (Turing Award winner) for the impetus of generator-based iterators in the CLU language.

[†]

We can do other things with the new capabilities, like feed values back into the generators:

def gen():
    feedback = (yield 'First')
    yield feedback

if __name__ == '__main__':
    g = gen()
    assert g.next() == 'First'
    assert g.send('Test') == 'Test'
[‡]

The original PCI bus is approximately the "Hello, Word!" of platform architecture, so far as I can tell.

[§]

I still haven't gotten solid good grasp of the design methodology changes. If you want one generator to block until the success/failure of another subroutine, then you have to sleep and trigger wake events with the possibility of interrupts. Can you tell I've never used a language with continuations before? ;-)

[¶]

Deadlocking on mutually exclusive resources is easy with a cooperatively multitasking dispatcher: one entity (coroutine) is holding the resource and yields, dispatcher picks another one that wants that same resource, performs a non-blocking acquisition, and then you have circular wait with no preemption == deadlock.