Idiomatic Python refactoring: for-else, “in” (contains) operator

I was perusing the App Engine SDK and I came across this snippet:

if self.choices:
  match = False
  for choice in self.choices:
    if choice == value:
      match = True
  if not match:
    raise BadValueError('Property %s is %r; must be one of %r' %
                        (self.name, value, self.choices))

Since I don’t work with many other Python programmers, I always have trouble figuring out what interesting tidbits would be useful to post in, say, a blog entry. I don’t have a good understanding of the popular knowledge level, but I figure that I can’t go too wrong refactoring code written by Google engineers (who I naively assume are all as cool as Steve Yegge). [*]

The for-else statement

Let’s forget about self for now [†] and refactor to use an obscure (but useful) Python feature, the for-else construct. for-else removes the necessity for the boolean-flag-state idiom from the original code, which is often used in lower level languages. [‡]

if choices:
    for choice in choices:
        if choice == value:
            break
    else:
        raise BadValueError

The for-else statement looks a little strange when you first encounter it, but I’ve come to love it. The else suite is evaluated if you don’t break out of the for loop. In this case, if we didn’t break out of the for loop, then we never found a value equivalent to choice.

We also gain some efficiency over the original by using the break statement as soon as we find a match: there’s no need to keep looking if you’ve already found a result! This can save you from iterating over all len(choices) items if you find it’s a valid choice in the first iteration.

in (contains) operator

Here is an even more readable and Python-like refactoring that uses the in operator: [§]

if choices and value not in choices:
    raise BadValueError

The in operator works on any iterable object and performs the same behavior as the code above: it looks for any item within self.choices such that choice == item. If it finds it early in the list, it won’t keep looking. This is similar behavior to our early break statement from the first refactoring.

Just like the original code with the for loop, the in operator raises a TypeError if choices is not iterable. The in operator is effectively a drop-in replacement for the (more verbose) for loop when it comes to membership testing.

Footnotes

[*] You should read his blog if you don’t already.
[†] For the language lawyers: we’re forgetting about the fact that this code was intended to be executed in a bound instance method. ;)
[‡] For example, C. For more information on programming languages and their "heights", see this Wikipedia entry.
[§] Yeah, yeah… technically it’s the not in operator.

Tags: , ,

9 Responses to “Idiomatic Python refactoring: for-else, “in” (contains) operator”

  1. Steven Says:

    Nice refactoring :)

    I think the original snippet of code comes from the C/C++ mentality (e.g. those who first started programming in C). I still find myself writing code like that, even after being exposed to all kinds of syntactic goodies, because it’s the first thing that comes to mind…

    Nice tip on for..else as well, I’ve never used it before but maybe I’ll give it a try now :)

  2. David Buxton Says:

    Why does one need to test for choices as well as value not in choices? Wouldn’t

    if value not in choices:
    raise BadValueError

    be sufficient? Or is choices set to a non-iterable when it is empty?

  3. Christopher Leary Says:

    @David: In the original code, if choices is anything that evaluates to False, the lookup isn’t performed. For example, the lookup would have been bypassed if choices were False, None, or even an empty iterable! If choices were None, your snippet would have raised a TypeError, which doesn’t follow the semantics of the original.

    You could do something like:

    try:
        if value not in choices:
            raise BadValueError
    except TypeError: # Non-iterables.
        pass

    But this also changes the behavior from the original in a subtle way. Empty iterables (like []) will raise a BadValueError here, but would have evaluated to False in the original “if choices” and bypassed the value check altogether.

  4. Dave Kirby Says:

    how about:

    if  value not in (choices or []):
        raise BadValueError

    If choices is None/False/0 etc then (choices or []) will evaluate to an empty list. The expression needs to be in brackets since ‘in’ binds more tightly than ‘or’.

    Admin: Fixed code syntax — correct markup is “pre lang=’python’”.

  5. Christopher Leary Says:

    @Dave Kirby: value in [] is always False, since an empty list contains no values. Thus, value not in [] is always True. Therefore, your snippet will raise a BadValueError if choices evaluates to False, whereas the original snippet will bypass the check altogether if choices evaluates to False.

  6. G Says:

    The note about the forelse being more efficient over break is moot. You could of used break in the first code too

  7. Christopher Leary Says:

    @G: Agreed. If you read carefully you’ll see that the statement was unrelated to the usage of for-else:

    “We also gain some efficiency over the original by using the break statement as soon as we find a match”

    Thanks for pointing it out more explicitly, though!

  8. rgz Says:

    Holy cow! That not just Pythonic, it reads as ok English.
    “If [there're] choices and value not in choices, raise BadValueError.”

  9. Christopher Leary Says:

    @rgz: Yeah, it’s pretty neat that way, but it’s only coincidental. It’s important not to get too bogged down in reads-like-English-ness, or you end up with overly complicated things like Perl’s natural language constructs (i.e. postfixed if/unless statements) or a strange emphasis on English-like readability, like in this COBOL example on Stack Overflow.

Leave a Reply