Archive for the ‘Python’ Category

Efficiency of list comprehensions

Friday, April 30th, 2010

I’m psyched about the awesome comments on my previous entry, Python by example: list comprehensions. Originally this entry was just a response to those comments, but people who stumbled across this entry on the interwebz found the response format too confusing, so I’ve restructured it for posterity.

Efficiency of the more common usage

Let’s look at the efficiency of list comprehensions in the more common usage, where the comprehension’s list result is actually relevant (or, in compiler-speak, live-out).

Using the following program, you can see the time spent in each implementation and the corresponding bytecode sequence:

import dis
import inspect
import timeit
 
 
programs = dict(
    loop="""
result = []
for i in range(20):
    result.append(i * 2)
""",
   loop_faster="""
result = []
add = result.append
for i in range(20):
    add(i * 2)
""",
    comprehension='result = [i * 2 for i in range(20)]',
)
 
 
for name, text in programs.iteritems():
    print name, timeit.Timer(stmt=text).timeit()
    code = compile(text, '<string>', 'exec')
    dis.disassemble(code)
loop 11.1495118141
  2           0 BUILD_LIST               0
              3 STORE_NAME               0 (result)
 
  3           6 SETUP_LOOP              37 (to 46)
              9 LOAD_NAME                1 (range)
             12 LOAD_CONST               0 (20)
             15 CALL_FUNCTION            1
             18 GET_ITER
        >>   19 FOR_ITER                23 (to 45)
             22 STORE_NAME               2 (i)
 
  4          25 LOAD_NAME                0 (result)
             28 LOAD_ATTR                3 (append)
             31 LOAD_NAME                2 (i)
             34 LOAD_CONST               1 (2)
             37 BINARY_MULTIPLY
             38 CALL_FUNCTION            1
             41 POP_TOP
             42 JUMP_ABSOLUTE           19
        >>   45 POP_BLOCK
        >>   46 LOAD_CONST               2 (None)
             49 RETURN_VALUE
loop_faster 8.36096310616
  2           0 BUILD_LIST               0
              3 STORE_NAME               0 (result)
 
  3           6 LOAD_NAME                0 (result)
              9 LOAD_ATTR                1 (append)
             12 STORE_NAME               2 (add)
 
  4          15 SETUP_LOOP              34 (to 52)
             18 LOAD_NAME                3 (range)
             21 LOAD_CONST               0 (20)
             24 CALL_FUNCTION            1
             27 GET_ITER
        >>   28 FOR_ITER                20 (to 51)
             31 STORE_NAME               4 (i)
 
  5          34 LOAD_NAME                2 (add)
             37 LOAD_NAME                4 (i)
             40 LOAD_CONST               1 (2)
             43 BINARY_MULTIPLY
             44 CALL_FUNCTION            1
             47 POP_TOP
             48 JUMP_ABSOLUTE           28
        >>   51 POP_BLOCK
        >>   52 LOAD_CONST               2 (None)
             55 RETURN_VALUE
comprehension 7.08145213127
  1           0 BUILD_LIST               0
              3 DUP_TOP
              4 STORE_NAME               0 (_[1])
              7 LOAD_NAME                1 (range)
             10 LOAD_CONST               0 (20)
             13 CALL_FUNCTION            1
             16 GET_ITER
        >>   17 FOR_ITER                17 (to 37)
             20 STORE_NAME               2 (i)
             23 LOAD_NAME                0 (_[1])
             26 LOAD_NAME                2 (i)
             29 LOAD_CONST               1 (2)
             32 BINARY_MULTIPLY
             33 LIST_APPEND
             34 JUMP_ABSOLUTE           17
        >>   37 DELETE_NAME              0 (_[1])
             40 STORE_NAME               3 (result)
             43 LOAD_CONST               2 (None)
             46 RETURN_VALUE

List comprehensions perform better here because you don’t need to load the append attribute off of the list (loop program, bytecode 28) and call it as a function (loop program, bytecode 38). Instead, in a comprehension, a specialized LIST_APPEND bytecode is generated for a fast append onto the result list (comprehension program, bytecode 33).

In the loop_faster program, you avoid the overhead of the append attribute lookup by hoisting it out of the loop and placing the result in a fastlocal (bytecode 9-12), so it loops more quickly; however, the comprehension uses a specialized LIST_APPEND bytecode instead of incurring the overhead of a function call, so it still trumps.

Using list comprehensions for side effects

I want to address a point that was brought up in the previous entry as to the efficiency of for loops versus list comprehensions when used purely for side effects, but I’ll discuss the subjective bit first, since that’s the least sciency part.

Readability

Simple test – if you did need the result would the comprehension be easily understood? If the answer is yes then removing the assignment on the left hand side doesn’t magically make it less readable…

Michael Foord

First of all, thanks to Michael for his excellent and thought provoking comment!

My response is that removing the use of the result does indeed make it less readable, precisely because you’re using a result-producing control flow construct where the result is not needed. I suppose I’m positing that it’s inherently confusing to do that with your syntax: there’s a looping form that doesn’t produce a result, so that should be used instead. It’s expressing your semantic intention via syntax.

For advanced Pythonistas it’s easy for figure out what’s going on at a glance, but comprehension-as-loop definitely has a "there’s more than one way to do it" smell about it, which also makes it less amenable to people learning the language.

With a viable comprehension-as-loop option, every time a user goes to write a loop that doesn’t require a result they now ask themselves, "Can I fit this into the list comprehension form?" Those mental branches are, to me, what "one way to do it" is designed to avoid. When I read Perl code, I take "mental exceptions" all the time because the author didn’t use the construct that I would have used in the same situation. Minimizing that is a good thing, so I maintain that "no result needed" should automatically imply a loop construct.

Efficiency

Consider two functions, comprehension and loop:

def loop():
    accum = []
    for i in range(20):
        accum.append(i)
    return accum
 
 
def comprehension():
    accum = []
    [accum.append(i) for i in range(20)]
    return accum

N.B. This example is comparing the efficiency of a list comprehension where the result of the comprehension is ignored to a for loop that produces no result, as is discussed in the referenced entry, Python by example: list comprehensions.

Michael Foord comments:

Your alternative for the single line, easily readable, list comprehension is four lines that are less efficient because the loop happens in the interpreter rather than in C.

However, the disassembly, obtained via dis.dis(func) looks like the following for the loop:

2           0 BUILD_LIST               0
            3 STORE_FAST               0 (accum)
 
3           6 SETUP_LOOP              33 (to 42)
            9 LOAD_GLOBAL              0 (range)
           12 LOAD_CONST               1 (20)
           15 CALL_FUNCTION            1
           18 GET_ITER
      >>   19 FOR_ITER                19 (to 41)
           22 STORE_FAST               1 (i)
 
4          25 LOAD_FAST                0 (accum)
           28 LOAD_ATTR                1 (append)
           31 LOAD_FAST                1 (i)
           34 CALL_FUNCTION            1
           37 POP_TOP
           38 JUMP_ABSOLUTE           19
      >>   41 POP_BLOCK
 
5     >>   42 LOAD_FAST                0 (accum)
           45 RETURN_VALUE

And it looks like the following for the comprehension:

2           0 BUILD_LIST               0
            3 STORE_FAST               0 (accum)
 
3           6 BUILD_LIST               0
            9 DUP_TOP
           10 STORE_FAST               1 (_[1])
           13 LOAD_GLOBAL              0 (range)
           16 LOAD_CONST               1 (20)
           19 CALL_FUNCTION            1
           22 GET_ITER
      >>   23 FOR_ITER                22 (to 48)
           26 STORE_FAST               2 (i)
           29 LOAD_FAST                1 (_[1])
           32 LOAD_FAST                0 (accum)
           35 LOAD_ATTR                1 (append)
           38 LOAD_FAST                2 (i)
           41 CALL_FUNCTION            1
           44 LIST_APPEND
           45 JUMP_ABSOLUTE           23
      >>   48 DELETE_FAST              1 (_[1])
           51 POP_TOP
 
4          52 LOAD_FAST                0 (accum)
           55 RETURN_VALUE

By looking at the bytecode instructions, we see that the list comprehension is, at a language level, actually just "syntactic sugar" for the for loop, as mentioned by nes — they both lower down into the same control flow construct at a virtual machine level, at least in CPython.

The primary difference between the two disassemblies is that a superfluous list comprehension result is stored into fastlocal 1, which is loaded (bytecode 29) and appended to (bytecode 44) each iteration, creating some additional overhead — it’s simply deleted in bytecode 48. Unless the POP_BLOCK operation (bytecode 41) of the loop disassembly is very expensive (I haven’t looked into its implementation), the comprehension disassembly is guaranteed to be less efficient.

Because of this, I believe that Michael was mistaken in referring to an overhead that results from use of a for loop versus a list comprehension for CPython. It would be interesting to perform a survey of the list comprehension optimization techniques used in various Python implementations, but optimization seems difficult outside of something like a special Cython construct, because LOAD_GLOBAL range could potentially be changed from the builtin range function. Various issues of this kind are discussed in the (very interesting) paper The effect of unrolling and inlining for Python bytecode optimizations.

Learning Python by example: list comprehensions

Thursday, April 29th, 2010

My friend, who is starting to learn Python 2.x, asked me what this snippet did:

def collapse(seq):
    # Preserve order.
    uniq = []
    [uniq.append(item) for item in seq if not uniq.count(item)]
    return uniq

This is not a snippet that should be emulated (i.e. it’s bad); however, it makes me happy: there are so many things that can be informatively corrected!

What is a list comprehension?

A list comprehension is a special brackety syntax to perform a transform operation with an optional filter clause that always produces a new sequence (list) object as a result. To break it down visually, you perform:

new_range = [i * i          for i in range(5)   if i % 2 == 0]

Which corresponds to:

*result*  = [*transform*    *iteration*         *filter*     ]

The filter piece answers the question, "should this item be transformed?" If the answer is yes, then the transform piece is evaluated and becomes an element in the result. The iteration [*] order is preserved in the result.

Go ahead and figure out what you expect new_range to be in the prior example. You can double check me in the Python shell, but I think it comes out to be:

>>> new_range = [i * i for i in range(5) if i % 2 == 0]
>>> print new_range
[0, 4, 16]

If it still isn’t clicking, we can try to make the example less noisy by getting rid of the transform and filter — can you tell what this will produce?

>>> new_range = [i for i in range(5)]

So what’s wrong with that first snippet?

As we observed in the previous section, a list comprehension always produces a result list, where the elements of the result list are the transformed elements of the iteration. That means, if there’s no filter piece, there are exactly as many result elements as there were iteration elements.

Weird thing number one about the snippet — the list comprehension result is unused. It’s created, mind you — list comprehension always create a value, even if you don’t care what it is — but it just goes off to oblivion. (In technical terms, it becomes garbage.) When you don’t need the result, just use a for loop! This is better:

def colapse(seq):
    """Preserve order."""
    uniq = []
    for item in seq:
        if not uniq.count(item):
            uniq.append(item)
    return uniq

It’s two more lines, but it’s less weird looking and wasteful. "Better for everybody who reads and runs your code," means you should do it.

Moral of the story: a list comprehension isn’t just, "shorthand for a loop." It’s shorthand for a transform from an input sequence to an output sequence with an optional filter. If it gets too complex or weird looking, just make a loop. It’s not that hard and readers of your code will thank you.

Weird thing number two: the transform, list.append(item), produces None as its output value, because the return value from list.append is always None. Therefore, the result, even though it isn’t kept anywhere, is a list of None values of the same length as seq (notice that there’s no filter clause).

Weird thing number three: list.count(item) iterates over every element in the list looking for things that == to item. If you think through the case where you call collapse on an entirely unique sequence, you can tell that the collapse algorithm is O(n2). In fact, it’s even worse than it may seem at first glance, because count will keep going all the way to the end of uniq, even if it finds item in the first index of uniq. What the original author really wanted was item not in uniq, which bails out early if it finds item in uniq.

Also worth mentioning for the computer-sciency folk playing along at home: if all elements of the sequence are comparable, you can bring that down to O(n * log n) by using a "shadow" sorted sequence and bisecting to test for membership. If the sequence is hashable you can bring it down to O(n), perhaps by using the set datatype if you are in Python >= 2.3. Note that the common cases of strings, numbers, and tuples (any built-in immutable datatype, for that matter) are hashable.

From Python history

It’s interesting to note that Python Enhancement Proposal (PEP) #270 considered putting a uniq function into the language distribution, but withdrew it with the following statement:

Removing duplicate elements from a list is a common task, but there are only two reasons I can see for making it a built-in. The first is if it could be done much faster, which isn’t the case. The second is if it makes it significantly easier to write code. The introduction of sets.py eliminates this situation since creating a sequence without duplicates is just a matter of choosing a different data structure: a set instead of a list.

Remember that sets can only contain hashable elements (same policy as dictionary keys) and are therefore not suitable for all uniq-ifying tasks, as mentioned in the last paragraph of the previous section.

Footnotes

[*] "Iteration" is just a fancy word for "step through the sequence, element by element, and give that element a name." In our case we’re giving the name i.

Code ☃ Unicode

Thursday, April 1st, 2010

Let’s come to terms: angle brackets and forward slashes are overloaded. Between relational operators, templates, bitwise shift operators, XML tags, (HTML/squiggly brace language) comments, division, regular expressions, and path separators, what don’t they do?

I think it’s clear to everyone that XML is the best and most human readable markup format ever conceived (for all data serialization and database backing store applications without exception), so it’s time for all that crufty old junk from yesteryear to learn its place. Widely adopted web standards (such as Binary XML and E4X) and well specified information exchange protocols (such as SOAP) speak for themselves through the synergy they’ve utilized in enterprise compute environments.

The results of a confidential survey I conducted conclusively demonstrate beyond any possibility of refutation that you type more angle brackets in an average markup document than you will type angle-bracket relational operators for the next ten years.

In conclusion, your life expectancy decreases as you continue to use the less-than operator and forward slash instead of accepting XML into your heart as a first-class syntax. I understand that some may not enjoy life or the pursuit of happiness and that they will continue to use deprecated syntaxes. To each their own.

I have contributed a JavaScript parser patch to rectify the situation: the ☃ operator is a heart-warming replacement for the (now XML-exclusive) pointy-on-the-left angle bracket and the commonly seen tilde diaeresis ⍨ replaces slash for delimiting regular expressions. I am confident this patch will achieve swift adoption, as it decreases the context sensitivity of the JavaScript parser, which is a clear and direct benefit for browser end users.

The (intolerably whitespace-sensitive) Python programming language nearly came to a similar conclusion to use unicode more pervasively, while simultaneously making it a real programming language by way of the use of types, but did not have enough garments to see it through.

Another interesting benefit: because JavaScript files may be UTF-16 encoded, this increases the utilization of bytes in the source text by filling the upper octets with non-zero values. This, in the aggregate, will increase the meaningful bandwidth utilization of the Internet as a whole.

Of course, I’d also recommend that C++ solve its nested template delimiter issue with ☃ and ☼ to close instead of increasing the context-sensitivity of the parser. [*] It clearly follows the logical flow of start/end delimiting.

As soon as Emoji are accepted as proper unicode code points, I will revise my recommendation to suggest using the standard poo emoticon for a template start delimiter, because increased giggling is demonstrated to reduce the likelihood of head-and-wall involved injuries during C++ compilation, second only to regular use of head protection while programming.

Footnotes

[*] Which provides a direct detriment to the end user — optimizing compilers spend most of their time in the parser.

Two postfix operations redux: sequence points

Sunday, January 10th, 2010

Get ready for some serious language lawyering.

I was going back and converting my old entries to reStructuredText when I found an entry in which I was wrong! (Shocking, I know.)

C

Stupid old me didn’t know about sequence points back in 2007: the effects of the ++ operator in the C expression i++ * i++ are in an indeterminate state of side-effect completion until one of the language-defined sequence points is encountered (i.e. a semicolon or function invocation).

From the C99 standard 6.5.4.2 item 2 regarding the postfix increment and decrement operators:

The result of the postfix ++ operator is the value of the operand. After the result is obtained, the value of the operand is incremented. The side effect of updating the stored value of the operand shall occur between the previous and the next sequence point.

Therefore, the compiler is totally at liberty to interpret that expression as:

mov lhs_result, i     ; Copy the values of the postincrement evaluation.
mov rhs_result, i     ; (Which is the original value of i.)
mul result, lhs_result, rhs_result
add i, lhs_result, 1
add i, rhs_result, 1  ; Second increment clobbers with the same value!

This results in the same result as the GCC compilation in the referenced entry: i is 12 and the result is 121.

As I mentioned before, the reason this can occur is that nothing in the syntax forces the first postincrement to be evaluated before the second one. To give an analogy to concurrency constructs: you have a kind of compile-time "race condition" in your syntax between the two postincrements that could be solved with a sequence point "barrier". [*]

In this assembly, those adds can float anywhere they like after their corresponding mov instruction and can operate directly on i instead of the temporary if they’d prefer. Here’s an possible sequence that results in a value of 132 and i as 13.

mov lhs_result, i ; Gets the original 11.
inc i             ; Increment in-place after the start value is copied.
mov rhs_result, i ; Gets the new value 12.
inc i             ; Increment occurs in-place again, making 13.
mul result, lhs_result, rhs_result

Even if you know what you’re doing, mixing two postfix operations, or any side effect, using the less obvious sequence points (like function invocation) is dangerous and easy to get wrong. Clearly it is not a best practice. [†]

Java

The postincrement operation appears to have sequence-point-like semantics in the Java language through experimentation, and it does! From the Java language specification (page 416):

The Java programming language also guarantees that every operand of an operator (except the conditional operators &&, ||, and ? :) appears to be fully evaluated before any part of the operation itself is performed.

Which combines with the definition of the postfix increment expression (page 485):

A postfix expression followed by a ++ operator is a postfix increment expression.

As well as left-to-right expression evaluation (page 415):

The left-hand operand of a binary operator appears to be fully evaluated before any part of the right-hand operand is evaluated.

To a definitive conclusion that i++ * i++ will always result in 132 == 11 * 12 and i == 13 when i == 11 to start.

Python

Python has no increment operators specifically so you don’t have to deal with this kind of nonsense.

>>> count = 0
>>> count++
  File "<stdin>", line 1
    count++
          ^
SyntaxError: invalid syntax

Annoyingly for newbies, though, it looks like ++count is a valid expression that happens to look like preincrement.

>>> count = 0
>>> ++count
0
>>> --count
0

They’re actually two unary positive and negative operators, respectively. Just one of the hazards of a context free grammar, I suppose.

Footnotes

[*] I threw this in because the ordeal reminds me of the classic bank account concurrency problem. If it’s more confusing than descriptive, please ignore it. :-)
[†]

Since function invocation defines sequence points, I thought this code sequence guaranteed those results:

#include <stdio.h>
 
int identity(int value) { return value; }
 
int main() {
        int i = 11;
        printf("%d\n", identity(i++) * identity(i++));
        printf("%d\n", i);
        return 0;
}

As Dan points out, the order of evaluation is totally unspecified — the left hand and right hand subexpression can potentially be evaluated concurrently.

Why you should bother!

Friday, October 2nd, 2009

I write this entry in response to Why Should I Bother?. The answer, in short, is that I find Python to be a great language for getting things done, and you shouldn’t let stupid interviewers deter you from learning a language that allows you to get more done.

I think I’m a pretty tough interviewer, so I also describe the things I’d recommend that a Python coder knows before applying to a Java position, based on my own technical-interview tactics.

A spectrum of formality

As my friend pointed out to me long ago, many computer scientists don’t care about effective programming methods, because they prefer theory. Understandably, we computer scientists qua programmers (AKA software engineers) find ourselves rent in twain.

As a computer science degree candidate, you are inevitably enamored with complex formalities, terminology, [*] and a robust knowledge of mathematical models that you’ll use a few times per programming project (if you’re lucky). Pragmatic programming during a course of study often takes a back seat to familiar computer science concepts and conformance to industry desires.

As a programmer, I enjoy Python because I find it minimizes boilerplate, maximizes my time thinking about the problem domain, and permits me to use whichever paradigm works best. I find that I write programs more quickly and spend less time working around language deficiencies. Importantly, the execution model fits in my brain.

Real Computer Scientists (TM) tend to love pure-functional programming languages because they fit into mathematical models nicely — founded on recursion, Curry-Howard isomorphism, and what have you — whereas Python is strongly imperative and, in its dynamism, lacks the same sort of formality.

Languages like Java sit somewhere in the middle. They’re still strongly imperative (there are no higher-order functions in Java), but there are more formalities. As the most notable example, compile-time type checking eliminates the possibility of type errors, which gives some programmers a sense of safety. [†] Such languages still let scientists chew on some computer sciencey problems; for example, where values clash with the type system, like provably eliminating NullPointerExceptions, which is fun, but difficult!

As the cost of increased formality, this class of languages is more syntax-heavy and leans on design patterns to get some of the flexibility dynamic typing gives you up front.

It’s debatable which category of languages is easiest to learn, but Java-like languages have footholds in the industry from historical C++ developer bases, Sun’s successful marketing of Java off of C++, and the more recent successes of the C# .NET platform.

It makes sense that we’re predominantly taught this category of languages in school: as a result, we can play the percentages and apply for most available developer jobs. Given that we have to learn it, you might as well do some throw-away programming in it now and again to keep yourself from forgetting everything; however, I’d recommend, as a programmer, that you save the fun projects for whichever language(s) that you find most intriguing.

I picture ease-and-rapidity of development-and-maintenance on a spectrum from low to high friction — other languages I’ve worked in fall somewhere on that spectrum as higher friction than Python. Though many computer scientists much smarter than I seem to conflate formality and safety, I’m fairly convinced I attain code completion and maintainability goals more readily with the imperative and flexible Python language. Plus, perhaps most importantly, I have fun.

My technical-interview protocol

The protocol I use to interview candidates is pretty simple. [‡]

  • Analyze their background experience for potential weaknesses that would affect their ability to perform in the position.
  • Don’t let the candidate talk about anything until the last n minutes, when they can ask questions or comment on the interview experience.
  • Hone in on the areas of potential weakness to determine how bad they are (if they’re bad at all).
  • Evaluate how easy it is to overcome those weaknesses. Determine where their level in relevant areas falls in the Dreyfus model and how much it matters to the description of the position.

Potential Java interview weaknesses

Interviewing a candidate whose background is primarily Python based for a generic Java developer position (as in Sayamindu’s entry), I would immediately flag the following areas as potential weaknesses:

Primitive data types

A programmer can pretty much get away never knowing how a number works in Python, since you typically overflow to appropriately sized data types automatically.

The candidate needs to know what all the Java primitives are when the names are provided to them, and must be able to describe why you would choose to use one over another. Knowing pass-by-value versus pass-by-reference is a plus. In Python there is a somewhat similar distinction between mutable and immutable types — if they understand the subtleties of identifier binding, learning by-ref versus by-value will be a cinch. If they don’t know either, I’ll be worried.

Object oriented design

The candidate’s Python background must not be entirely procedural, or they won’t fare well in a Java environment (which forces object orientation). Additionally, this would indicate that they probably haven’t done much design work: even if they’re an object-orientation iconoclast, they have to know what they’re rebelling against and why.

They need to know:

  • When polymorphism is appropriate.
  • What should be exposed from an encapsulation perspective.
  • What the benefits of interfaces are (in a statically typed, single-inheritance language).

Basically, if they don’t know the fundamentals of object oriented design, I’ll assume they’ve only ever written "scripts," by which I mean, "Small, unimportant code that glues the I/O of several real applications together." I don’t use the term lightly.

Unit testing

If they’ve been writing real-world Python without a single unit test or doctest, they’ve been Doing it Wrong (TM).

unittest is purposefully modeled on xUnit. They may have to learn the new jUnit 4 decorator syntax when they start work, but they should be able to claim they’ve worked with a jUnit 3 -like API.

Abstract data structures

Python has tuples, lists and dictionaries — all polymorphic containers — and they’ll do everything that your programmer heart desires. [§] Some other languages don’t have such nice abstractions.

It’d be awesome if they knew:

  • The difference between injective and bijective and how those terms are important to hash function design. If they can tell me this, I’ll let them write my high-performance hash functions.

They must know (in ascending importance):

  • The difference between a HashMap and a TreeMap.
  • The difference between a vector and a linked list, or when one should preferred over the other. The names are unimportant — I’d clarify that a vector was a dynamically growing array.
  • The "difference" between a tree and a graph.

Turning attention to you, the reader: if you’re lacking in data structures knowledge, I recommend you read a data structures book and actually implement the data structures. Then, take a few minutes to figure out where you’d actually use them in an application. They stick in your head fairly well once you’ve implemented them once.

Some interviewers will ask stupid questions like how to implement sorting algorithms. Again, just pick up a data structures book and implement them once, and you’ll get the gist. Refresh yourself before the interview, because these are a silly favorite — very few people have to implement sorting functions anymore.

Design patterns

Design patterns serve several purposes:

  • They establish a common language for communicating proposed solutions to commonly found problems.
  • They prevent developers for inventing stupid solutions to a solved class of problems.
  • They contain a suite of workarounds for inflexibilities in statically typed languages.

I would want to assure myself that you had an appropriate knowledge of relevant design patterns. More important than the names: if I describe them to you, will you recognize them and their useful applications?

For example, have you ever used the observer pattern? Adapter pattern? Proxying? Facade? You almost certainly had to use all of those if you’ve done major design work in Python.

Background concepts

These are some things that I would feel extra good about if the candidate knew and could accurately describe how they relate to their Python analogs:

  • The importance of string builders (Python list joining idiom)
  • Basic idea of how I/O streams work (Python files under the hood)
  • Basic knowledge of typecasting (Python has implicit polymorphism)

Practical advice

Some (bad) interviewers just won’t like you because you don’t know their favorite language. If you’re interviewing for a position that’s likely to be Java oriented, find the easiest IDE out there and write an application in it for fun. Try porting a Python application you wrote and see how the concepts translate — that’s often an eye-opener. Or katas!

If you find yourself unawares in an interview with these "language crusaders," there’s nothing you can do but show that you have the capacity to learn their language in the few weeks vacation you have before you start. If it makes you feel better, keep a mental map from languages to number of jerks you’ve encountered — even normalizing by developer-base size the results can be surprising. ;-)

Footnotes

[*] Frequently unncessary terminology, often trending towards hot enterprise jargon, since that’s what nets the most jobs and grant money.
[†] Dynamic typing proponents are quick to point out that this doesn’t prevent flaws in reasoning, which are the more difficult class of errors, and that you’ll end up writing tests for these anyway.
[‡] Clearly candidates could exploit a vulnerability in my interview protocol: leave off things they know I’m likely to test that they know particularly well; however, I generally ask them to stop after I’m satisfied they know something. Plus, the less I know about their other weaknesses the more unsure I am about them, and thus the less likely I am to recommend them.
[§] Though not necessarily in a performant way; i.e. note the existence of collections.deque and bisect. Knowing Python, I’d quiz the candidate to see if they knew of the performant datatypes.