Quick tips for getting into systems programming
In reply
Andrew (@ndrwdn) asked a great followup question to the last entry on
systems programming at my alma mater:

What follows are a few quick-and-general pointers on "I want to start doing
lower level stuff, but need a motivating direction for a starter project."
They're somewhat un-tested because I haven't mentored any apps-to-systems
transitions, but, as somebody who plays on both sides of that fence, I think
they all sound pretty fun.
A word of warning: systems programming may feel crude at first compared
to the managed languages and application-level design you're used to. However,
even among experts, the prevalence of footguns motivates simple designs and APIs, which can be a beautiful thing.
As a heuristic, when starting out, just code it the simple, ungeneralized way.
If you're doing something interesting, hard problems are likely to present
themselves anyhow!
Microcontrollers rock
Check out sites like hackaday.com to see the incredible feats that
people accomplish through microcontrollers and hobby time.
When starting out, it's great to get the tactile feedback of lighting up a
bright blue LED or successfully sending that first UDP packet to your desktop
at four in the morning.
Microcontroller-based development is also nice because you can build up your understanding of C code, if you're feeling rusty, from basic usage — say, keeping everything you need to store as a global variable or array — to fancier techniques as you improve and gain experience with what works well.
Although I haven't played with them specifically, I understand that Arduino
boards are all the rage these days — there are great tutorials and support
communities out on the web that love to help newbies get started with
microcontrollers. AVR freaks was around even when I was programming on my
STK500. I would recommend reading some forums to figure out which board looks right for you and your intended projects.
At school, people really took to Bruce Land's microcontroller class,
because you can't help but feel the fiero as you work towards more and more
ambitious project goals.
Since that class is still being taught, look to
the exercises and projects (link above) as good examples of what's possible with bright
students and four credits worth of time. [*]
Start fixing bugs on low-level open source projects
Many open source projects love to see willing new contributors. Especially check out projects a) that are known for having good/friendly mentoring and
b) that you think are cool (which will help you stay motivated).
I know one amazing person I worked with at Mozilla got into the project by
taking his time to figure out how to properly patch some open bugs.
If you take that route, either compare your patch to what the project
member has already posted, or request that somebody give you feedback on your
patch.
This is another good way to pick up mentor-like connections.
Check out open courseware for conceptual background
I personally love the rapid evolution of open courseware we're seeing. If you're feeling confident, pick a random low-level thing you've heard-of-but-never-quite-understood, type it into a search engine, and do a deep dive on a lecture or series. If you want a more structured approach, a simple search for systems programming open courseware has quite educational looking results.
General specifics: OSes and reversing

OSes
If you're really into OSes, I think you should just dive in and try writing a little kernel on top of your hardware of choice in qemu (a hardware emulator). Quick searches turn up some seemingly excellent tutorials on writing simple OS kernels on qemu, and writing simple OSes for microcontrollers is often a student project topic in courses like the one I mention above. [†]
With some confidence, patience, maybe a programming guide, and recall of some low-level background from school, I think this should be doable. Some research will be required on effective methods of debugging, though — that's always the trick with bare metal coding.
Or, for something less audacious sounding: build your own Linux kernel with some modifications to figure out what's going on.
There are plenty of guides on how to do this for your Linux distribution of choice, and you can learn a great deal just by fiddling around with code paths and using printk.
Try doing something on the system (in userspace) that's simple to isolate in the kernel source using grep — like mmapping /dev/mem or accessing an entry in /proc — to figure out how it works, and leave no stone unturned.
I recommend taking copious notes, because I find that's the best way to trace out any complex system. Taking notes makes it easy to refer back to previous realizations and backtrack at will.
Read everything that interests you on Linux Kernel Newbies, and subscribe to kernel changelog summaries. Attempt to understand things that interest you in the source tree's /Documentation. Write a really simple Linux Kernel Module. Then, refer to freely available texts for help in making it do progressively more interesting things. Another favorite read of mine was Understanding the Linux Kernel, if you have a hobby budget or a local library that carries it.
Reversing
This I know less about — pretty much everybody I know that has done significant reversing is an IDA wizard, and I, at this point, am not. They are also typically Win32 experts, which I am not. Understanding obfuscated assembly is probably a lot easier with powerful and scriptable tools of that sort, which ideally also have a good understanding of the OS. [‡]
However, one of the things that struck me when I was doing background research for attack mitigation patches was how great the security community was at sharing information through papers, blog entries, and proof of concept code. Also, I found that there are a good number of videos online where security researchers share their insights and methods in the exploit analysis process. Video searches may turn up useful conference proceedings, or it may be more effective to work from the other direction: find conferences that deal with your topic of interest, and see which of those offer video recordings.
During my research on security-related things, a blog entry by Chris Rohlf caused Practical Malware Analysis to end up on my wishlist as an introductory text. Seems to have good reviews all around. Something else to check out on a trip to the library or online forums, perhaps.
Footnotes
Understanding JIT spray
Steel your mind for a tale of intrigue, intertwingled with a complex topic in browser security. (It's kind of all over the place, but I might spray something useful.)
Our story, like so many others, starts out with a browser user like yourself, a bottle of red wine, and a devoted young hacker from the Eastern Bloc [*] that answers to the handle "Coleslaw". [†]
Winey-and-Cheesy Corporation, the largest international wine and cheese distributor, has just blitzkrieg bopped the mainstream media over the head with a tactical PR campaign — a free case of wine and sizable wheel of Gouda for the five millionth visitor to their website. [‡]
The only problem is that Winey-and-Cheesy's massively trafficked website... has been owned.
Coleslaw is something of a wunderkind, and has, through feats of social engineering and technical prowess paralleled only by terrible movies from the mid 90s, gained the ability to insert some arbitrary, special-sauce HTML and JavaScript into that promotional page.
Coleslaw intends to perform a "zero-day attack" — this means that there's a bug in the browser that Coleslaw knows about, but that the browser vendors are unaware of. Coleslaw thinks that this bug can be used to take over the machines of some unsuspecting users who visit the promotional page, capitalizing on their maniacal love of fine dining.
The attacker's dilemma
So, to recap, Coleslaw has found a bug in the browser. Coleslaw wants to exploit that bug in order to obtain arbitrary code execution — the ability to run whatever code Coleslaw feels like on the machine that's running the vulnerable browser. The question is, how does Coleslaw get from point A, "I see that there's a bug", to point B, "I can run anything I want on the vulnerable machine"? The process of figuring this out is called exploit development.
The exploit development process is a narrative about control. Coleslaw starts off by having control over a small set of things — the JavaScript and HTML on a page that the browser visits — but wants to end up controlling everything that the user controls on the vulnerable machine. The environment that internet sites have access to is supposed to be sandboxed; i.e. internet sites are expected to have a somewhat limited and carefully selected set of things it can control. For example, web sites that you happen to stumble across shouldn't be able to delete files off of your hard drive.
Strongly-related to this narrative about control is the concept of determinism. If Coleslaw has a concrete understanding that performing some action, like calling alert from JavaScript, always results in some consequence, like creating and displaying an alert window in the browser, then Coleslaw has effectively extended the realm of control from JavaScript to triggering-the-code-in-the browser-that-displays-an-alert-dialog. Barring bugs, the realm of control is always confined to the sandbox — the set of possible actions are those that the browser vendor permits an untrusted website to take.
Not all bugs are created equal
There are lots of different kinds of bugs that browser software can have. There's a relatively tiny set of bugs that permit control flow hijacking, which are generally of interest for gaining arbitrary code execution. Successful hijacking implies that you have the ability to control the address of the instruction being executed, which is commonly referred to as pseudo-register %eip (where ip is short for instruction pointer). With full control of %eip, the attacker can point it at any executable code — possibly at executable code that they've created.
Control flow hijacking is typically accomplished through some kind of memory corruption, stemming from errors in the use of type-unsafe programming constructs in the browser. In general, the bugs of interest for control flow hijacking are:
Memory writes that can be used to clobber vtable pointer or function pointer values. The attacker may have control over the location of the memory write, the value being written, or both.
Buffer overruns that can be used to manipulate values that are ultimately used to determine code to run. The classic example of this is clobbering return addresses present on the C stack. [§]
There's also the possibility of using an attacker-controllable invalid memory read bug to cause an invalid write to happen further along in program execution. Bugs that cause segfaults are carefully evaluated by browser security teams to see if the invalid memory access being performed can be manipulated for use in control flow hijacking.
Platform-level mitigations: DEP, ASLR, and canaries
There are some nifty platform-level protections against traditional control flow hijacking techniques. They make both taking control of %eip and executing an attacker controlled code sequence more difficult.
One control-flow hijacking mitigation is stack smashing protection, which is enabled at compile time using a technique referred to as "canary values". An attacker could historically use stack buffer overruns to clobber the return address in a function frame with a target %eip value, and the ret instruction at the end of the function's machine code would return to that new (attacker controlled) address value. With this mitigation enabled, however, the compiler places a special value on the stack between local variables (where the buffer lives) and the return value. The compiler also augments the function body with pre-return function prologue code that checks the canary value on the stack against its original value. If a stack buffer overrun causes the return value to be overwritten, the canary that lives in the contiguous space between the locals and return value should indicate that things have gone horribly wrong.
Generally, we tend to think of executables as containing all their executable code as static machine-code. Other than the code that the compiler spat out as specific sections of the executable, nothing else should run over the course of the program's execution. This expectation is codified in an OS-level mitigation called Data Execution Prevention (DEP).
The goal of DEP is to prevent things which are not code from being executed as code at runtime. Your program stack, for example, is just a bunch of space for data that your C function frames can't keep in registers. There's basically no reason that a sane program would ever want to start executing the stack area of memory like it were code — if something like that were to happen, it would be better if your program just terminated, because it could be the pivotal point before an attacker like Coleslaw takes control. Program termination means loss of control for the attacker.
Trying to execute code that was not in the original binary will generally cause the program to fault. In a JIT, however, we purposefully create code at runtime, violating the all-the-code-is-in-the-binary assumption. As a result, we have to explicitly mark the machine code that we create as executable by calling to an operating system API function, like VirtualProtect or mprotect, to indicate that the data the process has created should really be executable.
DEP's close friend from acronym club is Address Space Layout Randomization (ASLR). ASLR reduces determinism in the process that the attacker is trying to exploit by randomizing the stack address, library loading addres, heap address, and PEB address, amongst other key program components. With this mitigation, hardcoded constant addresses in attacker crafted code become probabilistically unlikely to succeed at hitting their target. As an example, the start of the program stack could wind up being placed at one of 16 thousand locations!
This also means that the address of system DLLs, like the ones containing OS API functions like VirtualProtect and C library functions like system, are probabilistically unknown to the attacker. Since the browser ships linked with all ASLR-enabled DLLs, it's difficult to use linked DLL code as direct footholds in process space.
Coleslaw wants to run an attacker-controlled code payload, but DEP makes it difficult to execute that payload, since it won't be marked as executable by default.
Coleslaw wants to be able to turn the bug that relinquishes control of %eip into a reliable exploit, but ASLR makes it difficult to know where to point %eip in order to run exploit code.
I imagine that turning a crash into an exploit isn't trivial these days.
Staged shellcode payloads
The machine-code payloads that attackers create are referred to as shellcode. Shellcode is generally characterized by its size and its goal, which is usually reflected by the "stage" it's said to be running. For example, the very first shellcode to run, in computer science style, is referred to as "stage 0".
Intermediate stages of shellcode are often used to bootstrap more complex executable code sequences. The complexity involved in turning a bug into an exploit often prevents arbitrarily complex code sequences from executing immediately, so tinier code sequences are written that just delegate responsibility to a more easily formed executable payload. Constraints that apply to the code that the exploit starts running directly tend to disappear after you've gone through some amount of indirection.
Shellcode can easily embed astoundingly small code sequences called "egg hunters" to find the memory address of other attacker-controlled payloads. The egg hunters are designed to avoid faulting the application, because faults cause the attacker to lose control. They work by performing a series of fast-and-minimally-sized system calls to determine whether a virtual memory page is safe to traverse through and read to find the "egg" payload delimiter.
Once the address of a stage 1 data payload is determined, stage 0 shellcode may attempt to make that segment of memory executable. Despite ASLR, the address of the VirtualProtect function can be derived by hopping from the known TEB address to the PEB address to the DLL loader address mapping table. Once executable permissions have been added to the stage 1 shellcode, it can simply be jumped to.
Another alternative, if the stage 0 shellcode is executing out of a code space with both writable and executable permissions and sufficient available space, is to use what's called a "GetPC" shellcode sequence to determine the current value of %eip and then copy the contents of a stage 1 shellcode payload buffer into the current code space.
For some bugs it may be possible to create "common" stage 0 shellcode to bootstrap any other shellcode payload — this common shellcode is a valuable commodity for exploit toolkits.
JIT spray, deconstructed
As mentioned earlier, the JIT has to mark its own assembly buffers as executable. An attacker may look at using that fact to generate executable stage 0 shellcode, in order to bypass some of the pain inflicted by DEP. But how could you possibly use JIT compilation process to make shellcode?
JIT spraying is the process of coercing the JIT engine to write many executable pages with embedded shellcode.
—Blazakis, 2010
Dion Blazakis wrote the seminal paper on JIT spray, in which he presented a jaw-dropping example. Blazakis noticed that the following ActionScript [¶] code:
var y = (
0x3c54d0d9 ^
0x3c909058 ^
0x3c59f46a ^
0x3c90c801 ^
0x3c9030d9 ^
0x3c53535b ^
...
)
Was JIT-compiled into the following instruction sequence:
addr op imm assembly
0 B8 D9D0543C MOV EAX,3C54D0D9
5 35 5890903C XOR EAX,3C909058
10 35 6AF4593C XOR EAX,3C59F46A
15 35 01C8903C XOR EAX,3C90C801
20 35 D930903C XOR EAX,3C9030D9
25 35 5B53533C XOR EAX,3C53535B
Check out the first line — it's showing that the first instruction is a MOV that places the 32-bit immediate payload into the EAX register. The 32-bit immediate payload from that instruction (3C54D0D9) is exactly the immediate that was used as the left-hand-side to the long XOR sequence in the original ActionScript code.
Now, if we look at the subsequent lines, we see that the addr column, which is showing the address of instructions relative to the start of the sequence, goes up by 5 every time. That's because each instruction after the initial MOV is performing an XOR against the original value in the accumulator register, EAX, exactly as the ActionScript program described.
Each of these instructions is exactly five bytes long — each instruction has a one-byte opcode prefix, given under the op column, followed by a 32 bit immediate constant: the opcode for MOV EAX,[imm32] is 0xB8, and the opcode sequence for XOR EAX,[imm32] is 0x35.
The immediate column may look confusing at a glance, but it's actually just the little-endian equivalent of the 32-bit immediate given in the assembly: the "little end" (least significant byte) goes "in" (at the lowest memory address), which is why the byte order looks flipped around from the one given in the assembly (and in the original ActionScript program).
It may not look so sinister, but the above table is actually deceiving you! In the table, all of the instructions are the same number of bytes (5) in length. On x86 CPUs, however, instructions are actually a variable number of bytes in length: instructions can be as small as a single byte, but can get quite long: the nop instruction is just a 0x90 opcode byte with no operands, whereas the movl $0xdeadbeef, 0x12345678(%ebx,%edx,1) instruction is significantly larger. [#]
As a result, when we look at this instruction sequence "crooked" — with a one-byte skew to the address — we decode a totally different sequence of instructions. I'll show you what I mean.
Our instructions in memory look like the following buffer:
static const char buf[] = {
0xB8, 0xD9, 0xD0, 0x54, 0x3C,
0x35, 0x58, 0x90, 0x90, 0x3C,
0x35, 0x6A, 0xF4, 0x59, 0x3C,
0x35, 0x01, 0xC8, 0x90, 0x3C,
0x35, 0xD9, 0x30, 0x90, 0x3C,
0x35, 0x5B, 0x53, 0x53, 0x3C
};
When we load this up in GDB, and run the disassemble command, we confirm the instructions present in the above table::
(gdb) disassemble/r buf
Dump of assembler code for function buf:
0x08048460 <+0>: b8 d9 d0 54 3c mov eax,0x3c54d0d9
0x08048465 <+5>: 35 58 90 90 3c xor eax,0x3c909058
0x0804846a <+10>: 35 6a f4 59 3c xor eax,0x3c59f46a
0x0804846f <+15>: 35 01 c8 90 3c xor eax,0x3c90c801
0x08048474 <+20>: 35 d9 30 90 3c xor eax,0x3c9030d9
0x08048479 <+25>: 35 5b 53 53 3c xor eax,0x3c53535b
But then, if we look at the buffer with a one-byte offset, we see a totally different set of instructions! Note the use of buf+1 as the disassembly target.:
(gdb) disassemble/r (buf+1), (buf+sizeof(buf))
Dump of assembler code from 0x8048461 to 0x804847e:
0x08048461 <buf+1>: d9 d0 fnop
0x08048463 <buf+3>: 54 push esp
0x08048464 <buf+4>: 3c 35 cmp al,0x35
0x08048466 <buf+6>: 58 pop eax
0x08048467 <buf+7>: 90 nop
0x08048468 <buf+8>: 90 nop
0x08048469 <buf+9>: 3c 35 cmp al,0x35
0x0804846b <buf+11>: 6a f4 push 0xfffffff4
0x0804846d <buf+13>: 59 pop ecx
0x0804846e <buf+14>: 3c 35 cmp al,0x35
0x08048470 <buf+16>: 01 c8 add eax,ecx
0x08048472 <buf+18>: 90 nop
0x08048473 <buf+19>: 3c 35 cmp al,0x35
0x08048475 <buf+21>: d9 30 fnstenv [eax]
0x08048477 <buf+23>: 90 nop
0x08048478 <buf+24>: 3c 35 cmp al,0x35
0x0804847a <buf+26>: 5b pop ebx
0x0804847b <buf+27>: 53 push ebx
If you look down the middle part of the two disassemblies, before the assembly mnemonics, you can read that the bytes are the same from left to right: the first line of the first disassemblies goes b8 d9 d0 54 3c, and the second disassembly starts on the second byte of that same sequence with d9 d0 54 3c, straddling multiple instructions. This is the magic of variable length instruction encoding: when you look at an instruction stream a little bit sideways, things can change very drastically.
Yo dawg, I heard you like X86 assembly...
It's not obvious, at first glance, just how clever this technique is.
The goal of the ActionScript code pattern is for the attacker to insert arbitrary bytes into the code stream that the JIT otherwise generates. The attacker then uses these arbitrary bytes as an alternate instruction stream. However, the attacker has to compensate for the non-attacker-controlled bytes that surround its own.
Each 32-bit immediate encoded in the ActionScript program starts with a MSB of 0x3c — that byte is little-endian encoded and placed, in memory, right before each of the 0x35s that represent the XOR EAX,[imm32] opcode.
Jumping to the 1-byte offset from the base address of the instruction stream starts us off executing 0xd9 0xd0 — a two-byte instruction that runs a no-op on the floating point unit. Both of these bytes were part of the attacker's immediate value: 0x3c54d0d9.
Effectively, the attacker is able to control 4 out of every 5 bytes per instruction in the stream. They are somewhat limited by the bytes fixed in the instruction stream, however — the MSB of each immediate is a 0x3c so that it can successfully combine with the 0x35 from the XOR EAX,[imm32] opcode to create a nop-like instruction, cmp al,0x35, that keeps the stream executing at the 1-byte offset.
It'd be ideal for the attacker if they could find a way to incorporate the 0x35 into an instruction in a useful way, instead of having to lose a byte in order to control it; however, there are lots of fun tricks that you can play to make compact instruction sequences. By making use of the stacky subset of x86 you can get a nice little MISCy program: pushes and pops are nice one-byte instructions that you can split across the semantic nops to simulate moves, and pushing 8-bit signed immediates only takes two bytes, as you can see at buf+11. Dumping your floating point coprocessor state out to the stack is a two byte sequence. Accessing the TEB is a three byte sequence. How can you not love x86?
For this particular code sequence, the attacker only has a 1/5 chance of jumping to an %eip that gives control back to the JIT program — if you land anywhere in the constant-encoded portion, the instruction sequence will be entirely different.
Outstanding issues
So now we know the basic requirements for pulling off a JIT spray attack:
Deterministic attacker control of values embedded in the instruction stream
Control of %eip
The ability to jump somewhere inside the JIT code, in order to probabilistically execute the attacker's interleaved instruction stream
But wait, how do you know where to jump?
JIT spray opens up the possibility for an attacker to create a lot of very similar code via the JIT compiler, possibly with nop sled prefixes. As a result, one approach to bypassing both DEP and ASLR is to fill enough of the address space with JIT code that you can jump to a random location and hit an attacker-controlled portion valid JIT code buffer with reasonable probability.
But this leads to further questions: What address does the attacker pick to jump to? How much code memory does the attacker spray? Creating a reliable exploit seems significantly more difficult.
Blazakis' solution
In order to create a reliable exploit — as opposed to a probabilistic one — Blazakis used the techniques of pointer inferencing and heap feng shui.
The sandbox makes it particularly tricky to figure out where things live in memory. Those kinds of details definitely aren't supposed to be exposed through the sandbox. If the attacker were able to figure out the locations of things in memory space through the sandbox, it would be considered an information leak.
Pointer inferencing is the technique that Blazakis used to accurately determine the memory location of heapified ActionScript entities in the Flash VM. The inferencing described in Blazakis' paper is cleverly based on the fact that literal integer values in the Flash virtual machines are [♠] hashed alongside of heap-object pointers. By observing the default dictionary enumeration order — the order in which keys exist in the hash table — Blazakis was able to narrow down the value of the object pointer to its exact location. [♥]
"Heap feng shui" is the process of understanding the memory allocation behaviors of the sandboxed environment that code is running in, and using that knowledge to place objects in some known locations in memory relative to each other. Blazakis noted that the ActionScript object heap expands in 16MiB increments and took into account the heuristics for executable allocations when loading ActionScript bytecode entities. Blazakis also relied on the usage of VirtualAlloc in the ActionScript memory allocator, with the knowledge that VirtualAlloc maps the first 64KiB aligned hole that's found in a linear scan through the virtual address space.
Blazakis was able to combine these techniques into reliable stage 0 shellcode execution by:
Determining the exact pointer of the first object within a 16MiB heap chunk.
Spraying just enough JIT code to place a JIT code allocation right after that 16MiB chunk.
Determining the JIT spray address to be the object address + 16MiB.
Adding a value like 0x101 to the base address to get an unaligned JIT code location, as described in the JIT spray section above.
Jumping to that resulting address.
Back to the story: the law of large numbers
So, Coleslaw intends to use a multi-step process:
Find bug that permits control flow hijacking
Perform JIT spray
Jump to probabilistic address for stage 0 shellcode
Importance of leaked information about the memory map becomes apparent here: it prevents you from doing a JIT-spray and jump-pray. However, given enough visitors, like the five million to Winey-and-Cheesy's giveaway, we have to start calculating expected values. As mitigations are added to lower the probability of success, we can see the expected value of ownage drop as well.
What will become of our anti-hero?
Will Coleslaw be able to use JIT spray to successfully exploit the browser?
Will pesky JavaScript engine hackers join forces with The League of Extraordinary Security Experts in an attempt to foil Coleslaw's plans with epic mitigations?
What will become of the wheel of Gouda? I hear that they don't keep for very long...
All this, and more, next week — same time, same channel — on Honest to a Segfault!
Footnotes
Monstrous polymorphism and a Python post-import hook decorator
I queue up a few thousand things to do before I get on an airplane: synchronize
two-thousand Google Reader entries, load up a bunch of websites I've been
meaning to read, and make sure for-fun projects are pulled from their most
updated branches.
Then, once I get up in the air, I realize that I don't really want to do 90%
of those things crammed into a seat with no elbow room. I end up doing one or
two. Along with reading Stevey's Drunken Blog Rant: When Polymorphism
Fails, this entry is all the productivity I can claim. The full code
repository for this entry is online if you'd like to follow along.
Polymorphism Recap
The word "polymorphic" comes from Greek roots meaning "many shaped." (Or they
lied to me in school — one of those.) From a worldly perspective I can see
this meaning two things:
A single object can take on many shapes, or
Requirements for a general "shape" can be satisfied by different categories
of objects.
As it turns out, both of these concepts apply to the Object-Oriented
programming, but the canonical meaning is the latter. [*] As Yegge
says:
If you have a bunch of similar objects [...], and they're all supposed to
respond differently to some situation, then you add a virtual method to them
and implement it differently for each object.
(If you don't know what a virtual method is, the Wikipedia page has an
alternate explanation.)
Yegge's Example
Yegge demonstrates that strictly adhering to the principles of polymorphism
does not always produce the best design:
Let's say you've got a big installed base of monsters. [...] Now let's say one
of your users wants to come in and write a little OpinionatedElf monster. [...]
Let's say the OpinionatedElf's sole purpose in life is to proclaim whether it
likes other monsters or not. It sits on your shoulder, and whenever you run
into, say, an Orc, it screams bloodthirstily: "I hate Orcs!!! Aaaaaargh!!!"
(This, incidentally, is how I feel about C++.)
The polymorphic approach to this problem is simple: go through every one of
your 150 monsters and add a doesMrOpinionatedElfHateYou() method.
This is a great counterexample — it induces an instant recognition of absurdity.
He then touches on the fact that dynamic languages allow you to do neat things
consistent with polymorphism due to the flexibility of the object structure
(which is typically just a hash map from identifiers to arbitrary object
values):
I guess if you could somehow enumerate all the classes in the system, and
check if they derive from Monster, then you could do this whole thing in a few
lines of code. In Ruby, I bet you can... but only for the already-loaded
classes. It doesn't work for classes still sitting on disk! You could solve
that, but then there's the network...
This is clearly impractical, but I figured there was some exploratory value to
implementing this challenge in Python. This entry is a small walk-through for
code to detect interface conformity by inspection, enumerate the classes in the
environment, manipulate classes in place, and add an import hook to manipulate
classes loaded from future modules.
The Antagonist
Double entendre intended. :-)
class OpinionatedElf(object):
is_liked_by_class_name = {
'OpinionatedElf': True,
'Orc': False,
'Troll': False}
def __init__(self, name):
self.name = name
def be_scary(self):
print("I'm small, ugly, and don't like the cut of your jib!")
def proclaim_penchance(self, other):
if not IMonster.is_conforming(other):
print("I can't even tell what that is!")
return
is_liked = other.is_liked_by_elf()
class_name = other.__class__.__name__
if is_liked is None:
print("I'm not sure how I feel about %s" % class_name)
return
emotion = 'love' if is_liked else 'hate'
print('I %s %s!!! Aaaaaargh!!!' % (emotion, other.__class__.__name__))
Determining which Classes are Monsters
First of all, Python doesn't require (nor does it encourage) a rigid type
hierarchy. Python's all about the interfaces, which are often implicit.
Step one is to create a way to recognize classes that implement the
monster interface:
required_methods = ['be_scary']
def be_scary(self):
raise NotImplementedError
@classmethod
def is_conforming(cls, object):
result = all(callable(getattr(object, attr_name, None))
for attr_name in cls.required_methods)
logging.debug('%s conforms? %s', object, result)
return result
assert IMonster.is_conforming(IMonster)
This is a simple little class — there are better third party libraries to use
if you want real interface functionality (i.e. more generic conformity
testing and Design By Contract).
Enumerating the Classes in the Environment
All of the modules that have been loaded into the Python environment are placed
into sys.modules. By inspecting each of these modules, we can
manipulate the classes contained inside if they conform to our monster
interface.
for name, module in sys.modules.iteritems():
extend_monsters(module)
The extend_monsters function is a bit nuanced because immutable modules
also live in sys.modules. We skip those, along with abstract base classes,
which have trouble with inspect.getmembers():
def extend_monsters(module, extension_tag='_opinionated_extended'):
"""Extend monsters in the module's top-level namespace to
tell if they are liked by the :class:`OpinionatedElf`.
and tag it with the :param:`extension_tag` as a flag name.
Do not attempt to extend already-flagged modules.
Do not clobber existing methods with the extension method name.
Warning: swallows exceptional cases where :param:`module`
is builtin, frozen, or None.
"""
name = module.__name__ if module else None
logging.info('Instrumenting module %s', name)
if not module or imp.is_builtin(name) or imp.is_frozen(name) \
or getattr(module, extension_tag, False):
logging.info('Skipping module: %s', name)
return
module._opinionated_instrumented = True
classes = inspect.getmembers(module, inspect.isclass)
for name, cls in classes:
logging.debug('%s: %s', name, cls)
try:
conforming = IMonster.is_conforming(cls)
except AttributeError, e:
if '__abstractmethods__' in str(e): # Abstract class.
continue
raise
if not conforming:
continue
class_name = cls.__name__
logging.debug('Instrumenting class %s', class_name)
attr_name = 'is_liked_by_elf'
if hasattr(cls, attr_name): # Don't clobber existing methods.
logging.warn('Method already exists: %s', cls)
continue
logging.info('Setting %s on %s', attr_name, class_name)
setattr(cls, attr_name,
lambda self: OpinionatedElf.is_liked_by_class_name.get(
self.__class__.__name__, None))
If we were going to be thorough, we would recurse on the members of the class
to see if the class scope was enclosing any more IMonster classes, but
you're never really going to find them all: if a module defines a monster class
in a function-local scope, there's no good way to get the local class statement
and modify it through inspection.
In any case, we're at the point where we can modify all monsters in the
top-level namespace of already-loaded modules. What about modules that we have
yet to load?
Post-import Hook
There is no standard post-import hook (that I know of) in Python. PEP 369
looks promising, but I couldn't find any record of additional work being done
on it. The current import hooks, described in PEP 302, are all pre-import
hooks. As such, you have to decorate the __import__ builtin, wrapping the
original with your intended post-input functionality, like so: [†]
def import_decorator(old_import, post_processor):
"""
:param old_import: The import function to decorate, most likely
``__builtin__.__import__``.
:param post_processor: Function of the form
`post_processor(module) -> module`.
:return: A new import function, most likely to be assigned to
``__builtin__.__import__``.
"""
assert all(callable(fun) for fun in (old_import, post_processor))
def new_import(``*args``, ``**kwargs``):
module = old_import(\*args, \*\*kwargs)
return post_processor(module)
return new_import
After which we can replace the old __import__ with its decorated counterpart:
__builtin__.__import__ = import_decorator(__builtin__.__import__,
extend_monsters)
The Network
Yegge brings up the issue of dynamically generated classes by mentioning
network communications, calling to mind examples such as Java's RMI and
CORBA. This is a scary place to go, even just conceptualizing. If
metaclasses are used, I don't see any difficulty in decorating __new__ with
the same kind of inspection we employed above; however, code generation
presents potentially insurmountable problems.
Decorating the eval family of functions to modify new classes created
seems possible, but it would be challenging and requires additional
research on my part. exec is a keyword/statement, which I would think is a
hopeless cause.
Footnotes
Thoughts on desktop Linux incompatibilities with iPhone and Android
Linux users want music-player/phone integration. Linux users want to sync all of their data — contacts, emails, calendars, bookmarks, documents, ebooks, music, photos, videos — at the touch of a button. Linux users want 3G data rates. Linux users want a state of the art, coordinated mobile platform.
If FLOSS developers are so prone to scratching their own itches, why doesn't there exist such a thing?
Because large scale mobile device companies box us out.
The iPhone Platform
I believe that Linux users who purchase their iPhone with the intent of jailbreaking it to fake compatibility are doing the Linux community a great disservice. They are purchasing a device which is made with the intent of not working with your computer. There's no more mass storage device. There's no longer a known iTunesDB format. The iPhone goes so far to obscure our intended usage that the community-recommend method of gaining functionality was to use an arbitrary code execution exploit. This is what we're driven to do. Do you want to support this behavior with your $200-500?
From a technological standpoint, our historical success at reverse engineering is very cool. It demonstrates the community's technical prowess through our ability to overcome artificial barriers. Despite the coolness factor, however, we can not and should not rely on our ability to kluge around obstacles in our path. Why? Because it doesn't allow us to make any definitive progress. It constantly puts us several steps behind the capabilities of a "properly" functioning device, both due to the difficulty of finding a solution and the misdirection of creative energy. One can't reasonably expect to build a working, Linux-compatible platform on top of a series of hacks that could potentially break with any minor release.
Even more insulting is the message that alternative solutions that work within the system are unwelcome. In my mind, the rallying cry of the Linux community should be "iPhone != iTunes". Ideally, the community could write an iTunes replacement application that played Ogg Vorbis and FLAC files. Let's enumerate some problems that this would solve for FLOSS developers and enthusiasts:
We wouldn't have to reverse engineer the new iTunesDB format (or anything having to do with iTunes).
We wouldn't have to reverse engineer the new iPhone USB protocol.
We would be starting a platform with a solid base that we could build upon. We would no longer be at the mercy of a development shop that clearly doesn't care about our demographic.
We could have it connect to a small socket server on our local machines and automatically sync music over WiFi.
We could play Ogg Vorbis files, for God's sake!
We could write a whole suite of totally legitimate applications for the iPhone to perform compatible iPhone-native-application-like functionality, all within the artificial constraints of the iPhone! There's nothing stopping us — except for the distribution mechanism. If Apple is at all amenable to our cause, the rejection of competitive apps will have to stop. Again: we should not have to void our warranties to use our product in legitimate ways on our competitive computing platforms.
Sadly, even if iTunes-store enlightenment came to fruition, we'd still be screwed. Platform restrictions disallow several key abilities. Case in point, we could not background our iTunes-replacement music player while we browsed the web (or did anything else, for that matter). We find ourselves at the mercy of the exposed API and Human Interface restrictions. Although this is unfortunate, it's decidedly better than founding a platform on our ability to hack around the poor design decisions of others.
The Android Platform
I'm much less well informed about the Android platform and the upcoming HTC Dream mobile device. Nobody is well informed at this point — almost exactly one month from the expected release date — much to the chagrin of potential customers. There are early indications that Linux desktop compatibility will not be supported natively on this platform either. As a Linux user, I can only cross my fingers and hope that Android will be as open as Google makes it out to be, while keeping a close watch on the potentially hazardous centralized distribution model.
Food For Thought
Since this article is supposed to contain my "thoughts on" the subject, I feel I should also share this little tidbit that keeps rattling around in my head. I'm not drawing any conclusions, just providing the reader with another, incomplete step in my thought process.
Monopoly law exists, in part, to disallow certain practices that are thought to be detrimental to "consumer welfare". From Wikipedia (emphasis added):
Competition law does not make merely having a monopoly illegal, but rather abusing the power that a monopoly may confer, for instance through exclusionary practices.
Update: September 20, 2008
An application named MailWrangler was also barred from the Apple Store for vaguely duplicating the functionality of Mail.app. From Angelo DiNardi's article:
Normally to check multiple Gmail accounts in mobile Safari you would have to log in and out of all of the accounts, typing the username and password for each. Using just the Apple Mail application you aren’t able to see threaded views, your google contacts, archive (quickly), star, etc without going through the hassles that are present when using Gmail’s IMAP on the iPhone.
This is another case of barring an application that offers features for a smaller demographic. I personally can't see why Apple is so "afraid" — let third party apps spring up for specialized features, so long as they don't violate the device's terms of use. If you feel like incorporating those features into Mail.app somewhere down the road, the other applications will die out naturally.
I feel sincere sympathy for Angelo; however, on the desktop Linux side we're at an even greater disadvantage — for us, there isn't even similar functionality available on the iPhone platform. To just sync our music, we have to void our warranties. The only thing we can possibly do without voiding our warranties is write an app with similar functionality to the iTunes music player and acquire it through the Apple Store. Forbidding us from doing this makes legitimate desktop Linux use impossible — for what advantage?