December 18, 2012

Quick tips for getting into systems programming

In reply

Andrew (@ndrwdn) asked a great followup question to the last entry on systems programming at my alma mater:

@cdleary Just read your blog post. Are there any resources you would recommend for a Java guy interested in doing systems programming?

What follows are a few quick-and-general pointers on "I want to start doing lower level stuff, but need a motivating direction for a starter project." They're somewhat un-tested because I haven't mentored any apps-to-systems transitions, but, as somebody who plays on both sides of that fence, I think they all sound pretty fun.

A word of warning: systems programming may feel crude at first compared to the managed languages and application-level design you're used to. However, even among experts, the prevalence of footguns motivates simple designs and APIs, which can be a beautiful thing. As a heuristic, when starting out, just code it the simple, ungeneralized way. If you're doing something interesting, hard problems are likely to present themselves anyhow!

Microcontrollers rock

Check out sites like hackaday.com to see the incredible feats that people accomplish through microcontrollers and hobby time. When starting out, it's great to get the tactile feedback of lighting up a bright blue LED or successfully sending that first UDP packet to your desktop at four in the morning.

Microcontroller-based development is also nice because you can build up your understanding of C code, if you're feeling rusty, from basic usage — say, keeping everything you need to store as a global variable or array — to fancier techniques as you improve and gain experience with what works well.

Although I haven't played with them specifically, I understand that Arduino boards are all the rage these days — there are great tutorials and support communities out on the web that love to help newbies get started with microcontrollers. AVR freaks was around even when I was programming on my STK500. I would recommend reading some forums to figure out which board looks right for you and your intended projects.

At school, people really took to Bruce Land's microcontroller class, because you can't help but feel the fiero as you work towards more and more ambitious project goals. Since that class is still being taught, look to the exercises and projects (link above) as good examples of what's possible with bright students and four credits worth of time. [*]

Start fixing bugs on low-level open source projects

Many open source projects love to see willing new contributors. Especially check out projects a) that are known for having good/friendly mentoring and b) that you think are cool (which will help you stay motivated).

I know one amazing person I worked with at Mozilla got into the project by taking his time to figure out how to properly patch some open bugs. If you take that route, either compare your patch to what the project member has already posted, or request that somebody give you feedback on your patch. This is another good way to pick up mentor-like connections.

Check out open courseware for conceptual background

I personally love the rapid evolution of open courseware we're seeing. If you're feeling confident, pick a random low-level thing you've heard-of-but-never-quite-understood, type it into a search engine, and do a deep dive on a lecture or series. If you want a more structured approach, a simple search for systems programming open courseware has quite educational looking results.

General specifics: OSes and reversing

@cdleary Some general but also OS implementation and perhaps malware analysis/RE.

OSes

If you're really into OSes, I think you should just dive in and try writing a little kernel on top of your hardware of choice in qemu (a hardware emulator). Quick searches turn up some seemingly excellent tutorials on writing simple OS kernels on qemu, and writing simple OSes for microcontrollers is often a student project topic in courses like the one I mention above. [†]

With some confidence, patience, maybe a programming guide, and recall of some low-level background from school, I think this should be doable. Some research will be required on effective methods of debugging, though — that's always the trick with bare metal coding.

Or, for something less audacious sounding: build your own Linux kernel with some modifications to figure out what's going on. There are plenty of guides on how to do this for your Linux distribution of choice, and you can learn a great deal just by fiddling around with code paths and using printk. Try doing something on the system (in userspace) that's simple to isolate in the kernel source using grep — like mmapping /dev/mem or accessing an entry in /proc — to figure out how it works, and leave no stone unturned.

I recommend taking copious notes, because I find that's the best way to trace out any complex system. Taking notes makes it easy to refer back to previous realizations and backtrack at will.

Read everything that interests you on Linux Kernel Newbies, and subscribe to kernel changelog summaries. Attempt to understand things that interest you in the source tree's /Documentation. Write a really simple Linux Kernel Module. Then, refer to freely available texts for help in making it do progressively more interesting things. Another favorite read of mine was Understanding the Linux Kernel, if you have a hobby budget or a local library that carries it.

Reversing

This I know less about — pretty much everybody I know that has done significant reversing is an IDA wizard, and I, at this point, am not. They are also typically Win32 experts, which I am not. Understanding obfuscated assembly is probably a lot easier with powerful and scriptable tools of that sort, which ideally also have a good understanding of the OS. [‡]

However, one of the things that struck me when I was doing background research for attack mitigation patches was how great the security community was at sharing information through papers, blog entries, and proof of concept code. Also, I found that there are a good number of videos online where security researchers share their insights and methods in the exploit analysis process. Video searches may turn up useful conference proceedings, or it may be more effective to work from the other direction: find conferences that deal with your topic of interest, and see which of those offer video recordings.

During my research on security-related things, a blog entry by Chris Rohlf caused Practical Malware Analysis to end up on my wishlist as an introductory text. Seems to have good reviews all around. Something else to check out on a trip to the library or online forums, perhaps.

Footnotes

[*]

At the end of the page somebody notes: "This page is transmitted using 100% recycled electrons." ;-)

[†]

Also, don't pass up a chance to browse through the qemu source. Want to know how to emulate a bunch of different hardware efficiently? Use the source, Luke! (Hint: it's a JIT. :-)

[‡]

One other neat thing we occassionally used for debugging at Mozilla was a VMWare-based time-traveling virtual machine instance. It sounded like they were deprecating it a few years back, so I'm not sure the status of it, but if it's still around it would literally allow you to play programs backwards!

Systems programming at my alma mater

Bryan also asked me this at NodeConf last year, where I was chatting with him about the then-in-development IonMonkey:

An old e-mail to the Cornell CS faculty: https://gist.github.com/4278516  Have things changed in the decade since?

I remembered my talk with Bryan when I went to recruit there last year and asked the same interview question that he references — except with the pointer uninitialized so candidates would have to enumerate the possibilities — to see what evidence I could collect. My thoughts on the issue haven't really changed since that chat, so I'll just repeat them here.

(And, although I do not speak for my employer, for any programmers back in Ithaca who think systems programming and stuff like Birman's class is cool beans, my team is hiring both full time and interns in the valley, and I would be delighted if you decided to apply.)

My overarching thought: bring the passion

Many of the people I'm really proud that my teams have hired out of undergrad are just "in love" with systems programming, just as a skilled artisan "cares" about their craft. They work on personal projects and steer their trajectory towards it somewhat independent of the curriculum.

Passion seems to be pretty key, along with follow-through, and ability to work well with others, in the people I've thumbs-up'd over the years. Of course I always want people who do well in their more systems-oriented curriculum and live in a solid part the current-ability curve, but I always have an eye out for the passionately interested ones.

So, I tend to wonder: if an org has a "can systems program" distribution among the candidates, can you predict the existence of the outliers at the career fair from the position of the fat part of that curve?

Anecdotally, myself and two other systems hackers on the JavaScript engine came from the same undergrad program, modulo a few years, although we took radically different paths to get to the team. They are among the best and most passionate systems programmers I've ever known, which also pushes me to think passionate interest may be a high-order bit.

Regardless, it's obviously in systems companies' best interest to try to get the most bang per buck on recruiting trips, so you can see how Bryan's point of order is relevant.

My biased take-away from my time there

I graduated less than a decade ago, so I have my own point of reference. From my time there several years ago, I got the feeling that the mentality was:

This didn't come from any kind of authority, it's just putting into words the "this is how things are done around here" understanding I had at the time. All of them seemed reasonable in context, though I didn't think I wanted to head down the path alluded by those rules of thumb. Of course these were, in the end, just rules of thumb: we still had things like a Linux farm used by some courses.

I feel that the "horrible for teaching" problem extends to other important real-world systems considerations as well: I learned MIPS and Alpha [*], presumably due to their clean RISC heritage, but golly do I ever wish I was taught more about specifics of x86 systems. And POSIX systems. [†]

Of course that kind of thing — picking a "real-world" ISA or compute platform — can be a tricky play for a curriculum: what do you do about the to-be SUN folks? Perhaps you've taught them all this x86-specific nonsense when they only care about SPARC. How many of the "there-be-dragons" lessons from x86 would cross-apply?

There's a balance between trade and fundamentals, and I feel I was often reminded that I was there to cultivate excellent fundamentals which could later be applied appropriately to the trends of industry and academia.

But seriously, it's just writing C...

For my graduating class, CS undergrad didn't really require writing C. The closest you were forced to get was translating C constructs (like loops and function calls) to MIPS and filling in blanks in existing programs. You note the bijection-looking relationship between C and assembly and can pretty much move on.

I tried to steer to hit as much interesting systems-level programming as possible. To summarize a path to learning a workable amount of systems programming in my school of yore, in hopes it will translate to something helpful existing today:

I'm not a good alum in failing to keep up with the goings-ons but, if I had a recommendation based on personal experience, it'd be to do stuff like that. Unfortunately, I've also been at companies where the most basic interview question is "how does a vtable actually work" or on nuances of C++ exceptions, so for some jobs you may want to take an advanced C++ class as well.

Understanding a NULL pointer deref isn't writing C

Eh, it kind of is. On my recruiting trip, if people didn't get my uninitialized pointer dereference question, I would ask them questions about MMUs if they had taken the computer organization class. Some knew how an MMU worked (of course, some more roughly than others), but didn't realize that OSes had a policy of keeping the null page mapping invalid.

So if you understand an MMU, why don't you know what's going to happen in the NULL pointer deref? Because you've never actually written a C program and screwed it up. Or your haven't written enough assembly with pointer manipulation. If you've actually written a Java program and screwed it up you might say NullPointerException, but then you remember there are no exceptions in C, so you have to quickly come up with an answer that fits and say zero.

I think another example might help to illustrate the disconnect: the difference between protected mode and user mode is well understood among people who complete an operating systems course, but the conventions associated with them (something like "tell me about init"), or what a "traditional" physical memory space actually looks like, seem to be out of scope without outside interest.

This kind of interview scenario is usually time to fluency sensitive — wrapping your head around modern C and sane manual memory management isn't trivial, so it does require some time and experience. Plus when you're working regularly with footguns, team members want a basic level of trust in coding capability. It's not that you think the person can't do the job, it's just not the right timing if you need to find somebody who can hit the ground running. Bryan also mentions this in his email.

Thankfully for those of us concerned with the placement of the fat part of the distribution, it sounds like Professor Sirer is saying it's been moving even more in the right direction in the time since I've departed. And, for the big reveal, I did find good systems candidates on my trip, and at the same time avoided freezing to death despite going soft in California all these years.

Brain teaser

I'll round this entry off with a little brain teaser for you systems-minded folks: I contend that the following might not segfault.

// ...

int main() {
    mysterious_function();
    A *a = NULL;
    printf("%d\n", a->integer_member);
    return EXIT_SUCCESS;
}

How many reasons can you enumerate as to why? What if we eliminate the call to the mysterious function?

Footnotes

[*]

In an advanced course we had an Alpha 21264 that I came to love deeply.

[†]

I'm hoping there's more emphasis on POSIX these days with the mobile growth and Linux/OS X dominance in that space.