July 29, 2008

Thoughts on C++ in small memory footprint embedded development

Background

During my senior year I took on an ECE491 Independent Project course to follow up on a ECE476 Microcontrollers project. On the completion of ECE491 we had created a Low Speed USB 2.0 stack library for the Atmel Mega32 Microcontroller using ~$6 worth of hardware and ~6000 standard lines of C.

Everybody in ECE476 used the CodeVisionAVR IDE, but we were a unique group in using avr-gcc. Though most students were okay with it, there were some minor features missing from the CodeVision compiler at the time, such as the ability to allocate objects on the heap. ;)

We rewrote the ECE476 code base in ECE491, again using avr-gcc, because we realized that the USB protocol was a lot more complex than the stack we had originally written. One of my main gripes in ECE491 was that I was writing highly object oriented code in a language which didn't support any of the syntax. I'm starting to hack on the code base again, and a port to C++ seems like a good idea (since avr-g++ is also available), but there are some significant trade-offs running through my mind.

The Trade-offs

Things I want from C++ in the project:

Things I don't want from C++ in the project:

The First Google Hit Says...

I've read through Reducing C++ Code Bloat and found it thought provoking. Though the article writes about gcc 3.4 and I'm using gcc 4.2, I can't imagine that the underlying code-bloat concepts have changed much. I'm betting a lot of the compiler directive advice is taken care of by gcc's -Os, but I'll make a note to check it out.

It seems sensible to give up on exceptions ahead of time, but there seems to be some hope that the compiler can figure out good code reuse for the templates. I'm compiling to ELF, then performing and objcopy to turn it into Intel Hex object format — I'm hoping that the conversion is trivial and the good ELF compilation referenced in the article will stick.

In the end it seems like I'm just gambling on how much template reuse will occur. I sure hope that if I do all the porting-to-C++ work it optimizes well — template hoisting looks like one of those idioms I'd prefer to leave alone. :(

Experimenting with C++ inheritance diamonds and method resolution order

It seems like g++ uses the order indicated in the class' derivation list over the order indicated in the base specifier list, as in the below example:

#include <iostream>

using namespace std;

class Animal {
public:
    Animal() {
        cout < < "Animal!" << endl;
    }
};

class Man : public virtual Animal {
public:
    Man(string exclamation) {
        cout << "Man " << exclamation << "!" << endl;
    }
};

class Bear : public virtual Animal {
public:
    Bear(string exclamation) {
        cout << "Bear " << exclamation << "!" << endl;
    }
};

class Pig : public virtual Animal {
public:
    Pig(string exclamation) {
        cout << "Pig " << exclamation << "!" << endl;
    }
};

class ManBearPig : public Man, public Bear, public Pig {
public:
    ManBearPig(string exclamation)
        : Pig(exclamation), Bear(exclamation), Man(exclamation)
    {
        cout << "ManBearPig " << exclamation << "!" << endl;
    }
};

int main() {
    ManBearPig mbp("away");
    return 0;
}
cdleary@gamma:~/projects/sandbox/sandbox_cpp$ g++ diamond.cpp && ./a.out
Animal!
Man away!
Bear away!
Pig away!
ManBearPig away!

Note that this experiment is a pretty (very) weak basis for the conclusion — it could be using lexicographic order, order based on the day of month, or any number of other unlikely heuristics :) A lot more experimentation is necessary before getting a discernible pattern, but I just felt like messing around.

Edit (07/27/08): Using correct "base specifier list" terminology instead of my made-up "class initializer list" terminology.

Matching _t types in your .vimrc

Background

I find myself constantly reproducing my .vimrc file. It's most frequently because I'm migrating from system to system; however, I sometimes just lose it during a reformat (or forget to rsync with the -a flag).

One part of Vim that I'm not fond of is its regex. It takes the one thing I like about Perl (the ecumenical regex syntax) and throws it out the window. As a result, I usually write hackish regexes to highlight my type_t cTypes on the fly, which never highlight quite what I want them to.

Evolution

One example is a regex I found doing a Google search for "vim match _t", which, admittedly, doesn't return much. The most relevant hit suggests the following:

syntax match cType /[^ (]*_t[ )]/ " very wrong

This suggestion is pretty bad — it doesn't match cow_t in any of the following, as examples:

typedef struct Cow cow_t;
cow_t* my_cow;
cow_t my_cow;

At first I thought the correct regex was the following, which matches all of the above:

syntax match cType /\w\+_t\W\{-}/ " also wrong

It's annoying Vim regex doesn't have the standard operators (like +) without the escape, and that there's that awkward match on the last atom (W, or non-word character) to drop it with a special funky-looking-dealie. I believe the Perl equivalent is the equally unintuitive ?? postfix, but it has the clear advantage of being the de-facto standard.

The above faultily matches on things like the following, however: cow_tip(); This indicates that we need to match on the previous portion in all cases, except where there's a word character following. For this, we use the following, correct, construct:

syntax match cType /\w\+_t\w\@!/ " CORRECT!

I couldn't have figured it out without this handy reference, as well as the more extensive Vim documentation for fixing my original error by using clown-hat looking constructs.

Efficiency/Readability Fix (July 30, 2008)

My friend Trevor Caira pointed out the existence of the \zs and \ze atoms. These resize the match to the specified start and end (respectively), without using clown-hat trickery. <@:)

This makes the regex look a great deal more straightforward:

syntax match cType /\w\+_t\ze\W/