'C' category

Newer entries »

procfs and preload

Two of the cool utilities that I've checked out lately have centered around /proc. /proc is a virtual filesystem mountpoint — the filesystem entities are generated on the fly by the kernel. The filesystem entities provide information about the kernel state and, consequently, the currently running processes. [*]

The utilities are preload and powertop. Both are written in C, though I think that either of them could be written more clearly in Python.

preload

Preload's premise is fascinating. Each shared library that a running process is using via MMIO can be queried via /proc/[pid]/maps, which contains entries of the form:

[vm_start_addr]-[vm_end_addr] [perms] [file_offset] [device_major_id]:[device_minor_id] [inode_num] [file_path]

Preload uses a Markov chain to decide which shared library pages to "pre-load" into the page cache by reading and analyzing these maps over time. Preload's primary goal was to reduce login times by pre-emptively warming up a cold page cache, which it was successful in doing. The catch is that running preload was shown to decrease performance once the cache was warmed up, indicating that it may have just gotten in the way of the native Linux page cache prefetch algorithm. [†]

There are a few other things in /proc that preload uses, like /proc/meminfo, but querying the maps is the meat and potatoes. I was thinking of porting it to Python so that I could understand the structure of the program better, but the fact that the daemon caused a performance decrease over a warm cache turned me off the idea.

References

Footnotes

[*]

A cool side note — all files in /proc have a file size of 0 except kcore and self.

[†]

The page_cache_readahead() function in the Linux kernel.

Two postfix operations in a single statement in GCC

#include <stdio.h>

int z = 11;

int main()
{
    printf("%d\n", ((z++) * (z++)));
    printf("%d\n", z);
    return 0;
}
$ gcc -o postfix_test.o postfix_test.c; ./postfix_test.o
121
12

Surprised? I sure was. It looks like gcc interprets two postfix operations in a single statement as a single postfix increment request. I guess this makes sense if you consider the postfix operator to mean, "Wait for this statement to complete, then have the variable increment." Assuming this specification, the second time that you postfix-increment the compiler says, "Yeah, I’m already going to have the variable increment when the statement completes — no need to tell me again."

On the other hand, prefix increment does work twice in the same statement. Maybe this is a decision that’s left up to the compiler? It’s not specified in K&R as far as I can see, but I haven’t checked any of the ANSI specifications.

Updates

2007/09/26 Here’s what Java has to say!

class DoublePostfixTester
{
    public static void main(String[] args)
    {
        int z = 11;
        System.out.println(((z++) * (z++)));
        System.out.println(z);
    }
}
$ javac DoublePostfixTester.java
$ java DoublePostfixTester
132
13

Which is what I would have expected in the first place. Bravo, Java — we’re more alike than I thought.

Matching _t types in your .vimrc

Background

I find myself constantly reproducing my .vimrc file. It's most frequently because I'm migrating from system to system; however, I sometimes just lose it during a reformat (or forget to rsync with the -a flag).

One part of Vim that I'm not fond of is its regex. It takes the one thing I like about Perl (the ecumenical regex syntax) and throws it out the window. As a result, I usually write hackish regexes to highlight my type_t cTypes on the fly, which never highlight quite what I want them to.

Evolution

One example is a regex I found doing a Google search for "vim match _t", which, admittedly, doesn't return much. The most relevant hit suggests the following:

syntax match cType /[^ (]*_t[ )]/ " very wrong

This suggestion is pretty bad — it doesn't match cow_t in any of the following, as examples:

typedef struct Cow cow_t;
cow_t* my_cow;
cow_t my_cow;

At first I thought the correct regex was the following, which matches all of the above:

syntax match cType /\w\+_t\W\{-}/ " also wrong

It's annoying Vim regex doesn't have the standard operators (like +) without the escape, and that there's that awkward match on the last atom (W, or non-word character) to drop it with a special funky-looking-dealie. I believe the Perl equivalent is the equally unintuitive ?? postfix, but it has the clear advantage of being the de-facto standard.

The above faultily matches on things like the following, however: cow_tip(); This indicates that we need to match on the previous portion in all cases, except where there's a word character following. For this, we use the following, correct, construct:

syntax match cType /\w\+_t\w\@!/ " CORRECT!

I couldn't have figured it out without this handy reference, as well as the more extensive Vim documentation for fixing my original error by using clown-hat looking constructs.

Efficiency/Readability Fix (July 30, 2008)

My friend Trevor Caira pointed out the existence of the \zs and \ze atoms. These resize the match to the specified start and end (respectively), without using clown-hat trickery. <@:)

This makes the regex look a great deal more straightforward:

syntax match cType /\w\+_t\ze\W/