Simple, selfish, and unscientific shootout
Disclaimer
I've caught some flak over publishing my "selfish" (read: empirical testing that yields results which are only relevant to me) multi-language-engine-and-standard-library "shootout" (read: I wrote the same basic functionality across multiple languages, somewhat like on the shootout.alioth.debian.org site, the Computer Language Benchmarks Game). I value the concept and process of learning in the open, but it may require more time and consideration of clarity than I had given in this entry. Taking it down would apparently be a breach of etiquette, so please read the following TL;DR as a primer.
TL;DR: I encourage you to personally try writing small utilities against a variety of language engines when you have the opportunity. Consider how much tweaking of the original code you have to do in order to obtain a well-performing implementation. Weigh the development effort and your natural proficiency against the performance, clarity, and safety of the resulting program. Gather evidence and be eager to test your cost assumptions. Commit to learning about sources of overhead and unforeseen characteristics of your libraries. You may be surprised which engines give the best bang per time spent.
It has also been suggested to me that all native languages are within ~3x of one another on generated code performance, and the rest of the difference is generally attributable to the library or algorithm, so that's an interesting rule of thumb to keep in mind.
If you'd like to see how to write a small utility against a variety of language engines, you can check out the Github repo.
Introduction
We tend to throw around "orders of magnitude" when it comes to "programming language speeds", even though we know that the concept of a programming language having a speed for arbitrary programs makes little sense. But, when I'm coding up something small, I find myself pondering a very concrete question: which available language engine (language implementation and libraries) could I reasonably write this little program against that would give the best speed over development effort?
I'm not looking to optimize all the buttery nooks and crannies of this program, nor do I want to drill into potential deficiencies in the I/O subsystem: I just want to make a painless little utility that doesn't require me to go on a lunch break.
XKCD knows what I'm talking about:

I was writing a very simple, single-threaded program to generate about a billion uniformly random int32s in a text file, and I decided I would do a selfish little shootout: write the same program in a set of "viable" languages (remember, this is all about me :-), unscientifically use time(1) on the programs a few times, consider how painful it was to write, and see what the runtimes come out to be.
For 100 million integers on my CentOS Bloomfield box, these were the runtimes for my initial, naive implementations and their lightly tweaked counterparts:
Impl | Naive Runtime | Naive Ratio | Tweaked Runtime | Tweaked Ratio | Engine |
|---|---|---|---|---|---|
.cpp | ~0m 11s | ~0m 15s | GCC 4.4.6 -O3 | ||
.java | ~0m 18s | ~1.5x | ~0m 19s | ~1.25x | JDK 1.7.0.04 |
.go | ~1m 5s | ~6x | ~0m 23s | ~1.5x | go1.0.1 |
.rs | ~1m 7s | ~6x | ~0m 23s | ~1.5x | rustc -O3 0.2 (trunk) |
.ml | ~0m 37s | ~3.3x | ~0m 35s | ~2.5x | ocamlopt 3.11.2 |
.py | ~1m 6s | ~6x | ~0m 51s | ~3.5x | PyPy 1.9.1 (nightly) |
.lua | ~1m 36s | ~9x | ~0m 27s (FFI) | ~1.8x | LuaJIT 2.0.0-beta10 (trunk) |
.rb | ~1m 50s | ~10x | ruby 2.0.0 (trunk) |
Like all developers, I have varied levels of expertise across languages and their standard libraries; but, as I said, this is a selfish shootout, so my competence in a given language is considered part of the baseline. You'll see in the comments that many readers identified performance bugs in these code samples.
There are also caveats for the random numbers I was generating in OCaml (due to tag bit stealing).
For a billion integers the naive C++0x version took 1m 42s and the naive Java version took 2m 18s (1.35x slower). I didn't want to spend the time to slow down the others by an order of magnitude.
As a result — with perpetual intent to improve my abilities in all engines I work with, willful ignorance of the reasoning, acknowledgement that I need to perform more experiments like this to draw a more reasonable conclusion over time, and malice aforethought — I'll hereby declare myself guilty of leaning a bit more towards writing things like this in C++ when I want better runtimes in the giga range for little IO-and-compute programs.
Show me the code!
I threw the code up on github, but the versions that I wrote naively (before optimization suggestions) are duplicated here for convenience.
C++0x:
#include <cstdlib>
#include <cstdio>
#include <random>
int
main(int argc, char **argv)
{
if (2 != argc) {
fprintf(stderr, "Usage: %s <elem_count>\n", argv[0]);
return EXIT_FAILURE;
}
long long count = atoll(argv[1]);
std::mt19937 rng;
rng.seed(time(NULL));
std::uniform_int<int32_t> dist;
FILE *file = fopen("vec_gen.out", "w");
if (NULL == file) {
perror("could not open vector file for writing");
return EXIT_FAILURE;
}
for (long long i = 0; i < count; ++i) {
int32_t r = dist(rng);
fprintf(file, "%d\n", r);
}
fclose(file);
return EXIT_SUCCESS;
}
Java:
import java.io.BufferedWriter;
import java.io.FileWriter;
import java.util.Random;
import java.io.IOException;
class VecGen
{
static final int INT_MAX = 2147483647;
public static void main(String args[]) {
if (args.length != 1) {
System.err.println("Usage: VecGen <elem_count>");
System.exit(-1);
}
int count = Integer.parseInt(args[0]);
try {
FileWriter fw = new FileWriter("vec_gen.out");
BufferedWriter bw = new BufferedWriter(fw);
Random rng = new Random();
for (int i = 0; i < count; ++i) {
int r = rng.nextInt(INT_MAX);
bw.write(r + "\n");
}
bw.close();
} catch (IOException e) {
System.err.println("Received I/O exception: " + e);
System.exit(-2);
}
}
};
Python:
import random
import sys
def main():
if len(sys.argv) != 2:
print >> sys.stderr, "Usage: %s <elem_count>" % sys.argv[0]
exit(-1)
count = int(sys.argv[1])
random.seed()
with open('vec_gen.out', 'w') as file:
for i in xrange(count):
r = random.getrandbits(31)
print >> file, r
if __name__ == '__main__':
main()
OCaml:
if (Array.length Sys.argv) <> 2 then (
let msg = "Usage: " ^ (Array.get Sys.argv 0) ^ " <elem_count>\n" in
prerr_string msg;
exit (-1)
) else (
let count = int_of_string (Array.get Sys.argv 1) in
let file = open_out "vec_gen.out" in
let rec write_rand_line n =
if n = 0 then ()
else
let r = Random.bits () in
output_string file ((string_of_int r) ^ "\n");
write_rand_line (n - 1)
in
write_rand_line count;
exit 0
)
Lua:
function main(args)
if #args ~= 1 then
io.stderr:write("Usage: " .. args[0] .. " <elem_count>\n")
os.exit(-1)
end
local count = tonumber(args[1])
math.randomseed(os.time())
local upper = math.floor(2^31 - 1)
io.output(io.open("vec_gen.out", "w"))
for i = 1,count do
local r = math.random(0, upper)
io.write(r)
io.write("\n")
end
io.close()
end
main(arg)
Rust:
import io::writer_util;
fn main(args: [str]) {
if vec::len(args) != 2u {
let usage = #fmt("Usage: %s <elem_count>\n", args[0]);
io::stderr().write_str(usage);
os::set_exit_status(-1);
ret;
}
let count = option::get(int::from_str(args[1]));
let rng = rand::seeded_rng(rand::seed());
let fw = result::get(io::buffered_file_writer("vec_gen.out"));
let mut i = 0;
while i < count {
let r = rng.next() & (0x7fffffffu as u32);
fw.write_line(int::to_str(r as int, 10u));
i += 1;
}
}
Go:
package main
import (
"bufio"
"fmt"
"log"
"math/rand"
"os"
"strconv"
)
func main() {
if len(os.Args) != 2 {
fmt.Fprintf(os.Stderr, "usage: %s <elem_count>\n", os.Args[0])
os.Exit(-1)
}
count, err := strconv.Atoi(os.Args[1])
if err != nil { panic(err) }
file, err := os.Create("vec_gen.out")
if err != nil { panic(err) }
defer file.Close()
bw := bufio.NewWriter(file)
defer func() {
err := bw.Flush()
if err != nil { log.Fatal(err) }
}()
for i := 0; i < count; i++ {
r := rand.Int31()
fmt.Fprintf(bw, "%d\n", r)
}
}
Ruby:
def main()
if ARGV.length != 1 then
warn "Usage: #{$0} <elem_count>"
exit -1
end
count = ARGV[0].to_i
file = File.open "vec_gen.out", "w"
upper = 2147483647
for i in 1..count do
r = rand(upper)
file.write r
file.write "\n"
end
end
main()
Updates
- 2012-06-03 2100
Reflect Makefile switch to ocaml native compiler, I was using the bytecode compiler.
- 2012-06-04 1500
Add Ruby, because I seem to remember enough of it.
- 2012-06-04 1930
Update Rust numbers per Graydon's comment. The code under test remained unchanged for the entry's inline results table.