SML is improving, though benchmarks are getting worse
Friday, April 4th, 2008In the following, jenkinsInt is a function that uses no references — it takes an integer, calculates the Jenkins One-at-a-time hash code on a number of bytes corresponding to its width, and returns that value as an integer. Pretty simple, all in all.
Standard ML of New Jersey v110.54 [built: Sat Sep 24 16:19:34 2005] val jenkinsInt = fn : int -> int - jenkinsInt(4); jenkinsInt: 4 Segmentation fault (core dumped)
Segfault!? Clearly this is a failure of the runtime, so I turned to version .62, which is known to be slower:
Standard ML of New Jersey v110.62 [built: Wed May 9 18:06:38 2007] val jenkinsInt = fn : int -> int - jenkinsInt(4); jenkinsInt: 4 uncaught exception Overflow [overflow] raised at: file stdin
At least we get a nice error message in the newer version. It was a lot easier to fix the error once I knew what was really happening.
For kicks, here’s my (fixed) implementation. I don’t guarantee its correctness — I wrote it hastily for a class project.
structure Hashes = struct val () = if Word.wordSize <> Option.valOf(Int.precision) then print("WARNING: unexpected int size\n") else () val maxIntInverse = if Int.maxInt = NONE then ~1 else ~1 * Option.valOf(Int.maxInt) (* I'm not sure how much optimization the compiler does, so I'm precomputing * these values -- they're quite frequently used if a lot of hashing is being * done. *) val zero = Word.fromInt(0) val three = Word.fromInt(3) val six = Word.fromInt(6) val eight = Word.fromInt(8) val ten = Word.fromInt(10) val eleven = Word.fromInt(11) val fifteen = Word.fromInt(15) val sixteen = Word.fromInt(16) val twenty_four = Word.fromInt(24) val two_hundred_fifty_five = Word.fromInt(255) fun jenkinsByte(key:Word.word, hash:Word.word):Word.word = let val hash = hash + key val hash = hash + Word.<<(hash, ten) val hash = Word.xorb(hash, Word.>>(hash, six)) in hash end fun jenkinsChar(key:char, hash:Word.word):Word.word = let val key = Word.fromInt(Char.ord(key)) in jenkinsByte(key, hash) end fun jenkinsFinish(hash:Word.word):int = let val hash = hash + Word.<<(hash, three) val hash = Word.xorb(hash, Word.>>(hash, eleven)) val hash = hash + Word.<<(hash, fifteen) val result = Word.toIntX(hash) val result = if result < maxIntInverse then maxIntInverse else result in Int.abs(result) end fun jenkinsChars(cs:char list):int = let val hash = zero val hash = List.foldl(jenkinsChar)(hash)(cs) val result = Word.toIntX(hash) val result = if result < maxIntInverse then maxIntInverse else result val absResult = Int.abs(result) in absResult end fun jenkinsInt(i:int):int = let val w = Word.fromInt(i) val b1 = Word.andb(w, two_hundred_fifty_five) val b2 = Word.>>(w, eight) val b3 = Word.>>(w, sixteen) val b4 = Word.>>(w, twenty_four) val hash = List.foldl(jenkinsByte)(zero)([b1, b2, b3, b4]) in jenkinsFinish(hash) end fun jenkinsString(s:string):int = jenkinsChars(String.explode(s)) end