On Mon, Jul 20, 2009 at 8:37 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
> Anyone want to see if they can beat that? Some testing on other
> architectures would help too.
Hm, I took the three implementations so far (normal, unrolled, and
clz) as well as the two from
http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious
and got some very strange results:
normal: 1.494s
clz: 2.214s
unrolled: 2.966s
lookup table: 0.001s
float hack: 11.930s
I can't see why the unrolled implementation is slower than the
non-unrolled so I'm suspecting something's wrong with my #ifdefs but I
don't see it.
I do think the code I grabbed from the stanford page might be
off-by-one for our purposes but I haven't looked closely at that.
I also wonder if this microbenchmark is actually ok because it's
testing the same value over and over so any branch-prediction will
shine unrealistically well.
--
greg
http://mit.edu/~gsstark/resume.pdf