Re: pgbench - add pseudo-random permutation function - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: pgbench - add pseudo-random permutation function
Date
Msg-id CAEepm=0FMnWm9miM3Tkuq5j3Q31WQJ7vvniy6rinCfkrsByQRQ@mail.gmail.com
Whole thread Raw
In response to Re: pgbench - add pseudo-random permutation function  (Fabien COELHO <coelho@cri.ensmp.fr>)
Responses Re: pgbench - add pseudo-random permutation function
List pgsql-hackers
On Wed, Sep 26, 2018 at 8:26 PM Fabien COELHO <coelho@cri.ensmp.fr> wrote:
> > modular_multiply() is an interesting device.  I will leave this to
> > committers with a stronger mathematical background than me, but I have
> > a small comment in passing:
>
> For testing this function, I have manually commented out the various
> shortcuts so that the manual multiplication code is always used, and the
> tests passed. I just did it again.
>
> > +#ifdef HAVE__BUILTIN_POPCOUNTLL
> > + return __builtin_popcountll(n);
> > +#else /* use clever no branching bitwise operator version */
> >
> > I think it is not enough for the compiler to have
> > __builtin_popcountll().  The CPU that this is eventually executed on
> > must actually have the POPCNT instruction[1] (or equivalent on ARM,
> > POWER etc), or the program will die with SIGILL.
>
> Hmmm, I'd be pretty surprised: The point of the builtin is to delegate to
> the compiler the hassle of selecting the best option available, depending
> on target hardware class: whether it issues a cpu/sse4 instruction is
> not/should not be our problem.

True, it selects based on what you tell it:

$ cat x.c
#include <stdio.h>
#include <stdlib.h>

int
main(int argc, char *argv[])
{
        printf("%d\n", __builtin_popcount(atoi(argv[1])));
}
$ gdb -q a.out
...
(gdb) disass main
...
   0x00000000004005a4 <+39>: callq  0x4005c0 <__popcountdi2>
...
$ gcc -mpopcnt x.c
$ gdb -q a.out
...
(gdb) disass main
...
   0x000000000040059f <+34>: popcnt %eax,%eax
...

I'd forgotten one detail... we figure out if need to tell it that we
have SSE4.2 instructions explicitly in the configure script:

checking which CRC-32C implementation to use... SSE 4.2 with runtime check

We enable -msse4.2 just on the one file that needs it, in src/port/Makefile:

pg_crc32c_sse42.o: CFLAGS+=$(CFLAGS_SSE42)

On this CentOS machine I see:

gcc ... -msse4.2 ... -c pg_crc32c_sse42.c -o pg_crc32c_sse42_srv.o

That is necessary because most people consume PostgreSQL through
packages from distributions that have to work on an Athlon II or
whatever, so we can't just use -msse4.2 for every translation unit.
So it becomes our job to isolate small bits of code that use newer
instructions, if it's really worth the effort to do that, and supply
our own runtime checks and provide a fallback.

> > There are CPUs in circulation produced in this decade that don't have
> > it.
>
> Then the compiler, when generating code that is expected to run for a
> large class of hardware which include such old ones, should not use a
> possibly unavailable instruction, or the runtime should take
> responsability for dynamically selecting a viable option.

Right.  Our problem is that if we didn't do our own runtime testing,
users of (say) Debian and RHEL packages (= most users of PostgreSQL)
would effectively never be able to use things like CRC32 instructions.
None of that seems worth it for something like this.

> An interesting side effect of your mail is that while researching a bit on
> the subject I noticed __builtin_clzll() which helps improve the nbits code
> compared to using popcount. Patch attached uses CLZ insted of POPCOUNT.

It's annoying that fls() didn't make it into POSIX along with ffs().
On my system it compiles to a BSR instruction, just like
__builtin_clz().

-- 
Thomas Munro
http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: amul sul
Date:
Subject: 64-bit hash function for hstore and citext data type
Next
From: Fabien COELHO
Date:
Subject: Re: pgbench - add pseudo-random permutation function