Re: pgbench internal contention - Mailing list pgsql-hackers

From Robert Haas
Subject Re: pgbench internal contention
Date
Msg-id 3C78256A-9AAC-434E-9980-4F00BB1EB665@gmail.com
Whole thread Raw
In response to Re: pgbench internal contention  (Andy Colson <andy@squeakycode.net>)
List pgsql-hackers
On Jul 30, 2011, at 9:40 AM, Andy Colson <andy@squeakycode.net> wrote:
> On 07/29/2011 04:00 PM, Robert Haas wrote:
>> On machines with lots of CPU cores, pgbench can start eating up a lot
>> of system time.  Investigation reveals that the problem is with
>> random(), which glibc implements like this:
>>
>> long int
>> __random ()
>> {
>>   int32_t retval;
>>   __libc_lock_lock (lock);
>>   (void) __random_r (&unsafe_state,&retval);
>>   __libc_lock_unlock (lock);
>>   return retval;
>> }
>> weak_alias (__random, random)
>>
>> Rather obviously, if you're running enough pgbench threads, you're
>> going to have a pretty ugly point of contention there.  On the 32-core
>> machine provided by Nate Boley, with my usual 5-minute SELECT-only
>> test, lazy-vxid and sinval-fastmessages applied, and scale factor 100,
>> "time" shows that pgbench uses almost as much system time as user
>> time:
>>
>> $ time pgbench -n -S -T 300 -c 64 -j 64
>> transaction type: SELECT only
>> scaling factor: 100
>> query mode: simple
>> number of clients: 64
>> number of threads: 64
>> duration: 300 s
>> number of transactions actually processed: 55319555
>> tps = 184396.016257 (including connections establishing)
>> tps = 184410.926840 (excluding connections establishing)
>>
>> real    5m0.019s
>> user    21m10.100s
>> sys    17m45.480s
>>
>> I patched it to use random_r() - the patch is attached - and here are
>> the (rather gratifying) results of that test:
>>
>> $ time ./pgbench -n -S -T 300 -c 64 -j 64
>> transaction type: SELECT only
>> scaling factor: 100
>> query mode: simple
>> number of clients: 64
>> number of threads: 64
>> duration: 300 s
>> number of transactions actually processed: 71851589
>> tps = 239503.585813 (including connections establishing)
>> tps = 239521.816698 (excluding connections establishing)
>>
>> real    5m0.016s
>> user    20m40.880s
>> sys    9m25.930s
>>
>> Since a client-limited benchmark isn't very interesting, I think this
>> change makes sense.  Thoughts?  Objections?  Coding style
>> improvements?
>>
>>
>>
>>
>>
> How much randomness do we really need for test data.  What if it where changed to more of a random starting point and
thenautoinc'd after that.  Or if there were two func's, a rand() and a next().  If your test really needs randomness
userand(), otherwise use next(), it would be way faster, and you dont really care what the number is anyway. 

Well, I think you need at least pseudo-randomness for pgbench - reading the table in sequential order is not going to
performthe same as doing random fetches against it. 

...Robert

pgsql-hackers by date:

Previous
From: Andy Colson
Date:
Subject: Re: pgbench internal contention
Next
From: Tom Lane
Date:
Subject: Re: libedit memory stomp is apparently fixed in OS X Lion