Re: sniff test on some PG 8.4 numbers - Mailing list pgsql-performance

From Jon Nelson
Subject Re: sniff test on some PG 8.4 numbers
Date
Msg-id CAKuK5J1LL0Vx8qa1Yj9jeNDueYwG0ZzFpB6q3+tkGAu=OmY1XQ@mail.gmail.com
Whole thread Raw
In response to Re: sniff test on some PG 8.4 numbers  (Greg Smith <greg@2ndQuadrant.com>)
Responses Re: sniff test on some PG 8.4 numbers
Re: sniff test on some PG 8.4 numbers
List pgsql-performance
On Sun, Mar 10, 2013 at 10:46 AM, Greg Smith <greg@2ndquadrant.com> wrote:
> On 3/5/13 10:00 PM, Jon Nelson wrote:
>>
>> On Tue, Mar 5, 2013 at 1:35 PM, Jon Nelson <jnelson+pgsql@jamponi.net>
>> wrote:
>>>
>>>
>>> pgbench -h BLAH -c 32 -M prepared -t 100000 -S
>>> I get 95,000 to 100,000 tps.
>>>
>>> pgbench -h BLAH -c 32 -M prepared -t 100000
>>> seems to hover around 6,200 tps (size 100) to 13,700 (size 400)
>>
>>
>> Some followup:
>> The read test goes (up to) 133K tps, and the read-write test to 22k
>> tps when performed over localhost.
>
>
> All your write numbers are inflated because the test is too short.  This
> hardware will be lucky to sustain 7500 TPS on writes.  But you're only
> writing 100,000 transactions, which means the entire test run isn't even
> hitting the database--only the WAL writes are.  When your test run is
> finished, look at /proc/meminfo  I'd wager a large sum you'll find "Dirty:"
> has hundreds of megabytes, if not gigabytes, of unwritten information.
> Basically, 100,000 writes on this sort of server can all be cached in
> Linux's write cache, and pgbench won't force them out of there.  So you're
> not simulating sustained database writes, only how fast of a burst the
> server can handle for a little bit.
>
> For a write test, you must run for long enough to start and complete a
> checkpoint before the numbers are of any use, and 2 checkpoints are even
> better.  The minimum useful length is a 10 minute run, so "-T 600" instead
> of using -t.  If you want something that does every trick possible to make
> it hard to cheat at this, as well as letting you graph size and client data,
> try my pgbench-tools: https://github.com/gregs1104/pgbench-tools  (Note that
> there is a bug in that program right now, it spawns vmstat and iostat
> processes but they don't get killed at the end correctly.  "killall vmstat
> iostat" after running is a good idea until I fix that).

I (briefly!) acquired an identical machine as last but this time with
an Areca instead of an LSI (4 drives).

The following is with ext4, nobarrier, and noatime. As noted in the
original post, I have done a fair bit of system tuning. I have the
dirty_bytes and dirty_background_bytes set to 3GB and 2GB,
respectively.

I built 9.2 and using 9.2 and the following pgbench invocation:

pgbench  -j 8  -c 32 -M prepared -T 600

transaction type: TPC-B (sort of)
scaling factor: 400
query mode: prepared
number of clients: 32
number of threads: 8
duration: 600 s
number of transactions actually processed: 16306693
tps = 27176.566608 (including connections establishing)
tps = 27178.518841 (excluding connections establishing)

> Your read test numbers are similarly inflated, but read test errors aren't
> as large.  Around 133K TPS on select-only is probably accurate. For a read
> test, use "-T 30" to let it run for 30 seconds to get a more accurate
> number.  The read read bottleneck on your hardware is going to be the
> pgbench client itself, which on 8.4 is running as a single thread.  On 9.0+
> you can have multiple pgbench workers.  It normally takes 4 to 8 of them to
> saturate a larger server.

The 'select-only' test (same as above with '-S'):

starting vacuum...end.
transaction type: SELECT only
scaling factor: 400
query mode: prepared
number of clients: 32
number of threads: 8
duration: 600 s
number of transactions actually processed: 127513307
tps = 212514.337971 (including connections establishing)
tps = 212544.392278 (excluding connections establishing)

These are the *only* changes I've made to the config file:

shared_buffers = 32GB
wal_buffers = 16MB
checkpoint_segments = 1024

I can run either or both of these again with different options, but
mostly I'm looking for a sniff test.
However, I'm a bit confused, now.

It seems as though you say the write numbers are not believable,
suggesting a value of 7,500 (roughly 1/4 what I'm getting). If I run
the read test for 30 seconds I get - highly variable - between 300K
and 400K tps. Why are these tps so high compared to your expectations?
Note: I did get better results with HT on vs. with HT off, so I've
left HT on for now.



--
Jon


pgsql-performance by date:

Previous
From: Greg Smith
Date:
Subject: Re: New server setup
Next
From: Scott Marlowe
Date:
Subject: Re: sniff test on some PG 8.4 numbers