Re: Analysis of ganged WAL writes - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Analysis of ganged WAL writes
Date
Msg-id 22173.1033992316@sss.pgh.pa.us
Whole thread Raw
In response to Re: Analysis of ganged WAL writes  (Hannu Krosing <hannu@tm.ee>)
Responses Re: Analysis of ganged WAL writes  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hannu Krosing <hannu@tm.ee> writes:
> in an ideal world this would be 5*120=600 tps. 
> Have you any good any ideas what holds it back for the other 300 tps ?

Well, recall that the CPU usage was about 20% in the single-client test.
(The reason I needed a variant version of pgbench is that this machine
is too slow to do more than 120 TPC-B transactions per second anyway.)

That says that the best possible throughput on this test scenario is 5
transactions per disk rotation --- the CPU is just not capable of doing
more.  I am actually getting about 4 xact/rotation for 10 or more
clients (in fact it seems to reach that plateau at 8 clients, and be
close to it at 7).  I'm inclined to think that the fact that it's 4 not
5 is just a matter of "not quite there" --- there's some additional CPU
overhead due to lock contention, etc, and any slowdown at all will cause
it to miss making 5.  The 20% CPU figure was approximate to begin with,
anyway.

The other interesting question is why we're not able to saturate the
machine with only 4 or 5 clients.  I think pgbench itself is probably
to blame for that: it can't keep all its backend threads constantly
busy ... especially not when several of them report back transaction
completion at essentially the same instant, as will happen under
ganged-commit conditions.  There will be intervals where multiple
backends are waiting for pgbench to send a new command.  That delay
in starting a new command cycle is probably enough for them to "miss the
bus" of getting included in the next commit write.

That's just a guess though; I don't have tools that would let me see
exactly what's happening.  Anyone else want to reproduce the test on
a different system and see what it does?

> If it has CPU utilisation of only 50% then there must be still some
> moderate lock contention. 

No, that's I/O wait I think, forced by the quantization of the number
of transactions that get committed per rotation.

> btw, what is the number for 1-5-10 clients with fsync off ? 

About 640 tps at 1 and 5, trailing off to 615 at 10, and down to 450
at 100 clients (now that must be lock contention...)
        regards, tom lane


pgsql-hackers by date:

Previous
From: Larry Rosenman
Date:
Subject: Re: cross-posts (was Re: [GENERAL] Large databases,
Next
From: Hans-Jürgen Schönig
Date:
Subject: Re: Threaded Sorting