Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers
Date
Msg-id 84c22fbb-b9c4-a02f-384b-b4feb2c67193@2ndquadrant.com
Whole thread Raw
In response to Re: Speed up Clog Access by increasing CLOG buffers  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers
List pgsql-hackers
Hi,

> The attached results show that:
>
> (a) master shows the same zig-zag behavior - No idea why this wasn't
> observed on the previous runs.
>
> (b) group_update actually seems to improve the situation, because the
> performance keeps stable up to 72 clients, while on master the
> fluctuation starts way earlier.
>
> I'll redo the tests with a newer kernel - this was on 3.10.x which is
> what Red Hat 7.2 uses, I'll try on 4.8.6. Then I'll try with the patches
> you submitted, if the 4.8.6 kernel does not help.
>
> Overall, I'm convinced this issue is unrelated to the patches.

I've been unable to rerun the tests on this hardware with a newer 
kernel, so nothing new on the x86 front.

But as discussed with Amit in Tokyo at pgconf.asia, I got access to a 
Power8e machine (IBM 8247-22L to be precise). It's a much smaller 
machine compared to the x86 one, though - it only has 24 cores in 2 
sockets, 128GB of RAM and less powerful storage, for example.

I've repeated a subset of x86 tests and pushed them to
    https://bitbucket.org/tvondra/power8-results-2

The new results are prefixed with "power-" and I've tried to put them 
right next to the "same" x86 tests.

In all cases the patches significantly reduce the contention on 
CLogControlLock, just like on x86. Which is good and expected.

Otherwise the results are rather boring - no major regressions compared 
to master, and all the patches perform almost exactly the same. Compare 
for example this:

* http://tvondra.bitbucket.org/#dilip-300-unlogged-sync

* http://tvondra.bitbucket.org/#power-dilip-300-unlogged-sync

So the results seem much smoother compared to x86, and the performance 
difference is roughly 3x, which matches the 24 vs. 72 cores.

For pgbench, the difference is much more significant, though:

* http://tvondra.bitbucket.org/#pgbench-300-unlogged-sync-skip

* http://tvondra.bitbucket.org/#power-pgbench-300-unlogged-sync-skip

So, we're doing ~40k on Power8, but 220k on x86 (which is ~6x more, so 
double per-core throughput). My first guess was that this is due to the 
x86 machine having better I/O subsystem, so I've reran the tests with 
data directory in tmpfs, but that produced almost the same results.

Of course, this observation is unrelated to this patch.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: amul sul
Date:
Subject: Re: [HACKERS] pg_background contrib module proposal
Next
From: Stephen Frost
Date:
Subject: Re: [HACKERS] Minor correction in alter_table.sgml