Re: Removing PD_ALL_VISIBLE - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Removing PD_ALL_VISIBLE
Date
Msg-id 1354700409.28666.33.camel@jdavis
Whole thread Raw
In response to Re: Removing PD_ALL_VISIBLE  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: Removing PD_ALL_VISIBLE
List pgsql-hackers
On Fri, 2012-11-30 at 13:16 -0800, Jeff Davis wrote:
> I tried for quite a while to show any kind of performance difference
> between checking the VM and checking PD_ALL_VISIBLE on a 12-core box (24
> if you count HT).
>
> Three patches in question:
>   1. Current unpatched master
>   2. patch that naively always checks the VM page, pinning and unpinning
> each time
>   3. Same as #2, but tries to keep buffers pinned (I had to fix a bug in
> this patch though -- new version forthcoming)

New patch attached.

Nathan Boley kindly lent me access to a 64-core box, and that shows a
much more interesting result. The previous test (on the 12-core)
basically showed no difference between any of the patches.

Now, I see why on the 64 core box: the interesting region seems to be
around 32 concurrent connections.

The left column is the concurrency, and the right is the runtime. This
test was for concurrent scans of a 350MB table (each process did 4 scans
and quit). Test program attached.

Patch 1 (scan test):

001 004.299533
002 004.434378
004 004.708533
008 004.518470
012 004.487033
016 004.513915
024 004.765459
032 006.425780
048 007.089146
064 007.908850
072 009.461419
096 013.098646
108 015.278592
128 019.797206

Patch 2 (scan test):

001 004.385206
002 004.596340
004 004.616684
008 004.832248
012 004.858336
016 004.689959
024 005.016797
032 006.857642
048 012.049407
064 025.774772
072 032.680710
096 059.147500
108 083.654806
128 120.350200

Patch 3 (scan test):

001 004.464991
002 004.555595
004 004.562364
008 004.649633
012 004.628159
016 004.518748
024 004.768348
032 004.834177
048 007.003305
064 008.242714
072 009.732261
096 013.231056
108 014.996977
128 020.488570

As you can see, patch #2 starts to show a difference at around 32 and
completely falls over by 48 connections. This is expected because it's
the naive approach that pins the VM page every time it needs it.

Patch #1 and #3 are effectively the same, subsequent runs (and with more
measurements around concurrency 32) show that the differences are just
noise (which seems to be greater around the inflection point of 32). All
of the numbers that seem to show any difference can end up with patch #1
better or patch #3 better, depending on the run.

I tried the delete test, too, but I still couldn't see any difference.
(I neglected to mention in my last email: I aborted after each delete so
that it would be repeatable). The inflection point there is
significantly lower, so I assume it must be contending over something
else. I tried making the table unlogged to see if that would change
things, but it didn't change much. This test only scales linearly to
about 8 or so. Or, there could be something wrong with my test.

So, I conclude that contention is certainly a problem for scans for
patch #2, but patch #3 seems to fix that completely by holding the
buffer pins. The deletes are somewhat inconclusive, but I just can't see
a difference.

Holding more pins does have a distributed cost in theory, as Tom points
out, but I don't know where to begin testing that. We'll have to make a
decision between (a) maintaining the extra complexity and doing the
extra page writes involved with PD_ALL_VISIBLE; or (b) holding onto one
extra pin per table being scanned. Right now, if PD_ALL_VISIBLE did not
exist, it would be pretty hard to justify putting it in as far as I can
tell.

Regards,
    Jeff Davis

Attachment

pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: proposal: fix corner use case of variadic fuctions usage
Next
From: Simon Riggs
Date:
Subject: Re: ALTER TABLE ... NOREWRITE option