Re: Removing PD_ALL_VISIBLE - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Removing PD_ALL_VISIBLE
Date
Msg-id 1358497876.26970.52.camel@jdavis
Whole thread Raw
In response to Re: Removing PD_ALL_VISIBLE  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: Removing PD_ALL_VISIBLE
List pgsql-hackers
On Thu, 2013-01-17 at 14:53 -0800, Jeff Davis wrote:
> Test plan:
>
>   1. Take current patch (without "skip VM check for small tables"
> optimization mentioned above).
>   2. Create 500 tables each about 1MB.
>   3. VACUUM them all.
>   4. Start 500 connections (one for each table)
>   5. Time the running of a loop that executes a COUNT(*) on that
> connection's table 100 times.

Done, with a few extra variables. Again, thanks to Nathan Boley for
lending me the 64-core box. Test program attached.

I did both 1MB tables and 1 tuple tables, but I ended up throwing out
the 1-tuple table results. First of all, as I said, that's a pretty easy
problem to solve, so not really what I want to test. Second, I had to do
so many iterations that I don't think I was testing anything useful. I
did see what might have been a couple differences, but I would need to
explore in more detail and I don't think it's worth it, so I'm just
reporting on the 1MB tables.

For each test, each of 500 connections runs 10 iterations of a COUNT(*)
on it's own 1MB table (which is vacuumed and has the VM bit set). The
query is prepared once. The table has only an int column.

The variable is shared_buffers, going from 32MB (near exhaustion for 500
connections) to 2048MB (everything fits).

The last column is the time range in seconds. I included the range this
time, because there was more variance in the runs but I still think they
are good test results.

master:
    32MB: 16.4 - 18.9
    64MB: 16.9 - 17.3
   128MB: 17.5 - 17.9
   256MB: 14.7 - 15.8
   384MB:  8.1 -  9.3
   448MB:  4.3 -  9.2
   512MB:  1.7 -  2.2
   576MB:  0.6 -  0.6
  1024MB:  0.6 -  0.6
  2048MB:  0.6 -  0.6

patch:
    32MB: 16.8 - 17.6
    64MB: 17.1 - 17.5
   128MB: 17.2 - 18.0
   256MB: 14.8 - 16.2
   384MB:  8.0 - 10.1
   448MB:  4.6 -  7.2
   512MB:  2.0 -  2.6
   576MB:  0.6 -  0.6
  1024MB:  0.6 -  0.6
  2048MB:  0.6 -  0.6

Conclusion:

I see about what I expect: a precipitous drop in runtime after
everything fits in shared_buffers (500 1MB tables means the inflection
point around 512MB makes a lot of sense). There does seem to be a
measurable difference right around that inflection point, but it's not
much. Considering that this is the worst case that I could devise, I am
not too concerned about this.

However, it is interesting to see that there really is a lot of
maintenance work being done when we need to move pages in and out of
shared buffers. I'm not sure that it's related to the freelists though.

For the extra pins to really be a problem, I think a much higher
percentage of the buffers would need to be pinned. Since the case we are
worried about involves scans (if it involved indexes, that would already
be using more than one pin per scan), then that means the only way to
get to a high percentage of pinned buffers is by having very small
tables. But we don't really need to use the VM when scanning very small
tables (the overhead would be elsewhere), so I think we're OK.

So, I attached a new version of the patch that doesn't look at the VM
for tables with fewer than 32 pages. That's the only change.

Regards,
    Jeff Davis

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Review of "pg_basebackup and pg_receivexlog to use non-blocking socket communication", was: Re: Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
Next
From: Jeff Davis
Date:
Subject: Re: 9.3 Pre-proposal: Range Merge Join