Re: Eliminating PD_ALL_VISIBLE, take 2 - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Eliminating PD_ALL_VISIBLE, take 2
Date
Msg-id 1373910086.14172.17.camel@jdavis
Whole thread Raw
In response to Re: Eliminating PD_ALL_VISIBLE, take 2  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Eliminating PD_ALL_VISIBLE, take 2
List pgsql-hackers
On Sun, 2013-07-14 at 23:06 -0400, Robert Haas wrote:
> > Of course, there's a reason that PD_ALL_VISIBLE is not like a normal
> > hint: we need to make sure that inserts/updates/deletes clear the VM
> > bit. But my patch already addresses that by keeping the VM page pinned.
> 
> I'm of the opinion that we ought to extract the parts of the patch
> that hold the VM pin for longer, review those separately, and if
> they're good and desirable, apply them.

I'm confused. My patch holds a VM page pinned for those cases where
PD_ALL_VISIBLE is currently used -- scans or insert/update/delete. If we
have PD_ALL_VISIBLE, there's no point in the cache, right?

> I am not convinced.  I thought about the problem of repeatedly
> switching pinned VM pages during the index-only scans work, and
> decided that we could live with it because, if the table was large
> enough that we were pinning VM pages frequently, we were also avoiding
> I/O.  Of course, this is a logical fallacy, since the table could
> easily be large enough to have quite a few VM pages and yet small
> enough to fit in RAM.  And, indeed, at least in the early days, an
> index scan could beat out an index-only scan by a significant margin
> on a memory-resident table, precisely because of the added cost of the
> VM lookups.  I haven't benchmarked lately so I don't know for sure
> whether that's still the case, but I bet it is.

To check visibility from an index scan, you either need to pin a heap
page or a VM page. Why would checking the heap page be cheaper? Is it
just other code in the VM-testing path that's slower? Or is there
concurrency involved in your measurements which may indicate contention
over VM pages?

> I think this idea is worth exploring, although I fear the overhead is
> likely to be rather large.  We could find out, though.  Suppose we
> simply change XLOG_HEAP2_VISIBLE to emit FPIs for the heap pages; how
> much does that slow down vacuuming a large table into which many pages
> have been bulk loaded?  Sadly, I bet it's rather a lot, but I'd like
> to be wrong.

My point was that, if freezing needs to emit an FPI anyway, and we
combine freezing and PD_ALL_VISIBLE, then using WAL properly wouldn't
cost us anything. Whether that makes sense depends on what other
combination of proposals makes it in, of course. I agree that we don't
want to start adding FPIs unnecessarily.

Anyway, thanks for the feedback. Moved out of this 'fest.

Regards,Jeff Davis





pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: mvcc catalo gsnapshots and TopTransactionContext
Next
From: Fujii Masao
Date:
Subject: Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])