On Sun, 2013-07-14 at 23:06 -0400, Robert Haas wrote:
> > Of course, there's a reason that PD_ALL_VISIBLE is not like a normal
> > hint: we need to make sure that inserts/updates/deletes clear the VM
> > bit. But my patch already addresses that by keeping the VM page pinned.
>
> I'm of the opinion that we ought to extract the parts of the patch
> that hold the VM pin for longer, review those separately, and if
> they're good and desirable, apply them.
I'm confused. My patch holds a VM page pinned for those cases where
PD_ALL_VISIBLE is currently used -- scans or insert/update/delete. If we
have PD_ALL_VISIBLE, there's no point in the cache, right?
> I am not convinced. I thought about the problem of repeatedly
> switching pinned VM pages during the index-only scans work, and
> decided that we could live with it because, if the table was large
> enough that we were pinning VM pages frequently, we were also avoiding
> I/O. Of course, this is a logical fallacy, since the table could
> easily be large enough to have quite a few VM pages and yet small
> enough to fit in RAM. And, indeed, at least in the early days, an
> index scan could beat out an index-only scan by a significant margin
> on a memory-resident table, precisely because of the added cost of the
> VM lookups. I haven't benchmarked lately so I don't know for sure
> whether that's still the case, but I bet it is.
To check visibility from an index scan, you either need to pin a heap
page or a VM page. Why would checking the heap page be cheaper? Is it
just other code in the VM-testing path that's slower? Or is there
concurrency involved in your measurements which may indicate contention
over VM pages?
> I think this idea is worth exploring, although I fear the overhead is
> likely to be rather large. We could find out, though. Suppose we
> simply change XLOG_HEAP2_VISIBLE to emit FPIs for the heap pages; how
> much does that slow down vacuuming a large table into which many pages
> have been bulk loaded? Sadly, I bet it's rather a lot, but I'd like
> to be wrong.
My point was that, if freezing needs to emit an FPI anyway, and we
combine freezing and PD_ALL_VISIBLE, then using WAL properly wouldn't
cost us anything. Whether that makes sense depends on what other
combination of proposals makes it in, of course. I agree that we don't
want to start adding FPIs unnecessarily.
Anyway, thanks for the feedback. Moved out of this 'fest.
Regards,Jeff Davis