Re: LP_DEAD hinting and not holding on to a buffer pin on leaf page(Was: [HACKERS] [WIP] Zipfian distribution in pgbench) - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: LP_DEAD hinting and not holding on to a buffer pin on leaf page(Was: [HACKERS] [WIP] Zipfian distribution in pgbench)
Date
Msg-id 20170731221305.GB25306@marmot
Whole thread Raw
In response to Re: LP_DEAD hinting and not holding on to a buffer pin on leaf page(Was: [HACKERS] [WIP] Zipfian distribution in pgbench)  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> wrote:
>On Mon, Jul 31, 2017 at 1:54 PM, Peter Geoghegan <pg@bowt.ie> wrote:
>> That is hard to justify. I don't think that failing to set LP_DEAD hints
>> is the cost that must be paid to realize a benefit elsewhere, though. I
>> don't see much problem with having both benefits consistently. It's
>> actually very unlikely that VACUUM will run, and a TID will be recycled
>> at exactly the wrong time. We could probably come up with a more
>> discriminating way of detecting that that may have happened, at least
>> for Postgres 11. We'd continue to use LSN; the slow path would be taken
>> when the LSN changed, but we do not give up on setting LP_DEAD bits. I
>> think we can justify going to the heap again in this slow path, if
>> that's what it takes.
>
>Yeah, that might possibly be a good approach.

I also wonder if it's worth teaching lazy_scan_heap() to keep around a
list of TIDs that can at least have their LP_DEAD bit set within their
index page, for use during subsequent index vacuuming. Doing at least
this much for TIDs from heap pages that are skipped due to some other
session concurrently holding a pin on the heap page ("pinskipped_pages"
pages) could help some cases, and seems doable. VACUUM does not need an
extra interlock against another VACUUM (such as a buffer pin) here, of
course.

I wouldn't expect this to help very much on many workloads, including
Alik's Zipfian workload, but it might be useful to have a real guarantee
about how long it can be, in VACUUM cycles, before a dead index tuple at
least has its LP_DEAD bit set. LP_DEAD will only be set when an index
scan goes to the heap, and it's not hard to imagine workloads where no
index scan ever does that with dead tuples whose heap TIDs were killed
only very recently.

Unlike with heap pruning, setting the LP_DEAD bit of all dead index
tuples on a leaf page is pretty much as good as a full VACUUM of the
page. The only thing that makes it not quite as good is that you cannot
assume that it's safe to kill the heap TIDs afterwards.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: [HACKERS] [BUGS] BUG #14759: insert into foreign data partitions fail
Next
From: Stephen Frost
Date:
Subject: Re: [HACKERS] pg_dump does not handle indirectly-granted permissions properly