Home > mailing lists

Re: Eliminate redundant tuple visibility check in vacuum - Mailing list pgsql-hackers

From	David Geier
Subject	Re: Eliminate redundant tuple visibility check in vacuum
Date	September 1, 2023 08:47:10
Msg-id	ed6011e9-551f-6156-25c7-7e5ff068fcf4@gmail.com Whole thread Raw
In response to	Re: Eliminate redundant tuple visibility check in vacuum (Peter Geoghegan <pg@bowt.ie>)
List	pgsql-hackers

Tree view

Hi,

On 9/1/23 03:25, Peter Geoghegan wrote:
> On Thu, Aug 31, 2023 at 3:35 PM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
>> Any inserting transaction which aborts after heap_page_prune()'s
>> visibility check will now be of no concern to lazy_scan_prune(). Since
>> we don't do the visibility check again, we won't find the tuple
>> HEAPTUPLE_DEAD and thus won't have the problem of adding the tuple to
>> the array of tuples for vacuum to reap. This does mean that we won't
>> reap and remove tuples whose inserting transactions abort right after
>> heap_page_prune()'s visibility check. But, this doesn't seem like an
>> issue.
> It's definitely not an issue.
>
> The loop is essentially a hacky way of getting a consistent picture of
> which tuples should be treated as HEAPTUPLE_DEAD, and which tuples
> need to be left behind (consistent at the level of each page and each
> HOT chain, at least). Making that explicit seems strictly better.
>
>> They may not be removed until the next vacuum, but ISTM it is
>> actually worse to pay the cost of re-pruning the whole page just to
>> clean up that one tuple. Maybe if that renders the page all visible
>> and we can mark it as such in the visibility map -- but that seems
>> like a relatively narrow use case.
> The chances of actually hitting the retry are microscopic anyway. It
> has nothing to do with making sure that dead tuples from aborted
> tuples get removed for its own sake, or anything. Rather, the retry is
> all about making sure that all TIDs that get removed from indexes can
> only point to LP_DEAD stubs. Prior to Postgres 14, HEAPTUPLE_DEAD
> tuples with storage would very occasionally be left behind, which made
> life difficult in a bunch of other places -- for no good reason.
>
That makes sense and seems like a much better compromise. Thanks for the 
explanations. Please update the comment to document the corner case and 
how we handle it.

-- 
David Geier
(ServiceNow)

pgsql-hackers by date:

From: Peter Smith
Date: 01 September 2023, 08:29:06
Subject: Re: Synchronizing slots from primary to standby

From: Amit Kapila
Date: 01 September 2023, 09:15:48
Subject: Re: persist logical slots to disk during shutdown checkpoint

Re: Eliminate redundant tuple visibility check in vacuum - Mailing list pgsql-hackers

Previous

Next