Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
Date
Msg-id 20220220034732.cchlk25elmechd4k@alap3.anarazel.de
Whole thread Raw
In response to Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
Hi,

On 2022-02-19 19:31:21 -0800, Peter Geoghegan wrote:
> On Sat, Feb 19, 2022 at 6:16 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > > Given that heap_surgery's raison d'etre is correcting corruption etc, I think
> > > it makes sense for it to do as minimal work as possible. Iterating through a
> > > HOT chain would be a problem if you e.g. tried to repair a page with HOT
> > > corruption.
> >
> > I guess that's also true. There is at least a legitimate argument to
> > be made for not leaving behind any orphaned heap-only tuples. The
> > interface is a TID, and so the user may already believe that they're
> > killing the heap-only, not just the root item (since ctid suggests
> > that the TID of a heap-only tuple is the TID of the root item, which
> > is kind of misleading).
> 
> Actually, I would say that heap_surgery's raison d'etre is making
> weird errors related to corruption of this or that TID go away, so
> that the user can cut their losses. That's how it's advertised.

I'm not that sure those are that different... Imagine some corruption leading
to two hot chains ending in the same tid, which our fancy new secure pruning
algorithm might detect.

Either way, I'm a bit surprised about the logic to not allow killing redirect
items? What if you have a redirect pointing to an unused item?


> Let's assume that we don't want to make VACUUM/pruning just treat
> orphaned heap-only tuples as DEAD, regardless of their true HTSV-wise
> status

I don't think that'd ever be a good idea. Those tuples are visible to a
seqscan after all.


> -- let's say that we want to err in the direction of doing
> nothing at all with the page. Now we have to have a weird error in
> VACUUM instead (not great, but better than just spinning between
> lazy_scan_prune and heap_prune_page).

Non DEAD orphaned versions shouldn't cause a problem in lazy_scan_prune(). The
problem here is a DEAD orphaned HOT tuples, and those we should be able to
delete with the new page pruning logic, right?


I think it might be worth getting rid of the need for the retry approach by
reusing the same HTSV status array between heap_prune_page and
lazy_scan_prune. Then the only legitimate reason for seeing a DEAD item in
lazy_scan_prune() would be some form of corruption.  And it'd be a pretty
decent performance boost, HTSV ain't cheap.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
Next
From: Peter Geoghegan
Date:
Subject: Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations