Re: New IndexAM API controlling index vacuum strategies - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: New IndexAM API controlling index vacuum strategies
Date
Msg-id CAH2-WzkRYCXscEkfzO6gqf4fu-SXy-OkiJ2LyJxaE2_UYsY2+Q@mail.gmail.com
Whole thread Raw
In response to Re: New IndexAM API controlling index vacuum strategies  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Mon, Mar 15, 2021 at 4:11 PM Andres Freund <andres@anarazel.de> wrote:
> > > I'm not comfortable with this change without adding more safety
> > > checks. If there's ever a case in which the HEAPTUPLE_DEAD case is hit
> > > and the xid needs to be frozen, we'll either cause errors or
> > > corruption. Yes, that's already the case with params->index_cleanup ==
> > > DISABLED, but that's not that widely used.
> >
> > I noticed that Noah's similar 2013 patch [1] added a defensive
> > heap_tuple_needs_freeze() + elog(ERROR) to the HEAPTUPLE_DEAD case. I
> > suppose that that's roughly what you have in mind here?
>
> I'm not sure that's sufficient. If the case is legitimately reachable
> (I'm maybe 60% is not, after staring at it for a long time, but ...),
> then we can't just error out when we didn't so far.

If you're only 60% sure that the heap_tuple_needs_freeze() error thing
doesn't break anything that should work by now then it seems unlikely
that you'll ever get past 90% sure. I think that we should make a
conservative assumption that the defensive elog(ERROR) won't be
sufficient, and proceed on that basis.

> I kinda wonder whether this case should just be handled by just gotoing
> back to the start of the blkno loop, and redoing the pruning. The only
> thing that makes that a bit more complicatd is that we've already
> incremented vacrelstats->{scanned_pages,vacrelstats->tupcount_pages}.

That seems like a good solution to me -- this is a very seldom hit
path, so we can be a bit inefficient without it mattering.

It might make sense to *also* check some things (maybe using
heap_tuple_needs_freeze()) in passing, just for debugging purposes.

> We really should put the per-page work (i.e. the blkno loop body) of
> lazy_scan_heap() into a separate function, same with the
> too-many-dead-tuples branch.

+1.

BTW I've noticed that the code (and code like it) tends to confuse
things that the current VACUUM performed versus things by *some*
VACUUM (that may or may not be current one). This refactoring might be
a good opportunity to think about that as well.

> >  * It is assumed that the caller has checked the tuple with
> >  * HeapTupleSatisfiesVacuum() and determined that it is not HEAPTUPLE_DEAD
> >  * (else we should be removing the tuple, not freezing it).
> >
> > Does that need work too?
>
> I'm pretty scared of the index-cleanup-disabled path, for that reason. I
> think the hot path is more likely to be unproblematic, but I'd not bet
> my (nonexistant) farm on it.

Well if we can solve the problem by simply doing pruning once again in
the event of a HEAPTUPLE_DEAD return value from the lazy_scan_heap()
HTSV call, then the comment becomes 100% true (which is not the case
even today).

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Masahiro Ikeda
Date:
Subject: make the stats collector shutdown without writing the statsfiles if the immediate shutdown is requested.
Next
From: Tom Lane
Date:
Subject: Re: [PATCH]: Allow errors in parameter values to be reported during the BIND phase itself..