Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum - Mailing list pgsql-bugs

From Andres Freund
Subject Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Date
Msg-id 20220312214910.ubzhc3y5phyjq76h@alap3.anarazel.de
Whole thread Raw
In response to Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-bugs
Hi,

On 2022-03-10 20:55:37 -0800, Peter Geoghegan wrote:
> On Thu, Mar 10, 2022 at 8:08 PM Andres Freund <andres@anarazel.de> wrote:
> > But that's probably best done separately - but perhaps worth to "weaken" the
> > comment a bit?
> 
> I'm confused again.
> 
> I don't get why it's only going to be worth delaying establishing
> vistest if we can also delay establishing OldestXmin in about the same
> way.

GlobalVisTestFor() doesn't trawl through the procarray to compute horizons. It
just gets already computed horizons, potentially very approximate ones that
later will get updated when necessary to determine the visibility of a
concrete xid.  You could call GlobalVisTestFor() and then later call call
vacuum_set_xid_limits() and the test by GlobalVisTestFor() would still be
accurate.


> AFAICT the only compelling reason to have two separate cutoffs from the
> point of view of vacuumlazy.c is that it enables periodically refreshing
> vistest within lazy_scan_prune, to allow it to prune dead tuples more
> eagerly (fewer recently dead tuples, more dead removed tuples). That I can
> see, that much I get.

The main reason to have vistest at all is that for on-access pruning computing
accurate horizons is excruciatingly expensive - the reason behind procarray
scaling badly were all the xmin accesses during normal snapshot computations!
And due to the way the pruning code is used for both vacuuming and on-access
pruning that kind of bleeds into vacuumlazy.c.  That could have been reduced,
but it seemed beneficial to allow to update horizons as we go.


> Periodically refreshing OldestXmin for freezing won't work, though. At
> the very least it seems much less compelling than the vistest idea.

I think there's a fair bit of value in using an "as aggressive as possible"
initial OldestXmin. Which is used to set a bunch of cutoffs / determine
aggressiveness etc.

Once we compute "measured" relfrozenxid however, there's afaict otherwise not
much point in updating OldestXmin as we go, at least if we start to use
vistest for the horizon determinations inside vacuumlazy.c.


> Yes, it probably is true that we could in principle further split
> OldestXmin along "xmin vs xid horizon" lines (I should say "split it
> again" -- you already split OldestXmin into "OldestXmin for freezing,
> vistest for pruning" as part of your snapshot scalability work). But
> would it really make any sense? If so, I can't see why. At least not
> right now.

If we don't switch to using vistest for HTSV determinations in vacuumlazy.c,
then I don't think we can refresh vistest without causing problems in
lazy_scan_prune().

But I think refreshing horizons is a different discussion from using
as-recent-as-possible initial horizons...

Greetings,

Andres Freund



pgsql-bugs by date:

Previous
From: Japin Li
Date:
Subject: Re: BUG #17435: "add column if not exists" always adds new FK on the column
Next
From: PG Bug reporting form
Date:
Subject: BUG #17436: Initializing and starting with Dockerfile fails