Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum - Mailing list pgsql-bugs

From Peter Geoghegan
Subject Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Date
Msg-id CAH2-WzmGdE_ccJCoTxFik_opCdJrYxi76=xEfROP-6WnmrftuQ@mail.gmail.com
Whole thread Raw
In response to Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum  (Andres Freund <andres@anarazel.de>)
List pgsql-bugs
On Fri, Mar 18, 2022 at 7:53 AM Andres Freund <andres@anarazel.de> wrote:
> > Agreed. This is why I find your emphasis on as-recent-as-possible
> > initial horizons confusing. It seems as if you're giving almost equal
> > emphasis to both issues, even though one issue (more eager pruning
> > during VACUUM) is of obvious practical value, while the other
> > (as-recent-as-possible initial horizons) is pretty theoretical and
> > academic -- at best.
>
> I'm not intending to weigh them the same. I think using a more recent horizon
> is more important than you describe here, but "computed horizons" will be a
> considerably larger win.

Okay, got it.

My relfrozenxid and freezing patch series patch v10-0002-* took a new
approach to avoiding the visibility map aggressive VACUUM special
case, that seems like it might be relevant here (surprisingly enough).
The latest version totally embraces making all visibility map access
inside lazy_scan_heap use local state describing a range of skippable
blocks. Not that different, really, but still something that hints at
an interesting future direction.

What if we built a palloc'd array of these ranges up front, right
after establishing OldestXmin? We could even build a complete picture
of the entire table before we'd even scanned the first block. I don't
think it would require too much space. We'd then be operating against
a "snapshot of the visibility map just after OldestXmin was
established", even in a multi-hour VACUUM operation. This is kind of a
move in the opposite direction, but it's not necessarily the wrong
direction; at least this way we're not going to dirty concurrently
modified pages from the largest tables, just to set hint bits or clean
up one or two concurrently aborted transactions.

I also think that this would help with your aio work. Having a
palloc'd array like this would be an ideal target for prefetching. I
actually checked what you were doing here. Evidently the visibility
map stuff at the top of lazy_scan_heap was already a problem for aio:


https://github.com/anarazel/postgres/blob/29d3a3a3ddd47906d9128bc68c218664ab15a2c6/src/backend/access/heap/vacuumlazy.c#L927

I'm posting a more refined version of just that patch here, since it's
closer to the ideal for this kind of thing; it takes all of the
decisions away from lazy_scan_heap, which need only execute skipping
by following instructions built by the new helper function.

--
Peter Geoghegan

Attachment

pgsql-bugs by date:

Previous
From: Andres Freund
Date:
Subject: Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Next
From: Peter Geoghegan
Date:
Subject: Re: snapshot recovery conflict despite hot_standby_feedback set to on