Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum - Mailing list pgsql-bugs

From Dmitry Dolgov
Subject Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Date
Msg-id 20211109150216.fgfybn35mwnkeef3@localhost
Whole thread Raw
In response to Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
List pgsql-bugs
> On Mon, Nov 08, 2021 at 10:32:39AM -0800, Peter Geoghegan wrote:
> On Mon, Nov 8, 2021 at 9:28 AM Dmitry Dolgov <9erthalion6@gmail.com> wrote:
> > Interesting, I don't think I've observed those errors. In fact after the
> > recent changes (I've compiled here from 39a31056) around assertion logic
> > and index_delete_check_htid now I'm getting another type of crashes
> > using your scripts. This time heap_page_prune_execute stumbles upon a
> > non heap-only tuple trying to update unused line pointers:
>
> It looks like the new heap_page_prune_execute() assertions catch the
> same problem earlier. It's hard to not suspect the code in pruneheap.c
> itself. Whatever else may have happened, the code in pruneheap.c ought
> to not even try to set a non-heap-only tuple item to LP_UNUSED. ISTM
> that it should explicitly look out for and avoid doing that.
>
> Offhand, I wonder if we just need to have an additional check in
> heap_prune_chain(), which is like the existing "If we find a redirect
> somewhere else, stop --- it must not be same chain" handling used for
> LP_REDIRECT items that aren't at the start of a HOT chain:

Yes, adding such condition works in this case, no non-heap-only tuples
were recorded as unused in heap_prune_chain, and nothing else popped up
afterwards. But now after a couple of runs I could also reproduce (at
least partially) what Alexander was talking about:

    ERROR:  could not open relation with OID 1056321

Not sure yet where is it coming from.



pgsql-bugs by date:

Previous
From: Noah Misch
Date:
Subject: Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data
Next
From: Zuber Farooqui
Date:
Subject: Re: BUG #17276: pg_tblspc Permission denied