Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum - Mailing list pgsql-bugs

From Dmitry Dolgov
Subject Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Date
Msg-id 20211213122154.4dhb4cmigqxhsuba@localhost
Whole thread Raw
In response to Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum  (Andres Freund <andres@anarazel.de>)
Responses Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
List pgsql-bugs
> On Fri, Dec 10, 2021 at 08:58:26PM -0800, Andres Freund wrote:
> Hi,
>
> On 2021-11-13 16:06:40 +0100, Dmitry Dolgov wrote:
> > I've got curious if modifying the Alexander's test case could reveal
> > something interesting, and sprinkled it with savepoints and rollbacks.
> > Almost immediately a new problem has manifested itself, although the
> > crash has nothing to do with the disconnected tuples as far as I can
> > tell -- still probably worth mentioning. In this case vacuum invoked
> > lazy_scan_prune, and during the first scan one of the chains had a
> > HEAPTUPLE_DEAD at the third position. The processing flow fell through
> > to heap_prune_record_prunable and crashed on an assert with an
> > InvalidTransactionId:
> >
> >     #3  0x000055a2b260d1f9 in heap_prune_record_prunable (prstate=0x7ffd0c0ecdf0, xid=0) at pruneheap.c:872
> >     #4  0x000055a2b260ca72 in heap_prune_chain (buffer=2117, rootoffnum=150, prstate=0x7ffd0c0ecdf0) at
pruneheap.c:695
> >     #5  0x000055a2b260bcd6 in heap_page_prune (relation=0x7fb98e217e20, buffer=2117, vistest=0x55a2b31d2d60
<GlobalVisCatalogRels>,old_snap_xmin=0, old_snap_ts=0, report_stats=false, off_loc=0x55a2b3e6a0cc) at pruneheap.c:288
 
> >     #6  0x000055a2b261309c in lazy_scan_prune (vacrel=0x55a2b3e6a060, buf=2117, blkno=192, page=0x7fb97856bf80 "",
vistest=0x55a2b31d2d60<GlobalVisCatalogRels>, prunestate=0x7ffd0c0ee9d0) at vacuumlazy.c:1739
 
> >
> > Applying heap_prune_record_prunable only if TransactionIdIsNormal seems
> > to help. The original implementation didn't reach
> > heap_prune_record_prunable either and also doesn't crash.
>
> Does your modified test still find problems with 0001 & 0002 from
> https://postgr.es/m/20211211045710.ljtuu4gfloh754rs%40alap3.anarazel.de
> applied?

Nope, everything seems to be working smoothly.



pgsql-bugs by date:

Previous
From: Michael Paquier
Date:
Subject: Re: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters
Next
From: Alexander Korotkov
Date:
Subject: Re: BUG #17300: Server crashes on deserializing text multirange