Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic
Date
Msg-id CAEze2Wh-2XjU8GXiWXA-bnqFUas96De1YJH+oA58sqjScdi=rg@mail.gmail.com
Whole thread Raw
In response to Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Thu, 10 Jun 2021 at 19:07, Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Thu, Jun 10, 2021 at 9:57 AM Matthias van de Meent
> <boekewurm+postgres@gmail.com> wrote:
> > > By "matches what we expect", I meant "involves a just-aborted
> > > transaction". We could defensively verify that the inserting
> > > transaction concurrently aborted at the point of retrying/calling
> > > heap_page_prune() a second time. If there is no aborted transaction
> > > involved (as was the case with this bug), then we can be confident
> > > that something is seriously broken.
> >
> > I believe there are more cases than only the rolled back case, but
> > checking for those cases would potentially help, yes.
>
> Why do you believe that there are other cases?
>
> I'm not aware of any case that causes lazy_scan_prune() to retry using
> the goto, other than the aborted transaction case I described
> (excluding the bug that you diagnosed, which was of course never
> supposed to happen). If it really is possible to observe a retry for
> any other reason then I'd very much like to know all the details - it
> might well signal a distinct bug of the same general variety.

I see one exit for HEAPTUPLE_DEAD on a potentially recently committed
xvac (?), and we might also check against recently committed
transactions if xmin == xmax, although apparently that is not
implemented right now.

With regards,

Matthias van de Meent



pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: unnesting multirange data types
Next
From: Robert Haas
Date:
Subject: Re: [bug?] Missed parallel safety checks, and wrong parallel safety