Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY - Mailing list pgsql-bugs

From Andres Freund
Subject Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
Date
Msg-id 20220524184654.c2zt6coy4s5a6rnh@alap3.anarazel.de
Whole thread Raw
In response to Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-bugs
Hi,

On 2022-05-24 10:38:14 -0700, Peter Geoghegan wrote:
> On Tue, May 24, 2022 at 9:37 AM Andres Freund <andres@anarazel.de> wrote:
> > Do we have any idea what really causes the corruption?
> 
> I don't think so.

I think I found it: https://postgr.es/m/20220524183705.cmgbqq32z63qynhe%40alap3.anarazel.de
afaict PROC_IN_SAFE_IC is completely broken right now. Any concurrent prune
can remove prune rows that are visible to the snapshot held by the
PROC_IN_SAFE_IC backend. Which basically makes them "fair weather snapshots" -
they work only as long as there is no concurrent activity.

Similar behavior is fine for VACUUM - it doesn't use a snapshot / need a
consistent view of the table. But not for CIC - otherwise it could just use
SnapshotAny or such.


I don't really see a realistic alternative other than reverting at this
point. I think this needs to be rethought fairly fundamentally.


> Andrey's tap test fails for me on 14 as expected, and does so reliably
> -- so there is a fairly good reproducer for this.
> 
> I don't have time to debug this right now (...), but it would probably be
> straightforward to get an RR recording of the failure.

I tried that, but it didn't repro under rr within 15min or so.


> (need to work on my pgCon talk)

Good luck :)


> > One thing that'd be worth excluding is the use of parallel index builds.
> 
> I can rule out a problem with parallel index builds -- disabling them
> in the tap test doesn't alter the outcome.

Good. Just to clarify: I was suspicious of PROC_IN_SAFE_IC being set
incoherently in parallel workers or such, not of parallel index builds "in
general".

Greetings,

Andres Freund



pgsql-bugs by date:

Previous
From: Andrey Borodin
Date:
Subject: Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
Next
From: PG Bug reporting form
Date:
Subject: BUG #17496: to_char function resets if interval exceeds 23 hours 59 minutes