Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data - Mailing list pgsql-bugs

From Andrey Borodin
Subject Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data
Date
Msg-id D22DEA09-80DA-4350-839B-0FC0BD0668A4@yandex-team.ru
Whole thread Raw
In response to Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data  (Andrey Borodin <x4mmm@yandex-team.ru>)
Responses Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-bugs

> 30 июля 2021 г., в 23:41, Noah Misch <noah@leadboat.com> написал(а):
>
> On Fri, Jul 30, 2021 at 03:42:10PM +0500, Andrey Borodin wrote:
>>> 30 июля 2021 г., в 07:25, Noah Misch <noah@leadboat.com> написал(а):
>>> What alternative fix designs should we consider?
>>
>> I observe that provided patch fixes CIC under normal transactions, but test with 2PC still fails similarly.
>> Unindexed tuple was committed somewhere at the end of Phase 3 or 4.
>> 2021-07-30 15:35:31.806 +05 [25987] 002_cic_2pc.pl LOG:  statement: REINDEX INDEX CONCURRENTLY idx;
>> 2021-07-30 15:35:31.806 +05 [25987] 002_cic_2pc.pl WARNING:  Phase 1
>> 2021-07-30 15:35:31.806 +05 [25987] 002_cic_2pc.pl WARNING:  Phase 2
>> 2021-07-30 15:35:31.806 +05 [25987] 002_cic_2pc.pl WARNING:  XXX: VirtualXactLock vxid -1/6735
>> 2021-07-30 15:35:31.807 +05 [25987] 002_cic_2pc.pl WARNING:  XXX: VirtualXactLock vxid -1/6736
>> 2021-07-30 15:35:31.808 +05 [25987] 002_cic_2pc.pl WARNING:  Phase 3
>> 2021-07-30 15:35:31.808 +05 [25987] 002_cic_2pc.pl WARNING:  XXX: VirtualXactLock vxid -1/6750
>> 2021-07-30 15:35:31.809 +05 [25987] 002_cic_2pc.pl WARNING:  Phase 4
>> 2021-07-30 15:35:31.809 +05 [25987] 002_cic_2pc.pl WARNING:  Phase 5
>> 2021-07-30 15:35:31.809 +05 [25987] 002_cic_2pc.pl WARNING:  XXX: VirtualXactLock vxid -1/6762
>> 2021-07-30 15:35:31.809 +05 [25987] 002_cic_2pc.pl WARNING:  XXX: VirtualXactLock vxid -1/6763
>> 2021-07-30 15:35:31.810 +05 [25987] 002_cic_2pc.pl WARNING:  Phase 6
>> 2021-07-30 15:35:31.810 +05 [25987] 002_cic_2pc.pl WARNING:  XXX: VirtualXactLock vxid 6/2166
>> 2021-07-30 15:35:31.810 +05 [25987] 002_cic_2pc.pl WARNING:  XXX: VirtualXactLock vxid -1/6767
>> 2021-07-30 15:35:31.810 +05 [25987] 002_cic_2pc.pl WARNING:  XXX: VirtualXactLock vxid -1/6764
>> 2021-07-30 15:35:31.810 +05 [25987] 002_cic_2pc.pl WARNING:  XXX: VirtualXactLock vxid -1/6765
>> 2021-07-30 15:35:31.811 +05 [25987] 002_cic_2pc.pl WARNING:  Phase Final
>> 2021-07-30 15:35:31.811 +05 [25987] 002_cic_2pc.pl LOG:  statement: SELECT bt_index_check('idx',true);
>> 2021-07-30 15:35:31.813 +05 [25987] 002_cic_2pc.pl ERROR:  heap tuple (46,16) from table "tbl" lacks matching index
tuplewithin index "idx" xmin 6751 xmax 0 
>
> I see a failure, too.  Once again, "i:" lines are events within the INSERT
> backend, and "r:" lines are events within the REINDEX CONCURRENTLY backend:
>
> r: Phase 2 begins.
> i: INSERT.  Start PREPARE.
> r: Phase 2 commits indisready=t for idx_ccnew.
> r: Start waiting for the INSERT to finish.
> i: PREPARE finishes.
> r: Wake up and start validate_index().  This is a problem.  It needed to wait
>   for COMMIT PREPARED to finish.
I'l investigate this scenario. I've tried to sprinkle some more WaitForLockersMultiple() yet without success.

> This may have a different explanation than the failure you saw, because my
> INSERT transaction already had a permanent XID before the start of phase 3.  I
> won't have time to study this further in the next several days.  Can you find
> out where things go wrong?
I'll try. This bug is #1 priority for me. We repack ~pb of indexes each weekend (only bloated, many in fact are
bloated).And seems like they all are endangered. 

>  The next thing I would study is VirtualXactLock(),
> specifically what happens if the lock holder is a normal backend (with or
> without an XID) when VirtualXactLock() starts but becomes a prepared
> transaction (w/ different PGPROC) before VirtualXactLock() ends.

PreparedXactLock() will do the trick. If we have xid - we always take a lock on xid. If we have vxid - we try to
convertit to xid and look in all PGPROCs for 2PCs. And then again - wait for xid. 
At this point I'm certain that if any transaction is reported by GetLockConflicts() it will get awaited by
VirtualXactLock().
The problem is that rogue transaction was never reported by GetLockConflicts().

Thanks!

Best regards, Andrey Borodin.


pgsql-bugs by date:

Previous
From: Michael Paquier
Date:
Subject: Re: BUG #17061: Impossible to query the fields of the tuple created by SEARCH BREADTH FIRST BY .. SET ..
Next
From: Peter Geoghegan
Date:
Subject: Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data