Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data - Mailing list pgsql-bugs

From Andrey Borodin
Subject Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data
Date
Msg-id 15936F8A-0436-40C8-9D1A-4A4623681554@yandex-team.ru
Whole thread Raw
In response to Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data  (Andres Freund <andres@anarazel.de>)
Responses Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data  (Andres Freund <andres@anarazel.de>)
List pgsql-bugs
Hi!

> 15 авг. 2021 г., в 18:45, Andres Freund <andres@anarazel.de> написал(а):
>
> Hi,
>
> On 2021-08-15 16:09:37 +0500, Andrey Borodin wrote:
>> From 929736512ebf8eb9ac6ddaaf49b9e6148d72cfbb Mon Sep 17 00:00:00 2001
>> From: Andrey Borodin <amborodin@acm.org>
>> Date: Fri, 30 Jul 2021 14:40:16 +0500
>> Subject: [PATCH v12 2/6] PoC fix for race in RelationBuildDesc() and relcache
>> invalidation
>>
>> ---
>> src/backend/utils/cache/relcache.c | 28 ++++++++++++++++++++++++++++
>> 1 file changed, 28 insertions(+)
>>
>> diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
>> index 13d9994af3..7eec7b1f41 100644
>> --- a/src/backend/utils/cache/relcache.c
>> +++ b/src/backend/utils/cache/relcache.c
>> @@ -997,9 +997,16 @@ equalRSDesc(RowSecurityDesc *rsdesc1, RowSecurityDesc *rsdesc2)
>>  *        (suggesting we are trying to access a just-deleted relation).
>>  *        Any other error is reported via elog.
>>  */
>> +typedef struct InProgressRels {
>> +    Oid relid;
>> +    bool invalidated;
>> +} InProgressRels;
>> +static InProgressRels inProgress[100];
>> +
>> static Relation
>> RelationBuildDesc(Oid targetRelId, bool insertIt)
>> {
>> +    int in_progress_offset;
>>     Relation    relation;
>>     Oid            relid;
>>     HeapTuple    pg_class_tuple;
>> @@ -1033,6 +1040,14 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
>>     }
>> #endif
>>
>> +    for (in_progress_offset = 0;
>> +         OidIsValid(inProgress[in_progress_offset].relid);
>> +         in_progress_offset++)
>> +        ;
>> +    inProgress[in_progress_offset].relid = targetRelId;
>> +retry:
>> +    inProgress[in_progress_offset].invalidated = false;
>> +
>>     /*
>>      * find the tuple in pg_class corresponding to the given relation id
>>      */
>> @@ -1213,6 +1228,12 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
>>      */
>>     heap_freetuple(pg_class_tuple);
>>
>> +    if (inProgress[in_progress_offset].invalidated)
>> +        goto retry;                /* TODO free old one */
>> +    /* inProgress is in fact the stack, we can safely remove it's top */
>> +    inProgress[in_progress_offset].relid = InvalidOid;
>> +    Assert(inProgress[in_progress_offset + 1].relid == InvalidOid);
>> +
>>     /*
>>      * Insert newly created relation into relcache hash table, if requested.
>>      *
>> @@ -2805,6 +2826,13 @@ RelationCacheInvalidateEntry(Oid relationId)
>>         relcacheInvalsReceived++;
>>         RelationFlushRelation(relation);
>>     }
>> +    else
>> +    {
>> +        int i;
>> +        for (i = 0; OidIsValid(inProgress[i].relid); i++)
>> +            if (inProgress[i].relid == relationId)
>> +                inProgress[i].invalidated = true;
>> +    }
>> }
>
> Desperately needs comments. Without a commit message and without
> comments it's hard to review this without re-reading the entire thread -
> which approximately nobody will do.
I've added some comments. But it seems we should use dynamic allocations instead of 100-based array.

>
>
>> From 7e47dae2828d88ddb2161fda0c3b08a158c6cf37 Mon Sep 17 00:00:00 2001
>> From: Andrey Borodin <amborodin@acm.org>
>> Date: Sat, 7 Aug 2021 20:27:14 +0500
>> Subject: [PATCH v12 5/6] PoC fix clear xid
>>
>> ---
>> src/backend/access/transam/xact.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
>> index 387f80419a..9b19d939eb 100644
>> --- a/src/backend/access/transam/xact.c
>> +++ b/src/backend/access/transam/xact.c
>> @@ -2500,7 +2500,6 @@ PrepareTransaction(void)
>>      * done *after* the prepared transaction has been marked valid, else
>>      * someone may think it is unlocked and recyclable.
>>      */
>> -    ProcArrayClearTransaction(MyProc);
>>
>>     /*
>>      * In normal commit-processing, this is all non-critical post-transaction
>> @@ -2535,6 +2534,8 @@ PrepareTransaction(void)
>>     PostPrepare_MultiXact(xid);
>>
>>     PostPrepare_Locks(xid);
>> +
>> +    ProcArrayClearTransaction(MyProc);
>>     PostPrepare_PredicateLocks(xid);
>
> The comment above ProcArrayClearTransaction would need to be moved and
> updated...
Fixed.

>
>> From 6db9cafd146db1a645bb6885157b0e1f032765e0 Mon Sep 17 00:00:00 2001
>> From: Andrey Borodin <amborodin@acm.org>
>> Date: Mon, 19 Jul 2021 11:50:02 +0500
>> Subject: [PATCH v12 4/6] Fix CREATE INDEX CONCURRENTLY in precence of vxids
>> converted to 2pc
> ...
>
>> +/*
>> + * TwoPhaseGetXidByVXid
>> + *        Try to lookup for vxid among prepared xacts
>> + */
>> +XidListEntry
>> +TwoPhaseGetXidByVXid(VirtualTransactionId vxid)
>> +{
>> +    int                i;
>> +    XidListEntry    result;
>> +    result.next = NULL;
>> +    result.xid = InvalidTransactionId;
>> +
>> +    LWLockAcquire(TwoPhaseStateLock, LW_SHARED);
>> +
>> +    for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
>> +    {
>> +        GlobalTransaction    gxact = TwoPhaseState->prepXacts[i];
>> +        PGPROC               *proc = &ProcGlobal->allProcs[gxact->pgprocno];
>> +        VirtualTransactionId proc_vxid;
>> +        GET_VXID_FROM_PGPROC(proc_vxid, *proc);
>> +
>> +        if (VirtualTransactionIdEquals(vxid, proc_vxid))
>> +        {
>> +            if (result.xid != InvalidTransactionId)
>> +            {
>> +                /* Already has candidate - need to alloc some space */
>> +                XidListEntry *copy = palloc(sizeof(XidListEntry));
>> +                copy->next = result.next;
>> +                copy->xid = result.xid;
>> +                result.next = copy;
>> +            }
>> +            result.xid = gxact->xid;
>> +        }
>> +    }
>> +
>> +    LWLockRelease(TwoPhaseStateLock);
>> +
>> +    return result;
>> +}
>
> Dynamic memory allocations while holding a fairly central lock - one
> which is now going to be more contended - doesn't seem great.
>
> Is the memory context this is called in guaranteed to be of a proper
> duration?  Including being reset in case there's an error at some point
> before the memory is freed?
I've removed custom list and all memory allocations. If there are multiple 2PCs with same vxid - we just wait for one,
thenretry. 

>> +/*
>> + *        WaitXact
>> + *
>> + *        Wait for xid completition if have xid. Otherwise try to find xid among
>> + *        two-phase procarray entries.
>> + */
>> +static bool WaitXact(VirtualTransactionId vxid, TransactionId xid, bool wait)
>> +{
>> +    LockAcquireResult    lar;
>> +    LOCKTAG                tag;
>> +    XidListEntry        xidlist;
>> +    XidListEntry       *xidlist_ptr = NULL; /* pointer to TwoPhaseGetXidByVXid()s pallocs */
>> +    bool                result;
>> +
>> +    if (TransactionIdIsValid(xid))
>> +    {
>> +        /* We know exact xid - no need to search in 2PC state */
>> +        xidlist.xid = xid;
>> +        xidlist.next = NULL;
>> +    }
>> +    else
>> +    {
>> +        /* You cant have vxid among 2PCs if you have no 2PCs */
>> +        if (max_prepared_xacts == 0)
>> +            return true;
>> +
>> +        /*
>> +         * We must search for vxids in 2pc state
>> +         * XXX: O(N*N) complexity where N is number of prepared xacts
>> +         */
>> +        xidlist = TwoPhaseGetXidByVXid(vxid);
>> +        /* Return if transaction is gone entirely */
>> +        if (!TransactionIdIsValid(xidlist.xid))
>> +            return true;
>> +        xidlist_ptr = xidlist.next;
>> +    }
>
> Perhaps I missed this - but won't we constantly enter this path for
> non-2pc transactions? E.g.
I've restored heuristics that if it's not 2PC - we just exit from WaitPreparedXact().


>
>> @@ -4573,7 +4649,7 @@ VirtualXactLock(VirtualTransactionId vxid, bool wait)
>>      */
>>     proc = BackendIdGetProc(vxid.backendId);
>>     if (proc == NULL)
>> -        return true;
>> +        return WaitXact(vxid, InvalidTransactionId, wait);
>>
>>     /*
>>      * We must acquire this lock before checking the backendId and lxid
>> @@ -4587,9 +4663,12 @@ VirtualXactLock(VirtualTransactionId vxid, bool wait)
>>         || proc->fpLocalTransactionId != vxid.localTransactionId)
>>     {
>>         LWLockRelease(&proc->fpInfoLock);
>> -        return true;
>> +        return WaitXact(vxid, InvalidTransactionId, wait);
>>     }
>
> It seems like it's going to add a substantial amount of work even when
> no 2PC xacts are involved?
Only if 2PCs are enabled.

>> diff --git a/src/include/access/twophase.h b/src/include/access/twophase.h
>> index e27e1a8fe8..a5f28d3a80 100644
>> --- a/src/include/access/twophase.h
>> +++ b/src/include/access/twophase.h
>> @@ -25,6 +25,17 @@
>>  */
>> typedef struct GlobalTransactionData *GlobalTransaction;
>>
>> +/*
>> + * XidListEntry is expected to be used as list very rarely. Under normal
>> + * circumstances TwoPhaseGetXidByVXid() returns only one xid.
>> + * But under certain conditions can return many xids or nothing.
>> + */
>> +typedef struct XidListEntry
>> +{
>> +    TransactionId xid;
>> +    struct XidListEntry* next;
>> +} XidListEntry;
>
> I don't think we should invent additional ad-hoc list types.
Fixed, removed list here entirely.

I'm attaching new version. It works fine on my machines.

Thanks!

Best regards, Andrey Borodin.


Attachment

pgsql-bugs by date:

Previous
From: Yongqian Li
Date:
Subject: Unexpected behavior from using default config value
Next
From: Andres Freund
Date:
Subject: Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data