Re: [BUG?] check_exclusion_or_unique_constraint false negative - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [BUG?] check_exclusion_or_unique_constraint false negative
Date
Msg-id CAA4eK1Jfb0xviXYon-_TvHNKeAY7ngAeo++Knu-0RPR6EkSBjA@mail.gmail.com
Whole thread Raw
In response to Re: [BUG?] check_exclusion_or_unique_constraint false negative  (Michail Nikolaev <michail.nikolaev@gmail.com>)
Responses Re: [BUG?] check_exclusion_or_unique_constraint false negative
List pgsql-hackers
On Thu, Aug 1, 2024 at 2:55 PM Michail Nikolaev
<michail.nikolaev@gmail.com> wrote:
>
> > Thanks for pointing out the issue!
>
> Thanks for your attention!
>
> > IIUC, the issue can happen when two concurrent transactions using DirtySnapshot access
> > the same tuples, which is not specific to the parallel apply
>
> Not exactly, it happens for any DirtySnapshot scan over a B-tree index with some other transaction updating the same
indexpage (even using the MVCC snapshot). 
>
> So, logical replication related scenario looks like this:
>
> * subscriber worker receives a tuple update\delete from the publisher
> * it calls RelationFindReplTupleByIndex to find the tuple in the local table
> * some other transaction updates the tuple in the local table (on subscriber side) in parallel
> * RelationFindReplTupleByIndex may not find the tuple because it uses DirtySnapshot
> * update\delete is lost
>
> Parallel apply mode looks like more dangerous because it uses multiple workers on the subscriber side, so the
probabilityof the issue is higher. 
> In that case, "some other transaction" is just another worker applying changes of different transaction in parallel.
>

I think it is rather less likely or not possible in a parallel apply
case because such conflicting updates (updates on the same tuple)
should be serialized at the publisher itself. So one of the updates
will be after the commit that has the second update.

I haven't tried the test based on your description of the general
problem with DirtySnapshot scan. In case of logical replication, we
will LOG update_missing type of conflict and the user may need to take
some manual action based on that. I have not tried a test so I could
be wrong as well. I am not sure we can do anything specific to logical
replication for this but feel free to suggest if you have ideas to
solve this problem in general or specific to logical replication.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Casts from jsonb to other types should cope with json null
Next
From: Junwang Zhao
Date:
Subject: Re: [Patch] remove duplicated smgrclose