Re: Conflict Detection and Resolution - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Conflict Detection and Resolution
Date
Msg-id da6965af-57f2-4d15-844c-caf8e873b800@enterprisedb.com
Whole thread Raw
In response to Re: Conflict Detection and Resolution  (shveta malik <shveta.malik@gmail.com>)
Responses Re: Conflict Detection and Resolution
Re: Conflict Detection and Resolution
List pgsql-hackers

On 6/11/24 10:35, shveta malik wrote:
> On Mon, Jun 10, 2024 at 5:24 PM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>>
>>
>>
>> On 6/10/24 12:56, shveta malik wrote:
>>> On Fri, Jun 7, 2024 at 6:08 PM Tomas Vondra
>>> <tomas.vondra@enterprisedb.com> wrote:
>>>>
>>>>>>>
>>>>>>>  UPDATE
>>>>>>> ================
>>>>>>>
>>>>>>> Conflict Detection Method:
>>>>>>> --------------------------------
>>>>>>> Origin conflict detection: The ‘origin’ info is used to detect
>>>>>>> conflict which can be obtained from commit-timestamp generated for
>>>>>>> incoming txn at the source node. To compare remote’s origin with the
>>>>>>> local’s origin, we must have origin information for local txns as well
>>>>>>> which can be obtained from commit-timestamp after enabling
>>>>>>> ‘track_commit_timestamp’ locally.
>>>>>>> The one drawback here is the ‘origin’ information cannot be obtained
>>>>>>> once the row is frozen and the commit-timestamp info is removed by
>>>>>>> vacuum. For a frozen row, conflicts cannot be raised, and thus the
>>>>>>> incoming changes will be applied in all the cases.
>>>>>>>
>>>>>>> Conflict Types:
>>>>>>> ----------------
>>>>>>> a) update_differ: The origin of an incoming update's key row differs
>>>>>>> from the local row i.e.; the row has already been updated locally or
>>>>>>> by different nodes.
>>>>>>> b) update_missing: The row with the same value as that incoming
>>>>>>> update's key does not exist. Remote is trying to update a row which
>>>>>>> does not exist locally.
>>>>>>> c) update_deleted: The row with the same value as that incoming
>>>>>>> update's key does not exist. The row is already deleted. This conflict
>>>>>>> type is generated only if the deleted row is still detectable i.e., it
>>>>>>> is not removed by VACUUM yet. If the row is removed by VACUUM already,
>>>>>>> it cannot detect this conflict. It will detect it as update_missing
>>>>>>> and will follow the default or configured resolver of update_missing
>>>>>>> itself.
>>>>>>>
>>>>>>
>>>>>> I don't understand the why should update_missing or update_deleted be
>>>>>> different, especially considering it's not detected reliably. And also
>>>>>> that even if we happen to find the row the associated TOAST data may
>>>>>> have already been removed. So why would this matter?
>>>>>
>>>>> Here, we are trying to tackle the case where the row is 'recently'
>>>>> deleted i.e. concurrent UPDATE and DELETE on pub and sub. User may
>>>>> want to opt for a different resolution in such a case as against the
>>>>> one where the corresponding row was not even present in the first
>>>>> place. The case where the row was deleted long back may not fall into
>>>>> this category as there are higher chances that they have been removed
>>>>> by vacuum and can be considered equivalent to the update_ missing
>>>>> case.
>>>>>
>>>>
>>>> My point is that if we can't detect the difference reliably, it's not
>>>> very useful. Consider this example:
>>>>
>>>> Node A:
>>>>
>>>>   T1: INSERT INTO t (id, value) VALUES (1,1);
>>>>
>>>>   T2: DELETE FROM t WHERE id = 1;
>>>>
>>>> Node B:
>>>>
>>>>   T3: UPDATE t SET value = 2 WHERE id = 1;
>>>>
>>>> The "correct" order of received messages on a third node is T1-T3-T2.
>>>> But we may also see T1-T2-T3 and T3-T1-T2, e.g. due to network issues
>>>> and so on. For T1-T2-T3 the right decision is to discard the update,
>>>> while for T3-T1-T2 it's to either wait for the INSERT or wait for the
>>>> insert to arrive.
>>>>
>>>> But if we misdetect the situation, we either end up with a row that
>>>> shouldn't be there, or losing an update.
>>>
>>> Doesn't the above example indicate that 'update_deleted' should also
>>> be considered a necessary conflict type?  Please see the possibilities
>>> of conflicts in all three cases:
>>>
>>>
>>> The "correct" order of receiving messages on node C (as suggested
>>> above) is T1-T3-T2   (case1)
>>> ----------
>>> T1 will insert the row.
>>> T3 will have update_differ conflict; latest_timestamp wins or apply
>>> will apply it. earliest_timestamp_wins or skip will skip it.
>>> T2 will delete the row (irrespective of whether the update happened or not).
>>> End Result: No Data.
>>>
>>> T1-T2-T3
>>> ----------
>>> T1 will insert the row.
>>> T2 will delete the row.
>>> T3 will have conflict update_deleted. If it is 'update_deleted', the
>>> chances are that the resolver set here is to 'skip' (default is also
>>> 'skip' in this case).
>>>
>>> If vacuum has deleted that row (or if we don't support
>>> 'update_deleted' conflict), it will be 'update_missing' conflict. In
>>> that case, the user may end up inserting the row if resolver chosen is
>>> in favor of apply (which seems an obvious choice for 'update_missing'
>>> conflict; default is also 'apply_or_skip').
>>>
>>> End result:
>>> Row inserted with 'update_missing'.
>>> Row correctly skipped with 'update_deleted' (assuming the obvious
>>> choice seems to be 'skip' for update_deleted case).
>>>
>>> So it seems that with 'update_deleted' conflict, there are higher
>>> chances of opting for right decision here (which is to discard the
>>> update), as 'update_deleted' conveys correct info to the user.  The
>>> 'update_missing' OTOH does not convey correct info and user may end up
>>> inserting the data by choosing apply favoring resolvers for
>>> 'update_missing'. Again, we get benefit of 'update_deleted' for
>>> *recently* deleted rows only.
>>>
>>> T3-T1-T2
>>> ----------
>>> T3 may end up inserting the record if the resolver is in favor of
>>> 'apply' and all the columns are received from remote.
>>> T1 will have' insert_exists' conflict and thus may either overwrite
>>> 'updated' values or may leave the data as is (based on whether
>>> resolver is in favor of apply or not)
>>> T2 will end up deleting it.
>>> End Result: No Data.
>>>
>>> I feel for second case (and similar cases), 'update_deleted' serves a
>>> better conflict type.
>>>
>>
>> True, but this is pretty much just a restatement of the example, right?
>>
>> The point I was trying to make is that this hinges on the ability to
>> detect the correct conflict type. And if vacuum can swoop in and remove
>> the recently deleted tuples (which I believe can happen at any time,
>> right?), then that's not guaranteed, because we won't see the deleted
>> tuple anymore.
> 
> Yes, that's correct. However, many cases could benefit from the
> update_deleted conflict type if it can be implemented reliably. That's
> why we wanted to give it a try. But if we can't achieve predictable
> results with it, I'm fine to drop this approach and conflict_type. We
> can consider a better design in the future that doesn't depend on
> non-vacuumed entries and provides a more robust method for identifying
> deleted rows.
> 

I agree having a separate update_deleted conflict would be beneficial,
I'm not arguing against that - my point is actually that I think this
conflict type is required, and that it needs to be detected reliably.

I'm not sure dropping update_deleted entirely would be a good idea,
though. It pretty much guarantees making the wrong decision at least
sometimes. But at least it's predictable and users are more likely to
notice that (compared to update_delete working on well-behaving systems,
and then failing when a node starts lagging or something).

That's my opinion, though, and I don't intend to stay in the way. But I
think the solution is not that difficult - something needs to prevent
cleanup of recently dead tuples (until the "relevant" changes are
received and applied from other nodes). I don't know if that could be
done based on information we have for subscriptions, or if we need
something new.

>> Also, can the resolver even convert the UPDATE into INSERT and proceed?
>> Maybe with REPLICA IDENTITY FULL?
> 
> Yes, it can, as long as the row doesn't contain toasted data. Without
> toasted data, the new tuple is fully logged. However, if the row does
> contain toasted data, the new tuple won't log it completely. In such a
> case, REPLICA IDENTITY FULL becomes a requirement to ensure we have
> all the data necessary to create the row on the target side. In
> absence of RI full and with row lacking toasted data, the operation
> will be skipped or error will be raised.
> 
>> Otherwise the row might be incomplete,
>> missing required columns etc. In which case it'd have to wait for the
>> actual INSERT to arrive - which would work for actual update_missing,
>> where the row may be delayed due to network issues. But if that's a
>> mistake due to vacuum removing the deleted tuple, it'll wait forever.
> 
> Even in case of 'update_missing', we do not intend to wait for 'actual
> insert' to arrive, as it is not guaranteed if the 'insert' will arrive
> or not. And thus we plan to skip or error out  (based on user's
> configuration) if a complete row can not be created for insertion.
> 

If the UPDATE contains all the columns and can be turned into an INSERT,
then that seems reasonable. But I don't see how skipping it could work
in general (except for some very simple / specific use cases). I'm not
sure if you suggest to skip just the one UPDATE or transaction as a
whole, but it seems to me either of those options could easily lead to
all kinds of inconsistencies and user confusion.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Jelte Fennema-Nio
Date:
Subject: Re: Partial aggregates pushdown
Next
From: Jacob Champion
Date:
Subject: Re: RFC: adding pytest as a supported test framework