Re: Conflict Detection and Resolution - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Conflict Detection and Resolution |
Date | |
Msg-id | da6965af-57f2-4d15-844c-caf8e873b800@enterprisedb.com Whole thread Raw |
In response to | Re: Conflict Detection and Resolution (shveta malik <shveta.malik@gmail.com>) |
Responses |
Re: Conflict Detection and Resolution
Re: Conflict Detection and Resolution |
List | pgsql-hackers |
On 6/11/24 10:35, shveta malik wrote: > On Mon, Jun 10, 2024 at 5:24 PM Tomas Vondra > <tomas.vondra@enterprisedb.com> wrote: >> >> >> >> On 6/10/24 12:56, shveta malik wrote: >>> On Fri, Jun 7, 2024 at 6:08 PM Tomas Vondra >>> <tomas.vondra@enterprisedb.com> wrote: >>>> >>>>>>> >>>>>>> UPDATE >>>>>>> ================ >>>>>>> >>>>>>> Conflict Detection Method: >>>>>>> -------------------------------- >>>>>>> Origin conflict detection: The ‘origin’ info is used to detect >>>>>>> conflict which can be obtained from commit-timestamp generated for >>>>>>> incoming txn at the source node. To compare remote’s origin with the >>>>>>> local’s origin, we must have origin information for local txns as well >>>>>>> which can be obtained from commit-timestamp after enabling >>>>>>> ‘track_commit_timestamp’ locally. >>>>>>> The one drawback here is the ‘origin’ information cannot be obtained >>>>>>> once the row is frozen and the commit-timestamp info is removed by >>>>>>> vacuum. For a frozen row, conflicts cannot be raised, and thus the >>>>>>> incoming changes will be applied in all the cases. >>>>>>> >>>>>>> Conflict Types: >>>>>>> ---------------- >>>>>>> a) update_differ: The origin of an incoming update's key row differs >>>>>>> from the local row i.e.; the row has already been updated locally or >>>>>>> by different nodes. >>>>>>> b) update_missing: The row with the same value as that incoming >>>>>>> update's key does not exist. Remote is trying to update a row which >>>>>>> does not exist locally. >>>>>>> c) update_deleted: The row with the same value as that incoming >>>>>>> update's key does not exist. The row is already deleted. This conflict >>>>>>> type is generated only if the deleted row is still detectable i.e., it >>>>>>> is not removed by VACUUM yet. If the row is removed by VACUUM already, >>>>>>> it cannot detect this conflict. It will detect it as update_missing >>>>>>> and will follow the default or configured resolver of update_missing >>>>>>> itself. >>>>>>> >>>>>> >>>>>> I don't understand the why should update_missing or update_deleted be >>>>>> different, especially considering it's not detected reliably. And also >>>>>> that even if we happen to find the row the associated TOAST data may >>>>>> have already been removed. So why would this matter? >>>>> >>>>> Here, we are trying to tackle the case where the row is 'recently' >>>>> deleted i.e. concurrent UPDATE and DELETE on pub and sub. User may >>>>> want to opt for a different resolution in such a case as against the >>>>> one where the corresponding row was not even present in the first >>>>> place. The case where the row was deleted long back may not fall into >>>>> this category as there are higher chances that they have been removed >>>>> by vacuum and can be considered equivalent to the update_ missing >>>>> case. >>>>> >>>> >>>> My point is that if we can't detect the difference reliably, it's not >>>> very useful. Consider this example: >>>> >>>> Node A: >>>> >>>> T1: INSERT INTO t (id, value) VALUES (1,1); >>>> >>>> T2: DELETE FROM t WHERE id = 1; >>>> >>>> Node B: >>>> >>>> T3: UPDATE t SET value = 2 WHERE id = 1; >>>> >>>> The "correct" order of received messages on a third node is T1-T3-T2. >>>> But we may also see T1-T2-T3 and T3-T1-T2, e.g. due to network issues >>>> and so on. For T1-T2-T3 the right decision is to discard the update, >>>> while for T3-T1-T2 it's to either wait for the INSERT or wait for the >>>> insert to arrive. >>>> >>>> But if we misdetect the situation, we either end up with a row that >>>> shouldn't be there, or losing an update. >>> >>> Doesn't the above example indicate that 'update_deleted' should also >>> be considered a necessary conflict type? Please see the possibilities >>> of conflicts in all three cases: >>> >>> >>> The "correct" order of receiving messages on node C (as suggested >>> above) is T1-T3-T2 (case1) >>> ---------- >>> T1 will insert the row. >>> T3 will have update_differ conflict; latest_timestamp wins or apply >>> will apply it. earliest_timestamp_wins or skip will skip it. >>> T2 will delete the row (irrespective of whether the update happened or not). >>> End Result: No Data. >>> >>> T1-T2-T3 >>> ---------- >>> T1 will insert the row. >>> T2 will delete the row. >>> T3 will have conflict update_deleted. If it is 'update_deleted', the >>> chances are that the resolver set here is to 'skip' (default is also >>> 'skip' in this case). >>> >>> If vacuum has deleted that row (or if we don't support >>> 'update_deleted' conflict), it will be 'update_missing' conflict. In >>> that case, the user may end up inserting the row if resolver chosen is >>> in favor of apply (which seems an obvious choice for 'update_missing' >>> conflict; default is also 'apply_or_skip'). >>> >>> End result: >>> Row inserted with 'update_missing'. >>> Row correctly skipped with 'update_deleted' (assuming the obvious >>> choice seems to be 'skip' for update_deleted case). >>> >>> So it seems that with 'update_deleted' conflict, there are higher >>> chances of opting for right decision here (which is to discard the >>> update), as 'update_deleted' conveys correct info to the user. The >>> 'update_missing' OTOH does not convey correct info and user may end up >>> inserting the data by choosing apply favoring resolvers for >>> 'update_missing'. Again, we get benefit of 'update_deleted' for >>> *recently* deleted rows only. >>> >>> T3-T1-T2 >>> ---------- >>> T3 may end up inserting the record if the resolver is in favor of >>> 'apply' and all the columns are received from remote. >>> T1 will have' insert_exists' conflict and thus may either overwrite >>> 'updated' values or may leave the data as is (based on whether >>> resolver is in favor of apply or not) >>> T2 will end up deleting it. >>> End Result: No Data. >>> >>> I feel for second case (and similar cases), 'update_deleted' serves a >>> better conflict type. >>> >> >> True, but this is pretty much just a restatement of the example, right? >> >> The point I was trying to make is that this hinges on the ability to >> detect the correct conflict type. And if vacuum can swoop in and remove >> the recently deleted tuples (which I believe can happen at any time, >> right?), then that's not guaranteed, because we won't see the deleted >> tuple anymore. > > Yes, that's correct. However, many cases could benefit from the > update_deleted conflict type if it can be implemented reliably. That's > why we wanted to give it a try. But if we can't achieve predictable > results with it, I'm fine to drop this approach and conflict_type. We > can consider a better design in the future that doesn't depend on > non-vacuumed entries and provides a more robust method for identifying > deleted rows. > I agree having a separate update_deleted conflict would be beneficial, I'm not arguing against that - my point is actually that I think this conflict type is required, and that it needs to be detected reliably. I'm not sure dropping update_deleted entirely would be a good idea, though. It pretty much guarantees making the wrong decision at least sometimes. But at least it's predictable and users are more likely to notice that (compared to update_delete working on well-behaving systems, and then failing when a node starts lagging or something). That's my opinion, though, and I don't intend to stay in the way. But I think the solution is not that difficult - something needs to prevent cleanup of recently dead tuples (until the "relevant" changes are received and applied from other nodes). I don't know if that could be done based on information we have for subscriptions, or if we need something new. >> Also, can the resolver even convert the UPDATE into INSERT and proceed? >> Maybe with REPLICA IDENTITY FULL? > > Yes, it can, as long as the row doesn't contain toasted data. Without > toasted data, the new tuple is fully logged. However, if the row does > contain toasted data, the new tuple won't log it completely. In such a > case, REPLICA IDENTITY FULL becomes a requirement to ensure we have > all the data necessary to create the row on the target side. In > absence of RI full and with row lacking toasted data, the operation > will be skipped or error will be raised. > >> Otherwise the row might be incomplete, >> missing required columns etc. In which case it'd have to wait for the >> actual INSERT to arrive - which would work for actual update_missing, >> where the row may be delayed due to network issues. But if that's a >> mistake due to vacuum removing the deleted tuple, it'll wait forever. > > Even in case of 'update_missing', we do not intend to wait for 'actual > insert' to arrive, as it is not guaranteed if the 'insert' will arrive > or not. And thus we plan to skip or error out (based on user's > configuration) if a complete row can not be created for insertion. > If the UPDATE contains all the columns and can be turned into an INSERT, then that seems reasonable. But I don't see how skipping it could work in general (except for some very simple / specific use cases). I'm not sure if you suggest to skip just the one UPDATE or transaction as a whole, but it seems to me either of those options could easily lead to all kinds of inconsistencies and user confusion. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: