Hello!
> Why only by luck?
I mean last_write_win provides the same results in the following cases:
* we found the tuple, detected a conflict, and decided to ignore the
update coming from the publisher
* we were unable to find the tuple, logged an error about it, and
ignored the update coming from the publisher
In both cases, the result is the same: the subscriber version remains
in the table.
> Then these may not lead to eventual consistency for such cases. So,
> not sure one should anyway rely on these.
But with the fixed snapshot dirty scan, it becomes possible to
implement such strategies.
Also, some strategies require some kind of merge function for tuples.
In my understanding, even last_write_win should probably compare
timestamps to determine which version is "newer" because time in
distributed systems can be tricky.
Therefore, we have to find the tuple if it exists.
> BTW, then isn't it possible that INSERT happens on a different page?
Yes, it is possible - in that case, the bug does not occur. It only
happens if a new TID of some logical tuple is added to the same page.
Just to clarify, this is about B-tree pages, not the heap.
> I think this questions whether we consider the SnapshotDirty results
> correct or not.
In my understanding, this is clearly wrong:
* such behavior is not documented anywhere
* usage patterns assume that such things cannot happen
* new features struggle with it. For example, the new update_deleted
logging may fail to behave correctly
(038_update_missing_with_retain.pl in the patch) - so how should it be
used? It might be correct, but it also might not be...
Another option is to document the behavior and rename it to SnapshotMaybe :)
By the way, SnapshotSelf is also affected.
> The case of logical replication giving wrong results
> [0] is the behavior from the beginning of logical replication.
Logical replication was mainly focused on replication without any
concurrent updates on the subscriber side. So, I think this is why the
issue was overlooked.
Best regards,
Mikhail.