RE: Conflict detection for update_deleted in logical replication - Mailing list pgsql-hackers
From | Zhijie Hou (Fujitsu) |
---|---|
Subject | RE: Conflict detection for update_deleted in logical replication |
Date | |
Msg-id | OS0PR01MB5716662BEB9C0B4E92587FAC946C2@OS0PR01MB5716.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | RE: Conflict detection for update_deleted in logical replication ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>) |
List | pgsql-hackers |
On Friday, September 20, 2024 10:55 AM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote: > On Friday, September 20, 2024 2:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > I think that such a time-based configuration parameter would be a > > reasonable solution. The current concerns are that it might affect > > vacuum performance and lead to a similar bug we had with > vacuum_defer_cleanup_age. > > Thanks for the feedback! > > I am working on the POC patch and doing some initial performance tests on > this idea. > I will share the results after finishing. > > Apart from the vacuum_defer_cleanup_age idea. we’ve given more thought to > our approach for retaining dead tuples and have come up with another idea that > can reliably detect conflicts without requiring users to choose a wise value for > the vacuum_committs_age. This new idea could also reduce the performance > impact. Thanks a lot to Amit for off-list discussion. > > The concept of the new idea is that, the dead tuples are only useful to detect > conflicts when applying *concurrent* transactions from remotes. Any > subsequent UPDATE from a remote node after removing the dead tuples > should have a later timestamp, meaning it's reasonable to detect an > update_missing scenario and convert the UPDATE to an INSERT when > applying it. > > To achieve above, we can create an additional replication slot on the subscriber > side, maintained by the apply worker. This slot is used to retain the dead tuples. > The apply worker will advance the slot.xmin after confirming that all the > concurrent transaction on publisher has been applied locally. > > The process of advancing the slot.xmin could be: > > 1) the apply worker call GetRunningTransactionData() to get the > 'oldestRunningXid' and consider this as 'candidate_xmin'. > 2) the apply worker send a new message to walsender to request the latest wal > flush position(GetFlushRecPtr) on publisher, and save it to > 'candidate_remote_wal_lsn'. Here we could introduce a new feedback > message or extend the existing keepalive message(e,g extends the > requestReply bit in keepalive message to add a 'request_wal_position' value) > 3) The apply worker can continue to apply changes. After applying all the WALs > upto 'candidate_remote_wal_lsn', the apply worker can then advance the > slot.xmin to 'candidate_xmin'. > > This approach ensures that dead tuples are not removed until all concurrent > transactions have been applied. It can be effective for both bidirectional and > non-bidirectional replication cases. > > We could introduce a boolean subscription option (retain_dead_tuples) to > control whether this feature is enabled. Each subscription intending to detect > update-delete conflicts should set retain_dead_tuples to true. > > The following explains how it works in different cases to achieve data > consistency: ... > -- > 3 nodes, non-bidirectional, Node C subscribes to both Node A and Node B: > -- Sorry for a typo here, the time of T2 and T3 were reversed. Please see the following correction: > > Node A: > T1: INSERT INTO t (id, value) VALUES (1,1); ts=10.00 AM > T2: DELETE FROM t WHERE id = 1; ts=10.01 AM Here T2 should be at ts=10.02 AM > > Node B: > T3: UPDATE t SET value = 2 WHERE id = 1; ts=10.02 AM T3 should be at ts=10.01 AM > > Node C: > apply T1, T2, T3 > > After applying T2, the apply worker on Node C will check the latest wal flush > location on Node B. Till that time, the T3 should have finished, so the xmin will > be advanced only after applying the WALs that is later than T3. So, the dead > tuple will not be removed before applying the T3, which means the > update_delete can be detected. > > Your feedback on this idea would be greatly appreciated. > Best Regards, Hou zj
pgsql-hackers by date: