Re: Conflict detection for update_deleted in logical replication - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Conflict detection for update_deleted in logical replication |
Date | |
Msg-id | CAA4eK1+q77+PdWUWzU_DVfVd3VsGGWZOiJPDy9NME-5iGJATBw@mail.gmail.com Whole thread Raw |
In response to | Re: Conflict detection for update_deleted in logical replication (Masahiko Sawada <sawada.mshk@gmail.com>) |
List | pgsql-hackers |
On Thu, Nov 21, 2024 at 3:03 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote: > > Attach the V10 patch set which addressed above comments > and fixed a CFbot warning due to un-initialized variable. > We should make the v10_2-0001* as the first main patch for review till we have a consensus to resolve LSN<->Timestamp inversion issue. This is because v10_2 doesn't rely on the correctness of LSN<->Timestamp mapping. Now, say in some later release, we fix the LSN<->Timestamp inversion issue, we can simply avoid sending remote_xact information and it will behave the same as your v10_1 approach. Comments on v10_2_0001*: ====================== 1. +/* + * The phases involved in advancing the non-removable transaction ID. + * + * Refer to maybe_advance_nonremovable_xid() for details on how the function + * transitions between these phases. + */ +typedef enum +{ + DTR_GET_CANDIDATE_XID, + DTR_REQUEST_PUBLISHER_STATUS, + DTR_WAIT_FOR_PUBLISHER_STATUS, + DTR_WAIT_FOR_LOCAL_FLUSH +} DeadTupleRetainPhase; First, can we have a better name for this enum like RetainConflictInfoPhase or something like that? Second, the phase transition is not very clear from the comments atop maybe_advance_nonremovable_xid. You can refer to comments atop tablesync.c or snapbuild.c to see other cases where we have explained phase transitions. 2. + * Wait for the status from the walsender. After receiving the first status + * after acquiring a new candidate transaction ID, do not proceed if there + * are ongoing concurrent remote transactions. In this part of the comments: " .. after acquiring a new candidate transaction ID ..." appears misplaced. 3. In maybe_advance_nonremovable_xid(), the handling of each phase looks ad-hoc though I see that you have done that have so that you can handle the phase change functionality after changing the phase immediately. If we have to ever extend this functionality, it will be tricky to handle the new phase or at least the code will become complicated. How about handling each phase in the order of their occurrence and having separate functions for each phase as we have in apply_dispatch()? That way it would be convenient to invoke the phase handling functionality even if it needs to be called multiple times in the same function. 4. /* + * An invalid position indiates the publisher is also + * a physical standby. In this scenario, advancing the + * non-removable transaction ID is not supported. This + * is because the logical walsender on the standby can + * only get the WAL replay position but there may be + * more WALs that are being replicated from the + * primary and those WALs could have earlier commit + * timestamp. Refer to + * maybe_advance_nonremovable_xid() for details. + */ + if (XLogRecPtrIsInvalid(remote_lsn)) + { + ereport(WARNING, + errmsg("cannot get the latest WAL position from the publisher"), + errdetail("The connected publisher is also a standby server.")); + + /* + * Continuously revert to the request phase until + * the standby server (publisher) is promoted, at + * which point a valid WAL position will be + * received. + */ + phase = DTR_REQUEST_PUBLISHER_STATUS; + } Shouldn't this be an ERROR as the patch doesn't support this case? The same should be true for later patches for the subscription option. -- With Regards, Amit Kapila.
pgsql-hackers by date: