Re: Conflict detection for update_deleted in logical replication - Mailing list pgsql-hackers
From | vignesh C |
---|---|
Subject | Re: Conflict detection for update_deleted in logical replication |
Date | |
Msg-id | CALDaNm3whh00AN58Azsps6+NLsYgqL6w2hz6wmcqSw5uiYqAbA@mail.gmail.com Whole thread Raw |
In response to | Re: Conflict detection for update_deleted in logical replication (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
RE: Conflict detection for update_deleted in logical replication
|
List | pgsql-hackers |
On Wed, 25 Dec 2024 at 08:13, Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote: > > On Monday, December 23, 2024 2:15 PM Kuroda, Hayato/黒田 隼人 <kuroda.hayato@fujitsu.com> wrote: > > > > Dear Hou, > > > > Thanks for updating the patch. Few comments: > > Thanks for the comments! > > > 02. ErrorOnReservedSlotName() > > > > Currently the function is callsed from three points - > > create_physical_replication_slot(), > > create_logical_replication_slot() and CreateReplicationSlot(). > > Can we move them to the ReplicationSlotCreate(), or combine into > > ReplicationSlotValidateName()? > > I am not sure because moving the check into these functions because that would > prevent the launcher from creating the slot as well unless we add a new > parameter for these functions, but I am not sure if it's worth it at this > stage. > > > > > 03. advance_conflict_slot_xmin() > > > > ``` > > Assert(TransactionIdIsValid(MyReplicationSlot->data.xmin)); > > ``` > > > > Assuming the case that the launcher crashed just after > > ReplicationSlotCreate(CONFLICT_DETECTION_SLOT). > > After the restart, the slot can be acquired since > > SearchNamedReplicationSlot(CONFLICT_DETECTION_SLOT) > > is true, but the process would fail the assert because data.xmin is still invalid. > > > > I think we should re-create the slot when the xmin is invalid. Thought? > > After thinking more, the standard approach to me would be to mark the slot as > EPHEMERAL during creation and persist it after initializing, so changed like > that. > > > 05. check_remote_recovery() > > > > Can we add a test case related with this? > > I think the code path is already tested, and I am a bit unsure if we want to setup > a standby to test the ERROR case, so didn't add this. > > --- > > Attach the new version patch set which addressed all other comments. Few suggestions: 1) If we have a subscription with detect_update_deleted option and we try to upgrade it with default settings(in case dba forgot to set track_commit_timestamp), the upgrade will fail after doing a lot of steps like that mentioned in ok below: Setting locale and encoding for new cluster ok Analyzing all rows in the new cluster ok Freezing all rows in the new cluster ok Deleting files from new pg_xact ok Copying old pg_xact to new server ok Setting oldest XID for new cluster ok Setting next transaction ID and epoch for new cluster ok Deleting files from new pg_multixact/offsets ok Copying old pg_multixact/offsets to new server ok Deleting files from new pg_multixact/members ok Copying old pg_multixact/members to new server ok Setting next multixact ID and offset for new cluster ok Resetting WAL archives ok Setting frozenxid and minmxid counters in new cluster ok Restoring global objects in the new cluster ok Restoring database schemas in the new cluster postgres *failure* We should detect this at an earlier point somewhere like in check_new_cluster_subscription_configuration and throw an error from there. 2) Also should we include an additional slot for the pg_conflict_detection slot while checking max_replication_slots. Though this error will occur after the upgrade is completed, it may be better to include the slot during upgrade itself so that the DBA need not handle this error separately after the upgrade is completed. 3) We have reserved the pg_conflict_detection name in this version, so if there was a replication slot with the name pg_conflict_detection in the older version, the upgrade will fail at a very later stage like an earlier upgrade shown. I feel we should check if the old cluster has any slot with the name pg_conflict_detection and throw an error earlier itself: +void +ErrorOnReservedSlotName(const char *name) +{ + if (strcmp(name, CONFLICT_DETECTION_SLOT) == 0) + ereport(ERROR, + errcode(ERRCODE_RESERVED_NAME), + errmsg("replication slot name \"%s\" is reserved", + name)); +} 4) We should also mention something like below in the documentation so the user can be aware of it: The slot name cannot be created with pg_conflict_detection, as this is reserved for logical replication conflict detection. Regards, Vignesh
pgsql-hackers by date: