Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers
From | Bertrand Drouvot |
---|---|
Subject | Re: Synchronizing slots from primary to standby |
Date | |
Msg-id | Zcoly3C/pkUyC7up@ip-10-97-1-34.eu-west-3.compute.internal Whole thread Raw |
In response to | Re: Synchronizing slots from primary to standby (Amit Kapila <amit.kapila16@gmail.com>) |
List | pgsql-hackers |
Hi, On Mon, Feb 12, 2024 at 04:19:33PM +0530, Amit Kapila wrote: > On Mon, Feb 12, 2024 at 3:33 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > A few random comments: > > > > > > 003 === > > > > + If, after executing the function, > > + <link linkend="guc-hot-standby-feedback"> > > + <varname>hot_standby_feedback</varname></link> is disabled on > > + the standby or the physical slot configured in > > + <link linkend="guc-primary-slot-name"> > > + <varname>primary_slot_name</varname></link> is > > + removed, > > > > I think another option that could lead to slot invalidation is if primary_slot_name > > is NULL or miss-configured. > > > > If the primary_slot_name is NULL then the function will error out. Yeah right, it had to be non NULL initially so we know there is a physical slot (if not dropped) that should prevent conflicts at the first place (should hsf be on). Please forget about comment 003 then. > > > > 005 === > > > > + To resume logical replication after failover from the synced logical > > + slots, the subscription's 'conninfo' must be altered > > > > Only in a pub/sub context but not for other ways of using the logical replication > > slot(s). > > > > Right, but what additional information do you want here? I thought we > were speaking about the in-build logical replication here so this is > okay. The "Logical Decoding Concepts" sub-chapter also mentions "Logical decoding clients" so I was not sure the part added in the patch was for in-build logical replication only. Or maybe just reword that way "In case of in-build logical replication, to resume after failover from the synced......"? > > > > > 008 === > > > > + ereport(LOG, > > + errmsg("dropped replication slot \"%s\" of dbid %d", > > + NameStr(local_slot->data.name), > > + local_slot->data.database)); > > > > We emit a message when an "invalidated" slot is dropped but not when we create > > a slot. Shouldn't we emit a message when we create a synced slot on the standby? > > > > I think that could be confusing to see "a drop" message not followed by "a create" > > one when it's expected (slot valid on the primary for example). > > > > Isn't the below message for sync-ready slot sufficient? Otherwise, in > most cases, we will LOG multiple similar messages. > > + ereport(LOG, > + errmsg("newly created slot \"%s\" is sync-ready now", > + remote_slot->name)); Yes it is sufficient if we reach it. For example during some test, I was able to go through this code path: Breakpoint 2, update_and_persist_local_synced_slot (remote_slot=0x56450e7c49c0, remote_dbid=5) at slotsync.c:340 340 ReplicationSlot *slot = MyReplicationSlot; (gdb) n 346 if (remote_slot->restart_lsn < slot->data.restart_lsn || (gdb) 347 TransactionIdPrecedes(remote_slot->catalog_xmin, (gdb) 346 if (remote_slot->restart_lsn < slot->data.restart_lsn || (gdb) 358 return; means exiting from update_and_persist_local_synced_slot() without reaching the "newly created slot" message (the slot on the primary was "inactive"). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: