Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers

From Bertrand Drouvot
Subject Re: Synchronizing slots from primary to standby
Date
Msg-id ZbyFixCyOef/6sZx@ip-10-97-1-34.eu-west-3.compute.internal
Whole thread Raw
In response to Re: Synchronizing slots from primary to standby  (shveta malik <shveta.malik@gmail.com>)
Responses RE: Synchronizing slots from primary to standby
List pgsql-hackers
Hi,

On Thu, Feb 01, 2024 at 05:29:15PM +0530, shveta malik wrote:
> Attached v75 patch-set. Changes are:
> 
> 1) Re-arranged the patches:
> 1.1) 'libpqrc' related changes (from v74-001 and v74-004) are
> separated out in v75-001 as those are independent changes.
> 1.2) 'Add logical slot sync capability', 'Slot sync worker as special
> process' and 'App-name changes' are now merged to single patch which
> makes v75-002.
> 1.3) 'Wait for physical Standby confirmation' and 'Failover Validation
> Document' patches are maintained as is (v75-003 and v75-004 now).

Thanks!

I only looked at the commit message for v75-0002 and see that it has changed
since the comment done in [1], but it still does not look correct to me.

"
If a logical slot on the primary is valid but is invalidated on the standby,
then that slot is dropped and recreated on the standby in next sync-cycle
provided the slot still exists on the primary server. It is okay to recreate
such slots as long as these are not consumable on the standby (which is the
case currently). This situation may occur due to the following reasons:
- The max_slot_wal_keep_size on the standby is insufficient to retain WAL
  records from the restart_lsn of the slot.
- primary_slot_name is temporarily reset to null and the physical slot is
  removed.
- The primary changes wal_level to a level lower than logical.
"

If a logical decoding slot "still exists on the primary server" then the primary
can not change the wal_level to lower than logical, one would get something like:

"FATAL:  logical replication slot "logical_slot" exists, but wal_level < logical"

and then slots won't get invalidated on the standby. I've the feeling that the
wal_level conflict part may need to be explained separately? (I think it's not
possible that they end up being re-created on the standby for this conflict, they
will be simply removed as it would mean the counterpart one on the primary 
does not exist anymore).

[1]: https://www.postgresql.org/message-id/ZYWdSIeAMQQcLmVT%40ip-10-97-1-34.eu-west-3.compute.internal

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Alexander Lakhin
Date:
Subject: Re: Race condition in FetchTableStates() breaks synchronization of subscription tables
Next
From: Dilip Kumar
Date:
Subject: Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock