... > > Thinking about this further, using quorum settings for > synchronized_standby_slots can/will certainly result in at least one > sync standby lagging behind the logical replica, making it probably > impossible to continue with the existing logical replication setup > after a failover to the standby that lags behind. Here is what I am > mean: >
But won't that be true even for synchronous_standby_names? I think in the case of quorum, it is the responsibility of the failover solution to select the most recent synced standby among all the standby's specified in synchronous_standby_names. Similarly here before failing over logical subscriber to one of physical standby, the failover tool needs to ensure it is switching over to the synced replica. We have given steps in the docs [1] that could be used to identify the replica where the subscriber can switchover. Will that address your concern?
+1, the job of failover orchestration is to ensure the new primary is caught up at least until the quorum LSN. Otherwise, it can be a durability issue where users see missing committed transactions.
BTW, I have also suggested this idea in thread [2]. I don't recall all the ideas/points discussed in that thread but it would be good to check that thread for any alternative ideas and points raised, so that we don't miss anything.
Thanks for sharing the links, the approach is similar. DEFAULT to SAME_AS_SYNCREP_STANDBYS is an interesting option.
I like the idea of avoiding duplicate lists unless the user wants to maintain a separate list.