Re: Issue with logical replication slot during switchover - Mailing list pgsql-hackers
| From | Masahiko Sawada |
|---|---|
| Subject | Re: Issue with logical replication slot during switchover |
| Date | |
| Msg-id | CAD21AoBSgQoX99aC2PfL7dzCwM0NGvbgrfUD7ALihVCVtgqVXQ@mail.gmail.com Whole thread Raw |
| In response to | Re: Issue with logical replication slot during switchover (Amit Kapila <amit.kapila16@gmail.com>) |
| Responses |
Re: Issue with logical replication slot during switchover
|
| List | pgsql-hackers |
On Tue, Nov 18, 2025 at 1:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Sat, Nov 15, 2025 at 4:02 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Fri, Nov 14, 2025 at 2:39 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > The point is quite fundamental, do you think we can sync to a > > > pre-existing slot with the same name and failover marked as true after > > > the first time the node joins a new primary? > > > > Given the current behavior that we cannot create a logical slot with > > failover=true on the standby, it makes sense to me that we overwrite > > the pre-existing slot (with synced=false and failover=true) on the old > > primary by the slot (with synced=true and failover=true) on the new > > primary if their names, plugin and other properties matches and the > > pre-existing slot has lesser LSNs and XIDs than the one on the new > > primary. But at the same time, we need to consider the possible future > > changes that allow users to create a slot with failover=true also on > > the standby. > > > > Alexander pointed out[1] that allowing to create a slot with > > failover=true on the standby won't work with the current > > implementation. I agree with his analysis, and I guess we would need > > more changes than simply allowing it, regardless of accepting the > > proposed change. We might need to introduce a replication slot origin > > or a generation. > > > > AFAICS, the email you pointed out wrote about use cases, not the > actual code implementation. We can discuss use cases if we want to > pursue that implementation, but the reason why we decided not to allow > it was for the cases where users try to configure cascaded standbys to > also try to sync slots from the first standby that are already being > synced from the primary. There are quite a few technical challenges in > supporting that, like how to make sure primary waits even for cascaded > standbys before sending the changes to logical subscribers. Right. My point is that these are two independent issues. The fact that creating a slot with failover=true directly on a standby is difficult (due to the cascaded-standby cases you mentioned) does not, by itself, justify allowing us to overwrite an existing slot with failover=true and synced=false during slot synchronization. > OTOH, for the cases where there is a totally different logical slot on > standby (not present on primary) with failover=true, we can allow it > to be synced from standby-1 to a cascaded standby, though we need some > way to distinguish those cases. For example, during sync on cascaded > standby, we can ensure that the slot being synced is not a sync-slot > (failover=true and sync=true). Yes. We need some way to distinguish those slots, otherwise if users create a slot with the same name on the primary, the slot on standby-1 (a cascading standby) could be overwritten. I think we would need some additional metadata per slot to support that safely. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: