Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers

From Drouvot, Bertrand
Subject Re: Synchronizing slots from primary to standby
Date
Msg-id 8a06a7d0-b555-43b0-b407-99a618b30ece@gmail.com
Whole thread Raw
In response to Re: Synchronizing slots from primary to standby  (shveta malik <shveta.malik@gmail.com>)
Responses Re: Synchronizing slots from primary to standby
Re: Synchronizing slots from primary to standby
List pgsql-hackers
Hi,

On 11/9/23 11:54 AM, shveta malik wrote:
> 
> PFA v32 patches which has below changes:

Thanks!
   
> 7) Added warning for cases where a user-slot with the same name is
> already present which slot-sync worker is trying to create. Sync for
> such slots is skipped.

I'm seeing assertion and segfault in this case due to ReplicationSlotRelease()
in synchronize_one_slot().

Adding this extra check prior to it:

-       ReplicationSlotRelease();
+       if (!(found && s->data.sync_state == SYNCSLOT_STATE_NONE))
+               ReplicationSlotRelease();

make them disappear.

> 
> Open Question:
> 1) Currently I have put drop slot logic for slots with 'sync_state=i'
> in slot-sync worker. Do we need to put it somewhere in promotion-logic
> as well? 

Yeah I think so, because there is a time window when one could "use" the slot
after the promotion and before it is removed. Producing things like:

"
2023-11-09 15:16:50.294 UTC [2580462] LOG:  dropped replication slot "logical_slot2" of dbid 5 as it was not
sync-ready
2023-11-09 15:16:50.295 UTC [2580462] LOG:  dropped replication slot "logical_slot3" of dbid 5 as it was not
sync-ready
2023-11-09 15:16:50.297 UTC [2580462] LOG:  dropped replication slot "logical_slot4" of dbid 5 as it was not
sync-ready
2023-11-09 15:16:50.297 UTC [2580462] ERROR:  replication slot "logical_slot5" is active for PID 2594628
"

After the promotion one was able to use logical_slot5 and now we can now drop it.

> Perhaps in WaitForWALToBecomeAvailable() where we call
> XLogShutdownWalRcv after checking 'CheckForStandbyTrigger'. Thoughts?
> 

You mean here?

/*
  * Check to see if promotion is requested. Note that we do
  * this only after failure, so when you promote, we still
  * finish replaying as much as we can from archive and
  * pg_wal before failover.
  */
if (StandbyMode && CheckForStandbyTrigger())
{
      XLogShutdownWalRcv();
         return XLREAD_FAIL;
}

If so, that sounds like a good place to me.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: "Tristan Partin"
Date:
Subject: Re: Failure during Building Postgres in Windows with Meson
Next
From: Dean Rasheed
Date:
Subject: Re: Bug: RLS policy FOR SELECT is used to check new rows