Hi,
On 11/9/23 11:54 AM, shveta malik wrote:
>
> PFA v32 patches which has below changes:
Thanks!
> 7) Added warning for cases where a user-slot with the same name is
> already present which slot-sync worker is trying to create. Sync for
> such slots is skipped.
I'm seeing assertion and segfault in this case due to ReplicationSlotRelease()
in synchronize_one_slot().
Adding this extra check prior to it:
- ReplicationSlotRelease();
+ if (!(found && s->data.sync_state == SYNCSLOT_STATE_NONE))
+ ReplicationSlotRelease();
make them disappear.
>
> Open Question:
> 1) Currently I have put drop slot logic for slots with 'sync_state=i'
> in slot-sync worker. Do we need to put it somewhere in promotion-logic
> as well?
Yeah I think so, because there is a time window when one could "use" the slot
after the promotion and before it is removed. Producing things like:
"
2023-11-09 15:16:50.294 UTC [2580462] LOG: dropped replication slot "logical_slot2" of dbid 5 as it was not
sync-ready
2023-11-09 15:16:50.295 UTC [2580462] LOG: dropped replication slot "logical_slot3" of dbid 5 as it was not
sync-ready
2023-11-09 15:16:50.297 UTC [2580462] LOG: dropped replication slot "logical_slot4" of dbid 5 as it was not
sync-ready
2023-11-09 15:16:50.297 UTC [2580462] ERROR: replication slot "logical_slot5" is active for PID 2594628
"
After the promotion one was able to use logical_slot5 and now we can now drop it.
> Perhaps in WaitForWALToBecomeAvailable() where we call
> XLogShutdownWalRcv after checking 'CheckForStandbyTrigger'. Thoughts?
>
You mean here?
/*
* Check to see if promotion is requested. Note that we do
* this only after failure, so when you promote, we still
* finish replaying as much as we can from archive and
* pg_wal before failover.
*/
if (StandbyMode && CheckForStandbyTrigger())
{
XLogShutdownWalRcv();
return XLREAD_FAIL;
}
If so, that sounds like a good place to me.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com