A proposal to force-drop replication slots to make disabling async/sync standbys or logical replication faster in production environments - Mailing list pgsql-hackers
From
Bharath Rupireddy
Subject
A proposal to force-drop replication slots to make disabling async/sync standbys or logical replication faster in production environments
Currently postgres doesn't allow dropping a replication slot that's active [1]. This can make certain operations more time-consuming or stuck in production environments. These operations are - disable async/sync standbys and disable logical replication that require the postgres running on standby or the subscriber to go down. If stopping postgres server takes time, the VM or container will have to be killed forcefully which can take a considerable amount of time as there are many layers in between.
How about we provide a function to force-drop a replication slot? All other things such as stopping postgres and gracefully unprovisioning VM etc. can be taken care of in the background. This force-drop function will also have to ensure that the walsender that's active for the replication slot is terminated gracefully without letting postmaster restart the other backends (right now if a wal sender is exited/terminated, the postmaster restarts all other backends too). The main advantage of the force-drop function is that the disable operations can be quicker and there is no down time/crash on the primary/source server.
Thoughts?
[1] ERROR: replication slot "foo" is active for PID 2598155