On 2020-Apr-06, Alvaro Herrera wrote:
> I think there's a race condition in this: if we kill a walsender and it
> restarts immediately before we (checkpoint) can acquire the slot, we
> will wait for it to terminate on its own. Fixing this requires changing
> the ReplicationSlotAcquire API so that it knows not to wait but not
> raise error either (so we can use an infinite loop: "acquire, if busy
> send signal")
I think this should do it, but I didn't test it super-carefully and the
usage of the condition variable is not entirely kosher.
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services