Hi,
This is likely related to the issue I've reported[1]: A logical
walsender may be stuck at 100% CPU during shutdown, trying to read an
incomplete FPI_FOR_HINT record and blocking the shutdown sequence. By
stopping the logical replication's target, the impacted walsender
exited, unblocking the shutdown.
There are similar reports of failover being stuck on projects like patroni[2].
I've provided a way to reproduce the issue in the linked thread, along
with a tentative patch.
Regards,
Anthonin Bonnefoy
[1]: https://www.postgresql.org/message-id/flat/CAO6_Xqo3co3BuUVEVzkaBVw9LidBgeeQ_2hfxeLMQcXwovB3GQ%40mail.gmail.com
[2]: https://github.com/patroni/patroni/issues/3522