Thread: BUG #18862: using pg_autoctl node_2 cannot bring back from maintenance mode

BUG #18862: using pg_autoctl node_2 cannot bring back from maintenance mode

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      18862
Logged by:          mindaugas kancevicius
Email address:      m.kancevicius@gmail.com
PostgreSQL version: 16.7
Operating system:   RHEL 8.10
Description:

Hello,
I am using pg_auto_failover postgresql version 16.7, RHEL 8.10:
monitor node
node_1
node_2
When applying config changes in node_1, enabling maintenance in node_2:
 pg_autoctl enable maintenance
When configuration completed in node_1, disabling maintenance:
pg_autoctl enable maintenance
This command worked fine for 5-7 times when did changes and node_2 catchedup
node_1 TLI: LSN.
When applied the last time enable, disable maintenance it somehow frozen and
received an error when tried to comeback from maintenance mode:
  Name |  Node |               Host:Port |         TLI: LSN |   Connection |
     Reported State |      Assigned State
-------+-------+-------------------------+------------------+--------------+---------------------+--------------------
node_1 |     1 | node_1:5432 |   4: 97/6C000110 |   read-write |
 single |              single
node_2 |     2 | node_2:5432 |   4: 97/6BD90750 |       none ! |
maintenance |             catchingup

The last known: TLI: LSN was 97/6BD90338
  Name |  Node |               Host:Port |         TLI: LSN |   Connection |
     Reported State |      Assigned State
-------+-------+-------------------------+------------------+--------------+---------------------+--------------------
node_1 |     1 | node_1:5432 |   4: 97/6BD90338 |   read-write |
primary |             primary
node_2 |     2 | node_2:5432 |   4: 97/6BD90338 |    read-only |
secondary |           secondary

This happened for me twice in 2 days after several commands when
enabled/disabled maintenance mode.

Is there any known issue why this happens randomly for node_2.
The only wait how i was able to fix it, i had to drop node_2 from monitor
node and reapply setup on node_2.


Hi,

On Mon, 2025-03-24 at 15:28 +0000, PG Bug reporting form wrote:
> I am using pg_auto_failover postgresql version 16.7, RHEL 8.10:
> monitor node

<snip>

Please report this to pg_auto_failover project:

https://github.com/hapostgres/pg_auto_failover/issues/new

Regards,
--
Devrim Gündüz
Open Source Solution Architect, PostgreSQL Major Contributor
Twitter: @DevrimGunduz , @DevrimGunduzTR

Attachment