On 11/24/24 13:00, Ron Johnson wrote: > On Sun, Nov 24, 2024 at 2:55 PM Christophe Pettus <xof@thebuild.com > <mailto:xof@thebuild.com>> wrote: > > > On Nov 24, 2024, at 09:15, Ron Johnson <ronljohnsonjr@gmail.com > <mailto:ronljohnsonjr@gmail.com>> wrote: > > > > Doesn't the existence of a replication slot force PG to retain > WAL files when replication is broken? > > It does. I don't recall if the OP said that they were using a > persistent replication slot or not; it's not as common with binary > replication as with logical replication. > > > Really? I wonder why people fight with configuring max_wal_size and > wal_keep_size, when replication slots do all the work for you.
Replication slots persist across crashes and know nothing about the state of their consumer(s). They will prevent removal of required resources even when there is no connection using them. This consumes storage because neither required WAL nor required rows from the system catalogs can be removed by VACUUM as long as they are required by a replication slot. In extreme cases this could cause the database to shut down to prevent transaction ID wraparound (see Section 24.1.5). So if a slot is no longer required it should be dropped. "
Nagios has built-in disk space monitoring, and if it doesn't also have built-in replication monitoring, you can write a plug-in. Or write your own bash script that periodically runs "SELECT * from pg_replication_slots;" and "SELECT * FROM pg_stat_replication;" on the primary and "SELECT * FROM pg_stat_wal_receiver;" on the secondary.
Whichever you do, some monitoring should always be in place.
" Caution
There is a chance that the old primary is up again during the promotion and if subscriptions are not disabled, the logical subscribers may continue to receive data from the old primary server even after promotion until the connection string is altered. This might result in data inconsistency issues, preventing the logical subscribers from being able to continue replication from the new primary server. "
Logical replication is off-topic for this problem, no?