Thread: Primary and standby setting cross-checks

Primary and standby setting cross-checks

From
Heikki Linnakangas
Date:
Currently, if you configure a hot standby server with a smaller 
max_connections setting than the primary, the server refuses to start up:

LOG:  entering standby mode
FATAL:  recovery aborted because of insufficient parameter settings
DETAIL:  max_connections = 10 is a lower setting than on the primary 
server, where its value was 100.
HINT:  You can restart the server after making the necessary 
configuration changes.

Or if you change the setting in the primary while the standby is 
running, replay pauses:

WARNING:  hot standby is not possible because of insufficient parameter 
settings
DETAIL:  max_connections = 100 is a lower setting than on the primary 
server, where its value was 200.
CONTEXT:  WAL redo at 2/E10000D8 for XLOG/PARAMETER_CHANGE: 
max_connections=200 max_worker_processes=8 max_wal_senders=10 
max_prepared_xacts=0 max_locks_per_xact=64 wal_level=logical 
wal_log_hints=off track_commit_timestamp=off
LOG:  recovery has paused
DETAIL:  If recovery is unpaused, the server will shut down.
HINT:  You can then restart the server after making the necessary 
configuration changes.
CONTEXT:  WAL redo at 2/E10000D8 for XLOG/PARAMETER_CHANGE: 
max_connections=200 max_worker_processes=8 max_wal_senders=10 
max_prepared_xacts=0 max_locks_per_xact=64 wal_level=logical 
wal_log_hints=off track_commit_timestamp=off

Both of these are rather unpleasant behavior.

I thought I could get rid of that limitation with my CSN snapshot patch 
[1], because it gets rid of the fixed-size known-assigned XIDs array, 
but there's a second reason for these limitations. It's also used to 
ensure that the standby has enough space in the lock manager to hold 
possible AccessExclusiveLocks taken by transactions in the primary.

So firstly, I think that's a bad tradeoff. In vast majority of cases, 
you would not run out of lock space anyway, if you just started up the 
system. Secondly, that cross-check of settings doesn't fully prevent the 
problem. It ensures that the lock tables are large enough to accommodate 
all the locks you could possibly hold in the primary, but that doesn't 
take into account any additional locks held by read-only queries in the 
hot standby. So if you have queries running in the standby that take a 
lot of locks, this can happen anyway:

2024-08-29 21:44:32.634 EEST [668327] FATAL:  out of shared memory
2024-08-29 21:44:32.634 EEST [668327] HINT:  You might need to increase 
"max_locks_per_transaction".
2024-08-29 21:44:32.634 EEST [668327] CONTEXT:  WAL redo at 2/FD40FCC8 
for Standby/LOCK: xid 996 db 5 rel 154045
2024-08-29 21:44:32.634 EEST [668327] WARNING:  you don't own a lock of 
type AccessExclusiveLock
2024-08-29 21:44:32.634 EEST [668327] LOG:  RecoveryLockHash contains 
entry for lock no longer recorded by lock manager: xid 996 database 5 
relation 154045
TRAP: failed Assert("false"), File: 
"../src/backend/storage/ipc/standby.c", Line: 1053, PID: 668327
postgres: startup recovering 
0000000100000002000000FD(ExceptionalCondition+0x6e)[0x556a4588396e]
postgres: startup recovering 
0000000100000002000000FD(+0x44156e)[0x556a4571356e]
postgres: startup recovering 
0000000100000002000000FD(StandbyReleaseAllLocks+0x78)[0x556a45712738]
postgres: startup recovering 
0000000100000002000000FD(ShutdownRecoveryTransactionEnvironment+0x15)[0x556a45712685]
postgres: startup recovering 
0000000100000002000000FD(shmem_exit+0x111)[0x556a457062e1]
postgres: startup recovering 
0000000100000002000000FD(+0x434132)[0x556a45706132]
postgres: startup recovering 
0000000100000002000000FD(proc_exit+0x59)[0x556a45706079]
postgres: startup recovering 
0000000100000002000000FD(errfinish+0x278)[0x556a45884708]
postgres: startup recovering 
0000000100000002000000FD(LockAcquireExtended+0xa46)[0x556a45719386]
postgres: startup recovering 
0000000100000002000000FD(StandbyAcquireAccessExclusiveLock+0x11d)[0x556a4571330d]
postgres: startup recovering 
0000000100000002000000FD(standby_redo+0x70)[0x556a45713690]
postgres: startup recovering 
0000000100000002000000FD(PerformWalRecovery+0x7b3)[0x556a4547d313]
postgres: startup recovering 
0000000100000002000000FD(StartupXLOG+0xac3)[0x556a4546dae3]
postgres: startup recovering 
0000000100000002000000FD(StartupProcessMain+0xe8)[0x556a45693558]
postgres: startup recovering 
0000000100000002000000FD(+0x3ba95d)[0x556a4568c95d]
postgres: startup recovering 
0000000100000002000000FD(+0x3bce41)[0x556a4568ee41]
postgres: startup recovering 
0000000100000002000000FD(PostmasterMain+0x116e)[0x556a4568eaae]
postgres: startup recovering 
0000000100000002000000FD(+0x2f960e)[0x556a455cb60e]
/lib/x86_64-linux-gnu/libc.so.6(+0x27c8a)[0x7f10ef042c8a]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85)[0x7f10ef042d45]
postgres: startup recovering 
0000000100000002000000FD(_start+0x21)[0x556a453af011]
2024-08-29 21:44:32.641 EEST [668324] LOG:  startup process (PID 668327) 
was terminated by signal 6: Aborted
2024-08-29 21:44:32.641 EEST [668324] LOG:  terminating any other active 
server processes
2024-08-29 21:44:32.654 EEST [668324] LOG:  shutting down due to startup 
process failure
2024-08-29 21:44:32.729 EEST [668324] LOG:  database system is shut down

Granted, if you restart the server, it will probably succeed because 
restarting the server will kill all the other queries that were holding 
locks. But yuck. With assertions disabled, it looks a little less scary, 
but not nice anyway.

So how to improve this? I see a few options:

a) Downgrade the error at startup to a warning, and allow starting the 
standby with smaller settings in standby. At least with a smaller 
max_locks_per_transactions. The other settings also affect the size of 
known-assigned XIDs array, but if the CSN snapshots get committed, that 
will get fixed. In most cases there is enough lock memory anyway, and it 
will be fine. Just fix the assertion failure so that the error message 
is a little nicer.

b) If you run out of lock space, kill running queries, and prevent new 
ones from starting. Track the locks in startup process' private memory 
until there is enough space in the lock manager, and then re-open for 
queries. In essence, go from hot standby mode to warm standby, until 
it's possible to go back to hot standby mode again.

Thoughts, better ideas?

[1] https://commitfest.postgresql.org/49/4912/

-- 
Heikki Linnakangas
Neon (https://neon.tech)