hot standby failed in read pg_commit_ts - Mailing list pgsql-hackers

From James Pang
Subject hot standby failed in read pg_commit_ts
Date
Msg-id CAHgTRfe33jcxh9i0Wjea7RJSx6FjL2JVWKe0RbX8_=FXu=mATw@mail.gmail.com
Whole thread
List pgsql-hackers
experts, 
Postgresql 14.18, parameter changes on primary: wal_level=logical from replica, track_commit_timestamp=on, max_worker_processes=40, max_wal_senders=30, restart primary
when primary restarted successfully , hot standby recovery aborted and instance restarted recovery from last restartpoint. this restartpoint point to old transaction
about 10 minutes ago before enable track_commit_timestamp and wal_level=logical.  startup process with new track_commit_timestamp(on) and  try to read pg_commit_ts data for this old transaction, failed in "Could not read from file "pg_commit_ts/xxxx" at offset 180xx4: read too few bytes".  This old transaction finished before enable track_commit_timestamp in primary, so there is corresponding commit_timestap data and xlog for it. 
 this failure is expected ?  or a bug like this old https://www.postgresql.org/message-id/4744cb5b-8962-8f10-f729-7cfeba807fcb%40amazon.com
and fix to this case?

 standby was in rebooting/failure loop, we already recreate hot standby to fix the issue.

LOG:  started streaming WAL from primary at 1F4F/xxxxxxxx on timeline 1
FATAL:  recovery aborted because of insufficient parameter settings
DETAIL:  max_worker_processes = 20 is a lower setting than on the primary server, where its value was 40.
HINT:  You can restart the server after making the necessary configuration changes.
CONTEXT:  WAL redo at 1F4F/800000A0 for XLOG/PARAMETER_CHANGE: max_connections=xxx max_worker_processes=40 max_wal_senders=30 max_prepared_xacts=0 max_locks_per_xact=64 wal_level=logical wal_log_hints=off track_commit_timestamp=on
LOG:  startup process (PID xxxx) exited with exit code 1
LOG:  terminating any other active server processes
LOG:  shutting down due to startup process failure
LOG:  database system is shut down
LOG:  starting PostgreSQL 14.18 on aarch64-unknown-linux-gnu, compiled by gcc (GCC) 7.3.1 xxxx, 64-bit

<<<we changed the parameter max_worker_processes, max_wal_senders to same as primary , then standby started recovery from last restart point

...
LOG:  database system was interrupted while in recovery at log time 2026-02-2x 07:4x:xx UTC
HINT:  If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.

...
LOG:  entering standby mode
...
LOG:  redo starts at 1F4F/5Axxxx  (this is the last restartpoint, that's old transaction 10 minutes ago before time when enable track_commit_timestamp=on, wal_level=logical)
...
FATAL:  could not access status of transaction 3576xxxx                                      
DETAIL:  Could not read from file "pg_commit_ts/xxxx" at offset 180xx4: read too few bytes.    
CONTEXT:  WAL redo at 1F4F/5Cxxxx for Transaction/COMMIT: 2026-02-2x 07:4x:21.7xxx+00
LOG:  startup process (PID xxx) exited with exit code 1
LOG:  terminating any other active server processes
LOG:  shutting down due to startup process failure
LOG:  database system is shut down
LOG:  starting PostgreSQL 14.18 on aarch64-unknown-linux-gnu, compiled by gcc (GCC) 7.3.1 xxxx, 64-bit
...

Thanks,

James

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Fix bug in multixact Oldest*MXactId initialization and access
Next
From: Jelte Fennema-Nio
Date:
Subject: Re: Make copyObject work in C++