Hi Hackers,
When a subscription has retain_dead_tuples enabled with maxretention set
to zero (unlimited retention), adjust_xid_advance_interval() caps
xid_advance_interval to Min(interval, maxretention). Since maxretention
is zero, this always collapses the interval to zero milliseconds.
A zero makes TimestampDifferenceExceeds(last_time, now, 0) always
true in get_candidate_xid(). This causes the apply worker to call
GetOldestActiveTransactionId() on every single WAL message. This results in
a huge number of ProcArrayLock acquisitions under moderate write load.
Fix by adding a maxretention > 0 guard to the cap. When maxretention is zero ,
the exponential back-off in adjust_xid_advance_interval()
now works correctly, growing the interval from 100 ms toward the 180 s
ceiling.
Measured with perf uprobe counting GetOldestActiveTransactionId calls
at ~39K TPS (pgbench, 5 clients):
Before fix: 25,104 calls / 5 s (~5,021/s)
After fix: 31 calls / 5 s (~6/s)
Thank
Satya