Backpatch critical performance fixes to pgarch.c
This backpatches commits beb4e9ba1652 and 1fb17b190341 (originally
appearing in previously in REL_15_STABLE) to REL_14_STABLE. Performance
of the WAL archiver can become pretty critical at times, and reports
exist of users getting in serious trouble (hours of downtime, loss of
replicas) because of lack of this optimization.
We'd like to backpatch these to REL_13_STABLE too, but because of the
very invasive changes made by commit d75288fb27b8 in the 14 timeframe,
we deem it too risky :-(
Original commit messages appear below.
Discussion: https://postgr.es/m/202411131605.m66syq5i5ucl@alvherre.pgsql
commit beb4e9ba1652a04f66ff20261444d06f678c0b2d
Author: Robert Haas <rhaas@postgresql.org>
AuthorDate: Thu Nov 11 15:02:53 2021 -0500
Improve performance of pgarch_readyXlog() with many status files.
Presently, the archive_status directory was scanned for each file to
archive. When there are many status files, say because archive_command
has been failing for a long time, these directory scans can get very
slow. With this change, the archiver remembers several files to archive
during each directory scan, speeding things up.
To ensure timeline history files are archived as quickly as possible,
XLogArchiveNotify() forces the archiver to do a new directory scan as
soon as the .ready file for one is created.
Nathan Bossart, per a long discussion involving many people. It is
not clear to me exactly who out of all those people reviewed this
particular patch.
Discussion: http://postgr.es/m/CA+TgmobhAbs2yabTuTRkJTq_kkC80-+jw=pfpypdOJ7+gAbQbw@mail.gmail.com
Discussion: http://postgr.es/m/620F3CE1-0255-4D66-9D87-0EADE866985A@amazon.com
commit 1fb17b1903414676bd371068739549cd2966fe87
Author: Tom Lane <tgl@sss.pgh.pa.us>
AuthorDate: Wed Dec 29 17:02:50 2021 -0500
Fix issues in pgarch's new directory-scanning logic.
The arch_filenames[] array elements were one byte too small, so that
a maximum-length filename would get corrupted if another entry
were made after it. (Noted by Thomas Munro, fix by Nathan Bossart.)
Move these arrays into a palloc'd struct, so that we aren't wasting
a few kilobytes of static data in each non-archiver process.
Add a binaryheap_reset() call to make it plain that we start the
directory scan with an empty heap. I don't think there's any live
bug of that sort, but it seems fragile, and this is very cheap
insurance.
Cleanup for commit beb4e9ba1, so no back-patch needed.
Discussion: https://postgr.es/m/CA+hUKGLHAjHuKuwtzsW7uMJF4BVPcQRL-UMZG_HM-g0y7yLkUg@mail.gmail.com
Branch
------
REL_14_STABLE
Details
-------
https://git.postgresql.org/pg/commitdiff/4abf615cc8a8ca80430b5a0bfa18be6efcea96b2
Modified Files
--------------
src/backend/access/transam/xlogarchive.c | 14 +++
src/backend/postmaster/pgarch.c | 208 +++++++++++++++++++++++++++----
src/include/postmaster/pgarch.h | 1 +
3 files changed, 197 insertions(+), 26 deletions(-)