On Tue, Dec 17, 2024 at 04:50:16PM -0800, Robert Pang wrote:
> We recently observed a few cases where Postgres running on Linux
> encountered an issue with WAL segment files. Specifically, two WAL
> segments were linked to the same physical file after Postgres ran out
> of memory and the OOM killer terminated one of its processes. This
> resulted in the WAL segments overwriting each other and Postgres
> failing a later recovery.
Yikes!
> We found this fix [1] that has been applied to Postgres 16, but the
> cases we observed were running Postgres 15. Given that older major
> versions will be supported for a good number of years, and the
> potential for irrecoverability exists (even if rare), we would like to
> discuss the possibility of back-patching this fix.
IMHO this is a good time to reevaluate. It looks like we originally didn't
back-patch out of an abundance of caution, but now that this one has had
time to bake, I think it's worth seriously considering, especially now that
we have a report from the field.
--
nathan