pgsql: Move temporary file cleanup to before_shmem_exit(). - Mailing list pgsql-committers

From Andres Freund
Subject pgsql: Move temporary file cleanup to before_shmem_exit().
Date
Msg-id E1mCYSw-0005Q4-VR@gemulon.postgresql.org
Whole thread Raw
Responses Re: pgsql: Move temporary file cleanup to before_shmem_exit().  (Michael Paquier <michael@paquier.xyz>)
List pgsql-committers
Move temporary file cleanup to before_shmem_exit().

As reported by a few OSX buildfarm animals there exist at least one path where
temporary files exist during AtProcExit_Files() processing. As temporary file
cleanup causes pgstat reporting, the assertions added in ee3f8d3d3ae caused
failures.

This is not an OSX specific issue, we were just lucky that timing on OSX
reliably triggered the problem.  The known way to cause this is a FATAL error
during perform_base_backup() with a MANIFEST used - adding an elog(FATAL)
after InitializeBackupManifest() reliably reproduces the problem in isolation.

The problem is that the temporary file created in InitializeBackupManifest()
is not cleaned up via resource owner cleanup as WalSndResourceCleanup()
currently is only used for non-FATAL errors. That then allows to reach
AtProcExit_Files() with existing temporary files, causing the assertion
failure.

To fix this problem, move temporary file cleanup to a before_shmem_exit() hook
and add assertions ensuring that no temporary files are created before / after
temporary file management has been initialized / shut down. The cleanest way
to do so seems to be to split fd.c initialization into two, one for plain file
access and one for temporary file access.

Right now there's no need to perform further fd.c cleanup during process exit,
so I just renamed AtProcExit_Files() to BeforeShmemExit_Files(). Alternatively
we could perform another pass through the files to check that no temporary
files exist, but the added assertions seem to provide enough protection
against that.

It might turn out that the assertions added in ee3f8d3d3ae will cause too much
noise - in that case we'll have to downgrade them to a WARNING, at least
temporarily.

This commit is not necessarily the best approach to address this issue, but it
should resolve the buildfarm failures. We can revise later.

Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20210807190131.2bm24acbebl4wl6i@alap3.anarazel.de

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/675c945394b36c2db0e8c8c9f6209c131ce3f0a8

Modified Files
--------------
src/backend/storage/file/fd.c     | 57 ++++++++++++++++++++++++++++++++++-----
src/backend/utils/init/postinit.c | 15 +++++++++--
src/include/storage/fd.h          |  1 +
3 files changed, 65 insertions(+), 8 deletions(-)


pgsql-committers by date:

Previous
From: Peter Eisentraut
Date:
Subject: pgsql: Remove T_MemoryContext
Next
From: Andres Freund
Date:
Subject: Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o