pgsql: Clean up assorted messiness around AllocateDir() usage. - Mailing list pgsql-committers

From Tom Lane
Subject pgsql: Clean up assorted messiness around AllocateDir() usage.
Date
Msg-id E1eLypg-000491-W1@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Clean up assorted messiness around AllocateDir() usage.

This patch fixes a couple of low-probability bugs that could lead to
reporting an irrelevant errno value (and hence possibly a wrong SQLSTATE)
concerning directory-open or file-open failures.  It also fixes places
where we took shortcuts in reporting such errors, either by using elog
instead of ereport or by using ereport but forgetting to specify an
errcode.  And it eliminates a lot of just plain redundant error-handling
code.

In service of all this, export fd.c's formerly-static function
ReadDirExtended, so that external callers can make use of the coding
pattern

    dir = AllocateDir(path);
    while ((de = ReadDirExtended(dir, path, LOG)) != NULL)

if they'd like to treat directory-open failures as mere LOG conditions
rather than errors.  Also fix FreeDir to be a no-op if we reach it
with dir == NULL, as such a coding pattern would cause.

Then, remove code at many call sites that was throwing an error or log
message for AllocateDir failure, as ReadDir or ReadDirExtended can handle
that job just fine.  Aside from being a net code savings, this gets rid of
a lot of not-quite-up-to-snuff reports, as mentioned above.  (In some
places these changes result in replacing a custom error message such as
"could not open tablespace directory" with more generic wording "could not
open directory", but it was agreed that the custom wording buys little as
long as we report the directory name.)  In some other call sites where we
can't just remove code, change the error reports to be fully
project-style-compliant.

Also reorder code in restoreTwoPhaseData that was acquiring a lock
between AllocateDir and ReadDir; in the unlikely but surely not
impossible case that LWLockAcquire changes errno, AllocateDir failures
would be misreported.  There is no great value in opening the directory
before acquiring TwoPhaseStateLock, so just do it in the other order.

Also fix CheckXLogRemoved to guarantee that it preserves errno,
as quite a number of call sites are implicitly assuming.  (Again,
it's unlikely but I think not impossible that errno could change
during a SpinLockAcquire.  If so, this function was broken for its
own purposes as well as breaking callers.)

And change a few places that were using not-per-project-style messages,
such as "could not read directory" when "could not open directory" is
more correct.

Back-patch the exporting of ReadDirExtended, in case we have occasion
to back-patch some fix that makes use of it; it's not needed right now
but surely making it global is pretty harmless.  Also back-patch the
restoreTwoPhaseData and CheckXLogRemoved fixes.  The rest of this is
essentially cosmetic and need not get back-patched.

Michael Paquier, with a bit of additional work by me

Discussion: https://postgr.es/m/CAB7nPqRpOCxjiirHmebEFhXVTK7V5Jvw4bz82p7Oimtsm3TyZA@mail.gmail.com

Branch
------
REL9_6_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/543d8c2f5d4c6b290534f22cc570f51ef75214e5

Modified Files
--------------
src/backend/access/transam/xlog.c |  8 ++++++++
src/backend/storage/file/fd.c     | 32 ++++++++++++++++++++++++--------
src/include/storage/fd.h          |  2 ++
3 files changed, 34 insertions(+), 8 deletions(-)


pgsql-committers by date:

Previous
From: Robert Haas
Date:
Subject: pgsql: When VACUUM or ANALYZE skips a concurrently dropped table,log i
Next
From: Tom Lane
Date:
Subject: pgsql: Improve error handling in RemovePgTempFiles().