Re: avoid multiple hard links to same WAL file after a crash - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: avoid multiple hard links to same WAL file after a crash
Date
Msg-id 20220408170519.GA1409463@nathanxps13
Whole thread Raw
In response to Re: avoid multiple hard links to same WAL file after a crash  (Nathan Bossart <nathandbossart@gmail.com>)
List pgsql-hackers
On Fri, Apr 08, 2022 at 09:53:12AM -0700, Nathan Bossart wrote:
> On Fri, Apr 08, 2022 at 10:38:03AM -0400, Robert Haas wrote:
>> I'd actually be in favor of nuking durable_rename_excl() from orbit
>> and putting the file-exists tests in the callers. Otherwise, someone
>> might assume that it actually has the semantics that its name
>> suggests, which could be pretty disastrous. If we don't want to do
>> that, then I'd changing to do the stat-then-durable-rename thing
>> internally, so we don't leave hard links lying around in *any* code
>> path. Perhaps that's the right answer for the back-branches in any
>> case, since there could be third-party code calling this function.
> 
> I think there might be another problem.  The man page for rename() seems to
> indicate that overwriting an existing file also introduces a window where
> the old and new path are hard links to the same file.  This isn't a problem
> for the WAL files because we should never be overwriting an existing one,
> but I wonder if it's a problem for other code paths.  My guess is that many
> code paths that overwrite an existing file are first writing changes to a
> temporary file before atomically replacing the original.  Those paths are
> likely okay, too, as you can usually just discard any existing temporary
> files.

Ha, so there are only a few callers of durable_rename_excl() in the
PostgreSQL tree.  One is basic_archive.c, which is already doing a stat()
check.  IIRC I only used durable_rename_excl() here to handle the case
where multiple servers are writing archives to the same location.  If that
happened, the archiver process would begin failing.  If a crash left two
hard links to the same file around, we will silently succeed the next time
around thanks to the compare_files() check.  Besides the WAL installation
code, the only other callers are in timeline.c, and both note that the use
of durable_rename_excl() is for "paranoidly trying to avoid overwriting an
existing file (there shouldn't be one)."

So AFAICT basic_archive.c is the only caller with a strong reason for using
durable_rename_excl(), and even that might not be worth keeping it around.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Joshua Brindle
Date:
Subject: Re: [PoC/RFC] Multiple passwords, interval expirations
Next
From: Mark Wong
Date:
Subject: Re: trigger example for plsample