Re: avoid multiple hard links to same WAL file after a crash - Mailing list pgsql-hackers

From Tom Lane
Subject Re: avoid multiple hard links to same WAL file after a crash
Date
Msg-id 2370127.1650308835@sss.pgh.pa.us
Whole thread Raw
In response to Re: avoid multiple hard links to same WAL file after a crash  (Michael Paquier <michael@paquier.xyz>)
Responses Re: avoid multiple hard links to same WAL file after a crash  (Greg Stark <stark@mit.edu>)
List pgsql-hackers
Michael Paquier <michael@paquier.xyz> writes:
> On Fri, Apr 08, 2022 at 09:00:36PM -0400, Robert Haas wrote:
>> I wonder if this is really true. I thought rename() was supposed to be atomic.

> Not always.  For example, some old versions of MacOS have a non-atomic
> implementation of rename(), like prairiedog with 10.4.  Even 10.5 does
> not handle atomicity as far as I call.

I think that's not talking about the same thing.  POSIX requires rename(2)
to replace an existing target link atomically:

    If the link named by the new argument exists, it shall be removed and
    old renamed to new. In this case, a link named new shall remain
    visible to other threads throughout the renaming operation and refer
    either to the file referred to by new or old before the operation
    began.

(It's that requirement that ancient macOS fails to meet.)

However, I do not see any text that addresses the question of whether
the old link disappears atomically with the appearance of the new link,
and it seems like that'd be pretty impractical to ensure in cases like
moving a link from one directory to another.  (What would it even mean
to say that, considering that a thread can't read the two directories
at the same instant?)  From a crash-safety standpoint, it'd surely be
better to make the new link before removing the old, so I imagine
that's what most file systems do.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Why does pg_class.reltuples count only live tuples in indexes (after VACUUM runs)?
Next
From: Tom Lane
Date:
Subject: Re: Why does pg_class.reltuples count only live tuples in indexes (after VACUUM runs)?