Re: stat() vs ERROR_DELETE_PENDING, round N + 1 - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: stat() vs ERROR_DELETE_PENDING, round N + 1
Date
Msg-id CA+hUKGKL=3HZAPLh2En6TyNv242zsR=i8bZsCAAHDDN1K94Byw@mail.gmail.com
Whole thread Raw
In response to Re: stat() vs ERROR_DELETE_PENDING, round N + 1  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: stat() vs ERROR_DELETE_PENDING, round N + 1
List pgsql-hackers
On Fri, Sep 3, 2021 at 2:01 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
> Might be stupid, if a delete-pending'ed file can obstruct something,
> couldn't we change unlink on Windows to rename to a temporary random
> name then remove it?  We do something like it explicitly while WAL
> file removal. (It may cause degradation on bulk file deletion, and we
> may need further fix so that such being-deleted files are excluded
> while running a directory scan, though..)
>
> However, looking [1], with that strategy there may be a case where
> such "deleted" files may be left alone forever, though.

It's a good idea.  I tested it and it certainly does fix the
basebackup problem I've seen (experimental patch attached).  But,
yeah, I'm also a bit worried that that path could be fragile and need
special handling in lots of places.

I also tried writing a new open() wrapper using the lower level
NtCreateFile() interface, and then an updated stat() wrapper built on
top of that.  As a non-Windows person, getting that to (mostly) work
involved a fair amount of suffering.  I can share that if someone is
interested, but while learning about that family of interfaces, I
realised we could keep the existing Win32-based code, but also
retrieve the NT status, leading to a very small change (experimental
patch attached).

The best idea is probably to set FILE_DISPOSITION_DELETE |
FILE_DISPOSITION_POSIX_SEMANTICS before unlinking.  This appears to be
a silver bullet, but isn't available on ancient Windows releases that
we support, or file systems other than local NTFS.  So maybe we need a
combination of that + STATUS_DELETE_PENDING as shown in the attached.
I'll look into that next.

Attachment

pgsql-hackers by date:

Previous
From: Esteban Zimanyi
Date:
Subject: Fwd: Problem with Unix sockets when porting MobilityDB for Windows
Next
From: Masahiko Sawada
Date:
Subject: Re: Skipping logical replication transactions on subscriber side