Re: checkpointer: PANIC: could not fsync file: No such file or directory - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: checkpointer: PANIC: could not fsync file: No such file or directory
Date
Msg-id CA+hUKGKHFDpuJZ5bqPx6TtxavTApRa6MgHAFhHvZJxYUsWdK2A@mail.gmail.com
Whole thread Raw
In response to Re: checkpointer: PANIC: could not fsync file: No such file ordirectory  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: checkpointer: PANIC: could not fsync file: No such file or directory  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
On Fri, Nov 29, 2019 at 11:14 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
> On Fri, Nov 29, 2019 at 10:50:36AM +1300, Thomas Munro wrote:
> > On Fri, Nov 29, 2019 at 3:13 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> > > On Wed, Nov 27, 2019 at 7:53 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > > >  2019-11-26 23:41:50.009-05 | could not fsync file "pg_tblspc/16401/PG_12_201909212/16460/973123799.10": No
suchfile or directory
 
> > >
> > > I managed to reproduce this (see below).  I think I know what the
> >
> > Here is a patch that fixes the problem by sending all the
> > SYNC_FORGET_REQUEST messages up front.
>
> I managed to reproduce it too, but my recipe is crummy enough that I'm not even
> going to send it..
>
> I confirmed that patch also seems to work for my worse recipe.

Thanks.

One thing I'm wondering about is what happens if you encounter EPERM,
EIO etc while probing the existing files, so that we give up early and
don't deal with some higher numbered files.

(1)  We'll still have unlinked the lower numbered files, and we'll
leave the higher numbered files where they are, which might confuse
matters if the relfilenode number if later recycled so the zombie
segments appear to belong to the new relfilenode.  That is
longstanding PostgreSQL behaviour not changed by commit 3eb77eba or
this patch IIUC (despite moving a few things around), and I guess it's
unlikely to bite you considering all the coincidences required, but if
there's a transient EPERM (think Windows virus checker opening files
without the shared read flag), it's not inconceivable.  One solution
to that would be not to queue the unlink request for segment 0 if
anything goes wrong while unlinking the other segments (in other
words: if you can't unlink all the segments, deliberately leak segment
0 and thus the *whole relfilenode*, not just the problem file(s)).

(2)  Even with the fix I just proposed, if you give up early due to
EPERM, EIO etc, there might still be sync requests queued for high
numbered segments, so you could reach the PANIC case.  It's likely
that, once you've hit such an error, the checkpointer is going to be
panicking anyway when it starts seeing similar errors, but still, it'd
be possible to do better here (especially if the error was transient
and short lived).  If we don't accept that behaviour, we could switch
(back) to a single cancel message that can whack every request
relating to the relation (essentially as it was in 11, though it
requires a bit more work for the new design), or stop using
_mdfd_getseg() for this so that you can remove segments independently
without worrying about sync requests for other segments (it was
actually like that in an earlier version of the patch for commit
3eb77eba, but someone complained that it didn't benifit from fd
caching).



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: pglz performance
Next
From: Tatsuo Ishii
Date:
Subject: Re: Implementing Incremental View Maintenance