Re: subscriptionCheck failures on nightjar - Mailing list pgsql-hackers

From Andres Freund
Subject Re: subscriptionCheck failures on nightjar
Date
Msg-id 20190213181225.fathyapig4sm4exa@alap3.anarazel.de
Whole thread Raw
In response to Re: subscriptionCheck failures on nightjar  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: subscriptionCheck failures on nightjar  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: subscriptionCheck failures on nightjar  (Andrew Dunstan <andrew.dunstan@2ndquadrant.com>)
List pgsql-hackers
Hi,

On 2019-02-13 12:59:19 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2019-02-13 12:37:35 -0500, Tom Lane wrote:
> >> Bleah.  But in any case, the rename should not create a situation
> >> in which we need to fsync the file data again.
> 
> > Well, it's not super well defined which of either you need to make the
> > rename durable, and it appears to differ between OSs. Any argument
> > against fixing it up like I suggested, by using an fd from before the
> > rename?
> 
> I'm unimpressed.  You're speculating about the filesystem having random
> deviations from POSIX behavior, and using that weak argument to justify a
> totally untested technique having its own obvious portability hazards.

Uhm, we've reproduced failures due to the lack of such fsyncs at some
point. And not some fringe OS, but ext4 (albeit with data=writeback).

I don't think POSIX has yet figured out what they actually think is
required:
http://austingroupbugs.net/view.php?id=672

I guess we could just ignore ENOENT in this case, that ought to be just
as safe as using the old fd.


> Also, I wondered why this is coming out as a PANIC.  I thought originally
> that somebody must be causing this code to run in a critical section,
> but it looks like the real issue is just that fsync_fname() uses
> data_sync_elevel, which is
> 
> int
> data_sync_elevel(int elevel)
> {
>     return data_sync_retry ? elevel : PANIC;
> }
> 
> I really really don't want us doing questionably-necessary fsyncs with a
> PANIC as the result.

Well, given the 'failed fsync throws dirty data away' issue, I don't
quite see what we can do otherwise. But:


> Perhaps more to the point, the way this was coded, the PANIC applies
> to open() failures in fsync_fname_ext() not just fsync() failures;
> that's painting with too broad a brush isn't it?

That indeed seems wrong. Thomas?  I'm not quite sure how to best fix
this though - I guess we could rename fsync_fname_ext's eleval parameter
to fsync_failure_elevel? It's not visible outside fd.c, so that'd not be
to bad?

Greetings,

Andres Freund


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: reducing isolation tests runtime
Next
From: Tom Lane
Date:
Subject: Re: subscriptionCheck failures on nightjar