Re: Cygwin cleanup - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Cygwin cleanup
Date
Msg-id 20230124003622.rg6trhbfhnxmnprj@awork3.anarazel.de
Whole thread Raw
In response to Re: Cygwin cleanup  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-hackers
Hi,

On 2023-01-23 17:28:14 -0600, Justin Pryzby wrote:
> On Thu, Jan 12, 2023 at 10:17:55PM -0600, Justin Pryzby wrote:
> > On Thu, Jan 12, 2023 at 06:43:54PM -0800, Andres Freund wrote:
> > > > It looks like logical decoding may be the "most wrong" place that
> > > > wal_sync_method is being used, so maybe my change is reasonable to
> > > > consider, and not just a workaround.
> > > 
> > > I don't follow. What does using fsync_fname() vs fsync_fname_ext() have to do
> > > with pg_fsync() using wal_sync_method? fsync_fname() is just a wrapper around
> > > fsync_fname_ext(). Both end up in pg_fsync().
> > 
> > My patch used fsync_fname_ext() which would cause an ERROR rather than a
> > PANIC when failing to fsync the logical decoding pathname.
> > 
> > > Are you actually proposing that we don't PANIC after an fsync for the category
> > > of files that you list here, even with data_sync_retry set?
> > 
> > Yes, but I'm referring only to my change to SnapBuildSerialize().
> > 
> > The rest of the verbage was me trying to figure out the
> > history/evolution of pg_fsync usage.
> 
> Also note the existing comment (originating from Thomas' "fsync-gate"
> commit, which introduced data_sync_retry):
> 
> +        * It's safe to just ERROR on fsync() here because we'll retry the whole
> +        * operation including the writes.
> 
> Also, it seems to work fine if one calls pg_fsync() again, rather than
> calling fsync_fname(), which re-opens the file.

I don't think that'd achieve the same thing necessarily. But it's notoriously
hard to know what which OS requires in this area.


> It also seems to work fine if one omits the initial call to
> fsync_fname("pg_logical/snapshots", true);

I don't think it's a good idea randomly weaken individual fsyncs just because
it somehow, without any theory as to how, fixes tests on cygwin.


> Since SnapBuildSerialize() isn't atomic (the system could crash at any
> point), I'm not seeing why these wouldn't be adequately safe.

I'm not sure what you mean by that. It's atomic from a crash safety view: It
first writes into a tempfile, fsyncs that + directory, then renames the file
into place, fsyncs new filename + directory again. Tempfiles are removed after
a crash.  In case of a crash you can either end up with an "old" or a "new"
file.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: Logical replication timeout problem
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: Time delayed LR (WAS Re: logical replication restrictions)