Re: .ready and .done files considered harmful - Mailing list pgsql-hackers

From Dipesh Pandit
Subject Re: .ready and .done files considered harmful
Date
Msg-id CAN1g5_EgkiSAd2Hh_T_bC9Z+GmdQTEhR=eb=edh8aKAiLDnysQ@mail.gmail.com
Whole thread Raw
In response to Re: .ready and .done files considered harmful  ("Bossart, Nathan" <bossartn@amazon.com>)
Responses Re: .ready and .done files considered harmful  ("Bossart, Nathan" <bossartn@amazon.com>)
Re: .ready and .done files considered harmful  ("Bossart, Nathan" <bossartn@amazon.com>)
Re: .ready and .done files considered harmful  ("Bossart, Nathan" <bossartn@amazon.com>)
List pgsql-hackers
Hi,

Thanks for the feedback.

> Should we have XLogArchiveNotify(), writeTimeLineHistory(), and
> writeTimeLineHistoryFile() enable the directory scan instead?  Else,
> we have to exhaustively cover all such code paths, which may be
> difficult to maintain.  Another reason I am bringing this up is that
> my patch for adjusting .ready file creation [0] introduces more
> opportunities for .ready files to be created out-of-order.

XLogArchiveNotify() notifies Archiver when a log segment is ready for
archival by creating a .ready file. This function is being called for each
log segment and placing a call to enable directory scan here will result
in directory scan for each log segment.

We can have writeTimeLineHistory() and writeTimeLineHistoryFile() to
enable directory scan to handle the scenarios related to timeline switch.

However, in other scenarios, I think we have to explicitly call PgArchEnableDirScan()
to enable directory scan. PgArchEnableDirScan() takes care of waking up
archiver so that the caller of this function need not have to nudge the archiver.

> +    /*
> +     * This is a fall-back path, check if we are here due to the unavailability
> +     * of next anticipated log segment or the archiver is being forced to
> +     * perform a full directory scan. Reset the flag in shared memory only if
> +     * it has been enabled to force a full directory scan and then proceed with
> +     * directory scan.
> +     */
> +    if (PgArch->dirScan)
> +        PgArch->dirScan = false;

> Why do we need to check that the flag is set before we reset it?  I
> think we could just always reset it since we are about to do a
> directory scan anyway

Yes, I agree.

> > If there is a race condition here with setting the flag, then an
> > alternative design would be to use a counter - either a plain old
> > uint64 or perhaps pg_atomic_uint64 - and have the startup process
> > increment the counter when it wants to trigger a scan. In this design,
> > the archiver would never modify the counter itself, but just remember
> > the last value that it saw. If it later sees a different value it
> > knows that a full scan is required. I think this kind of system is
> > extremely robust against the general class of problems that you're
> > talking about here, but I'm not sure whether we need it, because I'm
> > not sure whether there is a race with just the bool.

> I'm not sure, either.  Perhaps it would at least be worth adding a
> pg_memory_barrier() after setting dirScan to false to avoid the
> scenario I mentioned (which may or may not be possible).  IMO this
> stuff would be much easier to reason about if we used a lock instead,
> even if the synchronization was not strictly necessary.  However, I
> don't want to hold this patch up too much on this point.

There is one possible scenario where it may run into a race condition. If
archiver has just finished archiving all .ready files and the next anticipated
log segment is not available then in this case archiver takes the fall-back
path to scan directory. It resets the flag before it begins directory scan.
Now, if a directory scan is enabled by a timeline switch or .ready file created
out of order in parallel to the event that the archiver resets the flag then this
might result in a race condition. But in this case also archiver is eventually
going to perform a directory scan and the desired file will be archived as part
of directory scan. Apart of this I can't think of any other scenario which may
result into a race condition unless I am missing something.

I have incorporated the suggestions and updated a new patch. PFA patch v9.

Thanks,
Dipesh
Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: elog.c query_id support vs shutdown
Next
From: Matthias van de Meent
Date:
Subject: Re: NAMEDATALEN increase because of non-latin languages