Re: .ready and .done files considered harmful - Mailing list pgsql-hackers

From Bossart, Nathan
Subject Re: .ready and .done files considered harmful
Date
Msg-id 27F8BC15-6242-44C8-9818-5D02A3BF8F5C@amazon.com
Whole thread Raw
In response to Re: .ready and .done files considered harmful  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: .ready and .done files considered harmful  (Dipesh Pandit <dipesh.pandit@gmail.com>)
List pgsql-hackers
Thanks for the new version of the patch.  Overall, I think it is on
the right track.

+    /*
+     * This .ready file is created out of order, notify archiver to perform
+     * a full directory scan to archive corresponding WAL file.
+     */
+    StatusFilePath(archiveStatusPath, xlog, ".ready");
+    if (stat(archiveStatusPath, &stat_buf) == 0)
+    {
+        PgArchEnableDirScan();
+        PgArchWakeup();
+    }

Should we have XLogArchiveNotify(), writeTimeLineHistory(), and
writeTimeLineHistoryFile() enable the directory scan instead?  Else,
we have to exhaustively cover all such code paths, which may be
difficult to maintain.  Another reason I am bringing this up is that
my patch for adjusting .ready file creation [0] introduces more
opportunities for .ready files to be created out-of-order.

+    /*
+     * This is a fall-back path, check if we are here due to the unavailability
+     * of next anticipated log segment or the archiver is being forced to
+     * perform a full directory scan. Reset the flag in shared memory only if
+     * it has been enabled to force a full directory scan and then proceed with
+     * directory scan.
+     */
+    if (PgArch->dirScan)
+        PgArch->dirScan = false;

Why do we need to check that the flag is set before we reset it?  I
think we could just always reset it since we are about to do a
directory scan anyway.

On 8/18/21, 7:25 AM, "Robert Haas" <robertmhaas@gmail.com> wrote:
> On Tue, Aug 17, 2021 at 4:19 PM Bossart, Nathan <bossartn@amazon.com> wrote:
>> Thinking further, I think the most important thing to ensure is that
>> resetting the flag happens before we begin the directory scan.
>> Consider the following scenario in which a timeline history file would
>> potentially be lost:
>>
>>         1. Archiver completes directory scan.
>>         2. A timeline history file is created and the flag is set.
>>         3. Archiver resets the flag.
>
> Dipesh says in his latest email that the archiver resets the flag just
> before it begins a directory scan. If that's accurate, then I think
> this sequence of events can't occur.
>
> If there is a race condition here with setting the flag, then an
> alternative design would be to use a counter - either a plain old
> uint64 or perhaps pg_atomic_uint64 - and have the startup process
> increment the counter when it wants to trigger a scan. In this design,
> the archiver would never modify the counter itself, but just remember
> the last value that it saw. If it later sees a different value it
> knows that a full scan is required. I think this kind of system is
> extremely robust against the general class of problems that you're
> talking about here, but I'm not sure whether we need it, because I'm
> not sure whether there is a race with just the bool.

I'm not sure, either.  Perhaps it would at least be worth adding a
pg_memory_barrier() after setting dirScan to false to avoid the
scenario I mentioned (which may or may not be possible).  IMO this
stuff would be much easier to reason about if we used a lock instead,
even if the synchronization was not strictly necessary.  However, I
don't want to hold this patch up too much on this point.

Nathan

[0] https://postgr.es/m/05AD5FE2-9A53-4D11-A3F8-3A83EBB0EB93%40amazon.com


pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: The Free Space Map: Problems and Opportunities
Next
From: Andrew Dunstan
Date:
Subject: support for windows robocopy in archive_command and restore_command