Re: Concurrency issue in pg_rewind - Mailing list pgsql-hackers

From Alexander Kukushkin
Subject Re: Concurrency issue in pg_rewind
Date
Msg-id CAFh8B=mmZ5S-Y5Lf9v=oYfWMG+OxJpXHtO8yjXOaAoJUFXmX3g@mail.gmail.com
Whole thread Raw
In response to Re: Concurrency issue in pg_rewind  (Alexey Kondratov <a.kondratov@postgrespro.ru>)
Responses Re: Concurrency issue in pg_rewind  (Alexey Kondratov <a.kondratov@postgrespro.ru>)
List pgsql-hackers
On Thu, 17 Sep 2020 at 14:04, Alexey Kondratov
<a.kondratov@postgrespro.ru> wrote:
>
> Hm, I cannot understand why wal-g (or any other tool) is trying to run
> pg_rewind, while WAL copying (and prefetch) is still in progress? Why do
> not just wait until it is finished?

wal-g doesn't try to call pg_rewind.
First, we called wal-g, it fetched the file we requested and exited.
But, before exiting, wal-g forks, and the child process does prefetch
of a few next WALs.
We don't really know when the child process exits and can't wait for it.

>
> It is also not clear for me why it does not put prefetched WAL files
> directly into the pg_wal?

Because this is how postgres works. It doesn't matter whether the
specific WAL segment is there, postgres will call the restore_command
anyway.
The restore command also doesn't know if the file in pg_wal is OK,
therefore keeping the prefetched file in some other place and moving
it seems to be a good approach.

> With --restore-target-wal pg_rewind is trying to call restore_command on
> its own and it can happen at two stages:
>
> 1) When pg_rewind is trying to find the last checkpoint preceding a
> divergence point. In that case file map is not even yet initialized.
> Thus, all fetched WAL segments at this stage will be present in the file
> map created later.

Nope, it will fetch files you requested, and in addition to that it
will leave a child process running in the background which is doing
the prefetch (manipulating with pg_wal/.wal-g/...)

>
> 2) When it creates a data pages map. It should traverse WAL from the
> last common checkpoint till the final shutdown point in order to find
> all modified pages on the target. At this stage pg_rewind only updates
> info about data segments in the file map. That way, I see a minor
> problem that WAL segments fetched at this stage would not be deleted,
> since they are absent in the file map.
>
> Anyway, pg_rewind does not delete neither WAL segments, not any other
> files in the middle of the file map creation, so I cannot imagine, how
> it can get into the same trouble on its own.

When pg_rewind was creating the map, some temporary files where there,
because the forked child process of wal-g was still running.
When the wal-g child process exits, it removes some of these files.
Specifically, it was trying to prefetch 0000008400000A7600000024 into
the pg_wal/.wal-g/prefetch/running/0000008400000A7600000024, but
apparently the file wasn't available on S3 and prefetch failed,
therefore the empty file was removed.


> Although keeping arbitrary files inside PGDATA does not look like a good
> idea for me, I do not see anything criminal in skipping non-existing
> file, when executing a file map by pg_rewind.

Good, I will prepare a patch then.

Regards,
--
Alexander Kukushkin



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: logical/relation.c header description
Next
From: Amit Kapila
Date:
Subject: Re: [HACKERS] logical decoding of two-phase transactions