Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown) - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)
Date
Msg-id 20191004083721.GA1829@paquier.xyz
Whole thread Raw
In response to Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)  (Alexey Kondratov <a.kondratov@postgrespro.ru>)
Responses Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)
List pgsql-hackers
On Thu, Oct 03, 2019 at 12:43:37PM +0300, Alexey Kondratov wrote:
> On 03.10.2019 6:07, Michael Paquier wrote:
>> I have reworked your first patch as per the attached.  What do you
>> think about it?  The part with the control file needs to go down to
>> v12, and I would likely split that into two commits on HEAD: one for
>> the control file and a second for the recovery.conf portion with the
>> fix for --no-ensure-shutdown to keep a cleaner history.
>
> It looks fine for me excepting the progress reporting part. It now adds
> PG_CONTROL_FILE_SIZE to fetch_done. However, I cannot find that control file
> is either included into filemap and fetch_size or counted during
> calculate_totals(). Maybe I've missed something, but now it looks like we
> report something that wasn't planned for progress reporting, doesn't
> it?

Right.  The pre-12 code actually handles that incorrecly as it assumed
that any files written through file_ops.c should be part of the
progress.  So I went with the simplest solution, and backpatched this
part with 6f3823b.  I have also committed the set of fixes for the new
options so as we have a better base of work than what's on HEAD
currently.

>> +               # Check that incompatible options error out.
>> +               command_fails(
>> +                       [
>> +                               'pg_rewind', "--debug",
>> +                               "--source-pgdata=$standby_pgdata",
>> +                               "--target-pgdata=$master_pgdata", "-R",
>> +                               "--no-ensure-shutdown"
>> +                       ],
>> +                       'pg_rewind local with -R');
>> Incompatible options had better be checked within a separate perl
>> script?  We generally do that for the other binaries.
>
> Yes, it makes sense. I've reworked the patch with tests and added a couple
> of extra cases.

Regarding the tests, adding a --dry-run command is a good idea.
However I think that there is more value to automate the use of the
single user mode automatically in the tests as that's more critical
from the point of view of rewind run, and stopping the cluster with
immediate mode causes, as expected, the next --dry-run command to
fail.

Another thing is that I think that we should use -F with --single.
This makes recovery faster, and the target data folder is synced
at the end of pg_rewind anyway.

Using the long option names makes the tests easier to follow in this
case, so I have switched -R to --write-recovery-conf.

Some comments and the docs have been using some confusing wording, so
I have reworked what I found (like many "it" in a single sentence
referring different things).

+command_fails(
+    [
+        'pg_rewind', "--debug",
+        "--source-pgdata=$standby_pgdata",
+        "--target-pgdata=$master_pgdata",
+        "--no-ensure-shutdown"
+    ],
+    'pg_rewind local without source shutdown');
Regarding all the set of incompatible options, we have much more of
that after the initial option parsing so I think that we should group
all the cheap ones together.  Let's tackle that as a separate patch.
We can also just check after --no-ensure-shutdown directly in
RewindTest.pm as I have switched the cluster to not be cleanly shut
down anymore to stress the automatic recovery path, and trigger that
before running pg_rewind for the local and remote mode.

Attached is an updated patch with all I found.  What do you think?
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Memory Accounting
Next
From: Michael Paquier
Date:
Subject: Re: Include RELKIND_TOASTVALUE in get_relkind_objtype