Re: Trap errors from streaming child in pg_basebackup to exit early - Mailing list pgsql-hackers

From Daniel Gustafsson
Subject Re: Trap errors from streaming child in pg_basebackup to exit early
Date
Msg-id 2289827C-7462-4B47-AD18-0601FAD36143@yesql.se
Whole thread Raw
In response to Re: Trap errors from streaming child in pg_basebackup to exit early  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Trap errors from streaming child in pg_basebackup to exit early  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
> On 21 Feb 2022, at 03:03, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, Feb 18, 2022 at 10:00:43PM +0100, Daniel Gustafsson wrote:
>> This is good idea, I was going in a different direction earlier with a test but
>> this is cleaner.  The attached 0001 refactors pump_until; 0002 fixes a trivial
>> spelling error found while hacking; and 0003 is the previous patch complete
>> with a test that passes on Cirrus CI.
>
> This looks rather sane to me, and I can confirm that this passes
> the CI and a manual run of MSVC tests with my own box.

Great, thanks!

> +is($node->poll_query_until('postgres',
> +   "SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE " .
> +   "application_name = '010_pg_basebackup.pl' AND wait_event =
> 'WalSenderMain' " .
> +   "AND backend_type = 'walsender'"), "1", "Walsender killed");
> If you do that, don't you have a risk to kill the WAL sender doing the
> BASE_BACKUP?  That could falsify the test.  It seems to me that it
> would be safer to add a check on query ~ 'START_REPLICATION' or
> something like that.

I don't think there's a risk, but I've added the check on query as well since
it also makes it more readable.

> -           diag("aborting wait: program timed out");
> -           diag("stream contents: >>", $$stream, "<<");
> -           diag("pattern searched for: ", $untl);
> Keeping some of this information around would be useful for
> debugging in the refactored routine.

Maybe, but we don't really have diag output anywhere in the modules or the
tests so I didn't see much of a precedent for keeping it.  Inspectig the repo I
think we can remove two more in pg_rewind, which I just started a thread for.

> +my $sigchld_bb = IPC::Run::start(
> +   [
> +       @pg_basebackup_defs, '-X', 'stream', '-D', "$tempdir/sigchld",
> +       '-r', '32', '-d', $node->connstr('postgres')
> +   ],
>     I would recommend the use of long options here as a matter to
> self-document what this does, and add a comment explaining why
> --max-rate is preferable, mainly for fast machines.

Fair enough, done.

--
Daniel Gustafsson        https://vmware.com/


Attachment

pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Using Test::More test functions for pg_rewind
Next
From: Maxim Orlov
Date:
Subject: Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.