RE: speed up a logical replica setup - Mailing list pgsql-hackers

From Hayato Kuroda (Fujitsu)
Subject RE: speed up a logical replica setup
Date
Msg-id TY3PR01MB9889FD26E1DE89FC8D51973EF5432@TY3PR01MB9889.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: speed up a logical replica setup  ("Euler Taveira" <euler@eulerto.com>)
Responses Re: speed up a logical replica setup
List pgsql-hackers
Dear Euler,

Thanks for giving comments! I want to reply some of them.

>
I didn't provide the whole explanation. I'm envisioning the use case that pg_ctl
doesn't reach the consistent state and the timeout is reached (the consequence
is that pg_createsubscriber aborts the execution). It might occur on a busy
server. The probability that it occurs with the current code is low (LSN gap
for recovery is small). Maybe I'm anticipating issues when the base backup
support is added but better to raise concerns during development.
>

Hmm, actually I didn't know the case. Thanks for explanation. I want to see
how you describe on the doc.

>
pg_upgrade doesn't but others do like pg_rewind, pg_resetwal, pg_controldata,
pg_checksums. It seems newer tools tend to provide short and long options.
>

Oh, you are right.

>
Nothing? If you interrupt the execution, there will be objects left behind and
you, as someone that decided to do it, have to clean things up. What do you
expect this tool to do? The documentation will provide some guidance informing
the object name patterns this tool uses and you can check for leftover objects.
Someone can argue that is a valid feature request but IMO it is not one in the
top of the list.
>

OK, so let's keep current style.

>
Why? Are you suggesting that the dry run mode covers just the verification
part? If so, it is not a dry run mode. I would expect it to run until the end
(or until it accomplish its goal) but *does not* modify data. For pg_resetwal,
the modification is one of the last steps and the other ones (KillFoo
functions) that are skipped modify data. It ends the dry run mode when it
accomplish its goal (obtain the new control data values). If we stop earlier,
some of the additional steps won't be covered by the dry run mode and a failure
can happen but could be detected if you run a few more steps.
>

Yes, it was my expectation. I'm still not sure which operations can detect by the
dry_run, but we can keep it for now.

>
Why? See [1]. I prefer the kind mode (always wait until the recovery ends) but
you and Amit are proposing a more aggressive mode. The proposal (-t 60) seems
ok right now, however, if the goal is to provide base backup support in the
future, you certainly should have to add the --recovery-timeout in big clusters
or those with high workloads because base backup is run between replication slot
creation and consistent LSN. Of course, we can change the default when base
backup support is added.
>

Sorry, I was missing your previous post. Let's keep yours.

>
Good point. I included a check for pg_create_subscription role and CREATE
privilege on the specified database.
>

Not sure, but can we do the replication origin functions by these privilege?
According to the doc[1], these ones seem not to be related.

[1]: https://www.postgresql.org/docs/devel/functions-admin.html#FUNCTIONS-REPLICATION

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/global/




pgsql-hackers by date:

Previous
From: James Coleman
Date:
Subject: set_cheapest without checking pathlist
Next
From: James Coleman
Date:
Subject: Re: Parallelize correlated subqueries that execute within each worker