Re: speed up a logical replica setup - Mailing list pgsql-hackers

From Euler Taveira
Subject Re: speed up a logical replica setup
Date
Msg-id 7aa94b4a-9139-4b24-a184-dd17b38a0c8f@app.fastmail.com
Whole thread Raw
In response to Re: speed up a logical replica setup  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: speed up a logical replica setup
List pgsql-hackers
On Thu, May 23, 2024, at 5:54 AM, Amit Kapila wrote:
On Wed, May 22, 2024 at 8:46 PM Euler Taveira <euler@eulerto.com> wrote:
>
> Following the same line that simplifies the code, we can: (a) add a loop in
> check_subscriber() that waits until walreceiver is available on subscriber or
> (b) use a timeout. The main advantage of (a) is that the primary slot is already
> available but I'm afraid we need a escape mechanism for the loop (timeout?).
>

Sorry, it is not clear to me why we need any additional loop in
check_subscriber(), aren't we speaking about the problem in
check_publisher() function?

The idea is to use check_subscriber() to check pg_stat_walreceiver. Once this
view returns a row and primary_slot_name is set on standby, the referred
replication slot name should be active on primary. Hence, the query on
check_publisher() make sure that the referred replication slot is in use on
primary. 

Why in the first place do we need to ensure that primary_slot_name is
active on the primary? You mentioned something related to WAL
retention but I don't know how that is related to this tool's
functionality. If at all, we are bothered about WAL retention on the
primary that should be the WAL corresponding to consistent_lsn
computed by setup_publisher() but this check doesn't seem to ensure
that.

Maybe it is a lot of checks. I'm afraid there isn't a simple way to get and
make sure the replication slot is used by the physical replication. I mean if
there is primary_slot_name = 'foo' on standby, there is no guarantee that the
replication slot 'foo' exists on primary. The idea is to get the exact
replication slot name used by physical replication to drop it. Once I posted a
patch it should be clear. (Another idea is to relax this check and rely only on
primary_slot_name to drop this replication slot on primary. The replication slot
might not exist and it shouldn't return an error in this case.)


--
Euler Taveira

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: about cross-compiling issue
Next
From: Jeff Davis
Date:
Subject: Re: First draft of PG 17 release notes