On Wed, May 22, 2024 at 8:46 PM Euler Taveira <euler@eulerto.com> wrote:
>
> On Wed, May 22, 2024, at 8:19 AM, Amit Kapila wrote:
>
> > v2-0002: not changed
> >
>
> We have added more tries to see if the primary_slot_name becomes
> active but I think it is still fragile because it is possible on slow
> machines that the required slot didn't become active even after more
> retries. I have raised the same comment previously [2] and asked an
> additional question but didn't get any response.
>
>
> Following the same line that simplifies the code, we can: (a) add a loop in
> check_subscriber() that waits until walreceiver is available on subscriber or
> (b) use a timeout. The main advantage of (a) is that the primary slot is already
> available but I'm afraid we need a escape mechanism for the loop (timeout?).
>
Sorry, it is not clear to me why we need any additional loop in
check_subscriber(), aren't we speaking about the problem in
check_publisher() function?
Why in the first place do we need to ensure that primary_slot_name is
active on the primary? You mentioned something related to WAL
retention but I don't know how that is related to this tool's
functionality. If at all, we are bothered about WAL retention on the
primary that should be the WAL corresponding to consistent_lsn
computed by setup_publisher() but this check doesn't seem to ensure
that.
--
With Regards,
Amit Kapila.