Re: Switching XLog source from archive to streaming when primary available - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Switching XLog source from archive to streaming when primary available
Date
Msg-id CALj2ACV0ZkXggzMrfx6pi_MGTR-rGOY3AOUwvUP1pg=iE6+LwA@mail.gmail.com
Whole thread Raw
In response to Re: Switching XLog source from archive to streaming when primary available  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: Switching XLog source from archive to streaming when primary available
List pgsql-hackers
On Sat, Sep 10, 2022 at 3:35 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> Well, for wal_keep_size, using bytes makes sense.  Given you know how much
> disk space you have, you can set this parameter accordingly to avoid
> retaining too much of it for standby servers.  For your proposed parameter,
> it's not so simple.  The same setting could have wildly different timing
> behavior depending on the server.  I still think that a timeout is the most
> intuitive.

Hm. In v3 patch, I've used the timeout approach, but tracking the
duration server spends in XLOG_FROM_ARCHIVE as opposed to tracking
last failed time in streaming from primary.

> So I think it
> would be difficult for the end user to decide on a value.  However, even
> the timeout approach has this sort of problem.  If your parameter is set to
> 1 minute, but the current archive takes 5 minutes to recover, you won't
> really be testing streaming replication once a minute.  That would likely
> need to be documented.

Added a note in the docs.

On Fri, Sep 9, 2022 at 10:46 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> Being late for the party.

Thanks for reviewing this.

> It seems to me that the function is getting too long.  I think we
> might want to move the core part of the patch into another function.

Yeah, the WaitForWALToBecomeAvailable() (without this patch) has
around 460 LOC out of which WAL fetching from the chosen source is of
240 LOC, IMO, this code will be a candidate for a new function. I
think that part can be discussed separately.

Having said that, I moved the new code to a new function.

> I think it might be better if intentionalSourceSwitch doesn't need
> lastSourceFailed set. It would look like this:
>
> >  if (lastSourceFailed || switchSource)
> >  {
> >     if (nonblocking && lastSourceFailed)
> >        return XLREAD_WOULDBLOCK;

I think the above looks good, done that way in the latest patch.

> I don't think the flag first_time is needed.

Addressed this in the v4 patch.

Please review the attached v4 patch addressing above review comments.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: pg15b4: FailedAssertion("TransactionIdIsValid(xmax)
Next
From: Etsuro Fujita
Date:
Subject: Re: list of acknowledgments for PG15