Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers

From shveta malik
Subject Re: Synchronizing slots from primary to standby
Date
Msg-id CAJpy0uA2M0DmUMRJ6VZkcuPWdgnwd6m5jGqfiBG4Y6Nm6dumiw@mail.gmail.com
Whole thread Raw
In response to Re: Synchronizing slots from primary to standby  ("Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>)
Responses Re: Synchronizing slots from primary to standby
List pgsql-hackers
On Wed, Oct 25, 2023 at 3:15 PM Drouvot, Bertrand
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On 10/25/23 5:00 AM, shveta malik wrote:
> > On Tue, Oct 24, 2023 at 11:54 AM Drouvot, Bertrand
> > <bertranddrouvot.pg@gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> On 10/23/23 2:56 PM, shveta malik wrote:
> >>> On Mon, Oct 23, 2023 at 5:52 PM Drouvot, Bertrand
> >>> <bertranddrouvot.pg@gmail.com> wrote:
> >>
> >>>> We are waiting for DEFAULT_NAPTIME_PER_CYCLE (3 minutes) before checking if there
> >>>> is new synced slot(s) to be created on the standby. Do we want to keep this behavior
> >>>> for V1?
> >>>>
> >>>
> >>> I think for the slotsync workers case, we should reduce the naptime in
> >>> the launcher to say 30sec and retain the default one of 3mins for
> >>> subscription apply workers. Thoughts?
> >>>
> >>
> >> Another option could be to keep DEFAULT_NAPTIME_PER_CYCLE and create a new
> >> API on the standby that would refresh the list of sync slot at wish, thoughts?
> >>
> >
> > Do you mean API to refresh list of DBIDs rather than sync-slots?
> > As per current design, launcher gets DBID lists for all the failover
> > slots from the primary at intervals of DEFAULT_NAPTIME_PER_CYCLE.
>
> I mean an API to get a newly created slot on the primary being created/synced on
> the standby at wish.
>
> Also let's imagine this scenario:
>
> - create logical_slot1 on the primary (and don't start using it)
>
> Then on the standby we'll get things like:
>
> 2023-10-25 08:33:36.897 UTC [740298] LOG:  waiting for remote slot "logical_slot1" LSN (0/C00316A0) and catalog xmin
(752)to pass local slot LSN (0/C0049530) and and catalog xmin (754) 
>
> That's expected and due to the fact that ReplicationSlotReserveWal() does set the slot
> restart_lsn to a value < at the corresponding restart_lsn slot on the primary.
>
> - create logical_slot2 on the primary (and start using it)
>
> Then logical_slot2 won't be created/synced on the standby until there is activity on logical_slot1 on the primary
> that would produce things like:
>
> 2023-10-25 08:41:35.508 UTC [740298] LOG:  wait over for remote slot "logical_slot1" as its LSN (0/C005FFD8) and
catalogxmin (756) has now passed local slot LSN (0/C0049530) and catalog xmin (754) 
>
> With this new dedicated API, it will be:
>
> - clear that the API call is "hanging" until there is some activity on the newly created slot
> (currently there is "waiting for remote slot " message in the logfile as mentioned above but
> I'm not sure that's enough)
>
> - be possible to create/sync logical_slot2 in the example above without waiting for activity
> on logical_slot1.
>
> Maybe we should change our current algorithm during slot creation so that a newly created inactive
> slot on the primary does not block other newly created "active" slots on the primary to be created
> on the standby? Depending on how we implement that, the new API may not be needed at all.
>
> Thoughts?
>

I discussed this with my colleague Hou-San and we think that one
possibility could be to somehow accelerate the increment of
restart_lsn on primary.  This can be achieved by connecting to the
remote and executing pg_log_standby_snapshot() at reasonable intervals
while waiting on standby during slot creation. This may increase speed
to a reasonable extent w/o having to wait for the user or bgwriter to
do the same for us. The current logical decoding uses a similar
approach to speed up the slot creation.  I refer to usage of
LogStandbySnapshot in SnapBuildWaitSnapshot() and
ReplicationSlotReserveWal()).
Thoughts?

thanks
Shveta



pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: pg_upgrade's object listing
Next
From: Alena Rybakina
Date:
Subject: Invalid Path with UpperRel