Re: pg_get_wal_replay_pause_state() should not return 'paused' while a promotion is ongoing. - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: pg_get_wal_replay_pause_state() should not return 'paused' while a promotion is ongoing.
Date
Msg-id 20210519.152529.1656496447023808157.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: pg_get_wal_replay_pause_state() should not return 'paused' while a promotion is ongoing.  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: pg_get_wal_replay_pause_state() should not return 'paused' while a promotion is ongoing.
Re: pg_get_wal_replay_pause_state() should not return 'paused' while a promotion is ongoing.
List pgsql-hackers
At Wed, 19 May 2021 11:19:13 +0530, Dilip Kumar <dilipbalaut@gmail.com> wrote in 
> On Wed, May 19, 2021 at 10:16 AM Fujii Masao
> <masao.fujii@oss.nttdata.com> wrote:
> >
> > On 2021/05/18 15:46, Michael Paquier wrote:
> > > On Tue, May 18, 2021 at 12:48:38PM +0900, Fujii Masao wrote:
> > >> Currently a promotion causes all available WAL to be replayed before
> > >> a standby becomes a primary whether it was in paused state or not.
> > >> OTOH, something like immediate promotion (i.e., standby becomes
> > >> a primary without replaying outstanding WAL) might be useful for
> > >> some cases. I don't object to that.
> > >
> > > Sounds like a "promotion immediate" mode.  It does not sound difficult
> > > nor expensive to add a small test for that in one of the existing
> > > recovery tests triggerring a promotion.  Could you add one based on
> > > pg_get_wal_replay_pause_state()?
> >
> > You're thinking to add the test like the following?
> > #1. Pause the recovery
> > #2. Confirm that pg_get_wal_replay_pause_state() returns 'paused'
> > #3. Trigger standby promotion
> > #4. Confirm that pg_get_wal_replay_pause_state() returns 'not paused'
> >
> > It seems not easy to do the test #4 stably because
> > pg_get_wal_replay_pause_state() needs to be executed
> > before the promotion finishes.
> 
> Even for #2, we can not ensure that whether it will be 'paused' or
> 'pause requested'.

We often use poll_query_until() to make sure some desired state is
reached.  And, as Michael suggested, the function
pg_get_wal_replay_pause_state() still works at the time of
recovery_end_command.  So a bit more detailed steps are:

#0. Equip the server with recovery_end_command that waits for some
    trigger then start the server.
#1. Pause the recovery
#2. Wait until pg_get_wal_replay_pause_state() returns 'paused'
#3. Trigger standby promotion
#4. Wait until pg_get_wal_replay_pause_state() returns 'not paused'
#5. Trigger recovery_end_command to let promotion proceed.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: pg_get_wal_replay_pause_state() should not return 'paused' while a promotion is ongoing.
Next
From: Amit Langote
Date:
Subject: Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS