Re: pg_get_wal_replay_pause_state() should not return 'paused' while a promotion is ongoing. - Mailing list pgsql-hackers
From | Kyotaro Horiguchi |
---|---|
Subject | Re: pg_get_wal_replay_pause_state() should not return 'paused' while a promotion is ongoing. |
Date | |
Msg-id | 20210519.164352.1786666023204393975.horikyota.ntt@gmail.com Whole thread Raw |
In response to | Re: pg_get_wal_replay_pause_state() should not return 'paused' while a promotion is ongoing. (Fujii Masao <masao.fujii@oss.nttdata.com>) |
Responses |
Re: pg_get_wal_replay_pause_state() should not return 'paused' while a promotion is ongoing.
|
List | pgsql-hackers |
At Wed, 19 May 2021 16:21:58 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > > > On 2021/05/19 15:25, Kyotaro Horiguchi wrote: > > At Wed, 19 May 2021 11:19:13 +0530, Dilip Kumar > > <dilipbalaut@gmail.com> wrote in > >> On Wed, May 19, 2021 at 10:16 AM Fujii Masao > >> <masao.fujii@oss.nttdata.com> wrote: > >>> > >>> On 2021/05/18 15:46, Michael Paquier wrote: > >>>> On Tue, May 18, 2021 at 12:48:38PM +0900, Fujii Masao wrote: > >>>>> Currently a promotion causes all available WAL to be replayed before > >>>>> a standby becomes a primary whether it was in paused state or not. > >>>>> OTOH, something like immediate promotion (i.e., standby becomes > >>>>> a primary without replaying outstanding WAL) might be useful for > >>>>> some cases. I don't object to that. > >>>> > >>>> Sounds like a "promotion immediate" mode. It does not sound difficult > >>>> nor expensive to add a small test for that in one of the existing > >>>> recovery tests triggerring a promotion. Could you add one based on > >>>> pg_get_wal_replay_pause_state()? > >>> > >>> You're thinking to add the test like the following? > >>> #1. Pause the recovery > >>> #2. Confirm that pg_get_wal_replay_pause_state() returns 'paused' > >>> #3. Trigger standby promotion > >>> #4. Confirm that pg_get_wal_replay_pause_state() returns 'not paused' > >>> > >>> It seems not easy to do the test #4 stably because > >>> pg_get_wal_replay_pause_state() needs to be executed > >>> before the promotion finishes. > >> > >> Even for #2, we can not ensure that whether it will be 'paused' or > >> 'pause requested'. > > We often use poll_query_until() to make sure some desired state is > > reached. > > Yes. > > > And, as Michael suggested, the function > > pg_get_wal_replay_pause_state() still works at the time of > > recovery_end_command. So a bit more detailed steps are: > > IMO this idea is tricky and fragile, so I'm inclined to avoid that if Agreed, the recovery_end_command would be something like the following avoiding dependency on sh. However, I'm not sure it works as well on Windows.. recovery_end_command='perl -e "while( -f \'$trigfile\') {sleep 0.1;}"' > possible. > Attached is the POC patch to add the following tests. > > #1. Check that pg_get_wal_replay_pause_state() reports "not paused" at > #first. > #2. Request to pause archive recovery and wait until it's actually > #paused. > #3. Request to resume archive recovery and wait until it's actually > #resumed. > #4. Request to pause archive recovery and wait until it's actually > #paused. > Then, check that the paused state ends and promotion continues > if a promotion is triggered while recovery is paused. > > In #4, pg_get_wal_replay_pause_state() is not executed while promotion > is ongoing. #4 checks that pg_is_in_recovery() returns false and > the promotion finishes expectedly in that case. Isn't this test enough > for now? +1 for adding some tests for pg_wal_replay_pause() but the test seems like checking only that pg_get_wal_replay_pause_state() returns the expected state value. Don't we need to check that the recovery is actually paused and that the promotion happens at expected LSN? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: