Re: Is Recovery actually paused? - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Is Recovery actually paused?
Date
Msg-id CAFiTN-tN325vACNZL+Y1JYqDTi1Egu8wawrjuyE9yxvAgdqsjw@mail.gmail.com
Whole thread Raw
In response to Re: Is Recovery actually paused?  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: Is Recovery actually paused?
List pgsql-hackers
On Mon, Nov 30, 2020 at 2:40 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, Nov 30, 2020 at 12:17 PM Yugo NAGATA <nagata@sraoss.co.jp> wrote:
>
> Thanks for looking into this.
>
> > On Thu, 22 Oct 2020 20:36:48 +0530
> > Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > > On Thu, Oct 22, 2020 at 7:50 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > > On Thu, Oct 22, 2020 at 6:59 AM Kyotaro Horiguchi
> > > > <horikyota.ntt@gmail.com> wrote:
> > > > >
> > > > > At Wed, 21 Oct 2020 11:14:24 -0400, Robert Haas <robertmhaas@gmail.com> wrote in
> > > > > > On Wed, Oct 21, 2020 at 7:16 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > > > > One idea could be, if the recovery process is waiting for WAL and a
> > > > > > > recovery pause is requested then we can assume that the recovery is
> > > > > > > paused because before processing the next wal it will always check
> > > > > > > whether the recovery pause is requested or not.
> > > > > ..
> > > > > > However, it might be better to implement this by having the system
> > > > > > absorb the pause immediately when it's in this state, rather than
> > > > > > trying to detect this state and treat it specially.
> > > > >
> > > > > The paused state is shown in pg_stat_activity.wait_event and it is
> > > > > strange that pg_is_wal_replay_paused() is inconsistent with the
> > > > > column.
> > > >
> > > > Right
> > > >
> > > > To make them consistent, we need to call recoveryPausesHere()
> > > > > at the end of WaitForWALToBecomeAvailable() and let
> > > > > pg_wal_replay_pause() call WakeupRecovery().
> > > > >
> > > > > I think we don't need a separate function to find the state.
> > > >
> > > > The idea makes sense to me.  I will try to change the patch as per the
> > > > suggestion.
> > >
> > > Here is the patch based on this idea.
> >
> > I reviewd this patch.
> >
> > First, I made a recovery conflict situation using a table lock.
> >
> > Standby:
> > #= begin;
> > #= select * from t;
> >
> > Primary:
> > #= begin;
> > #= lock t in ;
> >
> > After this, WAL of the table lock cannot be replayed due to a lock acquired
> > in the standby.
> >
> > Second, during the delay, I executed pg_wal_replay_pause() and
> > pg_is_wal_replay_paused(). Then, pg_is_wal_replay_paused was blocked until
> > max_standby_streaming_delay was expired, and eventually returned true.
> >
> > I can also see the same behaviour by setting recovery_min_apply_delay.
> >
> > So, pg_is_wal_replay_paused waits for recovery to get paused and this works
> > successfully as expected.
> >
> > However, I wonder users don't expect pg_is_wal_replay_paused to wait.
> > Especially, if max_standby_streaming_delay is -1, this will be blocked forever,
> > although this setting may not be usual. In addition, some users may set
> > recovery_min_apply_delay for a large.  If such users call pg_is_wal_replay_paused,
> > it could wait for a long time.
> >
> > At least, I think we need some descriptions on document to explain
> > pg_is_wal_replay_paused could wait while a time.
>
> Ok

Fixed this, added some comments in .sgml as well as in function header

> > Also, how about adding a new boolean argument to pg_is_wal_replay_paused to
> > control whether this waits for recovery to get paused or not? By setting its
> > default value to true or false, users can use the old format for calling this
> > and the backward compatibility can be maintained.
>
> So basically, if the wait_recovery_pause flag is false then we will
> immediately return true if the pause is requested?  I agree that it is
> good to have an API to know whether the recovery pause is requested or
> not but I am not sure is it good idea to make this API serve both the
> purpose?  Anyone else have any thoughts on this?
>
> >
> > As another comment, while pg_is_wal_replay_paused is blocking, I can not cancel
> > the query. I think CHECK_FOR_INTERRUPTS() is necessary in the waiting loop.
> >
> >
> > +                   errhint("Recovery control functions can only be executed during recovery.")));
> >
> > There are a few tabs at the end of this line.
>
> I will fix.

Fixed this as well.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Greg Nancarrow
Date:
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Next
From: Dilip Kumar
Date:
Subject: Re: Parallel INSERT (INTO ... SELECT ...)