Home > mailing lists

Re: [Bug fix]There is the case archive_timeout parameter is ignored after recovery works. - Mailing list pgsql-hackers

From	Kyotaro Horiguchi
Subject	Re: [Bug fix]There is the case archive_timeout parameter is ignored after recovery works.
Date	June 30, 2020 00:14:23
Msg-id	20200630.091423.1542668690374745736.horikyota.ntt@gmail.com Whole thread Raw
In response to	RE: [Bug fix]There is the case archive_timeout parameter is ignoredafter recovery works. ("higuchi.daisuke@fujitsu.com" <higuchi.daisuke@fujitsu.com>)
Responses	Re: [Bug fix]There is the case archive_timeout parameter is ignored after recovery works.
List	pgsql-hackers

Tree view

Opps! I misunderstood that.

At Mon, 29 Jun 2020 13:00:25 +0000, "higuchi.daisuke@fujitsu.com" <higuchi.daisuke@fujitsu.com> wrote in 
> Fujii-san, thank you for comments.
> 
> >The cause of this problem is that the checkpointer's sleep time is calculated
> >from both checkpoint_timeout and archive_timeout during normal running,
> >but calculated only from checkpoint_timeout during recovery. So Daisuke-san's
> >patch tries to change that so that it's calculated from both of them even
> >during recovery. No?
> 
> Yes, it's exactly so.
> 
> >last_xlog_switch_time is not updated during recovery. So "elapsed_secs" can be
> >large and cur_timeout can be negative. Isn't this problematic?
> 
> Yes... My patch was missing this.

The patch also makes WaitLatch called with zero timeout, which causes
assertion failure.

> How about using the original archive_timeout value for calculating cur_timeout during recovery?
> 
>                 if (XLogArchiveTimeout > 0 && !RecoveryInProgress())
>                 {
>                         elapsed_secs = now - last_xlog_switch_time;
>                         if (elapsed_secs >= XLogArchiveTimeout)
>                                 continue;               /* no sleep for us ... */
>                         cur_timeout = Min(cur_timeout, XLogArchiveTimeout - elapsed_secs);
>                 }
> +               else if (XLogArchiveTimeout > 0)
> +                       cur_timeout = Min(cur_timeout, XLogArchiveTimeout);
> 
> During recovery, accurate cur_timeout is not calculated because elapsed_secs is not used.
> However, after recovery is complete, WAL archiving will start by the next archive_timeout is reached.
> I felt it is enough to solve this problem.

That causes unwanted change of cur_timeout during recovery.

> >As another approach, what about waking the checkpointer up at the end of
> >recovery like we already do for walsenders?

We don't want change checkpoint interval during recovery, that means
we cannot cnosider archive_timeout at the fist checkpointer after
recovery ends. So I think that the suggestion from Fujii-san is the
direction.

> If the above solution is not good, I will consider this approach.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

pgsql-hackers by date:

From: David Rowley
Date: 29 June 2020, 23:57:03
Subject: Re: Hybrid Hash/Nested Loop joins and caching results from subplans

From: Fujii Masao
Date: 30 June 2020, 01:01:02
Subject: Re: [Bug fix]There is the case archive_timeout parameter is ignored after recovery works.

Re: [Bug fix]There is the case archive_timeout parameter is ignored after recovery works. - Mailing list pgsql-hackers

Previous

Next