On Tue, Feb 21, 2023 at 8:01 PM Michael Paquier <michael@paquier.xyz> wrote:
> On Tue, Feb 21, 2023 at 03:22:58PM +0900, Kyotaro Horiguchi wrote:
> > At Tue, 21 Feb 2023 12:51:51 +0900, Michael Paquier <michael@paquier.xyz> wrote in
> >> commit 7ff23c6d277d1d90478a51f0dd81414d343f3850
> >> Author: Thomas Munro <tmunro@postgresql.org>
> >> Date: Mon Aug 2 17:32:20 2021 +1200
> >>
> >> Run checkpointer and bgwriter in crash recovery.
> >>
> >> Start up the checkpointer and bgwriter during crash recovery (except in
> >> --single mode), as we do for replication. This wasn't done back in
> >> commit cdd46c76 out of caution. Now it seems like a better idea to make
> >> the environment as similar as possible in both cases. There may also be
> >> some performance advantages.
> >
> > Wouldn't just previnting restartpoints during crash recovery mean
> > making this change inneffective? If something wrong actually happens
> > when restartpoints happen during crash recovery, shouldn't we identify
> > the cause and let restartpoints run safely during crash recovery?
>
> I will let Thomas comment on that (now added in CC of this thread),
> because, honestly, I am not sure.
Thanks, catching up with this thread.
> Now, it seems to me that the original intent of starting the
> checkpointer this early is to be able to unify the code path for the
> end-of-recovery checkpoint, primarily, which is a fine goal in itself
> because the startup process can query a checkpoint rather than doing
> things itself. It is the fact of starting early the bgwriter that
> showed most of the improvements in crash recovery with some aggressive
> tuning setups, under specific workloads like heavy INSERTs.
>
> This report is telling us that restartpoints during crash recovery may
> not be a good idea without more careful consideration, even if the
> original intent of 7ff23c6 is to allow them in the crash recovery
> case. I am not sure about this last point. (Restartpoint is
> mentioned once in the thread of this commit, FWIW.)
I started by wanting to have the bgwriter active. For that, the
checkpointer must also be active, to have a shared fsync request queue
(since both the startup process and the bgwriter create fsync
requests, it wouldn't work if the startup process had a local
"pendingOps"). Looking into this.