Re: please update ps display for recovery checkpoint - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: please update ps display for recovery checkpoint
Date
Msg-id 20201002072814.GE1464@paquier.xyz
Whole thread Raw
In response to Re: please update ps display for recovery checkpoint  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: please update ps display for recovery checkpoint  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-hackers
On Sat, Sep 19, 2020 at 11:00:31AM -0500, Justin Pryzby wrote:
> Maybe it's a bad idea if the checkpointer is continuously changing its display.
> I don't see the utility in it, since log_checkpoints does more than ps could
> ever do.  I'm concerned that would break things for someone using something
> like pgrep.

At the end of recovery, there is a code path where the startup process
triggers the checkpoint by itself if the bgwriter is not launched, but
there is also a second code path where, if the bgwriter is started and
if the cluster not promoted, the startup process would request for an
immediate checkpoint and then wait for it.  It is IMO equally
important to update the display of the checkpointer in this case to
show that the checkpointer is running an end-of-recovery checkpoint.

> Related: I have always thought that this message meant "recovery will complete
> Real Soon", but I now understand it to mean "beginning the recovery checkpoint,
> which is flagged CHECKPOINT_IMMEDIATE" (and may take a long time).

Yep.  And at the end of crash recovery seconds feel like minutes.

I agree that "checkpointer checkpoint" is not the best fit.  Using
parenthesis would also be inconsistent with the other usages of this
API in the backend code.  What about adding "running" then?  This
would give "checkpointer running end-of-recovery checkpoint".

While looking at this patch, I got tempted to use a StringInfo to fill
in the string to display as that would make the addition of any extra
information easier, giving the attached.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Keisuke Kuroda
Date:
Subject: Re: Logical replication CPU-bound with TRUNCATE/DROP/CREATE many tables
Next
From: Amit Langote
Date:
Subject: Re: Improve choose_custom_plan for initial partition prune case