On Mon, Nov 14, 2022 at 9:31 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Mon, Nov 14, 2022 at 7:37 AM Simon Riggs
> <simon.riggs@enterprisedb.com> wrote:
> > > Whilte at it, I noticed that we report redo progress for PITR, but we
> > > don't report when standby enters archive recovery mode, say due to a
> > > failure in the connection to primary or after the promote signal is
> > > found. Isn't it useful to report in this case as well to know the
> > > recovery progress?
> >
> > I think your patch disables progress too early, effectively turning
> > off the standby progress feature. The purpose was to report on things
> > that take long periods during recovery, not just prior to recovery.
> >
> > I would advocate that we disable progress only while waiting, as I've done here:
> > https://www.postgresql.org/message-id/CANbhV-GcWjZ2cmj0uCbZDWQUHnneMi_4EfY3dVWq0-yD5o7Ccg%40mail.gmail.com
>
> Maybe I'm confused here, but I think that, on a standby, startup
> progress messages are only printed until the main redo loop is
> reached. Otherwise, we would print a message on a standby every 10s
> forever, which seems like a thing that most users would not like. So I
> think that Bharath has the right idea here.
Yes, the idea is to disable the timeout on standby completely since we
actually don't report any recovery progress. Keeping it enabled,
unnecessarily calls startup_progress_timeout_handler() every
log_startup_progress_interval seconds i.e. 10 seconds. That's the
intention of the patch.
> I don't think that his patch is right in detail, though. I don't think
> the call to disable_timeout() needs to be conditional,
Yes, disable_timeout() returns if the timeout was previously disabled
i.e. all_timeouts[STARTUP_PROGRESS_TIMEOUT].active is false. I changed
it in the v2 patch.
> and I don't
> think the Assert is correct.
You're right. My intention there was to check if the timeout is
enabled while ereport_startup_progress() is called. In the v2 patch,
when we actually disable the timeout startup_progress_timer_expired
gets set to false and has_startup_progress_timeout_expired() just
returns in such a case.
> Also, I think that your patch has the
> right idea in encapsulating the disable_timeout() call inside a new
> function disable_startup_progress_timeout(), rather than having the
> details known directly by xlogrecovery.c.
Yes, I too like Simon's idea of {enable,
disable}_startup_progress_timeout functions, I utilized them in the v2
patch here.
I actually want to get rid of begin_startup_progress_phase() which now
becomes a thin wrapper calling disable and enable functions and ensure
the callers do follow enable()-report_progress()-disable() way to use
the feature, however I didn't code for that as it needs changes across
many files. If okay, I can code for that too. Thoughts?
Please review the v2 patch.
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com