Re: when the startup process doesn't (logging startup delays) - Mailing list pgsql-hackers

From Robert Haas
Subject Re: when the startup process doesn't (logging startup delays)
Date
Msg-id CA+TgmoY-0AUppRhiOGg1Bev+CwE_ZTajCDAY8B-wSLuw0KGC0w@mail.gmail.com
Whole thread Raw
In response to Re: when the startup process doesn't (logging startup delays)  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: when the startup process doesn't (logging startup delays)  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-hackers
On Sun, Jul 25, 2021 at 1:56 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> On Fri, Jul 23, 2021 at 04:09:47PM +0530, Nitin Jadhav wrote:
> > > I think walkdir() should only call LogStartupProgress(FSYNC_IN_PROGRESS, path);
> > > when action == datadir_fsync_fname.
> >
> > I agree and fixed it.
>
> I saw that you fixed it by calling InitStartupProgress() after the walkdir()
> calls which do pre_sync_fname.  So then walkdir is calling
> LogStartupProgress(STARTUP_PROCESS_OP_FSYNC) even when it's not doing fsync,
> and then LogStartupProgress() is returning because !AmStartupProcess().
>
> That seems indirect, fragile, and confusing.  I suggest that walkdir() should
> take an argument for which operation to pass to LogStartupProgress().  You can
> pass a special enum for cases where nothing should be logged, like
> STARTUP_PROCESS_OP_NONE.

I don't think walkdir() has any business calling LogStartupProgress()
at all. It's supposed to be a generic function, not one that is only
available to be called from the startup process, or has different
behavior there. From my point of view, the right thing is to put the
logging calls into the particular callbacks that SyncDataDirectory()
uses.

> On Wed, Jul 21, 2021 at 04:47:32PM +0530, Bharath Rupireddy wrote:
> > 1) I still don't see the need for two functions LogStartupProgress and
> > LogStartupProgressComplete. Most of the code is duplicate. I think we
> > can just do it with a single function something like [1]:
>
> I agree that one function can do this more succinctly.  I think it's best to
> use a separate enum value for START operations and END operations.

Maybe I'm missing something here, but I don't understand the purpose
of this. You can always combine two functions into one, but it's only
worth doing if you end up with less code, which doesn't seem to be the
case here. The strings are all different and that's most of the
function, and the other stuff that gets done isn't the same either, so
you'd just end up with a bunch of if-statements. That doesn't seem
like an improvement.

Thinking further, I guess I understand it from the caller's
perspective. It's not necessarily clear why in some places we call
LogStartupProgress() and others LogStartupProgressComplete(). Someone
might expect a function with "complete" in the name like that to only
be called once at the very end, rather than once at the end of a
phase, and it does sort of make sense that you'd want to call one
function everywhere rather than sometimes one and sometimes the other
... but I'm not exactly sure how to get there from here. Having only
LogStartupProgress() but having it do a giant if-statement to figure
out whether we're mid-phase or end-of-phase does not seem like the
right approach.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: SQL/JSON: JSON_TABLE
Next
From: Tom Lane
Date:
Subject: Re: Skip temporary table schema name from explain-verbose output.