Re: when the startup process doesn't - Mailing list pgsql-hackers

From Andres Freund
Subject Re: when the startup process doesn't
Date
Msg-id 20210420202827.5qv6lji2wmmvkytc@alap3.anarazel.de
Whole thread Raw
In response to when the startup process doesn't  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: when the startup process doesn't  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
List pgsql-hackers
Hi,

On 2021-04-19 13:55:13 -0400, Robert Haas wrote:
> Another possible approach would be to accept connections for
> monitoring purposes even during crash recovery. We can't allow access
> to any database at that point, since the system might not be
> consistent, but we could allow something like a replication connection
> (the non-database-associated variant). Maybe it would be precisely a
> replication connection and we'd just refuse all but a subset of
> commands, or maybe it would be some other kinds of thing. But either
> way you'd be able to issue a command in some mini-language saying "so,
> tell me how startup is going" and it would reply with a result set of
> some kind.

The hard part about this seems to be how to perform authentication -
obviously we can't do catalog lookups for users at that time.

If that weren't the issue, we could easily do much better than now, by
just providing an errdetail() with recovery progress information. But we
presumably don't want to spray such information to unauthenticated
connection attempts.


I've vaguely wondered before whether it'd be worth having something like
an "admin" socket somewhere in the data directory. Which explicitly
wouldn't require authentication, have the cluster owner as the user,
etc.  That'd not just be useful for monitoring during recovery, but also
make some interactions with the server easier for admin tools I think.



> If I had to pick one of these two ideas, I'd pick the one the
> log-based solution, since it seems easier to access and simplifies
> retrospective analysis, but I suspect SQL access would be quite useful
> for some users too, especially in cloud environments where "just log
> into the machine and have a look" is not an option.

However, leaving aside the implementation effort, the crazy idea
above would not easily address the issue of only being accessible with
local access...

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: ML-based indexing ("The Case for Learned Index Structures", a paper from Google)
Next
From: Andres Freund
Date:
Subject: Re: when the startup process doesn't