Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible
Date
Msg-id wujfqghd2uucboqjgo2igtsedrfp7mbif7iavu52asvfr72fju@27jevcaoz555
Whole thread Raw
In response to Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible  (Jacob Champion <jacob.champion@enterprisedb.com>)
Responses Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible
List pgsql-hackers
Hi,

On 2024-11-07 10:44:25 -0800, Jacob Champion wrote:
> On Thu, Nov 7, 2024 at 10:12 AM Andres Freund <andres@anarazel.de> wrote:
> > I don't understand why the pgstat_bestart()/pgstat_bestart_pre_auth() split
> > makes sense. The latter is going to redo most of the work that the former
> > did. What's the point of that?
> >
> > Why not have a new function that initializes just the missing additional
> > information? Or for that matter, why not move most of what pgstat_bestart()
> > does into pgstat_beinit()?
> 
> I talk about that up above [1]. I agree that this is all complicated
> and fragile, but at the moment, I think splitting things apart is not
> going to reduce the complexity in any way. I'm all ears for a
> different approach, though (and it sounds like Michael is taking a
> stab at it too).

I think the patch should not be merged as-is. It's just too ugly and fragile.


> > This doesn't really seem like it's actually using wait events to describe
> > waits. The new wait events cover stuff like memory allocations etc, see
> > e.g. pg_SSPI_make_upn().
> 
> I've also asked about the "scope" of the waits in the OP [2]. I can
> move them downwards in the stack, if you'd prefer.

Well, right now they're just not actually wait events, so yes, they'd need to
be moved down.

I think it might make more sense to use pgstat_report_activity() or such here,
rather than using wait events to describe something that's not a wait.


> > This isn't just pedantry - all the relevant code really needs to be rewritten
> > to allow the blocking to happen in an interruptible way, otherwise
> > authentication timeout etc can't realiably work. Once that's done you can
> > actually define useful wait events too.
> 
> I agree that would be amazing! I'm not about to tackle reliable
> interrupts for all of the current blocking auth code for v18, though.
> I'm just trying to make it observable when we do something that
> blocks.

Well, with that justification we could end up adding wait events for large
swaths of code that might not actually ever wait.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [bug fix] prepared transaction might be lost when max_prepared_transactions is zero on the subscriber
Next
From: Nathan Bossart
Date:
Subject: Re: Popcount optimization using AVX512