Re: Add session statistics to pg_stat_database - Mailing list pgsql-hackers

From Laurenz Albe
Subject Re: Add session statistics to pg_stat_database
Date
Msg-id 2099659e5d679d070f0bb2edb573f296f3c9813f.camel@cybertec.at
Whole thread Raw
In response to Re: Add session statistics to pg_stat_database  (Magnus Hagander <magnus@hagander.net>)
Responses Re: Add session statistics to pg_stat_database  (Magnus Hagander <magnus@hagander.net>)
List pgsql-hackers
On Tue, 2020-11-17 at 17:33 +0100, Magnus Hagander wrote:
> I've taken a look as well, and here are a few short notes:

Much appreciated!

> * It talks about "number of connections" but "number of aborted sessions". We should probably
>   be consistent about talking either about connections or sessions? In particular, connections
>   seems wrong in this case, because it only starts counting after authentication is complete
>   (since otherwise we send no stats)? (This goes for both docs and actual function names)

Yes, that is true.  I have changed "connections" to "sessions" and renamed the new
column "connections" to "session_count".

I think that most people will understand a session as started after a successful
connection.

> * Is there a reason we're counting active and idle in transaction (including aborted),
>   but not fastpath? In particular, we seem to ignore fastpath -- if we don't want to single
>   it out specifically, it should probably be included in active?

The only reason is that I didn't think of it.  Fixed.

> * pgstat_send_connstat() but pgstat_recv_connection(). Let's call both connstat or both
>   connection (I'd vote connstat)?

Agreed, done.

> * Is this actually a fix that's independent of the new stats? It seems in general to be
>   changing the behaviour of "force", which is more generic?
> -               !have_function_stats)
> +               !have_function_stats && !force)

The comment right above that reads:
/* Don't expend a clock check if nothing to do */
So it is just a quick exit if there is nothing to do.

But with that patch we have something to do if "force" (see below) is true:
Report the remaining session duration and if the session was closed normally.

Thus the additional check.

> * in pgstat_send_connstat() you pass the parameter "force" in as "disconnect".
>   That behaviour at least requires a comment saying why, I think. My understanding is
>   it relies on that "force" means this is a "backend is shutting down", but that is not
>   actually documented anywhere. Maybe the "force" parameter should actually be renamed
>   to indicate this is really what it means, to avoid a future mistake in the area?
>   But even with that, how does that turn into disconnect?

"pgstat_report_stat(true)" is only called from "pgstat_beshutdown_hook()", so
it is currently only called when the backend is about to exit.

According the the comments the flag means that "caller wants to force stats out".
I guess that the author thought that there may arise other reasons to force sending
statistics in the future (commit 641912b4d from 2007).

However, since that has not happened, I have renamed the flag to "disconnect" and
adapted the documentation.  This doesn't change the current behavior, but establishes
a new rule.

> * Maybe rename pgStatSessionDisconnected to pgStatSessionNormalDisconnected?
>   To avoid having to go back to the setting point and look it up in a comment.

Long, descriptive names are a good thing.
I have decided to use "pgStatSessionDisconnectedNormally", since that is even longer
and seems to fit the "yes or no" category better.
 
> I wonder if there would also be a way to count "sessions that crashed" as well.
>  That is,the ones that failed in a way that caused the postmaster to restart the system.
>  But that's information we'd have to send from the postmaster, but I'm actually unsure
>  if we're "allowed" to send things to the stats collector from the postmaster.
>  But I think it could be quite useful information to have. Maybe we can find some way
>  to piggyback on the fact that we're restarting the stats collector as a result?

Sure, a crash count would be useful.  I don't know if it is easy for the stats collector
to tell the difference between a start after a backend crash and - say - starting from
a base backup.

Patch v6 attached.

I think that that would be material for another patch, and I don't think it should go
to "pg_stat_database", because a) it might be hard to tell to which database the crashed
backend was attached, b) it might be a background process that doesn't belong to a database
and c) if the crash were caused by - say - corruption in a shared catalog, it would be
misleading.

Yours,
Laurenz Albe

Attachment

pgsql-hackers by date:

Previous
From: Dean Rasheed
Date:
Subject: Re: [bug+patch] Inserting DEFAULT into generated columns from VALUES RTE
Next
From: Simon Riggs
Date:
Subject: Re: VACUUM (DISABLE_PAGE_SKIPPING on)