Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead - Mailing list pgsql-hackers

From Laurenz Albe
Subject Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
Date
Msg-id 4095ceb328780d1bdba77ac6d9785fd7577ed047.camel@cybertec.at
Whole thread Raw
In response to Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead  (Andres Freund <andres@anarazel.de>)
Responses Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
List pgsql-hackers
On Tue, 2021-08-17 at 02:14 -0700, Andres Freund wrote:
> On 2021-08-17 10:44:51 +0200, Laurenz Albe wrote:
> > On Sun, 2021-08-01 at 13:55 -0700, Andres Freund wrote:
> > > We maintain last_report as GetCurrentTransactionStopTimestamp(), but then use
> > > a separate timestamp in pgstat_send_connstats() to compute the difference from
> > > last_report, which is then set to GetCurrentTransactionStopTimestamp()'s
> > > return value.
> > 
> > Makes sense to me.  How about passing "now", which was just initialized from
> > GetCurrentTransactionStopTimestamp(), as additional parameter to
> > pgstat_send_connstats() and use that value instead of taking the current time?
> 
> Yes.

Here is a patch for that.

> > > I'm also not all that happy with sending yet another UDP packet for this.
> > 
> > Are you suggesting that connection statistics should be shoehorned into
> > some other statistics message?  That would reduce the number of UDP packets,
> > but it sounds ugly and confusing to me.
> 
> That ship already has sailed. Look at struct PgStat_MsgTabstat
> 
> Given that we transport number of commits/commits, block read/write time
> adding the time the connection was active/inactive doesn't really seem like it
> makes things meaningfully worse?

Point taken.

I looked at the other statistics sent in pgstat_report_stat(), and I see
none that are sent unconditionally.  Are you thinking of this:

    /*
     * Send partial messages.  Make sure that any pending xact commit/abort
     * gets counted, even if there are no table stats to send.
     */
    if (regular_msg.m_nentries > 0 ||
        pgStatXactCommit > 0 || pgStatXactRollback > 0)
        pgstat_send_tabstat(®ular_msg);
    if (shared_msg.m_nentries > 0)
        pgstat_send_tabstat(&shared_msg);

I can't think of a way to hack this up that wouldn't make my stomach turn.

> > > Alternatively we could just not send an update to connection stats every 500ms
> > > for every active connection, but only do so at connection end. The database
> > > stats would only contain stats for disconnected sessions, while the stats for
> > > "live" connections are maintained via backend_status.c.
> > 
> > That was my original take, but if I remember right, Magnus convinced me that
> > it would be more useful to have statistics for open sessions as well.
> > With a connection pool, connections can stay open for a very long time,
> > and the accuracy and usefulness of the statistics would become questionable.
> 
> That's not a contradiction to what I propose. Having the data available via
> backend_status.c allows to sum up the data for active connections and the data
> for past connections.
> 
> I think it's also just cleaner to not constantly report changing preliminary
> data as pgstat_send_connstats() does.

Currently, the data are kept in static variables in the backend process.
That would have to change for such an approach, right?

> Doubling the number of UDP messages in common workloads seems also problematic
> enough that it should be addressed for 14.

Ok, but I don't know how to go about it.

Yours,
Laurenz Albe

Attachment

pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: PoC/WIP: Extended statistics on expressions
Next
From: Amit Kapila
Date:
Subject: Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o