pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead - Mailing list pgsql-hackers

From Andres Freund
Subject pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
Date
Msg-id 20210801205501.nyxzxoelqoo4x2qc@alap3.anarazel.de
Whole thread Raw
Responses Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
List pgsql-hackers
Hi,

Since

commit 960869da0803427d14335bba24393f414b476e2c
Author: Magnus Hagander <magnus@hagander.net>
Date:   2021-01-17 13:34:09 +0100

    Add pg_stat_database counters for sessions and session time

pgstat_report_stat() does another timestamp computation via
pgstat_send_connstats(), despite typically having computed one just a few
lines before.

Given that timestamp computation isn't all that cheap, that's not great. Even
more, that additional timestamp computation makes things *less* accurate:

void
pgstat_report_stat(bool disconnect)
...
    /*
     * Don't send a message unless it's been at least PGSTAT_STAT_INTERVAL
     * msec since we last sent one, or the backend is about to exit.
     */
    now = GetCurrentTransactionStopTimestamp();
    if (!disconnect &&
        !TimestampDifferenceExceeds(last_report, now, PGSTAT_STAT_INTERVAL))
        return;

    /* for backends, send connection statistics */
    if (MyBackendType == B_BACKEND)
        pgstat_send_connstats(disconnect, last_report);

    last_report = now;

and then pgstat_send_connstats() does:
    /* session time since the last report */
    TimestampDifference(((last_report == 0) ? MyStartTimestamp : last_report),
                        GetCurrentTimestamp(),
                        &secs, &usecs);
    msg.m_session_time = secs * 1000000 + usecs;

We maintain last_report as GetCurrentTransactionStopTimestamp(), but then use
a separate timestamp in pgstat_send_connstats() to compute the difference from
last_report, which is then set to GetCurrentTransactionStopTimestamp()'s
return value.


I'm also not all that happy with sending yet another UDP packet for this. This
doubles the UDP packets to the stats collector in the common cases (unless
more than TABSTAT_QUANTUM tables have stats to report, or shared tables have
been accessed). We already send plenty of "summary" information via
PgStat_MsgTabstat, one of which is sent unconditionally, why do we need a
separate message for connection stats?


Alternatively we could just not send an update to connection stats every 500ms
for every active connection, but only do so at connection end. The database
stats would only contain stats for disconnected sessions, while the stats for
"live" connections are maintained via backend_status.c.  That'd give us *more*
information for less costs, because we then could see idle/active times for
individual connections.

That'd likely be more change than what we would want to do at this point in
the release cycle though. But I think we ought to do something about the
increased overhead...

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Gilles Darold
Date:
Subject: Re: [PATCH] proposal for regexp_count, regexp_instr, regexp_substr and regexp_replace
Next
From: Andres Freund
Date:
Subject: Re: slab allocator performance issues