Re: shared-memory based stats collector - v70 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: shared-memory based stats collector - v70
Date
Msg-id 20220809164748.6omys6j64yf2a2lr@awork3.anarazel.de
Whole thread Raw
In response to Re: shared-memory based stats collector - v70  (Greg Stark <stark@mit.edu>)
Responses Re: shared-memory based stats collector - v70
Re: shared-memory based stats collector - v70
List pgsql-hackers
Hi,

On 2022-08-09 12:00:46 -0400, Greg Stark wrote:
> I was more aiming at a C function that extensions could use directly
> rather than an SQL function -- though I suppose having the former it
> would be simple enough to implement the latter using it. (though it
> would have to be one for each stat type I guess)

I think such a C extension could exist today, without patching core code? It'd
be a bit ugly to include pgstat_internal.h, I guess, but other than that...


> The reason I want a C function is I'm trying to get as far as I can
> without a connection to a database, without a transaction, without
> accessing the catalog, and as much as possible without taking locks.

I assume you don't include lwlocks under locks?


> I think this is important for making monitoring highly reliable and low
> impact on production.

I'm doubtful about that, but whatever.


> The main problem with my current code is that I'm accessing the shared
> memory hash table directly. This means the I'm possibly introducing
> locking contention on the shared memory hash table.

I don't think that's a large enough issue to worry about unless you're
polling at a very high rate, which'd be a bad idea in itself. If a backend
can't get the lock for some stats change it'll defer flushing the stats a bit,
so it'll not cause a lot of other problems.


> I'm thinking of separating the shared memory hash scan from the metric scan
> so the list can be quickly built minimizing the time the lock is held.

I'd really really want to see some evidence that any sort of complexity here
is worth it.


> I have a few things I would like to suggest for future improvements to
> this infrastructure. I haven't polished the details of it yet but the
> main thing I think I'm missing is the catalog name for the object. I
> don't want to have to fetch it from the catalog and in any case I
> think it would generally be useful and might regularize the
> replication slot handling too.

I'm *dead* set against including catalog names in shared memory stats. That'll
add a good amount of memory usage and complexity, without any sort of
comensurate gain.


> I also think it would be nice to have a change counter for every stat
> object, or perhaps a change time. Prometheus wouldn't be able to make
> use of it but other monitoring software might be able to receive only
> metrics that have changed since the last update which would really
> help on databases with large numbers of mostly static objects.

I think you're proposing adding overhead that doesn't even have a real user.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: moving basebackup code to its own directory
Next
From: Andres Freund
Date:
Subject: Re: shared-memory based stats collector - v70