Re: shared-memory based stats collector - v70 - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: shared-memory based stats collector - v70 |
Date | |
Msg-id | 20220809164748.6omys6j64yf2a2lr@awork3.anarazel.de Whole thread Raw |
In response to | Re: shared-memory based stats collector - v70 (Greg Stark <stark@mit.edu>) |
Responses |
Re: shared-memory based stats collector - v70
Re: shared-memory based stats collector - v70 |
List | pgsql-hackers |
Hi, On 2022-08-09 12:00:46 -0400, Greg Stark wrote: > I was more aiming at a C function that extensions could use directly > rather than an SQL function -- though I suppose having the former it > would be simple enough to implement the latter using it. (though it > would have to be one for each stat type I guess) I think such a C extension could exist today, without patching core code? It'd be a bit ugly to include pgstat_internal.h, I guess, but other than that... > The reason I want a C function is I'm trying to get as far as I can > without a connection to a database, without a transaction, without > accessing the catalog, and as much as possible without taking locks. I assume you don't include lwlocks under locks? > I think this is important for making monitoring highly reliable and low > impact on production. I'm doubtful about that, but whatever. > The main problem with my current code is that I'm accessing the shared > memory hash table directly. This means the I'm possibly introducing > locking contention on the shared memory hash table. I don't think that's a large enough issue to worry about unless you're polling at a very high rate, which'd be a bad idea in itself. If a backend can't get the lock for some stats change it'll defer flushing the stats a bit, so it'll not cause a lot of other problems. > I'm thinking of separating the shared memory hash scan from the metric scan > so the list can be quickly built minimizing the time the lock is held. I'd really really want to see some evidence that any sort of complexity here is worth it. > I have a few things I would like to suggest for future improvements to > this infrastructure. I haven't polished the details of it yet but the > main thing I think I'm missing is the catalog name for the object. I > don't want to have to fetch it from the catalog and in any case I > think it would generally be useful and might regularize the > replication slot handling too. I'm *dead* set against including catalog names in shared memory stats. That'll add a good amount of memory usage and complexity, without any sort of comensurate gain. > I also think it would be nice to have a change counter for every stat > object, or perhaps a change time. Prometheus wouldn't be able to make > use of it but other monitoring software might be able to receive only > metrics that have changed since the last update which would really > help on databases with large numbers of mostly static objects. I think you're proposing adding overhead that doesn't even have a real user. Greetings, Andres Freund
pgsql-hackers by date: