Re: shared-memory based stats collector - v70 - Mailing list pgsql-hackers

From Greg Stark
Subject Re: shared-memory based stats collector - v70
Date
Msg-id CAM-w4HNLtcKxrG0WepCKjt5tizbvZRcrqABX=jNNZoi_18E5rw@mail.gmail.com
Whole thread Raw
In response to Re: shared-memory based stats collector - v70  ("Drouvot, Bertrand" <bdrouvot@amazon.com>)
Responses Re: shared-memory based stats collector - v70
Re: shared-memory based stats collector - v70
List pgsql-hackers
On Tue, 16 Aug 2022 at 08:49, Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
>
>
> +               if (p->key.kind != PGSTAT_KIND_RELATION)
> +                       continue;

Hm. So presumably this needs to be extended. Either to let the caller
decide which types of stats to return or to somehow return all the
stats intermixed. In my monitoring code I did the latter because I
didn't think going through the hash table repeatedly would be very
efficient. But it's definitely a pretty awkward API since I need a
switch statement that explicitly lists each case and casts the result.

> > 2) When I did the function attached above I tried to avoid returning
> > the whole set and make it possible to process them as they arrive.
>
> Is it the way it has been done? (did not look at your function yet)

I did it with callbacks. It was quick and easy and convenient for my
use case. But in general I don't really like callbacks and would think
some kind of iterator style api would be nicer.

I am handling the stats entries as they turn up. I'm constructing the
text output for each in a callback and buffering up the whole http
response in a string buffer.

I think that's ok but if I wanted to avoid buffering it up and do
network i/o then I would think the thing to do would be to build the
list of entry keys and then loop over that list doing a hash lookup
for each one and generating the response for each out and writing it
to the network. That way there wouldn't be anything locked, not even
the hash table, while doing network i/o. It would mean a lot of
traffic on the hash table though.

> > -- on that note I wonder if I've done something
> > wrong because I noted a few records with InvalidOid where I didn't
> > expect it.
>
> It looks like that InvalidOid for the dbid means that the entry is for a
> shared relation.

Ah yes. I had actually found that but forgotten it.

There's also a database entry with dboid=InvalidOid which is
apparently where background workers with no database attached report
stats.

> I've in mind to add some filtering on the dbid (I think it could be
> useful for monitoring tool with a persistent connection to one database
> but that wants to pull the stats database per database).
>
> I don't think a look up through the local cache will work if the
> entry/key is related to another database the API is launched from.

Isn't there also a local hash table used to find the entries to reduce
traffic on the shared hash table? Even if you don't take a snapshot
does it get entered there? There are definitely still parts of this
I'm working on a pretty vague understanding of :/

-- 
greg



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock
Next
From: Andres Freund
Date:
Subject: Re: static libpq (and other libraries) overwritten on aix