Re: shared-memory based stats collector - v70 - Mailing list pgsql-hackers
From | Drouvot, Bertrand |
---|---|
Subject | Re: shared-memory based stats collector - v70 |
Date | |
Msg-id | 1c56f4c6-61ce-d131-553c-297326e20bf1@amazon.com Whole thread Raw |
In response to | Re: shared-memory based stats collector - v70 (Greg Stark <stark@mit.edu>) |
Responses |
Re: shared-memory based stats collector - v70
|
List | pgsql-hackers |
Hi, On 8/15/22 4:46 PM, Greg Stark wrote: > On Thu, 11 Aug 2022 at 02:11, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >> As Andres was not -1 about that idea (as it should be low cost to add >> and maintain) as long as somebody cares enough to write something: then >> I'll give it a try and submit a patch for it. > I agree it would be a useful feature. I think there may be things to > talk about here though. > > 1) Are you planning to go through the local hash table and > LocalSnapshot and obey the consistency mode? I was thinking a flag > passed to build_snapshot to request global mode might be sufficient > instead of a completely separate function. I think the new API should behave as PGSTAT_FETCH_CONSISTENCY_NONE (as querying from all the databases increases the risk of having to deal with "large" number of objects). I've in mind to do something along those lines (still need to add some filtering, extra check on the permission,...): + dshash_seq_init(&hstat, pgStatLocal.shared_hash, false); + while ((p = dshash_seq_next(&hstat)) != NULL) + { + Datum values[PG_STAT_GET_ALL_TABLES_STATS_COLS]; + bool nulls[PG_STAT_GET_ALL_TABLES_STATS_COLS]; + PgStat_StatTabEntry * tabentry = NULL; + MemSet(values, 0, sizeof(values)); + MemSet(nulls, false, sizeof(nulls)); + + if (p->key.kind != PGSTAT_KIND_RELATION) + continue; + if (p->dropped) + continue; + + stats_data = dsa_get_address(pgStatLocal.dsa, p->body); + LWLockAcquire(&stats_data->lock, LW_SHARED); + tabentry = pgstat_get_entry_data(PGSTAT_KIND_RELATION, stats_data); + + + values[0] = ObjectIdGetDatum(p->key.dboid); + values[1] = ObjectIdGetDatum(p->key.objoid); + values[2]= DatumGetInt64(tabentry->tuples_inserted); . . + tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls); + LWLockRelease(&stats_data->lock); + } + dshash_seq_term(&hstat); What do you think? > 2) When I did the function attached above I tried to avoid returning > the whole set and make it possible to process them as they arrive. Is it the way it has been done? (did not look at your function yet) > I > actually was hoping to get to the point where I could start shipping > out network data as they arrive and not even buffer up the response, > but I think I need to be careful about hash table locking then. If using dshash_seq_next() the already returned elements are locked. But I guess you would like to unlock them (if you are able to process them as they arrive)? > 3) They key difference here is that we're returning whatever stats are > in the hash table rather than using the catalog to drive a list of id > numbers to look up. Right. > I guess the API should make it clear this is what > is being returned Right. I think we'll end up with a set of relations id (not their names) and their associated stats. > -- on that note I wonder if I've done something > wrong because I noted a few records with InvalidOid where I didn't > expect it. It looks like that InvalidOid for the dbid means that the entry is for a shared relation. Where did you see them (while not expecting them)? > 4) I'm currently looping over the hash table returning the records all > intermixed. Some users will probably want to do things like "return > all Relation records for all databases" or "return all Index records > for database id xxx". So some form of filtering may be best or perhaps > a way to retrieve just the keys so they can then be looked up one by > one (through the local cache?). I've in mind to add some filtering on the dbid (I think it could be useful for monitoring tool with a persistent connection to one database but that wants to pull the stats database per database). I don't think a look up through the local cache will work if the entry/key is related to another database the API is launched from. > 5) On that note I'm not clear how the local cache will interact with > these cross-database lookups. That should probably be documented... yeah I don't think that would work (if by local cache you mean what is in the relcache). Regards, -- Bertrand Drouvot Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: