Re: Split index and table statistics into different types of stats - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Split index and table statistics into different types of stats |
Date | |
Msg-id | 20231113204439.r4lmys73tessqmak@awork3.anarazel.de Whole thread Raw |
In response to | Re: Split index and table statistics into different types of stats ("Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>) |
Responses |
Re: Split index and table statistics into different types of stats
|
List | pgsql-hackers |
Hi, On 2023-11-13 09:26:56 +0100, Drouvot, Bertrand wrote: > --- a/src/backend/storage/buffer/bufmgr.c > +++ b/src/backend/storage/buffer/bufmgr.c > @@ -799,11 +799,19 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum, > * Read the buffer, and update pgstat counters to reflect a cache hit or > * miss. > */ > - pgstat_count_buffer_read(reln); > + if (reln->rd_rel->relkind == RELKIND_INDEX) > + pgstat_count_index_buffer_read(reln); > + else > + pgstat_count_table_buffer_read(reln); It's not nice from a layering POV that we need this level of awareness in bufmgr.c. I wonder if this is an argument for first splitting out stats like blocks_hit, blocks_fetched into something like "relfilenode stats" - they're agnostic of the relkind. There aren't that many such stats right now, admittedly, but I think we'll want to also track dirtied, written blocks on a per relation basis once we can (i.e. we key the relevant stats by relfilenode instead of oid, so we can associate stats when writing out buffers). > +/* > + * Initialize a relcache entry to count access statistics. Called whenever an > + * index is opened. > + * > + * We assume that a relcache entry's pgstatind_info field is zeroed by relcache.c > + * when the relcache entry is made; thereafter it is long-lived data. > + * > + * This does not create a reference to a stats entry in shared memory, nor > + * allocate memory for the pending stats. That happens in > + * pgstat_assoc_index(). > + */ > +void > +pgstat_init_index(Relation rel) > +{ > + /* > + * We only count stats for indexes > + */ > + Assert(rel->rd_rel->relkind == RELKIND_INDEX); > + > + if (!pgstat_track_counts) > + { > + if (rel->pgstatind_info != NULL) > + pgstat_unlink_index(rel); > + > + /* We're not counting at all */ > + rel->pgstat_enabled = false; > + rel->pgstatind_info = NULL; > + return; > + } > + > + rel->pgstat_enabled = true; > +} > + > +/* > + * Prepare for statistics for this index to be collected. > + * > + * This ensures we have a reference to the stats entry before stats can be > + * generated. That is important because an index drop in another > + * connection could otherwise lead to the stats entry being dropped, which then > + * later would get recreated when flushing stats. > + * > + * This is separate from pgstat_init_index() as it is not uncommon for > + * relcache entries to be opened without ever getting stats reported. > + */ > +void > +pgstat_assoc_index(Relation rel) > +{ > + Assert(rel->pgstat_enabled); > + Assert(rel->pgstatind_info == NULL); > + > + /* Else find or make the PgStat_IndexStatus entry, and update link */ > + rel->pgstatind_info = pgstat_prep_index_pending(RelationGetRelid(rel), > + rel->rd_rel->relisshared); > + > + /* don't allow link a stats to multiple relcache entries */ > + Assert(rel->pgstatind_info->relation == NULL); > + > + /* mark this relation as the owner */ > + rel->pgstatind_info->relation = rel; > +} > + > +/* > + * Break the mutual link between a relcache entry and pending index stats entry. > + * This must be called whenever one end of the link is removed. > + */ > +void > +pgstat_unlink_index(Relation rel) > +{ > + > + if (rel->pgstatind_info == NULL) > + return; > + > + /* link sanity check for the index stats */ > + if (rel->pgstatind_info) > + { > + Assert(rel->pgstatind_info->relation == rel); > + rel->pgstatind_info->relation = NULL; > + rel->pgstatind_info = NULL; > + } > +} > ... This is a fair bit of duplicated code - perhaps we could have shared helpers? > +/* ---------- > + * PgStat_IndexStatus Per-index status within a backend > + * > + * Many of the event counters are nontransactional, ie, we count events > + * in committed and aborted transactions alike. For these, we just count > + * directly in the PgStat_IndexStatus. > + * ---------- > + */ > +typedef struct PgStat_IndexStatus > +{ > + Oid r_id; /* relation's OID */ > + bool r_shared; /* is it a shared catalog? */ > + struct PgStat_IndexXactStatus *trans; /* lowest subxact's counts */ > + PgStat_IndexCounts counts; /* event counts to be sent */ > + Relation relation; /* rel that is using this entry */ > +} PgStat_IndexStatus; > + > /* ---------- > * PgStat_TableXactStatus Per-table, per-subtransaction status > * ---------- > @@ -227,6 +264,29 @@ typedef struct PgStat_TableXactStatus > } PgStat_TableXactStatus; > > > +/* ---------- > + * PgStat_IndexXactStatus Per-index, per-subtransaction status > + * ---------- > + */ > +typedef struct PgStat_IndexXactStatus > +{ > + PgStat_Counter tuples_inserted; /* tuples inserted in (sub)xact */ > + PgStat_Counter tuples_updated; /* tuples updated in (sub)xact */ > + PgStat_Counter tuples_deleted; /* tuples deleted in (sub)xact */ > + bool truncdropped; /* relation truncated/dropped in this > + * (sub)xact */ > + /* tuples i/u/d prior to truncate/drop */ > + PgStat_Counter inserted_pre_truncdrop; > + PgStat_Counter updated_pre_truncdrop; > + PgStat_Counter deleted_pre_truncdrop; > + int nest_level; /* subtransaction nest level */ > + /* links to other structs for same relation: */ > + struct PgStat_IndexXactStatus *upper; /* next higher subxact if any */ > + PgStat_IndexStatus *parent; /* per-table status */ > + /* structs of same subxact level are linked here: */ > + struct PgStat_IndexXactStatus *next; /* next of same subxact */ > +} PgStat_IndexXactStatus; I don't think much of this is used? It doesn't look like you're using most of the fields. Which makes sense - there's not really the same transactional behaviour for indexes as there is for tables. Greetings, Andres Freund
pgsql-hackers by date: