Re: Split index and table statistics into different types of stats - Mailing list pgsql-hackers
From | Drouvot, Bertrand |
---|---|
Subject | Re: Split index and table statistics into different types of stats |
Date | |
Msg-id | 06517d27-7a81-3b60-60c4-9a92c6498981@gmail.com Whole thread Raw |
In response to | Re: Split index and table statistics into different types of stats (Melanie Plageman <melanieplageman@gmail.com>) |
Responses |
Re: Split index and table statistics into different types of stats
|
List | pgsql-hackers |
Hi, On 11/4/22 9:51 PM, Melanie Plageman wrote: > Hi Bertrand, > > I'm glad you are working on this. > > I had a few thoughts/ideas > Thanks for looking at it! > It seems better to have all of the counts in the various stats structs > not be prefixed with n_, i_, t_ > > typedef struct PgStat_StatDBEntry > { > ... > PgStat_Counter n_blocks_fetched; > PgStat_Counter n_blocks_hit; > PgStat_Counter n_tuples_returned; > PgStat_Counter n_tuples_fetched; > ... > > I've attached a patch (0002) to change this in case you are interested > in making such a change I did not renamed initially the fields/columns to ease the review. Indeed, I think we should go further than removing the n_, i_ and t_ prefixes so that the fields actually match the view's columns. For example, currently pg_stat_all_indexes.idx_tup_read is linked to "tuples_returned", so that it would make sense to rename "tuples_returned" to "tuples_read" or even "tup_read" in the indexes counters. That's why I had in mind to do this fields/columns renaming into a separate patch (once this one is committed), so that the current one focus only on splitting the stats: what do you think? > (I've attached all of my suggestions in patches > along with your original patch so that cfbot still passes). > > On Wed, Nov 2, 2022 at 5:00 AM Drouvot, Bertrand > <bertranddrouvot.pg@gmail.com> wrote: >> On 11/1/22 1:30 AM, Andres Freund wrote: >>> On 2022-10-31 14:14:15 +0100, Drouvot, Bertrand wrote: >>>> @@ -240,9 +293,23 @@ pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS) >>>> PG_RETURN_INT64(result); >>>> } >>>> >>>> +Datum >>>> +pg_stat_get_index_blocks_fetched(PG_FUNCTION_ARGS) >>>> +{ >>>> + Oid relid = PG_GETARG_OID(0); >>>> + int64 result; >>>> + PgStat_StatIndEntry *indentry; >>>> + >>>> + if ((indentry = pgstat_fetch_stat_indentry(relid)) == NULL) >>>> + result = 0; >>>> + else >>>> + result = (int64) (indentry->blocks_fetched); >>>> + >>>> + PG_RETURN_INT64(result); >>>> +} >>> >>> We have so many copies of this by now - perhaps we first should deduplicate >>> them somehow? Even if it's just a macro or such. >>> >> >> Yeah good point, a new macro has been defined for the "int64" return >> case in V3 attached. > > I looked for other opportunities to de-duplicate, but most of the functions > that were added that are identical except the return type and > PgStat_Kind are short enough that it doesn't make sense to make wrappers > or macros. > Yeah, agree. > I do think it makes sense to reorder the members of the two structs > PgStat_IndexCounts and PgStat_TableCounts so that they have a common > header. I've done that in the attached patch (0003). > That's a good idea, thanks! But for that we would need to have the same fields names, means: - Remove the prefixes (as you've done in 0002) - And probably reduce the number of fields in the new PgStat_RelationCounts that 003 is introducing (for example tuples_returned should be excluded if we're going to rename it later on to "tuples_read" for the indexes to map the pg_stat_all_indexes.idx_tup_read column). ISTM that we should do it in the "renaming" effort, after this patch is committed. > In the flush functions, I was also thinking it might be nice to use the > same pattern as is used in [1] and [2] to add the counts together. It > makes the lines a bit easier to read, IMO. If we remove the prefixes > from the count fields, this works for many of the fields. I've attached > a patch (0004) that does something like this, in case you wanted to go > in this direction. I like it too but same remarks as previously. I think it should be part of the "renaming" effort. > > Since you have made new parallel functions for indexes and tables for > many of the functions in pgstat_relation.c, perhaps it makes sense to > split it into pgstat_table.c and pgstat_index.c? Good point, thanks, I'll work on it. > > One question I had about the original code (not related to your > refactor) is why the pending stats aren't memset in the flush functions > after aggregating them into the shared stats. Not sure I'm getting your point, do you think something is not right with the flush functions? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: