Thread: Split index and table statistics into different types of stats

Split index and table statistics into different types of stats

From
"Drouvot, Bertrand"
Date:
Hi hackers,

Please find attached a patch proposal to split index and table 
statistics into different types of stats.

This idea has been proposed by Andres in a couple of threads, see [1] 
and [2].

To sum up:

We currently track index and table types of statistics in the same 
format (so that a number of the "columns" in index stats are currently 
unused) and we rename column in views etc to make them somewhat sensible.

So that the immediate benefits to $SUBJECT are:

- have reasonable names for the fields
- shrink the current memory usage

The attached patch proposal:

- renames PGSTAT_KIND_RELATION to PGSTAT_KIND_TABLE
- creates a new PGSTAT_KIND_INDEX
- creates new macros: pgstat_count_index_fetch(), 
pgstat_count_index_scan(), pgstat_count_index_tuples(), 
pgstat_count_index_buffer_read() and pgstat_count_index_buffer_hit() to 
increment the indexes related stats
- creates new SQL callable functions dedicated to the indexes that are 
used in system_views.sql

It also adds basic tests in src/test/regress/sql/stats.sql for toast and 
partitions (we may want to create a dedicated patch for those additional 
tests though).

The fields renaming has not been done to ease the reading of this patch 
(I think it would be better to create a dedicated patch for the renaming 
once the split is done).

I'm adding a new CF entry for this patch.

Looking forward to your feedback,

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


[1]: 
https://www.postgresql.org/message-id/flat/20221019181930.bx73kul4nbiftr65%40awork3.anarazel.de

[2]: 
https://www.postgresql.org/message-id/20220818195124.c7ipzf6c5v7vxymc%40awork3.anarazel.de

Attachment

Re: Split index and table statistics into different types of stats

From
"Drouvot, Bertrand"
Date:
Hi,

On 10/31/22 2:31 PM, Justin Pryzby wrote:
> I didn't looks closely, but there's a couple places where you wrote
> ";;", which looks unintentional.
> 
> -               PG_RETURN_TIMESTAMPTZ(tabentry->lastscan);
> +               PG_RETURN_TIMESTAMPTZ(tabentry->lastscan);;
> 

Thanks for looking at it!
oops, thanks for the keen eyes ;-) Fixed in v2 attached.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment

Re: Split index and table statistics into different types of stats

From
Andres Freund
Date:
Hi,

On 2022-10-31 14:14:15 +0100, Drouvot, Bertrand wrote:
> Please find attached a patch proposal to split index and table statistics
> into different types of stats.
> 
> This idea has been proposed by Andres in a couple of threads, see [1] and
> [2].

Thanks for working on this!



> diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
> index 5b49cc5a09..8a715db82e 100644
> --- a/src/backend/catalog/heap.c
> +++ b/src/backend/catalog/heap.c
> @@ -1853,7 +1853,7 @@ heap_drop_with_catalog(Oid relid)
>          RelationDropStorage(rel);
>  
>      /* ensure that stats are dropped if transaction commits */
> -    pgstat_drop_relation(rel);
> +    pgstat_drop_heap(rel);

I don't think "heap" is a good name for these, even if there's some historical
reasons for it. Particularly because you used "table" in some bits and pieces
too.


>  /*
> @@ -168,39 +210,55 @@ pgstat_unlink_relation(Relation rel)
>  void
>  pgstat_create_relation(Relation rel)
>  {
> -    pgstat_create_transactional(PGSTAT_KIND_RELATION,
> -                                rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId,
> -                                RelationGetRelid(rel));
> +    if (rel->rd_rel->relkind == RELKIND_INDEX)
> +        pgstat_create_transactional(PGSTAT_KIND_INDEX,
> +                                    rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId,
> +                                    RelationGetRelid(rel));
> +    else
> +        pgstat_create_transactional(PGSTAT_KIND_TABLE,
> +                                    rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId,
> +                                    RelationGetRelid(rel));
> +}

Hm - why is this best handled on this level, rather than at the caller?


> +/*
> + * Support function for the SQL-callable pgstat* functions. Returns
> + * the collected statistics for one index or NULL. NULL doesn't mean
> + * that the index doesn't exist, just that there are no statistics, so the
> + * caller is better off to report ZERO instead.
> + */
> +PgStat_StatIndEntry *
> +pgstat_fetch_stat_indentry(Oid relid)
> +{
> +    PgStat_StatIndEntry *indentry;
> +
> +    indentry = pgstat_fetch_stat_indentry_ext(false, relid);
> +    if (indentry != NULL)
> +        return indentry;
> +
> +    /*
> +     * If we didn't find it, maybe it's a shared index.
> +     */
> +    indentry = pgstat_fetch_stat_indentry_ext(true, relid);
> +    return indentry;
> +}
> +
> +/*
> + * More efficient version of pgstat_fetch_stat_indentry(), allowing to specify
> + * whether the to-be-accessed index is shared or not.
> + */
> +PgStat_StatIndEntry *
> +pgstat_fetch_stat_indentry_ext(bool shared, Oid reloid)
> +{
> +    Oid            dboid = (shared ? InvalidOid : MyDatabaseId);
> +
> +    return (PgStat_StatIndEntry *)
> +        pgstat_fetch_entry(PGSTAT_KIND_INDEX, dboid, reloid);
>  }

Do we need this split anywhere for now? I suspect not, the table case is
mainly for the autovacuum launcher, which won't look at indexes "in isolation".



> @@ -240,9 +293,23 @@ pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS)
>      PG_RETURN_INT64(result);
>  }
>  
> +Datum
> +pg_stat_get_index_blocks_fetched(PG_FUNCTION_ARGS)
> +{
> +    Oid            relid = PG_GETARG_OID(0);
> +    int64        result;
> +    PgStat_StatIndEntry *indentry;
> +
> +    if ((indentry = pgstat_fetch_stat_indentry(relid)) == NULL)
> +        result = 0;
> +    else
> +        result = (int64) (indentry->blocks_fetched);
> +
> +    PG_RETURN_INT64(result);
> +}

We have so many copies of this by now - perhaps we first should deduplicate
them somehow? Even if it's just a macro or such.

Greetings,

Andres Freund



Re: Split index and table statistics into different types of stats

From
"Drouvot, Bertrand"
Date:
Hi,

On 11/1/22 1:30 AM, Andres Freund wrote:
> Hi,
> 
> On 2022-10-31 14:14:15 +0100, Drouvot, Bertrand wrote:
>> Please find attached a patch proposal to split index and table statistics
>> into different types of stats.
>>
>> This idea has been proposed by Andres in a couple of threads, see [1] and
>> [2].
> 
> Thanks for working on this!
> 

Thanks for looking at it!

> 
> 
>> diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
>> index 5b49cc5a09..8a715db82e 100644
>> --- a/src/backend/catalog/heap.c
>> +++ b/src/backend/catalog/heap.c
>> @@ -1853,7 +1853,7 @@ heap_drop_with_catalog(Oid relid)
>>           RelationDropStorage(rel);
>>   
>>       /* ensure that stats are dropped if transaction commits */
>> -    pgstat_drop_relation(rel);
>> +    pgstat_drop_heap(rel);
> 
> I don't think "heap" is a good name for these, even if there's some historical
> reasons for it. Particularly because you used "table" in some bits and pieces
> too.
> 

Agree, replaced by "table" where appropriate in V3 attached.

> 
>>   /*
>> @@ -168,39 +210,55 @@ pgstat_unlink_relation(Relation rel)
>>   void
>>   pgstat_create_relation(Relation rel)
>>   {
>> -    pgstat_create_transactional(PGSTAT_KIND_RELATION,
>> -                                rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId,
>> -                                RelationGetRelid(rel));
>> +    if (rel->rd_rel->relkind == RELKIND_INDEX)
>> +        pgstat_create_transactional(PGSTAT_KIND_INDEX,
>> +                                    rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId,
>> +                                    RelationGetRelid(rel));
>> +    else
>> +        pgstat_create_transactional(PGSTAT_KIND_TABLE,
>> +                                    rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId,
>> +                                    RelationGetRelid(rel));
>> +}
> 
> Hm - why is this best handled on this level, rather than at the caller?
> 
> 

Agree that it should be split in 
pgstat_create_table()/pgstat_create_index() (also as it was already 
split for the "drop" case): done in V3.

>> +/*
>> + * Support function for the SQL-callable pgstat* functions. Returns
>> + * the collected statistics for one index or NULL. NULL doesn't mean
>> + * that the index doesn't exist, just that there are no statistics, so the
>> + * caller is better off to report ZERO instead.
>> + */
>> +PgStat_StatIndEntry *
>> +pgstat_fetch_stat_indentry(Oid relid)
>> +{
>> +    PgStat_StatIndEntry *indentry;
>> +
>> +    indentry = pgstat_fetch_stat_indentry_ext(false, relid);
>> +    if (indentry != NULL)
>> +        return indentry;
>> +
>> +    /*
>> +     * If we didn't find it, maybe it's a shared index.
>> +     */
>> +    indentry = pgstat_fetch_stat_indentry_ext(true, relid);
>> +    return indentry;
>> +}
>> +
>> +/*
>> + * More efficient version of pgstat_fetch_stat_indentry(), allowing to specify
>> + * whether the to-be-accessed index is shared or not.
>> + */
>> +PgStat_StatIndEntry *
>> +pgstat_fetch_stat_indentry_ext(bool shared, Oid reloid)
>> +{
>> +    Oid            dboid = (shared ? InvalidOid : MyDatabaseId);
>> +
>> +    return (PgStat_StatIndEntry *)
>> +        pgstat_fetch_entry(PGSTAT_KIND_INDEX, dboid, reloid);
>>   }
> 
> Do we need this split anywhere for now? I suspect not, the table case is
> mainly for the autovacuum launcher, which won't look at indexes "in isolation".
> 

Yes I think so as pgstat_fetch_stat_indentry_ext() has its use case in 
pgstat_copy_index_stats() (previously pgstat_copy_relation_stats()).

> 
> 
>> @@ -240,9 +293,23 @@ pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS)
>>       PG_RETURN_INT64(result);
>>   }
>>   
>> +Datum
>> +pg_stat_get_index_blocks_fetched(PG_FUNCTION_ARGS)
>> +{
>> +    Oid            relid = PG_GETARG_OID(0);
>> +    int64        result;
>> +    PgStat_StatIndEntry *indentry;
>> +
>> +    if ((indentry = pgstat_fetch_stat_indentry(relid)) == NULL)
>> +        result = 0;
>> +    else
>> +        result = (int64) (indentry->blocks_fetched);
>> +
>> +    PG_RETURN_INT64(result);
>> +}
> 
> We have so many copies of this by now - perhaps we first should deduplicate
> them somehow? Even if it's just a macro or such.
> 

Yeah good point, a new macro has been defined for the "int64" return 
case in V3 attached.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment

Re: Split index and table statistics into different types of stats

From
Melanie Plageman
Date:
Hi Bertrand,

I'm glad you are working on this.

I had a few thoughts/ideas

It seems better to have all of the counts in the various stats structs
not be prefixed with n_, i_, t_

typedef struct PgStat_StatDBEntry
{
...
    PgStat_Counter n_blocks_fetched;
    PgStat_Counter n_blocks_hit;
    PgStat_Counter n_tuples_returned;
    PgStat_Counter n_tuples_fetched;
...

I've attached a patch (0002) to change this in case you are interested
in making such a change (I've attached all of my suggestions in patches
along with your original patch so that cfbot still passes).

On Wed, Nov 2, 2022 at 5:00 AM Drouvot, Bertrand
<bertranddrouvot.pg@gmail.com> wrote:
> On 11/1/22 1:30 AM, Andres Freund wrote:
> > On 2022-10-31 14:14:15 +0100, Drouvot, Bertrand wrote:
> >> @@ -240,9 +293,23 @@ pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS)
> >>      PG_RETURN_INT64(result);
> >>   }
> >>
> >> +Datum
> >> +pg_stat_get_index_blocks_fetched(PG_FUNCTION_ARGS)
> >> +{
> >> +    Oid                     relid = PG_GETARG_OID(0);
> >> +    int64           result;
> >> +    PgStat_StatIndEntry *indentry;
> >> +
> >> +    if ((indentry = pgstat_fetch_stat_indentry(relid)) == NULL)
> >> +            result = 0;
> >> +    else
> >> +            result = (int64) (indentry->blocks_fetched);
> >> +
> >> +    PG_RETURN_INT64(result);
> >> +}
> >
> > We have so many copies of this by now - perhaps we first should deduplicate
> > them somehow? Even if it's just a macro or such.
> >
>
> Yeah good point, a new macro has been defined for the "int64" return
> case in V3 attached.

I looked for other opportunities to de-duplicate, but most of the functions
that were added that are identical except the return type and
PgStat_Kind are short enough that it doesn't make sense to make wrappers
or macros.

I do think it makes sense to reorder the members of the two structs
PgStat_IndexCounts and PgStat_TableCounts so that they have a common
header. I've done that in the attached patch (0003).

In the flush functions, I was also thinking it might be nice to use the
same pattern as is used in [1] and [2] to add the counts together. It
makes the lines a bit easier to read, IMO. If we remove the prefixes
from the count fields, this works for many of the fields. I've attached
a patch (0004) that does something like this, in case you wanted to go
in this direction.

Since you have made new parallel functions for indexes and tables for
many of the functions in pgstat_relation.c, perhaps it makes sense to
split it into pgstat_table.c and pgstat_index.c?

One question I had about the original code (not related to your
refactor) is why the pending stats aren't memset in the flush functions
after aggregating them into the shared stats.

- Melanie

[1] https://github.com/postgres/postgres/blob/master/src/backend/utils/activity/pgstat_checkpointer.c#L49
[2] https://github.com/postgres/postgres/blob/master/src/backend/utils/activity/pgstat_database.c#L370

Attachment

Re: Split index and table statistics into different types of stats

From
"Drouvot, Bertrand"
Date:
Hi,

On 11/4/22 9:51 PM, Melanie Plageman wrote:
> Hi Bertrand,
> 
> I'm glad you are working on this.
> 
> I had a few thoughts/ideas
> 

Thanks for looking at it!

> It seems better to have all of the counts in the various stats structs
> not be prefixed with n_, i_, t_
> 
> typedef struct PgStat_StatDBEntry
> {
> ...
>      PgStat_Counter n_blocks_fetched;
>      PgStat_Counter n_blocks_hit;
>      PgStat_Counter n_tuples_returned;
>      PgStat_Counter n_tuples_fetched;
> ...
> 
> I've attached a patch (0002) to change this in case you are interested
> in making such a change

I did not renamed initially the fields/columns to ease the review.

Indeed, I think we should go further than removing the n_, i_ and t_ 
prefixes so that the fields actually match the view's columns.

For example, currently pg_stat_all_indexes.idx_tup_read is linked to 
"tuples_returned", so that it would make sense to rename 
"tuples_returned" to "tuples_read" or even "tup_read" in the indexes 
counters.

That's why I had in mind to do this fields/columns renaming into a 
separate patch (once this one is committed), so that the current one 
focus only on splitting the stats: what do you think?

> (I've attached all of my suggestions in patches
> along with your original patch so that cfbot still passes).
> 
> On Wed, Nov 2, 2022 at 5:00 AM Drouvot, Bertrand
> <bertranddrouvot.pg@gmail.com> wrote:
>> On 11/1/22 1:30 AM, Andres Freund wrote:
>>> On 2022-10-31 14:14:15 +0100, Drouvot, Bertrand wrote:
>>>> @@ -240,9 +293,23 @@ pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS)
>>>>       PG_RETURN_INT64(result);
>>>>    }
>>>>
>>>> +Datum
>>>> +pg_stat_get_index_blocks_fetched(PG_FUNCTION_ARGS)
>>>> +{
>>>> +    Oid                     relid = PG_GETARG_OID(0);
>>>> +    int64           result;
>>>> +    PgStat_StatIndEntry *indentry;
>>>> +
>>>> +    if ((indentry = pgstat_fetch_stat_indentry(relid)) == NULL)
>>>> +            result = 0;
>>>> +    else
>>>> +            result = (int64) (indentry->blocks_fetched);
>>>> +
>>>> +    PG_RETURN_INT64(result);
>>>> +}
>>>
>>> We have so many copies of this by now - perhaps we first should deduplicate
>>> them somehow? Even if it's just a macro or such.
>>>
>>
>> Yeah good point, a new macro has been defined for the "int64" return
>> case in V3 attached.
> 
> I looked for other opportunities to de-duplicate, but most of the functions
> that were added that are identical except the return type and
> PgStat_Kind are short enough that it doesn't make sense to make wrappers
> or macros.
> 

Yeah, agree.

> I do think it makes sense to reorder the members of the two structs
> PgStat_IndexCounts and PgStat_TableCounts so that they have a common
> header. I've done that in the attached patch (0003).
> 

That's a good idea, thanks! But for that we would need to have the same 
fields names, means:

- Remove the prefixes (as you've done in 0002)
- And probably reduce the number of fields in the new 
PgStat_RelationCounts that 003 is introducing (for example 
tuples_returned should be excluded if we're going to rename it later on 
to "tuples_read" for the indexes to map the 
pg_stat_all_indexes.idx_tup_read column).

ISTM that we should do it in the "renaming" effort, after this patch is 
committed.

> In the flush functions, I was also thinking it might be nice to use the
> same pattern as is used in [1] and [2] to add the counts together. It
> makes the lines a bit easier to read, IMO. If we remove the prefixes
> from the count fields, this works for many of the fields. I've attached
> a patch (0004) that does something like this, in case you wanted to go
> in this direction.

I like it too but same remarks as previously. I think it should be part 
of the "renaming" effort.

> 
> Since you have made new parallel functions for indexes and tables for
> many of the functions in pgstat_relation.c, perhaps it makes sense to
> split it into pgstat_table.c and pgstat_index.c?

Good point, thanks, I'll work on it.

> 
> One question I had about the original code (not related to your
> refactor) is why the pending stats aren't memset in the flush functions
> after aggregating them into the shared stats.

Not sure I'm getting your point, do you think something is not right 
with the flush functions?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Split index and table statistics into different types of stats

From
"Drouvot, Bertrand"
Date:
Hi,

On 11/14/22 11:36 AM, Drouvot, Bertrand wrote:
> On 11/4/22 9:51 PM, Melanie Plageman wrote:
>> Since you have made new parallel functions for indexes and tables for
>> many of the functions in pgstat_relation.c, perhaps it makes sense to
>> split it into pgstat_table.c and pgstat_index.c?
> 
> Good point, thanks, I'll work on it.
> 

Please find attached v4 adding pgstat_table.c and pgstat_index.c (and 
removing pgstat_relation.c).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment

Re: Split index and table statistics into different types of stats

From
Andres Freund
Date:
Hi,

On 2022-11-15 10:48:40 +0100, Drouvot, Bertrand wrote:
> diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
> index ae3365d917..be7f175bf1 100644
> --- a/src/backend/utils/adt/pgstatfuncs.c
> +++ b/src/backend/utils/adt/pgstatfuncs.c
> @@ -36,24 +36,34 @@
>  
>  #define HAS_PGSTAT_PERMISSIONS(role)     (has_privs_of_role(GetUserId(), ROLE_PG_READ_ALL_STATS) ||
has_privs_of_role(GetUserId(),role))
 
>  
> +#define PGSTAT_FETCH_STAT_ENTRY(entry, stat_name) ((entry == NULL) ? 0 : (int64) (entry->stat_name))
> +
>  Datum
> -pg_stat_get_numscans(PG_FUNCTION_ARGS)
> +pg_stat_get_index_numscans(PG_FUNCTION_ARGS)
>  {
>      Oid            relid = PG_GETARG_OID(0);
>      int64        result;
> -    PgStat_StatTabEntry *tabentry;
> +    PgStat_StatIndEntry *indentry = pgstat_fetch_stat_indentry(relid);
>  
> -    if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
> -        result = 0;
> -    else
> -        result = (int64) (tabentry->numscans);
> +    result = PGSTAT_FETCH_STAT_ENTRY(indentry, numscans);
>  
>      PG_RETURN_INT64(result);
>  }

This still leaves a fair bit of boilerplate. ISTM that the function body
really should just be a single line.

Might even be worth defining the whole function via a macro. Perhaps something like

PGSTAT_DEFINE_REL_FIELD_ACCESSOR(PGSTAT_KIND_INDEX, pg_stat_get_index, numscans);

I think the logic to infer which DB oid to use for a stats entry could be
shared between different kinds of stats. We don't need to duplicate it.

Is there any reason to not replace the "double lookup" in
pgstat_fetch_stat_tabentry() with IsSharedRelation()?


This should probably be done in a preparatory commit.

Greetings,

Andres Freund



Re: Split index and table statistics into different types of stats

From
"Drouvot, Bertrand"
Date:
Hi,

On 11/16/22 9:12 PM, Andres Freund wrote:
> Hi,
> 
> On 2022-11-15 10:48:40 +0100, Drouvot, Bertrand wrote:
>> diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
>> index ae3365d917..be7f175bf1 100644
>> --- a/src/backend/utils/adt/pgstatfuncs.c
>> +++ b/src/backend/utils/adt/pgstatfuncs.c
>> @@ -36,24 +36,34 @@
>>   
>>   #define HAS_PGSTAT_PERMISSIONS(role)     (has_privs_of_role(GetUserId(), ROLE_PG_READ_ALL_STATS) ||
has_privs_of_role(GetUserId(),role))
 
>>   
>> +#define PGSTAT_FETCH_STAT_ENTRY(entry, stat_name) ((entry == NULL) ? 0 : (int64) (entry->stat_name))
>> +
>>   Datum
>> -pg_stat_get_numscans(PG_FUNCTION_ARGS)
>> +pg_stat_get_index_numscans(PG_FUNCTION_ARGS)
>>   {
>>       Oid            relid = PG_GETARG_OID(0);
>>       int64        result;
>> -    PgStat_StatTabEntry *tabentry;
>> +    PgStat_StatIndEntry *indentry = pgstat_fetch_stat_indentry(relid);
>>   
>> -    if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
>> -        result = 0;
>> -    else
>> -        result = (int64) (tabentry->numscans);
>> +    result = PGSTAT_FETCH_STAT_ENTRY(indentry, numscans);
>>   
>>       PG_RETURN_INT64(result);
>>   }
> 
> This still leaves a fair bit of boilerplate. ISTM that the function body
> really should just be a single line.
> 
> Might even be worth defining the whole function via a macro. Perhaps something like
> 
> PGSTAT_DEFINE_REL_FIELD_ACCESSOR(PGSTAT_KIND_INDEX, pg_stat_get_index, numscans);

Thanks for the feedback!

Right, what about something like the following?

"
#define PGSTAT_FETCH_STAT_ENTRY(pgstat_entry_kind, pgstat_fetch_stat_function, relid, stat_name) \
    do { \
        pgstat_entry_kind *entry = pgstat_fetch_stat_function(relid); \
        PG_RETURN_INT64(entry == NULL ? 0 : (int64) (entry->stat_name)); \
    } while (0)

Datum
pg_stat_get_index_numscans(PG_FUNCTION_ARGS)
{
    PGSTAT_FETCH_STAT_ENTRY(PgStat_StatIndEntry, pgstat_fetch_stat_indentry, PG_GETARG_OID(0), numscans);
}
"

> 
> I think the logic to infer which DB oid to use for a stats entry could be
> shared between different kinds of stats. We don't need to duplicate it.
> 

Agree, will provide a new patch version once [1] is committed.


> Is there any reason to not replace the "double lookup" in
> pgstat_fetch_stat_tabentry() with IsSharedRelation()?
> 
> 

Thanks for the suggestion!

> This should probably be done in a preparatory commit.

Proposal submitted in [1].

[1]:
https://www.postgresql.org/message-id/flat/2e4a0ae1-2696-9f0c-301c-2330e447133f%40gmail.com#e47bf5d2902121461b61ed47413628fc

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Split index and table statistics into different types of stats

From
Andres Freund
Date:
Hi,

On 2022-11-18 12:18:38 +0100, Drouvot, Bertrand wrote:
> On 11/16/22 9:12 PM, Andres Freund wrote:
> > This still leaves a fair bit of boilerplate. ISTM that the function body
> > really should just be a single line.
> > 
> > Might even be worth defining the whole function via a macro. Perhaps something like
> > 
> > PGSTAT_DEFINE_REL_FIELD_ACCESSOR(PGSTAT_KIND_INDEX, pg_stat_get_index, numscans);
> 
> Thanks for the feedback!
> 
> Right, what about something like the following?
> 
> "
> #define PGSTAT_FETCH_STAT_ENTRY(pgstat_entry_kind, pgstat_fetch_stat_function, relid, stat_name) \
>     do { \
>         pgstat_entry_kind *entry = pgstat_fetch_stat_function(relid); \
>         PG_RETURN_INT64(entry == NULL ? 0 : (int64) (entry->stat_name)); \
>     } while (0)
> 
> Datum
> pg_stat_get_index_numscans(PG_FUNCTION_ARGS)
> {
>     PGSTAT_FETCH_STAT_ENTRY(PgStat_StatIndEntry, pgstat_fetch_stat_indentry, PG_GETARG_OID(0), numscans);
> }
> "

That's better, but still seems like quite a bit of repetition, given the
number of accessors. I think I like my idea of a macro defining the whole
function a bit better.

I'd define a "base" macro and then a version that's specific to tables and
indexes each, so that the pieces related to that don't have to be repeated as
often.


> > This should probably be done in a preparatory commit.
> 
> Proposal submitted in [1].

Now merged.

Greetings,

Andres Freund



Re: Split index and table statistics into different types of stats

From
"Drouvot, Bertrand"
Date:
Hi,

On 11/21/22 12:19 AM, Andres Freund wrote:
> Hi,
> 
> On 2022-11-18 12:18:38 +0100, Drouvot, Bertrand wrote:
>> On 11/16/22 9:12 PM, Andres Freund wrote:
>>> This still leaves a fair bit of boilerplate. ISTM that the function body
>>> really should just be a single line.
>>>
>>> Might even be worth defining the whole function via a macro. Perhaps something like
>>>
>>> PGSTAT_DEFINE_REL_FIELD_ACCESSOR(PGSTAT_KIND_INDEX, pg_stat_get_index, numscans);
>>
>> Thanks for the feedback!
>>
>> Right, what about something like the following?
>>
>> "
>> #define PGSTAT_FETCH_STAT_ENTRY(pgstat_entry_kind, pgstat_fetch_stat_function, relid, stat_name) \
>>     do { \
>>         pgstat_entry_kind *entry = pgstat_fetch_stat_function(relid); \
>>         PG_RETURN_INT64(entry == NULL ? 0 : (int64) (entry->stat_name)); \
>>     } while (0)
>>
>> Datum
>> pg_stat_get_index_numscans(PG_FUNCTION_ARGS)
>> {
>>     PGSTAT_FETCH_STAT_ENTRY(PgStat_StatIndEntry, pgstat_fetch_stat_indentry, PG_GETARG_OID(0), numscans);
>> }
>> "
> 
> That's better, but still seems like quite a bit of repetition, given the
> number of accessors. I think I like my idea of a macro defining the whole
> function a bit better.
> 

Got it, what about creating another preparatory commit to first introduce something like:

"
#define PGSTAT_DEFINE_REL_FIELD_ACCESSOR(function_name_prefix, stat_name) \
Datum \
function_name_prefix##_##stat_name(PG_FUNCTION_ARGS) \
{ \
Oid            relid = PG_GETARG_OID(0); \
int64        result; \
PgStat_StatTabEntry *tabentry; \
if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL) \
    result = 0; \
else \
    result = (int64) (tabentry->stat_name); \
PG_RETURN_INT64(result); \
} \

PGSTAT_DEFINE_REL_FIELD_ACCESSOR(pg_stat_get, numscans);

PGSTAT_DEFINE_REL_FIELD_ACCESSOR(pg_stat_get, tuples_returned);
.
.
.
"

If that makes sense to you, I'll submit this preparatory patch.

> Now merged.

Thanks!

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Split index and table statistics into different types of stats

From
Bharath Rupireddy
Date:
On Mon, Nov 21, 2022 at 7:03 PM Drouvot, Bertrand
<bertranddrouvot.pg@gmail.com> wrote:
>
> On 11/21/22 12:19 AM, Andres Freund wrote:
> >
> > That's better, but still seems like quite a bit of repetition, given the
> > number of accessors. I think I like my idea of a macro defining the whole
> > function a bit better.
> >
>
> Got it, what about creating another preparatory commit to first introduce something like:
>
> "
> #define PGSTAT_DEFINE_REL_FIELD_ACCESSOR(function_name_prefix, stat_name) \
> Datum \
> function_name_prefix##_##stat_name(PG_FUNCTION_ARGS) \
> { \
> Oid                     relid = PG_GETARG_OID(0); \
> int64           result; \
> PgStat_StatTabEntry *tabentry; \
> if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL) \
>         result = 0; \
> else \
>         result = (int64) (tabentry->stat_name); \
> PG_RETURN_INT64(result); \
> } \
>
> PGSTAT_DEFINE_REL_FIELD_ACCESSOR(pg_stat_get, numscans);
>
> PGSTAT_DEFINE_REL_FIELD_ACCESSOR(pg_stat_get, tuples_returned);
> .
> .
> .
> "
>
> If that makes sense to you, I'll submit this preparatory patch.

I think the macros stitching the function declarations and definitions
is a great idea to avoid code duplicacy. We seem to be using that
approach already - PG_FUNCTION_INFO_V1, SH_DECLARE, SH_DEFINE and its
friends, STEMMER_MODULE and so on. +1 for first applying this
principle for existing functions. Looking forward to the patch.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Split index and table statistics into different types of stats

From
"Drouvot, Bertrand"
Date:
Hi,

On 11/22/22 7:19 AM, Bharath Rupireddy wrote:
> On Mon, Nov 21, 2022 at 7:03 PM Drouvot, Bertrand
> <bertranddrouvot.pg@gmail.com> wrote:
>>
>> On 11/21/22 12:19 AM, Andres Freund wrote:
>>>
>>> That's better, but still seems like quite a bit of repetition, given the
>>> number of accessors. I think I like my idea of a macro defining the whole
>>> function a bit better.
>>>
>>
>> Got it, what about creating another preparatory commit to first introduce something like:
>>
>> "
>> #define PGSTAT_DEFINE_REL_FIELD_ACCESSOR(function_name_prefix, stat_name) \
>> Datum \
>> function_name_prefix##_##stat_name(PG_FUNCTION_ARGS) \
>> { \
>> Oid                     relid = PG_GETARG_OID(0); \
>> int64           result; \
>> PgStat_StatTabEntry *tabentry; \
>> if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL) \
>>          result = 0; \
>> else \
>>          result = (int64) (tabentry->stat_name); \
>> PG_RETURN_INT64(result); \
>> } \
>>
>> PGSTAT_DEFINE_REL_FIELD_ACCESSOR(pg_stat_get, numscans);
>>
>> PGSTAT_DEFINE_REL_FIELD_ACCESSOR(pg_stat_get, tuples_returned);
>> .
>> .
>> .
>> "
>>
>> If that makes sense to you, I'll submit this preparatory patch.
> 
> I think the macros stitching the function declarations and definitions
> is a great idea to avoid code duplicacy. We seem to be using that
> approach already - PG_FUNCTION_INFO_V1, SH_DECLARE, SH_DEFINE and its
> friends, STEMMER_MODULE and so on. +1 for first applying this
> principle for existing functions. Looking forward to the patch.
> 

Thanks! Patch proposal submitted in [1].

I'll resume working on the current thread once [1] is committed.

[1]: https://www.postgresql.org/message-id/d547a9bc-76c2-f875-df74-3ad6fd9d6236%40gmail.com

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Split index and table statistics into different types of stats

From
"Drouvot, Bertrand"
Date:
Hi,

On 11/22/22 8:12 AM, Drouvot, Bertrand wrote:
> Hi,
> 
> On 11/22/22 7:19 AM, Bharath Rupireddy wrote:
>> On Mon, Nov 21, 2022 at 7:03 PM Drouvot, Bertrand
>> <bertranddrouvot.pg@gmail.com> wrote:
>>>
>>> On 11/21/22 12:19 AM, Andres Freund wrote:
>>>>
>>>> That's better, but still seems like quite a bit of repetition, given the
>>>> number of accessors. I think I like my idea of a macro defining the whole
>>>> function a bit better.
>>>>
>>>
>>> Got it, what about creating another preparatory commit to first introduce something like:
>>>
>>> "
>>> #define PGSTAT_DEFINE_REL_FIELD_ACCESSOR(function_name_prefix, stat_name) \
>>> Datum \
>>> function_name_prefix##_##stat_name(PG_FUNCTION_ARGS) \
>>> { \
>>> Oid                     relid = PG_GETARG_OID(0); \
>>> int64           result; \
>>> PgStat_StatTabEntry *tabentry; \
>>> if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL) \
>>>          result = 0; \
>>> else \
>>>          result = (int64) (tabentry->stat_name); \
>>> PG_RETURN_INT64(result); \
>>> } \
>>>
>>> PGSTAT_DEFINE_REL_FIELD_ACCESSOR(pg_stat_get, numscans);
>>>
>>> PGSTAT_DEFINE_REL_FIELD_ACCESSOR(pg_stat_get, tuples_returned);
>>> .
>>> .
>>> .
>>> "
>>>
>>> If that makes sense to you, I'll submit this preparatory patch.
>>
>> I think the macros stitching the function declarations and definitions
>> is a great idea to avoid code duplicacy. We seem to be using that
>> approach already - PG_FUNCTION_INFO_V1, SH_DECLARE, SH_DEFINE and its
>> friends, STEMMER_MODULE and so on. +1 for first applying this
>> principle for existing functions. Looking forward to the patch.
>>
> 
> Thanks! Patch proposal submitted in [1].
> 
> I'll resume working on the current thread once [1] is committed.
> 
> [1]: https://www.postgresql.org/message-id/d547a9bc-76c2-f875-df74-3ad6fd9d6236%40gmail.com
> 

As [1] mentioned above has been committed (83a1a1b566), please find attached V5 related to this thread making use of
thenew macros (namely PG_STAT_GET_RELENTRY_INT64 and PG_STAT_GET_RELENTRY_TIMESTAMPTZ).
 

I switched from using "CppConcat" to using "##", as it looks to me it's easier to read now that we are adding another
concatenationto the game (due to the table/index split).
 

The (Tab,tab) or (Ind,ind) passed as arguments to the macros look "weird" (I don't have a better idea yet): purpose is
tofollow the naming convention for PgStat_StatTabEntry/PgStat_StatIndEntry and
pgstat_fetch_stat_tabentry/pgstat_fetch_stat_indentry,thoughts?
 

Looking forward to your feedback,

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment

Re: Split index and table statistics into different types of stats

From
"Drouvot, Bertrand"
Date:
Hi,

> Hi,
>  
> As [1] mentioned above has been committed (83a1a1b566), please find attached V5 related to this thread making use of
thenew macros (namely PG_STAT_GET_RELENTRY_INT64 and PG_STAT_GET_RELENTRY_TIMESTAMPTZ).
 
> 
> I switched from using "CppConcat" to using "##", as it looks to me it's easier to read now that we are adding another
concatenationto the game (due to the table/index split).
 
> 
> The (Tab,tab) or (Ind,ind) passed as arguments to the macros look "weird" (I don't have a better idea yet): purpose
isto follow the naming convention for PgStat_StatTabEntry/PgStat_StatIndEntry and
pgstat_fetch_stat_tabentry/pgstat_fetch_stat_indentry,thoughts?
 
> 
> Looking forward to your feedback,
> 

Attaching V6 (mandatory rebase due to 8018ffbf58).

While at it, I got rid of the weirdness mentioned above by creating 2 sets of macros (one for the tables and one for
theindexes).
 

Looking forward to your feedback,

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment

Re: Split index and table statistics into different types of stats

From
"Drouvot, Bertrand"
Date:
Hi,

On 12/7/22 11:11 AM, Drouvot, Bertrand wrote:
> Hi,
> 
>> Hi,
>>
>> As [1] mentioned above has been committed (83a1a1b566), please find attached V5 related to this thread making use of
thenew macros (namely PG_STAT_GET_RELENTRY_INT64 and PG_STAT_GET_RELENTRY_TIMESTAMPTZ).
 
>>
>> I switched from using "CppConcat" to using "##", as it looks to me it's easier to read now that we are adding
anotherconcatenation to the game (due to the table/index split).
 
>>
>> The (Tab,tab) or (Ind,ind) passed as arguments to the macros look "weird" (I don't have a better idea yet): purpose
isto follow the naming convention for PgStat_StatTabEntry/PgStat_StatIndEntry and
pgstat_fetch_stat_tabentry/pgstat_fetch_stat_indentry,thoughts?
 
>>
>> Looking forward to your feedback,
>>
> 
> Attaching V6 (mandatory rebase due to 8018ffbf58).
> 
> While at it, I got rid of the weirdness mentioned above by creating 2 sets of macros (one for the tables and one for
theindexes).
 
> 
> Looking forward to your feedback,
> 
> Regards,
> 

Attaching V7, mandatory rebase due to 66dcb09246.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment

Re: Split index and table statistics into different types of stats

From
"Drouvot, Bertrand"
Date:
Hi,

On 12/10/22 10:54 AM, Drouvot, Bertrand wrote:
> Hi,
> 
> On 12/7/22 11:11 AM, Drouvot, Bertrand wrote:
>> Hi,
>>
>>> Hi,
>>>
>>> As [1] mentioned above has been committed (83a1a1b566), please find attached V5 related to this thread making use
ofthe new macros (namely PG_STAT_GET_RELENTRY_INT64 and PG_STAT_GET_RELENTRY_TIMESTAMPTZ).
 
>>>
>>> I switched from using "CppConcat" to using "##", as it looks to me it's easier to read now that we are adding
anotherconcatenation to the game (due to the table/index split).
 
>>>
>>> The (Tab,tab) or (Ind,ind) passed as arguments to the macros look "weird" (I don't have a better idea yet): purpose
isto follow the naming convention for PgStat_StatTabEntry/PgStat_StatIndEntry and
pgstat_fetch_stat_tabentry/pgstat_fetch_stat_indentry,thoughts?
 
>>>
>>> Looking forward to your feedback,
>>>
>>
>> Attaching V6 (mandatory rebase due to 8018ffbf58).
>>
>> While at it, I got rid of the weirdness mentioned above by creating 2 sets of macros (one for the tables and one for
theindexes).
 
>>
>> Looking forward to your feedback,
>>
>> Regards,
>>
> 
> Attaching V7, mandatory rebase due to 66dcb09246.
> 

Attaching V8, mandatory rebase due to c8e1ba736b.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment

Re: Split index and table statistics into different types of stats

From
Andres Freund
Date:
Hi,

On 2023-01-03 15:19:18 +0100, Drouvot, Bertrand wrote:
> diff --git a/src/backend/access/common/relation.c b/src/backend/access/common/relation.c
> index 4017e175e3..fca166a063 100644
> --- a/src/backend/access/common/relation.c
> +++ b/src/backend/access/common/relation.c
> @@ -73,7 +73,10 @@ relation_open(Oid relationId, LOCKMODE lockmode)
>      if (RelationUsesLocalBuffers(r))
>          MyXactFlags |= XACT_FLAGS_ACCESSEDTEMPNAMESPACE;
>  
> -    pgstat_init_relation(r);
> +    if (r->rd_rel->relkind == RELKIND_INDEX)
> +        pgstat_init_index(r);
> +    else
> +        pgstat_init_table(r);
>  
>      return r;
>  }
> @@ -123,7 +126,10 @@ try_relation_open(Oid relationId, LOCKMODE lockmode)
>      if (RelationUsesLocalBuffers(r))
>          MyXactFlags |= XACT_FLAGS_ACCESSEDTEMPNAMESPACE;
>  
> -    pgstat_init_relation(r);
> +    if (r->rd_rel->relkind == RELKIND_INDEX)
> +        pgstat_init_index(r);
> +    else
> +        pgstat_init_table(r);
>  
>      return r;
>  }

Not this patch's fault, but the functions in relation.c have gotten duplicated
to an almost ridiculous degree :(


> diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
> index 3fb38a25cf..98bb230b95 100644
> --- a/src/backend/storage/buffer/bufmgr.c
> +++ b/src/backend/storage/buffer/bufmgr.c
> @@ -776,11 +776,19 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
>       * Read the buffer, and update pgstat counters to reflect a cache hit or
>       * miss.
>       */
> -    pgstat_count_buffer_read(reln);
> +    if (reln->rd_rel->relkind == RELKIND_INDEX)
> +        pgstat_count_index_buffer_read(reln);
> +    else
> +        pgstat_count_table_buffer_read(reln);
>      buf = ReadBuffer_common(RelationGetSmgr(reln), reln->rd_rel->relpersistence,
>                              forkNum, blockNum, mode, strategy, &hit);
>      if (hit)
> -        pgstat_count_buffer_hit(reln);
> +    {
> +        if (reln->rd_rel->relkind == RELKIND_INDEX)
> +            pgstat_count_index_buffer_hit(reln);
> +        else
> +            pgstat_count_table_buffer_hit(reln);
> +    }
>      return buf;
>  }

Not nice to have additional branches here :(.

I think going forward we should move buffer stats to a "per-relfilenode" stats
entry (which would allow to track writes too), then thiw branch would be
removed again.


> +/* -------------------------------------------------------------------------
> + *
> + * pgstat_index.c
> + *      Implementation of index statistics.

This is a fair bit of duplicated code. Perhaps it'd be worth keeping
pgstat_relation with code common to table/index stats?


> +bool
> +pgstat_index_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
> +{
> +    static const PgStat_IndexCounts all_zeroes;
> +    Oid            dboid;
> +
> +    PgStat_IndexStatus *lstats; /* pending stats entry  */
> +    PgStatShared_Index *shrelcomstats;

What does "com" stand for in shrelcomstats?


> +    PgStat_StatIndEntry *indentry;    /* index entry of shared stats */
> +    PgStat_StatDBEntry *dbentry;    /* pending database entry */
> +
> +    dboid = entry_ref->shared_entry->key.dboid;
> +    lstats = (PgStat_IndexStatus *) entry_ref->pending;
> +    shrelcomstats = (PgStatShared_Index *) entry_ref->shared_stats;
> +
> +    /*
> +     * Ignore entries that didn't accumulate any actual counts, such as
> +     * indexes that were opened by the planner but not used.
> +     */
> +    if (memcmp(&lstats->i_counts, &all_zeroes,
> +               sizeof(PgStat_IndexCounts)) == 0)
> +    {
> +        return true;
> +    }

I really need to propose pg_memiszero().



>  Datum
> -pg_stat_get_xact_numscans(PG_FUNCTION_ARGS)
> +pg_stat_get_tab_xact_numscans(PG_FUNCTION_ARGS)
>  {
>      Oid            relid = PG_GETARG_OID(0);
>      int64        result;
> @@ -1360,17 +1413,32 @@ pg_stat_get_xact_numscans(PG_FUNCTION_ARGS)
>      PG_RETURN_INT64(result);
>  }
>  
> +Datum
> +pg_stat_get_ind_xact_numscans(PG_FUNCTION_ARGS)
> +{
> +    Oid            relid = PG_GETARG_OID(0);
> +    int64        result;
> +    PgStat_IndexStatus *indentry;
> +
> +    if ((indentry = find_indstat_entry(relid)) == NULL)
> +        result = 0;
> +    else
> +        result = (int64) (indentry->i_counts.i_numscans);
> +
> +    PG_RETURN_INT64(result);
> +}

Why didn't all these get converted to the same macro based approach as the
!xact versions?


>  Datum
>  pg_stat_get_xact_tuples_returned(PG_FUNCTION_ARGS)
>  {
>      Oid            relid = PG_GETARG_OID(0);
>      int64        result;
> -    PgStat_TableStatus *tabentry;
> +    PgStat_IndexStatus *indentry;
>  
> -    if ((tabentry = find_tabstat_entry(relid)) == NULL)
> +    if ((indentry = find_indstat_entry(relid)) == NULL)
>          result = 0;
>      else
> -        result = (int64) (tabentry->t_counts.t_tuples_returned);
> +        result = (int64) (indentry->i_counts.i_tuples_returned);
>  
>      PG_RETURN_INT64(result);
>  }

There's a bunch of changes like this, and I don't understand -
pg_stat_get_xact_tuples_returned() now looks at index stats, even though it
afaics continues to be used in pg_stat_xact_all_tables? Huh?


> +/* ----------
> + * PgStat_IndexStatus            Per-index status within a backend
> + *
> + * Many of the event counters are nontransactional, ie, we count events
> + * in committed and aborted transactions alike.  For these, we just count
> + * directly in the PgStat_IndexStatus.
> + * ----------
> + */

Which counters are transactional for indexes? None, no?

> diff --git a/src/test/recovery/t/029_stats_restart.pl b/src/test/recovery/t/029_stats_restart.pl
> index 83d6647d32..8b0b597419 100644
> --- a/src/test/recovery/t/029_stats_restart.pl
> +++ b/src/test/recovery/t/029_stats_restart.pl
> @@ -43,8 +43,8 @@ my $sect = "initial";
>  is(have_stats('database', $dboid, 0), 't', "$sect: db stats do exist");
>  is(have_stats('function', $dboid, $funcoid),
>      't', "$sect: function stats do exist");
> -is(have_stats('relation', $dboid, $tableoid),
> -    't', "$sect: relation stats do exist");
> +is(have_stats('table', $dboid, $tableoid),
> +    't', "$sect: table stats do exist");

Think this should grow a test for an index too. There's not that much point in
the isolation case, because we don't have transactional stats, but here it
seems worth testing?

Greetings,

Andres Freund



Re: Split index and table statistics into different types of stats

From
"Drouvot, Bertrand"
Date:
Hi,

On 1/5/23 1:27 AM, Andres Freund wrote:
> Hi,
> 
> On 2023-01-03 15:19:18 +0100, Drouvot, Bertrand wrote:
>> diff --git a/src/backend/access/common/relation.c b/src/backend/access/common/relation.c
>> index 4017e175e3..fca166a063 100644
>> --- a/src/backend/access/common/relation.c
>> +++ b/src/backend/access/common/relation.c
>> @@ -73,7 +73,10 @@ relation_open(Oid relationId, LOCKMODE lockmode)
>>       if (RelationUsesLocalBuffers(r))
>>           MyXactFlags |= XACT_FLAGS_ACCESSEDTEMPNAMESPACE;
>>   
>> -    pgstat_init_relation(r);
>> +    if (r->rd_rel->relkind == RELKIND_INDEX)
>> +        pgstat_init_index(r);
>> +    else
>> +        pgstat_init_table(r);
>>   
>>       return r;
>>   }
>> @@ -123,7 +126,10 @@ try_relation_open(Oid relationId, LOCKMODE lockmode)
>>       if (RelationUsesLocalBuffers(r))
>>           MyXactFlags |= XACT_FLAGS_ACCESSEDTEMPNAMESPACE;
>>   
>> -    pgstat_init_relation(r);
>> +    if (r->rd_rel->relkind == RELKIND_INDEX)
>> +        pgstat_init_index(r);
>> +    else
>> +        pgstat_init_table(r);
>>   
>>       return r;
>>   }
> 
> Not this patch's fault, but the functions in relation.c have gotten duplicated
> to an almost ridiculous degree :(
> 

Thanks for looking at it!
Right, I'll have a look and will try to submit a dedicated patch for this.

> 
>> diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
>> index 3fb38a25cf..98bb230b95 100644
>> --- a/src/backend/storage/buffer/bufmgr.c
>> +++ b/src/backend/storage/buffer/bufmgr.c
>> @@ -776,11 +776,19 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
>>        * Read the buffer, and update pgstat counters to reflect a cache hit or
>>        * miss.
>>        */
>> -    pgstat_count_buffer_read(reln);
>> +    if (reln->rd_rel->relkind == RELKIND_INDEX)
>> +        pgstat_count_index_buffer_read(reln);
>> +    else
>> +        pgstat_count_table_buffer_read(reln);
>>       buf = ReadBuffer_common(RelationGetSmgr(reln), reln->rd_rel->relpersistence,
>>                               forkNum, blockNum, mode, strategy, &hit);
>>       if (hit)
>> -        pgstat_count_buffer_hit(reln);
>> +    {
>> +        if (reln->rd_rel->relkind == RELKIND_INDEX)
>> +            pgstat_count_index_buffer_hit(reln);
>> +        else
>> +            pgstat_count_table_buffer_hit(reln);
>> +    }
>>       return buf;
>>   }
> 
> Not nice to have additional branches here :(.

Indeed, but that does look like the price to pay for the moment ;-(

> 
> I think going forward we should move buffer stats to a "per-relfilenode" stats
> entry (which would allow to track writes too), then thiw branch would be
> removed again.
> 
> 

Agree. I think the best approach is to have this patch committed and then resume working on [1] (which would most
probablyintroduce
 
the "per-relfilenode" stats.) Does this approach make sense to you?


>> +/* -------------------------------------------------------------------------
>> + *
>> + * pgstat_index.c
>> + *      Implementation of index statistics.
> 
> This is a fair bit of duplicated code. Perhaps it'd be worth keeping
> pgstat_relation with code common to table/index stats?
> 

Good point, will look at what can be done.

> 
>> +bool
>> +pgstat_index_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
>> +{
>> +    static const PgStat_IndexCounts all_zeroes;
>> +    Oid            dboid;
>> +
>> +    PgStat_IndexStatus *lstats; /* pending stats entry  */
>> +    PgStatShared_Index *shrelcomstats;
> 
> What does "com" stand for in shrelcomstats?
> 

Oops, thanks!

This naming is coming from my first try while working on this subject (that I did not share).
The idea I had at that time was to create a PGSTAT_KIND_RELATION_COMMON stat type for common stats between tables and
indexes
and a dedicated one (PGSTAT_KIND_TABLE) for tables (given that indexes would have been fully part of the common one).
But it did not work well (specially as we want "dedicated" field names), so I preferred to submit the current
proposal.

Will fix this bad naming.

> 
>> +    PgStat_StatIndEntry *indentry;    /* index entry of shared stats */
>> +    PgStat_StatDBEntry *dbentry;    /* pending database entry */
>> +
>> +    dboid = entry_ref->shared_entry->key.dboid;
>> +    lstats = (PgStat_IndexStatus *) entry_ref->pending;
>> +    shrelcomstats = (PgStatShared_Index *) entry_ref->shared_stats;
>> +
>> +    /*
>> +     * Ignore entries that didn't accumulate any actual counts, such as
>> +     * indexes that were opened by the planner but not used.
>> +     */
>> +    if (memcmp(&lstats->i_counts, &all_zeroes,
>> +               sizeof(PgStat_IndexCounts)) == 0)
>> +    {
>> +        return true;
>> +    }
> 
> I really need to propose pg_memiszero().
> 

Oh yeah, great idea, that would be easier to read.

> 
> 
>>   Datum
>> -pg_stat_get_xact_numscans(PG_FUNCTION_ARGS)
>> +pg_stat_get_tab_xact_numscans(PG_FUNCTION_ARGS)
>>   {
>>       Oid            relid = PG_GETARG_OID(0);
>>       int64        result;
>> @@ -1360,17 +1413,32 @@ pg_stat_get_xact_numscans(PG_FUNCTION_ARGS)
>>       PG_RETURN_INT64(result);
>>   }
>>   
>> +Datum
>> +pg_stat_get_ind_xact_numscans(PG_FUNCTION_ARGS)
>> +{
>> +    Oid            relid = PG_GETARG_OID(0);
>> +    int64        result;
>> +    PgStat_IndexStatus *indentry;
>> +
>> +    if ((indentry = find_indstat_entry(relid)) == NULL)
>> +        result = 0;
>> +    else
>> +        result = (int64) (indentry->i_counts.i_numscans);
>> +
>> +    PG_RETURN_INT64(result);
>> +}
> 
> Why didn't all these get converted to the same macro based approach as the
> !xact versions?
> 

I think the "benefits" was not that "big" as compared to the number of non xact ones.
But, good point, now with the tables/indexes split I think it does: I'll submit a dedicated patch for it.

> 
>>   Datum
>>   pg_stat_get_xact_tuples_returned(PG_FUNCTION_ARGS)
>>   {
>>       Oid            relid = PG_GETARG_OID(0);
>>       int64        result;
>> -    PgStat_TableStatus *tabentry;
>> +    PgStat_IndexStatus *indentry;
>>   
>> -    if ((tabentry = find_tabstat_entry(relid)) == NULL)
>> +    if ((indentry = find_indstat_entry(relid)) == NULL)
>>           result = 0;
>>       else
>> -        result = (int64) (tabentry->t_counts.t_tuples_returned);
>> +        result = (int64) (indentry->i_counts.i_tuples_returned);
>>   
>>       PG_RETURN_INT64(result);
>>   }
> 
> There's a bunch of changes like this, and I don't understand -
> pg_stat_get_xact_tuples_returned() now looks at index stats, even though it
> afaics continues to be used in pg_stat_xact_all_tables? Huh?
> 
> 

Looks like a mistake (I probably messed up while doing all those changes that "look the same"), thanks for pointing
out!
I'll go through each one and double check.

>> +/* ----------
>> + * PgStat_IndexStatus            Per-index status within a backend
>> + *
>> + * Many of the event counters are nontransactional, ie, we count events
>> + * in committed and aborted transactions alike.  For these, we just count
>> + * directly in the PgStat_IndexStatus.
>> + * ----------
>> + */
> 
> Which counters are transactional for indexes? None, no?

Right, will fix.

> 
>> diff --git a/src/test/recovery/t/029_stats_restart.pl b/src/test/recovery/t/029_stats_restart.pl
>> index 83d6647d32..8b0b597419 100644
>> --- a/src/test/recovery/t/029_stats_restart.pl
>> +++ b/src/test/recovery/t/029_stats_restart.pl
>> @@ -43,8 +43,8 @@ my $sect = "initial";
>>   is(have_stats('database', $dboid, 0), 't', "$sect: db stats do exist");
>>   is(have_stats('function', $dboid, $funcoid),
>>       't', "$sect: function stats do exist");
>> -is(have_stats('relation', $dboid, $tableoid),
>> -    't', "$sect: relation stats do exist");
>> +is(have_stats('table', $dboid, $tableoid),
>> +    't', "$sect: table stats do exist");
> 
> Think this should grow a test for an index too. There's not that much point in
> the isolation case, because we don't have transactional stats, but here it
> seems worth testing?
> 

+1, will do.


[1]:
https://www.postgresql.org/message-id/flat/20221220181108.e5fddk3g7cive3v6%40alap3.anarazel.de#4efb4ea3593233bdb400bfb25eb30b81

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Split index and table statistics into different types of stats

From
Nitin Jadhav
Date:
>> +/* -------------------------------------------------------------------------
>> + *
>> + * pgstat_index.c
>> + *    Implementation of index statistics.
>
> This is a fair bit of duplicated code. Perhaps it'd be worth keeping
> pgstat_relation with code common to table/index stats?

+1 to keep common functions/code between table and index stats. Only
the data structure should be different as the goal is to shrink the
current memory usage.

Thanks & Regards,
Nitin Jadhav

On Thu, Jan 5, 2023 at 3:35 PM Drouvot, Bertrand
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On 1/5/23 1:27 AM, Andres Freund wrote:
> > Hi,
> >
> > On 2023-01-03 15:19:18 +0100, Drouvot, Bertrand wrote:
> >> diff --git a/src/backend/access/common/relation.c b/src/backend/access/common/relation.c
> >> index 4017e175e3..fca166a063 100644
> >> --- a/src/backend/access/common/relation.c
> >> +++ b/src/backend/access/common/relation.c
> >> @@ -73,7 +73,10 @@ relation_open(Oid relationId, LOCKMODE lockmode)
> >>      if (RelationUsesLocalBuffers(r))
> >>              MyXactFlags |= XACT_FLAGS_ACCESSEDTEMPNAMESPACE;
> >>
> >> -    pgstat_init_relation(r);
> >> +    if (r->rd_rel->relkind == RELKIND_INDEX)
> >> +            pgstat_init_index(r);
> >> +    else
> >> +            pgstat_init_table(r);
> >>
> >>      return r;
> >>   }
> >> @@ -123,7 +126,10 @@ try_relation_open(Oid relationId, LOCKMODE lockmode)
> >>      if (RelationUsesLocalBuffers(r))
> >>              MyXactFlags |= XACT_FLAGS_ACCESSEDTEMPNAMESPACE;
> >>
> >> -    pgstat_init_relation(r);
> >> +    if (r->rd_rel->relkind == RELKIND_INDEX)
> >> +            pgstat_init_index(r);
> >> +    else
> >> +            pgstat_init_table(r);
> >>
> >>      return r;
> >>   }
> >
> > Not this patch's fault, but the functions in relation.c have gotten duplicated
> > to an almost ridiculous degree :(
> >
>
> Thanks for looking at it!
> Right, I'll have a look and will try to submit a dedicated patch for this.
>
> >
> >> diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
> >> index 3fb38a25cf..98bb230b95 100644
> >> --- a/src/backend/storage/buffer/bufmgr.c
> >> +++ b/src/backend/storage/buffer/bufmgr.c
> >> @@ -776,11 +776,19 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
> >>       * Read the buffer, and update pgstat counters to reflect a cache hit or
> >>       * miss.
> >>       */
> >> -    pgstat_count_buffer_read(reln);
> >> +    if (reln->rd_rel->relkind == RELKIND_INDEX)
> >> +            pgstat_count_index_buffer_read(reln);
> >> +    else
> >> +            pgstat_count_table_buffer_read(reln);
> >>      buf = ReadBuffer_common(RelationGetSmgr(reln), reln->rd_rel->relpersistence,
> >>                                                      forkNum, blockNum, mode, strategy, &hit);
> >>      if (hit)
> >> -            pgstat_count_buffer_hit(reln);
> >> +    {
> >> +            if (reln->rd_rel->relkind == RELKIND_INDEX)
> >> +                    pgstat_count_index_buffer_hit(reln);
> >> +            else
> >> +                    pgstat_count_table_buffer_hit(reln);
> >> +    }
> >>      return buf;
> >>   }
> >
> > Not nice to have additional branches here :(.
>
> Indeed, but that does look like the price to pay for the moment ;-(
>
> >
> > I think going forward we should move buffer stats to a "per-relfilenode" stats
> > entry (which would allow to track writes too), then thiw branch would be
> > removed again.
> >
> >
>
> Agree. I think the best approach is to have this patch committed and then resume working on [1] (which would most
probablyintroduce
 
> the "per-relfilenode" stats.) Does this approach make sense to you?
>
>
> >> +/* -------------------------------------------------------------------------
> >> + *
> >> + * pgstat_index.c
> >> + *    Implementation of index statistics.
> >
> > This is a fair bit of duplicated code. Perhaps it'd be worth keeping
> > pgstat_relation with code common to table/index stats?
> >
>
> Good point, will look at what can be done.
>
> >
> >> +bool
> >> +pgstat_index_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
> >> +{
> >> +    static const PgStat_IndexCounts all_zeroes;
> >> +    Oid                     dboid;
> >> +
> >> +    PgStat_IndexStatus *lstats; /* pending stats entry  */
> >> +    PgStatShared_Index *shrelcomstats;
> >
> > What does "com" stand for in shrelcomstats?
> >
>
> Oops, thanks!
>
> This naming is coming from my first try while working on this subject (that I did not share).
> The idea I had at that time was to create a PGSTAT_KIND_RELATION_COMMON stat type for common stats between tables and
indexes
> and a dedicated one (PGSTAT_KIND_TABLE) for tables (given that indexes would have been fully part of the common
one).
> But it did not work well (specially as we want "dedicated" field names), so I preferred to submit the current
proposal.
>
> Will fix this bad naming.
>
> >
> >> +    PgStat_StatIndEntry *indentry;  /* index entry of shared stats */
> >> +    PgStat_StatDBEntry *dbentry;    /* pending database entry */
> >> +
> >> +    dboid = entry_ref->shared_entry->key.dboid;
> >> +    lstats = (PgStat_IndexStatus *) entry_ref->pending;
> >> +    shrelcomstats = (PgStatShared_Index *) entry_ref->shared_stats;
> >> +
> >> +    /*
> >> +     * Ignore entries that didn't accumulate any actual counts, such as
> >> +     * indexes that were opened by the planner but not used.
> >> +     */
> >> +    if (memcmp(&lstats->i_counts, &all_zeroes,
> >> +                       sizeof(PgStat_IndexCounts)) == 0)
> >> +    {
> >> +            return true;
> >> +    }
> >
> > I really need to propose pg_memiszero().
> >
>
> Oh yeah, great idea, that would be easier to read.
>
> >
> >
> >>   Datum
> >> -pg_stat_get_xact_numscans(PG_FUNCTION_ARGS)
> >> +pg_stat_get_tab_xact_numscans(PG_FUNCTION_ARGS)
> >>   {
> >>      Oid                     relid = PG_GETARG_OID(0);
> >>      int64           result;
> >> @@ -1360,17 +1413,32 @@ pg_stat_get_xact_numscans(PG_FUNCTION_ARGS)
> >>      PG_RETURN_INT64(result);
> >>   }
> >>
> >> +Datum
> >> +pg_stat_get_ind_xact_numscans(PG_FUNCTION_ARGS)
> >> +{
> >> +    Oid                     relid = PG_GETARG_OID(0);
> >> +    int64           result;
> >> +    PgStat_IndexStatus *indentry;
> >> +
> >> +    if ((indentry = find_indstat_entry(relid)) == NULL)
> >> +            result = 0;
> >> +    else
> >> +            result = (int64) (indentry->i_counts.i_numscans);
> >> +
> >> +    PG_RETURN_INT64(result);
> >> +}
> >
> > Why didn't all these get converted to the same macro based approach as the
> > !xact versions?
> >
>
> I think the "benefits" was not that "big" as compared to the number of non xact ones.
> But, good point, now with the tables/indexes split I think it does: I'll submit a dedicated patch for it.
>
> >
> >>   Datum
> >>   pg_stat_get_xact_tuples_returned(PG_FUNCTION_ARGS)
> >>   {
> >>      Oid                     relid = PG_GETARG_OID(0);
> >>      int64           result;
> >> -    PgStat_TableStatus *tabentry;
> >> +    PgStat_IndexStatus *indentry;
> >>
> >> -    if ((tabentry = find_tabstat_entry(relid)) == NULL)
> >> +    if ((indentry = find_indstat_entry(relid)) == NULL)
> >>              result = 0;
> >>      else
> >> -            result = (int64) (tabentry->t_counts.t_tuples_returned);
> >> +            result = (int64) (indentry->i_counts.i_tuples_returned);
> >>
> >>      PG_RETURN_INT64(result);
> >>   }
> >
> > There's a bunch of changes like this, and I don't understand -
> > pg_stat_get_xact_tuples_returned() now looks at index stats, even though it
> > afaics continues to be used in pg_stat_xact_all_tables? Huh?
> >
> >
>
> Looks like a mistake (I probably messed up while doing all those changes that "look the same"), thanks for pointing
out!
> I'll go through each one and double check.
>
> >> +/* ----------
> >> + * PgStat_IndexStatus                       Per-index status within a backend
> >> + *
> >> + * Many of the event counters are nontransactional, ie, we count events
> >> + * in committed and aborted transactions alike.  For these, we just count
> >> + * directly in the PgStat_IndexStatus.
> >> + * ----------
> >> + */
> >
> > Which counters are transactional for indexes? None, no?
>
> Right, will fix.
>
> >
> >> diff --git a/src/test/recovery/t/029_stats_restart.pl b/src/test/recovery/t/029_stats_restart.pl
> >> index 83d6647d32..8b0b597419 100644
> >> --- a/src/test/recovery/t/029_stats_restart.pl
> >> +++ b/src/test/recovery/t/029_stats_restart.pl
> >> @@ -43,8 +43,8 @@ my $sect = "initial";
> >>   is(have_stats('database', $dboid, 0), 't', "$sect: db stats do exist");
> >>   is(have_stats('function', $dboid, $funcoid),
> >>      't', "$sect: function stats do exist");
> >> -is(have_stats('relation', $dboid, $tableoid),
> >> -    't', "$sect: relation stats do exist");
> >> +is(have_stats('table', $dboid, $tableoid),
> >> +    't', "$sect: table stats do exist");
> >
> > Think this should grow a test for an index too. There's not that much point in
> > the isolation case, because we don't have transactional stats, but here it
> > seems worth testing?
> >
>
> +1, will do.
>
>
> [1]:
https://www.postgresql.org/message-id/flat/20221220181108.e5fddk3g7cive3v6%40alap3.anarazel.de#4efb4ea3593233bdb400bfb25eb30b81
>
> Regards,
>
> --
> Bertrand Drouvot
> PostgreSQL Contributors Team
> RDS Open Source Databases
> Amazon Web Services: https://aws.amazon.com
>
>



Re: Split index and table statistics into different types of stats

From
Andres Freund
Date:
Hi,

On 2023-01-09 17:08:42 +0530, Nitin Jadhav wrote:
> +1 to keep common functions/code between table and index stats. Only
> the data structure should be different as the goal is to shrink the
> current memory usage.

I don't think the goal is solely to shrink memory usage - it's also to make it
possible to add more stats that are specific to just indexes or just
tables. Of course that's related to memory usage...

Greetings,

Andres Freund



Re: Split index and table statistics into different types of stats

From
vignesh C
Date:
On Tue, 3 Jan 2023 at 19:49, Drouvot, Bertrand
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On 12/10/22 10:54 AM, Drouvot, Bertrand wrote:
> > Hi,
> >
> > On 12/7/22 11:11 AM, Drouvot, Bertrand wrote:
> >> Hi,
> >>
> >>> Hi,
> >>>
> >>> As [1] mentioned above has been committed (83a1a1b566), please find attached V5 related to this thread making use
ofthe new macros (namely PG_STAT_GET_RELENTRY_INT64 and PG_STAT_GET_RELENTRY_TIMESTAMPTZ). 
> >>>
> >>> I switched from using "CppConcat" to using "##", as it looks to me it's easier to read now that we are adding
anotherconcatenation to the game (due to the table/index split). 
> >>>
> >>> The (Tab,tab) or (Ind,ind) passed as arguments to the macros look "weird" (I don't have a better idea yet):
purposeis to follow the naming convention for PgStat_StatTabEntry/PgStat_StatIndEntry and
pgstat_fetch_stat_tabentry/pgstat_fetch_stat_indentry,thoughts? 
> >>>
> >>> Looking forward to your feedback,
> >>>
> >>
> >> Attaching V6 (mandatory rebase due to 8018ffbf58).
> >>
> >> While at it, I got rid of the weirdness mentioned above by creating 2 sets of macros (one for the tables and one
forthe indexes). 
> >>
> >> Looking forward to your feedback,
> >>
> >> Regards,
> >>
> >
> > Attaching V7, mandatory rebase due to 66dcb09246.
> >
>
> Attaching V8, mandatory rebase due to c8e1ba736b.

The patch does not apply on top of HEAD as in [1], please post a rebased patch:
=== Applying patches on top of PostgreSQL commit ID
d540a02a724b9643205abce8c5644a0f0908f6e3 ===
=== applying patch ./v8-0001-split_tables_indexes_stats.patch
....
patching file src/backend/utils/activity/pgstat_table.c (renamed from
src/backend/utils/activity/pgstat_relation.c)
Hunk #25 FAILED at 759.
....
1 out of 29 hunks FAILED -- saving rejects to file
src/backend/utils/activity/pgstat_table.c.rej

[1] - http://cfbot.cputube.org/patch_41_3984.log

Regards,
Vignesh



Re: Split index and table statistics into different types of stats

From
"Drouvot, Bertrand"
Date:
Hi,

On 1/19/23 12:28 PM, vignesh C wrote:
> On Tue, 3 Jan 2023 at 19:49, Drouvot, Bertrand
> <bertranddrouvot.pg@gmail.com> wrote:
>> Attaching V8, mandatory rebase due to c8e1ba736b.
> 
> The patch does not apply on top of HEAD as in [1], please post a rebased patch:
> === Applying patches on top of PostgreSQL commit ID
> d540a02a724b9643205abce8c5644a0f0908f6e3 ===
> === applying patch ./v8-0001-split_tables_indexes_stats.patch
> ....
> patching file src/backend/utils/activity/pgstat_table.c (renamed from
> src/backend/utils/activity/pgstat_relation.c)
> Hunk #25 FAILED at 759.
> ....
> 1 out of 29 hunks FAILED -- saving rejects to file
> src/backend/utils/activity/pgstat_table.c.rej
> 
> [1] - http://cfbot.cputube.org/patch_41_3984.log
> 

Thanks for the warning!

Please find attached a rebased version (previous comments
up-thread still need to be addressed though).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment

Re: Split index and table statistics into different types of stats

From
Michael Paquier
Date:
On Sat, Jan 21, 2023 at 06:42:51AM +0100, Drouvot, Bertrand wrote:
> Please find attached a rebased version (previous comments
> up-thread still need to be addressed though).

This patch has a lot of conflicts.  Could you send a rebased version?
--
Michael

Attachment

Re: Split index and table statistics into different types of stats

From
"Drouvot, Bertrand"
Date:
Hi,

On 3/16/23 7:54 AM, Michael Paquier wrote:
> On Sat, Jan 21, 2023 at 06:42:51AM +0100, Drouvot, Bertrand wrote:
>> Please find attached a rebased version (previous comments
>> up-thread still need to be addressed though).
> 
> This patch has a lot of conflicts.  Could you send a rebased version?

Thanks for looking at it!

Please find attached v10.

Please note that the previous comments
up-thread still need to be addressed though, and that this patch is also somehow linked to the:

"Generate pg_stat_get_xact*() functions with Macros" patch (see [1]) (which itself has some dependencies, see [2])

My plan was to get [1] done before resuming working on the "Split index and table statistics into different types of
stats"one.
 

Regards,

[1]: https://commitfest.postgresql.org/42/4106/
[2]: https://www.postgresql.org/message-id/11744b0e-5f7f-aba8-7d9c-2ff0a0c6e2b2%40gmail.com

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment

Re: Split index and table statistics into different types of stats

From
Michael Paquier
Date:
On Thu, Mar 16, 2023 at 10:24:32AM +0100, Drouvot, Bertrand wrote:
> My plan was to get [1] done before resuming working on the "Split
> index and table statistics into different types of stats" one.

Okay, I was unsure what should be the order here.  Let's see about [1]
first, then.
--
Michael

Attachment

Re: Split index and table statistics into different types of stats

From
"Gregory Stark (as CFM)"
Date:
On Thu, 16 Mar 2023 at 05:25, Drouvot, Bertrand
<bertranddrouvot.pg@gmail.com> wrote:
>
> My plan was to get [1] done before resuming working on the "Split index and table statistics into different types of
stats"one.
 
> [1]: https://commitfest.postgresql.org/42/4106/


Generate pg_stat_get_xact*() functions with Macros ([1]) was committed March 27.

There's only a few days left in this CF. Would you like to leave this
here? Should it be marked Needs Review or Ready for Commit? Or should
we move it to the next CF now?



--
Gregory Stark
As Commitfest Manager



Re: Split index and table statistics into different types of stats

From
"Drouvot, Bertrand"
Date:
Hi,

On 4/3/23 11:47 PM, Gregory Stark (as CFM) wrote:
> On Thu, 16 Mar 2023 at 05:25, Drouvot, Bertrand
> <bertranddrouvot.pg@gmail.com> wrote:
>>
>> My plan was to get [1] done before resuming working on the "Split index and table statistics into different types of
stats"one.
 
>> [1]: https://commitfest.postgresql.org/42/4106/
> 
> 
> Generate pg_stat_get_xact*() functions with Macros ([1]) was committed March 27.
> 
> There's only a few days left in this CF. Would you like to leave this
> here? Should it be marked Needs Review or Ready for Commit? Or should
> we move it to the next CF now?
> 
> 
> 

I moved it to the next commitfest and marked the target version as v17.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Split index and table statistics into different types of stats

From
Michael Paquier
Date:
On Tue, Apr 04, 2023 at 12:04:35PM +0200, Drouvot, Bertrand wrote:
> I moved it to the next commitfest and marked the target version as
> v17.

Thanks for moving it.  I think that we should be able to do a bit more
work for the switch to macros in pgstatfuncs.c, but this is going to
require more review than the feature freeze date allow, I am afraid.
--
Michael

Attachment

Re: Split index and table statistics into different types of stats

From
Daniel Gustafsson
Date:
> On 4 Apr 2023, at 12:04, Drouvot, Bertrand <bertranddrouvot.pg@gmail.com> wrote:
> On 4/3/23 11:47 PM, Gregory Stark (as CFM) wrote:
>> On Thu, 16 Mar 2023 at 05:25, Drouvot, Bertrand
>> <bertranddrouvot.pg@gmail.com> wrote:
>>>
>>> My plan was to get [1] done before resuming working on the "Split index and table statistics into different types
ofstats" one. 
>>> [1]: https://commitfest.postgresql.org/42/4106/
>> Generate pg_stat_get_xact*() functions with Macros ([1]) was committed March 27.
>> There's only a few days left in this CF. Would you like to leave this
>> here? Should it be marked Needs Review or Ready for Commit? Or should
>> we move it to the next CF now?
>
> I moved it to the next commitfest and marked the target version as v17.

This patch no longer applies (with tests failing when it did), and the thread
has stalled.  I'm marking this returned with feedback for now, please feel free
to resubmit to a future CF with a new version of the patch.

--
Daniel Gustafsson




Re: Split index and table statistics into different types of stats

From
"Drouvot, Bertrand"
Date:
Hi,

On 7/10/23 11:14 AM, Daniel Gustafsson wrote:
>> On 4 Apr 2023, at 12:04, Drouvot, Bertrand <bertranddrouvot.pg@gmail.com> wrote:
>> On 4/3/23 11:47 PM, Gregory Stark (as CFM) wrote:
>>> On Thu, 16 Mar 2023 at 05:25, Drouvot, Bertrand
>>> <bertranddrouvot.pg@gmail.com> wrote:
>>>>
>>>> My plan was to get [1] done before resuming working on the "Split index and table statistics into different types
ofstats" one.
 
>>>> [1]: https://commitfest.postgresql.org/42/4106/
>>> Generate pg_stat_get_xact*() functions with Macros ([1]) was committed March 27.
>>> There's only a few days left in this CF. Would you like to leave this
>>> here? Should it be marked Needs Review or Ready for Commit? Or should
>>> we move it to the next CF now?
>>
>> I moved it to the next commitfest and marked the target version as v17.
> 
> This patch no longer applies (with tests failing when it did), and the thread
> has stalled.  I'm marking this returned with feedback for now, please feel free
> to resubmit to a future CF with a new version of the patch.

Thanks for the update.
I'll resume working on it and re-submit once done.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Split index and table statistics into different types of stats

From
"Drouvot, Bertrand"
Date:
Hi,

On 7/10/23 11:14 AM, Daniel Gustafsson wrote:
>> On 4 Apr 2023, at 12:04, Drouvot, Bertrand <bertranddrouvot.pg@gmail.com> wrote:
>> On 4/3/23 11:47 PM, Gregory Stark (as CFM) wrote:
>>> On Thu, 16 Mar 2023 at 05:25, Drouvot, Bertrand
>>> <bertranddrouvot.pg@gmail.com> wrote:
>>>>
>>>> My plan was to get [1] done before resuming working on the "Split index and table statistics into different types
ofstats" one.
 
>>>> [1]: https://commitfest.postgresql.org/42/4106/
>>> Generate pg_stat_get_xact*() functions with Macros ([1]) was committed March 27.
>>> There's only a few days left in this CF. Would you like to leave this
>>> here? Should it be marked Needs Review or Ready for Commit? Or should
>>> we move it to the next CF now?
>>
>> I moved it to the next commitfest and marked the target version as v17.
> 
> This patch no longer applies (with tests failing when it did), and the thread
> has stalled.  I'm marking this returned with feedback for now, please feel free
> to resubmit to a future CF with a new version of the patch.
> 

FWIW, attached a rebased version as V11.

Will now work on addressing the up-thread remaining comments.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment

Re: Split index and table statistics into different types of stats

From
Andres Freund
Date:
Hi,

On 2023-11-13 09:26:56 +0100, Drouvot, Bertrand wrote:
> --- a/src/backend/storage/buffer/bufmgr.c
> +++ b/src/backend/storage/buffer/bufmgr.c
> @@ -799,11 +799,19 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
>       * Read the buffer, and update pgstat counters to reflect a cache hit or
>       * miss.
>       */
> -    pgstat_count_buffer_read(reln);
> +    if (reln->rd_rel->relkind == RELKIND_INDEX)
> +        pgstat_count_index_buffer_read(reln);
> +    else
> +        pgstat_count_table_buffer_read(reln);

It's not nice from a layering POV that we need this level of awareness in
bufmgr.c.  I wonder if this is an argument for first splitting out stats like
blocks_hit, blocks_fetched into something like "relfilenode stats" - they're
agnostic of the relkind.  There aren't that many such stats right now,
admittedly, but I think we'll want to also track dirtied, written blocks on a
per relation basis once we can (i.e. we key the relevant stats by relfilenode
instead of oid, so we can associate stats when writing out buffers).


> +/*
> + * Initialize a relcache entry to count access statistics.  Called whenever an
> + * index is opened.
> + *
> + * We assume that a relcache entry's pgstatind_info field is zeroed by relcache.c
> + * when the relcache entry is made; thereafter it is long-lived data.
> + *
> + * This does not create a reference to a stats entry in shared memory, nor
> + * allocate memory for the pending stats. That happens in
> + * pgstat_assoc_index().
> + */
> +void
> +pgstat_init_index(Relation rel)
> +{
> +    /*
> +     * We only count stats for indexes
> +     */
> +    Assert(rel->rd_rel->relkind == RELKIND_INDEX);
> +
> +    if (!pgstat_track_counts)
> +    {
> +        if (rel->pgstatind_info != NULL)
> +            pgstat_unlink_index(rel);
> +
> +        /* We're not counting at all */
> +        rel->pgstat_enabled = false;
> +        rel->pgstatind_info = NULL;
> +        return;
> +    }
> +
> +    rel->pgstat_enabled = true;
> +}
> +
> +/*
> + * Prepare for statistics for this index to be collected.
> + *
> + * This ensures we have a reference to the stats entry before stats can be
> + * generated. That is important because an index drop in another
> + * connection could otherwise lead to the stats entry being dropped, which then
> + * later would get recreated when flushing stats.
> + *
> + * This is separate from pgstat_init_index() as it is not uncommon for
> + * relcache entries to be opened without ever getting stats reported.
> + */
> +void
> +pgstat_assoc_index(Relation rel)
> +{
> +    Assert(rel->pgstat_enabled);
> +    Assert(rel->pgstatind_info == NULL);
> +
> +    /* Else find or make the PgStat_IndexStatus entry, and update link */
> +    rel->pgstatind_info = pgstat_prep_index_pending(RelationGetRelid(rel),
> +                                                    rel->rd_rel->relisshared);
> +
> +    /* don't allow link a stats to multiple relcache entries */
> +    Assert(rel->pgstatind_info->relation == NULL);
> +
> +    /* mark this relation as the owner */
> +    rel->pgstatind_info->relation = rel;
> +}
> +
> +/*
> + * Break the mutual link between a relcache entry and pending index stats entry.
> + * This must be called whenever one end of the link is removed.
> + */
> +void
> +pgstat_unlink_index(Relation rel)
> +{
> +
> +    if (rel->pgstatind_info == NULL)
> +        return;
> +
> +    /* link sanity check for the index stats */
> +    if (rel->pgstatind_info)
> +    {
> +        Assert(rel->pgstatind_info->relation == rel);
> +        rel->pgstatind_info->relation = NULL;
> +        rel->pgstatind_info = NULL;
> +    }
> +}
> ...

This is a fair bit of duplicated code - perhaps we could have shared helpers?


> +/* ----------
> + * PgStat_IndexStatus            Per-index status within a backend
> + *
> + * Many of the event counters are nontransactional, ie, we count events
> + * in committed and aborted transactions alike.  For these, we just count
> + * directly in the PgStat_IndexStatus.
> + * ----------
> + */
> +typedef struct PgStat_IndexStatus
> +{
> +    Oid            r_id;            /* relation's OID */
> +    bool        r_shared;        /* is it a shared catalog? */
> +    struct PgStat_IndexXactStatus *trans;    /* lowest subxact's counts */
> +    PgStat_IndexCounts counts;    /* event counts to be sent */
> +    Relation    relation;        /* rel that is using this entry */
> +} PgStat_IndexStatus;
> +
>  /* ----------
>   * PgStat_TableXactStatus        Per-table, per-subtransaction status
>   * ----------
> @@ -227,6 +264,29 @@ typedef struct PgStat_TableXactStatus
>  } PgStat_TableXactStatus;
>  
>  
> +/* ----------
> + * PgStat_IndexXactStatus        Per-index, per-subtransaction status
> + * ----------
> + */
> +typedef struct PgStat_IndexXactStatus
> +{
> +    PgStat_Counter tuples_inserted; /* tuples inserted in (sub)xact */
> +    PgStat_Counter tuples_updated;    /* tuples updated in (sub)xact */
> +    PgStat_Counter tuples_deleted;    /* tuples deleted in (sub)xact */
> +    bool        truncdropped;    /* relation truncated/dropped in this
> +                                 * (sub)xact */
> +    /* tuples i/u/d prior to truncate/drop */
> +    PgStat_Counter inserted_pre_truncdrop;
> +    PgStat_Counter updated_pre_truncdrop;
> +    PgStat_Counter deleted_pre_truncdrop;
> +    int            nest_level;        /* subtransaction nest level */
> +    /* links to other structs for same relation: */
> +    struct PgStat_IndexXactStatus *upper;    /* next higher subxact if any */
> +    PgStat_IndexStatus *parent; /* per-table status */
> +    /* structs of same subxact level are linked here: */
> +    struct PgStat_IndexXactStatus *next;    /* next of same subxact */
> +} PgStat_IndexXactStatus;

I don't think much of this is used? It doesn't look like you're using most of
the fields. Which makes sense - there's not really the same transactional
behaviour for indexes as there is for tables.


Greetings,

Andres Freund



Re: Split index and table statistics into different types of stats

From
"Drouvot, Bertrand"
Date:
Hi,

On 11/13/23 9:44 PM, Andres Freund wrote:
> Hi,
> 
> On 2023-11-13 09:26:56 +0100, Drouvot, Bertrand wrote:
>> --- a/src/backend/storage/buffer/bufmgr.c
>> +++ b/src/backend/storage/buffer/bufmgr.c
>> @@ -799,11 +799,19 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
>>        * Read the buffer, and update pgstat counters to reflect a cache hit or
>>        * miss.
>>        */
>> -    pgstat_count_buffer_read(reln);
>> +    if (reln->rd_rel->relkind == RELKIND_INDEX)
>> +        pgstat_count_index_buffer_read(reln);
>> +    else
>> +        pgstat_count_table_buffer_read(reln);
> 
> It's not nice from a layering POV that we need this level of awareness in
> bufmgr.c.  I wonder if this is an argument for first splitting out stats like
> blocks_hit, blocks_fetched into something like "relfilenode stats" - they're
> agnostic of the relkind. 

Thanks for looking at it! Yeah I think that would make a lot of sense
to track some stats per relfilenode.

> There aren't that many such stats right now,
> admittedly, but I think we'll want to also track dirtied, written blocks on a
> per relation basis once we can (i.e. we key the relevant stats by relfilenode
> instead of oid, so we can associate stats when writing out buffers).
> 
> 

Agree. Then, I think that would make sense to start this effort before the
split index/table one. I can work on a per relfilenode stat patch first.

Does this patch ordering make sense to you?

1) Introduce per relfilenode stats
2) Split index and table stats

>> +/*
>> + * Initialize a relcache entry to count access statistics.  Called whenever an
>> + * index is opened.
>> + *
>> + * We assume that a relcache entry's pgstatind_info field is zeroed by relcache.c
>> + * when the relcache entry is made; thereafter it is long-lived data.
>> + *
>> + * This does not create a reference to a stats entry in shared memory, nor
>> + * allocate memory for the pending stats. That happens in
>> + * pgstat_assoc_index().
>> + */
>> +void
>> +pgstat_init_index(Relation rel)
>> +{
>> +    /*
>> +     * We only count stats for indexes
>> +     */
>> +    Assert(rel->rd_rel->relkind == RELKIND_INDEX);
>> +
>> +    if (!pgstat_track_counts)
>> +    {
>> +        if (rel->pgstatind_info != NULL)
>> +            pgstat_unlink_index(rel);
>> +
>> +        /* We're not counting at all */
>> +        rel->pgstat_enabled = false;
>> +        rel->pgstatind_info = NULL;
>> +        return;
>> +    }
>> +
>> +    rel->pgstat_enabled = true;
>> +}
>> +
>> +/*
>> + * Prepare for statistics for this index to be collected.
>> + *
>> + * This ensures we have a reference to the stats entry before stats can be
>> + * generated. That is important because an index drop in another
>> + * connection could otherwise lead to the stats entry being dropped, which then
>> + * later would get recreated when flushing stats.
>> + *
>> + * This is separate from pgstat_init_index() as it is not uncommon for
>> + * relcache entries to be opened without ever getting stats reported.
>> + */
>> +void
>> +pgstat_assoc_index(Relation rel)
>> +{
>> +    Assert(rel->pgstat_enabled);
>> +    Assert(rel->pgstatind_info == NULL);
>> +
>> +    /* Else find or make the PgStat_IndexStatus entry, and update link */
>> +    rel->pgstatind_info = pgstat_prep_index_pending(RelationGetRelid(rel),
>> +                                                    rel->rd_rel->relisshared);
>> +
>> +    /* don't allow link a stats to multiple relcache entries */
>> +    Assert(rel->pgstatind_info->relation == NULL);
>> +
>> +    /* mark this relation as the owner */
>> +    rel->pgstatind_info->relation = rel;
>> +}
>> +
>> +/*
>> + * Break the mutual link between a relcache entry and pending index stats entry.
>> + * This must be called whenever one end of the link is removed.
>> + */
>> +void
>> +pgstat_unlink_index(Relation rel)
>> +{
>> +
>> +    if (rel->pgstatind_info == NULL)
>> +        return;
>> +
>> +    /* link sanity check for the index stats */
>> +    if (rel->pgstatind_info)
>> +    {
>> +        Assert(rel->pgstatind_info->relation == rel);
>> +        rel->pgstatind_info->relation = NULL;
>> +        rel->pgstatind_info = NULL;
>> +    }
>> +}
>> ...
> 
> This is a fair bit of duplicated code - perhaps we could have shared helpers?
> 

Yeah, I had it in mind and that was part of the "Will now work on addressing the
up-thread remaining comments" remark I made up-thread.

> 
>> +/* ----------
>> + * PgStat_IndexStatus            Per-index status within a backend
>> + *
>> + * Many of the event counters are nontransactional, ie, we count events
>> + * in committed and aborted transactions alike.  For these, we just count
>> + * directly in the PgStat_IndexStatus.
>> + * ----------
>> + */
>> +typedef struct PgStat_IndexStatus
>> +{
>> +    Oid            r_id;            /* relation's OID */
>> +    bool        r_shared;        /* is it a shared catalog? */
>> +    struct PgStat_IndexXactStatus *trans;    /* lowest subxact's counts */
>> +    PgStat_IndexCounts counts;    /* event counts to be sent */
>> +    Relation    relation;        /* rel that is using this entry */
>> +} PgStat_IndexStatus;
>> +
>>   /* ----------
>>    * PgStat_TableXactStatus        Per-table, per-subtransaction status
>>    * ----------
>> @@ -227,6 +264,29 @@ typedef struct PgStat_TableXactStatus
>>   } PgStat_TableXactStatus;
>>   
>>   
>> +/* ----------
>> + * PgStat_IndexXactStatus        Per-index, per-subtransaction status
>> + * ----------
>> + */
>> +typedef struct PgStat_IndexXactStatus
>> +{
>> +    PgStat_Counter tuples_inserted; /* tuples inserted in (sub)xact */
>> +    PgStat_Counter tuples_updated;    /* tuples updated in (sub)xact */
>> +    PgStat_Counter tuples_deleted;    /* tuples deleted in (sub)xact */
>> +    bool        truncdropped;    /* relation truncated/dropped in this
>> +                                 * (sub)xact */
>> +    /* tuples i/u/d prior to truncate/drop */
>> +    PgStat_Counter inserted_pre_truncdrop;
>> +    PgStat_Counter updated_pre_truncdrop;
>> +    PgStat_Counter deleted_pre_truncdrop;
>> +    int            nest_level;        /* subtransaction nest level */
>> +    /* links to other structs for same relation: */
>> +    struct PgStat_IndexXactStatus *upper;    /* next higher subxact if any */
>> +    PgStat_IndexStatus *parent; /* per-table status */
>> +    /* structs of same subxact level are linked here: */
>> +    struct PgStat_IndexXactStatus *next;    /* next of same subxact */
>> +} PgStat_IndexXactStatus;
> 
> I don't think much of this is used? It doesn't look like you're using most of
> the fields. Which makes sense - there's not really the same transactional
> behaviour for indexes as there is for tables.
> 
> 

Fully agree. I had in mind to revisit this stuff too.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Split index and table statistics into different types of stats

From
Bertrand Drouvot
Date:
Hi,

On Tue, Nov 14, 2023 at 09:04:03AM +0100, Drouvot, Bertrand wrote:
> On 11/13/23 9:44 PM, Andres Freund wrote:
> > Hi,
> > 
> > It's not nice from a layering POV that we need this level of awareness in
> > bufmgr.c.  I wonder if this is an argument for first splitting out stats like
> > blocks_hit, blocks_fetched into something like "relfilenode stats" - they're
> > agnostic of the relkind.
> 
> Thanks for looking at it! Yeah I think that would make a lot of sense
> to track some stats per relfilenode.
> 
> > There aren't that many such stats right now,
> > admittedly, but I think we'll want to also track dirtied, written blocks on a
> > per relation basis once we can (i.e. we key the relevant stats by relfilenode
> > instead of oid, so we can associate stats when writing out buffers).
> > 
> > 
> 
> Agree. Then, I think that would make sense to start this effort before the
> split index/table one. I can work on a per relfilenode stat patch first.
> 
> Does this patch ordering make sense to you?
> 
> 1) Introduce per relfilenode stats
> 2) Split index and table stats

Just a quick update on this: I had a chat with Andres at pgconf.eu and we agreed
on the above ordering so that:

1) I started working on relfilenode stats (I hope to be able to provide a POC
patch soon).

2) The CF entry [1] status related to this thread has been changed to "Waiting
on Author".

[1]: https://commitfest.postgresql.org/47/4792/

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com