Re: Split index and table statistics into different types of stats - Mailing list pgsql-hackers

From Drouvot, Bertrand
Subject Re: Split index and table statistics into different types of stats
Date
Msg-id 4cc5476f-4360-1bed-c60d-2c21adf7db13@gmail.com
Whole thread Raw
In response to Re: Split index and table statistics into different types of stats  (Andres Freund <andres@anarazel.de>)
Responses Re: Split index and table statistics into different types of stats
List pgsql-hackers
Hi,

On 11/1/22 1:30 AM, Andres Freund wrote:
> Hi,
> 
> On 2022-10-31 14:14:15 +0100, Drouvot, Bertrand wrote:
>> Please find attached a patch proposal to split index and table statistics
>> into different types of stats.
>>
>> This idea has been proposed by Andres in a couple of threads, see [1] and
>> [2].
> 
> Thanks for working on this!
> 

Thanks for looking at it!

> 
> 
>> diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
>> index 5b49cc5a09..8a715db82e 100644
>> --- a/src/backend/catalog/heap.c
>> +++ b/src/backend/catalog/heap.c
>> @@ -1853,7 +1853,7 @@ heap_drop_with_catalog(Oid relid)
>>           RelationDropStorage(rel);
>>   
>>       /* ensure that stats are dropped if transaction commits */
>> -    pgstat_drop_relation(rel);
>> +    pgstat_drop_heap(rel);
> 
> I don't think "heap" is a good name for these, even if there's some historical
> reasons for it. Particularly because you used "table" in some bits and pieces
> too.
> 

Agree, replaced by "table" where appropriate in V3 attached.

> 
>>   /*
>> @@ -168,39 +210,55 @@ pgstat_unlink_relation(Relation rel)
>>   void
>>   pgstat_create_relation(Relation rel)
>>   {
>> -    pgstat_create_transactional(PGSTAT_KIND_RELATION,
>> -                                rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId,
>> -                                RelationGetRelid(rel));
>> +    if (rel->rd_rel->relkind == RELKIND_INDEX)
>> +        pgstat_create_transactional(PGSTAT_KIND_INDEX,
>> +                                    rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId,
>> +                                    RelationGetRelid(rel));
>> +    else
>> +        pgstat_create_transactional(PGSTAT_KIND_TABLE,
>> +                                    rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId,
>> +                                    RelationGetRelid(rel));
>> +}
> 
> Hm - why is this best handled on this level, rather than at the caller?
> 
> 

Agree that it should be split in 
pgstat_create_table()/pgstat_create_index() (also as it was already 
split for the "drop" case): done in V3.

>> +/*
>> + * Support function for the SQL-callable pgstat* functions. Returns
>> + * the collected statistics for one index or NULL. NULL doesn't mean
>> + * that the index doesn't exist, just that there are no statistics, so the
>> + * caller is better off to report ZERO instead.
>> + */
>> +PgStat_StatIndEntry *
>> +pgstat_fetch_stat_indentry(Oid relid)
>> +{
>> +    PgStat_StatIndEntry *indentry;
>> +
>> +    indentry = pgstat_fetch_stat_indentry_ext(false, relid);
>> +    if (indentry != NULL)
>> +        return indentry;
>> +
>> +    /*
>> +     * If we didn't find it, maybe it's a shared index.
>> +     */
>> +    indentry = pgstat_fetch_stat_indentry_ext(true, relid);
>> +    return indentry;
>> +}
>> +
>> +/*
>> + * More efficient version of pgstat_fetch_stat_indentry(), allowing to specify
>> + * whether the to-be-accessed index is shared or not.
>> + */
>> +PgStat_StatIndEntry *
>> +pgstat_fetch_stat_indentry_ext(bool shared, Oid reloid)
>> +{
>> +    Oid            dboid = (shared ? InvalidOid : MyDatabaseId);
>> +
>> +    return (PgStat_StatIndEntry *)
>> +        pgstat_fetch_entry(PGSTAT_KIND_INDEX, dboid, reloid);
>>   }
> 
> Do we need this split anywhere for now? I suspect not, the table case is
> mainly for the autovacuum launcher, which won't look at indexes "in isolation".
> 

Yes I think so as pgstat_fetch_stat_indentry_ext() has its use case in 
pgstat_copy_index_stats() (previously pgstat_copy_relation_stats()).

> 
> 
>> @@ -240,9 +293,23 @@ pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS)
>>       PG_RETURN_INT64(result);
>>   }
>>   
>> +Datum
>> +pg_stat_get_index_blocks_fetched(PG_FUNCTION_ARGS)
>> +{
>> +    Oid            relid = PG_GETARG_OID(0);
>> +    int64        result;
>> +    PgStat_StatIndEntry *indentry;
>> +
>> +    if ((indentry = pgstat_fetch_stat_indentry(relid)) == NULL)
>> +        result = 0;
>> +    else
>> +        result = (int64) (indentry->blocks_fetched);
>> +
>> +    PG_RETURN_INT64(result);
>> +}
> 
> We have so many copies of this by now - perhaps we first should deduplicate
> them somehow? Even if it's just a macro or such.
> 

Yeah good point, a new macro has been defined for the "int64" return 
case in V3 attached.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment

pgsql-hackers by date:

Previous
From: Aleksander Alekseev
Date:
Subject: Re: Adding doubly linked list type which stores the number of items in the list
Next
From: Yuya Watari
Date:
Subject: Re: [PoC] Reducing planning time when tables have many partitions