Re: Split index and table statistics into different types of stats - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Split index and table statistics into different types of stats
Date
Msg-id 20221101003006.3ehvn2prai26rkrf@awork3.anarazel.de
Whole thread Raw
In response to Split index and table statistics into different types of stats  ("Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>)
Responses Re: Split index and table statistics into different types of stats
List pgsql-hackers
Hi,

On 2022-10-31 14:14:15 +0100, Drouvot, Bertrand wrote:
> Please find attached a patch proposal to split index and table statistics
> into different types of stats.
> 
> This idea has been proposed by Andres in a couple of threads, see [1] and
> [2].

Thanks for working on this!



> diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
> index 5b49cc5a09..8a715db82e 100644
> --- a/src/backend/catalog/heap.c
> +++ b/src/backend/catalog/heap.c
> @@ -1853,7 +1853,7 @@ heap_drop_with_catalog(Oid relid)
>          RelationDropStorage(rel);
>  
>      /* ensure that stats are dropped if transaction commits */
> -    pgstat_drop_relation(rel);
> +    pgstat_drop_heap(rel);

I don't think "heap" is a good name for these, even if there's some historical
reasons for it. Particularly because you used "table" in some bits and pieces
too.


>  /*
> @@ -168,39 +210,55 @@ pgstat_unlink_relation(Relation rel)
>  void
>  pgstat_create_relation(Relation rel)
>  {
> -    pgstat_create_transactional(PGSTAT_KIND_RELATION,
> -                                rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId,
> -                                RelationGetRelid(rel));
> +    if (rel->rd_rel->relkind == RELKIND_INDEX)
> +        pgstat_create_transactional(PGSTAT_KIND_INDEX,
> +                                    rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId,
> +                                    RelationGetRelid(rel));
> +    else
> +        pgstat_create_transactional(PGSTAT_KIND_TABLE,
> +                                    rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId,
> +                                    RelationGetRelid(rel));
> +}

Hm - why is this best handled on this level, rather than at the caller?


> +/*
> + * Support function for the SQL-callable pgstat* functions. Returns
> + * the collected statistics for one index or NULL. NULL doesn't mean
> + * that the index doesn't exist, just that there are no statistics, so the
> + * caller is better off to report ZERO instead.
> + */
> +PgStat_StatIndEntry *
> +pgstat_fetch_stat_indentry(Oid relid)
> +{
> +    PgStat_StatIndEntry *indentry;
> +
> +    indentry = pgstat_fetch_stat_indentry_ext(false, relid);
> +    if (indentry != NULL)
> +        return indentry;
> +
> +    /*
> +     * If we didn't find it, maybe it's a shared index.
> +     */
> +    indentry = pgstat_fetch_stat_indentry_ext(true, relid);
> +    return indentry;
> +}
> +
> +/*
> + * More efficient version of pgstat_fetch_stat_indentry(), allowing to specify
> + * whether the to-be-accessed index is shared or not.
> + */
> +PgStat_StatIndEntry *
> +pgstat_fetch_stat_indentry_ext(bool shared, Oid reloid)
> +{
> +    Oid            dboid = (shared ? InvalidOid : MyDatabaseId);
> +
> +    return (PgStat_StatIndEntry *)
> +        pgstat_fetch_entry(PGSTAT_KIND_INDEX, dboid, reloid);
>  }

Do we need this split anywhere for now? I suspect not, the table case is
mainly for the autovacuum launcher, which won't look at indexes "in isolation".



> @@ -240,9 +293,23 @@ pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS)
>      PG_RETURN_INT64(result);
>  }
>  
> +Datum
> +pg_stat_get_index_blocks_fetched(PG_FUNCTION_ARGS)
> +{
> +    Oid            relid = PG_GETARG_OID(0);
> +    int64        result;
> +    PgStat_StatIndEntry *indentry;
> +
> +    if ((indentry = pgstat_fetch_stat_indentry(relid)) == NULL)
> +        result = 0;
> +    else
> +        result = (int64) (indentry->blocks_fetched);
> +
> +    PG_RETURN_INT64(result);
> +}

We have so many copies of this by now - perhaps we first should deduplicate
them somehow? Even if it's just a macro or such.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: heavily contended lwlocks with long wait queues scale badly
Next
From: Michael Paquier
Date:
Subject: Re: Simplifying our Trap/Assert infrastructure