Re: Per backend relation statistics tracking - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Per backend relation statistics tracking
Date
Msg-id 7fhpds4xqk6bnudzmzkqi33pinsxammpljwde5gfkjdygvejrj@ojkzfr7dxkmm
Whole thread Raw
In response to Per backend relation statistics tracking  (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>)
List pgsql-hackers
Hi,

On 2025-08-12 07:48:10 +0000, Bertrand Drouvot wrote:
> From 9e2f8cb9a87f1d9be91f2f39ef25fbb254944968 Mon Sep 17 00:00:00 2001
> From: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
> Date: Mon, 4 Aug 2025 08:14:02 +0000
> Subject: [PATCH v1 01/10] Adding per backend relation statistics tracking
> 
> This commit introduces per backend relation stats tracking and adds a
> new PgStat_BackendRelPending struct to store the pending statistics. To begin with,
> this commit adds a new counter (heap_scan) to record the number of sequential
> scans initiated on tables.
> 
> This commit relies on the existing per backend statistics machinery that has been
> added in 9aea73fc61d.
> ---
>  src/backend/access/heap/heapam.c            |  3 ++
>  src/backend/utils/activity/pgstat_backend.c | 59 +++++++++++++++++++++
>  src/include/pgstat.h                        | 14 +++++
>  src/include/utils/pgstat_internal.h         |  3 +-
>  src/tools/pgindent/typedefs.list            |  1 +
>  5 files changed, 79 insertions(+), 1 deletion(-)
>   73.9% src/backend/utils/activity/
>    7.4% src/include/utils/
>   15.4% src/include/
>    3.2% src/
> 
> diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
> index 0dcd6ee817e..d9d6fb6c6ea 100644
> --- a/src/backend/access/heap/heapam.c
> +++ b/src/backend/access/heap/heapam.c
> @@ -467,7 +467,10 @@ initscan(HeapScanDesc scan, ScanKey key, bool keep_startblock)
>       * and for sample scans we update stats for tuple fetches).
>       */
>      if (scan->rs_base.rs_flags & SO_TYPE_SEQSCAN)
> +    {
>          pgstat_count_heap_scan(scan->rs_base.rs_rd);
> +        pgstat_count_backend_rel_heap_scan();
> +    }
>  }
>

I don't like that this basically doubles the overhead of keeping stats by
tracking everythign twice. The proper solution is to do that not in the hot
path (i.e. in scans), but when summarizing stats to be flushed to the shared
stats.

FWIW, I think this was done wrongly for the per-backend IO stats too. I've
seen the increased overhead in profiles - and IO related counters aren't
incremented remotely as often as the scan related counters are.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Sami Imseih
Date:
Subject: Re: Per backend relation statistics tracking
Next
From: Sami Imseih
Date:
Subject: Re: Improve LWLock tranche name visibility across backends