Re: [PATCH] pg_statio_all_tables: several rows per table due to invalid TOAST index - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [PATCH] pg_statio_all_tables: several rows per table due to invalid TOAST index
Date
Msg-id 1703908.1647814114@sss.pgh.pa.us
Whole thread Raw
In response to Re: [PATCH] pg_statio_all_tables: several rows per table due to invalid TOAST index  (Andrei Zubkov <zubkov@moonset.ru>)
Responses Re: [PATCH] pg_statio_all_tables: several rows per table due to invalid TOAST index  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Andrei Zubkov <zubkov@moonset.ru> writes:
> On Tue, 2021-11-30 at 17:29 +0900, Michael Paquier wrote:
>> Hmm.  Why should we care about invalid indexes at all, including
>> pg_statio_all_indexes?

> I think we should care about them at least because they are exists and
> can consume resources. For example, invalid index is to be updated by
> DML operations.
> Of course we can exclude such indexes from a view using isvalid,
> isready, islive fields. But in such case we should mention this in the
> docs, and more important is that the new such states of indexes can
> appear in the future causing change in a view definition. Counting all
> indexes regardless of states seems more reasonable to me.

Yeah, I agree, especially since we do it like that for the table's
own indexes.  I have a couple of comments though:

1. There's a silly typo in the view definition (it outputs tidx_blks_read
twice).  Fixed in the attached v2.

2. Historically, if you put any constraints on the view output, like
    select * from pg_statio_all_tables where relname like 'foo%';
you'd get a commensurate reduction in the amount of work done.  With
this version, you don't: the CTE will get computed in its entirety
even if we just need one row of its result.  This seems pretty bad,
especially for installations with many tables --- I suspect many
users would think this cure is worse than the disease.

I'm not quite sure what to do about #2.  I thought of just removing
X.indexrelid from the GROUP BY clause and summing over the toast
index(es) as we do for the table's index(es).  But that doesn't
work: if there are N > 1 table indexes, then the counts for
the toast index(es) will be multiplied by N, and conversely if
there are multiple toast indexes then the counts for the table
indexes will be scaled up.  We need to sum separately over the
table indexes and toast indexes, and I don't immediately see how
to do that without creating an optimization fence.

            regards, tom lane

diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index bb1ac30cd1..a977efd3c5 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -710,6 +710,18 @@ CREATE VIEW pg_stat_xact_user_tables AS
           schemaname !~ '^pg_toast';

 CREATE VIEW pg_statio_all_tables AS
+    WITH indstat AS (
+        SELECT
+            indrelid,
+            sum(pg_stat_get_blocks_fetched(indexrelid) -
+                pg_stat_get_blocks_hit(indexrelid))::bigint
+            AS idx_blks_read,
+            sum(pg_stat_get_blocks_hit(indexrelid))::bigint
+            AS idx_blks_hit
+        FROM
+            pg_index
+        GROUP BY indrelid
+    )
     SELECT
             C.oid AS relid,
             N.nspname AS schemaname,
@@ -717,22 +729,19 @@ CREATE VIEW pg_statio_all_tables AS
             pg_stat_get_blocks_fetched(C.oid) -
                     pg_stat_get_blocks_hit(C.oid) AS heap_blks_read,
             pg_stat_get_blocks_hit(C.oid) AS heap_blks_hit,
-            sum(pg_stat_get_blocks_fetched(I.indexrelid) -
-                    pg_stat_get_blocks_hit(I.indexrelid))::bigint AS idx_blks_read,
-            sum(pg_stat_get_blocks_hit(I.indexrelid))::bigint AS idx_blks_hit,
+            I.idx_blks_read AS idx_blks_read,
+            I.idx_blks_hit AS idx_blks_hit,
             pg_stat_get_blocks_fetched(T.oid) -
                     pg_stat_get_blocks_hit(T.oid) AS toast_blks_read,
             pg_stat_get_blocks_hit(T.oid) AS toast_blks_hit,
-            pg_stat_get_blocks_fetched(X.indexrelid) -
-                    pg_stat_get_blocks_hit(X.indexrelid) AS tidx_blks_read,
-            pg_stat_get_blocks_hit(X.indexrelid) AS tidx_blks_hit
+            X.idx_blks_read AS tidx_blks_read,
+            X.idx_blks_hit AS tidx_blks_hit
     FROM pg_class C LEFT JOIN
-            pg_index I ON C.oid = I.indrelid LEFT JOIN
+            indstat I ON C.oid = I.indrelid LEFT JOIN
             pg_class T ON C.reltoastrelid = T.oid LEFT JOIN
-            pg_index X ON T.oid = X.indrelid
+            indstat X ON T.oid = X.indrelid
             LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
-    WHERE C.relkind IN ('r', 't', 'm')
-    GROUP BY C.oid, N.nspname, C.relname, T.oid, X.indexrelid;
+    WHERE C.relkind IN ('r', 't', 'm');

 CREATE VIEW pg_statio_sys_tables AS
     SELECT * FROM pg_statio_all_tables
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index ac468568a1..df1a3c1297 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2256,24 +2256,30 @@ pg_statio_all_sequences| SELECT c.oid AS relid,
    FROM (pg_class c
      LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
   WHERE (c.relkind = 'S'::"char");
-pg_statio_all_tables| SELECT c.oid AS relid,
+pg_statio_all_tables| WITH indstat AS (
+         SELECT pg_index.indrelid,
+            (sum((pg_stat_get_blocks_fetched(pg_index.indexrelid) -
pg_stat_get_blocks_hit(pg_index.indexrelid))))::bigintAS idx_blks_read, 
+            (sum(pg_stat_get_blocks_hit(pg_index.indexrelid)))::bigint AS idx_blks_hit
+           FROM pg_index
+          GROUP BY pg_index.indrelid
+        )
+ SELECT c.oid AS relid,
     n.nspname AS schemaname,
     c.relname,
     (pg_stat_get_blocks_fetched(c.oid) - pg_stat_get_blocks_hit(c.oid)) AS heap_blks_read,
     pg_stat_get_blocks_hit(c.oid) AS heap_blks_hit,
-    (sum((pg_stat_get_blocks_fetched(i.indexrelid) - pg_stat_get_blocks_hit(i.indexrelid))))::bigint AS idx_blks_read,
-    (sum(pg_stat_get_blocks_hit(i.indexrelid)))::bigint AS idx_blks_hit,
+    i.idx_blks_read,
+    i.idx_blks_hit,
     (pg_stat_get_blocks_fetched(t.oid) - pg_stat_get_blocks_hit(t.oid)) AS toast_blks_read,
     pg_stat_get_blocks_hit(t.oid) AS toast_blks_hit,
-    (pg_stat_get_blocks_fetched(x.indexrelid) - pg_stat_get_blocks_hit(x.indexrelid)) AS tidx_blks_read,
-    pg_stat_get_blocks_hit(x.indexrelid) AS tidx_blks_hit
+    x.idx_blks_read AS tidx_blks_read,
+    x.idx_blks_hit AS tidx_blks_hit
    FROM ((((pg_class c
-     LEFT JOIN pg_index i ON ((c.oid = i.indrelid)))
+     LEFT JOIN indstat i ON ((c.oid = i.indrelid)))
      LEFT JOIN pg_class t ON ((c.reltoastrelid = t.oid)))
-     LEFT JOIN pg_index x ON ((t.oid = x.indrelid)))
+     LEFT JOIN indstat x ON ((t.oid = x.indrelid)))
      LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
-  WHERE (c.relkind = ANY (ARRAY['r'::"char", 't'::"char", 'm'::"char"]))
-  GROUP BY c.oid, n.nspname, c.relname, t.oid, x.indexrelid;
+  WHERE (c.relkind = ANY (ARRAY['r'::"char", 't'::"char", 'm'::"char"]));
 pg_statio_sys_indexes| SELECT pg_statio_all_indexes.relid,
     pg_statio_all_indexes.indexrelid,
     pg_statio_all_indexes.schemaname,

pgsql-hackers by date:

Previous
From: Melanie Plageman
Date:
Subject: Re: shared-memory based stats collector - v66
Next
From: Tom Lane
Date:
Subject: Re: [PATCH] pg_statio_all_tables: several rows per table due to invalid TOAST index