Re: POC: GROUP BY optimization - Mailing list pgsql-hackers

From Maxim Orlov
Subject Re: POC: GROUP BY optimization
Date
Msg-id CACG=ezaYM1tr6Lmp8PRH1aeZq=rBKXEoTwgzMcLaD5MPhfW0Lg@mail.gmail.com
Whole thread Raw
In response to Re: POC: GROUP BY optimization  (Alexander Korotkov <aekorotkov@gmail.com>)
List pgsql-hackers
Hi!

Another issue on test introduced in 0452b461bc405. I think it may be unstable in some circumstances.
For example, if we'll try to use different BLCKSZ. See, I've made a little change in the number of tuples to be inserted:

$ git diff
diff --git a/src/test/regress/sql/aggregates.sql b/src/test/regress/sql/aggregates.sql
index d6ed5d0eff..414078d4ec 100644
--- a/src/test/regress/sql/aggregates.sql
+++ b/src/test/regress/sql/aggregates.sql
@@ -1187,7 +1187,7 @@ CREATE TABLE btg AS SELECT
   i % 100 AS y,
   'abc' || i % 10 AS z,
   i AS w
-FROM generate_series(1,10000) AS i;
+FROM generate_series(1,11900) AS i;
 CREATE INDEX btg_x_y_idx ON btg(x,y);
 ANALYZE btg;

And the bulk extension is kicked, so we got zeroed pages in the relation. The plane is also changed,
switched to seq scan from index scan:
@@ -2734,7 +2734,7 @@
   i % 100 AS y,
   'abc' || i % 10 AS z,
   i AS w
-FROM generate_series(1,10000) AS i;
+FROM generate_series(1,11900) AS i;
 CREATE INDEX btg_x_y_idx ON btg(x,y);
 ANALYZE btg;
 -- GROUP BY optimization by reorder columns by frequency
@@ -2760,62 +2760,57 @@

 -- Engage incremental sort
 explain (COSTS OFF) SELECT x,y FROM btg GROUP BY x,y,z,w;
-                   QUERY PLAN
--------------------------------------------------
+          QUERY PLAN
+------------------------------
  Group
    Group Key: x, y, z, w
-   ->  Incremental Sort
+   ->  Sort
          Sort Key: x, y, z, w
-         Presorted Key: x, y
-         ->  Index Scan using btg_x_y_idx on btg
-(6 rows)
+         ->  Seq Scan on btg
+(5 rows)
... and so on.

So, my proposal is simple. I think we need not just "ANALYZE btg", but
"VACUUM ANALYZE btg", to get rid of zeroed pages in this particular
case. PFA corresponding patch.

--
Best regards,
Maxim Orlov.
Attachment

pgsql-hackers by date:

Previous
From: Jelte Fennema-Nio
Date:
Subject: Improve readability by using designated initializers when possible
Next
From: "David E. Wheeler"
Date:
Subject: Re: Patch: Add parse_type Function