Re: PostgreSQL 10: Segmentation fault when using GROUPING SETS with all unsortable columns - Mailing list pgsql-hackers

From Andrew Gierth
Subject Re: PostgreSQL 10: Segmentation fault when using GROUPING SETS with all unsortable columns
Date
Msg-id 87vadrf4o4.fsf@news-spur.riddles.org.uk
Whole thread Raw
In response to Re: PostgreSQL 10: Segmentation fault when using GROUPING SETS with all unsortable columns  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
List pgsql-hackers
>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

 Huong> Not yet fully understand the related commit, but I think it is
 Huong> fine to put ERRCODE_FEATURE_NOT_SUPPORTED error from
 Huong> preprocess_grouping_sets when all columns in GROUPING SETS are
 Huong> unsortable. Is that right?

 Andrew> No, that's definitely wrong. The intent is to be able to
 Andrew> generate a hashed path in this case, it's just the logic that
 Andrew> tries to prefer sorting to hashing when the input arrives
 Andrew> already sorted is doing the wrong thing for unsortable data.

Attached is what I think is the correct fix, which I'll commit shortly.

-- 
Andrew (irc:RhodiumToad)

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 9c4a1baf5f..7fabe017c2 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -4017,7 +4017,28 @@ consider_groupingsets_paths(PlannerInfo *root,
 
         Assert(can_hash);
 
-        if (pathkeys_contained_in(root->group_pathkeys, path->pathkeys))
+        /*
+         * If the input is coincidentally sorted usefully (which can happen
+         * even if is_sorted is false, since that only means that our caller
+         * has set up the sorting for us), then save some hashtable space by
+         * making use of that. But we need to watch out for degenerate cases:
+         *
+         * 1) If there are any empty grouping sets, then group_pathkeys might
+         * be NIL if all non-empty grouping sets are unsortable. In this case,
+         * there will be a rollup containing only empty groups, and the
+         * pathkeys_contained_in test is vacuously true; this is ok.
+         *
+         * XXX: the above relies on the fact that group_pathkeys is generated
+         * from the first rollup. If we add the ability to consider multiple
+         * sort orders for grouping input, this assumption might fail.
+         *
+         * 2) If there are no empty sets and only unsortable sets, then the
+         * rollups list will be empty (and thus l_start == NULL), and
+         * group_pathkeys will be NIL; we must ensure that the vacuously-true
+         * pathkeys_contain_in test doesn't cause us to crash.
+         */
+        if (l_start != NULL &&
+            pathkeys_contained_in(root->group_pathkeys, path->pathkeys))
         {
             unhashed_rollup = lfirst_node(RollupData, l_start);
             exclude_groups = unhashed_rollup->numGroups;
diff --git a/src/test/regress/expected/groupingsets.out b/src/test/regress/expected/groupingsets.out
index d21a494a9d..c7deec2ff4 100644
--- a/src/test/regress/expected/groupingsets.out
+++ b/src/test/regress/expected/groupingsets.out
@@ -1018,6 +1018,18 @@ explain (costs off)
          ->  Values Scan on "*VALUES*"
 (9 rows)
 
+-- unsortable cases
+select unsortable_col, count(*)
+  from gstest4 group by grouping sets ((unsortable_col),(unsortable_col))
+  order by unsortable_col::text;
+ unsortable_col | count 
+----------------+-------
+              1 |     4
+              1 |     4
+              2 |     4
+              2 |     4
+(4 rows)
+
 -- mixed hashable/sortable cases
 select unhashable_col, unsortable_col,
        grouping(unhashable_col, unsortable_col),
diff --git a/src/test/regress/sql/groupingsets.sql b/src/test/regress/sql/groupingsets.sql
index eb68028603..c32d23b8d7 100644
--- a/src/test/regress/sql/groupingsets.sql
+++ b/src/test/regress/sql/groupingsets.sql
@@ -292,6 +292,11 @@ explain (costs off)
   select a, b, grouping(a,b), array_agg(v order by v)
     from gstest1 group by cube(a,b);
 
+-- unsortable cases
+select unsortable_col, count(*)
+  from gstest4 group by grouping sets ((unsortable_col),(unsortable_col))
+  order by unsortable_col::text;
+
 -- mixed hashable/sortable cases
 select unhashable_col, unsortable_col,
        grouping(unhashable_col, unsortable_col),

pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: Online enabling of checksums
Next
From: Oleksandr Shulgin
Date:
Subject: Re: [HACKERS] [PATCH] Generic type subscripting