Thread: PostgreSQL 10: Segmentation fault when using GROUPING SETS with allunsortable columns

Hi,

I have found a case which could get segmentation fault when using GROUPING SETS.
It occurs when columns in GROUPING SETS are all unsortable but hashable.

Attached grouping_sets_segv.zip include module to reproduce this problem.

I think it made from below change.

https://www.postgresql.org/docs/current/static/release-10.html#id-1.11.6.8.5.3.6
---
Allow hashed aggregation to be used with grouping sets (Andrew Gierth)
---
https://github.com/postgres/postgres/commit/b5635948ab165b6070e7d05d111f966e07570d81


bt from attached grouping_sets_segv.zip is like below.
---------------
Program terminated with signal 11, Segmentation fault.
#0  0x0000000000797855 in consider_groupingsets_paths (root=0x11fe078, grouped_rel=0x120cfc0, path=0x11ff488,
is_sorted=0'\000',  
    can_hash=1 '\001', target=0x11ffb20, gd=0x11fedb8, agg_costs=0x7fffed813850, dNumGroups=400) at planner.c:4205
4205                            unhashed_rollup = lfirst(l_start);
Missing separate debuginfos, use: debuginfo-install glibc-2.17-196.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64
krb5-libs-1.15.1-8.el7.x86_64libcom_err-1.42.9-7.el7.x86_64 libselinux-2.2.2-6.el7.x86_64
libxml2-2.9.1-6.el7_2.3.x86_64openssl-libs-1.0.1e-42.el7.x86_64 pcre-8.32-14.el7.x86_64 tde_for_pg10-1.2.0-0.el7.x86_64
xz-libs-5.1.2-9alpha.el7.x86_64zlib-1.2.7-17.el7.x86_64 
(gdb) bt
#0  0x0000000000797855 in consider_groupingsets_paths (root=0x11fe078, grouped_rel=0x120cfc0, path=0x11ff488,
is_sorted=0'\000',  
    can_hash=1 '\001', target=0x11ffb20, gd=0x11fedb8, agg_costs=0x7fffed813850, dNumGroups=400) at planner.c:4205
#1  0x0000000000797404 in create_grouping_paths (root=0x11fe078, input_rel=0x11fe290, target=0x11ffb20,
agg_costs=0x7fffed813850, 
    gd=0x11fedb8) at planner.c:4045
#2  0x0000000000793a06 in grouping_planner (root=0x11fe078, inheritance_update=0 '\000', tuple_fraction=0) at
planner.c:1895
#3  0x0000000000791ed8 in subquery_planner (glob=0x11b2a50, parse=0x11b2448, parent_root=0x0, hasRecursion=0 '\000',
tuple_fraction=0)
    at planner.c:862
#4  0x0000000000790d9e in standard_planner (parse=0x11b2448, cursorOptions=256, boundParams=0x0) at planner.c:334
#5  0x0000000000790b30 in planner (parse=0x11b2448, cursorOptions=256, boundParams=0x0) at planner.c:210
#6  0x00000000008756a1 in pg_plan_query (querytree=0x11b2448, cursorOptions=256, boundParams=0x0) at postgres.c:796
#7  0x00000000008757ce in pg_plan_queries (querytrees=0x11febc8, cursorOptions=256, boundParams=0x0) at postgres.c:862
#8  0x0000000000875a92 in exec_simple_query (
    query_string=0x11b1060 "select c1,c2 from testtbl_gouping_sets group by grouping sets(c1,c2);") at postgres.c:1027
#9  0x0000000000879dc2 in PostgresMain (argc=1, argv=0x115d570, dbname=0x115d3d8 "si2", username=0x112dfc0 "postgres")
    at postgres.c:4088
#10 0x00000000007dddb7 in BackendRun (port=0x1155fc0) at postmaster.c:4405
#11 0x00000000007dd53f in BackendStartup (port=0x1155fc0) at postmaster.c:4077
#12 0x00000000007d9a47 in ServerLoop () at postmaster.c:1755
#13 0x00000000007d9086 in PostmasterMain (argc=1, argv=0x112be90) at postmaster.c:1363
#14 0x00000000007184a2 in main (argc=1, argv=0x112be90) at main.c:228
(gdb) p l_start
$1 = (ListCell *) 0x0
(gdb) p gd->rollups
$2 = (List *) 0x0
(gdb)
---------------

Look at related source I found that,
In planner.c:preprocess_grouping_sets, we do not update gd->rollups if all of columns in
GROUPING SETS are unsortable but hashable.

After that in planner.c:consider_groupingsets_paths we used gd->rollups and made the SEGV above.

Not yet fully understand the related commit, but I think it is fine to put
ERRCODE_FEATURE_NOT_SUPPORTED error from preprocess_grouping_sets when all columns in
GROUPING SETS are unsortable. Is that right?

I have tried to write a patch (attached).
Could anyone confirm for me?


---
Thanks and best regards,
Dang Minh Huong
NEC Solution Innovators, Ltd.
http://www.nec-solutioninnovators.co.jp/en/

Attachment
Hi,

> I have found a case which could get segmentation fault when using GROUPING
> SETS.
> It occurs when columns in GROUPING SETS are all unsortable but hashable.

I have added this thread to current CF.
please let me know if you need more information.


---
Thanks and best regards,
Dang Minh Huong
NEC Solution Innovators, Ltd.
http://www.nec-solutioninnovators.co.jp/en/



Hi,

On 2018-03-17 23:43:22 +0000, Huong Dangminh wrote:
> Hi,
> 
> I have found a case which could get segmentation fault when using GROUPING SETS.
> It occurs when columns in GROUPING SETS are all unsortable but hashable.
> 
> Attached grouping_sets_segv.zip include module to reproduce this problem.
> 
> I think it made from below change.
> 
> https://www.postgresql.org/docs/current/static/release-10.html#id-1.11.6.8.5.3.6
> ---
> Allow hashed aggregation to be used with grouping sets (Andrew Gierth)
> ---
> https://github.com/postgres/postgres/commit/b5635948ab165b6070e7d05d111f966e07570d81

Andrew, any comments?

Regards,

Andres


>>>>> "Huong" == Huong Dangminh <huo-dangminh@ys.jp.nec.com> writes:

 Huong> Hi,

 Huong> I have found a case which could get segmentation fault when
 Huong> using GROUPING SETS. It occurs when columns in GROUPING SETS are
 Huong> all unsortable but hashable.

Yes, mea culpa. will fix.

BTW, you should have reported this as a bug via the pgsql-bugs list or
the bug submission form.

 Huong> Attached grouping_sets_segv.zip include module to reproduce this
 Huong> problem.

A much simpler way to reproduce it is to use one of the builtin types
that isn't sortable (I tend to use xid for testing).

 Huong> Not yet fully understand the related commit, but I think it is
 Huong> fine to put ERRCODE_FEATURE_NOT_SUPPORTED error from
 Huong> preprocess_grouping_sets when all columns in GROUPING SETS are
 Huong> unsortable. Is that right?

No, that's definitely wrong. The intent is to be able to generate a
hashed path in this case, it's just the logic that tries to prefer
sorting to hashing when the input arrives already sorted is doing the
wrong thing for unsortable data.

-- 
Andrew (irc:RhodiumToad)


>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

 Huong> Not yet fully understand the related commit, but I think it is
 Huong> fine to put ERRCODE_FEATURE_NOT_SUPPORTED error from
 Huong> preprocess_grouping_sets when all columns in GROUPING SETS are
 Huong> unsortable. Is that right?

 Andrew> No, that's definitely wrong. The intent is to be able to
 Andrew> generate a hashed path in this case, it's just the logic that
 Andrew> tries to prefer sorting to hashing when the input arrives
 Andrew> already sorted is doing the wrong thing for unsortable data.

Attached is what I think is the correct fix, which I'll commit shortly.

-- 
Andrew (irc:RhodiumToad)

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 9c4a1baf5f..7fabe017c2 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -4017,7 +4017,28 @@ consider_groupingsets_paths(PlannerInfo *root,
 
         Assert(can_hash);
 
-        if (pathkeys_contained_in(root->group_pathkeys, path->pathkeys))
+        /*
+         * If the input is coincidentally sorted usefully (which can happen
+         * even if is_sorted is false, since that only means that our caller
+         * has set up the sorting for us), then save some hashtable space by
+         * making use of that. But we need to watch out for degenerate cases:
+         *
+         * 1) If there are any empty grouping sets, then group_pathkeys might
+         * be NIL if all non-empty grouping sets are unsortable. In this case,
+         * there will be a rollup containing only empty groups, and the
+         * pathkeys_contained_in test is vacuously true; this is ok.
+         *
+         * XXX: the above relies on the fact that group_pathkeys is generated
+         * from the first rollup. If we add the ability to consider multiple
+         * sort orders for grouping input, this assumption might fail.
+         *
+         * 2) If there are no empty sets and only unsortable sets, then the
+         * rollups list will be empty (and thus l_start == NULL), and
+         * group_pathkeys will be NIL; we must ensure that the vacuously-true
+         * pathkeys_contain_in test doesn't cause us to crash.
+         */
+        if (l_start != NULL &&
+            pathkeys_contained_in(root->group_pathkeys, path->pathkeys))
         {
             unhashed_rollup = lfirst_node(RollupData, l_start);
             exclude_groups = unhashed_rollup->numGroups;
diff --git a/src/test/regress/expected/groupingsets.out b/src/test/regress/expected/groupingsets.out
index d21a494a9d..c7deec2ff4 100644
--- a/src/test/regress/expected/groupingsets.out
+++ b/src/test/regress/expected/groupingsets.out
@@ -1018,6 +1018,18 @@ explain (costs off)
          ->  Values Scan on "*VALUES*"
 (9 rows)
 
+-- unsortable cases
+select unsortable_col, count(*)
+  from gstest4 group by grouping sets ((unsortable_col),(unsortable_col))
+  order by unsortable_col::text;
+ unsortable_col | count 
+----------------+-------
+              1 |     4
+              1 |     4
+              2 |     4
+              2 |     4
+(4 rows)
+
 -- mixed hashable/sortable cases
 select unhashable_col, unsortable_col,
        grouping(unhashable_col, unsortable_col),
diff --git a/src/test/regress/sql/groupingsets.sql b/src/test/regress/sql/groupingsets.sql
index eb68028603..c32d23b8d7 100644
--- a/src/test/regress/sql/groupingsets.sql
+++ b/src/test/regress/sql/groupingsets.sql
@@ -292,6 +292,11 @@ explain (costs off)
   select a, b, grouping(a,b), array_agg(v order by v)
     from gstest1 group by cube(a,b);
 
+-- unsortable cases
+select unsortable_col, count(*)
+  from gstest4 group by grouping sets ((unsortable_col),(unsortable_col))
+  order by unsortable_col::text;
+
 -- mixed hashable/sortable cases
 select unhashable_col, unsortable_col,
        grouping(unhashable_col, unsortable_col),

Hi,

Thanks for response and the fix patch.

> >>>>> "Huong" == Huong Dangminh <huo-dangminh@ys.jp.nec.com> writes:
>
>  Huong> Hi,
>
>  Huong> I have found a case which could get segmentation fault when  Huong>
> using GROUPING SETS. It occurs when columns in GROUPING SETS are  Huong>
> all unsortable but hashable.
>
> Yes, mea culpa. will fix.
>
> BTW, you should have reported this as a bug via the pgsql-bugs list or the
> bug submission form.

Sorry. I will be careful in next time.

>  Huong> Attached grouping_sets_segv.zip include module to reproduce this
> Huong> problem.
>
> A much simpler way to reproduce it is to use one of the builtin types that
> isn't sortable (I tend to use xid for testing).

Yeah. I did not realize xid type also do.

>  Huong> Not yet fully understand the related commit, but I think it is
> Huong> fine to put ERRCODE_FEATURE_NOT_SUPPORTED error from  Huong>
> preprocess_grouping_sets when all columns in GROUPING SETS are  Huong>
> unsortable. Is that right?
>
> No, that's definitely wrong. The intent is to be able to generate a hashed
> path in this case, it's just the logic that tries to prefer sorting to hashing
> when the input arrives already sorted is doing the wrong thing for unsortable
> data.

Thanks, It should be like that.
And the patch work fine for me.


---
Thanks and best regards,
Dang Minh Huong
NEC Solution Innovators, Ltd.
http://www.nec-solutioninnovators.co.jp/en/