Thread: BUG #18568: BUG: Result wrong when do group by on partition table!

BUG #18568: BUG: Result wrong when do group by on partition table!

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      18568
Logged by:          Webbo Han
Email address:      1105066510@qq.com
PostgreSQL version: 16.3
Operating system:   centos 7.6
Description:

First, we create one case-insensitive collation use ICU:
```sql
    CREATE COLLATION case_insensitive (
        provider = icu,
        locale = 'und-u-ks-level2',
        deterministic = false
    );
```

Then, we create the partition table, meanwhile we set the collation of
column c to `case_insensitive`,
and set partkey's collation to 'C'.
```sql
    SET enable_partitionwise_aggregate TO true;
    SET enable_partitionwise_join TO true;
    SET max_parallel_workers_per_gather TO 0;
    SET enable_incremental_sort TO off;
    CREATE TABLE pagg_tab (c text collate case_insensitive) PARTITION BY LIST(c
collate "C");
    CREATE TABLE pagg_tab_p1 PARTITION OF pagg_tab FOR VALUES IN ('a', 'b',
'c', 'd');
    CREATE TABLE pagg_tab_p2 PARTITION OF pagg_tab FOR VALUES IN ('e', 'f',
'A');
    CREATE TABLE pagg_tab_p3 PARTITION OF pagg_tab FOR VALUES IN ('B', 'C',
'D', 'E');
    INSERT INTO pagg_tab SELECT substr('abcdeABCDE', (i % 10) +1 , 1) FROM
generate_series(0, 2999) i;
    ANALYZE pagg_tab;
```

We do group by on the table pagg_tab use `case_insensitive` collation, we
hope group key is case-insensitive.
but we find the execution result is not what we expected.
```shell
    postgres=# SELECT c collate case_insensitive, count(c) FROM pagg_tab GROUP
BY c collate case_insensitive;
     c | count
    ---+-------
     A |   300
     e |   300
     E |   300
     D |   300
     C |   300
     B |   300
     d |   300
     c |   300
     b |   300
     a |   300
    (10 rows)
```

The reason is the function group_by_has_partkey() do not check partkey's
collation, that lead to explain error.
```shell
    postgres=# EXPLAIN SELECT c collate case_insensitive, count(c) FROM
pagg_tab GROUP BY c collate case_insensitive ;
                                          QUERY PLAN
    --------------------------------------------------------------------------------------
     Append  (cost=12.00..60.15 rows=10 width=10)
       ->  HashAggregate  (cost=12.00..12.02 rows=2 width=10)
             Group Key: pagg_tab.c
             ->  Seq Scan on pagg_tab_p2 pagg_tab  (cost=0.00..9.00 rows=600
width=2)
       ->  HashAggregate  (cost=24.00..24.04 rows=4 width=10)
             Group Key: pagg_tab_1.c
             ->  Seq Scan on pagg_tab_p3 pagg_tab_1  (cost=0.00..18.00
rows=1200 width=2)
       ->  HashAggregate  (cost=24.00..24.04 rows=4 width=10)
             Group Key: pagg_tab_2.c
             ->  Seq Scan on pagg_tab_p1 pagg_tab_2  (cost=0.00..18.00
rows=1200 width=2)
    (10 rows)
```

So, group_by_has_partkey() need to verify if the partkey's collation matches
the groupkey,
meanwhile, if groupkey is RelabelType node and it's collation equal to
partkey's, it should 
also set variable `found` to true.




PG Bug reporting form <noreply@postgresql.org> 于2024年8月6日周二 15:01写道:
The following bug has been logged on the website:

Bug reference:      18568
Logged by:          Webbo Han
Email address:      1105066510@qq.com
PostgreSQL version: 16.3
Operating system:   centos 7.6
Description:       

First, we create one case-insensitive collation use ICU:
```sql
        CREATE COLLATION case_insensitive (
                provider = icu,
                locale = 'und-u-ks-level2',
                deterministic = false
        );
```

Then, we create the partition table, meanwhile we set the collation of
column c to `case_insensitive`,
and set partkey's collation to 'C'.
```sql
        SET enable_partitionwise_aggregate TO true;
        SET enable_partitionwise_join TO true;
        SET max_parallel_workers_per_gather TO 0;
        SET enable_incremental_sort TO off;
        CREATE TABLE pagg_tab (c text collate case_insensitive) PARTITION BY LIST(c
collate "C");
        CREATE TABLE pagg_tab_p1 PARTITION OF pagg_tab FOR VALUES IN ('a', 'b',
'c', 'd');
        CREATE TABLE pagg_tab_p2 PARTITION OF pagg_tab FOR VALUES IN ('e', 'f',
'A');
        CREATE TABLE pagg_tab_p3 PARTITION OF pagg_tab FOR VALUES IN ('B', 'C',
'D', 'E');
        INSERT INTO pagg_tab SELECT substr('abcdeABCDE', (i % 10) +1 , 1) FROM
generate_series(0, 2999) i;
        ANALYZE pagg_tab;
```

We do group by on the table pagg_tab use `case_insensitive` collation, we
hope group key is case-insensitive.
but we find the execution result is not what we expected.
```shell
        postgres=# SELECT c collate case_insensitive, count(c) FROM pagg_tab GROUP
BY c collate case_insensitive;
         c | count
        ---+-------
         A |   300
         e |   300
         E |   300
         D |   300
         C |   300
         B |   300
         d |   300
         c |   300
         b |   300
         a |   300
        (10 rows)
```

The reason is the function group_by_has_partkey() do not check partkey's
collation, that lead to explain error.
```shell
        postgres=# EXPLAIN SELECT c collate case_insensitive, count(c) FROM
pagg_tab GROUP BY c collate case_insensitive ;
                                              QUERY PLAN
        --------------------------------------------------------------------------------------
         Append  (cost=12.00..60.15 rows=10 width=10)
           ->  HashAggregate  (cost=12.00..12.02 rows=2 width=10)
                 Group Key: pagg_tab.c
                 ->  Seq Scan on pagg_tab_p2 pagg_tab  (cost=0.00..9.00 rows=600
width=2)
           ->  HashAggregate  (cost=24.00..24.04 rows=4 width=10)
                 Group Key: pagg_tab_1.c
                 ->  Seq Scan on pagg_tab_p3 pagg_tab_1  (cost=0.00..18.00
rows=1200 width=2)
           ->  HashAggregate  (cost=24.00..24.04 rows=4 width=10)
                 Group Key: pagg_tab_2.c
                 ->  Seq Scan on pagg_tab_p1 pagg_tab_2  (cost=0.00..18.00
rows=1200 width=2)
        (10 rows)
```

So, group_by_has_partkey() need to verify if the partkey's collation matches
the groupkey,
meanwhile, if groupkey is RelabelType node and it's collation equal to
partkey's, it should
also set variable `found` to true.


Yeah, I can reproduce $subject on HEAD.
But  I found this when debug into group_by_has_partkey(), as below:
call nodeToString(groupexprs):
VAR : varcollid 16384
call nodeToString(partexpr):
VAR: varcollid 16384

So the collid of partkey and groupexpr is same, so add check here may not fix this issue.

I continue to find out why the collation id of partkey is 16384(e.g. case_insensitive). The partkey expr info is
set in set_baserel_partition_key_exprs(), which it uses partkey->parttypcoll[cnt] value not  partkey->partcollation value.

And partkey->parttypcoll[cnt] is assigned from pg_attribute , which is the column c meta data. 
Should we use partkey->partcollation value?  I try to fix that in the attached patch. I add your case in the test, and I don't find
failed regress.

--
Tender Wang
Attachment

Re: BUG #18568: BUG: Result wrong when do group by on partition table!

From
Aleksander Alekseev
Date:
Hi,

> [...]
> I continue to find out why the collation id of partkey is 16384(e.g. case_insensitive). The partkey expr info is
> set in set_baserel_partition_key_exprs(), which it uses partkey->parttypcoll[cnt] value not  partkey->partcollation
value.
>
> And partkey->parttypcoll[cnt] is assigned from pg_attribute , which is the column c meta data.
> Should we use partkey->partcollation value?  I try to fix that in the attached patch. I add your case in the test,
andI don't find
 
> failed regress.

```
+SELECT c collate case_insensitive, count(c) FROM
+pagg_tab_col GROUP BY c collate case_insensitive;
+ c | count
+---+-------
+ e |   600
+ D |   600
+ C |   600
+ B |   600
+ A |   600
+(5 rows)
```

Shouldn't we use UPPER(c) and ORDER BY in the test case to make the
results deterministic?

-- 
Best regards,
Aleksander Alekseev





Tender Wang <tndrwang@gmail.com> 于2024年8月6日周二 16:42写道:
Yeah, I can reproduce $subject on HEAD.
But  I found this when debug into group_by_has_partkey(), as below:
call nodeToString(groupexprs):
VAR : varcollid 16384
call nodeToString(partexpr):
VAR: varcollid 16384

So the collid of partkey and groupexpr is same, so add check here may not fix this issue.

I continue to find out why the collation id of partkey is 16384(e.g. case_insensitive). The partkey expr info is
set in set_baserel_partition_key_exprs(), which it uses partkey->parttypcoll[cnt] value not  partkey->partcollation value.

And partkey->parttypcoll[cnt] is assigned from pg_attribute , which is the column c meta data. 
Should we use partkey->partcollation value?  I try to fix that in the attached patch. I add your case in the test, and I don't find
failed regress.

The plan seems not what we want, because I forgot to set enable_partitionwise_aggregate to true. 


--
Tender Wang
Attachment

回复: BUG #18568: BUG: Result wrong when do group by on partition table!

From
"狂奔的蜗牛"
Date:
Partkey's collation stored in RelOptInfo->part_scheme, and I use it to fix the bug.
The attachment is my solution!
 


------------------ 原始邮件 ------------------
发件人: "Tender Wang" <tndrwang@gmail.com>;
发送时间: 2024年8月6日(星期二) 下午4:42
收件人: "狂奔的蜗牛"<1105066510@qq.com>;"pgsql-bugs"<pgsql-bugs@lists.postgresql.org>;
主题: Re: BUG #18568: BUG: Result wrong when do group by on partition table!



PG Bug reporting form <noreply@postgresql.org> 于2024年8月6日周二 15:01写道:
The following bug has been logged on the website:

Bug reference:      18568
Logged by:          Webbo Han
Email address:      1105066510@qq.com
PostgreSQL version: 16.3
Operating system:   centos 7.6
Description:       

First, we create one case-insensitive collation use ICU:
```sql
        CREATE COLLATION case_insensitive (
                provider = icu,
                locale = 'und-u-ks-level2',
                deterministic = false
        );
```

Then, we create the partition table, meanwhile we set the collation of
column c to `case_insensitive`,
and set partkey's collation to 'C'.
```sql
        SET enable_partitionwise_aggregate TO true;
        SET enable_partitionwise_join TO true;
        SET max_parallel_workers_per_gather TO 0;
        SET enable_incremental_sort TO off;
        CREATE TABLE pagg_tab (c text collate case_insensitive) PARTITION BY LIST(c
collate "C");
        CREATE TABLE pagg_tab_p1 PARTITION OF pagg_tab FOR VALUES IN ('a', 'b',
'c', 'd');
        CREATE TABLE pagg_tab_p2 PARTITION OF pagg_tab FOR VALUES IN ('e', 'f',
'A');
        CREATE TABLE pagg_tab_p3 PARTITION OF pagg_tab FOR VALUES IN ('B', 'C',
'D', 'E');
        INSERT INTO pagg_tab SELECT substr('abcdeABCDE', (i % 10) +1 , 1) FROM
generate_series(0, 2999) i;
        ANALYZE pagg_tab;
```

We do group by on the table pagg_tab use `case_insensitive` collation, we
hope group key is case-insensitive.
but we find the execution result is not what we expected.
```shell
        postgres=# SELECT c collate case_insensitive, count(c) FROM pagg_tab GROUP
BY c collate case_insensitive;
         c | count
        ---+-------
         A |   300
         e |   300
         E |   300
         D |   300
         C |   300
         B |   300
         d |   300
         c |   300
         b |   300
         a |   300
        (10 rows)
```

The reason is the function group_by_has_partkey() do not check partkey's
collation, that lead to explain error.
```shell
        postgres=# EXPLAIN SELECT c collate case_insensitive, count(c) FROM
pagg_tab GROUP BY c collate case_insensitive ;
                                              QUERY PLAN
        --------------------------------------------------------------------------------------
         Append  (cost=12.00..60.15 rows=10 width=10)
           ->  HashAggregate  (cost=12.00..12.02 rows=2 width=10)
                 Group Key: pagg_tab.c
                 ->  Seq Scan on pagg_tab_p2 pagg_tab  (cost=0.00..9.00 rows=600
width=2)
           ->  HashAggregate  (cost=24.00..24.04 rows=4 width=10)
                 Group Key: pagg_tab_1.c
                 ->  Seq Scan on pagg_tab_p3 pagg_tab_1  (cost=0.00..18.00
rows=1200 width=2)
           ->  HashAggregate  (cost=24.00..24.04 rows=4 width=10)
                 Group Key: pagg_tab_2.c
                 ->  Seq Scan on pagg_tab_p1 pagg_tab_2  (cost=0.00..18.00
rows=1200 width=2)
        (10 rows)
```

So, group_by_has_partkey() need to verify if the partkey's collation matches
the groupkey,
meanwhile, if groupkey is RelabelType node and it's collation equal to
partkey's, it should
also set variable `found` to true.


Yeah, I can reproduce $subject on HEAD.
But  I found this when debug into group_by_has_partkey(), as below:
call nodeToString(groupexprs):
VAR : varcollid 16384
call nodeToString(partexpr):
VAR: varcollid 16384

So the collid of partkey and groupexpr is same, so add check here may not fix this issue.

I continue to find out why the collation id of partkey is 16384(e.g. case_insensitive). The partkey expr info is
set in set_baserel_partition_key_exprs(), which it uses partkey->parttypcoll[cnt] value not  partkey->partcollation value.

And partkey->parttypcoll[cnt] is assigned from pg_attribute , which is the column c meta data. 
Should we use partkey->partcollation value?  I try to fix that in the attached patch. I add your case in the test, and I don't find
failed regress.

--
Tender Wang
Attachment

回复: BUG #18568: BUG: Result wrong when do group by on partition table!

From
"狂奔的蜗牛"
Date:
Hi,
I am working to compatible with Oracle in PG,
in oracle, the result used case-insensitive collation is not unique.
There are two oracle's test case:

first case:
```sql
    CREATE TABLE group_by_ci_test_1 (c varchar2(20));
    INSERT INTO group_by_ci_test_1 VALUES ('a');
    INSERT INTO group_by_ci_test_1 VALUES ('a');
    INSERT INTO group_by_ci_test_1 VALUES ('A');
    SELECT c collate binary_ci, count(c) FROM group_by_ci_test_1 GROUP BY c collate binary_ci;
```
first result
```shell
    CCOLLATEBINARY_CI      COUNT(C)
    -------------------- ----------
    a                             3
```

second case:
```sql
    CREATE TABLE group_by_ci_test_2 (c varchar2(20));
    INSERT INTO group_by_ci_test_2 VALUES ('A');
    INSERT INTO group_by_ci_test_2 VALUES ('a');
    INSERT INTO group_by_ci_test_2 VALUES ('a');
    SELECT c collate binary_ci, count(c) FROM group_by_ci_test_2 GROUP BY c collate binary_ci;
```
second result:
```shell
    CCOLLATEBINARY_CI      COUNT(C)
    -------------------- ----------
    A                             6
```


 


------------------ 原始邮件 ------------------
发件人: "Aleksander Alekseev" <aleksander@timescale.com>;
发送时间: 2024年8月6日(星期二) 下午5:09
收件人: "pgsql-bugs"<pgsql-bugs@lists.postgresql.org>;
抄送: "Tender Wang"<tndrwang@gmail.com>;"狂奔的蜗牛"<1105066510@qq.com>;
主题: Re: BUG #18568: BUG: Result wrong when do group by on partition table!

Hi,

> [...]
> I continue to find out why the collation id of partkey is 16384(e.g. case_insensitive). The partkey expr info is
> set in set_baserel_partition_key_exprs(), which it uses partkey->parttypcoll[cnt] value not  partkey->partcollation value.
>
> And partkey->parttypcoll[cnt] is assigned from pg_attribute , which is the column c meta data.
> Should we use partkey->partcollation value?  I try to fix that in the attached patch. I add your case in the test, and I don't find
> failed regress.

```
+SELECT c collate case_insensitive, count(c) FROM
+pagg_tab_col GROUP BY c collate case_insensitive;
+ c | count
+---+-------
+ e |   600
+ D |   600
+ C |   600
+ B |   600
+ A |   600
+(5 rows)
```

Shouldn't we use UPPER(c) and ORDER BY in the test case to make the
results deterministic?

--
Best regards,
Aleksander Alekseev


PG Bug reporting form <noreply@postgresql.org> 于2024年8月6日周二 15:01写道:
The following bug has been logged on the website:

Bug reference:      18568
Logged by:          Webbo Han
Email address:      1105066510@qq.com
PostgreSQL version: 16.3
Operating system:   centos 7.6
Description:       

First, we create one case-insensitive collation use ICU:
```sql
        CREATE COLLATION case_insensitive (
                provider = icu,
                locale = 'und-u-ks-level2',
                deterministic = false
        );
```

Then, we create the partition table, meanwhile we set the collation of
column c to `case_insensitive`,
and set partkey's collation to 'C'.
```sql
        SET enable_partitionwise_aggregate TO true;
        SET enable_partitionwise_join TO true;
        SET max_parallel_workers_per_gather TO 0;
        SET enable_incremental_sort TO off;
        CREATE TABLE pagg_tab (c text collate case_insensitive) PARTITION BY LIST(c
collate "C");

I think above create table again. Should we allow this CREATE TABLE?   The partition key
definition are not same with column definiton.  Is it better to report error for users?

--
Tender Wang


狂奔的蜗牛 <1105066510@qq.com> 于2024年8月6日周二 19:33写道:
Partkey's collation stored in RelOptInfo->part_scheme, and I use it to fix the bug.
The attachment is my solution!

I look through your patch. Some code is not native for PG. For example:

if (nodeTag(groupexpr) == T_RelabelType)

usually, we use IsA() func.

--
Tender Wang
I think what I have done in v2 is not the right way. Because partition prune logic first checks whether it is equal
or not, then it will check the collation match or not. So I should not change the set_baserel_partition_key_exprs() logic.

I look through your patch and make some changes.

1. 
  git am your patch, report warnings, some code format issue.
2.  
 I removed this branch: if (IsA(groupexpr, RelabelType)).
 Because in your added test case, it didn't enter this branch. I didn't figure out what  kind of group by clause is RelableType.
 You can provide a test case based on the v3 patch.

3. Tweek a little about the test case.

By the way, your last two emails only sent to me, please cc the pgsql-bugs.



狂奔的蜗牛 <1105066510@qq.com> 于2024年8月6日周二 19:33写道:
Partkey's collation stored in RelOptInfo->part_scheme, and I use it to fix the bug.
The attachment is my solution!
 


------------------ 原始邮件 ------------------
发件人: "Tender Wang" <tndrwang@gmail.com>;
发送时间: 2024年8月6日(星期二) 下午4:42
收件人: "狂奔的蜗牛"<1105066510@qq.com>;"pgsql-bugs"<pgsql-bugs@lists.postgresql.org>;
主题: Re: BUG #18568: BUG: Result wrong when do group by on partition table!



PG Bug reporting form <noreply@postgresql.org> 于2024年8月6日周二 15:01写道:
The following bug has been logged on the website:

Bug reference:      18568
Logged by:          Webbo Han
Email address:      1105066510@qq.com
PostgreSQL version: 16.3
Operating system:   centos 7.6
Description:       

First, we create one case-insensitive collation use ICU:
```sql
        CREATE COLLATION case_insensitive (
                provider = icu,
                locale = 'und-u-ks-level2',
                deterministic = false
        );
```

Then, we create the partition table, meanwhile we set the collation of
column c to `case_insensitive`,
and set partkey's collation to 'C'.
```sql
        SET enable_partitionwise_aggregate TO true;
        SET enable_partitionwise_join TO true;
        SET max_parallel_workers_per_gather TO 0;
        SET enable_incremental_sort TO off;
        CREATE TABLE pagg_tab (c text collate case_insensitive) PARTITION BY LIST(c
collate "C");
        CREATE TABLE pagg_tab_p1 PARTITION OF pagg_tab FOR VALUES IN ('a', 'b',
'c', 'd');
        CREATE TABLE pagg_tab_p2 PARTITION OF pagg_tab FOR VALUES IN ('e', 'f',
'A');
        CREATE TABLE pagg_tab_p3 PARTITION OF pagg_tab FOR VALUES IN ('B', 'C',
'D', 'E');
        INSERT INTO pagg_tab SELECT substr('abcdeABCDE', (i % 10) +1 , 1) FROM
generate_series(0, 2999) i;
        ANALYZE pagg_tab;
```

We do group by on the table pagg_tab use `case_insensitive` collation, we
hope group key is case-insensitive.
but we find the execution result is not what we expected.
```shell
        postgres=# SELECT c collate case_insensitive, count(c) FROM pagg_tab GROUP
BY c collate case_insensitive;
         c | count
        ---+-------
         A |   300
         e |   300
         E |   300
         D |   300
         C |   300
         B |   300
         d |   300
         c |   300
         b |   300
         a |   300
        (10 rows)
```

The reason is the function group_by_has_partkey() do not check partkey's
collation, that lead to explain error.
```shell
        postgres=# EXPLAIN SELECT c collate case_insensitive, count(c) FROM
pagg_tab GROUP BY c collate case_insensitive ;
                                              QUERY PLAN
        --------------------------------------------------------------------------------------
         Append  (cost=12.00..60.15 rows=10 width=10)
           ->  HashAggregate  (cost=12.00..12.02 rows=2 width=10)
                 Group Key: pagg_tab.c
                 ->  Seq Scan on pagg_tab_p2 pagg_tab  (cost=0.00..9.00 rows=600
width=2)
           ->  HashAggregate  (cost=24.00..24.04 rows=4 width=10)
                 Group Key: pagg_tab_1.c
                 ->  Seq Scan on pagg_tab_p3 pagg_tab_1  (cost=0.00..18.00
rows=1200 width=2)
           ->  HashAggregate  (cost=24.00..24.04 rows=4 width=10)
                 Group Key: pagg_tab_2.c
                 ->  Seq Scan on pagg_tab_p1 pagg_tab_2  (cost=0.00..18.00
rows=1200 width=2)
        (10 rows)
```

So, group_by_has_partkey() need to verify if the partkey's collation matches
the groupkey,
meanwhile, if groupkey is RelabelType node and it's collation equal to
partkey's, it should
also set variable `found` to true.


Yeah, I can reproduce $subject on HEAD.
But  I found this when debug into group_by_has_partkey(), as below:
call nodeToString(groupexprs):
VAR : varcollid 16384
call nodeToString(partexpr):
VAR: varcollid 16384

So the collid of partkey and groupexpr is same, so add check here may not fix this issue.

I continue to find out why the collation id of partkey is 16384(e.g. case_insensitive). The partkey expr info is
set in set_baserel_partition_key_exprs(), which it uses partkey->parttypcoll[cnt] value not  partkey->partcollation value.

And partkey->parttypcoll[cnt] is assigned from pg_attribute , which is the column c meta data. 
Should we use partkey->partcollation value?  I try to fix that in the attached patch. I add your case in the test, and I don't find
failed regress.

--
Tender Wang


--
Tender Wang
Attachment

回复: BUG #18568: BUG: Result wrong when do group by on partition table!

From
"狂奔的蜗牛"
Date:
this case will enter `if (IsA(groupexpr, RelabelType))` branch. 

We set "C" as groupkey's collation, and it's not equal to column's.
In transformGroupClause(), syntax `COLLATE xxx` will create CollateExpr node, and CollateExpr's arg is Var node.
And then, planner will call eval_const_expressions_mutator() to transform CollateExpr to RelableType if CollateExpr->collOid not equal to Collate->arg's collation.


About V3 patch, PartCollMatchesExprColl() may be not suitable, because collation of partkey must be equal to groupkey's, even though they are all InvalidOid.


```sql
CREATE COLLATION case_insensitive (
provider = icu,
locale = 'und-u-ks-level2',
deterministic = false
);

SET enable_partitionwise_aggregate TO true;
SET enable_partitionwise_join TO true;
SET max_parallel_workers_per_gather TO 0;
SET enable_incremental_sort TO off;
drop table pagg_tab;

CREATE TABLE pagg_tab (c text collate case_insensitive) PARTITION BY LIST(c collate "C");
CREATE TABLE pagg_tab_p1 PARTITION OF pagg_tab FOR VALUES IN ('a', 'b', 'c', 'd');
CREATE TABLE pagg_tab_p2 PARTITION OF pagg_tab FOR VALUES IN ('e', 'f', 'A');
CREATE TABLE pagg_tab_p3 PARTITION OF pagg_tab FOR VALUES IN ('B', 'C', 'D', 'E');

INSERT INTO pagg_tab SELECT substr('abcdeABCDE', (i % 10) +1 , 1) FROM generate_series(0, 2999) i;

ANALYZE pagg_tab;

EXPLAIN SELECT c collate "C", count(c) FROM pagg_tab GROUP BY c collate "C" ;
SELECT c collate "C", count(c) FROM pagg_tab GROUP BY c collate "C";
```





 


------------------ 原始邮件 ------------------
发件人: "Tender Wang" <tndrwang@gmail.com>;
发送时间: 2024年8月7日(星期三) 中午12:01
收件人: "狂奔的蜗牛"<1105066510@qq.com>;
抄送: "pgsql-bugs"<pgsql-bugs@lists.postgresql.org>;
主题: Re: BUG #18568: BUG: Result wrong when do group by on partition table!

I think what I have done in v2 is not the right way. Because partition prune logic first checks whether it is equal
or not, then it will check the collation match or not. So I should not change the set_baserel_partition_key_exprs() logic.

I look through your patch and make some changes.

1. 
  git am your patch, report warnings, some code format issue.
2.  
 I removed this branch: if (IsA(groupexpr, RelabelType)).
 Because in your added test case, it didn't enter this branch. I didn't figure out what  kind of group by clause is RelableType.
 You can provide a test case based on the v3 patch.

3. Tweek a little about the test case.

By the way, your last two emails only sent to me, please cc the pgsql-bugs.



狂奔的蜗牛 <1105066510@qq.com> 于2024年8月6日周二 19:33写道:
Partkey's collation stored in RelOptInfo->part_scheme, and I use it to fix the bug.
The attachment is my solution!
 


------------------ 原始邮件 ------------------
发件人: "Tender Wang" <tndrwang@gmail.com>;
发送时间: 2024年8月6日(星期二) 下午4:42
收件人: "狂奔的蜗牛"<1105066510@qq.com>;"pgsql-bugs"<pgsql-bugs@lists.postgresql.org>;
主题: Re: BUG #18568: BUG: Result wrong when do group by on partition table!



PG Bug reporting form <noreply@postgresql.org> 于2024年8月6日周二 15:01写道:
The following bug has been logged on the website:

Bug reference:      18568
Logged by:          Webbo Han
Email address:      1105066510@qq.com
PostgreSQL version: 16.3
Operating system:   centos 7.6
Description:       

First, we create one case-insensitive collation use ICU:
```sql
        CREATE COLLATION case_insensitive (
                provider = icu,
                locale = 'und-u-ks-level2',
                deterministic = false
        );
```

Then, we create the partition table, meanwhile we set the collation of
column c to `case_insensitive`,
and set partkey's collation to 'C'.
```sql
        SET enable_partitionwise_aggregate TO true;
        SET enable_partitionwise_join TO true;
        SET max_parallel_workers_per_gather TO 0;
        SET enable_incremental_sort TO off;
        CREATE TABLE pagg_tab (c text collate case_insensitive) PARTITION BY LIST(c
collate "C");
        CREATE TABLE pagg_tab_p1 PARTITION OF pagg_tab FOR VALUES IN ('a', 'b',
'c', 'd');
        CREATE TABLE pagg_tab_p2 PARTITION OF pagg_tab FOR VALUES IN ('e', 'f',
'A');
        CREATE TABLE pagg_tab_p3 PARTITION OF pagg_tab FOR VALUES IN ('B', 'C',
'D', 'E');
        INSERT INTO pagg_tab SELECT substr('abcdeABCDE', (i % 10) +1 , 1) FROM
generate_series(0, 2999) i;
        ANALYZE pagg_tab;
```

We do group by on the table pagg_tab use `case_insensitive` collation, we
hope group key is case-insensitive.
but we find the execution result is not what we expected.
```shell
        postgres=# SELECT c collate case_insensitive, count(c) FROM pagg_tab GROUP
BY c collate case_insensitive;
         c | count
        ---+-------
         A |   300
         e |   300
         E |   300
         D |   300
         C |   300
         B |   300
         d |   300
         c |   300
         b |   300
         a |   300
        (10 rows)
```

The reason is the function group_by_has_partkey() do not check partkey's
collation, that lead to explain error.
```shell
        postgres=# EXPLAIN SELECT c collate case_insensitive, count(c) FROM
pagg_tab GROUP BY c collate case_insensitive ;
                                              QUERY PLAN
        --------------------------------------------------------------------------------------
         Append  (cost=12.00..60.15 rows=10 width=10)
           ->  HashAggregate  (cost=12.00..12.02 rows=2 width=10)
                 Group Key: pagg_tab.c
                 ->  Seq Scan on pagg_tab_p2 pagg_tab  (cost=0.00..9.00 rows=600
width=2)
           ->  HashAggregate  (cost=24.00..24.04 rows=4 width=10)
                 Group Key: pagg_tab_1.c
                 ->  Seq Scan on pagg_tab_p3 pagg_tab_1  (cost=0.00..18.00
rows=1200 width=2)
           ->  HashAggregate  (cost=24.00..24.04 rows=4 width=10)
                 Group Key: pagg_tab_2.c
                 ->  Seq Scan on pagg_tab_p1 pagg_tab_2  (cost=0.00..18.00
rows=1200 width=2)
        (10 rows)
```

So, group_by_has_partkey() need to verify if the partkey's collation matches
the groupkey,
meanwhile, if groupkey is RelabelType node and it's collation equal to
partkey's, it should
also set variable `found` to true.


Yeah, I can reproduce $subject on HEAD.
But  I found this when debug into group_by_has_partkey(), as below:
call nodeToString(groupexprs):
VAR : varcollid 16384
call nodeToString(partexpr):
VAR: varcollid 16384

So the collid of partkey and groupexpr is same, so add check here may not fix this issue.

I continue to find out why the collation id of partkey is 16384(e.g. case_insensitive). The partkey expr info is
set in set_baserel_partition_key_exprs(), which it uses partkey->parttypcoll[cnt] value not  partkey->partcollation value.

And partkey->parttypcoll[cnt] is assigned from pg_attribute , which is the column c meta data. 
Should we use partkey->partcollation value?  I try to fix that in the attached patch. I add your case in the test, and I don't find
failed regress.

--
Tender Wang


--
Tender Wang


狂奔的蜗牛 <1105066510@qq.com> 于2024年8月7日周三 13:35写道:
this case will enter `if (IsA(groupexpr, RelabelType))` branch. 

We set "C" as groupkey's collation, and it's not equal to column's.
In transformGroupClause(), syntax `COLLATE xxx` will create CollateExpr node, and CollateExpr's arg is Var node.
And then, planner will call eval_const_expressions_mutator() to transform CollateExpr to RelableType if CollateExpr->collOid not equal to Collate->arg's collation.

Yech. Thanks for the explanation.
Because exprCollation() return the resultcollid if the nodetag is Relabletype, the expr() && PartKeyCollMatchesExprColl() can handle this situation.
So we don't need "if (IsA(groupexpr, RelableTyple) " this branch.
 


About V3 patch, PartCollMatchesExprColl() may be not suitable, because collation of partkey must be equal to groupkey's, even though they are all InvalidOid.

I'm not sure about what you said.  I uesd it because partition prune do this way. I keep it temporarily. 

The test case in v3 patch only has EXPLAIN statement. I add SELECT statement and make the result more stable according to Aleksander advices.
--
Tender Wang
Attachment

回复: BUG #18568: BUG: Result wrong when do group by on partition table!

From
"狂奔的蜗牛"
Date:
PartCollMatchesExprColl()  is not strict enough.
If partcoll == InvalidOid and groupcoll != InvalidOid, PartCollMatchesExprColl() return true always.
Just determine whether groupcoll is equal to partcoll, like this "partcoll == groupcoll".


We cannot delete "if (IsA(groupexpr, RelableTyple) " branch, 
becasuse if groupexpr is RelabelType and  partcoll equal to groupcoll,  the "equal(groupexpr, partexpr) && PartKeyCollMatchesExprColl(partcoll, groupexpr_coll)" condition return false!!!
This is not what we expect.


We could rename "groupexpr_coll" to groupcoll, it looks more elegant.


 


------------------ 原始邮件 ------------------
发件人: "Tender Wang" <tndrwang@gmail.com>;
发送时间: 2024年8月7日(星期三) 下午2:57
收件人: "狂奔的蜗牛"<1105066510@qq.com>;
抄送: "pgsql-bugs"<pgsql-bugs@lists.postgresql.org>;"aleksander"<aleksander@timescale.com>;
主题: Re: BUG #18568: BUG: Result wrong when do group by on partition table!



狂奔的蜗牛 <1105066510@qq.com> 于2024年8月7日周三 13:35写道:
this case will enter `if (IsA(groupexpr, RelabelType))` branch. 

We set "C" as groupkey's collation, and it's not equal to column's.
In transformGroupClause(), syntax `COLLATE xxx` will create CollateExpr node, and CollateExpr's arg is Var node.
And then, planner will call eval_const_expressions_mutator() to transform CollateExpr to RelableType if CollateExpr->collOid not equal to Collate->arg's collation.

Yech. Thanks for the explanation.
Because exprCollation() return the resultcollid if the nodetag is Relabletype, the expr() && PartKeyCollMatchesExprColl() can handle this situation.
So we don't need "if (IsA(groupexpr, RelableTyple) " this branch.
 


About V3 patch, PartCollMatchesExprColl() may be not suitable, because collation of partkey must be equal to groupkey's, even though they are all InvalidOid.

I'm not sure about what you said.  I uesd it because partition prune do this way. I keep it temporarily. 

The test case in v3 patch only has EXPLAIN statement. I add SELECT statement and make the result more stable according to Aleksander advices.
--
Tender Wang


狂奔的蜗牛 <1105066510@qq.com> 于2024年8月7日周三 16:06写道:
PartCollMatchesExprColl()  is not strict enough.
If partcoll == InvalidOid and groupcoll != InvalidOid, PartCollMatchesExprColl() return true always.
Just determine whether groupcoll is equal to partcoll, like this "partcoll == groupcoll".

I understand what you said. Actually, I keep it just curious when partcoll is InvalidOid.  Why it works for partition prune.
Is it not same between check partkey is equal to groupexpr and check partkey is equal to qualclause?


We cannot delete "if (IsA(groupexpr, RelableTyple) " branch, 
becasuse if groupexpr is RelabelType and  partcoll equal to groupcoll,  the "equal(groupexpr, partexpr) && PartKeyCollMatchesExprColl(partcoll, groupexpr_coll)" condition return false!!!
This is not what we expect.
 
" if groupexpr is RelabelType and  partcoll equal to groupcoll ", according to original logic, will return false in this situation. 
Now you think we can support above situation. Am I understand correctly? 
We're better to add more test case to cover the code if we're going to support this. The test cases now seem not going
into the RelableTyple branch.


We could rename "groupexpr_coll" to groupcoll, it looks more elegant.

No objection. You can continue to support RelableType situation and add more test cases based on V4.

--
Tender Wang