Re: BUG #18568: BUG: Result wrong when do group by on partition table! - Mailing list pgsql-bugs

From Amit Langote
Subject Re: BUG #18568: BUG: Result wrong when do group by on partition table!
Date
Msg-id CA+HiwqE0bjnhuGm43qDFE0P_u9rf-cV99fUHcv66jPYVdsC73Q@mail.gmail.com
Whole thread Raw
In response to Re: BUG #18568: BUG: Result wrong when do group by on partition table!  (Tender Wang <tndrwang@gmail.com>)
Responses Re: BUG #18568: BUG: Result wrong when do group by on partition table!
List pgsql-bugs
Hi,

On Tue, Oct 22, 2024 at 8:14 PM Tender Wang <tndrwang@gmail.com> wrote:
> Amit Langote <amitlangote09@gmail.com> 于2024年10月22日周二 17:25写道:
>> On Tue, Oct 22, 2024 at 5:30 PM Tender Wang <tndrwang@gmail.com> wrote:
>> > Amit Langote <amitlangote09@gmail.com> 于2024年10月22日周二 15:33写道:
>> >> Not really.  As the documentation says, collation can be specified per
>> >> column or per operation:
>> >>
>> >> https://www.postgresql.org/docs/current/collation.html
>> >>
>> >> In this case, the operation is partitioning.  When you specify the
>> >> COLLATE clause for a partition key, it means that the partitioning
>> >> logic, such as partition tuple routing, will use that collation
>> >> instead of the column-specified or the column type's collation.
>> >
>> >
>> > Since you said partition key had its own collation, and but we used column type's collation in
>> >  set_baserel_partition_key_exprs() as below:
>> >
>> > partexpr = (Expr *) makeVar(varno, attno,
>> > partkey->parttypid[cnt],
>> > partkey->parttypmod[cnt],
>> > partkey->parttypcoll[cnt], 0);
>> >
>> > I think why not we directly use the partition key collation(e.g.  partcollation).
>>
>> That's a good question but I don't immediately know the answer.
>>
>> It seems like it has been like this since the beginning or since the
>> commit that added the RelOptInfo.partexprs field (9140cf8269).
>
> When I looked at the commit(9140cf8269),  I found that in 9140cf8269, the PartitionSchemeData struct had
> a filed:  Oid  *parttypcoll; whose value was equal to column type's collation.  But now HEAD this field has changed
to
> Oid   *partcollation; whose value is equal to partition key collation. This commit 2af28e6 made above change.
>
> commit 2af28e603319224e87fd35ab62f36ef6de45eaac
> Author: Robert Haas <rhaas@postgresql.org>
> Date:   Wed Feb 28 12:16:09 2018 -0500
>
>     For partitionwise join, match on partcollation, not parttypcoll.
>
>     The previous code considered two tables to have the partition scheme
>     if the underlying columns had the same collation, but what we
>     actually need to compare is not the collations associated with the
>     column but the collation used for partitioning.  Fix that.
>
>     Robert Haas and Amit Langote
>
>     Discussion: http://postgr.es/m/0f95f924-0efa-4cf5-eb5f-9a3d1bc3c33d@lab.ntt.co.jp

Good find.

> Now I suspect the commit 2af28e6 forgot to fix RelOptInfo's partexprs collation.
> If we change the partexprs collation to Partition Key collation, as I said in [1], this bug will not happen.
> But I'm not sure this fix will affect other codes that use RelOptInfo's partexprs.

Yeah, I suspect we could maybe just put the collation from
partcollation into varcollid of the partition key Var, but what about
partition keys that are not simple column references?  Would they
carry the correct collation such that exprCollation() returns the
right collation OID?

Also, we'll need to be sure that the semantic of anything that's using
partexprs is not broken because of this.

--
Thanks, Amit Langote



pgsql-bugs by date:

Previous
From: Tender Wang
Date:
Subject: Re: BUG #18568: BUG: Result wrong when do group by on partition table!
Next
From: Amit Langote
Date:
Subject: Re: BUG #18568: BUG: Result wrong when do group by on partition table!