pgsql: Fix collation handling for grouping keys in eager aggregation - Mailing list pgsql-committers

From Richard Guo
Subject pgsql: Fix collation handling for grouping keys in eager aggregation
Date
Msg-id E1w9a7i-003Aa2-1e@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Fix collation handling for grouping keys in eager aggregation

When determining if it is safe to use an expression as a grouping key
for partial aggregation, eager aggregation relies on the B-tree
equalimage support function to ensure that equality implies image
equality.

Previously, the code incorrectly passed the default collation of the
expression's data type to the equalimage procedure, rather than the
expression's actual collation.  As a result, if a column used a
non-deterministic collation but the base type's default collation was
deterministic, eager aggregation would incorrectly assume that the
column was safe for byte-level grouping.  This could cause rows to be
prematurely grouped and subsequently discarded by strict join
conditions, resulting in incorrect query results.

This patch fixes the issue by passing the expression's actual
collation to the equalimage procedure.

Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Discussion: https://postgr.es/m/CAMbWs48A53PY1Y4zoj7YhxPww9fO1hfnbdntKfA855zpXfVFRA@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/bd94845e8c90475b8149f6a091876a1827b6b305

Modified Files
--------------
src/backend/optimizer/plan/initsplan.c         | 10 ++-
src/backend/optimizer/util/relnode.c           | 10 ++-
src/test/regress/expected/collate.icu.utf8.out | 98 +++++++++++++++++++++-----
src/test/regress/sql/collate.icu.utf8.sql      | 45 ++++++++++++
4 files changed, 143 insertions(+), 20 deletions(-)


pgsql-committers by date:

Previous
From: Fujii Masao
Date:
Subject: pgsql: Add wal_sender_shutdown_timeout GUC to limit shutdown wait for r
Next
From: Fujii Masao
Date:
Subject: pgsql: Simplify redundant current_database() subqueries in stats.sql re