Fix HAVING-to-WHERE pushdown with nondeterministic collations - Mailing list pgsql-hackers

From Richard Guo
Subject Fix HAVING-to-WHERE pushdown with nondeterministic collations
Date
Msg-id CAMbWs48Dn2wW6XM94GZsoyMiH42=KgMo+WcobPKuWvGYnWaPOQ@mail.gmail.com
Whole thread Raw
Responses Re: Fix HAVING-to-WHERE pushdown with nondeterministic collations
List pgsql-hackers
As briefly discussed on Discord, when a GROUP BY clause uses a
nondeterministic collation, the planner's optimization of moving
HAVING clauses to WHERE can produce incorrect results if the HAVING
clause applies a stricter collation.

CREATE TABLE t (x TEXT COLLATE case_insensitive);
INSERT INTO t VALUES ('a'), ('A');

SELECT x, count(*) FROM t GROUP BY x HAVING x = 'a' COLLATE "C";

This returns count=1, but should return count=2.

The attached draft patch fixes this for HEAD by leveraging GROUP Vars
(Vars referencing RTE_GROUP) to detect collation conflicts on a
per-clause basis, so only unsafe clauses are kept in HAVING while safe
ones are still pushed.  Please see the commit message for more
details.

For versions prior to v18, we do not have GROUP Vars.  I wonder if we
can take a conservative approach: skipping the HAVING-to-WHERE
pushdown optimization entirely if any GROUP BY expression uses a
nondeterministic collation.

Thoughts and reviews are welcome.

- Richard

Attachment

pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)
Next
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: Skipping schema changes in publication