Re: Multiple grouping set specs referencing duplicate alias - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Multiple grouping set specs referencing duplicate alias
Date
Msg-id 3642474.1666568956@sss.pgh.pa.us
Whole thread Raw
In response to Multiple grouping set specs referencing duplicate alias  (David Kimura <david.g.kimura@gmail.com>)
List pgsql-hackers
David Kimura <david.g.kimura@gmail.com> writes:
> I think I may have stumbled across a case of wrong results on HEAD (same
> through version 9.6, though interestingly 9.5 produces different results
> altogether).

> test=# SELECT i AS ai1, i AS ai2 FROM generate_series(1,3)i GROUP BY
> ai2, ROLLUP(ai1) ORDER BY ai1, ai2;

Yeah, this is an instance of an issue we've known about for awhile:
when using grouping sets (ROLLUP), the planner fails to distinguish
between "ai1" and "ai1 as possibly nulled by the action of the
grouping node".  This has been discussed at, eg, [1] and [2].
The direction I'd like to take to fix it is to invent explicit
labeling of Vars that have been nulled by some operation such as
outer joins or grouping, and then represent grouping set outputs
as either PlaceHolderVars or Vars tied to a new RTE that represents
the grouping step.  I have been working on a patch that'd do the
first half of that [3], but it's been slow going, because we've
indulged in a lot of semantic squishiness in this area and cleaning
it all up is a large undertaking.

> I tinkered a bit and hacked together an admittedly ugly patch that triggers an
> explicit sort constructed from the parse tree.

I seriously doubt that that'll fix all the issues in this area.
We really really need to understand that a PathKey based on
the scan-level value of a Var is different from a PathKey based
on a post-nulling-step value.

            regards, tom lane

[1] https://www.postgresql.org/message-id/flat/CAMbWs48AtQTQGk37MSyDk_EAgDO3Y0iA_LzvuvGQ2uO_Wh2muw%40mail.gmail.com
[2] https://www.postgresql.org/message-id/flat/7dbdcf5c-b5a6-ef89-4958-da212fe10176%40iki.fi
[3] https://www.postgresql.org/message-id/flat/830269.1656693747@sss.pgh.pa.us



pgsql-hackers by date:

Previous
From: Виктория Шепард
Date:
Subject: Re: Re[2]: Possible solution for masking chosen columns when using pg_dump
Next
From: Andres Freund
Date:
Subject: Re: Use simplehash.h instead of dynahash in SMgr