I noticed a couple of days ago that the current sources coredump if you
try to use GROUP BY on the first sub-SELECT of a UNION, eg
create table category (name text, image text, url text, parent oid);
select name from category group by name
union select image from category;
=> kerboom
(It works if you put a GROUP BY on the second select, though. 6.4.2
didn't coredump in a cursory test, but it didn't produce the right
answers either.)
A check of the mail archives shows that Bill Carlson reported this
bug to pgsql-sql on 22 April, but I'd not picked up on it at the time.
The cause is that plan_union_queries() is failing to clear out the
groupclause before it returns control to union_planner, so the GROUP BY
gets applied twice, once to the subselect and once (incorrectly) to the
result of the UNION. (This wouldn't have happened with a less klugy
representation for UNION parsetrees, but I digress.) You can see this
happening if you look at the EXPLAIN output; the coredump only happens
at execution time.
This patch fixes it:
*** backend/optimizer/prep/prepunion.c.orig Sun Jun 6 13:38:11 1999
--- backend/optimizer/prep/prepunion.c Wed Jun 9 20:38:48 1999
***************
*** 192,197 ****
--- 192,204 ---- /* needed so we don't take the flag from the first query */ parse->uniqueFlag = NULL;
+ /* Make sure we don't try to apply the first query's grouping stuff
+ * to the Append node, either. Basically we don't want union_planner
+ * to do anything when we return control, except add the top sort/unique
+ * nodes for DISTINCT processing if this wasn't UNION ALL, or the top
+ * sort node if it was UNION ALL with a user-provided sort clause.
+ */
+ parse->groupClause = NULL; parse->havingQual = NULL; parse->hasAggs = false;
I feel fairly confident that this is a low-risk patch; certainly
it cannot break anything that doesn't involve GROUP BY and UNION.
Is there any objection to my committing it at this late hour?
regards, tom lane