UNION + GROUP BY bug located - Mailing list pgsql-hackers

From Tom Lane
Subject UNION + GROUP BY bug located
Date
Msg-id 8363.928976202@sss.pgh.pa.us
Whole thread Raw
Responses Re: [HACKERS] UNION + GROUP BY bug located
List pgsql-hackers
I noticed a couple of days ago that the current sources coredump if you
try to use GROUP BY on the first sub-SELECT of a UNION, eg

create table category (name text, image text, url text, parent oid);

select name from category group by name
union select image from category;

=> kerboom

(It works if you put a GROUP BY on the second select, though.  6.4.2
didn't coredump in a cursory test, but it didn't produce the right
answers either.)

A check of the mail archives shows that Bill Carlson reported this
bug to pgsql-sql on 22 April, but I'd not picked up on it at the time.

The cause is that plan_union_queries() is failing to clear out the
groupclause before it returns control to union_planner, so the GROUP BY
gets applied twice, once to the subselect and once (incorrectly) to the
result of the UNION.  (This wouldn't have happened with a less klugy
representation for UNION parsetrees, but I digress.)  You can see this
happening if you look at the EXPLAIN output; the coredump only happens
at execution time.

This patch fixes it:


*** backend/optimizer/prep/prepunion.c.orig    Sun Jun  6 13:38:11 1999
--- backend/optimizer/prep/prepunion.c    Wed Jun  9 20:38:48 1999
***************
*** 192,197 ****
--- 192,204 ----         /* needed so we don't take the flag from the first query */         parse->uniqueFlag = NULL;

+     /* Make sure we don't try to apply the first query's grouping stuff
+      * to the Append node, either.  Basically we don't want union_planner
+      * to do anything when we return control, except add the top sort/unique
+      * nodes for DISTINCT processing if this wasn't UNION ALL, or the top
+      * sort node if it was UNION ALL with a user-provided sort clause.
+      */
+     parse->groupClause = NULL;     parse->havingQual = NULL;     parse->hasAggs = false; 


I feel fairly confident that this is a low-risk patch; certainly
it cannot break anything that doesn't involve GROUP BY and UNION.
Is there any objection to my committing it at this late hour?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] Postgres 6.5 beta2 and beta3 problem
Next
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] UNION + GROUP BY bug located