Re: [HACKERS] Oops, I seem to have changed UNION's behavior - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: [HACKERS] Oops, I seem to have changed UNION's behavior
Date
Msg-id 199905091314.JAA21695@candle.pha.pa.us
Whole thread Raw
In response to Oops, I seem to have changed UNION's behavior  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Was there a conclusion on this?


> The equal() updates I installed yesterday (to fix the "don't know
> whether nodes of type 600 are equal" problem) have had an unintended
> side effect.
> 
> Am I right in thinking that UNION (without ALL) is defined to do a
> DISTINCT on its result, so that duplicates are removed even if the
> duplicates both came from the same source table?  That's what 6.4.2
> does, but I do not know if it's strictly kosher according to the SQL
> spec.
> 
> If so, the code is now busted, because with the equal() extension in
> place, cnfify() is able to recognize and remove duplicate select
> clauses.  That is, "SELECT xxx UNION SELECT xxx" will be folded to
> just "SELECT xxx" ... and that doesn't mean the same thing.
> 
> An actual example: given the data
> 
> play=> select a from tt;
> a
> -
> 1
> 1
> 2
> 3
> (4 rows)
> 
> Under 6.4.2 I get:
> 
> play=> select a from tt union select a from tt;
> a
> -
> 1
> 2
> 3
> (3 rows)
> 
> Note lack of duplicate "1".  Under current sources I get:
> 
> ttest=> select a from tt union select a from tt;
> a
> -
> 1
> 1
> 2
> 3
> (4 rows)
> 
> since the query is effectively reduced to just "select a from tt".
> 
> Assuming that 6.4.2 is doing the Right Thing, I see two possible fixes:
> (1) simplify equal() to say that two T_Query nodes are never equal, or
> (2) modify the planner so that the "select distinct" operation is
> inserted explicitly, and will thus happen even if the UNIONed selects
> are collapsed into just one.
> 
> (1) is a trivial fix of course, but it worries me --- maybe someday
> we will need equal() to give an honest answer for Query nodes.
> But I don't have the expertise to apply (2), and it seems like rather
> a lot of work for a boundary case that isn't really interesting in
> practice.
> 
> Comments?  *Is* 6.4.2 behaving according to the SQL spec?
> 
>             regards, tom lane
> 
> 


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] char(n) default '' crashes server
Next
From: Vadim Mikheev
Date:
Subject: Re: [HACKERS] Re: [COMMITTERS] 'pgsql/src/backend/parser gram.c'