Thread: 8.4b1: Query returning results in different order to 8.3

8.4b1: Query returning results in different order to 8.3

From
Ian Barwick
Date:
Hi

The following query is distilled down from a real production query for
ease of reproduction:
   SELECT 1 AS id , 2 AS tmpl_id    WHERE FALSE  -- (in production, only rarely will this clause return a row)
UNIONSELECT* FROM   (SELECT 2 AS id, 96 AS tmpl_id         UNION    SELECT 3 AS id, 101 AS tmpl_id  ORDER BY 1   ) tmpl
  WHERE tmpl_id IS NOT NULL
 

In 8.3 and earlier it consistently produces this result:
id | tmpl_id
----+--------- 2 |      96 3 |     101

This is the result I am expecting this query to produce ("expecting"
as in this is what I intend it to do, and it seems to work as
intended).

In 8.4beta1 the result is unpredictable; sometimes the row with id 3
is returned first, e.g.:
id | tmpl_id
----+--------- 3 |     101 2 |      96

Changing the values selected for tmpl_id produces different ordering;
on both systems tested [*], changing 101 to 102 on the third select
produces the expected ordering consistently.
   SELECT 1 AS id , 2 AS tmpl_id    WHERE FALSE      UNIONSELECT * FROM   (SELECT 2 AS id, 96 AS tmpl_id         UNION
 SELECT 3 AS id, 102 AS tmpl_id  ORDER BY 1   ) tmpl    WHERE tmpl_id IS NOT NULL
 
id | tmpl_id
----+--------- 2 |      96 3 |     102

Using other values produces varying results, I can't see a pattern.

Note that removing the first SELECT completely produces the expected
ordering consistently:
   SELECT * FROM   (SELECT 2 AS id, 96 AS tmpl_id         UNION    SELECT 3 AS id, 101 AS tmpl_id  ORDER BY 1   ) tmpl
WHEREtmpl_id IS NOT NULL
 

id | tmpl_id
----+--------- 2 |      96 3 |     101


Workaround / solution to produce consistent results is to move the
"ORDER BY 1" to the main SELECT clause:
   SELECT 1 AS id , 2 AS tmpl_id    WHERE FALSE       UNIONSELECT * FROM   (SELECT 2 AS id, 96 AS tmpl_id        UNION
 SELECT 3 AS id, 101 AS tmpl_id    ) tmpl    WHERE tmpl_id IS NOT NULL ORDER BY 1
 

(The full version of this query in its original form is in production
on 8.2 and 8.3 versions and I am confident it has always produced
consistent results. It is used to select the appropriate template for
pages on a website and someone would have noticed long before now if
it was serving up the wrong template).

Note I'm not sure whether this is a bug, or whether the assumption
made for the original query (that the row order returned by the
subquery would be carried over to the main part of the query) is
incorrect but just happened to work as expected pre-8.4.

[*] tested on: - Ubuntu 8.10 running on VIA C7-M - OS X 10.5 running on Intel Core Duo


Regards

Ian Barwick


Re: 8.4b1: Query returning results in different order to 8.3

From
Heikki Linnakangas
Date:
Ian Barwick wrote:
> Note I'm not sure whether this is a bug, or whether the assumption
> made for the original query (that the row order returned by the
> subquery would be carried over to the main part of the query) is
> incorrect but just happened to work as expected pre-8.4.

The latter. Without an ORDER BY (at the outermost level), the order of 
the result is not well defined. Before 8.4, UNION was always performed 
by a Sort + Unique, which explains why the output is always sorted in 
previous releases. 8.4 knows how to perform it with a Hash Aggregate, 
which doesn't yield sorted output. You can look at the EXPLAIN output to 
see the difference.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: 8.4b1: Query returning results in different order to 8.3

From
Andrew Dunstan
Date:

Ian Barwick wrote:
>
>
> Workaround / solution to produce consistent results is to move the
> "ORDER BY 1" to the main SELECT clause:
>
>     SELECT 1 AS id , 2 AS tmpl_id
>      WHERE FALSE
>         UNION
>     SELECT * FROM
>     (SELECT 2 AS id, 96 AS tmpl_id
>          UNION
>      SELECT 3 AS id, 101 AS tmpl_id
>      ) tmpl
>      WHERE tmpl_id IS NOT NULL
>   ORDER BY 1
>
> (The full version of this query in its original form is in production
> on 8.2 and 8.3 versions and I am confident it has always produced
> consistent results. It is used to select the appropriate template for
> pages on a website and someone would have noticed long before now if
> it was serving up the wrong template).
>
> Note I'm not sure whether this is a bug, or whether the assumption
> made for the original query (that the row order returned by the
> subquery would be carried over to the main part of the query) is
> incorrect but just happened to work as expected pre-8.4.
>
>
>   

I don't believe it's a bug - the assumption is one you're not entitled 
to make. Your "workaround" is the correct solution, ISTM.

cheers

andrew


Re: 8.4b1: Query returning results in different order to 8.3

From
Tom Lane
Date:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> Ian Barwick wrote:
>> Note I'm not sure whether this is a bug, or whether the assumption
>> made for the original query (that the row order returned by the
>> subquery would be carried over to the main part of the query) is
>> incorrect but just happened to work as expected pre-8.4.

> The latter. Without an ORDER BY (at the outermost level), the order of 
> the result is not well defined. Before 8.4, UNION was always performed 
> by a Sort + Unique, which explains why the output is always sorted in 
> previous releases. 8.4 knows how to perform it with a Hash Aggregate, 
> which doesn't yield sorted output.

This is mentioned in the release notes, but I suppose we'd better
promote it to the "observe the following incompatibilities" list...
        regards, tom lane


Re: 8.4b1: Query returning results in different order to 8.3

From
Grzegorz Jaskiewicz
Date:
On 18 Apr 2009, at 22:22, Tom Lane wrote:
>
> This is mentioned in the release notes, but I suppose we'd better
> promote it to the "observe the following incompatibilities" list...
>

This is a really funny one, because people naturally expect UNION  
[ALL] to stay in the same order. Unlike the table, order here cannot  
change by inserts/updates, etc.
I am sure many, even well experienced will stumble upon that one.

Me is guessing, that UNION [ALL] performance just had to be improved  
for CTEs ? Or was it something completely separate.




Re: 8.4b1: Query returning results in different order to 8.3

From
Greg Stark
Date:
On Sat, Apr 18, 2009 at 11:19 PM, Grzegorz Jaskiewicz
<gj@pointblue.com.pl> wrote:
> This is a really funny one, because people naturally expect UNION [ALL] to
> stay in the same order. Unlike the table, order here cannot change by
> inserts/updates, etc.
> I am sure many, even well experienced will stumble upon that one.

There is a misunderstanding here. UNION has *never* preserved the
order of the subqueries before. In the OP's query it was *not*
preserving the order. It was a coincidence that the order the subquery
was in was sorted on the first field and since UNION resorted the
whole result set by all the fields in order that meant it was in order
by the first field.

> Me is guessing, that UNION [ALL] performance just had to be improved for
> CTEs ? Or was it something completely separate.

Hash aggregates were new relative to set operations which have been
around a very long time. They didn't take advantage of the new code
but it was always fairly obvious that they should eventually have been
changed to. I think it came along with fixing DISTINCT to use hash
aggregates which was a similar situation.

UNION ALL should still preserve the order of the subqueries. It just
returns all the rows of each subquery one after the other with no
other work.

-- 
greg


Re: 8.4b1: Query returning results in different order to 8.3

From
Ian Barwick
Date:
2009/4/19 Tom Lane <tgl@sss.pgh.pa.us>
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> Ian Barwick wrote:
>> Note I'm not sure whether this is a bug, or whether the assumption
>> made for the original query (that the row order returned by the
>> subquery would be carried over to the main part of the query) is
>> incorrect but just happened to work as expected pre-8.4.

> The latter. Without an ORDER BY (at the outermost level), the order of
> the result is not well defined. Before 8.4, UNION was always performed
> by a Sort + Unique, which explains why the output is always sorted in
> previous releases. 8.4 knows how to perform it with a Hash Aggregate,
> which doesn't yield sorted output.

This is mentioned in the release notes, but I suppose we'd better
promote it to the "observe the following incompatibilities" list...

Thanks for clarifying that. The relevant section in the release notes (which I managed to miss) is this:


It would certainly be worth an explicit mention as I imagine the previous behaviour has been consistent enough for queries to have come to rely on it. 

Regards


Ian Barwick

Re: 8.4b1: Query returning results in different order to 8.3

From
Heikki Linnakangas
Date:
Greg Stark wrote:
> UNION ALL should still preserve the order of the subqueries. It just
> returns all the rows of each subquery one after the other with no
> other work.

Although without an ORDER BY that's not well-defined and thus not 
guaranteed to work in future versions either. I doubt we're going to 
change that anytime soon, but I wouldn't rely on it in an application.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com