Thread: 8.4b1: Query returning results in different order to 8.3
Hi The following query is distilled down from a real production query for ease of reproduction: SELECT 1 AS id , 2 AS tmpl_id WHERE FALSE -- (in production, only rarely will this clause return a row) UNIONSELECT* FROM (SELECT 2 AS id, 96 AS tmpl_id UNION SELECT 3 AS id, 101 AS tmpl_id ORDER BY 1 ) tmpl WHERE tmpl_id IS NOT NULL In 8.3 and earlier it consistently produces this result: id | tmpl_id ----+--------- 2 | 96 3 | 101 This is the result I am expecting this query to produce ("expecting" as in this is what I intend it to do, and it seems to work as intended). In 8.4beta1 the result is unpredictable; sometimes the row with id 3 is returned first, e.g.: id | tmpl_id ----+--------- 3 | 101 2 | 96 Changing the values selected for tmpl_id produces different ordering; on both systems tested [*], changing 101 to 102 on the third select produces the expected ordering consistently. SELECT 1 AS id , 2 AS tmpl_id WHERE FALSE UNIONSELECT * FROM (SELECT 2 AS id, 96 AS tmpl_id UNION SELECT 3 AS id, 102 AS tmpl_id ORDER BY 1 ) tmpl WHERE tmpl_id IS NOT NULL id | tmpl_id ----+--------- 2 | 96 3 | 102 Using other values produces varying results, I can't see a pattern. Note that removing the first SELECT completely produces the expected ordering consistently: SELECT * FROM (SELECT 2 AS id, 96 AS tmpl_id UNION SELECT 3 AS id, 101 AS tmpl_id ORDER BY 1 ) tmpl WHEREtmpl_id IS NOT NULL id | tmpl_id ----+--------- 2 | 96 3 | 101 Workaround / solution to produce consistent results is to move the "ORDER BY 1" to the main SELECT clause: SELECT 1 AS id , 2 AS tmpl_id WHERE FALSE UNIONSELECT * FROM (SELECT 2 AS id, 96 AS tmpl_id UNION SELECT 3 AS id, 101 AS tmpl_id ) tmpl WHERE tmpl_id IS NOT NULL ORDER BY 1 (The full version of this query in its original form is in production on 8.2 and 8.3 versions and I am confident it has always produced consistent results. It is used to select the appropriate template for pages on a website and someone would have noticed long before now if it was serving up the wrong template). Note I'm not sure whether this is a bug, or whether the assumption made for the original query (that the row order returned by the subquery would be carried over to the main part of the query) is incorrect but just happened to work as expected pre-8.4. [*] tested on: - Ubuntu 8.10 running on VIA C7-M - OS X 10.5 running on Intel Core Duo Regards Ian Barwick
Ian Barwick wrote: > Note I'm not sure whether this is a bug, or whether the assumption > made for the original query (that the row order returned by the > subquery would be carried over to the main part of the query) is > incorrect but just happened to work as expected pre-8.4. The latter. Without an ORDER BY (at the outermost level), the order of the result is not well defined. Before 8.4, UNION was always performed by a Sort + Unique, which explains why the output is always sorted in previous releases. 8.4 knows how to perform it with a Hash Aggregate, which doesn't yield sorted output. You can look at the EXPLAIN output to see the difference. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Ian Barwick wrote: > > > Workaround / solution to produce consistent results is to move the > "ORDER BY 1" to the main SELECT clause: > > SELECT 1 AS id , 2 AS tmpl_id > WHERE FALSE > UNION > SELECT * FROM > (SELECT 2 AS id, 96 AS tmpl_id > UNION > SELECT 3 AS id, 101 AS tmpl_id > ) tmpl > WHERE tmpl_id IS NOT NULL > ORDER BY 1 > > (The full version of this query in its original form is in production > on 8.2 and 8.3 versions and I am confident it has always produced > consistent results. It is used to select the appropriate template for > pages on a website and someone would have noticed long before now if > it was serving up the wrong template). > > Note I'm not sure whether this is a bug, or whether the assumption > made for the original query (that the row order returned by the > subquery would be carried over to the main part of the query) is > incorrect but just happened to work as expected pre-8.4. > > > I don't believe it's a bug - the assumption is one you're not entitled to make. Your "workaround" is the correct solution, ISTM. cheers andrew
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > Ian Barwick wrote: >> Note I'm not sure whether this is a bug, or whether the assumption >> made for the original query (that the row order returned by the >> subquery would be carried over to the main part of the query) is >> incorrect but just happened to work as expected pre-8.4. > The latter. Without an ORDER BY (at the outermost level), the order of > the result is not well defined. Before 8.4, UNION was always performed > by a Sort + Unique, which explains why the output is always sorted in > previous releases. 8.4 knows how to perform it with a Hash Aggregate, > which doesn't yield sorted output. This is mentioned in the release notes, but I suppose we'd better promote it to the "observe the following incompatibilities" list... regards, tom lane
On 18 Apr 2009, at 22:22, Tom Lane wrote: > > This is mentioned in the release notes, but I suppose we'd better > promote it to the "observe the following incompatibilities" list... > This is a really funny one, because people naturally expect UNION [ALL] to stay in the same order. Unlike the table, order here cannot change by inserts/updates, etc. I am sure many, even well experienced will stumble upon that one. Me is guessing, that UNION [ALL] performance just had to be improved for CTEs ? Or was it something completely separate.
On Sat, Apr 18, 2009 at 11:19 PM, Grzegorz Jaskiewicz <gj@pointblue.com.pl> wrote: > This is a really funny one, because people naturally expect UNION [ALL] to > stay in the same order. Unlike the table, order here cannot change by > inserts/updates, etc. > I am sure many, even well experienced will stumble upon that one. There is a misunderstanding here. UNION has *never* preserved the order of the subqueries before. In the OP's query it was *not* preserving the order. It was a coincidence that the order the subquery was in was sorted on the first field and since UNION resorted the whole result set by all the fields in order that meant it was in order by the first field. > Me is guessing, that UNION [ALL] performance just had to be improved for > CTEs ? Or was it something completely separate. Hash aggregates were new relative to set operations which have been around a very long time. They didn't take advantage of the new code but it was always fairly obvious that they should eventually have been changed to. I think it came along with fixing DISTINCT to use hash aggregates which was a similar situation. UNION ALL should still preserve the order of the subqueries. It just returns all the rows of each subquery one after the other with no other work. -- greg
2009/4/19 Tom Lane <tgl@sss.pgh.pa.us>
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:This is mentioned in the release notes, but I suppose we'd better
> Ian Barwick wrote:
>> Note I'm not sure whether this is a bug, or whether the assumption
>> made for the original query (that the row order returned by the
>> subquery would be carried over to the main part of the query) is
>> incorrect but just happened to work as expected pre-8.4.
> The latter. Without an ORDER BY (at the outermost level), the order of
> the result is not well defined. Before 8.4, UNION was always performed
> by a Sort + Unique, which explains why the output is always sorted in
> previous releases. 8.4 knows how to perform it with a Hash Aggregate,
> which doesn't yield sorted output.
promote it to the "observe the following incompatibilities" list...
Thanks for clarifying that. The relevant section in the release notes (which I managed to miss) is this:
It would certainly be worth an explicit mention as I imagine the previous behaviour has been consistent enough for queries to have come to rely on it.
Regards
Ian Barwick
Greg Stark wrote: > UNION ALL should still preserve the order of the subqueries. It just > returns all the rows of each subquery one after the other with no > other work. Although without an ORDER BY that's not well-defined and thus not guaranteed to work in future versions either. I doubt we're going to change that anytime soon, but I wouldn't rely on it in an application. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com