Thread: Preserving order through an inner join
I have a query structured like this: SELECT stuff FROM (SELECT more stuff FROM table1 ORDER BY field1) AS q1 INNER JOIN table2 ON ( ... ) and have found that the INNER JOIN is ignoring the order set for q1. The final results are not ordered by field1. This works for other databases (eg. MySQL and Sqllite3) but not PostgreSQL. I would really like to support PostgreSQL but this ordering problem is stopping me from doing so. I can make some small changes to the query structure as long as it works for the other DBs as well. Moving the ORDER BY outsideq1 would be a large amount of work, however (these queries are generated by a program), so I am hoping that thereis a simpler solution. Any suggestions for getting this to work? Kevin
Hello 2010/9/26 Kevin Jardine <kevinjardine@yahoo.com>: > I have a query structured like this: > > SELECT stuff FROM > (SELECT more stuff FROM > table1 > ORDER BY field1) AS q1 > INNER JOIN table2 ON ( ... ) > > and have found that the INNER JOIN is ignoring the order set for q1. > you can block a hash join that cannot to protect order. But it can be very contra-productive - this method of JOIN is the most faster. SQLite or MySQL are too simple and doesn't provide some more sophisticated constructs. Principally your query is wrong - the real order is based on final ORDER BY. It's probable, so your queries will stop working (on SQLite or MySQL) in future, when these databases will be more mature. Regards Pavel Stehule > The final results are not ordered by field1. > > This works for other databases (eg. MySQL and Sqllite3) but not PostgreSQL. > > I would really like to support PostgreSQL but this ordering problem is stopping me from doing so. > > I can make some small changes to the query structure as long as it works for the other DBs as well. Moving the ORDER BYoutside q1 would be a large amount of work, however (these queries are generated by a program), so I am hoping that thereis a simpler solution. > > Any suggestions for getting this to work? > > Kevin > > > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
Hi Pavel, I'm not really interested in a "my database is better than your database" discussion. I want to know how to preserve the order. As I said, moving the ORDER BY out of the subquery would be a lot of work. I am trying to find another solution. Any suggestions from anyone? Kevin --- On Sun, 9/26/10, Pavel Stehule <pavel.stehule@gmail.com> wrote: > From: Pavel Stehule <pavel.stehule@gmail.com> > Subject: Re: [GENERAL] Preserving order through an inner join > To: "Kevin Jardine" <kevinjardine@yahoo.com> > Cc: pgsql-general@postgresql.org > Date: Sunday, September 26, 2010, 9:23 PM > Hello > > 2010/9/26 Kevin Jardine <kevinjardine@yahoo.com>: > > I have a query structured like this: > > > > SELECT stuff FROM > > (SELECT more stuff FROM > > table1 > > ORDER BY field1) AS q1 > > INNER JOIN table2 ON ( ... ) > > > > and have found that the INNER JOIN is ignoring the > order set for q1. > > > > you can block a hash join that cannot to protect order. But > it can be > very contra-productive - this method of JOIN is the most > faster. > SQLite or MySQL are too simple and doesn't provide some > more > sophisticated constructs. Principally your query is wrong - > the real > order is based on final ORDER BY. It's probable, so your > queries will > stop working (on SQLite or MySQL) in future, when these > databases will > be more mature. > > Regards > > Pavel Stehule > > > The final results are not ordered by field1. > > > > This works for other databases (eg. MySQL and > Sqllite3) but not PostgreSQL. > > > > I would really like to support PostgreSQL but this > ordering problem is stopping me from doing so. > > > > I can make some small changes to the query structure > as long as it works for the other DBs as well. Moving the > ORDER BY outside q1 would be a large amount of work, however > (these queries are generated by a program), so I am hoping > that there is a simpler solution. > > > > Any suggestions for getting this to work? > > > > Kevin > > > > > > > > > > -- > > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > > To make changes to your subscription: > > http://www.postgresql.org/mailpref/pgsql-general > > >
Kevin Jardine <kevinjardine@yahoo.com> writes: > I have a query structured like this: > SELECT stuff FROM > (SELECT more stuff FROM > table1 > ORDER BY field1) AS q1 > INNER JOIN table2 ON ( ... ) > and have found that the INNER JOIN is ignoring the order set for q1. > The final results are not ordered by field1. Indeed. Many of the possible join techniques won't preserve that ordering. > This works for other databases (eg. MySQL and Sqllite3) but not PostgreSQL. It might sometimes accidentally fail to fail, but I think you'll find that there are *no* SQL databases where this is guaranteed to work the way you expect. The SQL standard explicitly disavows any particular output row order unless there is a top-level ORDER BY. (In fact, unless things have changed recently an ORDER BY in a sub-select isn't even legal per spec.) > I can make some small changes to the query structure as long as it works for the other DBs as well. Moving the ORDER BYoutside q1 would be a large amount of work, however (these queries are generated by a program), so I am hoping that thereis a simpler solution. Nope, that's what you need to do. regards, tom lane
OK, then. The sub-select needs to go. Thanks for helpful advice. Kevin --- On Sun, 9/26/10, Tom Lane <tgl@sss.pgh.pa.us> wrote: > From: Tom Lane <tgl@sss.pgh.pa.us> > Subject: Re: [GENERAL] Preserving order through an inner join > To: "Kevin Jardine" <kevinjardine@yahoo.com> > Cc: pgsql-general@postgresql.org > Date: Sunday, September 26, 2010, 9:37 PM > Kevin Jardine <kevinjardine@yahoo.com> > writes: > > I have a query structured like this: > > SELECT stuff FROM > > (SELECT more stuff FROM > > table1 > > ORDER BY field1) AS q1 > > INNER JOIN table2 ON ( ... ) > > > and have found that the INNER JOIN is ignoring the > order set for q1. > > > The final results are not ordered by field1. > > Indeed. Many of the possible join techniques won't > preserve that ordering. > > > This works for other databases (eg. MySQL and > Sqllite3) but not PostgreSQL. > > It might sometimes accidentally fail to fail, but I think > you'll find > that there are *no* SQL databases where this is guaranteed > to work the > way you expect. The SQL standard explicitly disavows > any particular > output row order unless there is a top-level ORDER > BY. (In fact, > unless things have changed recently an ORDER BY in a > sub-select isn't > even legal per spec.) > > > I can make some small changes to the query structure > as long as it works for the other DBs as well. Moving the > ORDER BY outside q1 would be a large amount of work, however > (these queries are generated by a program), so I am hoping > that there is a simpler solution. > > Nope, that's what you need to do. > > > regards, tom lane > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
On Sun, Sep 26, 2010 at 07:32:01PM +0000, Kevin Jardine wrote: > Hi Pavel, > > I'm not really interested in a "my database is better than your database" discussion. Pavel did not say that his database is better than yours. What he said was that your query is wrong (with respect to what you said you expect it to return). Tom attested that it isn't even "legal" SQL as per the spec. Please keep things factual. Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
> -----Original Message----- > From: Kevin Jardine [mailto:kevinjardine@yahoo.com] > Sent: Sunday, September 26, 2010 3:15 PM > To: pgsql-general@postgresql.org > Subject: Preserving order through an inner join > > I have a query structured like this: > > SELECT stuff FROM > (SELECT more stuff FROM > table1 > ORDER BY field1) AS q1 > INNER JOIN table2 ON ( ... ) > > and have found that the INNER JOIN is ignoring the order set for q1. > > The final results are not ordered by field1. > > This works for other databases (eg. MySQL and Sqllite3) but > not PostgreSQL. > > I would really like to support PostgreSQL but this ordering > problem is stopping me from doing so. > > I can make some small changes to the query structure as long > as it works for the other DBs as well. Moving the ORDER BY > outside q1 would be a large amount of work, however (these > queries are generated by a program), so I am hoping that > there is a simpler solution. > > Any suggestions for getting this to work? > > Kevin > As others stated, it's a bad idea to order sub-select. But if you really, really need it (m.b. not in this particular case), you could do: SELECT stuff FROM (SELECT field1, field2, COUNT(*) FROM table1 GROUP BY field1, field2 ORDER BY field1) AS q1 INNER JOIN table2 ON ( ... ) Regards, Igor Neyman
On Sun, Sep 26, 2010 at 9:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
The SQL standard explicitly disavows any particular
output row order unless there is a top-level ORDER BY. (In fact,
unless things have changed recently an ORDER BY in a sub-select isn't
even legal per spec.)
Not sure about the SQL spec allowing it, but an ORDER BY followed by a LIMIT does have valid use cases in sub-selects.
Regards,
--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com
singh.gurjeet@{ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet
Mail sent from my BlackLaptop device
Following up on other replies, and agreeing that SQL has no inherent row ordering and it is only by accident that you are seeing such ... The *only* way to reliably preserve sort order is by explicitly encoding the order in your data, by maintaining a column or columns through to the outermost query on which you sort. A generic solution which you can apply is to use a SQL "RANK"-alike feature to generate a serial number column in your rowset; you can do this in the inner query where you want to have all your unique or complex sorting details, and then your sort order will be preserved (as long as you explicitly propagate said column either by name or with *) through to the outer query. You can then order by and exclude said column in the outermost query if you want, but you don't have to because your application will still get the ordering information in said extra column. For example, something like this: SELECT stuff FROM (SELECT more stuff, rank() OVER (ORDER BY field1) AS ordcol FROM table1 ) AS q1 INNER JOIN table2 ON ( ... ) ORDER BY ordcol See http://www.postgresql.org/docs/9.0/interactive/queries-table-expressions.html#QUERIES-WINDOW and the 3 urls it links to. -- Darren Duncan Kevin Jardine wrote: > I have a query structured like this: > > SELECT stuff FROM > (SELECT more stuff FROM > table1 > ORDER BY field1) AS q1 > INNER JOIN table2 ON ( ... ) > > and have found that the INNER JOIN is ignoring the order set for q1. > > The final results are not ordered by field1. > > This works for other databases (eg. MySQL and Sqllite3) but not PostgreSQL. > > I would really like to support PostgreSQL but this ordering problem is stopping me from doing so. > > I can make some small changes to the query structure as long as it works for the other DBs as well. Moving the ORDER BYoutside q1 would be a large amount of work, however (these queries are generated by a program), so I am hoping that thereis a simpler solution. > > Any suggestions for getting this to work?
Gurjeet Singh wrote: > On Sun, Sep 26, 2010 at 9:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > The SQL standard explicitly disavows any particular > output row order unless there is a top-level ORDER BY. (In fact, > unless things have changed recently an ORDER BY in a sub-select isn't > even legal per spec.) > > Not sure about the SQL spec allowing it, but an ORDER BY followed by a > LIMIT does have valid use cases in sub-selects. Absolutely it does, but that is just a row *filtering* operation. You still have to have a separate ORDER BY in the outermost query to get result rows output in a particular order. -- Darren Duncan
On 09/27/2010 03:37 AM, Tom Lane wrote: > Kevin Jardine<kevinjardine@yahoo.com> writes: >> I have a query structured like this: >> SELECT stuff FROM >> (SELECT more stuff FROM >> table1 >> ORDER BY field1) AS q1 >> INNER JOIN table2 ON ( ... ) > >> and have found that the INNER JOIN is ignoring the order set for q1. > >> The final results are not ordered by field1. > > Indeed. Many of the possible join techniques won't preserve that ordering. Just to elaborate on this for the OP: When joining two data sets, PostgreSQL has a number of choices about how to do it. Some of these are: - Merge join (http://en.wikipedia.org/wiki/Sort-merge_join) - Hash join (http://en.wikipedia.org/wiki/Hash_join) - Nested loop + index/sequential table scans (http://en.wikipedia.org/wiki/Nested_loop_join) As you will see from the descriptions and how they work, many join algorithms do not preserve the order of the input relations. Different join types are optimal for different absolute and relative sizes of input relations and different join conditions; all of them are useful in one situation or another. Only using joins that preserved the natural order of the relations (which is undefined as per the SQL spec and 99% of the time people don't care about anyway) would be pretty nasty for performance. You could potentially force preservation of order in your particular query by telling PostgreSQL to use a nested loop with your subquery as the outer loop, but it'd require overriding the query planner's join method and join order optimization in ugly ways that aren't well supported. Not only that, but you'll have to do that kind of hacking-around if you target any other database that supports at least hash joins, and you'll have to do it using database-specific query hints, configuration options, session variables, etc. I hope this helps to explain in a bit more detail why it's worth putting in the effort to update your code. -- Craig ringer