Thread: subquery join order by
Hello, (I googled and read docs before sending this e-mail). Is it necessary to use order by twice (inside and outside) to get the proper order if I have an ordered subqery in a join? select * from (select distinct on (b_id) * from a order by b_id, id) sub left join b on b.id = sub.b_id; or select * from (select distinct on (b_id) * from a order by b_id, id) sub left join b on b.id = sub.b_id order by b_id; It seems to me that it's enough to use 'order by' only inside wheter 'by desc' or 'by asc' (b_id), however I'd like to be sure. Thank you. Mage
On 19 November 2010 01:36, Mage <mage@mage.hu> wrote: > Hello, > > (I googled and read docs before sending this e-mail). > > Is it necessary to use order by twice (inside and outside) to get the proper > order if I have an ordered subqery in a join? > > select * from (select distinct on (b_id) * from a order by b_id, id) sub > left join b on b.id = sub.b_id; > > or > > select * from (select distinct on (b_id) * from a order by b_id, id) sub > left join b on b.id = sub.b_id order by b_id; > > > It seems to me that it's enough to use 'order by' only inside wheter 'by > desc' or 'by asc' (b_id), however I'd like to be sure. > > Thank you. > > Mage You should always use ORDER BY on the outer-most part of the query since that's what will be finally returning your data. Don't bother with ordering sub-selects. So in your case, just use: SELECT * FROM (SELECT DISTINCT ON (b_id) * FROM a) sub LEFT JOIN b ON b.id = sub.b_id ORDER BY sub.b_id, sub.id; But why bother with a sub-select anyway? You can write it as: SELECT DISTINCT ON (a.b_id) * FROM a LEFT JOIN b ON b.id = a.b_id ORDER BY a.b_id, a.id; -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935
On 11/19/2010 03:21 AM, Thom Brown wrote: > > You should always use ORDER BY on the outer-most part of the query > since that's what will be finally returning your data. Don't bother > with ordering sub-selects. I definiatelly have to use the "order by" inside for two reasons. When "distinct on (x)" is used then x must be in the first column in the order by part. The second column in the order by decides which records will I include in the join so it is very important to use it for ordering. > So in your case, just use: > > SELECT * > FROM (SELECT DISTINCT ON (b_id) * FROM a) sub > LEFT JOIN b ON b.id = sub.b_id > ORDER BY sub.b_id, sub.id; select distinct on (id) * from b order by name; ERROR: SELECT DISTINCT ON expressions must match initial ORDER BY expressions > But why bother with a sub-select anyway? You can write it as: > > SELECT DISTINCT ON (a.b_id) * > FROM a > LEFT JOIN b ON b.id = a.b_id > ORDER BY a.b_id, a.id; I considered this, however the subquery is generated by an ORM. I wanted to separate it. Also the whole join affects many rows. I thought it's cheaper to preselect them inside the subquery then do the join. I am not sure. Explain analyze is my good friend but in this case I prefer to ask. Mage
> I considered this, however the subquery is generated by an ORM. I > wanted to separate it. > > Also the whole join affects many rows. I thought it's cheaper to > preselect them inside the subquery then do the join. I am not sure. > Explain analyze is my good friend but in this case I prefer to ask. # EXPLAIN ANALYZE select * from (select distinct on (b_id) * from a order by b_id, id) sub left join b on b.id = sub.b_id; QUERY PLAN ----------------------------------------------------------------------------------------------------------------- Hash Left Join (cost=187.45..243.70 rows=1230 width=44) (actual time=0.000..0.000 rows=3 loops=1) [...] (11 rows) # EXPLAIN ANALYZE SELECT DISTINCT ON (a.b_id) * FROM a LEFT JOIN b ON b.id = a.b_id ORDER BY a.b_id, a.id; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Unique (cost=1339.24..1405.05 rows=200 width=44) (actual time=0.000..0.000 rows=3 loops=1) [...] (15 rows) mage=# EXPLAIN ANALYZE select * from (select distinct on (b_id) * from a order by b_id, id) sub left join b on b.id = sub.b_id order by b.id; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Sort (cost=306.83..309.90 rows=1230 width=44) (actual time=0.000..0.000 rows=3 loops=1) The subquery seems to be better choice even with double ordering. But is the second order required? Mage