Re: Limit + group + join - Mailing list pgsql-performance
From | Mark Kirkwood |
---|---|
Subject | Re: Limit + group + join |
Date | |
Msg-id | 430E85ED.60109@paradise.net.nz Whole thread Raw |
In response to | Limit + group + join (Tobias Brox <tobias@nordicbet.com>) |
Responses |
Re: Limit + group + join
Re: Limit + group + join Re: Limit + group + join |
List | pgsql-performance |
Tobias, Interesting example: The 'desc' seems to be the guy triggering the sort, e.g: explain select c.id from c join b on c_id=c.id group by c.id order by c.id limit 5; QUERY PLAN ----------------------------------------------------------------------------------------- Limit (cost=0.00..0.28 rows=5 width=4) -> Group (cost=0.00..4476.00 rows=80000 width=4) -> Merge Join (cost=0.00..4276.00 rows=80000 width=4) Merge Cond: ("outer".id = "inner".c_id) -> Index Scan using c_pkey on c (cost=0.00..1518.00 rows=80000 width=4) -> Index Scan using b_on_c on b (cost=0.00..1558.00 rows=80000 width=4) (6 rows) Whereas with it back in again: explain select c.id from c join b on c_id=c.id group by c.id order by c.id desc limit 5; QUERY PLAN -------------------------------------------------------------------------------------- Limit (cost=10741.08..10741.11 rows=5 width=4) -> Group (cost=10741.08..11141.08 rows=80000 width=4) -> Sort (cost=10741.08..10941.08 rows=80000 width=4) Sort Key: c.id -> Hash Join (cost=1393.00..4226.00 rows=80000 width=4) Hash Cond: ("outer".c_id = "inner".id) -> Seq Scan on b (cost=0.00..1233.00 rows=80000 width=4) -> Hash (cost=1193.00..1193.00 rows=80000 width=4) -> Seq Scan on c (cost=0.00..1193.00 rows=80000 width=4) (9 rows) However being a bit brutal: set enable_mergejoin=false; set enable_hashjoin=false; explain select c.id from c join b on c_id=c.id group by c.id order by c.id desc limit 5; QUERY PLAN -------------------------------------------------------------------------------------------------- Limit (cost=0.00..15.24 rows=5 width=4) -> Group (cost=0.00..243798.00 rows=80000 width=4) -> Nested Loop (cost=0.00..243598.00 rows=80000 width=4) -> Index Scan Backward using c_pkey on c (cost=0.00..1518.00 rows=80000 width=4) -> Index Scan using b_on_c on b (cost=0.00..3.01 rows=1 width=4) Index Cond: (b.c_id = "outer".id) (6 rows) What is interesting is why this plan is being rejected... Cheers Mark Tobias Brox wrote: > Consider this setup - which is a gross simplification of parts of our > production system ;-) > > create table c (id integer primary key); > create table b (id integer primary key, c_id integer); > create index b_on_c on b(c_id) > > insert into c (select ... lots of IDs ...); > insert into b (select id, id from c); /* keep it simple :-) */ > > Now, I'm just interessted in some few rows. > > All those gives good plans: > > explain select c.id from c order by c.id limit 1; > explain select c.id from c group by c.id order by c.id limit 1; > explain select c.id from c join b on c_id=c.id order by c.id limit 1; > > ... BUT ... combining join, group and limit makes havoc: > > explain select c.id from c join b on c_id=c.id group by c.id order by c.id > desc limit 5; >
pgsql-performance by date: