# Limit + group + join

 From: Tobias Brox Limit + group + join 26 August 2005 , 00:27:31 20050826002709.GK10328@tobias.lan (view: Whole thread, Raw) Re: Limit + group + join  ("Jeffrey W. Baker") Re: Limit + group + join  (Mark Kirkwood) pgsql-performance

Tree view

Limit + group + join  (Tobias Brox, )
Re: Limit + group + join  ("Jeffrey W. Baker", )
Re: Limit + group + join  (Tobias Brox, )
Re: Limit + group + join  ("Jeffrey W. Baker", )
Re: Limit + group + join  (Mark Kirkwood, )
Re: Limit + group + join  (Tobias Brox, )
Re: Limit + group + join  (Stephan Szabo, )
Re: Limit + group + join  (Tom Lane, )
Re: Limit + group + join  (Mark Kirkwood, )
Re: Limit + group + join  (Tom Lane, )
Re: Limit + group + join  (Mark Kirkwood, )
Re: Limit + group + join  (Tom Lane, )
Re: Limit + group + join  (Mark Kirkwood, )
Re: Limit + group + join  (Tobias Brox, )
Re: Limit + group + join  (Greg Stark, )
Re: Limit + group + join  ("Merlin Moncure", )
Re: Limit + group + join  ("Merlin Moncure", )

```Consider this setup - which is a gross simplification of parts of our
production system ;-)

create table c (id integer primary key);
create table b (id integer primary key, c_id integer);
create index b_on_c on b(c_id)

insert into c (select ... lots of IDs ...);
insert into b (select id, id from c); /* keep it simple :-) */

Now, I'm just interessted in some few rows.

All those gives good plans:

explain select c.id from c order by c.id limit 1;
explain select c.id from c group by c.id order by c.id limit 1;
explain select c.id from c join b on c_id=c.id order by c.id limit 1;

... BUT ... combining join, group and limit makes havoc:

explain select c.id from c join b on c_id=c.id  group by c.id order by c.id
desc limit 5;
QUERY PLAN
-------------------------------------------------------------------------------------
Limit  (cost=3809.65..3809.67 rows=5 width=4)
->  Group  (cost=3809.65..3940.59 rows=26187 width=4)
->  Sort  (cost=3809.65..3875.12 rows=26188 width=4)
Sort Key: c.id
->  Hash Join  (cost=559.34..1887.89 rows=26188 width=4)
Hash Cond: ("outer".id = "inner".c_id)
->  Seq Scan on c  (cost=0.00..403.87 rows=26187 width=4)
->  Hash  (cost=403.87..403.87 rows=26187 width=4)
->  Seq Scan on b  (cost=0.00..403.87 rows=26187 width=4)
(9 rows)

I get the same behaviour on pg 7.4.7 and pg 8.0.2.  Of course, I can
probably use subqueries instead of join - though, I would have wished the
planner could do better ;-)

--
Notice of Confidentiality: This information may be confidential, and