Re: in() VS exists() - Mailing list pgsql-sql
From | Julien |
---|---|
Subject | Re: in() VS exists() |
Date | |
Msg-id | 1205424600.3240.30.camel@frodon.be-bif.ulb.ac.be Whole thread Raw |
In response to | in() VS exists() (Julien <jcigar@ulb.ac.be>) |
Responses |
Re: in() VS exists()
|
List | pgsql-sql |
If I understood well the query plan, the planner optimize the IN(SELECT ...) version with a JOIN (line 19-20 of the first paste) : -> Hash IN Join (cost=240.95..4011.20 rows=1436 width=4) (actual time=93.971..201.908 rows=1431 loops=1) Hash Cond: ("outer".id = "inner".specimen_id) so I guess that : select count(sp.id) from specimens sp where sp.id in (select specimen_id from sequences); is almost the same as : select count(sp.id) from specimens sp INNER JOIN (select specimen_id from sequences GROUP BY specimen_id) as foo on foo.specimen_id = sp.id; ? Thanks, Julien On Thu, 2008-03-13 at 14:46 +0100, Bart Degryse wrote: > The chapter on indexes in the manual should give you a pretty good > idea on the why. > IN and EXISTS are not the only possibilities, you can also use inner > or outer joins. > Which solution performs best depends on the data, the database > version, the available indexes, ... > > >>> Julien <jcigar@ulb.ac.be> 2008-03-13 15:47 >>> > Hello, > > Does anyone has an idea why sometimes: > - select ... where ... in (select ...) > is faster than : > - select ... where ... exists(select ...) > and sometimes it's the opposite ? > > I had such a situation, I've pasted the queries on: > http://rafb.net/p/KXNZ6892.html and http://rafb.net/p/jvo5DO38.html > > It's running PostgreSQL 8.1 with an effective_cache_size of 30000. > > specimens.id is the primary key and there are indexes on > sequences(specimen_id) and specimen_measurements(specimen_id) > > Is there a general "rule" to know when to use the in() version and > when > to use the exists() version ? Is it true to say that the exists() > version is more scalable (with many rows) than the in() version (from > the little tests I made it seems the case) ? > > Thanks, > Julien > > -- > Julien Cigar > Belgian Biodiversity Platform > http://www.biodiversity.be > Université Libre de Bruxelles (ULB) > Campus de la Plaine CP 257 > Bâtiment NO, Bureau 4 N4 115C (Niveau 4) > Boulevard du Triomphe, entrée ULB 2 > B-1050 Bruxelles > Mail: jcigar@ulb.ac.be > @biobel: http://biobel.biodiversity.be/person/show/471 > Tel : 02 650 57 52 > > > -- > Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-sql > -- Julien Cigar Belgian Biodiversity Platform http://www.biodiversity.be Université Libre de Bruxelles (ULB) Campus de la Plaine CP 257 Bâtiment NO, Bureau 4 N4 115C (Niveau 4) Boulevard du Triomphe, entrée ULB 2 B-1050 Bruxelles Mail: jcigar@ulb.ac.be @biobel: http://biobel.biodiversity.be/person/show/471 Tel : 02 650 57 52