join question - Mailing list pgsql-general

From Grzegorz Jaśkiewicz
Subject join question
Date
Msg-id 2f4958ff0810221452qabafe52p4089278804a8cfcb@mail.gmail.com
Whole thread Raw
Responses Re: join question  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Hey folks,

I am trying to rewrite a query here, that takes 1.5m atm to finish. I got it down to 20s, and still trying to pin it down.

basically, a query looks something like that atm:

select a.*, b.* 
 from a
   join b on a.id = b.a_id and a.banned <> true
 where
   a.start <= now()
  and
   b.end > now();


that's 20s query, and now I got it down to 10s , by using something - which in my eyes would be always wrong - and against all logic. So if someone could please explain to me why is it faster:

select a.*, b.* 
 from foo a
   join bar b on a.id = b.a_id
 where
  not exists (
      select id from foo where foo.id = b.a_id and foo.banned <> true
   )
 and
   a.start <= now()
  and
   b.end > now();


plans differ, obviously - second one uses index to lookup .banned in foo, whilst first one goes for seq scan. 
result is the same, but I was actually expecting quite opposite. So is join on 1-2M rows a bad idea ?
The effect can be seen on both 8.1 and cvs head.

I would be grateful for someone clarifying that to me.

-- 
GJ

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: how to split coordinates from point
Next
From: Guillaume Lelarge
Date:
Subject: Re: triggers problems whit function