Thread: Index Usage using IN
Hi, I have 2 tables both have an index on ID (both ID columns are an oid). I want to find only only rows in one and not the other. Select ID from TableA where ID not IN ( Select ID from Table B) This always generates sequential scans. Table A has about 250,000 rows. Table B has about 250,000 Rows. We should get a Scan on Table B and a Index Lookup on Table A. Is there any way to force this? enable_seqscan off doesn't help at all. The Plan is Seq Scan on tablea(cost=100000000.00..23883423070450.96 rows=119414 width=4) Filter: (NOT (subplan))" SubPlan -> Seq Scan on tableb (cost=100000000.00..100004611.17 rows=242617 width=4) Thanks Ralph
On Thu, 2006-02-02 at 09:12 +1300, Ralph Mason wrote: > Hi, > > I have 2 tables both have an index on ID (both ID columns are an oid). > > I want to find only only rows in one and not the other. > > Select ID from TableA where ID not IN ( Select ID from Table B) Have you considered this: SELECT ID from TableA EXCEPT Select ID from Table B ? -jwb
On Wed, 2006-02-01 at 12:22 -0800, Jeffrey W. Baker wrote: > On Thu, 2006-02-02 at 09:12 +1300, Ralph Mason wrote: > > Hi, > > > > I have 2 tables both have an index on ID (both ID columns are an oid). > > > > I want to find only only rows in one and not the other. > > > > Select ID from TableA where ID not IN ( Select ID from Table B) > > Have you considered this: > > SELECT ID from TableA EXCEPT Select ID from Table B Alternately: SELECT a.ID FROM TableA AS a LEFT JOIN TableB AS b ON a.ID = b.ID WHERE b.ID IS NULL -jwb
"Jeffrey W. Baker" <jwbaker@acm.org> writes: > On Thu, 2006-02-02 at 09:12 +1300, Ralph Mason wrote: >> Select ID from TableA where ID not IN ( Select ID from Table B) > Have you considered this: > SELECT ID from TableA EXCEPT Select ID from Table B Also, increasing work_mem might persuade the planner to try a hashed subplan, which'd be a lot better than what you have. Note that it's quite unlikely that indexes are going to help for this. regards, tom lane
On Thu, Feb 02, 2006 at 09:12:59 +1300, Ralph Mason <ralph.mason@telogis.com> wrote: > Hi, > > I have 2 tables both have an index on ID (both ID columns are an oid). > > I want to find only only rows in one and not the other. > > Select ID from TableA where ID not IN ( Select ID from Table B) > > This always generates sequential scans. > > Table A has about 250,000 rows. Table B has about 250,000 Rows. > > We should get a Scan on Table B and a Index Lookup on Table A. I don't think that is going to work if there are NULLs in table B. I don't know whether or not Postgres has code to special case NULL testing (either for constraints ruling them out, or doing probes for them in addition to the key it is trying to match) for doing NOT IN. Just doing a simple index probe into table A isn't going to tell you all you need to know if you don't find a match.
Select ID from TableA where not exists ( Select ID from Table B where ID = TableA.ID) might give you index scan. Of course, that is only useful is TableA is very small table. Not appropriate for 250k rows on 2/1/2006 12:12 PM Ralph Mason said the following: > Hi, > > I have 2 tables both have an index on ID (both ID columns are an oid). > > I want to find only only rows in one and not the other. > > Select ID from TableA where ID not IN ( Select ID from Table B) > > This always generates sequential scans. > > Table A has about 250,000 rows. Table B has about 250,000 Rows. > > We should get a Scan on Table B and a Index Lookup on Table A. > > Is there any way to force this? enable_seqscan off doesn't help at all. > > The Plan is > > Seq Scan on tablea(cost=100000000.00..23883423070450.96 rows=119414 > width=4) > Filter: (NOT (subplan))" > SubPlan -> Seq Scan on tableb (cost=100000000.00..100004611.17 > rows=242617 width=4) > > > Thanks > Ralph > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend >