Thread: RE: [SQL] To create the query for a select where one is NOT in th e other
RE: [SQL] To create the query for a select where one is NOT in th e other
From
"Jackson, DeJuan"
Date:
> On Sat, 19 Sep 1998, Karl Denninger wrote: > > > Date: Sat, 19 Sep 1998 17:55:14 -0500 > > From: Karl Denninger <karl@denninger.net> > > To: pgsql-sql@postgreSQL.org > > Subject: [SQL] To create the query for a select where one is NOT in > the other > > > > I want to specify the following SELECT and can't figure it out: > > > > Table "a" and table "b" have identical schemas > > > > I want to select all the records in "a" in which a key field in > "a" > > does NOT have a corresponding row in "b". > > > > I know how to do this if I want the opposite, that is, if I want > all > > the rows in "a" in which the key IS present in "b" this works: > > > > select a.field1, a.field2 from t1 a, t2 b where a.field1 = > b.field1; > > I think what you need to do is use a subquery: > > select key_field from table1 where key_field NOT IN (select key_field > from > table2) ; > This query will run faster and get you the same results: SELECT key_field FROM table1 a WHERE NOT EXISTS(SELECT b.key_field FROM table2 b WHERE b.key_field = a.key_field); The reason query two is faster than one is two fold... 1) The second query's subquery will only return row's if and only if there is a match, unlike the first query which will have to return every row in table2 for every row in table1 and then compare table1.key_field to every one of those values. 2) The first query will not use indexes at all. Where as the second one will for the subquery and a table scan for table1. I suggest timing both queries with and without indexes. And use 'explain' to see the query plans. Most IN/NOT IN queries can be written as an EXISTS/NOT EXISTS query and run faster. > The subquery should select a list of all the keys in table 2, and then > no > key from talbe 1 that matches anything in this list will be included > in > your search results. > > Make sense?? > > eric > > > > > However, the intuitive modification of this (change "=" to "<>") > produces > > a monster result in which each row in "a" other than the one which > matches > > gets output for each instance of "b" (which if you think about it > does > > make sense). > > > > How do I structure an SQL statement to get the desired rows > returned? > > > > -- > > -- > > Karl Denninger (karl@denninger.net) > > > > "Yes, the president should resign. He has lied to the American > people, > > time and time again, and betrayed their trust. Since he has > > admitted guilt, there is no reason to put the American people > through > > an impeachment. He will serve absolutely no purpose in finishing > out > > his term, the only possible solution is for the president to save > > some dignity and resign." > > > ______________________________________________________________________ > __ > > > > 12th Congressional District Hopeful William Jefferson Clinton, > > during the Nixon investigations > > > > > > > > _______________________ > Eric McKeown > ericm@palaver.net > http://www.palaver.net > > >
On Mon, 21 Sep 1998, Jackson, DeJuan wrote: > Date: Mon, 21 Sep 1998 13:07:13 -0500 > From: "Jackson, DeJuan" <djackson@cpsgroup.com> > To: Eric McKeown <ericm@palaver.net>, Karl Denninger <karl@denninger.net> > Cc: pgsql-sql@postgreSQL.org > Subject: RE: [SQL] To create the query for a select where one is NOT in th e other > > > On Sat, 19 Sep 1998, Karl Denninger wrote: > > > > I think what you need to do is use a subquery: > > > > select key_field from table1 where key_field NOT IN (select key_field > > from > > table2) ; > > > This query will run faster and get you the same results: > SELECT key_field FROM table1 a WHERE NOT EXISTS(SELECT b.key_field FROM > table2 b WHERE b.key_field = a.key_field); Bear with me; I'm learning, and I have a couple of questions. > > The reason query two is faster than one is two fold... > 1) The second query's subquery will only return row's if and only if > there is a match, unlike the first query which will have to return every > row in table2 for every row in table1 and then compare table1.key_field > to every one of those values. I understand that--makes perfect sense. > 2) The first query will not use indexes at all. Where as the second one > will for the subquery and a table scan for table1. > > I suggest timing both queries with and without indexes. And use > 'explain' to see the query plans. Most IN/NOT IN queries can be written > as an EXISTS/NOT EXISTS query and run faster. So what is the main difference between IN/NOT IN and EXISTS/NOT EXISTS queries? The fact that IN/NOT IN doesn't use indexes and EXISTS/NOT EXISTS does? Is EXISTS/NOT EXISTS standard SQL syntax? Thanks for the pointers... _______________________ Eric McKeown ericm@palaver.net http://www.palaver.net