Home > mailing lists

Re: Join Statements - Mailing list pgsql-sql

From	Andy Marden
Subject	Re: Join Statements
Date	February 26, 2002 12:56:04
Msg-id	a5ggbc$3032$1@jupiter.hub.org Whole thread Raw
In response to	Re: Join Statements (Mark Nielsen <python@kepnet.net>)
List	pgsql-sql

Tree view

Your problem here is in the use of the use of '!='. All the columns from b
will be null if there is no match. Any null result always fails a test so
null is never not equal to anything (or equal to it for that matter!). Don't
get me started on NULLs any the ludicrous ways that RDBMSs deal them them!

What you need is:

select count(a.*)
from a left outer join b on a.id = b.id
where b.id is null

This will give you the best performing query rather than NOT IN sub query.
It also gives you greater control of the query plan. Internally, these days,
this is what Oracle converts a not in sub-query to if you let it. PostgreSQL
still persists with the old way, it seems, of doing a nested-loop scan of
the sub query for each row in the outer which is usually horribly
inefficient.

The alternative solution below with the NOT ... ANY seems to give the same
query plan as the NOT IN approach

Cheers

Andy

"Mark Nielsen" <python@kepnet.net> wrote in message
news:3C6EA55C.8040905@kepnet.net...
> select distinct TABLENAME_id from TABLENAME_backup where  NOT
> TABLENAME_id = ANY (select TABLENAME_id from TABLENAME)
>
> Here is an example of me getting all the ids from the backup table that
> doesn't exist in the main table. This is similar to what you want.
>
> I think I am answering your question, or close to it.
> Mark
>
> T Conti wrote:
>
> > Howdy:
> >
> > I need to put together an SQL statement that will return all of the
> > rows in table A that are not in table B.  Table A and B have the same
> > primary key.  For example:
> >
> > select count(a.*)
> > from a  (nolock) left outer join
> > b (nolock) on a.id = b.id
> > where a.id != b.id
> >
> > This did not work.  It returned 0 rows.  I know that this could be
> > done very easily in a sub-select, but that seems inefficient.  Is
> > there any way to accomplish what I mentioned above in the join
> > statement or is the sub-select the way to go?
> >
> > Thanks for the help,
> > Tom
> >
>
>

pgsql-sql by date:

From: Christof Glaser
Date: 26 February 2002, 12:55:05
Subject: Re: Removing duplicates

From: Andreas Joseph Krogh
Date: 26 February 2002, 12:59:17
Subject: Re: Timestamp output

Re: Join Statements - Mailing list pgsql-sql

Previous

Next