Re: dumb question - Mailing list pgsql-general

From Kevin Grittner
Subject Re: dumb question
Date
Msg-id CACjxUsMU3E5nL0EuDMbxQ9vaKWXYtztnsDB-7ZfRjE19bEkyaQ@mail.gmail.com
Whole thread Raw
In response to Re: dumb question  ("David G. Johnston" <david.g.johnston@gmail.com>)
Responses Re: dumb question
List pgsql-general
On Thu, Jun 2, 2016 at 3:23 PM, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> On Thu, Jun 2, 2016 at 4:11 PM, John R Pierce <pierce@hogranch.com> wrote:
>> Thanks all the below seem to do the trick.

I doubt it -- using NOT IN requires (per the SQL specification)
handling NULLs in a way that probably does not give you the answer
you want.  (NOT IN also is often much slower than the NOT EXISTS
test which will actually give you the answer you want.)

test=# create table t (id int not null primary key, ref_id int, sts
int not null default 0);
CREATE TABLE
test=# insert into t values
(1,null,0),(2,1,1),(3,null,0),(4,null,0),(5,4,1),(6,null,0),(7,6,1);
INSERT 0 7
test=# select max(id) from t where sts=0 and id not in (select ref_id from t);
 max
-----

(1 row)

test=# select max(id) from t t1 where sts = 0 and not exists (select *
from t t2 where t2.ref_id = t1.id);
 max
-----
   3
(1 row)

Note that providing minimal setup (like the above) helps in getting
good answers quickly.

>> do note, this is whats known as an 'anti-join', and these can be pretty
>> expensive on large tables.
>
> +1

*Can* be.  Proper indexing can make them very reasonable.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-general by date:

Previous
From: rob stone
Date:
Subject: Re: Partitioned tables do not return affected row counts to client
Next
From: "David G. Johnston"
Date:
Subject: Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3