Thread: slow query - where not in

slow query - where not in

From
Jeremiah Elliott
Date:
here is the query that is killing me:

select shoporder from sodetailtabletrans where shoporder not in(select
shoporder from soheadertable)

This is just an example query. Any time I use 'where not in(' it takes several
hours to return a resultset. The postgres version is 7.2.3 although I have
tried it on my test server which has 7.3 on it and it runs just as slow. The
server is a fast server 2GHz with a gig of ram.  I have tried several
differant index setups but nothing seems to help.

soheadertable - 5104 rows
CREATE TABLE "soheadertable" (
   "shoporder" numeric(10,0) NOT NULL,
   "initrundate" date,
   "actualrundate" date,
   "processedminutes" numeric(10,0),
   "starttime" timestamptz,
   "endtime" timestamptz,
   "line" int4,
   "personcount" numeric(10,0),
   "product" varchar(15),
   "qtytomake" numeric(10,3),
   "linestatus" numeric(2,0) DEFAULT 1,
   "finishlinestatus" numeric(2,0) DEFAULT 1,
   "qtyinqueue" numeric(10,3),
   "lastcartonprinted" numeric(10,0),
   "qtydone" int8,
   "warehouse" text,
   "rescheduledate" date,
   "calculateddatetorun" date
);
CREATE  UNIQUE INDEX "shoporder_soheadertable_ukey" ON "soheadertable"
("shoporder");

sodetailtabletrans - 31494 rows
CREATE TABLE "sodetailtabletrans" (
   "shoporder" numeric(10,0) NOT NULL,
   "soseq" numeric(5,0) NOT NULL,
   "product" char(15) NOT NULL,
   "qtyqueued" numeric(17,2),
   "qtyneeded" numeric(17,2),
   "qtyallocated" numeric(17,2),
   "qtyused" numeric(17,2),
   "linestatus" numeric(2,0) DEFAULT 1,
   "unitsperenditem" numeric(10,1),
   CONSTRAINT "sodetailtrans_pk" PRIMARY KEY ("shoporder", "soseq")

-Jeremiah Elliott
jelliott@artcobell.com
);


Re: slow query - where not in

From
Bruno Wolff III
Date:
On Fri, Mar 28, 2003 at 09:38:50 -0600,
  Jeremiah Elliott <jelliott@artcobell.com> wrote:
> here is the query that is killing me:
>
> select shoporder from sodetailtabletrans where shoporder not in(select
> shoporder from soheadertable)

This will probably work better in 7.4.

For now there are several ways to rewrite this query.

If there are no null values for shoporder in soheadertable or
sodetailtabletrans you can use not exists instead of not in:
select shoporder from sodetailtabletrans where shoporder not exists(select
shoporder from soheadertable)

You can use set difference to calculate the result:
select shoporder from sodetailtabletrans except all select
shoporder from soheadertable

If there are no null values for shoporder in one of sodetailtabletrans
or soheadertable you can user an outer join with a restriction that limits
the rows of interest to those that don't match:
select sodetailtabletrans.shoporder from sodetailtabletrans left join
soheadertable using (shoporder) where soheadertable.shoporder is null


Re: slow query - where not in

From
Greg Stark
Date:
Jeremiah Elliott <jelliott@artcobell.com> writes:

> here is the query that is killing me:
>
> select shoporder from sodetailtabletrans where shoporder not in(select
> shoporder from soheadertable)
>
> This is just an example query. Any time I use 'where not in(' it takes several
> hours to return a resultset. The postgres version is 7.2.3 although I have
> tried it on my test server which has 7.3 on it and it runs just as slow. The
> server is a fast server 2GHz with a gig of ram.  I have tried several
> differant index setups but nothing seems to help.

This should be improved with 7.4, however there are some other things you can
try now.

try

SELECT shoporder
  FROM sodetailtabletrans
 WHERE NOT EXISTS (
       SELECT 1
         FROM soheadertable
        WHERE shoporder = sodetailtabletrans.shoporder
       )

or else try something like

         SELECT a.shoporder
           FROM sodetailtabletrans as a
LEFT OUTER JOIN soheadertable as b ON (a.shoporder = b.shoporder)
          WHERE b.shoporder IS NULL


--
greg

Re: slow query - where not in

From
"Michael Paesold"
Date:
Bruno Wolff III <bruno@wolff.to> wrote:

> Jeremiah Elliott <jelliott@artcobell.com> wrote:
> > here is the query that is killing me:
> >
> > select shoporder from sodetailtabletrans where shoporder not in(select
> > shoporder from soheadertable)
>

> If there are no null values for shoporder in soheadertable or
> sodetailtabletrans you can use not exists instead of not in:
> select shoporder from sodetailtabletrans where shoporder not exists(select
> shoporder from soheadertable)

I think this should rather be:

SELECT shoporder FROM sodetailtabletrans
  WHERE NOT EXISTS (
    SELECT 1 FROM soheadertable
    WHERE soheadertable.shoporder = sodetailtabletrans.shoporder
    )

Regards,
Michael Paesold


Re: slow query - where not in

From
Bruno Wolff III
Date:
On Fri, Mar 28, 2003 at 17:53:46 +0100,
  Michael Paesold <mpaesold@gmx.at> wrote:
> Bruno Wolff III <bruno@wolff.to> wrote:
>
> I think this should rather be:
>
> SELECT shoporder FROM sodetailtabletrans
>   WHERE NOT EXISTS (
>     SELECT 1 FROM soheadertable
>     WHERE soheadertable.shoporder = sodetailtabletrans.shoporder
>     )

Thanks for catching my mistake.


Re: slow query - where not in

From
Jeremiah Elliott
Date:
thanks guys - Greg, Bruno and Michael. That made a world of diferance.

thx
-Jeremiah