Re: "NOT IN" substantially slower in 9.0.2 than 8.3.13 - NOT EXISTS runs fast in both 8.3.13 and 9.0.2 - Mailing list pgsql-performance

From Ing. Marcos Ortiz Valmaseda
Subject Re: "NOT IN" substantially slower in 9.0.2 than 8.3.13 - NOT EXISTS runs fast in both 8.3.13 and 9.0.2
Date
Msg-id 955263366.48014741295283147090.JavaMail.root@ucimail4.uci.cu
Whole thread Raw
Responses Re: "NOT IN" substantially slower in 9.0.2 than 8.3.13 - NOT EXISTS runs fast in both 8.3.13 and 9.0.2
List pgsql-performance
Query is :
SELECT distinct m.id,coalesce(m.givenname,''),
       coalesce(m.midname,''),
       m.surname from marinerstates ms,vessels vsl,mariner m
WHERE  m.id=ms.marinerid and ms.vslid=vsl.id
AND    ms.state='Active' and coalesce(ms.endtime,now())::date >= '2006-07-15'
AND    ms.starttime::date <= '2007-01-11'  AND
       m.marinertype='Mariner'  and m.id
NOT IN (SELECT distinct mold.id
        FROM marinerstates msold,
        vessels vslold,
        mariner mold
        WHERE mold.id=msold.marinerid
        AND  msold.vslid=vslold.id
        AND msold.state='Active'
        AND coalesce(msold.endtime,now())::date >= '2006-07-15'
        AND msold.starttime::date <= '2007-01-11'
        AND EXISTS (SELECT 1
                    FROM marinerstates msold2
                    WHERE msold2.marinerid=msold.marinerid
                    AND msold2.state='Active'
                    AND msold2.id <> msold.id
                    AND msold2.starttime<msold.starttime
                    AND (msold.starttime-msold2.endtime)<='18 months')
                    AND mold.marinertype='Mariner' )
       ORDER BY m.surname,coalesce(m.givenname,'')
      ,coalesce(m.midname,'');

i get the following execution times: (with \timing)
FBSD_DEV : query : 240.419 ms
LINUX_PROD : query : 219.568 ms
FBSD_TEST : query :  2285.509 ms
LINUX_TEST : query : 5788.988 ms

Re writing the query in the "NOT EXIST" variation like:

SELECT distinct m.id,coalesce(m.givenname,''),coalesce(m.midname,''),m.surname from marinerstates ms,vessels
vsl,marinerm where  
m.id=ms.marinerid and ms.vslid=vsl.id and ms.state='Active' and coalesce(ms.endtime,now())::date >= '2006-07-15' and
ms.starttime::date <= '2007-01-11'  and m.marinertype='Mariner'  and NOT EXISTS
   (SELECT distinct mold.id from marinerstates msold,vessels vslold,mariner mold  where mold.id=msold.marinerid and
msold.vslid=vslold.idand  
   msold.state='Active' and coalesce(msold.endtime,now())::date >= '2006-07-15' and msold.starttime::date <=
'2007-01-11'and  
   exists (select 1 from marinerstates msold2 where msold2.marinerid=msold.marinerid and msold2.state='Active' and
msold2.id<> msold.id and  
              msold2.starttime<msold.starttime AND (msold.starttime-msold2.endtime)<='18 months')
   and mold.marinertype='Mariner' AND mold.id=m.id)
order by m.surname,coalesce(m.givenname,''),coalesce(m.midname,'');
gives:

FBSD_DEV : query : 154.000 ms
LINUX_PROD : query : 153.408 ms
FBSD_TEST : query :  137.000 ms
LINUX_TEST : query : 404.000 ms


Well, on the Release Notes on the PostgreSQL-8.4 Documentation, the developers recommend to use NOT EXISTS
instead NOT IN, because the first clause has a better performance. So, you can use it on that way.

Other questions?
- Do you have a partial index on marinerstates.marinerid where this condition is accomplished?
- Do you have a index on mariner.id?
- Can you provide a explain of these queries on the PostgreSQL-9.0 machines?

Regards


Ing. Marcos Luís Ortíz Valmaseda
Linux User # 418229 && PostgreSQL DBA
Centro de Tecnologías Gestión de Datos (DATEC)
http://postgresql.uci.cu
http://www.postgresql.org
http://it.toolbox.com/blogs/sql-apprentice

pgsql-performance by date:

Previous
From: Achilleas Mantzios
Date:
Subject: "NOT IN" substantially slower in 9.0.2 than 8.3.13 - NOT EXISTS runs fast in both 8.3.13 and 9.0.2
Next
From: Mladen Gogala
Date:
Subject: Re: "NOT IN" substantially slower in 9.0.2 than 8.3.13 - NOT EXISTS runs fast in both 8.3.13 and 9.0.2