Home > mailing lists

Re: Unable to get acceptable performance from EXCEPT - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: Unable to get acceptable performance from EXCEPT
Date	May 10, 2000 18:39:06
Msg-id	20127.957998329@sss.pgh.pa.us Whole thread Raw
In response to	Unable to get acceptable performance from EXCEPT (Alfred Perlstein <bright@wintelcom.net>)
Responses	Re: Unable to get acceptable performance from EXCEPT
List	pgsql-hackers

Tree view

Alfred Perlstein <bright@wintelcom.net> writes:
> =# select ref_id from ref_old except select ref_id from ref_new;
> Takes over 10 minutes, probably closer to half an hour.
> I've also tried using 'NOT IN ( select ref_id from ref_new )'

Yup.  EXCEPT is effectively translated to a NOT IN, if I recall
correctly, and neither IN ( sub-select ) nor NOT IN ( sub-select )
are implemented very efficiently.  Basically you get O(N^2) behavior
because the inner select is rescanned for each outer tuple.

We have a TODO list item to try to be smarter about this...

> Is there a way to formulate my SQL to get Postgresql to follow
> this algorithm [ kind of like a mergejoin ]

No, but you could try

select ref_id from ref_old where not exists
(select ref_id from ref_new where ref_id = ref_old.ref_id);

which would at least be smart enough to consider using an index
on ref_new(ref_id) instead of a sequential scan.
        regards, tom lane

pgsql-hackers by date:

From: Bruce Momjian
Date: 10 May 2000, 18:32:15
Subject: Re: setproctitle() no longer used?

From: Vince Vielhaber
Date: 10 May 2000, 18:59:11
Subject: Re: setproctitle() no longer used?

Re: Unable to get acceptable performance from EXCEPT - Mailing list pgsql-hackers

Previous

Next