Re: Q: Table scans on set difference - Mailing list pgsql-general

From Tom Lane
Subject Re: Q: Table scans on set difference
Date
Msg-id 20529.1152891383@sss.pgh.pa.us
Whole thread Raw
In response to Re: Q: Table scans on set difference  (Alban Hertroys <alban@magproductions.nl>)
List pgsql-general
Alban Hertroys <alban@magproductions.nl> writes:
> G. Ralph Kuntz, MD wrote:
>> explain select file_name from encounter_properties_table where file_name not
>> in (select filename from xfiles);

> What about:
> explain select file_name from encounter_properties_table
> where not exists (
>     select file_name from xfiles where filename = file_name);

If you only need the file name, an EXCEPT would probably work much
better:

select file_name from encounter_properties_table
except
select filename from xfiles;

Another possibility is to abuse the outer join machinery:

select file_name, ... from
    encounter_properties_table l left join xfiles r
      on l.file_name = r.filename
where r.filename is null;

Generally speaking, NOT IN performance is going to suck unless the
sub-select is small enough to fit in a hashtable.  You could consider
increasing work_mem enough that it would fit, but with 500K filenames
needed, that's probably not going to win.

            regards, tom lane

pgsql-general by date:

Previous
From: Florian Weimer
Date:
Subject: Re: Timestamp vs timestamptz
Next
From: Tom Lane
Date:
Subject: Re: apparent wraparound