Re: Question on a select - Mailing list pgsql-general

From Bruno Wolff III
Subject Re: Question on a select
Date
Msg-id 20050102075249.GA18496@wolff.to
Whole thread Raw
In response to Re: Question on a select  (Madison Kelly <linux@alteeve.com>)
Responses Re: Question on a select  (Madison Kelly <linux@alteeve.com>)
List pgsql-general
On Sun, Jan 02, 2005 at 01:58:20 -0500,
  Madison Kelly <linux@alteeve.com> wrote:
> Bruno Wolff III wrote:
> >SELECT a_name, a_type, a_dir, a_<others> FROM table_a
> >  WHERE a_name, a_type, a_dir NOT IN (
> >    SELECT b_name, b_type, b_dir FROM table_b)
> >;
> >
> >In pre 7.4 versions or if there are NULLs in the key columns for table_b
> >then you probably want to use NOT EXISTS (with a moodified WHERE clause)
> >instead on NOT IN.
>
> Hi Bruno,
>
>   Thank you for replying! I tried your example but I am getting a
> syntax error:
>
> tle-bu=> SELECT file_name, file_parent_dir, file_type FROM file_info_1
> WHERE file_name, file_parent_dir, file_type NOT IN (SELECT fs_name,
> fs_parent_dir, fs_type FROM file_set_1);
> ERROR:  syntax error at or near "," at character 78

There should be parenthesis around the list to test.
WHERE a_name, a_type, a_dir NOT IN (
should be
WHERE (a_name, a_type, a_dir) NOT IN (
>
>   I just replied to Vincent's post with a lot of detail on what I am
> trying to do and how my DB is constructed. His second example worked but
> I also had a syntax error on his first example. This program will be
> working with very large data sets so I would love to get your method
> working so that I could try benchmarking them to see which, in my
> application, would be most effective.

I believe that the NOT IN query should run comparably to the LEFT JOIN
example supplied by the other person (at least in recent versions of
Postgres). I would expect this to run faster than using NOT EXISTS.
You probably want to try all 3. The semantics of the three ways of doing
this are not all equivalent if there are NULLs in the data being used
to eliminate rows. As you indicated you don't have NULLs this shouldn't
be a problem.

Another way to write this is using set different (EXCEPT or EXCEPT ALL)
using the key fields and then joining back to table a to pick up the
other fields. However this will almost certianly be slower than the
other methods.

pgsql-general by date:

Previous
From: Madison Kelly
Date:
Subject: Re: Question on a select
Next
From: Madison Kelly
Date:
Subject: Re: Question on a select