Re: Proper Join and check - Mailing list pgsql-general

From Billy G. Allie
Subject Re: Proper Join and check
Date
Msg-id 200210070601.g9761DI08907@bajor.mug.org
Whole thread Raw
In response to Re: Proper Join and check  (Patrick Nelson <pnelson@neatech.com>)
List pgsql-general
Patrick Nelson wrote:
> Billy G. Allie wrote:
> ----------------->>>>
> The query you are looking for is:
>
> select a.sym from tableone a
> where a.sym not in (select b.sym from tablemany b
>                     where b.sym = a.sym);
>
> This query will run MUCH faster if you create a secondary index for
> tablemany (in fact, don't even try it without the index for any significant
> number or rows):
    [. . .]
> Oh yeah it helped...  Thanks the query took like 4 seconds.  I'm not sure I
> totally understand the b.sym = a.sym though.

With the b.sym = a.sym, the query optimizer will use an index scan of tablemany.
Without it, a sequential scan will be used (very VERY S L O W for any signifi-
cant number of rows).

Without the b.sym = a.sym, the result of the subselect will be all the rows in
tablemany which will have to be scanned to see if the test (not in) succeeds.
With the b.sym = a.sym, the result of the subselect will be empty or the value
of b.sym (as determined by an index lookup, which is why the index was needed).

I hope this clarifies things somewhat.
___________________________________________________________________________
____       | Billy G. Allie    | Domain....: Bill.Allie@mug.org
|  /|      | 7436 Hartwell     | MSN.......: B_G_Allie@email.msn.com
|-/-|----- | Dearborn, MI 48126|
|/  |LLIE  | (313) 582-1540    |

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: cross-posts (was Re: Large databases, performance)
Next
From: "Shridhar Daithankar"
Date:
Subject: Re: understanding insert slowdown