Thread: Proper Join and check

Proper Join and check

From
Patrick Nelson
Date:
If I have a one to many relation set on sym in each table is like... say
this:

CREATE TABLE "tableone" (
 "sym" varchar(5) NOT NULL,
 PRIMARY KEY ("sym") );

and

CREATE TABLE "tablemany" (
 "id"  int4       NOT NULL DEFAULT nextval('res_id_seq'::text),
 "sym" varchar(5) REFERENCES tableone ON UPDATE CASCADE,
 PRIMARY KEY ("id") );

Now these tables hold a lot of data so I want to do this correctly.  How do
I find the records in tableone that don't have any corresponding records in
tablemany?  In my application this shouldn't happen accept during the
initial inserting in tableone which briefly is followed by inserting in
tablemany.

Re: Proper Join and check

From
Patrick Nelson
Date:
Billy G. Allie wrote:
----------------->>>>
The query you are looking for is:

select a.sym from tableone a
where a.sym not in (select b.sym from tablemany b
                    where b.sym = a.sym);

This query will run MUCH faster if you create a secondary index for
tablemany (in fact, don't even try it without the index for any significant
number or rows):

create index tablemany_sym on tablemany(sym);

A better solution though, would be to use triggers to update tablemany when
tableone is updated, or to wrap both updates in a single transaction so that
the tableone transaction can be rolled back if tablemany is not updated.

Not knowing what you are attempting to do and how the updates occur prevents
me from going into any more detail than this.
----------------->>>>

Oh yeah it helped...  Thanks the query took like 4 seconds.  I'm not sure I
totally understand the b.sym = a.sym though.

I was trying to not bore anyone with a in depth description.  But since you
asked...

There is company data in each record of the tableone, and daily inputs of
accounting data in tablemany.  Now when a company gets added to table one
it's because it's YTD sales went above a certain point.  So a record gets
inserted into tableone, and then the accounting data gets inserted into
tablemany sequentially in that order.  When any company drops below that
certain point it gets flagged and if it stays there for a certain time, it
gets dropped from the tableone and tablemany.  The only constraint I had set
was the "REFERENCES tableone ON UPDATE CASCADE" which adds some continuity,
but not much.  Hadn't thought about triggers or really anything.
Application was working fine until I noticed that there was a sym in
tableone that didn't exist in tablemany.

Does that describe it good?

Again thanks for the index and select query.

Re: Proper Join and check

From
"Billy G. Allie"
Date:
Patrick Nelson wrote:
> Billy G. Allie wrote:
> ----------------->>>>
> The query you are looking for is:
>
> select a.sym from tableone a
> where a.sym not in (select b.sym from tablemany b
>                     where b.sym = a.sym);
>
> This query will run MUCH faster if you create a secondary index for
> tablemany (in fact, don't even try it without the index for any significant
> number or rows):
    [. . .]
> Oh yeah it helped...  Thanks the query took like 4 seconds.  I'm not sure I
> totally understand the b.sym = a.sym though.

With the b.sym = a.sym, the query optimizer will use an index scan of tablemany.
Without it, a sequential scan will be used (very VERY S L O W for any signifi-
cant number of rows).

Without the b.sym = a.sym, the result of the subselect will be all the rows in
tablemany which will have to be scanned to see if the test (not in) succeeds.
With the b.sym = a.sym, the result of the subselect will be empty or the value
of b.sym (as determined by an index lookup, which is why the index was needed).

I hope this clarifies things somewhat.
___________________________________________________________________________
____       | Billy G. Allie    | Domain....: Bill.Allie@mug.org
|  /|      | 7436 Hartwell     | MSN.......: B_G_Allie@email.msn.com
|-/-|----- | Dearborn, MI 48126|
|/  |LLIE  | (313) 582-1540    |

Re: Proper Join and check

From
"Billy G. Allie"
Date:
Patrick Nelson wrote:
> If I have a one to many relation set on sym in each table is like... say
> this:
>
> CREATE TABLE "tableone" (
>  "sym" varchar(5) NOT NULL,
>  PRIMARY KEY ("sym") );
>
> and
>
> CREATE TABLE "tablemany" (
>  "id"  int4       NOT NULL DEFAULT nextval('res_id_seq'::text),
>  "sym" varchar(5) REFERENCES tableone ON UPDATE CASCADE,
>  PRIMARY KEY ("id") );
>
> Now these tables hold a lot of data so I want to do this correctly.  How do
> I find the records in tableone that don't have any corresponding records in
> tablemany?  In my application this shouldn't happen accept during the
> initial inserting in tableone which briefly is followed by inserting in
> tablemany.

The query you are looking for is:

select a.sym from tableone a
where a.sym not in (select b.sym from tablemany b
                    where b.sym = a.sym);

This query will run MUCH faster if you create a secondary index for tablemany (in fact, don't even try it without the
indexfor any significant number or rows): 

create index tablemany_sym on tablemany(sym);

A better solution though, would be to use triggers to update tablemany when tableone is updated, or to wrap both
updatesin a single transaction so that the tableone transaction can be rolled back if tablemany is not updated. 

Not knowing what you are attempting to do and how the updates occur prevents me from going into any more detail than
this.

I hope this helps.
--
____       | Billy G. Allie    | Domain....: Bill.Allie@mug.org
|  /|      | 7436 Hartwell     | MSN.......: B_G_Allie@email.msn.com
|-/-|----- | Dearborn, MI 48126|
|/  |LLIE  | (313) 582-1540    |



Attachment