Thread: Why is this a cross join?

Why is this a cross join?

From
Tim Uckun
Date:
I have a seemingly simple query that returns way too many records.
Here is the count query

select count(*) from (
    select crm.*
    from sap_crm_sales crm
     inner join model_configurations mc on left(crm.customer_class, 6)
= left(mc.sap_code,6)
) as s

The crm table has about 44K records, the model table has about 1K the
join produces about 9 million  records

It doesn't matter if I make it a left join or inner join the count
ends up the same.

Re: Why is this a cross join?

From
Alban Hertroys
Date:
On Feb 17, 2013, at 9:16, Tim Uckun <timuckun@gmail.com> wrote:

> I have a seemingly simple query that returns way too many records.
> Here is the count query
>=20
> select count(*) from (
>    select crm.*
>    from sap_crm_sales crm
>     inner join model_configurations mc on left(crm.customer_class, 6)
> =3D left(mc.sap_code,6)
> ) as s
>=20
> The crm table has about 44K records, the model table has about 1K the
> join produces about 9 million  records


Apparently the first 6 characters of those fields are quite common, =
which gives you a result for every possible combination of the same =
6-character value.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.

Re: Why is this a cross join?

From
Tim Uckun
Date:
> Apparently the first 6 characters of those fields are quite common, which
> gives you a result for every possible combination of the same 6-character
> value.


Mmmmm. That seems kind of weird.   Is there any way to NOT have this
be a cross join?  For example if I extracted the first six characters
into a field and then joined on them it would not be a cross join
right?

Re: Why is this a cross join?

From
Tony Theodore
Date:
On 18/02/2013, at 7:58 AM, Tim Uckun <timuckun@gmail.com> wrote:

>> Apparently the first 6 characters of those fields are quite common, =
which
>> gives you a result for every possible combination of the same =
6-character
>> value.
>=20
>=20
> Mmmmm. That seems kind of weird.   Is there any way to NOT have this
> be a cross join?  For example if I extracted the first six characters
> into a field and then joined on them it would not be a cross join
> right?

In some way, every join is a cross join, with the results filtered =
according to the specificity of the join conditions. In this case:

inner join model_configurations mc on left(crm.customer_class, 6) =3D =
left(mc.sap_code,6)

"customer_class" sounds like a fairly generic sort of field, so you'd =
expect many matches. Truncating the fields is likely to make this even =
less specific, returning more results.

Cheers,

Tony
 =20=

Re: Why is this a cross join?

From
Tim Uckun
Date:
>
> In some way, every join is a cross join, with the results filtered according to the specificity of the join
conditions.In this case: 
>
> inner join model_configurations mc on left(crm.customer_class, 6) = left(mc.sap_code,6)
>
> "customer_class" sounds like a fairly generic sort of field, so you'd expect many matches. Truncating the fields is
likelyto make this even less specific, returning more results. 
>

I guess I am not explaining it properly..

Say I created new columns on both tables called "first_6" and
populated them with the substrings.  If I did a inner join or a left
join on those fields would I still get a cross join?

inner join model_configurations mc on mc.first_6 = crm.first_6

Re: Why is this a cross join?

From
Adrian Klaver
Date:
On 02/17/2013 02:09 PM, Tim Uckun wrote:
>>
>> In some way, every join is a cross join, with the results filtered according to the specificity of the join
conditions.In this case: 
>>
>> inner join model_configurations mc on left(crm.customer_class, 6) = left(mc.sap_code,6)
>>
>> "customer_class" sounds like a fairly generic sort of field, so you'd expect many matches. Truncating the fields is
likelyto make this even less specific, returning more results. 
>>
>
> I guess I am not explaining it properly..
>
> Say I created new columns on both tables called "first_6" and
> populated them with the substrings.  If I did a inner join or a left
> join on those fields would I still get a cross join?
>
> inner join model_configurations mc on mc.first_6 = crm.first_6
>
>
http://www.postgresql.org/docs/9.2/interactive/sql-select.html

...CROSS JOIN and INNER JOIN produce a simple Cartesian product, the
same result as you get from listing the two items at the top level of
FROM, but restricted by the join condition (if any). CROSS JOIN is
equivalent to INNER JOIN ON (TRUE), that is, no rows are removed by
qualification. These join types are just a notational convenience, since
they do nothing you couldn't do with plain FROM and WHERE...

--
Adrian Klaver
adrian.klaver@gmail.com

Re: Why is this a cross join?

From
Tony Theodore
Date:
On 18/02/2013, at 9:09 AM, Tim Uckun <timuckun@gmail.com> wrote:

>>=20
>> In some way, every join is a cross join, with the results filtered =
according to the specificity of the join conditions. In this case:
>>=20
>> inner join model_configurations mc on left(crm.customer_class, 6) =3D =
left(mc.sap_code,6)
>>=20
>> "customer_class" sounds like a fairly generic sort of field, so you'd =
expect many matches. Truncating the fields is likely to make this even =
less specific, returning more results.
>>=20
>=20
> I guess I am not explaining it properly..
>=20
> Say I created new columns on both tables called "first_6" and
> populated them with the substrings.  If I did a inner join or a left
> join on those fields would I still get a cross join?

No, it would be no different if you created new columns - the join =
condition is the same.

You're not actually getting a cross join, that would be many more =
records than 9million - you're just not setting a specific enough =
filter.

Cheers,

Tony

Re: Why is this a cross join?

From
John R Pierce
Date:
On 2/17/2013 2:09 PM, Tim Uckun wrote:
> Say I created new columns on both tables called "first_6" and
> populated them with the substrings.  If I did a inner join or a left
> join on those fields would I still get a cross join?
>
> inner join model_configurations mc on mc.first_6 = crm.first_6

without additional filtering, that will return a row for every row in
crm that matches a row in mc.    if there's a 1000 rows in crm that each
match an average of 6 rows in mc, that would return 6000 rows.

crm left join mc would return all of those, plus all the rows in crm
that don't have any match in mc (the mc.* fields would be NULL in these
result rows).



--
john r pierce                                      37N 122W
somewhere on the middle of the left coast

Re: Why is this a cross join?

From
Kevin Grittner
Date:
Tim Uckun <timuckun@gmail.com> wrote:=0A=0A> I guess I am not explaining it=
 properly..=0A>=0A> Say I created new columns on both tables called "first_=
6" and=0A> populated them with the substrings.=A0 If I did a inner join or =
a=0A> left join on those fields would I still get a cross join?=0A>=0A> inn=
er join model_configurations mc on mc.first_6 =3D crm.first_6=0A=0AAs other=
s have said, that would not change the results -- you are=0Agetting exactly=
 what you are requesting, but apparently not what=0Ayou want.=A0 Can you ex=
plain what it is that you do want?=A0 If you=0Ahave ten rows in one table w=
ith the same "first six" value, and 20=0Arows in the other table with that =
same "first six" value, how do=0Ayou want them to be matched up?=A0 Do you =
want one row in the result=0Afor every row in one of the tables?=A0 If so, =
how do you want to=0Adetermine which of the matching rows in the other tabl=
e to choose,=0Aand which to ignore?=0A=0A-- =0AKevin Grittner=0AEnterpriseD=
B: http://www.enterprisedb.com=0AThe Enterprise PostgreSQL Company

Re: Why is this a cross join?

From
Tim Uckun
Date:
> for every row in one of the tables?  If so, how do you want to
> determine which of the matching rows in the other table to choose,
> and which to ignore?


In this case it's a simple lookup. Any of the matching rows will do
but I just want one.

Re: Why is this a cross join?

From
Kevin Grittner
Date:
Tim Uckun <timuckun@gmail.com> wrote:=0A=0A>> how do you want to determine =
which of the matching rows in the=0A>> other table to choose, and which to =
ignore?=0A>=0A> Any of the matching rows will do but I just want one.=0A=0A=
Well, if you don't care, this might give you what you want.=0A=0Aselect dis=
tinct on (x.first6) crm.*, x.*=0A=A0 from from sap_crm_sales crm=0A=A0 join=
 (select left(mc.sap_code,6) as first6, mc.*=0A=A0=A0=A0=A0=A0=A0=A0=A0=A0 =
from model_configurations mc) x=0A=A0=A0=A0 on x.first6 =3D left(crm.custom=
er_class, 6)=0A=A0 order by x.first6;=0A=0AI can think of about a dozen dif=
ferent things which fit the=0Adescription that you've given so far that you=
 *might* want instead,=0Abut unless you get more specific we can only guess=
.=0A=0AIf this isn't what you want, your best bet might might be to come=0A=
up with a self-contained example, where you create an populate=0Atables wit=
h a small amount of data, and show what you would like as=0Aa result given =
that data set.=0A=0A--=0AKevin Grittner=0AEnterpriseDB: http://www.enterpri=
sedb.com=0AThe Enterprise PostgreSQL Company