Re: LEFT JOIN and missing values - Mailing list pgsql-sql

From Tomasz Myrta
Subject Re: LEFT JOIN and missing values
Date
Msg-id 3E6F80AB.8040709@klaster.net
Whole thread Raw
In response to LEFT JOIN and missing values  (mila <me@pierro.dds.nl>)
Responses Re: LEFT JOIN and missing values  (mila <me@pierro.dds.nl>)
List pgsql-sql
mila wrote:
> Hi,
> 
> I have a problem, like this:
> 
> I have 2 tables, T1 (id, val1) -- a vector
> and T2 (id1, id2, val2) -- this is actually a sqare matrix
> in the form of a very long table.
> 
> I need to do some operations with them, such as take a
> dot-product of 2 rows of the matrix (encoded as T2), weighted with T1..         .
> 
> The query is the following:
> 
> select c.id1, d.id1, sum(c.val2*d.val2*T1.val1)
>        from T1, T2 c, T2 d
>        where c.ID2 = T1.id AND d.id2 = T1.id AND c.id1 < d.id1
> group by c.id1, d.id1;

> 
> This gives me all possible pairs of (weighted) dot-products, as I want them.
> 
> Now, some 90% of values in the T2 are the same, say, 0.5, so I could
> thouw them away. But then the query above is not valid anymore, since
> if there is mo match on the condition   c.ID2 = T1.id AND d.id2 =
> T1.id, nothing is added (it might be that c.id2 is there and d.id2 is
> not!).
> 
> I quiess I need a left join + substitute for missing value. Can anyone
> help me with the query? I cannot figure it out, too bad ad left
> joins...
> 
> thanks,
> Mila

First let's rewrite your query to explicit joins (I like them very much, 
because they show the problem more clearly)

select c.id1, d.id1, sum(c.val2*d.val2*T1.val1)
from T1 join T2 c on (c.ID2 = T1.id) join T2 d on (d.id2 = T1.id and c.id1 < d.id1)
group by c.id1, d.id1;

If I understand well you want to delete useless values:

delete from t2 where val2=0.5

your query should look then:

select c.id1, d.id1, sum( case when c.id2 is null then 0.5 else c.val2 end * case when d.id2 is null then 0.5 else
d.val2end *T1.val1)
 
from T1 left join T2 c on (c.ID2 = T1.id) left join T2 d on (d.id2 = T1.id and (c.id1 < d.id1 or c.id1 is null))
group by c.id1, d.id1;

I'm not sure about this query... There is a problem - is it possible to 
have missing values in your matrix? This solution replaces all missing 
values into 0.5. What should happen to "T2 alias d" if there is missing 
value in "T2 alias c"? What happens then to clause "c.id1<d.id1" if "c" 
is not found?

Regards,
Tomasz Myrta



pgsql-sql by date:

Previous
From: "Zeugswetter Andreas SB SD"
Date:
Subject: Re: [HACKERS] Cursors and backwards scans and SCROLL
Next
From: Hesham Gowaily
Date:
Subject: Execution plan problem