Thread: How to find missing values across multiple OUTER JOINs

How to find missing values across multiple OUTER JOINs

From
Drew
Date:
I'm having troubles using multiple OUTER JOINs, which I think I want  
to use to solve my problem.

My problem is to find all non-translated string values in our  
translations database, given the 4 following tables:

SOURCE (source_id PRIMARY KEY, language_id, value, ...)
TRANSLATION (translation_id PRIMARY KEY, language_id, value, ...)
TRANSLATION_PAIR (source_id, translation_id)
LANGUAGE(language_id PRIMARY KEY, name)

This seems to me the appropriate situation for using OUTER JOINs, but  
I cannot figure out how to get the null rows without the not-null rows.

Here's my best guess at this query:
SELECT  
s.source_id,tp.translation_pair_id,t.translation_id,t.language_id,  
l.name
                                                                            FROM source s
 
LEFT OUTER JOIN translation_pair tp USING(source_id)
LEFT OUTER JOIN translation t ON tp.translation_id = t.translation_id  
AND t.translation_id is null)
RIGHT OUTER JOIN language l on l.language_id = t.language_id;

To test this query, I have a string that only has a translation in  
English and used it in this test query.  SELECT  
s.source_id,tp.translation_pair_id,t.translation_id,t.language_id,  
l.name    FROM (select * FROM source s WHERE source_id = 159986) as s        LEFT OUTER JOIN translation_pair tp
USING(source_id)       LEFT OUTER JOIN translation t ON tp.translation_id =  
 
t.translation_id        RIGHT OUTER JOIN language l on l.language_id = t.language_id;

This yields promising results: source_id | translation_pair_id | translation_id | language_id  
|         name
-----------+---------------------+----------------+------------- 
+----------------------           |                     |                |             |  
Russian           |                     |                |             |  
Danish           |                     |                |             | Dutch    159986 |             1893187 |
1743833|           4 |  
 
English           |                     |                |             |  
Finnish           |                     |                |             |  
French           |                     |                |             |  
German           |                     |                |             |  
Italian           |                     |                |             |  
Japanese           |                     |                |             |  
Korean           |                     |                |             |  
Norwegian           |                     |                |             |  
Simplified Chinese           |                     |                |             |  
Spanish           |                     |                |             |  
Swedish           |                     |                |             |  
Traditional Chinese           |                     |                |             |  
Portuguese           |                     |                |             |  
Polish           |                     |                |             |  
Turkish           |                     |                |             | Czech           |                     |
       |             |  
 
Brazilian Portuguese
(20 rows)

However, when I try to exclude the one not-null row, doing this:  SELECT  
s.source_id,tp.translation_pair_id,t.translation_id,t.language_id,  
l.name    FROM (select * FROM source s WHERE source_id = 159986) AS s        LEFT OUTER JOIN translation_pair tp
USING(source_id)       LEFT OUTER JOIN translation t ON tp.translation_id =  
 
t.translation_id AND t.translation_id IS NULL        RIGHT OUTER JOIN language l on l.language_id = t.language_id;

I expect 19 rows, but instead get 20 rows, all containing null values.
source_id | translation_pair_id | translation_id | language_id  
|         name
-----------+---------------------+----------------+------------- 
+----------------------           |                     |                |             |  
Russian           |                     |                |             |  
Danish           |                     |                |             | Dutch           |                     |
      |             |  
 
English           |                     |                |             |  
Finnish           |                     |                |             |  
French           |                     |                |             |  
German           |                     |                |             |  
Italian           |                     |                |             |  
Japanese           |                     |                |             |  
Korean           |                     |                |             |  
Norwegian           |                     |                |             |  
Simplified Chinese           |                     |                |             |  
Spanish           |                     |                |             |  
Swedish           |                     |                |             |  
Traditional Chinese           |                     |                |             |  
Portuguese           |                     |                |             |  
Polish           |                     |                |             |  
Turkish           |                     |                |             | Czech           |                     |
       |             |  
 
Brazilian Portuguese
(20 rows)


I'm guessing I need to group the joins together, to avoid some  
associative problem.

Do you see what I'm doing wrong?

Thanks for the help,

Drew


Re: How to find missing values across multiple OUTER JOINs

From
Ragnar
Date:
On þri, 2007-06-05 at 23:55 -0700, Drew wrote:
> I'm having troubles using multiple OUTER JOINs, which I think I want  
> to use to solve my problem.
> 
> My problem is to find all non-translated string values in our  
> translations database, given the 4 following tables:
> 
> SOURCE (source_id PRIMARY KEY, language_id, value, ...)
> TRANSLATION (translation_id PRIMARY KEY, language_id, value, ...)
> TRANSLATION_PAIR (source_id, translation_id)
> LANGUAGE(language_id PRIMARY KEY, name)
> 
> This seems to me the appropriate situation for using OUTER JOINs, but  
> I cannot figure out how to get the null rows without the not-null rows.
> 
> Here's my best guess at this query:
> SELECT  
> s.source_id,tp.translation_pair_id,t.translation_id,t.language_id,  
> l.name                                                                   
>                                                                          
>                                                      FROM source s
> LEFT OUTER JOIN translation_pair tp USING(source_id)
> LEFT OUTER JOIN translation t ON tp.translation_id = t.translation_id  
> AND t.translation_id is null
move this condition out of the ON clause into a WHERE clause
> )
> RIGHT OUTER JOIN language l on l.language_id = t.language_id;


SELECT s.source_id,      tp.translation_pair_id,      t.translation_id,      t.language_id,      l.name
FROM source s    LEFT OUTER JOIN translation_pair tp USING(source_id)    LEFT OUTER JOIN translation t ON
tp.translation_id= t.translation_id    RIGHT OUTER JOIN language l         on l.language_id =t.language_id
 
WHERE t.translation_id is null;


(i did not check the rest of your query)

hope this helps,
gnari




Re: How to find missing values across multiple OUTER JOINs

From
Richard Huxton
Date:
Drew wrote:

> This seems to me the appropriate situation for using OUTER JOINs, but I 
> cannot figure out how to get the null rows without the not-null rows.

> However, when I try to exclude the one not-null row, doing this:
>   SELECT 
> s.source_id,tp.translation_pair_id,t.translation_id,t.language_id, l.name
>     FROM (select * FROM source s WHERE source_id = 159986) AS s
>         LEFT OUTER JOIN translation_pair tp USING(source_id)
>         LEFT OUTER JOIN translation t ON tp.translation_id = 
> t.translation_id AND t.translation_id IS NULL
>         RIGHT OUTER JOIN language l on l.language_id = t.language_id;

Ah, you need to put the "IS NULL" outside the join.

SELECT
...
WHERE  s.source_id IS NULL OR tp.translation_pair_id IS NULL OR ...
--   Richard Huxton  Archonet Ltd


Re: How to find missing values across multiple OUTER JOINs

From
Drew
Date:
Thanks! That was it.

Drew

On Jun 6, 2007, at 1:45 AM, Ragnar wrote:

> On þri, 2007-06-05 at 23:55 -0700, Drew wrote:
>> I'm having troubles using multiple OUTER JOINs, which I think I want
>> to use to solve my problem.
>>
>> My problem is to find all non-translated string values in our
>> translations database, given the 4 following tables:
>>
>> SOURCE (source_id PRIMARY KEY, language_id, value, ...)
>> TRANSLATION (translation_id PRIMARY KEY, language_id, value, ...)
>> TRANSLATION_PAIR (source_id, translation_id)
>> LANGUAGE(language_id PRIMARY KEY, name)
>>
>> This seems to me the appropriate situation for using OUTER JOINs, but
>> I cannot figure out how to get the null rows without the not-null
>> rows.
>>
>> Here's my best guess at this query:
>> SELECT
>> s.source_id,tp.translation_pair_id,t.translation_id,t.language_id,
>> l.name
>>
>>                                                     FROM source s
>> LEFT OUTER JOIN translation_pair tp USING(source_id)
>> LEFT OUTER JOIN translation t ON tp.translation_id = t.translation_id
>> AND t.translation_id is null
> move this condition out of the ON clause into a WHERE clause
>> )
>> RIGHT OUTER JOIN language l on l.language_id = t.language_id;
>
>
> SELECT s.source_id,
>       tp.translation_pair_id,
>       t.translation_id,
>       t.language_id,
>       l.name
> FROM source s
>     LEFT OUTER JOIN translation_pair tp USING(source_id)
>     LEFT OUTER JOIN translation t ON tp.translation_id =
> t.translation_id
>     RIGHT OUTER JOIN language l
>         on l.language_id =t.language_id
> WHERE t.translation_id is null;
>
>
> (i did not check the rest of your query)
>
> hope this helps,
> gnari
>
>