Thread: How to find missing values across multiple OUTER JOINs
I'm having troubles using multiple OUTER JOINs, which I think I want to use to solve my problem. My problem is to find all non-translated string values in our translations database, given the 4 following tables: SOURCE (source_id PRIMARY KEY, language_id, value, ...) TRANSLATION (translation_id PRIMARY KEY, language_id, value, ...) TRANSLATION_PAIR (source_id, translation_id) LANGUAGE(language_id PRIMARY KEY, name) This seems to me the appropriate situation for using OUTER JOINs, but I cannot figure out how to get the null rows without the not-null rows. Here's my best guess at this query: SELECT s.source_id,tp.translation_pair_id,t.translation_id,t.language_id, l.name FROM source s LEFT OUTER JOIN translation_pair tp USING(source_id) LEFT OUTER JOIN translation t ON tp.translation_id = t.translation_id AND t.translation_id is null) RIGHT OUTER JOIN language l on l.language_id = t.language_id; To test this query, I have a string that only has a translation in English and used it in this test query. SELECT s.source_id,tp.translation_pair_id,t.translation_id,t.language_id, l.name FROM (select * FROM source s WHERE source_id = 159986) as s LEFT OUTER JOIN translation_pair tp USING(source_id) LEFT OUTER JOIN translation t ON tp.translation_id = t.translation_id RIGHT OUTER JOIN language l on l.language_id = t.language_id; This yields promising results: source_id | translation_pair_id | translation_id | language_id | name -----------+---------------------+----------------+------------- +---------------------- | | | | Russian | | | | Danish | | | | Dutch 159986 | 1893187 | 1743833| 4 | English | | | | Finnish | | | | French | | | | German | | | | Italian | | | | Japanese | | | | Korean | | | | Norwegian | | | | Simplified Chinese | | | | Spanish | | | | Swedish | | | | Traditional Chinese | | | | Portuguese | | | | Polish | | | | Turkish | | | | Czech | | | | Brazilian Portuguese (20 rows) However, when I try to exclude the one not-null row, doing this: SELECT s.source_id,tp.translation_pair_id,t.translation_id,t.language_id, l.name FROM (select * FROM source s WHERE source_id = 159986) AS s LEFT OUTER JOIN translation_pair tp USING(source_id) LEFT OUTER JOIN translation t ON tp.translation_id = t.translation_id AND t.translation_id IS NULL RIGHT OUTER JOIN language l on l.language_id = t.language_id; I expect 19 rows, but instead get 20 rows, all containing null values. source_id | translation_pair_id | translation_id | language_id | name -----------+---------------------+----------------+------------- +---------------------- | | | | Russian | | | | Danish | | | | Dutch | | | | English | | | | Finnish | | | | French | | | | German | | | | Italian | | | | Japanese | | | | Korean | | | | Norwegian | | | | Simplified Chinese | | | | Spanish | | | | Swedish | | | | Traditional Chinese | | | | Portuguese | | | | Polish | | | | Turkish | | | | Czech | | | | Brazilian Portuguese (20 rows) I'm guessing I need to group the joins together, to avoid some associative problem. Do you see what I'm doing wrong? Thanks for the help, Drew
On þri, 2007-06-05 at 23:55 -0700, Drew wrote: > I'm having troubles using multiple OUTER JOINs, which I think I want > to use to solve my problem. > > My problem is to find all non-translated string values in our > translations database, given the 4 following tables: > > SOURCE (source_id PRIMARY KEY, language_id, value, ...) > TRANSLATION (translation_id PRIMARY KEY, language_id, value, ...) > TRANSLATION_PAIR (source_id, translation_id) > LANGUAGE(language_id PRIMARY KEY, name) > > This seems to me the appropriate situation for using OUTER JOINs, but > I cannot figure out how to get the null rows without the not-null rows. > > Here's my best guess at this query: > SELECT > s.source_id,tp.translation_pair_id,t.translation_id,t.language_id, > l.name > > FROM source s > LEFT OUTER JOIN translation_pair tp USING(source_id) > LEFT OUTER JOIN translation t ON tp.translation_id = t.translation_id > AND t.translation_id is null move this condition out of the ON clause into a WHERE clause > ) > RIGHT OUTER JOIN language l on l.language_id = t.language_id; SELECT s.source_id, tp.translation_pair_id, t.translation_id, t.language_id, l.name FROM source s LEFT OUTER JOIN translation_pair tp USING(source_id) LEFT OUTER JOIN translation t ON tp.translation_id= t.translation_id RIGHT OUTER JOIN language l on l.language_id =t.language_id WHERE t.translation_id is null; (i did not check the rest of your query) hope this helps, gnari
Drew wrote: > This seems to me the appropriate situation for using OUTER JOINs, but I > cannot figure out how to get the null rows without the not-null rows. > However, when I try to exclude the one not-null row, doing this: > SELECT > s.source_id,tp.translation_pair_id,t.translation_id,t.language_id, l.name > FROM (select * FROM source s WHERE source_id = 159986) AS s > LEFT OUTER JOIN translation_pair tp USING(source_id) > LEFT OUTER JOIN translation t ON tp.translation_id = > t.translation_id AND t.translation_id IS NULL > RIGHT OUTER JOIN language l on l.language_id = t.language_id; Ah, you need to put the "IS NULL" outside the join. SELECT ... WHERE s.source_id IS NULL OR tp.translation_pair_id IS NULL OR ... -- Richard Huxton Archonet Ltd
Thanks! That was it. Drew On Jun 6, 2007, at 1:45 AM, Ragnar wrote: > On þri, 2007-06-05 at 23:55 -0700, Drew wrote: >> I'm having troubles using multiple OUTER JOINs, which I think I want >> to use to solve my problem. >> >> My problem is to find all non-translated string values in our >> translations database, given the 4 following tables: >> >> SOURCE (source_id PRIMARY KEY, language_id, value, ...) >> TRANSLATION (translation_id PRIMARY KEY, language_id, value, ...) >> TRANSLATION_PAIR (source_id, translation_id) >> LANGUAGE(language_id PRIMARY KEY, name) >> >> This seems to me the appropriate situation for using OUTER JOINs, but >> I cannot figure out how to get the null rows without the not-null >> rows. >> >> Here's my best guess at this query: >> SELECT >> s.source_id,tp.translation_pair_id,t.translation_id,t.language_id, >> l.name >> >> FROM source s >> LEFT OUTER JOIN translation_pair tp USING(source_id) >> LEFT OUTER JOIN translation t ON tp.translation_id = t.translation_id >> AND t.translation_id is null > move this condition out of the ON clause into a WHERE clause >> ) >> RIGHT OUTER JOIN language l on l.language_id = t.language_id; > > > SELECT s.source_id, > tp.translation_pair_id, > t.translation_id, > t.language_id, > l.name > FROM source s > LEFT OUTER JOIN translation_pair tp USING(source_id) > LEFT OUTER JOIN translation t ON tp.translation_id = > t.translation_id > RIGHT OUTER JOIN language l > on l.language_id =t.language_id > WHERE t.translation_id is null; > > > (i did not check the rest of your query) > > hope this helps, > gnari > >