Thread: Help with strange join
God morning, everybody! I have a problem that I don't seem to be able to solve by my self, thats why I kindly ask the list now… I have a database containing some tables, containing different receipts. Every receipt have an unknown number of ingredients linked to it, and every ingredients name is in a separate table sincce it needs to be translated to different languages. The tables are link according to the following: receipts <- related_ingredients <- ingredients <- languages On a listpage I want to list (ten and ten) all receipts that has _all the ingredients translated_. My problem is how to find out that every ingredient is translated. If I just do JOINs, I will not be able to find out if only one or all of the ingredients are translated. What I need is something that, for example, returns the receiptnumber if, and only if, the number of translated ingredients that is returned (after joining ingredients with languages) is the same as the number of entrys in the table "related_ingredients". Since I'm really close to deadline on this project and standing still for the moment, I really would appreciate some help right now... Best regards, Victor Spång Arthursson
On Fri, 4 Feb 2005 09:44:15 +0100, Victor Spång Arthursson <victor@tosti.dk> wrote: > God morning, everybody! > > I have a problem that I don't seem to be able to solve by my self, > thats why I kindly ask the list now… > > I have a database containing some tables, containing different receipts. > > Every receipt have an unknown number of ingredients linked to it, and > every ingredients name is in a separate table sincce it needs to be > translated to different languages. > > The tables are link according to the following: > > receipts <- related_ingredients <- ingredients <- languages > > On a listpage I want to list (ten and ten) all receipts that has _all > the ingredients translated_. > > My problem is how to find out that every ingredient is translated. > > If I just do JOINs, I will not be able to find out if only one or all > of the ingredients are translated. What I need is something that, for > example, returns the receiptnumber if, and only if, the number of > translated ingredients that is returned (after joining ingredients with > languages) is the same as the number of entrys in the table > "related_ingredients". Can you send the table structure and the query that does this? It may just be a matter of adding a subselect with a HAVING clause, but we won't know until we have more information. -- Mike Rylander mrylander@gmail.com GPLS -- PINES Development Database Developer http://open-ils.org
Victor Spång Arthursson wrote: > The tables are link according to the following: > > receipts <- related_ingredients <- ingredients <- languages > If I just do JOINs, I will not be able to find out if only one or all of > the ingredients are translated. What I need is something that, for > example, returns the receiptnumber if, and only if, the number of > translated ingredients that is returned (after joining ingredients with > languages) is the same as the number of entrys in the table > "related_ingredients". Something like this could work (column names and values made up while going on): SELECT * FROM (receipts INNER JOIN related_ingredients ON (receipt_id)) r OUTER JOIN (ingredients INNER JOIN languages ON (ingredient_id)) i WHERE i.language_id = 1 AND i.name IS NOT NULL The idea is that the outer join allows receipts with untranslated ingredients to end up in the results as NULL values, so that you can check for them. I suppose a HAVING clause would also work, but AFAIK you need to GROUP BY to be able to do that... All of this is untested of course, so I may have made mistakes. -- Alban Hertroys MAG Productions T: +31(0)53 4346874 F: +31(0)53 4346876 E: alban@magproductions.nl W: http://www.magproductions.nl
2005-02-04 kl. 13.00 skrev Mike Rylander: > Can you send the table structure and the query that does this? It may > just be a matter of adding a subselect with a HAVING clause, but we > won't know until we have more information. Sure - coming up! First table is the main receipt table: tostipippitest=# select nummer, opskriftsnavn as receiptname from opskrifter as receipts; nummer | receiptname --------+-------------------- 12345 | 882120001093591418 121222 | 534886001105088842 33233 | 217710001096269634 (3 rows) The id in this table is to be found in the related_ingredients-table: tostipippitest=# select id, ingrediens, maengde as amount, opskrift from opskrifter_content as related_ingredients where opskrift = 12345; id | ingrediens | amount | opskrift ----+------------+--------+-------- 8 | i21 | 2 | 12345 9 | i18 | 7 | 12345 11 | i24 | | 12345 4 | i17 | 2 | 12345 3 | i14 | 1 | 12345 (5 rows) Then there is the ingredients-table that the above relates to: tostipippitest=# select distinct betegnelse, opskrifter_ingredienser.id from opskrifter_content join opskrifter_ingredienser on ingrediens = opskrifter_ingredienser.id where opskrifter_content.opskrift = '12345'; betegnelse | id --------------------+----- 33794001087300778 | i24 135860001084976781 | i14 366841001086602763 | i17 377948001087300210 | i21 722896001087299185 | i18 (5 rows) And last there is the language-table: tostipippitest=# select * from languages limit 10 offset 0; relid | text | lang --------------------+--------------------------------------------- +------ 541388001083147128 | Lagervare Indasia + ID Andet 0 | DK 542973001083147128 | Specialbl. lagervare ID Pulver 100 | DK 544538001083147128 | Specialbl. lagervare ID Flydende 500 | DK 546152001083147128 | Specialbl. lagervare ID Andet 0 | DK 547733001083147128 | Specialbl. til SM lagervare ID Pulver 100 | DK 549545001083147128 | Specialbl. til SM lagervare ID Flydende 500 | DE 551072001083147128 | Specialbl. til SM lagervare ID Andet 0 | DK 552622001083147128 | Specialbl.kunder - ej lager ID Pulver 100 | DK 554156001083147128 | Specialbl.kunder - ej lager ID Flydende 500 | DK 555959001083147128 | Specialbl.kunder - ej lager ID Andet 0 | DK (10 rows) As you can see, there is for exampe only one entry with german language, "DE". When joining the ingredients on this table, the result can be max one entry. That is a result, but since it is less than the number of entrys in the table related_ingredients, I want all of the query to fail, thus not returning the actual receipt as a receipt totally translated in my dummy-SQL looking something like "SELECT * from reciepts JOIN (this strange sql that returns only the numbers of the receipts that are totally translated) on receiptnumber = othertable.receipt"… Thanks in advance, Victor Spång Arthursson Denmark / Sweden
On Fri, 4 Feb 2005 17:52:45 +0100, Victor Spång Arthursson <victor@tosti.dk> wrote: > > 2005-02-04 kl. 13.00 skrev Mike Rylander: > > > Can you send the table structure and the query that does this? It may > > just be a matter of adding a subselect with a HAVING clause, but we > > won't know until we have more information. > > Sure - coming up! > > First table is the main receipt table: > > tostipippitest=# select nummer, opskriftsnavn as receiptname from > opskrifter as receipts; > nummer | receiptname > --------+-------------------- > 12345 | 882120001093591418 > 121222 | 534886001105088842 > 33233 | 217710001096269634 > (3 rows) > > The id in this table is to be found in the related_ingredients-table: > > tostipippitest=# select id, ingrediens, maengde as amount, opskrift > from opskrifter_content as related_ingredients where opskrift = 12345; > id | ingrediens | amount | opskrift > ----+------------+--------+-------- > 8 | i21 | 2 | 12345 > 9 | i18 | 7 | 12345 > 11 | i24 | | 12345 > 4 | i17 | 2 | 12345 > 3 | i14 | 1 | 12345 > (5 rows) > > Then there is the ingredients-table that the above relates to: > > tostipippitest=# select distinct betegnelse, opskrifter_ingredienser.id > from opskrifter_content join opskrifter_ingredienser on ingrediens = > opskrifter_ingredienser.id where opskrifter_content.opskrift = '12345'; > betegnelse | id > --------------------+----- > 33794001087300778 | i24 > 135860001084976781 | i14 > 366841001086602763 | i17 > 377948001087300210 | i21 > 722896001087299185 | i18 > (5 rows) > > And last there is the language-table: > > tostipippitest=# select * from languages limit 10 offset 0; > relid | text | lang > --------------------+--------------------------------------------- > +------ > 541388001083147128 | Lagervare Indasia + ID Andet 0 | DK > 542973001083147128 | Specialbl. lagervare ID Pulver 100 | DK > 544538001083147128 | Specialbl. lagervare ID Flydende 500 | DK > 546152001083147128 | Specialbl. lagervare ID Andet 0 | DK > 547733001083147128 | Specialbl. til SM lagervare ID Pulver 100 | DK > 549545001083147128 | Specialbl. til SM lagervare ID Flydende 500 | DE > 551072001083147128 | Specialbl. til SM lagervare ID Andet 0 | DK > 552622001083147128 | Specialbl.kunder - ej lager ID Pulver 100 | DK > 554156001083147128 | Specialbl.kunder - ej lager ID Flydende 500 | DK > 555959001083147128 | Specialbl.kunder - ej lager ID Andet 0 | DK > (10 rows) > > As you can see, there is for exampe only one entry with german > language, "DE". When joining the ingredients on this table, the result > can be max one entry. That is a result, but since it is less than the > number of entrys in the table related_ingredients, I want all of the > query to fail, thus not returning the actual receipt as a receipt > totally translated in my dummy-SQL looking something like "SELECT * > from reciepts JOIN (this strange sql that returns only the numbers of > the receipts that are totally translated) on receiptnumber = > othertable.receipt"… How about: SELECT r.* FROM opskrifter r JOIN opskrifter_content c ON (r.nummer = c.opskrift) JOIN opskrifter_ingredienser i ON (c.ingrediens = i.id) WHERE EXISTS ( SELECT l.relid FROM languages l WHERE l.relid = i.betegnelse GROUP BY l.relid HAVING COUNT(DISINCT l.lang) = 1); I tested the subselect but not the entire thing. You could turn that subselect into a join instead of an EXISTS, though I'm not sure which will be faster: SELECT r.* FROM opskrifter r JOIN opskrifter_content c ON (r.nummer = c.opskrift) JOIN opskrifter_ingredienser i ON (c.ingrediens = i.id) JOIN ( SELECT l.relid FROM languages l GROUP BY l.relid HAVING COUNT(DISINCT l.lang) = 1) s ON (s.relid = i.betegnelse); Hope that helps! -- Mike Rylander mrylander@gmail.com GPLS -- PINES Development Database Developer http://open-ils.org
2005-02-04 kl. 20.36 skrev Mike Rylander: > How about: > > SELECT r.* FROM opskrifter r JOIN opskrifter_content c ON (r.nummer = > c.opskrift) JOIN opskrifter_ingredienser i ON (c.ingrediens = i.id) > WHERE EXISTS ( SELECT l.relid FROM languages l WHERE l.relid = > i.betegnelse GROUP BY l.relid HAVING COUNT(DIStINCT l.lang) = 1); Hi and thanks for your answer! Could you please try to break this SQL down and try to explain what it does, and also show how to change language… Sincerely Victor Spång Arthursson
On Mon, 7 Feb 2005 12:34:39 +0100, Victor Spång Arthursson <victor@tosti.dk> wrote: > > 2005-02-04 kl. 20.36 skrev Mike Rylander: > > > How about: > > > > SELECT r.* FROM opskrifter r JOIN opskrifter_content c ON (r.nummer = > > c.opskrift) JOIN opskrifter_ingredienser i ON (c.ingrediens = i.id) > > WHERE EXISTS ( SELECT l.relid FROM languages l WHERE l.relid = > > i.betegnelse GROUP BY l.relid HAVING COUNT(DIStINCT l.lang) = 1); > > Hi and thanks for your answer! No problem. Did it do what you expect? > > Could you please try to break this SQL down and try to explain what it > does, and also show how to change language… > Sure. I guess that the EXISTS version was faster? In any case: SELECT r.* -- you can add more columns here -- from any table but languages -- I assume the joins are correct and transparent. I just -- attempted to follow your example. I wasn't sure about -- the join from ingredients to languages since the example -- didn't have any exact matches, but the format of the relid -- column looked right. FROM opskrifter r JOIN opskrifter_content c ON (r.nummer =c.opskrift) JOIN opskrifter_ingredienser i ON (c.ingrediens = i.id) WHERE EXISTS ( -- EXISTS runs a subselect and if ANY rows are returned -- it becomes true http://www.postgresql.org/docs/8.0/interactive/functions-subquery.html#AEN12496 -- This query, if run by itself, will give you all the "relid"s from -- "languages" where ALL the rows in the group have the same -- "lang" attribute. The HAVING clause is the key. It looks at -- each group of rows with the same relid as a virtual subtable -- and counts the number of distinct "lang" columns. If I -- understood your question correctly, any partially translated -- ingredient lists will have more than one distinct "lang". In -- order to select a specific language you would need to -- wrap this in another because of GROUP BY restrictions, and -- use an aggregate to return the "lang" attribute. SELECT lang FROM ( SELECT l.relid, MAX(lang) as lang FROM languages l WHERE l.relid = i.betegnelse GROUP BY l.relid HAVING COUNT(DIStINCT l.lang) = 1 ) s2 WHERE l.lang = 'DE' ); To generalize this, just replace WHERE l.lang = 'DE' with WHERE l.lang = ? and pass the language in as a parameter to the query. The JOIN version would be much easier to limit. BTW, there was a typo in the first post in the JOIN version. It's fixed below. Just change it from SELECT r.* FROM opskrifter r JOIN opskrifter_content c ON (r.nummer = c.opskrift) JOIN opskrifter_ingredienser i ON (c.ingrediens = i.id) JOIN ( SELECT l.relid FROM languages l GROUP BY l.relid HAVING COUNT(DISINCT l.lang) = 1) s ON (l.relid = i.betegnelse); to SELECT r.* FROM opskrifter r JOIN opskrifter_content c ON (r.nummer = c.opskrift) JOIN opskrifter_ingredienser i ON (c.ingrediens = i.id) JOIN ( SELECT l.relid FROM languages l GROUP BY l.relid HAVING COUNT(DISINCT l.lang) = 1) s ON (l.relid = i.betegnelse AND l.lang = 'DE'); I am curious as to which is faster for you. If "languages" is small I would expect the EXISTS version to be faster, but for a large "languages" the JOIN should be faster since it only generates the subselect result set once. -- Mike Rylander mrylander@gmail.com GPLS -- PINES Development Database Developer http://open-ils.org