Thread: Help with strange join

Help with strange join

From
Victor Spång Arthursson
Date:
God morning, everybody!

I have a problem that I don't seem to be able to solve by my self,
thats why I kindly ask the list now…

I have a database containing some tables, containing different receipts.

Every receipt have an unknown number of ingredients linked to it, and
every ingredients name is in a separate table sincce it needs to be
translated to different languages.

The tables are link according to the following:

receipts <- related_ingredients <- ingredients <- languages

On a listpage I want to list (ten and ten) all receipts that has _all
the ingredients translated_.

My problem is how to find out that every ingredient is translated.

If I just do JOINs, I will not be able to find out if only one or all
of the ingredients are translated. What I need is something that, for
example, returns the receiptnumber if, and only if, the number of
translated ingredients that is returned (after joining ingredients with
languages) is the same as the number of entrys in the table
"related_ingredients".

Since I'm really close to deadline on this project and standing still
for the moment, I really would appreciate some help right now...

Best regards,

Victor Spång Arthursson

Re: Help with strange join

From
Mike Rylander
Date:
On Fri, 4 Feb 2005 09:44:15 +0100, Victor Spång Arthursson
<victor@tosti.dk> wrote:
> God morning, everybody!
>
> I have a problem that I don't seem to be able to solve by my self,
> thats why I kindly ask the list now…
>
> I have a database containing some tables, containing different receipts.
>
> Every receipt have an unknown number of ingredients linked to it, and
> every ingredients name is in a separate table sincce it needs to be
> translated to different languages.
>
> The tables are link according to the following:
>
> receipts <- related_ingredients <- ingredients <- languages
>
> On a listpage I want to list (ten and ten) all receipts that has _all
> the ingredients translated_.
>
> My problem is how to find out that every ingredient is translated.
>
> If I just do JOINs, I will not be able to find out if only one or all
> of the ingredients are translated. What I need is something that, for
> example, returns the receiptnumber if, and only if, the number of
> translated ingredients that is returned (after joining ingredients with
> languages) is the same as the number of entrys in the table
> "related_ingredients".

Can you send the table structure and the query that does this?  It may
just be a matter of adding a subselect with a HAVING clause, but we
won't know until we have more information.

--
Mike Rylander
mrylander@gmail.com
GPLS -- PINES Development
Database Developer
http://open-ils.org

Re: Help with strange join

From
Alban Hertroys
Date:
Victor Spång Arthursson wrote:
> The tables are link according to the following:
>
> receipts <- related_ingredients <- ingredients <- languages

> If I just do JOINs, I will not be able to find out if only one or all of
> the ingredients are translated. What I need is something that, for
> example, returns the receiptnumber if, and only if, the number of
> translated ingredients that is returned (after joining ingredients with
> languages) is the same as the number of entrys in the table
> "related_ingredients".

Something like this could work (column names and values made up while
going on):

SELECT *
FROM (receipts INNER JOIN related_ingredients ON (receipt_id)) r
OUTER JOIN (ingredients INNER JOIN languages ON (ingredient_id)) i
WHERE i.language_id = 1
AND i.name IS NOT NULL

The idea is that the outer join allows receipts with untranslated
ingredients to end up in the results as NULL values, so that you can
check for them.

I suppose a HAVING clause would also work, but AFAIK you need to GROUP
BY to be able to do that...

All of this is untested of course, so I may have made mistakes.

--
Alban Hertroys
MAG Productions

T: +31(0)53 4346874
F: +31(0)53 4346876
E: alban@magproductions.nl
W: http://www.magproductions.nl


Re: Help with strange join

From
Victor Spång Arthursson
Date:
2005-02-04 kl. 13.00 skrev Mike Rylander:

> Can you send the table structure and the query that does this?  It may
> just be a matter of adding a subselect with a HAVING clause, but we
> won't know until we have more information.

Sure - coming up!

First table is the main receipt table:

tostipippitest=# select nummer, opskriftsnavn as receiptname from
opskrifter as receipts;
  nummer |    receiptname
--------+--------------------
  12345  | 882120001093591418
  121222 | 534886001105088842
  33233  | 217710001096269634
(3 rows)

The id in this table is to be found in the related_ingredients-table:

tostipippitest=# select id, ingrediens, maengde as amount, opskrift
from opskrifter_content as related_ingredients where opskrift = 12345;
  id | ingrediens | amount | opskrift
----+------------+--------+--------
   8 | i21        |      2 | 12345
   9 | i18        |      7 | 12345
  11 | i24        |        | 12345
   4 | i17        |      2 | 12345
   3 | i14        |      1 | 12345
(5 rows)

Then there is the ingredients-table that the above relates to:

tostipippitest=# select distinct betegnelse, opskrifter_ingredienser.id
from opskrifter_content join opskrifter_ingredienser on ingrediens =
opskrifter_ingredienser.id where opskrifter_content.opskrift = '12345';
      betegnelse     | id
--------------------+-----
   33794001087300778 | i24
  135860001084976781 | i14
  366841001086602763 | i17
  377948001087300210 | i21
  722896001087299185 | i18
(5 rows)

And last there is the language-table:

tostipippitest=# select * from languages limit 10 offset 0;
        relid        |                    text                     | lang
--------------------+---------------------------------------------
+------
  541388001083147128 | Lagervare Indasia + ID Andet 0              | DK
  542973001083147128 | Specialbl. lagervare ID Pulver 100          | DK
  544538001083147128 | Specialbl. lagervare ID Flydende 500        | DK
  546152001083147128 | Specialbl. lagervare ID Andet 0             | DK
  547733001083147128 | Specialbl. til SM lagervare ID Pulver 100   | DK
  549545001083147128 | Specialbl. til SM lagervare ID Flydende 500 | DE
  551072001083147128 | Specialbl. til SM lagervare ID Andet 0      | DK
  552622001083147128 | Specialbl.kunder - ej lager ID Pulver 100   | DK
  554156001083147128 | Specialbl.kunder - ej lager ID Flydende 500 | DK
  555959001083147128 | Specialbl.kunder - ej lager ID Andet 0      | DK
(10 rows)

As you can see, there is for exampe only one entry with german
language, "DE". When joining the ingredients on this table, the result
can be max one entry. That is a result, but since it is less than the
number of entrys in the table related_ingredients, I want all of the
query to fail, thus not returning the actual receipt as a receipt
totally translated in my dummy-SQL looking something like "SELECT *
from reciepts JOIN (this strange sql that returns only the numbers of
the receipts that are totally translated) on receiptnumber =
othertable.receipt"…

Thanks in advance,

Victor Spång Arthursson
Denmark / Sweden


Re: Help with strange join

From
Mike Rylander
Date:
On Fri, 4 Feb 2005 17:52:45 +0100, Victor Spång Arthursson
<victor@tosti.dk> wrote:
>
> 2005-02-04 kl. 13.00 skrev Mike Rylander:
>
> > Can you send the table structure and the query that does this?  It may
> > just be a matter of adding a subselect with a HAVING clause, but we
> > won't know until we have more information.
>
> Sure - coming up!
>
> First table is the main receipt table:
>
> tostipippitest=# select nummer, opskriftsnavn as receiptname from
> opskrifter as receipts;
>   nummer |    receiptname
> --------+--------------------
>   12345  | 882120001093591418
>   121222 | 534886001105088842
>   33233  | 217710001096269634
> (3 rows)
>
> The id in this table is to be found in the related_ingredients-table:
>
> tostipippitest=# select id, ingrediens, maengde as amount, opskrift
> from opskrifter_content as related_ingredients where opskrift = 12345;
>   id | ingrediens | amount | opskrift
> ----+------------+--------+--------
>    8 | i21        |      2 | 12345
>    9 | i18        |      7 | 12345
>   11 | i24        |        | 12345
>    4 | i17        |      2 | 12345
>    3 | i14        |      1 | 12345
> (5 rows)
>
> Then there is the ingredients-table that the above relates to:
>
> tostipippitest=# select distinct betegnelse, opskrifter_ingredienser.id
> from opskrifter_content join opskrifter_ingredienser on ingrediens =
> opskrifter_ingredienser.id where opskrifter_content.opskrift = '12345';
>       betegnelse     | id
> --------------------+-----
>    33794001087300778 | i24
>   135860001084976781 | i14
>   366841001086602763 | i17
>   377948001087300210 | i21
>   722896001087299185 | i18
> (5 rows)
>
> And last there is the language-table:
>
> tostipippitest=# select * from languages limit 10 offset 0;
>         relid        |                    text                     | lang
> --------------------+---------------------------------------------
> +------
>   541388001083147128 | Lagervare Indasia + ID Andet 0              | DK
>   542973001083147128 | Specialbl. lagervare ID Pulver 100          | DK
>   544538001083147128 | Specialbl. lagervare ID Flydende 500        | DK
>   546152001083147128 | Specialbl. lagervare ID Andet 0             | DK
>   547733001083147128 | Specialbl. til SM lagervare ID Pulver 100   | DK
>   549545001083147128 | Specialbl. til SM lagervare ID Flydende 500 | DE
>   551072001083147128 | Specialbl. til SM lagervare ID Andet 0      | DK
>   552622001083147128 | Specialbl.kunder - ej lager ID Pulver 100   | DK
>   554156001083147128 | Specialbl.kunder - ej lager ID Flydende 500 | DK
>   555959001083147128 | Specialbl.kunder - ej lager ID Andet 0      | DK
> (10 rows)
>
> As you can see, there is for exampe only one entry with german
> language, "DE". When joining the ingredients on this table, the result
> can be max one entry. That is a result, but since it is less than the
> number of entrys in the table related_ingredients, I want all of the
> query to fail, thus not returning the actual receipt as a receipt
> totally translated in my dummy-SQL looking something like "SELECT *
> from reciepts JOIN (this strange sql that returns only the numbers of
> the receipts that are totally translated) on receiptnumber =
> othertable.receipt"…

How about:

SELECT r.* FROM opskrifter r JOIN opskrifter_content c ON (r.nummer =
c.opskrift) JOIN opskrifter_ingredienser i ON (c.ingrediens = i.id)
WHERE EXISTS ( SELECT l.relid FROM languages l WHERE  l.relid =
i.betegnelse GROUP BY l.relid HAVING COUNT(DISINCT l.lang) = 1);

I tested the subselect but not the entire thing.  You could turn that
subselect into a join instead of an EXISTS, though I'm not sure which
will be faster:

SELECT r.* FROM opskrifter r JOIN opskrifter_content c ON (r.nummer =
c.opskrift) JOIN opskrifter_ingredienser i ON (c.ingrediens = i.id)
JOIN ( SELECT l.relid FROM languages l GROUP BY l.relid HAVING
COUNT(DISINCT l.lang) = 1) s ON (s.relid = i.betegnelse);

Hope that helps!

--
Mike Rylander
mrylander@gmail.com
GPLS -- PINES Development
Database Developer
http://open-ils.org

Re: Help with strange join

From
Victor Spång Arthursson
Date:
2005-02-04 kl. 20.36 skrev Mike Rylander:

> How about:
>
> SELECT r.* FROM opskrifter r JOIN opskrifter_content c ON (r.nummer =
> c.opskrift) JOIN opskrifter_ingredienser i ON (c.ingrediens = i.id)
> WHERE EXISTS ( SELECT l.relid FROM languages l WHERE  l.relid =
> i.betegnelse GROUP BY l.relid HAVING COUNT(DIStINCT l.lang) = 1);

Hi and thanks for your answer!

Could you please try to break this SQL down and try to explain what it
does, and also show how to change language…

Sincerely

Victor Spång Arthursson


Re: Help with strange join

From
Mike Rylander
Date:
On Mon, 7 Feb 2005 12:34:39 +0100, Victor Spång Arthursson
<victor@tosti.dk> wrote:
>
> 2005-02-04 kl. 20.36 skrev Mike Rylander:
>
> > How about:
> >
> > SELECT r.* FROM opskrifter r JOIN opskrifter_content c ON (r.nummer =
> > c.opskrift) JOIN opskrifter_ingredienser i ON (c.ingrediens = i.id)
> > WHERE EXISTS ( SELECT l.relid FROM languages l WHERE  l.relid =
> > i.betegnelse GROUP BY l.relid HAVING COUNT(DIStINCT l.lang) = 1);
>
> Hi and thanks for your answer!

No problem.  Did it do what you expect?

>
> Could you please try to break this SQL down and try to explain what it
> does, and also show how to change language…
>

Sure.  I guess that the EXISTS version was faster?  In any case:

SELECT r.*    -- you can add more columns here
                     -- from any table but languages

          -- I assume the joins are correct and transparent.  I just
          -- attempted to follow your example.  I wasn't sure about
          -- the join from ingredients to languages since the example
          -- didn't have any exact matches, but the format of the relid
          -- column looked right.
          FROM opskrifter r
                JOIN opskrifter_content c ON (r.nummer =c.opskrift)
                JOIN opskrifter_ingredienser i ON (c.ingrediens = i.id)

         WHERE
                EXISTS (  -- EXISTS runs a subselect and if ANY rows
are returned
                                -- it becomes true
http://www.postgresql.org/docs/8.0/interactive/functions-subquery.html#AEN12496

                                 -- This query, if run by itself, will
give you all the "relid"s from
                                 -- "languages" where ALL the rows in
the group have the same
                                 -- "lang" attribute.  The HAVING
clause is the key.  It looks at
                                 -- each group of rows with the same
relid as a virtual subtable
                                 -- and counts the number of distinct
"lang" columns.  If I
                                 -- understood your question
correctly, any partially translated
                                 -- ingredient lists will have more
than one distinct "lang".  In
                                 -- order to select a specific
language you would need to
                                 -- wrap this in another because of
GROUP BY restrictions, and
                                 -- use an aggregate to return the
"lang" attribute.

           SELECT lang FROM (
                                 SELECT l.relid, MAX(lang) as lang
FROM languages l
                                         WHERE  l.relid = i.betegnelse
                                         GROUP BY l.relid
                                         HAVING COUNT(DIStINCT l.lang) = 1
                               ) s2
                                WHERE l.lang = 'DE'
                 );

To generalize this, just replace
                                WHERE l.lang = 'DE'
with
                                WHERE l.lang = ?
and pass the language in as a parameter to the query.  The JOIN
version would be much easier to limit.  BTW, there was a typo in the
first post in the JOIN version.  It's fixed below.  Just change it
from

SELECT r.* FROM opskrifter r JOIN opskrifter_content c ON (r.nummer =
c.opskrift) JOIN opskrifter_ingredienser i ON (c.ingrediens = i.id)
JOIN ( SELECT l.relid FROM languages l GROUP BY l.relid HAVING
COUNT(DISINCT l.lang) = 1) s ON (l.relid = i.betegnelse);

to

SELECT r.* FROM opskrifter r JOIN opskrifter_content c ON (r.nummer =
c.opskrift) JOIN opskrifter_ingredienser i ON (c.ingrediens = i.id)
JOIN ( SELECT l.relid FROM languages l GROUP BY l.relid HAVING
COUNT(DISINCT l.lang) = 1) s ON (l.relid = i.betegnelse AND l.lang = 'DE');


I am curious as to which is faster for you.  If "languages" is small I
would expect the EXISTS version to be faster, but for a large
"languages" the JOIN should be faster since it only generates the
subselect result set once.

--
Mike Rylander
mrylander@gmail.com
GPLS -- PINES Development
Database Developer
http://open-ils.org