Thread: Re: table has many to many relationship with itself - how

Re: table has many to many relationship with itself - how

From
SCassidy@overlandstorage.com
Date:
Starting with this:

create sequence languages_seq increment by 1;
create table languages (
  id integer primary key default nextval('languages_seq'),
  language_name varchar(100)
);
insert into languages (id, language_name) values (1, 'English');
insert into languages (id, language_name) values (2, 'French');
insert into languages (id, language_name) values (3, 'Spanish');
insert into languages (id, language_name) values (4, 'Italian');

create table phrases(
 id serial primary key,
 language integer references languages(id),
 content text
);
insert into phrases (language, content) values (1, 'the book');
insert into phrases (language, content) values (2, 'le livre');
insert into phrases (language, content) values (3, 'el libro');
insert into phrases (language, content) values (4, 'il libro');
insert into phrases (language, content) values (1, 'the room');
insert into phrases (language, content) values (4, 'la stanza');
insert into phrases (language, content) values (4, 'la camera');


For your translations table, I would go with something like this:


create sequence translations_seq increment by 1;
create table translations (
 translation_id integer primary key default nextval('translations_seq'),
 lang1_id integer references phrases(id),
 lang2_id integer references phrases(id)
);


(I like specifying my own sequence names, instead of using "serial", plus
using a default this way lets me insert an integer directly, when
necessary, or letting it default, but you can use serial, if you want).

That lets you insert rows for multiple to/from pairs.  Also, some words
have multiple meanings, or more than one word has the same meaning.  For
example, the English word "room" can be either "camera" or "stanza" in
Italian.

testdb1=> select * from phrases;
 id | language |  content
----+----------+-----------
  1 |        1 | the book
  2 |        2 | le livre
  3 |        3 | el libro
  4 |        4 | il libro
  5 |        1 | the room
  6 |        4 | la stanza
  7 |        4 | la camera
(7 rows)

testdb1=> insert into translations (lang1_id, lang2_id) values (1, 2);
INSERT 666949 1
testdb1=> insert into translations (lang1_id, lang2_id) values (1, 3);
INSERT 666950 1
testdb1=> insert into translations (lang1_id, lang2_id) values (1, 4);
INSERT 666953 1
testdb1=> insert into translations (lang1_id, lang2_id) values (5, 6);
INSERT 666954 1
testdb1=> insert into translations (lang1_id, lang2_id) values (5, 7);
INSERT 666955 1

Then, you can do this:
select p.content from phrases p where p.id in (select lang2_id from
translations where lang1_id = 5);
  content
-----------
 la stanza
 la camera
(2 rows)

I assume that this is a fairly simple "phrasebook" type of data set.
Partly, the structure depends on how you intend to access the data after
you build it.

Just an idea.

Susan



                  
                           "Daniel McBrearty"
                  
                      <danielmcbrearty@gmail.        To:       pgsql-general@postgresql.org
                  
                      com>                           cc:
                  
                           Sent by:                  Subject:  [GENERAL] table has many to many relationship with
itself- how to          
                                                      implement?
                  

                  
                      pgsql-general-owner@pos         |-------------------|
                  
                      tgresql.org                     | [ ] Expand Groups |
                  
                                                      |-------------------|
                  

                  
                           06/14/2006 01:53
                  
                      AM
                  

                  

                  




Hi all,

I have a table "phrases" that looks like this :

create table phrases(
 id serial ,
 language integer references langauges(id),
 content text
);


Simply a word or phrase in some language.

Now I want to express the concept of a "translation". A translation is
a number of phrases from different languages that are a translation of
each other. There is nothing else to say about a translation - though
it does need to be referencable by other tables, so it needs an ID.

One way to do this is with these two tables:

create table translations (
 id serial primary key
);

create table translations_to_phrases (
 translation_id integer references translations(id),
 phrase_id integer references phrases(id),
 primary key (translation_id, phrase_id)
);

Now this actually works as a data structure; the translations table is
a bit odd, having only an id, but that is all we really need.

Can I do this though? can I create a row in translations?

insert into table translations ... insert what?

The other way to do this that I see is to lose the link table
translations_to_phrases, and then make translations

create table translations (
 id serial primary key,
 phrases integer[]
);


but it seems that I can no longer make postgre aware that the integers
in translations(phrases) are references.

What is the best solution?

Thanks

Daniel

--
Daniel McBrearty
email : danielmcbrearty at gmail.com
www.engoi.com : the multi - language vocab trainer
BTW : 0873928131

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match





----------------------------------------------------------------------------------------------
Simply protected storage solutions ensure that your information is
automatically safe, readily available and always there, visit us at http://www.overlandstorage.com
----------------------------------------------------------------------------------------------


Re: table has many to many relationship with itself - how

From
"Daniel McBrearty"
Date:
thanks Susan for your idea.

I thought that was it for a moment, then I saw a problem (I think) -
it works, but gets quite inefficient.

when you have 2 phrases which are a translation, that is just one entry

when you have 3, that is 3
4 => 6

and so on.

In practice we might have 15, 20 languages in a translation. It is
unlimited many-to-many.

So the problem is a bit to do this while keeping the solution efficient.

I am almost sold on just using

create table translaton (
  id serial,
  phrases integer[]
);

but afaik there is no way to tell pg that the array contains refs to
another table. I could still live with this though, if noone has a
better way.

regards

Daniel


--
Daniel McBrearty
email : danielmcbrearty at gmail.com
www.engoi.com : the multi - language vocab trainer
BTW : 0873928131

Re: table has many to many relationship with itself - how

From
"John D. Burger"
Date:
SCassidy@overlandstorage.com wrote:

> Starting with this:
>
> create sequence languages_seq increment by 1;
> create table languages (
>   id integer primary key default nextval('languages_seq'),
>   language_name varchar(100)
> );

> (I like specifying my own sequence names, instead of using "serial",
> plus
> using a default this way lets me insert an integer directly, when
> necessary, or letting it default, but you can use serial, if you want).

You can always insert your own value into a SERIAL column.  From the
8.1 docs:

> The data types serial and bigserial are not true types, but merely a
> notational convenience for setting up unique identifier columns
> (similar to the AUTO_INCREMENT property supported by some other
> databases). In the current implementation, specifying
>
> CREATE TABLE tablename (
>     colname SERIAL
> );
>
> is equivalent to specifying:
>
> CREATE SEQUENCE tablename_colname_seq;
> CREATE TABLE tablename (
>     colname integer DEFAULT nextval('tablename_colname_seq') NOT NULL
> );

(http://www.postgresql.org/docs/8.1/interactive/datatype.html#DATATYPE-
SERIAL)

Your example above differs only in the sequence name (plus you have a
PK constraint, but you can do this on a SERIAL column too).

- John D. Burger
   MITRE


Re: table has many to many relationship with itself - how

From
Bruno Wolff III
Date:
On Wed, Jun 14, 2006 at 13:51:50 -0700,
  SCassidy@overlandstorage.com wrote:
> Starting with this:
>
> create sequence languages_seq increment by 1;
> create table languages (
>   id integer primary key default nextval('languages_seq'),
>   language_name varchar(100)
> );
> insert into languages (id, language_name) values (1, 'English');
> insert into languages (id, language_name) values (2, 'French');
> insert into languages (id, language_name) values (3, 'Spanish');
> insert into languages (id, language_name) values (4, 'Italian');
>
> create table phrases(
>  id serial primary key,
>  language integer references languages(id),
>  content text
> );
> insert into phrases (language, content) values (1, 'the book');
> insert into phrases (language, content) values (2, 'le livre');
> insert into phrases (language, content) values (3, 'el libro');
> insert into phrases (language, content) values (4, 'il libro');
> insert into phrases (language, content) values (1, 'the room');
> insert into phrases (language, content) values (4, 'la stanza');
> insert into phrases (language, content) values (4, 'la camera');
>
>
> For your translations table, I would go with something like this:
>
>
> create sequence translations_seq increment by 1;
> create table translations (
>  translation_id integer primary key default nextval('translations_seq'),
>  lang1_id integer references phrases(id),
>  lang2_id integer references phrases(id)
> );

I think you are better off putting the equivalence information in the phrases
table. (This assumes that treating translations of a phrase into various
languages forms an equivalence class.) Under this model each phrase will
be in exactly one equivalence class, so that adding an equivalence class
column to the phrase table seems like a good solution.