Thread: advice on setting up schema sought

advice on setting up schema sought

From
"danmcb"
Date:
Hi

I have a website that has multilingual text stored in the database.
Currently I just have a flat table (lets called it "translations"), one
row per text item, one column per language. This works OK, for now, but
I am looking at a redesign. Mostly I want to keep information about the
languages in teh db as well, so that look like an extra table, one row
per lang.

The problem that now arises is that there is an expected correlation
between the "languages" and "translations" tables - there should be a
row in languages for each column of translations. AFAIK (could well be
wrong, I am no expert in db theory) there is no real way to express in
the ddl. Not ideal.

An alternative layout would now be to lose the "translations" table,
and have two tables in place; one called "base_text" containing the
text to be translated, and another called, say, "tx_text" which
contains the translations. Each row of "tx_text" references both
"base_text" and also "languages".

This looks like a nice layout, as there is an abstract rep of the
languages, and we lose the "translations" table which can get very
wide. It's nice that the schema doesn't actually change to add a new
language.

BUT there are certain invariants that need to be enforced. The main one
is this:

There must only be one row in "site_text" for any given language
referencing a given row of "base_text".

How can I enforce this? Also have to bear in mind that there COULD
easily be two translators working on the same language. It is
imperative that they are not able to simultaneously save a translation
of the same base text.

I would also like to have a column in "languages", type boolean, called
"is_base" - this says what the base language is. Here, only ONE row can
have a true value. (Obviously it has default value of false and is not
null).

Another invariant now comes in - the language referenced by every row
of "site_text" MUST have "is_base" set to FALSE.

If anyone can tell me how best to express to handle this stuff in
postgresql, I'd be grateful. Also general comments on whether this is a
good schema or not are welcome.

regards, and thanks in advance

Daniel


Re: advice on setting up schema sought

From
Date:
Hi (see below),
> Hi
>
> I have a website that has multilingual text stored
> in the database.
> Currently I just have a flat table (lets called it
> "translations"), one
> row per text item, one column per language. This
> works OK, for now, but
> I am looking at a redesign. Mostly I want to keep
> information about the
> languages in teh db as well, so that look like an
> extra table, one row
> per lang.
>
> The problem that now arises is that there is an
> expected correlation
> between the "languages" and "translations" tables -
> there should be a
> row in languages for each column of translations.
> AFAIK (could well be
> wrong, I am no expert in db theory) there is no real
> way to express in
> the ddl. Not ideal.
>
> An alternative layout would now be to lose the
> "translations" table,
> and have two tables in place; one called "base_text"
> containing the
> text to be translated, and another called, say,
> "tx_text" which
> contains the translations. Each row of "tx_text"
> references both
> "base_text" and also "languages".
>
> This looks like a nice layout, as there is an
> abstract rep of the
> languages, and we lose the "translations" table
> which can get very
> wide. It's nice that the schema doesn't actually
> change to add a new
> language.
>
> BUT there are certain invariants that need to be
> enforced. The main one
> is this:
>
> There must only be one row in "site_text" for any
> given language
> referencing a given row of "base_text".
>
> How can I enforce this? Also have to bear in mind
> that there COULD
> easily be two translators working on the same
> language. It is
> imperative that they are not able to simultaneously
> save a translation
> of the same base text.
>
> I would also like to have a column in "languages",
> type boolean, called
> "is_base" - this says what the base language is.
> Here, only ONE row can
> have a true value. (Obviously it has default value
> of false and is not
> null).
>
> Another invariant now comes in - the language
> referenced by every row
> of "site_text" MUST have "is_base" set to FALSE.
>
> If anyone can tell me how best to express to handle
> this stuff in
> postgresql, I'd be grateful. Also general comments
> on whether this is a
> good schema or not are welcome.
>
> regards, and thanks in advance
>
> Daniel

Daniel, i'm not sure i totally understand your
request.  from what i could make of it, though, would
this workd?

t_text
text_id (pkey)
language_id (fkey)
text

t_language
language_id (pkey)
language

the foreign key of the t_text table would reference
the the pkey (primary key) of the t_language.

every row of multi-lingual text would have an
associated language.

is this a solution that can help you or did i
misunderstand your question?

ps - you lost me at site_text and base_text.

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Re: advice on setting up schema sought

From
Bruno Wolff III
Date:
On Mon, Apr 10, 2006 at 22:30:47 -0700,
  operationsengineer1@yahoo.com wrote:
>
> Daniel, i'm not sure i totally understand your
> request.  from what i could make of it, though, would
> this workd?
>
> t_text
> text_id (pkey)
> language_id (fkey)
> text
>
> t_language
> language_id (pkey)
> language
>
> the foreign key of the t_text table would reference
> the the pkey (primary key) of the t_language.

In the translation table, you need to make the primary key a composite of
text_id and language_id so that only one unique combination of the two is
allowed.

Re: advice on setting up schema sought

From
Date:

--- Bruno Wolff III <bruno@wolff.to> wrote:

> On Mon, Apr 10, 2006 at 22:30:47 -0700,
>   operationsengineer1@yahoo.com wrote:
> >
> > Daniel, i'm not sure i totally understand your
> > request.  from what i could make of it, though,
> would
> > this workd?
> >
> > t_text
> > text_id (pkey)
> > language_id (fkey)
> > text
> >
> > t_language
> > language_id (pkey)
> > language
> >
> > the foreign key of the t_text table would
> reference
> > the the pkey (primary key) of the t_language.
>
> In the translation table, you need to make the
> primary key a composite of
> text_id and language_id so that only one unique
> combination of the two is
> allowed.

something like so...

t_base_text
text_id (pkey)
language_id (fkey)
base_text (text)

t_language
language_id (pkey)
language (text)

t_translations
translations_id (pkey - composite of text_id and
language_id with a marker such as "_" inbetween.
otherwise 1 and 11 ("111") can be confused with 11 and
1 ("111").  unless someone has a better idea (which
they likely do).
translation (text)

would this be an option?

Bruno, do you even need a translations table?
couldn't you just use...

t_text
text_id (pkey)
is_base (bool)
text (text)

if the base text is all in one language, then you
wouldn't even need the boolean (as long as that rule
sticks, anyway).

is there a reason one setup is more compelling than
another?

tia...

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Re: advice on setting up schema sought

From
Bruno Wolff III
Date:
On Tue, Apr 11, 2006 at 12:07:54 -0700,
  operationsengineer1@yahoo.com wrote:
>
>
> t_translations
> translations_id (pkey - composite of text_id and
> language_id with a marker such as "_" inbetween.
> otherwise 1 and 11 ("111") can be confused with 11 and
> 1 ("111").  unless someone has a better idea (which
> they likely do).
> translation (text)
>
> would this be an option?

No, you want to use the primary key declaration on the table, not the
column. Done that way you can specify more than one column as forming
the primary key.

> Bruno, do you even need a translations table?
> couldn't you just use...
>
> t_text
> text_id (pkey)
> is_base (bool)
> text (text)
>
> if the base text is all in one language, then you
> wouldn't even need the boolean (as long as that rule
> sticks, anyway).

This wouldn't solve the primary key issue, but it wouldn't unreasonable
to combine the primary text and translations together especially if the
primary text was always in a specific language.

Re: advice on setting up schema sought

From
Date:
> > t_translations
> > translations_id (pkey - composite of text_id and
> > language_id with a marker such as "_" inbetween.
> > otherwise 1 and 11 ("111") can be confused with 11
> and
> > 1 ("111").  unless someone has a better idea
> (which
> > they likely do).
> > translation (text)
> >
> > would this be an option?
>
> No, you want to use the primary key declaration on
> the table, not the
> column. Done that way you can specify more than one
> column as forming
> the primary key.

like so?

t_translations
language_id (used for table level pkey)
base_text_id (used for table level pkey)
translation (text)

i haven't worked with table level pkeys before.  next
time i get pgadmin3 fired up, i'll look into it.

thanks,

oe1

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Re: advice on setting up schema sought

From
Bruno Wolff III
Date:
On Tue, Apr 11, 2006 at 14:01:08 -0700,
  operationsengineer1@yahoo.com wrote:
> like so?
>
> t_translations
> language_id (used for table level pkey)
> base_text_id (used for table level pkey)
> translation (text)
>
> i haven't worked with table level pkeys before.  next
> time i get pgadmin3 fired up, i'll look into it.

Here is an example of a table constraint right from the documentation:
CREATE TABLE films (
    code        char(5),
    title       varchar(40),
    did         integer,
    date_prod   date,
    kind        varchar(10),
    len         interval hour to minute,
    CONSTRAINT code_title PRIMARY KEY(code,title)
);

Re: advice on setting up schema sought

From
Date:
--- Bruno Wolff III <bruno@wolff.to> wrote:

> On Tue, Apr 11, 2006 at 14:01:08 -0700,
>   operationsengineer1@yahoo.com wrote:
> > like so?
> >
> > t_translations
> > language_id (used for table level pkey)
> > base_text_id (used for table level pkey)
> > translation (text)
> >
> > i haven't worked with table level pkeys before.
> next
> > time i get pgadmin3 fired up, i'll look into it.
>
> Here is an example of a table constraint right from
> the documentation:
> CREATE TABLE films (
>     code        char(5),
>     title       varchar(40),
>     did         integer,
>     date_prod   date,
>     kind        varchar(10),
>     len         interval hour to minute,
>     CONSTRAINT code_title PRIMARY KEY(code,title)
> );
>

cool beans!  but my mysql friends tell me pgsql is
slow...  is it?

juuuuuust kidding!

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Re: advice on setting up schema sought

From
"Daniel McBrearty"
Date:
thanks people for your input.

I also posted this over on pgsql-general and got a pretty thorough reply from one Michael Glaesemann, which seems to have pretty much sorted it for me.

cheers

Daniel


--
Daniel McBrearty
email : danielmcbrearty at gmail.com
www.engoi.com : the multi - language vocab trainer
BTW : 0873928131