On Fri, Oct 08, 2010 at 12:12:54PM +0200, Wolfgang Keller wrote:
> I'm working on a database schema which contains lots of
> "type code lookup" tables. The entries of these tables are
> also hierarchically related among themselves
> (subtype/supertype), to store rather large and quite complex
> taxonomies.
.From my experience it depends. Either you've got a "coding
system" which you need to provide localized "translations"
for or else you've got arbitrary type code lookups.
With coding systems it is typically not really a translation
of the coded term but rather *another* term people attach to
the same code - incidentally when using another language.
Terms in one language change while they don't change in
another. Think of the code as defining a class with all the
local language terms being ever-so-slightly different things
all belonging into that class (eg. while "back pain" and
"Kreuzschmerz" aren't considered translations of each other
*medically* they can well be considered to group under the
same ICD-10 code). Thus I've found this general scheme to
work well:
create table coded_term (
pk serial primary key,
code text,
term text,
lang text,
fk_coding_system integer
references coding_system(pk),
unique(code, term, lang, fk_coding_system)
);
(it can be argued whether lang should fold into coding_system)
If it's about arbitrary lookup values for codes I am using a
gettext version rewritten in pgsql similar to this:
create table lut_colors (
pk serial primary key
color text
);
create view v_lut_colors as
select
pk
as pk_lut_color,
color
as color,
_(color)
as l10n_color
from
lut_colors;
(you don't need the view or you don't need it in this way but
it's useful)
Now you guessed it: _() is a plpgsql function which does a
translation table lookup based on the database account (or a
passed in user name) and a pre-configured (or passed in)
language per said account/user name. It falls back from,
say, "de_DE" to, say, "de" to returning the original string.
The translation table is filled this way:
select i18n_upd_tx('de_DE', 'blue', 'blau');
select i18n_upd_tx('de_DE', 'grey', 'grau');
It doesn't really matter which language is used as the
"original" lookup language as long as a translation exists
for the desired target language:
select i18n_upd_tx('en', '1ö34kjafg8', 'yellow');
will properly make
select _('1ö34kjafg8', 'en');
return "yellow".
All the code for this is to be found in the git repository
for GNUmed at gitorious:
http://gitorious.org/gnumed
> BTW: Methods that use a single table to hold all
> translations for all strings in all tables of the entire
> schema are not very useful in this case, since it can't be
> excluded that depending on the context (i.e. the specific
> semantics of the specific "type code lookup" table) the
> translation of one and the same string in one language will
> be different in other languages.
Well, either add in a context field to the _()/i18n_upd_tx()
approach or consider using the coding system approach. You
might even figure out a way to use the tableoid in the
translation function:
create view v_lut_colors as
select
pk
as pk_lut_color,
color
as color,
_(color, lut_colors.tableoid)
as l10n_color
from
lut_colors;
This would require applying the tableoid when adding
translations though.
Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346