Thread: Internationalisation (i18n) with Postgres as backend
Hi, I'm creating a Postgres backend for an internal tool which is essentially a very simple implementation of multi-lingual CMS. So far my thoughts are along the lines of the below, but I would appreciate a second (or more !) pair of eyes from some Postgresqlgurus. I am especially interested in feedback and suggestions in relation to the following questions: (a) Is this going to work as expected (i.e. have I missed some obvious foot-guns ?) (b) Is this manner of doing things reasonably efficient or are there better ways I should be thinking of ? (bear in mindthe schema is not set in stone, so completely out of the box suggestions welcome !). The basic design concept (oversimplified) is: For each page, you have one or more objects and those objects may have contentin one or more languages. create table langtest( pageid text not null, objectid text not null , objectlang text not null, objectdata text not null); create unique index on (pageid,objectid,objectlang); insert into langTest(pageID,objectID,objectLang,objectData) values ('zzz','abc','en','Lorem ipsum dolor sit amet'); insert into langTest(pageID,objectID,objectLang,objectData) values ('zzz','abc','de','Amet sit dolor ipsum lorem'); insert into langTest(pageID,objectID,objectLang,objectData) values ('zzz','def','en','Dolor ipsum amet sit lorem'); select distinct on(objectid)objectid,objectlang,pageid,objectdata from langTest where pageid='zzz' and objectLang = any('{de,en}'::text[])order by objectid,array_position('{de,en}'::text[],objectLang); (The idea being that the select query will be wrapped into a function which the frontend will call, passing a list of elegiblelanguages as input) Thanks ! Laura
On 6/1/21 2:09 PM, Laura Smith wrote: > Hi, > > I'm creating a Postgres backend for an internal tool which is essentially a very simple implementation of multi-lingualCMS. > > So far my thoughts are along the lines of the below, but I would appreciate a second (or more !) pair of eyes from somePostgresql gurus. I am especially interested in feedback and suggestions in relation to the following questions: > > (a) Is this going to work as expected (i.e. have I missed some obvious foot-guns ?) > > (b) Is this manner of doing things reasonably efficient or are there better ways I should be thinking of ? (bear in mindthe schema is not set in stone, so completely out of the box suggestions welcome !). > > The basic design concept (oversimplified) is: For each page, you have one or more objects and those objects may have contentin one or more languages. > > create table langtest( > pageid text not null, > objectid text not null , > objectlang text not null, > objectdata text not null); > > create unique index on (pageid,objectid,objectlang); > > insert into langTest(pageID,objectID,objectLang,objectData) values ('zzz','abc','en','Lorem ipsum dolor sit amet'); > insert into langTest(pageID,objectID,objectLang,objectData) values ('zzz','abc','de','Amet sit dolor ipsum lorem'); > insert into langTest(pageID,objectID,objectLang,objectData) values ('zzz','def','en','Dolor ipsum amet sit lorem'); > > select distinct on(objectid)objectid,objectlang,pageid,objectdata from langTest where pageid='zzz' and objectLang = any('{de,en}'::text[])order by objectid,array_position('{de,en}'::text[],objectLang); > > (The idea being that the select query will be wrapped into a function which the frontend will call, passing a list of elegiblelanguages as input) > > Thanks ! > > Laura > What is your notion of "object". I first assumed it was akin to "document" but then pages have objects.
Hi Laura,
Did you consider using hstore to store language and data as a kvp? For example:
b2bc_owner@b2bcreditonline=# create table langtest(pageid text, objectid text, objectdata hstore, constraint langtest_pk primary key (pageid, objectid));
CREATE TABLE
CREATE TABLE
b2bc_owner@b2bcreditonline=# insert into langtest values ('zz', 'abc', '"en"=>"en for abc","de"=>"de for abc"');
INSERT 0 1
b2bc_owner@b2bcreditonline=# insert into langtest values ('zz', 'def', '"en"=>"en for def"');
INSERT 0 1
INSERT 0 1
b2bc_owner@b2bcreditonline=# insert into langtest values ('zz', 'def', '"en"=>"en for def"');
INSERT 0 1
b2bc_owner@b2bcreditonline=# create or replace function langtestfunc(text, text, text[]) returns text language sql as $$ select a.data from langtest as t, unnest(t.objectdata->$3) as a(data) where t.pageid = $1 and t.objectid = $2 and a.data is not null limit 1 $$;
b2bc_owner@b2bcreditonline=# select langtestfunc('zz', 'abc', array['de', 'en']);
langtestfunc
--------------
de for abc
(1 row)
langtestfunc
--------------
de for abc
(1 row)
b2bc_owner@b2bcreditonline=# select langtestfunc('zz', 'def', array['de', 'en']);
langtestfunc
--------------
en for def
(1 row)
langtestfunc
--------------
en for def
(1 row)
Just a thought.
Cheers,
Steve
On Wed, Jun 2, 2021 at 6:09 AM Laura Smith <n5d9xq3ti233xiyif2vp@protonmail.ch> wrote:
Hi,
I'm creating a Postgres backend for an internal tool which is essentially a very simple implementation of multi-lingual CMS.
So far my thoughts are along the lines of the below, but I would appreciate a second (or more !) pair of eyes from some Postgresql gurus. I am especially interested in feedback and suggestions in relation to the following questions:
(a) Is this going to work as expected (i.e. have I missed some obvious foot-guns ?)
(b) Is this manner of doing things reasonably efficient or are there better ways I should be thinking of ? (bear in mind the schema is not set in stone, so completely out of the box suggestions welcome !).
The basic design concept (oversimplified) is: For each page, you have one or more objects and those objects may have content in one or more languages.
create table langtest(
pageid text not null,
objectid text not null ,
objectlang text not null,
objectdata text not null);
create unique index on (pageid,objectid,objectlang);
insert into langTest(pageID,objectID,objectLang,objectData) values ('zzz','abc','en','Lorem ipsum dolor sit amet');
insert into langTest(pageID,objectID,objectLang,objectData) values ('zzz','abc','de','Amet sit dolor ipsum lorem');
insert into langTest(pageID,objectID,objectLang,objectData) values ('zzz','def','en','Dolor ipsum amet sit lorem');
select distinct on(objectid)objectid,objectlang,pageid,objectdata from langTest where pageid='zzz' and objectLang = any('{de,en}'::text[]) order by objectid,array_position('{de,en}'::text[],objectLang);
(The idea being that the select query will be wrapped into a function which the frontend will call, passing a list of elegible languages as input)
Thanks !
Laura
> What is your notion of "object". I first assumed it was akin to > "document" but then pages have objects. I think my terminology is a bit off. A document/page has object(s) on it. Or, perhaps better expressed, think of document/page as the template and object(s) is what fills the gaps in the template.
Hi Steve, I didn't consider hstore, I did consider jsonb though. The thing that made me lean towards individual rows rather than consolidated was that I thought versioning would ultimatelybe easier/cleaner to achieve with individual rows (e.g. using tsrange & gist exclude). But willing to be provenwrong. Laura ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Tuesday, 1 June 2021 22:10, Steve Baldwin <steve.baldwin@gmail.com> wrote: > Hi Laura, > > Did you consider using hstore to store language and data as a kvp? For example: > > b2bc_owner@b2bcreditonline=# create table langtest(pageid text, objectid text, objectdata hstore, constraint langtest_pkprimary key (pageid, objectid)); > CREATE TABLE > b2bc_owner@b2bcreditonline=# insert into langtest values ('zz', 'abc', '"en"=>"en for abc","de"=>"de for abc"'); > INSERT 0 1 > b2bc_owner@b2bcreditonline=# insert into langtest values ('zz', 'def', '"en"=>"en for def"'); > INSERT 0 1 > b2bc_owner@b2bcreditonline=# create or replace function langtestfunc(text, text, text[]) returns text language sql as $$select a.data from langtest as t, unnest(t.objectdata->$3) as a(data) where t.pageid = $1 and t.objectid = $2 and a.datais not null limit 1 $$; > b2bc_owner@b2bcreditonline=# select langtestfunc('zz', 'abc', array['de', 'en']); > langtestfunc > -------------- > de for abc > (1 row) > b2bc_owner@b2bcreditonline=# select langtestfunc('zz', 'def', array['de', 'en']); > langtestfunc > -------------- > en for def > (1 row) > > Just a thought. > > Cheers, > > Steve > > On Wed, Jun 2, 2021 at 6:09 AM Laura Smith <n5d9xq3ti233xiyif2vp@protonmail.ch> wrote: > > > Hi, > > > > I'm creating a Postgres backend for an internal tool which is essentially a very simple implementation of multi-lingualCMS. > > > > So far my thoughts are along the lines of the below, but I would appreciate a second (or more !) pair of eyes from somePostgresql gurus. I am especially interested in feedback and suggestions in relation to the following questions: > > > > (a) Is this going to work as expected (i.e. have I missed some obvious foot-guns ?) > > > > (b) Is this manner of doing things reasonably efficient or are there better ways I should be thinking of ? (bear in mindthe schema is not set in stone, so completely out of the box suggestions welcome !). > > > > The basic design concept (oversimplified) is: For each page, you have one or more objects and those objects may havecontent in one or more languages. > > > > create table langtest( > > pageid text not null, > > objectid text not null , > > objectlang text not null, > > objectdata text not null); > > > > create unique index on (pageid,objectid,objectlang); > > > > insert into langTest(pageID,objectID,objectLang,objectData) values ('zzz','abc','en','Lorem ipsum dolor sit amet'); > > insert into langTest(pageID,objectID,objectLang,objectData) values ('zzz','abc','de','Amet sit dolor ipsum lorem'); > > insert into langTest(pageID,objectID,objectLang,objectData) values ('zzz','def','en','Dolor ipsum amet sit lorem'); > > > > select distinct on(objectid)objectid,objectlang,pageid,objectdata from langTest where pageid='zzz' and objectLang = any('{de,en}'::text[])order by objectid,array_position('{de,en}'::text[],objectLang); > > > > (The idea being that the select query will be wrapped into a function which the frontend will call, passing a list ofelegible languages as input) > > > > Thanks ! > > > > Laura
On 6/1/21 5:00 PM, Laura Smith wrote:
Then I take it this template (layout?) is to be re-used across disparate content. So /instances/ of template applied to content (of vice versa?) are documents. Two separate domains to be managed, no? Is this an authoring system? Hard copy or digital presentation?What is your notion of "object". I first assumed it was akin to "document" but then pages have objects.I think my terminology is a bit off. A document/page has object(s) on it. Or, perhaps better expressed, think of document/page as the template and object(s) is what fills the gaps in the template.
On Wednesday, 2 June 2021 00:42, Rob Sargent <robjsargent@gmail.com> wrote: > On 6/1/21 5:00 PM, Laura Smith wrote: > > > > What is your notion of "object". I first assumed it was akin to > > > "document" but then pages have objects. > > > > I think my terminology is a bit off. > > > > A document/page has object(s) on it. > > > > Or, perhaps better expressed, think of document/page as the template and object(s) is what fills the gaps in the template. > > Then I take it this template (layout?) is to be re-used across disparate content. So /instances/ of template applied tocontent (of vice versa?) are documents. Two separate domains to be managed, no? Is this an authoring system? Hard copyor digital presentation? Yes, you are thinking along the right lines. The templates themselves are out-of-scope for the database, they are defined and managed in front-end as local filesystemfiles for the relevant parser to render since frontend tech is generally better suited to this role than databases. The role of the database is (when given a document ID) : - Tell the frontend which template to use - Provide content to fill template placeholders The content may take two forms: (a) Pure "fill the gaps" content for template placeholders (i.e. text for here, image for there etc.) (b) A list result set which then gets rendered by frontend (i.e think change lists, news items, lists of people's bios etc.) Versioning is a requirement of the system, particularly at content level. The internationalisation requirement comes in at content level. The templates themselves don't have any translatable elements. So, I guess TL;DR is yes, an authoring system. Currently only spec'd for digital (browser and mobile access), no print. Hope this helps clarify !
On Tue, 2021-06-01 at 23:08 +0000, Laura Smith wrote: > I didn't consider hstore, I did consider jsonb though. > > The thing that made me lean towards individual rows rather than consolidated was > that I thought versioning would ultimately be easier/cleaner to achieve with > individual rows (e.g. using tsrange & gist exclude). But willing to be proven wrong. Your intuition is good. Don't use JSON for that, and don't use the outdated hstore extension in particular. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
On Tue, 2021-06-01 at 20:09 +0000, Laura Smith wrote: > I'm creating a Postgres backend for an internal tool which is essentially a very simple implementation of multi-lingualCMS. > > I am especially interested in feedback and suggestions in relation to the following questions: > > (a) Is this going to work as expected (i.e. have I missed some obvious foot-guns ?) > > (b) Is this manner of doing things reasonably efficient or are there better ways I should be thinking of ? > (bear in mind the schema is not set in stone, so completely out of the box suggestions welcome !). > > The basic design concept (oversimplified) is: For each page, you have one or more objects and those objects may have contentin one or more languages. > > create table langtest( > pageid text not null, > objectid text not null , > objectlang text not null, > objectdata text not null); > > create unique index on (pageid,objectid,objectlang); > > select distinct on(objectid)objectid,objectlang,pageid,objectdata > from langTest where pageid='zzz' and objectLang = any('{de,en}'::text[]) > order by objectid,array_position('{de,en}'::text[],objectLang); That looks ok, except you should remove "objectid" from the index. That column makes that the index cannot be used for "objectlang" effectively. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com