Internationalisation (i18n) with Postgres as backend
Hi,
I'm creating a Postgres backend for an internal tool which is essentially a very simple implementation of multi-lingual CMS.
So far my thoughts are along the lines of the below, but I would appreciate a second (or more !) pair of eyes from some Postgresql gurus. I am especially interested in feedback and suggestions in relation to the following questions:
(a) Is this going to work as expected (i.e. have I missed some obvious foot-guns ?)
(b) Is this manner of doing things reasonably efficient or are there better ways I should be thinking of ? (bear in mind the schema is not set in stone, so completely out of the box suggestions welcome !).
The basic design concept (oversimplified) is: For each page, you have one or more objects and those objects may have content in one or more languages.
create table langtest(
pageid text not null,
objectid text not null ,
objectlang text not null,
objectdata text not null);
create unique index on (pageid,objectid,objectlang);
insert into langTest(pageID,objectID,objectLang,objectData) values ('zzz','abc','en','Lorem ipsum dolor sit amet');
insert into langTest(pageID,objectID,objectLang,objectData) values ('zzz','abc','de','Amet sit dolor ipsum lorem');
insert into langTest(pageID,objectID,objectLang,objectData) values ('zzz','def','en','Dolor ipsum amet sit lorem');
select distinct on(objectid)objectid,objectlang,pageid,objectdata from langTest where pageid='zzz' and objectLang = any('{de,en}'::text[]) order by objectid,array_position('{de,en}'::text[],objectLang);
(The idea being that the select query will be wrapped into a function which the frontend will call, passing a list of elegible languages as input)
Thanks !
Laura
On 6/1/21 2:09 PM, Laura Smith wrote:
Hi,
I'm creating a Postgres backend for an internal tool which is essentially a very simple implementation of multi-lingual CMS.
So far my thoughts are along the lines of the below, but I would appreciate a second (or more !) pair of eyes from some Postgresql gurus. I am especially interested in feedback and suggestions in relation to the following questions:
(a) Is this going to work as expected (i.e. have I missed some obvious foot-guns ?)
(b) Is this manner of doing things reasonably efficient or are there better ways I should be thinking of ? (bear in mind the schema is not set in stone, so completely out of the box suggestions welcome !).
The basic design concept (oversimplified) is: For each page, you have one or more objects and those objects may have content in one or more languages.
create table langtest(
pageid text not null,
objectid text not null ,
objectlang text not null,
objectdata text not null);create unique index on (pageid,objectid,objectlang);
insert into langTest(pageID,objectID,objectLang,objectData) values ('zzz','abc','en','Lorem ipsum dolor sit amet');
insert into langTest(pageID,objectID,objectLang,objectData) values ('zzz','abc','de','Amet sit dolor ipsum lorem');
insert into langTest(pageID,objectID,objectLang,objectData) values ('zzz','def','en','Dolor ipsum amet sit lorem');select distinct on(objectid)objectid,objectlang,pageid,objectdata from langTest where pageid='zzz' and objectLang = any('{de,en}'::text[]) order by objectid,array_position('{de,en}'::text[],objectLang);
(The idea being that the select query will be wrapped into a function which the frontend will call, passing a list of elegible languages as input)
Thanks !
Laura
What is your notion of "object". I first assumed it was akin to
"document" but then pages have objects.
Hi Laura,
Did you consider using hstore to store language and data as a kvp? For
example:
b2bc_owner@b2bcreditonline=# create table langtest(pageid text, objectid
text, objectdata hstore, constraint langtest_pk primary key (pageid,
objectid));
CREATE TABLE
b2bc_owner@b2bcreditonline=# insert into langtest values ('zz', 'abc',
'"en"=>"en for abc","de"=>"de for abc"');
INSERT 0 1
b2bc_owner@b2bcreditonline=# insert into langtest values ('zz', 'def',
'"en"=>"en for def"');
INSERT 0 1
b2bc_owner@b2bcreditonline=# create or replace function langtestfunc(text,
text, text[]) returns text language sql as $$ select a.data from langtest
as t, unnest(t.objectdata->$3) as a(data) where t.pageid = $1 and
t.objectid = $2 and a.data is not null limit 1 $$;
b2bc_owner@b2bcreditonline=# select langtestfunc('zz', 'abc', array['de',
'en']);
langtestfunc
--------------
de for abc
(1 row)
b2bc_owner@b2bcreditonline=# select langtestfunc('zz', 'def', array['de',
'en']);
langtestfunc
--------------
en for def
(1 row)
Just a thought.
Cheers,
Steve
On Wed, Jun 2, 2021 at 6:09 AM Laura Smith <
n5d9xq3ti233xiyif2vp@protonmail.ch> wrote:
Show quoted text
Hi,
I'm creating a Postgres backend for an internal tool which is essentially
a very simple implementation of multi-lingual CMS.So far my thoughts are along the lines of the below, but I would
appreciate a second (or more !) pair of eyes from some Postgresql gurus. I
am especially interested in feedback and suggestions in relation to the
following questions:(a) Is this going to work as expected (i.e. have I missed some obvious
foot-guns ?)(b) Is this manner of doing things reasonably efficient or are there
better ways I should be thinking of ? (bear in mind the schema is not set
in stone, so completely out of the box suggestions welcome !).The basic design concept (oversimplified) is: For each page, you have one
or more objects and those objects may have content in one or more languages.create table langtest(
pageid text not null,
objectid text not null ,
objectlang text not null,
objectdata text not null);create unique index on (pageid,objectid,objectlang);
insert into langTest(pageID,objectID,objectLang,objectData) values
('zzz','abc','en','Lorem ipsum dolor sit amet');
insert into langTest(pageID,objectID,objectLang,objectData) values
('zzz','abc','de','Amet sit dolor ipsum lorem');
insert into langTest(pageID,objectID,objectLang,objectData) values
('zzz','def','en','Dolor ipsum amet sit lorem');select distinct on(objectid)objectid,objectlang,pageid,objectdata from
langTest where pageid='zzz' and objectLang = any('{de,en}'::text[]) order
by objectid,array_position('{de,en}'::text[],objectLang);(The idea being that the select query will be wrapped into a function
which the frontend will call, passing a list of elegible languages as input)Thanks !
Laura
What is your notion of "object". I first assumed it was akin to
"document" but then pages have objects.
I think my terminology is a bit off.
A document/page has object(s) on it.
Or, perhaps better expressed, think of document/page as the template and object(s) is what fills the gaps in the template.
Hi Steve,
I didn't consider hstore, I did consider jsonb though.
The thing that made me lean towards individual rows rather than consolidated was that I thought versioning would ultimately be easier/cleaner to achieve with individual rows (e.g. using tsrange & gist exclude). But willing to be proven wrong.
Laura
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Tuesday, 1 June 2021 22:10, Steve Baldwin <steve.baldwin@gmail.com> wrote:
Show quoted text
Hi Laura,
Did you consider using hstore to store language and data as a kvp? For example:
b2bc_owner@b2bcreditonline=# create table langtest(pageid text, objectid text, objectdata hstore, constraint langtest_pk primary key (pageid, objectid));
CREATE TABLE
b2bc_owner@b2bcreditonline=# insert into langtest values ('zz', 'abc', '"en"=>"en for abc","de"=>"de for abc"');
INSERT 0 1
b2bc_owner@b2bcreditonline=# insert into langtest values ('zz', 'def', '"en"=>"en for def"');
INSERT 0 1
b2bc_owner@b2bcreditonline=# create or replace function langtestfunc(text, text, text[]) returns text language sql as $$ select a.data from langtest as t, unnest(t.objectdata->$3) as a(data) where t.pageid = $1 and t.objectid = $2 and a.data is not null limit 1 $$;
b2bc_owner@b2bcreditonline=# select langtestfunc('zz', 'abc', array['de', 'en']);
langtestfunc
--------------
de for abc
(1 row)
b2bc_owner@b2bcreditonline=# select langtestfunc('zz', 'def', array['de', 'en']);
langtestfunc
--------------
en for def
(1 row)Just a thought.
Cheers,
Steve
On Wed, Jun 2, 2021 at 6:09 AM Laura Smith <n5d9xq3ti233xiyif2vp@protonmail.ch> wrote:
Hi,
I'm creating a Postgres backend for an internal tool which is essentially a very simple implementation of multi-lingual CMS.
So far my thoughts are along the lines of the below, but I would appreciate a second (or more !) pair of eyes from some Postgresql gurus. I am especially interested in feedback and suggestions in relation to the following questions:
(a) Is this going to work as expected (i.e. have I missed some obvious foot-guns ?)
(b) Is this manner of doing things reasonably efficient or are there better ways I should be thinking of ? (bear in mind the schema is not set in stone, so completely out of the box suggestions welcome !).
The basic design concept (oversimplified) is: For each page, you have one or more objects and those objects may have content in one or more languages.
create table langtest(
pageid text not null,
objectid text not null ,
objectlang text not null,
objectdata text not null);create unique index on (pageid,objectid,objectlang);
insert into langTest(pageID,objectID,objectLang,objectData) values ('zzz','abc','en','Lorem ipsum dolor sit amet');
insert into langTest(pageID,objectID,objectLang,objectData) values ('zzz','abc','de','Amet sit dolor ipsum lorem');
insert into langTest(pageID,objectID,objectLang,objectData) values ('zzz','def','en','Dolor ipsum amet sit lorem');select distinct on(objectid)objectid,objectlang,pageid,objectdata from langTest where pageid='zzz' and objectLang = any('{de,en}'::text[]) order by objectid,array_position('{de,en}'::text[],objectLang);
(The idea being that the select query will be wrapped into a function which the frontend will call, passing a list of elegible languages as input)
Thanks !
Laura
On 6/1/21 5:00 PM, Laura Smith wrote:
What is your notion of "object". I first assumed it was akin to
"document" but then pages have objects.I think my terminology is a bit off.
A document/page has object(s) on it.
Or, perhaps better expressed, think of document/page as the template and object(s) is what fills the gaps in the template.
Then I take it this template (layout?) is to be re-used across disparate
content. So /instances/ of template applied to content (of vice versa?)
are documents. Two separate domains to be managed, no? Is this an
authoring system? Hard copy or digital presentation?
On Wednesday, 2 June 2021 00:42, Rob Sargent <robjsargent@gmail.com> wrote:
On 6/1/21 5:00 PM, Laura Smith wrote:
What is your notion of "object". I first assumed it was akin to
"document" but then pages have objects.I think my terminology is a bit off.
A document/page has object(s) on it.
Or, perhaps better expressed, think of document/page as the template and object(s) is what fills the gaps in the template.
Then I take it this template (layout?) is to be re-used across disparate content. So /instances/ of template applied to content (of vice versa?) are documents. Two separate domains to be managed, no? Is this an authoring system? Hard copy or digital presentation?
Yes, you are thinking along the right lines.
The templates themselves are out-of-scope for the database, they are defined and managed in front-end as local filesystem files for the relevant parser to render since frontend tech is generally better suited to this role than databases.
The role of the database is (when given a document ID) :
- Tell the frontend which template to use
- Provide content to fill template placeholders
The content may take two forms:
(a) Pure "fill the gaps" content for template placeholders (i.e. text for here, image for there etc.)
(b) A list result set which then gets rendered by frontend (i.e think change lists, news items, lists of people's bios etc.)
Versioning is a requirement of the system, particularly at content level.
The internationalisation requirement comes in at content level. The templates themselves don't have any translatable elements.
So, I guess TL;DR is yes, an authoring system. Currently only spec'd for digital (browser and mobile access), no print.
Hope this helps clarify !
On Tue, 2021-06-01 at 23:08 +0000, Laura Smith wrote:
I didn't consider hstore, I did consider jsonb though.
The thing that made me lean towards individual rows rather than consolidated was
that I thought versioning would ultimately be easier/cleaner to achieve with
individual rows (e.g. using tsrange & gist exclude). But willing to be proven wrong.
Your intuition is good. Don't use JSON for that, and don't use the outdated
hstore extension in particular.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
On Tue, 2021-06-01 at 20:09 +0000, Laura Smith wrote:
I'm creating a Postgres backend for an internal tool which is essentially a very simple implementation of multi-lingual CMS.
I am especially interested in feedback and suggestions in relation to the following questions:
(a) Is this going to work as expected (i.e. have I missed some obvious foot-guns ?)
(b) Is this manner of doing things reasonably efficient or are there better ways I should be thinking of ?
(bear in mind the schema is not set in stone, so completely out of the box suggestions welcome !).The basic design concept (oversimplified) is: For each page, you have one or more objects and those objects may have content in one or more languages.
create table langtest(
pageid text not null,
objectid text not null ,
objectlang text not null,
objectdata text not null);create unique index on (pageid,objectid,objectlang);
select distinct on(objectid)objectid,objectlang,pageid,objectdata
from langTest where pageid='zzz' and objectLang = any('{de,en}'::text[])
order by objectid,array_position('{de,en}'::text[],objectLang);
That looks ok, except you should remove "objectid" from the index.
That column makes that the index cannot be used for "objectlang" effectively.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com