Modelling a web CMS in Postgres ... a little advice needed

Started by Laura Smithover 3 years ago6 messagesgeneral
Jump to latest
#1Laura Smith
n5d9xq3ti233xiyif2vp@protonmail.ch

Hi

I'm looking at using pgsql as a backend to a web CMS but could do with a little advice from the crowd on the wiseness of my schema thinking.

TL;DR the design is centered around two tables "pages" and "page_content", where "pages" has a jsonb column that refers to "page_content" in a key-value style (key for content block location ID on the web page, value for the database lookup).  Probably both, but certainly "page_content" would need to be versioned.

My present thinking is along the following lines (table columns minimised for this post):

create table pages (
  page_id uuid primary key not null,
  page_metadata jsonb not null
);

create table page_content(
  content_id uuid not null,
  content_version_id uuid not null
  content_valid tstzrange not null default tstzrange(now(),'infinity'),
  content_data text,
EXCLUDE USING gist (content_id WITH =, content_valid WITH && ) DEFERRABLE INITIALLY DEFERRED
);
create unique index if not exists on page_content(content_version_id);
CREATE OR REPLACE VIEW current_content AS select * from page_content where content_valid @> now();

An example "page_metadata" entry might look something like :
{
"page":"foo",
"description":"bar",
"content":[
"pageHeader":"E52DD77C-F3B5-40D9-8E65-B95F54E1C76B",
"pageMainLeft":"0BEFA002-7F9B-4A6A-AD33-CA916751B648"
]
}

So I guess my questions here are :
Am i nuts with this thinking ? Is there a saner way to do this ? Should I be using pgsql at all for this, e.g. the cool kids will probably say I should be using a graph database ?  (N.B. I did consider a pure old-school relational model with no jsonb, but I considered it too difficult to model the dynamic nature of the fields, i.e. unknown many-many relationship between page content locations and content ... but I'm willing to be proven wrong by wiser minds)

Then, on a more technical level .... what would an optimal query for looping through the json content array look like ?  I have always been pretty useless when it comes to CTE expressions !

Thanks all

#2Tony Shelver
tshelver@gmail.com
In reply to: Laura Smith (#1)
Fwd: Modelling a web CMS in Postgres ... a little advice needed

From: Tony Shelver <tshelver@gmail.com>
Date: Thu, 11 Aug 2022 at 11:47
Subject: Re: Modelling a web CMS in Postgres ... a little advice needed
To: Laura Smith <n5d9xq3ti233xiyif2vp@protonmail.ch>

On Thu, 11 Aug 2022 at 09:35, Laura Smith <
n5d9xq3ti233xiyif2vp@protonmail.ch> wrote:

Hi

I'm looking at using pgsql as a backend to a web CMS but could do with a
little advice from the crowd on the wiseness of my schema thinking.

TL;DR the design is centered around two tables "pages" and "page_content",
where "pages" has a jsonb column that refers to "page_content" in a
key-value style (key for content block location ID on the web page, value
for the database lookup). Probably both, but certainly "page_content"
would need to be versioned.

My present thinking is along the following lines (table columns minimised
for this post):

create table pages (
page_id uuid primary key not null,
page_metadata jsonb not null
);

create table page_content(
content_id uuid not null,
content_version_id uuid not null
content_valid tstzrange not null default tstzrange(now(),'infinity'),
content_data text,
EXCLUDE USING gist (content_id WITH =, content_valid WITH && ) DEFERRABLE
INITIALLY DEFERRED
);
create unique index if not exists on page_content(content_version_id);
CREATE OR REPLACE VIEW current_content AS select * from page_content where
content_valid @> now();

An example "page_metadata" entry might look something like :
{
"page":"foo",
"description":"bar",
"content":[
"pageHeader":"E52DD77C-F3B5-40D9-8E65-B95F54E1C76B",
"pageMainLeft":"0BEFA002-7F9B-4A6A-AD33-CA916751B648"
]
}

So I guess my questions here are :
Am i nuts with this thinking ? Is there a saner way to do this ? Should I
be using pgsql at all for this, e.g. the cool kids will probably say I
should be using a graph database ? (N.B. I did consider a pure old-school
relational model with no jsonb, but I considered it too difficult to model
the dynamic nature of the fields, i.e. unknown many-many relationship
between page content locations and content ... but I'm willing to be proven
wrong by wiser minds)

Then, on a more technical level .... what would an optimal query for
looping through the json content array look like ? I have always been
pretty useless when it comes to CTE expressions !

Thanks all

Off the top of my head, it seems like you are reinventing the wheel.

There are some very good CMS systems out there, such as Strapi, which is an
open source, headless CMS, and which can use Postgresql as the underlying
DB.

It could be worth a while to install this and see how it generates the
underlying pages. As a benefit, it also generates the UI to maintain the
underlying data, and also provides a set of APIs that you can call to
access the data from many different front-ends.

Tony

#3Laura Smith
n5d9xq3ti233xiyif2vp@protonmail.ch
In reply to: Tony Shelver (#2)
Re: Fwd: Modelling a web CMS in Postgres ... a little advice needed

Hi Tony

The reason I'm looking to do it from scratch is that its a case of "once bitten, twice shy".

This CMS will be replacing a Joomla based CMS.

I can't quite say I'm enamoured by the option of trading one "baggage included" opinionated CMS for another.  Also in this day and age, removing the excess baggage means an improved security footprint.

Laura
------- Original Message -------
On Thursday, August 11th, 2022 at 10:49, Tony Shelver <tshelver@gmail.com> wrote:

Show quoted text

From: Tony Shelver <tshelver@gmail.com>
Date: Thu, 11 Aug 2022 at 11:47
Subject: Re: Modelling a web CMS in Postgres ... a little advice needed
To: Laura Smith <n5d9xq3ti233xiyif2vp@protonmail.ch>

On Thu, 11 Aug 2022 at 09:35, Laura Smith <n5d9xq3ti233xiyif2vp@protonmail.ch> wrote:

Hi

I'm looking at using pgsql as a backend to a web CMS but could do with a little advice from the crowd on the wiseness of my schema thinking.

TL;DR the design is centered around two tables "pages" and "page_content", where "pages" has a jsonb column that refers to "page_content" in a key-value style (key for content block location ID on the web page, value for the database lookup). Probably both, but certainly "page_content" would need to be versioned.

My present thinking is along the following lines (table columns minimised for this post):

create table pages (
page_id uuid primary key not null,
page_metadata jsonb not null
);

create table page_content(
content_id uuid not null,
content_version_id uuid not null
content_valid tstzrange not null default tstzrange(now(),'infinity'),
content_data text,
EXCLUDE USING gist (content_id WITH =, content_valid WITH && ) DEFERRABLE INITIALLY DEFERRED
);
create unique index if not exists on page_content(content_version_id);
CREATE OR REPLACE VIEW current_content AS select * from page_content where content_valid @> now();

An example "page_metadata" entry might look something like :
{
"page":"foo",
"description":"bar",
"content":[
"pageHeader":"E52DD77C-F3B5-40D9-8E65-B95F54E1C76B",
"pageMainLeft":"0BEFA002-7F9B-4A6A-AD33-CA916751B648"
]
}

So I guess my questions here are :
Am i nuts with this thinking ? Is there a saner way to do this ? Should I be using pgsql at all for this, e.g. the cool kids will probably say I should be using a graph database ? (N.B. I did consider a pure old-school relational model with no jsonb, but I considered it too difficult to model the dynamic nature of the fields, i.e. unknown many-many relationship between page content locations and content ... but I'm willing to be proven wrong by wiser minds)

Then, on a more technical level .... what would an optimal query for looping through the json content array look like ? I have always been pretty useless when it comes to CTE expressions !

Thanks all

Off the top of my head, it seems like you are reinventing the wheel.

There are some very good CMS systems out there, such as Strapi, which is an open source, headless CMS, and which can use Postgresql as the underlying DB.

It could be worth a while to install this and see how it generates the underlying pages. As a benefit, it also generates the UI to maintain the underlying data, and also provides a set of APIs that you can call to access the data from many different front-ends.

Tony

#4Laura Smith
n5d9xq3ti233xiyif2vp@protonmail.ch
In reply to: Laura Smith (#1)
Re: Fwd: Modelling a web CMS in Postgres ... a little advice needed

Sent with Proton Mail secure email.

------- Original Message -------
On Thursday, August 11th, 2022 at 13:11, Ray O'Donnell <ray@rodonnell.ie> wrote:

On 11/08/2022 11:00, Laura Smith wrote:

Hi Tony

The reason I'm looking to do it from scratch is that its a case of
"once bitten, twice shy".

This CMS will be replacing a Joomla based CMS.

I can't quite say I'm enamoured by the option of trading one "baggage
included" opinionated CMS for another. Also in this day and age,
removing the excess baggage means an improved security footprint.

I would second what Tony Shelver said - it sounds as if what you're
trying to do has already been done, and for the most part done well. I'd
recommend Drupal, which works well with PostgreSQL, and which is very
extensible and customisable.

I'd agree with you that removing baggage can mean improved security -
on the other hand, having a huge, security-conscious community behind an
open-source project gets you there too.

Ray.

Tony's suggestion also needed Node.js, and frankly I'm not even going to open that can of worms.

At this point I'm not interested in "very extensible and customisable". Joomla fills those words too, and my present installation is full of all sorts of third-party extensions ... the thought of all the security vulnerabilities scares the bejeezus out of me !

Hence I want to get rid of bells, whistles and baggage. Forget the fancy GUI, forget the "community" of random plugins.

I just want a solid Postgres, which exposes to the outside world via a limited number of Pl/pgsql functions. The frontend can then pull the data and render it into a template.

I know I probably sound nuts. But as I said, I've spent too long in the hell of Joomla. I don't want to replace it with another thing that tries to be all things to all people ... I just want to follow the KISS principle now.

#5Tony Shelver
tshelver@gmail.com
In reply to: Laura Smith (#3)
Re: Fwd: Modelling a web CMS in Postgres ... a little advice needed

On Thu, 11 Aug 2022 at 12:00, Laura Smith <
n5d9xq3ti233xiyif2vp@protonmail.ch> wrote:

Hi Tony

The reason I'm looking to do it from scratch is that its a case of "once
bitten, twice shy".

This CMS will be replacing a Joomla based CMS.

I can't quite say I'm enamoured by the option of trading one "baggage
included" opinionated CMS for another. Also in this day and age, removing
the excess baggage means an improved security footprint.

Laura, fair enough.

At the time i used a static site generator (SSG), which was Jekyll at the
time, to generate our website.
I could have used Joomla or Wordpress as the CMS, but these are terrible
solutions as you mentioned, as they are not designed as pure CMS engines,
especially for use with a static site generator.

I went with my own rolled CMS as well.

One part of my requirement was for a product sales database, which ended up
being a lot more complex than your design, as I needed categories, product
groups, products, prices by area and more. I did this in Postgresql, and
pulled the data via a GraphQL API implemented via Postgraphile, which
generates a GraphQL API from a Postgres schema automagically.

The other part was for the static web pages. What I came up with was to
store pages and then a set of tags where I could assign content to each tag
for the page.
For example, you talked about Page Headers and PageMainLeft. I stored
these tagged with the name of the page, and the tag name, plus the content.

The SSG then saw the tags in the HTML page template, and pulled in the
content via Postgraphile.

At the time I looked at several 'headless' CMS engines, and Strapi came
close to meeting requirements, but was still in early development stages.

Today, I would look at a different route, likely leveraging something like
Strapi. I *may* still consider using native Postgres tables to manage the
product database, but there are a lot of other proven CMS options on the
market now without the baggage of Joomla, Wordpress et al.
Node.js I regard as a necessary evil if you are developing relatively
complex websites, integrating multiple APIs and other resources.

Very much depends on your environment though.
For example, one huge advantage of the headless CMS options is that the
whole user management and access is built in if you have a large user base
maintaining different parts of the website content. Another is things like
formal APIs to get content, and access management / security of the content
itself.

There isn't really a one-solution-fits-all-requirements option when it
comes to content management, unfortunately.

Regards

#6Rob Sargent
robjsargent@gmail.com
In reply to: Laura Smith (#4)
Re: Modelling a web CMS in Postgres ... a little advice needed

On Aug 11, 2022, at 5:30 AM, Laura Smith <n5d9xq3ti233xiyif2vp@protonmail.ch> wrote:

Sent with Proton Mail secure email.

------- Original Message -------
On Thursday, August 11th, 2022 at 13:11, Ray O'Donnell <ray@rodonnell.ie> wrote:

On 11/08/2022 11:00, Laura Smith wrote:

Hi Tony

The reason I'm looking to do it from scratch is that its a case of
"once bitten, twice shy".

This CMS will be replacing a Joomla based CMS.

I can't quite say I'm enamoured by the option of trading one "baggage
included" opinionated CMS for another. Also in this day and age,
removing the excess baggage means an improved security footprint.

I would second what Tony Shelver said - it sounds as if what you're
trying to do has already been done, and for the most part done well. I'd
recommend Drupal, which works well with PostgreSQL, and which is very
extensible and customisable.

I'd agree with you that removing baggage can mean improved security -
on the other hand, having a huge, security-conscious community behind an
open-source project gets you there too.

Ray.

Tony's suggestion also needed Node.js, and frankly I'm not even going to open that can of worms.

At this point I'm not interested in "very extensible and customisable". Joomla fills those words too, and my present installation is full of all sorts of third-party extensions ... the thought of all the security vulnerabilities scares the bejeezus out of me !

Hence I want to get rid of bells, whistles and baggage. Forget the fancy GUI, forget the "community" of random plugins.

I just want a solid Postgres, which exposes to the outside world via a limited number of Pl/pgsql functions. The frontend can then pull the data and render it into a template.

I know I probably sound nuts. But as I said, I've spent too long in the hell of Joomla. I don't want to replace it with another thing that tries to be all things to all people ... I just want to follow the KISS principle now.

I once worked a company which produce reference text for radiologist. The content was generated from an in-house app which wrote a form of mark up xml and store in postgres (version 8,9 no json). That was hard copy though (but beautiful).

I would like to better understand your concept of “page”. I’m not following how you’re managing layout and how variable that is in your world.

I would suggest getting as many of the things like page name (foo) and description(bar) as possible out of the json. How many tags (eg pageHeader, pageMainLeft) do you have? Do queries ask for pages with pageMainLeft having some string value within their content? Will you index the content separately?

Pictures as content?