Question about where to deploy the business logics for data processing

Started by Nim Lialmost 3 years ago11 messagesgeneral
Jump to latest
#1Nim Li
mr.nim.li@gmail.com

Hello.

We have a PostgreSQL database with many tables, as well as foreign table,
dblink, triggers, functions, indexes, etc, for managing the business logics
of the data within the database. We also have a custom table for the
purpose of tracking the slowly changing dimensions (type 2).

Currently we are looking into using TypeORM (from Nest JS framework) to
connect to the database for creating a BE that provides web service. Some
reasons of using TypeORM are that it can update the database schema without
any SQL codes, works very well with Git, etc. And from what I am reading,
Git seems to work better with TypeORM, rather than handling individual
batch files with SQL codes (I still need to find out more about this) Yet
I do not think the ORM concept deals with database specify functions, such
as dblink and/or trigger-function, etc, which handles the business logics
or any ETL automation within the database itself (I should read more about
this as well.)

Anyway, in our team discussion, I was told that in modern programming
concept, the world is moving away from deploying programming logics within
the database (eg, by using PL/SQL). Instead, the proper way should be to
deploy all the programming logics to the framework which is used to connect
to the database, such as NestJS in our case. So, all we need in a database
should be only the schema (managed by ORM), and we should move all the
existing business logics (currently managed by things like the database
triggers, functions, dblink, etc.) to the Typescript codes within the
NestJS framework.

I wonder if anyone in the community has gone through changes like this? I
mean ... moving the business logics from PL/SQL within the database to the
codes in NestJS framework, and reply on only the TypeORM to manage the
update of the database without any SQL codes? Any thoughts about such a
change?

Thank you!!

#2Rob Sargent
robjsargent@gmail.com
In reply to: Nim Li (#1)
Re: Question about where to deploy the business logics for data processing

On Jun 8, 2023, at 8:21 PM, Nim Li <mr.nim.li@gmail.com> wrote:

Hello.

We have a PostgreSQL database with many tables, as well as foreign table, dblink, triggers, functions, indexes, etc, for managing the business logics of the data within the database. We also have a custom table for the purpose of tracking the slowly changing dimensions (type 2).

Currently we are looking into using TypeORM (from Nest JS framework) to connect to the database for creating a BE that provides web service. Some reasons of using TypeORM are that it can update the database schema without any SQL codes, works very well with Git, etc. And from what I am reading, Git seems to work better with TypeORM, rather than handling individual batch files with SQL codes (I still need to find out more about this) Yet I do not think the ORM concept deals with database specify functions, such as dblink and/or trigger-function, etc, which handles the business logics or any ETL automation within the database itself (I should read more about this as well.)

Anyway, in our team discussion, I was told that in modern programming concept, the world is moving away from deploying programming logics within the database (eg, by using PL/SQL). Instead, the proper way should be to deploy all the programming logics to the framework which is used to connect to the database, such as NestJS in our case. So, all we need in a database should be only the schema (managed by ORM), and we should move all the existing business logics (currently managed by things like the database triggers, functions, dblink, etc.) to the Typescript codes within the NestJS framework.

I wonder if anyone in the community has gone through changes like this? I mean ... moving the business logics from PL/SQL within the database to the codes in NestJS framework, and reply on only the TypeORM to manage the update of the database without any SQL codes? Any thoughts about such a change?

Thank you!!

You’re riding a pendulum which has swung too far.
In any organization of even minimal complexity, the physical data model and the deployable business model are never well aligned: the usages of the data are in a different dimension than the storage and maintenance of the data. I’ve not heard of TypeORM but on this list ORMs are notorious for generating poorly performing queries. The notion that application programming will replace database triggers is ludicrous.

#3Michael Nolan
htfoot@gmail.com
In reply to: Rob Sargent (#2)
Re: Question about where to deploy the business logics for data processing

Clearly I'm a 73 year old dinosaur, because I believe in having the
business logic in the database wherever possible. But the development
projects I've been around lately aren't using triggers at all. (And
it should not surprise anyone, certainly not me, that consistency of
data enforcement is an ongoing issue in these projects.)

Mike Nolan

Show quoted text

On Fri, Jun 9, 2023 at 10:06 AM Rob Sargent <robjsargent@gmail.com> wrote:

On Jun 8, 2023, at 8:21 PM, Nim Li <mr.nim.li@gmail.com> wrote:

Hello.

We have a PostgreSQL database with many tables, as well as foreign table, dblink, triggers, functions, indexes, etc, for managing the business logics of the data within the database. We also have a custom table for the purpose of tracking the slowly changing dimensions (type 2).

Currently we are looking into using TypeORM (from Nest JS framework) to connect to the database for creating a BE that provides web service. Some reasons of using TypeORM are that it can update the database schema without any SQL codes, works very well with Git, etc. And from what I am reading, Git seems to work better with TypeORM, rather than handling individual batch files with SQL codes (I still need to find out more about this) Yet I do not think the ORM concept deals with database specify functions, such as dblink and/or trigger-function, etc, which handles the business logics or any ETL automation within the database itself (I should read more about this as well.)

Anyway, in our team discussion, I was told that in modern programming concept, the world is moving away from deploying programming logics within the database (eg, by using PL/SQL). Instead, the proper way should be to deploy all the programming logics to the framework which is used to connect to the database, such as NestJS in our case. So, all we need in a database should be only the schema (managed by ORM), and we should move all the existing business logics (currently managed by things like the database triggers, functions, dblink, etc.) to the Typescript codes within the NestJS framework.

I wonder if anyone in the community has gone through changes like this? I mean ... moving the business logics from PL/SQL within the database to the codes in NestJS framework, and reply on only the TypeORM to manage the update of the database without any SQL codes? Any thoughts about such a change?

Thank you!!

You’re riding a pendulum which has swung too far.
In any organization of even minimal complexity, the physical data model and the deployable business model are never well aligned: the usages of the data are in a different dimension than the storage and maintenance of the data. I’ve not heard of TypeORM but on this list ORMs are notorious for generating poorly performing queries. The notion that application programming will replace database triggers is ludicrous.

#4Lorusso Domenico
domenico.l76@gmail.com
In reply to: Nim Li (#1)
Re: Question about where to deploy the business logics for data processing

Uhm me need to start form 2 concepts:

1. competence
2. Network lag

Competence: usually programmers aren't skilled enough about the
architectures and the actual needs of each layer.
This is a problem, because often programmers try to do something with what
he already know (e.g. perform join in Java....).

A correct design requires to identify at least the data logic, the process
logic, the business logic and the presentation logic.

One of the most important goals of Data logic is to ensure the
correctness of data from many point of view (all is impossible).

That involve:

- audit information
- bitemporal management
- strictly definition and verification of data (foreign key, checks,
management of compatibility)
- replicate consistently data for different usage
- isolate access for actual needs
- design

So an application that requires changing the data model does not seem to be
well designed...

Network lag
The first problem is latency, I must minimize the passage of data over the
network.
This means, for example, creating a service that allows the caller to
choose only the information it needs.
But it also means, to get all the information needed in a single call,
design asynchronous service, use cache data physically near to the frontend
or the middle layer.

Based on these 2 concepts I suggest:

- develop the Data logic near or inside the database;
- design powerful and addictive api;
- don't allow model change by the business logic
- organize/copy data in jsonb with a powerful json schema to provide
coherence through every layer
- ensure a system to grant ACID features to your process.

Il giorno ven 9 giu 2023 alle ore 05:22 Nim Li <mr.nim.li@gmail.com> ha
scritto:

Hello.

We have a PostgreSQL database with many tables, as well as foreign table,
dblink, triggers, functions, indexes, etc, for managing the business logics
of the data within the database. We also have a custom table for the
purpose of tracking the slowly changing dimensions (type 2).

Currently we are looking into using TypeORM (from Nest JS framework) to
connect to the database for creating a BE that provides web service. Some
reasons of using TypeORM are that it can update the database schema without
any SQL codes, works very well with Git, etc. And from what I am reading,
Git seems to work better with TypeORM, rather than handling individual
batch files with SQL codes (I still need to find out more about this) Yet
I do not think the ORM concept deals with database specify functions, such
as dblink and/or trigger-function, etc, which handles the business logics
or any ETL automation within the database itself (I should read more about
this as well.)

Anyway, in our team discussion, I was told that in modern programming
concept, the world is moving away from deploying programming logics within
the database (eg, by using PL/SQL). Instead, the proper way should be to
deploy all the programming logics to the framework which is used to connect
to the database, such as NestJS in our case. So, all we need in a database
should be only the schema (managed by ORM), and we should move all the
existing business logics (currently managed by things like the database
triggers, functions, dblink, etc.) to the Typescript codes within the
NestJS framework.

I wonder if anyone in the community has gone through changes like this? I
mean ... moving the business logics from PL/SQL within the database to the
codes in NestJS framework, and reply on only the TypeORM to manage the
update of the database without any SQL codes? Any thoughts about such a
change?

Thank you!!

--
Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]

#5Nim Li
mr.nim.li@gmail.com
In reply to: Lorusso Domenico (#4)
Re: Question about where to deploy the business logics for data processing

Hello,

Thank you so so much for all the feedback so far. :D

About this comment:

"... an application that requires changing the data model does not seem

to be well designed...don't allow model change by the business logic..."

I work in a science research faculity. When researchers start a project,
they don't necessary get the full picture of what they are hoping to achive
(yet they may get some ideas about the starting point that allow them to
move forward) By the time they see 40% percent of what they have done,
they may start to have a different thought and move towards a different
direction, or in some cases, they may spin it off to something different
after a certain period of time Coming with my Agile Development mindset in
the research area, it is common for me to see users changing their
requirement and expectation, with the same buckets for the data. Yes,
there is quite a lot of work to keep the researchers happy. ;-)

I suppose when there is a specific end-goal to achive for a project, a more
specific design can be more feasible based on the goal. But when the
end-goal is not necessary clear, and/or change-able, I am not exactly clear
how we may draw a black-and-white line to determine a design is good or not
(.. and for how long...)

I imagine one option may be to put less logics and restrictions on the data
side, which allows the researchers to have more flexible on their end. But
this may not be always feasbile due to the specific protocol of the study.
Perhaps there may be some other approaches and/or principles to deal with
situation like mine?

My major focus is still on getting more opinions about where to implement
the business logics for data processing ... if you have any thoughts about
the design, I would love to hear your thoughts as well.

Thank you so so much for sharing!

On Fri, Jun 9, 2023 at 12:35 PM Lorusso Domenico <domenico.l76@gmail.com>
wrote:

Show quoted text

Uhm me need to start form 2 concepts:

1. competence
2. Network lag

Competence: usually programmers aren't skilled enough about the
architectures and the actual needs of each layer.
This is a problem, because often programmers try to do something with what
he already know (e.g. perform join in Java....).

A correct design requires to identify at least the data logic, the process
logic, the business logic and the presentation logic.

One of the most important goals of Data logic is to ensure the
correctness of data from many point of view (all is impossible).

That involve:

- audit information
- bitemporal management
- strictly definition and verification of data (foreign key, checks,
management of compatibility)
- replicate consistently data for different usage
- isolate access for actual needs
- design

So an application that requires changing the data model does not seem to
be well designed...

Network lag
The first problem is latency, I must minimize the passage of data over the
network.
This means, for example, creating a service that allows the caller to
choose only the information it needs.
But it also means, to get all the information needed in a single call,
design asynchronous service, use cache data physically near to the frontend
or the middle layer.

Based on these 2 concepts I suggest:

- develop the Data logic near or inside the database;
- design powerful and addictive api;
- don't allow model change by the business logic
- organize/copy data in jsonb with a powerful json schema to provide
coherence through every layer
- ensure a system to grant ACID features to your process.

Il giorno ven 9 giu 2023 alle ore 05:22 Nim Li <mr.nim.li@gmail.com> ha
scritto:

Hello.

We have a PostgreSQL database with many tables, as well as foreign table,
dblink, triggers, functions, indexes, etc, for managing the business logics
of the data within the database. We also have a custom table for the
purpose of tracking the slowly changing dimensions (type 2).

Currently we are looking into using TypeORM (from Nest JS framework) to
connect to the database for creating a BE that provides web service. Some
reasons of using TypeORM are that it can update the database schema without
any SQL codes, works very well with Git, etc. And from what I am reading,
Git seems to work better with TypeORM, rather than handling individual
batch files with SQL codes (I still need to find out more about this) Yet
I do not think the ORM concept deals with database specify functions, such
as dblink and/or trigger-function, etc, which handles the business logics
or any ETL automation within the database itself (I should read more about
this as well.)

Anyway, in our team discussion, I was told that in modern programming
concept, the world is moving away from deploying programming logics within
the database (eg, by using PL/SQL). Instead, the proper way should be to
deploy all the programming logics to the framework which is used to connect
to the database, such as NestJS in our case. So, all we need in a database
should be only the schema (managed by ORM), and we should move all the
existing business logics (currently managed by things like the database
triggers, functions, dblink, etc.) to the Typescript codes within the
NestJS framework.

I wonder if anyone in the community has gone through changes like this?
I mean ... moving the business logics from PL/SQL within the database to
the codes in NestJS framework, and reply on only the TypeORM to manage the
update of the database without any SQL codes? Any thoughts about such a
change?

Thank you!!

--
Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]

#6Ron
ronljohnsonjr@gmail.com
In reply to: Nim Li (#5)
Re: Question about where to deploy the business logics for data processing

You can be sure that banks and academic research projects have different
needs.  Heck, your University's class scheduling software has different
needs from the research problems that you support.

The bottom line is that putting all of the "business" logic in TypeORM
*locks you into* using an ORM, while putting as much "business" logic in
database as stored procedures, triggers, foreign keys, etc... doesn't. 
Parts of the application can be in Java, some in JS, C, C++, Rust, Perl,
even COBOL.

On the other hand, putting so much logic into the database essentially
*locks you into that RDBMS*.

On 6/9/23 13:36, Nim Li wrote:

Hello,

Thank you so so much for all the feedback so far.  :D

About this comment:

"... an application that requires changing the data model does not seem

to be well designed...don't allow model change by the business logic..."

I work in a science research faculity.  When researchers start a project,
they don't necessary get the full picture of what they are hoping to
achive (yet they may get some ideas about the starting point that allow
them to move forward)  By the time they see 40% percent of what they have
done, they may start to have a different thought and move towards a
different direction, or in some cases, they may spin it off to something
different after a certain period of time  Coming with my Agile Development
mindset in the research area, it is common for me to see users changing
their requirement and expectation, with the same buckets for the data. 
Yes, there is quite a lot of work to keep the researchers happy.  ;-)

I suppose when there is a specific end-goal to achive for a project, a
more specific design can be more feasible based on the goal.  But when the
end-goal is not necessary clear, and/or change-able, I am not exactly
clear how we may draw a black-and-white line to determine a design is good
or not (.. and for how long...)

I imagine one option may be to put less logics and restrictions on the
data side, which allows the researchers to have more flexible on their
end.  But this may not be always feasbile due to the specific protocol of
the study.  Perhaps there may be some other approaches and/or principles
to deal with situation like mine?

My major focus is still on getting more opinions about where to implement
the business logics for data processing ...  if you have any thoughts
about the design, I would love to hear your thoughts as well.

Thank you so so much for sharing!

On Fri, Jun 9, 2023 at 12:35 PM Lorusso Domenico <domenico.l76@gmail.com>
wrote:

Uhm me need to start form 2 concepts:

1. competence
2. Network lag

Competence: usually programmers aren't skilled enough about the
architectures and the actual needs of each layer.
This is a problem, because often programmers try to do something with
what he already know (e.g. perform join in Java....).

A correct design requires to identify at least the data logic, the
process logic, the business logic and the presentation logic.

One of the most important goals of Data logic is to ensure the
correctness of data from many point of view (all is impossible).

That involve:

* audit information
* bitemporal management
* strictly definition and verification of data (foreign key, checks,
management of compatibility)
* replicate consistently data for different usage
* isolate access for actual needs
* design

So an application that requires changing the data model does not seem
to be well designed...

Network lag
The first problem is latency, I must minimize the passage of data over
the network.
This means, for example, creating a service that allows the caller to
choose only the information it needs.
But it also means, to get all the information needed in a single call,
design asynchronous service, use cache data physically near to the
frontend or the middle layer.

Based on these 2 concepts I suggest:

* develop the Data logic near or inside the database;
* design powerful and addictive api;
* don't allow model change by the business logic
* organize/copy data in jsonb with a powerful json schema to provide
coherence through every layer
* ensure a system to grant ACID features to your process.

Il giorno ven 9 giu 2023 alle ore 05:22 Nim Li <mr.nim.li@gmail.com>
ha scritto:

Hello.

We have a PostgreSQL database with many tables, as well as foreign
table, dblink, triggers, functions, indexes, etc, for managing the
business logics of the data within the database.  We also have a
custom table for the purpose of tracking the slowly changing
dimensions (type 2).

Currently we are looking into using TypeORM (from Nest JS
framework) to connect to the database for creating a BE that
provides web service.  Some reasons of using TypeORM are that it
can update the database schema without any SQL codes, works very
well with Git, etc.  And from what I am reading, Git seems to work
better with TypeORM, rather than handling individual batch files
with SQL codes (I still need to find out more about this)  Yet I
do not think the ORM concept deals with database specify
functions, such as dblink and/or trigger-function, etc, which
handles the business logics or any ETL automation within the
database itself (I should read more about this as well.)

Anyway, in our team discussion, I was told that in modern
programming concept, the world is moving away from deploying
programming logics within the database (eg, by using PL/SQL). 
Instead, the proper way should be to deploy all the programming
logics to the framework which is used to connect to the database,
such as NestJS in our case.  So, all we need in a database should
be only the schema (managed by ORM), and we should move all the
existing business logics (currently managed by things like the
database triggers, functions, dblink, etc.) to the Typescript
codes within the NestJS framework.

I wonder if anyone in the community has gone through changes like
this?  I mean ... moving the business logics from PL/SQL within
the database to the codes in NestJS framework, and reply on only
the TypeORM to manage the update of the database without any SQL
codes?  Any thoughts about such a change?

Thank you!!

--
Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]

--
Born in Arizona, moved to Babylonia.

#7Guyren Howe
guyren@gmail.com
In reply to: Ron (#6)
Re: Question about where to deploy the business logics for data processing

People change applications and programming languages all the time.

But change the database? Particularly away from Postgres, which is for nearly any purpose clearly the best SQL database available?

You have to pick one. Heck, write your triggers and stored procedures in Python and you can change to SQL Server, or in Java and you have the option of Oracle.

There is never a good reason to use MySQL. :-)

Guyren G Howe

Show quoted text

On Jun 9, 2023 at 13:39 -0700, Ron <ronljohnsonjr@gmail.com>, wrote:

You can be sure that banks and academic research projects have different needs.  Heck, your University's class scheduling software has different needs from the research problems that you support.

The bottom line is that putting all of the "business" logic in TypeORM locks you into using an ORM, while putting as much "business" logic in database as stored procedures, triggers, foreign keys, etc... doesn't.  Parts of the application can be in Java, some in JS, C, C++, Rust, Perl, even COBOL.

On the other hand, putting so much logic into the database essentially locks you into that RDBMS.

On 6/9/23 13:36, Nim Li wrote:

Hello,

Thank you so so much for all the feedback so far.  :D

About this comment:

"... an application that requires changing the data model does not seem to be well designed...don't allow model change by the business logic..."

I work in a science research faculity.  When researchers start a project, they don't necessary get the full picture of what they are hoping to achive (yet they may get some ideas about the starting point that allow them to move forward)  By the time they see 40% percent of what they have done, they may start to have a different thought and move towards a different direction, or in some cases, they may spin it off to something different after a certain period of time  Coming with my Agile Development mindset in the research area, it is common for me to see users changing their requirement and expectation, with the same buckets for the data.  Yes, there is quite a lot of work to keep the researchers happy.  ;-)

I suppose when there is a specific end-goal to achive for a project, a more specific design can be more feasible based on the goal.  But when the end-goal is not necessary clear, and/or change-able, I am not exactly clear how we may draw a black-and-white line to determine a design is good or not (.. and for how long...)

I imagine one option may be to put less logics and restrictions on the data side, which allows the researchers to have more flexible on their end.  But this may not be always feasbile due to the specific protocol of the study.  Perhaps there may be some other approaches and/or principles to deal with situation like mine?

My major focus is still on getting more opinions about where to implement the business logics for data processing ...  if you have any thoughts about the design, I would love to hear your thoughts as well.

Thank you so so much for sharing!

On Fri, Jun 9, 2023 at 12:35 PM Lorusso Domenico <domenico.l76@gmail.com> wrote:

Uhm me need to start form 2 concepts:

1. competence
2. Network lag

Competence: usually programmers aren't skilled enough about the architectures and the actual needs of each layer.
This is a problem, because often programmers try to do something with what he already know (e.g. perform join in Java....).

A correct design requires to identify at least the data logic, the process logic, the business logic and the presentation logic.

One of the most important goals of Data logic is to ensure the correctness of data from many point of view (all is impossible).

That involve:

• audit information
• bitemporal management
• strictly definition and verification of data (foreign key, checks, management of compatibility)
• replicate consistently data for different usage
• isolate access for actual needs
• design

So an application that requires changing the data model does not seem to be well designed...

Network lag
The first problem is latency, I must minimize the passage of data over the network.
This means, for example, creating a service that allows the caller to choose only the information it needs.
But it also means, to get all the information needed in a single call, design asynchronous service, use cache data physically near to the frontend or the middle layer.

Based on these 2 concepts I suggest:

• develop the Data logic near or inside the database;
• design powerful and addictive api;
• don't allow model change by the business logic
• organize/copy data in jsonb with a powerful json schema to provide coherence through every layer
• ensure a system to grant ACID features to your process.

Il giorno ven 9 giu 2023 alle ore 05:22 Nim Li <mr.nim.li@gmail.com> ha scritto:

Hello.

We have a PostgreSQL database with many tables, as well as foreign table, dblink, triggers, functions, indexes, etc, for managing the business logics of the data within the database.  We also have a custom table for the purpose of tracking the slowly changing dimensions (type 2).

Currently we are looking into using TypeORM (from Nest JS framework) to connect to the database for creating a BE that provides web service.  Some reasons of using TypeORM are that it can update the database schema without any SQL codes, works very well with Git, etc.  And from what I am reading, Git seems to work better with TypeORM, rather than handling individual batch files with SQL codes (I still need to find out more about this)  Yet I do not think the ORM concept deals with database specify functions, such as dblink and/or trigger-function, etc, which handles the business logics or any ETL automation within the database itself (I should read more about this as well.)

Anyway, in our team discussion, I was told that in modern programming concept, the world is moving away from deploying programming logics within the database (eg, by using PL/SQL).  Instead, the proper way should be to deploy all the programming logics to the framework which is used to connect to the database, such as NestJS in our case.  So, all we need in a database should be only the schema (managed by ORM), and we should move all the existing business logics (currently managed by things like the database triggers, functions, dblink, etc.) to the Typescript codes within the NestJS framework.

I wonder if anyone in the community has gone through changes like this?  I mean ... moving the business logics from PL/SQL within the database to the codes in NestJS framework, and reply on only the TypeORM to manage the update of the database without any SQL codes?  Any thoughts about such a change?

Thank you!!

--
Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]

--
Born in Arizona, moved to Babylonia.

#8Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Nim Li (#5)
Re: Question about where to deploy the business logics for data processing

On 6/9/23 11:36, Nim Li wrote:

Hello,

Thank you so so much for all the feedback so far.  :D

About this comment:

"... an application that requires changing the data model does not

seem to be well designed...don't allow model change by the business
logic..."

I work in a science research faculity.  When researchers start a
project, they don't necessary get the full picture of what they are
hoping to achive (yet they may get some ideas about the starting point
that allow them to move forward)  By the time they see 40% percent of
what they have done, they may start to have a different thought and move
towards a different direction, or in some cases, they may spin it off to
something different after a certain period of time  Coming with my Agile
Development mindset in the research area, it is common for me to see
users changing their requirement and expectation, with the same buckets
for the data.  Yes, there is quite a lot of work to keep the researchers
happy.  ;-)

I suppose when there is a specific end-goal to achive for a project, a
more specific design can be more feasible based on the goal.  But when
the end-goal is not necessary clear, and/or change-able, I am not
exactly clear how we may draw a black-and-white line to determine a
design is good or not (.. and for how long...)

Seems to me you are looking for a two part set up:

1) A experiment play ground where ideas and processes can be tested out
in a more free form manner. Some example software I have used or
experimented with that can fill that role:

Pandas
https://pandas.pydata.org/

Duckdb
https://duckdb.org/

Polars
https://pola-rs.github.io/polars-book/

2) Once something that resembles a solid plan has been developed then
move to Postgres or not.

--
Adrian Klaver
adrian.klaver@aklaver.com

#9Michael Nolan
htfoot@gmail.com
In reply to: Ron (#6)
Re: Question about where to deploy the business logics for data processing

You're gonna lock yourself into SOMETHING, that's why there are still
thousands of COBOL programs still being maintained.

Mike Nolan

Show quoted text

On Fri, Jun 9, 2023 at 3:39 PM Ron <ronljohnsonjr@gmail.com> wrote:

You can be sure that banks and academic research projects have different needs. Heck, your University's class scheduling software has different needs from the research problems that you support.

The bottom line is that putting all of the "business" logic in TypeORM locks you into using an ORM, while putting as much "business" logic in database as stored procedures, triggers, foreign keys, etc... doesn't. Parts of the application can be in Java, some in JS, C, C++, Rust, Perl, even COBOL.

On the other hand, putting so much logic into the database essentially locks you into that RDBMS.

On 6/9/23 13:36, Nim Li wrote:

Hello,

Thank you so so much for all the feedback so far. :D

About this comment:

"... an application that requires changing the data model does not seem to be well designed...don't allow model change by the business logic..."

I work in a science research faculity. When researchers start a project, they don't necessary get the full picture of what they are hoping to achive (yet they may get some ideas about the starting point that allow them to move forward) By the time they see 40% percent of what they have done, they may start to have a different thought and move towards a different direction, or in some cases, they may spin it off to something different after a certain period of time Coming with my Agile Development mindset in the research area, it is common for me to see users changing their requirement and expectation, with the same buckets for the data. Yes, there is quite a lot of work to keep the researchers happy. ;-)

I suppose when there is a specific end-goal to achive for a project, a more specific design can be more feasible based on the goal. But when the end-goal is not necessary clear, and/or change-able, I am not exactly clear how we may draw a black-and-white line to determine a design is good or not (.. and for how long...)

I imagine one option may be to put less logics and restrictions on the data side, which allows the researchers to have more flexible on their end. But this may not be always feasbile due to the specific protocol of the study. Perhaps there may be some other approaches and/or principles to deal with situation like mine?

My major focus is still on getting more opinions about where to implement the business logics for data processing ... if you have any thoughts about the design, I would love to hear your thoughts as well.

Thank you so so much for sharing!

On Fri, Jun 9, 2023 at 12:35 PM Lorusso Domenico <domenico.l76@gmail.com> wrote:

Uhm me need to start form 2 concepts:

competence
Network lag

Competence: usually programmers aren't skilled enough about the architectures and the actual needs of each layer.
This is a problem, because often programmers try to do something with what he already know (e.g. perform join in Java....).

A correct design requires to identify at least the data logic, the process logic, the business logic and the presentation logic.

One of the most important goals of Data logic is to ensure the correctness of data from many point of view (all is impossible).

That involve:

audit information
bitemporal management
strictly definition and verification of data (foreign key, checks, management of compatibility)
replicate consistently data for different usage
isolate access for actual needs
design

So an application that requires changing the data model does not seem to be well designed...

Network lag
The first problem is latency, I must minimize the passage of data over the network.
This means, for example, creating a service that allows the caller to choose only the information it needs.
But it also means, to get all the information needed in a single call, design asynchronous service, use cache data physically near to the frontend or the middle layer.

Based on these 2 concepts I suggest:

develop the Data logic near or inside the database;
design powerful and addictive api;
don't allow model change by the business logic
organize/copy data in jsonb with a powerful json schema to provide coherence through every layer
ensure a system to grant ACID features to your process.

Il giorno ven 9 giu 2023 alle ore 05:22 Nim Li <mr.nim.li@gmail.com> ha scritto:

Hello.

We have a PostgreSQL database with many tables, as well as foreign table, dblink, triggers, functions, indexes, etc, for managing the business logics of the data within the database. We also have a custom table for the purpose of tracking the slowly changing dimensions (type 2).

Currently we are looking into using TypeORM (from Nest JS framework) to connect to the database for creating a BE that provides web service. Some reasons of using TypeORM are that it can update the database schema without any SQL codes, works very well with Git, etc. And from what I am reading, Git seems to work better with TypeORM, rather than handling individual batch files with SQL codes (I still need to find out more about this) Yet I do not think the ORM concept deals with database specify functions, such as dblink and/or trigger-function, etc, which handles the business logics or any ETL automation within the database itself (I should read more about this as well.)

Anyway, in our team discussion, I was told that in modern programming concept, the world is moving away from deploying programming logics within the database (eg, by using PL/SQL). Instead, the proper way should be to deploy all the programming logics to the framework which is used to connect to the database, such as NestJS in our case. So, all we need in a database should be only the schema (managed by ORM), and we should move all the existing business logics (currently managed by things like the database triggers, functions, dblink, etc.) to the Typescript codes within the NestJS framework.

I wonder if anyone in the community has gone through changes like this? I mean ... moving the business logics from PL/SQL within the database to the codes in NestJS framework, and reply on only the TypeORM to manage the update of the database without any SQL codes? Any thoughts about such a change?

Thank you!!

--
Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]

--
Born in Arizona, moved to Babylonia.

#10Lorusso Domenico
domenico.l76@gmail.com
In reply to: Lorusso Domenico (#4)
Re: Question about where to deploy the business logics for data processing

Hi Nim,
well this is a very particular scenario.
In a few words, these projects will never go live for production purposes,
but just to verify some hypotheses.

In this case, could be acceptable to generate schema on the fly, but isn't
easy to automatize each aspect related to optimization (partitioning, index
and so on).

Coming to your last question, where set the logic of data manipulation,
again, in this case, minimize the lan traffic could be your main goal, this
means logic inside the DB.

Il giorno ven 9 giu 2023 alle ore 18:34 Lorusso Domenico <
domenico.l76@gmail.com> ha scritto:

Uhm me need to start form 2 concepts:

1. competence
2. Network lag

Competence: usually programmers aren't skilled enough about the
architectures and the actual needs of each layer.
This is a problem, because often programmers try to do something with what
he already know (e.g. perform join in Java....).

A correct design requires to identify at least the data logic, the process
logic, the business logic and the presentation logic.

One of the most important goals of Data logic is to ensure the
correctness of data from many point of view (all is impossible).

That involve:

- audit information
- bitemporal management
- strictly definition and verification of data (foreign key, checks,
management of compatibility)
- replicate consistently data for different usage
- isolate access for actual needs
- design

So an application that requires changing the data model does not seem to
be well designed...

Network lag
The first problem is latency, I must minimize the passage of data over the
network.
This means, for example, creating a service that allows the caller to
choose only the information it needs.
But it also means, to get all the information needed in a single call,
design asynchronous service, use cache data physically near to the frontend
or the middle layer.

Based on these 2 concepts I suggest:

- develop the Data logic near or inside the database;
- design powerful and addictive api;
- don't allow model change by the business logic
- organize/copy data in jsonb with a powerful json schema to provide
coherence through every layer
- ensure a system to grant ACID features to your process.

Il giorno ven 9 giu 2023 alle ore 05:22 Nim Li <mr.nim.li@gmail.com> ha
scritto:

Hello.

We have a PostgreSQL database with many tables, as well as foreign table,
dblink, triggers, functions, indexes, etc, for managing the business logics
of the data within the database. We also have a custom table for the
purpose of tracking the slowly changing dimensions (type 2).

Currently we are looking into using TypeORM (from Nest JS framework) to
connect to the database for creating a BE that provides web service. Some
reasons of using TypeORM are that it can update the database schema without
any SQL codes, works very well with Git, etc. And from what I am reading,
Git seems to work better with TypeORM, rather than handling individual
batch files with SQL codes (I still need to find out more about this) Yet
I do not think the ORM concept deals with database specify functions, such
as dblink and/or trigger-function, etc, which handles the business logics
or any ETL automation within the database itself (I should read more about
this as well.)

Anyway, in our team discussion, I was told that in modern programming
concept, the world is moving away from deploying programming logics within
the database (eg, by using PL/SQL). Instead, the proper way should be to
deploy all the programming logics to the framework which is used to connect
to the database, such as NestJS in our case. So, all we need in a database
should be only the schema (managed by ORM), and we should move all the
existing business logics (currently managed by things like the database
triggers, functions, dblink, etc.) to the Typescript codes within the
NestJS framework.

I wonder if anyone in the community has gone through changes like this?
I mean ... moving the business logics from PL/SQL within the database to
the codes in NestJS framework, and reply on only the TypeORM to manage the
update of the database without any SQL codes? Any thoughts about such a
change?

Thank you!!

--
Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]

--
Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]

#11Merlin Moncure
mmoncure@gmail.com
In reply to: Nim Li (#1)
Re: Question about where to deploy the business logics for data processing

On Thu, Jun 8, 2023 at 10:22 PM Nim Li <mr.nim.li@gmail.com> wrote:

I wonder if anyone in the community has gone through changes like this? I
mean ... moving the business logics from PL/SQL within the database to the
codes in NestJS framework, and reply on only the TypeORM to manage the
update of the database without any SQL codes? Any thoughts about such a
change?

Heads up, this is something of a religious database debate in the industry,
and you are asking a bunch of database guys what they think about this, and
their biases will show in their answers.

Having said that, your developers are utterly, completely, wrong. This is
classic, "my technology good, your technology bad", and most of the reasons
given to migrate the stack boil down to "I don't know SQL any will do
absolutely anything to avoid learning it", to the point of rewriting the
entire freaking project into (wait for it) javascript, which might very be
the worst possible language for data management.

The arguments supplied are tautological: "SQL is bad because you have to
write SQL, which is bad", except for the laughably incorrect "sql can't be
checked into git". Guess what, it can (try git -a my_func.sql), and there
are many techniques to deal with this.

Now, database deployments are a very complex topic which don't go away when
using an ORM. in fact, they often get worse. There are tools which can
generate change scripts from database A to A', are there tools to do that
for NestJS object models? Is there automatic dependency tracking for them?
Next thing you know, they will moving all your primary keys to
guids ("scaling problem, solved!") and whining about database performance
when you actually get some users.

WHY is writing SQL so bad? Is it slower? faster? Better supported?
plpgsql is very highly supported and draws from a higher talent pool than
"NestJS". Suppose you want to mix in some python, enterprise java, to your
application stack. What then?

ORMs are famously brittle and will often break if any data interaction to
the database does not itself go through the ORM, meaning you will be
writing and deploying programs to do simple tasks. They are slow,
discourage strong data modelling, interact with the database inefficiently,
and do not manage concurrent access to data well.

merlin