Moving to Postgresql database

Started by veem vabout 2 years ago18 messagesgeneral
Jump to latest
#1veem v
veema0000@gmail.com

Hello Experts,
If some teams are well versed with the Oracle database architecture and its
optimizers working and designing applications on top of this. Now moving
same team to work on AWS aurora postgresql databases design/development
projects. Is any key design/architectural changes should the app
development team or the database design team, should really aware about, so
as to take right decision on any new development project in AWS aurora
postgresql database?
Or
Is there any list of differences(as compared to Oracle database) in key
concepts like for example basic design concepts, Normalization,
Partitioning, clustering, backup and recovery, Indexing strategy, isolation
level, performance which one should definitely be aware of?

#2Justin Clift
justin@postgresql.org
In reply to: veem v (#1)
Re: Moving to Postgresql database

On 2024-01-15 14:16, veem v wrote:

Hello Experts,
If some teams are well versed with the Oracle database architecture and
its
optimizers working and designing applications on top of this. Now
moving
same team to work on AWS aurora postgresql databases design/development
projects. Is any key design/architectural changes should the app
development team or the database design team, should really aware
about, so
as to take right decision on any new development project in AWS aurora
postgresql database?
Or
Is there any list of differences(as compared to Oracle database) in key
concepts like for example basic design concepts, Normalization,
Partitioning, clustering, backup and recovery, Indexing strategy,
isolation
level, performance which one should definitely be aware of?

Is this the kind of thing you're looking for?

https://www.enterprisedb.com/blog/the-complete-oracle-to-postgresql-migration-guide-tutorial-move-convert-database-oracle-alternative

Regards and best wishes,

Justin Clift

#3Henrique Lima
henrique.sglima@gmail.com
In reply to: Justin Clift (#2)
Re: Moving to Postgresql database

You can find a lot information in this link

https://docs.aws.amazon.com/dms/latest/oracle-to-aurora-postgresql-migration-playbook/chap-oracle-aurora-pg.html

To actually perform the migration, you can use some AWS tools such as SCT +
DMS:
https://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/migrate-an-oracle-database-to-aurora-postgresql-using-aws-dms-and-aws-sct.html

Postgresql is a relational database just like Oracle, so database designs
(OLTP vs OLAP) are the same.
However, implementations of partitioning, indexing, isolation, etc... Those
have the same concept but some implementation differences.

Isolation, for instance, in Oracle you have the UNDO tablespace. In
Postgresql you have a different method that uses MVCC (Multi Version
Concurrency Control), which keeps deleted records (dead tuples), until a
vacuum process (or autovacuum) runs.
Performance, you have different memory parameters to adjust, hints. For
execution plan management, you need to install the Postgresql extension
apg_mgt_plan. I would recommend the following extension:

- pgstattuple - row level statistics
- pg_stat_statements - planning and execution of statistics
- apg_mgmt_plan - for management of query plans
- pg_cron - to schedule jobs

Index fragmentation occurs in Postgresql as well, so be sure to monitor it
because you may need to run a REINDEX CONCURRENTLY (in oracle INDEX REBUILD
ONLINE) when high fragmentation is observed.
Indexes in Postgresql have the INCLUDE option (which is similar to SQL
Server), which are columns to append to the index so you dont need to scan
the table.

Since you are going to Aurora Postgresql, it has the Performance Insights
tool which provides statistics and queries run against your Aurora Cluster.
You can enable it for free to have 7 days of statistics. It comes handy if
you are used to AWR in Oracle to investigate issues.

On Mon, Jan 15, 2024 at 1:43 PM Justin Clift <justin@postgresql.org> wrote:

On 2024-01-15 14:16, veem v wrote:

Hello Experts,
If some teams are well versed with the Oracle database architecture and
its
optimizers working and designing applications on top of this. Now
moving
same team to work on AWS aurora postgresql databases design/development
projects. Is any key design/architectural changes should the app
development team or the database design team, should really aware
about, so
as to take right decision on any new development project in AWS aurora
postgresql database?
Or
Is there any list of differences(as compared to Oracle database) in key
concepts like for example basic design concepts, Normalization,
Partitioning, clustering, backup and recovery, Indexing strategy,
isolation
level, performance which one should definitely be aware of?

Is this the kind of thing you're looking for?

https://www.enterprisedb.com/blog/the-complete-oracle-to-postgresql-migration-guide-tutorial-move-convert-database-oracle-alternative

Regards and best wishes,

Justin Clift

--
Best Regards,
_____________________________
Henrique S. G. Lima
Mobile: +1 (204) 951-6191

#4Dominique Devienne
ddevienne@gmail.com
In reply to: veem v (#1)
Re: Moving to Postgresql database

On Mon, Jan 15, 2024 at 5:17 AM veem v <veema0000@gmail.com> wrote:

Is any key design/architectural changes should the app development team
[...], should really aware about

Hi. One of the biggest pitfall of PostgreSQL, from the app-dev perspective,
is the fact any failed statement fails the whole transaction, with ROLLBACK
as the only recourse.

So if you have any code that does
try-something-and-if-it-fails-do-something-else,
which works in most RDBMS AFAIK, then that's NOT going to work with
PostgreSQL.

I think there's an extension to add it (don't recall its name), but I'm
always surprise it's not built-in,
even just as an opt-in choice. But maybe AWS Aurora is different in that
regard? I'm talking OSS PostgreSQL.

OTOH, one the biggest benefits of PostgreSQL vs Oracle is transactional
DDLs.
But for many/most, DDLs are mostly fixed, so doesn't matter as much as it
does to us.

libpq is much better than OCI, although nowdays there's a better official C
API on top of OCI.
And the protocol being open and OSS, unlike Oracle SQL*NET, there are
alternate pure-JS,
pure-RUST, pure-GO, etc... implementations beside libpq to suit the
client-side dev-stack better.

Of course, Oracle is batteries-included, while PostgreSQL relies on its
vast extension ecosystem instead.
Except you're limited to the (small) subset that intersect the Cloud
vendors managed PostgreSQL offer, if
you must also support those...

Another major difference is that the catalogs (dictionaries) in PostgreSQL
are fully open (modulo pg_authid and a few others).
So there's no USER_, ALL_, DBA_ variants that hide what objects exist in
the cluster, depending on privileges, like there is in Oracle.
Knowing an object exists doesn't mean you can access it, but that's a no-no
for some security-wise.

If you care about LOBs, Oracle SecureFile are (way?) faster, last we tested
a long time ago.
OTOH, PostgreSQL bytea is much larger and convenient that Oracle's RAW (but
that's also very dated info).

These are the main ones that come to mind. I'm sure there are many others.
FWIW. --DD

#5Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Dominique Devienne (#4)
Re: Moving to Postgresql database

On 1/16/24 00:06, Dominique Devienne wrote:

On Mon, Jan 15, 2024 at 5:17 AM veem v <veema0000@gmail.com
<mailto:veema0000@gmail.com>> wrote:

Is any key design/architectural changes should the app development
team [...], should really aware about

Hi. One of the biggest pitfall of PostgreSQL, from the app-dev perspective,
is the fact any failed statement fails the whole transaction, with
ROLLBACK as the only recourse.

https://www.postgresql.org/docs/current/sql-savepoint.html

"SAVEPOINT establishes a new savepoint within the current transaction.

A savepoint is a special mark inside a transaction that allows all
commands that are executed after it was established to be rolled back,
restoring the transaction state to what it was at the time of the
savepoint."

--
Adrian Klaver
adrian.klaver@aklaver.com

#6Dominique Devienne
ddevienne@gmail.com
In reply to: Adrian Klaver (#5)
Re: Moving to Postgresql database

On Tue, Jan 16, 2024 at 5:07 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

On 1/16/24 00:06, Dominique Devienne wrote:

On Mon, Jan 15, 2024 at 5:17 AM veem v <veema0000@gmail.com
<mailto:veema0000@gmail.com>> wrote:
Is any key design/architectural changes should the app development
team [...], should really aware about
Hi. One of the biggest pitfall of PostgreSQL, from the app-dev

perspective,

is the fact any failed statement fails the whole transaction, with
ROLLBACK as the only recourse.

"SAVEPOINT establishes a new savepoint within the current transaction.

I wish it was that easy.
I've been scared away from using them, after reading a few articles...
Also, that incurs extra round trips to the server, from the extra commands.

I really wish https://github.com/lzlabs/pg_statement_rollback was built-in.
Don't make it the default, for backward compatibility, but please let me
opt-in to it, w/o an extension.

https://buttondown.email/nelhage/archive/notes-on-some-postgresql-implementation-details/
https://postgres.ai/blog/20210831-postgresql-subtransactions-considered-harmful
https://about.gitlab.com/blog/2021/09/29/why-we-spent-the-last-month-eliminating-postgresql-subtransactions/

#7Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Dominique Devienne (#6)
Re: Moving to Postgresql database

On 1/16/24 09:04, Dominique Devienne wrote:

On Tue, Jan 16, 2024 at 5:07 PM Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 1/16/24 00:06, Dominique Devienne wrote:

On Mon, Jan 15, 2024 at 5:17 AM veem v <veema0000@gmail.com

<mailto:veema0000@gmail.com>

<mailto:veema0000@gmail.com <mailto:veema0000@gmail.com>>> wrote:
     Is any key design/architectural changes should the app

development

     team [...], should really aware about
Hi. One of the biggest pitfall of PostgreSQL, from the app-dev

perspective,

is the fact any failed statement fails the whole transaction, with
ROLLBACK as the only recourse.

"SAVEPOINT establishes a new savepoint within the current transaction.

I wish it was that easy.
I've been scared away from using them, after reading a few articles...
Also, that incurs extra round trips to the server, from the extra commands.

The point was that '... with ROLLBACK as the only recourse.' is not the
case. There is an alternative, whether you want to use it being a
separate question.

--
Adrian Klaver
adrian.klaver@aklaver.com

#8Dominique Devienne
ddevienne@gmail.com
In reply to: Adrian Klaver (#7)
Re: Moving to Postgresql database

On Tue, Jan 16, 2024 at 6:10 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

On 1/16/24 09:04, Dominique Devienne wrote:

On Tue, Jan 16, 2024 at 5:07 PM Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 1/16/24 00:06, Dominique Devienne wrote:

On Mon, Jan 15, 2024 at 5:17 AM veem v <veema0000@gmail.com

<mailto:veema0000@gmail.com>

<mailto:veema0000@gmail.com <mailto:veema0000@gmail.com>>> wrote:
Is any key design/architectural changes should the app

development

team [...], should really aware about
Hi. One of the biggest pitfall of PostgreSQL, from the app-dev

perspective,

is the fact any failed statement fails the whole transaction, with
ROLLBACK as the only recourse.

"SAVEPOINT establishes a new savepoint within the current

transaction.

I wish it was that easy.
I've been scared away from using them, after reading a few articles...
Also, that incurs extra round trips to the server, from the extra

commands.

The point was that '... with ROLLBACK as the only recourse.' is not the
case. There is an alternative, whether you want to use it being a
separate question.

Technically, it's still a ROLLBACK, so that is indeed the only recourse.
But sure, I take your point, you can emulate statement-level (implicit)
rollback
via an explicit SAVEPOINT, and ROLLBACK to the savepoint instead.

But my point remains, that something like what that extension does should
be an option of PostgreSQL itself, not an extension. --DD

PS: I'd also be happy to hear why it's not, or won't be, on technical terms.

#9Ron
ronljohnsonjr@gmail.com
In reply to: Adrian Klaver (#7)
Re: Moving to Postgresql database

On Tue, Jan 16, 2024 at 12:10 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

On 1/16/24 09:04, Dominique Devienne wrote:

On Tue, Jan 16, 2024 at 5:07 PM Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 1/16/24 00:06, Dominique Devienne wrote:

On Mon, Jan 15, 2024 at 5:17 AM veem v <veema0000@gmail.com

<mailto:veema0000@gmail.com>

<mailto:veema0000@gmail.com <mailto:veema0000@gmail.com>>> wrote:
Is any key design/architectural changes should the app

development

team [...], should really aware about
Hi. One of the biggest pitfall of PostgreSQL, from the app-dev

perspective,

is the fact any failed statement fails the whole transaction, with
ROLLBACK as the only recourse.

"SAVEPOINT establishes a new savepoint within the current

transaction.

I wish it was that easy.
I've been scared away from using them, after reading a few articles...
Also, that incurs extra round trips to the server, from the extra

commands.

The point was that '... with ROLLBACK as the only recourse.' is not the
case. There is an alternative, whether you want to use it being a
separate question.

Performance-killing alternatives are not really altternatives.

#10Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Dominique Devienne (#8)
Re: Moving to Postgresql database

On 1/16/24 09:23, Dominique Devienne wrote:

On Tue, Jan 16, 2024 at 6:10 PM Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

Technically, it's still a ROLLBACK, so that is indeed the only recourse.

Actually ROLLBACK TO:

https://www.postgresql.org/docs/current/sql-rollback-to.html

You would get a different outcome with just a plain ROLLBACK.

But sure, I take your point, you can emulate statement-level (implicit)
rollback
via an explicit SAVEPOINT, and ROLLBACK to the savepoint instead.

But my point remains, that something like what that extension does should
be an option of PostgreSQL itself, not an extension. --DD

PS: I'd also be happy to hear why it's not, or won't be, on technical terms.

--
Adrian Klaver
adrian.klaver@aklaver.com

#11Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Ron (#9)
Re: Moving to Postgresql database

On 1/16/24 09:59, Ron Johnson wrote:

Performance-killing alternatives are not really altternatives.

Unless it is the only one that solves your problem.

--
Adrian Klaver
adrian.klaver@aklaver.com

#12Ron
ronljohnsonjr@gmail.com
In reply to: Adrian Klaver (#11)
Re: Moving to Postgresql database

On Tue, Jan 16, 2024 at 1:09 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

On 1/16/24 09:59, Ron Johnson wrote:

Performance-killing alternatives are not really altternatives.

Unless it is the only one that solves your problem.

Amputating one head cures one's migraines, but nobody thinks it's a
viable solution to the problem of migraines.

Show quoted text

--
Adrian Klaver
adrian.klaver@aklaver.com

#13Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Ron (#9)
Re: Moving to Postgresql database

On 1/16/24 11:59 AM, Ron Johnson wrote:

      > Hi. One of the biggest pitfall of PostgreSQL, from the app-dev
     perspective,
      > is the fact any failed statement fails the whole

transaction, with

      > ROLLBACK as the only recourse.

     "SAVEPOINT establishes a new savepoint within the current

transaction.

I wish it was that easy.
I've been scared away from using them, after reading a few

articles...

Also, that incurs extra round trips to the server, from the extra

commands.

The point was that '...  with ROLLBACK as the only recourse.' is not
the
case. There is an alternative, whether you want to use it being a
separate question.

Performance-killing alternatives are not really altternatives.

What's the actual performance issue here?

I'm also wondering what the use case for constantly retrying errors is.
--
Jim Nasby, Data Architect, Austin TX

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dominique Devienne (#8)
Re: Moving to Postgresql database

Dominique Devienne <ddevienne@gmail.com> writes:

But sure, I take your point, you can emulate statement-level (implicit)
rollback via an explicit SAVEPOINT, and ROLLBACK to the savepoint instead.

But my point remains, that something like what that extension does should
be an option of PostgreSQL itself, not an extension. --DD

PS: I'd also be happy to hear why it's not, or won't be, on technical terms.

The reason it's not going to happen is that the community (or at least
the more senior developers) still remembers what happened the last
time we tried it.

We did implement server-side auto-rollback years ago in PG 7.3,
and it was enough of a disaster that we took it out again in 7.4.
The problem is that now you have a switch somewhere (whether a GUC
or something else, still a switch) that fundamentally changes the
transactional semantics seen by applications. Run an application
in the wrong mode and you have a broken app. Worse, there is an
awful lot of client-side code that now has to cope with both
behaviors. We thought that would be okay ... well, it wasn't.
It was a mess. It would be a bigger mess now if we were to try it
again, because there would be even more broken client code.

regards, tom lane

#15Dominique Devienne
ddevienne@gmail.com
In reply to: Tom Lane (#14)
Re: Moving to Postgresql database

On Tue, Jan 16, 2024 at 10:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Dominique Devienne <ddevienne@gmail.com> writes:

But sure, I take your point, you can emulate statement-level (implicit)
rollback via an explicit SAVEPOINT, and ROLLBACK to the savepoint

instead.

But my point remains, that something like what that extension does should
be an option of PostgreSQL itself, not an extension. --DD

PS: I'd also be happy to hear why it's not, or won't be, on technical

terms.

The reason it's not going to happen is that the community (or at least
the more senior developers) still remembers what happened the last
time we tried it.

We did implement server-side auto-rollback years ago in PG 7.3,
and it was enough of a disaster that we took it out again in 7.4.

Thanks Tom. That's insightful, and obviously something I didn't know.

The problem is that now you have a switch somewhere (whether a GUC
or something else, still a switch) that fundamentally changes the
transactional semantics seen by applications. Run an application
in the wrong mode and you have a broken app. Worse, there is an
awful lot of client-side code that now has to cope with both
behaviors. We thought that would be okay ... well, it wasn't.
It was a mess. It would be a bigger mess now if we were to try it
again, because there would be even more broken client code.

OK. That speaks against making it the default for sure.
But what if the client-code explicitly opts-in to that mode/switch?

Is the pg_statement_rollback technically wrong? Can't what it does be done
better and more efficiently if it was in the core itself? Is it a lot of
code?

Basically implicit-statement-level-rollback is the norm, AFAIK, and
PostgreSQL is the exception here.
This creates frictions for ports to PostrgeSQL, and cross-RDBMBS apps in
general.
Thus if it was at least possible to opt-in to it, that would be a great
advance IMHO.

Client backend processes are per-user-per-DB. Would such a switch be
applied to the DB?
DBs are typically tailored to specific applications, this something like
this would work.
Thus all backends accessing a DB that opted-in to
statement-implicit-rollback would use it (by default)?
Or could that be decided on a per-client-backend basis?

I know the discussion will probably stop here. It's unlikely to happen, I
get that.
I think that's a pity, especially since there's a proof of concept, which I
assume if technically valid.

Thanks, --DD

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dominique Devienne (#15)
Re: Moving to Postgresql database

Dominique Devienne <ddevienne@gmail.com> writes:

On Tue, Jan 16, 2024 at 10:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

The problem is that now you have a switch somewhere (whether a GUC
or something else, still a switch) that fundamentally changes the
transactional semantics seen by applications. Run an application
in the wrong mode and you have a broken app. Worse, there is an
awful lot of client-side code that now has to cope with both
behaviors. We thought that would be okay ... well, it wasn't.
It was a mess. It would be a bigger mess now if we were to try it
again, because there would be even more broken client code.

OK. That speaks against making it the default for sure.
But what if the client-code explicitly opts-in to that mode/switch?

It wasn't default in 7.3, either.

The key point here is that "the client code" isn't monolithic:
there are frequently 3 or 4 layers involved, all maintained by
different sets of people. If any one of them chooses to flip the
switch, all of them have to cope with the results (possibly without
even having observed the change). Like I said, it was a mess.

Perhaps we could have got away with changing this back around 1997.
By the time we tried (7.3 was released in 2002), it was already
too late because of the amount of client-side code that needed to
change and couldn't change in a timely fashion. Twenty years
later, that situation has to be many times worse.

Basically implicit-statement-level-rollback is the norm, AFAIK, and
PostgreSQL is the exception here.

I'm well aware of that. It doesn't matter.

regards, tom lane

#17Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Dominique Devienne (#15)
Re: Moving to Postgresql database

On 1/17/24 5:31 AM, Dominique Devienne wrote:

Is the pg_statement_rollback technically wrong? Can't what it does be done
better and more efficiently if it was in the core itself? Is it a lot of
code?

I took a quick look at that extension, and it's doing pretty much what
you'd do if this was baked into Postgres. The performance penaltiy that
you'll suffer here is that you're going to assign a new transaction ID
for every statement, which can be significantly more expensive than
using one XID per BEGIN/COMMIT (depending of course on how many
statements you have inside a BEGIN/COMMIT).

By the way, you might take a look at Babelfish[1] since it has to solve
this issue as well due to some of the error handling modes that T-SQL
supports.

Basically implicit-statement-level-rollback is the norm, AFAIK, and
PostgreSQL is the exception here.

I'm really curious what other databases you've seen that have this
behavior, because the only time I've ever seen it was T-SQL. Way back in
Sybase 11 days it was the only behavior you had, but at some point SQL
Server (and maybe Sybase) added additional options.

Frankly, this paradigm has always seemed completely broken to me. The
entire point of having transactions is so you have all-or-nothing
behavior: either everything works or the transaction aborts. I realize
that automatically rolling a statement back doesn't technically violate
ACID, but IMO it definitely violates the spirit of it. While there are
certainly *some* legitimate uses for rolling a statement back on error,
in 30 years I've seen maybe one scenario where you'd want to roll a
statement back on *any* error, and even then it was only on a specific
statement - not every statement that might get sent to the server.

1: https://babelfishpg.org/
--
Jim Nasby, Data Architect, Austin TX

#18Merlin Moncure
mmoncure@gmail.com
In reply to: Dominique Devienne (#6)
Re: Moving to Postgresql database

On Tue, Jan 16, 2024 at 11:05 AM Dominique Devienne <ddevienne@gmail.com>
wrote:

On Tue, Jan 16, 2024 at 5:07 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

On 1/16/24 00:06, Dominique Devienne wrote:

On Mon, Jan 15, 2024 at 5:17 AM veem v <veema0000@gmail.com
<mailto:veema0000@gmail.com>> wrote:
Is any key design/architectural changes should the app development
team [...], should really aware about
Hi. One of the biggest pitfall of PostgreSQL, from the app-dev

perspective,

is the fact any failed statement fails the whole transaction, with
ROLLBACK as the only recourse.

"SAVEPOINT establishes a new savepoint within the current transaction.

I wish it was that easy.
I've been scared away from using them, after reading a few articles...
Also, that incurs extra round trips to the server, from the extra commands.

Er, *every* statement incurs a round trip to the server. Candidly, I'm
not sure your point here is entirely thought through, unless you are taking
it to mean when writing ad hoc sql written to the console rather than
generally. The savepoint infrastructure directly implements transaction
control, and does it really well. It's both classic, broadly implemented,
and standard.

If you are concerned about round trips in general, you'd want to move to a
function or a procedure, where you have classic exception handling, if/else
blocks, etc, and there are no round trips. postgres really rewards mastery
of server side development practices.

merlin

Show quoted text