Moving to Postgresql database
Hello Experts,
If some teams are well versed with the Oracle database architecture and its
optimizers working and designing applications on top of this. Now moving
same team to work on AWS aurora postgresql databases design/development
projects. Is any key design/architectural changes should the app
development team or the database design team, should really aware about, so
as to take right decision on any new development project in AWS aurora
postgresql database?
Or
Is there any list of differences(as compared to Oracle database) in key
concepts like for example basic design concepts, Normalization,
Partitioning, clustering, backup and recovery, Indexing strategy, isolation
level, performance which one should definitely be aware of?
On 2024-01-15 14:16, veem v wrote:
Hello Experts,
If some teams are well versed with the Oracle database architecture and
its
optimizers working and designing applications on top of this. Now
moving
same team to work on AWS aurora postgresql databases design/development
projects. Is any key design/architectural changes should the app
development team or the database design team, should really aware
about, so
as to take right decision on any new development project in AWS aurora
postgresql database?
Or
Is there any list of differences(as compared to Oracle database) in key
concepts like for example basic design concepts, Normalization,
Partitioning, clustering, backup and recovery, Indexing strategy,
isolation
level, performance which one should definitely be aware of?
Is this the kind of thing you're looking for?
Regards and best wishes,
Justin Clift
You can find a lot information in this link
To actually perform the migration, you can use some AWS tools such as SCT +
DMS:
https://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/migrate-an-oracle-database-to-aurora-postgresql-using-aws-dms-and-aws-sct.html
Postgresql is a relational database just like Oracle, so database designs
(OLTP vs OLAP) are the same.
However, implementations of partitioning, indexing, isolation, etc... Those
have the same concept but some implementation differences.
Isolation, for instance, in Oracle you have the UNDO tablespace. In
Postgresql you have a different method that uses MVCC (Multi Version
Concurrency Control), which keeps deleted records (dead tuples), until a
vacuum process (or autovacuum) runs.
Performance, you have different memory parameters to adjust, hints. For
execution plan management, you need to install the Postgresql extension
apg_mgt_plan. I would recommend the following extension:
- pgstattuple - row level statistics
- pg_stat_statements - planning and execution of statistics
- apg_mgmt_plan - for management of query plans
- pg_cron - to schedule jobs
Index fragmentation occurs in Postgresql as well, so be sure to monitor it
because you may need to run a REINDEX CONCURRENTLY (in oracle INDEX REBUILD
ONLINE) when high fragmentation is observed.
Indexes in Postgresql have the INCLUDE option (which is similar to SQL
Server), which are columns to append to the index so you dont need to scan
the table.
Since you are going to Aurora Postgresql, it has the Performance Insights
tool which provides statistics and queries run against your Aurora Cluster.
You can enable it for free to have 7 days of statistics. It comes handy if
you are used to AWR in Oracle to investigate issues.
On Mon, Jan 15, 2024 at 1:43 PM Justin Clift <justin@postgresql.org> wrote:
On 2024-01-15 14:16, veem v wrote:
Hello Experts,
If some teams are well versed with the Oracle database architecture and
its
optimizers working and designing applications on top of this. Now
moving
same team to work on AWS aurora postgresql databases design/development
projects. Is any key design/architectural changes should the app
development team or the database design team, should really aware
about, so
as to take right decision on any new development project in AWS aurora
postgresql database?
Or
Is there any list of differences(as compared to Oracle database) in key
concepts like for example basic design concepts, Normalization,
Partitioning, clustering, backup and recovery, Indexing strategy,
isolation
level, performance which one should definitely be aware of?Is this the kind of thing you're looking for?
Regards and best wishes,
Justin Clift
--
Best Regards,
_____________________________
Henrique S. G. Lima
Mobile: +1 (204) 951-6191
On Mon, Jan 15, 2024 at 5:17 AM veem v <veema0000@gmail.com> wrote:
Is any key design/architectural changes should the app development team
[...], should really aware about
Hi. One of the biggest pitfall of PostgreSQL, from the app-dev perspective,
is the fact any failed statement fails the whole transaction, with ROLLBACK
as the only recourse.
So if you have any code that does
try-something-and-if-it-fails-do-something-else,
which works in most RDBMS AFAIK, then that's NOT going to work with
PostgreSQL.
I think there's an extension to add it (don't recall its name), but I'm
always surprise it's not built-in,
even just as an opt-in choice. But maybe AWS Aurora is different in that
regard? I'm talking OSS PostgreSQL.
OTOH, one the biggest benefits of PostgreSQL vs Oracle is transactional
DDLs.
But for many/most, DDLs are mostly fixed, so doesn't matter as much as it
does to us.
libpq is much better than OCI, although nowdays there's a better official C
API on top of OCI.
And the protocol being open and OSS, unlike Oracle SQL*NET, there are
alternate pure-JS,
pure-RUST, pure-GO, etc... implementations beside libpq to suit the
client-side dev-stack better.
Of course, Oracle is batteries-included, while PostgreSQL relies on its
vast extension ecosystem instead.
Except you're limited to the (small) subset that intersect the Cloud
vendors managed PostgreSQL offer, if
you must also support those...
Another major difference is that the catalogs (dictionaries) in PostgreSQL
are fully open (modulo pg_authid and a few others).
So there's no USER_, ALL_, DBA_ variants that hide what objects exist in
the cluster, depending on privileges, like there is in Oracle.
Knowing an object exists doesn't mean you can access it, but that's a no-no
for some security-wise.
If you care about LOBs, Oracle SecureFile are (way?) faster, last we tested
a long time ago.
OTOH, PostgreSQL bytea is much larger and convenient that Oracle's RAW (but
that's also very dated info).
These are the main ones that come to mind. I'm sure there are many others.
FWIW. --DD
On 1/16/24 00:06, Dominique Devienne wrote:
On Mon, Jan 15, 2024 at 5:17 AM veem v <veema0000@gmail.com
<mailto:veema0000@gmail.com>> wrote:Is any key design/architectural changes should the app development
team [...], should really aware aboutHi. One of the biggest pitfall of PostgreSQL, from the app-dev perspective,
is the fact any failed statement fails the whole transaction, with
ROLLBACK as the only recourse.
https://www.postgresql.org/docs/current/sql-savepoint.html
"SAVEPOINT establishes a new savepoint within the current transaction.
A savepoint is a special mark inside a transaction that allows all
commands that are executed after it was established to be rolled back,
restoring the transaction state to what it was at the time of the
savepoint."
--
Adrian Klaver
adrian.klaver@aklaver.com
On Tue, Jan 16, 2024 at 5:07 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:
On 1/16/24 00:06, Dominique Devienne wrote:
On Mon, Jan 15, 2024 at 5:17 AM veem v <veema0000@gmail.com
<mailto:veema0000@gmail.com>> wrote:
Is any key design/architectural changes should the app development
team [...], should really aware about
Hi. One of the biggest pitfall of PostgreSQL, from the app-devperspective,
is the fact any failed statement fails the whole transaction, with
ROLLBACK as the only recourse."SAVEPOINT establishes a new savepoint within the current transaction.
I wish it was that easy.
I've been scared away from using them, after reading a few articles...
Also, that incurs extra round trips to the server, from the extra commands.
I really wish https://github.com/lzlabs/pg_statement_rollback was built-in.
Don't make it the default, for backward compatibility, but please let me
opt-in to it, w/o an extension.
https://buttondown.email/nelhage/archive/notes-on-some-postgresql-implementation-details/
https://postgres.ai/blog/20210831-postgresql-subtransactions-considered-harmful
https://about.gitlab.com/blog/2021/09/29/why-we-spent-the-last-month-eliminating-postgresql-subtransactions/
On 1/16/24 09:04, Dominique Devienne wrote:
On Tue, Jan 16, 2024 at 5:07 PM Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:On 1/16/24 00:06, Dominique Devienne wrote:
On Mon, Jan 15, 2024 at 5:17 AM veem v <veema0000@gmail.com
<mailto:veema0000@gmail.com>
<mailto:veema0000@gmail.com <mailto:veema0000@gmail.com>>> wrote:
Is any key design/architectural changes should the appdevelopment
team [...], should really aware about
Hi. One of the biggest pitfall of PostgreSQL, from the app-devperspective,
is the fact any failed statement fails the whole transaction, with
ROLLBACK as the only recourse."SAVEPOINT establishes a new savepoint within the current transaction.
I wish it was that easy.
I've been scared away from using them, after reading a few articles...
Also, that incurs extra round trips to the server, from the extra commands.
The point was that '... with ROLLBACK as the only recourse.' is not the
case. There is an alternative, whether you want to use it being a
separate question.
--
Adrian Klaver
adrian.klaver@aklaver.com
On Tue, Jan 16, 2024 at 6:10 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:
On 1/16/24 09:04, Dominique Devienne wrote:
On Tue, Jan 16, 2024 at 5:07 PM Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:On 1/16/24 00:06, Dominique Devienne wrote:
On Mon, Jan 15, 2024 at 5:17 AM veem v <veema0000@gmail.com
<mailto:veema0000@gmail.com>
<mailto:veema0000@gmail.com <mailto:veema0000@gmail.com>>> wrote:
Is any key design/architectural changes should the appdevelopment
team [...], should really aware about
Hi. One of the biggest pitfall of PostgreSQL, from the app-devperspective,
is the fact any failed statement fails the whole transaction, with
ROLLBACK as the only recourse."SAVEPOINT establishes a new savepoint within the current
transaction.
I wish it was that easy.
I've been scared away from using them, after reading a few articles...
Also, that incurs extra round trips to the server, from the extracommands.
The point was that '... with ROLLBACK as the only recourse.' is not the
case. There is an alternative, whether you want to use it being a
separate question.
Technically, it's still a ROLLBACK, so that is indeed the only recourse.
But sure, I take your point, you can emulate statement-level (implicit)
rollback
via an explicit SAVEPOINT, and ROLLBACK to the savepoint instead.
But my point remains, that something like what that extension does should
be an option of PostgreSQL itself, not an extension. --DD
PS: I'd also be happy to hear why it's not, or won't be, on technical terms.
On Tue, Jan 16, 2024 at 12:10 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:
On 1/16/24 09:04, Dominique Devienne wrote:
On Tue, Jan 16, 2024 at 5:07 PM Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:On 1/16/24 00:06, Dominique Devienne wrote:
On Mon, Jan 15, 2024 at 5:17 AM veem v <veema0000@gmail.com
<mailto:veema0000@gmail.com>
<mailto:veema0000@gmail.com <mailto:veema0000@gmail.com>>> wrote:
Is any key design/architectural changes should the appdevelopment
team [...], should really aware about
Hi. One of the biggest pitfall of PostgreSQL, from the app-devperspective,
is the fact any failed statement fails the whole transaction, with
ROLLBACK as the only recourse."SAVEPOINT establishes a new savepoint within the current
transaction.
I wish it was that easy.
I've been scared away from using them, after reading a few articles...
Also, that incurs extra round trips to the server, from the extracommands.
The point was that '... with ROLLBACK as the only recourse.' is not the
case. There is an alternative, whether you want to use it being a
separate question.
Performance-killing alternatives are not really altternatives.
On 1/16/24 09:23, Dominique Devienne wrote:
On Tue, Jan 16, 2024 at 6:10 PM Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:
Technically, it's still a ROLLBACK, so that is indeed the only recourse.
Actually ROLLBACK TO:
https://www.postgresql.org/docs/current/sql-rollback-to.html
You would get a different outcome with just a plain ROLLBACK.
But sure, I take your point, you can emulate statement-level (implicit)
rollback
via an explicit SAVEPOINT, and ROLLBACK to the savepoint instead.But my point remains, that something like what that extension does should
be an option of PostgreSQL itself, not an extension. --DDPS: I'd also be happy to hear why it's not, or won't be, on technical terms.
--
Adrian Klaver
adrian.klaver@aklaver.com
On 1/16/24 09:59, Ron Johnson wrote:
Performance-killing alternatives are not really altternatives.
Unless it is the only one that solves your problem.
--
Adrian Klaver
adrian.klaver@aklaver.com
On Tue, Jan 16, 2024 at 1:09 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:
On 1/16/24 09:59, Ron Johnson wrote:
Performance-killing alternatives are not really altternatives.
Unless it is the only one that solves your problem.
Amputating one head cures one's migraines, but nobody thinks it's a
viable solution to the problem of migraines.
Show quoted text
--
Adrian Klaver
adrian.klaver@aklaver.com
On 1/16/24 11:59 AM, Ron Johnson wrote:
> Hi. One of the biggest pitfall of PostgreSQL, from the app-dev
perspective,
> is the fact any failed statement fails the wholetransaction, with
> ROLLBACK as the only recourse.
"SAVEPOINT establishes a new savepoint within the current
transaction.
I wish it was that easy.
I've been scared away from using them, after reading a fewarticles...
Also, that incurs extra round trips to the server, from the extra
commands.
The point was that '... with ROLLBACK as the only recourse.' is not
the
case. There is an alternative, whether you want to use it being a
separate question.Performance-killing alternatives are not really altternatives.
What's the actual performance issue here?
I'm also wondering what the use case for constantly retrying errors is.
--
Jim Nasby, Data Architect, Austin TX
Dominique Devienne <ddevienne@gmail.com> writes:
But sure, I take your point, you can emulate statement-level (implicit)
rollback via an explicit SAVEPOINT, and ROLLBACK to the savepoint instead.
But my point remains, that something like what that extension does should
be an option of PostgreSQL itself, not an extension. --DD
PS: I'd also be happy to hear why it's not, or won't be, on technical terms.
The reason it's not going to happen is that the community (or at least
the more senior developers) still remembers what happened the last
time we tried it.
We did implement server-side auto-rollback years ago in PG 7.3,
and it was enough of a disaster that we took it out again in 7.4.
The problem is that now you have a switch somewhere (whether a GUC
or something else, still a switch) that fundamentally changes the
transactional semantics seen by applications. Run an application
in the wrong mode and you have a broken app. Worse, there is an
awful lot of client-side code that now has to cope with both
behaviors. We thought that would be okay ... well, it wasn't.
It was a mess. It would be a bigger mess now if we were to try it
again, because there would be even more broken client code.
regards, tom lane
On Tue, Jan 16, 2024 at 10:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Dominique Devienne <ddevienne@gmail.com> writes:
But sure, I take your point, you can emulate statement-level (implicit)
rollback via an explicit SAVEPOINT, and ROLLBACK to the savepointinstead.
But my point remains, that something like what that extension does should
be an option of PostgreSQL itself, not an extension. --DDPS: I'd also be happy to hear why it's not, or won't be, on technical
terms.
The reason it's not going to happen is that the community (or at least
the more senior developers) still remembers what happened the last
time we tried it.We did implement server-side auto-rollback years ago in PG 7.3,
and it was enough of a disaster that we took it out again in 7.4.
Thanks Tom. That's insightful, and obviously something I didn't know.
The problem is that now you have a switch somewhere (whether a GUC
or something else, still a switch) that fundamentally changes the
transactional semantics seen by applications. Run an application
in the wrong mode and you have a broken app. Worse, there is an
awful lot of client-side code that now has to cope with both
behaviors. We thought that would be okay ... well, it wasn't.
It was a mess. It would be a bigger mess now if we were to try it
again, because there would be even more broken client code.
OK. That speaks against making it the default for sure.
But what if the client-code explicitly opts-in to that mode/switch?
Is the pg_statement_rollback technically wrong? Can't what it does be done
better and more efficiently if it was in the core itself? Is it a lot of
code?
Basically implicit-statement-level-rollback is the norm, AFAIK, and
PostgreSQL is the exception here.
This creates frictions for ports to PostrgeSQL, and cross-RDBMBS apps in
general.
Thus if it was at least possible to opt-in to it, that would be a great
advance IMHO.
Client backend processes are per-user-per-DB. Would such a switch be
applied to the DB?
DBs are typically tailored to specific applications, this something like
this would work.
Thus all backends accessing a DB that opted-in to
statement-implicit-rollback would use it (by default)?
Or could that be decided on a per-client-backend basis?
I know the discussion will probably stop here. It's unlikely to happen, I
get that.
I think that's a pity, especially since there's a proof of concept, which I
assume if technically valid.
Thanks, --DD
Dominique Devienne <ddevienne@gmail.com> writes:
On Tue, Jan 16, 2024 at 10:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
The problem is that now you have a switch somewhere (whether a GUC
or something else, still a switch) that fundamentally changes the
transactional semantics seen by applications. Run an application
in the wrong mode and you have a broken app. Worse, there is an
awful lot of client-side code that now has to cope with both
behaviors. We thought that would be okay ... well, it wasn't.
It was a mess. It would be a bigger mess now if we were to try it
again, because there would be even more broken client code.
OK. That speaks against making it the default for sure.
But what if the client-code explicitly opts-in to that mode/switch?
It wasn't default in 7.3, either.
The key point here is that "the client code" isn't monolithic:
there are frequently 3 or 4 layers involved, all maintained by
different sets of people. If any one of them chooses to flip the
switch, all of them have to cope with the results (possibly without
even having observed the change). Like I said, it was a mess.
Perhaps we could have got away with changing this back around 1997.
By the time we tried (7.3 was released in 2002), it was already
too late because of the amount of client-side code that needed to
change and couldn't change in a timely fashion. Twenty years
later, that situation has to be many times worse.
Basically implicit-statement-level-rollback is the norm, AFAIK, and
PostgreSQL is the exception here.
I'm well aware of that. It doesn't matter.
regards, tom lane
On 1/17/24 5:31 AM, Dominique Devienne wrote:
Is the pg_statement_rollback technically wrong? Can't what it does be done
better and more efficiently if it was in the core itself? Is it a lot of
code?
I took a quick look at that extension, and it's doing pretty much what
you'd do if this was baked into Postgres. The performance penaltiy that
you'll suffer here is that you're going to assign a new transaction ID
for every statement, which can be significantly more expensive than
using one XID per BEGIN/COMMIT (depending of course on how many
statements you have inside a BEGIN/COMMIT).
By the way, you might take a look at Babelfish[1] since it has to solve
this issue as well due to some of the error handling modes that T-SQL
supports.
Basically implicit-statement-level-rollback is the norm, AFAIK, and
PostgreSQL is the exception here.
I'm really curious what other databases you've seen that have this
behavior, because the only time I've ever seen it was T-SQL. Way back in
Sybase 11 days it was the only behavior you had, but at some point SQL
Server (and maybe Sybase) added additional options.
Frankly, this paradigm has always seemed completely broken to me. The
entire point of having transactions is so you have all-or-nothing
behavior: either everything works or the transaction aborts. I realize
that automatically rolling a statement back doesn't technically violate
ACID, but IMO it definitely violates the spirit of it. While there are
certainly *some* legitimate uses for rolling a statement back on error,
in 30 years I've seen maybe one scenario where you'd want to roll a
statement back on *any* error, and even then it was only on a specific
statement - not every statement that might get sent to the server.
1: https://babelfishpg.org/
--
Jim Nasby, Data Architect, Austin TX
On Tue, Jan 16, 2024 at 11:05 AM Dominique Devienne <ddevienne@gmail.com>
wrote:
On Tue, Jan 16, 2024 at 5:07 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:On 1/16/24 00:06, Dominique Devienne wrote:
On Mon, Jan 15, 2024 at 5:17 AM veem v <veema0000@gmail.com
<mailto:veema0000@gmail.com>> wrote:
Is any key design/architectural changes should the app development
team [...], should really aware about
Hi. One of the biggest pitfall of PostgreSQL, from the app-devperspective,
is the fact any failed statement fails the whole transaction, with
ROLLBACK as the only recourse."SAVEPOINT establishes a new savepoint within the current transaction.
I wish it was that easy.
I've been scared away from using them, after reading a few articles...
Also, that incurs extra round trips to the server, from the extra commands.
Er, *every* statement incurs a round trip to the server. Candidly, I'm
not sure your point here is entirely thought through, unless you are taking
it to mean when writing ad hoc sql written to the console rather than
generally. The savepoint infrastructure directly implements transaction
control, and does it really well. It's both classic, broadly implemented,
and standard.
If you are concerned about round trips in general, you'd want to move to a
function or a procedure, where you have classic exception handling, if/else
blocks, etc, and there are no round trips. postgres really rewards mastery
of server side development practices.
merlin