Re: Question regarding how databases support atomicity

Started by Siddharth Jainalmost 2 years ago8 messagesgeneral
Jump to latest
#1Siddharth Jain
siddhsql@gmail.com

On Fri, May 3, 2024 at 8:00 PM Siddharth Jain <siddhsql@gmail.com> wrote:

Show quoted text

I am trying to sharpen my understanding of databases. Let's say there is
an operation foo as part of the public API that internally translates to
more than 1 operation - I am sure there are examples like this in postgres.
So to do foo we have to do following in order in all or none fashion:

1. Step 1
2. Step 2
3. Step 3

The way I understand this is that if there is a failure in-between, we
start undoing and reverting the previous operations one by one. But what if
there is a failure and we are not able to revert an operation. How is that
situation handled? e.g., something failed when we tried to do Step 3. now
we revert Step 2 and succeed. but when we try to revert step 1 we fail.
what happens now? To me, it seems its impossible to guarantee true
atomicity in general.

S.

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: Siddharth Jain (#1)

On Friday, May 3, 2024, Siddharth Jain <siddhsql@gmail.com> wrote:

On Fri, May 3, 2024 at 8:00 PM Siddharth Jain <siddhsql@gmail.com> wrote:

I am trying to sharpen my understanding of databases. Let's say there is
an operation foo as part of the public API that internally translates to
more than 1 operation - I am sure there are examples like this in postgres.
So to do foo we have to do following in order in all or none fashion:

1. Step 1
2. Step 2
3. Step 3

The way I understand this is that if there is a failure in-between, we
start undoing and reverting the previous operations one by one.

Not in PostgreSQL. All work performed is considered provisional until a
commit succeeds. At which point all provisional work, which had been
tagged with the same transaction identifier, becomes reality to the rest of
the system, by virtue of marking the transaction live. If the commit never
happens, either because of error, rollback, or session end, the transaction
ends up being left unalive and eventually is cleaned up.

You need to ensure a “begin” happens before Step 1 and a “commit” after
Step 3.

David J.

#3David G. Johnston
david.g.johnston@gmail.com
In reply to: David G. Johnston (#2)

On Friday, May 3, 2024, David G. Johnston <david.g.johnston@gmail.com>
wrote:

On Friday, May 3, 2024, Siddharth Jain <siddhsql@gmail.com> wrote:

On Fri, May 3, 2024 at 8:00 PM Siddharth Jain <siddhsql@gmail.com> wrote:

The way I understand this is that if there is a failure in-between, we
start undoing and reverting the previous operations one by one.

You need to ensure a “begin” happens before Step 1 and a “commit” after
Step 3.

As described in our documentation.

https://www.postgresql.org/docs/current/tutorial-transactions.html

David J.

#4Christophe Pettus
xof@thebuild.com
In reply to: Siddharth Jain (#1)

On May 3, 2024, at 20:02, Siddharth Jain <siddhsql@gmail.com> wrote:

The way I understand this is that if there is a failure in-between, we start undoing and reverting the previous operations one by one. But what if there is a failure and we are not able to revert an operation. How is that situation handled? e.g., something failed when we tried to do Step 3. now we revert Step 2 and succeed. but when we try to revert step 1 we fail. what happens now? To me, it seems its impossible to guarantee true atomicity in general.

PostgreSQL does not "undo" operations as such. When modifications are made to the database, those modifications (inserts, updates, deletes) are marked with the ID of the transaction that made them. A COMMIT or ROLLBACK in PostgreSQL just notes if those modifications are now "permanent" (if the transaction committed) or "invisible" (if the transaction rolled back). This technique in general is called Multi-Version Concurrency Control. Here's a good presentation that describes how it works in PostgreSQL:

https://momjian.us/main/writings/pgsql/mvcc.pdf

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: David G. Johnston (#2)

"David G. Johnston" <david.g.johnston@gmail.com> writes:

On Friday, May 3, 2024, Siddharth Jain <siddhsql@gmail.com> wrote:

The way I understand this is that if there is a failure in-between, we

start undoing and reverting the previous operations one by one.

Not in PostgreSQL. All work performed is considered provisional until a
commit succeeds. At which point all provisional work, which had been
tagged with the same transaction identifier, becomes reality to the rest of
the system, by virtue of marking the transaction live.

Right. We don't use UNDO; instead, we use multiple versions of
database rows (MVCC). A transaction doesn't modify the contents
of existing rows, but just marks them as provisionally outdated, and
then inserts new row versions that are marked provisionally inserted.
Other transactions ignore the outdate markings and the uncommitted new
rows, until the transaction commits, at which time the new versions
become live and the old ones become dead. If the transaction never
does commit -- either through ROLLBACK or a crash -- then the old row
versions remain live and the new ones are dead. In either case, we
don't have a consistency or correctness problem, but we do have dead
rows that must eventually get vacuumed away to prevent indefinite
storage bloat. That can be done by background housekeeping processes
though (a/k/a autovacuum).

I believe Oracle, for one, actually does use UNDO. I don't know
what they do about failure-to-UNDO.

regards, tom lane

#6Siddharth Jain
siddhsql@gmail.com
In reply to: Tom Lane (#5)

Thanks All for the kind responses. I understand how MVCC takes care of
atomicity for updates to rows. I was developing a project where lets say
data for each table is stored in its own folder together with metadata (we
are not talking postgres now). So if I have two tables A and B I have a
folder structure like:
A
\_ metadata.json
B
\_ metadata.json
Now if I want to rename a table, I need to move the folder and also update
metadata accordingly. These are two separate operations but need to be done
atomically - all or none. in this case it is possible that we succeed in
renaming the folder but fail to update metadata for whatever reason. then
if we try to undo the folder rename we get another failure for whatever
reason. how to deal with such scenarios? are there no such scenarios in
postgres?

On Fri, May 3, 2024 at 8:29 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

"David G. Johnston" <david.g.johnston@gmail.com> writes:

On Friday, May 3, 2024, Siddharth Jain <siddhsql@gmail.com> wrote:

The way I understand this is that if there is a failure in-between, we

start undoing and reverting the previous operations one by one.

Not in PostgreSQL. All work performed is considered provisional until a
commit succeeds. At which point all provisional work, which had been
tagged with the same transaction identifier, becomes reality to the rest

of

the system, by virtue of marking the transaction live.

Right. We don't use UNDO; instead, we use multiple versions of
database rows (MVCC). A transaction doesn't modify the contents
of existing rows, but just marks them as provisionally outdated, and
then inserts new row versions that are marked provisionally inserted.
Other transactions ignore the outdate markings and the uncommitted new
rows, until the transaction commits, at which time the new versions
become live and the old ones become dead. If the transaction never
does commit -- either through ROLLBACK or a crash -- then the old row
versions remain live and the new ones are dead. In either case, we
don't have a consistency or correctness problem, but we do have dead
rows that must eventually get vacuumed away to prevent indefinite
storage bloat. That can be done by background housekeeping processes
though (a/k/a autovacuum).

I believe Oracle, for one, actually does use UNDO. I don't know
what they do about failure-to-UNDO.

regards, tom lane

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Siddharth Jain (#6)

Siddharth Jain <siddhsql@gmail.com> writes:

Thanks All for the kind responses. I understand how MVCC takes care of
atomicity for updates to rows. I was developing a project where lets say
data for each table is stored in its own folder together with metadata (we
are not talking postgres now). So if I have two tables A and B I have a
folder structure like:
A
\_ metadata.json
B
\_ metadata.json
Now if I want to rename a table, I need to move the folder and also update
metadata accordingly. These are two separate operations but need to be done
atomically - all or none. in this case it is possible that we succeed in
renaming the folder but fail to update metadata for whatever reason. then
if we try to undo the folder rename we get another failure for whatever
reason. how to deal with such scenarios? are there no such scenarios in
postgres?

There aren't, because we don't assume that we can apply filesystem
metadata changes like that. That's why our on-disk files are not
named after their tables ;-) --- that way they never need renaming.

https://www.postgresql.org/docs/current/storage-file-layout.html

There may be filesystems out there that would give you guarantees in
this area, but it'd require non-POSIX and hence non-portable system
calls.

regards, tom lane

#8Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Siddharth Jain (#6)

On 5/7/24 9:48 AM, Siddharth Jain wrote:

Thanks All for the kind responses. I understand how MVCC takes care of
atomicity for updates to rows. I was developing a project where lets say
data for each table is stored in its own folder together with metadata
(we are not talking postgres now). So if I have two tables A and B I
have a folder structure like:
A
\_ metadata.json
B
\_ metadata.json
Now if I want to rename a table, I need to move the folder and also
update metadata accordingly. These are two separate operations but need
to be done atomically - all or none. in this case it is possible that we
succeed in renaming the folder but fail to update metadata for whatever
reason. then if we try to undo the folder rename we get another failure
for whatever reason. how to deal with such scenarios? are there no such
scenarios in postgres?

The only thing I can think of is creating a function in one of the
untrusted languages plpython3u or plperlu to do the renaming. Then in
say plpython3u case wrap the actions in try/except block. On a failure
take the appropriate undo action.

On Fri, May 3, 2024 at 8:29 PM Tom Lane <tgl@sss.pgh.pa.us
<mailto:tgl@sss.pgh.pa.us>> wrote:

"David G. Johnston" <david.g.johnston@gmail.com
<mailto:david.g.johnston@gmail.com>> writes:

On Friday, May 3, 2024, Siddharth Jain <siddhsql@gmail.com

<mailto:siddhsql@gmail.com>> wrote:

The way I understand this is that if there is a failure

in-between, we

start undoing and reverting the previous operations one by one.

Not in PostgreSQL.  All work performed is considered provisional

until a

commit succeeds.  At which point all provisional work, which had been
tagged with the same transaction identifier, becomes reality to

the rest of

the system, by virtue of marking the transaction live.

Right.  We don't use UNDO; instead, we use multiple versions of
database rows (MVCC).  A transaction doesn't modify the contents
of existing rows, but just marks them as provisionally outdated, and
then inserts new row versions that are marked provisionally inserted.
Other transactions ignore the outdate markings and the uncommitted new
rows, until the transaction commits, at which time the new versions
become live and the old ones become dead.  If the transaction never
does commit -- either through ROLLBACK or a crash -- then the old row
versions remain live and the new ones are dead.  In either case, we
don't have a consistency or correctness problem, but we do have dead
rows that must eventually get vacuumed away to prevent indefinite
storage bloat.  That can be done by background housekeeping processes
though (a/k/a autovacuum).

I believe Oracle, for one, actually does use UNDO.  I don't know
what they do about failure-to-UNDO.

                        regards, tom lane

--
Adrian Klaver
adrian.klaver@aklaver.com