COMMIT IN STORED PROCEDURE WHILE IN A LOOP

Started by Ravi Krishnaover 3 years ago16 messagesgeneral

s_ravikrishna@aol.com

over 3 years ago

AWS Aurora based on PG 13

I am writing a sproc to copy a schema into another. Here is the relevant portion of the code.
Basically I want to commit after every table is created. In big schemas with hundreds of table I do not want to run entire operation in one transaction.

I am getting error at COMMIT -> cannot commit while a subtransaction is active.
Is commit not possible in a loop

BEGIN

FOR object IN

SELECT table_name::text FROM information_schema.TABLES WHERE table_schema = source_schema

AND table_type = 'BASE TABLE'

LOOP

buffer := dest_schema || '.' || object;

BEGIN

sql_stmt := 'CREATE TABLE ' || buffer || ' (LIKE ' || source_schema || '.' || object;

sql_stmt := sql_stmt || ' INCLUDING COMMENTS INCLUDING IDENTITY INCLUDING CONSTRAINTS INCLUDING DEFAULTS INCLUDING GENERATED INCLUDING INDEXES)' ;

--RAISE NOTICE '%' , sql_stmt ;

execute sql_stmt ;

COMMIT;

sql_stmt := 'INSERT INTO ' || buffer || ' OVERRIDING SYSTEM VALUE SELECT * FROM ' || source_schema || '.' || object ;

--RAISE NOTICE '%' , sql_stmt ;

execute sql_stmt ;

COMMIT;

EXCEPTION

WHEN OTHERS THEN

GET STACKED DIAGNOSTICS text_var1 = MESSAGE_TEXT,

text_var2 = PG_EXCEPTION_DETAIL,

text_var3 = PG_EXCEPTION_HINT;

print_msg := 'ERROR:->' || text_var1 || '|' || text_var2 || '|' || text_var3 ;

RAISE NOTICE '%' , print_msg ;

END ;

END LOOP;

END;

Import Notes

Reference msg id not found: 353091661.109766.1666124045054.ref@mail.yahoo.com

Christophe Pettus

xof@thebuild.com

over 3 years ago

In reply to: Ravi Krishna (#1)

Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

On Oct 18, 2022, at 13:14, Ravi Krishna <s_ravikrishna@aol.com> wrote:

I am getting error at COMMIT -> cannot commit while a subtransaction is active.
Is commit not possible in a loop

You can commit in a loop, but not in BEGIN / END block that has an exception handler: that creates a subtransaction for the duration of the BEGIN / END.

Bryn Llewellyn

bryn@yugabyte.com

over 3 years ago

In reply to: Christophe Pettus (#2)

Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

xof@thebuild.com wrote:

s_ravikrishna@aol.com wrote:

I am getting error at COMMIT -> cannot commit while a subtransaction is active...

You can commit in a loop, but not in BEGIN / END block that has an exception handler: that creates a subtransaction for the duration of the BEGIN / END.

This surprised me when I first started to use PG (after all those years with ORCL). There are some use cases where this limitation is a nuisance. For example, if you want to implement the famous retry loop (with exponential backoff) for an error that is provoked at commit time... well, you have to do it in client-side code.

Could the limitation be lifted by making tractable internal implementation changes? Or is it rooted in profoundly deep features of the architecture—meaning that it could never be lifted?

Ravi Krishna

s_ravikrishna@aol.com

over 3 years ago

In reply to: Christophe Pettus (#2)

Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

You can commit in a loop, but not in BEGIN / END block that has an exception handler:> that creates a subtransaction for the duration of the BEGIN / END.

The reason I have to deal with error exception is that I want to ignore failure on a table and move on to next table.
I thought I can trick it by moving the part of the code which executes dynamic sql and has exception handler in it,to a separate procedure. This way I can keep the main procedure without exception handler. It failed too withthe same error. So that means this restriction is not at lexical level, but at execution level.This is so disappointing. It seems there is no way I can copy each table and commit it individually and also capture error.

Christophe Pettus

xof@thebuild.com

over 3 years ago

In reply to: Ravi Krishna (#4)

Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

On Oct 18, 2022, at 14:29, Ravi Krishna <s_ravikrishna@aol.com> wrote:

You can commit in a loop, but not in BEGIN / END block that has an exception handler:
that creates a subtransaction for the duration of the BEGIN / END.

The reason I have to deal with error exception is that I want to ignore failure on a table and move on to next table.

Rather than have a loop inside the BEGIN / END, you could put the BEGIN EXCEPTION END inside the loop, catch the error, store the important parts of the exception in a variable, and then do the COMMIT after the END statement but before the next iteration of the loop. A bit messier, but it gets the job done.

Tom Lane

tgl@sss.pgh.pa.us

over 3 years ago

In reply to: Bryn Llewellyn (#3)

Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

Bryn Llewellyn <bryn@yugabyte.com> writes:

xof@thebuild.com wrote:

You can commit in a loop, but not in BEGIN / END block that has an exception handler: that creates a subtransaction for the duration of the BEGIN / END.

This surprised me when I first started to use PG (after all those years
with ORCL).

Really? BEGIN with an exception block is a subtransaction because it's
defined to roll back to the database state as of the start of the block
if an exception occurs. COMMIT in the middle fundamentally conflicts
with that, I should think. Does Oracle interpret that differently?

regards, tom lane

Christophe Pettus

xof@thebuild.com

over 3 years ago

In reply to: Bryn Llewellyn (#3)

Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

On Oct 18, 2022, at 14:15, Bryn Llewellyn <bryn@yugabyte.com> wrote:
Could the limitation be lifted by making tractable internal implementation changes? Or is it rooted in profoundly deep features of the architecture—meaning that it could never be lifted?

That is a very good question. One of the issues (as I understand it) is to be able to continue to use the same connection in the event of an error. Here's the scenario:

1. SELECT * FROM my_func();

2. my_func gets an error. The connection is now in an error state.

3. So, catching the exception doesn't do you much good, because future operations will get an error.

This problem is solved by wrapping it in a savepoint, since rolling back to the savepoint will undo the error state of the connection.

There might be a way forward, but it's hard to get the proper semantics of an exception handler without doing something like that.

Laurenz Albe

laurenz.albe@cybertec.at

over 3 years ago

In reply to: Tom Lane (#6)

Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

On Tue, 2022-10-18 at 17:33 -0400, Tom Lane wrote:

Bryn Llewellyn <bryn@yugabyte.com> writes:

xof@thebuild.com wrote:

You can commit in a loop, but not in BEGIN / END block that has an exception handler: that creates a subtransaction for the duration of the BEGIN / END.

This surprised me when I first started to use PG (after all those years
with ORCL).

Really? BEGIN with an exception block is a subtransaction because it's
defined to roll back to the database state as of the start of the block
if an exception occurs. COMMIT in the middle fundamentally conflicts
with that, I should think. Does Oracle interpret that differently?

Looks like Oracle doesn't care much about that:

SQL> CREATE TABLE mytab (id integer CHECK (id > 0));

Table created.

SQL> CREATE PROCEDURE committest IS
2 BEGIN
3 INSERT INTO mytab VALUES (42);
4 COMMIT;
5 INSERT INTO mytab VALUES (-42);
6 EXCEPTION
7 WHEN OTHERS THEN
8 NULL;
9 END;
10 /

Procedure created.

SQL> CALL committest();

Call completed.

SQL> SELECT * FROM mytab;

ID
----------
42

I looks like Oracle allows you to randomly interfere with its transaction handling.
If you run commit and then enter an exception handler, it simply doesn't rollback.

Yours,
Laurenz Albe

Mladen Gogala

gogala.mladen@gmail.com

over 3 years ago

In reply to: Tom Lane (#6)

Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

On Tue, 2022-10-18 at 17:33 -0400, Tom Lane wrote:

Bryn Llewellyn <bryn@yugabyte.com> writes:

xof@thebuild.com wrote:

You can commit in a loop, but not in BEGIN / END block that has
an exception handler: that creates a subtransaction for the
duration of the BEGIN / END.

This surprised me when I first started to use PG (after all those
years
with ORCL).

Really? BEGIN with an exception block is a subtransaction because
it's
defined to roll back to the database state as of the start of the
block
if an exception occurs. COMMIT in the middle fundamentally conflicts
with that, I should think. Does Oracle interpret that differently?

regards, tom lane

Hi Tom,
Yes, Oracle does interpret that differently. Bryn may correct me if I'm
wrong, but Oracle creates an implicit save point when it encounters
BEGIN. Exception handler doesn't necessarily roll things back. Oracle
behavior is not standard and PgSQL adheres to SQL standard better than
Oracle. However, being as pervasive as it is, Oracle is de facto
standard.
Also, Oracle has something called "autonomous transaction" which, in
effect, means that a session can have two concurrent transactions open,
which is also non-standard:

https://docs.oracle.com/en/database/oracle/oracle-database/19/lnpls/AUTONOMOUS_TRANSACTION-pragma.html#GUID-AD33D949-081B-4CD3-A240-C29773E908C3

Amazon, lead by Kevin Closson, the guy who has famously designed Oracle
Exadata among other things, even came up with the recipe how to migrate
it to Postgres:

https://aws.amazon.com/blogs/database/migrating-oracle-autonomous-transactions-to-postgresql/

I am a bit skeptical toward that recipe and I usually prefer
programming solutions with opening another thread and sending a
message. BTW, speaking of Kevin, he has also written pgio, which is a
PostgreSQL version of his SLOB package. Kevin is the only retired
Oracle ACE in existence. BTW, Bryn also used to be an Oracle ACE.
Regards

#10

Mladen Gogala

gogala.mladen@gmail.com

over 3 years ago

In reply to: Christophe Pettus (#5)

Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

On Tue, 2022-10-18 at 14:31 -0700, Christophe Pettus wrote:

Rather than have a loop inside the BEGIN / END, you could put the
BEGIN EXCEPTION END inside the loop, catch the error, store the
important parts of the exception in a variable, and then do the
COMMIT after the END statement but before the next iteration of the
loop. A bit messier, but it gets the job done.

Commit within a loop is an extremely bad idea. Commit is an expensive
operation which includes incrementing the XID, which is global and not
local to the process. There is also a WAL write which has to be waited
on. Every commit implies at least one write operation. If that was not
the case, Postgres wouldn't be ACID compliant. There would be problem
with the "D". Commit within a loop will have many adverse effects on
the performance. Here is what Tom Kyte, who used to be a celebrity in
the world of Oracle, said on this topic:

https://asktom.oracle.com/pls/apex/f?p=100:11:0::::p11_question_id:4951966319022

Surprisingly enough, the argument in the article is mostly portable,
doesn't depend on the database type at all. Basically, transaction is a
logical unit of work. If 1000 rows need to be updated, it's better to
update them in a single transaction than in 1000 transactions. That is
particularly true for Postgres which doesn't have problems with the
undo tablespace and ORA-1555 "snapshot too old" error because of the
different architecture. Also, Oracle has a secret optimization: it
doesn't wait for commit, if the commit is issued within PL/SQL loop.
Your idea solves the syntactic problem with commits within PLPG/SQL
loops but it doesn't solve other problems that such programming
causes. Commit within loop is a very bad idea. The best way to resolve
the problems with commit within the loop is to remove the programmer
trying to do that from the project. In my humble opinion, programmers
who do stuff like that should suffer unusual and cruel punishment.

PS:
----
I am sure, that if Tom Kyte would dare to make a suggestion on this
list, there would be someone who would try to explain "the Postgres
way" to him in a condescending manner. I've seen that before.

#11

Christophe Pettus

xof@thebuild.com

over 3 years ago

In reply to: Mladen Gogala (#10)

Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

On Oct 18, 2022, at 19:18, gogala.mladen@gmail.com wrote:

Commit within a loop is an extremely bad idea.

This is an over-generalization. There are many use-cases for this (if there were not, procedures wouldn't have been nearly as important a feature).

For example, if you are processing a large update (in the hundreds of thousands or more of rows), you often want to commit regularly so that other processes don't have to wait for the whole thing to finish due to row-level locks, and to give vacuum a chance to deal with the dead tuples. Similarly, while inserting one row at a time and committing is usually not a great idea, it can make sense to do large inserts in batches.

Applications do this kind of thing all the time, very successfully; it was just that the loop was in the application rather than in the procedure.

High commit rates happen all the time, and they don't break PostgreSQL. For example, an IoT application collecting sensor data and doing many inserts per second is also doing many commits per second, since each bare INSERT is in its own transaction. PostgreSQL handles it just fine.

#12

Mladen Gogala

gogala.mladen@gmail.com

over 3 years ago

In reply to: Christophe Pettus (#11)

Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

Comments in-line.

On Tue, 2022-10-18 at 21:02 -0700, Christophe Pettus wrote:

On Oct 18, 2022, at 19:18, gogala.mladen@gmail.com wrote:

Commit within a loop is an extremely bad idea.

This is an over-generalization. There are many use-cases for this
(if there were not, procedures wouldn't have been nearly as important
a feature).

For example, if you are processing a large update (in the hundreds of
thousands or more of rows), you often want to commit regularly so
that other processes don't have to wait for the whole thing to finish
due to row-level locks, and to give vacuum a chance to deal with the
dead tuples. Similarly, while inserting one row at a time and
committing is usually not a great idea, it can make sense to do large
inserts in batches.

That depends. Multiple commits will slow down the processing. If the
goal of the exercise is to let update complete as quickly as possible,
then it will be a single commit. If the goal is to enable normal
processing and let the enormous update complete in its due time, then
you'll do what you describe.

Applications do this kind of thing all the time, very successfully;
it was just that the loop was in the application rather than in the
procedure.

High commit rates happen all the time, and they don't break
PostgreSQL. For example, an IoT application collecting sensor data
and doing many inserts per second is also doing many commits per
second, since each bare INSERT is in its own transaction. PostgreSQL
handles it just fine.

Point of my post is that the business logic, in your case it's IoT
sensors, determines what is transaction and when to commit. Advice like
"commit often and commit early", to paraphrase the famous Chicago
mayor, is easy to find but I would take it with grain of salt.
Regards
--
Mladen Gogala
Database Consultant
https://dbwhisperer.wordpress.com

#13

Ron

ronljohnsonjr@gmail.com

over 3 years ago

In reply to: Mladen Gogala (#12)

Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

On 10/19/22 08:06, Mladen Gogala wrote:
[snip]

Applications do this kind of thing all the time, very successfully; it
was just that the loop was in the application rather than in the procedure.

High commit rates happen all the time, and they don't break PostgreSQL.
For example, an IoT application collecting sensor data and doing many
inserts per second is also doing many commits per second, since each bare
INSERT is in its own transaction. PostgreSQL handles it just fine.

Point of my post is that the business logic, in your case it's IoT
sensors, determines what is transaction and when to commit. Advice like
"commit often and commit early", to paraphrase the famous Chicago mayor,
is easy to find but I would take it with grain of salt.

In the normal course of operation (i.e, not when bulk loading), you /should/
commit at the end of every "business transaction". We've committed after X
business when running stovepipe "batch" jobs processing input files. In
those cases, though, we had to track progress through the file; in the case
of a rollback, the application had to go back to the last input file "save
point" and start over.

--
Angular momentum makes the world go 'round.

#14

Bryn Llewellyn

bryn@yugabyte.com

over 3 years ago

In reply to: Mladen Gogala (#12)

Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

It seems that I made a thread-discipline error when I asked a question that had nothing to do with the frequency, or the cost, of committing when I saw this reply (paraphrased for brevity here) from Christophe:

You [cannot] commit in [a] BEGIN / END [block statement] that has an exception handler [because] that creates a subtransaction for the duration of the [block statement].

I asked this in response (again, paraphrased for brevity):

Could the limitation be lifted...? [Or is the limitation] rooted in profoundly deep features of the architecture?

Sorry that I caused some distraction. Anyway, Tom replied immediately. He said:

BEGIN with an exception block is a subtransaction because it's defined to roll back to the database state as of the start of the block if an exception occurs. COMMIT in the middle fundamentally conflicts with that, I should think.

Thanks, Tom. It's clear to me now that the present PG paradigm will never, ever change.

So my conclusion stands for this use case: I'm using "serializable" isolation (which luxury ORCL doesn't afford me); I know that I can get a "cannot serialize" error at "commit" time. Here, I cannot take appropriate action within my PL/pgSQL code and hide the whole story of what this is about from client code. Rather, I must explain the business to the authors of the next tier, and teach them when, and how, retry is appropriate.

Tom asked, too, if ORCL has a different paradigm... Briefly, yes—and radically so. But (all of you) do please feel free to skip over my sketch here if it doesn't interest you.
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — —
«
The real account of which this is a sketch is of no interest at all if you work only with PG and have never used ORCL. But if you need to switch, intellectually, from one to the other—and maybe need to do this all the time because your job duties span both systems—then it *is* interesting.

ORCL's PL/SQL is compiled (and optimized) at "create procedure" time. The new source code replaces the old in the catalog. And if it failed to compile, then you're left with an invalid unit that you cannot execute. Compilation errors are recorded in the catalog too. Further, static dependencies (proc upon proc, proc upon table, etc) are also recorded in the catalog. This is intimately connected with the hard distinction between static and dynamic SQL. The latter simply passes the text on "as is" into the byte code to be dealt with at run-time. Only statements like "select", "insert", "update", delete" and a few others can be static SQL. Table creation and the like must be dynamic SQL. This probably shocks those who move from PG to ORCL because you cannot, for example, create a table and then operate on it with static SQL in the same procedure.

In particular, for the present discussion, the PL/SQL block statement is a pure lexical device. (This is the case in PL/SQL's progenitor, ADA. And that's where all that stuff about DIANA, that the PL/SQL programmer eventually comes to hear about, comes from.) All memory that you had a block statement in the source is lost in the compiled so-called byte code that gets interpreted at run time. On the other hand, every call from PL/SQL to SQL is done in its own subtransaction—and if it fails, then that single statement is atomically rolled back. The effect of all the SQLs to date, at this moment, remains intact—but uncommitted. (Of course, you might have issued "commit"(s) programmatically. So I'm talking about SQLs that were done since the most recent "commit".)

Significantly, the failure of a call from PL/SQL to SQL raises an exception—so (as well as the single-statement rollback) you now have an in-flight exception that flies up through successive scopes in search of a matching handler. If it remains unhandled at the last moment before the top-level PL/SQL "call" is due to finish, then a "rollback" is automatically issued. But if a handler *is* found, well... the exception is dead and you can carry on. Like everything else in programming, the code author must work out what "safe" is. (It could be to turn an insert that fails 'cos a unique key is violated into an update.) In ORCL, just as in PG, writing "when others than null" is held to be stupid. And code examples that do this are deemed to be not worthy of discussion.

Though the paradigms are different, each allows you properly to implement mission-critical applications. It's rather like English and Chinese. Astonishingly different. But each supports all that you need to let people communicate about mundane daily business, science, philosophy, epistemology, and so on.)
»

#15

Thomas Kellerer

shammat@gmx.net

over 3 years ago

In reply to: Mladen Gogala (#9)

Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

gogala.mladen@gmail.com schrieb am 19.10.2022 um 01:46:

Amazon, lead by Kevin Closson, the guy who has famously designed
Oracle Exadata among other things, even came up with the recipe how
to migrate it to Postgres:

https://aws.amazon.com/blogs/database/migrating-oracle-autonomous-transactions-to-postgresql/

The workaround to use dblink to simulate autonomous transactions is nothing new,
and has been floating around for quite a while.

Here is a blog post from 2012 (6 years before the Amazon recipe was published)

https://raghavt.blog/autonomous-transaction-in-postgresql-9-1/

and another one from 2016

https://blog.dalibo.com/2016/09/08/Pragma_Autonoumous_transactions.html

#16

Benedict Holland

benedict.m.holland@gmail.com

over 3 years ago

In reply to: Thomas Kellerer (#15)

Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

This seems like a bad use of a stored procedure. Why wouldn't you spin up
40 clients with a table name and run it across 40 connections? But also, I
don't like loops in stored procedures. Working with loops in a set based
system hurts me but it's personal preference.

Like, I could write this in python in less than 4 hours with an extremely
simple sproc. How often are you running table deletions like this to
require a sproc?

Thanks,
Ben

On Wed, Oct 19, 2022, 5:39 PM Thomas Kellerer <shammat@gmx.net> wrote:

Show quoted text

gogala.mladen@gmail.com schrieb am 19.10.2022 um 01:46:

Amazon, lead by Kevin Closson, the guy who has famously designed
Oracle Exadata among other things, even came up with the recipe how
to migrate it to Postgres:

https://aws.amazon.com/blogs/database/migrating-oracle-autonomous-transactions-to-postgresql/

The workaround to use dblink to simulate autonomous transactions is
nothing new,
and has been floating around for quite a while.

Here is a blog post from 2012 (6 years before the Amazon recipe was
published)

https://raghavt.blog/autonomous-transaction-in-postgresql-9-1/

and another one from 2016

https://blog.dalibo.com/2016/09/08/Pragma_Autonoumous_transactions.html