[PATCH] Hooks at XactCommand level

rjuju123@gmail.com

almost 5 years ago

In reply to: David Steele (#2)

Re: [PATCH] Hooks at XactCommand level

Hi David,

On Thu, Mar 11, 2021 at 07:41:35AM -0500, David Steele wrote:

Hi Julien,

On 12/8/20 5:15 AM, Gilles Darold wrote:

Based on a PoC reported in a previous thread [1] I'd like to propose new
hooks around transaction commands. The objective of this patch is to
allow PostgreSQL extension to act at start and end (including abort) of
a SQL statement in a transaction.

You have signed up to review this patch. Do you know when you will have a
chance to do that?

Thanks for the reminder! And sorry about that, I've unfortunately been quite
busy with other work duties recently. I already started to look at it (and
pg_statement_rollbackv2) so I should be able to post a review within a few
days!

[1]: /messages/by-id/20181207192006.rf4tkfl25oc6pqmv@alvherre.pgsql

rjuju123@gmail.com

almost 5 years ago

In reply to: Gilles Darold (#1)

Re: [PATCH] Hooks at XactCommand level

Hi,

On Tue, Dec 08, 2020 at 11:15:12AM +0100, Gilles Darold wrote:

Based on a PoC reported in a previous thread [1] I'd like to propose new
hooks around transaction commands. The objective of this patch is to
allow PostgreSQL extension to act at start and end (including abort) of
a SQL statement in a transaction.

The idea for these hooks is born from the no go given to Takayuki
Tsunakawa's patch[2] proposing an in core implementation of
statement-level rollback transaction and the pg_statement_rollback
extension[3] that we have developed at LzLabs. The extension
pg_statement_rollback has two limitation, the first one is that the
client still have to call the ROLLBACK TO SAVEPOINT when an error is
encountered and the second is that it generates a crash when PostgreSQL
is compiled with assert that can not be fixed at the extension level.

This topic came up quite often on the mailing list, the last being from ï¿½lvaro
at [1]/messages/by-id/20181207192006.rf4tkfl25oc6pqmv@alvherre.pgsql. I think there's a general agreement that customers want that feature,
won't stop asking for it, and many if not all forks ended up implementing it.

I would still prefer if he had a way to support if in vanilla postgres, with of
course all possible safeguards to avoid an epic fiasco.

I personally think that ï¿½lvaro's previous approach, giving the ability to
specify the rollback behavior in the TransactionStmt grammar, would be enough
(I mean without the GUC part) to cover realistic and sensible usecases, which
is where the client fully decides whether there's a statement level rollback or
not. One could probably build a custom module on top of that to introduce some
kind of GUC to change the behavior more globally if it wants to take that risk.

If such an approach is still not wanted for core inclusion, then I'm in favor
of adding those hooks. There's already a published extension that tries to
implement that (for complete fairness I'm one of the people to blame), but as
Gilles previously mentioned this is very hackish and the currently available
hooks makes it very hard if not impossible to have a perfect implementation.
It's clear that people will never stop to try doing it, so at least let's make
it possible using a custom module.

It's also probably worthwhile to mention that the custom extension implementing
server side statement level rollback wasn't implemented because it wasn't
doable in the client side, but because the client side implementation was
causing a really big overhead due to the need of sending the extra commands,
and putting it on the server side lead to really significant performance
improvement.

Although that I have not though about other uses for these hooks, they
will allow a full server side statement-level rollback feature like in
commercial DBMSs like DB2 and Oracle. This feature is very often
requested by users that want to migrate to PostgreSQL.

I also thought about it, and I don't really see other possible usage for those
hooks.

There is no additional syntax or GUC, the patch just adds three new hooks:

* start_xact_command_hook called at end of the start_xact_command()
function.
* finish_xact_command called in finish_xact_command() just before
CommitTransactionCommand().
* abort_current_transaction_hook called after an error is encountered at
end of AbortCurrentTransaction().

These hooks allow an external plugins to execute code related to the SQL
statements executed in a transaction.

The only comment I have for those hooks is for the
abort_current_transaction_hook. AbortCurrentTransaction() can be called
recursively, so should the hook provide some more information about the
CurrentTransactionState, like the blockState, or is
GetCurrentTransactionNestLevel() enough to act only for the wanted calls?

[1]: /messages/by-id/20181207192006.rf4tkfl25oc6pqmv@alvherre.pgsql
/messages/by-id/20181207192006.rf4tkfl25oc6pqmv@alvherre.pgsql

gilles@darold.net

almost 5 years ago

In reply to: Julien Rouhaud (#4)

1 attachment(s)

Re: [PATCH] Hooks at XactCommand level

Le 12/03/2021 à 06:55, Julien Rouhaud a écrit :

Hi,

On Tue, Dec 08, 2020 at 11:15:12AM +0100, Gilles Darold wrote:

Based on a PoC reported in a previous thread [1] I'd like to propose new
hooks around transaction commands. The objective of this patch is to
allow PostgreSQL extension to act at start and end (including abort) of
a SQL statement in a transaction.

The idea for these hooks is born from the no go given to Takayuki
Tsunakawa's patch[2] proposing an in core implementation of
statement-level rollback transaction and the pg_statement_rollback
extension[3] that we have developed at LzLabs. The extension
pg_statement_rollback has two limitation, the first one is that the
client still have to call the ROLLBACK TO SAVEPOINT when an error is
encountered and the second is that it generates a crash when PostgreSQL
is compiled with assert that can not be fixed at the extension level.

This topic came up quite often on the mailing list, the last being from Álvaro
at [1]. I think there's a general agreement that customers want that feature,
won't stop asking for it, and many if not all forks ended up implementing it.

I would still prefer if he had a way to support if in vanilla postgres, with of
course all possible safeguards to avoid an epic fiasco.

I have added Alvarro and Takayuki to the thread, this patch is inspired
from their proposals. I wrote this patch after reading the thread and
concluding that a core implementation doesn't seems to make the
consensus and that this feature could be available to users through an
extension.

I personally think that Álvaro's previous approach, giving the ability to
specify the rollback behavior in the TransactionStmt grammar, would be enough
(I mean without the GUC part) to cover realistic and sensible usecases, which
is where the client fully decides whether there's a statement level rollback or
not. One could probably build a custom module on top of that to introduce some
kind of GUC to change the behavior more globally if it wants to take that risk.

Yes probably, with this patch I just want to propose an external
implementation of the feature. The extension implementation "just"
require these three hooks to propose the same feature as if it was
implemented in vanilla postgres. The feature can be simply enabled or
disabled by a custom user defined variable before a transaction is
started or globaly for all transaction.

If such an approach is still not wanted for core inclusion, then I'm in favor
of adding those hooks. There's already a published extension that tries to
implement that (for complete fairness I'm one of the people to blame), but as
Gilles previously mentioned this is very hackish and the currently available
hooks makes it very hard if not impossible to have a perfect implementation.
It's clear that people will never stop to try doing it, so at least let's make
it possible using a custom module.

It's also probably worthwhile to mention that the custom extension implementing
server side statement level rollback wasn't implemented because it wasn't
doable in the client side, but because the client side implementation was
causing a really big overhead due to the need of sending the extra commands,
and putting it on the server side lead to really significant performance
improvement.

Right, the closer extension to reach this feature is the extension we
develop at LzLabs [2]https://github.com/lzlabs/pg_statement_rollback/ but it still require a rollback to savepoint at
client side in case of error. The extension [3]https://github.com/darold/pg_statement_rollbackv2 using these hooks
doesn't have this limitation, everything is handled server side.

Although that I have not though about other uses for these hooks, they
will allow a full server side statement-level rollback feature like in
commercial DBMSs like DB2 and Oracle. This feature is very often
requested by users that want to migrate to PostgreSQL.

I also thought about it, and I don't really see other possible usage for those
hooks.

Yes I have not a lot of imagination too for possible other use for these
hooks but I hope that in itself this feature can justify them. I just
though that if we expose the query_string at command_start hook we could
allow its modification by external modules, but this is surely the worst
idea I can produce.

There is no additional syntax or GUC, the patch just adds three new hooks:

* start_xact_command_hook called at end of the start_xact_command()
function.
* finish_xact_command called in finish_xact_command() just before
CommitTransactionCommand().
* abort_current_transaction_hook called after an error is encountered at
end of AbortCurrentTransaction().

These hooks allow an external plugins to execute code related to the SQL
statements executed in a transaction.

The only comment I have for those hooks is for the
abort_current_transaction_hook. AbortCurrentTransaction() can be called
recursively, so should the hook provide some more information about the
CurrentTransactionState, like the blockState, or is
GetCurrentTransactionNestLevel() enough to act only for the wanted calls?

I don't think we need to pass any information at least for the rollback
at statement level extension. All information needed are accessible and
actually at abort_current_transaction_hook we only toggle a boolean to
fire the rollback.

I have rebased the patch.

Thanks for the review.

[2]: https://github.com/lzlabs/pg_statement_rollback/

[3]: https://github.com/darold/pg_statement_rollbackv2

--
Gilles Darold
LzLabs GmbH
http://www.lzlabs.com/

Attachments:

v2-0001-hook-on-start-finish-abort-command.patchtext/x-patch; charset=UTF-8; name=v2-0001-hook-on-start-finish-abort-command.patchDownload

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 6395a9b240..8f1bc3431a 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -314,6 +314,9 @@ typedef struct SubXactCallbackItem
 static SubXactCallbackItem *SubXact_callbacks = NULL;
 
 
+/* Hook for plugins to get control of at end of AbortCurrentTransaction */
+AbortCurrentTransaction_hook_type abort_current_transaction_hook = NULL;
+
 /* local function prototypes */
 static void AssignTransactionId(TransactionState s);
 static void AbortTransaction(void);
@@ -3382,6 +3385,9 @@ AbortCurrentTransaction(void)
 			AbortCurrentTransaction();
 			break;
 	}
+
+	if (abort_current_transaction_hook)
+		(*abort_current_transaction_hook) ();
 }
 
 /*
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 2b1b68109f..1f7382d182 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -205,6 +205,12 @@ static void log_disconnections(int code, Datum arg);
 static void enable_statement_timeout(void);
 static void disable_statement_timeout(void);
 
+/*
+ * Hooks for plugins to get control at end/start of
+ * start_xact_command()/finish_xact_command().
+ */
+XactCommandStart_hook_type start_xact_command_hook = NULL;
+XactCommandFinish_hook_type finish_xact_command_hook = NULL;
 
 /* ----------------------------------------------------------------
  *		routines to obtain user input
@@ -2671,8 +2677,16 @@ start_xact_command(void)
 	 * not desired, the timeout has to be disabled explicitly.
 	 */
 	enable_statement_timeout();
+
+	/*
+	 * Now give loadable modules a chance to execute code before a transaction
+	 * command is processed.
+	 */
+	if (start_xact_command_hook)
+		(*start_xact_command_hook) ();
 }
 
+
 static void
 finish_xact_command(void)
 {
@@ -2681,6 +2695,13 @@ finish_xact_command(void)
 
 	if (xact_started)
 	{
+		/*
+		 * Now give loadable modules a chance to execute code just before a
+		 * transaction command is committed.
+		 */
+		if (finish_xact_command_hook)
+			(*finish_xact_command_hook) ();
+
 		CommitTransactionCommand();
 
 #ifdef MEMORY_CONTEXT_CHECKING
diff --git a/src/include/access/xact.h b/src/include/access/xact.h
index 34cfaf542c..6c1273bb44 100644
--- a/src/include/access/xact.h
+++ b/src/include/access/xact.h
@@ -482,4 +482,8 @@ IsModifySupportedInParallelMode(CmdType commandType)
 	return (commandType == CMD_INSERT);
 }
 
+/* Hook for plugins to get control in start_xact_command() and finish_xact_command() */
+typedef void (*AbortCurrentTransaction_hook_type) (void);
+extern PGDLLIMPORT AbortCurrentTransaction_hook_type abort_current_transaction_hook;
+
 #endif							/* XACT_H */
diff --git a/src/include/tcop/pquery.h b/src/include/tcop/pquery.h
index ed2c4d374b..91c663358a 100644
--- a/src/include/tcop/pquery.h
+++ b/src/include/tcop/pquery.h
@@ -42,4 +42,10 @@ extern uint64 PortalRunFetch(Portal portal,
 							 long count,
 							 DestReceiver *dest);
 
+/* Hook for plugins to get control in start_xact_command() and finish_xact_command() */
+typedef void (*XactCommandStart_hook_type) (void);
+extern PGDLLIMPORT XactCommandStart_hook_type start_xact_command_hook;
+typedef void (*XactCommandFinish_hook_type) (void);
+extern PGDLLIMPORT XactCommandFinish_hook_type finish_xact_command_hook;
+
 #endif							/* PQUERY_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 1d1d5d2f0e..29cd5acbff 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -13,6 +13,7 @@ A_Expr_Kind
 A_Indices
 A_Indirection
 A_Star
+AbortCurrentTransaction_hook_type
 AbsoluteTime
 AccessMethodInfo
 AccessPriv
@@ -2821,6 +2822,8 @@ XPVIV
 XPVMG
 XactCallback
 XactCallbackItem
+XactCommandStart_hook_type
+XactCommandFinish_hook_type
 XactEvent
 XactLockTableWaitInfo
 XidBoundsViolation

rjuju123@gmail.com

almost 5 years ago

In reply to: Gilles Darold (#5)

Re: [PATCH] Hooks at XactCommand level

On Fri, Mar 19, 2021 at 11:02:29PM +0100, Gilles Darold wrote:

Le 12/03/2021 ï¿½ 06:55, Julien Rouhaud a ï¿½critï¿½:

I don't think we need to pass any information at least for the rollback
at statement level extension. All information needed are accessible and
actually at abort_current_transaction_hook we only toggle a boolean to
fire the rollback.

That's what I thought but I wanted to be sure.

So, I have nothing more to say about the patch itself. At that point, I guess
that we can't keep postponing that topic, and we should either:

- commit this patch, or ï¿½lvaro's one based on a new grammar keyword for BEGIN
(maybe without the GUC if that's the only hard blocker), assuming that there
aren't any technical issue with those

- reject this patch, and I guess set in stone that vanilla postgres will
never allow that.

Given the situation I'm not sure if I should mark the patch as Ready for
Committer or not. I'll leave it as-is for now as ï¿½lvaro is already in Cc.

I have rebased the patch.

Thanks!

rjuju123@gmail.com

almost 5 years ago

In reply to: Julien Rouhaud (#6)

Re: [PATCH] Hooks at XactCommand level

On Sat, Mar 20, 2021 at 06:33:24PM +0800, Julien Rouhaud wrote:

So, I have nothing more to say about the patch itself. At that point, I guess
that we can't keep postponing that topic, and we should either:

- commit this patch, or ï¿½lvaro's one based on a new grammar keyword for BEGIN
(maybe without the GUC if that's the only hard blocker), assuming that there
aren't any technical issue with those

- reject this patch, and I guess set in stone that vanilla postgres will
never allow that.

Given the situation I'm not sure if I should mark the patch as Ready for
Committer or not. I'll leave it as-is for now as ï¿½lvaro is already in Cc.

I just switched the patch to Ready for Committer.

Nicolas CHAHWEKILIAN

leptitstagiaire@gmail.com

over 4 years ago

In reply to: Julien Rouhaud (#7)

Re: [PATCH] Hooks at XactCommand level

Hello,

As far as I am concerned, I am totally awaiting for this kind of feature exposed here, for one single reason at this time : the extension pg_statement_rollback will be much more valuable with the ability of processing "rollback to savepoint" without the need for explicit instruction from client side (and this patch is giving this option).
The way the improvement is suggested here seems to be clever enough to allow many interesting behaviours from differents kinds of extensions.

Thank you,

tgl@sss.pgh.pa.us

over 4 years ago

In reply to: Nicolas CHAHWEKILIAN (#8)

Re: [PATCH] Hooks at XactCommand level

Nicolas CHAHWEKILIAN <leptitstagiaire@gmail.com> writes:

As far as I am concerned, I am totally awaiting for this kind of feature
exposed here, for one single reason at this time : the extension
pg_statement_rollback will be much more valuable with the ability of
processing "rollback to savepoint" without the need for explicit
instruction from client side (and this patch is giving this option).

What exactly do these hooks do that isn't done as well or better
by the RegisterXactCallback and RegisterSubXactCallback mechanisms?
Perhaps we need to define some additional event types for those?
Or pass more data to the callback functions?

I quite dislike inventing a hook that's defined as "run during
start_xact_command", because there is basically nothing that's
not ad-hoc about that function: it's internal to postgres.c
and both its responsibilities and its call sites have changed
over time. I think anyone hooking into that will be displeased
by the stability of their results.

BTW, per the cfbot the patch doesn't even apply right now.

regards, tom lane

#10

gilles@darold.net

over 4 years ago

In reply to: Tom Lane (#9)

1 attachment(s)

Re: [PATCH] Hooks at XactCommand level

Le 01/07/2021 à 18:47, Tom Lane a écrit :

Nicolas CHAHWEKILIAN <leptitstagiaire@gmail.com> writes:

As far as I am concerned, I am totally awaiting for this kind of feature
exposed here, for one single reason at this time : the extension
pg_statement_rollback will be much more valuable with the ability of
processing "rollback to savepoint" without the need for explicit
instruction from client side (and this patch is giving this option).

What exactly do these hooks do that isn't done as well or better
by the RegisterXactCallback and RegisterSubXactCallback mechanisms?
Perhaps we need to define some additional event types for those?
Or pass more data to the callback functions?

Sorry it take me time to recall the reason of the hooks. Actually the
problem is that the callbacks are not called when a statement is
executed after an error so that we fall back to error:

ERROR: current transaction is aborted, commands ignored until end
of transaction block

For example with the rollback at statement level extension:

BEGIN;
UPDATE tbl_rsl SET id = 'two', val = 2 WHERE id = 1; -- >>>>> will fail
LOG: statement: UPDATE tbl_rsl SET id = 'two', val = 2 WHERE id = 1;
ERROR: invalid input syntax for type integer: "two"
LINE 1: UPDATE tbl_rsl SET id = 'two', val = 2 WHERE id = 1;
^
UPDATE tbl_rsl SET id = 'two', val = 2 WHERE id = 1; -- >>>>> will
fail again
LOG: statement: UPDATE tbl_rsl SET id = 'two', val = 2 WHERE id = 1;
ERROR: current transaction is aborted, commands ignored until end
of transaction block
SELECT * FROM tbl_rsl; -- Should show records id 1 + 2
LOG: statement: SELECT * FROM tbl_rsl;
ERROR: current transaction is aborted, commands ignored until end
of transaction block

With the exention and the hook on start_xact_command() we can continue
and execute all the following statements.

I have updated the patch to only keep the hook on start_xact_command(),
as you've suggested the other hooks can be replaced by the use of the
xact callback. The extension has also been updated for testing the
feature, available here https://github.com/darold/pg_statement_rollbackv2

I quite dislike inventing a hook that's defined as "run during
start_xact_command", because there is basically nothing that's
not ad-hoc about that function: it's internal to postgres.c
and both its responsibilities and its call sites have changed
over time. I think anyone hooking into that will be displeased
by the stability of their results.

Unfortunately I had not found a better solution, but I just tried with
placing the hook in function BeginCommand() in src/backend/tcop/dest.c
and the extension is working as espected. Do you think it would be a
better place?In this case I can update the patch. For this feature we
need a hook that is executed before any command even if the transaction
is in abort state to be able to inject the rollback to savepoint, maybe
I'm not looking at the right place to do that.

Thanks

--
Gilles Darold
http://www.darold.net/

Attachments:

command-start-finish-hook-v3.patchtext/x-patch; charset=UTF-8; name=command-start-finish-hook-v3.patchDownload

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 8cea10c901..4b9f8eeb3c 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -207,6 +207,8 @@ static void log_disconnections(int code, Datum arg);
 static void enable_statement_timeout(void);
 static void disable_statement_timeout(void);
 
+/* Hooks for plugins to get control at end of start_xact_command() */
+XactCommandStart_hook_type start_xact_command_hook = NULL;
 
 /* ----------------------------------------------------------------
  *		routines to obtain user input
@@ -2708,6 +2710,13 @@ start_xact_command(void)
 		!get_timeout_active(CLIENT_CONNECTION_CHECK_TIMEOUT))
 		enable_timeout_after(CLIENT_CONNECTION_CHECK_TIMEOUT,
 							 client_connection_check_interval);
+
+	/*
+	 * Now give loadable modules a chance to execute code before a transaction
+	 * command is processed.
+	 */
+	if (start_xact_command_hook)
+		(*start_xact_command_hook) ();
 }
 
 static void
diff --git a/src/include/tcop/pquery.h b/src/include/tcop/pquery.h
index 2318f04ff0..540ede42fd 100644
--- a/src/include/tcop/pquery.h
+++ b/src/include/tcop/pquery.h
@@ -48,4 +48,8 @@ extern bool PlannedStmtRequiresSnapshot(struct PlannedStmt *pstmt);
 
 extern void EnsurePortalSnapshotExists(void);
 
+/* Hook for plugins to get control in start_xact_command() and finish_xact_command() */
+typedef void (*XactCommandStart_hook_type) (void);
+extern PGDLLIMPORT XactCommandStart_hook_type start_xact_command_hook;
+
 #endif							/* PQUERY_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 64c06cf952..f473c2dc39 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2934,6 +2934,7 @@ XPVIV
 XPVMG
 XactCallback
 XactCallbackItem
+XactCommandStart_hook_type
 XactEvent
 XactLockTableWaitInfo
 XidBoundsViolation

#11

gilles@darold.net

over 4 years ago

In reply to: Tom Lane (#9)

1 attachment(s)

Re: [PATCH] Hooks at XactCommand level

Le 01/07/2021 à 18:47, Tom Lane a écrit :

Nicolas CHAHWEKILIAN <leptitstagiaire@gmail.com> writes:

As far as I am concerned, I am totally awaiting for this kind of feature
exposed here, for one single reason at this time : the extension
pg_statement_rollback will be much more valuable with the ability of
processing "rollback to savepoint" without the need for explicit
instruction from client side (and this patch is giving this option).

What exactly do these hooks do that isn't done as well or better
by the RegisterXactCallback and RegisterSubXactCallback mechanisms?
Perhaps we need to define some additional event types for those?
Or pass more data to the callback functions?

I quite dislike inventing a hook that's defined as "run during
start_xact_command", because there is basically nothing that's
not ad-hoc about that function: it's internal to postgres.c
and both its responsibilities and its call sites have changed
over time. I think anyone hooking into that will be displeased
by the stability of their results.

Sorry Tom, it seems that I have totally misinterpreted your comments,
google translate was not a great help for my understanding but Julien
was. Thanks Julien.

I'm joining a new patch v4 that removes the need of any hook and adds a
new events XACT_EVENT_COMMAND_START and SUBXACT_EVENT_COMMAND_START that
can be cautch in the xact callbacks when a new command is to be executed.

--
Gilles Darold
http://www.darold.net/

Attachments:

command-start-finish-hook-v4.patchtext/x-patch; charset=UTF-8; name=command-start-finish-hook-v4.patchDownload

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 441445927e..3b5f6bfc2d 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -6118,3 +6118,32 @@ MarkSubTransactionAssigned(void)
 
 	CurrentTransactionState->assigned = true;
 }
+
+/*
+ * CallXactStartCommand
+ *
+ * Wrapper over CallXactCallbacks or CallSubXactCallbacks called in postgres.c
+ * at end of start_xact_command(). It allows to user-defined code to be executed
+ * for the start of any command through a xact registered callback. This function
+ * do nothing if a transaction is not started.
+ *
+ * The related events XactEvent/SubXactEvent are XACT_EVENT_COMMAND_START and
+ * SUBXACT_EVENT_COMMAND_START.
+ */
+void
+CallXactStartCommand(void)
+{
+	TransactionState s = CurrentTransactionState;
+
+	if (s->blockState == TBLOCK_DEFAULT || s->blockState == TBLOCK_STARTED)
+		return;
+
+	/*
+	 * Call start-of-xact callbacks with start command event
+	 */
+	if (s->parent && s->subTransactionId)
+		CallSubXactCallbacks(SUBXACT_EVENT_COMMAND_START, s->subTransactionId,
+							 s->parent->subTransactionId);
+	else
+		CallXactCallbacks(XACT_EVENT_COMMAND_START);
+}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 8cea10c901..a0f4a17c51 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2708,6 +2708,16 @@ start_xact_command(void)
 		!get_timeout_active(CLIENT_CONNECTION_CHECK_TIMEOUT))
 		enable_timeout_after(CLIENT_CONNECTION_CHECK_TIMEOUT,
 							 client_connection_check_interval);
+
+	/*
+	 * Instruct registered callbacks on xact that a new command is to be
+	 * executed. It allows to execute user-defined code before any new
+	 * statement is executed. This is the responsability of the user code
+	 * to take care of the state of the transaction, it can be in an ABORT
+	 * state. The related xact events are XACT_EVENT_COMMAND_START and
+	 * SUBXACT_EVENT_COMMAND_START.
+	 */
+	CallXactStartCommand();
 }
 
 static void
diff --git a/src/include/access/xact.h b/src/include/access/xact.h
index 134f6862da..190fc7151b 100644
--- a/src/include/access/xact.h
+++ b/src/include/access/xact.h
@@ -119,7 +119,8 @@ typedef enum
 	XACT_EVENT_PREPARE,
 	XACT_EVENT_PRE_COMMIT,
 	XACT_EVENT_PARALLEL_PRE_COMMIT,
-	XACT_EVENT_PRE_PREPARE
+	XACT_EVENT_PRE_PREPARE,
+	XACT_EVENT_COMMAND_START
 } XactEvent;
 
 typedef void (*XactCallback) (XactEvent event, void *arg);
@@ -129,7 +130,8 @@ typedef enum
 	SUBXACT_EVENT_START_SUB,
 	SUBXACT_EVENT_COMMIT_SUB,
 	SUBXACT_EVENT_ABORT_SUB,
-	SUBXACT_EVENT_PRE_COMMIT_SUB
+	SUBXACT_EVENT_PRE_COMMIT_SUB,
+	SUBXACT_EVENT_COMMAND_START
 } SubXactEvent;
 
 typedef void (*SubXactCallback) (SubXactEvent event, SubTransactionId mySubid,
@@ -467,4 +469,6 @@ extern void EnterParallelMode(void);
 extern void ExitParallelMode(void);
 extern bool IsInParallelMode(void);
 
+extern void CallXactStartCommand(void);
+
 #endif							/* XACT_H */

#12

[1]: https://github.com/darold/pg_statement_rollbackv2

gilles@darold.net

over 4 years ago

In reply to: Gilles Darold (#11)

1 attachment(s)

Re: [PATCH] Hooks at XactCommand level

Hi,

I have renamed the patch and the title of this proposal registered in
the commitfest "Xact/SubXact event callback at command start" to reflect
the last changes that do not include new hooks anymore.

Here is the new description corresponding to the current patch.

This patch allow to execute user-defined code for the start of any
command through a xact registered callback. It introduce two new events
in XactEvent and SubXactEvent enum called respectively
XACT_EVENT_COMMAND_START and SUBXACT_EVENT_COMMAND_START. The callback
is not called if a transaction is not started.

The objective of this new callback is to be able to call user-defined
code before any new statement is executed. For example it can call a
rollback to savepoint if there was an error in the previous transaction
statement, which allow to implements Rollback at Statement Level at
server side using a PostgreSQL extension, see [1]https://github.com/darold/pg_statement_rollbackv2 .

The patch compile and regressions tests with assert enabled passed
successfully.

There is no regression test for this feature but extension at [1]https://github.com/darold/pg_statement_rollbackv2 has
been used for validation of the new callback.

The patch adds insignificant overhead by looking at an existing callback
definition but clearly it is the responsibility to the developer to
evaluate the performances impact of its user-defined code for this
callback as it will be called before each statement. Here is a very
simple test using pgbench -c 20 -j 8 -T 30

tps = 669.930274 (without user-defined code)
tps = 640.718361 (with user-defined code from extension [1]https://github.com/darold/pg_statement_rollbackv2)

the overhead for this extension is around 4.5% which I think is not so
bad good for such feature (internally it adds calls to RELEASE +
SAVEPOINT before each write statement execution and in case of error a
ROLLBACK TO savepoint).

--
Gilles Darold
http://www.darold.net/

Attachments:

00001-startcommand_xact_callback-v1.difftext/x-patch; charset=UTF-8; name=00001-startcommand_xact_callback-v1.diffDownload

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 441445927e..3b5f6bfc2d 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -6118,3 +6118,32 @@ MarkSubTransactionAssigned(void)
 
 	CurrentTransactionState->assigned = true;
 }
+
+/*
+ * CallXactStartCommand
+ *
+ * Wrapper over CallXactCallbacks or CallSubXactCallbacks called in postgres.c
+ * at end of start_xact_command(). It allows to user-defined code to be executed
+ * for the start of any command through a xact registered callback. This function
+ * do nothing if a transaction is not started.
+ *
+ * The related events XactEvent/SubXactEvent are XACT_EVENT_COMMAND_START and
+ * SUBXACT_EVENT_COMMAND_START.
+ */
+void
+CallXactStartCommand(void)
+{
+	TransactionState s = CurrentTransactionState;
+
+	if (s->blockState == TBLOCK_DEFAULT || s->blockState == TBLOCK_STARTED)
+		return;
+
+	/*
+	 * Call start-of-xact callbacks with start command event
+	 */
+	if (s->parent && s->subTransactionId)
+		CallSubXactCallbacks(SUBXACT_EVENT_COMMAND_START, s->subTransactionId,
+							 s->parent->subTransactionId);
+	else
+		CallXactCallbacks(XACT_EVENT_COMMAND_START);
+}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 8cea10c901..a0f4a17c51 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2708,6 +2708,16 @@ start_xact_command(void)
 		!get_timeout_active(CLIENT_CONNECTION_CHECK_TIMEOUT))
 		enable_timeout_after(CLIENT_CONNECTION_CHECK_TIMEOUT,
 							 client_connection_check_interval);
+
+	/*
+	 * Instruct registered callbacks on xact that a new command is to be
+	 * executed. It allows to execute user-defined code before any new
+	 * statement is executed. This is the responsability of the user code
+	 * to take care of the state of the transaction, it can be in an ABORT
+	 * state. The related xact events are XACT_EVENT_COMMAND_START and
+	 * SUBXACT_EVENT_COMMAND_START.
+	 */
+	CallXactStartCommand();
 }
 
 static void
diff --git a/src/include/access/xact.h b/src/include/access/xact.h
index 134f6862da..190fc7151b 100644
--- a/src/include/access/xact.h
+++ b/src/include/access/xact.h
@@ -119,7 +119,8 @@ typedef enum
 	XACT_EVENT_PREPARE,
 	XACT_EVENT_PRE_COMMIT,
 	XACT_EVENT_PARALLEL_PRE_COMMIT,
-	XACT_EVENT_PRE_PREPARE
+	XACT_EVENT_PRE_PREPARE,
+	XACT_EVENT_COMMAND_START
 } XactEvent;
 
 typedef void (*XactCallback) (XactEvent event, void *arg);
@@ -129,7 +130,8 @@ typedef enum
 	SUBXACT_EVENT_START_SUB,
 	SUBXACT_EVENT_COMMIT_SUB,
 	SUBXACT_EVENT_ABORT_SUB,
-	SUBXACT_EVENT_PRE_COMMIT_SUB
+	SUBXACT_EVENT_PRE_COMMIT_SUB,
+	SUBXACT_EVENT_COMMAND_START
 } SubXactEvent;
 
 typedef void (*SubXactCallback) (SubXactEvent event, SubTransactionId mySubid,
@@ -467,4 +469,6 @@ extern void EnterParallelMode(void);
 extern void ExitParallelMode(void);
 extern bool IsInParallelMode(void);
 
+extern void CallXactStartCommand(void);
+
 #endif							/* XACT_H */

#13

/messages/by-id/Pine.LNX.4.44.0303172059170.1975-100000@peter.localdomain

tgl@sss.pgh.pa.us

over 4 years ago

In reply to: Gilles Darold (#12)

Re: [PATCH] Hooks at XactCommand level

Gilles Darold <gilles@darold.net> writes:

I have renamed the patch and the title of this proposal registered in
the commitfest "Xact/SubXact event callback at command start" to reflect
the last changes that do not include new hooks anymore.

Hmm, it doesn't seem like this has addressed my concern at all.
The callbacks are still effectively defined as "run during
start_xact_command", so they're not any less squishy semantically
than they were before. The point of my criticism was that you
should move the call site to someplace that's more organically
connected to execution of commands.

Another thing I'm not too pleased with in this formulation is that it's
very unclear what the distinction is between XACT_EVENT_COMMAND_START
and SUBXACT_EVENT_COMMAND_START. AFAICS, *every* potential use-case
for this would have to hook into both callback chains, and most likely
would treat the two events alike. Plus, as you note, the goalposts
have suddenly been moved for the amount of overhead it's okay to have
in an XactCallback or SubXactCallback function. So that might cause
problems for unrelated code. It's probably better to not try to
re-use that infrastructure.

The objective of this new callback is to be able to call user-defined
code before any new statement is executed. For example it can call a
rollback to savepoint if there was an error in the previous transaction
statement, which allow to implements Rollback at Statement Level at
server side using a PostgreSQL extension, see [1] .

Urgh. Isn't this re-making the same mistake we made years ago, namely
trying to implement autocommit on the server side? I fear this will
be a disaster even larger than that was, because if it's an extension
then pretty much no applications will be prepared for the new semantics.
I strongly urge you to read the discussions that led up to f85f43dfb,
and to search the commit history before that for mentions of
"autocommit", to see just how extensive the mess was.

I spent a little time trying to locate said discussions; it's harder
than it should be because we didn't have the practice of citing email
threads in the commit log at the time. I did find

which seems to have been the proximate decision, and here are
a few threads talking about all the messes that were created
for JDBC etc:

/messages/by-id/3D793A93.7030000@xythos.com
/messages/by-id/3383060E-272E-11D7-BA14-000502E740BA@wellsgaming.com
/messages/by-id/Law14-F37PIje6n0ssr00000bc1@hotmail.com

Basically, changing transactional semantics underneath clients is
a horrid idea. Having such behavior in an extension rather than
the core doesn't make it less horrid. If we'd designed it to act
that way from day one, maybe it'd have been fine. But as things
stand, we are quite locked into the position that this has to be
managed on the client side.

</digression>

That point doesn't necessarily invalidate the value of having
some sort of hook in this general area. But I would kind of like
to see another use-case, because I don't believe in this one.

regards, tom lane

#14

gilles@darold.net

over 4 years ago

In reply to: Tom Lane (#13)

Re: [PATCH] Hooks at XactCommand level

Le 14/07/2021 à 21:26, Tom Lane a écrit :

Gilles Darold <gilles@darold.net> writes:

I have renamed the patch and the title of this proposal registered in
the commitfest "Xact/SubXact event callback at command start" to reflect
the last changes that do not include new hooks anymore.

Hmm, it doesn't seem like this has addressed my concern at all.
The callbacks are still effectively defined as "run during
start_xact_command", so they're not any less squishy semantically
than they were before. The point of my criticism was that you
should move the call site to someplace that's more organically
connected to execution of commands.

I would like to move it closer to the command execution but the only
place I see would be in BeginCommand() which actually is waiting for
code to execute, for the moment this function does nothing. I don't see
another possible place after start_xact_command() call. All my attempt
to inject the callback after start_xact_command() result in a failure.
If you see an other place I will be please to give it a test.

Another thing I'm not too pleased with in this formulation is that it's
very unclear what the distinction is between XACT_EVENT_COMMAND_START
and SUBXACT_EVENT_COMMAND_START. AFAICS, *every* potential use-case
for this would have to hook into both callback chains, and most likely
would treat the two events alike. Plus, as you note, the goalposts
have suddenly been moved for the amount of overhead it's okay to have
in an XactCallback or SubXactCallback function. So that might cause
problems for unrelated code. It's probably better to not try to
re-use that infrastructure.

Actually XACT_EVENT_COMMAND_START occurs only after the call BEGIN, when
a transaction starts, whereas SUBXACT_EVENT_COMMAND_START occurs in all
subsequent statement execution of this transaction. This helps to
perform different actions following the event. In the example extension
only SUBXACT_EVENT_COMMAND_START is used but for example I could use
event XACT_EVENT_COMMAND_START to not send a RELEASE savepoint as there
is none. I detect this case differently but this could be an improvement
in the extension.

<digression>

The objective of this new callback is to be able to call user-defined
code before any new statement is executed. For example it can call a
rollback to savepoint if there was an error in the previous transaction
statement, which allow to implements Rollback at Statement Level at
server side using a PostgreSQL extension, see [1] .

Urgh. Isn't this re-making the same mistake we made years ago, namely
trying to implement autocommit on the server side? I fear this will
be a disaster even larger than that was, because if it's an extension
then pretty much no applications will be prepared for the new semantics.
I strongly urge you to read the discussions that led up to f85f43dfb,
and to search the commit history before that for mentions of
"autocommit", to see just how extensive the mess was.

I spent a little time trying to locate said discussions; it's harder
than it should be because we didn't have the practice of citing email
threads in the commit log at the time. I did find

/messages/by-id/Pine.LNX.4.44.0303172059170.1975-100000@peter.localdomain

which seems to have been the proximate decision, and here are
a few threads talking about all the messes that were created
for JDBC etc:

/messages/by-id/3D793A93.7030000@xythos.com
/messages/by-id/3383060E-272E-11D7-BA14-000502E740BA@wellsgaming.com
/messages/by-id/Law14-F37PIje6n0ssr00000bc1@hotmail.com

Basically, changing transactional semantics underneath clients is
a horrid idea. Having such behavior in an extension rather than
the core doesn't make it less horrid. If we'd designed it to act
that way from day one, maybe it'd have been fine. But as things
stand, we are quite locked into the position that this has to be
managed on the client side.

Yes I have suffered of this implementation for server side autocommit,
it was reverted in PG 7.4 if I remember well. I'm old enough to remember
that :-). I'm also against restoring this feature inside PG core but the
fact that the subject comes again almost every 2 years mean that there
is a need on this feature. This is why I'm proposing to be able to use
an extension for those who really need the feature, with all the
associated warning.

For example in my case the first time I was needing this feature was to
emulate the behavior of DB2 that allows rollback at statement level.
This is not exactly autocommit because the transaction still need to be
validated or rolledback at end, this is just that an error will not
invalidate the full transaction but just the failing statement. I think
that this is different. Actually I have an extension that is doing that
for most of the work but we still have to send the ROLLBACK TO savepoint
at client side which is really a performances killer and especially
painful to implement with JDBC Exception blocks.

Recently I was working on an Oracle to PostgreSQL migration and want to
implement an other Oracle feature like that is heavily used when
importing data from different sources into a data warehouse. It's very
common in the Oracle world to batch data importinside a transaction and
log silently the errors into a dedicated table to be processed later.
"Whatever" (this concern only certain errors) happens you continue to
import the data and DBAs will check what to fix and will re-import the
records in error. Again, I have an extension that is doing that but we
still have to generate the ROLLBACK TO at client side. This can be
avoided with this proposal and will greatly simplify the code at client
side.

We all know the problems of such server side implementation but once you
have implemented it at client side and you are looking for better
performances it's obvious that this kind of extension could help. The
other solution is to move to a proprietary PostgreSQL fork which is
surely not what we want.

</digression>

That point doesn't necessarily invalidate the value of having
some sort of hook in this general area. But I would kind of like
to see another use-case, because I don't believe in this one.

I have sited two use-case, they are both based on the rollback at
statement level feature. I'm pretty sure that there is several other
use-cases that escape my poor imagination. IMHO the possibility to offer
the rollback at statement level feature through an extension should be
enough but if anyone have other use-case I will be pleased to create an
extension to test it :-)

--
Gilles Darold
http://www.darold.net/

#15

gilles@darold.net

over 4 years ago

In reply to: Gilles Darold (#14)

Re: [PATCH] Hooks at XactCommand level

Le 15/07/2021 à 09:44, Gilles Darold a écrit :

Le 14/07/2021 à 21:26, Tom Lane a écrit :

Gilles Darold<gilles@darold.net> writes:

I have renamed the patch and the title of this proposal registered in
the commitfest "Xact/SubXact event callback at command start" to reflect
the last changes that do not include new hooks anymore.

Hmm, it doesn't seem like this has addressed my concern at all.
The callbacks are still effectively defined as "run during
start_xact_command", so they're not any less squishy semantically
than they were before. The point of my criticism was that you
should move the call site to someplace that's more organically
connected to execution of commands.

I would like to move it closer to the command execution but the only
place I see would be in BeginCommand() which actually is waiting for
code to execute, for the moment this function does nothing. I don't
see another possible place after start_xact_command() call. All my
attempt to inject the callback after start_xact_command() result in a
failure. If you see an other place I will be please to give it a test.

Looks like I have not well understood again, maybe you want me to move
the callback just after the start_xact_command() so that it is not
"hidden" in the "run during
start_xact_command". Ok, I will do that.

--
Gilles Darold
http://www.darold.net/

#16

gilles@darold.net

over 4 years ago

In reply to: Tom Lane (#13)

1 attachment(s)

Re: [PATCH] Hooks at XactCommand level

Le 14/07/2021 à 21:26, Tom Lane a écrit :

Gilles Darold <gilles@darold.net> writes:

I have renamed the patch and the title of this proposal registered in
the commitfest "Xact/SubXact event callback at command start" to reflect
the last changes that do not include new hooks anymore.

Hmm, it doesn't seem like this has addressed my concern at all.
The callbacks are still effectively defined as "run during
start_xact_command", so they're not any less squishy semantically
than they were before. The point of my criticism was that you
should move the call site to someplace that's more organically
connected to execution of commands.

Another thing I'm not too pleased with in this formulation is that it's
very unclear what the distinction is between XACT_EVENT_COMMAND_START
and SUBXACT_EVENT_COMMAND_START. AFAICS, *every* potential use-case
for this would have to hook into both callback chains, and most likely
would treat the two events alike.

Please find in attachment the new version v2 of the patch, I hope this
time I have well understood your advices. Myapologies for this waste of
time.

I have moved the call to the callback out of start_xact_command() and
limit his call into exec_simple_query() and c_parse_exemessage(). There
is other call to start_xact_command() elsewhere but actually these two
places areenough for what I'm doing with the extensions. I have updated
the extension test cases to check the behavior when autocommit is on or
off, error in execute of prepared statement and error in update where
current of cursor. But there is certainly a case that I have missed.

Other calls of start_xact_command() are in exec_bind_message(),
exec_execute_message(), exec_describe_statement_message(),
exec_describe_portal_message and PostgresMain. In my test they are
either not called or generates duplicates calls to the callback with
exec_simple_query() and c_parse_exemessage().

Also CallXactStartCommand() will only use one event
XACT_EVENT_COMMAND_START and only do a single call:

CallXactCallbacks(XACT_EVENT_COMMAND_START);

Plus, as you note, the goalposts have suddenly been moved for the
amount of overhead it's okay to have in an XactCallback or SubXactCallback
function. So that might cause problems for unrelated code. It's probably
better to not try to re-use that infrastructure.

About this maybe I was not clear in my bench, the overhead is not
introduced by the patch on the callback, there is no overhead. But by
the rollback at statement level extension. In case this was clear but
you think that we must not reuse this callback infrastructure do you
mean that I should fallback to a hook?

Best regard,

--
Gilles Darold
http://www.darold.net/

Attachments:

00001-startcommand_xact_callback-v2.difftext/x-patch; charset=UTF-8; name=00001-startcommand_xact_callback-v2.diffDownload

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 441445927e..7384345a48 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -6118,3 +6118,27 @@ MarkSubTransactionAssigned(void)
 
 	CurrentTransactionState->assigned = true;
 }
+
+/*
+ * CallXactStartCommand
+ *
+ * Wrapper over CallXactCallbacks called in postgres.c after the call to
+ * start_xact_command(). It allows to user-defined code to be executed
+ * for the start of any command through a xact registered callback. This
+ * function do nothing if a transaction is not started.
+ *
+ * The related XactEvent is XACT_EVENT_COMMAND_START.
+ */
+void
+CallXactStartCommand(void)
+{
+	TransactionState s = CurrentTransactionState;
+
+	if (s->blockState == TBLOCK_DEFAULT || s->blockState == TBLOCK_STARTED)
+		return;
+
+	/*
+	 * Call start-of-xact callbacks with start command event
+	 */
+	CallXactCallbacks(XACT_EVENT_COMMAND_START);
+}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 8cea10c901..97a1a19d10 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -989,6 +989,13 @@ exec_simple_query(const char *query_string)
 	 */
 	start_xact_command();
 
+	/*
+	 * Instruct registered callbacks on xact that a new command is to be
+	 * executed. It allows to execute user-defined code before any new
+	 * statement is executed.
+	 */
+	CallXactStartCommand();
+
 	/*
 	 * Zap any pre-existing unnamed statement.  (While not strictly necessary,
 	 * it seems best to define simple-Query mode as if it used the unnamed
@@ -1082,6 +1089,13 @@ exec_simple_query(const char *query_string)
 		/* Make sure we are in a transaction command */
 		start_xact_command();
 
+		/*
+		 * Instruct registered callbacks on xact that a new command is to be
+		 * executed. It allows to execute user-defined code before any new
+		 * statement is executed.
+		 */
+		CallXactStartCommand();
+
 		/*
 		 * If using an implicit transaction block, and we're not already in a
 		 * transaction block, start an implicit block to force this statement
@@ -1361,6 +1375,13 @@ exec_parse_message(const char *query_string,	/* string to execute */
 	 */
 	start_xact_command();
 
+	/*
+	 * Instruct registered callbacks on xact that a new command is to be
+	 * executed. It allows to execute user-defined code before any new
+	 * statement is executed.
+	 */
+	CallXactStartCommand();
+
 	/*
 	 * Switch to appropriate context for constructing parsetrees.
 	 *
@@ -2708,6 +2729,7 @@ start_xact_command(void)
 		!get_timeout_active(CLIENT_CONNECTION_CHECK_TIMEOUT))
 		enable_timeout_after(CLIENT_CONNECTION_CHECK_TIMEOUT,
 							 client_connection_check_interval);
+
 }
 
 static void
diff --git a/src/include/access/xact.h b/src/include/access/xact.h
index 134f6862da..994e00e3b0 100644
--- a/src/include/access/xact.h
+++ b/src/include/access/xact.h
@@ -119,7 +119,8 @@ typedef enum
 	XACT_EVENT_PREPARE,
 	XACT_EVENT_PRE_COMMIT,
 	XACT_EVENT_PARALLEL_PRE_COMMIT,
-	XACT_EVENT_PRE_PREPARE
+	XACT_EVENT_PRE_PREPARE,
+	XACT_EVENT_COMMAND_START
 } XactEvent;
 
 typedef void (*XactCallback) (XactEvent event, void *arg);
@@ -467,4 +468,6 @@ extern void EnterParallelMode(void);
 extern void ExitParallelMode(void);
 extern bool IsInParallelMode(void);
 
+extern void CallXactStartCommand(void);
+
 #endif							/* XACT_H */

#17

tgl@sss.pgh.pa.us

over 4 years ago

In reply to: Gilles Darold (#16)

Re: [PATCH] Hooks at XactCommand level

Gilles Darold <gilles@darold.net> writes:

[ 00001-startcommand_xact_callback-v2.diff ]

I've not read this version of the patch, but I see from the cfbot's
results that it's broken postgres_fdw. I recall that postgres_fdw
uses the XactCallback and SubXactCallback mechanisms, so I'm betting
this means that you've changed the semantics of those callbacks in
an incompatible way. That's probably not a great idea. We could
fix postgres_fdw, but there are more than likely some external
modules that would also get broken, and that is supposed to be a
reasonably stable API.

regards, tom lane

#18

Andres Freund

andres@anarazel.de

over 4 years ago

In reply to: Tom Lane (#17)

Re: [PATCH] Hooks at XactCommand level

Hi,

On 2021-07-30 13:58:51 -0400, Tom Lane wrote:

Gilles Darold <gilles@darold.net> writes:

[ 00001-startcommand_xact_callback-v2.diff ]

I've not read this version of the patch, but I see from the cfbot's
results that it's broken postgres_fdw. I recall that postgres_fdw
uses the XactCallback and SubXactCallback mechanisms, so I'm betting
this means that you've changed the semantics of those callbacks in
an incompatible way. That's probably not a great idea. We could
fix postgres_fdw, but there are more than likely some external
modules that would also get broken, and that is supposed to be a
reasonably stable API.

I think this may partially be an issue with the way that postgres_fdw
uses the callback than with the patch. It disconnects from the server
*regardless* of the XactEvent passed to the callback. That makes it
really hard to extend the callback mechanism to further events...

Now, I'm also *quite* unconvinced that the placement of the
new CallXactStartCommand() in postgres.c is right.

On 2021-07-16 11:48:24 +0200, Gilles Darold wrote:

Other calls of start_xact_command() are in exec_bind_message(),
exec_execute_message(), exec_describe_statement_message(),
exec_describe_portal_message and PostgresMain. In my test they are either
not called or generates duplicates calls to the callback with
exec_simple_query() and c_parse_exemessage().

That seems like an issue with your test then. Prepared statements can be
parsed in one transaction and bind+exec'ed in another. And you even can
execute transaction control statements this way.

IMO this'd need tests somewhere that allow us to verify the hook
placements do something sensible.

It does not seems not great to add a bunch of external function calls
into all these routines. For simple queries postgres.c's exec_*
functions show up in profiles - doing yet another function call that
then also needs to look at various memory locations plausibly will show
up. Particularly when used with pipelined queries.

I'm *very* unconvinced it makes sense to implement a feature like this
in an extension / that we should expose API for that purpose. To me the
top-level transaction state is way too tied to our internals for it to
be reasonably dealt with in an extension. And I think an in-core version
would need to tackle the overhead and internal query execution issues
this feature has.

Greetings,

Andres Freund

#19

tgl@sss.pgh.pa.us

over 4 years ago

In reply to: Andres Freund (#18)

Re: [PATCH] Hooks at XactCommand level

Andres Freund <andres@anarazel.de> writes:

On 2021-07-30 13:58:51 -0400, Tom Lane wrote:

I've not read this version of the patch, but I see from the cfbot's
results that it's broken postgres_fdw.

I think this may partially be an issue with the way that postgres_fdw
uses the callback than with the patch. It disconnects from the server
*regardless* of the XactEvent passed to the callback. That makes it
really hard to extend the callback mechanism to further events...

Perhaps. Nonetheless, I thought upthread that adding these events
as Xact/SubXactCallback events was the wrong design, and I still
think that. A new hook would be a more sensible way.

I'm *very* unconvinced it makes sense to implement a feature like this
in an extension / that we should expose API for that purpose. To me the
top-level transaction state is way too tied to our internals for it to
be reasonably dealt with in an extension.

Yeah, that's the other major problem --- the use-case doesn't seem
very convincing. I'm not even sold on the goal, let alone on trying
to implement it by hooking into these particular places. I think
that'll end up being buggy and fragile as well as not very performant.

regards, tom lane

#20

Andres Freund

andres@anarazel.de

over 4 years ago

In reply to: Tom Lane (#19)

Re: [PATCH] Hooks at XactCommand level

Hi,

On 2021-07-30 17:49:09 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2021-07-30 13:58:51 -0400, Tom Lane wrote:

I've not read this version of the patch, but I see from the cfbot's
results that it's broken postgres_fdw.

I think this may partially be an issue with the way that postgres_fdw
uses the callback than with the patch. It disconnects from the server
*regardless* of the XactEvent passed to the callback. That makes it
really hard to extend the callback mechanism to further events...

Perhaps. Nonetheless, I thought upthread that adding these events
as Xact/SubXactCallback events was the wrong design, and I still
think that. A new hook would be a more sensible way.

I know I've wanted additional events in XactEvent before that'd also be
problematic for pg_fdw, but not make sense as a separate event. E.g. an
event when an xid is assigned.

I'm *very* unconvinced it makes sense to implement a feature like this
in an extension / that we should expose API for that purpose. To me the
top-level transaction state is way too tied to our internals for it to
be reasonably dealt with in an extension.

Yeah, that's the other major problem --- the use-case doesn't seem
very convincing. I'm not even sold on the goal, let alone on trying
to implement it by hooking into these particular places. I think
that'll end up being buggy and fragile as well as not very performant.

I'm more favorable than you on the overall goal. Migrations to PG are a
frequent and good thing and as discussed before, lots of PG forks ended
up implementing a version of this. Clearly there's demand.

However, I think a proper implementation would require a substantial
amount of effort. Including things like optimizing the subtransaction
logic so that switching the feature on doesn't lead to xid wraparound
issues. Adding odd hooks doesn't move us towards a real solution imo.

Greetings,

Andres Freund

#21

gilles@darold.net

over 4 years ago

In reply to: Andres Freund (#20)

Re: [PATCH] Hooks at XactCommand level

Le 31/07/2021 à 01:28, Andres Freund a écrit :

I'm *very* unconvinced it makes sense to implement a feature like this
in an extension / that we should expose API for that purpose. To me the
top-level transaction state is way too tied to our internals for it to
be reasonably dealt with in an extension.

Yeah, that's the other major problem --- the use-case doesn't seem
very convincing. I'm not even sold on the goal, let alone on trying
to implement it by hooking into these particular places. I think
that'll end up being buggy and fragile as well as not very performant.

I'm more favorable than you on the overall goal. Migrations to PG are a
frequent and good thing and as discussed before, lots of PG forks ended
up implementing a version of this. Clearly there's demand.

Sorry for the response delay. I have though about adding this odd hook
to be able to implement this feature through an extension because I
don't think this is something that should be implemented in core. There
were also patches proposals which were all rejected.

We usually implement the feature at client side which is imo enough for
the use cases. But the problem is that this a catastrophe in term of
performances. I have done a small benchmark to illustrate the problem.
This is a single process client on the same host than the PG backend.

For 10,000 tuples inserted with 50% of failures and rollback at
statement level handled at client side:

Expected: 5001, Count: 5001
DML insert took: 13 wallclock secs ( 0.53 usr + 0.94 sys =
1.47 CPU)

Now with statement at rollback level handled at server side using the
hook and the extension:

Expected: 5001, Count: 5001
DML insert took: 4 wallclock secs ( 0.27 usr + 0.32 sys =
0.59 CPU)

And with 100,000 tuples this is worst. Without the extension:

Expected: 50001, Count: 50001
DML insert took: 1796 wallclock secs (14.95 usr + 20.29 sys =
35.24 CPU)

with server side Rollback at statement level:

Expected: 50001, Count: 50001
DML insert took: 372 wallclock secs ( 4.85 usr + 5.45 sys =
10.30 CPU)

I think this is not so uncommon use cases and that could shows the
interest of such extension.

However, I think a proper implementation would require a substantial
amount of effort. Including things like optimizing the subtransaction
logic so that switching the feature on doesn't lead to xid wraparound
issues. Adding odd hooks doesn't move us towards a real solution imo.

I would like to help on this part but unfortunately I have no idea on
how we can improve that.

Best regards,

--
Gilles Darold

#22

gilles@darold.net

over 4 years ago

In reply to: Tom Lane (#19)

1 attachment(s)

Re: [PATCH] Hooks at XactCommand level

Le 30/07/2021 à 23:49, Tom Lane a écrit :

Andres Freund <andres@anarazel.de> writes:

On 2021-07-30 13:58:51 -0400, Tom Lane wrote:

I've not read this version of the patch, but I see from the cfbot's
results that it's broken postgres_fdw.

I think this may partially be an issue with the way that postgres_fdw
uses the callback than with the patch. It disconnects from the server
*regardless* of the XactEvent passed to the callback. That makes it
really hard to extend the callback mechanism to further events...

Perhaps. Nonetheless, I thought upthread that adding these events
as Xact/SubXactCallback events was the wrong design, and I still
think that. A new hook would be a more sensible way.

I'm *very* unconvinced it makes sense to implement a feature like this
in an extension / that we should expose API for that purpose. To me the
top-level transaction state is way too tied to our internals for it to
be reasonably dealt with in an extension.

Yeah, that's the other major problem --- the use-case doesn't seem
very convincing. I'm not even sold on the goal, let alone on trying
to implement it by hooking into these particular places. I think
that'll end up being buggy and fragile as well as not very performant.

I've attached the new version v5 of the patch that use a hook instead of
the use of a xact callback. Compared to the first implementation calls
to the hook have been extracted from the start_xact_command() function.
The test extension have also be updated.

If I understand well the last discussions there is no chance of having
this hook included. If there is no contrary opinion I will withdraw the
patch from the commitfest. However thank you so much to have taken time
to review this proposal.

Best regards,

--
Gilles Darold

Attachments:

command-start-hook-v5.patchtext/x-patch; charset=UTF-8; name=command-start-hook-v5.patchDownload

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 530caa520b..bc62a2cb98 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -207,6 +207,8 @@ static void log_disconnections(int code, Datum arg);
 static void enable_statement_timeout(void);
 static void disable_statement_timeout(void);
 
+/* Hooks for plugins to get control at end of start_xact_command() */
+XactCommandStart_hook_type start_xact_command_hook = NULL;
 
 /* ----------------------------------------------------------------
  *		routines to obtain user input
@@ -989,6 +991,13 @@ exec_simple_query(const char *query_string)
 	 */
 	start_xact_command();
 
+	/*
+	 * Now give loadable modules a chance to execute code
+	 * before a transaction command is processed.
+	 */
+	if (start_xact_command_hook)
+		(*start_xact_command_hook) ();
+
 	/*
 	 * Zap any pre-existing unnamed statement.  (While not strictly necessary,
 	 * it seems best to define simple-Query mode as if it used the unnamed
@@ -1082,6 +1091,13 @@ exec_simple_query(const char *query_string)
 		/* Make sure we are in a transaction command */
 		start_xact_command();
 
+		/*
+		 * Now give loadable modules a chance to execute code
+		 * before a transaction command is processed.
+		 */
+		if (start_xact_command_hook)
+			(*start_xact_command_hook) ();
+
 		/*
 		 * If using an implicit transaction block, and we're not already in a
 		 * transaction block, start an implicit block to force this statement
@@ -1361,6 +1377,13 @@ exec_parse_message(const char *query_string,	/* string to execute */
 	 */
 	start_xact_command();
 
+	/*
+	 * Now give loadable modules a chance to execute code
+	 * before a transaction command is processed.
+	 */
+	if (start_xact_command_hook)
+		(*start_xact_command_hook) ();
+
 	/*
 	 * Switch to appropriate context for constructing parsetrees.
 	 *
@@ -1647,6 +1670,13 @@ exec_bind_message(StringInfo input_message)
 	 */
 	start_xact_command();
 
+	/*
+	 * Now give loadable modules a chance to execute code
+	 * before a transaction command is processed.
+	 */
+	if (start_xact_command_hook)
+		(*start_xact_command_hook) ();
+
 	/* Switch back to message context */
 	MemoryContextSwitchTo(MessageContext);
 
@@ -2140,6 +2170,13 @@ exec_execute_message(const char *portal_name, long max_rows)
 	 */
 	start_xact_command();
 
+	/*
+	 * Now give loadable modules a chance to execute code
+	 * before a transaction command is processed.
+	 */
+	if (start_xact_command_hook)
+		(*start_xact_command_hook) ();
+
 	/*
 	 * If we re-issue an Execute protocol request against an existing portal,
 	 * then we are only fetching more rows rather than completely re-executing
@@ -2546,6 +2583,13 @@ exec_describe_statement_message(const char *stmt_name)
 	 */
 	start_xact_command();
 
+	/*
+	 * Now give loadable modules a chance to execute code
+	 * before a transaction command is processed.
+	 */
+	if (start_xact_command_hook)
+		(*start_xact_command_hook) ();
+
 	/* Switch back to message context */
 	MemoryContextSwitchTo(MessageContext);
 
@@ -2641,6 +2685,13 @@ exec_describe_portal_message(const char *portal_name)
 	 */
 	start_xact_command();
 
+	/*
+	 * Now give loadable modules a chance to execute code
+	 * before a transaction command is processed.
+	 */
+	if (start_xact_command_hook)
+		(*start_xact_command_hook) ();
+
 	/* Switch back to message context */
 	MemoryContextSwitchTo(MessageContext);
 
@@ -4561,6 +4612,13 @@ PostgresMain(int argc, char *argv[],
 				/* start an xact for this function invocation */
 				start_xact_command();
 
+				/*
+				 * Now give loadable modules a chance to execute code
+				 * before a transaction command is processed.
+				 */
+				if (start_xact_command_hook)
+					(*start_xact_command_hook) ();
+
 				/*
 				 * Note: we may at this point be inside an aborted
 				 * transaction.  We can't throw error for that until we've
diff --git a/src/include/tcop/pquery.h b/src/include/tcop/pquery.h
index 2318f04ff0..540ede42fd 100644
--- a/src/include/tcop/pquery.h
+++ b/src/include/tcop/pquery.h
@@ -48,4 +48,8 @@ extern bool PlannedStmtRequiresSnapshot(struct PlannedStmt *pstmt);
 
 extern void EnsurePortalSnapshotExists(void);
 
+/* Hook for plugins to get control in start_xact_command() and finish_xact_command() */
+typedef void (*XactCommandStart_hook_type) (void);
+extern PGDLLIMPORT XactCommandStart_hook_type start_xact_command_hook;
+
 #endif							/* PQUERY_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 37cf4b2f76..110541731c 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2938,6 +2938,7 @@ XPVIV
 XPVMG
 XactCallback
 XactCallbackItem
+XactCommandStart_hook_type
 XactEvent
 XactLockTableWaitInfo
 XidBoundsViolation

#23

Andres Freund

andres@anarazel.de

over 4 years ago

In reply to: Gilles Darold (#21)

Re: [PATCH] Hooks at XactCommand level

Hi,

On 2021-08-10 10:12:26 +0200, Gilles Darold wrote:

Sorry for the response delay. I have though about adding this odd hook to be
able to implement this feature through an extension because I don't think
this is something that should be implemented in core. There were also
patches proposals which were all rejected.

We usually implement the feature at client side which is imo enough for the
use cases. But the problem is that this a catastrophe in term of
performances. I have done a small benchmark to illustrate the problem. This
is a single process client on the same host than the PG backend.

For 10,000 tuples inserted with 50% of failures and rollback at statement
level handled at client side:

ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ Expected: 5001, Count:ï¿½ 5001
ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ DML insert took: 13 wallclock secs ( 0.53 usr +ï¿½ 0.94 sys =ï¿½ 1.47
CPU)

Something seems off here. This suggests every insert took 2.6ms. That
seems awfully long, unless your network latency is substantial. I did a
quick test implementing this in the naive-most way in pgbench, and I get
better times - and there's *lots* of room for improvement.

I used a pgbench script that sent the following:
BEGIN;
SAVEPOINT insert_fail;
INSERT INTO testinsert(data) VALUES (1);
ROLLBACK TO SAVEPOINT insert_fail;
SAVEPOINT insert_success;
INSERT INTO testinsert(data) VALUES (1);
RELEASE SAVEPOINT insert_success;
{repeat 5 times}
COMMIT;

I.e. 5 failing and 5 succeeding insertions wrapped in one transaction. I
get >2500 tps, i.e. > 25k rows/sec. And it's not hard to optimize that
further - the {ROLLBACK TO,RELEASE} SAVEPOINT; SAVEPOINT; INSERT can be
sent in one roundtrip. That gets me to somewhere around 40k rows/sec.

BEGIN;

\startpipeline
SAVEPOINT insert_fail;
INSERT INTO testinsert(data) VALUES (1);
\endpipeline

\startpipeline
ROLLBACK TO SAVEPOINT insert_fail;
SAVEPOINT insert_success;
INSERT INTO testinsert(data) VALUES (1);
\endpipeline

\startpipeline
RELEASE SAVEPOINT insert_success;
SAVEPOINT insert_fail;
INSERT INTO testinsert(data) VALUES (1);
\endpipeline

\startpipeline
ROLLBACK TO SAVEPOINT insert_fail;
SAVEPOINT insert_success;
INSERT INTO testinsert(data) VALUES (1);
\endpipeline

{repeat last two blocks three times}
COMMIT;

Greetings,

Andres Freund

#24

gilles@darold.net

over 4 years ago

In reply to: Andres Freund (#23)

Re: [PATCH] Hooks at XactCommand level

Le 13/08/2021 à 11:58, Andres Freund a écrit :

Hi,

On 2021-08-10 10:12:26 +0200, Gilles Darold wrote:

Sorry for the response delay. I have though about adding this odd hook to be
able to implement this feature through an extension because I don't think
this is something that should be implemented in core. There were also
patches proposals which were all rejected.

We usually implement the feature at client side which is imo enough for the
use cases. But the problem is that this a catastrophe in term of
performances. I have done a small benchmark to illustrate the problem. This
is a single process client on the same host than the PG backend.

For 10,000 tuples inserted with 50% of failures and rollback at statement
level handled at client side:

Expected: 5001, Count: 5001
DML insert took: 13 wallclock secs ( 0.53 usr + 0.94 sys = 1.47
CPU)

Something seems off here. This suggests every insert took 2.6ms. That
seems awfully long, unless your network latency is substantial. I did a
quick test implementing this in the naive-most way in pgbench, and I get
better times - and there's *lots* of room for improvement.

I used a pgbench script that sent the following:
BEGIN;
SAVEPOINT insert_fail;
INSERT INTO testinsert(data) VALUES (1);
ROLLBACK TO SAVEPOINT insert_fail;
SAVEPOINT insert_success;
INSERT INTO testinsert(data) VALUES (1);
RELEASE SAVEPOINT insert_success;
{repeat 5 times}
COMMIT;

I.e. 5 failing and 5 succeeding insertions wrapped in one transaction. I
get >2500 tps, i.e. > 25k rows/sec. And it's not hard to optimize that
further - the {ROLLBACK TO,RELEASE} SAVEPOINT; SAVEPOINT; INSERT can be
sent in one roundtrip. That gets me to somewhere around 40k rows/sec.

BEGIN;

\startpipeline
SAVEPOINT insert_fail;
INSERT INTO testinsert(data) VALUES (1);
\endpipeline

\startpipeline
ROLLBACK TO SAVEPOINT insert_fail;
SAVEPOINT insert_success;
INSERT INTO testinsert(data) VALUES (1);
\endpipeline

\startpipeline
RELEASE SAVEPOINT insert_success;
SAVEPOINT insert_fail;
INSERT INTO testinsert(data) VALUES (1);
\endpipeline

\startpipeline
ROLLBACK TO SAVEPOINT insert_fail;
SAVEPOINT insert_success;
INSERT INTO testinsert(data) VALUES (1);
\endpipeline

{repeat last two blocks three times}
COMMIT;

Greetings,

Andres Freund

I have written a Perl script to mimic what I have found in an Oracle
batch script to import data in a table. I had this use case in a recent
migration the only difference is that the batch was written in Java.

$dbh->do("BEGIN") or die "FATAL: " . $dbh->errstr . "\n";
my $start = new Benchmark;
my $sth = $dbh->prepare("INSERT INTO t1 VALUES (?, ?)");
exit 1 if (not defined $sth);
for (my $i = 0; $i <= 10000; $i++)
{
        $dbh->do("SAVEPOINT foo") or die "FATAL: " . $dbh->errstr . "\n";
        # Generate a duplicate key each two row inserted
        my $val = $i;
        $val = $i-1 if ($i % 2 != 0);
        unless ($sth->execute($val, 'insert '.$i)) {
                $dbh->do("ROLLBACK TO foo") or die "FATAL: " .
$dbh->errstr . "\n";
        } else {
                $dbh->do("RELEASE foo") or die "FATAL: " . $dbh->errstr
. "\n";
        }
}
$sth->finish();
my $end = new Benchmark;

$dbh->do("COMMIT;");

my $td = timediff($end, $start);
print "DML insert took: " . timestr($td) . "\n";

The timing reported are from my personal computer, there is no network
latency, it uses localhost. Anyway, the objective was not to bench the
DML throughput but the overhead of the rollback at statement level made
at client side versus server side. I guess that you might have the same
speed gain around x3 to x5 or more following the number of tuples?

The full script can be found here
https://github.com/darold/pg_statement_rollbackv2/blob/main/test/batch_script_example.pl

Cheers,

--
Gilles Darold

#25