Possible segfault when sending notification within a ProcessUtility hook

Started by Anthonin Bonnefoyabout 2 years ago3 messages
#1Anthonin Bonnefoy
anthonin.bonnefoy@datadoghq.com
1 attachment(s)

Hi,

I've encountered the following segfault:

#0: 0x0000000104e821a8 postgres`list_head(l=0x7f7f7f7f7f7f7f7f) at
pg_list.h:130:17
#1: 0x0000000104e81c9c postgres`PreCommit_Notify at async.c:932:16
#2: 0x0000000104dd02f8 postgres`CommitTransaction at xact.c:2236:2
#3: 0x0000000104dcfc24 postgres`CommitTransactionCommand at xact.c:3061:4
#4: 0x000000010528a880 postgres`finish_xact_command at postgres.c:2777:3
#5: 0x00000001052883ac postgres`exec_simple_query(query_string="notify
test;") at postgres.c:1298:4

This happens when a transaction block fails and a ProcessUtility hook
sends a notification during the rollback command.

When a transaction block fails, it will enter in a TBLOCK_ABORT state,
waiting for a rollback. Calling rollback will switch to a
TBLOCK_ABORT_END state and will only go through CleanupTransaction.
If a hook sends a notification during the rollback command, a
notification will be queued but its content will be wiped when the
TopTransactionContext is destroyed.
Trying to send a notification immediately after will segfault in
PreCommit_Notify as pendingNotifies->events will be invalid.

There's a test_notify_rollback test module attached to the patch that reproduces
the issue.

Moving notification clean up from AbortTransaction to CleanupTransaction fixes
the issue as it will clear pendingActions in the same function that destroys the
TopTransactionContext.

Regards,
Anthonin

Attachments:

v1-0001-Fix-segfault-when-notifying-in-a-ProcessUtility-h.patchapplication/octet-stream; name=v1-0001-Fix-segfault-when-notifying-in-a-ProcessUtility-h.patchDownload
From 7304ccf1b558a03221552ac3313e03ea41d4b3de Mon Sep 17 00:00:00 2001
From: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Date: Tue, 5 Dec 2023 16:46:03 +0100
Subject: [PATCH v1] Fix segfault when notifying in a ProcessUtility hook

This can happen when using a transaction block and a ProcessUtility hook
that would send a notification on Rollback.

When a transaction block fails, it will enter in a TBLOCK_ABORT state,
waiting for a rollback. Calling rollback will switch to a
TBLOCK_ABORT_END state and will only go through CleanupTransaction.
If a hook has sent a notification during the rollback command, a
notification will be queued but its content will be wiped during memory
cleanup.
Trying to send a notification immediately after will segfault in
PreCommit_Notify as pendingNotifies->events will be invalid.

Fix by moving cleanup notification from AbortTransaction to
CleanupTransaction.
---
 src/backend/access/transam/xact.c             |  2 +-
 src/backend/commands/async.c                  |  6 +-
 src/include/commands/async.h                  |  2 +-
 src/test/modules/Makefile                     |  1 +
 src/test/modules/meson.build                  |  1 +
 .../modules/test_notify_rollback/.gitignore   |  4 ++
 .../modules/test_notify_rollback/Makefile     | 26 ++++++++
 src/test/modules/test_notify_rollback/README  |  5 ++
 .../expected/test_notify_rollback.out         | 14 +++++
 .../modules/test_notify_rollback/meson.build  | 30 +++++++++
 .../sql/test_notify_rollback.sql              | 13 ++++
 .../test_notify_rollback.c                    | 63 +++++++++++++++++++
 12 files changed, 162 insertions(+), 5 deletions(-)
 create mode 100644 src/test/modules/test_notify_rollback/.gitignore
 create mode 100644 src/test/modules/test_notify_rollback/Makefile
 create mode 100644 src/test/modules/test_notify_rollback/README
 create mode 100644 src/test/modules/test_notify_rollback/expected/test_notify_rollback.out
 create mode 100644 src/test/modules/test_notify_rollback/meson.build
 create mode 100644 src/test/modules/test_notify_rollback/sql/test_notify_rollback.sql
 create mode 100644 src/test/modules/test_notify_rollback/test_notify_rollback.c

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 536edb3792..854797d24c 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -2800,7 +2800,6 @@ AbortTransaction(void)
 	AtAbort_Portals();
 	smgrDoPendingSyncs(false, is_parallel_worker);
 	AtEOXact_LargeObject(false);
-	AtAbort_Notify();
 	AtEOXact_RelationMap(false, is_parallel_worker);
 	AtAbort_Twophase();
 
@@ -2910,6 +2909,7 @@ CleanupTransaction(void)
 	TopTransactionResourceOwner = NULL;
 
 	AtCleanup_Memory();			/* and transaction memory */
+	AtCleanup_Notify();
 
 	s->fullTransactionId = InvalidFullTransactionId;
 	s->subTransactionId = InvalidSubTransactionId;
diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index 264f25a8f9..5459d182eb 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -1652,15 +1652,15 @@ SignalBackends(void)
 }
 
 /*
- * AtAbort_Notify
+ * AtCleanup_Notify
  *
- *	This is called at transaction abort.
+ *	This is called at transaction cleanup.
  *
  *	Gets rid of pending actions and outbound notifies that we would have
  *	executed if the transaction got committed.
  */
 void
-AtAbort_Notify(void)
+AtCleanup_Notify(void)
 {
 	/*
 	 * If we LISTEN but then roll back the transaction after PreCommit_Notify,
diff --git a/src/include/commands/async.h b/src/include/commands/async.h
index a44472b352..1acc05d03d 100644
--- a/src/include/commands/async.h
+++ b/src/include/commands/async.h
@@ -40,7 +40,7 @@ extern void Async_UnlistenAll(void);
 /* perform (or cancel) outbound notify processing at transaction commit */
 extern void PreCommit_Notify(void);
 extern void AtCommit_Notify(void);
-extern void AtAbort_Notify(void);
+extern void AtCleanup_Notify(void);
 extern void AtSubCommit_Notify(void);
 extern void AtSubAbort_Notify(void);
 extern void AtPrepare_Notify(void);
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 5d33fa6a9a..1b3a3107b5 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -23,6 +23,7 @@ SUBDIRS = \
 		  test_integerset \
 		  test_lfind \
 		  test_misc \
+		  test_notify_rollback \
 		  test_oat_hooks \
 		  test_parser \
 		  test_pg_dump \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index b76f588559..90f49bceea 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -20,6 +20,7 @@ subdir('test_ginpostinglist')
 subdir('test_integerset')
 subdir('test_lfind')
 subdir('test_misc')
+subdir('test_notify_rollback')
 subdir('test_oat_hooks')
 subdir('test_parser')
 subdir('test_pg_dump')
diff --git a/src/test/modules/test_notify_rollback/.gitignore b/src/test/modules/test_notify_rollback/.gitignore
new file mode 100644
index 0000000000..5dcb3ff972
--- /dev/null
+++ b/src/test/modules/test_notify_rollback/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/test_notify_rollback/Makefile b/src/test/modules/test_notify_rollback/Makefile
new file mode 100644
index 0000000000..dad3533ea5
--- /dev/null
+++ b/src/test/modules/test_notify_rollback/Makefile
@@ -0,0 +1,26 @@
+# src/test/modules/test_notify_rollback/Makefile
+
+MODULE_big = test_notify_rollback
+OBJS = \
+	$(WIN32RES) \
+	test_notify_rollback.o
+PGFILEDESC = "test_notify_rollback - send async notification during rollback"
+
+REGRESS = test_notify_rollback
+
+# disable installcheck for now
+NO_INSTALLCHECK = 1
+# and also for now force NO_LOCALE and UTF8
+ENCODING = UTF8
+NO_LOCALE = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_notify_rollback
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_notify_rollback/README b/src/test/modules/test_notify_rollback/README
new file mode 100644
index 0000000000..76cbb22977
--- /dev/null
+++ b/src/test/modules/test_notify_rollback/README
@@ -0,0 +1,5 @@
+This module tests sending an async notification in a process utility hook.
+Async notifications are sent at commit time. However, a rollback statement
+after a failed transaction won't go through CommitTransaction and won't send
+queued notifications. It will still go through AtCleanup_Memory, resetting
+context.
diff --git a/src/test/modules/test_notify_rollback/expected/test_notify_rollback.out b/src/test/modules/test_notify_rollback/expected/test_notify_rollback.out
new file mode 100644
index 0000000000..529b7d0ad8
--- /dev/null
+++ b/src/test/modules/test_notify_rollback/expected/test_notify_rollback.out
@@ -0,0 +1,14 @@
+LOAD 'test_notify_rollback';
+-- Start a transaction block
+BEGIN;
+-- Abort the transaction with a syntax error
+    g;
+ERROR:  syntax error at or near "g"
+LINE 1: g;
+        ^
+-- Rollback the transaction, test_notify_rollback's hook will
+-- queue a notification while we're in a failed transaction state
+ROLLBACK;
+-- Try to use notify after the rollback, making sure we don't
+-- segfault trying to use memory that was reset
+NOTIFY test;
diff --git a/src/test/modules/test_notify_rollback/meson.build b/src/test/modules/test_notify_rollback/meson.build
new file mode 100644
index 0000000000..21f4719010
--- /dev/null
+++ b/src/test/modules/test_notify_rollback/meson.build
@@ -0,0 +1,30 @@
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+test_notify_rollback_sources = files(
+  'test_notify_rollback.c',
+)
+
+if host_system == 'windows'
+  test_notify_rollback_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+    '--NAME', 'test_notify_rollback',
+    '--FILEDESC', 'test_notify_rollback - send async notification during rollback',])
+endif
+
+test_notify_rollback = shared_module('test_notify_rollback',
+  test_notify_rollback_sources,
+  kwargs: pg_test_mod_args,
+)
+test_install_libs += test_notify_rollback
+
+tests += {
+  'name': 'test_notify_rollback',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'regress': {
+    'sql': [
+      'test_notify_rollback',
+    ],
+    'regress_args': ['--no-locale', '--encoding=UTF8'],
+    'runningcheck': false,
+  },
+}
diff --git a/src/test/modules/test_notify_rollback/sql/test_notify_rollback.sql b/src/test/modules/test_notify_rollback/sql/test_notify_rollback.sql
new file mode 100644
index 0000000000..fda58af053
--- /dev/null
+++ b/src/test/modules/test_notify_rollback/sql/test_notify_rollback.sql
@@ -0,0 +1,13 @@
+LOAD 'test_notify_rollback';
+
+-- Start a transaction block
+BEGIN;
+-- Abort the transaction with a syntax error
+    g;
+-- Rollback the transaction, test_notify_rollback's hook will
+-- queue a notification while we're in a failed transaction state
+ROLLBACK;
+
+-- Try to use notify after the rollback, making sure we don't
+-- segfault trying to use memory that was reset
+NOTIFY test;
diff --git a/src/test/modules/test_notify_rollback/test_notify_rollback.c b/src/test/modules/test_notify_rollback/test_notify_rollback.c
new file mode 100644
index 0000000000..c5465ab45f
--- /dev/null
+++ b/src/test/modules/test_notify_rollback/test_notify_rollback.c
@@ -0,0 +1,63 @@
+/*--------------------------------------------------------------------------
+ *
+ * test_notify_rollback.c
+ *		Code for testing sending notification within ProcessUtility hook.
+ *
+ * Copyright (c) 2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		src/test/modules/test_notify_rollback/test_notify_rollback.c
+ *
+ * -------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "commands/async.h"
+#include "tcop/utility.h"
+
+PG_MODULE_MAGIC;
+
+/* Saved hook values */
+static ProcessUtility_hook_type next_ProcessUtility_hook = NULL;
+
+/* Notify rollback Hook */
+static void REGRESS_utility_command(PlannedStmt *pstmt,
+									const char *queryString, bool readOnlyTree,
+									ProcessUtilityContext context,
+									ParamListInfo params,
+									QueryEnvironment *queryEnv,
+									DestReceiver *dest, QueryCompletion *qc);
+
+/*
+ * Module load callback
+ */
+void
+_PG_init(void)
+{
+	/* ProcessUtility hook */
+	next_ProcessUtility_hook = ProcessUtility_hook;
+	ProcessUtility_hook = REGRESS_utility_command;
+}
+
+static void
+REGRESS_utility_command(PlannedStmt *pstmt,
+						const char *queryString,
+						bool readOnlyTree,
+						ProcessUtilityContext context,
+						ParamListInfo params,
+						QueryEnvironment *queryEnv,
+						DestReceiver *dest,
+						QueryCompletion *qc)
+{
+	/* Forward to next hook in the chain */
+	if (next_ProcessUtility_hook)
+		(*next_ProcessUtility_hook) (pstmt, queryString, readOnlyTree,
+									 context, params, queryEnv,
+									 dest, qc);
+	else
+		standard_ProcessUtility(pstmt, queryString, readOnlyTree,
+								context, params, queryEnv,
+								dest, qc);
+	Async_Notify("test", NULL);
+}
-- 
2.39.3 (Apple Git-145)

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Anthonin Bonnefoy (#1)
Re: Possible segfault when sending notification within a ProcessUtility hook

Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> writes:

This happens when a transaction block fails and a ProcessUtility hook
sends a notification during the rollback command.

Why should we regard that as anything other than a bug in the
ProcessUtility hook? A failed transaction should not send any
notifies.

Moving notification clean up from AbortTransaction to CleanupTransaction fixes
the issue as it will clear pendingActions in the same function that destroys the
TopTransactionContext.

Maybe that would be okay, or maybe not, but I'm disinclined to
mess with it without a better argument for changing it. It seems
like there would still be room to break things with mistimed
calls to async.c functions.

regards, tom lane

#3Anthonin Bonnefoy
anthonin.bonnefoy@datadoghq.com
In reply to: Tom Lane (#2)
Re: Possible segfault when sending notification within a ProcessUtility hook

On Tue, Dec 5, 2023 at 9:03 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Why should we regard that as anything other than a bug in the
ProcessUtility hook? A failed transaction should not send any
notifies.

Fair point. That was also my initial assumption but I thought that the
transaction
state was not available from a hook as I've missed
IsAbortedTransactionBlockState.

I will rely on IsAbortedTransactionBlockState to avoid this case,
thanks for the input.

Regards,
Anthonin.