Persist injection points across server restarts

Started by Michael Paquier9 months ago18 messages

michael@paquier.xyz

9 months ago

3 attachment(s)

Hi all,

A bug related to data visibility on standbys with a race condition
mixing 2PC transactions and synchronous_standby_names with the
checkpointer has been fixed a couple of days ago:
/messages/by-id/163fcbec-900b-4b07-beaa-d2ead8634bec@postgrespro.ru

One issue that we had while discussing this thread is that there was
no easy way to have a regression test because the problem requires a
wait in the checkpointer at a very early startup phase where the
shared memory status data related to s_s_names has *not* been
initialized yet.

I have been working on the problem and found out one nice way to
address this limitation, introducing in the module injection_points a
new function that flushes to a file the set of injection points
currently attached to a cluster, reloading the data from the file to
shmem early at startup when initializing the shmem state data through
shared_preload_libraries.

With that in place, it is then possible to make the checkpointer wait
when it starts at a very early stage, giving a way to reproduce the
original failure reported on the other thread:
- A wait injection point is attached.
- A flush is used to write the points' data to disk.
- Node restarts, loading back their state.
- The wait triggers in the checkpointer.

So, please find attached a patch set for all that:
- 0001 is a patch I have stolen from a different thread (see [1]/messages/by-id/Z_xYkA21KyLEHvWR@paquier.xyz -- Michael),
introducing InjectionPointList() that retrieves a list of the
injection points attached.
- 0002 extends injection_points with a new flush function, that can be
used in TAP tests to persist some points across restarts. One sticky
point is that I did not want to add any of this information in the
core backend injection point APIs, nor to any of the backend
structures because that's not necessary, and what's here is enough for
some TAP tests.
- 0003 adds a new regression test providing some coverage for
2e57790836c6. Reverting 2e57790836c6 causes the test to fail. This
shows how to use this new facility.

This is v19 work, so I am adding that to the next commit fest.

Thanks,

[1]: /messages/by-id/Z_xYkA21KyLEHvWR@paquier.xyz -- Michael
--
Michael

Attachments:

0001-Add-InjectionPointList-to-retrieve-list-of-injection.patchtext/x-diff; charset=us-asciiDownload

From daf866a6584e24bd69f4fa126ae1799f1ba0c912 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Tue, 22 Apr 2025 12:59:30 +0900
Subject: [PATCH 1/3] Add InjectionPointList() to retrieve list of injection
 points

This hides the internals of the shmem array lookup, allocating the
result in a palloc'd array usable by the caller.
---
 src/include/utils/injection_point.h      | 16 +++++++++
 src/backend/utils/misc/injection_point.c | 46 ++++++++++++++++++++++++
 src/tools/pgindent/typedefs.list         |  1 +
 3 files changed, 63 insertions(+)

diff --git a/src/include/utils/injection_point.h b/src/include/utils/injection_point.h
index 6ba64cd1ebc6..3714739a8772 100644
--- a/src/include/utils/injection_point.h
+++ b/src/include/utils/injection_point.h
@@ -11,6 +11,19 @@
 #ifndef INJECTION_POINT_H
 #define INJECTION_POINT_H
 
+#include "nodes/pg_list.h"
+
+/*
+ * Injection point data, used when retrieving a list of all the attached
+ * injection points.
+ */
+typedef struct InjectionPointData
+{
+	const char *name;
+	const char *library;
+	const char *function;
+} InjectionPointData;
+
 /*
  * Injection points require --enable-injection-points.
  */
@@ -46,6 +59,9 @@ extern void InjectionPointCached(const char *name);
 extern bool IsInjectionPointAttached(const char *name);
 extern bool InjectionPointDetach(const char *name);
 
+/* Get the current set of injection points attached */
+extern List *InjectionPointList(void);
+
 #ifdef EXEC_BACKEND
 extern PGDLLIMPORT struct InjectionPointsCtl *ActiveInjectionPoints;
 #endif
diff --git a/src/backend/utils/misc/injection_point.c b/src/backend/utils/misc/injection_point.c
index 97ab851f0a63..60b406e4c2e0 100644
--- a/src/backend/utils/misc/injection_point.c
+++ b/src/backend/utils/misc/injection_point.c
@@ -584,3 +584,49 @@ IsInjectionPointAttached(const char *name)
 	return false;				/* silence compiler */
 #endif
 }
+
+/*
+ * Retrieve a list of all the injection points currently attached.
+ *
+ * This list is palloc'd in the current memory context.
+ */
+List *
+InjectionPointList(void)
+{
+#ifdef USE_INJECTION_POINTS
+	List	   *inj_points = NIL;
+	uint32		max_inuse;
+
+	LWLockAcquire(InjectionPointLock, LW_SHARED);
+
+	max_inuse = pg_atomic_read_u32(&ActiveInjectionPoints->max_inuse);
+
+	for (uint32 idx = 0; idx < max_inuse; idx++)
+	{
+		InjectionPointEntry *entry;
+		InjectionPointData *inj_point;
+		uint64		generation;
+
+		entry = &ActiveInjectionPoints->entries[idx];
+		generation = pg_atomic_read_u64(&entry->generation);
+
+		/* skip free slots */
+		if (generation % 2 == 0)
+			continue;
+
+		inj_point = (InjectionPointData *) palloc0(sizeof(InjectionPointData));
+		inj_point->name = pstrdup(entry->name);
+		inj_point->library = pstrdup(entry->library);
+		inj_point->function = pstrdup(entry->function);
+		inj_points = lappend(inj_points, inj_point);
+	}
+
+	LWLockRelease(InjectionPointLock);
+
+	return inj_points;
+
+#else
+	elog(ERROR, "Injection points are not supported by this build");
+	return NIL;	/* keep compiler quiet */
+#endif
+}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index e5879e00dffe..31d5f8f71f46 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1282,6 +1282,7 @@ InjectionPointCacheEntry
 InjectionPointCallback
 InjectionPointCondition
 InjectionPointConditionType
+InjectionPointData
 InjectionPointEntry
 InjectionPointsCtl
 InjectionPointSharedState
-- 
2.49.0

0002-injection_points-Add-function-to-flush-injection-poi.patchtext/x-diff; charset=us-asciiDownload

From ded8b52a4212c32fe8f79ba68fb076332c3aa05d Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Wed, 30 Apr 2025 12:27:33 +0900
Subject: [PATCH 2/3] injection_points: Add function to flush injection point
 data to disk

The function injection_points_flush() can be called in a session to
persist to disk all the injection points currently attached to the
cluster.  These are saved in file called injection_points.data at the
root of PGDATA, and loaded by the cluster at startup with all the points
registered automatically attached.

This will be useful to test scenarios where injection points need to be
for example attached at very early stages of startup, before it is
possible to attach anything via SQL.
---
 .../injection_points--1.0.sql                 |  10 ++
 .../injection_points/injection_points.c       | 163 ++++++++++++++++++
 src/test/modules/injection_points/meson.build |   1 +
 .../injection_points/t/002_data_persist.pl    |  53 ++++++
 4 files changed, 227 insertions(+)
 create mode 100644 src/test/modules/injection_points/t/002_data_persist.pl

diff --git a/src/test/modules/injection_points/injection_points--1.0.sql b/src/test/modules/injection_points/injection_points--1.0.sql
index 5d83f08811b0..e99d44507be1 100644
--- a/src/test/modules/injection_points/injection_points--1.0.sql
+++ b/src/test/modules/injection_points/injection_points--1.0.sql
@@ -3,6 +3,16 @@
 -- complain if script is sourced in psql, rather than via CREATE EXTENSION
 \echo Use "CREATE EXTENSION injection_points" to load this file. \quit
 
+--
+-- injection_points_flush()
+--
+-- Flush to disk all the data of the injection points attached.
+--
+CREATE FUNCTION injection_points_flush()
+RETURNS void
+AS 'MODULE_PATHNAME', 'injection_points_flush'
+LANGUAGE C STRICT;
+
 --
 -- injection_points_attach()
 --
diff --git a/src/test/modules/injection_points/injection_points.c b/src/test/modules/injection_points/injection_points.c
index ad528d77752b..fb239dca5537 100644
--- a/src/test/modules/injection_points/injection_points.c
+++ b/src/test/modules/injection_points/injection_points.c
@@ -24,6 +24,7 @@
 #include "nodes/value.h"
 #include "storage/condition_variable.h"
 #include "storage/dsm_registry.h"
+#include "storage/fd.h"
 #include "storage/ipc.h"
 #include "storage/lwlock.h"
 #include "storage/shmem.h"
@@ -39,6 +40,14 @@ PG_MODULE_MAGIC;
 #define INJ_MAX_WAIT	8
 #define INJ_NAME_MAXLEN	64
 
+/* Location of injection point data files, if flush has been requested */
+#define INJ_DUMP_FILE	"injection_points.data"
+#define INJ_DUMP_FILE_TMP	INJ_DUMP_FILE ".tmp"
+
+/* Magic number identifying the injection file */
+static const uint32 INJ_FILE_HEADER = 0xFF345678;
+
+
 /*
  * Conditions related to injection points.  This tracks in shared memory the
  * runtime conditions under which an injection point is allowed to run,
@@ -148,6 +157,9 @@ static void
 injection_shmem_startup(void)
 {
 	bool		found;
+	int32		num_inj_points;
+	uint32		header;
+	FILE	   *file;
 
 	if (prev_shmem_startup_hook)
 		prev_shmem_startup_hook();
@@ -169,6 +181,84 @@ injection_shmem_startup(void)
 	}
 
 	LWLockRelease(AddinShmemInitLock);
+
+	/*
+	 * Done if some other process already completed the initialization.
+	 */
+	if (found)
+		return;
+
+	/*
+	 * Note: there should be no need to bother with locks here, because there
+	 * should be no other processes running when this code is reached.
+	 */
+
+	/* Load injection point data, if any has been found while starting up */
+	file = AllocateFile(INJ_DUMP_FILE, PG_BINARY_R);
+
+	if (file == NULL)
+	{
+		if (errno != ENOENT)
+			goto error;
+
+		/* No file?  We are done. */
+		return;
+	}
+
+	if (fread(&header, sizeof(uint32), 1, file) != 1 ||
+		fread(&num_inj_points, sizeof(int32), 1, file) != 1)
+		goto error;
+
+	if (header != INJ_FILE_HEADER)
+		goto error;
+
+	for (int i = 0; i < num_inj_points; i++)
+	{
+		const char *name;
+		const char *library;
+		const char *function;
+		uint32		len;
+		char		buf[1024];
+
+		if (fread(&len, sizeof(uint32), 1, file) != 1)
+			goto error;
+		if (fread(buf, 1, len + 1, file) != len + 1)
+			goto error;
+		name = pstrdup(buf);
+
+		if (fread(&len, sizeof(uint32), 1, file) != 1)
+			goto error;
+		if (fread(buf, 1, len + 1, file) != len + 1)
+			goto error;
+		library = pstrdup(buf);
+
+		if (fread(&len, sizeof(uint32), 1, file) != 1)
+			goto error;
+		if (fread(buf, 1, len + 1, file) != len + 1)
+			goto error;
+		function = pstrdup(buf);
+
+		/* No private data handled here */
+		InjectionPointAttach(name, library, function, NULL, 0);
+	}
+
+	/*
+	 * Remove the persisted injection point file, we do not need it anymore.
+	 */
+	unlink(INJ_DUMP_FILE);
+	FreeFile(file);
+
+	return;
+
+error:
+	ereport(LOG,
+			(errcode_for_file_access(),
+			 errmsg("could not read file \"%s\": %m",
+					INJ_DUMP_FILE)));
+	if (file)
+		FreeFile(file);
+
+	unlink(INJ_DUMP_FILE);
 }
 
 /*
@@ -330,6 +420,79 @@ injection_wait(const char *name, const void *private_data)
 	SpinLockRelease(&inj_state->lock);
 }
 
+/*
+ * SQL function for flushing injection point data to disk.
+ */
+PG_FUNCTION_INFO_V1(injection_points_flush);
+Datum
+injection_points_flush(PG_FUNCTION_ARGS)
+{
+	FILE	   *file = NULL;
+	List	   *inj_points = NIL;
+	ListCell   *lc;
+	int32		num_inj_points;
+
+	inj_points = InjectionPointList();
+	if (inj_points == NIL)
+		PG_RETURN_VOID();
+
+	num_inj_points = list_length(inj_points);
+
+	file = AllocateFile(INJ_DUMP_FILE ".tmp", PG_BINARY_W);
+	if (file == NULL)
+		goto error;
+
+	if (fwrite(&INJ_FILE_HEADER, sizeof(uint32), 1, file) != 1)
+		goto error;
+
+	if (fwrite(&num_inj_points, sizeof(int32), 1, file) != 1)
+		goto error;
+
+	foreach(lc, inj_points)
+	{
+		InjectionPointData *inj_point = lfirst(lc);
+		uint32		len;
+
+		len = strlen(inj_point->name);
+		if (fwrite(&len, sizeof(uint32), 1, file) != 1 ||
+			fwrite(inj_point->name, 1, len + 1, file) != len + 1)
+			goto error;
+
+		len = strlen(inj_point->library);
+		if (fwrite(&len, sizeof(uint32), 1, file) != 1 ||
+			fwrite(inj_point->library, 1, len + 1, file) != len + 1)
+			goto error;
+
+		len = strlen(inj_point->function);
+		if (fwrite(&len, sizeof(uint32), 1, file) != 1 ||
+			fwrite(inj_point->function, 1, len + 1, file) != len + 1)
+			goto error;
+	}
+
+	if (FreeFile(file))
+	{
+		file = NULL;
+		goto error;
+	}
+
+	/*
+	 * Rename file into place, so we atomically replace any old one.
+	 */
+	(void) durable_rename(INJ_DUMP_FILE_TMP, INJ_DUMP_FILE, LOG);
+
+	PG_RETURN_VOID();
+
+error:
+	ereport(LOG,
+			(errcode_for_file_access(),
+			 errmsg("could not write file \"%s\": %m",
+					INJ_DUMP_FILE_TMP)));
+	if (file)
+		FreeFile(file);
+	unlink(INJ_DUMP_FILE_TMP);
+	PG_RETURN_VOID();
+}
+
 /*
  * SQL function for creating an injection point.
  */
diff --git a/src/test/modules/injection_points/meson.build b/src/test/modules/injection_points/meson.build
index d61149712fd7..b994f8d413d3 100644
--- a/src/test/modules/injection_points/meson.build
+++ b/src/test/modules/injection_points/meson.build
@@ -56,6 +56,7 @@ tests += {
     },
     'tests': [
       't/001_stats.pl',
+      't/002_data_persist.pl',
     ],
   },
 }
diff --git a/src/test/modules/injection_points/t/002_data_persist.pl b/src/test/modules/injection_points/t/002_data_persist.pl
new file mode 100644
index 000000000000..9ecb05230931
--- /dev/null
+++ b/src/test/modules/injection_points/t/002_data_persist.pl
@@ -0,0 +1,53 @@
+
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+# Tests for persistence of injection point data.
+
+use strict;
+use warnings FATAL => 'all';
+use locale;
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Test persistency of statistics generated for injection points.
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# Node initialization
+my $node = PostgreSQL::Test::Cluster->new('master');
+$node->init;
+$node->append_conf(
+	'postgresql.conf', qq(
+shared_preload_libraries = 'injection_points'
+));
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION injection_points;');
+
+# Attach a couple of points, which are going to be made persistent.
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('persist-notice', 'notice');");
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('persist-error', 'error');");
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('persist-notice-2', 'notice');");
+
+# Flush and restart, the injection points still exist.
+$node->safe_psql('postgres', "SELECT injection_points_flush();");
+$node->restart;
+
+my ($result, $stdout, $stderr) =
+  $node->psql('postgres', "SELECT injection_points_run('persist-notice-2')");
+ok( $stderr =~
+	  /NOTICE:  notice triggered for injection point persist-notice-2/,
+	"injection point triggering NOTICE exists");
+
+($result, $stdout, $stderr) =
+  $node->psql('postgres', "SELECT injection_points_run('persist-error')");
+ok($stderr =~ /ERROR:  error triggered for injection point persist-error/,
+	"injection point triggering ERROR exists");
+
+done_testing();
-- 
2.49.0

0003-Add-regression-test-for-2PC-visibility-check-on-stan.patchtext/x-diff; charset=us-asciiDownload

From 67136a69201f0fda22cd7c79e5e02e2d1bfc3380 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Wed, 30 Apr 2025 14:15:23 +0900
Subject: [PATCH 3/3] Add regression test for 2PC visibility check on standby

This adds some test coverage for a defect fixed in 2e57790836c6, where
the only reliable test back then was to use a hardcoded sleep().

This test relies on an injection point that is persisted across a node
restart, so as it is possible to cause the checkpointer to wait when
configuring the shared memory state of shared_preload_libraries at a
very early startup stage.

Reverting 2e57790836c6 causes the test to fail, and it passes on HEAD.
---
 src/backend/replication/syncrep.c   |  3 ++
 src/test/recovery/t/009_twophase.pl | 60 +++++++++++++++++++++++++++++
 2 files changed, 63 insertions(+)

diff --git a/src/backend/replication/syncrep.c b/src/backend/replication/syncrep.c
index cc35984ad008..d02633619c28 100644
--- a/src/backend/replication/syncrep.c
+++ b/src/backend/replication/syncrep.c
@@ -84,6 +84,7 @@
 #include "storage/proc.h"
 #include "tcop/tcopprot.h"
 #include "utils/guc_hooks.h"
+#include "utils/injection_point.h"
 #include "utils/ps_status.h"
 
 /* User-settable parameters for sync rep */
@@ -968,6 +969,8 @@ SyncRepUpdateSyncStandbysDefined(void)
 	if (sync_standbys_defined !=
 		((WalSndCtl->sync_standbys_status & SYNC_STANDBY_DEFINED) != 0))
 	{
+		INJECTION_POINT("checkpointer-syncrep-update");
+
 		LWLockAcquire(SyncRepLock, LW_EXCLUSIVE);
 
 		/*
diff --git a/src/test/recovery/t/009_twophase.pl b/src/test/recovery/t/009_twophase.pl
index 1a662ebe499d..f14da1549bac 100644
--- a/src/test/recovery/t/009_twophase.pl
+++ b/src/test/recovery/t/009_twophase.pl
@@ -51,6 +51,27 @@ $node_paris->append_conf(
 ));
 $node_paris->start;
 
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+my $injection_points_supported =
+  $node_london->check_extension('injection_points');
+if ($injection_points_supported != 0)
+{
+	$node_london->safe_psql('postgres', 'CREATE EXTENSION injection_points;');
+
+	# Set shared_preload_libraries, to allow the injection points to persist
+	# across restarts.
+	$node_london->append_conf(
+		'postgresql.conf', qq(
+		shared_preload_libraries = 'injection_points'
+	));
+	$node_paris->append_conf(
+		'postgresql.conf', qq(
+		shared_preload_libraries = 'injection_points'
+	));
+}
+
 # Switch to synchronous replication in both directions
 configure_and_reload($node_london, "synchronous_standby_names = 'paris'");
 configure_and_reload($node_paris, "synchronous_standby_names = 'london'");
@@ -327,6 +348,23 @@ $cur_primary->psql(
 	INSERT INTO t_009_tbl_standby_mvcc VALUES (2, 'issued to ${cur_primary_name}');
 	PREPARE TRANSACTION 'xact_009_standby_mvcc';
 	");
+
+# Attach an injection point to wait in the checkpointer when configuring
+# the shared memory state data related to synchronous_standby_names, then
+# persist the attached point to disk so as the follow-up restart will be able
+# to wait at the early stages of the checkpointer startup sequence.
+#
+# Note that as the checkpointer has already applied the
+# synchronous_standby_names configuration, this has no effect until the
+# next startup of the primary.
+if ($injection_points_supported != 0)
+{
+	$cur_primary->psql('postgres',
+		"SELECT injection_points_attach('checkpointer-syncrep-update', 'wait')"
+	);
+	$cur_primary->psql('postgres', "SELECT injection_points_flush()");
+}
+
 $cur_primary->stop;
 $cur_standby->restart;
 
@@ -341,6 +379,16 @@ is($psql_out, '0',
 
 # Commit the transaction in primary
 $cur_primary->start;
+
+# Make sure that the checkpointer is waiting before setting up the data of
+# synchronous_standby_names in shared memory.  We want the checkpointer to be
+# stuck and make sure that the next COMMIT PREPARED is detected correctly on
+# the standby when remote_apply is set on the primary.
+if ($injection_points_supported != 0)
+{
+	$cur_primary->wait_for_event('checkpointer',
+		'checkpointer-syncrep-update');
+}
 $cur_primary->psql(
 	'postgres', "
 SET synchronous_commit='remote_apply'; -- To ensure the standby is caught up
@@ -361,6 +409,18 @@ is($psql_out, '2',
 	"Committed prepared transaction is visible to new snapshot in standby");
 $standby_session->quit;
 
+# Remove the injection point, the checkpointer now applies the configuration
+# related to synchronous_standby_names in shared memory.
+if ($injection_points_supported != 0)
+{
+	$cur_primary->psql('postgres',
+		"SELECT injection_points_wakeup('checkpointer-syncrep-update')");
+	$cur_primary->psql('postgres',
+		"SELECT injection_points_detach('checkpointer-syncrep-update')");
+}
+
+$cur_standby->restart;
+
 ###############################################################################
 # Check for a lock conflict between prepared transaction with DDL inside and
 # replay of XLOG_STANDBY_LOCK wal record.
-- 
2.49.0

Andrey Borodin

x4mmm@yandex-team.ru

9 months ago

In reply to: Michael Paquier (#1)

Re: Persist injection points across server restarts

On 30 Apr 2025, at 10:35, Michael Paquier <michael@paquier.xyz> wrote:

- A flush is used to write the points' data to disk.

Interesting functionality.
Did you consider custom resource manager to WAL-log injection points? This way we do not need to flush them explicitly, they always will be persistent.
Though they will appear on standby, which is, probably, not expected functionality...

Best regards, Andrey Borodin.

Michael Paquier

michael@paquier.xyz

9 months ago

In reply to: Andrey Borodin (#2)

Re: Persist injection points across server restarts

On Wed, Apr 30, 2025 at 10:52:51AM +0500, Andrey Borodin wrote:

Did you consider custom resource manager to WAL-log injection
points? This way we do not need to flush them explicitly, they
always will be persistent.

WAL-logging would not work in the case I've faced, because we need a
point much earlier than the startup process beginning any redo
activity, so I don't think that this would be useful.
--
Michael

Michael Paquier

michael@paquier.xyz

8 months ago

In reply to: Michael Paquier (#1)

3 attachment(s)

Re: Persist injection points across server restarts

On Wed, Apr 30, 2025 at 02:35:40PM +0900, Michael Paquier wrote:

This is v19 work, so I am adding that to the next commit fest.

Rebased, to fix a conflict I've introduced with a different commit.
--
Michael

Attachments:

v2-0001-Add-InjectionPointList-to-retrieve-list-of-inject.patchtext/x-diff; charset=us-asciiDownload

From f84213eab2fc324f0e92ca7c55e164da6b2a74ee Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Tue, 22 Apr 2025 12:59:30 +0900
Subject: [PATCH v2 1/3] Add InjectionPointList() to retrieve list of injection
 points

This hides the internals of the shmem array lookup, allocating the
result in a palloc'd array usable by the caller.
---
 src/include/utils/injection_point.h      | 16 +++++++++
 src/backend/utils/misc/injection_point.c | 46 ++++++++++++++++++++++++
 src/tools/pgindent/typedefs.list         |  1 +
 3 files changed, 63 insertions(+)

diff --git a/src/include/utils/injection_point.h b/src/include/utils/injection_point.h
index a37958e1835f..fd5bc061b7bd 100644
--- a/src/include/utils/injection_point.h
+++ b/src/include/utils/injection_point.h
@@ -11,6 +11,19 @@
 #ifndef INJECTION_POINT_H
 #define INJECTION_POINT_H
 
+#include "nodes/pg_list.h"
+
+/*
+ * Injection point data, used when retrieving a list of all the attached
+ * injection points.
+ */
+typedef struct InjectionPointData
+{
+	const char *name;
+	const char *library;
+	const char *function;
+} InjectionPointData;
+
 /*
  * Injection points require --enable-injection-points.
  */
@@ -47,6 +60,9 @@ extern void InjectionPointCached(const char *name, void *arg);
 extern bool IsInjectionPointAttached(const char *name);
 extern bool InjectionPointDetach(const char *name);
 
+/* Get the current set of injection points attached */
+extern List *InjectionPointList(void);
+
 #ifdef EXEC_BACKEND
 extern PGDLLIMPORT struct InjectionPointsCtl *ActiveInjectionPoints;
 #endif
diff --git a/src/backend/utils/misc/injection_point.c b/src/backend/utils/misc/injection_point.c
index f58ebc8ee522..12570fba56e4 100644
--- a/src/backend/utils/misc/injection_point.c
+++ b/src/backend/utils/misc/injection_point.c
@@ -584,3 +584,49 @@ IsInjectionPointAttached(const char *name)
 	return false;				/* silence compiler */
 #endif
 }
+
+/*
+ * Retrieve a list of all the injection points currently attached.
+ *
+ * This list is palloc'd in the current memory context.
+ */
+List *
+InjectionPointList(void)
+{
+#ifdef USE_INJECTION_POINTS
+	List	   *inj_points = NIL;
+	uint32		max_inuse;
+
+	LWLockAcquire(InjectionPointLock, LW_SHARED);
+
+	max_inuse = pg_atomic_read_u32(&ActiveInjectionPoints->max_inuse);
+
+	for (uint32 idx = 0; idx < max_inuse; idx++)
+	{
+		InjectionPointEntry *entry;
+		InjectionPointData *inj_point;
+		uint64		generation;
+
+		entry = &ActiveInjectionPoints->entries[idx];
+		generation = pg_atomic_read_u64(&entry->generation);
+
+		/* skip free slots */
+		if (generation % 2 == 0)
+			continue;
+
+		inj_point = (InjectionPointData *) palloc0(sizeof(InjectionPointData));
+		inj_point->name = pstrdup(entry->name);
+		inj_point->library = pstrdup(entry->library);
+		inj_point->function = pstrdup(entry->function);
+		inj_points = lappend(inj_points, inj_point);
+	}
+
+	LWLockRelease(InjectionPointLock);
+
+	return inj_points;
+
+#else
+	elog(ERROR, "Injection points are not supported by this build");
+	return NIL;	/* keep compiler quiet */
+#endif
+}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 9ea573fae210..e4842de815e7 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1283,6 +1283,7 @@ InjectionPointCacheEntry
 InjectionPointCallback
 InjectionPointCondition
 InjectionPointConditionType
+InjectionPointData
 InjectionPointEntry
 InjectionPointsCtl
 InjectionPointSharedState
-- 
2.49.0

v2-0002-injection_points-Add-function-to-flush-injection-.patchtext/x-diff; charset=us-asciiDownload

From 440625011c4d01c9fc29b59f09c5c1006189cfac Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Wed, 30 Apr 2025 12:27:33 +0900
Subject: [PATCH v2 2/3] injection_points: Add function to flush injection
 point data to disk

The function injection_points_flush() can be called in a session to
persist to disk all the injection points currently attached to the
cluster.  These are saved in file called injection_points.data at the
root of PGDATA, and loaded by the cluster at startup with all the points
registered automatically attached.

This will be useful to test scenarios where injection points need to be
for example attached at very early stages of startup, before it is
possible to attach anything via SQL.
---
 .../injection_points--1.0.sql                 |  10 ++
 .../injection_points/injection_points.c       | 163 ++++++++++++++++++
 src/test/modules/injection_points/meson.build |   1 +
 .../injection_points/t/002_data_persist.pl    |  53 ++++++
 4 files changed, 227 insertions(+)
 create mode 100644 src/test/modules/injection_points/t/002_data_persist.pl

diff --git a/src/test/modules/injection_points/injection_points--1.0.sql b/src/test/modules/injection_points/injection_points--1.0.sql
index cc76b1bf99ae..966e1342e4ae 100644
--- a/src/test/modules/injection_points/injection_points--1.0.sql
+++ b/src/test/modules/injection_points/injection_points--1.0.sql
@@ -3,6 +3,16 @@
 -- complain if script is sourced in psql, rather than via CREATE EXTENSION
 \echo Use "CREATE EXTENSION injection_points" to load this file. \quit
 
+--
+-- injection_points_flush()
+--
+-- Flush to disk all the data of the injection points attached.
+--
+CREATE FUNCTION injection_points_flush()
+RETURNS void
+AS 'MODULE_PATHNAME', 'injection_points_flush'
+LANGUAGE C STRICT;
+
 --
 -- injection_points_attach()
 --
diff --git a/src/test/modules/injection_points/injection_points.c b/src/test/modules/injection_points/injection_points.c
index 3da0cbc10e08..c6c541113680 100644
--- a/src/test/modules/injection_points/injection_points.c
+++ b/src/test/modules/injection_points/injection_points.c
@@ -24,6 +24,7 @@
 #include "nodes/value.h"
 #include "storage/condition_variable.h"
 #include "storage/dsm_registry.h"
+#include "storage/fd.h"
 #include "storage/ipc.h"
 #include "storage/lwlock.h"
 #include "storage/shmem.h"
@@ -39,6 +40,14 @@ PG_MODULE_MAGIC;
 #define INJ_MAX_WAIT	8
 #define INJ_NAME_MAXLEN	64
 
+/* Location of injection point data files, if flush has been requested */
+#define INJ_DUMP_FILE	"injection_points.data"
+#define INJ_DUMP_FILE_TMP	INJ_DUMP_FILE ".tmp"
+
+/* Magic number identifying the injection file */
+static const uint32 INJ_FILE_HEADER = 0xFF345678;
+
+
 /*
  * Conditions related to injection points.  This tracks in shared memory the
  * runtime conditions under which an injection point is allowed to run,
@@ -151,6 +160,9 @@ static void
 injection_shmem_startup(void)
 {
 	bool		found;
+	int32		num_inj_points;
+	uint32		header;
+	FILE	   *file;
 
 	if (prev_shmem_startup_hook)
 		prev_shmem_startup_hook();
@@ -172,6 +184,84 @@ injection_shmem_startup(void)
 	}
 
 	LWLockRelease(AddinShmemInitLock);
+
+	/*
+	 * Done if some other process already completed the initialization.
+	 */
+	if (found)
+		return;
+
+	/*
+	 * Note: there should be no need to bother with locks here, because there
+	 * should be no other processes running when this code is reached.
+	 */
+
+	/* Load injection point data, if any has been found while starting up */
+	file = AllocateFile(INJ_DUMP_FILE, PG_BINARY_R);
+
+	if (file == NULL)
+	{
+		if (errno != ENOENT)
+			goto error;
+
+		/* No file?  We are done. */
+		return;
+	}
+
+	if (fread(&header, sizeof(uint32), 1, file) != 1 ||
+		fread(&num_inj_points, sizeof(int32), 1, file) != 1)
+		goto error;
+
+	if (header != INJ_FILE_HEADER)
+		goto error;
+
+	for (int i = 0; i < num_inj_points; i++)
+	{
+		const char *name;
+		const char *library;
+		const char *function;
+		uint32		len;
+		char		buf[1024];
+
+		if (fread(&len, sizeof(uint32), 1, file) != 1)
+			goto error;
+		if (fread(buf, 1, len + 1, file) != len + 1)
+			goto error;
+		name = pstrdup(buf);
+
+		if (fread(&len, sizeof(uint32), 1, file) != 1)
+			goto error;
+		if (fread(buf, 1, len + 1, file) != len + 1)
+			goto error;
+		library = pstrdup(buf);
+
+		if (fread(&len, sizeof(uint32), 1, file) != 1)
+			goto error;
+		if (fread(buf, 1, len + 1, file) != len + 1)
+			goto error;
+		function = pstrdup(buf);
+
+		/* No private data handled here */
+		InjectionPointAttach(name, library, function, NULL, 0);
+	}
+
+	/*
+	 * Remove the persisted injection point file, we do not need it anymore.
+	 */
+	unlink(INJ_DUMP_FILE);
+	FreeFile(file);
+
+	return;
+
+error:
+	ereport(LOG,
+			(errcode_for_file_access(),
+			 errmsg("could not read file \"%s\": %m",
+					INJ_DUMP_FILE)));
+	if (file)
+		FreeFile(file);
+
+	unlink(INJ_DUMP_FILE);
 }
 
 /*
@@ -343,6 +433,79 @@ injection_wait(const char *name, const void *private_data, void *arg)
 	SpinLockRelease(&inj_state->lock);
 }
 
+/*
+ * SQL function for flushing injection point data to disk.
+ */
+PG_FUNCTION_INFO_V1(injection_points_flush);
+Datum
+injection_points_flush(PG_FUNCTION_ARGS)
+{
+	FILE	   *file = NULL;
+	List	   *inj_points = NIL;
+	ListCell   *lc;
+	int32		num_inj_points;
+
+	inj_points = InjectionPointList();
+	if (inj_points == NIL)
+		PG_RETURN_VOID();
+
+	num_inj_points = list_length(inj_points);
+
+	file = AllocateFile(INJ_DUMP_FILE ".tmp", PG_BINARY_W);
+	if (file == NULL)
+		goto error;
+
+	if (fwrite(&INJ_FILE_HEADER, sizeof(uint32), 1, file) != 1)
+		goto error;
+
+	if (fwrite(&num_inj_points, sizeof(int32), 1, file) != 1)
+		goto error;
+
+	foreach(lc, inj_points)
+	{
+		InjectionPointData *inj_point = lfirst(lc);
+		uint32		len;
+
+		len = strlen(inj_point->name);
+		if (fwrite(&len, sizeof(uint32), 1, file) != 1 ||
+			fwrite(inj_point->name, 1, len + 1, file) != len + 1)
+			goto error;
+
+		len = strlen(inj_point->library);
+		if (fwrite(&len, sizeof(uint32), 1, file) != 1 ||
+			fwrite(inj_point->library, 1, len + 1, file) != len + 1)
+			goto error;
+
+		len = strlen(inj_point->function);
+		if (fwrite(&len, sizeof(uint32), 1, file) != 1 ||
+			fwrite(inj_point->function, 1, len + 1, file) != len + 1)
+			goto error;
+	}
+
+	if (FreeFile(file))
+	{
+		file = NULL;
+		goto error;
+	}
+
+	/*
+	 * Rename file into place, so we atomically replace any old one.
+	 */
+	(void) durable_rename(INJ_DUMP_FILE_TMP, INJ_DUMP_FILE, LOG);
+
+	PG_RETURN_VOID();
+
+error:
+	ereport(LOG,
+			(errcode_for_file_access(),
+			 errmsg("could not write file \"%s\": %m",
+					INJ_DUMP_FILE_TMP)));
+	if (file)
+		FreeFile(file);
+	unlink(INJ_DUMP_FILE_TMP);
+	PG_RETURN_VOID();
+}
+
 /*
  * SQL function for creating an injection point.
  */
diff --git a/src/test/modules/injection_points/meson.build b/src/test/modules/injection_points/meson.build
index d61149712fd7..b994f8d413d3 100644
--- a/src/test/modules/injection_points/meson.build
+++ b/src/test/modules/injection_points/meson.build
@@ -56,6 +56,7 @@ tests += {
     },
     'tests': [
       't/001_stats.pl',
+      't/002_data_persist.pl',
     ],
   },
 }
diff --git a/src/test/modules/injection_points/t/002_data_persist.pl b/src/test/modules/injection_points/t/002_data_persist.pl
new file mode 100644
index 000000000000..9ecb05230931
--- /dev/null
+++ b/src/test/modules/injection_points/t/002_data_persist.pl
@@ -0,0 +1,53 @@
+
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+# Tests for persistence of injection point data.
+
+use strict;
+use warnings FATAL => 'all';
+use locale;
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Test persistency of statistics generated for injection points.
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# Node initialization
+my $node = PostgreSQL::Test::Cluster->new('master');
+$node->init;
+$node->append_conf(
+	'postgresql.conf', qq(
+shared_preload_libraries = 'injection_points'
+));
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION injection_points;');
+
+# Attach a couple of points, which are going to be made persistent.
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('persist-notice', 'notice');");
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('persist-error', 'error');");
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('persist-notice-2', 'notice');");
+
+# Flush and restart, the injection points still exist.
+$node->safe_psql('postgres', "SELECT injection_points_flush();");
+$node->restart;
+
+my ($result, $stdout, $stderr) =
+  $node->psql('postgres', "SELECT injection_points_run('persist-notice-2')");
+ok( $stderr =~
+	  /NOTICE:  notice triggered for injection point persist-notice-2/,
+	"injection point triggering NOTICE exists");
+
+($result, $stdout, $stderr) =
+  $node->psql('postgres', "SELECT injection_points_run('persist-error')");
+ok($stderr =~ /ERROR:  error triggered for injection point persist-error/,
+	"injection point triggering ERROR exists");
+
+done_testing();
-- 
2.49.0

v2-0003-Add-regression-test-for-2PC-visibility-check-on-s.patchtext/x-diff; charset=us-asciiDownload

From aab10ab4b45fcbd215eda5e12c3281c32f3e8e73 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Wed, 21 May 2025 17:15:19 +0900
Subject: [PATCH v2 3/3] Add regression test for 2PC visibility check on
 standby

This adds some test coverage for a defect fixed in 2e57790836c6, where
the only reliable test back then was to use a hardcoded sleep().

This test relies on an injection point that is persisted across a node
restart, so as it is possible to cause the checkpointer to wait when
configuring the shared memory state of shared_preload_libraries at a
very early startup stage.

Reverting 2e57790836c6 causes the test to fail, and it passes on HEAD.
---
 src/backend/replication/syncrep.c   |  3 ++
 src/test/recovery/t/009_twophase.pl | 60 +++++++++++++++++++++++++++++
 2 files changed, 63 insertions(+)

diff --git a/src/backend/replication/syncrep.c b/src/backend/replication/syncrep.c
index cc35984ad008..e2be2603d1ce 100644
--- a/src/backend/replication/syncrep.c
+++ b/src/backend/replication/syncrep.c
@@ -84,6 +84,7 @@
 #include "storage/proc.h"
 #include "tcop/tcopprot.h"
 #include "utils/guc_hooks.h"
+#include "utils/injection_point.h"
 #include "utils/ps_status.h"
 
 /* User-settable parameters for sync rep */
@@ -968,6 +969,8 @@ SyncRepUpdateSyncStandbysDefined(void)
 	if (sync_standbys_defined !=
 		((WalSndCtl->sync_standbys_status & SYNC_STANDBY_DEFINED) != 0))
 	{
+		INJECTION_POINT("checkpointer-syncrep-update", NULL);
+
 		LWLockAcquire(SyncRepLock, LW_EXCLUSIVE);
 
 		/*
diff --git a/src/test/recovery/t/009_twophase.pl b/src/test/recovery/t/009_twophase.pl
index 1a662ebe499d..f14da1549bac 100644
--- a/src/test/recovery/t/009_twophase.pl
+++ b/src/test/recovery/t/009_twophase.pl
@@ -51,6 +51,27 @@ $node_paris->append_conf(
 ));
 $node_paris->start;
 
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+my $injection_points_supported =
+  $node_london->check_extension('injection_points');
+if ($injection_points_supported != 0)
+{
+	$node_london->safe_psql('postgres', 'CREATE EXTENSION injection_points;');
+
+	# Set shared_preload_libraries, to allow the injection points to persist
+	# across restarts.
+	$node_london->append_conf(
+		'postgresql.conf', qq(
+		shared_preload_libraries = 'injection_points'
+	));
+	$node_paris->append_conf(
+		'postgresql.conf', qq(
+		shared_preload_libraries = 'injection_points'
+	));
+}
+
 # Switch to synchronous replication in both directions
 configure_and_reload($node_london, "synchronous_standby_names = 'paris'");
 configure_and_reload($node_paris, "synchronous_standby_names = 'london'");
@@ -327,6 +348,23 @@ $cur_primary->psql(
 	INSERT INTO t_009_tbl_standby_mvcc VALUES (2, 'issued to ${cur_primary_name}');
 	PREPARE TRANSACTION 'xact_009_standby_mvcc';
 	");
+
+# Attach an injection point to wait in the checkpointer when configuring
+# the shared memory state data related to synchronous_standby_names, then
+# persist the attached point to disk so as the follow-up restart will be able
+# to wait at the early stages of the checkpointer startup sequence.
+#
+# Note that as the checkpointer has already applied the
+# synchronous_standby_names configuration, this has no effect until the
+# next startup of the primary.
+if ($injection_points_supported != 0)
+{
+	$cur_primary->psql('postgres',
+		"SELECT injection_points_attach('checkpointer-syncrep-update', 'wait')"
+	);
+	$cur_primary->psql('postgres', "SELECT injection_points_flush()");
+}
+
 $cur_primary->stop;
 $cur_standby->restart;
 
@@ -341,6 +379,16 @@ is($psql_out, '0',
 
 # Commit the transaction in primary
 $cur_primary->start;
+
+# Make sure that the checkpointer is waiting before setting up the data of
+# synchronous_standby_names in shared memory.  We want the checkpointer to be
+# stuck and make sure that the next COMMIT PREPARED is detected correctly on
+# the standby when remote_apply is set on the primary.
+if ($injection_points_supported != 0)
+{
+	$cur_primary->wait_for_event('checkpointer',
+		'checkpointer-syncrep-update');
+}
 $cur_primary->psql(
 	'postgres', "
 SET synchronous_commit='remote_apply'; -- To ensure the standby is caught up
@@ -361,6 +409,18 @@ is($psql_out, '2',
 	"Committed prepared transaction is visible to new snapshot in standby");
 $standby_session->quit;
 
+# Remove the injection point, the checkpointer now applies the configuration
+# related to synchronous_standby_names in shared memory.
+if ($injection_points_supported != 0)
+{
+	$cur_primary->psql('postgres',
+		"SELECT injection_points_wakeup('checkpointer-syncrep-update')");
+	$cur_primary->psql('postgres',
+		"SELECT injection_points_detach('checkpointer-syncrep-update')");
+}
+
+$cur_standby->restart;
+
 ###############################################################################
 # Check for a lock conflict between prepared transaction with DDL inside and
 # replay of XLOG_STANDBY_LOCK wal record.
-- 
2.49.0

Andrey Borodin

x4mmm@yandex-team.ru

8 months ago

In reply to: Michael Paquier (#4)

Re: Persist injection points across server restarts

On 21 May 2025, at 11:20, Michael Paquier <michael@paquier.xyz> wrote:

Rebased, to fix a conflict I've introduced with a different commit.

I've looked into the patch set and have some more questions:
1. What if we flush many times? Is it supposed to overwrite injection_points.data?
2. What if we restart many times? first startup will remove the file... maybe remove it explicitly?
3. InjectionPoint private data is not handled, is it OK?
4. How session-local points are expected to be flushed? into which session they will be loaded? Maybe let's document that they are not saved?

Besides these, cool new abilities and a test for a bug, looks good to me.

Best regards, Andrey Borodin.

Michael Paquier

michael@paquier.xyz

8 months ago

In reply to: Andrey Borodin (#5)

Re: Persist injection points across server restarts

On Wed, May 21, 2025 at 12:17:55PM +0300, Andrey Borodin wrote:

I've looked into the patch set and have some more questions:
1. What if we flush many times? Is it supposed to overwrite injection_points.data?

Yes.

2. What if we restart many times? first startup will remove the file... maybe remove it explicitly?

Yes. The point is to persist across one restart. That's the same
behavior as what we do for the stats, implying that an extra flush
would be required across multiple restarts. As this behavior is
within the extension, I don't see why we could not change the
internals at some point if we're unhappy with what it does. The only
case where I can see us do a node restart with point persistence is
TAP tests, where local points don't really make sense. It is true
that we could introduce at some point tests with more advanced
filtering conditions, like backend types where flushes of the private
data could make sense. We don't have that now.

3. InjectionPoint private data is not handled, is it OK?

Because I don't have a case for that in core yet. We could add it, of
course, but I've always defined the bar of what gets into the backend
code as of something that is used by the core tests. I've just not
needed it for this test case.

4. How session-local points are expected to be flushed? into which
session they will be loaded? Maybe let's document that they are not
saved?

Yes, I was wondering about that, but finished by discarding it based
on the same argument as the previous point. Again, we could do that.
Flushes with restarts would be part of TAP tests, where we don't care
about concurrent test sessions, so I'm not sure it is worth worrying
at this stage. That feels like bloat in the backend interface than
actually required.
--
Michael

Rahila Syed

rahilasyed90@gmail.com

8 months ago

In reply to: Michael Paquier (#4)

Re: Persist injection points across server restarts

Hi,

This is v19 work, so I am adding that to the next commit fest.

Rebased, to fix a conflict I've introduced with a different commit.

Thank you for the rebased patches. I reviewed the code and have a few
comments for
your consideration.

It appears that Github CI is reporting failures with
injection_points/002_data_persist
failing across all OSes.

Following are comments on
v2-0002-injection_points-Add-function-to-flush-injection-.patch
1.
 /*
+        * Rename file into place, so we atomically replace any old one.
+        */
+       (void) durable_rename(INJ_DUMP_FILE_TMP, INJ_DUMP_FILE, LOG);

I wonder if it would be better to include a check for the return value of
`durable_rename` to
manage potential failure scenarios.

2. +
+ file = AllocateFile(INJ_DUMP_FILE ".tmp", PG_BINARY_W);

Could we perhaps add a brief comment clarifying the rationale behind the
use of a
temporary file in this context?

3.   +               if (fread(buf, 1, len + 1, file) != len + 1)
+                       goto error;
+               library = pstrdup(buf);

It seems that `buf` should be null-terminated before being passed to
`pstrdup(buf)`.
Otherwise, `strlen` might not accurately determine the length. This seems
to be
consistent with the handling of `fread` calls in `snapmgr.c`.

Thank you,
Rahila Syed

Michael Paquier

michael@paquier.xyz

8 months ago

In reply to: Rahila Syed (#7)

3 attachment(s)

Re: Persist injection points across server restarts

On Mon, May 26, 2025 at 01:17:46PM +0530, Rahila Syed wrote:

It appears that Github CI is reporting failures with
injection_points/002_data_persist
failing across all OSes.

I am not sure to see what you are referring to here, based on the
following reports:
https://commitfest.postgresql.org/patch/5731/
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/cf%2F5731
http://cfbot.cputube.org/highlights/all.html

The last failure reported that I know of was due to the addition of
the runtime arguments to the macro INJECTION_POINT().

I wonder if it would be better to include a check for the return value of
`durable_rename` to
manage potential failure scenarios.

Or we can be smarter and make this call an ERROR.

2. +
+ file = AllocateFile(INJ_DUMP_FILE ".tmp", PG_BINARY_W);

Could we perhaps add a brief comment clarifying the rationale behind the
use of a
temporary file in this context?

Sure. This is about avoiding the load of incomplete files if there's
a crash in the middle of a flush call, for example.

It seems that `buf` should be null-terminated before being passed to
`pstrdup(buf)`.
Otherwise, `strlen` might not accurately determine the length. This seems
to be
consistent with the handling of `fread` calls in `snapmgr.c`.

Hmm, good point. That would be safer, even if the write part should
already push the null termination.
--
Michael

Attachments:

v3-0001-Add-InjectionPointList-to-retrieve-list-of-inject.patchtext/x-diff; charset=us-asciiDownload

From fe86d0066c009affbe312bc528673e5d032bdfad Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Tue, 22 Apr 2025 12:59:30 +0900
Subject: [PATCH v3 1/3] Add InjectionPointList() to retrieve list of injection
 points

This hides the internals of the shmem array lookup, allocating the
result in a palloc'd array usable by the caller.
---
 src/include/utils/injection_point.h      | 16 +++++++++
 src/backend/utils/misc/injection_point.c | 46 ++++++++++++++++++++++++
 src/tools/pgindent/typedefs.list         |  1 +
 3 files changed, 63 insertions(+)

diff --git a/src/include/utils/injection_point.h b/src/include/utils/injection_point.h
index a37958e1835f..fd5bc061b7bd 100644
--- a/src/include/utils/injection_point.h
+++ b/src/include/utils/injection_point.h
@@ -11,6 +11,19 @@
 #ifndef INJECTION_POINT_H
 #define INJECTION_POINT_H
 
+#include "nodes/pg_list.h"
+
+/*
+ * Injection point data, used when retrieving a list of all the attached
+ * injection points.
+ */
+typedef struct InjectionPointData
+{
+	const char *name;
+	const char *library;
+	const char *function;
+} InjectionPointData;
+
 /*
  * Injection points require --enable-injection-points.
  */
@@ -47,6 +60,9 @@ extern void InjectionPointCached(const char *name, void *arg);
 extern bool IsInjectionPointAttached(const char *name);
 extern bool InjectionPointDetach(const char *name);
 
+/* Get the current set of injection points attached */
+extern List *InjectionPointList(void);
+
 #ifdef EXEC_BACKEND
 extern PGDLLIMPORT struct InjectionPointsCtl *ActiveInjectionPoints;
 #endif
diff --git a/src/backend/utils/misc/injection_point.c b/src/backend/utils/misc/injection_point.c
index f58ebc8ee522..12570fba56e4 100644
--- a/src/backend/utils/misc/injection_point.c
+++ b/src/backend/utils/misc/injection_point.c
@@ -584,3 +584,49 @@ IsInjectionPointAttached(const char *name)
 	return false;				/* silence compiler */
 #endif
 }
+
+/*
+ * Retrieve a list of all the injection points currently attached.
+ *
+ * This list is palloc'd in the current memory context.
+ */
+List *
+InjectionPointList(void)
+{
+#ifdef USE_INJECTION_POINTS
+	List	   *inj_points = NIL;
+	uint32		max_inuse;
+
+	LWLockAcquire(InjectionPointLock, LW_SHARED);
+
+	max_inuse = pg_atomic_read_u32(&ActiveInjectionPoints->max_inuse);
+
+	for (uint32 idx = 0; idx < max_inuse; idx++)
+	{
+		InjectionPointEntry *entry;
+		InjectionPointData *inj_point;
+		uint64		generation;
+
+		entry = &ActiveInjectionPoints->entries[idx];
+		generation = pg_atomic_read_u64(&entry->generation);
+
+		/* skip free slots */
+		if (generation % 2 == 0)
+			continue;
+
+		inj_point = (InjectionPointData *) palloc0(sizeof(InjectionPointData));
+		inj_point->name = pstrdup(entry->name);
+		inj_point->library = pstrdup(entry->library);
+		inj_point->function = pstrdup(entry->function);
+		inj_points = lappend(inj_points, inj_point);
+	}
+
+	LWLockRelease(InjectionPointLock);
+
+	return inj_points;
+
+#else
+	elog(ERROR, "Injection points are not supported by this build");
+	return NIL;	/* keep compiler quiet */
+#endif
+}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a8346cda633a..4ad6fdb0d003 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1283,6 +1283,7 @@ InjectionPointCacheEntry
 InjectionPointCallback
 InjectionPointCondition
 InjectionPointConditionType
+InjectionPointData
 InjectionPointEntry
 InjectionPointsCtl
 InjectionPointSharedState
-- 
2.49.0

v3-0002-injection_points-Add-function-to-flush-injection-.patchtext/x-diff; charset=us-asciiDownload

From bfb448567fdf3fce5cc7b459468ca379b40a0b04 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Tue, 27 May 2025 09:48:39 +0900
Subject: [PATCH v3 2/3] injection_points: Add function to flush injection
 point data to disk

The function injection_points_flush() can be called in a session to
persist to disk all the injection points currently attached to the
cluster.  These are saved in file called injection_points.data at the
root of PGDATA, and loaded by the cluster at startup with all the points
registered automatically attached.

This will be useful to test scenarios where injection points need to be
for example attached at very early stages of startup, before it is
possible to attach anything via SQL.
---
 .../injection_points--1.0.sql                 |  10 ++
 .../injection_points/injection_points.c       | 170 ++++++++++++++++++
 src/test/modules/injection_points/meson.build |   1 +
 .../injection_points/t/002_data_persist.pl    |  53 ++++++
 4 files changed, 234 insertions(+)
 create mode 100644 src/test/modules/injection_points/t/002_data_persist.pl

diff --git a/src/test/modules/injection_points/injection_points--1.0.sql b/src/test/modules/injection_points/injection_points--1.0.sql
index cc76b1bf99ae..966e1342e4ae 100644
--- a/src/test/modules/injection_points/injection_points--1.0.sql
+++ b/src/test/modules/injection_points/injection_points--1.0.sql
@@ -3,6 +3,16 @@
 -- complain if script is sourced in psql, rather than via CREATE EXTENSION
 \echo Use "CREATE EXTENSION injection_points" to load this file. \quit
 
+--
+-- injection_points_flush()
+--
+-- Flush to disk all the data of the injection points attached.
+--
+CREATE FUNCTION injection_points_flush()
+RETURNS void
+AS 'MODULE_PATHNAME', 'injection_points_flush'
+LANGUAGE C STRICT;
+
 --
 -- injection_points_attach()
 --
diff --git a/src/test/modules/injection_points/injection_points.c b/src/test/modules/injection_points/injection_points.c
index 3da0cbc10e08..ba039cfb0ca7 100644
--- a/src/test/modules/injection_points/injection_points.c
+++ b/src/test/modules/injection_points/injection_points.c
@@ -24,6 +24,7 @@
 #include "nodes/value.h"
 #include "storage/condition_variable.h"
 #include "storage/dsm_registry.h"
+#include "storage/fd.h"
 #include "storage/ipc.h"
 #include "storage/lwlock.h"
 #include "storage/shmem.h"
@@ -39,6 +40,14 @@ PG_MODULE_MAGIC;
 #define INJ_MAX_WAIT	8
 #define INJ_NAME_MAXLEN	64
 
+/* Location of injection point data files, if flush has been requested */
+#define INJ_DUMP_FILE	"injection_points.data"
+#define INJ_DUMP_FILE_TMP	INJ_DUMP_FILE ".tmp"
+
+/* Magic number identifying the injection file */
+static const uint32 INJ_FILE_HEADER = 0xFF345678;
+
+
 /*
  * Conditions related to injection points.  This tracks in shared memory the
  * runtime conditions under which an injection point is allowed to run,
@@ -151,6 +160,9 @@ static void
 injection_shmem_startup(void)
 {
 	bool		found;
+	int32		num_inj_points;
+	uint32		header;
+	FILE	   *file;
 
 	if (prev_shmem_startup_hook)
 		prev_shmem_startup_hook();
@@ -172,6 +184,87 @@ injection_shmem_startup(void)
 	}
 
 	LWLockRelease(AddinShmemInitLock);
+
+	/*
+	 * Done if some other process already completed the initialization.
+	 */
+	if (found)
+		return;
+
+	/*
+	 * Note: there should be no need to bother with locks here, because there
+	 * should be no other processes running when this code is reached.
+	 */
+
+	/* Load injection point data, if any has been found while starting up */
+	file = AllocateFile(INJ_DUMP_FILE, PG_BINARY_R);
+
+	if (file == NULL)
+	{
+		if (errno != ENOENT)
+			goto error;
+
+		/* No file?  We are done. */
+		return;
+	}
+
+	if (fread(&header, sizeof(uint32), 1, file) != 1 ||
+		fread(&num_inj_points, sizeof(int32), 1, file) != 1)
+		goto error;
+
+	if (header != INJ_FILE_HEADER)
+		goto error;
+
+	for (int i = 0; i < num_inj_points; i++)
+	{
+		const char *name;
+		const char *library;
+		const char *function;
+		uint32		len;
+		char		buf[1024];
+
+		if (fread(&len, sizeof(uint32), 1, file) != 1)
+			goto error;
+		if (fread(buf, 1, len + 1, file) != len + 1)
+			goto error;
+		buf[len] = '\0';
+		name = pstrdup(buf);
+
+		if (fread(&len, sizeof(uint32), 1, file) != 1)
+			goto error;
+		if (fread(buf, 1, len + 1, file) != len + 1)
+			goto error;
+		buf[len] = '\0';
+		library = pstrdup(buf);
+
+		if (fread(&len, sizeof(uint32), 1, file) != 1)
+			goto error;
+		if (fread(buf, 1, len + 1, file) != len + 1)
+			goto error;
+		buf[len] = '\0';
+		function = pstrdup(buf);
+
+		/* No private data handled here */
+		InjectionPointAttach(name, library, function, NULL, 0);
+	}
+
+	/*
+	 * Remove the persisted injection point file, we do not need it anymore.
+	 */
+	unlink(INJ_DUMP_FILE);
+	FreeFile(file);
+
+	return;
+
+error:
+	ereport(LOG,
+			(errcode_for_file_access(),
+			 errmsg("could not read file \"%s\": %m",
+					INJ_DUMP_FILE)));
+	if (file)
+		FreeFile(file);
+
+	unlink(INJ_DUMP_FILE);
 }
 
 /*
@@ -343,6 +436,83 @@ injection_wait(const char *name, const void *private_data, void *arg)
 	SpinLockRelease(&inj_state->lock);
 }
 
+/*
+ * SQL function for flushing injection point data to disk.
+ */
+PG_FUNCTION_INFO_V1(injection_points_flush);
+Datum
+injection_points_flush(PG_FUNCTION_ARGS)
+{
+	FILE	   *file = NULL;
+	List	   *inj_points = NIL;
+	ListCell   *lc;
+	int32		num_inj_points;
+
+	inj_points = InjectionPointList();
+	if (inj_points == NIL)
+		PG_RETURN_VOID();
+
+	num_inj_points = list_length(inj_points);
+
+	/*
+	 * The injection point data is written to a temporary file renamed to a
+	 * final file to avoid incomplete files that could be loaded by backends.
+	 */
+	file = AllocateFile(INJ_DUMP_FILE ".tmp", PG_BINARY_W);
+	if (file == NULL)
+		goto error;
+
+	if (fwrite(&INJ_FILE_HEADER, sizeof(uint32), 1, file) != 1)
+		goto error;
+
+	if (fwrite(&num_inj_points, sizeof(int32), 1, file) != 1)
+		goto error;
+
+	foreach(lc, inj_points)
+	{
+		InjectionPointData *inj_point = lfirst(lc);
+		uint32		len;
+
+		len = strlen(inj_point->name);
+		if (fwrite(&len, sizeof(uint32), 1, file) != 1 ||
+			fwrite(inj_point->name, 1, len + 1, file) != len + 1)
+			goto error;
+
+		len = strlen(inj_point->library);
+		if (fwrite(&len, sizeof(uint32), 1, file) != 1 ||
+			fwrite(inj_point->library, 1, len + 1, file) != len + 1)
+			goto error;
+
+		len = strlen(inj_point->function);
+		if (fwrite(&len, sizeof(uint32), 1, file) != 1 ||
+			fwrite(inj_point->function, 1, len + 1, file) != len + 1)
+			goto error;
+	}
+
+	if (FreeFile(file))
+	{
+		file = NULL;
+		goto error;
+	}
+
+	/*
+	 * Rename file into place, so we atomically replace any old one.
+	 */
+	durable_rename(INJ_DUMP_FILE_TMP, INJ_DUMP_FILE, ERROR);
+
+	PG_RETURN_VOID();
+
+error:
+	ereport(LOG,
+			(errcode_for_file_access(),
+			 errmsg("could not write file \"%s\": %m",
+					INJ_DUMP_FILE_TMP)));
+	if (file)
+		FreeFile(file);
+	unlink(INJ_DUMP_FILE_TMP);
+	PG_RETURN_VOID();
+}
+
 /*
  * SQL function for creating an injection point.
  */
diff --git a/src/test/modules/injection_points/meson.build b/src/test/modules/injection_points/meson.build
index d61149712fd7..b994f8d413d3 100644
--- a/src/test/modules/injection_points/meson.build
+++ b/src/test/modules/injection_points/meson.build
@@ -56,6 +56,7 @@ tests += {
     },
     'tests': [
       't/001_stats.pl',
+      't/002_data_persist.pl',
     ],
   },
 }
diff --git a/src/test/modules/injection_points/t/002_data_persist.pl b/src/test/modules/injection_points/t/002_data_persist.pl
new file mode 100644
index 000000000000..9ecb05230931
--- /dev/null
+++ b/src/test/modules/injection_points/t/002_data_persist.pl
@@ -0,0 +1,53 @@
+
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+# Tests for persistence of injection point data.
+
+use strict;
+use warnings FATAL => 'all';
+use locale;
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Test persistency of statistics generated for injection points.
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# Node initialization
+my $node = PostgreSQL::Test::Cluster->new('master');
+$node->init;
+$node->append_conf(
+	'postgresql.conf', qq(
+shared_preload_libraries = 'injection_points'
+));
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION injection_points;');
+
+# Attach a couple of points, which are going to be made persistent.
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('persist-notice', 'notice');");
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('persist-error', 'error');");
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('persist-notice-2', 'notice');");
+
+# Flush and restart, the injection points still exist.
+$node->safe_psql('postgres', "SELECT injection_points_flush();");
+$node->restart;
+
+my ($result, $stdout, $stderr) =
+  $node->psql('postgres', "SELECT injection_points_run('persist-notice-2')");
+ok( $stderr =~
+	  /NOTICE:  notice triggered for injection point persist-notice-2/,
+	"injection point triggering NOTICE exists");
+
+($result, $stdout, $stderr) =
+  $node->psql('postgres', "SELECT injection_points_run('persist-error')");
+ok($stderr =~ /ERROR:  error triggered for injection point persist-error/,
+	"injection point triggering ERROR exists");
+
+done_testing();
-- 
2.49.0

v3-0003-Add-regression-test-for-2PC-visibility-check-on-s.patchtext/x-diff; charset=us-asciiDownload

From c1556b972686dea8e748956220a0960516a4cbb3 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Wed, 21 May 2025 17:15:19 +0900
Subject: [PATCH v3 3/3] Add regression test for 2PC visibility check on
 standby

This adds some test coverage for a defect fixed in 2e57790836c6, where
the only reliable test back then was to use a hardcoded sleep().

This test relies on an injection point that is persisted across a node
restart, so as it is possible to cause the checkpointer to wait when
configuring the shared memory state of shared_preload_libraries at a
very early startup stage.

Reverting 2e57790836c6 causes the test to fail, and it passes on HEAD.
---
 src/backend/replication/syncrep.c   |  3 ++
 src/test/recovery/t/009_twophase.pl | 60 +++++++++++++++++++++++++++++
 2 files changed, 63 insertions(+)

diff --git a/src/backend/replication/syncrep.c b/src/backend/replication/syncrep.c
index cc35984ad008..e2be2603d1ce 100644
--- a/src/backend/replication/syncrep.c
+++ b/src/backend/replication/syncrep.c
@@ -84,6 +84,7 @@
 #include "storage/proc.h"
 #include "tcop/tcopprot.h"
 #include "utils/guc_hooks.h"
+#include "utils/injection_point.h"
 #include "utils/ps_status.h"
 
 /* User-settable parameters for sync rep */
@@ -968,6 +969,8 @@ SyncRepUpdateSyncStandbysDefined(void)
 	if (sync_standbys_defined !=
 		((WalSndCtl->sync_standbys_status & SYNC_STANDBY_DEFINED) != 0))
 	{
+		INJECTION_POINT("checkpointer-syncrep-update", NULL);
+
 		LWLockAcquire(SyncRepLock, LW_EXCLUSIVE);
 
 		/*
diff --git a/src/test/recovery/t/009_twophase.pl b/src/test/recovery/t/009_twophase.pl
index 1a662ebe499d..f14da1549bac 100644
--- a/src/test/recovery/t/009_twophase.pl
+++ b/src/test/recovery/t/009_twophase.pl
@@ -51,6 +51,27 @@ $node_paris->append_conf(
 ));
 $node_paris->start;
 
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+my $injection_points_supported =
+  $node_london->check_extension('injection_points');
+if ($injection_points_supported != 0)
+{
+	$node_london->safe_psql('postgres', 'CREATE EXTENSION injection_points;');
+
+	# Set shared_preload_libraries, to allow the injection points to persist
+	# across restarts.
+	$node_london->append_conf(
+		'postgresql.conf', qq(
+		shared_preload_libraries = 'injection_points'
+	));
+	$node_paris->append_conf(
+		'postgresql.conf', qq(
+		shared_preload_libraries = 'injection_points'
+	));
+}
+
 # Switch to synchronous replication in both directions
 configure_and_reload($node_london, "synchronous_standby_names = 'paris'");
 configure_and_reload($node_paris, "synchronous_standby_names = 'london'");
@@ -327,6 +348,23 @@ $cur_primary->psql(
 	INSERT INTO t_009_tbl_standby_mvcc VALUES (2, 'issued to ${cur_primary_name}');
 	PREPARE TRANSACTION 'xact_009_standby_mvcc';
 	");
+
+# Attach an injection point to wait in the checkpointer when configuring
+# the shared memory state data related to synchronous_standby_names, then
+# persist the attached point to disk so as the follow-up restart will be able
+# to wait at the early stages of the checkpointer startup sequence.
+#
+# Note that as the checkpointer has already applied the
+# synchronous_standby_names configuration, this has no effect until the
+# next startup of the primary.
+if ($injection_points_supported != 0)
+{
+	$cur_primary->psql('postgres',
+		"SELECT injection_points_attach('checkpointer-syncrep-update', 'wait')"
+	);
+	$cur_primary->psql('postgres', "SELECT injection_points_flush()");
+}
+
 $cur_primary->stop;
 $cur_standby->restart;
 
@@ -341,6 +379,16 @@ is($psql_out, '0',
 
 # Commit the transaction in primary
 $cur_primary->start;
+
+# Make sure that the checkpointer is waiting before setting up the data of
+# synchronous_standby_names in shared memory.  We want the checkpointer to be
+# stuck and make sure that the next COMMIT PREPARED is detected correctly on
+# the standby when remote_apply is set on the primary.
+if ($injection_points_supported != 0)
+{
+	$cur_primary->wait_for_event('checkpointer',
+		'checkpointer-syncrep-update');
+}
 $cur_primary->psql(
 	'postgres', "
 SET synchronous_commit='remote_apply'; -- To ensure the standby is caught up
@@ -361,6 +409,18 @@ is($psql_out, '2',
 	"Committed prepared transaction is visible to new snapshot in standby");
 $standby_session->quit;
 
+# Remove the injection point, the checkpointer now applies the configuration
+# related to synchronous_standby_names in shared memory.
+if ($injection_points_supported != 0)
+{
+	$cur_primary->psql('postgres',
+		"SELECT injection_points_wakeup('checkpointer-syncrep-update')");
+	$cur_primary->psql('postgres',
+		"SELECT injection_points_detach('checkpointer-syncrep-update')");
+}
+
+$cur_standby->restart;
+
 ###############################################################################
 # Check for a lock conflict between prepared transaction with DDL inside and
 # replay of XLOG_STANDBY_LOCK wal record.
-- 
2.49.0

Rahila Syed

rahilasyed90@gmail.com

8 months ago

In reply to: Michael Paquier (#8)

Re: Persist injection points across server restarts

Hi,

Thank you for addressing the review comments.

It appears that Github CI is reporting failures with

injection_points/002_data_persist
failing across all OSes.

I am not sure to see what you are referring to here, based on the
following reports:
https://commitfest.postgresql.org/patch/5731/
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/cf%2F5731
http://cfbot.cputube.org/highlights/all.html

The last failure reported that I know of was due to the addition of
the runtime arguments to the macro INJECTION_POINT().

After applying the v3-patches, I see failure like these:
macOS - Sonoma - Meson - Cirrus CI
<https://cirrus-ci.com/task/4589192849653760>
Windows - Server 2019, VS 2019 - Meson & ninja - Cirrus CI
<https://cirrus-ci.com/task/5715092756496384>
Linux - Debian Bookworm - Meson - Cirrus CI
<https://cirrus-ci.com/task/6629886430806016>

I wonder if it would be better to include a check for the return value of
`durable_rename` to
manage potential failure scenarios.

Or we can be smarter and make this call an ERROR.

One issue with this approach is that if a failure occurs during rename, we
cannot
unlink(INJ_DUMP_FILE_TMP), because durable_rename does not handle this
before throwing an error.

Should we also consider unlinking the target file INJ_DUMP_FILE in case of
an
error during renaming?
Since target files are already synced at the start in the durable_rename()
function.

Thank you,
Rahila Syed

#10

Michael Paquier

michael@paquier.xyz

8 months ago

In reply to: Rahila Syed (#9)

Re: Persist injection points across server restarts

On Thu, May 29, 2025 at 01:17:21PM +0530, Rahila Syed wrote:

After applying the v3-patches, I see failure like these:
macOS - Sonoma - Meson - Cirrus CI
<https://cirrus-ci.com/task/4589192849653760>
Windows - Server 2019, VS 2019 - Meson & ninja - Cirrus CI
<https://cirrus-ci.com/task/5715092756496384>
Linux - Debian Bookworm - Meson - Cirrus CI
<https://cirrus-ci.com/task/6629886430806016>

These tasks complain about the following:
[07:22:58.931] ../src/include/utils/injection_point.h:62:28: note:
'InjectionPointList' declared here [07:22:58.931] 62 | extern
InjectionPointData *InjectionPointList(uint32 *num_points);

The latest version of the patch (v3) posted on this thread includes
the following:
./v3-0001-Add-InjectionPointList-to-retrieve-list-of-inject.patch:
+extern List *InjectionPointList(void);
./v3-0001-Add-InjectionPointList-to-retrieve-list-of-inject.patch:
+InjectionPointList(void)
./v3-0002-injection_points-Add-function-to-flush-injection-.patch:
+ inj_points = InjectionPointList();

How is it possible for what you have applied to include changes that
are not part of the patch set posted? Manual mistake on your side
perhaps where versions of the patches have been mixed?

The CF bot is green:
https://commitfest.postgresql.org/patch/5731/

Should we also consider unlinking the target file INJ_DUMP_FILE in case of
an error during renaming?

I don't think that this is worth caring for the sake of tests. If an
error happens before the atomic rename is done to the final file,
leaving around a temporary file, the next flush would just recreate a
new file from scratch with the new contents.
--
Michael

#11

Rahila Syed

rahilasyed90@gmail.com

8 months ago

In reply to: Michael Paquier (#10)

Re: Persist injection points across server restarts

Hi Michael,

On Fri, May 30, 2025 at 5:01 AM Michael Paquier <michael@paquier.xyz> wrote:

On Thu, May 29, 2025 at 01:17:21PM +0530, Rahila Syed wrote:

After applying the v3-patches, I see failure like these:
macOS - Sonoma - Meson - Cirrus CI
<https://cirrus-ci.com/task/4589192849653760>
Windows - Server 2019, VS 2019 - Meson & ninja - Cirrus CI
<https://cirrus-ci.com/task/5715092756496384>
Linux - Debian Bookworm - Meson - Cirrus CI
<https://cirrus-ci.com/task/6629886430806016>

These tasks complain about the following:
[07:22:58.931] ../src/include/utils/injection_point.h:62:28: note:
'InjectionPointList' declared here [07:22:58.931] 62 | extern
InjectionPointData *InjectionPointList(uint32 *num_points);

The latest version of the patch (v3) posted on this thread includes
the following:
./v3-0001-Add-InjectionPointList-to-retrieve-list-of-inject.patch:
+extern List *InjectionPointList(void);
./v3-0001-Add-InjectionPointList-to-retrieve-list-of-inject.patch:
+InjectionPointList(void)
./v3-0002-injection_points-Add-function-to-flush-injection-.patch:
+ inj_points = InjectionPointList();

How is it possible for what you have applied to include changes that
are not part of the patch set posted? Manual mistake on your side
perhaps where versions of the patches have been mixed?

The CF bot is green:
https://commitfest.postgresql.org/patch/5731/

I apologize for the confusion; I mistakenly applied the wrong version of
the 0001 patch.
I can now confirm that the tests are passing on my end.

Thank you,
Rahila Syed

#12

Andrey Borodin

x4mmm@yandex-team.ru

8 months ago

In reply to: Michael Paquier (#8)

Re: Persist injection points across server restarts

On 27 May 2025, at 05:51, Michael Paquier <michael@paquier.xyz> wrote:

<v3-0001-Add-InjectionPointList-to-retrieve-list-of-inject.patch>

<v3-0002-injection_points-Add-function-to-flush-injection-.patch>

<v3-0003-Add-regression-test-for-2PC-visibility-check-on-s.patch>

I've made another round of looking into these patches.
I think it's fine that private data is not included. But the feature looks orthogonal, and I do not see specific reasons why it should be omitted when IPs are persisted.
Another idea of improvement is using distinguishable errors in injection_shmem_startup(). Like differentiating between read error and wrong magic number.
But there's no big value in these improvements, so the patch is fine as is too.

Best regards, Andrey Borodin.

#13

Jeff Davis

pgsql@j-davis.com

7 months ago

In reply to: Michael Paquier (#1)

Re: Persist injection points across server restarts

On Wed, 2025-04-30 at 14:35 +0900, Michael Paquier wrote:

- 0001 is a patch I have stolen from a different thread (see [1]),
introducing InjectionPointList() that retrieves a list of the
injection points attached.

Unless that other thread is blocked for some reason, I'd suggest just
going ahead with the SQL callable function first. That seems
independently useful.

- 0002 extends injection_points with a new flush function, that can
be
used in TAP tests to persist some points across restarts. One sticky
point is that I did not want to add any of this information in the
core backend injection point APIs, nor to any of the backend
structures because that's not necessary, and what's here is enough
for
some TAP tests.

This seems like a fair amount of complexity for a single use case.
Arguably, complexity around testing infrastructure is a lot less of a
problem than other kinds of complexity, so it might be OK. But it would
be nice if there were a couple cases that would benefit rather than
one.

Regards,
Jeff Davis

#14

Michael Paquier

michael@paquier.xyz

7 months ago

In reply to: Jeff Davis (#13)

Re: Persist injection points across server restarts

On Mon, Jun 02, 2025 at 12:42:35PM -0700, Jeff Davis wrote:

Unless that other thread is blocked for some reason, I'd suggest just
going ahead with the SQL callable function first. That seems
independently useful.

Thanks. Having InjectionPointList() was also required for the other
thread.

This seems like a fair amount of complexity for a single use case.
Arguably, complexity around testing infrastructure is a lot less of a
problem than other kinds of complexity, so it might be OK. But it would
be nice if there were a couple cases that would benefit rather than
one.

In all the approaches I've considered, this one was the least worst of
all based on the point that all the complexity is hidden in the test
module; there is no need to touch the backend code at all as long as
there is a way to retrieve the list of points that would be dumped to
disk.

Another set of test cases I had in mind was waits during recovery
before consistency is reached. There is no way to add a point without
connecting to the database, and we've had plenty of fixes involving
the startup process and a different process, mostly the checkpointer.
That's an annoying limitation.
--
Michael

#15

Jeff Davis

pgsql@j-davis.com

7 months ago

In reply to: Michael Paquier (#14)

Re: Persist injection points across server restarts

On Tue, 2025-06-03 at 13:13 +0900, Michael Paquier wrote:

In all the approaches I've considered, this one was the least worst
of
all based on the point that all the complexity is hidden in the test
module; there is no need to touch the backend code at all as long as
there is a way to retrieve the list of points that would be dumped to
disk.

True, though it does create a new file.

Another set of test cases I had in mind was waits during recovery
before consistency is reached. There is no way to add a point
without
connecting to the database, and we've had plenty of fixes involving
the startup process and a different process, mostly the checkpointer.
That's an annoying limitation.

If you have in mind some other ways to use it than I like it a lot
more. And I don't have a better idea.

Regards,
Jeff Davis

#16

Andres Freund

andres@anarazel.de

7 months ago

In reply to: Michael Paquier (#14)

Re: Persist injection points across server restarts

Hi,

On 2025-06-03 13:13:06 +0900, Michael Paquier wrote:

On Mon, Jun 02, 2025 at 12:42:35PM -0700, Jeff Davis wrote:

This seems like a fair amount of complexity for a single use case.
Arguably, complexity around testing infrastructure is a lot less of a
problem than other kinds of complexity, so it might be OK. But it would
be nice if there were a couple cases that would benefit rather than
one.

In all the approaches I've considered, this one was the least worst of
all based on the point that all the complexity is hidden in the test
module; there is no need to touch the backend code at all as long as
there is a way to retrieve the list of points that would be dumped to
disk.

Another set of test cases I had in mind was waits during recovery
before consistency is reached. There is no way to add a point without
connecting to the database, and we've had plenty of fixes involving
the startup process and a different process, mostly the checkpointer.
That's an annoying limitation.

I'm somewhat doubtful this is is the right direction. Tests that require
injection points before consistency also can't wait for injection points using
the SQL interface or such, so most of the stuff has to be written in C
anyway. And if so, you also can attach to injection points in the relevant
shared_preload_libraries entry.

Greetings,

Andres Freund

#17

Michael Paquier

michael@paquier.xyz

7 months ago

In reply to: Andres Freund (#16)

Re: Persist injection points across server restarts

On Tue, Jun 03, 2025 at 03:34:16PM -0400, Andres Freund wrote:

I'm somewhat doubtful this is is the right direction. Tests that require
injection points before consistency also can't wait for injection points using
the SQL interface or such, so most of the stuff has to be written in C
anyway. And if so, you also can attach to injection points in the relevant
shared_preload_libraries entry.

Hmm. I'm wondering about an alternate approach here: a postmaster GUC
in injection_points that can take in input a list of
name/library/function where the module would load them when
initializing. That's a bit artistic, perhaps, still it would work
without having to worry about the flush and reload steps.
--
Michael

#18

Michael Paquier

michael@paquier.xyz

6 months ago

In reply to: Michael Paquier (#17)

Re: Persist injection points across server restarts

On Wed, Jun 04, 2025 at 09:15:08AM +0900, Michael Paquier wrote:

On Tue, Jun 03, 2025 at 03:34:16PM -0400, Andres Freund wrote:

I'm somewhat doubtful this is is the right direction. Tests that require
injection points before consistency also can't wait for injection points using
the SQL interface or such, so most of the stuff has to be written in C
anyway. And if so, you also can attach to injection points in the relevant
shared_preload_libraries entry.

Hmm. I'm wondering about an alternate approach here: a postmaster GUC
in injection_points that can take in input a list of
name/library/function where the module would load them when
initializing. That's a bit artistic, perhaps, still it would work
without having to worry about the flush and reload steps.

[A couple of weeks later]

I am still not sure what's the right course of action here, so I'd
agree to just reject the patch for now, but keep it in mind once we
have more cases where this could be useful. The case I've found is
perhaps not enough. Updating the CF app to reflect on this feedback
now, thanks all for the input.
--
Michael