WAL-logging facility for pgstats kinds

Started by Michael Paquierabout 1 year ago11 messages
#1Michael Paquier
michael@paquier.xyz
4 attachment(s)

Hi all,

While brainstorming on the contents of the thread I have posted a
couple of days ago, I have been looking at what could be done so as
pgstats and WAL-logging could work together. This was point 2) from
this message:
/messages/by-id/Z2tblEmfuOfZy4zx@paquier.xyz

While considering the point that not all stats data is worth
replicating, I have fallen down to the following properties that are
nice to have across the board:
- A pgstats kind should be able to WAL-log some data that is should be
able to decide. Including versioning of the data.
- The data kind should be able to decide how this data is handled at
recovery (different persistency depending on standby vs crash
recovery, for one).
- Having to add one RMGR for each stats kind is not going to scale,
especially as we can control the redo path using a callback part of
PgStat_KindInfo. For custom kinds, this enforces some validation
based on if the stats library has been really loaded at startup with
shared_preload_library.
- It is nice for each out-of-core stats kind to not have to load and
define a custom RMGR, duplicating what this central facility is doing.
- The persistency of the stats data is better across crashes: this
approach makes the cleanup of orphaned pgstats entries easier as this
can be enforced at replay by each stats kind, if necessary, and it
is also possible to force the creation of stats data.
- For variable-numbered stats, the WAL-logging can be timed with the
timing where any pending stats are flushed, if there is some.

Table stats to prevent autovacuum from committing Seppuku for tables
without stats after a crash is the first use case I can think about,
where we'd want to split and expose some DML stats. Scans stats
should remain untouched on standbys, for example. There may be other
stats data that is worth replicating, all these could use this new
facility.

Attached is the result of this investigation, where I have finished by
implementing a new backend facility where pgstats kinds can WAL-log
data at will using a new RMGR called "PgStat". The most interesting
part is pgstat_xlog_data() in a new file called pgstat_xlog.c, that
stats kinds can use as an entry point to WAL-log data structures with
a XLogRegisterData() managed by a new record structure called
xl_pgstat_data.

For clarity, the patch set has been split into several pieces, I hope
rather edible:
- 0001, a fix I've posted on a different thread [1]/messages/by-id/Z23zcE4w1djukkva@paquier.xyz -- Michael, used in patch
0004 to test this new facility.
- 0002, a refactoring piece to be able to expose stats kind data into
the frontend (for pg_waldump).
- 0003 is the core backend piece, introducing the WAL-logging routine
and the new RMGR.
- 0004, that provides tests for the new backend facility via a custom
stats kind. Requires 0001.

I am going to attach it to the next CF.

Comments or opinions are welcome.

[1]: /messages/by-id/Z23zcE4w1djukkva@paquier.xyz -- Michael
--
Michael

Attachments:

v1-0001-injection_points-Tweak-variable-numbered-stats-to.patchtext/x-diff; charset=us-asciiDownload
From dc7f5ac52407c8916a560572cfb2c0adf183dbab Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Fri, 27 Dec 2024 09:16:02 +0900
Subject: [PATCH v1 1/4] injection_points: Tweak variable-numbered stats to
 work with pending data

As coded, the module was not using pending entries to store its data
locally before a flush to the central dshash is done with a timed
pgstat_report_stat() call.  Hence, the flush callback was defined, but
finished by being not used.  As a template, this is more efficient than
the original logic of updating directly the shared memory entries as
this reduces the interactions that need to be done with the pgstats
hash table in shared memory.

injection_stats_flush_cb() was also missing a pgstat_unlock_entry(), so
add one, while on it.
---
 .../modules/injection_points/injection_stats.c    | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/src/test/modules/injection_points/injection_stats.c b/src/test/modules/injection_points/injection_stats.c
index e16b9db284..21d5c10f39 100644
--- a/src/test/modules/injection_points/injection_stats.c
+++ b/src/test/modules/injection_points/injection_stats.c
@@ -80,6 +80,9 @@ injection_stats_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
 		return false;
 
 	shfuncent->stats.numcalls += localent->numcalls;
+
+	pgstat_unlock_entry(entry_ref);
+
 	return true;
 }
 
@@ -127,13 +130,13 @@ pgstat_create_inj(const char *name)
 	if (!inj_stats_loaded || !inj_stats_enabled)
 		return;
 
-	entry_ref = pgstat_get_entry_ref_locked(PGSTAT_KIND_INJECTION, InvalidOid,
-											PGSTAT_INJ_IDX(name), false);
+	entry_ref = pgstat_prep_pending_entry(PGSTAT_KIND_INJECTION, InvalidOid,
+										  PGSTAT_INJ_IDX(name), NULL);
+
 	shstatent = (PgStatShared_InjectionPoint *) entry_ref->shared_stats;
 
 	/* initialize shared memory data */
 	memset(&shstatent->stats, 0, sizeof(shstatent->stats));
-	pgstat_unlock_entry(entry_ref);
 }
 
 /*
@@ -168,16 +171,14 @@ pgstat_report_inj(const char *name)
 	if (!inj_stats_loaded || !inj_stats_enabled)
 		return;
 
-	entry_ref = pgstat_get_entry_ref_locked(PGSTAT_KIND_INJECTION, InvalidOid,
-											PGSTAT_INJ_IDX(name), false);
+	entry_ref = pgstat_prep_pending_entry(PGSTAT_KIND_INJECTION, InvalidOid,
+										  PGSTAT_INJ_IDX(name), NULL);
 
 	shstatent = (PgStatShared_InjectionPoint *) entry_ref->shared_stats;
 	statent = &shstatent->stats;
 
 	/* Update the injection point statistics */
 	statent->numcalls++;
-
-	pgstat_unlock_entry(entry_ref);
 }
 
 /*
-- 
2.45.2

v1-0002-Move-information-about-pgstats-kinds-into-its-own.patchtext/x-diff; charset=us-asciiDownload
From 9ba0ae7cc0c087f1a60331215bdb025993837ba2 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Fri, 27 Dec 2024 11:54:45 +0900
Subject: [PATCH v1 2/4] Move information about pgstats kinds into its own
 header

This information is split into its own header as it will be used by a
follow-up patch, to be used in some frontend and backend code.
---
 src/include/pgstat.h            | 57 +-------------------------
 src/include/utils/pgstat_kind.h | 72 +++++++++++++++++++++++++++++++++
 2 files changed, 73 insertions(+), 56 deletions(-)
 create mode 100644 src/include/utils/pgstat_kind.h

diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 75a41e8ff3..601b8075d5 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -18,6 +18,7 @@
 #include "replication/conflict.h"
 #include "utils/backend_progress.h" /* for backward compatibility */
 #include "utils/backend_status.h"	/* for backward compatibility */
+#include "utils/pgstat_kind.h"
 #include "utils/relcache.h"
 #include "utils/wait_event.h"	/* for backward compatibility */
 
@@ -33,62 +34,6 @@
 /* Default directory to store temporary statistics data in */
 #define PG_STAT_TMP_DIR		"pg_stat_tmp"
 
-/* The types of statistics entries */
-#define PgStat_Kind uint32
-
-/* Range of IDs allowed, for built-in and custom kinds */
-#define PGSTAT_KIND_MIN	1		/* Minimum ID allowed */
-#define PGSTAT_KIND_MAX	256		/* Maximum ID allowed */
-
-/* use 0 for INVALID, to catch zero-initialized data */
-#define PGSTAT_KIND_INVALID 0
-
-/* stats for variable-numbered objects */
-#define PGSTAT_KIND_DATABASE	1	/* database-wide statistics */
-#define PGSTAT_KIND_RELATION	2	/* per-table statistics */
-#define PGSTAT_KIND_FUNCTION	3	/* per-function statistics */
-#define PGSTAT_KIND_REPLSLOT	4	/* per-slot statistics */
-#define PGSTAT_KIND_SUBSCRIPTION	5	/* per-subscription statistics */
-#define PGSTAT_KIND_BACKEND	6	/* per-backend statistics */
-
-/* stats for fixed-numbered objects */
-#define PGSTAT_KIND_ARCHIVER	7
-#define PGSTAT_KIND_BGWRITER	8
-#define PGSTAT_KIND_CHECKPOINTER	9
-#define PGSTAT_KIND_IO	10
-#define PGSTAT_KIND_SLRU	11
-#define PGSTAT_KIND_WAL	12
-
-#define PGSTAT_KIND_BUILTIN_MIN PGSTAT_KIND_DATABASE
-#define PGSTAT_KIND_BUILTIN_MAX PGSTAT_KIND_WAL
-#define PGSTAT_KIND_BUILTIN_SIZE (PGSTAT_KIND_BUILTIN_MAX + 1)
-
-/* Custom stats kinds */
-
-/* Range of IDs allowed for custom stats kinds */
-#define PGSTAT_KIND_CUSTOM_MIN	128
-#define PGSTAT_KIND_CUSTOM_MAX	PGSTAT_KIND_MAX
-#define PGSTAT_KIND_CUSTOM_SIZE	(PGSTAT_KIND_CUSTOM_MAX - PGSTAT_KIND_CUSTOM_MIN + 1)
-
-/*
- * PgStat_Kind to use for extensions that require an ID, but are still in
- * development and have not reserved their own unique kind ID yet. See:
- * https://wiki.postgresql.org/wiki/CustomCumulativeStats
- */
-#define PGSTAT_KIND_EXPERIMENTAL	128
-
-static inline bool
-pgstat_is_kind_builtin(PgStat_Kind kind)
-{
-	return kind >= PGSTAT_KIND_BUILTIN_MIN && kind <= PGSTAT_KIND_BUILTIN_MAX;
-}
-
-static inline bool
-pgstat_is_kind_custom(PgStat_Kind kind)
-{
-	return kind >= PGSTAT_KIND_CUSTOM_MIN && kind <= PGSTAT_KIND_CUSTOM_MAX;
-}
-
 /* Values for track_functions GUC variable --- order is significant! */
 typedef enum TrackFunctionsLevel
 {
diff --git a/src/include/utils/pgstat_kind.h b/src/include/utils/pgstat_kind.h
new file mode 100644
index 0000000000..526fa04e8f
--- /dev/null
+++ b/src/include/utils/pgstat_kind.h
@@ -0,0 +1,72 @@
+/* ----------
+ *	pgstat_kind.h
+ *
+ *	Definitions related to the statistics kinds for the PostgreSQL
+ *	cumulative statistics system.  Can be included in backend or
+ *	frontend code.
+ *
+ *	Copyright (c) 2001-2024, PostgreSQL Global Development Group
+ *
+ *	src/include/utils/pgstat_kind.h
+ * ----------
+ */
+#ifndef PGSTAT_KIND_H
+#define PGSTAT_KIND_H
+
+/* The types of statistics entries */
+#define PgStat_Kind uint32
+
+/* Range of IDs allowed, for built-in and custom kinds */
+#define PGSTAT_KIND_MIN	1		/* Minimum ID allowed */
+#define PGSTAT_KIND_MAX	256		/* Maximum ID allowed */
+
+/* use 0 for INVALID, to catch zero-initialized data */
+#define PGSTAT_KIND_INVALID 0
+
+/* stats for variable-numbered objects */
+#define PGSTAT_KIND_DATABASE	1	/* database-wide statistics */
+#define PGSTAT_KIND_RELATION	2	/* per-table statistics */
+#define PGSTAT_KIND_FUNCTION	3	/* per-function statistics */
+#define PGSTAT_KIND_REPLSLOT	4	/* per-slot statistics */
+#define PGSTAT_KIND_SUBSCRIPTION	5	/* per-subscription statistics */
+#define PGSTAT_KIND_BACKEND	6	/* per-backend statistics */
+
+/* stats for fixed-numbered objects */
+#define PGSTAT_KIND_ARCHIVER	7
+#define PGSTAT_KIND_BGWRITER	8
+#define PGSTAT_KIND_CHECKPOINTER	9
+#define PGSTAT_KIND_IO	10
+#define PGSTAT_KIND_SLRU	11
+#define PGSTAT_KIND_WAL	12
+
+#define PGSTAT_KIND_BUILTIN_MIN PGSTAT_KIND_DATABASE
+#define PGSTAT_KIND_BUILTIN_MAX PGSTAT_KIND_WAL
+#define PGSTAT_KIND_BUILTIN_SIZE (PGSTAT_KIND_BUILTIN_MAX + 1)
+
+/* Custom stats kinds */
+
+/* Range of IDs allowed for custom stats kinds */
+#define PGSTAT_KIND_CUSTOM_MIN	128
+#define PGSTAT_KIND_CUSTOM_MAX	PGSTAT_KIND_MAX
+#define PGSTAT_KIND_CUSTOM_SIZE	(PGSTAT_KIND_CUSTOM_MAX - PGSTAT_KIND_CUSTOM_MIN + 1)
+
+/*
+ * PgStat_Kind to use for extensions that require an ID, but are still in
+ * development and have not reserved their own unique kind ID yet. See:
+ * https://wiki.postgresql.org/wiki/CustomCumulativeStats
+ */
+#define PGSTAT_KIND_EXPERIMENTAL	128
+
+static inline bool
+pgstat_is_kind_builtin(PgStat_Kind kind)
+{
+	return kind >= PGSTAT_KIND_BUILTIN_MIN && kind <= PGSTAT_KIND_BUILTIN_MAX;
+}
+
+static inline bool
+pgstat_is_kind_custom(PgStat_Kind kind)
+{
+	return kind >= PGSTAT_KIND_CUSTOM_MIN && kind <= PGSTAT_KIND_CUSTOM_MAX;
+}
+
+#endif							/* PGSTAT_KIND_H */
-- 
2.45.2

v1-0003-Add-RMGR-and-WAL-logging-API-for-pgstats.patchtext/x-diff; charset=us-asciiDownload
From d260487585c9effec3b22994825bec9ea7a734dc Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Fri, 27 Dec 2024 12:03:34 +0900
Subject: [PATCH v1 3/4] Add RMGR and WAL-logging API for pgstats

This commit adds a new facility in the backend that offers to pgstats
kinds the possibility to WAL-log data that need to be persisted across
restarts:
- A new callback called redo_cb can be defined by a pgstats_kind.
- pgstat_xlog.c includes one API able to register some payload data and
its size.  Stats kinds doing WAL-logging require a definition of
redo_cb.
- pg_waldump is able to show the data logged, as hexadedimal data.

This facility has applications for in-core and custom stats kinds, with
one primary case being the possibility to WAL-log some statistics
related to relations so as these are not lost post-crash.  This is left
as future work.
---
 src/include/access/rmgrlist.h            |  1 +
 src/include/utils/pgstat_internal.h      |  8 +++
 src/include/utils/pgstat_xlog.h          | 41 +++++++++++++
 src/backend/access/rmgrdesc/Makefile     |  1 +
 src/backend/access/rmgrdesc/meson.build  |  1 +
 src/backend/access/rmgrdesc/pgstatdesc.c | 48 +++++++++++++++
 src/backend/access/transam/rmgr.c        |  1 +
 src/backend/utils/activity/Makefile      |  1 +
 src/backend/utils/activity/meson.build   |  1 +
 src/backend/utils/activity/pgstat_xlog.c | 77 ++++++++++++++++++++++++
 src/bin/pg_waldump/.gitignore            |  1 +
 src/bin/pg_waldump/rmgrdesc.c            |  1 +
 src/bin/pg_waldump/t/001_basic.pl        |  3 +-
 src/tools/pgindent/typedefs.list         |  1 +
 14 files changed, 185 insertions(+), 1 deletion(-)
 create mode 100644 src/include/utils/pgstat_xlog.h
 create mode 100644 src/backend/access/rmgrdesc/pgstatdesc.c
 create mode 100644 src/backend/utils/activity/pgstat_xlog.c

diff --git a/src/include/access/rmgrlist.h b/src/include/access/rmgrlist.h
index 78e6b908c6..abf9393bbe 100644
--- a/src/include/access/rmgrlist.h
+++ b/src/include/access/rmgrlist.h
@@ -47,3 +47,4 @@ PG_RMGR(RM_COMMIT_TS_ID, "CommitTs", commit_ts_redo, commit_ts_desc, commit_ts_i
 PG_RMGR(RM_REPLORIGIN_ID, "ReplicationOrigin", replorigin_redo, replorigin_desc, replorigin_identify, NULL, NULL, NULL, NULL)
 PG_RMGR(RM_GENERIC_ID, "Generic", generic_redo, generic_desc, generic_identify, NULL, NULL, generic_mask, NULL)
 PG_RMGR(RM_LOGICALMSG_ID, "LogicalMessage", logicalmsg_redo, logicalmsg_desc, logicalmsg_identify, NULL, NULL, NULL, logicalmsg_decode)
+PG_RMGR(RM_PGSTAT_ID, "PgStat", pgstat_redo, pgstat_desc, pgstat_identify, NULL, NULL, NULL, NULL)
diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index 811ed9b005..47d7f8988e 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -258,6 +258,14 @@ typedef struct PgStat_KindInfo
 	 */
 	void		(*init_backend_cb) (void);
 
+	/*
+	 * Perform custom actions when replaying WAL related to a statistics kind.
+	 *
+	 * "data" is a pointer to the stats information that can be used by the
+	 * stats kind at redo.
+	 */
+	void		(*redo_cb) (void *data, size_t data_size);
+
 	/*
 	 * For variable-numbered stats: flush pending stats. Required if pending
 	 * data is used.  See flush_fixed_cb for fixed-numbered stats.
diff --git a/src/include/utils/pgstat_xlog.h b/src/include/utils/pgstat_xlog.h
new file mode 100644
index 0000000000..25b7a78dd5
--- /dev/null
+++ b/src/include/utils/pgstat_xlog.h
@@ -0,0 +1,41 @@
+/* ----------
+ * pgstat_xlog.h
+ *	  Exports from utils/activity/pgstat_xlog.c
+ *
+ * Copyright (c) 2001-2024, PostgreSQL Global Development Group
+ *
+ * src/include/utils/pgstat_xlog.h
+ * ----------
+ */
+#ifndef PGSTAT_XLOG_H
+#define PGSTAT_XLOG_H
+
+#include "access/xlog.h"
+#include "access/xlogdefs.h"
+#include "access/xlogreader.h"
+#include "utils/pgstat_kind.h"
+
+/*
+ * Generic WAL record for pgstat data.
+ */
+typedef struct xl_pgstat_data
+{
+	PgStat_Kind stats_kind;
+	size_t		data_size;		/* size of the data */
+	/* data payload, worth data_size */
+	char		data[FLEXIBLE_ARRAY_MEMBER];
+} xl_pgstat_data;
+
+#define SizeOfPgStatData    (offsetof(xl_pgstat_data, data))
+
+extern XLogRecPtr pgstat_xlog_data(PgStat_Kind stats_kind,
+								   const void *data,
+								   size_t data_size);
+
+/* RMGR API */
+#define XLOG_PGSTAT_DATA	0x00
+extern void pgstat_redo(XLogReaderState *record);
+extern void pgstat_desc(StringInfo buf, XLogReaderState *record);
+extern const char *pgstat_identify(uint8 info);
+
+#endif							/* PGSTAT_XLOG_H */
diff --git a/src/backend/access/rmgrdesc/Makefile b/src/backend/access/rmgrdesc/Makefile
index cd95eec37f..3979d22946 100644
--- a/src/backend/access/rmgrdesc/Makefile
+++ b/src/backend/access/rmgrdesc/Makefile
@@ -21,6 +21,7 @@ OBJS = \
 	logicalmsgdesc.o \
 	mxactdesc.o \
 	nbtdesc.o \
+	pgstatdesc.o \
 	relmapdesc.o \
 	replorigindesc.o \
 	rmgrdesc_utils.o \
diff --git a/src/backend/access/rmgrdesc/meson.build b/src/backend/access/rmgrdesc/meson.build
index e8b7a65fc7..951dacbd72 100644
--- a/src/backend/access/rmgrdesc/meson.build
+++ b/src/backend/access/rmgrdesc/meson.build
@@ -14,6 +14,7 @@ rmgr_desc_sources = files(
   'logicalmsgdesc.c',
   'mxactdesc.c',
   'nbtdesc.c',
+  'pgstatdesc.c',
   'relmapdesc.c',
   'replorigindesc.c',
   'rmgrdesc_utils.c',
diff --git a/src/backend/access/rmgrdesc/pgstatdesc.c b/src/backend/access/rmgrdesc/pgstatdesc.c
new file mode 100644
index 0000000000..0fc19d1e56
--- /dev/null
+++ b/src/backend/access/rmgrdesc/pgstatdesc.c
@@ -0,0 +1,48 @@
+/*-------------------------------------------------------------------------
+ *
+ * pgstatdesc.c
+ *	  rmgr descriptor routines for utils/activity/pgstat_xlog.c
+ *
+ * Portions Copyright (c) 2001-2024, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/access/rmgrdesc/pgstatdesc.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "utils/pgstat_xlog.h"
+
+void
+pgstat_desc(StringInfo buf, XLogReaderState *record)
+{
+	char	   *rec = XLogRecGetData(record);
+	uint8		info = XLogRecGetInfo(record) & ~XLR_INFO_MASK;
+
+	if (info == XLOG_PGSTAT_DATA)
+	{
+		xl_pgstat_data *xlrec = (xl_pgstat_data *) rec;
+		char	   *sep = "";
+
+		appendStringInfo(buf, "stats kind \"%u\"; with data (%zu bytes): ",
+						 xlrec->stats_kind, xlrec->data_size);
+		/* Write data payload as a series of hex bytes */
+		for (int cnt = 0; cnt < xlrec->data_size; cnt++)
+		{
+			appendStringInfo(buf, "%s%02X", sep, (unsigned char) xlrec->data[cnt]);
+			sep = " ";
+		}
+	}
+}
+
+const char *
+pgstat_identify(uint8 info)
+{
+	if ((info & ~XLR_INFO_MASK) == XLOG_PGSTAT_DATA)
+		return "PGSTAT_DATA";
+
+	return NULL;
+}
diff --git a/src/backend/access/transam/rmgr.c b/src/backend/access/transam/rmgr.c
index 1b7499726e..945843a78e 100644
--- a/src/backend/access/transam/rmgr.c
+++ b/src/backend/access/transam/rmgr.c
@@ -39,6 +39,7 @@
 #include "replication/message.h"
 #include "replication/origin.h"
 #include "storage/standby.h"
+#include "utils/pgstat_xlog.h"
 #include "utils/relmapper.h"
 /* IWYU pragma: end_keep */
 
diff --git a/src/backend/utils/activity/Makefile b/src/backend/utils/activity/Makefile
index 24b64a2742..186108c60a 100644
--- a/src/backend/utils/activity/Makefile
+++ b/src/backend/utils/activity/Makefile
@@ -33,6 +33,7 @@ OBJS = \
 	pgstat_subscription.o \
 	pgstat_wal.o \
 	pgstat_xact.o \
+	pgstat_xlog.o \
 	wait_event.o \
 	wait_event_funcs.o
 
diff --git a/src/backend/utils/activity/meson.build b/src/backend/utils/activity/meson.build
index 380d3dd70c..a79e2d3731 100644
--- a/src/backend/utils/activity/meson.build
+++ b/src/backend/utils/activity/meson.build
@@ -18,6 +18,7 @@ backend_sources += files(
   'pgstat_subscription.c',
   'pgstat_wal.c',
   'pgstat_xact.c',
+  'pgstat_xlog.c',
 )
 
 # this includes a .c file with contents generated in ../../../include/activity,
diff --git a/src/backend/utils/activity/pgstat_xlog.c b/src/backend/utils/activity/pgstat_xlog.c
new file mode 100644
index 0000000000..397a49a33a
--- /dev/null
+++ b/src/backend/utils/activity/pgstat_xlog.c
@@ -0,0 +1,77 @@
+/* ----------
+ * pgstat_xlog.c
+ *	   WAL replay logic for cumulative statistics.
+ *
+ *
+ * Copyright (c) 2001-2024, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  src/backend/utils/activity/pgstat_xlog.c
+ * ----------
+ */
+
+#include "postgres.h"
+
+#include <unistd.h>
+
+#include "access/xlog.h"
+#include "access/xloginsert.h"
+#include "utils/pgstat_internal.h"
+#include "utils/pgstat_xlog.h"
+
+/*
+ * Write pgstats record data into WAL.
+ */
+XLogRecPtr
+pgstat_xlog_data(PgStat_Kind stats_kind, const void *data,
+				 size_t data_size)
+{
+	xl_pgstat_data xlrec;
+	XLogRecPtr	lsn;
+
+	xlrec.stats_kind = stats_kind;
+	xlrec.data_size = data_size;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, SizeOfPgStatData);
+	XLogRegisterData(data, data_size);
+
+	lsn = XLogInsert(RM_PGSTAT_ID, XLOG_PGSTAT_DATA);
+	return lsn;
+}
+
+/*
+ * Redo just passes down to the stats kind expected by this record
+ * data included in it.
+ */
+void
+pgstat_redo(XLogReaderState *record)
+{
+	uint8		info = XLogRecGetInfo(record) & ~XLR_INFO_MASK;
+	xl_pgstat_data *xlrec;
+	PgStat_Kind stats_kind;
+	size_t		data_size;
+	void	   *data;
+	const PgStat_KindInfo *kind_info;
+
+	if (info != XLOG_PGSTAT_DATA)
+		elog(PANIC, "pgstat_redo: unknown op code %u", info);
+
+	xlrec = (xl_pgstat_data *) XLogRecGetData(record);
+
+	stats_kind = xlrec->stats_kind;
+	data_size = xlrec->data_size;
+	data = xlrec->data;
+
+	kind_info = pgstat_get_kind_info(stats_kind);
+
+	if (kind_info == NULL)
+		elog(FATAL, "pgstat_redo: invalid stats data found: kind=%u",
+			 stats_kind);
+
+	if (kind_info->redo_cb == NULL)
+		elog(FATAL, "pgstat_redo: no redo callback found: kind=%s",
+			 kind_info->name);
+
+	kind_info->redo_cb(data, data_size);
+}
diff --git a/src/bin/pg_waldump/.gitignore b/src/bin/pg_waldump/.gitignore
index ec51f41c76..e5089b322d 100644
--- a/src/bin/pg_waldump/.gitignore
+++ b/src/bin/pg_waldump/.gitignore
@@ -13,6 +13,7 @@
 /logicalmsgdesc.c
 /mxactdesc.c
 /nbtdesc.c
+/pgstatdesc.c
 /relmapdesc.c
 /replorigindesc.c
 /rmgrdesc_utils.c
diff --git a/src/bin/pg_waldump/rmgrdesc.c b/src/bin/pg_waldump/rmgrdesc.c
index 6b8c17bb4c..05b4f2acd1 100644
--- a/src/bin/pg_waldump/rmgrdesc.c
+++ b/src/bin/pg_waldump/rmgrdesc.c
@@ -30,6 +30,7 @@
 #include "replication/origin.h"
 #include "rmgrdesc.h"
 #include "storage/standbydefs.h"
+#include "utils/pgstat_xlog.h"
 #include "utils/relmapper.h"
 
 #define PG_RMGR(symname,name,redo,desc,identify,startup,cleanup,mask,decode) \
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 578e473139..367ed6ab8c 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -73,7 +73,8 @@ BRIN
 CommitTs
 ReplicationOrigin
 Generic
-LogicalMessage$/,
+LogicalMessage
+PgStat$/,
 	'rmgr list');
 
 
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index e1c4f913f8..a5b1a2c998 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -4134,6 +4134,7 @@ xl_multixact_create
 xl_multixact_truncate
 xl_overwrite_contrecord
 xl_parameter_change
+xl_pgstat_data
 xl_relmap_update
 xl_replorigin_drop
 xl_replorigin_set
-- 
2.45.2

v1-0004-injection_points-Add-option-and-tests-for-WAL-log.patchtext/x-diff; charset=us-asciiDownload
From 877951141b1affc55e022332f54120621e1a4958 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Fri, 27 Dec 2024 12:05:33 +0900
Subject: [PATCH v1 4/4] injection_points: Add option and tests for WAL-logging

This serves as a template for WAL-logging related to pgstats, based on
the variable-numbered stats the module includes.  Some TAP tests are
added to check the backend facility.
---
 .../injection_points/injection_points.c       | 19 ++++++-
 .../injection_points/injection_stats.c        | 52 +++++++++++++++++++
 .../injection_points/injection_stats.h        |  3 +-
 .../modules/injection_points/t/001_stats.pl   | 22 ++++++++
 src/tools/pgindent/typedefs.list              |  1 +
 5 files changed, 95 insertions(+), 2 deletions(-)

diff --git a/src/test/modules/injection_points/injection_points.c b/src/test/modules/injection_points/injection_points.c
index 6bcde7b34e..a5180e97f5 100644
--- a/src/test/modules/injection_points/injection_points.c
+++ b/src/test/modules/injection_points/injection_points.c
@@ -104,7 +104,7 @@ extern PGDLLEXPORT void injection_wait(const char *name,
 static bool injection_point_local = false;
 
 /*
- * GUC variable
+ * GUC variables
  *
  * This GUC is useful to control if statistics should be enabled or not
  * during a test with injection points, like for example if a test relies
@@ -112,6 +112,12 @@ static bool injection_point_local = false;
  */
 bool		inj_stats_enabled = false;
 
+/*
+ * This GUC controls if statistics of injection points should be logged
+ * into WAL or not.
+ */
+bool		inj_stats_wal_enabled = false;
+
 /* Shared memory init callbacks */
 static shmem_request_hook_type prev_shmem_request_hook = NULL;
 static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
@@ -534,6 +540,17 @@ _PG_init(void)
 							 NULL,
 							 NULL);
 
+	DefineCustomBoolVariable("injection_points.log_stats",
+							 "Enables WAL-logging for injection points statistics.",
+							 NULL,
+							 &inj_stats_wal_enabled,
+							 false,
+							 PGC_POSTMASTER,
+							 0,
+							 NULL,
+							 NULL,
+							 NULL);
+
 	MarkGUCPrefixReserved("injection_points");
 
 	/* Shared memory initialization */
diff --git a/src/test/modules/injection_points/injection_stats.c b/src/test/modules/injection_points/injection_stats.c
index 21d5c10f39..c14655f70e 100644
--- a/src/test/modules/injection_points/injection_stats.c
+++ b/src/test/modules/injection_points/injection_stats.c
@@ -21,6 +21,7 @@
 #include "pgstat.h"
 #include "utils/builtins.h"
 #include "utils/pgstat_internal.h"
+#include "utils/pgstat_xlog.h"
 
 /* Structures for statistics of injection points */
 typedef struct PgStat_StatInjEntry
@@ -34,6 +35,14 @@ typedef struct PgStatShared_InjectionPoint
 	PgStat_StatInjEntry stats;
 } PgStatShared_InjectionPoint;
 
+/* Structure for data of injection points logged in WAL */
+typedef struct PgStat_StatInjRecord
+{
+	uint64		objid;			/* hash of the point name */
+	PgStat_StatInjEntry entry;	/* stats data */
+} PgStat_StatInjRecord;
+
+static void injection_stats_redo_cb(void *data, size_t data_size);
 static bool injection_stats_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
 
 static const PgStat_KindInfo injection_stats = {
@@ -48,6 +57,7 @@ static const PgStat_KindInfo injection_stats = {
 	.shared_data_off = offsetof(PgStatShared_InjectionPoint, stats),
 	.shared_data_len = sizeof(((PgStatShared_InjectionPoint *) 0)->stats),
 	.pending_size = sizeof(PgStat_StatInjEntry),
+	.redo_cb = injection_stats_redo_cb,
 	.flush_pending_cb = injection_stats_flush_cb,
 };
 
@@ -64,6 +74,31 @@ static const PgStat_KindInfo injection_stats = {
 /* Track if stats are loaded */
 static bool inj_stats_loaded = false;
 
+/*
+ * REDO callback for injection point stats.
+ */
+static void
+injection_stats_redo_cb(void *data, size_t data_size)
+{
+	PgStat_StatInjRecord *record_data = (PgStat_StatInjRecord *) data;
+	PgStat_StatInjEntry record_entry = record_data->entry;
+	PgStat_StatInjEntry *stat_entry;
+	PgStatShared_InjectionPoint *shstatent;
+	PgStat_EntryRef *entry_ref;
+
+	Assert(data_size == sizeof(PgStat_StatInjRecord));
+
+	/* create or fetch existing entry */
+	entry_ref = pgstat_prep_pending_entry(PGSTAT_KIND_INJECTION, InvalidOid,
+										  record_data->objid, NULL);
+
+	shstatent = (PgStatShared_InjectionPoint *) entry_ref->shared_stats;
+	stat_entry = &shstatent->stats;
+
+	/* Update the injection point statistics, overwriting any existing data */
+	*stat_entry = record_entry;
+}
+
 /*
  * Callback for stats handling
  */
@@ -72,6 +107,7 @@ injection_stats_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
 {
 	PgStat_StatInjEntry *localent;
 	PgStatShared_InjectionPoint *shfuncent;
+	PgStat_StatInjRecord record_data = {0};
 
 	localent = (PgStat_StatInjEntry *) entry_ref->pending;
 	shfuncent = (PgStatShared_InjectionPoint *) entry_ref->shared_stats;
@@ -81,8 +117,24 @@ injection_stats_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
 
 	shfuncent->stats.numcalls += localent->numcalls;
 
+	record_data.objid = entry_ref->shared_entry->key.objid;
+	record_data.entry = shfuncent->stats;
+
 	pgstat_unlock_entry(entry_ref);
 
+	/*
+	 * Store the data in WAL, if not in recovery and if the option is enabled.
+	 */
+	if (!RecoveryInProgress() && inj_stats_wal_enabled)
+	{
+		XLogRecPtr	lsn;
+
+		lsn = pgstat_xlog_data(PGSTAT_KIND_INJECTION, &record_data,
+							   sizeof(PgStat_StatInjRecord));
+
+		/* Force a flush, to ensure persistency */
+		XLogFlush(lsn);
+	}
 	return true;
 }
 
diff --git a/src/test/modules/injection_points/injection_stats.h b/src/test/modules/injection_points/injection_stats.h
index c48d533b4b..2583f3fe9c 100644
--- a/src/test/modules/injection_points/injection_stats.h
+++ b/src/test/modules/injection_points/injection_stats.h
@@ -15,8 +15,9 @@
 #ifndef INJECTION_STATS
 #define INJECTION_STATS
 
-/* GUC variable */
+/* GUC variables */
 extern bool inj_stats_enabled;
+extern bool inj_stats_wal_enabled;
 
 /* injection_stats.c */
 extern void pgstat_register_inj(void);
diff --git a/src/test/modules/injection_points/t/001_stats.pl b/src/test/modules/injection_points/t/001_stats.pl
index 36728f16fc..395c2e7cdb 100644
--- a/src/test/modules/injection_points/t/001_stats.pl
+++ b/src/test/modules/injection_points/t/001_stats.pl
@@ -75,4 +75,26 @@ $node->stop;
 $node->adjust_conf('postgresql.conf', 'shared_preload_libraries', "''");
 $node->start;
 
+# Stop the server and enable WAL, the stats are preserved after recovery.
+$node->stop;
+$node->append_conf(
+	'postgresql.conf', qq(
+shared_preload_libraries = 'injection_points'
+injection_points.log_stats = true
+));
+$node->start;
+
+# Two calls, both WAL-logged.
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('stats-wal-notice', 'notice');");
+$node->safe_psql('postgres',
+	"SELECT injection_points_run('stats-wal-notice');");
+$node->safe_psql('postgres',
+	"SELECT injection_points_run('stats-wal-notice');");
+$node->stop('immediate');
+$node->start;
+$numcalls = $node->safe_psql('postgres',
+	"SELECT injection_points_stats_numcalls('stats-wal-notice');");
+is($numcalls, '2', 'number of stats after crash with WAL-logging enabled');
+
 done_testing();
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a5b1a2c998..47e3e07f5e 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2166,6 +2166,7 @@ PgStat_StatDBEntry
 PgStat_StatFuncEntry
 PgStat_StatInjEntry
 PgStat_StatInjFixedEntry
+PgStat_StatInjRecord
 PgStat_StatReplSlotEntry
 PgStat_StatSubEntry
 PgStat_StatTabEntry
-- 
2.45.2

#2Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#1)
3 attachment(s)
Re: WAL-logging facility for pgstats kinds

On Fri, Dec 27, 2024 at 12:32:25PM +0900, Michael Paquier wrote:

For clarity, the patch set has been split into several pieces, I hope
rather edible:
- 0001, a fix I've posted on a different thread [1], used in patch
0004 to test this new facility.
- 0002, a refactoring piece to be able to expose stats kind data into
the frontend (for pg_waldump).
- 0003 is the core backend piece, introducing the WAL-logging routine
and the new RMGR.
- 0004, that provides tests for the new backend facility via a custom
stats kind. Requires 0001.

I am going to attach it to the next CF.

0001 has been applied as of b757abefc041. Here is a rebased patch set
for the rest.
--
Michael

Attachments:

v2-0002-Move-information-about-pgstats-kinds-into-its-own.patchtext/x-diff; charset=us-asciiDownload
From 55a3c446c44f32204a070ca33c6ac01a16a2da3d Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Fri, 27 Dec 2024 11:54:45 +0900
Subject: [PATCH v2 2/4] Move information about pgstats kinds into its own
 header

This information is split into its own header as it will be used by a
follow-up patch, to be used in some frontend and backend code.
---
 src/include/pgstat.h            | 57 +-------------------------
 src/include/utils/pgstat_kind.h | 72 +++++++++++++++++++++++++++++++++
 2 files changed, 73 insertions(+), 56 deletions(-)
 create mode 100644 src/include/utils/pgstat_kind.h

diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 75a41e8ff3..601b8075d5 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -18,6 +18,7 @@
 #include "replication/conflict.h"
 #include "utils/backend_progress.h" /* for backward compatibility */
 #include "utils/backend_status.h"	/* for backward compatibility */
+#include "utils/pgstat_kind.h"
 #include "utils/relcache.h"
 #include "utils/wait_event.h"	/* for backward compatibility */
 
@@ -33,62 +34,6 @@
 /* Default directory to store temporary statistics data in */
 #define PG_STAT_TMP_DIR		"pg_stat_tmp"
 
-/* The types of statistics entries */
-#define PgStat_Kind uint32
-
-/* Range of IDs allowed, for built-in and custom kinds */
-#define PGSTAT_KIND_MIN	1		/* Minimum ID allowed */
-#define PGSTAT_KIND_MAX	256		/* Maximum ID allowed */
-
-/* use 0 for INVALID, to catch zero-initialized data */
-#define PGSTAT_KIND_INVALID 0
-
-/* stats for variable-numbered objects */
-#define PGSTAT_KIND_DATABASE	1	/* database-wide statistics */
-#define PGSTAT_KIND_RELATION	2	/* per-table statistics */
-#define PGSTAT_KIND_FUNCTION	3	/* per-function statistics */
-#define PGSTAT_KIND_REPLSLOT	4	/* per-slot statistics */
-#define PGSTAT_KIND_SUBSCRIPTION	5	/* per-subscription statistics */
-#define PGSTAT_KIND_BACKEND	6	/* per-backend statistics */
-
-/* stats for fixed-numbered objects */
-#define PGSTAT_KIND_ARCHIVER	7
-#define PGSTAT_KIND_BGWRITER	8
-#define PGSTAT_KIND_CHECKPOINTER	9
-#define PGSTAT_KIND_IO	10
-#define PGSTAT_KIND_SLRU	11
-#define PGSTAT_KIND_WAL	12
-
-#define PGSTAT_KIND_BUILTIN_MIN PGSTAT_KIND_DATABASE
-#define PGSTAT_KIND_BUILTIN_MAX PGSTAT_KIND_WAL
-#define PGSTAT_KIND_BUILTIN_SIZE (PGSTAT_KIND_BUILTIN_MAX + 1)
-
-/* Custom stats kinds */
-
-/* Range of IDs allowed for custom stats kinds */
-#define PGSTAT_KIND_CUSTOM_MIN	128
-#define PGSTAT_KIND_CUSTOM_MAX	PGSTAT_KIND_MAX
-#define PGSTAT_KIND_CUSTOM_SIZE	(PGSTAT_KIND_CUSTOM_MAX - PGSTAT_KIND_CUSTOM_MIN + 1)
-
-/*
- * PgStat_Kind to use for extensions that require an ID, but are still in
- * development and have not reserved their own unique kind ID yet. See:
- * https://wiki.postgresql.org/wiki/CustomCumulativeStats
- */
-#define PGSTAT_KIND_EXPERIMENTAL	128
-
-static inline bool
-pgstat_is_kind_builtin(PgStat_Kind kind)
-{
-	return kind >= PGSTAT_KIND_BUILTIN_MIN && kind <= PGSTAT_KIND_BUILTIN_MAX;
-}
-
-static inline bool
-pgstat_is_kind_custom(PgStat_Kind kind)
-{
-	return kind >= PGSTAT_KIND_CUSTOM_MIN && kind <= PGSTAT_KIND_CUSTOM_MAX;
-}
-
 /* Values for track_functions GUC variable --- order is significant! */
 typedef enum TrackFunctionsLevel
 {
diff --git a/src/include/utils/pgstat_kind.h b/src/include/utils/pgstat_kind.h
new file mode 100644
index 0000000000..526fa04e8f
--- /dev/null
+++ b/src/include/utils/pgstat_kind.h
@@ -0,0 +1,72 @@
+/* ----------
+ *	pgstat_kind.h
+ *
+ *	Definitions related to the statistics kinds for the PostgreSQL
+ *	cumulative statistics system.  Can be included in backend or
+ *	frontend code.
+ *
+ *	Copyright (c) 2001-2024, PostgreSQL Global Development Group
+ *
+ *	src/include/utils/pgstat_kind.h
+ * ----------
+ */
+#ifndef PGSTAT_KIND_H
+#define PGSTAT_KIND_H
+
+/* The types of statistics entries */
+#define PgStat_Kind uint32
+
+/* Range of IDs allowed, for built-in and custom kinds */
+#define PGSTAT_KIND_MIN	1		/* Minimum ID allowed */
+#define PGSTAT_KIND_MAX	256		/* Maximum ID allowed */
+
+/* use 0 for INVALID, to catch zero-initialized data */
+#define PGSTAT_KIND_INVALID 0
+
+/* stats for variable-numbered objects */
+#define PGSTAT_KIND_DATABASE	1	/* database-wide statistics */
+#define PGSTAT_KIND_RELATION	2	/* per-table statistics */
+#define PGSTAT_KIND_FUNCTION	3	/* per-function statistics */
+#define PGSTAT_KIND_REPLSLOT	4	/* per-slot statistics */
+#define PGSTAT_KIND_SUBSCRIPTION	5	/* per-subscription statistics */
+#define PGSTAT_KIND_BACKEND	6	/* per-backend statistics */
+
+/* stats for fixed-numbered objects */
+#define PGSTAT_KIND_ARCHIVER	7
+#define PGSTAT_KIND_BGWRITER	8
+#define PGSTAT_KIND_CHECKPOINTER	9
+#define PGSTAT_KIND_IO	10
+#define PGSTAT_KIND_SLRU	11
+#define PGSTAT_KIND_WAL	12
+
+#define PGSTAT_KIND_BUILTIN_MIN PGSTAT_KIND_DATABASE
+#define PGSTAT_KIND_BUILTIN_MAX PGSTAT_KIND_WAL
+#define PGSTAT_KIND_BUILTIN_SIZE (PGSTAT_KIND_BUILTIN_MAX + 1)
+
+/* Custom stats kinds */
+
+/* Range of IDs allowed for custom stats kinds */
+#define PGSTAT_KIND_CUSTOM_MIN	128
+#define PGSTAT_KIND_CUSTOM_MAX	PGSTAT_KIND_MAX
+#define PGSTAT_KIND_CUSTOM_SIZE	(PGSTAT_KIND_CUSTOM_MAX - PGSTAT_KIND_CUSTOM_MIN + 1)
+
+/*
+ * PgStat_Kind to use for extensions that require an ID, but are still in
+ * development and have not reserved their own unique kind ID yet. See:
+ * https://wiki.postgresql.org/wiki/CustomCumulativeStats
+ */
+#define PGSTAT_KIND_EXPERIMENTAL	128
+
+static inline bool
+pgstat_is_kind_builtin(PgStat_Kind kind)
+{
+	return kind >= PGSTAT_KIND_BUILTIN_MIN && kind <= PGSTAT_KIND_BUILTIN_MAX;
+}
+
+static inline bool
+pgstat_is_kind_custom(PgStat_Kind kind)
+{
+	return kind >= PGSTAT_KIND_CUSTOM_MIN && kind <= PGSTAT_KIND_CUSTOM_MAX;
+}
+
+#endif							/* PGSTAT_KIND_H */
-- 
2.45.2

v2-0003-Add-RMGR-and-WAL-logging-API-for-pgstats.patchtext/x-diff; charset=us-asciiDownload
From 2fd2b76238618e62e37eb011c0fe145162602a9d Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Fri, 27 Dec 2024 12:03:34 +0900
Subject: [PATCH v2 3/4] Add RMGR and WAL-logging API for pgstats

This commit adds a new facility in the backend that offers to pgstats
kinds the possibility to WAL-log data that need to be persisted across
restarts:
- A new callback called redo_cb can be defined by a pgstats_kind.
- pgstat_xlog.c includes one API able to register some payload data and
its size.  Stats kinds doing WAL-logging require a definition of
redo_cb.
- pg_waldump is able to show the data logged, as hexadedimal data.

This facility has applications for in-core and custom stats kinds, with
one primary case being the possibility to WAL-log some statistics
related to relations so as these are not lost post-crash.  This is left
as future work.
---
 src/include/access/rmgrlist.h            |  1 +
 src/include/utils/pgstat_internal.h      |  8 +++
 src/include/utils/pgstat_xlog.h          | 41 +++++++++++++
 src/backend/access/rmgrdesc/Makefile     |  1 +
 src/backend/access/rmgrdesc/meson.build  |  1 +
 src/backend/access/rmgrdesc/pgstatdesc.c | 48 +++++++++++++++
 src/backend/access/transam/rmgr.c        |  1 +
 src/backend/utils/activity/Makefile      |  1 +
 src/backend/utils/activity/meson.build   |  1 +
 src/backend/utils/activity/pgstat_xlog.c | 77 ++++++++++++++++++++++++
 src/bin/pg_waldump/.gitignore            |  1 +
 src/bin/pg_waldump/rmgrdesc.c            |  1 +
 src/bin/pg_waldump/t/001_basic.pl        |  3 +-
 src/tools/pgindent/typedefs.list         |  1 +
 14 files changed, 185 insertions(+), 1 deletion(-)
 create mode 100644 src/include/utils/pgstat_xlog.h
 create mode 100644 src/backend/access/rmgrdesc/pgstatdesc.c
 create mode 100644 src/backend/utils/activity/pgstat_xlog.c

diff --git a/src/include/access/rmgrlist.h b/src/include/access/rmgrlist.h
index 78e6b908c6..abf9393bbe 100644
--- a/src/include/access/rmgrlist.h
+++ b/src/include/access/rmgrlist.h
@@ -47,3 +47,4 @@ PG_RMGR(RM_COMMIT_TS_ID, "CommitTs", commit_ts_redo, commit_ts_desc, commit_ts_i
 PG_RMGR(RM_REPLORIGIN_ID, "ReplicationOrigin", replorigin_redo, replorigin_desc, replorigin_identify, NULL, NULL, NULL, NULL)
 PG_RMGR(RM_GENERIC_ID, "Generic", generic_redo, generic_desc, generic_identify, NULL, NULL, generic_mask, NULL)
 PG_RMGR(RM_LOGICALMSG_ID, "LogicalMessage", logicalmsg_redo, logicalmsg_desc, logicalmsg_identify, NULL, NULL, NULL, logicalmsg_decode)
+PG_RMGR(RM_PGSTAT_ID, "PgStat", pgstat_redo, pgstat_desc, pgstat_identify, NULL, NULL, NULL, NULL)
diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index 811ed9b005..47d7f8988e 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -258,6 +258,14 @@ typedef struct PgStat_KindInfo
 	 */
 	void		(*init_backend_cb) (void);
 
+	/*
+	 * Perform custom actions when replaying WAL related to a statistics kind.
+	 *
+	 * "data" is a pointer to the stats information that can be used by the
+	 * stats kind at redo.
+	 */
+	void		(*redo_cb) (void *data, size_t data_size);
+
 	/*
 	 * For variable-numbered stats: flush pending stats. Required if pending
 	 * data is used.  See flush_fixed_cb for fixed-numbered stats.
diff --git a/src/include/utils/pgstat_xlog.h b/src/include/utils/pgstat_xlog.h
new file mode 100644
index 0000000000..25b7a78dd5
--- /dev/null
+++ b/src/include/utils/pgstat_xlog.h
@@ -0,0 +1,41 @@
+/* ----------
+ * pgstat_xlog.h
+ *	  Exports from utils/activity/pgstat_xlog.c
+ *
+ * Copyright (c) 2001-2024, PostgreSQL Global Development Group
+ *
+ * src/include/utils/pgstat_xlog.h
+ * ----------
+ */
+#ifndef PGSTAT_XLOG_H
+#define PGSTAT_XLOG_H
+
+#include "access/xlog.h"
+#include "access/xlogdefs.h"
+#include "access/xlogreader.h"
+#include "utils/pgstat_kind.h"
+
+/*
+ * Generic WAL record for pgstat data.
+ */
+typedef struct xl_pgstat_data
+{
+	PgStat_Kind stats_kind;
+	size_t		data_size;		/* size of the data */
+	/* data payload, worth data_size */
+	char		data[FLEXIBLE_ARRAY_MEMBER];
+} xl_pgstat_data;
+
+#define SizeOfPgStatData    (offsetof(xl_pgstat_data, data))
+
+extern XLogRecPtr pgstat_xlog_data(PgStat_Kind stats_kind,
+								   const void *data,
+								   size_t data_size);
+
+/* RMGR API */
+#define XLOG_PGSTAT_DATA	0x00
+extern void pgstat_redo(XLogReaderState *record);
+extern void pgstat_desc(StringInfo buf, XLogReaderState *record);
+extern const char *pgstat_identify(uint8 info);
+
+#endif							/* PGSTAT_XLOG_H */
diff --git a/src/backend/access/rmgrdesc/Makefile b/src/backend/access/rmgrdesc/Makefile
index cd95eec37f..3979d22946 100644
--- a/src/backend/access/rmgrdesc/Makefile
+++ b/src/backend/access/rmgrdesc/Makefile
@@ -21,6 +21,7 @@ OBJS = \
 	logicalmsgdesc.o \
 	mxactdesc.o \
 	nbtdesc.o \
+	pgstatdesc.o \
 	relmapdesc.o \
 	replorigindesc.o \
 	rmgrdesc_utils.o \
diff --git a/src/backend/access/rmgrdesc/meson.build b/src/backend/access/rmgrdesc/meson.build
index e8b7a65fc7..951dacbd72 100644
--- a/src/backend/access/rmgrdesc/meson.build
+++ b/src/backend/access/rmgrdesc/meson.build
@@ -14,6 +14,7 @@ rmgr_desc_sources = files(
   'logicalmsgdesc.c',
   'mxactdesc.c',
   'nbtdesc.c',
+  'pgstatdesc.c',
   'relmapdesc.c',
   'replorigindesc.c',
   'rmgrdesc_utils.c',
diff --git a/src/backend/access/rmgrdesc/pgstatdesc.c b/src/backend/access/rmgrdesc/pgstatdesc.c
new file mode 100644
index 0000000000..0fc19d1e56
--- /dev/null
+++ b/src/backend/access/rmgrdesc/pgstatdesc.c
@@ -0,0 +1,48 @@
+/*-------------------------------------------------------------------------
+ *
+ * pgstatdesc.c
+ *	  rmgr descriptor routines for utils/activity/pgstat_xlog.c
+ *
+ * Portions Copyright (c) 2001-2024, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/access/rmgrdesc/pgstatdesc.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "utils/pgstat_xlog.h"
+
+void
+pgstat_desc(StringInfo buf, XLogReaderState *record)
+{
+	char	   *rec = XLogRecGetData(record);
+	uint8		info = XLogRecGetInfo(record) & ~XLR_INFO_MASK;
+
+	if (info == XLOG_PGSTAT_DATA)
+	{
+		xl_pgstat_data *xlrec = (xl_pgstat_data *) rec;
+		char	   *sep = "";
+
+		appendStringInfo(buf, "stats kind \"%u\"; with data (%zu bytes): ",
+						 xlrec->stats_kind, xlrec->data_size);
+		/* Write data payload as a series of hex bytes */
+		for (int cnt = 0; cnt < xlrec->data_size; cnt++)
+		{
+			appendStringInfo(buf, "%s%02X", sep, (unsigned char) xlrec->data[cnt]);
+			sep = " ";
+		}
+	}
+}
+
+const char *
+pgstat_identify(uint8 info)
+{
+	if ((info & ~XLR_INFO_MASK) == XLOG_PGSTAT_DATA)
+		return "PGSTAT_DATA";
+
+	return NULL;
+}
diff --git a/src/backend/access/transam/rmgr.c b/src/backend/access/transam/rmgr.c
index 1b7499726e..945843a78e 100644
--- a/src/backend/access/transam/rmgr.c
+++ b/src/backend/access/transam/rmgr.c
@@ -39,6 +39,7 @@
 #include "replication/message.h"
 #include "replication/origin.h"
 #include "storage/standby.h"
+#include "utils/pgstat_xlog.h"
 #include "utils/relmapper.h"
 /* IWYU pragma: end_keep */
 
diff --git a/src/backend/utils/activity/Makefile b/src/backend/utils/activity/Makefile
index 24b64a2742..186108c60a 100644
--- a/src/backend/utils/activity/Makefile
+++ b/src/backend/utils/activity/Makefile
@@ -33,6 +33,7 @@ OBJS = \
 	pgstat_subscription.o \
 	pgstat_wal.o \
 	pgstat_xact.o \
+	pgstat_xlog.o \
 	wait_event.o \
 	wait_event_funcs.o
 
diff --git a/src/backend/utils/activity/meson.build b/src/backend/utils/activity/meson.build
index 380d3dd70c..a79e2d3731 100644
--- a/src/backend/utils/activity/meson.build
+++ b/src/backend/utils/activity/meson.build
@@ -18,6 +18,7 @@ backend_sources += files(
   'pgstat_subscription.c',
   'pgstat_wal.c',
   'pgstat_xact.c',
+  'pgstat_xlog.c',
 )
 
 # this includes a .c file with contents generated in ../../../include/activity,
diff --git a/src/backend/utils/activity/pgstat_xlog.c b/src/backend/utils/activity/pgstat_xlog.c
new file mode 100644
index 0000000000..397a49a33a
--- /dev/null
+++ b/src/backend/utils/activity/pgstat_xlog.c
@@ -0,0 +1,77 @@
+/* ----------
+ * pgstat_xlog.c
+ *	   WAL replay logic for cumulative statistics.
+ *
+ *
+ * Copyright (c) 2001-2024, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  src/backend/utils/activity/pgstat_xlog.c
+ * ----------
+ */
+
+#include "postgres.h"
+
+#include <unistd.h>
+
+#include "access/xlog.h"
+#include "access/xloginsert.h"
+#include "utils/pgstat_internal.h"
+#include "utils/pgstat_xlog.h"
+
+/*
+ * Write pgstats record data into WAL.
+ */
+XLogRecPtr
+pgstat_xlog_data(PgStat_Kind stats_kind, const void *data,
+				 size_t data_size)
+{
+	xl_pgstat_data xlrec;
+	XLogRecPtr	lsn;
+
+	xlrec.stats_kind = stats_kind;
+	xlrec.data_size = data_size;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, SizeOfPgStatData);
+	XLogRegisterData(data, data_size);
+
+	lsn = XLogInsert(RM_PGSTAT_ID, XLOG_PGSTAT_DATA);
+	return lsn;
+}
+
+/*
+ * Redo just passes down to the stats kind expected by this record
+ * data included in it.
+ */
+void
+pgstat_redo(XLogReaderState *record)
+{
+	uint8		info = XLogRecGetInfo(record) & ~XLR_INFO_MASK;
+	xl_pgstat_data *xlrec;
+	PgStat_Kind stats_kind;
+	size_t		data_size;
+	void	   *data;
+	const PgStat_KindInfo *kind_info;
+
+	if (info != XLOG_PGSTAT_DATA)
+		elog(PANIC, "pgstat_redo: unknown op code %u", info);
+
+	xlrec = (xl_pgstat_data *) XLogRecGetData(record);
+
+	stats_kind = xlrec->stats_kind;
+	data_size = xlrec->data_size;
+	data = xlrec->data;
+
+	kind_info = pgstat_get_kind_info(stats_kind);
+
+	if (kind_info == NULL)
+		elog(FATAL, "pgstat_redo: invalid stats data found: kind=%u",
+			 stats_kind);
+
+	if (kind_info->redo_cb == NULL)
+		elog(FATAL, "pgstat_redo: no redo callback found: kind=%s",
+			 kind_info->name);
+
+	kind_info->redo_cb(data, data_size);
+}
diff --git a/src/bin/pg_waldump/.gitignore b/src/bin/pg_waldump/.gitignore
index ec51f41c76..e5089b322d 100644
--- a/src/bin/pg_waldump/.gitignore
+++ b/src/bin/pg_waldump/.gitignore
@@ -13,6 +13,7 @@
 /logicalmsgdesc.c
 /mxactdesc.c
 /nbtdesc.c
+/pgstatdesc.c
 /relmapdesc.c
 /replorigindesc.c
 /rmgrdesc_utils.c
diff --git a/src/bin/pg_waldump/rmgrdesc.c b/src/bin/pg_waldump/rmgrdesc.c
index 6b8c17bb4c..05b4f2acd1 100644
--- a/src/bin/pg_waldump/rmgrdesc.c
+++ b/src/bin/pg_waldump/rmgrdesc.c
@@ -30,6 +30,7 @@
 #include "replication/origin.h"
 #include "rmgrdesc.h"
 #include "storage/standbydefs.h"
+#include "utils/pgstat_xlog.h"
 #include "utils/relmapper.h"
 
 #define PG_RMGR(symname,name,redo,desc,identify,startup,cleanup,mask,decode) \
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 578e473139..367ed6ab8c 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -73,7 +73,8 @@ BRIN
 CommitTs
 ReplicationOrigin
 Generic
-LogicalMessage$/,
+LogicalMessage
+PgStat$/,
 	'rmgr list');
 
 
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index e1c4f913f8..a5b1a2c998 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -4134,6 +4134,7 @@ xl_multixact_create
 xl_multixact_truncate
 xl_overwrite_contrecord
 xl_parameter_change
+xl_pgstat_data
 xl_relmap_update
 xl_replorigin_drop
 xl_replorigin_set
-- 
2.45.2

v2-0004-injection_points-Add-option-and-tests-for-WAL-log.patchtext/x-diff; charset=us-asciiDownload
From 54537bbceab106b7a40aebdc815e6cd1ccd481cc Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Fri, 27 Dec 2024 12:05:33 +0900
Subject: [PATCH v2 4/4] injection_points: Add option and tests for WAL-logging

This serves as a template for WAL-logging related to pgstats, based on
the variable-numbered stats the module includes.  Some TAP tests are
added to check the backend facility.
---
 .../injection_points/injection_points.c       | 19 ++++++-
 .../injection_points/injection_stats.c        | 52 +++++++++++++++++++
 .../injection_points/injection_stats.h        |  3 +-
 .../modules/injection_points/t/001_stats.pl   | 22 ++++++++
 src/tools/pgindent/typedefs.list              |  1 +
 5 files changed, 95 insertions(+), 2 deletions(-)

diff --git a/src/test/modules/injection_points/injection_points.c b/src/test/modules/injection_points/injection_points.c
index 6bcde7b34e..a5180e97f5 100644
--- a/src/test/modules/injection_points/injection_points.c
+++ b/src/test/modules/injection_points/injection_points.c
@@ -104,7 +104,7 @@ extern PGDLLEXPORT void injection_wait(const char *name,
 static bool injection_point_local = false;
 
 /*
- * GUC variable
+ * GUC variables
  *
  * This GUC is useful to control if statistics should be enabled or not
  * during a test with injection points, like for example if a test relies
@@ -112,6 +112,12 @@ static bool injection_point_local = false;
  */
 bool		inj_stats_enabled = false;
 
+/*
+ * This GUC controls if statistics of injection points should be logged
+ * into WAL or not.
+ */
+bool		inj_stats_wal_enabled = false;
+
 /* Shared memory init callbacks */
 static shmem_request_hook_type prev_shmem_request_hook = NULL;
 static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
@@ -534,6 +540,17 @@ _PG_init(void)
 							 NULL,
 							 NULL);
 
+	DefineCustomBoolVariable("injection_points.log_stats",
+							 "Enables WAL-logging for injection points statistics.",
+							 NULL,
+							 &inj_stats_wal_enabled,
+							 false,
+							 PGC_POSTMASTER,
+							 0,
+							 NULL,
+							 NULL,
+							 NULL);
+
 	MarkGUCPrefixReserved("injection_points");
 
 	/* Shared memory initialization */
diff --git a/src/test/modules/injection_points/injection_stats.c b/src/test/modules/injection_points/injection_stats.c
index 21d5c10f39..c14655f70e 100644
--- a/src/test/modules/injection_points/injection_stats.c
+++ b/src/test/modules/injection_points/injection_stats.c
@@ -21,6 +21,7 @@
 #include "pgstat.h"
 #include "utils/builtins.h"
 #include "utils/pgstat_internal.h"
+#include "utils/pgstat_xlog.h"
 
 /* Structures for statistics of injection points */
 typedef struct PgStat_StatInjEntry
@@ -34,6 +35,14 @@ typedef struct PgStatShared_InjectionPoint
 	PgStat_StatInjEntry stats;
 } PgStatShared_InjectionPoint;
 
+/* Structure for data of injection points logged in WAL */
+typedef struct PgStat_StatInjRecord
+{
+	uint64		objid;			/* hash of the point name */
+	PgStat_StatInjEntry entry;	/* stats data */
+} PgStat_StatInjRecord;
+
+static void injection_stats_redo_cb(void *data, size_t data_size);
 static bool injection_stats_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
 
 static const PgStat_KindInfo injection_stats = {
@@ -48,6 +57,7 @@ static const PgStat_KindInfo injection_stats = {
 	.shared_data_off = offsetof(PgStatShared_InjectionPoint, stats),
 	.shared_data_len = sizeof(((PgStatShared_InjectionPoint *) 0)->stats),
 	.pending_size = sizeof(PgStat_StatInjEntry),
+	.redo_cb = injection_stats_redo_cb,
 	.flush_pending_cb = injection_stats_flush_cb,
 };
 
@@ -64,6 +74,31 @@ static const PgStat_KindInfo injection_stats = {
 /* Track if stats are loaded */
 static bool inj_stats_loaded = false;
 
+/*
+ * REDO callback for injection point stats.
+ */
+static void
+injection_stats_redo_cb(void *data, size_t data_size)
+{
+	PgStat_StatInjRecord *record_data = (PgStat_StatInjRecord *) data;
+	PgStat_StatInjEntry record_entry = record_data->entry;
+	PgStat_StatInjEntry *stat_entry;
+	PgStatShared_InjectionPoint *shstatent;
+	PgStat_EntryRef *entry_ref;
+
+	Assert(data_size == sizeof(PgStat_StatInjRecord));
+
+	/* create or fetch existing entry */
+	entry_ref = pgstat_prep_pending_entry(PGSTAT_KIND_INJECTION, InvalidOid,
+										  record_data->objid, NULL);
+
+	shstatent = (PgStatShared_InjectionPoint *) entry_ref->shared_stats;
+	stat_entry = &shstatent->stats;
+
+	/* Update the injection point statistics, overwriting any existing data */
+	*stat_entry = record_entry;
+}
+
 /*
  * Callback for stats handling
  */
@@ -72,6 +107,7 @@ injection_stats_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
 {
 	PgStat_StatInjEntry *localent;
 	PgStatShared_InjectionPoint *shfuncent;
+	PgStat_StatInjRecord record_data = {0};
 
 	localent = (PgStat_StatInjEntry *) entry_ref->pending;
 	shfuncent = (PgStatShared_InjectionPoint *) entry_ref->shared_stats;
@@ -81,8 +117,24 @@ injection_stats_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
 
 	shfuncent->stats.numcalls += localent->numcalls;
 
+	record_data.objid = entry_ref->shared_entry->key.objid;
+	record_data.entry = shfuncent->stats;
+
 	pgstat_unlock_entry(entry_ref);
 
+	/*
+	 * Store the data in WAL, if not in recovery and if the option is enabled.
+	 */
+	if (!RecoveryInProgress() && inj_stats_wal_enabled)
+	{
+		XLogRecPtr	lsn;
+
+		lsn = pgstat_xlog_data(PGSTAT_KIND_INJECTION, &record_data,
+							   sizeof(PgStat_StatInjRecord));
+
+		/* Force a flush, to ensure persistency */
+		XLogFlush(lsn);
+	}
 	return true;
 }
 
diff --git a/src/test/modules/injection_points/injection_stats.h b/src/test/modules/injection_points/injection_stats.h
index c48d533b4b..2583f3fe9c 100644
--- a/src/test/modules/injection_points/injection_stats.h
+++ b/src/test/modules/injection_points/injection_stats.h
@@ -15,8 +15,9 @@
 #ifndef INJECTION_STATS
 #define INJECTION_STATS
 
-/* GUC variable */
+/* GUC variables */
 extern bool inj_stats_enabled;
+extern bool inj_stats_wal_enabled;
 
 /* injection_stats.c */
 extern void pgstat_register_inj(void);
diff --git a/src/test/modules/injection_points/t/001_stats.pl b/src/test/modules/injection_points/t/001_stats.pl
index 36728f16fc..395c2e7cdb 100644
--- a/src/test/modules/injection_points/t/001_stats.pl
+++ b/src/test/modules/injection_points/t/001_stats.pl
@@ -75,4 +75,26 @@ $node->stop;
 $node->adjust_conf('postgresql.conf', 'shared_preload_libraries', "''");
 $node->start;
 
+# Stop the server and enable WAL, the stats are preserved after recovery.
+$node->stop;
+$node->append_conf(
+	'postgresql.conf', qq(
+shared_preload_libraries = 'injection_points'
+injection_points.log_stats = true
+));
+$node->start;
+
+# Two calls, both WAL-logged.
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('stats-wal-notice', 'notice');");
+$node->safe_psql('postgres',
+	"SELECT injection_points_run('stats-wal-notice');");
+$node->safe_psql('postgres',
+	"SELECT injection_points_run('stats-wal-notice');");
+$node->stop('immediate');
+$node->start;
+$numcalls = $node->safe_psql('postgres',
+	"SELECT injection_points_stats_numcalls('stats-wal-notice');");
+is($numcalls, '2', 'number of stats after crash with WAL-logging enabled');
+
 done_testing();
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a5b1a2c998..47e3e07f5e 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2166,6 +2166,7 @@ PgStat_StatDBEntry
 PgStat_StatFuncEntry
 PgStat_StatInjEntry
 PgStat_StatInjFixedEntry
+PgStat_StatInjRecord
 PgStat_StatReplSlotEntry
 PgStat_StatSubEntry
 PgStat_StatTabEntry
-- 
2.45.2

#3Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Michael Paquier (#2)
Re: WAL-logging facility for pgstats kinds

Hi,

On Tue, Dec 31, 2024 at 09:52:31AM +0900, Michael Paquier wrote:

On Fri, Dec 27, 2024 at 12:32:25PM +0900, Michael Paquier wrote:

For clarity, the patch set has been split into several pieces, I hope
rather edible:
- 0001, a fix I've posted on a different thread [1], used in patch
0004 to test this new facility.
- 0002, a refactoring piece to be able to expose stats kind data into
the frontend (for pg_waldump).
- 0003 is the core backend piece, introducing the WAL-logging routine
and the new RMGR.
- 0004, that provides tests for the new backend facility via a custom
stats kind. Requires 0001.

I am going to attach it to the next CF.

0001 has been applied as of b757abefc041. Here is a rebased patch set
for the rest.

Thanks for the patch!

A few random comments:

About v2-0002:

=== 1

+ * Copyright (c) 2001-2024, PostgreSQL Global Development Group

As pgstat_kind.h is a new file, s/Copyright (c) 2001-2024/Copyright (c) 2025/
maybe?

No more comments, as v2-0002 is "just" moving some stuff from pgstat.h to
pgstat_kind.h.

About v2-0003:

=== 2

Same as === 1 for pgstatdesc.c, pgstat_xlog.c and pgstat_xlog.h.

=== 3

+void
+pgstat_desc(StringInfo buf, XLogReaderState *record)
+{

Looks like logicalmsg_desc(), fine by me, except:

+ /* Write data payload as a series of hex bytes */

s/data payload/stats data/ maybe?

=== 4

+const char *
+pgstat_identify(uint8 info)

Looks like logicalmsg_identify(), fine by me.

=== 5

+typedef struct xl_pgstat_data
+{
+       PgStat_Kind stats_kind;
+       size_t          data_size;              /* size of the data */
+       /* data payload, worth data_size */
+       char            data[FLEXIBLE_ARRAY_MEMBER];
+} xl_pgstat_data;

Yeah, snapshotConflictHorizon and isCatalogRel are not needed here.

=== 6

+       stats_kind = xlrec->stats_kind;
+       data_size = xlrec->data_size;
+       data = xlrec->data;
+
+       kind_info = pgstat_get_kind_info(stats_kind);
+
+       if (kind_info == NULL)
+               elog(FATAL, "pgstat_redo: invalid stats data found: kind=%u",
+                        stats_kind);
+
+       if (kind_info->redo_cb == NULL)
+               elog(FATAL, "pgstat_redo: no redo callback found: kind=%s",
+                        kind_info->name);

What about moving the data_size and data assignments after the kind_info
and kind_info->redo_cb checks?

Also,

s/invalid stats data/invalid stats kind/ maybe?

About v2-0004:

=== 7

+ PgStat_StatInjRecord record_data = {0};

PgStat_StatInjRecord does not contain any padding but as it acts as a template,
better to use memset() instead? (to cover the cases where the record contains
padding).

=== 8

+       record_data.objid = entry_ref->shared_entry->key.objid;
+       record_data.entry = shfuncent->stats;

So it makes === 7 useless as here we are setting all the fields. But I think
it's good to keep === 7 as it acts as a template.

=== 9

+ if (!RecoveryInProgress() && inj_stats_wal_enabled)

s/!RecoveryInProgress()/XLogInsertAllowed()/ maybe?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#4Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#1)
Re: WAL-logging facility for pgstats kinds

Hi,

On 2024-12-27 12:32:25 +0900, Michael Paquier wrote:

While brainstorming on the contents of the thread I have posted a
couple of days ago, I have been looking at what could be done so as
pgstats and WAL-logging could work together. This was point 2) from
this message:
/messages/by-id/Z2tblEmfuOfZy4zx@paquier.xyz

While considering the point that not all stats data is worth
replicating, I have fallen down to the following properties that are
nice to have across the board:
- A pgstats kind should be able to WAL-log some data that is should be
able to decide. Including versioning of the data.

I don't really think that's right. For cumulative stats to make sense on both
a primary and a replica, or a primary after a crash, they need to cover things
that *already* are WAL logged.

E.g. for pg_stat_all_tables.n_{tup_{ins,upd,del,hot_upd},live_tup,dead_tup,
...}, - which are important for autovacuum scheduling - we should use the WAL
records covering those changes to re-assemble the stats during WAL replay.

The only reason that that's not trivial is that we don't have access to the
relation oid during crash recovery. Which in turn is why I think we should
change relation level stats to be keyed by relfilenode, rather than relation
oid.

I can't think of a real case where we would want to WAL log the stats
themselves, rather than re-emitting stats during replay based on the WAL
record of the "underlying object".

Do you have counter-examples?

Greetings,

Andres Freund

#5Michael Paquier
michael@paquier.xyz
In reply to: Andres Freund (#4)
Re: WAL-logging facility for pgstats kinds

On Thu, Jan 02, 2025 at 08:08:42PM -0500, Andres Freund wrote:

I can't think of a real case where we would want to WAL log the stats
themselves, rather than re-emitting stats during replay based on the WAL
record of the "underlying object".

Do you have counter-examples?

I'm not sure if the rebuild based on the WAL records is simpler than
logging a snapshot of them that the startup process could digest.

Anyway, I've also wanted to be able to replicate stats for historical
tracking for stats logged through hooks, and a second case was
injection point stats. All these would require registering a custom
rmgr on top of a custom stats kind, and centralizing the logic eases a
lot some of the sanity checks they'd require as the redo callback of a
stats kind can just be attached to its PgStat_KindInfo.

I know that Lucas has been playing a bit with the area, and perhaps
he has more cases in mind where the replication of stats data could be
relevant. So I am adding him in CC. Perhaps I could be wrong, of
course.
--
Michael

#6Michael Paquier
michael@paquier.xyz
In reply to: Bertrand Drouvot (#3)
3 attachment(s)
Re: WAL-logging facility for pgstats kinds

On Tue, Dec 31, 2024 at 11:29:26AM +0000, Bertrand Drouvot wrote:

=== 1

+ * Copyright (c) 2001-2024, PostgreSQL Global Development Group

As pgstat_kind.h is a new file, s/Copyright (c) 2001-2024/Copyright (c) 2025/
maybe?

Fixed.

No more comments, as v2-0002 is "just" moving some stuff from pgstat.h to
pgstat_kind.h.

Thanks. This split is independent of the main topic of the thread,
and I think that it just makes the declarations cleaner, so I'm
planning to apply this one. I was also planning to write some custom
tool to manipulate the pgstats file. That will help a bit.

For the rest of the patch set, let's see how the discussion moves on..

Same as === 1 for pgstatdesc.c, pgstat_xlog.c and pgstat_xlog.h.

Fixed.

+void
+pgstat_desc(StringInfo buf, XLogReaderState *record)
+{

Looks like logicalmsg_desc(), fine by me, except:

Yes, there's not much you can do here, except if we invent one day a
way to load external libraries in the frontend.

s/data payload/stats data/ maybe?
s/invalid stats data/invalid stats kind/ maybe?

Okay.

+ PgStat_StatInjRecord record_data = {0};

PgStat_StatInjRecord does not contain any padding but as it acts as a template,
better to use memset() instead? (to cover the cases where the record contains
padding).

Dammit, forgot about that. Good point about the padding, especially
for the template bits.

+ if (!RecoveryInProgress() && inj_stats_wal_enabled)

s/!RecoveryInProgress()/XLogInsertAllowed()/ maybe?

I'd rather use RecoveryInProgress() here even if XLogInsertAllowed()
is a synonym of that, minus the update of LocalXLogInsertAllowed for
the local process.
--
Michael

Attachments:

v3-0002-Move-information-about-pgstats-kinds-into-its-own.patchtext/x-diff; charset=us-asciiDownload
From f5eb1b66f1883d2b2e23dfb20eedf8328b0a01f3 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Fri, 10 Jan 2025 13:18:27 +0900
Subject: [PATCH v3 2/4] Move information about pgstats kinds into its own
 header

This information is split into its own header as it will be used by a
follow-up patch, to be used in some frontend and backend code.
---
 src/include/pgstat.h            | 57 +-------------------------
 src/include/utils/pgstat_kind.h | 72 +++++++++++++++++++++++++++++++++
 2 files changed, 73 insertions(+), 56 deletions(-)
 create mode 100644 src/include/utils/pgstat_kind.h

diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 6475889c58..81ad8e2b43 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -18,6 +18,7 @@
 #include "replication/conflict.h"
 #include "utils/backend_progress.h" /* for backward compatibility */
 #include "utils/backend_status.h"	/* for backward compatibility */
+#include "utils/pgstat_kind.h"
 #include "utils/relcache.h"
 #include "utils/wait_event.h"	/* for backward compatibility */
 
@@ -33,62 +34,6 @@
 /* Default directory to store temporary statistics data in */
 #define PG_STAT_TMP_DIR		"pg_stat_tmp"
 
-/* The types of statistics entries */
-#define PgStat_Kind uint32
-
-/* Range of IDs allowed, for built-in and custom kinds */
-#define PGSTAT_KIND_MIN	1		/* Minimum ID allowed */
-#define PGSTAT_KIND_MAX	256		/* Maximum ID allowed */
-
-/* use 0 for INVALID, to catch zero-initialized data */
-#define PGSTAT_KIND_INVALID 0
-
-/* stats for variable-numbered objects */
-#define PGSTAT_KIND_DATABASE	1	/* database-wide statistics */
-#define PGSTAT_KIND_RELATION	2	/* per-table statistics */
-#define PGSTAT_KIND_FUNCTION	3	/* per-function statistics */
-#define PGSTAT_KIND_REPLSLOT	4	/* per-slot statistics */
-#define PGSTAT_KIND_SUBSCRIPTION	5	/* per-subscription statistics */
-#define PGSTAT_KIND_BACKEND	6	/* per-backend statistics */
-
-/* stats for fixed-numbered objects */
-#define PGSTAT_KIND_ARCHIVER	7
-#define PGSTAT_KIND_BGWRITER	8
-#define PGSTAT_KIND_CHECKPOINTER	9
-#define PGSTAT_KIND_IO	10
-#define PGSTAT_KIND_SLRU	11
-#define PGSTAT_KIND_WAL	12
-
-#define PGSTAT_KIND_BUILTIN_MIN PGSTAT_KIND_DATABASE
-#define PGSTAT_KIND_BUILTIN_MAX PGSTAT_KIND_WAL
-#define PGSTAT_KIND_BUILTIN_SIZE (PGSTAT_KIND_BUILTIN_MAX + 1)
-
-/* Custom stats kinds */
-
-/* Range of IDs allowed for custom stats kinds */
-#define PGSTAT_KIND_CUSTOM_MIN	128
-#define PGSTAT_KIND_CUSTOM_MAX	PGSTAT_KIND_MAX
-#define PGSTAT_KIND_CUSTOM_SIZE	(PGSTAT_KIND_CUSTOM_MAX - PGSTAT_KIND_CUSTOM_MIN + 1)
-
-/*
- * PgStat_Kind to use for extensions that require an ID, but are still in
- * development and have not reserved their own unique kind ID yet. See:
- * https://wiki.postgresql.org/wiki/CustomCumulativeStats
- */
-#define PGSTAT_KIND_EXPERIMENTAL	128
-
-static inline bool
-pgstat_is_kind_builtin(PgStat_Kind kind)
-{
-	return kind >= PGSTAT_KIND_BUILTIN_MIN && kind <= PGSTAT_KIND_BUILTIN_MAX;
-}
-
-static inline bool
-pgstat_is_kind_custom(PgStat_Kind kind)
-{
-	return kind >= PGSTAT_KIND_CUSTOM_MIN && kind <= PGSTAT_KIND_CUSTOM_MAX;
-}
-
 /* Values for track_functions GUC variable --- order is significant! */
 typedef enum TrackFunctionsLevel
 {
diff --git a/src/include/utils/pgstat_kind.h b/src/include/utils/pgstat_kind.h
new file mode 100644
index 0000000000..f44169fd5a
--- /dev/null
+++ b/src/include/utils/pgstat_kind.h
@@ -0,0 +1,72 @@
+/* ----------
+ *	pgstat_kind.h
+ *
+ *	Definitions related to the statistics kinds for the PostgreSQL
+ *	cumulative statistics system.  Can be included in backend or
+ *	frontend code.
+ *
+ *	Copyright (c) 2001-2025, PostgreSQL Global Development Group
+ *
+ *	src/include/utils/pgstat_kind.h
+ * ----------
+ */
+#ifndef PGSTAT_KIND_H
+#define PGSTAT_KIND_H
+
+/* The types of statistics entries */
+#define PgStat_Kind uint32
+
+/* Range of IDs allowed, for built-in and custom kinds */
+#define PGSTAT_KIND_MIN	1		/* Minimum ID allowed */
+#define PGSTAT_KIND_MAX	256		/* Maximum ID allowed */
+
+/* use 0 for INVALID, to catch zero-initialized data */
+#define PGSTAT_KIND_INVALID 0
+
+/* stats for variable-numbered objects */
+#define PGSTAT_KIND_DATABASE	1	/* database-wide statistics */
+#define PGSTAT_KIND_RELATION	2	/* per-table statistics */
+#define PGSTAT_KIND_FUNCTION	3	/* per-function statistics */
+#define PGSTAT_KIND_REPLSLOT	4	/* per-slot statistics */
+#define PGSTAT_KIND_SUBSCRIPTION	5	/* per-subscription statistics */
+#define PGSTAT_KIND_BACKEND	6	/* per-backend statistics */
+
+/* stats for fixed-numbered objects */
+#define PGSTAT_KIND_ARCHIVER	7
+#define PGSTAT_KIND_BGWRITER	8
+#define PGSTAT_KIND_CHECKPOINTER	9
+#define PGSTAT_KIND_IO	10
+#define PGSTAT_KIND_SLRU	11
+#define PGSTAT_KIND_WAL	12
+
+#define PGSTAT_KIND_BUILTIN_MIN PGSTAT_KIND_DATABASE
+#define PGSTAT_KIND_BUILTIN_MAX PGSTAT_KIND_WAL
+#define PGSTAT_KIND_BUILTIN_SIZE (PGSTAT_KIND_BUILTIN_MAX + 1)
+
+/* Custom stats kinds */
+
+/* Range of IDs allowed for custom stats kinds */
+#define PGSTAT_KIND_CUSTOM_MIN	128
+#define PGSTAT_KIND_CUSTOM_MAX	PGSTAT_KIND_MAX
+#define PGSTAT_KIND_CUSTOM_SIZE	(PGSTAT_KIND_CUSTOM_MAX - PGSTAT_KIND_CUSTOM_MIN + 1)
+
+/*
+ * PgStat_Kind to use for extensions that require an ID, but are still in
+ * development and have not reserved their own unique kind ID yet. See:
+ * https://wiki.postgresql.org/wiki/CustomCumulativeStats
+ */
+#define PGSTAT_KIND_EXPERIMENTAL	128
+
+static inline bool
+pgstat_is_kind_builtin(PgStat_Kind kind)
+{
+	return kind >= PGSTAT_KIND_BUILTIN_MIN && kind <= PGSTAT_KIND_BUILTIN_MAX;
+}
+
+static inline bool
+pgstat_is_kind_custom(PgStat_Kind kind)
+{
+	return kind >= PGSTAT_KIND_CUSTOM_MIN && kind <= PGSTAT_KIND_CUSTOM_MAX;
+}
+
+#endif							/* PGSTAT_KIND_H */
-- 
2.47.1

v3-0003-Add-RMGR-and-WAL-logging-API-for-pgstats.patchtext/x-diff; charset=us-asciiDownload
From a67dcb96c093c427e6b7fb1eeb906e2d05c3796a Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Fri, 10 Jan 2025 13:26:46 +0900
Subject: [PATCH v3 3/4] Add RMGR and WAL-logging API for pgstats

This commit adds a new facility in the backend that offers to pgstats
kinds the possibility to WAL-log data that need to be persisted across
restarts:
- A new callback called redo_cb can be defined by a pgstats_kind.
- pgstat_xlog.c includes one API able to register some payload data and
its size.  Stats kinds doing WAL-logging require a definition of
redo_cb.
- pg_waldump is able to show the data logged, as hexadedimal data.

This facility has applications for in-core and custom stats kinds, with
one primary case being the possibility to WAL-log some statistics
related to relations so as these are not lost post-crash.  This is left
as future work.
---
 src/include/access/rmgrlist.h            |  1 +
 src/include/utils/pgstat_internal.h      |  8 +++
 src/include/utils/pgstat_xlog.h          | 41 ++++++++++++++
 src/backend/access/rmgrdesc/Makefile     |  1 +
 src/backend/access/rmgrdesc/meson.build  |  1 +
 src/backend/access/rmgrdesc/pgstatdesc.c | 49 ++++++++++++++++
 src/backend/access/transam/rmgr.c        |  1 +
 src/backend/utils/activity/Makefile      |  1 +
 src/backend/utils/activity/meson.build   |  1 +
 src/backend/utils/activity/pgstat_xlog.c | 72 ++++++++++++++++++++++++
 src/bin/pg_waldump/.gitignore            |  1 +
 src/bin/pg_waldump/rmgrdesc.c            |  1 +
 src/bin/pg_waldump/t/001_basic.pl        |  3 +-
 src/tools/pgindent/typedefs.list         |  1 +
 14 files changed, 181 insertions(+), 1 deletion(-)
 create mode 100644 src/include/utils/pgstat_xlog.h
 create mode 100644 src/backend/access/rmgrdesc/pgstatdesc.c
 create mode 100644 src/backend/utils/activity/pgstat_xlog.c

diff --git a/src/include/access/rmgrlist.h b/src/include/access/rmgrlist.h
index 8e7fc9db87..409397cb21 100644
--- a/src/include/access/rmgrlist.h
+++ b/src/include/access/rmgrlist.h
@@ -47,3 +47,4 @@ PG_RMGR(RM_COMMIT_TS_ID, "CommitTs", commit_ts_redo, commit_ts_desc, commit_ts_i
 PG_RMGR(RM_REPLORIGIN_ID, "ReplicationOrigin", replorigin_redo, replorigin_desc, replorigin_identify, NULL, NULL, NULL, NULL)
 PG_RMGR(RM_GENERIC_ID, "Generic", generic_redo, generic_desc, generic_identify, NULL, NULL, generic_mask, NULL)
 PG_RMGR(RM_LOGICALMSG_ID, "LogicalMessage", logicalmsg_redo, logicalmsg_desc, logicalmsg_identify, NULL, NULL, NULL, logicalmsg_decode)
+PG_RMGR(RM_PGSTAT_ID, "PgStat", pgstat_redo, pgstat_desc, pgstat_identify, NULL, NULL, NULL, NULL)
diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index 4bb8e5c53a..7a87d4e2f6 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -258,6 +258,14 @@ typedef struct PgStat_KindInfo
 	 */
 	void		(*init_backend_cb) (void);
 
+	/*
+	 * Perform custom actions when replaying WAL related to a statistics kind.
+	 *
+	 * "data" is a pointer to the stats information that can be used by the
+	 * stats kind at redo.
+	 */
+	void		(*redo_cb) (void *data, size_t data_size);
+
 	/*
 	 * For variable-numbered stats: flush pending stats. Required if pending
 	 * data is used.  See flush_fixed_cb for fixed-numbered stats.
diff --git a/src/include/utils/pgstat_xlog.h b/src/include/utils/pgstat_xlog.h
new file mode 100644
index 0000000000..8e229b4f12
--- /dev/null
+++ b/src/include/utils/pgstat_xlog.h
@@ -0,0 +1,41 @@
+/* ----------
+ * pgstat_xlog.h
+ *	  Exports from utils/activity/pgstat_xlog.c
+ *
+ * Copyright (c) 2001-2025, PostgreSQL Global Development Group
+ *
+ * src/include/utils/pgstat_xlog.h
+ * ----------
+ */
+#ifndef PGSTAT_XLOG_H
+#define PGSTAT_XLOG_H
+
+#include "access/xlog.h"
+#include "access/xlogdefs.h"
+#include "access/xlogreader.h"
+#include "utils/pgstat_kind.h"
+
+/*
+ * Generic WAL record for pgstat data.
+ */
+typedef struct xl_pgstat_data
+{
+	PgStat_Kind stats_kind;
+	size_t		data_size;		/* size of the data */
+	/* stats data, worth data_size */
+	char		data[FLEXIBLE_ARRAY_MEMBER];
+} xl_pgstat_data;
+
+#define SizeOfPgStatData    (offsetof(xl_pgstat_data, data))
+
+extern XLogRecPtr pgstat_xlog_data(PgStat_Kind stats_kind,
+								   const void *data,
+								   size_t data_size);
+
+/* RMGR API */
+#define XLOG_PGSTAT_DATA	0x00
+extern void pgstat_redo(XLogReaderState *record);
+extern void pgstat_desc(StringInfo buf, XLogReaderState *record);
+extern const char *pgstat_identify(uint8 info);
+
+#endif							/* PGSTAT_XLOG_H */
diff --git a/src/backend/access/rmgrdesc/Makefile b/src/backend/access/rmgrdesc/Makefile
index cd95eec37f..3979d22946 100644
--- a/src/backend/access/rmgrdesc/Makefile
+++ b/src/backend/access/rmgrdesc/Makefile
@@ -21,6 +21,7 @@ OBJS = \
 	logicalmsgdesc.o \
 	mxactdesc.o \
 	nbtdesc.o \
+	pgstatdesc.o \
 	relmapdesc.o \
 	replorigindesc.o \
 	rmgrdesc_utils.o \
diff --git a/src/backend/access/rmgrdesc/meson.build b/src/backend/access/rmgrdesc/meson.build
index 96c98e800c..91c53e634c 100644
--- a/src/backend/access/rmgrdesc/meson.build
+++ b/src/backend/access/rmgrdesc/meson.build
@@ -14,6 +14,7 @@ rmgr_desc_sources = files(
   'logicalmsgdesc.c',
   'mxactdesc.c',
   'nbtdesc.c',
+  'pgstatdesc.c',
   'relmapdesc.c',
   'replorigindesc.c',
   'rmgrdesc_utils.c',
diff --git a/src/backend/access/rmgrdesc/pgstatdesc.c b/src/backend/access/rmgrdesc/pgstatdesc.c
new file mode 100644
index 0000000000..233f9f1994
--- /dev/null
+++ b/src/backend/access/rmgrdesc/pgstatdesc.c
@@ -0,0 +1,49 @@
+/*-------------------------------------------------------------------------
+ *
+ * pgstatdesc.c
+ *	  rmgr descriptor routines for utils/activity/pgstat_xlog.c
+ *
+ * Portions Copyright (c) 2001-2025, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/access/rmgrdesc/pgstatdesc.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "utils/pgstat_xlog.h"
+
+void
+pgstat_desc(StringInfo buf, XLogReaderState *record)
+{
+	char	   *rec = XLogRecGetData(record);
+	uint8		info = XLogRecGetInfo(record) & ~XLR_INFO_MASK;
+
+	if (info == XLOG_PGSTAT_DATA)
+	{
+		xl_pgstat_data *xlrec = (xl_pgstat_data *) rec;
+		char	   *sep = "";
+
+		appendStringInfo(buf, "stats kind \"%u\"; with data (%zu bytes): ",
+						 xlrec->stats_kind, xlrec->data_size);
+
+		/* Write stats data as a series of hex bytes */
+		for (int cnt = 0; cnt < xlrec->data_size; cnt++)
+		{
+			appendStringInfo(buf, "%s%02X", sep, (unsigned char) xlrec->data[cnt]);
+			sep = " ";
+		}
+	}
+}
+
+const char *
+pgstat_identify(uint8 info)
+{
+	if ((info & ~XLR_INFO_MASK) == XLOG_PGSTAT_DATA)
+		return "PGSTAT_DATA";
+
+	return NULL;
+}
diff --git a/src/backend/access/transam/rmgr.c b/src/backend/access/transam/rmgr.c
index 1b7499726e..945843a78e 100644
--- a/src/backend/access/transam/rmgr.c
+++ b/src/backend/access/transam/rmgr.c
@@ -39,6 +39,7 @@
 #include "replication/message.h"
 #include "replication/origin.h"
 #include "storage/standby.h"
+#include "utils/pgstat_xlog.h"
 #include "utils/relmapper.h"
 /* IWYU pragma: end_keep */
 
diff --git a/src/backend/utils/activity/Makefile b/src/backend/utils/activity/Makefile
index 9c2443e1ec..fdbdd6bfe6 100644
--- a/src/backend/utils/activity/Makefile
+++ b/src/backend/utils/activity/Makefile
@@ -33,6 +33,7 @@ OBJS = \
 	pgstat_subscription.o \
 	pgstat_wal.o \
 	pgstat_xact.o \
+	pgstat_xlog.o \
 	wait_event.o \
 	wait_event_funcs.o
 
diff --git a/src/backend/utils/activity/meson.build b/src/backend/utils/activity/meson.build
index d8e56b49c2..533efef2cb 100644
--- a/src/backend/utils/activity/meson.build
+++ b/src/backend/utils/activity/meson.build
@@ -18,6 +18,7 @@ backend_sources += files(
   'pgstat_subscription.c',
   'pgstat_wal.c',
   'pgstat_xact.c',
+  'pgstat_xlog.c',
 )
 
 # this includes a .c file with contents generated in ../../../include/activity,
diff --git a/src/backend/utils/activity/pgstat_xlog.c b/src/backend/utils/activity/pgstat_xlog.c
new file mode 100644
index 0000000000..59a06cd9e1
--- /dev/null
+++ b/src/backend/utils/activity/pgstat_xlog.c
@@ -0,0 +1,72 @@
+/* ----------
+ * pgstat_xlog.c
+ *	   WAL replay logic for cumulative statistics.
+ *
+ *
+ * Copyright (c) 2001-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  src/backend/utils/activity/pgstat_xlog.c
+ * ----------
+ */
+
+#include "postgres.h"
+
+#include <unistd.h>
+
+#include "access/xlog.h"
+#include "access/xloginsert.h"
+#include "utils/pgstat_internal.h"
+#include "utils/pgstat_xlog.h"
+
+/*
+ * Write pgstats record data into WAL.
+ */
+XLogRecPtr
+pgstat_xlog_data(PgStat_Kind stats_kind, const void *data,
+				 size_t data_size)
+{
+	xl_pgstat_data xlrec;
+	XLogRecPtr	lsn;
+
+	xlrec.stats_kind = stats_kind;
+	xlrec.data_size = data_size;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, SizeOfPgStatData);
+	XLogRegisterData(data, data_size);
+
+	lsn = XLogInsert(RM_PGSTAT_ID, XLOG_PGSTAT_DATA);
+	return lsn;
+}
+
+/*
+ * Redo just passes down to the stats kind expected by this record
+ * data included in it.
+ */
+void
+pgstat_redo(XLogReaderState *record)
+{
+	uint8		info = XLogRecGetInfo(record) & ~XLR_INFO_MASK;
+	xl_pgstat_data *xlrec;
+	PgStat_Kind stats_kind;
+	const PgStat_KindInfo *kind_info;
+
+	if (info != XLOG_PGSTAT_DATA)
+		elog(PANIC, "pgstat_redo: unknown op code %u", info);
+
+	xlrec = (xl_pgstat_data *) XLogRecGetData(record);
+
+	stats_kind = xlrec->stats_kind;
+	kind_info = pgstat_get_kind_info(stats_kind);
+
+	if (kind_info == NULL)
+		elog(FATAL, "pgstat_redo: invalid stats kind found: kind=%u",
+			 stats_kind);
+
+	if (kind_info->redo_cb == NULL)
+		elog(FATAL, "pgstat_redo: no redo callback found: kind=%s",
+			 kind_info->name);
+
+	kind_info->redo_cb(xlrec->data, xlrec->data_size);
+}
diff --git a/src/bin/pg_waldump/.gitignore b/src/bin/pg_waldump/.gitignore
index ec51f41c76..e5089b322d 100644
--- a/src/bin/pg_waldump/.gitignore
+++ b/src/bin/pg_waldump/.gitignore
@@ -13,6 +13,7 @@
 /logicalmsgdesc.c
 /mxactdesc.c
 /nbtdesc.c
+/pgstatdesc.c
 /relmapdesc.c
 /replorigindesc.c
 /rmgrdesc_utils.c
diff --git a/src/bin/pg_waldump/rmgrdesc.c b/src/bin/pg_waldump/rmgrdesc.c
index 6b8c17bb4c..05b4f2acd1 100644
--- a/src/bin/pg_waldump/rmgrdesc.c
+++ b/src/bin/pg_waldump/rmgrdesc.c
@@ -30,6 +30,7 @@
 #include "replication/origin.h"
 #include "rmgrdesc.h"
 #include "storage/standbydefs.h"
+#include "utils/pgstat_xlog.h"
 #include "utils/relmapper.h"
 
 #define PG_RMGR(symname,name,redo,desc,identify,startup,cleanup,mask,decode) \
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 8d574a410c..70256ecccb 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -73,7 +73,8 @@ BRIN
 CommitTs
 ReplicationOrigin
 Generic
-LogicalMessage$/,
+LogicalMessage
+PgStat$/,
 	'rmgr list');
 
 
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index eb93debe10..4db1e7cf4b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -4135,6 +4135,7 @@ xl_multixact_create
 xl_multixact_truncate
 xl_overwrite_contrecord
 xl_parameter_change
+xl_pgstat_data
 xl_relmap_update
 xl_replorigin_drop
 xl_replorigin_set
-- 
2.47.1

v3-0004-injection_points-Add-option-and-tests-for-WAL-log.patchtext/x-diff; charset=us-asciiDownload
From 00f4bc416b214b5d2f7d53ed5eedbd1209f43953 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Fri, 10 Jan 2025 13:36:28 +0900
Subject: [PATCH v3 4/4] injection_points: Add option and tests for WAL-logging

This serves as a template for WAL-logging related to pgstats, based on
the variable-numbered stats the module includes.  Some TAP tests are
added to check the backend facility.
---
 .../injection_points/injection_points.c       | 19 ++++++-
 .../injection_points/injection_stats.c        | 54 +++++++++++++++++++
 .../injection_points/injection_stats.h        |  3 +-
 .../modules/injection_points/t/001_stats.pl   | 22 ++++++++
 src/tools/pgindent/typedefs.list              |  1 +
 5 files changed, 97 insertions(+), 2 deletions(-)

diff --git a/src/test/modules/injection_points/injection_points.c b/src/test/modules/injection_points/injection_points.c
index ad528d7775..c7f0d6d242 100644
--- a/src/test/modules/injection_points/injection_points.c
+++ b/src/test/modules/injection_points/injection_points.c
@@ -104,7 +104,7 @@ extern PGDLLEXPORT void injection_wait(const char *name,
 static bool injection_point_local = false;
 
 /*
- * GUC variable
+ * GUC variables
  *
  * This GUC is useful to control if statistics should be enabled or not
  * during a test with injection points, like for example if a test relies
@@ -112,6 +112,12 @@ static bool injection_point_local = false;
  */
 bool		inj_stats_enabled = false;
 
+/*
+ * This GUC controls if statistics of injection points should be logged
+ * into WAL or not.
+ */
+bool		inj_stats_wal_enabled = false;
+
 /* Shared memory init callbacks */
 static shmem_request_hook_type prev_shmem_request_hook = NULL;
 static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
@@ -534,6 +540,17 @@ _PG_init(void)
 							 NULL,
 							 NULL);
 
+	DefineCustomBoolVariable("injection_points.log_stats",
+							 "Enables WAL-logging for injection points statistics.",
+							 NULL,
+							 &inj_stats_wal_enabled,
+							 false,
+							 PGC_POSTMASTER,
+							 0,
+							 NULL,
+							 NULL,
+							 NULL);
+
 	MarkGUCPrefixReserved("injection_points");
 
 	/* Shared memory initialization */
diff --git a/src/test/modules/injection_points/injection_stats.c b/src/test/modules/injection_points/injection_stats.c
index 5db62bca66..df52cb026e 100644
--- a/src/test/modules/injection_points/injection_stats.c
+++ b/src/test/modules/injection_points/injection_stats.c
@@ -21,6 +21,7 @@
 #include "pgstat.h"
 #include "utils/builtins.h"
 #include "utils/pgstat_internal.h"
+#include "utils/pgstat_xlog.h"
 
 /* Structures for statistics of injection points */
 typedef struct PgStat_StatInjEntry
@@ -34,6 +35,14 @@ typedef struct PgStatShared_InjectionPoint
 	PgStat_StatInjEntry stats;
 } PgStatShared_InjectionPoint;
 
+/* Structure for data of injection points logged in WAL */
+typedef struct PgStat_StatInjRecord
+{
+	uint64		objid;			/* hash of the point name */
+	PgStat_StatInjEntry entry;	/* stats data */
+} PgStat_StatInjRecord;
+
+static void injection_stats_redo_cb(void *data, size_t data_size);
 static bool injection_stats_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
 
 static const PgStat_KindInfo injection_stats = {
@@ -48,6 +57,7 @@ static const PgStat_KindInfo injection_stats = {
 	.shared_data_off = offsetof(PgStatShared_InjectionPoint, stats),
 	.shared_data_len = sizeof(((PgStatShared_InjectionPoint *) 0)->stats),
 	.pending_size = sizeof(PgStat_StatInjEntry),
+	.redo_cb = injection_stats_redo_cb,
 	.flush_pending_cb = injection_stats_flush_cb,
 };
 
@@ -64,6 +74,31 @@ static const PgStat_KindInfo injection_stats = {
 /* Track if stats are loaded */
 static bool inj_stats_loaded = false;
 
+/*
+ * REDO callback for injection point stats.
+ */
+static void
+injection_stats_redo_cb(void *data, size_t data_size)
+{
+	PgStat_StatInjRecord *record_data = (PgStat_StatInjRecord *) data;
+	PgStat_StatInjEntry record_entry = record_data->entry;
+	PgStat_StatInjEntry *stat_entry;
+	PgStatShared_InjectionPoint *shstatent;
+	PgStat_EntryRef *entry_ref;
+
+	Assert(data_size == sizeof(PgStat_StatInjRecord));
+
+	/* create or fetch existing entry */
+	entry_ref = pgstat_prep_pending_entry(PGSTAT_KIND_INJECTION, InvalidOid,
+										  record_data->objid, NULL);
+
+	shstatent = (PgStatShared_InjectionPoint *) entry_ref->shared_stats;
+	stat_entry = &shstatent->stats;
+
+	/* Update the injection point statistics, overwriting any existing data */
+	*stat_entry = record_entry;
+}
+
 /*
  * Callback for stats handling
  */
@@ -72,6 +107,9 @@ injection_stats_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
 {
 	PgStat_StatInjEntry *localent;
 	PgStatShared_InjectionPoint *shfuncent;
+	PgStat_StatInjRecord record_data;
+
+	memset(&record_data, 0, sizeof(PgStat_StatInjRecord));
 
 	localent = (PgStat_StatInjEntry *) entry_ref->pending;
 	shfuncent = (PgStatShared_InjectionPoint *) entry_ref->shared_stats;
@@ -81,8 +119,24 @@ injection_stats_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
 
 	shfuncent->stats.numcalls += localent->numcalls;
 
+	record_data.objid = entry_ref->shared_entry->key.objid;
+	record_data.entry = shfuncent->stats;
+
 	pgstat_unlock_entry(entry_ref);
 
+	/*
+	 * Store the data in WAL, if not in recovery and if the option is enabled.
+	 */
+	if (!RecoveryInProgress() && inj_stats_wal_enabled)
+	{
+		XLogRecPtr	lsn;
+
+		lsn = pgstat_xlog_data(PGSTAT_KIND_INJECTION, &record_data,
+							   sizeof(PgStat_StatInjRecord));
+
+		/* Force a flush, to ensure persistency */
+		XLogFlush(lsn);
+	}
 	return true;
 }
 
diff --git a/src/test/modules/injection_points/injection_stats.h b/src/test/modules/injection_points/injection_stats.h
index ba310c52c7..baa8bb506c 100644
--- a/src/test/modules/injection_points/injection_stats.h
+++ b/src/test/modules/injection_points/injection_stats.h
@@ -15,8 +15,9 @@
 #ifndef INJECTION_STATS
 #define INJECTION_STATS
 
-/* GUC variable */
+/* GUC variables */
 extern bool inj_stats_enabled;
+extern bool inj_stats_wal_enabled;
 
 /* injection_stats.c */
 extern void pgstat_register_inj(void);
diff --git a/src/test/modules/injection_points/t/001_stats.pl b/src/test/modules/injection_points/t/001_stats.pl
index d4539fe872..4b34dd0a5a 100644
--- a/src/test/modules/injection_points/t/001_stats.pl
+++ b/src/test/modules/injection_points/t/001_stats.pl
@@ -75,4 +75,26 @@ $node->stop;
 $node->adjust_conf('postgresql.conf', 'shared_preload_libraries', "''");
 $node->start;
 
+# Stop the server and enable WAL, the stats are preserved after recovery.
+$node->stop;
+$node->append_conf(
+	'postgresql.conf', qq(
+shared_preload_libraries = 'injection_points'
+injection_points.log_stats = true
+));
+$node->start;
+
+# Two calls, both WAL-logged.
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('stats-wal-notice', 'notice');");
+$node->safe_psql('postgres',
+	"SELECT injection_points_run('stats-wal-notice');");
+$node->safe_psql('postgres',
+	"SELECT injection_points_run('stats-wal-notice');");
+$node->stop('immediate');
+$node->start;
+$numcalls = $node->safe_psql('postgres',
+	"SELECT injection_points_stats_numcalls('stats-wal-notice');");
+is($numcalls, '2', 'number of stats after crash with WAL-logging enabled');
+
 done_testing();
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 4db1e7cf4b..303a06b3fb 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2167,6 +2167,7 @@ PgStat_StatDBEntry
 PgStat_StatFuncEntry
 PgStat_StatInjEntry
 PgStat_StatInjFixedEntry
+PgStat_StatInjRecord
 PgStat_StatReplSlotEntry
 PgStat_StatSubEntry
 PgStat_StatTabEntry
-- 
2.47.1

#7Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#6)
2 attachment(s)
Re: WAL-logging facility for pgstats kinds

On Fri, Jan 10, 2025 at 01:46:53PM +0900, Michael Paquier wrote:

I'd rather use RecoveryInProgress() here even if XLogInsertAllowed()
is a synonym of that, minus the update of LocalXLogInsertAllowed for
the local process.

I've applied v2-0002 for the new header as it is useful on its own.
Rebased to avoid the wrath of the CF bot, as v3.
--
Michael

Attachments:

v3-0003-Add-RMGR-and-WAL-logging-API-for-pgstats.patchtext/x-diff; charset=us-asciiDownload
From 7d4aa6fc0e348065b46c31cbb21d7a4d2f21e5b2 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Fri, 10 Jan 2025 13:26:46 +0900
Subject: [PATCH v3 3/4] Add RMGR and WAL-logging API for pgstats

This commit adds a new facility in the backend that offers to pgstats
kinds the possibility to WAL-log data that need to be persisted across
restarts:
- A new callback called redo_cb can be defined by a pgstats_kind.
- pgstat_xlog.c includes one API able to register some payload data and
its size.  Stats kinds doing WAL-logging require a definition of
redo_cb.
- pg_waldump is able to show the data logged, as hexadedimal data.

This facility has applications for in-core and custom stats kinds, with
one primary case being the possibility to WAL-log some statistics
related to relations so as these are not lost post-crash.  This is left
as future work.
---
 src/include/access/rmgrlist.h            |  1 +
 src/include/utils/pgstat_internal.h      |  8 +++
 src/include/utils/pgstat_xlog.h          | 41 ++++++++++++++
 src/backend/access/rmgrdesc/Makefile     |  1 +
 src/backend/access/rmgrdesc/meson.build  |  1 +
 src/backend/access/rmgrdesc/pgstatdesc.c | 49 ++++++++++++++++
 src/backend/access/transam/rmgr.c        |  1 +
 src/backend/utils/activity/Makefile      |  1 +
 src/backend/utils/activity/meson.build   |  1 +
 src/backend/utils/activity/pgstat_xlog.c | 72 ++++++++++++++++++++++++
 src/bin/pg_waldump/.gitignore            |  1 +
 src/bin/pg_waldump/rmgrdesc.c            |  1 +
 src/bin/pg_waldump/t/001_basic.pl        |  3 +-
 src/tools/pgindent/typedefs.list         |  1 +
 14 files changed, 181 insertions(+), 1 deletion(-)
 create mode 100644 src/include/utils/pgstat_xlog.h
 create mode 100644 src/backend/access/rmgrdesc/pgstatdesc.c
 create mode 100644 src/backend/utils/activity/pgstat_xlog.c

diff --git a/src/include/access/rmgrlist.h b/src/include/access/rmgrlist.h
index 8e7fc9db87..409397cb21 100644
--- a/src/include/access/rmgrlist.h
+++ b/src/include/access/rmgrlist.h
@@ -47,3 +47,4 @@ PG_RMGR(RM_COMMIT_TS_ID, "CommitTs", commit_ts_redo, commit_ts_desc, commit_ts_i
 PG_RMGR(RM_REPLORIGIN_ID, "ReplicationOrigin", replorigin_redo, replorigin_desc, replorigin_identify, NULL, NULL, NULL, NULL)
 PG_RMGR(RM_GENERIC_ID, "Generic", generic_redo, generic_desc, generic_identify, NULL, NULL, generic_mask, NULL)
 PG_RMGR(RM_LOGICALMSG_ID, "LogicalMessage", logicalmsg_redo, logicalmsg_desc, logicalmsg_identify, NULL, NULL, NULL, logicalmsg_decode)
+PG_RMGR(RM_PGSTAT_ID, "PgStat", pgstat_redo, pgstat_desc, pgstat_identify, NULL, NULL, NULL, NULL)
diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index 4bb8e5c53a..7a87d4e2f6 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -258,6 +258,14 @@ typedef struct PgStat_KindInfo
 	 */
 	void		(*init_backend_cb) (void);
 
+	/*
+	 * Perform custom actions when replaying WAL related to a statistics kind.
+	 *
+	 * "data" is a pointer to the stats information that can be used by the
+	 * stats kind at redo.
+	 */
+	void		(*redo_cb) (void *data, size_t data_size);
+
 	/*
 	 * For variable-numbered stats: flush pending stats. Required if pending
 	 * data is used.  See flush_fixed_cb for fixed-numbered stats.
diff --git a/src/include/utils/pgstat_xlog.h b/src/include/utils/pgstat_xlog.h
new file mode 100644
index 0000000000..8e229b4f12
--- /dev/null
+++ b/src/include/utils/pgstat_xlog.h
@@ -0,0 +1,41 @@
+/* ----------
+ * pgstat_xlog.h
+ *	  Exports from utils/activity/pgstat_xlog.c
+ *
+ * Copyright (c) 2001-2025, PostgreSQL Global Development Group
+ *
+ * src/include/utils/pgstat_xlog.h
+ * ----------
+ */
+#ifndef PGSTAT_XLOG_H
+#define PGSTAT_XLOG_H
+
+#include "access/xlog.h"
+#include "access/xlogdefs.h"
+#include "access/xlogreader.h"
+#include "utils/pgstat_kind.h"
+
+/*
+ * Generic WAL record for pgstat data.
+ */
+typedef struct xl_pgstat_data
+{
+	PgStat_Kind stats_kind;
+	size_t		data_size;		/* size of the data */
+	/* stats data, worth data_size */
+	char		data[FLEXIBLE_ARRAY_MEMBER];
+} xl_pgstat_data;
+
+#define SizeOfPgStatData    (offsetof(xl_pgstat_data, data))
+
+extern XLogRecPtr pgstat_xlog_data(PgStat_Kind stats_kind,
+								   const void *data,
+								   size_t data_size);
+
+/* RMGR API */
+#define XLOG_PGSTAT_DATA	0x00
+extern void pgstat_redo(XLogReaderState *record);
+extern void pgstat_desc(StringInfo buf, XLogReaderState *record);
+extern const char *pgstat_identify(uint8 info);
+
+#endif							/* PGSTAT_XLOG_H */
diff --git a/src/backend/access/rmgrdesc/Makefile b/src/backend/access/rmgrdesc/Makefile
index cd95eec37f..3979d22946 100644
--- a/src/backend/access/rmgrdesc/Makefile
+++ b/src/backend/access/rmgrdesc/Makefile
@@ -21,6 +21,7 @@ OBJS = \
 	logicalmsgdesc.o \
 	mxactdesc.o \
 	nbtdesc.o \
+	pgstatdesc.o \
 	relmapdesc.o \
 	replorigindesc.o \
 	rmgrdesc_utils.o \
diff --git a/src/backend/access/rmgrdesc/meson.build b/src/backend/access/rmgrdesc/meson.build
index 96c98e800c..91c53e634c 100644
--- a/src/backend/access/rmgrdesc/meson.build
+++ b/src/backend/access/rmgrdesc/meson.build
@@ -14,6 +14,7 @@ rmgr_desc_sources = files(
   'logicalmsgdesc.c',
   'mxactdesc.c',
   'nbtdesc.c',
+  'pgstatdesc.c',
   'relmapdesc.c',
   'replorigindesc.c',
   'rmgrdesc_utils.c',
diff --git a/src/backend/access/rmgrdesc/pgstatdesc.c b/src/backend/access/rmgrdesc/pgstatdesc.c
new file mode 100644
index 0000000000..233f9f1994
--- /dev/null
+++ b/src/backend/access/rmgrdesc/pgstatdesc.c
@@ -0,0 +1,49 @@
+/*-------------------------------------------------------------------------
+ *
+ * pgstatdesc.c
+ *	  rmgr descriptor routines for utils/activity/pgstat_xlog.c
+ *
+ * Portions Copyright (c) 2001-2025, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/access/rmgrdesc/pgstatdesc.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "utils/pgstat_xlog.h"
+
+void
+pgstat_desc(StringInfo buf, XLogReaderState *record)
+{
+	char	   *rec = XLogRecGetData(record);
+	uint8		info = XLogRecGetInfo(record) & ~XLR_INFO_MASK;
+
+	if (info == XLOG_PGSTAT_DATA)
+	{
+		xl_pgstat_data *xlrec = (xl_pgstat_data *) rec;
+		char	   *sep = "";
+
+		appendStringInfo(buf, "stats kind \"%u\"; with data (%zu bytes): ",
+						 xlrec->stats_kind, xlrec->data_size);
+
+		/* Write stats data as a series of hex bytes */
+		for (int cnt = 0; cnt < xlrec->data_size; cnt++)
+		{
+			appendStringInfo(buf, "%s%02X", sep, (unsigned char) xlrec->data[cnt]);
+			sep = " ";
+		}
+	}
+}
+
+const char *
+pgstat_identify(uint8 info)
+{
+	if ((info & ~XLR_INFO_MASK) == XLOG_PGSTAT_DATA)
+		return "PGSTAT_DATA";
+
+	return NULL;
+}
diff --git a/src/backend/access/transam/rmgr.c b/src/backend/access/transam/rmgr.c
index 1b7499726e..945843a78e 100644
--- a/src/backend/access/transam/rmgr.c
+++ b/src/backend/access/transam/rmgr.c
@@ -39,6 +39,7 @@
 #include "replication/message.h"
 #include "replication/origin.h"
 #include "storage/standby.h"
+#include "utils/pgstat_xlog.h"
 #include "utils/relmapper.h"
 /* IWYU pragma: end_keep */
 
diff --git a/src/backend/utils/activity/Makefile b/src/backend/utils/activity/Makefile
index 9c2443e1ec..fdbdd6bfe6 100644
--- a/src/backend/utils/activity/Makefile
+++ b/src/backend/utils/activity/Makefile
@@ -33,6 +33,7 @@ OBJS = \
 	pgstat_subscription.o \
 	pgstat_wal.o \
 	pgstat_xact.o \
+	pgstat_xlog.o \
 	wait_event.o \
 	wait_event_funcs.o
 
diff --git a/src/backend/utils/activity/meson.build b/src/backend/utils/activity/meson.build
index d8e56b49c2..533efef2cb 100644
--- a/src/backend/utils/activity/meson.build
+++ b/src/backend/utils/activity/meson.build
@@ -18,6 +18,7 @@ backend_sources += files(
   'pgstat_subscription.c',
   'pgstat_wal.c',
   'pgstat_xact.c',
+  'pgstat_xlog.c',
 )
 
 # this includes a .c file with contents generated in ../../../include/activity,
diff --git a/src/backend/utils/activity/pgstat_xlog.c b/src/backend/utils/activity/pgstat_xlog.c
new file mode 100644
index 0000000000..59a06cd9e1
--- /dev/null
+++ b/src/backend/utils/activity/pgstat_xlog.c
@@ -0,0 +1,72 @@
+/* ----------
+ * pgstat_xlog.c
+ *	   WAL replay logic for cumulative statistics.
+ *
+ *
+ * Copyright (c) 2001-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  src/backend/utils/activity/pgstat_xlog.c
+ * ----------
+ */
+
+#include "postgres.h"
+
+#include <unistd.h>
+
+#include "access/xlog.h"
+#include "access/xloginsert.h"
+#include "utils/pgstat_internal.h"
+#include "utils/pgstat_xlog.h"
+
+/*
+ * Write pgstats record data into WAL.
+ */
+XLogRecPtr
+pgstat_xlog_data(PgStat_Kind stats_kind, const void *data,
+				 size_t data_size)
+{
+	xl_pgstat_data xlrec;
+	XLogRecPtr	lsn;
+
+	xlrec.stats_kind = stats_kind;
+	xlrec.data_size = data_size;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, SizeOfPgStatData);
+	XLogRegisterData(data, data_size);
+
+	lsn = XLogInsert(RM_PGSTAT_ID, XLOG_PGSTAT_DATA);
+	return lsn;
+}
+
+/*
+ * Redo just passes down to the stats kind expected by this record
+ * data included in it.
+ */
+void
+pgstat_redo(XLogReaderState *record)
+{
+	uint8		info = XLogRecGetInfo(record) & ~XLR_INFO_MASK;
+	xl_pgstat_data *xlrec;
+	PgStat_Kind stats_kind;
+	const PgStat_KindInfo *kind_info;
+
+	if (info != XLOG_PGSTAT_DATA)
+		elog(PANIC, "pgstat_redo: unknown op code %u", info);
+
+	xlrec = (xl_pgstat_data *) XLogRecGetData(record);
+
+	stats_kind = xlrec->stats_kind;
+	kind_info = pgstat_get_kind_info(stats_kind);
+
+	if (kind_info == NULL)
+		elog(FATAL, "pgstat_redo: invalid stats kind found: kind=%u",
+			 stats_kind);
+
+	if (kind_info->redo_cb == NULL)
+		elog(FATAL, "pgstat_redo: no redo callback found: kind=%s",
+			 kind_info->name);
+
+	kind_info->redo_cb(xlrec->data, xlrec->data_size);
+}
diff --git a/src/bin/pg_waldump/.gitignore b/src/bin/pg_waldump/.gitignore
index ec51f41c76..e5089b322d 100644
--- a/src/bin/pg_waldump/.gitignore
+++ b/src/bin/pg_waldump/.gitignore
@@ -13,6 +13,7 @@
 /logicalmsgdesc.c
 /mxactdesc.c
 /nbtdesc.c
+/pgstatdesc.c
 /relmapdesc.c
 /replorigindesc.c
 /rmgrdesc_utils.c
diff --git a/src/bin/pg_waldump/rmgrdesc.c b/src/bin/pg_waldump/rmgrdesc.c
index 6b8c17bb4c..05b4f2acd1 100644
--- a/src/bin/pg_waldump/rmgrdesc.c
+++ b/src/bin/pg_waldump/rmgrdesc.c
@@ -30,6 +30,7 @@
 #include "replication/origin.h"
 #include "rmgrdesc.h"
 #include "storage/standbydefs.h"
+#include "utils/pgstat_xlog.h"
 #include "utils/relmapper.h"
 
 #define PG_RMGR(symname,name,redo,desc,identify,startup,cleanup,mask,decode) \
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 8d574a410c..70256ecccb 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -73,7 +73,8 @@ BRIN
 CommitTs
 ReplicationOrigin
 Generic
-LogicalMessage$/,
+LogicalMessage
+PgStat$/,
 	'rmgr list');
 
 
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index eb93debe10..4db1e7cf4b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -4135,6 +4135,7 @@ xl_multixact_create
 xl_multixact_truncate
 xl_overwrite_contrecord
 xl_parameter_change
+xl_pgstat_data
 xl_relmap_update
 xl_replorigin_drop
 xl_replorigin_set
-- 
2.47.1

v3-0004-injection_points-Add-option-and-tests-for-WAL-log.patchtext/x-diff; charset=us-asciiDownload
From 397540085a0f176806aa36d0c8c85f8a6dbbf938 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Fri, 10 Jan 2025 13:36:28 +0900
Subject: [PATCH v3 4/4] injection_points: Add option and tests for WAL-logging

This serves as a template for WAL-logging related to pgstats, based on
the variable-numbered stats the module includes.  Some TAP tests are
added to check the backend facility.
---
 .../injection_points/injection_points.c       | 19 ++++++-
 .../injection_points/injection_stats.c        | 54 +++++++++++++++++++
 .../injection_points/injection_stats.h        |  3 +-
 .../modules/injection_points/t/001_stats.pl   | 22 ++++++++
 src/tools/pgindent/typedefs.list              |  1 +
 5 files changed, 97 insertions(+), 2 deletions(-)

diff --git a/src/test/modules/injection_points/injection_points.c b/src/test/modules/injection_points/injection_points.c
index ad528d7775..c7f0d6d242 100644
--- a/src/test/modules/injection_points/injection_points.c
+++ b/src/test/modules/injection_points/injection_points.c
@@ -104,7 +104,7 @@ extern PGDLLEXPORT void injection_wait(const char *name,
 static bool injection_point_local = false;
 
 /*
- * GUC variable
+ * GUC variables
  *
  * This GUC is useful to control if statistics should be enabled or not
  * during a test with injection points, like for example if a test relies
@@ -112,6 +112,12 @@ static bool injection_point_local = false;
  */
 bool		inj_stats_enabled = false;
 
+/*
+ * This GUC controls if statistics of injection points should be logged
+ * into WAL or not.
+ */
+bool		inj_stats_wal_enabled = false;
+
 /* Shared memory init callbacks */
 static shmem_request_hook_type prev_shmem_request_hook = NULL;
 static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
@@ -534,6 +540,17 @@ _PG_init(void)
 							 NULL,
 							 NULL);
 
+	DefineCustomBoolVariable("injection_points.log_stats",
+							 "Enables WAL-logging for injection points statistics.",
+							 NULL,
+							 &inj_stats_wal_enabled,
+							 false,
+							 PGC_POSTMASTER,
+							 0,
+							 NULL,
+							 NULL,
+							 NULL);
+
 	MarkGUCPrefixReserved("injection_points");
 
 	/* Shared memory initialization */
diff --git a/src/test/modules/injection_points/injection_stats.c b/src/test/modules/injection_points/injection_stats.c
index 5db62bca66..df52cb026e 100644
--- a/src/test/modules/injection_points/injection_stats.c
+++ b/src/test/modules/injection_points/injection_stats.c
@@ -21,6 +21,7 @@
 #include "pgstat.h"
 #include "utils/builtins.h"
 #include "utils/pgstat_internal.h"
+#include "utils/pgstat_xlog.h"
 
 /* Structures for statistics of injection points */
 typedef struct PgStat_StatInjEntry
@@ -34,6 +35,14 @@ typedef struct PgStatShared_InjectionPoint
 	PgStat_StatInjEntry stats;
 } PgStatShared_InjectionPoint;
 
+/* Structure for data of injection points logged in WAL */
+typedef struct PgStat_StatInjRecord
+{
+	uint64		objid;			/* hash of the point name */
+	PgStat_StatInjEntry entry;	/* stats data */
+} PgStat_StatInjRecord;
+
+static void injection_stats_redo_cb(void *data, size_t data_size);
 static bool injection_stats_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
 
 static const PgStat_KindInfo injection_stats = {
@@ -48,6 +57,7 @@ static const PgStat_KindInfo injection_stats = {
 	.shared_data_off = offsetof(PgStatShared_InjectionPoint, stats),
 	.shared_data_len = sizeof(((PgStatShared_InjectionPoint *) 0)->stats),
 	.pending_size = sizeof(PgStat_StatInjEntry),
+	.redo_cb = injection_stats_redo_cb,
 	.flush_pending_cb = injection_stats_flush_cb,
 };
 
@@ -64,6 +74,31 @@ static const PgStat_KindInfo injection_stats = {
 /* Track if stats are loaded */
 static bool inj_stats_loaded = false;
 
+/*
+ * REDO callback for injection point stats.
+ */
+static void
+injection_stats_redo_cb(void *data, size_t data_size)
+{
+	PgStat_StatInjRecord *record_data = (PgStat_StatInjRecord *) data;
+	PgStat_StatInjEntry record_entry = record_data->entry;
+	PgStat_StatInjEntry *stat_entry;
+	PgStatShared_InjectionPoint *shstatent;
+	PgStat_EntryRef *entry_ref;
+
+	Assert(data_size == sizeof(PgStat_StatInjRecord));
+
+	/* create or fetch existing entry */
+	entry_ref = pgstat_prep_pending_entry(PGSTAT_KIND_INJECTION, InvalidOid,
+										  record_data->objid, NULL);
+
+	shstatent = (PgStatShared_InjectionPoint *) entry_ref->shared_stats;
+	stat_entry = &shstatent->stats;
+
+	/* Update the injection point statistics, overwriting any existing data */
+	*stat_entry = record_entry;
+}
+
 /*
  * Callback for stats handling
  */
@@ -72,6 +107,9 @@ injection_stats_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
 {
 	PgStat_StatInjEntry *localent;
 	PgStatShared_InjectionPoint *shfuncent;
+	PgStat_StatInjRecord record_data;
+
+	memset(&record_data, 0, sizeof(PgStat_StatInjRecord));
 
 	localent = (PgStat_StatInjEntry *) entry_ref->pending;
 	shfuncent = (PgStatShared_InjectionPoint *) entry_ref->shared_stats;
@@ -81,8 +119,24 @@ injection_stats_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
 
 	shfuncent->stats.numcalls += localent->numcalls;
 
+	record_data.objid = entry_ref->shared_entry->key.objid;
+	record_data.entry = shfuncent->stats;
+
 	pgstat_unlock_entry(entry_ref);
 
+	/*
+	 * Store the data in WAL, if not in recovery and if the option is enabled.
+	 */
+	if (!RecoveryInProgress() && inj_stats_wal_enabled)
+	{
+		XLogRecPtr	lsn;
+
+		lsn = pgstat_xlog_data(PGSTAT_KIND_INJECTION, &record_data,
+							   sizeof(PgStat_StatInjRecord));
+
+		/* Force a flush, to ensure persistency */
+		XLogFlush(lsn);
+	}
 	return true;
 }
 
diff --git a/src/test/modules/injection_points/injection_stats.h b/src/test/modules/injection_points/injection_stats.h
index ba310c52c7..baa8bb506c 100644
--- a/src/test/modules/injection_points/injection_stats.h
+++ b/src/test/modules/injection_points/injection_stats.h
@@ -15,8 +15,9 @@
 #ifndef INJECTION_STATS
 #define INJECTION_STATS
 
-/* GUC variable */
+/* GUC variables */
 extern bool inj_stats_enabled;
+extern bool inj_stats_wal_enabled;
 
 /* injection_stats.c */
 extern void pgstat_register_inj(void);
diff --git a/src/test/modules/injection_points/t/001_stats.pl b/src/test/modules/injection_points/t/001_stats.pl
index d4539fe872..4b34dd0a5a 100644
--- a/src/test/modules/injection_points/t/001_stats.pl
+++ b/src/test/modules/injection_points/t/001_stats.pl
@@ -75,4 +75,26 @@ $node->stop;
 $node->adjust_conf('postgresql.conf', 'shared_preload_libraries', "''");
 $node->start;
 
+# Stop the server and enable WAL, the stats are preserved after recovery.
+$node->stop;
+$node->append_conf(
+	'postgresql.conf', qq(
+shared_preload_libraries = 'injection_points'
+injection_points.log_stats = true
+));
+$node->start;
+
+# Two calls, both WAL-logged.
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('stats-wal-notice', 'notice');");
+$node->safe_psql('postgres',
+	"SELECT injection_points_run('stats-wal-notice');");
+$node->safe_psql('postgres',
+	"SELECT injection_points_run('stats-wal-notice');");
+$node->stop('immediate');
+$node->start;
+$numcalls = $node->safe_psql('postgres',
+	"SELECT injection_points_stats_numcalls('stats-wal-notice');");
+is($numcalls, '2', 'number of stats after crash with WAL-logging enabled');
+
 done_testing();
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 4db1e7cf4b..303a06b3fb 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2167,6 +2167,7 @@ PgStat_StatDBEntry
 PgStat_StatFuncEntry
 PgStat_StatInjEntry
 PgStat_StatInjFixedEntry
+PgStat_StatInjRecord
 PgStat_StatReplSlotEntry
 PgStat_StatSubEntry
 PgStat_StatTabEntry
-- 
2.47.1

#8Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#7)
Re: WAL-logging facility for pgstats kinds

Hi,

On 2025-01-14 12:54:36 +0900, Michael Paquier wrote:

On Fri, Jan 10, 2025 at 01:46:53PM +0900, Michael Paquier wrote:

I'd rather use RecoveryInProgress() here even if XLogInsertAllowed()
is a synonym of that, minus the update of LocalXLogInsertAllowed for
the local process.

I've applied v2-0002 for the new header as it is useful on its own.
Rebased to avoid the wrath of the CF bot, as v3.

Because I saw this being moved to the new CF: I continue to *strenuously*
object to this design. As outlined upthread, I think it's going into the
completely wrong direction.

- Andres

#9Michael Paquier
michael@paquier.xyz
In reply to: Andres Freund (#8)
Re: WAL-logging facility for pgstats kinds

On Mon, Feb 10, 2025 at 11:43:30AM -0500, Andres Freund wrote:

Because I saw this being moved to the new CF: I continue to *strenuously*
object to this design. As outlined upthread, I think it's going into the
completely wrong direction.

Right. FWIW, I'm not sure that we can absolutely just discard a
possiblity like what this patch is doing, but I see your point that it
may not fit into the final picture, depending on what we finish with.
I'll go discard that for now, keeping it aside in case.
--
Michael

#10Cédric Villemain
Cedric.Villemain@Data-Bene.io
In reply to: Michael Paquier (#9)
Re: WAL-logging facility for pgstats kinds

On 12/02/2025 01:50, Michael Paquier wrote:

On Mon, Feb 10, 2025 at 11:43:30AM -0500, Andres Freund wrote:

Because I saw this being moved to the new CF: I continue to *strenuously*
object to this design. As outlined upthread, I think it's going into the
completely wrong direction.

Right. FWIW, I'm not sure that we can absolutely just discard a
possiblity like what this patch is doing, but I see your point that it
may not fit into the final picture, depending on what we finish with.
I'll go discard that for now, keeping it aside in case.

I am developing (WIP) an extension to add a facility similar to
pg_replication_slots, but for statistics or metrics. On one side, it
would allow the registration of external "stats plugins," and on the
other side, it would enable the consumption of whatever these plugins
return. I believe a similar idea may have been proposed in the
past-perhaps involving opening a dedicated port for accessing
stats/metrics-but I haven't yet searched for related public talks or
archives.

This approach makes it easy to envision a background worker on a replica
that connects to a stats slot and processes the received stats/metrics
as needed.

Additionally, there is significant potential for external applications
to leverage the PostgreSQL stats system to collect metrics instead of
relying on relational tables, promising a highly efficient solution. In
such cases, providing a mechanism to replicate this information could be
very beneficial for end-users.

Though I also support Andres' position to avoid abusing WAL in this
context, I am slightly more flexible: this exception would apply only if
there is a clear separation between stats essential for PostgreSQL (or
its extensions) to function correctly during or after a switchover, and
those (stats or) metrics that are irrelevant to PostgreSQL itself.

---
Cédric Villemain +33 6 20 30 22 52
https://www.Data-Bene.io
PostgreSQL Support, Expertise, Training, R&D

#11Michael Paquier
michael@paquier.xyz
In reply to: Cédric Villemain (#10)
Re: WAL-logging facility for pgstats kinds

On Sat, Feb 15, 2025 at 10:31:33AM +0100, Cédric Villemain wrote:

Additionally, there is significant potential for external applications to
leverage the PostgreSQL stats system to collect metrics instead of relying
on relational tables, promising a highly efficient solution. In such cases,
providing a mechanism to replicate this information could be very beneficial
for end-users.

Hmm. You mean with the extra addition of a hook around
pgstat[_shmem].c so as it would be possible to take some custom action
when pushing the stats to the pgstats dshash or for the fixed-numbered
stats?

Though I also support Andres' position to avoid abusing WAL in this context,
I am slightly more flexible: this exception would apply only if there is a
clear separation between stats essential for PostgreSQL (or its extensions)
to function correctly during or after a switchover, and those (stats or)
metrics that are irrelevant to PostgreSQL itself.

All these parts are doable even today outside of core with a custom
RMGR, so it's not like there are no solutions in place. The
introduction of a pgstats callback that can attached to a stats kind
is able to enforce a stricter rule in the backend, at least, without
the need to consume a bunch of RMGR IDs for that. (It is possible to
have one RMGR used across many extensions, clumsy but doable.)
--
Michael