[PATCH] Expose checkpoint timestamp and duration in pg_stat_checkpointer

Started by BharatDBabout 2 months ago2 messages
#1BharatDB
bharatdbpg@gmail.com
4 attachment(s)

Hi all,

While debugging checkpointer write behavior (related to the write-combining
discussions)[1]/messages/by-id/CAAKRu_ZBzTp-o4pu1UwmpLWkFmAmP7dyGFo867HxMo-AF+=MDw@mail.gmail.com, I recently found some of the enhancements related to
extending PostgreSQL's internal statistics . By including contextual
information to checkpoint completion logs, when `log_checkpoints` is
enabled we can now know the checkpoint type (manual, timed and immediate),
checkpoint duration and checkpoint completion timestamp through SQL for
better monitoring and analysis. I think it would be helpful to understand
more clearly about the current state of a bug and for further review on
future bugs. I had discussed the same with one of the contributors of
PostgreSQL and as per the suggestion, I am submitting a small independent
patch that adds two additional fields to the pg_stat_checkpointer log.

1. Patch features: I introduced these two fields internally as newly added
statistics and was able to surface them through SQL. The added statistics
were:
a. last_checkpoint_time : Exposes the completion timestamp of the most
recent checkpoint and is useful for monitoring checkpoint frequency and
diagnosing irregular checkpoint behavior.

b. checkpoint_total_time: Reports the total time spent in the last
completed checkpoint and helps administrators detect unexpectedly long or
stalled checkpoints.

I am attaching my observations, screenshots and patch in support for this.
Observations:
1. Terminal output for type of checkpoint occurred:

soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$ psql -p 55432 -d postgres
psql (19devel)
Type "help" for help.

postgres=# SELECT pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)

postgres=# \q
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$ pgbench -i -s 50 -p 55432 -d
postgres
dropping old tables...
NOTICE: table "pgbench_accounts" does not exist, skipping
NOTICE: table "pgbench_branches" does not exist, skipping
NOTICE: table "pgbench_history" does not exist, skipping
NOTICE: table "pgbench_tellers" does not exist, skipping
creating tables...
generating data (client-side)...
vacuuming...

creating primary keys...
done in 7.49 s (drop tables 0.00 s, create tables 0.01 s, client-side
generate 5.52 s, vacuum 0.20 s, primary keys 1.77 s).
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$ pgbench -c 8 -j 8 -T 120 -p
55432 -d postgres
pgbench (19devel)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 50
query mode: simple
number of clients: 8
number of threads: 8
maximum number of tries: 1
duration: 120 s
number of transactions actually processed: 228099
number of failed transactions: 0 (0.000%)
latency average = 4.209 ms
initial connection time = 7.762 ms
tps = 1900.858503 (without initial connection time)
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$ psql -p 55432 -d postgres
psql (19devel)
Type "help" for help.

postgres=# CHECKPOINT;
CHECKPOINT
postgres=# \q
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$ sudo grep "checkpoint
complete" "$PGDATA/logfile" | tail -1
[sudo] password for soumya:
2025-11-20 11:51:06.128 IST [18026] LOG: checkpoint complete (immediate):
wrote 7286 buffers (44.5%), wrote 4 SLRU buffers; 0 WAL file(s) added, 0
removed, 27 recycled; write=0.095 s, sync=0.034 s, total=0.279 s; sync
files=17, longest=0.004 s, average=0.002 s; distance=447382 kB,
estimate=531349 kB; lsn=0/7F4EDED8, redo lsn=0/7F4EDE80
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$

2. Terminal output for type of checkpoint duration and checkpoint
completion timestamp:

soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$ psql -p 55432 -d postgres
psql (19devel)
Type "help" for help.

postgres=# SELECT checkpoint_total_time, last_checkpoint_time FROM
pg_stat_checkpointer;
checkpoint_total_time | last_checkpoint_time
-----------------------+----------------------------------
175119 | 2025-11-20 11:51:06.128918+05:30
(1 row)

postgres=# CHECKPOINT;
CHECKPOINT
postgres=# SELECT pg_sleep(1);
pg_sleep
----------

(1 row)

postgres=# SELECT checkpoint_total_time, last_checkpoint_time FROM
pg_stat_checkpointer;
checkpoint_total_time | last_checkpoint_time
-----------------------+----------------------------------
175138 | 2025-11-20 11:58:02.879149+05:30
(1 row)

postgres=# \q
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$ sudo grep "checkpoint
complete" "$PGDATA/logfile" | tail -1
2025-11-20 11:58:02.879 IST [18026] LOG: checkpoint complete (immediate):
wrote 0 buffers (0.0%), wrote 0 SLRU buffers; 0 WAL file(s) added, 0
removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.019 s; sync
files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=478214
kB; lsn=0/7F4EDFE0, redo lsn=0/7F4EDF88
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$

I hope this will be helpful for proceeding further. Looking forward to
more feedback.

Thanking you.

Regards,
Soumya
BharatDB

References:
[1]: /messages/by-id/CAAKRu_ZBzTp-o4pu1UwmpLWkFmAmP7dyGFo867HxMo-AF+=MDw@mail.gmail.com
/messages/by-id/CAAKRu_ZBzTp-o4pu1UwmpLWkFmAmP7dyGFo867HxMo-AF+=MDw@mail.gmail.com

Attachments:

Screenshot from 2025-11-20 11-56-31.pngimage/png; name="Screenshot from 2025-11-20 11-56-31.png"Download
Screenshot from 2025-11-20 12-00-05.pngimage/png; name="Screenshot from 2025-11-20 12-00-05.png"Download
0001-Added-enhancement-related-to-checkpoint-reason-durat.patchtext/x-patch; charset=US-ASCII; name=0001-Added-enhancement-related-to-checkpoint-reason-durat.patchDownload
From 465ed2178a5437f31497ca5f5bb4a38ed0451b7a Mon Sep 17 00:00:00 2001
From: BharatDB <bharatdbpg@gmail.com>
Date: Mon, 10 Nov 2025 11:15:24 +0530
Subject: [PATCH] Added enhancement related to checkpoint reason / duration
 info in logs and stats

Signed-off-by: BharatDB <bharatdbpg@gmail.com>
---
 src/backend/access/transam/xlog.c             | 11 +++++-
 src/backend/catalog/system_views.sql          |  4 ++-
 .../utils/activity/pgstat_checkpointer.c      | 34 +++++++++++++++++++
 src/backend/utils/adt/pgstatfuncs.c           | 21 ++++++++++++
 src/include/catalog/pg_proc.dat               | 11 ++++++
 src/include/pgstat.h                          |  5 ++-
 6 files changed, 83 insertions(+), 3 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index ec992d2139..9217508917 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6761,6 +6761,7 @@ LogCheckpointEnd(bool restartpoint, int flags)
 	sync_msecs = TimestampDifferenceMilliseconds(CheckpointStats.ckpt_sync_t,
 												 CheckpointStats.ckpt_sync_end_t);
 
+
 	/* Accumulate checkpoint timing summary data, in milliseconds. */
 	PendingCheckpointerStats.write_time += write_msecs;
 	PendingCheckpointerStats.sync_time += sync_msecs;
@@ -6774,7 +6775,15 @@ LogCheckpointEnd(bool restartpoint, int flags)
 
 	total_msecs = TimestampDifferenceMilliseconds(CheckpointStats.ckpt_start_t,
 												  CheckpointStats.ckpt_end_t);
-
+												
+	
+	/* Store in PendingCheckpointerStats */
+	PendingCheckpointerStats.checkpoint_total_time += (double) total_msecs;
+	PendingCheckpointerStats.last_checkpoint_time = CheckpointStats.ckpt_end_t;
+
+	/* Publishing it */
+	pgstat_report_checkpointer();
+	
 	/*
 	 * Timing values returned from CheckpointStats are in microseconds.
 	 * Convert to milliseconds for consistent printing.
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index dec8df4f8e..903e001d95 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1190,7 +1190,9 @@ CREATE VIEW pg_stat_checkpointer AS
         pg_stat_get_checkpointer_sync_time() AS sync_time,
         pg_stat_get_checkpointer_buffers_written() AS buffers_written,
         pg_stat_get_checkpointer_slru_written() AS slru_written,
-        pg_stat_get_checkpointer_stat_reset_time() AS stats_reset;
+        pg_stat_get_checkpointer_stat_reset_time() AS stats_reset,
+        pg_stat_get_checkpointer_checkpoint_total_time() AS checkpoint_total_time,
+        pg_stat_get_checkpointer_last_checkpoint_time() AS last_checkpoint_time;
 
 CREATE VIEW pg_stat_io AS
 SELECT
diff --git a/src/backend/utils/activity/pgstat_checkpointer.c b/src/backend/utils/activity/pgstat_checkpointer.c
index e65034a30a..62ef427b82 100644
--- a/src/backend/utils/activity/pgstat_checkpointer.c
+++ b/src/backend/utils/activity/pgstat_checkpointer.c
@@ -56,8 +56,14 @@ pgstat_report_checkpointer(void)
 	CHECKPOINTER_ACC(sync_time);
 	CHECKPOINTER_ACC(buffers_written);
 	CHECKPOINTER_ACC(slru_written);
+	CHECKPOINTER_ACC(checkpoint_total_time);
 #undef CHECKPOINTER_ACC
 
+	/* only overwrite if we actually have a new timestamp */
+	if (PendingCheckpointerStats.last_checkpoint_time != 0)
+    	stats_shmem->stats.last_checkpoint_time =
+        	PendingCheckpointerStats.last_checkpoint_time;
+	
 	pgstat_end_changecount_write(&stats_shmem->changecount);
 
 	/*
@@ -71,6 +77,28 @@ pgstat_report_checkpointer(void)
 	pgstat_flush_io(false);
 }
 
+/* ------------------------------------------------------------
+ * Extended checkpointer stats reporting function
+ * ------------------------------------------------------------
+ */
+void
+pgstat_report_checkpointer_extended(long total_msecs, TimestampTz end_time)
+{
+    
+	PgStat_CheckpointerStats *checkpointer_stats;
+
+
+   checkpointer_stats = pgstat_fetch_stat_checkpointer();
+   if (!checkpointer_stats)
+       return;
+
+
+   checkpointer_stats->checkpoint_total_time += total_msecs;
+   checkpointer_stats->last_checkpoint_time = end_time;
+
+}
+
+
 /*
  * pgstat_fetch_stat_checkpointer() -
  *
@@ -136,5 +164,11 @@ pgstat_checkpointer_snapshot_cb(void)
 	CHECKPOINTER_COMP(sync_time);
 	CHECKPOINTER_COMP(buffers_written);
 	CHECKPOINTER_COMP(slru_written);
+	CHECKPOINTER_COMP(checkpoint_total_time);
 #undef CHECKPOINTER_COMP
+
+	pgStatLocal.snapshot.checkpointer.last_checkpoint_time = stats_shmem->stats.last_checkpoint_time;
+
+	elog(LOG, "DBG snapshot_cb: copied last_checkpoint_time=%ld",
+    	(long) pgStatLocal.snapshot.checkpointer.last_checkpoint_time);
 }
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index a710508979..591ad2ac88 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -2292,3 +2292,24 @@ pg_stat_have_stats(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(pgstat_have_entry(kind, dboid, objid));
 }
+
+PG_FUNCTION_INFO_V1(pg_stat_get_checkpointer_checkpoint_total_time);
+
+Datum
+pg_stat_get_checkpointer_checkpoint_total_time(PG_FUNCTION_ARGS)
+{
+    PgStat_CheckpointerStats *stats = pgstat_fetch_stat_checkpointer();
+    PG_RETURN_FLOAT8(stats->checkpoint_total_time);
+}
+
+PG_FUNCTION_INFO_V1(pg_stat_get_checkpointer_last_checkpoint_time);
+
+Datum
+pg_stat_get_checkpointer_last_checkpoint_time(PG_FUNCTION_ARGS)
+{
+   
+    PgStat_CheckpointerStats *stats = pgstat_fetch_stat_checkpointer();
+	if (!stats) PG_RETURN_NULL();
+	PG_RETURN_TIMESTAMPTZ(stats->last_checkpoint_time);
+
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 9121a382f7..a57053c4e2 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5980,6 +5980,17 @@
   proname => 'pg_stat_get_checkpointer_stat_reset_time', provolatile => 's',
   proparallel => 'r', prorettype => 'timestamptz', proargtypes => '',
   prosrc => 'pg_stat_get_checkpointer_stat_reset_time' },
+# New functions for checkpointer
+{ oid => '7000', 
+  descr => 'total time spent in last checkpoint in milliseconds',
+  proname => 'pg_stat_get_checkpointer_checkpoint_total_time', provolatile => 's',
+  proparallel => 'r', prorettype => 'float8', proargtypes => '',
+  prosrc => 'pg_stat_get_checkpointer_checkpoint_total_time' },
+{ oid => '7001', 
+  descr => 'timestamp of last checkpoint completion',
+  proname => 'pg_stat_get_checkpointer_last_checkpoint_time', provolatile => 's', 
+  proparallel => 'r', prorettype => 'timestamptz', proargtypes => '', 
+  prosrc => 'pg_stat_get_checkpointer_last_checkpoint_time' },
 { oid => '2772',
   descr => 'statistics: number of buffers written by the bgwriter for cleaning dirty buffers',
   proname => 'pg_stat_get_bgwriter_buf_written_clean', provolatile => 's',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 7ae503e71a..a8eb1f8add 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -263,7 +263,9 @@ typedef struct PgStat_CheckpointerStats
 	PgStat_Counter sync_time;
 	PgStat_Counter buffers_written;
 	PgStat_Counter slru_written;
-	TimestampTz stat_reset_timestamp;
+	PgStat_Counter checkpoint_total_time;  /* new: total ms of last checkpoint */
+	TimestampTz    last_checkpoint_time;   /* new: end time of last checkpoint */
+	TimestampTz    stat_reset_timestamp;
 } PgStat_CheckpointerStats;
 
 
@@ -583,6 +585,7 @@ extern PgStat_BgWriterStats *pgstat_fetch_stat_bgwriter(void);
 
 extern void pgstat_report_checkpointer(void);
 extern PgStat_CheckpointerStats *pgstat_fetch_stat_checkpointer(void);
+extern void pgstat_report_checkpointer_extended(long total_msecs, TimestampTz end_time);
 
 
 /*
-- 
2.34.1

0001-Enhance-checkpoint-logs-to-include-reason-manual-tim.patchtext/x-patch; charset=US-ASCII; name=0001-Enhance-checkpoint-logs-to-include-reason-manual-tim.patchDownload
From 28ac031e3484df0d293bfeabd4ec08bed961b6f2 Mon Sep 17 00:00:00 2001
From: BharatDB <bharatdbpg@gmail.com>
Date: Mon, 3 Nov 2025 16:19:09 +0530
Subject: [PATCH] Enhance checkpoint logs to include reason
 (manual/timed/immediate)

Signed-off-by: BharatDB <bharatdbpg@gmail.com>
---
 src/backend/access/transam/xlog.c | 24 +++++++++++++++++++-----
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index fd91bcd68e..ec992d2139 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -56,6 +56,7 @@
 #include "access/transam.h"
 #include "access/twophase.h"
 #include "access/xact.h"
+#include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "access/xlogarchive.h"
 #include "access/xloginsert.h"
@@ -6732,7 +6733,7 @@ LogCheckpointStart(int flags, bool restartpoint)
  * Log end of a checkpoint.
  */
 static void
-LogCheckpointEnd(bool restartpoint)
+LogCheckpointEnd(bool restartpoint, int flags)
 {
 	long		write_msecs,
 				sync_msecs,
@@ -6740,6 +6741,17 @@ LogCheckpointEnd(bool restartpoint)
 				longest_msecs,
 				average_msecs;
 	uint64		average_sync_time;
+	const char *ckpt_reason = "timed";
+
+	/* Determine checkpoint reason */
+	if (flags & CHECKPOINT_IS_SHUTDOWN)
+		ckpt_reason = "shutdown";
+	else if (flags & CHECKPOINT_END_OF_RECOVERY)
+		ckpt_reason = "end-of-recovery";
+	else if (flags & CHECKPOINT_FAST)
+		ckpt_reason = "immediate";
+	else if (flags & CHECKPOINT_FORCE)
+		ckpt_reason = "forced";
 
 	CheckpointStats.ckpt_end_t = GetCurrentTimestamp();
 
@@ -6782,12 +6794,13 @@ LogCheckpointEnd(bool restartpoint)
 	 */
 	if (restartpoint)
 		ereport(LOG,
-				(errmsg("restartpoint complete: wrote %d buffers (%.1f%%), "
+				(errmsg("restartpoint complete (%s): wrote %d buffers (%.1f%%), "
 						"wrote %d SLRU buffers; %d WAL file(s) added, "
 						"%d removed, %d recycled; write=%ld.%03d s, "
 						"sync=%ld.%03d s, total=%ld.%03d s; sync files=%d, "
 						"longest=%ld.%03d s, average=%ld.%03d s; distance=%d kB, "
 						"estimate=%d kB; lsn=%X/%08X, redo lsn=%X/%08X",
+						ckpt_reason,
 						CheckpointStats.ckpt_bufs_written,
 						(double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers,
 						CheckpointStats.ckpt_slru_written,
@@ -6806,12 +6819,13 @@ LogCheckpointEnd(bool restartpoint)
 						LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo))));
 	else
 		ereport(LOG,
-				(errmsg("checkpoint complete: wrote %d buffers (%.1f%%), "
+				(errmsg("checkpoint complete (%s): wrote %d buffers (%.1f%%), "
 						"wrote %d SLRU buffers; %d WAL file(s) added, "
 						"%d removed, %d recycled; write=%ld.%03d s, "
 						"sync=%ld.%03d s, total=%ld.%03d s; sync files=%d, "
 						"longest=%ld.%03d s, average=%ld.%03d s; distance=%d kB, "
 						"estimate=%d kB; lsn=%X/%08X, redo lsn=%X/%08X",
+						ckpt_reason,
 						CheckpointStats.ckpt_bufs_written,
 						(double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers,
 						CheckpointStats.ckpt_slru_written,
@@ -7400,7 +7414,7 @@ CreateCheckPoint(int flags)
 		TruncateSUBTRANS(GetOldestTransactionIdConsideredRunning());
 
 	/* Real work is done; log and update stats. */
-	LogCheckpointEnd(false);
+	LogCheckpointEnd(false, flags);
 
 	/* Reset the process title */
 	update_checkpoint_display(flags, false, true);
@@ -7868,7 +7882,7 @@ CreateRestartPoint(int flags)
 		TruncateSUBTRANS(GetOldestTransactionIdConsideredRunning());
 
 	/* Real work is done; log and update stats. */
-	LogCheckpointEnd(true);
+	LogCheckpointEnd(true, flags);
 
 	/* Reset the process title */
 	update_checkpoint_display(flags, true, true);
-- 
2.34.1

#2Álvaro Herrera
alvherre@kurilemu.de
In reply to: BharatDB (#1)
Re: [PATCH] Expose checkpoint timestamp and duration in pg_stat_checkpointer

On 2025-Nov-20, BharatDB wrote:

Hi all,

Please see
/messages/by-id/CA+TgmoZWTLxmDXP9cKsLOhTdZXJ2BXHJK7+0w206wxGPYpXH4g@mail.gmail.com
especially point 2.

--
Álvaro Herrera