Change log level for notifying hot standby is waiting non-overflowed snapshot

Started by torikoshia11 months ago21 messages

torikoshia@oss.nttdata.com

11 months ago

1 attachment(s)

Hi,

When a hot standby is restarted in a state where subtransactions have
overflowed, it may become inaccessible:

$ psql: error: connection to server at "localhost" (::1), port 5433
failed: FATAL: the database system is not yet accepting connections
DETAIL: Consistent recovery state has not been yet reached.

However, the log message that indicates the cause of this issue seems to
be only output at the DEBUG1 level:

elog(DEBUG1,
"recovery snapshot waiting for non-overflowed snapshot or "
"until oldest active xid on standby is at least %u (now %u)",
standbySnapshotPendingXmin,
running->oldestRunningXid);

I believe this message would be useful not only for developers but also
for users.
How about changing the log level from DEBUG1 to NOTICE or else?

Background:
One of our customers recently encountered an issue where the hot standby
became inaccessible after a restart.
The issue resolved itself after some time and I suspect it was caused by
a subtransaction overflow.
If the log level had been higher one, it would have been easier to
diagnose the problem.
..Even if it was a NOTICE, it may be difficult to notice the cause if
the log_min_message is set to default WARNING, but well, it seems a
higher log level is better than DEBUG1.

I would appreciate your thoughts.

--
Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.

Attachments:

v1-0001-Change-loglevel-for-waiting-non-overflowed-snapshot.patchtext/x-diff; name=v1-0001-Change-loglevel-for-waiting-non-overflowed-snapshot.patchDownload

diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 2e54c11f88..20c505add7 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -1125,7 +1125,7 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
 					 "recovery snapshots are now enabled");
 			}
 			else
-				elog(DEBUG1,
+				elog(NOTICE,
 					 "recovery snapshot waiting for non-overflowed snapshot or "
 					 "until oldest active xid on standby is at least %u (now %u)",
 					 standbySnapshotPendingXmin,
@@ -1303,7 +1303,7 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
 	if (standbyState == STANDBY_SNAPSHOT_READY)
 		elog(DEBUG1, "recovery snapshots are now enabled");
 	else
-		elog(DEBUG1,
+		elog(NOTICE,
 			 "recovery snapshot waiting for non-overflowed snapshot or "
 			 "until oldest active xid on standby is at least %u (now %u)",
 			 standbySnapshotPendingXmin,

Fujii Masao

masao.fujii@oss.nttdata.com

11 months ago

In reply to: torikoshia (#1)

Re: Change log level for notifying hot standby is waiting non-overflowed snapshot

On 2025/02/03 22:35, torikoshia wrote:

Hi,

When a hot standby is restarted in a state where subtransactions have overflowed, it may become inaccessible:

$ psql: error: connection to server at "localhost" (::1), port 5433 failed: FATAL: the database system is not yet accepting connections
DETAIL: Consistent recovery state has not been yet reached.

Could you share the steps to reproduce this situation?

However, the log message that indicates the cause of this issue seems to be only output at the DEBUG1 level:

elog(DEBUG1,
       "recovery snapshot waiting for non-overflowed snapshot or "
       "until oldest active xid on standby is at least %u (now %u)",
       standbySnapshotPendingXmin,
       running->oldestRunningXid);

I believe this message would be useful not only for developers but also for users.

Isn't this log message too difficult for most users? It seems to
describe PostgreSQL's internal mechanisms, making it hard
for users to understand the issue and what actions to take.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

torikoshia

torikoshia@oss.nttdata.com

10 months ago

In reply to: Fujii Masao (#2)

Re: Change log level for notifying hot standby is waiting non-overflowed snapshot

On 2025-03-03 13:10, Fujii Masao wrote:

Thanks for your comments!

On 2025/02/03 22:35, torikoshia wrote:

Hi,

When a hot standby is restarted in a state where subtransactions have
overflowed, it may become inaccessible:

$ psql: error: connection to server at "localhost" (::1), port 5433
failed: FATAL: the database system is not yet accepting connections
DETAIL: Consistent recovery state has not been yet
reached.

Could you share the steps to reproduce this situation?

We can reproduce this situation using the following procedure.
I performed this test with one asynchronous standby server.

-- overflow subtransaction
(primary)=# create table t1 (i int);
(primary)=# select 'insert into t1 values (1); savepoint s_' ||
generate_series(1, 70) ; \gexec
(primary)=# checkpoint;

-- restart standby
$ pg_ctl restart -D data_stb/
waiting for server to shut down.... done
server stopped
waiting for server to start.... LOG: redirecting log output to logging
collector process
........................................................... stopped
waiting
pg_ctl: server did not start in time

-- standby log
DEBUG: recovery snapshot waiting for non-overflowed snapshot or until
oldest active xid on standby is at least 887 (now 818)

However, the log message that indicates the cause of this issue seems
to be only output at the DEBUG1 level:

elog(DEBUG1,
       "recovery snapshot waiting for non-overflowed snapshot or "
       "until oldest active xid on standby is at least %u (now %u)",
       standbySnapshotPendingXmin,
       running->oldestRunningXid);

I believe this message would be useful not only for developers but
also for users.

Isn't this log message too difficult for most users? It seems to
describe PostgreSQL's internal mechanisms, making it hard
for users to understand the issue and what actions to take.

Agreed and I feel that a message suggesting something like "check if
there are any overflowing transactions on the primary side" would make
it useful.
On the other hand, the manual's explanation of
pg_stat_get_backend_subxact() does not mention subtransaction overflow,
so I am not sure how much detail should be included.

--
Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.

Fujii Masao

masao.fujii@oss.nttdata.com

10 months ago

In reply to: torikoshia (#3)

Re: Change log level for notifying hot standby is waiting non-overflowed snapshot

On 2025/03/04 0:20, torikoshia wrote:

On 2025-03-03 13:10, Fujii Masao wrote:

Thanks for your comments!

On 2025/02/03 22:35, torikoshia wrote:

Hi,

When a hot standby is restarted in a state where subtransactions have overflowed, it may become inaccessible:

$ psql: error: connection to server at "localhost" (::1), port 5433 failed: FATAL: the database system is not yet accepting connections
DETAIL: Consistent recovery state has not been yet reached.

Could you share the steps to reproduce this situation?

We can reproduce this situation using the following procedure.

Thanks! I was able to reproduce the issue.

Agreed and I feel that a message suggesting something like "check if there are any overflowing transactions on the primary side" would make it useful.

I’m wondering if this message might still be confusing for users.
Would they immediately understand what "overflowing transactions" means?
Even after reading this message, it seems also unclear what actions
they should take to resolve the issue. Plus, this message can appear
multiple times if there are multiple overflowing transactions before
starting accepting read-only connections - which could be even more confusing.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

torikoshia

torikoshia@oss.nttdata.com

10 months ago

In reply to: Fujii Masao (#4)

Re: Change log level for notifying hot standby is waiting non-overflowed snapshot

On 2025-03-04 03:17, Fujii Masao wrote:

Agreed and I feel that a message suggesting something like "check if
there are any overflowing transactions on the primary side" would make
it useful.

I’m wondering if this message might still be confusing for users.
Would they immediately understand what "overflowing transactions"
means?
Even after reading this message, it seems also unclear what actions
they should take to resolve the issue. Plus, this message can appear
multiple times if there are multiple overflowing transactions before
starting accepting read-only connections - which could be even more
confusing.

It seems better to reconsider the content and timing of this message
output.

I personally think that logging information about this situation where
subtransaction overflowed and it prevents hot standby connections would
be helpful for users and support providers to understand the cause of
the issue.
Do you think such logging is unnecessary?

--
Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.

torikoshia

torikoshia@oss.nttdata.com

10 months ago

In reply to: torikoshia (#5)

2 attachment(s)

Re: Change log level for notifying hot standby is waiting non-overflowed snapshot

Hi,

After an off-list discussion with Fujii-san, I'm now trying to modify
the following message that is output when a client attempts to connect
instead of changing the log level as the original proposal:

$ psql: error: connection to server at "localhost" (::1), port 5433
failed: FATAL: the database system is not yet accepting connections
DETAIL: Consistent recovery state has not been yet reached.

I have now 2 candidates to do this.

The 1st
one(v1-0001-Change-log-message-when-hot-standby-is-not-access.patch) is
a simple update to the existing log messages, explicitly mentioning that
snapshot overflow could be a possible cause.
The 2nd(v1-0001-Make-it-clear-when-hot-standby-is-inaccessible-du.patch)
one introduces new states for pmState and CAC_state (which manages
whether connections can be accepted) to represent waiting for a
non-overflowed snapshot.

The advantage of the 2nd one is that it makes it clear whether the
connection failure is due to not reaching a consistent recovery state or
a snapshot overflow. However, I haven't found other significant
benefits, and I feel it might be overkill.

Personally, I feel 1st patch may be sufficient, but I would appreciate
any feedback.

--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.

Attachments:

v1-0001-Make-it-clear-when-hot-standby-is-inaccessible-du.patchtext/x-diff; name=v1-0001-Make-it-clear-when-hot-standby-is-inaccessible-du.patchDownload

From 38a9ec23af2dc43ad24d939bb015d28d550d71fd Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi <torikoshi@sraoss.co.jp>
Date: Wed, 12 Mar 2025 21:47:22 +0900
Subject: [PATCH v1] Make it clear when hot standby is inaccessible due to
 subtransaction overflow

Previously, the log message only assumed that the recovery process had
not yet reached a consistent point. However, even after reaching the
consistent point, if there is a transaction with an overflowed
subtransaction, hot standby becomes inaccessible.
Since there was no log message indicating this reason, it was
difficult to identify the cause.

This patch explicitly handles such cases, making the cause clearer in
the logs.
---
 src/backend/postmaster/postmaster.c | 29 ++++++++++++++++++++++-------
 src/backend/storage/ipc/procarray.c | 17 +++++++++++++++++
 src/backend/tcop/backend_startup.c  | 13 +++++++++++++
 src/include/storage/pmsignal.h      |  2 ++
 src/include/tcop/backend_startup.h  |  1 +
 5 files changed, 55 insertions(+), 7 deletions(-)

diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index d2a7a7add6..5c3de3f97d 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -333,6 +333,8 @@ typedef enum
 	PM_INIT,					/* postmaster starting */
 	PM_STARTUP,					/* waiting for startup subprocess */
 	PM_RECOVERY,				/* in archive recovery mode */
+	PM_SNAPSHOT_PENDING,		/* in snapshot pending because of an
+								 * overflowed subtransaction */
 	PM_HOT_STANDBY,				/* in hot standby mode */
 	PM_RUN,						/* normal "database is alive" state */
 	PM_STOP_BACKENDS,			/* need to stop remaining backends */
@@ -1814,6 +1816,9 @@ canAcceptConnections(BackendType backend_type)
 		else if (!FatalError && pmState == PM_RECOVERY)
 			return CAC_NOTCONSISTENT;	/* not yet at consistent recovery
 										 * state */
+		else if (!FatalError && pmState == PM_SNAPSHOT_PENDING)
+			return CAC_SNAPSHOT_PENDING;	/* waiting for non-overflowed
+											 * snapshot */
 		else
 			return CAC_RECOVERY;	/* else must be crash recovery */
 	}
@@ -2111,7 +2116,7 @@ process_pm_shutdown_request(void)
 			 */
 			if (pmState == PM_RUN || pmState == PM_HOT_STANDBY)
 				connsAllowed = false;
-			else if (pmState == PM_STARTUP || pmState == PM_RECOVERY)
+			else if (pmState == PM_STARTUP || pmState == PM_RECOVERY || pmState == PM_SNAPSHOT_PENDING)
 			{
 				/* There should be no clients, so proceed to stop children */
 				UpdatePMState(PM_STOP_BACKENDS);
@@ -2145,7 +2150,7 @@ process_pm_shutdown_request(void)
 			sd_notify(0, "STOPPING=1");
 #endif
 
-			if (pmState == PM_STARTUP || pmState == PM_RECOVERY)
+			if (pmState == PM_STARTUP || pmState == PM_RECOVERY || pmState == PM_SNAPSHOT_PENDING)
 			{
 				/* Just shut down background processes silently */
 				UpdatePMState(PM_STOP_BACKENDS);
@@ -2711,6 +2716,7 @@ HandleFatalError(QuitSignalReason reason, bool consider_sigabrt)
 
 			/* wait for children to die */
 		case PM_RECOVERY:
+		case PM_SNAPSHOT_PENDING:
 		case PM_HOT_STANDBY:
 		case PM_RUN:
 		case PM_STOP_BACKENDS:
@@ -3193,6 +3199,7 @@ pmstate_name(PMState state)
 			PM_TOSTR_CASE(PM_INIT);
 			PM_TOSTR_CASE(PM_STARTUP);
 			PM_TOSTR_CASE(PM_RECOVERY);
+			PM_TOSTR_CASE(PM_SNAPSHOT_PENDING);
 			PM_TOSTR_CASE(PM_HOT_STANDBY);
 			PM_TOSTR_CASE(PM_RUN);
 			PM_TOSTR_CASE(PM_STOP_BACKENDS);
@@ -3245,7 +3252,7 @@ LaunchMissingBackgroundProcesses(void)
 	 * the shutdown checkpoint.  That's done in PostmasterStateMachine(), not
 	 * here.)
 	 */
-	if (pmState == PM_RUN || pmState == PM_RECOVERY ||
+	if (pmState == PM_RUN || pmState == PM_RECOVERY || pmState == PM_SNAPSHOT_PENDING ||
 		pmState == PM_HOT_STANDBY || pmState == PM_STARTUP)
 	{
 		if (CheckpointerPMChild == NULL)
@@ -3281,7 +3288,7 @@ LaunchMissingBackgroundProcesses(void)
 	 */
 	if (PgArchPMChild == NULL &&
 		((XLogArchivingActive() && pmState == PM_RUN) ||
-		 (XLogArchivingAlways() && (pmState == PM_RECOVERY || pmState == PM_HOT_STANDBY))) &&
+		 (XLogArchivingAlways() && (pmState == PM_RECOVERY || pmState == PM_SNAPSHOT_PENDING || pmState == PM_HOT_STANDBY))) &&
 		PgArchCanRestart())
 		PgArchPMChild = StartChildProcess(B_ARCHIVER);
 
@@ -3313,7 +3320,7 @@ LaunchMissingBackgroundProcesses(void)
 	if (WalReceiverRequested)
 	{
 		if (WalReceiverPMChild == NULL &&
-			(pmState == PM_STARTUP || pmState == PM_RECOVERY ||
+			(pmState == PM_STARTUP || pmState == PM_RECOVERY || pmState == PM_SNAPSHOT_PENDING ||
 			 pmState == PM_HOT_STANDBY) &&
 			Shutdown <= SmartShutdown)
 		{
@@ -3663,8 +3670,15 @@ process_pm_pmsignal(void)
 		UpdatePMState(PM_RECOVERY);
 	}
 
-	if (CheckPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY) &&
+	if (CheckPostmasterSignal(PMSIGNAL_SNAPSHOT_PENDING) &&
 		pmState == PM_RECOVERY && Shutdown == NoShutdown)
+	{
+		UpdatePMState(PM_SNAPSHOT_PENDING);
+	}
+
+	if (CheckPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY) &&
+		(pmState == PM_RECOVERY || pmState == PM_SNAPSHOT_PENDING) &&
+		Shutdown == NoShutdown)
 	{
 		ereport(LOG,
 				(errmsg("database system is ready to accept read-only connections")));
@@ -3806,7 +3820,7 @@ process_pm_pmsignal(void)
 	}
 
 	if (StartupPMChild != NULL &&
-		(pmState == PM_STARTUP || pmState == PM_RECOVERY ||
+		(pmState == PM_STARTUP || pmState == PM_RECOVERY || pmState == PM_SNAPSHOT_PENDING ||
 		 pmState == PM_HOT_STANDBY) &&
 		CheckPromoteSignal())
 	{
@@ -4130,6 +4144,7 @@ bgworker_should_start_now(BgWorkerStartTime start_time)
 			/* fall through */
 
 		case PM_RECOVERY:
+		case PM_SNAPSHOT_PENDING:
 		case PM_STARTUP:
 		case PM_INIT:
 			if (start_time == BgWorkerStart_PostmasterStart)
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 2e54c11f88..bb37ad2fc2 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -58,6 +58,7 @@
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "port/pg_lfind.h"
+#include "storage/pmsignal.h"
 #include "storage/proc.h"
 #include "storage/procarray.h"
 #include "utils/acl.h"
@@ -1125,11 +1126,19 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
 					 "recovery snapshots are now enabled");
 			}
 			else
+			{
+				/*
+				 * Inform postmaster that we are waiting for a non-overflowed
+				 * snapshot, so it can notify clients why the connection is
+				 * not yet acceptable.
+				 */
+				SendPostmasterSignal(PMSIGNAL_SNAPSHOT_PENDING);
 				elog(DEBUG1,
 					 "recovery snapshot waiting for non-overflowed snapshot or "
 					 "until oldest active xid on standby is at least %u (now %u)",
 					 standbySnapshotPendingXmin,
 					 running->oldestRunningXid);
+			}
 			return;
 		}
 	}
@@ -1303,11 +1312,19 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
 	if (standbyState == STANDBY_SNAPSHOT_READY)
 		elog(DEBUG1, "recovery snapshots are now enabled");
 	else
+	{
+		/*
+		 * Inform postmaster that we are waiting for a non-overflowed
+		 * snapshot, so it can notify clients why the connection is not yet
+		 * acceptable.
+		 */
+		SendPostmasterSignal(PMSIGNAL_SNAPSHOT_PENDING);
 		elog(DEBUG1,
 			 "recovery snapshot waiting for non-overflowed snapshot or "
 			 "until oldest active xid on standby is at least %u (now %u)",
 			 standbySnapshotPendingXmin,
 			 running->oldestRunningXid);
+	}
 }
 
 /*
diff --git a/src/backend/tcop/backend_startup.c b/src/backend/tcop/backend_startup.c
index c70746fa56..17e9708136 100644
--- a/src/backend/tcop/backend_startup.c
+++ b/src/backend/tcop/backend_startup.c
@@ -303,6 +303,19 @@ BackendInitialize(ClientSocket *client_sock, CAC_state cac)
 							 errmsg("the database system is not accepting connections"),
 							 errdetail("Hot standby mode is disabled.")));
 				break;
+			case CAC_SNAPSHOT_PENDING:
+				if (EnableHotStandby)
+					ereport(FATAL,
+							(errcode(ERRCODE_CANNOT_CONNECT_NOW),
+							 errmsg("the database system is not yet accepting connections"),
+							 errdetail("Snapshot is pending because subtransaction is overflowed."),
+							 errhint("Find and close a transaction with more than %d subtransactions", PGPROC_MAX_CACHED_SUBXIDS)));
+				else
+					ereport(FATAL,
+							(errcode(ERRCODE_CANNOT_CONNECT_NOW),
+							 errmsg("the database system is not accepting connections"),
+							 errdetail("Hot standby mode is disabled.")));
+				break;
 			case CAC_SHUTDOWN:
 				ereport(FATAL,
 						(errcode(ERRCODE_CANNOT_CONNECT_NOW),
diff --git a/src/include/storage/pmsignal.h b/src/include/storage/pmsignal.h
index d84a383047..a67813a15b 100644
--- a/src/include/storage/pmsignal.h
+++ b/src/include/storage/pmsignal.h
@@ -33,6 +33,8 @@
 typedef enum
 {
 	PMSIGNAL_RECOVERY_STARTED,	/* recovery has started */
+	PMSIGNAL_SNAPSHOT_PENDING,	/* snapshot is pending because of an
+								 * overflowed subtransaction */
 	PMSIGNAL_BEGIN_HOT_STANDBY, /* begin Hot Standby */
 	PMSIGNAL_ROTATE_LOGFILE,	/* send SIGUSR1 to syslogger to rotate logfile */
 	PMSIGNAL_START_AUTOVAC_LAUNCHER,	/* start an autovacuum launcher */
diff --git a/src/include/tcop/backend_startup.h b/src/include/tcop/backend_startup.h
index 7328561120..866a3b7cd2 100644
--- a/src/include/tcop/backend_startup.h
+++ b/src/include/tcop/backend_startup.h
@@ -30,6 +30,7 @@ typedef enum CAC_state
 	CAC_SHUTDOWN,
 	CAC_RECOVERY,
 	CAC_NOTCONSISTENT,
+	CAC_SNAPSHOT_PENDING,
 	CAC_TOOMANY,
 } CAC_state;
 
-- 
2.43.0

v1-0001-Change-log-message-when-hot-standby-is-not-access.patchtext/x-diff; name=v1-0001-Change-log-message-when-hot-standby-is-not-access.patchDownload

From 96c95cbf855419c909b0f79b79be47c2220d2c51 Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi <torikoshi@sraoss.co.jp>
Date: Wed, 12 Mar 2025 21:45:43 +0900
Subject: [PATCH v1] Change log message when hot standby is not accessible

Previously, the log message only assumed that the recovery process had not yet
reached a consistent point. However, even when we have reached the consistent
point, if there is a transaction whose subtransaction is overflowed, the hot
standby is not accessible and it was difficult to identify the cause since
there are no log message indicates the reason.
This change improves clarity by explicitly mention the case.
---
 src/backend/postmaster/postmaster.c | 6 ++++--
 src/backend/tcop/backend_startup.c  | 5 +++--
 src/include/tcop/backend_startup.h  | 2 +-
 3 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index d2a7a7add6..aafd238a7c 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -1812,8 +1812,10 @@ canAcceptConnections(BackendType backend_type)
 		else if (!FatalError && pmState == PM_STARTUP)
 			return CAC_STARTUP; /* normal startup */
 		else if (!FatalError && pmState == PM_RECOVERY)
-			return CAC_NOTCONSISTENT;	/* not yet at consistent recovery
-										 * state */
+			return CAC_NOTCONSISTENT_OR_OVERFLOWED; /* not yet at consistent
+													 * recovery state or
+													 * subtransaction is
+													 * overflowed */
 		else
 			return CAC_RECOVERY;	/* else must be crash recovery */
 	}
diff --git a/src/backend/tcop/backend_startup.c b/src/backend/tcop/backend_startup.c
index c70746fa56..46b1709e67 100644
--- a/src/backend/tcop/backend_startup.c
+++ b/src/backend/tcop/backend_startup.c
@@ -291,12 +291,13 @@ BackendInitialize(ClientSocket *client_sock, CAC_state cac)
 						(errcode(ERRCODE_CANNOT_CONNECT_NOW),
 						 errmsg("the database system is starting up")));
 				break;
-			case CAC_NOTCONSISTENT:
+			case CAC_NOTCONSISTENT_OR_OVERFLOWED:
 				if (EnableHotStandby)
 					ereport(FATAL,
 							(errcode(ERRCODE_CANNOT_CONNECT_NOW),
 							 errmsg("the database system is not yet accepting connections"),
-							 errdetail("Consistent recovery state has not been yet reached.")));
+							 errdetail("Consistent recovery state has not been yet reached, or snappshot is pending because subtransaction is overflowed."),
+							 errhint("In the latter case, find and close the transaction with more than %d subtransactions", PGPROC_MAX_CACHED_SUBXIDS)));
 				else
 					ereport(FATAL,
 							(errcode(ERRCODE_CANNOT_CONNECT_NOW),
diff --git a/src/include/tcop/backend_startup.h b/src/include/tcop/backend_startup.h
index 7328561120..6580efcfbd 100644
--- a/src/include/tcop/backend_startup.h
+++ b/src/include/tcop/backend_startup.h
@@ -29,7 +29,7 @@ typedef enum CAC_state
 	CAC_STARTUP,
 	CAC_SHUTDOWN,
 	CAC_RECOVERY,
-	CAC_NOTCONSISTENT,
+	CAC_NOTCONSISTENT_OR_OVERFLOWED,
 	CAC_TOOMANY,
 } CAC_state;
 
-- 
2.43.0

Fujii Masao

masao.fujii@oss.nttdata.com

10 months ago

In reply to: torikoshia (#6)

Re: Change log level for notifying hot standby is waiting non-overflowed snapshot

On 2025/03/12 21:57, torikoshia wrote:

Hi,

After an off-list discussion with Fujii-san, I'm now trying to modify the following message that is output when a client attempts to connect instead of changing the log level as the original proposal:

$ psql: error: connection to server at "localhost" (::1), port 5433 failed: FATAL: the database system is not yet accepting connections
DETAIL: Consistent recovery state has not been yet reached.

I have now 2 candidates to do this.

Thanks for the patches!

The 1st one(v1-0001-Change-log-message-when-hot-standby-is-not-access.patch) is a simple update to the existing log messages, explicitly mentioning that snapshot overflow could be a possible cause.
The 2nd(v1-0001-Make-it-clear-when-hot-standby-is-inaccessible-du.patch) one introduces new states for pmState and CAC_state (which manages whether connections can be accepted) to represent waiting for a non-overflowed snapshot.

The advantage of the 2nd one is that it makes it clear whether the connection failure is due to not reaching a consistent recovery state or a snapshot overflow. However, I haven't found other significant benefits, and I feel it might be overkill.

I agree that adding a new postmaster signal and state for
this minor issue seems unnecessary.

Personally, I feel 1st patch may be sufficient, but I would appreciate any feedback.

Agreed.

-							 errdetail("Consistent recovery state has not been yet reached.")));
+							 errdetail("Consistent recovery state has not been yet reached, or snappshot is pending because subtransaction is overflowed."),
+							 errhint("In the latter case, find and close the transaction with more than %d subtransactions", PGPROC_MAX_CACHED_SUBXIDS)));

This message might be too detailed. Instead, how about simplifying it
to something like: "Consistent recovery state has not been reached,
or snapshot is not ready for hot standby."

We can then update the documentation to clarify that overflowed subtransactions
may delay snapshot readiness for hot standby and explain how to address it.
For example, the current description - "it will begin accepting connections once
the recovery has brought the system to a consistent state." - should be updated
to reflect this condition.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

torikoshia

torikoshia@oss.nttdata.com

10 months ago

In reply to: Fujii Masao (#7)

Re: Change log level for notifying hot standby is waiting non-overflowed snapshot

Hi,

On 2025-03-21 02:15, Fujii Masao wrote:
Thanks for your review!

Personally, I feel 1st patch may be sufficient, but I would appreciate
any feedback.

Agreed.
-							 errdetail("Consistent recovery state has not been yet 
reached.")));
+							 errdetail("Consistent recovery state has not been yet
reached, or snappshot is pending because subtransaction is
overflowed."),
+							 errhint("In the latter case, find and close the transaction
with more than %d subtransactions", PGPROC_MAX_CACHED_SUBXIDS)));
This message might be too detailed. Instead, how about simplifying it
to something like: "Consistent recovery state has not been reached,
or snapshot is not ready for hot standby."

Agreed.

Do you also think the errhint message is unnecessary?
I agree with your idea to add a description of the overflowed
subtransaction in the manual, but I'm not sure all users will be able to
find it.
Some people may not understand what needs to be done to make the
snapshot ready for hot standby.
I think adding an errhint may help those users.

We can then update the documentation to clarify that overflowed
subtransactions
may delay snapshot readiness for hot standby and explain how to address
it.
For example, the current description - "it will begin accepting
connections once
the recovery has brought the system to a consistent state." - should be
updated
to reflect this condition.

--
Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.

Fujii Masao

masao.fujii@oss.nttdata.com

10 months ago

In reply to: torikoshia (#8)

Re: Change log level for notifying hot standby is waiting non-overflowed snapshot

On 2025/03/21 21:29, torikoshia wrote:

Hi,

On 2025-03-21 02:15, Fujii Masao wrote:
Thanks for your review!
Personally, I feel 1st patch may be sufficient, but I would appreciate any feedback.

Agreed.
-                             errdetail("Consistent recovery state has not been yet reached.")));
+                             errdetail("Consistent recovery state has not been yet
reached, or snappshot is pending because subtransaction is
overflowed."),
+                             errhint("In the latter case, find and close the transaction
with more than %d subtransactions", PGPROC_MAX_CACHED_SUBXIDS)));
This message might be too detailed. Instead, how about simplifying it
to something like: "Consistent recovery state has not been reached,
or snapshot is not ready for hot standby."
Agreed.

Do you also think the errhint message is unnecessary?
I agree with your idea to add a description of the overflowed subtransaction in the manual, but I'm not sure all users will be able to find it.
Some people may not understand what needs to be done to make the snapshot ready for hot standby.
I think adding an errhint may help those users.

I see your concern that users might overlook the documentation and
struggle to find a solution. However, I still believe it's better to
include this information in the documentation rather than logging it
as a hint. Since the scenario where the hint would be useful is
relatively rare, logging it every time might be more confusing than helpful.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#10

torikoshia

torikoshia@oss.nttdata.com

10 months ago

In reply to: Fujii Masao (#9)

1 attachment(s)

Re: Change log level for notifying hot standby is waiting non-overflowed snapshot

On 2025-03-24 00:08, Fujii Masao wrote:

Do you also think the errhint message is unnecessary?
I agree with your idea to add a description of the overflowed
subtransaction in the manual, but I'm not sure all users will be able
to find it.
Some people may not understand what needs to be done to make the
snapshot ready for hot standby.
I think adding an errhint may help those users.

I see your concern that users might overlook the documentation and
struggle to find a solution. However, I still believe it's better to
include this information in the documentation rather than logging it
as a hint. Since the scenario where the hint would be useful is
relatively rare, logging it every time might be more confusing than
helpful.

Thanks for your opinion and it sounds reasonable.

Attached an updated patch.

--
Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.

Attachments:

v2-0001-Change-log-message-when-hot-standby-is-not-access.patchtext/x-diff; name=v2-0001-Change-log-message-when-hot-standby-is-not-access.patchDownload

From d2eb01a10b07d10e1a3fee2150862986dc0edf41 Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Date: Mon, 24 Mar 2025 23:13:52 +0900
Subject: [PATCH v2] Change log message when hot standby is not accessible

Previously, the log message only assumed that the recovery process had not yet
reached a consistent point. However, even when we have reached the consistent
point, if there is a transaction whose subtransaction is overflowed, the hot
standby is not accessible and it was difficult to identify the cause since
there are no log message indicates the reason.
This change improves clarity by explicitly mention the case.

Additionally, the documentation has been updated to clarify that overflowed
subtransactions can delay snapshot readiness for hot standby and hot to solve
it.

---
 doc/src/sgml/high-availability.sgml | 7 +++++--
 src/backend/postmaster/postmaster.c | 6 ++++--
 src/backend/tcop/backend_startup.c  | 4 ++--
 src/include/tcop/backend_startup.h  | 2 +-
 4 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index acf3ac0601..6e336b1a8e 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -1535,8 +1535,11 @@ synchronous_standby_names = 'ANY 2 (s1, s2, s3)'
    <para>
     When the <xref linkend="guc-hot-standby"/> parameter is set to true on a
     standby server, it will begin accepting connections once the recovery has
-    brought the system to a consistent state.  All such connections are
-    strictly read-only; not even temporary tables may be written.
+    brought the system to a consistent state.  However, overflowed
+    subtransactions may also delay snapshot readiness for hot standby. In such
+    case, the issue can be resolved by closing the transaction containing the
+    overflowed subtransactions.  All connections accepted by the hot standby
+    are strictly read-only; not even temporary tables may be written.
    </para>
 
    <para>
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index a0c37532d2..6c678af8e6 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -1825,8 +1825,10 @@ canAcceptConnections(BackendType backend_type)
 		else if (!FatalError && pmState == PM_STARTUP)
 			return CAC_STARTUP; /* normal startup */
 		else if (!FatalError && pmState == PM_RECOVERY)
-			return CAC_NOTCONSISTENT;	/* not yet at consistent recovery
-										 * state */
+			return CAC_NOTCONSISTENT_OR_OVERFLOWED; /* not yet at consistent
+													 * recovery state or
+													 * subtransaction is
+													 * overflowed */
 		else
 			return CAC_RECOVERY;	/* else must be crash recovery */
 	}
diff --git a/src/backend/tcop/backend_startup.c b/src/backend/tcop/backend_startup.c
index 27c0b3c2b0..37ba5f760e 100644
--- a/src/backend/tcop/backend_startup.c
+++ b/src/backend/tcop/backend_startup.c
@@ -306,12 +306,12 @@ BackendInitialize(ClientSocket *client_sock, CAC_state cac)
 						(errcode(ERRCODE_CANNOT_CONNECT_NOW),
 						 errmsg("the database system is starting up")));
 				break;
-			case CAC_NOTCONSISTENT:
+			case CAC_NOTCONSISTENT_OR_OVERFLOWED:
 				if (EnableHotStandby)
 					ereport(FATAL,
 							(errcode(ERRCODE_CANNOT_CONNECT_NOW),
 							 errmsg("the database system is not yet accepting connections"),
-							 errdetail("Consistent recovery state has not been yet reached.")));
+							 errdetail("Consistent recovery state has not been yet reached, or snapshot is not ready for hot standby.")));
 				else
 					ereport(FATAL,
 							(errcode(ERRCODE_CANNOT_CONNECT_NOW),
diff --git a/src/include/tcop/backend_startup.h b/src/include/tcop/backend_startup.h
index 578828c1ca..4a4b878e2d 100644
--- a/src/include/tcop/backend_startup.h
+++ b/src/include/tcop/backend_startup.h
@@ -36,7 +36,7 @@ typedef enum CAC_state
 	CAC_STARTUP,
 	CAC_SHUTDOWN,
 	CAC_RECOVERY,
-	CAC_NOTCONSISTENT,
+	CAC_NOTCONSISTENT_OR_OVERFLOWED,
 	CAC_TOOMANY,
 } CAC_state;
 

base-commit: 19c6eb06c51f4da70e2ea0f1bdb64a0142e8e2aa
-- 
2.48.1

#11

Fujii Masao

masao.fujii@oss.nttdata.com

10 months ago

In reply to: torikoshia (#10)

Re: Change log level for notifying hot standby is waiting non-overflowed snapshot

On 2025/03/24 23:18, torikoshia wrote:

On 2025-03-24 00:08, Fujii Masao wrote:

Do you also think the errhint message is unnecessary?
I agree with your idea to add a description of the overflowed subtransaction in the manual, but I'm not sure all users will be able to find it.
Some people may not understand what needs to be done to make the snapshot ready for hot standby.
I think adding an errhint may help those users.

I see your concern that users might overlook the documentation and
struggle to find a solution. However, I still believe it's better to
include this information in the documentation rather than logging it
as a hint. Since the scenario where the hint would be useful is
relatively rare, logging it every time might be more confusing than helpful.

Thanks for your opinion and it sounds reasonable.

Attached an updated patch.

Thanks for updating the patch!

In high-availability.sgml, the "Administrator's Overview" section already
describes the conditions for accepting hot standby connections.
This section should also be updated accordingly.

+    brought the system to a consistent state.  However, overflowed
+    subtransactions may also delay snapshot readiness for hot standby. In such
+    case, the issue can be resolved by closing the transaction containing the
+    overflowed subtransactions.  All connections accepted by the hot standby
+    are strictly read-only; not even temporary tables may be written.

It would be better to move this explanation about overflowed subtransactions
to the "Administrator's Overview" section.

-			case CAC_NOTCONSISTENT:
+			case CAC_NOTCONSISTENT_OR_OVERFLOWED:

This new name seems a bit too long. I'm OK to leave the name as it is.
Or, something like CAC_NOTHOTSTANDBY seems simpler and better to me.
Thought?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#12

torikoshia

torikoshia@oss.nttdata.com

10 months ago

In reply to: Fujii Masao (#11)

1 attachment(s)

Re: Change log level for notifying hot standby is waiting non-overflowed snapshot

Hi,

I had another off-list discussion with Fujii-san, and according to the
following manual[1]https://www.postgresql.org/docs/devel/hot-standby.html, it seems that a transaction with an overflowed
subtransaction is already considered inconsistent:

Reaching a consistent state can also be delayed in the presence of
both of these conditions:

- A write transaction has more than 64 subtransactions
- Very long-lived write transactions

IIUC, the manual suggests that both conditions must be met -- recovery
reaching at least minRecoveryPoint and no overflowed subtransactions —-
for the standby to be considered consistent.

OTOH, the following log message is emitted even when subtransactions
have overflowed, which appears to contradict the definition of
consistency mentioned above:

LOG: consistent recovery state reached

This log message is triggered when recovery progresses beyond
minRecoveryPoint(according to CheckRecoveryConsistency()).
However, since this state does not satisfy 'consistency' defined in the
manual, I think it would be more accurate to log that it has merely
reached the "minimum recovery point".
Furthermore, it may be better to emit the above log message only when
recovery has progressed beyond minRecoveryPoint and there are no
overflowed subtransactions.

Attached patch does this.

Additionally, renaming variables such as reachedConsistency in
CheckRecoveryConsistency might also be appropriate.
However, in the attached patch, I have left them unchanged for now.

On 2025-03-25 00:55, Fujii Masao wrote:

-                  case CAC_NOTCONSISTENT:
+                  case CAC_NOTCONSISTENT_OR_OVERFLOWED:
This new name seems a bit too long. I'm OK to leave the name as it is.
Or, something like CAC_NOTHOTSTANDBY seems simpler and better to me.

Beyond just the length issue, given the understanding outlined above, I
now think CAC_NOTCONSISTENT does not actually need to be changed.

In high-availability.sgml, the "Administrator's Overview" section
already
describes the conditions for accepting hot standby connections.
This section should also be updated accordingly.

Agreed.
I have updated this section to mention that the resolution is to close
the problematic transaction.
OTOH the changes made in v2 patch seem unnecessary, since the concept of
'consistent' is already explained in the "Administrator's Overview."

-							 errdetail("Consistent recovery state has not been yet 
reached.")));
+							 errdetail("Consistent recovery state has not been yet reached, 
or snappshot is pending because subtransaction is overflowed."),

Given the above understanding, "or" is not appropriate in this context,
so I left this message unchanged.
Instead, I have added an errhint. The phrasing in the hint message
aligns with the manual, allowing users to search for this hint and find
the newly added resolution instructions.

What do you think?

[1]: https://www.postgresql.org/docs/devel/hot-standby.html

--
Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.

Attachments:

v3-0001-Add-hint-message-when-hot-standby-is-unaccessible.patchtext/x-diff; name=v3-0001-Add-hint-message-when-hot-standby-is-unaccessible.patchDownload

From 2f552c683cfc3f4ba69a33f279ec80bca60e1c93 Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Date: Thu, 27 Mar 2025 10:51:37 +0900
Subject: [PATCH v3] Add hint message when hot standby is unaccessible

Currently, when hot standby is inaccessible due to an overflowed
subtransaction, it is difficult for users to determine the cause
since there are no user level log message indicating that.
This patch adds a hint message to indicate the reasons.

Additionally, there is an inconsistency between the documentation and
the log messages regarding the definition of 'consistent' recovery.
The documentation states that a consistent state requires both recovery
goes beyond minRecoveryPoint and the absence of overflowed
subtransactions, whereas the source code consideres only
minRecoveryPoint.
This patch updates the log message to align with the documentation.

---
 doc/src/sgml/high-availability.sgml       | 1 +
 src/backend/access/transam/xlogrecovery.c | 5 ++++-
 src/backend/tcop/backend_startup.c        | 4 +++-
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index acf3ac0601..6ceb57b0a0 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -1991,6 +1991,7 @@ LOG:  database system is ready to accept read-only connections
        </listitem>
       </itemizedlist>
 
+    The former case can be resolved by closing the transaction.
     If you are running file-based log shipping ("warm standby"), you might need
     to wait until the next WAL file arrives, which could be as long as the
     <varname>archive_timeout</varname> setting on the primary.
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 2c19013c98..8152f90c99 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2249,7 +2249,7 @@ CheckRecoveryConsistency(void)
 
 		reachedConsistency = true;
 		ereport(LOG,
-				(errmsg("consistent recovery state reached at %X/%X",
+				(errmsg("minimum recovery point reached at %X/%X",
 						LSN_FORMAT_ARGS(lastReplayedEndRecPtr))));
 	}
 
@@ -2268,6 +2268,9 @@ CheckRecoveryConsistency(void)
 		SpinLockRelease(&XLogRecoveryCtl->info_lck);
 
 		LocalHotStandbyActive = true;
+		ereport(LOG,
+				(errmsg("consistent recovery state reached at %X/%X",
+						LSN_FORMAT_ARGS(lastReplayedEndRecPtr))));
 
 		SendPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY);
 	}
diff --git a/src/backend/tcop/backend_startup.c b/src/backend/tcop/backend_startup.c
index 27c0b3c2b0..252443f4ca 100644
--- a/src/backend/tcop/backend_startup.c
+++ b/src/backend/tcop/backend_startup.c
@@ -311,7 +311,9 @@ BackendInitialize(ClientSocket *client_sock, CAC_state cac)
 					ereport(FATAL,
 							(errcode(ERRCODE_CANNOT_CONNECT_NOW),
 							 errmsg("the database system is not yet accepting connections"),
-							 errdetail("Consistent recovery state has not been yet reached.")));
+							 errdetail("Consistent recovery state has not been yet reached."),
+							 errhint("Minimum recovery point has not been yet reached or a write transaction may have more than %d subtransactions.",
+									 PGPROC_MAX_CACHED_SUBXIDS)));
 				else
 					ereport(FATAL,
 							(errcode(ERRCODE_CANNOT_CONNECT_NOW),

base-commit: c325a7633fcb33dbd73f46ddbbe91e95ddf3b227
-- 
2.48.1

#13

torikoshia

torikoshia@oss.nttdata.com

10 months ago

In reply to: torikoshia (#12)

1 attachment(s)

Re: Change log level for notifying hot standby is waiting non-overflowed snapshot

On 2025-03-27 11:06, torikoshia wrote:

Hi,

I had another off-list discussion with Fujii-san, and according to the
following manual[1], it seems that a transaction with an overflowed
subtransaction is already considered inconsistent:

Reaching a consistent state can also be delayed in the presence of
both of these conditions:

- A write transaction has more than 64 subtransactions
- Very long-lived write transactions

IIUC, the manual suggests that both conditions must be met -- recovery
reaching at least minRecoveryPoint and no overflowed subtransactions
—- for the standby to be considered consistent.

OTOH, the following log message is emitted even when subtransactions
have overflowed, which appears to contradict the definition of
consistency mentioned above:

LOG: consistent recovery state reached

This log message is triggered when recovery progresses beyond
minRecoveryPoint(according to CheckRecoveryConsistency()).
However, since this state does not satisfy 'consistency' defined in
the manual, I think it would be more accurate to log that it has
merely reached the "minimum recovery point".
Furthermore, it may be better to emit the above log message only when
recovery has progressed beyond minRecoveryPoint and there are no
overflowed subtransactions.

Attached patch does this.

Additionally, renaming variables such as reachedConsistency in
CheckRecoveryConsistency might also be appropriate.
However, in the attached patch, I have left them unchanged for now.

On 2025-03-25 00:55, Fujii Masao wrote:
-                  case CAC_NOTCONSISTENT:
+                  case CAC_NOTCONSISTENT_OR_OVERFLOWED:
This new name seems a bit too long. I'm OK to leave the name as it is.
Or, something like CAC_NOTHOTSTANDBY seems simpler and better to me.
Beyond just the length issue, given the understanding outlined above,
I now think CAC_NOTCONSISTENT does not actually need to be changed.

In high-availability.sgml, the "Administrator's Overview" section
already
describes the conditions for accepting hot standby connections.
This section should also be updated accordingly.

Agreed.
I have updated this section to mention that the resolution is to close
the problematic transaction.
OTOH the changes made in v2 patch seem unnecessary, since the concept
of 'consistent' is already explained in the "Administrator's
Overview."
-							 errdetail("Consistent recovery state has not been yet 
reached.")));
+							 errdetail("Consistent recovery state has not been yet
reached, or snappshot is pending because subtransaction is
overflowed."),
Given the above understanding, "or" is not appropriate in this
context, so I left this message unchanged.
Instead, I have added an errhint. The phrasing in the hint message
aligns with the manual, allowing users to search for this hint and
find the newly added resolution instructions.

On second thought, it may not be appropriate to show this output to
clients attempting to connect. This message should be notified not to
clients but to administrators.

From this point of view, it'd be better to output a message indicating
the status inside ProcArrayApplyRecoveryInfo(). However, a
straightforward implementation would result in the same message being
logged every time an XLOG_RUNNING_XACTS WAL is received, making it
noisy.

Instead of directly outputting a log indicating that a hot standby
connection cannot be established due to subtransaction overflow, the
attached patch updates the manual so that administrators can determine
whether a subtransaction overflow has occurred based on the modified log
output.

What do you think?

--
Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.

Attachments:

v4-0001-Modify-and-add-log-messages-to-align-with-the-doc.patchtext/x-diff; name=v4-0001-Modify-and-add-log-messages-to-align-with-the-doc.patchDownload

From 908db07687d9d4473ab86d93356b7d605329c0be Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Date: Thu, 27 Mar 2025 23:50:08 +0900
Subject: [PATCH v4] Modify and log to align with the doc

There is an inconsistency between the documentation[1] and the log
messages regarding the definition of a 'consistent' recovery state.
The documentation states that a consistent state requires both that
recovery has progressed beyond minRecoveryPoint and that there are
no overflowed subtransactions, whereas the source code previously
considered only minRecoveryPoint.

This patch modifies and adds log messages to align with the
documentation. Specifically, messages are now logged both when
recovery progresses beyond minRecoveryPoint and when the system
reaches a consistent state in the updated definition. As a result,
the "consistent recovery state reached" message is now output
later than before.

As a side effect, previously, if hot standby remained inaccessible
due to overflowed subtransactions, it was difficult to determine
the cause. With this change in log messages and the corresponding
update to the documentation, the reason for the delay can now be
identified more clearly.

[1] https://www.postgresql.org/docs/devel/hot-standby.html
---
 doc/src/sgml/high-availability.sgml       | 8 ++++++++
 src/backend/access/transam/xlogrecovery.c | 5 ++++-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index acf3ac0601..c83c5403ed 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -1967,6 +1967,10 @@ LOG:  entering standby mode
 
 ... then some time later ...
 
+LOG:  minimum recovery point reached
+
+... It may take some time ...
+
 LOG:  consistent recovery state reached
 LOG:  database system is ready to accept read-only connections
 </programlisting>
@@ -1991,6 +1995,10 @@ LOG:  database system is ready to accept read-only connections
        </listitem>
       </itemizedlist>
 
+    In this case, only "minimum recovery point reached" is logged, and
+    "consistent recovery state reached" is not logged. This issue can be
+    resolved by closing the transaction.
+    This case can be resolved by closing the transaction.
     If you are running file-based log shipping ("warm standby"), you might need
     to wait until the next WAL file arrives, which could be as long as the
     <varname>archive_timeout</varname> setting on the primary.
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 2c19013c98..8152f90c99 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2249,7 +2249,7 @@ CheckRecoveryConsistency(void)
 
 		reachedConsistency = true;
 		ereport(LOG,
-				(errmsg("consistent recovery state reached at %X/%X",
+				(errmsg("minimum recovery point reached at %X/%X",
 						LSN_FORMAT_ARGS(lastReplayedEndRecPtr))));
 	}
 
@@ -2268,6 +2268,9 @@ CheckRecoveryConsistency(void)
 		SpinLockRelease(&XLogRecoveryCtl->info_lck);
 
 		LocalHotStandbyActive = true;
+		ereport(LOG,
+				(errmsg("consistent recovery state reached at %X/%X",
+						LSN_FORMAT_ARGS(lastReplayedEndRecPtr))));
 
 		SendPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY);
 	}

base-commit: 0f3604a518f8b3fd35ffc344d85c71693ded0dde
-- 
2.48.1

#14

Fujii Masao

masao.fujii@oss.nttdata.com

10 months ago

In reply to: torikoshia (#13)

1 attachment(s)

Re: Change log level for notifying hot standby is waiting non-overflowed snapshot

On 2025/03/28 0:13, torikoshia wrote:

On 2025-03-27 11:06, torikoshia wrote:
Hi,

I had another off-list discussion with Fujii-san, and according to the
following manual[1], it seems that a transaction with an overflowed
subtransaction is already considered inconsistent:

Reaching a consistent state can also be delayed in the presence of
both of these conditions:

- A write transaction has more than 64 subtransactions
- Very long-lived write transactions

IIUC, the manual suggests that both conditions must be met -- recovery
reaching at least minRecoveryPoint and no overflowed subtransactions
—- for the standby to be considered consistent.

OTOH, the following log message is emitted even when subtransactions
have overflowed, which appears to contradict the definition of
consistency mentioned above:

LOG: consistent recovery state reached

This log message is triggered when recovery progresses beyond
minRecoveryPoint(according to CheckRecoveryConsistency()).
However, since this state does not satisfy 'consistency' defined in
the manual, I think it would be more accurate to log that it has
merely reached the "minimum recovery point".
Furthermore, it may be better to emit the above log message only when
recovery has progressed beyond minRecoveryPoint and there are no
overflowed subtransactions.

Attached patch does this.

Additionally, renaming variables such as reachedConsistency in
CheckRecoveryConsistency might also be appropriate.
However, in the attached patch, I have left them unchanged for now.

On 2025-03-25 00:55, Fujii Masao wrote:
-                  case CAC_NOTCONSISTENT:
+                  case CAC_NOTCONSISTENT_OR_OVERFLOWED:
This new name seems a bit too long. I'm OK to leave the name as it is.
Or, something like CAC_NOTHOTSTANDBY seems simpler and better to me.
Beyond just the length issue, given the understanding outlined above,
I now think CAC_NOTCONSISTENT does not actually need to be changed.

In high-availability.sgml, the "Administrator's Overview" section already
describes the conditions for accepting hot standby connections.
This section should also be updated accordingly.

Agreed.
I have updated this section to mention that the resolution is to close
the problematic transaction.
OTOH the changes made in v2 patch seem unnecessary, since the concept
of 'consistent' is already explained in the "Administrator's
Overview."
-                             errdetail("Consistent recovery state has not been yet reached.")));
+                             errdetail("Consistent recovery state has not been yet
reached, or snappshot is pending because subtransaction is
overflowed."),
Given the above understanding, "or" is not appropriate in this
context, so I left this message unchanged.
Instead, I have added an errhint. The phrasing in the hint message
aligns with the manual, allowing users to search for this hint and
find the newly added resolution instructions.
On second thought, it may not be appropriate to show this output to clients attempting to connect. This message should be notified not to clients but to administrators.

From this point of view, it'd be better to output a message indicating the status inside ProcArrayApplyRecoveryInfo(). However, a straightforward implementation would result in the same message being logged every time an XLOG_RUNNING_XACTS WAL is received, making it noisy.

Instead of directly outputting a log indicating that a hot standby connection cannot be established due to subtransaction overflow, the attached patch updates the manual so that administrators can determine whether a subtransaction overflow has occurred based on the modified log output.

What do you think?

I had the same thought during our off-list discussion. However,
after reviewing the recovery code - such as recoveryStopsBefore(),
which checks whether a consistent state is reached - I now believe
the manual’s definition of a consistent state may be incorrect.
A consistent state should be defined as the point where recovery
has reached minRecoveryPoint.

If we were to change the definition to match the manual,
we would also need to update various recovery checks,
which wouldn't be a trivial task.

Given that, I now think it's better to revive your v1 patch,
which introduces a new postmaster signal and improves the error message
when connections are not accepted during hot standby. I've attached
a revised version of the patch based on your v1. Thought?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Attachments:

v5-0001-Improve-error-message-when-standby-does-accept-co.patchtext/plain; charset=UTF-8; name=v5-0001-Improve-error-message-when-standby-does-accept-co.patchDownload

From a21c5fa943c2e372bf6b77fcbb32345751fb7a84 Mon Sep 17 00:00:00 2001
From: Fujii Masao <fujii@postgresql.org>
Date: Fri, 28 Mar 2025 01:40:52 +0900
Subject: [PATCH v5] Improve error message when standby does accept
 connections.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Even after reaching the minimum recovery point, if there are long-lived
write transactions with 64 subtransactions on the primary, the recovery
snapshot may not yet be ready for hot standby, delaying read-only
connections on the standby. Previously, when read-only connections were
not accepted due to this condition, the following error message was logged:

    FATAL:  the database system is not yet accepting connections
    DETAIL:  Consistent recovery state has not been yet reached.

This DETAIL message was misleading because the following message was
already logged in this case:

    LOG:  consistent recovery state reached

This contradiction, i.e., indicating that the recovery state was consistent
while also stating it wasn’t, caused confusion.

This commit improves the error message to better reflect the actual state:

    FATAL: the database system is not accepting connections
    DETAIL: Recovery snapshot is not yet ready for hot standby.
    HINT: To enable hot standby, close write transactions with more than 64 subtransactions on the primary server.

To implement this, the commit introduces a new postmaster signal,
PMSIGNAL_RECOVERY_CONSISTENT. When the startup process reaches
a consistent recovery state, it sends this signal to the postmaster,
allowing it to correctly recognize that state.

Since this is not a clear bug, the change is applied only to the master
branch and is not back-patched.
---
 doc/src/sgml/high-availability.sgml       |  9 ++++++---
 src/backend/access/transam/xlogrecovery.c |  6 ++++++
 src/backend/postmaster/postmaster.c       | 12 +++++++++---
 src/backend/tcop/backend_startup.c        | 16 ++++++++++++----
 src/include/storage/pmsignal.h            |  1 +
 src/include/tcop/backend_startup.h        |  2 +-
 6 files changed, 35 insertions(+), 11 deletions(-)

diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index acf3ac0601d..0e84ca66901 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -1974,9 +1974,12 @@ LOG:  database system is ready to accept read-only connections
     Consistency information is recorded once per checkpoint on the primary.
     It is not possible to enable hot standby when reading WAL
     written during a period when <varname>wal_level</varname> was not set to
-    <literal>replica</literal> or <literal>logical</literal> on the primary.  Reaching
-    a consistent state can also be delayed in the presence of both of these
-    conditions:
+    <literal>replica</literal> or <literal>logical</literal> on the primary.
+    Even after reaching a consistent state, the recovery snapshot may not
+    be ready for hot standby if both of the following conditions are met,
+    delaying accepting read-only connections.  To enable hot standby,
+    a long-lived write transaction with more than 64 subtransactions
+    needs to be closed on the primary.
 
       <itemizedlist>
        <listitem>
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 0aa3ab59085..6ce979f2d8b 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -291,6 +291,11 @@ static bool backupEndRequired = false;
  * Consistent state means that the system is internally consistent, all
  * the WAL has been replayed up to a certain point, and importantly, there
  * is no trace of later actions on disk.
+ *
+ * This flag is used only by the startup process and postmaster. When
+ * minRecoveryPoint is reached, the startup process sets it to true and
+ * sends a PMSIGNAL_RECOVERY_CONSISTENT signal to the postmaster,
+ * which then sets it to true upon receiving the signal.
  */
 bool		reachedConsistency = false;
 
@@ -2248,6 +2253,7 @@ CheckRecoveryConsistency(void)
 		CheckTablespaceDirectory();
 
 		reachedConsistency = true;
+		SendPostmasterSignal(PMSIGNAL_RECOVERY_CONSISTENT);
 		ereport(LOG,
 				(errmsg("consistent recovery state reached at %X/%X",
 						LSN_FORMAT_ARGS(lastReplayedEndRecPtr))));
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index c966c2e83af..3fe45de5da0 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -1825,8 +1825,7 @@ canAcceptConnections(BackendType backend_type)
 		else if (!FatalError && pmState == PM_STARTUP)
 			return CAC_STARTUP; /* normal startup */
 		else if (!FatalError && pmState == PM_RECOVERY)
-			return CAC_NOTCONSISTENT;	/* not yet at consistent recovery
-										 * state */
+			return CAC_NOTHOTSTANDBY;	/* not yet ready for hot standby */
 		else
 			return CAC_RECOVERY;	/* else must be crash recovery */
 	}
@@ -3699,6 +3698,7 @@ process_pm_pmsignal(void)
 		/* WAL redo has started. We're out of reinitialization. */
 		FatalError = false;
 		AbortStartTime = 0;
+		reachedConsistency = false;
 
 		/*
 		 * Start the archiver if we're responsible for (re-)archiving received
@@ -3724,8 +3724,14 @@ process_pm_pmsignal(void)
 		UpdatePMState(PM_RECOVERY);
 	}
 
-	if (CheckPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY) &&
+	if (CheckPostmasterSignal(PMSIGNAL_RECOVERY_CONSISTENT) &&
 		pmState == PM_RECOVERY && Shutdown == NoShutdown)
+	{
+		reachedConsistency = true;
+	}
+
+	if (CheckPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY) &&
+		(pmState == PM_RECOVERY && Shutdown == NoShutdown))
 	{
 		ereport(LOG,
 				(errmsg("database system is ready to accept read-only connections")));
diff --git a/src/backend/tcop/backend_startup.c b/src/backend/tcop/backend_startup.c
index a07c59ece01..f920d9e4e55 100644
--- a/src/backend/tcop/backend_startup.c
+++ b/src/backend/tcop/backend_startup.c
@@ -18,6 +18,7 @@
 #include <unistd.h>
 
 #include "access/xlog.h"
+#include "access/xlogrecovery.h"
 #include "common/ip.h"
 #include "common/string.h"
 #include "libpq/libpq.h"
@@ -306,17 +307,24 @@ BackendInitialize(ClientSocket *client_sock, CAC_state cac)
 						(errcode(ERRCODE_CANNOT_CONNECT_NOW),
 						 errmsg("the database system is starting up")));
 				break;
-			case CAC_NOTCONSISTENT:
-				if (EnableHotStandby)
+			case CAC_NOTHOTSTANDBY:
+				if (!EnableHotStandby)
 					ereport(FATAL,
 							(errcode(ERRCODE_CANNOT_CONNECT_NOW),
 							 errmsg("the database system is not yet accepting connections"),
-							 errdetail("Consistent recovery state has not been yet reached.")));
+							 errdetail("Hot standby mode is disabled.")));
+				else if (reachedConsistency)
+					ereport(FATAL,
+							(errcode(ERRCODE_CANNOT_CONNECT_NOW),
+							 errmsg("the database system is not accepting connections"),
+							 errdetail("Recovery snapshot is not yet ready for hot standby."),
+							 errhint("To enable hot standby, close write transactions with more than %d subtransactions on the primary server.",
+									 PGPROC_MAX_CACHED_SUBXIDS)));
 				else
 					ereport(FATAL,
 							(errcode(ERRCODE_CANNOT_CONNECT_NOW),
 							 errmsg("the database system is not accepting connections"),
-							 errdetail("Hot standby mode is disabled.")));
+							 errdetail("Consistent recovery state has not been yet reached.")));
 				break;
 			case CAC_SHUTDOWN:
 				ereport(FATAL,
diff --git a/src/include/storage/pmsignal.h b/src/include/storage/pmsignal.h
index d84a383047e..67fa9ac06e1 100644
--- a/src/include/storage/pmsignal.h
+++ b/src/include/storage/pmsignal.h
@@ -33,6 +33,7 @@
 typedef enum
 {
 	PMSIGNAL_RECOVERY_STARTED,	/* recovery has started */
+	PMSIGNAL_RECOVERY_CONSISTENT,	/* recovery has reached consistent state */
 	PMSIGNAL_BEGIN_HOT_STANDBY, /* begin Hot Standby */
 	PMSIGNAL_ROTATE_LOGFILE,	/* send SIGUSR1 to syslogger to rotate logfile */
 	PMSIGNAL_START_AUTOVAC_LAUNCHER,	/* start an autovacuum launcher */
diff --git a/src/include/tcop/backend_startup.h b/src/include/tcop/backend_startup.h
index 578828c1caf..dcb9d056643 100644
--- a/src/include/tcop/backend_startup.h
+++ b/src/include/tcop/backend_startup.h
@@ -36,7 +36,7 @@ typedef enum CAC_state
 	CAC_STARTUP,
 	CAC_SHUTDOWN,
 	CAC_RECOVERY,
-	CAC_NOTCONSISTENT,
+	CAC_NOTHOTSTANDBY,
 	CAC_TOOMANY,
 } CAC_state;
 
-- 
2.48.1

#15

torikoshia

torikoshia@oss.nttdata.com

10 months ago

In reply to: Fujii Masao (#14)

Re: Change log level for notifying hot standby is waiting non-overflowed snapshot

On 2025-03-31 12:51, Fujii Masao wrote:

I had the same thought during our off-list discussion. However,
after reviewing the recovery code - such as recoveryStopsBefore(),
which checks whether a consistent state is reached - I now believe
the manual’s definition of a consistent state may be incorrect.
A consistent state should be defined as the point where recovery
has reached minRecoveryPoint.

I now agree with you.

If we were to change the definition to match the manual,
we would also need to update various recovery checks,
which wouldn't be a trivial task.

Given that, I now think it's better to revive your v1 patch,
which introduces a new postmaster signal and improves the error message
when connections are not accepted during hot standby. I've attached
a revised version of the patch based on your v1. Thought?

Thank you for writing the patch!

Here are some comments on the documentation.

The following description in high-availability.sgml also seems to misuse
the word 'consistent':

When the <xref linkend="guc-hot-standby"/> parameter is set to true on
a
standby server, it will begin accepting connections once the recovery
has
brought the system to a consistent state.

Since this is part of the "User's Overview" section, it may not be
appropriate to include too much detail.
How about rewording it to avoid using 'consistent', for example:

When the <xref linkend="guc-hot-standby"/> parameter is set to true on
a
standby server, it will begin accepting connections once it is ready.

+    delaying accepting read-only connections.  To enable hot standby,
+    a long-lived write transaction with more than 64 subtransactions
+    needs to be closed on the primary.

Is it better to use 'transactions' in the plural form rather than as a
nominal?

- There may be more than one such transaction.
- The <itemizedlist> below also uses the plural form.
- The newly added message also uses the plural form:
   + errhint("To enable hot standby, close write transactions with more 
than %d subtransactions on the primary server."

What do you think?

--
Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.

#16

Yugo Nagata

nagata@sraoss.co.jp

10 months ago

In reply to: Fujii Masao (#14)

Re: Change log level for notifying hot standby is waiting non-overflowed snapshot

On Mon, 31 Mar 2025 12:51:06 +0900
Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2025/03/28 0:13, torikoshia wrote:
On 2025-03-27 11:06, torikoshia wrote:
Hi,

I had another off-list discussion with Fujii-san, and according to the
following manual[1], it seems that a transaction with an overflowed
subtransaction is already considered inconsistent:

Reaching a consistent state can also be delayed in the presence of
both of these conditions:

- A write transaction has more than 64 subtransactions
- Very long-lived write transactions

IIUC, the manual suggests that both conditions must be met -- recovery
reaching at least minRecoveryPoint and no overflowed subtransactions
—- for the standby to be considered consistent.

OTOH, the following log message is emitted even when subtransactions
have overflowed, which appears to contradict the definition of
consistency mentioned above:

LOG: consistent recovery state reached

This log message is triggered when recovery progresses beyond
minRecoveryPoint(according to CheckRecoveryConsistency()).
However, since this state does not satisfy 'consistency' defined in
the manual, I think it would be more accurate to log that it has
merely reached the "minimum recovery point".
Furthermore, it may be better to emit the above log message only when
recovery has progressed beyond minRecoveryPoint and there are no
overflowed subtransactions.

Attached patch does this.

Additionally, renaming variables such as reachedConsistency in
CheckRecoveryConsistency might also be appropriate.
However, in the attached patch, I have left them unchanged for now.

On 2025-03-25 00:55, Fujii Masao wrote:
-                  case CAC_NOTCONSISTENT:
+                  case CAC_NOTCONSISTENT_OR_OVERFLOWED:
This new name seems a bit too long. I'm OK to leave the name as it is.
Or, something like CAC_NOTHOTSTANDBY seems simpler and better to me.
Beyond just the length issue, given the understanding outlined above,
I now think CAC_NOTCONSISTENT does not actually need to be changed.

In high-availability.sgml, the "Administrator's Overview" section already
describes the conditions for accepting hot standby connections.
This section should also be updated accordingly.

Agreed.
I have updated this section to mention that the resolution is to close
the problematic transaction.
OTOH the changes made in v2 patch seem unnecessary, since the concept
of 'consistent' is already explained in the "Administrator's
Overview."
-                             errdetail("Consistent recovery state has not been yet reached.")));
+                             errdetail("Consistent recovery state has not been yet
reached, or snappshot is pending because subtransaction is
overflowed."),
Given the above understanding, "or" is not appropriate in this
context, so I left this message unchanged.
Instead, I have added an errhint. The phrasing in the hint message
aligns with the manual, allowing users to search for this hint and
find the newly added resolution instructions.
On second thought, it may not be appropriate to show this output to clients attempting to connect. This message should be notified not to clients but to administrators.

From this point of view, it'd be better to output a message indicating the status inside ProcArrayApplyRecoveryInfo(). However, a straightforward implementation would result in the same message being logged every time an XLOG_RUNNING_XACTS WAL is received, making it noisy.

Instead of directly outputting a log indicating that a hot standby connection cannot be established due to subtransaction overflow, the attached patch updates the manual so that administrators can determine whether a subtransaction overflow has occurred based on the modified log output.

What do you think?
I had the same thought during our off-list discussion. However,
after reviewing the recovery code - such as recoveryStopsBefore(),
which checks whether a consistent state is reached - I now believe
the manual’s definition of a consistent state may be incorrect.
A consistent state should be defined as the point where recovery
has reached minRecoveryPoint.

If we were to change the definition to match the manual,
we would also need to update various recovery checks,
which wouldn't be a trivial task.

Given that, I now think it's better to revive your v1 patch,
which introduces a new postmaster signal and improves the error message
when connections are not accepted during hot standby. I've attached
a revised version of the patch based on your v1. Thought?

I prefer this approach clarifying that consistency and subtransaction overflow
are separate concepts in the documentation.

Here are minor comments on the patch:

-           case CAC_NOTCONSISTENT:
-               if (EnableHotStandby)
+           case CAC_NOTHOTSTANDBY:
+               if (!EnableHotStandby)
                    ereport(FATAL,
                            (errcode(ERRCODE_CANNOT_CONNECT_NOW),
                             errmsg("the database system is not yet accepting connections"),
-                            errdetail("Consistent recovery state has not been yet reached.")));
+                            errdetail("Hot standby mode is disabled.")));
+               else if (reachedConsistency)
+                   ereport(FATAL,
+                           (errcode(ERRCODE_CANNOT_CONNECT_NOW),
+                            errmsg("the database system is not accepting connections"),
+                            errdetail("Recovery snapshot is not yet ready for hot standby."),
+                            errhint("To enable hot standby, close write transactions with more than %d subtransactions on the primary server.",
+                                    PGPROC_MAX_CACHED_SUBXIDS)));
                else
                    ereport(FATAL,
                            (errcode(ERRCODE_CANNOT_CONNECT_NOW),
                             errmsg("the database system is not accepting connections"),
-                            errdetail("Hot standby mode is disabled.")));
+                            errdetail("Consistent recovery state has not been yet reached.")));

The message says "the database system is not yet accepting connections" when "Hot standby mode is disabled".
I think "yet" is not necessary in this case. Otherwise, when "Recovery snapshot is not yet ready for hot standby"
or "Consistent recovery state has not been yet reached", it seems better to use "yet"

Regards,
Yugo Nagata

--
Yugo Nagata <nagata@sraoss.co.jp>

#17

Fujii Masao

masao.fujii@oss.nttdata.com

10 months ago

In reply to: torikoshia (#15)

1 attachment(s)

Re: Change log level for notifying hot standby is waiting non-overflowed snapshot

On 2025/03/31 22:44, torikoshia wrote:

Here are some comments on the documentation.

Thanks for the review!

The following description in high-availability.sgml also seems to misuse the word 'consistent':

When the <xref linkend="guc-hot-standby"/> parameter is set to true on a
standby server, it will begin accepting connections once the recovery has
brought the system to a consistent state.

Since this is part of the "User's Overview" section, it may not be appropriate to include too much detail.
How about rewording it to avoid using 'consistent', for example:

When the <xref linkend="guc-hot-standby"/> parameter is set to true on a
standby server, it will begin accepting connections once it is ready.

"once it is ready." feels too vague to me. How about using
"once recovery has brought the system to a consistent state and
be ready for hot standby." instead?

+    delaying accepting read-only connections.  To enable hot standby,
+    a long-lived write transaction with more than 64 subtransactions
+    needs to be closed on the primary.
Is it better to use 'transactions' in the plural form rather than as a nominal?

- There may be more than one such transaction.
- The <itemizedlist> below also uses the plural form.
- The newly added message also uses the plural form:
+ errhint("To enable hot standby, close write transactions with more than %d subtransactions on the primary server."

What do you think?

I'm not sure whether the plural form is better here, but I've updated
the patch as suggested. Attached is the revised version.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Attachments:

v6-0001-Improve-error-message-when-standby-does-accept-co.patchtext/plain; charset=UTF-8; name=v6-0001-Improve-error-message-when-standby-does-accept-co.patchDownload

From 2c83ae035c8093498b78243095180fa906a39c4a Mon Sep 17 00:00:00 2001
From: Fujii Masao <fujii@postgresql.org>
Date: Tue, 1 Apr 2025 00:47:58 +0900
Subject: [PATCH v6] Improve error message when standby does accept
 connections.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Even after reaching the minimum recovery point, if there are long-lived
write transactions with 64 subtransactions on the primary, the recovery
snapshot may not yet be ready for hot standby, delaying read-only
connections on the standby. Previously, when read-only connections were
not accepted due to this condition, the following error message was logged:

    FATAL:  the database system is not yet accepting connections
    DETAIL:  Consistent recovery state has not been yet reached.

This DETAIL message was misleading because the following message was
already logged in this case:

    LOG:  consistent recovery state reached

This contradiction, i.e., indicating that the recovery state was consistent
while also stating it wasn’t, caused confusion.

This commit improves the error message to better reflect the actual state:

    FATAL: the database system is not yet accepting connections
    DETAIL: Recovery snapshot is not yet ready for hot standby.
    HINT: To enable hot standby, close write transactions with more than 64 subtransactions on the primary server.

To implement this, the commit introduces a new postmaster signal,
PMSIGNAL_RECOVERY_CONSISTENT. When the startup process reaches
a consistent recovery state, it sends this signal to the postmaster,
allowing it to correctly recognize that state.

Since this is not a clear bug, the change is applied only to the master
branch and is not back-patched.

Author: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Co-authored-by: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Yugo Nagata <nagata@sraoss.co.jp>
Discussion: https://postgr.es/m/02db8cd8e1f527a8b999b94a4bee3165@oss.nttdata.com
---
 doc/src/sgml/high-availability.sgml       | 12 ++++++++----
 src/backend/access/transam/xlogrecovery.c |  6 ++++++
 src/backend/postmaster/postmaster.c       | 12 +++++++++---
 src/backend/tcop/backend_startup.c        | 18 +++++++++++++-----
 src/include/storage/pmsignal.h            |  1 +
 src/include/tcop/backend_startup.h        |  2 +-
 6 files changed, 38 insertions(+), 13 deletions(-)

diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index acf3ac0601d..b47d8b4106e 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -1535,7 +1535,8 @@ synchronous_standby_names = 'ANY 2 (s1, s2, s3)'
    <para>
     When the <xref linkend="guc-hot-standby"/> parameter is set to true on a
     standby server, it will begin accepting connections once the recovery has
-    brought the system to a consistent state.  All such connections are
+    brought the system to a consistent state and be ready for hot standby.
+    All such connections are
     strictly read-only; not even temporary tables may be written.
    </para>
 
@@ -1974,9 +1975,12 @@ LOG:  database system is ready to accept read-only connections
     Consistency information is recorded once per checkpoint on the primary.
     It is not possible to enable hot standby when reading WAL
     written during a period when <varname>wal_level</varname> was not set to
-    <literal>replica</literal> or <literal>logical</literal> on the primary.  Reaching
-    a consistent state can also be delayed in the presence of both of these
-    conditions:
+    <literal>replica</literal> or <literal>logical</literal> on the primary.
+    Even after reaching a consistent state, the recovery snapshot may not
+    be ready for hot standby if both of the following conditions are met,
+    delaying accepting read-only connections.  To enable hot standby,
+    long-lived write transactions with more than 64 subtransactions
+    need to be closed on the primary.
 
       <itemizedlist>
        <listitem>
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 0aa3ab59085..6ce979f2d8b 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -291,6 +291,11 @@ static bool backupEndRequired = false;
  * Consistent state means that the system is internally consistent, all
  * the WAL has been replayed up to a certain point, and importantly, there
  * is no trace of later actions on disk.
+ *
+ * This flag is used only by the startup process and postmaster. When
+ * minRecoveryPoint is reached, the startup process sets it to true and
+ * sends a PMSIGNAL_RECOVERY_CONSISTENT signal to the postmaster,
+ * which then sets it to true upon receiving the signal.
  */
 bool		reachedConsistency = false;
 
@@ -2248,6 +2253,7 @@ CheckRecoveryConsistency(void)
 		CheckTablespaceDirectory();
 
 		reachedConsistency = true;
+		SendPostmasterSignal(PMSIGNAL_RECOVERY_CONSISTENT);
 		ereport(LOG,
 				(errmsg("consistent recovery state reached at %X/%X",
 						LSN_FORMAT_ARGS(lastReplayedEndRecPtr))));
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index c966c2e83af..3fe45de5da0 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -1825,8 +1825,7 @@ canAcceptConnections(BackendType backend_type)
 		else if (!FatalError && pmState == PM_STARTUP)
 			return CAC_STARTUP; /* normal startup */
 		else if (!FatalError && pmState == PM_RECOVERY)
-			return CAC_NOTCONSISTENT;	/* not yet at consistent recovery
-										 * state */
+			return CAC_NOTHOTSTANDBY;	/* not yet ready for hot standby */
 		else
 			return CAC_RECOVERY;	/* else must be crash recovery */
 	}
@@ -3699,6 +3698,7 @@ process_pm_pmsignal(void)
 		/* WAL redo has started. We're out of reinitialization. */
 		FatalError = false;
 		AbortStartTime = 0;
+		reachedConsistency = false;
 
 		/*
 		 * Start the archiver if we're responsible for (re-)archiving received
@@ -3724,8 +3724,14 @@ process_pm_pmsignal(void)
 		UpdatePMState(PM_RECOVERY);
 	}
 
-	if (CheckPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY) &&
+	if (CheckPostmasterSignal(PMSIGNAL_RECOVERY_CONSISTENT) &&
 		pmState == PM_RECOVERY && Shutdown == NoShutdown)
+	{
+		reachedConsistency = true;
+	}
+
+	if (CheckPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY) &&
+		(pmState == PM_RECOVERY && Shutdown == NoShutdown))
 	{
 		ereport(LOG,
 				(errmsg("database system is ready to accept read-only connections")));
diff --git a/src/backend/tcop/backend_startup.c b/src/backend/tcop/backend_startup.c
index a07c59ece01..84e1c6f2831 100644
--- a/src/backend/tcop/backend_startup.c
+++ b/src/backend/tcop/backend_startup.c
@@ -18,6 +18,7 @@
 #include <unistd.h>
 
 #include "access/xlog.h"
+#include "access/xlogrecovery.h"
 #include "common/ip.h"
 #include "common/string.h"
 #include "libpq/libpq.h"
@@ -306,17 +307,24 @@ BackendInitialize(ClientSocket *client_sock, CAC_state cac)
 						(errcode(ERRCODE_CANNOT_CONNECT_NOW),
 						 errmsg("the database system is starting up")));
 				break;
-			case CAC_NOTCONSISTENT:
-				if (EnableHotStandby)
+			case CAC_NOTHOTSTANDBY:
+				if (!EnableHotStandby)
+					ereport(FATAL,
+							(errcode(ERRCODE_CANNOT_CONNECT_NOW),
+							 errmsg("the database system is not accepting connections"),
+							 errdetail("Hot standby mode is disabled.")));
+				else if (reachedConsistency)
 					ereport(FATAL,
 							(errcode(ERRCODE_CANNOT_CONNECT_NOW),
 							 errmsg("the database system is not yet accepting connections"),
-							 errdetail("Consistent recovery state has not been yet reached.")));
+							 errdetail("Recovery snapshot is not yet ready for hot standby."),
+							 errhint("To enable hot standby, close write transactions with more than %d subtransactions on the primary server.",
+									 PGPROC_MAX_CACHED_SUBXIDS)));
 				else
 					ereport(FATAL,
 							(errcode(ERRCODE_CANNOT_CONNECT_NOW),
-							 errmsg("the database system is not accepting connections"),
-							 errdetail("Hot standby mode is disabled.")));
+							 errmsg("the database system is not yet accepting connections"),
+							 errdetail("Consistent recovery state has not been yet reached.")));
 				break;
 			case CAC_SHUTDOWN:
 				ereport(FATAL,
diff --git a/src/include/storage/pmsignal.h b/src/include/storage/pmsignal.h
index d84a383047e..67fa9ac06e1 100644
--- a/src/include/storage/pmsignal.h
+++ b/src/include/storage/pmsignal.h
@@ -33,6 +33,7 @@
 typedef enum
 {
 	PMSIGNAL_RECOVERY_STARTED,	/* recovery has started */
+	PMSIGNAL_RECOVERY_CONSISTENT,	/* recovery has reached consistent state */
 	PMSIGNAL_BEGIN_HOT_STANDBY, /* begin Hot Standby */
 	PMSIGNAL_ROTATE_LOGFILE,	/* send SIGUSR1 to syslogger to rotate logfile */
 	PMSIGNAL_START_AUTOVAC_LAUNCHER,	/* start an autovacuum launcher */
diff --git a/src/include/tcop/backend_startup.h b/src/include/tcop/backend_startup.h
index 578828c1caf..dcb9d056643 100644
--- a/src/include/tcop/backend_startup.h
+++ b/src/include/tcop/backend_startup.h
@@ -36,7 +36,7 @@ typedef enum CAC_state
 	CAC_STARTUP,
 	CAC_SHUTDOWN,
 	CAC_RECOVERY,
-	CAC_NOTCONSISTENT,
+	CAC_NOTHOTSTANDBY,
 	CAC_TOOMANY,
 } CAC_state;
 
-- 
2.48.1

#18

Fujii Masao

masao.fujii@oss.nttdata.com

10 months ago

In reply to: Yugo Nagata (#16)

Re: Change log level for notifying hot standby is waiting non-overflowed snapshot

On 2025/03/31 22:45, Yugo Nagata wrote:

I prefer this approach clarifying that consistency and subtransaction overflow
are separate concepts in the documentation.

Here are minor comments on the patch:

Thanks for the review!

-           case CAC_NOTCONSISTENT:
-               if (EnableHotStandby)
+           case CAC_NOTHOTSTANDBY:
+               if (!EnableHotStandby)
ereport(FATAL,
(errcode(ERRCODE_CANNOT_CONNECT_NOW),
errmsg("the database system is not yet accepting connections"),
-                            errdetail("Consistent recovery state has not been yet reached.")));
+                            errdetail("Hot standby mode is disabled.")));
+               else if (reachedConsistency)
+                   ereport(FATAL,
+                           (errcode(ERRCODE_CANNOT_CONNECT_NOW),
+                            errmsg("the database system is not accepting connections"),
+                            errdetail("Recovery snapshot is not yet ready for hot standby."),
+                            errhint("To enable hot standby, close write transactions with more than %d subtransactions on the primary server.",
+                                    PGPROC_MAX_CACHED_SUBXIDS)));
else
ereport(FATAL,
(errcode(ERRCODE_CANNOT_CONNECT_NOW),
errmsg("the database system is not accepting connections"),
-                            errdetail("Hot standby mode is disabled.")));
+                            errdetail("Consistent recovery state has not been yet reached.")));

I may have unintentionally modified the error message.
I fixed the patch as suggested. Please check the latest patch
I posted earlier in response to Torikoshi-san.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#19

torikoshia

torikoshia@oss.nttdata.com

10 months ago

In reply to: Fujii Masao (#18)

Re: Change log level for notifying hot standby is waiting non-overflowed snapshot

On 2025-04-01 01:12, Fujii Masao wrote:

On 2025/03/31 22:45, Yugo Nagata wrote:

I prefer this approach clarifying that consistency and subtransaction
overflow
are separate concepts in the documentation.

Here are minor comments on the patch:

Thanks for the review!
-           case CAC_NOTCONSISTENT:
-               if (EnableHotStandby)
+           case CAC_NOTHOTSTANDBY:
+               if (!EnableHotStandby)
ereport(FATAL,
(errcode(ERRCODE_CANNOT_CONNECT_NOW),
errmsg("the database system is not yet 
accepting connections"),
-                            errdetail("Consistent recovery state has 
not been yet reached.")));
+                            errdetail("Hot standby mode is 
disabled.")));
+               else if (reachedConsistency)
+                   ereport(FATAL,
+                           (errcode(ERRCODE_CANNOT_CONNECT_NOW),
+                            errmsg("the database system is not 
accepting connections"),
+                            errdetail("Recovery snapshot is not yet 
ready for hot standby."),
+                            errhint("To enable hot standby, close 
write transactions with more than %d subtransactions on the primary 
server.",
+                                    PGPROC_MAX_CACHED_SUBXIDS)));
else
ereport(FATAL,
(errcode(ERRCODE_CANNOT_CONNECT_NOW),
errmsg("the database system is not 
accepting connections"),
-                            errdetail("Hot standby mode is 
disabled.")));
+                            errdetail("Consistent recovery state has 
not been yet reached.")));
The message says "the database system is not yet accepting
connections" when "Hot standby mode is disabled".
I think "yet" is not necessary in this case. Otherwise, when "Recovery
snapshot is not yet ready for hot standby"
or "Consistent recovery state has not been yet reached", it seems
better to use "yet"
I may have unintentionally modified the error message.
I fixed the patch as suggested. Please check the latest patch
I posted earlier in response to Torikoshi-san.

Thank you for updating the patch!

LGTM.
I feel like changing the status to 'Ready for Committer', but since
Nagata-san may have additional comments, I'm leaving it as 'Needs
Review'.

--
Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.

#20

Fujii Masao

masao.fujii@oss.nttdata.com

10 months ago

In reply to: torikoshia (#19)

Re: Change log level for notifying hot standby is waiting non-overflowed snapshot

On 2025/04/01 20:54, torikoshia wrote:

Thank you for updating the patch!

LGTM.

I've pushed the patch. Thanks!

I feel like changing the status to 'Ready for Committer', but since Nagata-san may have additional comments, I'm leaving it as 'Needs Review'.

If any issues arise, let's continue to address them.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#21

torikoshia

torikoshia@oss.nttdata.com

10 months ago

In reply to: Fujii Masao (#20)

Re: Change log level for notifying hot standby is waiting non-overflowed snapshot

On 2025-04-02 15:21, Fujii Masao wrote:

On 2025/04/01 20:54, torikoshia wrote:

Thank you for updating the patch!

LGTM.

I've pushed the patch. Thanks!

Thanks a lot!

I feel like changing the status to 'Ready for Committer', but since
Nagata-san may have additional comments, I'm leaving it as 'Needs
Review'.

If any issues arise, let's continue to address them.

OK.

--
Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.