Improve handling of parameter differences in physical replication

Started by Peter Eisentrautalmost 6 years ago27 messages
#1Peter Eisentraut
peter.eisentraut@2ndquadrant.com
1 attachment(s)

When certain parameters are changed on a physical replication primary,
this is communicated to standbys using the XLOG_PARAMETER_CHANGE WAL
record. The standby then checks whether its own settings are at least
as big as the ones on the primary. If not, the standby shuts down with
a fatal error.

The correspondence of settings between primary and standby is required
because those settings influence certain shared memory sizings that are
required for processing WAL records that the primary might send. For
example, if the primary sends a prepared transaction, the standby must
have had max_prepared_transaction set appropriately or it won't be able
to process those WAL records.

However, fatally shutting down the standby immediately upon receipt of
the parameter change record might be a bit of an overreaction. The
resources related to those settings are not required immediately at that
point, and might never be required if the activity on the primary does
not exhaust all those resources. An extreme example is raising
max_prepared_transactions on the primary but never actually using
prepared transactions.

Where this becomes a serious problem is if you have many standbys and
you do a failover. If the newly promoted standby happens to have a
higher setting for one of the relevant parameters, all the other
standbys that have followed it then shut down immediately and won't be
able to continue until you change all their settings.

If we didn't do the hard shutdown and we just let the standby roll on
with recovery, nothing bad will happen and it will eventually produce an
appropriate error when those resources are required (e.g., "maximum
number of prepared transactions reached").

So I think there are better ways to handle this. It might be reasonable
to provide options. The attached patch doesn't do that but it would be
pretty easy. What the attached patch does is:

Upon receipt of XLOG_PARAMETER_CHANGE, we still check the settings but
only issue a warning and set a global flag if there is a problem. Then
when we actually hit the resource issue and the flag was set, we issue
another warning message with relevant information. Additionally, at
that point we pause recovery instead of shutting down, so a hot standby
remains usable. (That could certainly be configurable.)

Btw., I think the current setup is slightly buggy. The MaxBackends
value that is used to size shared memory is computed as MaxConnections +
autovacuum_max_workers + 1 + max_worker_processes + max_wal_senders, but
we don't track autovacuum_max_workers in WAL.

(This patch was developed together with Simon Riggs.)

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v1-0001-Improve-handling-of-parameter-differences-in-phys.patchtext/plain; charset=UTF-8; name=v1-0001-Improve-handling-of-parameter-differences-in-phys.patch; x-mac-creator=0; x-mac-type=0Download
From f5b4b7fd853b0dba2deea6b1e8290ae4c6df7081 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 27 Feb 2020 08:50:37 +0100
Subject: [PATCH v1] Improve handling of parameter differences in physical
 replication

When certain parameters are changed on a physical replication primary,
this is communicated to standbys using the XLOG_PARAMETER_CHANGE WAL
record.  The standby then checks whether its own settings are at least
as big as the ones on the primary.  If not, the standby shuts down
with a fatal error.

The correspondence of settings between primary and standby is required
because those settings influence certain shared memory sizings that
are required for processing WAL records that the primary might send.
For example, if the primary sends a prepared transaction, the standby
must have had max_prepared_transaction set appropriately or it won't
be able to process those WAL records.

However, fatally shutting down the standby immediately upon receipt of
the parameter change record might be a bit of an overreaction.  The
resources related to those settings are not required immediately at
that point, and might never be required if the activity on the primary
does not exhaust all those resources.  If we just let the standby roll
on with recovery, it will eventually produce an appropriate error when
those resources are used.

So this patch relaxes this a bit.  Upon receipt of
XLOG_PARAMETER_CHANGE, we still check the settings but only issue a
warning and set a global flag if there is a problem.  Then when we
actually hit the resource issue and the flag was set, we issue another
warning message with relevant information.  Additionally, at that
point we pause recovery, so a hot standby remains usable.
---
 src/backend/access/transam/twophase.c |  3 ++
 src/backend/access/transam/xlog.c     | 56 ++++++++++++++++++++++-----
 src/backend/storage/ipc/procarray.c   | 15 ++++++-
 src/backend/storage/lmgr/lock.c       | 10 +++++
 src/include/access/xlog.h             |  1 +
 5 files changed, 74 insertions(+), 11 deletions(-)

diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 5adf956f41..fdac2ed69d 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -2360,11 +2360,14 @@ PrepareRedoAdd(char *buf, XLogRecPtr start_lsn,
 
 	/* Get a free gxact from the freelist */
 	if (TwoPhaseState->freeGXacts == NULL)
+	{
+		StandbyParamErrorPauseRecovery("max_prepared_transaction", max_prepared_xacts);
 		ereport(ERROR,
 				(errcode(ERRCODE_OUT_OF_MEMORY),
 				 errmsg("maximum number of prepared transactions reached"),
 				 errhint("Increase max_prepared_transactions (currently %d).",
 						 max_prepared_xacts)));
+	}
 	gxact = TwoPhaseState->freeGXacts;
 	TwoPhaseState->freeGXacts = gxact->next;
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index d19408b3be..71c4f87511 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -257,6 +257,8 @@ bool		InArchiveRecovery = false;
 static bool standby_signal_file_found = false;
 static bool recovery_signal_file_found = false;
 
+static bool need_restart_for_parameter_values = false;
+
 /* Was the last xlog file restored from archive, or local? */
 static bool restoredFromArchive = false;
 
@@ -5970,6 +5972,36 @@ SetRecoveryPause(bool recoveryPause)
 	SpinLockRelease(&XLogCtl->info_lck);
 }
 
+/*
+ * If in hot standby, pause recovery because of a parameter conflict.
+ *
+ * Similar to recoveryPausesHere() but with a different messaging.  The user
+ * is expected to make the parameter change and restart the server.  If they
+ * just unpause recovery, they will then run into whatever error change is
+ * after this function call for the non-hot-standby case.
+ *
+ * param_name is the parameter at fault and currValue its current value, for
+ * producing a message.
+ */
+void
+StandbyParamErrorPauseRecovery(const char *param_name, int currValue)
+{
+	if (!AmStartupProcess() || !need_restart_for_parameter_values)
+		return;
+
+	ereport(WARNING,
+			(errmsg("recovery paused because of insufficient setting of parameter %s (currently %d)", param_name, currValue),
+			 errdetail("The value must be at least as high as on the primary server."),
+			 errhint("Recovery cannot continue unless the parameter is changed and the server restarted.")));
+
+	SetRecoveryPause(true);
+	while (RecoveryIsPaused())
+	{
+		pg_usleep(1000000L);    /* 1000 ms */
+		HandleStartupProcInterrupts();
+	}
+}
+
 /*
  * When recovery_min_apply_delay is set, we wait long enough to make sure
  * certain record types are applied at least that interval behind the master.
@@ -6149,16 +6181,20 @@ GetXLogReceiptTime(TimestampTz *rtime, bool *fromStream)
  * Note that text field supplied is a parameter name and does not require
  * translation
  */
-#define RecoveryRequiresIntParameter(param_name, currValue, minValue) \
-do { \
-	if ((currValue) < (minValue)) \
-		ereport(ERROR, \
-				(errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
-				 errmsg("hot standby is not possible because %s = %d is a lower setting than on the master server (its value was %d)", \
-						param_name, \
-						currValue, \
-						minValue))); \
-} while(0)
+static void
+RecoveryRequiresIntParameter(const char *param_name, int currValue, int minValue)
+{
+	if (currValue < minValue)
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("insufficient setting for parameter %s", param_name),
+				 errdetail("%s = %d is a lower setting than on the master server (where its value was %d).",
+						   param_name, currValue, minValue),
+				 errhint("Change parameters and restart the server, or there may be resource exhaustion errors sooner or later.")));
+		need_restart_for_parameter_values = true;
+	}
+}
 
 /*
  * Check to see if required parameters are set high enough on this server
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 4a5b26c23d..5b14d4f1a3 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -3653,7 +3653,20 @@ KnownAssignedXidsAdd(TransactionId from_xid, TransactionId to_xid,
 		 * If it still won't fit then we're out of memory
 		 */
 		if (head + nxids > pArray->maxKnownAssignedXids)
-			elog(ERROR, "too many KnownAssignedXids");
+		{
+			/*
+			 * The error messages here refer to max_connections, but any
+			 * setting that contributes to TOTAL_MAX_CACHED_SUBXIDS would
+			 * work.  But listing them all without differentiation would
+			 * probably be confusing.
+			 */
+			StandbyParamErrorPauseRecovery("max_connections", MaxConnections);
+			ereport(ERROR,
+					(errcode(ERRCODE_OUT_OF_MEMORY),
+					 errmsg("out of shared memory"),
+					 errdetail("There are no more KnownAssignedXids slots."),
+					 errhint("You might need to increase max_connections.")));
+		}
 	}
 
 	/* Now we can insert the xids into the space starting at head */
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 56dba09299..b4658dac25 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -924,10 +924,13 @@ LockAcquireExtended(const LOCKTAG *locktag,
 			if (locallockp)
 				*locallockp = NULL;
 			if (reportMemoryError)
+			{
+				StandbyParamErrorPauseRecovery("max_locks_per_transaction", max_locks_per_xact);
 				ereport(ERROR,
 						(errcode(ERRCODE_OUT_OF_MEMORY),
 						 errmsg("out of shared memory"),
 						 errhint("You might need to increase max_locks_per_transaction.")));
+			}
 			else
 				return LOCKACQUIRE_NOT_AVAIL;
 		}
@@ -962,10 +965,13 @@ LockAcquireExtended(const LOCKTAG *locktag,
 		if (locallockp)
 			*locallockp = NULL;
 		if (reportMemoryError)
+		{
+			StandbyParamErrorPauseRecovery("max_locks_per_transaction", max_locks_per_xact);
 			ereport(ERROR,
 					(errcode(ERRCODE_OUT_OF_MEMORY),
 					 errmsg("out of shared memory"),
 					 errhint("You might need to increase max_locks_per_transaction.")));
+		}
 		else
 			return LOCKACQUIRE_NOT_AVAIL;
 	}
@@ -2747,6 +2753,7 @@ FastPathGetRelationLockEntry(LOCALLOCK *locallock)
 		{
 			LWLockRelease(partitionLock);
 			LWLockRelease(&MyProc->backendLock);
+			StandbyParamErrorPauseRecovery("max_locks_per_transaction", max_locks_per_xact);
 			ereport(ERROR,
 					(errcode(ERRCODE_OUT_OF_MEMORY),
 					 errmsg("out of shared memory"),
@@ -4077,6 +4084,7 @@ lock_twophase_recover(TransactionId xid, uint16 info,
 	if (!lock)
 	{
 		LWLockRelease(partitionLock);
+		StandbyParamErrorPauseRecovery("max_locks_per_transaction", max_locks_per_xact);
 		ereport(ERROR,
 				(errcode(ERRCODE_OUT_OF_MEMORY),
 				 errmsg("out of shared memory"),
@@ -4142,6 +4150,7 @@ lock_twophase_recover(TransactionId xid, uint16 info,
 				elog(PANIC, "lock table corrupted");
 		}
 		LWLockRelease(partitionLock);
+		StandbyParamErrorPauseRecovery("max_locks_per_transaction", max_locks_per_xact);
 		ereport(ERROR,
 				(errcode(ERRCODE_OUT_OF_MEMORY),
 				 errmsg("out of shared memory"),
@@ -4434,6 +4443,7 @@ VirtualXactLock(VirtualTransactionId vxid, bool wait)
 		{
 			LWLockRelease(partitionLock);
 			LWLockRelease(&proc->backendLock);
+			StandbyParamErrorPauseRecovery("max_locks_per_transaction", max_locks_per_xact);
 			ereport(ERROR,
 					(errcode(ERRCODE_OUT_OF_MEMORY),
 					 errmsg("out of shared memory"),
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 98b033fc20..44465b3829 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -286,6 +286,7 @@ extern XLogRecPtr GetXLogInsertRecPtr(void);
 extern XLogRecPtr GetXLogWriteRecPtr(void);
 extern bool RecoveryIsPaused(void);
 extern void SetRecoveryPause(bool recoveryPause);
+extern void StandbyParamErrorPauseRecovery(const char *param_name, int currValue);
 extern TimestampTz GetLatestXTime(void);
 extern TimestampTz GetCurrentChunkReplayStartTime(void);
 
-- 
2.25.0

In reply to: Peter Eisentraut (#1)
Re: Improve handling of parameter differences in physical replication

Hello

Thank you for working on this!

Where this becomes a serious problem is if you have many standbys and you do a failover.

+1
Several times my team would like to pause recovery instead of panic after change settings on primary. (same thing for create_tablespace_directories replay errors too...)

We documented somewhere (excluding code) shutting down the standby immediately upon receipt of the parameter change? doc/src/sgml/high-availability.sgml says only about "refuse to start".

regards, Sergei

#3Fujii Masao
masao.fujii@oss.nttdata.com
In reply to: Peter Eisentraut (#1)
Re: Improve handling of parameter differences in physical replication

On 2020/02/27 17:23, Peter Eisentraut wrote:

When certain parameters are changed on a physical replication primary,   this is communicated to standbys using the XLOG_PARAMETER_CHANGE WAL record.  The standby then checks whether its own settings are at least as big as the ones on the primary.  If not, the standby shuts down with a fatal error.

The correspondence of settings between primary and standby is required because those settings influence certain shared memory sizings that are required for processing WAL records that the primary might send.  For example, if the primary sends a prepared transaction, the standby must have had max_prepared_transaction set appropriately or it won't be able to process those WAL records.

However, fatally shutting down the standby immediately upon receipt of the parameter change record might be a bit of an overreaction.  The resources related to those settings are not required immediately at that point, and might never be required if the activity on the primary does not exhaust all those resources.  An extreme example is raising max_prepared_transactions on the primary but never actually using prepared transactions.

Where this becomes a serious problem is if you have many standbys and you do a failover.  If the newly promoted standby happens to have a higher setting for one of the relevant parameters, all the other standbys that have followed it then shut down immediately and won't be able to continue until you change all their settings.

If we didn't do the hard shutdown and we just let the standby roll on with recovery, nothing bad will happen and it will eventually produce an appropriate error when those resources are required (e.g., "maximum number of prepared transactions reached").

So I think there are better ways to handle this.  It might be reasonable to provide options.  The attached patch doesn't do that but it would be pretty easy.  What the attached patch does is:

Upon receipt of XLOG_PARAMETER_CHANGE, we still check the settings but only issue a warning and set a global flag if there is a problem.  Then when we actually hit the resource issue and the flag was set, we issue another warning message with relevant information.  Additionally, at that point we pause recovery instead of shutting down, so a hot standby remains usable.  (That could certainly be configurable.)

+1

Btw., I think the current setup is slightly buggy.  The MaxBackends value that is used to size shared memory is computed as MaxConnections + autovacuum_max_workers + 1 + max_worker_processes + max_wal_senders, but we don't track autovacuum_max_workers in WAL.

Maybe this is because autovacuum doesn't work during recovery?

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

#4Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Fujii Masao (#3)
Re: Improve handling of parameter differences in physical replication

On 2020-02-27 11:13, Fujii Masao wrote:

Btw., I think the current setup is slightly buggy.  The MaxBackends value that is used to size shared memory is computed as MaxConnections + autovacuum_max_workers + 1 + max_worker_processes + max_wal_senders, but we don't track autovacuum_max_workers in WAL.

Maybe this is because autovacuum doesn't work during recovery?

Autovacuum on the primary can use locks or xids, and so it's possible
that the standby when processing WAL encounters more of those than it
has locally allocated shared memory to handle.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#5Michael Paquier
michael@paquier.xyz
In reply to: Peter Eisentraut (#4)
Re: Improve handling of parameter differences in physical replication

On Thu, Feb 27, 2020 at 02:37:24PM +0100, Peter Eisentraut wrote:

On 2020-02-27 11:13, Fujii Masao wrote:

Btw., I think the current setup is slightly buggy.  The

MaxBackends value that is used to size shared memory is computed as
MaxConnections + autovacuum_max_workers + 1 + max_worker_processes +
max_wal_senders, but we don't track autovacuum_max_workers in WAL.

Maybe this is because autovacuum doesn't work during recovery?

Autovacuum on the primary can use locks or xids, and so it's possible that
the standby when processing WAL encounters more of those than it has locally
allocated shared memory to handle.

Putting aside your patch because that sounds like a separate issue..
Doesn't this mean that autovacuum_max_workers should be added to the
control file, that we need to record in WAL any updates done to it and
that CheckRequiredParameterValues() is wrong?
--
Michael

#6Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Michael Paquier (#5)
Re: Improve handling of parameter differences in physical replication

On 2020-02-28 08:45, Michael Paquier wrote:

On Thu, Feb 27, 2020 at 02:37:24PM +0100, Peter Eisentraut wrote:

On 2020-02-27 11:13, Fujii Masao wrote:

Btw., I think the current setup is slightly buggy.� The

MaxBackends value that is used to size shared memory is computed as
MaxConnections + autovacuum_max_workers + 1 + max_worker_processes +
max_wal_senders, but we don't track autovacuum_max_workers in WAL.

Maybe this is because autovacuum doesn't work during recovery?

Autovacuum on the primary can use locks or xids, and so it's possible that
the standby when processing WAL encounters more of those than it has locally
allocated shared memory to handle.

Putting aside your patch because that sounds like a separate issue..
Doesn't this mean that autovacuum_max_workers should be added to the
control file, that we need to record in WAL any updates done to it and
that CheckRequiredParameterValues() is wrong?

That would be a direct fix, yes.

Perhaps it might be better to track the combined MaxBackends instead,
however.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#7Michael Paquier
michael@paquier.xyz
In reply to: Peter Eisentraut (#6)
Re: Improve handling of parameter differences in physical replication

On Fri, Feb 28, 2020 at 08:49:08AM +0100, Peter Eisentraut wrote:

Perhaps it might be better to track the combined MaxBackends instead,
however.

Not sure about that. I think that we should keep them separated, as
that's more useful for debugging and more verbose for error reporting.

(Worth noting that max_prepared_xacts is separate because of its dummy
PGPROC entries created by PREPARE TRANSACTION, so it cannot be
included in the set).
--
Michael

#8Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Peter Eisentraut (#1)
Re: Improve handling of parameter differences in physical replication

On 2020-Feb-27, Peter Eisentraut wrote:

So this patch relaxes this a bit. Upon receipt of
XLOG_PARAMETER_CHANGE, we still check the settings but only issue a
warning and set a global flag if there is a problem. Then when we
actually hit the resource issue and the flag was set, we issue another
warning message with relevant information. Additionally, at that
point we pause recovery, so a hot standby remains usable.

Hmm, so what is the actual end-user behavior? As I read the code, we
first send the WARNING, then pause recovery until the user resumes
replication; at that point we raise the original error. Presumably, at
that point the startup process terminates and is relaunched, and replay
continues normally. Is that it?

I think if the startup process terminates because of the original error,
after it is unpaused, postmaster will get that as a signal to do a
crash-recovery cycle, closing all existing connections. Is that right?
If so, it would be worth improving that (possibly by adding a
sigsetjmp() block) to avoid the disruption.

Thanks,

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#9Masahiko Sawada
masahiko.sawada@2ndquadrant.com
In reply to: Alvaro Herrera (#8)
Re: Improve handling of parameter differences in physical replication

On Sat, 29 Feb 2020 at 06:39, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

On 2020-Feb-27, Peter Eisentraut wrote:

So this patch relaxes this a bit. Upon receipt of
XLOG_PARAMETER_CHANGE, we still check the settings but only issue a
warning and set a global flag if there is a problem. Then when we
actually hit the resource issue and the flag was set, we issue another
warning message with relevant information. Additionally, at that
point we pause recovery, so a hot standby remains usable.

Hmm, so what is the actual end-user behavior? As I read the code, we
first send the WARNING, then pause recovery until the user resumes
replication; at that point we raise the original error.

I think after recovery is paused users will be better to restart the
server rather than resume the recovery. I agree with this idea but I'm
slightly concerned that users might not realize that recovery is
paused until they look at that line in server log or at
pg_stat_replication because the standby server is still functional. So
I think we can periodically send WARNING to inform user that we're
still waiting for parameter change and restart.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#10Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Alvaro Herrera (#8)
Re: Improve handling of parameter differences in physical replication

On 2020-02-28 16:33, Alvaro Herrera wrote:

Hmm, so what is the actual end-user behavior? As I read the code, we
first send the WARNING, then pause recovery until the user resumes
replication; at that point we raise the original error. Presumably, at
that point the startup process terminates and is relaunched, and replay
continues normally. Is that it?

No, at that point you get the original, current behavior that the server
instance shuts down with a fatal error.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#11Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Masahiko Sawada (#9)
Re: Improve handling of parameter differences in physical replication

On 2020-03-09 09:11, Masahiko Sawada wrote:

I think after recovery is paused users will be better to restart the
server rather than resume the recovery. I agree with this idea but I'm
slightly concerned that users might not realize that recovery is
paused until they look at that line in server log or at
pg_stat_replication because the standby server is still functional. So
I think we can periodically send WARNING to inform user that we're
still waiting for parameter change and restart.

I think that would be annoying, unless you create a system for
configuring those periodic warnings.

I imagine in a case like having set max_prepared_transactions but never
actually using prepared transactions, people will just ignore the
warning until they have their next restart, so it could be months of
periodic warnings.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#12Masahiko Sawada
masahiko.sawada@2ndquadrant.com
In reply to: Peter Eisentraut (#11)
Re: Improve handling of parameter differences in physical replication

On Mon, 9 Mar 2020 at 18:45, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2020-03-09 09:11, Masahiko Sawada wrote:

I think after recovery is paused users will be better to restart the
server rather than resume the recovery. I agree with this idea but I'm
slightly concerned that users might not realize that recovery is
paused until they look at that line in server log or at
pg_stat_replication because the standby server is still functional. So
I think we can periodically send WARNING to inform user that we're
still waiting for parameter change and restart.

I think that would be annoying, unless you create a system for
configuring those periodic warnings.

I imagine in a case like having set max_prepared_transactions but never
actually using prepared transactions, people will just ignore the
warning until they have their next restart, so it could be months of
periodic warnings.

Well I meant to periodically send warning messages while waiting for
parameter change, that is after exhausting resources and stopping
recovery. In this situation user need to notice that as soon as
possible.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#13Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Masahiko Sawada (#12)
Re: Improve handling of parameter differences in physical replication

At Mon, 9 Mar 2020 21:13:38 +0900, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote in

On Mon, 9 Mar 2020 at 18:45, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2020-03-09 09:11, Masahiko Sawada wrote:

I think after recovery is paused users will be better to restart the
server rather than resume the recovery. I agree with this idea but I'm
slightly concerned that users might not realize that recovery is
paused until they look at that line in server log or at
pg_stat_replication because the standby server is still functional. So
I think we can periodically send WARNING to inform user that we're
still waiting for parameter change and restart.

I think that would be annoying, unless you create a system for
configuring those periodic warnings.

I imagine in a case like having set max_prepared_transactions but never
actually using prepared transactions, people will just ignore the
warning until they have their next restart, so it could be months of
periodic warnings.

Well I meant to periodically send warning messages while waiting for
parameter change, that is after exhausting resources and stopping
recovery. In this situation user need to notice that as soon as
possible.

If we lose connection, standby continues to complain about lost
connection every 5 seconds. This is a situation of that kind.

By the way, when I reduced max_connection only on master then take
exclusive locks until standby complains on lock exchaustion, I see a
WARNING that is saying max_locks_per_transaction instead of
max_connection.

WARNING: insufficient setting for parameter max_connections
DETAIL: max_connections = 2 is a lower setting than on the master server (where its value was 3).
HINT: Change parameters and restart the server, or there may be resource exhaustion errors sooner or later.
CONTEXT: WAL redo at 0/60000A0 for XLOG/PARAMETER_CHANGE: max_connections=3 max_worker_processes=8 max_wal_senders=2 max_prepared_xacts=0 max_locks_per_xact=10 wal_level=replica wal_log_hints=off track_commit_timestamp=off
WARNING: recovery paused because of insufficient setting of parameter max_locks_per_transaction (currently 10)
DETAIL: The value must be at least as high as on the primary server.
HINT: Recovery cannot continue unless the parameter is changed and the server restarted.
CONTEXT: WAL redo at 0/6004A80 for Standb

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#14Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Kyotaro Horiguchi (#13)
Re: Improve handling of parameter differences in physical replication

On 2020-03-10 09:57, Kyotaro Horiguchi wrote:

Well I meant to periodically send warning messages while waiting for
parameter change, that is after exhausting resources and stopping
recovery. In this situation user need to notice that as soon as
possible.

If we lose connection, standby continues to complain about lost
connection every 5 seconds. This is a situation of that kind.

My argument is that it's not really the same. If a standby is
disconnected for more than a few minutes, it's really not going to be a
good standby anymore after a while. In this case, however, having
certain parameter discrepancies is really harmless and you can run with
it for a long time. I'm not strictly opposed to a periodic warning, but
it's unclear to me how we would find a good interval.

By the way, when I reduced max_connection only on master then take
exclusive locks until standby complains on lock exchaustion, I see a
WARNING that is saying max_locks_per_transaction instead of
max_connection.

WARNING: insufficient setting for parameter max_connections
DETAIL: max_connections = 2 is a lower setting than on the master server (where its value was 3).
HINT: Change parameters and restart the server, or there may be resource exhaustion errors sooner or later.
CONTEXT: WAL redo at 0/60000A0 for XLOG/PARAMETER_CHANGE: max_connections=3 max_worker_processes=8 max_wal_senders=2 max_prepared_xacts=0 max_locks_per_xact=10 wal_level=replica wal_log_hints=off track_commit_timestamp=off
WARNING: recovery paused because of insufficient setting of parameter max_locks_per_transaction (currently 10)
DETAIL: The value must be at least as high as on the primary server.
HINT: Recovery cannot continue unless the parameter is changed and the server restarted.
CONTEXT: WAL redo at 0/6004A80 for Standb

This is all a web of half-truths. The lock tables are sized based on
max_locks_per_xact * (MaxBackends + max_prepared_xacts). So if you run
out of lock space, we currently recommend (in the single-server case),
that you raise max_locks_per_xact, but you could also raise
max_prepared_xacts or something else. So this is now the opposite case
where the lock table on the master was bigger because of max_connections.

We could make the advice less specific and just say, in essence, you
need to make some parameter changes; see earlier for some hints.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#15Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Peter Eisentraut (#14)
Re: Improve handling of parameter differences in physical replication

At Tue, 10 Mar 2020 14:47:47 +0100, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote in

On 2020-03-10 09:57, Kyotaro Horiguchi wrote:

Well I meant to periodically send warning messages while waiting for
parameter change, that is after exhausting resources and stopping
recovery. In this situation user need to notice that as soon as
possible.

If we lose connection, standby continues to complain about lost
connection every 5 seconds. This is a situation of that kind.

My argument is that it's not really the same. If a standby is
disconnected for more than a few minutes, it's really not going to be
a good standby anymore after a while. In this case, however, having
certain parameter discrepancies is really harmless and you can run
with it for a long time. I'm not strictly opposed to a periodic
warning, but it's unclear to me how we would find a good interval.

I meant the behavior after streaming is paused. That situation leads
to loss of WAL or running out of WAL storage on the master. Actually
5 seconds as a interval would be too frequent, but, maybe, we need at
least one message for a WAL segment-size?

By the way, when I reduced max_connection only on master then take
exclusive locks until standby complains on lock exchaustion, I see a
WARNING that is saying max_locks_per_transaction instead of
max_connection.

...

WARNING: recovery paused because of insufficient setting of parameter
max_locks_per_transaction (currently 10)
DETAIL: The value must be at least as high as on the primary server.
HINT: Recovery cannot continue unless the parameter is changed and the
server restarted.
CONTEXT: WAL redo at 0/6004A80 for Standb

This is all a web of half-truths. The lock tables are sized based on
max_locks_per_xact * (MaxBackends + max_prepared_xacts). So if you
run out of lock space, we currently recommend (in the single-server
case), that you raise max_locks_per_xact, but you could also raise
max_prepared_xacts or something else. So this is now the opposite
case where the lock table on the master was bigger because of
max_connections.

Yeah, I know. So, I'm not sure whether the checks on individual GUC
variable (other than wal_level) makes sense. We might even not need
the WARNING on parameter change.

We could make the advice less specific and just say, in essence, you
need to make some parameter changes; see earlier for some hints.

In that sense the direction menetioned above seems sensible.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#16Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Kyotaro Horiguchi (#15)
1 attachment(s)
Re: Improve handling of parameter differences in physical replication

Here is an updated patch that incorporates some of the suggestions. In
particular, some of the warning messages have been rephrased to more
accurate (but also less specific), the warning message at recovery pause
repeats every 1 minute, and the documentation has been updated.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v2-0001-Improve-handling-of-parameter-differences-in-phys.patchtext/plain; charset=UTF-8; name=v2-0001-Improve-handling-of-parameter-differences-in-phys.patch; x-mac-creator=0; x-mac-type=0Download
From 3fb5c70b4b7e132cb75271bbd70c4a1523a72e86 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Wed, 11 Mar 2020 19:25:44 +0100
Subject: [PATCH v2] Improve handling of parameter differences in physical
 replication

When certain parameters are changed on a physical replication primary,
this is communicated to standbys using the XLOG_PARAMETER_CHANGE WAL
record.  The standby then checks whether its own settings are at least
as big as the ones on the primary.  If not, the standby shuts down
with a fatal error.

The correspondence of settings between primary and standby is required
because those settings influence certain shared memory sizings that
are required for processing WAL records that the primary might send.
For example, if the primary sends a prepared transaction, the standby
must have had max_prepared_transaction set appropriately or it won't
be able to process those WAL records.

However, fatally shutting down the standby immediately upon receipt of
the parameter change record might be a bit of an overreaction.  The
resources related to those settings are not required immediately at
that point, and might never be required if the activity on the primary
does not exhaust all those resources.  If we just let the standby roll
on with recovery, it will eventually produce an appropriate error when
those resources are used.

So this patch relaxes this a bit.  Upon receipt of
XLOG_PARAMETER_CHANGE, we still check the settings but only issue a
warning and set a global flag if there is a problem.  Then when we
actually hit the resource issue and the flag was set, we issue another
warning message with relevant information.  Additionally, at that
point we pause recovery, so a hot standby remains usable.
---
 doc/src/sgml/high-availability.sgml   | 48 +++++++++++++-----
 src/backend/access/transam/twophase.c |  3 ++
 src/backend/access/transam/xlog.c     | 72 +++++++++++++++++++++++----
 src/backend/storage/ipc/procarray.c   |  9 +++-
 src/backend/storage/lmgr/lock.c       | 10 ++++
 src/include/access/xlog.h             |  1 +
 6 files changed, 120 insertions(+), 23 deletions(-)

diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index bc4d98fe03..c5a9f9d5df 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -2148,18 +2148,14 @@ <title>Administrator's Overview</title>
    </para>
 
    <para>
-    The setting of some parameters on the standby will need reconfiguration
-    if they have been changed on the primary. For these parameters,
-    the value on the standby must
-    be equal to or greater than the value on the primary.
-    Therefore, if you want to increase these values, you should do so on all
-    standby servers first, before applying the changes to the primary server.
-    Conversely, if you want to decrease these values, you should do so on the
-    primary server first, before applying the changes to all standby servers.
-    If these parameters
-    are not set high enough then the standby will refuse to start.
-    Higher values can then be supplied and the server
-    restarted to begin recovery again.  These parameters are:
+    The settings of some parameters determine the size of shared memory for
+    tracking transaction IDs, locks, and prepared transactions.  These shared
+    memory structures should be no smaller on a standby than on the primary.
+    Otherwise, it could happen that the standby runs out of shared memory
+    during recovery.  For example, if the primary uses a prepared transaction
+    but the standby did not allocate any shared memory for tracking prepared
+    transactions, then recovery will abort and cannot continue until the
+    standby's configuration is changed.  The parameters affected are:
 
       <itemizedlist>
        <listitem>
@@ -2188,6 +2184,34 @@ <title>Administrator's Overview</title>
         </para>
        </listitem>
       </itemizedlist>
+
+    The easiest way to ensure this does not become a problem is to have these
+    parameters set on the standbys to values equal to or greater than on the
+    primary.  Therefore, if you want to increase these values, you should do
+    so on all standby servers first, before applying the changes to the
+    primary server.  Conversely, if you want to decrease these values, you
+    should do so on the primary server first, before applying the changes to
+    all standby servers.  The WAL tracks changes to these parameters on the
+    primary, and if a standby processes WAL that indicates that the current
+    value on the primary is higher than its own value, it will log a warning, for example:
+<screen>
+WARNING:  insufficient setting for parameter max_connections
+DETAIL:  max_connections = 80 is a lower setting than on the master server (where its value was 100).
+HINT:  Change parameters and restart the server, or there may be resource exhaustion errors sooner or later.
+</screen>
+    Recovery will continue but could abort at any time thereafter.  (It could
+    also never end up failing if the activity on the primary does not actually
+    require the full extent of the allocated shared memory resources.)  If
+    recovery reaches a point where it cannot continue due to lack of shared
+    memory, recovery will pause and another warning will be logged, for example:
+<screen>
+WARNING:  recovery paused because of insufficient parameter settings
+DETAIL:  See earlier in the log about which settings are insufficient.
+HINT:  Recovery cannot continue unless the configuration is changed and the server restarted.
+</screen>
+    This warning will repeated once a minute.  At that point, the settings on
+    the standby need to be updated and the instance restarted before recovery
+    can continue.
    </para>
 
    <para>
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 5adf956f41..ce35f15f34 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -2360,11 +2360,14 @@ PrepareRedoAdd(char *buf, XLogRecPtr start_lsn,
 
 	/* Get a free gxact from the freelist */
 	if (TwoPhaseState->freeGXacts == NULL)
+	{
+		StandbyParamErrorPauseRecovery();
 		ereport(ERROR,
 				(errcode(ERRCODE_OUT_OF_MEMORY),
 				 errmsg("maximum number of prepared transactions reached"),
 				 errhint("Increase max_prepared_transactions (currently %d).",
 						 max_prepared_xacts)));
+	}
 	gxact = TwoPhaseState->freeGXacts;
 	TwoPhaseState->freeGXacts = gxact->next;
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 614a25242b..3e825a2414 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -258,6 +258,8 @@ bool		InArchiveRecovery = false;
 static bool standby_signal_file_found = false;
 static bool recovery_signal_file_found = false;
 
+static bool need_restart_for_parameter_values = false;
+
 /* Was the last xlog file restored from archive, or local? */
 static bool restoredFromArchive = false;
 
@@ -5971,6 +5973,52 @@ SetRecoveryPause(bool recoveryPause)
 	SpinLockRelease(&XLogCtl->info_lck);
 }
 
+/*
+ * If in hot standby, pause recovery because of a parameter conflict.
+ *
+ * Similar to recoveryPausesHere() but with a different messaging.  The user
+ * is expected to make the parameter change and restart the server.  If they
+ * just unpause recovery, they will then run into whatever error change is
+ * after this function call for the non-hot-standby case.
+ *
+ * We intentionally do not give advice about specific parameters or values
+ * here because it might be misleading.  For example, if we run out of lock
+ * space, then in the single-server case we would recommend raising
+ * max_locks_per_transaction, but in recovery it could equally be the case
+ * that max_connections is out of sync with the primary.  If we get here, we
+ * have already logged any parameter discrepancy in
+ * RecoveryRequiresIntParameter(), so users can go back to that and get
+ * concrete and accurate information.
+ */
+void
+StandbyParamErrorPauseRecovery(void)
+{
+	TimestampTz last_warning = 0;
+
+	if (!AmStartupProcess() || !need_restart_for_parameter_values)
+		return;
+
+	SetRecoveryPause(true);
+
+	do
+	{
+		TimestampTz now = GetCurrentTimestamp();
+
+		if (TimestampDifferenceExceeds(last_warning, now, 60000))
+		{
+			ereport(WARNING,
+					(errmsg("recovery paused because of insufficient parameter settings"),
+					 errdetail("See earlier in the log about which settings are insufficient."),
+					 errhint("Recovery cannot continue unless the configuration is changed and the server restarted.")));
+			last_warning = now;
+		}
+
+		pg_usleep(1000000L);    /* 1000 ms */
+		HandleStartupProcInterrupts();
+	}
+	while (RecoveryIsPaused());
+}
+
 /*
  * When recovery_min_apply_delay is set, we wait long enough to make sure
  * certain record types are applied at least that interval behind the master.
@@ -6150,16 +6198,20 @@ GetXLogReceiptTime(TimestampTz *rtime, bool *fromStream)
  * Note that text field supplied is a parameter name and does not require
  * translation
  */
-#define RecoveryRequiresIntParameter(param_name, currValue, minValue) \
-do { \
-	if ((currValue) < (minValue)) \
-		ereport(ERROR, \
-				(errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
-				 errmsg("hot standby is not possible because %s = %d is a lower setting than on the master server (its value was %d)", \
-						param_name, \
-						currValue, \
-						minValue))); \
-} while(0)
+static void
+RecoveryRequiresIntParameter(const char *param_name, int currValue, int minValue)
+{
+	if (currValue < minValue)
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("insufficient setting for parameter %s", param_name),
+				 errdetail("%s = %d is a lower setting than on the master server (where its value was %d).",
+						   param_name, currValue, minValue),
+				 errhint("Change parameters and restart the server, or there may be resource exhaustion errors sooner or later.")));
+		need_restart_for_parameter_values = true;
+	}
+}
 
 /*
  * Check to see if required parameters are set high enough on this server
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index f45a619deb..cfb88db4a4 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -3654,7 +3654,14 @@ KnownAssignedXidsAdd(TransactionId from_xid, TransactionId to_xid,
 		 * If it still won't fit then we're out of memory
 		 */
 		if (head + nxids > pArray->maxKnownAssignedXids)
-			elog(ERROR, "too many KnownAssignedXids");
+		{
+			StandbyParamErrorPauseRecovery();
+			ereport(ERROR,
+					(errcode(ERRCODE_OUT_OF_MEMORY),
+					 errmsg("out of shared memory"),
+					 errdetail("There are no more KnownAssignedXids slots."),
+					 errhint("You might need to increase max_connections.")));
+		}
 	}
 
 	/* Now we can insert the xids into the space starting at head */
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 56dba09299..cbcc7611f4 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -924,10 +924,13 @@ LockAcquireExtended(const LOCKTAG *locktag,
 			if (locallockp)
 				*locallockp = NULL;
 			if (reportMemoryError)
+			{
+				StandbyParamErrorPauseRecovery();
 				ereport(ERROR,
 						(errcode(ERRCODE_OUT_OF_MEMORY),
 						 errmsg("out of shared memory"),
 						 errhint("You might need to increase max_locks_per_transaction.")));
+			}
 			else
 				return LOCKACQUIRE_NOT_AVAIL;
 		}
@@ -962,10 +965,13 @@ LockAcquireExtended(const LOCKTAG *locktag,
 		if (locallockp)
 			*locallockp = NULL;
 		if (reportMemoryError)
+		{
+			StandbyParamErrorPauseRecovery();
 			ereport(ERROR,
 					(errcode(ERRCODE_OUT_OF_MEMORY),
 					 errmsg("out of shared memory"),
 					 errhint("You might need to increase max_locks_per_transaction.")));
+		}
 		else
 			return LOCKACQUIRE_NOT_AVAIL;
 	}
@@ -2747,6 +2753,7 @@ FastPathGetRelationLockEntry(LOCALLOCK *locallock)
 		{
 			LWLockRelease(partitionLock);
 			LWLockRelease(&MyProc->backendLock);
+			StandbyParamErrorPauseRecovery();
 			ereport(ERROR,
 					(errcode(ERRCODE_OUT_OF_MEMORY),
 					 errmsg("out of shared memory"),
@@ -4077,6 +4084,7 @@ lock_twophase_recover(TransactionId xid, uint16 info,
 	if (!lock)
 	{
 		LWLockRelease(partitionLock);
+		StandbyParamErrorPauseRecovery();
 		ereport(ERROR,
 				(errcode(ERRCODE_OUT_OF_MEMORY),
 				 errmsg("out of shared memory"),
@@ -4142,6 +4150,7 @@ lock_twophase_recover(TransactionId xid, uint16 info,
 				elog(PANIC, "lock table corrupted");
 		}
 		LWLockRelease(partitionLock);
+		StandbyParamErrorPauseRecovery();
 		ereport(ERROR,
 				(errcode(ERRCODE_OUT_OF_MEMORY),
 				 errmsg("out of shared memory"),
@@ -4434,6 +4443,7 @@ VirtualXactLock(VirtualTransactionId vxid, bool wait)
 		{
 			LWLockRelease(partitionLock);
 			LWLockRelease(&proc->backendLock);
+			StandbyParamErrorPauseRecovery();
 			ereport(ERROR,
 					(errcode(ERRCODE_OUT_OF_MEMORY),
 					 errmsg("out of shared memory"),
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 98b033fc20..fae19a9d26 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -286,6 +286,7 @@ extern XLogRecPtr GetXLogInsertRecPtr(void);
 extern XLogRecPtr GetXLogWriteRecPtr(void);
 extern bool RecoveryIsPaused(void);
 extern void SetRecoveryPause(bool recoveryPause);
+extern void StandbyParamErrorPauseRecovery(void);
 extern TimestampTz GetLatestXTime(void);
 extern TimestampTz GetCurrentChunkReplayStartTime(void);
 
-- 
2.25.0

#17Masahiko Sawada
masahiko.sawada@2ndquadrant.com
In reply to: Peter Eisentraut (#16)
Re: Improve handling of parameter differences in physical replication

On Thu, 12 Mar 2020 at 04:34, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

Here is an updated patch that incorporates some of the suggestions. In
particular, some of the warning messages have been rephrased to more
accurate (but also less specific), the warning message at recovery pause
repeats every 1 minute, and the documentation has been updated.

Thank you for updating the patch. I have one comment on the latest
version patch:

+   do
+   {
+       TimestampTz now = GetCurrentTimestamp();
+
+       if (TimestampDifferenceExceeds(last_warning, now, 60000))
+       {
+           ereport(WARNING,
+                   (errmsg("recovery paused because of insufficient
parameter settings"),
+                    errdetail("See earlier in the log about which
settings are insufficient."),
+                    errhint("Recovery cannot continue unless the
configuration is changed and the server restarted.")));
+           last_warning = now;
+       }
+
+       pg_usleep(1000000L);    /* 1000 ms */
+       HandleStartupProcInterrupts();
+   }

I think we can set wait event WAIT_EVENT_RECOVERY_PAUSE here.

The others look good to me.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In reply to: Masahiko Sawada (#17)
Re: Improve handling of parameter differences in physical replication

Hello

I think we can set wait event WAIT_EVENT_RECOVERY_PAUSE here.

+1, since we added this in recoveryPausesHere.

PS: do we need to add a prototype for the RecoveryRequiredIntParameter function in top of xlog.c?

regards, Sergei

#19Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Sergei Kornilov (#18)
Re: Improve handling of parameter differences in physical replication

On 2020-03-27 20:15, Sergei Kornilov wrote:

I think we can set wait event WAIT_EVENT_RECOVERY_PAUSE here.

+1, since we added this in recoveryPausesHere.

committed with that addition

PS: do we need to add a prototype for the RecoveryRequiredIntParameter function in top of xlog.c?

There is no consistent style, I think, but I usually only add prototypes
for static functions if they are required because of the ordering in the
file.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#20Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Peter Eisentraut (#19)
2 attachment(s)
Re: Improve handling of parameter differences in physical replication

Here is another stab at this subject.

This is a much simplified variant: When encountering a parameter change
in the WAL that is higher than the standby's current setting, we log a
warning (instead of an error until now) and pause recovery. If you
resume (unpause) recovery, the instance shuts down as before.

This allows you to keep your standbys running for a bit (depending on
lag requirements) and schedule the required restart more deliberately.

I had previously suggested making this new behavior configurable, but
there didn't seem to be much interest in that, so I have not included
that there.

The documentation changes are mostly carried over from previous patch
versions (but adjusted for the actual behavior of the patch).

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v3-0001-Replace-a-macro-by-a-function.patchtext/plain; charset=UTF-8; name=v3-0001-Replace-a-macro-by-a-function.patch; x-mac-creator=0; x-mac-type=0Download
From 2d3ffb221ae9418ebd7e1fc7fb771213014ca216 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Wed, 24 Jun 2020 08:49:42 +0200
Subject: [PATCH v3 1/2] Replace a macro by a function

Using a macro is ugly and not justified here.
---
 src/backend/access/transam/xlog.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index a1256a103b..dcec2111b3 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6232,16 +6232,17 @@ GetXLogReceiptTime(TimestampTz *rtime, bool *fromStream)
  * Note that text field supplied is a parameter name and does not require
  * translation
  */
-#define RecoveryRequiresIntParameter(param_name, currValue, minValue) \
-do { \
-	if ((currValue) < (minValue)) \
-		ereport(ERROR, \
-				(errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
-				 errmsg("hot standby is not possible because %s = %d is a lower setting than on the master server (its value was %d)", \
-						param_name, \
-						currValue, \
-						minValue))); \
-} while(0)
+static void
+RecoveryRequiresIntParameter(const char *param_name, int currValue, int minValue)
+{
+	if (currValue < minValue)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("hot standby is not possible because %s = %d is a lower setting than on the master server (its value was %d)",
+						param_name,
+						currValue,
+						minValue)));
+}
 
 /*
  * Check to see if required parameters are set high enough on this server
-- 
2.27.0

v3-0002-Pause-recovery-for-insufficient-parameter-setting.patchtext/plain; charset=UTF-8; name=v3-0002-Pause-recovery-for-insufficient-parameter-setting.patch; x-mac-creator=0; x-mac-type=0Download
From 18740a021a90207423032cb7201f54650b81c15a Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Wed, 24 Jun 2020 09:18:21 +0200
Subject: [PATCH v3 2/2] Pause recovery for insufficient parameter settings

When certain parameters are changed on a physical replication primary,
this is communicated to standbys using the XLOG_PARAMETER_CHANGE WAL
record.  The standby then checks whether its own settings are at least
as big as the ones on the primary.  If not, the standby shuts down
with a fatal error.

This patch changes this behavior to pause recovery at that point
instead.  That allows read traffic on the standby to continue while
database administrators figure out next steps.  When recovery is
unpaused, the server shuts down (as before).  The idea is to fix the
parameters while recovery is paused and then restart when there is a
maintenance window.
---
 doc/src/sgml/high-availability.sgml | 47 +++++++++++++++++++++--------
 src/backend/access/transam/xlog.c   | 17 ++++++++++-
 2 files changed, 51 insertions(+), 13 deletions(-)

diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index 65c3fc62a9..d32ec6d9b3 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -2159,18 +2159,14 @@ <title>Administrator's Overview</title>
    </para>
 
    <para>
-    The setting of some parameters on the standby will need reconfiguration
-    if they have been changed on the primary. For these parameters,
-    the value on the standby must
-    be equal to or greater than the value on the primary.
-    Therefore, if you want to increase these values, you should do so on all
-    standby servers first, before applying the changes to the primary server.
-    Conversely, if you want to decrease these values, you should do so on the
-    primary server first, before applying the changes to all standby servers.
-    If these parameters
-    are not set high enough then the standby will refuse to start.
-    Higher values can then be supplied and the server
-    restarted to begin recovery again.  These parameters are:
+    The settings of some parameters determine the size of shared memory for
+    tracking transaction IDs, locks, and prepared transactions.  These shared
+    memory structures must be no smaller on a standby than on the primary in
+    order to ensure that the standby does not run out of shared memory during
+    recovery.  For example, if the primary had used a prepared transaction but
+    the standby had not allocated any shared memory for tracking prepared
+    transactions, then recovery could not continue until the standby's
+    configuration is changed.  The parameters affected are:
 
       <itemizedlist>
        <listitem>
@@ -2199,6 +2195,33 @@ <title>Administrator's Overview</title>
         </para>
        </listitem>
       </itemizedlist>
+
+    The easiest way to ensure this does not become a problem is to have these
+    parameters set on the standbys to values equal to or greater than on the
+    primary.  Therefore, if you want to increase these values, you should do
+    so on all standby servers first, before applying the changes to the
+    primary server.  Conversely, if you want to decrease these values, you
+    should do so on the primary server first, before applying the changes to
+    all standby servers.  Keep in mind that when a standby is promoted, it
+    becomes the new reference for the required parameter settings for the
+    standbys that follow it.  Therefore, to avoid this becoming a problem
+    during a switchover or failover, it is recommended to keep these settings
+    the same on all standby servers.
+   </para>
+
+   <para>
+    The WAL tracks changes to these parameters on the
+    primary, and if a standby processes WAL that indicates that the current
+    value on the primary is higher than its own value, it will log a warning
+    and pause recovery, for example:
+<screen>
+WARNING:  hot standby is not possible because max_connections = 80 is a lower setting than on the master server (its value was 100)
+LOG:  recovery will be paused
+DETAIL:  If recovery is unpaused, the server will shut down.
+HINT:  You can then restart the server after making the necessary configuration changes.
+</screen>
+    At that point, the settings on the standby need to be updated and the
+    instance restarted before recovery can continue.
    </para>
 
    <para>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index dcec2111b3..e0e22e26f4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6236,12 +6236,27 @@ static void
 RecoveryRequiresIntParameter(const char *param_name, int currValue, int minValue)
 {
 	if (currValue < minValue)
-		ereport(ERROR,
+	{
+		ereport(WARNING,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("hot standby is not possible because %s = %d is a lower setting than on the master server (its value was %d)",
 						param_name,
 						currValue,
 						minValue)));
+		ereport(LOG,
+				(errmsg("recovery will be paused"),
+				 errdetail("If recovery is unpaused, the server will shut down."),
+				 errhint("You can then restart the server after making the necessary configuration changes.")));
+		SetRecoveryPause(true);
+		while (RecoveryIsPaused())
+		{
+			pg_usleep(1000000L);	/* 1000 ms */
+			HandleStartupProcInterrupts();
+		}
+		ereport(FATAL,
+				(errmsg("recovery aborted because of insufficient parameter settings"),
+				 errhint("You can restart the server after making the necessary configuration changes.")));
+	}
 }
 
 /*
-- 
2.27.0

#21Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Peter Eisentraut (#20)
2 attachment(s)
Re: Improve handling of parameter differences in physical replication

Here is a minimally updated new patch version to resolve a merge conflict.

On 2020-06-24 10:00, Peter Eisentraut wrote:

Here is another stab at this subject.

This is a much simplified variant: When encountering a parameter change
in the WAL that is higher than the standby's current setting, we log a
warning (instead of an error until now) and pause recovery. If you
resume (unpause) recovery, the instance shuts down as before.

This allows you to keep your standbys running for a bit (depending on
lag requirements) and schedule the required restart more deliberately.

I had previously suggested making this new behavior configurable, but
there didn't seem to be much interest in that, so I have not included
that there.

The documentation changes are mostly carried over from previous patch
versions (but adjusted for the actual behavior of the patch).

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v4-0001-Replace-a-macro-by-a-function.patchtext/plain; charset=UTF-8; name=v4-0001-Replace-a-macro-by-a-function.patch; x-mac-creator=0; x-mac-type=0Download
From 6dfa5445eed5c4bf601577df1a874bf10b0e562a Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Wed, 24 Jun 2020 08:49:42 +0200
Subject: [PATCH v4 1/2] Replace a macro by a function

Using a macro is ugly and not justified here.
---
 src/backend/access/transam/xlog.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 0a97b1d37f..b9d011fa8d 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6235,16 +6235,17 @@ GetXLogReceiptTime(TimestampTz *rtime, bool *fromStream)
  * Note that text field supplied is a parameter name and does not require
  * translation
  */
-#define RecoveryRequiresIntParameter(param_name, currValue, minValue) \
-do { \
-	if ((currValue) < (minValue)) \
-		ereport(ERROR, \
-				(errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
-				 errmsg("hot standby is not possible because %s = %d is a lower setting than on the primary server (its value was %d)", \
-						param_name, \
-						currValue, \
-						minValue))); \
-} while(0)
+static void
+RecoveryRequiresIntParameter(const char *param_name, int currValue, int minValue)
+{
+	if (currValue < minValue)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("hot standby is not possible because %s = %d is a lower setting than on the primary server (its value was %d)",
+						param_name,
+						currValue,
+						minValue)));
+}
 
 /*
  * Check to see if required parameters are set high enough on this server
-- 
2.27.0

v4-0002-Pause-recovery-for-insufficient-parameter-setting.patchtext/plain; charset=UTF-8; name=v4-0002-Pause-recovery-for-insufficient-parameter-setting.patch; x-mac-creator=0; x-mac-type=0Download
From 4b5812c41fe581042249a1af43ac96df75f14a23 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Wed, 24 Jun 2020 09:18:21 +0200
Subject: [PATCH v4 2/2] Pause recovery for insufficient parameter settings

When certain parameters are changed on a physical replication primary,
this is communicated to standbys using the XLOG_PARAMETER_CHANGE WAL
record.  The standby then checks whether its own settings are at least
as big as the ones on the primary.  If not, the standby shuts down
with a fatal error.

This patch changes this behavior to pause recovery at that point
instead.  That allows read traffic on the standby to continue while
database administrators figure out next steps.  When recovery is
unpaused, the server shuts down (as before).  The idea is to fix the
parameters while recovery is paused and then restart when there is a
maintenance window.
---
 doc/src/sgml/high-availability.sgml | 47 +++++++++++++++++++++--------
 src/backend/access/transam/xlog.c   | 17 ++++++++++-
 2 files changed, 51 insertions(+), 13 deletions(-)

diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index 6a9184f314..5aa8eeced2 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -2158,18 +2158,14 @@ <title>Administrator's Overview</title>
    </para>
 
    <para>
-    The setting of some parameters on the standby will need reconfiguration
-    if they have been changed on the primary. For these parameters,
-    the value on the standby must
-    be equal to or greater than the value on the primary.
-    Therefore, if you want to increase these values, you should do so on all
-    standby servers first, before applying the changes to the primary server.
-    Conversely, if you want to decrease these values, you should do so on the
-    primary server first, before applying the changes to all standby servers.
-    If these parameters
-    are not set high enough then the standby will refuse to start.
-    Higher values can then be supplied and the server
-    restarted to begin recovery again.  These parameters are:
+    The settings of some parameters determine the size of shared memory for
+    tracking transaction IDs, locks, and prepared transactions.  These shared
+    memory structures must be no smaller on a standby than on the primary in
+    order to ensure that the standby does not run out of shared memory during
+    recovery.  For example, if the primary had used a prepared transaction but
+    the standby had not allocated any shared memory for tracking prepared
+    transactions, then recovery could not continue until the standby's
+    configuration is changed.  The parameters affected are:
 
       <itemizedlist>
        <listitem>
@@ -2198,6 +2194,33 @@ <title>Administrator's Overview</title>
         </para>
        </listitem>
       </itemizedlist>
+
+    The easiest way to ensure this does not become a problem is to have these
+    parameters set on the standbys to values equal to or greater than on the
+    primary.  Therefore, if you want to increase these values, you should do
+    so on all standby servers first, before applying the changes to the
+    primary server.  Conversely, if you want to decrease these values, you
+    should do so on the primary server first, before applying the changes to
+    all standby servers.  Keep in mind that when a standby is promoted, it
+    becomes the new reference for the required parameter settings for the
+    standbys that follow it.  Therefore, to avoid this becoming a problem
+    during a switchover or failover, it is recommended to keep these settings
+    the same on all standby servers.
+   </para>
+
+   <para>
+    The WAL tracks changes to these parameters on the
+    primary, and if a standby processes WAL that indicates that the current
+    value on the primary is higher than its own value, it will log a warning
+    and pause recovery, for example:
+<screen>
+WARNING:  hot standby is not possible because max_connections = 80 is a lower setting than on the master server (its value was 100)
+LOG:  recovery will be paused
+DETAIL:  If recovery is unpaused, the server will shut down.
+HINT:  You can then restart the server after making the necessary configuration changes.
+</screen>
+    At that point, the settings on the standby need to be updated and the
+    instance restarted before recovery can continue.
    </para>
 
    <para>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b9d011fa8d..d5ae40cdf7 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6239,12 +6239,27 @@ static void
 RecoveryRequiresIntParameter(const char *param_name, int currValue, int minValue)
 {
 	if (currValue < minValue)
-		ereport(ERROR,
+	{
+		ereport(WARNING,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("hot standby is not possible because %s = %d is a lower setting than on the primary server (its value was %d)",
 						param_name,
 						currValue,
 						minValue)));
+		ereport(LOG,
+				(errmsg("recovery will be paused"),
+				 errdetail("If recovery is unpaused, the server will shut down."),
+				 errhint("You can then restart the server after making the necessary configuration changes.")));
+		SetRecoveryPause(true);
+		while (RecoveryIsPaused())
+		{
+			pg_usleep(1000000L);	/* 1000 ms */
+			HandleStartupProcInterrupts();
+		}
+		ereport(FATAL,
+				(errmsg("recovery aborted because of insufficient parameter settings"),
+				 errhint("You can restart the server after making the necessary configuration changes.")));
+	}
 }
 
 /*
-- 
2.27.0

In reply to: Peter Eisentraut (#21)
Re: Improve handling of parameter differences in physical replication

Hello

Thank you! I'm on vacation, so I was finally able to review the patch.

Seems WAIT_EVENT_RECOVERY_PAUSE addition was lost during patch simplification.

ereport(FATAL,
(errmsg("recovery aborted because of insufficient parameter settings"),
errhint("You can restart the server after making the necessary configuration changes.")));

I think we should repeat here conflicted param_name and minValue. pg_wal_replay_resume can be called days after recovery being paused. The initial message can be difficult to find.

errmsg("recovery will be paused")

May be use the same "recovery has paused" as in recoveryPausesHere? It doesn't seem to make any difference since we set pause right after that, but there will be a little less work translators.

Not sure about "If recovery is unpaused". The word "resumed" seems to have been usually used in docs.

regards, Sergei

#23Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Sergei Kornilov (#22)
1 attachment(s)
Re: Improve handling of parameter differences in physical replication

On 2020-11-19 20:17, Sergei Kornilov wrote:

Seems WAIT_EVENT_RECOVERY_PAUSE addition was lost during patch simplification.

added

ereport(FATAL,
(errmsg("recovery aborted because of insufficient parameter settings"),
errhint("You can restart the server after making the necessary configuration changes.")));

I think we should repeat here conflicted param_name and minValue. pg_wal_replay_resume can be called days after recovery being paused. The initial message can be difficult to find.

done

errmsg("recovery will be paused")

May be use the same "recovery has paused" as in recoveryPausesHere? It doesn't seem to make any difference since we set pause right after that, but there will be a little less work translators.

done

Not sure about "If recovery is unpaused". The word "resumed" seems to have been usually used in docs.

I think I like "unpaused" better here, because "resumed" would seem to
imply that recovery can actually continue.

One thing that has not been added to my patch is the equivalent of
496ee647ecd2917369ffcf1eaa0b2cdca07c8730, which allows promotion while
recovery is paused. I'm not sure that would be necessary, and it
doesn't look easy to add either.

--
Peter Eisentraut
2ndQuadrant, an EDB company
https://www.2ndquadrant.com/

Attachments:

v5-0001-Pause-recovery-for-insufficient-parameter-setting.patchtext/plain; charset=UTF-8; name=v5-0001-Pause-recovery-for-insufficient-parameter-setting.patch; x-mac-creator=0; x-mac-type=0Download
From 99724e2ee14b5f3ec926c7afdc056863a7e2294f Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Fri, 20 Nov 2020 13:39:09 +0100
Subject: [PATCH v5] Pause recovery for insufficient parameter settings

When certain parameters are changed on a physical replication primary,
this is communicated to standbys using the XLOG_PARAMETER_CHANGE WAL
record.  The standby then checks whether its own settings are at least
as big as the ones on the primary.  If not, the standby shuts down
with a fatal error.

This patch changes this behavior to pause recovery at that point
instead.  That allows read traffic on the standby to continue while
database administrators figure out next steps.  When recovery is
unpaused, the server shuts down (as before).  The idea is to fix the
parameters while recovery is paused and then restart when there is a
maintenance window.

Discussion: https://www.postgresql.org/message-id/flat/4ad69a4c-cc9b-0dfe-0352-8b1b0cd36c7b@2ndquadrant.com
---
 doc/src/sgml/high-availability.sgml | 48 +++++++++++++++++++++--------
 src/backend/access/transam/xlog.c   | 38 ++++++++++++++++++++---
 2 files changed, 69 insertions(+), 17 deletions(-)

diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index 19d7bd2b28..e9a30dd88b 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -2120,18 +2120,14 @@ <title>Administrator's Overview</title>
    </para>
 
    <para>
-    The setting of some parameters on the standby will need reconfiguration
-    if they have been changed on the primary. For these parameters,
-    the value on the standby must
-    be equal to or greater than the value on the primary.
-    Therefore, if you want to increase these values, you should do so on all
-    standby servers first, before applying the changes to the primary server.
-    Conversely, if you want to decrease these values, you should do so on the
-    primary server first, before applying the changes to all standby servers.
-    If these parameters
-    are not set high enough then the standby will refuse to start.
-    Higher values can then be supplied and the server
-    restarted to begin recovery again.  These parameters are:
+    The settings of some parameters determine the size of shared memory for
+    tracking transaction IDs, locks, and prepared transactions.  These shared
+    memory structures must be no smaller on a standby than on the primary in
+    order to ensure that the standby does not run out of shared memory during
+    recovery.  For example, if the primary had used a prepared transaction but
+    the standby had not allocated any shared memory for tracking prepared
+    transactions, then recovery could not continue until the standby's
+    configuration is changed.  The parameters affected are:
 
       <itemizedlist>
        <listitem>
@@ -2160,6 +2156,34 @@ <title>Administrator's Overview</title>
         </para>
        </listitem>
       </itemizedlist>
+
+    The easiest way to ensure this does not become a problem is to have these
+    parameters set on the standbys to values equal to or greater than on the
+    primary.  Therefore, if you want to increase these values, you should do
+    so on all standby servers first, before applying the changes to the
+    primary server.  Conversely, if you want to decrease these values, you
+    should do so on the primary server first, before applying the changes to
+    all standby servers.  Keep in mind that when a standby is promoted, it
+    becomes the new reference for the required parameter settings for the
+    standbys that follow it.  Therefore, to avoid this becoming a problem
+    during a switchover or failover, it is recommended to keep these settings
+    the same on all standby servers.
+   </para>
+
+   <para>
+    The WAL tracks changes to these parameters on the
+    primary, and if a standby processes WAL that indicates that the current
+    value on the primary is higher than its own value, it will log a warning
+    and pause recovery, for example:
+<screen>
+WARNING:  hot standby is not possible because of insufficient parameter settings
+DETAIL:  max_connections = 80 is a lower setting than on the primary server, where its value was 100.
+LOG:  recovery has paused
+DETAIL:  If recovery is unpaused, the server will shut down.
+HINT:  You can then restart the server after making the necessary configuration changes.
+</screen>
+    At that point, the settings on the standby need to be updated and the
+    instance restarted before recovery can continue.
    </para>
 
    <para>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 13f1d8c3dc..a3519d50e4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6232,12 +6232,40 @@ static void
 RecoveryRequiresIntParameter(const char *param_name, int currValue, int minValue)
 {
 	if (currValue < minValue)
-		ereport(ERROR,
+	{
+		ereport(WARNING,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-				 errmsg("hot standby is not possible because %s = %d is a lower setting than on the primary server (its value was %d)",
-						param_name,
-						currValue,
-						minValue)));
+				 errmsg("hot standby is not possible because of insufficient parameter settings"),
+				 errdetail("%s = %d is a lower setting than on the primary server, where its value was %d.",
+						   param_name,
+						   currValue,
+						   minValue)));
+
+		SetRecoveryPause(true);
+
+		ereport(LOG,
+				(errmsg("recovery has paused"),
+				 errdetail("If recovery is unpaused, the server will shut down."),
+				 errhint("You can then restart the server after making the necessary configuration changes.")));
+
+		while (RecoveryIsPaused())
+		{
+			HandleStartupProcInterrupts();
+			pgstat_report_wait_start(WAIT_EVENT_RECOVERY_PAUSE);
+			pg_usleep(1000000L);	/* 1000 ms */
+			pgstat_report_wait_end();
+		}
+
+		ereport(FATAL,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("recovery aborted because of insufficient parameter settings"),
+				 /* Repeat the detail from above so it's easy to find in the log. */
+				 errdetail("%s = %d is a lower setting than on the primary server, where its value was %d.",
+						   param_name,
+						   currValue,
+						   minValue),
+				 errhint("You can restart the server after making the necessary configuration changes.")));
+	}
 }
 
 /*
-- 
2.29.2

In reply to: Masahiko Sawada (#12)
Re: Improve handling of parameter differences in physical replication

Hello

I think I like "unpaused" better here, because "resumed" would seem to
imply that recovery can actually continue.

Good, I agree.

One thing that has not been added to my patch is the equivalent of
496ee647ecd2917369ffcf1eaa0b2cdca07c8730, which allows promotion while
recovery is paused. I'm not sure that would be necessary, and it
doesn't look easy to add either.

Hmm... Good question. How about putting CheckForStandbyTrigger() in a wait loop, but reporting FATAL with an appropriate message, such as "promotion is not possible because of insufficient parameter settings"?
Also it suits me if we only document that we ignore promote here. I don't think this is an important case. And yes, it's not easy to allow promotion, since we have already updated control file.

Probably we need pause only after we reached consistency?

2020-11-20 18:10:23.617 MSK 19722 @ from [vxid: txid:0] [] LOG: entering standby mode
2020-11-20 18:10:23.632 MSK 19722 @ from [vxid: txid:0] [] WARNING: hot standby is not possible because of insufficient parameter settings
2020-11-20 18:10:23.632 MSK 19722 @ from [vxid: txid:0] [] DETAIL: max_connections = 100 is a lower setting than on the primary server, where its value was 150.
2020-11-20 18:10:23.632 MSK 19722 @ from [vxid: txid:0] [] LOG: recovery has paused
2020-11-20 18:10:23.632 MSK 19722 @ from [vxid: txid:0] [] DETAIL: If recovery is unpaused, the server will shut down.
2020-11-20 18:10:23.632 MSK 19722 @ from [vxid: txid:0] [] HINT: You can then restart the server after making the necessary configuration changes.
2020-11-20 18:13:09.767 MSK 19755 melkij@postgres from [local] [vxid: txid:0] [] FATAL: the database system is starting up

regards, Sergei

#25Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Sergei Kornilov (#24)
1 attachment(s)
Re: Improve handling of parameter differences in physical replication

On 2020-11-20 16:47, Sergei Kornilov wrote:

Hmm... Good question. How about putting CheckForStandbyTrigger() in a wait loop, but reporting FATAL with an appropriate message, such as "promotion is not possible because of insufficient parameter settings"?
Also it suits me if we only document that we ignore promote here. I don't think this is an important case. And yes, it's not easy to allow promotion, since we have already updated control file.

Probably we need pause only after we reached consistency?

Here is an updated patch that implements both of these points. I have
used hot standby active instead of reached consistency. I guess
arguments could be made either way, but the original use case really
cared about hot standby.

--
Peter Eisentraut
2ndQuadrant, an EDB company
https://www.2ndquadrant.com/

Attachments:

v6-0001-Pause-recovery-for-insufficient-parameter-setting.patchtext/plain; charset=UTF-8; name=v6-0001-Pause-recovery-for-insufficient-parameter-setting.patch; x-mac-creator=0; x-mac-type=0Download
From 2d93c2918af7ff186aa4a6a2327286b6eb1a2859 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Mon, 30 Nov 2020 19:21:40 +0100
Subject: [PATCH v6] Pause recovery for insufficient parameter settings

When certain parameters are changed on a physical replication primary,
this is communicated to standbys using the XLOG_PARAMETER_CHANGE WAL
record.  The standby then checks whether its own settings are at least
as big as the ones on the primary.  If not, the standby shuts down
with a fatal error.

This patch changes this behavior for hot standbys to pause recovery at
that point instead.  That allows read traffic on the standby to
continue while database administrators figure out next steps.  When
recovery is unpaused, the server shuts down (as before).  The idea is
to fix the parameters while recovery is paused and then restart when
there is a maintenance window.

Discussion: https://www.postgresql.org/message-id/flat/4ad69a4c-cc9b-0dfe-0352-8b1b0cd36c7b@2ndquadrant.com
---
 doc/src/sgml/high-availability.sgml | 51 +++++++++++++++++++------
 src/backend/access/transam/xlog.c   | 59 ++++++++++++++++++++++++++---
 2 files changed, 93 insertions(+), 17 deletions(-)

diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index 19d7bd2b28..71f7f58cb0 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -2120,18 +2120,14 @@ <title>Administrator's Overview</title>
    </para>
 
    <para>
-    The setting of some parameters on the standby will need reconfiguration
-    if they have been changed on the primary. For these parameters,
-    the value on the standby must
-    be equal to or greater than the value on the primary.
-    Therefore, if you want to increase these values, you should do so on all
-    standby servers first, before applying the changes to the primary server.
-    Conversely, if you want to decrease these values, you should do so on the
-    primary server first, before applying the changes to all standby servers.
-    If these parameters
-    are not set high enough then the standby will refuse to start.
-    Higher values can then be supplied and the server
-    restarted to begin recovery again.  These parameters are:
+    The settings of some parameters determine the size of shared memory for
+    tracking transaction IDs, locks, and prepared transactions.  These shared
+    memory structures must be no smaller on a standby than on the primary in
+    order to ensure that the standby does not run out of shared memory during
+    recovery.  For example, if the primary had used a prepared transaction but
+    the standby had not allocated any shared memory for tracking prepared
+    transactions, then recovery could not continue until the standby's
+    configuration is changed.  The parameters affected are:
 
       <itemizedlist>
        <listitem>
@@ -2160,6 +2156,37 @@ <title>Administrator's Overview</title>
         </para>
        </listitem>
       </itemizedlist>
+
+    The easiest way to ensure this does not become a problem is to have these
+    parameters set on the standbys to values equal to or greater than on the
+    primary.  Therefore, if you want to increase these values, you should do
+    so on all standby servers first, before applying the changes to the
+    primary server.  Conversely, if you want to decrease these values, you
+    should do so on the primary server first, before applying the changes to
+    all standby servers.  Keep in mind that when a standby is promoted, it
+    becomes the new reference for the required parameter settings for the
+    standbys that follow it.  Therefore, to avoid this becoming a problem
+    during a switchover or failover, it is recommended to keep these settings
+    the same on all standby servers.
+   </para>
+
+   <para>
+    The WAL tracks changes to these parameters on the
+    primary.  If a hot standby processes WAL that indicates that the current
+    value on the primary is higher than its own value, it will log a warning
+    and pause recovery, for example:
+<screen>
+WARNING:  hot standby is not possible because of insufficient parameter settings
+DETAIL:  max_connections = 80 is a lower setting than on the primary server, where its value was 100.
+LOG:  recovery has paused
+DETAIL:  If recovery is unpaused, the server will shut down.
+HINT:  You can then restart the server after making the necessary configuration changes.
+</screen>
+    At that point, the settings on the standby need to be updated and the
+    instance restarted before recovery can continue.  If the standby is not a
+    hot standby, then when it encounters the incompatible parameter change, it
+    will shut down immediately without pausing, since there is then no value
+    in keeping it up.
    </para>
 
    <para>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 13f1d8c3dc..f9a41e0dfb 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6232,12 +6232,61 @@ static void
 RecoveryRequiresIntParameter(const char *param_name, int currValue, int minValue)
 {
 	if (currValue < minValue)
-		ereport(ERROR,
+	{
+		if (LocalHotStandbyActive)
+		{
+			bool		warned_for_promote = false;
+
+			ereport(WARNING,
+					(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+					 errmsg("hot standby is not possible because of insufficient parameter settings"),
+					 errdetail("%s = %d is a lower setting than on the primary server, where its value was %d.",
+							   param_name,
+							   currValue,
+							   minValue)));
+
+			SetRecoveryPause(true);
+
+			ereport(LOG,
+					(errmsg("recovery has paused"),
+					 errdetail("If recovery is unpaused, the server will shut down."),
+					 errhint("You can then restart the server after making the necessary configuration changes.")));
+
+			while (RecoveryIsPaused())
+			{
+				HandleStartupProcInterrupts();
+
+				if (CheckForStandbyTrigger())
+				{
+					if (!warned_for_promote)
+						ereport(WARNING,
+								(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+								 errmsg("promotion is not possible because of insufficient parameter settings"),
+								 /* Repeat the detail from above so it's easy to find in the log. */
+								 errdetail("%s = %d is a lower setting than on the primary server, where its value was %d.",
+										   param_name,
+										   currValue,
+										   minValue),
+								 errhint("Restart the server after making the necessary configuration changes.")));
+					warned_for_promote = true;
+				}
+
+				pgstat_report_wait_start(WAIT_EVENT_RECOVERY_PAUSE);
+				pg_usleep(1000000L);	/* 1000 ms */
+				pgstat_report_wait_end();
+			}
+		}
+
+		ereport(FATAL,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-				 errmsg("hot standby is not possible because %s = %d is a lower setting than on the primary server (its value was %d)",
-						param_name,
-						currValue,
-						minValue)));
+				 errmsg("recovery aborted because of insufficient parameter settings"),
+				 /* Repeat the detail from above so it's easy to find in the log. */
+				 errdetail("%s = %d is a lower setting than on the primary server, where its value was %d.",
+						   param_name,
+						   currValue,
+						   minValue),
+				 errhint("You can restart the server after making the necessary configuration changes.")));
+	}
 }
 
 /*
-- 
2.29.2

In reply to: Peter Eisentraut (#25)
Re: Improve handling of parameter differences in physical replication

The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: tested, passed

Hello
Look good for me. I think the patch is ready for commiter.

The new status of this patch is: Ready for Committer

#27Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Sergei Kornilov (#26)
Re: Improve handling of parameter differences in physical replication

On 2021-01-15 12:28, Sergei Kornilov wrote:

The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: tested, passed

Hello
Look good for me. I think the patch is ready for commiter.

The new status of this patch is: Ready for Committer

committed

--
Peter Eisentraut
2ndQuadrant, an EDB company
https://www.2ndquadrant.com/