Requiring recovery.signal or standby.signal when recovering with a backup_label

Started by Michael Paquieralmost 3 years ago37 messages

michael@paquier.xyz

almost 3 years ago

1 attachment(s)

Hi all,

This is a follow-up of the point I have made a few weeks ago on this
thread of pgsql-bugs about $subject:
/messages/by-id/Y/Q/17rpYS7YGbIt@paquier.xyz
/messages/by-id/Y/v0c+3W89NBT/if@paquier.xyz

Here is a short summary of what I think is incorrect, and what I'd
like to do to improve things moving forward, this pointing to a
simple solution..

While looking at the so-said thread, I have dug into the recovery code
to see what looks like an incorrect assumption behind the two boolean
flags named ArchiveRecoveryRequested and InArchiveRecovery that we
have in xlogrecovery.c to control the behavior of archive recovery in
the startup process. For information, as of HEAD, these two are
described as follows:
/*
* When ArchiveRecoveryRequested is set, archive recovery was requested,
* ie. signal files were present. When InArchiveRecovery is set, we are
* currently recovering using offline XLOG archives. These variables are only
* valid in the startup process.
*
* When ArchiveRecoveryRequested is true, but InArchiveRecovery is false, we're
* currently performing crash recovery using only XLOG files in pg_wal, but
* will switch to using offline XLOG archives as soon as we reach the end of
* WAL in pg_wal.
*/
bool ArchiveRecoveryRequested = false;
bool InArchiveRecovery = false;

When you read this text alone, its assumptions are simple. When the
startup process finds a recovery.signal or a standby.signal, we switch
ArchiveRecoveryRequested to true. If there is a standby.signal,
InArchiveRecovery would be immediately set to true before beginning
the redo loop. If we begin redo with a recovery.signal,
ArchiveRecoveryRequested = true and InArchiveRecovery = false, crash
recovery happens first, consuming all the WAL in pg_wal/, then we'd
move on with archive recovery.

Now comes the problem of the other thread, which is what happens when
you use a backup_label *without* a recovery.signal or a
standby.signal. In this case, as currently coded, it is possible to
enforce ArchiveRecoveryRequested = false and later InArchiveRecovery =
true. Not setting ArchiveRecoveryRequested has a couple of undesired
effect. First, this skips some initialization steps that may be
needed at a later point in recovery. The thread quoted above has
reported one aspect of that: we miss some hot-standby related
intialization that can reflect if replaying enough WAL that a restart
point could happen. Depending on the amount of data copied into
pg_wal/ before starting a node with only a backup_label it may also be
possible that a consistent point has been reached, where restart
points would be legit. A second Kiss Cool effect (understands who
can), is that we miss the initialization of the recoveryWakeupLatch.
A third effect is that some code paths can use GUC values related to
recovery without ArchiveRecoveryRequested being around, one example
seems to be hot_standby whose default is true.

It is worth noting the end of FinishWalRecovery(), that includes this
part:
if (ArchiveRecoveryRequested)
{
/*
* We are no longer in archive recovery state.
*
* We are now done reading the old WAL. Turn off archive fetching if
* it was active.
*/
Assert(InArchiveRecovery);
InArchiveRecovery = false;

I have been pondering for a few weeks now about what kind of
definition would suit to a cluster having a backup_label file without
a signal file added, which is documented as required by the docs in
the HA section as well as pg_rewind. It is true that there could be a
point to allow such a configuration so as a node recovers without a
TLI jump, but I cannot find appealing this case, as well, as a node
could just happily overwrite WAL segments in the archives on an
existing timeline, potentially corruption other nodes writing on the
same TLI. There are a few other recovery scenarios where one copies
directly WAL segments into pg_wal/ that can lead to a lot of weird
inconsistencies as well, one being the report of the thread of
pgsql-hackers.

At the end, I'd like to think that we should just require
a recovery.signal or a standby.signal if we have a backup_label file,
and even enforce this rule at the end of recovery for some sanity
checks. I don't think that we can just enforce
ArchiveRecoveryRequested in this path, either, as a backup_label would
be renamed to .old once the control file knows up to which LSN it
needs to replay to reach consistency and if an end-of-backup record is
required. That's not something that can be reasonably backpatched, as
it could disturb some recovery workflows, even if these are kind of in
a dangerous spot, IMO, so I would like to propose that only on HEAD
for 16~ because the recovery code has never really considered this
combination of ArchiveRecoveryRequested and InArchiveRecovery.

While digging into that, I have found one TAP test of pg_basebackup
that was doing recovery with just a backup_label file, with a
restore_command already set. A second case was in pg_rewind, were we
have a node without standby.signal, still it uses a primary_conninfo.

Attached is a patch on the lines of what I am thinking about. This
reworks a bit some of the actions at the beginning of the startup
process:
- Report the set of LOGs showing the state of the node after reading
the backup_label.
- Enforce a rule in ShutdownWalRecovery() and document the
restriction.
- Add a check with the signal files after finding a backup_label
file.
- The validation checks on the recovery parameters are applied (aka
restore_command required with recovery.signal, or a primary_conninfo
required on standby for streaming, etc.).

My apologies for the long message, but this deserves some attention,
IMHO.

So, any thoughts?
--
Michael

Attachments:

v1-0001-Strengthen-use-of-ArchiveRecoveryRequested-and-In.patchtext/x-diff; charset=us-asciiDownload

From 0a19d16c185473b7cb3cfab0b6129b714c827d28 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Fri, 10 Mar 2023 15:51:32 +0900
Subject: [PATCH v1] Strengthen use of ArchiveRecoveryRequested and
 InArchiveRecovery

---
 src/backend/access/transam/xlogrecovery.c     | 112 +++++++++++-------
 src/bin/pg_basebackup/t/010_pg_basebackup.pl  |   3 +-
 src/bin/pg_rewind/t/008_min_recovery_point.pl |   1 +
 3 files changed, 75 insertions(+), 41 deletions(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index dbe9394762..001dcf2077 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -125,15 +125,19 @@ static TimeLineID curFileTLI;
 
 /*
  * When ArchiveRecoveryRequested is set, archive recovery was requested,
- * ie. signal files were present.  When InArchiveRecovery is set, we are
- * currently recovering using offline XLOG archives.  These variables are only
- * valid in the startup process.
+ * ie. signal files or backup_label were present.  When InArchiveRecovery is
+ * set, we are currently recovering using offline XLOG archives.  These
+ + variables are only valid in the startup process.  Note that the presence of
+ * a backup_label file forces archive recovery even if there is no signal
+ * file.
  *
  * When ArchiveRecoveryRequested is true, but InArchiveRecovery is false, we're
  * currently performing crash recovery using only XLOG files in pg_wal, but
  * will switch to using offline XLOG archives as soon as we reach the end of
  * WAL in pg_wal.
-*/
+ *
+ * InArchiveRecovery should never be set without ArchiveRecoveryRequested.
+ */
 bool		ArchiveRecoveryRequested = false;
 bool		InArchiveRecovery = false;
 
@@ -540,42 +544,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 	readRecoverySignalFile();
 	validateRecoveryParameters();
 
-	if (ArchiveRecoveryRequested)
-	{
-		if (StandbyModeRequested)
-			ereport(LOG,
-					(errmsg("entering standby mode")));
-		else if (recoveryTarget == RECOVERY_TARGET_XID)
-			ereport(LOG,
-					(errmsg("starting point-in-time recovery to XID %u",
-							recoveryTargetXid)));
-		else if (recoveryTarget == RECOVERY_TARGET_TIME)
-			ereport(LOG,
-					(errmsg("starting point-in-time recovery to %s",
-							timestamptz_to_str(recoveryTargetTime))));
-		else if (recoveryTarget == RECOVERY_TARGET_NAME)
-			ereport(LOG,
-					(errmsg("starting point-in-time recovery to \"%s\"",
-							recoveryTargetName)));
-		else if (recoveryTarget == RECOVERY_TARGET_LSN)
-			ereport(LOG,
-					(errmsg("starting point-in-time recovery to WAL location (LSN) \"%X/%X\"",
-							LSN_FORMAT_ARGS(recoveryTargetLSN))));
-		else if (recoveryTarget == RECOVERY_TARGET_IMMEDIATE)
-			ereport(LOG,
-					(errmsg("starting point-in-time recovery to earliest consistent point")));
-		else
-			ereport(LOG,
-					(errmsg("starting archive recovery")));
-	}
-
-	/*
-	 * Take ownership of the wakeup latch if we're going to sleep during
-	 * recovery.
-	 */
-	if (ArchiveRecoveryRequested)
-		OwnLatch(&XLogRecoveryCtl->recoveryWakeupLatch);
-
+	/* Set the WAL reading processor, as backup_label may need it */
 	private = palloc0(sizeof(XLogPageReadPrivate));
 	xlogreader =
 		XLogReaderAllocate(wal_segment_size, NULL,
@@ -609,11 +578,30 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 	replay_image_masked = (char *) palloc(BLCKSZ);
 	primary_image_masked = (char *) palloc(BLCKSZ);
 
+	/*
+	 * Read the backup_label file.  We want to run this part of the recovery
+	 * process after checking for signal files and perform validation of the
+	 * recovery parameters, as it may be possible that a backup needs to be
+	 * run, but no signal files have been set.
+	 */
 	if (read_backup_label(&CheckPointLoc, &CheckPointTLI, &backupEndRequired,
 						  &backupFromStandby))
 	{
 		List	   *tablespaces = NIL;
 
+		if (!ArchiveRecoveryRequested)
+			ereport(FATAL,
+					(errmsg("could not find recovery.signal or standby.signal when recovering with backup_label"),
+					 errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" or \"%s/standby.signal\" and add required recovery options.",
+							 DataDir, DataDir)));
+
+		/*
+		 * Take ownership of the wakeup latch if we're going to sleep during
+		 * recovery if needed.
+		 */
+		if (ArchiveRecoveryRequested)
+			OwnLatch(&XLogRecoveryCtl->recoveryWakeupLatch);
+
 		/*
 		 * Archive recovery was requested, and thanks to the backup label
 		 * file, we know how far we need to replay to reach consistency. Enter
@@ -706,6 +694,15 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 	}
 	else
 	{
+		/* No backup_label file has been found if we are here. */
+
+		/*
+		 * Take ownership of the wakeup latch if we're going to sleep during
+		 * recovery if needed.
+		 */
+		if (ArchiveRecoveryRequested)
+			OwnLatch(&XLogRecoveryCtl->recoveryWakeupLatch);
+
 		/*
 		 * If tablespace_map file is present without backup_label file, there
 		 * is no use of such file.  There is no harm in retaining it, but it
@@ -789,6 +786,35 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		wasShutdown = ((record->xl_info & ~XLR_INFO_MASK) == XLOG_CHECKPOINT_SHUTDOWN);
 	}
 
+	if (ArchiveRecoveryRequested)
+	{
+		if (StandbyModeRequested)
+			ereport(LOG,
+					(errmsg("entering standby mode")));
+		else if (recoveryTarget == RECOVERY_TARGET_XID)
+			ereport(LOG,
+					(errmsg("starting point-in-time recovery to XID %u",
+							recoveryTargetXid)));
+		else if (recoveryTarget == RECOVERY_TARGET_TIME)
+			ereport(LOG,
+					(errmsg("starting point-in-time recovery to %s",
+							timestamptz_to_str(recoveryTargetTime))));
+		else if (recoveryTarget == RECOVERY_TARGET_NAME)
+			ereport(LOG,
+					(errmsg("starting point-in-time recovery to \"%s\"",
+							recoveryTargetName)));
+		else if (recoveryTarget == RECOVERY_TARGET_LSN)
+			ereport(LOG,
+					(errmsg("starting point-in-time recovery to WAL location (LSN) \"%X/%X\"",
+							LSN_FORMAT_ARGS(recoveryTargetLSN))));
+		else if (recoveryTarget == RECOVERY_TARGET_IMMEDIATE)
+			ereport(LOG,
+					(errmsg("starting point-in-time recovery to earliest consistent point")));
+		else
+			ereport(LOG,
+					(errmsg("starting archive recovery")));
+	}
+
 	/*
 	 * If the location of the checkpoint record is not on the expected
 	 * timeline in the history of the requested timeline, we cannot proceed:
@@ -1574,6 +1600,12 @@ ShutdownWalRecovery(void)
 	 */
 	if (ArchiveRecoveryRequested)
 		DisownLatch(&XLogRecoveryCtl->recoveryWakeupLatch);
+
+	/*
+	 * InArchiveRecovery should never have been set without
+	 * ArchiveRecoveryRequested.
+	 */
+	Assert(ArchiveRecoveryRequested || !InArchiveRecovery);
 }
 
 /*
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index b60cb78a0d..26d2b11fd5 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -375,7 +375,8 @@ SKIP:
 	my $node2 = PostgreSQL::Test::Cluster->new('replica');
 
 	# Recover main data directory
-	$node2->init_from_backup($node, 'tarbackup2', tar_program => $tar);
+	$node2->init_from_backup($node, 'tarbackup2', tar_program => $tar,
+	    has_restoring => 1);
 
 	# Recover tablespace into a new directory (not where it was!)
 	my $repTsDir     = "$tempdir/tblspc1replica";
diff --git a/src/bin/pg_rewind/t/008_min_recovery_point.pl b/src/bin/pg_rewind/t/008_min_recovery_point.pl
index c753a64fdb..0026e6491e 100644
--- a/src/bin/pg_rewind/t/008_min_recovery_point.pl
+++ b/src/bin/pg_rewind/t/008_min_recovery_point.pl
@@ -152,6 +152,7 @@ move(
 	"$tmp_folder/node_2-postgresql.conf.tmp",
 	"$node_2_pgdata/postgresql.conf");
 
+$node_2->append_conf('standby.signal', '');
 $node_2->start;
 
 # Check contents of the test tables after rewind. The rows inserted in node 3
-- 
2.39.2

Michael Paquier

michael@paquier.xyz

almost 3 years ago

In reply to: Michael Paquier (#1)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On Fri, Mar 10, 2023 at 03:59:04PM +0900, Michael Paquier wrote:

My apologies for the long message, but this deserves some attention,
IMHO.

Note: A CF entry has been added as of [1]https://commitfest.postgresql.org/43/4244/, and I have added an item in
the list of live issues on the open item page for 16.

[1]: https://commitfest.postgresql.org/43/4244/
[2]: https://wiki.postgresql.org/wiki/PostgreSQL_16_Open_Items#Live_issues -- Michael
--
Michael

David Zhang

david.zhang@highgo.ca

over 2 years ago

In reply to: Michael Paquier (#2)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

I believe before users can make a backup using pg_basebackup and then
start the backup as an independent Primary server for whatever reasons.
Now, if this is still allowed, then users need to be aware that the
backup_label must be manually deleted, otherwise, the backup won't be
able to start as a Primary.

The current message below doesn't provide such a hint.

+		if (!ArchiveRecoveryRequested)
+			ereport(FATAL,
+					(errmsg("could not find recovery.signal or standby.signal when recovering with backup_label"),
+					 errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" or \"%s/standby.signal\" and add required recovery options.",
+							 DataDir, DataDir)));

On 2023-03-12 6:06 p.m., Michael Paquier wrote:

On Fri, Mar 10, 2023 at 03:59:04PM +0900, Michael Paquier wrote:

My apologies for the long message, but this deserves some attention,
IMHO.

Note: A CF entry has been added as of [1], and I have added an item in
the list of live issues on the open item page for 16.

[1]:https://commitfest.postgresql.org/43/4244/
[2]:https://wiki.postgresql.org/wiki/PostgreSQL_16_Open_Items#Live_issues
--
Michael

Best regards,

David

Michael Paquier

michael@paquier.xyz

over 2 years ago

In reply to: David Zhang (#3)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On Fri, Jul 14, 2023 at 01:32:49PM -0700, David Zhang wrote:

I believe before users can make a backup using pg_basebackup and then start
the backup as an independent Primary server for whatever reasons. Now, if
this is still allowed, then users need to be aware that the backup_label
must be manually deleted, otherwise, the backup won't be able to start as a
Primary.

Delete a backup_label from a fresh base backup can easily lead to data
corruption, as the startup process would pick up as LSN to start
recovery from the control file rather than the backup_label file.
This would happen if a checkpoint updates the redo LSN in the control
file while a backup happens and the control file is copied after the
checkpoint, for instance. If one wishes to deploy a new primary from
a base backup, recovery.signal is the way to go, making sure that the
new primary is bumped into a new timeline once recovery finishes, on
top of making sure that the startup process starts recovery from a
position where the cluster would be able to achieve a consistent
state.

The current message below doesn't provide such a hint.

+		if (!ArchiveRecoveryRequested)
+			ereport(FATAL,
+					(errmsg("could not find
recovery.signal or standby.signal when recovering with
backup_label"), 
+					 errhint("If you are restoring
from a backup, touch \"%s/recovery.signal\" or \"%s/standby.signal\"
and add required recovery options.",
+							 DataDir,
DataDir)));

How would you rewrite that? I am not sure how many details we want to
put here in terms of differences between recovery.signal and
standby.signal, still we surely should mention these are the two
possible choices.
--
Michael

David Zhang

david.zhang@highgo.ca

over 2 years ago

In reply to: Michael Paquier (#4)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On 2023-07-16 6:27 p.m., Michael Paquier wrote:

Delete a backup_label from a fresh base backup can easily lead to data
corruption, as the startup process would pick up as LSN to start
recovery from the control file rather than the backup_label file.
This would happen if a checkpoint updates the redo LSN in the control
file while a backup happens and the control file is copied after the
checkpoint, for instance. If one wishes to deploy a new primary from
a base backup, recovery.signal is the way to go, making sure that the
new primary is bumped into a new timeline once recovery finishes, on
top of making sure that the startup process starts recovery from a
position where the cluster would be able to achieve a consistent
state.

Thanks a lot for sharing this information.

How would you rewrite that? I am not sure how many details we want to
put here in terms of differences between recovery.signal and
standby.signal, still we surely should mention these are the two
possible choices.

Honestly, I can't convince myself to mention the backup_label here too.
But, I can share some information regarding my testing of the patch and
the corresponding results.

To assess the impact of the patch, I executed the following commands for
before and after,

pg_basebackup -h localhost -p 5432 -U david -D pg_backup1

pg_ctl -D pg_backup1 -l /tmp/logfile start

Before the patch, there were no issues encountered when starting an
independent Primary server.

However, after applying the patch, I observed the following behavior
when starting from the base backup:

1) simply start server from a base backup

FATAL: could not find recovery.signal or standby.signal when recovering
with backup_label

HINT: If you are restoring from a backup, touch
"/media/david/disk1/pg_backup1/recovery.signal" or
"/media/david/disk1/pg_backup1/standby.signal" and add required recovery
options.

2) touch a recovery.signal file and then try to start the server, the
following error was encountered:

FATAL: must specify restore_command when standby mode is not enabled

3) touch a standby.signal file, then the server successfully started,
however, it operates in standby mode, whereas the intended behavior was
for it to function as a primary server.

Best regards,

David

Michael Paquier

michael@paquier.xyz

over 2 years ago

In reply to: David Zhang (#5)

1 attachment(s)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On Wed, Jul 19, 2023 at 11:21:17AM -0700, David Zhang wrote:

1) simply start server from a base backup

FATAL: could not find recovery.signal or standby.signal when recovering
with backup_label

HINT: If you are restoring from a backup, touch
"/media/david/disk1/pg_backup1/recovery.signal" or
"/media/david/disk1/pg_backup1/standby.signal" and add required recovery
options.

Note the difference when --write-recovery-conf is specified, where a
standby.conf is created with a primary_conninfo in
postgresql.auto.conf. So, yes, that's expected by default with the
patch.

2) touch a recovery.signal file and then try to start the server, the
following error was encountered:

FATAL: must specify restore_command when standby mode is not enabled

Yes, that's also something expected in the scope of the v1 posted.
The idea behind this restriction is that specifying recovery.signal is
equivalent to asking for archive recovery, but not specifying
restore_command is equivalent to not provide any options to be able to
recover. See validateRecoveryParameters() and note that this
restriction exists since the beginning of times, introduced in commit
66ec2db. I tend to agree that there is something to be said about
self-contained backups taken from pg_basebackup, though, as these
would fail if no restore_command is specified, and this restriction is
in place before Postgres has introduced replication and easier ways to
have base backups. As a whole, I think that there is a good argument
in favor of removing this restriction for the case where archive
recovery is requested if users have all their WAL in pg_wal/ to be
able to recover up to a consistent point, keeping these GUC
restrictions if requesting a standby (not recovery.signal, only
standby.signal).

3) touch a standby.signal file, then the server successfully started,
however, it operates in standby mode, whereas the intended behavior was for
it to function as a primary server.

standby.signal implies that the server will start in standby mode. If
one wants to deploy a new primary, that would imply a timeline jump at
the end of recovery, you would need to specify recovery.signal
instead.

We need more discussions and more opinions, but the discussion has
stalled for a few months now. In case, I am adding Thomas Munro in CC
who has mentioned to me at PGcon that he was interested in this patch
(this thread's problem is not directly related to the fact that the
checkpointer now runs in crash recovery, though).

For now, I am attaching a rebased v2.
--
Michael

Attachments:

v2-0001-Strengthen-use-of-ArchiveRecoveryRequested-and-In.patchtext/x-diff; charset=us-asciiDownload

From 9945a86161b1cf23f7002a8f1f4bce20061693d6 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Thu, 20 Jul 2023 07:59:56 +0900
Subject: [PATCH v2] Strengthen use of ArchiveRecoveryRequested and
 InArchiveRecovery

---
 src/backend/access/transam/xlogrecovery.c     | 112 +++++++++++-------
 src/bin/pg_basebackup/t/010_pg_basebackup.pl  |   3 +-
 src/bin/pg_rewind/t/008_min_recovery_point.pl |   1 +
 3 files changed, 75 insertions(+), 41 deletions(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index becc2bda62..86899452b4 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -125,15 +125,19 @@ static TimeLineID curFileTLI;
 
 /*
  * When ArchiveRecoveryRequested is set, archive recovery was requested,
- * ie. signal files were present.  When InArchiveRecovery is set, we are
- * currently recovering using offline XLOG archives.  These variables are only
- * valid in the startup process.
+ * ie. signal files or backup_label were present.  When InArchiveRecovery is
+ * set, we are currently recovering using offline XLOG archives.  These
+ + variables are only valid in the startup process.  Note that the presence of
+ * a backup_label file forces archive recovery even if there is no signal
+ * file.
  *
  * When ArchiveRecoveryRequested is true, but InArchiveRecovery is false, we're
  * currently performing crash recovery using only XLOG files in pg_wal, but
  * will switch to using offline XLOG archives as soon as we reach the end of
  * WAL in pg_wal.
-*/
+ *
+ * InArchiveRecovery should never be set without ArchiveRecoveryRequested.
+ */
 bool		ArchiveRecoveryRequested = false;
 bool		InArchiveRecovery = false;
 
@@ -540,42 +544,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 	readRecoverySignalFile();
 	validateRecoveryParameters();
 
-	if (ArchiveRecoveryRequested)
-	{
-		if (StandbyModeRequested)
-			ereport(LOG,
-					(errmsg("entering standby mode")));
-		else if (recoveryTarget == RECOVERY_TARGET_XID)
-			ereport(LOG,
-					(errmsg("starting point-in-time recovery to XID %u",
-							recoveryTargetXid)));
-		else if (recoveryTarget == RECOVERY_TARGET_TIME)
-			ereport(LOG,
-					(errmsg("starting point-in-time recovery to %s",
-							timestamptz_to_str(recoveryTargetTime))));
-		else if (recoveryTarget == RECOVERY_TARGET_NAME)
-			ereport(LOG,
-					(errmsg("starting point-in-time recovery to \"%s\"",
-							recoveryTargetName)));
-		else if (recoveryTarget == RECOVERY_TARGET_LSN)
-			ereport(LOG,
-					(errmsg("starting point-in-time recovery to WAL location (LSN) \"%X/%X\"",
-							LSN_FORMAT_ARGS(recoveryTargetLSN))));
-		else if (recoveryTarget == RECOVERY_TARGET_IMMEDIATE)
-			ereport(LOG,
-					(errmsg("starting point-in-time recovery to earliest consistent point")));
-		else
-			ereport(LOG,
-					(errmsg("starting archive recovery")));
-	}
-
-	/*
-	 * Take ownership of the wakeup latch if we're going to sleep during
-	 * recovery.
-	 */
-	if (ArchiveRecoveryRequested)
-		OwnLatch(&XLogRecoveryCtl->recoveryWakeupLatch);
-
+	/* Set the WAL reading processor, as backup_label may need it */
 	private = palloc0(sizeof(XLogPageReadPrivate));
 	xlogreader =
 		XLogReaderAllocate(wal_segment_size, NULL,
@@ -609,11 +578,30 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 	replay_image_masked = (char *) palloc(BLCKSZ);
 	primary_image_masked = (char *) palloc(BLCKSZ);
 
+	/*
+	 * Read the backup_label file.  We want to run this part of the recovery
+	 * process after checking for signal files and perform validation of the
+	 * recovery parameters, as it may be possible that a backup needs to be
+	 * run, but no signal files have been set.
+	 */
 	if (read_backup_label(&CheckPointLoc, &CheckPointTLI, &backupEndRequired,
 						  &backupFromStandby))
 	{
 		List	   *tablespaces = NIL;
 
+		if (!ArchiveRecoveryRequested)
+			ereport(FATAL,
+					(errmsg("could not find recovery.signal or standby.signal when recovering with backup_label"),
+					 errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" or \"%s/standby.signal\" and add required recovery options.",
+							 DataDir, DataDir)));
+
+		/*
+		 * Take ownership of the wakeup latch if we're going to sleep during
+		 * recovery if needed.
+		 */
+		if (ArchiveRecoveryRequested)
+			OwnLatch(&XLogRecoveryCtl->recoveryWakeupLatch);
+
 		/*
 		 * Archive recovery was requested, and thanks to the backup label
 		 * file, we know how far we need to replay to reach consistency. Enter
@@ -706,6 +694,15 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 	}
 	else
 	{
+		/* No backup_label file has been found if we are here. */
+
+		/*
+		 * Take ownership of the wakeup latch if we're going to sleep during
+		 * recovery if needed.
+		 */
+		if (ArchiveRecoveryRequested)
+			OwnLatch(&XLogRecoveryCtl->recoveryWakeupLatch);
+
 		/*
 		 * If tablespace_map file is present without backup_label file, there
 		 * is no use of such file.  There is no harm in retaining it, but it
@@ -789,6 +786,35 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		wasShutdown = ((record->xl_info & ~XLR_INFO_MASK) == XLOG_CHECKPOINT_SHUTDOWN);
 	}
 
+	if (ArchiveRecoveryRequested)
+	{
+		if (StandbyModeRequested)
+			ereport(LOG,
+					(errmsg("entering standby mode")));
+		else if (recoveryTarget == RECOVERY_TARGET_XID)
+			ereport(LOG,
+					(errmsg("starting point-in-time recovery to XID %u",
+							recoveryTargetXid)));
+		else if (recoveryTarget == RECOVERY_TARGET_TIME)
+			ereport(LOG,
+					(errmsg("starting point-in-time recovery to %s",
+							timestamptz_to_str(recoveryTargetTime))));
+		else if (recoveryTarget == RECOVERY_TARGET_NAME)
+			ereport(LOG,
+					(errmsg("starting point-in-time recovery to \"%s\"",
+							recoveryTargetName)));
+		else if (recoveryTarget == RECOVERY_TARGET_LSN)
+			ereport(LOG,
+					(errmsg("starting point-in-time recovery to WAL location (LSN) \"%X/%X\"",
+							LSN_FORMAT_ARGS(recoveryTargetLSN))));
+		else if (recoveryTarget == RECOVERY_TARGET_IMMEDIATE)
+			ereport(LOG,
+					(errmsg("starting point-in-time recovery to earliest consistent point")));
+		else
+			ereport(LOG,
+					(errmsg("starting archive recovery")));
+	}
+
 	/*
 	 * If the location of the checkpoint record is not on the expected
 	 * timeline in the history of the requested timeline, we cannot proceed:
@@ -1574,6 +1600,12 @@ ShutdownWalRecovery(void)
 	 */
 	if (ArchiveRecoveryRequested)
 		DisownLatch(&XLogRecoveryCtl->recoveryWakeupLatch);
+
+	/*
+	 * InArchiveRecovery should never have been set without
+	 * ArchiveRecoveryRequested.
+	 */
+	Assert(ArchiveRecoveryRequested || !InArchiveRecovery);
 }
 
 /*
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index b9f5e1266b..b9e54f4562 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -392,7 +392,8 @@ SKIP:
 	my $node2 = PostgreSQL::Test::Cluster->new('replica');
 
 	# Recover main data directory
-	$node2->init_from_backup($node, 'tarbackup2', tar_program => $tar);
+	$node2->init_from_backup($node, 'tarbackup2', tar_program => $tar,
+	    has_restoring => 1);
 
 	# Recover tablespace into a new directory (not where it was!)
 	my $repTsDir = "$tempdir/tblspc1replica";
diff --git a/src/bin/pg_rewind/t/008_min_recovery_point.pl b/src/bin/pg_rewind/t/008_min_recovery_point.pl
index d4c89451e6..1cff5b7019 100644
--- a/src/bin/pg_rewind/t/008_min_recovery_point.pl
+++ b/src/bin/pg_rewind/t/008_min_recovery_point.pl
@@ -152,6 +152,7 @@ move(
 	"$tmp_folder/node_2-postgresql.conf.tmp",
 	"$node_2_pgdata/postgresql.conf");
 
+$node_2->append_conf('standby.signal', '');
 $node_2->start;
 
 # Check contents of the test tables after rewind. The rows inserted in node 3
-- 
2.40.1

Bowen Shi

zxwsbg12138@gmail.com

over 2 years ago

In reply to: Michael Paquier (#6)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

Thanks for the patch.

I rerun the test in
/messages/by-id/ZQtzcH2lvo8leXEr@paquier.xyz
. We can discuss all the problems in this thread.

First I encountered the problem " FATAL: could not find
recovery.signal or standby.signal when recovering with backup_label ",
then I deleted the backup_label file and started the instance
successfully.

Delete a backup_label from a fresh base backup can easily lead to data
corruption, as the startup process would pick up as LSN to start
recovery from the control file rather than the backup_label file.
This would happen if a checkpoint updates the redo LSN in the control
file while a backup happens and the control file is copied after the
checkpoint, for instance. If one wishes to deploy a new primary from
a base backup, recovery.signal is the way to go, making sure that the
new primary is bumped into a new timeline once recovery finishes, on
top of making sure that the startup process starts recovery from a
position where the cluster would be able to achieve a consistent
state.

ereport(FATAL,
(errmsg("could not find redo location referenced by checkpoint record"),
errhint("If you are restoring from a backup, touch
\"%s/recovery.signal\" and add required recovery options.\n"
"If you are not restoring from a backup, try removing the file
\"%s/backup_label\".\n"
"Be careful: removing \"%s/backup_label\" will result in a corrupt
cluster if restoring from a backup.",
DataDir, DataDir, DataDir)));

There are two similar error messages in xlogrecovery.c. Maybe we can
modify the error messages to be similar.

--
Bowen Shi

Show quoted text

On Thu, 21 Sept 2023 at 11:01, Michael Paquier <michael@paquier.xyz> wrote:

On Wed, Jul 19, 2023 at 11:21:17AM -0700, David Zhang wrote:

1) simply start server from a base backup

FATAL: could not find recovery.signal or standby.signal when recovering
with backup_label

HINT: If you are restoring from a backup, touch
"/media/david/disk1/pg_backup1/recovery.signal" or
"/media/david/disk1/pg_backup1/standby.signal" and add required recovery
options.

Note the difference when --write-recovery-conf is specified, where a
standby.conf is created with a primary_conninfo in
postgresql.auto.conf. So, yes, that's expected by default with the
patch.

2) touch a recovery.signal file and then try to start the server, the
following error was encountered:

FATAL: must specify restore_command when standby mode is not enabled

Yes, that's also something expected in the scope of the v1 posted.
The idea behind this restriction is that specifying recovery.signal is
equivalent to asking for archive recovery, but not specifying
restore_command is equivalent to not provide any options to be able to
recover. See validateRecoveryParameters() and note that this
restriction exists since the beginning of times, introduced in commit
66ec2db. I tend to agree that there is something to be said about
self-contained backups taken from pg_basebackup, though, as these
would fail if no restore_command is specified, and this restriction is
in place before Postgres has introduced replication and easier ways to
have base backups. As a whole, I think that there is a good argument
in favor of removing this restriction for the case where archive
recovery is requested if users have all their WAL in pg_wal/ to be
able to recover up to a consistent point, keeping these GUC
restrictions if requesting a standby (not recovery.signal, only
standby.signal).

3) touch a standby.signal file, then the server successfully started,
however, it operates in standby mode, whereas the intended behavior was for
it to function as a primary server.

standby.signal implies that the server will start in standby mode. If
one wants to deploy a new primary, that would imply a timeline jump at
the end of recovery, you would need to specify recovery.signal
instead.

We need more discussions and more opinions, but the discussion has
stalled for a few months now. In case, I am adding Thomas Munro in CC
who has mentioned to me at PGcon that he was interested in this patch
(this thread's problem is not directly related to the fact that the
checkpointer now runs in crash recovery, though).

For now, I am attaching a rebased v2.
--
Michael

Michael Paquier

michael@paquier.xyz

over 2 years ago

In reply to: Bowen Shi (#7)

1 attachment(s)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On Thu, Sep 21, 2023 at 11:45:06AM +0800, Bowen Shi wrote:

First I encountered the problem " FATAL: could not find
recovery.signal or standby.signal when recovering with backup_label ",
then I deleted the backup_label file and started the instance
successfully.

Doing that is equal to corrupting your instance as recovery would
begin from the latest redo LSN stored in the control file, but the
physical relation files included in the backup may include blocks that
require records that are needed before this redo LSN and the LSN
stored in the backup_label.

Delete a backup_label from a fresh base backup can easily lead to data
corruption, as the startup process would pick up as LSN to start
recovery from the control file rather than the backup_label file.
This would happen if a checkpoint updates the redo LSN in the control
file while a backup happens and the control file is copied after the
checkpoint, for instance. If one wishes to deploy a new primary from
a base backup, recovery.signal is the way to go, making sure that the
new primary is bumped into a new timeline once recovery finishes, on
top of making sure that the startup process starts recovery from a
position where the cluster would be able to achieve a consistent
state.

And that's what I mean here. In more details. So, really, don't do
that.

ereport(FATAL,
(errmsg("could not find redo location referenced by checkpoint record"),
errhint("If you are restoring from a backup, touch
\"%s/recovery.signal\" and add required recovery options.\n"
"If you are not restoring from a backup, try removing the file
\"%s/backup_label\".\n"
"Be careful: removing \"%s/backup_label\" will result in a corrupt
cluster if restoring from a backup.",
DataDir, DataDir, DataDir)));

There are two similar error messages in xlogrecovery.c. Maybe we can
modify the error messages to be similar.

The patch adds the following message, which is written this way to be
consistent with the two others, already:

+    ereport(FATAL,
+            (errmsg("could not find recovery.signal or standby.signal when recovering with backup_label"),
+             errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" or \"%s/standby.signal\" and add required recovery options.",
+                     DataDir, DataDir)));

But you have an interesting point here, why isn't standby.signal also
mentioned in the two existing comments? Depending on what's wanted by
the user this can be equally useful to report back.

Attached is a slightly updated patch, where I have also removed the
check on ArchiveRecoveryRequested because the FATAL generated for
!ArchiveRecoveryRequested makes sure that it is useless after reading
the backup_label file.

This patch has been around for a few months now. Do others have
opinions about the direction taken here to make the presence of
recovery.signal or standby.signal a hard requirement when a
backup_label file is found (HEAD only)?
--
Michael

Attachments:

v3-0001-Strengthen-use-of-ArchiveRecoveryRequested-and-In.patchtext/x-diff; charset=us-asciiDownload

From db2fd4f3f3a60f4e36a5d93d7d0867a3705e21e2 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Wed, 27 Sep 2023 16:17:42 +0900
Subject: [PATCH v3] Strengthen use of ArchiveRecoveryRequested and
 InArchiveRecovery

---
 src/backend/access/transam/xlogrecovery.c     | 119 +++++++++++-------
 src/bin/pg_basebackup/t/010_pg_basebackup.pl  |   3 +-
 src/bin/pg_rewind/t/008_min_recovery_point.pl |   1 +
 3 files changed, 78 insertions(+), 45 deletions(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index becc2bda62..7b5d1d6baa 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -125,15 +125,19 @@ static TimeLineID curFileTLI;
 
 /*
  * When ArchiveRecoveryRequested is set, archive recovery was requested,
- * ie. signal files were present.  When InArchiveRecovery is set, we are
- * currently recovering using offline XLOG archives.  These variables are only
- * valid in the startup process.
+ * ie. signal files or backup_label were present.  When InArchiveRecovery is
+ * set, we are currently recovering using offline XLOG archives.  These
+ + variables are only valid in the startup process.  Note that the presence of
+ * a backup_label file forces archive recovery even if there is no signal
+ * file.
  *
  * When ArchiveRecoveryRequested is true, but InArchiveRecovery is false, we're
  * currently performing crash recovery using only XLOG files in pg_wal, but
  * will switch to using offline XLOG archives as soon as we reach the end of
  * WAL in pg_wal.
-*/
+ *
+ * InArchiveRecovery should never be set without ArchiveRecoveryRequested.
+ */
 bool		ArchiveRecoveryRequested = false;
 bool		InArchiveRecovery = false;
 
@@ -540,42 +544,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 	readRecoverySignalFile();
 	validateRecoveryParameters();
 
-	if (ArchiveRecoveryRequested)
-	{
-		if (StandbyModeRequested)
-			ereport(LOG,
-					(errmsg("entering standby mode")));
-		else if (recoveryTarget == RECOVERY_TARGET_XID)
-			ereport(LOG,
-					(errmsg("starting point-in-time recovery to XID %u",
-							recoveryTargetXid)));
-		else if (recoveryTarget == RECOVERY_TARGET_TIME)
-			ereport(LOG,
-					(errmsg("starting point-in-time recovery to %s",
-							timestamptz_to_str(recoveryTargetTime))));
-		else if (recoveryTarget == RECOVERY_TARGET_NAME)
-			ereport(LOG,
-					(errmsg("starting point-in-time recovery to \"%s\"",
-							recoveryTargetName)));
-		else if (recoveryTarget == RECOVERY_TARGET_LSN)
-			ereport(LOG,
-					(errmsg("starting point-in-time recovery to WAL location (LSN) \"%X/%X\"",
-							LSN_FORMAT_ARGS(recoveryTargetLSN))));
-		else if (recoveryTarget == RECOVERY_TARGET_IMMEDIATE)
-			ereport(LOG,
-					(errmsg("starting point-in-time recovery to earliest consistent point")));
-		else
-			ereport(LOG,
-					(errmsg("starting archive recovery")));
-	}
-
-	/*
-	 * Take ownership of the wakeup latch if we're going to sleep during
-	 * recovery.
-	 */
-	if (ArchiveRecoveryRequested)
-		OwnLatch(&XLogRecoveryCtl->recoveryWakeupLatch);
-
+	/* Set the WAL reading processor, as backup_label may need it */
 	private = palloc0(sizeof(XLogPageReadPrivate));
 	xlogreader =
 		XLogReaderAllocate(wal_segment_size, NULL,
@@ -609,11 +578,29 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 	replay_image_masked = (char *) palloc(BLCKSZ);
 	primary_image_masked = (char *) palloc(BLCKSZ);
 
+	/*
+	 * Read the backup_label file.  We want to run this part of the recovery
+	 * process after checking for signal files and perform validation of the
+	 * recovery parameters, as it may be possible that a backup needs to be
+	 * run, but no signal files have been set.
+	 */
 	if (read_backup_label(&CheckPointLoc, &CheckPointTLI, &backupEndRequired,
 						  &backupFromStandby))
 	{
 		List	   *tablespaces = NIL;
 
+		if (!ArchiveRecoveryRequested)
+			ereport(FATAL,
+					(errmsg("could not find recovery.signal or standby.signal when recovering with backup_label"),
+					 errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" or \"%s/standby.signal\" and add required recovery options.",
+							 DataDir, DataDir)));
+
+		/*
+		 * Take ownership of the wakeup latch if we're going to sleep during
+		 * recovery if needed.
+		 */
+		OwnLatch(&XLogRecoveryCtl->recoveryWakeupLatch);
+
 		/*
 		 * Archive recovery was requested, and thanks to the backup label
 		 * file, we know how far we need to replay to reach consistency. Enter
@@ -651,20 +638,20 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 								checkPoint.ThisTimeLineID))
 					ereport(FATAL,
 							(errmsg("could not find redo location referenced by checkpoint record"),
-							 errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" and add required recovery options.\n"
+							 errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" or \"%s/standby.signal\" and add required recovery options.\n"
 									 "If you are not restoring from a backup, try removing the file \"%s/backup_label\".\n"
 									 "Be careful: removing \"%s/backup_label\" will result in a corrupt cluster if restoring from a backup.",
-									 DataDir, DataDir, DataDir)));
+									 DataDir, DataDir, DataDir, DataDir)));
 			}
 		}
 		else
 		{
 			ereport(FATAL,
 					(errmsg("could not locate required checkpoint record"),
-					 errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" and add required recovery options.\n"
+					 errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" or \"%s/standby.signal\" and add required recovery options.\n"
 							 "If you are not restoring from a backup, try removing the file \"%s/backup_label\".\n"
 							 "Be careful: removing \"%s/backup_label\" will result in a corrupt cluster if restoring from a backup.",
-							 DataDir, DataDir, DataDir)));
+							 DataDir, DataDir, DataDir, DataDir)));
 			wasShutdown = false;	/* keep compiler quiet */
 		}
 
@@ -706,6 +693,15 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 	}
 	else
 	{
+		/* No backup_label file has been found if we are here. */
+
+		/*
+		 * Take ownership of the wakeup latch if we're going to sleep during
+		 * recovery if needed.
+		 */
+		if (ArchiveRecoveryRequested)
+			OwnLatch(&XLogRecoveryCtl->recoveryWakeupLatch);
+
 		/*
 		 * If tablespace_map file is present without backup_label file, there
 		 * is no use of such file.  There is no harm in retaining it, but it
@@ -789,6 +785,35 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		wasShutdown = ((record->xl_info & ~XLR_INFO_MASK) == XLOG_CHECKPOINT_SHUTDOWN);
 	}
 
+	if (ArchiveRecoveryRequested)
+	{
+		if (StandbyModeRequested)
+			ereport(LOG,
+					(errmsg("entering standby mode")));
+		else if (recoveryTarget == RECOVERY_TARGET_XID)
+			ereport(LOG,
+					(errmsg("starting point-in-time recovery to XID %u",
+							recoveryTargetXid)));
+		else if (recoveryTarget == RECOVERY_TARGET_TIME)
+			ereport(LOG,
+					(errmsg("starting point-in-time recovery to %s",
+							timestamptz_to_str(recoveryTargetTime))));
+		else if (recoveryTarget == RECOVERY_TARGET_NAME)
+			ereport(LOG,
+					(errmsg("starting point-in-time recovery to \"%s\"",
+							recoveryTargetName)));
+		else if (recoveryTarget == RECOVERY_TARGET_LSN)
+			ereport(LOG,
+					(errmsg("starting point-in-time recovery to WAL location (LSN) \"%X/%X\"",
+							LSN_FORMAT_ARGS(recoveryTargetLSN))));
+		else if (recoveryTarget == RECOVERY_TARGET_IMMEDIATE)
+			ereport(LOG,
+					(errmsg("starting point-in-time recovery to earliest consistent point")));
+		else
+			ereport(LOG,
+					(errmsg("starting archive recovery")));
+	}
+
 	/*
 	 * If the location of the checkpoint record is not on the expected
 	 * timeline in the history of the requested timeline, we cannot proceed:
@@ -1574,6 +1599,12 @@ ShutdownWalRecovery(void)
 	 */
 	if (ArchiveRecoveryRequested)
 		DisownLatch(&XLogRecoveryCtl->recoveryWakeupLatch);
+
+	/*
+	 * InArchiveRecovery should never have been set without
+	 * ArchiveRecoveryRequested.
+	 */
+	Assert(ArchiveRecoveryRequested || !InArchiveRecovery);
 }
 
 /*
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index b9f5e1266b..b9e54f4562 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -392,7 +392,8 @@ SKIP:
 	my $node2 = PostgreSQL::Test::Cluster->new('replica');
 
 	# Recover main data directory
-	$node2->init_from_backup($node, 'tarbackup2', tar_program => $tar);
+	$node2->init_from_backup($node, 'tarbackup2', tar_program => $tar,
+	    has_restoring => 1);
 
 	# Recover tablespace into a new directory (not where it was!)
 	my $repTsDir = "$tempdir/tblspc1replica";
diff --git a/src/bin/pg_rewind/t/008_min_recovery_point.pl b/src/bin/pg_rewind/t/008_min_recovery_point.pl
index d4c89451e6..1cff5b7019 100644
--- a/src/bin/pg_rewind/t/008_min_recovery_point.pl
+++ b/src/bin/pg_rewind/t/008_min_recovery_point.pl
@@ -152,6 +152,7 @@ move(
 	"$tmp_folder/node_2-postgresql.conf.tmp",
 	"$node_2_pgdata/postgresql.conf");
 
+$node_2->append_conf('standby.signal', '');
 $node_2->start;
 
 # Check contents of the test tables after rewind. The rows inserted in node 3
-- 
2.40.1

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 2 years ago

In reply to: Michael Paquier (#1)

1 attachment(s)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

At Fri, 10 Mar 2023 15:59:04 +0900, Michael Paquier <michael@paquier.xyz> wrote in

My apologies for the long message, but this deserves some attention,
IMHO.

So, any thoughts?

Sorry for being late. However, I agree with David's concern regarding
the unnecessary inconvenience it introduces. I'd like to maintain the
functionality.

While I agree that InArchiveRecovery should be activated only if
ArchiveReArchiveRecoveryRequested is true, I oppose to the notion that
the mere presence of backup_label should be interpreted as a request
for archive recovery (even if it is implied in a comment in
InitWalRecovery()). Instead, I propose that we separate backup_label
and archive recovery, in other words, we should not turn on
InArchiveRecovery if !ArchiveRecoveryRequested, regardless of the
presence of backup_label. We can know the minimum required recovery
LSN by the XLOG_BACKUP_END record.

The attached is a quick mock-up, but providing an approximation of my
thoughts. (For example, end_of_backup_reached could potentially be
extended to the ArchiveRecoveryRequested case and we could simplify
the condition..)

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

xlogrecov_not_enter_arch_if_not_requested.txttext/plain; charset=us-asciiDownload

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index fcbde10529..e4af945319 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5489,17 +5489,15 @@ StartupXLOG(void)
 	set_ps_display("");
 
 	/*
-	 * When recovering from a backup (we are in recovery, and archive recovery
-	 * was requested), complain if we did not roll forward far enough to reach
-	 * the point where the database is consistent.  For regular online
-	 * backup-from-primary, that means reaching the end-of-backup WAL record
-	 * (at which point we reset backupStartPoint to be Invalid), for
-	 * backup-from-replica (which can't inject records into the WAL stream),
-	 * that point is when we reach the minRecoveryPoint in pg_control (which
-	 * we purposefully copy last when backing up from a replica).  For
-	 * pg_rewind (which creates a backup_label with a method of "pg_rewind")
-	 * or snapshot-style backups (which don't), backupEndRequired will be set
-	 * to false.
+	 * When recovering from a backup, complain if we did not roll forward far
+	 * enough to reach the point where the database is consistent.  For regular
+	 * online backup-from-primary, that means reaching the end-of-backup WAL
+	 * record, for backup-from-replica (which can't inject records into the WAL
+	 * stream), that point is when we reach the minRecoveryPoint in pg_control
+	 * (which we purposefully copy last when backing up from a replica).  For
+	 * pg_rewind (which creates a backup_label with a method of "pg_rewind") or
+	 * snapshot-style backups (which don't), backupEndRequired will be set to
+	 * false.
 	 *
 	 * Note: it is indeed okay to look at the local variable
 	 * LocalMinRecoveryPoint here, even though ControlFile->minRecoveryPoint
@@ -5508,7 +5506,7 @@ StartupXLOG(void)
 	 */
 	if (InRecovery &&
 		(EndOfLog < LocalMinRecoveryPoint ||
-		 !XLogRecPtrIsInvalid(ControlFile->backupStartPoint)))
+		 (haveBackupLabel && !endOfRecoveryInfo->end_of_backup_reached)))
 	{
 		/*
 		 * Ran off end of WAL before reaching end-of-backup WAL record, or
@@ -5516,16 +5514,13 @@ StartupXLOG(void)
 		 * recover from an online backup but never called pg_backup_stop(), or
 		 * you didn't archive all the WAL needed.
 		 */
-		if (ArchiveRecoveryRequested || ControlFile->backupEndRequired)
-		{
-			if (!XLogRecPtrIsInvalid(ControlFile->backupStartPoint) || ControlFile->backupEndRequired)
-				ereport(FATAL,
-						(errmsg("WAL ends before end of online backup"),
-						 errhint("All WAL generated while online backup was taken must be available at recovery.")));
-			else
-				ereport(FATAL,
-						(errmsg("WAL ends before consistent recovery point")));
-		}
+		if (haveBackupLabel && !endOfRecoveryInfo->end_of_backup_reached)
+			ereport(FATAL,
+					(errmsg("WAL ends before end of online backup"),
+					 errhint("All WAL generated while online backup was taken must be available at recovery.")));
+		else
+			ereport(FATAL,
+					(errmsg("WAL ends before consistent recovery point")));
 	}
 
 	/*
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index becc2bda62..d3cbf0703b 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -281,6 +281,7 @@ static TimeLineID minRecoveryPointTLI;
 static XLogRecPtr backupStartPoint;
 static XLogRecPtr backupEndPoint;
 static bool backupEndRequired = false;
+static bool backupEndReached = false;
 
 /*
  * Have we reached a consistent database state?  In crash recovery, we have
@@ -615,11 +616,12 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		List	   *tablespaces = NIL;
 
 		/*
-		 * Archive recovery was requested, and thanks to the backup label
+		 * When archive recovery was requested, thanks to the backup label
 		 * file, we know how far we need to replay to reach consistency. Enter
 		 * archive recovery directly.
 		 */
-		InArchiveRecovery = true;
+		if (ArchiveRecoveryRequested)
+			InArchiveRecovery = true;
 		if (StandbyModeRequested)
 			EnableStandbyMode();
 
@@ -1531,6 +1533,19 @@ FinishWalRecovery(void)
 	result->standby_signal_file_found = standby_signal_file_found;
 	result->recovery_signal_file_found = recovery_signal_file_found;
 
+	/*
+	 * If archive recovery was performed, backupEndReached indicates passing
+	 * the end of backup. If not, a valid backupEndPoint value suggests the
+	 * recovery began from a base backup; verify if recovery surpasses that
+	 * point.
+	 */
+	if (backupEndReached ||
+		(!XLogRecPtrIsInvalid(backupEndPoint) &&
+		 backupEndPoint <= XLogRecoveryCtl->lastReplayedEndRecPtr))
+		result->end_of_backup_reached = true;
+	else
+		result->end_of_backup_reached = false;
+
 	return result;
 }
 
@@ -2133,6 +2148,7 @@ CheckRecoveryConsistency(void)
 		backupStartPoint = InvalidXLogRecPtr;
 		backupEndPoint = InvalidXLogRecPtr;
 		backupEndRequired = false;
+		backupEndReached = true;
 	}
 
 	/*
diff --git a/src/include/access/xlogrecovery.h b/src/include/access/xlogrecovery.h
index 47c29350f5..b8c1a97224 100644
--- a/src/include/access/xlogrecovery.h
+++ b/src/include/access/xlogrecovery.h
@@ -129,6 +129,8 @@ typedef struct
 	 */
 	bool		standby_signal_file_found;
 	bool		recovery_signal_file_found;
+
+	bool		end_of_backup_reached;
 } EndOfWalRecoveryInfo;
 
 extern EndOfWalRecoveryInfo *FinishWalRecovery(void);

#10

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 2 years ago

In reply to: Michael Paquier (#6)

1 attachment(s)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

Sorry, it seems that I posted at the wrong position..

At Thu, 28 Sep 2023 12:58:51 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

At Fri, 10 Mar 2023 15:59:04 +0900, Michael Paquier <michael@paquier.xyz> wrote in

My apologies for the long message, but this deserves some attention,
IMHO.

So, any thoughts?

Sorry for being late. However, I agree with David's concern regarding
the unnecessary inconvenience it introduces. I'd like to maintain the
functionality.

While I agree that InArchiveRecovery should be activated only if
ArchiveReArchiveRecoveryRequested is true, I oppose to the notion that
the mere presence of backup_label should be interpreted as a request
for archive recovery (even if it is implied in a comment in
InitWalRecovery()). Instead, I propose that we separate backup_label
and archive recovery, in other words, we should not turn on
InArchiveRecovery if !ArchiveRecoveryRequested, regardless of the
presence of backup_label. We can know the minimum required recovery
LSN by the XLOG_BACKUP_END record.

The attached is a quick mock-up, but providing an approximation of my
thoughts. (For example, end_of_backup_reached could potentially be
extended to the ArchiveRecoveryRequested case and we could simplify
the condition..)

regards

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

xlogrecov_not_enter_arch_if_not_requested.txttext/plain; charset=us-asciiDownload

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index fcbde10529..e4af945319 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5489,17 +5489,15 @@ StartupXLOG(void)
 	set_ps_display("");
 
 	/*
-	 * When recovering from a backup (we are in recovery, and archive recovery
-	 * was requested), complain if we did not roll forward far enough to reach
-	 * the point where the database is consistent.  For regular online
-	 * backup-from-primary, that means reaching the end-of-backup WAL record
-	 * (at which point we reset backupStartPoint to be Invalid), for
-	 * backup-from-replica (which can't inject records into the WAL stream),
-	 * that point is when we reach the minRecoveryPoint in pg_control (which
-	 * we purposefully copy last when backing up from a replica).  For
-	 * pg_rewind (which creates a backup_label with a method of "pg_rewind")
-	 * or snapshot-style backups (which don't), backupEndRequired will be set
-	 * to false.
+	 * When recovering from a backup, complain if we did not roll forward far
+	 * enough to reach the point where the database is consistent.  For regular
+	 * online backup-from-primary, that means reaching the end-of-backup WAL
+	 * record, for backup-from-replica (which can't inject records into the WAL
+	 * stream), that point is when we reach the minRecoveryPoint in pg_control
+	 * (which we purposefully copy last when backing up from a replica).  For
+	 * pg_rewind (which creates a backup_label with a method of "pg_rewind") or
+	 * snapshot-style backups (which don't), backupEndRequired will be set to
+	 * false.
 	 *
 	 * Note: it is indeed okay to look at the local variable
 	 * LocalMinRecoveryPoint here, even though ControlFile->minRecoveryPoint
@@ -5508,7 +5506,7 @@ StartupXLOG(void)
 	 */
 	if (InRecovery &&
 		(EndOfLog < LocalMinRecoveryPoint ||
-		 !XLogRecPtrIsInvalid(ControlFile->backupStartPoint)))
+		 (haveBackupLabel && !endOfRecoveryInfo->end_of_backup_reached)))
 	{
 		/*
 		 * Ran off end of WAL before reaching end-of-backup WAL record, or
@@ -5516,16 +5514,13 @@ StartupXLOG(void)
 		 * recover from an online backup but never called pg_backup_stop(), or
 		 * you didn't archive all the WAL needed.
 		 */
-		if (ArchiveRecoveryRequested || ControlFile->backupEndRequired)
-		{
-			if (!XLogRecPtrIsInvalid(ControlFile->backupStartPoint) || ControlFile->backupEndRequired)
-				ereport(FATAL,
-						(errmsg("WAL ends before end of online backup"),
-						 errhint("All WAL generated while online backup was taken must be available at recovery.")));
-			else
-				ereport(FATAL,
-						(errmsg("WAL ends before consistent recovery point")));
-		}
+		if (haveBackupLabel && !endOfRecoveryInfo->end_of_backup_reached)
+			ereport(FATAL,
+					(errmsg("WAL ends before end of online backup"),
+					 errhint("All WAL generated while online backup was taken must be available at recovery.")));
+		else
+			ereport(FATAL,
+					(errmsg("WAL ends before consistent recovery point")));
 	}
 
 	/*
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index becc2bda62..d3cbf0703b 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -281,6 +281,7 @@ static TimeLineID minRecoveryPointTLI;
 static XLogRecPtr backupStartPoint;
 static XLogRecPtr backupEndPoint;
 static bool backupEndRequired = false;
+static bool backupEndReached = false;
 
 /*
  * Have we reached a consistent database state?  In crash recovery, we have
@@ -615,11 +616,12 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		List	   *tablespaces = NIL;
 
 		/*
-		 * Archive recovery was requested, and thanks to the backup label
+		 * When archive recovery was requested, thanks to the backup label
 		 * file, we know how far we need to replay to reach consistency. Enter
 		 * archive recovery directly.
 		 */
-		InArchiveRecovery = true;
+		if (ArchiveRecoveryRequested)
+			InArchiveRecovery = true;
 		if (StandbyModeRequested)
 			EnableStandbyMode();
 
@@ -1531,6 +1533,19 @@ FinishWalRecovery(void)
 	result->standby_signal_file_found = standby_signal_file_found;
 	result->recovery_signal_file_found = recovery_signal_file_found;
 
+	/*
+	 * If archive recovery was performed, backupEndReached indicates passing
+	 * the end of backup. If not, a valid backupEndPoint value suggests the
+	 * recovery began from a base backup; verify if recovery surpasses that
+	 * point.
+	 */
+	if (backupEndReached ||
+		(!XLogRecPtrIsInvalid(backupEndPoint) &&
+		 backupEndPoint <= XLogRecoveryCtl->lastReplayedEndRecPtr))
+		result->end_of_backup_reached = true;
+	else
+		result->end_of_backup_reached = false;
+
 	return result;
 }
 
@@ -2133,6 +2148,7 @@ CheckRecoveryConsistency(void)
 		backupStartPoint = InvalidXLogRecPtr;
 		backupEndPoint = InvalidXLogRecPtr;
 		backupEndRequired = false;
+		backupEndReached = true;
 	}
 
 	/*
diff --git a/src/include/access/xlogrecovery.h b/src/include/access/xlogrecovery.h
index 47c29350f5..b8c1a97224 100644
--- a/src/include/access/xlogrecovery.h
+++ b/src/include/access/xlogrecovery.h
@@ -129,6 +129,8 @@ typedef struct
 	 */
 	bool		standby_signal_file_found;
 	bool		recovery_signal_file_found;
+
+	bool		end_of_backup_reached;
 } EndOfWalRecoveryInfo;
 
 extern EndOfWalRecoveryInfo *FinishWalRecovery(void);

Import Notes

Reply to msg id not found: ZRPY9VwlNYygle4o@paquier.xyz20230928.125851.2051261991058034791.horikyota.ntt@gmail.com

#11

Michael Paquier

michael@paquier.xyz

over 2 years ago

In reply to: Kyotaro Horiguchi (#9)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On Thu, Sep 28, 2023 at 12:58:51PM +0900, Kyotaro Horiguchi wrote:

The attached is a quick mock-up, but providing an approximation of my
thoughts. (For example, end_of_backup_reached could potentially be
extended to the ArchiveRecoveryRequested case and we could simplify
the condition..)

I am not sure why this is related to this thread..

static XLogRecPtr backupStartPoint;
static XLogRecPtr backupEndPoint;
static bool backupEndRequired = false;
+static bool backupEndReached = false;

Anyway, sneaking at your suggestion, this is actually outlining the
main issue I have with this code currently. We have so many static
booleans to control one behavior over the other that we always try to
make this code more complicated, while we should try to make it
simpler instead.
--
Michael

#12

David Steele

david@pgmasters.net

over 2 years ago

In reply to: Kyotaro Horiguchi (#9)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On 9/27/23 23:58, Kyotaro Horiguchi wrote:

At Fri, 10 Mar 2023 15:59:04 +0900, Michael Paquier <michael@paquier.xyz> wrote in

My apologies for the long message, but this deserves some attention,
IMHO.

So, any thoughts?

Sorry for being late. However, I agree with David's concern regarding
the unnecessary inconvenience it introduces. I'd like to maintain the
functionality.

After some playing around, I find I agree with Michael on this, i.e.
require at least standby.signal when a backup_label is present.

According to my testing, you can preserve the "independent server"
functionality by setting archive_command = /bin/false. In this case the
timeline is not advanced and recovery proceeds from whatever is
available in pg_wal.

I think this type of recovery from a backup label without a timeline
change should absolutely be the exception, not the default as it seems
to be now. If the server is truly independent, then the timeline change
is not important. If the server is not independent, then the timeline
change is critical.

So overall, +1 for Michael's patch, though I have only read through it
and not tested it yet.

One comment, though, if we are going to require recovery.signal when
backup_label is present, should it just be implied? Why error and force
the user to create it?

Regards,
-David

#13

Michael Paquier

michael@paquier.xyz

over 2 years ago

In reply to: David Steele (#12)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On Thu, Sep 28, 2023 at 04:23:42PM -0400, David Steele wrote:

After some playing around, I find I agree with Michael on this, i.e. require
at least standby.signal when a backup_label is present.

According to my testing, you can preserve the "independent server"
functionality by setting archive_command = /bin/false. In this case the
timeline is not advanced and recovery proceeds from whatever is available in
pg_wal.

I've seen folks depend on such setups in the past, actually, letting a
process outside Postgres just "push" WAL segments to pg_wal instead of
Postgres pulling it with a restore_command or a primary_conninfo for a
standby.

I think this type of recovery from a backup label without a timeline change
should absolutely be the exception, not the default as it seems to be now.

This can mess up archives pretty easily, additionally, so it's not
something to encourage..

If the server is truly independent, then the timeline change is not
important. If the server is not independent, then the timeline change is
critical.

So overall, +1 for Michael's patch, though I have only read through it and
not tested it yet.

Reviews, thoughts and opinions are welcome.

One comment, though, if we are going to require recovery.signal when
backup_label is present, should it just be implied? Why error and force the
user to create it?

That's one thing I was considering, but I also cannot convince myself
that this is the best option because the presence of recovery.signal
or standby.standby (if both, standby.signal takes priority) makes it
clear what type of recovery is wanted at disk level. I'd be OK if
folks think that this is a sensible consensus, as well, even if I
don't really agree with it.

Another idea I had was to force the creation of recovery.signal by
pg_basebackup even if -R is not used. All the reports we've seen with
people getting confused came from pg_basebackup that enforces no
configuration.

A last thing, that had better be covered in a separate thread and
patch, is about validateRecoveryParameters(). These days, I'd like to
think that it may be OK to lift at least the restriction on
restore_command being required if we are doing recovery to ease the
case of self-contained backups (aka the case where all the WAL needed
to reach a consistent point is in pg_wal/ or its tarball)
--
Michael

#14

David Steele

david@pgmasters.net

about 2 years ago

In reply to: Michael Paquier (#13)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On 9/28/23 19:59, Michael Paquier wrote:

On Thu, Sep 28, 2023 at 04:23:42PM -0400, David Steele wrote:

So overall, +1 for Michael's patch, though I have only read through it and
not tested it yet.

Reviews, thoughts and opinions are welcome.

OK, I have now reviewed and tested the patch and it looks good to me. I
stopped short of marking this RfC since there are other reviewers in the
mix.

I dislike that we need to repeat:

OwnLatch(&XLogRecoveryCtl->recoveryWakeupLatch);

But I see the logic behind why you did it and there's no better way to
do it as far as I can see.

One comment, though, if we are going to require recovery.signal when
backup_label is present, should it just be implied? Why error and force the
user to create it?

That's one thing I was considering, but I also cannot convince myself
that this is the best option because the presence of recovery.signal
or standby.standby (if both, standby.signal takes priority) makes it
clear what type of recovery is wanted at disk level. I'd be OK if
folks think that this is a sensible consensus, as well, even if I
don't really agree with it.

I'm OK with keeping it as required for now.

Another idea I had was to force the creation of recovery.signal by
pg_basebackup even if -R is not used. All the reports we've seen with
people getting confused came from pg_basebackup that enforces no
configuration.

This change makes it more obvious if configuration is missing (since
you'll get an error), however +1 for adding this to pg_basebackup.

A last thing, that had better be covered in a separate thread and
patch, is about validateRecoveryParameters(). These days, I'd like to
think that it may be OK to lift at least the restriction on
restore_command being required if we are doing recovery to ease the
case of self-contained backups (aka the case where all the WAL needed
to reach a consistent point is in pg_wal/ or its tarball)

Hmmm, I'm not sure about this. I'd prefer users set
restore_command=/bin/false explicitly to fetch WAL from pg_wal by
default if that's what they really intend.

Regards,
-David

#15

Michael Paquier

michael@paquier.xyz

about 2 years ago

In reply to: David Steele (#14)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On Sat, Oct 14, 2023 at 03:45:33PM -0400, David Steele wrote:

On 9/28/23 19:59, Michael Paquier wrote:
OK, I have now reviewed and tested the patch and it looks good to me. I
stopped short of marking this RfC since there are other reviewers in the
mix.

Thanks for the review. Yes, I am wondering if other people would
chime in here. It doesn't feel like this has gathered enough
opinions. Now this thread has been around for many months, and we've
done quite a few changes in the backup APIs in the last few years with
few users complaining back about them..

I dislike that we need to repeat:

OwnLatch(&XLogRecoveryCtl->recoveryWakeupLatch);

But I see the logic behind why you did it and there's no better way to do it
as far as I can see.

The main point is that there is no meaning in setting the latch until
the backup_label file is read because if ArchiveRecoveryRequested is
*not* set the startup process would outright fail as of the lack of
[recovery|standby].signal.

Another idea I had was to force the creation of recovery.signal by
pg_basebackup even if -R is not used. All the reports we've seen with
people getting confused came from pg_basebackup that enforces no
configuration.

This change makes it more obvious if configuration is missing (since you'll
get an error), however +1 for adding this to pg_basebackup.

Looking at the streaming APIs of pg_basebackup, it looks like this
would be a matter of using bbstreamer_inject_file() to inject an empty
file into the stream. Still something seems to be off once
compression methods are involved.. Hmm. I am not sure. Well, this
could always be done as a patch independant of this one, under a
separate discussion. There are extra arguments about whether it would
be a good idea to add a recovery.signal even when taking a backup from
a standby, and do that only in 17~.

A last thing, that had better be covered in a separate thread and
patch, is about validateRecoveryParameters(). These days, I'd like to
think that it may be OK to lift at least the restriction on
restore_command being required if we are doing recovery to ease the
case of self-contained backups (aka the case where all the WAL needed
to reach a consistent point is in pg_wal/ or its tarball)

Hmmm, I'm not sure about this. I'd prefer users set
restore_command=/bin/false explicitly to fetch WAL from pg_wal by default if
that's what they really intend.

It wouldn't be the first time we break compatibility in this area, so
perhaps you are right and keeping this requirement is fine, even if it
requires one extra step when recovering a self-contained backup
generated by pg_basebackup. At least this forces users to look at
their setup and check if something is wrong. We'd likely finish with
a few "bug" reports, as well :D
--
Michael

#16

Bowen Shi

zxwsbg12138@gmail.com

about 2 years ago

In reply to: Michael Paquier (#15)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: tested, passed

It looks good to me.

I have reviewed the code and tested the patch with basic check-world test an pgbench test (metioned in /messages/by-id/ZQtzcH2lvo8leXEr@paquier.xyz

Another reviewer has also approved it, so I change the status to RFC.

The new status of this patch is: Ready for Committer

#17

Laurenz Albe

laurenz.albe@cybertec.at

about 2 years ago

In reply to: Michael Paquier (#15)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On Mon, 2023-10-16 at 14:54 +0900, Michael Paquier wrote:

Thanks for the review. Yes, I am wondering if other people would
chime in here. It doesn't feel like this has gathered enough
opinions.

I don't have strong feelings either way. If you have backup_label
but no signal file, starting PostgreSQL may succeed (if the WAL
with the checkpoint happens to be in pg_wal) or it may fail with
an error message. There is no danger of causing damage unless you
remove backup_label, right?

I cannot think of a use case where you use such a configuration on
purpose, and the current error message is more crypric than a plain
"you must have a signal file to start from a backup", so perhaps
your patch is a good idea.

Yours,
Laurenz Albe

#18

Michael Paquier

michael@paquier.xyz

about 2 years ago

In reply to: Laurenz Albe (#17)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On Mon, Oct 16, 2023 at 05:48:43PM +0200, Laurenz Albe wrote:

I don't have strong feelings either way. If you have backup_label
but no signal file, starting PostgreSQL may succeed (if the WAL
with the checkpoint happens to be in pg_wal) or it may fail with
an error message. There is no danger of causing damage unless you
remove backup_label, right?

A bit more happens currently if you have a backup_label with no signal
files, unfortunately, because this causes some startup states to not
be initialized. See around here:
/messages/by-id/Y/Q/17rpYS7YGbIt@paquier.xyz
/messages/by-id/Y/v0c+3W89NBT/if@paquier.xyz

I cannot think of a use case where you use such a configuration on
purpose, and the current error message is more crypric than a plain
"you must have a signal file to start from a backup", so perhaps
your patch is a good idea.

I hope so.
--
Michael

#19

Michael Paquier

michael@paquier.xyz

about 2 years ago

In reply to: Michael Paquier (#15)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On Mon, Oct 16, 2023 at 02:54:35PM +0900, Michael Paquier wrote:

On Sat, Oct 14, 2023 at 03:45:33PM -0400, David Steele wrote:

On 9/28/23 19:59, Michael Paquier wrote:

Another idea I had was to force the creation of recovery.signal by
pg_basebackup even if -R is not used. All the reports we've seen with
people getting confused came from pg_basebackup that enforces no
configuration.

This change makes it more obvious if configuration is missing (since you'll
get an error), however +1 for adding this to pg_basebackup.

Looking at the streaming APIs of pg_basebackup, it looks like this
would be a matter of using bbstreamer_inject_file() to inject an empty
file into the stream. Still something seems to be off once
compression methods are involved.. Hmm. I am not sure. Well, this
could always be done as a patch independant of this one, under a
separate discussion. There are extra arguments about whether it would
be a good idea to add a recovery.signal even when taking a backup from
a standby, and do that only in 17~.

Hmm. On this specific point, it would actually be much simpler to
force recovery.signal to be in the contents streamed to a BASE_BACKUP.
This does not step on your proposal at [1]/messages/by-id/2daf8adc-8db7-4204-a7f2-a7e94e2bfa4b@pgmasters.net, though, because you'd
still require a .signal file for recovery as far as I understand :/

[1]: /messages/by-id/2daf8adc-8db7-4204-a7f2-a7e94e2bfa4b@pgmasters.net

Would folks be OK to move on with the patch of this thread at the end?
I am attempting a last-call kind of thing.
--
Michael

#20

David Steele

david@pgmasters.net

about 2 years ago

In reply to: Michael Paquier (#19)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On 10/27/23 03:22, Michael Paquier wrote:

On Mon, Oct 16, 2023 at 02:54:35PM +0900, Michael Paquier wrote:

On Sat, Oct 14, 2023 at 03:45:33PM -0400, David Steele wrote:

On 9/28/23 19:59, Michael Paquier wrote:

Another idea I had was to force the creation of recovery.signal by
pg_basebackup even if -R is not used. All the reports we've seen with
people getting confused came from pg_basebackup that enforces no
configuration.

This change makes it more obvious if configuration is missing (since you'll
get an error), however +1 for adding this to pg_basebackup.

Looking at the streaming APIs of pg_basebackup, it looks like this
would be a matter of using bbstreamer_inject_file() to inject an empty
file into the stream. Still something seems to be off once
compression methods are involved.. Hmm. I am not sure. Well, this
could always be done as a patch independant of this one, under a
separate discussion. There are extra arguments about whether it would
be a good idea to add a recovery.signal even when taking a backup from
a standby, and do that only in 17~.

Hmm. On this specific point, it would actually be much simpler to
force recovery.signal to be in the contents streamed to a BASE_BACKUP.

That sounds like the right plan to me. Nice and simple.

This does not step on your proposal at [1], though, because you'd
still require a .signal file for recovery as far as I understand :/

[1]: /messages/by-id/2daf8adc-8db7-4204-a7f2-a7e94e2bfa4b@pgmasters.net

Yes.

Would folks be OK to move on with the patch of this thread at the end?
I am attempting a last-call kind of thing.

I'm still +1 for the patch as it stands.

Regards,
-David

#21

Michael Paquier

michael@paquier.xyz

about 2 years ago

In reply to: David Steele (#20)

1 attachment(s)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On Fri, Oct 27, 2023 at 09:31:10AM -0400, David Steele wrote:

That sounds like the right plan to me. Nice and simple.

I'll tackle that in a separate thread with a patch registered for the
upcoming CF of November.

I'm still +1 for the patch as it stands.

I have been reviewing the patch, and applied portions of it as of
dc5bd388 and 1ffdc03c and they're quite independent pieces. After
that, the remaining bits of the patch to change the behavior is now
straight-forward. I have written a commit message for it, while on
it, as per the attached.
--
Michael

Attachments:

v4-0001-Require-recovery.signal-or-standby.signal-when-re.patchtext/x-diff; charset=us-asciiDownload

From 26a8432fe3ab8426e7797d85d19b0fe69d3384c9 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Mon, 30 Oct 2023 16:02:52 +0900
Subject: [PATCH v4] Require recovery.signal or standby.signal when reading a
 backup_file

Historically, the startup process uses two static variables to control
if archive recovery should happen, when either recovery.signal or
standby.signal are defined in the data folder at the beginning of
recovery:
- ArchiveRecoveryRequested, set to "true" if either .signal file exists.
This controls if archive recovery has been requested.
- InArchiveRecovery, that controls if recovery should switch to archive
mode during recovery.  It is initially "false" but switched to "true"
during recovery for two cases:
-- Just after finding and reading a valid backup_label file, close to
the beginning of recovery.
-- When no backup_label file is found, it would be set to "false" to
force crash recovery, where all the WAL located in pg_wal/ is read
first.  When all the local WAL records have been consumed, it is
changed to "true" to switch to archive recovery mode, equivalent to a
base backup.

Now comes the fact that recovery has an old logical problem: when a
backup_label is found but no .signal file is defined, the startup
process finishes in a state where ArchiveRecoveryRequested is "false"
but InArchiveRecovery is "true".  This is proving to cause various
issues, that got worse since restart points can run during crash
recovery for the benefit of being able to start and stop instances
without doing crash recovery from its initial point (7ff23c6d277d):
- Some standby states needed by archive recovery would not be set,
causing correctness issues, like assertion failures during recovery.
- recoveryWakeupLatch would not be set.
- Some GUCs are messed up, hot_standby that depends on
ArchiveRecoveryRequested.

This configuration was possible when recovering from a base backup taken
by pg_basebackup without -R.  Note that the documentation requires at
least to set recovery.signal to restore from a backup, but the startup
process was not making this policy explicit.  In most cases, one would
have been able to complete recovery, but that's a matter of luck,
really, as it depends on the workload of the origin server.  This has
the advantage of simplifying the logic for the archive recovery case, as
InArchiveRecovery now requires ArchiveRecoveryRequested to be set.
Recovering a self-contained backup now requires a recovery.signal, with
a restore_command set in postgresql.conf (or related conf file).

One test of pg_basebackup and one test of pg_rewind need to be tweaked
to avoid the FATAL introduced by this patch when a base_backup is found
without a .signal file, even if the intention of both is to have a
recovery.signal set.

Reviewed-by: David Steele, Bowen Shi
Discussion: https://postgr.es/m/ZArVOMifjzE7f8W7@paquier.xyz
Discussion: https://postgr.es/m/Y/Q/17rpYS7YGbIt@paquier.xyz
Discussion: https://postgr.es/m/Y/v0c+3W89NBT/if@paquier.xyz
---
 src/backend/access/transam/xlogrecovery.c     | 22 ++++++++++++++++---
 src/bin/pg_basebackup/t/010_pg_basebackup.pl  |  3 ++-
 src/bin/pg_rewind/t/008_min_recovery_point.pl |  1 +
 3 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index c61566666a..0088dfec2e 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -125,14 +125,18 @@ static TimeLineID curFileTLI;
 
 /*
  * When ArchiveRecoveryRequested is set, archive recovery was requested,
- * ie. signal files were present.  When InArchiveRecovery is set, we are
- * currently recovering using offline XLOG archives.  These variables are only
- * valid in the startup process.
+ * ie. signal files or backup_label were present.  When InArchiveRecovery is
+ * set, we are currently recovering using offline XLOG archives.  These
+ * variables are only valid in the startup process.  Note that the presence of
+ * a backup_label file requires the presence of one of the two .signal files
+ * to enforce archive recovery.
  *
  * When ArchiveRecoveryRequested is true, but InArchiveRecovery is false, we're
  * currently performing crash recovery using only XLOG files in pg_wal, but
  * will switch to using offline XLOG archives as soon as we reach the end of
  * WAL in pg_wal.
+ *
+ * InArchiveRecovery should never be set without ArchiveRecoveryRequested.
  */
 bool		ArchiveRecoveryRequested = false;
 bool		InArchiveRecovery = false;
@@ -594,6 +598,12 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 	{
 		List	   *tablespaces = NIL;
 
+		if (!ArchiveRecoveryRequested)
+			ereport(FATAL,
+					(errmsg("could not find recovery.signal or standby.signal when recovering with backup_label"),
+					 errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" or \"%s/standby.signal\" and add required recovery options.",
+							 DataDir, DataDir)));
+
 		/*
 		 * Archive recovery was requested, and thanks to the backup label
 		 * file, we know how far we need to replay to reach consistency. Enter
@@ -1591,6 +1601,12 @@ ShutdownWalRecovery(void)
 	 */
 	if (ArchiveRecoveryRequested)
 		DisownLatch(&XLogRecoveryCtl->recoveryWakeupLatch);
+
+	/*
+	 * InArchiveRecovery should never have been set without
+	 * ArchiveRecoveryRequested.
+	 */
+	Assert(ArchiveRecoveryRequested || !InArchiveRecovery);
 }
 
 /*
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index b9f5e1266b..b9e54f4562 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -392,7 +392,8 @@ SKIP:
 	my $node2 = PostgreSQL::Test::Cluster->new('replica');
 
 	# Recover main data directory
-	$node2->init_from_backup($node, 'tarbackup2', tar_program => $tar);
+	$node2->init_from_backup($node, 'tarbackup2', tar_program => $tar,
+	    has_restoring => 1);
 
 	# Recover tablespace into a new directory (not where it was!)
 	my $repTsDir = "$tempdir/tblspc1replica";
diff --git a/src/bin/pg_rewind/t/008_min_recovery_point.pl b/src/bin/pg_rewind/t/008_min_recovery_point.pl
index d4c89451e6..1cff5b7019 100644
--- a/src/bin/pg_rewind/t/008_min_recovery_point.pl
+++ b/src/bin/pg_rewind/t/008_min_recovery_point.pl
@@ -152,6 +152,7 @@ move(
 	"$tmp_folder/node_2-postgresql.conf.tmp",
 	"$node_2_pgdata/postgresql.conf");
 
+$node_2->append_conf('standby.signal', '');
 $node_2->start;
 
 # Check contents of the test tables after rewind. The rows inserted in node 3
-- 
2.42.0

#22

Roberto Mello

roberto.mello@gmail.com

about 2 years ago

In reply to: Michael Paquier (#21)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On Mon, Oct 30, 2023 at 1:09 AM Michael Paquier <michael@paquier.xyz> wrote:

I have been reviewing the patch, and applied portions of it as of
dc5bd388 and 1ffdc03c and they're quite independent pieces. After
that, the remaining bits of the patch to change the behavior is now
straight-forward. I have written a commit message for it, while on
it, as per the attached.

A suggestion for the hint message in an effort to improve readability:

"If you are restoring from a backup, ensure \"%s/recovery.signal\" or
\"%s/standby.signal\" is present and add required recovery options."

I realize the original use of "touch" is a valid shortcut for what I
suggest above, however that will be less clear for the not-so-un*x-inclined
users of Postgres, while for some it'll be downright confusing, IMHO. It
also provides the advantage of being crystal clear on what needs to be done
to fix the problem.

Roberto

#23

Robert Haas

robertmhaas@gmail.com

about 2 years ago

In reply to: Michael Paquier (#21)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On Mon, Oct 30, 2023 at 3:09 AM Michael Paquier <michael@paquier.xyz> wrote:

I have been reviewing the patch, and applied portions of it as of
dc5bd388 and 1ffdc03c and they're quite independent pieces. After
that, the remaining bits of the patch to change the behavior is now
straight-forward. I have written a commit message for it, while on
it, as per the attached.

I would encourage some caution here.

In a vacuum, I'm in favor of this, and for the same reasons as you,
namely, that the huge pile of Booleans that we use to control recovery
is confusing, and it's difficult to make sure that all the code paths
are adequately tested, and I think some of the things that actually
work here are not documented.

But in practice, I think there is a possibility of something like this
backfiring very hard. Notice that the first two people who commented
on the thread saw the error and immediately removed backup_label even
though that's 100% wrong. It shows how utterly willing users are to
remove backup_label for any reason or no reason at all. If we convert
cases where things would have worked into cases where people nuke
backup_label and then it appears to work, we're going to be worse off
in the long run, no matter how crazy the idea of removing backup_label
may seem to us.

Also, Andres just recently mentioned to me that he uses this procedure
of starting a server with a backup_label but no recovery.signal or
standby.signal file regularly, and thinks other people do too. I was
surprised, since I've never done that, except maybe when I was a noob
and didn't have a clue. But Andres is far from a noob.

--
Robert Haas
EDB: http://www.enterprisedb.com

#24

Andres Freund

andres@anarazel.de

about 2 years ago

In reply to: Michael Paquier (#21)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

Hi,

On 2023-10-30 16:08:50 +0900, Michael Paquier wrote:

From 26a8432fe3ab8426e7797d85d19b0fe69d3384c9 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Mon, 30 Oct 2023 16:02:52 +0900
Subject: [PATCH v4] Require recovery.signal or standby.signal when reading a
backup_file

Historically, the startup process uses two static variables to control
if archive recovery should happen, when either recovery.signal or
standby.signal are defined in the data folder at the beginning of
recovery:

I think the problem with these variables is that they're a really messy state
machine - something this patch doesn't meaningfully improve IMO.

This configuration was possible when recovering from a base backup taken
by pg_basebackup without -R. Note that the documentation requires at
least to set recovery.signal to restore from a backup, but the startup
process was not making this policy explicit.

Maybe I just didn't check the right place, but from I saw, this, at most, is
implied, rather than explicitly stated.

In most cases, one would have been able to complete recovery, but that's a
matter of luck, really, as it depends on the workload of the origin server.

With -X ... we have all the necessary WAL locally, how does the workload on
the primary matter? If you pass --no-slot, pg_basebackup might fail to fetch
the necessary wal, but then you'd also have gotten an error.

I agree with Robert that this would be a good error check on a green field,
but that I am less convinced it's going to help more than hurt now.

Right now running pg_basebackup with -X stream, without --write-recovery-conf,
gives you a copy of a cluster that will come up correctly as a distinct
instance.

With this change applied, you need to know that the way to avoid the existing
FATAL about restore_command at startup (when recovery.signal exists but
restore_command isn't set)) is to is to set "restore_command = false",
something we don't explain anywhere afaict. We should lessen the need to ever
use restore_command, not increase it.

It also seems risky to have people get used to restore_command = false,
because that effectively disables detection of other timelines etc. But, this
method does force a new timeline - which will be the same on each clone of the
database...

I also just don't think that it's always desirable to create a new timeline.

Greetings,

Andres Freund

#25

Michael Paquier

michael@paquier.xyz

about 2 years ago

In reply to: Robert Haas (#23)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On Mon, Oct 30, 2023 at 01:55:13PM -0400, Robert Haas wrote:

I would encourage some caution here.

Thanks for chiming here.

In a vacuum, I'm in favor of this, and for the same reasons as you,
namely, that the huge pile of Booleans that we use to control recovery
is confusing, and it's difficult to make sure that all the code paths
are adequately tested, and I think some of the things that actually
work here are not documented.

Yep, same feeling here.

But in practice, I think there is a possibility of something like this
backfiring very hard. Notice that the first two people who commented
on the thread saw the error and immediately removed backup_label even
though that's 100% wrong. It shows how utterly willing users are to
remove backup_label for any reason or no reason at all. If we convert
cases where things would have worked into cases where people nuke
backup_label and then it appears to work, we're going to be worse off
in the long run, no matter how crazy the idea of removing backup_label
may seem to us.

As far as I know, there's one paragraph in the docs that implies this
mode without giving an actual hint that this may be OK or not, so
shrug:
https://www.postgresql.org/docs/devel/continuous-archiving.html#BACKUP-TIPS
"As with base backups, the easiest way to produce a standalone hot
backup is to use the pg_basebackup tool. If you include the -X
parameter when calling it, all the write-ahead log required to use the
backup will be included in the backup automatically, and no special
action is required to restore the backup."

And a few lines down we imply to use restore_command, something that
we check is set only if recovery.signal is set. See additionally
validateRecoveryParameters(), where the comments imply that
InArchiveRecovery would be set only when there's a restore command.

As you're telling me, and I've considered that as an option as well,
perhaps we should just consider the presence of a backup_label file
with no .signal files as a synonym of crash recovery? In the recovery
path, currently the essence of the problem is when we do
InArchiveRecovery=true, but ArchiveRecoveryRequested=false, meaning
that it should do archive recovery but we don't want it, and that does
not really make sense. The rest of the code sort of implies that this
is not a suported combination. So basically, my suggestion here, is
to just replay WAL up to the end of what's in your local pg_wal/ and
hope for the best, without TLI jumps, except that we'd do nothing.
Doing a pg_basebackup -X stream followed by a restart would work fine
with that, because all the WAL is here.

A point of contention is if we'd better be stricter about satisfying
backupEndPoint in such a case, but the redo code only wants to do
something here when ArchiveRecoveryRequested is set (aka there's a
.signal file set), and we would not want a TLI jump at the end of
recovery, so I don't see an argument with caring about backupEndPoint
in this case.

At the end, I'm OK as long as ArchiveRecoveryRequested=false
InArchiveRecovery=true does not exist anymore, because it's much
easier to get what's going on with the redo path, IMHO.

(I have a patch at hand to show the idea, will post it with a reply to
Andres' message.)

Also, Andres just recently mentioned to me that he uses this procedure
of starting a server with a backup_label but no recovery.signal or
standby.signal file regularly, and thinks other people do too. I was
surprised, since I've never done that, except maybe when I was a noob
and didn't have a clue. But Andres is far from a noob.

At this stage, that's basically at your own risk, as the code thinks
it's OK to force what's basically archive-recovery-without-being-it.
So it basically works, but it can also easily backfire, as well..
--
Michael

#26

Michael Paquier

michael@paquier.xyz

about 2 years ago

In reply to: Roberto Mello (#22)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On Mon, Oct 30, 2023 at 10:32:28AM -0600, Roberto Mello wrote:

I realize the original use of "touch" is a valid shortcut for what I
suggest above, however that will be less clear for the not-so-un*x-inclined
users of Postgres, while for some it'll be downright confusing, IMHO. It
also provides the advantage of being crystal clear on what needs to be done
to fix the problem.

Indeed, "touch" may be better in this path if we'd throw an ERROR to
enforce a given policy, and that's more consistent with the rest of
the area.
--
Michael

#27

Michael Paquier

michael@paquier.xyz

about 2 years ago

In reply to: Andres Freund (#24)

1 attachment(s)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On Mon, Oct 30, 2023 at 12:47:41PM -0700, Andres Freund wrote:

I think the problem with these variables is that they're a really messy state
machine - something this patch doesn't meaningfully improve IMO.

Okay. Yes, this is my root issue as well. We're at the stage where
we should reduce the possible set of combinations and assumptions
we're inventing because people can do undocumented stuff, then perhaps
refactor the code on top of that (say, if one combination with too
booleans is not possible, switch to a three-state enum rather than 2
bools, etc).

This configuration was possible when recovering from a base backup taken
by pg_basebackup without -R. Note that the documentation requires at
least to set recovery.signal to restore from a backup, but the startup
process was not making this policy explicit.

Maybe I just didn't check the right place, but from I saw, this, at most, is
implied, rather than explicitly stated.

See the doc reference here:
/messages/by-id/ZUBM6BNQnEh7lzIM@paquier.xyz

So it kind of implies it, still also mentions restore_command. It's
like Schrödinger's cat, yes and no at the same time.

With -X ... we have all the necessary WAL locally, how does the workload on
the primary matter? If you pass --no-slot, pg_basebackup might fail to fetch
the necessary wal, but then you'd also have gotten an error.

[...]

Right now running pg_basebackup with -X stream, without --write-recovery-conf,
gives you a copy of a cluster that will come up correctly as a distinct
instance.

[...]

I also just don't think that it's always desirable to create a new timeline.

Yeah. Another argument I was mentioning to Robert is that we may want
to just treat the case where you have a backup_label without any
signal files just the same as crash recovery, replaying all the local
pg_wal/, and nothing else. For example, something like the attached
should make sure that InArchiveRecovery=true should never be set if
ArchiveRecoveryRequested is not set.

The attached would still cause redo to complain on a "WAL ends before
end of online backup" if not all the WAL is here (reason behind the
tweak of 010_pg_basebackup.pl, but the previous tweak to pg_rewind's
008_min_recovery_point.pl is not required here).

Attached is the idea I had in mind, in terms of code, FWIW.
--
Michael

Attachments:

0001-Force-crash-recovery-with-backup_label-and-no-.signa.patchtext/x-diff; charset=us-asciiDownload

From fbd9fa407c04799ad4401d3ba5c6b67a0d922631 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Tue, 31 Oct 2023 10:14:01 +0900
Subject: [PATCH] Force crash recovery with backup_label and no .signal files

---
 src/backend/access/transam/xlogrecovery.c    | 29 ++++++++++++++++----
 src/bin/pg_basebackup/t/010_pg_basebackup.pl |  3 +-
 doc/src/sgml/backup.sgml                     |  4 ++-
 3 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index c61566666a..de5787d7e8 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -133,6 +133,8 @@ static TimeLineID curFileTLI;
  * currently performing crash recovery using only XLOG files in pg_wal, but
  * will switch to using offline XLOG archives as soon as we reach the end of
  * WAL in pg_wal.
+ *
+ * InArchiveRecovery should never be set without ArchiveRecoveryRequested.
  */
 bool		ArchiveRecoveryRequested = false;
 bool		InArchiveRecovery = false;
@@ -595,13 +597,22 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		List	   *tablespaces = NIL;
 
 		/*
-		 * Archive recovery was requested, and thanks to the backup label
-		 * file, we know how far we need to replay to reach consistency. Enter
-		 * archive recovery directly.
+		 * If archive recovery was requested, and we know how far we need to
+		 * replay to reach consistency thanks to the backup label file, then
+		 * enter archive recovery directly in this case.
+		 *
+		 * If archive recovery was not requested, then do crash recovery and
+		 * replay all the local WAL.  This still checks that all the WAL up
+		 * to backupEndRequired has been replayed.  This case is useful when
+		 * restoring from a standalone base backup, taken with pg_basebackup
+		 * --wal-method=stream, for example.
 		 */
-		InArchiveRecovery = true;
-		if (StandbyModeRequested)
-			EnableStandbyMode();
+		if (ArchiveRecoveryRequested)
+		{
+			InArchiveRecovery = true;
+			if (StandbyModeRequested)
+				EnableStandbyMode();
+		}
 
 		/*
 		 * When a backup_label file is present, we want to roll forward from
@@ -1591,6 +1602,12 @@ ShutdownWalRecovery(void)
 	 */
 	if (ArchiveRecoveryRequested)
 		DisownLatch(&XLogRecoveryCtl->recoveryWakeupLatch);
+
+	/*
+	 * InArchiveRecovery should never have been set without
+	 * ArchiveRecoveryRequested.
+	 */
+	Assert(ArchiveRecoveryRequested || !InArchiveRecovery);
 }
 
 /*
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index b9f5e1266b..b9e54f4562 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -392,7 +392,8 @@ SKIP:
 	my $node2 = PostgreSQL::Test::Cluster->new('replica');
 
 	# Recover main data directory
-	$node2->init_from_backup($node, 'tarbackup2', tar_program => $tar);
+	$node2->init_from_backup($node, 'tarbackup2', tar_program => $tar,
+	    has_restoring => 1);
 
 	# Recover tablespace into a new directory (not where it was!)
 	my $repTsDir = "$tempdir/tblspc1replica";
diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 8cb24d6ae5..5ba7a284cf 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -1380,7 +1380,9 @@ restore_command = 'cp /mnt/server/archivedir/%f %p'
       tool. If you include the <literal>-X</literal> parameter when calling
       it, all the write-ahead log required to use the backup will be
       included in the backup automatically, and no special action is
-      required to restore the backup.
+      required to restore the backup. Restoring a standalone backup is
+      equivalent to crash recovery, replaying all the WAL stored in
+      <filename>pg_wal</filename> up to its end.
      </para>
     </sect3>
 
-- 
2.42.0

#28

Robert Haas

robertmhaas@gmail.com

about 2 years ago

In reply to: Michael Paquier (#25)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On Mon, Oct 30, 2023 at 8:40 PM Michael Paquier <michael@paquier.xyz> wrote:

As far as I know, there's one paragraph in the docs that implies this
mode without giving an actual hint that this may be OK or not, so
shrug:
https://www.postgresql.org/docs/devel/continuous-archiving.html#BACKUP-TIPS
"As with base backups, the easiest way to produce a standalone hot
backup is to use the pg_basebackup tool. If you include the -X
parameter when calling it, all the write-ahead log required to use the
backup will be included in the backup automatically, and no special
action is required to restore the backup."

I see your point, but that's way too subtle. As far as I know, the
only actually-documented procedure for restoring is this one:

https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-PITR-RECOVERY

That procedure actually is badly in need of some updating, IMHO,
because close to half of it is about moving your existing database
cluster out of the way, which may or may not be needed in the case of
any particular backup restore. Also, it unconditionally mentions
creating recovery.signal, with no mention of standby.signal. And
certainly not with neither. It also gives zero motivation for actually
doing this and says nothing useful about backup_label.

Both recovery.signal and standby.signal are documented in
https://www.postgresql.org/docs/current/runtime-config-wal.html#RUNTIME-CONFIG-WAL-ARCHIVE-RECOVERY
but you'd have no real reason to look in a list of GUCs for
information about a file on disk. recovery.signal but not
standby.signal is mentioned in
https://www.postgresql.org/docs/current/warm-standby.html but nowhere
that I can find do we explicitly talk about running with at least one
of them.

As you're telling me, and I've considered that as an option as well,
perhaps we should just consider the presence of a backup_label file
with no .signal files as a synonym of crash recovery? In the recovery
path, currently the essence of the problem is when we do
InArchiveRecovery=true, but ArchiveRecoveryRequested=false, meaning
that it should do archive recovery but we don't want it, and that does
not really make sense. The rest of the code sort of implies that this
is not a suported combination. So basically, my suggestion here, is
to just replay WAL up to the end of what's in your local pg_wal/ and
hope for the best, without TLI jumps, except that we'd do nothing.

This sentence seems to be incomplete.

But I was not saying we should treat the case where we have a
backup_label file like crash recovery. The real question here is why
we don't treat it fully like archive recovery. I don't know off-hand
what is different if I start the server with both backup_label and
recovery.signal vs. if I start it with only backup_label, but I
question whether there should be any difference at all.

A point of contention is if we'd better be stricter about satisfying
backupEndPoint in such a case, but the redo code only wants to do
something here when ArchiveRecoveryRequested is set (aka there's a
.signal file set), and we would not want a TLI jump at the end of
recovery, so I don't see an argument with caring about backupEndPoint
in this case.

This is a bit hard for me to understand, but I disagree strongly with
the idea that we should ever ignore a backup end point if we have one.

--
Robert Haas
EDB: http://www.enterprisedb.com

#29

Michael Paquier

michael@paquier.xyz

about 2 years ago

In reply to: Robert Haas (#28)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On Tue, Oct 31, 2023 at 08:28:07AM -0400, Robert Haas wrote:

On Mon, Oct 30, 2023 at 8:40 PM Michael Paquier <michael@paquier.xyz> wrote:

As far as I know, there's one paragraph in the docs that implies this
mode without giving an actual hint that this may be OK or not, so
shrug:
https://www.postgresql.org/docs/devel/continuous-archiving.html#BACKUP-TIPS
"As with base backups, the easiest way to produce a standalone hot
backup is to use the pg_basebackup tool. If you include the -X
parameter when calling it, all the write-ahead log required to use the
backup will be included in the backup automatically, and no special
action is required to restore the backup."

I see your point, but that's way too subtle. As far as I know, the
only actually-documented procedure for restoring is this one:
https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-PITR-RECOVERY

That procedure actually is badly in need of some updating, IMHO,
because close to half of it is about moving your existing database
cluster out of the way, which may or may not be needed in the case of
any particular backup restore. Also, it unconditionally mentions
creating recovery.signal, with no mention of standby.signal. And
certainly not with neither. It also gives zero motivation for actually
doing this and says nothing useful about backup_label.

Both recovery.signal and standby.signal are documented in
https://www.postgresql.org/docs/current/runtime-config-wal.html#RUNTIME-CONFIG-WAL-ARCHIVE-RECOVERY
but you'd have no real reason to look in a list of GUCs for
information about a file on disk. recovery.signal but not
standby.signal is mentioned in
https://www.postgresql.org/docs/current/warm-standby.html but nowhere
that I can find do we explicitly talk about running with at least one
of them.

Point 7. of what you quote says to use one? True that this needs a
refresh, and perhaps a bit fat warning about the fact that these are
required if you want to fetch WAL from other sources than the local
pg_wal/. Perhaps there may be a point of revisiting the default
behavior of recovery_target_timeline in this case, I don't know.

As you're telling me, and I've considered that as an option as well,
perhaps we should just consider the presence of a backup_label file
with no .signal files as a synonym of crash recovery? In the recovery
path, currently the essence of the problem is when we do
InArchiveRecovery=true, but ArchiveRecoveryRequested=false, meaning
that it should do archive recovery but we don't want it, and that does
not really make sense. The rest of the code sort of implies that this
is not a suported combination. So basically, my suggestion here, is
to just replay WAL up to the end of what's in your local pg_wal/ and
hope for the best, without TLI jumps, except that we'd do nothing.

This sentence seems to be incomplete.

I've re-read it, and it looks OK to me. What I mean with this
paragraph are two things:
- Remove InArchiveRecovery=true and ArchiveRecoveryRequested=false as
a possible combination in the code.
- Treat backup_label with no .signal file as the same as crash
recovery, that:
-- Does no TLI jump at the end of recovery.
-- Expects all the WAL to be in pg_wal/.

But I was not saying we should treat the case where we have a
backup_label file like crash recovery. The real question here is why
we don't treat it fully like archive recovery.

Timeline jump at the end of recovery? Archive recovery forces a TLI
jump by default at the end of redo if there's a signal file, and some
users may not want a TLI jump by default?

I don't know off-hand
what is different if I start the server with both backup_label and
recovery.signal vs. if I start it with only backup_label, but I
question whether there should be any difference at all.

Perhaps we could do that, but note that backup_label is renamed to
backup_label.old at the beginning of redo. The code has historically
always enforced InArchiveRecovery=true when there's a backup label,
and InArchiveRecovery=false where there is no backup label, so we
don't get the same recovery behavior if a cluster is restarted while
it was still performing recovery. I don't quite see how it is
possible to make this code simpler without enforcing a policy to take
care of this inconsistency. I've listed two of them on this thread:
- Force the presence of a .recovery file when there is a
backup_label, to force archive recovery.
- Force crash recovery if there are no signal files but a
backup_label, then a restart of a cluster that began a restore while
it processed a backup would be confused: should it do crash recovery
or archive recovery?

My guess, based on what I read from the feedback of this thread, is
that it could be more helpful to do the second thing, not the third
one, because this is better with standalone backups: no TLI jumps and
restore happens with all the local WAL in pg_wal/, without any GUCs to
control how recovery should run.

You are suggesting a third, hybrid, approach. Now note we have always
checked for signal files before the backup_label. Recovery GUCs are
checked only if there's one of the two signal files. It seems to me
that what you are suggesting would make the code a bit harder to
follow, actually, and more inconsistent with stable branches because
we would need to check the control file contents *before* checking for
the .signal files or backup_label to be able to see if archive
recovery *should* happen, depending on if there's a backupEndPoint.

A point of contention is if we'd better be stricter about satisfying
backupEndPoint in such a case, but the redo code only wants to do
something here when ArchiveRecoveryRequested is set (aka there's a
.signal file set), and we would not want a TLI jump at the end of
recovery, so I don't see an argument with caring about backupEndPoint
in this case.

This is a bit hard for me to understand, but I disagree strongly with
the idea that we should ever ignore a backup end point if we have one.

Actually, while experimenting yesterday before sending my reply to
you, I have noticed that redo cares about backupEndPoint even if you
force crash recovery when there's only a backup_label file. There's a
case in pg_basebackup that would fail, but that's accidental, AFAIK:
/messages/by-id/ZUBVKfL6FR6NOQyt@paquier.xyz

See in StartupXLOG(), around the comment "complain if we did not roll
forward far enough to reach". This complains if archive recovery has
been requested *or* if we retrieved a backup end LSN from the
backup_label.
--
Michael

#30

Kyotaro Horiguchi

horikyota.ntt@gmail.com

about 2 years ago

In reply to: Michael Paquier (#29)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

At Wed, 1 Nov 2023 08:39:17 +0900, Michael Paquier <michael@paquier.xyz> wrote in

See in StartupXLOG(), around the comment "complain if we did not roll
forward far enough to reach". This complains if archive recovery has
been requested *or* if we retrieved a backup end LSN from the
backup_label.

Please note that backupStartPoint is not reset even when reaching the
backup end point during crash recovery. If backup_label enforces
archive recovery, I think this point won't be an issue as you
mentioned. For the record, my earlier proposal aimed to detect
reaching the end point even during crash recovery.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#31

Michael Paquier

michael@paquier.xyz

about 2 years ago

In reply to: Kyotaro Horiguchi (#30)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On Thu, Nov 02, 2023 at 11:03:35AM +0900, Kyotaro Horiguchi wrote:

At Wed, 1 Nov 2023 08:39:17 +0900, Michael Paquier <michael@paquier.xyz> wrote in

See in StartupXLOG(), around the comment "complain if we did not roll
forward far enough to reach". This complains if archive recovery has
been requested *or* if we retrieved a backup end LSN from the
backup_label.

Please note that backupStartPoint is not reset even when reaching the
backup end point during crash recovery. If backup_label enforces
archive recovery, I think this point won't be an issue as you
mentioned. For the record, my earlier proposal aimed to detect
reaching the end point even during crash recovery.

Good point. Not doing ReachedEndOfBackup() at the end of crash
recovery feels inconsistent, especially since we care about some of
these fields in this case.

If a .signal file is required when we read a backup_label, yes that
would not be a problem because we'd always link backupEndPoint's
destiny with a requested archive recovery, but there seem to be little
love for enforcing that based on the feedback of this thread, so..
--
Michael

#32

Robert Haas

robertmhaas@gmail.com

about 2 years ago

In reply to: Michael Paquier (#29)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On Tue, Oct 31, 2023 at 7:39 PM Michael Paquier <michael@paquier.xyz> wrote:

Point 7. of what you quote says to use one? True that this needs a
refresh, and perhaps a bit fat warning about the fact that these are
required if you want to fetch WAL from other sources than the local
pg_wal/. Perhaps there may be a point of revisiting the default
behavior of recovery_target_timeline in this case, I don't know.

I don't really know what to say to this -- sure, point 7 of
"Recovering Using a Continuous Archive Backup" says to use
recovery.signal. But as I said in the preceding paragraph, it doesn't
say either "use recovery.signal or standby.signal". Nor does it or
anything else in the documentation explain under what circumstances
you're allowed to have neither. So the whole thing is very unclear.

As you're telling me, and I've considered that as an option as well,
perhaps we should just consider the presence of a backup_label file
with no .signal files as a synonym of crash recovery? In the recovery
path, currently the essence of the problem is when we do
InArchiveRecovery=true, but ArchiveRecoveryRequested=false, meaning
that it should do archive recovery but we don't want it, and that does
not really make sense. The rest of the code sort of implies that this
is not a suported combination. So basically, my suggestion here, is
to just replay WAL up to the end of what's in your local pg_wal/ and
hope for the best, without TLI jumps, except that we'd do nothing.

This sentence seems to be incomplete.

I've re-read it, and it looks OK to me.

Well, the sentence ends with "except that we'd do nothing" and I don't
know what that means. It would make sense to me if it said "except
that we'd do nothing about <whatever>" or "except that we'd do nothing
instead of <something>" but as you've written it basically seems to
boil down to "my suggestion is to replay WAL except do nothing" which
makes no sense. If you replay WAL, you're not doing nothing.

But I was not saying we should treat the case where we have a
backup_label file like crash recovery. The real question here is why
we don't treat it fully like archive recovery.

Timeline jump at the end of recovery? Archive recovery forces a TLI
jump by default at the end of redo if there's a signal file, and some
users may not want a TLI jump by default?

Uggh. I don't know what to think about that. I bet some people do want
that, but that makes it pretty easy to end up with multiple copies of
the same cluster running on the same TLI, too, which is not a thing
that you really want to have happen.

At the end of the day, I'm coming around to the view that the biggest
problem here is the documentation. Nobody can really know what's
supposed to work right now because the documentation doesn't say which
things you are and are not allowed to do and what results you should
expect in each case. If it did, it would be easier to discuss possible
behavior changes. Right now, it's hard to change any code at all,
because there's no list of supported scenarios, so you can't tell
whether a potential change affects a scenario that somebody thinks
should work, or only cases that nobody can possibly care about. It's
sort of possible to reason your way through that, to an extent, but
it's pretty hard. The fact that I didn't know that starting from a
backup with neither recovery.signal nor standby.signal was a thing
that anybody did or cared about is good evidence of that.

I'm coming to the understanding that we have four supported scenarios.
One, no backup_label, no recovery.signal, and no standby.signal.
Hence, replay WAL until the end, then start up. Two, backup_label
exists but neither recovery.signal nor standby.signal does. As before,
but if I understand correctly, now we can check that we reached the
backup end location. Three, recovery.signal exists, with or without
backup_label. Now we create a new TLI at the end of recovery, and
also, now can fetch WAL that is not present in pg_wal using
primary_conninfo or restore_command. In fact, I think we may prefer to
do that over using WAL we have locally, but I'm not quite sure about
that. Fourth, standby.signal exists, with or without backup_label. As
the previous scenario, but now when we reach the end of WAL we wait
for more to appear instead of ending recovery. I have a feeling this
is not quite an exhaustive list of differences between the various
modes, and I'm not even sure that it lists all of the things someone
might try to do. Thoughts?

I also feel like the terminology here sometimes obscures more than it
illuminates. For instance, it seems like ArchiveRecoveryRequested
really means "are any signal files present?" while InArchiveRecovery
means "are we fetching WAL from outside pg_wal rather than using
what's in pg_wal?". But these are not obvious from the names, and
sometimes we have additional variables with overlapping meanings, like
readSource, which indicates whether we're reading from pg_wal, the
archive, or the walreceiver, and yet is probably not redundant with
InArchiveRecovery. In any event, I think that we need to start with
the question of what behavior(s) we want to expose to users, and then
back into the question of what internal variables and states need to
exist in order to support that behavior. We cannot start by deciding
what variables we'd like to get rid of and then trying to justify the
resulting behavior changes on the grounds that they simplify the code.
Users aren't going to like that, hackers aren't going to like that,
and the resulting behavior probably won't be anything great.

--
Robert Haas
EDB: http://www.enterprisedb.com

#33

Michael Paquier

michael@paquier.xyz

about 2 years ago

In reply to: Robert Haas (#32)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On Wed, Nov 08, 2023 at 01:16:58PM -0500, Robert Haas wrote:

On Tue, Oct 31, 2023 at 7:39 PM Michael Paquier <michael@paquier.xyz> wrote:

As you're telling me, and I've considered that as an option as well,
perhaps we should just consider the presence of a backup_label file
with no .signal files as a synonym of crash recovery? In the recovery
path, currently the essence of the problem is when we do
InArchiveRecovery=true, but ArchiveRecoveryRequested=false, meaning
that it should do archive recovery but we don't want it, and that does
not really make sense. The rest of the code sort of implies that this
is not a suported combination. So basically, my suggestion here, is
to just replay WAL up to the end of what's in your local pg_wal/ and
hope for the best, without TLI jumps, except that we'd do nothing.

This sentence seems to be incomplete.

I've re-read it, and it looks OK to me.

Well, the sentence ends with "except that we'd do nothing" and I don't
know what that means. It would make sense to me if it said "except
that we'd do nothing about <whatever>" or "except that we'd do nothing
instead of <something>" but as you've written it basically seems to
boil down to "my suggestion is to replay WAL except do nothing" which
makes no sense. If you replay WAL, you're not doing nothing.

Sure, sorry for the confusion. By "we'd do nothing", I mean precirely
"to take no specific action related to archive recovery and recovery
parameters at the end of recovery", meaning that a combination of
backup_label with no signal file would be the same as crash recovery,
replaying WAL up to the end of what can be found in pg_wal/, and only
that.

But I was not saying we should treat the case where we have a
backup_label file like crash recovery. The real question here is why
we don't treat it fully like archive recovery.

Timeline jump at the end of recovery? Archive recovery forces a TLI
jump by default at the end of redo if there's a signal file, and some
users may not want a TLI jump by default?

Uggh. I don't know what to think about that. I bet some people do want
that, but that makes it pretty easy to end up with multiple copies of
the same cluster running on the same TLI, too, which is not a thing
that you really want to have happen.

Andres has mentioned upthread that this is something he's been using
to quickly be able to clone a cluster. I would not recommend doing
that, personally, but if that's useful in some cases, well, why not.

At the end of the day, I'm coming around to the view that the biggest
problem here is the documentation. Nobody can really know what's
supposed to work right now because the documentation doesn't say which
things you are and are not allowed to do and what results you should
expect in each case. If it did, it would be easier to discuss possible
behavior changes. Right now, it's hard to change any code at all,
because there's no list of supported scenarios, so you can't tell
whether a potential change affects a scenario that somebody thinks
should work, or only cases that nobody can possibly care about. It's
sort of possible to reason your way through that, to an extent, but
it's pretty hard. The fact that I didn't know that starting from a
backup with neither recovery.signal nor standby.signal was a thing
that anybody did or cared about is good evidence of that.

That's one problem, not all of it, because the code takes extra
assumptions around that.

I also feel like the terminology here sometimes obscures more than it
illuminates. For instance, it seems like ArchiveRecoveryRequested
really means "are any signal files present?" while InArchiveRecovery
means "are we fetching WAL from outside pg_wal rather than using
what's in pg_wal?". But these are not obvious from the names, and
sometimes we have additional variables with overlapping meanings, like
readSource, which indicates whether we're reading from pg_wal, the
archive, or the walreceiver, and yet is probably not redundant with
InArchiveRecovery. In any event, I think that we need to start with
the question of what behavior(s) we want to expose to users, and then
back into the question of what internal variables and states need to
exist in order to support that behavior. We cannot start by deciding
what variables we'd like to get rid of and then trying to justify the
resulting behavior changes on the grounds that they simplify the code.
Users aren't going to like that, hackers aren't going to like that,
and the resulting behavior probably won't be anything great.

Note as well that InArchiveRecovery is set when there's a
backup_label, but that the code would check for the existence of a
restore_command only if a signal file exists. That's strange, but if
people have been relying on this behavior, so be it.

At this stage, it looks pretty clear to me that there's no consensus
on what to do, and nobody's happy with the proposal of this thread, so
I am going to mark it as rejected.
--
Michael

#34

Michael Paquier

michael@paquier.xyz

about 2 years ago

In reply to: Michael Paquier (#33)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On Thu, Nov 09, 2023 at 12:04:19PM +0900, Michael Paquier wrote:

Sure, sorry for the confusion. By "we'd do nothing", I mean precirely
"to take no specific action related to archive recovery and recovery
parameters at the end of recovery", meaning that a combination of
backup_label with no signal file would be the same as crash recovery,
replaying WAL up to the end of what can be found in pg_wal/, and only
that.

By being slightly more precise. I also mean to fail recovery if it is
not possible to replay up to the end-of-backup LSN marked in the label
file because we are missing some stuff in pg_wal/, which is something
that the code is currently able to handle.
--
Michael

#35

Andres Freund

andres@anarazel.de

about 2 years ago

In reply to: Michael Paquier (#34)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

Hi,

On 2023-11-09 12:16:52 +0900, Michael Paquier wrote:

On Thu, Nov 09, 2023 at 12:04:19PM +0900, Michael Paquier wrote:

Sure, sorry for the confusion. By "we'd do nothing", I mean precirely
"to take no specific action related to archive recovery and recovery
parameters at the end of recovery", meaning that a combination of
backup_label with no signal file would be the same as crash recovery,
replaying WAL up to the end of what can be found in pg_wal/, and only
that.

I don't think those are equivalent - in the "backup_label with no signal file"
case we start recovery at a different location than the "crash recovery" case
does.

By being slightly more precise. I also mean to fail recovery if it is
not possible to replay up to the end-of-backup LSN marked in the label
file because we are missing some stuff in pg_wal/, which is something
that the code is currently able to handle.

"able to handle" as in detect and error out? Because that's the only possible
sane thing to do, correct?

Greetings,

Andres Freund

#36

Michael Paquier

michael@paquier.xyz

about 2 years ago

In reply to: Andres Freund (#35)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

On Mon, Nov 13, 2023 at 03:41:44PM -0800, Andres Freund wrote:

On 2023-11-09 12:16:52 +0900, Michael Paquier wrote:

On Thu, Nov 09, 2023 at 12:04:19PM +0900, Michael Paquier wrote:

Sure, sorry for the confusion. By "we'd do nothing", I mean precirely
"to take no specific action related to archive recovery and recovery
parameters at the end of recovery", meaning that a combination of
backup_label with no signal file would be the same as crash recovery,
replaying WAL up to the end of what can be found in pg_wal/, and only
that.

I don't think those are equivalent - in the "backup_label with no signal file"
case we start recovery at a different location than the "crash recovery" case
does.

It depends on how you see things, and based on my read of the thread
or the code we've never really put a clear definition what a
"backup_label with no signal file" should do. The definition I was
suggesting is to make it work the same way as crash recovery
internally:
- use the start LSN from the backup_label.
- replay up to the end of local WAL.
- don't rely on any recovery GUCs.
- if at the end of recovery replay has not reached the end-of-backup
record, then fail.

By being slightly more precise. I also mean to fail recovery if it is
not possible to replay up to the end-of-backup LSN marked in the label
file because we are missing some stuff in pg_wal/, which is something
that the code is currently able to handle.

"able to handle" as in detect and error out? Because that's the only possible
sane thing to do, correct?

By "able to handle", I mean to detect that the expected LSN has not
been reached and FATAL, or fail recovery. So yes.
--
Michael

#37

Andres Freund

andres@anarazel.de

about 2 years ago

In reply to: Michael Paquier (#36)

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

Hi,

On 2023-11-14 09:13:44 +0900, Michael Paquier wrote:

On Mon, Nov 13, 2023 at 03:41:44PM -0800, Andres Freund wrote:

On 2023-11-09 12:16:52 +0900, Michael Paquier wrote:

On Thu, Nov 09, 2023 at 12:04:19PM +0900, Michael Paquier wrote:

Sure, sorry for the confusion. By "we'd do nothing", I mean precirely
"to take no specific action related to archive recovery and recovery
parameters at the end of recovery", meaning that a combination of
backup_label with no signal file would be the same as crash recovery,
replaying WAL up to the end of what can be found in pg_wal/, and only
that.

I don't think those are equivalent - in the "backup_label with no signal file"
case we start recovery at a different location than the "crash recovery" case
does.

It depends on how you see things, and based on my read of the thread
or the code we've never really put a clear definition what a
"backup_label with no signal file" should do. The definition I was
suggesting is to make it work the same way as crash recovery
internally:
- use the start LSN from the backup_label.

That's fundamentally different from crash recovery!

- replay up to the end of local WAL.
- don't rely on any recovery GUCs.
- if at the end of recovery replay has not reached the end-of-backup
record, then fail.

Also different from crash recovery.

It doesn't make sense to me to say "work the same way" when there are such
fundamental differences.

Greetings,

Andres Freund