Enforcing that all WAL has been replayed after restoring from backup

Started by Heikki Linnakangasover 14 years ago19 messages
#1Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
1 attachment(s)

Currently, if you take a backup with "pg_basebackup -x" (which means it
will include all the WAL to required restore within the backup dump),
and hit Ctrl-C while the WAL is being streamed, you end up with a data
directory that you can start postmaster from, even though the backup was
not complete. So what appears to be a valid backup - it starts up fine -
can actually be corrupt.

I put in a check against that back in March, but it had to be reverted
because it broke crash recovery when the system crashed while a
pg_start_backup() based backup was in progress:

http://archives.postgresql.org/message-id/4DA58686.1050501@enterprisedb.com

Here's a patch to add it back in a more fine-grained fashion. The patch
adds an extra line to backup_label, indicating whether the backup was
taken with pg_start/stop_backup(), or by streaming (= pg_basebackup).
For a backup taken with pg_start_backup(), the behavior is kept the same
as it has been - if the end-of-backup record is not reached during crash
recovery, the database starts up anyway. But for a streamed backup, you
get an error at startup.

I think this is a nice additional safeguard to have, making streamed
backups more robust. I'd like to add this to 9.1, but it required an
extra field to be added to the control file, so it would force an
initdb. It's probably not worth that. Or, we could sneak in the extra
boolean field to some currently unused pad space in the ControlFile struct.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Attachments:

require-backup-end-record-1.patchtext/x-diff; name=require-backup-end-record-1.patchDownload
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 6a6959f..d057e66 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -662,7 +662,8 @@ static bool CheckForStandbyTrigger(void);
 static void xlog_outrec(StringInfo buf, XLogRecord *record);
 #endif
 static void pg_start_backup_callback(int code, Datum arg);
-static bool read_backup_label(XLogRecPtr *checkPointLoc);
+static bool read_backup_label(XLogRecPtr *checkPointLoc,
+				  bool *backupEndRequired);
 static void rm_redo_error_callback(void *arg);
 static int	get_sync_bit(int method);
 
@@ -6016,6 +6017,7 @@ StartupXLOG(void)
 	XLogRecord *record;
 	uint32		freespace;
 	TransactionId oldestActiveXID;
+	bool		backupEndRequired = false;
 
 	/*
 	 * Read control file and check XLOG status looks valid.
@@ -6149,7 +6151,7 @@ StartupXLOG(void)
 	if (StandbyMode)
 		OwnLatch(&XLogCtl->recoveryWakeupLatch);
 
-	if (read_backup_label(&checkPointLoc))
+	if (read_backup_label(&checkPointLoc, &backupEndRequired))
 	{
 		/*
 		 * When a backup_label file is present, we want to roll forward from
@@ -6328,7 +6330,10 @@ StartupXLOG(void)
 		 * set backupStartPoint if we're starting recovery from a base backup
 		 */
 		if (haveBackupLabel)
+		{
 			ControlFile->backupStartPoint = checkPoint.redo;
+			ControlFile->backupEndRequired = backupEndRequired;
+		}
 		ControlFile->time = (pg_time_t) time(NULL);
 		/* No need to hold ControlFileLock yet, we aren't up far enough */
 		UpdateControlFile();
@@ -6698,9 +6703,13 @@ StartupXLOG(void)
 		 * crashes while an online backup is in progress. We must not treat
 		 * that as an error, or the database will refuse to start up.
 		 */
-		if (InArchiveRecovery)
+		if (InArchiveRecovery || ControlFile->backupEndRequired)
 		{
-			if (!XLogRecPtrIsInvalid(ControlFile->backupStartPoint))
+			if (ControlFile->backupEndRequired)
+				ereport(FATAL,
+						(errmsg("WAL ends before end of online backup"),
+						 errhint("All WAL generated while online backup was taken must be available at recovery.")));
+			else if (!XLogRecPtrIsInvalid(ControlFile->backupStartPoint))
 				ereport(FATAL,
 						(errmsg("WAL ends before end of online backup"),
 						 errhint("Online backup started with pg_start_backup() must be ended with pg_stop_backup(), and all WAL up to that point must be available at recovery.")));
@@ -8531,6 +8540,7 @@ xlog_redo(XLogRecPtr lsn, XLogRecord *record)
 			if (XLByteLT(ControlFile->minRecoveryPoint, lsn))
 				ControlFile->minRecoveryPoint = lsn;
 			MemSet(&ControlFile->backupStartPoint, 0, sizeof(XLogRecPtr));
+			ControlFile->backupEndRequired = false;
 			UpdateControlFile();
 
 			LWLockRelease(ControlFileLock);
@@ -9013,6 +9023,8 @@ do_pg_start_backup(const char *backupidstr, bool fast, char **labelfile)
 						 startpoint.xlogid, startpoint.xrecoff, xlogfilename);
 		appendStringInfo(&labelfbuf, "CHECKPOINT LOCATION: %X/%X\n",
 						 checkpointloc.xlogid, checkpointloc.xrecoff);
+		appendStringInfo(&labelfbuf, "BACKUP METHOD: %s\n",
+						 exclusive ? "pg_start_backup" : "streamed");
 		appendStringInfo(&labelfbuf, "START TIME: %s\n", strfbuf);
 		appendStringInfo(&labelfbuf, "LABEL: %s\n", backupidstr);
 
@@ -9768,15 +9780,19 @@ pg_xlogfile_name(PG_FUNCTION_ARGS)
  *
  * Returns TRUE if a backup_label was found (and fills the checkpoint
  * location and its REDO location into *checkPointLoc and RedoStartLSN,
- * respectively); returns FALSE if not.
+ * respectively); returns FALSE if not. If this backup_label came from a
+ * streamed backup, *backupEndRequired is set to TRUE.
  */
 static bool
-read_backup_label(XLogRecPtr *checkPointLoc)
+read_backup_label(XLogRecPtr *checkPointLoc, bool *backupEndRequired)
 {
 	char		startxlogfilename[MAXFNAMELEN];
 	TimeLineID	tli;
 	FILE	   *lfp;
 	char		ch;
+	char		backuptype[16];
+
+	*backupEndRequired = false;
 
 	/*
 	 * See if label file is present
@@ -9809,6 +9825,17 @@ read_backup_label(XLogRecPtr *checkPointLoc)
 		ereport(FATAL,
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 				 errmsg("invalid data in file \"%s\"", BACKUP_LABEL_FILE)));
+	/*
+	 * BACKUP METHOD line is new in 9.0. Don't complain if it doesn't exist,
+	 * in case you're restoring from a backup taken with an 9.0 beta version
+	 * that didn't emit it.
+	 */
+	if (fscanf(lfp, "BACKUP METHOD: %16s", backuptype) == 1)
+	{
+		if (strcmp(backuptype, "streamed") == 0)
+			*backupEndRequired = true;
+	}
+
 	if (ferror(lfp) || FreeFile(lfp))
 		ereport(FATAL,
 				(errcode_for_file_access(),
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index d543ef6..1648d7d 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -21,7 +21,7 @@
 
 
 /* Version identifier for this pg_control format */
-#define PG_CONTROL_VERSION	903
+#define PG_CONTROL_VERSION	911
 
 /*
  * Body of CheckPoint XLOG records.  This is declared here because we keep
@@ -137,9 +137,16 @@ typedef struct ControlFileData
 	 * we use the redo pointer as a cross-check when we see an end-of-backup
 	 * record, to make sure the end-of-backup record corresponds the base
 	 * backup we're recovering from.
+	 *
+	 * If backupEndRequired is true, we know for sure that we're restoring
+	 * from a backup, and must see a backup-end record before we can safely
+	 * start up. If it's false, but backupStartPoint is set, a backup_label
+	 * file was found at startup but it may have been a leftover from a stray
+	 * pg_start_backup() call, not accompanied with pg_stop_backup().
 	 */
 	XLogRecPtr	minRecoveryPoint;
 	XLogRecPtr	backupStartPoint;
+	bool		backupEndRequired;
 
 	/*
 	 * Parameter settings that determine if the WAL can be used for archival
#2Alvaro Herrera
alvherre@commandprompt.com
In reply to: Heikki Linnakangas (#1)
Re: Enforcing that all WAL has been replayed after restoring from backup

Excerpts from Heikki Linnakangas's message of mar ago 09 05:00:00 -0400 2011:

I think this is a nice additional safeguard to have, making streamed
backups more robust. I'd like to add this to 9.1, but it required an
extra field to be added to the control file, so it would force an
initdb. It's probably not worth that. Or, we could sneak in the extra
boolean field to some currently unused pad space in the ControlFile struct.

How about making the new backup_label field optional? If absent, assume
current behavior.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#3Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Alvaro Herrera (#2)
Re: Enforcing that all WAL has been replayed after restoring from backup

On 09.08.2011 18:20, Alvaro Herrera wrote:

Excerpts from Heikki Linnakangas's message of mar ago 09 05:00:00 -0400 2011:

I think this is a nice additional safeguard to have, making streamed
backups more robust. I'd like to add this to 9.1, but it required an
extra field to be added to the control file, so it would force an
initdb. It's probably not worth that. Or, we could sneak in the extra
boolean field to some currently unused pad space in the ControlFile struct.

How about making the new backup_label field optional? If absent, assume
current behavior.

That's how I actually did it in the patch. However, the problem wrt.
requiring initdb is not the new field in backup_label, it's the new
field in the control file.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#3)
Re: Enforcing that all WAL has been replayed after restoring from backup

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

On 09.08.2011 18:20, Alvaro Herrera wrote:

How about making the new backup_label field optional? If absent, assume
current behavior.

That's how I actually did it in the patch. However, the problem wrt.
requiring initdb is not the new field in backup_label, it's the new
field in the control file.

Yeah. I think it's too late to be fooling with pg_control for 9.1.
Just fix it in HEAD.

regards, tom lane

#5Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#4)
Re: Enforcing that all WAL has been replayed after restoring from backup

On 09.08.2011 19:07, Tom Lane wrote:

Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes:

On 09.08.2011 18:20, Alvaro Herrera wrote:

How about making the new backup_label field optional? If absent, assume
current behavior.

That's how I actually did it in the patch. However, the problem wrt.
requiring initdb is not the new field in backup_label, it's the new
field in the control file.

Yeah. I think it's too late to be fooling with pg_control for 9.1.
Just fix it in HEAD.

Done.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#6Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#4)
Re: Enforcing that all WAL has been replayed after restoring from backup

On Tue, Aug 9, 2011 at 18:07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

On 09.08.2011 18:20, Alvaro Herrera wrote:

How about making the new backup_label field optional?  If absent, assume
current behavior.

That's how I actually did it in the patch. However, the problem wrt.
requiring initdb is not the new field in backup_label, it's the new
field in the control file.

Yeah.  I think it's too late to be fooling with pg_control for 9.1.
Just fix it in HEAD.

Should we add a note to the documentation of pg_basebackup in 9.1
telling people to take care about the failure case? Or add a signal
handler in the pg_basebackup client emitting a warning about it?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#7Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Magnus Hagander (#6)
Re: Enforcing that all WAL has been replayed after restoring from backup

On 10.08.2011 12:29, Magnus Hagander wrote:

On Tue, Aug 9, 2011 at 18:07, Tom Lane<tgl@sss.pgh.pa.us> wrote:

Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes:

On 09.08.2011 18:20, Alvaro Herrera wrote:

How about making the new backup_label field optional? If absent, assume
current behavior.

That's how I actually did it in the patch. However, the problem wrt.
requiring initdb is not the new field in backup_label, it's the new
field in the control file.

Yeah. I think it's too late to be fooling with pg_control for 9.1.
Just fix it in HEAD.

Should we add a note to the documentation of pg_basebackup in 9.1
telling people to take care about the failure case?

Something like "Note: if you abort the backup before it's finished, the
backup won't be valid" ? That seems pretty obvious to me, hardly worth
documenting.

Or add a signal
handler in the pg_basebackup client emitting a warning about it?

We don't have such a signal handler pg_dump either. I don't think we
should add it.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#8Magnus Hagander
magnus@hagander.net
In reply to: Heikki Linnakangas (#7)
Re: Enforcing that all WAL has been replayed after restoring from backup

On Wed, Aug 10, 2011 at 12:44, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

On 10.08.2011 12:29, Magnus Hagander wrote:

On Tue, Aug 9, 2011 at 18:07, Tom Lane<tgl@sss.pgh.pa.us>  wrote:

Heikki Linnakangas<heikki.linnakangas@enterprisedb.com>  writes:

On 09.08.2011 18:20, Alvaro Herrera wrote:

How about making the new backup_label field optional?  If absent,
assume
current behavior.

That's how I actually did it in the patch. However, the problem wrt.
requiring initdb is not the new field in backup_label, it's the new
field in the control file.

Yeah.  I think it's too late to be fooling with pg_control for 9.1.
Just fix it in HEAD.

Should we add a note to the documentation of pg_basebackup in 9.1
telling people to take care about the failure case?

Something like "Note: if you abort the backup before it's finished, the
backup won't be valid" ? That seems pretty obvious to me, hardly worth
documenting.

I meant something more along the line of that it looks ok, but may be corrupted.

Or add a signal
handler in the pg_basebackup client emitting a warning about it?

We don't have such a signal handler pg_dump either. I don't think we should
add it.

Hmm. I guess an aborted pg_dump will also "look ok but actually be
corrupt" (or incomplete). Good point.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#9Robert Haas
robertmhaas@gmail.com
In reply to: Magnus Hagander (#8)
Re: Enforcing that all WAL has been replayed after restoring from backup

On Wed, Aug 10, 2011 at 6:53 AM, Magnus Hagander <magnus@hagander.net> wrote:

On Wed, Aug 10, 2011 at 12:44, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

On 10.08.2011 12:29, Magnus Hagander wrote:

On Tue, Aug 9, 2011 at 18:07, Tom Lane<tgl@sss.pgh.pa.us>  wrote:

Heikki Linnakangas<heikki.linnakangas@enterprisedb.com>  writes:

On 09.08.2011 18:20, Alvaro Herrera wrote:

How about making the new backup_label field optional?  If absent,
assume
current behavior.

That's how I actually did it in the patch. However, the problem wrt.
requiring initdb is not the new field in backup_label, it's the new
field in the control file.

Yeah.  I think it's too late to be fooling with pg_control for 9.1.
Just fix it in HEAD.

Should we add a note to the documentation of pg_basebackup in 9.1
telling people to take care about the failure case?

Something like "Note: if you abort the backup before it's finished, the
backup won't be valid" ? That seems pretty obvious to me, hardly worth
documenting.

I meant something more along the line of that it looks ok, but may be corrupted.

Yeah. I'm frankly pretty nervous about shipping 9.1 with this
problem, but note that I don't have a better idea. I'd favor making
pg_basebackup emit a warning or maybe even remove the backup if it's
aborted midway through.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#10Simon Riggs
simon@2ndquadrant.com
In reply to: Robert Haas (#9)
Re: Enforcing that all WAL has been replayed after restoring from backup

On Wed, Aug 10, 2011 at 1:19 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Aug 10, 2011 at 6:53 AM, Magnus Hagander <magnus@hagander.net> wrote:

On Wed, Aug 10, 2011 at 12:44, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

On 10.08.2011 12:29, Magnus Hagander wrote:

On Tue, Aug 9, 2011 at 18:07, Tom Lane<tgl@sss.pgh.pa.us>  wrote:

Heikki Linnakangas<heikki.linnakangas@enterprisedb.com>  writes:

On 09.08.2011 18:20, Alvaro Herrera wrote:

How about making the new backup_label field optional?  If absent,
assume
current behavior.

That's how I actually did it in the patch. However, the problem wrt.
requiring initdb is not the new field in backup_label, it's the new
field in the control file.

Yeah.  I think it's too late to be fooling with pg_control for 9.1.
Just fix it in HEAD.

Should we add a note to the documentation of pg_basebackup in 9.1
telling people to take care about the failure case?

Something like "Note: if you abort the backup before it's finished, the
backup won't be valid" ? That seems pretty obvious to me, hardly worth
documenting.

I meant something more along the line of that it looks ok, but may be corrupted.

Yeah.  I'm frankly pretty nervous about shipping 9.1 with this
problem, but note that I don't have a better idea.  I'd favor making
pg_basebackup emit a warning or maybe even remove the backup if it's
aborted midway through.

I don't understand why we need to change pg_control for this?

Why can't we just add a line to backup_label as the first action of
pg_basebackup and then updated it the last action to show the backup
set is complete?

That would be safe for 9.1

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#11Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#10)
Re: Enforcing that all WAL has been replayed after restoring from backup

On 10.08.2011 15:34, Simon Riggs wrote:

On Wed, Aug 10, 2011 at 1:19 PM, Robert Haas<robertmhaas@gmail.com> wrote:

On Wed, Aug 10, 2011 at 6:53 AM, Magnus Hagander<magnus@hagander.net> wrote:

On Wed, Aug 10, 2011 at 12:44, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

On 10.08.2011 12:29, Magnus Hagander wrote:

On Tue, Aug 9, 2011 at 18:07, Tom Lane<tgl@sss.pgh.pa.us> wrote:

Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes:

On 09.08.2011 18:20, Alvaro Herrera wrote:

How about making the new backup_label field optional? If absent,
assume
current behavior.

That's how I actually did it in the patch. However, the problem wrt.
requiring initdb is not the new field in backup_label, it's the new
field in the control file.

Yeah. I think it's too late to be fooling with pg_control for 9.1.
Just fix it in HEAD.

Should we add a note to the documentation of pg_basebackup in 9.1
telling people to take care about the failure case?

Something like "Note: if you abort the backup before it's finished, the
backup won't be valid" ? That seems pretty obvious to me, hardly worth
documenting.

I meant something more along the line of that it looks ok, but may be corrupted.

Yeah. I'm frankly pretty nervous about shipping 9.1 with this
problem, but note that I don't have a better idea. I'd favor making
pg_basebackup emit a warning or maybe even remove the backup if it's
aborted midway through.

I don't understand why we need to change pg_control for this?

Why can't we just add a line to backup_label as the first action of
pg_basebackup and then updated it the last action to show the backup
set is complete?

Hmm, that's not possible for the 'tar' output, but would work for 'dir'
output. Another similar idea would be to withhold the control file in
memory until the end of backup, and append it to the output as last. The
backup can't be restored until the control file is written out.

That won't protect from more complicated scenarios, like if you take the
backup without the -x flag, and copy some but not all of the required
WAL files manually to the pg_xlog directory. But it'd be much better
than nothing for 9.1.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#11)
Re: Enforcing that all WAL has been replayed after restoring from backup

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

Hmm, that's not possible for the 'tar' output, but would work for 'dir'
output. Another similar idea would be to withhold the control file in
memory until the end of backup, and append it to the output as last. The
backup can't be restored until the control file is written out.

That won't protect from more complicated scenarios, like if you take the
backup without the -x flag, and copy some but not all of the required
WAL files manually to the pg_xlog directory. But it'd be much better
than nothing for 9.1.

Maybe we're overcomplicating this. What about changing pg_basebackup to
print a message when the backup is completely sent/received? People
would get used to that quickly, and would know to be suspicious if they
didn't see it.

regards, tom lane

#13Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#12)
Re: Enforcing that all WAL has been replayed after restoring from backup

On Wed, Aug 10, 2011 at 1:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

Hmm, that's not possible for the 'tar' output, but would work for 'dir'
output. Another similar idea would be to withhold the control file in
memory until the end of backup, and append it to the output as last. The
backup can't be restored until the control file is written out.

That won't protect from more complicated scenarios, like if you take the
backup without the -x flag, and copy some but not all of the required
WAL files manually to the pg_xlog directory. But it'd be much better
than nothing for 9.1.

Maybe we're overcomplicating this.  What about changing pg_basebackup to
print a message when the backup is completely sent/received?  People
would get used to that quickly, and would know to be suspicious if they
didn't see it.

Yeah, but would they be sufficiently suspicious to think "oh, my
backup is hopeless corrupted even if it seems to work"?

I think a clearer warning is needed, at the very least, and if there's
a way to prevent it altogether at least in straightforward cases, that
would be even better.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#14Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#12)
Re: Enforcing that all WAL has been replayed after restoring from backup

On Wed, Aug 10, 2011 at 19:45, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

Hmm, that's not possible for the 'tar' output, but would work for 'dir'
output. Another similar idea would be to withhold the control file in
memory until the end of backup, and append it to the output as last. The
backup can't be restored until the control file is written out.

That won't protect from more complicated scenarios, like if you take the
backup without the -x flag, and copy some but not all of the required
WAL files manually to the pg_xlog directory. But it'd be much better
than nothing for 9.1.

Maybe we're overcomplicating this.  What about changing pg_basebackup to
print a message when the backup is completely sent/received?  People
would get used to that quickly, and would know to be suspicious if they
didn't see it.

That would suck for scripts, and have people redirect the output to
/dev/null instead, wouldn't it? And it violates the "unix expectation"
that is that a successful command will not write anything to it's
output...

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#15Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Magnus Hagander (#8)
Re: Enforcing that all WAL has been replayed after restoring from backup

Magnus Hagander <magnus@hagander.net> writes:

Or add a signal
handler in the pg_basebackup client emitting a warning about it?

We don't have such a signal handler pg_dump either. I don't think we should
add it.

Hmm. I guess an aborted pg_dump will also "look ok but actually be
corrupt" (or incomplete). Good point.

What about having the signal handler corrupt the backup by adding some
garbage into it? Now the failure case is obvious…

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

#16Fujii Masao
masao.fujii@gmail.com
In reply to: Heikki Linnakangas (#11)
1 attachment(s)
Re: Enforcing that all WAL has been replayed after restoring from backup

On Thu, Aug 11, 2011 at 1:34 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

Hmm, that's not possible for the 'tar' output, but would work for 'dir'
output. Another similar idea would be to withhold the control file in memory
until the end of backup, and append it to the output as last. The backup
can't be restored until the control file is written out.

That won't protect from more complicated scenarios, like if you take the
backup without the -x flag, and copy some but not all of the required WAL
files manually to the pg_xlog directory. But it'd be much better than
nothing for 9.1.

We need to skip checking whether we've reached the end backup location
only when the server crashes while base backup using pg_start_backup. Right?
We can do this by *not* initializing ControlFile->backupStartPoint if the server
is doing crash recovery and backupEndRequired is false. What about the attached
patch?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

require-backup-end-record-2.patchtext/x-patch; charset=US-ASCII; name=require-backup-end-record-2.patchDownload
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 11035e6..d0d68d4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6329,11 +6329,8 @@ StartupXLOG(void)
 		/*
 		 * set backupStartPoint if we're starting recovery from a base backup
 		 */
-		if (haveBackupLabel)
-		{
+		if ((InArchiveRecovery && haveBackupLabel) || backupEndRequired)
 			ControlFile->backupStartPoint = checkPoint.redo;
-			ControlFile->backupEndRequired = backupEndRequired;
-		}
 		ControlFile->time = (pg_time_t) time(NULL);
 		/* No need to hold ControlFileLock yet, we aren't up far enough */
 		UpdateControlFile();
@@ -6703,20 +6700,13 @@ StartupXLOG(void)
 		 * crashes while an online backup is in progress. We must not treat
 		 * that as an error, or the database will refuse to start up.
 		 */
-		if (InArchiveRecovery || ControlFile->backupEndRequired)
-		{
-			if (ControlFile->backupEndRequired)
-				ereport(FATAL,
-						(errmsg("WAL ends before end of online backup"),
-						 errhint("All WAL generated while online backup was taken must be available at recovery.")));
-			else if (!XLogRecPtrIsInvalid(ControlFile->backupStartPoint))
-				ereport(FATAL,
-						(errmsg("WAL ends before end of online backup"),
-						 errhint("Online backup started with pg_start_backup() must be ended with pg_stop_backup(), and all WAL up to that point must be available at recovery.")));
-			else
-				ereport(FATAL,
-					  (errmsg("WAL ends before consistent recovery point")));
-		}
+		if (!XLogRecPtrIsInvalid(ControlFile->backupStartPoint))
+			ereport(FATAL,
+					(errmsg("WAL ends before end of online backup"),
+					 errhint("Online backup started with pg_start_backup() must be ended with pg_stop_backup(), and all WAL up to that point must be available at recovery.")));
+		else
+			ereport(FATAL,
+					(errmsg("WAL ends before consistent recovery point")));
 	}
 
 	/*
@@ -8540,7 +8530,6 @@ xlog_redo(XLogRecPtr lsn, XLogRecord *record)
 			if (XLByteLT(ControlFile->minRecoveryPoint, lsn))
 				ControlFile->minRecoveryPoint = lsn;
 			MemSet(&ControlFile->backupStartPoint, 0, sizeof(XLogRecPtr));
-			ControlFile->backupEndRequired = false;
 			UpdateControlFile();
 
 			LWLockRelease(ControlFileLock);
@@ -9826,8 +9815,8 @@ read_backup_label(XLogRecPtr *checkPointLoc, bool *backupEndRequired)
 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 				 errmsg("invalid data in file \"%s\"", BACKUP_LABEL_FILE)));
 	/*
-	 * BACKUP METHOD line is new in 9.0. Don't complain if it doesn't exist,
-	 * in case you're restoring from a backup taken with an 9.0 beta version
+	 * BACKUP METHOD line is new in 9.1. Don't complain if it doesn't exist,
+	 * in case you're restoring from a backup taken with an 9.1 beta version
 	 * that didn't emit it.
 	 */
 	if (fscanf(lfp, "BACKUP METHOD: %19s", backuptype) == 1)
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 6688c19..9600b50 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -137,16 +137,9 @@ typedef struct ControlFileData
 	 * we use the redo pointer as a cross-check when we see an end-of-backup
 	 * record, to make sure the end-of-backup record corresponds the base
 	 * backup we're recovering from.
-	 *
-	 * If backupEndRequired is true, we know for sure that we're restoring
-	 * from a backup, and must see a backup-end record before we can safely
-	 * start up. If it's false, but backupStartPoint is set, a backup_label
-	 * file was found at startup but it may have been a leftover from a stray
-	 * pg_start_backup() call, not accompanied by pg_stop_backup().
 	 */
 	XLogRecPtr	minRecoveryPoint;
 	XLogRecPtr	backupStartPoint;
-	bool		backupEndRequired;
 
 	/*
 	 * Parameter settings that determine if the WAL can be used for archival
#17Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Fujii Masao (#16)
Re: Enforcing that all WAL has been replayed after restoring from backup

On 16.08.2011 04:10, Fujii Masao wrote:

On Thu, Aug 11, 2011 at 1:34 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

Hmm, that's not possible for the 'tar' output, but would work for 'dir'
output. Another similar idea would be to withhold the control file in memory
until the end of backup, and append it to the output as last. The backup
can't be restored until the control file is written out.

That won't protect from more complicated scenarios, like if you take the
backup without the -x flag, and copy some but not all of the required WAL
files manually to the pg_xlog directory. But it'd be much better than
nothing for 9.1.

We need to skip checking whether we've reached the end backup location
only when the server crashes while base backup using pg_start_backup. Right?

Yes.

We can do this by *not* initializing ControlFile->backupStartPoint if the server
is doing crash recovery and backupEndRequired is false. What about the attached
patch?

Hmm, this behaves slightly differently, if you first try to start the
restored server without recovery.conf, stop recovery, and restart it
after adding recovery.conf. But I guess that's not a big deal, the check
is simply skipped in that case, which is what always happens without
this patch anyway. Committed this to 9.1, but kept master as it was.

(sorry for the delay, I wanted to fix the bogus comment as soon as I saw
it, but needed some time to ponder the rest of the patch)

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#18Fujii Masao
masao.fujii@gmail.com
In reply to: Heikki Linnakangas (#17)
Re: Enforcing that all WAL has been replayed after restoring from backup

On Wed, Aug 17, 2011 at 5:49 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

Hmm, this behaves slightly differently, if you first try to start the
restored server without recovery.conf, stop recovery, and restart it after
adding recovery.conf. But I guess that's not a big deal, the check is simply
skipped in that case, which is what always happens without this patch
anyway.

Oh, I forgot to consider that case. Yeah, I agree with you.

Committed this to 9.1,

Thanks a lot!

but kept master as it was.

So, in master, we should change pg_controldata.c and pg_resetxlog.c for
new pg_control field "backupEndRequired"?

(sorry for the delay, I wanted to fix the bogus comment as soon as I saw it,
but needed some time to ponder the rest of the patch)

NM. Thanks!

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#19Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Fujii Masao (#18)
Re: Enforcing that all WAL has been replayed after restoring from backup

On 17.08.2011 12:26, Fujii Masao wrote:

So, in master, we should change pg_controldata.c and pg_resetxlog.c for
new pg_control field "backupEndRequired"?

Ah, good catch! Fixed.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com