Patch for option in pg_resetxlog for restore from WAL files

Started by Amit kapilaover 13 years ago6 messages
#1Amit kapila
amit.kapila@huawei.com
1 attachment(s)

Patch implementing the design in below mail chain is attached with this mail.

Let me know if there is any problem or objections?

Shall I put in CF(2012-09)?

Test Done for this patch
-------------------------------------
[X] [X] [X]
Test-1
1. Start the server and do some operations.
2. Kill the server with abort option and delete the pg_control file of a databae while the server process is in progress and create an empty file with the same name.
3. check the restore of pg_control file with -f option.

Result
The control file is recreated and recovery happens with the remaining redo after server restart.

Test - 2
1. start the server and shut it down the server once the checkpoint happens after cleanup the old xlog files. where only one checkpoint record is present in the xlog files.
2. Kill the server with abort option and delete the pg_control file of a databae and create an empty file with the same name.
3. check the restore of pg_control file with -f option.

Result
The control file is recreated and recovery happens with the remaining redo after server restart.

Test-3
1. Start the server and execute the operations which resuls in splitting the record header between two redo files.
2. Test the restore of control file after the server is shutdown, the control file deleted and recreated the same.
3. Delete the first xlog file which contains checkpoint record.
4. check the restore of pg_control file with -f option.

Result
The control file is recreated with guessed values and recovery is not possible.

Test-4
1. start the server and shut it down normally.
2. Delete the pg_control file of a databae and create an empty file with the same name.
3. Delete all the valid xlog files and create some invalid xlog files.
4. check the restore of pg_control file with -f option.

Result
The control file is recreated with guessed values and recovery is not possible.

Test-5
1. Start the server on the database.
2. Delete the pg_control file of a database and create an empty file with the same name.
3. Try to restore the control file where the server is already running on the same database.

Result
The restore of control file fails as server is already running and the pid file already exist.

From: Amit Kapila [mailto:amit.kapila@huawei.com]
Sent: Thursday, July 05, 2012 10:21 AM

From: Robert Haas [mailto:robertmhaas@gmail.com]
Sent: Friday, June 22, 2012 8:59 PM
On Fri, Jun 22, 2012 at 5:25 AM, Amit Kapila <amit.kapila@huawei.com> wrote:
Based on the discussion and suggestions in this mail chain, following

features can be implemented:

1. To compute the value of max LSN in data pages based on user input

whether he wants it for an individual file,

a particular directory or whole database.

2a. To search the available WAL files for the latest checkpoint record

and prints the value.

2b. To search the available WAL files for the latest checkpoint record

and recreates a pg_control file pointing at that checkpoint.

I have kept both options to address different kind of corruption

scenarios.

I think I can see all of those things being potentially useful. There
are a couple of pending patches that will revise the WAL format
slightly; not sure how much those are likely to interfere with any
development you might do on (2) in the meantime.

Below is the details of Option-2, for Option-1, I will send mail separately

New option for pg_resetxlog:
-----------------------------
1. Introduce option -r to restore the control file if possible and print
those values.
3. User need to give option -f along with -r to rewrite the control file
from WAL files.
2. If not able to get the control information from WAL files then the
control data will be guessed and proceedes as normal reset xlog.
4. If the control information is restored, then the option -l is ignored.

Design for new Option:
----------------------

1. Validate the pg_xlog directory before proceeding of restoring control
values. if the directory is invalid then the control values will be guessed.
2. Read the pg_xlog directory and read all the existing files.
3. If it is a valid xlog file then add it to a list in an increasing order,
Otherwise the file is ignored and continue to the next file.
4. Try to find the last timestamp file from the list to start reading for a
checkpoint record.
5. Read the first page from the file and validate it. if the validation
fails the restore happens with guessed values.
6. Read the first record as start of the record from the identified first xlog file.
7. If the first record is a continuation record from a previous record then
ignore the record and continue to the next record.
8. After getting the entire record then the record is validated, if it is
not a valid record searching for the next record will be stopped and the control values
will be guessed.
9. Search all the files to the end of the last file to get the latest
checkpoint record.
10. While searching for the record, if it is not reaching the last file
(there is missing file or invalid record) then treat this scenario as a failure of finding the checkpoint record
and go for guessing the control values.
11. After finding the last checkpoint record, update the checkpoint record
information in the control file.

Implementation:
----------------
1. We need to use most of the functionality of functions mentioned below.
One way is to duplicate the code of these
functions related to functionality required by pg_resetxlog in
pg_resetxlog module. I have checked other modules also
but didn't find how we can use common functionality in server utility
from backend code.
Could you please point me for the appropriate way for doing it.

The list of functions:
1. ValidateXLOGDirectoryStructure
2. XLogPageRead
3. ReadRecord
4. RecordIsValid
5. ValidXLOGPageHeader
6. ValidXLogRecordHeader

Suggestions/Comments/Thoughts?

With Regards,
Amit Kapila.

Attachments:

restore_pg_control_data.patchtext/plain; name=restore_pg_control_data.patchDownload
diff --git a/doc/src/sgml/ref/pg_resetxlog.sgml b/doc/src/sgml/ref/pg_resetxlog.sgml
index 27b9ab41..5d6aef6 100644
--- a/doc/src/sgml/ref/pg_resetxlog.sgml
+++ b/doc/src/sgml/ref/pg_resetxlog.sgml
@@ -24,6 +24,7 @@ PostgreSQL documentation
    <command>pg_resetxlog</command>
    <arg choice="opt"><option>-f</option></arg>
    <arg choice="opt"><option>-n</option></arg>
+   <arg choice="opt"><option>-r</option></arg>
    <arg choice="opt"><option>-o</option> <replaceable class="parameter">oid</replaceable></arg>
    <arg choice="opt"><option>-x</option> <replaceable class="parameter">xid</replaceable></arg>
    <arg choice="opt"><option>-e</option> <replaceable class="parameter">xid_epoch</replaceable></arg>
@@ -61,6 +62,20 @@ PostgreSQL documentation
   </para>
 
   <para>
+   If <command>pg_resetxlog</command> complains that it cannot determine valid
+   data for <filename>pg_control</>, you can restore the
+   <filename>pg_control</> data from WAL files if they are intact by
+   specifying the <option>-r</> (restore) option. In this case the values will
+   be substituted for the missing data from WAL files will be displayed.
+   If the user feels the values are correct, the user can force these values
+   to set in the control file by specifying the <option>-fr</> by this missing
+   data is restored from WAL files, user can recover database and use as normal
+   database after restart. if the missing data is not able to restore then in
+   this case plausible values will be substituted for the missing data.
+   which follows same as below information.
+  </para>
+
+  <para>
    If <command>pg_resetxlog</command> complains that it cannot determine
    valid data for <filename>pg_control</>, you can force it to proceed anyway
    by specifying the <option>-f</> (force) option.  In this case plausible
diff --git a/src/bin/pg_resetxlog/pg_resetxlog.c b/src/bin/pg_resetxlog/pg_resetxlog.c
index d5d89ec..9e9caf4 100644
--- a/src/bin/pg_resetxlog/pg_resetxlog.c
+++ b/src/bin/pg_resetxlog/pg_resetxlog.c
@@ -19,6 +19,13 @@
  * This is all pretty straightforward except for the intuition part of
  * step 2 ...
  *
+ * The algorithm of restoring the pg_control value from old xlog file:
+ *	  1. Retrieve all of the active xlog files from xlog direcotry into a list
+ *		 by increasing order, according their timeline, xlogsegno.
+ *	  2. Search the list to find the oldest xlog file of the lastest time line.
+ *	  3. Search the records from the oldest xlog file of latest time line
+ *		 to the latest xlog file of latest time line, if the checkpoint record
+ *		 has been found, update the latest checkpoint and previous checkpoint.
  *
  * Portions Copyright (c) 1996-2012, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -62,16 +69,87 @@ extern char *optarg;
 static ControlFileData ControlFile;		/* pg_control values */
 static XLogSegNo newXlogSegNo;	/* new XLOG segment # */
 static bool guessed = false;	/* T if we had to guess at any values */
+static bool restored = false;	/* T if we had restore any values from WAL */
 static const char *progname;
 
+/*
+ * We use a list to store the active xlog files we had found in the
+ * xlog directory in increasing order according the time line, logid,
+ * segment id.
+ *
+ */
+typedef struct XLogFileName
+{
+	TimeLineID	tli;
+	XLogSegNo	segno;
+	char		fname[MAXFNAMELEN];
+	struct XLogFileName *next;
+}	XLogFileName;
+
+/* The list head */
+static XLogFileName *xlogfilelist = NULL;
+
+/* LastXLogfile is the latest file in the latest time line,
+ *	CurXLogfile is the oldest file in the latest time line
+ */
+static XLogFileName *CurXLogFile,
+		   *LastXLogFile;
+
+/*
+ * readFile is -1 or a kernel FD for an open log file segment.
+ * When it's open, readOff is the current seek offset in the file.
+ * readSegNo identifies the segment.  These variables are only
+ * used to read the XLOG.
+ */
+static int	readFile = -1;
+static XLogSegNo readSegNo = 0;
+static TimeLineID readTli;
+static uint32 readOff = 0;
+
+/* Buffer for currently read page (XLOG_BLCKSZ bytes) */
+static char *readBuf = NULL;
+
+/* Buffer for current ReadRecord result (expandable) */
+static char *readRecordBuf = NULL;
+static uint32 readRecordBufSize = 0;
+
+/* State information for XLOG reading */
+static XLogRecPtr ReadRecPtr;	/* start of last record read */
+static XLogRecPtr EndRecPtr;	/* end+1 of last record read */
+
+/* The last checkpoint found in xlog file.*/
+static CheckPoint lastcheckpoint;
+
+/* The last and previous checkpoint pointers found in xlog file.*/
+static XLogRecPtr prevchkp,
+			lastchkp;
+
 static bool ReadControlFile(void);
 static void GuessControlValues(void);
-static void PrintControlValues(bool guessed);
+static void PrintControlValues(void);
 static void RewriteControlFile(void);
 static void FindEndOfXLOG(void);
 static void KillExistingXLOG(void);
 static void KillExistingArchiveStatus(void);
 static void WriteEmptyXLOG(void);
+static void RestoreControlValues(bool mode);
+static bool SearchLastCheckpoint(void);
+static bool CompareXLogFileNames(XLogFileName * f1, XLogFileName * f2);
+static void AddXLogFileIntoList(char *fname);
+static bool ValidXLogFileName(char *fname);
+static bool ValidXLOGPageHeader(XLogPageHeader hdr, uint32 tli,
+					XLogSegNo segno);
+static void ValidateXLOGDirectoryStructure(void);
+static bool PrepareXLogFileList(void);
+static bool IsNextSeg(XLogFileName * prev, XLogFileName * cur);
+static bool GetStartXLogFile(void);
+static bool RecordIsValid(XLogRecord *record);
+static void CleanUpList(XLogFileName * list);
+static bool ValidXLogRecordHeader(XLogRecPtr *RecPtr, XLogRecord *record,
+					  bool randAccess);
+static bool XLogPageRead(XLogRecPtr *RecPtr);
+static XLogRecord *ReadRecord(XLogRecPtr *RecPtr, bool randAccess,
+		   bool *bContRec);
 static void usage(void);
 
 
@@ -81,6 +159,7 @@ main(int argc, char *argv[])
 	int			c;
 	bool		force = false;
 	bool		noupdate = false;
+	bool		restore = false;
 	uint32		set_xid_epoch = (uint32) -1;
 	TransactionId set_xid = 0;
 	Oid			set_oid = 0;
@@ -112,7 +191,7 @@ main(int argc, char *argv[])
 	}
 
 
-	while ((c = getopt(argc, argv, "fl:m:no:O:x:e:")) != -1)
+	while ((c = getopt(argc, argv, "fl:m:no:O:x:e:r")) != -1)
 	{
 		switch (c)
 		{
@@ -123,6 +202,9 @@ main(int argc, char *argv[])
 			case 'n':
 				noupdate = true;
 				break;
+			case 'r':
+				restore = true;
+				break;
 
 			case 'e':
 				set_xid_epoch = strtoul(optarg, &endptr, 0);
@@ -271,15 +353,21 @@ main(int argc, char *argv[])
 	}
 
 	/*
-	 * Attempt to read the existing pg_control file
+	 * Attempt to read the existing pg_control file, if the pg_control file is
+	 * not proper then it will first try to restore pg_control values from WAL
+	 * files, if failed then guess the values. if the restore option is not
+	 * provided then the values will be guessed directly.
 	 */
 	if (!ReadControlFile())
-		GuessControlValues();
+		RestoreControlValues(restore);
 
-	/*
-	 * Also look at existing segment files to set up newXlogSegNo
-	 */
-	FindEndOfXLOG();
+	if (!restored)
+	{
+		/*
+		 * Also look at existing segment files to set up newXlogSegNo
+		 */
+		FindEndOfXLOG();
+	}
 
 	/*
 	 * Adjust fields if required by switches.  (Do this now so that printout,
@@ -324,9 +412,9 @@ main(int argc, char *argv[])
 	 * If we had to guess anything, and -f was not given, just print the
 	 * guessed values and exit.  Also print if -n is given.
 	 */
-	if ((guessed && !force) || noupdate)
+	if (((guessed || restored) && !force) || noupdate)
 	{
-		PrintControlValues(guessed);
+		PrintControlValues();
 		if (!noupdate)
 		{
 			printf(_("\nIf these values seem acceptable, use -f to force reset.\n"));
@@ -351,11 +439,17 @@ main(int argc, char *argv[])
 	 * Else, do the dirty deed.
 	 */
 	RewriteControlFile();
-	KillExistingXLOG();
-	KillExistingArchiveStatus();
-	WriteEmptyXLOG();
+	if (!restored)
+	{
+		KillExistingXLOG();
+		KillExistingArchiveStatus();
+		WriteEmptyXLOG();
+
+		printf(_("Transaction log reset\n"));
+	}
+	else
+		printf(_("Restore is successful\n"));
 
-	printf(_("Transaction log reset\n"));
 	return 0;
 }
 
@@ -434,7 +528,6 @@ ReadControlFile(void)
 	return false;
 }
 
-
 /*
  * Guess at pg_control values when we can't read the old ones.
  */
@@ -481,7 +574,6 @@ GuessControlValues(void)
 	ControlFile.checkPoint = ControlFile.checkPointCopy.redo;
 
 	/* minRecoveryPoint, backupStartPoint and backupEndPoint can be left zero */
-
 	ControlFile.wal_level = WAL_LEVEL_MINIMAL;
 	ControlFile.MaxConnections = 100;
 	ControlFile.max_prepared_xacts = 0;
@@ -518,13 +610,15 @@ GuessControlValues(void)
  * reset by RewriteControlFile().
  */
 static void
-PrintControlValues(bool guessed)
+PrintControlValues()
 {
 	char		sysident_str[32];
 	char		fname[MAXFNAMELEN];
 
 	if (guessed)
 		printf(_("Guessed pg_control values:\n\n"));
+	else if (restored)
+		printf(_("Restored pg_control values:\n\n"));
 	else
 		printf(_("pg_control values:\n\n"));
 
@@ -535,10 +629,23 @@ PrintControlValues(bool guessed)
 	snprintf(sysident_str, sizeof(sysident_str), UINT64_FORMAT,
 			 ControlFile.system_identifier);
 
-	XLogFileName(fname, ControlFile.checkPointCopy.ThisTimeLineID, newXlogSegNo);
+	if (restored)
+	{
+		XLogFileName(fname, ControlFile.checkPointCopy.ThisTimeLineID,
+					 ControlFile.checkPointCopy.redo);
+
+		printf(_("First log segment after restore:        %s\n"),
+			   fname);
+	}
+	else
+	{
+		XLogFileName(fname, ControlFile.checkPointCopy.ThisTimeLineID,
+					 newXlogSegNo);
+
+		printf(_("First log segment after reset:        %s\n"),
+			   fname);
+	}
 
-	printf(_("First log segment after reset:        %s\n"),
-		   fname);
 	printf(_("pg_control version number:            %u\n"),
 		   ControlFile.pg_control_version);
 	printf(_("Catalog version number:               %u\n"),
@@ -599,22 +706,26 @@ RewriteControlFile(void)
 	int			fd;
 	char		buffer[PG_CONTROL_SIZE];		/* need not be aligned */
 
-	/*
-	 * Adjust fields as needed to force an empty XLOG starting at
-	 * newXlogSegNo.
-	 */
-	XLogSegNoOffsetToRecPtr(newXlogSegNo, SizeOfXLogLongPHD,
-							ControlFile.checkPointCopy.redo);
-	ControlFile.checkPointCopy.time = (pg_time_t) time(NULL);
+	if (!restored)
+	{
+		/*
+		 * Adjust fields as needed to force an empty XLOG starting at
+		 * newXlogSegNo.
+		 */
+		XLogSegNoOffsetToRecPtr(newXlogSegNo, SizeOfXLogLongPHD,
+								ControlFile.checkPointCopy.redo);
+		ControlFile.checkPointCopy.time = (pg_time_t) time(NULL);
+
+		ControlFile.state = DB_SHUTDOWNED;
+		ControlFile.time = (pg_time_t) time(NULL);
+		ControlFile.checkPoint = ControlFile.checkPointCopy.redo;
+		ControlFile.prevCheckPoint = 0;
+		ControlFile.minRecoveryPoint = 0;
+		ControlFile.backupStartPoint = 0;
+		ControlFile.backupEndPoint = 0;
+		ControlFile.backupEndRequired = false;
+	}
 
-	ControlFile.state = DB_SHUTDOWNED;
-	ControlFile.time = (pg_time_t) time(NULL);
-	ControlFile.checkPoint = ControlFile.checkPointCopy.redo;
-	ControlFile.prevCheckPoint = 0;
-	ControlFile.minRecoveryPoint = 0;
-	ControlFile.backupStartPoint = 0;
-	ControlFile.backupEndPoint = 0;
-	ControlFile.backupEndRequired = false;
 
 	/*
 	 * Force the defaults for max_* settings. The values don't really matter
@@ -985,6 +1096,895 @@ WriteEmptyXLOG(void)
 	close(fd);
 }
 
+/*
+ * Restores the pg_control values by searching the xlog segment files
+ * or by guessing it.
+ */
+static void
+RestoreControlValues(bool mode)
+{
+	bool		result = false;
+
+	GuessControlValues();
+
+	if (mode)
+	{
+		/*
+		 * update the checkpoint value in control file, by searching xlog
+		 * segment files.
+		 */
+		result = SearchLastCheckpoint();
+		if (result)				/* The last checkpoint had been found. */
+		{
+			restored = true;
+			guessed = false;
+
+			ControlFile.checkPointCopy = lastcheckpoint;
+			ControlFile.checkPoint = lastchkp;
+			ControlFile.prevCheckPoint = prevchkp;
+			ControlFile.time = lastcheckpoint.time;
+
+			/*
+			 * Always set as in production, which makes the database to
+			 * recover always if the redo present.
+			 */
+			ControlFile.state = DB_IN_PRODUCTION;
+
+			newXlogSegNo = readSegNo;
+		}
+
+		/* Clean up the list. */
+		CleanUpList(xlogfilelist);
+	}
+}
+
+
+/*
+ * Search the lastest checkpoint in the lastest XLog segment file.
+ *
+ * The return value is the total checkpoints which had been found
+ * in the XLog segment file.
+ */
+static bool
+SearchLastCheckpoint(void)
+{
+	bool		randAccess = true;
+	bool		bContRec = false;
+	bool		bFoundChkPoint = false;
+
+	/* First time through, permanently allocate readBuf. */
+	readBuf = (char *) malloc(XLOG_BLCKSZ);
+	if (NULL == readBuf)
+	{
+		fprintf(stderr, _("%s: Memory allocation failed for size: %d\n"),
+				progname, XLOG_BLCKSZ);
+		exit(1);
+	}
+
+	/*
+	 * retrive all of the active xlog files from xlog direcotry into a list by
+	 * increasing order, according their timeline and segmentno.
+	 */
+	if (!PrepareXLogFileList())
+		goto READ_BUF_FREE;
+
+	/* Select the oldest segment file in the lastest time line. */
+	if (!GetStartXLogFile())
+	{
+		fprintf(stderr, _("%s: No xlog files are found in directory: %s\n"),
+				progname, XLOGDIR);
+		goto READ_BUF_FREE;
+	}
+
+	/* Form the first record address to read from xlog file */
+	XLogSegNoOffsetToRecPtr(CurXLogFile->segno, 0, EndRecPtr);
+	ReadRecPtr = EndRecPtr;
+
+	fprintf(stderr, _("%s: Trying to read checkpoint record from xlog files in"
+					  " directory: %s\n"), progname, XLOGDIR);
+
+	/*
+	 * Search the XLog segment file from beginning to end, if checkpoint
+	 * record is found, then update checkpoint record later update the control
+	 * file from these informations.
+	 */
+	while (ReadRecord(&EndRecPtr, randAccess, &bContRec) != NULL)
+	{
+		/* Ignore the first record if it is a continuous record */
+		if (bContRec)
+			continue;
+
+		randAccess = false;
+
+		/* To see if the record is checkpoint record. */
+		if (((XLogRecord *) readRecordBuf)->xl_rmid == RM_XLOG_ID)
+		{
+			CheckPoint *chkpoint;
+			uint8		info
+			= ((XLogRecord *) readRecordBuf)->xl_info & ~XLR_INFO_MASK;
+
+			if ((info == XLOG_CHECKPOINT_SHUTDOWN)
+				|| (info == XLOG_CHECKPOINT_ONLINE))
+			{
+				chkpoint
+					= (CheckPoint *) XLogRecGetData(((XLogRecord *) readRecordBuf));
+				prevchkp = lastchkp;
+				lastchkp = ReadRecPtr;
+				lastcheckpoint = *chkpoint;
+				bFoundChkPoint = true;
+			}
+		}
+	}
+
+	/* Check the checkpoint record is found or not */
+	if (!bFoundChkPoint)
+	{
+		fprintf(stderr, _("%s: Checkpoint redo record is not found\n"),
+				progname);
+	}
+
+	/*
+	 * We can not know clearly if we had reached the end. But just check if we
+	 * reach the last segment file, if it is not, then some problem there.
+	 */
+	if (readSegNo != LastXLogFile->segno)
+	{
+		fprintf(stderr, _("%s: Finished reading xlog segno:%lu is not matching"
+						  " with last xlog file from directory:%lu\n"),
+				progname, readSegNo, LastXLogFile->segno);
+	}
+
+	if (readFile >= 0)
+	{
+		close(readFile);
+		readFile = -1;
+	}
+
+	if (readRecordBuf)
+	{
+		free(readRecordBuf);
+		readRecordBufSize = 0;
+	}
+
+READ_BUF_FREE:
+	free(readBuf);
+	return bFoundChkPoint;
+}
+
+/*
+ * compare two xlog file from their name to see which one is latest.
+ * Return true for file 1 is the lastest file.
+ */
+static bool
+CompareXLogFileNames(XLogFileName * f1, XLogFileName * f2)
+{
+	if (f2->tli >= f1->tli)
+	{
+		if (f2->segno > f1->segno)
+			return false;
+	}
+
+	return true;
+}
+
+
+
+/*
+ * Add the file which had been found in the xlog folder into xlogfilelist.
+ * The xlogfile list is matained in a increasing order.
+ *
+ * The input parameter is the name of the xlog	file, the name is assumpted
+ * valid.
+ */
+static void
+AddXLogFileIntoList(char *fname)
+{
+	XLogFileName *NewSegFile,
+			  **currp;
+	char		path[MAXPGPATH];
+
+	/* Allocate a new node for the new file. */
+	NewSegFile = (XLogFileName *) malloc(sizeof(XLogFileName));
+	if (NULL == NewSegFile)
+	{
+		fprintf(stderr, _("%s: Memory allocation failed for size:%lu\n"),
+				progname, sizeof(XLogFileName));
+		exit(1);
+	}
+
+	strcpy(NewSegFile->fname, fname);	/* setup the name */
+
+	/* extract the time line, xlog segment number from the name. */
+	XLogFromFileName(fname, &(NewSegFile->tli), &(NewSegFile->segno));
+
+	NewSegFile->next = NULL;
+
+	/* Ensure the xlog file is active and valid. */
+	snprintf(path, MAXPGPATH, "%s/%s", XLOGDIR, NewSegFile->fname);
+	readFile = open(path, O_RDONLY | PG_BINARY, 0);
+	if (readFile < 0)
+	{
+		fprintf(stderr, _("%s: Can not open xlog file %s.\n"), progname, path);
+		free(NewSegFile);
+		return;
+	}
+
+	if (read(readFile, readBuf, XLOG_BLCKSZ) != XLOG_BLCKSZ)
+	{
+		fprintf(stderr, _("%s: Can not read xlog file %s.\n"), progname, path);
+		free(NewSegFile);
+		goto insert_failure;
+	}
+
+	if (!ValidXLOGPageHeader((XLogPageHeader) readBuf,
+							 NewSegFile->tli,
+							 NewSegFile->segno))
+	{
+		free(NewSegFile);
+		goto insert_failure;
+	}
+
+	/* try to search the list and find the insert point. */
+	currp = &xlogfilelist;
+
+	while (*currp && CompareXLogFileNames(NewSegFile, *currp))
+		currp = &((*currp)->next);
+
+	NewSegFile->next = *currp;
+	*currp = NewSegFile;
+
+insert_failure:
+	close(readFile);
+	readFile = -1;
+	return;
+}
+
+/*
+ * validates XLOG directory structure
+ */
+static void
+ValidateXLOGDirectoryStructure(void)
+{
+	char		path[MAXPGPATH];
+	struct stat stat_buf;
+
+	/* Check for pg_xlog; if it doesn't exist, error out */
+	if (stat(XLOGDIR, &stat_buf) != 0
+		|| !S_ISDIR(stat_buf.st_mode))
+	{
+		fprintf(stderr, _("%s: required WAL directory \"%s\" does not exist\n"),
+				progname, XLOGDIR);
+		exit(1);
+	}
+
+	/* Check for archive_status */
+	snprintf(path, MAXPGPATH, XLOGDIR "/archive_status");
+	if ((stat(path, &stat_buf) != 0)
+		|| !S_ISDIR(stat_buf.st_mode))
+	{
+		fprintf(stderr, _("%s: required WAL directory \"%s\" does not exist\n"),
+				progname, path);
+		exit(1);
+	}
+}
+
+/*
+ * Check if the file is a valid xlog file.
+ * Return true for the input file is a valid xlog file.
+ * The input parameter is the name of the xlog file.
+ */
+static bool
+ValidXLogFileName(char *fname)
+{
+	uint32		logTLI,
+				logId,
+				logSeg;
+
+	if (strlen(fname) != 24 ||
+		strspn(fname, "0123456789ABCDEF") != 24 ||
+		sscanf(fname, "%8x%8x%8x", &logTLI, &logId, &logSeg) != 3)
+		return false;
+	return true;
+
+}
+
+/* Ensure the xlog page header is valid.*/
+static bool
+ValidXLOGPageHeader(XLogPageHeader hdr, uint32 tli, XLogSegNo segno)
+{
+	XLogRecPtr	recaddr;
+
+	if (hdr->xlp_magic != XLOG_PAGE_MAGIC)
+	{
+		return false;
+	}
+
+	if ((hdr->xlp_info & ~XLP_ALL_FLAGS) != 0)
+	{
+		return false;
+	}
+
+	if (hdr->xlp_info & XLP_LONG_HEADER)
+	{
+		XLogLongPageHeader longhdr = (XLogLongPageHeader) hdr;
+
+		if (longhdr->xlp_seg_size != XLogSegSize)
+		{
+			return false;
+		}
+
+		if (longhdr->xlp_xlog_blcksz != XLOG_BLCKSZ)
+		{
+			return false;
+		}
+
+		/* Get the system identifier from the segment file header. */
+		ControlFile.system_identifier = ((XLogLongPageHeader) longhdr)->xlp_sysid;
+	}
+	else if (0 == readOff)
+	{
+		/* As first page of file doesn't have a long header */
+		return false;
+	}
+
+	XLogSegNoOffsetToRecPtr(segno, readOff, recaddr);
+
+	if (!XLByteEQ(hdr->xlp_pageaddr, recaddr))
+	{
+		return false;
+	}
+
+	if (hdr->xlp_tli != tli)
+	{
+		return false;
+	}
+
+	return true;
+}
+
+/*
+ * Validate an XLOG record header.
+ */
+static bool
+ValidXLogRecordHeader(XLogRecPtr *RecPtr, XLogRecord *record, bool randAccess)
+{
+	/*
+	 * xl_len == 0 is bad data for everything except XLOG SWITCH, where it is
+	 * required.
+	 */
+	if (record->xl_rmid == RM_XLOG_ID && record->xl_info == XLOG_SWITCH)
+	{
+		if (record->xl_len != 0)
+		{
+			return false;
+		}
+	}
+	else if (record->xl_len == 0)
+	{
+		return false;
+	}
+
+	if (record->xl_tot_len < SizeOfXLogRecord + record->xl_len ||
+		record->xl_tot_len > SizeOfXLogRecord + record->xl_len +
+		XLR_MAX_BKP_BLOCKS * (sizeof(BkpBlock) + BLCKSZ))
+	{
+		return false;
+	}
+	if (record->xl_rmid > RM_MAX_ID)
+	{
+		return false;
+	}
+
+	if (randAccess)
+	{
+		/*
+		 * We can't exactly verify the prev-link, but surely it should be less
+		 * than the record's own address.
+		 */
+		if (!XLByteLT(record->xl_prev, *RecPtr))
+		{
+			return false;
+		}
+	}
+	else
+	{
+		/*
+		 * Record's prev-link should exactly match our previous location. This
+		 * check guards against torn WAL pages where a stale but valid-looking
+		 * WAL record starts on a sector boundary.
+		 */
+		if (!XLByteEQ(record->xl_prev, ReadRecPtr))
+		{
+			return false;
+		}
+	}
+
+	return true;
+}
+
+/*
+ * Prepare all the existing XLOG files into a list
+ */
+static bool
+PrepareXLogFileList(void)
+{
+	DIR		   *xldir;
+	struct dirent *xlde;
+
+	ValidateXLOGDirectoryStructure();
+
+	/* Open the xlog direcotry. */
+	xldir = opendir(XLOGDIR);
+	if (xldir == NULL)
+	{
+		fprintf(stderr, _("%s: could not open directory \"%s\": %s\n"),
+				progname, XLOGDIR, strerror(errno));
+		return false;
+	}
+
+	/* Search the directory, insert the segment files into the xlogfilelist. */
+	errno = 0;
+	while ((xlde = readdir(xldir)) != NULL)
+	{
+		if (ValidXLogFileName(xlde->d_name))
+		{
+			/* XLog file is found, insert it into the xlogfilelist. */
+			AddXLogFileIntoList(xlde->d_name);
+		}
+		errno = 0;
+	}
+#ifdef WIN32
+	if (GetLastError() == ERROR_NO_MORE_FILES)
+		errno = 0;
+#endif
+
+	if (errno)
+	{
+		fprintf(stderr, _("%s: could not read from directory \"%s\": %s\n"),
+				progname, XLOGDIR, strerror(errno));
+		closedir(xldir);
+		return false;
+	}
+
+	closedir(xldir);
+	return true;
+}
+
+/* check is two segment file is continous.*/
+static bool
+IsNextSeg(XLogFileName * prev, XLogFileName * cur)
+{
+	uint32		logid;
+	uint32		logsegno;
+	XLogSegNo	nextSegno;
+
+	if (prev->tli != cur->tli)
+		return false;
+
+	logid = prev->segno / XLogSegmentsPerXLogId;
+	logsegno = prev->segno % XLogSegmentsPerXLogId;
+
+	if ((logsegno + 1) < XLogSegmentsPerXLogId)
+	{
+		logsegno++;
+	}
+	else
+	{
+		logsegno = 0;
+		logid++;
+	}
+
+	nextSegno = (uint64) logid *XLogSegmentsPerXLogId + logsegno;
+
+	if (nextSegno == cur->segno)
+		return true;
+
+	return false;
+}
+
+/*
+ * Select the oldest xlog file in the latest time line to start
+ * the search for checkpoint redo record.
+ */
+static bool
+GetStartXLogFile(void)
+{
+	XLogFileName *tmp;
+
+	if (NULL == xlogfilelist)
+	{
+		return false;
+	}
+
+	tmp = CurXLogFile = xlogfilelist;
+
+	while (tmp->next != NULL)
+	{
+		/*
+		 * we should ensure that from the first to the last segment file is
+		 * continous.
+		 */
+		if (!IsNextSeg(tmp, tmp->next))
+		{
+			CurXLogFile = tmp->next;
+		}
+
+		tmp = tmp->next;
+	}
+
+	LastXLogFile = tmp;
+	return true;
+}
+
+/*
+ * CRC-check an XLOG record.  We do not believe the contents of an XLOG
+ * record (other than to the minimal extent of computing the amount of
+ * data to read in) until we've checked the CRCs.
+ *
+ * We assume all of the record has been read into memory at *record.
+ */
+static bool
+RecordIsValid(XLogRecord *record)
+{
+	pg_crc32	crc;
+	int			i;
+	uint32		len = record->xl_len;
+	BkpBlock	bkpb;
+	char	   *blk;
+
+	/* First the rmgr data */
+	INIT_CRC32(crc);
+	COMP_CRC32(crc, XLogRecGetData(record), len);
+
+	/* Add in the backup blocks, if any */
+	blk = (char *) XLogRecGetData(record) + len;
+	for (i = 0; i < XLR_MAX_BKP_BLOCKS; i++)
+	{
+		uint32		blen;
+
+		if (!(record->xl_info & XLR_SET_BKP_BLOCK(i)))
+			continue;
+
+		memcpy(&bkpb, blk, sizeof(BkpBlock));
+		if (bkpb.hole_offset + bkpb.hole_length > BLCKSZ)
+		{
+			return false;
+		}
+		blen = sizeof(BkpBlock) + BLCKSZ - bkpb.hole_length;
+		COMP_CRC32(crc, blk, blen);
+		blk += blen;
+	}
+
+	/* Check that xl_tot_len agrees with our calculation */
+	if (blk != (char *) record + record->xl_tot_len)
+	{
+		return false;
+	}
+
+	/* Finally include the record header */
+	COMP_CRC32(crc, (char *) record, offsetof(XLogRecord, xl_crc));
+	FIN_CRC32(crc);
+
+	if (!EQ_CRC32(record->xl_crc, crc))
+	{
+		return false;
+	}
+
+	return true;
+}
+
+/* Clean up the allocated list.*/
+static void
+CleanUpList(XLogFileName * list)
+{
+	XLogFileName *tmp;
+
+	tmp = list;
+	while (list != NULL)
+	{
+		tmp = list->next;
+		free(list);
+		list = tmp;
+	}
+}
+
+/*
+ * Reads one XLOG page.
+ */
+static bool
+XLogPageRead(XLogRecPtr *RecPtr)
+{
+	bool		switchlog = false;
+	uint32		targetPageOff;
+	uint32		targetRecOff;
+	XLogSegNo	targetSegNo;
+	char		path[MAXPGPATH];
+
+	XLByteToSeg(*RecPtr, targetSegNo);
+	targetPageOff = (((*RecPtr) % XLogSegSize) / XLOG_BLCKSZ) * XLOG_BLCKSZ;
+	targetRecOff = (*RecPtr) % XLOG_BLCKSZ;
+
+	/* Fast exit if we have read the record in the current buffer already */
+	if ((targetSegNo == readSegNo)
+		&& (targetPageOff == readOff)
+		&& (targetRecOff < XLOG_BLCKSZ))
+		return true;
+
+	/*
+	 * See if we need to switch to a new segment because the requested record
+	 * is not in the currently open one.
+	 */
+	if (readFile >= 0 && !XLByteInSeg(*RecPtr, readSegNo))
+	{
+		close(readFile);
+		readFile = -1;
+	}
+
+	XLByteToSeg(*RecPtr, readSegNo);
+
+	if (readFile < 0)
+	{
+		if (!CurXLogFile)
+		{
+			fprintf(stderr, _("%s: No next file is found to read. "
+							  "Incomplete xlog files.\n"), progname);
+			return false;
+		}
+
+		/* Open a  Xlog segment file. */
+		snprintf(path, MAXPGPATH, "%s/%s", XLOGDIR, CurXLogFile->fname);
+
+		readFile = open(path, O_RDONLY | PG_BINARY, 0);
+		if (readFile < 0)
+		{
+			fprintf(stderr, _("%s: Can not open xlog file %s.\n"), progname, path);
+			return false;
+		}
+
+		readTli = CurXLogFile->tli;
+
+		switchlog = true;
+		CurXLogFile = CurXLogFile->next;
+	}
+
+	/*
+	 * As the following case is not possible, because the search always starts
+	 * from first page.
+	 */
+	if (switchlog && targetPageOff != 0)
+	{
+		Assert(0);
+	}
+
+	/* Read the requested page */
+	readOff = targetPageOff;
+	if (lseek(readFile, (off_t) readOff, SEEK_SET) < 0)
+		return false;
+
+	if (read(readFile, readBuf, XLOG_BLCKSZ) != XLOG_BLCKSZ)
+		return false;
+
+	if (!ValidXLOGPageHeader((XLogPageHeader) readBuf, readTli, readSegNo))
+		return false;
+
+	Assert(targetSegNo == readSegNo);
+	Assert(targetPageOff == readOff);
+
+	return true;
+}
+
+/*
+ * Attempt to read an XLOG record.
+ */
+static XLogRecord *
+ReadRecord(XLogRecPtr *RecPtr, bool randAccess, bool *bContRec)
+{
+	uint32		targetRecOff;
+	XLogRecord *record;
+	uint32		len;
+	uint32		total_len;
+	uint32		pageHeaderSize;
+	bool		gotheader = false;
+	XLogPageHeader pageHeader;
+
+	/* Read the page of the record */
+	if (!XLogPageRead(RecPtr))
+		return NULL;
+
+	pageHeaderSize = XLogPageHeaderSize((XLogPageHeader) readBuf);
+	targetRecOff = (*RecPtr) % XLOG_BLCKSZ;
+	if (targetRecOff == 0)
+	{
+		(*RecPtr) += pageHeaderSize;
+		targetRecOff = pageHeaderSize;
+	}
+	else if (targetRecOff < pageHeaderSize)
+	{
+		return NULL;
+	}
+
+	pageHeader = (XLogPageHeader) readBuf;
+
+	/*
+	 * Check the first record of a page is a continuous record or not
+	 */
+	if (!(*bContRec)
+		&& randAccess
+		&& (pageHeader->xlp_info & XLP_FIRST_IS_CONTRECORD))
+	{
+		*bContRec = true;
+	}
+	else
+		*bContRec = false;
+
+	/*
+	 * NB: Even though we use an XLogRecord pointer here, the whole record
+	 * header might not fit on this page. xl_tot_len is the first field in
+	 * struct, so it must be on this page, but we cannot safely access any
+	 * other fields yet.
+	 */
+	record = (XLogRecord *) (readBuf + (*RecPtr) % XLOG_BLCKSZ);
+	if (!(*bContRec))
+	{
+		total_len = record->xl_tot_len;
+	}
+	else
+	{
+		total_len = pageHeader->xlp_rem_len;
+	}
+
+	/*
+	 * If we got the whole header already, validate it immediately. Otherwise
+	 * we validate it after reading the rest of the header from the next page.
+	 */
+	if (*bContRec)
+	{
+		gotheader = true;
+	}
+	else if (targetRecOff <= (XLOG_BLCKSZ - SizeOfXLogRecord))
+	{
+		if (!ValidXLogRecordHeader(RecPtr, record, randAccess))
+			return NULL;
+		gotheader = true;
+	}
+
+	/*
+	 * Allocate or enlarge readRecordBuf as needed.  To avoid useless small
+	 * increases, round its size to a multiple of BLCKSZ, and make sure it's
+	 * at least 4*BLCKSZ + XLOG_BLCKSZ to start with. (That is enough for all
+	 * "normal" records, but very large commit or abort records might need
+	 * more space.)
+	 */
+	if (total_len > readRecordBufSize)
+	{
+		uint32		newSize = total_len;
+
+		newSize += XLOG_BLCKSZ - (newSize % XLOG_BLCKSZ);
+		newSize = Max(newSize, (4 * BLCKSZ) + XLOG_BLCKSZ);
+
+		if (readRecordBuf)
+			free(readRecordBuf);
+
+		readRecordBuf = (char *) malloc(newSize);
+		if (!readRecordBuf)
+		{
+			fprintf(stderr, _("%s: Memory allocation failed for size: %u\n"),
+					progname, newSize);
+			readRecordBufSize = 0;
+			exit(1);
+		}
+
+		readRecordBufSize = newSize;
+	}
+
+	len = XLOG_BLCKSZ - (*RecPtr) % XLOG_BLCKSZ;		/* available in block */
+	if (total_len > len)
+	{
+		/* Need to reassemble record */
+		char	   *contrecord;
+		XLogRecPtr	pagelsn;
+		char	   *buffer;
+		uint32		gotlen;
+
+		/* Initialize pagelsn to the beginning of the page this record is on */
+		pagelsn = ((*RecPtr) / XLOG_BLCKSZ) * XLOG_BLCKSZ;
+
+		/* Copy the first fragment of the record from the first page. */
+		memcpy(readRecordBuf, readBuf + (*RecPtr) % XLOG_BLCKSZ, len);
+		buffer = readRecordBuf + len;
+		gotlen = len;
+
+		do
+		{
+			/* Calculate pointer to beginning of next page */
+			XLByteAdvance(pagelsn, XLOG_BLCKSZ);
+
+			/* Wait for the next page to become available */
+			if (!XLogPageRead(&pagelsn))
+				return NULL;
+
+			/* Check that the continuation on next page looks valid */
+			pageHeader = (XLogPageHeader) readBuf;
+			if (!(pageHeader->xlp_info & XLP_FIRST_IS_CONTRECORD))
+			{
+				return NULL;
+			}
+
+			/*
+			 * Cross-check that xlp_rem_len agrees with how much of the record
+			 * we expect there to be left.
+			 */
+			if (pageHeader->xlp_rem_len == 0 ||
+				total_len != (pageHeader->xlp_rem_len + gotlen))
+			{
+				return NULL;
+			}
+
+			/* Append the continuation from this page to the buffer */
+			pageHeaderSize = XLogPageHeaderSize(pageHeader);
+			contrecord = (char *) readBuf + pageHeaderSize;
+			len = XLOG_BLCKSZ - pageHeaderSize;
+
+			if (pageHeader->xlp_rem_len < len)
+				len = pageHeader->xlp_rem_len;
+
+			memcpy(buffer, (char *) contrecord, len);
+			buffer += len;
+			gotlen += len;
+
+			/* If we just reassembled the record header, validate it. */
+			if (!gotheader)
+			{
+				record = (XLogRecord *) readRecordBuf;
+				if (!ValidXLogRecordHeader(RecPtr, record, randAccess))
+					return NULL;
+
+				gotheader = true;
+			}
+		} while (pageHeader->xlp_rem_len > len);
+
+		record = (XLogRecord *) readRecordBuf;
+		if (!(*bContRec))
+		{
+			if (!RecordIsValid(record))
+				return NULL;
+		}
+
+		pageHeaderSize = XLogPageHeaderSize((XLogPageHeader) readBuf);
+		ReadRecPtr = *RecPtr;
+		XLogSegNoOffsetToRecPtr(
+								readSegNo,
+				readOff + pageHeaderSize + MAXALIGN(pageHeader->xlp_rem_len),
+								EndRecPtr);
+	}
+	else
+	{
+		/* Record does not cross a page boundary */
+		if (!(*bContRec))
+		{
+			if (!RecordIsValid(record))
+				return NULL;
+		}
+
+		ReadRecPtr = *RecPtr;
+		EndRecPtr = *RecPtr + MAXALIGN(total_len);
+		memcpy(readRecordBuf, record, total_len);
+	}
+
+	/*
+	 * Special processing if it's an XLOG SWITCH record
+	 */
+	if (record->xl_rmid == RM_XLOG_ID && record->xl_info == XLOG_SWITCH)
+	{
+		/* Pretend it extends to end of segment */
+		EndRecPtr += XLogSegSize - 1;
+		EndRecPtr -= EndRecPtr % XLogSegSize;
+
+		readOff = XLogSegSize - XLOG_BLCKSZ;
+	}
+
+	return record;
+}
 
 static void
 usage(void)
@@ -994,6 +1994,7 @@ usage(void)
 	printf(_("Options:\n"));
 	printf(_("  -e XIDEPOCH      set next transaction ID epoch\n"));
 	printf(_("  -f               force update to be done\n"));
+	printf(_("  -r               restore the control values if possible\n"));
 	printf(_("  -l xlogfile      force minimum WAL starting location for new transaction log\n"));
 	printf(_("  -m XID           set next multitransaction ID\n"));
 	printf(_("  -n               no update, just show extracted control values (for testing)\n"));
#2Amit kapila
amit.kapila@huawei.com
In reply to: Amit kapila (#1)
FW: Patch for option in pg_resetxlog for restore from WAL files

I have uploaded the patch for new option in pg_resetxlog at below location:

https://commitfest.postgresql.org/action/patch_view?id=897

This completes the implementation of Option-2 discussed in below mail.

Now I will work on Option-1 (1. To compute the value of max LSN in data pages based on user input
whether he wants it for an individual file, a particular directory or whole database.)

From: Amit kapila
Sent: Wednesday, July 18, 2012 7:17 PM

Patch implementing the design in below mail chain is attached with this mail.

From: Amit Kapila [mailto:amit.kapila@huawei.com]
Sent: Thursday, July 05, 2012 10:21 AM

From: Robert Haas [mailto:robertmhaas@gmail.com]
Sent: Friday, June 22, 2012 8:59 PM
On Fri, Jun 22, 2012 at 5:25 AM, Amit Kapila <amit.kapila@huawei.com> wrote:
Based on the discussion and suggestions in this mail chain, following

features can be implemented:

1. To compute the value of max LSN in data pages based on user input

whether he wants it for an individual file,

a particular directory or whole database.

2a. To search the available WAL files for the latest checkpoint record

and prints the value.

2b. To search the available WAL files for the latest checkpoint record

and recreates a pg_control file pointing at that checkpoint.

I have kept both options to address different kind of corruption

scenarios.

I think I can see all of those things being potentially useful. There
are a couple of pending patches that will revise the WAL format
slightly; not sure how much those are likely to interfere with any
development you might do on (2) in the meantime.

With Regards,

Amit Kapila.

#3Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Amit kapila (#1)
Re: Patch for option in pg_resetxlog for restore from WAL files

On 18.07.2012 16:47, Amit kapila wrote:

Patch implementing the design in below mail chain is attached with this mail.

This patch copies the ReadRecord() function and a bunch of related
functions from xlog.c into pg_resetxlog.c. There's a separate patch in
the current commitfest to make that code reusable, without having to
copy-paste it to every tool that wants to parse the WAL. See
https://commitfest.postgresql.org/action/patch_view?id=860. This patch
should be refactored to make use of that framework, as soon as it's
committed.

- Heikki

#4Amit Kapila
amit.kapila@huawei.com
In reply to: Heikki Linnakangas (#3)
Re: Patch for option in pg_resetxlog for restore from WAL files

On Monday, September 24, 2012 2:30 PM Heikki Linnakangas wrote:
On 18.07.2012 16:47, Amit kapila wrote:

Patch implementing the design in below mail chain is attached with

this mail.

This patch copies the ReadRecord() function and a bunch of related
functions from xlog.c into pg_resetxlog.c. There's a separate patch in
the current commitfest to make that code reusable, without having to
copy-paste it to every tool that wants to parse the WAL. See
https://commitfest.postgresql.org/action/patch_view?id=860. This patch
should be refactored to make use of that framework, as soon as it's
committed.

Sure. Thanks for the feedback.

With Regards,
Amit Kapila.

#5Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#3)
Re: Patch for option in pg_resetxlog for restore from WAL files

On 24 September 2012 04:00, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:

On 18.07.2012 16:47, Amit kapila wrote:

Patch implementing the design in below mail chain is attached with this
mail.

This patch copies the ReadRecord() function and a bunch of related functions
from xlog.c into pg_resetxlog.c. There's a separate patch in the current
commitfest to make that code reusable, without having to copy-paste it to
every tool that wants to parse the WAL. See
https://commitfest.postgresql.org/action/patch_view?id=860. This patch
should be refactored to make use of that framework, as soon as it's
committed.

Agreed, moving to next commitfest.

Amit, suggest review of the patch that this now depends upon.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#6Amit Kapila
amit.kapila@huawei.com
In reply to: Simon Riggs (#5)
Re: Patch for option in pg_resetxlog for restore from WAL files

On Tuesday, September 25, 2012 6:27 PM Simon Riggs wrote :

On 24 September 2012 04:00, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:

On 18.07.2012 16:47, Amit kapila wrote:

Patch implementing the design in below mail chain is attached with

this

mail.

This patch copies the ReadRecord() function and a bunch of related

functions

from xlog.c into pg_resetxlog.c. There's a separate patch in the

current

commitfest to make that code reusable, without having to copy-paste

it to

every tool that wants to parse the WAL. See
https://commitfest.postgresql.org/action/patch_view?id=860. This

patch

should be refactored to make use of that framework, as soon as it's
committed.

Agreed, moving to next commitfest.

Amit, suggest review of the patch that this now depends upon.

Earlier I thought, I will try to finish in this CommitFest if the XLogReader
Patch gets committed by next week.
However if you feel it is better to work it for next CommitFest, I shall do
it that way.

With Regards,
Amit Kapila.