[PATCH] xlogreader-v4

andres@anarazel.de

over 13 years ago

From: Andres Freund <andres@2ndquadrant.com>
Subject: [PATCH] xlogreader-v4
In-Reply-To:

Hi,

this is the latest and obviously best version of xlogreader & xlogdump with
changes both from Heikki and me.

Changes:
* windows build support for pg_xlogdump
* xlogdump moved to contrib
* xlogdump option parsing enhancements
* generic cleanups
* a few more comments
* removal of some ugliness in XLogFindNextRecord

I think its mostly ready, for xlogdump minimally these two issues remain:

const char *
timestamptz_to_str(TimestampTz dt)
{
return "unimplemented-timestamp";
}

const char *
relpathbackend(RelFileNode rnode, BackendId backend, ForkNumber forknum)
{
return "unimplemented-relpathbackend";
}

aren't exactly the nicest wrapper functions. I think its ok to simply copy
relpathbackend from the backend, but timestamptz_to_str? Thats a heck of a lot
of code.

Patches 1 and 2 and 5 are just preparatory and probably can be applied
beforehand.

3 and 4 are the real meat of this and especially 3 needs some careful review.

Input welcome!

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

andres@anarazel.de

over 13 years ago

In reply to: Andres Freund (#1)

[PATCH 1/5] Centralize Assert* macros into c.h so its common between backend/frontend

From: Andres Freund <andres@anarazel.de>

c.h already had parts of the assert support (StaticAssert*) and its the shared
file between postgres.h and postgres_fe.h. This makes it easier to build
frontend programs which have to do the hack.
---
src/include/c.h | 65 +++++++++++++++++++++++++++++++++++++++++++++++
src/include/postgres.h | 54 ++-------------------------------------
src/include/postgres_fe.h | 12 ---------
3 files changed, 67 insertions(+), 64 deletions(-)

diff --git a/src/include/c.h b/src/include/c.h
index f7db157..c30df8b 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -694,6 +694,71 @@ typedef NameData *Name;

 /*
+ * USE_ASSERT_CHECKING, if defined, turns on all the assertions.
+ * - plai  9/5/90
+ *
+ * It should _NOT_ be defined in releases or in benchmark copies
+ */
+
+/*
+ * Assert() can be used in both frontend and backend code. In frontend code it
+ * just calls the standard assert, if it's available. If use of assertions is
+ * not configured, it does nothing.
+ */
+#ifndef USE_ASSERT_CHECKING
+
+#define Assert(condition)
+#define AssertMacro(condition)	((void)true)
+#define AssertArg(condition)
+#define AssertState(condition)
+
+#elif defined FRONTEND
+
+#include <assert.h>
+#define Assert(p) assert(p)
+#define AssertMacro(p)	((void) assert(p))
+
+#else /* USE_ASSERT_CHECKING && FRONTEND */
+
+/*
+ * Trap
+ *		Generates an exception if the given condition is true.
+ */
+#define Trap(condition, errorType) \
+	do { \
+		if ((assert_enabled) && (condition)) \
+			ExceptionalCondition(CppAsString(condition), (errorType), \
+								 __FILE__, __LINE__); \
+	} while (0)
+
+/*
+ *	TrapMacro is the same as Trap but it's intended for use in macros:
+ *
+ *		#define foo(x) (AssertMacro(x != 0), bar(x))
+ *
+ *	Isn't CPP fun?
+ */
+#define TrapMacro(condition, errorType) \
+	((bool) ((! assert_enabled) || ! (condition) || \
+			 (ExceptionalCondition(CppAsString(condition), (errorType), \
+								   __FILE__, __LINE__), 0)))
+
+#define Assert(condition) \
+		Trap(!(condition), "FailedAssertion")
+
+#define AssertMacro(condition) \
+		((void) TrapMacro(!(condition), "FailedAssertion"))
+
+#define AssertArg(condition) \
+		Trap(!(condition), "BadArgument")
+
+#define AssertState(condition) \
+		Trap(!(condition), "BadState")
+
+#endif /* USE_ASSERT_CHECKING && !FRONTEND */
+
+
+/*
  * Macros to support compile-time assertion checks.
  *
  * If the "condition" (a compile-time-constant expression) evaluates to false,
diff --git a/src/include/postgres.h b/src/include/postgres.h
index b6e922f..bbe125a 100644
--- a/src/include/postgres.h
+++ b/src/include/postgres.h
@@ -25,7 +25,7 @@
  *	  -------	------------------------------------------------
  *		1)		variable-length datatypes (TOAST support)
  *		2)		datum type + support macros
- *		3)		exception handling definitions
+ *		3)		exception handling
  *
  *	 NOTES
  *
@@ -627,62 +627,12 @@ extern Datum Float8GetDatum(float8 X);

 /* ----------------------------------------------------------------
- *				Section 3:	exception handling definitions
- *							Assert, Trap, etc macros
+ *				Section 3:	exception handling backend support
  * ----------------------------------------------------------------
  */

extern PGDLLIMPORT bool assert_enabled;

-/*
- * USE_ASSERT_CHECKING, if defined, turns on all the assertions.
- * - plai  9/5/90
- *
- * It should _NOT_ be defined in releases or in benchmark copies
- */
-
-/*
- * Trap
- *		Generates an exception if the given condition is true.
- */
-#define Trap(condition, errorType) \
-	do { \
-		if ((assert_enabled) && (condition)) \
-			ExceptionalCondition(CppAsString(condition), (errorType), \
-								 __FILE__, __LINE__); \
-	} while (0)
-
-/*
- *	TrapMacro is the same as Trap but it's intended for use in macros:
- *
- *		#define foo(x) (AssertMacro(x != 0), bar(x))
- *
- *	Isn't CPP fun?
- */
-#define TrapMacro(condition, errorType) \
-	((bool) ((! assert_enabled) || ! (condition) || \
-			 (ExceptionalCondition(CppAsString(condition), (errorType), \
-								   __FILE__, __LINE__), 0)))
-
-#ifndef USE_ASSERT_CHECKING
-#define Assert(condition)
-#define AssertMacro(condition)	((void)true)
-#define AssertArg(condition)
-#define AssertState(condition)
-#else
-#define Assert(condition) \
-		Trap(!(condition), "FailedAssertion")
-
-#define AssertMacro(condition) \
-		((void) TrapMacro(!(condition), "FailedAssertion"))
-
-#define AssertArg(condition) \
-		Trap(!(condition), "BadArgument")
-
-#define AssertState(condition) \
-		Trap(!(condition), "BadState")
-#endif   /* USE_ASSERT_CHECKING */
-
 extern void ExceptionalCondition(const char *conditionName,
 					 const char *errorType,
 			 const char *fileName, int lineNumber) __attribute__((noreturn));
diff --git a/src/include/postgres_fe.h b/src/include/postgres_fe.h
index af31227..0f35ecc 100644
--- a/src/include/postgres_fe.h
+++ b/src/include/postgres_fe.h
@@ -24,16 +24,4 @@

#include "c.h"

-/*
- * Assert() can be used in both frontend and backend code. In frontend code it
- * just calls the standard assert, if it's available. If use of assertions is
- * not configured, it does nothing.
- */
-#ifdef USE_ASSERT_CHECKING
-#include <assert.h>
-#define Assert(p) assert(p)
-#else
-#define Assert(p)
-#endif
-
#endif /* POSTGRES_FE_H */
--
1.7.12.289.g0ce9864.dirty

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

andres@anarazel.de

over 13 years ago

In reply to: Andres Freund (#1)

[PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

From: Andres Freund <andres@anarazel.de>

relpathbackend() (via some of its wrappers) is used in *_desc routines which we
want to be useable without a backend environment arround.

Change signature to return a 'const char *' to make misuse easier to
detect. That necessicates also changing the 'FileName' typedef to 'const char
*' which seems to be a good idea anyway.
---
src/backend/access/rmgrdesc/smgrdesc.c | 6 ++---
src/backend/access/rmgrdesc/xactdesc.c | 6 ++---
src/backend/access/transam/xlogutils.c | 9 +++----
src/backend/catalog/catalog.c | 49 +++++++++++-----------------------
src/backend/storage/buffer/bufmgr.c | 12 +++------
src/backend/storage/file/fd.c | 6 ++---
src/backend/storage/smgr/md.c | 23 +++++-----------
src/backend/utils/adt/dbsize.c | 4 +--
src/include/catalog/catalog.h | 2 +-
src/include/storage/fd.h | 9 +++----
10 files changed, 42 insertions(+), 84 deletions(-)

diff --git a/src/backend/access/rmgrdesc/smgrdesc.c b/src/backend/access/rmgrdesc/smgrdesc.c
index bcabf89..490c8c7 100644
--- a/src/backend/access/rmgrdesc/smgrdesc.c
+++ b/src/backend/access/rmgrdesc/smgrdesc.c
@@ -26,19 +26,17 @@ smgr_desc(StringInfo buf, uint8 xl_info, char *rec)
 	if (info == XLOG_SMGR_CREATE)
 	{
 		xl_smgr_create *xlrec = (xl_smgr_create *) rec;
-		char	   *path = relpathperm(xlrec->rnode, xlrec->forkNum);
+		const char *path = relpathperm(xlrec->rnode, xlrec->forkNum);

 		appendStringInfo(buf, "file create: %s", path);
-		pfree(path);
 	}
 	else if (info == XLOG_SMGR_TRUNCATE)
 	{
 		xl_smgr_truncate *xlrec = (xl_smgr_truncate *) rec;
-		char	   *path = relpathperm(xlrec->rnode, MAIN_FORKNUM);
+		const char *path = relpathperm(xlrec->rnode, MAIN_FORKNUM);

 		appendStringInfo(buf, "file truncate: %s to %u blocks", path,
 						 xlrec->blkno);
-		pfree(path);
 	}
 	else
 		appendStringInfo(buf, "UNKNOWN");
diff --git a/src/backend/access/rmgrdesc/xactdesc.c b/src/backend/access/rmgrdesc/xactdesc.c
index 2471279..b86a53e 100644
--- a/src/backend/access/rmgrdesc/xactdesc.c
+++ b/src/backend/access/rmgrdesc/xactdesc.c
@@ -35,10 +35,9 @@ xact_desc_commit(StringInfo buf, xl_xact_commit *xlrec)
 		appendStringInfo(buf, "; rels:");
 		for (i = 0; i < xlrec->nrels; i++)
 		{
-			char	   *path = relpathperm(xlrec->xnodes[i], MAIN_FORKNUM);
+			const char *path = relpathperm(xlrec->xnodes[i], MAIN_FORKNUM);

 			appendStringInfo(buf, " %s", path);
-			pfree(path);
 		}
 	}
 	if (xlrec->nsubxacts > 0)
@@ -105,10 +104,9 @@ xact_desc_abort(StringInfo buf, xl_xact_abort *xlrec)
 		appendStringInfo(buf, "; rels:");
 		for (i = 0; i < xlrec->nrels; i++)
 		{
-			char	   *path = relpathperm(xlrec->xnodes[i], MAIN_FORKNUM);
+			const char *path = relpathperm(xlrec->xnodes[i], MAIN_FORKNUM);

 			appendStringInfo(buf, " %s", path);
-			pfree(path);
 		}
 	}
 	if (xlrec->nsubxacts > 0)
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index f9a6e62..8266f3c 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -57,7 +57,7 @@ static void
 report_invalid_page(int elevel, RelFileNode node, ForkNumber forkno,
 					BlockNumber blkno, bool present)
 {
-	char	   *path = relpathperm(node, forkno);
+	const char *path = relpathperm(node, forkno);

if (present)
elog(elevel, "page %u of relation %s is uninitialized",
@@ -65,7 +65,6 @@ report_invalid_page(int elevel, RelFileNode node, ForkNumber forkno,
else
elog(elevel, "page %u of relation %s does not exist",
blkno, path);
- pfree(path);
}

 /* Log a reference to an invalid page */
@@ -153,11 +152,10 @@ forget_invalid_pages(RelFileNode node, ForkNumber forkno, BlockNumber minblkno)
 		{
 			if (log_min_messages <= DEBUG2 || client_min_messages <= DEBUG2)
 			{
-				char	   *path = relpathperm(hentry->key.node, forkno);
+				const char *path = relpathperm(hentry->key.node, forkno);

elog(DEBUG2, "page %u of relation %s has been dropped",
hentry->key.blkno, path);
- pfree(path);
}

 			if (hash_search(invalid_page_tab,
@@ -186,11 +184,10 @@ forget_invalid_pages_db(Oid dbid)
 		{
 			if (log_min_messages <= DEBUG2 || client_min_messages <= DEBUG2)
 			{
-				char	   *path = relpathperm(hentry->key.node, hentry->key.forkno);
+				const char *path = relpathperm(hentry->key.node, hentry->key.forkno);

elog(DEBUG2, "page %u of relation %s has been dropped",
hentry->key.blkno, path);
- pfree(path);
}

 			if (hash_search(invalid_page_tab,
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 9686486..6455ef0 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -112,54 +112,45 @@ forkname_chars(const char *str, ForkNumber *fork)
 /*
  * relpathbackend - construct path to a relation's file
  *
- * Result is a palloc'd string.
+ * Result is a pointer to a statically allocated string.
  */
-char *
+const char *
 relpathbackend(RelFileNode rnode, BackendId backend, ForkNumber forknum)
 {
-	int			pathlen;
-	char	   *path;
+	static char path[MAXPGPATH];

 	if (rnode.spcNode == GLOBALTABLESPACE_OID)
 	{
 		/* Shared system relations live in {datadir}/global */
 		Assert(rnode.dbNode == 0);
 		Assert(backend == InvalidBackendId);
-		pathlen = 7 + OIDCHARS + 1 + FORKNAMECHARS + 1;
-		path = (char *) palloc(pathlen);
 		if (forknum != MAIN_FORKNUM)
-			snprintf(path, pathlen, "global/%u_%s",
+			snprintf(path, MAXPGPATH, "global/%u_%s",
 					 rnode.relNode, forkNames[forknum]);
 		else
-			snprintf(path, pathlen, "global/%u", rnode.relNode);
+			snprintf(path, MAXPGPATH, "global/%u", rnode.relNode);
 	}
 	else if (rnode.spcNode == DEFAULTTABLESPACE_OID)
 	{
 		/* The default tablespace is {datadir}/base */
 		if (backend == InvalidBackendId)
 		{
-			pathlen = 5 + OIDCHARS + 1 + OIDCHARS + 1 + FORKNAMECHARS + 1;
-			path = (char *) palloc(pathlen);
 			if (forknum != MAIN_FORKNUM)
-				snprintf(path, pathlen, "base/%u/%u_%s",
+				snprintf(path, MAXPGPATH, "base/%u/%u_%s",
 						 rnode.dbNode, rnode.relNode,
 						 forkNames[forknum]);
 			else
-				snprintf(path, pathlen, "base/%u/%u",
+				snprintf(path, MAXPGPATH, "base/%u/%u",
 						 rnode.dbNode, rnode.relNode);
 		}
 		else
 		{
-			/* OIDCHARS will suffice for an integer, too */
-			pathlen = 5 + OIDCHARS + 2 + OIDCHARS + 1 + OIDCHARS + 1
-				+ FORKNAMECHARS + 1;
-			path = (char *) palloc(pathlen);
 			if (forknum != MAIN_FORKNUM)
-				snprintf(path, pathlen, "base/%u/t%d_%u_%s",
+				snprintf(path, MAXPGPATH, "base/%u/t%d_%u_%s",
 						 rnode.dbNode, backend, rnode.relNode,
 						 forkNames[forknum]);
 			else
-				snprintf(path, pathlen, "base/%u/t%d_%u",
+				snprintf(path, MAXPGPATH, "base/%u/t%d_%u",
 						 rnode.dbNode, backend, rnode.relNode);
 		}
 	}
@@ -168,38 +159,30 @@ relpathbackend(RelFileNode rnode, BackendId backend, ForkNumber forknum)
 		/* All other tablespaces are accessed via symlinks */
 		if (backend == InvalidBackendId)
 		{
-			pathlen = 9 + 1 + OIDCHARS + 1
-				+ strlen(TABLESPACE_VERSION_DIRECTORY) + 1 + OIDCHARS + 1
-				+ OIDCHARS + 1 + FORKNAMECHARS + 1;
-			path = (char *) palloc(pathlen);
 			if (forknum != MAIN_FORKNUM)
-				snprintf(path, pathlen, "pg_tblspc/%u/%s/%u/%u_%s",
+				snprintf(path, MAXPGPATH, "pg_tblspc/%u/%s/%u/%u_%s",
 						 rnode.spcNode, TABLESPACE_VERSION_DIRECTORY,
 						 rnode.dbNode, rnode.relNode,
 						 forkNames[forknum]);
 			else
-				snprintf(path, pathlen, "pg_tblspc/%u/%s/%u/%u",
+				snprintf(path, MAXPGPATH, "pg_tblspc/%u/%s/%u/%u",
 						 rnode.spcNode, TABLESPACE_VERSION_DIRECTORY,
 						 rnode.dbNode, rnode.relNode);
 		}
 		else
 		{
-			/* OIDCHARS will suffice for an integer, too */
-			pathlen = 9 + 1 + OIDCHARS + 1
-				+ strlen(TABLESPACE_VERSION_DIRECTORY) + 1 + OIDCHARS + 2
-				+ OIDCHARS + 1 + OIDCHARS + 1 + FORKNAMECHARS + 1;
-			path = (char *) palloc(pathlen);
 			if (forknum != MAIN_FORKNUM)
-				snprintf(path, pathlen, "pg_tblspc/%u/%s/%u/t%d_%u_%s",
+				snprintf(path, MAXPGPATH, "pg_tblspc/%u/%s/%u/t%d_%u_%s",
 						 rnode.spcNode, TABLESPACE_VERSION_DIRECTORY,
 						 rnode.dbNode, backend, rnode.relNode,
 						 forkNames[forknum]);
 			else
-				snprintf(path, pathlen, "pg_tblspc/%u/%s/%u/t%d_%u",
+				snprintf(path, MAXPGPATH, "pg_tblspc/%u/%s/%u/t%d_%u",
 						 rnode.spcNode, TABLESPACE_VERSION_DIRECTORY,
 						 rnode.dbNode, backend, rnode.relNode);
 		}
 	}
+
 	return path;
 }

@@ -534,7 +517,7 @@ Oid
GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
{
RelFileNodeBackend rnode;
- char *rpath;
+ const char *rpath;
int fd;
bool collides;
BackendId backend;
@@ -599,8 +582,6 @@ GetNewRelFileNode(Oid reltablespace, Relation pg_class, char relpersistence)
*/
collides = false;
}
-
- pfree(rpath);
} while (collides);

 	return rnode.node.relNode;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 03ed41d..6c2620d 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1757,7 +1757,7 @@ PrintBufferLeakWarning(Buffer buffer)
 {
 	volatile BufferDesc *buf;
 	int32		loccount;
-	char	   *path;
+	const char *path;
 	BackendId	backend;

Assert(BufferIsValid(buffer));
@@ -1782,7 +1782,6 @@ PrintBufferLeakWarning(Buffer buffer)
buffer, path,
buf->tag.blockNum, buf->flags,
buf->refcount, loccount);
- pfree(path);
}

/*
@@ -2901,7 +2900,7 @@ AbortBufferIO(void)
if (sv_flags & BM_IO_ERROR)
{
/* Buffer is pinned, so we can read tag without spinlock */
- char *path;
+ const char *path;

 				path = relpathperm(buf->tag.rnode, buf->tag.forkNum);
 				ereport(WARNING,
@@ -2909,7 +2908,6 @@ AbortBufferIO(void)
 						 errmsg("could not write block %u of %s",
 								buf->tag.blockNum, path),
 						 errdetail("Multiple failures --- write error might be permanent.")));
-				pfree(path);
 			}
 		}
 		TerminateBufferIO(buf, false, BM_IO_ERROR);
@@ -2927,11 +2925,10 @@ shared_buffer_write_error_callback(void *arg)
 	/* Buffer is pinned, so we can read the tag without locking the spinlock */
 	if (bufHdr != NULL)
 	{
-		char	   *path = relpathperm(bufHdr->tag.rnode, bufHdr->tag.forkNum);
+		const char *path = relpathperm(bufHdr->tag.rnode, bufHdr->tag.forkNum);

errcontext("writing block %u of relation %s",
bufHdr->tag.blockNum, path);
- pfree(path);
}
}

@@ -2945,11 +2942,10 @@ local_buffer_write_error_callback(void *arg)

 	if (bufHdr != NULL)
 	{
-		char	   *path = relpathbackend(bufHdr->tag.rnode, MyBackendId,
+		const char *path = relpathbackend(bufHdr->tag.rnode, MyBackendId,
 										  bufHdr->tag.forkNum);

 		errcontext("writing block %u of relation %s",
 				   bufHdr->tag.blockNum, path);
-		pfree(path);
 	}
 }
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index c026731..64d2a1e 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -556,7 +556,7 @@ set_max_safe_fds(void)
  * this module wouldn't have any open files to close at that point anyway.
  */
 int
-BasicOpenFile(FileName fileName, int fileFlags, int fileMode)
+BasicOpenFile(const char *fileName, int fileFlags, int fileMode)
 {
 	int			fd;

@@ -878,7 +878,7 @@ FileInvalidate(File file)
  * (which should always be $PGDATA when this code is running).
  */
 File
-PathNameOpenFile(FileName fileName, int fileFlags, int fileMode)
+PathNameOpenFile(const char *fileName, int fileFlags, int fileMode)
 {
 	char	   *fnamecopy;
 	File		file;
@@ -1548,7 +1548,7 @@ TryAgain:
  * Like AllocateFile, but returns an unbuffered fd like open(2)
  */
 int
-OpenTransientFile(FileName fileName, int fileFlags, int fileMode)
+OpenTransientFile(const char *fileName, int fileFlags, int fileMode)
 {
 	int			fd;

diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 20eb36a..4ef47b1 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -264,7 +264,7 @@ mdexists(SMgrRelation reln, ForkNumber forkNum)
 void
 mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
 {
-	char	   *path;
+	const char *path;
 	File		fd;

if (isRedo && reln->md_fd[forkNum] != NULL)
@@ -298,8 +298,6 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
}
}

- pfree(path);
-
reln->md_fd[forkNum] = _fdvec_alloc();

reln->md_fd[forkNum]->mdfd_vfd = fd;
@@ -380,7 +378,7 @@ mdunlink(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
static void
mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
{
- char *path;
+ const char *path;
int ret;

path = relpath(rnode, forkNum);
@@ -449,8 +447,6 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
}
pfree(segpath);
}
-
- pfree(path);
}

/*
@@ -545,7 +541,7 @@ static MdfdVec *
mdopen(SMgrRelation reln, ForkNumber forknum, ExtensionBehavior behavior)
{
MdfdVec *mdfd;
- char *path;
+ const char *path;
File fd;

/* No work if already open */
@@ -571,7 +567,6 @@ mdopen(SMgrRelation reln, ForkNumber forknum, ExtensionBehavior behavior)
if (behavior == EXTENSION_RETURN_NULL &&
FILE_POSSIBLY_DELETED(errno))
{
- pfree(path);
return NULL;
}
ereport(ERROR,
@@ -580,8 +575,6 @@ mdopen(SMgrRelation reln, ForkNumber forknum, ExtensionBehavior behavior)
}
}

- pfree(path);
-
reln->md_fd[forknum] = mdfd = _fdvec_alloc();

mdfd->mdfd_vfd = fd;
@@ -1279,7 +1272,7 @@ mdpostckpt(void)
while (pendingUnlinks != NIL)
{
PendingUnlinkEntry *entry = (PendingUnlinkEntry *) linitial(pendingUnlinks);
- char *path;
+ const char *path;

/*
* New entries are appended to the end, so if the entry is new we've
@@ -1309,7 +1302,6 @@ mdpostckpt(void)
(errcode_for_file_access(),
errmsg("could not remove file \"%s\": %m", path)));
}
- pfree(path);

 		/* And remove the list entry */
 		pendingUnlinks = list_delete_first(pendingUnlinks);
@@ -1634,8 +1626,8 @@ _fdvec_alloc(void)
 static char *
 _mdfd_segpath(SMgrRelation reln, ForkNumber forknum, BlockNumber segno)
 {
-	char	   *path,
-			   *fullpath;
+	const char *path;
+	char	   *fullpath;

path = relpath(reln->smgr_rnode, forknum);

@@ -1644,10 +1636,9 @@ _mdfd_segpath(SMgrRelation reln, ForkNumber forknum, BlockNumber segno)
/* be sure we have enough space for the '.segno' */
fullpath = (char *) palloc(strlen(path) + 12);
sprintf(fullpath, "%s.%u", path, segno);
- pfree(path);
}
else
- fullpath = path;
+ fullpath = pstrdup(path);

 	return fullpath;
 }
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 89ad386..c285007 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -268,7 +268,7 @@ static int64
 calculate_relation_size(RelFileNode *rfn, BackendId backend, ForkNumber forknum)
 {
 	int64		totalsize = 0;
-	char	   *relationpath;
+	const char *relationpath;
 	char		pathname[MAXPGPATH];
 	unsigned int segcount = 0;

@@ -756,7 +756,7 @@ pg_relation_filepath(PG_FUNCTION_ARGS)
 	Form_pg_class relform;
 	RelFileNode rnode;
 	BackendId	backend;
-	char	   *path;
+	const char *path;

 	tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
 	if (!HeapTupleIsValid(tuple))
diff --git a/src/include/catalog/catalog.h b/src/include/catalog/catalog.h
index 5d506fe..299cb03 100644
--- a/src/include/catalog/catalog.h
+++ b/src/include/catalog/catalog.h
@@ -31,7 +31,7 @@ extern const char *forkNames[];
 extern ForkNumber forkname_to_number(char *forkName);
 extern int	forkname_chars(const char *str, ForkNumber *);

-extern char *relpathbackend(RelFileNode rnode, BackendId backend,
+extern const char *relpathbackend(RelFileNode rnode, BackendId backend,
 			   ForkNumber forknum);
 extern char *GetDatabasePath(Oid dbNode, Oid spcNode);

diff --git a/src/include/storage/fd.h b/src/include/storage/fd.h
index bd36c9d..b2565c4 100644
--- a/src/include/storage/fd.h
+++ b/src/include/storage/fd.h
@@ -45,9 +45,6 @@
 /*
  * FileSeek uses the standard UNIX lseek(2) flags.
  */
-
-typedef char *FileName;
-
 typedef int File;

@@ -65,7 +62,7 @@ extern int max_safe_fds;
*/

 /* Operations on virtual Files --- equivalent to Unix kernel file ops */
-extern File PathNameOpenFile(FileName fileName, int fileFlags, int fileMode);
+extern File PathNameOpenFile(const char *fileName, int fileFlags, int fileMode);
 extern File OpenTemporaryFile(bool interXact);
 extern void FileClose(File file);
 extern int	FilePrefetch(File file, off_t offset, int amount);
@@ -86,11 +83,11 @@ extern struct dirent *ReadDir(DIR *dir, const char *dirname);
 extern int	FreeDir(DIR *dir);

 /* Operations to allow use of a plain kernel FD, with automatic cleanup */
-extern int	OpenTransientFile(FileName fileName, int fileFlags, int fileMode);
+extern int	OpenTransientFile(const char *fileName, int fileFlags, int fileMode);
 extern int	CloseTransientFile(int fd);

 /* If you've really really gotta have a plain kernel FD, use this */
-extern int	BasicOpenFile(FileName fileName, int fileFlags, int fileMode);
+extern int	BasicOpenFile(const char *fileName, int fileFlags, int fileMode);

/* Miscellaneous support routines */
extern void InitFileAccess(void);
--
1.7.12.289.g0ce9864.dirty

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

andres@anarazel.de

over 13 years ago

In reply to: Andres Freund (#1)

[PATCH 3/5] Split out xlog reading into its own module called xlogreader

From: Andres Freund <andres@anarazel.de>

The way xlog reading was done up to now made it impossible to use that
nontrivial code outside of xlog.c although it is useful for different purposes
like debugging wal (xlogdump) and decoding wal back into logical changes.

Authors: Heikki Linnakangas, Andres Freund, Alvaro Herrera
Reviewed-By: Alvaro Herrera
---
src/backend/access/transam/Makefile | 2 +-
src/backend/access/transam/xlog.c | 830 +++++----------------------
src/backend/access/transam/xlogreader.c | 987 ++++++++++++++++++++++++++++++++
src/backend/nls.mk | 5 +-
src/include/access/xlogreader.h | 141 +++++
5 files changed, 1261 insertions(+), 704 deletions(-)
create mode 100644 src/backend/access/transam/xlogreader.c
create mode 100644 src/include/access/xlogreader.h

diff --git a/src/backend/access/transam/Makefile b/src/backend/access/transam/Makefile
index 700cfd8..eb6cfc5 100644
--- a/src/backend/access/transam/Makefile
+++ b/src/backend/access/transam/Makefile
@@ -14,7 +14,7 @@ include $(top_builddir)/src/Makefile.global

 OBJS = clog.o transam.o varsup.o xact.o rmgr.o slru.o subtrans.o multixact.o \
 	timeline.o twophase.o twophase_rmgr.o xlog.o xlogarchive.o xlogfuncs.o \
-	xlogutils.o
+	xlogreader.o xlogutils.o

include $(top_srcdir)/src/backend/common.mk

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 51a515a..310a654 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -30,6 +30,7 @@
 #include "access/twophase.h"
 #include "access/xact.h"
 #include "access/xlog_internal.h"
+#include "access/xlogreader.h"
 #include "access/xlogutils.h"
 #include "catalog/catversion.h"
 #include "catalog/pg_control.h"
@@ -548,7 +549,6 @@ static int	readFile = -1;
 static XLogSegNo readSegNo = 0;
 static uint32 readOff = 0;
 static uint32 readLen = 0;
-static bool	readFileHeaderValidated = false;
 static XLogSource readSource = 0;		/* XLOG_FROM_* code */

/*
@@ -561,6 +561,13 @@ static XLogSource readSource = 0; /* XLOG_FROM_* code */
static XLogSource currentSource = 0; /* XLOG_FROM_* code */
static bool lastSourceFailed = false;

+typedef struct XLogPageReadPrivate
+{
+	int			emode;
+	bool		fetching_ckpt;	/* are we fetching a checkpoint record? */
+	bool		randAccess;
+} XLogPageReadPrivate;
+
 /*
  * These variables track when we last obtained some WAL data to process,
  * and where we got it from.  (XLogReceiptSource is initially the same as
@@ -572,18 +579,9 @@ static bool	lastSourceFailed = false;
 static TimestampTz XLogReceiptTime = 0;
 static XLogSource XLogReceiptSource = 0;	/* XLOG_FROM_* code */

-/* Buffer for currently read page (XLOG_BLCKSZ bytes) */
-static char *readBuf = NULL;
-
-/* Buffer for current ReadRecord result (expandable) */
-static char *readRecordBuf = NULL;
-static uint32 readRecordBufSize = 0;
-
/* State information for XLOG reading */
static XLogRecPtr ReadRecPtr; /* start of last record read */
static XLogRecPtr EndRecPtr; /* end+1 of last record read */
-static TimeLineID lastPageTLI = 0;
-static TimeLineID lastSegmentTLI = 0;

 static XLogRecPtr minRecoveryPoint;		/* local copy of
 										 * ControlFile->minRecoveryPoint */
@@ -627,8 +625,8 @@ static bool InstallXLogFileSegment(XLogSegNo *segno, char *tmppath,
 static int XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli,
 			 int source, bool notexistOk);
 static int XLogFileReadAnyTLI(XLogSegNo segno, int emode, int source);
-static bool XLogPageRead(XLogRecPtr *RecPtr, int emode, bool fetching_ckpt,
-			 bool randAccess);
+static int XLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr,
+				 int reqLen, char *readBuf, TimeLineID *readTLI);
 static bool WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 							bool fetching_ckpt);
 static int	emode_for_corrupt_record(int emode, XLogRecPtr RecPtr);
@@ -639,12 +637,11 @@ static void UpdateLastRemovedPtr(char *filename);
 static void ValidateXLOGDirectoryStructure(void);
 static void CleanupBackupHistory(void);
 static void UpdateMinRecoveryPoint(XLogRecPtr lsn, bool force);
-static XLogRecord *ReadRecord(XLogRecPtr *RecPtr, int emode, bool fetching_ckpt);
+static XLogRecord *ReadRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr,
+		   int emode, bool fetching_ckpt);
 static void CheckRecoveryConsistency(void);
-static bool ValidXLogPageHeader(XLogPageHeader hdr, int emode, bool segmentonly);
-static bool ValidXLogRecordHeader(XLogRecPtr *RecPtr, XLogRecord *record,
-					  int emode, bool randAccess);
-static XLogRecord *ReadCheckpointRecord(XLogRecPtr RecPtr, int whichChkpt);
+static XLogRecord *ReadCheckpointRecord(XLogReaderState *xlogreader,
+					 XLogRecPtr RecPtr, int whichChkpt);
 static bool rescanLatestTimeLine(void);
 static void WriteControlFile(void);
 static void ReadControlFile(void);
@@ -2652,9 +2649,6 @@ XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli,
 		if (source != XLOG_FROM_STREAM)
 			XLogReceiptTime = GetCurrentTimestamp();

-		/* The file header needs to be validated on first access */
-		readFileHeaderValidated = false;
-
 		return fd;
 	}
 	if (errno != ENOENT || !notfoundOk) /* unexpected failure? */
@@ -2709,7 +2703,8 @@ XLogFileReadAnyTLI(XLogSegNo segno, int emode, int source)

 		if (source == XLOG_FROM_ANY || source == XLOG_FROM_ARCHIVE)
 		{
-			fd = XLogFileRead(segno, emode, tli, XLOG_FROM_ARCHIVE, true);
+			fd = XLogFileRead(segno, emode, tli,
+							  XLOG_FROM_ARCHIVE, true);
 			if (fd != -1)
 			{
 				elog(DEBUG1, "got WAL segment from archive");
@@ -2721,7 +2716,8 @@ XLogFileReadAnyTLI(XLogSegNo segno, int emode, int source)

 		if (source == XLOG_FROM_ANY || source == XLOG_FROM_PG_XLOG)
 		{
-			fd = XLogFileRead(segno, emode, tli, XLOG_FROM_PG_XLOG, true);
+			fd = XLogFileRead(segno, emode, tli,
+							  XLOG_FROM_PG_XLOG, true);
 			if (fd != -1)
 			{
 				if (!expectedTLEs)
@@ -3178,102 +3174,6 @@ RestoreBackupBlock(XLogRecPtr lsn, XLogRecord *record, int block_index,
 }

 /*
- * CRC-check an XLOG record.  We do not believe the contents of an XLOG
- * record (other than to the minimal extent of computing the amount of
- * data to read in) until we've checked the CRCs.
- *
- * We assume all of the record (that is, xl_tot_len bytes) has been read
- * into memory at *record.  Also, ValidXLogRecordHeader() has accepted the
- * record's header, which means in particular that xl_tot_len is at least
- * SizeOfXlogRecord, so it is safe to fetch xl_len.
- */
-static bool
-RecordIsValid(XLogRecord *record, XLogRecPtr recptr, int emode)
-{
-	pg_crc32	crc;
-	int			i;
-	uint32		len = record->xl_len;
-	BkpBlock	bkpb;
-	char	   *blk;
-	size_t		remaining = record->xl_tot_len;
-
-	/* First the rmgr data */
-	if (remaining < SizeOfXLogRecord + len)
-	{
-		/* ValidXLogRecordHeader() should've caught this already... */
-		ereport(emode_for_corrupt_record(emode, recptr),
-				(errmsg("invalid record length at %X/%X",
-						(uint32) (recptr >> 32), (uint32) recptr)));
-		return false;
-	}
-	remaining -= SizeOfXLogRecord + len;
-	INIT_CRC32(crc);
-	COMP_CRC32(crc, XLogRecGetData(record), len);
-
-	/* Add in the backup blocks, if any */
-	blk = (char *) XLogRecGetData(record) + len;
-	for (i = 0; i < XLR_MAX_BKP_BLOCKS; i++)
-	{
-		uint32		blen;
-
-		if (!(record->xl_info & XLR_BKP_BLOCK(i)))
-			continue;
-
-		if (remaining < sizeof(BkpBlock))
-		{
-			ereport(emode_for_corrupt_record(emode, recptr),
-					(errmsg("invalid backup block size in record at %X/%X",
-							(uint32) (recptr >> 32), (uint32) recptr)));
-			return false;
-		}
-		memcpy(&bkpb, blk, sizeof(BkpBlock));
-
-		if (bkpb.hole_offset + bkpb.hole_length > BLCKSZ)
-		{
-			ereport(emode_for_corrupt_record(emode, recptr),
-					(errmsg("incorrect hole size in record at %X/%X",
-							(uint32) (recptr >> 32), (uint32) recptr)));
-			return false;
-		}
-		blen = sizeof(BkpBlock) + BLCKSZ - bkpb.hole_length;
-
-		if (remaining < blen)
-		{
-			ereport(emode_for_corrupt_record(emode, recptr),
-					(errmsg("invalid backup block size in record at %X/%X",
-							(uint32) (recptr >> 32), (uint32) recptr)));
-			return false;
-		}
-		remaining -= blen;
-		COMP_CRC32(crc, blk, blen);
-		blk += blen;
-	}
-
-	/* Check that xl_tot_len agrees with our calculation */
-	if (remaining != 0)
-	{
-		ereport(emode_for_corrupt_record(emode, recptr),
-				(errmsg("incorrect total length in record at %X/%X",
-						(uint32) (recptr >> 32), (uint32) recptr)));
-		return false;
-	}
-
-	/* Finally include the record header */
-	COMP_CRC32(crc, (char *) record, offsetof(XLogRecord, xl_crc));
-	FIN_CRC32(crc);
-
-	if (!EQ_CRC32(record->xl_crc, crc))
-	{
-		ereport(emode_for_corrupt_record(emode, recptr),
-		(errmsg("incorrect resource manager data checksum in record at %X/%X",
-				(uint32) (recptr >> 32), (uint32) recptr)));
-		return false;
-	}
-
-	return true;
-}
-
-/*
  * Attempt to read an XLOG record.
  *
  * If RecPtr is not NULL, try to read a record at that position.  Otherwise
@@ -3286,511 +3186,65 @@ RecordIsValid(XLogRecord *record, XLogRecPtr recptr, int emode)
  * the returned record pointer always points there.
  */
 static XLogRecord *
-ReadRecord(XLogRecPtr *RecPtr, int emode, bool fetching_ckpt)
+ReadRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr, int emode,
+		   bool fetching_ckpt)
 {
 	XLogRecord *record;
-	XLogRecPtr	tmpRecPtr = EndRecPtr;
-	bool		randAccess = false;
-	uint32		len,
-				total_len;
-	uint32		targetRecOff;
-	uint32		pageHeaderSize;
-	bool		gotheader;
-
-	if (readBuf == NULL)
-	{
-		/*
-		 * First time through, permanently allocate readBuf.  We do it this
-		 * way, rather than just making a static array, for two reasons: (1)
-		 * no need to waste the storage in most instantiations of the backend;
-		 * (2) a static char array isn't guaranteed to have any particular
-		 * alignment, whereas malloc() will provide MAXALIGN'd storage.
-		 */
-		readBuf = (char *) malloc(XLOG_BLCKSZ);
-		Assert(readBuf != NULL);
-	}
-
-	if (RecPtr == NULL)
-	{
-		RecPtr = &tmpRecPtr;
-
-		/*
-		 * RecPtr is pointing to end+1 of the previous WAL record.  If
-		 * we're at a page boundary, no more records can fit on the current
-		 * page. We must skip over the page header, but we can't do that
-		 * until we've read in the page, since the header size is variable.
-		 */
-	}
-	else
-	{
-		/*
-		 * In this case, the passed-in record pointer should already be
-		 * pointing to a valid record starting position.
-		 */
-		if (!XRecOffIsValid(*RecPtr))
-			ereport(PANIC,
-					(errmsg("invalid record offset at %X/%X",
-							(uint32) (*RecPtr >> 32), (uint32) *RecPtr)));
+	XLogPageReadPrivate *private = (XLogPageReadPrivate *) xlogreader->private_data;

-		/*
-		 * Since we are going to a random position in WAL, forget any prior
-		 * state about what timeline we were in, and allow it to be any
-		 * timeline in expectedTLEs.  We also set a flag to allow curFileTLI
-		 * to go backwards (but we can't reset that variable right here, since
-		 * we might not change files at all).
-		 */
-		/* see comment in ValidXLogPageHeader */
-		lastPageTLI = lastSegmentTLI = 0;
-		randAccess = true;		/* allow curFileTLI to go backwards too */
-	}
+	/* Pass through parameters to XLogPageRead */
+	private->fetching_ckpt = fetching_ckpt;
+	private->emode = emode;
+	private->randAccess = (RecPtr != InvalidXLogRecPtr);

/* This is the first try to read this page. */
lastSourceFailed = false;
-retry:
- /* Read the page containing the record */
- if (!XLogPageRead(RecPtr, emode, fetching_ckpt, randAccess))
- return NULL;

-	pageHeaderSize = XLogPageHeaderSize((XLogPageHeader) readBuf);
-	targetRecOff = (*RecPtr) % XLOG_BLCKSZ;
-	if (targetRecOff == 0)
-	{
-		/*
-		 * At page start, so skip over page header.  The Assert checks that
-		 * we're not scribbling on caller's record pointer; it's OK because we
-		 * can only get here in the continuing-from-prev-record case, since
-		 * XRecOffIsValid rejected the zero-page-offset case otherwise.
-		 */
-		Assert(RecPtr == &tmpRecPtr);
-		(*RecPtr) += pageHeaderSize;
-		targetRecOff = pageHeaderSize;
-	}
-	else if (targetRecOff < pageHeaderSize)
+	do
 	{
-		ereport(emode_for_corrupt_record(emode, *RecPtr),
-				(errmsg("invalid record offset at %X/%X",
-						(uint32) ((*RecPtr) >> 32), (uint32) *RecPtr)));
-		goto next_record_is_invalid;
-	}
-	if ((((XLogPageHeader) readBuf)->xlp_info & XLP_FIRST_IS_CONTRECORD) &&
-		targetRecOff == pageHeaderSize)
-	{
-		ereport(emode_for_corrupt_record(emode, *RecPtr),
-				(errmsg("contrecord is requested by %X/%X",
-						(uint32) ((*RecPtr) >> 32), (uint32) *RecPtr)));
-		goto next_record_is_invalid;
-	}
-
-	/*
-	 * Read the record length.
-	 *
-	 * NB: Even though we use an XLogRecord pointer here, the whole record
-	 * header might not fit on this page. xl_tot_len is the first field of
-	 * the struct, so it must be on this page (the records are MAXALIGNed),
-	 * but we cannot access any other fields until we've verified that we
-	 * got the whole header.
-	 */
-	record = (XLogRecord *) (readBuf + (*RecPtr) % XLOG_BLCKSZ);
-	total_len = record->xl_tot_len;
-
-	/*
-	 * If the whole record header is on this page, validate it immediately.
-	 * Otherwise do just a basic sanity check on xl_tot_len, and validate the
-	 * rest of the header after reading it from the next page.  The xl_tot_len
-	 * check is necessary here to ensure that we enter the "Need to reassemble
-	 * record" code path below; otherwise we might fail to apply
-	 * ValidXLogRecordHeader at all.
-	 */
-	if (targetRecOff <= XLOG_BLCKSZ - SizeOfXLogRecord)
-	{
-		if (!ValidXLogRecordHeader(RecPtr, record, emode, randAccess))
-			goto next_record_is_invalid;
-		gotheader = true;
-	}
-	else
-	{
-		if (total_len < SizeOfXLogRecord)
+		char   *errormsg;
+		record = XLogReadRecord(xlogreader, RecPtr, &errormsg);
+		ReadRecPtr = xlogreader->ReadRecPtr;
+		EndRecPtr = xlogreader->EndRecPtr;
+		if (record == NULL)
 		{
-			ereport(emode_for_corrupt_record(emode, *RecPtr),
-					(errmsg("invalid record length at %X/%X",
-							(uint32) ((*RecPtr) >> 32), (uint32) *RecPtr)));
-			goto next_record_is_invalid;
-		}
-		gotheader = false;
-	}
-
-	/*
-	 * Allocate or enlarge readRecordBuf as needed.  To avoid useless small
-	 * increases, round its size to a multiple of XLOG_BLCKSZ, and make sure
-	 * it's at least 4*Max(BLCKSZ, XLOG_BLCKSZ) to start with.  (That is
-	 * enough for all "normal" records, but very large commit or abort records
-	 * might need more space.)
-	 */
-	if (total_len > readRecordBufSize)
-	{
-		uint32		newSize = total_len;
-
-		newSize += XLOG_BLCKSZ - (newSize % XLOG_BLCKSZ);
-		newSize = Max(newSize, 4 * Max(BLCKSZ, XLOG_BLCKSZ));
-		if (readRecordBuf)
-			free(readRecordBuf);
-		readRecordBuf = (char *) malloc(newSize);
-		if (!readRecordBuf)
-		{
-			readRecordBufSize = 0;
-			/* We treat this as a "bogus data" condition */
-			ereport(emode_for_corrupt_record(emode, *RecPtr),
-					(errmsg("record length %u at %X/%X too long",
-							total_len, (uint32) ((*RecPtr) >> 32), (uint32) *RecPtr)));
-			goto next_record_is_invalid;
-		}
-		readRecordBufSize = newSize;
-	}
-
-	len = XLOG_BLCKSZ - (*RecPtr) % XLOG_BLCKSZ;
-	if (total_len > len)
-	{
-		/* Need to reassemble record */
-		char	   *contrecord;
-		XLogPageHeader pageHeader;
-		XLogRecPtr	pagelsn;
-		char	   *buffer;
-		uint32		gotlen;
-
-		/* Initialize pagelsn to the beginning of the page this record is on */
-		pagelsn = ((*RecPtr) / XLOG_BLCKSZ) * XLOG_BLCKSZ;
-
-		/* Copy the first fragment of the record from the first page. */
-		memcpy(readRecordBuf, readBuf + (*RecPtr) % XLOG_BLCKSZ, len);
-		buffer = readRecordBuf + len;
-		gotlen = len;
+			ereport(emode_for_corrupt_record(emode,
+											 RecPtr ? RecPtr : EndRecPtr),
+					(errmsg_internal("%s", errormsg) /* already translated */));

-		do
-		{
-			/* Calculate pointer to beginning of next page */
-			pagelsn += XLOG_BLCKSZ;
-			/* Wait for the next page to become available */
-			if (!XLogPageRead(&pagelsn, emode, false, false))
-				return NULL;
-
-			/* Check that the continuation on next page looks valid */
-			pageHeader = (XLogPageHeader) readBuf;
-			if (!(pageHeader->xlp_info & XLP_FIRST_IS_CONTRECORD))
-			{
-				ereport(emode_for_corrupt_record(emode, *RecPtr),
-						(errmsg("there is no contrecord flag in log segment %s, offset %u",
-								XLogFileNameP(curFileTLI, readSegNo),
-								readOff)));
-				goto next_record_is_invalid;
-			}
-			/*
-			 * Cross-check that xlp_rem_len agrees with how much of the record
-			 * we expect there to be left.
-			 */
-			if (pageHeader->xlp_rem_len == 0 ||
-				total_len != (pageHeader->xlp_rem_len + gotlen))
-			{
-				ereport(emode_for_corrupt_record(emode, *RecPtr),
-						(errmsg("invalid contrecord length %u in log segment %s, offset %u",
-								pageHeader->xlp_rem_len,
-								XLogFileNameP(curFileTLI, readSegNo),
-								readOff)));
-				goto next_record_is_invalid;
-			}
+			lastSourceFailed = true;

-			/* Append the continuation from this page to the buffer */
-			pageHeaderSize = XLogPageHeaderSize(pageHeader);
-			contrecord = (char *) readBuf + pageHeaderSize;
-			len = XLOG_BLCKSZ - pageHeaderSize;
-			if (pageHeader->xlp_rem_len < len)
-				len = pageHeader->xlp_rem_len;
-			memcpy(buffer, (char *) contrecord, len);
-			buffer += len;
-			gotlen += len;
-
-			/* If we just reassembled the record header, validate it. */
-			if (!gotheader)
+			if (readFile >= 0)
 			{
-				record = (XLogRecord *) readRecordBuf;
-				if (!ValidXLogRecordHeader(RecPtr, record, emode, randAccess))
-					goto next_record_is_invalid;
-				gotheader = true;
+				close(readFile);
+				readFile = -1;
 			}
-		} while (pageHeader->xlp_rem_len > len);
-
-		record = (XLogRecord *) readRecordBuf;
-		if (!RecordIsValid(record, *RecPtr, emode))
-			goto next_record_is_invalid;
-		pageHeaderSize = XLogPageHeaderSize((XLogPageHeader) readBuf);
-		XLogSegNoOffsetToRecPtr(
-			readSegNo,
-			readOff + pageHeaderSize + MAXALIGN(pageHeader->xlp_rem_len),
-			EndRecPtr);
-		ReadRecPtr = *RecPtr;
-	}
-	else
-	{
-		/* Record does not cross a page boundary */
-		if (!RecordIsValid(record, *RecPtr, emode))
-			goto next_record_is_invalid;
-		EndRecPtr = *RecPtr + MAXALIGN(total_len);
-
-		ReadRecPtr = *RecPtr;
-		memcpy(readRecordBuf, record, total_len);
-	}
-
-	/*
-	 * Special processing if it's an XLOG SWITCH record
-	 */
-	if (record->xl_rmid == RM_XLOG_ID && record->xl_info == XLOG_SWITCH)
-	{
-		/* Pretend it extends to end of segment */
-		EndRecPtr += XLogSegSize - 1;
-		EndRecPtr -= EndRecPtr % XLogSegSize;
-
-		/*
-		 * Pretend that readBuf contains the last page of the segment. This is
-		 * just to avoid Assert failure in StartupXLOG if XLOG ends with this
-		 * segment.
-		 */
-		readOff = XLogSegSize - XLOG_BLCKSZ;
-	}
-	return record;
-
-next_record_is_invalid:
-	lastSourceFailed = true;
-
-	if (readFile >= 0)
-	{
-		close(readFile);
-		readFile = -1;
-	}
-
-	/* In standby-mode, keep trying */
-	if (StandbyMode)
-		goto retry;
-	else
-		return NULL;
-}
-
-/*
- * Check whether the xlog header of a page just read in looks valid.
- *
- * This is just a convenience subroutine to avoid duplicated code in
- * ReadRecord.	It's not intended for use from anywhere else.
- */
-static bool
-ValidXLogPageHeader(XLogPageHeader hdr, int emode, bool segmentonly)
-{
-	XLogRecPtr	recaddr;
-
-	XLogSegNoOffsetToRecPtr(readSegNo, readOff, recaddr);
-
-	if (hdr->xlp_magic != XLOG_PAGE_MAGIC)
-	{
-		ereport(emode_for_corrupt_record(emode, recaddr),
-				(errmsg("invalid magic number %04X in log segment %s, offset %u",
-						hdr->xlp_magic,
-						XLogFileNameP(curFileTLI, readSegNo),
-						readOff)));
-		return false;
-	}
-	if ((hdr->xlp_info & ~XLP_ALL_FLAGS) != 0)
-	{
-		ereport(emode_for_corrupt_record(emode, recaddr),
-				(errmsg("invalid info bits %04X in log segment %s, offset %u",
-						hdr->xlp_info,
-						XLogFileNameP(curFileTLI, readSegNo),
-						readOff)));
-		return false;
-	}
-	if (hdr->xlp_info & XLP_LONG_HEADER)
-	{
-		XLogLongPageHeader longhdr = (XLogLongPageHeader) hdr;
-
-		if (longhdr->xlp_sysid != ControlFile->system_identifier)
-		{
-			char		fhdrident_str[32];
-			char		sysident_str[32];
-
-			/*
-			 * Format sysids separately to keep platform-dependent format code
-			 * out of the translatable message string.
-			 */
-			snprintf(fhdrident_str, sizeof(fhdrident_str), UINT64_FORMAT,
-					 longhdr->xlp_sysid);
-			snprintf(sysident_str, sizeof(sysident_str), UINT64_FORMAT,
-					 ControlFile->system_identifier);
-			ereport(emode_for_corrupt_record(emode, recaddr),
-					(errmsg("WAL file is from different database system"),
-					 errdetail("WAL file database system identifier is %s, pg_control database system identifier is %s.",
-							   fhdrident_str, sysident_str)));
-			return false;
-		}
-		if (longhdr->xlp_seg_size != XLogSegSize)
-		{
-			ereport(emode_for_corrupt_record(emode, recaddr),
-					(errmsg("WAL file is from different database system"),
-					 errdetail("Incorrect XLOG_SEG_SIZE in page header.")));
-			return false;
-		}
-		if (longhdr->xlp_xlog_blcksz != XLOG_BLCKSZ)
-		{
-			ereport(emode_for_corrupt_record(emode, recaddr),
-					(errmsg("WAL file is from different database system"),
-					 errdetail("Incorrect XLOG_BLCKSZ in page header.")));
-			return false;
+			break;
 		}
-	}
-	else if (readOff == 0)
-	{
-		/* hmm, first page of file doesn't have a long header? */
-		ereport(emode_for_corrupt_record(emode, recaddr),
-				(errmsg("invalid info bits %04X in log segment %s, offset %u",
-						hdr->xlp_info,
-						XLogFileNameP(curFileTLI, readSegNo),
-						readOff)));
-		return false;
-	}
-
-	if (hdr->xlp_pageaddr != recaddr)
-	{
-		ereport(emode_for_corrupt_record(emode, recaddr),
-				(errmsg("unexpected pageaddr %X/%X in log segment %s, offset %u",
-						(uint32) (hdr->xlp_pageaddr >> 32), (uint32) hdr->xlp_pageaddr,
-						XLogFileNameP(curFileTLI, readSegNo),
-						readOff)));
-		return false;
-	}

-	/*
-	 * Check page TLI is one of the expected values.
-	 */
-	if (!tliInHistory(hdr->xlp_tli, expectedTLEs))
-	{
-		ereport(emode_for_corrupt_record(emode, recaddr),
-				(errmsg("unexpected timeline ID %u in log segment %s, offset %u",
-						hdr->xlp_tli,
-						XLogFileNameP(curFileTLI, readSegNo),
-						readOff)));
-		return false;
-	}
-
-	/*
-	 * Since child timelines are always assigned a TLI greater than their
-	 * immediate parent's TLI, we should never see TLI go backwards across
-	 * successive pages of a consistent WAL sequence.
-	 *
-	 * Of course this check should only be applied when advancing sequentially
-	 * across pages; therefore ReadRecord resets lastPageTLI and
-	 * lastSegmentTLI to zero when going to a random page.
-	 *
-	 * Sometimes we re-open a segment that's already been partially replayed.
-	 * In that case we cannot perform the normal TLI check: if there is a
-	 * timeline switch within the segment, the first page has a smaller TLI
-	 * than later pages following the timeline switch, and we might've read
-	 * them already. As a weaker test, we still check that it's not smaller
-	 * than the TLI we last saw at the beginning of a segment. Pass
-	 * segmentonly = true when re-validating the first page like that, and the
-	 * page you're actually interested in comes later.
-	 */
-	if (hdr->xlp_tli < (segmentonly ? lastSegmentTLI : lastPageTLI))
-	{
-		ereport(emode_for_corrupt_record(emode, recaddr),
-				(errmsg("out-of-sequence timeline ID %u (after %u) in log segment %s, offset %u",
-						hdr->xlp_tli,
-						segmentonly ? lastSegmentTLI : lastPageTLI,
-						XLogFileNameP(curFileTLI, readSegNo),
-						readOff)));
-		return false;
-	}
-	lastPageTLI = hdr->xlp_tli;
-	if (readOff == 0)
-		lastSegmentTLI = hdr->xlp_tli;
-
-	return true;
-}
-
-/*
- * Validate an XLOG record header.
- *
- * This is just a convenience subroutine to avoid duplicated code in
- * ReadRecord.	It's not intended for use from anywhere else.
- */
-static bool
-ValidXLogRecordHeader(XLogRecPtr *RecPtr, XLogRecord *record, int emode,
-					  bool randAccess)
-{
-	/*
-	 * xl_len == 0 is bad data for everything except XLOG SWITCH, where it is
-	 * required.
-	 */
-	if (record->xl_rmid == RM_XLOG_ID && record->xl_info == XLOG_SWITCH)
-	{
-		if (record->xl_len != 0)
-		{
-			ereport(emode_for_corrupt_record(emode, *RecPtr),
-					(errmsg("invalid xlog switch record at %X/%X",
-							(uint32) ((*RecPtr) >> 32), (uint32) *RecPtr)));
-			return false;
-		}
-	}
-	else if (record->xl_len == 0)
-	{
-		ereport(emode_for_corrupt_record(emode, *RecPtr),
-				(errmsg("record with zero length at %X/%X",
-						(uint32) ((*RecPtr) >> 32), (uint32) *RecPtr)));
-		return false;
-	}
-	if (record->xl_tot_len < SizeOfXLogRecord + record->xl_len ||
-		record->xl_tot_len > SizeOfXLogRecord + record->xl_len +
-		XLR_MAX_BKP_BLOCKS * (sizeof(BkpBlock) + BLCKSZ))
-	{
-		ereport(emode_for_corrupt_record(emode, *RecPtr),
-				(errmsg("invalid record length at %X/%X",
-						(uint32) ((*RecPtr) >> 32), (uint32) *RecPtr)));
-		return false;
-	}
-	if (record->xl_rmid > RM_MAX_ID)
-	{
-		ereport(emode_for_corrupt_record(emode, *RecPtr),
-				(errmsg("invalid resource manager ID %u at %X/%X",
-						record->xl_rmid, (uint32) ((*RecPtr) >> 32), (uint32) *RecPtr)));
-		return false;
-	}
-	if (randAccess)
-	{
 		/*
-		 * We can't exactly verify the prev-link, but surely it should be less
-		 * than the record's own address.
+		 * Check page TLI is one of the expected values.
 		 */
-		if (!(record->xl_prev < *RecPtr))
+		if (!tliInHistory(xlogreader->latestPageTLI, expectedTLEs))
 		{
-			ereport(emode_for_corrupt_record(emode, *RecPtr),
-					(errmsg("record with incorrect prev-link %X/%X at %X/%X",
-							(uint32) (record->xl_prev >> 32), (uint32) record->xl_prev,
-							(uint32) ((*RecPtr) >> 32), (uint32) *RecPtr)));
+			char		fname[MAXFNAMELEN];
+			XLogSegNo segno;
+			int32 offset;
+
+			XLByteToSeg(xlogreader->latestPagePtr, segno);
+			offset = xlogreader->latestPagePtr % XLogSegSize;
+			XLogFileName(fname, xlogreader->readPageTLI, segno);
+			ereport(emode_for_corrupt_record(emode,
+											 RecPtr ? RecPtr : EndRecPtr),
+					(errmsg("unexpected timeline ID %u in log segment %s, offset %u",
+							xlogreader->latestPageTLI,
+							fname,
+							offset)));
 			return false;
 		}
-	}
-	else
-	{
-		/*
-		 * Record's prev-link should exactly match our previous location. This
-		 * check guards against torn WAL pages where a stale but valid-looking
-		 * WAL record starts on a sector boundary.
-		 */
-		if (record->xl_prev != ReadRecPtr)
-		{
-			ereport(emode_for_corrupt_record(emode, *RecPtr),
-					(errmsg("record with incorrect prev-link %X/%X at %X/%X",
-							(uint32) (record->xl_prev >> 32), (uint32) record->xl_prev,
-							(uint32) ((*RecPtr) >> 32), (uint32) *RecPtr)));
-			return false;
-		}
-	}
+	} while (StandbyMode && record == NULL);

-	return true;
+	return record;
 }

/*
@@ -5235,6 +4689,8 @@ StartupXLOG(void)
bool backupEndRequired = false;
bool backupFromStandby = false;
DBState dbstate_at_startup;
+ XLogReaderState *xlogreader;
+ XLogPageReadPrivate private;

/*
* Read control file and check XLOG status looks valid.
@@ -5351,6 +4807,16 @@ StartupXLOG(void)
if (StandbyMode)
OwnLatch(&XLogCtl->recoveryWakeupLatch);

+	/* Set up XLOG reader facility */
+	MemSet(&private, 0, sizeof(XLogPageReadPrivate));
+	xlogreader = XLogReaderAllocate(InvalidXLogRecPtr, &XLogPageRead, &private);
+	if (!xlogreader)
+		ereport(ERROR,
+				(errcode(ERRCODE_OUT_OF_MEMORY),
+				 errmsg("out of memory"),
+				 errdetail("Failed while allocating an XLog reading processor")));
+	xlogreader->system_identifier = ControlFile->system_identifier;
+
 	if (read_backup_label(&checkPointLoc, &backupEndRequired,
 						  &backupFromStandby))
 	{
@@ -5358,7 +4824,7 @@ StartupXLOG(void)
 		 * When a backup_label file is present, we want to roll forward from
 		 * the checkpoint it identifies, rather than using pg_control.
 		 */
-		record = ReadCheckpointRecord(checkPointLoc, 0);
+		record = ReadCheckpointRecord(xlogreader, checkPointLoc, 0);
 		if (record != NULL)
 		{
 			memcpy(&checkPoint, XLogRecGetData(record), sizeof(CheckPoint));
@@ -5376,7 +4842,7 @@ StartupXLOG(void)
 			 */
 			if (checkPoint.redo < checkPointLoc)
 			{
-				if (!ReadRecord(&(checkPoint.redo), LOG, false))
+				if (!ReadRecord(xlogreader, checkPoint.redo, LOG, false))
 					ereport(FATAL,
 							(errmsg("could not find redo location referenced by checkpoint record"),
 							 errhint("If you are not restoring from a backup, try removing the file \"%s/backup_label\".", DataDir)));
@@ -5400,7 +4866,7 @@ StartupXLOG(void)
 		 */
 		checkPointLoc = ControlFile->checkPoint;
 		RedoStartLSN = ControlFile->checkPointCopy.redo;
-		record = ReadCheckpointRecord(checkPointLoc, 1);
+		record = ReadCheckpointRecord(xlogreader, checkPointLoc, 1);
 		if (record != NULL)
 		{
 			ereport(DEBUG1,
@@ -5419,7 +4885,7 @@ StartupXLOG(void)
 		else
 		{
 			checkPointLoc = ControlFile->prevCheckPoint;
-			record = ReadCheckpointRecord(checkPointLoc, 2);
+			record = ReadCheckpointRecord(xlogreader, checkPointLoc, 2);
 			if (record != NULL)
 			{
 				ereport(LOG,
@@ -5777,12 +5243,12 @@ StartupXLOG(void)
 		if (checkPoint.redo < RecPtr)
 		{
 			/* back up to find the record */
-			record = ReadRecord(&(checkPoint.redo), PANIC, false);
+			record = ReadRecord(xlogreader, checkPoint.redo, PANIC, false);
 		}
 		else
 		{
 			/* just have to read next record after CheckPoint */
-			record = ReadRecord(NULL, LOG, false);
+			record = ReadRecord(xlogreader, InvalidXLogRecPtr, LOG, false);
 		}

if (record != NULL)
@@ -5963,7 +5429,7 @@ StartupXLOG(void)
break;

 				/* Else, try to fetch the next WAL record */
-				record = ReadRecord(NULL, LOG, false);
+				record = ReadRecord(xlogreader, InvalidXLogRecPtr, LOG, false);
 			} while (record != NULL);

 			/*
@@ -6013,7 +5479,7 @@ StartupXLOG(void)
 	 * Re-fetch the last valid or last applied record, so we can identify the
 	 * exact endpoint of what we consider the valid portion of WAL.
 	 */
-	record = ReadRecord(&LastRec, PANIC, false);
+	record = ReadRecord(xlogreader, LastRec, PANIC, false);
 	EndOfLog = EndRecPtr;
 	XLByteToPrevSeg(EndOfLog, endLogSegNo);

@@ -6117,7 +5583,7 @@ StartupXLOG(void)
 	 * we will use that below.)
 	 */
 	if (InArchiveRecovery)
-		exitArchiveRecovery(curFileTLI, endLogSegNo);
+		exitArchiveRecovery(xlogreader->readPageTLI, endLogSegNo);

 	/*
 	 * Prepare to write WAL starting at EndOfLog position, and init xlog
@@ -6136,8 +5602,15 @@ StartupXLOG(void)
 	 * record spans, not the one it starts in.	The last block is indeed the
 	 * one we want to use.
 	 */
-	Assert(readOff == (XLogCtl->xlblocks[0] - XLOG_BLCKSZ) % XLogSegSize);
-	memcpy((char *) Insert->currpage, readBuf, XLOG_BLCKSZ);
+	if (EndOfLog % XLOG_BLCKSZ == 0)
+	{
+		memset(Insert->currpage, 0, XLOG_BLCKSZ);
+	}
+	else
+	{
+		Assert(readOff == (XLogCtl->xlblocks[0] - XLOG_BLCKSZ) % XLogSegSize);
+		memcpy((char *) Insert->currpage, xlogreader->readBuf, XLOG_BLCKSZ);
+	}
 	Insert->currpos = (char *) Insert->currpage +
 		(EndOfLog + XLOG_BLCKSZ - XLogCtl->xlblocks[0]);

@@ -6288,23 +5761,13 @@ StartupXLOG(void)
if (standbyState != STANDBY_DISABLED)
ShutdownRecoveryTransactionEnvironment();

-	/* Shut down readFile facility, free space */
+	/* Shut down xlogreader */
 	if (readFile >= 0)
 	{
 		close(readFile);
 		readFile = -1;
 	}
-	if (readBuf)
-	{
-		free(readBuf);
-		readBuf = NULL;
-	}
-	if (readRecordBuf)
-	{
-		free(readRecordBuf);
-		readRecordBuf = NULL;
-		readRecordBufSize = 0;
-	}
+	XLogReaderFree(xlogreader);

 	/*
 	 * If any of the critical GUCs have changed, log them before we allow
@@ -6554,7 +6017,7 @@ LocalSetXLogInsertAllowed(void)
  * 1 for "primary", 2 for "secondary", 0 for "other" (backup_label)
  */
 static XLogRecord *
-ReadCheckpointRecord(XLogRecPtr RecPtr, int whichChkpt)
+ReadCheckpointRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr, int whichChkpt)
 {
 	XLogRecord *record;

@@ -6578,7 +6041,7 @@ ReadCheckpointRecord(XLogRecPtr RecPtr, int whichChkpt)
return NULL;
}

-	record = ReadRecord(&RecPtr, LOG, true);
+	record = ReadRecord(xlogreader, RecPtr, LOG, true);

 	if (record == NULL)
 	{
@@ -9332,28 +8795,24 @@ CancelBackup(void)
  * XLogPageRead() to try fetching the record from another source, or to
  * sleep and retry.
  */
-static bool
-XLogPageRead(XLogRecPtr *RecPtr, int emode, bool fetching_ckpt,
-			 bool randAccess)
+static int
+XLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr, int reqLen,
+			 char *readBuf, TimeLineID *readTLI)
 {
+	XLogPageReadPrivate *private =
+		(XLogPageReadPrivate *) xlogreader->private_data;
+	int			emode = private->emode;
 	uint32		targetPageOff;
-	uint32		targetRecOff;
-	XLogSegNo	targetSegNo;
-
-	XLByteToSeg(*RecPtr, targetSegNo);
-	targetPageOff = (((*RecPtr) % XLogSegSize) / XLOG_BLCKSZ) * XLOG_BLCKSZ;
-	targetRecOff = (*RecPtr) % XLOG_BLCKSZ;
+	XLogSegNo	targetSegNo PG_USED_FOR_ASSERTS_ONLY;

-	/* Fast exit if we have read the record in the current buffer already */
-	if (!lastSourceFailed && targetSegNo == readSegNo &&
-		targetPageOff == readOff && targetRecOff < readLen)
-		return true;
+	XLByteToSeg(targetPagePtr, targetSegNo);
+	targetPageOff = targetPagePtr % XLogSegSize;

 	/*
 	 * See if we need to switch to a new segment because the requested record
 	 * is not in the currently open one.
 	 */
-	if (readFile >= 0 && !XLByteInSeg(*RecPtr, readSegNo))
+	if (readFile >= 0 && !XLByteInSeg(targetPagePtr, readSegNo))
 	{
 		/*
 		 * Request a restartpoint if we've replayed too much xlog since the
@@ -9374,39 +8833,34 @@ XLogPageRead(XLogRecPtr *RecPtr, int emode, bool fetching_ckpt,
 		readSource = 0;
 	}

-	XLByteToSeg(*RecPtr, readSegNo);
+	XLByteToSeg(targetPagePtr, readSegNo);

 retry:
 	/* See if we need to retrieve more data */
 	if (readFile < 0 ||
-		(readSource == XLOG_FROM_STREAM && receivedUpto <= *RecPtr))
+		(readSource == XLOG_FROM_STREAM &&
+		 receivedUpto <= targetPagePtr + reqLen))
 	{
 		if (StandbyMode)
 		{
-			if (!WaitForWALToBecomeAvailable(*RecPtr, randAccess,
-											 fetching_ckpt))
+			if (!WaitForWALToBecomeAvailable(targetPagePtr + reqLen,
+											 private->randAccess,
+											 private->fetching_ckpt))
 				goto triggered;
 		}
-		else
+		/* In archive or crash recovery. */
+		else if (readFile < 0)
 		{
-			/* In archive or crash recovery. */
-			if (readFile < 0)
-			{
-				int			source;
+			int source;

-				/* Reset curFileTLI if random fetch. */
-				if (randAccess)
-					curFileTLI = 0;
-
-				if (InArchiveRecovery)
-					source = XLOG_FROM_ANY;
-				else
-					source = XLOG_FROM_PG_XLOG;
+			if (InArchiveRecovery)
+				source = XLOG_FROM_ANY;
+			else
+				source = XLOG_FROM_PG_XLOG;

-				readFile = XLogFileReadAnyTLI(readSegNo, emode, source);
-				if (readFile < 0)
-					return false;
-			}
+			readFile = XLogFileReadAnyTLI(readSegNo, emode, source);
+			if (readFile < 0)
+				return -1;
 		}
 	}

@@ -9424,72 +8878,46 @@ retry:
 	 */
 	if (readSource == XLOG_FROM_STREAM)
 	{
-		if (((*RecPtr) / XLOG_BLCKSZ) != (receivedUpto / XLOG_BLCKSZ))
-		{
+		if (((targetPagePtr) / XLOG_BLCKSZ) != (receivedUpto / XLOG_BLCKSZ))
 			readLen = XLOG_BLCKSZ;
-		}
 		else
 			readLen = receivedUpto % XLogSegSize - targetPageOff;
 	}
 	else
 		readLen = XLOG_BLCKSZ;

-	if (!readFileHeaderValidated && targetPageOff != 0)
-	{
-		/*
-		 * Whenever switching to a new WAL segment, we read the first page of
-		 * the file and validate its header, even if that's not where the
-		 * target record is.  This is so that we can check the additional
-		 * identification info that is present in the first page's "long"
-		 * header.
-		 */
-		readOff = 0;
-		if (read(readFile, readBuf, XLOG_BLCKSZ) != XLOG_BLCKSZ)
-		{
-			char fname[MAXFNAMELEN];
-			XLogFileName(fname, curFileTLI, readSegNo);
-			ereport(emode_for_corrupt_record(emode, *RecPtr),
-					(errcode_for_file_access(),
-					 errmsg("could not read from log segment %s, offset %u: %m",
-							fname, readOff)));
-			goto next_record_is_invalid;
-		}
-		if (!ValidXLogPageHeader((XLogPageHeader) readBuf, emode, true))
-			goto next_record_is_invalid;
-	}
-
 	/* Read the requested page */
 	readOff = targetPageOff;
 	if (lseek(readFile, (off_t) readOff, SEEK_SET) < 0)
 	{
 		char fname[MAXFNAMELEN];
+
 		XLogFileName(fname, curFileTLI, readSegNo);
-		ereport(emode_for_corrupt_record(emode, *RecPtr),
+		ereport(emode_for_corrupt_record(emode, targetPagePtr + reqLen),
 				(errcode_for_file_access(),
 		 errmsg("could not seek in log segment %s to offset %u: %m",
-				fname, readOff)));
+						fname, readOff)));
 		goto next_record_is_invalid;
 	}
+
 	if (read(readFile, readBuf, XLOG_BLCKSZ) != XLOG_BLCKSZ)
 	{
 		char fname[MAXFNAMELEN];
+
 		XLogFileName(fname, curFileTLI, readSegNo);
-		ereport(emode_for_corrupt_record(emode, *RecPtr),
+		ereport(emode_for_corrupt_record(emode, targetPagePtr + reqLen),
 				(errcode_for_file_access(),
 		 errmsg("could not read from log segment %s, offset %u: %m",
-				fname, readOff)));
+						fname, readOff)));
 		goto next_record_is_invalid;
 	}
-	if (!ValidXLogPageHeader((XLogPageHeader) readBuf, emode, false))
-		goto next_record_is_invalid;
-
-	readFileHeaderValidated = true;

 	Assert(targetSegNo == readSegNo);
 	Assert(targetPageOff == readOff);
-	Assert(targetRecOff < readLen);
+	Assert(reqLen <= readLen);

-	return true;
+	*readTLI = curFileTLI;
+	return readLen;

next_record_is_invalid:
lastSourceFailed = true;
@@ -9504,7 +8932,7 @@ next_record_is_invalid:
if (StandbyMode)
goto retry;
else
- return false;
+ return -1;

triggered:
if (readFile >= 0)
@@ -9513,7 +8941,7 @@ triggered:
readLen = 0;
readSource = 0;

-	return false;
+	return -1;
 }

 /*
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
new file mode 100644
index 0000000..6a420e6
--- /dev/null
+++ b/src/backend/access/transam/xlogreader.c
@@ -0,0 +1,987 @@
+/*-------------------------------------------------------------------------
+ *
+ * xlogreader.c
+ *		Generic xlog reading facility
+ *
+ * Portions Copyright (c) 2012, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		src/backend/access/transam/xlogreader.c
+ *
+ * NOTES
+ *		Documentation about how do use this interface can be found in
+ *		xlogreader.h, more specifically in the definition of the
+ *		XLogReaderState struct where all parameters are documented.
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/transam.h"
+#include "access/xlog.h"
+#include "access/xlog_internal.h"
+#include "access/xlogreader.h"
+#include "catalog/pg_control.h"
+
+static bool allocate_recordbuf(XLogReaderState *state, uint32 reclength);
+
+static bool ValidXLogPageHeader(XLogReaderState *state, XLogRecPtr recptr,
+								XLogPageHeader hdr);
+static bool ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
+		XLogRecPtr PrevRecPtr, XLogRecord *record, bool randAccess);
+static bool ValidXLogRecord(XLogReaderState *state, XLogRecord *record,
+						    XLogRecPtr recptr);
+static int ReadPageInternal(struct XLogReaderState *state, XLogRecPtr pageptr,
+				 int reqLen);
+static void report_invalid_record(XLogReaderState *state, const char *fmt, ...)
+/* This extension allows gcc to check the format string for consistency with
+   the supplied arguments. */
+__attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 3)));
+
+/* size of the buffer allocated for error message. */
+#define MAX_ERRORMSG_LEN 1000
+
+/*
+ * Construct a string in state->errormsg_buf explaining what's wrong with
+ * the current record being read.
+ */
+static void
+report_invalid_record(XLogReaderState *state, const char *fmt, ...)
+{
+	va_list	args;
+
+	fmt = _(fmt);
+
+	va_start(args, fmt);
+	vsnprintf(state->errormsg_buf, MAX_ERRORMSG_LEN, fmt, args);
+	va_end(args);
+}
+
+/*
+ * Allocate and initialize a new xlog reader
+ *
+ * Returns NULL if the xlogreader couldn't be allocated.
+ */
+XLogReaderState *
+XLogReaderAllocate(XLogRecPtr startpoint, XLogPageReadCB pagereadfunc,
+				   void *private_data)
+{
+	XLogReaderState *state;
+
+	state = (XLogReaderState *) malloc(sizeof(XLogReaderState));
+	if (!state)
+		return NULL;
+	MemSet(state, 0, sizeof(XLogReaderState));
+
+	/*
+	 * Permanently allocate readBuf.  We do it this way, rather than just
+	 * making a static array, for two reasons: (1) no need to waste the
+	 * storage in most instantiations of the backend; (2) a static char array
+	 * isn't guaranteed to have any particular alignment, whereas malloc()
+	 * will provide MAXALIGN'd storage.
+	 */
+	state->readBuf = (char *) malloc(XLOG_BLCKSZ);
+	if (!state->readBuf)
+	{
+		free(state);
+		return NULL;
+	}
+
+	state->read_page = pagereadfunc;
+	state->private_data = private_data;
+	state->EndRecPtr = startpoint;
+	state->readPageTLI = 0;
+	state->system_identifier = 0;
+	state->errormsg_buf = malloc(MAX_ERRORMSG_LEN + 1);
+	if (!state->errormsg_buf)
+	{
+		free(state->readBuf);
+		free(state);
+		return NULL;
+	}
+	state->errormsg_buf[0] = '\0';
+
+	/*
+	 * Allocate an initial readRecordBuf of minimal size, which can later be
+	 * enlarged if necessary.
+	 */
+	if (!allocate_recordbuf(state, 0))
+	{
+		free(state->errormsg_buf);
+		free(state->readBuf);
+		free(state);
+		return NULL;
+	}
+
+	return state;
+}
+
+void
+XLogReaderFree(XLogReaderState *state)
+{
+	free(state->errormsg_buf);
+	if (state->readRecordBuf)
+		free(state->readRecordBuf);
+	free(state->readBuf);
+	free(state);
+}
+
+/*
+ * Allocate readRecordBuf to fit a record of at least the given length.
+ * Returns true if successful, false if out of memory.
+ *
+ * readRecordBufSize is set to the new buffer size.
+ *
+ * To avoid useless small increases, round its size to a multiple of
+ * XLOG_BLCKSZ, and make sure it's at least 5*Max(BLCKSZ, XLOG_BLCKSZ) to start
+ * with.  (That is enough for all "normal" records, but very large commit or
+ * abort records might need more space.)
+ */
+static bool
+allocate_recordbuf(XLogReaderState *state, uint32 reclength)
+{
+	uint32		newSize = reclength;
+
+	newSize += XLOG_BLCKSZ - (newSize % XLOG_BLCKSZ);
+	newSize = Max(newSize, 5 * Max(BLCKSZ, XLOG_BLCKSZ));
+
+	if (state->readRecordBuf)
+		free(state->readRecordBuf);
+	state->readRecordBuf = (char *) malloc(newSize);
+	if (!state->readRecordBuf)
+	{
+		state->readRecordBufSize = 0;
+		return false;
+	}
+
+	state->readRecordBufSize = newSize;
+	return true;
+}
+
+/*
+ * Attempt to read an XLOG record.
+ *
+ * If RecPtr is not NULL, try to read a record at that position.  Otherwise
+ * try to read a record just after the last one previously read.
+ *
+ * If no valid record is available, returns NULL. On NULL return, *errormsg
+ * is usually set to a string with details of the failure. One typical error
+ * where *errormsg is not set is when the read_page callback returns an error.
+ *
+ * The returned pointer (or *errormsg) points to an internal buffer that's
+ * valid until the next call to XLogReadRecord.
+ */
+XLogRecord *
+XLogReadRecord(XLogReaderState *state, XLogRecPtr RecPtr, char **errormsg)
+{
+	XLogRecord *record;
+	XLogRecPtr	tmpRecPtr = state->EndRecPtr;
+	XLogRecPtr  targetPagePtr;
+	bool		randAccess = false;
+	uint32		len,
+				total_len;
+	uint32		targetRecOff;
+	uint32		pageHeaderSize;
+	bool		gotheader;
+	int         readOff;
+
+	*errormsg = NULL;
+	state->errormsg_buf[0] = '\0';
+
+	if (RecPtr == InvalidXLogRecPtr)
+	{
+		RecPtr = tmpRecPtr;
+
+		if (state->ReadRecPtr == InvalidXLogRecPtr)
+			randAccess = true;
+
+		/*
+		 * RecPtr is pointing to end+1 of the previous WAL record.	If we're
+		 * at a page boundary, no more records can fit on the current page. We
+		 * must skip over the page header, but we can't do that until we've
+		 * read in the page, since the header size is variable.
+		 */
+	}
+	else
+	{
+		/*
+		 * In this case, the passed-in record pointer should already be
+		 * pointing to a valid record starting position.
+		 */
+		Assert(XRecOffIsValid(RecPtr));
+		randAccess = true;		/* allow readPageTLI to go backwards too */
+	}
+
+	targetPagePtr = RecPtr - (RecPtr % XLOG_BLCKSZ);
+
+	/* Read the page containing the record into state->readBuf */
+	readOff = ReadPageInternal(state, targetPagePtr, SizeOfXLogRecord);
+
+	if (readOff < 0)
+	{
+		if (state->errormsg_buf[0] != '\0')
+			*errormsg = state->errormsg_buf;
+		return NULL;
+	}
+
+	/* ReadPageInternal always returns at least the page header */
+	pageHeaderSize = XLogPageHeaderSize((XLogPageHeader) state->readBuf);
+	targetRecOff = RecPtr % XLOG_BLCKSZ;
+	if (targetRecOff == 0)
+	{
+		/*
+		 * At page start, so skip over page header.
+		 */
+		RecPtr += pageHeaderSize;
+		targetRecOff = pageHeaderSize;
+	}
+	else if (targetRecOff < pageHeaderSize)
+	{
+		report_invalid_record(state, "invalid record offset at %X/%X",
+							  (uint32) (RecPtr >> 32), (uint32) RecPtr);
+		*errormsg = state->errormsg_buf;
+		return NULL;
+	}
+
+	if ((((XLogPageHeader) state->readBuf)->xlp_info & XLP_FIRST_IS_CONTRECORD) &&
+		targetRecOff == pageHeaderSize)
+	{
+		report_invalid_record(state, "contrecord is requested by %X/%X",
+							  (uint32) (RecPtr >> 32), (uint32) RecPtr);
+		*errormsg = state->errormsg_buf;
+		return NULL;
+	}
+
+	/* ReadPageInternal has verified the page header */
+	Assert(pageHeaderSize <= readOff);
+
+	/*
+	 * Ensure the whole record header or at least the part on this page is
+	 * read.
+	 */
+	readOff = ReadPageInternal(state,
+							   targetPagePtr,
+							   Min(targetRecOff + SizeOfXLogRecord, XLOG_BLCKSZ));
+	if (readOff < 0)
+	{
+		if (state->errormsg_buf[0] != '\0')
+			*errormsg = state->errormsg_buf;
+		return NULL;
+	}
+
+	/*
+	 * Read the record length.
+	 *
+	 * NB: Even though we use an XLogRecord pointer here, the whole record
+	 * header might not fit on this page. xl_tot_len is the first field of the
+	 * struct, so it must be on this page (the records are MAXALIGNed), but we
+	 * cannot access any other fields until we've verified that we got the
+	 * whole header.
+	 */
+	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
+	total_len = record->xl_tot_len;
+
+	/*
+	 * If the whole record header is on this page, validate it immediately.
+	 * Otherwise do just a basic sanity check on xl_tot_len, and validate the
+	 * rest of the header after reading it from the next page.	The xl_tot_len
+	 * check is necessary here to ensure that we enter the "Need to reassemble
+	 * record" code path below; otherwise we might fail to apply
+	 * ValidXLogRecordHeader at all.
+	 */
+	if (targetRecOff <= XLOG_BLCKSZ - SizeOfXLogRecord)
+	{
+		if (!ValidXLogRecordHeader(state, RecPtr, state->ReadRecPtr, record,
+								   randAccess))
+		{
+			if (state->errormsg_buf[0] != '\0')
+				*errormsg = state->errormsg_buf;
+			return NULL;
+		}
+		gotheader = true;
+	}
+	else
+	{
+		/* XXX: more validation should be done here */
+		if (total_len < SizeOfXLogRecord)
+		{
+			report_invalid_record(state, "invalid record length at %X/%X",
+								  (uint32) (RecPtr >> 32), (uint32) RecPtr);
+			*errormsg = state->errormsg_buf;
+			return NULL;
+		}
+		gotheader = false;
+	}
+
+	/*
+	 * Enlarge readRecordBuf as needed.
+	 */
+	if (total_len > state->readRecordBufSize &&
+		!allocate_recordbuf(state, total_len))
+	{
+		/* We treat this as a "bogus data" condition */
+		report_invalid_record(state, "record length %u at %X/%X too long",
+							  total_len,
+							  (uint32) (RecPtr >> 32), (uint32) RecPtr);
+		*errormsg = state->errormsg_buf;
+		return NULL;
+	}
+
+	len = XLOG_BLCKSZ - RecPtr % XLOG_BLCKSZ;
+	if (total_len > len)
+	{
+		/* Need to reassemble record */
+		char	   *contdata;
+		XLogPageHeader pageHeader;
+		char	   *buffer;
+		uint32		gotlen;
+
+		/* Copy the first fragment of the record from the first page. */
+		memcpy(state->readRecordBuf,
+			   state->readBuf + RecPtr % XLOG_BLCKSZ, len);
+		buffer = state->readRecordBuf + len;
+		gotlen = len;
+
+		do
+		{
+			/* Calculate pointer to beginning of next page */
+			targetPagePtr += XLOG_BLCKSZ;
+
+			/* Wait for the next page to become available */
+			readOff = ReadPageInternal(state, targetPagePtr,
+									   Min(len, XLOG_BLCKSZ));
+
+			if (readOff < 0)
+				goto err;
+
+			Assert(SizeOfXLogShortPHD <= readOff);
+
+			/* Check that the continuation on next page looks valid */
+			pageHeader = (XLogPageHeader) state->readBuf;
+			if (!(pageHeader->xlp_info & XLP_FIRST_IS_CONTRECORD))
+			{
+				report_invalid_record(state,
+									  "there is no contrecord flag at %X/%X",
+								  (uint32) (RecPtr >> 32), (uint32) RecPtr);
+				goto err;
+			}
+
+			/*
+			 * Cross-check that xlp_rem_len agrees with how much of the record
+			 * we expect there to be left.
+			 */
+			if (pageHeader->xlp_rem_len == 0 ||
+				total_len != (pageHeader->xlp_rem_len + gotlen))
+			{
+				report_invalid_record(state,
+									  "invalid contrecord length %u at %X/%X",
+									  pageHeader->xlp_rem_len,
+								  (uint32) (RecPtr >> 32), (uint32) RecPtr);
+				goto err;
+			}
+
+			/* Append the continuation from this page to the buffer */
+			pageHeaderSize = XLogPageHeaderSize(pageHeader);
+			Assert(pageHeaderSize <= readOff);
+
+			contdata = (char *) state->readBuf + pageHeaderSize;
+			len = XLOG_BLCKSZ - pageHeaderSize;
+			if (pageHeader->xlp_rem_len < len)
+				len = pageHeader->xlp_rem_len;
+
+			memcpy(buffer, (char *) contdata, len);
+			buffer += len;
+			gotlen += len;
+
+			/* If we just reassembled the record header, validate it. */
+			if (!gotheader)
+			{
+				record = (XLogRecord *) state->readRecordBuf;
+				if (!ValidXLogRecordHeader(state, RecPtr, state->ReadRecPtr,
+										   record, randAccess))
+					goto err;
+				gotheader = true;
+			}
+		} while (gotlen < total_len);
+
+		Assert(gotheader);
+
+		record = (XLogRecord *) state->readRecordBuf;
+		if (!ValidXLogRecord(state, record, RecPtr))
+			goto err;
+
+		pageHeaderSize = XLogPageHeaderSize((XLogPageHeader) state->readBuf);
+		state->ReadRecPtr = RecPtr;
+		state->EndRecPtr = targetPagePtr + pageHeaderSize
+			+ MAXALIGN(pageHeader->xlp_rem_len);
+	}
+	else
+	{
+		/* Wait for the record data to become available */
+		readOff = ReadPageInternal(state, targetPagePtr,
+								   Min(targetRecOff + total_len, XLOG_BLCKSZ));
+		if (readOff < 0)
+			goto err;
+
+		/* Record does not cross a page boundary */
+		if (!ValidXLogRecord(state, record, RecPtr))
+			goto err;
+
+		state->EndRecPtr = RecPtr + MAXALIGN(total_len);
+
+		state->ReadRecPtr = RecPtr;
+		memcpy(state->readRecordBuf, record, total_len);
+	}
+
+	/*
+	 * Special processing if it's an XLOG SWITCH record
+	 */
+	if (record->xl_rmid == RM_XLOG_ID && record->xl_info == XLOG_SWITCH)
+	{
+		/* Pretend it extends to end of segment */
+		state->EndRecPtr += XLogSegSize - 1;
+		state->EndRecPtr -= state->EndRecPtr % XLogSegSize;
+	}
+
+	return record;
+
+err:
+	/*
+	 * Invalidate the xlog page we've cached. We might read from a different
+	 * source after failure.
+	 */
+	state->readSegNo = 0;
+	state->readOff = 0;
+	state->readLen = 0;
+
+	if (state->errormsg_buf[0] != '\0')
+		*errormsg = state->errormsg_buf;
+
+	return NULL;
+}
+
+/*
+ * Read a single xlog page including at least [pagestart, RecPtr] of valid data
+ * via the read_page() callback.
+ *
+ * Returns -1 if the required page cannot be read for some reason.
+ *
+ * We fetch the page from a reader-local cache if we know we have the required
+ * data and if there hasn't been any error since caching the data.
+ */
+static int
+ReadPageInternal(struct XLogReaderState *state, XLogRecPtr pageptr,
+				 int reqLen)
+{
+	int			readLen;
+	uint32		targetPageOff;
+	XLogSegNo	targetSegNo;
+	XLogPageHeader hdr;
+
+	Assert((pageptr % XLOG_BLCKSZ) == 0);
+
+	XLByteToSeg(pageptr, targetSegNo);
+	targetPageOff = (pageptr % XLogSegSize);
+
+	/* check whether we have all the requested data already */
+	if (targetSegNo == state->readSegNo && targetPageOff == state->readOff &&
+		reqLen < state->readLen)
+		return state->readLen;
+
+	/*
+	 * Data is not cached.
+	 *
+	 * Everytime we actually read the page, even if we looked at parts of it
+	 * before, we need to do verification as the read_page callback might now
+	 * be rereading data from a different source.
+	 *
+	 * Whenever switching to a new WAL segment, we read the first page of the
+	 * file and validate its header, even if that's not where the target record
+	 * is.  This is so that we can check the additional identification info
+	 * that is present in the first page's "long" header.
+	 */
+	if (targetSegNo != state->readSegNo &&
+		targetPageOff != 0)
+	{
+		XLogPageHeader hdr;
+		XLogRecPtr targetSegmentPtr = pageptr - targetPageOff;
+
+		readLen = state->read_page(state, targetSegmentPtr, XLOG_BLCKSZ,
+								   state->readBuf, &state->readPageTLI);
+
+		if (readLen < 0)
+			goto err;
+
+		Assert(readLen <= XLOG_BLCKSZ);
+
+		/* we can be sure to have enough WAL available, we scrolled back */
+		Assert(readLen == XLOG_BLCKSZ);
+
+		hdr = (XLogPageHeader) state->readBuf;
+
+		if (!ValidXLogPageHeader(state, targetSegmentPtr, hdr))
+			goto err;
+	}
+
+	/* now read the target data */
+	readLen = state->read_page(state, pageptr, Max(reqLen, SizeOfXLogShortPHD),
+							   state->readBuf, &state->readPageTLI);
+	if (readLen < 0)
+		goto err;
+
+	Assert(readLen <= XLOG_BLCKSZ);
+
+	/* check we have enough data to check for the actual length of a the page header */
+	if (readLen <= SizeOfXLogShortPHD)
+		goto err;
+
+	Assert(readLen >= reqLen);
+
+	hdr = (XLogPageHeader) state->readBuf;
+
+	/* still not enough */
+	if (readLen < XLogPageHeaderSize(hdr))
+	{
+		readLen = state->read_page(state, pageptr, XLogPageHeaderSize(hdr),
+								   state->readBuf, &state->readPageTLI);
+		if (readLen < 0)
+			goto err;
+	}
+
+	if (!ValidXLogPageHeader(state, pageptr, hdr))
+		goto err;
+
+	/* update cache information */
+	state->readSegNo = targetSegNo;
+	state->readOff = targetPageOff;
+	state->readLen = readLen;
+
+	return readLen;
+err:
+	state->readSegNo = 0;
+	state->readOff = 0;
+	state->readLen = 0;
+	return -1;
+}
+
+/*
+ * Validate an XLOG record header.
+ *
+ * This is just a convenience subroutine to avoid duplicated code in
+ * XLogReadRecord.	It's not intended for use from anywhere else.
+ */
+static bool
+ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
+					  XLogRecPtr PrevRecPtr, XLogRecord *record,
+					  bool randAccess)
+{
+	/*
+	 * xl_len == 0 is bad data for everything except XLOG SWITCH, where it is
+	 * required.
+	 */
+	if (record->xl_rmid == RM_XLOG_ID && record->xl_info == XLOG_SWITCH)
+	{
+		if (record->xl_len != 0)
+		{
+			report_invalid_record(state,
+								  "invalid xlog switch record at %X/%X",
+								  (uint32) (RecPtr >> 32), (uint32) RecPtr);
+			return false;
+		}
+	}
+	else if (record->xl_len == 0)
+	{
+		report_invalid_record(state,
+							  "record with zero length at %X/%X",
+							  (uint32) (RecPtr >> 32), (uint32) RecPtr);
+		return false;
+	}
+	if (record->xl_tot_len < SizeOfXLogRecord + record->xl_len ||
+		record->xl_tot_len > SizeOfXLogRecord + record->xl_len +
+		XLR_MAX_BKP_BLOCKS * (sizeof(BkpBlock) + BLCKSZ))
+	{
+		report_invalid_record(state,
+							  "invalid record length at %X/%X",
+							  (uint32) (RecPtr >> 32), (uint32) RecPtr);
+		return false;
+	}
+	if (record->xl_rmid > RM_MAX_ID)
+	{
+		report_invalid_record(state,
+							  "invalid resource manager ID %u at %X/%X",
+							  record->xl_rmid, (uint32) (RecPtr >> 32),
+							  (uint32) RecPtr);
+		return false;
+	}
+	if (randAccess)
+	{
+		/*
+		 * We can't exactly verify the prev-link, but surely it should be less
+		 * than the record's own address.
+		 */
+		if (!(record->xl_prev < RecPtr))
+		{
+			report_invalid_record(state,
+								  "record with incorrect prev-link %X/%X at %X/%X",
+								  (uint32) (record->xl_prev >> 32),
+								  (uint32) record->xl_prev,
+								  (uint32) (RecPtr >> 32), (uint32) RecPtr);
+			return false;
+		}
+	}
+	else
+	{
+		/*
+		 * Record's prev-link should exactly match our previous location. This
+		 * check guards against torn WAL pages where a stale but valid-looking
+		 * WAL record starts on a sector boundary.
+		 */
+		if (record->xl_prev != PrevRecPtr)
+		{
+			report_invalid_record(state,
+								  "record with incorrect prev-link %X/%X at %X/%X",
+								  (uint32) (record->xl_prev >> 32),
+								  (uint32) record->xl_prev,
+								  (uint32) (RecPtr >> 32), (uint32) RecPtr);
+			return false;
+		}
+	}
+
+	return true;
+}
+
+
+/*
+ * CRC-check an XLOG record.  We do not believe the contents of an XLOG
+ * record (other than to the minimal extent of computing the amount of
+ * data to read in) until we've checked the CRCs.
+ *
+ * We assume all of the record (that is, xl_tot_len bytes) has been read
+ * into memory at *record.	Also, ValidXLogRecordHeader() has accepted the
+ * record's header, which means in particular that xl_tot_len is at least
+ * SizeOfXlogRecord, so it is safe to fetch xl_len.
+ */
+static bool
+ValidXLogRecord(XLogReaderState *state, XLogRecord *record, XLogRecPtr recptr)
+{
+	pg_crc32	crc;
+	int			i;
+	uint32		len = record->xl_len;
+	BkpBlock	bkpb;
+	char	   *blk;
+	size_t		remaining = record->xl_tot_len;
+
+	/* First the rmgr data */
+	if (remaining < SizeOfXLogRecord + len)
+	{
+		/* ValidXLogRecordHeader() should've caught this already... */
+		report_invalid_record(state, "invalid record length at %X/%X",
+							  (uint32) (recptr >> 32), (uint32) recptr);
+		return false;
+	}
+	remaining -= SizeOfXLogRecord + len;
+	INIT_CRC32(crc);
+	COMP_CRC32(crc, XLogRecGetData(record), len);
+
+	/* Add in the backup blocks, if any */
+	blk = (char *) XLogRecGetData(record) + len;
+	for (i = 0; i < XLR_MAX_BKP_BLOCKS; i++)
+	{
+		uint32		blen;
+
+		if (!(record->xl_info & XLR_BKP_BLOCK(i)))
+			continue;
+
+		if (remaining < sizeof(BkpBlock))
+		{
+			report_invalid_record(state,
+							  "invalid backup block size in record at %X/%X",
+								  (uint32) (recptr >> 32), (uint32) recptr);
+			return false;
+		}
+		memcpy(&bkpb, blk, sizeof(BkpBlock));
+
+		if (bkpb.hole_offset + bkpb.hole_length > BLCKSZ)
+		{
+			report_invalid_record(state,
+								  "incorrect hole size in record at %X/%X",
+								  (uint32) (recptr >> 32), (uint32) recptr);
+			return false;
+		}
+		blen = sizeof(BkpBlock) + BLCKSZ - bkpb.hole_length;
+
+		if (remaining < blen)
+		{
+			report_invalid_record(state,
+							  "invalid backup block size in record at %X/%X",
+								  (uint32) (recptr >> 32), (uint32) recptr);
+			return false;
+		}
+		remaining -= blen;
+		COMP_CRC32(crc, blk, blen);
+		blk += blen;
+	}
+
+	/* Check that xl_tot_len agrees with our calculation */
+	if (remaining != 0)
+	{
+		report_invalid_record(state,
+							  "incorrect total length in record at %X/%X",
+							  (uint32) (recptr >> 32), (uint32) recptr);
+		return false;
+	}
+
+	/* Finally include the record header */
+	COMP_CRC32(crc, (char *) record, offsetof(XLogRecord, xl_crc));
+	FIN_CRC32(crc);
+
+	if (!EQ_CRC32(record->xl_crc, crc))
+	{
+		report_invalid_record(state,
+				 "incorrect resource manager data checksum in record at %X/%X",
+							  (uint32) (recptr >> 32), (uint32) recptr);
+		return false;
+	}
+
+	return true;
+}
+
+static bool
+ValidXLogPageHeader(XLogReaderState *state, XLogRecPtr recptr,
+					XLogPageHeader hdr)
+{
+	XLogRecPtr	recaddr;
+	XLogSegNo segno;
+	int32 offset;
+
+	Assert((recptr % XLOG_BLCKSZ) == 0);
+
+	XLByteToSeg(recptr, segno);
+	offset = recptr % XLogSegSize;
+
+	XLogSegNoOffsetToRecPtr(segno, offset, recaddr);
+
+	if (hdr->xlp_magic != XLOG_PAGE_MAGIC)
+	{
+		char		fname[MAXFNAMELEN];
+
+		XLogFileName(fname, state->readPageTLI, segno);
+
+		report_invalid_record(state,
+					  "invalid magic number %04X in log segment %s, offset %u",
+							  hdr->xlp_magic,
+							  fname,
+							  offset);
+		return false;
+	}
+
+	if ((hdr->xlp_info & ~XLP_ALL_FLAGS) != 0)
+	{
+		char		fname[MAXFNAMELEN];
+
+		XLogFileName(fname, state->readPageTLI, segno);
+
+		report_invalid_record(state,
+						"invalid info bits %04X in log segment %s, offset %u",
+							  hdr->xlp_info,
+							  fname,
+							  offset);
+		return false;
+	}
+
+	if (hdr->xlp_info & XLP_LONG_HEADER)
+	{
+		XLogLongPageHeader longhdr = (XLogLongPageHeader) hdr;
+
+		if (state->system_identifier &&
+		    longhdr->xlp_sysid != state->system_identifier)
+		{
+			char		fhdrident_str[32];
+			char		sysident_str[32];
+
+			/*
+			 * Format sysids separately to keep platform-dependent format code
+			 * out of the translatable message string.
+			 */
+			snprintf(fhdrident_str, sizeof(fhdrident_str), UINT64_FORMAT,
+					 longhdr->xlp_sysid);
+			snprintf(sysident_str, sizeof(sysident_str), UINT64_FORMAT,
+					 state->system_identifier);
+			report_invalid_record(state,
+					  "WAL file is from different database system: WAL file database system identifier is %s, pg_control database system identifier is %s.",
+								  fhdrident_str, sysident_str);
+			return false;
+		}
+		else if (longhdr->xlp_seg_size != XLogSegSize)
+		{
+			report_invalid_record(state,
+					  "WAL file is from different database system: Incorrect XLOG_SEG_SIZE in page header.");
+			return false;
+		}
+		else if (longhdr->xlp_xlog_blcksz != XLOG_BLCKSZ)
+		{
+			report_invalid_record(state,
+					 "WAL file is from different database system: Incorrect XLOG_BLCKSZ in page header.");
+			return false;
+		}
+	}
+	else if (offset == 0)
+	{
+		char		fname[MAXFNAMELEN];
+
+		XLogFileName(fname, state->readPageTLI, segno);
+
+		/* hmm, first page of file doesn't have a long header? */
+		report_invalid_record(state,
+					  "invalid info bits %04X in log segment %s, offset %u",
+							  hdr->xlp_info,
+							  fname,
+							  offset);
+		return false;
+	}
+
+	if (hdr->xlp_pageaddr != recaddr)
+	{
+		char		fname[MAXFNAMELEN];
+
+		XLogFileName(fname, state->readPageTLI, segno);
+
+		report_invalid_record(state,
+			  "unexpected pageaddr %X/%X in log segment %s, offset %u",
+			  (uint32) (hdr->xlp_pageaddr >> 32), (uint32) hdr->xlp_pageaddr,
+							  fname,
+							  offset);
+		return false;
+	}
+
+	/*
+	 * Since child timelines are always assigned a TLI greater than their
+	 * immediate parent's TLI, we should never see TLI go backwards across
+	 * successive pages of a consistent WAL sequence.
+	 *
+	 * Of course this check should only be applied when advancing sequentially
+	 * across pages; therefore ReadRecord resets lastPageTLI and lastSegmentTLI
+	 * to zero when going to a random page. FIXME
+	 *
+	 * Sometimes we re-read a segment that's already been (partially) read. So
+	 * we only verify TLIs for pages that are later than the last remembered
+	 * LSN.
+	 *
+	 * XXX: This is slightly less precise than the check we did in earlier
+	 * times. I don't see a problem with that though.
+	 */
+	if (state->latestPagePtr < recptr)
+	{
+		if (hdr->xlp_tli < state->latestPageTLI)
+		{
+			char		fname[MAXFNAMELEN];
+
+			XLogFileName(fname, state->readPageTLI, segno);
+
+			report_invalid_record(state,
+								  "out-of-sequence timeline ID %u (after %u) in log segment %s, offset %u",
+								  hdr->xlp_tli,
+								  state->latestPageTLI,
+								  fname,
+								  offset);
+			return false;
+		}
+	}
+	state->latestPagePtr = recptr;
+	state->latestPageTLI = hdr->xlp_tli;
+	return true;
+}
+
+/*
+ * Functions that are currently only needed in the backend, but are better
+ * implemented inside xlogreader because the internal functions available
+ * there.
+ */
+#ifdef FRONTEND
+
+/*
+ * Find the first record with at an lsn >= RecPtr.
+ *
+ * Useful for checking wether RecPtr is a valid xlog address for reading and to
+ * find the first valid address after some address when dumping records for
+ * debugging purposes.
+ */
+XLogRecPtr
+XLogFindNextRecord(XLogReaderState *state, XLogRecPtr RecPtr)
+{
+   XLogReaderState saved_state = *state;
+   XLogRecPtr  targetPagePtr;
+   XLogRecPtr  tmpRecPtr;
+   int targetRecOff;
+   XLogRecPtr found = InvalidXLogRecPtr;
+   uint32      pageHeaderSize;
+   XLogPageHeader header;
+   XLogRecord *record;
+   uint32 readLen;
+   char       *errormsg;
+
+   if (RecPtr == InvalidXLogRecPtr)
+       RecPtr = state->EndRecPtr;
+
+   targetRecOff = RecPtr % XLOG_BLCKSZ;
+
+   /* scroll back to page boundary */
+   targetPagePtr = RecPtr - targetRecOff;
+
+   /* Read the page containing the record */
+   readLen = ReadPageInternal(state, targetPagePtr, targetRecOff);
+   if (readLen < 0)
+       goto err;
+
+   header = (XLogPageHeader) state->readBuf;
+
+   pageHeaderSize = XLogPageHeaderSize(header);
+
+   /* make sure we have enough data for the page header */
+   readLen = ReadPageInternal(state, targetPagePtr, pageHeaderSize);
+   if (readLen < 0)
+       goto err;
+
+   /* skip over potential continuation data */
+   if (header->xlp_info & XLP_FIRST_IS_CONTRECORD)
+   {
+       /* record headers are MAXALIGN'ed */
+       tmpRecPtr = targetPagePtr + pageHeaderSize
+           + MAXALIGN(header->xlp_rem_len);
+   }
+   else
+   {
+       tmpRecPtr = targetPagePtr + pageHeaderSize;
+   }
+
+   /*
+    * we know now that tmpRecPtr is an address pointing to a valid XLogRecord
+    * because either were at the first record after the beginning of a page or
+    * we just jumped over the remaining data of a continuation.
+    */
+   while ((record = XLogReadRecord(state, tmpRecPtr, &errormsg)))
+   {
+       /* continue after the record */
+       tmpRecPtr = InvalidXLogRecPtr;
+
+       /* past the record we've found, break out */
+       if (RecPtr <= state->ReadRecPtr)
+       {
+           found = state->ReadRecPtr;
+           goto out;
+       }
+   }
+
+err:
+out:
+   /* Reset state to what we had before finding the record */
+   state->readSegNo = 0;
+   state->readOff = 0;
+   state->readLen = 0;
+   state->ReadRecPtr = saved_state.ReadRecPtr;
+   state->EndRecPtr = saved_state.EndRecPtr;
+   return found;
+}
+
+#endif /* FRONTEND */
diff --git a/src/backend/nls.mk b/src/backend/nls.mk
index 30f6a2b..c072de7 100644
--- a/src/backend/nls.mk
+++ b/src/backend/nls.mk
@@ -4,12 +4,13 @@ AVAIL_LANGUAGES  = de es fr ja pt_BR tr zh_CN zh_TW
 GETTEXT_FILES    = + gettext-files
 GETTEXT_TRIGGERS = $(BACKEND_COMMON_GETTEXT_TRIGGERS) \
     GUC_check_errmsg GUC_check_errdetail GUC_check_errhint \
-    write_stderr yyerror parser_yyerror
+    write_stderr yyerror parser_yyerror report_invalid_record
 GETTEXT_FLAGS    = $(BACKEND_COMMON_GETTEXT_FLAGS) \
     GUC_check_errmsg:1:c-format \
     GUC_check_errdetail:1:c-format \
     GUC_check_errhint:1:c-format \
-    write_stderr:1:c-format
+    write_stderr:1:c-format \
+    report_invalid_record:2:c-format

 gettext-files: distprep
 	find $(srcdir)/ $(srcdir)/../port/ -name '*.c' -print | LC_ALL=C sort >$@
diff --git a/src/include/access/xlogreader.h b/src/include/access/xlogreader.h
new file mode 100644
index 0000000..acc8309
--- /dev/null
+++ b/src/include/access/xlogreader.h
@@ -0,0 +1,141 @@
+/*-------------------------------------------------------------------------
+ *
+ * readxlog.h
+ *
+ *		Generic xlog reading facility.
+ *
+ * Portions Copyright (c) 2012, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		src/include/access/xlogreader.h
+ *
+ * NOTES
+ *		Check the definition of the XLogReaderState struct for instructions on
+ *		how to use the XLogReader infrastructure.
+ *
+ *		The basic idea is to allocate an XLogReaderState via
+ *		XLogReaderAllocate, and call XLogReadRecord() until it returns NULL.
+ *-------------------------------------------------------------------------
+ */
+#ifndef XLOGREADER_H
+#define XLOGREADER_H
+
+#include "access/xlog_internal.h"
+
+struct XLogReaderState;
+
+/*
+ * The callbacks are explained in more detail inside the XLogReaderState
+ * struct.
+ */
+
+typedef int (*XLogPageReadCB) (struct XLogReaderState *state,
+							   XLogRecPtr pageptr,
+							   int reqLen,
+							   char *readBuf,
+							   TimeLineID *pageTLI);
+
+typedef struct XLogReaderState
+{
+	/* ----------------------------------------
+	 * Public parameters
+	 * ----------------------------------------
+	 */
+
+	/*
+	 * Data input callback (mandatory).
+	 *
+	 * This callback shall read the the xlog page (of size XLOG_BLKSZ) in which
+	 * RecPtr resides. All data <= RecPtr must be visible. The callback shall
+	 * return the range of actually valid bytes returned or -1 upon
+	 * failure.
+	 *
+	 * *pageTLI should be set to the TLI of the file the page was read from
+	 * to be in. It is currently used only for error reporting purposes, to
+	 * reconstruct the name of the WAL file where an error occurred.
+	 */
+	XLogPageReadCB read_page;
+
+	/*
+	 * System identifier of the xlog files were about to read.
+	 *
+	 * Set to zero (the default value) if unknown or unimportant.
+	 */
+	uint64		system_identifier;
+
+	/*
+	 * Opaque data for callbacks to use.  Not used by XLogReader.
+	 */
+	void	   *private_data;
+
+	/*
+	 * From where to where are we reading
+	 */
+	XLogRecPtr	ReadRecPtr;		/* start of last record read */
+	XLogRecPtr	EndRecPtr;		/* end+1 of last record read */
+
+	/*
+	 * TLI of the current xlog page
+	 */
+	TimeLineID	ReadTimeLineID;
+
+	/* ----------------------------------------
+	 * private/internal state
+	 * ----------------------------------------
+	 */
+
+	/* Buffer for currently read page (XLOG_BLCKSZ bytes) */
+	char	   *readBuf;
+
+	/* last read segment, segment offset, read length, TLI */
+	XLogSegNo   readSegNo;
+	uint32      readOff;
+	uint32      readLen;
+	TimeLineID  readPageTLI;
+
+	/* beginning of last page read, and its TLI  */
+	XLogRecPtr	latestPagePtr;
+	TimeLineID	latestPageTLI;
+
+	/* Buffer for current ReadRecord result (expandable) */
+	char	   *readRecordBuf;
+	uint32		readRecordBufSize;
+
+	/* Buffer to hold error message */
+	char	   *errormsg_buf;
+} XLogReaderState;
+
+/*
+ * Get a new XLogReader
+ *
+ * At least the read_page callback, startptr and endptr have to be set before
+ * the reader can be used.
+ */
+extern XLogReaderState *XLogReaderAllocate(XLogRecPtr startpoint,
+				   XLogPageReadCB pagereadfunc, void *private_data);
+
+/*
+ * Free an XLogReader
+ */
+extern void XLogReaderFree(XLogReaderState *state);
+
+/*
+ * Read the next record from xlog. Returns NULL on end-of-WAL or on failure.
+ */
+extern struct XLogRecord *XLogReadRecord(XLogReaderState *state, XLogRecPtr ptr,
+			   char **errormsg);
+
+/*
+ * Functions that are currently only needed in the backend, but are better
+ * implemented inside xlogreader because the internal functions available
+ * there.
+ */
+#ifdef FRONTEND
+/*
+ * Find the address of the first record with a lsn >= RecPtr.
+ */
+extern XLogRecPtr XLogFindNextRecord(XLogReaderState *state, XLogRecPtr RecPtr);
+
+#endif /* FRONTEND */
+
+#endif   /* XLOGREADER_H */
-- 
1.7.12.289.g0ce9864.dirty

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

andres@anarazel.de

over 13 years ago

In reply to: Andres Freund (#1)

[PATCH 4/5] Add pg_xlogdump contrib module

From: Andres Freund <andres@anarazel.de>

Authors: Andres Freund, Heikki Linnakangas
---
contrib/Makefile | 1 +
contrib/pg_xlogdump/Makefile | 37 +++
contrib/pg_xlogdump/compat.c | 58 ++++
contrib/pg_xlogdump/pg_xlogdump.c | 654 ++++++++++++++++++++++++++++++++++++++
contrib/pg_xlogdump/tables.c | 78 +++++
doc/src/sgml/ref/allfiles.sgml | 1 +
doc/src/sgml/ref/pg_xlogdump.sgml | 76 +++++
doc/src/sgml/reference.sgml | 1 +
src/backend/access/transam/rmgr.c | 1 +
src/backend/catalog/catalog.c | 2 +
src/tools/msvc/Mkvcbuild.pm | 16 +-
11 files changed, 924 insertions(+), 1 deletion(-)
create mode 100644 contrib/pg_xlogdump/Makefile
create mode 100644 contrib/pg_xlogdump/compat.c
create mode 100644 contrib/pg_xlogdump/pg_xlogdump.c
create mode 100644 contrib/pg_xlogdump/tables.c
create mode 100644 doc/src/sgml/ref/pg_xlogdump.sgml

diff --git a/contrib/Makefile b/contrib/Makefile
index fcd7c1e..5d290b8 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -39,6 +39,7 @@ SUBDIRS = \
 		pg_trgm		\
 		pg_upgrade	\
 		pg_upgrade_support \
+		pg_xlogdump	\
 		pgbench		\
 		pgcrypto	\
 		pgrowlocks	\
diff --git a/contrib/pg_xlogdump/Makefile b/contrib/pg_xlogdump/Makefile
new file mode 100644
index 0000000..1adef35
--- /dev/null
+++ b/contrib/pg_xlogdump/Makefile
@@ -0,0 +1,37 @@
+# contrib/pg_xlogdump/Makefile
+
+PGFILEDESC = "pg_xlogdump"
+PGAPPICON=win32
+
+PROGRAM = pg_xlogdump
+OBJS =  pg_xlogdump.o compat.o tables.o xlogreader.o $(RMGRDESCOBJS) \
+	$(WIN32RES)
+
+# XXX: Perhaps this should be done by a wildcard rule so that you don't need
+# to remember to add new rmgrdesc files to this list.
+RMGRDESCSOURCES = clogdesc.c dbasedesc.c gindesc.c gistdesc.c hashdesc.c \
+	heapdesc.c mxactdesc.c nbtdesc.c relmapdesc.c seqdesc.c smgrdesc.c \
+	spgdesc.c standbydesc.c tblspcdesc.c xactdesc.c xlogdesc.c
+
+RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
+
+EXTRA_CLEAN = $(RMGRDESCSOURCES) xlogreader.c
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/pg_xlogdump
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
+
+override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
+
+xlogreader.c: % : $(top_srcdir)/src/backend/access/transam/%
+	rm -f $@ && $(LN_S) $< .
+
+$(RMGRDESCSOURCES): % : $(top_srcdir)/src/backend/access/rmgrdesc/%
+	rm -f $@ && $(LN_S) $< .
diff --git a/contrib/pg_xlogdump/compat.c b/contrib/pg_xlogdump/compat.c
new file mode 100644
index 0000000..e150afb
--- /dev/null
+++ b/contrib/pg_xlogdump/compat.c
@@ -0,0 +1,58 @@
+/*-------------------------------------------------------------------------
+ *
+ * compat.c
+ *		Reimplementations of various backend functions.
+ *
+ * Portions Copyright (c) 2012, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		contrib/pg_xlogdump/compat.c
+ *
+ * This file contains client-side implementations for various backend
+ * functions that the rm_desc functions in *desc.c files rely on.
+ *
+ *-------------------------------------------------------------------------
+ */
+
+/* ugly hack, same as in e.g pg_controldata */
+#define FRONTEND 1
+#include "postgres.h"
+
+#include "catalog/catalog.h"
+#include "datatype/timestamp.h"
+#include "lib/stringinfo.h"
+#include "storage/relfilenode.h"
+#include "utils/timestamp.h"
+#include "utils/datetime.h"
+
+const char *
+timestamptz_to_str(TimestampTz dt)
+{
+	return "unimplemented-timestamp";
+}
+
+const char *
+relpathbackend(RelFileNode rnode, BackendId backend, ForkNumber forknum)
+{
+	return "unimplemented-relpathbackend";
+}
+
+/*
+ * Provide a hacked up compat layer for StringInfos so xlog desc functions can
+ * be linked/called.
+ */
+void
+appendStringInfo(StringInfo str, const char *fmt, ...)
+{
+	va_list		args;
+
+	va_start(args, fmt);
+	vprintf(fmt, args);
+	va_end(args);
+}
+
+void
+appendStringInfoString(StringInfo str, const char *string)
+{
+	appendStringInfo(str, "%s", string);
+}
diff --git a/contrib/pg_xlogdump/pg_xlogdump.c b/contrib/pg_xlogdump/pg_xlogdump.c
new file mode 100644
index 0000000..29ee73e
--- /dev/null
+++ b/contrib/pg_xlogdump/pg_xlogdump.c
@@ -0,0 +1,654 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_xlogdump.c - decode and display WAL
+ *
+ * Copyright (c) 2012, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		  contrib/pg_xlogdump/pg_xlogdump.c
+ *-------------------------------------------------------------------------
+ */
+
+/* ugly hack, same as in e.g pg_controldata */
+#define FRONTEND 1
+#include "postgres.h"
+
+#include <unistd.h>
+
+#include "access/xlog.h"
+#include "access/xlogreader.h"
+#include "access/rmgr.h"
+#include "access/transam.h"
+
+#include "catalog/catalog.h"
+
+#include "getopt_long.h"
+
+static const char *progname;
+
+typedef struct XLogDumpPrivateData
+{
+	TimeLineID	timeline;
+	char	   *inpath;
+	XLogRecPtr	startptr;
+	XLogRecPtr	endptr;
+
+	/* display options */
+	bool		bkp_details;
+	int			stop_after_records;
+	int			already_displayed_records;
+
+	/* filter options */
+	int         filter_by_rmgr;
+	TransactionId filter_by_xid;
+} XLogDumpPrivateData;
+
+static void fatal_error(const char *fmt, ...)
+__attribute__((format(PG_PRINTF_ATTRIBUTE, 1, 2)));
+
+static void fatal_error(const char *fmt, ...)
+{
+	va_list		args;
+	fflush(stdout);
+
+	fprintf(stderr, "%s: fatal_error: ", progname);
+	va_start(args, fmt);
+	vfprintf(stderr, fmt, args);
+	va_end(args);
+	fputc('\n', stderr);
+	exit(EXIT_FAILURE);
+}
+
+/*
+ * Check whether directory exists and whether we can open it. Keep errno set
+ * error reporting by the caller.
+ */
+static bool
+verify_directory(const char *directory)
+{
+	int fd = open(directory, O_DIRECTORY|O_RDONLY);
+	if (fd < 0)
+		return false;
+	close(fd);
+	return true;
+}
+
+static void
+split_path(const char *path, char **dir, char **fname)
+{
+	char *sep;
+
+	/* split filepath into directory & filename */
+	sep = strrchr(path, '/');
+
+	/* directory path */
+	if (sep != NULL)
+	{
+		/* windows doesn't have strndup */
+		*dir = strdup(path);
+		(*dir)[(sep - path) + 1] = '\0';
+		*fname = strdup(sep + 1);
+		}
+	/* local directory */
+	else
+	{
+		*dir = NULL;
+		*fname = strdup(path);
+	}
+}
+
+/*
+ * Try to find the file in several places:
+ * if directory == NULL:
+ *   fname
+ *   XLOGDIR / fname
+ *   $PGDATA / XLOGDIR / fname
+ * else
+ *   directory / fname
+ *   directory / XLOGDIR / fname
+ *
+ * return a read only fd
+ */
+static int
+fuzzy_open_file(const char *directory, const char *fname)
+{
+	int fd = -1;
+	char fpath[MAXPGPATH];
+
+	if (directory == NULL)
+	{
+		const char* datadir;
+
+		/* fname */
+		fd = open(fname, O_RDONLY | PG_BINARY, 0);
+		if (fd < 0 && errno != ENOENT)
+			return -1;
+		else if (fd > 0)
+			return fd;
+
+		/* XLOGDIR / fname */
+		snprintf(fpath, MAXPGPATH, "%s/%s",
+				 XLOGDIR, fname);
+		fd = open(fpath, O_RDONLY | PG_BINARY, 0);
+		if (fd < 0 && errno != ENOENT)
+			return -1;
+		else if (fd > 0)
+			return fd;
+
+		datadir = getenv("PGDATA");
+		/* $PGDATA / XLOGDIR / fname */
+		if (datadir != NULL)
+		{
+			snprintf(fpath, MAXPGPATH, "%s/%s/%s",
+					 datadir, XLOGDIR, fname);
+			fd = open(fpath, O_RDONLY | PG_BINARY, 0);
+			if (fd < 0 && errno != ENOENT)
+				return -1;
+			else if (fd > 0)
+				return fd;
+		}
+	}
+	else
+	{
+		/* directory / fname */
+		snprintf(fpath, MAXPGPATH, "%s/%s",
+				 directory, fname);
+		fd = open(fpath, O_RDONLY | PG_BINARY, 0);
+		if (fd < 0 && errno != ENOENT)
+			return -1;
+		else if (fd > 0)
+			return fd;
+
+		/* directory / XLOGDIR / fname */
+		snprintf(fpath, MAXPGPATH, "%s/%s/%s",
+				 directory, XLOGDIR, fname);
+		fd = open(fpath, O_RDONLY | PG_BINARY, 0);
+		if (fd < 0 && errno != ENOENT)
+			return -1;
+		else if (fd > 0)
+			return fd;
+	}
+	return -1;
+}
+
+/* this should probably be put in a general implementation */
+static void
+XLogDumpXLogRead(const char *directory, TimeLineID timeline_id,
+				 XLogRecPtr startptr, char *buf, Size count)
+{
+	char	   *p;
+	XLogRecPtr	recptr;
+	Size		nbytes;
+
+	static int	sendFile = -1;
+	static XLogSegNo sendSegNo = 0;
+	static uint32 sendOff = 0;
+
+	p = buf;
+	recptr = startptr;
+	nbytes = count;
+
+	while (nbytes > 0)
+	{
+		uint32		startoff;
+		int			segbytes;
+		int			readbytes;
+
+		startoff = recptr % XLogSegSize;
+
+		if (sendFile < 0 || !XLByteInSeg(recptr, sendSegNo))
+		{
+			char		fname[MAXFNAMELEN];
+
+			/* Switch to another logfile segment */
+			if (sendFile >= 0)
+				close(sendFile);
+
+			XLByteToSeg(recptr, sendSegNo);
+
+			XLogFileName(fname, timeline_id, sendSegNo);
+
+			sendFile = fuzzy_open_file(directory, fname);
+
+			if (sendFile < 0)
+				fatal_error("could not find file \"%s\": %s",
+							fname, strerror(errno));
+			sendOff = 0;
+		}
+
+		/* Need to seek in the file? */
+		if (sendOff != startoff)
+		{
+			if (lseek(sendFile, (off_t) startoff, SEEK_SET) < 0)
+			{
+				int		err = errno;
+				char	fname[MAXPGPATH];
+				XLogFileName(fname, timeline_id, sendSegNo);
+
+				fatal_error("could not seek in log segment %s to offset %u: %s",
+							fname, startoff, strerror(err));
+			}
+			sendOff = startoff;
+		}
+
+		/* How many bytes are within this segment? */
+		if (nbytes > (XLogSegSize - startoff))
+			segbytes = XLogSegSize - startoff;
+		else
+			segbytes = nbytes;
+
+		readbytes = read(sendFile, p, segbytes);
+		if (readbytes <= 0)
+		{
+			int		err = errno;
+			char	fname[MAXPGPATH];
+			XLogFileName(fname, timeline_id, sendSegNo);
+
+			fatal_error("could not read from log segment %s, offset %d, length %d: %s",
+						fname, sendOff, segbytes, strerror(err));
+		}
+
+		/* Update state for read */
+		recptr += readbytes;
+
+		sendOff += readbytes;
+		nbytes -= readbytes;
+		p += readbytes;
+	}
+}
+
+static int
+XLogDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
+				 char *readBuff, TimeLineID *curFileTLI)
+{
+	XLogDumpPrivateData *private = state->private_data;
+	int			count = XLOG_BLCKSZ;
+
+	if (private->endptr != InvalidXLogRecPtr)
+	{
+		if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
+			count = XLOG_BLCKSZ;
+		else if (targetPagePtr + reqLen <= private->endptr)
+			count = private->endptr - targetPagePtr;
+		else
+			return -1;
+	}
+
+	XLogDumpXLogRead(private->inpath, private->timeline, targetPagePtr,
+					 readBuff, count);
+
+	return count;
+}
+
+static void
+XLogDumpDisplayRecord(XLogReaderState *state, XLogRecord *record)
+{
+	XLogDumpPrivateData *config = (XLogDumpPrivateData *) state->private_data;
+	const RmgrData *rmgr = &RmgrTable[record->xl_rmid];
+
+	if (config->filter_by_rmgr != -1 &&
+	    config->filter_by_rmgr != record->xl_rmid)
+		return;
+
+	if (TransactionIdIsValid(config->filter_by_xid) &&
+	    config->filter_by_xid != record->xl_xid)
+		return;
+
+	config->already_displayed_records++;
+
+	printf("xlog record: rmgr: %-11s, record_len: %6u, tot_len: %6u, tx: %10u, lsn: %X/%08X, prev %X/%08X, bkp: %u%u%u%u, desc:",
+			rmgr->rm_name,
+			record->xl_len, record->xl_tot_len,
+			record->xl_xid,
+			(uint32) (state->ReadRecPtr >> 32), (uint32) state->ReadRecPtr,
+			(uint32) (record->xl_prev >> 32), (uint32) record->xl_prev,
+			!!(XLR_BKP_BLOCK(0) & record->xl_info),
+			!!(XLR_BKP_BLOCK(1) & record->xl_info),
+			!!(XLR_BKP_BLOCK(2) & record->xl_info),
+			!!(XLR_BKP_BLOCK(3) & record->xl_info));
+
+	/* the desc routine will printf the description directly to stdout */
+	rmgr->rm_desc(NULL, record->xl_info, XLogRecGetData(record));
+
+	putchar('\n');
+
+	if (config->bkp_details)
+	{
+		int		off;
+		char   *blk = (char *) XLogRecGetData(record) + record->xl_len;
+
+		for (off = 0; off < XLR_MAX_BKP_BLOCKS; off++)
+		{
+			BkpBlock	bkpb;
+
+			if (!(XLR_BKP_BLOCK(off) & record->xl_info))
+				continue;
+
+			memcpy(&bkpb, blk, sizeof(BkpBlock));
+			blk += sizeof(BkpBlock);
+			blk += BLCKSZ - bkpb.hole_length;
+
+			printf("\tbackup bkp #%u; rel %u/%u/%u; fork: %s; block: %u; hole: offset: %u, length: %u\n",
+				   off, bkpb.node.spcNode, bkpb.node.dbNode, bkpb.node.relNode,
+				   forkNames[bkpb.fork], bkpb.block, bkpb.hole_offset, bkpb.hole_length);
+		}
+	}
+}
+
+static void
+usage(void)
+{
+	printf("%s: reads/writes postgres transaction logs for debugging.\n\n",
+		   progname);
+	printf("Usage:\n");
+	printf("  %s [OPTION] [STARTSEG [ENDSEG]] \n", progname);
+	printf("\nOptions:\n");
+	printf("  -b, --bkp-details      output detailed information about backup blocks\n");
+	printf("  -e, --end RECPTR       read wal up to RECPTR\n");
+	printf("  -h, --help             show this help, then exit\n");
+	printf("  -n, --limit RECORDS    only display n records, abort afterwards\n");
+	printf("  -p, --path PATH        from where do we want to read? cwd/pg_xlog is the default\n");
+	printf("  -r, --rmgr RMGR        only show records generated by the rmgr RMGR\n");
+	printf("  -s, --start RECPTR     read wal in directory indicated by -p starting at RECPTR\n");
+	printf("  -t, --timeline TLI     which timeline do we want to read, defaults to 1\n");
+	printf("  -V, --version          output version information, then exit\n");
+	printf("  -x, --xid XID          only show records with transactionid XID\n");
+}
+
+int
+main(int argc, char **argv)
+{
+	uint32		xlogid;
+	uint32		xrecoff;
+	XLogReaderState *xlogreader_state;
+	XLogDumpPrivateData private;
+	XLogRecord *record;
+	XLogRecPtr	first_record;
+	char	   *errormsg;
+
+	static struct option long_options[] = {
+		{"bkp-details", no_argument, NULL, 'b'},
+		{"end", required_argument, NULL, 'e'},
+		{"help", no_argument, NULL, '?'},
+		{"limit", required_argument, NULL, 'n'},
+		{"path", required_argument, NULL, 'p'},
+		{"rmgr", required_argument, NULL, 'r'},
+		{"start", required_argument, NULL, 's'},
+		{"timeline", required_argument, NULL, 't'},
+		{"xid", required_argument, NULL, 'x'},
+		{"version", no_argument, NULL, 'V'},
+		{NULL, 0, NULL, 0}
+	};
+
+	int			option;
+	int			optindex = 0;
+
+	progname = get_progname(argv[0]);
+
+	memset(&private, 0, sizeof(XLogDumpPrivateData));
+
+	private.timeline = 1;
+	private.bkp_details = false;
+	private.startptr = InvalidXLogRecPtr;
+	private.endptr = InvalidXLogRecPtr;
+	private.stop_after_records = -1;
+	private.already_displayed_records = 0;
+	private.filter_by_rmgr = -1;
+	private.filter_by_xid = InvalidTransactionId;
+
+	if (argc <= 1)
+	{
+		fprintf(stderr, "%s: no arguments specified\n", progname);
+		goto bad_argument;
+	}
+
+	while ((option = getopt_long(argc, argv, "be:?n:p:r:s:t:Vx:",
+								 long_options, &optindex)) != -1)
+	{
+		switch (option)
+		{
+			case 'b':
+				private.bkp_details = true;
+				break;
+			case 'e':
+				if (sscanf(optarg, "%X/%X", &xlogid, &xrecoff) != 2)
+				{
+					fprintf(stderr, "%s: could not parse parse --end %s\n",
+							progname, optarg);
+					goto bad_argument;
+				}
+				private.endptr = (uint64)xlogid << 32 | xrecoff;
+				break;
+			case '?':
+				usage();
+				exit(EXIT_SUCCESS);
+				break;
+			case 'n':
+				if (sscanf(optarg, "%d", &private.stop_after_records) != 1)
+				{
+					fprintf(stderr, "%s: could not parse parse --limit %s\n",
+							progname, optarg);
+					goto bad_argument;
+				}
+				break;
+			case 'p':
+				private.inpath = strdup(optarg);
+				break;
+			case 'r':
+			{
+				int i;
+				for (i = 0; i < RM_MAX_ID; i++)
+				{
+					if (strcmp(optarg, RmgrTable[i].rm_name) == 0)
+					{
+						private.filter_by_rmgr = i;
+						break;
+					}
+				}
+
+				if (private.filter_by_rmgr == -1)
+				{
+					fprintf(stderr, "%s: --rmgr %s does not exist\n",
+							progname, optarg);
+					goto bad_argument;
+				}
+			}
+			break;
+			case 's':
+				if (sscanf(optarg, "%X/%X", &xlogid, &xrecoff) != 2)
+				{
+					fprintf(stderr, "%s: could not parse parse --end %s\n",
+							progname, optarg);
+					goto bad_argument;
+				}
+				else
+					private.startptr = (uint64)xlogid << 32 | xrecoff;
+				break;
+			case 't':
+				if (sscanf(optarg, "%d", &private.timeline) != 1)
+				{
+					fprintf(stderr, "%s: could not parse timeline --timeline %s\n",
+							progname, optarg);
+					goto bad_argument;
+				}
+				break;
+			case 'V':
+				puts("pg_xlogdump (PostgreSQL) " PG_VERSION);
+				exit(EXIT_SUCCESS);
+				break;
+			case 'x':
+				if (sscanf(optarg, "%u", &private.filter_by_xid) != 1)
+				{
+					fprintf(stderr, "%s: could not parse --xid %s as a valid xid\n",
+							progname, optarg);
+					goto bad_argument;
+				}
+				break;
+			default:
+				goto bad_argument;
+		}
+	}
+
+	if ((optind + 2) < argc)
+	{
+		fprintf(stderr,
+				"%s: too many command-line arguments (first is \"%s\")\n",
+				progname, argv[optind + 2]);
+		goto bad_argument;
+	}
+
+	if (private.inpath != NULL)
+	{
+		/* validdate path points to directory */
+		if (!verify_directory(private.inpath))
+		{
+			fprintf(stderr,
+					"%s: --path %s is cannot be opened: %s",
+					progname, private.inpath, strerror(errno));
+			goto bad_argument;
+		}
+	}
+
+	/* parse files as start/end boundaries, extract path if not specified */
+	if (optind < argc)
+	{
+		char *directory = NULL;
+		char *fname = NULL;
+		int fd;
+		XLogSegNo segno;
+
+		split_path(argv[optind], &directory, &fname);
+
+		if (private.inpath == NULL && directory != NULL)
+		{
+			private.inpath = directory;
+
+			if (!verify_directory(private.inpath))
+				fatal_error("cannot open directory %s: %s",
+							private.inpath, strerror(errno));
+		}
+
+		fd = fuzzy_open_file(private.inpath, fname);
+		if (fd < 0)
+			fatal_error("could not open file %s", fname);
+		close(fd);
+
+		/* parse position from file */
+		XLogFromFileName(fname, &private.timeline, &segno);
+
+		if (XLogRecPtrIsInvalid(private.startptr))
+			XLogSegNoOffsetToRecPtr(segno, 0, private.startptr);
+		else if (!XLByteInSeg(private.startptr, segno))
+		{
+			fprintf(stderr,
+					"%s: --end %X/%X is not inside file \"%s\"\n",
+					progname,
+					(uint32)(private.startptr >> 32),
+					(uint32)private.startptr,
+					fname);
+			goto bad_argument;
+		}
+
+		/* no second file specified, set end position */
+		if (!(optind + 1 < argc) && XLogRecPtrIsInvalid(private.endptr))
+			XLogSegNoOffsetToRecPtr(segno + 1, 0, private.endptr);
+
+		/* parse ENDSEG if passed */
+		if (optind + 1 < argc)
+		{
+			XLogSegNo endsegno;
+
+			/* ignore directory, already have that */
+			split_path(argv[optind + 1], &directory, &fname);
+
+			fd = fuzzy_open_file(private.inpath, fname);
+			if (fd < 0)
+				fatal_error("could not open file %s", fname);
+			close(fd);
+
+			/* parse position from file */
+			XLogFromFileName(fname, &private.timeline, &endsegno);
+
+			if (endsegno < segno)
+				fatal_error("ENDSEG %s is before STARSEG %s",
+							argv[optind + 1], argv[optind]);
+
+			if (XLogRecPtrIsInvalid(private.endptr))
+				XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.endptr);
+
+			/* set segno to endsegno for check of --end */
+			segno = endsegno;
+		}
+
+
+		if (!XLByteInSeg(private.endptr, segno) &&
+			private.endptr != (segno + 1) * XLogSegSize)
+		{
+			fprintf(stderr,
+					"%s: --end %X/%X is not inside file \"%s\"\n",
+					progname,
+					(uint32)(private.endptr >> 32),
+					(uint32)private.endptr,
+					argv[argc -1]);
+			goto bad_argument;
+		}
+	}
+
+	/* we don't know what to print */
+	if (XLogRecPtrIsInvalid(private.startptr))
+	{
+		fprintf(stderr, "%s: no --start given in range mode.\n", progname);
+		goto bad_argument;
+	}
+
+	/* done with argument parsing, do the actual work */
+
+	/* we have everything we need, start reading */
+	xlogreader_state = XLogReaderAllocate(private.startptr,
+										  XLogDumpReadPage,
+										  &private);
+
+	/* first find a valid recptr to start from */
+	first_record = XLogFindNextRecord(xlogreader_state, private.startptr);
+
+	if (first_record == InvalidXLogRecPtr)
+		fatal_error("could not find a valid record after %X/%X",
+					(uint32) (private.startptr >> 32),
+					(uint32) private.startptr);
+
+	/*
+	 * Display a message that were skipping data if `from` wasn't a pointer
+	 * to the start of a record and also wasn't a pointer to the beginning
+	 * of a segment (e.g. we were used in file mode).
+	 */
+	if (first_record != private.startptr && (private.startptr % XLogSegSize) != 0)
+		printf("first record is after %X/%X, at %X/%X, skipping over %u bytes\n",
+			   (uint32) (private.startptr >> 32), (uint32) private.startptr,
+			   (uint32) (first_record >> 32), (uint32) first_record,
+			   (uint32) (first_record - private.startptr));
+
+	while ((record = XLogReadRecord(xlogreader_state, first_record, &errormsg)))
+	{
+		/* continue after the last record */
+		first_record = InvalidXLogRecPtr;
+		XLogDumpDisplayRecord(xlogreader_state, record);
+
+		/* check whether we printed enough */
+		if (private.stop_after_records > 0 &&
+			private.already_displayed_records >= private.stop_after_records)
+			break;
+	}
+
+	if (errormsg)
+		fatal_error("error in WAL record at %X/%X: %s\n",
+					(uint32)(xlogreader_state->ReadRecPtr >> 32),
+					(uint32)xlogreader_state->ReadRecPtr,
+					errormsg);
+
+	XLogReaderFree(xlogreader_state);
+
+	return EXIT_SUCCESS;
+bad_argument:
+	fprintf(stderr, "Try \"%s --help\" for more information.\n", progname);
+	return EXIT_FAILURE;
+}
diff --git a/contrib/pg_xlogdump/tables.c b/contrib/pg_xlogdump/tables.c
new file mode 100644
index 0000000..e947e0d
--- /dev/null
+++ b/contrib/pg_xlogdump/tables.c
@@ -0,0 +1,78 @@
+/*-------------------------------------------------------------------------
+ *
+ * tables.c
+ *		Support data for xlogdump.c
+ *
+ * Portions Copyright (c) 2012, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		contrib/pg_xlogdump/tables.c
+ *
+ * NOTES
+ *
+ *-------------------------------------------------------------------------
+ */
+
+/*
+ * rmgr.c
+ *
+ * Resource managers definition
+ *
+ * src/backend/access/transam/rmgr.c
+ */
+#include "postgres.h"
+
+#include "access/clog.h"
+#include "access/gin.h"
+#include "access/gist_private.h"
+#include "access/hash.h"
+#include "access/heapam_xlog.h"
+#include "access/multixact.h"
+#include "access/nbtree.h"
+#include "access/spgist.h"
+#include "access/xact.h"
+#include "access/xlog_internal.h"
+#include "catalog/storage_xlog.h"
+#include "commands/dbcommands.h"
+#include "commands/sequence.h"
+#include "commands/tablespace.h"
+#include "storage/standby.h"
+#include "utils/relmapper.h"
+#include "catalog/catalog.h"
+
+/*
+ * Table of fork names.
+ *
+ * needs to be synced with src/backend/catalog/catalog.c
+ */
+const char *forkNames[] = {
+	"main",						/* MAIN_FORKNUM */
+	"fsm",						/* FSM_FORKNUM */
+	"vm",						/* VISIBILITYMAP_FORKNUM */
+	"init"						/* INIT_FORKNUM */
+};
+
+/*
+ * RmgrTable linked only to functions available outside of the backend.
+ *
+ * needs to be synced with src/backend/access/transam/rmgr.c
+ */
+const RmgrData RmgrTable[RM_MAX_ID + 1] = {
+	{"XLOG", NULL, xlog_desc, NULL, NULL, NULL},
+	{"Transaction", NULL, xact_desc, NULL, NULL, NULL},
+	{"Storage", NULL, smgr_desc, NULL, NULL, NULL},
+	{"CLOG", NULL, clog_desc, NULL, NULL, NULL},
+	{"Database", NULL, dbase_desc, NULL, NULL, NULL},
+	{"Tablespace", NULL, tblspc_desc, NULL, NULL, NULL},
+	{"MultiXact", NULL, multixact_desc, NULL, NULL, NULL},
+	{"RelMap", NULL, relmap_desc, NULL, NULL, NULL},
+	{"Standby", NULL, standby_desc, NULL, NULL, NULL},
+	{"Heap2", NULL, heap2_desc, NULL, NULL, NULL},
+	{"Heap", NULL, heap_desc, NULL, NULL, NULL},
+	{"Btree", NULL, btree_desc, NULL, NULL, NULL},
+	{"Hash", NULL, hash_desc, NULL, NULL, NULL},
+	{"Gin", NULL, gin_desc, NULL, NULL, NULL},
+	{"Gist", NULL, gist_desc, NULL, NULL, NULL},
+	{"Sequence", NULL, seq_desc, NULL, NULL, NULL},
+	{"SPGist", NULL, spg_desc, NULL, NULL, NULL}
+};
diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml
index df84054..49cb7ac 100644
--- a/doc/src/sgml/ref/allfiles.sgml
+++ b/doc/src/sgml/ref/allfiles.sgml
@@ -178,6 +178,7 @@ Complete list of usable sgml source files in this directory.
 <!ENTITY pgReceivexlog      SYSTEM "pg_receivexlog.sgml">
 <!ENTITY pgResetxlog        SYSTEM "pg_resetxlog.sgml">
 <!ENTITY pgRestore          SYSTEM "pg_restore.sgml">
+<!ENTITY pgXlogdump         SYSTEM "pg_xlogdump.sgml">
 <!ENTITY postgres           SYSTEM "postgres-ref.sgml">
 <!ENTITY postmaster         SYSTEM "postmaster.sgml">
 <!ENTITY psqlRef            SYSTEM "psql-ref.sgml">
diff --git a/doc/src/sgml/ref/pg_xlogdump.sgml b/doc/src/sgml/ref/pg_xlogdump.sgml
new file mode 100644
index 0000000..7a27c7b
--- /dev/null
+++ b/doc/src/sgml/ref/pg_xlogdump.sgml
@@ -0,0 +1,76 @@
+<!--
+doc/src/sgml/ref/pg_xlogdump.sgml
+PostgreSQL documentation
+-->
+
+<refentry id="APP-PGXLOGDUMP">
+ <refmeta>
+  <refentrytitle><application>pg_xlogdump</application></refentrytitle>
+  <manvolnum>1</manvolnum>
+  <refmiscinfo>Application</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+  <refname>pg_xlogdump</refname>
+  <refpurpose>Display the write-ahead log of a <productname>PostgreSQL</productname> database cluster</refpurpose>
+ </refnamediv>
+
+ <indexterm zone="app-pgxlogdump">
+  <primary>pg_xlogdump</primary>
+ </indexterm>
+
+ <refsynopsisdiv>
+  <cmdsynopsis>
+   <command>pg_xlogdump</command>
+   <arg choice="opt"><option>-b</option></arg>
+   <arg choice="opt"><option>-e</option> <replaceable class="parameter">xlogrecptr</replaceable></arg>
+   <arg choice="opt"><option>-f</option> <replaceable class="parameter">filename</replaceable></arg>
+   <arg choice="opt"><option>-h</option></arg>
+   <arg choice="opt"><option>-p</option> <replaceable class="parameter">directory</replaceable></arg>
+   <arg choice="opt"><option>-s</option> <replaceable class="parameter">xlogrecptr</replaceable></arg>
+   <arg choice="opt"><option>-t</option> <replaceable class="parameter">timelineid</replaceable></arg>
+   <arg choice="opt"><option>-v</option></arg>
+  </cmdsynopsis>
+ </refsynopsisdiv>
+
+ <refsect1 id="R1-APP-PGXLOGDUMP-1">
+  <title>Description</title>
+  <para>
+   <command>pg_xlogdump</command> display the write-ahead log (WAL) and is only
+   useful for debugging or educational purposes.
+  </para>
+
+  <para>
+   This utility can only be run by the user who installed the server, because
+   it requires read access to the data directory. It does not perform any
+   modifications.
+  </para>
+ </refsect1>
+
+ <refsect1>
+  <title>Options</title>
+
+   <para>
+    The following command-line options control the location and format of the
+    output.
+
+    <variablelist>
+     <varlistentry>
+      <term><option>-p <replaceable class="parameter">directory</replaceable></option></term>
+      <listitem>
+       <para>
+        Directory to find xlog files in.
+       </para>
+      </listitem>
+     </varlistentry>
+    </variablelist>
+   </para>
+ </refsect1>
+
+ <refsect1>
+  <title>Notes</title>
+  <para>
+    Can give wrong results when the server is running.
+  </para>
+ </refsect1>
+</refentry>
diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml
index 0872168..fed1fdd 100644
--- a/doc/src/sgml/reference.sgml
+++ b/doc/src/sgml/reference.sgml
@@ -225,6 +225,7 @@
    &pgDumpall;
    &pgReceivexlog;
    &pgRestore;
+   &pgXlogdump;
    &psqlRef;
    &reindexdb;
    &vacuumdb;
diff --git a/src/backend/access/transam/rmgr.c b/src/backend/access/transam/rmgr.c
index cc210a7..4e94af1 100644
--- a/src/backend/access/transam/rmgr.c
+++ b/src/backend/access/transam/rmgr.c
@@ -24,6 +24,7 @@
 #include "storage/standby.h"
 #include "utils/relmapper.h"

+/* Also update contrib/pg_xlogdump/tables.c if you add something here. */

 const RmgrData RmgrTable[RM_MAX_ID + 1] = {
 	{"XLOG", xlog_redo, xlog_desc, NULL, NULL, NULL},
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index 6455ef0..92ab3ca 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -52,6 +52,8 @@
  * If you add a new entry, remember to update the errhint below, and the
  * documentation for pg_relation_size(). Also keep FORKNAMECHARS above
  * up-to-date.
+ *
+ * Also update contrib/pg_xlogdump/tables.c if you add something here.
  */
 const char *forkNames[] = {
 	"main",						/* MAIN_FORKNUM */
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index d587365..7b6ed41 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -41,7 +41,7 @@ my $contrib_extraincludes =
 my $contrib_extrasource = {
 	'cube' => [ 'cubescan.l', 'cubeparse.y' ],
 	'seg'  => [ 'segscan.l',  'segparse.y' ] };
-my @contrib_excludes = ('pgcrypto', 'intagg', 'sepgsql');
+my @contrib_excludes = ('intagg', 'pgcrypto', 'pg_xlogdump', 'sepgsql');

sub mkvcbuild
{
@@ -409,6 +409,20 @@ sub mkvcbuild
'localtime.c');
$zic->AddReference($libpgport);

+	my $pgxlogdump = $solution->AddProject('pg_xlogdump', 'exe', 'contrib');
+	$pgxlogdump->{name} = 'pg_xlogdump';
+	$pgxlogdump->AddIncludeDir('src\backend');
+	$pgxlogdump->AddFiles('contrib\pg_xlogdump',
+		'compat.c', 'pg_xlogdump.c', 'tables.c');
+	$pgxlogdump->AddFile('src\backend\access\transam\xlogreader.c');
+	$pgxlogdump->AddFiles('src\backend\access\rmgrdesc',
+		'clogdesc.c', 'dbasedesc.c', 'gindesc.c', 'gistdesc.c', 'hashdesc.c',
+		'heapdesc.c', 'mxactdesc.c', 'nbtdesc.c', 'relmapdesc.c', 'seqdesc.c',
+		'smgrdesc.c', 'spgdesc.c', 'standbydesc.c', 'tblspcdesc.c',
+		'xactdesc.c', 'xlogdesc.c');
+	$pgxlogdump->AddReference($libpgport);
+	$pgxlogdump->AddDefine('FRONTEND');
+
 	if ($solution->{options}->{xml})
 	{
 		$contrib_extraincludes->{'pgxml'} = [
-- 
1.7.12.289.g0ce9864.dirty

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

andres@anarazel.de

over 13 years ago

In reply to: Andres Freund (#1)

[PATCH 5/5] remove spurious space in running_xact's _desc function

From: Andres Freund <andres@anarazel.de>

---
src/backend/access/rmgrdesc/standbydesc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/access/rmgrdesc/standbydesc.c b/src/backend/access/rmgrdesc/standbydesc.c
index c38892b..5fb6f54 100644
--- a/src/backend/access/rmgrdesc/standbydesc.c
+++ b/src/backend/access/rmgrdesc/standbydesc.c
@@ -57,7 +57,7 @@ standby_desc(StringInfo buf, uint8 xl_info, char *rec)
 	{
 		xl_running_xacts *xlrec = (xl_running_xacts *) rec;

-		appendStringInfo(buf, " running xacts:");
+		appendStringInfo(buf, "running xacts:");
 		standby_desc_running_xacts(buf, xlrec);
 	}
 	else
-- 
1.7.12.289.g0ce9864.dirty

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Thom Brown

thom@linux.com

over 13 years ago

In reply to: Andres Freund (#1)

Re: [PATCH] xlogreader-v4

On 8 January 2013 19:09, Andres Freund <andres@2ndquadrant.com> wrote:

From: Andres Freund <andres@2ndquadrant.com>
Subject: [PATCH] xlogreader-v4
In-Reply-To:

Hi,

this is the latest and obviously best version of xlogreader & xlogdump with
changes both from Heikki and me.

Aren't you forgetting something?

--
Thom

Thom Brown

thom@linux.com

over 13 years ago

In reply to: Thom Brown (#7)

Re: [PATCH] xlogreader-v4

On 8 January 2013 19:15, Thom Brown <thom@linux.com> wrote:

On 8 January 2013 19:09, Andres Freund <andres@2ndquadrant.com> wrote:

From: Andres Freund <andres@2ndquadrant.com>
Subject: [PATCH] xlogreader-v4
In-Reply-To:

Hi,

this is the latest and obviously best version of xlogreader & xlogdump
with
changes both from Heikki and me.

Aren't you forgetting something?

I see, you're posting them separately. Nevermind.

--
Thom

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#2)

Re: [PATCH 1/5] Centralize Assert* macros into c.h so its common between backend/frontend

Andres Freund <andres@2ndquadrant.com> writes:

From: Andres Freund <andres@anarazel.de>
c.h already had parts of the assert support (StaticAssert*) and its the shared
file between postgres.h and postgres_fe.h. This makes it easier to build
frontend programs which have to do the hack.

This patch seems unnecessary given that we already put a version of Assert()
into postgres_fe.h. I don't think that moving the two different
definitions into an #if block in one file is an improvement. If that
were an improvement, we might as well move everything in both postgres.h
and postgres_fe.h into c.h with a pile of #ifs.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

andres@anarazel.de

over 13 years ago

In reply to: Andres Freund (#1)

Re: [PATCH] xlogreader-v4

On 2013-01-08 20:09:42 +0100, Andres Freund wrote:

From: Andres Freund <andres@2ndquadrant.com>
Subject: [PATCH] xlogreader-v4
In-Reply-To:

Hi,

this is the latest and obviously best version of xlogreader & xlogdump with
changes both from Heikki and me.

Changes:
* windows build support for pg_xlogdump

That was done blindly, btw, so I only know it compiles, not that it
runs...

Its in git://git.postgresql.org/git/users/andresfreund/postgres.git
branch xlogreader_v4 btw.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#3)

Re: [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

Andres Freund <andres@2ndquadrant.com> writes:
maxpg> From: Andres Freund <andres@anarazel.de>

relpathbackend() (via some of its wrappers) is used in *_desc routines which we
want to be useable without a backend environment arround.

I'm 100% unimpressed with making relpathbackend return a pointer to a
static buffer. Who's to say whether that won't create bugs due to
overlapping usages?

Change signature to return a 'const char *' to make misuse easier to
detect.

That seems to create way more churn than is necessary, and it's wrong
anyway if the result is palloc'd.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#9)

Re: [PATCH 1/5] Centralize Assert* macros into c.h so its common between backend/frontend

On 2013-01-08 14:25:06 -0500, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

From: Andres Freund <andres@anarazel.de>
c.h already had parts of the assert support (StaticAssert*) and its the shared
file between postgres.h and postgres_fe.h. This makes it easier to build
frontend programs which have to do the hack.

This patch seems unnecessary given that we already put a version of Assert()
into postgres_fe.h. I don't think that moving the two different
definitions into an #if block in one file is an improvement. If that
were an improvement, we might as well move everything in both postgres.h
and postgres_fe.h into c.h with a pile of #ifs.

The problem is that some (including existing) pieces of code need to
include postgres.h itself, those can't easily include postgres_fe.h as
well without getting into problems with redefinitions. It seems the most
consistent to move all of that into c.h, enough of the assertion stuff
is already there, I don't see an advantage of splitting it across 3
files as it currently is.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#12)

Re: [PATCH 1/5] Centralize Assert* macros into c.h so its common between backend/frontend

Andres Freund <andres@2ndquadrant.com> writes:

On 2013-01-08 14:25:06 -0500, Tom Lane wrote:

This patch seems unnecessary given that we already put a version of Assert()
into postgres_fe.h.

The problem is that some (including existing) pieces of code need to
include postgres.h itself, those can't easily include postgres_fe.h as
well without getting into problems with redefinitions.

There is no place, anywhere, that should be including both. So I don't
see the problem.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#11)

Re: [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

On 2013-01-08 14:28:14 -0500, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:
maxpg> From: Andres Freund <andres@anarazel.de>

relpathbackend() (via some of its wrappers) is used in *_desc routines which we
want to be useable without a backend environment arround.

I'm 100% unimpressed with making relpathbackend return a pointer to a
static buffer. Who's to say whether that won't create bugs due to
overlapping usages?

I say it ;). I've gone through all callers and checked. Not that that
guarantees anything, but ...

The reason a static buffer is better is that some of the *desc routines
use relpathbackend() and pfree() the result. That would require
providing a stub pfree() in xlogdump which seems to be exceedingly ugly.

Change signature to return a 'const char *' to make misuse easier to
detect.

That seems to create way more churn than is necessary, and it's wrong
anyway if the result is palloc'd.

It causes warnings in potential external users that pfree() the result
of relpathbackend which seems helpful. Obviously only makes sense if
relpathbackend() returns a pointer into a static buffer...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#14)

Re: [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

Andres Freund <andres@2ndquadrant.com> writes:

On 2013-01-08 14:28:14 -0500, Tom Lane wrote:

I'm 100% unimpressed with making relpathbackend return a pointer to a
static buffer. Who's to say whether that won't create bugs due to
overlapping usages?

I say it ;). I've gone through all callers and checked. Not that that
guarantees anything, but ...

Even if you've proven it safe today, it seems unnecessarily fragile.
Just about any other place we've used a static result buffer, we've
regretted it, unless the use cases were *very* narrow.

The reason a static buffer is better is that some of the *desc routines
use relpathbackend() and pfree() the result. That would require
providing a stub pfree() in xlogdump which seems to be exceedingly ugly.

Why? If we have palloc support how hard can it be to map pfree to free?
And why wouldn't we want to? I can hardly imagine providing only palloc
and not pfree support.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#15)

Re: [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

On 2013-01-08 14:53:29 -0500, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

On 2013-01-08 14:28:14 -0500, Tom Lane wrote:

I'm 100% unimpressed with making relpathbackend return a pointer to a
static buffer. Who's to say whether that won't create bugs due to
overlapping usages?

I say it ;). I've gone through all callers and checked. Not that that
guarantees anything, but ...

Even if you've proven it safe today, it seems unnecessarily fragile.
Just about any other place we've used a static result buffer, we've
regretted it, unless the use cases were *very* narrow.

Hm, relpathbackend seems pretty narrow to me.

Funny, we both argued the other way round than we are now when talking
about the sprintf(..., "%X/%X", (uint32)(recptr >> 32), (uint32)recptr)
thingy ;)

The reason a static buffer is better is that some of the *desc routines
use relpathbackend() and pfree() the result. That would require
providing a stub pfree() in xlogdump which seems to be exceedingly ugly.

Why? If we have palloc support how hard can it be to map pfree to free?
And why wouldn't we want to? I can hardly imagine providing only palloc
and not pfree support.

Uhm, we don't have & need palloc support and I don't think
relpathbackend() is a good justification for adding it.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#13)

Re: [PATCH 1/5] Centralize Assert* macros into c.h so its common between backend/frontend

On 2013-01-08 14:35:12 -0500, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

On 2013-01-08 14:25:06 -0500, Tom Lane wrote:

This patch seems unnecessary given that we already put a version of Assert()
into postgres_fe.h.

The problem is that some (including existing) pieces of code need to
include postgres.h itself, those can't easily include postgres_fe.h as
well without getting into problems with redefinitions.

There is no place, anywhere, that should be including both. So I don't
see the problem.

Sorry, misremembered the problem somewhat. The problem is that code that
includes postgres.h atm ends up with ExceptionalCondition() et
al. declared even if FRONTEND is defined. So if anything uses an assert
you need to provide wrappers for those which seems nasty. If they are
provided centrally and check for FRONTEND that problem doesn't exist.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#16)

Re: [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

Andres Freund <andres@2ndquadrant.com> writes:

Uhm, we don't have & need palloc support and I don't think
relpathbackend() is a good justification for adding it.

I've said from the very beginning of this effort that it would be
impossible to share any meaningful amount of code between frontend and
backend environments without adding some sort of emulation of
palloc/pfree/elog. I think this patch is just making the code uglier
and more fragile to put off the inevitable, and that we'd be better
served to bite the bullet and do that.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19

alvherre@2ndquadrant.com

over 13 years ago

In reply to: Andres Freund (#17)

Re: Re: [PATCH 1/5] Centralize Assert* macros into c.h so its common between backend/frontend

Andres Freund wrote:

On 2013-01-08 14:35:12 -0500, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

On 2013-01-08 14:25:06 -0500, Tom Lane wrote:

This patch seems unnecessary given that we already put a version of Assert()
into postgres_fe.h.

The problem is that some (including existing) pieces of code need to
include postgres.h itself, those can't easily include postgres_fe.h as
well without getting into problems with redefinitions.

There is no place, anywhere, that should be including both. So I don't
see the problem.

Sorry, misremembered the problem somewhat. The problem is that code that
includes postgres.h atm ends up with ExceptionalCondition() et
al. declared even if FRONTEND is defined. So if anything uses an assert
you need to provide wrappers for those which seems nasty. If they are
provided centrally and check for FRONTEND that problem doesn't exist.

I think the right fix here is to fix things so that postgres.h is not
necessary. How hard is that? Maybe it just requires some more
reshuffling of xlog headers.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20

alvherre@2ndquadrant.com

over 13 years ago

In reply to: Tom Lane (#18)

Re: [PATCH 2/5] Make relpathbackend return a statically result instead of palloc()'ing it

Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

Uhm, we don't have & need palloc support and I don't think
relpathbackend() is a good justification for adding it.

I've said from the very beginning of this effort that it would be
impossible to share any meaningful amount of code between frontend and
backend environments without adding some sort of emulation of
palloc/pfree/elog. I think this patch is just making the code uglier
and more fragile to put off the inevitable, and that we'd be better
served to bite the bullet and do that.

As far as this patch is concerned, I think it's sufficient to do

#define palloc(x) malloc(x)
#define pfree(x) free(x)

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#18)

#22

andres@anarazel.de

over 13 years ago

In reply to: Alvaro Herrera (#19)

#23

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#21)

#24

heikki.linnakangas@enterprisedb.com

over 13 years ago

In reply to: Andres Freund (#21)

#25

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Alvaro Herrera (#19)

#26

andres@anarazel.de

over 13 years ago

In reply to: Heikki Linnakangas (#24)

#27

heikki.linnakangas@enterprisedb.com

over 13 years ago

In reply to: Andres Freund (#26)

#28

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#23)

#29

andres@anarazel.de

over 13 years ago

In reply to: Heikki Linnakangas (#27)

#30

robertmhaas@gmail.com

over 13 years ago

In reply to: Andres Freund (#16)

#31

alvherre@2ndquadrant.com

over 13 years ago

In reply to: Robert Haas (#30)

#32

andres@anarazel.de

over 13 years ago

In reply to: Robert Haas (#30)

#33

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Robert Haas (#30)

#34

robertmhaas@gmail.com

over 13 years ago

In reply to: Tom Lane (#33)

#35

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Robert Haas (#34)

#36

robertmhaas@gmail.com

over 13 years ago

In reply to: Tom Lane (#35)

#37

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#18)

#38

andres@anarazel.de

over 13 years ago

In reply to: Andres Freund (#37)

#39

andres@anarazel.de

over 13 years ago

In reply to: Andres Freund (#37)

#40

heikki.linnakangas@enterprisedb.com

over 13 years ago

In reply to: Andres Freund (#37)

#41

andres@anarazel.de

over 13 years ago

In reply to: Heikki Linnakangas (#40)

#42

Magnus Hagander

magnus@hagander.net

over 13 years ago

In reply to: Andres Freund (#38)

#43

andres@anarazel.de

over 13 years ago

In reply to: Magnus Hagander (#42)

#44

Magnus Hagander

magnus@hagander.net

over 13 years ago

In reply to: Andres Freund (#43)

#45

Michael Paquier

michael@paquier.xyz

over 13 years ago

In reply to: Magnus Hagander (#44)

#46

andres@anarazel.de

over 13 years ago

In reply to: Michael Paquier (#45)

#47

alvherre@2ndquadrant.com

over 13 years ago

In reply to: Andres Freund (#38)

#48

andres@anarazel.de

over 13 years ago

In reply to: Alvaro Herrera (#47)

#49

andres@anarazel.de

over 13 years ago

In reply to: Alvaro Herrera (#47)

#50

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Magnus Hagander (#44)

#51

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#41)

#52

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#51)

#53

Peter Geoghegan

pg@bowt.ie

over 13 years ago

In reply to: Tom Lane (#50)

#54

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#50)

#55

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#52)

#56

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#55)

#57

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#56)

#58

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#57)

#59

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#54)

#60

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#59)

#61

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#58)

#62

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Tom Lane (#61)

#63

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#62)

#64

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#61)

#65

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#63)

#66

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#63)

#67

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#65)

#68

andres@anarazel.de

over 13 years ago

In reply to: Andres Freund (#67)

#69

andres@anarazel.de

over 13 years ago

In reply to: Andres Freund (#68)

#70

andres@anarazel.de

over 13 years ago

In reply to: Andres Freund (#69)

#71

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#70)

#72

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#71)

#73

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#72)

#74

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#73)

#75

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#74)

#76

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#75)

#77

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#76)

#78

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#77)

#79

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#78)

#80

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#79)

#81

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#80)

#82

heikki.linnakangas@enterprisedb.com

over 13 years ago

In reply to: Tom Lane (#81)

#83

andres@anarazel.de

over 13 years ago

In reply to: Heikki Linnakangas (#82)

#84

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Heikki Linnakangas (#82)

#85

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#84)

#86

heikki.linnakangas@enterprisedb.com

over 13 years ago

In reply to: Andres Freund (#85)

#87

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Heikki Linnakangas (#86)

#88

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#72)

#89

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#85)

#90

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Heikki Linnakangas (#82)

#91

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#87)

#92

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#90)

#93

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#89)

#94

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#91)

#95

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#88)

#96

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#95)

#97

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#96)

#98

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#97)

#99

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#98)

#100

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#99)

#101

robertmhaas@gmail.com

over 13 years ago

In reply to: Tom Lane (#100)

#102

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Robert Haas (#101)

#103

alvherre@2ndquadrant.com

over 13 years ago

In reply to: Andres Freund (#4)

#104

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#25)

#105

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#61)

#106

andres@anarazel.de

over 13 years ago

In reply to: Andres Freund (#104)

#107

alvherre@2ndquadrant.com

over 13 years ago

In reply to: Andres Freund (#106)

#108

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Alvaro Herrera (#107)

#109

alvherre@2ndquadrant.com

over 13 years ago

In reply to: Alvaro Herrera (#107)

#110

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Alvaro Herrera (#109)

#111

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#108)

#112

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andres Freund (#111)

#113

andres@anarazel.de

over 13 years ago

In reply to: Tom Lane (#112)

#114

Steve Singer

steve@ssinger.info

over 13 years ago

In reply to: Tom Lane (#61)

#115

andres@anarazel.de

over 13 years ago

In reply to: Steve Singer (#114)

#116

Steve Singer

steve@ssinger.info

over 13 years ago

In reply to: Andres Freund (#115)

#117

andres@anarazel.de

over 13 years ago

In reply to: Steve Singer (#116)

#118

Steve Singer

steve@ssinger.info

over 13 years ago

In reply to: Andres Freund (#117)

#119

heikki.linnakangas@enterprisedb.com

over 13 years ago

In reply to: Tom Lane (#98)

#120

alvherre@2ndquadrant.com

over 13 years ago

In reply to: Tom Lane (#112)

#121

alvherre@2ndquadrant.com

over 13 years ago

In reply to: Andres Freund (#5)

#122

andres@anarazel.de

over 13 years ago

In reply to: Alvaro Herrera (#121)

#123

alvherre@2ndquadrant.com

over 13 years ago

In reply to: Andres Freund (#122)

#124

andres@anarazel.de

over 13 years ago

In reply to: Alvaro Herrera (#123)

#125