patch for parallel pg_dump

Started by Joachim Wielandalmost 14 years ago80 messages
#1Joachim Wieland
joe@mcknight.de
2 attachment(s)

So this is the parallel pg_dump patch, generalizing the existing
parallel restore and allowing parallel dumps for the directory archive
format, the patch works on Windows and Unix.

In the first phase of a parallel pg_dump/pg_restore, it does catalog
backup/restore in a single process, then forks off worker processes
which are connected to the master process by pipes (on Windows, the
pg_pipe implementation is used). These pipes are only used for a few
commands and status messages. The processes then work on the items
that they get assigned to by the master, in other words the worker
processes do not terminate after each item but stay there until the
end of the parallel part of the dump/restore. Once they finish their
current item and send the status back to the master, they are assigned
the next item and so forth...

In parallel restore, the master closes its own connection to the
database before forking of worker processes, just as it does now. In
parallel dump however, we need to hold the masters connection open so
that we can detect deadlocks. The issue is that somebody could have
requested an exclusive lock after the master has initially requested a
shared lock on all tables. Therefore, the worker process also requests
a shared lock on the table with NOWAIT and if this fails, we know that
there is a conflicting lock in between and that we need to abort the
dump.

Parallel pg_dump sorts the tables and indexes by their sizes so that
it can start with the largest items first.

The connections of the parallel dump use the synchronized snapshot
feature. However there's also an option --no-synchronized-snapshots
which can be used to dump from an older PostgreSQL version.

I'm also attaching another use-case for the parallel backup as a
separate patch, which is a new archive format that I named
"pg_backup_mirror", it's basically the parallel version of "pg_dump |
psql", so it does a parallel dump and restore of a database from one
host to another. The patch for this is fairly small, but it's still a
bit rough and needs some more work and discussion. Depending on how
quickly (or not) we get done with the review of the main patch, we can
then include this one as well or postpone it.

Joachim

Attachments:

parallel_pg_dump_1.difftext/x-patch; charset=US-ASCII; name=parallel_pg_dump_1.diffDownload
diff --git a/src/backend/port/pipe.c b/src/backend/port/pipe.c
index 357f5ec..4e47d13 100644
*** a/src/backend/port/pipe.c
--- b/src/backend/port/pipe.c
***************
*** 15,23 ****
   *-------------------------------------------------------------------------
   */
  
  #include "postgres.h"
  
- #ifdef WIN32
  int
  pgpipe(int handles[2])
  {
--- 15,42 ----
   *-------------------------------------------------------------------------
   */
  
+ #ifdef WIN32
+ 
+ /*
+  * This pipe implementation is used in both the server and non-server programs.
+  * In the default case we run within the server and use the standard ereport
+  * error reporting.
+  * If the code runs in a non-server program (like pg_dump), then that program
+  * #defines an error routine and includes this .c file.
+  */
+ #ifndef PGPIPE_EREPORT
  #include "postgres.h"
+ #define PGPIPE_EREPORT pgpipe_ereport
+ static void
+ pgpipe_ereport(const char *fmt, ...)
+ {
+ 	va_list args;
+ 	va_start(args, fmt);
+ 	ereport(LOG, (errmsg_internal(fmt, args)));
+ 	va_end(args);
+ }
+ #endif
  
  int
  pgpipe(int handles[2])
  {
*************** pgpipe(int handles[2])
*** 29,35 ****
  
  	if ((s = socket(AF_INET, SOCK_STREAM, 0)) == INVALID_SOCKET)
  	{
! 		ereport(LOG, (errmsg_internal("pgpipe could not create socket: %ui", WSAGetLastError())));
  		return -1;
  	}
  
--- 48,54 ----
  
  	if ((s = socket(AF_INET, SOCK_STREAM, 0)) == INVALID_SOCKET)
  	{
! 		PGPIPE_EREPORT("pgpipe could not create socket: %ui", WSAGetLastError());
  		return -1;
  	}
  
*************** pgpipe(int handles[2])
*** 39,76 ****
  	serv_addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
  	if (bind(s, (SOCKADDR *) &serv_addr, len) == SOCKET_ERROR)
  	{
! 		ereport(LOG, (errmsg_internal("pgpipe could not bind: %ui", WSAGetLastError())));
  		closesocket(s);
  		return -1;
  	}
  	if (listen(s, 1) == SOCKET_ERROR)
  	{
! 		ereport(LOG, (errmsg_internal("pgpipe could not listen: %ui", WSAGetLastError())));
  		closesocket(s);
  		return -1;
  	}
  	if (getsockname(s, (SOCKADDR *) &serv_addr, &len) == SOCKET_ERROR)
  	{
! 		ereport(LOG, (errmsg_internal("pgpipe could not getsockname: %ui", WSAGetLastError())));
  		closesocket(s);
  		return -1;
  	}
  	if ((handles[1] = socket(PF_INET, SOCK_STREAM, 0)) == INVALID_SOCKET)
  	{
! 		ereport(LOG, (errmsg_internal("pgpipe could not create socket 2: %ui", WSAGetLastError())));
  		closesocket(s);
  		return -1;
  	}
  
  	if (connect(handles[1], (SOCKADDR *) &serv_addr, len) == SOCKET_ERROR)
  	{
! 		ereport(LOG, (errmsg_internal("pgpipe could not connect socket: %ui", WSAGetLastError())));
  		closesocket(s);
  		return -1;
  	}
  	if ((handles[0] = accept(s, (SOCKADDR *) &serv_addr, &len)) == INVALID_SOCKET)
  	{
! 		ereport(LOG, (errmsg_internal("pgpipe could not accept socket: %ui", WSAGetLastError())));
  		closesocket(handles[1]);
  		handles[1] = INVALID_SOCKET;
  		closesocket(s);
--- 58,95 ----
  	serv_addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
  	if (bind(s, (SOCKADDR *) &serv_addr, len) == SOCKET_ERROR)
  	{
! 		PGPIPE_EREPORT("pgpipe could not bind: %ui", WSAGetLastError());
  		closesocket(s);
  		return -1;
  	}
  	if (listen(s, 1) == SOCKET_ERROR)
  	{
! 		PGPIPE_EREPORT("pgpipe could not listen: %ui", WSAGetLastError());
  		closesocket(s);
  		return -1;
  	}
  	if (getsockname(s, (SOCKADDR *) &serv_addr, &len) == SOCKET_ERROR)
  	{
! 		PGPIPE_EREPORT("pgpipe could not getsockname: %ui", WSAGetLastError());
  		closesocket(s);
  		return -1;
  	}
  	if ((handles[1] = socket(PF_INET, SOCK_STREAM, 0)) == INVALID_SOCKET)
  	{
! 		PGPIPE_EREPORT("pgpipe could not create socket 2: %ui", WSAGetLastError());
  		closesocket(s);
  		return -1;
  	}
  
  	if (connect(handles[1], (SOCKADDR *) &serv_addr, len) == SOCKET_ERROR)
  	{
! 		PGPIPE_EREPORT("pgpipe could not connect socket: %ui", WSAGetLastError());
  		closesocket(s);
  		return -1;
  	}
  	if ((handles[0] = accept(s, (SOCKADDR *) &serv_addr, &len)) == INVALID_SOCKET)
  	{
! 		PGPIPE_EREPORT("pgpipe could not accept socket: %ui", WSAGetLastError());
  		closesocket(handles[1]);
  		handles[1] = INVALID_SOCKET;
  		closesocket(s);
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 033fb1e..00501de 100644
*** a/src/bin/pg_dump/Makefile
--- b/src/bin/pg_dump/Makefile
*************** override CPPFLAGS := -I$(libpq_srcdir) $
*** 20,26 ****
  
  OBJS=	pg_backup_archiver.o pg_backup_db.o pg_backup_custom.o \
  	pg_backup_files.o pg_backup_null.o pg_backup_tar.o \
! 	pg_backup_directory.o dumpmem.o dumputils.o compress_io.o $(WIN32RES)
  
  KEYWRDOBJS = keywords.o kwlookup.o
  
--- 20,27 ----
  
  OBJS=	pg_backup_archiver.o pg_backup_db.o pg_backup_custom.o \
  	pg_backup_files.o pg_backup_null.o pg_backup_tar.o \
! 	pg_backup_directory.o dumpmem.o dumputils.o compress_io.o \
! 	parallel.o $(WIN32RES)
  
  KEYWRDOBJS = keywords.o kwlookup.o
  
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index d48b276..3bc224b 100644
*** a/src/bin/pg_dump/compress_io.c
--- b/src/bin/pg_dump/compress_io.c
***************
*** 54,59 ****
--- 54,60 ----
  
  #include "compress_io.h"
  #include "dumpmem.h"
+ #include "parallel.h"
  
  /*----------------------
   * Compressor API
*************** size_t
*** 181,186 ****
--- 182,190 ----
  WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
  				   const void *data, size_t dLen)
  {
+ 	/* Are we aborting? */
+ 	checkAborting(AH);
+ 
  	switch (cs->comprAlg)
  	{
  		case COMPR_ALG_LIBZ:
*************** ReadDataFromArchiveZlib(ArchiveHandle *A
*** 350,355 ****
--- 354,362 ----
  	/* no minimal chunk size for zlib */
  	while ((cnt = readF(AH, &buf, &buflen)))
  	{
+ 		/* Are we aborting? */
+ 		checkAborting(AH);
+ 
  		zp->next_in = (void *) buf;
  		zp->avail_in = cnt;
  
*************** ReadDataFromArchiveNone(ArchiveHandle *A
*** 410,415 ****
--- 417,425 ----
  
  	while ((cnt = readF(AH, &buf, &buflen)))
  	{
+ 		/* Are we aborting? */
+ 		checkAborting(AH);
+ 
  		ahwrite(buf, 1, cnt, AH);
  	}
  
diff --git a/src/bin/pg_dump/dumputils.c b/src/bin/pg_dump/dumputils.c
index 3493e39..352daba 100644
*** a/src/bin/pg_dump/dumputils.c
--- b/src/bin/pg_dump/dumputils.c
***************
*** 16,21 ****
--- 16,22 ----
  
  #include <ctype.h>
  
+ #include "dumpmem.h"
  #include "dumputils.h"
  #include "pg_backup.h"
  
*************** static bool parseAclItem(const char *ite
*** 36,41 ****
--- 37,43 ----
  static char *copyAclUserName(PQExpBuffer output, char *input);
  static void AddAcl(PQExpBuffer aclbuf, const char *keyword,
  	   const char *subname);
+ static PQExpBuffer getThreadLocalPQExpBuffer(void);
  
  #ifdef WIN32
  static bool parallel_init_done = false;
*************** init_parallel_dump_utils(void)
*** 55,69 ****
  }
  
  /*
!  *	Quotes input string if it's not a legitimate SQL identifier as-is.
!  *
!  *	Note that the returned string must be used before calling fmtId again,
!  *	since we re-use the same return buffer each time.  Non-reentrant but
!  *	reduces memory leakage. (On Windows the memory leakage will be one buffer
!  *	per thread, which is at least better than one per call).
   */
! const char *
! fmtId(const char *rawid)
  {
  	/*
  	 * The Tls code goes awry if we use a static var, so we provide for both
--- 57,67 ----
  }
  
  /*
!  * Non-reentrant but reduces memory leakage. (On Windows the memory leakage
!  * will be one buffer per thread, which is at least better than one per call).
   */
! static PQExpBuffer
! getThreadLocalPQExpBuffer(void)
  {
  	/*
  	 * The Tls code goes awry if we use a static var, so we provide for both
*************** fmtId(const char *rawid)
*** 72,80 ****
  	static PQExpBuffer s_id_return = NULL;
  	PQExpBuffer id_return;
  
- 	const char *cp;
- 	bool		need_quotes = false;
- 
  #ifdef WIN32
  	if (parallel_init_done)
  		id_return = (PQExpBuffer) TlsGetValue(tls_index);		/* 0 when not set */
--- 70,75 ----
*************** fmtId(const char *rawid)
*** 101,109 ****
  #else
  		s_id_return = id_return;
  #endif
- 
  	}
  
  	/*
  	 * These checks need to match the identifier production in scan.l. Don't
  	 * use islower() etc.
--- 96,120 ----
  #else
  		s_id_return = id_return;
  #endif
  	}
  
+ 	return id_return;
+ }
+ 
+ /*
+  *	Quotes input string if it's not a legitimate SQL identifier as-is.
+  *
+  *	Note that the returned string must be used before calling fmtId again,
+  *	since we re-use the same return buffer each time.
+  */
+ const char *
+ fmtId(const char *rawid)
+ {
+ 	PQExpBuffer id_return = getThreadLocalPQExpBuffer();
+ 
+ 	const char *cp;
+ 	bool		need_quotes = false;
+ 
  	/*
  	 * These checks need to match the identifier production in scan.l. Don't
  	 * use islower() etc.
*************** fmtId(const char *rawid)
*** 171,176 ****
--- 182,226 ----
  	return id_return->data;
  }
  
+ /*
+  * fmtQualifiedId - convert a qualified name to the proper format for
+  * the source database.
+  *
+  * Like fmtId, use the result before calling again.
+  */
+ const char *
+ fmtQualifiedId(const char *schema, const char *id, int remoteVersion)
+ {
+ 	PQExpBuffer id_return;
+ 	/*
+ 	 * We're using the same PQExpBuffer as fmtId(), that's why we first get all
+ 	 * the values from fmtId and then return them appended in the PQExpBuffer.
+ 	 * The reason we still use the PQExpBuffer to return the string is just for
+ 	 * ease of use in the caller, that doesn't have to free the string
+ 	 * explicitly.
+ 	 */
+ 	char	   *schemaBuf, *idBuf;
+ 
+ 	/* Suppress schema name if fetching from pre-7.3 DB */
+ 	if (remoteVersion >= 70300 && schema && *schema)
+ 	{
+ 		schemaBuf = pg_strdup(fmtId(schema));
+ 	} else
+ 		schemaBuf = pg_strdup("");
+ 
+ 	idBuf = pg_strdup(fmtId(id));
+ 
+ 	/* this will reset the PQExpBuffer */
+ 	id_return = getThreadLocalPQExpBuffer();
+ 	appendPQExpBuffer(id_return, "%s%s%s",
+ 								 schemaBuf,
+ 								 strlen(schemaBuf) > 0 ? "." : "",
+ 								 idBuf);
+ 	free(schemaBuf);
+ 	free(idBuf);
+ 
+ 	return id_return->data;
+ }
  
  /*
   * Convert a string value to an SQL string literal and append it to
diff --git a/src/bin/pg_dump/dumputils.h b/src/bin/pg_dump/dumputils.h
index b4cf730..060c95d 100644
*** a/src/bin/pg_dump/dumputils.h
--- b/src/bin/pg_dump/dumputils.h
*************** extern const char *progname;
*** 24,29 ****
--- 24,30 ----
  
  extern void init_parallel_dump_utils(void);
  extern const char *fmtId(const char *identifier);
+ extern const char *fmtQualifiedId(const char *schema, const char *id, int remoteVersion);
  extern void appendStringLiteral(PQExpBuffer buf, const char *str,
  					int encoding, bool std_strings);
  extern void appendStringLiteralConn(PQExpBuffer buf, const char *str,
diff --git a/src/bin/pg_dump/parallel.c b/src/bin/pg_dump/parallel.c
index ...bcde24c .
*** a/src/bin/pg_dump/parallel.c
--- b/src/bin/pg_dump/parallel.c
***************
*** 0 ****
--- 1,1289 ----
+ /*-------------------------------------------------------------------------
+  *
+  * parallel.c
+  *
+  *	Parallel support for the pg_dump archiver
+  *
+  * Portions Copyright (c) 1996-2011, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  *	The author is not responsible for loss or damages that may
+  *	result from its use.
+  *
+  * IDENTIFICATION
+  *		src/bin/pg_dump/parallel.c
+  *
+  *-------------------------------------------------------------------------
+  */
+ 
+ #include "pg_backup_db.h"
+ 
+ #include "dumpmem.h"
+ #include "dumputils.h"
+ #include "parallel.h"
+ 
+ #ifndef WIN32
+ #include <sys/types.h>
+ #include <sys/wait.h>
+ #include "signal.h"
+ #include <unistd.h>
+ #include <fcntl.h>
+ #endif
+ 
+ #define PIPE_READ							0
+ #define PIPE_WRITE							1
+ #define SHUTDOWN_GRACE_PERIOD				(500)
+ 
+ /* file-scope variables */
+ #ifdef WIN32
+ static unsigned int	tMasterThreadId = 0;
+ static HANDLE		termEvent = INVALID_HANDLE_VALUE;
+ #else
+ static volatile sig_atomic_t wantAbort = 0;
+ static bool aborting = false;
+ #endif
+ 
+ /*
+  * The parallel error handler is called for any die_horribly() in a child or master process.
+  * It then takes control over shutting down the rest of the gang.
+  */
+ void (*volatile vparallel_error_handler)(ArchiveHandle *AH, const char *modulename,
+ 								const char *fmt, va_list ap)
+ 						__attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 0))) = NULL;
+ 
+ /* The actual implementation of the error handler function */
+ static void vparallel_error_handler_imp(ArchiveHandle *AH, const char *modulename,
+ 										const char *fmt, va_list ap)
+ 								__attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 0)));
+ 
+ static const char *modulename = gettext_noop("parallel archiver");
+ 
+ static int ShutdownConnection(PGconn **conn);
+ 
+ static void WaitForTerminatingWorkers(ArchiveHandle *AH, ParallelState *pstate);
+ static void ShutdownWorkersHard(ArchiveHandle *AH, ParallelState *pstate);
+ static void ShutdownWorkersSoft(ArchiveHandle *AH, ParallelState *pstate, bool do_wait);
+ static void PrintStatus(ParallelState *pstate);
+ static bool HasEveryWorkerTerminated(ParallelState *pstate);
+ 
+ static void lockTableNoWait(ArchiveHandle *AH, TocEntry *te);
+ static void WaitForCommands(ArchiveHandle *AH, int pipefd[2]);
+ static char *getMessageFromMaster(ArchiveHandle *AH, int pipefd[2]);
+ static void sendMessageToMaster(ArchiveHandle *AH, int pipefd[2], const char *str);
+ static char *getMessageFromWorker(ArchiveHandle *AH, ParallelState *pstate,
+ 								  bool do_wait, int *worker);
+ static void sendMessageToWorker(ArchiveHandle *AH, ParallelState *pstate,
+ 							    int worker, const char *str);
+ static char *readMessageFromPipe(int fd, bool do_wait);
+ 
+ static void SetupWorker(ArchiveHandle *AH, int pipefd[2], int worker,
+ 						RestoreOptions *ropt);
+ 
+ #define messageStartsWith(msg, prefix) \
+ 	(strncmp(msg, prefix, strlen(prefix)) == 0)
+ #define messageEquals(msg, pattern) \
+ 	(strcmp(msg, pattern) == 0)
+ 
+ /* architecture dependent #defines */
+ #ifdef WIN32
+ 	/* WIN32 */
+ 	/* pgpipe implemented in src/backend/port/pipe.c */
+ 	#define setnonblocking(fd) \
+ 		do { u_long mode = 1; \
+ 			 ioctlsocket((fd), FIONBIO, &mode); \
+ 		} while(0);
+ 	#define setblocking(fd) \
+ 		do { u_long mode = 0; \
+ 			 ioctlsocket((fd), FIONBIO, &mode); \
+ 		} while(0);
+ #else /* UNIX */
+ 	#define setnonblocking(fd) \
+ 		do { long flags = (long) fcntl((fd), F_GETFL, 0); \
+ 			fcntl(fd, F_SETFL, flags | O_NONBLOCK); \
+ 		} while(0);
+ 	#define setblocking(fd) \
+ 		do { long flags = (long) fcntl((fd), F_GETFL, 0); \
+ 			fcntl(fd, F_SETFL, flags & ~O_NONBLOCK); \
+ 		} while(0);
+ #endif
+ 
+ #ifdef WIN32
+ /*
+  * On Windows, source in the pgpipe implementation from the backend and provide
+  * an own error reporting routine, the backend usually uses ereport() for that.
+  */
+ static void pgdump_pgpipe_ereport(const char* fmt, ...);
+ #define PGPIPE_EREPORT pgdump_pgpipe_ereport
+ #include "../../backend/port/pipe.c"
+ static void
+ pgdump_pgpipe_ereport(const char* fmt, ...)
+ {
+ 	va_list args;
+ 	va_start(args, fmt);
+ 	vwrite_msg("pgpipe", fmt, args);
+ 	va_end(args);
+ }
+ #endif
+ 
+ static int
+ #ifdef WIN32
+ GetSlotOfThread(ParallelState *pstate, unsigned int threadId)
+ #else
+ GetSlotOfProcess(ParallelState *pstate, pid_t pid)
+ #endif
+ {
+ 	int i;
+ 
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ #ifdef WIN32
+ 		if (pstate->parallelSlot[i].threadId == threadId)
+ #else
+ 		if (pstate->parallelSlot[i].pid == pid)
+ #endif
+ 			return i;
+ 
+ 	Assert(false);
+ 	return NO_SLOT;
+ }
+ 
+ enum escrow_action { GET, SET };
+ static void
+ parallel_error_handler_escrow_data(enum escrow_action act, ParallelState *pstate)
+ {
+ 	static ParallelState *s_pstate = NULL;
+ 
+ 	if (act == SET)
+ 		s_pstate = pstate;
+ 	else
+ 		*pstate = *s_pstate;
+ }
+ 
+ static void
+ vparallel_error_handler_imp(ArchiveHandle *AH,
+ 							const char *modulename,
+ 							const char *fmt, va_list ap)
+ {
+ 	ParallelState pstate;
+ 	char		buf[512];
+ 	int			pipefd[2];
+ 	int			i;
+ 
+ 	if (AH->is_clone)
+ 	{
+ 		/* we are the child, get the message out to the parent */
+ 		parallel_error_handler_escrow_data(GET, &pstate);
+ #ifdef WIN32
+ 		i = GetSlotOfThread(&pstate, GetCurrentThreadId());
+ #else
+ 		i = GetSlotOfProcess(&pstate, getpid());
+ #endif
+ 		if (pstate.parallelSlot[i].inErrorHandling)
+ 			return;
+ 
+ 		pstate.parallelSlot[i].inErrorHandling = true;
+ 
+ 		pipefd[PIPE_READ] = pstate.parallelSlot[i].pipeRevRead;
+ 		pipefd[PIPE_WRITE] = pstate.parallelSlot[i].pipeRevWrite;
+ 
+ 		strcpy(buf, "ERROR ");
+ 		vsnprintf(buf + strlen("ERROR "), sizeof(buf) - strlen("ERROR "), fmt, ap);
+ 
+ 		sendMessageToMaster(AH, pipefd, buf);
+ 		if (AH->connection)
+ 			ShutdownConnection(&(AH->connection));
+ #ifdef WIN32
+ 		ExitThread(1);
+ #else
+ 		exit(1);
+ #endif
+ 	}
+ 	else
+ 	{
+ #ifndef WIN32
+ 		/*
+ 		 * We are the parent. We need the handling variable to see if we're
+ 		 * already handling an error.
+ 		 */
+ 		if (aborting)
+ 			return;
+ 		aborting = 1;
+ 
+ 		signal(SIGPIPE, SIG_IGN);
+ #endif
+ 		/*
+ 		 * Note that technically we're using a new pstate here, the old one
+ 		 * is copied over and then the old one isn't updated anymore. Only
+ 		 * the new one is, which is okay because control will never return
+ 		 * from this function.
+ 		 */
+ 		parallel_error_handler_escrow_data(GET, &pstate);
+ 		ShutdownWorkersHard(AH, &pstate);
+ 		/* Terminate own connection */
+ 		if (AH->connection)
+ 			ShutdownConnection(&(AH->connection));
+ 		vwrite_msg(NULL, fmt, ap);
+ 		exit(1);
+ 	}
+ 	Assert(false);
+ }
+ 
+ /*
+  * If we have one worker that terminates for some reason, we'd like the other
+  * threads to terminate as well (and not finish with their 70 GB table dump
+  * first...). Now in UNIX we can just kill these processes, and let the signal
+  * handler set wantAbort to 1 or more. In Windows we set a termEvent and this
+  * serves as the signal for everyone to terminate.
+  */
+ void
+ checkAborting(ArchiveHandle *AH)
+ {
+ #ifdef WIN32
+ 	if (WaitForSingleObject(termEvent, 0) == WAIT_OBJECT_0)
+ #else
+ 	if (wantAbort)
+ #endif
+ 		/*
+ 		 * Terminate, this error will actually never show up somewhere
+ 		 * because if termEvent/wantAbort is set, then we are already in the
+ 		 * process of going down and already have a reason why we're
+ 		 * terminating.
+ 		 */
+ 		die_horribly(AH, modulename, "worker is terminating");
+ }
+ 
+ /*
+  * A select loop that repeats calling select until a descriptor in the read set
+  * becomes readable. On Windows we have to check for the termination event from
+  * time to time, on Unix we can just block forever.
+  */
+ #ifdef WIN32
+ static int
+ select_loop(int maxFd, fd_set *workerset)
+ {
+ 	int			i;
+ 	fd_set		saveSet = *workerset;
+ 
+ 	/* should always be the master */
+ 	Assert(tMasterThreadId == GetCurrentThreadId());
+ 
+ 	for (;;)
+ 	{
+ 		/*
+ 		 * sleep a quarter of a second before checking if we should
+ 		 * terminate.
+ 		 */
+ 		struct timeval tv = { 0, 250000 };
+ 		*workerset = saveSet;
+ 		i = select(maxFd + 1, workerset, NULL, NULL, &tv);
+ 
+ 		if (i == SOCKET_ERROR && WSAGetLastError() == WSAEINTR)
+ 			continue;
+ 		if (i)
+ 			break;
+ 	}
+ 
+ 	return i;
+ }
+ #else /* UNIX */
+ static int
+ select_loop(int maxFd, fd_set *workerset)
+ {
+ 	int		i;
+ 
+ 	fd_set saveSet = *workerset;
+ 	for (;;)
+ 	{
+ 		*workerset = saveSet;
+ 		i = select(maxFd + 1, workerset, NULL, NULL, NULL);
+ 		Assert(i != 0);
+ 		if (wantAbort && !aborting) {
+ 			return NO_SLOT;
+ 		}
+ 		if (i < 0 && errno == EINTR)
+ 			continue;
+ 		break;
+ 	}
+ 
+ 	return i;
+ }
+ #endif
+ 
+ /*
+  * Shut down any remaining workers, this has an implicit do_wait == true
+  */
+ static void
+ ShutdownWorkersHard(ArchiveHandle *AH, ParallelState *pstate)
+ {
+ #ifdef WIN32
+ 	/* The workers monitor this event via checkAborting(). */
+ 	SetEvent(termEvent);
+ #endif
+ 	/*
+ 	 * The fastest way we can make them terminate is when they are listening
+ 	 * for new commands and we just tell them to terminate.
+ 	 */
+ 	ShutdownWorkersSoft(AH, pstate, false);
+ 
+ #ifndef WIN32
+ 	{
+ 		int i;
+ 		for (i = 0; i < pstate->numWorkers; i++)
+ 			kill(pstate->parallelSlot[i].pid, SIGTERM);
+ 
+ 		/* Reset our signal handler, if we get signaled again, terminate normally */
+ 		signal(SIGINT, SIG_DFL);
+ 		signal(SIGTERM, SIG_DFL);
+ 		signal(SIGQUIT, SIG_DFL);
+ 	}
+ #endif
+ 
+ 	WaitForTerminatingWorkers(AH, pstate);
+ }
+ 
+ static void
+ WaitForTerminatingWorkers(ArchiveHandle *AH, ParallelState *pstate)
+ {
+ 	while (!HasEveryWorkerTerminated(pstate))
+ 	{
+ 		int			worker;
+ 		char	   *msg;
+ 
+ 		PrintStatus(pstate);
+ 
+ 		msg = getMessageFromWorker(AH, pstate, true, &worker);
+ 		if (!msg || messageStartsWith(msg, "ERROR "))
+ 			pstate->parallelSlot[worker].workerStatus = WRKR_TERMINATED;
+ 		if (msg)
+ 			free(msg);
+ 	}
+ 	Assert(HasEveryWorkerTerminated(pstate));
+ }
+ 
+ #ifndef WIN32
+ /* Signal handling (UNIX only) */
+ static void
+ sigTermHandler(int signum)
+ {
+ 	wantAbort++;
+ }
+ #endif
+ 
+ /*
+  * This function is called by both UNIX and Windows variants to set up a
+  * worker process.
+  */
+ static void
+ SetupWorker(ArchiveHandle *AH, int pipefd[2], int worker,
+ 			RestoreOptions *ropt)
+ {
+ 	/*
+ 	 * In dump mode (pg_dump) this calls _SetupWorker() as defined in
+ 	 * pg_dump.c, while in restore mode (pg_restore) it calls _SetupWorker() as
+ 	 * defined in pg_restore.c.
+      *
+ 	 * We get the raw connection only for the reason that we can close it
+ 	 * properly when we shut down. This happens only that way when it is
+ 	 * brought down because of an error.
+ 	 */
+ 	_SetupWorker((Archive *) AH, ropt);
+ 
+ 	Assert(AH->connection != NULL);
+ 
+ 	WaitForCommands(AH, pipefd);
+ 
+ 	closesocket(pipefd[PIPE_READ]);
+ 	closesocket(pipefd[PIPE_WRITE]);
+ }
+ 
+ #ifdef WIN32
+ /*
+  * On Windows the _beginthreadex() function allows us to pass one parameter.
+  * Since we need to pass a few values however, we define a structure here
+  * and then pass a pointer to such a structure in _beginthreadex().
+  */
+ typedef struct {
+ 	ArchiveHandle  *AH;
+ 	RestoreOptions *ropt;
+ 	int				worker;
+ 	int				pipeRead;
+ 	int				pipeWrite;
+ } WorkerInfo;
+ 
+ static unsigned __stdcall
+ init_spawned_worker_win32(WorkerInfo *wi)
+ {
+ 	ArchiveHandle *AH;
+ 	int pipefd[2] = { wi->pipeRead, wi->pipeWrite };
+ 	int worker = wi->worker;
+ 	RestoreOptions *ropt = wi->ropt;
+ 
+ 	AH = CloneArchive(wi->AH);
+ 
+ 	free(wi);
+ 	SetupWorker(AH, pipefd, worker, ropt);
+ 
+ 	DeCloneArchive(AH);
+ 	_endthreadex(0);
+ 	return 0;
+ }
+ #endif
+ 
+ /*
+  * This function starts the parallel dump or restore by spawning off the worker
+  * processes in both Unix and Windows. For Windows, it creates a number of
+  * threads while it does a fork() on Unix.
+  */
+ ParallelState *
+ ParallelBackupStart(ArchiveHandle *AH, RestoreOptions *ropt)
+ {
+ 	ParallelState  *pstate;
+ 	int				i;
+ 	const size_t	slotSize = AH->public.numWorkers * sizeof(ParallelSlot);
+ 
+ 	Assert(AH->public.numWorkers > 0);
+ 
+ 	/* Ensure stdio state is quiesced before forking */
+ 	fflush(NULL);
+ 
+ 	pstate = (ParallelState *) pg_malloc(sizeof(ParallelState));
+ 
+ 	pstate->numWorkers = AH->public.numWorkers;
+ 	pstate->parallelSlot = NULL;
+ 
+ 	if (AH->public.numWorkers == 1)
+ 		return pstate;
+ 
+ 	pstate->parallelSlot = (ParallelSlot *) pg_malloc(slotSize);
+ 	memset((void *) pstate->parallelSlot, 0, slotSize);
+ 
+ 	parallel_error_handler_escrow_data(SET, pstate);
+ 	vparallel_error_handler = vparallel_error_handler_imp;
+ 
+ #ifdef WIN32
+ 	tMasterThreadId = GetCurrentThreadId();
+ 	termEvent = CreateEvent(NULL, true, false, "Terminate");
+ #else
+ 	signal(SIGTERM, sigTermHandler);
+ 	signal(SIGINT, sigTermHandler);
+ 	signal(SIGQUIT, sigTermHandler);
+ #endif
+ 
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 	{
+ #ifdef WIN32
+ 		WorkerInfo *wi;
+ 		uintptr_t	handle;
+ #else
+ 		pid_t		pid;
+ #endif
+ 		int			pipeMW[2], pipeWM[2];
+ 
+ 		if (pgpipe(pipeMW) < 0 || pgpipe(pipeWM) < 0)
+ 			die_horribly(AH, modulename, "Cannot create communication channels: %s",
+ 						 strerror(errno));
+ 
+ 		pstate->parallelSlot[i].workerStatus = WRKR_IDLE;
+ #ifdef WIN32
+ 		/* Allocate a new structure for every worker */
+ 		wi = (WorkerInfo *) pg_malloc(sizeof(WorkerInfo));
+ 
+ 		wi->ropt = ropt;
+ 		wi->worker = i;
+ 		wi->AH = AH;
+ 		wi->pipeRead = pstate->parallelSlot[i].pipeRevRead = pipeMW[PIPE_READ];
+ 		wi->pipeWrite = pstate->parallelSlot[i].pipeRevWrite = pipeWM[PIPE_WRITE];
+ 
+ 		handle = _beginthreadex(NULL, 0, &init_spawned_worker_win32,
+ 								wi, 0, &(pstate->parallelSlot[i].threadId));
+ 		pstate->parallelSlot[i].hThread = handle;
+ #else
+ 		pid = fork();
+ 		if (pid == 0)
+ 		{
+ 			/* we are the worker */
+ 			int j;
+ 			int pipefd[2] = { pipeMW[PIPE_READ], pipeWM[PIPE_WRITE] };
+ 
+ 			/*
+ 			 * Store the fds for the reverse communication in pstate. Actually
+ 			 * we only use this in case of an error and don't use pstate
+ 			 * otherwise in the worker process. On Windows we write to the
+ 			 * global pstate, in Unix we write to our process-local copy but
+ 			 * that's also where we'd retrieve this information back from.
+ 			 */
+ 			pstate->parallelSlot[i].pipeRevRead = pipefd[PIPE_READ];
+ 			pstate->parallelSlot[i].pipeRevWrite = pipefd[PIPE_WRITE];
+ 			pstate->parallelSlot[i].pid = getpid();
+ 
+ 			/*
+ 			 * Call CloneArchive on Unix as well even though technically we
+ 			 * don't need to because fork() gives us a copy in our own address space
+ 			 * already. But CloneArchive resets the state information, sets is_clone
+ 			 * and also clones the database connection (for parallel dump)
+ 			 * which all seems kinda helpful.
+ 			 */
+ 			AH = CloneArchive(AH);
+ 
+ #ifdef HAVE_SETSID
+ 			/*
+ 			 * If we can, we try to make each process the leader of its own
+ 			 * process group. The reason is that if you hit Ctrl-C and they are
+ 			 * all in the same process group, any termination sequence is
+ 			 * possible, because every process will receive the signal. What
+ 			 * often happens is that a worker receives the signal, terminates
+ 			 * and the master detects that one of the workers had a problem,
+ 			 * even before acting on its own signal. That's still okay because
+ 			 * everyone still terminates but it looks a bit weird.
+ 			 *
+ 			 * With setsid() however, a Ctrl-C is only sent to the master and
+ 			 * he can then cascade it to the worker processes.
+ 			 */
+ 			setsid();
+ #endif
+ 
+ 			closesocket(pipeWM[PIPE_READ]);		/* close read end of Worker -> Master */
+ 			closesocket(pipeMW[PIPE_WRITE]);	/* close write end of Master -> Worker */
+ 
+ 			/*
+ 			 * Close all inherited fds for communication of the master with
+ 			 * the other workers.
+ 			 */
+ 			for (j = 0; j < i; j++)
+ 			{
+ 				closesocket(pstate->parallelSlot[j].pipeRead);
+ 				closesocket(pstate->parallelSlot[j].pipeWrite);
+ 			}
+ 
+ 			SetupWorker(AH, pipefd, i, ropt);
+ 
+ 			exit(0);
+ 		}
+ 		else if (pid < 0)
+ 			/* fork failed */
+ 			die_horribly(AH, modulename,
+ 						 "could not create worker process: %s\n",
+ 						 strerror(errno));
+ 
+ 		/* we are the Master, pid > 0 here */
+ 		Assert(pid > 0);
+ 		closesocket(pipeMW[PIPE_READ]);		/* close read end of Master -> Worker */
+ 		closesocket(pipeWM[PIPE_WRITE]);	/* close write end of Worker -> Master */
+ 
+ 		pstate->parallelSlot[i].pid = pid;
+ #endif
+ 
+ 		pstate->parallelSlot[i].pipeRead = pipeWM[PIPE_READ];
+ 		pstate->parallelSlot[i].pipeWrite = pipeMW[PIPE_WRITE];
+ 
+ 		pstate->parallelSlot[i].args = (ParallelArgs *) pg_malloc(sizeof(ParallelArgs));
+ 		pstate->parallelSlot[i].args->AH = AH;
+ 		pstate->parallelSlot[i].args->te = NULL;
+ 		pstate->parallelSlot[i].workerStatus = WRKR_IDLE;
+ 	}
+ 
+ 	return pstate;
+ }
+ 
+ /*
+  * Tell all of our workers to terminate.
+  *
+  * Pretty straightforward routine, first we tell everyone to terminate, then we
+  * listen to the workers' replies and finally close the sockets that we have
+  * used for communication.
+  */
+ void
+ ParallelBackupEnd(ArchiveHandle *AH, ParallelState *pstate)
+ {
+ 	int i;
+ 
+ 	if (pstate->numWorkers == 1)
+ 		return;
+ 
+ 	PrintStatus(pstate);
+ 	Assert(IsEveryWorkerIdle(pstate));
+ 
+ 	/* no hard shutdown, let workers exit by themselves and wait for them */
+ 	ShutdownWorkersSoft(AH, pstate, true);
+ 
+ 	PrintStatus(pstate);
+ 
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 	{
+ 		closesocket(pstate->parallelSlot[i].pipeRead);
+ 		closesocket(pstate->parallelSlot[i].pipeWrite);
+ 	}
+ 
+ 	vparallel_error_handler = NULL;
+ 
+ 	free(pstate->parallelSlot);
+ 	free(pstate);
+ }
+ 
+ 
+ /*
+  * The sequence is the following (for dump, similar for restore):
+  *
+  * Master                                   Worker
+  *
+  *                                          enters WaitForCommands()
+  * DispatchJobForTocEntry(...te...)
+  *
+  * [ Worker is IDLE ]
+  *
+  * arg = (MasterStartParallelItemPtr)()
+  * send: DUMP arg
+  *                                          receive: DUMP arg
+  *                                          str = (WorkerJobDumpPtr)(arg)
+  * [ Worker is WORKING ]                    ... gets te from arg ...
+  *                                          ... dump te ...
+  *                                          send: OK DUMP info
+  *
+  * In ListenToWorkers():
+  *
+  * [ Worker is FINISHED ]
+  * receive: OK DUMP info
+  * status = (MasterEndParallelItemPtr)(info)
+  *
+  * In ReapWorkerStatus(&ptr):
+  * *ptr = status;
+  * [ Worker is IDLE ]
+  */
+ void
+ DispatchJobForTocEntry(ArchiveHandle *AH, ParallelState *pstate, TocEntry *te,
+ 					   T_Action act)
+ {
+ 	int		worker;
+ 	char   *arg;
+ 
+ 	/* our caller makes sure that at least one worker is idle */
+ 	Assert(GetIdleWorker(pstate) != NO_SLOT);
+ 	worker = GetIdleWorker(pstate);
+ 	Assert(worker != NO_SLOT);
+ 
+ 	arg = (AH->MasterStartParallelItemPtr)(AH, te, act);
+ 
+ 	sendMessageToWorker(AH, pstate, worker, arg);
+ 
+ 	pstate->parallelSlot[worker].workerStatus = WRKR_WORKING;
+ 	pstate->parallelSlot[worker].args->te = te;
+ 	PrintStatus(pstate);
+ }
+ 
+ static void
+ PrintStatus(ParallelState *pstate)
+ {
+ 	int			i;
+ 	printf("------Status------\n");
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 	{
+ 		printf("Status of worker %d: ", i);
+ 		switch (pstate->parallelSlot[i].workerStatus)
+ 		{
+ 			case WRKR_IDLE:
+ 				printf("IDLE");
+ 				break;
+ 			case WRKR_WORKING:
+ 				printf("WORKING");
+ 				break;
+ 			case WRKR_FINISHED:
+ 				printf("FINISHED");
+ 				break;
+ 			case WRKR_TERMINATED:
+ 				printf("TERMINATED");
+ 				break;
+ 		}
+ 		printf("\n");
+ 	}
+ 	printf("------------\n");
+ }
+ 
+ 
+ /*
+  * Find the first free parallel slot (if any).
+  */
+ int
+ GetIdleWorker(ParallelState *pstate)
+ {
+ 	int			i;
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 		if (pstate->parallelSlot[i].workerStatus == WRKR_IDLE)
+ 			return i;
+ 	return NO_SLOT;
+ }
+ 
+ /*
+  * Return true iff every worker process is in the WRKR_TERMINATED state.
+  */
+ static bool
+ HasEveryWorkerTerminated(ParallelState *pstate)
+ {
+ 	int			i;
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 		if (pstate->parallelSlot[i].workerStatus != WRKR_TERMINATED)
+ 			return false;
+ 	return true;
+ }
+ 
+ /*
+  * Return true iff every worker is in the WRKR_IDLE state.
+  */
+ bool
+ IsEveryWorkerIdle(ParallelState *pstate)
+ {
+ 	int			i;
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 		if (pstate->parallelSlot[i].workerStatus != WRKR_IDLE)
+ 			return false;
+ 	return true;
+ }
+ 
+ /*
+  * Performs a soft shutdown and optionally waits for every worker to terminate.
+  * A soft shutdown sends a "TERMINATE" message to every worker only.
+  */
+ static void
+ ShutdownWorkersSoft(ArchiveHandle *AH, ParallelState *pstate, bool do_wait)
+ {
+ 	int			i;
+ 
+ 	/* soft shutdown */
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 	{
+ 		if (pstate->parallelSlot[i].workerStatus != WRKR_TERMINATED)
+ 		{
+ 			sendMessageToWorker(AH, pstate, i, "TERMINATE");
+ 			pstate->parallelSlot[i].workerStatus = WRKR_WORKING;
+ 		}
+ 	}
+ 
+ 	if (!do_wait)
+ 		return;
+ 
+ 	WaitForTerminatingWorkers(AH, pstate);
+ }
+ 
+ /*
+  * This routine does some effort to gracefully shut down the database
+  * connection, but not too much, since the parent is waiting for the workers to
+  * terminate. The cancellation of the database connection is done in an
+  * asynchronous way, so we need to wait a bit after sending PQcancel().
+  * Calling PQcancel() first and then and PQfinish() immediately afterwards
+  * would still cancel an active connection because most likely the PQfinish()
+  * has not yet been processed.
+  *
+  * On Windows, when the master process terminates the childrens' database
+  * connections it forks off new threads that do nothing else than close the
+  * connection. These threads only live as long as they are in this function.
+  * And since a thread needs to return a value this function needs to as well.
+  * Hence this function returns an (unsigned) int.
+  */
+ static int
+ ShutdownConnection(PGconn **conn)
+ {
+ 	PGcancel   *cancel;
+ 	char		errbuf[1];
+ 	int			i;
+ 
+ 	Assert(conn != NULL);
+ 	Assert(*conn != NULL);
+ 
+ 	if ((cancel = PQgetCancel(*conn)))
+ 	{
+ 		PQcancel(cancel, errbuf, sizeof(errbuf));
+ 		PQfreeCancel(cancel);
+ 	}
+ 
+ 	/* give the server a little while */
+ 	for (i = 0; i < 10; i++)
+ 	{
+ 		PQconsumeInput(*conn);
+ 		if (!PQisBusy(*conn))
+ 			break;
+ 		pg_usleep((SHUTDOWN_GRACE_PERIOD / 10) * 1000);
+ 	}
+ 
+ 	PQfinish(*conn);
+ 	*conn = NULL;
+ 	return 0;
+ }
+ 
+ /*
+  * One danger of the parallel backup is a possible deadlock:
+  *
+  * 1) Master dumps the schema and locks all tables in ACCESS SHARE mode.
+  * 2) Another process requests an ACCESS EXCLUSIVE lock (which is not granted
+  *    because the master holds a conflicting ACCESS SHARE lock).
+  * 3) The worker process also requests an ACCESS SHARE lock to read the table.
+  *    The worker's not granted that lock but is enqueued behind the ACCESS
+  *    EXCLUSIVE lock request.
+  *
+  * Now what we do here is to just request a lock in ACCESS SHARE but with
+  * NOWAIT in the worker prior to touching the table. If we don't get the lock,
+  * then we know that somebody else has requested an ACCESS EXCLUSIVE lock and
+  * are good to just fail the whole backup because we have detected a deadlock.
+  */
+ static void
+ lockTableNoWait(ArchiveHandle *AH, TocEntry *te)
+ {
+ 	const char *qualId;
+ 	PQExpBuffer query = createPQExpBuffer();
+ 	PGresult   *res;
+ 
+ 	Assert(AH->format == archDirectory);
+ 	Assert(strcmp(te->desc, "BLOBS") != 0);
+ 
+ 	/*
+ 	 * We are only locking tables and thus we can peek at the DROP command
+ 	 * which contains the fully qualified name.
+ 	 *
+ 	 * Additionally, strlen("DROP") == strlen("LOCK").
+ 	 */
+ 	appendPQExpBuffer(query, "SELECT pg_namespace.nspname,"
+ 							 "       pg_class.relname "
+ 							 "  FROM pg_class "
+ 							 "  JOIN pg_namespace on pg_namespace.oid = relnamespace "
+ 							 " WHERE pg_class.oid = %d", te->catalogId.oid);
+ 
+ 	res = PQexec(AH->connection, query->data);
+ 
+ 	if (!res || PQresultStatus(res) != PGRES_TUPLES_OK)
+ 		die_horribly(AH, modulename, "could not get relation name for oid %d: %s",
+ 					 te->catalogId.oid, PQerrorMessage(AH->connection));
+ 
+ 	resetPQExpBuffer(query);
+ 
+ 	qualId = fmtQualifiedId(PQgetvalue(res, 0, 0),
+ 							PQgetvalue(res, 0, 1),
+ 							AH->public.remoteVersion);
+ 
+ 	appendPQExpBuffer(query, "LOCK TABLE %s IN ACCESS SHARE MODE NOWAIT", qualId);
+ 	PQclear(res);
+ 
+ 	res = PQexec(AH->connection, query->data);
+ 
+ 	if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
+ 		die_horribly(AH, modulename, "could not obtain lock on relation \"%s\". This "
+ 					 "usually means that someone requested an ACCESS EXCLUSIVE lock "
+ 					 "on the table after the pg_dump parent process has gotten the "
+ 					 "initial ACCESS SHARE lock on the table.", qualId);
+ 
+ 	PQclear(res);
+ 	destroyPQExpBuffer(query);
+ }
+ 
+ /*
+  * That's the main routine for the worker.
+  * When it starts up it enters this routine and waits for commands from the
+  * master process. After having processed a command it comes back to here to
+  * wait for the next command. Finally it will receive a TERMINATE command and
+  * exit.
+  */
+ static void
+ WaitForCommands(ArchiveHandle *AH, int pipefd[2])
+ {
+ 	char	   *command;
+ 	DumpId		dumpId;
+ 	int			nBytes;
+ 	char	   *str = NULL;
+ 	TocEntry   *te;
+ 
+ 	for(;;)
+ 	{
+ 		command = getMessageFromMaster(AH, pipefd);
+ 
+ 		if (messageStartsWith(command, "DUMP "))
+ 		{
+ 			Assert(AH->format == archDirectory);
+ 			sscanf(command + strlen("DUMP "), "%d%n", &dumpId, &nBytes);
+ 			Assert(nBytes == strlen(command) - strlen("DUMP "));
+ 
+ 			te = getTocEntryByDumpId(AH, dumpId);
+ 			Assert(te != NULL);
+ 
+ 			/*
+ 			 * Lock the table but with NOWAIT. Note that the parent is already
+ 			 * holding a lock. If we cannot acquire another ACCESS SHARE MODE
+ 			 * lock, then somebody else has requested an exclusive lock in the
+ 			 * meantime.  lockTableNoWait dies in this case to prevent a
+ 			 * deadlock.
+ 			 */
+ 			if (strcmp(te->desc, "BLOBS") != 0)
+ 				lockTableNoWait(AH, te);
+ 
+ 			/*
+ 			 * The message we return here has been pg_malloc()ed and we are
+ 			 * responsible for free()ing it.
+ 			 */
+ 			str = (AH->WorkerJobDumpPtr)(AH, te);
+ 			Assert(AH->connection != NULL);
+ 			sendMessageToMaster(AH, pipefd, str);
+ 			free(str);
+ 		}
+ 		else if (messageStartsWith(command, "RESTORE "))
+ 		{
+ 			Assert(AH->format == archDirectory || AH->format == archCustom);
+ 			Assert(AH->connection != NULL);
+ 
+ 			sscanf(command + strlen("RESTORE "), "%d%n", &dumpId, &nBytes);
+ 			Assert(nBytes == strlen(command) - strlen("RESTORE "));
+ 
+ 			te = getTocEntryByDumpId(AH, dumpId);
+ 			Assert(te != NULL);
+ 			/*
+ 			 * The message we return here has been pg_malloc()ed and we are
+ 			 * responsible for free()ing it.
+ 			 */
+ 			str = (AH->WorkerJobRestorePtr)(AH, te);
+ 			Assert(AH->connection != NULL);
+ 			sendMessageToMaster(AH, pipefd, str);
+ 			free(str);
+ 		}
+ 		else if (messageEquals(command, "TERMINATE"))
+ 		{
+ 			PQfinish(AH->connection);
+ 			AH->connection = NULL;
+ 			return;
+ 		}
+ 		else
+ 		{
+ 			die_horribly(AH, modulename,
+ 						 "Unknown command on communication channel: %s", command);
+ 		}
+ 	}
+ }
+ 
+ /*
+  * Note the status change:
+  *
+  * DispatchJobForTocEntry		WRKR_IDLE -> WRKR_WORKING
+  * ListenToWorkers				WRKR_WORKING -> WRKR_FINISHED / WRKR_TERMINATED
+  * ReapWorkerStatus				WRKR_FINISHED -> WRKR_IDLE
+  *
+  * Just calling ReapWorkerStatus() when all workers are working might or might
+  * not give you an idle worker because you need to call ListenToWorkers() in
+  * between and only thereafter ReapWorkerStatus(). This is necessary in order to
+  * get and deal with the status (=result) of the worker's execution.
+  */
+ void
+ ListenToWorkers(ArchiveHandle *AH, ParallelState *pstate, bool do_wait)
+ {
+ 	int			worker;
+ 	char	   *msg;
+ 
+ 	msg = getMessageFromWorker(AH, pstate, do_wait, &worker);
+ 
+ 	if (!msg)
+ 	{
+ 		Assert(!do_wait);
+ 		return;
+ 	}
+ 
+ 	if (messageStartsWith(msg, "OK "))
+ 	{
+ 		char	   *statusString;
+ 		TocEntry   *te;
+ 
+ 		pstate->parallelSlot[worker].workerStatus = WRKR_FINISHED;
+ 		te = pstate->parallelSlot[worker].args->te;
+ 		if (messageStartsWith(msg, "OK RESTORE "))
+ 		{
+ 			statusString = msg + strlen("OK RESTORE ");
+ 			pstate->parallelSlot[worker].status =
+ 				(AH->MasterEndParallelItemPtr)
+ 					(AH, te, statusString, ACT_RESTORE);
+ 		}
+ 		else if (messageStartsWith(msg, "OK DUMP "))
+ 		{
+ 			statusString = msg + strlen("OK DUMP ");
+ 			pstate->parallelSlot[worker].status =
+ 				(AH->MasterEndParallelItemPtr)
+ 					(AH, te, statusString, ACT_DUMP);
+ 		}
+ 		else
+ 			die_horribly(AH, modulename, "Invalid message received from worker: %s", msg);
+ 	}
+ 	else if (messageStartsWith(msg, "ERROR "))
+ 	{
+ 		Assert(AH->format == archDirectory || AH->format == archCustom);
+ 		pstate->parallelSlot[worker].workerStatus = WRKR_TERMINATED;
+ 		die_horribly(AH, modulename, "%s", msg + strlen("ERROR "));
+ 	}
+ 	else
+ 		die_horribly(AH, modulename, "Invalid message received from worker: %s", msg);
+ 
+ 	PrintStatus(pstate);
+ 
+ 	/* both Unix and Win32 return pg_malloc()ed space, so we free it */
+ 	free(msg);
+ }
+ 
+ /*
+  * This function is executed in the master process.
+  *
+  * This function is used to get the return value of a terminated worker
+  * process. If a process has terminated, its status is stored in *status and
+  * the id of the worker is returned.
+  */
+ int
+ ReapWorkerStatus(ParallelState *pstate, int *status)
+ {
+ 	int			i;
+ 
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 	{
+ 		if (pstate->parallelSlot[i].workerStatus == WRKR_FINISHED)
+ 		{
+ 			*status = pstate->parallelSlot[i].status;
+ 			pstate->parallelSlot[i].status = 0;
+ 			pstate->parallelSlot[i].workerStatus = WRKR_IDLE;
+ 			PrintStatus(pstate);
+ 			return i;
+ 		}
+ 	}
+ 	return NO_SLOT;
+ }
+ 
+ /*
+  * This function is executed in the master process.
+  *
+  * It looks for an idle worker process and only returns if there is one.
+  */
+ void
+ EnsureIdleWorker(ArchiveHandle *AH, ParallelState *pstate)
+ {
+ 	int		ret_worker;
+ 	int		work_status;
+ 
+ 	for (;;)
+ 	{
+ 		int nTerm = 0;
+ 		while ((ret_worker = ReapWorkerStatus(pstate, &work_status)) != NO_SLOT)
+ 		{
+ 			if (work_status != 0)
+ 				die_horribly(AH, modulename, "Error processing a parallel work item.\n");
+ 
+ 			nTerm++;
+ 		}
+ 
+ 		/* We need to make sure that we have an idle worker before dispatching
+ 		 * the next item. If nTerm > 0 we already have that (quick check). */
+ 		if (nTerm > 0)
+ 			return;
+ 
+ 		/* explicit check for an idle worker */
+ 		if (GetIdleWorker(pstate) != NO_SLOT)
+ 			return;
+ 
+ 		/*
+ 		 * If we have no idle worker, read the result of one or more
+ 		 * workers and loop the loop to call ReapWorkerStatus() on them
+ 		 */
+ 		ListenToWorkers(AH, pstate, true);
+ 	}
+ }
+ 
+ /*
+  * This function is executed in the master process.
+  *
+  * It waits for all workers to terminate.
+  */
+ void
+ EnsureWorkersFinished(ArchiveHandle *AH, ParallelState *pstate)
+ {
+ 	int			work_status;
+ 
+ 	if (!pstate || pstate->numWorkers == 1)
+ 		return;
+ 
+ 	/* Waiting for the remaining worker processes to finish */
+ 	while (!IsEveryWorkerIdle(pstate))
+ 	{
+ 		if (ReapWorkerStatus(pstate, &work_status) == NO_SLOT)
+ 			ListenToWorkers(AH, pstate, true);
+ 		else if (work_status != 0)
+ 			die_horribly(AH, modulename, "Error processing a parallel work item");
+ 	}
+ }
+ 
+ /*
+  * This function is executed in the worker process.
+  *
+  * It returns the next message on the communication channel, blocking until it
+  * becomes available.
+  */
+ static char *
+ getMessageFromMaster(ArchiveHandle *AH, int pipefd[2])
+ {
+ 	return readMessageFromPipe(pipefd[PIPE_READ], true);
+ }
+ 
+ /*
+  * This function is executed in the worker process.
+  *
+  * It sends a message to the master on the communication channel.
+  */
+ static void
+ sendMessageToMaster(ArchiveHandle *AH, int pipefd[2], const char *str)
+ {
+ 	int			len = strlen(str) + 1;
+ 
+ 	if (pipewrite(pipefd[PIPE_WRITE], str, len) != len)
+ 		die_horribly(AH, modulename,
+ 					 "Error writing to the communication channel: %s",
+ 					 strerror(errno));
+ }
+ 
+ /*
+  * This function is executed in the master process.
+  *
+  * It returns the next message from the worker on the communication channel,
+  * optionally blocking (do_wait) until it becomes available.
+  *
+  * The id of the worker is returned in *worker.
+  */
+ static char *
+ getMessageFromWorker(ArchiveHandle *AH, ParallelState *pstate, bool do_wait, int *worker)
+ {
+ 	int			i;
+ 	fd_set		workerset;
+ 	int			maxFd = -1;
+ 	struct		timeval nowait = { 0, 0 };
+ 
+ 	FD_ZERO(&workerset);
+ 
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 	{
+ 		if (pstate->parallelSlot[i].workerStatus == WRKR_TERMINATED)
+ 			continue;
+ 		FD_SET(pstate->parallelSlot[i].pipeRead, &workerset);
+ 		/* actually WIN32 ignores the first parameter to select()... */
+ 		if (pstate->parallelSlot[i].pipeRead > maxFd)
+ 			maxFd = pstate->parallelSlot[i].pipeRead;
+ 	}
+ 
+ 	if (do_wait)
+ 	{
+ 		i = select_loop(maxFd, &workerset);
+ 		Assert(i != 0);
+ 	}
+ 	else
+ 	{
+ 		if ((i = select(maxFd + 1, &workerset, NULL, NULL, &nowait)) == 0)
+ 			return NULL;
+ 	}
+ 
+ #ifndef WIN32
+ 	if (wantAbort && !aborting)
+ 		die_horribly(AH, modulename, "terminated by user\n");
+ #endif
+ 
+ 	if (i < 0)
+ 	{
+ 		write_msg(NULL, "Error in ListenToWorkers(): %s", strerror(errno));
+ 		exit(1);
+ 	}
+ 
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 	{
+ 		char	   *msg;
+ 
+ 		if (!FD_ISSET(pstate->parallelSlot[i].pipeRead, &workerset))
+ 			continue;
+ 
+ 		msg = readMessageFromPipe(pstate->parallelSlot[i].pipeRead, false);
+ 		*worker = i;
+ 		return msg;
+ 	}
+ 	Assert(false);
+ 	return NULL;
+ }
+ 
+ /*
+  * This function is executed in the master process.
+  *
+  * It sends a message to a certain worker on the communication channel.
+  */
+ static void
+ sendMessageToWorker(ArchiveHandle *AH, ParallelState *pstate, int worker, const char *str)
+ {
+ 	int			len = strlen(str) + 1;
+ 
+ 	if (pipewrite(pstate->parallelSlot[worker].pipeWrite, str, len) != len)
+ 		die_horribly(AH, modulename,
+ 					 "Error writing to the communication channel: %s",
+ 					 strerror(errno));
+ }
+ 
+ /*
+  * The underlying function to read a message from the communication channel (fd)
+  * with optional blocking (do_wait).
+  */
+ static char *
+ readMessageFromPipe(int fd, bool do_wait)
+ {
+ 	char	   *msg;
+ 	int			msgsize, bufsize;
+ 	int			ret;
+ 
+ 	/*
+ 	 * The problem here is that we need to deal with several possibilites:
+ 	 * we could receive only a partial message or several messages at once.
+ 	 * The caller expects us to return exactly one message however.
+ 	 *
+ 	 * We could either read in as much as we can and keep track of what we
+ 	 * delivered back to the caller or we just read byte by byte. Once we see
+ 	 * (char) 0, we know that it's the message's end. This would be quite
+ 	 * inefficient for more data but since we are reading only on the command
+ 	 * channel, the performance loss does not seem worth the trouble of keeping
+ 	 * internal states for different file descriptors.
+ 	 */
+ 
+ 	bufsize = 64;  /* could be any number */
+ 	msg = (char *) pg_malloc(bufsize);
+ 
+ 	msgsize = 0;
+ 	for (;;)
+ 	{
+ 		Assert(msgsize <= bufsize);
+ 		/*
+ 		 * If we do non-blocking read, only set the channel non-blocking for
+ 		 * the very first character. We trust in our messages to be
+ 		 * \0-terminated, so if there is any character in the beginning, then
+ 		 * we read the message until we find a \0 somewhere, which indicates
+ 		 * the end of the message.
+ 		 */
+ 		if (msgsize == 0 && !do_wait) {
+ 			setnonblocking(fd);
+ 		}
+ 
+ 		ret = piperead(fd, msg + msgsize, 1);
+ 
+ 		if (msgsize == 0 && !do_wait)
+ 		{
+ 			int		saved_errno = errno;
+ 			setblocking(fd);
+ 			/* no data has been available */
+ 			if (ret < 0 && saved_errno == EAGAIN)
+ 				return NULL;
+ 		}
+ 
+ 		/* worker has closed the connection or another error happened */
+ 		if (ret <= 0)
+ 			return NULL;
+ 
+ 		Assert(ret == 1);
+ 
+ 		if (msg[msgsize] == '\0') {
+ 			return msg;
+ 		}
+ 
+ 		msgsize++;
+ 		if (msgsize == bufsize)
+ 		{
+ 			/* could be any number */
+ 			bufsize += 16;
+ 			msg = (char *) realloc(msg, bufsize);
+ 		}
+ 	}
+ }
+ 
diff --git a/src/bin/pg_dump/parallel.h b/src/bin/pg_dump/parallel.h
index ...4c86b9b .
*** a/src/bin/pg_dump/parallel.h
--- b/src/bin/pg_dump/parallel.h
***************
*** 0 ****
--- 1,91 ----
+ /*-------------------------------------------------------------------------
+  *
+  * parallel.h
+  *
+  *	Parallel support header file for the pg_dump archiver
+  *
+  * Portions Copyright (c) 1996-2011, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  *	The author is not responsible for loss or damages that may
+  *	result from its use.
+  *
+  * IDENTIFICATION
+  *		src/bin/pg_dump/parallel.h
+  *
+  *-------------------------------------------------------------------------
+  */
+ 
+ #include "pg_backup_db.h"
+ 
+ struct _archiveHandle;
+ struct _tocEntry;
+ 
+ typedef enum
+ {
+ 	WRKR_TERMINATED = 0,
+ 	WRKR_IDLE,
+ 	WRKR_WORKING,
+ 	WRKR_FINISHED
+ } T_WorkerStatus;
+ 
+ typedef enum _action
+ {
+ 	ACT_DUMP,
+ 	ACT_RESTORE,
+ } T_Action;
+ 
+ /* Arguments needed for a worker process */
+ typedef struct _parallel_args
+ {
+ 	struct _archiveHandle *AH;
+ 	struct _tocEntry	  *te;
+ } ParallelArgs;
+ 
+ /* State for each parallel activity slot */
+ typedef struct _parallel_slot
+ {
+ 	ParallelArgs	   *args;
+ 	T_WorkerStatus		workerStatus;
+ 	int					status;
+ 	int					pipeRead;
+ 	int					pipeWrite;
+ 	int					pipeRevRead;
+ 	int					pipeRevWrite;
+ #ifdef WIN32
+ 	uintptr_t			hThread;
+ 	unsigned int		threadId;
+ #else
+ 	pid_t				pid;
+ #endif
+ 	bool				inErrorHandling;
+ } ParallelSlot;
+ 
+ #define NO_SLOT (-1)
+ 
+ typedef struct _parallel_state
+ {
+ 	int			numWorkers;
+ 	ParallelSlot *parallelSlot;
+ } ParallelState;
+ 
+ extern int GetIdleWorker(ParallelState *pstate);
+ extern bool IsEveryWorkerIdle(ParallelState *pstate);
+ extern void ListenToWorkers(struct _archiveHandle *AH, ParallelState *pstate, bool do_wait);
+ extern int ReapWorkerStatus(ParallelState *pstate, int *status);
+ extern void EnsureIdleWorker(struct _archiveHandle *AH, ParallelState *pstate);
+ extern void EnsureWorkersFinished(struct _archiveHandle *AH, ParallelState *pstate);
+ 
+ extern ParallelState *ParallelBackupStart(struct _archiveHandle *AH,
+ 										  RestoreOptions *ropt);
+ extern void DispatchJobForTocEntry(struct _archiveHandle *AH,
+ 								   ParallelState *pstate,
+ 								   struct _tocEntry *te, T_Action act);
+ extern void ParallelBackupEnd(struct _archiveHandle *AH, ParallelState *pstate);
+ 
+ extern void (* volatile vparallel_error_handler)(struct _archiveHandle *AH,
+ 									const char *modulename,
+ 									const char *fmt, va_list ap);
+ 
+ extern void checkAborting(struct _archiveHandle *AH);
+ 
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 8926488..767f865 100644
*** a/src/bin/pg_dump/pg_backup.h
--- b/src/bin/pg_dump/pg_backup.h
*************** typedef struct _Archive
*** 90,95 ****
--- 90,97 ----
  	int			minRemoteVersion;		/* allowable range */
  	int			maxRemoteVersion;
  
+ 	int			numWorkers;		/* number of parallel processes */
+ 
  	/* info needed for string escaping */
  	int			encoding;		/* libpq code for client_encoding */
  	bool		std_strings;	/* standard_conforming_strings */
*************** typedef struct _restoreOptions
*** 150,156 ****
  	int			suppressDumpWarnings;	/* Suppress output of WARNING entries
  										 * to stderr */
  	bool		single_txn;
- 	int			number_of_jobs;
  
  	bool	   *idWanted;		/* array showing which dump IDs to emit */
  } RestoreOptions;
--- 152,157 ----
*************** typedef struct _restoreOptions
*** 162,180 ****
  
  /* Lets the archive know we have a DB connection to shutdown if it dies */
  
! PGconn *ConnectDatabase(Archive *AH,
  				const char *dbname,
  				const char *pghost,
  				const char *pgport,
  				const char *username,
  				enum trivalue prompt_password);
  
  /* Called to add a TOC entry */
  extern void ArchiveEntry(Archive *AHX,
  			 CatalogId catalogId, DumpId dumpId,
  			 const char *tag,
  			 const char *namespace, const char *tablespace,
! 			 const char *owner, bool withOids,
  			 const char *desc, teSection section,
  			 const char *defn,
  			 const char *dropStmt, const char *copyStmt,
--- 163,183 ----
  
  /* Lets the archive know we have a DB connection to shutdown if it dies */
  
! PGconn *ConnectDatabase(Archive *AHX,
  				const char *dbname,
  				const char *pghost,
  				const char *pgport,
  				const char *username,
  				enum trivalue prompt_password);
+ PGconn *CloneDatabaseConnection(Archive *AHX);
  
  /* Called to add a TOC entry */
  extern void ArchiveEntry(Archive *AHX,
  			 CatalogId catalogId, DumpId dumpId,
  			 const char *tag,
  			 const char *namespace, const char *tablespace,
! 			 const char *owner,
! 			 unsigned long int relpages, bool withOids,
  			 const char *desc, teSection section,
  			 const char *defn,
  			 const char *dropStmt, const char *copyStmt,
*************** extern void PrintTOCSummary(Archive *AH,
*** 203,208 ****
--- 206,214 ----
  
  extern RestoreOptions *NewRestoreOptions(void);
  
+ /* We have one in pg_dump.c and another one in pg_restore.c */
+ extern void _SetupWorker(Archive *AHX, RestoreOptions *ropt);
+ 
  /* Rearrange and filter TOC entries */
  extern void SortTocFromFile(Archive *AHX, RestoreOptions *ropt);
  extern void InitDummyWantedList(Archive *AHX, RestoreOptions *ropt);
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 234e50f..0c81dfe 100644
*** a/src/bin/pg_dump/pg_backup_archiver.c
--- b/src/bin/pg_dump/pg_backup_archiver.c
***************
*** 23,82 ****
  #include "pg_backup_db.h"
  #include "dumpmem.h"
  #include "dumputils.h"
  
  #include <ctype.h>
  #include <unistd.h>
  #include <sys/stat.h>
  #include <sys/types.h>
  #include <sys/wait.h>
  
  #ifdef WIN32
  #include <io.h>
  #endif
  
  #include "libpq/libpq-fs.h"
  
- /*
-  * Special exit values from worker children.  We reserve 0 for normal
-  * success; 1 and other small values should be interpreted as crashes.
-  */
- #define WORKER_CREATE_DONE		10
- #define WORKER_INHIBIT_DATA		11
- #define WORKER_IGNORED_ERRORS	12
- 
- /*
-  * Unix uses exit to return result from worker child, so function is void.
-  * Windows thread result comes via function return.
-  */
- #ifndef WIN32
- #define parallel_restore_result void
- #else
- #define parallel_restore_result DWORD
- #endif
- 
- /* IDs for worker children are either PIDs or thread handles */
- #ifndef WIN32
- #define thandle pid_t
- #else
- #define thandle HANDLE
- #endif
- 
- /* Arguments needed for a worker child */
- typedef struct _restore_args
- {
- 	ArchiveHandle *AH;
- 	TocEntry   *te;
- } RestoreArgs;
- 
- /* State for each parallel activity slot */
- typedef struct _parallel_slot
- {
- 	thandle		child_id;
- 	RestoreArgs *args;
- } ParallelSlot;
- 
- #define NO_SLOT (-1)
- 
  #define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
  #define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
  
--- 23,46 ----
  #include "pg_backup_db.h"
  #include "dumpmem.h"
  #include "dumputils.h"
+ #include "parallel.h"
  
  #include <ctype.h>
+ #include <fcntl.h>
  #include <unistd.h>
  #include <sys/stat.h>
  #include <sys/types.h>
  #include <sys/wait.h>
  
+ static const char *modulename = gettext_noop("archiver");
+ 
+ 
  #ifdef WIN32
  #include <io.h>
  #endif
  
  #include "libpq/libpq-fs.h"
  
  #define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
  #define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
  
*************** typedef struct _outputContext
*** 87,94 ****
  	int			gzOut;
  } OutputContext;
  
- static const char *modulename = gettext_noop("archiver");
- 
  /* index array created by fix_dependencies -- only used in parallel restore */
  static TocEntry **tocsByDumpId; /* index by dumpId - 1 */
  static DumpId maxDumpId;		/* length of above array */
--- 51,56 ----
*************** static teReqs _tocEntryRequired(TocEntry
*** 115,121 ****
  static bool _tocEntryIsACL(TocEntry *te);
  static void _disableTriggersIfNecessary(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt);
  static void _enableTriggersIfNecessary(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt);
- static TocEntry *getTocEntryByDumpId(ArchiveHandle *AH, DumpId id);
  static void _moveBefore(ArchiveHandle *AH, TocEntry *pos, TocEntry *te);
  static int	_discoverArchiveFormat(ArchiveHandle *AH);
  
--- 77,82 ----
*************** static void RestoreOutput(ArchiveHandle
*** 132,152 ****
  
  static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te,
  				  RestoreOptions *ropt, bool is_parallel);
! static void restore_toc_entries_parallel(ArchiveHandle *AH);
! static thandle spawn_restore(RestoreArgs *args);
! static thandle reap_child(ParallelSlot *slots, int n_slots, int *work_status);
! static bool work_in_progress(ParallelSlot *slots, int n_slots);
! static int	get_next_slot(ParallelSlot *slots, int n_slots);
  static void par_list_header_init(TocEntry *l);
  static void par_list_append(TocEntry *l, TocEntry *te);
  static void par_list_remove(TocEntry *te);
  static TocEntry *get_next_work_item(ArchiveHandle *AH,
  				   TocEntry *ready_list,
! 				   ParallelSlot *slots, int n_slots);
! static parallel_restore_result parallel_restore(RestoreArgs *args);
  static void mark_work_done(ArchiveHandle *AH, TocEntry *ready_list,
! 			   thandle worker, int status,
! 			   ParallelSlot *slots, int n_slots);
  static void fix_dependencies(ArchiveHandle *AH);
  static bool has_lock_conflicts(TocEntry *te1, TocEntry *te2);
  static void repoint_table_dependencies(ArchiveHandle *AH,
--- 93,111 ----
  
  static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te,
  				  RestoreOptions *ropt, bool is_parallel);
! static void restore_toc_entries_prefork(ArchiveHandle *AH);
! static void restore_toc_entries_parallel(ArchiveHandle *AH, ParallelState *pstate,
! 										 TocEntry *pending_list);
! static void restore_toc_entries_postfork(ArchiveHandle *AH, TocEntry *pending_list);
  static void par_list_header_init(TocEntry *l);
  static void par_list_append(TocEntry *l, TocEntry *te);
  static void par_list_remove(TocEntry *te);
  static TocEntry *get_next_work_item(ArchiveHandle *AH,
  				   TocEntry *ready_list,
! 				   ParallelState *pstate);
  static void mark_work_done(ArchiveHandle *AH, TocEntry *ready_list,
! 			   int worker, int status,
! 			   ParallelState *pstate);
  static void fix_dependencies(ArchiveHandle *AH);
  static bool has_lock_conflicts(TocEntry *te1, TocEntry *te2);
  static void repoint_table_dependencies(ArchiveHandle *AH,
*************** static void reduce_dependencies(ArchiveH
*** 156,164 ****
  					TocEntry *ready_list);
  static void mark_create_done(ArchiveHandle *AH, TocEntry *te);
  static void inhibit_data_for_failed_table(ArchiveHandle *AH, TocEntry *te);
- static ArchiveHandle *CloneArchive(ArchiveHandle *AH);
- static void DeCloneArchive(ArchiveHandle *AH);
- 
  
  /*
   *	Wrapper functions.
--- 115,120 ----
*************** RestoreArchive(Archive *AHX, RestoreOpti
*** 245,251 ****
  	/*
  	 * If we're going to do parallel restore, there are some restrictions.
  	 */
! 	parallel_mode = (ropt->number_of_jobs > 1 && ropt->useDB);
  	if (parallel_mode)
  	{
  		/* We haven't got round to making this work for all archive formats */
--- 201,207 ----
  	/*
  	 * If we're going to do parallel restore, there are some restrictions.
  	 */
! 	parallel_mode = (AH->public.numWorkers > 1 && ropt->useDB);
  	if (parallel_mode)
  	{
  		/* We haven't got round to making this work for all archive formats */
*************** RestoreArchive(Archive *AHX, RestoreOpti
*** 411,417 ****
  	 * In parallel mode, turn control over to the parallel-restore logic.
  	 */
  	if (parallel_mode)
! 		restore_toc_entries_parallel(AH);
  	else
  	{
  		for (te = AH->toc->next; te != AH->toc; te = te->next)
--- 367,391 ----
  	 * In parallel mode, turn control over to the parallel-restore logic.
  	 */
  	if (parallel_mode)
! 	{
! 		ParallelState  *pstate;
! 		TocEntry		pending_list;
! 
! 		par_list_header_init(&pending_list);
! 
! 		/* This runs PRE_DATA items and then disconnects from the database */
! 		restore_toc_entries_prefork(AH);
! 		Assert(AH->connection == NULL);
! 
! 		/* ParallelBackupStart() will actually fork the processes */
! 		pstate = ParallelBackupStart(AH, ropt);
! 		restore_toc_entries_parallel(AH, pstate, &pending_list);
! 		ParallelBackupEnd(AH, pstate);
! 
! 		/* reconnect the master and see if we missed something */
! 		restore_toc_entries_postfork(AH, &pending_list);
! 		Assert(AH->connection != NULL);
! 	}
  	else
  	{
  		for (te = AH->toc->next; te != AH->toc; te = te->next)
*************** static int
*** 476,482 ****
  restore_toc_entry(ArchiveHandle *AH, TocEntry *te,
  				  RestoreOptions *ropt, bool is_parallel)
  {
! 	int			retval = 0;
  	teReqs		reqs;
  	bool		defnDumped;
  
--- 450,456 ----
  restore_toc_entry(ArchiveHandle *AH, TocEntry *te,
  				  RestoreOptions *ropt, bool is_parallel)
  {
! 	int			status = WORKER_OK;
  	teReqs		reqs;
  	bool		defnDumped;
  
*************** restore_toc_entry(ArchiveHandle *AH, Toc
*** 518,524 ****
  				if (ropt->noDataForFailedTables)
  				{
  					if (is_parallel)
! 						retval = WORKER_INHIBIT_DATA;
  					else
  						inhibit_data_for_failed_table(AH, te);
  				}
--- 492,498 ----
  				if (ropt->noDataForFailedTables)
  				{
  					if (is_parallel)
! 						status = WORKER_INHIBIT_DATA;
  					else
  						inhibit_data_for_failed_table(AH, te);
  				}
*************** restore_toc_entry(ArchiveHandle *AH, Toc
*** 533,539 ****
  				 * just set the return value.
  				 */
  				if (is_parallel)
! 					retval = WORKER_CREATE_DONE;
  				else
  					mark_create_done(AH, te);
  			}
--- 507,513 ----
  				 * just set the return value.
  				 */
  				if (is_parallel)
! 					status = WORKER_CREATE_DONE;
  				else
  					mark_create_done(AH, te);
  			}
*************** restore_toc_entry(ArchiveHandle *AH, Toc
*** 651,657 ****
  		}
  	}
  
! 	return retval;
  }
  
  /*
--- 625,634 ----
  		}
  	}
  
! 	if (AH->public.n_errors > 0 && status == WORKER_OK)
! 		status = WORKER_IGNORED_ERRORS;
! 
! 	return status;
  }
  
  /*
*************** ArchiveEntry(Archive *AHX,
*** 753,759 ****
  			 const char *tag,
  			 const char *namespace,
  			 const char *tablespace,
! 			 const char *owner, bool withOids,
  			 const char *desc, teSection section,
  			 const char *defn,
  			 const char *dropStmt, const char *copyStmt,
--- 730,737 ----
  			 const char *tag,
  			 const char *namespace,
  			 const char *tablespace,
! 			 const char *owner,
! 			 unsigned long int relpages, bool withOids,
  			 const char *desc, teSection section,
  			 const char *defn,
  			 const char *dropStmt, const char *copyStmt,
*************** static void
*** 1429,1434 ****
--- 1407,1421 ----
  vdie_horribly(ArchiveHandle *AH, const char *modulename,
  			  const char *fmt, va_list ap)
  {
+ 	/*
+ 	 * If we have an error handler for the parallel operation, then
+ 	 * control will not come back from there.
+ 	 */
+ 	if (vparallel_error_handler)
+ 		vparallel_error_handler(AH, modulename, fmt, ap);
+ 
+ 	Assert(!vparallel_error_handler);
+ 
  	vwrite_msg(modulename, fmt, ap);
  
  	if (AH)
*************** _moveBefore(ArchiveHandle *AH, TocEntry
*** 1534,1540 ****
  	pos->prev = te;
  }
  
! static TocEntry *
  getTocEntryByDumpId(ArchiveHandle *AH, DumpId id)
  {
  	TocEntry   *te;
--- 1521,1527 ----
  	pos->prev = te;
  }
  
! TocEntry *
  getTocEntryByDumpId(ArchiveHandle *AH, DumpId id)
  {
  	TocEntry   *te;
*************** _allocAH(const char *FileSpec, const Arc
*** 1944,1949 ****
--- 1931,1938 ----
  
  	AH->archiveDumpVersion = PG_VERSION;
  
+ 	AH->is_clone = false;
+ 
  	AH->createDate = time(NULL);
  
  	AH->intSize = sizeof(int);
*************** _allocAH(const char *FileSpec, const Arc
*** 2035,2082 ****
  
  
  void
! WriteDataChunks(ArchiveHandle *AH)
  {
  	TocEntry   *te;
- 	StartDataPtr startPtr;
- 	EndDataPtr	endPtr;
  
  	for (te = AH->toc->next; te != AH->toc; te = te->next)
  	{
! 		if (te->dataDumper != NULL)
! 		{
! 			AH->currToc = te;
! 			/* printf("Writing data for %d (%x)\n", te->id, te); */
! 
! 			if (strcmp(te->desc, "BLOBS") == 0)
! 			{
! 				startPtr = AH->StartBlobsPtr;
! 				endPtr = AH->EndBlobsPtr;
! 			}
! 			else
! 			{
! 				startPtr = AH->StartDataPtr;
! 				endPtr = AH->EndDataPtr;
! 			}
! 
! 			if (startPtr != NULL)
! 				(*startPtr) (AH, te);
  
  			/*
! 			 * printf("Dumper arg for %d is %x\n", te->id, te->dataDumperArg);
  			 */
  
! 			/*
! 			 * The user-provided DataDumper routine needs to call
! 			 * AH->WriteData
! 			 */
! 			(*te->dataDumper) ((Archive *) AH, te->dataDumperArg);
  
! 			if (endPtr != NULL)
! 				(*endPtr) (AH, te);
! 			AH->currToc = NULL;
! 		}
  	}
  }
  
  void
--- 2024,2088 ----
  
  
  void
! WriteDataChunks(ArchiveHandle *AH, ParallelState *pstate)
  {
  	TocEntry   *te;
  
  	for (te = AH->toc->next; te != AH->toc; te = te->next)
  	{
! 		if (!te->hadDumper)
! 			continue;
  
+ 		if (pstate && pstate->numWorkers > 1)
+ 		{
  			/*
! 			 * If we are in a parallel backup, then we are always the master
! 			 * process.
  			 */
+ 			EnsureIdleWorker(AH, pstate);
+ 			Assert(GetIdleWorker(pstate) != NO_SLOT);
+ 			DispatchJobForTocEntry(AH, pstate, te, ACT_DUMP);
+ 		}
+ 		else
+ 		{
+ 			WriteDataChunksForTocEntry(AH, te);
+ 		}
+ 	}
+ 	EnsureWorkersFinished(AH, pstate);
+ }
  
! void
! WriteDataChunksForTocEntry(ArchiveHandle *AH, TocEntry *te)
! {
! 	StartDataPtr startPtr;
! 	EndDataPtr	endPtr;
  
! 	AH->currToc = te;
! 
! 	if (strcmp(te->desc, "BLOBS") == 0)
! 	{
! 		startPtr = AH->StartBlobsPtr;
! 		endPtr = AH->EndBlobsPtr;
! 	}
! 	else
! 	{
! 		startPtr = AH->StartDataPtr;
! 		endPtr = AH->EndDataPtr;
  	}
+ 
+ 	if (startPtr != NULL)
+ 		(*startPtr) (AH, te);
+ 
+ 	/*
+ 	 * The user-provided DataDumper routine needs to call
+ 	 * AH->WriteData
+ 	 */
+ 	(*te->dataDumper) ((Archive *) AH, te->dataDumperArg);
+ 
+ 	if (endPtr != NULL)
+ 		(*endPtr) (AH, te);
+ 
+ 	AH->currToc = NULL;
  }
  
  void
*************** WriteToc(ArchiveHandle *AH)
*** 2086,2093 ****
  	char		workbuf[32];
  	int			i;
  
- 	/* printf("%d TOC Entries to save\n", AH->tocCount); */
- 
  	WriteInt(AH, AH->tocCount);
  
  	for (te = AH->toc->next; te != AH->toc; te = te->next)
--- 2092,2097 ----
*************** dumpTimestamp(ArchiveHandle *AH, const c
*** 3239,3274 ****
  		ahprintf(AH, "-- %s %s\n\n", msg, buf);
  }
  
- 
- /*
-  * Main engine for parallel restore.
-  *
-  * Work is done in three phases.
-  * First we process all SECTION_PRE_DATA tocEntries, in a single connection,
-  * just as for a standard restore.	Second we process the remaining non-ACL
-  * steps in parallel worker children (threads on Windows, processes on Unix),
-  * each of which connects separately to the database.  Finally we process all
-  * the ACL entries in a single connection (that happens back in
-  * RestoreArchive).
-  */
  static void
! restore_toc_entries_parallel(ArchiveHandle *AH)
  {
  	RestoreOptions *ropt = AH->ropt;
- 	int			n_slots = ropt->number_of_jobs;
- 	ParallelSlot *slots;
- 	int			work_status;
- 	int			next_slot;
  	bool		skipped_some;
- 	TocEntry	pending_list;
- 	TocEntry	ready_list;
  	TocEntry   *next_work_item;
- 	thandle		ret_child;
- 	TocEntry   *te;
  
! 	ahlog(AH, 2, "entering restore_toc_entries_parallel\n");
  
! 	slots = (ParallelSlot *) pg_calloc(sizeof(ParallelSlot), n_slots);
  
  	/* Adjust dependency information */
  	fix_dependencies(AH);
--- 3243,3264 ----
  		ahprintf(AH, "-- %s %s\n\n", msg, buf);
  }
  
  static void
! restore_toc_entries_prefork(ArchiveHandle *AH)
  {
  	RestoreOptions *ropt = AH->ropt;
  	bool		skipped_some;
  	TocEntry   *next_work_item;
  
! 	ahlog(AH, 2, "entering restore_toc_entries_prefork\n");
  
! 	/* we haven't got round to making this work for all archive formats */
! 	if (AH->ClonePtr == NULL || AH->ReopenPtr == NULL)
! 		die_horribly(AH, modulename, "parallel restore is not supported with this archive file format\n");
! 
! 	/* doesn't work if the archive represents dependencies as OIDs, either */
! 	if (AH->version < K_VERS_1_8)
! 		die_horribly(AH, modulename, "parallel restore is not supported with archives made by pre-8.0 pg_dump\n");
  
  	/* Adjust dependency information */
  	fix_dependencies(AH);
*************** restore_toc_entries_parallel(ArchiveHand
*** 3336,3352 ****
  		free(AH->currTablespace);
  	AH->currTablespace = NULL;
  	AH->currWithOids = -1;
  
  	/*
! 	 * Initialize the lists of pending and ready items.  After this setup, the
! 	 * pending list is everything that needs to be done but is blocked by one
! 	 * or more dependencies, while the ready list contains items that have no
! 	 * remaining dependencies.	Note: we don't yet filter out entries that
! 	 * aren't going to be restored.  They might participate in dependency
! 	 * chains connecting entries that should be restored, so we treat them as
! 	 * live until we actually process them.
  	 */
- 	par_list_header_init(&pending_list);
  	par_list_header_init(&ready_list);
  	skipped_some = false;
  	for (next_work_item = AH->toc->next; next_work_item != AH->toc; next_work_item = next_work_item->next)
--- 3326,3367 ----
  		free(AH->currTablespace);
  	AH->currTablespace = NULL;
  	AH->currWithOids = -1;
+ }
+ 
+ /*
+  * Main engine for parallel restore.
+  *
+  * Work is done in three phases.
+  * First we process all SECTION_PRE_DATA tocEntries, in a single connection,
+  * just as for a standard restore. This is done in restore_toc_entries_prefork().
+  * Second we process the remaining non-ACL steps in parallel worker children
+  * (threads on Windows, processes on Unix), these fork off and set up their
+  * connections before we call restore_toc_entries_parallel_forked.
+  * Finally we process all the ACL entries in a single connection (that happens
+  * back in RestoreArchive).
+  */
+ static void
+ restore_toc_entries_parallel(ArchiveHandle *AH, ParallelState *pstate,
+ 							 TocEntry *pending_list)
+ {
+ 	int			work_status;
+ 	bool		skipped_some;
+ 	TocEntry	ready_list;
+ 	TocEntry   *next_work_item;
+ 	int			ret_child;
+ 
+ 	ahlog(AH, 2, "entering restore_toc_entries_parallel\n");
  
  	/*
! 	 * Initialize the lists of ready items, the list for pending items has
! 	 * already been initialized in the parent.  After this setup, the pending
! 	 * list is everything that needs to be done but is blocked by one or more
! 	 * dependencies, while the ready list contains items that have no remaining
! 	 * dependencies. Note: we don't yet filter out entries that aren't going
! 	 * to be restored. They might participate in dependency chains connecting
! 	 * entries that should be restored, so we treat them as live until we
! 	 * actually process them.
  	 */
  	par_list_header_init(&ready_list);
  	skipped_some = false;
  	for (next_work_item = AH->toc->next; next_work_item != AH->toc; next_work_item = next_work_item->next)
*************** restore_toc_entries_parallel(ArchiveHand
*** 3371,3377 ****
  		}
  
  		if (next_work_item->depCount > 0)
! 			par_list_append(&pending_list, next_work_item);
  		else
  			par_list_append(&ready_list, next_work_item);
  	}
--- 3386,3392 ----
  		}
  
  		if (next_work_item->depCount > 0)
! 			par_list_append(pending_list, next_work_item);
  		else
  			par_list_append(&ready_list, next_work_item);
  	}
*************** restore_toc_entries_parallel(ArchiveHand
*** 3385,3393 ****
  
  	ahlog(AH, 1, "entering main parallel loop\n");
  
! 	while ((next_work_item = get_next_work_item(AH, &ready_list,
! 												slots, n_slots)) != NULL ||
! 		   work_in_progress(slots, n_slots))
  	{
  		if (next_work_item != NULL)
  		{
--- 3400,3407 ----
  
  	ahlog(AH, 1, "entering main parallel loop\n");
  
! 	while ((next_work_item = get_next_work_item(AH, &ready_list, pstate)) != NULL ||
! 		   !IsEveryWorkerIdle(pstate))
  	{
  		if (next_work_item != NULL)
  		{
*************** restore_toc_entries_parallel(ArchiveHand
*** 3407,3461 ****
  				continue;
  			}
  
! 			if ((next_slot = get_next_slot(slots, n_slots)) != NO_SLOT)
! 			{
! 				/* There is work still to do and a worker slot available */
! 				thandle		child;
! 				RestoreArgs *args;
! 
! 				ahlog(AH, 1, "launching item %d %s %s\n",
! 					  next_work_item->dumpId,
! 					  next_work_item->desc, next_work_item->tag);
  
! 				par_list_remove(next_work_item);
  
! 				/* this memory is dealloced in mark_work_done() */
! 				args = pg_malloc(sizeof(RestoreArgs));
! 				args->AH = CloneArchive(AH);
! 				args->te = next_work_item;
  
! 				/* run the step in a worker child */
! 				child = spawn_restore(args);
  
! 				slots[next_slot].child_id = child;
! 				slots[next_slot].args = args;
  
! 				continue;
  			}
- 		}
  
! 		/*
! 		 * If we get here there must be work being done.  Either there is no
! 		 * work available to schedule (and work_in_progress returned true) or
! 		 * there are no slots available.  So we wait for a worker to finish,
! 		 * and process the result.
! 		 */
! 		ret_child = reap_child(slots, n_slots, &work_status);
  
! 		if (WIFEXITED(work_status))
! 		{
! 			mark_work_done(AH, &ready_list,
! 						   ret_child, WEXITSTATUS(work_status),
! 						   slots, n_slots);
! 		}
! 		else
! 		{
! 			die_horribly(AH, modulename, "worker process crashed: status %d\n",
! 						 work_status);
  		}
  	}
  
  	ahlog(AH, 1, "finished main parallel loop\n");
  
  	/*
  	 * Now reconnect the single parent connection.
--- 3421,3493 ----
  				continue;
  			}
  
! 			ahlog(AH, 1, "launching item %d %s %s\n",
! 				  next_work_item->dumpId,
! 				  next_work_item->desc, next_work_item->tag);
  
! 			par_list_remove(next_work_item);
  
! 			Assert(GetIdleWorker(pstate) != NO_SLOT);
! 			DispatchJobForTocEntry(AH, pstate, next_work_item, ACT_RESTORE);
! 		}
! 		else
! 		{
! 			/* at least one child is working and we have nothing ready. */
! 			Assert(!IsEveryWorkerIdle(pstate));
! 		}
  
! 		for (;;)
! 		{
! 			int nTerm = 0;
  
! 			/*
! 			 * In order to reduce dependencies as soon as possible and
! 			 * especially to reap the status of workers who are working on
! 			 * items that pending items depend on, we do a non-blocking check
! 			 * for ended workers first.
! 			 *
! 			 * However, if we do not have any other work items currently that
! 			 * workers can work on, we do not busy-loop here but instead
! 			 * really wait for at least one worker to terminate. Hence we call
! 			 * ListenToWorkers(..., ..., do_wait = true) in this case.
! 			 */
! 			ListenToWorkers(AH, pstate, !next_work_item);
  
! 			while ((ret_child = ReapWorkerStatus(pstate, &work_status)) != NO_SLOT)
! 			{
! 				nTerm++;
! 				mark_work_done(AH, &ready_list, ret_child, work_status, pstate);
  			}
  
! 			/*
! 			 * We need to make sure that we have an idle worker before re-running the
! 			 * loop. If nTerm > 0 we already have that (quick check).
! 			 */
! 			if (nTerm > 0)
! 				break;
  
! 			/* if nobody terminated, explicitly check for an idle worker */
! 			if (GetIdleWorker(pstate) != NO_SLOT)
! 				break;
! 
! 			/*
! 			 * If we have no idle worker, read the result of one or more
! 			 * workers and loop the loop to call ReapWorkerStatus() on them.
! 			 */
! 			ListenToWorkers(AH, pstate, true);
  		}
  	}
  
  	ahlog(AH, 1, "finished main parallel loop\n");
+ }
+ 
+ static void
+ restore_toc_entries_postfork(ArchiveHandle *AH, TocEntry *pending_list)
+ {
+ 	RestoreOptions *ropt = AH->ropt;
+ 	TocEntry   *te;
+ 
+ 	ahlog(AH, 2, "entering restore_toc_entries_postfork\n");
  
  	/*
  	 * Now reconnect the single parent connection.
*************** restore_toc_entries_parallel(ArchiveHand
*** 3471,3477 ****
  	 * dependencies, or some other pathological condition. If so, do it in the
  	 * single parent connection.
  	 */
! 	for (te = pending_list.par_next; te != &pending_list; te = te->par_next)
  	{
  		ahlog(AH, 1, "processing missed item %d %s %s\n",
  			  te->dumpId, te->desc, te->tag);
--- 3503,3509 ----
  	 * dependencies, or some other pathological condition. If so, do it in the
  	 * single parent connection.
  	 */
! 	for (te = pending_list->par_next; te != pending_list; te = te->par_next)
  	{
  		ahlog(AH, 1, "processing missed item %d %s %s\n",
  			  te->dumpId, te->desc, te->tag);
*************** restore_toc_entries_parallel(ArchiveHand
*** 3482,3602 ****
  }
  
  /*
-  * create a worker child to perform a restore step in parallel
-  */
- static thandle
- spawn_restore(RestoreArgs *args)
- {
- 	thandle		child;
- 
- 	/* Ensure stdio state is quiesced before forking */
- 	fflush(NULL);
- 
- #ifndef WIN32
- 	child = fork();
- 	if (child == 0)
- 	{
- 		/* in child process */
- 		parallel_restore(args);
- 		die_horribly(args->AH, modulename,
- 					 "parallel_restore should not return\n");
- 	}
- 	else if (child < 0)
- 	{
- 		/* fork failed */
- 		die_horribly(args->AH, modulename,
- 					 "could not create worker process: %s\n",
- 					 strerror(errno));
- 	}
- #else
- 	child = (HANDLE) _beginthreadex(NULL, 0, (void *) parallel_restore,
- 									args, 0, NULL);
- 	if (child == 0)
- 		die_horribly(args->AH, modulename,
- 					 "could not create worker thread: %s\n",
- 					 strerror(errno));
- #endif
- 
- 	return child;
- }
- 
- /*
-  *	collect status from a completed worker child
-  */
- static thandle
- reap_child(ParallelSlot *slots, int n_slots, int *work_status)
- {
- #ifndef WIN32
- 	/* Unix is so much easier ... */
- 	return wait(work_status);
- #else
- 	static HANDLE *handles = NULL;
- 	int			hindex,
- 				snum,
- 				tnum;
- 	thandle		ret_child;
- 	DWORD		res;
- 
- 	/* first time around only, make space for handles to listen on */
- 	if (handles == NULL)
- 		handles = (HANDLE *) pg_calloc(sizeof(HANDLE), n_slots);
- 
- 	/* set up list of handles to listen to */
- 	for (snum = 0, tnum = 0; snum < n_slots; snum++)
- 		if (slots[snum].child_id != 0)
- 			handles[tnum++] = slots[snum].child_id;
- 
- 	/* wait for one to finish */
- 	hindex = WaitForMultipleObjects(tnum, handles, false, INFINITE);
- 
- 	/* get handle of finished thread */
- 	ret_child = handles[hindex - WAIT_OBJECT_0];
- 
- 	/* get the result */
- 	GetExitCodeThread(ret_child, &res);
- 	*work_status = res;
- 
- 	/* dispose of handle to stop leaks */
- 	CloseHandle(ret_child);
- 
- 	return ret_child;
- #endif
- }
- 
- /*
-  * are we doing anything now?
-  */
- static bool
- work_in_progress(ParallelSlot *slots, int n_slots)
- {
- 	int			i;
- 
- 	for (i = 0; i < n_slots; i++)
- 	{
- 		if (slots[i].child_id != 0)
- 			return true;
- 	}
- 	return false;
- }
- 
- /*
-  * find the first free parallel slot (if any).
-  */
- static int
- get_next_slot(ParallelSlot *slots, int n_slots)
- {
- 	int			i;
- 
- 	for (i = 0; i < n_slots; i++)
- 	{
- 		if (slots[i].child_id == 0)
- 			return i;
- 	}
- 	return NO_SLOT;
- }
- 
- 
- /*
   * Check if te1 has an exclusive lock requirement for an item that te2 also
   * requires, whether or not te2's requirement is for an exclusive lock.
   */
--- 3514,3519 ----
*************** par_list_remove(TocEntry *te)
*** 3669,3675 ****
   */
  static TocEntry *
  get_next_work_item(ArchiveHandle *AH, TocEntry *ready_list,
! 				   ParallelSlot *slots, int n_slots)
  {
  	bool		pref_non_data = false;	/* or get from AH->ropt */
  	TocEntry   *data_te = NULL;
--- 3586,3592 ----
   */
  static TocEntry *
  get_next_work_item(ArchiveHandle *AH, TocEntry *ready_list,
! 				   ParallelState *pstate)
  {
  	bool		pref_non_data = false;	/* or get from AH->ropt */
  	TocEntry   *data_te = NULL;
*************** get_next_work_item(ArchiveHandle *AH, To
*** 3684,3694 ****
  	{
  		int			count = 0;
  
! 		for (k = 0; k < n_slots; k++)
! 			if (slots[k].args->te != NULL &&
! 				slots[k].args->te->section == SECTION_DATA)
  				count++;
! 		if (n_slots == 0 || count * 4 < n_slots)
  			pref_non_data = false;
  	}
  
--- 3601,3611 ----
  	{
  		int			count = 0;
  
! 		for (k = 0; k < pstate->numWorkers; k++)
! 			if (pstate->parallelSlot[k].args->te != NULL &&
! 				pstate->parallelSlot[k].args->te->section == SECTION_DATA)
  				count++;
! 		if (pstate->numWorkers == 0 || count * 4 < pstate->numWorkers)
  			pref_non_data = false;
  	}
  
*************** get_next_work_item(ArchiveHandle *AH, To
*** 3704,3716 ****
  		 * that a currently running item also needs lock on, or vice versa. If
  		 * so, we don't want to schedule them together.
  		 */
! 		for (i = 0; i < n_slots && !conflicts; i++)
  		{
  			TocEntry   *running_te;
  
! 			if (slots[i].args == NULL)
  				continue;
! 			running_te = slots[i].args->te;
  
  			if (has_lock_conflicts(te, running_te) ||
  				has_lock_conflicts(running_te, te))
--- 3621,3633 ----
  		 * that a currently running item also needs lock on, or vice versa. If
  		 * so, we don't want to schedule them together.
  		 */
! 		for (i = 0; i < pstate->numWorkers && !conflicts; i++)
  		{
  			TocEntry   *running_te;
  
! 			if (pstate->parallelSlot[i].workerStatus != WRKR_WORKING)
  				continue;
! 			running_te = pstate->parallelSlot[i].args->te;
  
  			if (has_lock_conflicts(te, running_te) ||
  				has_lock_conflicts(running_te, te))
*************** get_next_work_item(ArchiveHandle *AH, To
*** 3745,3805 ****
  /*
   * Restore a single TOC item in parallel with others
   *
!  * this is the procedure run as a thread (Windows) or a
!  * separate process (everything else).
   */
! static parallel_restore_result
! parallel_restore(RestoreArgs *args)
  {
  	ArchiveHandle *AH = args->AH;
  	TocEntry   *te = args->te;
  	RestoreOptions *ropt = AH->ropt;
! 	int			retval;
! 
! 	/*
! 	 * Close and reopen the input file so we have a private file pointer that
! 	 * doesn't stomp on anyone else's file pointer, if we're actually going to
! 	 * need to read from the file. Otherwise, just close it except on Windows,
! 	 * where it will possibly be needed by other threads.
! 	 *
! 	 * Note: on Windows, since we are using threads not processes, the reopen
! 	 * call *doesn't* close the original file pointer but just open a new one.
! 	 */
! 	if (te->section == SECTION_DATA)
! 		(AH->ReopenPtr) (AH);
! #ifndef WIN32
! 	else
! 		(AH->ClosePtr) (AH);
! #endif
! 
! 	/*
! 	 * We need our own database connection, too
! 	 */
! 	ConnectDatabase((Archive *) AH, ropt->dbname,
! 					ropt->pghost, ropt->pgport, ropt->username,
! 					ropt->promptPassword);
  
  	_doSetFixedOutputState(AH);
  
! 	/* Restore the TOC item */
! 	retval = restore_toc_entry(AH, te, ropt, true);
! 
! 	/* And clean up */
! 	PQfinish(AH->connection);
! 	AH->connection = NULL;
  
! 	/* If we reopened the file, we are done with it, so close it now */
! 	if (te->section == SECTION_DATA)
! 		(AH->ClosePtr) (AH);
  
! 	if (retval == 0 && AH->public.n_errors)
! 		retval = WORKER_IGNORED_ERRORS;
  
! #ifndef WIN32
! 	exit(retval);
! #else
! 	return retval;
! #endif
  }
  
  
--- 3662,3690 ----
  /*
   * Restore a single TOC item in parallel with others
   *
!  * this is run in the worker, i.e. in a thread (Windows) or a separate process
!  * (everything else). A worker process executes several such work items during
!  * a parallel backup or restore. Once we terminate here and report back that
!  * our work is finished, the master process will assign us a new work item.
   */
! int
! parallel_restore(ParallelArgs *args)
  {
  	ArchiveHandle *AH = args->AH;
  	TocEntry   *te = args->te;
  	RestoreOptions *ropt = AH->ropt;
! 	int			status;
  
  	_doSetFixedOutputState(AH);
  
! 	Assert(AH->connection != NULL);
  
! 	AH->public.n_errors = 0;
  
! 	/* Restore the TOC item */
! 	status = restore_toc_entry(AH, te, ropt, true);
  
! 	return status;
  }
  
  
*************** parallel_restore(RestoreArgs *args)
*** 3811,3835 ****
   */
  static void
  mark_work_done(ArchiveHandle *AH, TocEntry *ready_list,
! 			   thandle worker, int status,
! 			   ParallelSlot *slots, int n_slots)
  {
  	TocEntry   *te = NULL;
- 	int			i;
- 
- 	for (i = 0; i < n_slots; i++)
- 	{
- 		if (slots[i].child_id == worker)
- 		{
- 			slots[i].child_id = 0;
- 			te = slots[i].args->te;
- 			DeCloneArchive(slots[i].args->AH);
- 			free(slots[i].args);
- 			slots[i].args = NULL;
  
! 			break;
! 		}
! 	}
  
  	if (te == NULL)
  		die_horribly(AH, modulename, "could not find slot of finished worker\n");
--- 3696,3707 ----
   */
  static void
  mark_work_done(ArchiveHandle *AH, TocEntry *ready_list,
! 			   int worker, int status,
! 			   ParallelState *pstate)
  {
  	TocEntry   *te = NULL;
  
! 	te = pstate->parallelSlot[worker].args->te;
  
  	if (te == NULL)
  		die_horribly(AH, modulename, "could not find slot of finished worker\n");
*************** inhibit_data_for_failed_table(ArchiveHan
*** 4184,4193 ****
   *
   * Enough of the structure is cloned to ensure that there is no
   * conflict between different threads each with their own clone.
-  *
-  * These could be public, but no need at present.
   */
! static ArchiveHandle *
  CloneArchive(ArchiveHandle *AH)
  {
  	ArchiveHandle *clone;
--- 4056,4063 ----
   *
   * Enough of the structure is cloned to ensure that there is no
   * conflict between different threads each with their own clone.
   */
! ArchiveHandle *
  CloneArchive(ArchiveHandle *AH)
  {
  	ArchiveHandle *clone;
*************** CloneArchive(ArchiveHandle *AH)
*** 4213,4221 ****
--- 4083,4145 ----
  	/* clone has its own error count, too */
  	clone->public.n_errors = 0;
  
+ 	/*
+ 	 * Remember that we're a clone, this is used for deciding if we should
+ 	 * install a snapshot.
+ 	 */
+ 	clone->is_clone = true;
+ 
+ 	/*
+ 	 * Connect our new clone object to the database:
+ 	 * In parallel restore the parent is already disconnected.
+ 	 * In parallel backup we clone the parent's existing connection.
+ 	 */
+ 	if (AH->ropt)
+ 	{
+ 		RestoreOptions *ropt = AH->ropt;
+ 		Assert(AH->connection == NULL);
+ 		/* this also sets clone->connection */
+ 		ConnectDatabase((Archive *) clone, ropt->dbname,
+ 					ropt->pghost, ropt->pgport, ropt->username,
+ 					ropt->promptPassword);
+ 	}
+ 	else
+ 	{
+ 		char	   *dbname;
+ 		char	   *pghost;
+ 		char	   *pgport;
+ 		char	   *username;
+ 		const char *encname;
+ 
+ 		Assert(AH->connection != NULL);
+ 
+ 		/*
+ 		 * Even though we are technically accessing the parent's database object
+ 		 * here, these functions are fine to be called like that because all just
+ 		 * return a pointer and do not actually send/receive any data to/from the
+ 		 * database.
+ 		 */
+ 		dbname = PQdb(AH->connection);
+ 		pghost = PQhost(AH->connection);
+ 		pgport = PQport(AH->connection);
+ 		username = PQuser(AH->connection);
+ 		encname = pg_encoding_to_char(AH->public.encoding);
+ 
+ 		/* this also sets clone->connection */
+ 		ConnectDatabase((Archive *) clone, dbname, pghost, pgport, username, TRI_NO);
+ 
+ 		/*
+ 		 * Set the same encoding, whatever we set here is what we got from
+ 		 * pg_encoding_to_char(), so we really shouldn't run into an error setting that
+ 		 * very same value. Also see the comment in SetupConnection().
+ 		 */
+ 		PQsetClientEncoding(clone->connection, encname);
+ 	}
+ 
  	/* Let the format-specific code have a chance too */
  	(clone->ClonePtr) (clone);
  
+ 	Assert(clone->connection != NULL);
  	return clone;
  }
  
*************** CloneArchive(ArchiveHandle *AH)
*** 4224,4230 ****
   *
   * Note: we assume any clone-local connection was already closed.
   */
! static void
  DeCloneArchive(ArchiveHandle *AH)
  {
  	/* Clear format-specific state */
--- 4148,4154 ----
   *
   * Note: we assume any clone-local connection was already closed.
   */
! void
  DeCloneArchive(ArchiveHandle *AH)
  {
  	/* Clear format-specific state */
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 6dd5158..3b10384 100644
*** a/src/bin/pg_dump/pg_backup_archiver.h
--- b/src/bin/pg_dump/pg_backup_archiver.h
*************** typedef z_stream *z_streamp;
*** 100,108 ****
--- 100,120 ----
  #define K_OFFSET_POS_SET 2
  #define K_OFFSET_NO_DATA 3
  
+ /*
+  * Special exit values from worker children.  We reserve 0 for normal
+  * success; 1 and other small values should be interpreted as crashes.
+  */
+ #define WORKER_OK                     0
+ #define WORKER_CREATE_DONE            10
+ #define WORKER_INHIBIT_DATA           11
+ #define WORKER_IGNORED_ERRORS         12
+ 
  struct _archiveHandle;
  struct _tocEntry;
  struct _restoreList;
+ struct _parallel_args;
+ struct _parallel_state;
+ enum _action;
  
  typedef void (*ClosePtr) (struct _archiveHandle * AH);
  typedef void (*ReopenPtr) (struct _archiveHandle * AH);
*************** typedef void (*PrintTocDataPtr) (struct
*** 130,135 ****
--- 142,154 ----
  typedef void (*ClonePtr) (struct _archiveHandle * AH);
  typedef void (*DeClonePtr) (struct _archiveHandle * AH);
  
+ typedef char *(*WorkerJobRestorePtr)(struct _archiveHandle * AH, struct _tocEntry * te);
+ typedef char *(*WorkerJobDumpPtr)(struct _archiveHandle * AH, struct _tocEntry * te);
+ typedef char *(*MasterStartParallelItemPtr)(struct _archiveHandle * AH, struct _tocEntry * te,
+ 											enum _action act);
+ typedef int (*MasterEndParallelItemPtr)(struct _archiveHandle * AH, struct _tocEntry * te,
+ 										const char *str, enum _action act);
+ 
  typedef size_t (*CustomOutPtr) (struct _archiveHandle * AH, const void *buf, size_t len);
  
  typedef enum
*************** typedef struct _archiveHandle
*** 188,193 ****
--- 207,213 ----
  								 * Added V1.7 */
  	ArchiveFormat format;		/* Archive format */
  
+ 	bool		is_clone;		/* have we been cloned ? */
  	sqlparseInfo sqlparse;		/* state for parsing INSERT data */
  
  	time_t		createDate;		/* Date archive created */
*************** typedef struct _archiveHandle
*** 228,233 ****
--- 248,259 ----
  	StartBlobPtr StartBlobPtr;
  	EndBlobPtr EndBlobPtr;
  
+ 	MasterStartParallelItemPtr MasterStartParallelItemPtr;
+ 	MasterEndParallelItemPtr MasterEndParallelItemPtr;
+ 
+ 	WorkerJobDumpPtr WorkerJobDumpPtr;
+ 	WorkerJobRestorePtr WorkerJobRestorePtr;
+ 
  	ClonePtr ClonePtr;			/* Clone format-specific fields */
  	DeClonePtr DeClonePtr;		/* Clean up cloned fields */
  
*************** typedef struct _archiveHandle
*** 237,242 ****
--- 263,271 ----
  	char	   *archdbname;		/* DB name *read* from archive */
  	enum trivalue promptPassword;
  	char	   *savedPassword;	/* password for ropt->username, if known */
+ 	char	   *use_role;
+ 	char	   *sync_snapshot_id;	/* sync snapshot id for parallel
+ 									   operation */
  	PGconn	   *connection;
  	int			connectToDB;	/* Flag to indicate if direct DB connection is
  								 * required */
*************** typedef struct _tocEntry
*** 324,329 ****
--- 353,359 ----
  	int			nLockDeps;		/* number of such dependencies */
  } TocEntry;
  
+ extern int parallel_restore(struct _parallel_args *args);
  
  extern void die_horribly(ArchiveHandle *AH, const char *modulename, const char *fmt,...) __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4)));
  extern void warn_or_die_horribly(ArchiveHandle *AH, const char *modulename, const char *fmt,...) __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4)));
*************** extern void WriteHead(ArchiveHandle *AH)
*** 334,342 ****
  extern void ReadHead(ArchiveHandle *AH);
  extern void WriteToc(ArchiveHandle *AH);
  extern void ReadToc(ArchiveHandle *AH);
! extern void WriteDataChunks(ArchiveHandle *AH);
  
  extern teReqs TocIDRequired(ArchiveHandle *AH, DumpId id, RestoreOptions *ropt);
  extern bool checkSeek(FILE *fp);
  
  #define appendStringLiteralAHX(buf,str,AH) \
--- 364,377 ----
  extern void ReadHead(ArchiveHandle *AH);
  extern void WriteToc(ArchiveHandle *AH);
  extern void ReadToc(ArchiveHandle *AH);
! extern void WriteDataChunks(ArchiveHandle *AH, struct _parallel_state *pstate);
! extern void WriteDataChunksForTocEntry(ArchiveHandle *AH, TocEntry *te);
! 
! extern ArchiveHandle *CloneArchive(ArchiveHandle *AH);
! extern void DeCloneArchive(ArchiveHandle *AH);
  
  extern teReqs TocIDRequired(ArchiveHandle *AH, DumpId id, RestoreOptions *ropt);
+ TocEntry *getTocEntryByDumpId(ArchiveHandle *AH, DumpId id);
  extern bool checkSeek(FILE *fp);
  
  #define appendStringLiteralAHX(buf,str,AH) \
*************** int			ahprintf(ArchiveHandle *AH, const
*** 378,381 ****
--- 413,428 ----
  
  void		ahlog(ArchiveHandle *AH, int level, const char *fmt,...) __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4)));
  
+ #ifdef USE_ASSERT_CHECKING
+ #define Assert(condition) \
+ 	if (!(condition)) \
+ 	{ \
+ 		write_msg(NULL, "Failed assertion in %s, line %d\n", \
+ 				  __FILE__, __LINE__); \
+ 		abort();\
+ 	}
+ #else
+ #define Assert(condition)
+ #endif
+ 
  #endif
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index 31fa373..f3070f7 100644
*** a/src/bin/pg_dump/pg_backup_custom.c
--- b/src/bin/pg_dump/pg_backup_custom.c
***************
*** 27,32 ****
--- 27,33 ----
  #include "compress_io.h"
  #include "dumputils.h"
  #include "dumpmem.h"
+ #include "parallel.h"
  
  /*--------
   * Routines in the format interface
*************** static void _LoadBlobs(ArchiveHandle *AH
*** 60,65 ****
--- 61,70 ----
  static void _Clone(ArchiveHandle *AH);
  static void _DeClone(ArchiveHandle *AH);
  
+ static char *_MasterStartParallelItem(ArchiveHandle *AH, TocEntry *te, T_Action act);
+ static int _MasterEndParallelItem(ArchiveHandle *AH, TocEntry *te, const char *str, T_Action act);
+ char *_WorkerJobRestoreCustom(ArchiveHandle *AH, TocEntry *te);
+ 
  typedef struct
  {
  	CompressorState *cs;
*************** static size_t _CustomReadFunc(ArchiveHan
*** 87,94 ****
  
  static const char *modulename = gettext_noop("custom archiver");
  
- 
- 
  /*
   *	Init routine required by ALL formats. This is a global routine
   *	and should be declared in pg_backup_archiver.h
--- 92,97 ----
*************** InitArchiveFmt_Custom(ArchiveHandle *AH)
*** 127,132 ****
--- 130,142 ----
  	AH->ClonePtr = _Clone;
  	AH->DeClonePtr = _DeClone;
  
+ 	AH->MasterStartParallelItemPtr = _MasterStartParallelItem;
+ 	AH->MasterEndParallelItemPtr = _MasterEndParallelItem;
+ 
+ 	/* no parallel dump in the custom archive, only parallel restore */
+ 	AH->WorkerJobDumpPtr = NULL;
+ 	AH->WorkerJobRestorePtr = _WorkerJobRestoreCustom;
+ 
  	/* Set up a private area. */
  	ctx = (lclContext *) pg_calloc(1, sizeof(lclContext));
  	AH->formatData = (void *) ctx;
*************** _CloseArchive(ArchiveHandle *AH)
*** 698,704 ****
  		tpos = ftello(AH->FH);
  		WriteToc(AH);
  		ctx->dataStart = _getFilePos(AH, ctx);
! 		WriteDataChunks(AH);
  
  		/*
  		 * If possible, re-write the TOC in order to update the data offset
--- 708,714 ----
  		tpos = ftello(AH->FH);
  		WriteToc(AH);
  		ctx->dataStart = _getFilePos(AH, ctx);
! 		WriteDataChunks(AH, NULL);
  
  		/*
  		 * If possible, re-write the TOC in order to update the data offset
*************** _DeClone(ArchiveHandle *AH)
*** 796,801 ****
--- 806,886 ----
  	free(ctx);
  }
  
+ /*
+  * This function is executed in the child of a parallel backup for the
+  * custom format archive and dumps the actual data.
+  */
+ char *
+ _WorkerJobRestoreCustom(ArchiveHandle *AH, TocEntry *te)
+ {
+ 	/* short fixed-size string + some ID so far, this needs to be malloc'ed
+ 	 * instead of static because we work with threads on windows */
+ 	const int	buflen = 64;
+ 	char	   *buf = (char*) pg_malloc(buflen);
+ 	ParallelArgs pargs;
+ 	int			status;
+ 	lclTocEntry *tctx;
+ 
+ 	tctx = (lclTocEntry *) te->formatData;
+ 
+ 	pargs.AH = AH;
+ 	pargs.te = te;
+ 
+ 	status = parallel_restore(&pargs);
+ 
+ 	snprintf(buf, buflen, "OK RESTORE %d %d %d", te->dumpId, status,
+ 			 status == WORKER_IGNORED_ERRORS ? AH->public.n_errors : 0);
+ 
+ 	return buf;
+ }
+ 
+ /*
+  * This function is executed in the parent process. Depending on the desired
+  * action (dump or restore) it creates a string that is understood by the
+  * _WorkerJobDumpDirectory/_WorkerJobRestoreDirectory functions of the
+  * respective dump format.
+  */
+ static char *
+ _MasterStartParallelItem(ArchiveHandle *AH, TocEntry *te, T_Action act)
+ {
+ 	/*
+ 	 * A static char is okay here, even on Windows because we call this
+ 	 * function only from one process (the master).
+ 	 */
+ 	static char			buf[64]; /* short fixed-size string + number */
+ 
+ 	/* no parallel dump in the custom archive format */
+ 	Assert(act == ACT_RESTORE);
+ 
+ 	snprintf(buf, sizeof(buf), "RESTORE %d", te->dumpId);
+ 
+ 	return buf;
+ }
+ 
+ /*
+  * This function is executed in the parent process. It analyzes the response of
+  * the _WorkerJobDumpDirectory/_WorkerJobRestoreDirectory functions of the
+  * respective dump format.
+  */
+ static int
+ _MasterEndParallelItem(ArchiveHandle *AH, TocEntry *te, const char *str, T_Action act)
+ {
+ 	DumpId		dumpId;
+ 	int			nBytes, status, n_errors;
+ 
+ 	/* no parallel dump in the custom archive */
+ 	Assert(act == ACT_RESTORE);
+ 
+ 	sscanf(str, "%u %u %u%n", &dumpId, &status, &n_errors, &nBytes);
+ 
+ 	Assert(nBytes == strlen(str));
+ 	Assert(dumpId == te->dumpId);
+ 
+ 	AH->public.n_errors += n_errors;
+ 
+ 	return status;
+ }
+ 
  /*--------------------------------------------------
   * END OF FORMAT CALLBACKS
   *--------------------------------------------------
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 9c6d7c1..995cf31 100644
*** a/src/bin/pg_dump/pg_backup_directory.c
--- b/src/bin/pg_dump/pg_backup_directory.c
***************
*** 35,40 ****
--- 35,42 ----
  
  #include "compress_io.h"
  #include "dumpmem.h"
+ #include "dumputils.h"
+ #include "parallel.h"
  
  #include <dirent.h>
  #include <sys/stat.h>
*************** typedef struct
*** 50,55 ****
--- 52,58 ----
  	cfp		   *dataFH;			/* currently open data file */
  
  	cfp		   *blobsTocFH;		/* file handle for blobs.toc */
+ 	ParallelState *pstate;		/* for parallel backup / restore */
  } lclContext;
  
  typedef struct
*************** static int	_ReadByte(ArchiveHandle *);
*** 69,74 ****
--- 72,78 ----
  static size_t _WriteBuf(ArchiveHandle *AH, const void *buf, size_t len);
  static size_t _ReadBuf(ArchiveHandle *AH, void *buf, size_t len);
  static void _CloseArchive(ArchiveHandle *AH);
+ static void _ReopenArchive(ArchiveHandle *AH);
  static void _PrintTocData(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt);
  
  static void _WriteExtraToc(ArchiveHandle *AH, TocEntry *te);
*************** static void _StartBlob(ArchiveHandle *AH
*** 80,90 ****
  static void _EndBlob(ArchiveHandle *AH, TocEntry *te, Oid oid);
  static void _EndBlobs(ArchiveHandle *AH, TocEntry *te);
  static void _LoadBlobs(ArchiveHandle *AH, RestoreOptions *ropt);
  
! static char *prependDirectory(ArchiveHandle *AH, const char *relativeFilename);
  
  static void createDirectory(const char *dir);
! 
  
  /*
   *	Init routine required by ALL formats. This is a global routine
--- 84,101 ----
  static void _EndBlob(ArchiveHandle *AH, TocEntry *te, Oid oid);
  static void _EndBlobs(ArchiveHandle *AH, TocEntry *te);
  static void _LoadBlobs(ArchiveHandle *AH, RestoreOptions *ropt);
+ static void _Clone(ArchiveHandle *AH);
+ static void _DeClone(ArchiveHandle *AH);
  
! static char *_MasterStartParallelItem(ArchiveHandle *AH, TocEntry *te, T_Action act);
! static int _MasterEndParallelItem(ArchiveHandle *AH, TocEntry *te, const char *str, T_Action act);
! static char *_WorkerJobRestoreDirectory(ArchiveHandle *AH, TocEntry *te);
! static char *_WorkerJobDumpDirectory(ArchiveHandle *AH, TocEntry *te);
  
  static void createDirectory(const char *dir);
! static char *prependDirectory(ArchiveHandle *AH, char *buf, const char *relativeFilename);
! static char *_WorkerJobDumpDirectory(ArchiveHandle *AH, TocEntry *te);
! static char *_WorkerJobRestoreDirectory(ArchiveHandle *AH, TocEntry *te);
  
  /*
   *	Init routine required by ALL formats. This is a global routine
*************** InitArchiveFmt_Directory(ArchiveHandle *
*** 111,117 ****
  	AH->WriteBufPtr = _WriteBuf;
  	AH->ReadBufPtr = _ReadBuf;
  	AH->ClosePtr = _CloseArchive;
! 	AH->ReopenPtr = NULL;
  	AH->PrintTocDataPtr = _PrintTocData;
  	AH->ReadExtraTocPtr = _ReadExtraToc;
  	AH->WriteExtraTocPtr = _WriteExtraToc;
--- 122,128 ----
  	AH->WriteBufPtr = _WriteBuf;
  	AH->ReadBufPtr = _ReadBuf;
  	AH->ClosePtr = _CloseArchive;
! 	AH->ReopenPtr = _ReopenArchive;
  	AH->PrintTocDataPtr = _PrintTocData;
  	AH->ReadExtraTocPtr = _ReadExtraToc;
  	AH->WriteExtraTocPtr = _WriteExtraToc;
*************** InitArchiveFmt_Directory(ArchiveHandle *
*** 122,129 ****
  	AH->EndBlobPtr = _EndBlob;
  	AH->EndBlobsPtr = _EndBlobs;
  
! 	AH->ClonePtr = NULL;
! 	AH->DeClonePtr = NULL;
  
  	/* Set up our private context */
  	ctx = (lclContext *) pg_calloc(1, sizeof(lclContext));
--- 133,146 ----
  	AH->EndBlobPtr = _EndBlob;
  	AH->EndBlobsPtr = _EndBlobs;
  
! 	AH->ClonePtr = _Clone;
! 	AH->DeClonePtr = _DeClone;
! 
! 	AH->WorkerJobRestorePtr = _WorkerJobRestoreDirectory;
! 	AH->WorkerJobDumpPtr = _WorkerJobDumpDirectory;
! 
! 	AH->MasterStartParallelItemPtr = _MasterStartParallelItem;
! 	AH->MasterEndParallelItemPtr = _MasterEndParallelItem;
  
  	/* Set up our private context */
  	ctx = (lclContext *) pg_calloc(1, sizeof(lclContext));
*************** InitArchiveFmt_Directory(ArchiveHandle *
*** 152,161 ****
  	}
  	else
  	{							/* Read Mode */
! 		char	   *fname;
  		cfp		   *tocFH;
  
! 		fname = prependDirectory(AH, "toc.dat");
  
  		tocFH = cfopen_read(fname, PG_BINARY_R);
  		if (tocFH == NULL)
--- 169,178 ----
  	}
  	else
  	{							/* Read Mode */
! 		char	   fname[MAXPGPATH];
  		cfp		   *tocFH;
  
! 		prependDirectory(AH, fname, "toc.dat");
  
  		tocFH = cfopen_read(fname, PG_BINARY_R);
  		if (tocFH == NULL)
*************** _StartData(ArchiveHandle *AH, TocEntry *
*** 281,289 ****
  {
  	lclTocEntry *tctx = (lclTocEntry *) te->formatData;
  	lclContext *ctx = (lclContext *) AH->formatData;
! 	char	   *fname;
  
! 	fname = prependDirectory(AH, tctx->filename);
  
  	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
  	if (ctx->dataFH == NULL)
--- 298,306 ----
  {
  	lclTocEntry *tctx = (lclTocEntry *) te->formatData;
  	lclContext *ctx = (lclContext *) AH->formatData;
! 	char		fname[MAXPGPATH];
  
! 	prependDirectory(AH, fname, tctx->filename);
  
  	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
  	if (ctx->dataFH == NULL)
*************** _PrintTocData(ArchiveHandle *AH, TocEntr
*** 372,379 ****
  		_LoadBlobs(AH, ropt);
  	else
  	{
! 		char	   *fname = prependDirectory(AH, tctx->filename);
  
  		_PrintFileData(AH, fname, ropt);
  	}
  }
--- 389,397 ----
  		_LoadBlobs(AH, ropt);
  	else
  	{
! 		char		fname[MAXPGPATH];
  
+ 		prependDirectory(AH, fname, tctx->filename);
  		_PrintFileData(AH, fname, ropt);
  	}
  }
*************** _LoadBlobs(ArchiveHandle *AH, RestoreOpt
*** 383,394 ****
  {
  	Oid			oid;
  	lclContext *ctx = (lclContext *) AH->formatData;
! 	char	   *fname;
  	char		line[MAXPGPATH];
  
  	StartRestoreBlobs(AH);
  
! 	fname = prependDirectory(AH, "blobs.toc");
  
  	ctx->blobsTocFH = cfopen_read(fname, PG_BINARY_R);
  
--- 401,412 ----
  {
  	Oid			oid;
  	lclContext *ctx = (lclContext *) AH->formatData;
! 	char		fname[MAXPGPATH];
  	char		line[MAXPGPATH];
  
  	StartRestoreBlobs(AH);
  
! 	prependDirectory(AH, fname, "blobs.toc");
  
  	ctx->blobsTocFH = cfopen_read(fname, PG_BINARY_R);
  
*************** _CloseArchive(ArchiveHandle *AH)
*** 515,521 ****
  	if (AH->mode == archModeWrite)
  	{
  		cfp		   *tocFH;
! 		char	   *fname = prependDirectory(AH, "toc.dat");
  
  		/* The TOC is always created uncompressed */
  		tocFH = cfopen_write(fname, PG_BINARY_W, 0);
--- 533,544 ----
  	if (AH->mode == archModeWrite)
  	{
  		cfp		   *tocFH;
! 		char		fname[MAXPGPATH];
! 
! 		prependDirectory(AH, fname, "toc.dat");
! 
! 		/* this will actually fork the processes for a parallel backup */
! 		ctx->pstate = ParallelBackupStart(AH, NULL);
  
  		/* The TOC is always created uncompressed */
  		tocFH = cfopen_write(fname, PG_BINARY_W, 0);
*************** _CloseArchive(ArchiveHandle *AH)
*** 536,546 ****
  		if (cfclose(tocFH) != 0)
  			die_horribly(AH, modulename, "could not close TOC file: %s\n",
  						 strerror(errno));
! 		WriteDataChunks(AH);
  	}
  	AH->FH = NULL;
  }
  
  
  /*
   * BLOB support
--- 559,582 ----
  		if (cfclose(tocFH) != 0)
  			die_horribly(AH, modulename, "could not close TOC file: %s\n",
  						 strerror(errno));
! 		WriteDataChunks(AH, ctx->pstate);
! 
! 		ParallelBackupEnd(AH, ctx->pstate);
  	}
  	AH->FH = NULL;
  }
  
+ /*
+  * Reopen the archive's file handle.
+  */
+ static void
+ _ReopenArchive(ArchiveHandle *AH)
+ {
+ 	/*
+ 	 * Our TOC is in memory, our data files are opened by each child anyway as
+ 	 * they are separate. We support reopening the archive by just doing nothing.
+ 	 */
+ }
  
  /*
   * BLOB support
*************** static void
*** 557,565 ****
  _StartBlobs(ArchiveHandle *AH, TocEntry *te)
  {
  	lclContext *ctx = (lclContext *) AH->formatData;
! 	char	   *fname;
  
! 	fname = prependDirectory(AH, "blobs.toc");
  
  	/* The blob TOC file is never compressed */
  	ctx->blobsTocFH = cfopen_write(fname, "ab", 0);
--- 593,601 ----
  _StartBlobs(ArchiveHandle *AH, TocEntry *te)
  {
  	lclContext *ctx = (lclContext *) AH->formatData;
! 	char		fname[MAXPGPATH];
  
! 	prependDirectory(AH, fname, "blobs.toc");
  
  	/* The blob TOC file is never compressed */
  	ctx->blobsTocFH = cfopen_write(fname, "ab", 0);
*************** createDirectory(const char *dir)
*** 652,663 ****
  					 dir, strerror(errno));
  }
  
! 
  static char *
! prependDirectory(ArchiveHandle *AH, const char *relativeFilename)
  {
  	lclContext *ctx = (lclContext *) AH->formatData;
- 	static char buf[MAXPGPATH];
  	char	   *dname;
  
  	dname = ctx->directory;
--- 688,703 ----
  					 dir, strerror(errno));
  }
  
! /*
!  * Gets a relative file name and prepends the output directory, writing the
!  * result to buf. The caller needs to make sure that buf is MAXPGPATH bytes
!  * big. Can't use a static char[MAXPGPATH] inside the function because we run
!  * multithreaded on Windows.
!  */
  static char *
! prependDirectory(ArchiveHandle *AH, char* buf, const char *relativeFilename)
  {
  	lclContext *ctx = (lclContext *) AH->formatData;
  	char	   *dname;
  
  	dname = ctx->directory;
*************** prependDirectory(ArchiveHandle *AH, cons
*** 671,673 ****
--- 711,864 ----
  
  	return buf;
  }
+ 
+ /*
+  * Clone format-specific fields during parallel restoration.
+  */
+ static void
+ _Clone(ArchiveHandle *AH)
+ {
+ 	lclContext *ctx = (lclContext *) AH->formatData;
+ 
+ 	AH->formatData = (lclContext *) pg_malloc(sizeof(lclContext));
+ 	if (AH->formatData == NULL)
+ 		die_horribly(AH, modulename, "out of memory\n");
+ 	memcpy(AH->formatData, ctx, sizeof(lclContext));
+ 	ctx = (lclContext *) AH->formatData;
+ 
+ 	/*
+ 	 * Note: we do not make a local lo_buf because we expect at most one BLOBS
+ 	 * entry per archive, so no parallelism is possible.  Likewise,
+ 	 * TOC-entry-local state isn't an issue because any one TOC entry is
+ 	 * touched by just one worker child.
+ 	 */
+ 
+ 	/*
+ 	 * We also don't copy the ParallelState pointer (pstate), only the master
+ 	 * process ever writes to it.
+ 	 */
+ }
+ 
+ static void
+ _DeClone(ArchiveHandle *AH)
+ {
+ 	lclContext *ctx = (lclContext *) AH->formatData;
+ 	free(ctx);
+ }
+ 
+ /*
+  * This function is executed in the parent process. Depending on the desired
+  * action (dump or restore) it creates a string that is understood by the
+  * _WorkerJobDumpDirectory/_WorkerJobRestoreDirectory functions of the
+  * respective dump format.
+  */
+ static char *
+ _MasterStartParallelItem(ArchiveHandle *AH, TocEntry *te, T_Action act)
+ {
+ 	/*
+ 	 * A static char is okay here, even on Windows because we call this
+ 	 * function only from one process (the master).
+ 	 */
+ 	static char	buf[64];
+ 
+ 	if (act == ACT_DUMP)
+ 		snprintf(buf, sizeof(buf), "DUMP %d", te->dumpId);
+ 	else if (act == ACT_RESTORE)
+ 		snprintf(buf, sizeof(buf), "RESTORE %d", te->dumpId);
+ 
+ 	return buf;
+ }
+ 
+ /*
+  * This function is executed in the child of a parallel backup for the
+  * directory archive and dumps the actual data.
+  *
+  * We are currently returning only the DumpId so theoretically we could
+  * make this function returning an int (or a DumpId). However, to
+  * facilitate further enhancements and because sooner or later we need to
+  * convert this to a string and send it via a message anyway, we stick with
+  * char *. It is parsed on the other side by the _EndMasterParallel()
+  * function of the respective dump format.
+  */
+ static char *
+ _WorkerJobDumpDirectory(ArchiveHandle *AH, TocEntry *te)
+ {
+ 	/* short fixed-size string + some ID so far, this needs to be malloc'ed
+ 	 * instead of static because we work with threads on windows */
+ 	const int	buflen = 64;
+ 	char	   *buf = (char*) pg_malloc(buflen);
+ 	lclTocEntry *tctx = (lclTocEntry *) te->formatData;
+ 
+ 	/* This should never happen */
+ 	if (!tctx)
+ 		die_horribly(AH, modulename, "Error during backup\n");
+ 
+ 	/*
+ 	 * This function returns void. We either fail and die horribly or succeed...
+ 	 * A failure will be detected by the parent when the child dies unexpectedly.
+ 	 */
+ 	WriteDataChunksForTocEntry(AH, te);
+ 
+ 	snprintf(buf, buflen, "OK DUMP %d", te->dumpId);
+ 
+ 	return buf;
+ }
+ 
+ /*
+  * This function is executed in the child of a parallel backup for the
+  * directory archive and dumps the actual data.
+  */
+ static char *
+ _WorkerJobRestoreDirectory(ArchiveHandle *AH, TocEntry *te)
+ {
+ 	/* short fixed-size string + some ID so far, this needs to be malloc'ed
+ 	 * instead of static because we work with threads on windows */
+ 	const int	buflen = 64;
+ 	char	   *buf = (char*) pg_malloc(buflen);
+ 	ParallelArgs pargs;
+ 	int			status;
+ 	lclTocEntry *tctx;
+ 
+ 	tctx = (lclTocEntry *) te->formatData;
+ 
+ 	pargs.AH = AH;
+ 	pargs.te = te;
+ 
+ 	status = parallel_restore(&pargs);
+ 
+ 	snprintf(buf, buflen, "OK RESTORE %d %d %d", te->dumpId, status,
+ 			 status == WORKER_IGNORED_ERRORS ? AH->public.n_errors : 0);
+ 
+ 	return buf;
+ }
+ /*
+  * This function is executed in the parent process. It analyzes the response of
+  * the _WorkerJobDumpDirectory/_WorkerJobRestoreDirectory functions of the
+  * respective dump format.
+  */
+ static int
+ _MasterEndParallelItem(ArchiveHandle *AH, TocEntry *te, const char *str, T_Action act)
+ {
+ 	DumpId		dumpId;
+ 	int			nBytes, n_errors;
+ 	int			status = 0;
+ 
+ 	if (act == ACT_DUMP)
+ 	{
+ 		sscanf(str, "%u%n", &dumpId, &nBytes);
+ 
+ 		Assert(dumpId == te->dumpId);
+ 		Assert(nBytes == strlen(str));
+ 	}
+ 	else if (act == ACT_RESTORE)
+ 	{
+ 		sscanf(str, "%u %u %u%n", &dumpId, &status, &n_errors, &nBytes);
+ 
+ 		Assert(dumpId == te->dumpId);
+ 		Assert(nBytes == strlen(str));
+ 
+ 		AH->public.n_errors += n_errors;
+ 	}
+ 
+ 	return status;
+ }
diff --git a/src/bin/pg_dump/pg_backup_files.c b/src/bin/pg_dump/pg_backup_files.c
index ffcbb8f..13e8ed3 100644
*** a/src/bin/pg_dump/pg_backup_files.c
--- b/src/bin/pg_dump/pg_backup_files.c
*************** InitArchiveFmt_Files(ArchiveHandle *AH)
*** 102,107 ****
--- 102,113 ----
  	AH->ClonePtr = NULL;
  	AH->DeClonePtr = NULL;
  
+ 	AH->MasterStartParallelItemPtr = NULL;
+ 	AH->MasterEndParallelItemPtr = NULL;
+ 
+ 	AH->WorkerJobDumpPtr = NULL;
+ 	AH->WorkerJobRestorePtr = NULL;
+ 
  	/*
  	 * Set up some special context used in compressing data.
  	 */
*************** _CloseArchive(ArchiveHandle *AH)
*** 455,461 ****
  		WriteToc(AH);
  		if (fclose(AH->FH) != 0)
  			die_horribly(AH, modulename, "could not close TOC file: %s\n", strerror(errno));
! 		WriteDataChunks(AH);
  	}
  
  	AH->FH = NULL;
--- 461,467 ----
  		WriteToc(AH);
  		if (fclose(AH->FH) != 0)
  			die_horribly(AH, modulename, "could not close TOC file: %s\n", strerror(errno));
! 		WriteDataChunks(AH, NULL);
  	}
  
  	AH->FH = NULL;
diff --git a/src/bin/pg_dump/pg_backup_tar.c b/src/bin/pg_dump/pg_backup_tar.c
index 39ce417..929083e 100644
*** a/src/bin/pg_dump/pg_backup_tar.c
--- b/src/bin/pg_dump/pg_backup_tar.c
*************** InitArchiveFmt_Tar(ArchiveHandle *AH)
*** 157,162 ****
--- 157,168 ----
  	AH->ClonePtr = NULL;
  	AH->DeClonePtr = NULL;
  
+ 	AH->MasterStartParallelItemPtr = NULL;
+ 	AH->MasterEndParallelItemPtr = NULL;
+ 
+ 	AH->WorkerJobDumpPtr = NULL;
+ 	AH->WorkerJobRestorePtr = NULL;
+ 
  	/*
  	 * Set up some special context used in compressing data.
  	 */
*************** _CloseArchive(ArchiveHandle *AH)
*** 835,841 ****
  		/*
  		 * Now send the data (tables & blobs)
  		 */
! 		WriteDataChunks(AH);
  
  		/*
  		 * Now this format wants to append a script which does a full restore
--- 841,847 ----
  		/*
  		 * Now send the data (tables & blobs)
  		 */
! 		WriteDataChunks(AH, NULL);
  
  		/*
  		 * Now this format wants to append a script which does a full restore
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 13fc667..5a15435 100644
*** a/src/bin/pg_dump/pg_dump.c
--- b/src/bin/pg_dump/pg_dump.c
*************** static int	disable_dollar_quoting = 0;
*** 140,145 ****
--- 140,146 ----
  static int	dump_inserts = 0;
  static int	column_inserts = 0;
  static int	no_security_labels = 0;
+ static int  no_synchronized_snapshots = 0;
  static int	no_unlogged_table_data = 0;
  static int	serializable_deferrable = 0;
  
*************** static const char *convertTSFunction(Oid
*** 227,235 ****
  static Oid	findLastBuiltinOid_V71(const char *);
  static Oid	findLastBuiltinOid_V70(void);
  static void selectSourceSchema(const char *schemaName);
  static char *getFormattedTypeName(Oid oid, OidOptions opts);
  static char *myFormatType(const char *typname, int32 typmod);
- static const char *fmtQualifiedId(const char *schema, const char *id);
  static void getBlobs(Archive *AH);
  static void dumpBlob(Archive *AH, BlobInfo *binfo);
  static int	dumpBlobs(Archive *AH, void *arg);
--- 228,237 ----
  static Oid	findLastBuiltinOid_V71(const char *);
  static Oid	findLastBuiltinOid_V70(void);
  static void selectSourceSchema(const char *schemaName);
+ static void selectSourceSchemaOnAH(ArchiveHandle *AH, const char *schemaName);
+ static void selectSourceSchemaOnConnection(PGconn *conn, const char *schemaName);
  static char *getFormattedTypeName(Oid oid, OidOptions opts);
  static char *myFormatType(const char *typname, int32 typmod);
  static void getBlobs(Archive *AH);
  static void dumpBlob(Archive *AH, BlobInfo *binfo);
  static int	dumpBlobs(Archive *AH, void *arg);
*************** static void binary_upgrade_extension_mem
*** 246,255 ****
  								DumpableObject *dobj,
  								const char *objlabel);
  static const char *getAttrName(int attrnum, TableInfo *tblInfo);
! static const char *fmtCopyColumnList(const TableInfo *ti);
  static void do_sql_command(PGconn *conn, const char *query);
  static void check_sql_result(PGresult *res, PGconn *conn, const char *query,
  				 ExecStatusType expected);
  
  int
  main(int argc, char **argv)
--- 248,260 ----
  								DumpableObject *dobj,
  								const char *objlabel);
  static const char *getAttrName(int attrnum, TableInfo *tblInfo);
! static const char *fmtCopyColumnList(const TableInfo *ti, PQExpBuffer buffer);
  static void do_sql_command(PGconn *conn, const char *query);
  static void check_sql_result(PGresult *res, PGconn *conn, const char *query,
  				 ExecStatusType expected);
+ static void SetupConnection(Archive *AHX, const char *dumpencoding,
+ 							const char *use_role);
+ static char *get_synchronized_snapshot(ArchiveHandle *AH);
  
  int
  main(int argc, char **argv)
*************** main(int argc, char **argv)
*** 262,274 ****
  	const char *pgport = NULL;
  	const char *username = NULL;
  	const char *dumpencoding = NULL;
- 	const char *std_strings;
  	bool		oids = false;
  	TableInfo  *tblinfo;
  	int			numTables;
  	DumpableObject **dobjs;
  	int			numObjs;
  	int			i;
  	enum trivalue prompt_password = TRI_DEFAULT;
  	int			compressLevel = -1;
  	int			plainText = 0;
--- 267,279 ----
  	const char *pgport = NULL;
  	const char *username = NULL;
  	const char *dumpencoding = NULL;
  	bool		oids = false;
  	TableInfo  *tblinfo;
  	int			numTables;
  	DumpableObject **dobjs;
  	int			numObjs;
  	int			i;
+ 	int			numWorkers = 1;
  	enum trivalue prompt_password = TRI_DEFAULT;
  	int			compressLevel = -1;
  	int			plainText = 0;
*************** main(int argc, char **argv)
*** 297,302 ****
--- 302,308 ----
  		{"format", required_argument, NULL, 'F'},
  		{"host", required_argument, NULL, 'h'},
  		{"ignore-version", no_argument, NULL, 'i'},
+ 		{"jobs", 1, NULL, 'j'},
  		{"no-reconnect", no_argument, NULL, 'R'},
  		{"oids", no_argument, NULL, 'o'},
  		{"no-owner", no_argument, NULL, 'O'},
*************** main(int argc, char **argv)
*** 336,341 ****
--- 342,348 ----
  		{"serializable-deferrable", no_argument, &serializable_deferrable, 1},
  		{"use-set-session-authorization", no_argument, &use_setsessauth, 1},
  		{"no-security-labels", no_argument, &no_security_labels, 1},
+ 		{"no-synchronized-snapshots", no_argument, &no_synchronized_snapshots, 1},
  		{"no-unlogged-table-data", no_argument, &no_unlogged_table_data, 1},
  
  		{NULL, 0, NULL, 0}
*************** main(int argc, char **argv)
*** 373,379 ****
  		}
  	}
  
! 	while ((c = getopt_long(argc, argv, "abcCE:f:F:h:in:N:oOp:RsS:t:T:U:vwWxZ:",
  							long_options, &optindex)) != -1)
  	{
  		switch (c)
--- 380,386 ----
  		}
  	}
  
! 	while ((c = getopt_long(argc, argv, "abcCE:f:F:h:ij:n:N:oOp:RsS:t:T:U:vwWxZ:",
  							long_options, &optindex)) != -1)
  	{
  		switch (c)
*************** main(int argc, char **argv)
*** 414,419 ****
--- 421,430 ----
  				/* ignored, deprecated option */
  				break;
  
+ 			case 'j':			/* number of dump jobs */
+ 				numWorkers = atoi(optarg);
+ 				break;
+ 
  			case 'n':			/* include schema(s) */
  				simple_string_list_append(&schema_include_patterns, optarg);
  				include_everything = false;
*************** main(int argc, char **argv)
*** 575,580 ****
--- 586,612 ----
  			compressLevel = 0;
  	}
  
+ 	/*
+ 	 * On Windows we can only have at most MAXIMUM_WAIT_OBJECTS (= 64 usually)
+ 	 * parallel jobs because that's the maximum limit for the
+ 	 * WaitForMultipleObjects() call.
+ 	 */
+ 	if (numWorkers <= 0
+ #ifdef WIN32
+ 			|| numWorkers > MAXIMUM_WAIT_OBJECTS
+ #endif
+ 		)
+ 	{
+ 		write_msg(NULL, _("%s: invalid number of parallel jobs\n"), progname);
+ 		exit(1);
+ 	}
+ 
+ 	/* Parallel backup only in the directory archive format so far */
+ 	if (archiveFormat != archDirectory && numWorkers > 1) {
+ 		write_msg(NULL, "parallel backup only supported by the directory format\n");
+ 		exit(1);
+ 	}
+ 
  	/* Open the output file */
  	g_fout = CreateArchive(filename, archiveFormat, compressLevel, archiveMode);
  
*************** main(int argc, char **argv)
*** 601,606 ****
--- 633,640 ----
  	g_fout->minRemoteVersion = 70000;
  	g_fout->maxRemoteVersion = (my_version / 100) * 100 + 99;
  
+ 	g_fout->numWorkers = numWorkers;
+ 
  	/*
  	 * Open the database using the Archiver, so it knows about it. Errors mean
  	 * death.
*************** main(int argc, char **argv)
*** 608,702 ****
  	g_conn = ConnectDatabase(g_fout, dbname, pghost, pgport,
  							 username, prompt_password);
  
! 	/* Set the client encoding if requested */
! 	if (dumpencoding)
  	{
! 		if (PQsetClientEncoding(g_conn, dumpencoding) < 0)
! 		{
! 			write_msg(NULL, "invalid client encoding \"%s\" specified\n",
! 					  dumpencoding);
! 			exit(1);
! 		}
  	}
  
  	/*
- 	 * Get the active encoding and the standard_conforming_strings setting, so
- 	 * we know how to escape strings.
- 	 */
- 	g_fout->encoding = PQclientEncoding(g_conn);
- 
- 	std_strings = PQparameterStatus(g_conn, "standard_conforming_strings");
- 	g_fout->std_strings = (std_strings && strcmp(std_strings, "on") == 0);
- 
- 	/* Set the role if requested */
- 	if (use_role && g_fout->remoteVersion >= 80100)
- 	{
- 		PQExpBuffer query = createPQExpBuffer();
- 
- 		appendPQExpBuffer(query, "SET ROLE %s", fmtId(use_role));
- 		do_sql_command(g_conn, query->data);
- 		destroyPQExpBuffer(query);
- 	}
- 
- 	/* Set the datestyle to ISO to ensure the dump's portability */
- 	do_sql_command(g_conn, "SET DATESTYLE = ISO");
- 
- 	/* Likewise, avoid using sql_standard intervalstyle */
- 	if (g_fout->remoteVersion >= 80400)
- 		do_sql_command(g_conn, "SET INTERVALSTYLE = POSTGRES");
- 
- 	/*
- 	 * If supported, set extra_float_digits so that we can dump float data
- 	 * exactly (given correctly implemented float I/O code, anyway)
- 	 */
- 	if (g_fout->remoteVersion >= 90000)
- 		do_sql_command(g_conn, "SET extra_float_digits TO 3");
- 	else if (g_fout->remoteVersion >= 70400)
- 		do_sql_command(g_conn, "SET extra_float_digits TO 2");
- 
- 	/*
- 	 * If synchronized scanning is supported, disable it, to prevent
- 	 * unpredictable changes in row ordering across a dump and reload.
- 	 */
- 	if (g_fout->remoteVersion >= 80300)
- 		do_sql_command(g_conn, "SET synchronize_seqscans TO off");
- 
- 	/*
- 	 * Disable timeouts if supported.
- 	 */
- 	if (g_fout->remoteVersion >= 70300)
- 		do_sql_command(g_conn, "SET statement_timeout = 0");
- 
- 	/*
- 	 * Quote all identifiers, if requested.
- 	 */
- 	if (quote_all_identifiers && g_fout->remoteVersion >= 90100)
- 		do_sql_command(g_conn, "SET quote_all_identifiers = true");
- 
- 	/*
  	 * Disable security label support if server version < v9.1.x (prevents
  	 * access to nonexistent pg_seclabel catalog)
  	 */
  	if (g_fout->remoteVersion < 90100)
  		no_security_labels = 1;
  
- 	/*
- 	 * Start transaction-snapshot mode transaction to dump consistent data.
- 	 */
- 	do_sql_command(g_conn, "BEGIN");
- 	if (g_fout->remoteVersion >= 90100)
- 	{
- 		if (serializable_deferrable)
- 			do_sql_command(g_conn,
- 						   "SET TRANSACTION ISOLATION LEVEL SERIALIZABLE, "
- 						   "READ ONLY, DEFERRABLE");
- 		else
- 			do_sql_command(g_conn,
- 						   "SET TRANSACTION ISOLATION LEVEL REPEATABLE READ");
- 	}
- 	else
- 		do_sql_command(g_conn, "SET TRANSACTION ISOLATION LEVEL SERIALIZABLE");
- 
  	/* Select the appropriate subquery to convert user IDs to names */
  	if (g_fout->remoteVersion >= 80100)
  		username_subquery = "SELECT rolname FROM pg_catalog.pg_roles WHERE oid =";
--- 642,665 ----
  	g_conn = ConnectDatabase(g_fout, dbname, pghost, pgport,
  							 username, prompt_password);
  
! 	/* Find the last built-in OID, if needed */
! 	if (g_fout->remoteVersion < 70300)
  	{
! 		if (g_fout->remoteVersion >= 70100)
! 			g_last_builtin_oid = findLastBuiltinOid_V71(PQdb(g_conn));
! 		else
! 			g_last_builtin_oid = findLastBuiltinOid_V70();
! 		if (g_verbose)
! 			write_msg(NULL, "last built-in OID is %u\n", g_last_builtin_oid);
  	}
  
  	/*
  	 * Disable security label support if server version < v9.1.x (prevents
  	 * access to nonexistent pg_seclabel catalog)
  	 */
  	if (g_fout->remoteVersion < 90100)
  		no_security_labels = 1;
  
  	/* Select the appropriate subquery to convert user IDs to names */
  	if (g_fout->remoteVersion >= 80100)
  		username_subquery = "SELECT rolname FROM pg_catalog.pg_roles WHERE oid =";
*************** main(int argc, char **argv)
*** 705,721 ****
  	else
  		username_subquery = "SELECT usename FROM pg_user WHERE usesysid =";
  
! 	/* Find the last built-in OID, if needed */
! 	if (g_fout->remoteVersion < 70300)
  	{
! 		if (g_fout->remoteVersion >= 70100)
! 			g_last_builtin_oid = findLastBuiltinOid_V71(PQdb(g_conn));
! 		else
! 			g_last_builtin_oid = findLastBuiltinOid_V70();
! 		if (g_verbose)
! 			write_msg(NULL, "last built-in OID is %u\n", g_last_builtin_oid);
  	}
  
  	/* Expand schema selection patterns into OID lists */
  	if (schema_include_patterns.head != NULL)
  	{
--- 668,687 ----
  	else
  		username_subquery = "SELECT usename FROM pg_user WHERE usesysid =";
  
! 	if (numWorkers > 1)
  	{
! 		/* check the version for the synchronized snapshots feature */
! 		if (g_fout->remoteVersion < 90200 && !no_synchronized_snapshots)
! 		{
! 			write_msg(NULL, "No synchronized snapshots available in this version\n"
! 						 "You might have to run with --no-synchronized-snapshots\n");
! 			exit(1);
! 		} else if (g_fout->remoteVersion >= 90200 && no_synchronized_snapshots)
! 			write_msg(NULL, "Ignoring --no-synchronized-snapshots\n");
  	}
  
+ 	SetupConnection(g_fout, dumpencoding, use_role);
+ 
  	/* Expand schema selection patterns into OID lists */
  	if (schema_include_patterns.head != NULL)
  	{
*************** main(int argc, char **argv)
*** 797,802 ****
--- 763,772 ----
  	else
  		sortDumpableObjectsByTypeOid(dobjs, numObjs);
  
+ 	/* If we do a parallel dump, we want the largest tables to go first */
+ 	if (archiveFormat == archDirectory && numWorkers > 1)
+ 		sortDataAndIndexObjectsBySize(dobjs, numObjs);
+ 
  	sortDumpableObjects(dobjs, numObjs);
  
  	/*
*************** help(const char *progname)
*** 862,867 ****
--- 832,838 ----
  	printf(_("  -f, --file=FILENAME         output file or directory name\n"));
  	printf(_("  -F, --format=c|d|t|p        output file format (custom, directory, tar,\n"
  			 "                              plain text (default))\n"));
+ 	printf(_("  -j, --jobs=NUM              use this many parallel jobs to dump\n"));
  	printf(_("  -v, --verbose               verbose mode\n"));
  	printf(_("  -Z, --compress=0-9          compression level for compressed formats\n"));
  	printf(_("  --lock-wait-timeout=TIMEOUT fail after waiting TIMEOUT for a table lock\n"));
*************** help(const char *progname)
*** 891,896 ****
--- 862,868 ----
  	printf(_("  --exclude-table-data=TABLE  do NOT dump data for the named table(s)\n"));
  	printf(_("  --inserts                   dump data as INSERT commands, rather than COPY\n"));
  	printf(_("  --no-security-labels        do not dump security label assignments\n"));
+ 	printf(_("  --no-synchronized-snapshots parallel processes should not use synchronized snapshots\n"));
  	printf(_("  --no-tablespaces            do not dump tablespace assignments\n"));
  	printf(_("  --no-unlogged-table-data    do not dump unlogged table data\n"));
  	printf(_("  --quote-all-identifiers     quote all identifiers, even if not key words\n"));
*************** exit_nicely(void)
*** 922,927 ****
--- 894,1066 ----
  	exit(1);
  }
  
+ /*
+  * Initialize the connection for a new worker process.
+  */
+ void
+ _SetupWorker(Archive *AHX, RestoreOptions *ropt)
+ {
+ 	SetupConnection(AHX, NULL, NULL);
+ }
+ 
+ static void
+ SetupConnection(Archive *AHX, const char *dumpencoding, const char *use_role)
+ {
+ 	ArchiveHandle *AH = (ArchiveHandle *) AHX;
+ 	const char *std_strings;
+ 	PGconn *conn = AH->connection;
+ 
+ 	/*
+ 	 * Set the client encoding if requested. If dumpencoding == NULL then
+ 	 * either it hasn't been requested or we're a cloned connection and then this
+ 	 * has already been set in CloneArchive according to the original
+ 	 * connection encoding.
+ 	 */
+ 	if (dumpencoding)
+ 	{
+ 		if (PQsetClientEncoding(AH->connection, dumpencoding) < 0)
+ 		{
+ 			write_msg(NULL, "invalid client encoding \"%s\" specified\n",
+ 					  dumpencoding);
+ 			exit(1);
+ 		}
+ 	}
+ 
+ 	/*
+ 	 * Get the active encoding and the standard_conforming_strings setting, so
+ 	 * we know how to escape strings.
+ 	 */
+ 	AHX->encoding = PQclientEncoding(conn);
+ 
+ 	std_strings = PQparameterStatus(conn, "standard_conforming_strings");
+ 	AHX->std_strings = (std_strings && strcmp(std_strings, "on") == 0);
+ 
+ 	/* Set the role if requested */
+ 	if (!use_role && AH->use_role)
+ 		use_role = AH->use_role;
+ 
+ 	if (use_role && AHX->remoteVersion >= 80100)
+ 	{
+ 		PQExpBuffer query = createPQExpBuffer();
+ 
+ 		appendPQExpBuffer(query, "SET ROLE %s", fmtId(use_role));
+ 		do_sql_command(conn, query->data);
+ 		destroyPQExpBuffer(query);
+ 
+ 		/* save this for later use on parallel connections */
+ 		if (!AH->use_role)
+ 			AH->use_role = strdup(use_role);
+ 	}
+ 
+ 	/* Set the datestyle to ISO to ensure the dump's portability */
+ 	do_sql_command(conn, "SET DATESTYLE = ISO");
+ 
+ 	/* Likewise, avoid using sql_standard intervalstyle */
+ 	if (AHX->remoteVersion >= 80400)
+ 		do_sql_command(conn, "SET INTERVALSTYLE = POSTGRES");
+ 
+ 	/*
+ 	 * If supported, set extra_float_digits so that we can dump float data
+ 	 * exactly (given correctly implemented float I/O code, anyway)
+ 	 */
+ 	if (AHX->remoteVersion >= 80500)
+ 		do_sql_command(conn, "SET extra_float_digits TO 3");
+ 	else if (AHX->remoteVersion >= 70400)
+ 		do_sql_command(conn, "SET extra_float_digits TO 2");
+ 
+ 	/*
+ 	 * If synchronized scanning is supported, disable it, to prevent
+ 	 * unpredictable changes in row ordering across a dump and reload.
+ 	 */
+ 	if (AHX->remoteVersion >= 80300)
+ 		do_sql_command(conn, "SET synchronize_seqscans TO off");
+ 
+ 	/*
+ 	 * Quote all identifiers, if requested.
+ 	 */
+ 	if (quote_all_identifiers && AHX->remoteVersion >= 90100)
+ 		do_sql_command(conn, "SET quote_all_identifiers = true");
+ 
+ 	/*
+ 	 * Disable timeouts if supported.
+ 	 */
+ 	if (AHX->remoteVersion >= 70300)
+ 		do_sql_command(conn, "SET statement_timeout = 0");
+ 
+ 	/*
+ 	 * Quote all identifiers, if requested.
+ 	 */
+ 	if (quote_all_identifiers && AHX->remoteVersion >= 90100)
+ 		do_sql_command(conn, "SET quote_all_identifiers = true");
+ 
+ 	/*
+ 	 * Start transaction-snapshot mode transaction to dump consistent data.
+ 	 */
+ 	do_sql_command(conn, "BEGIN");
+ 	if (AHX->remoteVersion >= 90100)
+ 	{
+ 		if (serializable_deferrable)
+ 			do_sql_command(conn,
+ 						   "SET TRANSACTION ISOLATION LEVEL SERIALIZABLE, "
+ 						   "READ ONLY, DEFERRABLE");
+ 		else
+ 			do_sql_command(conn,
+ 						   "SET TRANSACTION ISOLATION LEVEL REPEATABLE READ");
+ 	}
+ 	else
+ 		do_sql_command(conn, "SET TRANSACTION ISOLATION LEVEL SERIALIZABLE");
+ 
+ 	if (AHX->numWorkers > 1 && AHX->remoteVersion >= 90200)
+ 	{
+ 		if (AH->is_clone)
+ 		{
+ 			PQExpBuffer query = createPQExpBuffer();
+ 			appendPQExpBuffer(query, "SET TRANSACTION SNAPSHOT ");
+ 			appendStringLiteralConn(query, AH->sync_snapshot_id, conn);
+ 			destroyPQExpBuffer(query);
+ 		}
+ 		else {
+ 			/*
+ 			 * If the version is lower and we don't have synchronized snapshots
+ 			 * yet, we will error out earlier already. So either we have the
+ 			 * feature or the user has given the explicit command not to use it.
+ 			 * Note: If we have it, we always use it, you cannot switch it off
+ 			 * then.
+ 			 */
+ 			if (AHX->remoteVersion >= 90200)
+ 				AH->sync_snapshot_id = get_synchronized_snapshot(AH);
+ 		}
+ 	}
+ }
+ 
+ static char*
+ get_synchronized_snapshot(ArchiveHandle *AH)
+ {
+ 	const char *query = "select pg_export_snapshot()";
+ 	char	   *result;
+ 	int			ntups;
+ 	PGconn	   *conn = AH->connection;
+ 	PGresult   *res = PQexec(conn, query);
+ 
+ 	check_sql_result(res, conn, query, PGRES_TUPLES_OK);
+ 
+ 	/* Expecting a single result only */
+ 	ntups = PQntuples(res);
+ 	if (ntups != 1)
+ 	{
+ 		write_msg(NULL, ngettext("query returned %d row instead of one: %s\n",
+ 							   "query returned %d rows instead of one: %s\n",
+ 								 ntups),
+ 				  ntups, query);
+ 		exit_nicely();
+ 	}
+ 
+ 	result = strdup(PQgetvalue(res, 0, 0));
+ 	PQclear(res);
+ 
+ 	return result;
+ }
+ 
  static ArchiveFormat
  parseArchiveFormat(const char *format, ArchiveMode *mode)
  {
*************** selectDumpableObject(DumpableObject *dob
*** 1252,1263 ****
--- 1391,1412 ----
  static int
  dumpTableData_copy(Archive *fout, void *dcontext)
  {
+ 	/*
+ 	 * This is a data dumper routine, executed in a child for parallel backup, so
+ 	 * it must not access the global g_conn but AH->connection instead.
+ 	 */
+ 	ArchiveHandle *AH = (ArchiveHandle *) fout;
  	TableDataInfo *tdinfo = (TableDataInfo *) dcontext;
  	TableInfo  *tbinfo = tdinfo->tdtable;
  	const char *classname = tbinfo->dobj.name;
  	const bool	hasoids = tbinfo->hasoids;
  	const bool	oids = tdinfo->oids;
  	PQExpBuffer q = createPQExpBuffer();
+ 	/*
+ 	 * Note: can't use getThreadLocalPQExpBuffer() here, we're calling fmtId which
+ 	 * uses it already.
+ 	 */
+ 	PQExpBuffer clistBuf = createPQExpBuffer();
  	PGresult   *res;
  	int			ret;
  	char	   *copybuf;
*************** dumpTableData_copy(Archive *fout, void *
*** 1272,1278 ****
  	 * this ensures reproducible results in case the table contains regproc,
  	 * regclass, etc columns.
  	 */
! 	selectSourceSchema(tbinfo->dobj.namespace->dobj.name);
  
  	/*
  	 * If possible, specify the column list explicitly so that we have no
--- 1421,1427 ----
  	 * this ensures reproducible results in case the table contains regproc,
  	 * regclass, etc columns.
  	 */
! 	selectSourceSchemaOnAH(AH, tbinfo->dobj.namespace->dobj.name);
  
  	/*
  	 * If possible, specify the column list explicitly so that we have no
*************** dumpTableData_copy(Archive *fout, void *
*** 1280,1287 ****
  	 * column ordering of COPY will not be what we want in certain corner
  	 * cases involving ADD COLUMN and inheritance.)
  	 */
! 	if (g_fout->remoteVersion >= 70300)
! 		column_list = fmtCopyColumnList(tbinfo);
  	else
  		column_list = "";		/* can't select columns in COPY */
  
--- 1429,1436 ----
  	 * column ordering of COPY will not be what we want in certain corner
  	 * cases involving ADD COLUMN and inheritance.)
  	 */
! 	if (AH->public.remoteVersion >= 70300)
! 		column_list = fmtCopyColumnList(tbinfo, clistBuf);
  	else
  		column_list = "";		/* can't select columns in COPY */
  
*************** dumpTableData_copy(Archive *fout, void *
*** 1289,1295 ****
  	{
  		appendPQExpBuffer(q, "COPY %s %s WITH OIDS TO stdout;",
  						  fmtQualifiedId(tbinfo->dobj.namespace->dobj.name,
! 										 classname),
  						  column_list);
  	}
  	else if (tdinfo->filtercond)
--- 1438,1445 ----
  	{
  		appendPQExpBuffer(q, "COPY %s %s WITH OIDS TO stdout;",
  						  fmtQualifiedId(tbinfo->dobj.namespace->dobj.name,
! 										 classname,
! 										 AH->public.remoteVersion),
  						  column_list);
  	}
  	else if (tdinfo->filtercond)
*************** dumpTableData_copy(Archive *fout, void *
*** 1306,1328 ****
  			appendPQExpBufferStr(q, "* ");
  		appendPQExpBuffer(q, "FROM %s %s) TO stdout;",
  						  fmtQualifiedId(tbinfo->dobj.namespace->dobj.name,
! 										 classname),
  						  tdinfo->filtercond);
  	}
  	else
  	{
  		appendPQExpBuffer(q, "COPY %s %s TO stdout;",
  						  fmtQualifiedId(tbinfo->dobj.namespace->dobj.name,
! 										 classname),
  						  column_list);
  	}
! 	res = PQexec(g_conn, q->data);
! 	check_sql_result(res, g_conn, q->data, PGRES_COPY_OUT);
  	PQclear(res);
  
  	for (;;)
  	{
! 		ret = PQgetCopyData(g_conn, &copybuf, 0);
  
  		if (ret < 0)
  			break;				/* done or error */
--- 1456,1481 ----
  			appendPQExpBufferStr(q, "* ");
  		appendPQExpBuffer(q, "FROM %s %s) TO stdout;",
  						  fmtQualifiedId(tbinfo->dobj.namespace->dobj.name,
! 										 classname,
! 										 AH->public.remoteVersion),
  						  tdinfo->filtercond);
  	}
  	else
  	{
  		appendPQExpBuffer(q, "COPY %s %s TO stdout;",
  						  fmtQualifiedId(tbinfo->dobj.namespace->dobj.name,
! 										 classname,
! 										 AH->public.remoteVersion),
  						  column_list);
  	}
! 	res = PQexec(AH->connection, q->data);
! 	check_sql_result(res, AH->connection, q->data, PGRES_COPY_OUT);
  	PQclear(res);
+ 	destroyPQExpBuffer(clistBuf);
  
  	for (;;)
  	{
! 		ret = PQgetCopyData(AH->connection, &copybuf, 0);
  
  		if (ret < 0)
  			break;				/* done or error */
*************** dumpTableData_copy(Archive *fout, void *
*** 1385,1398 ****
  	{
  		/* copy data transfer failed */
  		write_msg(NULL, "Dumping the contents of table \"%s\" failed: PQgetCopyData() failed.\n", classname);
! 		write_msg(NULL, "Error message from server: %s", PQerrorMessage(g_conn));
  		write_msg(NULL, "The command was: %s\n", q->data);
  		exit_nicely();
  	}
  
  	/* Check command status and return to normal libpq state */
! 	res = PQgetResult(g_conn);
! 	check_sql_result(res, g_conn, q->data, PGRES_COMMAND_OK);
  	PQclear(res);
  
  	destroyPQExpBuffer(q);
--- 1538,1551 ----
  	{
  		/* copy data transfer failed */
  		write_msg(NULL, "Dumping the contents of table \"%s\" failed: PQgetCopyData() failed.\n", classname);
! 		write_msg(NULL, "Error message from server: %s", PQerrorMessage(AH->connection));
  		write_msg(NULL, "The command was: %s\n", q->data);
  		exit_nicely();
  	}
  
  	/* Check command status and return to normal libpq state */
! 	res = PQgetResult(AH->connection);
! 	check_sql_result(res, AH->connection, q->data, PGRES_COMMAND_OK);
  	PQclear(res);
  
  	destroyPQExpBuffer(q);
*************** dumpTableData_copy(Archive *fout, void *
*** 1410,1415 ****
--- 1563,1573 ----
  static int
  dumpTableData_insert(Archive *fout, void *dcontext)
  {
+ 	/*
+ 	 * This is a data dumper routine, executed in a child for parallel backup, so
+ 	 * it must not access the global g_conn but AH->connection instead.
+ 	 */
+ 	ArchiveHandle *AH = (ArchiveHandle *) fout;
  	TableDataInfo *tdinfo = (TableDataInfo *) dcontext;
  	TableInfo  *tbinfo = tdinfo->tdtable;
  	const char *classname = tbinfo->dobj.name;
*************** dumpTableData_insert(Archive *fout, void
*** 1425,1458 ****
  	 * this ensures reproducible results in case the table contains regproc,
  	 * regclass, etc columns.
  	 */
! 	selectSourceSchema(tbinfo->dobj.namespace->dobj.name);
  
  	if (fout->remoteVersion >= 70100)
  	{
  		appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
  						  "SELECT * FROM ONLY %s",
  						  fmtQualifiedId(tbinfo->dobj.namespace->dobj.name,
! 										 classname));
  	}
  	else
  	{
  		appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
  						  "SELECT * FROM %s",
  						  fmtQualifiedId(tbinfo->dobj.namespace->dobj.name,
! 										 classname));
  	}
  	if (tdinfo->filtercond)
  		appendPQExpBuffer(q, " %s", tdinfo->filtercond);
  
! 	res = PQexec(g_conn, q->data);
! 	check_sql_result(res, g_conn, q->data, PGRES_COMMAND_OK);
  
  	do
  	{
  		PQclear(res);
  
! 		res = PQexec(g_conn, "FETCH 100 FROM _pg_dump_cursor");
! 		check_sql_result(res, g_conn, "FETCH 100 FROM _pg_dump_cursor",
  						 PGRES_TUPLES_OK);
  		nfields = PQnfields(res);
  		for (tuple = 0; tuple < PQntuples(res); tuple++)
--- 1583,1618 ----
  	 * this ensures reproducible results in case the table contains regproc,
  	 * regclass, etc columns.
  	 */
! 	selectSourceSchemaOnAH(AH, tbinfo->dobj.namespace->dobj.name);
  
  	if (fout->remoteVersion >= 70100)
  	{
  		appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
  						  "SELECT * FROM ONLY %s",
  						  fmtQualifiedId(tbinfo->dobj.namespace->dobj.name,
! 										 classname,
! 										 AH->public.remoteVersion));
  	}
  	else
  	{
  		appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
  						  "SELECT * FROM %s",
  						  fmtQualifiedId(tbinfo->dobj.namespace->dobj.name,
! 										 classname,
! 										 AH->public.remoteVersion));
  	}
  	if (tdinfo->filtercond)
  		appendPQExpBuffer(q, " %s", tdinfo->filtercond);
  
! 	res = PQexec(AH->connection, q->data);
! 	check_sql_result(res, AH->connection, q->data, PGRES_COMMAND_OK);
  
  	do
  	{
  		PQclear(res);
  
! 		res = PQexec(AH->connection, "FETCH 100 FROM _pg_dump_cursor");
! 		check_sql_result(res, AH->connection, "FETCH 100 FROM _pg_dump_cursor",
  						 PGRES_TUPLES_OK);
  		nfields = PQnfields(res);
  		for (tuple = 0; tuple < PQntuples(res); tuple++)
*************** dumpTableData_insert(Archive *fout, void
*** 1550,1556 ****
  
  	archprintf(fout, "\n\n");
  
! 	do_sql_command(g_conn, "CLOSE _pg_dump_cursor");
  
  	destroyPQExpBuffer(q);
  	return 1;
--- 1710,1716 ----
  
  	archprintf(fout, "\n\n");
  
! 	do_sql_command(AH->connection, "CLOSE _pg_dump_cursor");
  
  	destroyPQExpBuffer(q);
  	return 1;
*************** dumpTableData(Archive *fout, TableDataIn
*** 1568,1573 ****
--- 1728,1734 ----
  {
  	TableInfo  *tbinfo = tdinfo->tdtable;
  	PQExpBuffer copyBuf = createPQExpBuffer();
+ 	PQExpBuffer clistBuf = createPQExpBuffer();
  	DataDumperPtr dumpFn;
  	char	   *copyStmt;
  
*************** dumpTableData(Archive *fout, TableDataIn
*** 1583,1589 ****
  		appendPQExpBuffer(copyBuf, "COPY %s ",
  						  fmtId(tbinfo->dobj.name));
  		appendPQExpBuffer(copyBuf, "%s %sFROM stdin;\n",
! 						  fmtCopyColumnList(tbinfo),
  					  (tdinfo->oids && tbinfo->hasoids) ? "WITH OIDS " : "");
  		copyStmt = copyBuf->data;
  	}
--- 1744,1750 ----
  		appendPQExpBuffer(copyBuf, "COPY %s ",
  						  fmtId(tbinfo->dobj.name));
  		appendPQExpBuffer(copyBuf, "%s %sFROM stdin;\n",
! 						  fmtCopyColumnList(tbinfo, clistBuf),
  					  (tdinfo->oids && tbinfo->hasoids) ? "WITH OIDS " : "");
  		copyStmt = copyBuf->data;
  	}
*************** dumpTableData(Archive *fout, TableDataIn
*** 1596,1608 ****
  
  	ArchiveEntry(fout, tdinfo->dobj.catId, tdinfo->dobj.dumpId,
  				 tbinfo->dobj.name, tbinfo->dobj.namespace->dobj.name,
! 				 NULL, tbinfo->rolname,
  				 false, "TABLE DATA", SECTION_DATA,
  				 "", "", copyStmt,
  				 tdinfo->dobj.dependencies, tdinfo->dobj.nDeps,
  				 dumpFn, tdinfo);
  
  	destroyPQExpBuffer(copyBuf);
  }
  
  /*
--- 1757,1770 ----
  
  	ArchiveEntry(fout, tdinfo->dobj.catId, tdinfo->dobj.dumpId,
  				 tbinfo->dobj.name, tbinfo->dobj.namespace->dobj.name,
! 				 NULL, tbinfo->rolname, tbinfo->relpages,
  				 false, "TABLE DATA", SECTION_DATA,
  				 "", "", copyStmt,
  				 tdinfo->dobj.dependencies, tdinfo->dobj.nDeps,
  				 dumpFn, tdinfo);
  
  	destroyPQExpBuffer(copyBuf);
+ 	destroyPQExpBuffer(clistBuf);
  }
  
  /*
*************** dumpDatabase(Archive *AH)
*** 1979,1984 ****
--- 2141,2147 ----
  				 NULL,			/* Namespace */
  				 NULL,			/* Tablespace */
  				 dba,			/* Owner */
+ 				 0,				/* relpages */
  				 false,			/* with oids */
  				 "DATABASE",	/* Desc */
  				 SECTION_PRE_DATA,		/* Section */
*************** dumpDatabase(Archive *AH)
*** 2027,2033 ****
  						  atoi(PQgetvalue(lo_res, 0, i_relfrozenxid)),
  						  LargeObjectRelationId);
  		ArchiveEntry(AH, nilCatalogId, createDumpId(),
! 					 "pg_largeobject", NULL, NULL, "",
  					 false, "pg_largeobject", SECTION_PRE_DATA,
  					 loOutQry->data, "", NULL,
  					 NULL, 0,
--- 2190,2196 ----
  						  atoi(PQgetvalue(lo_res, 0, i_relfrozenxid)),
  						  LargeObjectRelationId);
  		ArchiveEntry(AH, nilCatalogId, createDumpId(),
! 					 "pg_largeobject", NULL, NULL, "", 0,
  					 false, "pg_largeobject", SECTION_PRE_DATA,
  					 loOutQry->data, "", NULL,
  					 NULL, 0,
*************** dumpDatabase(Archive *AH)
*** 2066,2072 ****
  							  atoi(PQgetvalue(lo_res, 0, i_relfrozenxid)),
  							  LargeObjectMetadataRelationId);
  			ArchiveEntry(AH, nilCatalogId, createDumpId(),
! 						 "pg_largeobject_metadata", NULL, NULL, "",
  						 false, "pg_largeobject_metadata", SECTION_PRE_DATA,
  						 loOutQry->data, "", NULL,
  						 NULL, 0,
--- 2229,2235 ----
  							  atoi(PQgetvalue(lo_res, 0, i_relfrozenxid)),
  							  LargeObjectMetadataRelationId);
  			ArchiveEntry(AH, nilCatalogId, createDumpId(),
! 						 "pg_largeobject_metadata", NULL, NULL, "", 0,
  						 false, "pg_largeobject_metadata", SECTION_PRE_DATA,
  						 loOutQry->data, "", NULL,
  						 NULL, 0,
*************** dumpDatabase(Archive *AH)
*** 2101,2107 ****
  			appendPQExpBuffer(dbQry, ";\n");
  
  			ArchiveEntry(AH, dbCatId, createDumpId(), datname, NULL, NULL,
! 						 dba, false, "COMMENT", SECTION_NONE,
  						 dbQry->data, "", NULL,
  						 &dbDumpId, 1, NULL, NULL);
  		}
--- 2264,2270 ----
  			appendPQExpBuffer(dbQry, ";\n");
  
  			ArchiveEntry(AH, dbCatId, createDumpId(), datname, NULL, NULL,
! 						 dba, 0, false, "COMMENT", SECTION_NONE,
  						 dbQry->data, "", NULL,
  						 &dbDumpId, 1, NULL, NULL);
  		}
*************** dumpDatabase(Archive *AH)
*** 2128,2134 ****
  		emitShSecLabels(g_conn, res, seclabelQry, "DATABASE", datname);
  		if (strlen(seclabelQry->data))
  			ArchiveEntry(AH, dbCatId, createDumpId(), datname, NULL, NULL,
! 						 dba, false, "SECURITY LABEL", SECTION_NONE,
  						 seclabelQry->data, "", NULL,
  						 &dbDumpId, 1, NULL, NULL);
  		destroyPQExpBuffer(seclabelQry);
--- 2291,2297 ----
  		emitShSecLabels(g_conn, res, seclabelQry, "DATABASE", datname);
  		if (strlen(seclabelQry->data))
  			ArchiveEntry(AH, dbCatId, createDumpId(), datname, NULL, NULL,
! 						 dba, 0, false, "SECURITY LABEL", SECTION_NONE,
  						 seclabelQry->data, "", NULL,
  						 &dbDumpId, 1, NULL, NULL);
  		destroyPQExpBuffer(seclabelQry);
*************** dumpEncoding(Archive *AH)
*** 2157,2163 ****
  	appendPQExpBuffer(qry, ";\n");
  
  	ArchiveEntry(AH, nilCatalogId, createDumpId(),
! 				 "ENCODING", NULL, NULL, "",
  				 false, "ENCODING", SECTION_PRE_DATA,
  				 qry->data, "", NULL,
  				 NULL, 0,
--- 2320,2326 ----
  	appendPQExpBuffer(qry, ";\n");
  
  	ArchiveEntry(AH, nilCatalogId, createDumpId(),
! 				 "ENCODING", NULL, NULL, "", 0,
  				 false, "ENCODING", SECTION_PRE_DATA,
  				 qry->data, "", NULL,
  				 NULL, 0,
*************** dumpStdStrings(Archive *AH)
*** 2184,2190 ****
  					  stdstrings);
  
  	ArchiveEntry(AH, nilCatalogId, createDumpId(),
! 				 "STDSTRINGS", NULL, NULL, "",
  				 false, "STDSTRINGS", SECTION_PRE_DATA,
  				 qry->data, "", NULL,
  				 NULL, 0,
--- 2347,2353 ----
  					  stdstrings);
  
  	ArchiveEntry(AH, nilCatalogId, createDumpId(),
! 				 "STDSTRINGS", NULL, NULL, "", 0,
  				 false, "STDSTRINGS", SECTION_PRE_DATA,
  				 qry->data, "", NULL,
  				 NULL, 0,
*************** dumpBlob(Archive *AH, BlobInfo *binfo)
*** 2296,2302 ****
  	ArchiveEntry(AH, binfo->dobj.catId, binfo->dobj.dumpId,
  				 binfo->dobj.name,
  				 NULL, NULL,
! 				 binfo->rolname, false,
  				 "BLOB", SECTION_PRE_DATA,
  				 cquery->data, dquery->data, NULL,
  				 binfo->dobj.dependencies, binfo->dobj.nDeps,
--- 2459,2465 ----
  	ArchiveEntry(AH, binfo->dobj.catId, binfo->dobj.dumpId,
  				 binfo->dobj.name,
  				 NULL, NULL,
! 				 binfo->rolname, 0, false,
  				 "BLOB", SECTION_PRE_DATA,
  				 cquery->data, dquery->data, NULL,
  				 binfo->dobj.dependencies, binfo->dobj.nDeps,
*************** dumpBlob(Archive *AH, BlobInfo *binfo)
*** 2331,2338 ****
   *	dump the data contents of all large objects
   */
  static int
! dumpBlobs(Archive *AH, void *arg)
  {
  	const char *blobQry;
  	const char *blobFetchQry;
  	PGresult   *res;
--- 2494,2506 ----
   *	dump the data contents of all large objects
   */
  static int
! dumpBlobs(Archive *AHX, void *arg)
  {
+ 	/*
+ 	 * This is a data dumper routine, executed in a child for parallel backup,
+ 	 * so it must not access the global g_conn but AH->connection instead.
+ 	 */
+ 	ArchiveHandle *AH = (ArchiveHandle *) AHX;
  	const char *blobQry;
  	const char *blobFetchQry;
  	PGresult   *res;
*************** dumpBlobs(Archive *AH, void *arg)
*** 2345,2365 ****
  		write_msg(NULL, "saving large objects\n");
  
  	/* Make sure we are in proper schema */
! 	selectSourceSchema("pg_catalog");
  
  	/*
  	 * Currently, we re-fetch all BLOB OIDs using a cursor.  Consider scanning
  	 * the already-in-memory dumpable objects instead...
  	 */
! 	if (AH->remoteVersion >= 90000)
  		blobQry = "DECLARE bloboid CURSOR FOR SELECT oid FROM pg_largeobject_metadata";
! 	else if (AH->remoteVersion >= 70100)
  		blobQry = "DECLARE bloboid CURSOR FOR SELECT DISTINCT loid FROM pg_largeobject";
  	else
  		blobQry = "DECLARE bloboid CURSOR FOR SELECT oid FROM pg_class WHERE relkind = 'l'";
  
! 	res = PQexec(g_conn, blobQry);
! 	check_sql_result(res, g_conn, blobQry, PGRES_COMMAND_OK);
  
  	/* Command to fetch from cursor */
  	blobFetchQry = "FETCH 1000 IN bloboid";
--- 2513,2533 ----
  		write_msg(NULL, "saving large objects\n");
  
  	/* Make sure we are in proper schema */
! 	selectSourceSchemaOnAH(AH, "pg_catalog");
  
  	/*
  	 * Currently, we re-fetch all BLOB OIDs using a cursor.  Consider scanning
  	 * the already-in-memory dumpable objects instead...
  	 */
! 	if (AH->public.remoteVersion >= 90000)
  		blobQry = "DECLARE bloboid CURSOR FOR SELECT oid FROM pg_largeobject_metadata";
! 	else if (AH->public.remoteVersion >= 70100)
  		blobQry = "DECLARE bloboid CURSOR FOR SELECT DISTINCT loid FROM pg_largeobject";
  	else
  		blobQry = "DECLARE bloboid CURSOR FOR SELECT oid FROM pg_class WHERE relkind = 'l'";
  
! 	res = PQexec(AH->connection, blobQry);
! 	check_sql_result(res, AH->connection, blobQry, PGRES_COMMAND_OK);
  
  	/* Command to fetch from cursor */
  	blobFetchQry = "FETCH 1000 IN bloboid";
*************** dumpBlobs(Archive *AH, void *arg)
*** 2369,2376 ****
  		PQclear(res);
  
  		/* Do a fetch */
! 		res = PQexec(g_conn, blobFetchQry);
! 		check_sql_result(res, g_conn, blobFetchQry, PGRES_TUPLES_OK);
  
  		/* Process the tuples, if any */
  		ntups = PQntuples(res);
--- 2537,2544 ----
  		PQclear(res);
  
  		/* Do a fetch */
! 		res = PQexec(AH->connection, blobFetchQry);
! 		check_sql_result(res, AH->connection, blobFetchQry, PGRES_TUPLES_OK);
  
  		/* Process the tuples, if any */
  		ntups = PQntuples(res);
*************** dumpBlobs(Archive *AH, void *arg)
*** 2381,2413 ****
  
  			blobOid = atooid(PQgetvalue(res, i, 0));
  			/* Open the BLOB */
! 			loFd = lo_open(g_conn, blobOid, INV_READ);
  			if (loFd == -1)
  			{
  				write_msg(NULL, "could not open large object %u: %s",
! 						  blobOid, PQerrorMessage(g_conn));
  				exit_nicely();
  			}
  
! 			StartBlob(AH, blobOid);
  
  			/* Now read it in chunks, sending data to archive */
  			do
  			{
! 				cnt = lo_read(g_conn, loFd, buf, LOBBUFSIZE);
  				if (cnt < 0)
  				{
  					write_msg(NULL, "error reading large object %u: %s",
! 							  blobOid, PQerrorMessage(g_conn));
  					exit_nicely();
  				}
  
! 				WriteData(AH, buf, cnt);
  			} while (cnt > 0);
  
! 			lo_close(g_conn, loFd);
  
! 			EndBlob(AH, blobOid);
  		}
  	} while (ntups > 0);
  
--- 2549,2583 ----
  
  			blobOid = atooid(PQgetvalue(res, i, 0));
  			/* Open the BLOB */
! 			loFd = lo_open(AH->connection, blobOid, INV_READ);
  			if (loFd == -1)
  			{
  				write_msg(NULL, "could not open large object %u: %s",
! 						  blobOid, PQerrorMessage(AH->connection));
  				exit_nicely();
  			}
  
! 			StartBlob(AHX, blobOid);
  
  			/* Now read it in chunks, sending data to archive */
  			do
  			{
! 				cnt = lo_read(AH->connection, loFd, buf, LOBBUFSIZE);
  				if (cnt < 0)
  				{
  					write_msg(NULL, "error reading large object %u: %s",
! 							  blobOid, PQerrorMessage(AH->connection));
  					exit_nicely();
  				}
  
! 				/* we try to avoid writing empty chunks */
! 				if (cnt > 0)
! 					WriteData(AHX, buf, cnt);
  			} while (cnt > 0);
  
! 			lo_close(AH->connection, loFd);
  
! 			EndBlob(AHX, blobOid);
  		}
  	} while (ntups > 0);
  
*************** getTables(int *numTables)
*** 3929,3934 ****
--- 4099,4105 ----
  	int			i_reloptions;
  	int			i_toastreloptions;
  	int			i_reloftype;
+ 	int			i_relpages;
  
  	/* Make sure we are in proper schema */
  	selectSourceSchema("pg_catalog");
*************** getTables(int *numTables)
*** 3968,3973 ****
--- 4139,4145 ----
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
  						  "c.relpersistence, "
+ 						  "c.relpages, "
  						  "CASE WHEN c.reloftype <> 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS reloftype, "
  						  "d.refobjid AS owning_tab, "
  						  "d.refobjsubid AS owning_col, "
*************** getTables(int *numTables)
*** 4004,4009 ****
--- 4176,4182 ----
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
  						  "'p' AS relpersistence, "
+ 						  "c.relpages, "
  						  "CASE WHEN c.reloftype <> 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS reloftype, "
  						  "d.refobjid AS owning_tab, "
  						  "d.refobjsubid AS owning_col, "
*************** getTables(int *numTables)
*** 4039,4044 ****
--- 4212,4218 ----
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
  						  "'p' AS relpersistence, "
+ 						  "c.relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
  						  "d.refobjsubid AS owning_col, "
*************** getTables(int *numTables)
*** 4074,4079 ****
--- 4248,4254 ----
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
  						  "'p' AS relpersistence, "
+ 						  "c.relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
  						  "d.refobjsubid AS owning_col, "
*************** getTables(int *numTables)
*** 4110,4115 ****
--- 4285,4291 ----
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
  						  "'p' AS relpersistence, "
+ 						  "relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
  						  "d.refobjsubid AS owning_col, "
*************** getTables(int *numTables)
*** 4145,4150 ****
--- 4321,4327 ----
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
  						  "'p' AS relpersistence, "
+ 						  "relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
  						  "d.refobjsubid AS owning_col, "
*************** getTables(int *numTables)
*** 4176,4181 ****
--- 4353,4359 ----
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
  						  "'p' AS relpersistence, "
+ 						  "relpages, "
  						  "NULL AS reloftype, "
  						  "NULL::oid AS owning_tab, "
  						  "NULL::int4 AS owning_col, "
*************** getTables(int *numTables)
*** 4202,4207 ****
--- 4380,4386 ----
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
  						  "'p' AS relpersistence, "
+ 						  "relpages, "
  						  "NULL AS reloftype, "
  						  "NULL::oid AS owning_tab, "
  						  "NULL::int4 AS owning_col, "
*************** getTables(int *numTables)
*** 4238,4243 ****
--- 4417,4423 ----
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
  						  "'p' AS relpersistence, "
+ 						  "0 AS relpages, "
  						  "NULL AS reloftype, "
  						  "NULL::oid AS owning_tab, "
  						  "NULL::int4 AS owning_col, "
*************** getTables(int *numTables)
*** 4292,4297 ****
--- 4472,4478 ----
  	i_reloptions = PQfnumber(res, "reloptions");
  	i_toastreloptions = PQfnumber(res, "toast_reloptions");
  	i_reloftype = PQfnumber(res, "reloftype");
+ 	i_relpages = PQfnumber(res, "relpages");
  
  	if (lockWaitTimeout && g_fout->remoteVersion >= 70300)
  	{
*************** getTables(int *numTables)
*** 4346,4351 ****
--- 4527,4533 ----
  		tblinfo[i].reltablespace = pg_strdup(PQgetvalue(res, i, i_reltablespace));
  		tblinfo[i].reloptions = pg_strdup(PQgetvalue(res, i, i_reloptions));
  		tblinfo[i].toast_reloptions = pg_strdup(PQgetvalue(res, i, i_toastreloptions));
+ 		tblinfo[i].relpages = atoi(PQgetvalue(res, i, i_relpages));
  
  		/* other fields were zeroed above */
  
*************** getTables(int *numTables)
*** 4375,4381 ****
  			appendPQExpBuffer(query,
  							  "LOCK TABLE %s IN ACCESS SHARE MODE",
  						 fmtQualifiedId(tblinfo[i].dobj.namespace->dobj.name,
! 										tblinfo[i].dobj.name));
  			do_sql_command(g_conn, query->data);
  		}
  
--- 4557,4564 ----
  			appendPQExpBuffer(query,
  							  "LOCK TABLE %s IN ACCESS SHARE MODE",
  						 fmtQualifiedId(tblinfo[i].dobj.namespace->dobj.name,
! 										tblinfo[i].dobj.name,
! 										g_fout->remoteVersion));
  			do_sql_command(g_conn, query->data);
  		}
  
*************** dumpComment(Archive *fout, const char *t
*** 6874,6880 ****
  		 * post-data.
  		 */
  		ArchiveEntry(fout, nilCatalogId, createDumpId(),
! 					 target, namespace, NULL, owner,
  					 false, "COMMENT", SECTION_NONE,
  					 query->data, "", NULL,
  					 &(dumpId), 1,
--- 7057,7063 ----
  		 * post-data.
  		 */
  		ArchiveEntry(fout, nilCatalogId, createDumpId(),
! 					 target, namespace, NULL, owner, 0,
  					 false, "COMMENT", SECTION_NONE,
  					 query->data, "", NULL,
  					 &(dumpId), 1,
*************** dumpTableComment(Archive *fout, TableInf
*** 6935,6941 ****
  			ArchiveEntry(fout, nilCatalogId, createDumpId(),
  						 target->data,
  						 tbinfo->dobj.namespace->dobj.name,
! 						 NULL, tbinfo->rolname,
  						 false, "COMMENT", SECTION_NONE,
  						 query->data, "", NULL,
  						 &(tbinfo->dobj.dumpId), 1,
--- 7118,7124 ----
  			ArchiveEntry(fout, nilCatalogId, createDumpId(),
  						 target->data,
  						 tbinfo->dobj.namespace->dobj.name,
! 						 NULL, tbinfo->rolname, 0,
  						 false, "COMMENT", SECTION_NONE,
  						 query->data, "", NULL,
  						 &(tbinfo->dobj.dumpId), 1,
*************** dumpTableComment(Archive *fout, TableInf
*** 6957,6963 ****
  			ArchiveEntry(fout, nilCatalogId, createDumpId(),
  						 target->data,
  						 tbinfo->dobj.namespace->dobj.name,
! 						 NULL, tbinfo->rolname,
  						 false, "COMMENT", SECTION_NONE,
  						 query->data, "", NULL,
  						 &(tbinfo->dobj.dumpId), 1,
--- 7140,7146 ----
  			ArchiveEntry(fout, nilCatalogId, createDumpId(),
  						 target->data,
  						 tbinfo->dobj.namespace->dobj.name,
! 						 NULL, tbinfo->rolname, 0,
  						 false, "COMMENT", SECTION_NONE,
  						 query->data, "", NULL,
  						 &(tbinfo->dobj.dumpId), 1,
*************** dumpDumpableObject(Archive *fout, Dumpab
*** 7265,7271 ****
  			break;
  		case DO_BLOB_DATA:
  			ArchiveEntry(fout, dobj->catId, dobj->dumpId,
! 						 dobj->name, NULL, NULL, "",
  						 false, "BLOBS", SECTION_DATA,
  						 "", "", NULL,
  						 dobj->dependencies, dobj->nDeps,
--- 7448,7454 ----
  			break;
  		case DO_BLOB_DATA:
  			ArchiveEntry(fout, dobj->catId, dobj->dumpId,
! 						 dobj->name, NULL, NULL, "", 0,
  						 false, "BLOBS", SECTION_DATA,
  						 "", "", NULL,
  						 dobj->dependencies, dobj->nDeps,
*************** dumpNamespace(Archive *fout, NamespaceIn
*** 7312,7318 ****
  	ArchiveEntry(fout, nspinfo->dobj.catId, nspinfo->dobj.dumpId,
  				 nspinfo->dobj.name,
  				 NULL, NULL,
! 				 nspinfo->rolname,
  				 false, "SCHEMA", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 nspinfo->dobj.dependencies, nspinfo->dobj.nDeps,
--- 7495,7501 ----
  	ArchiveEntry(fout, nspinfo->dobj.catId, nspinfo->dobj.dumpId,
  				 nspinfo->dobj.name,
  				 NULL, NULL,
! 				 nspinfo->rolname, 0,
  				 false, "SCHEMA", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 nspinfo->dobj.dependencies, nspinfo->dobj.nDeps,
*************** dumpExtension(Archive *fout, ExtensionIn
*** 7430,7436 ****
  	ArchiveEntry(fout, extinfo->dobj.catId, extinfo->dobj.dumpId,
  				 extinfo->dobj.name,
  				 NULL, NULL,
! 				 "",
  				 false, "EXTENSION", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 extinfo->dobj.dependencies, extinfo->dobj.nDeps,
--- 7613,7619 ----
  	ArchiveEntry(fout, extinfo->dobj.catId, extinfo->dobj.dumpId,
  				 extinfo->dobj.name,
  				 NULL, NULL,
! 				 "", 0,
  				 false, "EXTENSION", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 extinfo->dobj.dependencies, extinfo->dobj.nDeps,
*************** dumpEnumType(Archive *fout, TypeInfo *ty
*** 7578,7584 ****
  				 tyinfo->dobj.name,
  				 tyinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 tyinfo->rolname, false,
  				 "TYPE", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 tyinfo->dobj.dependencies, tyinfo->dobj.nDeps,
--- 7761,7767 ----
  				 tyinfo->dobj.name,
  				 tyinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 tyinfo->rolname, 0, false,
  				 "TYPE", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 tyinfo->dobj.dependencies, tyinfo->dobj.nDeps,
*************** dumpRangeType(Archive *fout, TypeInfo *t
*** 7709,7715 ****
  				 tyinfo->dobj.name,
  				 tyinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 tyinfo->rolname, false,
  				 "TYPE", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 tyinfo->dobj.dependencies, tyinfo->dobj.nDeps,
--- 7892,7898 ----
  				 tyinfo->dobj.name,
  				 tyinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 tyinfo->rolname, 0, false,
  				 "TYPE", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 tyinfo->dobj.dependencies, tyinfo->dobj.nDeps,
*************** dumpBaseType(Archive *fout, TypeInfo *ty
*** 8103,8109 ****
  				 tyinfo->dobj.name,
  				 tyinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 tyinfo->rolname, false,
  				 "TYPE", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 tyinfo->dobj.dependencies, tyinfo->dobj.nDeps,
--- 8286,8292 ----
  				 tyinfo->dobj.name,
  				 tyinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 tyinfo->rolname, 0, false,
  				 "TYPE", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 tyinfo->dobj.dependencies, tyinfo->dobj.nDeps,
*************** dumpDomain(Archive *fout, TypeInfo *tyin
*** 8270,8276 ****
  				 tyinfo->dobj.name,
  				 tyinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 tyinfo->rolname, false,
  				 "DOMAIN", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 tyinfo->dobj.dependencies, tyinfo->dobj.nDeps,
--- 8453,8459 ----
  				 tyinfo->dobj.name,
  				 tyinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 tyinfo->rolname, 0, false,
  				 "DOMAIN", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 tyinfo->dobj.dependencies, tyinfo->dobj.nDeps,
*************** dumpCompositeType(Archive *fout, TypeInf
*** 8477,8483 ****
  				 tyinfo->dobj.name,
  				 tyinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 tyinfo->rolname, false,
  				 "TYPE", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 tyinfo->dobj.dependencies, tyinfo->dobj.nDeps,
--- 8660,8666 ----
  				 tyinfo->dobj.name,
  				 tyinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 tyinfo->rolname, 0, false,
  				 "TYPE", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 tyinfo->dobj.dependencies, tyinfo->dobj.nDeps,
*************** dumpCompositeTypeColComments(Archive *fo
*** 8597,8603 ****
  			ArchiveEntry(fout, nilCatalogId, createDumpId(),
  						 target->data,
  						 tyinfo->dobj.namespace->dobj.name,
! 						 NULL, tyinfo->rolname,
  						 false, "COMMENT", SECTION_NONE,
  						 query->data, "", NULL,
  						 &(tyinfo->dobj.dumpId), 1,
--- 8780,8786 ----
  			ArchiveEntry(fout, nilCatalogId, createDumpId(),
  						 target->data,
  						 tyinfo->dobj.namespace->dobj.name,
! 						 NULL, tyinfo->rolname, 0,
  						 false, "COMMENT", SECTION_NONE,
  						 query->data, "", NULL,
  						 &(tyinfo->dobj.dumpId), 1,
*************** dumpShellType(Archive *fout, ShellTypeIn
*** 8650,8656 ****
  				 stinfo->dobj.name,
  				 stinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 stinfo->baseType->rolname, false,
  				 "SHELL TYPE", SECTION_PRE_DATA,
  				 q->data, "", NULL,
  				 stinfo->dobj.dependencies, stinfo->dobj.nDeps,
--- 8833,8839 ----
  				 stinfo->dobj.name,
  				 stinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 stinfo->baseType->rolname, 0, false,
  				 "SHELL TYPE", SECTION_PRE_DATA,
  				 q->data, "", NULL,
  				 stinfo->dobj.dependencies, stinfo->dobj.nDeps,
*************** dumpProcLang(Archive *fout, ProcLangInfo
*** 8824,8830 ****
  
  	ArchiveEntry(fout, plang->dobj.catId, plang->dobj.dumpId,
  				 plang->dobj.name,
! 				 lanschema, NULL, plang->lanowner,
  				 false, "PROCEDURAL LANGUAGE", SECTION_PRE_DATA,
  				 defqry->data, delqry->data, NULL,
  				 plang->dobj.dependencies, plang->dobj.nDeps,
--- 9007,9013 ----
  
  	ArchiveEntry(fout, plang->dobj.catId, plang->dobj.dumpId,
  				 plang->dobj.name,
! 				 lanschema, NULL, plang->lanowner, 0,
  				 false, "PROCEDURAL LANGUAGE", SECTION_PRE_DATA,
  				 defqry->data, delqry->data, NULL,
  				 plang->dobj.dependencies, plang->dobj.nDeps,
*************** dumpFunc(Archive *fout, FuncInfo *finfo)
*** 9394,9400 ****
  				 funcsig_tag,
  				 finfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 finfo->rolname, false,
  				 "FUNCTION", SECTION_PRE_DATA,
  				 q->data, delqry->data, NULL,
  				 finfo->dobj.dependencies, finfo->dobj.nDeps,
--- 9577,9583 ----
  				 funcsig_tag,
  				 finfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 finfo->rolname, 0, false,
  				 "FUNCTION", SECTION_PRE_DATA,
  				 q->data, delqry->data, NULL,
  				 finfo->dobj.dependencies, finfo->dobj.nDeps,
*************** dumpCast(Archive *fout, CastInfo *cast)
*** 9558,9564 ****
  
  	ArchiveEntry(fout, cast->dobj.catId, cast->dobj.dumpId,
  				 labelq->data,
! 				 "pg_catalog", NULL, "",
  				 false, "CAST", SECTION_PRE_DATA,
  				 defqry->data, delqry->data, NULL,
  				 cast->dobj.dependencies, cast->dobj.nDeps,
--- 9741,9747 ----
  
  	ArchiveEntry(fout, cast->dobj.catId, cast->dobj.dumpId,
  				 labelq->data,
! 				 "pg_catalog", NULL, "", 0,
  				 false, "CAST", SECTION_PRE_DATA,
  				 defqry->data, delqry->data, NULL,
  				 cast->dobj.dependencies, cast->dobj.nDeps,
*************** dumpOpr(Archive *fout, OprInfo *oprinfo)
*** 9805,9811 ****
  				 oprinfo->dobj.name,
  				 oprinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 oprinfo->rolname,
  				 false, "OPERATOR", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 oprinfo->dobj.dependencies, oprinfo->dobj.nDeps,
--- 9988,9994 ----
  				 oprinfo->dobj.name,
  				 oprinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 oprinfo->rolname, 0,
  				 false, "OPERATOR", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 oprinfo->dobj.dependencies, oprinfo->dobj.nDeps,
*************** dumpOpclass(Archive *fout, OpclassInfo *
*** 10338,10344 ****
  				 opcinfo->dobj.name,
  				 opcinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 opcinfo->rolname,
  				 false, "OPERATOR CLASS", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 opcinfo->dobj.dependencies, opcinfo->dobj.nDeps,
--- 10521,10527 ----
  				 opcinfo->dobj.name,
  				 opcinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 opcinfo->rolname, 0,
  				 false, "OPERATOR CLASS", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 opcinfo->dobj.dependencies, opcinfo->dobj.nDeps,
*************** dumpOpfamily(Archive *fout, OpfamilyInfo
*** 10666,10672 ****
  				 opfinfo->dobj.name,
  				 opfinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 opfinfo->rolname,
  				 false, "OPERATOR FAMILY", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 opfinfo->dobj.dependencies, opfinfo->dobj.nDeps,
--- 10849,10855 ----
  				 opfinfo->dobj.name,
  				 opfinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 opfinfo->rolname, 0,
  				 false, "OPERATOR FAMILY", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 opfinfo->dobj.dependencies, opfinfo->dobj.nDeps,
*************** dumpCollation(Archive *fout, CollInfo *c
*** 10768,10774 ****
  				 collinfo->dobj.name,
  				 collinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 collinfo->rolname,
  				 false, "COLLATION", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 collinfo->dobj.dependencies, collinfo->dobj.nDeps,
--- 10951,10957 ----
  				 collinfo->dobj.name,
  				 collinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 collinfo->rolname, 0,
  				 false, "COLLATION", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 collinfo->dobj.dependencies, collinfo->dobj.nDeps,
*************** dumpConversion(Archive *fout, ConvInfo *
*** 10880,10886 ****
  				 convinfo->dobj.name,
  				 convinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 convinfo->rolname,
  				 false, "CONVERSION", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 convinfo->dobj.dependencies, convinfo->dobj.nDeps,
--- 11063,11069 ----
  				 convinfo->dobj.name,
  				 convinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 convinfo->rolname, 0,
  				 false, "CONVERSION", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 convinfo->dobj.dependencies, convinfo->dobj.nDeps,
*************** dumpAgg(Archive *fout, AggInfo *agginfo)
*** 11129,11135 ****
  				 aggsig_tag,
  				 agginfo->aggfn.dobj.namespace->dobj.name,
  				 NULL,
! 				 agginfo->aggfn.rolname,
  				 false, "AGGREGATE", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 agginfo->aggfn.dobj.dependencies, agginfo->aggfn.dobj.nDeps,
--- 11312,11318 ----
  				 aggsig_tag,
  				 agginfo->aggfn.dobj.namespace->dobj.name,
  				 NULL,
! 				 agginfo->aggfn.rolname, 0,
  				 false, "AGGREGATE", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 agginfo->aggfn.dobj.dependencies, agginfo->aggfn.dobj.nDeps,
*************** dumpTSParser(Archive *fout, TSParserInfo
*** 11228,11233 ****
--- 11411,11417 ----
  				 prsinfo->dobj.namespace->dobj.name,
  				 NULL,
  				 "",
+ 				 0,
  				 false, "TEXT SEARCH PARSER", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 prsinfo->dobj.dependencies, prsinfo->dobj.nDeps,
*************** dumpTSDictionary(Archive *fout, TSDictIn
*** 11326,11331 ****
--- 11510,11516 ----
  				 dictinfo->dobj.namespace->dobj.name,
  				 NULL,
  				 dictinfo->rolname,
+ 				 0,
  				 false, "TEXT SEARCH DICTIONARY", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 dictinfo->dobj.dependencies, dictinfo->dobj.nDeps,
*************** dumpTSTemplate(Archive *fout, TSTemplate
*** 11392,11397 ****
--- 11577,11583 ----
  				 tmplinfo->dobj.namespace->dobj.name,
  				 NULL,
  				 "",
+ 				 0,
  				 false, "TEXT SEARCH TEMPLATE", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 tmplinfo->dobj.dependencies, tmplinfo->dobj.nDeps,
*************** dumpTSConfig(Archive *fout, TSConfigInfo
*** 11531,11536 ****
--- 11717,11723 ----
  				 cfginfo->dobj.namespace->dobj.name,
  				 NULL,
  				 cfginfo->rolname,
+ 				 0,
  				 false, "TEXT SEARCH CONFIGURATION", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 cfginfo->dobj.dependencies, cfginfo->dobj.nDeps,
*************** dumpForeignDataWrapper(Archive *fout, Fd
*** 11605,11610 ****
--- 11792,11798 ----
  				 NULL,
  				 NULL,
  				 fdwinfo->rolname,
+ 				 0,
  				 false, "FOREIGN DATA WRAPPER", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 fdwinfo->dobj.dependencies, fdwinfo->dobj.nDeps,
*************** dumpForeignServer(Archive *fout, Foreign
*** 11708,11713 ****
--- 11896,11902 ----
  				 NULL,
  				 NULL,
  				 srvinfo->rolname,
+ 				 0,
  				 false, "SERVER", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 srvinfo->dobj.dependencies, srvinfo->dobj.nDeps,
*************** dumpUserMappings(Archive *fout,
*** 11825,11831 ****
  					 tag->data,
  					 namespace,
  					 NULL,
! 					 owner, false,
  					 "USER MAPPING", SECTION_PRE_DATA,
  					 q->data, delq->data, NULL,
  					 &dumpId, 1,
--- 12014,12020 ----
  					 tag->data,
  					 namespace,
  					 NULL,
! 					 owner, 0, false,
  					 "USER MAPPING", SECTION_PRE_DATA,
  					 q->data, delq->data, NULL,
  					 &dumpId, 1,
*************** dumpDefaultACL(Archive *fout, DefaultACL
*** 11896,11901 ****
--- 12085,12091 ----
  	   daclinfo->dobj.namespace ? daclinfo->dobj.namespace->dobj.name : NULL,
  				 NULL,
  				 daclinfo->defaclrole,
+ 				 0,
  				 false, "DEFAULT ACL", SECTION_NONE,
  				 q->data, "", NULL,
  				 daclinfo->dobj.dependencies, daclinfo->dobj.nDeps,
*************** dumpACL(Archive *fout, CatalogId objCatI
*** 11953,11958 ****
--- 12143,12149 ----
  					 tag, nspname,
  					 NULL,
  					 owner ? owner : "",
+ 					 0,
  					 false, "ACL", SECTION_NONE,
  					 sql->data, "", NULL,
  					 &(objDumpId), 1,
*************** dumpSecLabel(Archive *fout, const char *
*** 12029,12035 ****
  	{
  		ArchiveEntry(fout, nilCatalogId, createDumpId(),
  					 target, namespace, NULL, owner,
! 					 false, "SECURITY LABEL", SECTION_NONE,
  					 query->data, "", NULL,
  					 &(dumpId), 1,
  					 NULL, NULL);
--- 12220,12226 ----
  	{
  		ArchiveEntry(fout, nilCatalogId, createDumpId(),
  					 target, namespace, NULL, owner,
! 					 0, false, "SECURITY LABEL", SECTION_NONE,
  					 query->data, "", NULL,
  					 &(dumpId), 1,
  					 NULL, NULL);
*************** dumpTableSecLabel(Archive *fout, TableIn
*** 12107,12113 ****
  					 target->data,
  					 tbinfo->dobj.namespace->dobj.name,
  					 NULL, tbinfo->rolname,
! 					 false, "SECURITY LABEL", SECTION_NONE,
  					 query->data, "", NULL,
  					 &(tbinfo->dobj.dumpId), 1,
  					 NULL, NULL);
--- 12298,12304 ----
  					 target->data,
  					 tbinfo->dobj.namespace->dobj.name,
  					 NULL, tbinfo->rolname,
! 					 0, false, "SECURITY LABEL", SECTION_NONE,
  					 query->data, "", NULL,
  					 &(tbinfo->dobj.dumpId), 1,
  					 NULL, NULL);
*************** dumpTableSchema(Archive *fout, TableInfo
*** 12888,12893 ****
--- 13079,13085 ----
  				 tbinfo->dobj.namespace->dobj.name,
  			(tbinfo->relkind == RELKIND_VIEW) ? NULL : tbinfo->reltablespace,
  				 tbinfo->rolname,
+ 				 0,
  			   (strcmp(reltypename, "TABLE") == 0) ? tbinfo->hasoids : false,
  				 reltypename, SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
*************** dumpAttrDef(Archive *fout, AttrDefInfo *
*** 12961,12966 ****
--- 13153,13159 ----
  				 tbinfo->dobj.namespace->dobj.name,
  				 NULL,
  				 tbinfo->rolname,
+ 				 0,
  				 false, "DEFAULT", SECTION_PRE_DATA,
  				 q->data, delq->data, NULL,
  				 adinfo->dobj.dependencies, adinfo->dobj.nDeps,
*************** dumpIndex(Archive *fout, IndxInfo *indxi
*** 13062,13068 ****
  					 indxinfo->dobj.name,
  					 tbinfo->dobj.namespace->dobj.name,
  					 indxinfo->tablespace,
! 					 tbinfo->rolname, false,
  					 "INDEX", SECTION_POST_DATA,
  					 q->data, delq->data, NULL,
  					 indxinfo->dobj.dependencies, indxinfo->dobj.nDeps,
--- 13255,13261 ----
  					 indxinfo->dobj.name,
  					 tbinfo->dobj.namespace->dobj.name,
  					 indxinfo->tablespace,
! 					 tbinfo->rolname, indxinfo->relpages, false,
  					 "INDEX", SECTION_POST_DATA,
  					 q->data, delq->data, NULL,
  					 indxinfo->dobj.dependencies, indxinfo->dobj.nDeps,
*************** dumpConstraint(Archive *fout, Constraint
*** 13185,13191 ****
  					 coninfo->dobj.name,
  					 tbinfo->dobj.namespace->dobj.name,
  					 indxinfo->tablespace,
! 					 tbinfo->rolname, false,
  					 "CONSTRAINT", SECTION_POST_DATA,
  					 q->data, delq->data, NULL,
  					 coninfo->dobj.dependencies, coninfo->dobj.nDeps,
--- 13378,13384 ----
  					 coninfo->dobj.name,
  					 tbinfo->dobj.namespace->dobj.name,
  					 indxinfo->tablespace,
! 					 tbinfo->rolname, 0, false,
  					 "CONSTRAINT", SECTION_POST_DATA,
  					 q->data, delq->data, NULL,
  					 coninfo->dobj.dependencies, coninfo->dobj.nDeps,
*************** dumpConstraint(Archive *fout, Constraint
*** 13218,13224 ****
  					 coninfo->dobj.name,
  					 tbinfo->dobj.namespace->dobj.name,
  					 NULL,
! 					 tbinfo->rolname, false,
  					 "FK CONSTRAINT", SECTION_POST_DATA,
  					 q->data, delq->data, NULL,
  					 coninfo->dobj.dependencies, coninfo->dobj.nDeps,
--- 13411,13417 ----
  					 coninfo->dobj.name,
  					 tbinfo->dobj.namespace->dobj.name,
  					 NULL,
! 					 tbinfo->rolname, 0, false,
  					 "FK CONSTRAINT", SECTION_POST_DATA,
  					 q->data, delq->data, NULL,
  					 coninfo->dobj.dependencies, coninfo->dobj.nDeps,
*************** dumpConstraint(Archive *fout, Constraint
*** 13253,13259 ****
  						 coninfo->dobj.name,
  						 tbinfo->dobj.namespace->dobj.name,
  						 NULL,
! 						 tbinfo->rolname, false,
  						 "CHECK CONSTRAINT", SECTION_POST_DATA,
  						 q->data, delq->data, NULL,
  						 coninfo->dobj.dependencies, coninfo->dobj.nDeps,
--- 13446,13452 ----
  						 coninfo->dobj.name,
  						 tbinfo->dobj.namespace->dobj.name,
  						 NULL,
! 						 tbinfo->rolname, 0, false,
  						 "CHECK CONSTRAINT", SECTION_POST_DATA,
  						 q->data, delq->data, NULL,
  						 coninfo->dobj.dependencies, coninfo->dobj.nDeps,
*************** dumpConstraint(Archive *fout, Constraint
*** 13289,13295 ****
  						 coninfo->dobj.name,
  						 tyinfo->dobj.namespace->dobj.name,
  						 NULL,
! 						 tyinfo->rolname, false,
  						 "CHECK CONSTRAINT", SECTION_POST_DATA,
  						 q->data, delq->data, NULL,
  						 coninfo->dobj.dependencies, coninfo->dobj.nDeps,
--- 13482,13488 ----
  						 coninfo->dobj.name,
  						 tyinfo->dobj.namespace->dobj.name,
  						 NULL,
! 						 tyinfo->rolname, 0, false,
  						 "CHECK CONSTRAINT", SECTION_POST_DATA,
  						 q->data, delq->data, NULL,
  						 coninfo->dobj.dependencies, coninfo->dobj.nDeps,
*************** dumpSequence(Archive *fout, TableInfo *t
*** 13579,13585 ****
  					 tbinfo->dobj.name,
  					 tbinfo->dobj.namespace->dobj.name,
  					 NULL,
! 					 tbinfo->rolname,
  					 false, "SEQUENCE", SECTION_PRE_DATA,
  					 query->data, delqry->data, NULL,
  					 tbinfo->dobj.dependencies, tbinfo->dobj.nDeps,
--- 13772,13778 ----
  					 tbinfo->dobj.name,
  					 tbinfo->dobj.namespace->dobj.name,
  					 NULL,
! 					 tbinfo->rolname, 0,
  					 false, "SEQUENCE", SECTION_PRE_DATA,
  					 query->data, delqry->data, NULL,
  					 tbinfo->dobj.dependencies, tbinfo->dobj.nDeps,
*************** dumpSequence(Archive *fout, TableInfo *t
*** 13615,13621 ****
  							 tbinfo->dobj.name,
  							 tbinfo->dobj.namespace->dobj.name,
  							 NULL,
! 							 tbinfo->rolname,
  							 false, "SEQUENCE OWNED BY", SECTION_PRE_DATA,
  							 query->data, "", NULL,
  							 &(tbinfo->dobj.dumpId), 1,
--- 13808,13814 ----
  							 tbinfo->dobj.name,
  							 tbinfo->dobj.namespace->dobj.name,
  							 NULL,
! 							 tbinfo->rolname, 0,
  							 false, "SEQUENCE OWNED BY", SECTION_PRE_DATA,
  							 query->data, "", NULL,
  							 &(tbinfo->dobj.dumpId), 1,
*************** dumpSequence(Archive *fout, TableInfo *t
*** 13645,13650 ****
--- 13838,13844 ----
  					 tbinfo->dobj.namespace->dobj.name,
  					 NULL,
  					 tbinfo->rolname,
+ 					 0,
  					 false, "SEQUENCE SET", SECTION_PRE_DATA,
  					 query->data, "", NULL,
  					 &(tbinfo->dobj.dumpId), 1,
*************** dumpTrigger(Archive *fout, TriggerInfo *
*** 13845,13851 ****
  				 tginfo->dobj.name,
  				 tbinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 tbinfo->rolname, false,
  				 "TRIGGER", SECTION_POST_DATA,
  				 query->data, delqry->data, NULL,
  				 tginfo->dobj.dependencies, tginfo->dobj.nDeps,
--- 14039,14045 ----
  				 tginfo->dobj.name,
  				 tbinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 tbinfo->rolname, 0, false,
  				 "TRIGGER", SECTION_POST_DATA,
  				 query->data, delqry->data, NULL,
  				 tginfo->dobj.dependencies, tginfo->dobj.nDeps,
*************** dumpRule(Archive *fout, RuleInfo *rinfo)
*** 13967,13973 ****
  				 rinfo->dobj.name,
  				 tbinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 tbinfo->rolname, false,
  				 "RULE", SECTION_POST_DATA,
  				 cmd->data, delcmd->data, NULL,
  				 rinfo->dobj.dependencies, rinfo->dobj.nDeps,
--- 14161,14167 ----
  				 rinfo->dobj.name,
  				 tbinfo->dobj.namespace->dobj.name,
  				 NULL,
! 				 tbinfo->rolname, 0, false,
  				 "RULE", SECTION_POST_DATA,
  				 cmd->data, delcmd->data, NULL,
  				 rinfo->dobj.dependencies, rinfo->dobj.nDeps,
*************** getDependencies(void)
*** 14268,14273 ****
--- 14462,14509 ----
  	destroyPQExpBuffer(query);
  }
  
+ /*
+  * Select the source schema and use a static var to remember what we have set
+  * as the default schema right now.  This function is never called in parallel
+  * context, so the static var is okay. The parallel context will call
+  * selectSourceSchemaOnAH, and this remembers the current schema via
+  * AH->currSchema.
+  */
+ static void
+ selectSourceSchema(const char *schemaName)
+ {
+ 	static char *currSchemaName = NULL;
+ 
+ 	if (!schemaName || *schemaName == '\0' ||
+ 		(currSchemaName && strcmp(currSchemaName, schemaName) == 0))
+ 		return;					/* no need to do anything */
+ 
+ 	selectSourceSchemaOnConnection(g_conn, schemaName);
+ 
+ 	if (currSchemaName)
+ 		free(currSchemaName);
+ 	currSchemaName = pg_strdup(schemaName);
+ }
+ 
+ /*
+  * This function lets a DataDumper function select a schema on an
+  * ArchiveHandle. These functions can be called from a threaded program for
+  * parallel dump/restore and must therefore not access global variables (read
+  * only access to g_fout->remoteVersion is okay however).
+  */
+ static void
+ selectSourceSchemaOnAH(ArchiveHandle *AH, const char *schemaName)
+ {
+ 	if (!schemaName || *schemaName == '\0' ||
+ 		(AH->currSchema && strcmp(AH->currSchema, schemaName) == 0))
+ 		return;					/* no need to do anything */
+ 
+ 	selectSourceSchemaOnConnection(AH->connection, schemaName);
+ 
+ 	if (AH->currSchema)
+ 		free(AH->currSchema);
+ 	AH->currSchema = pg_strdup(schemaName);
+ }
  
  /*
   * selectSourceSchema - make the specified schema the active search path
*************** getDependencies(void)
*** 14280,14301 ****
   *
   * Whenever the selected schema is not pg_catalog, be careful to qualify
   * references to system catalogs and types in our emitted commands!
   */
  static void
! selectSourceSchema(const char *schemaName)
  {
- 	static char *curSchemaName = NULL;
  	PQExpBuffer query;
  
  	/* Not relevant if fetching from pre-7.3 DB */
  	if (g_fout->remoteVersion < 70300)
  		return;
- 	/* Ignore null schema names */
- 	if (schemaName == NULL || *schemaName == '\0')
- 		return;
- 	/* Optimize away repeated selection of same schema */
- 	if (curSchemaName && strcmp(curSchemaName, schemaName) == 0)
- 		return;
  
  	query = createPQExpBuffer();
  	appendPQExpBuffer(query, "SET search_path = %s",
--- 14516,14536 ----
   *
   * Whenever the selected schema is not pg_catalog, be careful to qualify
   * references to system catalogs and types in our emitted commands!
+  *
+  * This function is called only from selectSourceSchemaOnAH and
+  * selectSourceSchema.
   */
  static void
! selectSourceSchemaOnConnection(PGconn *conn, const char *schemaName)
  {
  	PQExpBuffer query;
  
+ 	/* This is checked by the callers already */
+ 	Assert(schemaName != NULL && *schemaName != '\0');
+ 
  	/* Not relevant if fetching from pre-7.3 DB */
  	if (g_fout->remoteVersion < 70300)
  		return;
  
  	query = createPQExpBuffer();
  	appendPQExpBuffer(query, "SET search_path = %s",
*************** selectSourceSchema(const char *schemaNam
*** 14303,14314 ****
  	if (strcmp(schemaName, "pg_catalog") != 0)
  		appendPQExpBuffer(query, ", pg_catalog");
  
! 	do_sql_command(g_conn, query->data);
  
  	destroyPQExpBuffer(query);
- 	if (curSchemaName)
- 		free(curSchemaName);
- 	curSchemaName = pg_strdup(schemaName);
  }
  
  /*
--- 14538,14546 ----
  	if (strcmp(schemaName, "pg_catalog") != 0)
  		appendPQExpBuffer(query, ", pg_catalog");
  
! 	do_sql_command(conn, query->data);
  
  	destroyPQExpBuffer(query);
  }
  
  /*
*************** myFormatType(const char *typname, int32
*** 14459,14529 ****
  }
  
  /*
-  * fmtQualifiedId - convert a qualified name to the proper format for
-  * the source database.
-  *
-  * Like fmtId, use the result before calling again.
-  */
- static const char *
- fmtQualifiedId(const char *schema, const char *id)
- {
- 	static PQExpBuffer id_return = NULL;
- 
- 	if (id_return)				/* first time through? */
- 		resetPQExpBuffer(id_return);
- 	else
- 		id_return = createPQExpBuffer();
- 
- 	/* Suppress schema name if fetching from pre-7.3 DB */
- 	if (g_fout->remoteVersion >= 70300 && schema && *schema)
- 	{
- 		appendPQExpBuffer(id_return, "%s.",
- 						  fmtId(schema));
- 	}
- 	appendPQExpBuffer(id_return, "%s",
- 					  fmtId(id));
- 
- 	return id_return->data;
- }
- 
- /*
   * Return a column list clause for the given relation.
   *
   * Special case: if there are no undropped columns in the relation, return
   * "", not an invalid "()" column list.
   */
  static const char *
! fmtCopyColumnList(const TableInfo *ti)
  {
- 	static PQExpBuffer q = NULL;
  	int			numatts = ti->numatts;
  	char	  **attnames = ti->attnames;
  	bool	   *attisdropped = ti->attisdropped;
  	bool		needComma;
  	int			i;
  
! 	if (q)						/* first time through? */
! 		resetPQExpBuffer(q);
! 	else
! 		q = createPQExpBuffer();
! 
! 	appendPQExpBuffer(q, "(");
  	needComma = false;
  	for (i = 0; i < numatts; i++)
  	{
  		if (attisdropped[i])
  			continue;
  		if (needComma)
! 			appendPQExpBuffer(q, ", ");
! 		appendPQExpBuffer(q, "%s", fmtId(attnames[i]));
  		needComma = true;
  	}
  
  	if (!needComma)
  		return "";				/* no undropped columns */
  
! 	appendPQExpBuffer(q, ")");
! 	return q->data;
  }
  
  /*
--- 14691,14727 ----
  }
  
  /*
   * Return a column list clause for the given relation.
   *
   * Special case: if there are no undropped columns in the relation, return
   * "", not an invalid "()" column list.
   */
  static const char *
! fmtCopyColumnList(const TableInfo *ti, PQExpBuffer buffer)
  {
  	int			numatts = ti->numatts;
  	char	  **attnames = ti->attnames;
  	bool	   *attisdropped = ti->attisdropped;
  	bool		needComma;
  	int			i;
  
! 	appendPQExpBuffer(buffer, "(");
  	needComma = false;
  	for (i = 0; i < numatts; i++)
  	{
  		if (attisdropped[i])
  			continue;
  		if (needComma)
! 			appendPQExpBuffer(buffer, ", ");
! 		appendPQExpBuffer(buffer, "%s", fmtId(attnames[i]));
  		needComma = true;
  	}
  
  	if (!needComma)
  		return "";				/* no undropped columns */
  
! 	appendPQExpBuffer(buffer, ")");
! 	return buffer->data;
  }
  
  /*
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 11c4d37..dba617b 100644
*** a/src/bin/pg_dump/pg_dump.h
--- b/src/bin/pg_dump/pg_dump.h
*************** typedef struct _tableInfo
*** 257,262 ****
--- 257,263 ----
  	/* these two are set only if table is a sequence owned by a column: */
  	Oid			owning_tab;		/* OID of table owning sequence */
  	int			owning_col;		/* attr # of column owning sequence */
+ 	int			relpages;
  
  	bool		interesting;	/* true if need to collect more data */
  
*************** typedef struct _indxInfo
*** 328,333 ****
--- 329,335 ----
  	bool		indisclustered;
  	/* if there is an associated constraint object, its dumpId: */
  	DumpId		indexconstraint;
+ 	int			relpages;		/* relpages of the underlying table */
  } IndxInfo;
  
  typedef struct _ruleInfo
*************** extern void parseOidArray(const char *st
*** 532,537 ****
--- 534,540 ----
  extern void sortDumpableObjects(DumpableObject **objs, int numObjs);
  extern void sortDumpableObjectsByTypeName(DumpableObject **objs, int numObjs);
  extern void sortDumpableObjectsByTypeOid(DumpableObject **objs, int numObjs);
+ extern void sortDataAndIndexObjectsBySize(DumpableObject **objs, int numObjs);
  
  /*
   * version specific routines
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index 4d1ae94..1286124 100644
*** a/src/bin/pg_dump/pg_dump_sort.c
--- b/src/bin/pg_dump/pg_dump_sort.c
*************** static void repairDependencyLoop(Dumpabl
*** 121,126 ****
--- 121,213 ----
  static void describeDumpableObject(DumpableObject *obj,
  					   char *buf, int bufsize);
  
+ static int DOSizeCompare(const void *p1, const void *p2);
+ 
+ static int
+ findFirstEqualType(DumpableObjectType type, DumpableObject **objs, int numObjs)
+ {
+ 	int i;
+ 	for (i = 0; i < numObjs; i++)
+ 		if (objs[i]->objType == type)
+ 			return i;
+ 	return -1;
+ }
+ 
+ static int
+ findFirstDifferentType(DumpableObjectType type, DumpableObject **objs, int numObjs, int start)
+ {
+ 	int i;
+ 	for (i = start; i < numObjs; i++)
+ 		if (objs[i]->objType != type)
+ 			return i;
+ 	return numObjs - 1;
+ }
+ 
+ /*
+  * When we do a parallel dump, we want to start with the largest items first.
+  *
+  * Say we have the objects in this order:
+  * ....DDDDD....III....
+  *
+  * with D = Table data, I = Index, . = other object
+  *
+  * This sorting function now takes each of the D or I blocks and sorts them
+  * according to their size.
+  */
+ void
+ sortDataAndIndexObjectsBySize(DumpableObject **objs, int numObjs)
+ {
+ 	int		startIdx, endIdx;
+ 	void   *startPtr;
+ 
+ 	if (numObjs <= 1)
+ 		return;
+ 
+ 	startIdx = findFirstEqualType(DO_TABLE_DATA, objs, numObjs);
+ 	if (startIdx >= 0)
+ 	{
+ 		endIdx = findFirstDifferentType(DO_TABLE_DATA, objs, numObjs, startIdx);
+ 		startPtr = objs + startIdx;
+ 		qsort(startPtr, endIdx - startIdx, sizeof(DumpableObject *),
+ 			  DOSizeCompare);
+ 	}
+ 
+ 	startIdx = findFirstEqualType(DO_INDEX, objs, numObjs);
+ 	if (startIdx >= 0)
+ 	{
+ 		endIdx = findFirstDifferentType(DO_INDEX, objs, numObjs, startIdx);
+ 		startPtr = objs + startIdx;
+ 		qsort(startPtr, endIdx - startIdx, sizeof(DumpableObject *),
+ 			  DOSizeCompare);
+ 	}
+ }
+ 
+ static int
+ DOSizeCompare(const void *p1, const void *p2)
+ {
+ 	DumpableObject *obj1 = *(DumpableObject **) p1;
+ 	DumpableObject *obj2 = *(DumpableObject **) p2;
+ 	int			obj1_size = 0;
+ 	int			obj2_size = 0;
+ 
+ 	if (obj1->objType == DO_TABLE_DATA)
+ 		obj1_size = ((TableDataInfo *) obj1)->tdtable->relpages;
+ 	if (obj1->objType == DO_INDEX)
+ 		obj1_size = ((IndxInfo *) obj1)->relpages;
+ 
+ 	if (obj2->objType == DO_TABLE_DATA)
+ 		obj2_size = ((TableDataInfo *) obj2)->tdtable->relpages;
+ 	if (obj2->objType == DO_INDEX)
+ 		obj2_size = ((IndxInfo *) obj2)->relpages;
+ 
+ 	/* we want to see the biggest item go first */
+ 	if (obj1_size > obj2_size)
+ 		return -1;
+ 	if (obj2_size > obj1_size)
+ 		return 1;
+ 
+ 	return 0;
+ }
  
  /*
   * Sort the given objects into a type/name-based ordering
diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c
index 4c93667..438b8f0 100644
*** a/src/bin/pg_dump/pg_dumpall.c
--- b/src/bin/pg_dump/pg_dumpall.c
*************** doShellQuoting(PQExpBuffer buf, const ch
*** 1913,1915 ****
--- 1913,1918 ----
  	appendPQExpBufferChar(buf, '"');
  #endif   /* WIN32 */
  }
+ 
+ /* dummy, no parallel dump/restore for pg_dumpall yet */
+ void _SetupWorker(Archive *AHX, RestoreOptions *ropt) {}
diff --git a/src/bin/pg_dump/pg_restore.c b/src/bin/pg_dump/pg_restore.c
index 6ff1ab8..965e9f2 100644
*** a/src/bin/pg_dump/pg_restore.c
--- b/src/bin/pg_dump/pg_restore.c
*************** main(int argc, char **argv)
*** 72,77 ****
--- 72,78 ----
  	RestoreOptions *opts;
  	int			c;
  	int			exit_code;
+ 	int			numWorkers = 1;
  	Archive    *AH;
  	char	   *inputFileSpec;
  	static int	disable_triggers = 0;
*************** main(int argc, char **argv)
*** 183,189 ****
  				break;
  
  			case 'j':			/* number of restore jobs */
! 				opts->number_of_jobs = atoi(optarg);
  				break;
  
  			case 'l':			/* Dump the TOC summary */
--- 184,190 ----
  				break;
  
  			case 'j':			/* number of restore jobs */
! 				numWorkers = atoi(optarg);
  				break;
  
  			case 'l':			/* Dump the TOC summary */
*************** main(int argc, char **argv)
*** 338,344 ****
  	}
  
  	/* Can't do single-txn mode with multiple connections */
! 	if (opts->single_txn && opts->number_of_jobs > 1)
  	{
  		fprintf(stderr, _("%s: cannot specify both --single-transaction and multiple jobs\n"),
  				progname);
--- 339,345 ----
  	}
  
  	/* Can't do single-txn mode with multiple connections */
! 	if (opts->single_txn && numWorkers > 1)
  	{
  		fprintf(stderr, _("%s: cannot specify both --single-transaction and multiple jobs\n"),
  				progname);
*************** main(int argc, char **argv)
*** 403,408 ****
--- 404,420 ----
  		InitDummyWantedList(AH, opts);
  	}
  
+ 	/* See comments in pg_dump.c */
+ #ifdef WIN32
+ 	if (numWorkers > MAXIMUM_WAIT_OBJECTS)
+ 	{
+ 		fprintf(stderr, _("%s: invalid number of parallel jobs\n"),	progname);
+ 		exit(1);
+ 	}
+ #endif
+ 
+ 	AH->numWorkers = numWorkers;
+ 
  	if (opts->tocSummary)
  		PrintTOCSummary(AH, opts);
  	else
*************** main(int argc, char **argv)
*** 421,426 ****
--- 433,445 ----
  	return exit_code;
  }
  
+ void
+ _SetupWorker(Archive *AHX, RestoreOptions *ropt)
+ {
+ 	ArchiveHandle *AH = (ArchiveHandle *) AHX;
+ 	(AH->ReopenPtr) (AH);
+ }
+ 
  static void
  usage(const char *progname)
  {
pg_backup_mirror_1.difftext/x-patch; charset=US-ASCII; name=pg_backup_mirror_1.diffDownload
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 00501de..8e2e983 100644
*** a/src/bin/pg_dump/Makefile
--- b/src/bin/pg_dump/Makefile
*************** kwlookup.c: % : $(top_srcdir)/src/backen
*** 30,37 ****
  
  all: pg_dump pg_restore pg_dumpall
  
! pg_dump: pg_dump.o common.o pg_dump_sort.o $(OBJS) $(KEYWRDOBJS) | submake-libpq submake-libpgport
! 	$(CC) $(CFLAGS) pg_dump.o common.o pg_dump_sort.o $(KEYWRDOBJS) $(OBJS) $(libpq_pgport) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X)
  
  pg_restore: pg_restore.o $(OBJS) $(KEYWRDOBJS) | submake-libpq submake-libpgport
  	$(CC) $(CFLAGS) pg_restore.o $(KEYWRDOBJS) $(OBJS) $(libpq_pgport) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X)
--- 30,37 ----
  
  all: pg_dump pg_restore pg_dumpall
  
! pg_dump: pg_dump.o common.o pg_dump_sort.o pg_backup_mirror.o $(OBJS) $(KEYWRDOBJS) | submake-libpq submake-libpgport
! 	$(CC) $(CFLAGS) pg_dump.o common.o pg_dump_sort.o pg_backup_mirror.o $(KEYWRDOBJS) $(OBJS) $(libpq_pgport) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X)
  
  pg_restore: pg_restore.o $(OBJS) $(KEYWRDOBJS) | submake-libpq submake-libpgport
  	$(CC) $(CFLAGS) pg_restore.o $(KEYWRDOBJS) $(OBJS) $(libpq_pgport) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X)
diff --git a/src/bin/pg_dump/parallel.c b/src/bin/pg_dump/parallel.c
index bcde24c..b4d0af4 100644
*** a/src/bin/pg_dump/parallel.c
--- b/src/bin/pg_dump/parallel.c
*************** WaitForCommands(ArchiveHandle *AH, int p
*** 921,927 ****
  		}
  		else if (messageStartsWith(command, "RESTORE "))
  		{
! 			Assert(AH->format == archDirectory || AH->format == archCustom);
  			Assert(AH->connection != NULL);
  
  			sscanf(command + strlen("RESTORE "), "%d%n", &dumpId, &nBytes);
--- 921,927 ----
  		}
  		else if (messageStartsWith(command, "RESTORE "))
  		{
! 			Assert(AH->format == archDirectory || AH->format == archCustom || AH->format == archMirror);
  			Assert(AH->connection != NULL);
  
  			sscanf(command + strlen("RESTORE "), "%d%n", &dumpId, &nBytes);
*************** ListenToWorkers(ArchiveHandle *AH, Paral
*** 1004,1010 ****
  	}
  	else if (messageStartsWith(msg, "ERROR "))
  	{
! 		Assert(AH->format == archDirectory || AH->format == archCustom);
  		pstate->parallelSlot[worker].workerStatus = WRKR_TERMINATED;
  		die_horribly(AH, modulename, "%s", msg + strlen("ERROR "));
  	}
--- 1004,1010 ----
  	}
  	else if (messageStartsWith(msg, "ERROR "))
  	{
! 		Assert(AH->format == archDirectory || AH->format == archCustom || AH->format == archMirror);
  		pstate->parallelSlot[worker].workerStatus = WRKR_TERMINATED;
  		die_horribly(AH, modulename, "%s", msg + strlen("ERROR "));
  	}
diff --git a/src/bin/pg_dump/parallel.h b/src/bin/pg_dump/parallel.h
index 4c86b9b..194dea4 100644
*** a/src/bin/pg_dump/parallel.h
--- b/src/bin/pg_dump/parallel.h
*************** typedef enum
*** 32,38 ****
  typedef enum _action
  {
  	ACT_DUMP,
! 	ACT_RESTORE,
  } T_Action;
  
  /* Arguments needed for a worker process */
--- 32,38 ----
  typedef enum _action
  {
  	ACT_DUMP,
! 	ACT_RESTORE
  } T_Action;
  
  /* Arguments needed for a worker process */
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 767f865..3c90904 100644
*** a/src/bin/pg_dump/pg_backup.h
--- b/src/bin/pg_dump/pg_backup.h
*************** typedef enum _archiveFormat
*** 51,57 ****
  	archFiles = 2,
  	archTar = 3,
  	archNull = 4,
! 	archDirectory = 5
  } ArchiveFormat;
  
  typedef enum _archiveMode
--- 51,58 ----
  	archFiles = 2,
  	archTar = 3,
  	archNull = 4,
! 	archDirectory = 5,
! 	archMirror = 6
  } ArchiveFormat;
  
  typedef enum _archiveMode
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 0c81dfe..68e5307 100644
*** a/src/bin/pg_dump/pg_backup_archiver.c
--- b/src/bin/pg_dump/pg_backup_archiver.c
*************** CloseArchive(Archive *AHX)
*** 154,160 ****
  	int			res = 0;
  	ArchiveHandle *AH = (ArchiveHandle *) AHX;
  
! 	(*AH->ClosePtr) (AH);
  
  	/* Close the output */
  	if (AH->gzOut)
--- 154,161 ----
  	int			res = 0;
  	ArchiveHandle *AH = (ArchiveHandle *) AHX;
  
! 	if (AH->ClosePtr)
! 		(*AH->ClosePtr) (AH);
  
  	/* Close the output */
  	if (AH->gzOut)
*************** _allocAH(const char *FileSpec, const Arc
*** 2015,2020 ****
--- 2016,2025 ----
  			InitArchiveFmt_Tar(AH);
  			break;
  
+ 		case archMirror:
+ 			InitArchiveFmt_Mirror(AH);
+ 			break;
+ 
  		default:
  			die_horribly(AH, modulename, "unrecognized file format \"%d\"\n", fmt);
  	}
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 3b10384..929a32c 100644
*** a/src/bin/pg_dump/pg_backup_archiver.h
--- b/src/bin/pg_dump/pg_backup_archiver.h
*************** extern void EndRestoreBlob(ArchiveHandle
*** 398,406 ****
  extern void EndRestoreBlobs(ArchiveHandle *AH);
  
  extern void InitArchiveFmt_Custom(ArchiveHandle *AH);
  extern void InitArchiveFmt_Files(ArchiveHandle *AH);
  extern void InitArchiveFmt_Null(ArchiveHandle *AH);
- extern void InitArchiveFmt_Directory(ArchiveHandle *AH);
  extern void InitArchiveFmt_Tar(ArchiveHandle *AH);
  
  extern bool isValidTarHeader(char *header);
--- 398,407 ----
  extern void EndRestoreBlobs(ArchiveHandle *AH);
  
  extern void InitArchiveFmt_Custom(ArchiveHandle *AH);
+ extern void InitArchiveFmt_Directory(ArchiveHandle *AH);
  extern void InitArchiveFmt_Files(ArchiveHandle *AH);
+ extern void InitArchiveFmt_Mirror(ArchiveHandle *AH);
  extern void InitArchiveFmt_Null(ArchiveHandle *AH);
  extern void InitArchiveFmt_Tar(ArchiveHandle *AH);
  
  extern bool isValidTarHeader(char *header);
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 995cf31..5e1e8d9 100644
*** a/src/bin/pg_dump/pg_backup_directory.c
--- b/src/bin/pg_dump/pg_backup_directory.c
*************** static void _DeClone(ArchiveHandle *AH);
*** 89,96 ****
  
  static char *_MasterStartParallelItem(ArchiveHandle *AH, TocEntry *te, T_Action act);
  static int _MasterEndParallelItem(ArchiveHandle *AH, TocEntry *te, const char *str, T_Action act);
- static char *_WorkerJobRestoreDirectory(ArchiveHandle *AH, TocEntry *te);
- static char *_WorkerJobDumpDirectory(ArchiveHandle *AH, TocEntry *te);
  
  static void createDirectory(const char *dir);
  static char *prependDirectory(ArchiveHandle *AH, char *buf, const char *relativeFilename);
--- 89,94 ----
diff --git a/src/bin/pg_dump/pg_backup_mirror.c b/src/bin/pg_dump/pg_backup_mirror.c
index ...100cea1 .
*** a/src/bin/pg_dump/pg_backup_mirror.c
--- b/src/bin/pg_dump/pg_backup_mirror.c
***************
*** 0 ****
--- 1,453 ----
+ /*-------------------------------------------------------------------------
+  *
+  * pg_backup_mirror.c
+  *
+  * Mirror a database from one host to another by doing a parallel
+  * pg_backup/restore.
+  *
+  *	Portions Copyright (c) 1996-2012, PostgreSQL Global Development Group
+  *	Portions Copyright (c) 1994, Regents of the University of California
+  *	Portions Copyright (c) 2000, Philip Warner
+  *
+  *	Rights are granted to use this software in any way so long
+  *	as this notice is not removed.
+  *
+  *	The author is not responsible for loss or damages that may
+  *	result from it's use.
+  *
+  * IDENTIFICATION
+  *		src/bin/pg_dump/pg_backup_mirror.c
+  *
+  *-------------------------------------------------------------------------
+  */
+ 
+ #include "compress_io.h"
+ #include "dumpmem.h"
+ #include "dumputils.h"
+ #include "parallel.h"
+ 
+ typedef struct
+ {
+ 	PGconn	   *dumpConn;
+ 	PGconn	   *restoreConn;
+ } lclContext;
+ 
+ typedef struct
+ {
+ 	int			status;
+ } lclTocEntry;
+ 
+ static const char *modulename = gettext_noop("mirror archiver");
+ 
+ /* prototypes for private functions */
+ static void _ArchiveEntry(ArchiveHandle *AH, TocEntry *te);
+ static size_t _WriteData(ArchiveHandle *AH, const void *data, size_t dLen);
+ static void _PrintTocData(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt);
+ 
+ static void _StartBlobs(ArchiveHandle *AH, TocEntry *te);
+ static void _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid);
+ static void _EndBlob(ArchiveHandle *AH, TocEntry *te, Oid oid);
+ static void _EndBlobs(ArchiveHandle *AH, TocEntry *te);
+ static void _Clone(ArchiveHandle *AH);
+ static void _DeClone(ArchiveHandle *AH);
+ 
+ static void _ReopenArchive(ArchiveHandle *AH);
+ 
+ static char *_MasterStartParallelItem(ArchiveHandle *AH, TocEntry *te, T_Action act);
+ static int _MasterEndParallelItem(ArchiveHandle *AH, TocEntry *te, const char *str, T_Action act);
+ static char *_WorkerJobRestoreMirror(ArchiveHandle *AH, TocEntry *te);
+ 
+ /*
+  *	Init routine required by ALL formats. This is a global routine
+  *	and should be declared in pg_backup_archiver.h
+  *
+  *	Its task is to create any extra archive context (using AH->formatData),
+  *	and to initialize the supported function pointers.
+  *
+  *	It should also prepare whatever its input source is for reading/writing,
+  *	and in the case of a read mode connection, it should load the Header & TOC.
+  */
+ void
+ InitArchiveFmt_Mirror(ArchiveHandle *AH)
+ {
+ 	lclContext *ctx;
+ 
+ 	/* Assuming static functions, this can be copied for each format. */
+ 	AH->ArchiveEntryPtr = _ArchiveEntry;
+ 	AH->StartDataPtr = NULL;
+ 	AH->WriteDataPtr = _WriteData;
+ 	AH->EndDataPtr = NULL;
+ 	AH->WriteBytePtr = NULL;
+ 	AH->ReadBytePtr = NULL;
+ 	AH->WriteBufPtr = NULL;
+ 	AH->ReadBufPtr = NULL;
+ 	AH->ClosePtr = NULL;
+ 	AH->ReopenPtr = _ReopenArchive;
+ 	AH->PrintTocDataPtr = _PrintTocData;
+ 	AH->ReadExtraTocPtr = NULL;
+ 	AH->WriteExtraTocPtr = NULL;
+ 	AH->PrintExtraTocPtr = NULL;
+ 
+ 	AH->StartBlobsPtr = _StartBlobs;
+ 	AH->StartBlobPtr = _StartBlob;
+ 	AH->EndBlobPtr = _EndBlob;
+ 	AH->EndBlobsPtr = _EndBlobs;
+ 
+ 	AH->ClonePtr = _Clone;
+ 	AH->DeClonePtr = _DeClone;
+ 
+ 	AH->WorkerJobRestorePtr = _WorkerJobRestoreMirror;
+ 	AH->WorkerJobDumpPtr = NULL;
+ 
+ 	AH->MasterStartParallelItemPtr = _MasterStartParallelItem;
+ 	AH->MasterEndParallelItemPtr = _MasterEndParallelItem;
+ 
+ 	/* Set up our private context */
+ 	ctx = (lclContext *) pg_calloc(1, sizeof(lclContext));
+ 	AH->formatData = (void *) ctx;
+ 
+ 	/* Initialize LO buffering */
+ 	AH->lo_buf_size = LOBBUFSIZE;
+ 	AH->lo_buf = (void *) pg_malloc(LOBBUFSIZE);
+ }
+ 
+ /*
+  * Called by the Archiver when the dumper creates a new TOC entry.
+ */
+ static void
+ _ArchiveEntry(ArchiveHandle *AH, TocEntry *te)
+ {
+ 	lclTocEntry *tctx = (lclTocEntry *) calloc(1, sizeof(lclTocEntry));
+ 	te->formatData = (void *) tctx;
+ }
+ 
+ void
+ setActiveConnection(ArchiveHandle *AH, T_Action act, bool connOnly)
+ {
+ 	lclContext *ctx = (lclContext *) AH->formatData;
+ 
+ 	if (act == ACT_DUMP)
+ 	{
+ 		AH->connection = ctx->dumpConn;
+ 		if (connOnly)
+ 			return;
+ 	}
+ 	else
+ 	{
+ 		AH->connection = ctx->restoreConn;
+ 		if (connOnly)
+ 			return;
+ 
+ 	}
+ }
+ 
+ void
+ MirrorDecoupleDumpConn(ArchiveHandle *AH)
+ {
+ 	lclContext *ctx = (lclContext *) AH->formatData;
+ 
+ 	ctx->dumpConn = AH->connection;
+ 	AH->connection = NULL;
+ }
+ 
+ /*
+  * Called by the archiver when saving TABLE DATA (not schema). This routine
+  * should save whatever format-specific information is needed to read
+  * the archive back.
+  *
+  * It is called just prior to the dumper's 'DataDumper' routine being called.
+  *
+  * We create the data file for writing.
+  *
+  * static void _StartData(ArchiveHandle *AH, TocEntry *te) { }
+  */
+ 
+ /*
+  * Called by archiver when dumper calls WriteData. This routine is
+  * called for both BLOB and TABLE data; it is the responsibility of
+  * the format to manage each kind of data using StartBlob/StartData.
+  *
+  * It should only be called from within a DataDumper routine.
+  */
+ static size_t
+ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
+ {
+ 	setActiveConnection(AH, ACT_RESTORE, true);
+ 	ahwrite(data, 1, dLen, AH);
+ 	setActiveConnection(AH, ACT_DUMP, true);
+ 
+ 	return dLen;
+ }
+ 
+ /*
+  * Called by the archiver when a dumper's 'DataDumper' routine has
+  * finished.
+  *
+  * static void _EndData(ArchiveHandle *AH, TocEntry *te) { }
+  */
+ 
+ /*
+  * Print data for a given TOC entry
+ */
+ static void
+ _PrintTocData(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt)
+ {
+ 	setActiveConnection(AH, ACT_DUMP, true);
+ 	WriteDataChunksForTocEntry(AH, te);
+ 	setActiveConnection(AH, ACT_RESTORE, true);
+ }
+ 
+ /*
+  * Write a byte of data to the archive.
+  * Called by the archiver to do integer & byte output to the archive.
+  * These routines are only used to read & write the headers & TOC.
+  *
+  * static int _WriteByte(ArchiveHandle *AH, const int i) { }
+  */
+ 
+ /*
+  * Read a byte of data from the archive.
+  * Called by the archiver to read bytes & integers from the archive.
+  * These routines are only used to read & write headers & TOC.
+  * EOF should be treated as a fatal error.
+  *
+  * static int _ReadByte(ArchiveHandle *AH) { }
+  */
+ 
+ /*
+  * Write a buffer of data to the archive.
+  *
+  * Called by the archiver to write a block of bytes to the TOC or a data file.
+  *
+  * static size_t _WriteBuf(ArchiveHandle *AH, const void *buf, size_t len) { }
+  */
+ 
+ /*
+  * Read a block of bytes from the archive.
+  *
+  * Called by the archiver to read a block of bytes from the archive
+  *
+  * static size_t _ReadBuf(ArchiveHandle *AH, void *buf, size_t len) { }
+  */
+ 
+ /*
+  * Close the archive.
+  *
+  * When writing the archive, this is the routine that actually starts
+  * the process of saving it to files. No data should be written prior
+  * to this point, since the user could sort the TOC after creating it.
+  *
+  * If an archive is to be written, this routine must call:
+  *		WriteHead			to save the archive header
+  *		WriteToc			to save the TOC entries
+  *		WriteDataChunks		to save all DATA & BLOBs.
+  *
+  * static void _CloseArchive(ArchiveHandle *AH) { }
+  */
+ 
+ /*
+  * Reopen the archive's file handle.
+  */
+ static void _ReopenArchive(ArchiveHandle *AH) { }
+ 
+ /*
+  * BLOB support
+  */
+ 
+ /*
+  * Called by the archiver when starting to save all BLOB DATA (not schema).
+  * It is called just prior to the dumper's DataDumper routine.
+  *
+  * We open the large object TOC file here, so that we can append a line to
+  * it for each blob.
+  */
+ static void
+ _StartBlobs(ArchiveHandle *AH, TocEntry *te)
+ {
+ 	setActiveConnection(AH, ACT_RESTORE, true);
+ 	StartRestoreBlobs(AH);
+ 	setActiveConnection(AH, ACT_DUMP, true);
+ }
+ 
+ /*
+  * Called by the archiver when we're about to start dumping a blob.
+  *
+  * We create a file to write the blob to.
+  */
+ static void
+ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
+ {
+ 	setActiveConnection(AH, ACT_RESTORE, true);
+ 	StartRestoreBlob(AH, oid, AH->ropt->dropSchema);
+ 	setActiveConnection(AH, ACT_DUMP, true);
+ }
+ 
+ /*
+  * Called by the archiver when the dumper is finished writing a blob.
+  *
+  * We close the blob file and write an entry to the blob TOC file for it.
+  */
+ static void
+ _EndBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
+ {
+ 	setActiveConnection(AH, ACT_RESTORE, true);
+ 	EndRestoreBlob(AH, oid);
+ 	setActiveConnection(AH, ACT_DUMP, true);
+ }
+ 
+ /*
+  * Called by the archiver when finishing saving all BLOB DATA.
+  *
+  * We close the blobs TOC file.
+  */
+ static void
+ _EndBlobs(ArchiveHandle *AH, TocEntry *te)
+ {
+ 	setActiveConnection(AH, ACT_RESTORE, true);
+ 	EndRestoreBlobs(AH);
+ 	setActiveConnection(AH, ACT_DUMP, true);
+ }
+ 
+ /*
+  * Clone format-specific fields during parallel restoration.
+  */
+ static void
+ _Clone(ArchiveHandle *AH)
+ {
+ 	char	   *dbname;
+ 	char	   *pghost;
+ 	char	   *pgport;
+ 	char	   *username;
+ 	const char *encname;
+ 
+ 	lclContext *ctx = (lclContext *) AH->formatData;
+ 
+ 	AH->formatData = (lclContext *) pg_malloc(sizeof(lclContext));
+ 	memcpy(AH->formatData, ctx, sizeof(lclContext));
+ 	ctx = (lclContext *) AH->formatData;
+ 
+ 	Assert(ctx->dumpConn != NULL);
+ 
+ 	ctx->restoreConn = AH->connection;
+ 	AH->connection = NULL;
+ 
+ 	/*
+ 	 * Even though we are technically accessing the parent's database object
+ 	 * here, these functions are fine to be called like that because all just
+ 	 * return a pointer and do not actually send/receive any data to/from the
+ 	 * database.
+ 	 */
+ 	dbname = PQdb(ctx->dumpConn);
+ 	pghost = PQhost(ctx->dumpConn);
+ 	pgport = PQport(ctx->dumpConn);
+ 	username = PQuser(ctx->dumpConn);
+ 	/* XXX could we have different encodings ? */
+ 	encname = pg_encoding_to_char(AH->public.encoding);
+ 
+ 	if (AH->savedPassword)
+ 		AH->savedPassword = NULL;
+ 
+ 	/* XXX this does more than we most likely want */
+ 	ConnectDatabase((Archive *) AH, dbname, pghost, pgport, username, TRI_NO);
+ 
+ 	ctx->dumpConn = AH->connection;
+ 
+ 	/* Let the dumpConn be in AH->connection so that we run SetupConnection on
+ 	 * it. */
+ 
+ 
+ 	/*
+ 	 * Note: we do not make a local lo_buf because we expect at most one BLOBS
+ 	 * entry per archive, so no parallelism is possible.  Likewise,
+ 	 * TOC-entry-local state isn't an issue because any one TOC entry is
+ 	 * touched by just one worker child.
+ 	 */
+ 
+ 	/*
+ 	 * We also don't copy the ParallelState pointer (pstate), only the master
+ 	 * process ever writes to it.
+ 	 */
+ }
+ 
+ static void
+ _DeClone(ArchiveHandle *AH)
+ {
+ 	lclContext *ctx = (lclContext *) AH->formatData;
+ 	free(ctx);
+ }
+ 
+ /*
+  * This function is executed in the parent process. Depending on the desired
+  * action (dump or restore) it creates a string that is understood by the
+  * _WorkerJobDumpDirectory/_WorkerJobRestoreDirectory functions of the
+  * respective dump format.
+  */
+ static char *
+ _MasterStartParallelItem(ArchiveHandle *AH, TocEntry *te, T_Action act)
+ {
+ 	/*
+ 	 * A static char is okay here, even on Windows because we call this
+ 	 * function only from one process (the master).
+ 	 */
+ 	static char	buf[64];
+ 	snprintf(buf, sizeof(buf), "RESTORE %d", te->dumpId);
+ 	return buf;
+ }
+ 
+ /*
+  * This function is executed in the child of a parallel backup for the
+  * directory archive and dumps the actual data.
+  */
+ static char *
+ _WorkerJobRestoreMirror(ArchiveHandle *AH, TocEntry *te)
+ {
+ 	/* short fixed-size string + some ID so far, this needs to be malloc'ed
+ 	 * instead of static because we work with threads on windows */
+ 	const int	buflen = 64;
+ 	char	   *buf = (char*) pg_malloc(buflen);
+ 	ParallelArgs pargs;
+ 	int			status;
+ 	lclTocEntry *tctx;
+ 
+ 	tctx = (lclTocEntry *) te->formatData;
+ 
+ 	/* Prepare the item */
+ 	pargs.AH = AH;
+ 	pargs.te = te;
+ 
+ 	/* Prepare AH */
+ 	setActiveConnection(AH, ACT_RESTORE, false);
+ 
+ 	status = parallel_restore(&pargs);
+ 
+ 	tctx->status = status;
+ 	snprintf(buf, buflen, "OK RESTORE %d %d %d", te->dumpId, status,
+ 			 status == WORKER_IGNORED_ERRORS ? AH->public.n_errors : 0);
+ 
+ 	setActiveConnection(AH, ACT_RESTORE, false);
+ 
+ 	return buf;
+ }
+ /*
+  * This function is executed in the parent process. It analyzes the response of
+  * the _WorkerJobDumpDirectory/_WorkerJobRestoreDirectory functions of the
+  * respective dump format.
+  */
+ static int
+ _MasterEndParallelItem(ArchiveHandle *AH, TocEntry *te, const char *str, T_Action act)
+ {
+ 	DumpId		dumpId;
+ 	int			nBytes, n_errors;
+ 	int			status = 0;
+ 
+ 	Assert(act == ACT_RESTORE);
+ 
+ 	sscanf(str, "%u %u %u%n", &dumpId, &status, &n_errors, &nBytes);
+ 
+ 	Assert(dumpId == te->dumpId);
+ 	Assert(nBytes == strlen(str));
+ 
+ 	AH->public.n_errors += n_errors;
+ 
+ 	return status;
+ }
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 5a15435..e8b60ff 100644
*** a/src/bin/pg_dump/pg_dump.c
--- b/src/bin/pg_dump/pg_dump.c
*************** static int	no_security_labels = 0;
*** 143,148 ****
--- 143,149 ----
  static int  no_synchronized_snapshots = 0;
  static int	no_unlogged_table_data = 0;
  static int	serializable_deferrable = 0;
+ static int  mirror_exit_on_error = false;
  
  
  static void help(const char *progname);
*************** main(int argc, char **argv)
*** 266,271 ****
--- 267,276 ----
  	const char *pghost = NULL;
  	const char *pgport = NULL;
  	const char *username = NULL;
+ 	char	   *restore_dbname = NULL;
+ 	char	   *restore_pghost = NULL;
+ 	char	   *restore_pgport = NULL;
+ 	char	   *restore_username = NULL;
  	const char *dumpencoding = NULL;
  	bool		oids = false;
  	TableInfo  *tblinfo;
*************** main(int argc, char **argv)
*** 288,297 ****
--- 293,304 ----
  	RestoreOptions *ropt;
  	ArchiveFormat archiveFormat = archUnknown;
  	ArchiveMode archiveMode;
+ 	int			exit_code;
  
  	static int	disable_triggers = 0;
  	static int	outputNoTablespaces = 0;
  	static int	use_setsessauth = 0;
+ 	static int	in_restore_options = false;
  
  	static struct option long_options[] = {
  		{"data-only", no_argument, NULL, 'a'},
*************** main(int argc, char **argv)
*** 345,350 ****
--- 352,365 ----
  		{"no-synchronized-snapshots", no_argument, &no_synchronized_snapshots, 1},
  		{"no-unlogged-table-data", no_argument, &no_unlogged_table_data, 1},
  
+ 		/*
+ 		 * the following option switches to target db interpretation for the
+ 		 * pg_backup_mirror options.
+ 		 */
+ 		{"restore", no_argument, &in_restore_options, 1},
+ 		{"dbname", required_argument, NULL, 6},
+ 		{"mirror-exit-on-error", no_argument, &mirror_exit_on_error, 1},
+ 
  		{NULL, 0, NULL, 0}
  	};
  
*************** main(int argc, char **argv)
*** 414,420 ****
  				break;
  
  			case 'h':			/* server host */
! 				pghost = optarg;
  				break;
  
  			case 'i':
--- 429,438 ----
  				break;
  
  			case 'h':			/* server host */
! 				if (!in_restore_options)
! 					pghost = optarg;
! 				else
! 					restore_pghost = optarg;
  				break;
  
  			case 'i':
*************** main(int argc, char **argv)
*** 443,449 ****
  				break;
  
  			case 'p':			/* server port */
! 				pgport = optarg;
  				break;
  
  			case 'R':
--- 461,470 ----
  				break;
  
  			case 'p':			/* server port */
! 				if (!in_restore_options)
! 					pgport = optarg;
! 				else
! 					restore_pgport = optarg;
  				break;
  
  			case 'R':
*************** main(int argc, char **argv)
*** 468,474 ****
  				break;
  
  			case 'U':
! 				username = optarg;
  				break;
  
  			case 'v':			/* verbose */
--- 489,498 ----
  				break;
  
  			case 'U':
! 				if (!in_restore_options)
! 					username = optarg;
! 				else
! 					restore_username = optarg;
  				break;
  
  			case 'v':			/* verbose */
*************** main(int argc, char **argv)
*** 511,516 ****
--- 535,545 ----
  				set_section(optarg, &dumpSections);
  				break;
  
+ 			case 6:
+ 				/* XXX check that we're in --restore ? */
+ 				restore_dbname = optarg;
+ 				break;
+ 
  			default:
  				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
  				exit(1);
*************** main(int argc, char **argv)
*** 602,608 ****
  	}
  
  	/* Parallel backup only in the directory archive format so far */
! 	if (archiveFormat != archDirectory && numWorkers > 1) {
  		write_msg(NULL, "parallel backup only supported by the directory format\n");
  		exit(1);
  	}
--- 631,637 ----
  	}
  
  	/* Parallel backup only in the directory archive format so far */
! 	if (archiveFormat != archDirectory && archiveFormat != archMirror && numWorkers > 1) {
  		write_msg(NULL, "parallel backup only supported by the directory format\n");
  		exit(1);
  	}
*************** main(int argc, char **argv)
*** 764,770 ****
  		sortDumpableObjectsByTypeOid(dobjs, numObjs);
  
  	/* If we do a parallel dump, we want the largest tables to go first */
! 	if (archiveFormat == archDirectory && numWorkers > 1)
  		sortDataAndIndexObjectsBySize(dobjs, numObjs);
  
  	sortDumpableObjects(dobjs, numObjs);
--- 793,799 ----
  		sortDumpableObjectsByTypeOid(dobjs, numObjs);
  
  	/* If we do a parallel dump, we want the largest tables to go first */
! 	if (numWorkers > 1 && (archiveFormat == archDirectory || archiveFormat == archMirror))
  		sortDataAndIndexObjectsBySize(dobjs, numObjs);
  
  	sortDumpableObjects(dobjs, numObjs);
*************** main(int argc, char **argv)
*** 789,795 ****
  	/*
  	 * And finally we can do the actual output.
  	 */
! 	if (plainText)
  	{
  		ropt = NewRestoreOptions();
  		ropt->filename = (char *) filename;
--- 818,824 ----
  	/*
  	 * And finally we can do the actual output.
  	 */
! 	if (plainText || archiveFormat == archMirror)
  	{
  		ropt = NewRestoreOptions();
  		ropt->filename = (char *) filename;
*************** main(int argc, char **argv)
*** 810,823 ****
  
  		ropt->suppressDumpWarnings = true;		/* We've already shown them */
  
  		RestoreArchive(g_fout, ropt);
  	}
  
  	CloseArchive(g_fout);
  
  	PQfinish(g_conn);
  
! 	exit(0);
  }
  
  
--- 839,874 ----
  
  		ropt->suppressDumpWarnings = true;		/* We've already shown them */
  
+ 		if (archiveFormat == archMirror)
+ 		{
+ 			ropt->dbname = restore_dbname;
+ 			ropt->pghost = restore_pghost;
+ 			ropt->pgport = restore_pgport;
+ 			ropt->username = restore_username;
+ 			ropt->promptPassword = prompt_password;
+ 			ropt->useDB = 1;
+ 			ropt->exit_on_error = g_fout->exit_on_error = mirror_exit_on_error;
+ 
+ 			/* Disconnect the dump connection */
+ 			MirrorDecoupleDumpConn((ArchiveHandle *) g_fout);
+ 		}
+ 
  		RestoreArchive(g_fout, ropt);
  	}
  
+ 	/* done, print a summary of ignored errors */
+ 	if (g_fout->n_errors)
+ 		fprintf(stderr, _("WARNING: errors ignored on restore: %d\n"),
+ 				g_fout->n_errors);
+ 
+ 	/* AH may be freed in CloseArchive? */
+ 	exit_code = g_fout->n_errors ? 1 : 0;
+ 
  	CloseArchive(g_fout);
  
  	PQfinish(g_conn);
  
! 	exit(exit_code);
  }
  
  
*************** parseArchiveFormat(const char *format, A
*** 1089,1094 ****
--- 1140,1149 ----
  		 * documented.
  		 */
  		archiveFormat = archFiles;
+ 	else if (pg_strcasecmp(format, "m") == 0)
+ 		archiveFormat = archMirror;
+ 	else if (pg_strcasecmp(format, "mirror") == 0)
+ 		archiveFormat = archMirror;
  	else if (pg_strcasecmp(format, "p") == 0)
  		archiveFormat = archNull;
  	else if (pg_strcasecmp(format, "plain") == 0)
diff --git a/src/bin/pg_dump/pg_restore.c b/src/bin/pg_dump/pg_restore.c
index 965e9f2..7db2c88 100644
*** a/src/bin/pg_dump/pg_restore.c
--- b/src/bin/pg_dump/pg_restore.c
*************** usage(const char *progname)
*** 498,500 ****
--- 498,504 ----
  	printf(_("\nIf no input file name is supplied, then standard input is used.\n\n"));
  	printf(_("Report bugs to <pgsql-bugs@postgresql.org>.\n"));
  }
+ 
+ /* dummy: while implemented as an archive format, it runs entirely in pg_dump */
+ void InitArchiveFmt_Mirror(ArchiveHandle *AH) {}
+ 
#2Robert Haas
robertmhaas@gmail.com
In reply to: Joachim Wieland (#1)
Re: patch for parallel pg_dump

On Sun, Jan 15, 2012 at 1:01 PM, Joachim Wieland <joe@mcknight.de> wrote:

So this is the parallel pg_dump patch, generalizing the existing
parallel restore and allowing parallel dumps for the directory archive
format, the patch works on Windows and Unix.

This patch introduces a large amount of notational churn, as perhaps
well-illustrated by this hunk:

  static int
! dumpBlobs(Archive *AHX, void *arg)
  {
+       /*
+        * This is a data dumper routine, executed in a child for
parallel backup,
+        * so it must not access the global g_conn but AH->connection instead.
+        */
+       ArchiveHandle *AH = (ArchiveHandle *) AHX;

It seems pretty grotty to have a static function that gets an argument
of the wrong type, and then just immediately turns around and casts it
to something else. It's not exactly obvious that that's even safe,
especially if you don't know that ArchiveHandle is a struct whose
first element is an Archive. But even if you do know that subclassing
is intended, that doesn't prove that the particular Archive object is
always going to be an ArchiveHandle under the hood. If it is, why not
just pass it as an ArchiveHandle to begin with? I think this needs to
be revised in some way. At least in the few cases I checked, the only
point of getting at the ArchiveHandle this way was to find
AH->connection, which suggests to me either that AH->connection should
be in the "public" section, inside Archive rather than ArchiveHandle,
or else that we ought to just pass the connection object to this
function (and all of its friends who have similar issues) as a
separate argument. Either way, I think that would make this patch
both cleaner and less-invasive. In fact we might want to pull out
just that change and commit it separately to simplify review of the
remaining work.

It's not clear to me why fmtQualifiedId needs to move to dumputils.c.
The way you have it, fmtQualifiedId() is now with fmtId(), but no
longer with fmtCopyColumnList(), the only other similarly named
function in that directory. That may be more logical, or it may not,
but rearranging the code like this makes it a lot harder to review,
and I would prefer that we make such changes as separate commits if
we're going to do them, so that diff can do something sensible with
the changes that are integral to the patch.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#3Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#2)
Re: patch for parallel pg_dump

On Fri, Jan 27, 2012 at 10:57 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Sun, Jan 15, 2012 at 1:01 PM, Joachim Wieland <joe@mcknight.de> wrote:

So this is the parallel pg_dump patch, generalizing the existing
parallel restore and allowing parallel dumps for the directory archive
format, the patch works on Windows and Unix.

This patch introduces a large amount of notational churn, as perhaps
well-illustrated by this hunk:

 static int
! dumpBlobs(Archive *AHX, void *arg)
 {
+       /*
+        * This is a data dumper routine, executed in a child for
parallel backup,
+        * so it must not access the global g_conn but AH->connection instead.
+        */
+       ArchiveHandle *AH = (ArchiveHandle *) AHX;

It seems pretty grotty to have a static function that gets an argument
of the wrong type, and then just immediately turns around and casts it
to something else.  It's not exactly obvious that that's even safe,
especially if you don't know that ArchiveHandle is a struct whose
first element is an Archive.  But even if you do know that subclassing
is intended, that doesn't prove that the particular Archive object is
always going to be an ArchiveHandle under the hood.  If it is, why not
just pass it as an ArchiveHandle to begin with?  I think this needs to
be revised in some way.  At least in the few cases I checked, the only
point of getting at the ArchiveHandle this way was to find
AH->connection, which suggests to me either that AH->connection should
be in the "public" section, inside Archive rather than ArchiveHandle,
or else that we ought to just pass the connection object to this
function (and all of its friends who have similar issues) as a
separate argument.  Either way, I think that would make this patch
both cleaner and less-invasive.  In fact we might want to pull out
just that change and commit it separately to simplify review of the
remaining work.

It's not clear to me why fmtQualifiedId needs to move to dumputils.c.
The way you have it, fmtQualifiedId() is now with fmtId(), but no
longer with fmtCopyColumnList(), the only other similarly named
function in that directory.  That may be more logical, or it may not,
but rearranging the code like this makes it a lot harder to review,
and I would prefer that we make such changes as separate commits if
we're going to do them, so that diff can do something sensible with
the changes that are integral to the patch.

Woops, hit send a little too soon there. I'll try to make some more
substantive comments after looking at this more.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#4Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#3)
Re: patch for parallel pg_dump

On Fri, Jan 27, 2012 at 10:58 AM, Robert Haas <robertmhaas@gmail.com> wrote:

It's not clear to me why fmtQualifiedId needs to move to dumputils.c.
The way you have it, fmtQualifiedId() is now with fmtId(), but no
longer with fmtCopyColumnList(), the only other similarly named
function in that directory.  That may be more logical, or it may not,
but rearranging the code like this makes it a lot harder to review,
and I would prefer that we make such changes as separate commits if
we're going to do them, so that diff can do something sensible with
the changes that are integral to the patch.

Woops, hit send a little too soon there.  I'll try to make some more
substantive comments after looking at this more.

And maybe retract some of the bogus ones, like the above: I see why
you moved this, now - parallel.c needs it.

Still looking...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#5Robert Haas
robertmhaas@gmail.com
In reply to: Joachim Wieland (#1)
Re: patch for parallel pg_dump

On Sun, Jan 15, 2012 at 1:01 PM, Joachim Wieland <joe@mcknight.de> wrote:

So this is the parallel pg_dump patch, generalizing the existing
parallel restore and allowing parallel dumps for the directory archive
format, the patch works on Windows and Unix.

It seems a little unfortunate that we are using threads here on
Windows and processes on Linux. Maybe it's too late to revisit that
decision, but it seems like a recipe for subtle bugs.

In parallel restore, the master closes its own connection to the
database before forking of worker processes, just as it does now. In
parallel dump however, we need to hold the masters connection open so
that we can detect deadlocks. The issue is that somebody could have
requested an exclusive lock after the master has initially requested a
shared lock on all tables. Therefore, the worker process also requests
a shared lock on the table with NOWAIT and if this fails, we know that
there is a conflicting lock in between and that we need to abort the
dump.

I think this is an acceptable limitation, but the window where it can
happen seems awfully wide right now. As things stand, it seems like
we don't try to lock the table in the child until we're about to
access it, which means that, on a large database, we could dump out
99% of the database and then be forced to abort the dump because of a
conflicting lock on the very last table. We could fix that by having
every child lock every table right at the beginning, so that all
possible failures of this type would happen before we do any work, but
that will eat up a lot of lock table space. It would be nice if the
children could somehow piggyback on the parent's locks, but I don't
see any obvious way to make that work. Maybe we just have to live
with it the way it is, but I worry that people whose dumps fail 10
hours into a 12 hour parallel dump are going to be grumpy.

The connections of the parallel dump use the synchronized snapshot
feature. However there's also an option --no-synchronized-snapshots
which can be used to dump from an older PostgreSQL version.

Right now, you have it set up so that --no-synchronized-snapshots is
ignored even if synchronized snapshots are supported, which doesn't
make much sense to me. I think you should allow
--no-synchronized-snapshots with any release, and error out if it's
not specified and the version is too old work without it. It's
probably not a good idea to run with --no-synchronized-snapshots ever,
and doubly so if they're not available, but I'd rather leave that
decision to the user. (Imagine, for example, that we discovered a bug
in our synchronized snapshot implementation.)

I am tempted to advocate calling the flag --unsynchronized-snapshots,
because to me that underscores the danger a little more clearly, but
perhaps that is too clever.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#6Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Robert Haas (#5)
Re: patch for parallel pg_dump

On 27.01.2012 18:46, Robert Haas wrote:

On Sun, Jan 15, 2012 at 1:01 PM, Joachim Wieland<joe@mcknight.de> wrote:

In parallel restore, the master closes its own connection to the
database before forking of worker processes, just as it does now. In
parallel dump however, we need to hold the masters connection open so
that we can detect deadlocks. The issue is that somebody could have
requested an exclusive lock after the master has initially requested a
shared lock on all tables. Therefore, the worker process also requests
a shared lock on the table with NOWAIT and if this fails, we know that
there is a conflicting lock in between and that we need to abort the
dump.

I think this is an acceptable limitation, but the window where it can
happen seems awfully wide right now. As things stand, it seems like
we don't try to lock the table in the child until we're about to
access it, which means that, on a large database, we could dump out
99% of the database and then be forced to abort the dump because of a
conflicting lock on the very last table. We could fix that by having
every child lock every table right at the beginning, so that all
possible failures of this type would happen before we do any work, but
that will eat up a lot of lock table space. It would be nice if the
children could somehow piggyback on the parent's locks, but I don't
see any obvious way to make that work. Maybe we just have to live
with it the way it is, but I worry that people whose dumps fail 10
hours into a 12 hour parallel dump are going to be grumpy.

If the master process keeps the locks it acquires in the beginning, you
could fall back to dumping those tables where the child lock fails using
the master connection.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#7Robert Haas
robertmhaas@gmail.com
In reply to: Heikki Linnakangas (#6)
Re: patch for parallel pg_dump

On Fri, Jan 27, 2012 at 11:53 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

If the master process keeps the locks it acquires in the beginning, you
could fall back to dumping those tables where the child lock fails using the
master connection.

Hmm, that's a thought.

Another idea I just had is to allow a transaction that has imported a
snapshot to jump the queue when attempting to acquire a lock that the
backend from which the snapshot was imported already holds. We don't
want to allow queue-jumping in general because there are numerous
places in the code where we rely on the current behavior to avoid
starving strong lockers, but it seems like it might be reasonable to
allow it in this special case.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#8Joachim Wieland
joe@mcknight.de
In reply to: Robert Haas (#2)
Re: patch for parallel pg_dump

On Fri, Jan 27, 2012 at 4:57 PM, Robert Haas <robertmhaas@gmail.com> wrote:

But even if you do know that subclassing
is intended, that doesn't prove that the particular Archive object is
always going to be an ArchiveHandle under the hood.  If it is, why not
just pass it as an ArchiveHandle to begin with?

I know that you took back some of your comments, but I'm with you
here. Archive is allocated as an ArchiveHandle and then casted back to
Archive*, so you always know that an Archive is an ArchiveHandle. I'm
all for getting rid of Archive and just using ArchiveHandle throughout
pg_dump which would get rid of these useless casts. I admit that I
might have made it a bit worse by adding a few more of these casts but
the fundamental issue was already there and there is precedence for
casting between them in both directions :-)

Joachim

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joachim Wieland (#8)
Re: patch for parallel pg_dump

Joachim Wieland <joe@mcknight.de> writes:

I know that you took back some of your comments, but I'm with you
here. Archive is allocated as an ArchiveHandle and then casted back to
Archive*, so you always know that an Archive is an ArchiveHandle. I'm
all for getting rid of Archive and just using ArchiveHandle throughout
pg_dump which would get rid of these useless casts.

I'd like to see a more thoroughgoing look at the basic structure of
pg_dump. Everybody who's ever looked at that code has found it
confusing, with the possible exception of the original author who is
long gone from the project anyway. I don't know exactly what would make
it better, but the useless distinction between Archive and ArchiveHandle
seems like a minor annoyance, not the core disease.

Not that there'd be anything wrong with starting with that.

regards, tom lane

#10Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#9)
Re: patch for parallel pg_dump

On Sun, Jan 29, 2012 at 12:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Joachim Wieland <joe@mcknight.de> writes:

I know that you took back some of your comments, but I'm with you
here. Archive is allocated as an ArchiveHandle and then casted back to
Archive*, so you always know that an Archive is an ArchiveHandle. I'm
all for getting rid of Archive and just using ArchiveHandle throughout
pg_dump which would get rid of these useless casts.

I'd like to see a more thoroughgoing look at the basic structure of
pg_dump.  Everybody who's ever looked at that code has found it
confusing, with the possible exception of the original author who is
long gone from the project anyway.  I don't know exactly what would make
it better, but the useless distinction between Archive and ArchiveHandle
seems like a minor annoyance, not the core disease.

Not that there'd be anything wrong with starting with that.

After some study, I'm reluctant to completely abandon the
Archive/ArchiveHandle distinction because it seems to me that it does
serve some purpose: right now, nothing in pg_dump.c - where all the
code to actually dump stuff lives - knows anything about what's inside
the ArchiveHandle, just the Archive. So having two data structures
really does serve a purpose: if it's part of the Archive, you need it
in order to query the system catalogs and generate SQL. If it isn't,
you don't. Considering how much more crap there is inside an
ArchiveHandle than an Archive, I don't think we should lightly abandon
that distinction.

Now, that having been said, there are probably better ways of making
that distinction than what we have here; Archive and ArchiveHandle
could be better named, perhaps, and we could have pointers from one
structure to the other instead of magically embedding them inside each
other. All the function pointers that are part of the ArchiveHandle
could be separated out into a static, constant structure with a name
like ArchiveMethod, and we could keep a pointer to the appropriate
ArchiveMethod inside each ArchiveHandle instead of copying all the
pointers into it. I think that'd be a lot less confusing.

But the immediate problem is that pg_dump.c is heavily reliant on
global variables, which isn't going to fly if we want this code to use
threads on Windows (or anywhere else). It's also bad style. So I
suggest that we refactor pg_dump.c to get rid of g_conn and g_fout.
Getting rid of g_fout should be fairly easy: in many places, we're
already passing Archive *fout as a parameter. If we pass it as a
parameter to the functions that need it but aren't yet getting it as a
parameter, then it can cease to exist as a global variable and become
local to main().

Getting rid of g_conn is a little harder. Joachim's patch takes the
view that we can safely overload the existing ArchiveHandle.connection
member. Currently, that member is the connection to which we are
doing a direct to database restore; what he's proposing to do is also
use it to store the connection from which are doing the dump. But it
seems possible that we might someday want to dump from one database
and restore into another database at the same time, so maybe we ought
to play it safe and use different variables for those things. So I'm
inclined to think we could just add a "PGconn *remote_connection"
argument to the Archive structure (to go with all of the similarly
named "remote" fields, all of which describe the DB to be dumped) and
then that would be available everywhere that the Archive itself is.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#10)
Re: patch for parallel pg_dump

Robert Haas <robertmhaas@gmail.com> writes:

But the immediate problem is that pg_dump.c is heavily reliant on
global variables, which isn't going to fly if we want this code to use
threads on Windows (or anywhere else). It's also bad style. So I
suggest that we refactor pg_dump.c to get rid of g_conn and g_fout.

I've looked at that a bit in the past and decided that the notational
overhead would be too much. OTOH, if we're going to be forced into it
in order to support parallel pg_dump, we might as well do it first in a
separate patch.

... But it
seems possible that we might someday want to dump from one database
and restore into another database at the same time, so maybe we ought
to play it safe and use different variables for those things. So I'm
inclined to think we could just add a "PGconn *remote_connection"
argument to the Archive structure (to go with all of the similarly
named "remote" fields, all of which describe the DB to be dumped) and
then that would be available everywhere that the Archive itself is.

I always thought that the "remote" terminology was singularly unhelpful.
It implies there's a "local" connection floating around somewhere, which
of course there is not, and it does nothing to remind you of whether the
connection leads to a database being dumped or a database being restored
into. If we are going to have two fields, could we name them something
less opaque, perhaps "src_connection" and "dest_connection"?

regards, tom lane

#12Joachim Wieland
joe@mcknight.de
In reply to: Robert Haas (#10)
Re: patch for parallel pg_dump

On Mon, Jan 30, 2012 at 12:20 PM, Robert Haas <robertmhaas@gmail.com> wrote:

But the immediate problem is that pg_dump.c is heavily reliant on
global variables, which isn't going to fly if we want this code to use
threads on Windows (or anywhere else).  It's also bad style.

Technically, since most of pg_dump.c dumps the catalog and since this
isn't done in parallel but only in the master process, most functions
need not be changed for the parallel restore. Only those that are
called from the worker threads need to be changed, this has been done
in e.g. dumpBlobs(), the change that you quoted upthread.

But it
seems possible that we might someday want to dump from one database
and restore into another database at the same time, so maybe we ought
to play it safe and use different variables for those things.

Actually I've tried that but in the end concluded that it's best to
have at most one database connection in an ArchiveHandle if you don't
want to do a lot more refactoring. Besides the normal connection
parameters like host, port, ... there's also std_strings, encoding,
savedPassword, currUser/currSchema, lo_buf, remoteVersion ... that
wouldn't be obvious where they belonged to.

Speaking about refactoring, I'm happy to also throw in the idea to
make the dump and restore more symmetrical than they are now. I kinda
disliked RestoreOptions* being a member of the ArchiveHandle without
having something similar for the dump. Ideally I'd say there should be
a DumpOptions and a RestoreOptions field (with a "struct Connection"
being part of them containing all the different connection
parameters). They could be a union if you wanted to allow only one
connection, or not if you want more than one.

#13Robert Haas
robertmhaas@gmail.com
In reply to: Joachim Wieland (#12)
Re: patch for parallel pg_dump

On Tue, Jan 31, 2012 at 12:55 AM, Joachim Wieland <joe@mcknight.de> wrote:

On Mon, Jan 30, 2012 at 12:20 PM, Robert Haas <robertmhaas@gmail.com> wrote:

But the immediate problem is that pg_dump.c is heavily reliant on
global variables, which isn't going to fly if we want this code to use
threads on Windows (or anywhere else).  It's also bad style.

Technically, since most of pg_dump.c dumps the catalog and since this
isn't done in parallel but only in the master process, most functions
need not be changed for the parallel restore. Only those that are
called from the worker threads need to be changed, this has been done
in e.g. dumpBlobs(), the change that you quoted upthread.

If we're going to go to the trouble of cleaning this up, I'd prefer to
rationalize it rather than doing just the absolute bare minimum amount
of stuff needed to make it appear to work.

But it
seems possible that we might someday want to dump from one database
and restore into another database at the same time, so maybe we ought
to play it safe and use different variables for those things.

Actually I've tried that but in the end concluded that it's best to
have at most one database connection in an ArchiveHandle if you don't
want to do a lot more refactoring. Besides the normal connection
parameters like host, port, ... there's also std_strings, encoding,
savedPassword, currUser/currSchema, lo_buf, remoteVersion ... that
wouldn't be obvious where they belonged to.

And just for added fun and excitement, they all have inconsistent
naming conventions and inadequate documentation.

I think if we need more refactoring in order to support multiple
database connections, we should go do that refactoring. The current
situation is not serving anyone well.

Speaking about refactoring, I'm happy to also throw in the idea to
make the dump and restore more symmetrical than they are now. I kinda
disliked RestoreOptions* being a member of the ArchiveHandle without
having something similar for the dump. Ideally I'd say there should be
a DumpOptions and a RestoreOptions field (with a "struct Connection"
being part of them containing all the different connection
parameters). They could be a union if you wanted to allow only one
connection, or not if you want more than one.

I like the idea of a struct Connection. I think that could make a lot of sense.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#14Joachim Wieland
joe@mcknight.de
In reply to: Robert Haas (#13)
Re: patch for parallel pg_dump

On Tue, Jan 31, 2012 at 9:05 AM, Robert Haas <robertmhaas@gmail.com> wrote:

And just for added fun and excitement, they all have inconsistent
naming conventions and inadequate documentation.

I think if we need more refactoring in order to support multiple
database connections, we should go do that refactoring.  The current
situation is not serving anyone well.

I guess I'd find it cleaner to have just one connection per Archive
(or ArchiveHandle). If you need two connections, why not just have two
Archive objects, as they would have different characteristics anyway,
one for dumping data, one to restore.

#15Robert Haas
robertmhaas@gmail.com
In reply to: Joachim Wieland (#14)
Re: patch for parallel pg_dump

On Tue, Jan 31, 2012 at 4:46 PM, Joachim Wieland <joe@mcknight.de> wrote:

On Tue, Jan 31, 2012 at 9:05 AM, Robert Haas <robertmhaas@gmail.com> wrote:

And just for added fun and excitement, they all have inconsistent
naming conventions and inadequate documentation.

I think if we need more refactoring in order to support multiple
database connections, we should go do that refactoring.  The current
situation is not serving anyone well.

I guess I'd find it cleaner to have just one connection per Archive
(or ArchiveHandle). If you need two connections, why not just have two
Archive objects, as they would have different characteristics anyway,
one for dumping data, one to restore.

I think we're more-or-less proposing to rename "Archive" to
"Connection", aren't we?

And then ArchiveHandle can store all the things that aren't related to
a specific connection.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#16Joachim Wieland
joe@mcknight.de
In reply to: Robert Haas (#15)
Re: patch for parallel pg_dump

On Wed, Feb 1, 2012 at 12:24 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I think we're more-or-less proposing to rename "Archive" to
"Connection", aren't we?

And then ArchiveHandle can store all the things that aren't related to
a specific connection.

How about something like that:
(Hopefully you'll come up with better names...)

StateHandle {
Connection
Schema
Archive
Methods
union {
DumpOptions
RestoreOptions
}
}

Dumping would mean to do:

Connection -> Schema -> Archive using DumpOptions through the
specified Methods

Restore:

Archive -> Schema -> Connection using RestoreOptions through the
specified Methods

Dumping from one database and restoring into another one would be two
StateHandles with different Connections, Archive == NULL, Schema
pointing to the same Schema, Methods most likely also pointing to the
same function pointer table and each with different Options in the
union of course.

Granted, you could say that above I've only grouped the elements of
the ArchiveHandle, but I don't really see that breaking it up into
several objects makes it any better or easier. There could be some
accessor functions to hide the details of the whole object to the
different consumers.

#17Robert Haas
robertmhaas@gmail.com
In reply to: Joachim Wieland (#16)
Re: patch for parallel pg_dump

On Thu, Feb 2, 2012 at 8:31 PM, Joachim Wieland <joe@mcknight.de> wrote:

On Wed, Feb 1, 2012 at 12:24 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I think we're more-or-less proposing to rename "Archive" to
"Connection", aren't we?

And then ArchiveHandle can store all the things that aren't related to
a specific connection.

How about something like that:
(Hopefully you'll come up with better names...)

StateHandle {
  Connection
  Schema
  Archive
  Methods
  union {
     DumpOptions
     RestoreOptions
  }
}

Dumping would mean to do:

 Connection -> Schema -> Archive using DumpOptions through the
specified Methods

Restore:

  Archive -> Schema -> Connection using RestoreOptions through the
specified Methods

Dumping from one database and restoring into another one would be two
StateHandles with different Connections, Archive == NULL, Schema
pointing to the same Schema, Methods most likely also pointing to the
same function pointer table and each with different Options in the
union of course.

Granted, you could say that above I've only grouped the elements of
the ArchiveHandle, but I don't really see that breaking it up into
several objects makes it any better or easier. There could be some
accessor functions to hide the details of the whole object to the
different consumers.

I'm not sure I understand what the various structures would contain.

My gut feeling for how to begin grinding through this is to go through
and do the following:

1. Rename Archive to Connection.
2. Add a PGconn object to it.
3. Change the declaration inside ArchiveHandle from "Archive public"
to "Connection source_connection".

I think that'd get us significantly closer to sanity and be pretty
simple to understand, and then we can take additional passes over it
until we're happy with what we've got.

If you're OK with that much I'll go do it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#18Joachim Wieland
joe@mcknight.de
In reply to: Robert Haas (#17)
Re: patch for parallel pg_dump

On Fri, Feb 3, 2012 at 7:52 AM, Robert Haas <robertmhaas@gmail.com> wrote:

If you're OK with that much I'll go do it.

Sure, go ahead!

#19Robert Haas
robertmhaas@gmail.com
In reply to: Joachim Wieland (#18)
Re: patch for parallel pg_dump

On Fri, Feb 3, 2012 at 10:43 AM, Joachim Wieland <joe@mcknight.de> wrote:

On Fri, Feb 3, 2012 at 7:52 AM, Robert Haas <robertmhaas@gmail.com> wrote:

If you're OK with that much I'll go do it.

Sure, go ahead!

It turns out that (as you anticipated) there are some problems with my
previous proposal. For one thing, an Archive isn't really just a
connection, because it's also used as a data sink by passing it to
functions like ArchiveEntry(). For two things, as you probably
realized but I failed to, ConnectDatabase() is already setting
AH->connection to the same PGconn it returns, so the idea that we can
potentially have multiple connection objects using the existing
framework is not really true; or at least it's going to require more
work than I originally thought. I think it might still be worth doing
at some point, but I think we probably need to clean up some of the
rest of this mess first.

I've now rejiggered things so that the Archive is passed down to
everything that needs it, so the global variable g_fout is gone. I've
also added a couple of additional accessors to pg_backup_db.c so that
most places that issue queries can simply use those routines without
needing to peek under the hood into the ArchiveHandle. This is not
quite enough to get rid of g_conn, but it's close: the major stumbling
block at this point is probably exit_nicely(). The gyrations we're
going through to make sure that AH->connection gets closed before
exiting are fairly annoying; maybe we should invent something in
dumputils.c along the line of the backend's on_shmem_exit().

I'm starting to think it might make sense to press on with this
refactoring just a bit further and eliminate the distinction between
Archive and ArchiveHandle. Given a few more accessors (and it really
looks like it would only be a few), pg_dump.c could treat an
ArchiveHandle as an opaque struct, which would accomplish more or less
the same thing as the current design, but in a less confusing fashion
- i.e. without giving the reader the idea that the author desperately
wishes he were coding in C++. That would allow simplification in a
number other places; just to take one example, we wouldn't need both
appendStringLiteralAH and appendStringLiteralAHX.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#19)
Re: patch for parallel pg_dump

Robert Haas <robertmhaas@gmail.com> writes:

most places that issue queries can simply use those routines without
needing to peek under the hood into the ArchiveHandle. This is not
quite enough to get rid of g_conn, but it's close: the major stumbling
block at this point is probably exit_nicely(). The gyrations we're
going through to make sure that AH->connection gets closed before
exiting are fairly annoying; maybe we should invent something in
dumputils.c along the line of the backend's on_shmem_exit().

Do we actually care about closing the connection? Worst case is that
backend logs an "unexpected EOF" message. But yeah, an atexit hook
might be the easiest solution.

regards, tom lane

#21Joachim Wieland
joe@mcknight.de
In reply to: Robert Haas (#19)
Re: patch for parallel pg_dump

On Tue, Feb 7, 2012 at 4:59 PM, Robert Haas <robertmhaas@gmail.com> wrote:

It turns out that (as you anticipated) there are some problems with my
previous proposal.

I assume you're talking to Tom, as my powers of anticipation are
actually quite limited... :-)

This is not
quite enough to get rid of g_conn, but it's close: the major stumbling
block at this point is probably exit_nicely().  The gyrations we're
going through to make sure that AH->connection gets closed before
exiting are fairly annoying; maybe we should invent something in
dumputils.c along the line of the backend's on_shmem_exit().

Yeah, this becomes even more important with parallel jobs where you
want all worker processes die once the parent exits. Otherwise some 10
already started processes would continue to dump your 10 largest
tables for the next few hours with the master process long dead... All
while you're about to start up the next master process...

In my patch I dealt with exactly the same problem for the error
handler by creating a separate function that has a static variable (a
pointer to the ParallelState). The value is set and retrieved through
the same function, so yes, it's kinda global but then again it can
only be accessed from this function, which is only called from the
error handler.

I'm starting to think it might make sense to press on with this
refactoring just a bit further and eliminate the distinction between
Archive and ArchiveHandle.

How about doing more refactoring after applying the patch, you'd then
see what is really needed and then we'd also have an actual use case
for more than one connection (You might have already guessed that this
proposal is heavily influenced by my self-interest of avoiding too
much work to make my patch match your refactoring)...

#22Robert Haas
robertmhaas@gmail.com
In reply to: Joachim Wieland (#21)
Re: patch for parallel pg_dump

On Tue, Feb 7, 2012 at 10:21 PM, Joachim Wieland <joe@mcknight.de> wrote:

On Tue, Feb 7, 2012 at 4:59 PM, Robert Haas <robertmhaas@gmail.com> wrote:

It turns out that (as you anticipated) there are some problems with my
previous proposal.

I assume you're talking to Tom, as my powers of anticipation are
actually quite limited... :-)

No, I was talking to you, actually. You earlier suggested that an
ArchiveHandle was only meant to contain a single PGconn, and it seems
you're right. We can refactor that assumption away, but not
trivially.

In my patch I dealt with exactly the same problem for the error
handler by creating a separate function that has a static variable (a
pointer to the ParallelState). The value is set and retrieved through
the same function, so yes, it's kinda global but then again it can
only be accessed from this function, which is only called from the
error handler.

How did you make this thread-safe?

I'm starting to think it might make sense to press on with this
refactoring just a bit further and eliminate the distinction between
Archive and ArchiveHandle.

How about doing more refactoring after applying the patch, you'd then
see what is really needed and then we'd also have an actual use case
for more than one connection (You might have already guessed that this
proposal is heavily influenced by my self-interest of avoiding too
much work to make my patch match your refactoring)...

Well, I don't think your patch is going to be too heavily affected
either way, because most of your changes were not in pg_dump.c, and
the need for any changes at all in pg_dump.c should rapidly be going
away. At any rate, the problem with stopping here is that g_conn is
still floating around, and one way or the other we've got to get rid
of it if you want to have more than one ArchiveHandle floating around
at a time. So we at least need to press on far enough to get to that
point.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#23Joachim Wieland
joe@mcknight.de
In reply to: Robert Haas (#22)
Re: patch for parallel pg_dump

On Wed, Feb 8, 2012 at 1:21 PM, Robert Haas <robertmhaas@gmail.com> wrote:

In my patch I dealt with exactly the same problem for the error
handler by creating a separate function that has a static variable (a
pointer to the ParallelState). The value is set and retrieved through
the same function, so yes, it's kinda global but then again it can
only be accessed from this function, which is only called from the
error handler.

How did you make this thread-safe?

The ParallelState has a ParallelSlot for each worker process which
contains that worker process's thread id. So a worker process just
walks through the slots until it finds its own thread id and then goes
from there.

In particular, it only gets the file descriptors to and from the
parent from this structure, to communicate the error it encountered,
but it doesn't get the PGconn. This is because that error handler is
only called when a worker calls die_horribly(AH, ...) in which case
the connection is already known through AH.

Termination via a signal just sets a variable that is checked in the
I/O routines and there we also have AH to shut down the connection
(actually to call die_horribly()).

So we at least need to press on far enough to get to that point.

Sure, let me know if I can help you with something.

#24Robert Haas
robertmhaas@gmail.com
In reply to: Joachim Wieland (#23)
Re: patch for parallel pg_dump

On Wed, Feb 8, 2012 at 7:56 PM, Joachim Wieland <joe@mcknight.de> wrote:

So we at least need to press on far enough to get to that point.

Sure, let me know if I can help you with something.

Alright. I think (hope) that I've pushed this far enough to serve the
needs of parallel pg_dump. The error handling is still pretty grotty
and might need a bit more surgery to handle the parallel case, but I
think that making this actually not ugly will require eliminating the
Archive/ArchiveHandle distinction, which is probably a good thing to
do but, as you point out, maybe not the first priority right now.

Can you provide an updated patch?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#25Joachim Wieland
joe@mcknight.de
In reply to: Robert Haas (#24)
1 attachment(s)
Re: patch for parallel pg_dump

On Thu, Feb 16, 2012 at 1:29 PM, Robert Haas <robertmhaas@gmail.com> wrote:

Can you provide an updated patch?

Robert, updated patch is attached.

Attachments:

parallel_pg_dump_2.diff.gzapplication/x-gzip; name=parallel_pg_dump_2.diff.gzDownload
#26Robert Haas
robertmhaas@gmail.com
In reply to: Joachim Wieland (#25)
Re: patch for parallel pg_dump

On Thu, Feb 23, 2012 at 11:37 PM, Joachim Wieland <joe@mcknight.de> wrote:

On Thu, Feb 16, 2012 at 1:29 PM, Robert Haas <robertmhaas@gmail.com> wrote:

Can you provide an updated patch?

Robert, updated patch is attached.

Well, I was hoping someone else would do some work on this, but here
we are. Some more comments from me:

+                       /*
+                        * If the version is lower and we don't have
synchronized snapshots
+                        * yet, we will error out earlier already. So
either we have the
+                        * feature or the user has given the explicit
command not to use it.
+                        * Note: If we have it, we always use it, you
cannot switch it off
+                        * then.
+                        */

I don't agree with this design decision. I think that
--no-synchronized-snapshots should be available even on server
versions that support it.

+ if (archiveFormat != archDirectory && numWorkers > 1) {

Style.

-                        const char *owner, bool withOids,
+                        const char *owner,
+                        unsigned long int relpages, bool withOids,

The new argument to ArchiveEntry() is unused. Removing it would
declutter things a good bit.

+#include "../../backend/port/pipe.c"

This seems really ugly. I suggest that we commit a separate patch to
move src/backend/port/pipe.c to src/port and refactor the interface so
that it has a signature something like this:

int win32_setup_pipe(int handles[2], char **error_string, int *error_code);

The backend can have a wrapper function around this that calls ereport
using the error_string and error_code, and any front-end code that
wants to use this can do so directly.

+/*
+ * The parallel error handler is called for any die_horribly() in a
child or master process.
+ * It then takes control over shutting down the rest of the gang.
+ */

I think this needs to be revised to take control in exit_nicely(),
maybe by using on_exit_nicely(). Trapping die_horribly() won't catch
everything.

+               if (msgsize == 0 && !do_wait) {
+                       setnonblocking(fd);
+               }

Style.

+               if (msg[msgsize] == '\0') {
+                       return msg;
+               }

Style.

I find myself somewhat uncomfortable with the extent to which this is
relying on being able to set blocking and nonblocking flags on and off
repeatedly. Maybe there's no real problem with it except for being
inefficient, but the way I'd expect this to readMessageFromPipe() to
be written is: Always leave the pipe in blocking mode. If you want a
non-blocking read, then use select() first to check whether it's ready
for read; if not, just do the read directly. Actually, looking
further, it seems that you already have such logic in
getMessageFromWorker(), so I'm unclear why do_wait needs to be passed
down to readMessageFromPipe() at all.

+ * Hence this function returns an (unsigned) int.
+ */
+static int

It doesn't look unsigned?

+extern const char *fmtQualifiedId(struct Archive *fout,
+
const char *schema, const char *id);

I don't think we want to expose struct Archive to dumputils.h. Can we
find a different place to put this?

+enum escrow_action { GET, SET };
+static void
+parallel_error_handler_escrow_data(enum escrow_action act,
ParallelState *pstate)
+{
+       static ParallelState *s_pstate = NULL;
+
+       if (act == SET)
+               s_pstate = pstate;
+       else
+               *pstate = *s_pstate;
+}

This seems like a mighty complicated way to implement a global variable.

+#ifdef HAVE_SETSID
+                       /*
+                        * If we can, we try to make each process the
leader of its own
+                        * process group. The reason is that if you
hit Ctrl-C and they are
+                        * all in the same process group, any
termination sequence is
+                        * possible, because every process will
receive the signal. What
+                        * often happens is that a worker receives the
signal, terminates
+                        * and the master detects that one of the
workers had a problem,
+                        * even before acting on its own signal.
That's still okay because
+                        * everyone still terminates but it looks a bit weird.
+                        *
+                        * With setsid() however, a Ctrl-C is only
sent to the master and
+                        * he can then cascade it to the worker processes.
+                        */
+                       setsid();
+#endif

This doesn't seem like a very good idea, because if the master fails
to propagate the ^C to all the workers for any reason, then the user
will have to hunt them down manually and kill them. Or imagine that
someone hits ^Z, for example. I think the right thing to do is to
have the workers catch the signal and handle it by terminating,
passing along a notification to the master that they were terminated
by user action; then the master can DTRT.

+typedef struct {

Style.

There's probably more, but the non-PostgreSQL part of my life is calling me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#27Joachim Wieland
joe@mcknight.de
In reply to: Robert Haas (#26)
2 attachment(s)
Re: patch for parallel pg_dump

On Sat, Mar 10, 2012 at 9:51 AM, Robert Haas <robertmhaas@gmail.com> wrote:

-                        const char *owner, bool withOids,
+                        const char *owner,
+                        unsigned long int relpages, bool withOids,

The new argument to ArchiveEntry() is unused.  Removing it would
declutter things a good bit.

How do you mean it's unused? pg_dump_sort.c uses relpages to dump the
largest tables first. What you don't want to see in a parallel dump is
a worker starting to dump a large table while everybody else is
already idle...

The backend can have a wrapper function around this that calls ereport
using the error_string and error_code, and any front-end code that
wants to use this can do so directly.

I tried this actually (patch attached) but then I wanted to test it
and couldn't find anything that used pgpipe() on Windows.

pg_basebackup/pg_basebackup.c is using it but it's in an #ifndef WIN32
and the same is true for postmaster/syslogger.c. Am I missing
something or has this Windows implementation become stale by now? I'll
append the patch but haven't adapted the pg_dump patch yet to use it.
Should we still go forward the way you proposed?

+/*
+ * The parallel error handler is called for any die_horribly() in a
child or master process.
+ * It then takes control over shutting down the rest of the gang.
+ */

I think this needs to be revised to take control in exit_nicely(),
maybe by using on_exit_nicely().  Trapping die_horribly() won't catch
everything.

It's actually not designed to catch everything. This whole error
handler thing is only there to report a single error to the user which
is hopefully the root cause of why everybody is shutting down. Assume
for example that we cannot get a lock on one table in a worker. Then
the worker would die_horribly() saying that it cannot get a lock. The
master would receive that message and shut down. Shutting down for the
master means killing all the other workers.

The master terminates because a worker died. And all the other workers
die because the master killed them. Yet the root cause for the
termination was the fact that one of the workers couldn't get a lock,
and this is the one and only message that the user should see.

If a child terminates without leaving a message, the master will still
detect it and just say "a worker process died unexpectedly" (this part
was actually broken, but now it's fixed :-) )

Actually, looking
further, it seems that you already have such logic in
getMessageFromWorker(), so I'm unclear why do_wait needs to be passed
down to readMessageFromPipe() at all.

That was a very good catch! Thanks!

+extern const char *fmtQualifiedId(struct Archive *fout,
+
const char *schema, const char *id);

I don't think we want to expose struct Archive to dumputils.h.  Can we
find a different place to put this?

Right now it's there because fmtId() is there.
fmtQualifiedId is used by dumputils.c, parallel.c and pg_dump.c,
making the headers dumputils.h, parallel.h and pg_dump.h obvious
candidates for the prototype. parallel.h doesn't make much sense. We
can put it in pg_dump.h if you think that's better, but then we should
also move fmtId() and fmtQualifiedId() into pg_dump.c...

Or we change fmtQualifiedId to take an int and then we always pass the
archive version instead of the Archive* ?

+enum escrow_action { GET, SET };
+static void
+parallel_error_handler_escrow_data(enum escrow_action act,
ParallelState *pstate)
+{
+       static ParallelState *s_pstate = NULL;
+
+       if (act == SET)
+               s_pstate = pstate;
+       else
+               *pstate = *s_pstate;
+}

This seems like a mighty complicated way to implement a global variable.

Well, we talked about that before, when you complained that you
couldn't get rid of the global g_conn because of the exit handler.
You're right that in fact it is an indirect global variable here but
it's clearly limited to the use of the error handler and you can be
sure that nobody other than this function writes to it or accesses it
without calling this function.

If you want to make the ParallelState global then the immediate next
question is why would you still pass it around as an argument to the
various functions and not just access the global variable instead from
everywhere...

(I have accepted and implemented all other proposals that I didn't
comment on here)

Joachim

Attachments:

pgpipe.difftext/x-patch; charset=US-ASCII; name=pgpipe.diffDownload
diff --git a/src/backend/port/pipe.c b/src/backend/port/pipe.c
index 357f5ec..4ca69c4 100644
*** a/src/backend/port/pipe.c
--- b/src/backend/port/pipe.c
***************
*** 1,13 ****
  /*-------------------------------------------------------------------------
   *
   * pipe.c
!  *	  pipe()
   *
   * Copyright (c) 1996-2012, PostgreSQL Global Development Group
   *
!  *	This is a replacement version of pipe for Win32 which allows
!  *	returned handles to be used in select(). Note that read/write calls
!  *	must be replaced with recv/send.
   *
   * IDENTIFICATION
   *	  src/backend/port/pipe.c
--- 1,14 ----
  /*-------------------------------------------------------------------------
   *
   * pipe.c
!  *	  pgpipe()
!  *    piperead()
   *
   * Copyright (c) 1996-2012, PostgreSQL Global Development Group
   *
!  *	This is a wrapper to call pgwin32_pipe() in Windows and report any
!  *  errors via ereport(LOG, ...). The backend checks the return code and
!  *  can throw a more fatal error if necessary.
   *
   * IDENTIFICATION
   *	  src/backend/port/pipe.c
***************
*** 21,95 ****
  int
  pgpipe(int handles[2])
  {
! 	SOCKET		s;
! 	struct sockaddr_in serv_addr;
! 	int			len = sizeof(serv_addr);
! 
! 	handles[0] = handles[1] = INVALID_SOCKET;
! 
! 	if ((s = socket(AF_INET, SOCK_STREAM, 0)) == INVALID_SOCKET)
! 	{
! 		ereport(LOG, (errmsg_internal("pgpipe could not create socket: %ui", WSAGetLastError())));
! 		return -1;
! 	}
  
! 	memset((void *) &serv_addr, 0, sizeof(serv_addr));
! 	serv_addr.sin_family = AF_INET;
! 	serv_addr.sin_port = htons(0);
! 	serv_addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
! 	if (bind(s, (SOCKADDR *) &serv_addr, len) == SOCKET_ERROR)
! 	{
! 		ereport(LOG, (errmsg_internal("pgpipe could not bind: %ui", WSAGetLastError())));
! 		closesocket(s);
! 		return -1;
! 	}
! 	if (listen(s, 1) == SOCKET_ERROR)
! 	{
! 		ereport(LOG, (errmsg_internal("pgpipe could not listen: %ui", WSAGetLastError())));
! 		closesocket(s);
! 		return -1;
! 	}
! 	if (getsockname(s, (SOCKADDR *) &serv_addr, &len) == SOCKET_ERROR)
! 	{
! 		ereport(LOG, (errmsg_internal("pgpipe could not getsockname: %ui", WSAGetLastError())));
! 		closesocket(s);
! 		return -1;
! 	}
! 	if ((handles[1] = socket(PF_INET, SOCK_STREAM, 0)) == INVALID_SOCKET)
! 	{
! 		ereport(LOG, (errmsg_internal("pgpipe could not create socket 2: %ui", WSAGetLastError())));
! 		closesocket(s);
! 		return -1;
! 	}
  
! 	if (connect(handles[1], (SOCKADDR *) &serv_addr, len) == SOCKET_ERROR)
! 	{
! 		ereport(LOG, (errmsg_internal("pgpipe could not connect socket: %ui", WSAGetLastError())));
! 		closesocket(s);
! 		return -1;
! 	}
! 	if ((handles[0] = accept(s, (SOCKADDR *) &serv_addr, &len)) == INVALID_SOCKET)
! 	{
! 		ereport(LOG, (errmsg_internal("pgpipe could not accept socket: %ui", WSAGetLastError())));
! 		closesocket(handles[1]);
! 		handles[1] = INVALID_SOCKET;
! 		closesocket(s);
! 		return -1;
! 	}
! 	closesocket(s);
! 	return 0;
  }
  
- 
  int
  piperead(int s, char *buf, int len)
  {
! 	int			ret = recv(s, buf, len, 0);
! 
! 	if (ret < 0 && WSAGetLastError() == WSAECONNRESET)
! 		/* EOF on the pipe! (win32 socket based implementation) */
! 		ret = 0;
! 	return ret;
  }
  
  #endif
--- 22,41 ----
  int
  pgpipe(int handles[2])
  {
! 	char	   *error_string;
! 	int			error_code;
! 	int			ret;
  
! 	if ((ret = pgwin32_pipe(handles, &error_string, &error_code)) != 0)
! 		ereport(LOG, (errmsg_internal("%s: %ui", error_string, error_code)));
  
! 	return ret;
  }
  
  int
  piperead(int s, char *buf, int len)
  {
! 	return pgwin32_piperead(s, buf, len);
  }
  
  #endif
diff --git a/src/include/port/win32.h b/src/include/port/win32.h
index e2dd23b..166b935 100644
*** a/src/include/port/win32.h
--- b/src/include/port/win32.h
*************** extern int	pgwin32_ReserveSharedMemoryRe
*** 400,405 ****
--- 400,409 ----
  /* in backend/port/win32/crashdump.c */
  extern void pgwin32_install_crashdump_handler(void);
  
+ /* in port/win32pipe.c */
+ int pgwin32_pipe(int handles[2], char **error_string, int *error_code);
+ int pgwin32_piperead(int s, char *buf, int len);
+ 
  /* in port/win32error.c */
  extern void _dosmaperr(unsigned long);
  
diff --git a/src/port/win32pipe.c b/src/port/win32pipe.c
index ...05e1da3 .
*** a/src/port/win32pipe.c
--- b/src/port/win32pipe.c
***************
*** 0 ****
--- 1,99 ----
+ /*-------------------------------------------------------------------------
+  *
+  * win32pipe.c
+  *	  pgwin32_pipe()
+  *
+  * Copyright (c) 1996-2012, PostgreSQL Global Development Group
+  *
+  *	This is a replacement version of pipe for Win32 which allows
+  *	returned handles to be used in select(). Note that read/write calls
+  *	must be replaced with recv/send.
+  *
+  * IDENTIFICATION
+  *	  src/port/win32pipe.c
+  *
+  *-------------------------------------------------------------------------
+  */
+ 
+ #include "postgres.h"
+ 
+ int
+ pgwin32_pipe(int handles[2], char **error_string, int *error_code)
+ {
+ 	SOCKET		s;
+ 	struct sockaddr_in serv_addr;
+ 	int			len = sizeof(serv_addr);
+ 
+ 	handles[0] = handles[1] = INVALID_SOCKET;
+ 
+ 	if ((s = socket(AF_INET, SOCK_STREAM, 0)) == INVALID_SOCKET)
+ 	{
+ 		*error_code = WSAGetLastError();
+ 		*error_string = "pgwin32_pipe could not create socket";
+ 		return -1;
+ 	}
+ 
+ 	memset((void *) &serv_addr, 0, sizeof(serv_addr));
+ 	serv_addr.sin_family = AF_INET;
+ 	serv_addr.sin_port = htons(0);
+ 	serv_addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
+ 	if (bind(s, (SOCKADDR *) &serv_addr, len) == SOCKET_ERROR)
+ 	{
+ 		*error_code = WSAGetLastError();
+ 		*error_string = "pgwin32_pipe could not bind";
+ 		closesocket(s);
+ 		return -1;
+ 	}
+ 	if (listen(s, 1) == SOCKET_ERROR)
+ 	{
+ 		*error_code = WSAGetLastError();
+ 		*error_string = "pgwin32_pipe could not listen";
+ 		closesocket(s);
+ 		return -1;
+ 	}
+ 	if (getsockname(s, (SOCKADDR *) &serv_addr, &len) == SOCKET_ERROR)
+ 	{
+ 		*error_code = WSAGetLastError();
+ 		*error_string = "pgwin32_pipe could not getsockname";
+ 		closesocket(s);
+ 		return -1;
+ 	}
+ 	if ((handles[1] = socket(PF_INET, SOCK_STREAM, 0)) == INVALID_SOCKET)
+ 	{
+ 		*error_code = WSAGetLastError();
+ 		*error_string = "pgwin32_pipe could not create socket";
+ 		closesocket(s);
+ 		return -1;
+ 	}
+ 
+ 	if (connect(handles[1], (SOCKADDR *) &serv_addr, len) == SOCKET_ERROR)
+ 	{
+ 		*error_code = WSAGetLastError();
+ 		*error_string = "pgwin32_pipe could not connect socket";
+ 		closesocket(s);
+ 		return -1;
+ 	}
+ 	if ((handles[0] = accept(s, (SOCKADDR *) &serv_addr, &len)) == INVALID_SOCKET)
+ 	{
+ 		*error_code = WSAGetLastError();
+ 		*error_string = "pgwin32_pipe could not accept socket";
+ 		closesocket(handles[1]);
+ 		handles[1] = INVALID_SOCKET;
+ 		closesocket(s);
+ 		return -1;
+ 	}
+ 	closesocket(s);
+ 	return 0;
+ }
+ 
+ 
+ int
+ pgwin32_piperead(int s, char *buf, int len)
+ {
+ 	int			ret = recv(s, buf, len, 0);
+ 
+ 	if (ret < 0 && WSAGetLastError() == WSAECONNRESET)
+ 		/* EOF on the pipe! (win32 socket based implementation) */
+ 		ret = 0;
+ 	return ret;
+ }
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index 64ec408..fda79fc 100644
*** a/src/tools/msvc/Mkvcbuild.pm
--- b/src/tools/msvc/Mkvcbuild.pm
*************** sub mkvcbuild
*** 58,64 ****
        erand48.c snprintf.c strlcat.c strlcpy.c dirmod.c exec.c noblock.c path.c
        pgcheckdir.c pg_crc.c pgmkdirp.c pgsleep.c pgstrcasecmp.c qsort.c qsort_arg.c
        sprompt.c thread.c getopt.c getopt_long.c dirent.c rint.c win32env.c
!       win32error.c win32setlocale.c);
  
      $libpgport = $solution->AddProject('libpgport','lib','misc');
      $libpgport->AddDefine('FRONTEND');
--- 58,64 ----
        erand48.c snprintf.c strlcat.c strlcpy.c dirmod.c exec.c noblock.c path.c
        pgcheckdir.c pg_crc.c pgmkdirp.c pgsleep.c pgstrcasecmp.c qsort.c qsort_arg.c
        sprompt.c thread.c getopt.c getopt_long.c dirent.c rint.c win32env.c
!       win32error.c win32setlocale.c win32pipe.c);
  
      $libpgport = $solution->AddProject('libpgport','lib','misc');
      $libpgport->AddDefine('FRONTEND');
parallel_pg_dump_3.diff.gzapplication/x-gzip; name=parallel_pg_dump_3.diff.gzDownload
���^Oparallel_pg_dump_3.diff�<W�����bC{Sdc&PJ�q����)��6GG��XYR�2�����ffw��-���=NX�;;3;;����F�^��S�l���:�=��8J����y�e�U#_���?����Q�����{��vY������jsss5��������/��j[;��>2�6���E���;����`�q�G"�K�h�7p�C#���m�rw��a
��;D�
��	����w>�`�w"P��a�&�[E�����&|c�l0�C��?�>�a��~2x<��0J�,s&x��D�0
��c�Dw�3
	��\���)s����d�����pl#���I<���,��$�&�����dDK��#��tJ�b����p�7��5W!N�gHx_�~� �w���'��Xlq���lK��������u�����W��JI���-,P�b��<�=D����U7
�w�$ls4I-�h4j0�w�_yp���a'�G�@�e��b��F�5���3�U�����ATy:A����Z����5$ ��RO����J���]5�v���������X�*�1h���yk�.���\��`����;�5k5v|�z�?u.z�6�u�y���^��#oH�@��J9p8�N�����L�
���t�xz�����R��`�J��i�z�>~"��|��j��[�|�%���_��g�����j_�W���x^�~�~�@�8��������������7��$��!��*i��9l��^f-���"�������q���
"��a��#���Ih='���bz�S�:���/������JOUe�C~���z=�v��,M�����A�_wA��O��_r`M<0�uy�>.��{rr�/�.?
5X��"���A��>���8����Qs�^U�Y��Q����^|�')��S
~���N���Z�qJ����O(�G�y�V�3t���2��m�m�s�9�O:�/RI����h���F��n� �ZN�-6�e��m�b����������1����4����'5���I�����\����q���4��������7,|t�"�&��d^�����4
 ��u/�u�Hm��aI����
"�!��$��S�~���R����I"~���m�I�����Y�x��~D���t���2�������N���ke���-����vw�9<6������^���E�8�,]�����R����������%��!�s���3��%�����c�+r^K7U��q;�y���u�*G�u�k�[���O���:��#e��~��2c��s��(���&E��
�Z���	L��M@Y
�]�v�l��lo�NB.g(ST�%�Bw����zX��S������c0n����	��f'm����w�}�sqf_�����%^���!}o�h�(�(�%*ed�nY����[���������@D���'`�x��l%�7�+3���w��B����{o������H~'����a�j�T~���C�1����#�1��;��	��QY���G�Q��y��jY{-�G{�k�������w�3�*AmYH��h����*�����cT������N��8h���a{��1���sFi�~��i7>��)���t�����a�h�5��PE�V(���Wd�uE�%x�
z)�R�p RN�V�iH�����W��7���=�u�F\������F<a�4T-
:�t�F��u<��9n@2a����Li�!1b\A�� �� ��Q�/"�	��p���P�X��x��~j{p��V��@��%��l�Mg%�"���Y��o�I���l���4J9f��S�CW��g���
���������/8ta��|�����Ap$�K�%kXC�>;�4�	�'l�e�c�A�W
h��h��<��9~hIX������Y!�����B^s>�����'�K�#�i*!%���@#H683@�s�����U�n��F3A���l������Ip1����Z�l�$�:�/Mi��J��D��l�� I��H/�������$B�%������k*3�NkFL���b�@�#"|'���ho��Dwk��nb�����'�-3��a�-�8x�������xW�%����<���7�	�l%������&sJ6�H�E+�^n|��P�T*!����������(
��5������1k�<xr�)����&�vD��	�C���xj(u��U��,��ZM�����H!���*�F����,�a>�0�	���Q,'�A��Pqp� ��n2�:a#h��(�6U�PV|D3���������Z;���}�c�e�<c���u��P�G{&��E
<+���j�P%����]��G��|{������T�>���9h��a��<��(I��yi���-�6?N���D�x<AA�U?g�� �H'�J�h�w4M@p�!�^���� ���O"FllX�,�)s��,���	�5��_.
�rHx���M�����c +�2^��T�u:b���\R����8�g�I���H�v�+�c�F��R}|3�)gr_y|�$F hc�98H^��e�oT$��q��N����'������Us��d/_j���"
�P1����L�2�-�����O�R�
�
����R`[F�g���=2z�lSy_$���K�����p[���
;���R�hS�v��
[= P#M�.1G�M�������X��;��pbVw�������u�h�:.�Y��CSmY�Y�$4)c\��Y.
�
A1 K@�2u�����H�]4��6\i;���(�c�Ile���B65����q�
x�:<��$�|��U��)�B�M~��(��b�<���]H9+��K�1`�zlY�!8�����H=[��x�^p�'��V���hYn&Tb�h4F��s��c��m�)mcB�d����-��n�L�~I+bn
lr�Z=d,@��y�F[��t]�t��3`=(�x��w��U������N���`���n:�!��S���(�&Gv�D��1P{��;�.X$;������9>9A����wj0fs��h��sF-��0�AY�����3�THWv���z�_���Sj�4���i�r�{�;�zW�rIe��g+��8��)��a�����v[��v��
����M�y�P��6�
�
f��.<�>��`a����L��"����z(��n��"l��9Uz��8v��
�j�e����^�^�g��I����{W��n6k:x�N��p�w��w@�D�uj�C���������sh.�`��y��F.��syz���I��SN[Ww��
�w��t���~�4�f��F��S6s�������o3N1�7$�LI�V\Y���>.���	����
�,��5����AS��	���N�nNU�s�1���P
���4%�hjO?�������|2�����%�m�r���d�	D\G�m������7�e�l`<X����Z��M������.o��`���y?�Zl��<jY�Fi*7������.�4
etH,0s�+����q��<�.(�u�����X�
�(�f�(�Dp�
7 A(<'��X�`W	�X�Q�����(�%�C��6J����1NJU��P!�_@C:w����D�������E6j�%���[��`*��"C���*�Kd4���f���A��P�^F��2R���0g�%�P�uM&�������^�����)e X�@�����������:�A�D K��c�I6?����y����[O���	a*m#wy�h
j�uc��+�����[���H�a�I���Xz��I�z'�L$>7������qu" \��Xc� 0�BT�1da@	���fR�����j��U���C�C����-v^_�t*M���z�l9�8�^��T��YH�XT������Wr����z��L������ �47��Y�w	�|��������u���@)8���6�+yym����'���/�<P?�
��e^J�����=�U�|�$���7A�^���Y]aL�T��z�M��e�Z�l���g�mS������
A�'o���{&��_�S��<���-ST��&5_[D�g�C#���R:���q�"���}�_H�]^�7W�������D3��>����z{���hY���3S~���c��������I����'q`R�.6�nQ�)�*���iFnf�P�E"?��
����#��L�8`���*��t�s����H	)@����L������|�����9����.��x�{��L������D#�������;}���,@L�>X�g�d6�%5t�aX����2�~�S�����9��)���t�M�\6��sg�fr�Q����9r��2����r�O^%EY X�C��!�=���U{Y�25�<J�E��R�18�&	F�W%�3=�Ck�xl�I����&1���������:q��-���,v3�>���q���e$P��-�B�q$k.H�e�j�v�,5���	�d}&���tVd
�z�#�z�wf��.�\�E�
���b�GTW�6�j��@�eT}��d=~��a�~�cQ�&�ua�WV�$���KdC!O�C��S�\�p�?��3�d���Rey��".����{+��P����E7?O�=��ly��77��>�)�p�miG7�%�r#��z��y�r�����R���})I�^ADA]�V���R���\��
cz;c�<p���?a��h���#���c��&
|�$���������Nr}ED��B�$�X�&e���7R"H03;�Xz�*_�OU�Y��a���WMv���t��kU%y?)��
�3��_��[�����+�rJ
	*-�A��F��/�^K�:���9yn�mQz�lC/o��q���<�)��/$<��	�w"n���#GMF�}XF/6�.:�X���*kp~5�7���a�i)�������z�����n�I�1`�i,�@Q�
�m(K��	�_��2���|�)��XJD!�IE�Z�&��yh�8�J���:����� �����4�X1�d�9q6+�d������^+��UG��������� bC�� ,<���h(���hFRe#��O��oA	�<���$���d`A����*h�����~���N0s����8�u��/���W�	Y�stT3����"�<�1	n��|���Q�z��5�J�/���^��.��������5-��n���?�u���� J���$���kP�-�e�-&����2}�7��?l�Xz��n�M������D����S����zd*l3���"��I��	�?"Cy�*����S~&��weg����|r&������Q3CT}H�3��O:��O�)�_.�k'K5X	J��;�����?v�o�`
�w�T'$)r7�,;��P}�P�����OI��R�=��������lEn5�,\�=F�B����G���a�a��
�������ac'~m�*�M���M����%��Ei��,��~NY<"�<��T>'�"������n�]�\���!�Pj+�@� ��������H
��X}��X+�,H��1��[}��A�b����{�k�DeM)�wme��JPZ��7�^�,�����	�q�kq���r��oB@��Y,*��
������!ox���U��^����gU�=$�d�
���!J<���}��w�tO�=�7J8��y|�4��L4Nt��`;�!���d��0���tHn!�:!#�B8���k��[�T��L�e{v�o�"�K�/H����D������1N��_���Xm��0���4+���4�w1��O�I�
f�x�0����x+ jY���U@���m�K��H��?�_����Av#k_L<9�M��8�<I���Z�c�f�-c2��V[/Z/�>3'��]U]u�����)�,J�9Y0�g��L����������Fb���u����f��y�8(�&��On�����8F�������8��<P_Z�jB5��i4CG+�����S[C$t�Y��W.y���MW����fs,jil�Z�7�VF����OjQ�����|���w\f�|D�:@��hF�����cv��
%�s��O-��.p�����nP*B��`��7?���V�D;+c\G�>���lp���KO�St�����@�=�xB	�y�����7�<aE����gZ���c�9��"���MQ�^����;�����7��������zm��y�IC����jC��M��L���4EX2$	:`k�y��V
F4�W�pw��"����+F/���0��bZ�/e<w��^��+c���Po�H��W���b+�({��%�/SF�8h_��F���3�R�L��*I/`]1eN��S�*o
b7�q�$�Z1��H����p,E�`p<�YO��?$�y���k�U1Z6�3[��)�L@��S���!
�����S6��H�J�X�K��v�Q�B�e�'1�~8=a���v��Y�w�0�McB>�g���C��}/m�-����t4�LO��?f[(Z������r��P�����`���9c;�N-M����Up���Ny�7������V�S�Z����)Lx�)�����5�S[�gm��]:�,��F��t�
���\�F�2���\������7a1+��#h�"��)��������d
���������I�����;�k�Ry-ec\�����Dj�u���^�SG���w) M��B��-�D�Al�k?����5�0�s��	�6Id�Z�|�'E����d0�%2^���0'Z��nW��op�k=���������	@��fnl�5��N�`����,O7�
l����0��X�C,L7c��T�.F=h�b��P����=���nBn��wF��
Q����<�?�HN7����FP��)����w��m~w����2����";N�~r4�-�G�''T�H��!wl�w�qn�8�4$�2Y�1:���'b�|�G ��Mk���L�2��-\l�z��Q���^tsT���[x���qK�-�((�s��
Jl���t������sd�)�1�����Gr�^��}Z\�na�!���##-v1����K�]Q�@�8�q���Vr����?���9C��6�J�arA�����h��Q���$�Lm���X%gG��
������k�x$����w9����P�
5JW���>�����l�����c	9k�\�V�� �R�����QbZ�m�����r��=Pb��O��'��x�,��E��8N�@�K��.���1�1�:Vb�W��4��k����_����(2�.4�Q�?��|':R��'
�")4L	��]��R���-r���	�-�l�C�9*��k����k��x��a| �U��H���&b���5��|4�G��C���O��{�i`�[]��0E�����M������Dgg[:����	�;T
a{��)�o~S�M�n<q�p�q����a���������Y�	���[D�>������
��|~����~4Rs�FT�lcT�p����dlaA?��!9Cbn���8��b��2���jd�'��s]�r��{�v@5/0^��`�y`>��G.��
���������@���&C�l�?(��-����fY�Kf~Yx��"�\1�v�.u�������y#�`�k[��
X��_g��C�F�����PBfuW��O)h�b���:kBb���h>�������3Al%f/�0\�5:Gs���m��e 
w���Ou5���y��^YC���!��:Y�U�M2-�/��So���7�i��L�A'�����%�yv�&C���|w|����������P*w {�]�0��+N	�!�7�zo���� ���i&�!�^b���`G_�����]5���Mr�G��@Z3���K�T���'���kh���K;�Co�E8��`����v��f��g���oYF
�	��[4�X�}O�$��T�)t*A���_����2H�J.YH�4������O����FZ��DC|nQ�=��"^�57���S�Y	��<]�\?
9��;[|��L
��\w��a�y~�����)2�;���m�pa�A����}}�$�-$����i�4+�T�^}�mhY�����dS�]����������������M����-���Jw��������#�`�h��=GU��eg�Q��.�j+9���Dt�%S�������
'��GS 9zw��=V`"�R�����F��_+�9��g|�\��\'��
�Yx)������)+I|+�����yLQ��
�d-������}�����dE[�tp���7DcCnAwg8T���s�O���
R��������ro�d���6TK��n}���,!|�XZ�^�+��OQ�?�������H��.��A0�!�i�7G�����O�Q��D�w5\�8�����0cJt)�"�����$v�0�x�(��T�02eg��H������rM�~�V��VFR��s�9J(k����7?�W��Y4�rj'�7?qfh��
�!,@G��:)V���z�����������({�U'�F����>���nL���l�%4\t~��h���2A��B������v\w�r��Am�D��Q����M��3X(���l���O�����=�����?���������A��S��|&@��C*�U+i��&��*8�$��&#�xa��)����[��th���QH��fn�4�	|�;_�T*�*4nX_eP[��������'kr�AT��'�����us��a�b �-aWKJ��"���8/-E@F��u�	gy������z����9�'�`9s���3a��b������v�n�?��==��>��m�`y*<�����V1�Se�.�������8erf��*2P/Y�-zVd��������,�GQ�zNf��I�M���b�>1#��K�����`)h��le7F����y��&$�8�HD �=81��X������3$��3�EKp/�v��6~�������=����R:�������)wB�)%��zyh/����
�Z�h�a�e�/h/�����Jy�b��Ia;|o���c:��a&�4�-�8
\�lz�*����L� a��=Ue����1	Y��^�������������L���������I���H���'�5Y%7��V�����q?���`L��R�^�����/�z|p�|
}�OGd�����;��W���z����j��~�1�.�eN.�lR1]Z��`�r��y�������^��/t�}}�A�}��d��w���!�s�����N�M\#�ME��p-�Vq
��u��4(�[�A��'2��_3��S*���}���k�L.U�R��ji	4;<���w�����b���\�����`m��KN�O�q���?�v��~�s���~F�I-��hb��_�������P)����$6�9�%(�5P41\�d;k����.D
}%�$���A���?���/�/���������2��c����H"�������$�G����I�A)�%&ew4^,;����h����A��5�N��'�}J&@>a�\�J0�����S�������$�Z���#�vDsj���r
Hu���F��Z������%2L��&|1��l����>=�@-��*�V���$>q�����2>���HQ,{w���X[d���R���X�`�b������F�o�C��� C�i`�T�/Q"�]����&$����@��H��q#�I�b���=�`�����N��7g�"�JxB����3�$ )s���!��9�Lb�q0��xOk{}"���e)f3/�R��cR�]O9��o<-��H��T��'!�.��O��pf%\��|�P��Y	�}{J�P���f��d��P�IzbBF��������� 4qmq^"��T�
��S.q��.\�g��2+_����w�����\@���
X�~@B�
���}+(�f
�Y�`���&�N�������dqf��K:���t=��@4�qQB�jXJ)�Sl�P�@9K�(����]X[3�������:�������N���IZ���O�7�HwT#�\�y�*9T����a���R�l�:�HL�U��L^�H�^G����� K%����WJMM��2~-nO	��<u���'��x��tER@Y��g����!��xH)�"��uN4����X�))-��^��4�uML��AF)��t6�%f���v�-������b�o;iU�s���w�&��V��n�~��u�M�m���Dx������A����
�Z���P���m��\��j�{���u�m/��D9r	9���r\����@�\|�L�z��e���<�b�J���X&~��{�,������8j�A��!��<]}���a��IX��cp?����o���J����W�M��9�������lr�zM1z.kG�SeU%��P�|u2�|Z�:�/N�b�f��8]�7���](U
�Q.@mO��S �����L�jS�q
�7�:i�Bf�2����W"vQ�[�~Q��������vj�������R���X��_l�X��u�?��r�	R�<+��e8���^�@�J1�>�2N��
��c���H�e��&���������K�����4���~��cDd�����-I��i��w�V��������k#%d4
U�|�Hy1|Y��m�8"������.����A��.�J	�,�fe(�0��d(���4)�?&�}$9F��C�X�� _��u�s�D� ��;�g�~�z]�iR��&j
�����+jeV��Id�7o�Bb8�$Pi��Zj"�J��/"���\l2�r2�fd��p�<��~��Xlpn�e>�2�R���g9����"9g0
��1\��)��t�Q+,R�f������^���H'�_v�
Y@K��8>���U^�?�V�J��C'����:�3C�l9}u�JR�������I�Z�}��]
'�J��:��b�A��|\-�V����\��aJ��'P��j\|�����K�B�sr��v�"'D������1�2R����$�#������u--��<��p���>Zs��F��01��8zl�����j��9����iw�js������rN(������O"��9�+&��������o~o�{A�#q�&������-���=5{b������,y��$�Jj����'9c+%/a�b����p�9K�L_���n5��u�-���K2��Js2���w!��������g����Y����� s������fK�?��ASug��&k�2�}�A����(	�B��r?�}p�2J��
+et�]�\`��t�/�^�����$:O#���9W|����g�b��~e:=�$��/�y�H���!�-ON�rfJ�����[��`��Z��N�l0@��G�*��AT��[j�s�����W�r
K*����3+�<z���=����R�k��LH����P"2�\������W�S�5�]�N�\��>PM'��z5�+�X��t�se�E��4��ch�E#��p8?���q�\����HR"g?-�L�0�(���h2P�$�����<��A5	>��
�G�����,C�4�<� $"�H�
�)(�s�Q�oT9���1���0�������6�2����I"n�i��QB9|�_L7�y��@
��{WB��qp�s"�	����GT����`�:]dp�F� ��A�L��(r��U�l<�81J'<!�i�e�5�3S�#�lq"�x�����jJX�G,��4���@�g�iPm�{"]���x�3�(��UO)����o��@[�m�������d0f��W�&�;��R���2h�����_����y�t�$��DC���SWub�%��J�2��l�S���j��S����D���B���8���=	���O����������e3�RiAe��G*�d�����|k�>�gM�12�� w�L�����{OO����?~�.���w>���B��1�Hc�i�w����=z4w�?mmm�����X��W�[��U����z��,����~��j�~����������R�)��>�8v��7x[c�����vJw����U���V�R��M'����C�%'��:<������������"�z"a���<�0��'�,M'a�Z��YrMU�W������]2Dc����_�t��t�8��������������#��6g�u����'JfR�������/7T>�6���L���"�9��J�@V�D:=�_?G���K���e�g=[���EO�-��'q+���J��p��f[qk���bvE����C��
jy�z:C�����9���
� \��3�<;M�����Ck�)�Q��d��6Ig����I�6���V�x�9-�>�+&�V�H5����.l��Tm�|��Izc�o�E_[S	�����<'
8����[����T�v)�!fos��s�/��-Q-�R�Mo���'W4���"�p�D(��i���ta����iF!�/�v����.��f,��N�`��Q�N���Sm�j^���9�r+a�WQ��k�����	�M��Q*07XzV�����+�����06�Y����$xV��_�>grn���Yp������YT�Vx���2<o���zm�����j4��U��j�H1�
D�E��V�n����g�Rc.����pr\EI�wd���f�:�����)�~�m�Y������4�.Vc.Rcuf����N�x���/��7I���SO���$���^��af�#�7������}1A^�_K}��`2�P�������_A20�p�Sm�#_bq2���1%$��1���& �P�}���W
�jo�G�<��Y����y'�����c��,HI�R������	�6���e@")��<����A,V�4��o��C���M@�f�J�%����(��������Z��3�y�sr����i�8�����6��x3u�LU�{�G��-��z���Kto���7������iS�p�SN+A�����+�u���wI�6�,��''
8����s29R^�����vW��B�=��{�#=����^���{����Tk��A������LP�M\*	��a�78��%���>b���m:���i��~�Y��)��������R����f�F9M(8�E�o���+��wOgWW}GEk��_����Q���Z����'��y����}DE�Jr;���If']Y�+3�����!�v�������cV������w�|�q��1Z��"1��u
l�n����r�s�8��	���-�$���f9^Dq2S��R��;iwG���_.����RY�;�Z���M�l��_��l���O�u&�;��Up��x�h�:k,���`l���<�+�|�������$�C��$�d�~��O�A�{�C��"
�����K�Y1#O���(�9o��k�Jh^4��%/53�p6�M�I�����=��r��M��y����We?f:�1j���pw����p�S@v�
�}��2
�~*y�L�'�������������Z��mpp�����Yoo�lZTsZ�<:>���Ql��Z�M, Pp��I��2�B �`������d�!�����n�
k��W�Y������dH2Uo��dq�=���'=������~��HLfo�
)�iE,���'�Hq-����R��s����i������&P�t�zo��
�����[�T[�ucm��������<���=?`(��S��J����g��r�Y[t�������P������?�60�/�GKU��&"������k��7����2�`Z�����r��7�������$�56��s?7�>1�[���I��2��<�wv7�D[��-_��0Fv�l^\��y0:R����v����/�������i���^8]���i^E���0^�Q�J^f
��t�Me�T���9��v/�y8AST� ��Q�"Q�n�~fV���F/(�����g�2_C'�y��
�Yw���"���q.4twE���"d�����u��I��*���������#�����|Nz������c�# .Z��i��.��EJ)�zl��a�(�<��;p>0�m1��=���B��9^C)	�rPJ#������c":*�% [t�������l����0K��3��7L[���r�32�6+o�(�����d���v��z��{�S�S	;cg����2��T��G�����r���Z�d�o��L��8����TY��Eg��>�0g�OS�?s�x�?(G�����ve�{�����?����Ag��!D����������lXP��N������5�\��aB��{ptz\���h��L5uK�����3��`�L-� ��Gk��x��ZF�U��&���|�P_k�g��/�5Z~���5����|�'�022��%_"�Q
�L��|��z���n���T��������-����W��~7��������m�P���u'�GF��h6Q���I��B��u����<Nh�":)`���U
�/��q=;��[��M�"0��Q�j�oT;������@�j#��
CiH�	�v[Uo]\����N��.i&����E^���w:B����q.7yH>�l����rz�������B���V��%�2��w&A�}��;z�{���@��U�b����}�R�nl
8�0K>�&��Qu0�������*�A�L��3���')qW�Fu~��7�u������_�K�������l�UX��-b�3
d�vB8TP�UH9o��$�b�.��f_�OlQ��u�"~��R�1`:"����T������S��<0����a ����_��-fq4�j(j�L���Mm;�����_Dq������e'O�����Q,�I����=8�/����(9k\�f�)����)�H�F��b[�VX�C�g�T�Kn��w�����e�~�����p���W�f�mo\[����*�6O,8��;�o�%JM�^t��w�����4�f�bo��L6�G�����������%��8]m]C��?i�A����V�� ng\��X��RL�s���M��"!�;6Auo�����=�4@x���U����r5���>�#/�v����
��K��w�ac��N���6�5+~��������:�=�E~�<]o�,]�v����	��_�`j��2���]E��@-���Q��)�=
�(@:�E����[�2�O�3���S\/\�}�>P�u�*����}�$ey�%�k�DjT}�a��@�����+�h=�X��+i��+89+"�5����p.�g@Rl���Me��
�W�������h��e���Gk��Pd�v/g��y��B�xk$La�7�����/�� y��i�)���O	��[T/F'^w�%z��8Y�l�1�E����K�<����z��},Q6P���A�e0�?�hg���2�5Y/��8:�	$��$-p����bV�-@/�}�)�.������#����\�M ����^AV�k�\����c
���PMp�����-=6�""�r��`���3���x�+��WX���X
ks'+t�����M@�?�A�����n�t������Z�o�b��V8��i�
N����u|~�
dy�&����!�����G������L)9&v:vm�u��>%��������B[���9���D�������$��� z���,�)����H���q� �b��-�s>B9�/�������[a��V�y�Z	7K����D������Dg�-�!����sTei+�e>e�IG�"���.�3\�i�YL���R��Nx���R�4w�sg!�=I����gO�Z����uKS�����n	�.�8*�7�S(�Y8���1QR0�P5���vO�V�si
+�?(0���Xy�z/(U�M`UC{���A���vq�r�"��cW���H-��|�:8J�m�5�C�{n��I4���=�$��NN�N��u�=\1��r��-E;<EO��L��A)KikA��T�Y�"�lP�te6B<p=�q���I�I"���	����J�����1^�CM��y
]���Y�>�m,�J
s����,�fX�Qf����!b�C������o�E~;������.��&T��f���Gtkp�T ��<�s�:����r���6��"%��%�q��p��!���pK�h
�HGd{2~d�G;`��2S����df)�+tUZ�
0a����[<S{��<R�{��-�&yQ�����OqD�G�f.�g0LSr�#E��`�.��E���$'��W��V���p����S��^��4�Y�F&$��{@B%��]������������^��y�u��b��!e��9nu�4�w���e������m^���z�btR��Dh�'����������$xU�D��=�J�b>&J���uG&Yh9�������:�X���H���T/Rb�FLQ�H�o��4�v0� ��)g���2��_fA��8�8��8-�
Vp��b��K����`�
��+�am�����m��(>`.����@X��N����j��e��~+�)�� �F�no�L2���f�.s�&��fEH��U1��4������$w1�|�A�b����������W�.���[����>�� �R��P�zo�����U������Y���?;���f�Aj>���a��K0�������KYrSzK2���knY&����F�8�`fxJ��X�����s	1����2<\��rI��\y�!��6�@~��j}����R�l�����|���7�xf)�~y�K���|���\T|_\j:j�b�o=��?����nA5�&���u1�uI8z�[������>�r��s��X�[p�K0{��`w���Om��K�ws����l�@pk����B�R�����%�B�#N�D8\�����1�kI��V����b��������q�y�i�s�5Xo�bY)-�X��#�cs�b~[U5�2��o����p��i)mBI�<A��@�3���>��2�mF�����7��@����/e �[��
F+,����(5-���qimM{HI��.�!@����j*9&W�����$��	:op�%]QL}HWN��T������+�lj����������UQ;mh��2tT��)n@�����o;��^9T�����+h��cJ�w;�� ���.�I�������SR��5����
:��6�RM�J�����Ng��\3[�N��M��g��dQ����E�������k=�]�4��3��2h��Y�-�Pm�R����r��9G���$�c��g-�.���Ut��]�!6�P��mR���3E���mj$)ksM5!�;%F�j�8�,t�4_
��u�-�-�k�$/�5��P���N�O�`F������g�{�N!F��$?Xtx�+.���w8������Sgh�}�
)��u����Y�����r>��S�
���&k^�i���y��cGTE7��Y������~�&�����]H�W%��(����tqy��%��O��n� ��K�q������u����f%�l��eM��`1���
X�������s�e�L8�4�I����Vp��1e�����8��i��w:h��������Z|4�cX�+*�L<t�$q��|t-�����49��.�s���j���j�D�����[� �K+���M��E5��
����;���[T?�(3�����xTy����6�m
xz�����]��,��e���9==�� �B�s�������j���'���������u�?]
��\p������z�"f�����4�Vf�j�@���zY��F���2�a���F��D��k
WEw��s���-���|=�����n�$����*X,5Y�{�j�����K(U�B����+���N���9����M�5-�QF�i�Q)��.���"|�c��0�Y *�R=,��ZT�x�&���b�RF��v�t�R�c�jg��.bY5|%�4�=I���1��w<��"b��Md�ZN���5�oj�D�z�����`n���=�"��-0�fo���~��t"F7�>X��+?��d��K��R13uq1,�@������o��0el�����rSb�9�_*�����^��R#��D��[t�s������������n)�����J���R����(e>��A��)���B�.����sz�d���Q5�U�3��pE]'p����LT%�������%�w\{O���&�B��%.&��H�������M���Y3Oq�%��8U<�F�O��(H�c,�$
��c\���������|�/{�����x"�]�^q�?`�C����7�C�|��X�.��=�����y@�'^RNd�x2�R�'��<gB��JY,��e��$3�l�Y���|���a�����8&��c�%A&=[����%l�U��
��*=9������$�7�cx3!0�����o�=�������[��bm!>�4���T�E�3�~�F5eNk�Z�C�W��d��y�^D���)�������[�&~�W���(�{GX��l�
)$���h>V��j��r�Z��`�{�sp�;~�����^����->���a�c��F��,mZ�hB��#C�B��a<h`v��(���{uO�b�B���7��MP�AlMJ\��*����,k]&��_�D����3R�;YWD�Y��3k��~K�_��|�Z�e�}&�`�Y5���*(>"g ]�����m�����VL��9@�,�0�~�Y-0P>�vIF��JU11��#�F�>���(���*i�r32�$�'A
$�8���)�Gp�.O��~����X
�~�L#=f��f�]�O@67�����6���}:Xvq6�{�U~�����]�I4����)@�]�J��	[��t�q���Qu�[
���������F�eW"X��V4�Z��1��1��a:������O�{<�&��N�{6���O�������YM�~�M*�g�>�%m{��(���T0L�y���������V�0�4��+-Y�$���i���I��E=VYNTg�����=q�����G�[����cWjU"�Ho���4��ac>�����j(Tbh����$q�����DD����������a��:	��R����O���grR��m���F����
2�@pf�3IJQ�i�(�	&��iG�:��e�F7(�`�r�#��]��]�v��,������l�R�����?3%�(���u�#7l��!�cL�8�8���I&8Dz� S���Mq�.Q�r��"�i�rwE���x�E�������������:d;�����.V������'q/%�G�c0 �@N�0N��*��09�*�o(1iatP�
��9]]��|?���b���H1O��pU�(2������|�8H�;���	Q\��*C�Jf���������n�E&�L=bqFiU��\�g�����;������l0E�o�����Y������]3���G�G2ylM���w!�^Z%mP&|���LP�����vl:qw�M���
�_S|q�v`8�J���@b��XETLD�gdM{���I�L;���Y!�2�������,�����&�4���u���@���*�F�A�"��1�*
��
*����	/Bg���&���Ap�H1���8u�q���Z���)Z`b�����Y�P�_!%9��r8��d���8i40���q���Dv*�o*ed*[��E��a"���Le��h��'�z��
�
���C�3x�HH�N(_�D����M��������.�|���acd��_��.;@@mT:�
~]�L���}�$�kOG�i� ���[�����S��Irf��Qb��h�PT��\�vQ��uS�<��Q�E)zMv�tzGK*�Obl�2$����1O��_a�
3iw��\M9+x�F���HY���)y��E��k9�p��]� F/S�R�����D�:o���l�#B���F<?Y)�pb�(9�K�X�HL�^�>�P)�F�F�\��!���{,��]����J^�����l�N�]kT|��5��o��I4��40�p�e�f�_>@�$�<��6����V��NnJ=?���xd�4#���9F���sF��=���'�x���l�L
-�9�zAI��������V:~�Zm�6`����kN#����Z]���U-�Q�7lV
�'�x�d.	1��Fe`%u"I
��������hK
�W���oDd�L�.z���N2�V�,n���48�:�[8-t����Q~�2�1�G��������t\�������r�=�Cj�8H/��6o�TF��������s3��Ne�cX`�T*c���v�w3����j��J�L����[���?H>�
�h�HaI0���c��������ZRfDU�!��)TM!���R�2�)���BX��A�:`�Qk�Fn���
���G�V�>�� �u��'��Q���_��1�7�u�4<O�M������/E�H�F�K~#�����o&{�i�GD��I��-����w��>���������>���A��8�N�K"������8�U���1�7sL��;����@I��K�������$2������Wc��������C��� �bK������r�������`��q��e��p��l�!r�
�8�����	`�&�_�zQ�57���JRU��II],B��n�w����L���vuv�%���,��l4rv�G�'5���V���o.,R�3��0�K�)��	�>�O3�e�Oq4�����T���M��g�L�"
�$�����'�K����'i���K�|�fb��Q�3hv�r������W�TZ���r�#�.��4�������Cd��{�~�������_�X\=���S5���;~��t�������z���G�����'G�:� ..BO"����k�����)�
��Lp��<N<��s����������&�]o ��f_�N�&Q���.����W������-�9M�j2v�jNSG
j�������}����D�6��8�%�M�t;G'����d�������t,E�UE~���,�p�)}-�TG�1�gJ�8o0u#����W���]��U���&jX�@awv1�;-F��R]��7���������\}g�0~f�Q��a�����}M_���T)Ipu�/X�8���:�J�	�2G}~2|����4����X�.�=���-���d����I>�j�q0)��~���~�KE��~���Rk�������sjc�M��e$�6�3�*v�x���W��gv����	Nc��~���2���y�j������m|�?3��(��XXFs�=�����6��'��]6���k������_k���-���&���#�����$�A�#�
B���pwz������<������AwtJ�G���	�r���e��4��k��Kd������^������h�9}���OI*��A:���a8E�,�b�Xk��|e�yuR����k���_�u���
F�i�K���h�~/������d�s��	����ZNT������&��s��
�N�����2���r����	���,	z��M>�o^�����������������}���KA�7 =�h��u�")�g����� 
������6�3�L��0��@���1m�vt���@�����g`��h���\�:����\^��`�D����E~nuK{�Tc�B�O���������Q���dc�	�a���;�~����U���/�
��1H�������*?w�Ujr�����jYswK�5�<X��Z��5�>f�g�����F��q��w�����7j-+�\�g�B�?>���%%c�O8�����~o��t�����j}J-�Si�T�6�C���Sy���fj�qX��^E��FW���.��1~��IW���z�_����o�my�?GEoi�_J��?��/�0�!�b���%u:b������H�9���_o���Q����e�9z�E��0����}����)ZUN��h�T1��F��
�~l���8��Up��[�a���9�%�����c(�_F�A0� ��;��\�'{6��j�o5�z[U��US�@D�
(��Ph_��Y7��
$��������+3��7��+�,3#�f>"�NK=,�.q���-+RD��+HM�gbg-�^,_yq����[�L��:�@�5�j�������:��%�=���u��`����+��m�w����`�L�'�K��`<X�p��Vb�Oq�Y�3KMT�)-�*O��{K����o�����r��q����XV����7���n��c�NK����jQ2�����q|��c��$��kH���3o%�M�#4����C���$�����.��}F�\��8xq�h ��2��+S���':�De�h��B��g�;�]������h	W�V�v�J����[������Z�#�m����<u*��1�
rm���9��(����Sbrv����9����T��<�|$��R
2����(���b��I-�rWj�21a��K�t���t�^4�>��:{h���>������^��`(�(���i.�7�;T|�t��vMV��c�	6�o��Vc�Q��<��7SI���(���^+KC��
��Bd��hb��iP�l;������2y��~"i�T3�A�Q�|-i8�����.�s���INF����J�*-�r3$^7[�=e��K���������Qf�S�c��%U�>����y� Fa���&]X�Q����s%��S��>9�x
�"�������\{�����O2gE���[���R��$���>So���a���T5�/�����[q�	>pn������m���w�n~����Y��g���f}]wg������R�C�+�%�
���MK�wX�7��ipb�qq�g�� �����?��-|
��?xD|��Q�&�_�)+I�����<�M0�L��TDfe^uyl���h��������}�����^��{5��f�.d�O�[0 a��7���A�Ez�����2PH�H"������'���w��{q|�z����9<|����Szy�1��g�.PiX�D�1u��j�\���j�|Y��=�"���6O�Qo��������F��c�nC��3�u��/t����D|>��DjV�f�,�Y�-�`t���r�x%�GOa�\�>����w�j ���j���9\z_N������m��.�$?��Q4Zi/e�6+��� ���Or�����k�8diy�5��Iiy�B�#����X/;����&�?��1��)��E��g���jw�������(4���% �L��\CHF+4W���
Tx��ob����Z�xa��:@��k��V<����p%B�Zs
�,��`�G��uq`����fH�?�%6���H�P���kk7��p�����B^���u�8+=X�n��J5W��{����*���^�L���/�NO���]�Y0��N/��w�����$R��$>���y�(�����j��ZU�H�,��R;��Z,7�����-�~li��M��"����n9���'`?r���9k�����TC���Uk�Z�1O|A�������V�����H���3}_l��������,A�)�96"��A0JP�A��qC[�`�V���^�2���5����K���mg:Ta�?wG����-o�}S���M����(���B�o��H��C�5�*Ib2�rqD"������
��������-�0i�0�Qo�c6B�����#��{�����h���19Q;w����[]�jD���E�������|�s���sW��2��gX`��E�Xc?�N��u5�#z����y��iL�E^��J�(�7TJa�$dV�cl�aI��,0��_����/%��3��C�n��W4��*�Z��C|;���28�
���!�_�t�z�
����~���g���Dr!{�N�l��{;���\����_��w0x�#7E���"G\���g�P��X*�\V	"yu��1���-BJ.1��&S9�v�3��SDQ��h��|�������
 ���Q����3�Uf���%�r������C
��'r-
�H�>�s���w��)P��"�Z���r��]��a�J�o��!	�\��T����|��P����m8IN�8ZM�h�K,�N
_�Yq�����a�G�$w%~E�J���E'���*Y���r'o�f��7k�SU	I��y{�������&Y5��,��45�s����v���7��X�CCo6Q�%G���z��mE�3���lV���^��9{��q���0��p
=�����|��PT����6��F,c&��F��
��\O��	��ac��7ms�c4������t�����U@�2`�B����rj��d��]�,���gRM>?<~���k���H�j�M���[��*+�����xN��P1'&�����l���RH)��������l�0��S5s�����Iu$���-��b/���N��W^<0�[j��x�1o����bU�}���H�{�4����||>���9�7>'���l�|!�$���b�\_B������������d�N!�������E��]�Ui���Et'��+�)�d��1���������M>����3r��-�����{����y���:tE�e��UC�9<
�3Gg����m��J�.}�X�Hb���d
�9�y��k��,�b7a���}�	M�`vS��Z�-<�	�(]����J�/�����$����E���cf(�/fo�7tC��#�=�,���l��]a52)38��(����]1]���F�;#��X����$��)9���47�j���H(K�r����'����1:��W�	s/�B�~�xW���sH2z�+������2(��(�����'t�d�@Uc^�?���a��)0<.03�/Z3):E�$��c��8?�8�����P��O1��U$E��k8��(34V�5�FXK�x{��s�Q����}@�d�e�h���
�����eq<L�A�m?a�������h�UX4d��'�b���8�WtD��a����2���Ue<�e0�A�R��H�/L&,����8�������g���p�(�~9H�,G�����0
��p��r��MM�����~^?�ec��}�l�w{�9X�S4[>-B*$�@L��1��$k����<Q��Y���)\6I1�������)�2�1F�A8�t�fSJaL('%���iI
��p�J?��>� ��E;�)a�Nu�qL��'�C��W����@dY*�"9LF��Fa!m���������cj@.
|$�i�,�
�Q�0��N�=������MT�x�	T5�����V��usi���
���TqJ]=�9&1���-	B�(L��8<�QDx��������6�u��x�%��[a�\�7��MV��xi6an)�om�;/!��X"$���~��q��[�O��P�?BYWe�#�������6C�P�|�����R�{�0��QxkKD�}Z�=���]�b��jq�����(�%���<� �O�H<_�=v�p{v�����Y6�O��(�O�[�����0�K��N=�.����/�x�c�2�_�*s\MU���b���z�I�.�Xrc+/�Y�tO;�������\��,��f�o�2�yiE����Q�<���&��!yx�����������E�Y��/8��B~�S���\���^]�����=�Zp����F��4��*����?����N��w�+�I�s��v@�����`��=�
�����eF�AY5��#�7+%(#�&������F�t��:��D"V`�Vn��vq��������d~�b�;���[)�������a�]xx����a# {�h<�O{��"�!�H�����lV�6|�L��P��Hn�<��\��A4�]M�4�D�������l�b]�8I[�N��h2��O�� y�8����U�*��4�)����m��x�U�+ �ob��C ,�g�@�zoW�hZ���y��q0��4�M���2����A�h���jq��x)���tv{`��X�K�����:���]��`8�c�On�%��$��0\�TvR�wS��4���`hR��l�R�����tW�[v�8x�L3DJ;�Y��LF����(*mgq9gx��l�/�������b^�<|�?����/��a��t����]W�&p!��P!�/ ��c.b�h���[�m0�+:8��Z�f;I2%����	f'�+9���||�Y/�����v�3���G��o^���}��O��i��N���It��z+���l��8h8��O/�n��)i6��������������gX�'�O6�f�O3���;�������O���^�
�S�s��S<2�����U@���}����oxeT���!�t��gT�mo�����3�Nsm���ZmU���cX�2y�.]	�J��W�V�N����k�m0�����p,0C[�;����>�����m{i���S��H�������cj�PJhU��~1cM��i E
F:1��T��"g��fLZc��!�����]��o:���-�'���}?�K�+�?�m�g	��%����X?+�)�,���lS�w����,
��9��qry��|���0����+�7S�X�����O�oj���M=�5�N��5`������B
�P���MG!7�������O/����GO����'��������O?�����O-7�5�1����/�2b�j�7a2���/�zC�,��{N����m���C����M�N��7�Df��0�:���
=���%�mm��S��<]K�`!�=&Ua2E��In����=��f	^�=Qc�l`��n�%3�8��f�$�����=��Ig�U,d����!T4����8����`��������E��p,7����bV���q�!��r�����z����o_��������u����D��7�)���<��)��E������X��>�Wp��+�����|��!�8��o�Z���f�r�������3.��e��k�����g5�1wv���5��.�����.~
8<���5���G��T1]������c2t{�����.�5����M� �=�sx���1|}�����%n\��HB�1������)�e��c
b���B��s�\���KE����8�spZ��p.>e m�z�p�o7�/He1��]XL3���'�������]�ek��~�nb�p�����Ip%�������+������1�d���'����B@|�����*j��*g��(&5�u����M-����(�D,�c$lG���}6x]�d�!��eye��*��v���:�������x�j��iK�b�+eIO��I�~����(��z�
?�0P���FgV�n1Y���D����l�p��|�����A
W��3��-<`[b�(`K�)���>����g';G�;�g�G������v����Co�m|r�sx��;��}�dg�;>:����|�r��e!���4<���fr��f�?H���~���K��G��Gt������5��k���!�I�E�PnvX��;�#�0�Up�fw���x==�3�}�7��8����5g�I����W,�{�^������{�!1&���R�2V>p�&t���V���Z�S����9�&��L���6��z^����2��c�i������
��zG@lf�^�\�����[/�]����������9�?��1i��Nc5y��taQo����
/.����:�/mm�^��t.Mv�����E���w�q��9>��Z��c��������U[������Y��6��mSh��S\���������8n�
��DsS���F�w�U�#+Zkd���@ f!�pB�4N
y�g��B<o8�;��JFz���g�����,6w�|!��~������q�:��j��TM@�8~O�y[#�#���o��6��m�];5(�[���79��������7��u�p��nr2q�T�k��CBr�������hYK7�z��_*�����N
��gGo_������
�U\9D����~�A�h��W���z��DA�����/}V����V�.GT������#cd�������*�?;;x���L\kG�7���@��|���`x�{|E�T�o��i6
p+�H���z��>���3f�8[?~k��A�����������#,������������{��E�gukz|����&5�bx���\������4���v�������0I���p���&9��3G��S5d�&E��9��uo�C�0�<Wz�a���F�s�~���n�%x�r$X[OW	6�
7���Y�;M�/L�*�e��Zc�����9E���(�	��{�pLEm�*P%w�sQ#UK����K��a�����9��^��^�������ClQ��[nie������L�9��$2 D^�B
��`b�'�8��)5����}����a@t���r���EP��`�C�o�j�ixN�c����|�_=�n�w[��������LF��d�#j�?���_W��S��s���R�6�;V�%��7�W3_��>s^Z�+s?"�Oc���b[U�X������{�v���F�n��wk�e����7i6�G��1\���	ft��2�~�
�8
��LI�!@l�f>d\�1v�dq���3�l��F2+�m�k=��k�#����k��7��g,.'�c�@�����'`�0ly
W
F�������Mp<��Xv�.&l�X��B5��z��}M����*�,���Q�y�t�W=�R�����UZVh}��,?W���H��3�������:}Z��B�������Ch��;�����o�F�b�\N�a6��h�R:
���'��O������9}�����)��<�!o�t�'j���OT)J�
���"�Wf�S�}�M��	��A��5	nT����T��S�c��6	�����
��?���,��Mg;�8�sY_"���O*��u��G��f�����^Hh�\���	CB8�K������8��y����z0�"9TD${�Z~��<������	�vlR���Kw�>�9�g������z�)��	���s��X�%1�E�0%F�o�P��9�M&.��C��F��T�����������B��
��;�z�\4f;@t������,�\��Hx������+��!ze"��zQ����u7z�Nw��;o��I���2�1����;���R6	e��y��@�@�$��<I7T�����C1�P�T2����|F�������(�*X(B38J1�0�����*y�,��������S\l�$��n�Hk�88cY�zY������ceC1�O��|��;{{������G$��U�`��r)��XpD-Ni5�y�^���x���ic�K���
��&���PR���A�����8�oa��F��Z�=Q����"��[m`4a���N�{���%&�^z��������	&���<���	�����v�Qe�%��M\�<U������M�:�?���rW���4�������b�*}Ym�0,�]u��/=s��V�!���~���F��J_L�|x����*�^x��������s3+K�
A� j�]���nEq��4�A�7�]L����Myt������oR>n���`)������xwp�
������h�LM�7����I�&�(mP�S��P��d]_T�D�R���N"%*��L1�OM� 8�'���y&	
d�Y:�g�1�n"�IN��=4�����Zbo�lg��N�I���������+��93��r�d�����������@5��`�U��T���'��u�����-�Mx�������X��1[�w���u0�!���m�8�mYn��J���*����� f�#����~}f������/��1&��{��/~�j��^�O���q�����W�wN�7|l������7'���k��(_�U�U�T2���������G=�����h�'�|���M���u'��6��f��I8�e�3<L��N�fIvo���G�6
�a���5MW� �!8�e�������
�[�pF�j�����T���`��/����+�t@��B�u��q�H��X����XZr�k!��~c�X�5���Zh���:J����m��H�;<��,���)�hy`��2��� ���%X`2`��Ji������3(G�G��ji:s����|O�_USz~!c������������\��(�m�*��\���%\���&�t>�`@._6�0(0Xi�T1�C�����^��'g��w��:3���;����U	�~�f
��N!\�V�������2$
�Wk�FY896�,Up���e�oc�h�x���e���aj�,�-�_����yku�yk
s�~�E�k�������������o�����*���P\��������X����v�>�������j���Y\�6?���6z�7���Vca�����z��X��Q��3�i����0x���E,��g`y��v�r�u^�@9��O$���z�H�_.�o4���z�m��l�?�j�%�
�V�.}.;�����e���H���',���E�.��%-���A'R��a��[�&�+~��b4[��2<����?��T���)�!�|:�$�cF��w�x��^V�zJJ8����.?u��`�0P��dN�4��5���J�	 ����J+6[�Ur�l��\�u����;�^B)�6�����ZY}72_��6���(�I�����#0�<j������W��RO�>��L����?�������3��SB��^Gt�00�KA�xv��Dcly������Su;�H����j6��N�p������Z�]�����]�&lW��;�.���
�F�[�a�*�����@���\o��Z�����[��+����gb���S���m��Z���`�Rt�t�dZ��^�����>���[����	�z�^�Ykv��B������m������/��j�;�`�
WE�fy�6�
���4���@�{�����q��y)>PY,����t5�g����
�"\~cze���������3�s�|G�\w�a����mV4�5��w��5�������CW
�GB�����W�������m�@�=[<\
$�%59��i��3+jT���K@�|����`C�
�!����]q�[qF��Sujm���[�7����.��t+w������V�FJsKe������j�:`|/��Jh��������,���/3���AWZ�yY���Y0��L����W��&�
���xY�E^���p�IV�lp�]���v��z����9as?l�V+p ��?�-�m�`[���-��T�g
���$L������{:����;����L<O��`^[?�%��P6(��@�3��npTD��]�rXNJ����������l�m'���Yu.�|������pN��~!�m4���j'�u�(�>L�%�����uQ�������;s���(��5I����35]��5@�ng�����'���5 ���J��1O�P�W� h;�y�|L��u�@��\�<������N��fj�0Ss���	<�h�C�/�<���|G3��hT���g�y�8��#<�6���m42o�\�
71������o��6gb�`��\j5��j��/����_~���'��a�i��p��_��v�����8CnB�/.Eg:h��T���8�A}}��zA������rY���p"�m��N����8�w�z����cM�V��G��c�W@���:���t1�����.�m�_�5����[���.�qpf7��1`I��j�V������Ul�_B�x;G��i5Z~������{���oW�=�^�qu�/D��4�U�6'r��e����N��v.\Z1�:}�x�-}d3Qq��H��R�v����0�|m0����i48��!b�B�9�H���\e�Y]��z�\U��(��_9�7����'PZ�O�`���������';�������;/�u����
'��; v�n!Z�]��GwA:�/���7�<l|1������S� �n�^�������i�k��Y����h>�����ji��`���;��b�F����K�|��|K����q����W���J�������tN��=����i��[�J����R��<����=�i����za�N����Va�/�:�5@��J���4&�#F$,g�N����R_K����}��9���p�o�Ov��O�'���sA����H�f��vRz�/��Jp��"���$z����*�z���)�dt=(��� ,�G������}�8��A>6U+u�}Y����_s����#�F��p|�E+~��j�����3M�F�����%����������9`�U��h�:i2���1��
@�UUcN������P�WXx~6\� ��@��n9X��E��{|�3���S\Cj�6Z�1Cm��v5�_y<�W�G>VM�j�&�;�]���n4RX5�P���e�:c����ON���`t�o�U��ht��Z}h���WH^�\d4T����qa8�j��Fv��q��QZB�~[?�/�IyE(�z!����'�/w��i�������XV��A���|)������\��7�idDB��p\H�|�iWs����t�Gx�@W:�&]��k��6����n�d���f��t��r�;��
�]�<������?�����D��j��8�j9��t��Tp�7�1�@�Q�����w@`�9�'���x�o8��yP=���D_���@�Vk#�5��&>��R�Z��y��l����{��<S���"���Qx�!?��v��$x��7�����F7uK�.�C�o5��������'��93O]����(��i^L0���)f��P�Cj��E��q�U�����v6;x�����`{q|������{�Nv���/���S\;�C�4�~���5k��C![M�vZ�������\!�/�������%9��p��lr��4-T�U[�
m�)���&��Q�b)sL@��4���5�35.����{�.���������>��|�%�z���l���f�����v�k��P�����1�cm�� ����R����{:�XX�a~�1��s4��_��=<�`.sL�������B��AY�j?�bb[��Q�V�1���v+��QD�����}�1�:��������������xbX��"��</�\�����L����b�-I����ry�W�v��K�`�b���?������rj�Xrap�H��F��i�VI_!9nc�D��s�D�~.�(�X-F�Z������w��IV[}1�p����qjm���O+
S�FW���V��UT�}8z��y'��;8����`����7O��C&I��Q��)�K\]o��p�j��m����Z�|�uqz��Q����d
|BF���m�q��n�&t�
z�nw�[��iR��"Pf��{���K?���q��cx��#C��c(��`
�Z��b��I��y%
`��qt=5P�kp����?��<>=[�WM-��iz����^�6��h-�!}%���&��7���~���a��
�\���p�X�ow��N�Nvw��A�]P�R��U�[������+���mA�~�W���T<��Y0��b���y�����l��gA��|s@�/~�1�^i�����*�\KW�Y�cxX_��.
�e`zgL�U@�������_6�o<���*Z�:?��p��;\oG8�� �5Z+������s�H��l�"S�����z�r��V���K#�o<��������|%�S-���<5�������t������#O�U!rO�jT��6*�t�b`�vM��l1��@�;~w���=������=&�dB'�z��)���`��~P����Ku�Ro��������@zE�i�iD��t�l�����3W�9
/.r�����`��d�t����������/(vvr���k�N���0w�)�uQD��$��)��g�C�[En���8s��c�1�k���n����/�J'o��op5�@g����8LhZ�Ctg���;������ANJ�=�O^-�j���Sw��7��~J�N��?a,��3>��X�-S7�������t���A8
��*6����}��w�O.y�p�}h,*6���e��FU��ZE����ZtiT�ZY��*�]��C0�	������J��PK�;�A�fc/�������������_��q\y�1�������W��2/~k]]��x�Fyb
�ae>�+[�m�=�g��N�����&����lz��x���6��v�qV)�#����?I�p����^8��F�����[�r��{.���8A^%�?[���A2�N�q�����I4
��l<V��c�#n���5m��������?���C	{��?�N���?��7�[X�5Z��[�[4�b��������{O�{�s����>gI+��}�s�
r
�bbf���������B�X�~��M�[���8+��U%1��K;�0�xm����*�����W�!���Y�N���"��<&������i9S��yx����;qL������	����8~�	3�Dz��P�����Z���������S�����s?��%5��dy����h���<M1M���n)W$��r.�
<��n��uu�"�^��-����Vv��^�P0s�^�v<�kx&FW�Ot`��=�M<� ����A�������"�J)����q��a<���0�����L����i��	P��pD���E?�pW����]��R�^|6�g�w�����{���������w%������)w.i�..�"w��8HlT2�	?�������������������}�-:�[�[U����G4D���`�s����$��J���h��K��O�c�(f5��^�@Hw���o?q�^�p�>�E��&�8�}����0d� i���u��"��cj��)��p�I��&C��k��p�Xq�j(`���<*�d1V"��:�66K��2��������v�zB9��&%Lsl��c�j\�w2���`��$��Q��"��Y���G���(n���F�Q4�R��4�@tC��4^��������Dt�;-1������+����$l�k�	Z�/��F�y�RwKO!���"�l������p*py���%���������|_ba�$��x4j�?Tb�0Z�?^]�{<0rN���G�d���d*_�d[�fa�u@�m��\t��R��,��e�_�!�o:r-��8����b����v��8�pz�� ��M��E7T<sl�����������<�8N���UA�/y���������=If���h6�$6�ZI�Z��AB�;�����`�&�O�y��&]�s�Gt�o�B����?��
�D�H1��hd�"�3�$�����' ��V�[������)DR����GA�\n+�*��j��j4���32��y���������A������l�������3������f����:R2�><fCE+�$���SJ��!��T�&�;>�C���/�X"��M�P�a�����������:�x���%����P8<�,<.($a>�Y�+�j��0�A�;0�g�v�z�����v�o5�a����Df�o�1�@q�]�z����H0K��qJ���Lh���������_�w�(��-��7���������Og�@auR�x��8�3��o�W��Z�o���f��7��a�,���[",�t��a�b45Oc���y~��9J�: ,|��g�D���]�y�g�	ZY��(]�H��C/E�z8����I�x�Q�:����5
*���(�b.�S�2:'�
D�o?�j;��a]+���4p�N0<�Uu �=�0���<H��Ko����=Ao�$yO����$
E���~b�s��r6���������k�&���#����Q�C�%wn���
�K`
����/$���4aKa��#���#�B_|��>�-�&�F�?�����yk�G��N��.�>2M�-���������<�2k�ODA�j{d��p�"�D}��h���oP�Jx��w&�D��S�&�U��w�����o��e�+���y��h
��)�-�{��rpp��XC�'�4$3xl�<���(�^~��������+)��
`^�7O�\��8�){�e����c|w����6�O���)�+��dx|�i�����`Z����y�����o��K�d�����2R(�;�Q�\�sr�{9�Vd�'g�lSc�H���(���x�����3gt�S�J�k�8�_\�c�������T�|�-���'e`^W�%~��O���bt>������G0W�E��\aUX���!)y�D����
��,������q����j��s�8��'�ts��|Vl�pc����!1�&�k{�7h���7�zd{@k������wm��k��]+�w-g� �WM���9:��r��xE���g~��'m�'������m\U���+|Y:<�)*Kg��oO�����d
YC����X���{w{K���m�:�a�R.����>,���sY[n��
����O�������(8�����n�v��G��:���fu��a��Q���!��wG����A�� �ru��I�gy�26���Q�ey�Ab.i�w
[|�.���}'w^�=�����bj]'%��O�P>\������ys��j�r}8:�T[��P���I��������m������%�P��J2T��4��	@�VV
����a�D����p�1�Q5[��R�{x�v^�z\�j��%/�q��lze.��a#I�%��3�^u�� d�k;/���f�"�?���S�Y�����y
�~��ce`�!�4��E��}���(��w�eQ�|h�>������]/�O����Q����������y�����z��[)c?O�����%����[���w��B��l��� ��c�ZQ|��������n���(k��s�'#`��x�����.~�
�������;�vB�k���}/Pt�3���5���W����D|=P=W����'��\���_g���E�R�vR�F��7j�8������}�C0$���+�v�d,�`�9�a�6�C�{�����#T+�]c�Rd�w~>x��u����Y���_�w�N-����pf����4�l ��Dt�j�m��9�yE�k��C�pH��)L6���9�q��)`�A|�]�UaS,MV��f��pS�� �-0�z+�A��Sd7SK�C��D��� �����V��=�f1����X�Yo�'tR��
#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joachim Wieland (#27)
Re: patch for parallel pg_dump

Joachim Wieland <joe@mcknight.de> writes:

On Sat, Mar 10, 2012 at 9:51 AM, Robert Haas <robertmhaas@gmail.com> wrote:

-                        const char *owner, bool withOids,
+                        const char *owner,
+                        unsigned long int relpages, bool withOids,

The new argument to ArchiveEntry() is unused. Removing it would
declutter things a good bit.

How do you mean it's unused? pg_dump_sort.c uses relpages to dump the
largest tables first. What you don't want to see in a parallel dump is
a worker starting to dump a large table while everybody else is
already idle...

Used or not, I think you could find a less ugly and less invasive way to
pass that around than this. We should try to avoid adding arguments to
ArchiveEntry that apply to only one object type.

(I'm also unconvinced that sorting by relation size is a good idea
anyway. Anything that makes the dump order less predictable gets
push-back, IME.)

regards, tom lane

#29Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Tom Lane (#28)
Re: patch for parallel pg_dump

Tom Lane <tgl@sss.pgh.pa.us> wrote:

(I'm also unconvinced that sorting by relation size is a good idea
anyway. Anything that makes the dump order less predictable gets
push-back, IME.)

Given that people often use diff on files from pg_dump,
unpredictable ordering can be a bad thing. On the other hand, that
is not something you would probably want to do with the output of a
*parallel* dump, so if it only affect that, it probably makes sense.
It seems like a reasonable heuristic to avoid having all but some
big table done, and having to wait for that while the other
processors are sitting idle.

-Kevin

#30Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#28)
Re: patch for parallel pg_dump

On Tuesday, March 13, 2012 02:48:11 PM Tom Lane wrote:

(I'm also unconvinced that sorting by relation size is a good idea
anyway. Anything that makes the dump order less predictable gets
push-back, IME.)

Why? Especially in the directory format - which is a prerequisite for parallel
dump if I remember this correctly - I don't really see a negative point in a
slightly changing dump order. Given its not deterministic anyway.

Andres

#31Robert Haas
robertmhaas@gmail.com
In reply to: Joachim Wieland (#27)
Re: patch for parallel pg_dump

On Mon, Mar 12, 2012 at 11:35 PM, Joachim Wieland <joe@mcknight.de> wrote:

How do you mean it's unused? pg_dump_sort.c uses relpages to dump the
largest tables first. What you don't want to see in a parallel dump is
a worker starting to dump a large table while everybody else is
already idle...

What I mean is that the function ArchiveEntry() is defined in
pg_backup_archiver.c, and it takes an argument called relpages, and
the string "relpages" does not appear anywhere else in that file.

The backend can have a wrapper function around this that calls ereport
using the error_string and error_code, and any front-end code that
wants to use this can do so directly.

I tried this actually (patch attached) but then I wanted to test it
and couldn't find anything that used pgpipe() on Windows.

pg_basebackup/pg_basebackup.c is using it but it's in an #ifndef WIN32
and the same is true for postmaster/syslogger.c. Am I missing
something or has this Windows implementation become stale by now? I'll
append the patch but haven't adapted the pg_dump patch yet to use it.
Should we still go forward the way you proposed?

Dunno. Can we get an opinion on that from one of the Windows guys?
Andrew, Magnus?

+/*
+ * The parallel error handler is called for any die_horribly() in a
child or master process.
+ * It then takes control over shutting down the rest of the gang.
+ */

I think this needs to be revised to take control in exit_nicely(),
maybe by using on_exit_nicely().  Trapping die_horribly() won't catch
everything.

It's actually not designed to catch everything. This whole error
handler thing is only there to report a single error to the user which
is hopefully the root cause of why everybody is shutting down. Assume
for example that we cannot get a lock on one table in a worker. Then
the worker would die_horribly() saying that it cannot get a lock. The
master would receive that message and shut down. Shutting down for the
master means killing all the other workers.

The master terminates because a worker died. And all the other workers
die because the master killed them. Yet the root cause for the
termination was the fact that one of the workers couldn't get a lock,
and this is the one and only message that the user should see.

If a child terminates without leaving a message, the master will still
detect it and just say "a worker process died unexpectedly" (this part
was actually broken, but now it's fixed :-) )

All that may be true, but I still don't see why it's right for this to
apply in the cases where the worker thread says die_horribly(), but
not in the cases where the worker says exit_horribly().

Or we change fmtQualifiedId to take an int and then we always pass the
archive version instead of the Archive* ?

Hmm, I think that might make sense.

+enum escrow_action { GET, SET };
+static void
+parallel_error_handler_escrow_data(enum escrow_action act,
ParallelState *pstate)
+{
+       static ParallelState *s_pstate = NULL;
+
+       if (act == SET)
+               s_pstate = pstate;
+       else
+               *pstate = *s_pstate;
+}

This seems like a mighty complicated way to implement a global variable.

Well, we talked about that before, when you complained that you
couldn't get rid of the global g_conn because of the exit handler.
You're right that in fact it is an indirect global variable here but
it's clearly limited to the use of the error handler and you can be
sure that nobody other than this function writes to it or accesses it
without calling this function.

Sure, but since all the function does is write to it or access it,
what good does that do me?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#32Andrew Dunstan
andrew@dunslane.net
In reply to: Robert Haas (#31)
Re: patch for parallel pg_dump

On 03/13/2012 01:53 PM, Robert Haas wrote:

I tried this actually (patch attached) but then I wanted to test it
and couldn't find anything that used pgpipe() on Windows.

pg_basebackup/pg_basebackup.c is using it but it's in an #ifndef WIN32
and the same is true for postmaster/syslogger.c. Am I missing
something or has this Windows implementation become stale by now? I'll
append the patch but haven't adapted the pg_dump patch yet to use it.
Should we still go forward the way you proposed?

Dunno. Can we get an opinion on that from one of the Windows guys?
Andrew, Magnus?

I haven't had time to review this patch or even follow all the
discussion as I was hoping. I'll try to review the whole thing shortly.

cheers

andrew

#33Robert Haas
robertmhaas@gmail.com
In reply to: Kevin Grittner (#29)
Re: patch for parallel pg_dump

On Tue, Mar 13, 2012 at 9:59 AM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:

Tom Lane <tgl@sss.pgh.pa.us> wrote:

(I'm also unconvinced that sorting by relation size is a good idea
anyway.  Anything that makes the dump order less predictable gets
push-back, IME.)

Given that people often use diff on files from pg_dump,
unpredictable ordering can be a bad thing.  On the other hand, that
is not something you would probably want to do with the output of a
*parallel* dump, so if it only affect that, it probably makes sense.
It seems like a reasonable heuristic to avoid having all but some
big table done, and having to wait for that while the other
processors are sitting idle.

Yeah, I think it's a good heuristic. Finishing the dump in the
minimum possible time is sufficiently similar to the knapsack problem
as to make me suspect that there is no easy way to be certain of
getting the optimal dump order (and we don't even have perfect
information, since we know little about the characteristics of the
underlying storage). But dumping tables in descending order figures
to get many easy cases right - e.g. suppose there are 200 tables of
size X and 1 table of size 100*X, and we have 3 workers to play with.
If we dump the tables in an essentially random order (relative to
size) then the overall time will get longer the more little tables we
dump before we start the big one.

Now, if we have tables of sizes 10*X, 9*X, 8*X, 6*X, and 5*X and two
workers, then the first worker will get the 10*X table, the second
worker will get the 9*X table, then the second worker will start the
8*X table, then the first worker will get the 6*X and 5*X tables and,
assuming dump time is a uniform function of table size, we'll finish
after 21 time units. Had we been smarter, we could have assigned the
9*X, 6*X, and 5*X tables to one worker and the 10*X and 8*X tables to
the other and finished in just 20 time units. There's probably a way
to construct a more extreme example of this, but I think in practice
if there's any loss due to this kind of effect it will be small, and
descending-size order certainly seems more likely to be right than
leading it to chance.

A bigger problem is dumping relations A and B at the same time might
involve a lot more I/O contention than dumping relations A and C at
the same time if, say, A and B are on the same tablespace and C is
not. I have no idea what to do about that in general, but for a first
version of this feature I think it's fine to punt.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#34Joachim Wieland
joe@mcknight.de
In reply to: Robert Haas (#31)
1 attachment(s)
Re: patch for parallel pg_dump

On Tue, Mar 13, 2012 at 1:53 PM, Robert Haas <robertmhaas@gmail.com> wrote:

What I mean is that the function ArchiveEntry() is defined in
pg_backup_archiver.c, and it takes an argument called relpages, and
the string "relpages" does not appear anywhere else in that file.

Uhm, that's kinda concerning, isn't it... fixed...

[...pgpipe...]

Dunno.  Can we get an opinion on that from one of the Windows guys?
Andrew, Magnus?

Waiting for the verdict here...

If a child terminates without leaving a message, the master will still
detect it and just say "a worker process died unexpectedly" (this part
was actually broken, but now it's fixed :-) )

All that may be true, but I still don't see why it's right for this to
apply in the cases where the worker thread says die_horribly(), but
not in the cases where the worker says exit_horribly().

Hm, I'm not calling the error handler from exit_horribly because it
doesn't have the AH. It looks like the code assumes that
die_horribly() is called whenever AH is available and if not,
exit_horribly() should be called which eventually calls these
preregistered exit-hooks via exit_nicely() to clean up the connection.

I think we should somehow unify both functions, the code is not very
consistent in this respect, it also calls exit_horribly() when it has
AH available. See for example pg_backup_tar.c

Or is there another distinction between them? The question how to
clean it up basically brings us back to the question what to do about
global variables in general and for error handlers in particular.

Or we change fmtQualifiedId to take an int and then we always pass the
archive version instead of the Archive* ?

Hmm, I think that might make sense.

Done.

+enum escrow_action { GET, SET };
+static void
+parallel_error_handler_escrow_data(enum escrow_action act,
ParallelState *pstate)
+{
+       static ParallelState *s_pstate = NULL;
+
+       if (act == SET)
+               s_pstate = pstate;
+       else
+               *pstate = *s_pstate;
+}

This seems like a mighty complicated way to implement a global variable.

Well, we talked about that before, when you complained that you
couldn't get rid of the global g_conn because of the exit handler.
You're right that in fact it is an indirect global variable here but
it's clearly limited to the use of the error handler and you can be
sure that nobody other than this function writes to it or accesses it
without calling this function.

Sure, but since all the function does is write to it or access it,
what good does that do me?

It encapsulates the variable so that it can only be used for one
specific use case.

Attaching a new version.

Attachments:

parallel_pg_dump_4.diff.gzapplication/x-gzip; name=parallel_pg_dump_4.diff.gzDownload
#35Robert Haas
robertmhaas@gmail.com
In reply to: Joachim Wieland (#34)
Re: patch for parallel pg_dump

On Wed, Mar 14, 2012 at 12:34 AM, Joachim Wieland <joe@mcknight.de> wrote:

If a child terminates without leaving a message, the master will still
detect it and just say "a worker process died unexpectedly" (this part
was actually broken, but now it's fixed :-) )

All that may be true, but I still don't see why it's right for this to
apply in the cases where the worker thread says die_horribly(), but
not in the cases where the worker says exit_horribly().

Hm, I'm not calling the error handler from exit_horribly because it
doesn't have the AH. It looks like the code assumes that
die_horribly() is called whenever AH is available and if not,
exit_horribly() should be called which eventually calls these
preregistered exit-hooks via exit_nicely() to clean up the connection.

I think we should somehow unify both functions, the code is not very
consistent in this respect, it also calls exit_horribly() when it has
AH available. See for example pg_backup_tar.c

I think we should get rid of die_horribly(), and instead have arrange
to always clean up AH via an on_exit_nicely hook.

Sure, but since all the function does is write to it or access it,
what good does that do me?

It encapsulates the variable so that it can only be used for one
specific use case.

Seems pointless to me.

+       /*
+        * This is a data dumper routine, executed in a child for parallel backu
+        * so it must not access the global g_conn but AH->connection instead.
+        */

There's no g_conn any more. This and several other references to it
should be updated or expunged.

+       {
+               write_msg(NULL, "parallel backup only supported by the directory
+               exit(1);
+       }

I think this should exit_horribly() with that message. It definitely
can't use exit() rather than exit_nicely(); more generally, every copy
of exit() that you've added here should exit_nicely() instead, or use
some higher-level routine like exit_horribly().

+                       write_msg(NULL, "No synchronized snapshots available in
+                                                "You might have to run with --n
+                       exit(1);

In addition to the previous problem, what do you mean by "might"? The
real problem is that on pre-9.2 versions multiple jobs are not OK
unless that option is used; I think we should say that more directly.

/*
* The sequence is the following (for dump, similar for restore):
*
* Master Worker
*
* enters WaitForCommands()
* DispatchJobForTocEntry(...te...)
*
* [ Worker is IDLE ]
*
* arg = (MasterStartParallelItemPtr)()
* send: DUMP arg
* receive: DUMP arg
* str = (WorkerJobDumpPtr)(arg)
* [ Worker is WORKING ] ... gets te from arg ...
* ... dump te ...
* send: OK DUMP info
*
* In ListenToWorkers():
*
* [ Worker is FINISHED ]
* receive: OK DUMP info
* status = (MasterEndParallelItemPtr)(info)
*
* In ReapWorkerStatus(&ptr):
* *ptr = status;
* [ Worker is IDLE ]
*/

I don't find this comment very clear, and would suggest rewriting it
using prose rather than an ASCII diagram. Note also that any sort of
thing that does look like an ASCII diagram must be surrounded by lines
of dashes within the comment block, or pgindent will make hash of it.
There are a couple of other places where this is an issue as well,
like the comment for ListenToWorkers().

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#36Andrew Dunstan
adunstan@postgresql.org
In reply to: Andrew Dunstan (#32)
Re: patch for parallel pg_dump

On 03/13/2012 02:10 PM, Andrew Dunstan wrote:

On 03/13/2012 01:53 PM, Robert Haas wrote:

I tried this actually (patch attached) but then I wanted to test it
and couldn't find anything that used pgpipe() on Windows.

pg_basebackup/pg_basebackup.c is using it but it's in an #ifndef WIN32
and the same is true for postmaster/syslogger.c. Am I missing
something or has this Windows implementation become stale by now? I'll
append the patch but haven't adapted the pg_dump patch yet to use it.
Should we still go forward the way you proposed?

Dunno. Can we get an opinion on that from one of the Windows guys?
Andrew, Magnus?

I haven't had time to review this patch or even follow all the
discussion as I was hoping. I'll try to review the whole thing shortly.

pgpipe used to be used in pgstat.c, but that's no longer true in any
live branch, so it's probably long dead. I'd be inclined to rip it out
if possible rather than expand its use.

I've just started looking at the patch, and I'm curious to know why it
didn't follow the pattern of parallel pg_restore which created a new
worker for each table rather than passing messages to looping worker
threads as this appears to do. That might have avoided a lot of the need
for this message passing infrastructure, if it could have been done. But
maybe I just need to review the patch and the discussions some more.

cheers

andrew

#37Alvaro Herrera
alvherre@commandprompt.com
In reply to: Andrew Dunstan (#36)
Re: patch for parallel pg_dump

Excerpts from Andrew Dunstan's message of mié mar 14 17:39:59 -0300 2012:

pgpipe used to be used in pgstat.c, but that's no longer true in any
live branch, so it's probably long dead. I'd be inclined to rip it out
if possible rather than expand its use.

our pgpipe() function is interesting -- all the callers that use it
first verify that they aren't WIN32. If they are, they are using a
#define that makes it plain pipe(). And the function is only defined in
WIN32. It seems a reasonable idea to kill both pgpipe() and piperead().

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#38Robert Haas
robertmhaas@gmail.com
In reply to: Andrew Dunstan (#36)
Re: patch for parallel pg_dump

On Wed, Mar 14, 2012 at 4:39 PM, Andrew Dunstan <adunstan@postgresql.org> wrote:

I've just started looking at the patch, and I'm curious to know why it
didn't follow the pattern of parallel pg_restore which created a new worker
for each table rather than passing messages to looping worker threads as
this appears to do. That might have avoided a lot of the need for this
message passing infrastructure, if it could have been done. But maybe I just
need to review the patch and the discussions some more.

Hmm, I hadn't actually considered that idea. Not sure whether it's
better or worse than the current implementation...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#39Joachim Wieland
joe@mcknight.de
In reply to: Robert Haas (#35)
Re: patch for parallel pg_dump

On Wed, Mar 14, 2012 at 2:02 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I think we should get rid of die_horribly(), and instead have arrange
to always clean up AH via an on_exit_nicely hook.

Good. The only exit handler I've seen so far is
pgdump_cleanup_at_exit. If there's no other one, is it okay to remove
all of this stacking functionality (see on_exit_nicely_index /
MAX_ON_EXIT_NICELY) from dumputils.c and just define two global
variables, one for the function and one for the arg that this function
would operate on (or a struct of both)?

We'd then have the current function and AHX (or only &AH->connection
from it) in the non-parallel case and as soon as we enter the parallel
dump, we can exchange it for another function operating on
ParallelState*. This avoids having to deal with thread-local storage
on Windows, because ParallelState* is just large enough to hold all
the required data and a specific thread can easily find its own slot
with its threadId.

Sure, but since all the function does is write to it or access it,
what good does that do me?

It encapsulates the variable so that it can only be used for one
specific use case.

Seems pointless to me.

Not so much to me if the alternative is to make ParallelState* a
global variable, but anyway, with the concept proposed above,
ParallelState* would be the arg that the parallel exit handler would
operate on, so it would indeed be global but hidden behind a different
name and a void* pointer.

(I will address all the other points you brought up in my next patch)

#40Joachim Wieland
joe@mcknight.de
In reply to: Andrew Dunstan (#36)
Re: patch for parallel pg_dump

On Wed, Mar 14, 2012 at 4:39 PM, Andrew Dunstan <adunstan@postgresql.org> wrote:

I've just started looking at the patch, and I'm curious to know why it
didn't follow the pattern of parallel pg_restore which created a new worker
for each table rather than passing messages to looping worker threads as
this appears to do. That might have avoided a lot of the need for this
message passing infrastructure, if it could have been done. But maybe I just
need to review the patch and the discussions some more.

The main reason for this design has now been overcome by the
flexibility of the synchronized snapshot feature, which allows to get
the snapshot of a transaction even if this other transaction has been
running for quite some time already. In other previously proposed
implementations of this feature, workers had to connect at the same
time and then could not close their transactions without losing the
snapshot.

The other drawback of the fork-per-tocentry-approach is the somewhat
limited bandwith of information from the worker back to the master,
it's basically just the return code. That's fine if there is no error,
but if there is, then the master can't tell any further details (e.g.
"could not get lock on table foo", or "could not write to file bar: no
space left on device").

This restriction does not only apply to error messages. For example,
what I'd also like to have in pg_dump would be checksums on a
per-TocEntry basis. The individual workers would calculate the
checksums when writing the file and then send them back to the master
for integration into the TOC. I don't see how such a feature could be
implemented in a straightforward way without a message passing
infrastructure.

#41Robert Haas
robertmhaas@gmail.com
In reply to: Joachim Wieland (#39)
Re: patch for parallel pg_dump

On Thu, Mar 15, 2012 at 12:56 AM, Joachim Wieland <joe@mcknight.de> wrote:

On Wed, Mar 14, 2012 at 2:02 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I think we should get rid of die_horribly(), and instead have arrange
to always clean up AH via an on_exit_nicely hook.

Good. The only exit handler I've seen so far is
pgdump_cleanup_at_exit. If there's no other one, is it okay to remove
all of this stacking functionality (see on_exit_nicely_index /
MAX_ON_EXIT_NICELY) from dumputils.c and just define two global
variables, one for the function and one for the arg that this function
would operate on (or a struct of both)?

No. That code is included by other things - like pg_dumpall - that
don't know there's such a thing as an Archive. But I don't see that
as a big problem; just on_exit_nicely whatever you want. We could
also add on_exit_nicely_reset(), if needed, to clear the existing
handlers.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#42Joachim Wieland
joe@mcknight.de
In reply to: Robert Haas (#41)
Re: patch for parallel pg_dump

On Fri, Mar 16, 2012 at 12:06 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Good. The only exit handler I've seen so far is
pgdump_cleanup_at_exit. If there's no other one, is it okay to remove
all of this stacking functionality (see on_exit_nicely_index /
MAX_ON_EXIT_NICELY) from dumputils.c and just define two global
variables, one for the function and one for the arg that this function
would operate on (or a struct of both)?

No.  That code is included by other things - like pg_dumpall - that
don't know there's such a thing as an Archive.  But I don't see that
as a big problem; just on_exit_nicely whatever you want.  We could
also add on_exit_nicely_reset(), if needed, to clear the existing
handlers.

Yes, on_exit_nicely_reset() would be what I'd need to remove all
callbacks from the parent after the fork in the child process.

I still can't find any other hooks except for pgdump_cleanup_at_exit
from pg_dump.c. I guess what you're saying is that we provide
dumputil.c to other programs but even though none of them currently
sets any exit callback, you want to keep the functionality so that
they can set multiple exit hooks in the future should the need for
them arise.

#43Joachim Wieland
joe@mcknight.de
In reply to: Robert Haas (#35)
1 attachment(s)
Re: patch for parallel pg_dump

On Wed, Mar 14, 2012 at 2:02 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I think we should somehow unify both functions, the code is not very
consistent in this respect, it also calls exit_horribly() when it has
AH available. See for example pg_backup_tar.c

I think we should get rid of die_horribly(), and instead have arrange
to always clean up AH via an on_exit_nicely hook.

Attached is a patch that gets rid of die_horribly().

For the parallel case it maintains an array with as many elements as
we have worker processes. When the workers start, they enter their Pid
(or ThreadId) and their ArchiveHandle (AH). The exit handler function
in a process can then find its own ArchiveHandle by comparing the own
Pid with all the elements in the array.

Attachments:

pg_dump_die_horribly.difftext/x-patch; charset=US-ASCII; name=pg_dump_die_horribly.diffDownload
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index c30b8f9..ff8e714 100644
*** a/src/bin/pg_dump/compress_io.c
--- b/src/bin/pg_dump/compress_io.c
*************** EndCompressorZlib(ArchiveHandle *AH, Com
*** 256,263 ****
  	DeflateCompressorZlib(AH, cs, true);
  
  	if (deflateEnd(zp) != Z_OK)
! 		die_horribly(AH, modulename,
! 					 "could not close compression stream: %s\n", zp->msg);
  
  	free(cs->zlibOut);
  	free(cs->zp);
--- 256,263 ----
  	DeflateCompressorZlib(AH, cs, true);
  
  	if (deflateEnd(zp) != Z_OK)
! 		exit_horribly(modulename,
! 					  "could not close compression stream: %s\n", zp->msg);
  
  	free(cs->zlibOut);
  	free(cs->zp);
*************** DeflateCompressorZlib(ArchiveHandle *AH,
*** 274,281 ****
  	{
  		res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
  		if (res == Z_STREAM_ERROR)
! 			die_horribly(AH, modulename,
! 						 "could not compress data: %s\n", zp->msg);
  		if ((flush && (zp->avail_out < cs->zlibOutSize))
  			|| (zp->avail_out == 0)
  			|| (zp->avail_in != 0)
--- 274,281 ----
  	{
  		res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
  		if (res == Z_STREAM_ERROR)
! 			exit_horribly(modulename,
! 						  "could not compress data: %s\n", zp->msg);
  		if ((flush && (zp->avail_out < cs->zlibOutSize))
  			|| (zp->avail_out == 0)
  			|| (zp->avail_in != 0)
*************** DeflateCompressorZlib(ArchiveHandle *AH,
*** 295,303 ****
  				size_t		len = cs->zlibOutSize - zp->avail_out;
  
  				if (cs->writeF(AH, out, len) != len)
! 					die_horribly(AH, modulename,
! 								 "could not write to output file: %s\n",
! 								 strerror(errno));
  			}
  			zp->next_out = (void *) out;
  			zp->avail_out = cs->zlibOutSize;
--- 295,303 ----
  				size_t		len = cs->zlibOutSize - zp->avail_out;
  
  				if (cs->writeF(AH, out, len) != len)
! 					exit_horribly(modulename,
! 								  "could not write to output file: %s\n",
! 								  strerror(errno));
  			}
  			zp->next_out = (void *) out;
  			zp->avail_out = cs->zlibOutSize;
*************** WriteDataToArchiveZlib(ArchiveHandle *AH
*** 318,324 ****
  
  	/*
  	 * we have either succeeded in writing dLen bytes or we have called
! 	 * die_horribly()
  	 */
  	return dLen;
  }
--- 318,324 ----
  
  	/*
  	 * we have either succeeded in writing dLen bytes or we have called
! 	 * exit_horribly()
  	 */
  	return dLen;
  }
*************** ReadDataFromArchiveZlib(ArchiveHandle *A
*** 361,368 ****
  
  			res = inflate(zp, 0);
  			if (res != Z_OK && res != Z_STREAM_END)
! 				die_horribly(AH, modulename,
! 							 "could not uncompress data: %s\n", zp->msg);
  
  			out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
  			ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
--- 361,368 ----
  
  			res = inflate(zp, 0);
  			if (res != Z_OK && res != Z_STREAM_END)
! 				exit_horribly(modulename,
! 							  "could not uncompress data: %s\n", zp->msg);
  
  			out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
  			ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
*************** ReadDataFromArchiveZlib(ArchiveHandle *A
*** 377,392 ****
  		zp->avail_out = ZLIB_OUT_SIZE;
  		res = inflate(zp, 0);
  		if (res != Z_OK && res != Z_STREAM_END)
! 			die_horribly(AH, modulename,
! 						 "could not uncompress data: %s\n", zp->msg);
  
  		out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
  		ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
  	}
  
  	if (inflateEnd(zp) != Z_OK)
! 		die_horribly(AH, modulename,
! 					 "could not close compression library: %s\n", zp->msg);
  
  	free(buf);
  	free(out);
--- 377,392 ----
  		zp->avail_out = ZLIB_OUT_SIZE;
  		res = inflate(zp, 0);
  		if (res != Z_OK && res != Z_STREAM_END)
! 			exit_horribly(modulename,
! 						  "could not uncompress data: %s\n", zp->msg);
  
  		out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
  		ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
  	}
  
  	if (inflateEnd(zp) != Z_OK)
! 		exit_horribly(modulename,
! 					  "could not close compression library: %s\n", zp->msg);
  
  	free(buf);
  	free(out);
*************** WriteDataToArchiveNone(ArchiveHandle *AH
*** 426,434 ****
  	 * do a check here as well...
  	 */
  	if (cs->writeF(AH, data, dLen) != dLen)
! 		die_horribly(AH, modulename,
! 					 "could not write to output file: %s\n",
! 					 strerror(errno));
  	return dLen;
  }
  
--- 426,434 ----
  	 * do a check here as well...
  	 */
  	if (cs->writeF(AH, data, dLen) != dLen)
! 		exit_horribly(modulename,
! 					  "could not write to output file: %s\n",
! 					  strerror(errno));
  	return dLen;
  }
  
diff --git a/src/bin/pg_dump/dumputils.c b/src/bin/pg_dump/dumputils.c
index 0b24220..d9681f6 100644
*** a/src/bin/pg_dump/dumputils.c
--- b/src/bin/pg_dump/dumputils.c
*************** static void AddAcl(PQExpBuffer aclbuf, c
*** 49,54 ****
--- 49,55 ----
  #ifdef WIN32
  static bool parallel_init_done = false;
  static DWORD tls_index;
+ static DWORD mainThreadId;
  #endif
  
  void
*************** init_parallel_dump_utils(void)
*** 59,64 ****
--- 60,66 ----
  	{
  		tls_index = TlsAlloc();
  		parallel_init_done = true;
+ 		mainThreadId = GetCurrentThreadId();
  	}
  #endif
  }
*************** on_exit_nicely(on_exit_nicely_callback f
*** 1313,1318 ****
--- 1315,1327 ----
  	on_exit_nicely_index++;
  }
  
+ /* Delete any previously set callback functions */
+ void
+ on_exit_nicely_reset(void)
+ {
+ 	on_exit_nicely_index = 0;
+ }
+ 
  /* Run accumulated on_exit_nicely callbacks and then exit quietly. */
  void
  exit_nicely(int code)
*************** exit_nicely(int code)
*** 1320,1324 ****
--- 1329,1337 ----
  	while (--on_exit_nicely_index >= 0)
  		(*on_exit_nicely_list[on_exit_nicely_index].function)(code,
  			on_exit_nicely_list[on_exit_nicely_index].arg);
+ #ifdef WIN32
+ 	if (parallel_init_done && GetCurrentThreadId() != mainThreadId)
+ 		ExitThread(code);
+ #endif
  	exit(code);
  }
diff --git a/src/bin/pg_dump/dumputils.h b/src/bin/pg_dump/dumputils.h
index 82cf940..2865c0f 100644
*** a/src/bin/pg_dump/dumputils.h
--- b/src/bin/pg_dump/dumputils.h
*************** extern void set_section (const char *arg
*** 62,67 ****
--- 62,68 ----
  
  typedef void (*on_exit_nicely_callback) (int code, void *arg);
  extern void on_exit_nicely(on_exit_nicely_callback function, void *arg);
+ extern void on_exit_nicely_reset(void);
  extern void exit_nicely(int code) __attribute__((noreturn));
  
  #endif   /* DUMPUTILS_H */
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index ff0db46..ba553d3 100644
*** a/src/bin/pg_dump/pg_backup.h
--- b/src/bin/pg_dump/pg_backup.h
*************** extern void ConnectDatabase(Archive *AH,
*** 167,172 ****
--- 167,173 ----
  				enum trivalue prompt_password);
  extern void DisconnectDatabase(Archive *AHX);
  extern PGconn *GetConnection(Archive *AHX);
+ extern void archive_close_connection(int code, void *arg);
  
  /* Called to add a TOC entry */
  extern void ArchiveEntry(Archive *AHX,
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 79f7dda..ebf11f7 100644
*** a/src/bin/pg_dump/pg_backup_archiver.c
--- b/src/bin/pg_dump/pg_backup_archiver.c
***************
*** 61,71 ****
--- 61,88 ----
  #define thandle HANDLE
  #endif
  
+ typedef struct _parallel_state_entry
+ {
+ #ifdef WIN32
+ 	unsigned int threadId;
+ #else
+ 	pid_t		pid;
+ #endif
+ 	ArchiveHandle *AH;
+ } ParallelStateEntry;
+ 
+ typedef struct _parallel_state
+ {
+ 	int			numWorkers;
+ 	ParallelStateEntry *pse;
+ } ParallelState;
+ 
  /* Arguments needed for a worker child */
  typedef struct _restore_args
  {
  	ArchiveHandle *AH;
  	TocEntry   *te;
+ 	ParallelStateEntry *pse;
  } RestoreArgs;
  
  /* State for each parallel activity slot */
*************** static int	_discoverArchiveFormat(Archiv
*** 122,131 ****
  
  static int	RestoringToDB(ArchiveHandle *AH);
  static void dump_lo_buf(ArchiveHandle *AH);
- static void vdie_horribly(ArchiveHandle *AH, const char *modulename,
- 						  const char *fmt, va_list ap)
- 	__attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 0), noreturn));
- 
  static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
  static void SetOutput(ArchiveHandle *AH, const char *filename, int compression);
  static OutputContext SaveOutput(ArchiveHandle *AH);
--- 139,144 ----
*************** CloseArchive(Archive *AHX)
*** 208,215 ****
  		res = fclose(AH->OF);
  
  	if (res != 0)
! 		die_horribly(AH, modulename, "could not close output file: %s\n",
! 					 strerror(errno));
  }
  
  /* Public */
--- 221,228 ----
  		res = fclose(AH->OF);
  
  	if (res != 0)
! 		exit_horribly(modulename, "could not close output file: %s\n",
! 					  strerror(errno));
  }
  
  /* Public */
*************** RestoreArchive(Archive *AHX, RestoreOpti
*** 234,247 ****
  	 * connected to, not the one we will create, which is very bad...
  	 */
  	if (ropt->createDB && ropt->dropSchema)
! 		die_horribly(AH, modulename, "-C and -c are incompatible options\n");
  
  	/*
  	 * -C is not compatible with -1, because we can't create a database inside
  	 * a transaction block.
  	 */
  	if (ropt->createDB && ropt->single_txn)
! 		die_horribly(AH, modulename, "-C and -1 are incompatible options\n");
  
  	/*
  	 * If we're going to do parallel restore, there are some restrictions.
--- 247,260 ----
  	 * connected to, not the one we will create, which is very bad...
  	 */
  	if (ropt->createDB && ropt->dropSchema)
! 		exit_horribly(modulename, "-C and -c are incompatible options\n");
  
  	/*
  	 * -C is not compatible with -1, because we can't create a database inside
  	 * a transaction block.
  	 */
  	if (ropt->createDB && ropt->single_txn)
! 		exit_horribly(modulename, "-C and -1 are incompatible options\n");
  
  	/*
  	 * If we're going to do parallel restore, there are some restrictions.
*************** RestoreArchive(Archive *AHX, RestoreOpti
*** 251,261 ****
  	{
  		/* We haven't got round to making this work for all archive formats */
  		if (AH->ClonePtr == NULL || AH->ReopenPtr == NULL)
! 			die_horribly(AH, modulename, "parallel restore is not supported with this archive file format\n");
  
  		/* Doesn't work if the archive represents dependencies as OIDs */
  		if (AH->version < K_VERS_1_8)
! 			die_horribly(AH, modulename, "parallel restore is not supported with archives made by pre-8.0 pg_dump\n");
  
  		/*
  		 * It's also not gonna work if we can't reopen the input file, so
--- 264,274 ----
  	{
  		/* We haven't got round to making this work for all archive formats */
  		if (AH->ClonePtr == NULL || AH->ReopenPtr == NULL)
! 			exit_horribly(modulename, "parallel restore is not supported with this archive file format\n");
  
  		/* Doesn't work if the archive represents dependencies as OIDs */
  		if (AH->version < K_VERS_1_8)
! 			exit_horribly(modulename, "parallel restore is not supported with archives made by pre-8.0 pg_dump\n");
  
  		/*
  		 * It's also not gonna work if we can't reopen the input file, so
*************** RestoreArchive(Archive *AHX, RestoreOpti
*** 274,280 ****
  		{
  			reqs = _tocEntryRequired(te, ropt, false);
  			if (te->hadDumper && (reqs & REQ_DATA) != 0)
! 				die_horribly(AH, modulename, "cannot restore from compressed archive (compression not supported in this installation)\n");
  		}
  	}
  #endif
--- 287,293 ----
  		{
  			reqs = _tocEntryRequired(te, ropt, false);
  			if (te->hadDumper && (reqs & REQ_DATA) != 0)
! 				exit_horribly(modulename, "cannot restore from compressed archive (compression not supported in this installation)\n");
  		}
  	}
  #endif
*************** RestoreArchive(Archive *AHX, RestoreOpti
*** 286,292 ****
  	{
  		ahlog(AH, 1, "connecting to database for restore\n");
  		if (AH->version < K_VERS_1_3)
! 			die_horribly(AH, modulename, "direct database connections are not supported in pre-1.3 archives\n");
  
  		/* XXX Should get this from the archive */
  		AHX->minRemoteVersion = 070100;
--- 299,305 ----
  	{
  		ahlog(AH, 1, "connecting to database for restore\n");
  		if (AH->version < K_VERS_1_3)
! 			exit_horribly(modulename, "direct database connections are not supported in pre-1.3 archives\n");
  
  		/* XXX Should get this from the archive */
  		AHX->minRemoteVersion = 070100;
*************** WriteData(Archive *AHX, const void *data
*** 734,740 ****
  	ArchiveHandle *AH = (ArchiveHandle *) AHX;
  
  	if (!AH->currToc)
! 		die_horribly(AH, modulename, "internal error -- WriteData cannot be called outside the context of a DataDumper routine\n");
  
  	return (*AH->WriteDataPtr) (AH, data, dLen);
  }
--- 747,753 ----
  	ArchiveHandle *AH = (ArchiveHandle *) AHX;
  
  	if (!AH->currToc)
! 		exit_horribly(modulename, "internal error -- WriteData cannot be called outside the context of a DataDumper routine\n");
  
  	return (*AH->WriteDataPtr) (AH, data, dLen);
  }
*************** StartBlob(Archive *AHX, Oid oid)
*** 889,895 ****
  	ArchiveHandle *AH = (ArchiveHandle *) AHX;
  
  	if (!AH->StartBlobPtr)
! 		die_horribly(AH, modulename, "large-object output not supported in chosen format\n");
  
  	(*AH->StartBlobPtr) (AH, AH->currToc, oid);
  
--- 902,908 ----
  	ArchiveHandle *AH = (ArchiveHandle *) AHX;
  
  	if (!AH->StartBlobPtr)
! 		exit_horribly(modulename, "large-object output not supported in chosen format\n");
  
  	(*AH->StartBlobPtr) (AH, AH->currToc, oid);
  
*************** StartRestoreBlob(ArchiveHandle *AH, Oid
*** 976,988 ****
  		{
  			loOid = lo_create(AH->connection, oid);
  			if (loOid == 0 || loOid != oid)
! 				die_horribly(AH, modulename, "could not create large object %u: %s",
! 							 oid, PQerrorMessage(AH->connection));
  		}
  		AH->loFd = lo_open(AH->connection, oid, INV_WRITE);
  		if (AH->loFd == -1)
! 			die_horribly(AH, modulename, "could not open large object %u: %s",
! 						 oid, PQerrorMessage(AH->connection));
  	}
  	else
  	{
--- 989,1001 ----
  		{
  			loOid = lo_create(AH->connection, oid);
  			if (loOid == 0 || loOid != oid)
! 				exit_horribly(modulename, "could not create large object %u: %s",
! 							  oid, PQerrorMessage(AH->connection));
  		}
  		AH->loFd = lo_open(AH->connection, oid, INV_WRITE);
  		if (AH->loFd == -1)
! 			exit_horribly(modulename, "could not open large object %u: %s",
! 						  oid, PQerrorMessage(AH->connection));
  	}
  	else
  	{
*************** SortTocFromFile(Archive *AHX, RestoreOpt
*** 1038,1045 ****
  	/* Setup the file */
  	fh = fopen(ropt->tocFile, PG_BINARY_R);
  	if (!fh)
! 		die_horribly(AH, modulename, "could not open TOC file \"%s\": %s\n",
! 					 ropt->tocFile, strerror(errno));
  
  	incomplete_line = false;
  	while (fgets(buf, sizeof(buf), fh) != NULL)
--- 1051,1058 ----
  	/* Setup the file */
  	fh = fopen(ropt->tocFile, PG_BINARY_R);
  	if (!fh)
! 		exit_horribly(modulename, "could not open TOC file \"%s\": %s\n",
! 					  ropt->tocFile, strerror(errno));
  
  	incomplete_line = false;
  	while (fgets(buf, sizeof(buf), fh) != NULL)
*************** SortTocFromFile(Archive *AHX, RestoreOpt
*** 1086,1093 ****
  		/* Find TOC entry */
  		te = getTocEntryByDumpId(AH, id);
  		if (!te)
! 			die_horribly(AH, modulename, "could not find entry for ID %d\n",
! 						 id);
  
  		/* Mark it wanted */
  		ropt->idWanted[id - 1] = true;
--- 1099,1106 ----
  		/* Find TOC entry */
  		te = getTocEntryByDumpId(AH, id);
  		if (!te)
! 			exit_horribly(modulename, "could not find entry for ID %d\n",
! 						  id);
  
  		/* Mark it wanted */
  		ropt->idWanted[id - 1] = true;
*************** SortTocFromFile(Archive *AHX, RestoreOpt
*** 1107,1114 ****
  	}
  
  	if (fclose(fh) != 0)
! 		die_horribly(AH, modulename, "could not close TOC file: %s\n",
! 					 strerror(errno));
  }
  
  /*
--- 1120,1127 ----
  	}
  
  	if (fclose(fh) != 0)
! 		exit_horribly(modulename, "could not close TOC file: %s\n",
! 					  strerror(errno));
  }
  
  /*
*************** SetOutput(ArchiveHandle *AH, const char
*** 1224,1234 ****
  	if (!AH->OF)
  	{
  		if (filename)
! 			die_horribly(AH, modulename, "could not open output file \"%s\": %s\n",
! 						 filename, strerror(errno));
  		else
! 			die_horribly(AH, modulename, "could not open output file: %s\n",
! 						 strerror(errno));
  	}
  }
  
--- 1237,1247 ----
  	if (!AH->OF)
  	{
  		if (filename)
! 			exit_horribly(modulename, "could not open output file \"%s\": %s\n",
! 						  filename, strerror(errno));
  		else
! 			exit_horribly(modulename, "could not open output file: %s\n",
! 						  strerror(errno));
  	}
  }
  
*************** RestoreOutput(ArchiveHandle *AH, OutputC
*** 1254,1260 ****
  		res = fclose(AH->OF);
  
  	if (res != 0)
! 		die_horribly(AH, modulename, "could not close output file: %s\n",
  					 strerror(errno));
  
  	AH->gzOut = savedContext.gzOut;
--- 1267,1273 ----
  		res = fclose(AH->OF);
  
  	if (res != 0)
! 		exit_horribly(modulename, "could not close output file: %s\n",
  					 strerror(errno));
  
  	AH->gzOut = savedContext.gzOut;
*************** dump_lo_buf(ArchiveHandle *AH)
*** 1332,1338 ****
  							  AH->lo_buf_used),
  			  (unsigned long) AH->lo_buf_used, (unsigned long) res);
  		if (res != AH->lo_buf_used)
! 			die_horribly(AH, modulename,
  			"could not write to large object (result: %lu, expected: %lu)\n",
  					   (unsigned long) res, (unsigned long) AH->lo_buf_used);
  	}
--- 1345,1351 ----
  							  AH->lo_buf_used),
  			  (unsigned long) AH->lo_buf_used, (unsigned long) res);
  		if (res != AH->lo_buf_used)
! 			exit_horribly(modulename,
  			"could not write to large object (result: %lu, expected: %lu)\n",
  					   (unsigned long) res, (unsigned long) AH->lo_buf_used);
  	}
*************** ahwrite(const void *ptr, size_t size, si
*** 1391,1397 ****
  	{
  		res = GZWRITE(ptr, size, nmemb, AH->OF);
  		if (res != (nmemb * size))
! 			die_horribly(AH, modulename, "could not write to output file: %s\n", strerror(errno));
  		return res;
  	}
  	else if (AH->CustomOutPtr)
--- 1404,1410 ----
  	{
  		res = GZWRITE(ptr, size, nmemb, AH->OF);
  		if (res != (nmemb * size))
! 			exit_horribly(modulename, "could not write to output file: %s\n", strerror(errno));
  		return res;
  	}
  	else if (AH->CustomOutPtr)
*************** ahwrite(const void *ptr, size_t size, si
*** 1399,1405 ****
  		res = AH->CustomOutPtr (AH, ptr, size * nmemb);
  
  		if (res != (nmemb * size))
! 			die_horribly(AH, modulename, "could not write to custom output routine\n");
  		return res;
  	}
  	else
--- 1412,1418 ----
  		res = AH->CustomOutPtr (AH, ptr, size * nmemb);
  
  		if (res != (nmemb * size))
! 			exit_horribly(modulename, "could not write to custom output routine\n");
  		return res;
  	}
  	else
*************** ahwrite(const void *ptr, size_t size, si
*** 1414,1468 ****
  		{
  			res = fwrite(ptr, size, nmemb, AH->OF);
  			if (res != nmemb)
! 				die_horribly(AH, modulename, "could not write to output file: %s\n",
  							 strerror(errno));
  			return res;
  		}
  	}
  }
  
- 
- /* Report a fatal error and exit(1) */
- static void
- vdie_horribly(ArchiveHandle *AH, const char *modulename,
- 			  const char *fmt, va_list ap)
- {
- 	vwrite_msg(modulename, fmt, ap);
- 
- 	if (AH)
- 	{
- 		if (AH->public.verbose)
- 			write_msg(NULL, "*** aborted because of error\n");
- 		DisconnectDatabase(&AH->public);
- 	}
- 
- 	exit_nicely(1);
- }
- 
- /* As above, but with variable arg list */
- void
- die_horribly(ArchiveHandle *AH, const char *modulename, const char *fmt,...)
- {
- 	va_list		ap;
- 
- 	va_start(ap, fmt);
- 	vdie_horribly(AH, modulename, fmt, ap);
- 	va_end(ap);
- }
- 
- /* As above, but with a complaint about a particular query. */
- void
- die_on_query_failure(ArchiveHandle *AH, const char *modulename,
- 					 const char *query)
- {
- 	write_msg(modulename, "query failed: %s",
- 			  PQerrorMessage(AH->connection));
- 	die_horribly(AH, modulename, "query was: %s\n", query);
- }
- 
  /* on some error, we may decide to go on... */
  void
! warn_or_die_horribly(ArchiveHandle *AH,
  					 const char *modulename, const char *fmt,...)
  {
  	va_list		ap;
--- 1427,1442 ----
  		{
  			res = fwrite(ptr, size, nmemb, AH->OF);
  			if (res != nmemb)
! 				exit_horribly(modulename, "could not write to output file: %s\n",
  							 strerror(errno));
  			return res;
  		}
  	}
  }
  
  /* on some error, we may decide to go on... */
  void
! warn_or_exit_horribly(ArchiveHandle *AH,
  					 const char *modulename, const char *fmt,...)
  {
  	va_list		ap;
*************** warn_or_die_horribly(ArchiveHandle *AH,
*** 1500,1513 ****
  	AH->lastErrorTE = AH->currentTE;
  
  	va_start(ap, fmt);
  	if (AH->public.exit_on_error)
! 		vdie_horribly(AH, modulename, fmt, ap);
  	else
- 	{
- 		vwrite_msg(modulename, fmt, ap);
  		AH->public.n_errors++;
- 	}
- 	va_end(ap);
  }
  
  #ifdef NOT_USED
--- 1474,1486 ----
  	AH->lastErrorTE = AH->currentTE;
  
  	va_start(ap, fmt);
+ 	vwrite_msg(modulename, fmt, ap);
+ 	va_end(ap);
+ 
  	if (AH->public.exit_on_error)
! 		exit_nicely(1);
  	else
  		AH->public.n_errors++;
  }
  
  #ifdef NOT_USED
*************** ReadOffset(ArchiveHandle *AH, pgoff_t *
*** 1626,1632 ****
  			break;
  
  		default:
! 			die_horribly(AH, modulename, "unexpected data offset flag %d\n", offsetFlg);
  	}
  
  	/*
--- 1599,1605 ----
  			break;
  
  		default:
! 			exit_horribly(modulename, "unexpected data offset flag %d\n", offsetFlg);
  	}
  
  	/*
*************** ReadOffset(ArchiveHandle *AH, pgoff_t *
*** 1639,1645 ****
  		else
  		{
  			if ((*AH->ReadBytePtr) (AH) != 0)
! 				die_horribly(AH, modulename, "file offset in dump file is too large\n");
  		}
  	}
  
--- 1612,1618 ----
  		else
  		{
  			if ((*AH->ReadBytePtr) (AH) != 0)
! 				exit_horribly(modulename, "file offset in dump file is too large\n");
  		}
  	}
  
*************** ReadStr(ArchiveHandle *AH)
*** 1733,1739 ****
  	{
  		buf = (char *) pg_malloc(l + 1);
  		if ((*AH->ReadBufPtr) (AH, (void *) buf, l) != l)
! 			die_horribly(AH, modulename, "unexpected end of file\n");
  
  		buf[l] = '\0';
  	}
--- 1706,1712 ----
  	{
  		buf = (char *) pg_malloc(l + 1);
  		if ((*AH->ReadBufPtr) (AH, (void *) buf, l) != l)
! 			exit_horribly(modulename, "unexpected end of file\n");
  
  		buf[l] = '\0';
  	}
*************** _discoverArchiveFormat(ArchiveHandle *AH
*** 1776,1783 ****
  			char		buf[MAXPGPATH];
  
  			if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
! 				die_horribly(AH, modulename, "directory name too long: \"%s\"\n",
! 							 AH->fSpec);
  			if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
  			{
  				AH->format = archDirectory;
--- 1749,1756 ----
  			char		buf[MAXPGPATH];
  
  			if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
! 				exit_horribly(modulename, "directory name too long: \"%s\"\n",
! 							  AH->fSpec);
  			if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
  			{
  				AH->format = archDirectory;
*************** _discoverArchiveFormat(ArchiveHandle *AH
*** 1786,1817 ****
  
  #ifdef HAVE_LIBZ
  			if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
! 				die_horribly(AH, modulename, "directory name too long: \"%s\"\n",
! 							 AH->fSpec);
  			if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
  			{
  				AH->format = archDirectory;
  				return AH->format;
  			}
  #endif
! 			die_horribly(AH, modulename, "directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)\n",
! 						 AH->fSpec);
  			fh = NULL;			/* keep compiler quiet */
  		}
  		else
  		{
  			fh = fopen(AH->fSpec, PG_BINARY_R);
  			if (!fh)
! 				die_horribly(AH, modulename, "could not open input file \"%s\": %s\n",
! 							 AH->fSpec, strerror(errno));
  		}
  	}
  	else
  	{
  		fh = stdin;
  		if (!fh)
! 			die_horribly(AH, modulename, "could not open input file: %s\n",
! 						 strerror(errno));
  	}
  
  	cnt = fread(sig, 1, 5, fh);
--- 1759,1790 ----
  
  #ifdef HAVE_LIBZ
  			if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
! 				exit_horribly(modulename, "directory name too long: \"%s\"\n",
! 							  AH->fSpec);
  			if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
  			{
  				AH->format = archDirectory;
  				return AH->format;
  			}
  #endif
! 			exit_horribly(modulename, "directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)\n",
! 						  AH->fSpec);
  			fh = NULL;			/* keep compiler quiet */
  		}
  		else
  		{
  			fh = fopen(AH->fSpec, PG_BINARY_R);
  			if (!fh)
! 				exit_horribly(modulename, "could not open input file \"%s\": %s\n",
! 							  AH->fSpec, strerror(errno));
  		}
  	}
  	else
  	{
  		fh = stdin;
  		if (!fh)
! 			exit_horribly(modulename, "could not open input file: %s\n",
! 						  strerror(errno));
  	}
  
  	cnt = fread(sig, 1, 5, fh);
*************** _discoverArchiveFormat(ArchiveHandle *AH
*** 1819,1828 ****
  	if (cnt != 5)
  	{
  		if (ferror(fh))
! 			die_horribly(AH, modulename, "could not read input file: %s\n", strerror(errno));
  		else
! 			die_horribly(AH, modulename, "input file is too short (read %lu, expected 5)\n",
! 						 (unsigned long) cnt);
  	}
  
  	/* Save it, just in case we need it later */
--- 1792,1801 ----
  	if (cnt != 5)
  	{
  		if (ferror(fh))
! 			exit_horribly(modulename, "could not read input file: %s\n", strerror(errno));
  		else
! 			exit_horribly(modulename, "input file is too short (read %lu, expected 5)\n",
! 						  (unsigned long) cnt);
  	}
  
  	/* Save it, just in case we need it later */
*************** _discoverArchiveFormat(ArchiveHandle *AH
*** 1883,1896 ****
  			 strncmp(AH->lookahead, TEXT_DUMPALL_HEADER, strlen(TEXT_DUMPALL_HEADER)) == 0))
  		{
  			/* looks like it's probably a text format dump. so suggest they try psql */
! 			die_horribly(AH, modulename, "input file appears to be a text format dump. Please use psql.\n");
  		}
  
  		if (AH->lookaheadLen != 512)
! 			die_horribly(AH, modulename, "input file does not appear to be a valid archive (too short?)\n");
  
  		if (!isValidTarHeader(AH->lookahead))
! 			die_horribly(AH, modulename, "input file does not appear to be a valid archive\n");
  
  		AH->format = archTar;
  	}
--- 1856,1869 ----
  			 strncmp(AH->lookahead, TEXT_DUMPALL_HEADER, strlen(TEXT_DUMPALL_HEADER)) == 0))
  		{
  			/* looks like it's probably a text format dump. so suggest they try psql */
! 			exit_horribly(modulename, "input file appears to be a text format dump. Please use psql.\n");
  		}
  
  		if (AH->lookaheadLen != 512)
! 			exit_horribly(modulename, "input file does not appear to be a valid archive (too short?)\n");
  
  		if (!isValidTarHeader(AH->lookahead))
! 			exit_horribly(modulename, "input file does not appear to be a valid archive\n");
  
  		AH->format = archTar;
  	}
*************** _discoverArchiveFormat(ArchiveHandle *AH
*** 1910,1917 ****
  	/* Close the file */
  	if (wantClose)
  		if (fclose(fh) != 0)
! 			die_horribly(AH, modulename, "could not close input file: %s\n",
! 						 strerror(errno));
  
  	return AH->format;
  }
--- 1883,1890 ----
  	/* Close the file */
  	if (wantClose)
  		if (fclose(fh) != 0)
! 			exit_horribly(modulename, "could not close input file: %s\n",
! 						  strerror(errno));
  
  	return AH->format;
  }
*************** _allocAH(const char *FileSpec, const Arc
*** 2034,2040 ****
  			break;
  
  		default:
! 			die_horribly(AH, modulename, "unrecognized file format \"%d\"\n", fmt);
  	}
  
  	return AH;
--- 2007,2013 ----
  			break;
  
  		default:
! 			exit_horribly(modulename, "unrecognized file format \"%d\"\n", fmt);
  	}
  
  	return AH;
*************** ReadToc(ArchiveHandle *AH)
*** 2156,2164 ****
  
  		/* Sanity check */
  		if (te->dumpId <= 0)
! 			die_horribly(AH, modulename,
! 					   "entry ID %d out of range -- perhaps a corrupt TOC\n",
! 						 te->dumpId);
  
  		te->hadDumper = ReadInt(AH);
  
--- 2129,2137 ----
  
  		/* Sanity check */
  		if (te->dumpId <= 0)
! 			exit_horribly(modulename,
! 						  "entry ID %d out of range -- perhaps a corrupt TOC\n",
! 						  te->dumpId);
  
  		te->hadDumper = ReadInt(AH);
  
*************** processEncodingEntry(ArchiveHandle *AH,
*** 2313,2325 ****
  		*ptr2 = '\0';
  		encoding = pg_char_to_encoding(ptr1);
  		if (encoding < 0)
! 			die_horribly(AH, modulename, "unrecognized encoding \"%s\"\n",
! 						 ptr1);
  		AH->public.encoding = encoding;
  	}
  	else
! 		die_horribly(AH, modulename, "invalid ENCODING item: %s\n",
! 					 te->defn);
  
  	free(defn);
  }
--- 2286,2298 ----
  		*ptr2 = '\0';
  		encoding = pg_char_to_encoding(ptr1);
  		if (encoding < 0)
! 			exit_horribly(modulename, "unrecognized encoding \"%s\"\n",
! 						  ptr1);
  		AH->public.encoding = encoding;
  	}
  	else
! 		exit_horribly(modulename, "invalid ENCODING item: %s\n",
! 					  te->defn);
  
  	free(defn);
  }
*************** processStdStringsEntry(ArchiveHandle *AH
*** 2336,2343 ****
  	else if (ptr1 && strncmp(ptr1, "'off'", 5) == 0)
  		AH->public.std_strings = false;
  	else
! 		die_horribly(AH, modulename, "invalid STDSTRINGS item: %s\n",
! 					 te->defn);
  }
  
  static teReqs
--- 2309,2316 ----
  	else if (ptr1 && strncmp(ptr1, "'off'", 5) == 0)
  		AH->public.std_strings = false;
  	else
! 		exit_horribly(modulename, "invalid STDSTRINGS item: %s\n",
! 					  te->defn);
  }
  
  static teReqs
*************** _doSetSessionAuth(ArchiveHandle *AH, con
*** 2544,2552 ****
  		res = PQexec(AH->connection, cmd->data);
  
  		if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
! 			/* NOT warn_or_die_horribly... use -O instead to skip this. */
! 			die_horribly(AH, modulename, "could not set session user to \"%s\": %s",
! 						 user, PQerrorMessage(AH->connection));
  
  		PQclear(res);
  	}
--- 2517,2525 ----
  		res = PQexec(AH->connection, cmd->data);
  
  		if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
! 			/* NOT warn_or_exit_horribly... use -O instead to skip this. */
! 			exit_horribly(modulename, "could not set session user to \"%s\": %s",
! 						  user, PQerrorMessage(AH->connection));
  
  		PQclear(res);
  	}
*************** _doSetWithOids(ArchiveHandle *AH, const
*** 2576,2584 ****
  		res = PQexec(AH->connection, cmd->data);
  
  		if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
! 			warn_or_die_horribly(AH, modulename,
! 								 "could not set default_with_oids: %s",
! 								 PQerrorMessage(AH->connection));
  
  		PQclear(res);
  	}
--- 2549,2557 ----
  		res = PQexec(AH->connection, cmd->data);
  
  		if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
! 			warn_or_exit_horribly(AH, modulename,
! 								  "could not set default_with_oids: %s",
! 								  PQerrorMessage(AH->connection));
  
  		PQclear(res);
  	}
*************** _selectOutputSchema(ArchiveHandle *AH, c
*** 2714,2722 ****
  		res = PQexec(AH->connection, qry->data);
  
  		if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
! 			warn_or_die_horribly(AH, modulename,
! 								 "could not set search_path to \"%s\": %s",
! 								 schemaName, PQerrorMessage(AH->connection));
  
  		PQclear(res);
  	}
--- 2687,2695 ----
  		res = PQexec(AH->connection, qry->data);
  
  		if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
! 			warn_or_exit_horribly(AH, modulename,
! 								  "could not set search_path to \"%s\": %s",
! 								  schemaName, PQerrorMessage(AH->connection));
  
  		PQclear(res);
  	}
*************** _selectTablespace(ArchiveHandle *AH, con
*** 2775,2783 ****
  		res = PQexec(AH->connection, qry->data);
  
  		if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
! 			warn_or_die_horribly(AH, modulename,
! 								 "could not set default_tablespace to %s: %s",
! 								 fmtId(want), PQerrorMessage(AH->connection));
  
  		PQclear(res);
  	}
--- 2748,2756 ----
  		res = PQexec(AH->connection, qry->data);
  
  		if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
! 			warn_or_exit_horribly(AH, modulename,
! 								  "could not set default_tablespace to %s: %s",
! 								  fmtId(want), PQerrorMessage(AH->connection));
  
  		PQclear(res);
  	}
*************** ReadHead(ArchiveHandle *AH)
*** 3157,3166 ****
  	if (!AH->readHeader)
  	{
  		if ((*AH->ReadBufPtr) (AH, tmpMag, 5) != 5)
! 			die_horribly(AH, modulename, "unexpected end of file\n");
  
  		if (strncmp(tmpMag, "PGDMP", 5) != 0)
! 			die_horribly(AH, modulename, "did not find magic string in file header\n");
  
  		AH->vmaj = (*AH->ReadBytePtr) (AH);
  		AH->vmin = (*AH->ReadBytePtr) (AH);
--- 3130,3139 ----
  	if (!AH->readHeader)
  	{
  		if ((*AH->ReadBufPtr) (AH, tmpMag, 5) != 5)
! 			exit_horribly(modulename, "unexpected end of file\n");
  
  		if (strncmp(tmpMag, "PGDMP", 5) != 0)
! 			exit_horribly(modulename, "did not find magic string in file header\n");
  
  		AH->vmaj = (*AH->ReadBytePtr) (AH);
  		AH->vmin = (*AH->ReadBytePtr) (AH);
*************** ReadHead(ArchiveHandle *AH)
*** 3173,3185 ****
  		AH->version = ((AH->vmaj * 256 + AH->vmin) * 256 + AH->vrev) * 256 + 0;
  
  		if (AH->version < K_VERS_1_0 || AH->version > K_VERS_MAX)
! 			die_horribly(AH, modulename, "unsupported version (%d.%d) in file header\n",
! 						 AH->vmaj, AH->vmin);
  
  		AH->intSize = (*AH->ReadBytePtr) (AH);
  		if (AH->intSize > 32)
! 			die_horribly(AH, modulename, "sanity check on integer size (%lu) failed\n",
! 						 (unsigned long) AH->intSize);
  
  		if (AH->intSize > sizeof(int))
  			write_msg(modulename, "WARNING: archive was made on a machine with larger integers, some operations might fail\n");
--- 3146,3158 ----
  		AH->version = ((AH->vmaj * 256 + AH->vmin) * 256 + AH->vrev) * 256 + 0;
  
  		if (AH->version < K_VERS_1_0 || AH->version > K_VERS_MAX)
! 			exit_horribly(modulename, "unsupported version (%d.%d) in file header\n",
! 						  AH->vmaj, AH->vmin);
  
  		AH->intSize = (*AH->ReadBytePtr) (AH);
  		if (AH->intSize > 32)
! 			exit_horribly(modulename, "sanity check on integer size (%lu) failed\n",
! 						  (unsigned long) AH->intSize);
  
  		if (AH->intSize > sizeof(int))
  			write_msg(modulename, "WARNING: archive was made on a machine with larger integers, some operations might fail\n");
*************** ReadHead(ArchiveHandle *AH)
*** 3192,3199 ****
  		fmt = (*AH->ReadBytePtr) (AH);
  
  		if (AH->format != fmt)
! 			die_horribly(AH, modulename, "expected format (%d) differs from format found in file (%d)\n",
! 						 AH->format, fmt);
  	}
  
  	if (AH->version >= K_VERS_1_2)
--- 3165,3172 ----
  		fmt = (*AH->ReadBytePtr) (AH);
  
  		if (AH->format != fmt)
! 			exit_horribly(modulename, "expected format (%d) differs from format found in file (%d)\n",
! 						  AH->format, fmt);
  	}
  
  	if (AH->version >= K_VERS_1_2)
*************** dumpTimestamp(ArchiveHandle *AH, const c
*** 3297,3302 ****
--- 3270,3321 ----
  		ahprintf(AH, "-- %s %s\n\n", msg, buf);
  }
  
+ static void
+ setProcessIdentifier(ParallelStateEntry *pse, ArchiveHandle *AH)
+ {
+ #ifdef WIN32
+ 	pse->threadId = GetCurrentThreadId();
+ #else
+ 	pse->pid = getpid();
+ #endif
+ 	pse->AH = AH;
+ }
+ 
+ static void
+ unsetProcessIdentifier(ParallelStateEntry *pse)
+ {
+ #ifdef WIN32
+ 	pse->threadId = 0;
+ #else
+ 	pse->pid = 0;
+ #endif
+ 	pse->AH = NULL;
+ }
+ 
+ static int
+ GetMySlot(ParallelState *pstate)
+ {
+ 	int i;
+ 
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ #ifdef WIN32
+ 		if (pstate->pse[i].threadId == GetCurrentThreadId())
+ #else
+ 		if (pstate->pse[i].pid == getpid())
+ #endif
+ 			return i;
+ 
+ 	return NO_SLOT;
+ }
+ 
+ static void
+ archive_close_connection_parallel(int code, void *ps)
+ {
+ 	ParallelState *pstate = (ParallelState *) ps;
+ 	int slotno = GetMySlot(pstate);
+ 	if (slotno != NO_SLOT && pstate->pse[slotno].AH)
+ 		DisconnectDatabase(&pstate->pse[slotno].AH->public);
+ }
  
  /*
   * Main engine for parallel restore.
*************** restore_toc_entries_parallel(ArchiveHand
*** 3323,3332 ****
  	TocEntry   *next_work_item;
  	thandle		ret_child;
  	TocEntry   *te;
  
  	ahlog(AH, 2, "entering restore_toc_entries_parallel\n");
  
! 	slots = (ParallelSlot *) pg_calloc(sizeof(ParallelSlot), n_slots);
  
  	/* Adjust dependency information */
  	fix_dependencies(AH);
--- 3342,3358 ----
  	TocEntry   *next_work_item;
  	thandle		ret_child;
  	TocEntry   *te;
+ 	ParallelState *pstate;
+ 	int			i;
  
  	ahlog(AH, 2, "entering restore_toc_entries_parallel\n");
  
! 	slots = (ParallelSlot *) pg_calloc(n_slots, sizeof(ParallelSlot));
! 	pstate = (ParallelState *) pg_malloc(sizeof(ParallelState));
! 	pstate->pse = (ParallelStateEntry *) pg_calloc(n_slots, sizeof(ParallelStateEntry));
! 	pstate->numWorkers = ropt->number_of_jobs;
! 	for (i = 0; i < pstate->numWorkers; i++)
! 		unsetProcessIdentifier(&(pstate->pse[i]));
  
  	/* Adjust dependency information */
  	fix_dependencies(AH);
*************** restore_toc_entries_parallel(ArchiveHand
*** 3382,3387 ****
--- 3408,3420 ----
  	 */
  	DisconnectDatabase(&AH->public);
  
+ 	/*
+ 	 * Both pg_dump and pg_restore that use this code set at most
+ 	 * one exit handler. So we can just reset the handlers.
+ 	 */
+ 	on_exit_nicely_reset();
+ 	on_exit_nicely(archive_close_connection_parallel, pstate);
+ 
  	/* blow away any transient state from the old connection */
  	if (AH->currUser)
  		free(AH->currUser);
*************** restore_toc_entries_parallel(ArchiveHand
*** 3480,3485 ****
--- 3513,3519 ----
  				args = pg_malloc(sizeof(RestoreArgs));
  				args->AH = CloneArchive(AH);
  				args->te = next_work_item;
+ 				args->pse = &pstate->pse[next_slot];
  
  				/* run the step in a worker child */
  				child = spawn_restore(args);
*************** restore_toc_entries_parallel(ArchiveHand
*** 3507,3518 ****
  		}
  		else
  		{
! 			die_horribly(AH, modulename, "worker process crashed: status %d\n",
! 						 work_status);
  		}
  	}
  
  	ahlog(AH, 1, "finished main parallel loop\n");
  
  	/*
  	 * Now reconnect the single parent connection.
--- 3541,3554 ----
  		}
  		else
  		{
! 			exit_horribly(modulename, "worker process crashed: status %d\n",
! 						  work_status);
  		}
  	}
  
  	ahlog(AH, 1, "finished main parallel loop\n");
+ 	on_exit_nicely_reset();
+ 	on_exit_nicely(archive_close_connection, AH);
  
  	/*
  	 * Now reconnect the single parent connection.
*************** spawn_restore(RestoreArgs *args)
*** 3555,3577 ****
  	{
  		/* in child process */
  		parallel_restore(args);
! 		die_horribly(args->AH, modulename,
! 					 "parallel_restore should not return\n");
  	}
  	else if (child < 0)
  	{
  		/* fork failed */
! 		die_horribly(args->AH, modulename,
! 					 "could not create worker process: %s\n",
! 					 strerror(errno));
  	}
  #else
  	child = (HANDLE) _beginthreadex(NULL, 0, (void *) parallel_restore,
  									args, 0, NULL);
  	if (child == 0)
! 		die_horribly(args->AH, modulename,
! 					 "could not create worker thread: %s\n",
! 					 strerror(errno));
  #endif
  
  	return child;
--- 3591,3613 ----
  	{
  		/* in child process */
  		parallel_restore(args);
! 		exit_horribly(modulename,
! 					  "parallel_restore should not return\n");
  	}
  	else if (child < 0)
  	{
  		/* fork failed */
! 		exit_horribly(modulename,
! 					  "could not create worker process: %s\n",
! 					  strerror(errno));
  	}
  #else
  	child = (HANDLE) _beginthreadex(NULL, 0, (void *) parallel_restore,
  									args, 0, NULL);
  	if (child == 0)
! 		exit_horribly(modulename,
! 					  "could not create worker thread: %s\n",
! 					  strerror(errno));
  #endif
  
  	return child;
*************** parallel_restore(RestoreArgs *args)
*** 3813,3818 ****
--- 3849,3856 ----
  	RestoreOptions *ropt = AH->ropt;
  	int			retval;
  
+ 	setProcessIdentifier(args->pse, AH);
+ 
  	/*
  	 * Close and reopen the input file so we have a private file pointer that
  	 * doesn't stomp on anyone else's file pointer, if we're actually going to
*************** parallel_restore(RestoreArgs *args)
*** 3843,3848 ****
--- 3881,3887 ----
  
  	/* And clean up */
  	DisconnectDatabase((Archive *) AH);
+ 	unsetProcessIdentifier(args->pse);
  
  	/* If we reopened the file, we are done with it, so close it now */
  	if (te->section == SECTION_DATA)
*************** mark_work_done(ArchiveHandle *AH, TocEnt
*** 3888,3894 ****
  	}
  
  	if (te == NULL)
! 		die_horribly(AH, modulename, "could not find slot of finished worker\n");
  
  	ahlog(AH, 1, "finished item %d %s %s\n",
  		  te->dumpId, te->desc, te->tag);
--- 3927,3933 ----
  	}
  
  	if (te == NULL)
! 		exit_horribly(modulename, "could not find slot of finished worker\n");
  
  	ahlog(AH, 1, "finished item %d %s %s\n",
  		  te->dumpId, te->desc, te->tag);
*************** mark_work_done(ArchiveHandle *AH, TocEnt
*** 3903,3910 ****
  	else if (status == WORKER_IGNORED_ERRORS)
  		AH->public.n_errors++;
  	else if (status != 0)
! 		die_horribly(AH, modulename, "worker process failed: exit code %d\n",
! 					 status);
  
  	reduce_dependencies(AH, te, ready_list);
  }
--- 3942,3949 ----
  	else if (status == WORKER_IGNORED_ERRORS)
  		AH->public.n_errors++;
  	else if (status != 0)
! 		exit_horribly(modulename, "worker process failed: exit code %d\n",
! 					  status);
  
  	reduce_dependencies(AH, te, ready_list);
  }
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index fa8c58c..b29d0f1 100644
*** a/src/bin/pg_dump/pg_backup_archiver.h
--- b/src/bin/pg_dump/pg_backup_archiver.h
*************** typedef struct _tocEntry
*** 324,332 ****
  } TocEntry;
  
  
! extern void die_horribly(ArchiveHandle *AH, const char *modulename, const char *fmt,...) __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4), noreturn));
! extern void die_on_query_failure(ArchiveHandle *AH, const char *modulename, const char *query) __attribute__((noreturn));
! extern void warn_or_die_horribly(ArchiveHandle *AH, const char *modulename, const char *fmt,...) __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4)));
  
  extern void WriteTOC(ArchiveHandle *AH);
  extern void ReadTOC(ArchiveHandle *AH);
--- 324,330 ----
  } TocEntry;
  
  
! extern void warn_or_exit_horribly(ArchiveHandle *AH, const char *modulename, const char *fmt,...) __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4)));
  
  extern void WriteTOC(ArchiveHandle *AH);
  extern void ReadTOC(ArchiveHandle *AH);
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index 31fa373..87242c5 100644
*** a/src/bin/pg_dump/pg_backup_custom.c
--- b/src/bin/pg_dump/pg_backup_custom.c
*************** InitArchiveFmt_Custom(ArchiveHandle *AH)
*** 146,160 ****
  		{
  			AH->FH = fopen(AH->fSpec, PG_BINARY_W);
  			if (!AH->FH)
! 				die_horribly(AH, modulename, "could not open output file \"%s\": %s\n",
! 							 AH->fSpec, strerror(errno));
  		}
  		else
  		{
  			AH->FH = stdout;
  			if (!AH->FH)
! 				die_horribly(AH, modulename, "could not open output file: %s\n",
! 							 strerror(errno));
  		}
  
  		ctx->hasSeek = checkSeek(AH->FH);
--- 146,160 ----
  		{
  			AH->FH = fopen(AH->fSpec, PG_BINARY_W);
  			if (!AH->FH)
! 				exit_horribly(modulename, "could not open output file \"%s\": %s\n",
! 							  AH->fSpec, strerror(errno));
  		}
  		else
  		{
  			AH->FH = stdout;
  			if (!AH->FH)
! 				exit_horribly(modulename, "could not open output file: %s\n",
! 							  strerror(errno));
  		}
  
  		ctx->hasSeek = checkSeek(AH->FH);
*************** InitArchiveFmt_Custom(ArchiveHandle *AH)
*** 165,179 ****
  		{
  			AH->FH = fopen(AH->fSpec, PG_BINARY_R);
  			if (!AH->FH)
! 				die_horribly(AH, modulename, "could not open input file \"%s\": %s\n",
! 							 AH->fSpec, strerror(errno));
  		}
  		else
  		{
  			AH->FH = stdin;
  			if (!AH->FH)
! 				die_horribly(AH, modulename, "could not open input file: %s\n",
! 							 strerror(errno));
  		}
  
  		ctx->hasSeek = checkSeek(AH->FH);
--- 165,179 ----
  		{
  			AH->FH = fopen(AH->fSpec, PG_BINARY_R);
  			if (!AH->FH)
! 				exit_horribly(modulename, "could not open input file \"%s\": %s\n",
! 							  AH->fSpec, strerror(errno));
  		}
  		else
  		{
  			AH->FH = stdin;
  			if (!AH->FH)
! 				exit_horribly(modulename, "could not open input file: %s\n",
! 							  strerror(errno));
  		}
  
  		ctx->hasSeek = checkSeek(AH->FH);
*************** _StartBlob(ArchiveHandle *AH, TocEntry *
*** 367,373 ****
  	lclContext *ctx = (lclContext *) AH->formatData;
  
  	if (oid == 0)
! 		die_horribly(AH, modulename, "invalid OID for large object\n");
  
  	WriteInt(AH, oid);
  
--- 367,373 ----
  	lclContext *ctx = (lclContext *) AH->formatData;
  
  	if (oid == 0)
! 		exit_horribly(modulename, "invalid OID for large object\n");
  
  	WriteInt(AH, oid);
  
*************** _PrintTocData(ArchiveHandle *AH, TocEntr
*** 437,445 ****
  					break;
  
  				default:		/* Always have a default */
! 					die_horribly(AH, modulename,
! 								 "unrecognized data block type (%d) while searching archive\n",
! 								 blkType);
  					break;
  			}
  			_readBlockHeader(AH, &blkType, &id);
--- 437,445 ----
  					break;
  
  				default:		/* Always have a default */
! 					exit_horribly(modulename,
! 								  "unrecognized data block type (%d) while searching archive\n",
! 								  blkType);
  					break;
  			}
  			_readBlockHeader(AH, &blkType, &id);
*************** _PrintTocData(ArchiveHandle *AH, TocEntr
*** 449,456 ****
  	{
  		/* We can just seek to the place we need to be. */
  		if (fseeko(AH->FH, tctx->dataPos, SEEK_SET) != 0)
! 			die_horribly(AH, modulename, "error during file seek: %s\n",
! 						 strerror(errno));
  
  		_readBlockHeader(AH, &blkType, &id);
  	}
--- 449,456 ----
  	{
  		/* We can just seek to the place we need to be. */
  		if (fseeko(AH->FH, tctx->dataPos, SEEK_SET) != 0)
! 			exit_horribly(modulename, "error during file seek: %s\n",
! 						  strerror(errno));
  
  		_readBlockHeader(AH, &blkType, &id);
  	}
*************** _PrintTocData(ArchiveHandle *AH, TocEntr
*** 459,483 ****
  	if (blkType == EOF)
  	{
  		if (tctx->dataState == K_OFFSET_POS_NOT_SET)
! 			die_horribly(AH, modulename, "could not find block ID %d in archive -- "
! 						 "possibly due to out-of-order restore request, "
! 						 "which cannot be handled due to lack of data offsets in archive\n",
! 						 te->dumpId);
  		else if (!ctx->hasSeek)
! 			die_horribly(AH, modulename, "could not find block ID %d in archive -- "
! 						 "possibly due to out-of-order restore request, "
! 				  "which cannot be handled due to non-seekable input file\n",
! 						 te->dumpId);
  		else	/* huh, the dataPos led us to EOF? */
! 			die_horribly(AH, modulename, "could not find block ID %d in archive -- "
! 						 "possibly corrupt archive\n",
! 						 te->dumpId);
  	}
  
  	/* Are we sane? */
  	if (id != te->dumpId)
! 		die_horribly(AH, modulename, "found unexpected block ID (%d) when reading data -- expected %d\n",
! 					 id, te->dumpId);
  
  	switch (blkType)
  	{
--- 459,483 ----
  	if (blkType == EOF)
  	{
  		if (tctx->dataState == K_OFFSET_POS_NOT_SET)
! 			exit_horribly(modulename, "could not find block ID %d in archive -- "
! 						  "possibly due to out-of-order restore request, "
! 						  "which cannot be handled due to lack of data offsets in archive\n",
! 						  te->dumpId);
  		else if (!ctx->hasSeek)
! 			exit_horribly(modulename, "could not find block ID %d in archive -- "
! 						  "possibly due to out-of-order restore request, "
! 						  "which cannot be handled due to non-seekable input file\n",
! 						  te->dumpId);
  		else	/* huh, the dataPos led us to EOF? */
! 			exit_horribly(modulename, "could not find block ID %d in archive -- "
! 						  "possibly corrupt archive\n",
! 						  te->dumpId);
  	}
  
  	/* Are we sane? */
  	if (id != te->dumpId)
! 		exit_horribly(modulename, "found unexpected block ID (%d) when reading data -- expected %d\n",
! 					  id, te->dumpId);
  
  	switch (blkType)
  	{
*************** _PrintTocData(ArchiveHandle *AH, TocEntr
*** 490,497 ****
  			break;
  
  		default:				/* Always have a default */
! 			die_horribly(AH, modulename, "unrecognized data block type %d while restoring archive\n",
! 						 blkType);
  			break;
  	}
  }
--- 490,497 ----
  			break;
  
  		default:				/* Always have a default */
! 			exit_horribly(modulename, "unrecognized data block type %d while restoring archive\n",
! 						  blkType);
  			break;
  	}
  }
*************** _skipData(ArchiveHandle *AH)
*** 571,581 ****
  		if (cnt != blkLen)
  		{
  			if (feof(AH->FH))
! 				die_horribly(AH, modulename,
! 							 "could not read from input file: end of file\n");
  			else
! 				die_horribly(AH, modulename,
! 					"could not read from input file: %s\n", strerror(errno));
  		}
  
  		ctx->filePos += blkLen;
--- 571,581 ----
  		if (cnt != blkLen)
  		{
  			if (feof(AH->FH))
! 				exit_horribly(modulename,
! 							  "could not read from input file: end of file\n");
  			else
! 				exit_horribly(modulename,
! 							  "could not read from input file: %s\n", strerror(errno));
  		}
  
  		ctx->filePos += blkLen;
*************** _WriteByte(ArchiveHandle *AH, const int
*** 604,610 ****
  	if (res != EOF)
  		ctx->filePos += 1;
  	else
! 		die_horribly(AH, modulename, "could not write byte: %s\n", strerror(errno));
  	return res;
  }
  
--- 604,610 ----
  	if (res != EOF)
  		ctx->filePos += 1;
  	else
! 		exit_horribly(modulename, "could not write byte: %s\n", strerror(errno));
  	return res;
  }
  
*************** _ReadByte(ArchiveHandle *AH)
*** 624,630 ****
  
  	res = getc(AH->FH);
  	if (res == EOF)
! 		die_horribly(AH, modulename, "unexpected end of file\n");
  	ctx->filePos += 1;
  	return res;
  }
--- 624,630 ----
  
  	res = getc(AH->FH);
  	if (res == EOF)
! 		exit_horribly(modulename, "unexpected end of file\n");
  	ctx->filePos += 1;
  	return res;
  }
*************** _WriteBuf(ArchiveHandle *AH, const void
*** 645,651 ****
  	res = fwrite(buf, 1, len, AH->FH);
  
  	if (res != len)
! 		die_horribly(AH, modulename,
  					 "could not write to output file: %s\n", strerror(errno));
  
  	ctx->filePos += res;
--- 645,651 ----
  	res = fwrite(buf, 1, len, AH->FH);
  
  	if (res != len)
! 		exit_horribly(modulename,
  					 "could not write to output file: %s\n", strerror(errno));
  
  	ctx->filePos += res;
*************** _CloseArchive(ArchiveHandle *AH)
*** 712,718 ****
  	}
  
  	if (fclose(AH->FH) != 0)
! 		die_horribly(AH, modulename, "could not close archive file: %s\n", strerror(errno));
  
  	AH->FH = NULL;
  }
--- 712,718 ----
  	}
  
  	if (fclose(AH->FH) != 0)
! 		exit_horribly(modulename, "could not close archive file: %s\n", strerror(errno));
  
  	AH->FH = NULL;
  }
*************** _ReopenArchive(ArchiveHandle *AH)
*** 731,767 ****
  	pgoff_t		tpos;
  
  	if (AH->mode == archModeWrite)
! 		die_horribly(AH, modulename, "can only reopen input archives\n");
  
  	/*
  	 * These two cases are user-facing errors since they represent unsupported
  	 * (but not invalid) use-cases.  Word the error messages appropriately.
  	 */
  	if (AH->fSpec == NULL || strcmp(AH->fSpec, "") == 0)
! 		die_horribly(AH, modulename, "parallel restore from stdin is not supported\n");
  	if (!ctx->hasSeek)
! 		die_horribly(AH, modulename, "parallel restore from non-seekable file is not supported\n");
  
  	errno = 0;
  	tpos = ftello(AH->FH);
  	if (errno)
! 		die_horribly(AH, modulename, "could not determine seek position in archive file: %s\n",
! 					 strerror(errno));
  
  #ifndef WIN32
  	if (fclose(AH->FH) != 0)
! 		die_horribly(AH, modulename, "could not close archive file: %s\n",
! 					 strerror(errno));
  #endif
  
  	AH->FH = fopen(AH->fSpec, PG_BINARY_R);
  	if (!AH->FH)
! 		die_horribly(AH, modulename, "could not open input file \"%s\": %s\n",
! 					 AH->fSpec, strerror(errno));
  
  	if (fseeko(AH->FH, tpos, SEEK_SET) != 0)
! 		die_horribly(AH, modulename, "could not set seek position in archive file: %s\n",
! 					 strerror(errno));
  }
  
  /*
--- 731,767 ----
  	pgoff_t		tpos;
  
  	if (AH->mode == archModeWrite)
! 		exit_horribly(modulename, "can only reopen input archives\n");
  
  	/*
  	 * These two cases are user-facing errors since they represent unsupported
  	 * (but not invalid) use-cases.  Word the error messages appropriately.
  	 */
  	if (AH->fSpec == NULL || strcmp(AH->fSpec, "") == 0)
! 		exit_horribly(modulename, "parallel restore from stdin is not supported\n");
  	if (!ctx->hasSeek)
! 		exit_horribly(modulename, "parallel restore from non-seekable file is not supported\n");
  
  	errno = 0;
  	tpos = ftello(AH->FH);
  	if (errno)
! 		exit_horribly(modulename, "could not determine seek position in archive file: %s\n",
! 					  strerror(errno));
  
  #ifndef WIN32
  	if (fclose(AH->FH) != 0)
! 		exit_horribly(modulename, "could not close archive file: %s\n",
! 					  strerror(errno));
  #endif
  
  	AH->FH = fopen(AH->fSpec, PG_BINARY_R);
  	if (!AH->FH)
! 		exit_horribly(modulename, "could not open input file \"%s\": %s\n",
! 					  AH->fSpec, strerror(errno));
  
  	if (fseeko(AH->FH, tpos, SEEK_SET) != 0)
! 		exit_horribly(modulename, "could not set seek position in archive file: %s\n",
! 					  strerror(errno));
  }
  
  /*
*************** _Clone(ArchiveHandle *AH)
*** 778,784 ****
  
  	/* sanity check, shouldn't happen */
  	if (ctx->cs != NULL)
! 		die_horribly(AH, modulename, "compressor active\n");
  
  	/*
  	 * Note: we do not make a local lo_buf because we expect at most one BLOBS
--- 778,784 ----
  
  	/* sanity check, shouldn't happen */
  	if (ctx->cs != NULL)
! 		exit_horribly(modulename, "compressor active\n");
  
  	/*
  	 * Note: we do not make a local lo_buf because we expect at most one BLOBS
*************** _readBlockHeader(ArchiveHandle *AH, int
*** 840,846 ****
  	int			byt;
  
  	/*
! 	 * Note: if we are at EOF with a pre-1.3 input file, we'll die_horribly
  	 * inside ReadInt rather than returning EOF.  It doesn't seem worth
  	 * jumping through hoops to deal with that case better, because no such
  	 * files are likely to exist in the wild: only some 7.1 development
--- 840,846 ----
  	int			byt;
  
  	/*
! 	 * Note: if we are at EOF with a pre-1.3 input file, we'll exit_horribly
  	 * inside ReadInt rather than returning EOF.  It doesn't seem worth
  	 * jumping through hoops to deal with that case better, because no such
  	 * files are likely to exist in the wild: only some 7.1 development
*************** _CustomReadFunc(ArchiveHandle *AH, char
*** 905,914 ****
  	if (cnt != blkLen)
  	{
  		if (feof(AH->FH))
! 			die_horribly(AH, modulename,
! 						 "could not read from input file: end of file\n");
  		else
! 			die_horribly(AH, modulename,
  					"could not read from input file: %s\n", strerror(errno));
  	}
  	return cnt;
--- 905,914 ----
  	if (cnt != blkLen)
  	{
  		if (feof(AH->FH))
! 			exit_horribly(modulename,
! 						  "could not read from input file: end of file\n");
  		else
! 			exit_horribly(modulename,
  					"could not read from input file: %s\n", strerror(errno));
  	}
  	return cnt;
diff --git a/src/bin/pg_dump/pg_backup_db.c b/src/bin/pg_dump/pg_backup_db.c
index a843eac..4a8283a 100644
*** a/src/bin/pg_dump/pg_backup_db.c
--- b/src/bin/pg_dump/pg_backup_db.c
*************** static PGconn *_connectDB(ArchiveHandle
*** 30,42 ****
  static void notice_processor(void *arg, const char *message);
  
  static int
! _parse_version(ArchiveHandle *AH, const char *versionString)
  {
  	int			v;
  
  	v = parse_version(versionString);
  	if (v < 0)
! 		die_horribly(AH, modulename, "could not parse version string \"%s\"\n", versionString);
  
  	return v;
  }
--- 30,42 ----
  static void notice_processor(void *arg, const char *message);
  
  static int
! _parse_version(const char *versionString)
  {
  	int			v;
  
  	v = parse_version(versionString);
  	if (v < 0)
! 		exit_horribly(modulename, "could not parse version string \"%s\"\n", versionString);
  
  	return v;
  }
*************** _check_database_version(ArchiveHandle *A
*** 48,60 ****
  	const char *remoteversion_str;
  	int			remoteversion;
  
! 	myversion = _parse_version(AH, PG_VERSION);
  
  	remoteversion_str = PQparameterStatus(AH->connection, "server_version");
  	if (!remoteversion_str)
! 		die_horribly(AH, modulename, "could not get server_version from libpq\n");
  
! 	remoteversion = _parse_version(AH, remoteversion_str);
  
  	AH->public.remoteVersionStr = pg_strdup(remoteversion_str);
  	AH->public.remoteVersion = remoteversion;
--- 48,60 ----
  	const char *remoteversion_str;
  	int			remoteversion;
  
! 	myversion = _parse_version(PG_VERSION);
  
  	remoteversion_str = PQparameterStatus(AH->connection, "server_version");
  	if (!remoteversion_str)
! 		exit_horribly(modulename, "could not get server_version from libpq\n");
  
! 	remoteversion = _parse_version(remoteversion_str);
  
  	AH->public.remoteVersionStr = pg_strdup(remoteversion_str);
  	AH->public.remoteVersion = remoteversion;
*************** _check_database_version(ArchiveHandle *A
*** 67,73 ****
  	{
  		write_msg(NULL, "server version: %s; %s version: %s\n",
  				  remoteversion_str, progname, PG_VERSION);
! 		die_horribly(AH, NULL, "aborting because of server version mismatch\n");
  	}
  }
  
--- 67,73 ----
  	{
  		write_msg(NULL, "server version: %s; %s version: %s\n",
  				  remoteversion_str, progname, PG_VERSION);
! 		exit_horribly(NULL, "aborting because of server version mismatch\n");
  	}
  }
  
*************** _connectDB(ArchiveHandle *AH, const char
*** 145,151 ****
  	{
  		password = simple_prompt("Password: ", 100, false);
  		if (password == NULL)
! 			die_horribly(AH, modulename, "out of memory\n");
  	}
  
  	do
--- 145,151 ----
  	{
  		password = simple_prompt("Password: ", 100, false);
  		if (password == NULL)
! 			exit_horribly(modulename, "out of memory\n");
  	}
  
  	do
*************** _connectDB(ArchiveHandle *AH, const char
*** 176,187 ****
  		free(values);
  
  		if (!newConn)
! 			die_horribly(AH, modulename, "failed to reconnect to database\n");
  
  		if (PQstatus(newConn) == CONNECTION_BAD)
  		{
  			if (!PQconnectionNeedsPassword(newConn))
! 				die_horribly(AH, modulename, "could not reconnect to database: %s",
  							 PQerrorMessage(newConn));
  			PQfinish(newConn);
  
--- 176,187 ----
  		free(values);
  
  		if (!newConn)
! 			exit_horribly(modulename, "failed to reconnect to database\n");
  
  		if (PQstatus(newConn) == CONNECTION_BAD)
  		{
  			if (!PQconnectionNeedsPassword(newConn))
! 				exit_horribly(modulename, "could not reconnect to database: %s",
  							 PQerrorMessage(newConn));
  			PQfinish(newConn);
  
*************** _connectDB(ArchiveHandle *AH, const char
*** 197,206 ****
  			if (AH->promptPassword != TRI_NO)
  				password = simple_prompt("Password: ", 100, false);
  			else
! 				die_horribly(AH, modulename, "connection needs password\n");
  
  			if (password == NULL)
! 				die_horribly(AH, modulename, "out of memory\n");
  			new_pass = true;
  		}
  	} while (new_pass);
--- 197,206 ----
  			if (AH->promptPassword != TRI_NO)
  				password = simple_prompt("Password: ", 100, false);
  			else
! 				exit_horribly(modulename, "connection needs password\n");
  
  			if (password == NULL)
! 				exit_horribly(modulename, "out of memory\n");
  			new_pass = true;
  		}
  	} while (new_pass);
*************** ConnectDatabase(Archive *AHX,
*** 238,250 ****
  	bool		new_pass;
  
  	if (AH->connection)
! 		die_horribly(AH, modulename, "already connected to a database\n");
  
  	if (prompt_password == TRI_YES && password == NULL)
  	{
  		password = simple_prompt("Password: ", 100, false);
  		if (password == NULL)
! 			die_horribly(AH, modulename, "out of memory\n");
  	}
  	AH->promptPassword = prompt_password;
  
--- 238,250 ----
  	bool		new_pass;
  
  	if (AH->connection)
! 		exit_horribly(modulename, "already connected to a database\n");
  
  	if (prompt_password == TRI_YES && password == NULL)
  	{
  		password = simple_prompt("Password: ", 100, false);
  		if (password == NULL)
! 			exit_horribly(modulename, "out of memory\n");
  	}
  	AH->promptPassword = prompt_password;
  
*************** ConnectDatabase(Archive *AHX,
*** 280,286 ****
  		free(values);
  
  		if (!AH->connection)
! 			die_horribly(AH, modulename, "failed to connect to database\n");
  
  		if (PQstatus(AH->connection) == CONNECTION_BAD &&
  			PQconnectionNeedsPassword(AH->connection) &&
--- 280,286 ----
  		free(values);
  
  		if (!AH->connection)
! 			exit_horribly(modulename, "failed to connect to database\n");
  
  		if (PQstatus(AH->connection) == CONNECTION_BAD &&
  			PQconnectionNeedsPassword(AH->connection) &&
*************** ConnectDatabase(Archive *AHX,
*** 290,296 ****
  			PQfinish(AH->connection);
  			password = simple_prompt("Password: ", 100, false);
  			if (password == NULL)
! 				die_horribly(AH, modulename, "out of memory\n");
  			new_pass = true;
  		}
  	} while (new_pass);
--- 290,296 ----
  			PQfinish(AH->connection);
  			password = simple_prompt("Password: ", 100, false);
  			if (password == NULL)
! 				exit_horribly(modulename, "out of memory\n");
  			new_pass = true;
  		}
  	} while (new_pass);
*************** ConnectDatabase(Archive *AHX,
*** 299,305 ****
  
  	/* check to see that the backend connection was successfully made */
  	if (PQstatus(AH->connection) == CONNECTION_BAD)
! 		die_horribly(AH, modulename, "connection to database \"%s\" failed: %s",
  					 PQdb(AH->connection), PQerrorMessage(AH->connection));
  
  	/* check for version mismatch */
--- 299,305 ----
  
  	/* check to see that the backend connection was successfully made */
  	if (PQstatus(AH->connection) == CONNECTION_BAD)
! 		exit_horribly(modulename, "connection to database \"%s\" failed: %s",
  					 PQdb(AH->connection), PQerrorMessage(AH->connection));
  
  	/* check for version mismatch */
*************** GetConnection(Archive *AHX)
*** 325,336 ****
--- 325,352 ----
  	return AH->connection;
  }
  
+ void
+ archive_close_connection(int code, void *arg)
+ {
+ 	Archive	   *AH = (Archive *) arg;
+ 
+ 	DisconnectDatabase(AH);
+ }
+ 
  static void
  notice_processor(void *arg, const char *message)
  {
  	write_msg(NULL, "%s", message);
  }
  
+ /* Like exit_horribly(), but with a complaint about a particular query. */
+ static void
+ die_on_query_failure(ArchiveHandle *AH, const char *modulename, const char *query)
+ {
+ 	write_msg(modulename, "query failed: %s",
+ 			  PQerrorMessage(AH->connection));
+ 	exit_horribly(modulename, "query was: %s\n", query);
+ }
  
  void
  ExecuteSqlStatement(Archive *AHX, const char *query)
*************** ExecuteSqlCommand(ArchiveHandle *AH, con
*** 393,400 ****
  				errStmt[DB_MAX_ERR_STMT - 2] = '.';
  				errStmt[DB_MAX_ERR_STMT - 1] = '\0';
  			}
! 			warn_or_die_horribly(AH, modulename, "%s: %s    Command was: %s\n",
! 								 desc, PQerrorMessage(conn), errStmt);
  			break;
  	}
  
--- 409,416 ----
  				errStmt[DB_MAX_ERR_STMT - 2] = '.';
  				errStmt[DB_MAX_ERR_STMT - 1] = '\0';
  			}
! 			warn_or_exit_horribly(AH, modulename, "%s: %s    Command was: %s\n",
! 								  desc, PQerrorMessage(conn), errStmt);
  			break;
  	}
  
*************** ExecuteSqlCommandBuf(ArchiveHandle *AH,
*** 495,502 ****
  		 */
  		if (AH->pgCopyIn &&
  			PQputCopyData(AH->connection, buf, bufLen) <= 0)
! 			die_horribly(AH, modulename, "error returned by PQputCopyData: %s",
! 						 PQerrorMessage(AH->connection));
  	}
  	else if (AH->outputKind == OUTPUT_OTHERDATA)
  	{
--- 511,518 ----
  		 */
  		if (AH->pgCopyIn &&
  			PQputCopyData(AH->connection, buf, bufLen) <= 0)
! 			exit_horribly(modulename, "error returned by PQputCopyData: %s",
! 						  PQerrorMessage(AH->connection));
  	}
  	else if (AH->outputKind == OUTPUT_OTHERDATA)
  	{
*************** EndDBCopyMode(ArchiveHandle *AH, TocEntr
*** 541,554 ****
  		PGresult   *res;
  
  		if (PQputCopyEnd(AH->connection, NULL) <= 0)
! 			die_horribly(AH, modulename, "error returned by PQputCopyEnd: %s",
! 						 PQerrorMessage(AH->connection));
  
  		/* Check command status and return to normal libpq state */
  		res = PQgetResult(AH->connection);
  		if (PQresultStatus(res) != PGRES_COMMAND_OK)
! 			warn_or_die_horribly(AH, modulename, "COPY failed for table \"%s\": %s",
! 								 te->tag, PQerrorMessage(AH->connection));
  		PQclear(res);
  
  		AH->pgCopyIn = false;
--- 557,570 ----
  		PGresult   *res;
  
  		if (PQputCopyEnd(AH->connection, NULL) <= 0)
! 			exit_horribly(modulename, "error returned by PQputCopyEnd: %s",
! 						  PQerrorMessage(AH->connection));
  
  		/* Check command status and return to normal libpq state */
  		res = PQgetResult(AH->connection);
  		if (PQresultStatus(res) != PGRES_COMMAND_OK)
! 			warn_or_exit_horribly(AH, modulename, "COPY failed for table \"%s\": %s",
! 								  te->tag, PQerrorMessage(AH->connection));
  		PQclear(res);
  
  		AH->pgCopyIn = false;
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 4b59516..8d43cd2 100644
*** a/src/bin/pg_dump/pg_backup_directory.c
--- b/src/bin/pg_dump/pg_backup_directory.c
*************** InitArchiveFmt_Directory(ArchiveHandle *
*** 142,148 ****
  	 */
  
  	if (!AH->fSpec || strcmp(AH->fSpec, "") == 0)
! 		die_horribly(AH, modulename, "no output directory specified\n");
  
  	ctx->directory = AH->fSpec;
  
--- 142,148 ----
  	 */
  
  	if (!AH->fSpec || strcmp(AH->fSpec, "") == 0)
! 		exit_horribly(modulename, "no output directory specified\n");
  
  	ctx->directory = AH->fSpec;
  
*************** InitArchiveFmt_Directory(ArchiveHandle *
*** 160,168 ****
  
  		tocFH = cfopen_read(fname, PG_BINARY_R);
  		if (tocFH == NULL)
! 			die_horribly(AH, modulename,
! 						 "could not open input file \"%s\": %s\n",
! 						 fname, strerror(errno));
  
  		ctx->dataFH = tocFH;
  
--- 160,168 ----
  
  		tocFH = cfopen_read(fname, PG_BINARY_R);
  		if (tocFH == NULL)
! 			exit_horribly(modulename,
! 						  "could not open input file \"%s\": %s\n",
! 						  fname, strerror(errno));
  
  		ctx->dataFH = tocFH;
  
*************** InitArchiveFmt_Directory(ArchiveHandle *
*** 177,183 ****
  
  		/* Nothing else in the file, so close it again... */
  		if (cfclose(tocFH) != 0)
! 			die_horribly(AH, modulename, "could not close TOC file: %s\n",
  						 strerror(errno));
  		ctx->dataFH = NULL;
  	}
--- 177,183 ----
  
  		/* Nothing else in the file, so close it again... */
  		if (cfclose(tocFH) != 0)
! 			exit_horribly(modulename, "could not close TOC file: %s\n",
  						 strerror(errno));
  		ctx->dataFH = NULL;
  	}
*************** _StartData(ArchiveHandle *AH, TocEntry *
*** 288,295 ****
  
  	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
  	if (ctx->dataFH == NULL)
! 		die_horribly(AH, modulename, "could not open output file \"%s\": %s\n",
! 					 fname, strerror(errno));
  }
  
  /*
--- 288,295 ----
  
  	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
  	if (ctx->dataFH == NULL)
! 		exit_horribly(modulename, "could not open output file \"%s\": %s\n",
! 					  fname, strerror(errno));
  }
  
  /*
*************** _PrintFileData(ArchiveHandle *AH, char *
*** 346,352 ****
  	cfp = cfopen_read(filename, PG_BINARY_R);
  
  	if (!cfp)
! 		die_horribly(AH, modulename, "could not open input file \"%s\": %s\n",
  					 filename, strerror(errno));
  
  	buf = pg_malloc(ZLIB_OUT_SIZE);
--- 346,352 ----
  	cfp = cfopen_read(filename, PG_BINARY_R);
  
  	if (!cfp)
! 		exit_horribly(modulename, "could not open input file \"%s\": %s\n",
  					 filename, strerror(errno));
  
  	buf = pg_malloc(ZLIB_OUT_SIZE);
*************** _PrintFileData(ArchiveHandle *AH, char *
*** 357,363 ****
  
  	free(buf);
  	if (cfclose(cfp) != 0)
! 		die_horribly(AH, modulename, "could not close data file: %s\n",
  					 strerror(errno));
  }
  
--- 357,363 ----
  
  	free(buf);
  	if (cfclose(cfp) != 0)
! 		exit_horribly(modulename, "could not close data file: %s\n",
  					 strerror(errno));
  }
  
*************** _LoadBlobs(ArchiveHandle *AH, RestoreOpt
*** 397,404 ****
  	ctx->blobsTocFH = cfopen_read(fname, PG_BINARY_R);
  
  	if (ctx->blobsTocFH == NULL)
! 		die_horribly(AH, modulename, "could not open large object TOC file \"%s\" for input: %s\n",
! 					 fname, strerror(errno));
  
  	/* Read the blobs TOC file line-by-line, and process each blob */
  	while ((cfgets(ctx->blobsTocFH, line, MAXPGPATH)) != NULL)
--- 397,404 ----
  	ctx->blobsTocFH = cfopen_read(fname, PG_BINARY_R);
  
  	if (ctx->blobsTocFH == NULL)
! 		exit_horribly(modulename, "could not open large object TOC file \"%s\" for input: %s\n",
! 					  fname, strerror(errno));
  
  	/* Read the blobs TOC file line-by-line, and process each blob */
  	while ((cfgets(ctx->blobsTocFH, line, MAXPGPATH)) != NULL)
*************** _LoadBlobs(ArchiveHandle *AH, RestoreOpt
*** 407,414 ****
  		char		path[MAXPGPATH];
  
  		if (sscanf(line, "%u %s\n", &oid, fname) != 2)
! 			die_horribly(AH, modulename, "invalid line in large object TOC file: %s\n",
! 						 line);
  
  		StartRestoreBlob(AH, oid, ropt->dropSchema);
  		snprintf(path, MAXPGPATH, "%s/%s", ctx->directory, fname);
--- 407,414 ----
  		char		path[MAXPGPATH];
  
  		if (sscanf(line, "%u %s\n", &oid, fname) != 2)
! 			exit_horribly(modulename, "invalid line in large object TOC file: %s\n",
! 						  line);
  
  		StartRestoreBlob(AH, oid, ropt->dropSchema);
  		snprintf(path, MAXPGPATH, "%s/%s", ctx->directory, fname);
*************** _LoadBlobs(ArchiveHandle *AH, RestoreOpt
*** 416,427 ****
  		EndRestoreBlob(AH, oid);
  	}
  	if (!cfeof(ctx->blobsTocFH))
! 		die_horribly(AH, modulename, "error reading large object TOC file \"%s\"\n",
  					 fname);
  
  	if (cfclose(ctx->blobsTocFH) != 0)
! 		die_horribly(AH, modulename, "could not close large object TOC file \"%s\": %s\n",
! 					 fname, strerror(errno));
  
  	ctx->blobsTocFH = NULL;
  
--- 416,427 ----
  		EndRestoreBlob(AH, oid);
  	}
  	if (!cfeof(ctx->blobsTocFH))
! 		exit_horribly(modulename, "error reading large object TOC file \"%s\"\n",
  					 fname);
  
  	if (cfclose(ctx->blobsTocFH) != 0)
! 		exit_horribly(modulename, "could not close large object TOC file \"%s\": %s\n",
! 					  fname, strerror(errno));
  
  	ctx->blobsTocFH = NULL;
  
*************** _WriteByte(ArchiveHandle *AH, const int
*** 441,447 ****
  	lclContext *ctx = (lclContext *) AH->formatData;
  
  	if (cfwrite(&c, 1, ctx->dataFH) != 1)
! 		die_horribly(AH, modulename, "could not write byte\n");
  
  	return 1;
  }
--- 441,447 ----
  	lclContext *ctx = (lclContext *) AH->formatData;
  
  	if (cfwrite(&c, 1, ctx->dataFH) != 1)
! 		exit_horribly(modulename, "could not write byte\n");
  
  	return 1;
  }
*************** _ReadByte(ArchiveHandle *AH)
*** 460,466 ****
  
  	res = cfgetc(ctx->dataFH);
  	if (res == EOF)
! 		die_horribly(AH, modulename, "unexpected end of file\n");
  
  	return res;
  }
--- 460,466 ----
  
  	res = cfgetc(ctx->dataFH);
  	if (res == EOF)
! 		exit_horribly(modulename, "unexpected end of file\n");
  
  	return res;
  }
*************** _WriteBuf(ArchiveHandle *AH, const void
*** 477,483 ****
  
  	res = cfwrite(buf, len, ctx->dataFH);
  	if (res != len)
! 		die_horribly(AH, modulename, "could not write to output file: %s\n",
  					 strerror(errno));
  
  	return res;
--- 477,483 ----
  
  	res = cfwrite(buf, len, ctx->dataFH);
  	if (res != len)
! 		exit_horribly(modulename, "could not write to output file: %s\n",
  					 strerror(errno));
  
  	return res;
*************** _CloseArchive(ArchiveHandle *AH)
*** 524,531 ****
  		/* The TOC is always created uncompressed */
  		tocFH = cfopen_write(fname, PG_BINARY_W, 0);
  		if (tocFH == NULL)
! 			die_horribly(AH, modulename, "could not open output file \"%s\": %s\n",
! 						 fname, strerror(errno));
  		ctx->dataFH = tocFH;
  
  		/*
--- 524,531 ----
  		/* The TOC is always created uncompressed */
  		tocFH = cfopen_write(fname, PG_BINARY_W, 0);
  		if (tocFH == NULL)
! 			exit_horribly(modulename, "could not open output file \"%s\": %s\n",
! 						  fname, strerror(errno));
  		ctx->dataFH = tocFH;
  
  		/*
*************** _CloseArchive(ArchiveHandle *AH)
*** 538,545 ****
  		AH->format = archDirectory;
  		WriteToc(AH);
  		if (cfclose(tocFH) != 0)
! 			die_horribly(AH, modulename, "could not close TOC file: %s\n",
! 						 strerror(errno));
  		WriteDataChunks(AH);
  	}
  	AH->FH = NULL;
--- 538,545 ----
  		AH->format = archDirectory;
  		WriteToc(AH);
  		if (cfclose(tocFH) != 0)
! 			exit_horribly(modulename, "could not close TOC file: %s\n",
! 						  strerror(errno));
  		WriteDataChunks(AH);
  	}
  	AH->FH = NULL;
*************** _StartBlobs(ArchiveHandle *AH, TocEntry
*** 568,575 ****
  	/* The blob TOC file is never compressed */
  	ctx->blobsTocFH = cfopen_write(fname, "ab", 0);
  	if (ctx->blobsTocFH == NULL)
! 		die_horribly(AH, modulename, "could not open output file \"%s\": %s\n",
! 					 fname, strerror(errno));
  }
  
  /*
--- 568,575 ----
  	/* The blob TOC file is never compressed */
  	ctx->blobsTocFH = cfopen_write(fname, "ab", 0);
  	if (ctx->blobsTocFH == NULL)
! 		exit_horribly(modulename, "could not open output file \"%s\": %s\n",
! 					  fname, strerror(errno));
  }
  
  /*
*************** _StartBlob(ArchiveHandle *AH, TocEntry *
*** 588,594 ****
  	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
  
  	if (ctx->dataFH == NULL)
! 		die_horribly(AH, modulename, "could not open output file \"%s\": %s\n",
  					 fname, strerror(errno));
  }
  
--- 588,594 ----
  	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
  
  	if (ctx->dataFH == NULL)
! 		exit_horribly(modulename, "could not open output file \"%s\": %s\n",
  					 fname, strerror(errno));
  }
  
*************** _EndBlob(ArchiveHandle *AH, TocEntry *te
*** 611,617 ****
  	/* register the blob in blobs.toc */
  	len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
  	if (cfwrite(buf, len, ctx->blobsTocFH) != len)
! 		die_horribly(AH, modulename, "could not write to blobs TOC file\n");
  }
  
  /*
--- 611,617 ----
  	/* register the blob in blobs.toc */
  	len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
  	if (cfwrite(buf, len, ctx->blobsTocFH) != len)
! 		exit_horribly(modulename, "could not write to blobs TOC file\n");
  }
  
  /*
*************** prependDirectory(ArchiveHandle *AH, cons
*** 667,673 ****
  	dname = ctx->directory;
  
  	if (strlen(dname) + 1 + strlen(relativeFilename) + 1 > MAXPGPATH)
! 		die_horribly(AH, modulename, "path name too long: %s", dname);
  
  	strcpy(buf, dname);
  	strcat(buf, "/");
--- 667,673 ----
  	dname = ctx->directory;
  
  	if (strlen(dname) + 1 + strlen(relativeFilename) + 1 > MAXPGPATH)
! 		exit_horribly(modulename, "path name too long: %s", dname);
  
  	strcpy(buf, dname);
  	strcat(buf, "/");
diff --git a/src/bin/pg_dump/pg_backup_files.c b/src/bin/pg_dump/pg_backup_files.c
index a7fd91d..d765838 100644
*** a/src/bin/pg_dump/pg_backup_files.c
--- b/src/bin/pg_dump/pg_backup_files.c
*************** InitArchiveFmt_Files(ArchiveHandle *AH)
*** 169,175 ****
  		ReadToc(AH);
  		/* Nothing else in the file... */
  		if (fclose(AH->FH) != 0)
! 			die_horribly(AH, modulename, "could not close TOC file: %s\n", strerror(errno));
  	}
  }
  
--- 169,175 ----
  		ReadToc(AH);
  		/* Nothing else in the file... */
  		if (fclose(AH->FH) != 0)
! 			exit_horribly(modulename, "could not close TOC file: %s\n", strerror(errno));
  	}
  }
  
*************** _StartData(ArchiveHandle *AH, TocEntry *
*** 259,266 ****
  #endif
  
  	if (tctx->FH == NULL)
! 		die_horribly(AH, modulename, "could not open output file \"%s\": %s\n",
! 					 tctx->filename, strerror(errno));
  }
  
  static size_t
--- 259,266 ----
  #endif
  
  	if (tctx->FH == NULL)
! 		exit_horribly(modulename, "could not open output file \"%s\": %s\n",
! 					  tctx->filename, strerror(errno));
  }
  
  static size_t
*************** _EndData(ArchiveHandle *AH, TocEntry *te
*** 280,286 ****
  
  	/* Close the file */
  	if (GZCLOSE(tctx->FH) != 0)
! 		die_horribly(AH, modulename, "could not close data file\n");
  
  	tctx->FH = NULL;
  }
--- 280,286 ----
  
  	/* Close the file */
  	if (GZCLOSE(tctx->FH) != 0)
! 		exit_horribly(modulename, "could not close data file\n");
  
  	tctx->FH = NULL;
  }
*************** _PrintFileData(ArchiveHandle *AH, char *
*** 304,310 ****
  #endif
  
  	if (AH->FH == NULL)
! 		die_horribly(AH, modulename, "could not open input file \"%s\": %s\n",
  					 filename, strerror(errno));
  
  	while ((cnt = GZREAD(buf, 1, 4095, AH->FH)) > 0)
--- 304,310 ----
  #endif
  
  	if (AH->FH == NULL)
! 		exit_horribly(modulename, "could not open input file \"%s\": %s\n",
  					 filename, strerror(errno));
  
  	while ((cnt = GZREAD(buf, 1, 4095, AH->FH)) > 0)
*************** _PrintFileData(ArchiveHandle *AH, char *
*** 314,320 ****
  	}
  
  	if (GZCLOSE(AH->FH) != 0)
! 		die_horribly(AH, modulename, "could not close data file after reading\n");
  }
  
  
--- 314,320 ----
  	}
  
  	if (GZCLOSE(AH->FH) != 0)
! 		exit_horribly(modulename, "could not close data file after reading\n");
  }
  
  
*************** _LoadBlobs(ArchiveHandle *AH, RestoreOpt
*** 376,382 ****
  	ctx->blobToc = fopen("blobs.toc", PG_BINARY_R);
  
  	if (ctx->blobToc == NULL)
! 		die_horribly(AH, modulename, "could not open large object TOC for input: %s\n", strerror(errno));
  
  	_getBlobTocEntry(AH, &oid, fname);
  
--- 376,382 ----
  	ctx->blobToc = fopen("blobs.toc", PG_BINARY_R);
  
  	if (ctx->blobToc == NULL)
! 		exit_horribly(modulename, "could not open large object TOC for input: %s\n", strerror(errno));
  
  	_getBlobTocEntry(AH, &oid, fname);
  
*************** _LoadBlobs(ArchiveHandle *AH, RestoreOpt
*** 389,395 ****
  	}
  
  	if (fclose(ctx->blobToc) != 0)
! 		die_horribly(AH, modulename, "could not close large object TOC file: %s\n", strerror(errno));
  
  	EndRestoreBlobs(AH);
  }
--- 389,395 ----
  	}
  
  	if (fclose(ctx->blobToc) != 0)
! 		exit_horribly(modulename, "could not close large object TOC file: %s\n", strerror(errno));
  
  	EndRestoreBlobs(AH);
  }
*************** _WriteByte(ArchiveHandle *AH, const int
*** 401,407 ****
  	lclContext *ctx = (lclContext *) AH->formatData;
  
  	if (fputc(i, AH->FH) == EOF)
! 		die_horribly(AH, modulename, "could not write byte\n");
  
  	ctx->filePos += 1;
  
--- 401,407 ----
  	lclContext *ctx = (lclContext *) AH->formatData;
  
  	if (fputc(i, AH->FH) == EOF)
! 		exit_horribly(modulename, "could not write byte\n");
  
  	ctx->filePos += 1;
  
*************** _ReadByte(ArchiveHandle *AH)
*** 416,422 ****
  
  	res = getc(AH->FH);
  	if (res == EOF)
! 		die_horribly(AH, modulename, "unexpected end of file\n");
  	ctx->filePos += 1;
  	return res;
  }
--- 416,422 ----
  
  	res = getc(AH->FH);
  	if (res == EOF)
! 		exit_horribly(modulename, "unexpected end of file\n");
  	ctx->filePos += 1;
  	return res;
  }
*************** _WriteBuf(ArchiveHandle *AH, const void
*** 429,435 ****
  
  	res = fwrite(buf, 1, len, AH->FH);
  	if (res != len)
! 		die_horribly(AH, modulename, "could not write to output file: %s\n", strerror(errno));
  
  	ctx->filePos += res;
  	return res;
--- 429,435 ----
  
  	res = fwrite(buf, 1, len, AH->FH);
  	if (res != len)
! 		exit_horribly(modulename, "could not write to output file: %s\n", strerror(errno));
  
  	ctx->filePos += res;
  	return res;
*************** _CloseArchive(ArchiveHandle *AH)
*** 454,460 ****
  		WriteHead(AH);
  		WriteToc(AH);
  		if (fclose(AH->FH) != 0)
! 			die_horribly(AH, modulename, "could not close TOC file: %s\n", strerror(errno));
  		WriteDataChunks(AH);
  	}
  
--- 454,460 ----
  		WriteHead(AH);
  		WriteToc(AH);
  		if (fclose(AH->FH) != 0)
! 			exit_horribly(modulename, "could not close TOC file: %s\n", strerror(errno));
  		WriteDataChunks(AH);
  	}
  
*************** _StartBlobs(ArchiveHandle *AH, TocEntry
*** 486,492 ****
  	ctx->blobToc = fopen(fname, PG_BINARY_W);
  
  	if (ctx->blobToc == NULL)
! 		die_horribly(AH, modulename,
  		"could not open large object TOC for output: %s\n", strerror(errno));
  }
  
--- 486,492 ----
  	ctx->blobToc = fopen(fname, PG_BINARY_W);
  
  	if (ctx->blobToc == NULL)
! 		exit_horribly(modulename,
  		"could not open large object TOC for output: %s\n", strerror(errno));
  }
  
*************** _StartBlob(ArchiveHandle *AH, TocEntry *
*** 507,513 ****
  	char	   *sfx;
  
  	if (oid == 0)
! 		die_horribly(AH, modulename, "invalid OID for large object (%u)\n", oid);
  
  	if (AH->compression != 0)
  		sfx = ".gz";
--- 507,513 ----
  	char	   *sfx;
  
  	if (oid == 0)
! 		exit_horribly(modulename, "invalid OID for large object (%u)\n", oid);
  
  	if (AH->compression != 0)
  		sfx = ".gz";
*************** _StartBlob(ArchiveHandle *AH, TocEntry *
*** 526,532 ****
  #endif
  
  	if (tctx->FH == NULL)
! 		die_horribly(AH, modulename, "could not open large object file \"%s\" for input: %s\n",
  					 fname, strerror(errno));
  }
  
--- 526,532 ----
  #endif
  
  	if (tctx->FH == NULL)
! 		exit_horribly(modulename, "could not open large object file \"%s\" for input: %s\n",
  					 fname, strerror(errno));
  }
  
*************** _EndBlob(ArchiveHandle *AH, TocEntry *te
*** 541,547 ****
  	lclTocEntry *tctx = (lclTocEntry *) te->formatData;
  
  	if (GZCLOSE(tctx->FH) != 0)
! 		die_horribly(AH, modulename, "could not close large object file\n");
  }
  
  /*
--- 541,547 ----
  	lclTocEntry *tctx = (lclTocEntry *) te->formatData;
  
  	if (GZCLOSE(tctx->FH) != 0)
! 		exit_horribly(modulename, "could not close large object file\n");
  }
  
  /*
*************** _EndBlobs(ArchiveHandle *AH, TocEntry *t
*** 558,562 ****
  	/* WriteInt(AH, 0); */
  
  	if (fclose(ctx->blobToc) != 0)
! 		die_horribly(AH, modulename, "could not close large object TOC file: %s\n", strerror(errno));
  }
--- 558,562 ----
  	/* WriteInt(AH, 0); */
  
  	if (fclose(ctx->blobToc) != 0)
! 		exit_horribly(modulename, "could not close large object TOC file: %s\n", strerror(errno));
  }
diff --git a/src/bin/pg_dump/pg_backup_null.c b/src/bin/pg_dump/pg_backup_null.c
index 201f0d9..ba1e461 100644
*** a/src/bin/pg_dump/pg_backup_null.c
--- b/src/bin/pg_dump/pg_backup_null.c
*************** InitArchiveFmt_Null(ArchiveHandle *AH)
*** 74,80 ****
  	 * Now prevent reading...
  	 */
  	if (AH->mode == archModeRead)
! 		die_horribly(AH, NULL, "this format cannot be read\n");
  }
  
  /*
--- 74,80 ----
  	 * Now prevent reading...
  	 */
  	if (AH->mode == archModeRead)
! 		exit_horribly(NULL, "this format cannot be read\n");
  }
  
  /*
*************** _StartBlob(ArchiveHandle *AH, TocEntry *
*** 149,155 ****
  	bool		old_blob_style = (AH->version < K_VERS_1_12);
  
  	if (oid == 0)
! 		die_horribly(AH, NULL, "invalid OID for large object\n");
  
  	/* With an old archive we must do drop and create logic here */
  	if (old_blob_style && AH->ropt->dropSchema)
--- 149,155 ----
  	bool		old_blob_style = (AH->version < K_VERS_1_12);
  
  	if (oid == 0)
! 		exit_horribly(NULL, "invalid OID for large object\n");
  
  	/* With an old archive we must do drop and create logic here */
  	if (old_blob_style && AH->ropt->dropSchema)
diff --git a/src/bin/pg_dump/pg_backup_tar.c b/src/bin/pg_dump/pg_backup_tar.c
index 4823ede..451c957 100644
*** a/src/bin/pg_dump/pg_backup_tar.c
--- b/src/bin/pg_dump/pg_backup_tar.c
*************** tarOpen(ArchiveHandle *AH, const char *f
*** 355,361 ****
  				 * Couldn't find the requested file. Future: do SEEK(0) and
  				 * retry.
  				 */
! 				die_horribly(AH, modulename, "could not find file \"%s\" in archive\n", filename);
  			}
  			else
  			{
--- 355,361 ----
  				 * Couldn't find the requested file. Future: do SEEK(0) and
  				 * retry.
  				 */
! 				exit_horribly(modulename, "could not find file \"%s\" in archive\n", filename);
  			}
  			else
  			{
*************** tarOpen(ArchiveHandle *AH, const char *f
*** 369,375 ****
  		if (AH->compression == 0)
  			tm->nFH = ctx->tarFH;
  		else
! 			die_horribly(AH, modulename, "compression is not supported by tar archive format\n");
  		/* tm->zFH = gzdopen(dup(fileno(ctx->tarFH)), "rb"); */
  #else
  		tm->nFH = ctx->tarFH;
--- 369,375 ----
  		if (AH->compression == 0)
  			tm->nFH = ctx->tarFH;
  		else
! 			exit_horribly(modulename, "compression is not supported by tar archive format\n");
  		/* tm->zFH = gzdopen(dup(fileno(ctx->tarFH)), "rb"); */
  #else
  		tm->nFH = ctx->tarFH;
*************** tarOpen(ArchiveHandle *AH, const char *f
*** 411,417 ****
  #endif
  
  		if (tm->tmpFH == NULL)
! 			die_horribly(AH, modulename, "could not generate temporary file name: %s\n", strerror(errno));
  
  #ifdef HAVE_LIBZ
  
--- 411,417 ----
  #endif
  
  		if (tm->tmpFH == NULL)
! 			exit_horribly(modulename, "could not generate temporary file name: %s\n", strerror(errno));
  
  #ifdef HAVE_LIBZ
  
*************** tarOpen(ArchiveHandle *AH, const char *f
*** 420,426 ****
  			sprintf(fmode, "wb%d", AH->compression);
  			tm->zFH = gzdopen(dup(fileno(tm->tmpFH)), fmode);
  			if (tm->zFH == NULL)
! 				die_horribly(AH, modulename, "could not open temporary file\n");
  		}
  		else
  			tm->nFH = tm->tmpFH;
--- 420,426 ----
  			sprintf(fmode, "wb%d", AH->compression);
  			tm->zFH = gzdopen(dup(fileno(tm->tmpFH)), fmode);
  			if (tm->zFH == NULL)
! 				exit_horribly(modulename, "could not open temporary file\n");
  		}
  		else
  			tm->nFH = tm->tmpFH;
*************** tarClose(ArchiveHandle *AH, TAR_MEMBER *
*** 447,453 ****
  	 */
  	if (AH->compression != 0)
  		if (GZCLOSE(th->zFH) != 0)
! 			die_horribly(AH, modulename, "could not close tar member\n");
  
  	if (th->mode == 'w')
  		_tarAddFile(AH, th);	/* This will close the temp file */
--- 447,453 ----
  	 */
  	if (AH->compression != 0)
  		if (GZCLOSE(th->zFH) != 0)
! 			exit_horribly(modulename, "could not close tar member\n");
  
  	if (th->mode == 'w')
  		_tarAddFile(AH, th);	/* This will close the temp file */
*************** _tarReadRaw(ArchiveHandle *AH, void *buf
*** 547,553 ****
  				res = fread(&((char *) buf)[used], 1, len, th->nFH);
  		}
  		else
! 			die_horribly(AH, modulename, "internal error -- neither th nor fh specified in tarReadRaw()\n");
  	}
  
  	ctx->tarFHpos += res + used;
--- 547,553 ----
  				res = fread(&((char *) buf)[used], 1, len, th->nFH);
  		}
  		else
! 			exit_horribly(modulename, "internal error -- neither th nor fh specified in tarReadRaw()\n");
  	}
  
  	ctx->tarFHpos += res + used;
*************** tarWrite(const void *buf, size_t len, TA
*** 584,591 ****
  		res = fwrite(buf, 1, len, th->nFH);
  
  	if (res != len)
! 		die_horribly(th->AH, modulename,
! 					 "could not write to output file: %s\n", strerror(errno));
  
  	th->pos += res;
  	return res;
--- 584,591 ----
  		res = fwrite(buf, 1, len, th->nFH);
  
  	if (res != len)
! 		exit_horribly(modulename,
! 					  "could not write to output file: %s\n", strerror(errno));
  
  	th->pos += res;
  	return res;
*************** _PrintTocData(ArchiveHandle *AH, TocEntr
*** 672,679 ****
  		 * we search the string for it in a paranoid sort of way.
  		 */
  		if (strncmp(tmpCopy, "copy ", 5) != 0)
! 			die_horribly(AH, modulename,
! 						 "invalid COPY statement -- could not find \"copy\" in string \"%s\"\n", tmpCopy);
  
  		pos1 = 5;
  		for (pos1 = 5; pos1 < strlen(tmpCopy); pos1++)
--- 672,679 ----
  		 * we search the string for it in a paranoid sort of way.
  		 */
  		if (strncmp(tmpCopy, "copy ", 5) != 0)
! 			exit_horribly(modulename,
! 						  "invalid COPY statement -- could not find \"copy\" in string \"%s\"\n", tmpCopy);
  
  		pos1 = 5;
  		for (pos1 = 5; pos1 < strlen(tmpCopy); pos1++)
*************** _PrintTocData(ArchiveHandle *AH, TocEntr
*** 690,698 ****
  				break;
  
  		if (pos2 >= strlen(tmpCopy))
! 			die_horribly(AH, modulename,
! 						 "invalid COPY statement -- could not find \"from stdin\" in string \"%s\" starting at position %lu\n",
! 						 tmpCopy, (unsigned long) pos1);
  
  		ahwrite(tmpCopy, 1, pos2, AH);	/* 'copy "table" [with oids]' */
  		ahprintf(AH, " from '$$PATH$$/%s' %s", tctx->filename, &tmpCopy[pos2 + 10]);
--- 690,698 ----
  				break;
  
  		if (pos2 >= strlen(tmpCopy))
! 			exit_horribly(modulename,
! 						  "invalid COPY statement -- could not find \"from stdin\" in string \"%s\" starting at position %lu\n",
! 						  tmpCopy, (unsigned long) pos1);
  
  		ahwrite(tmpCopy, 1, pos2, AH);	/* 'copy "table" [with oids]' */
  		ahprintf(AH, " from '$$PATH$$/%s' %s", tctx->filename, &tmpCopy[pos2 + 10]);
*************** _ReadByte(ArchiveHandle *AH)
*** 784,790 ****
  
  	res = tarRead(&c, 1, ctx->FH);
  	if (res != 1)
! 		die_horribly(AH, modulename, "unexpected end of file\n");
  	ctx->filePos += 1;
  	return c;
  }
--- 784,790 ----
  
  	res = tarRead(&c, 1, ctx->FH);
  	if (res != 1)
! 		exit_horribly(modulename, "unexpected end of file\n");
  	ctx->filePos += 1;
  	return c;
  }
*************** _CloseArchive(ArchiveHandle *AH)
*** 878,884 ****
  		for (i = 0; i < 512; i++)
  		{
  			if (fputc(0, ctx->tarFH) == EOF)
! 				die_horribly(AH, modulename,
  					   "could not write null block at end of tar archive\n");
  		}
  	}
--- 878,884 ----
  		for (i = 0; i < 512; i++)
  		{
  			if (fputc(0, ctx->tarFH) == EOF)
! 				exit_horribly(modulename,
  					   "could not write null block at end of tar archive\n");
  		}
  	}
*************** _StartBlob(ArchiveHandle *AH, TocEntry *
*** 934,940 ****
  	char	   *sfx;
  
  	if (oid == 0)
! 		die_horribly(AH, modulename, "invalid OID for large object (%u)\n", oid);
  
  	if (AH->compression != 0)
  		sfx = ".gz";
--- 934,940 ----
  	char	   *sfx;
  
  	if (oid == 0)
! 		exit_horribly(modulename, "invalid OID for large object (%u)\n", oid);
  
  	if (AH->compression != 0)
  		sfx = ".gz";
*************** _tarAddFile(ArchiveHandle *AH, TAR_MEMBE
*** 1077,1083 ****
  	 * because pgoff_t can't exceed the compared maximum on their platform.
  	 */
  	if (th->fileLen > MAX_TAR_MEMBER_FILELEN)
! 		die_horribly(AH, modulename, "archive member too large for tar format\n");
  
  	_tarWriteHeader(th);
  
--- 1077,1083 ----
  	 * because pgoff_t can't exceed the compared maximum on their platform.
  	 */
  	if (th->fileLen > MAX_TAR_MEMBER_FILELEN)
! 		exit_horribly(modulename, "archive member too large for tar format\n");
  
  	_tarWriteHeader(th);
  
*************** _tarAddFile(ArchiveHandle *AH, TAR_MEMBE
*** 1085,1099 ****
  	{
  		res = fwrite(buf, 1, cnt, th->tarFH);
  		if (res != cnt)
! 			die_horribly(AH, modulename,
! 						 "could not write to output file: %s\n",
! 						 strerror(errno));
  		len += res;
  	}
  
  	if (fclose(tmp) != 0)		/* This *should* delete it... */
! 		die_horribly(AH, modulename, "could not close temporary file: %s\n",
! 					 strerror(errno));
  
  	if (len != th->fileLen)
  	{
--- 1085,1099 ----
  	{
  		res = fwrite(buf, 1, cnt, th->tarFH);
  		if (res != cnt)
! 			exit_horribly(modulename,
! 						  "could not write to output file: %s\n",
! 						  strerror(errno));
  		len += res;
  	}
  
  	if (fclose(tmp) != 0)		/* This *should* delete it... */
! 		exit_horribly(modulename, "could not close temporary file: %s\n",
! 					  strerror(errno));
  
  	if (len != th->fileLen)
  	{
*************** _tarAddFile(ArchiveHandle *AH, TAR_MEMBE
*** 1102,1116 ****
  
  		snprintf(buf1, sizeof(buf1), INT64_FORMAT, (int64) len);
  		snprintf(buf2, sizeof(buf2), INT64_FORMAT, (int64) th->fileLen);
! 		die_horribly(AH, modulename, "actual file length (%s) does not match expected (%s)\n",
! 					 buf1, buf2);
  	}
  
  	pad = ((len + 511) & ~511) - len;
  	for (i = 0; i < pad; i++)
  	{
  		if (fputc('\0', th->tarFH) == EOF)
! 			die_horribly(AH, modulename, "could not output padding at end of tar member\n");
  	}
  
  	ctx->tarFHpos += len + pad;
--- 1102,1116 ----
  
  		snprintf(buf1, sizeof(buf1), INT64_FORMAT, (int64) len);
  		snprintf(buf2, sizeof(buf2), INT64_FORMAT, (int64) th->fileLen);
! 		exit_horribly(modulename, "actual file length (%s) does not match expected (%s)\n",
! 					  buf1, buf2);
  	}
  
  	pad = ((len + 511) & ~511) - len;
  	for (i = 0; i < pad; i++)
  	{
  		if (fputc('\0', th->tarFH) == EOF)
! 			exit_horribly(modulename, "could not output padding at end of tar member\n");
  	}
  
  	ctx->tarFHpos += len + pad;
*************** _tarPositionTo(ArchiveHandle *AH, const
*** 1159,1165 ****
  	if (!_tarGetHeader(AH, th))
  	{
  		if (filename)
! 			die_horribly(AH, modulename, "could not find header for file \"%s\" in tar archive\n", filename);
  		else
  		{
  			/*
--- 1159,1165 ----
  	if (!_tarGetHeader(AH, th))
  	{
  		if (filename)
! 			exit_horribly(modulename, "could not find header for file \"%s\" in tar archive\n", filename);
  		else
  		{
  			/*
*************** _tarPositionTo(ArchiveHandle *AH, const
*** 1177,1185 ****
  
  		id = atoi(th->targetFile);
  		if ((TocIDRequired(AH, id, AH->ropt) & REQ_DATA) != 0)
! 			die_horribly(AH, modulename, "restoring data out of order is not supported in this archive format: "
! 						 "\"%s\" is required, but comes before \"%s\" in the archive file.\n",
! 						 th->targetFile, filename);
  
  		/* Header doesn't match, so read to next header */
  		len = ((th->fileLen + 511) & ~511);		/* Padded length */
--- 1177,1185 ----
  
  		id = atoi(th->targetFile);
  		if ((TocIDRequired(AH, id, AH->ropt) & REQ_DATA) != 0)
! 			exit_horribly(modulename, "restoring data out of order is not supported in this archive format: "
! 						  "\"%s\" is required, but comes before \"%s\" in the archive file.\n",
! 						  th->targetFile, filename);
  
  		/* Header doesn't match, so read to next header */
  		len = ((th->fileLen + 511) & ~511);		/* Padded length */
*************** _tarPositionTo(ArchiveHandle *AH, const
*** 1189,1195 ****
  			_tarReadRaw(AH, &header[0], 512, NULL, ctx->tarFH);
  
  		if (!_tarGetHeader(AH, th))
! 			die_horribly(AH, modulename, "could not find header for file \"%s\" in tar archive\n", filename);
  	}
  
  	ctx->tarNextMember = ctx->tarFHpos + ((th->fileLen + 511) & ~511);
--- 1189,1195 ----
  			_tarReadRaw(AH, &header[0], 512, NULL, ctx->tarFH);
  
  		if (!_tarGetHeader(AH, th))
! 			exit_horribly(modulename, "could not find header for file \"%s\" in tar archive\n", filename);
  	}
  
  	ctx->tarNextMember = ctx->tarFHpos + ((th->fileLen + 511) & ~511);
*************** _tarGetHeader(ArchiveHandle *AH, TAR_MEM
*** 1222,1228 ****
  
  			snprintf(buf1, sizeof(buf1), INT64_FORMAT, (int64) ftello(ctx->tarFH));
  			snprintf(buf2, sizeof(buf2), INT64_FORMAT, (int64) ftello(ctx->tarFHpos));
! 			die_horribly(AH, modulename,
  			  "mismatch in actual vs. predicted file position (%s vs. %s)\n",
  						 buf1, buf2);
  		}
--- 1222,1228 ----
  
  			snprintf(buf1, sizeof(buf1), INT64_FORMAT, (int64) ftello(ctx->tarFH));
  			snprintf(buf2, sizeof(buf2), INT64_FORMAT, (int64) ftello(ctx->tarFHpos));
! 			exit_horribly(modulename,
  			  "mismatch in actual vs. predicted file position (%s vs. %s)\n",
  						 buf1, buf2);
  		}
*************** _tarGetHeader(ArchiveHandle *AH, TAR_MEM
*** 1237,1247 ****
  			return 0;
  
  		if (len != 512)
! 			die_horribly(AH, modulename,
! 						 ngettext("incomplete tar header found (%lu byte)\n",
! 								  "incomplete tar header found (%lu bytes)\n",
! 								  len),
! 						 (unsigned long) len);
  
  		/* Calc checksum */
  		chk = _tarChecksum(h);
--- 1237,1247 ----
  			return 0;
  
  		if (len != 512)
! 			exit_horribly(modulename,
! 						  ngettext("incomplete tar header found (%lu byte)\n",
! 								   "incomplete tar header found (%lu bytes)\n",
! 								   len),
! 						  (unsigned long) len);
  
  		/* Calc checksum */
  		chk = _tarChecksum(h);
*************** _tarGetHeader(ArchiveHandle *AH, TAR_MEM
*** 1285,1294 ****
  		char		buf[100];
  
  		snprintf(buf, sizeof(buf), INT64_FORMAT, (int64) ftello(ctx->tarFH));
! 		die_horribly(AH, modulename,
! 					 "corrupt tar header found in %s "
! 					 "(expected %d, computed %d) file position %s\n",
! 					 tag, sum, chk, buf);
  	}
  
  	th->targetFile = pg_strdup(tag);
--- 1285,1294 ----
  		char		buf[100];
  
  		snprintf(buf, sizeof(buf), INT64_FORMAT, (int64) ftello(ctx->tarFH));
! 		exit_horribly(modulename,
! 					  "corrupt tar header found in %s "
! 					  "(expected %d, computed %d) file position %s\n",
! 					  tag, sum, chk, buf);
  	}
  
  	th->targetFile = pg_strdup(tag);
*************** _tarWriteHeader(TAR_MEMBER *th)
*** 1379,1383 ****
  	}
  
  	if (fwrite(h, 1, 512, th->tarFH) != 512)
! 		die_horribly(th->AH, modulename, "could not write to output file: %s\n", strerror(errno));
  }
--- 1379,1383 ----
  	}
  
  	if (fwrite(h, 1, 512, th->tarFH) != 512)
! 		exit_horribly(modulename, "could not write to output file: %s\n", strerror(errno));
  }
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 2b0a5ff..de1955f 100644
*** a/src/bin/pg_dump/pg_dump.c
--- b/src/bin/pg_dump/pg_dump.c
*************** static int	serializable_deferrable = 0;
*** 144,150 ****
  
  
  static void help(const char *progname);
- static void pgdump_cleanup_at_exit(int code, void *arg);
  static void setup_connection(Archive *AH, const char *dumpencoding,
  				 char *use_role);
  static ArchiveFormat parseArchiveFormat(const char *format, ArchiveMode *mode);
--- 144,149 ----
*************** main(int argc, char **argv)
*** 575,581 ****
  
  	/* Open the output file */
  	fout = CreateArchive(filename, archiveFormat, compressLevel, archiveMode);
! 	on_exit_nicely(pgdump_cleanup_at_exit, fout);
  
  	if (fout == NULL)
  		exit_horribly(NULL, "could not open output file \"%s\" for writing\n", filename);
--- 574,580 ----
  
  	/* Open the output file */
  	fout = CreateArchive(filename, archiveFormat, compressLevel, archiveMode);
! 	on_exit_nicely(archive_close_connection, fout);
  
  	if (fout == NULL)
  		exit_horribly(NULL, "could not open output file \"%s\" for writing\n", filename);
*************** help(const char *progname)
*** 837,850 ****
  }
  
  static void
- pgdump_cleanup_at_exit(int code, void *arg)
- {
- 	Archive	   *AH = (Archive *) arg;
- 
- 	DisconnectDatabase(AH);
- }
- 
- static void
  setup_connection(Archive *AH, const char *dumpencoding, char *use_role)
  {
  	PGconn	   *conn = GetConnection(AH);
--- 836,841 ----
diff --git a/src/bin/pg_dump/pg_restore.c b/src/bin/pg_dump/pg_restore.c
index b5f4c62..370409b 100644
*** a/src/bin/pg_dump/pg_restore.c
--- b/src/bin/pg_dump/pg_restore.c
*************** main(int argc, char **argv)
*** 384,389 ****
--- 384,396 ----
  
  	AH = OpenArchive(inputFileSpec, opts->format);
  
+ 	/*
+ 	 * We don't have a connection yet but that doesn't matter. The connection
+ 	 * is initialized to NULL and if we terminate through exit_nicely() while
+ 	 * it's still NULL, the cleanup function will just be a no-op.
+ 	 */
+ 	on_exit_nicely(archive_close_connection, AH);
+ 
  	/* Let the archiver know how noisy to be */
  	AH->verbose = opts->verbose;
  
#44Alvaro Herrera
alvherre@commandprompt.com
In reply to: Joachim Wieland (#43)
Re: patch for parallel pg_dump

Excerpts from Joachim Wieland's message of lun mar 19 00:31:47 -0300 2012:

On Wed, Mar 14, 2012 at 2:02 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I think we should somehow unify both functions, the code is not very
consistent in this respect, it also calls exit_horribly() when it has
AH available. See for example pg_backup_tar.c

I think we should get rid of die_horribly(), and instead have arrange
to always clean up AH via an on_exit_nicely hook.

Attached is a patch that gets rid of die_horribly().

For the parallel case it maintains an array with as many elements as
we have worker processes. When the workers start, they enter their Pid
(or ThreadId) and their ArchiveHandle (AH). The exit handler function
in a process can then find its own ArchiveHandle by comparing the own
Pid with all the elements in the array.

Sounds good to me in general ... my only gripe is this: I wonder if it
would be better to have a central routine that knows about both
archive_close_connection and archive_close_connection_parallel -- and
the argument to the callback is a struct that contains both a pointer to
the struct with the connection to be closed, a ParallelState (either of
which can be null), and a flag stating which of the ParallelState/AH is
in use. That way, you avoid having to reset the callbacks when you
switch from AH to parallel; instead you just clear out the AH
connection, set the ParallelState, and flip the switch. The general
archive_close_connection checks the flag to know which routine to call.

I mean, what you have probably works fine now, but it doesn't seem very
extensible.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#45Joachim Wieland
joe@mcknight.de
In reply to: Alvaro Herrera (#44)
1 attachment(s)
Re: patch for parallel pg_dump

On Mon, Mar 19, 2012 at 9:14 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

Sounds good to me in general ... my only gripe is this: I wonder if it
would be better to have a central routine that knows about both
archive_close_connection and archive_close_connection_parallel -- and
the argument to the callback is a struct that contains both a pointer to
the struct with the connection to be closed [...]

I had a similar idea before but then concluded that for it you need to
have this struct globally available so that everybody (pg_dump.c /
pg_restore.c / pg_backup_archive.c) can access it to set the
appropriate state.

I gave it a second thought and now just defined a function that these
consumers can call, that way the variable can stay at file scope at
least.

Also we don't need this switch, we can set the ParallelState in the
struct before any child forks off and reset it to NULL after the last
child has terminated.

New patch attached, thanks for your comments.

Attachments:

pg_dump_die_horribly.2.difftext/x-patch; charset=US-ASCII; name=pg_dump_die_horribly.2.diffDownload
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index c30b8f9..ff8e714 100644
*** a/src/bin/pg_dump/compress_io.c
--- b/src/bin/pg_dump/compress_io.c
*************** EndCompressorZlib(ArchiveHandle *AH, Com
*** 256,263 ****
  	DeflateCompressorZlib(AH, cs, true);
  
  	if (deflateEnd(zp) != Z_OK)
! 		die_horribly(AH, modulename,
! 					 "could not close compression stream: %s\n", zp->msg);
  
  	free(cs->zlibOut);
  	free(cs->zp);
--- 256,263 ----
  	DeflateCompressorZlib(AH, cs, true);
  
  	if (deflateEnd(zp) != Z_OK)
! 		exit_horribly(modulename,
! 					  "could not close compression stream: %s\n", zp->msg);
  
  	free(cs->zlibOut);
  	free(cs->zp);
*************** DeflateCompressorZlib(ArchiveHandle *AH,
*** 274,281 ****
  	{
  		res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
  		if (res == Z_STREAM_ERROR)
! 			die_horribly(AH, modulename,
! 						 "could not compress data: %s\n", zp->msg);
  		if ((flush && (zp->avail_out < cs->zlibOutSize))
  			|| (zp->avail_out == 0)
  			|| (zp->avail_in != 0)
--- 274,281 ----
  	{
  		res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
  		if (res == Z_STREAM_ERROR)
! 			exit_horribly(modulename,
! 						  "could not compress data: %s\n", zp->msg);
  		if ((flush && (zp->avail_out < cs->zlibOutSize))
  			|| (zp->avail_out == 0)
  			|| (zp->avail_in != 0)
*************** DeflateCompressorZlib(ArchiveHandle *AH,
*** 295,303 ****
  				size_t		len = cs->zlibOutSize - zp->avail_out;
  
  				if (cs->writeF(AH, out, len) != len)
! 					die_horribly(AH, modulename,
! 								 "could not write to output file: %s\n",
! 								 strerror(errno));
  			}
  			zp->next_out = (void *) out;
  			zp->avail_out = cs->zlibOutSize;
--- 295,303 ----
  				size_t		len = cs->zlibOutSize - zp->avail_out;
  
  				if (cs->writeF(AH, out, len) != len)
! 					exit_horribly(modulename,
! 								  "could not write to output file: %s\n",
! 								  strerror(errno));
  			}
  			zp->next_out = (void *) out;
  			zp->avail_out = cs->zlibOutSize;
*************** WriteDataToArchiveZlib(ArchiveHandle *AH
*** 318,324 ****
  
  	/*
  	 * we have either succeeded in writing dLen bytes or we have called
! 	 * die_horribly()
  	 */
  	return dLen;
  }
--- 318,324 ----
  
  	/*
  	 * we have either succeeded in writing dLen bytes or we have called
! 	 * exit_horribly()
  	 */
  	return dLen;
  }
*************** ReadDataFromArchiveZlib(ArchiveHandle *A
*** 361,368 ****
  
  			res = inflate(zp, 0);
  			if (res != Z_OK && res != Z_STREAM_END)
! 				die_horribly(AH, modulename,
! 							 "could not uncompress data: %s\n", zp->msg);
  
  			out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
  			ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
--- 361,368 ----
  
  			res = inflate(zp, 0);
  			if (res != Z_OK && res != Z_STREAM_END)
! 				exit_horribly(modulename,
! 							  "could not uncompress data: %s\n", zp->msg);
  
  			out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
  			ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
*************** ReadDataFromArchiveZlib(ArchiveHandle *A
*** 377,392 ****
  		zp->avail_out = ZLIB_OUT_SIZE;
  		res = inflate(zp, 0);
  		if (res != Z_OK && res != Z_STREAM_END)
! 			die_horribly(AH, modulename,
! 						 "could not uncompress data: %s\n", zp->msg);
  
  		out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
  		ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
  	}
  
  	if (inflateEnd(zp) != Z_OK)
! 		die_horribly(AH, modulename,
! 					 "could not close compression library: %s\n", zp->msg);
  
  	free(buf);
  	free(out);
--- 377,392 ----
  		zp->avail_out = ZLIB_OUT_SIZE;
  		res = inflate(zp, 0);
  		if (res != Z_OK && res != Z_STREAM_END)
! 			exit_horribly(modulename,
! 						  "could not uncompress data: %s\n", zp->msg);
  
  		out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
  		ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
  	}
  
  	if (inflateEnd(zp) != Z_OK)
! 		exit_horribly(modulename,
! 					  "could not close compression library: %s\n", zp->msg);
  
  	free(buf);
  	free(out);
*************** WriteDataToArchiveNone(ArchiveHandle *AH
*** 426,434 ****
  	 * do a check here as well...
  	 */
  	if (cs->writeF(AH, data, dLen) != dLen)
! 		die_horribly(AH, modulename,
! 					 "could not write to output file: %s\n",
! 					 strerror(errno));
  	return dLen;
  }
  
--- 426,434 ----
  	 * do a check here as well...
  	 */
  	if (cs->writeF(AH, data, dLen) != dLen)
! 		exit_horribly(modulename,
! 					  "could not write to output file: %s\n",
! 					  strerror(errno));
  	return dLen;
  }
  
diff --git a/src/bin/pg_dump/dumputils.c b/src/bin/pg_dump/dumputils.c
index 0b24220..d9681f6 100644
*** a/src/bin/pg_dump/dumputils.c
--- b/src/bin/pg_dump/dumputils.c
*************** static void AddAcl(PQExpBuffer aclbuf, c
*** 49,54 ****
--- 49,55 ----
  #ifdef WIN32
  static bool parallel_init_done = false;
  static DWORD tls_index;
+ static DWORD mainThreadId;
  #endif
  
  void
*************** init_parallel_dump_utils(void)
*** 59,64 ****
--- 60,66 ----
  	{
  		tls_index = TlsAlloc();
  		parallel_init_done = true;
+ 		mainThreadId = GetCurrentThreadId();
  	}
  #endif
  }
*************** on_exit_nicely(on_exit_nicely_callback f
*** 1313,1318 ****
--- 1315,1327 ----
  	on_exit_nicely_index++;
  }
  
+ /* Delete any previously set callback functions */
+ void
+ on_exit_nicely_reset(void)
+ {
+ 	on_exit_nicely_index = 0;
+ }
+ 
  /* Run accumulated on_exit_nicely callbacks and then exit quietly. */
  void
  exit_nicely(int code)
*************** exit_nicely(int code)
*** 1320,1324 ****
--- 1329,1337 ----
  	while (--on_exit_nicely_index >= 0)
  		(*on_exit_nicely_list[on_exit_nicely_index].function)(code,
  			on_exit_nicely_list[on_exit_nicely_index].arg);
+ #ifdef WIN32
+ 	if (parallel_init_done && GetCurrentThreadId() != mainThreadId)
+ 		ExitThread(code);
+ #endif
  	exit(code);
  }
diff --git a/src/bin/pg_dump/dumputils.h b/src/bin/pg_dump/dumputils.h
index 82cf940..2865c0f 100644
*** a/src/bin/pg_dump/dumputils.h
--- b/src/bin/pg_dump/dumputils.h
*************** extern void set_section (const char *arg
*** 62,67 ****
--- 62,68 ----
  
  typedef void (*on_exit_nicely_callback) (int code, void *arg);
  extern void on_exit_nicely(on_exit_nicely_callback function, void *arg);
+ extern void on_exit_nicely_reset(void);
  extern void exit_nicely(int code) __attribute__((noreturn));
  
  #endif   /* DUMPUTILS_H */
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index ff0db46..ba553d3 100644
*** a/src/bin/pg_dump/pg_backup.h
--- b/src/bin/pg_dump/pg_backup.h
*************** extern void ConnectDatabase(Archive *AH,
*** 167,172 ****
--- 167,173 ----
  				enum trivalue prompt_password);
  extern void DisconnectDatabase(Archive *AHX);
  extern PGconn *GetConnection(Archive *AHX);
+ extern void archive_close_connection(int code, void *arg);
  
  /* Called to add a TOC entry */
  extern void ArchiveEntry(Archive *AHX,
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 79f7dda..5a66343 100644
*** a/src/bin/pg_dump/pg_backup_archiver.c
--- b/src/bin/pg_dump/pg_backup_archiver.c
***************
*** 61,71 ****
--- 61,88 ----
  #define thandle HANDLE
  #endif
  
+ typedef struct _parallel_state_entry
+ {
+ #ifdef WIN32
+ 	unsigned int threadId;
+ #else
+ 	pid_t		pid;
+ #endif
+ 	ArchiveHandle *AH;
+ } ParallelStateEntry;
+ 
+ typedef struct _parallel_state
+ {
+ 	int			numWorkers;
+ 	ParallelStateEntry *pse;
+ } ParallelState;
+ 
  /* Arguments needed for a worker child */
  typedef struct _restore_args
  {
  	ArchiveHandle *AH;
  	TocEntry   *te;
+ 	ParallelStateEntry *pse;
  } RestoreArgs;
  
  /* State for each parallel activity slot */
*************** typedef struct _parallel_slot
*** 75,80 ****
--- 92,105 ----
  	RestoreArgs *args;
  } ParallelSlot;
  
+ typedef struct _shutdown_information
+ {
+ 	ParallelState *pstate;
+ 	Archive       *AHX;
+ } ShutdownInformation;
+ 
+ static ShutdownInformation shutdown_info;
+ 
  #define NO_SLOT (-1)
  
  #define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
*************** static int	_discoverArchiveFormat(Archiv
*** 122,131 ****
  
  static int	RestoringToDB(ArchiveHandle *AH);
  static void dump_lo_buf(ArchiveHandle *AH);
- static void vdie_horribly(ArchiveHandle *AH, const char *modulename,
- 						  const char *fmt, va_list ap)
- 	__attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 0), noreturn));
- 
  static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
  static void SetOutput(ArchiveHandle *AH, const char *filename, int compression);
  static OutputContext SaveOutput(ArchiveHandle *AH);
--- 147,152 ----
*************** CloseArchive(Archive *AHX)
*** 208,215 ****
  		res = fclose(AH->OF);
  
  	if (res != 0)
! 		die_horribly(AH, modulename, "could not close output file: %s\n",
! 					 strerror(errno));
  }
  
  /* Public */
--- 229,236 ----
  		res = fclose(AH->OF);
  
  	if (res != 0)
! 		exit_horribly(modulename, "could not close output file: %s\n",
! 					  strerror(errno));
  }
  
  /* Public */
*************** RestoreArchive(Archive *AHX, RestoreOpti
*** 234,247 ****
  	 * connected to, not the one we will create, which is very bad...
  	 */
  	if (ropt->createDB && ropt->dropSchema)
! 		die_horribly(AH, modulename, "-C and -c are incompatible options\n");
  
  	/*
  	 * -C is not compatible with -1, because we can't create a database inside
  	 * a transaction block.
  	 */
  	if (ropt->createDB && ropt->single_txn)
! 		die_horribly(AH, modulename, "-C and -1 are incompatible options\n");
  
  	/*
  	 * If we're going to do parallel restore, there are some restrictions.
--- 255,268 ----
  	 * connected to, not the one we will create, which is very bad...
  	 */
  	if (ropt->createDB && ropt->dropSchema)
! 		exit_horribly(modulename, "-C and -c are incompatible options\n");
  
  	/*
  	 * -C is not compatible with -1, because we can't create a database inside
  	 * a transaction block.
  	 */
  	if (ropt->createDB && ropt->single_txn)
! 		exit_horribly(modulename, "-C and -1 are incompatible options\n");
  
  	/*
  	 * If we're going to do parallel restore, there are some restrictions.
*************** RestoreArchive(Archive *AHX, RestoreOpti
*** 251,261 ****
  	{
  		/* We haven't got round to making this work for all archive formats */
  		if (AH->ClonePtr == NULL || AH->ReopenPtr == NULL)
! 			die_horribly(AH, modulename, "parallel restore is not supported with this archive file format\n");
  
  		/* Doesn't work if the archive represents dependencies as OIDs */
  		if (AH->version < K_VERS_1_8)
! 			die_horribly(AH, modulename, "parallel restore is not supported with archives made by pre-8.0 pg_dump\n");
  
  		/*
  		 * It's also not gonna work if we can't reopen the input file, so
--- 272,282 ----
  	{
  		/* We haven't got round to making this work for all archive formats */
  		if (AH->ClonePtr == NULL || AH->ReopenPtr == NULL)
! 			exit_horribly(modulename, "parallel restore is not supported with this archive file format\n");
  
  		/* Doesn't work if the archive represents dependencies as OIDs */
  		if (AH->version < K_VERS_1_8)
! 			exit_horribly(modulename, "parallel restore is not supported with archives made by pre-8.0 pg_dump\n");
  
  		/*
  		 * It's also not gonna work if we can't reopen the input file, so
*************** RestoreArchive(Archive *AHX, RestoreOpti
*** 274,280 ****
  		{
  			reqs = _tocEntryRequired(te, ropt, false);
  			if (te->hadDumper && (reqs & REQ_DATA) != 0)
! 				die_horribly(AH, modulename, "cannot restore from compressed archive (compression not supported in this installation)\n");
  		}
  	}
  #endif
--- 295,301 ----
  		{
  			reqs = _tocEntryRequired(te, ropt, false);
  			if (te->hadDumper && (reqs & REQ_DATA) != 0)
! 				exit_horribly(modulename, "cannot restore from compressed archive (compression not supported in this installation)\n");
  		}
  	}
  #endif
*************** RestoreArchive(Archive *AHX, RestoreOpti
*** 286,292 ****
  	{
  		ahlog(AH, 1, "connecting to database for restore\n");
  		if (AH->version < K_VERS_1_3)
! 			die_horribly(AH, modulename, "direct database connections are not supported in pre-1.3 archives\n");
  
  		/* XXX Should get this from the archive */
  		AHX->minRemoteVersion = 070100;
--- 307,313 ----
  	{
  		ahlog(AH, 1, "connecting to database for restore\n");
  		if (AH->version < K_VERS_1_3)
! 			exit_horribly(modulename, "direct database connections are not supported in pre-1.3 archives\n");
  
  		/* XXX Should get this from the archive */
  		AHX->minRemoteVersion = 070100;
*************** WriteData(Archive *AHX, const void *data
*** 734,740 ****
  	ArchiveHandle *AH = (ArchiveHandle *) AHX;
  
  	if (!AH->currToc)
! 		die_horribly(AH, modulename, "internal error -- WriteData cannot be called outside the context of a DataDumper routine\n");
  
  	return (*AH->WriteDataPtr) (AH, data, dLen);
  }
--- 755,761 ----
  	ArchiveHandle *AH = (ArchiveHandle *) AHX;
  
  	if (!AH->currToc)
! 		exit_horribly(modulename, "internal error -- WriteData cannot be called outside the context of a DataDumper routine\n");
  
  	return (*AH->WriteDataPtr) (AH, data, dLen);
  }
*************** StartBlob(Archive *AHX, Oid oid)
*** 889,895 ****
  	ArchiveHandle *AH = (ArchiveHandle *) AHX;
  
  	if (!AH->StartBlobPtr)
! 		die_horribly(AH, modulename, "large-object output not supported in chosen format\n");
  
  	(*AH->StartBlobPtr) (AH, AH->currToc, oid);
  
--- 910,916 ----
  	ArchiveHandle *AH = (ArchiveHandle *) AHX;
  
  	if (!AH->StartBlobPtr)
! 		exit_horribly(modulename, "large-object output not supported in chosen format\n");
  
  	(*AH->StartBlobPtr) (AH, AH->currToc, oid);
  
*************** StartRestoreBlob(ArchiveHandle *AH, Oid
*** 976,988 ****
  		{
  			loOid = lo_create(AH->connection, oid);
  			if (loOid == 0 || loOid != oid)
! 				die_horribly(AH, modulename, "could not create large object %u: %s",
! 							 oid, PQerrorMessage(AH->connection));
  		}
  		AH->loFd = lo_open(AH->connection, oid, INV_WRITE);
  		if (AH->loFd == -1)
! 			die_horribly(AH, modulename, "could not open large object %u: %s",
! 						 oid, PQerrorMessage(AH->connection));
  	}
  	else
  	{
--- 997,1009 ----
  		{
  			loOid = lo_create(AH->connection, oid);
  			if (loOid == 0 || loOid != oid)
! 				exit_horribly(modulename, "could not create large object %u: %s",
! 							  oid, PQerrorMessage(AH->connection));
  		}
  		AH->loFd = lo_open(AH->connection, oid, INV_WRITE);
  		if (AH->loFd == -1)
! 			exit_horribly(modulename, "could not open large object %u: %s",
! 						  oid, PQerrorMessage(AH->connection));
  	}
  	else
  	{
*************** SortTocFromFile(Archive *AHX, RestoreOpt
*** 1038,1045 ****
  	/* Setup the file */
  	fh = fopen(ropt->tocFile, PG_BINARY_R);
  	if (!fh)
! 		die_horribly(AH, modulename, "could not open TOC file \"%s\": %s\n",
! 					 ropt->tocFile, strerror(errno));
  
  	incomplete_line = false;
  	while (fgets(buf, sizeof(buf), fh) != NULL)
--- 1059,1066 ----
  	/* Setup the file */
  	fh = fopen(ropt->tocFile, PG_BINARY_R);
  	if (!fh)
! 		exit_horribly(modulename, "could not open TOC file \"%s\": %s\n",
! 					  ropt->tocFile, strerror(errno));
  
  	incomplete_line = false;
  	while (fgets(buf, sizeof(buf), fh) != NULL)
*************** SortTocFromFile(Archive *AHX, RestoreOpt
*** 1086,1093 ****
  		/* Find TOC entry */
  		te = getTocEntryByDumpId(AH, id);
  		if (!te)
! 			die_horribly(AH, modulename, "could not find entry for ID %d\n",
! 						 id);
  
  		/* Mark it wanted */
  		ropt->idWanted[id - 1] = true;
--- 1107,1114 ----
  		/* Find TOC entry */
  		te = getTocEntryByDumpId(AH, id);
  		if (!te)
! 			exit_horribly(modulename, "could not find entry for ID %d\n",
! 						  id);
  
  		/* Mark it wanted */
  		ropt->idWanted[id - 1] = true;
*************** SortTocFromFile(Archive *AHX, RestoreOpt
*** 1107,1114 ****
  	}
  
  	if (fclose(fh) != 0)
! 		die_horribly(AH, modulename, "could not close TOC file: %s\n",
! 					 strerror(errno));
  }
  
  /*
--- 1128,1135 ----
  	}
  
  	if (fclose(fh) != 0)
! 		exit_horribly(modulename, "could not close TOC file: %s\n",
! 					  strerror(errno));
  }
  
  /*
*************** SetOutput(ArchiveHandle *AH, const char
*** 1224,1234 ****
  	if (!AH->OF)
  	{
  		if (filename)
! 			die_horribly(AH, modulename, "could not open output file \"%s\": %s\n",
! 						 filename, strerror(errno));
  		else
! 			die_horribly(AH, modulename, "could not open output file: %s\n",
! 						 strerror(errno));
  	}
  }
  
--- 1245,1255 ----
  	if (!AH->OF)
  	{
  		if (filename)
! 			exit_horribly(modulename, "could not open output file \"%s\": %s\n",
! 						  filename, strerror(errno));
  		else
! 			exit_horribly(modulename, "could not open output file: %s\n",
! 						  strerror(errno));
  	}
  }
  
*************** RestoreOutput(ArchiveHandle *AH, OutputC
*** 1254,1260 ****
  		res = fclose(AH->OF);
  
  	if (res != 0)
! 		die_horribly(AH, modulename, "could not close output file: %s\n",
  					 strerror(errno));
  
  	AH->gzOut = savedContext.gzOut;
--- 1275,1281 ----
  		res = fclose(AH->OF);
  
  	if (res != 0)
! 		exit_horribly(modulename, "could not close output file: %s\n",
  					 strerror(errno));
  
  	AH->gzOut = savedContext.gzOut;
*************** dump_lo_buf(ArchiveHandle *AH)
*** 1332,1338 ****
  							  AH->lo_buf_used),
  			  (unsigned long) AH->lo_buf_used, (unsigned long) res);
  		if (res != AH->lo_buf_used)
! 			die_horribly(AH, modulename,
  			"could not write to large object (result: %lu, expected: %lu)\n",
  					   (unsigned long) res, (unsigned long) AH->lo_buf_used);
  	}
--- 1353,1359 ----
  							  AH->lo_buf_used),
  			  (unsigned long) AH->lo_buf_used, (unsigned long) res);
  		if (res != AH->lo_buf_used)
! 			exit_horribly(modulename,
  			"could not write to large object (result: %lu, expected: %lu)\n",
  					   (unsigned long) res, (unsigned long) AH->lo_buf_used);
  	}
*************** ahwrite(const void *ptr, size_t size, si
*** 1391,1397 ****
  	{
  		res = GZWRITE(ptr, size, nmemb, AH->OF);
  		if (res != (nmemb * size))
! 			die_horribly(AH, modulename, "could not write to output file: %s\n", strerror(errno));
  		return res;
  	}
  	else if (AH->CustomOutPtr)
--- 1412,1418 ----
  	{
  		res = GZWRITE(ptr, size, nmemb, AH->OF);
  		if (res != (nmemb * size))
! 			exit_horribly(modulename, "could not write to output file: %s\n", strerror(errno));
  		return res;
  	}
  	else if (AH->CustomOutPtr)
*************** ahwrite(const void *ptr, size_t size, si
*** 1399,1405 ****
  		res = AH->CustomOutPtr (AH, ptr, size * nmemb);
  
  		if (res != (nmemb * size))
! 			die_horribly(AH, modulename, "could not write to custom output routine\n");
  		return res;
  	}
  	else
--- 1420,1426 ----
  		res = AH->CustomOutPtr (AH, ptr, size * nmemb);
  
  		if (res != (nmemb * size))
! 			exit_horribly(modulename, "could not write to custom output routine\n");
  		return res;
  	}
  	else
*************** ahwrite(const void *ptr, size_t size, si
*** 1414,1468 ****
  		{
  			res = fwrite(ptr, size, nmemb, AH->OF);
  			if (res != nmemb)
! 				die_horribly(AH, modulename, "could not write to output file: %s\n",
  							 strerror(errno));
  			return res;
  		}
  	}
  }
  
- 
- /* Report a fatal error and exit(1) */
- static void
- vdie_horribly(ArchiveHandle *AH, const char *modulename,
- 			  const char *fmt, va_list ap)
- {
- 	vwrite_msg(modulename, fmt, ap);
- 
- 	if (AH)
- 	{
- 		if (AH->public.verbose)
- 			write_msg(NULL, "*** aborted because of error\n");
- 		DisconnectDatabase(&AH->public);
- 	}
- 
- 	exit_nicely(1);
- }
- 
- /* As above, but with variable arg list */
- void
- die_horribly(ArchiveHandle *AH, const char *modulename, const char *fmt,...)
- {
- 	va_list		ap;
- 
- 	va_start(ap, fmt);
- 	vdie_horribly(AH, modulename, fmt, ap);
- 	va_end(ap);
- }
- 
- /* As above, but with a complaint about a particular query. */
- void
- die_on_query_failure(ArchiveHandle *AH, const char *modulename,
- 					 const char *query)
- {
- 	write_msg(modulename, "query failed: %s",
- 			  PQerrorMessage(AH->connection));
- 	die_horribly(AH, modulename, "query was: %s\n", query);
- }
- 
  /* on some error, we may decide to go on... */
  void
! warn_or_die_horribly(ArchiveHandle *AH,
  					 const char *modulename, const char *fmt,...)
  {
  	va_list		ap;
--- 1435,1450 ----
  		{
  			res = fwrite(ptr, size, nmemb, AH->OF);
  			if (res != nmemb)
! 				exit_horribly(modulename, "could not write to output file: %s\n",
  							 strerror(errno));
  			return res;
  		}
  	}
  }
  
  /* on some error, we may decide to go on... */
  void
! warn_or_exit_horribly(ArchiveHandle *AH,
  					 const char *modulename, const char *fmt,...)
  {
  	va_list		ap;
*************** warn_or_die_horribly(ArchiveHandle *AH,
*** 1500,1513 ****
  	AH->lastErrorTE = AH->currentTE;
  
  	va_start(ap, fmt);
  	if (AH->public.exit_on_error)
! 		vdie_horribly(AH, modulename, fmt, ap);
  	else
- 	{
- 		vwrite_msg(modulename, fmt, ap);
  		AH->public.n_errors++;
- 	}
- 	va_end(ap);
  }
  
  #ifdef NOT_USED
--- 1482,1494 ----
  	AH->lastErrorTE = AH->currentTE;
  
  	va_start(ap, fmt);
+ 	vwrite_msg(modulename, fmt, ap);
+ 	va_end(ap);
+ 
  	if (AH->public.exit_on_error)
! 		exit_nicely(1);
  	else
  		AH->public.n_errors++;
  }
  
  #ifdef NOT_USED
*************** ReadOffset(ArchiveHandle *AH, pgoff_t *
*** 1626,1632 ****
  			break;
  
  		default:
! 			die_horribly(AH, modulename, "unexpected data offset flag %d\n", offsetFlg);
  	}
  
  	/*
--- 1607,1613 ----
  			break;
  
  		default:
! 			exit_horribly(modulename, "unexpected data offset flag %d\n", offsetFlg);
  	}
  
  	/*
*************** ReadOffset(ArchiveHandle *AH, pgoff_t *
*** 1639,1645 ****
  		else
  		{
  			if ((*AH->ReadBytePtr) (AH) != 0)
! 				die_horribly(AH, modulename, "file offset in dump file is too large\n");
  		}
  	}
  
--- 1620,1626 ----
  		else
  		{
  			if ((*AH->ReadBytePtr) (AH) != 0)
! 				exit_horribly(modulename, "file offset in dump file is too large\n");
  		}
  	}
  
*************** ReadStr(ArchiveHandle *AH)
*** 1733,1739 ****
  	{
  		buf = (char *) pg_malloc(l + 1);
  		if ((*AH->ReadBufPtr) (AH, (void *) buf, l) != l)
! 			die_horribly(AH, modulename, "unexpected end of file\n");
  
  		buf[l] = '\0';
  	}
--- 1714,1720 ----
  	{
  		buf = (char *) pg_malloc(l + 1);
  		if ((*AH->ReadBufPtr) (AH, (void *) buf, l) != l)
! 			exit_horribly(modulename, "unexpected end of file\n");
  
  		buf[l] = '\0';
  	}
*************** _discoverArchiveFormat(ArchiveHandle *AH
*** 1776,1783 ****
  			char		buf[MAXPGPATH];
  
  			if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
! 				die_horribly(AH, modulename, "directory name too long: \"%s\"\n",
! 							 AH->fSpec);
  			if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
  			{
  				AH->format = archDirectory;
--- 1757,1764 ----
  			char		buf[MAXPGPATH];
  
  			if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
! 				exit_horribly(modulename, "directory name too long: \"%s\"\n",
! 							  AH->fSpec);
  			if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
  			{
  				AH->format = archDirectory;
*************** _discoverArchiveFormat(ArchiveHandle *AH
*** 1786,1817 ****
  
  #ifdef HAVE_LIBZ
  			if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
! 				die_horribly(AH, modulename, "directory name too long: \"%s\"\n",
! 							 AH->fSpec);
  			if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
  			{
  				AH->format = archDirectory;
  				return AH->format;
  			}
  #endif
! 			die_horribly(AH, modulename, "directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)\n",
! 						 AH->fSpec);
  			fh = NULL;			/* keep compiler quiet */
  		}
  		else
  		{
  			fh = fopen(AH->fSpec, PG_BINARY_R);
  			if (!fh)
! 				die_horribly(AH, modulename, "could not open input file \"%s\": %s\n",
! 							 AH->fSpec, strerror(errno));
  		}
  	}
  	else
  	{
  		fh = stdin;
  		if (!fh)
! 			die_horribly(AH, modulename, "could not open input file: %s\n",
! 						 strerror(errno));
  	}
  
  	cnt = fread(sig, 1, 5, fh);
--- 1767,1798 ----
  
  #ifdef HAVE_LIBZ
  			if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
! 				exit_horribly(modulename, "directory name too long: \"%s\"\n",
! 							  AH->fSpec);
  			if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
  			{
  				AH->format = archDirectory;
  				return AH->format;
  			}
  #endif
! 			exit_horribly(modulename, "directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)\n",
! 						  AH->fSpec);
  			fh = NULL;			/* keep compiler quiet */
  		}
  		else
  		{
  			fh = fopen(AH->fSpec, PG_BINARY_R);
  			if (!fh)
! 				exit_horribly(modulename, "could not open input file \"%s\": %s\n",
! 							  AH->fSpec, strerror(errno));
  		}
  	}
  	else
  	{
  		fh = stdin;
  		if (!fh)
! 			exit_horribly(modulename, "could not open input file: %s\n",
! 						  strerror(errno));
  	}
  
  	cnt = fread(sig, 1, 5, fh);
*************** _discoverArchiveFormat(ArchiveHandle *AH
*** 1819,1828 ****
  	if (cnt != 5)
  	{
  		if (ferror(fh))
! 			die_horribly(AH, modulename, "could not read input file: %s\n", strerror(errno));
  		else
! 			die_horribly(AH, modulename, "input file is too short (read %lu, expected 5)\n",
! 						 (unsigned long) cnt);
  	}
  
  	/* Save it, just in case we need it later */
--- 1800,1809 ----
  	if (cnt != 5)
  	{
  		if (ferror(fh))
! 			exit_horribly(modulename, "could not read input file: %s\n", strerror(errno));
  		else
! 			exit_horribly(modulename, "input file is too short (read %lu, expected 5)\n",
! 						  (unsigned long) cnt);
  	}
  
  	/* Save it, just in case we need it later */
*************** _discoverArchiveFormat(ArchiveHandle *AH
*** 1883,1896 ****
  			 strncmp(AH->lookahead, TEXT_DUMPALL_HEADER, strlen(TEXT_DUMPALL_HEADER)) == 0))
  		{
  			/* looks like it's probably a text format dump. so suggest they try psql */
! 			die_horribly(AH, modulename, "input file appears to be a text format dump. Please use psql.\n");
  		}
  
  		if (AH->lookaheadLen != 512)
! 			die_horribly(AH, modulename, "input file does not appear to be a valid archive (too short?)\n");
  
  		if (!isValidTarHeader(AH->lookahead))
! 			die_horribly(AH, modulename, "input file does not appear to be a valid archive\n");
  
  		AH->format = archTar;
  	}
--- 1864,1877 ----
  			 strncmp(AH->lookahead, TEXT_DUMPALL_HEADER, strlen(TEXT_DUMPALL_HEADER)) == 0))
  		{
  			/* looks like it's probably a text format dump. so suggest they try psql */
! 			exit_horribly(modulename, "input file appears to be a text format dump. Please use psql.\n");
  		}
  
  		if (AH->lookaheadLen != 512)
! 			exit_horribly(modulename, "input file does not appear to be a valid archive (too short?)\n");
  
  		if (!isValidTarHeader(AH->lookahead))
! 			exit_horribly(modulename, "input file does not appear to be a valid archive\n");
  
  		AH->format = archTar;
  	}
*************** _discoverArchiveFormat(ArchiveHandle *AH
*** 1910,1917 ****
  	/* Close the file */
  	if (wantClose)
  		if (fclose(fh) != 0)
! 			die_horribly(AH, modulename, "could not close input file: %s\n",
! 						 strerror(errno));
  
  	return AH->format;
  }
--- 1891,1898 ----
  	/* Close the file */
  	if (wantClose)
  		if (fclose(fh) != 0)
! 			exit_horribly(modulename, "could not close input file: %s\n",
! 						  strerror(errno));
  
  	return AH->format;
  }
*************** _allocAH(const char *FileSpec, const Arc
*** 2034,2040 ****
  			break;
  
  		default:
! 			die_horribly(AH, modulename, "unrecognized file format \"%d\"\n", fmt);
  	}
  
  	return AH;
--- 2015,2021 ----
  			break;
  
  		default:
! 			exit_horribly(modulename, "unrecognized file format \"%d\"\n", fmt);
  	}
  
  	return AH;
*************** ReadToc(ArchiveHandle *AH)
*** 2156,2164 ****
  
  		/* Sanity check */
  		if (te->dumpId <= 0)
! 			die_horribly(AH, modulename,
! 					   "entry ID %d out of range -- perhaps a corrupt TOC\n",
! 						 te->dumpId);
  
  		te->hadDumper = ReadInt(AH);
  
--- 2137,2145 ----
  
  		/* Sanity check */
  		if (te->dumpId <= 0)
! 			exit_horribly(modulename,
! 						  "entry ID %d out of range -- perhaps a corrupt TOC\n",
! 						  te->dumpId);
  
  		te->hadDumper = ReadInt(AH);
  
*************** processEncodingEntry(ArchiveHandle *AH,
*** 2313,2325 ****
  		*ptr2 = '\0';
  		encoding = pg_char_to_encoding(ptr1);
  		if (encoding < 0)
! 			die_horribly(AH, modulename, "unrecognized encoding \"%s\"\n",
! 						 ptr1);
  		AH->public.encoding = encoding;
  	}
  	else
! 		die_horribly(AH, modulename, "invalid ENCODING item: %s\n",
! 					 te->defn);
  
  	free(defn);
  }
--- 2294,2306 ----
  		*ptr2 = '\0';
  		encoding = pg_char_to_encoding(ptr1);
  		if (encoding < 0)
! 			exit_horribly(modulename, "unrecognized encoding \"%s\"\n",
! 						  ptr1);
  		AH->public.encoding = encoding;
  	}
  	else
! 		exit_horribly(modulename, "invalid ENCODING item: %s\n",
! 					  te->defn);
  
  	free(defn);
  }
*************** processStdStringsEntry(ArchiveHandle *AH
*** 2336,2343 ****
  	else if (ptr1 && strncmp(ptr1, "'off'", 5) == 0)
  		AH->public.std_strings = false;
  	else
! 		die_horribly(AH, modulename, "invalid STDSTRINGS item: %s\n",
! 					 te->defn);
  }
  
  static teReqs
--- 2317,2324 ----
  	else if (ptr1 && strncmp(ptr1, "'off'", 5) == 0)
  		AH->public.std_strings = false;
  	else
! 		exit_horribly(modulename, "invalid STDSTRINGS item: %s\n",
! 					  te->defn);
  }
  
  static teReqs
*************** _doSetSessionAuth(ArchiveHandle *AH, con
*** 2544,2552 ****
  		res = PQexec(AH->connection, cmd->data);
  
  		if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
! 			/* NOT warn_or_die_horribly... use -O instead to skip this. */
! 			die_horribly(AH, modulename, "could not set session user to \"%s\": %s",
! 						 user, PQerrorMessage(AH->connection));
  
  		PQclear(res);
  	}
--- 2525,2533 ----
  		res = PQexec(AH->connection, cmd->data);
  
  		if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
! 			/* NOT warn_or_exit_horribly... use -O instead to skip this. */
! 			exit_horribly(modulename, "could not set session user to \"%s\": %s",
! 						  user, PQerrorMessage(AH->connection));
  
  		PQclear(res);
  	}
*************** _doSetWithOids(ArchiveHandle *AH, const
*** 2576,2584 ****
  		res = PQexec(AH->connection, cmd->data);
  
  		if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
! 			warn_or_die_horribly(AH, modulename,
! 								 "could not set default_with_oids: %s",
! 								 PQerrorMessage(AH->connection));
  
  		PQclear(res);
  	}
--- 2557,2565 ----
  		res = PQexec(AH->connection, cmd->data);
  
  		if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
! 			warn_or_exit_horribly(AH, modulename,
! 								  "could not set default_with_oids: %s",
! 								  PQerrorMessage(AH->connection));
  
  		PQclear(res);
  	}
*************** _selectOutputSchema(ArchiveHandle *AH, c
*** 2714,2722 ****
  		res = PQexec(AH->connection, qry->data);
  
  		if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
! 			warn_or_die_horribly(AH, modulename,
! 								 "could not set search_path to \"%s\": %s",
! 								 schemaName, PQerrorMessage(AH->connection));
  
  		PQclear(res);
  	}
--- 2695,2703 ----
  		res = PQexec(AH->connection, qry->data);
  
  		if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
! 			warn_or_exit_horribly(AH, modulename,
! 								  "could not set search_path to \"%s\": %s",
! 								  schemaName, PQerrorMessage(AH->connection));
  
  		PQclear(res);
  	}
*************** _selectTablespace(ArchiveHandle *AH, con
*** 2775,2783 ****
  		res = PQexec(AH->connection, qry->data);
  
  		if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
! 			warn_or_die_horribly(AH, modulename,
! 								 "could not set default_tablespace to %s: %s",
! 								 fmtId(want), PQerrorMessage(AH->connection));
  
  		PQclear(res);
  	}
--- 2756,2764 ----
  		res = PQexec(AH->connection, qry->data);
  
  		if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
! 			warn_or_exit_horribly(AH, modulename,
! 								  "could not set default_tablespace to %s: %s",
! 								  fmtId(want), PQerrorMessage(AH->connection));
  
  		PQclear(res);
  	}
*************** ReadHead(ArchiveHandle *AH)
*** 3157,3166 ****
  	if (!AH->readHeader)
  	{
  		if ((*AH->ReadBufPtr) (AH, tmpMag, 5) != 5)
! 			die_horribly(AH, modulename, "unexpected end of file\n");
  
  		if (strncmp(tmpMag, "PGDMP", 5) != 0)
! 			die_horribly(AH, modulename, "did not find magic string in file header\n");
  
  		AH->vmaj = (*AH->ReadBytePtr) (AH);
  		AH->vmin = (*AH->ReadBytePtr) (AH);
--- 3138,3147 ----
  	if (!AH->readHeader)
  	{
  		if ((*AH->ReadBufPtr) (AH, tmpMag, 5) != 5)
! 			exit_horribly(modulename, "unexpected end of file\n");
  
  		if (strncmp(tmpMag, "PGDMP", 5) != 0)
! 			exit_horribly(modulename, "did not find magic string in file header\n");
  
  		AH->vmaj = (*AH->ReadBytePtr) (AH);
  		AH->vmin = (*AH->ReadBytePtr) (AH);
*************** ReadHead(ArchiveHandle *AH)
*** 3173,3185 ****
  		AH->version = ((AH->vmaj * 256 + AH->vmin) * 256 + AH->vrev) * 256 + 0;
  
  		if (AH->version < K_VERS_1_0 || AH->version > K_VERS_MAX)
! 			die_horribly(AH, modulename, "unsupported version (%d.%d) in file header\n",
! 						 AH->vmaj, AH->vmin);
  
  		AH->intSize = (*AH->ReadBytePtr) (AH);
  		if (AH->intSize > 32)
! 			die_horribly(AH, modulename, "sanity check on integer size (%lu) failed\n",
! 						 (unsigned long) AH->intSize);
  
  		if (AH->intSize > sizeof(int))
  			write_msg(modulename, "WARNING: archive was made on a machine with larger integers, some operations might fail\n");
--- 3154,3166 ----
  		AH->version = ((AH->vmaj * 256 + AH->vmin) * 256 + AH->vrev) * 256 + 0;
  
  		if (AH->version < K_VERS_1_0 || AH->version > K_VERS_MAX)
! 			exit_horribly(modulename, "unsupported version (%d.%d) in file header\n",
! 						  AH->vmaj, AH->vmin);
  
  		AH->intSize = (*AH->ReadBytePtr) (AH);
  		if (AH->intSize > 32)
! 			exit_horribly(modulename, "sanity check on integer size (%lu) failed\n",
! 						  (unsigned long) AH->intSize);
  
  		if (AH->intSize > sizeof(int))
  			write_msg(modulename, "WARNING: archive was made on a machine with larger integers, some operations might fail\n");
*************** ReadHead(ArchiveHandle *AH)
*** 3192,3199 ****
  		fmt = (*AH->ReadBytePtr) (AH);
  
  		if (AH->format != fmt)
! 			die_horribly(AH, modulename, "expected format (%d) differs from format found in file (%d)\n",
! 						 AH->format, fmt);
  	}
  
  	if (AH->version >= K_VERS_1_2)
--- 3173,3180 ----
  		fmt = (*AH->ReadBytePtr) (AH);
  
  		if (AH->format != fmt)
! 			exit_horribly(modulename, "expected format (%d) differs from format found in file (%d)\n",
! 						  AH->format, fmt);
  	}
  
  	if (AH->version >= K_VERS_1_2)
*************** dumpTimestamp(ArchiveHandle *AH, const c
*** 3297,3302 ****
--- 3278,3341 ----
  		ahprintf(AH, "-- %s %s\n\n", msg, buf);
  }
  
+ static void
+ setProcessIdentifier(ParallelStateEntry *pse, ArchiveHandle *AH)
+ {
+ #ifdef WIN32
+ 	pse->threadId = GetCurrentThreadId();
+ #else
+ 	pse->pid = getpid();
+ #endif
+ 	pse->AH = AH;
+ }
+ 
+ static void
+ unsetProcessIdentifier(ParallelStateEntry *pse)
+ {
+ #ifdef WIN32
+ 	pse->threadId = 0;
+ #else
+ 	pse->pid = 0;
+ #endif
+ 	pse->AH = NULL;
+ }
+ 
+ static int
+ GetMySlot(ParallelState *pstate)
+ {
+ 	int i;
+ 
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ #ifdef WIN32
+ 		if (pstate->pse[i].threadId == GetCurrentThreadId())
+ #else
+ 		if (pstate->pse[i].pid == getpid())
+ #endif
+ 			return i;
+ 
+ 	return NO_SLOT;
+ }
+ 
+ static void
+ archive_close_connection(int code, void *arg)
+ {
+ 	ShutdownInformation *si = (ShutdownInformation *) arg;
+ 	if (si->pstate)
+ 	{
+ 		int slotno = GetMySlot(si->pstate);
+ 		if (slotno != NO_SLOT && si->pstate->pse[slotno].AH)
+ 			DisconnectDatabase(&si->pstate->pse[slotno].AH->public);
+ 	}
+ 	else if (si->AHX)
+ 		DisconnectDatabase(si->AHX);
+ }
+ 
+ void
+ on_exit_close_archive(Archive *AHX)
+ {
+ 	shutdown_info.AHX = AHX;
+ 	on_exit_nicely(archive_close_connection, &shutdown_info);
+ }
  
  /*
   * Main engine for parallel restore.
*************** restore_toc_entries_parallel(ArchiveHand
*** 3323,3332 ****
  	TocEntry   *next_work_item;
  	thandle		ret_child;
  	TocEntry   *te;
  
  	ahlog(AH, 2, "entering restore_toc_entries_parallel\n");
  
! 	slots = (ParallelSlot *) pg_calloc(sizeof(ParallelSlot), n_slots);
  
  	/* Adjust dependency information */
  	fix_dependencies(AH);
--- 3362,3378 ----
  	TocEntry   *next_work_item;
  	thandle		ret_child;
  	TocEntry   *te;
+ 	ParallelState *pstate;
+ 	int			i;
  
  	ahlog(AH, 2, "entering restore_toc_entries_parallel\n");
  
! 	slots = (ParallelSlot *) pg_calloc(n_slots, sizeof(ParallelSlot));
! 	pstate = (ParallelState *) pg_malloc(sizeof(ParallelState));
! 	pstate->pse = (ParallelStateEntry *) pg_calloc(n_slots, sizeof(ParallelStateEntry));
! 	pstate->numWorkers = ropt->number_of_jobs;
! 	for (i = 0; i < pstate->numWorkers; i++)
! 		unsetProcessIdentifier(&(pstate->pse[i]));
  
  	/* Adjust dependency information */
  	fix_dependencies(AH);
*************** restore_toc_entries_parallel(ArchiveHand
*** 3382,3387 ****
--- 3428,3439 ----
  	 */
  	DisconnectDatabase(&AH->public);
  
+ 	/*
+ 	 * Set the pstate in the shutdown_info. The exit handler uses pstate if set
+ 	 * and falls back to AHX otherwise.
+ 	 */
+ 	shutdown_info.pstate = pstate;
+ 
  	/* blow away any transient state from the old connection */
  	if (AH->currUser)
  		free(AH->currUser);
*************** restore_toc_entries_parallel(ArchiveHand
*** 3480,3485 ****
--- 3532,3538 ----
  				args = pg_malloc(sizeof(RestoreArgs));
  				args->AH = CloneArchive(AH);
  				args->te = next_work_item;
+ 				args->pse = &pstate->pse[next_slot];
  
  				/* run the step in a worker child */
  				child = spawn_restore(args);
*************** restore_toc_entries_parallel(ArchiveHand
*** 3507,3520 ****
  		}
  		else
  		{
! 			die_horribly(AH, modulename, "worker process crashed: status %d\n",
! 						 work_status);
  		}
  	}
  
  	ahlog(AH, 1, "finished main parallel loop\n");
  
  	/*
  	 * Now reconnect the single parent connection.
  	 */
  	ConnectDatabase((Archive *) AH, ropt->dbname,
--- 3560,3579 ----
  		}
  		else
  		{
! 			exit_horribly(modulename, "worker process crashed: status %d\n",
! 						  work_status);
  		}
  	}
  
  	ahlog(AH, 1, "finished main parallel loop\n");
  
  	/*
+ 	 * Remove the pstate again, so the exit handler will now fall back to
+ 	 * closing AH->connection again.
+ 	 */
+ 	shutdown_info.pstate = NULL;
+ 
+ 	/*
  	 * Now reconnect the single parent connection.
  	 */
  	ConnectDatabase((Archive *) AH, ropt->dbname,
*************** spawn_restore(RestoreArgs *args)
*** 3555,3577 ****
  	{
  		/* in child process */
  		parallel_restore(args);
! 		die_horribly(args->AH, modulename,
! 					 "parallel_restore should not return\n");
  	}
  	else if (child < 0)
  	{
  		/* fork failed */
! 		die_horribly(args->AH, modulename,
! 					 "could not create worker process: %s\n",
! 					 strerror(errno));
  	}
  #else
  	child = (HANDLE) _beginthreadex(NULL, 0, (void *) parallel_restore,
  									args, 0, NULL);
  	if (child == 0)
! 		die_horribly(args->AH, modulename,
! 					 "could not create worker thread: %s\n",
! 					 strerror(errno));
  #endif
  
  	return child;
--- 3614,3636 ----
  	{
  		/* in child process */
  		parallel_restore(args);
! 		exit_horribly(modulename,
! 					  "parallel_restore should not return\n");
  	}
  	else if (child < 0)
  	{
  		/* fork failed */
! 		exit_horribly(modulename,
! 					  "could not create worker process: %s\n",
! 					  strerror(errno));
  	}
  #else
  	child = (HANDLE) _beginthreadex(NULL, 0, (void *) parallel_restore,
  									args, 0, NULL);
  	if (child == 0)
! 		exit_horribly(modulename,
! 					  "could not create worker thread: %s\n",
! 					  strerror(errno));
  #endif
  
  	return child;
*************** parallel_restore(RestoreArgs *args)
*** 3813,3818 ****
--- 3872,3879 ----
  	RestoreOptions *ropt = AH->ropt;
  	int			retval;
  
+ 	setProcessIdentifier(args->pse, AH);
+ 
  	/*
  	 * Close and reopen the input file so we have a private file pointer that
  	 * doesn't stomp on anyone else's file pointer, if we're actually going to
*************** parallel_restore(RestoreArgs *args)
*** 3843,3848 ****
--- 3904,3910 ----
  
  	/* And clean up */
  	DisconnectDatabase((Archive *) AH);
+ 	unsetProcessIdentifier(args->pse);
  
  	/* If we reopened the file, we are done with it, so close it now */
  	if (te->section == SECTION_DATA)
*************** mark_work_done(ArchiveHandle *AH, TocEnt
*** 3888,3894 ****
  	}
  
  	if (te == NULL)
! 		die_horribly(AH, modulename, "could not find slot of finished worker\n");
  
  	ahlog(AH, 1, "finished item %d %s %s\n",
  		  te->dumpId, te->desc, te->tag);
--- 3950,3956 ----
  	}
  
  	if (te == NULL)
! 		exit_horribly(modulename, "could not find slot of finished worker\n");
  
  	ahlog(AH, 1, "finished item %d %s %s\n",
  		  te->dumpId, te->desc, te->tag);
*************** mark_work_done(ArchiveHandle *AH, TocEnt
*** 3903,3910 ****
  	else if (status == WORKER_IGNORED_ERRORS)
  		AH->public.n_errors++;
  	else if (status != 0)
! 		die_horribly(AH, modulename, "worker process failed: exit code %d\n",
! 					 status);
  
  	reduce_dependencies(AH, te, ready_list);
  }
--- 3965,3972 ----
  	else if (status == WORKER_IGNORED_ERRORS)
  		AH->public.n_errors++;
  	else if (status != 0)
! 		exit_horribly(modulename, "worker process failed: exit code %d\n",
! 					  status);
  
  	reduce_dependencies(AH, te, ready_list);
  }
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index fa8c58c..5d3af9c 100644
*** a/src/bin/pg_dump/pg_backup_archiver.h
--- b/src/bin/pg_dump/pg_backup_archiver.h
*************** typedef struct _tocEntry
*** 323,332 ****
  	int			nLockDeps;		/* number of such dependencies */
  } TocEntry;
  
  
! extern void die_horribly(ArchiveHandle *AH, const char *modulename, const char *fmt,...) __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4), noreturn));
! extern void die_on_query_failure(ArchiveHandle *AH, const char *modulename, const char *query) __attribute__((noreturn));
! extern void warn_or_die_horribly(ArchiveHandle *AH, const char *modulename, const char *fmt,...) __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4)));
  
  extern void WriteTOC(ArchiveHandle *AH);
  extern void ReadTOC(ArchiveHandle *AH);
--- 323,331 ----
  	int			nLockDeps;		/* number of such dependencies */
  } TocEntry;
  
+ extern void on_exit_close_archive(Archive *AHX);
  
! extern void warn_or_exit_horribly(ArchiveHandle *AH, const char *modulename, const char *fmt,...) __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4)));
  
  extern void WriteTOC(ArchiveHandle *AH);
  extern void ReadTOC(ArchiveHandle *AH);
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index 31fa373..87242c5 100644
*** a/src/bin/pg_dump/pg_backup_custom.c
--- b/src/bin/pg_dump/pg_backup_custom.c
*************** InitArchiveFmt_Custom(ArchiveHandle *AH)
*** 146,160 ****
  		{
  			AH->FH = fopen(AH->fSpec, PG_BINARY_W);
  			if (!AH->FH)
! 				die_horribly(AH, modulename, "could not open output file \"%s\": %s\n",
! 							 AH->fSpec, strerror(errno));
  		}
  		else
  		{
  			AH->FH = stdout;
  			if (!AH->FH)
! 				die_horribly(AH, modulename, "could not open output file: %s\n",
! 							 strerror(errno));
  		}
  
  		ctx->hasSeek = checkSeek(AH->FH);
--- 146,160 ----
  		{
  			AH->FH = fopen(AH->fSpec, PG_BINARY_W);
  			if (!AH->FH)
! 				exit_horribly(modulename, "could not open output file \"%s\": %s\n",
! 							  AH->fSpec, strerror(errno));
  		}
  		else
  		{
  			AH->FH = stdout;
  			if (!AH->FH)
! 				exit_horribly(modulename, "could not open output file: %s\n",
! 							  strerror(errno));
  		}
  
  		ctx->hasSeek = checkSeek(AH->FH);
*************** InitArchiveFmt_Custom(ArchiveHandle *AH)
*** 165,179 ****
  		{
  			AH->FH = fopen(AH->fSpec, PG_BINARY_R);
  			if (!AH->FH)
! 				die_horribly(AH, modulename, "could not open input file \"%s\": %s\n",
! 							 AH->fSpec, strerror(errno));
  		}
  		else
  		{
  			AH->FH = stdin;
  			if (!AH->FH)
! 				die_horribly(AH, modulename, "could not open input file: %s\n",
! 							 strerror(errno));
  		}
  
  		ctx->hasSeek = checkSeek(AH->FH);
--- 165,179 ----
  		{
  			AH->FH = fopen(AH->fSpec, PG_BINARY_R);
  			if (!AH->FH)
! 				exit_horribly(modulename, "could not open input file \"%s\": %s\n",
! 							  AH->fSpec, strerror(errno));
  		}
  		else
  		{
  			AH->FH = stdin;
  			if (!AH->FH)
! 				exit_horribly(modulename, "could not open input file: %s\n",
! 							  strerror(errno));
  		}
  
  		ctx->hasSeek = checkSeek(AH->FH);
*************** _StartBlob(ArchiveHandle *AH, TocEntry *
*** 367,373 ****
  	lclContext *ctx = (lclContext *) AH->formatData;
  
  	if (oid == 0)
! 		die_horribly(AH, modulename, "invalid OID for large object\n");
  
  	WriteInt(AH, oid);
  
--- 367,373 ----
  	lclContext *ctx = (lclContext *) AH->formatData;
  
  	if (oid == 0)
! 		exit_horribly(modulename, "invalid OID for large object\n");
  
  	WriteInt(AH, oid);
  
*************** _PrintTocData(ArchiveHandle *AH, TocEntr
*** 437,445 ****
  					break;
  
  				default:		/* Always have a default */
! 					die_horribly(AH, modulename,
! 								 "unrecognized data block type (%d) while searching archive\n",
! 								 blkType);
  					break;
  			}
  			_readBlockHeader(AH, &blkType, &id);
--- 437,445 ----
  					break;
  
  				default:		/* Always have a default */
! 					exit_horribly(modulename,
! 								  "unrecognized data block type (%d) while searching archive\n",
! 								  blkType);
  					break;
  			}
  			_readBlockHeader(AH, &blkType, &id);
*************** _PrintTocData(ArchiveHandle *AH, TocEntr
*** 449,456 ****
  	{
  		/* We can just seek to the place we need to be. */
  		if (fseeko(AH->FH, tctx->dataPos, SEEK_SET) != 0)
! 			die_horribly(AH, modulename, "error during file seek: %s\n",
! 						 strerror(errno));
  
  		_readBlockHeader(AH, &blkType, &id);
  	}
--- 449,456 ----
  	{
  		/* We can just seek to the place we need to be. */
  		if (fseeko(AH->FH, tctx->dataPos, SEEK_SET) != 0)
! 			exit_horribly(modulename, "error during file seek: %s\n",
! 						  strerror(errno));
  
  		_readBlockHeader(AH, &blkType, &id);
  	}
*************** _PrintTocData(ArchiveHandle *AH, TocEntr
*** 459,483 ****
  	if (blkType == EOF)
  	{
  		if (tctx->dataState == K_OFFSET_POS_NOT_SET)
! 			die_horribly(AH, modulename, "could not find block ID %d in archive -- "
! 						 "possibly due to out-of-order restore request, "
! 						 "which cannot be handled due to lack of data offsets in archive\n",
! 						 te->dumpId);
  		else if (!ctx->hasSeek)
! 			die_horribly(AH, modulename, "could not find block ID %d in archive -- "
! 						 "possibly due to out-of-order restore request, "
! 				  "which cannot be handled due to non-seekable input file\n",
! 						 te->dumpId);
  		else	/* huh, the dataPos led us to EOF? */
! 			die_horribly(AH, modulename, "could not find block ID %d in archive -- "
! 						 "possibly corrupt archive\n",
! 						 te->dumpId);
  	}
  
  	/* Are we sane? */
  	if (id != te->dumpId)
! 		die_horribly(AH, modulename, "found unexpected block ID (%d) when reading data -- expected %d\n",
! 					 id, te->dumpId);
  
  	switch (blkType)
  	{
--- 459,483 ----
  	if (blkType == EOF)
  	{
  		if (tctx->dataState == K_OFFSET_POS_NOT_SET)
! 			exit_horribly(modulename, "could not find block ID %d in archive -- "
! 						  "possibly due to out-of-order restore request, "
! 						  "which cannot be handled due to lack of data offsets in archive\n",
! 						  te->dumpId);
  		else if (!ctx->hasSeek)
! 			exit_horribly(modulename, "could not find block ID %d in archive -- "
! 						  "possibly due to out-of-order restore request, "
! 						  "which cannot be handled due to non-seekable input file\n",
! 						  te->dumpId);
  		else	/* huh, the dataPos led us to EOF? */
! 			exit_horribly(modulename, "could not find block ID %d in archive -- "
! 						  "possibly corrupt archive\n",
! 						  te->dumpId);
  	}
  
  	/* Are we sane? */
  	if (id != te->dumpId)
! 		exit_horribly(modulename, "found unexpected block ID (%d) when reading data -- expected %d\n",
! 					  id, te->dumpId);
  
  	switch (blkType)
  	{
*************** _PrintTocData(ArchiveHandle *AH, TocEntr
*** 490,497 ****
  			break;
  
  		default:				/* Always have a default */
! 			die_horribly(AH, modulename, "unrecognized data block type %d while restoring archive\n",
! 						 blkType);
  			break;
  	}
  }
--- 490,497 ----
  			break;
  
  		default:				/* Always have a default */
! 			exit_horribly(modulename, "unrecognized data block type %d while restoring archive\n",
! 						  blkType);
  			break;
  	}
  }
*************** _skipData(ArchiveHandle *AH)
*** 571,581 ****
  		if (cnt != blkLen)
  		{
  			if (feof(AH->FH))
! 				die_horribly(AH, modulename,
! 							 "could not read from input file: end of file\n");
  			else
! 				die_horribly(AH, modulename,
! 					"could not read from input file: %s\n", strerror(errno));
  		}
  
  		ctx->filePos += blkLen;
--- 571,581 ----
  		if (cnt != blkLen)
  		{
  			if (feof(AH->FH))
! 				exit_horribly(modulename,
! 							  "could not read from input file: end of file\n");
  			else
! 				exit_horribly(modulename,
! 							  "could not read from input file: %s\n", strerror(errno));
  		}
  
  		ctx->filePos += blkLen;
*************** _WriteByte(ArchiveHandle *AH, const int
*** 604,610 ****
  	if (res != EOF)
  		ctx->filePos += 1;
  	else
! 		die_horribly(AH, modulename, "could not write byte: %s\n", strerror(errno));
  	return res;
  }
  
--- 604,610 ----
  	if (res != EOF)
  		ctx->filePos += 1;
  	else
! 		exit_horribly(modulename, "could not write byte: %s\n", strerror(errno));
  	return res;
  }
  
*************** _ReadByte(ArchiveHandle *AH)
*** 624,630 ****
  
  	res = getc(AH->FH);
  	if (res == EOF)
! 		die_horribly(AH, modulename, "unexpected end of file\n");
  	ctx->filePos += 1;
  	return res;
  }
--- 624,630 ----
  
  	res = getc(AH->FH);
  	if (res == EOF)
! 		exit_horribly(modulename, "unexpected end of file\n");
  	ctx->filePos += 1;
  	return res;
  }
*************** _WriteBuf(ArchiveHandle *AH, const void
*** 645,651 ****
  	res = fwrite(buf, 1, len, AH->FH);
  
  	if (res != len)
! 		die_horribly(AH, modulename,
  					 "could not write to output file: %s\n", strerror(errno));
  
  	ctx->filePos += res;
--- 645,651 ----
  	res = fwrite(buf, 1, len, AH->FH);
  
  	if (res != len)
! 		exit_horribly(modulename,
  					 "could not write to output file: %s\n", strerror(errno));
  
  	ctx->filePos += res;
*************** _CloseArchive(ArchiveHandle *AH)
*** 712,718 ****
  	}
  
  	if (fclose(AH->FH) != 0)
! 		die_horribly(AH, modulename, "could not close archive file: %s\n", strerror(errno));
  
  	AH->FH = NULL;
  }
--- 712,718 ----
  	}
  
  	if (fclose(AH->FH) != 0)
! 		exit_horribly(modulename, "could not close archive file: %s\n", strerror(errno));
  
  	AH->FH = NULL;
  }
*************** _ReopenArchive(ArchiveHandle *AH)
*** 731,767 ****
  	pgoff_t		tpos;
  
  	if (AH->mode == archModeWrite)
! 		die_horribly(AH, modulename, "can only reopen input archives\n");
  
  	/*
  	 * These two cases are user-facing errors since they represent unsupported
  	 * (but not invalid) use-cases.  Word the error messages appropriately.
  	 */
  	if (AH->fSpec == NULL || strcmp(AH->fSpec, "") == 0)
! 		die_horribly(AH, modulename, "parallel restore from stdin is not supported\n");
  	if (!ctx->hasSeek)
! 		die_horribly(AH, modulename, "parallel restore from non-seekable file is not supported\n");
  
  	errno = 0;
  	tpos = ftello(AH->FH);
  	if (errno)
! 		die_horribly(AH, modulename, "could not determine seek position in archive file: %s\n",
! 					 strerror(errno));
  
  #ifndef WIN32
  	if (fclose(AH->FH) != 0)
! 		die_horribly(AH, modulename, "could not close archive file: %s\n",
! 					 strerror(errno));
  #endif
  
  	AH->FH = fopen(AH->fSpec, PG_BINARY_R);
  	if (!AH->FH)
! 		die_horribly(AH, modulename, "could not open input file \"%s\": %s\n",
! 					 AH->fSpec, strerror(errno));
  
  	if (fseeko(AH->FH, tpos, SEEK_SET) != 0)
! 		die_horribly(AH, modulename, "could not set seek position in archive file: %s\n",
! 					 strerror(errno));
  }
  
  /*
--- 731,767 ----
  	pgoff_t		tpos;
  
  	if (AH->mode == archModeWrite)
! 		exit_horribly(modulename, "can only reopen input archives\n");
  
  	/*
  	 * These two cases are user-facing errors since they represent unsupported
  	 * (but not invalid) use-cases.  Word the error messages appropriately.
  	 */
  	if (AH->fSpec == NULL || strcmp(AH->fSpec, "") == 0)
! 		exit_horribly(modulename, "parallel restore from stdin is not supported\n");
  	if (!ctx->hasSeek)
! 		exit_horribly(modulename, "parallel restore from non-seekable file is not supported\n");
  
  	errno = 0;
  	tpos = ftello(AH->FH);
  	if (errno)
! 		exit_horribly(modulename, "could not determine seek position in archive file: %s\n",
! 					  strerror(errno));
  
  #ifndef WIN32
  	if (fclose(AH->FH) != 0)
! 		exit_horribly(modulename, "could not close archive file: %s\n",
! 					  strerror(errno));
  #endif
  
  	AH->FH = fopen(AH->fSpec, PG_BINARY_R);
  	if (!AH->FH)
! 		exit_horribly(modulename, "could not open input file \"%s\": %s\n",
! 					  AH->fSpec, strerror(errno));
  
  	if (fseeko(AH->FH, tpos, SEEK_SET) != 0)
! 		exit_horribly(modulename, "could not set seek position in archive file: %s\n",
! 					  strerror(errno));
  }
  
  /*
*************** _Clone(ArchiveHandle *AH)
*** 778,784 ****
  
  	/* sanity check, shouldn't happen */
  	if (ctx->cs != NULL)
! 		die_horribly(AH, modulename, "compressor active\n");
  
  	/*
  	 * Note: we do not make a local lo_buf because we expect at most one BLOBS
--- 778,784 ----
  
  	/* sanity check, shouldn't happen */
  	if (ctx->cs != NULL)
! 		exit_horribly(modulename, "compressor active\n");
  
  	/*
  	 * Note: we do not make a local lo_buf because we expect at most one BLOBS
*************** _readBlockHeader(ArchiveHandle *AH, int
*** 840,846 ****
  	int			byt;
  
  	/*
! 	 * Note: if we are at EOF with a pre-1.3 input file, we'll die_horribly
  	 * inside ReadInt rather than returning EOF.  It doesn't seem worth
  	 * jumping through hoops to deal with that case better, because no such
  	 * files are likely to exist in the wild: only some 7.1 development
--- 840,846 ----
  	int			byt;
  
  	/*
! 	 * Note: if we are at EOF with a pre-1.3 input file, we'll exit_horribly
  	 * inside ReadInt rather than returning EOF.  It doesn't seem worth
  	 * jumping through hoops to deal with that case better, because no such
  	 * files are likely to exist in the wild: only some 7.1 development
*************** _CustomReadFunc(ArchiveHandle *AH, char
*** 905,914 ****
  	if (cnt != blkLen)
  	{
  		if (feof(AH->FH))
! 			die_horribly(AH, modulename,
! 						 "could not read from input file: end of file\n");
  		else
! 			die_horribly(AH, modulename,
  					"could not read from input file: %s\n", strerror(errno));
  	}
  	return cnt;
--- 905,914 ----
  	if (cnt != blkLen)
  	{
  		if (feof(AH->FH))
! 			exit_horribly(modulename,
! 						  "could not read from input file: end of file\n");
  		else
! 			exit_horribly(modulename,
  					"could not read from input file: %s\n", strerror(errno));
  	}
  	return cnt;
diff --git a/src/bin/pg_dump/pg_backup_db.c b/src/bin/pg_dump/pg_backup_db.c
index a843eac..4a8283a 100644
*** a/src/bin/pg_dump/pg_backup_db.c
--- b/src/bin/pg_dump/pg_backup_db.c
*************** static PGconn *_connectDB(ArchiveHandle
*** 30,42 ****
  static void notice_processor(void *arg, const char *message);
  
  static int
! _parse_version(ArchiveHandle *AH, const char *versionString)
  {
  	int			v;
  
  	v = parse_version(versionString);
  	if (v < 0)
! 		die_horribly(AH, modulename, "could not parse version string \"%s\"\n", versionString);
  
  	return v;
  }
--- 30,42 ----
  static void notice_processor(void *arg, const char *message);
  
  static int
! _parse_version(const char *versionString)
  {
  	int			v;
  
  	v = parse_version(versionString);
  	if (v < 0)
! 		exit_horribly(modulename, "could not parse version string \"%s\"\n", versionString);
  
  	return v;
  }
*************** _check_database_version(ArchiveHandle *A
*** 48,60 ****
  	const char *remoteversion_str;
  	int			remoteversion;
  
! 	myversion = _parse_version(AH, PG_VERSION);
  
  	remoteversion_str = PQparameterStatus(AH->connection, "server_version");
  	if (!remoteversion_str)
! 		die_horribly(AH, modulename, "could not get server_version from libpq\n");
  
! 	remoteversion = _parse_version(AH, remoteversion_str);
  
  	AH->public.remoteVersionStr = pg_strdup(remoteversion_str);
  	AH->public.remoteVersion = remoteversion;
--- 48,60 ----
  	const char *remoteversion_str;
  	int			remoteversion;
  
! 	myversion = _parse_version(PG_VERSION);
  
  	remoteversion_str = PQparameterStatus(AH->connection, "server_version");
  	if (!remoteversion_str)
! 		exit_horribly(modulename, "could not get server_version from libpq\n");
  
! 	remoteversion = _parse_version(remoteversion_str);
  
  	AH->public.remoteVersionStr = pg_strdup(remoteversion_str);
  	AH->public.remoteVersion = remoteversion;
*************** _check_database_version(ArchiveHandle *A
*** 67,73 ****
  	{
  		write_msg(NULL, "server version: %s; %s version: %s\n",
  				  remoteversion_str, progname, PG_VERSION);
! 		die_horribly(AH, NULL, "aborting because of server version mismatch\n");
  	}
  }
  
--- 67,73 ----
  	{
  		write_msg(NULL, "server version: %s; %s version: %s\n",
  				  remoteversion_str, progname, PG_VERSION);
! 		exit_horribly(NULL, "aborting because of server version mismatch\n");
  	}
  }
  
*************** _connectDB(ArchiveHandle *AH, const char
*** 145,151 ****
  	{
  		password = simple_prompt("Password: ", 100, false);
  		if (password == NULL)
! 			die_horribly(AH, modulename, "out of memory\n");
  	}
  
  	do
--- 145,151 ----
  	{
  		password = simple_prompt("Password: ", 100, false);
  		if (password == NULL)
! 			exit_horribly(modulename, "out of memory\n");
  	}
  
  	do
*************** _connectDB(ArchiveHandle *AH, const char
*** 176,187 ****
  		free(values);
  
  		if (!newConn)
! 			die_horribly(AH, modulename, "failed to reconnect to database\n");
  
  		if (PQstatus(newConn) == CONNECTION_BAD)
  		{
  			if (!PQconnectionNeedsPassword(newConn))
! 				die_horribly(AH, modulename, "could not reconnect to database: %s",
  							 PQerrorMessage(newConn));
  			PQfinish(newConn);
  
--- 176,187 ----
  		free(values);
  
  		if (!newConn)
! 			exit_horribly(modulename, "failed to reconnect to database\n");
  
  		if (PQstatus(newConn) == CONNECTION_BAD)
  		{
  			if (!PQconnectionNeedsPassword(newConn))
! 				exit_horribly(modulename, "could not reconnect to database: %s",
  							 PQerrorMessage(newConn));
  			PQfinish(newConn);
  
*************** _connectDB(ArchiveHandle *AH, const char
*** 197,206 ****
  			if (AH->promptPassword != TRI_NO)
  				password = simple_prompt("Password: ", 100, false);
  			else
! 				die_horribly(AH, modulename, "connection needs password\n");
  
  			if (password == NULL)
! 				die_horribly(AH, modulename, "out of memory\n");
  			new_pass = true;
  		}
  	} while (new_pass);
--- 197,206 ----
  			if (AH->promptPassword != TRI_NO)
  				password = simple_prompt("Password: ", 100, false);
  			else
! 				exit_horribly(modulename, "connection needs password\n");
  
  			if (password == NULL)
! 				exit_horribly(modulename, "out of memory\n");
  			new_pass = true;
  		}
  	} while (new_pass);
*************** ConnectDatabase(Archive *AHX,
*** 238,250 ****
  	bool		new_pass;
  
  	if (AH->connection)
! 		die_horribly(AH, modulename, "already connected to a database\n");
  
  	if (prompt_password == TRI_YES && password == NULL)
  	{
  		password = simple_prompt("Password: ", 100, false);
  		if (password == NULL)
! 			die_horribly(AH, modulename, "out of memory\n");
  	}
  	AH->promptPassword = prompt_password;
  
--- 238,250 ----
  	bool		new_pass;
  
  	if (AH->connection)
! 		exit_horribly(modulename, "already connected to a database\n");
  
  	if (prompt_password == TRI_YES && password == NULL)
  	{
  		password = simple_prompt("Password: ", 100, false);
  		if (password == NULL)
! 			exit_horribly(modulename, "out of memory\n");
  	}
  	AH->promptPassword = prompt_password;
  
*************** ConnectDatabase(Archive *AHX,
*** 280,286 ****
  		free(values);
  
  		if (!AH->connection)
! 			die_horribly(AH, modulename, "failed to connect to database\n");
  
  		if (PQstatus(AH->connection) == CONNECTION_BAD &&
  			PQconnectionNeedsPassword(AH->connection) &&
--- 280,286 ----
  		free(values);
  
  		if (!AH->connection)
! 			exit_horribly(modulename, "failed to connect to database\n");
  
  		if (PQstatus(AH->connection) == CONNECTION_BAD &&
  			PQconnectionNeedsPassword(AH->connection) &&
*************** ConnectDatabase(Archive *AHX,
*** 290,296 ****
  			PQfinish(AH->connection);
  			password = simple_prompt("Password: ", 100, false);
  			if (password == NULL)
! 				die_horribly(AH, modulename, "out of memory\n");
  			new_pass = true;
  		}
  	} while (new_pass);
--- 290,296 ----
  			PQfinish(AH->connection);
  			password = simple_prompt("Password: ", 100, false);
  			if (password == NULL)
! 				exit_horribly(modulename, "out of memory\n");
  			new_pass = true;
  		}
  	} while (new_pass);
*************** ConnectDatabase(Archive *AHX,
*** 299,305 ****
  
  	/* check to see that the backend connection was successfully made */
  	if (PQstatus(AH->connection) == CONNECTION_BAD)
! 		die_horribly(AH, modulename, "connection to database \"%s\" failed: %s",
  					 PQdb(AH->connection), PQerrorMessage(AH->connection));
  
  	/* check for version mismatch */
--- 299,305 ----
  
  	/* check to see that the backend connection was successfully made */
  	if (PQstatus(AH->connection) == CONNECTION_BAD)
! 		exit_horribly(modulename, "connection to database \"%s\" failed: %s",
  					 PQdb(AH->connection), PQerrorMessage(AH->connection));
  
  	/* check for version mismatch */
*************** GetConnection(Archive *AHX)
*** 325,336 ****
--- 325,352 ----
  	return AH->connection;
  }
  
+ void
+ archive_close_connection(int code, void *arg)
+ {
+ 	Archive	   *AH = (Archive *) arg;
+ 
+ 	DisconnectDatabase(AH);
+ }
+ 
  static void
  notice_processor(void *arg, const char *message)
  {
  	write_msg(NULL, "%s", message);
  }
  
+ /* Like exit_horribly(), but with a complaint about a particular query. */
+ static void
+ die_on_query_failure(ArchiveHandle *AH, const char *modulename, const char *query)
+ {
+ 	write_msg(modulename, "query failed: %s",
+ 			  PQerrorMessage(AH->connection));
+ 	exit_horribly(modulename, "query was: %s\n", query);
+ }
  
  void
  ExecuteSqlStatement(Archive *AHX, const char *query)
*************** ExecuteSqlCommand(ArchiveHandle *AH, con
*** 393,400 ****
  				errStmt[DB_MAX_ERR_STMT - 2] = '.';
  				errStmt[DB_MAX_ERR_STMT - 1] = '\0';
  			}
! 			warn_or_die_horribly(AH, modulename, "%s: %s    Command was: %s\n",
! 								 desc, PQerrorMessage(conn), errStmt);
  			break;
  	}
  
--- 409,416 ----
  				errStmt[DB_MAX_ERR_STMT - 2] = '.';
  				errStmt[DB_MAX_ERR_STMT - 1] = '\0';
  			}
! 			warn_or_exit_horribly(AH, modulename, "%s: %s    Command was: %s\n",
! 								  desc, PQerrorMessage(conn), errStmt);
  			break;
  	}
  
*************** ExecuteSqlCommandBuf(ArchiveHandle *AH,
*** 495,502 ****
  		 */
  		if (AH->pgCopyIn &&
  			PQputCopyData(AH->connection, buf, bufLen) <= 0)
! 			die_horribly(AH, modulename, "error returned by PQputCopyData: %s",
! 						 PQerrorMessage(AH->connection));
  	}
  	else if (AH->outputKind == OUTPUT_OTHERDATA)
  	{
--- 511,518 ----
  		 */
  		if (AH->pgCopyIn &&
  			PQputCopyData(AH->connection, buf, bufLen) <= 0)
! 			exit_horribly(modulename, "error returned by PQputCopyData: %s",
! 						  PQerrorMessage(AH->connection));
  	}
  	else if (AH->outputKind == OUTPUT_OTHERDATA)
  	{
*************** EndDBCopyMode(ArchiveHandle *AH, TocEntr
*** 541,554 ****
  		PGresult   *res;
  
  		if (PQputCopyEnd(AH->connection, NULL) <= 0)
! 			die_horribly(AH, modulename, "error returned by PQputCopyEnd: %s",
! 						 PQerrorMessage(AH->connection));
  
  		/* Check command status and return to normal libpq state */
  		res = PQgetResult(AH->connection);
  		if (PQresultStatus(res) != PGRES_COMMAND_OK)
! 			warn_or_die_horribly(AH, modulename, "COPY failed for table \"%s\": %s",
! 								 te->tag, PQerrorMessage(AH->connection));
  		PQclear(res);
  
  		AH->pgCopyIn = false;
--- 557,570 ----
  		PGresult   *res;
  
  		if (PQputCopyEnd(AH->connection, NULL) <= 0)
! 			exit_horribly(modulename, "error returned by PQputCopyEnd: %s",
! 						  PQerrorMessage(AH->connection));
  
  		/* Check command status and return to normal libpq state */
  		res = PQgetResult(AH->connection);
  		if (PQresultStatus(res) != PGRES_COMMAND_OK)
! 			warn_or_exit_horribly(AH, modulename, "COPY failed for table \"%s\": %s",
! 								  te->tag, PQerrorMessage(AH->connection));
  		PQclear(res);
  
  		AH->pgCopyIn = false;
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 4b59516..8d43cd2 100644
*** a/src/bin/pg_dump/pg_backup_directory.c
--- b/src/bin/pg_dump/pg_backup_directory.c
*************** InitArchiveFmt_Directory(ArchiveHandle *
*** 142,148 ****
  	 */
  
  	if (!AH->fSpec || strcmp(AH->fSpec, "") == 0)
! 		die_horribly(AH, modulename, "no output directory specified\n");
  
  	ctx->directory = AH->fSpec;
  
--- 142,148 ----
  	 */
  
  	if (!AH->fSpec || strcmp(AH->fSpec, "") == 0)
! 		exit_horribly(modulename, "no output directory specified\n");
  
  	ctx->directory = AH->fSpec;
  
*************** InitArchiveFmt_Directory(ArchiveHandle *
*** 160,168 ****
  
  		tocFH = cfopen_read(fname, PG_BINARY_R);
  		if (tocFH == NULL)
! 			die_horribly(AH, modulename,
! 						 "could not open input file \"%s\": %s\n",
! 						 fname, strerror(errno));
  
  		ctx->dataFH = tocFH;
  
--- 160,168 ----
  
  		tocFH = cfopen_read(fname, PG_BINARY_R);
  		if (tocFH == NULL)
! 			exit_horribly(modulename,
! 						  "could not open input file \"%s\": %s\n",
! 						  fname, strerror(errno));
  
  		ctx->dataFH = tocFH;
  
*************** InitArchiveFmt_Directory(ArchiveHandle *
*** 177,183 ****
  
  		/* Nothing else in the file, so close it again... */
  		if (cfclose(tocFH) != 0)
! 			die_horribly(AH, modulename, "could not close TOC file: %s\n",
  						 strerror(errno));
  		ctx->dataFH = NULL;
  	}
--- 177,183 ----
  
  		/* Nothing else in the file, so close it again... */
  		if (cfclose(tocFH) != 0)
! 			exit_horribly(modulename, "could not close TOC file: %s\n",
  						 strerror(errno));
  		ctx->dataFH = NULL;
  	}
*************** _StartData(ArchiveHandle *AH, TocEntry *
*** 288,295 ****
  
  	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
  	if (ctx->dataFH == NULL)
! 		die_horribly(AH, modulename, "could not open output file \"%s\": %s\n",
! 					 fname, strerror(errno));
  }
  
  /*
--- 288,295 ----
  
  	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
  	if (ctx->dataFH == NULL)
! 		exit_horribly(modulename, "could not open output file \"%s\": %s\n",
! 					  fname, strerror(errno));
  }
  
  /*
*************** _PrintFileData(ArchiveHandle *AH, char *
*** 346,352 ****
  	cfp = cfopen_read(filename, PG_BINARY_R);
  
  	if (!cfp)
! 		die_horribly(AH, modulename, "could not open input file \"%s\": %s\n",
  					 filename, strerror(errno));
  
  	buf = pg_malloc(ZLIB_OUT_SIZE);
--- 346,352 ----
  	cfp = cfopen_read(filename, PG_BINARY_R);
  
  	if (!cfp)
! 		exit_horribly(modulename, "could not open input file \"%s\": %s\n",
  					 filename, strerror(errno));
  
  	buf = pg_malloc(ZLIB_OUT_SIZE);
*************** _PrintFileData(ArchiveHandle *AH, char *
*** 357,363 ****
  
  	free(buf);
  	if (cfclose(cfp) != 0)
! 		die_horribly(AH, modulename, "could not close data file: %s\n",
  					 strerror(errno));
  }
  
--- 357,363 ----
  
  	free(buf);
  	if (cfclose(cfp) != 0)
! 		exit_horribly(modulename, "could not close data file: %s\n",
  					 strerror(errno));
  }
  
*************** _LoadBlobs(ArchiveHandle *AH, RestoreOpt
*** 397,404 ****
  	ctx->blobsTocFH = cfopen_read(fname, PG_BINARY_R);
  
  	if (ctx->blobsTocFH == NULL)
! 		die_horribly(AH, modulename, "could not open large object TOC file \"%s\" for input: %s\n",
! 					 fname, strerror(errno));
  
  	/* Read the blobs TOC file line-by-line, and process each blob */
  	while ((cfgets(ctx->blobsTocFH, line, MAXPGPATH)) != NULL)
--- 397,404 ----
  	ctx->blobsTocFH = cfopen_read(fname, PG_BINARY_R);
  
  	if (ctx->blobsTocFH == NULL)
! 		exit_horribly(modulename, "could not open large object TOC file \"%s\" for input: %s\n",
! 					  fname, strerror(errno));
  
  	/* Read the blobs TOC file line-by-line, and process each blob */
  	while ((cfgets(ctx->blobsTocFH, line, MAXPGPATH)) != NULL)
*************** _LoadBlobs(ArchiveHandle *AH, RestoreOpt
*** 407,414 ****
  		char		path[MAXPGPATH];
  
  		if (sscanf(line, "%u %s\n", &oid, fname) != 2)
! 			die_horribly(AH, modulename, "invalid line in large object TOC file: %s\n",
! 						 line);
  
  		StartRestoreBlob(AH, oid, ropt->dropSchema);
  		snprintf(path, MAXPGPATH, "%s/%s", ctx->directory, fname);
--- 407,414 ----
  		char		path[MAXPGPATH];
  
  		if (sscanf(line, "%u %s\n", &oid, fname) != 2)
! 			exit_horribly(modulename, "invalid line in large object TOC file: %s\n",
! 						  line);
  
  		StartRestoreBlob(AH, oid, ropt->dropSchema);
  		snprintf(path, MAXPGPATH, "%s/%s", ctx->directory, fname);
*************** _LoadBlobs(ArchiveHandle *AH, RestoreOpt
*** 416,427 ****
  		EndRestoreBlob(AH, oid);
  	}
  	if (!cfeof(ctx->blobsTocFH))
! 		die_horribly(AH, modulename, "error reading large object TOC file \"%s\"\n",
  					 fname);
  
  	if (cfclose(ctx->blobsTocFH) != 0)
! 		die_horribly(AH, modulename, "could not close large object TOC file \"%s\": %s\n",
! 					 fname, strerror(errno));
  
  	ctx->blobsTocFH = NULL;
  
--- 416,427 ----
  		EndRestoreBlob(AH, oid);
  	}
  	if (!cfeof(ctx->blobsTocFH))
! 		exit_horribly(modulename, "error reading large object TOC file \"%s\"\n",
  					 fname);
  
  	if (cfclose(ctx->blobsTocFH) != 0)
! 		exit_horribly(modulename, "could not close large object TOC file \"%s\": %s\n",
! 					  fname, strerror(errno));
  
  	ctx->blobsTocFH = NULL;
  
*************** _WriteByte(ArchiveHandle *AH, const int
*** 441,447 ****
  	lclContext *ctx = (lclContext *) AH->formatData;
  
  	if (cfwrite(&c, 1, ctx->dataFH) != 1)
! 		die_horribly(AH, modulename, "could not write byte\n");
  
  	return 1;
  }
--- 441,447 ----
  	lclContext *ctx = (lclContext *) AH->formatData;
  
  	if (cfwrite(&c, 1, ctx->dataFH) != 1)
! 		exit_horribly(modulename, "could not write byte\n");
  
  	return 1;
  }
*************** _ReadByte(ArchiveHandle *AH)
*** 460,466 ****
  
  	res = cfgetc(ctx->dataFH);
  	if (res == EOF)
! 		die_horribly(AH, modulename, "unexpected end of file\n");
  
  	return res;
  }
--- 460,466 ----
  
  	res = cfgetc(ctx->dataFH);
  	if (res == EOF)
! 		exit_horribly(modulename, "unexpected end of file\n");
  
  	return res;
  }
*************** _WriteBuf(ArchiveHandle *AH, const void
*** 477,483 ****
  
  	res = cfwrite(buf, len, ctx->dataFH);
  	if (res != len)
! 		die_horribly(AH, modulename, "could not write to output file: %s\n",
  					 strerror(errno));
  
  	return res;
--- 477,483 ----
  
  	res = cfwrite(buf, len, ctx->dataFH);
  	if (res != len)
! 		exit_horribly(modulename, "could not write to output file: %s\n",
  					 strerror(errno));
  
  	return res;
*************** _CloseArchive(ArchiveHandle *AH)
*** 524,531 ****
  		/* The TOC is always created uncompressed */
  		tocFH = cfopen_write(fname, PG_BINARY_W, 0);
  		if (tocFH == NULL)
! 			die_horribly(AH, modulename, "could not open output file \"%s\": %s\n",
! 						 fname, strerror(errno));
  		ctx->dataFH = tocFH;
  
  		/*
--- 524,531 ----
  		/* The TOC is always created uncompressed */
  		tocFH = cfopen_write(fname, PG_BINARY_W, 0);
  		if (tocFH == NULL)
! 			exit_horribly(modulename, "could not open output file \"%s\": %s\n",
! 						  fname, strerror(errno));
  		ctx->dataFH = tocFH;
  
  		/*
*************** _CloseArchive(ArchiveHandle *AH)
*** 538,545 ****
  		AH->format = archDirectory;
  		WriteToc(AH);
  		if (cfclose(tocFH) != 0)
! 			die_horribly(AH, modulename, "could not close TOC file: %s\n",
! 						 strerror(errno));
  		WriteDataChunks(AH);
  	}
  	AH->FH = NULL;
--- 538,545 ----
  		AH->format = archDirectory;
  		WriteToc(AH);
  		if (cfclose(tocFH) != 0)
! 			exit_horribly(modulename, "could not close TOC file: %s\n",
! 						  strerror(errno));
  		WriteDataChunks(AH);
  	}
  	AH->FH = NULL;
*************** _StartBlobs(ArchiveHandle *AH, TocEntry
*** 568,575 ****
  	/* The blob TOC file is never compressed */
  	ctx->blobsTocFH = cfopen_write(fname, "ab", 0);
  	if (ctx->blobsTocFH == NULL)
! 		die_horribly(AH, modulename, "could not open output file \"%s\": %s\n",
! 					 fname, strerror(errno));
  }
  
  /*
--- 568,575 ----
  	/* The blob TOC file is never compressed */
  	ctx->blobsTocFH = cfopen_write(fname, "ab", 0);
  	if (ctx->blobsTocFH == NULL)
! 		exit_horribly(modulename, "could not open output file \"%s\": %s\n",
! 					  fname, strerror(errno));
  }
  
  /*
*************** _StartBlob(ArchiveHandle *AH, TocEntry *
*** 588,594 ****
  	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
  
  	if (ctx->dataFH == NULL)
! 		die_horribly(AH, modulename, "could not open output file \"%s\": %s\n",
  					 fname, strerror(errno));
  }
  
--- 588,594 ----
  	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
  
  	if (ctx->dataFH == NULL)
! 		exit_horribly(modulename, "could not open output file \"%s\": %s\n",
  					 fname, strerror(errno));
  }
  
*************** _EndBlob(ArchiveHandle *AH, TocEntry *te
*** 611,617 ****
  	/* register the blob in blobs.toc */
  	len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
  	if (cfwrite(buf, len, ctx->blobsTocFH) != len)
! 		die_horribly(AH, modulename, "could not write to blobs TOC file\n");
  }
  
  /*
--- 611,617 ----
  	/* register the blob in blobs.toc */
  	len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
  	if (cfwrite(buf, len, ctx->blobsTocFH) != len)
! 		exit_horribly(modulename, "could not write to blobs TOC file\n");
  }
  
  /*
*************** prependDirectory(ArchiveHandle *AH, cons
*** 667,673 ****
  	dname = ctx->directory;
  
  	if (strlen(dname) + 1 + strlen(relativeFilename) + 1 > MAXPGPATH)
! 		die_horribly(AH, modulename, "path name too long: %s", dname);
  
  	strcpy(buf, dname);
  	strcat(buf, "/");
--- 667,673 ----
  	dname = ctx->directory;
  
  	if (strlen(dname) + 1 + strlen(relativeFilename) + 1 > MAXPGPATH)
! 		exit_horribly(modulename, "path name too long: %s", dname);
  
  	strcpy(buf, dname);
  	strcat(buf, "/");
diff --git a/src/bin/pg_dump/pg_backup_files.c b/src/bin/pg_dump/pg_backup_files.c
index a7fd91d..d765838 100644
*** a/src/bin/pg_dump/pg_backup_files.c
--- b/src/bin/pg_dump/pg_backup_files.c
*************** InitArchiveFmt_Files(ArchiveHandle *AH)
*** 169,175 ****
  		ReadToc(AH);
  		/* Nothing else in the file... */
  		if (fclose(AH->FH) != 0)
! 			die_horribly(AH, modulename, "could not close TOC file: %s\n", strerror(errno));
  	}
  }
  
--- 169,175 ----
  		ReadToc(AH);
  		/* Nothing else in the file... */
  		if (fclose(AH->FH) != 0)
! 			exit_horribly(modulename, "could not close TOC file: %s\n", strerror(errno));
  	}
  }
  
*************** _StartData(ArchiveHandle *AH, TocEntry *
*** 259,266 ****
  #endif
  
  	if (tctx->FH == NULL)
! 		die_horribly(AH, modulename, "could not open output file \"%s\": %s\n",
! 					 tctx->filename, strerror(errno));
  }
  
  static size_t
--- 259,266 ----
  #endif
  
  	if (tctx->FH == NULL)
! 		exit_horribly(modulename, "could not open output file \"%s\": %s\n",
! 					  tctx->filename, strerror(errno));
  }
  
  static size_t
*************** _EndData(ArchiveHandle *AH, TocEntry *te
*** 280,286 ****
  
  	/* Close the file */
  	if (GZCLOSE(tctx->FH) != 0)
! 		die_horribly(AH, modulename, "could not close data file\n");
  
  	tctx->FH = NULL;
  }
--- 280,286 ----
  
  	/* Close the file */
  	if (GZCLOSE(tctx->FH) != 0)
! 		exit_horribly(modulename, "could not close data file\n");
  
  	tctx->FH = NULL;
  }
*************** _PrintFileData(ArchiveHandle *AH, char *
*** 304,310 ****
  #endif
  
  	if (AH->FH == NULL)
! 		die_horribly(AH, modulename, "could not open input file \"%s\": %s\n",
  					 filename, strerror(errno));
  
  	while ((cnt = GZREAD(buf, 1, 4095, AH->FH)) > 0)
--- 304,310 ----
  #endif
  
  	if (AH->FH == NULL)
! 		exit_horribly(modulename, "could not open input file \"%s\": %s\n",
  					 filename, strerror(errno));
  
  	while ((cnt = GZREAD(buf, 1, 4095, AH->FH)) > 0)
*************** _PrintFileData(ArchiveHandle *AH, char *
*** 314,320 ****
  	}
  
  	if (GZCLOSE(AH->FH) != 0)
! 		die_horribly(AH, modulename, "could not close data file after reading\n");
  }
  
  
--- 314,320 ----
  	}
  
  	if (GZCLOSE(AH->FH) != 0)
! 		exit_horribly(modulename, "could not close data file after reading\n");
  }
  
  
*************** _LoadBlobs(ArchiveHandle *AH, RestoreOpt
*** 376,382 ****
  	ctx->blobToc = fopen("blobs.toc", PG_BINARY_R);
  
  	if (ctx->blobToc == NULL)
! 		die_horribly(AH, modulename, "could not open large object TOC for input: %s\n", strerror(errno));
  
  	_getBlobTocEntry(AH, &oid, fname);
  
--- 376,382 ----
  	ctx->blobToc = fopen("blobs.toc", PG_BINARY_R);
  
  	if (ctx->blobToc == NULL)
! 		exit_horribly(modulename, "could not open large object TOC for input: %s\n", strerror(errno));
  
  	_getBlobTocEntry(AH, &oid, fname);
  
*************** _LoadBlobs(ArchiveHandle *AH, RestoreOpt
*** 389,395 ****
  	}
  
  	if (fclose(ctx->blobToc) != 0)
! 		die_horribly(AH, modulename, "could not close large object TOC file: %s\n", strerror(errno));
  
  	EndRestoreBlobs(AH);
  }
--- 389,395 ----
  	}
  
  	if (fclose(ctx->blobToc) != 0)
! 		exit_horribly(modulename, "could not close large object TOC file: %s\n", strerror(errno));
  
  	EndRestoreBlobs(AH);
  }
*************** _WriteByte(ArchiveHandle *AH, const int
*** 401,407 ****
  	lclContext *ctx = (lclContext *) AH->formatData;
  
  	if (fputc(i, AH->FH) == EOF)
! 		die_horribly(AH, modulename, "could not write byte\n");
  
  	ctx->filePos += 1;
  
--- 401,407 ----
  	lclContext *ctx = (lclContext *) AH->formatData;
  
  	if (fputc(i, AH->FH) == EOF)
! 		exit_horribly(modulename, "could not write byte\n");
  
  	ctx->filePos += 1;
  
*************** _ReadByte(ArchiveHandle *AH)
*** 416,422 ****
  
  	res = getc(AH->FH);
  	if (res == EOF)
! 		die_horribly(AH, modulename, "unexpected end of file\n");
  	ctx->filePos += 1;
  	return res;
  }
--- 416,422 ----
  
  	res = getc(AH->FH);
  	if (res == EOF)
! 		exit_horribly(modulename, "unexpected end of file\n");
  	ctx->filePos += 1;
  	return res;
  }
*************** _WriteBuf(ArchiveHandle *AH, const void
*** 429,435 ****
  
  	res = fwrite(buf, 1, len, AH->FH);
  	if (res != len)
! 		die_horribly(AH, modulename, "could not write to output file: %s\n", strerror(errno));
  
  	ctx->filePos += res;
  	return res;
--- 429,435 ----
  
  	res = fwrite(buf, 1, len, AH->FH);
  	if (res != len)
! 		exit_horribly(modulename, "could not write to output file: %s\n", strerror(errno));
  
  	ctx->filePos += res;
  	return res;
*************** _CloseArchive(ArchiveHandle *AH)
*** 454,460 ****
  		WriteHead(AH);
  		WriteToc(AH);
  		if (fclose(AH->FH) != 0)
! 			die_horribly(AH, modulename, "could not close TOC file: %s\n", strerror(errno));
  		WriteDataChunks(AH);
  	}
  
--- 454,460 ----
  		WriteHead(AH);
  		WriteToc(AH);
  		if (fclose(AH->FH) != 0)
! 			exit_horribly(modulename, "could not close TOC file: %s\n", strerror(errno));
  		WriteDataChunks(AH);
  	}
  
*************** _StartBlobs(ArchiveHandle *AH, TocEntry
*** 486,492 ****
  	ctx->blobToc = fopen(fname, PG_BINARY_W);
  
  	if (ctx->blobToc == NULL)
! 		die_horribly(AH, modulename,
  		"could not open large object TOC for output: %s\n", strerror(errno));
  }
  
--- 486,492 ----
  	ctx->blobToc = fopen(fname, PG_BINARY_W);
  
  	if (ctx->blobToc == NULL)
! 		exit_horribly(modulename,
  		"could not open large object TOC for output: %s\n", strerror(errno));
  }
  
*************** _StartBlob(ArchiveHandle *AH, TocEntry *
*** 507,513 ****
  	char	   *sfx;
  
  	if (oid == 0)
! 		die_horribly(AH, modulename, "invalid OID for large object (%u)\n", oid);
  
  	if (AH->compression != 0)
  		sfx = ".gz";
--- 507,513 ----
  	char	   *sfx;
  
  	if (oid == 0)
! 		exit_horribly(modulename, "invalid OID for large object (%u)\n", oid);
  
  	if (AH->compression != 0)
  		sfx = ".gz";
*************** _StartBlob(ArchiveHandle *AH, TocEntry *
*** 526,532 ****
  #endif
  
  	if (tctx->FH == NULL)
! 		die_horribly(AH, modulename, "could not open large object file \"%s\" for input: %s\n",
  					 fname, strerror(errno));
  }
  
--- 526,532 ----
  #endif
  
  	if (tctx->FH == NULL)
! 		exit_horribly(modulename, "could not open large object file \"%s\" for input: %s\n",
  					 fname, strerror(errno));
  }
  
*************** _EndBlob(ArchiveHandle *AH, TocEntry *te
*** 541,547 ****
  	lclTocEntry *tctx = (lclTocEntry *) te->formatData;
  
  	if (GZCLOSE(tctx->FH) != 0)
! 		die_horribly(AH, modulename, "could not close large object file\n");
  }
  
  /*
--- 541,547 ----
  	lclTocEntry *tctx = (lclTocEntry *) te->formatData;
  
  	if (GZCLOSE(tctx->FH) != 0)
! 		exit_horribly(modulename, "could not close large object file\n");
  }
  
  /*
*************** _EndBlobs(ArchiveHandle *AH, TocEntry *t
*** 558,562 ****
  	/* WriteInt(AH, 0); */
  
  	if (fclose(ctx->blobToc) != 0)
! 		die_horribly(AH, modulename, "could not close large object TOC file: %s\n", strerror(errno));
  }
--- 558,562 ----
  	/* WriteInt(AH, 0); */
  
  	if (fclose(ctx->blobToc) != 0)
! 		exit_horribly(modulename, "could not close large object TOC file: %s\n", strerror(errno));
  }
diff --git a/src/bin/pg_dump/pg_backup_null.c b/src/bin/pg_dump/pg_backup_null.c
index 201f0d9..ba1e461 100644
*** a/src/bin/pg_dump/pg_backup_null.c
--- b/src/bin/pg_dump/pg_backup_null.c
*************** InitArchiveFmt_Null(ArchiveHandle *AH)
*** 74,80 ****
  	 * Now prevent reading...
  	 */
  	if (AH->mode == archModeRead)
! 		die_horribly(AH, NULL, "this format cannot be read\n");
  }
  
  /*
--- 74,80 ----
  	 * Now prevent reading...
  	 */
  	if (AH->mode == archModeRead)
! 		exit_horribly(NULL, "this format cannot be read\n");
  }
  
  /*
*************** _StartBlob(ArchiveHandle *AH, TocEntry *
*** 149,155 ****
  	bool		old_blob_style = (AH->version < K_VERS_1_12);
  
  	if (oid == 0)
! 		die_horribly(AH, NULL, "invalid OID for large object\n");
  
  	/* With an old archive we must do drop and create logic here */
  	if (old_blob_style && AH->ropt->dropSchema)
--- 149,155 ----
  	bool		old_blob_style = (AH->version < K_VERS_1_12);
  
  	if (oid == 0)
! 		exit_horribly(NULL, "invalid OID for large object\n");
  
  	/* With an old archive we must do drop and create logic here */
  	if (old_blob_style && AH->ropt->dropSchema)
diff --git a/src/bin/pg_dump/pg_backup_tar.c b/src/bin/pg_dump/pg_backup_tar.c
index 4823ede..451c957 100644
*** a/src/bin/pg_dump/pg_backup_tar.c
--- b/src/bin/pg_dump/pg_backup_tar.c
*************** tarOpen(ArchiveHandle *AH, const char *f
*** 355,361 ****
  				 * Couldn't find the requested file. Future: do SEEK(0) and
  				 * retry.
  				 */
! 				die_horribly(AH, modulename, "could not find file \"%s\" in archive\n", filename);
  			}
  			else
  			{
--- 355,361 ----
  				 * Couldn't find the requested file. Future: do SEEK(0) and
  				 * retry.
  				 */
! 				exit_horribly(modulename, "could not find file \"%s\" in archive\n", filename);
  			}
  			else
  			{
*************** tarOpen(ArchiveHandle *AH, const char *f
*** 369,375 ****
  		if (AH->compression == 0)
  			tm->nFH = ctx->tarFH;
  		else
! 			die_horribly(AH, modulename, "compression is not supported by tar archive format\n");
  		/* tm->zFH = gzdopen(dup(fileno(ctx->tarFH)), "rb"); */
  #else
  		tm->nFH = ctx->tarFH;
--- 369,375 ----
  		if (AH->compression == 0)
  			tm->nFH = ctx->tarFH;
  		else
! 			exit_horribly(modulename, "compression is not supported by tar archive format\n");
  		/* tm->zFH = gzdopen(dup(fileno(ctx->tarFH)), "rb"); */
  #else
  		tm->nFH = ctx->tarFH;
*************** tarOpen(ArchiveHandle *AH, const char *f
*** 411,417 ****
  #endif
  
  		if (tm->tmpFH == NULL)
! 			die_horribly(AH, modulename, "could not generate temporary file name: %s\n", strerror(errno));
  
  #ifdef HAVE_LIBZ
  
--- 411,417 ----
  #endif
  
  		if (tm->tmpFH == NULL)
! 			exit_horribly(modulename, "could not generate temporary file name: %s\n", strerror(errno));
  
  #ifdef HAVE_LIBZ
  
*************** tarOpen(ArchiveHandle *AH, const char *f
*** 420,426 ****
  			sprintf(fmode, "wb%d", AH->compression);
  			tm->zFH = gzdopen(dup(fileno(tm->tmpFH)), fmode);
  			if (tm->zFH == NULL)
! 				die_horribly(AH, modulename, "could not open temporary file\n");
  		}
  		else
  			tm->nFH = tm->tmpFH;
--- 420,426 ----
  			sprintf(fmode, "wb%d", AH->compression);
  			tm->zFH = gzdopen(dup(fileno(tm->tmpFH)), fmode);
  			if (tm->zFH == NULL)
! 				exit_horribly(modulename, "could not open temporary file\n");
  		}
  		else
  			tm->nFH = tm->tmpFH;
*************** tarClose(ArchiveHandle *AH, TAR_MEMBER *
*** 447,453 ****
  	 */
  	if (AH->compression != 0)
  		if (GZCLOSE(th->zFH) != 0)
! 			die_horribly(AH, modulename, "could not close tar member\n");
  
  	if (th->mode == 'w')
  		_tarAddFile(AH, th);	/* This will close the temp file */
--- 447,453 ----
  	 */
  	if (AH->compression != 0)
  		if (GZCLOSE(th->zFH) != 0)
! 			exit_horribly(modulename, "could not close tar member\n");
  
  	if (th->mode == 'w')
  		_tarAddFile(AH, th);	/* This will close the temp file */
*************** _tarReadRaw(ArchiveHandle *AH, void *buf
*** 547,553 ****
  				res = fread(&((char *) buf)[used], 1, len, th->nFH);
  		}
  		else
! 			die_horribly(AH, modulename, "internal error -- neither th nor fh specified in tarReadRaw()\n");
  	}
  
  	ctx->tarFHpos += res + used;
--- 547,553 ----
  				res = fread(&((char *) buf)[used], 1, len, th->nFH);
  		}
  		else
! 			exit_horribly(modulename, "internal error -- neither th nor fh specified in tarReadRaw()\n");
  	}
  
  	ctx->tarFHpos += res + used;
*************** tarWrite(const void *buf, size_t len, TA
*** 584,591 ****
  		res = fwrite(buf, 1, len, th->nFH);
  
  	if (res != len)
! 		die_horribly(th->AH, modulename,
! 					 "could not write to output file: %s\n", strerror(errno));
  
  	th->pos += res;
  	return res;
--- 584,591 ----
  		res = fwrite(buf, 1, len, th->nFH);
  
  	if (res != len)
! 		exit_horribly(modulename,
! 					  "could not write to output file: %s\n", strerror(errno));
  
  	th->pos += res;
  	return res;
*************** _PrintTocData(ArchiveHandle *AH, TocEntr
*** 672,679 ****
  		 * we search the string for it in a paranoid sort of way.
  		 */
  		if (strncmp(tmpCopy, "copy ", 5) != 0)
! 			die_horribly(AH, modulename,
! 						 "invalid COPY statement -- could not find \"copy\" in string \"%s\"\n", tmpCopy);
  
  		pos1 = 5;
  		for (pos1 = 5; pos1 < strlen(tmpCopy); pos1++)
--- 672,679 ----
  		 * we search the string for it in a paranoid sort of way.
  		 */
  		if (strncmp(tmpCopy, "copy ", 5) != 0)
! 			exit_horribly(modulename,
! 						  "invalid COPY statement -- could not find \"copy\" in string \"%s\"\n", tmpCopy);
  
  		pos1 = 5;
  		for (pos1 = 5; pos1 < strlen(tmpCopy); pos1++)
*************** _PrintTocData(ArchiveHandle *AH, TocEntr
*** 690,698 ****
  				break;
  
  		if (pos2 >= strlen(tmpCopy))
! 			die_horribly(AH, modulename,
! 						 "invalid COPY statement -- could not find \"from stdin\" in string \"%s\" starting at position %lu\n",
! 						 tmpCopy, (unsigned long) pos1);
  
  		ahwrite(tmpCopy, 1, pos2, AH);	/* 'copy "table" [with oids]' */
  		ahprintf(AH, " from '$$PATH$$/%s' %s", tctx->filename, &tmpCopy[pos2 + 10]);
--- 690,698 ----
  				break;
  
  		if (pos2 >= strlen(tmpCopy))
! 			exit_horribly(modulename,
! 						  "invalid COPY statement -- could not find \"from stdin\" in string \"%s\" starting at position %lu\n",
! 						  tmpCopy, (unsigned long) pos1);
  
  		ahwrite(tmpCopy, 1, pos2, AH);	/* 'copy "table" [with oids]' */
  		ahprintf(AH, " from '$$PATH$$/%s' %s", tctx->filename, &tmpCopy[pos2 + 10]);
*************** _ReadByte(ArchiveHandle *AH)
*** 784,790 ****
  
  	res = tarRead(&c, 1, ctx->FH);
  	if (res != 1)
! 		die_horribly(AH, modulename, "unexpected end of file\n");
  	ctx->filePos += 1;
  	return c;
  }
--- 784,790 ----
  
  	res = tarRead(&c, 1, ctx->FH);
  	if (res != 1)
! 		exit_horribly(modulename, "unexpected end of file\n");
  	ctx->filePos += 1;
  	return c;
  }
*************** _CloseArchive(ArchiveHandle *AH)
*** 878,884 ****
  		for (i = 0; i < 512; i++)
  		{
  			if (fputc(0, ctx->tarFH) == EOF)
! 				die_horribly(AH, modulename,
  					   "could not write null block at end of tar archive\n");
  		}
  	}
--- 878,884 ----
  		for (i = 0; i < 512; i++)
  		{
  			if (fputc(0, ctx->tarFH) == EOF)
! 				exit_horribly(modulename,
  					   "could not write null block at end of tar archive\n");
  		}
  	}
*************** _StartBlob(ArchiveHandle *AH, TocEntry *
*** 934,940 ****
  	char	   *sfx;
  
  	if (oid == 0)
! 		die_horribly(AH, modulename, "invalid OID for large object (%u)\n", oid);
  
  	if (AH->compression != 0)
  		sfx = ".gz";
--- 934,940 ----
  	char	   *sfx;
  
  	if (oid == 0)
! 		exit_horribly(modulename, "invalid OID for large object (%u)\n", oid);
  
  	if (AH->compression != 0)
  		sfx = ".gz";
*************** _tarAddFile(ArchiveHandle *AH, TAR_MEMBE
*** 1077,1083 ****
  	 * because pgoff_t can't exceed the compared maximum on their platform.
  	 */
  	if (th->fileLen > MAX_TAR_MEMBER_FILELEN)
! 		die_horribly(AH, modulename, "archive member too large for tar format\n");
  
  	_tarWriteHeader(th);
  
--- 1077,1083 ----
  	 * because pgoff_t can't exceed the compared maximum on their platform.
  	 */
  	if (th->fileLen > MAX_TAR_MEMBER_FILELEN)
! 		exit_horribly(modulename, "archive member too large for tar format\n");
  
  	_tarWriteHeader(th);
  
*************** _tarAddFile(ArchiveHandle *AH, TAR_MEMBE
*** 1085,1099 ****
  	{
  		res = fwrite(buf, 1, cnt, th->tarFH);
  		if (res != cnt)
! 			die_horribly(AH, modulename,
! 						 "could not write to output file: %s\n",
! 						 strerror(errno));
  		len += res;
  	}
  
  	if (fclose(tmp) != 0)		/* This *should* delete it... */
! 		die_horribly(AH, modulename, "could not close temporary file: %s\n",
! 					 strerror(errno));
  
  	if (len != th->fileLen)
  	{
--- 1085,1099 ----
  	{
  		res = fwrite(buf, 1, cnt, th->tarFH);
  		if (res != cnt)
! 			exit_horribly(modulename,
! 						  "could not write to output file: %s\n",
! 						  strerror(errno));
  		len += res;
  	}
  
  	if (fclose(tmp) != 0)		/* This *should* delete it... */
! 		exit_horribly(modulename, "could not close temporary file: %s\n",
! 					  strerror(errno));
  
  	if (len != th->fileLen)
  	{
*************** _tarAddFile(ArchiveHandle *AH, TAR_MEMBE
*** 1102,1116 ****
  
  		snprintf(buf1, sizeof(buf1), INT64_FORMAT, (int64) len);
  		snprintf(buf2, sizeof(buf2), INT64_FORMAT, (int64) th->fileLen);
! 		die_horribly(AH, modulename, "actual file length (%s) does not match expected (%s)\n",
! 					 buf1, buf2);
  	}
  
  	pad = ((len + 511) & ~511) - len;
  	for (i = 0; i < pad; i++)
  	{
  		if (fputc('\0', th->tarFH) == EOF)
! 			die_horribly(AH, modulename, "could not output padding at end of tar member\n");
  	}
  
  	ctx->tarFHpos += len + pad;
--- 1102,1116 ----
  
  		snprintf(buf1, sizeof(buf1), INT64_FORMAT, (int64) len);
  		snprintf(buf2, sizeof(buf2), INT64_FORMAT, (int64) th->fileLen);
! 		exit_horribly(modulename, "actual file length (%s) does not match expected (%s)\n",
! 					  buf1, buf2);
  	}
  
  	pad = ((len + 511) & ~511) - len;
  	for (i = 0; i < pad; i++)
  	{
  		if (fputc('\0', th->tarFH) == EOF)
! 			exit_horribly(modulename, "could not output padding at end of tar member\n");
  	}
  
  	ctx->tarFHpos += len + pad;
*************** _tarPositionTo(ArchiveHandle *AH, const
*** 1159,1165 ****
  	if (!_tarGetHeader(AH, th))
  	{
  		if (filename)
! 			die_horribly(AH, modulename, "could not find header for file \"%s\" in tar archive\n", filename);
  		else
  		{
  			/*
--- 1159,1165 ----
  	if (!_tarGetHeader(AH, th))
  	{
  		if (filename)
! 			exit_horribly(modulename, "could not find header for file \"%s\" in tar archive\n", filename);
  		else
  		{
  			/*
*************** _tarPositionTo(ArchiveHandle *AH, const
*** 1177,1185 ****
  
  		id = atoi(th->targetFile);
  		if ((TocIDRequired(AH, id, AH->ropt) & REQ_DATA) != 0)
! 			die_horribly(AH, modulename, "restoring data out of order is not supported in this archive format: "
! 						 "\"%s\" is required, but comes before \"%s\" in the archive file.\n",
! 						 th->targetFile, filename);
  
  		/* Header doesn't match, so read to next header */
  		len = ((th->fileLen + 511) & ~511);		/* Padded length */
--- 1177,1185 ----
  
  		id = atoi(th->targetFile);
  		if ((TocIDRequired(AH, id, AH->ropt) & REQ_DATA) != 0)
! 			exit_horribly(modulename, "restoring data out of order is not supported in this archive format: "
! 						  "\"%s\" is required, but comes before \"%s\" in the archive file.\n",
! 						  th->targetFile, filename);
  
  		/* Header doesn't match, so read to next header */
  		len = ((th->fileLen + 511) & ~511);		/* Padded length */
*************** _tarPositionTo(ArchiveHandle *AH, const
*** 1189,1195 ****
  			_tarReadRaw(AH, &header[0], 512, NULL, ctx->tarFH);
  
  		if (!_tarGetHeader(AH, th))
! 			die_horribly(AH, modulename, "could not find header for file \"%s\" in tar archive\n", filename);
  	}
  
  	ctx->tarNextMember = ctx->tarFHpos + ((th->fileLen + 511) & ~511);
--- 1189,1195 ----
  			_tarReadRaw(AH, &header[0], 512, NULL, ctx->tarFH);
  
  		if (!_tarGetHeader(AH, th))
! 			exit_horribly(modulename, "could not find header for file \"%s\" in tar archive\n", filename);
  	}
  
  	ctx->tarNextMember = ctx->tarFHpos + ((th->fileLen + 511) & ~511);
*************** _tarGetHeader(ArchiveHandle *AH, TAR_MEM
*** 1222,1228 ****
  
  			snprintf(buf1, sizeof(buf1), INT64_FORMAT, (int64) ftello(ctx->tarFH));
  			snprintf(buf2, sizeof(buf2), INT64_FORMAT, (int64) ftello(ctx->tarFHpos));
! 			die_horribly(AH, modulename,
  			  "mismatch in actual vs. predicted file position (%s vs. %s)\n",
  						 buf1, buf2);
  		}
--- 1222,1228 ----
  
  			snprintf(buf1, sizeof(buf1), INT64_FORMAT, (int64) ftello(ctx->tarFH));
  			snprintf(buf2, sizeof(buf2), INT64_FORMAT, (int64) ftello(ctx->tarFHpos));
! 			exit_horribly(modulename,
  			  "mismatch in actual vs. predicted file position (%s vs. %s)\n",
  						 buf1, buf2);
  		}
*************** _tarGetHeader(ArchiveHandle *AH, TAR_MEM
*** 1237,1247 ****
  			return 0;
  
  		if (len != 512)
! 			die_horribly(AH, modulename,
! 						 ngettext("incomplete tar header found (%lu byte)\n",
! 								  "incomplete tar header found (%lu bytes)\n",
! 								  len),
! 						 (unsigned long) len);
  
  		/* Calc checksum */
  		chk = _tarChecksum(h);
--- 1237,1247 ----
  			return 0;
  
  		if (len != 512)
! 			exit_horribly(modulename,
! 						  ngettext("incomplete tar header found (%lu byte)\n",
! 								   "incomplete tar header found (%lu bytes)\n",
! 								   len),
! 						  (unsigned long) len);
  
  		/* Calc checksum */
  		chk = _tarChecksum(h);
*************** _tarGetHeader(ArchiveHandle *AH, TAR_MEM
*** 1285,1294 ****
  		char		buf[100];
  
  		snprintf(buf, sizeof(buf), INT64_FORMAT, (int64) ftello(ctx->tarFH));
! 		die_horribly(AH, modulename,
! 					 "corrupt tar header found in %s "
! 					 "(expected %d, computed %d) file position %s\n",
! 					 tag, sum, chk, buf);
  	}
  
  	th->targetFile = pg_strdup(tag);
--- 1285,1294 ----
  		char		buf[100];
  
  		snprintf(buf, sizeof(buf), INT64_FORMAT, (int64) ftello(ctx->tarFH));
! 		exit_horribly(modulename,
! 					  "corrupt tar header found in %s "
! 					  "(expected %d, computed %d) file position %s\n",
! 					  tag, sum, chk, buf);
  	}
  
  	th->targetFile = pg_strdup(tag);
*************** _tarWriteHeader(TAR_MEMBER *th)
*** 1379,1383 ****
  	}
  
  	if (fwrite(h, 1, 512, th->tarFH) != 512)
! 		die_horribly(th->AH, modulename, "could not write to output file: %s\n", strerror(errno));
  }
--- 1379,1383 ----
  	}
  
  	if (fwrite(h, 1, 512, th->tarFH) != 512)
! 		exit_horribly(modulename, "could not write to output file: %s\n", strerror(errno));
  }
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 2b0a5ff..089c98f 100644
*** a/src/bin/pg_dump/pg_dump.c
--- b/src/bin/pg_dump/pg_dump.c
*************** static int	serializable_deferrable = 0;
*** 144,150 ****
  
  
  static void help(const char *progname);
- static void pgdump_cleanup_at_exit(int code, void *arg);
  static void setup_connection(Archive *AH, const char *dumpencoding,
  				 char *use_role);
  static ArchiveFormat parseArchiveFormat(const char *format, ArchiveMode *mode);
--- 144,149 ----
*************** main(int argc, char **argv)
*** 575,581 ****
  
  	/* Open the output file */
  	fout = CreateArchive(filename, archiveFormat, compressLevel, archiveMode);
! 	on_exit_nicely(pgdump_cleanup_at_exit, fout);
  
  	if (fout == NULL)
  		exit_horribly(NULL, "could not open output file \"%s\" for writing\n", filename);
--- 574,582 ----
  
  	/* Open the output file */
  	fout = CreateArchive(filename, archiveFormat, compressLevel, archiveMode);
! 
! 	/* Register the cleanup hook */
! 	on_exit_close_archive(fout);
  
  	if (fout == NULL)
  		exit_horribly(NULL, "could not open output file \"%s\" for writing\n", filename);
*************** help(const char *progname)
*** 837,850 ****
  }
  
  static void
- pgdump_cleanup_at_exit(int code, void *arg)
- {
- 	Archive	   *AH = (Archive *) arg;
- 
- 	DisconnectDatabase(AH);
- }
- 
- static void
  setup_connection(Archive *AH, const char *dumpencoding, char *use_role)
  {
  	PGconn	   *conn = GetConnection(AH);
--- 838,843 ----
diff --git a/src/bin/pg_dump/pg_restore.c b/src/bin/pg_dump/pg_restore.c
index b5f4c62..edd0de9 100644
*** a/src/bin/pg_dump/pg_restore.c
--- b/src/bin/pg_dump/pg_restore.c
*************** main(int argc, char **argv)
*** 384,389 ****
--- 384,396 ----
  
  	AH = OpenArchive(inputFileSpec, opts->format);
  
+ 	/*
+ 	 * We don't have a connection yet but that doesn't matter. The connection
+ 	 * is initialized to NULL and if we terminate through exit_nicely() while
+ 	 * it's still NULL, the cleanup function will just be a no-op.
+ 	 */
+ 	on_exit_close_archive(AH);
+ 
  	/* Let the archiver know how noisy to be */
  	AH->verbose = opts->verbose;
  
#46Erik Rijkers
er@xs4all.nl
In reply to: Joachim Wieland (#45)
Re: patch for parallel pg_dump

On Tue, March 20, 2012 04:04, Joachim Wieland wrote:

On Mon, Mar 19, 2012 at 9:14 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

Sounds good to me in general ... my only gripe is this: I wonder if it
would be better to have a central routine that knows about both
archive_close_connection and archive_close_connection_parallel -- and
the argument to the callback is a struct that contains both a pointer to
the struct with the connection to be closed [...]

I had a similar idea before but then concluded that for it you need to
have this struct globally available so that everybody (pg_dump.c /
pg_restore.c / pg_backup_archive.c) can access it to set the
appropriate state.

I gave it a second thought and now just defined a function that these
consumers can call, that way the variable can stay at file scope at
least.

Also we don't need this switch, we can set the ParallelState in the
struct before any child forks off and reset it to NULL after the last
child has terminated.

New patch attached, thanks for your comments.

[pg_dump_die_horribly.2.diff ]

In my hands, the patch complains:

In file included from gram.y:13255:0:
scan.c: In function ‘yy_try_NUL_trans’:
scan.c:16243:23: warning: unused variable ‘yyg’ [-Wunused-variable]
pg_backup_archiver.c:3320:1: error: static declaration of ‘archive_close_connection’ follows
non-static declaration
pg_backup.h:170:13: note: previous declaration of ‘archive_close_connection’ was here
make[3]: *** [pg_backup_archiver.o] Error 1
make[2]: *** [all-pg_dump-recurse] Error 2
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [all-bin-recurse] Error 2
make: *** [all-src-recurse] Error 2
-- make returned 2 - abort

Erik Rijkers

#47Erik Rijkers
er@xs4all.nl
In reply to: Erik Rijkers (#46)
Re: patch for parallel pg_dump

[pg_dump_die_horribly.2.diff ]

In my hands, the patch complains:

In file included from gram.y:13255:0:
scan.c: In function ‘yy_try_NUL_trans’:
scan.c:16243:23: warning: unused variable ‘yyg’ [-Wunused-variable]
pg_backup_archiver.c:3320:1: error: static declaration of ‘archive_close_connection’ follows
non-static declaration
pg_backup.h:170:13: note: previous declaration of ‘archive_close_connection’ was here
make[3]: *** [pg_backup_archiver.o] Error 1
make[2]: *** [all-pg_dump-recurse] Error 2
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [all-bin-recurse] Error 2
make: *** [all-src-recurse] Error 2
-- make returned 2 - abort

I should add: Centos 5.8, gcc 4.6.3

Erik Rijkers

#48Joachim Wieland
joe@mcknight.de
In reply to: Erik Rijkers (#46)
1 attachment(s)
Re: patch for parallel pg_dump

On Tue, Mar 20, 2012 at 12:03 AM, Erik Rijkers <er@xs4all.nl> wrote:

In my hands, the patch complains:

Thanks, updated patch attached.

Attachments:

pg_dump_die_horribly.3.difftext/x-patch; charset=US-ASCII; name=pg_dump_die_horribly.3.diffDownload
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index c30b8f9..ff8e714 100644
*** a/src/bin/pg_dump/compress_io.c
--- b/src/bin/pg_dump/compress_io.c
*************** EndCompressorZlib(ArchiveHandle *AH, Com
*** 256,263 ****
  	DeflateCompressorZlib(AH, cs, true);
  
  	if (deflateEnd(zp) != Z_OK)
! 		die_horribly(AH, modulename,
! 					 "could not close compression stream: %s\n", zp->msg);
  
  	free(cs->zlibOut);
  	free(cs->zp);
--- 256,263 ----
  	DeflateCompressorZlib(AH, cs, true);
  
  	if (deflateEnd(zp) != Z_OK)
! 		exit_horribly(modulename,
! 					  "could not close compression stream: %s\n", zp->msg);
  
  	free(cs->zlibOut);
  	free(cs->zp);
*************** DeflateCompressorZlib(ArchiveHandle *AH,
*** 274,281 ****
  	{
  		res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
  		if (res == Z_STREAM_ERROR)
! 			die_horribly(AH, modulename,
! 						 "could not compress data: %s\n", zp->msg);
  		if ((flush && (zp->avail_out < cs->zlibOutSize))
  			|| (zp->avail_out == 0)
  			|| (zp->avail_in != 0)
--- 274,281 ----
  	{
  		res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
  		if (res == Z_STREAM_ERROR)
! 			exit_horribly(modulename,
! 						  "could not compress data: %s\n", zp->msg);
  		if ((flush && (zp->avail_out < cs->zlibOutSize))
  			|| (zp->avail_out == 0)
  			|| (zp->avail_in != 0)
*************** DeflateCompressorZlib(ArchiveHandle *AH,
*** 295,303 ****
  				size_t		len = cs->zlibOutSize - zp->avail_out;
  
  				if (cs->writeF(AH, out, len) != len)
! 					die_horribly(AH, modulename,
! 								 "could not write to output file: %s\n",
! 								 strerror(errno));
  			}
  			zp->next_out = (void *) out;
  			zp->avail_out = cs->zlibOutSize;
--- 295,303 ----
  				size_t		len = cs->zlibOutSize - zp->avail_out;
  
  				if (cs->writeF(AH, out, len) != len)
! 					exit_horribly(modulename,
! 								  "could not write to output file: %s\n",
! 								  strerror(errno));
  			}
  			zp->next_out = (void *) out;
  			zp->avail_out = cs->zlibOutSize;
*************** WriteDataToArchiveZlib(ArchiveHandle *AH
*** 318,324 ****
  
  	/*
  	 * we have either succeeded in writing dLen bytes or we have called
! 	 * die_horribly()
  	 */
  	return dLen;
  }
--- 318,324 ----
  
  	/*
  	 * we have either succeeded in writing dLen bytes or we have called
! 	 * exit_horribly()
  	 */
  	return dLen;
  }
*************** ReadDataFromArchiveZlib(ArchiveHandle *A
*** 361,368 ****
  
  			res = inflate(zp, 0);
  			if (res != Z_OK && res != Z_STREAM_END)
! 				die_horribly(AH, modulename,
! 							 "could not uncompress data: %s\n", zp->msg);
  
  			out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
  			ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
--- 361,368 ----
  
  			res = inflate(zp, 0);
  			if (res != Z_OK && res != Z_STREAM_END)
! 				exit_horribly(modulename,
! 							  "could not uncompress data: %s\n", zp->msg);
  
  			out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
  			ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
*************** ReadDataFromArchiveZlib(ArchiveHandle *A
*** 377,392 ****
  		zp->avail_out = ZLIB_OUT_SIZE;
  		res = inflate(zp, 0);
  		if (res != Z_OK && res != Z_STREAM_END)
! 			die_horribly(AH, modulename,
! 						 "could not uncompress data: %s\n", zp->msg);
  
  		out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
  		ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
  	}
  
  	if (inflateEnd(zp) != Z_OK)
! 		die_horribly(AH, modulename,
! 					 "could not close compression library: %s\n", zp->msg);
  
  	free(buf);
  	free(out);
--- 377,392 ----
  		zp->avail_out = ZLIB_OUT_SIZE;
  		res = inflate(zp, 0);
  		if (res != Z_OK && res != Z_STREAM_END)
! 			exit_horribly(modulename,
! 						  "could not uncompress data: %s\n", zp->msg);
  
  		out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
  		ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
  	}
  
  	if (inflateEnd(zp) != Z_OK)
! 		exit_horribly(modulename,
! 					  "could not close compression library: %s\n", zp->msg);
  
  	free(buf);
  	free(out);
*************** WriteDataToArchiveNone(ArchiveHandle *AH
*** 426,434 ****
  	 * do a check here as well...
  	 */
  	if (cs->writeF(AH, data, dLen) != dLen)
! 		die_horribly(AH, modulename,
! 					 "could not write to output file: %s\n",
! 					 strerror(errno));
  	return dLen;
  }
  
--- 426,434 ----
  	 * do a check here as well...
  	 */
  	if (cs->writeF(AH, data, dLen) != dLen)
! 		exit_horribly(modulename,
! 					  "could not write to output file: %s\n",
! 					  strerror(errno));
  	return dLen;
  }
  
diff --git a/src/bin/pg_dump/dumputils.c b/src/bin/pg_dump/dumputils.c
index 0b24220..d9681f6 100644
*** a/src/bin/pg_dump/dumputils.c
--- b/src/bin/pg_dump/dumputils.c
*************** static void AddAcl(PQExpBuffer aclbuf, c
*** 49,54 ****
--- 49,55 ----
  #ifdef WIN32
  static bool parallel_init_done = false;
  static DWORD tls_index;
+ static DWORD mainThreadId;
  #endif
  
  void
*************** init_parallel_dump_utils(void)
*** 59,64 ****
--- 60,66 ----
  	{
  		tls_index = TlsAlloc();
  		parallel_init_done = true;
+ 		mainThreadId = GetCurrentThreadId();
  	}
  #endif
  }
*************** on_exit_nicely(on_exit_nicely_callback f
*** 1313,1318 ****
--- 1315,1327 ----
  	on_exit_nicely_index++;
  }
  
+ /* Delete any previously set callback functions */
+ void
+ on_exit_nicely_reset(void)
+ {
+ 	on_exit_nicely_index = 0;
+ }
+ 
  /* Run accumulated on_exit_nicely callbacks and then exit quietly. */
  void
  exit_nicely(int code)
*************** exit_nicely(int code)
*** 1320,1324 ****
--- 1329,1337 ----
  	while (--on_exit_nicely_index >= 0)
  		(*on_exit_nicely_list[on_exit_nicely_index].function)(code,
  			on_exit_nicely_list[on_exit_nicely_index].arg);
+ #ifdef WIN32
+ 	if (parallel_init_done && GetCurrentThreadId() != mainThreadId)
+ 		ExitThread(code);
+ #endif
  	exit(code);
  }
diff --git a/src/bin/pg_dump/dumputils.h b/src/bin/pg_dump/dumputils.h
index 82cf940..2865c0f 100644
*** a/src/bin/pg_dump/dumputils.h
--- b/src/bin/pg_dump/dumputils.h
*************** extern void set_section (const char *arg
*** 62,67 ****
--- 62,68 ----
  
  typedef void (*on_exit_nicely_callback) (int code, void *arg);
  extern void on_exit_nicely(on_exit_nicely_callback function, void *arg);
+ extern void on_exit_nicely_reset(void);
  extern void exit_nicely(int code) __attribute__((noreturn));
  
  #endif   /* DUMPUTILS_H */
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 79f7dda..a684a31 100644
*** a/src/bin/pg_dump/pg_backup_archiver.c
--- b/src/bin/pg_dump/pg_backup_archiver.c
***************
*** 61,71 ****
--- 61,88 ----
  #define thandle HANDLE
  #endif
  
+ typedef struct _parallel_state_entry
+ {
+ #ifdef WIN32
+ 	unsigned int threadId;
+ #else
+ 	pid_t		pid;
+ #endif
+ 	ArchiveHandle *AH;
+ } ParallelStateEntry;
+ 
+ typedef struct _parallel_state
+ {
+ 	int			numWorkers;
+ 	ParallelStateEntry *pse;
+ } ParallelState;
+ 
  /* Arguments needed for a worker child */
  typedef struct _restore_args
  {
  	ArchiveHandle *AH;
  	TocEntry   *te;
+ 	ParallelStateEntry *pse;
  } RestoreArgs;
  
  /* State for each parallel activity slot */
*************** typedef struct _parallel_slot
*** 75,80 ****
--- 92,105 ----
  	RestoreArgs *args;
  } ParallelSlot;
  
+ typedef struct _shutdown_information
+ {
+ 	ParallelState *pstate;
+ 	Archive       *AHX;
+ } ShutdownInformation;
+ 
+ static ShutdownInformation shutdown_info;
+ 
  #define NO_SLOT (-1)
  
  #define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
*************** static int	_discoverArchiveFormat(Archiv
*** 122,131 ****
  
  static int	RestoringToDB(ArchiveHandle *AH);
  static void dump_lo_buf(ArchiveHandle *AH);
- static void vdie_horribly(ArchiveHandle *AH, const char *modulename,
- 						  const char *fmt, va_list ap)
- 	__attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 0), noreturn));
- 
  static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
  static void SetOutput(ArchiveHandle *AH, const char *filename, int compression);
  static OutputContext SaveOutput(ArchiveHandle *AH);
--- 147,152 ----
*************** static void inhibit_data_for_failed_tabl
*** 160,165 ****
--- 181,191 ----
  static ArchiveHandle *CloneArchive(ArchiveHandle *AH);
  static void DeCloneArchive(ArchiveHandle *AH);
  
+ static void setProcessIdentifier(ParallelStateEntry *pse, ArchiveHandle *AH);
+ static void unsetProcessIdentifier(ParallelStateEntry *pse);
+ static int GetMySlot(ParallelState *pstate);
+ static void archive_close_connection(int code, void *arg);
+ 
  
  /*
   *	Wrapper functions.
*************** CloseArchive(Archive *AHX)
*** 208,215 ****
  		res = fclose(AH->OF);
  
  	if (res != 0)
! 		die_horribly(AH, modulename, "could not close output file: %s\n",
! 					 strerror(errno));
  }
  
  /* Public */
--- 234,241 ----
  		res = fclose(AH->OF);
  
  	if (res != 0)
! 		exit_horribly(modulename, "could not close output file: %s\n",
! 					  strerror(errno));
  }
  
  /* Public */
*************** RestoreArchive(Archive *AHX, RestoreOpti
*** 234,247 ****
  	 * connected to, not the one we will create, which is very bad...
  	 */
  	if (ropt->createDB && ropt->dropSchema)
! 		die_horribly(AH, modulename, "-C and -c are incompatible options\n");
  
  	/*
  	 * -C is not compatible with -1, because we can't create a database inside
  	 * a transaction block.
  	 */
  	if (ropt->createDB && ropt->single_txn)
! 		die_horribly(AH, modulename, "-C and -1 are incompatible options\n");
  
  	/*
  	 * If we're going to do parallel restore, there are some restrictions.
--- 260,273 ----
  	 * connected to, not the one we will create, which is very bad...
  	 */
  	if (ropt->createDB && ropt->dropSchema)
! 		exit_horribly(modulename, "-C and -c are incompatible options\n");
  
  	/*
  	 * -C is not compatible with -1, because we can't create a database inside
  	 * a transaction block.
  	 */
  	if (ropt->createDB && ropt->single_txn)
! 		exit_horribly(modulename, "-C and -1 are incompatible options\n");
  
  	/*
  	 * If we're going to do parallel restore, there are some restrictions.
*************** RestoreArchive(Archive *AHX, RestoreOpti
*** 251,261 ****
  	{
  		/* We haven't got round to making this work for all archive formats */
  		if (AH->ClonePtr == NULL || AH->ReopenPtr == NULL)
! 			die_horribly(AH, modulename, "parallel restore is not supported with this archive file format\n");
  
  		/* Doesn't work if the archive represents dependencies as OIDs */
  		if (AH->version < K_VERS_1_8)
! 			die_horribly(AH, modulename, "parallel restore is not supported with archives made by pre-8.0 pg_dump\n");
  
  		/*
  		 * It's also not gonna work if we can't reopen the input file, so
--- 277,287 ----
  	{
  		/* We haven't got round to making this work for all archive formats */
  		if (AH->ClonePtr == NULL || AH->ReopenPtr == NULL)
! 			exit_horribly(modulename, "parallel restore is not supported with this archive file format\n");
  
  		/* Doesn't work if the archive represents dependencies as OIDs */
  		if (AH->version < K_VERS_1_8)
! 			exit_horribly(modulename, "parallel restore is not supported with archives made by pre-8.0 pg_dump\n");
  
  		/*
  		 * It's also not gonna work if we can't reopen the input file, so
*************** RestoreArchive(Archive *AHX, RestoreOpti
*** 274,280 ****
  		{
  			reqs = _tocEntryRequired(te, ropt, false);
  			if (te->hadDumper && (reqs & REQ_DATA) != 0)
! 				die_horribly(AH, modulename, "cannot restore from compressed archive (compression not supported in this installation)\n");
  		}
  	}
  #endif
--- 300,306 ----
  		{
  			reqs = _tocEntryRequired(te, ropt, false);
  			if (te->hadDumper && (reqs & REQ_DATA) != 0)
! 				exit_horribly(modulename, "cannot restore from compressed archive (compression not supported in this installation)\n");
  		}
  	}
  #endif
*************** RestoreArchive(Archive *AHX, RestoreOpti
*** 286,292 ****
  	{
  		ahlog(AH, 1, "connecting to database for restore\n");
  		if (AH->version < K_VERS_1_3)
! 			die_horribly(AH, modulename, "direct database connections are not supported in pre-1.3 archives\n");
  
  		/* XXX Should get this from the archive */
  		AHX->minRemoteVersion = 070100;
--- 312,318 ----
  	{
  		ahlog(AH, 1, "connecting to database for restore\n");
  		if (AH->version < K_VERS_1_3)
! 			exit_horribly(modulename, "direct database connections are not supported in pre-1.3 archives\n");
  
  		/* XXX Should get this from the archive */
  		AHX->minRemoteVersion = 070100;
*************** WriteData(Archive *AHX, const void *data
*** 734,740 ****
  	ArchiveHandle *AH = (ArchiveHandle *) AHX;
  
  	if (!AH->currToc)
! 		die_horribly(AH, modulename, "internal error -- WriteData cannot be called outside the context of a DataDumper routine\n");
  
  	return (*AH->WriteDataPtr) (AH, data, dLen);
  }
--- 760,766 ----
  	ArchiveHandle *AH = (ArchiveHandle *) AHX;
  
  	if (!AH->currToc)
! 		exit_horribly(modulename, "internal error -- WriteData cannot be called outside the context of a DataDumper routine\n");
  
  	return (*AH->WriteDataPtr) (AH, data, dLen);
  }
*************** StartBlob(Archive *AHX, Oid oid)
*** 889,895 ****
  	ArchiveHandle *AH = (ArchiveHandle *) AHX;
  
  	if (!AH->StartBlobPtr)
! 		die_horribly(AH, modulename, "large-object output not supported in chosen format\n");
  
  	(*AH->StartBlobPtr) (AH, AH->currToc, oid);
  
--- 915,921 ----
  	ArchiveHandle *AH = (ArchiveHandle *) AHX;
  
  	if (!AH->StartBlobPtr)
! 		exit_horribly(modulename, "large-object output not supported in chosen format\n");
  
  	(*AH->StartBlobPtr) (AH, AH->currToc, oid);
  
*************** StartRestoreBlob(ArchiveHandle *AH, Oid
*** 976,988 ****
  		{
  			loOid = lo_create(AH->connection, oid);
  			if (loOid == 0 || loOid != oid)
! 				die_horribly(AH, modulename, "could not create large object %u: %s",
! 							 oid, PQerrorMessage(AH->connection));
  		}
  		AH->loFd = lo_open(AH->connection, oid, INV_WRITE);
  		if (AH->loFd == -1)
! 			die_horribly(AH, modulename, "could not open large object %u: %s",
! 						 oid, PQerrorMessage(AH->connection));
  	}
  	else
  	{
--- 1002,1014 ----
  		{
  			loOid = lo_create(AH->connection, oid);
  			if (loOid == 0 || loOid != oid)
! 				exit_horribly(modulename, "could not create large object %u: %s",
! 							  oid, PQerrorMessage(AH->connection));
  		}
  		AH->loFd = lo_open(AH->connection, oid, INV_WRITE);
  		if (AH->loFd == -1)
! 			exit_horribly(modulename, "could not open large object %u: %s",
! 						  oid, PQerrorMessage(AH->connection));
  	}
  	else
  	{
*************** SortTocFromFile(Archive *AHX, RestoreOpt
*** 1038,1045 ****
  	/* Setup the file */
  	fh = fopen(ropt->tocFile, PG_BINARY_R);
  	if (!fh)
! 		die_horribly(AH, modulename, "could not open TOC file \"%s\": %s\n",
! 					 ropt->tocFile, strerror(errno));
  
  	incomplete_line = false;
  	while (fgets(buf, sizeof(buf), fh) != NULL)
--- 1064,1071 ----
  	/* Setup the file */
  	fh = fopen(ropt->tocFile, PG_BINARY_R);
  	if (!fh)
! 		exit_horribly(modulename, "could not open TOC file \"%s\": %s\n",
! 					  ropt->tocFile, strerror(errno));
  
  	incomplete_line = false;
  	while (fgets(buf, sizeof(buf), fh) != NULL)
*************** SortTocFromFile(Archive *AHX, RestoreOpt
*** 1086,1093 ****
  		/* Find TOC entry */
  		te = getTocEntryByDumpId(AH, id);
  		if (!te)
! 			die_horribly(AH, modulename, "could not find entry for ID %d\n",
! 						 id);
  
  		/* Mark it wanted */
  		ropt->idWanted[id - 1] = true;
--- 1112,1119 ----
  		/* Find TOC entry */
  		te = getTocEntryByDumpId(AH, id);
  		if (!te)
! 			exit_horribly(modulename, "could not find entry for ID %d\n",
! 						  id);
  
  		/* Mark it wanted */
  		ropt->idWanted[id - 1] = true;
*************** SortTocFromFile(Archive *AHX, RestoreOpt
*** 1107,1114 ****
  	}
  
  	if (fclose(fh) != 0)
! 		die_horribly(AH, modulename, "could not close TOC file: %s\n",
! 					 strerror(errno));
  }
  
  /*
--- 1133,1140 ----
  	}
  
  	if (fclose(fh) != 0)
! 		exit_horribly(modulename, "could not close TOC file: %s\n",
! 					  strerror(errno));
  }
  
  /*
*************** SetOutput(ArchiveHandle *AH, const char
*** 1224,1234 ****
  	if (!AH->OF)
  	{
  		if (filename)
! 			die_horribly(AH, modulename, "could not open output file \"%s\": %s\n",
! 						 filename, strerror(errno));
  		else
! 			die_horribly(AH, modulename, "could not open output file: %s\n",
! 						 strerror(errno));
  	}
  }
  
--- 1250,1260 ----
  	if (!AH->OF)
  	{
  		if (filename)
! 			exit_horribly(modulename, "could not open output file \"%s\": %s\n",
! 						  filename, strerror(errno));
  		else
! 			exit_horribly(modulename, "could not open output file: %s\n",
! 						  strerror(errno));
  	}
  }
  
*************** RestoreOutput(ArchiveHandle *AH, OutputC
*** 1254,1260 ****
  		res = fclose(AH->OF);
  
  	if (res != 0)
! 		die_horribly(AH, modulename, "could not close output file: %s\n",
  					 strerror(errno));
  
  	AH->gzOut = savedContext.gzOut;
--- 1280,1286 ----
  		res = fclose(AH->OF);
  
  	if (res != 0)
! 		exit_horribly(modulename, "could not close output file: %s\n",
  					 strerror(errno));
  
  	AH->gzOut = savedContext.gzOut;
*************** dump_lo_buf(ArchiveHandle *AH)
*** 1332,1338 ****
  							  AH->lo_buf_used),
  			  (unsigned long) AH->lo_buf_used, (unsigned long) res);
  		if (res != AH->lo_buf_used)
! 			die_horribly(AH, modulename,
  			"could not write to large object (result: %lu, expected: %lu)\n",
  					   (unsigned long) res, (unsigned long) AH->lo_buf_used);
  	}
--- 1358,1364 ----
  							  AH->lo_buf_used),
  			  (unsigned long) AH->lo_buf_used, (unsigned long) res);
  		if (res != AH->lo_buf_used)
! 			exit_horribly(modulename,
  			"could not write to large object (result: %lu, expected: %lu)\n",
  					   (unsigned long) res, (unsigned long) AH->lo_buf_used);
  	}
*************** ahwrite(const void *ptr, size_t size, si
*** 1391,1397 ****
  	{
  		res = GZWRITE(ptr, size, nmemb, AH->OF);
  		if (res != (nmemb * size))
! 			die_horribly(AH, modulename, "could not write to output file: %s\n", strerror(errno));
  		return res;
  	}
  	else if (AH->CustomOutPtr)
--- 1417,1423 ----
  	{
  		res = GZWRITE(ptr, size, nmemb, AH->OF);
  		if (res != (nmemb * size))
! 			exit_horribly(modulename, "could not write to output file: %s\n", strerror(errno));
  		return res;
  	}
  	else if (AH->CustomOutPtr)
*************** ahwrite(const void *ptr, size_t size, si
*** 1399,1405 ****
  		res = AH->CustomOutPtr (AH, ptr, size * nmemb);
  
  		if (res != (nmemb * size))
! 			die_horribly(AH, modulename, "could not write to custom output routine\n");
  		return res;
  	}
  	else
--- 1425,1431 ----
  		res = AH->CustomOutPtr (AH, ptr, size * nmemb);
  
  		if (res != (nmemb * size))
! 			exit_horribly(modulename, "could not write to custom output routine\n");
  		return res;
  	}
  	else
*************** ahwrite(const void *ptr, size_t size, si
*** 1414,1468 ****
  		{
  			res = fwrite(ptr, size, nmemb, AH->OF);
  			if (res != nmemb)
! 				die_horribly(AH, modulename, "could not write to output file: %s\n",
  							 strerror(errno));
  			return res;
  		}
  	}
  }
  
- 
- /* Report a fatal error and exit(1) */
- static void
- vdie_horribly(ArchiveHandle *AH, const char *modulename,
- 			  const char *fmt, va_list ap)
- {
- 	vwrite_msg(modulename, fmt, ap);
- 
- 	if (AH)
- 	{
- 		if (AH->public.verbose)
- 			write_msg(NULL, "*** aborted because of error\n");
- 		DisconnectDatabase(&AH->public);
- 	}
- 
- 	exit_nicely(1);
- }
- 
- /* As above, but with variable arg list */
- void
- die_horribly(ArchiveHandle *AH, const char *modulename, const char *fmt,...)
- {
- 	va_list		ap;
- 
- 	va_start(ap, fmt);
- 	vdie_horribly(AH, modulename, fmt, ap);
- 	va_end(ap);
- }
- 
- /* As above, but with a complaint about a particular query. */
- void
- die_on_query_failure(ArchiveHandle *AH, const char *modulename,
- 					 const char *query)
- {
- 	write_msg(modulename, "query failed: %s",
- 			  PQerrorMessage(AH->connection));
- 	die_horribly(AH, modulename, "query was: %s\n", query);
- }
- 
  /* on some error, we may decide to go on... */
  void
! warn_or_die_horribly(ArchiveHandle *AH,
  					 const char *modulename, const char *fmt,...)
  {
  	va_list		ap;
--- 1440,1455 ----
  		{
  			res = fwrite(ptr, size, nmemb, AH->OF);
  			if (res != nmemb)
! 				exit_horribly(modulename, "could not write to output file: %s\n",
  							 strerror(errno));
  			return res;
  		}
  	}
  }
  
  /* on some error, we may decide to go on... */
  void
! warn_or_exit_horribly(ArchiveHandle *AH,
  					 const char *modulename, const char *fmt,...)
  {
  	va_list		ap;
*************** warn_or_die_horribly(ArchiveHandle *AH,
*** 1500,1513 ****
  	AH->lastErrorTE = AH->currentTE;
  
  	va_start(ap, fmt);
  	if (AH->public.exit_on_error)
! 		vdie_horribly(AH, modulename, fmt, ap);
  	else
- 	{
- 		vwrite_msg(modulename, fmt, ap);
  		AH->public.n_errors++;
- 	}
- 	va_end(ap);
  }
  
  #ifdef NOT_USED
--- 1487,1499 ----
  	AH->lastErrorTE = AH->currentTE;
  
  	va_start(ap, fmt);
+ 	vwrite_msg(modulename, fmt, ap);
+ 	va_end(ap);
+ 
  	if (AH->public.exit_on_error)
! 		exit_nicely(1);
  	else
  		AH->public.n_errors++;
  }
  
  #ifdef NOT_USED
*************** ReadOffset(ArchiveHandle *AH, pgoff_t *
*** 1626,1632 ****
  			break;
  
  		default:
! 			die_horribly(AH, modulename, "unexpected data offset flag %d\n", offsetFlg);
  	}
  
  	/*
--- 1612,1618 ----
  			break;
  
  		default:
! 			exit_horribly(modulename, "unexpected data offset flag %d\n", offsetFlg);
  	}
  
  	/*
*************** ReadOffset(ArchiveHandle *AH, pgoff_t *
*** 1639,1645 ****
  		else
  		{
  			if ((*AH->ReadBytePtr) (AH) != 0)
! 				die_horribly(AH, modulename, "file offset in dump file is too large\n");
  		}
  	}
  
--- 1625,1631 ----
  		else
  		{
  			if ((*AH->ReadBytePtr) (AH) != 0)
! 				exit_horribly(modulename, "file offset in dump file is too large\n");
  		}
  	}
  
*************** ReadStr(ArchiveHandle *AH)
*** 1733,1739 ****
  	{
  		buf = (char *) pg_malloc(l + 1);
  		if ((*AH->ReadBufPtr) (AH, (void *) buf, l) != l)
! 			die_horribly(AH, modulename, "unexpected end of file\n");
  
  		buf[l] = '\0';
  	}
--- 1719,1725 ----
  	{
  		buf = (char *) pg_malloc(l + 1);
  		if ((*AH->ReadBufPtr) (AH, (void *) buf, l) != l)
! 			exit_horribly(modulename, "unexpected end of file\n");
  
  		buf[l] = '\0';
  	}
*************** _discoverArchiveFormat(ArchiveHandle *AH
*** 1776,1783 ****
  			char		buf[MAXPGPATH];
  
  			if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
! 				die_horribly(AH, modulename, "directory name too long: \"%s\"\n",
! 							 AH->fSpec);
  			if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
  			{
  				AH->format = archDirectory;
--- 1762,1769 ----
  			char		buf[MAXPGPATH];
  
  			if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
! 				exit_horribly(modulename, "directory name too long: \"%s\"\n",
! 							  AH->fSpec);
  			if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
  			{
  				AH->format = archDirectory;
*************** _discoverArchiveFormat(ArchiveHandle *AH
*** 1786,1817 ****
  
  #ifdef HAVE_LIBZ
  			if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
! 				die_horribly(AH, modulename, "directory name too long: \"%s\"\n",
! 							 AH->fSpec);
  			if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
  			{
  				AH->format = archDirectory;
  				return AH->format;
  			}
  #endif
! 			die_horribly(AH, modulename, "directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)\n",
! 						 AH->fSpec);
  			fh = NULL;			/* keep compiler quiet */
  		}
  		else
  		{
  			fh = fopen(AH->fSpec, PG_BINARY_R);
  			if (!fh)
! 				die_horribly(AH, modulename, "could not open input file \"%s\": %s\n",
! 							 AH->fSpec, strerror(errno));
  		}
  	}
  	else
  	{
  		fh = stdin;
  		if (!fh)
! 			die_horribly(AH, modulename, "could not open input file: %s\n",
! 						 strerror(errno));
  	}
  
  	cnt = fread(sig, 1, 5, fh);
--- 1772,1803 ----
  
  #ifdef HAVE_LIBZ
  			if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
! 				exit_horribly(modulename, "directory name too long: \"%s\"\n",
! 							  AH->fSpec);
  			if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
  			{
  				AH->format = archDirectory;
  				return AH->format;
  			}
  #endif
! 			exit_horribly(modulename, "directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)\n",
! 						  AH->fSpec);
  			fh = NULL;			/* keep compiler quiet */
  		}
  		else
  		{
  			fh = fopen(AH->fSpec, PG_BINARY_R);
  			if (!fh)
! 				exit_horribly(modulename, "could not open input file \"%s\": %s\n",
! 							  AH->fSpec, strerror(errno));
  		}
  	}
  	else
  	{
  		fh = stdin;
  		if (!fh)
! 			exit_horribly(modulename, "could not open input file: %s\n",
! 						  strerror(errno));
  	}
  
  	cnt = fread(sig, 1, 5, fh);
*************** _discoverArchiveFormat(ArchiveHandle *AH
*** 1819,1828 ****
  	if (cnt != 5)
  	{
  		if (ferror(fh))
! 			die_horribly(AH, modulename, "could not read input file: %s\n", strerror(errno));
  		else
! 			die_horribly(AH, modulename, "input file is too short (read %lu, expected 5)\n",
! 						 (unsigned long) cnt);
  	}
  
  	/* Save it, just in case we need it later */
--- 1805,1814 ----
  	if (cnt != 5)
  	{
  		if (ferror(fh))
! 			exit_horribly(modulename, "could not read input file: %s\n", strerror(errno));
  		else
! 			exit_horribly(modulename, "input file is too short (read %lu, expected 5)\n",
! 						  (unsigned long) cnt);
  	}
  
  	/* Save it, just in case we need it later */
*************** _discoverArchiveFormat(ArchiveHandle *AH
*** 1883,1896 ****
  			 strncmp(AH->lookahead, TEXT_DUMPALL_HEADER, strlen(TEXT_DUMPALL_HEADER)) == 0))
  		{
  			/* looks like it's probably a text format dump. so suggest they try psql */
! 			die_horribly(AH, modulename, "input file appears to be a text format dump. Please use psql.\n");
  		}
  
  		if (AH->lookaheadLen != 512)
! 			die_horribly(AH, modulename, "input file does not appear to be a valid archive (too short?)\n");
  
  		if (!isValidTarHeader(AH->lookahead))
! 			die_horribly(AH, modulename, "input file does not appear to be a valid archive\n");
  
  		AH->format = archTar;
  	}
--- 1869,1882 ----
  			 strncmp(AH->lookahead, TEXT_DUMPALL_HEADER, strlen(TEXT_DUMPALL_HEADER)) == 0))
  		{
  			/* looks like it's probably a text format dump. so suggest they try psql */
! 			exit_horribly(modulename, "input file appears to be a text format dump. Please use psql.\n");
  		}
  
  		if (AH->lookaheadLen != 512)
! 			exit_horribly(modulename, "input file does not appear to be a valid archive (too short?)\n");
  
  		if (!isValidTarHeader(AH->lookahead))
! 			exit_horribly(modulename, "input file does not appear to be a valid archive\n");
  
  		AH->format = archTar;
  	}
*************** _discoverArchiveFormat(ArchiveHandle *AH
*** 1910,1917 ****
  	/* Close the file */
  	if (wantClose)
  		if (fclose(fh) != 0)
! 			die_horribly(AH, modulename, "could not close input file: %s\n",
! 						 strerror(errno));
  
  	return AH->format;
  }
--- 1896,1903 ----
  	/* Close the file */
  	if (wantClose)
  		if (fclose(fh) != 0)
! 			exit_horribly(modulename, "could not close input file: %s\n",
! 						  strerror(errno));
  
  	return AH->format;
  }
*************** _allocAH(const char *FileSpec, const Arc
*** 2034,2040 ****
  			break;
  
  		default:
! 			die_horribly(AH, modulename, "unrecognized file format \"%d\"\n", fmt);
  	}
  
  	return AH;
--- 2020,2026 ----
  			break;
  
  		default:
! 			exit_horribly(modulename, "unrecognized file format \"%d\"\n", fmt);
  	}
  
  	return AH;
*************** ReadToc(ArchiveHandle *AH)
*** 2156,2164 ****
  
  		/* Sanity check */
  		if (te->dumpId <= 0)
! 			die_horribly(AH, modulename,
! 					   "entry ID %d out of range -- perhaps a corrupt TOC\n",
! 						 te->dumpId);
  
  		te->hadDumper = ReadInt(AH);
  
--- 2142,2150 ----
  
  		/* Sanity check */
  		if (te->dumpId <= 0)
! 			exit_horribly(modulename,
! 						  "entry ID %d out of range -- perhaps a corrupt TOC\n",
! 						  te->dumpId);
  
  		te->hadDumper = ReadInt(AH);
  
*************** processEncodingEntry(ArchiveHandle *AH,
*** 2313,2325 ****
  		*ptr2 = '\0';
  		encoding = pg_char_to_encoding(ptr1);
  		if (encoding < 0)
! 			die_horribly(AH, modulename, "unrecognized encoding \"%s\"\n",
! 						 ptr1);
  		AH->public.encoding = encoding;
  	}
  	else
! 		die_horribly(AH, modulename, "invalid ENCODING item: %s\n",
! 					 te->defn);
  
  	free(defn);
  }
--- 2299,2311 ----
  		*ptr2 = '\0';
  		encoding = pg_char_to_encoding(ptr1);
  		if (encoding < 0)
! 			exit_horribly(modulename, "unrecognized encoding \"%s\"\n",
! 						  ptr1);
  		AH->public.encoding = encoding;
  	}
  	else
! 		exit_horribly(modulename, "invalid ENCODING item: %s\n",
! 					  te->defn);
  
  	free(defn);
  }
*************** processStdStringsEntry(ArchiveHandle *AH
*** 2336,2343 ****
  	else if (ptr1 && strncmp(ptr1, "'off'", 5) == 0)
  		AH->public.std_strings = false;
  	else
! 		die_horribly(AH, modulename, "invalid STDSTRINGS item: %s\n",
! 					 te->defn);
  }
  
  static teReqs
--- 2322,2329 ----
  	else if (ptr1 && strncmp(ptr1, "'off'", 5) == 0)
  		AH->public.std_strings = false;
  	else
! 		exit_horribly(modulename, "invalid STDSTRINGS item: %s\n",
! 					  te->defn);
  }
  
  static teReqs
*************** _doSetSessionAuth(ArchiveHandle *AH, con
*** 2544,2552 ****
  		res = PQexec(AH->connection, cmd->data);
  
  		if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
! 			/* NOT warn_or_die_horribly... use -O instead to skip this. */
! 			die_horribly(AH, modulename, "could not set session user to \"%s\": %s",
! 						 user, PQerrorMessage(AH->connection));
  
  		PQclear(res);
  	}
--- 2530,2538 ----
  		res = PQexec(AH->connection, cmd->data);
  
  		if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
! 			/* NOT warn_or_exit_horribly... use -O instead to skip this. */
! 			exit_horribly(modulename, "could not set session user to \"%s\": %s",
! 						  user, PQerrorMessage(AH->connection));
  
  		PQclear(res);
  	}
*************** _doSetWithOids(ArchiveHandle *AH, const
*** 2576,2584 ****
  		res = PQexec(AH->connection, cmd->data);
  
  		if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
! 			warn_or_die_horribly(AH, modulename,
! 								 "could not set default_with_oids: %s",
! 								 PQerrorMessage(AH->connection));
  
  		PQclear(res);
  	}
--- 2562,2570 ----
  		res = PQexec(AH->connection, cmd->data);
  
  		if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
! 			warn_or_exit_horribly(AH, modulename,
! 								  "could not set default_with_oids: %s",
! 								  PQerrorMessage(AH->connection));
  
  		PQclear(res);
  	}
*************** _selectOutputSchema(ArchiveHandle *AH, c
*** 2714,2722 ****
  		res = PQexec(AH->connection, qry->data);
  
  		if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
! 			warn_or_die_horribly(AH, modulename,
! 								 "could not set search_path to \"%s\": %s",
! 								 schemaName, PQerrorMessage(AH->connection));
  
  		PQclear(res);
  	}
--- 2700,2708 ----
  		res = PQexec(AH->connection, qry->data);
  
  		if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
! 			warn_or_exit_horribly(AH, modulename,
! 								  "could not set search_path to \"%s\": %s",
! 								  schemaName, PQerrorMessage(AH->connection));
  
  		PQclear(res);
  	}
*************** _selectTablespace(ArchiveHandle *AH, con
*** 2775,2783 ****
  		res = PQexec(AH->connection, qry->data);
  
  		if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
! 			warn_or_die_horribly(AH, modulename,
! 								 "could not set default_tablespace to %s: %s",
! 								 fmtId(want), PQerrorMessage(AH->connection));
  
  		PQclear(res);
  	}
--- 2761,2769 ----
  		res = PQexec(AH->connection, qry->data);
  
  		if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
! 			warn_or_exit_horribly(AH, modulename,
! 								  "could not set default_tablespace to %s: %s",
! 								  fmtId(want), PQerrorMessage(AH->connection));
  
  		PQclear(res);
  	}
*************** ReadHead(ArchiveHandle *AH)
*** 3157,3166 ****
  	if (!AH->readHeader)
  	{
  		if ((*AH->ReadBufPtr) (AH, tmpMag, 5) != 5)
! 			die_horribly(AH, modulename, "unexpected end of file\n");
  
  		if (strncmp(tmpMag, "PGDMP", 5) != 0)
! 			die_horribly(AH, modulename, "did not find magic string in file header\n");
  
  		AH->vmaj = (*AH->ReadBytePtr) (AH);
  		AH->vmin = (*AH->ReadBytePtr) (AH);
--- 3143,3152 ----
  	if (!AH->readHeader)
  	{
  		if ((*AH->ReadBufPtr) (AH, tmpMag, 5) != 5)
! 			exit_horribly(modulename, "unexpected end of file\n");
  
  		if (strncmp(tmpMag, "PGDMP", 5) != 0)
! 			exit_horribly(modulename, "did not find magic string in file header\n");
  
  		AH->vmaj = (*AH->ReadBytePtr) (AH);
  		AH->vmin = (*AH->ReadBytePtr) (AH);
*************** ReadHead(ArchiveHandle *AH)
*** 3173,3185 ****
  		AH->version = ((AH->vmaj * 256 + AH->vmin) * 256 + AH->vrev) * 256 + 0;
  
  		if (AH->version < K_VERS_1_0 || AH->version > K_VERS_MAX)
! 			die_horribly(AH, modulename, "unsupported version (%d.%d) in file header\n",
! 						 AH->vmaj, AH->vmin);
  
  		AH->intSize = (*AH->ReadBytePtr) (AH);
  		if (AH->intSize > 32)
! 			die_horribly(AH, modulename, "sanity check on integer size (%lu) failed\n",
! 						 (unsigned long) AH->intSize);
  
  		if (AH->intSize > sizeof(int))
  			write_msg(modulename, "WARNING: archive was made on a machine with larger integers, some operations might fail\n");
--- 3159,3171 ----
  		AH->version = ((AH->vmaj * 256 + AH->vmin) * 256 + AH->vrev) * 256 + 0;
  
  		if (AH->version < K_VERS_1_0 || AH->version > K_VERS_MAX)
! 			exit_horribly(modulename, "unsupported version (%d.%d) in file header\n",
! 						  AH->vmaj, AH->vmin);
  
  		AH->intSize = (*AH->ReadBytePtr) (AH);
  		if (AH->intSize > 32)
! 			exit_horribly(modulename, "sanity check on integer size (%lu) failed\n",
! 						  (unsigned long) AH->intSize);
  
  		if (AH->intSize > sizeof(int))
  			write_msg(modulename, "WARNING: archive was made on a machine with larger integers, some operations might fail\n");
*************** ReadHead(ArchiveHandle *AH)
*** 3192,3199 ****
  		fmt = (*AH->ReadBytePtr) (AH);
  
  		if (AH->format != fmt)
! 			die_horribly(AH, modulename, "expected format (%d) differs from format found in file (%d)\n",
! 						 AH->format, fmt);
  	}
  
  	if (AH->version >= K_VERS_1_2)
--- 3178,3185 ----
  		fmt = (*AH->ReadBytePtr) (AH);
  
  		if (AH->format != fmt)
! 			exit_horribly(modulename, "expected format (%d) differs from format found in file (%d)\n",
! 						  AH->format, fmt);
  	}
  
  	if (AH->version >= K_VERS_1_2)
*************** dumpTimestamp(ArchiveHandle *AH, const c
*** 3297,3302 ****
--- 3283,3346 ----
  		ahprintf(AH, "-- %s %s\n\n", msg, buf);
  }
  
+ static void
+ setProcessIdentifier(ParallelStateEntry *pse, ArchiveHandle *AH)
+ {
+ #ifdef WIN32
+ 	pse->threadId = GetCurrentThreadId();
+ #else
+ 	pse->pid = getpid();
+ #endif
+ 	pse->AH = AH;
+ }
+ 
+ static void
+ unsetProcessIdentifier(ParallelStateEntry *pse)
+ {
+ #ifdef WIN32
+ 	pse->threadId = 0;
+ #else
+ 	pse->pid = 0;
+ #endif
+ 	pse->AH = NULL;
+ }
+ 
+ static int
+ GetMySlot(ParallelState *pstate)
+ {
+ 	int i;
+ 
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ #ifdef WIN32
+ 		if (pstate->pse[i].threadId == GetCurrentThreadId())
+ #else
+ 		if (pstate->pse[i].pid == getpid())
+ #endif
+ 			return i;
+ 
+ 	return NO_SLOT;
+ }
+ 
+ static void
+ archive_close_connection(int code, void *arg)
+ {
+ 	ShutdownInformation *si = (ShutdownInformation *) arg;
+ 	if (si->pstate)
+ 	{
+ 		int slotno = GetMySlot(si->pstate);
+ 		if (slotno != NO_SLOT && si->pstate->pse[slotno].AH)
+ 			DisconnectDatabase(&si->pstate->pse[slotno].AH->public);
+ 	}
+ 	else if (si->AHX)
+ 		DisconnectDatabase(si->AHX);
+ }
+ 
+ void
+ on_exit_close_archive(Archive *AHX)
+ {
+ 	shutdown_info.AHX = AHX;
+ 	on_exit_nicely(archive_close_connection, &shutdown_info);
+ }
  
  /*
   * Main engine for parallel restore.
*************** restore_toc_entries_parallel(ArchiveHand
*** 3323,3332 ****
  	TocEntry   *next_work_item;
  	thandle		ret_child;
  	TocEntry   *te;
  
  	ahlog(AH, 2, "entering restore_toc_entries_parallel\n");
  
! 	slots = (ParallelSlot *) pg_calloc(sizeof(ParallelSlot), n_slots);
  
  	/* Adjust dependency information */
  	fix_dependencies(AH);
--- 3367,3383 ----
  	TocEntry   *next_work_item;
  	thandle		ret_child;
  	TocEntry   *te;
+ 	ParallelState *pstate;
+ 	int			i;
  
  	ahlog(AH, 2, "entering restore_toc_entries_parallel\n");
  
! 	slots = (ParallelSlot *) pg_calloc(n_slots, sizeof(ParallelSlot));
! 	pstate = (ParallelState *) pg_malloc(sizeof(ParallelState));
! 	pstate->pse = (ParallelStateEntry *) pg_calloc(n_slots, sizeof(ParallelStateEntry));
! 	pstate->numWorkers = ropt->number_of_jobs;
! 	for (i = 0; i < pstate->numWorkers; i++)
! 		unsetProcessIdentifier(&(pstate->pse[i]));
  
  	/* Adjust dependency information */
  	fix_dependencies(AH);
*************** restore_toc_entries_parallel(ArchiveHand
*** 3382,3387 ****
--- 3433,3444 ----
  	 */
  	DisconnectDatabase(&AH->public);
  
+ 	/*
+ 	 * Set the pstate in the shutdown_info. The exit handler uses pstate if set
+ 	 * and falls back to AHX otherwise.
+ 	 */
+ 	shutdown_info.pstate = pstate;
+ 
  	/* blow away any transient state from the old connection */
  	if (AH->currUser)
  		free(AH->currUser);
*************** restore_toc_entries_parallel(ArchiveHand
*** 3480,3485 ****
--- 3537,3543 ----
  				args = pg_malloc(sizeof(RestoreArgs));
  				args->AH = CloneArchive(AH);
  				args->te = next_work_item;
+ 				args->pse = &pstate->pse[next_slot];
  
  				/* run the step in a worker child */
  				child = spawn_restore(args);
*************** restore_toc_entries_parallel(ArchiveHand
*** 3507,3520 ****
  		}
  		else
  		{
! 			die_horribly(AH, modulename, "worker process crashed: status %d\n",
! 						 work_status);
  		}
  	}
  
  	ahlog(AH, 1, "finished main parallel loop\n");
  
  	/*
  	 * Now reconnect the single parent connection.
  	 */
  	ConnectDatabase((Archive *) AH, ropt->dbname,
--- 3565,3584 ----
  		}
  		else
  		{
! 			exit_horribly(modulename, "worker process crashed: status %d\n",
! 						  work_status);
  		}
  	}
  
  	ahlog(AH, 1, "finished main parallel loop\n");
  
  	/*
+ 	 * Remove the pstate again, so the exit handler will now fall back to
+ 	 * closing AH->connection again.
+ 	 */
+ 	shutdown_info.pstate = NULL;
+ 
+ 	/*
  	 * Now reconnect the single parent connection.
  	 */
  	ConnectDatabase((Archive *) AH, ropt->dbname,
*************** spawn_restore(RestoreArgs *args)
*** 3555,3577 ****
  	{
  		/* in child process */
  		parallel_restore(args);
! 		die_horribly(args->AH, modulename,
! 					 "parallel_restore should not return\n");
  	}
  	else if (child < 0)
  	{
  		/* fork failed */
! 		die_horribly(args->AH, modulename,
! 					 "could not create worker process: %s\n",
! 					 strerror(errno));
  	}
  #else
  	child = (HANDLE) _beginthreadex(NULL, 0, (void *) parallel_restore,
  									args, 0, NULL);
  	if (child == 0)
! 		die_horribly(args->AH, modulename,
! 					 "could not create worker thread: %s\n",
! 					 strerror(errno));
  #endif
  
  	return child;
--- 3619,3641 ----
  	{
  		/* in child process */
  		parallel_restore(args);
! 		exit_horribly(modulename,
! 					  "parallel_restore should not return\n");
  	}
  	else if (child < 0)
  	{
  		/* fork failed */
! 		exit_horribly(modulename,
! 					  "could not create worker process: %s\n",
! 					  strerror(errno));
  	}
  #else
  	child = (HANDLE) _beginthreadex(NULL, 0, (void *) parallel_restore,
  									args, 0, NULL);
  	if (child == 0)
! 		exit_horribly(modulename,
! 					  "could not create worker thread: %s\n",
! 					  strerror(errno));
  #endif
  
  	return child;
*************** parallel_restore(RestoreArgs *args)
*** 3813,3818 ****
--- 3877,3884 ----
  	RestoreOptions *ropt = AH->ropt;
  	int			retval;
  
+ 	setProcessIdentifier(args->pse, AH);
+ 
  	/*
  	 * Close and reopen the input file so we have a private file pointer that
  	 * doesn't stomp on anyone else's file pointer, if we're actually going to
*************** parallel_restore(RestoreArgs *args)
*** 3843,3848 ****
--- 3909,3915 ----
  
  	/* And clean up */
  	DisconnectDatabase((Archive *) AH);
+ 	unsetProcessIdentifier(args->pse);
  
  	/* If we reopened the file, we are done with it, so close it now */
  	if (te->section == SECTION_DATA)
*************** mark_work_done(ArchiveHandle *AH, TocEnt
*** 3888,3894 ****
  	}
  
  	if (te == NULL)
! 		die_horribly(AH, modulename, "could not find slot of finished worker\n");
  
  	ahlog(AH, 1, "finished item %d %s %s\n",
  		  te->dumpId, te->desc, te->tag);
--- 3955,3961 ----
  	}
  
  	if (te == NULL)
! 		exit_horribly(modulename, "could not find slot of finished worker\n");
  
  	ahlog(AH, 1, "finished item %d %s %s\n",
  		  te->dumpId, te->desc, te->tag);
*************** mark_work_done(ArchiveHandle *AH, TocEnt
*** 3903,3910 ****
  	else if (status == WORKER_IGNORED_ERRORS)
  		AH->public.n_errors++;
  	else if (status != 0)
! 		die_horribly(AH, modulename, "worker process failed: exit code %d\n",
! 					 status);
  
  	reduce_dependencies(AH, te, ready_list);
  }
--- 3970,3977 ----
  	else if (status == WORKER_IGNORED_ERRORS)
  		AH->public.n_errors++;
  	else if (status != 0)
! 		exit_horribly(modulename, "worker process failed: exit code %d\n",
! 					  status);
  
  	reduce_dependencies(AH, te, ready_list);
  }
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index fa8c58c..5d3af9c 100644
*** a/src/bin/pg_dump/pg_backup_archiver.h
--- b/src/bin/pg_dump/pg_backup_archiver.h
*************** typedef struct _tocEntry
*** 323,332 ****
  	int			nLockDeps;		/* number of such dependencies */
  } TocEntry;
  
  
! extern void die_horribly(ArchiveHandle *AH, const char *modulename, const char *fmt,...) __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4), noreturn));
! extern void die_on_query_failure(ArchiveHandle *AH, const char *modulename, const char *query) __attribute__((noreturn));
! extern void warn_or_die_horribly(ArchiveHandle *AH, const char *modulename, const char *fmt,...) __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4)));
  
  extern void WriteTOC(ArchiveHandle *AH);
  extern void ReadTOC(ArchiveHandle *AH);
--- 323,331 ----
  	int			nLockDeps;		/* number of such dependencies */
  } TocEntry;
  
+ extern void on_exit_close_archive(Archive *AHX);
  
! extern void warn_or_exit_horribly(ArchiveHandle *AH, const char *modulename, const char *fmt,...) __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4)));
  
  extern void WriteTOC(ArchiveHandle *AH);
  extern void ReadTOC(ArchiveHandle *AH);
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index 31fa373..87242c5 100644
*** a/src/bin/pg_dump/pg_backup_custom.c
--- b/src/bin/pg_dump/pg_backup_custom.c
*************** InitArchiveFmt_Custom(ArchiveHandle *AH)
*** 146,160 ****
  		{
  			AH->FH = fopen(AH->fSpec, PG_BINARY_W);
  			if (!AH->FH)
! 				die_horribly(AH, modulename, "could not open output file \"%s\": %s\n",
! 							 AH->fSpec, strerror(errno));
  		}
  		else
  		{
  			AH->FH = stdout;
  			if (!AH->FH)
! 				die_horribly(AH, modulename, "could not open output file: %s\n",
! 							 strerror(errno));
  		}
  
  		ctx->hasSeek = checkSeek(AH->FH);
--- 146,160 ----
  		{
  			AH->FH = fopen(AH->fSpec, PG_BINARY_W);
  			if (!AH->FH)
! 				exit_horribly(modulename, "could not open output file \"%s\": %s\n",
! 							  AH->fSpec, strerror(errno));
  		}
  		else
  		{
  			AH->FH = stdout;
  			if (!AH->FH)
! 				exit_horribly(modulename, "could not open output file: %s\n",
! 							  strerror(errno));
  		}
  
  		ctx->hasSeek = checkSeek(AH->FH);
*************** InitArchiveFmt_Custom(ArchiveHandle *AH)
*** 165,179 ****
  		{
  			AH->FH = fopen(AH->fSpec, PG_BINARY_R);
  			if (!AH->FH)
! 				die_horribly(AH, modulename, "could not open input file \"%s\": %s\n",
! 							 AH->fSpec, strerror(errno));
  		}
  		else
  		{
  			AH->FH = stdin;
  			if (!AH->FH)
! 				die_horribly(AH, modulename, "could not open input file: %s\n",
! 							 strerror(errno));
  		}
  
  		ctx->hasSeek = checkSeek(AH->FH);
--- 165,179 ----
  		{
  			AH->FH = fopen(AH->fSpec, PG_BINARY_R);
  			if (!AH->FH)
! 				exit_horribly(modulename, "could not open input file \"%s\": %s\n",
! 							  AH->fSpec, strerror(errno));
  		}
  		else
  		{
  			AH->FH = stdin;
  			if (!AH->FH)
! 				exit_horribly(modulename, "could not open input file: %s\n",
! 							  strerror(errno));
  		}
  
  		ctx->hasSeek = checkSeek(AH->FH);
*************** _StartBlob(ArchiveHandle *AH, TocEntry *
*** 367,373 ****
  	lclContext *ctx = (lclContext *) AH->formatData;
  
  	if (oid == 0)
! 		die_horribly(AH, modulename, "invalid OID for large object\n");
  
  	WriteInt(AH, oid);
  
--- 367,373 ----
  	lclContext *ctx = (lclContext *) AH->formatData;
  
  	if (oid == 0)
! 		exit_horribly(modulename, "invalid OID for large object\n");
  
  	WriteInt(AH, oid);
  
*************** _PrintTocData(ArchiveHandle *AH, TocEntr
*** 437,445 ****
  					break;
  
  				default:		/* Always have a default */
! 					die_horribly(AH, modulename,
! 								 "unrecognized data block type (%d) while searching archive\n",
! 								 blkType);
  					break;
  			}
  			_readBlockHeader(AH, &blkType, &id);
--- 437,445 ----
  					break;
  
  				default:		/* Always have a default */
! 					exit_horribly(modulename,
! 								  "unrecognized data block type (%d) while searching archive\n",
! 								  blkType);
  					break;
  			}
  			_readBlockHeader(AH, &blkType, &id);
*************** _PrintTocData(ArchiveHandle *AH, TocEntr
*** 449,456 ****
  	{
  		/* We can just seek to the place we need to be. */
  		if (fseeko(AH->FH, tctx->dataPos, SEEK_SET) != 0)
! 			die_horribly(AH, modulename, "error during file seek: %s\n",
! 						 strerror(errno));
  
  		_readBlockHeader(AH, &blkType, &id);
  	}
--- 449,456 ----
  	{
  		/* We can just seek to the place we need to be. */
  		if (fseeko(AH->FH, tctx->dataPos, SEEK_SET) != 0)
! 			exit_horribly(modulename, "error during file seek: %s\n",
! 						  strerror(errno));
  
  		_readBlockHeader(AH, &blkType, &id);
  	}
*************** _PrintTocData(ArchiveHandle *AH, TocEntr
*** 459,483 ****
  	if (blkType == EOF)
  	{
  		if (tctx->dataState == K_OFFSET_POS_NOT_SET)
! 			die_horribly(AH, modulename, "could not find block ID %d in archive -- "
! 						 "possibly due to out-of-order restore request, "
! 						 "which cannot be handled due to lack of data offsets in archive\n",
! 						 te->dumpId);
  		else if (!ctx->hasSeek)
! 			die_horribly(AH, modulename, "could not find block ID %d in archive -- "
! 						 "possibly due to out-of-order restore request, "
! 				  "which cannot be handled due to non-seekable input file\n",
! 						 te->dumpId);
  		else	/* huh, the dataPos led us to EOF? */
! 			die_horribly(AH, modulename, "could not find block ID %d in archive -- "
! 						 "possibly corrupt archive\n",
! 						 te->dumpId);
  	}
  
  	/* Are we sane? */
  	if (id != te->dumpId)
! 		die_horribly(AH, modulename, "found unexpected block ID (%d) when reading data -- expected %d\n",
! 					 id, te->dumpId);
  
  	switch (blkType)
  	{
--- 459,483 ----
  	if (blkType == EOF)
  	{
  		if (tctx->dataState == K_OFFSET_POS_NOT_SET)
! 			exit_horribly(modulename, "could not find block ID %d in archive -- "
! 						  "possibly due to out-of-order restore request, "
! 						  "which cannot be handled due to lack of data offsets in archive\n",
! 						  te->dumpId);
  		else if (!ctx->hasSeek)
! 			exit_horribly(modulename, "could not find block ID %d in archive -- "
! 						  "possibly due to out-of-order restore request, "
! 						  "which cannot be handled due to non-seekable input file\n",
! 						  te->dumpId);
  		else	/* huh, the dataPos led us to EOF? */
! 			exit_horribly(modulename, "could not find block ID %d in archive -- "
! 						  "possibly corrupt archive\n",
! 						  te->dumpId);
  	}
  
  	/* Are we sane? */
  	if (id != te->dumpId)
! 		exit_horribly(modulename, "found unexpected block ID (%d) when reading data -- expected %d\n",
! 					  id, te->dumpId);
  
  	switch (blkType)
  	{
*************** _PrintTocData(ArchiveHandle *AH, TocEntr
*** 490,497 ****
  			break;
  
  		default:				/* Always have a default */
! 			die_horribly(AH, modulename, "unrecognized data block type %d while restoring archive\n",
! 						 blkType);
  			break;
  	}
  }
--- 490,497 ----
  			break;
  
  		default:				/* Always have a default */
! 			exit_horribly(modulename, "unrecognized data block type %d while restoring archive\n",
! 						  blkType);
  			break;
  	}
  }
*************** _skipData(ArchiveHandle *AH)
*** 571,581 ****
  		if (cnt != blkLen)
  		{
  			if (feof(AH->FH))
! 				die_horribly(AH, modulename,
! 							 "could not read from input file: end of file\n");
  			else
! 				die_horribly(AH, modulename,
! 					"could not read from input file: %s\n", strerror(errno));
  		}
  
  		ctx->filePos += blkLen;
--- 571,581 ----
  		if (cnt != blkLen)
  		{
  			if (feof(AH->FH))
! 				exit_horribly(modulename,
! 							  "could not read from input file: end of file\n");
  			else
! 				exit_horribly(modulename,
! 							  "could not read from input file: %s\n", strerror(errno));
  		}
  
  		ctx->filePos += blkLen;
*************** _WriteByte(ArchiveHandle *AH, const int
*** 604,610 ****
  	if (res != EOF)
  		ctx->filePos += 1;
  	else
! 		die_horribly(AH, modulename, "could not write byte: %s\n", strerror(errno));
  	return res;
  }
  
--- 604,610 ----
  	if (res != EOF)
  		ctx->filePos += 1;
  	else
! 		exit_horribly(modulename, "could not write byte: %s\n", strerror(errno));
  	return res;
  }
  
*************** _ReadByte(ArchiveHandle *AH)
*** 624,630 ****
  
  	res = getc(AH->FH);
  	if (res == EOF)
! 		die_horribly(AH, modulename, "unexpected end of file\n");
  	ctx->filePos += 1;
  	return res;
  }
--- 624,630 ----
  
  	res = getc(AH->FH);
  	if (res == EOF)
! 		exit_horribly(modulename, "unexpected end of file\n");
  	ctx->filePos += 1;
  	return res;
  }
*************** _WriteBuf(ArchiveHandle *AH, const void
*** 645,651 ****
  	res = fwrite(buf, 1, len, AH->FH);
  
  	if (res != len)
! 		die_horribly(AH, modulename,
  					 "could not write to output file: %s\n", strerror(errno));
  
  	ctx->filePos += res;
--- 645,651 ----
  	res = fwrite(buf, 1, len, AH->FH);
  
  	if (res != len)
! 		exit_horribly(modulename,
  					 "could not write to output file: %s\n", strerror(errno));
  
  	ctx->filePos += res;
*************** _CloseArchive(ArchiveHandle *AH)
*** 712,718 ****
  	}
  
  	if (fclose(AH->FH) != 0)
! 		die_horribly(AH, modulename, "could not close archive file: %s\n", strerror(errno));
  
  	AH->FH = NULL;
  }
--- 712,718 ----
  	}
  
  	if (fclose(AH->FH) != 0)
! 		exit_horribly(modulename, "could not close archive file: %s\n", strerror(errno));
  
  	AH->FH = NULL;
  }
*************** _ReopenArchive(ArchiveHandle *AH)
*** 731,767 ****
  	pgoff_t		tpos;
  
  	if (AH->mode == archModeWrite)
! 		die_horribly(AH, modulename, "can only reopen input archives\n");
  
  	/*
  	 * These two cases are user-facing errors since they represent unsupported
  	 * (but not invalid) use-cases.  Word the error messages appropriately.
  	 */
  	if (AH->fSpec == NULL || strcmp(AH->fSpec, "") == 0)
! 		die_horribly(AH, modulename, "parallel restore from stdin is not supported\n");
  	if (!ctx->hasSeek)
! 		die_horribly(AH, modulename, "parallel restore from non-seekable file is not supported\n");
  
  	errno = 0;
  	tpos = ftello(AH->FH);
  	if (errno)
! 		die_horribly(AH, modulename, "could not determine seek position in archive file: %s\n",
! 					 strerror(errno));
  
  #ifndef WIN32
  	if (fclose(AH->FH) != 0)
! 		die_horribly(AH, modulename, "could not close archive file: %s\n",
! 					 strerror(errno));
  #endif
  
  	AH->FH = fopen(AH->fSpec, PG_BINARY_R);
  	if (!AH->FH)
! 		die_horribly(AH, modulename, "could not open input file \"%s\": %s\n",
! 					 AH->fSpec, strerror(errno));
  
  	if (fseeko(AH->FH, tpos, SEEK_SET) != 0)
! 		die_horribly(AH, modulename, "could not set seek position in archive file: %s\n",
! 					 strerror(errno));
  }
  
  /*
--- 731,767 ----
  	pgoff_t		tpos;
  
  	if (AH->mode == archModeWrite)
! 		exit_horribly(modulename, "can only reopen input archives\n");
  
  	/*
  	 * These two cases are user-facing errors since they represent unsupported
  	 * (but not invalid) use-cases.  Word the error messages appropriately.
  	 */
  	if (AH->fSpec == NULL || strcmp(AH->fSpec, "") == 0)
! 		exit_horribly(modulename, "parallel restore from stdin is not supported\n");
  	if (!ctx->hasSeek)
! 		exit_horribly(modulename, "parallel restore from non-seekable file is not supported\n");
  
  	errno = 0;
  	tpos = ftello(AH->FH);
  	if (errno)
! 		exit_horribly(modulename, "could not determine seek position in archive file: %s\n",
! 					  strerror(errno));
  
  #ifndef WIN32
  	if (fclose(AH->FH) != 0)
! 		exit_horribly(modulename, "could not close archive file: %s\n",
! 					  strerror(errno));
  #endif
  
  	AH->FH = fopen(AH->fSpec, PG_BINARY_R);
  	if (!AH->FH)
! 		exit_horribly(modulename, "could not open input file \"%s\": %s\n",
! 					  AH->fSpec, strerror(errno));
  
  	if (fseeko(AH->FH, tpos, SEEK_SET) != 0)
! 		exit_horribly(modulename, "could not set seek position in archive file: %s\n",
! 					  strerror(errno));
  }
  
  /*
*************** _Clone(ArchiveHandle *AH)
*** 778,784 ****
  
  	/* sanity check, shouldn't happen */
  	if (ctx->cs != NULL)
! 		die_horribly(AH, modulename, "compressor active\n");
  
  	/*
  	 * Note: we do not make a local lo_buf because we expect at most one BLOBS
--- 778,784 ----
  
  	/* sanity check, shouldn't happen */
  	if (ctx->cs != NULL)
! 		exit_horribly(modulename, "compressor active\n");
  
  	/*
  	 * Note: we do not make a local lo_buf because we expect at most one BLOBS
*************** _readBlockHeader(ArchiveHandle *AH, int
*** 840,846 ****
  	int			byt;
  
  	/*
! 	 * Note: if we are at EOF with a pre-1.3 input file, we'll die_horribly
  	 * inside ReadInt rather than returning EOF.  It doesn't seem worth
  	 * jumping through hoops to deal with that case better, because no such
  	 * files are likely to exist in the wild: only some 7.1 development
--- 840,846 ----
  	int			byt;
  
  	/*
! 	 * Note: if we are at EOF with a pre-1.3 input file, we'll exit_horribly
  	 * inside ReadInt rather than returning EOF.  It doesn't seem worth
  	 * jumping through hoops to deal with that case better, because no such
  	 * files are likely to exist in the wild: only some 7.1 development
*************** _CustomReadFunc(ArchiveHandle *AH, char
*** 905,914 ****
  	if (cnt != blkLen)
  	{
  		if (feof(AH->FH))
! 			die_horribly(AH, modulename,
! 						 "could not read from input file: end of file\n");
  		else
! 			die_horribly(AH, modulename,
  					"could not read from input file: %s\n", strerror(errno));
  	}
  	return cnt;
--- 905,914 ----
  	if (cnt != blkLen)
  	{
  		if (feof(AH->FH))
! 			exit_horribly(modulename,
! 						  "could not read from input file: end of file\n");
  		else
! 			exit_horribly(modulename,
  					"could not read from input file: %s\n", strerror(errno));
  	}
  	return cnt;
diff --git a/src/bin/pg_dump/pg_backup_db.c b/src/bin/pg_dump/pg_backup_db.c
index a843eac..b315e68 100644
*** a/src/bin/pg_dump/pg_backup_db.c
--- b/src/bin/pg_dump/pg_backup_db.c
*************** static PGconn *_connectDB(ArchiveHandle
*** 30,42 ****
  static void notice_processor(void *arg, const char *message);
  
  static int
! _parse_version(ArchiveHandle *AH, const char *versionString)
  {
  	int			v;
  
  	v = parse_version(versionString);
  	if (v < 0)
! 		die_horribly(AH, modulename, "could not parse version string \"%s\"\n", versionString);
  
  	return v;
  }
--- 30,42 ----
  static void notice_processor(void *arg, const char *message);
  
  static int
! _parse_version(const char *versionString)
  {
  	int			v;
  
  	v = parse_version(versionString);
  	if (v < 0)
! 		exit_horribly(modulename, "could not parse version string \"%s\"\n", versionString);
  
  	return v;
  }
*************** _check_database_version(ArchiveHandle *A
*** 48,60 ****
  	const char *remoteversion_str;
  	int			remoteversion;
  
! 	myversion = _parse_version(AH, PG_VERSION);
  
  	remoteversion_str = PQparameterStatus(AH->connection, "server_version");
  	if (!remoteversion_str)
! 		die_horribly(AH, modulename, "could not get server_version from libpq\n");
  
! 	remoteversion = _parse_version(AH, remoteversion_str);
  
  	AH->public.remoteVersionStr = pg_strdup(remoteversion_str);
  	AH->public.remoteVersion = remoteversion;
--- 48,60 ----
  	const char *remoteversion_str;
  	int			remoteversion;
  
! 	myversion = _parse_version(PG_VERSION);
  
  	remoteversion_str = PQparameterStatus(AH->connection, "server_version");
  	if (!remoteversion_str)
! 		exit_horribly(modulename, "could not get server_version from libpq\n");
  
! 	remoteversion = _parse_version(remoteversion_str);
  
  	AH->public.remoteVersionStr = pg_strdup(remoteversion_str);
  	AH->public.remoteVersion = remoteversion;
*************** _check_database_version(ArchiveHandle *A
*** 67,73 ****
  	{
  		write_msg(NULL, "server version: %s; %s version: %s\n",
  				  remoteversion_str, progname, PG_VERSION);
! 		die_horribly(AH, NULL, "aborting because of server version mismatch\n");
  	}
  }
  
--- 67,73 ----
  	{
  		write_msg(NULL, "server version: %s; %s version: %s\n",
  				  remoteversion_str, progname, PG_VERSION);
! 		exit_horribly(NULL, "aborting because of server version mismatch\n");
  	}
  }
  
*************** _connectDB(ArchiveHandle *AH, const char
*** 145,151 ****
  	{
  		password = simple_prompt("Password: ", 100, false);
  		if (password == NULL)
! 			die_horribly(AH, modulename, "out of memory\n");
  	}
  
  	do
--- 145,151 ----
  	{
  		password = simple_prompt("Password: ", 100, false);
  		if (password == NULL)
! 			exit_horribly(modulename, "out of memory\n");
  	}
  
  	do
*************** _connectDB(ArchiveHandle *AH, const char
*** 176,187 ****
  		free(values);
  
  		if (!newConn)
! 			die_horribly(AH, modulename, "failed to reconnect to database\n");
  
  		if (PQstatus(newConn) == CONNECTION_BAD)
  		{
  			if (!PQconnectionNeedsPassword(newConn))
! 				die_horribly(AH, modulename, "could not reconnect to database: %s",
  							 PQerrorMessage(newConn));
  			PQfinish(newConn);
  
--- 176,187 ----
  		free(values);
  
  		if (!newConn)
! 			exit_horribly(modulename, "failed to reconnect to database\n");
  
  		if (PQstatus(newConn) == CONNECTION_BAD)
  		{
  			if (!PQconnectionNeedsPassword(newConn))
! 				exit_horribly(modulename, "could not reconnect to database: %s",
  							 PQerrorMessage(newConn));
  			PQfinish(newConn);
  
*************** _connectDB(ArchiveHandle *AH, const char
*** 197,206 ****
  			if (AH->promptPassword != TRI_NO)
  				password = simple_prompt("Password: ", 100, false);
  			else
! 				die_horribly(AH, modulename, "connection needs password\n");
  
  			if (password == NULL)
! 				die_horribly(AH, modulename, "out of memory\n");
  			new_pass = true;
  		}
  	} while (new_pass);
--- 197,206 ----
  			if (AH->promptPassword != TRI_NO)
  				password = simple_prompt("Password: ", 100, false);
  			else
! 				exit_horribly(modulename, "connection needs password\n");
  
  			if (password == NULL)
! 				exit_horribly(modulename, "out of memory\n");
  			new_pass = true;
  		}
  	} while (new_pass);
*************** ConnectDatabase(Archive *AHX,
*** 238,250 ****
  	bool		new_pass;
  
  	if (AH->connection)
! 		die_horribly(AH, modulename, "already connected to a database\n");
  
  	if (prompt_password == TRI_YES && password == NULL)
  	{
  		password = simple_prompt("Password: ", 100, false);
  		if (password == NULL)
! 			die_horribly(AH, modulename, "out of memory\n");
  	}
  	AH->promptPassword = prompt_password;
  
--- 238,250 ----
  	bool		new_pass;
  
  	if (AH->connection)
! 		exit_horribly(modulename, "already connected to a database\n");
  
  	if (prompt_password == TRI_YES && password == NULL)
  	{
  		password = simple_prompt("Password: ", 100, false);
  		if (password == NULL)
! 			exit_horribly(modulename, "out of memory\n");
  	}
  	AH->promptPassword = prompt_password;
  
*************** ConnectDatabase(Archive *AHX,
*** 280,286 ****
  		free(values);
  
  		if (!AH->connection)
! 			die_horribly(AH, modulename, "failed to connect to database\n");
  
  		if (PQstatus(AH->connection) == CONNECTION_BAD &&
  			PQconnectionNeedsPassword(AH->connection) &&
--- 280,286 ----
  		free(values);
  
  		if (!AH->connection)
! 			exit_horribly(modulename, "failed to connect to database\n");
  
  		if (PQstatus(AH->connection) == CONNECTION_BAD &&
  			PQconnectionNeedsPassword(AH->connection) &&
*************** ConnectDatabase(Archive *AHX,
*** 290,296 ****
  			PQfinish(AH->connection);
  			password = simple_prompt("Password: ", 100, false);
  			if (password == NULL)
! 				die_horribly(AH, modulename, "out of memory\n");
  			new_pass = true;
  		}
  	} while (new_pass);
--- 290,296 ----
  			PQfinish(AH->connection);
  			password = simple_prompt("Password: ", 100, false);
  			if (password == NULL)
! 				exit_horribly(modulename, "out of memory\n");
  			new_pass = true;
  		}
  	} while (new_pass);
*************** ConnectDatabase(Archive *AHX,
*** 299,305 ****
  
  	/* check to see that the backend connection was successfully made */
  	if (PQstatus(AH->connection) == CONNECTION_BAD)
! 		die_horribly(AH, modulename, "connection to database \"%s\" failed: %s",
  					 PQdb(AH->connection), PQerrorMessage(AH->connection));
  
  	/* check for version mismatch */
--- 299,305 ----
  
  	/* check to see that the backend connection was successfully made */
  	if (PQstatus(AH->connection) == CONNECTION_BAD)
! 		exit_horribly(modulename, "connection to database \"%s\" failed: %s",
  					 PQdb(AH->connection), PQerrorMessage(AH->connection));
  
  	/* check for version mismatch */
*************** notice_processor(void *arg, const char *
*** 331,336 ****
--- 331,344 ----
  	write_msg(NULL, "%s", message);
  }
  
+ /* Like exit_horribly(), but with a complaint about a particular query. */
+ static void
+ die_on_query_failure(ArchiveHandle *AH, const char *modulename, const char *query)
+ {
+ 	write_msg(modulename, "query failed: %s",
+ 			  PQerrorMessage(AH->connection));
+ 	exit_horribly(modulename, "query was: %s\n", query);
+ }
  
  void
  ExecuteSqlStatement(Archive *AHX, const char *query)
*************** ExecuteSqlCommand(ArchiveHandle *AH, con
*** 393,400 ****
  				errStmt[DB_MAX_ERR_STMT - 2] = '.';
  				errStmt[DB_MAX_ERR_STMT - 1] = '\0';
  			}
! 			warn_or_die_horribly(AH, modulename, "%s: %s    Command was: %s\n",
! 								 desc, PQerrorMessage(conn), errStmt);
  			break;
  	}
  
--- 401,408 ----
  				errStmt[DB_MAX_ERR_STMT - 2] = '.';
  				errStmt[DB_MAX_ERR_STMT - 1] = '\0';
  			}
! 			warn_or_exit_horribly(AH, modulename, "%s: %s    Command was: %s\n",
! 								  desc, PQerrorMessage(conn), errStmt);
  			break;
  	}
  
*************** ExecuteSqlCommandBuf(ArchiveHandle *AH,
*** 495,502 ****
  		 */
  		if (AH->pgCopyIn &&
  			PQputCopyData(AH->connection, buf, bufLen) <= 0)
! 			die_horribly(AH, modulename, "error returned by PQputCopyData: %s",
! 						 PQerrorMessage(AH->connection));
  	}
  	else if (AH->outputKind == OUTPUT_OTHERDATA)
  	{
--- 503,510 ----
  		 */
  		if (AH->pgCopyIn &&
  			PQputCopyData(AH->connection, buf, bufLen) <= 0)
! 			exit_horribly(modulename, "error returned by PQputCopyData: %s",
! 						  PQerrorMessage(AH->connection));
  	}
  	else if (AH->outputKind == OUTPUT_OTHERDATA)
  	{
*************** EndDBCopyMode(ArchiveHandle *AH, TocEntr
*** 541,554 ****
  		PGresult   *res;
  
  		if (PQputCopyEnd(AH->connection, NULL) <= 0)
! 			die_horribly(AH, modulename, "error returned by PQputCopyEnd: %s",
! 						 PQerrorMessage(AH->connection));
  
  		/* Check command status and return to normal libpq state */
  		res = PQgetResult(AH->connection);
  		if (PQresultStatus(res) != PGRES_COMMAND_OK)
! 			warn_or_die_horribly(AH, modulename, "COPY failed for table \"%s\": %s",
! 								 te->tag, PQerrorMessage(AH->connection));
  		PQclear(res);
  
  		AH->pgCopyIn = false;
--- 549,562 ----
  		PGresult   *res;
  
  		if (PQputCopyEnd(AH->connection, NULL) <= 0)
! 			exit_horribly(modulename, "error returned by PQputCopyEnd: %s",
! 						  PQerrorMessage(AH->connection));
  
  		/* Check command status and return to normal libpq state */
  		res = PQgetResult(AH->connection);
  		if (PQresultStatus(res) != PGRES_COMMAND_OK)
! 			warn_or_exit_horribly(AH, modulename, "COPY failed for table \"%s\": %s",
! 								  te->tag, PQerrorMessage(AH->connection));
  		PQclear(res);
  
  		AH->pgCopyIn = false;
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 4b59516..8d43cd2 100644
*** a/src/bin/pg_dump/pg_backup_directory.c
--- b/src/bin/pg_dump/pg_backup_directory.c
*************** InitArchiveFmt_Directory(ArchiveHandle *
*** 142,148 ****
  	 */
  
  	if (!AH->fSpec || strcmp(AH->fSpec, "") == 0)
! 		die_horribly(AH, modulename, "no output directory specified\n");
  
  	ctx->directory = AH->fSpec;
  
--- 142,148 ----
  	 */
  
  	if (!AH->fSpec || strcmp(AH->fSpec, "") == 0)
! 		exit_horribly(modulename, "no output directory specified\n");
  
  	ctx->directory = AH->fSpec;
  
*************** InitArchiveFmt_Directory(ArchiveHandle *
*** 160,168 ****
  
  		tocFH = cfopen_read(fname, PG_BINARY_R);
  		if (tocFH == NULL)
! 			die_horribly(AH, modulename,
! 						 "could not open input file \"%s\": %s\n",
! 						 fname, strerror(errno));
  
  		ctx->dataFH = tocFH;
  
--- 160,168 ----
  
  		tocFH = cfopen_read(fname, PG_BINARY_R);
  		if (tocFH == NULL)
! 			exit_horribly(modulename,
! 						  "could not open input file \"%s\": %s\n",
! 						  fname, strerror(errno));
  
  		ctx->dataFH = tocFH;
  
*************** InitArchiveFmt_Directory(ArchiveHandle *
*** 177,183 ****
  
  		/* Nothing else in the file, so close it again... */
  		if (cfclose(tocFH) != 0)
! 			die_horribly(AH, modulename, "could not close TOC file: %s\n",
  						 strerror(errno));
  		ctx->dataFH = NULL;
  	}
--- 177,183 ----
  
  		/* Nothing else in the file, so close it again... */
  		if (cfclose(tocFH) != 0)
! 			exit_horribly(modulename, "could not close TOC file: %s\n",
  						 strerror(errno));
  		ctx->dataFH = NULL;
  	}
*************** _StartData(ArchiveHandle *AH, TocEntry *
*** 288,295 ****
  
  	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
  	if (ctx->dataFH == NULL)
! 		die_horribly(AH, modulename, "could not open output file \"%s\": %s\n",
! 					 fname, strerror(errno));
  }
  
  /*
--- 288,295 ----
  
  	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
  	if (ctx->dataFH == NULL)
! 		exit_horribly(modulename, "could not open output file \"%s\": %s\n",
! 					  fname, strerror(errno));
  }
  
  /*
*************** _PrintFileData(ArchiveHandle *AH, char *
*** 346,352 ****
  	cfp = cfopen_read(filename, PG_BINARY_R);
  
  	if (!cfp)
! 		die_horribly(AH, modulename, "could not open input file \"%s\": %s\n",
  					 filename, strerror(errno));
  
  	buf = pg_malloc(ZLIB_OUT_SIZE);
--- 346,352 ----
  	cfp = cfopen_read(filename, PG_BINARY_R);
  
  	if (!cfp)
! 		exit_horribly(modulename, "could not open input file \"%s\": %s\n",
  					 filename, strerror(errno));
  
  	buf = pg_malloc(ZLIB_OUT_SIZE);
*************** _PrintFileData(ArchiveHandle *AH, char *
*** 357,363 ****
  
  	free(buf);
  	if (cfclose(cfp) != 0)
! 		die_horribly(AH, modulename, "could not close data file: %s\n",
  					 strerror(errno));
  }
  
--- 357,363 ----
  
  	free(buf);
  	if (cfclose(cfp) != 0)
! 		exit_horribly(modulename, "could not close data file: %s\n",
  					 strerror(errno));
  }
  
*************** _LoadBlobs(ArchiveHandle *AH, RestoreOpt
*** 397,404 ****
  	ctx->blobsTocFH = cfopen_read(fname, PG_BINARY_R);
  
  	if (ctx->blobsTocFH == NULL)
! 		die_horribly(AH, modulename, "could not open large object TOC file \"%s\" for input: %s\n",
! 					 fname, strerror(errno));
  
  	/* Read the blobs TOC file line-by-line, and process each blob */
  	while ((cfgets(ctx->blobsTocFH, line, MAXPGPATH)) != NULL)
--- 397,404 ----
  	ctx->blobsTocFH = cfopen_read(fname, PG_BINARY_R);
  
  	if (ctx->blobsTocFH == NULL)
! 		exit_horribly(modulename, "could not open large object TOC file \"%s\" for input: %s\n",
! 					  fname, strerror(errno));
  
  	/* Read the blobs TOC file line-by-line, and process each blob */
  	while ((cfgets(ctx->blobsTocFH, line, MAXPGPATH)) != NULL)
*************** _LoadBlobs(ArchiveHandle *AH, RestoreOpt
*** 407,414 ****
  		char		path[MAXPGPATH];
  
  		if (sscanf(line, "%u %s\n", &oid, fname) != 2)
! 			die_horribly(AH, modulename, "invalid line in large object TOC file: %s\n",
! 						 line);
  
  		StartRestoreBlob(AH, oid, ropt->dropSchema);
  		snprintf(path, MAXPGPATH, "%s/%s", ctx->directory, fname);
--- 407,414 ----
  		char		path[MAXPGPATH];
  
  		if (sscanf(line, "%u %s\n", &oid, fname) != 2)
! 			exit_horribly(modulename, "invalid line in large object TOC file: %s\n",
! 						  line);
  
  		StartRestoreBlob(AH, oid, ropt->dropSchema);
  		snprintf(path, MAXPGPATH, "%s/%s", ctx->directory, fname);
*************** _LoadBlobs(ArchiveHandle *AH, RestoreOpt
*** 416,427 ****
  		EndRestoreBlob(AH, oid);
  	}
  	if (!cfeof(ctx->blobsTocFH))
! 		die_horribly(AH, modulename, "error reading large object TOC file \"%s\"\n",
  					 fname);
  
  	if (cfclose(ctx->blobsTocFH) != 0)
! 		die_horribly(AH, modulename, "could not close large object TOC file \"%s\": %s\n",
! 					 fname, strerror(errno));
  
  	ctx->blobsTocFH = NULL;
  
--- 416,427 ----
  		EndRestoreBlob(AH, oid);
  	}
  	if (!cfeof(ctx->blobsTocFH))
! 		exit_horribly(modulename, "error reading large object TOC file \"%s\"\n",
  					 fname);
  
  	if (cfclose(ctx->blobsTocFH) != 0)
! 		exit_horribly(modulename, "could not close large object TOC file \"%s\": %s\n",
! 					  fname, strerror(errno));
  
  	ctx->blobsTocFH = NULL;
  
*************** _WriteByte(ArchiveHandle *AH, const int
*** 441,447 ****
  	lclContext *ctx = (lclContext *) AH->formatData;
  
  	if (cfwrite(&c, 1, ctx->dataFH) != 1)
! 		die_horribly(AH, modulename, "could not write byte\n");
  
  	return 1;
  }
--- 441,447 ----
  	lclContext *ctx = (lclContext *) AH->formatData;
  
  	if (cfwrite(&c, 1, ctx->dataFH) != 1)
! 		exit_horribly(modulename, "could not write byte\n");
  
  	return 1;
  }
*************** _ReadByte(ArchiveHandle *AH)
*** 460,466 ****
  
  	res = cfgetc(ctx->dataFH);
  	if (res == EOF)
! 		die_horribly(AH, modulename, "unexpected end of file\n");
  
  	return res;
  }
--- 460,466 ----
  
  	res = cfgetc(ctx->dataFH);
  	if (res == EOF)
! 		exit_horribly(modulename, "unexpected end of file\n");
  
  	return res;
  }
*************** _WriteBuf(ArchiveHandle *AH, const void
*** 477,483 ****
  
  	res = cfwrite(buf, len, ctx->dataFH);
  	if (res != len)
! 		die_horribly(AH, modulename, "could not write to output file: %s\n",
  					 strerror(errno));
  
  	return res;
--- 477,483 ----
  
  	res = cfwrite(buf, len, ctx->dataFH);
  	if (res != len)
! 		exit_horribly(modulename, "could not write to output file: %s\n",
  					 strerror(errno));
  
  	return res;
*************** _CloseArchive(ArchiveHandle *AH)
*** 524,531 ****
  		/* The TOC is always created uncompressed */
  		tocFH = cfopen_write(fname, PG_BINARY_W, 0);
  		if (tocFH == NULL)
! 			die_horribly(AH, modulename, "could not open output file \"%s\": %s\n",
! 						 fname, strerror(errno));
  		ctx->dataFH = tocFH;
  
  		/*
--- 524,531 ----
  		/* The TOC is always created uncompressed */
  		tocFH = cfopen_write(fname, PG_BINARY_W, 0);
  		if (tocFH == NULL)
! 			exit_horribly(modulename, "could not open output file \"%s\": %s\n",
! 						  fname, strerror(errno));
  		ctx->dataFH = tocFH;
  
  		/*
*************** _CloseArchive(ArchiveHandle *AH)
*** 538,545 ****
  		AH->format = archDirectory;
  		WriteToc(AH);
  		if (cfclose(tocFH) != 0)
! 			die_horribly(AH, modulename, "could not close TOC file: %s\n",
! 						 strerror(errno));
  		WriteDataChunks(AH);
  	}
  	AH->FH = NULL;
--- 538,545 ----
  		AH->format = archDirectory;
  		WriteToc(AH);
  		if (cfclose(tocFH) != 0)
! 			exit_horribly(modulename, "could not close TOC file: %s\n",
! 						  strerror(errno));
  		WriteDataChunks(AH);
  	}
  	AH->FH = NULL;
*************** _StartBlobs(ArchiveHandle *AH, TocEntry
*** 568,575 ****
  	/* The blob TOC file is never compressed */
  	ctx->blobsTocFH = cfopen_write(fname, "ab", 0);
  	if (ctx->blobsTocFH == NULL)
! 		die_horribly(AH, modulename, "could not open output file \"%s\": %s\n",
! 					 fname, strerror(errno));
  }
  
  /*
--- 568,575 ----
  	/* The blob TOC file is never compressed */
  	ctx->blobsTocFH = cfopen_write(fname, "ab", 0);
  	if (ctx->blobsTocFH == NULL)
! 		exit_horribly(modulename, "could not open output file \"%s\": %s\n",
! 					  fname, strerror(errno));
  }
  
  /*
*************** _StartBlob(ArchiveHandle *AH, TocEntry *
*** 588,594 ****
  	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
  
  	if (ctx->dataFH == NULL)
! 		die_horribly(AH, modulename, "could not open output file \"%s\": %s\n",
  					 fname, strerror(errno));
  }
  
--- 588,594 ----
  	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
  
  	if (ctx->dataFH == NULL)
! 		exit_horribly(modulename, "could not open output file \"%s\": %s\n",
  					 fname, strerror(errno));
  }
  
*************** _EndBlob(ArchiveHandle *AH, TocEntry *te
*** 611,617 ****
  	/* register the blob in blobs.toc */
  	len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
  	if (cfwrite(buf, len, ctx->blobsTocFH) != len)
! 		die_horribly(AH, modulename, "could not write to blobs TOC file\n");
  }
  
  /*
--- 611,617 ----
  	/* register the blob in blobs.toc */
  	len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
  	if (cfwrite(buf, len, ctx->blobsTocFH) != len)
! 		exit_horribly(modulename, "could not write to blobs TOC file\n");
  }
  
  /*
*************** prependDirectory(ArchiveHandle *AH, cons
*** 667,673 ****
  	dname = ctx->directory;
  
  	if (strlen(dname) + 1 + strlen(relativeFilename) + 1 > MAXPGPATH)
! 		die_horribly(AH, modulename, "path name too long: %s", dname);
  
  	strcpy(buf, dname);
  	strcat(buf, "/");
--- 667,673 ----
  	dname = ctx->directory;
  
  	if (strlen(dname) + 1 + strlen(relativeFilename) + 1 > MAXPGPATH)
! 		exit_horribly(modulename, "path name too long: %s", dname);
  
  	strcpy(buf, dname);
  	strcat(buf, "/");
diff --git a/src/bin/pg_dump/pg_backup_files.c b/src/bin/pg_dump/pg_backup_files.c
index a7fd91d..d765838 100644
*** a/src/bin/pg_dump/pg_backup_files.c
--- b/src/bin/pg_dump/pg_backup_files.c
*************** InitArchiveFmt_Files(ArchiveHandle *AH)
*** 169,175 ****
  		ReadToc(AH);
  		/* Nothing else in the file... */
  		if (fclose(AH->FH) != 0)
! 			die_horribly(AH, modulename, "could not close TOC file: %s\n", strerror(errno));
  	}
  }
  
--- 169,175 ----
  		ReadToc(AH);
  		/* Nothing else in the file... */
  		if (fclose(AH->FH) != 0)
! 			exit_horribly(modulename, "could not close TOC file: %s\n", strerror(errno));
  	}
  }
  
*************** _StartData(ArchiveHandle *AH, TocEntry *
*** 259,266 ****
  #endif
  
  	if (tctx->FH == NULL)
! 		die_horribly(AH, modulename, "could not open output file \"%s\": %s\n",
! 					 tctx->filename, strerror(errno));
  }
  
  static size_t
--- 259,266 ----
  #endif
  
  	if (tctx->FH == NULL)
! 		exit_horribly(modulename, "could not open output file \"%s\": %s\n",
! 					  tctx->filename, strerror(errno));
  }
  
  static size_t
*************** _EndData(ArchiveHandle *AH, TocEntry *te
*** 280,286 ****
  
  	/* Close the file */
  	if (GZCLOSE(tctx->FH) != 0)
! 		die_horribly(AH, modulename, "could not close data file\n");
  
  	tctx->FH = NULL;
  }
--- 280,286 ----
  
  	/* Close the file */
  	if (GZCLOSE(tctx->FH) != 0)
! 		exit_horribly(modulename, "could not close data file\n");
  
  	tctx->FH = NULL;
  }
*************** _PrintFileData(ArchiveHandle *AH, char *
*** 304,310 ****
  #endif
  
  	if (AH->FH == NULL)
! 		die_horribly(AH, modulename, "could not open input file \"%s\": %s\n",
  					 filename, strerror(errno));
  
  	while ((cnt = GZREAD(buf, 1, 4095, AH->FH)) > 0)
--- 304,310 ----
  #endif
  
  	if (AH->FH == NULL)
! 		exit_horribly(modulename, "could not open input file \"%s\": %s\n",
  					 filename, strerror(errno));
  
  	while ((cnt = GZREAD(buf, 1, 4095, AH->FH)) > 0)
*************** _PrintFileData(ArchiveHandle *AH, char *
*** 314,320 ****
  	}
  
  	if (GZCLOSE(AH->FH) != 0)
! 		die_horribly(AH, modulename, "could not close data file after reading\n");
  }
  
  
--- 314,320 ----
  	}
  
  	if (GZCLOSE(AH->FH) != 0)
! 		exit_horribly(modulename, "could not close data file after reading\n");
  }
  
  
*************** _LoadBlobs(ArchiveHandle *AH, RestoreOpt
*** 376,382 ****
  	ctx->blobToc = fopen("blobs.toc", PG_BINARY_R);
  
  	if (ctx->blobToc == NULL)
! 		die_horribly(AH, modulename, "could not open large object TOC for input: %s\n", strerror(errno));
  
  	_getBlobTocEntry(AH, &oid, fname);
  
--- 376,382 ----
  	ctx->blobToc = fopen("blobs.toc", PG_BINARY_R);
  
  	if (ctx->blobToc == NULL)
! 		exit_horribly(modulename, "could not open large object TOC for input: %s\n", strerror(errno));
  
  	_getBlobTocEntry(AH, &oid, fname);
  
*************** _LoadBlobs(ArchiveHandle *AH, RestoreOpt
*** 389,395 ****
  	}
  
  	if (fclose(ctx->blobToc) != 0)
! 		die_horribly(AH, modulename, "could not close large object TOC file: %s\n", strerror(errno));
  
  	EndRestoreBlobs(AH);
  }
--- 389,395 ----
  	}
  
  	if (fclose(ctx->blobToc) != 0)
! 		exit_horribly(modulename, "could not close large object TOC file: %s\n", strerror(errno));
  
  	EndRestoreBlobs(AH);
  }
*************** _WriteByte(ArchiveHandle *AH, const int
*** 401,407 ****
  	lclContext *ctx = (lclContext *) AH->formatData;
  
  	if (fputc(i, AH->FH) == EOF)
! 		die_horribly(AH, modulename, "could not write byte\n");
  
  	ctx->filePos += 1;
  
--- 401,407 ----
  	lclContext *ctx = (lclContext *) AH->formatData;
  
  	if (fputc(i, AH->FH) == EOF)
! 		exit_horribly(modulename, "could not write byte\n");
  
  	ctx->filePos += 1;
  
*************** _ReadByte(ArchiveHandle *AH)
*** 416,422 ****
  
  	res = getc(AH->FH);
  	if (res == EOF)
! 		die_horribly(AH, modulename, "unexpected end of file\n");
  	ctx->filePos += 1;
  	return res;
  }
--- 416,422 ----
  
  	res = getc(AH->FH);
  	if (res == EOF)
! 		exit_horribly(modulename, "unexpected end of file\n");
  	ctx->filePos += 1;
  	return res;
  }
*************** _WriteBuf(ArchiveHandle *AH, const void
*** 429,435 ****
  
  	res = fwrite(buf, 1, len, AH->FH);
  	if (res != len)
! 		die_horribly(AH, modulename, "could not write to output file: %s\n", strerror(errno));
  
  	ctx->filePos += res;
  	return res;
--- 429,435 ----
  
  	res = fwrite(buf, 1, len, AH->FH);
  	if (res != len)
! 		exit_horribly(modulename, "could not write to output file: %s\n", strerror(errno));
  
  	ctx->filePos += res;
  	return res;
*************** _CloseArchive(ArchiveHandle *AH)
*** 454,460 ****
  		WriteHead(AH);
  		WriteToc(AH);
  		if (fclose(AH->FH) != 0)
! 			die_horribly(AH, modulename, "could not close TOC file: %s\n", strerror(errno));
  		WriteDataChunks(AH);
  	}
  
--- 454,460 ----
  		WriteHead(AH);
  		WriteToc(AH);
  		if (fclose(AH->FH) != 0)
! 			exit_horribly(modulename, "could not close TOC file: %s\n", strerror(errno));
  		WriteDataChunks(AH);
  	}
  
*************** _StartBlobs(ArchiveHandle *AH, TocEntry
*** 486,492 ****
  	ctx->blobToc = fopen(fname, PG_BINARY_W);
  
  	if (ctx->blobToc == NULL)
! 		die_horribly(AH, modulename,
  		"could not open large object TOC for output: %s\n", strerror(errno));
  }
  
--- 486,492 ----
  	ctx->blobToc = fopen(fname, PG_BINARY_W);
  
  	if (ctx->blobToc == NULL)
! 		exit_horribly(modulename,
  		"could not open large object TOC for output: %s\n", strerror(errno));
  }
  
*************** _StartBlob(ArchiveHandle *AH, TocEntry *
*** 507,513 ****
  	char	   *sfx;
  
  	if (oid == 0)
! 		die_horribly(AH, modulename, "invalid OID for large object (%u)\n", oid);
  
  	if (AH->compression != 0)
  		sfx = ".gz";
--- 507,513 ----
  	char	   *sfx;
  
  	if (oid == 0)
! 		exit_horribly(modulename, "invalid OID for large object (%u)\n", oid);
  
  	if (AH->compression != 0)
  		sfx = ".gz";
*************** _StartBlob(ArchiveHandle *AH, TocEntry *
*** 526,532 ****
  #endif
  
  	if (tctx->FH == NULL)
! 		die_horribly(AH, modulename, "could not open large object file \"%s\" for input: %s\n",
  					 fname, strerror(errno));
  }
  
--- 526,532 ----
  #endif
  
  	if (tctx->FH == NULL)
! 		exit_horribly(modulename, "could not open large object file \"%s\" for input: %s\n",
  					 fname, strerror(errno));
  }
  
*************** _EndBlob(ArchiveHandle *AH, TocEntry *te
*** 541,547 ****
  	lclTocEntry *tctx = (lclTocEntry *) te->formatData;
  
  	if (GZCLOSE(tctx->FH) != 0)
! 		die_horribly(AH, modulename, "could not close large object file\n");
  }
  
  /*
--- 541,547 ----
  	lclTocEntry *tctx = (lclTocEntry *) te->formatData;
  
  	if (GZCLOSE(tctx->FH) != 0)
! 		exit_horribly(modulename, "could not close large object file\n");
  }
  
  /*
*************** _EndBlobs(ArchiveHandle *AH, TocEntry *t
*** 558,562 ****
  	/* WriteInt(AH, 0); */
  
  	if (fclose(ctx->blobToc) != 0)
! 		die_horribly(AH, modulename, "could not close large object TOC file: %s\n", strerror(errno));
  }
--- 558,562 ----
  	/* WriteInt(AH, 0); */
  
  	if (fclose(ctx->blobToc) != 0)
! 		exit_horribly(modulename, "could not close large object TOC file: %s\n", strerror(errno));
  }
diff --git a/src/bin/pg_dump/pg_backup_null.c b/src/bin/pg_dump/pg_backup_null.c
index 201f0d9..ba1e461 100644
*** a/src/bin/pg_dump/pg_backup_null.c
--- b/src/bin/pg_dump/pg_backup_null.c
*************** InitArchiveFmt_Null(ArchiveHandle *AH)
*** 74,80 ****
  	 * Now prevent reading...
  	 */
  	if (AH->mode == archModeRead)
! 		die_horribly(AH, NULL, "this format cannot be read\n");
  }
  
  /*
--- 74,80 ----
  	 * Now prevent reading...
  	 */
  	if (AH->mode == archModeRead)
! 		exit_horribly(NULL, "this format cannot be read\n");
  }
  
  /*
*************** _StartBlob(ArchiveHandle *AH, TocEntry *
*** 149,155 ****
  	bool		old_blob_style = (AH->version < K_VERS_1_12);
  
  	if (oid == 0)
! 		die_horribly(AH, NULL, "invalid OID for large object\n");
  
  	/* With an old archive we must do drop and create logic here */
  	if (old_blob_style && AH->ropt->dropSchema)
--- 149,155 ----
  	bool		old_blob_style = (AH->version < K_VERS_1_12);
  
  	if (oid == 0)
! 		exit_horribly(NULL, "invalid OID for large object\n");
  
  	/* With an old archive we must do drop and create logic here */
  	if (old_blob_style && AH->ropt->dropSchema)
diff --git a/src/bin/pg_dump/pg_backup_tar.c b/src/bin/pg_dump/pg_backup_tar.c
index 4823ede..451c957 100644
*** a/src/bin/pg_dump/pg_backup_tar.c
--- b/src/bin/pg_dump/pg_backup_tar.c
*************** tarOpen(ArchiveHandle *AH, const char *f
*** 355,361 ****
  				 * Couldn't find the requested file. Future: do SEEK(0) and
  				 * retry.
  				 */
! 				die_horribly(AH, modulename, "could not find file \"%s\" in archive\n", filename);
  			}
  			else
  			{
--- 355,361 ----
  				 * Couldn't find the requested file. Future: do SEEK(0) and
  				 * retry.
  				 */
! 				exit_horribly(modulename, "could not find file \"%s\" in archive\n", filename);
  			}
  			else
  			{
*************** tarOpen(ArchiveHandle *AH, const char *f
*** 369,375 ****
  		if (AH->compression == 0)
  			tm->nFH = ctx->tarFH;
  		else
! 			die_horribly(AH, modulename, "compression is not supported by tar archive format\n");
  		/* tm->zFH = gzdopen(dup(fileno(ctx->tarFH)), "rb"); */
  #else
  		tm->nFH = ctx->tarFH;
--- 369,375 ----
  		if (AH->compression == 0)
  			tm->nFH = ctx->tarFH;
  		else
! 			exit_horribly(modulename, "compression is not supported by tar archive format\n");
  		/* tm->zFH = gzdopen(dup(fileno(ctx->tarFH)), "rb"); */
  #else
  		tm->nFH = ctx->tarFH;
*************** tarOpen(ArchiveHandle *AH, const char *f
*** 411,417 ****
  #endif
  
  		if (tm->tmpFH == NULL)
! 			die_horribly(AH, modulename, "could not generate temporary file name: %s\n", strerror(errno));
  
  #ifdef HAVE_LIBZ
  
--- 411,417 ----
  #endif
  
  		if (tm->tmpFH == NULL)
! 			exit_horribly(modulename, "could not generate temporary file name: %s\n", strerror(errno));
  
  #ifdef HAVE_LIBZ
  
*************** tarOpen(ArchiveHandle *AH, const char *f
*** 420,426 ****
  			sprintf(fmode, "wb%d", AH->compression);
  			tm->zFH = gzdopen(dup(fileno(tm->tmpFH)), fmode);
  			if (tm->zFH == NULL)
! 				die_horribly(AH, modulename, "could not open temporary file\n");
  		}
  		else
  			tm->nFH = tm->tmpFH;
--- 420,426 ----
  			sprintf(fmode, "wb%d", AH->compression);
  			tm->zFH = gzdopen(dup(fileno(tm->tmpFH)), fmode);
  			if (tm->zFH == NULL)
! 				exit_horribly(modulename, "could not open temporary file\n");
  		}
  		else
  			tm->nFH = tm->tmpFH;
*************** tarClose(ArchiveHandle *AH, TAR_MEMBER *
*** 447,453 ****
  	 */
  	if (AH->compression != 0)
  		if (GZCLOSE(th->zFH) != 0)
! 			die_horribly(AH, modulename, "could not close tar member\n");
  
  	if (th->mode == 'w')
  		_tarAddFile(AH, th);	/* This will close the temp file */
--- 447,453 ----
  	 */
  	if (AH->compression != 0)
  		if (GZCLOSE(th->zFH) != 0)
! 			exit_horribly(modulename, "could not close tar member\n");
  
  	if (th->mode == 'w')
  		_tarAddFile(AH, th);	/* This will close the temp file */
*************** _tarReadRaw(ArchiveHandle *AH, void *buf
*** 547,553 ****
  				res = fread(&((char *) buf)[used], 1, len, th->nFH);
  		}
  		else
! 			die_horribly(AH, modulename, "internal error -- neither th nor fh specified in tarReadRaw()\n");
  	}
  
  	ctx->tarFHpos += res + used;
--- 547,553 ----
  				res = fread(&((char *) buf)[used], 1, len, th->nFH);
  		}
  		else
! 			exit_horribly(modulename, "internal error -- neither th nor fh specified in tarReadRaw()\n");
  	}
  
  	ctx->tarFHpos += res + used;
*************** tarWrite(const void *buf, size_t len, TA
*** 584,591 ****
  		res = fwrite(buf, 1, len, th->nFH);
  
  	if (res != len)
! 		die_horribly(th->AH, modulename,
! 					 "could not write to output file: %s\n", strerror(errno));
  
  	th->pos += res;
  	return res;
--- 584,591 ----
  		res = fwrite(buf, 1, len, th->nFH);
  
  	if (res != len)
! 		exit_horribly(modulename,
! 					  "could not write to output file: %s\n", strerror(errno));
  
  	th->pos += res;
  	return res;
*************** _PrintTocData(ArchiveHandle *AH, TocEntr
*** 672,679 ****
  		 * we search the string for it in a paranoid sort of way.
  		 */
  		if (strncmp(tmpCopy, "copy ", 5) != 0)
! 			die_horribly(AH, modulename,
! 						 "invalid COPY statement -- could not find \"copy\" in string \"%s\"\n", tmpCopy);
  
  		pos1 = 5;
  		for (pos1 = 5; pos1 < strlen(tmpCopy); pos1++)
--- 672,679 ----
  		 * we search the string for it in a paranoid sort of way.
  		 */
  		if (strncmp(tmpCopy, "copy ", 5) != 0)
! 			exit_horribly(modulename,
! 						  "invalid COPY statement -- could not find \"copy\" in string \"%s\"\n", tmpCopy);
  
  		pos1 = 5;
  		for (pos1 = 5; pos1 < strlen(tmpCopy); pos1++)
*************** _PrintTocData(ArchiveHandle *AH, TocEntr
*** 690,698 ****
  				break;
  
  		if (pos2 >= strlen(tmpCopy))
! 			die_horribly(AH, modulename,
! 						 "invalid COPY statement -- could not find \"from stdin\" in string \"%s\" starting at position %lu\n",
! 						 tmpCopy, (unsigned long) pos1);
  
  		ahwrite(tmpCopy, 1, pos2, AH);	/* 'copy "table" [with oids]' */
  		ahprintf(AH, " from '$$PATH$$/%s' %s", tctx->filename, &tmpCopy[pos2 + 10]);
--- 690,698 ----
  				break;
  
  		if (pos2 >= strlen(tmpCopy))
! 			exit_horribly(modulename,
! 						  "invalid COPY statement -- could not find \"from stdin\" in string \"%s\" starting at position %lu\n",
! 						  tmpCopy, (unsigned long) pos1);
  
  		ahwrite(tmpCopy, 1, pos2, AH);	/* 'copy "table" [with oids]' */
  		ahprintf(AH, " from '$$PATH$$/%s' %s", tctx->filename, &tmpCopy[pos2 + 10]);
*************** _ReadByte(ArchiveHandle *AH)
*** 784,790 ****
  
  	res = tarRead(&c, 1, ctx->FH);
  	if (res != 1)
! 		die_horribly(AH, modulename, "unexpected end of file\n");
  	ctx->filePos += 1;
  	return c;
  }
--- 784,790 ----
  
  	res = tarRead(&c, 1, ctx->FH);
  	if (res != 1)
! 		exit_horribly(modulename, "unexpected end of file\n");
  	ctx->filePos += 1;
  	return c;
  }
*************** _CloseArchive(ArchiveHandle *AH)
*** 878,884 ****
  		for (i = 0; i < 512; i++)
  		{
  			if (fputc(0, ctx->tarFH) == EOF)
! 				die_horribly(AH, modulename,
  					   "could not write null block at end of tar archive\n");
  		}
  	}
--- 878,884 ----
  		for (i = 0; i < 512; i++)
  		{
  			if (fputc(0, ctx->tarFH) == EOF)
! 				exit_horribly(modulename,
  					   "could not write null block at end of tar archive\n");
  		}
  	}
*************** _StartBlob(ArchiveHandle *AH, TocEntry *
*** 934,940 ****
  	char	   *sfx;
  
  	if (oid == 0)
! 		die_horribly(AH, modulename, "invalid OID for large object (%u)\n", oid);
  
  	if (AH->compression != 0)
  		sfx = ".gz";
--- 934,940 ----
  	char	   *sfx;
  
  	if (oid == 0)
! 		exit_horribly(modulename, "invalid OID for large object (%u)\n", oid);
  
  	if (AH->compression != 0)
  		sfx = ".gz";
*************** _tarAddFile(ArchiveHandle *AH, TAR_MEMBE
*** 1077,1083 ****
  	 * because pgoff_t can't exceed the compared maximum on their platform.
  	 */
  	if (th->fileLen > MAX_TAR_MEMBER_FILELEN)
! 		die_horribly(AH, modulename, "archive member too large for tar format\n");
  
  	_tarWriteHeader(th);
  
--- 1077,1083 ----
  	 * because pgoff_t can't exceed the compared maximum on their platform.
  	 */
  	if (th->fileLen > MAX_TAR_MEMBER_FILELEN)
! 		exit_horribly(modulename, "archive member too large for tar format\n");
  
  	_tarWriteHeader(th);
  
*************** _tarAddFile(ArchiveHandle *AH, TAR_MEMBE
*** 1085,1099 ****
  	{
  		res = fwrite(buf, 1, cnt, th->tarFH);
  		if (res != cnt)
! 			die_horribly(AH, modulename,
! 						 "could not write to output file: %s\n",
! 						 strerror(errno));
  		len += res;
  	}
  
  	if (fclose(tmp) != 0)		/* This *should* delete it... */
! 		die_horribly(AH, modulename, "could not close temporary file: %s\n",
! 					 strerror(errno));
  
  	if (len != th->fileLen)
  	{
--- 1085,1099 ----
  	{
  		res = fwrite(buf, 1, cnt, th->tarFH);
  		if (res != cnt)
! 			exit_horribly(modulename,
! 						  "could not write to output file: %s\n",
! 						  strerror(errno));
  		len += res;
  	}
  
  	if (fclose(tmp) != 0)		/* This *should* delete it... */
! 		exit_horribly(modulename, "could not close temporary file: %s\n",
! 					  strerror(errno));
  
  	if (len != th->fileLen)
  	{
*************** _tarAddFile(ArchiveHandle *AH, TAR_MEMBE
*** 1102,1116 ****
  
  		snprintf(buf1, sizeof(buf1), INT64_FORMAT, (int64) len);
  		snprintf(buf2, sizeof(buf2), INT64_FORMAT, (int64) th->fileLen);
! 		die_horribly(AH, modulename, "actual file length (%s) does not match expected (%s)\n",
! 					 buf1, buf2);
  	}
  
  	pad = ((len + 511) & ~511) - len;
  	for (i = 0; i < pad; i++)
  	{
  		if (fputc('\0', th->tarFH) == EOF)
! 			die_horribly(AH, modulename, "could not output padding at end of tar member\n");
  	}
  
  	ctx->tarFHpos += len + pad;
--- 1102,1116 ----
  
  		snprintf(buf1, sizeof(buf1), INT64_FORMAT, (int64) len);
  		snprintf(buf2, sizeof(buf2), INT64_FORMAT, (int64) th->fileLen);
! 		exit_horribly(modulename, "actual file length (%s) does not match expected (%s)\n",
! 					  buf1, buf2);
  	}
  
  	pad = ((len + 511) & ~511) - len;
  	for (i = 0; i < pad; i++)
  	{
  		if (fputc('\0', th->tarFH) == EOF)
! 			exit_horribly(modulename, "could not output padding at end of tar member\n");
  	}
  
  	ctx->tarFHpos += len + pad;
*************** _tarPositionTo(ArchiveHandle *AH, const
*** 1159,1165 ****
  	if (!_tarGetHeader(AH, th))
  	{
  		if (filename)
! 			die_horribly(AH, modulename, "could not find header for file \"%s\" in tar archive\n", filename);
  		else
  		{
  			/*
--- 1159,1165 ----
  	if (!_tarGetHeader(AH, th))
  	{
  		if (filename)
! 			exit_horribly(modulename, "could not find header for file \"%s\" in tar archive\n", filename);
  		else
  		{
  			/*
*************** _tarPositionTo(ArchiveHandle *AH, const
*** 1177,1185 ****
  
  		id = atoi(th->targetFile);
  		if ((TocIDRequired(AH, id, AH->ropt) & REQ_DATA) != 0)
! 			die_horribly(AH, modulename, "restoring data out of order is not supported in this archive format: "
! 						 "\"%s\" is required, but comes before \"%s\" in the archive file.\n",
! 						 th->targetFile, filename);
  
  		/* Header doesn't match, so read to next header */
  		len = ((th->fileLen + 511) & ~511);		/* Padded length */
--- 1177,1185 ----
  
  		id = atoi(th->targetFile);
  		if ((TocIDRequired(AH, id, AH->ropt) & REQ_DATA) != 0)
! 			exit_horribly(modulename, "restoring data out of order is not supported in this archive format: "
! 						  "\"%s\" is required, but comes before \"%s\" in the archive file.\n",
! 						  th->targetFile, filename);
  
  		/* Header doesn't match, so read to next header */
  		len = ((th->fileLen + 511) & ~511);		/* Padded length */
*************** _tarPositionTo(ArchiveHandle *AH, const
*** 1189,1195 ****
  			_tarReadRaw(AH, &header[0], 512, NULL, ctx->tarFH);
  
  		if (!_tarGetHeader(AH, th))
! 			die_horribly(AH, modulename, "could not find header for file \"%s\" in tar archive\n", filename);
  	}
  
  	ctx->tarNextMember = ctx->tarFHpos + ((th->fileLen + 511) & ~511);
--- 1189,1195 ----
  			_tarReadRaw(AH, &header[0], 512, NULL, ctx->tarFH);
  
  		if (!_tarGetHeader(AH, th))
! 			exit_horribly(modulename, "could not find header for file \"%s\" in tar archive\n", filename);
  	}
  
  	ctx->tarNextMember = ctx->tarFHpos + ((th->fileLen + 511) & ~511);
*************** _tarGetHeader(ArchiveHandle *AH, TAR_MEM
*** 1222,1228 ****
  
  			snprintf(buf1, sizeof(buf1), INT64_FORMAT, (int64) ftello(ctx->tarFH));
  			snprintf(buf2, sizeof(buf2), INT64_FORMAT, (int64) ftello(ctx->tarFHpos));
! 			die_horribly(AH, modulename,
  			  "mismatch in actual vs. predicted file position (%s vs. %s)\n",
  						 buf1, buf2);
  		}
--- 1222,1228 ----
  
  			snprintf(buf1, sizeof(buf1), INT64_FORMAT, (int64) ftello(ctx->tarFH));
  			snprintf(buf2, sizeof(buf2), INT64_FORMAT, (int64) ftello(ctx->tarFHpos));
! 			exit_horribly(modulename,
  			  "mismatch in actual vs. predicted file position (%s vs. %s)\n",
  						 buf1, buf2);
  		}
*************** _tarGetHeader(ArchiveHandle *AH, TAR_MEM
*** 1237,1247 ****
  			return 0;
  
  		if (len != 512)
! 			die_horribly(AH, modulename,
! 						 ngettext("incomplete tar header found (%lu byte)\n",
! 								  "incomplete tar header found (%lu bytes)\n",
! 								  len),
! 						 (unsigned long) len);
  
  		/* Calc checksum */
  		chk = _tarChecksum(h);
--- 1237,1247 ----
  			return 0;
  
  		if (len != 512)
! 			exit_horribly(modulename,
! 						  ngettext("incomplete tar header found (%lu byte)\n",
! 								   "incomplete tar header found (%lu bytes)\n",
! 								   len),
! 						  (unsigned long) len);
  
  		/* Calc checksum */
  		chk = _tarChecksum(h);
*************** _tarGetHeader(ArchiveHandle *AH, TAR_MEM
*** 1285,1294 ****
  		char		buf[100];
  
  		snprintf(buf, sizeof(buf), INT64_FORMAT, (int64) ftello(ctx->tarFH));
! 		die_horribly(AH, modulename,
! 					 "corrupt tar header found in %s "
! 					 "(expected %d, computed %d) file position %s\n",
! 					 tag, sum, chk, buf);
  	}
  
  	th->targetFile = pg_strdup(tag);
--- 1285,1294 ----
  		char		buf[100];
  
  		snprintf(buf, sizeof(buf), INT64_FORMAT, (int64) ftello(ctx->tarFH));
! 		exit_horribly(modulename,
! 					  "corrupt tar header found in %s "
! 					  "(expected %d, computed %d) file position %s\n",
! 					  tag, sum, chk, buf);
  	}
  
  	th->targetFile = pg_strdup(tag);
*************** _tarWriteHeader(TAR_MEMBER *th)
*** 1379,1383 ****
  	}
  
  	if (fwrite(h, 1, 512, th->tarFH) != 512)
! 		die_horribly(th->AH, modulename, "could not write to output file: %s\n", strerror(errno));
  }
--- 1379,1383 ----
  	}
  
  	if (fwrite(h, 1, 512, th->tarFH) != 512)
! 		exit_horribly(modulename, "could not write to output file: %s\n", strerror(errno));
  }
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 2b0a5ff..089c98f 100644
*** a/src/bin/pg_dump/pg_dump.c
--- b/src/bin/pg_dump/pg_dump.c
*************** static int	serializable_deferrable = 0;
*** 144,150 ****
  
  
  static void help(const char *progname);
- static void pgdump_cleanup_at_exit(int code, void *arg);
  static void setup_connection(Archive *AH, const char *dumpencoding,
  				 char *use_role);
  static ArchiveFormat parseArchiveFormat(const char *format, ArchiveMode *mode);
--- 144,149 ----
*************** main(int argc, char **argv)
*** 575,581 ****
  
  	/* Open the output file */
  	fout = CreateArchive(filename, archiveFormat, compressLevel, archiveMode);
! 	on_exit_nicely(pgdump_cleanup_at_exit, fout);
  
  	if (fout == NULL)
  		exit_horribly(NULL, "could not open output file \"%s\" for writing\n", filename);
--- 574,582 ----
  
  	/* Open the output file */
  	fout = CreateArchive(filename, archiveFormat, compressLevel, archiveMode);
! 
! 	/* Register the cleanup hook */
! 	on_exit_close_archive(fout);
  
  	if (fout == NULL)
  		exit_horribly(NULL, "could not open output file \"%s\" for writing\n", filename);
*************** help(const char *progname)
*** 837,850 ****
  }
  
  static void
- pgdump_cleanup_at_exit(int code, void *arg)
- {
- 	Archive	   *AH = (Archive *) arg;
- 
- 	DisconnectDatabase(AH);
- }
- 
- static void
  setup_connection(Archive *AH, const char *dumpencoding, char *use_role)
  {
  	PGconn	   *conn = GetConnection(AH);
--- 838,843 ----
diff --git a/src/bin/pg_dump/pg_restore.c b/src/bin/pg_dump/pg_restore.c
index b5f4c62..edd0de9 100644
*** a/src/bin/pg_dump/pg_restore.c
--- b/src/bin/pg_dump/pg_restore.c
*************** main(int argc, char **argv)
*** 384,389 ****
--- 384,396 ----
  
  	AH = OpenArchive(inputFileSpec, opts->format);
  
+ 	/*
+ 	 * We don't have a connection yet but that doesn't matter. The connection
+ 	 * is initialized to NULL and if we terminate through exit_nicely() while
+ 	 * it's still NULL, the cleanup function will just be a no-op.
+ 	 */
+ 	on_exit_close_archive(AH);
+ 
  	/* Let the archiver know how noisy to be */
  	AH->verbose = opts->verbose;
  
#49Alvaro Herrera
alvherre@commandprompt.com
In reply to: Joachim Wieland (#48)
Re: patch for parallel pg_dump

Excerpts from Joachim Wieland's message of mar mar 20 08:26:52 -0300 2012:

On Tue, Mar 20, 2012 at 12:03 AM, Erik Rijkers <er@xs4all.nl> wrote:

In my hands, the patch complains:

Thanks, updated patch attached.

Applied, with some minor tweaks, thanks.

I didn't try the WIN32 compile. I hope I didn't break it (assuming it
was working in your patch.)

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#50Alvaro Herrera
alvherre@commandprompt.com
In reply to: Joachim Wieland (#34)
Re: patch for parallel pg_dump

Are you going to provide a rebased version?

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#51Joachim Wieland
joe@mcknight.de
In reply to: Alvaro Herrera (#50)
Re: patch for parallel pg_dump

On Fri, Mar 23, 2012 at 11:11 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

Are you going to provide a rebased version?

Yes, working on that.

#52Joachim Wieland
joe@mcknight.de
In reply to: Alvaro Herrera (#50)
1 attachment(s)
Re: patch for parallel pg_dump

On Fri, Mar 23, 2012 at 11:11 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

Are you going to provide a rebased version?

Rebased version attached, this patch also includes Robert's earlier suggestions.

Attachments:

parallel_pg_dump_5.diff.gzapplication/x-gzip; name=parallel_pg_dump_5.diff.gzDownload
�	�oOparallel_pg_dump_5.diff�\{s����[�H��R6%K~�u�3J�$:u�v���v8Y�)�%H+n'���.)J����\O'�H`�����P��d�:��0g�����Qo��������>	#�FK_}�c�����?u���������{��������V��������V�a�-��p������'�7��C�~�D�(��:�0k�oh������p�W���_����k����H=?�!�&�|8U>���<�Xs�"�*s?�QO�Q��$��W(����_EF��Y�q!�0�O�8W��������d����\�!������y��q��<JX���8T"
�����VN�3?�x�����_{����*��%O&��^,�����RWZr�B�9W��l������=i�H?��f~�AT��?��L�V^i�7<������w��m�u�[c���I�g�����I���M��|����|���i��0�G~�_&i��������4k�����E���0.�$�?A���&�������`����	��6�� �l��?J�<�����6�M0���@=to����y�S�����=���^�OZ<� ��wg�����w2|���tq��1��:KfJ��:-HI�����[;����`��v���P�8a�0g~��i���l[��E�`�|�(�8A���d��k�lTL�������P�r���H;/b�)���q���?0��[?����?DG�I��u�����}KG����6 {	v����t��������������Q��MXd�UH�|������vw�����=Hd�o�!{D
�w������q���A����.���|��H�_���`hW���~eE�+�q�-���K�,~fI��Oa��a��;/
E�������S�����w:|u|����<�z6S�H��]'w������&E�	A�����,`^����~>��u�����6X�-��Q<|=�D'�Li
'����s/I�(9����6�8d{��^��Bc�$�`��� �|VaX�����X���sw�(BrB��w@����)����x�)}YL&<cI�����t���`<���i~��|�OHo`^��L��(���y~9E�?I?��D����	��Q�Qr�uE���y`�5�#�tLC�4n��[�E���u��t��YGXHx�Z��"��`�)4mNX�+��RX�?�!�]�x� ���p��D']�#��-�O���|y��|��� �c#�
�Fbg����u���������%!b�;0�(
P�"�(���>�H`�w��d�C������q�����<��
���1��J���W��yE�.j_.&���jN��2@zX>D�y$_��?��8�lw)w��%�G���?������.Y��/#��������[sR*��6s?�,$��i��b��z��S���}wo�U� �����$�E%����E��������2���t/�%���}�����Tz�)�>d�NN(��C��d��ZAz��BZ�������n��i���"�Pv������{������N	�����`cyL�!x���V��sU>Jo�>$����� 
1W�WE5_��)oKYxe�a�)��E��Z�m4d�`���1�D2�q7���$�6W��VB%s0�6�y���J����2��c��C���Y���_������Hb_�����4Lz+���8�ei������ s��E�Y_�=���=�G
��m������nI�X�h�7����X�2>���n@�����~�0�AiQNA�d�_�mwRd`���U���Ix��H.��5v�u�"���^h�����1�*'�rb����18��G��Y���oam��C1A��Vm�i�2TKu������$��j~)�rf~��}V}P����?�X���9o�s���������r(&`lS��_�L���n�������PBf�[�{�L����8���i�4<��1�������SWiTM%n[o��5SAaF�G�����^�,*�E28���]#s��2w[��-�����@�#*�$@��qe�h2t�9�l�,����$��Qt��2���	�`so��%�B��Su
>�d%�?%���O��)�0��~*�0u�G|b��N�Tm�{+����p&����ae��_���ZQ����6�����c������~�0�����'�t�
~�3��aX�,��VB�����4�s���>���IA���!	|Ud�:�p����!�0�Od.y��'����]��]S�<Ds� $�T�	��{,d�7�NX���J��#���Jz`o���6�M��`����v�����`��M���M�id�����5:��<�e�LY��V�������Y��8(G�Z^�������������{H<?�I"���P�D�z������O��y��]�+���1B<d-�7��[��DuU�������S����[z7J�y~<�
������9{���O/_{��������.�t�V��BF�����P_z�����8�K��5��9D����/�QU��8�zm)��Lj��M�,t���#[���.�y�H5�0��ho�J�2}MG�KV�n��vv'S�%e�i(k@�!��p�c��t}�	�_��J����>Su���KTYI�>-���y�"���]^Os��6T}����^����_gS�7Q2������(Ig��M��}��]v��a�`��8��������+��$�C��k~�>`_;Pu�@�Bv����=�g�5��0���|UR5��E�n�^__
.��O����M73��vn�<����1����������+R��L6����;���!���b��y��SB�j+~_@����4&A�G���NIn�A�����;?������:^��}��bx��#�$��]
}�!uZI���J#���������N�(�+������0bv|��vx���dx��w|�plMAXK��0��pS:��l����!�����(��<�= Ea�o�<s��7�>����������s:���O1]��'[�:X��Hv1-�q2���6�����?^���"�������Y���j�&���U|6`B=��w�T�7�SY��x�'I��7���]�c��X� �~ww�;�R�H��d���qI���z�%�/�Y���;U��������n5s�6���_�,e��$G�N�t�d7R������X�y�L�l������0�@�
�N�K@�0����|�a������}�L��V1kW��"
�M�\�������H~���r����!�z/���Z�����A��0��_��������$oZE|���������4AE7�w��q�����]V��*��`��W=6i�-���c��5��sT5�����i�B�T��G�^�p��g���c�M�H��0�}��}�*�h��
0�6�*��j�����V��% @Zr���:k+�	�L������
����bz~j�_�	���Yj�r1��9jd�R�l��1X5��s��"[R��4�e���qt�^u�d7����c;/���B��M%5>���FTnr���S��FVSJC"6��
��
T;�������<���U��)���-��O���{X�B�G������x�@��%=�e�Y��]6�E�4U<�
��c���5Z.����!�`�!�Ex�����-�5�.��>�aq��& �(w�Y\�Gj��P��1��(����D�o04����rH��9R'�c�:\�f#��Q�����P�B������$I�l598G%}��yjk1��6����F��`�����'<�Qm}����������\����C]n������y�	2�k'�t����K�,�����m�eN�����m�c�|U[$o�<gK�����1|�jA���N���0��f����)MC�6���9������T�.����� f�w���������L�'�[��N�%[���R�2�,���6�,��-���H�]l��0�4�Y��L�������7�������g��,��U����Y
����s��/��f"Q�Ng���s�n�x9^��o��f@J RF����������d5FREx��J��Y���vq��
��UEF���.����P��Y��8�;����6y��J*-vS-�&0�p_�a�kEo�u~���UL�&iZ�J�����+��;x���G����x�� k�n���
!��z��f��5�)�RQ6Qo|C�
�]v���T�-��I�1���
+��r�f/+���l2�!^�TMP��6�����H
�b2����:
���HqD���T�V��&����(��z�5N9����	�J������D8:
�-�
E�F��T+������h�����^�sP�<��5�k�;#k�3A8��@���*�Ym$P-FQ�7QJ�J�D����A2~i�x��,]���� 	���$PF�=)��
\_�Bw�I���| �s$����N@y���{���X��q�_a&�n���^��t��nJ�n0�����I]���KO�=eIP��Y�9��>����aXG��l����BV�`��.�@Z���@M�
X��/���=����=\����}�}v���Z�j0��������K�W�����VJU�����*����lI�?m2������"r�R�S0
��p��U��� ��k��
V6M�z�;��3_��"l�u�|R@��,��_�Yw`8��W�i1�B`3�6V ���4��������
�������[�����L]x�7��������HMy9��}��~��'.���I���vGR[���D�Q�$s����$��
��W�U���@�]��da�v��M�����a�(4'�>�qI�_�M4�j��f�g<��E�%9�Xr�f$���U31#J�����Y�A�����gOQ������S�|`�a��&�+����i�k��<�k�U>�2cs����b��0�I�:���3z8�������(tm����z��j����?O�X1��]�[��)��yJ���l�5�����M�c�P=���#R������x�%<�pA[���di^�_��K
"�}lN��)_����u���g���F�����>.�����[�G�hL�Mg��Ue�l��*��
�=�3
U������U�F�E��hf)�)�(	U��#�6��"�K5�Y���/����v��������VT�j���u����t;i���(S����L)�E��
}.���/���T�>Q�����$��?�b)�T�(����*6Kj�y��In2��y�V�(���5Sde��r�������]�|�z���qu����L�)7G4s(��X�n���g���}��}I���7-�[��Fw�SB�`$��]	��8D^������^�����w��x����5�t����J��M����P@�<��6�g�t
o�("jZ7�oB�P|���!l:��PV�J�V��t�]�y��x�#{���j�=��(��JO�E�\�O_���A���e�jN���t%!_�iB��(K
�wF9��>~%'���_��q��T�/��m�{kN�-�������fe��:���@F�G�-1�Z$�
�������eCd[���U���]�.�7��`��W�l�q _s�I�<���t�l5BB��n�;R��E#}6�*��a{��1���0un��E���g��IV��e����{��2L��z���kK�PUE%}69���oI�������y�3������.'��e�'�1����"������y�0��tJ2lmhX��I�6��y�������>�9
�i�%m�b�(�#b�-<|�i������S��)����+���k�+nu�+t�~��lzO>S������������]m\Y����OQ0+�d��g�������,-!��b�RT%c&�w�����E�N�}�5�FU��:�}�����=':4IQF�8h]��F��7e�����&�'���zp�����b�pw�w�R����	}N���G"��������)��z���C��!�ZZx����N|j��F�-i�?D,����D�6�����({�-A���l����\x8�����	#�������� �Xa�3\4�bG�y|Uv������J�uwJTRM��b��w��6�������?�S�mq�3��������,������6�k��j��Ts��3��ebC�p�d�6�MA����)R��Bp���T�7!;j�{:���%�"��
�m���n������C�����������PZ������+ym!r�~�E��4�G�������Q~�L����8��UJ<.���"6+����c,�o��L7�[����;Ss�
������g=}���� �@%�Oa�B�m_����s�����M���X:3����������QG ��.��v#�<���E�
�qia�xS[DhYa:X�pf�[;CF"R�v-�+��
����3*���a�� ����J��A���F)�R����T�&�NX���a�����Z��.���������@�{�h��B��v��	��R��e��hT
N0���k�%��j�J����l3�6�j���r����sf�y��9�X9r�����E���������E]��*���`H���2�%>��5D�
*�m����n��o��c���H|���K����p�
���L Mn�i{�cF(U�	���-bq��������01j|�����f�u�����qd���j����s��z��4w[�#��-_([����3�
/	W�K��cV��ZI�J� �%��������R~���2���iM�S���y�v
"�(���
5�J�vX�K�8�x���<��"�)�J�p��>H���h>^u���%O���i=�:��� r�J �>���oo�;�Y2��ts�Z(��i����'�@{�-z��,�z�"��q	�uHQ���IG{�Z�}+�T�����<+��3�B�_n�4�r%:pK}/�1���l��L��y�����v��#k85�����W�P!g�d��h>f���\2��^Y�Yp�CKIq��������hS.���{���,���Gf����3��/�b���i�]Z�&��������7���dO)Im������Lnq������g�<	'(�����D���3��
��`�'��k��?�������QF!��R1x�i��������Y"�q@r?��a��g�(���@	����L���#��cH��$������Z@�\���N�?�T�>!���,������0C�W�b���i��ip}l����StJ�"L'�p��$����i\��;�J��Y��C~��
�[�E�������ym[2%.:��:���|"�T��W9�v\��>{�+����Z*� O%���8wp~R�U�}�F�qS��[d�pm@�dQ��a"���w��������\��&EK������t�xHt�:c��+�b�=ACTD���%p��uT_dd�cc���h�Q��]��	�������]�(|��U�1����x{����K����5��PR!�"^�fKq�2��=�r
�X�L|��HY_���ag�����c��@pc�o�r���'28l�i����c!���^$FP<�*�[�%��|}��/�_���\1�����<���;��h�6}v������l��p����q���y�,jg���E�[q������p�F����� ��u����S��.��?B�@`"�E����L}L�K���b��9}9P�;���5 ��}'�!���$6�>��������8�H�������Xg��4��@E�����������`g��(�����!�R
`=Pz�$,���S��
��
�� �0k0d6������W�_�l�z	�
�N����N\�3��D���'<��'��a�7)�T����O�8J�����7�Vg���Z��OE5N�cA�>r�D���Df��ZAe�F��������D�~
s�������Q������+�k��_�G5�� ��7P�Z��Z�X_��Z`�P9��l���U���TE}}H�{D�����Bw�����f��JY��R��T����`zP��>�^s��6C&nhf��<�'?���	���&�}��.,���E������f�3m����;{�+e��s�\��+��4����|he�Oj����I���������Yh5��������[�z�g�_x����1�h�g�Qh�|+�y#��3K��>HP�s�D�=y|�;{sr�,������F)�j�u�,�U+�`��������+WN(
D����/.���|ae������%�������x��$vx�7&W��U��d_����b�r��FS\t��UL�����[m)}.;^}�� ��p^v��dQ�U�����`������?�t�Mn����K�9���"f��W����J�l#��{2������D�3�uLW�jk�+�5��2�W�������"<������MZ��l���J���w`���_kB}�Y+������o����]g�R�F�hj�d��
�?�e���S������v�K�
��F�sl�H�C�5������#�'�C���`�3I��/aHDX������3�����!�����x{�[����a��}]����o~:;��!������vJ�Ea��p*C�%�]E�![J'�q8 ]��#lV�����*	�+��s�f���I�9	�8r��;�f&�jS0zb|bN��W����tc���NUz�p{�( j
���*����^������^i��)�����'����9gp0�'!�lW���!_������	i*����������0pXz������VI�D)�����;[(P'D�0��	��a�q)
�XN.��M8!�����0v4�d����f3l��}/:���Z��%�V�z������0���������M��k�VP��������st�B�+y1T'��L�������Pp0��qu�	�3U����[](���'G���P%�|:"{���E�6��>���{���n�|���h���U�xW�F�������Th���d�����X������p���������g������N~�,�7�l<Z0@VY��M�A.��h{�Q�84q���P�u�d�\��m�_.z=����R)h����?y��*��������������V|R��K���
���=��� ��h
������}�w|���Fh�`JM�:�{vS���fy7�1��y�A���
��nI����:l\����Ej}%d��
���)1��s.]�U������P^��
3H�OgV����B���^������T���d�Q��u�������Mc�1��7�#�!15T��?`��O;����� `�aU����w(v[���4K�?��J�zK��W�<u�������z�n�hX*
�_+Xd��6�[�Td���8�w����e�/���vy�e�VSqtt^�����/�����fX�-�bL�6R�Y0l�,fu~�;�O	��S�Z�o�C��� M��m����*8k5u/oyH��L}j����7�k:�xB!�hf���	Q@)�=u;����|�r
4�? slP�&2�[7C�#��K����`<��((��v�E�([S\�7�W)�6���Kz��,��z�����R��iv�����pn�Tot�B"8�� �X���`��m�33�t���P��$=1R���*����� �5���BT�%�r7�S����-\��'�]�>�xqg_��]�>�%����A)�5�$��g���'��f�Y�3��{�#���9j�q��v���Eg�b
O��&(VO��M�#�r\^�6�
�/����#P�<��e`k��-b��Q$�3���W�=�#�.�+i����T�5]s�nHWT-a_�����8��?3��
#��N9��"�F�\�)����� 	�x1�FsrC�N:J;��4%6���2�'�
��f���J���v���a����q��lH>M����y�;���j��_��Uq=��iG	�������Y)�)������"�r�'M�5b�#-_��@
>�A�Z$@�e�1c�D�))����dd�����u�
�m�w\N��-�3��N�a�V��������L�`����ins����#�#�� d���eI��%m����s|�����%���	7�F*u��a��+���&��p���J�{�n@x�j�l*�l�(<����p���J(�.��;�e-�if���?{�E�h6������%gvL��6�Pa��>�y	_���L+n/�TU����Q�q?9w2F~�P�a��5��}G�f*��wN��E"�K�:k�9�����
�'����;����,.�������� �/R��!YO��WQ[6�`iq!��w����(�����\��p��z�6���'S��6�*^My��#f��S�$d��*���]��h>zi�$��3������#D��0�%(�e�&��ln��Y��8�g��"�dS�Q�,����-��Z.m���U�rq{4���
+|o�����)�e�g'd����4�������b������F��_����j\^�]���PU���d�g�������z|E���3�D�G|c�{(F�M�3�:��A���h*�i�C$�Q�����&#Z`�k:�,O���7&[�[F�gX���T���9����"�]���z��3F�U0�y��8����(���hB��Y�F`g����J�������8�~���<�J���e���N8��s5H "�"{;�$���I/����~865������
Y�����3/�[>,�������p����h���mw���D�<�V�k`��N�.�Da�9�����)k,nQ�w1�o�uy�b�*�G���^ya�����4��:�qN�~ke��m��;�`Np^���U�
�O�C�EU�k0���4*7��P��;��=������ZAg�B�H��S3�r��\'H*B=�����M�������-���]�x�g��
��sn����A0�u�u�4�0�a���5:M=YF������}��X�<q
@�����T��F��	�e���Q���2�5 ���}����%][r�srv����y���G���g{0Mo`����9Y����������N���*�7��;oK�d`!	��&+��CF��rE�(g�hv]sZ���"�������1��j����g�i�w ��B�+��-Lj���I#�J�nK�C�%B1�Vy����7�����xL��!��l��Sx�xC&
�"7��A���L� �� f| �X���Jm���tK��q2����Pu�3,������f�������Y"�8���`�����Z�t!���H��	�8R�D3EZi���"��D�.R���v]Z�P-L�� w���%\���o��+�
�0�J�2O"N����f@�^���������L��4L��wOB����Bwh��=��}����^���V����f�y&O�i���:�1}���UXV-c���K��h7_%r�dF*�Bh��]M�Y�r�u�����=c�W6J5m.�����rV���������/9�A��ac����r�(=akc�X����48�L����a��Z!���Is�������\�*�$s-&�����][d���>��oy�e�m�����[`�}���[���0�h3�P��<���� ��ug
~��r�����Y;��J���K�"e��$��x�)#H���Y$��er���GC;����sn��{zA���[�:�r����K)x���������I�qm���MV�f�m���X�H�a7���%�nIxT���q�7yyk��L7#�b�&���C��1A���|��3�N��d��&L2���`U+����������<��#��<:}����6�D:�t�@���mB��_T&p��8P��q�*��8aN��J�1��
�r�T(=\����I0������&O���U����U�MB��/�	T���a������q�2����|Q��d��yf�O�
�eB@.:�(�Q@���X�(fb��������P���cNI�����pe>;6�4R�o��Z��*��*���nC%�BJ��YF/A���SWw� �.B.a�
e�MN���mf4`NJc�+�x�W�-ds�9�H��z������������m��A���@I<qf>R(���S�k�:��gj�`6i�������)r����t���u�?5�#���E�=��l*�j��M]IF6��ek~81<�%��&x�������T{���4H����g�`��V�=�y����fUR�[2�@"2p�z8������1�u�=�gj?k���_u�qt���7G ��Wmv���n!�{�;:><�I��;;?=�{�![���MX��f(��7��`����f���b���4�kM��JHg�x���sT�'H�!��U ��Q�:c���:���S�]%�*�Z��A�H�q���������w/���j8��dX�}��S�e�]'0�f��D�wV�_�2.�s�>#����Kn��q�� ~w�l?z�����FSv���y����x�A�q��i
�����$�.�E�����=�0��_����J�L��o��g&�X:[W��T<e��E�V�Eu�s�!��>�|SA(�[����b5u��f������L��2N��R,6�k�+S�
u@]H���KaXR�.q�j��M;I�������.CX�g�l�(�3	�}���+�b����>!�[�>�l������<y����mmm-n����X��O�[��O��=�z��,����P�(������+L�2c)DtU���x��Py��������� �A����t��z��qr9�~|�}��6@GS�C�!i����Y�$�[Q��4	��Mn��>��G�I��F��r�\E3�����"�
e��5����^���$�P��I�v�G���G������N��Ji��������?����Y�a;z�����"��������"��9�*�c�"i�&8��~��������%��9��lW��/B?I�` �������Ks����V��FHF=������y�-{:�D�����N���������L�XaJI�� ji%�G���!��X$�f�R8B����V&(3�*
��T\=������L��[Ze��Kre���\�T_g���Tn%'W�$;��;3�~&)���dJ�4��ENu'X�n��1����8M}���(	>���.�U��vM�9�
�e�#�.m�)���u��#��{|��o�u�X�b�w�N���Se����J(��dn���(
�MW�����p��	!�����^esd�s^�Ure�sHq�\u1��.�|N��V�|^���V��zPk��z��i���>�~�g�H�}j����>���w�
+?��:3���^��S�����`&f���g��5�U.1��t]s�a��M���d��'�i|%= ������z�Nj,L9�� �Y<��C�a��h� ���sV�'�"T�s����W���|������2R�^n�7S^���a6�!>^��8��}z����f�J���8��3����C6u�1K�t��Y�� �fO�"��!��"09�&S�X\��'}a�*B����
�n�����>�ICd����DO���P(���h����aJgac#�p=!�����;���8�%�`�mY��U/�~�.�]�(����0������.�7�=�$x��9���5nW�z���n���|r���/?:?�?�__�
\4P,����u��l�Z~��P��O�KS����}D���>V!31�������N�!&�R�����G�s�[�y~���7:�pc��`p�1z�|��I��uTn4���p���MM�6��-/9�$�5��W���b�bB���N������6;AgE�o7������M�i����i�����w\�=�'�8fI�jK������oHa�W����
&��%��S%����6~�W|�sl:�y���������3"����F�tzM\[(��M����R4ilG����,�T=<}��7�=��#&h�s;�Fb��s��m��U����h���������3�_%r�,|x��?=��wpr|X*�k�G���^���w�%�9%�?���U�g�z�X�@���������{B(��4�S��>�,{�&��c�7����'3-W	=�I��_�%r���0T��=��p1S�M�}qpx�eH�E#R����N$�	�9�T�_�#�����E�p��0f��-W(�r�_�����N�o�����~b+-���gi�s9���%����r>���h�R���h1
��H�rG��X�S�nX9����rfk�rF�*{�a�=^���i�����.�Ie�;2Iw��r��]k�?�Z�\Nc�U� ����tR<W��E4\�����x���{���&"��eg��u~]�����'��l�5����*� �������[���[�|VQF������O%�SY�|bn���[���P�M��xo�M�A:�5e���1
r�������X��,CM^�9_&��QL�\�l������&fa|I�M��5��z���-��f~�f^��6��9���m��X�'��j6H^r��-�n�������^��w�7zAQ�7+��k�D5����p6�����.�;�����i��S���*poM{D�����c���I����%����;t#�E�(�.m�n�l�����Uj�����^=6����:���
��l�-��n��#-������,%�u������H.��1f+�2\af������-���<�!���D���m����9Y�i3�2k/�(����u�d���v�;��������/��
���od>9(Mc��c�|�7��mK3��#63$���;��"k�lf�WY7g!����������_��_�#����n��,<�{��?����A���&���������y���8�������k8�
/���<x�NS�C�
�i�2Ut��g�k��t� X�B��j`��Lj��RvY#@���>� �F�v��"����wgl�+��nD��#�����'O0���\�	��#y���C)��~�h�u=��B%{��z'�K���3�Y�.�=�~c��w)}�	���$�iI\�/�[��A`sz>UW=���������x�Q�e����������qp���6���-T��ot��6����Q4�(*��C*�wH%�f��P8�;�~���F��5�8�_9�Z�C�d�kG�P�oi���L�l]��'�(F�������U�-��y]������y&.�Z�&��^=[aC���P6$\����C���2���
+.:M����W�o4R��%���][*��������KFe�I�h��������+P��G�=�4�K��~x�����������Wy������6O�m?��@��,'��E����28Z�)4�d�3L
g<�@9r����
}�)�
���Jl��������-���@?��:,,�E��v�������
���S�����<��!V6��_�L.F�a0���,j%U��"�����������k�?�$�v����������':_���W��1��%$�H��L������W#���j�]�M-�tU�.��W���[;���!�^��������;M{������8���<��)o��u����������3L^&E�ez����������t��~�k��Dx���&��������N��w�q+��m��]�bQ�26h2���e��c�V��tM�,�]KOH���#����)BD`Xc/�829��������������&�K�_�o�a0��qA������y7�����Ep���7<Og1@vl��|t���)&�P�c�n�R�?�wC�����������CH���� 1�4Z~��i�u�F`����^v�FV���[�1d�����

)����Df")�|�u�^
t�B��F�%q~k��[�2`�����7?^U�W���<�Y��x%bu���d��o��&U��0g%>e_�;2�Y�QZ1h�8���7�=���B��X�mH�Np/(+�"���f����T!�~�fE��2�_���!���t��&E�K�r]�4%\�'HJ������*Y��LzQ�����qIs�|�J�'�M���,���LoH�+5��,����W������X{	�Hc�y��U���&�6�@
������*3%���
�ZW�4Zj���,���������E=5�&S��4~nOY��dl�
���lOV%uP����LBk�B0��S�p�t�_��2�<|��y)|7�%r ��S���e6�b���F�f�J_�@�	��^�\lm[����$lY�[��3�-�r����v�4�Kj��-%�z�����\��i~Y��
,n���[^��k�z�@oS��Z[Li�\���������b���k�S���5)Wt��F�+�3�IS<J��I����R�4w��]��5H?}X��]'\�7�;~��j��%7���MLhi*E�Wt/+������J��0��(��_(V�����P���(�{M��<3F0����j����zZ+�"/|��y���+X�,�V�k�FWG������L��~������Y��37�Q���f�X���UPMZ�#��'����������;��D��z����w�YQ��jT�B���7P�'�@��0P~T�����B�����b���{�On(����h�a��B��g��c����8�V�
�Mr�-�f&� ��h�|������6�zo�3�\�\�i�a���u������V��*c#*�l��5�]/m���G��p�������<e��������^r�c	v����������4��-�w�!�N�a0��z�Ff�1��i~����u��1��[�����k�j���W�����\�����h]�����MyY�����������%+,�h[Y���8�������O2�����z�����s����5�d6n����4�UI/�]�n)�V�a�Hg}��Y�;��FG���<��~~+�������o����}�De������&$�����T��A�a���_�����
�F@{0["��E�G�'A����07�0�����PG��Lq:������8N��T�I~�������Y���>���g1����c��[�jME��5���XfM)����k5�~����|O�[y��y�����-����$�O@�]>�D<��SI8(�OUQn��{	�S��"�A��7!"mqO3��������G7^��K����?�	f�+hO�h<��\�o���~�Q�C�}�?�<���#�h��"k.P�
��^�g���9RY��}n	��-f�=	���pP��>�������+�tB���_�&�e�����Qp:�1w�JJZS���G(v� �*-�_`�1�#tmN7O�1
�l���p*�-
5p�?��C��4����<g7W-!�s����~$*�<���XslP�������E�y���V��	:{N����.

���j�!���5+�W5��&hn���@��1���T$�T~>
H
k��-o5G'��\�9��a�Q�]�qu/���5����f����f��������g';�\h1hl��*�y������>���"�({�M`2��~v���m=�c&{tHG��%��5t��%����}�����"~I����Sf/��n^�I4���_�1�C����gn���LO��}�)M��U�����E���%�|��&���3��c7�����g��R�s�����S�4�q@D{00"�ux���,�������K4~RQ�������	��b��F�T��Q�#��r�Y����A�����+MGU��� @�0���l2E�����YM��s������>��9�9�1*�j��s�I[�[p�K0{�Icw���O-��K�?�r�'G��[����.T��N}�g$��SB�11��p	~��b�6b��V�Tn��r���m��q�����/��{��N�o�k
�����u�:�96�*f��U*zfw����t6LN�'�&xZ�J����@�31��:�A:�l����%��U��h���c���eI�����k�~��]��v����Um/�RI;:J�6���g��s�Y��v|s(�A��b&N0���}L1y�!�Q{���F�CO����I���?�����^8b����2��\� |�i���bJ���*`����k(2��!���Tpi���i��5?�rC�����N!
I�'����������gw\���z��g8
��E���(��W�^���oU��t6H�����^�ly�9�b���fq\���B�" ML+�z��NL��g�e���f�#���*@��W���&�&�'e�O�db��Wc���#�yg��P�M��8"_5��u%3D;��k#|8�T.!����B�����W��<:?<(�E_�tu�U�at�t���c�g�{�?�9M;�SK��mx�?��b�j7���h�������MTJr�Y�a��g����%-���sJ4�V��/�&1�IZ�J�\�w��Jg��J�eR/�D�U���E�R�8����)7K�^�,��*wx��;m;:��vZ��/��w#�+S����6Vt�s?��r����	��b���E>�r)+��J�h�,Tc�GM�e�=f�i�:X�Hr��UpX�����
�F1@�ZGe����cB�����1�7%q���C�S��i��*"��Mt��
��=H����+���`�u3��6A/5�MJ�[Ut�������Dh�E��{r+`�l��n��F���f�W0�!�6���O��vI	__�<����7�G)�n���.��������@�
:n�iP�j��{�?��������#!�-6�Cr�{�o�4����U�]G�:Q1K{����%��}5Y�<MS�����M
���L�:�i����B���`��	+���w9>u�tc��R{�x��ym�Jz!H,Q��W��g7C��;�."h�d��1��8��[�u�X��-%���7rH���b�w|��i��)��s�� ��!% ��n��U��?���b:��
����1���\��:8ld}s6��^������s���� (�4m�W������c���UDp�1vcR/p+�R=������{n -L��� Ud:�G�f�!�r����X���R�C����43L��>���!"�]�m�0�Y3����O0k���;]��]�~�V+Y�8A_J�&i~)m*��(�\}Kc����`	\b>6��l�4i�� �a$~p���S�6��F����y4����)0,k
�X������s�1Gh�\�� �a�L�b(�O�[�~�!�����+S'�B)�)[�'�{�m��d���u��U��I����:c���M�.=�>�j��	]�z�%��d�
>��P^U�o5=�sJ�D��������][m�6���5�[��G2�xL	"Y� ��>"�O���n����t?Q���g��������Bw$��������[����=Q��&xA��W�i@�n<�_���{�/���EHB����Z�����}����b���p�f�����"�[�<�2�*�v�dEi*�������M�s�v���wN����k��g����7aH�)�,�������Ag�L�%{��a�Y�r��j��t�Q	���h�x�v���/�=T���j�>1}������SCZQ�%af4�*��z�i�L�Cz����;:�������y��s�e�c��@H8��
A)["�bB��#CJK{�
�����$'&d��F��h���7�Q)�����%�l@i�U-b��D7��&�D��Gv��C������n���7���s��C1pgF
[�%.>�#r^��9q ��rn+��4 ����&������)3�����r<�'����:z���|Bwt@@�l!B`���JBF����$h��G��%w��D�>����i)�Uc����;��dl��]QB�2#�����r:N�i������qV.�����h�I4!4 ���f�]�J�'[�%!K��������2�_g����_h�����HxWtk5�����q���������O���~PA$�����<�R�9��g��.6�w�/}"+���Q�=�v�8*��6pw�x-�1��?�Z���������
�z3�Y�N���Z'c�-�����E��^��jy���������x�~i�6])P��"��;"��L����A!}�S��P]���n���1���Y]��J�gf���|nh0f��<�`�Z�uO+R'%TVo�����Qy���(C��	3��h�e�&*�����)k�Qh����`��e���.P�6���=��vVc<�v�U��������<�����iX�eN�q���~�#�q�"5c���'6al0L���B�d���R�r����\3���8w�-�\�u8��P`�$�j�L�����n�F���]Iio�v*�}n�����=qr�HT���Y'�x�"���mX������
����F���H>����{P���~;��n�H��yW��Hk��F�������w�"�|�|��o�<�L;�@Z�a���]*����*��2�J���PHu���#`#�e�y�F<p@�^6�"	��*�
�F=��z"�� �Yx�!	����FI����f���P�b�����IH'�!\���g�T�h�XZNm%e�� ��L��W8@EV��S��id^��Pal����UDf[��1�����U�=4M���h�P��z��,H^���!�P"~]eK��"`�J�v� 4�g��k�Db$[����=8u�q��|*m��g�s�n"�����=g�*��`�����.E%�
rB�g	����^��b8�4�`��@N�^���M�nR��-�)L��Ee��C�����e:-�ev���..��$�Dg
�#����+ssq�����KY	��y��A�0Ld��d�nO��@�Vs��c�N��i�vB����8���Q�s��ku�oCy�N2By��%�t
I�}4�2j�rn^%X��W��{]��,�78bi�ZK�\L�l��*��[�_��TZ����&��f��?G~�|�����4��[VKr�RN9�BkL�d�XBnN�T1IS��X��z�6�G(M����D���w�+�.��5�>�*����6�����(1Y�k;�J��=wK��d����p����'4}�WM�$�Kj
���[*b�'���
?�^��	v��������g��{�K�'���M@�3F���CM�nF$���������s���A��>1��3�;�@�dd�����K�4\2�-���Z���j�����9SC�1�7���x_5���6	O�"���V�!�9Y^B�������I�$DN��M�R�T���oDd:U)i���\���lf��+M�q#��&G��k�����]��cJ�4��!eP/T	�d1�K��+$�y��H&���t!���!����6�����/�=](�=�E;E��D����C�n�!~>�P�B[��d�^,��
kben�$I�B�@��$\M��n|���K��Z����C
��TV4j���S�66�e~�r�s���Z����}�v@>_��������,�����?��)��jMu�*x���`����i�}��DE���B$~`!�#��ZF*�?����r0����0�z��v���^��<3rZ<u@�����;?=����+��HTG��lf����8�t���S���@E��K�;������$"S5�f�F��T�(�r�8=������o]��{�c�O�
h��;�~������t(},g���d9
�������omQ����U�M/���fQP�i��T��" ���`�}���	�nWgY�V7��iP���k>���q$���g7}sa��g�a���^��R�����������,����K�]K�F����5rX���(��x6xvN�	�������U���������V0����������z��i�h�W�����L���v�
��z�[�����������V�L��K�
�|������w�����y���Y��y/�O���'G�*�q�G=����P�C�22�38�5��������xNW��W�{�5�1�V�V
�]o ��g_����G�]�����eKZ	������v>,�h=���5EL�X���R�F��^���.��3�c���z�U<m��B�����_�G�X�$�v���X���>�;�TE�D��f���v�!�:�j�[���������������7(;���#�h��0��6���'l��{t!�>���?��������iD��v����r
(=�Q\�����1��R���Q�?���� c����X;@�z����E�d�,�*5{�_�G:5�8�TR�?_xr�����~Cy���o��A��Eo�w!a���H�	|g�T�P��{��vM���yE/���^���S�#/��,`IC��?v9Z�~f��Q���X���3���A�>����������h����4�F�,�h���������<�}"�J�Q��N��w�����N�1��C��T���Sr=Z�?LP��\�-s�f_\�w�#�G�
�������G/����}���
i���3���P-�1�u�I�y��v���9�������O����f��7��J0��D��4�ygDnT�Q����y����r��O��I���_r/m1�`u�K�d�|�.v�U����l����	l��y�z�2�����{�N���_���A�}�����5}�U����v/)G�k�'(����&�L)���{
�
����>��BQl���F������e����K�Ne�.L���f��{L�u)��,>�_a��h�>��QFG;K�x�:
{vM�(�������S��������m����&�%-��mJA��T�%"l�Ke�/���ZcZ;[���Y�WR��*�/	��`T�m���[��Q�����m!���R?�
$��Og������������}t*�DF��Rq�8b<��������e��]����9�Q���9���}o��`�$����^������7G��WeW^�/P=[���������;Le�C�X1�{EM����`�ND-Lw��j��j��i;��jat3�t0�`jc2��m,~����
'��|�����q/��W��+��<	�q����?�+�����iK��'N�+�%���4�AQ{"D�7Q���e��y����/��Z��4D���9��=
����W �,�R����\I�D�y�\)ey��2��uZVa�r�;�HY�!��}�T�L�q��B�e�����W����T�������_oi�Yr���"kvKtz��Uk������W���n��E���T��M���x�������'����b�u(�aZZU�#����
��_�]c�8�(g��������	�����~��S*�Q]g2%*9����h�h3'�3����0��g��2Q=�4Bc�(	�;dp{�Zv���w9�#�3��%����w��@�R�p{�M��g:��h�1�jh��B�]g!������"�g4Qb������]�����|����G�Y�bRK+�����5K!��u�n�($��#F0c���MH������9
d���**���~2|$��Rj
2f���0�g�`���I-�qWj�P�u�
��#������
�OJ��m����g���x����8��i
$�\,�7�`*?:��
�+�1���9��Nk�Q��<����g������N�} ���X�Bd��hb��)P!`CbTa�@U��O�G�������A��D�������=���E�����
Hy���C^8�{�����=�j�#b
���Yz��He|�Oa2��7b��9�L���2����g���r��t�������("F���ss�r-��>����rIa�7nhF`�	�I��}&A�9�������;_��G���R|dp���������4�W�n~����X������e}]m��uZ?�z)�h����E�
a�&���@�wX�7���pb�qq���G@�'�����x

�?�E|���&%<�B,����$)�xv-J;yy�`L�B�TDfm^u����Q���������5��|�z�Hu����4��!~����c��a�7�%r�Wy}�"3����������{|����^����;�����y������C�+�3�KtX@��z;�t���n��������&��0�P�c)��f�|V����$���u�)
$z-J{���p�{SU��p�����,�f�]�>��s�>��~do��� [qS4�q���\�\i_[7q<��qn���9~��1�v_�Z�Hy8s"t������ma�_���G��9���&���(~��xJ�@�Z�/�������Y���2�H-�
)r�"/���)Q$sH���������{�g=��G�8��~��e��m�~�[q�����u�����%�?u��� 1��U*ND�����|#�J������rJ�:l5�F�:��:���T�ji)q����X��6�����:VQ��^�X��o����z���4����e����i@���������1MK%����x.<
"'1�������w��[`Z�.�������6��&�@X��D�
	3#�?��bh�6Z����q����������va������y��owS�f����f����>��)'�u*���mh�w�����-�s��N�5n9�ZhBN7K�x�	hk�A6��^hJ��G��n�7�eNMqLZ>�t�Kd^�
c�V6�,��
W�J����X�9]��q���{fuX�7M���,�AZ�#!(�����=���\�z���$�&�h���q����[��N���?m���j�z��}j���6��}*��B�{%o����~�����_��QI����:RK�s�=f�\���
�8���S�&��YG�������ly�e��w���2������)�ll�~�|_](V'��]g���3���70���$����9��H��_o���d����y7�6v_�6��e�����cv�V)�'F�*��@�8�c�d�u=c�~�A*���B������iX��n���9W�sw��o������~[�
[�c�()��J"��x<K����B�"�F�G��W�Z	IYCT��f��o6�l�6�b�D�1��J� �G�x���������������P&��X�]������ �=.������������_��p��#�.�f��a��<���m��M�H{�1n�`��jz��n^���NQv�P��E[$!G�Q��0��\�3@v2����d]��"Q��{5v:~��)�2S��}�om�?b�2{R�t,�<�>}�o���5�Ch����M0�sO+?m,o�Q��93�\�����aYd;~y���C�=����1V
C�WU7����T��5��9�d*���yFJ�E����f���}�����4�V�L�Z}���O�����������(������:����KQ8|@
'��^���Qqi�35�����VYn����:W�e���2�^9�[�����}�I�'iKu��h�
���Y�p����Z{.��V%M�=�^L�D�$�q�tK�zFm��^����(�]���N������n�5�T�A����xA���h��lii~mu����q�x�2D����m,J��7������r�����&�Y�U��m����0s��z,���<�}�p�`s(#���FT`���6�(���������T�|��������FZ&�'`�hDA���W$U��$���.���8J��x:yw6S�g`�w�?7m��v��7'/�x>EH���������46!RV�t7H�9E���/85p��Q���T�q�"��t��*��w2��]N���N�O�m�d�+A8��e <v�On�$x}�����
b���"��������H���J�l[�|���&.>#4�jc�Zn��d��!^�1Bt����ZH�S�S�_8�g��ih}�V�;5m���t'�����&[f��@�v��t�
u&��:��t���".�����w^�-���Nsyg�����e�3+g����v��v�f�J��L0��G8L��C��������;�
��?�?h���G*�������J[���f&b �p]�a_�Y��~^V��>B������a�A8d�1��k�@��n�3J.���/]�����S0
�[nX���Y{Wd��W��|�@��l[(Z������X�����v�^�9�`��������cx��]{B�k�+�Q�_;Y;]�<�W�`z[v"��@������A�Ph�9A��#��ln��Y�=��^�$|����x��Ik����M �82N�a|�GR�bU=�M�!��%��[�����xR1�`���9��>�5����I4��H�_������8�>`/P��4����B�V�����"��&�A��w�+�Sa�0LO���\^v��i��p�W�
~n#�7+������`ya��
�1��sb*�6�e�kV
��w����\�=[����|Z�T�GL���{A>�wn�(r����g<�2)P��J�{� �A\pnc����H�h>#<�`B�8���>�#�C�������SSM��������c#���'�c����������e'�Ar���FG���B����}"���+��bj@~|$�i,Wd��(�tt�Z'��.V���s���R�^�	�Q�����p��6�,�	��pw|D9�C��>�9#1]���B�N����px����	55V�UM��,J����.!<{��F��
l�2D���Y�`8��i�"�$�K�B�n!��S��gnm�����
/_?���������M��)Gs���"T�
�M��.��-�]�y�4�sw��}�ns��A������+�����$��JH\[�]�����n���6���q�"�����sLOQ��n���eU��" &�5;��v&�h��*w?T$�
���=�<J�`,���{t3���Zpc�m�1�r'#g����-��ix���4�N�2m��#���8i�W,�FY4bTL��B	�#�l2��Sd�B��H�0��5�����d�v���:��8��E��������`~k�~s{4\z��v������a��F�q��u����
:���9���A;@EM��z���������h<��,(0�z10�v��q�nu{YqC����f���a/����|�8����K�P�����.�����# ���N�U0��c�)��ay9�0u������MA�n��V#/.0��b�E�� 8\�}�>��I�A�����a���60�O���K����5�C��E�$��v}��aK��.��
j�$i60.�D������?������yUz�	����;��X;�cr�f3�pS�A}j.���d�,�?���^�i^���-�`yN�aQ�(|�?�������0�!���s{��5o�v�ot����jBOp/�pF�'����zoO����,���l/If�7�0!��d~
\'~��#���l��~4������:d{�-�v�w�����!��g��D0�ag���4���z�a>(��|6��'����@����(���&M�m*��D�{������9�TD�	��Bo��a�\�sE����.�S�������`#�����W��3���a9N�'�-4���}�+��[�\�����4g���~zs���3������V�z9�0���Sw�Z���5�f����������&s,����D��9����������4�p[X���
/'��l}�9��(/��!�J�J	����/��I��6�}�tjJ��JEN���$��<��k�x��7[�Y���o;3n�	[�OHw�Q?3����V$������'W��$93��Q`:?T�CA�!��6m1������������li�*��\�+��tm%<Q�t��������-^��Z���;,������Un;�0���%����y��������3y����������qyS��M
�>L<?���R����=8J�J�����v���:\�$� ��c��M�j^�������v��5{N�GP	D`j��t.��z W_��1l�\����G�_=�zN�?�N��O�������?=�x����>�|@K�D/b~�Bi�B��[u��9�	������]8`;��ggy�����:�5����^��I��gO<�����I��]S.�:��������y)�����H�Uz��M�O���3���n�7�j{��V`@������T^O�����R��X^.?��=��"�|
0�����*�F�[�_5�c��no7��������N�aMw���i�b�������/V}N�)�6o��y������{G����?�??��he��1��W�������(3�Y<��O�Sx
��8���C>0�e����6U�r@H'b�����	��3�?:n6x�����}�;;�A��g�����a��������))0S�:������Z4������D1��#k���%U<���h�=rG[_0�i^��_�M��K�����'�������<4��}��Q�;9�{�H|����-F��G7-=;m8������0@�������ip%�?���o���]�H�S�H���'|��V@%��������v�����������s��Z&oF.mpd0uW�T0�aR�Y�'��������uj;~��t��x3g��vj8j�2�*�:����X%L������<����&�(;
`@f� ��->�����/��2Km�/�?:�$M��{����7��D���V�~���lb�U�+D���c����������o�M���������{����N����7��{ x���c����
�e����;�;�R7������W����#��h�	T�	
t
+��O��`��P�?���;: �5��o�bw�~UY�z��t�
��,��������e���_����/���><=D���bS���S�[z����R u��J������wN�i~���g|Kwu&[�T+*&��g�:�Q3����u
����^<E��Q��%E6
���{J���6�����?��1���>E��O<S��z������})��
�@��#����C�j�����SM]��B����1F��d�yrt@y�%qRLp�kM�������M�N������[��.���w5M��[T����2i�|OMk?�����b^���d����@�Xpx/��	gqR���X���~�Y���P��}?���v��&g��c�k:=�W�x��	4��s����;N�����\\Uz�M����y����x���\m3V������[����W�M���������y�	�YfHr��������Y��6������#�n.�>0���t�i�k�`��Q~q��[���^�s�+FGP0�����6�����c��_������S��trD(�@1��+h	>l��L�����������'?��3���\�o�_�K-���� �������mXp��n�`n�H2��W/��c}|r��z�]��r?AN.�e���gZ����1&���������i����������u�L��3��GE0�^x9!��}��r�;���+g��c��/�#`�8�Cs�Y#0Erz��f�B2��*"�
�5�F�`����h��q_�����	&��
>�!�E'�a\��� �}�-��0��[M����������d�jKb�]����q4.�@��+�8D�#�"
2�Q��.���)Te,
������C)�Gu�{N%��k��v��Z`6(�_o��U��������D_f�B��ukz"t�<�H.��m����b&M�f`��[�-�3�|��J���}Hf��h^���X���:_�W�Z�������X���n���l;a�:����	�j����Sy#8W6�B�#��hd����W��i����|�~��yi��~D�>���/�mV�GnYo��&$�����;;���q/�X�k�z�{$������
Z�F���PS�g�����6�FJa�f�m�C����Q���$B/�B}#��k��N���l�~����p>5c0�bw3a��8�`��O0��g�~�Q�3�R0��6��o�e���`U,��H/;���v�E=���}Km;K�U�V��������<��yM��tvM������1j���Z��E�%���}�NZK�����j��*-��u#G�e�`S[W~M�q8����G
��Q�y�U#f������lJ^9.�:����I�=����wg�A�t�:y��EQ-���N�D5����x��-�� J���^�O�#�7�Lv�&��
�V�:
BEJ�!�Uv�.������,��*l���tl6���U=���2L�5dLT��P�F*�n��LI q���&�	�"�*V�[�*��+��������CM$�w?�A����5�k%Jfj��e.���p�������:
�6����:���7�V����(���EO���r�;y��*�J
��.����6�0�X"0��!�;�����9S��&�����E�#s/�}�����%��vr	F{�f���'������Z��m����B�F9��?�4�x�'o~z{LJ8��
�Q�VRF�yA���.�8Ah��x=&����]�J�����V�+!�-T���A�-�Qp����X����Z�&��{t���g��	�!pw���Wt��<���c�H�����<��A���T�������Q��9���M�dn���9����?{Ss�V��g��gy����3�� `Kh��x�"e��!�h+�	P�(3Ts�B��c����������{��m %��QH������8*t�\�Kq\�a@W�����w���\�c�������v��!
x�P�����I��!0H9��sOd'y��f�I���-��Qh�L&�E�rp��f�����+oo0������N�NN1���i��r-x�N(����+m}�>�[;�����q~����}�}Q�S���_~_�Q[�]��mE���C�CwCz�G�Ao�h�1u�:�w�)�&��aa(��Q�!Y3R*������F�9�d3�.�����X�����:��
]sj0�P�mx��
$#��5
�5��5�2^�=�"i��8��2�l�|��7p�e[���CJ�P����4��������G\�_v�1����Q��E�?�Y�����{n�m���]�|vp����	9�I�P�1������*��� =�4{2�_�[���b$Y�	%�iv�����M�{�`,�(v�b/������0#��HO��KI/�v�����q�@�FoR�s� 
��D����q��FZsV�u��n���m��n�
��kgP���f���O�(OU�_������;*qKS)��f��)^Nh`�AF^�4��������GEx��o��j�y�������,����o�>�`u����Pe���� � ����NI�����Vj���u
+z�Vo��c���J=�>�y�k��,�4�����j�a�,��?�r�L��g�2N�8��j;��f��fH�1���[�F�oan��,�
&�v�)�O��v&���'�S��`��&p�	����n����	����'�?-�?.�o=����M�����F�
�M����K���8A���N�+N^M����F
�����j���5MnDG��Lf?�8z�.6�K�x3�XN%*���jx��/�;���E�������A)FD`��9TE��n��}�9������
������\�Q2�W�32#���t�)pv	��zNq�q�3���-)��Pz��7�������Z@B[(��8��e]r�� ��_0��!�6�0���\X�����@B��$���[�g��p�������_�7�������yg�Q�������T-��[����8��Z�����1����-5��0Au��|��D
6�nE�K��N~&�����B����=F���j��^V?6I���11��������>���S7i��@��MGX�Wo�[)w���s�;p�:�u�V��7���������.�U�F+xa�7��Nr,��I?Ib)5����R�����`<GT�`h�!���.���AJ��ly�z��i9�s���gO��"�<�r���<���-��Z�v8�����4�6���L-�]��:�./B���i��_m��l�.�=�������j�O�_R�Nk`����Yx�������������Ej��5�Rt�z5s�tH���\���a�K�]��k=����4�"5��b���
����Q]�Rs*�Q��E��S����
V|���@8����q�J�����)���T�S�-Bt:xn��s���qF���Lf'�p>{��Aq���q��G�HD��g9P��������{���p�?gJ^JuW���V���j?c��������\�Q�%�Q`8kY�'�5�����d���<����{��y�����"
��&�0����������F;�$Q�������% l{+`��(���|L5����|������|�M�p����S�[��kO*r�=E`���:D�|���A�<s����]!���,���`>;�E��'��E���8�����#�	����d@���{r:���M���@JJ��0u����8:����b�$[�h�O����]{��qn�x�����>&e�}�s+�cl����g�	3�s������N�L.����A�����
� �H���w�ao7jZ��g��&Ef&��Z�L�;�d����-d�W���)<V�C�?��N�Nnp58�O�<Y�uC61l�'�c�����<~_hgfAVZE����U����j�B�<�F�;�|V9I�����)�Q����o>>��:�4������<�)���V���A��Q"TG��mi���dxN$U���_����?��
Z&
�Z�7��v�[�/�a�D�EVxk��N����l-�ta��U��@�g�p��peu�myj�����b3��a��h~y��rd5������i0� m	(�w6���}�-;�[��(�v�
Q�@V��Xw�<~��6�);K#"��VL#�
X'=z�!]E9�ny���3��N�`_"�<���	�Z��]�	�hL9���w�����}�S��.�I�D�ov��.(�[���%�,7�I�8�Po�+�v�2���7�����8�tH���tZ�_��%��e��?yO���2�'�hx�5<
c5!��~d
�.s�t7�	�]L�wO����;Q�R�>IX�0�~���iR�Pw�N}�g���b���He�_��#���y����q�rA�|
�(��0����GE����72�Y�0�:�?3^�vg�n�TV���������:t2�Z��q��A<An��	x}�/���������g�
!�g��I��c�&��B�I��i���k���1����f�R���$>�}���s���E�A��Q�������"�Yle��R����1�$�f����ju{X�����o^-I�y�2P*�� ����Nx�%`�d�lv��f��@�|�MD4�!9y�HpKC�N#��dL�E��"��E(�����q�ct���J�M-�K��H�4��i�+b<��������O
�������P���@��(Lp�h�5J5�����b���n�w�f�N������b�hwe��m�s<A��h�Y�[x�G�����AXb������$���L�����9��_NJ�D!��o	�������e:�����s&
3P� `WO��r
�F�o�n��F�o7���n,K<�����q�l|�%�0�:e^|��}���f�+Z_��EV�(I�(*��p�x%:���,�2,��uF���6�V��(��\�D�T��*�(�,�����������1�(�����
��f*�0�x
0C�E��xf���gN���kqM���M!�[�Np=���C�4C��2���K~72��?0C�+�Q�C�%�o��c�
�F���3Nk��4Q�Kf�X"����w���Xp��V'�	����0�e~/5��pD��������n���Z��X:Li�����*tDT�/M�q�8����!L�������u?��b�+��
���u��h��9�����qtt��XM�'0I�.��#�MD����'{�G��<���R6�	fD���'����s�p�G�m}��MX���]ski`�p����`E��nZ��T�y?�@������t=�����df#�����������\F5�rI��=��/o�`�|����4�8����lO���h��:?Z�=�����p}q@Ou~�+NtY�R��myf�p}�Qf�+-������4Od1�S����iv��Z����9WX����CR��D���4
�4,�����97����<W������|��F�e�\�zE#��]I�A����b�����6+������e�n,�w�����~�8`_5_���c���������I�\���V�z�p=;[����a��Yb����������3��l�!k��\+1���r{K���m3;���V��������rY[.��8�������y��h9��3
flw�]��c�H�G�6���f�L�=�|L<*'��p�9�����wA����.��l�p��+5��n��\�k�9{�}^�����5��e
/��Q0U����nm��i�p
M�����vzY�i��S3b+�����m
���������{��>���<y��e�riu���aLH��i�x����n��[��Nboab��T"b���!�4��E��M�$^��������r~���������x��-8FN��z� }�}��vNo�j�+�>���d�vu+�4��
$\K��t*���p��|���](J-�u�#I�'@��q��)��?A[+/n����
�~YiPt�=�4�����L�����|���s�}���c����v8E
��-?~�?�{�=p���K���_I�������,�$�*-����t:�L���l�__0��SHW���C��h�n�m���<�^/M4lv@
��`�1����<LM�����[�I��e�
���n���;��}
pL��L=�x��$e�8
"`=@ �x*c�g�5��o{Z�q�7��q��>
#53Robert Haas
robertmhaas@gmail.com
In reply to: Joachim Wieland (#52)
Re: patch for parallel pg_dump

On Sun, Mar 25, 2012 at 10:50 PM, Joachim Wieland <joe@mcknight.de> wrote:

On Fri, Mar 23, 2012 at 11:11 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

Are you going to provide a rebased version?

Rebased version attached, this patch also includes Robert's earlier suggestions.

I keep hoping someone who knows Windows is going to take a look at
this, but so far no luck. It could also really use some attention
from someone who has an actual really big database handy, to see how
successful it is in reducing the dump time. Without those things, I
can't see this getting committed. But in the meantime, a few fairly
minor comments based on reading the code.

I'm wondering if we really need this much complexity around shutting
down workers. I'm not sure I understand why we need both a "hard" and
a "soft" method of shutting them down. At least on non-Windows
systems, it seems like it would be entirely sufficient to just send a
SIGTERM when you want them to die. They don't even need to catch it;
they can just die. You could also set things up so that if the
connection to the parent process is closed, the worker exits. Then,
during normal shutdown, you don't need to kill them at all. The
master can simply exit, and the child processes will follow suit. The
checkAborting stuff all goes away.

The existing coding of on_exit_nicely is intention, and copied from
similar logic for on_shmem_exit in the backend. Is there really a
compelling reason to reverse the firing order of exit hooks?

On my system:

parallel.c: In function ‘WaitForTerminatingWorkers’:
parallel.c:275: warning: ‘slot’ may be used uninitialized in this function
make: *** [parallel.o] Error 1

Which actually looks like a semi-legitimate gripe.

+       if (numWorkers > MAXIMUM_WAIT_OBJECTS)
+       {
+               fprintf(stderr, _("%s: invalid number of parallel
jobs\n"),     progname);
+               exit(1);
+       }

I think this error message could be more clear. How about "maximum
number of parallel jobs is %d"?

+void _SetupWorker(Archive *AHX, RestoreOptions *ropt) {}

Thankfully, this bit in pg_dumpall.c appears to be superfluous. I
hope this is just a holdover from an earlier version that we can lose.

-                                         const char *modulename,
const char *fmt,...)
+                                         const char *modulename,
const char *fmt,...)

Useless hunk.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#54Alvaro Herrera
alvherre@commandprompt.com
In reply to: Robert Haas (#53)
Re: patch for parallel pg_dump

Excerpts from Robert Haas's message of mié mar 28 14:46:30 -0300 2012:

I keep hoping someone who knows Windows is going to take a look at
this, but so far no luck. It could also really use some attention
from someone who has an actual really big database handy, to see how
successful it is in reducing the dump time. Without those things, I
can't see this getting committed. But in the meantime, a few fairly
minor comments based on reading the code.

My main comment about the current patch is that it looks like it's
touching pg_restore parallel code by moving some stuff into parallel.c.
If that's really the case and its voluminous, maybe this patch would
shrink a bit if we could do the code moving in a first patch. That
would be mostly mechanical. Then the interesting stuff would apply on
top of that. That would make review easier.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#55Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#54)
Re: patch for parallel pg_dump

On Wed, Mar 28, 2012 at 2:20 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

Excerpts from Robert Haas's message of mié mar 28 14:46:30 -0300 2012:

I keep hoping someone who knows Windows is going to take a look at
this, but so far no luck.  It could also really use some attention
from someone who has an actual really big database handy, to see how
successful it is in reducing the dump time.  Without those things, I
can't see this getting committed.  But in the meantime, a few fairly
minor comments based on reading the code.

My main comment about the current patch is that it looks like it's
touching pg_restore parallel code by moving some stuff into parallel.c.
If that's really the case and its voluminous, maybe this patch would
shrink a bit if we could do the code moving in a first patch.  That
would be mostly mechanical.  Then the interesting stuff would apply on
top of that.  That would make review easier.

+1.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#56Andrew Dunstan
andrew@dunslane.net
In reply to: Robert Haas (#55)
Re: patch for parallel pg_dump

On 03/28/2012 02:27 PM, Robert Haas wrote:

On Wed, Mar 28, 2012 at 2:20 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

Excerpts from Robert Haas's message of mié mar 28 14:46:30 -0300 2012:

I keep hoping someone who knows Windows is going to take a look at
this, but so far no luck. It could also really use some attention
from someone who has an actual really big database handy, to see how
successful it is in reducing the dump time. Without those things, I
can't see this getting committed. But in the meantime, a few fairly
minor comments based on reading the code.

My main comment about the current patch is that it looks like it's
touching pg_restore parallel code by moving some stuff into parallel.c.
If that's really the case and its voluminous, maybe this patch would
shrink a bit if we could do the code moving in a first patch. That
would be mostly mechanical. Then the interesting stuff would apply on
top of that. That would make review easier.

+1.

+1 also.

FYI I am just starting some test runs on Windows. I also have a
reasonably sized db (98 gb) on my SL6 server which should be perfect for
testing this (I just partitioned its two main tables), and will try to
get some timing runs.

cheers

andrew

#57Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#56)
Re: patch for parallel pg_dump

On 03/28/2012 03:17 PM, Andrew Dunstan wrote:

On 03/28/2012 02:27 PM, Robert Haas wrote:

On Wed, Mar 28, 2012 at 2:20 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

Excerpts from Robert Haas's message of mié mar 28 14:46:30 -0300 2012:

I keep hoping someone who knows Windows is going to take a look at
this, but so far no luck. It could also really use some attention
from someone who has an actual really big database handy, to see how
successful it is in reducing the dump time. Without those things, I
can't see this getting committed. But in the meantime, a few fairly
minor comments based on reading the code.

My main comment about the current patch is that it looks like it's
touching pg_restore parallel code by moving some stuff into parallel.c.
If that's really the case and its voluminous, maybe this patch would
shrink a bit if we could do the code moving in a first patch. That
would be mostly mechanical. Then the interesting stuff would apply on
top of that. That would make review easier.

+1.

+1 also.

FYI I am just starting some test runs on Windows. I also have a
reasonably sized db (98 gb) on my SL6 server which should be perfect
for testing this (I just partitioned its two main tables), and will
try to get some timing runs.

First hurdle: It doesn't build under Windows/mingw-w64:

x86_64-w64-mingw32-gcc -O2 -Wall -Wmissing-prototypes
-Wpointer-arith -Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -fexcess-precision=standard -g
-I../../../src/interfaces/libpq -I../../../src/include
-I./src/include/port/win32 -DEXEC_BACKEND
-I/c/prog/mingwdep/include "-I../../../src/include/port/win32" -c
-o parallel.o parallel.c
parallel.c:40:12: error: static declaration of 'pgpipe' follows
non-static declaration
../../../src/include/port.h:268:12: note: previous declaration of
'pgpipe' was here
parallel.c:41:12: error: static declaration of 'piperead' follows
non-static declaration
../../../src/include/port.h:269:12: note: previous declaration of
'piperead' was here
parallel.c: In function 'ParallelBackupStart':
parallel.c:455:9: warning: passing argument 3 of '_beginthreadex'
from incompatible pointer type
c:\mingw64\bin\../lib/gcc/x86_64-w64-mingw32/4.5.3/../../../../x86_64-w64-mingw32/include/process.h:31:29:
note: expected 'unsigned int (*)(void *)' but argument is of type
'unsigned int (*)(struct WorkerInfo *)'
make[3]: *** [parallel.o] Error 1
make[3]: Leaving directory
`/home/andrew/bf/root/HEAD/pgsql/src/bin/pg_dump'
make[2]: *** [all-pg_dump-recurse] Error 2
make[2]: Leaving directory `/home/andrew/bf/root/HEAD/pgsql/src/bin'
make[1]: *** [all-bin-recurse] Error 2
make[1]: Leaving directory `/home/andrew/bf/root/HEAD/pgsql/src'
make: *** [all-src-recurse] Error 2

I'll have a look at that in a little while.

cheers

andrew

Show quoted text

cheers

andrew

#58Joachim Wieland
joe@mcknight.de
In reply to: Andrew Dunstan (#57)
Re: patch for parallel pg_dump

On Wed, Mar 28, 2012 at 5:19 PM, Andrew Dunstan <andrew@dunslane.net> wrote:

First hurdle: It doesn't build under Windows/mingw-w64:

  parallel.c:40:12: error: static declaration of 'pgpipe' follows
  non-static declaration

Strange, I'm not seeing this but I'm building with VC2005. What
happens is that you're pulling in the pgpipe.h header. I have moved
these functions as static functions into pg_dump since you voted for
removing them from the other location (because as it turned out,
nobody else is currently using them).

Joachim

#59Andrew Dunstan
andrew@dunslane.net
In reply to: Joachim Wieland (#58)
Re: patch for parallel pg_dump

On 03/28/2012 08:28 PM, Joachim Wieland wrote:

On Wed, Mar 28, 2012 at 5:19 PM, Andrew Dunstan<andrew@dunslane.net> wrote:

First hurdle: It doesn't build under Windows/mingw-w64:

parallel.c:40:12: error: static declaration of 'pgpipe' follows
non-static declaration

Strange, I'm not seeing this but I'm building with VC2005. What
happens is that you're pulling in the pgpipe.h header. I have moved
these functions as static functions into pg_dump since you voted for
removing them from the other location (because as it turned out,
nobody else is currently using them).

But your patch hasn't got rid of them, and so it's declared twice. There
is no pgpipe.h, BTW, it's declared in port.h. If VC2005 doesn't complain
about the double declaration then that's a bug in the compiler, IMNSHO.
Doesn't it even issue a warning? I no longer use VC2005 for anything,
BTW, I use VC2008 or later.

Anyway, ISTM the best thing is just for us to get rid of pgpipe without
further ado. I'll try to get a patch together for that.

cheers

andrew

Show quoted text

Joachim

#60Joachim Wieland
joe@mcknight.de
In reply to: Andrew Dunstan (#59)
Re: patch for parallel pg_dump

On Thu, Mar 29, 2012 at 2:46 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

But your patch hasn't got rid of them, and so it's declared twice. There is
no pgpipe.h, BTW, it's declared in port.h. If VC2005 doesn't complain about
the double declaration then that's a bug in the compiler, IMNSHO. Doesn't it
even issue a warning? I no longer use VC2005 for anything, BTW, I use VC2008
or later.

I agree, the compiler should have found it, but no, I don't even get a
warning. I just verified it and when I add a #error right after the
prototypes in port.h then it hits the #error on every file, so it
definitely sees both prototypes and doesn't complain... cl.exe is
running with /W3.

Anyway, ISTM the best thing is just for us to get rid of pgpipe without
further ado. I'll try to get a patch together for that.

Thanks.

#61Joachim Wieland
joe@mcknight.de
In reply to: Robert Haas (#53)
Re: patch for parallel pg_dump

On Wed, Mar 28, 2012 at 1:46 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I'm wondering if we really need this much complexity around shutting
down workers.  I'm not sure I understand why we need both a "hard" and
a "soft" method of shutting them down.  At least on non-Windows
systems, it seems like it would be entirely sufficient to just send a
SIGTERM when you want them to die.  They don't even need to catch it;
they can just die.

At least on my Linux test system, even if all pg_dump processes are
gone, the server happily continues sending data. When I strace an
individual backend process, I see a lot of Broken pipe writes, but
that doesn't stop it from just writing out the whole table to a closed
file descriptor. This is a 9.0-latest server.

--- SIGPIPE (Broken pipe) @ 0 (0) ---
read(13, "\220\370\0\0\240\240r\266\3\0\4\0\264\1\320\1\0 \4
\0\0\0\0\270\237\220\0h\237\230\0"..., 8192) = 8192
read(13, "\220\370\0\0\350\300r\266\3\0\4\0\264\1\320\1\0 \4
\0\0\0\0\260\237\230\0h\237\220\0"..., 8192) = 8192
sendto(7, "d\0\0\0Acpp\t15.00000\t1245240000\taut"..., 8192, 0, NULL,
0) = -1 EPIPE (Broken pipe)
--- SIGPIPE (Broken pipe) @ 0 (0) ---
read(13, "\220\370\0\0000\341r\266\3\0\5\0\260\1\340\1\0 \4
\0\0\0\0\270\237\220\0p\237\220\0"..., 8192) = 8192
sendto(7, "d\0\0\0Dcpp\t15.00000\t1245672000\taut"..., 8192, 0, NULL,
0) = -1 EPIPE (Broken pipe)
--- SIGPIPE (Broken pipe) @ 0 (0) ---
read(13,  <unfinished ...>

I guess that https://commitfest.postgresql.org/action/patch_view?id=663
would take care of that in the future, but without it (i.e anything <=
9.2) it's quite annoying if you want to Ctrl-C a pg_dump and then have
to manually hunt down and kill all the backend processes.

I tested the above with immediately returning from
DisconnectDatabase() in pg_backup_db.c on Linux. The important thing
is that it calls PQcancel() on it's connection before terminating.

On Windows, several pages indicate that you can only cleanly terminate
a thread from within the thread, e.g. the last paragraph on

http://msdn.microsoft.com/en-us/library/windows/desktop/ms686724%28v=vs.85%29.aspx

The patch is basically doing what this page is recommending.

I'll try your proposal about terminating in the child when the
connection is closed, that sounds reasonable and I don't see an
immediate problem with that.

The existing coding of on_exit_nicely is intention, and copied from
similar logic for on_shmem_exit in the backend. Is there really a
compelling reason to reverse the firing order of exit hooks?

No, reversing the order was not intended. I rewrote it to a for loop
because the current implementation modifies a global variable and so
on Windows only one thread would call the exit hook.

I'll add all your other suggestions to the next version of my patch. Thanks!

Joachim

#62Andrew Dunstan
andrew@dunslane.net
In reply to: Joachim Wieland (#60)
Re: patch for parallel pg_dump

On 03/28/2012 09:12 PM, Joachim Wieland wrote:

On Thu, Mar 29, 2012 at 2:46 AM, Andrew Dunstan<andrew@dunslane.net> wrote:

But your patch hasn't got rid of them, and so it's declared twice. There is
no pgpipe.h, BTW, it's declared in port.h. If VC2005 doesn't complain about
the double declaration then that's a bug in the compiler, IMNSHO. Doesn't it
even issue a warning? I no longer use VC2005 for anything, BTW, I use VC2008
or later.

I agree, the compiler should have found it, but no, I don't even get a
warning. I just verified it and when I add a #error right after the
prototypes in port.h then it hits the #error on every file, so it
definitely sees both prototypes and doesn't complain... cl.exe is
running with /W3.

src/bin/pg_dump/pg_backup_archiver.c

Anyway, ISTM the best thing is just for us to get rid of pgpipe without
further ado. I'll try to get a patch together for that.

Thanks.

OK, I have committed this. Following that, you need to add a couple of
macros for pipewrite to parallel.c. There's also a missing cast in the
call to beginthreadex().

I'll continue testing tomorrow.

cheers

andrew

#63Robert Haas
robertmhaas@gmail.com
In reply to: Joachim Wieland (#61)
Re: patch for parallel pg_dump

On Wed, Mar 28, 2012 at 9:54 PM, Joachim Wieland <joe@mcknight.de> wrote:

On Wed, Mar 28, 2012 at 1:46 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I'm wondering if we really need this much complexity around shutting
down workers.  I'm not sure I understand why we need both a "hard" and
a "soft" method of shutting them down.  At least on non-Windows
systems, it seems like it would be entirely sufficient to just send a
SIGTERM when you want them to die.  They don't even need to catch it;
they can just die.

At least on my Linux test system, even if all pg_dump processes are
gone, the server happily continues sending data. When I strace an
individual backend process, I see a lot of Broken pipe writes, but
that doesn't stop it from just writing out the whole table to a closed
file descriptor. This is a 9.0-latest server.

Wow, yuck. At least now I understand why you're doing it like that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#64Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#62)
1 attachment(s)
Re: patch for parallel pg_dump

On 03/28/2012 11:43 PM, Andrew Dunstan wrote:

On 03/28/2012 09:12 PM, Joachim Wieland wrote:

On Thu, Mar 29, 2012 at 2:46 AM, Andrew Dunstan<andrew@dunslane.net>
wrote:

But your patch hasn't got rid of them, and so it's declared twice.
There is
no pgpipe.h, BTW, it's declared in port.h. If VC2005 doesn't
complain about
the double declaration then that's a bug in the compiler, IMNSHO.
Doesn't it
even issue a warning? I no longer use VC2005 for anything, BTW, I
use VC2008
or later.

I agree, the compiler should have found it, but no, I don't even get a
warning. I just verified it and when I add a #error right after the
prototypes in port.h then it hits the #error on every file, so it
definitely sees both prototypes and doesn't complain... cl.exe is
running with /W3.

src/bin/pg_dump/pg_backup_archiver.c

Anyway, ISTM the best thing is just for us to get rid of pgpipe without
further ado. I'll try to get a patch together for that.

Thanks.

OK, I have committed this. Following that, you need to add a couple of
macros for pipewrite to parallel.c. There's also a missing cast in the
call to beginthreadex().

I'll continue testing tomorrow.

Here is an updated patch that builds now that pgpipe has been removed.

cheers

andrew

Attachments:

parallel_pgdump_nopgpipe.patchtext/x-patch; name=parallel_pgdump_nopgpipe.patchDownload
*** a/src/bin/pg_dump/Makefile
--- b/src/bin/pg_dump/Makefile
***************
*** 20,26 **** override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
  
  OBJS=	pg_backup_archiver.o pg_backup_db.o pg_backup_custom.o \
  	pg_backup_null.o pg_backup_tar.o \
! 	pg_backup_directory.o dumpmem.o dumputils.o compress_io.o $(WIN32RES)
  
  KEYWRDOBJS = keywords.o kwlookup.o
  
--- 20,27 ----
  
  OBJS=	pg_backup_archiver.o pg_backup_db.o pg_backup_custom.o \
  	pg_backup_null.o pg_backup_tar.o \
! 	pg_backup_directory.o dumpmem.o dumputils.o compress_io.o \
! 	parallel.o $(WIN32RES)
  
  KEYWRDOBJS = keywords.o kwlookup.o
  
*** a/src/bin/pg_dump/compress_io.c
--- b/src/bin/pg_dump/compress_io.c
***************
*** 55,60 ****
--- 55,61 ----
  #include "compress_io.h"
  #include "dumpmem.h"
  #include "dumputils.h"
+ #include "parallel.h"
  
  /*----------------------
   * Compressor API
***************
*** 182,187 **** size_t
--- 183,191 ----
  WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
  				   const void *data, size_t dLen)
  {
+ 	/* Are we aborting? */
+ 	checkAborting(AH);
+ 
  	switch (cs->comprAlg)
  	{
  		case COMPR_ALG_LIBZ:
***************
*** 351,356 **** ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
--- 355,363 ----
  	/* no minimal chunk size for zlib */
  	while ((cnt = readF(AH, &buf, &buflen)))
  	{
+ 		/* Are we aborting? */
+ 		checkAborting(AH);
+ 
  		zp->next_in = (void *) buf;
  		zp->avail_in = cnt;
  
***************
*** 411,416 **** ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF)
--- 418,426 ----
  
  	while ((cnt = readF(AH, &buf, &buflen)))
  	{
+ 		/* Are we aborting? */
+ 		checkAborting(AH);
+ 
  		ahwrite(buf, 1, cnt, AH);
  	}
  
*** a/src/bin/pg_dump/dumputils.c
--- b/src/bin/pg_dump/dumputils.c
***************
*** 16,21 ****
--- 16,22 ----
  
  #include <ctype.h>
  
+ #include "dumpmem.h"
  #include "dumputils.h"
  #include "pg_backup.h"
  
***************
*** 35,40 **** static struct
--- 36,42 ----
  } on_exit_nicely_list[MAX_ON_EXIT_NICELY];
  
  static int on_exit_nicely_index;
+ void (*on_exit_msg_func)(const char *modulename, const char *fmt, va_list ap) = vwrite_msg;
  
  #define supports_grant_options(version) ((version) >= 70400)
  
***************
*** 45,50 **** static bool parseAclItem(const char *item, const char *type,
--- 47,53 ----
  static char *copyAclUserName(PQExpBuffer output, char *input);
  static void AddAcl(PQExpBuffer aclbuf, const char *keyword,
  	   const char *subname);
+ static PQExpBuffer getThreadLocalPQExpBuffer(void);
  
  #ifdef WIN32
  static bool parallel_init_done = false;
***************
*** 66,80 **** init_parallel_dump_utils(void)
  }
  
  /*
!  *	Quotes input string if it's not a legitimate SQL identifier as-is.
!  *
!  *	Note that the returned string must be used before calling fmtId again,
!  *	since we re-use the same return buffer each time.  Non-reentrant but
!  *	reduces memory leakage. (On Windows the memory leakage will be one buffer
!  *	per thread, which is at least better than one per call).
   */
! const char *
! fmtId(const char *rawid)
  {
  	/*
  	 * The Tls code goes awry if we use a static var, so we provide for both
--- 69,79 ----
  }
  
  /*
!  * Non-reentrant but reduces memory leakage. (On Windows the memory leakage
!  * will be one buffer per thread, which is at least better than one per call).
   */
! static PQExpBuffer
! getThreadLocalPQExpBuffer(void)
  {
  	/*
  	 * The Tls code goes awry if we use a static var, so we provide for both
***************
*** 83,91 **** fmtId(const char *rawid)
  	static PQExpBuffer s_id_return = NULL;
  	PQExpBuffer id_return;
  
- 	const char *cp;
- 	bool		need_quotes = false;
- 
  #ifdef WIN32
  	if (parallel_init_done)
  		id_return = (PQExpBuffer) TlsGetValue(tls_index);		/* 0 when not set */
--- 82,87 ----
***************
*** 112,120 **** fmtId(const char *rawid)
  #else
  		s_id_return = id_return;
  #endif
- 
  	}
  
  	/*
  	 * These checks need to match the identifier production in scan.l. Don't
  	 * use islower() etc.
--- 108,132 ----
  #else
  		s_id_return = id_return;
  #endif
  	}
  
+ 	return id_return;
+ }
+ 
+ /*
+  *	Quotes input string if it's not a legitimate SQL identifier as-is.
+  *
+  *	Note that the returned string must be used before calling fmtId again,
+  *	since we re-use the same return buffer each time.
+  */
+ const char *
+ fmtId(const char *rawid)
+ {
+ 	PQExpBuffer id_return = getThreadLocalPQExpBuffer();
+ 
+ 	const char *cp;
+ 	bool		need_quotes = false;
+ 
  	/*
  	 * These checks need to match the identifier production in scan.l. Don't
  	 * use islower() etc.
***************
*** 182,187 **** fmtId(const char *rawid)
--- 194,228 ----
  	return id_return->data;
  }
  
+ /*
+  * fmtQualifiedId - convert a qualified name to the proper format for
+  * the source database.
+  *
+  * Like fmtId, use the result before calling again.
+  *
+  * Since we call fmtId and it also uses getThreadLocalPQExpBuffer() we cannot
+  * use it until we're finished with calling fmtId().
+  */
+ const char *
+ fmtQualifiedId(int remoteVersion, const char *schema, const char *id)
+ {
+ 	PQExpBuffer id_return;
+ 	PQExpBuffer lcl_pqexp = createPQExpBuffer();
+ 
+ 	/* Suppress schema name if fetching from pre-7.3 DB */
+ 	if (remoteVersion >= 70300 && schema && *schema)
+ 	{
+ 		appendPQExpBuffer(lcl_pqexp, "%s.", fmtId(schema));
+ 	}
+ 	appendPQExpBuffer(lcl_pqexp, "%s", fmtId(id));
+ 
+ 	id_return = getThreadLocalPQExpBuffer();
+ 
+ 	appendPQExpBuffer(id_return, "%s", lcl_pqexp->data);
+ 	destroyPQExpBuffer(lcl_pqexp);
+ 
+ 	return id_return->data;
+ }
  
  /*
   * Convert a string value to an SQL string literal and append it to
***************
*** 1269,1275 **** exit_horribly(const char *modulename, const char *fmt,...)
  	va_list		ap;
  
  	va_start(ap, fmt);
! 	vwrite_msg(modulename, fmt, ap);
  	va_end(ap);
  
  	exit_nicely(1);
--- 1310,1316 ----
  	va_list		ap;
  
  	va_start(ap, fmt);
! 	on_exit_msg_func(modulename, fmt, ap);
  	va_end(ap);
  
  	exit_nicely(1);
***************
*** 1319,1327 **** on_exit_nicely(on_exit_nicely_callback function, void *arg)
  void
  exit_nicely(int code)
  {
! 	while (--on_exit_nicely_index >= 0)
! 		(*on_exit_nicely_list[on_exit_nicely_index].function)(code,
! 			on_exit_nicely_list[on_exit_nicely_index].arg);
  #ifdef WIN32
  	if (parallel_init_done && GetCurrentThreadId() != mainThreadId)
  		ExitThread(code);
--- 1360,1369 ----
  void
  exit_nicely(int code)
  {
! 	int i;
! 	for (i = 0; i < on_exit_nicely_index; i++)
! 		(*on_exit_nicely_list[i].function)(code,
! 			on_exit_nicely_list[i].arg);
  #ifdef WIN32
  	if (parallel_init_done && GetCurrentThreadId() != mainThreadId)
  		ExitThread(code);
*** a/src/bin/pg_dump/dumputils.h
--- b/src/bin/pg_dump/dumputils.h
***************
*** 24,29 **** extern const char *progname;
--- 24,31 ----
  
  extern void init_parallel_dump_utils(void);
  extern const char *fmtId(const char *identifier);
+ extern const char *fmtQualifiedId(int remoteVersion,
+ 								  const char *schema, const char *id);
  extern void appendStringLiteral(PQExpBuffer buf, const char *str,
  					int encoding, bool std_strings);
  extern void appendStringLiteralConn(PQExpBuffer buf, const char *str,
***************
*** 60,65 **** extern void exit_horribly(const char *modulename, const char *fmt,...)
--- 62,69 ----
  				__attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 3), noreturn));
  extern void set_section (const char *arg, int *dumpSections);
  
+ extern void (*on_exit_msg_func)(const char *modulename, const char *fmt, va_list ap)
+ 				__attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 0)));
  typedef void (*on_exit_nicely_callback) (int code, void *arg);
  extern void on_exit_nicely(on_exit_nicely_callback function, void *arg);
  extern void exit_nicely(int code) __attribute__((noreturn));
*** /dev/null
--- b/src/bin/pg_dump/parallel.c
***************
*** 0 ****
--- 1,1321 ----
+ /*-------------------------------------------------------------------------
+  *
+  * parallel.c
+  *
+  *	Parallel support for the pg_dump archiver
+  *
+  * Portions Copyright (c) 1996-2011, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  *	The author is not responsible for loss or damages that may
+  *	result from its use.
+  *
+  * IDENTIFICATION
+  *		src/bin/pg_dump/parallel.c
+  *
+  *-------------------------------------------------------------------------
+  */
+ 
+ #include "pg_backup_db.h"
+ 
+ #include "dumpmem.h"
+ #include "dumputils.h"
+ #include "parallel.h"
+ 
+ #ifndef WIN32
+ #include <sys/types.h>
+ #include <sys/wait.h>
+ #include "signal.h"
+ #include <unistd.h>
+ #include <fcntl.h>
+ #endif
+ 
+ #define PIPE_READ							0
+ #define PIPE_WRITE							1
+ 
+ /* file-scope variables */
+ #ifdef WIN32
+ static unsigned int	tMasterThreadId = 0;
+ static HANDLE		termEvent = INVALID_HANDLE_VALUE;
+ static int pgpipe(int handles[2]);
+ static int piperead(int s, char *buf, int len);
+ #define pipewrite(a,b,c)   send(a,b,c,0)
+ #else
+ static volatile sig_atomic_t wantAbort = 0;
+ static bool aborting = false;
+ #define pipewrite(a,b,c)   write(a,b,c)
+ #endif
+ 
+ typedef struct ShutdownInformation
+ {
+     ParallelState *pstate;
+     Archive       *AHX;
+ } ShutdownInformation;
+ 
+ static ShutdownInformation shutdown_info;
+ 
+ static const char *modulename = gettext_noop("parallel archiver");
+ 
+ static ParallelSlot *GetMyPSlot(ParallelState *pstate);
+ static void parallel_exit_msg_func(const char *modulename,
+ 								   const char *fmt, va_list ap)
+ 			__attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 0)));
+ static void parallel_msg_master(ParallelSlot *slot, const char *modulename,
+ 								const char *fmt, va_list ap)
+ 			__attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 0)));
+ static void archive_close_connection(int code, void *arg);
+ static void ShutdownWorkersHard(ParallelState *pstate);
+ static void ShutdownWorkersSoft(ParallelState *pstate, bool do_wait);
+ static void WaitForTerminatingWorkers(ParallelState *pstate);
+ #ifndef WIN32
+ static void sigTermHandler(int signum);
+ #endif
+ static void SetupWorker(ArchiveHandle *AH, int pipefd[2], int worker,
+ 						RestoreOptions *ropt);
+ static void PrintStatus(ParallelState *pstate);
+ static bool HasEveryWorkerTerminated(ParallelState *pstate);
+ 
+ static void lockTableNoWait(ArchiveHandle *AH, TocEntry *te);
+ static void WaitForCommands(ArchiveHandle *AH, int pipefd[2]);
+ static char *getMessageFromMaster(int pipefd[2]);
+ static void sendMessageToMaster(int pipefd[2], const char *str);
+ static int select_loop(int maxFd, fd_set *workerset);
+ static char *getMessageFromWorker(ParallelState *pstate,
+ 								  bool do_wait, int *worker);
+ static void sendMessageToWorker(ParallelState *pstate,
+ 							    int worker, const char *str);
+ static char *readMessageFromPipe(int fd);
+ 
+ #define messageStartsWith(msg, prefix) \
+ 	(strncmp(msg, prefix, strlen(prefix)) == 0)
+ #define messageEquals(msg, pattern) \
+ 	(strcmp(msg, pattern) == 0)
+ 
+ static ParallelSlot *
+ GetMyPSlot(ParallelState *pstate)
+ {
+ 	int i;
+ 
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ #ifdef WIN32
+ 		if (pstate->parallelSlot[i].threadId == GetCurrentThreadId())
+ #else
+ 		if (pstate->parallelSlot[i].pid == getpid())
+ #endif
+ 			return &(pstate->parallelSlot[i]);
+ 
+ 	return NULL;
+ }
+ 
+ /*
+  * This is the function that will be called from exit_horribly() to print the
+  * error message. If the worker process does exit_horribly(), we forward its
+  * last words to the master process. The master process then does exit_horribly()
+  * with this error message itself and prints it normally. After printing the
+  * message, exit_horribly() on the master will shut down the remaining worker
+  * processes.
+  */
+ static void
+ parallel_exit_msg_func(const char *modulename, const char *fmt, va_list ap)
+ {
+ 	ParallelState *pstate = shutdown_info.pstate;
+ 	ParallelSlot *slot;
+ 
+ 	Assert(pstate);
+ 
+ 	slot = GetMyPSlot(pstate);
+ 
+ 	if (!slot)
+ 		/* We're the parent, just write the message out */
+ 		vwrite_msg(modulename, fmt, ap);
+ 	else
+ 		/* If we're a worker process, send the msg to the master process */
+ 		parallel_msg_master(slot, modulename, fmt, ap);
+ }
+ 
+ /* Sends the error message from the worker to the master process */
+ static void
+ parallel_msg_master(ParallelSlot *slot, const char *modulename,
+ 					const char *fmt, va_list ap)
+ {
+ 	char		buf[512];
+ 	int			pipefd[2];
+ 
+ 	pipefd[PIPE_READ] = slot->pipeRevRead;
+ 	pipefd[PIPE_WRITE] = slot->pipeRevWrite;
+ 
+ 	strcpy(buf, "ERROR ");
+ 	vsnprintf(buf + strlen("ERROR "),
+ 			  sizeof(buf) - strlen("ERROR "), fmt, ap);
+ 
+ 	sendMessageToMaster(pipefd, buf);
+ }
+ 
+ /*
+  * pg_dump and pg_restore register the Archive pointer for the exit handler
+  * (called from exit_horribly). This function mainly exists so that we can keep
+  * shutdown_info in file scope only.
+  */
+ void
+ on_exit_close_archive(Archive *AHX)
+ {
+ 	shutdown_info.AHX = AHX;
+ 	on_exit_nicely(archive_close_connection, &shutdown_info);
+ }
+ 
+ /* This function can close archives in both the parallel and non-parallel case. */
+ static void
+ archive_close_connection(int code, void *arg)
+ {
+ 	ShutdownInformation *si = (ShutdownInformation *) arg;
+ 
+ 	if (si->pstate)
+ 	{
+ 		ParallelSlot *slot = GetMyPSlot(si->pstate);
+ 
+ 		if (!slot) {
+ 			/*
+ 			 * We're the master: We have already printed out the message passed
+ 			 * to exit_horribly() either from the master itself or from a
+ 			 * worker process. Now we need to close our own database connection
+ 			 * (only open during parallel dump but not restore) and shut down
+ 			 * the remaining workers.
+ 			 */
+ 			DisconnectDatabase(si->AHX);
+ #ifndef WIN32
+ 			/*
+ 			 * Setting aborting to true switches to best-effort-mode
+ 			 * (send/receive but ignore errors) in communicating with our
+ 			 * workers.
+ 			 */
+ 			aborting = true;
+ #endif
+ 			ShutdownWorkersHard(si->pstate);
+ 		}
+ 		else if (slot->args->AH)
+ 			DisconnectDatabase(&(slot->args->AH->public));
+ 	}
+ 	else if (si->AHX)
+ 		DisconnectDatabase(si->AHX);
+ }
+ 
+ /*
+  * If we have one worker that terminates for some reason, we'd like the other
+  * threads to terminate as well (and not finish with their 70 GB table dump
+  * first...). Now in UNIX we can just kill these processes, and let the signal
+  * handler set wantAbort to 1. In Windows we set a termEvent and this serves as
+  * the signal for everyone to terminate.
+  */
+ void
+ checkAborting(ArchiveHandle *AH)
+ {
+ #ifdef WIN32
+ 	if (WaitForSingleObject(termEvent, 0) == WAIT_OBJECT_0)
+ #else
+ 	if (wantAbort)
+ #endif
+ 		exit_horribly(modulename, "worker is terminating\n");
+ }
+ 
+ /*
+  * Shut down any remaining workers, this has an implicit do_wait == true.
+  *
+  * The fastest way we can make the workers terminate gracefully is when
+  * they are listening for new commands and we just tell them to terminate.
+  */
+ static void
+ ShutdownWorkersHard(ParallelState *pstate)
+ {
+ #ifndef WIN32
+ 	int i;
+ 	signal(SIGPIPE, SIG_IGN);
+ 	ShutdownWorkersSoft(pstate, false);
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 		kill(pstate->parallelSlot[i].pid, SIGTERM);
+ 	WaitForTerminatingWorkers(pstate);
+ #else
+ 	/* The workers monitor this event via checkAborting(). */
+ 	SetEvent(termEvent);
+ 	/* No hard shutdown on Windows, wait for the workers to terminate */
+ 	ShutdownWorkersSoft(pstate, true);
+ #endif
+ }
+ 
+ /*
+  * Performs a soft shutdown and optionally waits for every worker to terminate.
+  * A soft shutdown sends a "TERMINATE" message to every worker only.
+  */
+ static void
+ ShutdownWorkersSoft(ParallelState *pstate, bool do_wait)
+ {
+ 	int			i;
+ 
+ 	/* soft shutdown */
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 	{
+ 		if (pstate->parallelSlot[i].workerStatus != WRKR_TERMINATED)
+ 		{
+ 			sendMessageToWorker(pstate, i, "TERMINATE");
+ 			pstate->parallelSlot[i].workerStatus = WRKR_WORKING;
+ 		}
+ 	}
+ 
+ 	if (!do_wait)
+ 		return;
+ 
+ 	WaitForTerminatingWorkers(pstate);
+ }
+ 
+ /*
+  * Wait for the termination of the processes using the OS-specific method.
+  */
+ static void
+ WaitForTerminatingWorkers(ParallelState *pstate)
+ {
+ 	while (!HasEveryWorkerTerminated(pstate))
+ 	{
+ 		ParallelSlot *slot;
+ 		int j;
+ #ifndef WIN32
+ 		int		status;
+ 		pid_t	pid = wait(&status);
+ 		for (j = 0; j < pstate->numWorkers; j++)
+ 			if (pstate->parallelSlot[j].pid == pid)
+ 				slot = &(pstate->parallelSlot[j]);
+ #else
+ 		uintptr_t hThread;
+ 		DWORD	ret;
+ 		uintptr_t *lpHandles = pg_malloc(sizeof(HANDLE) * pstate->numWorkers);
+ 		int nrun = 0;
+ 		for (j = 0; j < pstate->numWorkers; j++)
+ 			if (pstate->parallelSlot[j].workerStatus != WRKR_TERMINATED)
+ 			{
+ 				lpHandles[nrun] = pstate->parallelSlot[j].hThread;
+ 				nrun++;
+ 			}
+ 		ret = WaitForMultipleObjects(nrun, (HANDLE*) lpHandles, false, INFINITE);
+ 		Assert(ret != WAIT_FAILED);
+ 		hThread = lpHandles[ret - WAIT_OBJECT_0];
+ 
+ 		for (j = 0; j < pstate->numWorkers; j++)
+ 			if (pstate->parallelSlot[j].hThread == hThread)
+ 				slot = &(pstate->parallelSlot[j]);
+ 
+ 		free(lpHandles);
+ #endif
+ 		Assert(slot);
+ 
+ 		slot->workerStatus = WRKR_TERMINATED;
+ 
+ 		PrintStatus(pstate);
+ 	}
+ 	Assert(HasEveryWorkerTerminated(pstate));
+ }
+ 
+ #ifndef WIN32
+ /* Signal handling (UNIX only) */
+ static void
+ sigTermHandler(int signum)
+ {
+ 	wantAbort = 1;
+ }
+ #endif
+ 
+ /*
+  * This function is called by both UNIX and Windows variants to set up a
+  * worker process.
+  */
+ static void
+ SetupWorker(ArchiveHandle *AH, int pipefd[2], int worker,
+ 			RestoreOptions *ropt)
+ {
+ 	/*
+ 	 * In dump mode (pg_dump) this calls _SetupWorker() as defined in
+ 	 * pg_dump.c, while in restore mode (pg_restore) it calls _SetupWorker() as
+ 	 * defined in pg_restore.c.
+      *
+ 	 * We get the raw connection only for the reason that we can close it
+ 	 * properly when we shut down. This happens only that way when it is
+ 	 * brought down because of an error.
+ 	 */
+ 	_SetupWorker((Archive *) AH, ropt);
+ 
+ 	Assert(AH->connection != NULL);
+ 
+ 	WaitForCommands(AH, pipefd);
+ 
+ 	closesocket(pipefd[PIPE_READ]);
+ 	closesocket(pipefd[PIPE_WRITE]);
+ }
+ 
+ #ifdef WIN32
+ /*
+  * On Windows the _beginthreadex() function allows us to pass one parameter.
+  * Since we need to pass a few values however, we define a structure here
+  * and then pass a pointer to such a structure in _beginthreadex().
+  */
+ typedef struct {
+ 	ArchiveHandle  *AH;
+ 	RestoreOptions *ropt;
+ 	int				worker;
+ 	int				pipeRead;
+ 	int				pipeWrite;
+ } WorkerInfo;
+ 
+ static unsigned __stdcall
+ init_spawned_worker_win32(WorkerInfo *wi)
+ {
+ 	ArchiveHandle *AH;
+ 	int pipefd[2] = { wi->pipeRead, wi->pipeWrite };
+ 	int worker = wi->worker;
+ 	RestoreOptions *ropt = wi->ropt;
+ 
+ 	AH = CloneArchive(wi->AH);
+ 
+ 	free(wi);
+ 	SetupWorker(AH, pipefd, worker, ropt);
+ 
+ 	DeCloneArchive(AH);
+ 	_endthreadex(0);
+ 	return 0;
+ }
+ #endif
+ 
+ /*
+  * This function starts the parallel dump or restore by spawning off the worker
+  * processes in both Unix and Windows. For Windows, it creates a number of
+  * threads while it does a fork() on Unix.
+  */
+ ParallelState *
+ ParallelBackupStart(ArchiveHandle *AH, RestoreOptions *ropt)
+ {
+ 	ParallelState  *pstate;
+ 	int				i;
+ 	const size_t	slotSize = AH->public.numWorkers * sizeof(ParallelSlot);
+ 
+ 	Assert(AH->public.numWorkers > 0);
+ 
+ 	/* Ensure stdio state is quiesced before forking */
+ 	fflush(NULL);
+ 
+ 	pstate = (ParallelState *) pg_malloc(sizeof(ParallelState));
+ 
+ 	pstate->numWorkers = AH->public.numWorkers;
+ 	pstate->parallelSlot = NULL;
+ 
+ 	if (AH->public.numWorkers == 1)
+ 		return pstate;
+ 
+ 	pstate->parallelSlot = (ParallelSlot *) pg_malloc(slotSize);
+ 	memset((void *) pstate->parallelSlot, 0, slotSize);
+ 
+ 	/*
+ 	 * Set the pstate in the shutdown_info. The exit handler uses pstate if
+ 	 * set and falls back to AHX otherwise.
+ 	 */
+ 	shutdown_info.pstate = pstate;
+ 	on_exit_msg_func = parallel_exit_msg_func;
+ 
+ #ifdef WIN32
+ 	tMasterThreadId = GetCurrentThreadId();
+ 	termEvent = CreateEvent(NULL, true, false, "Terminate");
+ #else
+ 	signal(SIGTERM, sigTermHandler);
+ 	signal(SIGINT, sigTermHandler);
+ 	signal(SIGQUIT, sigTermHandler);
+ #endif
+ 
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 	{
+ #ifdef WIN32
+ 		WorkerInfo *wi;
+ 		uintptr_t	handle;
+ #else
+ 		pid_t		pid;
+ #endif
+ 		int			pipeMW[2], pipeWM[2];
+ 
+ 		if (pgpipe(pipeMW) < 0 || pgpipe(pipeWM) < 0)
+ 			exit_horribly(modulename, "Cannot create communication channels: %s\n",
+ 						  strerror(errno));
+ 
+ 		pstate->parallelSlot[i].workerStatus = WRKR_IDLE;
+ 		pstate->parallelSlot[i].args = (ParallelArgs *) pg_malloc(sizeof(ParallelArgs));
+ 		pstate->parallelSlot[i].args->AH = NULL;
+ 		pstate->parallelSlot[i].args->te = NULL;
+ #ifdef WIN32
+ 		/* Allocate a new structure for every worker */
+ 		wi = (WorkerInfo *) pg_malloc(sizeof(WorkerInfo));
+ 
+ 		wi->ropt = ropt;
+ 		wi->worker = i;
+ 		wi->AH = AH;
+ 		wi->pipeRead = pstate->parallelSlot[i].pipeRevRead = pipeMW[PIPE_READ];
+ 		wi->pipeWrite = pstate->parallelSlot[i].pipeRevWrite = pipeWM[PIPE_WRITE];
+ 
+ 		handle = _beginthreadex(NULL, 0, (void *) &init_spawned_worker_win32,
+ 								wi, 0, &(pstate->parallelSlot[i].threadId));
+ 		pstate->parallelSlot[i].hThread = handle;
+ #else
+ 		pid = fork();
+ 		if (pid == 0)
+ 		{
+ 			/* we are the worker */
+ 			int j;
+ 			int pipefd[2] = { pipeMW[PIPE_READ], pipeWM[PIPE_WRITE] };
+ 
+ 			/*
+ 			 * Store the fds for the reverse communication in pstate. Actually
+ 			 * we only use this in case of an error and don't use pstate
+ 			 * otherwise in the worker process. On Windows we write to the
+ 			 * global pstate, in Unix we write to our process-local copy but
+ 			 * that's also where we'd retrieve this information back from.
+ 			 */
+ 			pstate->parallelSlot[i].pipeRevRead = pipefd[PIPE_READ];
+ 			pstate->parallelSlot[i].pipeRevWrite = pipefd[PIPE_WRITE];
+ 			pstate->parallelSlot[i].pid = getpid();
+ 
+ 			/*
+ 			 * Call CloneArchive on Unix as well even though technically we
+ 			 * don't need to because fork() gives us a copy in our own address space
+ 			 * already. But CloneArchive resets the state information and also
+ 			 * clones the database connection (for parallel dump) which both
+ 			 * seem kinda helpful.
+ 			 */
+ 			pstate->parallelSlot[i].args->AH = CloneArchive(AH);
+ 
+ 			closesocket(pipeWM[PIPE_READ]);		/* close read end of Worker -> Master */
+ 			closesocket(pipeMW[PIPE_WRITE]);	/* close write end of Master -> Worker */
+ 
+ 			/*
+ 			 * Close all inherited fds for communication of the master with
+ 			 * the other workers.
+ 			 */
+ 			for (j = 0; j < i; j++)
+ 			{
+ 				closesocket(pstate->parallelSlot[j].pipeRead);
+ 				closesocket(pstate->parallelSlot[j].pipeWrite);
+ 			}
+ 
+ 			SetupWorker(pstate->parallelSlot[i].args->AH, pipefd, i, ropt);
+ 
+ 			exit(0);
+ 		}
+ 		else if (pid < 0)
+ 			/* fork failed */
+ 			exit_horribly(modulename,
+ 						  "could not create worker process: %s\n",
+ 						  strerror(errno));
+ 
+ 		/* we are the Master, pid > 0 here */
+ 		Assert(pid > 0);
+ 		closesocket(pipeMW[PIPE_READ]);		/* close read end of Master -> Worker */
+ 		closesocket(pipeWM[PIPE_WRITE]);	/* close write end of Worker -> Master */
+ 
+ 		pstate->parallelSlot[i].pid = pid;
+ #endif
+ 
+ 		pstate->parallelSlot[i].pipeRead = pipeWM[PIPE_READ];
+ 		pstate->parallelSlot[i].pipeWrite = pipeMW[PIPE_WRITE];
+ 	}
+ 
+ 	return pstate;
+ }
+ 
+ /*
+  * Tell all of our workers to terminate.
+  *
+  * Pretty straightforward routine, first we tell everyone to terminate, then we
+  * listen to the workers' replies and finally close the sockets that we have
+  * used for communication.
+  */
+ void
+ ParallelBackupEnd(ArchiveHandle *AH, ParallelState *pstate)
+ {
+ 	int i;
+ 
+ 	if (pstate->numWorkers == 1)
+ 		return;
+ 
+ 	PrintStatus(pstate);
+ 	Assert(IsEveryWorkerIdle(pstate));
+ 
+ 	/* no hard shutdown, let workers exit by themselves and wait for them */
+ 	ShutdownWorkersSoft(pstate, true);
+ 
+ 	PrintStatus(pstate);
+ 
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 	{
+ 		closesocket(pstate->parallelSlot[i].pipeRead);
+ 		closesocket(pstate->parallelSlot[i].pipeWrite);
+ 	}
+ 
+ 	/*
+ 	 * Remove the pstate again, so the exit handler in the parent will now
+ 	 * again fall back to closing AH->connection (if connected).
+ 	 */
+ 	shutdown_info.pstate = NULL;
+ 
+ 	free(pstate->parallelSlot);
+ 	free(pstate);
+ }
+ 
+ 
+ /*
+  * The sequence is the following (for dump, similar for restore):
+  *
+  * The master process starts the parallel backup in ParllelBackupStart, this
+  * forks the worker processes which enter WaitForCommand().
+  *
+  * The master process dispatches an individual work item to one of the worker
+  * processes in DispatchJobForTocEntry(). It calls
+  * AH->MasterStartParallelItemPtr, a routine of the output format. This
+  * function's arguments are the parents archive handle AH (containing the full
+  * catalog information), the TocEntry that the worker should work on and a
+  * T_Action act indicating whether this is a backup or a restore item.  The
+  * function then converts the TocEntry assignment into a string that is then
+  * sent over to the worker process. In the simplest case that would be
+  * something like "DUMP 1234", with 1234 being the TocEntry id.
+  *
+  * The worker receives the message in the routine pointed to by
+  * WorkerJobDumpPtr or WorkerJobRestorePtr. These are also pointers to
+  * corresponding routines of the respective output format, e.g.
+  * _WorkerJobDumpDirectory().
+  *
+  * Remember that we have forked off the workers only after we have read in the
+  * catalog. That's why our worker processes can also access the catalog
+  * information. Now they re-translate the textual representation to a TocEntry
+  * on their side and do the required action (restore or dump).
+  *
+  * The result is again a textual string that is sent back to the master and is
+  * interpreted by AH->MasterEndParallelItemPtr. This function can update state
+  * or catalog information on the master's side, depending on the reply from the
+  * worker process. In the end it returns status which is 0 for successful
+  * execution.
+  *
+  * ---------------------------------------------------------------------
+  * Master                                   Worker
+  *
+  *                                          enters WaitForCommands()
+  * DispatchJobForTocEntry(...te...)
+  *
+  * [ Worker is IDLE ]
+  *
+  * arg = (MasterStartParallelItemPtr)()
+  * send: DUMP arg
+  *                                          receive: DUMP arg
+  *                                          str = (WorkerJobDumpPtr)(arg)
+  * [ Worker is WORKING ]                    ... gets te from arg ...
+  *                                          ... dump te ...
+  *                                          send: OK DUMP info
+  *
+  * In ListenToWorkers():
+  *
+  * [ Worker is FINISHED ]
+  * receive: OK DUMP info
+  * status = (MasterEndParallelItemPtr)(info)
+  *
+  * In ReapWorkerStatus(&ptr):
+  * *ptr = status;
+  * [ Worker is IDLE ]
+  * ---------------------------------------------------------------------
+  */
+ void
+ DispatchJobForTocEntry(ArchiveHandle *AH, ParallelState *pstate, TocEntry *te,
+ 					   T_Action act)
+ {
+ 	int		worker;
+ 	char   *arg;
+ 
+ 	/* our caller makes sure that at least one worker is idle */
+ 	Assert(GetIdleWorker(pstate) != NO_SLOT);
+ 	worker = GetIdleWorker(pstate);
+ 	Assert(worker != NO_SLOT);
+ 
+ 	arg = (AH->MasterStartParallelItemPtr)(AH, te, act);
+ 
+ 	sendMessageToWorker(pstate, worker, arg);
+ 
+ 	pstate->parallelSlot[worker].workerStatus = WRKR_WORKING;
+ 	pstate->parallelSlot[worker].args->te = te;
+ 	PrintStatus(pstate);
+ }
+ 
+ static void
+ PrintStatus(ParallelState *pstate)
+ {
+ 	int			i;
+ 	printf("------Status------\n");
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 	{
+ 		printf("Status of worker %d: ", i);
+ 		switch (pstate->parallelSlot[i].workerStatus)
+ 		{
+ 			case WRKR_IDLE:
+ 				printf("IDLE");
+ 				break;
+ 			case WRKR_WORKING:
+ 				printf("WORKING");
+ 				break;
+ 			case WRKR_FINISHED:
+ 				printf("FINISHED");
+ 				break;
+ 			case WRKR_TERMINATED:
+ 				printf("TERMINATED");
+ 				break;
+ 		}
+ 		printf("\n");
+ 	}
+ 	printf("------------\n");
+ }
+ 
+ 
+ /*
+  * Find the first free parallel slot (if any).
+  */
+ int
+ GetIdleWorker(ParallelState *pstate)
+ {
+ 	int			i;
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 		if (pstate->parallelSlot[i].workerStatus == WRKR_IDLE)
+ 			return i;
+ 	return NO_SLOT;
+ }
+ 
+ /*
+  * Return true iff every worker process is in the WRKR_TERMINATED state.
+  */
+ static bool
+ HasEveryWorkerTerminated(ParallelState *pstate)
+ {
+ 	int			i;
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 		if (pstate->parallelSlot[i].workerStatus != WRKR_TERMINATED)
+ 			return false;
+ 	return true;
+ }
+ 
+ /*
+  * Return true iff every worker is in the WRKR_IDLE state.
+  */
+ bool
+ IsEveryWorkerIdle(ParallelState *pstate)
+ {
+ 	int			i;
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 		if (pstate->parallelSlot[i].workerStatus != WRKR_IDLE)
+ 			return false;
+ 	return true;
+ }
+ 
+ /*
+  * ---------------------------------------------------------------------
+  * One danger of the parallel backup is a possible deadlock:
+  *
+  * 1) Master dumps the schema and locks all tables in ACCESS SHARE mode.
+  * 2) Another process requests an ACCESS EXCLUSIVE lock (which is not granted
+  *    because the master holds a conflicting ACCESS SHARE lock).
+  * 3) The worker process also requests an ACCESS SHARE lock to read the table.
+  *    The worker's not granted that lock but is enqueued behind the ACCESS
+  *    EXCLUSIVE lock request.
+  * ---------------------------------------------------------------------
+  *
+  * Now what we do here is to just request a lock in ACCESS SHARE but with
+  * NOWAIT in the worker prior to touching the table. If we don't get the lock,
+  * then we know that somebody else has requested an ACCESS EXCLUSIVE lock and
+  * are good to just fail the whole backup because we have detected a deadlock.
+  */
+ static void
+ lockTableNoWait(ArchiveHandle *AH, TocEntry *te)
+ {
+ 	Archive *AHX = (Archive *) AH;
+ 	const char *qualId;
+ 	PQExpBuffer query = createPQExpBuffer();
+ 	PGresult   *res;
+ 
+ 	Assert(AH->format == archDirectory);
+ 	Assert(strcmp(te->desc, "BLOBS") != 0);
+ 
+ 	appendPQExpBuffer(query, "SELECT pg_namespace.nspname,"
+ 							 "       pg_class.relname "
+ 							 "  FROM pg_class "
+ 							 "  JOIN pg_namespace on pg_namespace.oid = relnamespace "
+ 							 " WHERE pg_class.oid = %d", te->catalogId.oid);
+ 
+ 	res = PQexec(AH->connection, query->data);
+ 
+ 	if (!res || PQresultStatus(res) != PGRES_TUPLES_OK)
+ 		exit_horribly(modulename, "could not get relation name for oid %d: %s\n",
+ 					  te->catalogId.oid, PQerrorMessage(AH->connection));
+ 
+ 	resetPQExpBuffer(query);
+ 
+ 	qualId = fmtQualifiedId(AHX->remoteVersion, PQgetvalue(res, 0, 0), PQgetvalue(res, 0, 1));
+ 
+ 	appendPQExpBuffer(query, "LOCK TABLE %s IN ACCESS SHARE MODE NOWAIT", qualId);
+ 	PQclear(res);
+ 
+ 	res = PQexec(AH->connection, query->data);
+ 
+ 	if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
+ 		exit_horribly(modulename, "could not obtain lock on relation \"%s\". This "
+ 					  "usually means that someone requested an ACCESS EXCLUSIVE lock "
+ 					  "on the table after the pg_dump parent process has gotten the "
+ 					  "initial ACCESS SHARE lock on the table.\n", qualId);
+ 
+ 	PQclear(res);
+ 	destroyPQExpBuffer(query);
+ }
+ 
+ /*
+  * That's the main routine for the worker.
+  * When it starts up it enters this routine and waits for commands from the
+  * master process. After having processed a command it comes back to here to
+  * wait for the next command. Finally it will receive a TERMINATE command and
+  * exit.
+  */
+ static void
+ WaitForCommands(ArchiveHandle *AH, int pipefd[2])
+ {
+ 	char	   *command;
+ 	DumpId		dumpId;
+ 	int			nBytes;
+ 	char	   *str = NULL;
+ 	TocEntry   *te;
+ 
+ 	for(;;)
+ 	{
+ 		command = getMessageFromMaster(pipefd);
+ 
+ 		if (messageStartsWith(command, "DUMP "))
+ 		{
+ 			Assert(AH->format == archDirectory);
+ 			sscanf(command + strlen("DUMP "), "%d%n", &dumpId, &nBytes);
+ 			Assert(nBytes == strlen(command) - strlen("DUMP "));
+ 
+ 			te = getTocEntryByDumpId(AH, dumpId);
+ 			Assert(te != NULL);
+ 
+ 			/*
+ 			 * Lock the table but with NOWAIT. Note that the parent is already
+ 			 * holding a lock. If we cannot acquire another ACCESS SHARE MODE
+ 			 * lock, then somebody else has requested an exclusive lock in the
+ 			 * meantime.  lockTableNoWait dies in this case to prevent a
+ 			 * deadlock.
+ 			 */
+ 			if (strcmp(te->desc, "BLOBS") != 0)
+ 				lockTableNoWait(AH, te);
+ 
+ 			/*
+ 			 * The message we return here has been pg_malloc()ed and we are
+ 			 * responsible for free()ing it.
+ 			 */
+ 			str = (AH->WorkerJobDumpPtr)(AH, te);
+ 			Assert(AH->connection != NULL);
+ 			sendMessageToMaster(pipefd, str);
+ 			free(str);
+ 		}
+ 		else if (messageStartsWith(command, "RESTORE "))
+ 		{
+ 			Assert(AH->format == archDirectory || AH->format == archCustom);
+ 			Assert(AH->connection != NULL);
+ 
+ 			sscanf(command + strlen("RESTORE "), "%d%n", &dumpId, &nBytes);
+ 			Assert(nBytes == strlen(command) - strlen("RESTORE "));
+ 
+ 			te = getTocEntryByDumpId(AH, dumpId);
+ 			Assert(te != NULL);
+ 			/*
+ 			 * The message we return here has been pg_malloc()ed and we are
+ 			 * responsible for free()ing it.
+ 			 */
+ 			str = (AH->WorkerJobRestorePtr)(AH, te);
+ 			Assert(AH->connection != NULL);
+ 			sendMessageToMaster(pipefd, str);
+ 			free(str);
+ 		}
+ 		else if (messageEquals(command, "TERMINATE"))
+ 		{
+ 			PQfinish(AH->connection);
+ 			AH->connection = NULL;
+ 			return;
+ 		}
+ 		else
+ 			exit_horribly(modulename,
+ 						  "Unknown command on communication channel: %s\n",
+ 						  command);
+ 	}
+ }
+ 
+ /*
+  * ---------------------------------------------------------------------
+  * Note the status change:
+  *
+  * DispatchJobForTocEntry		WRKR_IDLE -> WRKR_WORKING
+  * ListenToWorkers				WRKR_WORKING -> WRKR_FINISHED / WRKR_TERMINATED
+  * ReapWorkerStatus				WRKR_FINISHED -> WRKR_IDLE
+  * ---------------------------------------------------------------------
+  *
+  * Just calling ReapWorkerStatus() when all workers are working might or might
+  * not give you an idle worker because you need to call ListenToWorkers() in
+  * between and only thereafter ReapWorkerStatus(). This is necessary in order to
+  * get and deal with the status (=result) of the worker's execution.
+  */
+ void
+ ListenToWorkers(ArchiveHandle *AH, ParallelState *pstate, bool do_wait)
+ {
+ 	int			worker;
+ 	char	   *msg;
+ 
+ 	msg = getMessageFromWorker(pstate, do_wait, &worker);
+ 
+ 	if (!msg)
+ 	{
+ 		if (do_wait)
+ 			exit_horribly(modulename, "A worker process died unexpectedly\n");
+ 		return;
+ 	}
+ 
+ 	if (messageStartsWith(msg, "OK "))
+ 	{
+ 		char	   *statusString;
+ 		TocEntry   *te;
+ 
+ 		pstate->parallelSlot[worker].workerStatus = WRKR_FINISHED;
+ 		te = pstate->parallelSlot[worker].args->te;
+ 		if (messageStartsWith(msg, "OK RESTORE "))
+ 		{
+ 			statusString = msg + strlen("OK RESTORE ");
+ 			pstate->parallelSlot[worker].status =
+ 				(AH->MasterEndParallelItemPtr)
+ 					(AH, te, statusString, ACT_RESTORE);
+ 		}
+ 		else if (messageStartsWith(msg, "OK DUMP "))
+ 		{
+ 			statusString = msg + strlen("OK DUMP ");
+ 			pstate->parallelSlot[worker].status =
+ 				(AH->MasterEndParallelItemPtr)
+ 					(AH, te, statusString, ACT_DUMP);
+ 		}
+ 		else
+ 			exit_horribly(modulename,
+ 						  "Invalid message received from worker: %s\n", msg);
+ 	}
+ 	else if (messageStartsWith(msg, "ERROR "))
+ 	{
+ 		Assert(AH->format == archDirectory || AH->format == archCustom);
+ 		pstate->parallelSlot[worker].workerStatus = WRKR_TERMINATED;
+ 		exit_horribly(modulename, "%s", msg + strlen("ERROR "));
+ 	}
+ 	else
+ 		exit_horribly(modulename, "Invalid message received from worker: %s\n", msg);
+ 
+ 	PrintStatus(pstate);
+ 
+ 	/* both Unix and Win32 return pg_malloc()ed space, so we free it */
+ 	free(msg);
+ }
+ 
+ /*
+  * This function is executed in the master process.
+  *
+  * This function is used to get the return value of a terminated worker
+  * process. If a process has terminated, its status is stored in *status and
+  * the id of the worker is returned.
+  */
+ int
+ ReapWorkerStatus(ParallelState *pstate, int *status)
+ {
+ 	int			i;
+ 
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 	{
+ 		if (pstate->parallelSlot[i].workerStatus == WRKR_FINISHED)
+ 		{
+ 			*status = pstate->parallelSlot[i].status;
+ 			pstate->parallelSlot[i].status = 0;
+ 			pstate->parallelSlot[i].workerStatus = WRKR_IDLE;
+ 			PrintStatus(pstate);
+ 			return i;
+ 		}
+ 	}
+ 	return NO_SLOT;
+ }
+ 
+ /*
+  * This function is executed in the master process.
+  *
+  * It looks for an idle worker process and only returns if there is one.
+  */
+ void
+ EnsureIdleWorker(ArchiveHandle *AH, ParallelState *pstate)
+ {
+ 	int		ret_worker;
+ 	int		work_status;
+ 
+ 	for (;;)
+ 	{
+ 		int nTerm = 0;
+ 		while ((ret_worker = ReapWorkerStatus(pstate, &work_status)) != NO_SLOT)
+ 		{
+ 			if (work_status != 0)
+ 				exit_horribly(modulename, "Error processing a parallel work item.\n");
+ 
+ 			nTerm++;
+ 		}
+ 
+ 		/* We need to make sure that we have an idle worker before dispatching
+ 		 * the next item. If nTerm > 0 we already have that (quick check). */
+ 		if (nTerm > 0)
+ 			return;
+ 
+ 		/* explicit check for an idle worker */
+ 		if (GetIdleWorker(pstate) != NO_SLOT)
+ 			return;
+ 
+ 		/*
+ 		 * If we have no idle worker, read the result of one or more
+ 		 * workers and loop the loop to call ReapWorkerStatus() on them
+ 		 */
+ 		ListenToWorkers(AH, pstate, true);
+ 	}
+ }
+ 
+ /*
+  * This function is executed in the master process.
+  *
+  * It waits for all workers to terminate.
+  */
+ void
+ EnsureWorkersFinished(ArchiveHandle *AH, ParallelState *pstate)
+ {
+ 	int			work_status;
+ 
+ 	if (!pstate || pstate->numWorkers == 1)
+ 		return;
+ 
+ 	/* Waiting for the remaining worker processes to finish */
+ 	while (!IsEveryWorkerIdle(pstate))
+ 	{
+ 		if (ReapWorkerStatus(pstate, &work_status) == NO_SLOT)
+ 			ListenToWorkers(AH, pstate, true);
+ 		else if (work_status != 0)
+ 			exit_horribly(modulename, "Error processing a parallel work item\n");
+ 	}
+ }
+ 
+ /*
+  * This function is executed in the worker process.
+  *
+  * It returns the next message on the communication channel, blocking until it
+  * becomes available.
+  */
+ static char *
+ getMessageFromMaster(int pipefd[2])
+ {
+ 	return readMessageFromPipe(pipefd[PIPE_READ]);
+ }
+ 
+ /*
+  * This function is executed in the worker process.
+  *
+  * It sends a message to the master on the communication channel.
+  */
+ static void
+ sendMessageToMaster(int pipefd[2], const char *str)
+ {
+ 	int			len = strlen(str) + 1;
+ 
+ 	if (pipewrite(pipefd[PIPE_WRITE], str, len) != len)
+ 		exit_horribly(modulename,
+ 					  "Error writing to the communication channel: %s\n",
+ 					  strerror(errno));
+ }
+ 
+ /*
+  * A select loop that repeats calling select until a descriptor in the read set
+  * becomes readable. On Windows we have to check for the termination event from
+  * time to time, on Unix we can just block forever.
+  */
+ #ifdef WIN32
+ static int
+ select_loop(int maxFd, fd_set *workerset)
+ {
+ 	int			i;
+ 	fd_set		saveSet = *workerset;
+ 
+ 	/* should always be the master */
+ 	Assert(tMasterThreadId == GetCurrentThreadId());
+ 
+ 	for (;;)
+ 	{
+ 		/*
+ 		 * sleep a quarter of a second before checking if we should
+ 		 * terminate.
+ 		 */
+ 		struct timeval tv = { 0, 250000 };
+ 		*workerset = saveSet;
+ 		i = select(maxFd + 1, workerset, NULL, NULL, &tv);
+ 
+ 		if (i == SOCKET_ERROR && WSAGetLastError() == WSAEINTR)
+ 			continue;
+ 		if (i)
+ 			break;
+ 	}
+ 
+ 	return i;
+ }
+ #else /* UNIX */
+ static int
+ select_loop(int maxFd, fd_set *workerset)
+ {
+ 	int		i;
+ 
+ 	fd_set saveSet = *workerset;
+ 	for (;;)
+ 	{
+ 		*workerset = saveSet;
+ 		i = select(maxFd + 1, workerset, NULL, NULL, NULL);
+ 
+ 		/*
+ 		 * If we Ctrl-C the master process , it's likely that we interrupt
+ 		 * select() here. The signal handler will set wantAbort == true and the
+ 		 * shutdown journey starts from here. Note that we'll come back here
+ 		 * later when we tell all workers to terminate and read their
+ 		 * responses. But then we have aborting set to true.
+ 		 */
+ 		if (wantAbort && !aborting)
+ 			exit_horribly(modulename, "terminated by user\n");
+ 
+ 		if (i < 0 && errno == EINTR)
+ 			continue;
+ 		break;
+ 	}
+ 
+ 	return i;
+ }
+ #endif
+ 
+ /*
+  * This function is executed in the master process.
+  *
+  * It returns the next message from the worker on the communication channel,
+  * optionally blocking (do_wait) until it becomes available.
+  *
+  * The id of the worker is returned in *worker.
+  */
+ static char *
+ getMessageFromWorker(ParallelState *pstate, bool do_wait, int *worker)
+ {
+ 	int			i;
+ 	fd_set		workerset;
+ 	int			maxFd = -1;
+ 	struct		timeval nowait = { 0, 0 };
+ 
+ 	FD_ZERO(&workerset);
+ 
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 	{
+ 		if (pstate->parallelSlot[i].workerStatus == WRKR_TERMINATED)
+ 			continue;
+ 		FD_SET(pstate->parallelSlot[i].pipeRead, &workerset);
+ 		/* actually WIN32 ignores the first parameter to select()... */
+ 		if (pstate->parallelSlot[i].pipeRead > maxFd)
+ 			maxFd = pstate->parallelSlot[i].pipeRead;
+ 	}
+ 
+ 	if (do_wait)
+ 	{
+ 		i = select_loop(maxFd, &workerset);
+ 		Assert(i != 0);
+ 	}
+ 	else
+ 	{
+ 		if ((i = select(maxFd + 1, &workerset, NULL, NULL, &nowait)) == 0)
+ 			return NULL;
+ 	}
+ 
+ 	if (i < 0)
+ 		exit_horribly(modulename, "Error in ListenToWorkers(): %s", strerror(errno));
+ 
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 	{
+ 		char	   *msg;
+ 
+ 		if (!FD_ISSET(pstate->parallelSlot[i].pipeRead, &workerset))
+ 			continue;
+ 
+ 		msg = readMessageFromPipe(pstate->parallelSlot[i].pipeRead);
+ 		*worker = i;
+ 		return msg;
+ 	}
+ 	Assert(false);
+ 	return NULL;
+ }
+ 
+ /*
+  * This function is executed in the master process.
+  *
+  * It sends a message to a certain worker on the communication channel.
+  */
+ static void
+ sendMessageToWorker(ParallelState *pstate, int worker, const char *str)
+ {
+ 	int			len = strlen(str) + 1;
+ 
+ 	if (pipewrite(pstate->parallelSlot[worker].pipeWrite, str, len) != len)
+ 	{
+ 		/*
+ 		 * If we're already aborting anyway, don't care if we succeed or not.
+ 		 * The child might have gone already.
+ 		 */
+ #ifndef WIN32
+ 		if (!aborting)
+ #endif
+ 			exit_horribly(modulename,
+ 						  "Error writing to the communication channel: %s\n",
+ 						  strerror(errno));
+ 	}
+ }
+ 
+ /*
+  * The underlying function to read a message from the communication channel (fd)
+  * with optional blocking (do_wait).
+  */
+ static char *
+ readMessageFromPipe(int fd)
+ {
+ 	char	   *msg;
+ 	int			msgsize, bufsize;
+ 	int			ret;
+ 
+ 	/*
+ 	 * The problem here is that we need to deal with several possibilites:
+ 	 * we could receive only a partial message or several messages at once.
+ 	 * The caller expects us to return exactly one message however.
+ 	 *
+ 	 * We could either read in as much as we can and keep track of what we
+ 	 * delivered back to the caller or we just read byte by byte. Once we see
+ 	 * (char) 0, we know that it's the message's end. This would be quite
+ 	 * inefficient for more data but since we are reading only on the command
+ 	 * channel, the performance loss does not seem worth the trouble of keeping
+ 	 * internal states for different file descriptors.
+ 	 */
+ 
+ 	bufsize = 64;  /* could be any number */
+ 	msg = (char *) pg_malloc(bufsize);
+ 
+ 	msgsize = 0;
+ 	for (;;)
+ 	{
+ 		Assert(msgsize <= bufsize);
+ 		ret = piperead(fd, msg + msgsize, 1);
+ 
+ 		/* worker has closed the connection or another error happened */
+ 		if (ret <= 0)
+ 			return NULL;
+ 
+ 		Assert(ret == 1);
+ 
+ 		if (msg[msgsize] == '\0')
+ 			return msg;
+ 
+ 		msgsize++;
+ 		if (msgsize == bufsize)
+ 		{
+ 			/* could be any number */
+ 			bufsize += 16;
+ 			msg = (char *) realloc(msg, bufsize);
+ 		}
+ 	}
+ }
+ 
+ #ifdef WIN32
+ /*
+  *	This is a replacement version of pipe for Win32 which allows returned
+  *	handles to be used in select(). Note that read/write calls must be replaced
+  *	with recv/send.
+  */
+ 
+ static int
+ pgpipe(int handles[2])
+ {
+ 	SOCKET		s;
+ 	struct sockaddr_in serv_addr;
+ 	int			len = sizeof(serv_addr);
+ 
+ 	handles[0] = handles[1] = INVALID_SOCKET;
+ 
+ 	if ((s = socket(AF_INET, SOCK_STREAM, 0)) == INVALID_SOCKET)
+ 	{
+ 		write_msg(modulename, "pgpipe could not create socket: %ui",
+ 				  WSAGetLastError());
+ 		return -1;
+ 	}
+ 
+ 	memset((void *) &serv_addr, 0, sizeof(serv_addr));
+ 	serv_addr.sin_family = AF_INET;
+ 	serv_addr.sin_port = htons(0);
+ 	serv_addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
+ 	if (bind(s, (SOCKADDR *) &serv_addr, len) == SOCKET_ERROR)
+ 	{
+ 		write_msg(modulename, "pgpipe could not bind: %ui",
+ 				  WSAGetLastError());
+ 		closesocket(s);
+ 		return -1;
+ 	}
+ 	if (listen(s, 1) == SOCKET_ERROR)
+ 	{
+ 		write_msg(modulename, "pgpipe could not listen: %ui",
+ 				  WSAGetLastError());
+ 		closesocket(s);
+ 		return -1;
+ 	}
+ 	if (getsockname(s, (SOCKADDR *) &serv_addr, &len) == SOCKET_ERROR)
+ 	{
+ 		write_msg(modulename, "pgpipe could not getsockname: %ui",
+ 				  WSAGetLastError());
+ 		closesocket(s);
+ 		return -1;
+ 	}
+ 	if ((handles[1] = socket(PF_INET, SOCK_STREAM, 0)) == INVALID_SOCKET)
+ 	{
+ 		write_msg(modulename, "pgpipe could not create socket 2: %ui",
+ 				  WSAGetLastError());
+ 		closesocket(s);
+ 		return -1;
+ 	}
+ 
+ 	if (connect(handles[1], (SOCKADDR *) &serv_addr, len) == SOCKET_ERROR)
+ 	{
+ 		write_msg(modulename, "pgpipe could not connect socket: %ui",
+ 				  WSAGetLastError());
+ 		closesocket(s);
+ 		return -1;
+ 	}
+ 	if ((handles[0] = accept(s, (SOCKADDR *) &serv_addr, &len)) == INVALID_SOCKET)
+ 	{
+ 		write_msg(modulename, "pgpipe could not accept socket: %ui",
+ 				  WSAGetLastError());
+ 		closesocket(handles[1]);
+ 		handles[1] = INVALID_SOCKET;
+ 		closesocket(s);
+ 		return -1;
+ 	}
+ 	closesocket(s);
+ 	return 0;
+ }
+ 
+ static int
+ piperead(int s, char *buf, int len)
+ {
+ 	int			ret = recv(s, buf, len, 0);
+ 
+ 	if (ret < 0 && WSAGetLastError() == WSAECONNRESET)
+ 		/* EOF on the pipe! (win32 socket based implementation) */
+ 		ret = 0;
+ 	return ret;
+ }
+ #endif
*** /dev/null
--- b/src/bin/pg_dump/parallel.h
***************
*** 0 ****
--- 1,86 ----
+ /*-------------------------------------------------------------------------
+  *
+  * parallel.h
+  *
+  *	Parallel support header file for the pg_dump archiver
+  *
+  * Portions Copyright (c) 1996-2011, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  *	The author is not responsible for loss or damages that may
+  *	result from its use.
+  *
+  * IDENTIFICATION
+  *		src/bin/pg_dump/parallel.h
+  *
+  *-------------------------------------------------------------------------
+  */
+ 
+ #include "pg_backup_db.h"
+ 
+ struct _archiveHandle;
+ struct _tocEntry;
+ 
+ typedef enum
+ {
+ 	WRKR_TERMINATED = 0,
+ 	WRKR_IDLE,
+ 	WRKR_WORKING,
+ 	WRKR_FINISHED
+ } T_WorkerStatus;
+ 
+ typedef enum _action
+ {
+ 	ACT_DUMP,
+ 	ACT_RESTORE,
+ } T_Action;
+ 
+ /* Arguments needed for a worker process */
+ typedef struct _parallel_args
+ {
+ 	struct _archiveHandle *AH;
+ 	struct _tocEntry	  *te;
+ } ParallelArgs;
+ 
+ /* State for each parallel activity slot */
+ typedef struct _parallel_slot
+ {
+ 	ParallelArgs	   *args;
+ 	T_WorkerStatus		workerStatus;
+ 	int					status;
+ 	int					pipeRead;
+ 	int					pipeWrite;
+ 	int					pipeRevRead;
+ 	int					pipeRevWrite;
+ #ifdef WIN32
+ 	uintptr_t			hThread;
+ 	unsigned int		threadId;
+ #else
+ 	pid_t				pid;
+ #endif
+ } ParallelSlot;
+ 
+ #define NO_SLOT (-1)
+ 
+ typedef struct _parallel_state
+ {
+ 	int			numWorkers;
+ 	ParallelSlot *parallelSlot;
+ } ParallelState;
+ 
+ extern int GetIdleWorker(ParallelState *pstate);
+ extern bool IsEveryWorkerIdle(ParallelState *pstate);
+ extern void ListenToWorkers(struct _archiveHandle *AH, ParallelState *pstate, bool do_wait);
+ extern int ReapWorkerStatus(ParallelState *pstate, int *status);
+ extern void EnsureIdleWorker(struct _archiveHandle *AH, ParallelState *pstate);
+ extern void EnsureWorkersFinished(struct _archiveHandle *AH, ParallelState *pstate);
+ 
+ extern ParallelState *ParallelBackupStart(struct _archiveHandle *AH,
+ 										  RestoreOptions *ropt);
+ extern void DispatchJobForTocEntry(struct _archiveHandle *AH,
+ 								   ParallelState *pstate,
+ 								   struct _tocEntry *te, T_Action act);
+ extern void ParallelBackupEnd(struct _archiveHandle *AH, ParallelState *pstate);
+ 
+ extern void checkAborting(struct _archiveHandle *AH);
+ 
*** a/src/bin/pg_dump/pg_backup.h
--- b/src/bin/pg_dump/pg_backup.h
***************
*** 89,97 **** struct Archive
--- 89,101 ----
  	int			minRemoteVersion;		/* allowable range */
  	int			maxRemoteVersion;
  
+ 	int			numWorkers;		/* number of parallel processes */
+ 	char	   *sync_snapshot_id;  /* sync snapshot id for parallel operation */
+ 
  	/* info needed for string escaping */
  	int			encoding;		/* libpq code for client_encoding */
  	bool		std_strings;	/* standard_conforming_strings */
+ 	char	   *use_role;		/* Issue SET ROLE to this */
  
  	/* error handling */
  	bool		exit_on_error;	/* whether to exit on SQL errors... */
***************
*** 149,155 **** typedef struct _restoreOptions
  	int			suppressDumpWarnings;	/* Suppress output of WARNING entries
  										 * to stderr */
  	bool		single_txn;
- 	int			number_of_jobs;
  
  	bool	   *idWanted;		/* array showing which dump IDs to emit */
  } RestoreOptions;
--- 153,158 ----
***************
*** 201,206 **** extern void PrintTOCSummary(Archive *AH, RestoreOptions *ropt);
--- 204,212 ----
  
  extern RestoreOptions *NewRestoreOptions(void);
  
+ /* We have one in pg_dump.c and another one in pg_restore.c */
+ extern void _SetupWorker(Archive *AHX, RestoreOptions *ropt);
+ 
  /* Rearrange and filter TOC entries */
  extern void SortTocFromFile(Archive *AHX, RestoreOptions *ropt);
  extern void InitDummyWantedList(Archive *AHX, RestoreOptions *ropt);
*** a/src/bin/pg_dump/pg_backup_archiver.c
--- b/src/bin/pg_dump/pg_backup_archiver.c
***************
*** 23,30 ****
--- 23,32 ----
  #include "pg_backup_db.h"
  #include "dumpmem.h"
  #include "dumputils.h"
+ #include "parallel.h"
  
  #include <ctype.h>
+ #include <fcntl.h>
  #include <unistd.h>
  #include <sys/stat.h>
  #include <sys/types.h>
***************
*** 36,107 ****
  
  #include "libpq/libpq-fs.h"
  
- /*
-  * Special exit values from worker children.  We reserve 0 for normal
-  * success; 1 and other small values should be interpreted as crashes.
-  */
- #define WORKER_CREATE_DONE		10
- #define WORKER_INHIBIT_DATA		11
- #define WORKER_IGNORED_ERRORS	12
- 
- /*
-  * Unix uses exit to return result from worker child, so function is void.
-  * Windows thread result comes via function return.
-  */
- #ifndef WIN32
- #define parallel_restore_result void
- #else
- #define parallel_restore_result DWORD
- #endif
- 
- /* IDs for worker children are either PIDs or thread handles */
- #ifndef WIN32
- #define thandle pid_t
- #else
- #define thandle HANDLE
- #endif
- 
- typedef struct ParallelStateEntry
- {
- #ifdef WIN32
- 	unsigned int threadId;
- #else
- 	pid_t		pid;
- #endif
- 	ArchiveHandle *AH;
- } ParallelStateEntry;
- 
- typedef struct ParallelState
- {
- 	int			numWorkers;
- 	ParallelStateEntry *pse;
- } ParallelState;
- 
- /* Arguments needed for a worker child */
- typedef struct _restore_args
- {
- 	ArchiveHandle *AH;
- 	TocEntry   *te;
- 	ParallelStateEntry *pse;
- } RestoreArgs;
- 
- /* State for each parallel activity slot */
- typedef struct _parallel_slot
- {
- 	thandle		child_id;
- 	RestoreArgs *args;
- } ParallelSlot;
- 
- typedef struct ShutdownInformation
- {
- 	ParallelState *pstate;
- 	Archive       *AHX;
- } ShutdownInformation;
- 
- static ShutdownInformation shutdown_info;
- 
- #define NO_SLOT (-1)
- 
  #define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
  #define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
  
--- 38,43 ----
***************
*** 141,147 **** static teReqs _tocEntryRequired(TocEntry *te, RestoreOptions *ropt, bool include
  static bool _tocEntryIsACL(TocEntry *te);
  static void _disableTriggersIfNecessary(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt);
  static void _enableTriggersIfNecessary(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt);
- static TocEntry *getTocEntryByDumpId(ArchiveHandle *AH, DumpId id);
  static void _moveBefore(ArchiveHandle *AH, TocEntry *pos, TocEntry *te);
  static int	_discoverArchiveFormat(ArchiveHandle *AH);
  
--- 77,82 ----
***************
*** 154,174 **** static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
  
  static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te,
  				  RestoreOptions *ropt, bool is_parallel);
! static void restore_toc_entries_parallel(ArchiveHandle *AH);
! static thandle spawn_restore(RestoreArgs *args);
! static thandle reap_child(ParallelSlot *slots, int n_slots, int *work_status);
! static bool work_in_progress(ParallelSlot *slots, int n_slots);
! static int	get_next_slot(ParallelSlot *slots, int n_slots);
  static void par_list_header_init(TocEntry *l);
  static void par_list_append(TocEntry *l, TocEntry *te);
  static void par_list_remove(TocEntry *te);
  static TocEntry *get_next_work_item(ArchiveHandle *AH,
  				   TocEntry *ready_list,
! 				   ParallelSlot *slots, int n_slots);
! static parallel_restore_result parallel_restore(RestoreArgs *args);
  static void mark_work_done(ArchiveHandle *AH, TocEntry *ready_list,
! 			   thandle worker, int status,
! 			   ParallelSlot *slots, int n_slots);
  static void fix_dependencies(ArchiveHandle *AH);
  static bool has_lock_conflicts(TocEntry *te1, TocEntry *te2);
  static void repoint_table_dependencies(ArchiveHandle *AH,
--- 89,107 ----
  
  static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te,
  				  RestoreOptions *ropt, bool is_parallel);
! static void restore_toc_entries_prefork(ArchiveHandle *AH);
! static void restore_toc_entries_parallel(ArchiveHandle *AH, ParallelState *pstate,
! 										 TocEntry *pending_list);
! static void restore_toc_entries_postfork(ArchiveHandle *AH, TocEntry *pending_list);
  static void par_list_header_init(TocEntry *l);
  static void par_list_append(TocEntry *l, TocEntry *te);
  static void par_list_remove(TocEntry *te);
  static TocEntry *get_next_work_item(ArchiveHandle *AH,
  				   TocEntry *ready_list,
! 				   ParallelState *pstate);
  static void mark_work_done(ArchiveHandle *AH, TocEntry *ready_list,
! 			   int worker, int status,
! 			   ParallelState *pstate);
  static void fix_dependencies(ArchiveHandle *AH);
  static bool has_lock_conflicts(TocEntry *te1, TocEntry *te2);
  static void repoint_table_dependencies(ArchiveHandle *AH,
***************
*** 178,191 **** static void reduce_dependencies(ArchiveHandle *AH, TocEntry *te,
  					TocEntry *ready_list);
  static void mark_create_done(ArchiveHandle *AH, TocEntry *te);
  static void inhibit_data_for_failed_table(ArchiveHandle *AH, TocEntry *te);
- static ArchiveHandle *CloneArchive(ArchiveHandle *AH);
- static void DeCloneArchive(ArchiveHandle *AH);
- 
- static void setProcessIdentifier(ParallelStateEntry *pse, ArchiveHandle *AH);
- static void unsetProcessIdentifier(ParallelStateEntry *pse);
- static ParallelStateEntry *GetMyPSEntry(ParallelState *pstate);
- static void archive_close_connection(int code, void *arg);
- 
  
  /*
   *	Wrapper functions.
--- 111,116 ----
***************
*** 272,278 **** RestoreArchive(Archive *AHX, RestoreOptions *ropt)
  	/*
  	 * If we're going to do parallel restore, there are some restrictions.
  	 */
! 	parallel_mode = (ropt->number_of_jobs > 1 && ropt->useDB);
  	if (parallel_mode)
  	{
  		/* We haven't got round to making this work for all archive formats */
--- 197,203 ----
  	/*
  	 * If we're going to do parallel restore, there are some restrictions.
  	 */
! 	parallel_mode = (AH->public.numWorkers > 1 && ropt->useDB);
  	if (parallel_mode)
  	{
  		/* We haven't got round to making this work for all archive formats */
***************
*** 438,444 **** RestoreArchive(Archive *AHX, RestoreOptions *ropt)
  	 * In parallel mode, turn control over to the parallel-restore logic.
  	 */
  	if (parallel_mode)
! 		restore_toc_entries_parallel(AH);
  	else
  	{
  		for (te = AH->toc->next; te != AH->toc; te = te->next)
--- 363,387 ----
  	 * In parallel mode, turn control over to the parallel-restore logic.
  	 */
  	if (parallel_mode)
! 	{
! 		ParallelState  *pstate;
! 		TocEntry		pending_list;
! 
! 		par_list_header_init(&pending_list);
! 
! 		/* This runs PRE_DATA items and then disconnects from the database */
! 		restore_toc_entries_prefork(AH);
! 		Assert(AH->connection == NULL);
! 
! 		/* ParallelBackupStart() will actually fork the processes */
! 		pstate = ParallelBackupStart(AH, ropt);
! 		restore_toc_entries_parallel(AH, pstate, &pending_list);
! 		ParallelBackupEnd(AH, pstate);
! 
! 		/* reconnect the master and see if we missed something */
! 		restore_toc_entries_postfork(AH, &pending_list);
! 		Assert(AH->connection != NULL);
! 	}
  	else
  	{
  		for (te = AH->toc->next; te != AH->toc; te = te->next)
***************
*** 500,506 **** static int
  restore_toc_entry(ArchiveHandle *AH, TocEntry *te,
  				  RestoreOptions *ropt, bool is_parallel)
  {
! 	int			retval = 0;
  	teReqs		reqs;
  	bool		defnDumped;
  
--- 443,449 ----
  restore_toc_entry(ArchiveHandle *AH, TocEntry *te,
  				  RestoreOptions *ropt, bool is_parallel)
  {
! 	int			status = WORKER_OK;
  	teReqs		reqs;
  	bool		defnDumped;
  
***************
*** 542,548 **** restore_toc_entry(ArchiveHandle *AH, TocEntry *te,
  				if (ropt->noDataForFailedTables)
  				{
  					if (is_parallel)
! 						retval = WORKER_INHIBIT_DATA;
  					else
  						inhibit_data_for_failed_table(AH, te);
  				}
--- 485,491 ----
  				if (ropt->noDataForFailedTables)
  				{
  					if (is_parallel)
! 						status = WORKER_INHIBIT_DATA;
  					else
  						inhibit_data_for_failed_table(AH, te);
  				}
***************
*** 557,563 **** restore_toc_entry(ArchiveHandle *AH, TocEntry *te,
  				 * just set the return value.
  				 */
  				if (is_parallel)
! 					retval = WORKER_CREATE_DONE;
  				else
  					mark_create_done(AH, te);
  			}
--- 500,506 ----
  				 * just set the return value.
  				 */
  				if (is_parallel)
! 					status = WORKER_CREATE_DONE;
  				else
  					mark_create_done(AH, te);
  			}
***************
*** 675,681 **** restore_toc_entry(ArchiveHandle *AH, TocEntry *te,
  		}
  	}
  
! 	return retval;
  }
  
  /*
--- 618,627 ----
  		}
  	}
  
! 	if (AH->public.n_errors > 0 && status == WORKER_OK)
! 		status = WORKER_IGNORED_ERRORS;
! 
! 	return status;
  }
  
  /*
***************
*** 1447,1453 **** ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
  /* on some error, we may decide to go on... */
  void
  warn_or_exit_horribly(ArchiveHandle *AH,
!  					  const char *modulename, const char *fmt,...)
  {
  	va_list		ap;
  
--- 1393,1399 ----
  /* on some error, we may decide to go on... */
  void
  warn_or_exit_horribly(ArchiveHandle *AH,
! 					  const char *modulename, const char *fmt,...)
  {
  	va_list		ap;
  
***************
*** 1524,1530 **** _moveBefore(ArchiveHandle *AH, TocEntry *pos, TocEntry *te)
  	pos->prev = te;
  }
  
! static TocEntry *
  getTocEntryByDumpId(ArchiveHandle *AH, DumpId id)
  {
  	TocEntry   *te;
--- 1470,1476 ----
  	pos->prev = te;
  }
  
! TocEntry *
  getTocEntryByDumpId(ArchiveHandle *AH, DumpId id)
  {
  	TocEntry   *te;
***************
*** 2021,2068 **** _allocAH(const char *FileSpec, const ArchiveFormat fmt,
  
  
  void
! WriteDataChunks(ArchiveHandle *AH)
  {
  	TocEntry   *te;
- 	StartDataPtr startPtr;
- 	EndDataPtr	endPtr;
  
  	for (te = AH->toc->next; te != AH->toc; te = te->next)
  	{
! 		if (te->dataDumper != NULL)
! 		{
! 			AH->currToc = te;
! 			/* printf("Writing data for %d (%x)\n", te->id, te); */
! 
! 			if (strcmp(te->desc, "BLOBS") == 0)
! 			{
! 				startPtr = AH->StartBlobsPtr;
! 				endPtr = AH->EndBlobsPtr;
! 			}
! 			else
! 			{
! 				startPtr = AH->StartDataPtr;
! 				endPtr = AH->EndDataPtr;
! 			}
! 
! 			if (startPtr != NULL)
! 				(*startPtr) (AH, te);
  
  			/*
! 			 * printf("Dumper arg for %d is %x\n", te->id, te->dataDumperArg);
  			 */
  
! 			/*
! 			 * The user-provided DataDumper routine needs to call
! 			 * AH->WriteData
! 			 */
! 			(*te->dataDumper) ((Archive *) AH, te->dataDumperArg);
  
! 			if (endPtr != NULL)
! 				(*endPtr) (AH, te);
! 			AH->currToc = NULL;
! 		}
  	}
  }
  
  void
--- 1967,2029 ----
  
  
  void
! WriteDataChunks(ArchiveHandle *AH, ParallelState *pstate)
  {
  	TocEntry   *te;
  
  	for (te = AH->toc->next; te != AH->toc; te = te->next)
  	{
! 		if (!te->hadDumper)
! 			continue;
  
+ 		if (pstate && pstate->numWorkers > 1)
+ 		{
  			/*
! 			 * If we are in a parallel backup, then we are always the master
! 			 * process.
  			 */
+ 			EnsureIdleWorker(AH, pstate);
+ 			Assert(GetIdleWorker(pstate) != NO_SLOT);
+ 			DispatchJobForTocEntry(AH, pstate, te, ACT_DUMP);
+ 		}
+ 		else
+ 			WriteDataChunksForTocEntry(AH, te);
+ 	}
+ 	EnsureWorkersFinished(AH, pstate);
+ }
  
! void
! WriteDataChunksForTocEntry(ArchiveHandle *AH, TocEntry *te)
! {
! 	StartDataPtr startPtr;
! 	EndDataPtr	endPtr;
  
! 	AH->currToc = te;
! 
! 	if (strcmp(te->desc, "BLOBS") == 0)
! 	{
! 		startPtr = AH->StartBlobsPtr;
! 		endPtr = AH->EndBlobsPtr;
! 	}
! 	else
! 	{
! 		startPtr = AH->StartDataPtr;
! 		endPtr = AH->EndDataPtr;
  	}
+ 
+ 	if (startPtr != NULL)
+ 		(*startPtr) (AH, te);
+ 
+ 	/*
+ 	 * The user-provided DataDumper routine needs to call
+ 	 * AH->WriteData
+ 	 */
+ 	(*te->dataDumper) ((Archive *) AH, te->dataDumperArg);
+ 
+ 	if (endPtr != NULL)
+ 		(*endPtr) (AH, te);
+ 
+ 	AH->currToc = NULL;
  }
  
  void
***************
*** 3276,3342 **** dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim)
  		ahprintf(AH, "-- %s %s\n\n", msg, buf);
  }
  
- static void
- setProcessIdentifier(ParallelStateEntry *pse, ArchiveHandle *AH)
- {
- #ifdef WIN32
- 	pse->threadId = GetCurrentThreadId();
- #else
- 	pse->pid = getpid();
- #endif
- 	pse->AH = AH;
- }
- 
- static void
- unsetProcessIdentifier(ParallelStateEntry *pse)
- {
- #ifdef WIN32
- 	pse->threadId = 0;
- #else
- 	pse->pid = 0;
- #endif
- 	pse->AH = NULL;
- }
- 
- static ParallelStateEntry *
- GetMyPSEntry(ParallelState *pstate)
- {
- 	int i;
- 
- 	for (i = 0; i < pstate->numWorkers; i++)
- #ifdef WIN32
- 		if (pstate->pse[i].threadId == GetCurrentThreadId())
- #else
- 		if (pstate->pse[i].pid == getpid())
- #endif
- 			return &(pstate->pse[i]);
- 
- 	return NULL;
- }
- 
- static void
- archive_close_connection(int code, void *arg)
- {
- 	ShutdownInformation *si = (ShutdownInformation *) arg;
- 
- 	if (si->pstate)
- 	{
- 		ParallelStateEntry *entry = GetMyPSEntry(si->pstate);
- 
- 		if (entry != NULL && entry->AH)
- 			DisconnectDatabase(&(entry->AH->public));
- 	}
- 	else if (si->AHX)
- 		DisconnectDatabase(si->AHX);
- }
- 
- void
- on_exit_close_archive(Archive *AHX)
- {
- 	shutdown_info.AHX = AHX;
- 	on_exit_nicely(archive_close_connection, &shutdown_info);
- }
- 
  /*
   * Main engine for parallel restore.
   *
--- 3237,3242 ----
***************
*** 3349,3378 **** on_exit_close_archive(Archive *AHX)
   * RestoreArchive).
   */
  static void
! restore_toc_entries_parallel(ArchiveHandle *AH)
  {
  	RestoreOptions *ropt = AH->ropt;
- 	int			n_slots = ropt->number_of_jobs;
- 	ParallelSlot *slots;
- 	int			work_status;
- 	int			next_slot;
  	bool		skipped_some;
- 	TocEntry	pending_list;
- 	TocEntry	ready_list;
  	TocEntry   *next_work_item;
- 	thandle		ret_child;
- 	TocEntry   *te;
- 	ParallelState *pstate;
- 	int			i;
  
! 	ahlog(AH, 2, "entering restore_toc_entries_parallel\n");
  
! 	slots = (ParallelSlot *) pg_calloc(n_slots, sizeof(ParallelSlot));
! 	pstate = (ParallelState *) pg_malloc(sizeof(ParallelState));
! 	pstate->pse = (ParallelStateEntry *) pg_calloc(n_slots, sizeof(ParallelStateEntry));
! 	pstate->numWorkers = ropt->number_of_jobs;
! 	for (i = 0; i < pstate->numWorkers; i++)
! 		unsetProcessIdentifier(&(pstate->pse[i]));
  
  	/* Adjust dependency information */
  	fix_dependencies(AH);
--- 3249,3269 ----
   * RestoreArchive).
   */
  static void
! restore_toc_entries_prefork(ArchiveHandle *AH)
  {
  	RestoreOptions *ropt = AH->ropt;
  	bool		skipped_some;
  	TocEntry   *next_work_item;
  
! 	ahlog(AH, 2, "entering restore_toc_entries_prefork\n");
  
! 	/* we haven't got round to making this work for all archive formats */
! 	if (AH->ClonePtr == NULL || AH->ReopenPtr == NULL)
! 		exit_horribly(modulename, "parallel restore is not supported with this archive file format\n");
! 
! 	/* doesn't work if the archive represents dependencies as OIDs, either */
! 	if (AH->version < K_VERS_1_8)
! 		exit_horribly(modulename, "parallel restore is not supported with archives made by pre-8.0 pg_dump\n");
  
  	/* Adjust dependency information */
  	fix_dependencies(AH);
***************
*** 3428,3439 **** restore_toc_entries_parallel(ArchiveHandle *AH)
  	 */
  	DisconnectDatabase(&AH->public);
  
- 	/*
- 	 * Set the pstate in the shutdown_info. The exit handler uses pstate if set
- 	 * and falls back to AHX otherwise.
- 	 */
- 	shutdown_info.pstate = pstate;
- 
  	/* blow away any transient state from the old connection */
  	if (AH->currUser)
  		free(AH->currUser);
--- 3319,3324 ----
***************
*** 3445,3461 **** restore_toc_entries_parallel(ArchiveHandle *AH)
  		free(AH->currTablespace);
  	AH->currTablespace = NULL;
  	AH->currWithOids = -1;
  
  	/*
! 	 * Initialize the lists of pending and ready items.  After this setup, the
! 	 * pending list is everything that needs to be done but is blocked by one
! 	 * or more dependencies, while the ready list contains items that have no
! 	 * remaining dependencies.	Note: we don't yet filter out entries that
! 	 * aren't going to be restored.  They might participate in dependency
! 	 * chains connecting entries that should be restored, so we treat them as
! 	 * live until we actually process them.
  	 */
- 	par_list_header_init(&pending_list);
  	par_list_header_init(&ready_list);
  	skipped_some = false;
  	for (next_work_item = AH->toc->next; next_work_item != AH->toc; next_work_item = next_work_item->next)
--- 3330,3371 ----
  		free(AH->currTablespace);
  	AH->currTablespace = NULL;
  	AH->currWithOids = -1;
+ }
+ 
+ /*
+  * Main engine for parallel restore.
+  *
+  * Work is done in three phases.
+  * First we process all SECTION_PRE_DATA tocEntries, in a single connection,
+  * just as for a standard restore. This is done in restore_toc_entries_prefork().
+  * Second we process the remaining non-ACL steps in parallel worker children
+  * (threads on Windows, processes on Unix), these fork off and set up their
+  * connections before we call restore_toc_entries_parallel_forked.
+  * Finally we process all the ACL entries in a single connection (that happens
+  * back in RestoreArchive).
+  */
+ static void
+ restore_toc_entries_parallel(ArchiveHandle *AH, ParallelState *pstate,
+ 							 TocEntry *pending_list)
+ {
+ 	int			work_status;
+ 	bool		skipped_some;
+ 	TocEntry	ready_list;
+ 	TocEntry   *next_work_item;
+ 	int			ret_child;
+ 
+ 	ahlog(AH, 2, "entering restore_toc_entries_parallel\n");
  
  	/*
! 	 * Initialize the lists of ready items, the list for pending items has
! 	 * already been initialized in the parent.  After this setup, the pending
! 	 * list is everything that needs to be done but is blocked by one or more
! 	 * dependencies, while the ready list contains items that have no remaining
! 	 * dependencies. Note: we don't yet filter out entries that aren't going
! 	 * to be restored. They might participate in dependency chains connecting
! 	 * entries that should be restored, so we treat them as live until we
! 	 * actually process them.
  	 */
  	par_list_header_init(&ready_list);
  	skipped_some = false;
  	for (next_work_item = AH->toc->next; next_work_item != AH->toc; next_work_item = next_work_item->next)
***************
*** 3480,3486 **** restore_toc_entries_parallel(ArchiveHandle *AH)
  		}
  
  		if (next_work_item->depCount > 0)
! 			par_list_append(&pending_list, next_work_item);
  		else
  			par_list_append(&ready_list, next_work_item);
  	}
--- 3390,3396 ----
  		}
  
  		if (next_work_item->depCount > 0)
! 			par_list_append(pending_list, next_work_item);
  		else
  			par_list_append(&ready_list, next_work_item);
  	}
***************
*** 3494,3502 **** restore_toc_entries_parallel(ArchiveHandle *AH)
  
  	ahlog(AH, 1, "entering main parallel loop\n");
  
! 	while ((next_work_item = get_next_work_item(AH, &ready_list,
! 												slots, n_slots)) != NULL ||
! 		   work_in_progress(slots, n_slots))
  	{
  		if (next_work_item != NULL)
  		{
--- 3404,3411 ----
  
  	ahlog(AH, 1, "entering main parallel loop\n");
  
! 	while ((next_work_item = get_next_work_item(AH, &ready_list, pstate)) != NULL ||
! 		   !IsEveryWorkerIdle(pstate))
  	{
  		if (next_work_item != NULL)
  		{
***************
*** 3516,3577 **** restore_toc_entries_parallel(ArchiveHandle *AH)
  				continue;
  			}
  
! 			if ((next_slot = get_next_slot(slots, n_slots)) != NO_SLOT)
! 			{
! 				/* There is work still to do and a worker slot available */
! 				thandle		child;
! 				RestoreArgs *args;
  
! 				ahlog(AH, 1, "launching item %d %s %s\n",
! 					  next_work_item->dumpId,
! 					  next_work_item->desc, next_work_item->tag);
  
! 				par_list_remove(next_work_item);
! 
! 				/* this memory is dealloced in mark_work_done() */
! 				args = pg_malloc(sizeof(RestoreArgs));
! 				args->AH = CloneArchive(AH);
! 				args->te = next_work_item;
! 				args->pse = &pstate->pse[next_slot];
  
! 				/* run the step in a worker child */
! 				child = spawn_restore(args);
  
! 				slots[next_slot].child_id = child;
! 				slots[next_slot].args = args;
  
! 				continue;
  			}
- 		}
  
! 		/*
! 		 * If we get here there must be work being done.  Either there is no
! 		 * work available to schedule (and work_in_progress returned true) or
! 		 * there are no slots available.  So we wait for a worker to finish,
! 		 * and process the result.
! 		 */
! 		ret_child = reap_child(slots, n_slots, &work_status);
  
! 		if (WIFEXITED(work_status))
! 		{
! 			mark_work_done(AH, &ready_list,
! 						   ret_child, WEXITSTATUS(work_status),
! 						   slots, n_slots);
! 		}
! 		else
! 		{
! 			exit_horribly(modulename, "worker process crashed: status %d\n",
! 						  work_status);
  		}
  	}
  
  	ahlog(AH, 1, "finished main parallel loop\n");
  
! 	/*
! 	 * Remove the pstate again, so the exit handler will now fall back to
! 	 * closing AH->connection again.
! 	 */
! 	shutdown_info.pstate = NULL;
  
  	/*
  	 * Now reconnect the single parent connection.
--- 3425,3495 ----
  				continue;
  			}
  
! 			ahlog(AH, 1, "launching item %d %s %s\n",
! 				  next_work_item->dumpId,
! 				  next_work_item->desc, next_work_item->tag);
  
! 			par_list_remove(next_work_item);
  
! 			Assert(GetIdleWorker(pstate) != NO_SLOT);
! 			DispatchJobForTocEntry(AH, pstate, next_work_item, ACT_RESTORE);
! 		}
! 		else
! 			/* at least one child is working and we have nothing ready. */
! 			Assert(!IsEveryWorkerIdle(pstate));
  
! 		for (;;)
! 		{
! 			int nTerm = 0;
  
! 			/*
! 			 * In order to reduce dependencies as soon as possible and
! 			 * especially to reap the status of workers who are working on
! 			 * items that pending items depend on, we do a non-blocking check
! 			 * for ended workers first.
! 			 *
! 			 * However, if we do not have any other work items currently that
! 			 * workers can work on, we do not busy-loop here but instead
! 			 * really wait for at least one worker to terminate. Hence we call
! 			 * ListenToWorkers(..., ..., do_wait = true) in this case.
! 			 */
! 			ListenToWorkers(AH, pstate, !next_work_item);
  
! 			while ((ret_child = ReapWorkerStatus(pstate, &work_status)) != NO_SLOT)
! 			{
! 				nTerm++;
! 				mark_work_done(AH, &ready_list, ret_child, work_status, pstate);
  			}
  
! 			/*
! 			 * We need to make sure that we have an idle worker before re-running the
! 			 * loop. If nTerm > 0 we already have that (quick check).
! 			 */
! 			if (nTerm > 0)
! 				break;
  
! 			/* if nobody terminated, explicitly check for an idle worker */
! 			if (GetIdleWorker(pstate) != NO_SLOT)
! 				break;
! 
! 			/*
! 			 * If we have no idle worker, read the result of one or more
! 			 * workers and loop the loop to call ReapWorkerStatus() on them.
! 			 */
! 			ListenToWorkers(AH, pstate, true);
  		}
  	}
  
  	ahlog(AH, 1, "finished main parallel loop\n");
+ }
  
! static void
! restore_toc_entries_postfork(ArchiveHandle *AH, TocEntry *pending_list)
! {
! 	RestoreOptions *ropt = AH->ropt;
! 	TocEntry   *te;
! 
! 	ahlog(AH, 2, "entering restore_toc_entries_postfork\n");
  
  	/*
  	 * Now reconnect the single parent connection.
***************
*** 3587,3593 **** restore_toc_entries_parallel(ArchiveHandle *AH)
  	 * dependencies, or some other pathological condition. If so, do it in the
  	 * single parent connection.
  	 */
! 	for (te = pending_list.par_next; te != &pending_list; te = te->par_next)
  	{
  		ahlog(AH, 1, "processing missed item %d %s %s\n",
  			  te->dumpId, te->desc, te->tag);
--- 3505,3511 ----
  	 * dependencies, or some other pathological condition. If so, do it in the
  	 * single parent connection.
  	 */
! 	for (te = pending_list->par_next; te != pending_list; te = te->par_next)
  	{
  		ahlog(AH, 1, "processing missed item %d %s %s\n",
  			  te->dumpId, te->desc, te->tag);
***************
*** 3598,3718 **** restore_toc_entries_parallel(ArchiveHandle *AH)
  }
  
  /*
-  * create a worker child to perform a restore step in parallel
-  */
- static thandle
- spawn_restore(RestoreArgs *args)
- {
- 	thandle		child;
- 
- 	/* Ensure stdio state is quiesced before forking */
- 	fflush(NULL);
- 
- #ifndef WIN32
- 	child = fork();
- 	if (child == 0)
- 	{
- 		/* in child process */
- 		parallel_restore(args);
- 		exit_horribly(modulename,
- 					  "parallel_restore should not return\n");
- 	}
- 	else if (child < 0)
- 	{
- 		/* fork failed */
- 		exit_horribly(modulename,
- 					  "could not create worker process: %s\n",
- 					  strerror(errno));
- 	}
- #else
- 	child = (HANDLE) _beginthreadex(NULL, 0, (void *) parallel_restore,
- 									args, 0, NULL);
- 	if (child == 0)
- 		exit_horribly(modulename,
- 					  "could not create worker thread: %s\n",
- 					  strerror(errno));
- #endif
- 
- 	return child;
- }
- 
- /*
-  *	collect status from a completed worker child
-  */
- static thandle
- reap_child(ParallelSlot *slots, int n_slots, int *work_status)
- {
- #ifndef WIN32
- 	/* Unix is so much easier ... */
- 	return wait(work_status);
- #else
- 	static HANDLE *handles = NULL;
- 	int			hindex,
- 				snum,
- 				tnum;
- 	thandle		ret_child;
- 	DWORD		res;
- 
- 	/* first time around only, make space for handles to listen on */
- 	if (handles == NULL)
- 		handles = (HANDLE *) pg_calloc(sizeof(HANDLE), n_slots);
- 
- 	/* set up list of handles to listen to */
- 	for (snum = 0, tnum = 0; snum < n_slots; snum++)
- 		if (slots[snum].child_id != 0)
- 			handles[tnum++] = slots[snum].child_id;
- 
- 	/* wait for one to finish */
- 	hindex = WaitForMultipleObjects(tnum, handles, false, INFINITE);
- 
- 	/* get handle of finished thread */
- 	ret_child = handles[hindex - WAIT_OBJECT_0];
- 
- 	/* get the result */
- 	GetExitCodeThread(ret_child, &res);
- 	*work_status = res;
- 
- 	/* dispose of handle to stop leaks */
- 	CloseHandle(ret_child);
- 
- 	return ret_child;
- #endif
- }
- 
- /*
-  * are we doing anything now?
-  */
- static bool
- work_in_progress(ParallelSlot *slots, int n_slots)
- {
- 	int			i;
- 
- 	for (i = 0; i < n_slots; i++)
- 	{
- 		if (slots[i].child_id != 0)
- 			return true;
- 	}
- 	return false;
- }
- 
- /*
-  * find the first free parallel slot (if any).
-  */
- static int
- get_next_slot(ParallelSlot *slots, int n_slots)
- {
- 	int			i;
- 
- 	for (i = 0; i < n_slots; i++)
- 	{
- 		if (slots[i].child_id == 0)
- 			return i;
- 	}
- 	return NO_SLOT;
- }
- 
- 
- /*
   * Check if te1 has an exclusive lock requirement for an item that te2 also
   * requires, whether or not te2's requirement is for an exclusive lock.
   */
--- 3516,3521 ----
***************
*** 3785,3791 **** par_list_remove(TocEntry *te)
   */
  static TocEntry *
  get_next_work_item(ArchiveHandle *AH, TocEntry *ready_list,
! 				   ParallelSlot *slots, int n_slots)
  {
  	bool		pref_non_data = false;	/* or get from AH->ropt */
  	TocEntry   *data_te = NULL;
--- 3588,3594 ----
   */
  static TocEntry *
  get_next_work_item(ArchiveHandle *AH, TocEntry *ready_list,
! 				   ParallelState *pstate)
  {
  	bool		pref_non_data = false;	/* or get from AH->ropt */
  	TocEntry   *data_te = NULL;
***************
*** 3800,3810 **** get_next_work_item(ArchiveHandle *AH, TocEntry *ready_list,
  	{
  		int			count = 0;
  
! 		for (k = 0; k < n_slots; k++)
! 			if (slots[k].args->te != NULL &&
! 				slots[k].args->te->section == SECTION_DATA)
  				count++;
! 		if (n_slots == 0 || count * 4 < n_slots)
  			pref_non_data = false;
  	}
  
--- 3603,3613 ----
  	{
  		int			count = 0;
  
! 		for (k = 0; k < pstate->numWorkers; k++)
! 			if (pstate->parallelSlot[k].args->te != NULL &&
! 				pstate->parallelSlot[k].args->te->section == SECTION_DATA)
  				count++;
! 		if (pstate->numWorkers == 0 || count * 4 < pstate->numWorkers)
  			pref_non_data = false;
  	}
  
***************
*** 3820,3832 **** get_next_work_item(ArchiveHandle *AH, TocEntry *ready_list,
  		 * that a currently running item also needs lock on, or vice versa. If
  		 * so, we don't want to schedule them together.
  		 */
! 		for (i = 0; i < n_slots && !conflicts; i++)
  		{
  			TocEntry   *running_te;
  
! 			if (slots[i].args == NULL)
  				continue;
! 			running_te = slots[i].args->te;
  
  			if (has_lock_conflicts(te, running_te) ||
  				has_lock_conflicts(running_te, te))
--- 3623,3635 ----
  		 * that a currently running item also needs lock on, or vice versa. If
  		 * so, we don't want to schedule them together.
  		 */
! 		for (i = 0; i < pstate->numWorkers && !conflicts; i++)
  		{
  			TocEntry   *running_te;
  
! 			if (pstate->parallelSlot[i].workerStatus != WRKR_WORKING)
  				continue;
! 			running_te = pstate->parallelSlot[i].args->te;
  
  			if (has_lock_conflicts(te, running_te) ||
  				has_lock_conflicts(running_te, te))
***************
*** 3861,3923 **** get_next_work_item(ArchiveHandle *AH, TocEntry *ready_list,
  /*
   * Restore a single TOC item in parallel with others
   *
!  * this is the procedure run as a thread (Windows) or a
!  * separate process (everything else).
   */
! static parallel_restore_result
! parallel_restore(RestoreArgs *args)
  {
  	ArchiveHandle *AH = args->AH;
  	TocEntry   *te = args->te;
  	RestoreOptions *ropt = AH->ropt;
! 	int			retval;
! 
! 	setProcessIdentifier(args->pse, AH);
! 
! 	/*
! 	 * Close and reopen the input file so we have a private file pointer that
! 	 * doesn't stomp on anyone else's file pointer, if we're actually going to
! 	 * need to read from the file. Otherwise, just close it except on Windows,
! 	 * where it will possibly be needed by other threads.
! 	 *
! 	 * Note: on Windows, since we are using threads not processes, the reopen
! 	 * call *doesn't* close the original file pointer but just open a new one.
! 	 */
! 	if (te->section == SECTION_DATA)
! 		(AH->ReopenPtr) (AH);
! #ifndef WIN32
! 	else
! 		(AH->ClosePtr) (AH);
! #endif
! 
! 	/*
! 	 * We need our own database connection, too
! 	 */
! 	ConnectDatabase((Archive *) AH, ropt->dbname,
! 					ropt->pghost, ropt->pgport, ropt->username,
! 					ropt->promptPassword);
  
  	_doSetFixedOutputState(AH);
  
! 	/* Restore the TOC item */
! 	retval = restore_toc_entry(AH, te, ropt, true);
! 
! 	/* And clean up */
! 	DisconnectDatabase((Archive *) AH);
! 	unsetProcessIdentifier(args->pse);
  
! 	/* If we reopened the file, we are done with it, so close it now */
! 	if (te->section == SECTION_DATA)
! 		(AH->ClosePtr) (AH);
  
! 	if (retval == 0 && AH->public.n_errors)
! 		retval = WORKER_IGNORED_ERRORS;
  
! #ifndef WIN32
! 	exit(retval);
! #else
! 	return retval;
! #endif
  }
  
  
--- 3664,3692 ----
  /*
   * Restore a single TOC item in parallel with others
   *
!  * this is run in the worker, i.e. in a thread (Windows) or a separate process
!  * (everything else). A worker process executes several such work items during
!  * a parallel backup or restore. Once we terminate here and report back that
!  * our work is finished, the master process will assign us a new work item.
   */
! int
! parallel_restore(ParallelArgs *args)
  {
  	ArchiveHandle *AH = args->AH;
  	TocEntry   *te = args->te;
  	RestoreOptions *ropt = AH->ropt;
! 	int			status;
  
  	_doSetFixedOutputState(AH);
  
! 	Assert(AH->connection != NULL);
  
! 	AH->public.n_errors = 0;
  
! 	/* Restore the TOC item */
! 	status = restore_toc_entry(AH, te, ropt, true);
  
! 	return status;
  }
  
  
***************
*** 3929,3953 **** parallel_restore(RestoreArgs *args)
   */
  static void
  mark_work_done(ArchiveHandle *AH, TocEntry *ready_list,
! 			   thandle worker, int status,
! 			   ParallelSlot *slots, int n_slots)
  {
  	TocEntry   *te = NULL;
- 	int			i;
  
! 	for (i = 0; i < n_slots; i++)
! 	{
! 		if (slots[i].child_id == worker)
! 		{
! 			slots[i].child_id = 0;
! 			te = slots[i].args->te;
! 			DeCloneArchive(slots[i].args->AH);
! 			free(slots[i].args);
! 			slots[i].args = NULL;
! 
! 			break;
! 		}
! 	}
  
  	if (te == NULL)
  		exit_horribly(modulename, "could not find slot of finished worker\n");
--- 3698,3709 ----
   */
  static void
  mark_work_done(ArchiveHandle *AH, TocEntry *ready_list,
! 			   int worker, int status,
! 			   ParallelState *pstate)
  {
  	TocEntry   *te = NULL;
  
! 	te = pstate->parallelSlot[worker].args->te;
  
  	if (te == NULL)
  		exit_horribly(modulename, "could not find slot of finished worker\n");
***************
*** 4302,4311 **** inhibit_data_for_failed_table(ArchiveHandle *AH, TocEntry *te)
   *
   * Enough of the structure is cloned to ensure that there is no
   * conflict between different threads each with their own clone.
-  *
-  * These could be public, but no need at present.
   */
! static ArchiveHandle *
  CloneArchive(ArchiveHandle *AH)
  {
  	ArchiveHandle *clone;
--- 4058,4065 ----
   *
   * Enough of the structure is cloned to ensure that there is no
   * conflict between different threads each with their own clone.
   */
! ArchiveHandle *
  CloneArchive(ArchiveHandle *AH)
  {
  	ArchiveHandle *clone;
***************
*** 4331,4339 **** CloneArchive(ArchiveHandle *AH)
--- 4085,4141 ----
  	/* clone has its own error count, too */
  	clone->public.n_errors = 0;
  
+ 	/*
+ 	 * Connect our new clone object to the database:
+ 	 * In parallel restore the parent is already disconnected.
+ 	 * In parallel backup we clone the parent's existing connection.
+ 	 */
+ 	if (AH->ropt)
+ 	{
+ 		RestoreOptions *ropt = AH->ropt;
+ 		Assert(AH->connection == NULL);
+ 		/* this also sets clone->connection */
+ 		ConnectDatabase((Archive *) clone, ropt->dbname,
+ 					ropt->pghost, ropt->pgport, ropt->username,
+ 					ropt->promptPassword);
+ 	}
+ 	else
+ 	{
+ 		char	   *dbname;
+ 		char	   *pghost;
+ 		char	   *pgport;
+ 		char	   *username;
+ 		const char *encname;
+ 
+ 		Assert(AH->connection != NULL);
+ 
+ 		/*
+ 		 * Even though we are technically accessing the parent's database object
+ 		 * here, these functions are fine to be called like that because all just
+ 		 * return a pointer and do not actually send/receive any data to/from the
+ 		 * database.
+ 		 */
+ 		dbname = PQdb(AH->connection);
+ 		pghost = PQhost(AH->connection);
+ 		pgport = PQport(AH->connection);
+ 		username = PQuser(AH->connection);
+ 		encname = pg_encoding_to_char(AH->public.encoding);
+ 
+ 		/* this also sets clone->connection */
+ 		ConnectDatabase((Archive *) clone, dbname, pghost, pgport, username, TRI_NO);
+ 
+ 		/*
+ 		 * Set the same encoding, whatever we set here is what we got from
+ 		 * pg_encoding_to_char(), so we really shouldn't run into an error setting that
+ 		 * very same value. Also see the comment in SetupConnection().
+ 		 */
+ 		PQsetClientEncoding(clone->connection, encname);
+ 	}
+ 
  	/* Let the format-specific code have a chance too */
  	(clone->ClonePtr) (clone);
  
+ 	Assert(clone->connection != NULL);
  	return clone;
  }
  
***************
*** 4342,4348 **** CloneArchive(ArchiveHandle *AH)
   *
   * Note: we assume any clone-local connection was already closed.
   */
! static void
  DeCloneArchive(ArchiveHandle *AH)
  {
  	/* Clear format-specific state */
--- 4144,4150 ----
   *
   * Note: we assume any clone-local connection was already closed.
   */
! void
  DeCloneArchive(ArchiveHandle *AH)
  {
  	/* Clear format-specific state */
*** a/src/bin/pg_dump/pg_backup_archiver.h
--- b/src/bin/pg_dump/pg_backup_archiver.h
***************
*** 100,107 **** typedef z_stream *z_streamp;
--- 100,120 ----
  #define K_OFFSET_POS_SET 2
  #define K_OFFSET_NO_DATA 3
  
+ /*
+  * Special exit values from worker children.  We reserve 0 for normal
+  * success; 1 and other small values should be interpreted as crashes.
+  */
+ #define WORKER_OK                     0
+ #define WORKER_CREATE_DONE            10
+ #define WORKER_INHIBIT_DATA           11
+ #define WORKER_IGNORED_ERRORS         12
+ 
  struct _archiveHandle;
  struct _tocEntry;
+ struct _restoreList;
+ struct _parallel_args;
+ struct _parallel_state;
+ enum _action;
  
  typedef void (*ClosePtr) (struct _archiveHandle * AH);
  typedef void (*ReopenPtr) (struct _archiveHandle * AH);
***************
*** 129,134 **** typedef void (*PrintTocDataPtr) (struct _archiveHandle * AH, struct _tocEntry *
--- 142,154 ----
  typedef void (*ClonePtr) (struct _archiveHandle * AH);
  typedef void (*DeClonePtr) (struct _archiveHandle * AH);
  
+ typedef char *(*WorkerJobRestorePtr)(struct _archiveHandle * AH, struct _tocEntry * te);
+ typedef char *(*WorkerJobDumpPtr)(struct _archiveHandle * AH, struct _tocEntry * te);
+ typedef char *(*MasterStartParallelItemPtr)(struct _archiveHandle * AH, struct _tocEntry * te,
+ 											enum _action act);
+ typedef int (*MasterEndParallelItemPtr)(struct _archiveHandle * AH, struct _tocEntry * te,
+ 										const char *str, enum _action act);
+ 
  typedef size_t (*CustomOutPtr) (struct _archiveHandle * AH, const void *buf, size_t len);
  
  typedef enum
***************
*** 227,232 **** typedef struct _archiveHandle
--- 247,258 ----
  	StartBlobPtr StartBlobPtr;
  	EndBlobPtr EndBlobPtr;
  
+ 	MasterStartParallelItemPtr MasterStartParallelItemPtr;
+ 	MasterEndParallelItemPtr MasterEndParallelItemPtr;
+ 
+ 	WorkerJobDumpPtr WorkerJobDumpPtr;
+ 	WorkerJobRestorePtr WorkerJobRestorePtr;
+ 
  	ClonePtr ClonePtr;			/* Clone format-specific fields */
  	DeClonePtr DeClonePtr;		/* Clean up cloned fields */
  
***************
*** 236,241 **** typedef struct _archiveHandle
--- 262,268 ----
  	char	   *archdbname;		/* DB name *read* from archive */
  	enum trivalue promptPassword;
  	char	   *savedPassword;	/* password for ropt->username, if known */
+ 	char	   *use_role;
  	PGconn	   *connection;
  	int			connectToDB;	/* Flag to indicate if direct DB connection is
  								 * required */
***************
*** 323,328 **** typedef struct _tocEntry
--- 350,356 ----
  	int			nLockDeps;		/* number of such dependencies */
  } TocEntry;
  
+ extern int parallel_restore(struct _parallel_args *args);
  extern void on_exit_close_archive(Archive *AHX);
  
  extern void warn_or_exit_horribly(ArchiveHandle *AH, const char *modulename, const char *fmt,...) __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4)));
***************
*** 333,341 **** extern void WriteHead(ArchiveHandle *AH);
  extern void ReadHead(ArchiveHandle *AH);
  extern void WriteToc(ArchiveHandle *AH);
  extern void ReadToc(ArchiveHandle *AH);
! extern void WriteDataChunks(ArchiveHandle *AH);
  
  extern teReqs TocIDRequired(ArchiveHandle *AH, DumpId id, RestoreOptions *ropt);
  extern bool checkSeek(FILE *fp);
  
  #define appendStringLiteralAHX(buf,str,AH) \
--- 361,374 ----
  extern void ReadHead(ArchiveHandle *AH);
  extern void WriteToc(ArchiveHandle *AH);
  extern void ReadToc(ArchiveHandle *AH);
! extern void WriteDataChunks(ArchiveHandle *AH, struct _parallel_state *pstate);
! extern void WriteDataChunksForTocEntry(ArchiveHandle *AH, TocEntry *te);
! 
! extern ArchiveHandle *CloneArchive(ArchiveHandle *AH);
! extern void DeCloneArchive(ArchiveHandle *AH);
  
  extern teReqs TocIDRequired(ArchiveHandle *AH, DumpId id, RestoreOptions *ropt);
+ TocEntry *getTocEntryByDumpId(ArchiveHandle *AH, DumpId id);
  extern bool checkSeek(FILE *fp);
  
  #define appendStringLiteralAHX(buf,str,AH) \
***************
*** 376,379 **** int			ahprintf(ArchiveHandle *AH, const char *fmt,...) __attribute__((format(PG_
--- 409,424 ----
  
  void		ahlog(ArchiveHandle *AH, int level, const char *fmt,...) __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4)));
  
+ #ifdef USE_ASSERT_CHECKING
+ #define Assert(condition) \
+ 	if (!(condition)) \
+ 	{ \
+ 		write_msg(NULL, "Failed assertion in %s, line %d\n", \
+ 				  __FILE__, __LINE__); \
+ 		abort();\
+ 	}
+ #else
+ #define Assert(condition)
+ #endif
+ 
  #endif
*** a/src/bin/pg_dump/pg_backup_custom.c
--- b/src/bin/pg_dump/pg_backup_custom.c
***************
*** 27,32 ****
--- 27,33 ----
  #include "compress_io.h"
  #include "dumputils.h"
  #include "dumpmem.h"
+ #include "parallel.h"
  
  /*--------
   * Routines in the format interface
***************
*** 60,65 **** static void _LoadBlobs(ArchiveHandle *AH, bool drop);
--- 61,70 ----
  static void _Clone(ArchiveHandle *AH);
  static void _DeClone(ArchiveHandle *AH);
  
+ static char *_MasterStartParallelItem(ArchiveHandle *AH, TocEntry *te, T_Action act);
+ static int _MasterEndParallelItem(ArchiveHandle *AH, TocEntry *te, const char *str, T_Action act);
+ char *_WorkerJobRestoreCustom(ArchiveHandle *AH, TocEntry *te);
+ 
  typedef struct
  {
  	CompressorState *cs;
***************
*** 127,132 **** InitArchiveFmt_Custom(ArchiveHandle *AH)
--- 132,144 ----
  	AH->ClonePtr = _Clone;
  	AH->DeClonePtr = _DeClone;
  
+ 	AH->MasterStartParallelItemPtr = _MasterStartParallelItem;
+ 	AH->MasterEndParallelItemPtr = _MasterEndParallelItem;
+ 
+ 	/* no parallel dump in the custom archive, only parallel restore */
+ 	AH->WorkerJobDumpPtr = NULL;
+ 	AH->WorkerJobRestorePtr = _WorkerJobRestoreCustom;
+ 
  	/* Set up a private area. */
  	ctx = (lclContext *) pg_calloc(1, sizeof(lclContext));
  	AH->formatData = (void *) ctx;
***************
*** 698,704 **** _CloseArchive(ArchiveHandle *AH)
  		tpos = ftello(AH->FH);
  		WriteToc(AH);
  		ctx->dataStart = _getFilePos(AH, ctx);
! 		WriteDataChunks(AH);
  
  		/*
  		 * If possible, re-write the TOC in order to update the data offset
--- 710,716 ----
  		tpos = ftello(AH->FH);
  		WriteToc(AH);
  		ctx->dataStart = _getFilePos(AH, ctx);
! 		WriteDataChunks(AH, NULL);
  
  		/*
  		 * If possible, re-write the TOC in order to update the data offset
***************
*** 796,801 **** _DeClone(ArchiveHandle *AH)
--- 808,888 ----
  	free(ctx);
  }
  
+ /*
+  * This function is executed in the child of a parallel backup for the
+  * custom format archive and dumps the actual data.
+  */
+ char *
+ _WorkerJobRestoreCustom(ArchiveHandle *AH, TocEntry *te)
+ {
+ 	/* short fixed-size string + some ID so far, this needs to be malloc'ed
+ 	 * instead of static because we work with threads on windows */
+ 	const int	buflen = 64;
+ 	char	   *buf = (char*) pg_malloc(buflen);
+ 	ParallelArgs pargs;
+ 	int			status;
+ 	lclTocEntry *tctx;
+ 
+ 	tctx = (lclTocEntry *) te->formatData;
+ 
+ 	pargs.AH = AH;
+ 	pargs.te = te;
+ 
+ 	status = parallel_restore(&pargs);
+ 
+ 	snprintf(buf, buflen, "OK RESTORE %d %d %d", te->dumpId, status,
+ 			 status == WORKER_IGNORED_ERRORS ? AH->public.n_errors : 0);
+ 
+ 	return buf;
+ }
+ 
+ /*
+  * This function is executed in the parent process. Depending on the desired
+  * action (dump or restore) it creates a string that is understood by the
+  * _WorkerJobDumpDirectory/_WorkerJobRestoreDirectory functions of the
+  * respective dump format.
+  */
+ static char *
+ _MasterStartParallelItem(ArchiveHandle *AH, TocEntry *te, T_Action act)
+ {
+ 	/*
+ 	 * A static char is okay here, even on Windows because we call this
+ 	 * function only from one process (the master).
+ 	 */
+ 	static char			buf[64]; /* short fixed-size string + number */
+ 
+ 	/* no parallel dump in the custom archive format */
+ 	Assert(act == ACT_RESTORE);
+ 
+ 	snprintf(buf, sizeof(buf), "RESTORE %d", te->dumpId);
+ 
+ 	return buf;
+ }
+ 
+ /*
+  * This function is executed in the parent process. It analyzes the response of
+  * the _WorkerJobDumpDirectory/_WorkerJobRestoreDirectory functions of the
+  * respective dump format.
+  */
+ static int
+ _MasterEndParallelItem(ArchiveHandle *AH, TocEntry *te, const char *str, T_Action act)
+ {
+ 	DumpId		dumpId;
+ 	int			nBytes, status, n_errors;
+ 
+ 	/* no parallel dump in the custom archive */
+ 	Assert(act == ACT_RESTORE);
+ 
+ 	sscanf(str, "%u %u %u%n", &dumpId, &status, &n_errors, &nBytes);
+ 
+ 	Assert(nBytes == strlen(str));
+ 	Assert(dumpId == te->dumpId);
+ 
+ 	AH->public.n_errors += n_errors;
+ 
+ 	return status;
+ }
+ 
  /*--------------------------------------------------
   * END OF FORMAT CALLBACKS
   *--------------------------------------------------
*** a/src/bin/pg_dump/pg_backup_db.c
--- b/src/bin/pg_dump/pg_backup_db.c
***************
*** 308,319 **** ConnectDatabase(Archive *AHX,
  	PQsetNoticeProcessor(AH->connection, notice_processor, NULL);
  }
  
  void
  DisconnectDatabase(Archive *AHX)
  {
  	ArchiveHandle *AH = (ArchiveHandle *) AHX;
  
! 	PQfinish(AH->connection);		/* noop if AH->connection is NULL */
  	AH->connection = NULL;
  }
  
--- 308,337 ----
  	PQsetNoticeProcessor(AH->connection, notice_processor, NULL);
  }
  
+ /*
+  * Close the connection to the database and also cancel off the query if we
+  * have one running.
+  */
  void
  DisconnectDatabase(Archive *AHX)
  {
  	ArchiveHandle *AH = (ArchiveHandle *) AHX;
+ 	PGcancel   *cancel;
+ 	char		errbuf[1];
+ 
+ 	if (!AH->connection)
+ 		return;
  
! 	if (PQtransactionStatus(AH->connection) == PQTRANS_ACTIVE)
! 	{
! 		if ((cancel = PQgetCancel(AH->connection)))
! 		{
! 			PQcancel(cancel, errbuf, sizeof(errbuf));
! 			PQfreeCancel(cancel);
! 		}
! 	}
! 
! 	PQfinish(AH->connection);
  	AH->connection = NULL;
  }
  
*** a/src/bin/pg_dump/pg_backup_directory.c
--- b/src/bin/pg_dump/pg_backup_directory.c
***************
*** 36,41 ****
--- 36,42 ----
  #include "compress_io.h"
  #include "dumpmem.h"
  #include "dumputils.h"
+ #include "parallel.h"
  
  #include <dirent.h>
  #include <sys/stat.h>
***************
*** 51,56 **** typedef struct
--- 52,58 ----
  	cfp		   *dataFH;			/* currently open data file */
  
  	cfp		   *blobsTocFH;		/* file handle for blobs.toc */
+ 	ParallelState *pstate;		/* for parallel backup / restore */
  } lclContext;
  
  typedef struct
***************
*** 70,75 **** static int	_ReadByte(ArchiveHandle *);
--- 72,78 ----
  static size_t _WriteBuf(ArchiveHandle *AH, const void *buf, size_t len);
  static size_t _ReadBuf(ArchiveHandle *AH, void *buf, size_t len);
  static void _CloseArchive(ArchiveHandle *AH);
+ static void _ReopenArchive(ArchiveHandle *AH);
  static void _PrintTocData(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt);
  
  static void _WriteExtraToc(ArchiveHandle *AH, TocEntry *te);
***************
*** 81,91 **** static void _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid);
  static void _EndBlob(ArchiveHandle *AH, TocEntry *te, Oid oid);
  static void _EndBlobs(ArchiveHandle *AH, TocEntry *te);
  static void _LoadBlobs(ArchiveHandle *AH, RestoreOptions *ropt);
  
! static char *prependDirectory(ArchiveHandle *AH, const char *relativeFilename);
  
  static void createDirectory(const char *dir);
! 
  
  /*
   *	Init routine required by ALL formats. This is a global routine
--- 84,99 ----
  static void _EndBlob(ArchiveHandle *AH, TocEntry *te, Oid oid);
  static void _EndBlobs(ArchiveHandle *AH, TocEntry *te);
  static void _LoadBlobs(ArchiveHandle *AH, RestoreOptions *ropt);
+ static void _Clone(ArchiveHandle *AH);
+ static void _DeClone(ArchiveHandle *AH);
  
! static char *_MasterStartParallelItem(ArchiveHandle *AH, TocEntry *te, T_Action act);
! static int _MasterEndParallelItem(ArchiveHandle *AH, TocEntry *te, const char *str, T_Action act);
! static char *_WorkerJobRestoreDirectory(ArchiveHandle *AH, TocEntry *te);
! static char *_WorkerJobDumpDirectory(ArchiveHandle *AH, TocEntry *te);
  
  static void createDirectory(const char *dir);
! static char *prependDirectory(ArchiveHandle *AH, char *buf, const char *relativeFilename);
  
  /*
   *	Init routine required by ALL formats. This is a global routine
***************
*** 112,118 **** InitArchiveFmt_Directory(ArchiveHandle *AH)
  	AH->WriteBufPtr = _WriteBuf;
  	AH->ReadBufPtr = _ReadBuf;
  	AH->ClosePtr = _CloseArchive;
! 	AH->ReopenPtr = NULL;
  	AH->PrintTocDataPtr = _PrintTocData;
  	AH->ReadExtraTocPtr = _ReadExtraToc;
  	AH->WriteExtraTocPtr = _WriteExtraToc;
--- 120,126 ----
  	AH->WriteBufPtr = _WriteBuf;
  	AH->ReadBufPtr = _ReadBuf;
  	AH->ClosePtr = _CloseArchive;
! 	AH->ReopenPtr = _ReopenArchive;
  	AH->PrintTocDataPtr = _PrintTocData;
  	AH->ReadExtraTocPtr = _ReadExtraToc;
  	AH->WriteExtraTocPtr = _WriteExtraToc;
***************
*** 123,130 **** InitArchiveFmt_Directory(ArchiveHandle *AH)
  	AH->EndBlobPtr = _EndBlob;
  	AH->EndBlobsPtr = _EndBlobs;
  
! 	AH->ClonePtr = NULL;
! 	AH->DeClonePtr = NULL;
  
  	/* Set up our private context */
  	ctx = (lclContext *) pg_calloc(1, sizeof(lclContext));
--- 131,144 ----
  	AH->EndBlobPtr = _EndBlob;
  	AH->EndBlobsPtr = _EndBlobs;
  
! 	AH->ClonePtr = _Clone;
! 	AH->DeClonePtr = _DeClone;
! 
! 	AH->WorkerJobRestorePtr = _WorkerJobRestoreDirectory;
! 	AH->WorkerJobDumpPtr = _WorkerJobDumpDirectory;
! 
! 	AH->MasterStartParallelItemPtr = _MasterStartParallelItem;
! 	AH->MasterEndParallelItemPtr = _MasterEndParallelItem;
  
  	/* Set up our private context */
  	ctx = (lclContext *) pg_calloc(1, sizeof(lclContext));
***************
*** 153,162 **** InitArchiveFmt_Directory(ArchiveHandle *AH)
  	}
  	else
  	{							/* Read Mode */
! 		char	   *fname;
  		cfp		   *tocFH;
  
! 		fname = prependDirectory(AH, "toc.dat");
  
  		tocFH = cfopen_read(fname, PG_BINARY_R);
  		if (tocFH == NULL)
--- 167,176 ----
  	}
  	else
  	{							/* Read Mode */
! 		char	   fname[MAXPGPATH];
  		cfp		   *tocFH;
  
! 		prependDirectory(AH, fname, "toc.dat");
  
  		tocFH = cfopen_read(fname, PG_BINARY_R);
  		if (tocFH == NULL)
***************
*** 282,290 **** _StartData(ArchiveHandle *AH, TocEntry *te)
  {
  	lclTocEntry *tctx = (lclTocEntry *) te->formatData;
  	lclContext *ctx = (lclContext *) AH->formatData;
! 	char	   *fname;
  
! 	fname = prependDirectory(AH, tctx->filename);
  
  	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
  	if (ctx->dataFH == NULL)
--- 296,304 ----
  {
  	lclTocEntry *tctx = (lclTocEntry *) te->formatData;
  	lclContext *ctx = (lclContext *) AH->formatData;
! 	char		fname[MAXPGPATH];
  
! 	prependDirectory(AH, fname, tctx->filename);
  
  	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
  	if (ctx->dataFH == NULL)
***************
*** 309,314 **** _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
--- 323,331 ----
  	if (dLen == 0)
  		return 0;
  
+ 	/* Are we aborting? */
+ 	checkAborting(AH);
+ 
  	return cfwrite(data, dLen, ctx->dataFH);
  }
  
***************
*** 376,383 **** _PrintTocData(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt)
  		_LoadBlobs(AH, ropt);
  	else
  	{
! 		char	   *fname = prependDirectory(AH, tctx->filename);
  
  		_PrintFileData(AH, fname, ropt);
  	}
  }
--- 393,401 ----
  		_LoadBlobs(AH, ropt);
  	else
  	{
! 		char		fname[MAXPGPATH];
  
+ 		prependDirectory(AH, fname, tctx->filename);
  		_PrintFileData(AH, fname, ropt);
  	}
  }
***************
*** 387,398 **** _LoadBlobs(ArchiveHandle *AH, RestoreOptions *ropt)
  {
  	Oid			oid;
  	lclContext *ctx = (lclContext *) AH->formatData;
! 	char	   *fname;
  	char		line[MAXPGPATH];
  
  	StartRestoreBlobs(AH);
  
! 	fname = prependDirectory(AH, "blobs.toc");
  
  	ctx->blobsTocFH = cfopen_read(fname, PG_BINARY_R);
  
--- 405,416 ----
  {
  	Oid			oid;
  	lclContext *ctx = (lclContext *) AH->formatData;
! 	char		fname[MAXPGPATH];
  	char		line[MAXPGPATH];
  
  	StartRestoreBlobs(AH);
  
! 	prependDirectory(AH, fname, "blobs.toc");
  
  	ctx->blobsTocFH = cfopen_read(fname, PG_BINARY_R);
  
***************
*** 475,480 **** _WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
--- 493,501 ----
  	lclContext *ctx = (lclContext *) AH->formatData;
  	size_t		res;
  
+ 	/* Are we aborting? */
+ 	checkAborting(AH);
+ 
  	res = cfwrite(buf, len, ctx->dataFH);
  	if (res != len)
  		exit_horribly(modulename, "could not write to output file: %s\n",
***************
*** 519,525 **** _CloseArchive(ArchiveHandle *AH)
  	if (AH->mode == archModeWrite)
  	{
  		cfp		   *tocFH;
! 		char	   *fname = prependDirectory(AH, "toc.dat");
  
  		/* The TOC is always created uncompressed */
  		tocFH = cfopen_write(fname, PG_BINARY_W, 0);
--- 540,551 ----
  	if (AH->mode == archModeWrite)
  	{
  		cfp		   *tocFH;
! 		char		fname[MAXPGPATH];
! 
! 		prependDirectory(AH, fname, "toc.dat");
! 
! 		/* this will actually fork the processes for a parallel backup */
! 		ctx->pstate = ParallelBackupStart(AH, NULL);
  
  		/* The TOC is always created uncompressed */
  		tocFH = cfopen_write(fname, PG_BINARY_W, 0);
***************
*** 540,550 **** _CloseArchive(ArchiveHandle *AH)
  		if (cfclose(tocFH) != 0)
  			exit_horribly(modulename, "could not close TOC file: %s\n",
  						  strerror(errno));
! 		WriteDataChunks(AH);
  	}
  	AH->FH = NULL;
  }
  
  
  /*
   * BLOB support
--- 566,589 ----
  		if (cfclose(tocFH) != 0)
  			exit_horribly(modulename, "could not close TOC file: %s\n",
  						  strerror(errno));
! 		WriteDataChunks(AH, ctx->pstate);
! 
! 		ParallelBackupEnd(AH, ctx->pstate);
  	}
  	AH->FH = NULL;
  }
  
+ /*
+  * Reopen the archive's file handle.
+  */
+ static void
+ _ReopenArchive(ArchiveHandle *AH)
+ {
+ 	/*
+ 	 * Our TOC is in memory, our data files are opened by each child anyway as
+ 	 * they are separate. We support reopening the archive by just doing nothing.
+ 	 */
+ }
  
  /*
   * BLOB support
***************
*** 561,569 **** static void
  _StartBlobs(ArchiveHandle *AH, TocEntry *te)
  {
  	lclContext *ctx = (lclContext *) AH->formatData;
! 	char	   *fname;
  
! 	fname = prependDirectory(AH, "blobs.toc");
  
  	/* The blob TOC file is never compressed */
  	ctx->blobsTocFH = cfopen_write(fname, "ab", 0);
--- 600,608 ----
  _StartBlobs(ArchiveHandle *AH, TocEntry *te)
  {
  	lclContext *ctx = (lclContext *) AH->formatData;
! 	char		fname[MAXPGPATH];
  
! 	prependDirectory(AH, fname, "blobs.toc");
  
  	/* The blob TOC file is never compressed */
  	ctx->blobsTocFH = cfopen_write(fname, "ab", 0);
***************
*** 656,667 **** createDirectory(const char *dir)
  					  dir, strerror(errno));
  }
  
! 
  static char *
! prependDirectory(ArchiveHandle *AH, const char *relativeFilename)
  {
  	lclContext *ctx = (lclContext *) AH->formatData;
- 	static char buf[MAXPGPATH];
  	char	   *dname;
  
  	dname = ctx->directory;
--- 695,710 ----
  					  dir, strerror(errno));
  }
  
! /*
!  * Gets a relative file name and prepends the output directory, writing the
!  * result to buf. The caller needs to make sure that buf is MAXPGPATH bytes
!  * big. Can't use a static char[MAXPGPATH] inside the function because we run
!  * multithreaded on Windows.
!  */
  static char *
! prependDirectory(ArchiveHandle *AH, char* buf, const char *relativeFilename)
  {
  	lclContext *ctx = (lclContext *) AH->formatData;
  	char	   *dname;
  
  	dname = ctx->directory;
***************
*** 675,677 **** prependDirectory(ArchiveHandle *AH, const char *relativeFilename)
--- 718,869 ----
  
  	return buf;
  }
+ 
+ /*
+  * Clone format-specific fields during parallel restoration.
+  */
+ static void
+ _Clone(ArchiveHandle *AH)
+ {
+ 	lclContext *ctx = (lclContext *) AH->formatData;
+ 
+ 	AH->formatData = (lclContext *) pg_malloc(sizeof(lclContext));
+ 	memcpy(AH->formatData, ctx, sizeof(lclContext));
+ 	ctx = (lclContext *) AH->formatData;
+ 
+ 	/*
+ 	 * Note: we do not make a local lo_buf because we expect at most one BLOBS
+ 	 * entry per archive, so no parallelism is possible.  Likewise,
+ 	 * TOC-entry-local state isn't an issue because any one TOC entry is
+ 	 * touched by just one worker child.
+ 	 */
+ 
+ 	/*
+ 	 * We also don't copy the ParallelState pointer (pstate), only the master
+ 	 * process ever writes to it.
+ 	 */
+ }
+ 
+ static void
+ _DeClone(ArchiveHandle *AH)
+ {
+ 	lclContext *ctx = (lclContext *) AH->formatData;
+ 	free(ctx);
+ }
+ 
+ /*
+  * This function is executed in the parent process. Depending on the desired
+  * action (dump or restore) it creates a string that is understood by the
+  * _WorkerJobDumpDirectory/_WorkerJobRestoreDirectory functions of the
+  * respective dump format.
+  */
+ static char *
+ _MasterStartParallelItem(ArchiveHandle *AH, TocEntry *te, T_Action act)
+ {
+ 	/*
+ 	 * A static char is okay here, even on Windows because we call this
+ 	 * function only from one process (the master).
+ 	 */
+ 	static char	buf[64];
+ 
+ 	if (act == ACT_DUMP)
+ 		snprintf(buf, sizeof(buf), "DUMP %d", te->dumpId);
+ 	else if (act == ACT_RESTORE)
+ 		snprintf(buf, sizeof(buf), "RESTORE %d", te->dumpId);
+ 
+ 	return buf;
+ }
+ 
+ /*
+  * This function is executed in the child of a parallel backup for the
+  * directory archive and dumps the actual data.
+  *
+  * We are currently returning only the DumpId so theoretically we could
+  * make this function returning an int (or a DumpId). However, to
+  * facilitate further enhancements and because sooner or later we need to
+  * convert this to a string and send it via a message anyway, we stick with
+  * char *. It is parsed on the other side by the _EndMasterParallel()
+  * function of the respective dump format.
+  */
+ static char *
+ _WorkerJobDumpDirectory(ArchiveHandle *AH, TocEntry *te)
+ {
+ 	/* short fixed-size string + some ID so far, this needs to be malloc'ed
+ 	 * instead of static because we work with threads on windows */
+ 	const int	buflen = 64;
+ 	char	   *buf = (char*) pg_malloc(buflen);
+ 	lclTocEntry *tctx = (lclTocEntry *) te->formatData;
+ 
+ 	/* This should never happen */
+ 	if (!tctx)
+ 		exit_horribly(modulename, "Error during backup\n");
+ 
+ 	/*
+ 	 * This function returns void. We either fail and die horribly or succeed...
+ 	 * A failure will be detected by the parent when the child dies unexpectedly.
+ 	 */
+ 	WriteDataChunksForTocEntry(AH, te);
+ 
+ 	snprintf(buf, buflen, "OK DUMP %d", te->dumpId);
+ 
+ 	return buf;
+ }
+ 
+ /*
+  * This function is executed in the child of a parallel backup for the
+  * directory archive and dumps the actual data.
+  */
+ static char *
+ _WorkerJobRestoreDirectory(ArchiveHandle *AH, TocEntry *te)
+ {
+ 	/* short fixed-size string + some ID so far, this needs to be malloc'ed
+ 	 * instead of static because we work with threads on windows */
+ 	const int	buflen = 64;
+ 	char	   *buf = (char*) pg_malloc(buflen);
+ 	ParallelArgs pargs;
+ 	int			status;
+ 	lclTocEntry *tctx;
+ 
+ 	tctx = (lclTocEntry *) te->formatData;
+ 
+ 	pargs.AH = AH;
+ 	pargs.te = te;
+ 
+ 	status = parallel_restore(&pargs);
+ 
+ 	snprintf(buf, buflen, "OK RESTORE %d %d %d", te->dumpId, status,
+ 			 status == WORKER_IGNORED_ERRORS ? AH->public.n_errors : 0);
+ 
+ 	return buf;
+ }
+ /*
+  * This function is executed in the parent process. It analyzes the response of
+  * the _WorkerJobDumpDirectory/_WorkerJobRestoreDirectory functions of the
+  * respective dump format.
+  */
+ static int
+ _MasterEndParallelItem(ArchiveHandle *AH, TocEntry *te, const char *str, T_Action act)
+ {
+ 	DumpId		dumpId;
+ 	int			nBytes, n_errors;
+ 	int			status = 0;
+ 
+ 	if (act == ACT_DUMP)
+ 	{
+ 		sscanf(str, "%u%n", &dumpId, &nBytes);
+ 
+ 		Assert(dumpId == te->dumpId);
+ 		Assert(nBytes == strlen(str));
+ 	}
+ 	else if (act == ACT_RESTORE)
+ 	{
+ 		sscanf(str, "%u %u %u%n", &dumpId, &status, &n_errors, &nBytes);
+ 
+ 		Assert(dumpId == te->dumpId);
+ 		Assert(nBytes == strlen(str));
+ 
+ 		AH->public.n_errors += n_errors;
+ 	}
+ 
+ 	return status;
+ }
*** a/src/bin/pg_dump/pg_backup_tar.c
--- b/src/bin/pg_dump/pg_backup_tar.c
***************
*** 155,160 **** InitArchiveFmt_Tar(ArchiveHandle *AH)
--- 155,166 ----
  	AH->ClonePtr = NULL;
  	AH->DeClonePtr = NULL;
  
+ 	AH->MasterStartParallelItemPtr = NULL;
+ 	AH->MasterEndParallelItemPtr = NULL;
+ 
+ 	AH->WorkerJobDumpPtr = NULL;
+ 	AH->WorkerJobRestorePtr = NULL;
+ 
  	/*
  	 * Set up some special context used in compressing data.
  	 */
***************
*** 834,840 **** _CloseArchive(ArchiveHandle *AH)
  		/*
  		 * Now send the data (tables & blobs)
  		 */
! 		WriteDataChunks(AH);
  
  		/*
  		 * Now this format wants to append a script which does a full restore
--- 840,846 ----
  		/*
  		 * Now send the data (tables & blobs)
  		 */
! 		WriteDataChunks(AH, NULL);
  
  		/*
  		 * Now this format wants to append a script which does a full restore
*** a/src/bin/pg_dump/pg_dump.c
--- b/src/bin/pg_dump/pg_dump.c
***************
*** 139,144 **** static int	disable_dollar_quoting = 0;
--- 139,145 ----
  static int	dump_inserts = 0;
  static int	column_inserts = 0;
  static int	no_security_labels = 0;
+ static int  no_synchronized_snapshots = 0;
  static int	no_unlogged_table_data = 0;
  static int	serializable_deferrable = 0;
  
***************
*** 235,242 **** static Oid	findLastBuiltinOid_V70(Archive *fout);
  static void selectSourceSchema(Archive *fout, const char *schemaName);
  static char *getFormattedTypeName(Archive *fout, Oid oid, OidOptions opts);
  static char *myFormatType(const char *typname, int32 typmod);
- static const char *fmtQualifiedId(Archive *fout,
- 								  const char *schema, const char *id);
  static void getBlobs(Archive *fout);
  static void dumpBlob(Archive *fout, BlobInfo *binfo);
  static int	dumpBlobs(Archive *fout, void *arg);
--- 236,241 ----
***************
*** 254,261 **** static void binary_upgrade_extension_member(PQExpBuffer upgrade_buffer,
  								DumpableObject *dobj,
  								const char *objlabel);
  static const char *getAttrName(int attrnum, TableInfo *tblInfo);
! static const char *fmtCopyColumnList(const TableInfo *ti);
! static PGresult *ExecuteSqlQueryForSingleRow(Archive *fout, char *query);
  
  int
  main(int argc, char **argv)
--- 253,261 ----
  								DumpableObject *dobj,
  								const char *objlabel);
  static const char *getAttrName(int attrnum, TableInfo *tblInfo);
! static const char *fmtCopyColumnList(const TableInfo *ti, PQExpBuffer buffer);
! static char *get_synchronized_snapshot(Archive *fout);
! static PGresult *ExecuteSqlQueryForSingleRow(Archive *fout, const char *query);
  
  int
  main(int argc, char **argv)
***************
*** 274,279 **** main(int argc, char **argv)
--- 274,280 ----
  	DumpableObject **dobjs;
  	int			numObjs;
  	int			i;
+ 	int			numWorkers = 1;
  	enum trivalue prompt_password = TRI_DEFAULT;
  	int			compressLevel = -1;
  	int			plainText = 0;
***************
*** 303,308 **** main(int argc, char **argv)
--- 304,310 ----
  		{"format", required_argument, NULL, 'F'},
  		{"host", required_argument, NULL, 'h'},
  		{"ignore-version", no_argument, NULL, 'i'},
+ 		{"jobs", 1, NULL, 'j'},
  		{"no-reconnect", no_argument, NULL, 'R'},
  		{"oids", no_argument, NULL, 'o'},
  		{"no-owner", no_argument, NULL, 'O'},
***************
*** 342,347 **** main(int argc, char **argv)
--- 344,350 ----
  		{"serializable-deferrable", no_argument, &serializable_deferrable, 1},
  		{"use-set-session-authorization", no_argument, &use_setsessauth, 1},
  		{"no-security-labels", no_argument, &no_security_labels, 1},
+ 		{"no-synchronized-snapshots", no_argument, &no_synchronized_snapshots, 1},
  		{"no-unlogged-table-data", no_argument, &no_unlogged_table_data, 1},
  
  		{NULL, 0, NULL, 0}
***************
*** 349,354 **** main(int argc, char **argv)
--- 352,359 ----
  
  	set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_dump"));
  
+ 	init_parallel_dump_utils();
+ 
  	g_verbose = false;
  
  	strcpy(g_comment_start, "-- ");
***************
*** 379,385 **** main(int argc, char **argv)
  		}
  	}
  
! 	while ((c = getopt_long(argc, argv, "abcCE:f:F:h:in:N:oOp:RsS:t:T:U:vwWxZ:",
  							long_options, &optindex)) != -1)
  	{
  		switch (c)
--- 384,390 ----
  		}
  	}
  
! 	while ((c = getopt_long(argc, argv, "abcCE:f:F:h:ij:n:N:oOp:RsS:t:T:U:vwWxZ:",
  							long_options, &optindex)) != -1)
  	{
  		switch (c)
***************
*** 420,425 **** main(int argc, char **argv)
--- 425,434 ----
  				/* ignored, deprecated option */
  				break;
  
+ 			case 'j':			/* number of dump jobs */
+ 				numWorkers = atoi(optarg);
+ 				break;
+ 
  			case 'n':			/* include schema(s) */
  				simple_string_list_append(&schema_include_patterns, optarg);
  				include_everything = false;
***************
*** 572,577 **** main(int argc, char **argv)
--- 581,602 ----
  			compressLevel = 0;
  	}
  
+ 	/*
+ 	 * On Windows we can only have at most MAXIMUM_WAIT_OBJECTS (= 64 usually)
+ 	 * parallel jobs because that's the maximum limit for the
+ 	 * WaitForMultipleObjects() call.
+ 	 */
+ 	if (numWorkers <= 0
+ #ifdef WIN32
+ 			|| numWorkers > MAXIMUM_WAIT_OBJECTS
+ #endif
+ 		)
+ 		exit_horribly(NULL, "%s: invalid number of parallel jobs\n", progname);
+ 
+ 	/* Parallel backup only in the directory archive format so far */
+ 	if (archiveFormat != archDirectory && numWorkers > 1)
+ 		exit_horribly(NULL, "parallel backup only supported by the directory format\n");
+ 
  	/* Open the output file */
  	fout = CreateArchive(filename, archiveFormat, compressLevel, archiveMode);
  
***************
*** 595,600 **** main(int argc, char **argv)
--- 620,627 ----
  	fout->minRemoteVersion = 70000;
  	fout->maxRemoteVersion = (my_version / 100) * 100 + 99;
  
+ 	fout->numWorkers = numWorkers;
+ 
  	/*
  	 * Open the database using the Archiver, so it knows about it. Errors mean
  	 * death.
***************
*** 609,633 **** main(int argc, char **argv)
  	if (fout->remoteVersion < 90100)
  		no_security_labels = 1;
  
- 	/*
- 	 * Start transaction-snapshot mode transaction to dump consistent data.
- 	 */
- 	ExecuteSqlStatement(fout, "BEGIN");
- 	if (fout->remoteVersion >= 90100)
- 	{
- 		if (serializable_deferrable)
- 			ExecuteSqlStatement(fout,
- 								"SET TRANSACTION ISOLATION LEVEL "
- 								"SERIALIZABLE, READ ONLY, DEFERRABLE");
- 		else
- 			ExecuteSqlStatement(fout,
- 						   		"SET TRANSACTION ISOLATION LEVEL "
- 								"REPEATABLE READ");
- 	}
- 	else
- 		ExecuteSqlStatement(fout,
- 							"SET TRANSACTION ISOLATION LEVEL SERIALIZABLE");
- 
  	/* Select the appropriate subquery to convert user IDs to names */
  	if (fout->remoteVersion >= 80100)
  		username_subquery = "SELECT rolname FROM pg_catalog.pg_roles WHERE oid =";
--- 636,641 ----
***************
*** 636,641 **** main(int argc, char **argv)
--- 644,657 ----
  	else
  		username_subquery = "SELECT usename FROM pg_user WHERE usesysid =";
  
+ 	/* check the version for the synchronized snapshots feature */
+ 	if (numWorkers > 1 && fout->remoteVersion < 90200
+ 		&& !no_synchronized_snapshots)
+ 		exit_horribly(NULL,
+ 					 "No synchronized snapshots available in this server version. "
+ 					 "Run with --no-synchronized-snapshots instead if you do not "
+ 					 "need synchronized snapshots.\n");
+ 
  	/* Find the last built-in OID, if needed */
  	if (fout->remoteVersion < 70300)
  	{
***************
*** 723,728 **** main(int argc, char **argv)
--- 739,748 ----
  	else
  		sortDumpableObjectsByTypeOid(dobjs, numObjs);
  
+ 	/* If we do a parallel dump, we want the largest tables to go first */
+ 	if (archiveFormat == archDirectory && numWorkers > 1)
+ 		sortDataAndIndexObjectsBySize(dobjs, numObjs);
+ 
  	sortDumpableObjects(dobjs, numObjs);
  
  	/*
***************
*** 786,791 **** help(const char *progname)
--- 806,812 ----
  	printf(_("  -f, --file=FILENAME         output file or directory name\n"));
  	printf(_("  -F, --format=c|d|t|p        output file format (custom, directory, tar,\n"
  			 "                              plain text (default))\n"));
+ 	printf(_("  -j, --jobs=NUM              use this many parallel jobs to dump\n"));
  	printf(_("  -v, --verbose               verbose mode\n"));
  	printf(_("  -Z, --compress=0-9          compression level for compressed formats\n"));
  	printf(_("  --lock-wait-timeout=TIMEOUT fail after waiting TIMEOUT for a table lock\n"));
***************
*** 815,820 **** help(const char *progname)
--- 836,842 ----
  	printf(_("  --exclude-table-data=TABLE  do NOT dump data for the named table(s)\n"));
  	printf(_("  --inserts                   dump data as INSERT commands, rather than COPY\n"));
  	printf(_("  --no-security-labels        do not dump security label assignments\n"));
+ 	printf(_("  --no-synchronized-snapshots parallel processes should not use synchronized snapshots\n"));
  	printf(_("  --no-tablespaces            do not dump tablespace assignments\n"));
  	printf(_("  --no-unlogged-table-data    do not dump unlogged table data\n"));
  	printf(_("  --quote-all-identifiers     quote all identifiers, even if not key words\n"));
***************
*** 843,849 **** setup_connection(Archive *AH, const char *dumpencoding, char *use_role)
  	PGconn	   *conn = GetConnection(AH);
  	const char *std_strings;
  
! 	/* Set the client encoding if requested */
  	if (dumpencoding)
  	{
  		if (PQsetClientEncoding(conn, dumpencoding) < 0)
--- 865,876 ----
  	PGconn	   *conn = GetConnection(AH);
  	const char *std_strings;
  
! 	/*
! 	 * Set the client encoding if requested. If dumpencoding == NULL then
! 	 * either it hasn't been requested or we're a cloned connection and then this
! 	 * has already been set in CloneArchive according to the original
! 	 * connection encoding.
! 	 */
  	if (dumpencoding)
  	{
  		if (PQsetClientEncoding(conn, dumpencoding) < 0)
***************
*** 861,866 **** setup_connection(Archive *AH, const char *dumpencoding, char *use_role)
--- 888,897 ----
  	AH->std_strings = (std_strings && strcmp(std_strings, "on") == 0);
  
  	/* Set the role if requested */
+ 	if (!use_role && AH->use_role)
+ 		use_role = AH->use_role;
+ 
+ 	/* Set the role if requested */
  	if (use_role && AH->remoteVersion >= 80100)
  	{
  		PQExpBuffer query = createPQExpBuffer();
***************
*** 868,873 **** setup_connection(Archive *AH, const char *dumpencoding, char *use_role)
--- 899,908 ----
  		appendPQExpBuffer(query, "SET ROLE %s", fmtId(use_role));
  		ExecuteSqlStatement(AH, query->data);
  		destroyPQExpBuffer(query);
+ 
+ 		/* save this for later use on parallel connections */
+ 		if (!AH->use_role)
+ 			AH->use_role = strdup(use_role);
  	}
  
  	/* Set the datestyle to ISO to ensure the dump's portability */
***************
*** 904,909 **** setup_connection(Archive *AH, const char *dumpencoding, char *use_role)
--- 939,997 ----
  	 */
  	if (quote_all_identifiers && AH->remoteVersion >= 90100)
  		ExecuteSqlStatement(AH, "SET quote_all_identifiers = true");
+ 
+ 	/*
+ 	 * Start transaction-snapshot mode transaction to dump consistent data.
+ 	 */
+ 	ExecuteSqlStatement(AH, "BEGIN");
+ 	if (AH->remoteVersion >= 90100)
+ 	{
+ 		if (serializable_deferrable)
+ 			ExecuteSqlStatement(AH,
+ 						   "SET TRANSACTION ISOLATION LEVEL SERIALIZABLE, "
+ 						   "READ ONLY, DEFERRABLE");
+ 		else
+ 			ExecuteSqlStatement(AH,
+ 						   "SET TRANSACTION ISOLATION LEVEL REPEATABLE READ");
+ 	}
+ 	else
+ 		ExecuteSqlStatement(AH, "SET TRANSACTION ISOLATION LEVEL SERIALIZABLE");
+ 
+ 	if (AH->numWorkers > 1 && AH->remoteVersion >= 90200 && !no_synchronized_snapshots)
+ 	{
+ 		if (AH->sync_snapshot_id)
+ 		{
+ 			PQExpBuffer query = createPQExpBuffer();
+ 			appendPQExpBuffer(query, "SET TRANSACTION SNAPSHOT ");
+ 			appendStringLiteralConn(query, AH->sync_snapshot_id, conn);
+ 			destroyPQExpBuffer(query);
+ 		}
+ 		else
+ 			AH->sync_snapshot_id = get_synchronized_snapshot(AH);
+ 	}
+ }
+ 
+ /*
+  * Initialize the connection for a new worker process.
+  */
+ void
+ _SetupWorker(Archive *AHX, RestoreOptions *ropt)
+ {
+ 	setup_connection(AHX, NULL, NULL);
+ }
+ 
+ static char*
+ get_synchronized_snapshot(Archive *fout)
+ {
+ 	const char *query = "select pg_export_snapshot()";
+ 	char	   *result;
+ 	PGresult   *res;
+ 
+ 	res = ExecuteSqlQueryForSingleRow(fout, query);
+ 	result = strdup(PQgetvalue(res, 0, 0));
+ 	PQclear(res);
+ 
+ 	return result;
  }
  
  static ArchiveFormat
***************
*** 1220,1225 **** dumpTableData_copy(Archive *fout, void *dcontext)
--- 1308,1318 ----
  	const bool	hasoids = tbinfo->hasoids;
  	const bool	oids = tdinfo->oids;
  	PQExpBuffer q = createPQExpBuffer();
+ 	/*
+ 	 * Note: can't use getThreadLocalPQExpBuffer() here, we're calling fmtId which
+ 	 * uses it already.
+ 	 */
+ 	PQExpBuffer clistBuf = createPQExpBuffer();
  	PGconn	   *conn = GetConnection(fout);
  	PGresult   *res;
  	int			ret;
***************
*** 1244,1257 **** dumpTableData_copy(Archive *fout, void *dcontext)
  	 * cases involving ADD COLUMN and inheritance.)
  	 */
  	if (fout->remoteVersion >= 70300)
! 		column_list = fmtCopyColumnList(tbinfo);
  	else
  		column_list = "";		/* can't select columns in COPY */
  
  	if (oids && hasoids)
  	{
  		appendPQExpBuffer(q, "COPY %s %s WITH OIDS TO stdout;",
! 						  fmtQualifiedId(fout,
  										 tbinfo->dobj.namespace->dobj.name,
  										 classname),
  						  column_list);
--- 1337,1350 ----
  	 * cases involving ADD COLUMN and inheritance.)
  	 */
  	if (fout->remoteVersion >= 70300)
! 		column_list = fmtCopyColumnList(tbinfo, clistBuf);
  	else
  		column_list = "";		/* can't select columns in COPY */
  
  	if (oids && hasoids)
  	{
  		appendPQExpBuffer(q, "COPY %s %s WITH OIDS TO stdout;",
! 						  fmtQualifiedId(fout->remoteVersion,
  										 tbinfo->dobj.namespace->dobj.name,
  										 classname),
  						  column_list);
***************
*** 1269,1275 **** dumpTableData_copy(Archive *fout, void *dcontext)
  		else
  			appendPQExpBufferStr(q, "* ");
  		appendPQExpBuffer(q, "FROM %s %s) TO stdout;",
! 						  fmtQualifiedId(fout,
  										 tbinfo->dobj.namespace->dobj.name,
  										 classname),
  						  tdinfo->filtercond);
--- 1362,1368 ----
  		else
  			appendPQExpBufferStr(q, "* ");
  		appendPQExpBuffer(q, "FROM %s %s) TO stdout;",
! 						  fmtQualifiedId(fout->remoteVersion,
  										 tbinfo->dobj.namespace->dobj.name,
  										 classname),
  						  tdinfo->filtercond);
***************
*** 1277,1289 **** dumpTableData_copy(Archive *fout, void *dcontext)
  	else
  	{
  		appendPQExpBuffer(q, "COPY %s %s TO stdout;",
! 						  fmtQualifiedId(fout,
  										 tbinfo->dobj.namespace->dobj.name,
  										 classname),
  						  column_list);
  	}
  	res = ExecuteSqlQuery(fout, q->data, PGRES_COPY_OUT);
  	PQclear(res);
  
  	for (;;)
  	{
--- 1370,1383 ----
  	else
  	{
  		appendPQExpBuffer(q, "COPY %s %s TO stdout;",
! 						  fmtQualifiedId(fout->remoteVersion,
  										 tbinfo->dobj.namespace->dobj.name,
  										 classname),
  						  column_list);
  	}
  	res = ExecuteSqlQuery(fout, q->data, PGRES_COPY_OUT);
  	PQclear(res);
+ 	destroyPQExpBuffer(clistBuf);
  
  	for (;;)
  	{
***************
*** 1402,1408 **** dumpTableData_insert(Archive *fout, void *dcontext)
  	{
  		appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
  						  "SELECT * FROM ONLY %s",
! 						  fmtQualifiedId(fout,
  										 tbinfo->dobj.namespace->dobj.name,
  										 classname));
  	}
--- 1496,1502 ----
  	{
  		appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
  						  "SELECT * FROM ONLY %s",
! 						  fmtQualifiedId(fout->remoteVersion,
  										 tbinfo->dobj.namespace->dobj.name,
  										 classname));
  	}
***************
*** 1410,1416 **** dumpTableData_insert(Archive *fout, void *dcontext)
  	{
  		appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
  						  "SELECT * FROM %s",
! 						  fmtQualifiedId(fout,
  										 tbinfo->dobj.namespace->dobj.name,
  										 classname));
  	}
--- 1504,1510 ----
  	{
  		appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
  						  "SELECT * FROM %s",
! 						  fmtQualifiedId(fout->remoteVersion,
  										 tbinfo->dobj.namespace->dobj.name,
  										 classname));
  	}
***************
*** 1542,1547 **** dumpTableData(Archive *fout, TableDataInfo *tdinfo)
--- 1636,1642 ----
  {
  	TableInfo  *tbinfo = tdinfo->tdtable;
  	PQExpBuffer copyBuf = createPQExpBuffer();
+ 	PQExpBuffer clistBuf = createPQExpBuffer();
  	DataDumperPtr dumpFn;
  	char	   *copyStmt;
  
***************
*** 1553,1559 **** dumpTableData(Archive *fout, TableDataInfo *tdinfo)
  		appendPQExpBuffer(copyBuf, "COPY %s ",
  						  fmtId(tbinfo->dobj.name));
  		appendPQExpBuffer(copyBuf, "%s %sFROM stdin;\n",
! 						  fmtCopyColumnList(tbinfo),
  					  (tdinfo->oids && tbinfo->hasoids) ? "WITH OIDS " : "");
  		copyStmt = copyBuf->data;
  	}
--- 1648,1654 ----
  		appendPQExpBuffer(copyBuf, "COPY %s ",
  						  fmtId(tbinfo->dobj.name));
  		appendPQExpBuffer(copyBuf, "%s %sFROM stdin;\n",
! 						  fmtCopyColumnList(tbinfo, clistBuf),
  					  (tdinfo->oids && tbinfo->hasoids) ? "WITH OIDS " : "");
  		copyStmt = copyBuf->data;
  	}
***************
*** 1573,1578 **** dumpTableData(Archive *fout, TableDataInfo *tdinfo)
--- 1668,1674 ----
  				 dumpFn, tdinfo);
  
  	destroyPQExpBuffer(copyBuf);
+ 	destroyPQExpBuffer(clistBuf);
  }
  
  /*
***************
*** 3842,3847 **** getTables(Archive *fout, int *numTables)
--- 3938,3944 ----
  	int			i_reloptions;
  	int			i_toastreloptions;
  	int			i_reloftype;
+ 	int			i_relpages;
  
  	/* Make sure we are in proper schema */
  	selectSourceSchema(fout, "pg_catalog");
***************
*** 3881,3886 **** getTables(Archive *fout, int *numTables)
--- 3978,3984 ----
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
  						  "c.relpersistence, "
+ 						  "c.relpages, "
  						  "CASE WHEN c.reloftype <> 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS reloftype, "
  						  "d.refobjid AS owning_tab, "
  						  "d.refobjsubid AS owning_col, "
***************
*** 3917,3922 **** getTables(Archive *fout, int *numTables)
--- 4015,4021 ----
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
  						  "'p' AS relpersistence, "
+ 						  "c.relpages, "
  						  "CASE WHEN c.reloftype <> 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS reloftype, "
  						  "d.refobjid AS owning_tab, "
  						  "d.refobjsubid AS owning_col, "
***************
*** 3952,3957 **** getTables(Archive *fout, int *numTables)
--- 4051,4057 ----
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
  						  "'p' AS relpersistence, "
+ 						  "c.relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
  						  "d.refobjsubid AS owning_col, "
***************
*** 3987,3992 **** getTables(Archive *fout, int *numTables)
--- 4087,4093 ----
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
  						  "'p' AS relpersistence, "
+ 						  "c.relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
  						  "d.refobjsubid AS owning_col, "
***************
*** 4023,4028 **** getTables(Archive *fout, int *numTables)
--- 4124,4130 ----
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
  						  "'p' AS relpersistence, "
+ 						  "relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
  						  "d.refobjsubid AS owning_col, "
***************
*** 4058,4063 **** getTables(Archive *fout, int *numTables)
--- 4160,4166 ----
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
  						  "'p' AS relpersistence, "
+ 						  "relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
  						  "d.refobjsubid AS owning_col, "
***************
*** 4089,4094 **** getTables(Archive *fout, int *numTables)
--- 4192,4198 ----
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
  						  "'p' AS relpersistence, "
+ 						  "relpages, "
  						  "NULL AS reloftype, "
  						  "NULL::oid AS owning_tab, "
  						  "NULL::int4 AS owning_col, "
***************
*** 4115,4120 **** getTables(Archive *fout, int *numTables)
--- 4219,4225 ----
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
  						  "'p' AS relpersistence, "
+ 						  "relpages, "
  						  "NULL AS reloftype, "
  						  "NULL::oid AS owning_tab, "
  						  "NULL::int4 AS owning_col, "
***************
*** 4151,4156 **** getTables(Archive *fout, int *numTables)
--- 4256,4262 ----
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
  						  "'p' AS relpersistence, "
+ 						  "0 AS relpages, "
  						  "NULL AS reloftype, "
  						  "NULL::oid AS owning_tab, "
  						  "NULL::int4 AS owning_col, "
***************
*** 4204,4209 **** getTables(Archive *fout, int *numTables)
--- 4310,4316 ----
  	i_reloptions = PQfnumber(res, "reloptions");
  	i_toastreloptions = PQfnumber(res, "toast_reloptions");
  	i_reloftype = PQfnumber(res, "reloftype");
+ 	i_relpages = PQfnumber(res, "relpages");
  
  	if (lockWaitTimeout && fout->remoteVersion >= 70300)
  	{
***************
*** 4260,4265 **** getTables(Archive *fout, int *numTables)
--- 4367,4373 ----
  		tblinfo[i].reltablespace = pg_strdup(PQgetvalue(res, i, i_reltablespace));
  		tblinfo[i].reloptions = pg_strdup(PQgetvalue(res, i, i_reloptions));
  		tblinfo[i].toast_reloptions = pg_strdup(PQgetvalue(res, i, i_toastreloptions));
+ 		tblinfo[i].relpages = atoi(PQgetvalue(res, i, i_relpages));
  
  		/* other fields were zeroed above */
  
***************
*** 4288,4294 **** getTables(Archive *fout, int *numTables)
  			resetPQExpBuffer(query);
  			appendPQExpBuffer(query,
  							  "LOCK TABLE %s IN ACCESS SHARE MODE",
! 						 fmtQualifiedId(fout,
  										tblinfo[i].dobj.namespace->dobj.name,
  										tblinfo[i].dobj.name));
  			ExecuteSqlStatement(fout, query->data);
--- 4396,4402 ----
  			resetPQExpBuffer(query);
  			appendPQExpBuffer(query,
  							  "LOCK TABLE %s IN ACCESS SHARE MODE",
! 						 fmtQualifiedId(fout->remoteVersion,
  										tblinfo[i].dobj.namespace->dobj.name,
  										tblinfo[i].dobj.name));
  			ExecuteSqlStatement(fout, query->data);
***************
*** 4422,4428 **** getIndexes(Archive *fout, TableInfo tblinfo[], int numTables)
  				i_conoid,
  				i_condef,
  				i_tablespace,
! 				i_options;
  	int			ntups;
  
  	for (i = 0; i < numTables; i++)
--- 4530,4537 ----
  				i_conoid,
  				i_condef,
  				i_tablespace,
! 				i_options,
! 				i_relpages;
  	int			ntups;
  
  	for (i = 0; i < numTables; i++)
***************
*** 4464,4469 **** getIndexes(Archive *fout, TableInfo tblinfo[], int numTables)
--- 4573,4579 ----
  					 "pg_catalog.pg_get_indexdef(i.indexrelid) AS indexdef, "
  							  "t.relnatts AS indnkeys, "
  							  "i.indkey, i.indisclustered, "
+ 							  "t.relpages, "
  							  "c.contype, c.conname, "
  							  "c.condeferrable, c.condeferred, "
  							  "c.tableoid AS contableoid, "
***************
*** 4489,4494 **** getIndexes(Archive *fout, TableInfo tblinfo[], int numTables)
--- 4599,4605 ----
  					 "pg_catalog.pg_get_indexdef(i.indexrelid) AS indexdef, "
  							  "t.relnatts AS indnkeys, "
  							  "i.indkey, i.indisclustered, "
+ 							  "t.relpages, "
  							  "c.contype, c.conname, "
  							  "c.condeferrable, c.condeferred, "
  							  "c.tableoid AS contableoid, "
***************
*** 4517,4522 **** getIndexes(Archive *fout, TableInfo tblinfo[], int numTables)
--- 4628,4634 ----
  					 "pg_catalog.pg_get_indexdef(i.indexrelid) AS indexdef, "
  							  "t.relnatts AS indnkeys, "
  							  "i.indkey, i.indisclustered, "
+ 							  "t.relpages, "
  							  "c.contype, c.conname, "
  							  "c.condeferrable, c.condeferred, "
  							  "c.tableoid AS contableoid, "
***************
*** 4545,4550 **** getIndexes(Archive *fout, TableInfo tblinfo[], int numTables)
--- 4657,4663 ----
  					 "pg_catalog.pg_get_indexdef(i.indexrelid) AS indexdef, "
  							  "t.relnatts AS indnkeys, "
  							  "i.indkey, i.indisclustered, "
+ 							  "t.relpages, "
  							  "c.contype, c.conname, "
  							  "c.condeferrable, c.condeferred, "
  							  "c.tableoid AS contableoid, "
***************
*** 4573,4578 **** getIndexes(Archive *fout, TableInfo tblinfo[], int numTables)
--- 4686,4692 ----
  							  "pg_get_indexdef(i.indexrelid) AS indexdef, "
  							  "t.relnatts AS indnkeys, "
  							  "i.indkey, false AS indisclustered, "
+ 							  "t.relpages, "
  							  "CASE WHEN i.indisprimary THEN 'p'::char "
  							  "ELSE '0'::char END AS contype, "
  							  "t.relname AS conname, "
***************
*** 4599,4604 **** getIndexes(Archive *fout, TableInfo tblinfo[], int numTables)
--- 4713,4719 ----
  							  "pg_get_indexdef(i.indexrelid) AS indexdef, "
  							  "t.relnatts AS indnkeys, "
  							  "i.indkey, false AS indisclustered, "
+ 							  "t.relpages, "
  							  "CASE WHEN i.indisprimary THEN 'p'::char "
  							  "ELSE '0'::char END AS contype, "
  							  "t.relname AS conname, "
***************
*** 4627,4632 **** getIndexes(Archive *fout, TableInfo tblinfo[], int numTables)
--- 4742,4748 ----
  		i_indnkeys = PQfnumber(res, "indnkeys");
  		i_indkey = PQfnumber(res, "indkey");
  		i_indisclustered = PQfnumber(res, "indisclustered");
+ 		i_relpages = PQfnumber(res, "relpages");
  		i_contype = PQfnumber(res, "contype");
  		i_conname = PQfnumber(res, "conname");
  		i_condeferrable = PQfnumber(res, "condeferrable");
***************
*** 4669,4674 **** getIndexes(Archive *fout, TableInfo tblinfo[], int numTables)
--- 4785,4791 ----
  			parseOidArray(PQgetvalue(res, j, i_indkey),
  						  indxinfo[j].indkeys, INDEX_MAX_KEYS);
  			indxinfo[j].indisclustered = (PQgetvalue(res, j, i_indisclustered)[0] == 't');
+ 			indxinfo[j].relpages = atoi(PQgetvalue(res, j, i_relpages));
  			contype = *(PQgetvalue(res, j, i_contype));
  
  			if (contype == 'p' || contype == 'u' || contype == 'x')
***************
*** 14070,14091 **** getDependencies(Archive *fout)
   *
   * Whenever the selected schema is not pg_catalog, be careful to qualify
   * references to system catalogs and types in our emitted commands!
   */
  static void
  selectSourceSchema(Archive *fout, const char *schemaName)
  {
- 	static char *curSchemaName = NULL;
  	PQExpBuffer query;
  
  	/* Not relevant if fetching from pre-7.3 DB */
  	if (fout->remoteVersion < 70300)
  		return;
- 	/* Ignore null schema names */
- 	if (schemaName == NULL || *schemaName == '\0')
- 		return;
- 	/* Optimize away repeated selection of same schema */
- 	if (curSchemaName && strcmp(curSchemaName, schemaName) == 0)
- 		return;
  
  	query = createPQExpBuffer();
  	appendPQExpBuffer(query, "SET search_path = %s",
--- 14187,14207 ----
   *
   * Whenever the selected schema is not pg_catalog, be careful to qualify
   * references to system catalogs and types in our emitted commands!
+  *
+  * This function is called only from selectSourceSchemaOnAH and
+  * selectSourceSchema.
   */
  static void
  selectSourceSchema(Archive *fout, const char *schemaName)
  {
  	PQExpBuffer query;
  
+ 	/* This is checked by the callers already */
+ 	Assert(schemaName != NULL && *schemaName != '\0');
+ 
  	/* Not relevant if fetching from pre-7.3 DB */
  	if (fout->remoteVersion < 70300)
  		return;
  
  	query = createPQExpBuffer();
  	appendPQExpBuffer(query, "SET search_path = %s",
***************
*** 14096,14104 **** selectSourceSchema(Archive *fout, const char *schemaName)
  	ExecuteSqlStatement(fout, query->data);
  
  	destroyPQExpBuffer(query);
- 	if (curSchemaName)
- 		free(curSchemaName);
- 	curSchemaName = pg_strdup(schemaName);
  }
  
  /*
--- 14212,14217 ----
***************
*** 14236,14313 **** myFormatType(const char *typname, int32 typmod)
  }
  
  /*
-  * fmtQualifiedId - convert a qualified name to the proper format for
-  * the source database.
-  *
-  * Like fmtId, use the result before calling again.
-  */
- static const char *
- fmtQualifiedId(Archive *fout, const char *schema, const char *id)
- {
- 	static PQExpBuffer id_return = NULL;
- 
- 	if (id_return)				/* first time through? */
- 		resetPQExpBuffer(id_return);
- 	else
- 		id_return = createPQExpBuffer();
- 
- 	/* Suppress schema name if fetching from pre-7.3 DB */
- 	if (fout->remoteVersion >= 70300 && schema && *schema)
- 	{
- 		appendPQExpBuffer(id_return, "%s.",
- 						  fmtId(schema));
- 	}
- 	appendPQExpBuffer(id_return, "%s",
- 					  fmtId(id));
- 
- 	return id_return->data;
- }
- 
- /*
   * Return a column list clause for the given relation.
   *
   * Special case: if there are no undropped columns in the relation, return
   * "", not an invalid "()" column list.
   */
  static const char *
! fmtCopyColumnList(const TableInfo *ti)
  {
- 	static PQExpBuffer q = NULL;
  	int			numatts = ti->numatts;
  	char	  **attnames = ti->attnames;
  	bool	   *attisdropped = ti->attisdropped;
  	bool		needComma;
  	int			i;
  
! 	if (q)						/* first time through? */
! 		resetPQExpBuffer(q);
! 	else
! 		q = createPQExpBuffer();
! 
! 	appendPQExpBuffer(q, "(");
  	needComma = false;
  	for (i = 0; i < numatts; i++)
  	{
  		if (attisdropped[i])
  			continue;
  		if (needComma)
! 			appendPQExpBuffer(q, ", ");
! 		appendPQExpBuffer(q, "%s", fmtId(attnames[i]));
  		needComma = true;
  	}
  
  	if (!needComma)
  		return "";				/* no undropped columns */
  
! 	appendPQExpBuffer(q, ")");
! 	return q->data;
  }
  
  /*
   * Execute an SQL query and verify that we got exactly one row back.
   */
  static PGresult *
! ExecuteSqlQueryForSingleRow(Archive *fout, char *query)
  {
  	PGresult   *res;
  	int			ntups;
--- 14349,14392 ----
  }
  
  /*
   * Return a column list clause for the given relation.
   *
   * Special case: if there are no undropped columns in the relation, return
   * "", not an invalid "()" column list.
   */
  static const char *
! fmtCopyColumnList(const TableInfo *ti, PQExpBuffer buffer)
  {
  	int			numatts = ti->numatts;
  	char	  **attnames = ti->attnames;
  	bool	   *attisdropped = ti->attisdropped;
  	bool		needComma;
  	int			i;
  
! 	appendPQExpBuffer(buffer, "(");
  	needComma = false;
  	for (i = 0; i < numatts; i++)
  	{
  		if (attisdropped[i])
  			continue;
  		if (needComma)
! 			appendPQExpBuffer(buffer, ", ");
! 		appendPQExpBuffer(buffer, "%s", fmtId(attnames[i]));
  		needComma = true;
  	}
  
  	if (!needComma)
  		return "";				/* no undropped columns */
  
! 	appendPQExpBuffer(buffer, ")");
! 	return buffer->data;
  }
  
  /*
   * Execute an SQL query and verify that we got exactly one row back.
   */
  static PGresult *
! ExecuteSqlQueryForSingleRow(Archive *fout, const char *query)
  {
  	PGresult   *res;
  	int			ntups;
*** a/src/bin/pg_dump/pg_dump.h
--- b/src/bin/pg_dump/pg_dump.h
***************
*** 256,261 **** typedef struct _tableInfo
--- 256,262 ----
  	/* these two are set only if table is a sequence owned by a column: */
  	Oid			owning_tab;		/* OID of table owning sequence */
  	int			owning_col;		/* attr # of column owning sequence */
+ 	int			relpages;
  
  	bool		interesting;	/* true if need to collect more data */
  
***************
*** 319,324 **** typedef struct _indxInfo
--- 320,326 ----
  	bool		indisclustered;
  	/* if there is an associated constraint object, its dumpId: */
  	DumpId		indexconstraint;
+ 	int			relpages;		/* relpages of the underlying table */
  } IndxInfo;
  
  typedef struct _ruleInfo
***************
*** 523,528 **** extern void parseOidArray(const char *str, Oid *array, int arraysize);
--- 525,531 ----
  extern void sortDumpableObjects(DumpableObject **objs, int numObjs);
  extern void sortDumpableObjectsByTypeName(DumpableObject **objs, int numObjs);
  extern void sortDumpableObjectsByTypeOid(DumpableObject **objs, int numObjs);
+ extern void sortDataAndIndexObjectsBySize(DumpableObject **objs, int numObjs);
  
  /*
   * version specific routines
*** a/src/bin/pg_dump/pg_dump_sort.c
--- b/src/bin/pg_dump/pg_dump_sort.c
***************
*** 121,126 **** static void repairDependencyLoop(DumpableObject **loop,
--- 121,213 ----
  static void describeDumpableObject(DumpableObject *obj,
  					   char *buf, int bufsize);
  
+ static int DOSizeCompare(const void *p1, const void *p2);
+ 
+ static int
+ findFirstEqualType(DumpableObjectType type, DumpableObject **objs, int numObjs)
+ {
+ 	int i;
+ 	for (i = 0; i < numObjs; i++)
+ 		if (objs[i]->objType == type)
+ 			return i;
+ 	return -1;
+ }
+ 
+ static int
+ findFirstDifferentType(DumpableObjectType type, DumpableObject **objs, int numObjs, int start)
+ {
+ 	int i;
+ 	for (i = start; i < numObjs; i++)
+ 		if (objs[i]->objType != type)
+ 			return i;
+ 	return numObjs - 1;
+ }
+ 
+ /*
+  * When we do a parallel dump, we want to start with the largest items first.
+  *
+  * Say we have the objects in this order:
+  * ....DDDDD....III....
+  *
+  * with D = Table data, I = Index, . = other object
+  *
+  * This sorting function now takes each of the D or I blocks and sorts them
+  * according to their size.
+  */
+ void
+ sortDataAndIndexObjectsBySize(DumpableObject **objs, int numObjs)
+ {
+ 	int		startIdx, endIdx;
+ 	void   *startPtr;
+ 
+ 	if (numObjs <= 1)
+ 		return;
+ 
+ 	startIdx = findFirstEqualType(DO_TABLE_DATA, objs, numObjs);
+ 	if (startIdx >= 0)
+ 	{
+ 		endIdx = findFirstDifferentType(DO_TABLE_DATA, objs, numObjs, startIdx);
+ 		startPtr = objs + startIdx;
+ 		qsort(startPtr, endIdx - startIdx, sizeof(DumpableObject *),
+ 			  DOSizeCompare);
+ 	}
+ 
+ 	startIdx = findFirstEqualType(DO_INDEX, objs, numObjs);
+ 	if (startIdx >= 0)
+ 	{
+ 		endIdx = findFirstDifferentType(DO_INDEX, objs, numObjs, startIdx);
+ 		startPtr = objs + startIdx;
+ 		qsort(startPtr, endIdx - startIdx, sizeof(DumpableObject *),
+ 			  DOSizeCompare);
+ 	}
+ }
+ 
+ static int
+ DOSizeCompare(const void *p1, const void *p2)
+ {
+ 	DumpableObject *obj1 = *(DumpableObject **) p1;
+ 	DumpableObject *obj2 = *(DumpableObject **) p2;
+ 	int			obj1_size = 0;
+ 	int			obj2_size = 0;
+ 
+ 	if (obj1->objType == DO_TABLE_DATA)
+ 		obj1_size = ((TableDataInfo *) obj1)->tdtable->relpages;
+ 	if (obj1->objType == DO_INDEX)
+ 		obj1_size = ((IndxInfo *) obj1)->relpages;
+ 
+ 	if (obj2->objType == DO_TABLE_DATA)
+ 		obj2_size = ((TableDataInfo *) obj2)->tdtable->relpages;
+ 	if (obj2->objType == DO_INDEX)
+ 		obj2_size = ((IndxInfo *) obj2)->relpages;
+ 
+ 	/* we want to see the biggest item go first */
+ 	if (obj1_size > obj2_size)
+ 		return -1;
+ 	if (obj2_size > obj1_size)
+ 		return 1;
+ 
+ 	return 0;
+ }
  
  /*
   * Sort the given objects into a type/name-based ordering
*** a/src/bin/pg_dump/pg_dumpall.c
--- b/src/bin/pg_dump/pg_dumpall.c
***************
*** 1918,1920 **** doShellQuoting(PQExpBuffer buf, const char *str)
--- 1918,1923 ----
  	appendPQExpBufferChar(buf, '"');
  #endif   /* WIN32 */
  }
+ 
+ /* dummy, no parallel dump/restore for pg_dumpall yet */
+ void _SetupWorker(Archive *AHX, RestoreOptions *ropt) {}
*** a/src/bin/pg_dump/pg_restore.c
--- b/src/bin/pg_dump/pg_restore.c
***************
*** 72,77 **** main(int argc, char **argv)
--- 72,78 ----
  	RestoreOptions *opts;
  	int			c;
  	int			exit_code;
+ 	int			numWorkers = 1;
  	Archive    *AH;
  	char	   *inputFileSpec;
  	static int	disable_triggers = 0;
***************
*** 183,189 **** main(int argc, char **argv)
  				break;
  
  			case 'j':			/* number of restore jobs */
! 				opts->number_of_jobs = atoi(optarg);
  				break;
  
  			case 'l':			/* Dump the TOC summary */
--- 184,190 ----
  				break;
  
  			case 'j':			/* number of restore jobs */
! 				numWorkers = atoi(optarg);
  				break;
  
  			case 'l':			/* Dump the TOC summary */
***************
*** 338,344 **** main(int argc, char **argv)
  	}
  
  	/* Can't do single-txn mode with multiple connections */
! 	if (opts->single_txn && opts->number_of_jobs > 1)
  	{
  		fprintf(stderr, _("%s: cannot specify both --single-transaction and multiple jobs\n"),
  				progname);
--- 339,345 ----
  	}
  
  	/* Can't do single-txn mode with multiple connections */
! 	if (opts->single_txn && numWorkers > 1)
  	{
  		fprintf(stderr, _("%s: cannot specify both --single-transaction and multiple jobs\n"),
  				progname);
***************
*** 405,410 **** main(int argc, char **argv)
--- 406,422 ----
  		InitDummyWantedList(AH, opts);
  	}
  
+ 	/* See comments in pg_dump.c */
+ #ifdef WIN32
+ 	if (numWorkers > MAXIMUM_WAIT_OBJECTS)
+ 	{
+ 		fprintf(stderr, _("%s: invalid number of parallel jobs\n"),	progname);
+ 		exit(1);
+ 	}
+ #endif
+ 
+ 	AH->numWorkers = numWorkers;
+ 
  	if (opts->tocSummary)
  		PrintTOCSummary(AH, opts);
  	else
***************
*** 423,428 **** main(int argc, char **argv)
--- 435,447 ----
  	return exit_code;
  }
  
+ void
+ _SetupWorker(Archive *AHX, RestoreOptions *ropt)
+ {
+ 	ArchiveHandle *AH = (ArchiveHandle *) AHX;
+ 	(AH->ReopenPtr) (AH);
+ }
+ 
  static void
  usage(const char *progname)
  {
*** a/src/tools/msvc/Mkvcbuild.pm
--- b/src/tools/msvc/Mkvcbuild.pm
***************
*** 344,349 **** sub mkvcbuild
--- 344,351 ----
      $pgdump->AddFile('src\bin\pg_dump\pg_dump_sort.c');
      $pgdump->AddFile('src\bin\pg_dump\keywords.c');
      $pgdump->AddFile('src\backend\parser\kwlookup.c');
+     $pgdump->AddLibrary('wsock32.lib');
+     $pgdump->AddLibrary('ws2_32.lib');
  
      my $pgdumpall = AddSimpleFrontend('pg_dump', 1);
  
***************
*** 368,373 **** sub mkvcbuild
--- 370,377 ----
      $pgrestore->AddFile('src\bin\pg_dump\pg_restore.c');
      $pgrestore->AddFile('src\bin\pg_dump\keywords.c');
      $pgrestore->AddFile('src\backend\parser\kwlookup.c');
+     $pgrestore->AddLibrary('wsock32.lib');
+     $pgrestore->AddLibrary('ws2_32.lib');
  
      my $zic = $solution->AddProject('zic','exe','utils');
      $zic->AddFiles('src\timezone','zic.c','ialloc.c','scheck.c','localtime.c');
#65Joachim Wieland
joe@mcknight.de
In reply to: Alvaro Herrera (#54)
2 attachment(s)
Re: patch for parallel pg_dump

On Wed, Mar 28, 2012 at 2:20 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

My main comment about the current patch is that it looks like it's
touching pg_restore parallel code by moving some stuff into parallel.c.
If that's really the case and its voluminous, maybe this patch would
shrink a bit if we could do the code moving in a first patch.  That
would be mostly mechanical.  Then the interesting stuff would apply on
top of that.  That would make review easier.

Unfortunately this is not really the case. What is being moved out of
pg_backup_archiver.c and into parallel.c is either the shutdown logic
that has been applied only a few days ago or is necessary to change
the parallel restore logic from one-thread-per-dump-object to the
message passing framework where a worker starts in the beginning and
then receives a new dump object from the master every time it's idle.

Instead now I split up the patch into two parts. The first part does
not depend on the parallel functionality and can be applied already
now. The second part then adds the parallelism on top.

Here's the complete list of changes of the first patch:

- split up restore_toc_entries_parallel into
restore_toc_entries_prefork / restore_toc_entries_parallel and
restore_toc_entries_postfork
- remove static char from prependDirectory
- remove static PQExpBuffer from fmtCopyColumnList
- get the relPages numbers, add a function that sorts by tablesize
(not called right now)
- remove static char* from selectSourceSchema
- function getThreadLocalPQExpBuffer returning a PQExpBuffer that
lives in thread-local memory
- make fmtId use the thread-local PQExpBuffer
- add function fmtQualifiedId which is schema + fmtId()
- add function pointer on_exit_msg_func, that will be called to handle
the last words of a process (currently it only prints the message)
- change exit_nicely to not modify the global variable on_exit_nicely_index
- move ropt->number_of_jobs from RestoreOptions to AHX->numWorkers so
that it can be used for backup as well
- make getTocEntryByDumpId non-static, even though it's not necessary now
- make CloneArchive / DeCloneArchive non-static, even though it's not
necessary now
- if a transaction is active in DisconnectDatabase, cancel this
transaction with PQcancel

Attachments:

parallel_pg_dump_6-part1.difftext/x-patch; charset=US-ASCII; name=parallel_pg_dump_6-part1.diffDownload
diff --git a/src/bin/pg_dump/dumputils.c b/src/bin/pg_dump/dumputils.c
index 9b30629..f8ab57b 100644
*** a/src/bin/pg_dump/dumputils.c
--- b/src/bin/pg_dump/dumputils.c
***************
*** 16,21 ****
--- 16,22 ----
  
  #include <ctype.h>
  
+ #include "dumpmem.h"
  #include "dumputils.h"
  #include "pg_backup.h"
  
*************** static struct
*** 39,44 ****
--- 40,46 ----
  } on_exit_nicely_list[MAX_ON_EXIT_NICELY];
  
  static int on_exit_nicely_index;
+ void (*on_exit_msg_func)(const char *modulename, const char *fmt, va_list ap) = vwrite_msg;
  
  #define supports_grant_options(version) ((version) >= 70400)
  
*************** static bool parseAclItem(const char *ite
*** 49,54 ****
--- 51,57 ----
  static char *copyAclUserName(PQExpBuffer output, char *input);
  static void AddAcl(PQExpBuffer aclbuf, const char *keyword,
  	   const char *subname);
+ static PQExpBuffer getThreadLocalPQExpBuffer(void);
  
  #ifdef WIN32
  static bool parallel_init_done = false;
*************** init_parallel_dump_utils(void)
*** 70,84 ****
  }
  
  /*
!  *	Quotes input string if it's not a legitimate SQL identifier as-is.
!  *
!  *	Note that the returned string must be used before calling fmtId again,
!  *	since we re-use the same return buffer each time.  Non-reentrant but
!  *	reduces memory leakage. (On Windows the memory leakage will be one buffer
!  *	per thread, which is at least better than one per call).
   */
! const char *
! fmtId(const char *rawid)
  {
  	/*
  	 * The Tls code goes awry if we use a static var, so we provide for both
--- 73,83 ----
  }
  
  /*
!  * Non-reentrant but reduces memory leakage. (On Windows the memory leakage
!  * will be one buffer per thread, which is at least better than one per call).
   */
! static PQExpBuffer
! getThreadLocalPQExpBuffer(void)
  {
  	/*
  	 * The Tls code goes awry if we use a static var, so we provide for both
*************** fmtId(const char *rawid)
*** 87,95 ****
  	static PQExpBuffer s_id_return = NULL;
  	PQExpBuffer id_return;
  
- 	const char *cp;
- 	bool		need_quotes = false;
- 
  #ifdef WIN32
  	if (parallel_init_done)
  		id_return = (PQExpBuffer) TlsGetValue(tls_index);		/* 0 when not set */
--- 86,91 ----
*************** fmtId(const char *rawid)
*** 119,124 ****
--- 115,137 ----
  
  	}
  
+ 	return id_return;
+ }
+ 
+ /*
+  *	Quotes input string if it's not a legitimate SQL identifier as-is.
+  *
+  *	Note that the returned string must be used before calling fmtId again,
+  *	since we re-use the same return buffer each time.
+  */
+ const char *
+ fmtId(const char *rawid)
+ {
+ 	PQExpBuffer id_return = getThreadLocalPQExpBuffer();
+ 
+ 	const char *cp;
+ 	bool		need_quotes = false;
+ 
  	/*
  	 * These checks need to match the identifier production in scan.l. Don't
  	 * use islower() etc.
*************** fmtId(const char *rawid)
*** 186,191 ****
--- 199,233 ----
  	return id_return->data;
  }
  
+ /*
+  * fmtQualifiedId - convert a qualified name to the proper format for
+  * the source database.
+  *
+  * Like fmtId, use the result before calling again.
+  *
+  * Since we call fmtId and it also uses getThreadLocalPQExpBuffer() we cannot
+  * use it until we're finished with calling fmtId().
+  */
+ const char *
+ fmtQualifiedId(int remoteVersion, const char *schema, const char *id)
+ {
+ 	PQExpBuffer id_return;
+ 	PQExpBuffer lcl_pqexp = createPQExpBuffer();
+ 
+ 	/* Suppress schema name if fetching from pre-7.3 DB */
+ 	if (remoteVersion >= 70300 && schema && *schema)
+ 	{
+ 		appendPQExpBuffer(lcl_pqexp, "%s.", fmtId(schema));
+ 	}
+ 	appendPQExpBuffer(lcl_pqexp, "%s", fmtId(id));
+ 
+ 	id_return = getThreadLocalPQExpBuffer();
+ 
+ 	appendPQExpBuffer(id_return, "%s", lcl_pqexp->data);
+ 	destroyPQExpBuffer(lcl_pqexp);
+ 
+ 	return id_return->data;
+ }
  
  /*
   * Convert a string value to an SQL string literal and append it to
*************** exit_horribly(const char *modulename, co
*** 1273,1279 ****
  	va_list		ap;
  
  	va_start(ap, fmt);
! 	vwrite_msg(modulename, fmt, ap);
  	va_end(ap);
  
  	exit_nicely(1);
--- 1315,1321 ----
  	va_list		ap;
  
  	va_start(ap, fmt);
! 	on_exit_msg_func(modulename, fmt, ap);
  	va_end(ap);
  
  	exit_nicely(1);
*************** on_exit_nicely(on_exit_nicely_callback f
*** 1319,1331 ****
  	on_exit_nicely_index++;
  }
  
! /* Run accumulated on_exit_nicely callbacks and then exit quietly. */
  void
  exit_nicely(int code)
  {
! 	while (--on_exit_nicely_index >= 0)
! 		(*on_exit_nicely_list[on_exit_nicely_index].function)(code,
! 			on_exit_nicely_list[on_exit_nicely_index].arg);
  #ifdef WIN32
  	if (parallel_init_done && GetCurrentThreadId() != mainThreadId)
  		ExitThread(code);
--- 1361,1374 ----
  	on_exit_nicely_index++;
  }
  
! /* Run accumulated on_exit_nicely callbacks in reverse order and then exit quietly. */
  void
  exit_nicely(int code)
  {
! 	int i;
! 	for (i = on_exit_nicely_index - 1; i >= 0; i--)
! 		(*on_exit_nicely_list[i].function)(code,
! 			on_exit_nicely_list[i].arg);
  #ifdef WIN32
  	if (parallel_init_done && GetCurrentThreadId() != mainThreadId)
  		ExitThread(code);
diff --git a/src/bin/pg_dump/dumputils.h b/src/bin/pg_dump/dumputils.h
index 82cf940..9c374c6 100644
*** a/src/bin/pg_dump/dumputils.h
--- b/src/bin/pg_dump/dumputils.h
*************** extern const char *progname;
*** 24,29 ****
--- 24,31 ----
  
  extern void init_parallel_dump_utils(void);
  extern const char *fmtId(const char *identifier);
+ extern const char *fmtQualifiedId(int remoteVersion,
+ 								  const char *schema, const char *id);
  extern void appendStringLiteral(PQExpBuffer buf, const char *str,
  					int encoding, bool std_strings);
  extern void appendStringLiteralConn(PQExpBuffer buf, const char *str,
*************** extern void exit_horribly(const char *mo
*** 60,65 ****
--- 62,69 ----
  				__attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 3), noreturn));
  extern void set_section (const char *arg, int *dumpSections);
  
+ extern void (*on_exit_msg_func)(const char *modulename, const char *fmt, va_list ap)
+ 				__attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 0)));
  typedef void (*on_exit_nicely_callback) (int code, void *arg);
  extern void on_exit_nicely(on_exit_nicely_callback function, void *arg);
  extern void exit_nicely(int code) __attribute__((noreturn));
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 61c6863..22c19fe 100644
*** a/src/bin/pg_dump/pg_backup.h
--- b/src/bin/pg_dump/pg_backup.h
*************** struct Archive
*** 89,94 ****
--- 89,96 ----
  	int			minRemoteVersion;		/* allowable range */
  	int			maxRemoteVersion;
  
+ 	int			numWorkers;		/* number of parallel processes */
+ 
  	/* info needed for string escaping */
  	int			encoding;		/* libpq code for client_encoding */
  	bool		std_strings;	/* standard_conforming_strings */
*************** typedef struct _restoreOptions
*** 149,155 ****
  	int			suppressDumpWarnings;	/* Suppress output of WARNING entries
  										 * to stderr */
  	bool		single_txn;
- 	int			number_of_jobs;
  
  	bool	   *idWanted;		/* array showing which dump IDs to emit */
  } RestoreOptions;
--- 151,156 ----
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index e292659..6e21c09 100644
*** a/src/bin/pg_dump/pg_backup_archiver.c
--- b/src/bin/pg_dump/pg_backup_archiver.c
*************** static teReqs _tocEntryRequired(TocEntry
*** 141,147 ****
  static bool _tocEntryIsACL(TocEntry *te);
  static void _disableTriggersIfNecessary(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt);
  static void _enableTriggersIfNecessary(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt);
- static TocEntry *getTocEntryByDumpId(ArchiveHandle *AH, DumpId id);
  static void _moveBefore(ArchiveHandle *AH, TocEntry *pos, TocEntry *te);
  static int	_discoverArchiveFormat(ArchiveHandle *AH);
  
--- 141,146 ----
*************** static void RestoreOutput(ArchiveHandle
*** 154,160 ****
  
  static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te,
  				  RestoreOptions *ropt, bool is_parallel);
! static void restore_toc_entries_parallel(ArchiveHandle *AH);
  static thandle spawn_restore(RestoreArgs *args);
  static thandle reap_child(ParallelSlot *slots, int n_slots, int *work_status);
  static bool work_in_progress(ParallelSlot *slots, int n_slots);
--- 153,161 ----
  
  static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te,
  				  RestoreOptions *ropt, bool is_parallel);
! static void restore_toc_entries_prefork(ArchiveHandle *AH);
! static void restore_toc_entries_parallel(ArchiveHandle *AH, TocEntry *pending_list);
! static void restore_toc_entries_postfork(ArchiveHandle *AH, TocEntry *pending_list);
  static thandle spawn_restore(RestoreArgs *args);
  static thandle reap_child(ParallelSlot *slots, int n_slots, int *work_status);
  static bool work_in_progress(ParallelSlot *slots, int n_slots);
*************** static void reduce_dependencies(ArchiveH
*** 178,185 ****
  					TocEntry *ready_list);
  static void mark_create_done(ArchiveHandle *AH, TocEntry *te);
  static void inhibit_data_for_failed_table(ArchiveHandle *AH, TocEntry *te);
- static ArchiveHandle *CloneArchive(ArchiveHandle *AH);
- static void DeCloneArchive(ArchiveHandle *AH);
  
  static void setProcessIdentifier(ParallelStateEntry *pse, ArchiveHandle *AH);
  static void unsetProcessIdentifier(ParallelStateEntry *pse);
--- 179,184 ----
*************** RestoreArchive(Archive *AHX, RestoreOpti
*** 272,278 ****
  	/*
  	 * If we're going to do parallel restore, there are some restrictions.
  	 */
! 	parallel_mode = (ropt->number_of_jobs > 1 && ropt->useDB);
  	if (parallel_mode)
  	{
  		/* We haven't got round to making this work for all archive formats */
--- 271,277 ----
  	/*
  	 * If we're going to do parallel restore, there are some restrictions.
  	 */
! 	parallel_mode = (AH->public.numWorkers > 1 && ropt->useDB);
  	if (parallel_mode)
  	{
  		/* We haven't got round to making this work for all archive formats */
*************** RestoreArchive(Archive *AHX, RestoreOpti
*** 438,444 ****
  	 * In parallel mode, turn control over to the parallel-restore logic.
  	 */
  	if (parallel_mode)
! 		restore_toc_entries_parallel(AH);
  	else
  	{
  		for (te = AH->toc->next; te != AH->toc; te = te->next)
--- 437,458 ----
  	 * In parallel mode, turn control over to the parallel-restore logic.
  	 */
  	if (parallel_mode)
! 	{
! 		TocEntry pending_list;
! 
! 		par_list_header_init(&pending_list);
! 
! 		/* This runs PRE_DATA items and then disconnects from the database */
! 		restore_toc_entries_prefork(AH);
! 		Assert(AH->connection == NULL);
! 
! 		/* This will actually fork the processes */
! 		restore_toc_entries_parallel(AH, &pending_list);
! 
! 		/* reconnect the master and see if we missed something */
! 		restore_toc_entries_postfork(AH, &pending_list);
! 		Assert(AH->connection != NULL);
! 	}
  	else
  	{
  		for (te = AH->toc->next; te != AH->toc; te = te->next)
*************** _moveBefore(ArchiveHandle *AH, TocEntry
*** 1524,1530 ****
  	pos->prev = te;
  }
  
! static TocEntry *
  getTocEntryByDumpId(ArchiveHandle *AH, DumpId id)
  {
  	TocEntry   *te;
--- 1538,1544 ----
  	pos->prev = te;
  }
  
! TocEntry *
  getTocEntryByDumpId(ArchiveHandle *AH, DumpId id)
  {
  	TocEntry   *te;
*************** on_exit_close_archive(Archive *AHX)
*** 3337,3378 ****
  	on_exit_nicely(archive_close_connection, &shutdown_info);
  }
  
- /*
-  * Main engine for parallel restore.
-  *
-  * Work is done in three phases.
-  * First we process all SECTION_PRE_DATA tocEntries, in a single connection,
-  * just as for a standard restore.	Second we process the remaining non-ACL
-  * steps in parallel worker children (threads on Windows, processes on Unix),
-  * each of which connects separately to the database.  Finally we process all
-  * the ACL entries in a single connection (that happens back in
-  * RestoreArchive).
-  */
  static void
! restore_toc_entries_parallel(ArchiveHandle *AH)
  {
  	RestoreOptions *ropt = AH->ropt;
- 	int			n_slots = ropt->number_of_jobs;
- 	ParallelSlot *slots;
- 	int			work_status;
- 	int			next_slot;
  	bool		skipped_some;
- 	TocEntry	pending_list;
- 	TocEntry	ready_list;
  	TocEntry   *next_work_item;
- 	thandle		ret_child;
- 	TocEntry   *te;
- 	ParallelState *pstate;
- 	int			i;
- 
- 	ahlog(AH, 2, "entering restore_toc_entries_parallel\n");
  
! 	slots = (ParallelSlot *) pg_calloc(n_slots, sizeof(ParallelSlot));
! 	pstate = (ParallelState *) pg_malloc(sizeof(ParallelState));
! 	pstate->pse = (ParallelStateEntry *) pg_calloc(n_slots, sizeof(ParallelStateEntry));
! 	pstate->numWorkers = ropt->number_of_jobs;
! 	for (i = 0; i < pstate->numWorkers; i++)
! 		unsetProcessIdentifier(&(pstate->pse[i]));
  
  	/* Adjust dependency information */
  	fix_dependencies(AH);
--- 3351,3364 ----
  	on_exit_nicely(archive_close_connection, &shutdown_info);
  }
  
  static void
! restore_toc_entries_prefork(ArchiveHandle *AH)
  {
  	RestoreOptions *ropt = AH->ropt;
  	bool		skipped_some;
  	TocEntry   *next_work_item;
  
! 	ahlog(AH, 2, "entering restore_toc_entries_prefork\n");
  
  	/* Adjust dependency information */
  	fix_dependencies(AH);
*************** restore_toc_entries_parallel(ArchiveHand
*** 3428,3439 ****
  	 */
  	DisconnectDatabase(&AH->public);
  
- 	/*
- 	 * Set the pstate in the shutdown_info. The exit handler uses pstate if set
- 	 * and falls back to AHX otherwise.
- 	 */
- 	shutdown_info.pstate = pstate;
- 
  	/* blow away any transient state from the old connection */
  	if (AH->currUser)
  		free(AH->currUser);
--- 3414,3419 ----
*************** restore_toc_entries_parallel(ArchiveHand
*** 3445,3461 ****
  		free(AH->currTablespace);
  	AH->currTablespace = NULL;
  	AH->currWithOids = -1;
  
  	/*
! 	 * Initialize the lists of pending and ready items.  After this setup, the
! 	 * pending list is everything that needs to be done but is blocked by one
! 	 * or more dependencies, while the ready list contains items that have no
! 	 * remaining dependencies.	Note: we don't yet filter out entries that
! 	 * aren't going to be restored.  They might participate in dependency
! 	 * chains connecting entries that should be restored, so we treat them as
! 	 * live until we actually process them.
  	 */
- 	par_list_header_init(&pending_list);
  	par_list_header_init(&ready_list);
  	skipped_some = false;
  	for (next_work_item = AH->toc->next; next_work_item != AH->toc; next_work_item = next_work_item->next)
--- 3425,3483 ----
  		free(AH->currTablespace);
  	AH->currTablespace = NULL;
  	AH->currWithOids = -1;
+ }
+ 
+ /*
+  * Main engine for parallel restore.
+  *
+  * Work is done in three phases.
+  * First we process all SECTION_PRE_DATA tocEntries, in a single connection,
+  * just as for a standard restore. This is done in restore_toc_entries_prefork().
+  * Second we process the remaining non-ACL steps in parallel worker children
+  * (threads on Windows, processes on Unix), each of which connects separately
+  * to the database.
+  * Finally we process all the ACL entries in a single connection (that happens
+  * back in RestoreArchive).
+  */
+ static void
+ restore_toc_entries_parallel(ArchiveHandle *AH, TocEntry *pending_list)
+ {
+ 	ParallelState *pstate;
+ 	ParallelSlot *slots;
+ 	int			n_slots = AH->public.numWorkers;
+ 	TocEntry   *next_work_item;
+ 	int			next_slot;
+ 	TocEntry	ready_list;
+ 	int			ret_child;
+ 	bool		skipped_some;
+ 	int			work_status;
+ 	int			i;
+ 
+ 	ahlog(AH, 2, "entering restore_toc_entries_parallel\n");
+ 
+ 	slots = (ParallelSlot *) pg_calloc(n_slots, sizeof(ParallelSlot));
+ 	pstate = (ParallelState *) pg_malloc(sizeof(ParallelState));
+ 	pstate->pse = (ParallelStateEntry *) pg_calloc(n_slots, sizeof(ParallelStateEntry));
+ 	pstate->numWorkers = AH->public.numWorkers;
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 		unsetProcessIdentifier(&(pstate->pse[i]));
  
  	/*
! 	 * Set the pstate in the shutdown_info. The exit handler uses pstate if set
! 	 * and falls back to AHX otherwise.
! 	 */
! 	shutdown_info.pstate = pstate;
! 
! 	/*
! 	 * Initialize the lists of ready items, the list for pending items has
! 	 * already been initialized in the caller.  After this setup, the pending
! 	 * list is everything that needs to be done but is blocked by one or more
! 	 * dependencies, while the ready list contains items that have no remaining
! 	 * dependencies.	Note: we don't yet filter out entries that aren't going
! 	 * to be restored.  They might participate in dependency chains connecting
! 	 * entries that should be restored, so we treat them as live until we
! 	 * actually process them.
  	 */
  	par_list_header_init(&ready_list);
  	skipped_some = false;
  	for (next_work_item = AH->toc->next; next_work_item != AH->toc; next_work_item = next_work_item->next)
*************** restore_toc_entries_parallel(ArchiveHand
*** 3480,3486 ****
  		}
  
  		if (next_work_item->depCount > 0)
! 			par_list_append(&pending_list, next_work_item);
  		else
  			par_list_append(&ready_list, next_work_item);
  	}
--- 3502,3508 ----
  		}
  
  		if (next_work_item->depCount > 0)
! 			par_list_append(pending_list, next_work_item);
  		else
  			par_list_append(&ready_list, next_work_item);
  	}
*************** restore_toc_entries_parallel(ArchiveHand
*** 3566,3571 ****
--- 3588,3602 ----
  	}
  
  	ahlog(AH, 1, "finished main parallel loop\n");
+ }
+ 
+ static void
+ restore_toc_entries_postfork(ArchiveHandle *AH, TocEntry *pending_list)
+ {
+ 	RestoreOptions *ropt = AH->ropt;
+ 	TocEntry   *te;
+ 
+ 	ahlog(AH, 2, "entering restore_toc_entries_postfork\n");
  
  	/*
  	 * Remove the pstate again, so the exit handler will now fall back to
*************** restore_toc_entries_parallel(ArchiveHand
*** 3587,3593 ****
  	 * dependencies, or some other pathological condition. If so, do it in the
  	 * single parent connection.
  	 */
! 	for (te = pending_list.par_next; te != &pending_list; te = te->par_next)
  	{
  		ahlog(AH, 1, "processing missed item %d %s %s\n",
  			  te->dumpId, te->desc, te->tag);
--- 3618,3624 ----
  	 * dependencies, or some other pathological condition. If so, do it in the
  	 * single parent connection.
  	 */
! 	for (te = pending_list->par_next; te != pending_list; te = te->par_next)
  	{
  		ahlog(AH, 1, "processing missed item %d %s %s\n",
  			  te->dumpId, te->desc, te->tag);
*************** inhibit_data_for_failed_table(ArchiveHan
*** 4302,4311 ****
   *
   * Enough of the structure is cloned to ensure that there is no
   * conflict between different threads each with their own clone.
-  *
-  * These could be public, but no need at present.
   */
! static ArchiveHandle *
  CloneArchive(ArchiveHandle *AH)
  {
  	ArchiveHandle *clone;
--- 4333,4340 ----
   *
   * Enough of the structure is cloned to ensure that there is no
   * conflict between different threads each with their own clone.
   */
! ArchiveHandle *
  CloneArchive(ArchiveHandle *AH)
  {
  	ArchiveHandle *clone;
*************** CloneArchive(ArchiveHandle *AH)
*** 4342,4348 ****
   *
   * Note: we assume any clone-local connection was already closed.
   */
! static void
  DeCloneArchive(ArchiveHandle *AH)
  {
  	/* Clear format-specific state */
--- 4371,4377 ----
   *
   * Note: we assume any clone-local connection was already closed.
   */
! void
  DeCloneArchive(ArchiveHandle *AH)
  {
  	/* Clear format-specific state */
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index c84ec61..005a8fe 100644
*** a/src/bin/pg_dump/pg_backup_archiver.h
--- b/src/bin/pg_dump/pg_backup_archiver.h
*************** extern void ReadHead(ArchiveHandle *AH);
*** 334,341 ****
--- 334,344 ----
  extern void WriteToc(ArchiveHandle *AH);
  extern void ReadToc(ArchiveHandle *AH);
  extern void WriteDataChunks(ArchiveHandle *AH);
+ extern ArchiveHandle *CloneArchive(ArchiveHandle *AH);
+ extern void DeCloneArchive(ArchiveHandle *AH);
  
  extern teReqs TocIDRequired(ArchiveHandle *AH, DumpId id, RestoreOptions *ropt);
+ TocEntry *getTocEntryByDumpId(ArchiveHandle *AH, DumpId id);
  extern bool checkSeek(FILE *fp);
  
  #define appendStringLiteralAHX(buf,str,AH) \
*************** int			ahprintf(ArchiveHandle *AH, const
*** 376,379 ****
--- 379,394 ----
  
  void		ahlog(ArchiveHandle *AH, int level, const char *fmt,...) __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4)));
  
+ #ifdef USE_ASSERT_CHECKING
+ #define Assert(condition) \
+ 	if (!(condition)) \
+ 	{ \
+ 		write_msg(NULL, "Failed assertion in %s, line %d\n", \
+ 				  __FILE__, __LINE__); \
+ 		abort();\
+ 	}
+ #else
+ #define Assert(condition)
+ #endif
+ 
  #endif
diff --git a/src/bin/pg_dump/pg_backup_db.c b/src/bin/pg_dump/pg_backup_db.c
index b315e68..985a5b0 100644
*** a/src/bin/pg_dump/pg_backup_db.c
--- b/src/bin/pg_dump/pg_backup_db.c
*************** ConnectDatabase(Archive *AHX,
*** 308,319 ****
  	PQsetNoticeProcessor(AH->connection, notice_processor, NULL);
  }
  
  void
  DisconnectDatabase(Archive *AHX)
  {
  	ArchiveHandle *AH = (ArchiveHandle *) AHX;
  
! 	PQfinish(AH->connection);		/* noop if AH->connection is NULL */
  	AH->connection = NULL;
  }
  
--- 308,337 ----
  	PQsetNoticeProcessor(AH->connection, notice_processor, NULL);
  }
  
+ /*
+  * Close the connection to the database and also cancel off the query if we
+  * have one running.
+  */
  void
  DisconnectDatabase(Archive *AHX)
  {
  	ArchiveHandle *AH = (ArchiveHandle *) AHX;
+ 	PGcancel   *cancel;
+ 	char		errbuf[1];
  
! 	if (!AH->connection)
! 		return;
! 
! 	if (PQtransactionStatus(AH->connection) == PQTRANS_ACTIVE)
! 	{
! 		if ((cancel = PQgetCancel(AH->connection)))
! 		{
! 			PQcancel(cancel, errbuf, sizeof(errbuf));
! 			PQfreeCancel(cancel);
! 		}
! 	}
! 
! 	PQfinish(AH->connection);
  	AH->connection = NULL;
  }
  
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 8d43cd2..e504684 100644
*** a/src/bin/pg_dump/pg_backup_directory.c
--- b/src/bin/pg_dump/pg_backup_directory.c
*************** static void _EndBlob(ArchiveHandle *AH,
*** 82,89 ****
  static void _EndBlobs(ArchiveHandle *AH, TocEntry *te);
  static void _LoadBlobs(ArchiveHandle *AH, RestoreOptions *ropt);
  
! static char *prependDirectory(ArchiveHandle *AH, const char *relativeFilename);
! 
  static void createDirectory(const char *dir);
  
  
--- 82,88 ----
  static void _EndBlobs(ArchiveHandle *AH, TocEntry *te);
  static void _LoadBlobs(ArchiveHandle *AH, RestoreOptions *ropt);
  
! static char *prependDirectory(ArchiveHandle *AH, char *buf, const char *relativeFilename);
  static void createDirectory(const char *dir);
  
  
*************** InitArchiveFmt_Directory(ArchiveHandle *
*** 153,162 ****
  	}
  	else
  	{							/* Read Mode */
! 		char	   *fname;
  		cfp		   *tocFH;
  
! 		fname = prependDirectory(AH, "toc.dat");
  
  		tocFH = cfopen_read(fname, PG_BINARY_R);
  		if (tocFH == NULL)
--- 152,161 ----
  	}
  	else
  	{							/* Read Mode */
! 		char	   fname[MAXPGPATH];
  		cfp		   *tocFH;
  
! 		prependDirectory(AH, fname, "toc.dat");
  
  		tocFH = cfopen_read(fname, PG_BINARY_R);
  		if (tocFH == NULL)
*************** _StartData(ArchiveHandle *AH, TocEntry *
*** 282,290 ****
  {
  	lclTocEntry *tctx = (lclTocEntry *) te->formatData;
  	lclContext *ctx = (lclContext *) AH->formatData;
! 	char	   *fname;
  
! 	fname = prependDirectory(AH, tctx->filename);
  
  	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
  	if (ctx->dataFH == NULL)
--- 281,289 ----
  {
  	lclTocEntry *tctx = (lclTocEntry *) te->formatData;
  	lclContext *ctx = (lclContext *) AH->formatData;
! 	char		fname[MAXPGPATH];
  
! 	prependDirectory(AH, fname, tctx->filename);
  
  	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
  	if (ctx->dataFH == NULL)
*************** _PrintTocData(ArchiveHandle *AH, TocEntr
*** 376,383 ****
  		_LoadBlobs(AH, ropt);
  	else
  	{
! 		char	   *fname = prependDirectory(AH, tctx->filename);
  
  		_PrintFileData(AH, fname, ropt);
  	}
  }
--- 375,383 ----
  		_LoadBlobs(AH, ropt);
  	else
  	{
! 		char		fname[MAXPGPATH];
  
+ 		prependDirectory(AH, fname, tctx->filename);
  		_PrintFileData(AH, fname, ropt);
  	}
  }
*************** _LoadBlobs(ArchiveHandle *AH, RestoreOpt
*** 387,398 ****
  {
  	Oid			oid;
  	lclContext *ctx = (lclContext *) AH->formatData;
! 	char	   *fname;
  	char		line[MAXPGPATH];
  
  	StartRestoreBlobs(AH);
  
! 	fname = prependDirectory(AH, "blobs.toc");
  
  	ctx->blobsTocFH = cfopen_read(fname, PG_BINARY_R);
  
--- 387,398 ----
  {
  	Oid			oid;
  	lclContext *ctx = (lclContext *) AH->formatData;
! 	char		fname[MAXPGPATH];
  	char		line[MAXPGPATH];
  
  	StartRestoreBlobs(AH);
  
! 	prependDirectory(AH, fname, "blobs.toc");
  
  	ctx->blobsTocFH = cfopen_read(fname, PG_BINARY_R);
  
*************** _CloseArchive(ArchiveHandle *AH)
*** 519,525 ****
  	if (AH->mode == archModeWrite)
  	{
  		cfp		   *tocFH;
! 		char	   *fname = prependDirectory(AH, "toc.dat");
  
  		/* The TOC is always created uncompressed */
  		tocFH = cfopen_write(fname, PG_BINARY_W, 0);
--- 519,527 ----
  	if (AH->mode == archModeWrite)
  	{
  		cfp		   *tocFH;
! 		char		fname[MAXPGPATH];
! 
! 		prependDirectory(AH, fname, "toc.dat");
  
  		/* The TOC is always created uncompressed */
  		tocFH = cfopen_write(fname, PG_BINARY_W, 0);
*************** static void
*** 561,569 ****
  _StartBlobs(ArchiveHandle *AH, TocEntry *te)
  {
  	lclContext *ctx = (lclContext *) AH->formatData;
! 	char	   *fname;
  
! 	fname = prependDirectory(AH, "blobs.toc");
  
  	/* The blob TOC file is never compressed */
  	ctx->blobsTocFH = cfopen_write(fname, "ab", 0);
--- 563,571 ----
  _StartBlobs(ArchiveHandle *AH, TocEntry *te)
  {
  	lclContext *ctx = (lclContext *) AH->formatData;
! 	char		fname[MAXPGPATH];
  
! 	prependDirectory(AH, fname, "blobs.toc");
  
  	/* The blob TOC file is never compressed */
  	ctx->blobsTocFH = cfopen_write(fname, "ab", 0);
*************** createDirectory(const char *dir)
*** 656,667 ****
  					  dir, strerror(errno));
  }
  
! 
  static char *
! prependDirectory(ArchiveHandle *AH, const char *relativeFilename)
  {
  	lclContext *ctx = (lclContext *) AH->formatData;
- 	static char buf[MAXPGPATH];
  	char	   *dname;
  
  	dname = ctx->directory;
--- 658,673 ----
  					  dir, strerror(errno));
  }
  
! /*
!  * Gets a relative file name and prepends the output directory, writing the
!  * result to buf. The caller needs to make sure that buf is MAXPGPATH bytes
!  * big. Can't use a static char[MAXPGPATH] inside the function because we run
!  * multithreaded on Windows.
!  */
  static char *
! prependDirectory(ArchiveHandle *AH, char* buf, const char *relativeFilename)
  {
  	lclContext *ctx = (lclContext *) AH->formatData;
  	char	   *dname;
  
  	dname = ctx->directory;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index f9fbaee..7f91eb9 100644
*** a/src/bin/pg_dump/pg_dump.c
--- b/src/bin/pg_dump/pg_dump.c
*************** static Oid	findLastBuiltinOid_V70(Archiv
*** 235,242 ****
  static void selectSourceSchema(Archive *fout, const char *schemaName);
  static char *getFormattedTypeName(Archive *fout, Oid oid, OidOptions opts);
  static char *myFormatType(const char *typname, int32 typmod);
- static const char *fmtQualifiedId(Archive *fout,
- 								  const char *schema, const char *id);
  static void getBlobs(Archive *fout);
  static void dumpBlob(Archive *fout, BlobInfo *binfo);
  static int	dumpBlobs(Archive *fout, void *arg);
--- 235,240 ----
*************** static void binary_upgrade_extension_mem
*** 254,260 ****
  								DumpableObject *dobj,
  								const char *objlabel);
  static const char *getAttrName(int attrnum, TableInfo *tblInfo);
! static const char *fmtCopyColumnList(const TableInfo *ti);
  static PGresult *ExecuteSqlQueryForSingleRow(Archive *fout, char *query);
  
  int
--- 252,258 ----
  								DumpableObject *dobj,
  								const char *objlabel);
  static const char *getAttrName(int attrnum, TableInfo *tblInfo);
! static const char *fmtCopyColumnList(const TableInfo *ti, PQExpBuffer buffer);
  static PGresult *ExecuteSqlQueryForSingleRow(Archive *fout, char *query);
  
  int
*************** dumpTableData_copy(Archive *fout, void *
*** 1220,1225 ****
--- 1218,1228 ----
  	const bool	hasoids = tbinfo->hasoids;
  	const bool	oids = tdinfo->oids;
  	PQExpBuffer q = createPQExpBuffer();
+ 	/*
+ 	 * Note: can't use getThreadLocalPQExpBuffer() here, we're calling fmtId which
+ 	 * uses it already.
+ 	 */
+ 	PQExpBuffer clistBuf = createPQExpBuffer();
  	PGconn	   *conn = GetConnection(fout);
  	PGresult   *res;
  	int			ret;
*************** dumpTableData_copy(Archive *fout, void *
*** 1244,1257 ****
  	 * cases involving ADD COLUMN and inheritance.)
  	 */
  	if (fout->remoteVersion >= 70300)
! 		column_list = fmtCopyColumnList(tbinfo);
  	else
  		column_list = "";		/* can't select columns in COPY */
  
  	if (oids && hasoids)
  	{
  		appendPQExpBuffer(q, "COPY %s %s WITH OIDS TO stdout;",
! 						  fmtQualifiedId(fout,
  										 tbinfo->dobj.namespace->dobj.name,
  										 classname),
  						  column_list);
--- 1247,1260 ----
  	 * cases involving ADD COLUMN and inheritance.)
  	 */
  	if (fout->remoteVersion >= 70300)
! 		column_list = fmtCopyColumnList(tbinfo, clistBuf);
  	else
  		column_list = "";		/* can't select columns in COPY */
  
  	if (oids && hasoids)
  	{
  		appendPQExpBuffer(q, "COPY %s %s WITH OIDS TO stdout;",
! 						  fmtQualifiedId(fout->remoteVersion,
  										 tbinfo->dobj.namespace->dobj.name,
  										 classname),
  						  column_list);
*************** dumpTableData_copy(Archive *fout, void *
*** 1269,1275 ****
  		else
  			appendPQExpBufferStr(q, "* ");
  		appendPQExpBuffer(q, "FROM %s %s) TO stdout;",
! 						  fmtQualifiedId(fout,
  										 tbinfo->dobj.namespace->dobj.name,
  										 classname),
  						  tdinfo->filtercond);
--- 1272,1278 ----
  		else
  			appendPQExpBufferStr(q, "* ");
  		appendPQExpBuffer(q, "FROM %s %s) TO stdout;",
! 						  fmtQualifiedId(fout->remoteVersion,
  										 tbinfo->dobj.namespace->dobj.name,
  										 classname),
  						  tdinfo->filtercond);
*************** dumpTableData_copy(Archive *fout, void *
*** 1277,1289 ****
  	else
  	{
  		appendPQExpBuffer(q, "COPY %s %s TO stdout;",
! 						  fmtQualifiedId(fout,
  										 tbinfo->dobj.namespace->dobj.name,
  										 classname),
  						  column_list);
  	}
  	res = ExecuteSqlQuery(fout, q->data, PGRES_COPY_OUT);
  	PQclear(res);
  
  	for (;;)
  	{
--- 1280,1293 ----
  	else
  	{
  		appendPQExpBuffer(q, "COPY %s %s TO stdout;",
! 						  fmtQualifiedId(fout->remoteVersion,
  										 tbinfo->dobj.namespace->dobj.name,
  										 classname),
  						  column_list);
  	}
  	res = ExecuteSqlQuery(fout, q->data, PGRES_COPY_OUT);
  	PQclear(res);
+ 	destroyPQExpBuffer(clistBuf);
  
  	for (;;)
  	{
*************** dumpTableData_insert(Archive *fout, void
*** 1402,1408 ****
  	{
  		appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
  						  "SELECT * FROM ONLY %s",
! 						  fmtQualifiedId(fout,
  										 tbinfo->dobj.namespace->dobj.name,
  										 classname));
  	}
--- 1406,1412 ----
  	{
  		appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
  						  "SELECT * FROM ONLY %s",
! 						  fmtQualifiedId(fout->remoteVersion,
  										 tbinfo->dobj.namespace->dobj.name,
  										 classname));
  	}
*************** dumpTableData_insert(Archive *fout, void
*** 1410,1416 ****
  	{
  		appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
  						  "SELECT * FROM %s",
! 						  fmtQualifiedId(fout,
  										 tbinfo->dobj.namespace->dobj.name,
  										 classname));
  	}
--- 1414,1420 ----
  	{
  		appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
  						  "SELECT * FROM %s",
! 						  fmtQualifiedId(fout->remoteVersion,
  										 tbinfo->dobj.namespace->dobj.name,
  										 classname));
  	}
*************** dumpTableData(Archive *fout, TableDataIn
*** 1542,1547 ****
--- 1546,1552 ----
  {
  	TableInfo  *tbinfo = tdinfo->tdtable;
  	PQExpBuffer copyBuf = createPQExpBuffer();
+ 	PQExpBuffer clistBuf = createPQExpBuffer();
  	DataDumperPtr dumpFn;
  	char	   *copyStmt;
  
*************** dumpTableData(Archive *fout, TableDataIn
*** 1553,1559 ****
  		appendPQExpBuffer(copyBuf, "COPY %s ",
  						  fmtId(tbinfo->dobj.name));
  		appendPQExpBuffer(copyBuf, "%s %sFROM stdin;\n",
! 						  fmtCopyColumnList(tbinfo),
  					  (tdinfo->oids && tbinfo->hasoids) ? "WITH OIDS " : "");
  		copyStmt = copyBuf->data;
  	}
--- 1558,1564 ----
  		appendPQExpBuffer(copyBuf, "COPY %s ",
  						  fmtId(tbinfo->dobj.name));
  		appendPQExpBuffer(copyBuf, "%s %sFROM stdin;\n",
! 						  fmtCopyColumnList(tbinfo, clistBuf),
  					  (tdinfo->oids && tbinfo->hasoids) ? "WITH OIDS " : "");
  		copyStmt = copyBuf->data;
  	}
*************** dumpTableData(Archive *fout, TableDataIn
*** 1573,1578 ****
--- 1578,1584 ----
  				 dumpFn, tdinfo);
  
  	destroyPQExpBuffer(copyBuf);
+ 	destroyPQExpBuffer(clistBuf);
  }
  
  /*
*************** getTables(Archive *fout, int *numTables)
*** 3842,3847 ****
--- 3848,3854 ----
  	int			i_reloptions;
  	int			i_toastreloptions;
  	int			i_reloftype;
+ 	int			i_relpages;
  
  	/* Make sure we are in proper schema */
  	selectSourceSchema(fout, "pg_catalog");
*************** getTables(Archive *fout, int *numTables)
*** 3881,3886 ****
--- 3888,3894 ----
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
  						  "c.relpersistence, "
+ 						  "c.relpages, "
  						  "CASE WHEN c.reloftype <> 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS reloftype, "
  						  "d.refobjid AS owning_tab, "
  						  "d.refobjsubid AS owning_col, "
*************** getTables(Archive *fout, int *numTables)
*** 3917,3922 ****
--- 3925,3931 ----
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
  						  "'p' AS relpersistence, "
+ 						  "c.relpages, "
  						  "CASE WHEN c.reloftype <> 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS reloftype, "
  						  "d.refobjid AS owning_tab, "
  						  "d.refobjsubid AS owning_col, "
*************** getTables(Archive *fout, int *numTables)
*** 3952,3957 ****
--- 3961,3967 ----
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
  						  "'p' AS relpersistence, "
+ 						  "c.relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
  						  "d.refobjsubid AS owning_col, "
*************** getTables(Archive *fout, int *numTables)
*** 3987,3992 ****
--- 3997,4003 ----
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
  						  "'p' AS relpersistence, "
+ 						  "c.relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
  						  "d.refobjsubid AS owning_col, "
*************** getTables(Archive *fout, int *numTables)
*** 4023,4028 ****
--- 4034,4040 ----
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
  						  "'p' AS relpersistence, "
+ 						  "relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
  						  "d.refobjsubid AS owning_col, "
*************** getTables(Archive *fout, int *numTables)
*** 4058,4063 ****
--- 4070,4076 ----
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
  						  "'p' AS relpersistence, "
+ 						  "relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
  						  "d.refobjsubid AS owning_col, "
*************** getTables(Archive *fout, int *numTables)
*** 4089,4094 ****
--- 4102,4108 ----
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
  						  "'p' AS relpersistence, "
+ 						  "relpages, "
  						  "NULL AS reloftype, "
  						  "NULL::oid AS owning_tab, "
  						  "NULL::int4 AS owning_col, "
*************** getTables(Archive *fout, int *numTables)
*** 4115,4120 ****
--- 4129,4135 ----
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
  						  "'p' AS relpersistence, "
+ 						  "relpages, "
  						  "NULL AS reloftype, "
  						  "NULL::oid AS owning_tab, "
  						  "NULL::int4 AS owning_col, "
*************** getTables(Archive *fout, int *numTables)
*** 4151,4156 ****
--- 4166,4172 ----
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
  						  "'p' AS relpersistence, "
+ 						  "0 AS relpages, "
  						  "NULL AS reloftype, "
  						  "NULL::oid AS owning_tab, "
  						  "NULL::int4 AS owning_col, "
*************** getTables(Archive *fout, int *numTables)
*** 4204,4209 ****
--- 4220,4226 ----
  	i_reloptions = PQfnumber(res, "reloptions");
  	i_toastreloptions = PQfnumber(res, "toast_reloptions");
  	i_reloftype = PQfnumber(res, "reloftype");
+ 	i_relpages = PQfnumber(res, "relpages");
  
  	if (lockWaitTimeout && fout->remoteVersion >= 70300)
  	{
*************** getTables(Archive *fout, int *numTables)
*** 4260,4265 ****
--- 4277,4283 ----
  		tblinfo[i].reltablespace = pg_strdup(PQgetvalue(res, i, i_reltablespace));
  		tblinfo[i].reloptions = pg_strdup(PQgetvalue(res, i, i_reloptions));
  		tblinfo[i].toast_reloptions = pg_strdup(PQgetvalue(res, i, i_toastreloptions));
+ 		tblinfo[i].relpages = atoi(PQgetvalue(res, i, i_relpages));
  
  		/* other fields were zeroed above */
  
*************** getTables(Archive *fout, int *numTables)
*** 4288,4294 ****
  			resetPQExpBuffer(query);
  			appendPQExpBuffer(query,
  							  "LOCK TABLE %s IN ACCESS SHARE MODE",
! 						 fmtQualifiedId(fout,
  										tblinfo[i].dobj.namespace->dobj.name,
  										tblinfo[i].dobj.name));
  			ExecuteSqlStatement(fout, query->data);
--- 4306,4312 ----
  			resetPQExpBuffer(query);
  			appendPQExpBuffer(query,
  							  "LOCK TABLE %s IN ACCESS SHARE MODE",
! 						 fmtQualifiedId(fout->remoteVersion,
  										tblinfo[i].dobj.namespace->dobj.name,
  										tblinfo[i].dobj.name));
  			ExecuteSqlStatement(fout, query->data);
*************** getIndexes(Archive *fout, TableInfo tbli
*** 4427,4433 ****
  				i_conoid,
  				i_condef,
  				i_tablespace,
! 				i_options;
  	int			ntups;
  
  	for (i = 0; i < numTables; i++)
--- 4445,4452 ----
  				i_conoid,
  				i_condef,
  				i_tablespace,
! 				i_options,
! 				i_relpages;
  	int			ntups;
  
  	for (i = 0; i < numTables; i++)
*************** getIndexes(Archive *fout, TableInfo tbli
*** 4469,4474 ****
--- 4488,4494 ----
  					 "pg_catalog.pg_get_indexdef(i.indexrelid) AS indexdef, "
  							  "t.relnatts AS indnkeys, "
  							  "i.indkey, i.indisclustered, "
+ 							  "t.relpages, "
  							  "c.contype, c.conname, "
  							  "c.condeferrable, c.condeferred, "
  							  "c.tableoid AS contableoid, "
*************** getIndexes(Archive *fout, TableInfo tbli
*** 4494,4499 ****
--- 4514,4520 ----
  					 "pg_catalog.pg_get_indexdef(i.indexrelid) AS indexdef, "
  							  "t.relnatts AS indnkeys, "
  							  "i.indkey, i.indisclustered, "
+ 							  "t.relpages, "
  							  "c.contype, c.conname, "
  							  "c.condeferrable, c.condeferred, "
  							  "c.tableoid AS contableoid, "
*************** getIndexes(Archive *fout, TableInfo tbli
*** 4522,4527 ****
--- 4543,4549 ----
  					 "pg_catalog.pg_get_indexdef(i.indexrelid) AS indexdef, "
  							  "t.relnatts AS indnkeys, "
  							  "i.indkey, i.indisclustered, "
+ 							  "t.relpages, "
  							  "c.contype, c.conname, "
  							  "c.condeferrable, c.condeferred, "
  							  "c.tableoid AS contableoid, "
*************** getIndexes(Archive *fout, TableInfo tbli
*** 4550,4555 ****
--- 4572,4578 ----
  					 "pg_catalog.pg_get_indexdef(i.indexrelid) AS indexdef, "
  							  "t.relnatts AS indnkeys, "
  							  "i.indkey, i.indisclustered, "
+ 							  "t.relpages, "
  							  "c.contype, c.conname, "
  							  "c.condeferrable, c.condeferred, "
  							  "c.tableoid AS contableoid, "
*************** getIndexes(Archive *fout, TableInfo tbli
*** 4578,4583 ****
--- 4601,4607 ----
  							  "pg_get_indexdef(i.indexrelid) AS indexdef, "
  							  "t.relnatts AS indnkeys, "
  							  "i.indkey, false AS indisclustered, "
+ 							  "t.relpages, "
  							  "CASE WHEN i.indisprimary THEN 'p'::char "
  							  "ELSE '0'::char END AS contype, "
  							  "t.relname AS conname, "
*************** getIndexes(Archive *fout, TableInfo tbli
*** 4604,4609 ****
--- 4628,4634 ----
  							  "pg_get_indexdef(i.indexrelid) AS indexdef, "
  							  "t.relnatts AS indnkeys, "
  							  "i.indkey, false AS indisclustered, "
+ 							  "t.relpages, "
  							  "CASE WHEN i.indisprimary THEN 'p'::char "
  							  "ELSE '0'::char END AS contype, "
  							  "t.relname AS conname, "
*************** getIndexes(Archive *fout, TableInfo tbli
*** 4632,4637 ****
--- 4657,4663 ----
  		i_indnkeys = PQfnumber(res, "indnkeys");
  		i_indkey = PQfnumber(res, "indkey");
  		i_indisclustered = PQfnumber(res, "indisclustered");
+ 		i_relpages = PQfnumber(res, "relpages");
  		i_contype = PQfnumber(res, "contype");
  		i_conname = PQfnumber(res, "conname");
  		i_condeferrable = PQfnumber(res, "condeferrable");
*************** getIndexes(Archive *fout, TableInfo tbli
*** 4674,4679 ****
--- 4700,4706 ----
  			parseOidArray(PQgetvalue(res, j, i_indkey),
  						  indxinfo[j].indkeys, INDEX_MAX_KEYS);
  			indxinfo[j].indisclustered = (PQgetvalue(res, j, i_indisclustered)[0] == 't');
+ 			indxinfo[j].relpages = atoi(PQgetvalue(res, j, i_relpages));
  			contype = *(PQgetvalue(res, j, i_contype));
  
  			if (contype == 'p' || contype == 'u' || contype == 'x')
*************** getDependencies(Archive *fout)
*** 14075,14096 ****
   *
   * Whenever the selected schema is not pg_catalog, be careful to qualify
   * references to system catalogs and types in our emitted commands!
   */
  static void
  selectSourceSchema(Archive *fout, const char *schemaName)
  {
- 	static char *curSchemaName = NULL;
  	PQExpBuffer query;
  
  	/* Not relevant if fetching from pre-7.3 DB */
  	if (fout->remoteVersion < 70300)
  		return;
- 	/* Ignore null schema names */
- 	if (schemaName == NULL || *schemaName == '\0')
- 		return;
- 	/* Optimize away repeated selection of same schema */
- 	if (curSchemaName && strcmp(curSchemaName, schemaName) == 0)
- 		return;
  
  	query = createPQExpBuffer();
  	appendPQExpBuffer(query, "SET search_path = %s",
--- 14102,14122 ----
   *
   * Whenever the selected schema is not pg_catalog, be careful to qualify
   * references to system catalogs and types in our emitted commands!
+  *
+  * This function is called only from selectSourceSchemaOnAH and
+  * selectSourceSchema.
   */
  static void
  selectSourceSchema(Archive *fout, const char *schemaName)
  {
  	PQExpBuffer query;
  
+ 	/* This is checked by the callers already */
+ 	Assert(schemaName != NULL && *schemaName != '\0');
+ 
  	/* Not relevant if fetching from pre-7.3 DB */
  	if (fout->remoteVersion < 70300)
  		return;
  
  	query = createPQExpBuffer();
  	appendPQExpBuffer(query, "SET search_path = %s",
*************** selectSourceSchema(Archive *fout, const
*** 14101,14109 ****
  	ExecuteSqlStatement(fout, query->data);
  
  	destroyPQExpBuffer(query);
- 	if (curSchemaName)
- 		free(curSchemaName);
- 	curSchemaName = pg_strdup(schemaName);
  }
  
  /*
--- 14127,14132 ----
*************** myFormatType(const char *typname, int32
*** 14241,14311 ****
  }
  
  /*
-  * fmtQualifiedId - convert a qualified name to the proper format for
-  * the source database.
-  *
-  * Like fmtId, use the result before calling again.
-  */
- static const char *
- fmtQualifiedId(Archive *fout, const char *schema, const char *id)
- {
- 	static PQExpBuffer id_return = NULL;
- 
- 	if (id_return)				/* first time through? */
- 		resetPQExpBuffer(id_return);
- 	else
- 		id_return = createPQExpBuffer();
- 
- 	/* Suppress schema name if fetching from pre-7.3 DB */
- 	if (fout->remoteVersion >= 70300 && schema && *schema)
- 	{
- 		appendPQExpBuffer(id_return, "%s.",
- 						  fmtId(schema));
- 	}
- 	appendPQExpBuffer(id_return, "%s",
- 					  fmtId(id));
- 
- 	return id_return->data;
- }
- 
- /*
   * Return a column list clause for the given relation.
   *
   * Special case: if there are no undropped columns in the relation, return
   * "", not an invalid "()" column list.
   */
  static const char *
! fmtCopyColumnList(const TableInfo *ti)
  {
- 	static PQExpBuffer q = NULL;
  	int			numatts = ti->numatts;
  	char	  **attnames = ti->attnames;
  	bool	   *attisdropped = ti->attisdropped;
  	bool		needComma;
  	int			i;
  
! 	if (q)						/* first time through? */
! 		resetPQExpBuffer(q);
! 	else
! 		q = createPQExpBuffer();
! 
! 	appendPQExpBuffer(q, "(");
  	needComma = false;
  	for (i = 0; i < numatts; i++)
  	{
  		if (attisdropped[i])
  			continue;
  		if (needComma)
! 			appendPQExpBuffer(q, ", ");
! 		appendPQExpBuffer(q, "%s", fmtId(attnames[i]));
  		needComma = true;
  	}
  
  	if (!needComma)
  		return "";				/* no undropped columns */
  
! 	appendPQExpBuffer(q, ")");
! 	return q->data;
  }
  
  /*
--- 14264,14300 ----
  }
  
  /*
   * Return a column list clause for the given relation.
   *
   * Special case: if there are no undropped columns in the relation, return
   * "", not an invalid "()" column list.
   */
  static const char *
! fmtCopyColumnList(const TableInfo *ti, PQExpBuffer buffer)
  {
  	int			numatts = ti->numatts;
  	char	  **attnames = ti->attnames;
  	bool	   *attisdropped = ti->attisdropped;
  	bool		needComma;
  	int			i;
  
! 	appendPQExpBuffer(buffer, "(");
  	needComma = false;
  	for (i = 0; i < numatts; i++)
  	{
  		if (attisdropped[i])
  			continue;
  		if (needComma)
! 			appendPQExpBuffer(buffer, ", ");
! 		appendPQExpBuffer(buffer, "%s", fmtId(attnames[i]));
  		needComma = true;
  	}
  
  	if (!needComma)
  		return "";				/* no undropped columns */
  
! 	appendPQExpBuffer(buffer, ")");
! 	return buffer->data;
  }
  
  /*
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index fba6953..0efc15f 100644
*** a/src/bin/pg_dump/pg_dump.h
--- b/src/bin/pg_dump/pg_dump.h
*************** typedef struct _tableInfo
*** 256,261 ****
--- 256,262 ----
  	/* these two are set only if table is a sequence owned by a column: */
  	Oid			owning_tab;		/* OID of table owning sequence */
  	int			owning_col;		/* attr # of column owning sequence */
+ 	int			relpages;
  
  	bool		interesting;	/* true if need to collect more data */
  
*************** typedef struct _indxInfo
*** 319,324 ****
--- 320,326 ----
  	bool		indisclustered;
  	/* if there is an associated constraint object, its dumpId: */
  	DumpId		indexconstraint;
+ 	int			relpages;		/* relpages of the underlying table */
  } IndxInfo;
  
  typedef struct _ruleInfo
*************** extern void parseOidArray(const char *st
*** 523,528 ****
--- 525,531 ----
  extern void sortDumpableObjects(DumpableObject **objs, int numObjs);
  extern void sortDumpableObjectsByTypeName(DumpableObject **objs, int numObjs);
  extern void sortDumpableObjectsByTypeOid(DumpableObject **objs, int numObjs);
+ extern void sortDataAndIndexObjectsBySize(DumpableObject **objs, int numObjs);
  
  /*
   * version specific routines
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index a6533be..fe4d965 100644
*** a/src/bin/pg_dump/pg_dump_sort.c
--- b/src/bin/pg_dump/pg_dump_sort.c
*************** static void repairDependencyLoop(Dumpabl
*** 121,126 ****
--- 121,213 ----
  static void describeDumpableObject(DumpableObject *obj,
  					   char *buf, int bufsize);
  
+ static int DOSizeCompare(const void *p1, const void *p2);
+ 
+ static int
+ findFirstEqualType(DumpableObjectType type, DumpableObject **objs, int numObjs)
+ {
+ 	int i;
+ 	for (i = 0; i < numObjs; i++)
+ 		if (objs[i]->objType == type)
+ 			return i;
+ 	return -1;
+ }
+ 
+ static int
+ findFirstDifferentType(DumpableObjectType type, DumpableObject **objs, int numObjs, int start)
+ {
+ 	int i;
+ 	for (i = start; i < numObjs; i++)
+ 		if (objs[i]->objType != type)
+ 			return i;
+ 	return numObjs - 1;
+ }
+ 
+ /*
+  * When we do a parallel dump, we want to start with the largest items first.
+  *
+  * Say we have the objects in this order:
+  * ....DDDDD....III....
+  *
+  * with D = Table data, I = Index, . = other object
+  *
+  * This sorting function now takes each of the D or I blocks and sorts them
+  * according to their size.
+  */
+ void
+ sortDataAndIndexObjectsBySize(DumpableObject **objs, int numObjs)
+ {
+ 	int		startIdx, endIdx;
+ 	void   *startPtr;
+ 
+ 	if (numObjs <= 1)
+ 		return;
+ 
+ 	startIdx = findFirstEqualType(DO_TABLE_DATA, objs, numObjs);
+ 	if (startIdx >= 0)
+ 	{
+ 		endIdx = findFirstDifferentType(DO_TABLE_DATA, objs, numObjs, startIdx);
+ 		startPtr = objs + startIdx;
+ 		qsort(startPtr, endIdx - startIdx, sizeof(DumpableObject *),
+ 			  DOSizeCompare);
+ 	}
+ 
+ 	startIdx = findFirstEqualType(DO_INDEX, objs, numObjs);
+ 	if (startIdx >= 0)
+ 	{
+ 		endIdx = findFirstDifferentType(DO_INDEX, objs, numObjs, startIdx);
+ 		startPtr = objs + startIdx;
+ 		qsort(startPtr, endIdx - startIdx, sizeof(DumpableObject *),
+ 			  DOSizeCompare);
+ 	}
+ }
+ 
+ static int
+ DOSizeCompare(const void *p1, const void *p2)
+ {
+ 	DumpableObject *obj1 = *(DumpableObject **) p1;
+ 	DumpableObject *obj2 = *(DumpableObject **) p2;
+ 	int			obj1_size = 0;
+ 	int			obj2_size = 0;
+ 
+ 	if (obj1->objType == DO_TABLE_DATA)
+ 		obj1_size = ((TableDataInfo *) obj1)->tdtable->relpages;
+ 	if (obj1->objType == DO_INDEX)
+ 		obj1_size = ((IndxInfo *) obj1)->relpages;
+ 
+ 	if (obj2->objType == DO_TABLE_DATA)
+ 		obj2_size = ((TableDataInfo *) obj2)->tdtable->relpages;
+ 	if (obj2->objType == DO_INDEX)
+ 		obj2_size = ((IndxInfo *) obj2)->relpages;
+ 
+ 	/* we want to see the biggest item go first */
+ 	if (obj1_size > obj2_size)
+ 		return -1;
+ 	if (obj2_size > obj1_size)
+ 		return 1;
+ 
+ 	return 0;
+ }
  
  /*
   * Sort the given objects into a type/name-based ordering
diff --git a/src/bin/pg_dump/pg_restore.c b/src/bin/pg_dump/pg_restore.c
index bd2feff..11c83f7 100644
*** a/src/bin/pg_dump/pg_restore.c
--- b/src/bin/pg_dump/pg_restore.c
*************** main(int argc, char **argv)
*** 72,77 ****
--- 72,78 ----
  	RestoreOptions *opts;
  	int			c;
  	int			exit_code;
+ 	int			numWorkers = 1;
  	Archive    *AH;
  	char	   *inputFileSpec;
  	static int	disable_triggers = 0;
*************** main(int argc, char **argv)
*** 183,189 ****
  				break;
  
  			case 'j':			/* number of restore jobs */
! 				opts->number_of_jobs = atoi(optarg);
  				break;
  
  			case 'l':			/* Dump the TOC summary */
--- 184,190 ----
  				break;
  
  			case 'j':			/* number of restore jobs */
! 				numWorkers = atoi(optarg);
  				break;
  
  			case 'l':			/* Dump the TOC summary */
*************** main(int argc, char **argv)
*** 338,344 ****
  	}
  
  	/* Can't do single-txn mode with multiple connections */
! 	if (opts->single_txn && opts->number_of_jobs > 1)
  	{
  		fprintf(stderr, _("%s: cannot specify both --single-transaction and multiple jobs\n"),
  				progname);
--- 339,345 ----
  	}
  
  	/* Can't do single-txn mode with multiple connections */
! 	if (opts->single_txn && numWorkers > 1)
  	{
  		fprintf(stderr, _("%s: cannot specify both --single-transaction and multiple jobs\n"),
  				progname);
*************** main(int argc, char **argv)
*** 405,410 ****
--- 406,413 ----
  		InitDummyWantedList(AH, opts);
  	}
  
+ 	AH->numWorkers = numWorkers;
+ 
  	if (opts->tocSummary)
  		PrintTOCSummary(AH, opts);
  	else
parallel_pg_dump_6-part2.difftext/x-patch; charset=US-ASCII; name=parallel_pg_dump_6-part2.diffDownload
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index e9be18b..f3d554f 100644
*** a/src/bin/pg_dump/Makefile
--- b/src/bin/pg_dump/Makefile
*************** override CPPFLAGS := -I$(libpq_srcdir) $
*** 20,26 ****
  
  OBJS=	pg_backup_archiver.o pg_backup_db.o pg_backup_custom.o \
  	pg_backup_null.o pg_backup_tar.o \
! 	pg_backup_directory.o dumpmem.o dumputils.o compress_io.o $(WIN32RES)
  
  KEYWRDOBJS = keywords.o kwlookup.o
  
--- 20,27 ----
  
  OBJS=	pg_backup_archiver.o pg_backup_db.o pg_backup_custom.o \
  	pg_backup_null.o pg_backup_tar.o \
! 	pg_backup_directory.o dumpmem.o dumputils.o compress_io.o \
! 	parallel.o $(WIN32RES)
  
  KEYWRDOBJS = keywords.o kwlookup.o
  
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index ff8e714..f104127 100644
*** a/src/bin/pg_dump/compress_io.c
--- b/src/bin/pg_dump/compress_io.c
***************
*** 55,60 ****
--- 55,61 ----
  #include "compress_io.h"
  #include "dumpmem.h"
  #include "dumputils.h"
+ #include "parallel.h"
  
  /*----------------------
   * Compressor API
*************** size_t
*** 182,187 ****
--- 183,191 ----
  WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
  				   const void *data, size_t dLen)
  {
+ 	/* Are we aborting? */
+ 	checkAborting(AH);
+ 
  	switch (cs->comprAlg)
  	{
  		case COMPR_ALG_LIBZ:
*************** ReadDataFromArchiveZlib(ArchiveHandle *A
*** 351,356 ****
--- 355,363 ----
  	/* no minimal chunk size for zlib */
  	while ((cnt = readF(AH, &buf, &buflen)))
  	{
+ 		/* Are we aborting? */
+ 		checkAborting(AH);
+ 
  		zp->next_in = (void *) buf;
  		zp->avail_in = cnt;
  
*************** ReadDataFromArchiveNone(ArchiveHandle *A
*** 411,416 ****
--- 418,426 ----
  
  	while ((cnt = readF(AH, &buf, &buflen)))
  	{
+ 		/* Are we aborting? */
+ 		checkAborting(AH);
+ 
  		ahwrite(buf, 1, cnt, AH);
  	}
  
diff --git a/src/bin/pg_dump/parallel.c b/src/bin/pg_dump/parallel.c
index ...965a3d2 .
*** a/src/bin/pg_dump/parallel.c
--- b/src/bin/pg_dump/parallel.c
***************
*** 0 ****
--- 1,1300 ----
+ /*-------------------------------------------------------------------------
+  *
+  * parallel.c
+  *
+  *	Parallel support for the pg_dump archiver
+  *
+  * Portions Copyright (c) 1996-2011, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  *	The author is not responsible for loss or damages that may
+  *	result from its use.
+  *
+  * IDENTIFICATION
+  *		src/bin/pg_dump/parallel.c
+  *
+  *-------------------------------------------------------------------------
+  */
+ 
+ #include "pg_backup_db.h"
+ 
+ #include "dumpmem.h"
+ #include "dumputils.h"
+ #include "parallel.h"
+ 
+ #ifndef WIN32
+ #include <sys/types.h>
+ #include <sys/wait.h>
+ #include "signal.h"
+ #include <unistd.h>
+ #include <fcntl.h>
+ #endif
+ 
+ #define PIPE_READ							0
+ #define PIPE_WRITE							1
+ 
+ /* file-scope variables */
+ #ifdef WIN32
+ static unsigned int	tMasterThreadId = 0;
+ static HANDLE		termEvent = INVALID_HANDLE_VALUE;
+ static int pgpipe(int handles[2]);
+ static int piperead(int s, char *buf, int len);
+ #define pipewrite(a,b,c)	send(a,b,c,0)
+ #else
+ static volatile sig_atomic_t wantAbort = 0;
+ static bool aborting = false;
+ #define pgpipe(a)			pipe(a)
+ #define piperead(a,b,c)		read(a,b,c)
+ #define pipewrite(a,b,c)	write(a,b,c)
+ #endif
+ 
+ typedef struct ShutdownInformation
+ {
+     ParallelState *pstate;
+     Archive       *AHX;
+ } ShutdownInformation;
+ 
+ static ShutdownInformation shutdown_info;
+ 
+ static const char *modulename = gettext_noop("parallel archiver");
+ 
+ static ParallelSlot *GetMyPSlot(ParallelState *pstate);
+ static void parallel_exit_msg_func(const char *modulename,
+ 								   const char *fmt, va_list ap)
+ 			__attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 0)));
+ static void parallel_msg_master(ParallelSlot *slot, const char *modulename,
+ 								const char *fmt, va_list ap)
+ 			__attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 0)));
+ static void archive_close_connection(int code, void *arg);
+ static void ShutdownWorkersHard(ParallelState *pstate);
+ static void WaitForTerminatingWorkers(ParallelState *pstate);
+ #ifndef WIN32
+ static void sigTermHandler(int signum);
+ #endif
+ static void SetupWorker(ArchiveHandle *AH, int pipefd[2], int worker,
+ 						RestoreOptions *ropt);
+ static void PrintStatus(ParallelState *pstate);
+ static bool HasEveryWorkerTerminated(ParallelState *pstate);
+ 
+ static void lockTableNoWait(ArchiveHandle *AH, TocEntry *te);
+ static void WaitForCommands(ArchiveHandle *AH, int pipefd[2]);
+ static char *getMessageFromMaster(int pipefd[2]);
+ static void sendMessageToMaster(int pipefd[2], const char *str);
+ static int select_loop(int maxFd, fd_set *workerset);
+ static char *getMessageFromWorker(ParallelState *pstate,
+ 								  bool do_wait, int *worker);
+ static void sendMessageToWorker(ParallelState *pstate,
+ 							    int worker, const char *str);
+ static char *readMessageFromPipe(int fd);
+ 
+ #define messageStartsWith(msg, prefix) \
+ 	(strncmp(msg, prefix, strlen(prefix)) == 0)
+ #define messageEquals(msg, pattern) \
+ 	(strcmp(msg, pattern) == 0)
+ 
+ static ParallelSlot *
+ GetMyPSlot(ParallelState *pstate)
+ {
+ 	int i;
+ 
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ #ifdef WIN32
+ 		if (pstate->parallelSlot[i].threadId == GetCurrentThreadId())
+ #else
+ 		if (pstate->parallelSlot[i].pid == getpid())
+ #endif
+ 			return &(pstate->parallelSlot[i]);
+ 
+ 	return NULL;
+ }
+ 
+ /*
+  * This is the function that will be called from exit_horribly() to print the
+  * error message. If the worker process does exit_horribly(), we forward its
+  * last words to the master process. The master process then does exit_horribly()
+  * with this error message itself and prints it normally. After printing the
+  * message, exit_horribly() on the master will shut down the remaining worker
+  * processes.
+  */
+ static void
+ parallel_exit_msg_func(const char *modulename, const char *fmt, va_list ap)
+ {
+ 	ParallelState *pstate = shutdown_info.pstate;
+ 	ParallelSlot *slot;
+ 
+ 	Assert(pstate);
+ 
+ 	slot = GetMyPSlot(pstate);
+ 
+ 	if (!slot)
+ 		/* We're the parent, just write the message out */
+ 		vwrite_msg(modulename, fmt, ap);
+ 	else
+ 		/* If we're a worker process, send the msg to the master process */
+ 		parallel_msg_master(slot, modulename, fmt, ap);
+ }
+ 
+ /* Sends the error message from the worker to the master process */
+ static void
+ parallel_msg_master(ParallelSlot *slot, const char *modulename,
+ 					const char *fmt, va_list ap)
+ {
+ 	char		buf[512];
+ 	int			pipefd[2];
+ 
+ 	pipefd[PIPE_READ] = slot->pipeRevRead;
+ 	pipefd[PIPE_WRITE] = slot->pipeRevWrite;
+ 
+ 	strcpy(buf, "ERROR ");
+ 	vsnprintf(buf + strlen("ERROR "),
+ 			  sizeof(buf) - strlen("ERROR "), fmt, ap);
+ 
+ 	sendMessageToMaster(pipefd, buf);
+ }
+ 
+ /*
+  * pg_dump and pg_restore register the Archive pointer for the exit handler
+  * (called from exit_horribly). This function mainly exists so that we can keep
+  * shutdown_info in file scope only.
+  */
+ void
+ on_exit_close_archive(Archive *AHX)
+ {
+ 	shutdown_info.AHX = AHX;
+ 	on_exit_nicely(archive_close_connection, &shutdown_info);
+ }
+ 
+ /* This function can close archives in both the parallel and non-parallel case. */
+ static void
+ archive_close_connection(int code, void *arg)
+ {
+ 	ShutdownInformation *si = (ShutdownInformation *) arg;
+ 
+ 	if (si->pstate)
+ 	{
+ 		ParallelSlot *slot = GetMyPSlot(si->pstate);
+ 
+ 		if (!slot) {
+ 			/*
+ 			 * We're the master: We have already printed out the message passed
+ 			 * to exit_horribly() either from the master itself or from a
+ 			 * worker process. Now we need to close our own database connection
+ 			 * (only open during parallel dump but not restore) and shut down
+ 			 * the remaining workers.
+ 			 */
+ 			DisconnectDatabase(si->AHX);
+ #ifndef WIN32
+ 			/*
+ 			 * Setting aborting to true switches to best-effort-mode
+ 			 * (send/receive but ignore errors) in communicating with our
+ 			 * workers.
+ 			 */
+ 			aborting = true;
+ #endif
+ 			ShutdownWorkersHard(si->pstate);
+ 		}
+ 		else if (slot->args->AH)
+ 			DisconnectDatabase(&(slot->args->AH->public));
+ 	}
+ 	else if (si->AHX)
+ 		DisconnectDatabase(si->AHX);
+ }
+ 
+ /*
+  * If we have one worker that terminates for some reason, we'd like the other
+  * threads to terminate as well (and not finish with their 70 GB table dump
+  * first...). Now in UNIX we can just kill these processes, and let the signal
+  * handler set wantAbort to 1. In Windows we set a termEvent and this serves as
+  * the signal for everyone to terminate.
+  */
+ void
+ checkAborting(ArchiveHandle *AH)
+ {
+ #ifdef WIN32
+ 	if (WaitForSingleObject(termEvent, 0) == WAIT_OBJECT_0)
+ #else
+ 	if (wantAbort)
+ #endif
+ 		exit_horribly(modulename, "worker is terminating\n");
+ }
+ 
+ /*
+  * Shut down any remaining workers, this has an implicit do_wait == true.
+  *
+  * The fastest way we can make the workers terminate gracefully is when
+  * they are listening for new commands and we just tell them to terminate.
+  */
+ static void
+ ShutdownWorkersHard(ParallelState *pstate)
+ {
+ #ifndef WIN32
+ 	int i;
+ 	signal(SIGPIPE, SIG_IGN);
+ 
+ 	/* close the sockets so that the workers know they can exit */
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 	{
+ 		closesocket(pstate->parallelSlot[i].pipeRead);
+ 		closesocket(pstate->parallelSlot[i].pipeWrite);
+ 	}
+ 
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 		kill(pstate->parallelSlot[i].pid, SIGTERM);
+ 
+ #else
+ 	/* The workers monitor this event via checkAborting(). */
+ 	SetEvent(termEvent);
+ #endif
+ 
+ 	WaitForTerminatingWorkers(pstate);
+ }
+ 
+ /*
+  * Wait for the termination of the processes using the OS-specific method.
+  */
+ static void
+ WaitForTerminatingWorkers(ParallelState *pstate)
+ {
+ 	while (!HasEveryWorkerTerminated(pstate))
+ 	{
+ 		ParallelSlot *slot = NULL;
+ 		int j;
+ #ifndef WIN32
+ 		int		status;
+ 		pid_t	pid = wait(&status);
+ 		for (j = 0; j < pstate->numWorkers; j++)
+ 			if (pstate->parallelSlot[j].pid == pid)
+ 				slot = &(pstate->parallelSlot[j]);
+ #else
+ 		uintptr_t hThread;
+ 		DWORD	ret;
+ 		uintptr_t *lpHandles = pg_malloc(sizeof(HANDLE) * pstate->numWorkers);
+ 		int nrun = 0;
+ 		for (j = 0; j < pstate->numWorkers; j++)
+ 			if (pstate->parallelSlot[j].workerStatus != WRKR_TERMINATED)
+ 			{
+ 				lpHandles[nrun] = pstate->parallelSlot[j].hThread;
+ 				nrun++;
+ 			}
+ 		ret = WaitForMultipleObjects(nrun, (HANDLE*) lpHandles, false, INFINITE);
+ 		Assert(ret != WAIT_FAILED);
+ 		hThread = lpHandles[ret - WAIT_OBJECT_0];
+ 
+ 		for (j = 0; j < pstate->numWorkers; j++)
+ 			if (pstate->parallelSlot[j].hThread == hThread)
+ 				slot = &(pstate->parallelSlot[j]);
+ 
+ 		free(lpHandles);
+ #endif
+ 		Assert(slot);
+ 
+ 		slot->workerStatus = WRKR_TERMINATED;
+ 
+ 		PrintStatus(pstate);
+ 	}
+ 	Assert(HasEveryWorkerTerminated(pstate));
+ }
+ 
+ #ifndef WIN32
+ /* Signal handling (UNIX only) */
+ static void
+ sigTermHandler(int signum)
+ {
+ 	wantAbort = 1;
+ }
+ #endif
+ 
+ /*
+  * This function is called by both UNIX and Windows variants to set up a
+  * worker process.
+  */
+ static void
+ SetupWorker(ArchiveHandle *AH, int pipefd[2], int worker,
+ 			RestoreOptions *ropt)
+ {
+ 	/*
+ 	 * In dump mode (pg_dump) this calls _SetupWorker() as defined in
+ 	 * pg_dump.c, while in restore mode (pg_restore) it calls _SetupWorker() as
+ 	 * defined in pg_restore.c.
+      *
+ 	 * We get the raw connection only for the reason that we can close it
+ 	 * properly when we shut down. This happens only that way when it is
+ 	 * brought down because of an error.
+ 	 */
+ 	_SetupWorker((Archive *) AH, ropt);
+ 
+ 	Assert(AH->connection != NULL);
+ 
+ 	WaitForCommands(AH, pipefd);
+ 
+ 	closesocket(pipefd[PIPE_READ]);
+ 	closesocket(pipefd[PIPE_WRITE]);
+ }
+ 
+ #ifdef WIN32
+ /*
+  * On Windows the _beginthreadex() function allows us to pass one parameter.
+  * Since we need to pass a few values however, we define a structure here
+  * and then pass a pointer to such a structure in _beginthreadex().
+  */
+ typedef struct {
+ 	ArchiveHandle  *AH;
+ 	RestoreOptions *ropt;
+ 	int				worker;
+ 	int				pipeRead;
+ 	int				pipeWrite;
+ } WorkerInfo;
+ 
+ static unsigned __stdcall
+ init_spawned_worker_win32(WorkerInfo *wi)
+ {
+ 	ArchiveHandle *AH;
+ 	int pipefd[2] = { wi->pipeRead, wi->pipeWrite };
+ 	int worker = wi->worker;
+ 	RestoreOptions *ropt = wi->ropt;
+ 
+ 	AH = CloneArchive(wi->AH);
+ 
+ 	free(wi);
+ 	SetupWorker(AH, pipefd, worker, ropt);
+ 
+ 	DeCloneArchive(AH);
+ 	_endthreadex(0);
+ 	return 0;
+ }
+ #endif
+ 
+ /*
+  * This function starts the parallel dump or restore by spawning off the worker
+  * processes in both Unix and Windows. For Windows, it creates a number of
+  * threads while it does a fork() on Unix.
+  */
+ ParallelState *
+ ParallelBackupStart(ArchiveHandle *AH, RestoreOptions *ropt)
+ {
+ 	ParallelState  *pstate;
+ 	int				i;
+ 	const size_t	slotSize = AH->public.numWorkers * sizeof(ParallelSlot);
+ 
+ 	Assert(AH->public.numWorkers > 0);
+ 
+ 	/* Ensure stdio state is quiesced before forking */
+ 	fflush(NULL);
+ 
+ 	pstate = (ParallelState *) pg_malloc(sizeof(ParallelState));
+ 
+ 	pstate->numWorkers = AH->public.numWorkers;
+ 	pstate->parallelSlot = NULL;
+ 
+ 	if (AH->public.numWorkers == 1)
+ 		return pstate;
+ 
+ 	pstate->parallelSlot = (ParallelSlot *) pg_malloc(slotSize);
+ 	memset((void *) pstate->parallelSlot, 0, slotSize);
+ 
+ 	/*
+ 	 * Set the pstate in the shutdown_info. The exit handler uses pstate if
+ 	 * set and falls back to AHX otherwise.
+ 	 */
+ 	shutdown_info.pstate = pstate;
+ 	on_exit_msg_func = parallel_exit_msg_func;
+ 
+ #ifdef WIN32
+ 	tMasterThreadId = GetCurrentThreadId();
+ 	termEvent = CreateEvent(NULL, true, false, "Terminate");
+ #else
+ 	signal(SIGTERM, sigTermHandler);
+ 	signal(SIGINT, sigTermHandler);
+ 	signal(SIGQUIT, sigTermHandler);
+ #endif
+ 
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 	{
+ #ifdef WIN32
+ 		WorkerInfo *wi;
+ 		uintptr_t	handle;
+ #else
+ 		pid_t		pid;
+ #endif
+ 		int			pipeMW[2], pipeWM[2];
+ 
+ 		if (pgpipe(pipeMW) < 0 || pgpipe(pipeWM) < 0)
+ 			exit_horribly(modulename, "Cannot create communication channels: %s\n",
+ 						  strerror(errno));
+ 
+ 		pstate->parallelSlot[i].workerStatus = WRKR_IDLE;
+ 		pstate->parallelSlot[i].args = (ParallelArgs *) pg_malloc(sizeof(ParallelArgs));
+ 		pstate->parallelSlot[i].args->AH = NULL;
+ 		pstate->parallelSlot[i].args->te = NULL;
+ #ifdef WIN32
+ 		/* Allocate a new structure for every worker */
+ 		wi = (WorkerInfo *) pg_malloc(sizeof(WorkerInfo));
+ 
+ 		wi->ropt = ropt;
+ 		wi->worker = i;
+ 		wi->AH = AH;
+ 		wi->pipeRead = pstate->parallelSlot[i].pipeRevRead = pipeMW[PIPE_READ];
+ 		wi->pipeWrite = pstate->parallelSlot[i].pipeRevWrite = pipeWM[PIPE_WRITE];
+ 
+ 		handle = _beginthreadex(NULL, 0, &init_spawned_worker_win32,
+ 								wi, 0, &(pstate->parallelSlot[i].threadId));
+ 		pstate->parallelSlot[i].hThread = handle;
+ #else
+ 		pid = fork();
+ 		if (pid == 0)
+ 		{
+ 			/* we are the worker */
+ 			int j;
+ 			int pipefd[2] = { pipeMW[PIPE_READ], pipeWM[PIPE_WRITE] };
+ 
+ 			/*
+ 			 * Store the fds for the reverse communication in pstate. Actually
+ 			 * we only use this in case of an error and don't use pstate
+ 			 * otherwise in the worker process. On Windows we write to the
+ 			 * global pstate, in Unix we write to our process-local copy but
+ 			 * that's also where we'd retrieve this information back from.
+ 			 */
+ 			pstate->parallelSlot[i].pipeRevRead = pipefd[PIPE_READ];
+ 			pstate->parallelSlot[i].pipeRevWrite = pipefd[PIPE_WRITE];
+ 			pstate->parallelSlot[i].pid = getpid();
+ 
+ 			/*
+ 			 * Call CloneArchive on Unix as well even though technically we
+ 			 * don't need to because fork() gives us a copy in our own address space
+ 			 * already. But CloneArchive resets the state information and also
+ 			 * clones the database connection (for parallel dump) which both
+ 			 * seem kinda helpful.
+ 			 */
+ 			pstate->parallelSlot[i].args->AH = CloneArchive(AH);
+ 
+ 			closesocket(pipeWM[PIPE_READ]);		/* close read end of Worker -> Master */
+ 			closesocket(pipeMW[PIPE_WRITE]);	/* close write end of Master -> Worker */
+ 
+ 			/*
+ 			 * Close all inherited fds for communication of the master with
+ 			 * the other workers.
+ 			 */
+ 			for (j = 0; j < i; j++)
+ 			{
+ 				closesocket(pstate->parallelSlot[j].pipeRead);
+ 				closesocket(pstate->parallelSlot[j].pipeWrite);
+ 			}
+ 
+ 			SetupWorker(pstate->parallelSlot[i].args->AH, pipefd, i, ropt);
+ 
+ 			exit(0);
+ 		}
+ 		else if (pid < 0)
+ 			/* fork failed */
+ 			exit_horribly(modulename,
+ 						  "could not create worker process: %s\n",
+ 						  strerror(errno));
+ 
+ 		/* we are the Master, pid > 0 here */
+ 		Assert(pid > 0);
+ 		closesocket(pipeMW[PIPE_READ]);		/* close read end of Master -> Worker */
+ 		closesocket(pipeWM[PIPE_WRITE]);	/* close write end of Worker -> Master */
+ 
+ 		pstate->parallelSlot[i].pid = pid;
+ #endif
+ 
+ 		pstate->parallelSlot[i].pipeRead = pipeWM[PIPE_READ];
+ 		pstate->parallelSlot[i].pipeWrite = pipeMW[PIPE_WRITE];
+ 	}
+ 
+ 	return pstate;
+ }
+ 
+ /*
+  * Tell all of our workers to terminate.
+  *
+  * Pretty straightforward routine, first we tell everyone to terminate, then we
+  * listen to the workers' replies and finally close the sockets that we have
+  * used for communication.
+  */
+ void
+ ParallelBackupEnd(ArchiveHandle *AH, ParallelState *pstate)
+ {
+ 	int i;
+ 
+ 	if (pstate->numWorkers == 1)
+ 		return;
+ 
+ 	PrintStatus(pstate);
+ 	Assert(IsEveryWorkerIdle(pstate));
+ 
+ 	/* close the sockets so that the workers know they can exit */
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 	{
+ 		closesocket(pstate->parallelSlot[i].pipeRead);
+ 		closesocket(pstate->parallelSlot[i].pipeWrite);
+ 	}
+ 	WaitForTerminatingWorkers(pstate);
+ 
+ 	/*
+ 	 * Remove the pstate again, so the exit handler in the parent will now
+ 	 * again fall back to closing AH->connection (if connected).
+ 	 */
+ 	shutdown_info.pstate = NULL;
+ 
+ 	free(pstate->parallelSlot);
+ 	free(pstate);
+ }
+ 
+ 
+ /*
+  * The sequence is the following (for dump, similar for restore):
+  *
+  * The master process starts the parallel backup in ParllelBackupStart, this
+  * forks the worker processes which enter WaitForCommand().
+  *
+  * The master process dispatches an individual work item to one of the worker
+  * processes in DispatchJobForTocEntry(). It calls
+  * AH->MasterStartParallelItemPtr, a routine of the output format. This
+  * function's arguments are the parents archive handle AH (containing the full
+  * catalog information), the TocEntry that the worker should work on and a
+  * T_Action act indicating whether this is a backup or a restore item.  The
+  * function then converts the TocEntry assignment into a string that is then
+  * sent over to the worker process. In the simplest case that would be
+  * something like "DUMP 1234", with 1234 being the TocEntry id.
+  *
+  * The worker receives the message in the routine pointed to by
+  * WorkerJobDumpPtr or WorkerJobRestorePtr. These are also pointers to
+  * corresponding routines of the respective output format, e.g.
+  * _WorkerJobDumpDirectory().
+  *
+  * Remember that we have forked off the workers only after we have read in the
+  * catalog. That's why our worker processes can also access the catalog
+  * information. Now they re-translate the textual representation to a TocEntry
+  * on their side and do the required action (restore or dump).
+  *
+  * The result is again a textual string that is sent back to the master and is
+  * interpreted by AH->MasterEndParallelItemPtr. This function can update state
+  * or catalog information on the master's side, depending on the reply from the
+  * worker process. In the end it returns status which is 0 for successful
+  * execution.
+  *
+  * ---------------------------------------------------------------------
+  * Master                                   Worker
+  *
+  *                                          enters WaitForCommands()
+  * DispatchJobForTocEntry(...te...)
+  *
+  * [ Worker is IDLE ]
+  *
+  * arg = (MasterStartParallelItemPtr)()
+  * send: DUMP arg
+  *                                          receive: DUMP arg
+  *                                          str = (WorkerJobDumpPtr)(arg)
+  * [ Worker is WORKING ]                    ... gets te from arg ...
+  *                                          ... dump te ...
+  *                                          send: OK DUMP info
+  *
+  * In ListenToWorkers():
+  *
+  * [ Worker is FINISHED ]
+  * receive: OK DUMP info
+  * status = (MasterEndParallelItemPtr)(info)
+  *
+  * In ReapWorkerStatus(&ptr):
+  * *ptr = status;
+  * [ Worker is IDLE ]
+  * ---------------------------------------------------------------------
+  */
+ void
+ DispatchJobForTocEntry(ArchiveHandle *AH, ParallelState *pstate, TocEntry *te,
+ 					   T_Action act)
+ {
+ 	int		worker;
+ 	char   *arg;
+ 
+ 	/* our caller makes sure that at least one worker is idle */
+ 	Assert(GetIdleWorker(pstate) != NO_SLOT);
+ 	worker = GetIdleWorker(pstate);
+ 	Assert(worker != NO_SLOT);
+ 
+ 	arg = (AH->MasterStartParallelItemPtr)(AH, te, act);
+ 
+ 	sendMessageToWorker(pstate, worker, arg);
+ 
+ 	pstate->parallelSlot[worker].workerStatus = WRKR_WORKING;
+ 	pstate->parallelSlot[worker].args->te = te;
+ 	PrintStatus(pstate);
+ }
+ 
+ static void
+ PrintStatus(ParallelState *pstate)
+ {
+ 	int			i;
+ 	printf("------Status------\n");
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 	{
+ 		printf("Status of worker %d: ", i);
+ 		switch (pstate->parallelSlot[i].workerStatus)
+ 		{
+ 			case WRKR_IDLE:
+ 				printf("IDLE");
+ 				break;
+ 			case WRKR_WORKING:
+ 				printf("WORKING");
+ 				break;
+ 			case WRKR_FINISHED:
+ 				printf("FINISHED");
+ 				break;
+ 			case WRKR_TERMINATED:
+ 				printf("TERMINATED");
+ 				break;
+ 		}
+ 		printf("\n");
+ 	}
+ 	printf("------------\n");
+ }
+ 
+ 
+ /*
+  * Find the first free parallel slot (if any).
+  */
+ int
+ GetIdleWorker(ParallelState *pstate)
+ {
+ 	int			i;
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 		if (pstate->parallelSlot[i].workerStatus == WRKR_IDLE)
+ 			return i;
+ 	return NO_SLOT;
+ }
+ 
+ /*
+  * Return true iff every worker process is in the WRKR_TERMINATED state.
+  */
+ static bool
+ HasEveryWorkerTerminated(ParallelState *pstate)
+ {
+ 	int			i;
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 		if (pstate->parallelSlot[i].workerStatus != WRKR_TERMINATED)
+ 			return false;
+ 	return true;
+ }
+ 
+ /*
+  * Return true iff every worker is in the WRKR_IDLE state.
+  */
+ bool
+ IsEveryWorkerIdle(ParallelState *pstate)
+ {
+ 	int			i;
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 		if (pstate->parallelSlot[i].workerStatus != WRKR_IDLE)
+ 			return false;
+ 	return true;
+ }
+ 
+ /*
+  * ---------------------------------------------------------------------
+  * One danger of the parallel backup is a possible deadlock:
+  *
+  * 1) Master dumps the schema and locks all tables in ACCESS SHARE mode.
+  * 2) Another process requests an ACCESS EXCLUSIVE lock (which is not granted
+  *    because the master holds a conflicting ACCESS SHARE lock).
+  * 3) The worker process also requests an ACCESS SHARE lock to read the table.
+  *    The worker's not granted that lock but is enqueued behind the ACCESS
+  *    EXCLUSIVE lock request.
+  * ---------------------------------------------------------------------
+  *
+  * Now what we do here is to just request a lock in ACCESS SHARE but with
+  * NOWAIT in the worker prior to touching the table. If we don't get the lock,
+  * then we know that somebody else has requested an ACCESS EXCLUSIVE lock and
+  * are good to just fail the whole backup because we have detected a deadlock.
+  */
+ static void
+ lockTableNoWait(ArchiveHandle *AH, TocEntry *te)
+ {
+ 	Archive *AHX = (Archive *) AH;
+ 	const char *qualId;
+ 	PQExpBuffer query = createPQExpBuffer();
+ 	PGresult   *res;
+ 
+ 	Assert(AH->format == archDirectory);
+ 	Assert(strcmp(te->desc, "BLOBS") != 0);
+ 
+ 	appendPQExpBuffer(query, "SELECT pg_namespace.nspname,"
+ 							 "       pg_class.relname "
+ 							 "  FROM pg_class "
+ 							 "  JOIN pg_namespace on pg_namespace.oid = relnamespace "
+ 							 " WHERE pg_class.oid = %d", te->catalogId.oid);
+ 
+ 	res = PQexec(AH->connection, query->data);
+ 
+ 	if (!res || PQresultStatus(res) != PGRES_TUPLES_OK)
+ 		exit_horribly(modulename, "could not get relation name for oid %d: %s\n",
+ 					  te->catalogId.oid, PQerrorMessage(AH->connection));
+ 
+ 	resetPQExpBuffer(query);
+ 
+ 	qualId = fmtQualifiedId(AHX->remoteVersion, PQgetvalue(res, 0, 0), PQgetvalue(res, 0, 1));
+ 
+ 	appendPQExpBuffer(query, "LOCK TABLE %s IN ACCESS SHARE MODE NOWAIT", qualId);
+ 	PQclear(res);
+ 
+ 	res = PQexec(AH->connection, query->data);
+ 
+ 	if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
+ 		exit_horribly(modulename, "could not obtain lock on relation \"%s\". This "
+ 					  "usually means that someone requested an ACCESS EXCLUSIVE lock "
+ 					  "on the table after the pg_dump parent process has gotten the "
+ 					  "initial ACCESS SHARE lock on the table.\n", qualId);
+ 
+ 	PQclear(res);
+ 	destroyPQExpBuffer(query);
+ }
+ 
+ /*
+  * That's the main routine for the worker.
+  * When it starts up it enters this routine and waits for commands from the
+  * master process. After having processed a command it comes back to here to
+  * wait for the next command. Finally it will receive a TERMINATE command and
+  * exit.
+  */
+ static void
+ WaitForCommands(ArchiveHandle *AH, int pipefd[2])
+ {
+ 	char	   *command;
+ 	DumpId		dumpId;
+ 	int			nBytes;
+ 	char	   *str = NULL;
+ 	TocEntry   *te;
+ 
+ 	for(;;)
+ 	{
+ 		if (!(command = getMessageFromMaster(pipefd)))
+ 		{
+ 			PQfinish(AH->connection);
+ 			AH->connection = NULL;
+ 			return;
+ 		}
+ 
+ 		if (messageStartsWith(command, "DUMP "))
+ 		{
+ 			Assert(AH->format == archDirectory);
+ 			sscanf(command + strlen("DUMP "), "%d%n", &dumpId, &nBytes);
+ 			Assert(nBytes == strlen(command) - strlen("DUMP "));
+ 
+ 			te = getTocEntryByDumpId(AH, dumpId);
+ 			Assert(te != NULL);
+ 
+ 			/*
+ 			 * Lock the table but with NOWAIT. Note that the parent is already
+ 			 * holding a lock. If we cannot acquire another ACCESS SHARE MODE
+ 			 * lock, then somebody else has requested an exclusive lock in the
+ 			 * meantime.  lockTableNoWait dies in this case to prevent a
+ 			 * deadlock.
+ 			 */
+ 			if (strcmp(te->desc, "BLOBS") != 0)
+ 				lockTableNoWait(AH, te);
+ 
+ 			/*
+ 			 * The message we return here has been pg_malloc()ed and we are
+ 			 * responsible for free()ing it.
+ 			 */
+ 			str = (AH->WorkerJobDumpPtr)(AH, te);
+ 			Assert(AH->connection != NULL);
+ 			sendMessageToMaster(pipefd, str);
+ 			free(str);
+ 		}
+ 		else if (messageStartsWith(command, "RESTORE "))
+ 		{
+ 			Assert(AH->format == archDirectory || AH->format == archCustom);
+ 			Assert(AH->connection != NULL);
+ 
+ 			sscanf(command + strlen("RESTORE "), "%d%n", &dumpId, &nBytes);
+ 			Assert(nBytes == strlen(command) - strlen("RESTORE "));
+ 
+ 			te = getTocEntryByDumpId(AH, dumpId);
+ 			Assert(te != NULL);
+ 			/*
+ 			 * The message we return here has been pg_malloc()ed and we are
+ 			 * responsible for free()ing it.
+ 			 */
+ 			str = (AH->WorkerJobRestorePtr)(AH, te);
+ 			Assert(AH->connection != NULL);
+ 			sendMessageToMaster(pipefd, str);
+ 			free(str);
+ 		}
+ 		else
+ 			exit_horribly(modulename,
+ 						  "Unknown command on communication channel: %s\n",
+ 						  command);
+ 	}
+ }
+ 
+ /*
+  * ---------------------------------------------------------------------
+  * Note the status change:
+  *
+  * DispatchJobForTocEntry		WRKR_IDLE -> WRKR_WORKING
+  * ListenToWorkers				WRKR_WORKING -> WRKR_FINISHED / WRKR_TERMINATED
+  * ReapWorkerStatus				WRKR_FINISHED -> WRKR_IDLE
+  * ---------------------------------------------------------------------
+  *
+  * Just calling ReapWorkerStatus() when all workers are working might or might
+  * not give you an idle worker because you need to call ListenToWorkers() in
+  * between and only thereafter ReapWorkerStatus(). This is necessary in order to
+  * get and deal with the status (=result) of the worker's execution.
+  */
+ void
+ ListenToWorkers(ArchiveHandle *AH, ParallelState *pstate, bool do_wait)
+ {
+ 	int			worker;
+ 	char	   *msg;
+ 
+ 	msg = getMessageFromWorker(pstate, do_wait, &worker);
+ 
+ 	if (!msg)
+ 	{
+ 		if (do_wait)
+ 			exit_horribly(modulename, "A worker process died unexpectedly\n");
+ 		return;
+ 	}
+ 
+ 	if (messageStartsWith(msg, "OK "))
+ 	{
+ 		char	   *statusString;
+ 		TocEntry   *te;
+ 
+ 		pstate->parallelSlot[worker].workerStatus = WRKR_FINISHED;
+ 		te = pstate->parallelSlot[worker].args->te;
+ 		if (messageStartsWith(msg, "OK RESTORE "))
+ 		{
+ 			statusString = msg + strlen("OK RESTORE ");
+ 			pstate->parallelSlot[worker].status =
+ 				(AH->MasterEndParallelItemPtr)
+ 					(AH, te, statusString, ACT_RESTORE);
+ 		}
+ 		else if (messageStartsWith(msg, "OK DUMP "))
+ 		{
+ 			statusString = msg + strlen("OK DUMP ");
+ 			pstate->parallelSlot[worker].status =
+ 				(AH->MasterEndParallelItemPtr)
+ 					(AH, te, statusString, ACT_DUMP);
+ 		}
+ 		else
+ 			exit_horribly(modulename,
+ 						  "Invalid message received from worker: %s\n", msg);
+ 	}
+ 	else if (messageStartsWith(msg, "ERROR "))
+ 	{
+ 		Assert(AH->format == archDirectory || AH->format == archCustom);
+ 		pstate->parallelSlot[worker].workerStatus = WRKR_TERMINATED;
+ 		exit_horribly(modulename, "%s", msg + strlen("ERROR "));
+ 	}
+ 	else
+ 		exit_horribly(modulename, "Invalid message received from worker: %s\n", msg);
+ 
+ 	PrintStatus(pstate);
+ 
+ 	/* both Unix and Win32 return pg_malloc()ed space, so we free it */
+ 	free(msg);
+ }
+ 
+ /*
+  * This function is executed in the master process.
+  *
+  * This function is used to get the return value of a terminated worker
+  * process. If a process has terminated, its status is stored in *status and
+  * the id of the worker is returned.
+  */
+ int
+ ReapWorkerStatus(ParallelState *pstate, int *status)
+ {
+ 	int			i;
+ 
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 	{
+ 		if (pstate->parallelSlot[i].workerStatus == WRKR_FINISHED)
+ 		{
+ 			*status = pstate->parallelSlot[i].status;
+ 			pstate->parallelSlot[i].status = 0;
+ 			pstate->parallelSlot[i].workerStatus = WRKR_IDLE;
+ 			PrintStatus(pstate);
+ 			return i;
+ 		}
+ 	}
+ 	return NO_SLOT;
+ }
+ 
+ /*
+  * This function is executed in the master process.
+  *
+  * It looks for an idle worker process and only returns if there is one.
+  */
+ void
+ EnsureIdleWorker(ArchiveHandle *AH, ParallelState *pstate)
+ {
+ 	int		ret_worker;
+ 	int		work_status;
+ 
+ 	for (;;)
+ 	{
+ 		int nTerm = 0;
+ 		while ((ret_worker = ReapWorkerStatus(pstate, &work_status)) != NO_SLOT)
+ 		{
+ 			if (work_status != 0)
+ 				exit_horribly(modulename, "Error processing a parallel work item.\n");
+ 
+ 			nTerm++;
+ 		}
+ 
+ 		/* We need to make sure that we have an idle worker before dispatching
+ 		 * the next item. If nTerm > 0 we already have that (quick check). */
+ 		if (nTerm > 0)
+ 			return;
+ 
+ 		/* explicit check for an idle worker */
+ 		if (GetIdleWorker(pstate) != NO_SLOT)
+ 			return;
+ 
+ 		/*
+ 		 * If we have no idle worker, read the result of one or more
+ 		 * workers and loop the loop to call ReapWorkerStatus() on them
+ 		 */
+ 		ListenToWorkers(AH, pstate, true);
+ 	}
+ }
+ 
+ /*
+  * This function is executed in the master process.
+  *
+  * It waits for all workers to terminate.
+  */
+ void
+ EnsureWorkersFinished(ArchiveHandle *AH, ParallelState *pstate)
+ {
+ 	int			work_status;
+ 
+ 	if (!pstate || pstate->numWorkers == 1)
+ 		return;
+ 
+ 	/* Waiting for the remaining worker processes to finish */
+ 	while (!IsEveryWorkerIdle(pstate))
+ 	{
+ 		if (ReapWorkerStatus(pstate, &work_status) == NO_SLOT)
+ 			ListenToWorkers(AH, pstate, true);
+ 		else if (work_status != 0)
+ 			exit_horribly(modulename, "Error processing a parallel work item\n");
+ 	}
+ }
+ 
+ /*
+  * This function is executed in the worker process.
+  *
+  * It returns the next message on the communication channel, blocking until it
+  * becomes available.
+  */
+ static char *
+ getMessageFromMaster(int pipefd[2])
+ {
+ 	return readMessageFromPipe(pipefd[PIPE_READ]);
+ }
+ 
+ /*
+  * This function is executed in the worker process.
+  *
+  * It sends a message to the master on the communication channel.
+  */
+ static void
+ sendMessageToMaster(int pipefd[2], const char *str)
+ {
+ 	int			len = strlen(str) + 1;
+ 
+ 	if (pipewrite(pipefd[PIPE_WRITE], str, len) != len)
+ 		exit_horribly(modulename,
+ 					  "Error writing to the communication channel: %s\n",
+ 					  strerror(errno));
+ }
+ 
+ /*
+  * A select loop that repeats calling select until a descriptor in the read set
+  * becomes readable. On Windows we have to check for the termination event from
+  * time to time, on Unix we can just block forever.
+  */
+ #ifdef WIN32
+ static int
+ select_loop(int maxFd, fd_set *workerset)
+ {
+ 	int			i;
+ 	fd_set		saveSet = *workerset;
+ 
+ 	/* should always be the master */
+ 	Assert(tMasterThreadId == GetCurrentThreadId());
+ 
+ 	for (;;)
+ 	{
+ 		/*
+ 		 * sleep a quarter of a second before checking if we should
+ 		 * terminate.
+ 		 */
+ 		struct timeval tv = { 0, 250000 };
+ 		*workerset = saveSet;
+ 		i = select(maxFd + 1, workerset, NULL, NULL, &tv);
+ 
+ 		if (i == SOCKET_ERROR && WSAGetLastError() == WSAEINTR)
+ 			continue;
+ 		if (i)
+ 			break;
+ 	}
+ 
+ 	return i;
+ }
+ #else /* UNIX */
+ static int
+ select_loop(int maxFd, fd_set *workerset)
+ {
+ 	int		i;
+ 
+ 	fd_set saveSet = *workerset;
+ 	for (;;)
+ 	{
+ 		*workerset = saveSet;
+ 		i = select(maxFd + 1, workerset, NULL, NULL, NULL);
+ 
+ 		/*
+ 		 * If we Ctrl-C the master process , it's likely that we interrupt
+ 		 * select() here. The signal handler will set wantAbort == true and the
+ 		 * shutdown journey starts from here. Note that we'll come back here
+ 		 * later when we tell all workers to terminate and read their
+ 		 * responses. But then we have aborting set to true.
+ 		 */
+ 		if (wantAbort && !aborting)
+ 			exit_horribly(modulename, "terminated by user\n");
+ 
+ 		if (i < 0 && errno == EINTR)
+ 			continue;
+ 		break;
+ 	}
+ 
+ 	return i;
+ }
+ #endif
+ 
+ /*
+  * This function is executed in the master process.
+  *
+  * It returns the next message from the worker on the communication channel,
+  * optionally blocking (do_wait) until it becomes available.
+  *
+  * The id of the worker is returned in *worker.
+  */
+ static char *
+ getMessageFromWorker(ParallelState *pstate, bool do_wait, int *worker)
+ {
+ 	int			i;
+ 	fd_set		workerset;
+ 	int			maxFd = -1;
+ 	struct		timeval nowait = { 0, 0 };
+ 
+ 	FD_ZERO(&workerset);
+ 
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 	{
+ 		if (pstate->parallelSlot[i].workerStatus == WRKR_TERMINATED)
+ 			continue;
+ 		FD_SET(pstate->parallelSlot[i].pipeRead, &workerset);
+ 		/* actually WIN32 ignores the first parameter to select()... */
+ 		if (pstate->parallelSlot[i].pipeRead > maxFd)
+ 			maxFd = pstate->parallelSlot[i].pipeRead;
+ 	}
+ 
+ 	if (do_wait)
+ 	{
+ 		i = select_loop(maxFd, &workerset);
+ 		Assert(i != 0);
+ 	}
+ 	else
+ 	{
+ 		if ((i = select(maxFd + 1, &workerset, NULL, NULL, &nowait)) == 0)
+ 			return NULL;
+ 	}
+ 
+ 	if (i < 0)
+ 		exit_horribly(modulename, "Error in ListenToWorkers(): %s", strerror(errno));
+ 
+ 	for (i = 0; i < pstate->numWorkers; i++)
+ 	{
+ 		char	   *msg;
+ 
+ 		if (!FD_ISSET(pstate->parallelSlot[i].pipeRead, &workerset))
+ 			continue;
+ 
+ 		msg = readMessageFromPipe(pstate->parallelSlot[i].pipeRead);
+ 		*worker = i;
+ 		return msg;
+ 	}
+ 	Assert(false);
+ 	return NULL;
+ }
+ 
+ /*
+  * This function is executed in the master process.
+  *
+  * It sends a message to a certain worker on the communication channel.
+  */
+ static void
+ sendMessageToWorker(ParallelState *pstate, int worker, const char *str)
+ {
+ 	int			len = strlen(str) + 1;
+ 
+ 	if (pipewrite(pstate->parallelSlot[worker].pipeWrite, str, len) != len)
+ 	{
+ 		/*
+ 		 * If we're already aborting anyway, don't care if we succeed or not.
+ 		 * The child might have gone already.
+ 		 */
+ #ifndef WIN32
+ 		if (!aborting)
+ #endif
+ 			exit_horribly(modulename,
+ 						  "Error writing to the communication channel: %s\n",
+ 						  strerror(errno));
+ 	}
+ }
+ 
+ /*
+  * The underlying function to read a message from the communication channel (fd)
+  * with optional blocking (do_wait).
+  */
+ static char *
+ readMessageFromPipe(int fd)
+ {
+ 	char	   *msg;
+ 	int			msgsize, bufsize;
+ 	int			ret;
+ 
+ 	/*
+ 	 * The problem here is that we need to deal with several possibilites:
+ 	 * we could receive only a partial message or several messages at once.
+ 	 * The caller expects us to return exactly one message however.
+ 	 *
+ 	 * We could either read in as much as we can and keep track of what we
+ 	 * delivered back to the caller or we just read byte by byte. Once we see
+ 	 * (char) 0, we know that it's the message's end. This would be quite
+ 	 * inefficient for more data but since we are reading only on the command
+ 	 * channel, the performance loss does not seem worth the trouble of keeping
+ 	 * internal states for different file descriptors.
+ 	 */
+ 
+ 	bufsize = 64;  /* could be any number */
+ 	msg = (char *) pg_malloc(bufsize);
+ 
+ 	msgsize = 0;
+ 	for (;;)
+ 	{
+ 		Assert(msgsize <= bufsize);
+ 		ret = piperead(fd, msg + msgsize, 1);
+ 
+ 		/* worker has closed the connection or another error happened */
+ 		if (ret <= 0)
+ 			return NULL;
+ 
+ 		Assert(ret == 1);
+ 
+ 		if (msg[msgsize] == '\0')
+ 			return msg;
+ 
+ 		msgsize++;
+ 		if (msgsize == bufsize)
+ 		{
+ 			/* could be any number */
+ 			bufsize += 16;
+ 			msg = (char *) realloc(msg, bufsize);
+ 		}
+ 	}
+ }
+ 
+ #ifdef WIN32
+ /*
+  *	This is a replacement version of pipe for Win32 which allows returned
+  *	handles to be used in select(). Note that read/write calls must be replaced
+  *	with recv/send.
+  */
+ 
+ static int
+ pgpipe(int handles[2])
+ {
+ 	SOCKET		s;
+ 	struct sockaddr_in serv_addr;
+ 	int			len = sizeof(serv_addr);
+ 
+ 	handles[0] = handles[1] = INVALID_SOCKET;
+ 
+ 	if ((s = socket(AF_INET, SOCK_STREAM, 0)) == INVALID_SOCKET)
+ 	{
+ 		write_msg(modulename, "pgpipe could not create socket: %ui",
+ 				  WSAGetLastError());
+ 		return -1;
+ 	}
+ 
+ 	memset((void *) &serv_addr, 0, sizeof(serv_addr));
+ 	serv_addr.sin_family = AF_INET;
+ 	serv_addr.sin_port = htons(0);
+ 	serv_addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
+ 	if (bind(s, (SOCKADDR *) &serv_addr, len) == SOCKET_ERROR)
+ 	{
+ 		write_msg(modulename, "pgpipe could not bind: %ui",
+ 				  WSAGetLastError());
+ 		closesocket(s);
+ 		return -1;
+ 	}
+ 	if (listen(s, 1) == SOCKET_ERROR)
+ 	{
+ 		write_msg(modulename, "pgpipe could not listen: %ui",
+ 				  WSAGetLastError());
+ 		closesocket(s);
+ 		return -1;
+ 	}
+ 	if (getsockname(s, (SOCKADDR *) &serv_addr, &len) == SOCKET_ERROR)
+ 	{
+ 		write_msg(modulename, "pgpipe could not getsockname: %ui",
+ 				  WSAGetLastError());
+ 		closesocket(s);
+ 		return -1;
+ 	}
+ 	if ((handles[1] = socket(PF_INET, SOCK_STREAM, 0)) == INVALID_SOCKET)
+ 	{
+ 		write_msg(modulename, "pgpipe could not create socket 2: %ui",
+ 				  WSAGetLastError());
+ 		closesocket(s);
+ 		return -1;
+ 	}
+ 
+ 	if (connect(handles[1], (SOCKADDR *) &serv_addr, len) == SOCKET_ERROR)
+ 	{
+ 		write_msg(modulename, "pgpipe could not connect socket: %ui",
+ 				  WSAGetLastError());
+ 		closesocket(s);
+ 		return -1;
+ 	}
+ 	if ((handles[0] = accept(s, (SOCKADDR *) &serv_addr, &len)) == INVALID_SOCKET)
+ 	{
+ 		write_msg(modulename, "pgpipe could not accept socket: %ui",
+ 				  WSAGetLastError());
+ 		closesocket(handles[1]);
+ 		handles[1] = INVALID_SOCKET;
+ 		closesocket(s);
+ 		return -1;
+ 	}
+ 	closesocket(s);
+ 	return 0;
+ }
+ 
+ static int
+ piperead(int s, char *buf, int len)
+ {
+ 	int			ret = recv(s, buf, len, 0);
+ 
+ 	if (ret < 0 && WSAGetLastError() == WSAECONNRESET)
+ 		/* EOF on the pipe! (win32 socket based implementation) */
+ 		ret = 0;
+ 	return ret;
+ }
+ #endif
diff --git a/src/bin/pg_dump/parallel.h b/src/bin/pg_dump/parallel.h
index ...51c6757 .
*** a/src/bin/pg_dump/parallel.h
--- b/src/bin/pg_dump/parallel.h
***************
*** 0 ****
--- 1,86 ----
+ /*-------------------------------------------------------------------------
+  *
+  * parallel.h
+  *
+  *	Parallel support header file for the pg_dump archiver
+  *
+  * Portions Copyright (c) 1996-2011, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  *	The author is not responsible for loss or damages that may
+  *	result from its use.
+  *
+  * IDENTIFICATION
+  *		src/bin/pg_dump/parallel.h
+  *
+  *-------------------------------------------------------------------------
+  */
+ 
+ #include "pg_backup_db.h"
+ 
+ struct _archiveHandle;
+ struct _tocEntry;
+ 
+ typedef enum
+ {
+ 	WRKR_TERMINATED = 0,
+ 	WRKR_IDLE,
+ 	WRKR_WORKING,
+ 	WRKR_FINISHED
+ } T_WorkerStatus;
+ 
+ typedef enum _action
+ {
+ 	ACT_DUMP,
+ 	ACT_RESTORE,
+ } T_Action;
+ 
+ /* Arguments needed for a worker process */
+ typedef struct _parallel_args
+ {
+ 	struct _archiveHandle *AH;
+ 	struct _tocEntry	  *te;
+ } ParallelArgs;
+ 
+ /* State for each parallel activity slot */
+ typedef struct _parallel_slot
+ {
+ 	ParallelArgs	   *args;
+ 	T_WorkerStatus		workerStatus;
+ 	int					status;
+ 	int					pipeRead;
+ 	int					pipeWrite;
+ 	int					pipeRevRead;
+ 	int					pipeRevWrite;
+ #ifdef WIN32
+ 	uintptr_t			hThread;
+ 	unsigned int		threadId;
+ #else
+ 	pid_t				pid;
+ #endif
+ } ParallelSlot;
+ 
+ #define NO_SLOT (-1)
+ 
+ typedef struct _parallel_state
+ {
+ 	int			numWorkers;
+ 	ParallelSlot *parallelSlot;
+ } ParallelState;
+ 
+ extern int GetIdleWorker(ParallelState *pstate);
+ extern bool IsEveryWorkerIdle(ParallelState *pstate);
+ extern void ListenToWorkers(struct _archiveHandle *AH, ParallelState *pstate, bool do_wait);
+ extern int ReapWorkerStatus(ParallelState *pstate, int *status);
+ extern void EnsureIdleWorker(struct _archiveHandle *AH, ParallelState *pstate);
+ extern void EnsureWorkersFinished(struct _archiveHandle *AH, ParallelState *pstate);
+ 
+ extern ParallelState *ParallelBackupStart(struct _archiveHandle *AH,
+ 										  RestoreOptions *ropt);
+ extern void DispatchJobForTocEntry(struct _archiveHandle *AH,
+ 								   ParallelState *pstate,
+ 								   struct _tocEntry *te, T_Action act);
+ extern void ParallelBackupEnd(struct _archiveHandle *AH, ParallelState *pstate);
+ 
+ extern void checkAborting(struct _archiveHandle *AH);
+ 
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 22c19fe..a21e05a 100644
*** a/src/bin/pg_dump/pg_backup.h
--- b/src/bin/pg_dump/pg_backup.h
*************** struct Archive
*** 90,99 ****
--- 90,101 ----
  	int			maxRemoteVersion;
  
  	int			numWorkers;		/* number of parallel processes */
+ 	char	   *sync_snapshot_id;  /* sync snapshot id for parallel operation */
  
  	/* info needed for string escaping */
  	int			encoding;		/* libpq code for client_encoding */
  	bool		std_strings;	/* standard_conforming_strings */
+ 	char	   *use_role;		/* Issue SET ROLE to this */
  
  	/* error handling */
  	bool		exit_on_error;	/* whether to exit on SQL errors... */
*************** extern void PrintTOCSummary(Archive *AH,
*** 202,207 ****
--- 204,212 ----
  
  extern RestoreOptions *NewRestoreOptions(void);
  
+ /* We have one in pg_dump.c and another one in pg_restore.c */
+ extern void _SetupWorker(Archive *AHX, RestoreOptions *ropt);
+ 
  /* Rearrange and filter TOC entries */
  extern void SortTocFromFile(Archive *AHX, RestoreOptions *ropt);
  extern void InitDummyWantedList(Archive *AHX, RestoreOptions *ropt);
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 6e21c09..c9a1a22 100644
*** a/src/bin/pg_dump/pg_backup_archiver.c
--- b/src/bin/pg_dump/pg_backup_archiver.c
***************
*** 23,30 ****
--- 23,32 ----
  #include "pg_backup_db.h"
  #include "dumpmem.h"
  #include "dumputils.h"
+ #include "parallel.h"
  
  #include <ctype.h>
+ #include <fcntl.h>
  #include <unistd.h>
  #include <sys/stat.h>
  #include <sys/types.h>
***************
*** 36,107 ****
  
  #include "libpq/libpq-fs.h"
  
- /*
-  * Special exit values from worker children.  We reserve 0 for normal
-  * success; 1 and other small values should be interpreted as crashes.
-  */
- #define WORKER_CREATE_DONE		10
- #define WORKER_INHIBIT_DATA		11
- #define WORKER_IGNORED_ERRORS	12
- 
- /*
-  * Unix uses exit to return result from worker child, so function is void.
-  * Windows thread result comes via function return.
-  */
- #ifndef WIN32
- #define parallel_restore_result void
- #else
- #define parallel_restore_result DWORD
- #endif
- 
- /* IDs for worker children are either PIDs or thread handles */
- #ifndef WIN32
- #define thandle pid_t
- #else
- #define thandle HANDLE
- #endif
- 
- typedef struct ParallelStateEntry
- {
- #ifdef WIN32
- 	unsigned int threadId;
- #else
- 	pid_t		pid;
- #endif
- 	ArchiveHandle *AH;
- } ParallelStateEntry;
- 
- typedef struct ParallelState
- {
- 	int			numWorkers;
- 	ParallelStateEntry *pse;
- } ParallelState;
- 
- /* Arguments needed for a worker child */
- typedef struct _restore_args
- {
- 	ArchiveHandle *AH;
- 	TocEntry   *te;
- 	ParallelStateEntry *pse;
- } RestoreArgs;
- 
- /* State for each parallel activity slot */
- typedef struct _parallel_slot
- {
- 	thandle		child_id;
- 	RestoreArgs *args;
- } ParallelSlot;
- 
- typedef struct ShutdownInformation
- {
- 	ParallelState *pstate;
- 	Archive       *AHX;
- } ShutdownInformation;
- 
- static ShutdownInformation shutdown_info;
- 
- #define NO_SLOT (-1)
- 
  #define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
  #define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
  
--- 38,43 ----
*************** static void RestoreOutput(ArchiveHandle
*** 154,175 ****
  static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te,
  				  RestoreOptions *ropt, bool is_parallel);
  static void restore_toc_entries_prefork(ArchiveHandle *AH);
! static void restore_toc_entries_parallel(ArchiveHandle *AH, TocEntry *pending_list);
  static void restore_toc_entries_postfork(ArchiveHandle *AH, TocEntry *pending_list);
- static thandle spawn_restore(RestoreArgs *args);
- static thandle reap_child(ParallelSlot *slots, int n_slots, int *work_status);
- static bool work_in_progress(ParallelSlot *slots, int n_slots);
- static int	get_next_slot(ParallelSlot *slots, int n_slots);
  static void par_list_header_init(TocEntry *l);
  static void par_list_append(TocEntry *l, TocEntry *te);
  static void par_list_remove(TocEntry *te);
  static TocEntry *get_next_work_item(ArchiveHandle *AH,
  				   TocEntry *ready_list,
! 				   ParallelSlot *slots, int n_slots);
! static parallel_restore_result parallel_restore(RestoreArgs *args);
  static void mark_work_done(ArchiveHandle *AH, TocEntry *ready_list,
! 			   thandle worker, int status,
! 			   ParallelSlot *slots, int n_slots);
  static void fix_dependencies(ArchiveHandle *AH);
  static bool has_lock_conflicts(TocEntry *te1, TocEntry *te2);
  static void repoint_table_dependencies(ArchiveHandle *AH,
--- 90,107 ----
  static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te,
  				  RestoreOptions *ropt, bool is_parallel);
  static void restore_toc_entries_prefork(ArchiveHandle *AH);
! static void restore_toc_entries_parallel(ArchiveHandle *AH, ParallelState *pstate,
! 										 TocEntry *pending_list);
  static void restore_toc_entries_postfork(ArchiveHandle *AH, TocEntry *pending_list);
  static void par_list_header_init(TocEntry *l);
  static void par_list_append(TocEntry *l, TocEntry *te);
  static void par_list_remove(TocEntry *te);
  static TocEntry *get_next_work_item(ArchiveHandle *AH,
  				   TocEntry *ready_list,
! 				   ParallelState *pstate);
  static void mark_work_done(ArchiveHandle *AH, TocEntry *ready_list,
! 			   int worker, int status,
! 			   ParallelState *pstate);
  static void fix_dependencies(ArchiveHandle *AH);
  static bool has_lock_conflicts(TocEntry *te1, TocEntry *te2);
  static void repoint_table_dependencies(ArchiveHandle *AH,
*************** static void reduce_dependencies(ArchiveH
*** 180,191 ****
  static void mark_create_done(ArchiveHandle *AH, TocEntry *te);
  static void inhibit_data_for_failed_table(ArchiveHandle *AH, TocEntry *te);
  
- static void setProcessIdentifier(ParallelStateEntry *pse, ArchiveHandle *AH);
- static void unsetProcessIdentifier(ParallelStateEntry *pse);
- static ParallelStateEntry *GetMyPSEntry(ParallelState *pstate);
- static void archive_close_connection(int code, void *arg);
- 
- 
  /*
   *	Wrapper functions.
   *
--- 112,117 ----
*************** RestoreArchive(Archive *AHX, RestoreOpti
*** 438,444 ****
  	 */
  	if (parallel_mode)
  	{
! 		TocEntry pending_list;
  
  		par_list_header_init(&pending_list);
  
--- 364,371 ----
  	 */
  	if (parallel_mode)
  	{
! 		ParallelState  *pstate;
! 		TocEntry		pending_list;
  
  		par_list_header_init(&pending_list);
  
*************** RestoreArchive(Archive *AHX, RestoreOpti
*** 446,453 ****
  		restore_toc_entries_prefork(AH);
  		Assert(AH->connection == NULL);
  
! 		/* This will actually fork the processes */
! 		restore_toc_entries_parallel(AH, &pending_list);
  
  		/* reconnect the master and see if we missed something */
  		restore_toc_entries_postfork(AH, &pending_list);
--- 373,382 ----
  		restore_toc_entries_prefork(AH);
  		Assert(AH->connection == NULL);
  
! 		/* ParallelBackupStart() will actually fork the processes */
! 		pstate = ParallelBackupStart(AH, ropt);
! 		restore_toc_entries_parallel(AH, pstate, &pending_list);
! 		ParallelBackupEnd(AH, pstate);
  
  		/* reconnect the master and see if we missed something */
  		restore_toc_entries_postfork(AH, &pending_list);
*************** static int
*** 514,520 ****
  restore_toc_entry(ArchiveHandle *AH, TocEntry *te,
  				  RestoreOptions *ropt, bool is_parallel)
  {
! 	int			retval = 0;
  	teReqs		reqs;
  	bool		defnDumped;
  
--- 443,449 ----
  restore_toc_entry(ArchiveHandle *AH, TocEntry *te,
  				  RestoreOptions *ropt, bool is_parallel)
  {
! 	int			status = WORKER_OK;
  	teReqs		reqs;
  	bool		defnDumped;
  
*************** restore_toc_entry(ArchiveHandle *AH, Toc
*** 556,562 ****
  				if (ropt->noDataForFailedTables)
  				{
  					if (is_parallel)
! 						retval = WORKER_INHIBIT_DATA;
  					else
  						inhibit_data_for_failed_table(AH, te);
  				}
--- 485,491 ----
  				if (ropt->noDataForFailedTables)
  				{
  					if (is_parallel)
! 						status = WORKER_INHIBIT_DATA;
  					else
  						inhibit_data_for_failed_table(AH, te);
  				}
*************** restore_toc_entry(ArchiveHandle *AH, Toc
*** 571,577 ****
  				 * just set the return value.
  				 */
  				if (is_parallel)
! 					retval = WORKER_CREATE_DONE;
  				else
  					mark_create_done(AH, te);
  			}
--- 500,506 ----
  				 * just set the return value.
  				 */
  				if (is_parallel)
! 					status = WORKER_CREATE_DONE;
  				else
  					mark_create_done(AH, te);
  			}
*************** restore_toc_entry(ArchiveHandle *AH, Toc
*** 689,695 ****
  		}
  	}
  
! 	return retval;
  }
  
  /*
--- 618,627 ----
  		}
  	}
  
! 	if (AH->public.n_errors > 0 && status == WORKER_OK)
! 		status = WORKER_IGNORED_ERRORS;
! 
! 	return status;
  }
  
  /*
*************** _allocAH(const char *FileSpec, const Arc
*** 2035,2082 ****
  
  
  void
! WriteDataChunks(ArchiveHandle *AH)
  {
  	TocEntry   *te;
- 	StartDataPtr startPtr;
- 	EndDataPtr	endPtr;
  
  	for (te = AH->toc->next; te != AH->toc; te = te->next)
  	{
! 		if (te->dataDumper != NULL)
! 		{
! 			AH->currToc = te;
! 			/* printf("Writing data for %d (%x)\n", te->id, te); */
! 
! 			if (strcmp(te->desc, "BLOBS") == 0)
! 			{
! 				startPtr = AH->StartBlobsPtr;
! 				endPtr = AH->EndBlobsPtr;
! 			}
! 			else
! 			{
! 				startPtr = AH->StartDataPtr;
! 				endPtr = AH->EndDataPtr;
! 			}
! 
! 			if (startPtr != NULL)
! 				(*startPtr) (AH, te);
  
  			/*
! 			 * printf("Dumper arg for %d is %x\n", te->id, te->dataDumperArg);
  			 */
  
! 			/*
! 			 * The user-provided DataDumper routine needs to call
! 			 * AH->WriteData
! 			 */
! 			(*te->dataDumper) ((Archive *) AH, te->dataDumperArg);
  
! 			if (endPtr != NULL)
! 				(*endPtr) (AH, te);
! 			AH->currToc = NULL;
! 		}
  	}
  }
  
  void
--- 1967,2029 ----
  
  
  void
! WriteDataChunks(ArchiveHandle *AH, ParallelState *pstate)
  {
  	TocEntry   *te;
  
  	for (te = AH->toc->next; te != AH->toc; te = te->next)
  	{
! 		if (!te->hadDumper)
! 			continue;
  
+ 		if (pstate && pstate->numWorkers > 1)
+ 		{
  			/*
! 			 * If we are in a parallel backup, then we are always the master
! 			 * process.
  			 */
+ 			EnsureIdleWorker(AH, pstate);
+ 			Assert(GetIdleWorker(pstate) != NO_SLOT);
+ 			DispatchJobForTocEntry(AH, pstate, te, ACT_DUMP);
+ 		}
+ 		else
+ 			WriteDataChunksForTocEntry(AH, te);
+ 	}
+ 	EnsureWorkersFinished(AH, pstate);
+ }
  
! void
! WriteDataChunksForTocEntry(ArchiveHandle *AH, TocEntry *te)
! {
! 	StartDataPtr startPtr;
! 	EndDataPtr	endPtr;
  
! 	AH->currToc = te;
! 
! 	if (strcmp(te->desc, "BLOBS") == 0)
! 	{
! 		startPtr = AH->StartBlobsPtr;
! 		endPtr = AH->EndBlobsPtr;
! 	}
! 	else
! 	{
! 		startPtr = AH->StartDataPtr;
! 		endPtr = AH->EndDataPtr;
  	}
+ 
+ 	if (startPtr != NULL)
+ 		(*startPtr) (AH, te);
+ 
+ 	/*
+ 	 * The user-provided DataDumper routine needs to call
+ 	 * AH->WriteData
+ 	 */
+ 	(*te->dataDumper) ((Archive *) AH, te->dataDumperArg);
+ 
+ 	if (endPtr != NULL)
+ 		(*endPtr) (AH, te);
+ 
+ 	AH->currToc = NULL;
  }
  
  void
*************** dumpTimestamp(ArchiveHandle *AH, const c
*** 3291,3357 ****
  }
  
  static void
- setProcessIdentifier(ParallelStateEntry *pse, ArchiveHandle *AH)
- {
- #ifdef WIN32
- 	pse->threadId = GetCurrentThreadId();
- #else
- 	pse->pid = getpid();
- #endif
- 	pse->AH = AH;
- }
- 
- static void
- unsetProcessIdentifier(ParallelStateEntry *pse)
- {
- #ifdef WIN32
- 	pse->threadId = 0;
- #else
- 	pse->pid = 0;
- #endif
- 	pse->AH = NULL;
- }
- 
- static ParallelStateEntry *
- GetMyPSEntry(ParallelState *pstate)
- {
- 	int i;
- 
- 	for (i = 0; i < pstate->numWorkers; i++)
- #ifdef WIN32
- 		if (pstate->pse[i].threadId == GetCurrentThreadId())
- #else
- 		if (pstate->pse[i].pid == getpid())
- #endif
- 			return &(pstate->pse[i]);
- 
- 	return NULL;
- }
- 
- static void
- archive_close_connection(int code, void *arg)
- {
- 	ShutdownInformation *si = (ShutdownInformation *) arg;
- 
- 	if (si->pstate)
- 	{
- 		ParallelStateEntry *entry = GetMyPSEntry(si->pstate);
- 
- 		if (entry != NULL && entry->AH)
- 			DisconnectDatabase(&(entry->AH->public));
- 	}
- 	else if (si->AHX)
- 		DisconnectDatabase(si->AHX);
- }
- 
- void
- on_exit_close_archive(Archive *AHX)
- {
- 	shutdown_info.AHX = AHX;
- 	on_exit_nicely(archive_close_connection, &shutdown_info);
- }
- 
- static void
  restore_toc_entries_prefork(ArchiveHandle *AH)
  {
  	RestoreOptions *ropt = AH->ropt;
--- 3238,3243 ----
*************** restore_toc_entries_prefork(ArchiveHandl
*** 3434,3480 ****
   * First we process all SECTION_PRE_DATA tocEntries, in a single connection,
   * just as for a standard restore. This is done in restore_toc_entries_prefork().
   * Second we process the remaining non-ACL steps in parallel worker children
!  * (threads on Windows, processes on Unix), each of which connects separately
!  * to the database.
   * Finally we process all the ACL entries in a single connection (that happens
   * back in RestoreArchive).
   */
  static void
! restore_toc_entries_parallel(ArchiveHandle *AH, TocEntry *pending_list)
  {
! 	ParallelState *pstate;
! 	ParallelSlot *slots;
! 	int			n_slots = AH->public.numWorkers;
! 	TocEntry   *next_work_item;
! 	int			next_slot;
  	TocEntry	ready_list;
  	int			ret_child;
- 	bool		skipped_some;
- 	int			work_status;
- 	int			i;
  
  	ahlog(AH, 2, "entering restore_toc_entries_parallel\n");
  
- 	slots = (ParallelSlot *) pg_calloc(n_slots, sizeof(ParallelSlot));
- 	pstate = (ParallelState *) pg_malloc(sizeof(ParallelState));
- 	pstate->pse = (ParallelStateEntry *) pg_calloc(n_slots, sizeof(ParallelStateEntry));
- 	pstate->numWorkers = AH->public.numWorkers;
- 	for (i = 0; i < pstate->numWorkers; i++)
- 		unsetProcessIdentifier(&(pstate->pse[i]));
- 
- 	/*
- 	 * Set the pstate in the shutdown_info. The exit handler uses pstate if set
- 	 * and falls back to AHX otherwise.
- 	 */
- 	shutdown_info.pstate = pstate;
- 
  	/*
  	 * Initialize the lists of ready items, the list for pending items has
  	 * already been initialized in the caller.  After this setup, the pending
  	 * list is everything that needs to be done but is blocked by one or more
  	 * dependencies, while the ready list contains items that have no remaining
! 	 * dependencies.	Note: we don't yet filter out entries that aren't going
! 	 * to be restored.  They might participate in dependency chains connecting
  	 * entries that should be restored, so we treat them as live until we
  	 * actually process them.
  	 */
--- 3320,3349 ----
   * First we process all SECTION_PRE_DATA tocEntries, in a single connection,
   * just as for a standard restore. This is done in restore_toc_entries_prefork().
   * Second we process the remaining non-ACL steps in parallel worker children
!  * (threads on Windows, processes on Unix), these fork off and set up their
!  * connections before we call restore_toc_entries_parallel_forked.
   * Finally we process all the ACL entries in a single connection (that happens
   * back in RestoreArchive).
   */
  static void
! restore_toc_entries_parallel(ArchiveHandle *AH, ParallelState *pstate,
! 							 TocEntry *pending_list)
  {
! 	int			work_status;
! 	bool		skipped_some;
  	TocEntry	ready_list;
+ 	TocEntry   *next_work_item;
  	int			ret_child;
  
  	ahlog(AH, 2, "entering restore_toc_entries_parallel\n");
  
  	/*
  	 * Initialize the lists of ready items, the list for pending items has
  	 * already been initialized in the caller.  After this setup, the pending
  	 * list is everything that needs to be done but is blocked by one or more
  	 * dependencies, while the ready list contains items that have no remaining
! 	 * dependencies. Note: we don't yet filter out entries that aren't going
! 	 * to be restored. They might participate in dependency chains connecting
  	 * entries that should be restored, so we treat them as live until we
  	 * actually process them.
  	 */
*************** restore_toc_entries_parallel(ArchiveHand
*** 3516,3524 ****
  
  	ahlog(AH, 1, "entering main parallel loop\n");
  
! 	while ((next_work_item = get_next_work_item(AH, &ready_list,
! 												slots, n_slots)) != NULL ||
! 		   work_in_progress(slots, n_slots))
  	{
  		if (next_work_item != NULL)
  		{
--- 3385,3392 ----
  
  	ahlog(AH, 1, "entering main parallel loop\n");
  
! 	while ((next_work_item = get_next_work_item(AH, &ready_list, pstate)) != NULL ||
! 		   !IsEveryWorkerIdle(pstate))
  	{
  		if (next_work_item != NULL)
  		{
*************** restore_toc_entries_parallel(ArchiveHand
*** 3538,3589 ****
  				continue;
  			}
  
! 			if ((next_slot = get_next_slot(slots, n_slots)) != NO_SLOT)
! 			{
! 				/* There is work still to do and a worker slot available */
! 				thandle		child;
! 				RestoreArgs *args;
! 
! 				ahlog(AH, 1, "launching item %d %s %s\n",
! 					  next_work_item->dumpId,
! 					  next_work_item->desc, next_work_item->tag);
  
! 				par_list_remove(next_work_item);
  
! 				/* this memory is dealloced in mark_work_done() */
! 				args = pg_malloc(sizeof(RestoreArgs));
! 				args->AH = CloneArchive(AH);
! 				args->te = next_work_item;
! 				args->pse = &pstate->pse[next_slot];
  
! 				/* run the step in a worker child */
! 				child = spawn_restore(args);
  
! 				slots[next_slot].child_id = child;
! 				slots[next_slot].args = args;
  
! 				continue;
  			}
- 		}
  
! 		/*
! 		 * If we get here there must be work being done.  Either there is no
! 		 * work available to schedule (and work_in_progress returned true) or
! 		 * there are no slots available.  So we wait for a worker to finish,
! 		 * and process the result.
! 		 */
! 		ret_child = reap_child(slots, n_slots, &work_status);
  
! 		if (WIFEXITED(work_status))
! 		{
! 			mark_work_done(AH, &ready_list,
! 						   ret_child, WEXITSTATUS(work_status),
! 						   slots, n_slots);
! 		}
! 		else
! 		{
! 			exit_horribly(modulename, "worker process crashed: status %d\n",
! 						  work_status);
  		}
  	}
  
--- 3406,3463 ----
  				continue;
  			}
  
! 			ahlog(AH, 1, "launching item %d %s %s\n",
! 				  next_work_item->dumpId,
! 				  next_work_item->desc, next_work_item->tag);
  
! 			par_list_remove(next_work_item);
  
! 			Assert(GetIdleWorker(pstate) != NO_SLOT);
! 			DispatchJobForTocEntry(AH, pstate, next_work_item, ACT_RESTORE);
! 		}
! 		else
! 			/* at least one child is working and we have nothing ready. */
! 			Assert(!IsEveryWorkerIdle(pstate));
  
! 		for (;;)
! 		{
! 			int nTerm = 0;
  
! 			/*
! 			 * In order to reduce dependencies as soon as possible and
! 			 * especially to reap the status of workers who are working on
! 			 * items that pending items depend on, we do a non-blocking check
! 			 * for ended workers first.
! 			 *
! 			 * However, if we do not have any other work items currently that
! 			 * workers can work on, we do not busy-loop here but instead
! 			 * really wait for at least one worker to terminate. Hence we call
! 			 * ListenToWorkers(..., ..., do_wait = true) in this case.
! 			 */
! 			ListenToWorkers(AH, pstate, !next_work_item);
  
! 			while ((ret_child = ReapWorkerStatus(pstate, &work_status)) != NO_SLOT)
! 			{
! 				nTerm++;
! 				mark_work_done(AH, &ready_list, ret_child, work_status, pstate);
  			}
  
! 			/*
! 			 * We need to make sure that we have an idle worker before re-running the
! 			 * loop. If nTerm > 0 we already have that (quick check).
! 			 */
! 			if (nTerm > 0)
! 				break;
  
! 			/* if nobody terminated, explicitly check for an idle worker */
! 			if (GetIdleWorker(pstate) != NO_SLOT)
! 				break;
! 
! 			/*
! 			 * If we have no idle worker, read the result of one or more
! 			 * workers and loop the loop to call ReapWorkerStatus() on them.
! 			 */
! 			ListenToWorkers(AH, pstate, true);
  		}
  	}
  
*************** restore_toc_entries_postfork(ArchiveHand
*** 3599,3610 ****
  	ahlog(AH, 2, "entering restore_toc_entries_postfork\n");
  
  	/*
- 	 * Remove the pstate again, so the exit handler will now fall back to
- 	 * closing AH->connection again.
- 	 */
- 	shutdown_info.pstate = NULL;
- 
- 	/*
  	 * Now reconnect the single parent connection.
  	 */
  	ConnectDatabase((Archive *) AH, ropt->dbname,
--- 3473,3478 ----
*************** restore_toc_entries_postfork(ArchiveHand
*** 3629,3749 ****
  }
  
  /*
-  * create a worker child to perform a restore step in parallel
-  */
- static thandle
- spawn_restore(RestoreArgs *args)
- {
- 	thandle		child;
- 
- 	/* Ensure stdio state is quiesced before forking */
- 	fflush(NULL);
- 
- #ifndef WIN32
- 	child = fork();
- 	if (child == 0)
- 	{
- 		/* in child process */
- 		parallel_restore(args);
- 		exit_horribly(modulename,
- 					  "parallel_restore should not return\n");
- 	}
- 	else if (child < 0)
- 	{
- 		/* fork failed */
- 		exit_horribly(modulename,
- 					  "could not create worker process: %s\n",
- 					  strerror(errno));
- 	}
- #else
- 	child = (HANDLE) _beginthreadex(NULL, 0, (void *) parallel_restore,
- 									args, 0, NULL);
- 	if (child == 0)
- 		exit_horribly(modulename,
- 					  "could not create worker thread: %s\n",
- 					  strerror(errno));
- #endif
- 
- 	return child;
- }
- 
- /*
-  *	collect status from a completed worker child
-  */
- static thandle
- reap_child(ParallelSlot *slots, int n_slots, int *work_status)
- {
- #ifndef WIN32
- 	/* Unix is so much easier ... */
- 	return wait(work_status);
- #else
- 	static HANDLE *handles = NULL;
- 	int			hindex,
- 				snum,
- 				tnum;
- 	thandle		ret_child;
- 	DWORD		res;
- 
- 	/* first time around only, make space for handles to listen on */
- 	if (handles == NULL)
- 		handles = (HANDLE *) pg_calloc(sizeof(HANDLE), n_slots);
- 
- 	/* set up list of handles to listen to */
- 	for (snum = 0, tnum = 0; snum < n_slots; snum++)
- 		if (slots[snum].child_id != 0)
- 			handles[tnum++] = slots[snum].child_id;
- 
- 	/* wait for one to finish */
- 	hindex = WaitForMultipleObjects(tnum, handles, false, INFINITE);
- 
- 	/* get handle of finished thread */
- 	ret_child = handles[hindex - WAIT_OBJECT_0];
- 
- 	/* get the result */
- 	GetExitCodeThread(ret_child, &res);
- 	*work_status = res;
- 
- 	/* dispose of handle to stop leaks */
- 	CloseHandle(ret_child);
- 
- 	return ret_child;
- #endif
- }
- 
- /*
-  * are we doing anything now?
-  */
- static bool
- work_in_progress(ParallelSlot *slots, int n_slots)
- {
- 	int			i;
- 
- 	for (i = 0; i < n_slots; i++)
- 	{
- 		if (slots[i].child_id != 0)
- 			return true;
- 	}
- 	return false;
- }
- 
- /*
-  * find the first free parallel slot (if any).
-  */
- static int
- get_next_slot(ParallelSlot *slots, int n_slots)
- {
- 	int			i;
- 
- 	for (i = 0; i < n_slots; i++)
- 	{
- 		if (slots[i].child_id == 0)
- 			return i;
- 	}
- 	return NO_SLOT;
- }
- 
- 
- /*
   * Check if te1 has an exclusive lock requirement for an item that te2 also
   * requires, whether or not te2's requirement is for an exclusive lock.
   */
--- 3497,3502 ----
*************** par_list_remove(TocEntry *te)
*** 3816,3822 ****
   */
  static TocEntry *
  get_next_work_item(ArchiveHandle *AH, TocEntry *ready_list,
! 				   ParallelSlot *slots, int n_slots)
  {
  	bool		pref_non_data = false;	/* or get from AH->ropt */
  	TocEntry   *data_te = NULL;
--- 3569,3575 ----
   */
  static TocEntry *
  get_next_work_item(ArchiveHandle *AH, TocEntry *ready_list,
! 				   ParallelState *pstate)
  {
  	bool		pref_non_data = false;	/* or get from AH->ropt */
  	TocEntry   *data_te = NULL;
*************** get_next_work_item(ArchiveHandle *AH, To
*** 3831,3841 ****
  	{
  		int			count = 0;
  
! 		for (k = 0; k < n_slots; k++)
! 			if (slots[k].args->te != NULL &&
! 				slots[k].args->te->section == SECTION_DATA)
  				count++;
! 		if (n_slots == 0 || count * 4 < n_slots)
  			pref_non_data = false;
  	}
  
--- 3584,3594 ----
  	{
  		int			count = 0;
  
! 		for (k = 0; k < pstate->numWorkers; k++)
! 			if (pstate->parallelSlot[k].args->te != NULL &&
! 				pstate->parallelSlot[k].args->te->section == SECTION_DATA)
  				count++;
! 		if (pstate->numWorkers == 0 || count * 4 < pstate->numWorkers)
  			pref_non_data = false;
  	}
  
*************** get_next_work_item(ArchiveHandle *AH, To
*** 3851,3863 ****
  		 * that a currently running item also needs lock on, or vice versa. If
  		 * so, we don't want to schedule them together.
  		 */
! 		for (i = 0; i < n_slots && !conflicts; i++)
  		{
  			TocEntry   *running_te;
  
! 			if (slots[i].args == NULL)
  				continue;
! 			running_te = slots[i].args->te;
  
  			if (has_lock_conflicts(te, running_te) ||
  				has_lock_conflicts(running_te, te))
--- 3604,3616 ----
  		 * that a currently running item also needs lock on, or vice versa. If
  		 * so, we don't want to schedule them together.
  		 */
! 		for (i = 0; i < pstate->numWorkers && !conflicts; i++)
  		{
  			TocEntry   *running_te;
  
! 			if (pstate->parallelSlot[i].workerStatus != WRKR_WORKING)
  				continue;
! 			running_te = pstate->parallelSlot[i].args->te;
  
  			if (has_lock_conflicts(te, running_te) ||
  				has_lock_conflicts(running_te, te))
*************** get_next_work_item(ArchiveHandle *AH, To
*** 3892,3954 ****
  /*
   * Restore a single TOC item in parallel with others
   *
!  * this is the procedure run as a thread (Windows) or a
!  * separate process (everything else).
   */
! static parallel_restore_result
! parallel_restore(RestoreArgs *args)
  {
  	ArchiveHandle *AH = args->AH;
  	TocEntry   *te = args->te;
  	RestoreOptions *ropt = AH->ropt;
! 	int			retval;
! 
! 	setProcessIdentifier(args->pse, AH);
! 
! 	/*
! 	 * Close and reopen the input file so we have a private file pointer that
! 	 * doesn't stomp on anyone else's file pointer, if we're actually going to
! 	 * need to read from the file. Otherwise, just close it except on Windows,
! 	 * where it will possibly be needed by other threads.
! 	 *
! 	 * Note: on Windows, since we are using threads not processes, the reopen
! 	 * call *doesn't* close the original file pointer but just open a new one.
! 	 */
! 	if (te->section == SECTION_DATA)
! 		(AH->ReopenPtr) (AH);
! #ifndef WIN32
! 	else
! 		(AH->ClosePtr) (AH);
! #endif
! 
! 	/*
! 	 * We need our own database connection, too
! 	 */
! 	ConnectDatabase((Archive *) AH, ropt->dbname,
! 					ropt->pghost, ropt->pgport, ropt->username,
! 					ropt->promptPassword);
  
  	_doSetFixedOutputState(AH);
  
! 	/* Restore the TOC item */
! 	retval = restore_toc_entry(AH, te, ropt, true);
! 
! 	/* And clean up */
! 	DisconnectDatabase((Archive *) AH);
! 	unsetProcessIdentifier(args->pse);
  
! 	/* If we reopened the file, we are done with it, so close it now */
! 	if (te->section == SECTION_DATA)
! 		(AH->ClosePtr) (AH);
  
! 	if (retval == 0 && AH->public.n_errors)
! 		retval = WORKER_IGNORED_ERRORS;
  
! #ifndef WIN32
! 	exit(retval);
! #else
! 	return retval;
! #endif
  }
  
  
--- 3645,3673 ----
  /*
   * Restore a single TOC item in parallel with others
   *
!  * this is run in the worker, i.e. in a thread (Windows) or a separate process
!  * (everything else). A worker process executes several such work items during
!  * a parallel backup or restore. Once we terminate here and report back that
!  * our work is finished, the master process will assign us a new work item.
   */
! int
! parallel_restore(ParallelArgs *args)
  {
  	ArchiveHandle *AH = args->AH;
  	TocEntry   *te = args->te;
  	RestoreOptions *ropt = AH->ropt;
! 	int			status;
  
  	_doSetFixedOutputState(AH);
  
! 	Assert(AH->connection != NULL);
  
! 	AH->public.n_errors = 0;
  
! 	/* Restore the TOC item */
! 	status = restore_toc_entry(AH, te, ropt, true);
  
! 	return status;
  }
  
  
*************** parallel_restore(RestoreArgs *args)
*** 3960,3984 ****
   */
  static void
  mark_work_done(ArchiveHandle *AH, TocEntry *ready_list,
! 			   thandle worker, int status,
! 			   ParallelSlot *slots, int n_slots)
  {
  	TocEntry   *te = NULL;
- 	int			i;
  
! 	for (i = 0; i < n_slots; i++)
! 	{
! 		if (slots[i].child_id == worker)
! 		{
! 			slots[i].child_id = 0;
! 			te = slots[i].args->te;
! 			DeCloneArchive(slots[i].args->AH);
! 			free(slots[i].args);
! 			slots[i].args = NULL;
! 
! 			break;
! 		}
! 	}
  
  	if (te == NULL)
  		exit_horribly(modulename, "could not find slot of finished worker\n");
--- 3679,3690 ----
   */
  static void
  mark_work_done(ArchiveHandle *AH, TocEntry *ready_list,
! 			   int worker, int status,
! 			   ParallelState *pstate)
  {
  	TocEntry   *te = NULL;
  
! 	te = pstate->parallelSlot[worker].args->te;
  
  	if (te == NULL)
  		exit_horribly(modulename, "could not find slot of finished worker\n");
*************** CloneArchive(ArchiveHandle *AH)
*** 4360,4368 ****
--- 4066,4122 ----
  	/* clone has its own error count, too */
  	clone->public.n_errors = 0;
  
+ 	/*
+ 	 * Connect our new clone object to the database:
+ 	 * In parallel restore the parent is already disconnected.
+ 	 * In parallel backup we clone the parent's existing connection.
+ 	 */
+ 	if (AH->ropt)
+ 	{
+ 		RestoreOptions *ropt = AH->ropt;
+ 		Assert(AH->connection == NULL);
+ 		/* this also sets clone->connection */
+ 		ConnectDatabase((Archive *) clone, ropt->dbname,
+ 					ropt->pghost, ropt->pgport, ropt->username,
+ 					ropt->promptPassword);
+ 	}
+ 	else
+ 	{
+ 		char	   *dbname;
+ 		char	   *pghost;
+ 		char	   *pgport;
+ 		char	   *username;
+ 		const char *encname;
+ 
+ 		Assert(AH->connection != NULL);
+ 
+ 		/*
+ 		 * Even though we are technically accessing the parent's database object
+ 		 * here, these functions are fine to be called like that because all just
+ 		 * return a pointer and do not actually send/receive any data to/from the
+ 		 * database.
+ 		 */
+ 		dbname = PQdb(AH->connection);
+ 		pghost = PQhost(AH->connection);
+ 		pgport = PQport(AH->connection);
+ 		username = PQuser(AH->connection);
+ 		encname = pg_encoding_to_char(AH->public.encoding);
+ 
+ 		/* this also sets clone->connection */
+ 		ConnectDatabase((Archive *) clone, dbname, pghost, pgport, username, TRI_NO);
+ 
+ 		/*
+ 		 * Set the same encoding, whatever we set here is what we got from
+ 		 * pg_encoding_to_char(), so we really shouldn't run into an error setting that
+ 		 * very same value. Also see the comment in SetupConnection().
+ 		 */
+ 		PQsetClientEncoding(clone->connection, encname);
+ 	}
+ 
  	/* Let the format-specific code have a chance too */
  	(clone->ClonePtr) (clone);
  
+ 	Assert(clone->connection != NULL);
  	return clone;
  }
  
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 005a8fe..28c036a 100644
*** a/src/bin/pg_dump/pg_backup_archiver.h
--- b/src/bin/pg_dump/pg_backup_archiver.h
*************** typedef z_stream *z_streamp;
*** 100,107 ****
--- 100,120 ----
  #define K_OFFSET_POS_SET 2
  #define K_OFFSET_NO_DATA 3
  
+ /*
+  * Special exit values from worker children.  We reserve 0 for normal
+  * success; 1 and other small values should be interpreted as crashes.
+  */
+ #define WORKER_OK                     0
+ #define WORKER_CREATE_DONE            10
+ #define WORKER_INHIBIT_DATA           11
+ #define WORKER_IGNORED_ERRORS         12
+ 
  struct _archiveHandle;
  struct _tocEntry;
+ struct _restoreList;
+ struct _parallel_args;
+ struct _parallel_state;
+ enum _action;
  
  typedef void (*ClosePtr) (struct _archiveHandle * AH);
  typedef void (*ReopenPtr) (struct _archiveHandle * AH);
*************** typedef void (*PrintTocDataPtr) (struct
*** 129,134 ****
--- 142,154 ----
  typedef void (*ClonePtr) (struct _archiveHandle * AH);
  typedef void (*DeClonePtr) (struct _archiveHandle * AH);
  
+ typedef char *(*WorkerJobRestorePtr)(struct _archiveHandle * AH, struct _tocEntry * te);
+ typedef char *(*WorkerJobDumpPtr)(struct _archiveHandle * AH, struct _tocEntry * te);
+ typedef char *(*MasterStartParallelItemPtr)(struct _archiveHandle * AH, struct _tocEntry * te,
+ 											enum _action act);
+ typedef int (*MasterEndParallelItemPtr)(struct _archiveHandle * AH, struct _tocEntry * te,
+ 										const char *str, enum _action act);
+ 
  typedef size_t (*CustomOutPtr) (struct _archiveHandle * AH, const void *buf, size_t len);
  
  typedef enum
*************** typedef struct _archiveHandle
*** 227,232 ****
--- 247,258 ----
  	StartBlobPtr StartBlobPtr;
  	EndBlobPtr EndBlobPtr;
  
+ 	MasterStartParallelItemPtr MasterStartParallelItemPtr;
+ 	MasterEndParallelItemPtr MasterEndParallelItemPtr;
+ 
+ 	WorkerJobDumpPtr WorkerJobDumpPtr;
+ 	WorkerJobRestorePtr WorkerJobRestorePtr;
+ 
  	ClonePtr ClonePtr;			/* Clone format-specific fields */
  	DeClonePtr DeClonePtr;		/* Clean up cloned fields */
  
*************** typedef struct _archiveHandle
*** 236,241 ****
--- 262,268 ----
  	char	   *archdbname;		/* DB name *read* from archive */
  	enum trivalue promptPassword;
  	char	   *savedPassword;	/* password for ropt->username, if known */
+ 	char	   *use_role;
  	PGconn	   *connection;
  	int			connectToDB;	/* Flag to indicate if direct DB connection is
  								 * required */
*************** typedef struct _tocEntry
*** 323,328 ****
--- 350,356 ----
  	int			nLockDeps;		/* number of such dependencies */
  } TocEntry;
  
+ extern int parallel_restore(struct _parallel_args *args);
  extern void on_exit_close_archive(Archive *AHX);
  
  extern void warn_or_exit_horribly(ArchiveHandle *AH, const char *modulename, const char *fmt,...) __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4)));
*************** extern void WriteHead(ArchiveHandle *AH)
*** 333,339 ****
  extern void ReadHead(ArchiveHandle *AH);
  extern void WriteToc(ArchiveHandle *AH);
  extern void ReadToc(ArchiveHandle *AH);
! extern void WriteDataChunks(ArchiveHandle *AH);
  extern ArchiveHandle *CloneArchive(ArchiveHandle *AH);
  extern void DeCloneArchive(ArchiveHandle *AH);
  
--- 361,368 ----
  extern void ReadHead(ArchiveHandle *AH);
  extern void WriteToc(ArchiveHandle *AH);
  extern void ReadToc(ArchiveHandle *AH);
! extern void WriteDataChunks(ArchiveHandle *AH, struct _parallel_state *pstate);
! extern void WriteDataChunksForTocEntry(ArchiveHandle *AH, TocEntry *te);
  extern ArchiveHandle *CloneArchive(ArchiveHandle *AH);
  extern void DeCloneArchive(ArchiveHandle *AH);
  
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index 18d158f..748d96f 100644
*** a/src/bin/pg_dump/pg_backup_custom.c
--- b/src/bin/pg_dump/pg_backup_custom.c
***************
*** 27,32 ****
--- 27,33 ----
  #include "compress_io.h"
  #include "dumputils.h"
  #include "dumpmem.h"
+ #include "parallel.h"
  
  /*--------
   * Routines in the format interface
*************** static void _LoadBlobs(ArchiveHandle *AH
*** 60,65 ****
--- 61,70 ----
  static void _Clone(ArchiveHandle *AH);
  static void _DeClone(ArchiveHandle *AH);
  
+ static char *_MasterStartParallelItem(ArchiveHandle *AH, TocEntry *te, T_Action act);
+ static int _MasterEndParallelItem(ArchiveHandle *AH, TocEntry *te, const char *str, T_Action act);
+ char *_WorkerJobRestoreCustom(ArchiveHandle *AH, TocEntry *te);
+ 
  typedef struct
  {
  	CompressorState *cs;
*************** InitArchiveFmt_Custom(ArchiveHandle *AH)
*** 127,132 ****
--- 132,144 ----
  	AH->ClonePtr = _Clone;
  	AH->DeClonePtr = _DeClone;
  
+ 	AH->MasterStartParallelItemPtr = _MasterStartParallelItem;
+ 	AH->MasterEndParallelItemPtr = _MasterEndParallelItem;
+ 
+ 	/* no parallel dump in the custom archive, only parallel restore */
+ 	AH->WorkerJobDumpPtr = NULL;
+ 	AH->WorkerJobRestorePtr = _WorkerJobRestoreCustom;
+ 
  	/* Set up a private area. */
  	ctx = (lclContext *) pg_calloc(1, sizeof(lclContext));
  	AH->formatData = (void *) ctx;
*************** _CloseArchive(ArchiveHandle *AH)
*** 698,704 ****
  		tpos = ftello(AH->FH);
  		WriteToc(AH);
  		ctx->dataStart = _getFilePos(AH, ctx);
! 		WriteDataChunks(AH);
  
  		/*
  		 * If possible, re-write the TOC in order to update the data offset
--- 710,716 ----
  		tpos = ftello(AH->FH);
  		WriteToc(AH);
  		ctx->dataStart = _getFilePos(AH, ctx);
! 		WriteDataChunks(AH, NULL);
  
  		/*
  		 * If possible, re-write the TOC in order to update the data offset
*************** _DeClone(ArchiveHandle *AH)
*** 796,801 ****
--- 808,888 ----
  	free(ctx);
  }
  
+ /*
+  * This function is executed in the child of a parallel backup for the
+  * custom format archive and dumps the actual data.
+  */
+ char *
+ _WorkerJobRestoreCustom(ArchiveHandle *AH, TocEntry *te)
+ {
+ 	/* short fixed-size string + some ID so far, this needs to be malloc'ed
+ 	 * instead of static because we work with threads on windows */
+ 	const int	buflen = 64;
+ 	char	   *buf = (char*) pg_malloc(buflen);
+ 	ParallelArgs pargs;
+ 	int			status;
+ 	lclTocEntry *tctx;
+ 
+ 	tctx = (lclTocEntry *) te->formatData;
+ 
+ 	pargs.AH = AH;
+ 	pargs.te = te;
+ 
+ 	status = parallel_restore(&pargs);
+ 
+ 	snprintf(buf, buflen, "OK RESTORE %d %d %d", te->dumpId, status,
+ 			 status == WORKER_IGNORED_ERRORS ? AH->public.n_errors : 0);
+ 
+ 	return buf;
+ }
+ 
+ /*
+  * This function is executed in the parent process. Depending on the desired
+  * action (dump or restore) it creates a string that is understood by the
+  * _WorkerJobDumpDirectory/_WorkerJobRestoreDirectory functions of the
+  * respective dump format.
+  */
+ static char *
+ _MasterStartParallelItem(ArchiveHandle *AH, TocEntry *te, T_Action act)
+ {
+ 	/*
+ 	 * A static char is okay here, even on Windows because we call this
+ 	 * function only from one process (the master).
+ 	 */
+ 	static char			buf[64]; /* short fixed-size string + number */
+ 
+ 	/* no parallel dump in the custom archive format */
+ 	Assert(act == ACT_RESTORE);
+ 
+ 	snprintf(buf, sizeof(buf), "RESTORE %d", te->dumpId);
+ 
+ 	return buf;
+ }
+ 
+ /*
+  * This function is executed in the parent process. It analyzes the response of
+  * the _WorkerJobDumpDirectory/_WorkerJobRestoreDirectory functions of the
+  * respective dump format.
+  */
+ static int
+ _MasterEndParallelItem(ArchiveHandle *AH, TocEntry *te, const char *str, T_Action act)
+ {
+ 	DumpId		dumpId;
+ 	int			nBytes, status, n_errors;
+ 
+ 	/* no parallel dump in the custom archive */
+ 	Assert(act == ACT_RESTORE);
+ 
+ 	sscanf(str, "%u %u %u%n", &dumpId, &status, &n_errors, &nBytes);
+ 
+ 	Assert(nBytes == strlen(str));
+ 	Assert(dumpId == te->dumpId);
+ 
+ 	AH->public.n_errors += n_errors;
+ 
+ 	return status;
+ }
+ 
  /*--------------------------------------------------
   * END OF FORMAT CALLBACKS
   *--------------------------------------------------
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index e504684..d846010 100644
*** a/src/bin/pg_dump/pg_backup_directory.c
--- b/src/bin/pg_dump/pg_backup_directory.c
***************
*** 36,41 ****
--- 36,42 ----
  #include "compress_io.h"
  #include "dumpmem.h"
  #include "dumputils.h"
+ #include "parallel.h"
  
  #include <dirent.h>
  #include <sys/stat.h>
*************** typedef struct
*** 51,56 ****
--- 52,58 ----
  	cfp		   *dataFH;			/* currently open data file */
  
  	cfp		   *blobsTocFH;		/* file handle for blobs.toc */
+ 	ParallelState *pstate;		/* for parallel backup / restore */
  } lclContext;
  
  typedef struct
*************** static int	_ReadByte(ArchiveHandle *);
*** 70,75 ****
--- 72,78 ----
  static size_t _WriteBuf(ArchiveHandle *AH, const void *buf, size_t len);
  static size_t _ReadBuf(ArchiveHandle *AH, void *buf, size_t len);
  static void _CloseArchive(ArchiveHandle *AH);
+ static void _ReopenArchive(ArchiveHandle *AH);
  static void _PrintTocData(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt);
  
  static void _WriteExtraToc(ArchiveHandle *AH, TocEntry *te);
*************** static void _StartBlob(ArchiveHandle *AH
*** 81,90 ****
  static void _EndBlob(ArchiveHandle *AH, TocEntry *te, Oid oid);
  static void _EndBlobs(ArchiveHandle *AH, TocEntry *te);
  static void _LoadBlobs(ArchiveHandle *AH, RestoreOptions *ropt);
  
! static char *prependDirectory(ArchiveHandle *AH, char *buf, const char *relativeFilename);
! static void createDirectory(const char *dir);
  
  
  /*
   *	Init routine required by ALL formats. This is a global routine
--- 84,99 ----
  static void _EndBlob(ArchiveHandle *AH, TocEntry *te, Oid oid);
  static void _EndBlobs(ArchiveHandle *AH, TocEntry *te);
  static void _LoadBlobs(ArchiveHandle *AH, RestoreOptions *ropt);
+ static void _Clone(ArchiveHandle *AH);
+ static void _DeClone(ArchiveHandle *AH);
  
! static char *_MasterStartParallelItem(ArchiveHandle *AH, TocEntry *te, T_Action act);
! static int _MasterEndParallelItem(ArchiveHandle *AH, TocEntry *te, const char *str, T_Action act);
! static char *_WorkerJobRestoreDirectory(ArchiveHandle *AH, TocEntry *te);
! static char *_WorkerJobDumpDirectory(ArchiveHandle *AH, TocEntry *te);
  
+ static void createDirectory(const char *dir);
+ static char *prependDirectory(ArchiveHandle *AH, char *buf, const char *relativeFilename);
  
  /*
   *	Init routine required by ALL formats. This is a global routine
*************** InitArchiveFmt_Directory(ArchiveHandle *
*** 111,117 ****
  	AH->WriteBufPtr = _WriteBuf;
  	AH->ReadBufPtr = _ReadBuf;
  	AH->ClosePtr = _CloseArchive;
! 	AH->ReopenPtr = NULL;
  	AH->PrintTocDataPtr = _PrintTocData;
  	AH->ReadExtraTocPtr = _ReadExtraToc;
  	AH->WriteExtraTocPtr = _WriteExtraToc;
--- 120,126 ----
  	AH->WriteBufPtr = _WriteBuf;
  	AH->ReadBufPtr = _ReadBuf;
  	AH->ClosePtr = _CloseArchive;
! 	AH->ReopenPtr = _ReopenArchive;
  	AH->PrintTocDataPtr = _PrintTocData;
  	AH->ReadExtraTocPtr = _ReadExtraToc;
  	AH->WriteExtraTocPtr = _WriteExtraToc;
*************** InitArchiveFmt_Directory(ArchiveHandle *
*** 122,129 ****
  	AH->EndBlobPtr = _EndBlob;
  	AH->EndBlobsPtr = _EndBlobs;
  
! 	AH->ClonePtr = NULL;
! 	AH->DeClonePtr = NULL;
  
  	/* Set up our private context */
  	ctx = (lclContext *) pg_calloc(1, sizeof(lclContext));
--- 131,144 ----
  	AH->EndBlobPtr = _EndBlob;
  	AH->EndBlobsPtr = _EndBlobs;
  
! 	AH->ClonePtr = _Clone;
! 	AH->DeClonePtr = _DeClone;
! 
! 	AH->WorkerJobRestorePtr = _WorkerJobRestoreDirectory;
! 	AH->WorkerJobDumpPtr = _WorkerJobDumpDirectory;
! 
! 	AH->MasterStartParallelItemPtr = _MasterStartParallelItem;
! 	AH->MasterEndParallelItemPtr = _MasterEndParallelItem;
  
  	/* Set up our private context */
  	ctx = (lclContext *) pg_calloc(1, sizeof(lclContext));
*************** _WriteData(ArchiveHandle *AH, const void
*** 308,313 ****
--- 323,331 ----
  	if (dLen == 0)
  		return 0;
  
+ 	/* Are we aborting? */
+ 	checkAborting(AH);
+ 
  	return cfwrite(data, dLen, ctx->dataFH);
  }
  
*************** _WriteBuf(ArchiveHandle *AH, const void
*** 475,480 ****
--- 493,501 ----
  	lclContext *ctx = (lclContext *) AH->formatData;
  	size_t		res;
  
+ 	/* Are we aborting? */
+ 	checkAborting(AH);
+ 
  	res = cfwrite(buf, len, ctx->dataFH);
  	if (res != len)
  		exit_horribly(modulename, "could not write to output file: %s\n",
*************** _CloseArchive(ArchiveHandle *AH)
*** 523,528 ****
--- 544,552 ----
  
  		prependDirectory(AH, fname, "toc.dat");
  
+ 		/* this will actually fork the processes for a parallel backup */
+ 		ctx->pstate = ParallelBackupStart(AH, NULL);
+ 
  		/* The TOC is always created uncompressed */
  		tocFH = cfopen_write(fname, PG_BINARY_W, 0);
  		if (tocFH == NULL)
*************** _CloseArchive(ArchiveHandle *AH)
*** 542,552 ****
  		if (cfclose(tocFH) != 0)
  			exit_horribly(modulename, "could not close TOC file: %s\n",
  						  strerror(errno));
! 		WriteDataChunks(AH);
  	}
  	AH->FH = NULL;
  }
  
  
  /*
   * BLOB support
--- 566,589 ----
  		if (cfclose(tocFH) != 0)
  			exit_horribly(modulename, "could not close TOC file: %s\n",
  						  strerror(errno));
! 		WriteDataChunks(AH, ctx->pstate);
! 
! 		ParallelBackupEnd(AH, ctx->pstate);
  	}
  	AH->FH = NULL;
  }
  
+ /*
+  * Reopen the archive's file handle.
+  */
+ static void
+ _ReopenArchive(ArchiveHandle *AH)
+ {
+ 	/*
+ 	 * Our TOC is in memory, our data files are opened by each child anyway as
+ 	 * they are separate. We support reopening the archive by just doing nothing.
+ 	 */
+ }
  
  /*
   * BLOB support
*************** prependDirectory(ArchiveHandle *AH, char
*** 681,683 ****
--- 718,869 ----
  
  	return buf;
  }
+ 
+ /*
+  * Clone format-specific fields during parallel restoration.
+  */
+ static void
+ _Clone(ArchiveHandle *AH)
+ {
+ 	lclContext *ctx = (lclContext *) AH->formatData;
+ 
+ 	AH->formatData = (lclContext *) pg_malloc(sizeof(lclContext));
+ 	memcpy(AH->formatData, ctx, sizeof(lclContext));
+ 	ctx = (lclContext *) AH->formatData;
+ 
+ 	/*
+ 	 * Note: we do not make a local lo_buf because we expect at most one BLOBS
+ 	 * entry per archive, so no parallelism is possible.  Likewise,
+ 	 * TOC-entry-local state isn't an issue because any one TOC entry is
+ 	 * touched by just one worker child.
+ 	 */
+ 
+ 	/*
+ 	 * We also don't copy the ParallelState pointer (pstate), only the master
+ 	 * process ever writes to it.
+ 	 */
+ }
+ 
+ static void
+ _DeClone(ArchiveHandle *AH)
+ {
+ 	lclContext *ctx = (lclContext *) AH->formatData;
+ 	free(ctx);
+ }
+ 
+ /*
+  * This function is executed in the parent process. Depending on the desired
+  * action (dump or restore) it creates a string that is understood by the
+  * _WorkerJobDumpDirectory/_WorkerJobRestoreDirectory functions of the
+  * respective dump format.
+  */
+ static char *
+ _MasterStartParallelItem(ArchiveHandle *AH, TocEntry *te, T_Action act)
+ {
+ 	/*
+ 	 * A static char is okay here, even on Windows because we call this
+ 	 * function only from one process (the master).
+ 	 */
+ 	static char	buf[64];
+ 
+ 	if (act == ACT_DUMP)
+ 		snprintf(buf, sizeof(buf), "DUMP %d", te->dumpId);
+ 	else if (act == ACT_RESTORE)
+ 		snprintf(buf, sizeof(buf), "RESTORE %d", te->dumpId);
+ 
+ 	return buf;
+ }
+ 
+ /*
+  * This function is executed in the child of a parallel backup for the
+  * directory archive and dumps the actual data.
+  *
+  * We are currently returning only the DumpId so theoretically we could
+  * make this function returning an int (or a DumpId). However, to
+  * facilitate further enhancements and because sooner or later we need to
+  * convert this to a string and send it via a message anyway, we stick with
+  * char *. It is parsed on the other side by the _EndMasterParallel()
+  * function of the respective dump format.
+  */
+ static char *
+ _WorkerJobDumpDirectory(ArchiveHandle *AH, TocEntry *te)
+ {
+ 	/* short fixed-size string + some ID so far, this needs to be malloc'ed
+ 	 * instead of static because we work with threads on windows */
+ 	const int	buflen = 64;
+ 	char	   *buf = (char*) pg_malloc(buflen);
+ 	lclTocEntry *tctx = (lclTocEntry *) te->formatData;
+ 
+ 	/* This should never happen */
+ 	if (!tctx)
+ 		exit_horribly(modulename, "Error during backup\n");
+ 
+ 	/*
+ 	 * This function returns void. We either fail and die horribly or succeed...
+ 	 * A failure will be detected by the parent when the child dies unexpectedly.
+ 	 */
+ 	WriteDataChunksForTocEntry(AH, te);
+ 
+ 	snprintf(buf, buflen, "OK DUMP %d", te->dumpId);
+ 
+ 	return buf;
+ }
+ 
+ /*
+  * This function is executed in the child of a parallel backup for the
+  * directory archive and dumps the actual data.
+  */
+ static char *
+ _WorkerJobRestoreDirectory(ArchiveHandle *AH, TocEntry *te)
+ {
+ 	/* short fixed-size string + some ID so far, this needs to be malloc'ed
+ 	 * instead of static because we work with threads on windows */
+ 	const int	buflen = 64;
+ 	char	   *buf = (char*) pg_malloc(buflen);
+ 	ParallelArgs pargs;
+ 	int			status;
+ 	lclTocEntry *tctx;
+ 
+ 	tctx = (lclTocEntry *) te->formatData;
+ 
+ 	pargs.AH = AH;
+ 	pargs.te = te;
+ 
+ 	status = parallel_restore(&pargs);
+ 
+ 	snprintf(buf, buflen, "OK RESTORE %d %d %d", te->dumpId, status,
+ 			 status == WORKER_IGNORED_ERRORS ? AH->public.n_errors : 0);
+ 
+ 	return buf;
+ }
+ /*
+  * This function is executed in the parent process. It analyzes the response of
+  * the _WorkerJobDumpDirectory/_WorkerJobRestoreDirectory functions of the
+  * respective dump format.
+  */
+ static int
+ _MasterEndParallelItem(ArchiveHandle *AH, TocEntry *te, const char *str, T_Action act)
+ {
+ 	DumpId		dumpId;
+ 	int			nBytes, n_errors;
+ 	int			status = 0;
+ 
+ 	if (act == ACT_DUMP)
+ 	{
+ 		sscanf(str, "%u%n", &dumpId, &nBytes);
+ 
+ 		Assert(dumpId == te->dumpId);
+ 		Assert(nBytes == strlen(str));
+ 	}
+ 	else if (act == ACT_RESTORE)
+ 	{
+ 		sscanf(str, "%u %u %u%n", &dumpId, &status, &n_errors, &nBytes);
+ 
+ 		Assert(dumpId == te->dumpId);
+ 		Assert(nBytes == strlen(str));
+ 
+ 		AH->public.n_errors += n_errors;
+ 	}
+ 
+ 	return status;
+ }
diff --git a/src/bin/pg_dump/pg_backup_tar.c b/src/bin/pg_dump/pg_backup_tar.c
index 451c957..824a822 100644
*** a/src/bin/pg_dump/pg_backup_tar.c
--- b/src/bin/pg_dump/pg_backup_tar.c
*************** InitArchiveFmt_Tar(ArchiveHandle *AH)
*** 155,160 ****
--- 155,166 ----
  	AH->ClonePtr = NULL;
  	AH->DeClonePtr = NULL;
  
+ 	AH->MasterStartParallelItemPtr = NULL;
+ 	AH->MasterEndParallelItemPtr = NULL;
+ 
+ 	AH->WorkerJobDumpPtr = NULL;
+ 	AH->WorkerJobRestorePtr = NULL;
+ 
  	/*
  	 * Set up some special context used in compressing data.
  	 */
*************** _CloseArchive(ArchiveHandle *AH)
*** 834,840 ****
  		/*
  		 * Now send the data (tables & blobs)
  		 */
! 		WriteDataChunks(AH);
  
  		/*
  		 * Now this format wants to append a script which does a full restore
--- 840,846 ----
  		/*
  		 * Now send the data (tables & blobs)
  		 */
! 		WriteDataChunks(AH, NULL);
  
  		/*
  		 * Now this format wants to append a script which does a full restore
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 7f91eb9..4c7b603 100644
*** a/src/bin/pg_dump/pg_dump.c
--- b/src/bin/pg_dump/pg_dump.c
*************** static int	disable_dollar_quoting = 0;
*** 139,144 ****
--- 139,145 ----
  static int	dump_inserts = 0;
  static int	column_inserts = 0;
  static int	no_security_labels = 0;
+ static int  no_synchronized_snapshots = 0;
  static int	no_unlogged_table_data = 0;
  static int	serializable_deferrable = 0;
  
*************** static void binary_upgrade_extension_mem
*** 253,258 ****
--- 254,260 ----
  								const char *objlabel);
  static const char *getAttrName(int attrnum, TableInfo *tblInfo);
  static const char *fmtCopyColumnList(const TableInfo *ti, PQExpBuffer buffer);
+ static char *get_synchronized_snapshot(Archive *fout);
  static PGresult *ExecuteSqlQueryForSingleRow(Archive *fout, char *query);
  
  int
*************** main(int argc, char **argv)
*** 272,277 ****
--- 274,280 ----
  	DumpableObject **dobjs;
  	int			numObjs;
  	int			i;
+ 	int			numWorkers = 1;
  	enum trivalue prompt_password = TRI_DEFAULT;
  	int			compressLevel = -1;
  	int			plainText = 0;
*************** main(int argc, char **argv)
*** 301,306 ****
--- 304,310 ----
  		{"format", required_argument, NULL, 'F'},
  		{"host", required_argument, NULL, 'h'},
  		{"ignore-version", no_argument, NULL, 'i'},
+ 		{"jobs", 1, NULL, 'j'},
  		{"no-reconnect", no_argument, NULL, 'R'},
  		{"oids", no_argument, NULL, 'o'},
  		{"no-owner", no_argument, NULL, 'O'},
*************** main(int argc, char **argv)
*** 340,345 ****
--- 344,350 ----
  		{"serializable-deferrable", no_argument, &serializable_deferrable, 1},
  		{"use-set-session-authorization", no_argument, &use_setsessauth, 1},
  		{"no-security-labels", no_argument, &no_security_labels, 1},
+ 		{"no-synchronized-snapshots", no_argument, &no_synchronized_snapshots, 1},
  		{"no-unlogged-table-data", no_argument, &no_unlogged_table_data, 1},
  
  		{NULL, 0, NULL, 0}
*************** main(int argc, char **argv)
*** 347,352 ****
--- 352,363 ----
  
  	set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_dump"));
  
+ 	/*
+ 	 * Initialize what we need for parallel execution, especially for thread
+ 	 * support on Windows.
+ 	 */
+ 	init_parallel_dump_utils();
+ 
  	g_verbose = false;
  
  	strcpy(g_comment_start, "-- ");
*************** main(int argc, char **argv)
*** 377,383 ****
  		}
  	}
  
! 	while ((c = getopt_long(argc, argv, "abcCE:f:F:h:in:N:oOp:RsS:t:T:U:vwWxZ:",
  							long_options, &optindex)) != -1)
  	{
  		switch (c)
--- 388,394 ----
  		}
  	}
  
! 	while ((c = getopt_long(argc, argv, "abcCE:f:F:h:ij:n:N:oOp:RsS:t:T:U:vwWxZ:",
  							long_options, &optindex)) != -1)
  	{
  		switch (c)
*************** main(int argc, char **argv)
*** 418,423 ****
--- 429,438 ----
  				/* ignored, deprecated option */
  				break;
  
+ 			case 'j':			/* number of dump jobs */
+ 				numWorkers = atoi(optarg);
+ 				break;
+ 
  			case 'n':			/* include schema(s) */
  				simple_string_list_append(&schema_include_patterns, optarg);
  				include_everything = false;
*************** main(int argc, char **argv)
*** 570,575 ****
--- 585,606 ----
  			compressLevel = 0;
  	}
  
+ 	/*
+ 	 * On Windows we can only have at most MAXIMUM_WAIT_OBJECTS (= 64 usually)
+ 	 * parallel jobs because that's the maximum limit for the
+ 	 * WaitForMultipleObjects() call.
+ 	 */
+ 	if (numWorkers <= 0
+ #ifdef WIN32
+ 			|| numWorkers > MAXIMUM_WAIT_OBJECTS
+ #endif
+ 		)
+ 		exit_horribly(NULL, "%s: invalid number of parallel jobs\n", progname);
+ 
+ 	/* Parallel backup only in the directory archive format so far */
+ 	if (archiveFormat != archDirectory && numWorkers > 1)
+ 		exit_horribly(NULL, "parallel backup only supported by the directory format\n");
+ 
  	/* Open the output file */
  	fout = CreateArchive(filename, archiveFormat, compressLevel, archiveMode);
  
*************** main(int argc, char **argv)
*** 593,598 ****
--- 624,631 ----
  	fout->minRemoteVersion = 70000;
  	fout->maxRemoteVersion = (my_version / 100) * 100 + 99;
  
+ 	fout->numWorkers = numWorkers;
+ 
  	/*
  	 * Open the database using the Archiver, so it knows about it. Errors mean
  	 * death.
*************** main(int argc, char **argv)
*** 607,631 ****
  	if (fout->remoteVersion < 90100)
  		no_security_labels = 1;
  
- 	/*
- 	 * Start transaction-snapshot mode transaction to dump consistent data.
- 	 */
- 	ExecuteSqlStatement(fout, "BEGIN");
- 	if (fout->remoteVersion >= 90100)
- 	{
- 		if (serializable_deferrable)
- 			ExecuteSqlStatement(fout,
- 								"SET TRANSACTION ISOLATION LEVEL "
- 								"SERIALIZABLE, READ ONLY, DEFERRABLE");
- 		else
- 			ExecuteSqlStatement(fout,
- 						   		"SET TRANSACTION ISOLATION LEVEL "
- 								"REPEATABLE READ");
- 	}
- 	else
- 		ExecuteSqlStatement(fout,
- 							"SET TRANSACTION ISOLATION LEVEL SERIALIZABLE");
- 
  	/* Select the appropriate subquery to convert user IDs to names */
  	if (fout->remoteVersion >= 80100)
  		username_subquery = "SELECT rolname FROM pg_catalog.pg_roles WHERE oid =";
--- 640,645 ----
*************** main(int argc, char **argv)
*** 634,639 ****
--- 648,661 ----
  	else
  		username_subquery = "SELECT usename FROM pg_user WHERE usesysid =";
  
+ 	/* check the version for the synchronized snapshots feature */
+ 	if (numWorkers > 1 && fout->remoteVersion < 90200
+ 		&& !no_synchronized_snapshots)
+ 		exit_horribly(NULL,
+ 					 "No synchronized snapshots available in this server version. "
+ 					 "Run with --no-synchronized-snapshots instead if you do not "
+ 					 "need synchronized snapshots.\n");
+ 
  	/* Find the last built-in OID, if needed */
  	if (fout->remoteVersion < 70300)
  	{
*************** main(int argc, char **argv)
*** 721,726 ****
--- 743,752 ----
  	else
  		sortDumpableObjectsByTypeOid(dobjs, numObjs);
  
+ 	/* If we do a parallel dump, we want the largest tables to go first */
+ 	if (archiveFormat == archDirectory && numWorkers > 1)
+ 		sortDataAndIndexObjectsBySize(dobjs, numObjs);
+ 
  	sortDumpableObjects(dobjs, numObjs);
  
  	/*
*************** help(const char *progname)
*** 784,789 ****
--- 810,816 ----
  	printf(_("  -f, --file=FILENAME         output file or directory name\n"));
  	printf(_("  -F, --format=c|d|t|p        output file format (custom, directory, tar,\n"
  			 "                              plain text (default))\n"));
+ 	printf(_("  -j, --jobs=NUM              use this many parallel jobs to dump\n"));
  	printf(_("  -v, --verbose               verbose mode\n"));
  	printf(_("  -Z, --compress=0-9          compression level for compressed formats\n"));
  	printf(_("  --lock-wait-timeout=TIMEOUT fail after waiting TIMEOUT for a table lock\n"));
*************** help(const char *progname)
*** 813,818 ****
--- 840,846 ----
  	printf(_("  --exclude-table-data=TABLE  do NOT dump data for the named table(s)\n"));
  	printf(_("  --inserts                   dump data as INSERT commands, rather than COPY\n"));
  	printf(_("  --no-security-labels        do not dump security label assignments\n"));
+ 	printf(_("  --no-synchronized-snapshots parallel processes should not use synchronized snapshots\n"));
  	printf(_("  --no-tablespaces            do not dump tablespace assignments\n"));
  	printf(_("  --no-unlogged-table-data    do not dump unlogged table data\n"));
  	printf(_("  --quote-all-identifiers     quote all identifiers, even if not key words\n"));
*************** setup_connection(Archive *AH, const char
*** 841,847 ****
  	PGconn	   *conn = GetConnection(AH);
  	const char *std_strings;
  
! 	/* Set the client encoding if requested */
  	if (dumpencoding)
  	{
  		if (PQsetClientEncoding(conn, dumpencoding) < 0)
--- 869,880 ----
  	PGconn	   *conn = GetConnection(AH);
  	const char *std_strings;
  
! 	/*
! 	 * Set the client encoding if requested. If dumpencoding == NULL then
! 	 * either it hasn't been requested or we're a cloned connection and then this
! 	 * has already been set in CloneArchive according to the original
! 	 * connection encoding.
! 	 */
  	if (dumpencoding)
  	{
  		if (PQsetClientEncoding(conn, dumpencoding) < 0)
*************** setup_connection(Archive *AH, const char
*** 859,864 ****
--- 892,901 ----
  	AH->std_strings = (std_strings && strcmp(std_strings, "on") == 0);
  
  	/* Set the role if requested */
+ 	if (!use_role && AH->use_role)
+ 		use_role = AH->use_role;
+ 
+ 	/* Set the role if requested */
  	if (use_role && AH->remoteVersion >= 80100)
  	{
  		PQExpBuffer query = createPQExpBuffer();
*************** setup_connection(Archive *AH, const char
*** 866,871 ****
--- 903,912 ----
  		appendPQExpBuffer(query, "SET ROLE %s", fmtId(use_role));
  		ExecuteSqlStatement(AH, query->data);
  		destroyPQExpBuffer(query);
+ 
+ 		/* save this for later use on parallel connections */
+ 		if (!AH->use_role)
+ 			AH->use_role = strdup(use_role);
  	}
  
  	/* Set the datestyle to ISO to ensure the dump's portability */
*************** setup_connection(Archive *AH, const char
*** 902,907 ****
--- 943,1001 ----
  	 */
  	if (quote_all_identifiers && AH->remoteVersion >= 90100)
  		ExecuteSqlStatement(AH, "SET quote_all_identifiers = true");
+ 
+ 	/*
+ 	 * Start transaction-snapshot mode transaction to dump consistent data.
+ 	 */
+ 	ExecuteSqlStatement(AH, "BEGIN");
+ 	if (AH->remoteVersion >= 90100)
+ 	{
+ 		if (serializable_deferrable)
+ 			ExecuteSqlStatement(AH,
+ 						   "SET TRANSACTION ISOLATION LEVEL SERIALIZABLE, "
+ 						   "READ ONLY, DEFERRABLE");
+ 		else
+ 			ExecuteSqlStatement(AH,
+ 						   "SET TRANSACTION ISOLATION LEVEL REPEATABLE READ");
+ 	}
+ 	else
+ 		ExecuteSqlStatement(AH, "SET TRANSACTION ISOLATION LEVEL SERIALIZABLE");
+ 
+ 	if (AH->numWorkers > 1 && AH->remoteVersion >= 90200 && !no_synchronized_snapshots)
+ 	{
+ 		if (AH->sync_snapshot_id)
+ 		{
+ 			PQExpBuffer query = createPQExpBuffer();
+ 			appendPQExpBuffer(query, "SET TRANSACTION SNAPSHOT ");
+ 			appendStringLiteralConn(query, AH->sync_snapshot_id, conn);
+ 			destroyPQExpBuffer(query);
+ 		}
+ 		else
+ 			AH->sync_snapshot_id = get_synchronized_snapshot(AH);
+ 	}
+ }
+ 
+ /*
+  * Initialize the connection for a new worker process.
+  */
+ void
+ _SetupWorker(Archive *AHX, RestoreOptions *ropt)
+ {
+ 	setup_connection(AHX, NULL, NULL);
+ }
+ 
+ static char*
+ get_synchronized_snapshot(Archive *fout)
+ {
+ 	char	   *query = "select pg_export_snapshot()";
+ 	char	   *result;
+ 	PGresult   *res;
+ 
+ 	res = ExecuteSqlQueryForSingleRow(fout, query);
+ 	result = strdup(PQgetvalue(res, 0, 0));
+ 	PQclear(res);
+ 
+ 	return result;
  }
  
  static ArchiveFormat
diff --git a/src/bin/pg_dump/pg_restore.c b/src/bin/pg_dump/pg_restore.c
index 11c83f7..99dc7a6 100644
*** a/src/bin/pg_dump/pg_restore.c
--- b/src/bin/pg_dump/pg_restore.c
*************** main(int argc, char **argv)
*** 406,411 ****
--- 406,421 ----
  		InitDummyWantedList(AH, opts);
  	}
  
+ 	/* See comments in pg_dump.c */
+ #ifdef WIN32
+ 	if (numWorkers > MAXIMUM_WAIT_OBJECTS)
+ 	{
+ 		fprintf(stderr, _("%s: maximum number of parallel jobs is %d\n"),
+ 				progname, MAXIMUM_WAIT_OBJECTS);
+ 		exit(1);
+ 	}
+ #endif
+ 
  	AH->numWorkers = numWorkers;
  
  	if (opts->tocSummary)
*************** main(int argc, char **argv)
*** 426,431 ****
--- 436,448 ----
  	return exit_code;
  }
  
+ void
+ _SetupWorker(Archive *AHX, RestoreOptions *ropt)
+ {
+ 	ArchiveHandle *AH = (ArchiveHandle *) AHX;
+ 	(AH->ReopenPtr) (AH);
+ }
+ 
  static void
  usage(const char *progname)
  {
#66Robert Haas
robertmhaas@gmail.com
In reply to: Joachim Wieland (#65)
Re: patch for parallel pg_dump

On Sun, Apr 1, 2012 at 12:35 PM, Joachim Wieland <joe@mcknight.de> wrote:

On Wed, Mar 28, 2012 at 2:20 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

My main comment about the current patch is that it looks like it's
touching pg_restore parallel code by moving some stuff into parallel.c.
If that's really the case and its voluminous, maybe this patch would
shrink a bit if we could do the code moving in a first patch.  That
would be mostly mechanical.  Then the interesting stuff would apply on
top of that.  That would make review easier.

Unfortunately this is not really the case. What is being moved out of
pg_backup_archiver.c and into parallel.c is either the shutdown logic
that has been applied only a few days ago or is necessary to change
the parallel restore logic from one-thread-per-dump-object to the
message passing framework where a worker starts in the beginning and
then receives a new dump object from the master every time it's idle.

Hmm. It looks to me like the part-two patch still contains a bunch of
code rearrangement. For example, the current code for
pg_backup_archiver.c patch contains this:

typedef struct ParallelState
{
int numWorkers;
ParallelStateEntry *pse;
} ParallelState;

In the revised patch, that's removed, and parallel.c instead contains this:

typedef struct _parallel_state
{
int numWorkers;
ParallelSlot *parallelSlot;
} ParallelState;

Perhaps I am missing something, but it looks to me like that's the
same code, except that for some reason the identifiers have been
renamed. I see little point in renaming the struct from ParallelState
to _parallel_state (in fact, I like the new name less) or changing
ParallelStateEntry to ParallelSlot (which is no worse, but it's not
better either).

On a similar note, what's the point of changing struct Archive to have
int numWorkers instead of int number_of_jobs, and furthermore
shuffling the declaration around to a different part of the struct?
If that's really an improvement, it should be a separate patch, but my
guess is that it is just rearranging deck chairs. It gives rise to
subsidiary diff hunks like this:

        /*
         * If we're going to do parallel restore, there are some restrictions.
         */
!       parallel_mode = (ropt->number_of_jobs > 1 && ropt->useDB);
        if (parallel_mode)
        {
                /* We haven't got round to making this work for all
archive formats */
--- 271,277 ----
        /*
         * If we're going to do parallel restore, there are some restrictions.
         */
!       parallel_mode = (AH->public.numWorkers > 1 && ropt->useDB);
        if (parallel_mode)
        {
                /* We haven't got round to making this work for all
archive formats */

On another note, I am not sure that I like the messaging protocol
you've designed. It seems to me that this has little chance of being
reliable:

+ void (*on_exit_msg_func)(const char *modulename, const char *fmt,
va_list ap) = vwrite_msg;

I believe the idea here is that you're going to capture the dying gasp
of the worker thread and send it to the master instead of printing it
out. But that doesn't seem very reliable. There's code all over the
place (and, in particular, in pg_dump.c) that assumes that we may
write messages at any time, and in particular that we may emit
multiple messages just before croaking. Even if you were to hook
vwrite_msg, I'm not sure that would do it, because there are certainly
situations in which libpq can print out errors directly, for example,
or a system library might cough something up. I'm thinking that the
boss process should really probably be capturing stdout and stderr and
reading messages from there, and interpreting any messages that it
sees as non-fatal (but still reporting them back to the user) unless
the worker exits unexpectedly (which the master can detect by noticing
that the connection has been closed).

Also, I like the idea of making it possible to use assertions in
front-end code. But I think that if we're going to do that, we should
make it work in general, not just for things that include
pg_backup_archiver.h. It seems to me that the way to do this would be
to move the definitions of Assert(), AssertMacro(), AssertArg(),
AssertState(), Trap(), TrapMacro(), and ExceptionalCondition() out of
postgres.h and into a new header file, say, pg_assert.h. postgres.h
can include that automatically, and people writing front-end code can
include it if they're so inclined. The only difficulty is where and
how to define ExceptionalCondition(). The obvious place seems to be
dumputils.c, which seems to be our unofficial place to stick stuff
that may be needed by a variety of frontend code. Really, I think we
ought to think about creating a library that is more explicitly for
that purpose (libpgfrontend?) or else figuring out how to incorporate
it into libpq, but that's a project for another day.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#67Joachim Wieland
joe@mcknight.de
In reply to: Robert Haas (#66)
Re: patch for parallel pg_dump

On Tue, Apr 3, 2012 at 9:34 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Sun, Apr 1, 2012 at 12:35 PM, Joachim Wieland <joe@mcknight.de> wrote:

Unfortunately this is not really the case. What is being moved out of
pg_backup_archiver.c and into parallel.c is either the shutdown logic
that has been applied only a few days ago or is necessary to change
the parallel restore logic from one-thread-per-dump-object to the
message passing framework where a worker starts in the beginning and
then receives a new dump object from the master every time it's idle.

Hmm.  It looks to me like the part-two patch still contains a bunch of
code rearrangement.  For example, the current code for
pg_backup_archiver.c patch contains this:

typedef struct ParallelState
{
       int                     numWorkers;
       ParallelStateEntry *pse;
} ParallelState;

In the revised patch, that's removed, and parallel.c instead contains this:

typedef struct _parallel_state
{
     int                     numWorkers;
     ParallelSlot *parallelSlot;
} ParallelState;

Yes, this is what I referred to as the part of the shutdown logic that
we only applied a few days ago. I basically backported what I had in
parallel.c into pg_backup_archiver.c which is where all the parallel
logic lives at the moment. Moving it out of pg_backup_archiver.c and
into a new parallel.c file means that I'd either have to move the
declaration to a header or create accessor functions and declare these
in a header. I actually tried it and both solutions created more lines
than they would save later on, especially with the lines that will
remove this temporary arrangement again...

The current parallel restore engine already has a "ParallelSlot"
structure but uses it in a slightly different way. That's why I
created the one in the shutdown logic as "ParallelStateEntry" for now.
This will be gone with the final patch and at the end there will only
be a "ParallelSlot" left that will serve both purposes. That's why you
see this renaming (and the removal of the current ParallelSlot
structure).

"struct _parallel_state" won't be used anywhere, except for forward
declarations in headers. I just used it because that seemed to be the
naming scheme, other structures are called similarly, e.g.:

$ grep "struct _" pg_backup_archiver.c
typedef struct _restore_args
typedef struct _parallel_slot
typedef struct _outputContext

I'm fine with any name, just let me know what you prefer.

On a similar note, what's the point of changing struct Archive to have
int numWorkers instead of int number_of_jobs, and furthermore
shuffling the declaration around to a different part of the struct?

number_of_jobs was in the struct RestoreOptions before, a structure
that is not used when doing a dump. I moved it to the Archive as it is
used by both dump and restore and since all other code talks about
"workers" I changed the name to "numWorkers".

On another note, I am not sure that I like the messaging protocol
you've designed.  It seems to me that this has little chance of being
reliable:

+ void (*on_exit_msg_func)(const char *modulename, const char *fmt,
va_list ap) = vwrite_msg;

I believe the idea here is that you're going to capture the dying gasp
of the worker thread and send it to the master instead of printing it
out.  But that doesn't seem very reliable.  There's code all over the
place (and, in particular, in pg_dump.c) that assumes that we may
write messages at any time, and in particular that we may emit
multiple messages just before croaking.

I guess you're talking about the code in pg_dump that reads in the
database schema and the details of all the different objects in the
schema. This code is run before forking off workers and is always
executed in the master.

pg_dump only forks when all the catalog data has been read so only
actual TABLE DATA and BLOBs are dumped from the workers. I claim that
in at least 90% the functions involved here use exit_horribly() and
output a clear message about why they're dying. If they don't but just
die, the master will say "worker died unexpectedly". As you said a few
mails before, any function exiting at this stage should rather call
exit_horribly() to properly clean up after itself.

The advantage of using the message passing system for the last error
message is that you get exactly one message and it's very likely that
it accurately describes what happened to a worker to make it stop.

Also, I like the idea of making it possible to use assertions in
front-end code.  But I think that if we're going to do that, we should
make it work in general, not just for things that include
pg_backup_archiver.h.

I completely agree. Assertions helped a lot dealing with concurrent
code. How do you want to tackle this for now? Want me to create a
separate header pg_assert.h as part of my patch? Or is it okay to
factor it out later and include it from the general header then?

#68Alvaro Herrera
alvherre@commandprompt.com
In reply to: Joachim Wieland (#67)
Re: patch for parallel pg_dump

Excerpts from Joachim Wieland's message of mar abr 03 11:40:31 -0300 2012:

On Tue, Apr 3, 2012 at 9:34 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Hmm.  It looks to me like the part-two patch still contains a bunch of
code rearrangement.  For example, the current code for
pg_backup_archiver.c patch contains this:

typedef struct ParallelState
{
       int                     numWorkers;
       ParallelStateEntry *pse;
} ParallelState;

In the revised patch, that's removed, and parallel.c instead contains this:

typedef struct _parallel_state
{
     int                     numWorkers;
     ParallelSlot *parallelSlot;
} ParallelState;

"struct _parallel_state" won't be used anywhere, except for forward
declarations in headers. I just used it because that seemed to be the
naming scheme, other structures are called similarly, e.g.:

$ grep "struct _" pg_backup_archiver.c
typedef struct _restore_args
typedef struct _parallel_slot
typedef struct _outputContext

I'm fine with any name, just let me know what you prefer.

I think I can explain this one. In the refactoring patch that Joachim
submitted and I committed, the struct was called _parallel_state, using
the same naming convention with the useless _ suffix and all lowercase
that already plagued the pg_dump code. This name is pointlessly
different from the typedef -- maybe the original pg_dump author thought
the names would collide and chose them different. But this is not true,
looks ugly to me, and furthermore it is inconsistent with what we do in
the rest of the PG code which is to use struct names identical to the
typedef names. So I changed it -- without realizing that the subsequent
patch would move the declarations elsewhere, losing my renaming.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#69Robert Haas
robertmhaas@gmail.com
In reply to: Joachim Wieland (#67)
Re: patch for parallel pg_dump

On Tue, Apr 3, 2012 at 10:40 AM, Joachim Wieland <joe@mcknight.de> wrote:

On a similar note, what's the point of changing struct Archive to have
int numWorkers instead of int number_of_jobs, and furthermore
shuffling the declaration around to a different part of the struct?

number_of_jobs was in the struct RestoreOptions before, a structure
that is not used when doing a dump. I moved it to the Archive as it is
used by both dump and restore and since all other code talks about
"workers" I changed the name to "numWorkers".

Gah. Somehow I feel that splitting up this patch into two pieces
hasn't made anything any better.

On another note, I am not sure that I like the messaging protocol
you've designed.  It seems to me that this has little chance of being
reliable:

+ void (*on_exit_msg_func)(const char *modulename, const char *fmt,
va_list ap) = vwrite_msg;

I believe the idea here is that you're going to capture the dying gasp
of the worker thread and send it to the master instead of printing it
out.  But that doesn't seem very reliable.  There's code all over the
place (and, in particular, in pg_dump.c) that assumes that we may
write messages at any time, and in particular that we may emit
multiple messages just before croaking.

I guess you're talking about the code in pg_dump that reads in the
database schema and the details of all the different objects in the
schema. This code is run before forking off workers and is always
executed in the master.

OK, but it seems like a pretty fragile assumption that none of the
workers will ever manage to emit any other error messages. We don't
rely on this kind of assumption in the backend, which is a lot
better-structured and less spaghetti-like than the pg_dump code.

Also, I like the idea of making it possible to use assertions in
front-end code.  But I think that if we're going to do that, we should
make it work in general, not just for things that include
pg_backup_archiver.h.

I completely agree. Assertions helped a lot dealing with concurrent
code. How do you want to tackle this for now? Want me to create a
separate header pg_assert.h as part of my patch? Or is it okay to
factor it out later and include it from the general header then?

I'll just go do it, barring objections.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#70Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#69)
Re: patch for parallel pg_dump

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Apr 3, 2012 at 10:40 AM, Joachim Wieland <joe@mcknight.de> wrote:

I completely agree. Assertions helped a lot dealing with concurrent
code. How do you want to tackle this for now? Want me to create a
separate header pg_assert.h as part of my patch? Or is it okay to
factor it out later and include it from the general header then?

I'll just go do it, barring objections.

If the necessary support code isn't going to be available *everywhere*,
it should not be in postgres.h. So I did not care for your proposal to
put it in dumputils.

Possibly we could move assert.c into src/port/ and make it part of
libpgport?

regards, tom lane

#71Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#70)
Re: patch for parallel pg_dump

On Tue, Apr 3, 2012 at 11:38 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Apr 3, 2012 at 10:40 AM, Joachim Wieland <joe@mcknight.de> wrote:

I completely agree. Assertions helped a lot dealing with concurrent
code. How do you want to tackle this for now? Want me to create a
separate header pg_assert.h as part of my patch? Or is it okay to
factor it out later and include it from the general header then?

I'll just go do it, barring objections.

If the necessary support code isn't going to be available *everywhere*,
it should not be in postgres.h.  So I did not care for your proposal to
put it in dumputils.

Err... I didn't suggest putting it in postgres.h. I suggested taking
it OUT of postgres.h and putting it in a separate header file.

Possibly we could move assert.c into src/port/ and make it part of
libpgport?

The trouble is that it calls write_stderr(), which has a non-trivial
implementation on Windows that I don't believe will be suitable for
front-end code. If we can find a reasonable way to work around that
issue then I think that would work. We might also want to rename
ExceptionalCondition() to pg_exceptional_condition or something like
that if we're going to include it in libpgport.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#72Alvaro Herrera
alvherre@commandprompt.com
In reply to: Tom Lane (#70)
Re: patch for parallel pg_dump

Excerpts from Tom Lane's message of mar abr 03 12:38:20 -0300 2012:

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Apr 3, 2012 at 10:40 AM, Joachim Wieland <joe@mcknight.de> wrote:

I completely agree. Assertions helped a lot dealing with concurrent
code. How do you want to tackle this for now? Want me to create a
separate header pg_assert.h as part of my patch? Or is it okay to
factor it out later and include it from the general header then?

I'll just go do it, barring objections.

If the necessary support code isn't going to be available *everywhere*,
it should not be in postgres.h. So I did not care for your proposal to
put it in dumputils.

Possibly we could move assert.c into src/port/ and make it part of
libpgport?

That only leaves assert_enabled to be handled. In the backend it lives
in guc.c; what to do about frontend code?

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#73Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#71)
Re: patch for parallel pg_dump

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Apr 3, 2012 at 11:38 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Possibly we could move assert.c into src/port/ and make it part of
libpgport?

The trouble is that it calls write_stderr(), which has a non-trivial
implementation on Windows that I don't believe will be suitable for
front-end code. If we can find a reasonable way to work around that
issue then I think that would work.

Well, if we don't have a solution to that problem then it's premature
to propose making Assert available to frontend code. So my opinion
is that that idea is too half-baked to be pushing into 9.2 at this
time. Let's put it on the to-do list instead.

regards, tom lane

#74Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#73)
Re: patch for parallel pg_dump

On Tue, Apr 3, 2012 at 11:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Apr 3, 2012 at 11:38 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Possibly we could move assert.c into src/port/ and make it part of
libpgport?

The trouble is that it calls write_stderr(), which has a non-trivial
implementation on Windows that I don't believe will be suitable for
front-end code.  If we can find a reasonable way to work around that
issue then I think that would work.

Well, if we don't have a solution to that problem then it's premature
to propose making Assert available to frontend code.  So my opinion
is that that idea is too half-baked to be pushing into 9.2 at this
time.  Let's put it on the to-do list instead.

It's more baked than Joachim's existing solution, and I don't favor
punting his whole patch because we don't want to give five minutes of
thought to this problem. The patch may need to be punted for other
reasons, of course.

Maybe we could just stick #ifdef BACKEND in the libpgport code. If
we're in the backend, we write_stderr(). Otherwise we just
fprintf(stderr, ...).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#75Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#72)
Re: patch for parallel pg_dump

Alvaro Herrera <alvherre@commandprompt.com> writes:

That only leaves assert_enabled to be handled. In the backend it lives
in guc.c; what to do about frontend code?

There's no mechanism for turning such a switch on or off in most
frontend code anyway. I'd think it could just be assumed to be on
in the frontend implementation --- ie, frontend Asserts would always
be active in --enable-cassert builds. There's not really any need
to do better until/unless we start having Asserts that impact
performance on the frontend side.

regards, tom lane

#76Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#74)
Re: patch for parallel pg_dump

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Apr 3, 2012 at 11:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Well, if we don't have a solution to that problem then it's premature
to propose making Assert available to frontend code. �So my opinion
is that that idea is too half-baked to be pushing into 9.2 at this
time. �Let's put it on the to-do list instead.

It's more baked than Joachim's existing solution, and I don't favor
punting his whole patch because we don't want to give five minutes of
thought to this problem. The patch may need to be punted for other
reasons, of course.

Ripping out the Asserts surely can't take long.

Maybe we could just stick #ifdef BACKEND in the libpgport code. If
we're in the backend, we write_stderr(). Otherwise we just
fprintf(stderr, ...).

No, the reason for write_stderr() is that fprintf(stderr) is unreliable
on Windows. If memory serves, it can actually crash in some situations.

More generally, it's not very clear what to do with error reports in an
OS environment where stderr isn't a universally supported concept.
So (IMO anyway) you can't just ignore the problem. And it's not one
that's going to be solved in five minutes, either.

regards, tom lane

#77Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#76)
Re: patch for parallel pg_dump

On Tue, Apr 3, 2012 at 12:17 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Apr 3, 2012 at 11:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Well, if we don't have a solution to that problem then it's premature
to propose making Assert available to frontend code.  So my opinion
is that that idea is too half-baked to be pushing into 9.2 at this
time.  Let's put it on the to-do list instead.

It's more baked than Joachim's existing solution, and I don't favor
punting his whole patch because we don't want to give five minutes of
thought to this problem.  The patch may need to be punted for other
reasons, of course.

Ripping out the Asserts surely can't take long.

Yeah, but asserts exist for a reason: to make it possible to find bugs
more easily. Let's not be penny-wise and pound-foolish.

Maybe we could just stick #ifdef BACKEND in the libpgport code.  If
we're in the backend, we write_stderr().  Otherwise we just
fprintf(stderr, ...).

No, the reason for write_stderr() is that fprintf(stderr) is unreliable
on Windows.  If memory serves, it can actually crash in some situations.

Dude, we're already doing fprintf(stderr) all over pg_dump. If it's
unreliable even in front-end code, we're screwed anyway. That is a
non-objection.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#78Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#77)
Re: patch for parallel pg_dump

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Apr 3, 2012 at 12:17 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

No, the reason for write_stderr() is that fprintf(stderr) is unreliable
on Windows. �If memory serves, it can actually crash in some situations.

Dude, we're already doing fprintf(stderr) all over pg_dump. If it's
unreliable even in front-end code, we're screwed anyway. That is a
non-objection.

No, it isn't. The fact that it works in pg_dump doesn't extrapolate
to other places. (In particular, it will absolutely not work in libpq,
at least not in all the environments where libpq is supposed to work.)

I think what we've got at the moment is something that's adequate for
pg_dump, and that's all that it is. Concluding that it can be used in
all frontend code is way premature, and therefore I'm -1 on the idea
of exposing it in non-pg_dump header files.

regards, tom lane

#79Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#78)
Re: patch for parallel pg_dump

On Tue, Apr 3, 2012 at 12:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Apr 3, 2012 at 12:17 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

No, the reason for write_stderr() is that fprintf(stderr) is unreliable
on Windows.  If memory serves, it can actually crash in some situations.

Dude, we're already doing fprintf(stderr) all over pg_dump.  If it's
unreliable even in front-end code, we're screwed anyway.  That is a
non-objection.

No, it isn't.  The fact that it works in pg_dump doesn't extrapolate
to other places.  (In particular, it will absolutely not work in libpq,
at least not in all the environments where libpq is supposed to work.)

Well, I didn't propose putting the assert machinery into libpq, but
FWIW fprintf(stderr, ...) it's already used there - see
defaultNoticeProcessor() for example. It's also used in libpgport,
which is where we're proposing to put this code (see pgsymlink, for
example). Furthermore, the code would only run in the event that
assertions are enabled (which only developers normally do) and it
would only break if run as a service (which would be unusual when
debugging) and only in the case where the process was in the middle of
crashing anyway (in which case the worst case is that it crashes just
before printing the error message rather than just after).

However, if despite all that you're still worried about it, it appears
special handling is only needed to cover the case where code might be
running as a service, and apparently none of our utilities worry about
that right now except for pg_ctl, which handles it by defining a
frontend-safe version of write_stderr(). So perhaps we ought to merge
the frontend version in pg_ctl with the backend version and put the
result in libpgport (possibly renamed) and then ExceptionalCondition()
can go into pgport and simply call write_stderr() and it will get the
appropriate version depending on whether it's running in frontend or
backend code. That's a tad more work than I was expecting to do here,
but I'm still not convinced that it's more than can be done in an
afternoon.

I think what we've got at the moment is something that's adequate for
pg_dump, and that's all that it is.  Concluding that it can be used in
all frontend code is way premature, and therefore I'm -1 on the idea
of exposing it in non-pg_dump header files.

It seems desirable to share as much of the implementation with the
backend as possible, and moving the macros from postgres.h into their
own header file (which postgres.h would include) would allow that,
leaving it as the application's problem to provide
ExceptionalCondition(), which pg_dump can surely do. If we don't do
at least that much, then we're going to end up either (1) removing the
assertions, which doesn't seem prudent, or (2) doing a cut-and-paste
from postgres.h into a pg_dump header file, which seems to have no
virtues whatsoever, or (3) having an implementation in pg_dump that is
completely separate from and different from what we have in the
backend, which seems even worse than the preceding option. The
minimum we need here to avoid creating a mess here is no more than
moving 25 lines of code from postgres.h to a separate header file that
pg_dump can include. I don't see why that's a big deal. I also don't
see why we can't fix the problem more completely, perhaps along the
lines suggested above.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#80Joachim Wieland
joe@mcknight.de
In reply to: Robert Haas (#69)
Re: patch for parallel pg_dump

On Tue, Apr 3, 2012 at 11:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:

OK, but it seems like a pretty fragile assumption that none of the
workers will ever manage to emit any other error messages.  We don't
rely on this kind of assumption in the backend, which is a lot
better-structured and less spaghetti-like than the pg_dump code.

Yeah, but even if they don't use exit_horribly(), the user would still
see the output, stdout/stderr aren't closed and everyone can still
write to it.

As a test, I printed out some messages upon seeing a specific dump id
in a worker:

if (strcmp(command, "DUMP 3540") == 0)
{
write_msg(NULL, "Info 1\n");
printf("Info 2\n");
exit_horribly(modulename, "that's why\n");
}

$ ./pg_dump -j 7 ...
pg_dump: Info 1
Info 2
pg_dump: [parallel archiver] that's why

if (strcmp(command, "DUMP 3540") == 0)
{
write_msg(NULL, "Info 1\n");
printf("Info 2\n");
fprintf(stderr, "exiting on my own\n");
exit(1);
}

$ ./pg_dump -j 7 ...
pg_dump: Info 1
Info 2
exiting on my own
pg_dump: [parallel archiver] A worker process died unexpectedly