pg_basebackup stream xlog to tar
Attached patch adds support for -X stream to work with .tar and .tar.gz
file formats.
If tar mode is specified, a separate pg_xlog.tar (or .tar.gz) file is
created and the data is streamed into it. Regular mode is (should not) see
any changes in how it works.
The implementation creates a "walmethod" for directory and one for tar,
which is basically a set of function pointers that we pass around as part
of the StreamCtl structure. All calls to modify the files are sent through
the current method, using the normal open/read/write calls as it is now for
directories, and the more complicated method for tar and targz.
The tar method doesn't support all things that are required for
pg_receivexlog, but I don't think it makes any real sense to have support
for pg_receivexlog in tar mode. But it does support all the things that
pg_basebackup needs.
Some smaller pieces of functionality like unlinking files on failure and
padding files have been moved into the walmethod because they have to be
differently implemented (we cannot pre-pad a compressed file -- the size
will depend on the compression ration anyway -- for example).
AFAICT we never actually documented that -X stream doesn't work with tar in
the manpage of current versions. Only in the error message. We might want
to fix that in backbranches.
In passing this also fixes an XXX comment about not re-lseeking on the WAL
file all the time -- the walmethod now tracks the current position in the
file in a variable.
Finally, to make this work, the pring_tar_number() function is now exported
from port/tar.c along with the other ones already exported from there.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
Attachments:
pg_basebackup_stream_tar.patchtext/x-patch; charset=US-ASCII; name=pg_basebackup_stream_tar.patchDownload
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 03615da..981d201 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -180,7 +180,8 @@ PostgreSQL documentation
target directory, the tar contents will be written to
standard output, suitable for piping to for example
<productname>gzip</productname>. This is only possible if
- the cluster has no additional tablespaces.
+ the cluster has no additional tablespaces and transaction
+ log streaming is not used.
</para>
</listitem>
</varlistentry>
@@ -323,6 +324,10 @@ PostgreSQL documentation
If the log has been rotated when it's time to transfer it, the
backup will fail and be unusable.
</para>
+ <para>
+ The transaction log files will be written to
+ the <filename>base.tar</filename> file.
+ </para>
</listitem>
</varlistentry>
@@ -339,6 +344,9 @@ PostgreSQL documentation
client can keep up with transaction log received, using this mode
requires no extra transaction logs to be saved on the master.
</para>
+ <para>The transactionn log files are written to a separate file
+ called <filename>pg_xlog.tar</filename>.
+ </para>
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/bin/pg_basebackup/Makefile b/src/bin/pg_basebackup/Makefile
index fa1ce8b..52ac9e9 100644
--- a/src/bin/pg_basebackup/Makefile
+++ b/src/bin/pg_basebackup/Makefile
@@ -19,7 +19,7 @@ include $(top_builddir)/src/Makefile.global
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
LDFLAGS += -L$(top_builddir)/src/fe_utils -lpgfeutils -lpq
-OBJS=receivelog.o streamutil.o $(WIN32RES)
+OBJS=receivelog.o streamutil.o walmethods.o $(WIN32RES)
all: pg_basebackup pg_receivexlog pg_recvlogical
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 351a420..58c0821 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -365,7 +365,7 @@ typedef struct
{
PGconn *bgconn;
XLogRecPtr startptr;
- char xlogdir[MAXPGPATH];
+ char xlog[MAXPGPATH]; /* directory or tarfile depending on mode */
char *sysidentifier;
int timeline;
} logstreamer_param;
@@ -383,9 +383,13 @@ LogStreamerMain(logstreamer_param *param)
stream.standby_message_timeout = standby_message_timeout;
stream.synchronous = false;
stream.mark_done = true;
- stream.basedir = param->xlogdir;
stream.partial_suffix = NULL;
+ if (format == 'p')
+ stream.walmethod = CreateWalDirectoryMethod(param->xlog);
+ else
+ stream.walmethod = CreateWalTarMethod(param->xlog, compresslevel);
+
if (!ReceiveXlogStream(param->bgconn, &stream))
/*
@@ -395,6 +399,14 @@ LogStreamerMain(logstreamer_param *param)
*/
return 1;
+ if (!stream.walmethod->finish())
+ {
+ fprintf(stderr,
+ _("%s: could not finish writing WAL files: %s\n"),
+ progname, strerror(errno));
+ return 1;
+ }
+
PQfinish(param->bgconn);
return 0;
}
@@ -445,22 +457,25 @@ StartLogStreamer(char *startpos, uint32 timeline, char *sysidentifier)
/* Error message already written in GetConnection() */
exit(1);
- snprintf(param->xlogdir, sizeof(param->xlogdir), "%s/pg_xlog", basedir);
-
- /*
- * Create pg_xlog/archive_status (and thus pg_xlog) so we can write to
- * basedir/pg_xlog as the directory entry in the tar file may arrive
- * later.
- */
- snprintf(statusdir, sizeof(statusdir), "%s/pg_xlog/archive_status",
- basedir);
+ snprintf(param->xlog, sizeof(param->xlog), "%s/pg_xlog", basedir);
- if (pg_mkdir_p(statusdir, S_IRWXU) != 0 && errno != EEXIST)
+ if (format == 'p')
{
- fprintf(stderr,
- _("%s: could not create directory \"%s\": %s\n"),
- progname, statusdir, strerror(errno));
- disconnect_and_exit(1);
+ /*
+ * Create pg_xlog/archive_status (and thus pg_xlog) so we can write to
+ * basedir/pg_xlog as the directory entry in the tar file may arrive
+ * later.
+ */
+ snprintf(statusdir, sizeof(statusdir), "%s/pg_xlog/archive_status",
+ basedir);
+
+ if (pg_mkdir_p(statusdir, S_IRWXU) != 0 && errno != EEXIST)
+ {
+ fprintf(stderr,
+ _("%s: could not create directory \"%s\": %s\n"),
+ progname, statusdir, strerror(errno));
+ disconnect_and_exit(1);
+ }
}
/*
@@ -2110,16 +2125,6 @@ main(int argc, char **argv)
exit(1);
}
- if (format != 'p' && streamwal)
- {
- fprintf(stderr,
- _("%s: WAL streaming can only be used in plain mode\n"),
- progname);
- fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
- progname);
- exit(1);
- }
-
if (replication_slot && !streamwal)
{
fprintf(stderr,
diff --git a/src/bin/pg_basebackup/pg_receivexlog.c b/src/bin/pg_basebackup/pg_receivexlog.c
index 7f7ee9d..9b4c101 100644
--- a/src/bin/pg_basebackup/pg_receivexlog.c
+++ b/src/bin/pg_basebackup/pg_receivexlog.c
@@ -337,11 +337,19 @@ StreamLog(void)
stream.standby_message_timeout = standby_message_timeout;
stream.synchronous = synchronous;
stream.mark_done = false;
- stream.basedir = basedir;
+ stream.walmethod = CreateWalDirectoryMethod(basedir);
stream.partial_suffix = ".partial";
ReceiveXlogStream(conn, &stream);
+ if (!stream.walmethod->finish())
+ {
+ fprintf(stderr,
+ _("%s: could not finish writing WAL files: %s\n"),
+ progname, strerror(errno));
+ return;
+ }
+
PQfinish(conn);
conn = NULL;
}
diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c
index 062730b..9197eeb 100644
--- a/src/bin/pg_basebackup/receivelog.c
+++ b/src/bin/pg_basebackup/receivelog.c
@@ -26,7 +26,7 @@
/* fd and filename for currently open WAL file */
-static int walfile = -1;
+static Walfile *walfile = NULL;
static char current_walfile_name[MAXPGPATH] = "";
static bool reportFlushPosition = false;
static XLogRecPtr lastFlushPosition = InvalidXLogRecPtr;
@@ -37,7 +37,7 @@ static PGresult *HandleCopyStream(PGconn *conn, StreamCtl *stream,
XLogRecPtr *stoppos);
static int CopyStreamPoll(PGconn *conn, long timeout_ms);
static int CopyStreamReceive(PGconn *conn, long timeout, char **buffer);
-static bool ProcessKeepaliveMsg(PGconn *conn, char *copybuf, int len,
+static bool ProcessKeepaliveMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
XLogRecPtr blockpos, int64 *last_status);
static bool ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
XLogRecPtr *blockpos);
@@ -52,33 +52,33 @@ static bool ReadEndOfStreamingResult(PGresult *res, XLogRecPtr *startpos,
uint32 *timeline);
static bool
-mark_file_as_archived(const char *basedir, const char *fname)
+mark_file_as_archived(StreamCtl *stream, const char *fname)
{
- int fd;
+ Walfile *f;
static char tmppath[MAXPGPATH];
- snprintf(tmppath, sizeof(tmppath), "%s/archive_status/%s.done",
- basedir, fname);
+ snprintf(tmppath, sizeof(tmppath), "archive_status/%s.done",
+ fname);
- fd = open(tmppath, O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
- if (fd < 0)
+ f = stream->walmethod->open_for_write(tmppath, NULL, 0);
+ if (f == NULL)
{
fprintf(stderr, _("%s: could not create archive status file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
+ progname, tmppath, stream->walmethod->getlasterror());
return false;
}
- if (fsync(fd) != 0)
+ if (stream->walmethod->fsync(f) != 0)
{
fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
+ progname, tmppath, stream->walmethod->getlasterror());
- close(fd);
+ stream->walmethod->close(f, CLOSE_UNLINK);
return false;
}
- close(fd);
+ stream->walmethod->close(f, CLOSE_NORMAL);
return true;
}
@@ -92,79 +92,65 @@ mark_file_as_archived(const char *basedir, const char *fname)
static bool
open_walfile(StreamCtl *stream, XLogRecPtr startpoint)
{
- int f;
+ Walfile *f;
char fn[MAXPGPATH];
- struct stat statbuf;
- char *zerobuf;
- int bytes;
+ ssize_t size;
XLogSegNo segno;
XLByteToSeg(startpoint, segno);
XLogFileName(current_walfile_name, stream->timeline, segno);
- snprintf(fn, sizeof(fn), "%s/%s%s", stream->basedir, current_walfile_name,
+ snprintf(fn, sizeof(fn), "%s%s", current_walfile_name,
stream->partial_suffix ? stream->partial_suffix : "");
- f = open(fn, O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
- if (f == -1)
- {
- fprintf(stderr,
- _("%s: could not open transaction log file \"%s\": %s\n"),
- progname, fn, strerror(errno));
- return false;
- }
-
- /*
- * Verify that the file is either empty (just created), or a complete
- * XLogSegSize segment. Anything in between indicates a corrupt file.
- */
- if (fstat(f, &statbuf) != 0)
- {
- fprintf(stderr,
- _("%s: could not stat transaction log file \"%s\": %s\n"),
- progname, fn, strerror(errno));
- close(f);
- return false;
- }
- if (statbuf.st_size == XLogSegSize)
- {
- /* File is open and ready to use */
- walfile = f;
- return true;
- }
- if (statbuf.st_size != 0)
- {
- fprintf(stderr,
- _("%s: transaction log file \"%s\" has %d bytes, should be 0 or %d\n"),
- progname, fn, (int) statbuf.st_size, XLogSegSize);
- close(f);
- return false;
- }
- /* New, empty, file. So pad it to 16Mb with zeroes */
- zerobuf = pg_malloc0(XLOG_BLCKSZ);
- for (bytes = 0; bytes < XLogSegSize; bytes += XLOG_BLCKSZ)
+ if (stream->walmethod->existsfile(fn))
{
- if (write(f, zerobuf, XLOG_BLCKSZ) != XLOG_BLCKSZ)
+ /*
+ * Verify that the file is either empty (just created), or a complete
+ * XLogSegSize segment. Anything in between indicates a corrupt file.
+ */
+ size = stream->walmethod->get_file_size(fn);
+ if (size < 0)
+ {
+ fprintf(stderr,
+ _("%s: could not get size of transaction log file \"%s\": %s\n"),
+ progname, fn, stream->walmethod->getlasterror());
+ return false;
+ }
+ if (size == XLogSegSize)
+ {
+ /* Already padded file. Open it for use */
+ f = stream->walmethod->open_for_write(current_walfile_name, stream->partial_suffix, 0);
+ if (f == NULL)
+ {
+ fprintf(stderr,
+ _("%s: could not open existing transaction log file \"%s\": %s\n"),
+ progname, fn, stream->walmethod->getlasterror());
+ return false;
+ }
+ walfile = f;
+ return true;
+ }
+ if (size != 0)
{
fprintf(stderr,
- _("%s: could not pad transaction log file \"%s\": %s\n"),
- progname, fn, strerror(errno));
- free(zerobuf);
- close(f);
- unlink(fn);
+ _("%s: transaction log file \"%s\" has %d bytes, should be 0 or %d\n"),
+ progname, fn, (int) size, XLogSegSize);
return false;
}
}
- free(zerobuf);
- if (lseek(f, SEEK_SET, 0) != 0)
+ /* No file existed, so create one */
+
+ f = stream->walmethod->open_for_write(current_walfile_name, stream->partial_suffix, XLogSegSize);
+ if (f == NULL)
{
fprintf(stderr,
- _("%s: could not seek to beginning of transaction log file \"%s\": %s\n"),
- progname, fn, strerror(errno));
- close(f);
+ _("%s: could not open transaction log file \"%s\": %s\n"),
+ progname, fn, stream->walmethod->getlasterror());
return false;
}
+
walfile = f;
return true;
}
@@ -178,56 +164,50 @@ static bool
close_walfile(StreamCtl *stream, XLogRecPtr pos)
{
off_t currpos;
+ int r;
- if (walfile == -1)
+ if (walfile == NULL)
return true;
- currpos = lseek(walfile, 0, SEEK_CUR);
+ currpos = stream->walmethod->get_current_pos(walfile);
if (currpos == -1)
{
fprintf(stderr,
_("%s: could not determine seek position in file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
+ progname, current_walfile_name, stream->walmethod->getlasterror());
return false;
}
- if (fsync(walfile) != 0)
+ if (stream->walmethod->fsync(walfile) != 0)
{
fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
+ progname, current_walfile_name, stream->walmethod->getlasterror());
return false;
}
- if (close(walfile) != 0)
+ if (stream->partial_suffix)
{
- fprintf(stderr, _("%s: could not close file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
- walfile = -1;
- return false;
+ if (currpos == XLOG_SEG_SIZE)
+ r = stream->walmethod->close(walfile, CLOSE_NORMAL);
+ else
+ {
+ fprintf(stderr,
+ _("%s: not renaming \"%s%s\", segment is not complete\n"),
+ progname, current_walfile_name, stream->partial_suffix);
+ r = stream->walmethod->close(walfile, CLOSE_NO_RENAME);
+ }
}
- walfile = -1;
+ else
+ r = stream->walmethod->close(walfile, CLOSE_NORMAL);
- /*
- * If we finished writing a .partial file, rename it into place.
- */
- if (currpos == XLOG_SEG_SIZE && stream->partial_suffix)
- {
- char oldfn[MAXPGPATH];
- char newfn[MAXPGPATH];
+ walfile = NULL;
- snprintf(oldfn, sizeof(oldfn), "%s/%s%s", stream->basedir, current_walfile_name, stream->partial_suffix);
- snprintf(newfn, sizeof(newfn), "%s/%s", stream->basedir, current_walfile_name);
- if (rename(oldfn, newfn) != 0)
- {
- fprintf(stderr, _("%s: could not rename file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
- return false;
- }
+ if (r != 0)
+ {
+ fprintf(stderr, _("%s: could not close file \"%s\": %s\n"),
+ progname, current_walfile_name, stream->walmethod->getlasterror());
+ return false;
}
- else if (stream->partial_suffix)
- fprintf(stderr,
- _("%s: not renaming \"%s%s\", segment is not complete\n"),
- progname, current_walfile_name, stream->partial_suffix);
/*
* Mark file as archived if requested by the caller - pg_basebackup needs
@@ -238,7 +218,7 @@ close_walfile(StreamCtl *stream, XLogRecPtr pos)
if (currpos == XLOG_SEG_SIZE && stream->mark_done)
{
/* writes error message if failed */
- if (!mark_file_as_archived(stream->basedir, current_walfile_name))
+ if (!mark_file_as_archived(stream, current_walfile_name))
return false;
}
@@ -253,9 +233,7 @@ close_walfile(StreamCtl *stream, XLogRecPtr pos)
static bool
existsTimeLineHistoryFile(StreamCtl *stream)
{
- char path[MAXPGPATH];
char histfname[MAXFNAMELEN];
- int fd;
/*
* Timeline 1 never has a history file. We treat that as if it existed,
@@ -266,31 +244,16 @@ existsTimeLineHistoryFile(StreamCtl *stream)
TLHistoryFileName(histfname, stream->timeline);
- snprintf(path, sizeof(path), "%s/%s", stream->basedir, histfname);
-
- fd = open(path, O_RDONLY | PG_BINARY, 0);
- if (fd < 0)
- {
- if (errno != ENOENT)
- fprintf(stderr, _("%s: could not open timeline history file \"%s\": %s\n"),
- progname, path, strerror(errno));
- return false;
- }
- else
- {
- close(fd);
- return true;
- }
+ return stream->walmethod->existsfile(histfname);
}
static bool
writeTimeLineHistoryFile(StreamCtl *stream, char *filename, char *content)
{
int size = strlen(content);
- char path[MAXPGPATH];
char tmppath[MAXPGPATH];
char histfname[MAXFNAMELEN];
- int fd;
+ Walfile *f;
/*
* Check that the server's idea of how timeline history files should be
@@ -304,62 +267,39 @@ writeTimeLineHistoryFile(StreamCtl *stream, char *filename, char *content)
return false;
}
- snprintf(path, sizeof(path), "%s/%s", stream->basedir, histfname);
-
- /*
- * Write into a temp file name.
- */
- snprintf(tmppath, MAXPGPATH, "%s.tmp", path);
-
- unlink(tmppath);
-
- fd = open(tmppath, O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
- if (fd < 0)
+ f = stream->walmethod->open_for_write(histfname, ".tmp", 0);
+ if (f == NULL)
{
fprintf(stderr, _("%s: could not create timeline history file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
+ progname, histfname, stream->walmethod->getlasterror());
return false;
}
- errno = 0;
- if ((int) write(fd, content, size) != size)
+ if ((int) stream->walmethod->write(f, content, size) != size)
{
- int save_errno = errno;
+ fprintf(stderr, _("%s: could not write timeline history file \"%s\": %s\n"),
+ progname, histfname, stream->walmethod->getlasterror());
/*
* If we fail to make the file, delete it to release disk space
*/
- close(fd);
- unlink(tmppath);
- errno = save_errno;
+ stream->walmethod->close(f, CLOSE_UNLINK);
- fprintf(stderr, _("%s: could not write timeline history file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
return false;
}
- if (fsync(fd) != 0)
+ if (stream->walmethod->fsync(f) != 0)
{
- close(fd);
fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
+ progname, tmppath, stream->walmethod->getlasterror());
+ stream->walmethod->close(f, CLOSE_NORMAL);
return false;
}
- if (close(fd) != 0)
+ if (stream->walmethod->close(f, CLOSE_NORMAL) != 0)
{
fprintf(stderr, _("%s: could not close file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
- return false;
- }
-
- /*
- * Now move the completed history file into place with its final name.
- */
- if (rename(tmppath, path) < 0)
- {
- fprintf(stderr, _("%s: could not rename file \"%s\" to \"%s\": %s\n"),
- progname, tmppath, path, strerror(errno));
+ progname, histfname, stream->walmethod->getlasterror());
return false;
}
@@ -367,7 +307,7 @@ writeTimeLineHistoryFile(StreamCtl *stream, char *filename, char *content)
if (stream->mark_done)
{
/* writes error message if failed */
- if (!mark_file_as_archived(stream->basedir, histfname))
+ if (!mark_file_as_archived(stream, histfname))
return false;
}
@@ -736,10 +676,10 @@ ReceiveXlogStream(PGconn *conn, StreamCtl *stream)
}
error:
- if (walfile != -1 && close(walfile) != 0)
+ if (walfile != NULL && stream->walmethod->close(walfile, CLOSE_NORMAL) != 0)
fprintf(stderr, _("%s: could not close file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
- walfile = -1;
+ progname, current_walfile_name, stream->walmethod->getlasterror());
+ walfile = NULL;
return false;
}
@@ -823,12 +763,12 @@ HandleCopyStream(PGconn *conn, StreamCtl *stream,
* If synchronous option is true, issue sync command as soon as there
* are WAL data which has not been flushed yet.
*/
- if (stream->synchronous && lastFlushPosition < blockpos && walfile != -1)
+ if (stream->synchronous && lastFlushPosition < blockpos && walfile != NULL)
{
- if (fsync(walfile) != 0)
+ if (stream->walmethod->fsync(walfile) != 0)
{
fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
+ progname, current_walfile_name, stream->walmethod->getlasterror());
goto error;
}
lastFlushPosition = blockpos;
@@ -879,7 +819,7 @@ HandleCopyStream(PGconn *conn, StreamCtl *stream,
/* Check the message type. */
if (copybuf[0] == 'k')
{
- if (!ProcessKeepaliveMsg(conn, copybuf, r, blockpos,
+ if (!ProcessKeepaliveMsg(conn, stream, copybuf, r, blockpos,
&last_status))
goto error;
}
@@ -1032,7 +972,7 @@ CopyStreamReceive(PGconn *conn, long timeout, char **buffer)
* Process the keepalive message.
*/
static bool
-ProcessKeepaliveMsg(PGconn *conn, char *copybuf, int len,
+ProcessKeepaliveMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
XLogRecPtr blockpos, int64 *last_status)
{
int pos;
@@ -1059,7 +999,7 @@ ProcessKeepaliveMsg(PGconn *conn, char *copybuf, int len,
if (replyRequested && still_sending)
{
if (reportFlushPosition && lastFlushPosition < blockpos &&
- walfile != -1)
+ walfile != NULL)
{
/*
* If a valid flush location needs to be reported, flush the
@@ -1068,10 +1008,10 @@ ProcessKeepaliveMsg(PGconn *conn, char *copybuf, int len,
* data has been successfully replicated or not, at the normal
* shutdown of the server.
*/
- if (fsync(walfile) != 0)
+ if (stream->walmethod->fsync(walfile) != 0)
{
fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
+ progname, current_walfile_name, stream->walmethod->getlasterror());
return false;
}
lastFlushPosition = blockpos;
@@ -1129,7 +1069,7 @@ ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
* Verify that the initial location in the stream matches where we think
* we are.
*/
- if (walfile == -1)
+ if (walfile == NULL)
{
/* No file open yet */
if (xlogoff != 0)
@@ -1143,12 +1083,11 @@ ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
else
{
/* More data in existing segment */
- /* XXX: store seek value don't reseek all the time */
- if (lseek(walfile, 0, SEEK_CUR) != xlogoff)
+ if (stream->walmethod->get_current_pos(walfile) != xlogoff)
{
fprintf(stderr,
_("%s: got WAL data offset %08x, expected %08x\n"),
- progname, xlogoff, (int) lseek(walfile, 0, SEEK_CUR));
+ progname, xlogoff, (int) stream->walmethod->get_current_pos(walfile));
return false;
}
}
@@ -1169,7 +1108,7 @@ ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
else
bytes_to_write = bytes_left;
- if (walfile == -1)
+ if (walfile == NULL)
{
if (!open_walfile(stream, *blockpos))
{
@@ -1178,14 +1117,13 @@ ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
}
}
- if (write(walfile,
- copybuf + hdr_len + bytes_written,
- bytes_to_write) != bytes_to_write)
+ if (stream->walmethod->write(walfile, copybuf + hdr_len + bytes_written,
+ bytes_to_write) != bytes_to_write)
{
fprintf(stderr,
_("%s: could not write %u bytes to WAL file \"%s\": %s\n"),
progname, bytes_to_write, current_walfile_name,
- strerror(errno));
+ stream->walmethod->getlasterror());
return false;
}
diff --git a/src/bin/pg_basebackup/receivelog.h b/src/bin/pg_basebackup/receivelog.h
index 554ff8b..e6db14a 100644
--- a/src/bin/pg_basebackup/receivelog.h
+++ b/src/bin/pg_basebackup/receivelog.h
@@ -13,6 +13,7 @@
#define RECEIVELOG_H
#include "libpq-fe.h"
+#include "walmethods.h"
#include "access/xlogdefs.h"
@@ -39,7 +40,7 @@ typedef struct StreamCtl
stream_stop_callback stream_stop; /* Stop streaming when returns true */
- char *basedir; /* Received segments written to this dir */
+ WalWriteMethod *walmethod; /* How to write the WAL */
char *partial_suffix; /* Suffix appended to partially received files */
} StreamCtl;
diff --git a/src/bin/pg_basebackup/walmethods.c b/src/bin/pg_basebackup/walmethods.c
new file mode 100644
index 0000000..53fce28
--- /dev/null
+++ b/src/bin/pg_basebackup/walmethods.c
@@ -0,0 +1,822 @@
+/*-------------------------------------------------------------------------
+ *
+ * walmethods.c - implementations of different ways to write received wal
+ *
+ * NOTE! The caller must ensure that only one method is instantiated in
+ * any given program, and that it's only instantiated once!
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_basebackup/walmethods.c
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <sys/stat.h>
+#include <time.h>
+#include <unistd.h>
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
+#include "pgtar.h"
+
+#include "receivelog.h"
+
+/* Size of zlib buffer for .tar.gz */
+#define ZLIB_OUT_SIZE 4096
+
+/*-------------------------------------------------------------------------
+ * WalDirectoryMethod - write wal to a directory looking like pg_xlog
+ *-------------------------------------------------------------------------
+ */
+
+/*
+ * Global static data for this method
+ */
+typedef struct DirectoryMethodData
+{
+ char *basedir;
+} DirectoryMethodData;
+static DirectoryMethodData *dir_data = NULL;
+
+/*
+ * Local file handle
+ */
+typedef struct DirectoryMethodFile
+{
+ int fd;
+ off_t currpos;
+ char *pathname;
+ char *temp_suffix;
+} DirectoryMethodFile;
+
+static char *
+dir_getlasterror(void)
+{
+ /* Directory method always sets errno, so just use strerror */
+ return strerror(errno);
+}
+
+static Walfile
+dir_open_for_write(const char *pathname, const char *temp_suffix, size_t pad_to_size)
+{
+ static char tmppath[MAXPGPATH];
+ int fd;
+ DirectoryMethodFile *f;
+
+ snprintf(tmppath, sizeof(tmppath), "%s/%s%s",
+ dir_data->basedir, pathname, temp_suffix ? temp_suffix : "");
+
+ fd = open(tmppath, O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
+ if (fd < 0)
+ return NULL;
+
+ if (pad_to_size)
+ {
+ /* Always pre-pad on regular files */
+ char *zerobuf;
+ int bytes;
+
+ zerobuf = pg_malloc0(XLOG_BLCKSZ);
+ for (bytes = 0; bytes < pad_to_size; bytes += XLOG_BLCKSZ)
+ {
+ if (write(fd, zerobuf, XLOG_BLCKSZ) != XLOG_BLCKSZ)
+ {
+ int save_errno = errno;
+
+ pg_free(zerobuf);
+ close(fd);
+ errno = save_errno;
+ return NULL;
+ }
+ }
+ pg_free(zerobuf);
+
+ if (lseek(fd, 0, SEEK_SET) != 0)
+ return NULL;
+ }
+
+ f = pg_malloc0(sizeof(DirectoryMethodFile));
+ f->fd = fd;
+ f->currpos = 0;
+ f->pathname = pg_strdup(pathname);
+ if (temp_suffix)
+ f->temp_suffix = pg_strdup(temp_suffix);
+ return f;
+}
+
+static ssize_t
+dir_write(Walfile f, const void *buf, size_t count)
+{
+ ssize_t r;
+ DirectoryMethodFile *df = (DirectoryMethodFile *) f;
+
+ Assert(f != NULL);
+
+ r = write(df->fd, buf, count);
+ if (r > 0)
+ df->currpos += r;
+ return r;
+}
+
+static off_t
+dir_get_current_pos(Walfile f)
+{
+ Assert(f != NULL);
+
+ /* Use a cached value to prevent lots of reseeks */
+ return ((DirectoryMethodFile *) f)->currpos;
+}
+
+static int
+dir_close(Walfile f, WalCloseMethod method)
+{
+ int r;
+ DirectoryMethodFile *df = (DirectoryMethodFile *) f;
+ static char tmppath[MAXPGPATH];
+ static char tmppath2[MAXPGPATH];
+
+ Assert(f != NULL);
+
+ r = close(df->fd);
+
+ if (r == 0)
+ {
+ /* Build path to the current version of the file */
+ if (method == CLOSE_NORMAL && df->temp_suffix)
+ {
+ /* If we have a temp prefix, normal is we rename the file */
+ snprintf(tmppath, sizeof(tmppath), "%s/%s%s",
+ dir_data->basedir, df->pathname, df->temp_suffix);
+ snprintf(tmppath2, sizeof(tmppath2), "%s/%s",
+ dir_data->basedir, df->pathname);
+ r = rename(tmppath, tmppath2);
+ }
+ else if (method == CLOSE_UNLINK)
+ {
+ /* Unlink the file once it's closed */
+ snprintf(tmppath, sizeof(tmppath), "%s/%s%s",
+ dir_data->basedir, df->pathname, df->temp_suffix ? df->temp_suffix : "");
+ r = unlink(tmppath);
+ }
+ /* else either CLOSE_NORMAL and no temp suffix, or CLOSE_NO_RENAME */
+ }
+
+ pg_free(df->pathname);
+ if (df->temp_suffix)
+ pg_free(df->temp_suffix);
+ pg_free(df);
+
+ return r;
+}
+
+static int
+dir_fsync(Walfile f)
+{
+ Assert(f != NULL);
+
+ return fsync(((DirectoryMethodFile *) f)->fd);
+}
+
+static ssize_t
+dir_get_file_size(const char *pathname)
+{
+ struct stat statbuf;
+ static char tmppath[MAXPGPATH];
+
+ snprintf(tmppath, sizeof(tmppath), "%s/%s",
+ dir_data->basedir, pathname);
+
+ if (stat(tmppath, &statbuf) != 0)
+ return -1;
+
+ return statbuf.st_size;
+}
+
+static int
+dir_unlink(const char *pathname)
+{
+ static char tmppath[MAXPGPATH];
+
+ snprintf(tmppath, sizeof(tmppath), "%s/%s",
+ dir_data->basedir, pathname);
+
+ return unlink(tmppath);
+}
+
+static bool
+dir_existsfile(const char *pathname)
+{
+ static char tmppath[MAXPGPATH];
+ int fd;
+
+ snprintf(tmppath, sizeof(tmppath), "%s/%s",
+ dir_data->basedir, pathname);
+
+ fd = open(tmppath, O_RDONLY | PG_BINARY, 0);
+ if (fd < 0)
+ return false;
+ close(fd);
+ return true;
+}
+
+static bool
+dir_finish(void)
+{
+ /* No cleanup necessary */
+ return true;
+}
+
+
+WalWriteMethod *
+CreateWalDirectoryMethod(const char *basedir)
+{
+ WalWriteMethod *method;
+
+ method = pg_malloc0(sizeof(WalWriteMethod));
+ method->open_for_write = dir_open_for_write;
+ method->write = dir_write;
+ method->get_current_pos = dir_get_current_pos;
+ method->get_file_size = dir_get_file_size;
+ method->close = dir_close;
+ method->fsync = dir_fsync;
+ method->unlink = dir_unlink;
+ method->existsfile = dir_existsfile;
+ method->finish = dir_finish;
+ method->getlasterror = dir_getlasterror;
+
+ dir_data = pg_malloc0(sizeof(DirectoryMethodData));
+ dir_data->basedir = pg_strdup(basedir);
+
+ return method;
+}
+
+
+/*-------------------------------------------------------------------------
+ * WalTarMethod - write wal to a tar file containing pg_xlog contents
+ *-------------------------------------------------------------------------
+ */
+
+typedef struct TarMethodFile
+{
+ off_t ofs_start; /* Where does the *header* for this file start */
+ off_t currpos;
+ char header[512];
+ char *pathname;
+ size_t pad_to_size;
+} TarMethodFile;
+
+typedef struct TarMethodData
+{
+ char *tarfilename;
+ int fd;
+ int compression;
+ TarMethodFile *currentfile;
+ char lasterror[1024];
+#ifdef HAVE_LIBZ
+ z_streamp zp;
+ void *zlibOut;
+#endif
+} TarMethodData;
+static TarMethodData *tar_data = NULL;
+
+#define tar_clear_error() tar_data->lasterror[0] = '\0'
+#define tar_set_error(msg) strlcpy(tar_data->lasterror, msg, sizeof(tar_data->lasterror))
+
+static char *
+tar_getlasterror(void)
+{
+ /*
+ * If a custom error is set, return that one. Otherwise, assume errno is
+ * set and return that one.
+ */
+ if (tar_data->lasterror[0])
+ return tar_data->lasterror;
+ return strerror(errno);
+}
+
+#ifdef HAVE_LIBZ
+static bool
+tar_write_compressed_data(void *buf, size_t count, bool flush)
+{
+ tar_data->zp->next_in = buf;
+ tar_data->zp->avail_in = count;
+
+ while (tar_data->zp->avail_in || flush)
+ {
+ int r;
+
+ r = deflate(tar_data->zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (r == Z_STREAM_ERROR)
+ {
+ tar_set_error("deflate failed");
+ return false;
+ }
+
+ if (tar_data->zp->avail_out < ZLIB_OUT_SIZE)
+ {
+ size_t len = ZLIB_OUT_SIZE - tar_data->zp->avail_out;
+
+ if (write(tar_data->fd, tar_data->zlibOut, len) != len)
+ return false;
+
+ tar_data->zp->next_out = tar_data->zlibOut;
+ tar_data->zp->avail_out = ZLIB_OUT_SIZE;
+ }
+
+ if (r == Z_STREAM_END)
+ break;
+ }
+
+ if (flush)
+ {
+ /* Reset the stream for writing */
+ if (deflateReset(tar_data->zp) != Z_OK)
+ {
+ tar_set_error("deflateReset failed");
+ return false;
+ }
+ }
+
+ return true;
+}
+#endif
+
+static ssize_t
+tar_write(Walfile f, const void *buf, size_t count)
+{
+ ssize_t r;
+
+ Assert(f != NULL);
+ tar_clear_error();
+
+ /* Tarfile will always be positioned at the end */
+ if (!tar_data->compression)
+ {
+ r = write(tar_data->fd, buf, count);
+ if (r > 0)
+ ((TarMethodFile *) f)->currpos += r;
+ return r;
+ }
+#ifdef HAVE_LIBZ
+ else
+ {
+ if (!tar_write_compressed_data((void *) buf, count, false))
+ return -1;
+ ((TarMethodFile *) f)->currpos += count;
+ return count;
+ }
+#endif
+}
+
+static bool
+tar_write_padding_data(TarMethodFile * f, size_t bytes)
+{
+ char *zerobuf = pg_malloc0(XLOG_BLCKSZ);
+ size_t bytesleft = bytes;
+
+ while (bytesleft)
+ {
+ size_t bytestowrite = bytesleft > XLOG_BLCKSZ ? XLOG_BLCKSZ : bytesleft;
+
+ size_t r = tar_write(f, zerobuf, bytestowrite);
+
+ if (r < 0)
+ return false;
+ bytesleft -= r;
+ }
+ return true;
+}
+
+static Walfile
+tar_open_for_write(const char *pathname, const char *temp_suffix, size_t pad_to_size)
+{
+ int save_errno;
+ static char tmppath[MAXPGPATH];
+
+ tar_clear_error();
+
+ if (tar_data->fd < 0)
+ {
+ /*
+ * We open the tar file only when we first try to write to it.
+ */
+ tar_data->fd = open(tar_data->tarfilename,
+ O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
+ if (tar_data->fd < 0)
+ return NULL;
+
+#ifdef HAVE_LIBZ
+ if (tar_data->compression)
+ {
+ tar_data->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ tar_data->zp->zalloc = Z_NULL;
+ tar_data->zp->zfree = Z_NULL;
+ tar_data->zp->opaque = Z_NULL;
+ tar_data->zp->next_out = tar_data->zlibOut;
+ tar_data->zp->avail_out = ZLIB_OUT_SIZE;
+
+ if (deflateInit2(tar_data->zp, tar_data->compression, Z_DEFLATED, 15 + 16, 8, Z_DEFAULT_STRATEGY) != Z_OK)
+ {
+ tar_set_error("deflateInit2 failed");
+ return NULL;
+ }
+ }
+#endif
+
+ /* There's no tar header itself, the file starts with regular files */
+ }
+
+ Assert(tar_data->currentfile == NULL);
+ if (tar_data->currentfile != NULL)
+ {
+ tar_set_error("implementation error: tar files can't have more than one open file\n");
+ return NULL;
+ }
+
+ tar_data->currentfile = pg_malloc0(sizeof(TarMethodFile));
+
+ snprintf(tmppath, sizeof(tmppath), "%s%s",
+ pathname, temp_suffix ? temp_suffix : "");
+
+ /* Create a header with size set to 0 - we will fill out the size on close */
+ if (tarCreateHeader(tar_data->currentfile->header, tmppath, NULL, 0, S_IRUSR | S_IWUSR, 0, 0, time(NULL)) != TAR_OK)
+ {
+ pg_free(tar_data->currentfile);
+ tar_data->currentfile = NULL;
+ tar_set_error("could not create tar header");
+ return NULL;
+ }
+
+#ifdef HAVE_LIBZ
+ if (tar_data->compression)
+ {
+ /* Flush existing data */
+ if (!tar_write_compressed_data(NULL, 0, true))
+ return NULL;
+
+ /* Turn off compression for header */
+ if (deflateParams(tar_data->zp, 0, 0) != Z_OK)
+ {
+ tar_set_error("deflateParams failed");
+ return NULL;
+ }
+ }
+#endif
+
+ tar_data->currentfile->ofs_start = lseek(tar_data->fd, 0, SEEK_CUR);
+ if (tar_data->currentfile->ofs_start == -1)
+ {
+ save_errno = errno;
+ pg_free(tar_data->currentfile);
+ tar_data->currentfile = NULL;
+ errno = save_errno;
+ return NULL;
+ }
+ tar_data->currentfile->currpos = 0;
+
+ if (!tar_data->compression)
+ {
+ if (write(tar_data->fd, tar_data->currentfile->header, 512) != 512)
+ {
+ save_errno = errno;
+ pg_free(tar_data->currentfile);
+ tar_data->currentfile = NULL;
+ errno = save_errno;
+ return NULL;
+ }
+ }
+#ifdef HAVE_LIBZ
+ else
+ {
+ /* Write header through the zlib APIs but with no compression */
+ if (!tar_write_compressed_data(tar_data->currentfile->header, 512, true))
+ return NULL;
+
+ /* Re-enable compression for the rest of the file */
+ if (deflateParams(tar_data->zp, tar_data->compression, 0) != Z_OK)
+ {
+ tar_set_error("deflateParams failed");
+ return NULL;
+ }
+ }
+#endif
+
+ tar_data->currentfile->pathname = pg_strdup(pathname);
+
+ /*
+ * Uncompressed files are padded on creation, but for compression we can't
+ * do that
+ */
+ if (pad_to_size)
+ {
+ if (tar_data->compression)
+ {
+ tar_data->currentfile->pad_to_size = pad_to_size;
+ }
+ else
+ {
+ /* Uncompressed, so pad now */
+ tar_data->currentfile->pad_to_size = 0;
+ tar_write_padding_data(tar_data->currentfile, pad_to_size);
+ /* Seek back to start */
+ if (lseek(tar_data->fd, tar_data->currentfile->ofs_start, SEEK_SET) != tar_data->currentfile->ofs_start)
+ return NULL;
+
+ tar_data->currentfile->currpos = 0;
+ }
+ }
+
+ return tar_data->currentfile;
+}
+
+static ssize_t
+tar_get_file_size(const char *pathname)
+{
+ tar_clear_error();
+
+ /* Currently not used, so not supported */
+ errno = ENOSYS;
+ return -1;
+}
+
+static off_t
+tar_get_current_pos(Walfile f)
+{
+ Assert(f != NULL);
+ tar_clear_error();
+
+ return ((TarMethodFile *) f)->currpos;
+}
+
+static int
+tar_fsync(Walfile f)
+{
+ Assert(f != NULL);
+ tar_clear_error();
+
+ /*
+ * Always sync the whole tarfile, because that's all we can do. This makes
+ * no sense on compressed files, so just ignore those.
+ */
+ if (tar_data->compression)
+ return 0;
+
+ return fsync(tar_data->fd);
+}
+
+static int
+tar_close(Walfile f, WalCloseMethod method)
+{
+ ssize_t filesize;
+ int padding;
+ TarMethodFile *tf = (TarMethodFile *) f;
+
+ Assert(f != NULL);
+ tar_clear_error();
+
+ if (method == CLOSE_UNLINK)
+ {
+ if (tar_data->compression)
+ {
+ tar_set_error("unlink not supported with compression");
+ return -1;
+ }
+
+ /*
+ * Unlink the file that we just wrote to the tar. We do this by
+ * truncating it to the start of the header. This is safe as we only
+ * allow writing of the very last file.
+ */
+ if (ftruncate(tar_data->fd, tf->ofs_start) != 0)
+ return -1;
+
+ pg_free(tf->pathname);
+ pg_free(tf);
+ tar_data->currentfile = NULL;
+
+ return 0;
+ }
+
+ /*
+ * Pad the file itself with zeroes if necessary. Note that this is
+ * different from the tar format padding -- this is the padding we asked
+ * for when the file was opened.
+ */
+ if (tf->pad_to_size)
+ {
+ size_t sizeleft = tf->pad_to_size - tf->currpos;
+
+ if (sizeleft)
+ {
+ if (!tar_write_padding_data(tf, sizeleft))
+ return -1;
+ }
+ }
+
+ /*
+ * Get the size of the file, and pad the current data up to the nearest
+ * 512 byte boundary.
+ */
+ filesize = tar_get_current_pos(f);
+ padding = ((filesize + 511) & ~511) - filesize;
+ if (padding)
+ {
+ char zerobuf[512];
+
+ MemSet(zerobuf, 0, padding);
+ if (tar_write(f, zerobuf, padding) != padding)
+ return -1;
+ }
+
+
+#ifdef HAVE_LIBZ
+ if (tar_data->compression)
+ {
+ /* Flush the current buffer */
+ if (!tar_write_compressed_data(NULL, 0, true))
+ {
+ errno = EINVAL;
+ return -1;
+ }
+ }
+#endif
+
+ /*
+ * Now go back and update the header with the correct filesize and
+ * possibly also renaming the file. We overwrite the entire current header
+ * when done, including the checksum.
+ */
+ print_tar_number(&(tf->header[124]), 12, filesize);
+
+ if (method == CLOSE_NORMAL)
+
+ /*
+ * We overwrite it with what it was before if we have no tempname,
+ * since we're going to write the buffer anyway.
+ */
+ strlcpy(&(tf->header[0]), tf->pathname, 100);
+
+ print_tar_number(&(tf->header[148]), 8, tarChecksum(((TarMethodFile *) f)->header));
+ if (lseek(tar_data->fd, tf->ofs_start, SEEK_SET) != ((TarMethodFile *) f)->ofs_start)
+ return -1;
+ if (!tar_data->compression)
+ {
+ if (write(tar_data->fd, tf->header, 512) != 512)
+ return -1;
+ }
+#ifdef HAVE_LIBZ
+ else
+ {
+ /* Turn off compression */
+ if (deflateParams(tar_data->zp, 0, 0) != Z_OK)
+ {
+ tar_set_error("deflateParams failed");
+ return -1;
+ }
+
+ /* Overwrite the header, assuming the size will be the same */
+ if (!tar_write_compressed_data(tar_data->currentfile->header, 512, true))
+ return -1;
+
+ /* Turn compression back on */
+ if (deflateParams(tar_data->zp, tar_data->compression, 0) != Z_OK)
+ {
+ tar_set_error("deflateParams failed");
+ return -1;
+ }
+ }
+#endif
+
+ /* Move file pointer back down to end, so we can write the next file */
+ if (lseek(tar_data->fd, 0, SEEK_END) < 0)
+ return -1;
+
+ /* Always fsync on close, so the padding gets fsynced */
+ tar_fsync(f);
+
+ /* Clean up and done */
+ pg_free(tf->pathname);
+ pg_free(tf);
+ tar_data->currentfile = NULL;
+
+ return 0;
+}
+
+static int
+tar_unlink(const char *pathname)
+{
+ tar_clear_error();
+ errno = ENOSYS;
+ return -1;
+}
+
+static bool
+tar_existsfile(const char *pathname)
+{
+ tar_clear_error();
+ /* We only deal with new tarfiles, so nothing externally created exists */
+ return false;
+}
+
+static bool
+tar_finish(void)
+{
+ char zerobuf[1024];
+
+ tar_clear_error();
+
+ if (tar_data->currentfile)
+ {
+ if (tar_close(tar_data->currentfile, CLOSE_NORMAL) != 0)
+ return false;
+ }
+
+ /* A tarfile always ends with two empty blocks */
+ MemSet(zerobuf, 0, sizeof(zerobuf));
+ if (!tar_data->compression)
+ {
+ if (write(tar_data->fd, zerobuf, sizeof(zerobuf)) != sizeof(zerobuf))
+ return false;
+ }
+#ifdef HAVE_LIBZ
+ else
+ {
+ if (!tar_write_compressed_data(zerobuf, sizeof(zerobuf), false))
+ return false;
+
+ /* Also flush all data to make sure the gzip stream is finished */
+ tar_data->zp->next_in = NULL;
+ tar_data->zp->avail_in = 0;
+ while (true)
+ {
+ int r;
+
+ r = deflate(tar_data->zp, Z_FINISH);
+
+ if (r == Z_STREAM_ERROR)
+ {
+ tar_set_error("deflate failed");
+ return false;
+ }
+ if (tar_data->zp->avail_out < ZLIB_OUT_SIZE)
+ {
+ size_t len = ZLIB_OUT_SIZE - tar_data->zp->avail_out;
+
+ if (write(tar_data->fd, tar_data->zlibOut, len) != len)
+ return false;
+ }
+ if (r == Z_STREAM_END)
+ break;
+ }
+
+ if (deflateEnd(tar_data->zp) != Z_OK)
+ {
+ tar_set_error("deflateEnd failed");
+ return false;
+ }
+ }
+#endif
+
+ /* sync the empty blocks as well, since they're after the last file */
+ fsync(tar_data->fd);
+
+ if (close(tar_data->fd) != 0)
+ return false;
+
+ tar_data->fd = -1;
+
+ return true;
+}
+
+WalWriteMethod *
+CreateWalTarMethod(const char *tarbase, int compression)
+{
+ WalWriteMethod *method;
+ const char *suffix = (compression > 0) ? ".tar.gz" : ".tar";
+
+ method = pg_malloc0(sizeof(WalWriteMethod));
+ method->open_for_write = tar_open_for_write;
+ method->write = tar_write;
+ method->get_current_pos = tar_get_current_pos;
+ method->get_file_size = tar_get_file_size;
+ method->close = tar_close;
+ method->fsync = tar_fsync;
+ method->unlink = tar_unlink;
+ method->existsfile = tar_existsfile;
+ method->finish = tar_finish;
+ method->getlasterror = tar_getlasterror;
+
+ tar_data = pg_malloc0(sizeof(TarMethodData));
+ tar_data->tarfilename = pg_malloc0(strlen(tarbase) + strlen(suffix) + 1);
+ sprintf(tar_data->tarfilename, "%s%s", tarbase, suffix);
+ tar_data->fd = -1;
+ tar_data->compression = compression;
+ if (compression)
+ tar_data->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ return method;
+}
diff --git a/src/bin/pg_basebackup/walmethods.h b/src/bin/pg_basebackup/walmethods.h
new file mode 100644
index 0000000..9922cfd
--- /dev/null
+++ b/src/bin/pg_basebackup/walmethods.h
@@ -0,0 +1,46 @@
+/*-------------------------------------------------------------------------
+ *
+ * walmethods.h
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_basebackup/walmethods.h
+ *-------------------------------------------------------------------------
+ */
+
+
+typedef void *Walfile;
+
+typedef enum
+{
+ CLOSE_NORMAL,
+ CLOSE_UNLINK,
+ CLOSE_NO_RENAME,
+} WalCloseMethod;
+
+typedef struct WalWriteMethod WalWriteMethod;
+struct WalWriteMethod
+{
+ Walfile(*open_for_write) (const char *pathname, const char *temp_suffix, size_t pad_to_size);
+ int (*close) (Walfile f, WalCloseMethod method);
+ int (*unlink) (const char *pathname);
+ bool (*existsfile) (const char *pathname);
+ ssize_t (*get_file_size) (const char *pathname);
+
+ ssize_t (*write) (Walfile f, const void *buf, size_t count);
+ off_t (*get_current_pos) (Walfile f);
+ int (*fsync) (Walfile f);
+ bool (*finish) (void);
+ char *(*getlasterror) (void);
+};
+
+/*
+ * Available WAL methods:
+ * - WalDirectoryMethod - write WAL to regular files in a standard pg_xlog
+ * - TarDirectoryMethod - write WAL to a tarfile corresponding to pg_xlog
+ * (only implements the methods required for pg_basebackup,
+ * not all those required for pg_receivexlog)
+ */
+WalWriteMethod *CreateWalDirectoryMethod(const char *basedir);
+WalWriteMethod *CreateWalTarMethod(const char *tarbase, int compression);
diff --git a/src/include/pgtar.h b/src/include/pgtar.h
index 45ca400..1d179f0 100644
--- a/src/include/pgtar.h
+++ b/src/include/pgtar.h
@@ -22,4 +22,5 @@ enum tarError
extern enum tarError tarCreateHeader(char *h, const char *filename, const char *linktarget,
pgoff_t size, mode_t mode, uid_t uid, gid_t gid, time_t mtime);
extern uint64 read_tar_number(const char *s, int len);
+extern void print_tar_number(char *s, int len, uint64 val);
extern int tarChecksum(char *header);
diff --git a/src/port/tar.c b/src/port/tar.c
index 52a2113..f1da959 100644
--- a/src/port/tar.c
+++ b/src/port/tar.c
@@ -16,7 +16,7 @@
* support only non-negative numbers, so we don't worry about the GNU rules
* for handling negative numbers.)
*/
-static void
+void
print_tar_number(char *s, int len, uint64 val)
{
if (val < (((uint64) 1) << ((len - 1) * 3)))
On Thu, Sep 1, 2016 at 6:58 AM, Magnus Hagander <magnus@hagander.net> wrote:
Attached patch adds support for -X stream to work with .tar and .tar.gz file
formats.
Nice.
If tar mode is specified, a separate pg_xlog.tar (or .tar.gz) file is
created and the data is streamed into it. Regular mode is (should not) see
any changes in how it works.
Could you use XLOGDIR from xlog_internal.h at least?
The implementation creates a "walmethod" for directory and one for tar,
which is basically a set of function pointers that we pass around as part of
the StreamCtl structure. All calls to modify the files are sent through the
current method, using the normal open/read/write calls as it is now for
directories, and the more complicated method for tar and targz.
That looks pretty cool by looking at the code.
The tar method doesn't support all things that are required for
pg_receivexlog, but I don't think it makes any real sense to have support
for pg_receivexlog in tar mode. But it does support all the things that
pg_basebackup needs.
Agreed. Your patch is enough complicated.
Some smaller pieces of functionality like unlinking files on failure and
padding files have been moved into the walmethod because they have to be
differently implemented (we cannot pre-pad a compressed file -- the size
will depend on the compression ration anyway -- for example).AFAICT we never actually documented that -X stream doesn't work with tar in
the manpage of current versions. Only in the error message. We might want to
fix that in backbranches.
+1 for documenting that in back-branches.
In passing this also fixes an XXX comment about not re-lseeking on the WAL
file all the time -- the walmethod now tracks the current position in the
file in a variable.Finally, to make this work, the pring_tar_number() function is now exported
from port/tar.c along with the other ones already exported from there.
walmethods.c:387:9: warning: comparison of unsigned expression < 0 is
always false [-Wtautological-compare]
if (r < 0)
This patch is generating a warning for me with clang.
I have just tested this feature:
$ pg_basebackup -X stream -D data -F t
Which generates that:
$ ls data
base.tar pg_xlog.tar
However after decompressing pg_xlog.tar the segments don't have a correct size:
$ ls -l 0*
-rw------- 1 mpaquier _guest 3.9M Sep 1 16:12 000000010000000000000011
Even if that's filling them with zeros during pg_basebackup when a
segment is done, those should be 16MB to allow users to reuse them
directly.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Sep 1, 2016 at 9:19 AM, Michael Paquier <michael.paquier@gmail.com>
wrote:
On Thu, Sep 1, 2016 at 6:58 AM, Magnus Hagander <magnus@hagander.net>
wrote:Attached patch adds support for -X stream to work with .tar and .tar.gz
file
formats.
Nice.
If tar mode is specified, a separate pg_xlog.tar (or .tar.gz) file is
created and the data is streamed into it. Regular mode is (should not)see
any changes in how it works.
Could you use XLOGDIR from xlog_internal.h at least?
Yes, that makes sense.
The implementation creates a "walmethod" for directory and one for tar,
which is basically a set of function pointers that we pass around aspart of
the StreamCtl structure. All calls to modify the files are sent through
the
current method, using the normal open/read/write calls as it is now for
directories, and the more complicated method for tar and targz.That looks pretty cool by looking at the code.
The tar method doesn't support all things that are required for
pg_receivexlog, but I don't think it makes any real sense to have support
for pg_receivexlog in tar mode. But it does support all the things that
pg_basebackup needs.Agreed. Your patch is enough complicated.
Some smaller pieces of functionality like unlinking files on failure and
padding files have been moved into the walmethod because they have to be
differently implemented (we cannot pre-pad a compressed file -- the size
will depend on the compression ration anyway -- for example).AFAICT we never actually documented that -X stream doesn't work with tar
in
the manpage of current versions. Only in the error message. We might
want to
fix that in backbranches.
+1 for documenting that in back-branches.
In passing this also fixes an XXX comment about not re-lseeking on the
WAL
file all the time -- the walmethod now tracks the current position in the
file in a variable.Finally, to make this work, the pring_tar_number() function is now
exported
from port/tar.c along with the other ones already exported from there.
walmethods.c:387:9: warning: comparison of unsigned expression < 0 is
always false [-Wtautological-compare]
if (r < 0)
This patch is generating a warning for me with clang.I have just tested this feature:
$ pg_basebackup -X stream -D data -F t
Which generates that:
$ ls data
base.tar pg_xlog.tar
However after decompressing pg_xlog.tar the segments don't have a correct
size:
$ ls -l 0*
-rw------- 1 mpaquier _guest 3.9M Sep 1 16:12 000000010000000000000011Even if that's filling them with zeros during pg_basebackup when a
segment is done, those should be 16MB to allow users to reuse them
directly.
Huh, that's annoying. I must've broken that when I fixed padding for
compressed files. It forgets the padding when it updates the size of the
tarfile (works fine for compressed files because padding is done at the
end).
That's definitely not intended - it's supposed to be 16Mb. And it actually
writes 16Mb to the tarfile, it's the extraction that doesn't see them. That
also means that if you get more than one member of the tarfile at this
point, it will be corrupt. (It's not corrupt in the .tar.gz case, clearly
my testing of the very last iteration of the patch forgot to doublecheck
this with both).
Oops. Will fix.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On Thu, Sep 1, 2016 at 4:41 PM, Magnus Hagander <magnus@hagander.net> wrote:
That's definitely not intended - it's supposed to be 16Mb. And it actually
writes 16Mb to the tarfile, it's the extraction that doesn't see them. That
also means that if you get more than one member of the tarfile at this
point, it will be corrupt. (It's not corrupt in the .tar.gz case, clearly my
testing of the very last iteration of the patch forgot to doublecheck this
with both).Oops. Will fix.
If at the same time you could add some tests in pg_basebackup/t, that
would be great :)
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Sep 1, 2016 at 9:43 AM, Michael Paquier <michael.paquier@gmail.com>
wrote:
On Thu, Sep 1, 2016 at 4:41 PM, Magnus Hagander <magnus@hagander.net>
wrote:That's definitely not intended - it's supposed to be 16Mb. And it
actually
writes 16Mb to the tarfile, it's the extraction that doesn't see them.
That
also means that if you get more than one member of the tarfile at this
point, it will be corrupt. (It's not corrupt in the .tar.gz case,clearly my
testing of the very last iteration of the patch forgot to doublecheck
this
with both).
Oops. Will fix.
If at the same time you could add some tests in pg_basebackup/t, that
would be great :)
That's definitely on my slightly-longer-term plan. But I've successfully
managed to avoid perl long enough now that looking at the code in those
tests is mighty confusing :) So I need a bunch of readup before I can
figure that out. (yes, that means I've managed to avoid our own discussions
about that style tests on this list quite successfully too :P)
We don't seem to check for similar issues as the one just found in the
existing tests though, do we? As in, we don't actually verify that the xlog
files being streamed are 16Mb? (Or for that matter that the tarfile emitted
by -Ft is actually a tarfile?) Or am I missing some magic somewhere? :)
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On Thu, Sep 1, 2016 at 5:13 PM, Magnus Hagander <magnus@hagander.net> wrote:
We don't seem to check for similar issues as the one just found in the
existing tests though, do we? As in, we don't actually verify that the xlog
files being streamed are 16Mb? (Or for that matter that the tarfile emitted
by -Ft is actually a tarfile?) Or am I missing some magic somewhere? :)
No. There is no checks on the WAL file size (you should use the output
of pg_controldata to see how large the segments should be). For the
tar file, the complication is in its untar... Perl provides some ways
to untar things, though the oldest version that we support in the TAP
tests does not offer that :(
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Sep 1, 2016 at 2:39 PM, Michael Paquier <michael.paquier@gmail.com>
wrote:
On Thu, Sep 1, 2016 at 5:13 PM, Magnus Hagander <magnus@hagander.net>
wrote:We don't seem to check for similar issues as the one just found in the
existing tests though, do we? As in, we don't actually verify that thexlog
files being streamed are 16Mb? (Or for that matter that the tarfile
emitted
by -Ft is actually a tarfile?) Or am I missing some magic somewhere? :)
No. There is no checks on the WAL file size (you should use the output
of pg_controldata to see how large the segments should be). For the
tar file, the complication is in its untar... Perl provides some ways
to untar things, though the oldest version that we support in the TAP
tests does not offer that :(
Ugh. That would be nice to have, but I think that's outside the scope of
this patch.
PFA is an updated version of this patch that:
* documents a magic value passed to zlib (which is in their documentation
as being a magic value, but has no define)
* fixes the padding of tar files
* adds a most basic test that the -X stream -Ft does produce a tarfile
As for using XLOGDIR to drive the name of the tarfile. pg_basebackup is
already hardcoded to use pg_xlog. And so are the tests. We probably want to
fix that, but that's a separate step and this patch will be easier to
review and test if we keep it out for now.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
Attachments:
pg_basebackup_stream_tar_v2.patchtext/x-patch; charset=US-ASCII; name=pg_basebackup_stream_tar_v2.patchDownload
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 03615da..981d201 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -180,7 +180,8 @@ PostgreSQL documentation
target directory, the tar contents will be written to
standard output, suitable for piping to for example
<productname>gzip</productname>. This is only possible if
- the cluster has no additional tablespaces.
+ the cluster has no additional tablespaces and transaction
+ log streaming is not used.
</para>
</listitem>
</varlistentry>
@@ -323,6 +324,10 @@ PostgreSQL documentation
If the log has been rotated when it's time to transfer it, the
backup will fail and be unusable.
</para>
+ <para>
+ The transaction log files will be written to
+ the <filename>base.tar</filename> file.
+ </para>
</listitem>
</varlistentry>
@@ -339,6 +344,9 @@ PostgreSQL documentation
client can keep up with transaction log received, using this mode
requires no extra transaction logs to be saved on the master.
</para>
+ <para>The transactionn log files are written to a separate file
+ called <filename>pg_xlog.tar</filename>.
+ </para>
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/bin/pg_basebackup/Makefile b/src/bin/pg_basebackup/Makefile
index fa1ce8b..52ac9e9 100644
--- a/src/bin/pg_basebackup/Makefile
+++ b/src/bin/pg_basebackup/Makefile
@@ -19,7 +19,7 @@ include $(top_builddir)/src/Makefile.global
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
LDFLAGS += -L$(top_builddir)/src/fe_utils -lpgfeutils -lpq
-OBJS=receivelog.o streamutil.o $(WIN32RES)
+OBJS=receivelog.o streamutil.o walmethods.o $(WIN32RES)
all: pg_basebackup pg_receivexlog pg_recvlogical
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 351a420..58c0821 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -365,7 +365,7 @@ typedef struct
{
PGconn *bgconn;
XLogRecPtr startptr;
- char xlogdir[MAXPGPATH];
+ char xlog[MAXPGPATH]; /* directory or tarfile depending on mode */
char *sysidentifier;
int timeline;
} logstreamer_param;
@@ -383,9 +383,13 @@ LogStreamerMain(logstreamer_param *param)
stream.standby_message_timeout = standby_message_timeout;
stream.synchronous = false;
stream.mark_done = true;
- stream.basedir = param->xlogdir;
stream.partial_suffix = NULL;
+ if (format == 'p')
+ stream.walmethod = CreateWalDirectoryMethod(param->xlog);
+ else
+ stream.walmethod = CreateWalTarMethod(param->xlog, compresslevel);
+
if (!ReceiveXlogStream(param->bgconn, &stream))
/*
@@ -395,6 +399,14 @@ LogStreamerMain(logstreamer_param *param)
*/
return 1;
+ if (!stream.walmethod->finish())
+ {
+ fprintf(stderr,
+ _("%s: could not finish writing WAL files: %s\n"),
+ progname, strerror(errno));
+ return 1;
+ }
+
PQfinish(param->bgconn);
return 0;
}
@@ -445,22 +457,25 @@ StartLogStreamer(char *startpos, uint32 timeline, char *sysidentifier)
/* Error message already written in GetConnection() */
exit(1);
- snprintf(param->xlogdir, sizeof(param->xlogdir), "%s/pg_xlog", basedir);
-
- /*
- * Create pg_xlog/archive_status (and thus pg_xlog) so we can write to
- * basedir/pg_xlog as the directory entry in the tar file may arrive
- * later.
- */
- snprintf(statusdir, sizeof(statusdir), "%s/pg_xlog/archive_status",
- basedir);
+ snprintf(param->xlog, sizeof(param->xlog), "%s/pg_xlog", basedir);
- if (pg_mkdir_p(statusdir, S_IRWXU) != 0 && errno != EEXIST)
+ if (format == 'p')
{
- fprintf(stderr,
- _("%s: could not create directory \"%s\": %s\n"),
- progname, statusdir, strerror(errno));
- disconnect_and_exit(1);
+ /*
+ * Create pg_xlog/archive_status (and thus pg_xlog) so we can write to
+ * basedir/pg_xlog as the directory entry in the tar file may arrive
+ * later.
+ */
+ snprintf(statusdir, sizeof(statusdir), "%s/pg_xlog/archive_status",
+ basedir);
+
+ if (pg_mkdir_p(statusdir, S_IRWXU) != 0 && errno != EEXIST)
+ {
+ fprintf(stderr,
+ _("%s: could not create directory \"%s\": %s\n"),
+ progname, statusdir, strerror(errno));
+ disconnect_and_exit(1);
+ }
}
/*
@@ -2110,16 +2125,6 @@ main(int argc, char **argv)
exit(1);
}
- if (format != 'p' && streamwal)
- {
- fprintf(stderr,
- _("%s: WAL streaming can only be used in plain mode\n"),
- progname);
- fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
- progname);
- exit(1);
- }
-
if (replication_slot && !streamwal)
{
fprintf(stderr,
diff --git a/src/bin/pg_basebackup/pg_receivexlog.c b/src/bin/pg_basebackup/pg_receivexlog.c
index 7f7ee9d..9b4c101 100644
--- a/src/bin/pg_basebackup/pg_receivexlog.c
+++ b/src/bin/pg_basebackup/pg_receivexlog.c
@@ -337,11 +337,19 @@ StreamLog(void)
stream.standby_message_timeout = standby_message_timeout;
stream.synchronous = synchronous;
stream.mark_done = false;
- stream.basedir = basedir;
+ stream.walmethod = CreateWalDirectoryMethod(basedir);
stream.partial_suffix = ".partial";
ReceiveXlogStream(conn, &stream);
+ if (!stream.walmethod->finish())
+ {
+ fprintf(stderr,
+ _("%s: could not finish writing WAL files: %s\n"),
+ progname, strerror(errno));
+ return;
+ }
+
PQfinish(conn);
conn = NULL;
}
diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c
index 062730b..9197eeb 100644
--- a/src/bin/pg_basebackup/receivelog.c
+++ b/src/bin/pg_basebackup/receivelog.c
@@ -26,7 +26,7 @@
/* fd and filename for currently open WAL file */
-static int walfile = -1;
+static Walfile *walfile = NULL;
static char current_walfile_name[MAXPGPATH] = "";
static bool reportFlushPosition = false;
static XLogRecPtr lastFlushPosition = InvalidXLogRecPtr;
@@ -37,7 +37,7 @@ static PGresult *HandleCopyStream(PGconn *conn, StreamCtl *stream,
XLogRecPtr *stoppos);
static int CopyStreamPoll(PGconn *conn, long timeout_ms);
static int CopyStreamReceive(PGconn *conn, long timeout, char **buffer);
-static bool ProcessKeepaliveMsg(PGconn *conn, char *copybuf, int len,
+static bool ProcessKeepaliveMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
XLogRecPtr blockpos, int64 *last_status);
static bool ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
XLogRecPtr *blockpos);
@@ -52,33 +52,33 @@ static bool ReadEndOfStreamingResult(PGresult *res, XLogRecPtr *startpos,
uint32 *timeline);
static bool
-mark_file_as_archived(const char *basedir, const char *fname)
+mark_file_as_archived(StreamCtl *stream, const char *fname)
{
- int fd;
+ Walfile *f;
static char tmppath[MAXPGPATH];
- snprintf(tmppath, sizeof(tmppath), "%s/archive_status/%s.done",
- basedir, fname);
+ snprintf(tmppath, sizeof(tmppath), "archive_status/%s.done",
+ fname);
- fd = open(tmppath, O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
- if (fd < 0)
+ f = stream->walmethod->open_for_write(tmppath, NULL, 0);
+ if (f == NULL)
{
fprintf(stderr, _("%s: could not create archive status file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
+ progname, tmppath, stream->walmethod->getlasterror());
return false;
}
- if (fsync(fd) != 0)
+ if (stream->walmethod->fsync(f) != 0)
{
fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
+ progname, tmppath, stream->walmethod->getlasterror());
- close(fd);
+ stream->walmethod->close(f, CLOSE_UNLINK);
return false;
}
- close(fd);
+ stream->walmethod->close(f, CLOSE_NORMAL);
return true;
}
@@ -92,79 +92,65 @@ mark_file_as_archived(const char *basedir, const char *fname)
static bool
open_walfile(StreamCtl *stream, XLogRecPtr startpoint)
{
- int f;
+ Walfile *f;
char fn[MAXPGPATH];
- struct stat statbuf;
- char *zerobuf;
- int bytes;
+ ssize_t size;
XLogSegNo segno;
XLByteToSeg(startpoint, segno);
XLogFileName(current_walfile_name, stream->timeline, segno);
- snprintf(fn, sizeof(fn), "%s/%s%s", stream->basedir, current_walfile_name,
+ snprintf(fn, sizeof(fn), "%s%s", current_walfile_name,
stream->partial_suffix ? stream->partial_suffix : "");
- f = open(fn, O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
- if (f == -1)
- {
- fprintf(stderr,
- _("%s: could not open transaction log file \"%s\": %s\n"),
- progname, fn, strerror(errno));
- return false;
- }
-
- /*
- * Verify that the file is either empty (just created), or a complete
- * XLogSegSize segment. Anything in between indicates a corrupt file.
- */
- if (fstat(f, &statbuf) != 0)
- {
- fprintf(stderr,
- _("%s: could not stat transaction log file \"%s\": %s\n"),
- progname, fn, strerror(errno));
- close(f);
- return false;
- }
- if (statbuf.st_size == XLogSegSize)
- {
- /* File is open and ready to use */
- walfile = f;
- return true;
- }
- if (statbuf.st_size != 0)
- {
- fprintf(stderr,
- _("%s: transaction log file \"%s\" has %d bytes, should be 0 or %d\n"),
- progname, fn, (int) statbuf.st_size, XLogSegSize);
- close(f);
- return false;
- }
- /* New, empty, file. So pad it to 16Mb with zeroes */
- zerobuf = pg_malloc0(XLOG_BLCKSZ);
- for (bytes = 0; bytes < XLogSegSize; bytes += XLOG_BLCKSZ)
+ if (stream->walmethod->existsfile(fn))
{
- if (write(f, zerobuf, XLOG_BLCKSZ) != XLOG_BLCKSZ)
+ /*
+ * Verify that the file is either empty (just created), or a complete
+ * XLogSegSize segment. Anything in between indicates a corrupt file.
+ */
+ size = stream->walmethod->get_file_size(fn);
+ if (size < 0)
+ {
+ fprintf(stderr,
+ _("%s: could not get size of transaction log file \"%s\": %s\n"),
+ progname, fn, stream->walmethod->getlasterror());
+ return false;
+ }
+ if (size == XLogSegSize)
+ {
+ /* Already padded file. Open it for use */
+ f = stream->walmethod->open_for_write(current_walfile_name, stream->partial_suffix, 0);
+ if (f == NULL)
+ {
+ fprintf(stderr,
+ _("%s: could not open existing transaction log file \"%s\": %s\n"),
+ progname, fn, stream->walmethod->getlasterror());
+ return false;
+ }
+ walfile = f;
+ return true;
+ }
+ if (size != 0)
{
fprintf(stderr,
- _("%s: could not pad transaction log file \"%s\": %s\n"),
- progname, fn, strerror(errno));
- free(zerobuf);
- close(f);
- unlink(fn);
+ _("%s: transaction log file \"%s\" has %d bytes, should be 0 or %d\n"),
+ progname, fn, (int) size, XLogSegSize);
return false;
}
}
- free(zerobuf);
- if (lseek(f, SEEK_SET, 0) != 0)
+ /* No file existed, so create one */
+
+ f = stream->walmethod->open_for_write(current_walfile_name, stream->partial_suffix, XLogSegSize);
+ if (f == NULL)
{
fprintf(stderr,
- _("%s: could not seek to beginning of transaction log file \"%s\": %s\n"),
- progname, fn, strerror(errno));
- close(f);
+ _("%s: could not open transaction log file \"%s\": %s\n"),
+ progname, fn, stream->walmethod->getlasterror());
return false;
}
+
walfile = f;
return true;
}
@@ -178,56 +164,50 @@ static bool
close_walfile(StreamCtl *stream, XLogRecPtr pos)
{
off_t currpos;
+ int r;
- if (walfile == -1)
+ if (walfile == NULL)
return true;
- currpos = lseek(walfile, 0, SEEK_CUR);
+ currpos = stream->walmethod->get_current_pos(walfile);
if (currpos == -1)
{
fprintf(stderr,
_("%s: could not determine seek position in file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
+ progname, current_walfile_name, stream->walmethod->getlasterror());
return false;
}
- if (fsync(walfile) != 0)
+ if (stream->walmethod->fsync(walfile) != 0)
{
fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
+ progname, current_walfile_name, stream->walmethod->getlasterror());
return false;
}
- if (close(walfile) != 0)
+ if (stream->partial_suffix)
{
- fprintf(stderr, _("%s: could not close file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
- walfile = -1;
- return false;
+ if (currpos == XLOG_SEG_SIZE)
+ r = stream->walmethod->close(walfile, CLOSE_NORMAL);
+ else
+ {
+ fprintf(stderr,
+ _("%s: not renaming \"%s%s\", segment is not complete\n"),
+ progname, current_walfile_name, stream->partial_suffix);
+ r = stream->walmethod->close(walfile, CLOSE_NO_RENAME);
+ }
}
- walfile = -1;
+ else
+ r = stream->walmethod->close(walfile, CLOSE_NORMAL);
- /*
- * If we finished writing a .partial file, rename it into place.
- */
- if (currpos == XLOG_SEG_SIZE && stream->partial_suffix)
- {
- char oldfn[MAXPGPATH];
- char newfn[MAXPGPATH];
+ walfile = NULL;
- snprintf(oldfn, sizeof(oldfn), "%s/%s%s", stream->basedir, current_walfile_name, stream->partial_suffix);
- snprintf(newfn, sizeof(newfn), "%s/%s", stream->basedir, current_walfile_name);
- if (rename(oldfn, newfn) != 0)
- {
- fprintf(stderr, _("%s: could not rename file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
- return false;
- }
+ if (r != 0)
+ {
+ fprintf(stderr, _("%s: could not close file \"%s\": %s\n"),
+ progname, current_walfile_name, stream->walmethod->getlasterror());
+ return false;
}
- else if (stream->partial_suffix)
- fprintf(stderr,
- _("%s: not renaming \"%s%s\", segment is not complete\n"),
- progname, current_walfile_name, stream->partial_suffix);
/*
* Mark file as archived if requested by the caller - pg_basebackup needs
@@ -238,7 +218,7 @@ close_walfile(StreamCtl *stream, XLogRecPtr pos)
if (currpos == XLOG_SEG_SIZE && stream->mark_done)
{
/* writes error message if failed */
- if (!mark_file_as_archived(stream->basedir, current_walfile_name))
+ if (!mark_file_as_archived(stream, current_walfile_name))
return false;
}
@@ -253,9 +233,7 @@ close_walfile(StreamCtl *stream, XLogRecPtr pos)
static bool
existsTimeLineHistoryFile(StreamCtl *stream)
{
- char path[MAXPGPATH];
char histfname[MAXFNAMELEN];
- int fd;
/*
* Timeline 1 never has a history file. We treat that as if it existed,
@@ -266,31 +244,16 @@ existsTimeLineHistoryFile(StreamCtl *stream)
TLHistoryFileName(histfname, stream->timeline);
- snprintf(path, sizeof(path), "%s/%s", stream->basedir, histfname);
-
- fd = open(path, O_RDONLY | PG_BINARY, 0);
- if (fd < 0)
- {
- if (errno != ENOENT)
- fprintf(stderr, _("%s: could not open timeline history file \"%s\": %s\n"),
- progname, path, strerror(errno));
- return false;
- }
- else
- {
- close(fd);
- return true;
- }
+ return stream->walmethod->existsfile(histfname);
}
static bool
writeTimeLineHistoryFile(StreamCtl *stream, char *filename, char *content)
{
int size = strlen(content);
- char path[MAXPGPATH];
char tmppath[MAXPGPATH];
char histfname[MAXFNAMELEN];
- int fd;
+ Walfile *f;
/*
* Check that the server's idea of how timeline history files should be
@@ -304,62 +267,39 @@ writeTimeLineHistoryFile(StreamCtl *stream, char *filename, char *content)
return false;
}
- snprintf(path, sizeof(path), "%s/%s", stream->basedir, histfname);
-
- /*
- * Write into a temp file name.
- */
- snprintf(tmppath, MAXPGPATH, "%s.tmp", path);
-
- unlink(tmppath);
-
- fd = open(tmppath, O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
- if (fd < 0)
+ f = stream->walmethod->open_for_write(histfname, ".tmp", 0);
+ if (f == NULL)
{
fprintf(stderr, _("%s: could not create timeline history file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
+ progname, histfname, stream->walmethod->getlasterror());
return false;
}
- errno = 0;
- if ((int) write(fd, content, size) != size)
+ if ((int) stream->walmethod->write(f, content, size) != size)
{
- int save_errno = errno;
+ fprintf(stderr, _("%s: could not write timeline history file \"%s\": %s\n"),
+ progname, histfname, stream->walmethod->getlasterror());
/*
* If we fail to make the file, delete it to release disk space
*/
- close(fd);
- unlink(tmppath);
- errno = save_errno;
+ stream->walmethod->close(f, CLOSE_UNLINK);
- fprintf(stderr, _("%s: could not write timeline history file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
return false;
}
- if (fsync(fd) != 0)
+ if (stream->walmethod->fsync(f) != 0)
{
- close(fd);
fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
+ progname, tmppath, stream->walmethod->getlasterror());
+ stream->walmethod->close(f, CLOSE_NORMAL);
return false;
}
- if (close(fd) != 0)
+ if (stream->walmethod->close(f, CLOSE_NORMAL) != 0)
{
fprintf(stderr, _("%s: could not close file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
- return false;
- }
-
- /*
- * Now move the completed history file into place with its final name.
- */
- if (rename(tmppath, path) < 0)
- {
- fprintf(stderr, _("%s: could not rename file \"%s\" to \"%s\": %s\n"),
- progname, tmppath, path, strerror(errno));
+ progname, histfname, stream->walmethod->getlasterror());
return false;
}
@@ -367,7 +307,7 @@ writeTimeLineHistoryFile(StreamCtl *stream, char *filename, char *content)
if (stream->mark_done)
{
/* writes error message if failed */
- if (!mark_file_as_archived(stream->basedir, histfname))
+ if (!mark_file_as_archived(stream, histfname))
return false;
}
@@ -736,10 +676,10 @@ ReceiveXlogStream(PGconn *conn, StreamCtl *stream)
}
error:
- if (walfile != -1 && close(walfile) != 0)
+ if (walfile != NULL && stream->walmethod->close(walfile, CLOSE_NORMAL) != 0)
fprintf(stderr, _("%s: could not close file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
- walfile = -1;
+ progname, current_walfile_name, stream->walmethod->getlasterror());
+ walfile = NULL;
return false;
}
@@ -823,12 +763,12 @@ HandleCopyStream(PGconn *conn, StreamCtl *stream,
* If synchronous option is true, issue sync command as soon as there
* are WAL data which has not been flushed yet.
*/
- if (stream->synchronous && lastFlushPosition < blockpos && walfile != -1)
+ if (stream->synchronous && lastFlushPosition < blockpos && walfile != NULL)
{
- if (fsync(walfile) != 0)
+ if (stream->walmethod->fsync(walfile) != 0)
{
fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
+ progname, current_walfile_name, stream->walmethod->getlasterror());
goto error;
}
lastFlushPosition = blockpos;
@@ -879,7 +819,7 @@ HandleCopyStream(PGconn *conn, StreamCtl *stream,
/* Check the message type. */
if (copybuf[0] == 'k')
{
- if (!ProcessKeepaliveMsg(conn, copybuf, r, blockpos,
+ if (!ProcessKeepaliveMsg(conn, stream, copybuf, r, blockpos,
&last_status))
goto error;
}
@@ -1032,7 +972,7 @@ CopyStreamReceive(PGconn *conn, long timeout, char **buffer)
* Process the keepalive message.
*/
static bool
-ProcessKeepaliveMsg(PGconn *conn, char *copybuf, int len,
+ProcessKeepaliveMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
XLogRecPtr blockpos, int64 *last_status)
{
int pos;
@@ -1059,7 +999,7 @@ ProcessKeepaliveMsg(PGconn *conn, char *copybuf, int len,
if (replyRequested && still_sending)
{
if (reportFlushPosition && lastFlushPosition < blockpos &&
- walfile != -1)
+ walfile != NULL)
{
/*
* If a valid flush location needs to be reported, flush the
@@ -1068,10 +1008,10 @@ ProcessKeepaliveMsg(PGconn *conn, char *copybuf, int len,
* data has been successfully replicated or not, at the normal
* shutdown of the server.
*/
- if (fsync(walfile) != 0)
+ if (stream->walmethod->fsync(walfile) != 0)
{
fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
+ progname, current_walfile_name, stream->walmethod->getlasterror());
return false;
}
lastFlushPosition = blockpos;
@@ -1129,7 +1069,7 @@ ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
* Verify that the initial location in the stream matches where we think
* we are.
*/
- if (walfile == -1)
+ if (walfile == NULL)
{
/* No file open yet */
if (xlogoff != 0)
@@ -1143,12 +1083,11 @@ ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
else
{
/* More data in existing segment */
- /* XXX: store seek value don't reseek all the time */
- if (lseek(walfile, 0, SEEK_CUR) != xlogoff)
+ if (stream->walmethod->get_current_pos(walfile) != xlogoff)
{
fprintf(stderr,
_("%s: got WAL data offset %08x, expected %08x\n"),
- progname, xlogoff, (int) lseek(walfile, 0, SEEK_CUR));
+ progname, xlogoff, (int) stream->walmethod->get_current_pos(walfile));
return false;
}
}
@@ -1169,7 +1108,7 @@ ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
else
bytes_to_write = bytes_left;
- if (walfile == -1)
+ if (walfile == NULL)
{
if (!open_walfile(stream, *blockpos))
{
@@ -1178,14 +1117,13 @@ ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
}
}
- if (write(walfile,
- copybuf + hdr_len + bytes_written,
- bytes_to_write) != bytes_to_write)
+ if (stream->walmethod->write(walfile, copybuf + hdr_len + bytes_written,
+ bytes_to_write) != bytes_to_write)
{
fprintf(stderr,
_("%s: could not write %u bytes to WAL file \"%s\": %s\n"),
progname, bytes_to_write, current_walfile_name,
- strerror(errno));
+ stream->walmethod->getlasterror());
return false;
}
diff --git a/src/bin/pg_basebackup/receivelog.h b/src/bin/pg_basebackup/receivelog.h
index 554ff8b..e6db14a 100644
--- a/src/bin/pg_basebackup/receivelog.h
+++ b/src/bin/pg_basebackup/receivelog.h
@@ -13,6 +13,7 @@
#define RECEIVELOG_H
#include "libpq-fe.h"
+#include "walmethods.h"
#include "access/xlogdefs.h"
@@ -39,7 +40,7 @@ typedef struct StreamCtl
stream_stop_callback stream_stop; /* Stop streaming when returns true */
- char *basedir; /* Received segments written to this dir */
+ WalWriteMethod *walmethod; /* How to write the WAL */
char *partial_suffix; /* Suffix appended to partially received files */
} StreamCtl;
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index 6c33936..797076d 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -4,7 +4,7 @@ use Cwd;
use Config;
use PostgresNode;
use TestLib;
-use Test::More tests => 51;
+use Test::More tests => 53;
program_help_ok('pg_basebackup');
program_version_ok('pg_basebackup');
@@ -189,6 +189,10 @@ $node->command_ok(
'pg_basebackup -X stream runs');
ok(grep(/^[0-9A-F]{24}$/, slurp_dir("$tempdir/backupxf/pg_xlog")),
'WAL files copied');
+$node->command_ok(
+ [ 'pg_basebackup', '-D', "$tempdir/backupxst", '-X', 'stream', '-Ft' ],
+ 'pg_basebackup -X stream runs in tar mode');
+ok(-f "$tempdir/backupxst/pg_xlog.tar");
$node->command_fails(
[ 'pg_basebackup', '-D', "$tempdir/fail", '-S', 'slot1' ],
diff --git a/src/bin/pg_basebackup/walmethods.c b/src/bin/pg_basebackup/walmethods.c
new file mode 100644
index 0000000..7a5c6e3
--- /dev/null
+++ b/src/bin/pg_basebackup/walmethods.c
@@ -0,0 +1,838 @@
+/*-------------------------------------------------------------------------
+ *
+ * walmethods.c - implementations of different ways to write received wal
+ *
+ * NOTE! The caller must ensure that only one method is instantiated in
+ * any given program, and that it's only instantiated once!
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_basebackup/walmethods.c
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <sys/stat.h>
+#include <time.h>
+#include <unistd.h>
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
+#include "pgtar.h"
+
+#include "receivelog.h"
+
+/* Size of zlib buffer for .tar.gz */
+#define ZLIB_OUT_SIZE 4096
+
+/*-------------------------------------------------------------------------
+ * WalDirectoryMethod - write wal to a directory looking like pg_xlog
+ *-------------------------------------------------------------------------
+ */
+
+/*
+ * Global static data for this method
+ */
+typedef struct DirectoryMethodData
+{
+ char *basedir;
+} DirectoryMethodData;
+static DirectoryMethodData *dir_data = NULL;
+
+/*
+ * Local file handle
+ */
+typedef struct DirectoryMethodFile
+{
+ int fd;
+ off_t currpos;
+ char *pathname;
+ char *temp_suffix;
+} DirectoryMethodFile;
+
+static char *
+dir_getlasterror(void)
+{
+ /* Directory method always sets errno, so just use strerror */
+ return strerror(errno);
+}
+
+static Walfile
+dir_open_for_write(const char *pathname, const char *temp_suffix, size_t pad_to_size)
+{
+ static char tmppath[MAXPGPATH];
+ int fd;
+ DirectoryMethodFile *f;
+
+ snprintf(tmppath, sizeof(tmppath), "%s/%s%s",
+ dir_data->basedir, pathname, temp_suffix ? temp_suffix : "");
+
+ fd = open(tmppath, O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
+ if (fd < 0)
+ return NULL;
+
+ if (pad_to_size)
+ {
+ /* Always pre-pad on regular files */
+ char *zerobuf;
+ int bytes;
+
+ zerobuf = pg_malloc0(XLOG_BLCKSZ);
+ for (bytes = 0; bytes < pad_to_size; bytes += XLOG_BLCKSZ)
+ {
+ if (write(fd, zerobuf, XLOG_BLCKSZ) != XLOG_BLCKSZ)
+ {
+ int save_errno = errno;
+
+ pg_free(zerobuf);
+ close(fd);
+ errno = save_errno;
+ return NULL;
+ }
+ }
+ pg_free(zerobuf);
+
+ if (lseek(fd, 0, SEEK_SET) != 0)
+ return NULL;
+ }
+
+ f = pg_malloc0(sizeof(DirectoryMethodFile));
+ f->fd = fd;
+ f->currpos = 0;
+ f->pathname = pg_strdup(pathname);
+ if (temp_suffix)
+ f->temp_suffix = pg_strdup(temp_suffix);
+ return f;
+}
+
+static ssize_t
+dir_write(Walfile f, const void *buf, size_t count)
+{
+ ssize_t r;
+ DirectoryMethodFile *df = (DirectoryMethodFile *) f;
+
+ Assert(f != NULL);
+
+ r = write(df->fd, buf, count);
+ if (r > 0)
+ df->currpos += r;
+ return r;
+}
+
+static off_t
+dir_get_current_pos(Walfile f)
+{
+ Assert(f != NULL);
+
+ /* Use a cached value to prevent lots of reseeks */
+ return ((DirectoryMethodFile *) f)->currpos;
+}
+
+static int
+dir_close(Walfile f, WalCloseMethod method)
+{
+ int r;
+ DirectoryMethodFile *df = (DirectoryMethodFile *) f;
+ static char tmppath[MAXPGPATH];
+ static char tmppath2[MAXPGPATH];
+
+ Assert(f != NULL);
+
+ r = close(df->fd);
+
+ if (r == 0)
+ {
+ /* Build path to the current version of the file */
+ if (method == CLOSE_NORMAL && df->temp_suffix)
+ {
+ /* If we have a temp prefix, normal is we rename the file */
+ snprintf(tmppath, sizeof(tmppath), "%s/%s%s",
+ dir_data->basedir, df->pathname, df->temp_suffix);
+ snprintf(tmppath2, sizeof(tmppath2), "%s/%s",
+ dir_data->basedir, df->pathname);
+ r = rename(tmppath, tmppath2);
+ }
+ else if (method == CLOSE_UNLINK)
+ {
+ /* Unlink the file once it's closed */
+ snprintf(tmppath, sizeof(tmppath), "%s/%s%s",
+ dir_data->basedir, df->pathname, df->temp_suffix ? df->temp_suffix : "");
+ r = unlink(tmppath);
+ }
+ /* else either CLOSE_NORMAL and no temp suffix, or CLOSE_NO_RENAME */
+ }
+
+ pg_free(df->pathname);
+ if (df->temp_suffix)
+ pg_free(df->temp_suffix);
+ pg_free(df);
+
+ return r;
+}
+
+static int
+dir_fsync(Walfile f)
+{
+ Assert(f != NULL);
+
+ return fsync(((DirectoryMethodFile *) f)->fd);
+}
+
+static ssize_t
+dir_get_file_size(const char *pathname)
+{
+ struct stat statbuf;
+ static char tmppath[MAXPGPATH];
+
+ snprintf(tmppath, sizeof(tmppath), "%s/%s",
+ dir_data->basedir, pathname);
+
+ if (stat(tmppath, &statbuf) != 0)
+ return -1;
+
+ return statbuf.st_size;
+}
+
+static int
+dir_unlink(const char *pathname)
+{
+ static char tmppath[MAXPGPATH];
+
+ snprintf(tmppath, sizeof(tmppath), "%s/%s",
+ dir_data->basedir, pathname);
+
+ return unlink(tmppath);
+}
+
+static bool
+dir_existsfile(const char *pathname)
+{
+ static char tmppath[MAXPGPATH];
+ int fd;
+
+ snprintf(tmppath, sizeof(tmppath), "%s/%s",
+ dir_data->basedir, pathname);
+
+ fd = open(tmppath, O_RDONLY | PG_BINARY, 0);
+ if (fd < 0)
+ return false;
+ close(fd);
+ return true;
+}
+
+static bool
+dir_finish(void)
+{
+ /* No cleanup necessary */
+ return true;
+}
+
+
+WalWriteMethod *
+CreateWalDirectoryMethod(const char *basedir)
+{
+ WalWriteMethod *method;
+
+ method = pg_malloc0(sizeof(WalWriteMethod));
+ method->open_for_write = dir_open_for_write;
+ method->write = dir_write;
+ method->get_current_pos = dir_get_current_pos;
+ method->get_file_size = dir_get_file_size;
+ method->close = dir_close;
+ method->fsync = dir_fsync;
+ method->unlink = dir_unlink;
+ method->existsfile = dir_existsfile;
+ method->finish = dir_finish;
+ method->getlasterror = dir_getlasterror;
+
+ dir_data = pg_malloc0(sizeof(DirectoryMethodData));
+ dir_data->basedir = pg_strdup(basedir);
+
+ return method;
+}
+
+
+/*-------------------------------------------------------------------------
+ * WalTarMethod - write wal to a tar file containing pg_xlog contents
+ *-------------------------------------------------------------------------
+ */
+
+typedef struct TarMethodFile
+{
+ off_t ofs_start; /* Where does the *header* for this file start */
+ off_t currpos;
+ char header[512];
+ char *pathname;
+ size_t pad_to_size;
+} TarMethodFile;
+
+typedef struct TarMethodData
+{
+ char *tarfilename;
+ int fd;
+ int compression;
+ TarMethodFile *currentfile;
+ char lasterror[1024];
+#ifdef HAVE_LIBZ
+ z_streamp zp;
+ void *zlibOut;
+#endif
+} TarMethodData;
+static TarMethodData *tar_data = NULL;
+
+#define tar_clear_error() tar_data->lasterror[0] = '\0'
+#define tar_set_error(msg) strlcpy(tar_data->lasterror, msg, sizeof(tar_data->lasterror))
+
+static char *
+tar_getlasterror(void)
+{
+ /*
+ * If a custom error is set, return that one. Otherwise, assume errno is
+ * set and return that one.
+ */
+ if (tar_data->lasterror[0])
+ return tar_data->lasterror;
+ return strerror(errno);
+}
+
+#ifdef HAVE_LIBZ
+static bool
+tar_write_compressed_data(void *buf, size_t count, bool flush)
+{
+ tar_data->zp->next_in = buf;
+ tar_data->zp->avail_in = count;
+
+ while (tar_data->zp->avail_in || flush)
+ {
+ int r;
+
+ r = deflate(tar_data->zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (r == Z_STREAM_ERROR)
+ {
+ tar_set_error("deflate failed");
+ return false;
+ }
+
+ if (tar_data->zp->avail_out < ZLIB_OUT_SIZE)
+ {
+ size_t len = ZLIB_OUT_SIZE - tar_data->zp->avail_out;
+
+ if (write(tar_data->fd, tar_data->zlibOut, len) != len)
+ return false;
+
+ tar_data->zp->next_out = tar_data->zlibOut;
+ tar_data->zp->avail_out = ZLIB_OUT_SIZE;
+ }
+
+ if (r == Z_STREAM_END)
+ break;
+ }
+
+ if (flush)
+ {
+ /* Reset the stream for writing */
+ if (deflateReset(tar_data->zp) != Z_OK)
+ {
+ tar_set_error("deflateReset failed");
+ return false;
+ }
+ }
+
+ return true;
+}
+#endif
+
+static ssize_t
+tar_write(Walfile f, const void *buf, size_t count)
+{
+ ssize_t r;
+
+ Assert(f != NULL);
+ tar_clear_error();
+
+ /* Tarfile will always be positioned at the end */
+ if (!tar_data->compression)
+ {
+ r = write(tar_data->fd, buf, count);
+ if (r > 0)
+ ((TarMethodFile *) f)->currpos += r;
+ return r;
+ }
+#ifdef HAVE_LIBZ
+ else
+ {
+ if (!tar_write_compressed_data((void *) buf, count, false))
+ return -1;
+ ((TarMethodFile *) f)->currpos += count;
+ return count;
+ }
+#endif
+}
+
+static bool
+tar_write_padding_data(TarMethodFile * f, size_t bytes)
+{
+ char *zerobuf = pg_malloc0(XLOG_BLCKSZ);
+ size_t bytesleft = bytes;
+
+ while (bytesleft)
+ {
+ size_t bytestowrite = bytesleft > XLOG_BLCKSZ ? XLOG_BLCKSZ : bytesleft;
+
+ size_t r = tar_write(f, zerobuf, bytestowrite);
+
+ if (r < 0)
+ return false;
+ bytesleft -= r;
+ }
+ return true;
+}
+
+static Walfile
+tar_open_for_write(const char *pathname, const char *temp_suffix, size_t pad_to_size)
+{
+ int save_errno;
+ static char tmppath[MAXPGPATH];
+
+ tar_clear_error();
+
+ if (tar_data->fd < 0)
+ {
+ /*
+ * We open the tar file only when we first try to write to it.
+ */
+ tar_data->fd = open(tar_data->tarfilename,
+ O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
+ if (tar_data->fd < 0)
+ return NULL;
+
+#ifdef HAVE_LIBZ
+ if (tar_data->compression)
+ {
+ tar_data->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ tar_data->zp->zalloc = Z_NULL;
+ tar_data->zp->zfree = Z_NULL;
+ tar_data->zp->opaque = Z_NULL;
+ tar_data->zp->next_out = tar_data->zlibOut;
+ tar_data->zp->avail_out = ZLIB_OUT_SIZE;
+
+ /*
+ * Initialize deflation library. Adding the magic value 16 to the
+ * default 15 for the windowBits parameter makes the output be
+ * gzip instead of zlib.
+ */
+ if (deflateInit2(tar_data->zp, tar_data->compression, Z_DEFLATED, 15 + 16, 8, Z_DEFAULT_STRATEGY) != Z_OK)
+ {
+ tar_set_error("deflateInit2 failed");
+ return NULL;
+ }
+ }
+#endif
+
+ /* There's no tar header itself, the file starts with regular files */
+ }
+
+ Assert(tar_data->currentfile == NULL);
+ if (tar_data->currentfile != NULL)
+ {
+ tar_set_error("implementation error: tar files can't have more than one open file\n");
+ return NULL;
+ }
+
+ tar_data->currentfile = pg_malloc0(sizeof(TarMethodFile));
+
+ snprintf(tmppath, sizeof(tmppath), "%s%s",
+ pathname, temp_suffix ? temp_suffix : "");
+
+ /* Create a header with size set to 0 - we will fill out the size on close */
+ if (tarCreateHeader(tar_data->currentfile->header, tmppath, NULL, 0, S_IRUSR | S_IWUSR, 0, 0, time(NULL)) != TAR_OK)
+ {
+ pg_free(tar_data->currentfile);
+ tar_data->currentfile = NULL;
+ tar_set_error("could not create tar header");
+ return NULL;
+ }
+
+#ifdef HAVE_LIBZ
+ if (tar_data->compression)
+ {
+ /* Flush existing data */
+ if (!tar_write_compressed_data(NULL, 0, true))
+ return NULL;
+
+ /* Turn off compression for header */
+ if (deflateParams(tar_data->zp, 0, 0) != Z_OK)
+ {
+ tar_set_error("deflateParams failed");
+ return NULL;
+ }
+ }
+#endif
+
+ tar_data->currentfile->ofs_start = lseek(tar_data->fd, 0, SEEK_CUR);
+ if (tar_data->currentfile->ofs_start == -1)
+ {
+ save_errno = errno;
+ pg_free(tar_data->currentfile);
+ tar_data->currentfile = NULL;
+ errno = save_errno;
+ return NULL;
+ }
+ tar_data->currentfile->currpos = 0;
+
+ if (!tar_data->compression)
+ {
+ if (write(tar_data->fd, tar_data->currentfile->header, 512) != 512)
+ {
+ save_errno = errno;
+ pg_free(tar_data->currentfile);
+ tar_data->currentfile = NULL;
+ errno = save_errno;
+ return NULL;
+ }
+ }
+#ifdef HAVE_LIBZ
+ else
+ {
+ /* Write header through the zlib APIs but with no compression */
+ if (!tar_write_compressed_data(tar_data->currentfile->header, 512, true))
+ return NULL;
+
+ /* Re-enable compression for the rest of the file */
+ if (deflateParams(tar_data->zp, tar_data->compression, 0) != Z_OK)
+ {
+ tar_set_error("deflateParams failed");
+ return NULL;
+ }
+ }
+#endif
+
+ tar_data->currentfile->pathname = pg_strdup(pathname);
+
+ /*
+ * Uncompressed files are padded on creation, but for compression we can't
+ * do that
+ */
+ if (pad_to_size)
+ {
+ tar_data->currentfile->pad_to_size = pad_to_size;
+ if (!tar_data->compression)
+ {
+ /* Uncompressed, so pad now */
+ tar_write_padding_data(tar_data->currentfile, pad_to_size);
+ /* Seek back to start */
+ if (lseek(tar_data->fd, tar_data->currentfile->ofs_start, SEEK_SET) != tar_data->currentfile->ofs_start)
+ return NULL;
+
+ tar_data->currentfile->currpos = 0;
+ }
+ }
+
+ return tar_data->currentfile;
+}
+
+static ssize_t
+tar_get_file_size(const char *pathname)
+{
+ tar_clear_error();
+
+ /* Currently not used, so not supported */
+ errno = ENOSYS;
+ return -1;
+}
+
+static off_t
+tar_get_current_pos(Walfile f)
+{
+ Assert(f != NULL);
+ tar_clear_error();
+
+ return ((TarMethodFile *) f)->currpos;
+}
+
+static int
+tar_fsync(Walfile f)
+{
+ Assert(f != NULL);
+ tar_clear_error();
+
+ /*
+ * Always sync the whole tarfile, because that's all we can do. This makes
+ * no sense on compressed files, so just ignore those.
+ */
+ if (tar_data->compression)
+ return 0;
+
+ return fsync(tar_data->fd);
+}
+
+static int
+tar_close(Walfile f, WalCloseMethod method)
+{
+ ssize_t filesize;
+ int padding;
+ TarMethodFile *tf = (TarMethodFile *) f;
+
+ Assert(f != NULL);
+ tar_clear_error();
+
+ if (method == CLOSE_UNLINK)
+ {
+ if (tar_data->compression)
+ {
+ tar_set_error("unlink not supported with compression");
+ return -1;
+ }
+
+ /*
+ * Unlink the file that we just wrote to the tar. We do this by
+ * truncating it to the start of the header. This is safe as we only
+ * allow writing of the very last file.
+ */
+ if (ftruncate(tar_data->fd, tf->ofs_start) != 0)
+ return -1;
+
+ pg_free(tf->pathname);
+ pg_free(tf);
+ tar_data->currentfile = NULL;
+
+ return 0;
+ }
+
+ /*
+ * Pad the file itself with zeroes if necessary. Note that this is
+ * different from the tar format padding -- this is the padding we asked
+ * for when the file was opened.
+ */
+ if (tf->pad_to_size)
+ {
+ if (tar_data->compression)
+ {
+ /*
+ * A compressed tarfile is padded on close since we cannot know
+ * the size of the compressed output until the end.
+ */
+ size_t sizeleft = tf->pad_to_size - tf->currpos;
+
+ if (sizeleft)
+ {
+ if (!tar_write_padding_data(tf, sizeleft))
+ return -1;
+ }
+ }
+ else
+ {
+ /*
+ * An uncompressed tarfile was padded on creation, so just adjust
+ * the current position as if we seeked to the end.
+ */
+ tf->currpos = tf->pad_to_size;
+ }
+ }
+
+ /*
+ * Get the size of the file, and pad the current data up to the nearest
+ * 512 byte boundary.
+ */
+ filesize = tar_get_current_pos(f);
+ padding = ((filesize + 511) & ~511) - filesize;
+ if (padding)
+ {
+ char zerobuf[512];
+
+ MemSet(zerobuf, 0, padding);
+ if (tar_write(f, zerobuf, padding) != padding)
+ return -1;
+ }
+
+
+#ifdef HAVE_LIBZ
+ if (tar_data->compression)
+ {
+ /* Flush the current buffer */
+ if (!tar_write_compressed_data(NULL, 0, true))
+ {
+ errno = EINVAL;
+ return -1;
+ }
+ }
+#endif
+
+ /*
+ * Now go back and update the header with the correct filesize and
+ * possibly also renaming the file. We overwrite the entire current header
+ * when done, including the checksum.
+ */
+ print_tar_number(&(tf->header[124]), 12, filesize);
+
+ if (method == CLOSE_NORMAL)
+
+ /*
+ * We overwrite it with what it was before if we have no tempname,
+ * since we're going to write the buffer anyway.
+ */
+ strlcpy(&(tf->header[0]), tf->pathname, 100);
+
+ print_tar_number(&(tf->header[148]), 8, tarChecksum(((TarMethodFile *) f)->header));
+ if (lseek(tar_data->fd, tf->ofs_start, SEEK_SET) != ((TarMethodFile *) f)->ofs_start)
+ return -1;
+ if (!tar_data->compression)
+ {
+ if (write(tar_data->fd, tf->header, 512) != 512)
+ return -1;
+ }
+#ifdef HAVE_LIBZ
+ else
+ {
+ /* Turn off compression */
+ if (deflateParams(tar_data->zp, 0, 0) != Z_OK)
+ {
+ tar_set_error("deflateParams failed");
+ return -1;
+ }
+
+ /* Overwrite the header, assuming the size will be the same */
+ if (!tar_write_compressed_data(tar_data->currentfile->header, 512, true))
+ return -1;
+
+ /* Turn compression back on */
+ if (deflateParams(tar_data->zp, tar_data->compression, 0) != Z_OK)
+ {
+ tar_set_error("deflateParams failed");
+ return -1;
+ }
+ }
+#endif
+
+ /* Move file pointer back down to end, so we can write the next file */
+ if (lseek(tar_data->fd, 0, SEEK_END) < 0)
+ return -1;
+
+ /* Always fsync on close, so the padding gets fsynced */
+ tar_fsync(f);
+
+ /* Clean up and done */
+ pg_free(tf->pathname);
+ pg_free(tf);
+ tar_data->currentfile = NULL;
+
+ return 0;
+}
+
+static int
+tar_unlink(const char *pathname)
+{
+ tar_clear_error();
+ errno = ENOSYS;
+ return -1;
+}
+
+static bool
+tar_existsfile(const char *pathname)
+{
+ tar_clear_error();
+ /* We only deal with new tarfiles, so nothing externally created exists */
+ return false;
+}
+
+static bool
+tar_finish(void)
+{
+ char zerobuf[1024];
+
+ tar_clear_error();
+
+ if (tar_data->currentfile)
+ {
+ if (tar_close(tar_data->currentfile, CLOSE_NORMAL) != 0)
+ return false;
+ }
+
+ /* A tarfile always ends with two empty blocks */
+ MemSet(zerobuf, 0, sizeof(zerobuf));
+ if (!tar_data->compression)
+ {
+ if (write(tar_data->fd, zerobuf, sizeof(zerobuf)) != sizeof(zerobuf))
+ return false;
+ }
+#ifdef HAVE_LIBZ
+ else
+ {
+ if (!tar_write_compressed_data(zerobuf, sizeof(zerobuf), false))
+ return false;
+
+ /* Also flush all data to make sure the gzip stream is finished */
+ tar_data->zp->next_in = NULL;
+ tar_data->zp->avail_in = 0;
+ while (true)
+ {
+ int r;
+
+ r = deflate(tar_data->zp, Z_FINISH);
+
+ if (r == Z_STREAM_ERROR)
+ {
+ tar_set_error("deflate failed");
+ return false;
+ }
+ if (tar_data->zp->avail_out < ZLIB_OUT_SIZE)
+ {
+ size_t len = ZLIB_OUT_SIZE - tar_data->zp->avail_out;
+
+ if (write(tar_data->fd, tar_data->zlibOut, len) != len)
+ return false;
+ }
+ if (r == Z_STREAM_END)
+ break;
+ }
+
+ if (deflateEnd(tar_data->zp) != Z_OK)
+ {
+ tar_set_error("deflateEnd failed");
+ return false;
+ }
+ }
+#endif
+
+ /* sync the empty blocks as well, since they're after the last file */
+ fsync(tar_data->fd);
+
+ if (close(tar_data->fd) != 0)
+ return false;
+
+ tar_data->fd = -1;
+
+ return true;
+}
+
+WalWriteMethod *
+CreateWalTarMethod(const char *tarbase, int compression)
+{
+ WalWriteMethod *method;
+ const char *suffix = (compression > 0) ? ".tar.gz" : ".tar";
+
+ method = pg_malloc0(sizeof(WalWriteMethod));
+ method->open_for_write = tar_open_for_write;
+ method->write = tar_write;
+ method->get_current_pos = tar_get_current_pos;
+ method->get_file_size = tar_get_file_size;
+ method->close = tar_close;
+ method->fsync = tar_fsync;
+ method->unlink = tar_unlink;
+ method->existsfile = tar_existsfile;
+ method->finish = tar_finish;
+ method->getlasterror = tar_getlasterror;
+
+ tar_data = pg_malloc0(sizeof(TarMethodData));
+ tar_data->tarfilename = pg_malloc0(strlen(tarbase) + strlen(suffix) + 1);
+ sprintf(tar_data->tarfilename, "%s%s", tarbase, suffix);
+ tar_data->fd = -1;
+ tar_data->compression = compression;
+ if (compression)
+ tar_data->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ return method;
+}
diff --git a/src/bin/pg_basebackup/walmethods.h b/src/bin/pg_basebackup/walmethods.h
new file mode 100644
index 0000000..9922cfd
--- /dev/null
+++ b/src/bin/pg_basebackup/walmethods.h
@@ -0,0 +1,46 @@
+/*-------------------------------------------------------------------------
+ *
+ * walmethods.h
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_basebackup/walmethods.h
+ *-------------------------------------------------------------------------
+ */
+
+
+typedef void *Walfile;
+
+typedef enum
+{
+ CLOSE_NORMAL,
+ CLOSE_UNLINK,
+ CLOSE_NO_RENAME,
+} WalCloseMethod;
+
+typedef struct WalWriteMethod WalWriteMethod;
+struct WalWriteMethod
+{
+ Walfile(*open_for_write) (const char *pathname, const char *temp_suffix, size_t pad_to_size);
+ int (*close) (Walfile f, WalCloseMethod method);
+ int (*unlink) (const char *pathname);
+ bool (*existsfile) (const char *pathname);
+ ssize_t (*get_file_size) (const char *pathname);
+
+ ssize_t (*write) (Walfile f, const void *buf, size_t count);
+ off_t (*get_current_pos) (Walfile f);
+ int (*fsync) (Walfile f);
+ bool (*finish) (void);
+ char *(*getlasterror) (void);
+};
+
+/*
+ * Available WAL methods:
+ * - WalDirectoryMethod - write WAL to regular files in a standard pg_xlog
+ * - TarDirectoryMethod - write WAL to a tarfile corresponding to pg_xlog
+ * (only implements the methods required for pg_basebackup,
+ * not all those required for pg_receivexlog)
+ */
+WalWriteMethod *CreateWalDirectoryMethod(const char *basedir);
+WalWriteMethod *CreateWalTarMethod(const char *tarbase, int compression);
diff --git a/src/include/pgtar.h b/src/include/pgtar.h
index 45ca400..1d179f0 100644
--- a/src/include/pgtar.h
+++ b/src/include/pgtar.h
@@ -22,4 +22,5 @@ enum tarError
extern enum tarError tarCreateHeader(char *h, const char *filename, const char *linktarget,
pgoff_t size, mode_t mode, uid_t uid, gid_t gid, time_t mtime);
extern uint64 read_tar_number(const char *s, int len);
+extern void print_tar_number(char *s, int len, uint64 val);
extern int tarChecksum(char *header);
diff --git a/src/port/tar.c b/src/port/tar.c
index 52a2113..f1da959 100644
--- a/src/port/tar.c
+++ b/src/port/tar.c
@@ -16,7 +16,7 @@
* support only non-negative numbers, so we don't worry about the GNU rules
* for handling negative numbers.)
*/
-static void
+void
print_tar_number(char *s, int len, uint64 val)
{
if (val < (((uint64) 1) << ((len - 1) * 3)))
On Sat, Sep 3, 2016 at 10:35 PM, Magnus Hagander <magnus@hagander.net> wrote:
Ugh. That would be nice to have, but I think that's outside the scope of
this patch.
A test for this patch that could have value would be to use
pg_basebackup -X stream -Ft, then untar pg_xlog.tar and look at the
size of the segments. If you have an idea to untar something without
the in-core perl support because we need to have the TAP stuff able to
run on at least 5.8.8, I'd welcome an idea. Honestly I still have
none, and that's why the recovery tests do not use tarballs in their
tests when using pg_basebackup. In short let's not add something more
for this patch.
PFA is an updated version of this patch that:
* documents a magic value passed to zlib (which is in their documentation as
being a magic value, but has no define)
* fixes the padding of tar files
* adds a most basic test that the -X stream -Ft does produce a tarfile
So I had a more serious look at this patch, and it basically makes
more generic the operations done for the plain mode by adding a set of
routines that can be used by both tar and plain mode to work on the
WAL files streamed. Elegant.
+ <para>
+ The transaction log files will be written to
+ the <filename>base.tar</filename> file.
+ </para>
Nit: number of spaces here.
-mark_file_as_archived(const char *basedir, const char *fname)
+mark_file_as_archived(StreamCtl *stream, const char *fname)
Just passing WalMethod as argument would be enough, but... My patch
adding the fsync calls to pg_basebackup could just make use of
StreamCtl, so let's keep it as you suggest.
static bool
existsTimeLineHistoryFile(StreamCtl *stream)
{
[...]
+ return stream->walmethod->existsfile(histfname);
}
existsfile returns always false for the tar method. This does not
matter much because pg_basebackup exists immediately in case of a
failure, but I think that this deserves a comment in ReceiveXlogStream
where existsTimeLineHistoryFile is called.
I find the use of existsfile() in open_walfile() rather confusing
because this relies on the fact that existsfile() returns always
false for the tar mode. We could add an additional field in WalMethod
to store the method type and use that more, but that may make the code
more confusing than what you propose. What do you think?
+ int (*unlink) (const char *pathname);
The unlink method is used nowhere. This could just be removed.
-static void
+void
print_tar_number(char *s, int len, uint64 val)
This could be an independent patch. Or not.
I think that I found another bug regarding the contents of the
segments. I did pg_basebackup -F t -X stream, then untared pg_xlog.tar
which contained segment 1/0/2, then:
$ pg_xlogdump 000000010000000000000002
pg_xlogdump: FATAL: could not find a valid record after 0/2000000
I'd expect this segment to have records, up to a XLOG_SWITCH record.
As for using XLOGDIR to drive the name of the tarfile. pg_basebackup is
already hardcoded to use pg_xlog. And so are the tests. We probably want to
fix that, but that's a separate step and this patch will be easier to review
and test if we keep it out for now.
Yes. pg_basebackup is not the only thing here missing the point here,
here is most of the list:
$ git grep "pg_xlog\"" -- *.c *.h
src/backend/access/transam/xlog.c:static const char *xlogSourceNames[]
= {"any", "archive", "pg_xlog", "stream"};
src/backend/replication/basebackup.c: dir = AllocateDir("pg_xlog");
src/backend/replication/basebackup.c: (errmsg("could
not open directory \"%s\": %m", "pg_xlog")));
src/backend/replication/basebackup.c: while ((de = ReadDir(dir,
"pg_xlog")) != NULL)
src/backend/replication/basebackup.c: if (strcmp(pathbuf,
"./pg_xlog") == 0)
src/backend/storage/file/fd.c: if (lstat("pg_xlog", &st) < 0)
src/backend/storage/file/fd.c: "pg_xlog")));
src/backend/storage/file/fd.c: if (pgwin32_is_junction("pg_xlog"))
src/backend/storage/file/fd.c: walkdir("pg_xlog", pre_sync_fname,
false, DEBUG1);
src/backend/storage/file/fd.c: walkdir("pg_xlog",
datadir_fsync_fname, false, LOG);
src/bin/initdb/initdb.c: snprintf(pg_xlog, MAXPGPATH, "%s/pg_xlog", pg_data);
src/bin/initdb/initdb.c: subdirloc = psprintf("%s/pg_xlog", pg_data);
src/bin/pg_basebackup/pg_basebackup.c: snprintf(param->xlogdir,
sizeof(param->xlogdir), "%s/pg_xlog", basedir);
src/bin/pg_basebackup/pg_basebackup.c: if
(!((pg_str_endswith(filename, "/pg_xlog") ||
src/bin/pg_basebackup/pg_basebackup.c: linkloc =
psprintf("%s/pg_xlog", basedir);
src/bin/pg_rewind/copy_fetch.c: strcmp(path, "pg_xlog") == 0)
src/bin/pg_rewind/filemap.c: if (strcmp(path, "pg_xlog") == 0 &&
type == FILE_TYPE_SYMLINK)
src/bin/pg_rewind/filemap.c: if (exists &&
!S_ISDIR(statbuf.st_mode) && strcmp(path, "pg_xlog") != 0)
src/bin/pg_rewind/filemap.c: if (strcmp(path, "pg_xlog") == 0 &&
type == FILE_TYPE_SYMLINK)
src/bin/pg_upgrade/exec.c: "pg_xlog"};
src/include/access/xlog_internal.h:#define XLOGDIR "pg_xlog"
Thanks,
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Sep 5, 2016 at 4:01 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
[ review comments ]
This thread has been sitting idle for more than 3 weeks, so I'm
marking it "Returned with Feedback" in the CommitFest application.
Magnus, Michael's latest round of comments seem pretty trivial, so
perhaps you want to just fix whichever of them seem to you to have
merit and commit without waiting for the next CommitFest. Or, you can
resubmit for the next CommitFest if you think it needs more review.
But the CommitFest is just about over so it's time to clean out old
entries, one way or the other.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sep 28, 2016 19:11, "Robert Haas" <robertmhaas@gmail.com> wrote:
On Mon, Sep 5, 2016 at 4:01 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:[ review comments ]
This thread has been sitting idle for more than 3 weeks, so I'm
marking it "Returned with Feedback" in the CommitFest application.
Magnus, Michael's latest round of comments seem pretty trivial, so
perhaps you want to just fix whichever of them seem to you to have
merit and commit without waiting for the next CommitFest. Or, you can
resubmit for the next CommitFest if you think it needs more review.
But the CommitFest is just about over so it's time to clean out old
entries, one way or the other.
Yeah, understood. I was planning to get back to it this week, but failed to
find the time. I'll still have some hope about later this week, but most
likely not until the next.
/Magnus
On Mon, Sep 5, 2016 at 10:01 AM, Michael Paquier <michael.paquier@gmail.com>
wrote:
On Sat, Sep 3, 2016 at 10:35 PM, Magnus Hagander <magnus@hagander.net>
wrote:Ugh. That would be nice to have, but I think that's outside the scope of
this patch.A test for this patch that could have value would be to use
pg_basebackup -X stream -Ft, then untar pg_xlog.tar and look at the
size of the segments. If you have an idea to untar something without
the in-core perl support because we need to have the TAP stuff able to
run on at least 5.8.8, I'd welcome an idea. Honestly I still have
none, and that's why the recovery tests do not use tarballs in their
tests when using pg_basebackup. In short let's not add something more
for this patch.PFA is an updated version of this patch that:
* documents a magic value passed to zlib (which is in theirdocumentation as
being a magic value, but has no define)
* fixes the padding of tar files
* adds a most basic test that the -X stream -Ft does produce a tarfileSo I had a more serious look at this patch, and it basically makes
more generic the operations done for the plain mode by adding a set of
routines that can be used by both tar and plain mode to work on the
WAL files streamed. Elegant.+ <para> + The transaction log files will be written to + the <filename>base.tar</filename> file. + </para> Nit: number of spaces here.
Fixed.
-mark_file_as_archived(const char *basedir, const char *fname) +mark_file_as_archived(StreamCtl *stream, const char *fname) Just passing WalMethod as argument would be enough, but... My patch adding the fsync calls to pg_basebackup could just make use of StreamCtl, so let's keep it as you suggest.
Yeah, I think it's cleaner to pass the whole structure around really. If
not now, we'd need it eventually. That makes all callers more consistent.
static bool
existsTimeLineHistoryFile(StreamCtl *stream)
{
[...]
+ return stream->walmethod->existsfile(histfname);
}
existsfile returns always false for the tar method. This does not
matter much because pg_basebackup exists immediately in case of a
failure, but I think that this deserves a comment in ReceiveXlogStream
where existsTimeLineHistoryFile is called.
OK, added. As you say, the behaviour is expected, but it makes sense to
mention it clealry there.
I find the use of existsfile() in open_walfile() rather confusing
because this relies on the fact that existsfile() returns always
false for the tar mode. We could add an additional field in WalMethod
to store the method type and use that more, but that may make the code
more confusing than what you propose. What do you think?
Yeah, I'm not sure that helps. The point is that the abstraction is
supposed to take care of that. But if it's confusing, then clearly a
comment is warranted there, so I've added that. Do you think that makes it
clear enough?
+ int (*unlink) (const char *pathname);
The unlink method is used nowhere. This could just be removed.
That's clearly a missed cleanup. Removed, thanks.
-static void +void print_tar_number(char *s, int len, uint64 val) This could be an independent patch. Or not.
Could be, but we don't really have any other uses for it.
I think that I found another bug regarding the contents of the
segments. I did pg_basebackup -F t -X stream, then untared pg_xlog.tar
which contained segment 1/0/2, then:
$ pg_xlogdump 000000010000000000000002
pg_xlogdump: FATAL: could not find a valid record after 0/2000000
I'd expect this segment to have records, up to a XLOG_SWITCH record.
Ugh. That's definitely broken yes. It seeked back and overwrote the tar
header with the data, instead of starting where the file part was supposed
to be. It worked fine on compressed files, and it's when implementing that
that it broke.
So what's our basic rule for these perl tests - are we allowed to use
pg_xlogdump from within a pg_basebackup test? If so that could actually be
a useful test - do the backup, extract the xlog and verify that it contains
the SWITCH record.
I also noticed that using -Z5 created a .tar.gz and -z created a .tar
(which was compressed). Because compresslevel is set to -1 with -z,
meaning default.
Again, apologies for getting late back into the game here.
//Magnus
Attachments:
pg_basebackup_stream_tar_v3.patchtext/x-patch; charset=US-ASCII; name=pg_basebackup_stream_tar_v3.patchDownload
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 9f1eae1..a4236a5 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -180,7 +180,8 @@ PostgreSQL documentation
target directory, the tar contents will be written to
standard output, suitable for piping to for example
<productname>gzip</productname>. This is only possible if
- the cluster has no additional tablespaces.
+ the cluster has no additional tablespaces and transaction
+ log streaming is not used.
</para>
</listitem>
</varlistentry>
@@ -323,6 +324,10 @@ PostgreSQL documentation
If the log has been rotated when it's time to transfer it, the
backup will fail and be unusable.
</para>
+ <para>
+ The transaction log files will be written to
+ the <filename>base.tar</filename> file.
+ </para>
</listitem>
</varlistentry>
@@ -339,6 +344,9 @@ PostgreSQL documentation
client can keep up with transaction log received, using this mode
requires no extra transaction logs to be saved on the master.
</para>
+ <para>The transactionn log files are written to a separate file
+ called <filename>pg_xlog.tar</filename>.
+ </para>
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/bin/pg_basebackup/Makefile b/src/bin/pg_basebackup/Makefile
index fa1ce8b..52ac9e9 100644
--- a/src/bin/pg_basebackup/Makefile
+++ b/src/bin/pg_basebackup/Makefile
@@ -19,7 +19,7 @@ include $(top_builddir)/src/Makefile.global
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
LDFLAGS += -L$(top_builddir)/src/fe_utils -lpgfeutils -lpq
-OBJS=receivelog.o streamutil.o $(WIN32RES)
+OBJS=receivelog.o streamutil.o walmethods.o $(WIN32RES)
all: pg_basebackup pg_receivexlog pg_recvlogical
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 42f3b27..467d4fe 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -438,7 +438,7 @@ typedef struct
{
PGconn *bgconn;
XLogRecPtr startptr;
- char xlogdir[MAXPGPATH];
+ char xlog[MAXPGPATH]; /* directory or tarfile depending on mode */
char *sysidentifier;
int timeline;
} logstreamer_param;
@@ -458,9 +458,13 @@ LogStreamerMain(logstreamer_param *param)
stream.standby_message_timeout = standby_message_timeout;
stream.synchronous = false;
stream.mark_done = true;
- stream.basedir = param->xlogdir;
stream.partial_suffix = NULL;
+ if (format == 'p')
+ stream.walmethod = CreateWalDirectoryMethod(param->xlog);
+ else
+ stream.walmethod = CreateWalTarMethod(param->xlog, compresslevel);
+
if (!ReceiveXlogStream(param->bgconn, &stream))
/*
@@ -470,6 +474,14 @@ LogStreamerMain(logstreamer_param *param)
*/
return 1;
+ if (!stream.walmethod->finish())
+ {
+ fprintf(stderr,
+ _("%s: could not finish writing WAL files: %s\n"),
+ progname, strerror(errno));
+ return 1;
+ }
+
PQfinish(param->bgconn);
return 0;
}
@@ -520,22 +532,25 @@ StartLogStreamer(char *startpos, uint32 timeline, char *sysidentifier)
/* Error message already written in GetConnection() */
exit(1);
- snprintf(param->xlogdir, sizeof(param->xlogdir), "%s/pg_xlog", basedir);
-
- /*
- * Create pg_xlog/archive_status (and thus pg_xlog) so we can write to
- * basedir/pg_xlog as the directory entry in the tar file may arrive
- * later.
- */
- snprintf(statusdir, sizeof(statusdir), "%s/pg_xlog/archive_status",
- basedir);
+ snprintf(param->xlog, sizeof(param->xlog), "%s/pg_xlog", basedir);
- if (pg_mkdir_p(statusdir, S_IRWXU) != 0 && errno != EEXIST)
+ if (format == 'p')
{
- fprintf(stderr,
- _("%s: could not create directory \"%s\": %s\n"),
- progname, statusdir, strerror(errno));
- disconnect_and_exit(1);
+ /*
+ * Create pg_xlog/archive_status (and thus pg_xlog) so we can write to
+ * basedir/pg_xlog as the directory entry in the tar file may arrive
+ * later.
+ */
+ snprintf(statusdir, sizeof(statusdir), "%s/pg_xlog/archive_status",
+ basedir);
+
+ if (pg_mkdir_p(statusdir, S_IRWXU) != 0 && errno != EEXIST)
+ {
+ fprintf(stderr,
+ _("%s: could not create directory \"%s\": %s\n"),
+ progname, statusdir, strerror(errno));
+ disconnect_and_exit(1);
+ }
}
/*
@@ -2195,16 +2210,6 @@ main(int argc, char **argv)
exit(1);
}
- if (format != 'p' && streamwal)
- {
- fprintf(stderr,
- _("%s: WAL streaming can only be used in plain mode\n"),
- progname);
- fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
- progname);
- exit(1);
- }
-
if (replication_slot && !streamwal)
{
fprintf(stderr,
diff --git a/src/bin/pg_basebackup/pg_receivexlog.c b/src/bin/pg_basebackup/pg_receivexlog.c
index 7f7ee9d..9b4c101 100644
--- a/src/bin/pg_basebackup/pg_receivexlog.c
+++ b/src/bin/pg_basebackup/pg_receivexlog.c
@@ -337,11 +337,19 @@ StreamLog(void)
stream.standby_message_timeout = standby_message_timeout;
stream.synchronous = synchronous;
stream.mark_done = false;
- stream.basedir = basedir;
+ stream.walmethod = CreateWalDirectoryMethod(basedir);
stream.partial_suffix = ".partial";
ReceiveXlogStream(conn, &stream);
+ if (!stream.walmethod->finish())
+ {
+ fprintf(stderr,
+ _("%s: could not finish writing WAL files: %s\n"),
+ progname, strerror(errno));
+ return;
+ }
+
PQfinish(conn);
conn = NULL;
}
diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c
index 062730b..481944f 100644
--- a/src/bin/pg_basebackup/receivelog.c
+++ b/src/bin/pg_basebackup/receivelog.c
@@ -26,7 +26,7 @@
/* fd and filename for currently open WAL file */
-static int walfile = -1;
+static Walfile *walfile = NULL;
static char current_walfile_name[MAXPGPATH] = "";
static bool reportFlushPosition = false;
static XLogRecPtr lastFlushPosition = InvalidXLogRecPtr;
@@ -37,7 +37,7 @@ static PGresult *HandleCopyStream(PGconn *conn, StreamCtl *stream,
XLogRecPtr *stoppos);
static int CopyStreamPoll(PGconn *conn, long timeout_ms);
static int CopyStreamReceive(PGconn *conn, long timeout, char **buffer);
-static bool ProcessKeepaliveMsg(PGconn *conn, char *copybuf, int len,
+static bool ProcessKeepaliveMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
XLogRecPtr blockpos, int64 *last_status);
static bool ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
XLogRecPtr *blockpos);
@@ -52,33 +52,33 @@ static bool ReadEndOfStreamingResult(PGresult *res, XLogRecPtr *startpos,
uint32 *timeline);
static bool
-mark_file_as_archived(const char *basedir, const char *fname)
+mark_file_as_archived(StreamCtl *stream, const char *fname)
{
- int fd;
+ Walfile *f;
static char tmppath[MAXPGPATH];
- snprintf(tmppath, sizeof(tmppath), "%s/archive_status/%s.done",
- basedir, fname);
+ snprintf(tmppath, sizeof(tmppath), "archive_status/%s.done",
+ fname);
- fd = open(tmppath, O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
- if (fd < 0)
+ f = stream->walmethod->open_for_write(tmppath, NULL, 0);
+ if (f == NULL)
{
fprintf(stderr, _("%s: could not create archive status file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
+ progname, tmppath, stream->walmethod->getlasterror());
return false;
}
- if (fsync(fd) != 0)
+ if (stream->walmethod->fsync(f) != 0)
{
fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
+ progname, tmppath, stream->walmethod->getlasterror());
- close(fd);
+ stream->walmethod->close(f, CLOSE_UNLINK);
return false;
}
- close(fd);
+ stream->walmethod->close(f, CLOSE_NORMAL);
return true;
}
@@ -92,79 +92,70 @@ mark_file_as_archived(const char *basedir, const char *fname)
static bool
open_walfile(StreamCtl *stream, XLogRecPtr startpoint)
{
- int f;
+ Walfile *f;
char fn[MAXPGPATH];
- struct stat statbuf;
- char *zerobuf;
- int bytes;
+ ssize_t size;
XLogSegNo segno;
XLByteToSeg(startpoint, segno);
XLogFileName(current_walfile_name, stream->timeline, segno);
- snprintf(fn, sizeof(fn), "%s/%s%s", stream->basedir, current_walfile_name,
+ snprintf(fn, sizeof(fn), "%s%s", current_walfile_name,
stream->partial_suffix ? stream->partial_suffix : "");
- f = open(fn, O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
- if (f == -1)
- {
- fprintf(stderr,
- _("%s: could not open transaction log file \"%s\": %s\n"),
- progname, fn, strerror(errno));
- return false;
- }
/*
- * Verify that the file is either empty (just created), or a complete
- * XLogSegSize segment. Anything in between indicates a corrupt file.
+ * When streaming to files, if an existing file exists we verify that it's
+ * either empty (just created), or a complete XLogSegSize segment (in
+ * which case it has been created and padded). Anything else indicates a
+ * corrupt file.
+ *
+ * When streaming to tar, no file with this name will exist before, so we
+ * never have to verify a size.
*/
- if (fstat(f, &statbuf) != 0)
- {
- fprintf(stderr,
- _("%s: could not stat transaction log file \"%s\": %s\n"),
- progname, fn, strerror(errno));
- close(f);
- return false;
- }
- if (statbuf.st_size == XLogSegSize)
- {
- /* File is open and ready to use */
- walfile = f;
- return true;
- }
- if (statbuf.st_size != 0)
- {
- fprintf(stderr,
- _("%s: transaction log file \"%s\" has %d bytes, should be 0 or %d\n"),
- progname, fn, (int) statbuf.st_size, XLogSegSize);
- close(f);
- return false;
- }
-
- /* New, empty, file. So pad it to 16Mb with zeroes */
- zerobuf = pg_malloc0(XLOG_BLCKSZ);
- for (bytes = 0; bytes < XLogSegSize; bytes += XLOG_BLCKSZ)
+ if (stream->walmethod->existsfile(fn))
{
- if (write(f, zerobuf, XLOG_BLCKSZ) != XLOG_BLCKSZ)
+ size = stream->walmethod->get_file_size(fn);
+ if (size < 0)
{
fprintf(stderr,
- _("%s: could not pad transaction log file \"%s\": %s\n"),
- progname, fn, strerror(errno));
- free(zerobuf);
- close(f);
- unlink(fn);
+ _("%s: could not get size of transaction log file \"%s\": %s\n"),
+ progname, fn, stream->walmethod->getlasterror());
+ return false;
+ }
+ if (size == XLogSegSize)
+ {
+ /* Already padded file. Open it for use */
+ f = stream->walmethod->open_for_write(current_walfile_name, stream->partial_suffix, 0);
+ if (f == NULL)
+ {
+ fprintf(stderr,
+ _("%s: could not open existing transaction log file \"%s\": %s\n"),
+ progname, fn, stream->walmethod->getlasterror());
+ return false;
+ }
+ walfile = f;
+ return true;
+ }
+ if (size != 0)
+ {
+ fprintf(stderr,
+ _("%s: transaction log file \"%s\" has %d bytes, should be 0 or %d\n"),
+ progname, fn, (int) size, XLogSegSize);
return false;
}
}
- free(zerobuf);
- if (lseek(f, SEEK_SET, 0) != 0)
+ /* No file existed, so create one */
+
+ f = stream->walmethod->open_for_write(current_walfile_name, stream->partial_suffix, XLogSegSize);
+ if (f == NULL)
{
fprintf(stderr,
- _("%s: could not seek to beginning of transaction log file \"%s\": %s\n"),
- progname, fn, strerror(errno));
- close(f);
+ _("%s: could not open transaction log file \"%s\": %s\n"),
+ progname, fn, stream->walmethod->getlasterror());
return false;
}
+
walfile = f;
return true;
}
@@ -178,56 +169,50 @@ static bool
close_walfile(StreamCtl *stream, XLogRecPtr pos)
{
off_t currpos;
+ int r;
- if (walfile == -1)
+ if (walfile == NULL)
return true;
- currpos = lseek(walfile, 0, SEEK_CUR);
+ currpos = stream->walmethod->get_current_pos(walfile);
if (currpos == -1)
{
fprintf(stderr,
_("%s: could not determine seek position in file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
+ progname, current_walfile_name, stream->walmethod->getlasterror());
return false;
}
- if (fsync(walfile) != 0)
+ if (stream->walmethod->fsync(walfile) != 0)
{
fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
+ progname, current_walfile_name, stream->walmethod->getlasterror());
return false;
}
- if (close(walfile) != 0)
+ if (stream->partial_suffix)
{
- fprintf(stderr, _("%s: could not close file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
- walfile = -1;
- return false;
+ if (currpos == XLOG_SEG_SIZE)
+ r = stream->walmethod->close(walfile, CLOSE_NORMAL);
+ else
+ {
+ fprintf(stderr,
+ _("%s: not renaming \"%s%s\", segment is not complete\n"),
+ progname, current_walfile_name, stream->partial_suffix);
+ r = stream->walmethod->close(walfile, CLOSE_NO_RENAME);
+ }
}
- walfile = -1;
+ else
+ r = stream->walmethod->close(walfile, CLOSE_NORMAL);
- /*
- * If we finished writing a .partial file, rename it into place.
- */
- if (currpos == XLOG_SEG_SIZE && stream->partial_suffix)
- {
- char oldfn[MAXPGPATH];
- char newfn[MAXPGPATH];
+ walfile = NULL;
- snprintf(oldfn, sizeof(oldfn), "%s/%s%s", stream->basedir, current_walfile_name, stream->partial_suffix);
- snprintf(newfn, sizeof(newfn), "%s/%s", stream->basedir, current_walfile_name);
- if (rename(oldfn, newfn) != 0)
- {
- fprintf(stderr, _("%s: could not rename file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
- return false;
- }
+ if (r != 0)
+ {
+ fprintf(stderr, _("%s: could not close file \"%s\": %s\n"),
+ progname, current_walfile_name, stream->walmethod->getlasterror());
+ return false;
}
- else if (stream->partial_suffix)
- fprintf(stderr,
- _("%s: not renaming \"%s%s\", segment is not complete\n"),
- progname, current_walfile_name, stream->partial_suffix);
/*
* Mark file as archived if requested by the caller - pg_basebackup needs
@@ -238,7 +223,7 @@ close_walfile(StreamCtl *stream, XLogRecPtr pos)
if (currpos == XLOG_SEG_SIZE && stream->mark_done)
{
/* writes error message if failed */
- if (!mark_file_as_archived(stream->basedir, current_walfile_name))
+ if (!mark_file_as_archived(stream, current_walfile_name))
return false;
}
@@ -253,9 +238,7 @@ close_walfile(StreamCtl *stream, XLogRecPtr pos)
static bool
existsTimeLineHistoryFile(StreamCtl *stream)
{
- char path[MAXPGPATH];
char histfname[MAXFNAMELEN];
- int fd;
/*
* Timeline 1 never has a history file. We treat that as if it existed,
@@ -266,31 +249,16 @@ existsTimeLineHistoryFile(StreamCtl *stream)
TLHistoryFileName(histfname, stream->timeline);
- snprintf(path, sizeof(path), "%s/%s", stream->basedir, histfname);
-
- fd = open(path, O_RDONLY | PG_BINARY, 0);
- if (fd < 0)
- {
- if (errno != ENOENT)
- fprintf(stderr, _("%s: could not open timeline history file \"%s\": %s\n"),
- progname, path, strerror(errno));
- return false;
- }
- else
- {
- close(fd);
- return true;
- }
+ return stream->walmethod->existsfile(histfname);
}
static bool
writeTimeLineHistoryFile(StreamCtl *stream, char *filename, char *content)
{
int size = strlen(content);
- char path[MAXPGPATH];
char tmppath[MAXPGPATH];
char histfname[MAXFNAMELEN];
- int fd;
+ Walfile *f;
/*
* Check that the server's idea of how timeline history files should be
@@ -304,62 +272,39 @@ writeTimeLineHistoryFile(StreamCtl *stream, char *filename, char *content)
return false;
}
- snprintf(path, sizeof(path), "%s/%s", stream->basedir, histfname);
-
- /*
- * Write into a temp file name.
- */
- snprintf(tmppath, MAXPGPATH, "%s.tmp", path);
-
- unlink(tmppath);
-
- fd = open(tmppath, O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
- if (fd < 0)
+ f = stream->walmethod->open_for_write(histfname, ".tmp", 0);
+ if (f == NULL)
{
fprintf(stderr, _("%s: could not create timeline history file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
+ progname, histfname, stream->walmethod->getlasterror());
return false;
}
- errno = 0;
- if ((int) write(fd, content, size) != size)
+ if ((int) stream->walmethod->write(f, content, size) != size)
{
- int save_errno = errno;
+ fprintf(stderr, _("%s: could not write timeline history file \"%s\": %s\n"),
+ progname, histfname, stream->walmethod->getlasterror());
/*
* If we fail to make the file, delete it to release disk space
*/
- close(fd);
- unlink(tmppath);
- errno = save_errno;
+ stream->walmethod->close(f, CLOSE_UNLINK);
- fprintf(stderr, _("%s: could not write timeline history file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
return false;
}
- if (fsync(fd) != 0)
+ if (stream->walmethod->fsync(f) != 0)
{
- close(fd);
fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
+ progname, tmppath, stream->walmethod->getlasterror());
+ stream->walmethod->close(f, CLOSE_NORMAL);
return false;
}
- if (close(fd) != 0)
+ if (stream->walmethod->close(f, CLOSE_NORMAL) != 0)
{
fprintf(stderr, _("%s: could not close file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
- return false;
- }
-
- /*
- * Now move the completed history file into place with its final name.
- */
- if (rename(tmppath, path) < 0)
- {
- fprintf(stderr, _("%s: could not rename file \"%s\" to \"%s\": %s\n"),
- progname, tmppath, path, strerror(errno));
+ progname, histfname, stream->walmethod->getlasterror());
return false;
}
@@ -367,7 +312,7 @@ writeTimeLineHistoryFile(StreamCtl *stream, char *filename, char *content)
if (stream->mark_done)
{
/* writes error message if failed */
- if (!mark_file_as_archived(stream->basedir, histfname))
+ if (!mark_file_as_archived(stream, histfname))
return false;
}
@@ -577,7 +522,9 @@ ReceiveXlogStream(PGconn *conn, StreamCtl *stream)
{
/*
* Fetch the timeline history file for this timeline, if we don't have
- * it already.
+ * it already. When streaming log to tar, this will always return
+ * false, as we are never streaming into an existing file and therefor
+ * there can be no pre-existing timeline history file.
*/
if (!existsTimeLineHistoryFile(stream))
{
@@ -736,10 +683,10 @@ ReceiveXlogStream(PGconn *conn, StreamCtl *stream)
}
error:
- if (walfile != -1 && close(walfile) != 0)
+ if (walfile != NULL && stream->walmethod->close(walfile, CLOSE_NORMAL) != 0)
fprintf(stderr, _("%s: could not close file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
- walfile = -1;
+ progname, current_walfile_name, stream->walmethod->getlasterror());
+ walfile = NULL;
return false;
}
@@ -823,12 +770,12 @@ HandleCopyStream(PGconn *conn, StreamCtl *stream,
* If synchronous option is true, issue sync command as soon as there
* are WAL data which has not been flushed yet.
*/
- if (stream->synchronous && lastFlushPosition < blockpos && walfile != -1)
+ if (stream->synchronous && lastFlushPosition < blockpos && walfile != NULL)
{
- if (fsync(walfile) != 0)
+ if (stream->walmethod->fsync(walfile) != 0)
{
fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
+ progname, current_walfile_name, stream->walmethod->getlasterror());
goto error;
}
lastFlushPosition = blockpos;
@@ -879,7 +826,7 @@ HandleCopyStream(PGconn *conn, StreamCtl *stream,
/* Check the message type. */
if (copybuf[0] == 'k')
{
- if (!ProcessKeepaliveMsg(conn, copybuf, r, blockpos,
+ if (!ProcessKeepaliveMsg(conn, stream, copybuf, r, blockpos,
&last_status))
goto error;
}
@@ -1032,7 +979,7 @@ CopyStreamReceive(PGconn *conn, long timeout, char **buffer)
* Process the keepalive message.
*/
static bool
-ProcessKeepaliveMsg(PGconn *conn, char *copybuf, int len,
+ProcessKeepaliveMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
XLogRecPtr blockpos, int64 *last_status)
{
int pos;
@@ -1059,7 +1006,7 @@ ProcessKeepaliveMsg(PGconn *conn, char *copybuf, int len,
if (replyRequested && still_sending)
{
if (reportFlushPosition && lastFlushPosition < blockpos &&
- walfile != -1)
+ walfile != NULL)
{
/*
* If a valid flush location needs to be reported, flush the
@@ -1068,10 +1015,10 @@ ProcessKeepaliveMsg(PGconn *conn, char *copybuf, int len,
* data has been successfully replicated or not, at the normal
* shutdown of the server.
*/
- if (fsync(walfile) != 0)
+ if (stream->walmethod->fsync(walfile) != 0)
{
fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
+ progname, current_walfile_name, stream->walmethod->getlasterror());
return false;
}
lastFlushPosition = blockpos;
@@ -1129,7 +1076,7 @@ ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
* Verify that the initial location in the stream matches where we think
* we are.
*/
- if (walfile == -1)
+ if (walfile == NULL)
{
/* No file open yet */
if (xlogoff != 0)
@@ -1143,12 +1090,11 @@ ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
else
{
/* More data in existing segment */
- /* XXX: store seek value don't reseek all the time */
- if (lseek(walfile, 0, SEEK_CUR) != xlogoff)
+ if (stream->walmethod->get_current_pos(walfile) != xlogoff)
{
fprintf(stderr,
_("%s: got WAL data offset %08x, expected %08x\n"),
- progname, xlogoff, (int) lseek(walfile, 0, SEEK_CUR));
+ progname, xlogoff, (int) stream->walmethod->get_current_pos(walfile));
return false;
}
}
@@ -1169,7 +1115,7 @@ ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
else
bytes_to_write = bytes_left;
- if (walfile == -1)
+ if (walfile == NULL)
{
if (!open_walfile(stream, *blockpos))
{
@@ -1178,14 +1124,13 @@ ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
}
}
- if (write(walfile,
- copybuf + hdr_len + bytes_written,
- bytes_to_write) != bytes_to_write)
+ if (stream->walmethod->write(walfile, copybuf + hdr_len + bytes_written,
+ bytes_to_write) != bytes_to_write)
{
fprintf(stderr,
_("%s: could not write %u bytes to WAL file \"%s\": %s\n"),
progname, bytes_to_write, current_walfile_name,
- strerror(errno));
+ stream->walmethod->getlasterror());
return false;
}
diff --git a/src/bin/pg_basebackup/receivelog.h b/src/bin/pg_basebackup/receivelog.h
index 554ff8b..e6db14a 100644
--- a/src/bin/pg_basebackup/receivelog.h
+++ b/src/bin/pg_basebackup/receivelog.h
@@ -13,6 +13,7 @@
#define RECEIVELOG_H
#include "libpq-fe.h"
+#include "walmethods.h"
#include "access/xlogdefs.h"
@@ -39,7 +40,7 @@ typedef struct StreamCtl
stream_stop_callback stream_stop; /* Stop streaming when returns true */
- char *basedir; /* Received segments written to this dir */
+ WalWriteMethod *walmethod; /* How to write the WAL */
char *partial_suffix; /* Suffix appended to partially received files */
} StreamCtl;
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index fd9857d..de4631e 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -4,7 +4,7 @@ use Cwd;
use Config;
use PostgresNode;
use TestLib;
-use Test::More tests => 54;
+use Test::More tests => 56;
program_help_ok('pg_basebackup');
program_version_ok('pg_basebackup');
@@ -197,6 +197,10 @@ $node->command_ok(
'pg_basebackup -X stream runs');
ok(grep(/^[0-9A-F]{24}$/, slurp_dir("$tempdir/backupxf/pg_xlog")),
'WAL files copied');
+$node->command_ok(
+ [ 'pg_basebackup', '-D', "$tempdir/backupxst", '-X', 'stream', '-Ft' ],
+ 'pg_basebackup -X stream runs in tar mode');
+ok(-f "$tempdir/backupxst/pg_xlog.tar");
$node->command_fails(
[ 'pg_basebackup', '-D', "$tempdir/fail", '-S', 'slot1' ],
diff --git a/src/bin/pg_basebackup/walmethods.c b/src/bin/pg_basebackup/walmethods.c
new file mode 100644
index 0000000..c1c574b
--- /dev/null
+++ b/src/bin/pg_basebackup/walmethods.c
@@ -0,0 +1,818 @@
+/*-------------------------------------------------------------------------
+ *
+ * walmethods.c - implementations of different ways to write received wal
+ *
+ * NOTE! The caller must ensure that only one method is instantiated in
+ * any given program, and that it's only instantiated once!
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_basebackup/walmethods.c
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <sys/stat.h>
+#include <time.h>
+#include <unistd.h>
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
+#include "pgtar.h"
+
+#include "receivelog.h"
+
+/* Size of zlib buffer for .tar.gz */
+#define ZLIB_OUT_SIZE 4096
+
+/*-------------------------------------------------------------------------
+ * WalDirectoryMethod - write wal to a directory looking like pg_xlog
+ *-------------------------------------------------------------------------
+ */
+
+/*
+ * Global static data for this method
+ */
+typedef struct DirectoryMethodData
+{
+ char *basedir;
+} DirectoryMethodData;
+static DirectoryMethodData *dir_data = NULL;
+
+/*
+ * Local file handle
+ */
+typedef struct DirectoryMethodFile
+{
+ int fd;
+ off_t currpos;
+ char *pathname;
+ char *temp_suffix;
+} DirectoryMethodFile;
+
+static char *
+dir_getlasterror(void)
+{
+ /* Directory method always sets errno, so just use strerror */
+ return strerror(errno);
+}
+
+static Walfile
+dir_open_for_write(const char *pathname, const char *temp_suffix, size_t pad_to_size)
+{
+ static char tmppath[MAXPGPATH];
+ int fd;
+ DirectoryMethodFile *f;
+
+ snprintf(tmppath, sizeof(tmppath), "%s/%s%s",
+ dir_data->basedir, pathname, temp_suffix ? temp_suffix : "");
+
+ fd = open(tmppath, O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
+ if (fd < 0)
+ return NULL;
+
+ if (pad_to_size)
+ {
+ /* Always pre-pad on regular files */
+ char *zerobuf;
+ int bytes;
+
+ zerobuf = pg_malloc0(XLOG_BLCKSZ);
+ for (bytes = 0; bytes < pad_to_size; bytes += XLOG_BLCKSZ)
+ {
+ if (write(fd, zerobuf, XLOG_BLCKSZ) != XLOG_BLCKSZ)
+ {
+ int save_errno = errno;
+
+ pg_free(zerobuf);
+ close(fd);
+ errno = save_errno;
+ return NULL;
+ }
+ }
+ pg_free(zerobuf);
+
+ if (lseek(fd, 0, SEEK_SET) != 0)
+ return NULL;
+ }
+
+ f = pg_malloc0(sizeof(DirectoryMethodFile));
+ f->fd = fd;
+ f->currpos = 0;
+ f->pathname = pg_strdup(pathname);
+ if (temp_suffix)
+ f->temp_suffix = pg_strdup(temp_suffix);
+ return f;
+}
+
+static ssize_t
+dir_write(Walfile f, const void *buf, size_t count)
+{
+ ssize_t r;
+ DirectoryMethodFile *df = (DirectoryMethodFile *) f;
+
+ Assert(f != NULL);
+
+ r = write(df->fd, buf, count);
+ if (r > 0)
+ df->currpos += r;
+ return r;
+}
+
+static off_t
+dir_get_current_pos(Walfile f)
+{
+ Assert(f != NULL);
+
+ /* Use a cached value to prevent lots of reseeks */
+ return ((DirectoryMethodFile *) f)->currpos;
+}
+
+static int
+dir_close(Walfile f, WalCloseMethod method)
+{
+ int r;
+ DirectoryMethodFile *df = (DirectoryMethodFile *) f;
+ static char tmppath[MAXPGPATH];
+ static char tmppath2[MAXPGPATH];
+
+ Assert(f != NULL);
+
+ r = close(df->fd);
+
+ if (r == 0)
+ {
+ /* Build path to the current version of the file */
+ if (method == CLOSE_NORMAL && df->temp_suffix)
+ {
+ /* If we have a temp prefix, normal is we rename the file */
+ snprintf(tmppath, sizeof(tmppath), "%s/%s%s",
+ dir_data->basedir, df->pathname, df->temp_suffix);
+ snprintf(tmppath2, sizeof(tmppath2), "%s/%s",
+ dir_data->basedir, df->pathname);
+ r = rename(tmppath, tmppath2);
+ }
+ else if (method == CLOSE_UNLINK
+ )
+ {
+ /* Unlink the file once it's closed */
+ snprintf(tmppath, sizeof(tmppath), "%s/%s%s",
+ dir_data->basedir, df->pathname, df->temp_suffix ? df->temp_suffix : "");
+ r = unlink(tmppath);
+ }
+ /* else either CLOSE_NORMAL and no temp suffix, or CLOSE_NO_RENAME */
+ }
+
+ pg_free(df->pathname);
+ if (df->temp_suffix)
+ pg_free(df->temp_suffix);
+ pg_free(df);
+
+ return r;
+}
+
+static int
+dir_fsync(Walfile f)
+{
+ Assert(f != NULL);
+
+ return fsync(((DirectoryMethodFile *) f)->fd);
+}
+
+static ssize_t
+dir_get_file_size(const char *pathname)
+{
+ struct stat statbuf;
+ static char tmppath[MAXPGPATH];
+
+ snprintf(tmppath, sizeof(tmppath), "%s/%s",
+ dir_data->basedir, pathname);
+
+ if (stat(tmppath, &statbuf) != 0)
+ return -1;
+
+ return statbuf.st_size;
+}
+
+static bool
+dir_existsfile(const char *pathname)
+{
+ static char tmppath[MAXPGPATH];
+ int fd;
+
+ snprintf(tmppath, sizeof(tmppath), "%s/%s",
+ dir_data->basedir, pathname);
+
+ fd = open(tmppath, O_RDONLY | PG_BINARY, 0);
+ if (fd < 0)
+ return false;
+ close(fd);
+ return true;
+}
+
+static bool
+dir_finish(void)
+{
+ /* No cleanup necessary */
+ return true;
+}
+
+
+WalWriteMethod *
+CreateWalDirectoryMethod(const char *basedir)
+{
+ WalWriteMethod *method;
+
+ method = pg_malloc0(sizeof(WalWriteMethod));
+ method->open_for_write = dir_open_for_write;
+ method->write = dir_write;
+ method->get_current_pos = dir_get_current_pos;
+ method->get_file_size = dir_get_file_size;
+ method->close = dir_close;
+ method->fsync = dir_fsync;
+ method->existsfile = dir_existsfile;
+ method->finish = dir_finish;
+ method->getlasterror = dir_getlasterror;
+
+ dir_data = pg_malloc0(sizeof(DirectoryMethodData));
+ dir_data->basedir = pg_strdup(basedir);
+
+ return method;
+}
+
+
+/*-------------------------------------------------------------------------
+ * WalTarMethod - write wal to a tar file containing pg_xlog contents
+ *-------------------------------------------------------------------------
+ */
+
+typedef struct TarMethodFile
+{
+ off_t ofs_start; /* Where does the *header* for this file start */
+ off_t currpos;
+ char header[512];
+ char *pathname;
+ size_t pad_to_size;
+} TarMethodFile;
+
+typedef struct TarMethodData
+{
+ char *tarfilename;
+ int fd;
+ int compression;
+ TarMethodFile *currentfile;
+ char lasterror[1024];
+#ifdef HAVE_LIBZ
+ z_streamp zp;
+ void *zlibOut;
+#endif
+} TarMethodData;
+static TarMethodData *tar_data = NULL;
+
+#define tar_clear_error() tar_data->lasterror[0] = '\0'
+#define tar_set_error(msg) strlcpy(tar_data->lasterror, msg, sizeof(tar_data->lasterror))
+
+static char *
+tar_getlasterror(void)
+{
+ /*
+ * If a custom error is set, return that one. Otherwise, assume errno is
+ * set and return that one.
+ */
+ if (tar_data->lasterror[0])
+ return tar_data->lasterror;
+ return strerror(errno);
+}
+
+#ifdef HAVE_LIBZ
+static bool
+tar_write_compressed_data(void *buf, size_t count, bool flush)
+{
+ tar_data->zp->next_in = buf;
+ tar_data->zp->avail_in = count;
+
+ while (tar_data->zp->avail_in || flush)
+ {
+ int r;
+
+ r = deflate(tar_data->zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (r == Z_STREAM_ERROR)
+ {
+ tar_set_error("deflate failed");
+ return false;
+ }
+
+ if (tar_data->zp->avail_out < ZLIB_OUT_SIZE)
+ {
+ size_t len = ZLIB_OUT_SIZE - tar_data->zp->avail_out;
+
+ if (write(tar_data->fd, tar_data->zlibOut, len) != len)
+ return false;
+
+ tar_data->zp->next_out = tar_data->zlibOut;
+ tar_data->zp->avail_out = ZLIB_OUT_SIZE;
+ }
+
+ if (r == Z_STREAM_END)
+ break;
+ }
+
+ if (flush)
+ {
+ /* Reset the stream for writing */
+ if (deflateReset(tar_data->zp) != Z_OK)
+ {
+ tar_set_error("deflateReset failed");
+ return false;
+ }
+ }
+
+ return true;
+}
+#endif
+
+static ssize_t
+tar_write(Walfile f, const void *buf, size_t count)
+{
+ ssize_t r;
+
+ Assert(f != NULL);
+ tar_clear_error();
+
+ /* Tarfile will always be positioned at the end */
+ if (!tar_data->compression)
+ {
+ r = write(tar_data->fd, buf, count);
+ if (r > 0)
+ ((TarMethodFile *) f)->currpos += r;
+ return r;
+ }
+#ifdef HAVE_LIBZ
+ else
+ {
+ if (!tar_write_compressed_data((void *) buf, count, false))
+ return -1;
+ ((TarMethodFile *) f)->currpos += count;
+ return count;
+ }
+#endif
+}
+
+static bool
+tar_write_padding_data(TarMethodFile * f, size_t bytes)
+{
+ char *zerobuf = pg_malloc0(XLOG_BLCKSZ);
+ size_t bytesleft = bytes;
+
+ while (bytesleft)
+ {
+ size_t bytestowrite = bytesleft > XLOG_BLCKSZ ? XLOG_BLCKSZ : bytesleft;
+
+ size_t r = tar_write(f, zerobuf, bytestowrite);
+
+ if (r < 0)
+ return false;
+ bytesleft -= r;
+ }
+ return true;
+}
+
+static Walfile
+tar_open_for_write(const char *pathname, const char *temp_suffix, size_t pad_to_size)
+{
+ int save_errno;
+ static char tmppath[MAXPGPATH];
+
+ tar_clear_error();
+
+ if (tar_data->fd < 0)
+ {
+ /*
+ * We open the tar file only when we first try to write to it.
+ */
+ tar_data->fd = open(tar_data->tarfilename,
+ O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
+ if (tar_data->fd < 0)
+ return NULL;
+
+#ifdef HAVE_LIBZ
+ if (tar_data->compression)
+ {
+ tar_data->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ tar_data->zp->zalloc = Z_NULL;
+ tar_data->zp->zfree = Z_NULL;
+ tar_data->zp->opaque = Z_NULL;
+ tar_data->zp->next_out = tar_data->zlibOut;
+ tar_data->zp->avail_out = ZLIB_OUT_SIZE;
+
+ /*
+ * Initialize deflation library. Adding the magic value 16 to the
+ * default 15 for the windowBits parameter makes the output be
+ * gzip instead of zlib.
+ */
+ if (deflateInit2(tar_data->zp, tar_data->compression, Z_DEFLATED, 15 + 16, 8, Z_DEFAULT_STRATEGY) != Z_OK)
+ {
+ tar_set_error("deflateInit2 failed");
+ return NULL;
+ }
+ }
+#endif
+
+ /* There's no tar header itself, the file starts with regular files */
+ }
+
+ Assert(tar_data->currentfile == NULL);
+ if (tar_data->currentfile != NULL)
+ {
+ tar_set_error("implementation error: tar files can't have more than one open file\n");
+ return NULL;
+ }
+
+ tar_data->currentfile = pg_malloc0(sizeof(TarMethodFile));
+
+ snprintf(tmppath, sizeof(tmppath), "%s%s",
+ pathname, temp_suffix ? temp_suffix : "");
+
+ /* Create a header with size set to 0 - we will fill out the size on close */
+ if (tarCreateHeader(tar_data->currentfile->header, tmppath, NULL, 0, S_IRUSR | S_IWUSR, 0, 0, time(NULL)) != TAR_OK)
+ {
+ pg_free(tar_data->currentfile);
+ tar_data->currentfile = NULL;
+ tar_set_error("could not create tar header");
+ return NULL;
+ }
+
+#ifdef HAVE_LIBZ
+ if (tar_data->compression)
+ {
+ /* Flush existing data */
+ if (!tar_write_compressed_data(NULL, 0, true))
+ return NULL;
+
+ /* Turn off compression for header */
+ if (deflateParams(tar_data->zp, 0, 0) != Z_OK)
+ {
+ tar_set_error("deflateParams failed");
+ return NULL;
+ }
+ }
+#endif
+
+ tar_data->currentfile->ofs_start = lseek(tar_data->fd, 0, SEEK_CUR);
+ if (tar_data->currentfile->ofs_start == -1)
+ {
+ save_errno = errno;
+ pg_free(tar_data->currentfile);
+ tar_data->currentfile = NULL;
+ errno = save_errno;
+ return NULL;
+ }
+ tar_data->currentfile->currpos = 0;
+
+ if (!tar_data->compression)
+ {
+ if (write(tar_data->fd, tar_data->currentfile->header, 512) != 512)
+ {
+ save_errno = errno;
+ pg_free(tar_data->currentfile);
+ tar_data->currentfile = NULL;
+ errno = save_errno;
+ return NULL;
+ }
+ }
+#ifdef HAVE_LIBZ
+ else
+ {
+ /* Write header through the zlib APIs but with no compression */
+ if (!tar_write_compressed_data(tar_data->currentfile->header, 512, true))
+ return NULL;
+
+ /* Re-enable compression for the rest of the file */
+ if (deflateParams(tar_data->zp, tar_data->compression, 0) != Z_OK)
+ {
+ tar_set_error("deflateParams failed");
+ return NULL;
+ }
+ }
+#endif
+
+ tar_data->currentfile->pathname = pg_strdup(pathname);
+
+ /*
+ * Uncompressed files are padded on creation, but for compression we can't
+ * do that
+ */
+ if (pad_to_size)
+ {
+ tar_data->currentfile->pad_to_size = pad_to_size;
+ if (!tar_data->compression)
+ {
+ /* Uncompressed, so pad now */
+ tar_write_padding_data(tar_data->currentfile, pad_to_size);
+ /* Seek back to start */
+ if (lseek(tar_data->fd, tar_data->currentfile->ofs_start + 512, SEEK_SET) != tar_data->currentfile->ofs_start + 512)
+ return NULL;
+
+ tar_data->currentfile->currpos = 0;
+ }
+ }
+
+ return tar_data->currentfile;
+}
+
+static ssize_t
+tar_get_file_size(const char *pathname)
+{
+ tar_clear_error();
+
+ /* Currently not used, so not supported */
+ errno = ENOSYS;
+ return -1;
+}
+
+static off_t
+tar_get_current_pos(Walfile f)
+{
+ Assert(f != NULL);
+ tar_clear_error();
+
+ return ((TarMethodFile *) f)->currpos;
+}
+
+static int
+tar_fsync(Walfile f)
+{
+ Assert(f != NULL);
+ tar_clear_error();
+
+ /*
+ * Always sync the whole tarfile, because that's all we can do. This makes
+ * no sense on compressed files, so just ignore those.
+ */
+ if (tar_data->compression)
+ return 0;
+
+ return fsync(tar_data->fd);
+}
+
+static int
+tar_close(Walfile f, WalCloseMethod method)
+{
+ ssize_t filesize;
+ int padding;
+ TarMethodFile *tf = (TarMethodFile *) f;
+
+ Assert(f != NULL);
+ tar_clear_error();
+
+ if (method == CLOSE_UNLINK)
+ {
+ if (tar_data->compression)
+ {
+ tar_set_error("unlink not supported with compression");
+ return -1;
+ }
+
+ /*
+ * Unlink the file that we just wrote to the tar. We do this by
+ * truncating it to the start of the header. This is safe as we only
+ * allow writing of the very last file.
+ */
+ if (ftruncate(tar_data->fd, tf->ofs_start) != 0)
+ return -1;
+
+ pg_free(tf->pathname);
+ pg_free(tf);
+ tar_data->currentfile = NULL;
+
+ return 0;
+ }
+
+ /*
+ * Pad the file itself with zeroes if necessary. Note that this is
+ * different from the tar format padding -- this is the padding we asked
+ * for when the file was opened.
+ */
+ if (tf->pad_to_size)
+ {
+ if (tar_data->compression)
+ {
+ /*
+ * A compressed tarfile is padded on close since we cannot know
+ * the size of the compressed output until the end.
+ */
+ size_t sizeleft = tf->pad_to_size - tf->currpos;
+
+ if (sizeleft)
+ {
+ if (!tar_write_padding_data(tf, sizeleft))
+ return -1;
+ }
+ }
+ else
+ {
+ /*
+ * An uncompressed tarfile was padded on creation, so just adjust
+ * the current position as if we seeked to the end.
+ */
+ tf->currpos = tf->pad_to_size;
+ }
+ }
+
+ /*
+ * Get the size of the file, and pad the current data up to the nearest
+ * 512 byte boundary.
+ */
+ filesize = tar_get_current_pos(f);
+ padding = ((filesize + 511) & ~511) - filesize;
+ if (padding)
+ {
+ char zerobuf[512];
+
+ MemSet(zerobuf, 0, padding);
+ if (tar_write(f, zerobuf, padding) != padding)
+ return -1;
+ }
+
+
+#ifdef HAVE_LIBZ
+ if (tar_data->compression)
+ {
+ /* Flush the current buffer */
+ if (!tar_write_compressed_data(NULL, 0, true))
+ {
+ errno = EINVAL;
+ return -1;
+ }
+ }
+#endif
+
+ /*
+ * Now go back and update the header with the correct filesize and
+ * possibly also renaming the file. We overwrite the entire current header
+ * when done, including the checksum.
+ */
+ print_tar_number(&(tf->header[124]), 12, filesize);
+
+ if (method == CLOSE_NORMAL)
+
+ /*
+ * We overwrite it with what it was before if we have no tempname,
+ * since we're going to write the buffer anyway.
+ */
+ strlcpy(&(tf->header[0]), tf->pathname, 100);
+
+ print_tar_number(&(tf->header[148]), 8, tarChecksum(((TarMethodFile *) f)->header));
+ if (lseek(tar_data->fd, tf->ofs_start, SEEK_SET) != ((TarMethodFile *) f)->ofs_start)
+ return -1;
+ if (!tar_data->compression)
+ {
+ if (write(tar_data->fd, tf->header, 512) != 512)
+ return -1;
+ }
+#ifdef HAVE_LIBZ
+ else
+ {
+ /* Turn off compression */
+ if (deflateParams(tar_data->zp, 0, 0) != Z_OK)
+ {
+ tar_set_error("deflateParams failed");
+ return -1;
+ }
+
+ /* Overwrite the header, assuming the size will be the same */
+ if (!tar_write_compressed_data(tar_data->currentfile->header, 512, true))
+ return -1;
+
+ /* Turn compression back on */
+ if (deflateParams(tar_data->zp, tar_data->compression, 0) != Z_OK)
+ {
+ tar_set_error("deflateParams failed");
+ return -1;
+ }
+ }
+#endif
+
+ /* Move file pointer back down to end, so we can write the next file */
+ if (lseek(tar_data->fd, 0, SEEK_END) < 0)
+ return -1;
+
+ /* Always fsync on close, so the padding gets fsynced */
+ tar_fsync(f);
+
+ /* Clean up and done */
+ pg_free(tf->pathname);
+ pg_free(tf);
+ tar_data->currentfile = NULL;
+
+ return 0;
+}
+
+static bool
+tar_existsfile(const char *pathname)
+{
+ tar_clear_error();
+ /* We only deal with new tarfiles, so nothing externally created exists */
+ return false;
+}
+
+static bool
+tar_finish(void)
+{
+ char zerobuf[1024];
+
+ tar_clear_error();
+
+ if (tar_data->currentfile)
+ {
+ if (tar_close(tar_data->currentfile, CLOSE_NORMAL) != 0)
+ return false;
+ }
+
+ /* A tarfile always ends with two empty blocks */
+ MemSet(zerobuf, 0, sizeof(zerobuf));
+ if (!tar_data->compression)
+ {
+ if (write(tar_data->fd, zerobuf, sizeof(zerobuf)) != sizeof(zerobuf))
+ return false;
+ }
+#ifdef HAVE_LIBZ
+ else
+ {
+ if (!tar_write_compressed_data(zerobuf, sizeof(zerobuf), false))
+ return false;
+
+ /* Also flush all data to make sure the gzip stream is finished */
+ tar_data->zp->next_in = NULL;
+ tar_data->zp->avail_in = 0;
+ while (true)
+ {
+ int r;
+
+ r = deflate(tar_data->zp, Z_FINISH);
+
+ if (r == Z_STREAM_ERROR)
+ {
+ tar_set_error("deflate failed");
+ return false;
+ }
+ if (tar_data->zp->avail_out < ZLIB_OUT_SIZE)
+ {
+ size_t len = ZLIB_OUT_SIZE - tar_data->zp->avail_out;
+
+ if (write(tar_data->fd, tar_data->zlibOut, len) != len)
+ return false;
+ }
+ if (r == Z_STREAM_END)
+ break;
+ }
+
+ if (deflateEnd(tar_data->zp) != Z_OK)
+ {
+ tar_set_error("deflateEnd failed");
+ return false;
+ }
+ }
+#endif
+
+ /* sync the empty blocks as well, since they're after the last file */
+ fsync(tar_data->fd);
+
+ if (close(tar_data->fd) != 0)
+ return false;
+
+ tar_data->fd = -1;
+
+ return true;
+}
+
+WalWriteMethod *
+CreateWalTarMethod(const char *tarbase, int compression)
+{
+ WalWriteMethod *method;
+ const char *suffix = (compression != 0) ? ".tar.gz" : ".tar";
+
+ method = pg_malloc0(sizeof(WalWriteMethod));
+ method->open_for_write = tar_open_for_write;
+ method->write = tar_write;
+ method->get_current_pos = tar_get_current_pos;
+ method->get_file_size = tar_get_file_size;
+ method->close = tar_close;
+ method->fsync = tar_fsync;
+ method->existsfile = tar_existsfile;
+ method->finish = tar_finish;
+ method->getlasterror = tar_getlasterror;
+
+ tar_data = pg_malloc0(sizeof(TarMethodData));
+ tar_data->tarfilename = pg_malloc0(strlen(tarbase) + strlen(suffix) + 1);
+ sprintf(tar_data->tarfilename, "%s%s", tarbase, suffix);
+ tar_data->fd = -1;
+ tar_data->compression = compression;
+ if (compression)
+ tar_data->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ return method;
+}
diff --git a/src/bin/pg_basebackup/walmethods.h b/src/bin/pg_basebackup/walmethods.h
new file mode 100644
index 0000000..39dd6a9
--- /dev/null
+++ b/src/bin/pg_basebackup/walmethods.h
@@ -0,0 +1,45 @@
+/*-------------------------------------------------------------------------
+ *
+ * walmethods.h
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_basebackup/walmethods.h
+ *-------------------------------------------------------------------------
+ */
+
+
+typedef void *Walfile;
+
+typedef enum
+{
+ CLOSE_NORMAL,
+ CLOSE_UNLINK,
+ CLOSE_NO_RENAME,
+} WalCloseMethod;
+
+typedef struct WalWriteMethod WalWriteMethod;
+struct WalWriteMethod
+{
+ Walfile(*open_for_write) (const char *pathname, const char *temp_suffix, size_t pad_to_size);
+ int (*close) (Walfile f, WalCloseMethod method);
+ bool (*existsfile) (const char *pathname);
+ ssize_t (*get_file_size) (const char *pathname);
+
+ ssize_t (*write) (Walfile f, const void *buf, size_t count);
+ off_t (*get_current_pos) (Walfile f);
+ int (*fsync) (Walfile f);
+ bool (*finish) (void);
+ char *(*getlasterror) (void);
+};
+
+/*
+ * Available WAL methods:
+ * - WalDirectoryMethod - write WAL to regular files in a standard pg_xlog
+ * - TarDirectoryMethod - write WAL to a tarfile corresponding to pg_xlog
+ * (only implements the methods required for pg_basebackup,
+ * not all those required for pg_receivexlog)
+ */
+WalWriteMethod *CreateWalDirectoryMethod(const char *basedir);
+WalWriteMethod *CreateWalTarMethod(const char *tarbase, int compression);
diff --git a/src/include/pgtar.h b/src/include/pgtar.h
index 45ca400..1d179f0 100644
--- a/src/include/pgtar.h
+++ b/src/include/pgtar.h
@@ -22,4 +22,5 @@ enum tarError
extern enum tarError tarCreateHeader(char *h, const char *filename, const char *linktarget,
pgoff_t size, mode_t mode, uid_t uid, gid_t gid, time_t mtime);
extern uint64 read_tar_number(const char *s, int len);
+extern void print_tar_number(char *s, int len, uint64 val);
extern int tarChecksum(char *header);
diff --git a/src/port/tar.c b/src/port/tar.c
index 52a2113..f1da959 100644
--- a/src/port/tar.c
+++ b/src/port/tar.c
@@ -16,7 +16,7 @@
* support only non-negative numbers, so we don't worry about the GNU rules
* for handling negative numbers.)
*/
-static void
+void
print_tar_number(char *s, int len, uint64 val)
{
if (val < (((uint64) 1) << ((len - 1) * 3)))
On Thu, Sep 29, 2016 at 12:44 PM, Magnus Hagander <magnus@hagander.net>
wrote:
On Mon, Sep 5, 2016 at 10:01 AM, Michael Paquier <
michael.paquier@gmail.com> wrote:On Sat, Sep 3, 2016 at 10:35 PM, Magnus Hagander <magnus@hagander.net>
wrote:Ugh. That would be nice to have, but I think that's outside the scope of
this patch.A test for this patch that could have value would be to use
pg_basebackup -X stream -Ft, then untar pg_xlog.tar and look at the
size of the segments. If you have an idea to untar something without
the in-core perl support because we need to have the TAP stuff able to
run on at least 5.8.8, I'd welcome an idea. Honestly I still have
none, and that's why the recovery tests do not use tarballs in their
tests when using pg_basebackup. In short let's not add something more
for this patch.PFA is an updated version of this patch that:
* documents a magic value passed to zlib (which is in theirdocumentation as
being a magic value, but has no define)
* fixes the padding of tar files
* adds a most basic test that the -X stream -Ft does produce a tarfileSo I had a more serious look at this patch, and it basically makes
more generic the operations done for the plain mode by adding a set of
routines that can be used by both tar and plain mode to work on the
WAL files streamed. Elegant.+ <para> + The transaction log files will be written to + the <filename>base.tar</filename> file. + </para> Nit: number of spaces here.Fixed.
-mark_file_as_archived(const char *basedir, const char *fname) +mark_file_as_archived(StreamCtl *stream, const char *fname) Just passing WalMethod as argument would be enough, but... My patch adding the fsync calls to pg_basebackup could just make use of StreamCtl, so let's keep it as you suggest.Yeah, I think it's cleaner to pass the whole structure around really. If
not now, we'd need it eventually. That makes all callers more consistent.static bool
existsTimeLineHistoryFile(StreamCtl *stream)
{
[...]
+ return stream->walmethod->existsfile(histfname);
}
existsfile returns always false for the tar method. This does not
matter much because pg_basebackup exists immediately in case of a
failure, but I think that this deserves a comment in ReceiveXlogStream
where existsTimeLineHistoryFile is called.OK, added. As you say, the behaviour is expected, but it makes sense to
mention it clealry there.I find the use of existsfile() in open_walfile() rather confusing
because this relies on the fact that existsfile() returns always
false for the tar mode. We could add an additional field in WalMethod
to store the method type and use that more, but that may make the code
more confusing than what you propose. What do you think?Yeah, I'm not sure that helps. The point is that the abstraction is
supposed to take care of that. But if it's confusing, then clearly a
comment is warranted there, so I've added that. Do you think that makes it
clear enough?+ int (*unlink) (const char *pathname);
The unlink method is used nowhere. This could just be removed.That's clearly a missed cleanup. Removed, thanks.
-static void +void print_tar_number(char *s, int len, uint64 val) This could be an independent patch. Or not.Could be, but we don't really have any other uses for it.
I think that I found another bug regarding the contents of the
segments. I did pg_basebackup -F t -X stream, then untared pg_xlog.tar
which contained segment 1/0/2, then:
$ pg_xlogdump 000000010000000000000002
pg_xlogdump: FATAL: could not find a valid record after 0/2000000
I'd expect this segment to have records, up to a XLOG_SWITCH record.Ugh. That's definitely broken yes. It seeked back and overwrote the tar
header with the data, instead of starting where the file part was supposed
to be. It worked fine on compressed files, and it's when implementing that
that it broke.So what's our basic rule for these perl tests - are we allowed to use
pg_xlogdump from within a pg_basebackup test? If so that could actually be
a useful test - do the backup, extract the xlog and verify that it contains
the SWITCH record.I also noticed that using -Z5 created a .tar.gz and -z created a .tar
(which was compressed). Because compresslevel is set to -1 with -z,
meaning default.Again, apologies for getting late back into the game here.
And here's yet another version, now rebased on top of the fsync and nosync
changes that got applied.
In particular, this conflicted with pretty much every single change from
the fsync patch, so I'm definitely looking for another round of review
before this can be committed.
I ended up moving much of the fsync stuff into walmethods.c, since they
were dependent on if we used tar or not (obviously only the parts about the
wal, not the basebackup). So there's a significant risk that I missed
something there.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
Attachments:
pg_basebackup_stream_tar_v4.patchtext/x-patch; charset=US-ASCII; name=pg_basebackup_stream_tar_v4.patchDownload
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 55e913f..e024531 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -180,7 +180,8 @@ PostgreSQL documentation
target directory, the tar contents will be written to
standard output, suitable for piping to for example
<productname>gzip</productname>. This is only possible if
- the cluster has no additional tablespaces.
+ the cluster has no additional tablespaces and transaction
+ log streaming is not used.
</para>
</listitem>
</varlistentry>
@@ -323,6 +324,10 @@ PostgreSQL documentation
If the log has been rotated when it's time to transfer it, the
backup will fail and be unusable.
</para>
+ <para>
+ The transaction log files will be written to
+ the <filename>base.tar</filename> file.
+ </para>
</listitem>
</varlistentry>
@@ -339,6 +344,9 @@ PostgreSQL documentation
client can keep up with transaction log received, using this mode
requires no extra transaction logs to be saved on the master.
</para>
+ <para>The transactionn log files are written to a separate file
+ called <filename>pg_xlog.tar</filename>.
+ </para>
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/bin/pg_basebackup/Makefile b/src/bin/pg_basebackup/Makefile
index fa1ce8b..52ac9e9 100644
--- a/src/bin/pg_basebackup/Makefile
+++ b/src/bin/pg_basebackup/Makefile
@@ -19,7 +19,7 @@ include $(top_builddir)/src/Makefile.global
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
LDFLAGS += -L$(top_builddir)/src/fe_utils -lpgfeutils -lpq
-OBJS=receivelog.o streamutil.o $(WIN32RES)
+OBJS=receivelog.o streamutil.o walmethods.o $(WIN32RES)
all: pg_basebackup pg_receivexlog pg_recvlogical
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 0f5d9d6..d82d80c 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -443,7 +443,7 @@ typedef struct
{
PGconn *bgconn;
XLogRecPtr startptr;
- char xlogdir[MAXPGPATH];
+ char xlog[MAXPGPATH]; /* directory or tarfile depending on mode */
char *sysidentifier;
int timeline;
} logstreamer_param;
@@ -464,9 +464,13 @@ LogStreamerMain(logstreamer_param *param)
stream.synchronous = false;
stream.do_sync = do_sync;
stream.mark_done = true;
- stream.basedir = param->xlogdir;
stream.partial_suffix = NULL;
+ if (format == 'p')
+ stream.walmethod = CreateWalDirectoryMethod(param->xlog, do_sync);
+ else
+ stream.walmethod = CreateWalTarMethod(param->xlog, compresslevel, do_sync);
+
if (!ReceiveXlogStream(param->bgconn, &stream))
/*
@@ -476,6 +480,14 @@ LogStreamerMain(logstreamer_param *param)
*/
return 1;
+ if (!stream.walmethod->finish())
+ {
+ fprintf(stderr,
+ _("%s: could not finish writing WAL files: %s\n"),
+ progname, strerror(errno));
+ return 1;
+ }
+
PQfinish(param->bgconn);
return 0;
}
@@ -526,22 +538,25 @@ StartLogStreamer(char *startpos, uint32 timeline, char *sysidentifier)
/* Error message already written in GetConnection() */
exit(1);
- snprintf(param->xlogdir, sizeof(param->xlogdir), "%s/pg_xlog", basedir);
-
- /*
- * Create pg_xlog/archive_status (and thus pg_xlog) so we can write to
- * basedir/pg_xlog as the directory entry in the tar file may arrive
- * later.
- */
- snprintf(statusdir, sizeof(statusdir), "%s/pg_xlog/archive_status",
- basedir);
+ snprintf(param->xlog, sizeof(param->xlog), "%s/pg_xlog", basedir);
- if (pg_mkdir_p(statusdir, S_IRWXU) != 0 && errno != EEXIST)
+ if (format == 'p')
{
- fprintf(stderr,
- _("%s: could not create directory \"%s\": %s\n"),
- progname, statusdir, strerror(errno));
- disconnect_and_exit(1);
+ /*
+ * Create pg_xlog/archive_status (and thus pg_xlog) so we can write to
+ * basedir/pg_xlog as the directory entry in the tar file may arrive
+ * later.
+ */
+ snprintf(statusdir, sizeof(statusdir), "%s/pg_xlog/archive_status",
+ basedir);
+
+ if (pg_mkdir_p(statusdir, S_IRWXU) != 0 && errno != EEXIST)
+ {
+ fprintf(stderr,
+ _("%s: could not create directory \"%s\": %s\n"),
+ progname, statusdir, strerror(errno));
+ disconnect_and_exit(1);
+ }
}
/*
@@ -2234,16 +2249,6 @@ main(int argc, char **argv)
exit(1);
}
- if (format != 'p' && streamwal)
- {
- fprintf(stderr,
- _("%s: WAL streaming can only be used in plain mode\n"),
- progname);
- fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
- progname);
- exit(1);
- }
-
if (replication_slot && !streamwal)
{
fprintf(stderr,
diff --git a/src/bin/pg_basebackup/pg_receivexlog.c b/src/bin/pg_basebackup/pg_receivexlog.c
index a58a251..bbdf96e 100644
--- a/src/bin/pg_basebackup/pg_receivexlog.c
+++ b/src/bin/pg_basebackup/pg_receivexlog.c
@@ -338,11 +338,19 @@ StreamLog(void)
stream.synchronous = synchronous;
stream.do_sync = true;
stream.mark_done = false;
- stream.basedir = basedir;
+ stream.walmethod = CreateWalDirectoryMethod(basedir, stream.do_sync);
stream.partial_suffix = ".partial";
ReceiveXlogStream(conn, &stream);
+ if (!stream.walmethod->finish())
+ {
+ fprintf(stderr,
+ _("%s: could not finish writing WAL files: %s\n"),
+ progname, strerror(errno));
+ return;
+ }
+
PQfinish(conn);
conn = NULL;
}
diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c
index 8f29d19..925dc51 100644
--- a/src/bin/pg_basebackup/receivelog.c
+++ b/src/bin/pg_basebackup/receivelog.c
@@ -30,7 +30,7 @@
/* fd and filename for currently open WAL file */
-static int walfile = -1;
+static Walfile *walfile = NULL;
static char current_walfile_name[MAXPGPATH] = "";
static bool reportFlushPosition = false;
static XLogRecPtr lastFlushPosition = InvalidXLogRecPtr;
@@ -56,29 +56,23 @@ static bool ReadEndOfStreamingResult(PGresult *res, XLogRecPtr *startpos,
uint32 *timeline);
static bool
-mark_file_as_archived(const char *basedir, const char *fname, bool do_sync)
+mark_file_as_archived(StreamCtl *stream, const char *fname)
{
- int fd;
+ Walfile *f;
static char tmppath[MAXPGPATH];
- snprintf(tmppath, sizeof(tmppath), "%s/archive_status/%s.done",
- basedir, fname);
+ snprintf(tmppath, sizeof(tmppath), "archive_status/%s.done",
+ fname);
- fd = open(tmppath, O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
- if (fd < 0)
+ f = stream->walmethod->open_for_write(tmppath, NULL, 0);
+ if (f == NULL)
{
fprintf(stderr, _("%s: could not create archive status file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
+ progname, tmppath, stream->walmethod->getlasterror());
return false;
}
- close(fd);
-
- if (do_sync && fsync_fname(tmppath, false, progname) != 0)
- return false;
-
- if (do_sync && fsync_parent_path(tmppath, progname) != 0)
- return false;
+ stream->walmethod->close(f, CLOSE_NORMAL);
return true;
}
@@ -92,100 +86,79 @@ mark_file_as_archived(const char *basedir, const char *fname, bool do_sync)
static bool
open_walfile(StreamCtl *stream, XLogRecPtr startpoint)
{
- int f;
+ Walfile *f;
char fn[MAXPGPATH];
- struct stat statbuf;
- char *zerobuf;
- int bytes;
+ ssize_t size;
XLogSegNo segno;
XLByteToSeg(startpoint, segno);
XLogFileName(current_walfile_name, stream->timeline, segno);
- snprintf(fn, sizeof(fn), "%s/%s%s", stream->basedir, current_walfile_name,
+ snprintf(fn, sizeof(fn), "%s%s", current_walfile_name,
stream->partial_suffix ? stream->partial_suffix : "");
- f = open(fn, O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
- if (f == -1)
- {
- fprintf(stderr,
- _("%s: could not open transaction log file \"%s\": %s\n"),
- progname, fn, strerror(errno));
- return false;
- }
/*
- * Verify that the file is either empty (just created), or a complete
- * XLogSegSize segment. Anything in between indicates a corrupt file.
+ * When streaming to files, if an existing file exists we verify that it's
+ * either empty (just created), or a complete XLogSegSize segment (in
+ * which case it has been created and padded). Anything else indicates a
+ * corrupt file.
+ *
+ * When streaming to tar, no file with this name will exist before, so we
+ * never have to verify a size.
*/
- if (fstat(f, &statbuf) != 0)
+ if (stream->walmethod->existsfile(fn))
{
- fprintf(stderr,
- _("%s: could not stat transaction log file \"%s\": %s\n"),
- progname, fn, strerror(errno));
- close(f);
- return false;
- }
- if (statbuf.st_size == XLogSegSize)
- {
- /* File is open and ready to use */
- walfile = f;
-
- /*
- * fsync, in case of a previous crash between padding and fsyncing the
- * file.
- */
- if (stream->do_sync && fsync_fname(fn, false, progname) != 0)
- return false;
- if (stream->do_sync && fsync_parent_path(fn, progname) != 0)
+ size = stream->walmethod->get_file_size(fn);
+ if (size < 0)
+ {
+ fprintf(stderr,
+ _("%s: could not get size of transaction log file \"%s\": %s\n"),
+ progname, fn, stream->walmethod->getlasterror());
return false;
+ }
+ if (size == XLogSegSize)
+ {
+ /* Already padded file. Open it for use */
+ f = stream->walmethod->open_for_write(current_walfile_name, stream->partial_suffix, 0);
+ if (f == NULL)
+ {
+ fprintf(stderr,
+ _("%s: could not open existing transaction log file \"%s\": %s\n"),
+ progname, fn, stream->walmethod->getlasterror());
+ return false;
+ }
- return true;
- }
- if (statbuf.st_size != 0)
- {
- fprintf(stderr,
- _("%s: transaction log file \"%s\" has %d bytes, should be 0 or %d\n"),
- progname, fn, (int) statbuf.st_size, XLogSegSize);
- close(f);
- return false;
- }
+ /* fsync file in case of a previous crash */
+ if (!stream->walmethod->fsync(f))
+ {
+ stream->walmethod->close(f, CLOSE_UNLINK);
+ return false;
+ }
- /* New, empty, file. So pad it to 16Mb with zeroes */
- zerobuf = pg_malloc0(XLOG_BLCKSZ);
- for (bytes = 0; bytes < XLogSegSize; bytes += XLOG_BLCKSZ)
- {
- if (write(f, zerobuf, XLOG_BLCKSZ) != XLOG_BLCKSZ)
+ walfile = f;
+ return true;
+ }
+ if (size != 0)
{
fprintf(stderr,
- _("%s: could not pad transaction log file \"%s\": %s\n"),
- progname, fn, strerror(errno));
- free(zerobuf);
- close(f);
- unlink(fn);
+ _("%s: transaction log file \"%s\" has %d bytes, should be 0 or %d\n"),
+ progname, fn, (int) size, XLogSegSize);
return false;
}
+ /* File existed and was empty, so fall through and open */
}
- free(zerobuf);
- /*
- * fsync WAL file and containing directory, to ensure the file is
- * persistently created and zeroed. That's particularly important when
- * using synchronous mode, where the file is modified and fsynced
- * in-place, without a directory fsync.
- */
- if (stream->do_sync && fsync_fname(fn, false, progname) != 0)
- return false;
- if (stream->do_sync && fsync_parent_path(fn, progname) != 0)
- return false;
+ /* No file existed, so create one */
- if (lseek(f, SEEK_SET, 0) != 0)
+ f = stream->walmethod->open_for_write(current_walfile_name, stream->partial_suffix, XLogSegSize);
+ if (f == NULL)
{
fprintf(stderr,
- _("%s: could not seek to beginning of transaction log file \"%s\": %s\n"),
- progname, fn, strerror(errno));
- close(f);
+ _("%s: could not open transaction log file \"%s\": %s\n"),
+ progname, fn, stream->walmethod->getlasterror());
return false;
}
+
walfile = f;
return true;
}
@@ -199,55 +172,43 @@ static bool
close_walfile(StreamCtl *stream, XLogRecPtr pos)
{
off_t currpos;
+ int r;
- if (walfile == -1)
+ if (walfile == NULL)
return true;
- currpos = lseek(walfile, 0, SEEK_CUR);
+ currpos = stream->walmethod->get_current_pos(walfile);
if (currpos == -1)
{
fprintf(stderr,
_("%s: could not determine seek position in file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
+ progname, current_walfile_name, stream->walmethod->getlasterror());
return false;
}
- if (stream->do_sync && fsync(walfile) != 0)
+ if (stream->partial_suffix)
{
- fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
- return false;
+ if (currpos == XLOG_SEG_SIZE)
+ r = stream->walmethod->close(walfile, CLOSE_NORMAL);
+ else
+ {
+ fprintf(stderr,
+ _("%s: not renaming \"%s%s\", segment is not complete\n"),
+ progname, current_walfile_name, stream->partial_suffix);
+ r = stream->walmethod->close(walfile, CLOSE_NO_RENAME);
+ }
}
+ else
+ r = stream->walmethod->close(walfile, CLOSE_NORMAL);
- if (close(walfile) != 0)
+ walfile = NULL;
+
+ if (r != 0)
{
fprintf(stderr, _("%s: could not close file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
- walfile = -1;
+ progname, current_walfile_name, stream->walmethod->getlasterror());
return false;
}
- walfile = -1;
-
- /*
- * If we finished writing a .partial file, rename it into place.
- */
- if (currpos == XLOG_SEG_SIZE && stream->partial_suffix)
- {
- char oldfn[MAXPGPATH];
- char newfn[MAXPGPATH];
-
- snprintf(oldfn, sizeof(oldfn), "%s/%s%s", stream->basedir, current_walfile_name, stream->partial_suffix);
- snprintf(newfn, sizeof(newfn), "%s/%s", stream->basedir, current_walfile_name);
- if (durable_rename(oldfn, newfn, progname) != 0)
- {
- /* durable_rename produced a log entry */
- return false;
- }
- }
- else if (stream->partial_suffix)
- fprintf(stderr,
- _("%s: not renaming \"%s%s\", segment is not complete\n"),
- progname, current_walfile_name, stream->partial_suffix);
/*
* Mark file as archived if requested by the caller - pg_basebackup needs
@@ -258,8 +219,7 @@ close_walfile(StreamCtl *stream, XLogRecPtr pos)
if (currpos == XLOG_SEG_SIZE && stream->mark_done)
{
/* writes error message if failed */
- if (!mark_file_as_archived(stream->basedir, current_walfile_name,
- stream->do_sync))
+ if (!mark_file_as_archived(stream, current_walfile_name))
return false;
}
@@ -274,9 +234,7 @@ close_walfile(StreamCtl *stream, XLogRecPtr pos)
static bool
existsTimeLineHistoryFile(StreamCtl *stream)
{
- char path[MAXPGPATH];
char histfname[MAXFNAMELEN];
- int fd;
/*
* Timeline 1 never has a history file. We treat that as if it existed,
@@ -287,31 +245,15 @@ existsTimeLineHistoryFile(StreamCtl *stream)
TLHistoryFileName(histfname, stream->timeline);
- snprintf(path, sizeof(path), "%s/%s", stream->basedir, histfname);
-
- fd = open(path, O_RDONLY | PG_BINARY, 0);
- if (fd < 0)
- {
- if (errno != ENOENT)
- fprintf(stderr, _("%s: could not open timeline history file \"%s\": %s\n"),
- progname, path, strerror(errno));
- return false;
- }
- else
- {
- close(fd);
- return true;
- }
+ return stream->walmethod->existsfile(histfname);
}
static bool
writeTimeLineHistoryFile(StreamCtl *stream, char *filename, char *content)
{
int size = strlen(content);
- char path[MAXPGPATH];
- char tmppath[MAXPGPATH];
char histfname[MAXFNAMELEN];
- int fd;
+ Walfile *f;
/*
* Check that the server's idea of how timeline history files should be
@@ -325,53 +267,31 @@ writeTimeLineHistoryFile(StreamCtl *stream, char *filename, char *content)
return false;
}
- snprintf(path, sizeof(path), "%s/%s", stream->basedir, histfname);
-
- /*
- * Write into a temp file name.
- */
- snprintf(tmppath, MAXPGPATH, "%s.tmp", path);
-
- unlink(tmppath);
-
- fd = open(tmppath, O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
- if (fd < 0)
+ f = stream->walmethod->open_for_write(histfname, ".tmp", 0);
+ if (f == NULL)
{
fprintf(stderr, _("%s: could not create timeline history file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
+ progname, histfname, stream->walmethod->getlasterror());
return false;
}
- errno = 0;
- if ((int) write(fd, content, size) != size)
+ if ((int) stream->walmethod->write(f, content, size) != size)
{
- int save_errno = errno;
+ fprintf(stderr, _("%s: could not write timeline history file \"%s\": %s\n"),
+ progname, histfname, stream->walmethod->getlasterror());
/*
* If we fail to make the file, delete it to release disk space
*/
- close(fd);
- unlink(tmppath);
- errno = save_errno;
+ stream->walmethod->close(f, CLOSE_UNLINK);
- fprintf(stderr, _("%s: could not write timeline history file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
return false;
}
- if (close(fd) != 0)
+ if (stream->walmethod->close(f, CLOSE_NORMAL) != 0)
{
fprintf(stderr, _("%s: could not close file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
- return false;
- }
-
- /*
- * Now move the completed history file into place with its final name.
- */
- if (durable_rename(tmppath, path, progname) < 0)
- {
- /* durable_rename produced a log entry */
+ progname, histfname, stream->walmethod->getlasterror());
return false;
}
@@ -379,8 +299,7 @@ writeTimeLineHistoryFile(StreamCtl *stream, char *filename, char *content)
if (stream->mark_done)
{
/* writes error message if failed */
- if (!mark_file_as_archived(stream->basedir, histfname,
- stream->do_sync))
+ if (!mark_file_as_archived(stream, histfname))
return false;
}
@@ -590,7 +509,9 @@ ReceiveXlogStream(PGconn *conn, StreamCtl *stream)
{
/*
* Fetch the timeline history file for this timeline, if we don't have
- * it already.
+ * it already. When streaming log to tar, this will always return
+ * false, as we are never streaming into an existing file and therefor
+ * there can be no pre-existing timeline history file.
*/
if (!existsTimeLineHistoryFile(stream))
{
@@ -749,10 +670,10 @@ ReceiveXlogStream(PGconn *conn, StreamCtl *stream)
}
error:
- if (walfile != -1 && close(walfile) != 0)
+ if (walfile != NULL && stream->walmethod->close(walfile, CLOSE_NORMAL) != 0)
fprintf(stderr, _("%s: could not close file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
- walfile = -1;
+ progname, current_walfile_name, stream->walmethod->getlasterror());
+ walfile = NULL;
return false;
}
@@ -836,12 +757,12 @@ HandleCopyStream(PGconn *conn, StreamCtl *stream,
* If synchronous option is true, issue sync command as soon as there
* are WAL data which has not been flushed yet.
*/
- if (stream->synchronous && lastFlushPosition < blockpos && walfile != -1)
+ if (stream->synchronous && lastFlushPosition < blockpos && walfile != NULL)
{
- if (stream->do_sync && fsync(walfile) != 0)
+ if (stream->walmethod->fsync(walfile) != 0)
{
fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
+ progname, current_walfile_name, stream->walmethod->getlasterror());
goto error;
}
lastFlushPosition = blockpos;
@@ -1072,7 +993,7 @@ ProcessKeepaliveMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
if (replyRequested && still_sending)
{
if (reportFlushPosition && lastFlushPosition < blockpos &&
- walfile != -1)
+ walfile != NULL)
{
/*
* If a valid flush location needs to be reported, flush the
@@ -1081,10 +1002,10 @@ ProcessKeepaliveMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
* data has been successfully replicated or not, at the normal
* shutdown of the server.
*/
- if (stream->do_sync && fsync(walfile) != 0)
+ if (stream->walmethod->fsync(walfile) != 0)
{
fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
+ progname, current_walfile_name, stream->walmethod->getlasterror());
return false;
}
lastFlushPosition = blockpos;
@@ -1142,7 +1063,7 @@ ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
* Verify that the initial location in the stream matches where we think
* we are.
*/
- if (walfile == -1)
+ if (walfile == NULL)
{
/* No file open yet */
if (xlogoff != 0)
@@ -1156,12 +1077,11 @@ ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
else
{
/* More data in existing segment */
- /* XXX: store seek value don't reseek all the time */
- if (lseek(walfile, 0, SEEK_CUR) != xlogoff)
+ if (stream->walmethod->get_current_pos(walfile) != xlogoff)
{
fprintf(stderr,
_("%s: got WAL data offset %08x, expected %08x\n"),
- progname, xlogoff, (int) lseek(walfile, 0, SEEK_CUR));
+ progname, xlogoff, (int) stream->walmethod->get_current_pos(walfile));
return false;
}
}
@@ -1182,7 +1102,7 @@ ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
else
bytes_to_write = bytes_left;
- if (walfile == -1)
+ if (walfile == NULL)
{
if (!open_walfile(stream, *blockpos))
{
@@ -1191,14 +1111,13 @@ ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
}
}
- if (write(walfile,
- copybuf + hdr_len + bytes_written,
- bytes_to_write) != bytes_to_write)
+ if (stream->walmethod->write(walfile, copybuf + hdr_len + bytes_written,
+ bytes_to_write) != bytes_to_write)
{
fprintf(stderr,
_("%s: could not write %u bytes to WAL file \"%s\": %s\n"),
progname, bytes_to_write, current_walfile_name,
- strerror(errno));
+ stream->walmethod->getlasterror());
return false;
}
diff --git a/src/bin/pg_basebackup/receivelog.h b/src/bin/pg_basebackup/receivelog.h
index 7a3bbc5..b5913ea 100644
--- a/src/bin/pg_basebackup/receivelog.h
+++ b/src/bin/pg_basebackup/receivelog.h
@@ -13,6 +13,7 @@
#define RECEIVELOG_H
#include "libpq-fe.h"
+#include "walmethods.h"
#include "access/xlogdefs.h"
@@ -41,7 +42,7 @@ typedef struct StreamCtl
stream_stop_callback stream_stop; /* Stop streaming when returns true */
- char *basedir; /* Received segments written to this dir */
+ WalWriteMethod *walmethod; /* How to write the WAL */
char *partial_suffix; /* Suffix appended to partially received files */
} StreamCtl;
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index a52bd4e..dca8c1c 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -4,7 +4,7 @@ use Cwd;
use Config;
use PostgresNode;
use TestLib;
-use Test::More tests => 67;
+use Test::More tests => 69;
program_help_ok('pg_basebackup');
program_version_ok('pg_basebackup');
@@ -237,6 +237,10 @@ $node->command_ok(
'pg_basebackup -X stream runs');
ok(grep(/^[0-9A-F]{24}$/, slurp_dir("$tempdir/backupxf/pg_xlog")),
'WAL files copied');
+$node->command_ok(
+ [ 'pg_basebackup', '-D', "$tempdir/backupxst", '-X', 'stream', '-Ft' ],
+ 'pg_basebackup -X stream runs in tar mode');
+ok(-f "$tempdir/backupxst/pg_xlog.tar", "tar file was created");
$node->command_fails(
[ 'pg_basebackup', '-D', "$tempdir/fail", '-S', 'slot1' ],
diff --git a/src/bin/pg_basebackup/walmethods.c b/src/bin/pg_basebackup/walmethods.c
new file mode 100644
index 0000000..e8d74b1
--- /dev/null
+++ b/src/bin/pg_basebackup/walmethods.c
@@ -0,0 +1,873 @@
+/*-------------------------------------------------------------------------
+ *
+ * walmethods.c - implementations of different ways to write received wal
+ *
+ * NOTE! The caller must ensure that only one method is instantiated in
+ * any given program, and that it's only instantiated once!
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_basebackup/walmethods.c
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <sys/stat.h>
+#include <time.h>
+#include <unistd.h>
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
+#include "pgtar.h"
+#include "common/file_utils.h"
+
+#include "receivelog.h"
+#include "streamutil.h"
+
+/* Size of zlib buffer for .tar.gz */
+#define ZLIB_OUT_SIZE 4096
+
+/*-------------------------------------------------------------------------
+ * WalDirectoryMethod - write wal to a directory looking like pg_xlog
+ *-------------------------------------------------------------------------
+ */
+
+/*
+ * Global static data for this method
+ */
+typedef struct DirectoryMethodData
+{
+ char *basedir;
+ bool sync;
+} DirectoryMethodData;
+static DirectoryMethodData *dir_data = NULL;
+
+/*
+ * Local file handle
+ */
+typedef struct DirectoryMethodFile
+{
+ int fd;
+ off_t currpos;
+ char *pathname;
+ char *fullpath;
+ char *temp_suffix;
+} DirectoryMethodFile;
+
+static char *
+dir_getlasterror(void)
+{
+ /* Directory method always sets errno, so just use strerror */
+ return strerror(errno);
+}
+
+static Walfile
+dir_open_for_write(const char *pathname, const char *temp_suffix, size_t pad_to_size)
+{
+ static char tmppath[MAXPGPATH];
+ int fd;
+ DirectoryMethodFile *f;
+
+ snprintf(tmppath, sizeof(tmppath), "%s/%s%s",
+ dir_data->basedir, pathname, temp_suffix ? temp_suffix : "");
+
+ fd = open(tmppath, O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
+ if (fd < 0)
+ return NULL;
+
+ if (pad_to_size)
+ {
+ /* Always pre-pad on regular files */
+ char *zerobuf;
+ int bytes;
+
+ zerobuf = pg_malloc0(XLOG_BLCKSZ);
+ for (bytes = 0; bytes < pad_to_size; bytes += XLOG_BLCKSZ)
+ {
+ if (write(fd, zerobuf, XLOG_BLCKSZ) != XLOG_BLCKSZ)
+ {
+ int save_errno = errno;
+
+ pg_free(zerobuf);
+ close(fd);
+ errno = save_errno;
+ return NULL;
+ }
+ }
+ pg_free(zerobuf);
+
+ if (lseek(fd, 0, SEEK_SET) != 0)
+ return NULL;
+ }
+
+ f = pg_malloc0(sizeof(DirectoryMethodFile));
+ f->fd = fd;
+ f->currpos = 0;
+ f->pathname = pg_strdup(pathname);
+ f->fullpath = pg_strdup(tmppath);
+ if (temp_suffix)
+ f->temp_suffix = pg_strdup(temp_suffix);
+
+ /*
+ * fsync WAL file and containing directory, to ensure the file is
+ * persistently created and zeroed (if padded). That's particularly
+ * important when using synchronous mode, where the file is modified
+ * and fsynced in-place, without a directory fsync.
+ */
+ if (dir_data->sync)
+ {
+ if (fsync_fname(f->fullpath, false, progname) != 0)
+ return NULL;
+ if (fsync_parent_path(f->fullpath, progname) != 0)
+ return NULL;
+ }
+ return f;
+}
+
+static ssize_t
+dir_write(Walfile f, const void *buf, size_t count)
+{
+ ssize_t r;
+ DirectoryMethodFile *df = (DirectoryMethodFile *) f;
+
+ Assert(f != NULL);
+
+ r = write(df->fd, buf, count);
+ if (r > 0)
+ df->currpos += r;
+ return r;
+}
+
+static off_t
+dir_get_current_pos(Walfile f)
+{
+ Assert(f != NULL);
+
+ /* Use a cached value to prevent lots of reseeks */
+ return ((DirectoryMethodFile *) f)->currpos;
+}
+
+static int
+dir_close(Walfile f, WalCloseMethod method)
+{
+ int r;
+ DirectoryMethodFile *df = (DirectoryMethodFile *) f;
+ static char tmppath[MAXPGPATH];
+ static char tmppath2[MAXPGPATH];
+
+ Assert(f != NULL);
+
+ r = close(df->fd);
+
+ if (r == 0)
+ {
+ /* Build path to the current version of the file */
+ if (method == CLOSE_NORMAL && df->temp_suffix)
+ {
+ /* If we have a temp prefix, normal is we rename the file */
+ snprintf(tmppath, sizeof(tmppath), "%s/%s%s",
+ dir_data->basedir, df->pathname, df->temp_suffix);
+ snprintf(tmppath2, sizeof(tmppath2), "%s/%s",
+ dir_data->basedir, df->pathname);
+ r = durable_rename(tmppath, tmppath2, progname);
+ }
+ else if (method == CLOSE_UNLINK
+ )
+ {
+ /* Unlink the file once it's closed */
+ snprintf(tmppath, sizeof(tmppath), "%s/%s%s",
+ dir_data->basedir, df->pathname, df->temp_suffix ? df->temp_suffix : "");
+ r = unlink(tmppath);
+ }
+ else
+ {
+ /*
+ * Else either CLOSE_NORMAL and no temp suffix,
+ * or CLOSE_NO_RENAME. In this case, fsync the file and
+ * containing directory if sync mode is requested.
+ */
+ if (dir_data->sync)
+ {
+ r = fsync_fname(df->fullpath, false, progname);
+ if (r == 0)
+ r = fsync_parent_path(df->fullpath, progname);
+ }
+ }
+ }
+
+ pg_free(df->pathname);
+ pg_free(df->fullpath);
+ if (df->temp_suffix)
+ pg_free(df->temp_suffix);
+ pg_free(df);
+
+ return r;
+}
+
+static int
+dir_fsync(Walfile f)
+{
+ Assert(f != NULL);
+
+ if (!dir_data->sync)
+ return 0;
+
+ return fsync(((DirectoryMethodFile *) f)->fd);
+}
+
+static ssize_t
+dir_get_file_size(const char *pathname)
+{
+ struct stat statbuf;
+ static char tmppath[MAXPGPATH];
+
+ snprintf(tmppath, sizeof(tmppath), "%s/%s",
+ dir_data->basedir, pathname);
+
+ if (stat(tmppath, &statbuf) != 0)
+ return -1;
+
+ return statbuf.st_size;
+}
+
+static bool
+dir_existsfile(const char *pathname)
+{
+ static char tmppath[MAXPGPATH];
+ int fd;
+
+ snprintf(tmppath, sizeof(tmppath), "%s/%s",
+ dir_data->basedir, pathname);
+
+ fd = open(tmppath, O_RDONLY | PG_BINARY, 0);
+ if (fd < 0)
+ return false;
+ close(fd);
+ return true;
+}
+
+static bool
+dir_finish(void)
+{
+ if (dir_data->sync)
+ {
+ /*
+ * Files are fsynced when they are closed, but we need
+ * to fsync the directory entry here as well.
+ */
+ if (fsync_fname(dir_data->basedir, true, progname) != 0)
+ return false;
+ }
+ return true;
+}
+
+
+WalWriteMethod *
+CreateWalDirectoryMethod(const char *basedir, bool sync)
+{
+ WalWriteMethod *method;
+
+ method = pg_malloc0(sizeof(WalWriteMethod));
+ method->open_for_write = dir_open_for_write;
+ method->write = dir_write;
+ method->get_current_pos = dir_get_current_pos;
+ method->get_file_size = dir_get_file_size;
+ method->close = dir_close;
+ method->fsync = dir_fsync;
+ method->existsfile = dir_existsfile;
+ method->finish = dir_finish;
+ method->getlasterror = dir_getlasterror;
+
+ dir_data = pg_malloc0(sizeof(DirectoryMethodData));
+ dir_data->basedir = pg_strdup(basedir);
+ dir_data->sync = sync;
+
+ return method;
+}
+
+
+/*-------------------------------------------------------------------------
+ * WalTarMethod - write wal to a tar file containing pg_xlog contents
+ *-------------------------------------------------------------------------
+ */
+
+typedef struct TarMethodFile
+{
+ off_t ofs_start; /* Where does the *header* for this file start */
+ off_t currpos;
+ char header[512];
+ char *pathname;
+ size_t pad_to_size;
+} TarMethodFile;
+
+typedef struct TarMethodData
+{
+ char *tarfilename;
+ int fd;
+ int compression;
+ bool sync;
+ TarMethodFile *currentfile;
+ char lasterror[1024];
+#ifdef HAVE_LIBZ
+ z_streamp zp;
+ void *zlibOut;
+#endif
+} TarMethodData;
+static TarMethodData *tar_data = NULL;
+
+#define tar_clear_error() tar_data->lasterror[0] = '\0'
+#define tar_set_error(msg) strlcpy(tar_data->lasterror, msg, sizeof(tar_data->lasterror))
+
+static char *
+tar_getlasterror(void)
+{
+ /*
+ * If a custom error is set, return that one. Otherwise, assume errno is
+ * set and return that one.
+ */
+ if (tar_data->lasterror[0])
+ return tar_data->lasterror;
+ return strerror(errno);
+}
+
+#ifdef HAVE_LIBZ
+static bool
+tar_write_compressed_data(void *buf, size_t count, bool flush)
+{
+ tar_data->zp->next_in = buf;
+ tar_data->zp->avail_in = count;
+
+ while (tar_data->zp->avail_in || flush)
+ {
+ int r;
+
+ r = deflate(tar_data->zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (r == Z_STREAM_ERROR)
+ {
+ tar_set_error("deflate failed");
+ return false;
+ }
+
+ if (tar_data->zp->avail_out < ZLIB_OUT_SIZE)
+ {
+ size_t len = ZLIB_OUT_SIZE - tar_data->zp->avail_out;
+
+ if (write(tar_data->fd, tar_data->zlibOut, len) != len)
+ return false;
+
+ tar_data->zp->next_out = tar_data->zlibOut;
+ tar_data->zp->avail_out = ZLIB_OUT_SIZE;
+ }
+
+ if (r == Z_STREAM_END)
+ break;
+ }
+
+ if (flush)
+ {
+ /* Reset the stream for writing */
+ if (deflateReset(tar_data->zp) != Z_OK)
+ {
+ tar_set_error("deflateReset failed");
+ return false;
+ }
+ }
+
+ return true;
+}
+#endif
+
+static ssize_t
+tar_write(Walfile f, const void *buf, size_t count)
+{
+ ssize_t r;
+
+ Assert(f != NULL);
+ tar_clear_error();
+
+ /* Tarfile will always be positioned at the end */
+ if (!tar_data->compression)
+ {
+ r = write(tar_data->fd, buf, count);
+ if (r > 0)
+ ((TarMethodFile *) f)->currpos += r;
+ return r;
+ }
+#ifdef HAVE_LIBZ
+ else
+ {
+ if (!tar_write_compressed_data((void *) buf, count, false))
+ return -1;
+ ((TarMethodFile *) f)->currpos += count;
+ return count;
+ }
+#endif
+}
+
+static bool
+tar_write_padding_data(TarMethodFile * f, size_t bytes)
+{
+ char *zerobuf = pg_malloc0(XLOG_BLCKSZ);
+ size_t bytesleft = bytes;
+
+ while (bytesleft)
+ {
+ size_t bytestowrite = bytesleft > XLOG_BLCKSZ ? XLOG_BLCKSZ : bytesleft;
+
+ size_t r = tar_write(f, zerobuf, bytestowrite);
+
+ if (r < 0)
+ return false;
+ bytesleft -= r;
+ }
+ return true;
+}
+
+static Walfile
+tar_open_for_write(const char *pathname, const char *temp_suffix, size_t pad_to_size)
+{
+ int save_errno;
+ static char tmppath[MAXPGPATH];
+
+ tar_clear_error();
+
+ if (tar_data->fd < 0)
+ {
+ /*
+ * We open the tar file only when we first try to write to it.
+ */
+ tar_data->fd = open(tar_data->tarfilename,
+ O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
+ if (tar_data->fd < 0)
+ return NULL;
+
+#ifdef HAVE_LIBZ
+ if (tar_data->compression)
+ {
+ tar_data->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ tar_data->zp->zalloc = Z_NULL;
+ tar_data->zp->zfree = Z_NULL;
+ tar_data->zp->opaque = Z_NULL;
+ tar_data->zp->next_out = tar_data->zlibOut;
+ tar_data->zp->avail_out = ZLIB_OUT_SIZE;
+
+ /*
+ * Initialize deflation library. Adding the magic value 16 to the
+ * default 15 for the windowBits parameter makes the output be
+ * gzip instead of zlib.
+ */
+ if (deflateInit2(tar_data->zp, tar_data->compression, Z_DEFLATED, 15 + 16, 8, Z_DEFAULT_STRATEGY) != Z_OK)
+ {
+ tar_set_error("deflateInit2 failed");
+ return NULL;
+ }
+ }
+#endif
+
+ /* There's no tar header itself, the file starts with regular files */
+ }
+
+ Assert(tar_data->currentfile == NULL);
+ if (tar_data->currentfile != NULL)
+ {
+ tar_set_error("implementation error: tar files can't have more than one open file\n");
+ return NULL;
+ }
+
+ tar_data->currentfile = pg_malloc0(sizeof(TarMethodFile));
+
+ snprintf(tmppath, sizeof(tmppath), "%s%s",
+ pathname, temp_suffix ? temp_suffix : "");
+
+ /* Create a header with size set to 0 - we will fill out the size on close */
+ if (tarCreateHeader(tar_data->currentfile->header, tmppath, NULL, 0, S_IRUSR | S_IWUSR, 0, 0, time(NULL)) != TAR_OK)
+ {
+ pg_free(tar_data->currentfile);
+ tar_data->currentfile = NULL;
+ tar_set_error("could not create tar header");
+ return NULL;
+ }
+
+#ifdef HAVE_LIBZ
+ if (tar_data->compression)
+ {
+ /* Flush existing data */
+ if (!tar_write_compressed_data(NULL, 0, true))
+ return NULL;
+
+ /* Turn off compression for header */
+ if (deflateParams(tar_data->zp, 0, 0) != Z_OK)
+ {
+ tar_set_error("deflateParams failed");
+ return NULL;
+ }
+ }
+#endif
+
+ tar_data->currentfile->ofs_start = lseek(tar_data->fd, 0, SEEK_CUR);
+ if (tar_data->currentfile->ofs_start == -1)
+ {
+ save_errno = errno;
+ pg_free(tar_data->currentfile);
+ tar_data->currentfile = NULL;
+ errno = save_errno;
+ return NULL;
+ }
+ tar_data->currentfile->currpos = 0;
+
+ if (!tar_data->compression)
+ {
+ if (write(tar_data->fd, tar_data->currentfile->header, 512) != 512)
+ {
+ save_errno = errno;
+ pg_free(tar_data->currentfile);
+ tar_data->currentfile = NULL;
+ errno = save_errno;
+ return NULL;
+ }
+ }
+#ifdef HAVE_LIBZ
+ else
+ {
+ /* Write header through the zlib APIs but with no compression */
+ if (!tar_write_compressed_data(tar_data->currentfile->header, 512, true))
+ return NULL;
+
+ /* Re-enable compression for the rest of the file */
+ if (deflateParams(tar_data->zp, tar_data->compression, 0) != Z_OK)
+ {
+ tar_set_error("deflateParams failed");
+ return NULL;
+ }
+ }
+#endif
+
+ tar_data->currentfile->pathname = pg_strdup(pathname);
+
+ /*
+ * Uncompressed files are padded on creation, but for compression we can't
+ * do that
+ */
+ if (pad_to_size)
+ {
+ tar_data->currentfile->pad_to_size = pad_to_size;
+ if (!tar_data->compression)
+ {
+ /* Uncompressed, so pad now */
+ tar_write_padding_data(tar_data->currentfile, pad_to_size);
+ /* Seek back to start */
+ if (lseek(tar_data->fd, tar_data->currentfile->ofs_start + 512, SEEK_SET) != tar_data->currentfile->ofs_start + 512)
+ return NULL;
+
+ tar_data->currentfile->currpos = 0;
+ }
+ }
+
+ return tar_data->currentfile;
+}
+
+static ssize_t
+tar_get_file_size(const char *pathname)
+{
+ tar_clear_error();
+
+ /* Currently not used, so not supported */
+ errno = ENOSYS;
+ return -1;
+}
+
+static off_t
+tar_get_current_pos(Walfile f)
+{
+ Assert(f != NULL);
+ tar_clear_error();
+
+ return ((TarMethodFile *) f)->currpos;
+}
+
+static int
+tar_fsync(Walfile f)
+{
+ Assert(f != NULL);
+ tar_clear_error();
+
+ /*
+ * Always sync the whole tarfile, because that's all we can do. This makes
+ * no sense on compressed files, so just ignore those.
+ */
+ if (tar_data->compression)
+ return 0;
+
+ return fsync(tar_data->fd);
+}
+
+static int
+tar_close(Walfile f, WalCloseMethod method)
+{
+ ssize_t filesize;
+ int padding;
+ TarMethodFile *tf = (TarMethodFile *) f;
+
+ Assert(f != NULL);
+ tar_clear_error();
+
+ if (method == CLOSE_UNLINK)
+ {
+ if (tar_data->compression)
+ {
+ tar_set_error("unlink not supported with compression");
+ return -1;
+ }
+
+ /*
+ * Unlink the file that we just wrote to the tar. We do this by
+ * truncating it to the start of the header. This is safe as we only
+ * allow writing of the very last file.
+ */
+ if (ftruncate(tar_data->fd, tf->ofs_start) != 0)
+ return -1;
+
+ pg_free(tf->pathname);
+ pg_free(tf);
+ tar_data->currentfile = NULL;
+
+ return 0;
+ }
+
+ /*
+ * Pad the file itself with zeroes if necessary. Note that this is
+ * different from the tar format padding -- this is the padding we asked
+ * for when the file was opened.
+ */
+ if (tf->pad_to_size)
+ {
+ if (tar_data->compression)
+ {
+ /*
+ * A compressed tarfile is padded on close since we cannot know
+ * the size of the compressed output until the end.
+ */
+ size_t sizeleft = tf->pad_to_size - tf->currpos;
+
+ if (sizeleft)
+ {
+ if (!tar_write_padding_data(tf, sizeleft))
+ return -1;
+ }
+ }
+ else
+ {
+ /*
+ * An uncompressed tarfile was padded on creation, so just adjust
+ * the current position as if we seeked to the end.
+ */
+ tf->currpos = tf->pad_to_size;
+ }
+ }
+
+ /*
+ * Get the size of the file, and pad the current data up to the nearest
+ * 512 byte boundary.
+ */
+ filesize = tar_get_current_pos(f);
+ padding = ((filesize + 511) & ~511) - filesize;
+ if (padding)
+ {
+ char zerobuf[512];
+
+ MemSet(zerobuf, 0, padding);
+ if (tar_write(f, zerobuf, padding) != padding)
+ return -1;
+ }
+
+
+#ifdef HAVE_LIBZ
+ if (tar_data->compression)
+ {
+ /* Flush the current buffer */
+ if (!tar_write_compressed_data(NULL, 0, true))
+ {
+ errno = EINVAL;
+ return -1;
+ }
+ }
+#endif
+
+ /*
+ * Now go back and update the header with the correct filesize and
+ * possibly also renaming the file. We overwrite the entire current header
+ * when done, including the checksum.
+ */
+ print_tar_number(&(tf->header[124]), 12, filesize);
+
+ if (method == CLOSE_NORMAL)
+
+ /*
+ * We overwrite it with what it was before if we have no tempname,
+ * since we're going to write the buffer anyway.
+ */
+ strlcpy(&(tf->header[0]), tf->pathname, 100);
+
+ print_tar_number(&(tf->header[148]), 8, tarChecksum(((TarMethodFile *) f)->header));
+ if (lseek(tar_data->fd, tf->ofs_start, SEEK_SET) != ((TarMethodFile *) f)->ofs_start)
+ return -1;
+ if (!tar_data->compression)
+ {
+ if (write(tar_data->fd, tf->header, 512) != 512)
+ return -1;
+ }
+#ifdef HAVE_LIBZ
+ else
+ {
+ /* Turn off compression */
+ if (deflateParams(tar_data->zp, 0, 0) != Z_OK)
+ {
+ tar_set_error("deflateParams failed");
+ return -1;
+ }
+
+ /* Overwrite the header, assuming the size will be the same */
+ if (!tar_write_compressed_data(tar_data->currentfile->header, 512, true))
+ return -1;
+
+ /* Turn compression back on */
+ if (deflateParams(tar_data->zp, tar_data->compression, 0) != Z_OK)
+ {
+ tar_set_error("deflateParams failed");
+ return -1;
+ }
+ }
+#endif
+
+ /* Move file pointer back down to end, so we can write the next file */
+ if (lseek(tar_data->fd, 0, SEEK_END) < 0)
+ return -1;
+
+ /* Always fsync on close, so the padding gets fsynced */
+ tar_fsync(f);
+
+ /* Clean up and done */
+ pg_free(tf->pathname);
+ pg_free(tf);
+ tar_data->currentfile = NULL;
+
+ return 0;
+}
+
+static bool
+tar_existsfile(const char *pathname)
+{
+ tar_clear_error();
+ /* We only deal with new tarfiles, so nothing externally created exists */
+ return false;
+}
+
+static bool
+tar_finish(void)
+{
+ char zerobuf[1024];
+
+ tar_clear_error();
+
+ if (tar_data->currentfile)
+ {
+ if (tar_close(tar_data->currentfile, CLOSE_NORMAL) != 0)
+ return false;
+ }
+
+ /* A tarfile always ends with two empty blocks */
+ MemSet(zerobuf, 0, sizeof(zerobuf));
+ if (!tar_data->compression)
+ {
+ if (write(tar_data->fd, zerobuf, sizeof(zerobuf)) != sizeof(zerobuf))
+ return false;
+ }
+#ifdef HAVE_LIBZ
+ else
+ {
+ if (!tar_write_compressed_data(zerobuf, sizeof(zerobuf), false))
+ return false;
+
+ /* Also flush all data to make sure the gzip stream is finished */
+ tar_data->zp->next_in = NULL;
+ tar_data->zp->avail_in = 0;
+ while (true)
+ {
+ int r;
+
+ r = deflate(tar_data->zp, Z_FINISH);
+
+ if (r == Z_STREAM_ERROR)
+ {
+ tar_set_error("deflate failed");
+ return false;
+ }
+ if (tar_data->zp->avail_out < ZLIB_OUT_SIZE)
+ {
+ size_t len = ZLIB_OUT_SIZE - tar_data->zp->avail_out;
+
+ if (write(tar_data->fd, tar_data->zlibOut, len) != len)
+ return false;
+ }
+ if (r == Z_STREAM_END)
+ break;
+ }
+
+ if (deflateEnd(tar_data->zp) != Z_OK)
+ {
+ tar_set_error("deflateEnd failed");
+ return false;
+ }
+ }
+#endif
+
+ /* sync the empty blocks as well, since they're after the last file */
+ fsync(tar_data->fd);
+
+ if (close(tar_data->fd) != 0)
+ return false;
+
+ tar_data->fd = -1;
+
+ if (tar_data->sync)
+ {
+ if (fsync_fname(tar_data->tarfilename, false, progname) != 0)
+ return false;
+ if (fsync_parent_path(tar_data->tarfilename, progname) != 0)
+ return false;
+ }
+
+ return true;
+}
+
+WalWriteMethod *
+CreateWalTarMethod(const char *tarbase, int compression, bool sync)
+{
+ WalWriteMethod *method;
+ const char *suffix = (compression != 0) ? ".tar.gz" : ".tar";
+
+ method = pg_malloc0(sizeof(WalWriteMethod));
+ method->open_for_write = tar_open_for_write;
+ method->write = tar_write;
+ method->get_current_pos = tar_get_current_pos;
+ method->get_file_size = tar_get_file_size;
+ method->close = tar_close;
+ method->fsync = tar_fsync;
+ method->existsfile = tar_existsfile;
+ method->finish = tar_finish;
+ method->getlasterror = tar_getlasterror;
+
+ tar_data = pg_malloc0(sizeof(TarMethodData));
+ tar_data->tarfilename = pg_malloc0(strlen(tarbase) + strlen(suffix) + 1);
+ sprintf(tar_data->tarfilename, "%s%s", tarbase, suffix);
+ tar_data->fd = -1;
+ tar_data->compression = compression;
+ tar_data->sync = sync;
+ if (compression)
+ tar_data->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ return method;
+}
diff --git a/src/bin/pg_basebackup/walmethods.h b/src/bin/pg_basebackup/walmethods.h
new file mode 100644
index 0000000..fa58f81
--- /dev/null
+++ b/src/bin/pg_basebackup/walmethods.h
@@ -0,0 +1,45 @@
+/*-------------------------------------------------------------------------
+ *
+ * walmethods.h
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_basebackup/walmethods.h
+ *-------------------------------------------------------------------------
+ */
+
+
+typedef void *Walfile;
+
+typedef enum
+{
+ CLOSE_NORMAL,
+ CLOSE_UNLINK,
+ CLOSE_NO_RENAME,
+} WalCloseMethod;
+
+typedef struct WalWriteMethod WalWriteMethod;
+struct WalWriteMethod
+{
+ Walfile(*open_for_write) (const char *pathname, const char *temp_suffix, size_t pad_to_size);
+ int (*close) (Walfile f, WalCloseMethod method);
+ bool (*existsfile) (const char *pathname);
+ ssize_t (*get_file_size) (const char *pathname);
+
+ ssize_t (*write) (Walfile f, const void *buf, size_t count);
+ off_t (*get_current_pos) (Walfile f);
+ int (*fsync) (Walfile f);
+ bool (*finish) (void);
+ char *(*getlasterror) (void);
+};
+
+/*
+ * Available WAL methods:
+ * - WalDirectoryMethod - write WAL to regular files in a standard pg_xlog
+ * - TarDirectoryMethod - write WAL to a tarfile corresponding to pg_xlog
+ * (only implements the methods required for pg_basebackup,
+ * not all those required for pg_receivexlog)
+ */
+WalWriteMethod *CreateWalDirectoryMethod(const char *basedir, bool sync);
+WalWriteMethod *CreateWalTarMethod(const char *tarbase, int compression, bool sync);
diff --git a/src/include/pgtar.h b/src/include/pgtar.h
index 45ca400..1d179f0 100644
--- a/src/include/pgtar.h
+++ b/src/include/pgtar.h
@@ -22,4 +22,5 @@ enum tarError
extern enum tarError tarCreateHeader(char *h, const char *filename, const char *linktarget,
pgoff_t size, mode_t mode, uid_t uid, gid_t gid, time_t mtime);
extern uint64 read_tar_number(const char *s, int len);
+extern void print_tar_number(char *s, int len, uint64 val);
extern int tarChecksum(char *header);
diff --git a/src/port/tar.c b/src/port/tar.c
index 52a2113..f1da959 100644
--- a/src/port/tar.c
+++ b/src/port/tar.c
@@ -16,7 +16,7 @@
* support only non-negative numbers, so we don't worry about the GNU rules
* for handling negative numbers.)
*/
-static void
+void
print_tar_number(char *s, int len, uint64 val)
{
if (val < (((uint64) 1) << ((len - 1) * 3)))
(Squashing two emails into one)
On Fri, Sep 30, 2016 at 11:16 PM, Magnus Hagander <magnus@hagander.net> wrote:
And here's yet another version, now rebased on top of the fsync and nosync
changes that got applied.
My fault :p
In particular, this conflicted with pretty much every single change from the
fsync patch, so I'm definitely looking for another round of review before
this can be committed.
Could you rebase once again? This is conflicting with the recent
changes in open_walfile() and close_walfile() of 728ceba.
I ended up moving much of the fsync stuff into walmethods.c, since they were
dependent on if we used tar or not (obviously only the parts about the wal,
not the basebackup). So there's a significant risk that I missed something
there.
+ /* If we have a temp prefix, normal is we rename the file */
+ snprintf(tmppath, sizeof(tmppath), "%s/%s%s",
+ dir_data->basedir, df->pathname, df->temp_suffix);
This comment could be improved, like "normal operation is to rename
the file" for example.
+ if (lseek(fd, 0, SEEK_SET) != 0)
+ return NULL;
[...]
+ if (fsync_fname(f->fullpath, false, progname) != 0)
+ return NULL;
+ if (fsync_parent_path(f->fullpath, progname) != 0)
+ return NULL;
fd leaks for those three code paths. And when one of those fsync calls
fail the previously pg_malloc'd f leaks as well. It may be a good idea
to have a single routine doing all the pg_free work for
DirectoryMethodFile. You'll need it as well in dir_close(). Or even
better: do the fsync calls before allocating f. For pg_basebackup it
does not matter much, but it does for pg_receivexlog that has a retry
logic.
+ if (deflateInit2(tar_data->zp, tar_data->compression,
Z_DEFLATED, 15 + 16, 8, Z_DEFAULT_STRATEGY) != Z_OK)
+ {
+ tar_set_error("deflateInit2 failed");
+ return NULL;
+ }
tar_data->zp leaks here.. Perhaps that does not matter as tar mode is
just used by pg_basebackup now but if we introduce some retry logic
I'd prefer avoiding any problems in the future.
On Thu, Sep 29, 2016 at 7:44 PM, Magnus Hagander <magnus@hagander.net> wrote:
static bool
existsTimeLineHistoryFile(StreamCtl *stream)
{
[...]
+ return stream->walmethod->existsfile(histfname);
}
existsfile returns always false for the tar method. This does not
matter much because pg_basebackup exists immediately in case of a
failure, but I think that this deserves a comment in ReceiveXlogStream
where existsTimeLineHistoryFile is called.OK, added. As you say, the behaviour is expected, but it makes sense to
mention it clealry there.
Thanks.
+ * false, as we are never streaming into an existing file and therefor
s/therefor/therefore.
So what's our basic rule for these perl tests - are we allowed to use
pg_xlogdump from within a pg_basebackup test? If so that could actually be a
useful test - do the backup, extract the xlog and verify that it contains
the SWITCH record.
pg_xlogdump is part of the default temporary installation, so using it
is fine. The issue though is how do we untar pg_xlog.tar without a
native perl call? That's not present down to 5.8.8.. The test you are
proposing in 010_pg_basebackup.pl is the best we can do for now.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Oct 4, 2016 at 12:05 AM, Michael Paquier <michael.paquier@gmail.com>
wrote:
(Squashing two emails into one)
On Fri, Sep 30, 2016 at 11:16 PM, Magnus Hagander <magnus@hagander.net>
wrote:And here's yet another version, now rebased on top of the fsync and
nosync
changes that got applied.
My fault :p
Yes, definitely :P
In particular, this conflicted with pretty much every single change from
the
fsync patch, so I'm definitely looking for another round of review before
this can be committed.Could you rebase once again? This is conflicting with the recent
changes in open_walfile() and close_walfile() of 728ceba.
Done.
I ended up moving much of the fsync stuff into walmethods.c, since they
were
dependent on if we used tar or not (obviously only the parts about the
wal,
not the basebackup). So there's a significant risk that I missed
something
there.
+ /* If we have a temp prefix, normal is we rename the file */ + snprintf(tmppath, sizeof(tmppath), "%s/%s%s", + dir_data->basedir, df->pathname, df->temp_suffix); This comment could be improved, like "normal operation is to rename the file" for example.
Agreed and fixed.
+ if (lseek(fd, 0, SEEK_SET) != 0) + return NULL; [...] + if (fsync_fname(f->fullpath, false, progname) != 0) + return NULL; + if (fsync_parent_path(f->fullpath, progname) != 0) + return NULL; fd leaks for those three code paths. And when one of those fsync calls fail the previously pg_malloc'd f leaks as well. It may be a good idea to have a single routine doing all the pg_free work for DirectoryMethodFile. You'll need it as well in dir_close(). Or even better: do the fsync calls before allocating f. For pg_basebackup it does not matter much, but it does for pg_receivexlog that has a retry logic.
Agreed, moving the fsyncs is definitely the best there.
+ if (deflateInit2(tar_data->zp, tar_data->compression, Z_DEFLATED, 15 + 16, 8, Z_DEFAULT_STRATEGY) != Z_OK) + { + tar_set_error("deflateInit2 failed"); + return NULL; + } tar_data->zp leaks here.. Perhaps that does not matter as tar mode is just used by pg_basebackup now but if we introduce some retry logic I'd prefer avoiding any problems in the future.
Agreed, leaks are bad even if they are not a direct problem right now.
Fixed.
On Thu, Sep 29, 2016 at 7:44 PM, Magnus Hagander <magnus@hagander.net>
wrote:static bool
existsTimeLineHistoryFile(StreamCtl *stream)
{
[...]
+ return stream->walmethod->existsfile(histfname);
}
existsfile returns always false for the tar method. This does not
matter much because pg_basebackup exists immediately in case of a
failure, but I think that this deserves a comment in ReceiveXlogStream
where existsTimeLineHistoryFile is called.OK, added. As you say, the behaviour is expected, but it makes sense to
mention it clealry there.Thanks.
+ * false, as we are never streaming into an existing file and
therefor
s/therefor/therefore.
Fixed.
So what's our basic rule for these perl tests - are we allowed to use
pg_xlogdump from within a pg_basebackup test? If so that could actuallybe a
useful test - do the backup, extract the xlog and verify that it contains
the SWITCH record.pg_xlogdump is part of the default temporary installation, so using it
is fine. The issue though is how do we untar pg_xlog.tar without a
native perl call? That's not present down to 5.8.8.. The test you are
proposing in 010_pg_basebackup.pl is the best we can do for now.
My initial thought was actually adding that check to non-tar format.
But I agree, to test the tar format things specifically we *somehow* need
to be able to untar. We either need to rely on a system tar (which will
likely break badly on Windows) or we need to rely on a perl tar module.
But independent of this patch, actually putting that test in for non-tar
mode would probably not be a bad idea -- if that breaks, it's likely both
break, after all.
Thanks!
//Magnus
Attachments:
pg_basebackup_stream_tar_v5.patchtext/x-patch; charset=US-ASCII; name=pg_basebackup_stream_tar_v5.patchDownload
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 55e913f..e024531 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -180,7 +180,8 @@ PostgreSQL documentation
target directory, the tar contents will be written to
standard output, suitable for piping to for example
<productname>gzip</productname>. This is only possible if
- the cluster has no additional tablespaces.
+ the cluster has no additional tablespaces and transaction
+ log streaming is not used.
</para>
</listitem>
</varlistentry>
@@ -323,6 +324,10 @@ PostgreSQL documentation
If the log has been rotated when it's time to transfer it, the
backup will fail and be unusable.
</para>
+ <para>
+ The transaction log files will be written to
+ the <filename>base.tar</filename> file.
+ </para>
</listitem>
</varlistentry>
@@ -339,6 +344,9 @@ PostgreSQL documentation
client can keep up with transaction log received, using this mode
requires no extra transaction logs to be saved on the master.
</para>
+ <para>The transactionn log files are written to a separate file
+ called <filename>pg_xlog.tar</filename>.
+ </para>
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/bin/pg_basebackup/Makefile b/src/bin/pg_basebackup/Makefile
index fa1ce8b..52ac9e9 100644
--- a/src/bin/pg_basebackup/Makefile
+++ b/src/bin/pg_basebackup/Makefile
@@ -19,7 +19,7 @@ include $(top_builddir)/src/Makefile.global
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
LDFLAGS += -L$(top_builddir)/src/fe_utils -lpgfeutils -lpq
-OBJS=receivelog.o streamutil.o $(WIN32RES)
+OBJS=receivelog.o streamutil.o walmethods.o $(WIN32RES)
all: pg_basebackup pg_receivexlog pg_recvlogical
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 0f5d9d6..d82d80c 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -443,7 +443,7 @@ typedef struct
{
PGconn *bgconn;
XLogRecPtr startptr;
- char xlogdir[MAXPGPATH];
+ char xlog[MAXPGPATH]; /* directory or tarfile depending on mode */
char *sysidentifier;
int timeline;
} logstreamer_param;
@@ -464,9 +464,13 @@ LogStreamerMain(logstreamer_param *param)
stream.synchronous = false;
stream.do_sync = do_sync;
stream.mark_done = true;
- stream.basedir = param->xlogdir;
stream.partial_suffix = NULL;
+ if (format == 'p')
+ stream.walmethod = CreateWalDirectoryMethod(param->xlog, do_sync);
+ else
+ stream.walmethod = CreateWalTarMethod(param->xlog, compresslevel, do_sync);
+
if (!ReceiveXlogStream(param->bgconn, &stream))
/*
@@ -476,6 +480,14 @@ LogStreamerMain(logstreamer_param *param)
*/
return 1;
+ if (!stream.walmethod->finish())
+ {
+ fprintf(stderr,
+ _("%s: could not finish writing WAL files: %s\n"),
+ progname, strerror(errno));
+ return 1;
+ }
+
PQfinish(param->bgconn);
return 0;
}
@@ -526,22 +538,25 @@ StartLogStreamer(char *startpos, uint32 timeline, char *sysidentifier)
/* Error message already written in GetConnection() */
exit(1);
- snprintf(param->xlogdir, sizeof(param->xlogdir), "%s/pg_xlog", basedir);
-
- /*
- * Create pg_xlog/archive_status (and thus pg_xlog) so we can write to
- * basedir/pg_xlog as the directory entry in the tar file may arrive
- * later.
- */
- snprintf(statusdir, sizeof(statusdir), "%s/pg_xlog/archive_status",
- basedir);
+ snprintf(param->xlog, sizeof(param->xlog), "%s/pg_xlog", basedir);
- if (pg_mkdir_p(statusdir, S_IRWXU) != 0 && errno != EEXIST)
+ if (format == 'p')
{
- fprintf(stderr,
- _("%s: could not create directory \"%s\": %s\n"),
- progname, statusdir, strerror(errno));
- disconnect_and_exit(1);
+ /*
+ * Create pg_xlog/archive_status (and thus pg_xlog) so we can write to
+ * basedir/pg_xlog as the directory entry in the tar file may arrive
+ * later.
+ */
+ snprintf(statusdir, sizeof(statusdir), "%s/pg_xlog/archive_status",
+ basedir);
+
+ if (pg_mkdir_p(statusdir, S_IRWXU) != 0 && errno != EEXIST)
+ {
+ fprintf(stderr,
+ _("%s: could not create directory \"%s\": %s\n"),
+ progname, statusdir, strerror(errno));
+ disconnect_and_exit(1);
+ }
}
/*
@@ -2234,16 +2249,6 @@ main(int argc, char **argv)
exit(1);
}
- if (format != 'p' && streamwal)
- {
- fprintf(stderr,
- _("%s: WAL streaming can only be used in plain mode\n"),
- progname);
- fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
- progname);
- exit(1);
- }
-
if (replication_slot && !streamwal)
{
fprintf(stderr,
diff --git a/src/bin/pg_basebackup/pg_receivexlog.c b/src/bin/pg_basebackup/pg_receivexlog.c
index a58a251..bbdf96e 100644
--- a/src/bin/pg_basebackup/pg_receivexlog.c
+++ b/src/bin/pg_basebackup/pg_receivexlog.c
@@ -338,11 +338,19 @@ StreamLog(void)
stream.synchronous = synchronous;
stream.do_sync = true;
stream.mark_done = false;
- stream.basedir = basedir;
+ stream.walmethod = CreateWalDirectoryMethod(basedir, stream.do_sync);
stream.partial_suffix = ".partial";
ReceiveXlogStream(conn, &stream);
+ if (!stream.walmethod->finish())
+ {
+ fprintf(stderr,
+ _("%s: could not finish writing WAL files: %s\n"),
+ progname, strerror(errno));
+ return;
+ }
+
PQfinish(conn);
conn = NULL;
}
diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c
index b0fa916..fcd0269 100644
--- a/src/bin/pg_basebackup/receivelog.c
+++ b/src/bin/pg_basebackup/receivelog.c
@@ -30,7 +30,7 @@
/* fd and filename for currently open WAL file */
-static int walfile = -1;
+static Walfile *walfile = NULL;
static char current_walfile_name[MAXPGPATH] = "";
static bool reportFlushPosition = false;
static XLogRecPtr lastFlushPosition = InvalidXLogRecPtr;
@@ -56,29 +56,23 @@ static bool ReadEndOfStreamingResult(PGresult *res, XLogRecPtr *startpos,
uint32 *timeline);
static bool
-mark_file_as_archived(const char *basedir, const char *fname, bool do_sync)
+mark_file_as_archived(StreamCtl *stream, const char *fname)
{
- int fd;
+ Walfile *f;
static char tmppath[MAXPGPATH];
- snprintf(tmppath, sizeof(tmppath), "%s/archive_status/%s.done",
- basedir, fname);
+ snprintf(tmppath, sizeof(tmppath), "archive_status/%s.done",
+ fname);
- fd = open(tmppath, O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
- if (fd < 0)
+ f = stream->walmethod->open_for_write(tmppath, NULL, 0);
+ if (f == NULL)
{
fprintf(stderr, _("%s: could not create archive status file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
+ progname, tmppath, stream->walmethod->getlasterror());
return false;
}
- close(fd);
-
- if (do_sync && fsync_fname(tmppath, false, progname) != 0)
- return false;
-
- if (do_sync && fsync_parent_path(tmppath, progname) != 0)
- return false;
+ stream->walmethod->close(f, CLOSE_NORMAL);
return true;
}
@@ -95,121 +89,82 @@ mark_file_as_archived(const char *basedir, const char *fname, bool do_sync)
static bool
open_walfile(StreamCtl *stream, XLogRecPtr startpoint)
{
- int f;
+ Walfile *f;
char fn[MAXPGPATH];
- struct stat statbuf;
- char *zerobuf;
- int bytes;
+ ssize_t size;
XLogSegNo segno;
XLByteToSeg(startpoint, segno);
XLogFileName(current_walfile_name, stream->timeline, segno);
- snprintf(fn, sizeof(fn), "%s/%s%s", stream->basedir, current_walfile_name,
+ snprintf(fn, sizeof(fn), "%s%s", current_walfile_name,
stream->partial_suffix ? stream->partial_suffix : "");
- f = open(fn, O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
- if (f == -1)
- {
- fprintf(stderr,
- _("%s: could not open transaction log file \"%s\": %s\n"),
- progname, fn, strerror(errno));
- return false;
- }
/*
- * Verify that the file is either empty (just created), or a complete
- * XLogSegSize segment. Anything in between indicates a corrupt file.
+ * When streaming to files, if an existing file exists we verify that it's
+ * either empty (just created), or a complete XLogSegSize segment (in
+ * which case it has been created and padded). Anything else indicates a
+ * corrupt file.
+ *
+ * When streaming to tar, no file with this name will exist before, so we
+ * never have to verify a size.
*/
- if (fstat(f, &statbuf) != 0)
+ if (stream->walmethod->existsfile(fn))
{
- fprintf(stderr,
- _("%s: could not stat transaction log file \"%s\": %s\n"),
- progname, fn, strerror(errno));
- close(f);
- return false;
- }
- if (statbuf.st_size == XLogSegSize)
- {
- /*
- * fsync, in case of a previous crash between padding and fsyncing the
- * file.
- */
- if (stream->do_sync)
+ size = stream->walmethod->get_file_size(fn);
+ if (size < 0)
{
- if (fsync_fname(fn, false, progname) != 0 ||
- fsync_parent_path(fn, progname) != 0)
+ fprintf(stderr,
+ _("%s: could not get size of transaction log file \"%s\": %s\n"),
+ progname, fn, stream->walmethod->getlasterror());
+ return false;
+ }
+ if (size == XLogSegSize)
+ {
+ /* Already padded file. Open it for use */
+ f = stream->walmethod->open_for_write(current_walfile_name, stream->partial_suffix, 0);
+ if (f == NULL)
{
- /* error already printed */
- close(f);
+ fprintf(stderr,
+ _("%s: could not open existing transaction log file \"%s\": %s\n"),
+ progname, fn, stream->walmethod->getlasterror());
return false;
}
- }
- /* File is open and ready to use */
- walfile = f;
- return true;
- }
- if (statbuf.st_size != 0)
- {
- fprintf(stderr,
- _("%s: transaction log file \"%s\" has %d bytes, should be 0 or %d\n"),
- progname, fn, (int) statbuf.st_size, XLogSegSize);
- close(f);
- return false;
- }
+ /* fsync file in case of a previous crash */
+ if (!stream->walmethod->fsync(f))
+ {
+ stream->walmethod->close(f, CLOSE_UNLINK);
+ return false;
+ }
- /*
- * New, empty, file. So pad it to 16Mb with zeroes. If we fail partway
- * through padding, we should attempt to unlink the file on failure, so as
- * not to leave behind a partially-filled file.
- */
- zerobuf = pg_malloc0(XLOG_BLCKSZ);
- for (bytes = 0; bytes < XLogSegSize; bytes += XLOG_BLCKSZ)
- {
- errno = 0;
- if (write(f, zerobuf, XLOG_BLCKSZ) != XLOG_BLCKSZ)
+ walfile = f;
+ return true;
+ }
+ if (size != 0)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
fprintf(stderr,
- _("%s: could not pad transaction log file \"%s\": %s\n"),
- progname, fn, strerror(errno));
- free(zerobuf);
- close(f);
- unlink(fn);
+ _("%s: transaction log file \"%s\" has %d bytes, should be 0 or %d\n"),
+ progname, fn, (int) size, XLogSegSize);
return false;
}
+ /* File existed and was empty, so fall through and open */
}
- free(zerobuf);
- /*
- * fsync WAL file and containing directory, to ensure the file is
- * persistently created and zeroed. That's particularly important when
- * using synchronous mode, where the file is modified and fsynced
- * in-place, without a directory fsync.
- */
- if (stream->do_sync)
- {
- if (fsync_fname(fn, false, progname) != 0 ||
- fsync_parent_path(fn, progname) != 0)
- {
- /* error already printed */
- close(f);
- return false;
- }
- }
+ /* No file existed, so create one */
- if (lseek(f, SEEK_SET, 0) != 0)
+ f = stream->walmethod->open_for_write(current_walfile_name, stream->partial_suffix, XLogSegSize);
+ if (f == NULL)
{
fprintf(stderr,
- _("%s: could not seek to beginning of transaction log file \"%s\": %s\n"),
- progname, fn, strerror(errno));
- close(f);
+ _("%s: could not open transaction log file \"%s\": %s\n"),
+ progname, fn, stream->walmethod->getlasterror());
return false;
}
- /* File is open and ready to use */
walfile = f;
return true;
}
@@ -223,59 +178,46 @@ static bool
close_walfile(StreamCtl *stream, XLogRecPtr pos)
{
off_t currpos;
+ int r;
- if (walfile == -1)
+ if (walfile == NULL)
return true;
- currpos = lseek(walfile, 0, SEEK_CUR);
+ currpos = stream->walmethod->get_current_pos(walfile);
if (currpos == -1)
{
fprintf(stderr,
_("%s: could not determine seek position in file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
- close(walfile);
- walfile = -1;
+ progname, current_walfile_name, stream->walmethod->getlasterror());
+ stream->walmethod->close(walfile, CLOSE_UNLINK);
+ walfile = NULL;
+
return false;
}
- if (stream->do_sync && fsync(walfile) != 0)
+ if (stream->partial_suffix)
{
- fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
- close(walfile);
- walfile = -1;
- return false;
+ if (currpos == XLOG_SEG_SIZE)
+ r = stream->walmethod->close(walfile, CLOSE_NORMAL);
+ else
+ {
+ fprintf(stderr,
+ _("%s: not renaming \"%s%s\", segment is not complete\n"),
+ progname, current_walfile_name, stream->partial_suffix);
+ r = stream->walmethod->close(walfile, CLOSE_NO_RENAME);
+ }
}
+ else
+ r = stream->walmethod->close(walfile, CLOSE_NORMAL);
- if (close(walfile) != 0)
+ walfile = NULL;
+
+ if (r != 0)
{
fprintf(stderr, _("%s: could not close file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
- walfile = -1;
+ progname, current_walfile_name, stream->walmethod->getlasterror());
return false;
}
- walfile = -1;
-
- /*
- * If we finished writing a .partial file, rename it into place.
- */
- if (currpos == XLOG_SEG_SIZE && stream->partial_suffix)
- {
- char oldfn[MAXPGPATH];
- char newfn[MAXPGPATH];
-
- snprintf(oldfn, sizeof(oldfn), "%s/%s%s", stream->basedir, current_walfile_name, stream->partial_suffix);
- snprintf(newfn, sizeof(newfn), "%s/%s", stream->basedir, current_walfile_name);
- if (durable_rename(oldfn, newfn, progname) != 0)
- {
- /* durable_rename produced a log entry */
- return false;
- }
- }
- else if (stream->partial_suffix)
- fprintf(stderr,
- _("%s: not renaming \"%s%s\", segment is not complete\n"),
- progname, current_walfile_name, stream->partial_suffix);
/*
* Mark file as archived if requested by the caller - pg_basebackup needs
@@ -286,8 +228,7 @@ close_walfile(StreamCtl *stream, XLogRecPtr pos)
if (currpos == XLOG_SEG_SIZE && stream->mark_done)
{
/* writes error message if failed */
- if (!mark_file_as_archived(stream->basedir, current_walfile_name,
- stream->do_sync))
+ if (!mark_file_as_archived(stream, current_walfile_name))
return false;
}
@@ -302,9 +243,7 @@ close_walfile(StreamCtl *stream, XLogRecPtr pos)
static bool
existsTimeLineHistoryFile(StreamCtl *stream)
{
- char path[MAXPGPATH];
char histfname[MAXFNAMELEN];
- int fd;
/*
* Timeline 1 never has a history file. We treat that as if it existed,
@@ -315,31 +254,15 @@ existsTimeLineHistoryFile(StreamCtl *stream)
TLHistoryFileName(histfname, stream->timeline);
- snprintf(path, sizeof(path), "%s/%s", stream->basedir, histfname);
-
- fd = open(path, O_RDONLY | PG_BINARY, 0);
- if (fd < 0)
- {
- if (errno != ENOENT)
- fprintf(stderr, _("%s: could not open timeline history file \"%s\": %s\n"),
- progname, path, strerror(errno));
- return false;
- }
- else
- {
- close(fd);
- return true;
- }
+ return stream->walmethod->existsfile(histfname);
}
static bool
writeTimeLineHistoryFile(StreamCtl *stream, char *filename, char *content)
{
int size = strlen(content);
- char path[MAXPGPATH];
- char tmppath[MAXPGPATH];
char histfname[MAXFNAMELEN];
- int fd;
+ Walfile *f;
/*
* Check that the server's idea of how timeline history files should be
@@ -353,53 +276,31 @@ writeTimeLineHistoryFile(StreamCtl *stream, char *filename, char *content)
return false;
}
- snprintf(path, sizeof(path), "%s/%s", stream->basedir, histfname);
-
- /*
- * Write into a temp file name.
- */
- snprintf(tmppath, MAXPGPATH, "%s.tmp", path);
-
- unlink(tmppath);
-
- fd = open(tmppath, O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
- if (fd < 0)
+ f = stream->walmethod->open_for_write(histfname, ".tmp", 0);
+ if (f == NULL)
{
fprintf(stderr, _("%s: could not create timeline history file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
+ progname, histfname, stream->walmethod->getlasterror());
return false;
}
- errno = 0;
- if ((int) write(fd, content, size) != size)
+ if ((int) stream->walmethod->write(f, content, size) != size)
{
- int save_errno = errno;
+ fprintf(stderr, _("%s: could not write timeline history file \"%s\": %s\n"),
+ progname, histfname, stream->walmethod->getlasterror());
/*
* If we fail to make the file, delete it to release disk space
*/
- close(fd);
- unlink(tmppath);
- errno = save_errno;
+ stream->walmethod->close(f, CLOSE_UNLINK);
- fprintf(stderr, _("%s: could not write timeline history file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
return false;
}
- if (close(fd) != 0)
+ if (stream->walmethod->close(f, CLOSE_NORMAL) != 0)
{
fprintf(stderr, _("%s: could not close file \"%s\": %s\n"),
- progname, tmppath, strerror(errno));
- return false;
- }
-
- /*
- * Now move the completed history file into place with its final name.
- */
- if (durable_rename(tmppath, path, progname) < 0)
- {
- /* durable_rename produced a log entry */
+ progname, histfname, stream->walmethod->getlasterror());
return false;
}
@@ -407,8 +308,7 @@ writeTimeLineHistoryFile(StreamCtl *stream, char *filename, char *content)
if (stream->mark_done)
{
/* writes error message if failed */
- if (!mark_file_as_archived(stream->basedir, histfname,
- stream->do_sync))
+ if (!mark_file_as_archived(stream, histfname))
return false;
}
@@ -618,7 +518,9 @@ ReceiveXlogStream(PGconn *conn, StreamCtl *stream)
{
/*
* Fetch the timeline history file for this timeline, if we don't have
- * it already.
+ * it already. When streaming log to tar, this will always return
+ * false, as we are never streaming into an existing file and
+ * therefore there can be no pre-existing timeline history file.
*/
if (!existsTimeLineHistoryFile(stream))
{
@@ -777,10 +679,10 @@ ReceiveXlogStream(PGconn *conn, StreamCtl *stream)
}
error:
- if (walfile != -1 && close(walfile) != 0)
+ if (walfile != NULL && stream->walmethod->close(walfile, CLOSE_NORMAL) != 0)
fprintf(stderr, _("%s: could not close file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
- walfile = -1;
+ progname, current_walfile_name, stream->walmethod->getlasterror());
+ walfile = NULL;
return false;
}
@@ -864,12 +766,12 @@ HandleCopyStream(PGconn *conn, StreamCtl *stream,
* If synchronous option is true, issue sync command as soon as there
* are WAL data which has not been flushed yet.
*/
- if (stream->synchronous && lastFlushPosition < blockpos && walfile != -1)
+ if (stream->synchronous && lastFlushPosition < blockpos && walfile != NULL)
{
- if (stream->do_sync && fsync(walfile) != 0)
+ if (stream->walmethod->fsync(walfile) != 0)
{
fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
+ progname, current_walfile_name, stream->walmethod->getlasterror());
goto error;
}
lastFlushPosition = blockpos;
@@ -1100,7 +1002,7 @@ ProcessKeepaliveMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
if (replyRequested && still_sending)
{
if (reportFlushPosition && lastFlushPosition < blockpos &&
- walfile != -1)
+ walfile != NULL)
{
/*
* If a valid flush location needs to be reported, flush the
@@ -1109,10 +1011,10 @@ ProcessKeepaliveMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
* data has been successfully replicated or not, at the normal
* shutdown of the server.
*/
- if (stream->do_sync && fsync(walfile) != 0)
+ if (stream->walmethod->fsync(walfile) != 0)
{
fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
- progname, current_walfile_name, strerror(errno));
+ progname, current_walfile_name, stream->walmethod->getlasterror());
return false;
}
lastFlushPosition = blockpos;
@@ -1170,7 +1072,7 @@ ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
* Verify that the initial location in the stream matches where we think
* we are.
*/
- if (walfile == -1)
+ if (walfile == NULL)
{
/* No file open yet */
if (xlogoff != 0)
@@ -1184,12 +1086,11 @@ ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
else
{
/* More data in existing segment */
- /* XXX: store seek value don't reseek all the time */
- if (lseek(walfile, 0, SEEK_CUR) != xlogoff)
+ if (stream->walmethod->get_current_pos(walfile) != xlogoff)
{
fprintf(stderr,
_("%s: got WAL data offset %08x, expected %08x\n"),
- progname, xlogoff, (int) lseek(walfile, 0, SEEK_CUR));
+ progname, xlogoff, (int) stream->walmethod->get_current_pos(walfile));
return false;
}
}
@@ -1210,7 +1111,7 @@ ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
else
bytes_to_write = bytes_left;
- if (walfile == -1)
+ if (walfile == NULL)
{
if (!open_walfile(stream, *blockpos))
{
@@ -1219,14 +1120,13 @@ ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len,
}
}
- if (write(walfile,
- copybuf + hdr_len + bytes_written,
- bytes_to_write) != bytes_to_write)
+ if (stream->walmethod->write(walfile, copybuf + hdr_len + bytes_written,
+ bytes_to_write) != bytes_to_write)
{
fprintf(stderr,
_("%s: could not write %u bytes to WAL file \"%s\": %s\n"),
progname, bytes_to_write, current_walfile_name,
- strerror(errno));
+ stream->walmethod->getlasterror());
return false;
}
diff --git a/src/bin/pg_basebackup/receivelog.h b/src/bin/pg_basebackup/receivelog.h
index 7a3bbc5..b5913ea 100644
--- a/src/bin/pg_basebackup/receivelog.h
+++ b/src/bin/pg_basebackup/receivelog.h
@@ -13,6 +13,7 @@
#define RECEIVELOG_H
#include "libpq-fe.h"
+#include "walmethods.h"
#include "access/xlogdefs.h"
@@ -41,7 +42,7 @@ typedef struct StreamCtl
stream_stop_callback stream_stop; /* Stop streaming when returns true */
- char *basedir; /* Received segments written to this dir */
+ WalWriteMethod *walmethod; /* How to write the WAL */
char *partial_suffix; /* Suffix appended to partially received files */
} StreamCtl;
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index a52bd4e..dca8c1c 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -4,7 +4,7 @@ use Cwd;
use Config;
use PostgresNode;
use TestLib;
-use Test::More tests => 67;
+use Test::More tests => 69;
program_help_ok('pg_basebackup');
program_version_ok('pg_basebackup');
@@ -237,6 +237,10 @@ $node->command_ok(
'pg_basebackup -X stream runs');
ok(grep(/^[0-9A-F]{24}$/, slurp_dir("$tempdir/backupxf/pg_xlog")),
'WAL files copied');
+$node->command_ok(
+ [ 'pg_basebackup', '-D', "$tempdir/backupxst", '-X', 'stream', '-Ft' ],
+ 'pg_basebackup -X stream runs in tar mode');
+ok(-f "$tempdir/backupxst/pg_xlog.tar", "tar file was created");
$node->command_fails(
[ 'pg_basebackup', '-D', "$tempdir/fail", '-S', 'slot1' ],
diff --git a/src/bin/pg_basebackup/walmethods.c b/src/bin/pg_basebackup/walmethods.c
new file mode 100644
index 0000000..2cdd75b
--- /dev/null
+++ b/src/bin/pg_basebackup/walmethods.c
@@ -0,0 +1,891 @@
+/*-------------------------------------------------------------------------
+ *
+ * walmethods.c - implementations of different ways to write received wal
+ *
+ * NOTE! The caller must ensure that only one method is instantiated in
+ * any given program, and that it's only instantiated once!
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_basebackup/walmethods.c
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <sys/stat.h>
+#include <time.h>
+#include <unistd.h>
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
+#include "pgtar.h"
+#include "common/file_utils.h"
+
+#include "receivelog.h"
+#include "streamutil.h"
+
+/* Size of zlib buffer for .tar.gz */
+#define ZLIB_OUT_SIZE 4096
+
+/*-------------------------------------------------------------------------
+ * WalDirectoryMethod - write wal to a directory looking like pg_xlog
+ *-------------------------------------------------------------------------
+ */
+
+/*
+ * Global static data for this method
+ */
+typedef struct DirectoryMethodData
+{
+ char *basedir;
+ bool sync;
+} DirectoryMethodData;
+static DirectoryMethodData *dir_data = NULL;
+
+/*
+ * Local file handle
+ */
+typedef struct DirectoryMethodFile
+{
+ int fd;
+ off_t currpos;
+ char *pathname;
+ char *fullpath;
+ char *temp_suffix;
+} DirectoryMethodFile;
+
+static char *
+dir_getlasterror(void)
+{
+ /* Directory method always sets errno, so just use strerror */
+ return strerror(errno);
+}
+
+static Walfile
+dir_open_for_write(const char *pathname, const char *temp_suffix, size_t pad_to_size)
+{
+ static char tmppath[MAXPGPATH];
+ int fd;
+ DirectoryMethodFile *f;
+
+ snprintf(tmppath, sizeof(tmppath), "%s/%s%s",
+ dir_data->basedir, pathname, temp_suffix ? temp_suffix : "");
+
+ fd = open(tmppath, O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
+ if (fd < 0)
+ return NULL;
+
+ if (pad_to_size)
+ {
+ /* Always pre-pad on regular files */
+ char *zerobuf;
+ int bytes;
+
+ zerobuf = pg_malloc0(XLOG_BLCKSZ);
+ for (bytes = 0; bytes < pad_to_size; bytes += XLOG_BLCKSZ)
+ {
+ if (write(fd, zerobuf, XLOG_BLCKSZ) != XLOG_BLCKSZ)
+ {
+ int save_errno = errno;
+
+ pg_free(zerobuf);
+ close(fd);
+ errno = save_errno;
+ return NULL;
+ }
+ }
+ pg_free(zerobuf);
+
+ if (lseek(fd, 0, SEEK_SET) != 0)
+ {
+ int save_errno = errno;
+
+ close(fd);
+ errno = save_errno;
+ return NULL;
+ }
+ }
+
+ /*
+ * fsync WAL file and containing directory, to ensure the file is
+ * persistently created and zeroed (if padded). That's particularly
+ * important when using synchronous mode, where the file is modified and
+ * fsynced in-place, without a directory fsync.
+ */
+ if (dir_data->sync)
+ {
+ if (fsync_fname(tmppath, false, progname) != 0)
+ {
+ close(fd);
+ return NULL;
+ }
+ if (fsync_parent_path(tmppath, progname) != 0)
+ {
+ close(fd);
+ return NULL;
+ }
+ }
+
+ f = pg_malloc0(sizeof(DirectoryMethodFile));
+ f->fd = fd;
+ f->currpos = 0;
+ f->pathname = pg_strdup(pathname);
+ f->fullpath = pg_strdup(tmppath);
+ if (temp_suffix)
+ f->temp_suffix = pg_strdup(temp_suffix);
+
+ return f;
+}
+
+static ssize_t
+dir_write(Walfile f, const void *buf, size_t count)
+{
+ ssize_t r;
+ DirectoryMethodFile *df = (DirectoryMethodFile *) f;
+
+ Assert(f != NULL);
+
+ r = write(df->fd, buf, count);
+ if (r > 0)
+ df->currpos += r;
+ return r;
+}
+
+static off_t
+dir_get_current_pos(Walfile f)
+{
+ Assert(f != NULL);
+
+ /* Use a cached value to prevent lots of reseeks */
+ return ((DirectoryMethodFile *) f)->currpos;
+}
+
+static int
+dir_close(Walfile f, WalCloseMethod method)
+{
+ int r;
+ DirectoryMethodFile *df = (DirectoryMethodFile *) f;
+ static char tmppath[MAXPGPATH];
+ static char tmppath2[MAXPGPATH];
+
+ Assert(f != NULL);
+
+ r = close(df->fd);
+
+ if (r == 0)
+ {
+ /* Build path to the current version of the file */
+ if (method == CLOSE_NORMAL && df->temp_suffix)
+ {
+ /*
+ * If we have a temp prefix, normal operation is to rename the
+ * file.
+ */
+ snprintf(tmppath, sizeof(tmppath), "%s/%s%s",
+ dir_data->basedir, df->pathname, df->temp_suffix);
+ snprintf(tmppath2, sizeof(tmppath2), "%s/%s",
+ dir_data->basedir, df->pathname);
+ r = durable_rename(tmppath, tmppath2, progname);
+ }
+ else if (method == CLOSE_UNLINK
+ )
+ {
+ /* Unlink the file once it's closed */
+ snprintf(tmppath, sizeof(tmppath), "%s/%s%s",
+ dir_data->basedir, df->pathname, df->temp_suffix ? df->temp_suffix : "");
+ r = unlink(tmppath);
+ }
+ else
+ {
+ /*
+ * Else either CLOSE_NORMAL and no temp suffix, or
+ * CLOSE_NO_RENAME. In this case, fsync the file and containing
+ * directory if sync mode is requested.
+ */
+ if (dir_data->sync)
+ {
+ r = fsync_fname(df->fullpath, false, progname);
+ if (r == 0)
+ r = fsync_parent_path(df->fullpath, progname);
+ }
+ }
+ }
+
+ pg_free(df->pathname);
+ pg_free(df->fullpath);
+ if (df->temp_suffix)
+ pg_free(df->temp_suffix);
+ pg_free(df);
+
+ return r;
+}
+
+static int
+dir_fsync(Walfile f)
+{
+ Assert(f != NULL);
+
+ if (!dir_data->sync)
+ return 0;
+
+ return fsync(((DirectoryMethodFile *) f)->fd);
+}
+
+static ssize_t
+dir_get_file_size(const char *pathname)
+{
+ struct stat statbuf;
+ static char tmppath[MAXPGPATH];
+
+ snprintf(tmppath, sizeof(tmppath), "%s/%s",
+ dir_data->basedir, pathname);
+
+ if (stat(tmppath, &statbuf) != 0)
+ return -1;
+
+ return statbuf.st_size;
+}
+
+static bool
+dir_existsfile(const char *pathname)
+{
+ static char tmppath[MAXPGPATH];
+ int fd;
+
+ snprintf(tmppath, sizeof(tmppath), "%s/%s",
+ dir_data->basedir, pathname);
+
+ fd = open(tmppath, O_RDONLY | PG_BINARY, 0);
+ if (fd < 0)
+ return false;
+ close(fd);
+ return true;
+}
+
+static bool
+dir_finish(void)
+{
+ if (dir_data->sync)
+ {
+ /*
+ * Files are fsynced when they are closed, but we need to fsync the
+ * directory entry here as well.
+ */
+ if (fsync_fname(dir_data->basedir, true, progname) != 0)
+ return false;
+ }
+ return true;
+}
+
+
+WalWriteMethod *
+CreateWalDirectoryMethod(const char *basedir, bool sync)
+{
+ WalWriteMethod *method;
+
+ method = pg_malloc0(sizeof(WalWriteMethod));
+ method->open_for_write = dir_open_for_write;
+ method->write = dir_write;
+ method->get_current_pos = dir_get_current_pos;
+ method->get_file_size = dir_get_file_size;
+ method->close = dir_close;
+ method->fsync = dir_fsync;
+ method->existsfile = dir_existsfile;
+ method->finish = dir_finish;
+ method->getlasterror = dir_getlasterror;
+
+ dir_data = pg_malloc0(sizeof(DirectoryMethodData));
+ dir_data->basedir = pg_strdup(basedir);
+ dir_data->sync = sync;
+
+ return method;
+}
+
+
+/*-------------------------------------------------------------------------
+ * WalTarMethod - write wal to a tar file containing pg_xlog contents
+ *-------------------------------------------------------------------------
+ */
+
+typedef struct TarMethodFile
+{
+ off_t ofs_start; /* Where does the *header* for this file start */
+ off_t currpos;
+ char header[512];
+ char *pathname;
+ size_t pad_to_size;
+} TarMethodFile;
+
+typedef struct TarMethodData
+{
+ char *tarfilename;
+ int fd;
+ int compression;
+ bool sync;
+ TarMethodFile *currentfile;
+ char lasterror[1024];
+#ifdef HAVE_LIBZ
+ z_streamp zp;
+ void *zlibOut;
+#endif
+} TarMethodData;
+static TarMethodData *tar_data = NULL;
+
+#define tar_clear_error() tar_data->lasterror[0] = '\0'
+#define tar_set_error(msg) strlcpy(tar_data->lasterror, msg, sizeof(tar_data->lasterror))
+
+static char *
+tar_getlasterror(void)
+{
+ /*
+ * If a custom error is set, return that one. Otherwise, assume errno is
+ * set and return that one.
+ */
+ if (tar_data->lasterror[0])
+ return tar_data->lasterror;
+ return strerror(errno);
+}
+
+#ifdef HAVE_LIBZ
+static bool
+tar_write_compressed_data(void *buf, size_t count, bool flush)
+{
+ tar_data->zp->next_in = buf;
+ tar_data->zp->avail_in = count;
+
+ while (tar_data->zp->avail_in || flush)
+ {
+ int r;
+
+ r = deflate(tar_data->zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (r == Z_STREAM_ERROR)
+ {
+ tar_set_error("deflate failed");
+ return false;
+ }
+
+ if (tar_data->zp->avail_out < ZLIB_OUT_SIZE)
+ {
+ size_t len = ZLIB_OUT_SIZE - tar_data->zp->avail_out;
+
+ if (write(tar_data->fd, tar_data->zlibOut, len) != len)
+ return false;
+
+ tar_data->zp->next_out = tar_data->zlibOut;
+ tar_data->zp->avail_out = ZLIB_OUT_SIZE;
+ }
+
+ if (r == Z_STREAM_END)
+ break;
+ }
+
+ if (flush)
+ {
+ /* Reset the stream for writing */
+ if (deflateReset(tar_data->zp) != Z_OK)
+ {
+ tar_set_error("deflateReset failed");
+ return false;
+ }
+ }
+
+ return true;
+}
+#endif
+
+static ssize_t
+tar_write(Walfile f, const void *buf, size_t count)
+{
+ ssize_t r;
+
+ Assert(f != NULL);
+ tar_clear_error();
+
+ /* Tarfile will always be positioned at the end */
+ if (!tar_data->compression)
+ {
+ r = write(tar_data->fd, buf, count);
+ if (r > 0)
+ ((TarMethodFile *) f)->currpos += r;
+ return r;
+ }
+#ifdef HAVE_LIBZ
+ else
+ {
+ if (!tar_write_compressed_data((void *) buf, count, false))
+ return -1;
+ ((TarMethodFile *) f)->currpos += count;
+ return count;
+ }
+#endif
+}
+
+static bool
+tar_write_padding_data(TarMethodFile * f, size_t bytes)
+{
+ char *zerobuf = pg_malloc0(XLOG_BLCKSZ);
+ size_t bytesleft = bytes;
+
+ while (bytesleft)
+ {
+ size_t bytestowrite = bytesleft > XLOG_BLCKSZ ? XLOG_BLCKSZ : bytesleft;
+
+ size_t r = tar_write(f, zerobuf, bytestowrite);
+
+ if (r < 0)
+ return false;
+ bytesleft -= r;
+ }
+ return true;
+}
+
+static Walfile
+tar_open_for_write(const char *pathname, const char *temp_suffix, size_t pad_to_size)
+{
+ int save_errno;
+ static char tmppath[MAXPGPATH];
+
+ tar_clear_error();
+
+ if (tar_data->fd < 0)
+ {
+ /*
+ * We open the tar file only when we first try to write to it.
+ */
+ tar_data->fd = open(tar_data->tarfilename,
+ O_WRONLY | O_CREAT | PG_BINARY, S_IRUSR | S_IWUSR);
+ if (tar_data->fd < 0)
+ return NULL;
+
+#ifdef HAVE_LIBZ
+ if (tar_data->compression)
+ {
+ tar_data->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ tar_data->zp->zalloc = Z_NULL;
+ tar_data->zp->zfree = Z_NULL;
+ tar_data->zp->opaque = Z_NULL;
+ tar_data->zp->next_out = tar_data->zlibOut;
+ tar_data->zp->avail_out = ZLIB_OUT_SIZE;
+
+ /*
+ * Initialize deflation library. Adding the magic value 16 to the
+ * default 15 for the windowBits parameter makes the output be
+ * gzip instead of zlib.
+ */
+ if (deflateInit2(tar_data->zp, tar_data->compression, Z_DEFLATED, 15 + 16, 8, Z_DEFAULT_STRATEGY) != Z_OK)
+ {
+ pg_free(tar_data->zp);
+ tar_data->zp = NULL;
+ tar_set_error("deflateInit2 failed");
+ return NULL;
+ }
+ }
+#endif
+
+ /* There's no tar header itself, the file starts with regular files */
+ }
+
+ Assert(tar_data->currentfile == NULL);
+ if (tar_data->currentfile != NULL)
+ {
+ tar_set_error("implementation error: tar files can't have more than one open file\n");
+ return NULL;
+ }
+
+ tar_data->currentfile = pg_malloc0(sizeof(TarMethodFile));
+
+ snprintf(tmppath, sizeof(tmppath), "%s%s",
+ pathname, temp_suffix ? temp_suffix : "");
+
+ /* Create a header with size set to 0 - we will fill out the size on close */
+ if (tarCreateHeader(tar_data->currentfile->header, tmppath, NULL, 0, S_IRUSR | S_IWUSR, 0, 0, time(NULL)) != TAR_OK)
+ {
+ pg_free(tar_data->currentfile);
+ tar_data->currentfile = NULL;
+ tar_set_error("could not create tar header");
+ return NULL;
+ }
+
+#ifdef HAVE_LIBZ
+ if (tar_data->compression)
+ {
+ /* Flush existing data */
+ if (!tar_write_compressed_data(NULL, 0, true))
+ return NULL;
+
+ /* Turn off compression for header */
+ if (deflateParams(tar_data->zp, 0, 0) != Z_OK)
+ {
+ tar_set_error("deflateParams failed");
+ return NULL;
+ }
+ }
+#endif
+
+ tar_data->currentfile->ofs_start = lseek(tar_data->fd, 0, SEEK_CUR);
+ if (tar_data->currentfile->ofs_start == -1)
+ {
+ save_errno = errno;
+ pg_free(tar_data->currentfile);
+ tar_data->currentfile = NULL;
+ errno = save_errno;
+ return NULL;
+ }
+ tar_data->currentfile->currpos = 0;
+
+ if (!tar_data->compression)
+ {
+ if (write(tar_data->fd, tar_data->currentfile->header, 512) != 512)
+ {
+ save_errno = errno;
+ pg_free(tar_data->currentfile);
+ tar_data->currentfile = NULL;
+ errno = save_errno;
+ return NULL;
+ }
+ }
+#ifdef HAVE_LIBZ
+ else
+ {
+ /* Write header through the zlib APIs but with no compression */
+ if (!tar_write_compressed_data(tar_data->currentfile->header, 512, true))
+ return NULL;
+
+ /* Re-enable compression for the rest of the file */
+ if (deflateParams(tar_data->zp, tar_data->compression, 0) != Z_OK)
+ {
+ tar_set_error("deflateParams failed");
+ return NULL;
+ }
+ }
+#endif
+
+ tar_data->currentfile->pathname = pg_strdup(pathname);
+
+ /*
+ * Uncompressed files are padded on creation, but for compression we can't
+ * do that
+ */
+ if (pad_to_size)
+ {
+ tar_data->currentfile->pad_to_size = pad_to_size;
+ if (!tar_data->compression)
+ {
+ /* Uncompressed, so pad now */
+ tar_write_padding_data(tar_data->currentfile, pad_to_size);
+ /* Seek back to start */
+ if (lseek(tar_data->fd, tar_data->currentfile->ofs_start + 512, SEEK_SET) != tar_data->currentfile->ofs_start + 512)
+ return NULL;
+
+ tar_data->currentfile->currpos = 0;
+ }
+ }
+
+ return tar_data->currentfile;
+}
+
+static ssize_t
+tar_get_file_size(const char *pathname)
+{
+ tar_clear_error();
+
+ /* Currently not used, so not supported */
+ errno = ENOSYS;
+ return -1;
+}
+
+static off_t
+tar_get_current_pos(Walfile f)
+{
+ Assert(f != NULL);
+ tar_clear_error();
+
+ return ((TarMethodFile *) f)->currpos;
+}
+
+static int
+tar_fsync(Walfile f)
+{
+ Assert(f != NULL);
+ tar_clear_error();
+
+ /*
+ * Always sync the whole tarfile, because that's all we can do. This makes
+ * no sense on compressed files, so just ignore those.
+ */
+ if (tar_data->compression)
+ return 0;
+
+ return fsync(tar_data->fd);
+}
+
+static int
+tar_close(Walfile f, WalCloseMethod method)
+{
+ ssize_t filesize;
+ int padding;
+ TarMethodFile *tf = (TarMethodFile *) f;
+
+ Assert(f != NULL);
+ tar_clear_error();
+
+ if (method == CLOSE_UNLINK)
+ {
+ if (tar_data->compression)
+ {
+ tar_set_error("unlink not supported with compression");
+ return -1;
+ }
+
+ /*
+ * Unlink the file that we just wrote to the tar. We do this by
+ * truncating it to the start of the header. This is safe as we only
+ * allow writing of the very last file.
+ */
+ if (ftruncate(tar_data->fd, tf->ofs_start) != 0)
+ return -1;
+
+ pg_free(tf->pathname);
+ pg_free(tf);
+ tar_data->currentfile = NULL;
+
+ return 0;
+ }
+
+ /*
+ * Pad the file itself with zeroes if necessary. Note that this is
+ * different from the tar format padding -- this is the padding we asked
+ * for when the file was opened.
+ */
+ if (tf->pad_to_size)
+ {
+ if (tar_data->compression)
+ {
+ /*
+ * A compressed tarfile is padded on close since we cannot know
+ * the size of the compressed output until the end.
+ */
+ size_t sizeleft = tf->pad_to_size - tf->currpos;
+
+ if (sizeleft)
+ {
+ if (!tar_write_padding_data(tf, sizeleft))
+ return -1;
+ }
+ }
+ else
+ {
+ /*
+ * An uncompressed tarfile was padded on creation, so just adjust
+ * the current position as if we seeked to the end.
+ */
+ tf->currpos = tf->pad_to_size;
+ }
+ }
+
+ /*
+ * Get the size of the file, and pad the current data up to the nearest
+ * 512 byte boundary.
+ */
+ filesize = tar_get_current_pos(f);
+ padding = ((filesize + 511) & ~511) - filesize;
+ if (padding)
+ {
+ char zerobuf[512];
+
+ MemSet(zerobuf, 0, padding);
+ if (tar_write(f, zerobuf, padding) != padding)
+ return -1;
+ }
+
+
+#ifdef HAVE_LIBZ
+ if (tar_data->compression)
+ {
+ /* Flush the current buffer */
+ if (!tar_write_compressed_data(NULL, 0, true))
+ {
+ errno = EINVAL;
+ return -1;
+ }
+ }
+#endif
+
+ /*
+ * Now go back and update the header with the correct filesize and
+ * possibly also renaming the file. We overwrite the entire current header
+ * when done, including the checksum.
+ */
+ print_tar_number(&(tf->header[124]), 12, filesize);
+
+ if (method == CLOSE_NORMAL)
+
+ /*
+ * We overwrite it with what it was before if we have no tempname,
+ * since we're going to write the buffer anyway.
+ */
+ strlcpy(&(tf->header[0]), tf->pathname, 100);
+
+ print_tar_number(&(tf->header[148]), 8, tarChecksum(((TarMethodFile *) f)->header));
+ if (lseek(tar_data->fd, tf->ofs_start, SEEK_SET) != ((TarMethodFile *) f)->ofs_start)
+ return -1;
+ if (!tar_data->compression)
+ {
+ if (write(tar_data->fd, tf->header, 512) != 512)
+ return -1;
+ }
+#ifdef HAVE_LIBZ
+ else
+ {
+ /* Turn off compression */
+ if (deflateParams(tar_data->zp, 0, 0) != Z_OK)
+ {
+ tar_set_error("deflateParams failed");
+ return -1;
+ }
+
+ /* Overwrite the header, assuming the size will be the same */
+ if (!tar_write_compressed_data(tar_data->currentfile->header, 512, true))
+ return -1;
+
+ /* Turn compression back on */
+ if (deflateParams(tar_data->zp, tar_data->compression, 0) != Z_OK)
+ {
+ tar_set_error("deflateParams failed");
+ return -1;
+ }
+ }
+#endif
+
+ /* Move file pointer back down to end, so we can write the next file */
+ if (lseek(tar_data->fd, 0, SEEK_END) < 0)
+ return -1;
+
+ /* Always fsync on close, so the padding gets fsynced */
+ tar_fsync(f);
+
+ /* Clean up and done */
+ pg_free(tf->pathname);
+ pg_free(tf);
+ tar_data->currentfile = NULL;
+
+ return 0;
+}
+
+static bool
+tar_existsfile(const char *pathname)
+{
+ tar_clear_error();
+ /* We only deal with new tarfiles, so nothing externally created exists */
+ return false;
+}
+
+static bool
+tar_finish(void)
+{
+ char zerobuf[1024];
+
+ tar_clear_error();
+
+ if (tar_data->currentfile)
+ {
+ if (tar_close(tar_data->currentfile, CLOSE_NORMAL) != 0)
+ return false;
+ }
+
+ /* A tarfile always ends with two empty blocks */
+ MemSet(zerobuf, 0, sizeof(zerobuf));
+ if (!tar_data->compression)
+ {
+ if (write(tar_data->fd, zerobuf, sizeof(zerobuf)) != sizeof(zerobuf))
+ return false;
+ }
+#ifdef HAVE_LIBZ
+ else
+ {
+ if (!tar_write_compressed_data(zerobuf, sizeof(zerobuf), false))
+ return false;
+
+ /* Also flush all data to make sure the gzip stream is finished */
+ tar_data->zp->next_in = NULL;
+ tar_data->zp->avail_in = 0;
+ while (true)
+ {
+ int r;
+
+ r = deflate(tar_data->zp, Z_FINISH);
+
+ if (r == Z_STREAM_ERROR)
+ {
+ tar_set_error("deflate failed");
+ return false;
+ }
+ if (tar_data->zp->avail_out < ZLIB_OUT_SIZE)
+ {
+ size_t len = ZLIB_OUT_SIZE - tar_data->zp->avail_out;
+
+ if (write(tar_data->fd, tar_data->zlibOut, len) != len)
+ return false;
+ }
+ if (r == Z_STREAM_END)
+ break;
+ }
+
+ if (deflateEnd(tar_data->zp) != Z_OK)
+ {
+ tar_set_error("deflateEnd failed");
+ return false;
+ }
+ }
+#endif
+
+ /* sync the empty blocks as well, since they're after the last file */
+ fsync(tar_data->fd);
+
+ if (close(tar_data->fd) != 0)
+ return false;
+
+ tar_data->fd = -1;
+
+ if (tar_data->sync)
+ {
+ if (fsync_fname(tar_data->tarfilename, false, progname) != 0)
+ return false;
+ if (fsync_parent_path(tar_data->tarfilename, progname) != 0)
+ return false;
+ }
+
+ return true;
+}
+
+WalWriteMethod *
+CreateWalTarMethod(const char *tarbase, int compression, bool sync)
+{
+ WalWriteMethod *method;
+ const char *suffix = (compression != 0) ? ".tar.gz" : ".tar";
+
+ method = pg_malloc0(sizeof(WalWriteMethod));
+ method->open_for_write = tar_open_for_write;
+ method->write = tar_write;
+ method->get_current_pos = tar_get_current_pos;
+ method->get_file_size = tar_get_file_size;
+ method->close = tar_close;
+ method->fsync = tar_fsync;
+ method->existsfile = tar_existsfile;
+ method->finish = tar_finish;
+ method->getlasterror = tar_getlasterror;
+
+ tar_data = pg_malloc0(sizeof(TarMethodData));
+ tar_data->tarfilename = pg_malloc0(strlen(tarbase) + strlen(suffix) + 1);
+ sprintf(tar_data->tarfilename, "%s%s", tarbase, suffix);
+ tar_data->fd = -1;
+ tar_data->compression = compression;
+ tar_data->sync = sync;
+ if (compression)
+ tar_data->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ return method;
+}
diff --git a/src/bin/pg_basebackup/walmethods.h b/src/bin/pg_basebackup/walmethods.h
new file mode 100644
index 0000000..fa58f81
--- /dev/null
+++ b/src/bin/pg_basebackup/walmethods.h
@@ -0,0 +1,45 @@
+/*-------------------------------------------------------------------------
+ *
+ * walmethods.h
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_basebackup/walmethods.h
+ *-------------------------------------------------------------------------
+ */
+
+
+typedef void *Walfile;
+
+typedef enum
+{
+ CLOSE_NORMAL,
+ CLOSE_UNLINK,
+ CLOSE_NO_RENAME,
+} WalCloseMethod;
+
+typedef struct WalWriteMethod WalWriteMethod;
+struct WalWriteMethod
+{
+ Walfile(*open_for_write) (const char *pathname, const char *temp_suffix, size_t pad_to_size);
+ int (*close) (Walfile f, WalCloseMethod method);
+ bool (*existsfile) (const char *pathname);
+ ssize_t (*get_file_size) (const char *pathname);
+
+ ssize_t (*write) (Walfile f, const void *buf, size_t count);
+ off_t (*get_current_pos) (Walfile f);
+ int (*fsync) (Walfile f);
+ bool (*finish) (void);
+ char *(*getlasterror) (void);
+};
+
+/*
+ * Available WAL methods:
+ * - WalDirectoryMethod - write WAL to regular files in a standard pg_xlog
+ * - TarDirectoryMethod - write WAL to a tarfile corresponding to pg_xlog
+ * (only implements the methods required for pg_basebackup,
+ * not all those required for pg_receivexlog)
+ */
+WalWriteMethod *CreateWalDirectoryMethod(const char *basedir, bool sync);
+WalWriteMethod *CreateWalTarMethod(const char *tarbase, int compression, bool sync);
diff --git a/src/include/pgtar.h b/src/include/pgtar.h
index 45ca400..1d179f0 100644
--- a/src/include/pgtar.h
+++ b/src/include/pgtar.h
@@ -22,4 +22,5 @@ enum tarError
extern enum tarError tarCreateHeader(char *h, const char *filename, const char *linktarget,
pgoff_t size, mode_t mode, uid_t uid, gid_t gid, time_t mtime);
extern uint64 read_tar_number(const char *s, int len);
+extern void print_tar_number(char *s, int len, uint64 val);
extern int tarChecksum(char *header);
diff --git a/src/port/tar.c b/src/port/tar.c
index 52a2113..f1da959 100644
--- a/src/port/tar.c
+++ b/src/port/tar.c
@@ -16,7 +16,7 @@
* support only non-negative numbers, so we don't worry about the GNU rules
* for handling negative numbers.)
*/
-static void
+void
print_tar_number(char *s, int len, uint64 val)
{
if (val < (((uint64) 1) << ((len - 1) * 3)))
On Sat, Oct 15, 2016 at 8:51 AM, Magnus Hagander <magnus@hagander.net> wrote:
Fixed.
Ok, I had a extra look on the patch:
+ <para>The transactionn log files are written to a separate file
+ called <filename>pg_xlog.tar</filename>.
+ </para>
s/transactionn/transaction/, and the <para> markup should be on its own line.
+ if (dir_data->sync)
+ {
+ if (fsync_fname(tmppath, false, progname) != 0)
+ {
+ close(fd);
+ return NULL;
+ }
+ if (fsync_parent_path(tmppath, progname) != 0)
+ {
+ close(fd);
+ return NULL;
+ }
+ }
Nit: squashing both things together would simplify the code.
+ else if (method == CLOSE_UNLINK
+ )
Your finger slipped here.
Except that it looks in pretty good to me, so I am switching that as
ready for committer.
But independent of this patch, actually putting that test in for non-tar
mode would probably not be a bad idea -- if that breaks, it's likely both
break, after all.
Agreed (you were able to break only tar upthread with your patch). One
way to do that elegantly would be to:
1) extend slurp_dir to return only files that have a matching pattern.
That's not difficult to do:
--- a/src/test/perl/TestLib.pm
+++ b/src/test/perl/TestLib.pm
@@ -184,10 +184,14 @@ sub generate_ascii_string
sub slurp_dir
{
- my ($dir) = @_;
+ my ($dir, $match_pattern) = @_;
opendir(my $dh, $dir)
or die "could not opendir \"$dir\": $!";
my @direntries = readdir $dh;
+ if (defined($match_pattern))
+ {
+ @direntries = grep($match_pattern, @direntries);
+ }
closedir $dh;
return @direntries;
}
Sorting them at the same time may be a good idea..
2) Add an option to pg_xlogdump to be able to output its output to a
file. That would be awkward to rely on grabbing the output data from a
pipe... On Windows particularly. Thinking about it, would that
actually be useful to others? That's not a complicated patch.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Oct 17, 2016 at 2:37 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
Except that it looks in pretty good to me, so I am switching that as
ready for committer.
+ /*
+ * Create pg_xlog/archive_status (and thus pg_xlog) so we can write to
+ * basedir/pg_xlog as the directory entry in the tar file may arrive
+ * later.
+ */
+ snprintf(statusdir, sizeof(statusdir), "%s/pg_xlog/archive_status",
+ basedir);
This part conflicts with f82ec32, where you need make pg_basebackup
aware of the backend version.. I promise that's the last conflict, at
least I don't have more patches planned in the area.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Oct 21, 2016 at 2:02 PM, Michael Paquier <michael.paquier@gmail.com>
wrote:
On Mon, Oct 17, 2016 at 2:37 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:Except that it looks in pretty good to me, so I am switching that as
ready for committer.+ /* + * Create pg_xlog/archive_status (and thus pg_xlog) so we can write to + * basedir/pg_xlog as the directory entry in the tar file may arrive + * later. + */ + snprintf(statusdir, sizeof(statusdir), "%s/pg_xlog/archive_status", + basedir);This part conflicts with f82ec32, where you need make pg_basebackup
aware of the backend version.. I promise that's the last conflict, at
least I don't have more patches planned in the area.
It also broke the tests and invalidated some documentation. But it was easy
enough to fix.
I've now applied this, so next time you get to do the merging :P Joking
aside, please review and let me know if you can spot something I messed up
in the final merge.
Thanks for your repeated reviews!
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On Mon, Oct 17, 2016 at 7:37 AM, Michael Paquier <michael.paquier@gmail.com>
wrote:
On Sat, Oct 15, 2016 at 8:51 AM, Magnus Hagander <magnus@hagander.net>
wrote:Fixed.
Ok, I had a extra look on the patch: + <para>The transactionn log files are written to a separate file + called <filename>pg_xlog.tar</filename>. + </para> s/transactionn/transaction/, and the <para> markup should be on its own line.+ if (dir_data->sync) + { + if (fsync_fname(tmppath, false, progname) != 0) + { + close(fd); + return NULL; + } + if (fsync_parent_path(tmppath, progname) != 0) + { + close(fd); + return NULL; + } + } Nit: squashing both things together would simplify the code.+ else if (method == CLOSE_UNLINK + ) Your finger slipped here.Except that it looks in pretty good to me, so I am switching that as
ready for committer.
I incorporated those changes before pushing.
But independent of this patch, actually putting that test in for non-tar
mode would probably not be a bad idea -- if that breaks, it's likely both
break, after all.Agreed (you were able to break only tar upthread with your patch). One way to do that elegantly would be to: 1) extend slurp_dir to return only files that have a matching pattern. That's not difficult to do: --- a/src/test/perl/TestLib.pm +++ b/src/test/perl/TestLib.pm @@ -184,10 +184,14 @@ sub generate_ascii_stringsub slurp_dir { - my ($dir) = @_; + my ($dir, $match_pattern) = @_; opendir(my $dh, $dir) or die "could not opendir \"$dir\": $!"; my @direntries = readdir $dh; + if (defined($match_pattern)) + { + @direntries = grep($match_pattern, @direntries); + } closedir $dh; return @direntries; } Sorting them at the same time may be a good idea.. 2) Add an option to pg_xlogdump to be able to output its output to a file. That would be awkward to rely on grabbing the output data from a pipe... On Windows particularly. Thinking about it, would that actually be useful to others? That's not a complicated patch.
I think both of those would be worthwhile. Just for the testability in
itself, but such a flag to pg_xlogdump would probably be useful in other
cases as well, beyond just the testing.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On Sun, Oct 23, 2016 at 10:30 PM, Magnus Hagander <magnus@hagander.net> wrote:
On Mon, Oct 17, 2016 at 7:37 AM, Michael Paquier <michael.paquier@gmail.com>
wrote:But independent of this patch, actually putting that test in for non-tar
mode would probably not be a bad idea -- if that breaks, it's likely
both
break, after all.Agreed (you were able to break only tar upthread with your patch). One way to do that elegantly would be to: 1) extend slurp_dir to return only files that have a matching pattern. That's not difficult to do: --- a/src/test/perl/TestLib.pm +++ b/src/test/perl/TestLib.pm @@ -184,10 +184,14 @@ sub generate_ascii_stringsub slurp_dir { - my ($dir) = @_; + my ($dir, $match_pattern) = @_; opendir(my $dh, $dir) or die "could not opendir \"$dir\": $!"; my @direntries = readdir $dh; + if (defined($match_pattern)) + { + @direntries = grep($match_pattern, @direntries); + } closedir $dh; return @direntries; } Sorting them at the same time may be a good idea.. 2) Add an option to pg_xlogdump to be able to output its output to a file. That would be awkward to rely on grabbing the output data from a pipe... On Windows particularly. Thinking about it, would that actually be useful to others? That's not a complicated patch.I think both of those would be worthwhile. Just for the testability in
itself, but such a flag to pg_xlogdump would probably be useful in other
cases as well, beyond just the testing.
Looking quickly at the code, it does not seem that complicated... I
may just send patches tomorrow for all those things and be done with
it, all that on its new dedicated thread.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Oct 23, 2016 at 10:52 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
On Sun, Oct 23, 2016 at 10:30 PM, Magnus Hagander <magnus@hagander.net> wrote:
I think both of those would be worthwhile. Just for the testability in
itself, but such a flag to pg_xlogdump would probably be useful in other
cases as well, beyond just the testing.Looking quickly at the code, it does not seem that complicated... I
may just send patches tomorrow for all those things and be done with
it, all that on its new dedicated thread.
Actually not that much after noticing that pg_xlogdump emulates some
of the backend's StringInfo routines and calls at the end vprintf() to
output everything to stdout, which is ugly. The cleanest solution here
would be to make StringInfo a bit more portable and allow them for
frontends, somehting that may be useful for any utility playing with
rm_desc. A less cleaner solution would be to somewhat store a fd
pointing to a file (or stdout) into compat.c and output to it. I'd
slightly prefer the first solution, but that does not seem worth the
effort just for pg_xlogdump and one test.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2016-10-17 14:37:05 +0900, Michael Paquier wrote:
2) Add an option to pg_xlogdump to be able to output its output to a
file. That would be awkward to rely on grabbing the output data from a
pipe... On Windows particularly. Thinking about it, would that
actually be useful to others? That's not a complicated patch.
Hm? Just redirecting output seems less complicated? And afaik works on
windows as well?
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Oct 24, 2016 at 1:38 PM, Andres Freund <andres@anarazel.de> wrote:
On 2016-10-17 14:37:05 +0900, Michael Paquier wrote:
2) Add an option to pg_xlogdump to be able to output its output to a
file. That would be awkward to rely on grabbing the output data from a
pipe... On Windows particularly. Thinking about it, would that
actually be useful to others? That's not a complicated patch.Hm? Just redirecting output seems less complicated? And afaik works on
windows as well?
In the TAP suite STDOUT is already redirect to the log files. Perhaps
we could just do a SELECT FILE; to redirect the output of pg_xlogdump
temporarily into a custom location just for the sake of a test like
that.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Oct 23, 2016 at 10:28 PM, Magnus Hagander <magnus@hagander.net> wrote:
It also broke the tests and invalidated some documentation. But it was easy
enough to fix.I've now applied this, so next time you get to do the merging :P Joking
aside, please review and let me know if you can spot something I messed up
in the final merge.
Just had another look at it..
+static int
+tar_fsync(Walfile f)
+{
+ Assert(f != NULL);
+ tar_clear_error();
+
+ /*
+ * Always sync the whole tarfile, because that's all we can do. This makes
+ * no sense on compressed files, so just ignore those.
+ */
+ if (tar_data->compression)
+ return 0;
+
+ return fsync(tar_data->fd);
+}
fsync() should not be called here if --no-sync is used.
+ /* sync the empty blocks as well, since they're after the last file */
+ fsync(tar_data->fd);
Similarly, the fsync call of tar_finish() should not happen when
--no-sync is used.
+ if (format == 'p')
+ stream.walmethod = CreateWalDirectoryMethod(param->xlog, do_sync);
+ else
+ stream.walmethod = CreateWalTarMethod(param->xlog,
compresslevel, do_sync);
LogStreamerMain() exits immediately once it is done, but I think that
we had better be tidy here and clean up the WAL methods that have been
allocated. I am thinking here about a potentially retry method on
failure, though the best shot in this area would be with
ReceiveXlogStream().
Attached is a patch addressing those points.
--
Michael
Attachments:
pg_basebackup-tar-fixes.patchinvalid/octet-stream; name=pg_basebackup-tar-fixes.patchDownload
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 16cab97..e2875df 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -495,6 +495,13 @@ LogStreamerMain(logstreamer_param *param)
}
PQfinish(param->bgconn);
+
+ if (format == 'p')
+ FreeWalDirectoryMethod();
+ else
+ FreeWalTarMethod();
+ pg_free(stream.walmethod);
+
return 0;
}
diff --git a/src/bin/pg_basebackup/walmethods.c b/src/bin/pg_basebackup/walmethods.c
index d1dc046..656622f 100644
--- a/src/bin/pg_basebackup/walmethods.c
+++ b/src/bin/pg_basebackup/walmethods.c
@@ -299,6 +299,13 @@ CreateWalDirectoryMethod(const char *basedir, bool sync)
return method;
}
+void
+FreeWalDirectoryMethod(void)
+{
+ pg_free(dir_data->basedir);
+ pg_free(dir_data);
+}
+
/*-------------------------------------------------------------------------
* WalTarMethod - write wal to a tar file containing pg_xlog contents
@@ -611,6 +618,9 @@ tar_sync(Walfile f)
Assert(f != NULL);
tar_clear_error();
+ if (!tar_data->sync)
+ return 0;
+
/*
* Always sync the whole tarfile, because that's all we can do. This makes
* no sense on compressed files, so just ignore those.
@@ -842,7 +852,8 @@ tar_finish(void)
#endif
/* sync the empty blocks as well, since they're after the last file */
- fsync(tar_data->fd);
+ if (tar_data->sync)
+ fsync(tar_data->fd);
if (close(tar_data->fd) != 0)
return false;
@@ -890,3 +901,14 @@ CreateWalTarMethod(const char *tarbase, int compression, bool sync)
return method;
}
+
+void
+FreeWalTarMethod(void)
+{
+ pg_free(tar_data->tarfilename);
+#ifdef HAVE_LIBZ
+ if (tar_data->compression)
+ pg_free(tar_data->zlibOut);
+#endif
+ pg_free(tar_data);
+}
diff --git a/src/bin/pg_basebackup/walmethods.h b/src/bin/pg_basebackup/walmethods.h
index 0c8eac7..8cea8ff 100644
--- a/src/bin/pg_basebackup/walmethods.h
+++ b/src/bin/pg_basebackup/walmethods.h
@@ -43,3 +43,7 @@ struct WalWriteMethod
*/
WalWriteMethod *CreateWalDirectoryMethod(const char *basedir, bool sync);
WalWriteMethod *CreateWalTarMethod(const char *tarbase, int compression, bool sync);
+
+/* Cleanup routines for previously-created methods */
+void FreeWalDirectoryMethod(void);
+void FreeWalTarMethod(void);
On Mon, Oct 24, 2016 at 7:46 AM, Michael Paquier <michael.paquier@gmail.com>
wrote:
On Sun, Oct 23, 2016 at 10:28 PM, Magnus Hagander <magnus@hagander.net>
wrote:It also broke the tests and invalidated some documentation. But it was
easy
enough to fix.
I've now applied this, so next time you get to do the merging :P Joking
aside, please review and let me know if you can spot something I messedup
in the final merge.
Just had another look at it.. +static int +tar_fsync(Walfile f) +{ + Assert(f != NULL); + tar_clear_error(); + + /* + * Always sync the whole tarfile, because that's all we can do. This makes + * no sense on compressed files, so just ignore those. + */ + if (tar_data->compression) + return 0; + + return fsync(tar_data->fd); +} fsync() should not be called here if --no-sync is used.+ /* sync the empty blocks as well, since they're after the last file */ + fsync(tar_data->fd); Similarly, the fsync call of tar_finish() should not happen when --no-sync is used.
Yeah, agreed.
+ if (format == 'p') + stream.walmethod = CreateWalDirectoryMethod(param->xlog, do_sync); + else + stream.walmethod = CreateWalTarMethod(param->xlog, compresslevel, do_sync); LogStreamerMain() exits immediately once it is done, but I think that we had better be tidy here and clean up the WAL methods that have been allocated. I am thinking here about a potentially retry method on failure, though the best shot in this area would be with ReceiveXlogStream().
Wouldn't the same be needed in pg_receivexlog.c in that case?
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On Tue, Oct 25, 2016 at 7:12 PM, Magnus Hagander <magnus@hagander.net> wrote:
On Mon, Oct 24, 2016 at 7:46 AM, Michael Paquier <michael.paquier@gmail.com>
wrote:On Sun, Oct 23, 2016 at 10:28 PM, Magnus Hagander <magnus@hagander.net> wrote: + if (format == 'p') + stream.walmethod = CreateWalDirectoryMethod(param->xlog, do_sync); + else + stream.walmethod = CreateWalTarMethod(param->xlog, compresslevel, do_sync); LogStreamerMain() exits immediately once it is done, but I think that we had better be tidy here and clean up the WAL methods that have been allocated. I am thinking here about a potentially retry method on failure, though the best shot in this area would be with ReceiveXlogStream().Wouldn't the same be needed in pg_receivexlog.c in that case?
Oops, missed that. Thanks for the extra checks. Attached is an updated patch.
--
Michael
Attachments:
pg_basebackup-tar-fixes-2.patchtext/x-diff; charset=US-ASCII; name=pg_basebackup-tar-fixes-2.patchDownload
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 16cab97..e2875df 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -495,6 +495,13 @@ LogStreamerMain(logstreamer_param *param)
}
PQfinish(param->bgconn);
+
+ if (format == 'p')
+ FreeWalDirectoryMethod();
+ else
+ FreeWalTarMethod();
+ pg_free(stream.walmethod);
+
return 0;
}
diff --git a/src/bin/pg_basebackup/pg_receivexlog.c b/src/bin/pg_basebackup/pg_receivexlog.c
index bbdf96e..99445e6 100644
--- a/src/bin/pg_basebackup/pg_receivexlog.c
+++ b/src/bin/pg_basebackup/pg_receivexlog.c
@@ -352,6 +352,10 @@ StreamLog(void)
}
PQfinish(conn);
+
+ FreeWalDirectoryMethod();
+ pg_free(stream.walmethod);
+
conn = NULL;
}
diff --git a/src/bin/pg_basebackup/walmethods.c b/src/bin/pg_basebackup/walmethods.c
index d1dc046..656622f 100644
--- a/src/bin/pg_basebackup/walmethods.c
+++ b/src/bin/pg_basebackup/walmethods.c
@@ -299,6 +299,13 @@ CreateWalDirectoryMethod(const char *basedir, bool sync)
return method;
}
+void
+FreeWalDirectoryMethod(void)
+{
+ pg_free(dir_data->basedir);
+ pg_free(dir_data);
+}
+
/*-------------------------------------------------------------------------
* WalTarMethod - write wal to a tar file containing pg_xlog contents
@@ -611,6 +618,9 @@ tar_sync(Walfile f)
Assert(f != NULL);
tar_clear_error();
+ if (!tar_data->sync)
+ return 0;
+
/*
* Always sync the whole tarfile, because that's all we can do. This makes
* no sense on compressed files, so just ignore those.
@@ -842,7 +852,8 @@ tar_finish(void)
#endif
/* sync the empty blocks as well, since they're after the last file */
- fsync(tar_data->fd);
+ if (tar_data->sync)
+ fsync(tar_data->fd);
if (close(tar_data->fd) != 0)
return false;
@@ -890,3 +901,14 @@ CreateWalTarMethod(const char *tarbase, int compression, bool sync)
return method;
}
+
+void
+FreeWalTarMethod(void)
+{
+ pg_free(tar_data->tarfilename);
+#ifdef HAVE_LIBZ
+ if (tar_data->compression)
+ pg_free(tar_data->zlibOut);
+#endif
+ pg_free(tar_data);
+}
diff --git a/src/bin/pg_basebackup/walmethods.h b/src/bin/pg_basebackup/walmethods.h
index 0c8eac7..8cea8ff 100644
--- a/src/bin/pg_basebackup/walmethods.h
+++ b/src/bin/pg_basebackup/walmethods.h
@@ -43,3 +43,7 @@ struct WalWriteMethod
*/
WalWriteMethod *CreateWalDirectoryMethod(const char *basedir, bool sync);
WalWriteMethod *CreateWalTarMethod(const char *tarbase, int compression, bool sync);
+
+/* Cleanup routines for previously-created methods */
+void FreeWalDirectoryMethod(void);
+void FreeWalTarMethod(void);
On Tue, Oct 25, 2016 at 2:52 PM, Michael Paquier <michael.paquier@gmail.com>
wrote:
On Tue, Oct 25, 2016 at 7:12 PM, Magnus Hagander <magnus@hagander.net>
wrote:On Mon, Oct 24, 2016 at 7:46 AM, Michael Paquier <
michael.paquier@gmail.com>
wrote:
On Sun, Oct 23, 2016 at 10:28 PM, Magnus Hagander <magnus@hagander.net> wrote: + if (format == 'p') + stream.walmethod = CreateWalDirectoryMethod(param->xlog,do_sync);
+ else + stream.walmethod = CreateWalTarMethod(param->xlog, compresslevel, do_sync); LogStreamerMain() exits immediately once it is done, but I think that we had better be tidy here and clean up the WAL methods that have been allocated. I am thinking here about a potentially retry method on failure, though the best shot in this area would be with ReceiveXlogStream().Wouldn't the same be needed in pg_receivexlog.c in that case?
Oops, missed that. Thanks for the extra checks. Attached is an updated
patch.
Thanks, applied and pushed.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On Wed, Oct 26, 2016 at 2:00 AM, Magnus Hagander <magnus@hagander.net> wrote:
Thanks, applied and pushed.
Thanks.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers