[RFC] Incremental backup v3: incremental PoC

Started by Marco Nenciariniabout 11 years ago44 messages

marco.nenciarini@2ndquadrant.it

about 11 years ago

1 attachment(s)

Hi Hackers,

following the advices gathered on the list I've prepared a third partial
patch on the way of implementing incremental pg_basebackup as described
here https://wiki.postgresql.org/wiki/Incremental_backup

== Changes

Compared to the previous version I've made the following changes:

* The backup_profile is not optional anymore. Generating it is cheap
enough not to bother the user with such a choice.

* I've isolated the code which detects the maxLSN of a segment in a
separate getMaxLSN function. At the moment it works scanning the whole
file, but I'm looking to replace it in the next versions.

* I've made possible to request an incremental backup passing a "-I
<LSN>" option to pg_basebackup. It is probably too "raw" to remain as
is, but it's is useful at this stage to test the code.

* I've modified the backup label to report the fact that the backup was
taken with the incremental option. The result will be something like:

START WAL LOCATION: 0/52000028 (file 000000010000000000000052)
CHECKPOINT LOCATION: 0/52000060
INCREMENTAL FROM LOCATION: 0/51000028
BACKUP METHOD: streamed
BACKUP FROM: master
START TIME: 2014-10-14 16:05:04 CEST
LABEL: pg_basebackup base backup

== Testing it

At this stage you can make an incremental file-level backup using this
procedure:

pg_basebackup -v -F p -D /tmp/x -x
LSN=$(awk '/^START WAL/{print $4}' /tmp/x/backup_profile)
pg_basebackup -v -F p -D /tmp/y -I $LSN -x

the result will be an incremental backup in /tmp/y based on the full
backup on /tmp/x.

You can "reintegrate" the incremental backup in the /tmp/z directory
with the following little python script, calling it as

./recover.py /tmp/x /tmp/y /tmp/z

----
#!/usr/bin/env python
# recover.py

import os
import shutil
import sys

if len(sys.argv) != 4:
print >> sys.stderr, "usage: %s base incremental destination"
sys.exit(1)

base=sys.argv[1]
incr=sys.argv[2]
dest=sys.argv[3]

if os.path.exists(dest):
print >> sys.stderr, "error: destination must not exist (%s)" % dest
sys.exit(1)

profile=open(os.path.join(incr, 'backup_profile'), 'r')

for line in profile:
if line.strip() == 'FILE LIST':
break

shutil.copytree(incr, dest)
for line in profile:
tblspc, lsn, sent, date, size, path = line.strip().split('\t')
if sent == 't' or lsn=='\\N':
continue
base_file = os.path.join(base, path)
dest_file = os.path.join(dest, path)
shutil.copy2(base_file, dest_file)
----

It has obviously to be replaced by a full-fledged user tool, but it is
enough to test the concept.

== What next

I would to replace the getMaxLSN function with a more-or-less persistent
structure which contains the maxLSN for each data segment.

To make it work I would hook into the ForwardFsyncRequest() function in
src/backend/postmaster/checkpointer.c and update an in memory hash every
time a block is going to be fsynced. The structure could be persisted on
disk at some time (probably on checkpoint).

I think a good key for the hash would be a BufferTag with blocknum
"rounded" to the start of the segment.

I'm here asking for comments and advices on how to implement it in an
acceptable way.

== Disclaimer

The code here is an intermediate step, it does not contain any
documentation beside the code comments and will be subject to deep and
radical changes. However I believe it can be a base to allow PostgreSQL
to have its file-based incremental backup, and a block-based incremental
backup after it.

Regards,
Marco

--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it

Attachments:

file-based-incremental-backup.patchtext/plain; charset=UTF-8; name=file-based-incremental-backup.patch; x-mac-creator=0; x-mac-type=0Download

From 5a7365fc3115c831627c087311c702a79cb355bc Mon Sep 17 00:00:00 2001
From: Marco Nenciarini <marco.nenciarini@2ndQuadrant.it>
Date: Tue, 14 Oct 2014 14:31:28 +0100
Subject: [PATCH] File-based incremental backup

Add backup profile to pg_basebackup
INCREMENTAL option naive implementaion
---
 src/backend/access/transam/xlog.c      |   7 +-
 src/backend/access/transam/xlogfuncs.c |   2 +-
 src/backend/replication/basebackup.c   | 316 +++++++++++++++++++++++++++++++--
 src/backend/replication/repl_gram.y    |   6 +
 src/backend/replication/repl_scanner.l |   1 +
 src/bin/pg_basebackup/pg_basebackup.c  |  83 +++++++--
 src/include/access/xlog.h              |   3 +-
 src/include/replication/basebackup.h   |   4 +
 8 files changed, 391 insertions(+), 31 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 235b442..4dc79f0 100644
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
*************** XLogFileNameP(TimeLineID tli, XLogSegNo 
*** 9718,9724 ****
   * permissions of the calling user!
   */
  XLogRecPtr
! do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
  				   char **labelfile)
  {
  	bool		exclusive = (labelfile == NULL);
--- 9718,9725 ----
   * permissions of the calling user!
   */
  XLogRecPtr
! do_pg_start_backup(const char *backupidstr, bool fast,
! 				   XLogRecPtr incremental_startpoint, TimeLineID *starttli_p,
  				   char **labelfile)
  {
  	bool		exclusive = (labelfile == NULL);
*************** do_pg_start_backup(const char *backupids
*** 9936,9941 ****
--- 9937,9946 ----
  			 (uint32) (startpoint >> 32), (uint32) startpoint, xlogfilename);
  		appendStringInfo(&labelfbuf, "CHECKPOINT LOCATION: %X/%X\n",
  					 (uint32) (checkpointloc >> 32), (uint32) checkpointloc);
+ 		if (incremental_startpoint > 0)
+ 			appendStringInfo(&labelfbuf, "INCREMENTAL FROM LOCATION: %X/%X\n",
+ 							 (uint32) (incremental_startpoint >> 32),
+ 							 (uint32) incremental_startpoint);
  		appendStringInfo(&labelfbuf, "BACKUP METHOD: %s\n",
  						 exclusive ? "pg_start_backup" : "streamed");
  		appendStringInfo(&labelfbuf, "BACKUP FROM: %s\n",
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 133143d..f1248fa 100644
*** a/src/backend/access/transam/xlogfuncs.c
--- b/src/backend/access/transam/xlogfuncs.c
*************** pg_start_backup(PG_FUNCTION_ARGS)
*** 59,65 ****
  				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
  		   errmsg("must be superuser or replication role to run a backup")));
  
! 	startpoint = do_pg_start_backup(backupidstr, fast, NULL, NULL);
  
  	PG_RETURN_LSN(startpoint);
  }
--- 59,65 ----
  				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
  		   errmsg("must be superuser or replication role to run a backup")));
  
! 	startpoint = do_pg_start_backup(backupidstr, fast, 0, NULL, NULL);
  
  	PG_RETURN_LSN(startpoint);
  }
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index fbcecbb..26f3b8e 100644
*** a/src/backend/replication/basebackup.c
--- b/src/backend/replication/basebackup.c
***************
*** 30,40 ****
--- 30,42 ----
  #include "replication/basebackup.h"
  #include "replication/walsender.h"
  #include "replication/walsender_private.h"
+ #include "storage/bufpage.h"
  #include "storage/fd.h"
  #include "storage/ipc.h"
  #include "utils/builtins.h"
  #include "utils/elog.h"
  #include "utils/ps_status.h"
+ #include "utils/pg_lsn.h"
  #include "utils/timestamp.h"
  
  
*************** typedef struct
*** 46,56 ****
  	bool		nowait;
  	bool		includewal;
  	uint32		maxrate;
  } basebackup_options;
  
  
! static int64 sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces);
! static int64 sendTablespace(char *path, bool sizeonly);
  static bool sendFile(char *readfilename, char *tarfilename,
  		 struct stat * statbuf, bool missing_ok);
  static void sendFileWithContent(const char *filename, const char *content);
--- 48,62 ----
  	bool		nowait;
  	bool		includewal;
  	uint32		maxrate;
+ 	XLogRecPtr	incremental_startpoint;
  } basebackup_options;
  
  
! static int64 sendDir(char *path, int basepathlen, bool sizeonly,
! 					 List *tablespaces, bool has_relfiles,
! 					 XLogRecPtr incremental_startpoint);
! static int64 sendTablespace(char *path, bool sizeonly,
! 				XLogRecPtr incremental_startpoint);
  static bool sendFile(char *readfilename, char *tarfilename,
  		 struct stat * statbuf, bool missing_ok);
  static void sendFileWithContent(const char *filename, const char *content);
*************** static void parse_basebackup_options(Lis
*** 64,69 ****
--- 70,81 ----
  static void SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli);
  static int	compareWalFileNames(const void *a, const void *b);
  static void throttle(size_t increment);
+ static bool getFileMaxLSN(char *filename, struct stat * statbuf,
+ 				XLogRecPtr *filemaxlsn);
+ static void writeBackupProfileLine(const char *filename, struct stat * statbuf,
+ 								   bool is_relfile, XLogRecPtr filemaxlsn, bool sent);
+ static void sendBackupProfile(const char *labelfile);
+ static bool validateRelfilenodeName(char *name);
  
  /* Was the backup currently in-progress initiated in recovery mode? */
  static bool backup_started_in_recovery = false;
*************** static int64 elapsed_min_unit;
*** 93,98 ****
--- 105,116 ----
  /* The last check of the transfer rate. */
  static int64 throttled_last;
  
+ /* Temporary file containing the backup profile */
+ static File backup_profile_fd = 0;
+ 
+ /* Tablespace being currently sent. Used in backup profile generation */
+ static char *current_tablespace = NULL;
+ 
  typedef struct
  {
  	char	   *oid;
*************** perform_base_backup(basebackup_options *
*** 132,138 ****
  
  	backup_started_in_recovery = RecoveryInProgress();
  
! 	startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint, &starttli,
  								  &labelfile);
  	/*
  	 * Once do_pg_start_backup has been called, ensure that any failure causes
--- 150,160 ----
  
  	backup_started_in_recovery = RecoveryInProgress();
  
! 	/* Open a temporary file to hold the profile content. */
! 	backup_profile_fd = OpenTemporaryFile(false);
! 
! 	startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint,
! 								  opt->incremental_startpoint, &starttli,
  								  &labelfile);
  	/*
  	 * Once do_pg_start_backup has been called, ensure that any failure causes
*************** perform_base_backup(basebackup_options *
*** 208,214 ****
  			ti->oid = pstrdup(de->d_name);
  			ti->path = pstrdup(linkpath);
  			ti->rpath = relpath ? pstrdup(relpath) : NULL;
! 			ti->size = opt->progress ? sendTablespace(fullpath, true) : -1;
  			tablespaces = lappend(tablespaces, ti);
  #else
  
--- 230,237 ----
  			ti->oid = pstrdup(de->d_name);
  			ti->path = pstrdup(linkpath);
  			ti->rpath = relpath ? pstrdup(relpath) : NULL;
! 			ti->size = opt->progress ? sendTablespace(fullpath, true,
! 											opt->incremental_startpoint) : -1;
  			tablespaces = lappend(tablespaces, ti);
  #else
  
*************** perform_base_backup(basebackup_options *
*** 225,231 ****
  
  		/* Add a node for the base directory at the end */
  		ti = palloc0(sizeof(tablespaceinfo));
! 		ti->size = opt->progress ? sendDir(".", 1, true, tablespaces) : -1;
  		tablespaces = lappend(tablespaces, ti);
  
  		/* Send tablespace header */
--- 248,255 ----
  
  		/* Add a node for the base directory at the end */
  		ti = palloc0(sizeof(tablespaceinfo));
! 		ti->size = opt->progress ? sendDir(".", 1, true, tablespaces, false,
! 										   opt->incremental_startpoint) : -1;
  		tablespaces = lappend(tablespaces, ti);
  
  		/* Send tablespace header */
*************** perform_base_backup(basebackup_options *
*** 267,272 ****
--- 291,302 ----
  			pq_sendint(&buf, 0, 2);		/* natts */
  			pq_endmessage(&buf);
  
+ 			/*
+ 			 * Save the current tablespace, used in writeBackupProfileLine
+ 			 * function
+ 			 */
+ 			current_tablespace = ti->oid;
+ 
  			if (ti->path == NULL)
  			{
  				struct stat statbuf;
*************** perform_base_backup(basebackup_options *
*** 275,281 ****
  				sendFileWithContent(BACKUP_LABEL_FILE, labelfile);
  
  				/* ... then the bulk of the files ... */
! 				sendDir(".", 1, false, tablespaces);
  
  				/* ... and pg_control after everything else. */
  				if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
--- 305,311 ----
  				sendFileWithContent(BACKUP_LABEL_FILE, labelfile);
  
  				/* ... then the bulk of the files ... */
! 				sendDir(".", 1, false, tablespaces, false, opt->incremental_startpoint);
  
  				/* ... and pg_control after everything else. */
  				if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
*************** perform_base_backup(basebackup_options *
*** 284,292 ****
  							 errmsg("could not stat control file \"%s\": %m",
  									XLOG_CONTROL_FILE)));
  				sendFile(XLOG_CONTROL_FILE, XLOG_CONTROL_FILE, &statbuf, false);
  			}
  			else
! 				sendTablespace(ti->path, false);
  
  			/*
  			 * If we're including WAL, and this is the main data directory we
--- 314,323 ----
  							 errmsg("could not stat control file \"%s\": %m",
  									XLOG_CONTROL_FILE)));
  				sendFile(XLOG_CONTROL_FILE, XLOG_CONTROL_FILE, &statbuf, false);
+ 				writeBackupProfileLine(XLOG_CONTROL_FILE, &statbuf, false, 0, true);
  			}
  			else
! 				sendTablespace(ti->path, false, opt->incremental_startpoint);
  
  			/*
  			 * If we're including WAL, and this is the main data directory we
*************** perform_base_backup(basebackup_options *
*** 498,503 ****
--- 529,536 ----
  
  			/* XLogSegSize is a multiple of 512, so no need for padding */
  			FreeFile(fp);
+ 
+ 			writeBackupProfileLine(pathbuf, &statbuf, false, 0, true);
  		}
  
  		/*
*************** perform_base_backup(basebackup_options *
*** 521,532 ****
--- 554,569 ----
  						 errmsg("could not stat file \"%s\": %m", pathbuf)));
  
  			sendFile(pathbuf, pathbuf, &statbuf, false);
+ 			writeBackupProfileLine(pathbuf, &statbuf, false, 0, true);
  		}
  
  		/* Send CopyDone message for the last tar file */
  		pq_putemptymessage('c');
  	}
  	SendXlogRecPtrResult(endptr, endtli);
+ 
+ 	/* Send the profile file. */
+ 	sendBackupProfile(labelfile);
  }
  
  /*
*************** parse_basebackup_options(List *options, 
*** 555,560 ****
--- 592,598 ----
  	bool		o_nowait = false;
  	bool		o_wal = false;
  	bool		o_maxrate = false;
+ 	bool		o_incremental = false;
  
  	MemSet(opt, 0, sizeof(*opt));
  	foreach(lopt, options)
*************** parse_basebackup_options(List *options, 
*** 625,630 ****
--- 663,680 ----
  			opt->maxrate = (uint32) maxrate;
  			o_maxrate = true;
  		}
+ 		else if (strcmp(defel->defname, "incremental") == 0)
+ 		{
+ 			if (o_incremental)
+ 				ereport(ERROR,
+ 						(errcode(ERRCODE_SYNTAX_ERROR),
+ 						 errmsg("duplicate option \"%s\"", defel->defname)));
+ 
+ 			opt->incremental_startpoint = DatumGetLSN(
+ 				DirectFunctionCall1(pg_lsn_in,
+ 									CStringGetDatum(strVal(defel->arg))));
+ 			o_incremental = true;
+ 		}
  		else
  			elog(ERROR, "option \"%s\" not recognized",
  				 defel->defname);
*************** sendFileWithContent(const char *filename
*** 844,849 ****
--- 894,902 ----
  		MemSet(buf, 0, pad);
  		pq_putmessage('d', buf, pad);
  	}
+ 
+ 	/* Write a backup profile entry for this file. */
+ 	writeBackupProfileLine(filename, &statbuf, false, 0, true);
  }
  
  /*
*************** sendFileWithContent(const char *filename
*** 854,860 ****
   * Only used to send auxiliary tablespaces, not PGDATA.
   */
  static int64
! sendTablespace(char *path, bool sizeonly)
  {
  	int64		size;
  	char		pathbuf[MAXPGPATH];
--- 907,913 ----
   * Only used to send auxiliary tablespaces, not PGDATA.
   */
  static int64
! sendTablespace(char *path, bool sizeonly, XLogRecPtr incremental_startpoint)
  {
  	int64		size;
  	char		pathbuf[MAXPGPATH];
*************** sendTablespace(char *path, bool sizeonly
*** 887,893 ****
  	size = 512;					/* Size of the header just added */
  
  	/* Send all the files in the tablespace version directory */
! 	size += sendDir(pathbuf, strlen(path), sizeonly, NIL);
  
  	return size;
  }
--- 940,946 ----
  	size = 512;					/* Size of the header just added */
  
  	/* Send all the files in the tablespace version directory */
! 	size += sendDir(pathbuf, strlen(path), sizeonly, NIL, true, incremental_startpoint);
  
  	return size;
  }
*************** sendTablespace(char *path, bool sizeonly
*** 899,907 ****
   *
   * Omit any directory in the tablespaces list, to avoid backing up
   * tablespaces twice when they were created inside PGDATA.
   */
  static int64
! sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces)
  {
  	DIR		   *dir;
  	struct dirent *de;
--- 952,964 ----
   *
   * Omit any directory in the tablespaces list, to avoid backing up
   * tablespaces twice when they were created inside PGDATA.
+  *
+  * If 'has_relfiles' is set, this directory will be checked to identify
+  * relnode files and compute their maxLSN.
   */
  static int64
! sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces,
! 		bool has_relfiles, XLogRecPtr incremental_startpoint)
  {
  	DIR		   *dir;
  	struct dirent *de;
*************** sendDir(char *path, int basepathlen, boo
*** 1100,1114 ****
  				}
  			}
  			if (!skip_this_dir)
! 				size += sendDir(pathbuf, basepathlen, sizeonly, tablespaces);
  		}
  		else if (S_ISREG(statbuf.st_mode))
  		{
  			bool		sent = false;
  
  			if (!sizeonly)
! 				sent = sendFile(pathbuf, pathbuf + basepathlen + 1, &statbuf,
! 								true);
  
  			if (sent || sizeonly)
  			{
--- 1157,1209 ----
  				}
  			}
  			if (!skip_this_dir)
! 			{
! 				bool	subdir_has_relfiles;
! 
! 				/*
! 				 * Whithin PGDATA relnode files are contained only in "global"
! 				 * and "base" directory
! 				 */
! 				subdir_has_relfiles = has_relfiles
! 					|| strcmp(pathbuf, "./global") == 0
! 					|| strcmp(pathbuf, "./base") == 0;
! 
! 				size += sendDir(pathbuf, basepathlen, sizeonly, tablespaces,
! 								subdir_has_relfiles, incremental_startpoint);
! 			}
  		}
  		else if (S_ISREG(statbuf.st_mode))
  		{
  			bool		sent = false;
  
  			if (!sizeonly)
! 			{
! 				bool		is_relfile;
! 				XLogRecPtr	filemaxlsn = 0;
! 
! 				/*
! 				 * If the current directory can have relnode files, check the file
! 				 * name to see if it is one of them.
! 				 */
! 				is_relfile = has_relfiles && validateRelfilenodeName(de->d_name);
! 
! 				/*
! 				 * If is_relfile get the MaxLSN. If unable to get the MaxLSN
! 				 * set is_relfile to false.
! 				 */
! 				is_relfile = is_relfile && getFileMaxLSN(pathbuf, &statbuf,
! 														 &filemaxlsn);
! 
! 				if (!is_relfile
! 					|| incremental_startpoint == 0
! 					|| filemaxlsn > incremental_startpoint)
! 					sent = sendFile(pathbuf, pathbuf + basepathlen + 1,
! 									&statbuf, true);
! 
! 				/* Write a backup profile entry for this file. */
! 				writeBackupProfileLine(pathbuf + basepathlen + 1, &statbuf,
! 									   is_relfile, filemaxlsn, sent);
! 			}
  
  			if (sent || sizeonly)
  			{
*************** throttle(size_t increment)
*** 1303,1305 ****
--- 1398,1595 ----
  		/* Sleep was necessary but might have been interrupted. */
  		throttled_last = GetCurrentIntegerTimestamp();
  }
+ 
+ /*
+  * Read the maximum LSN number in the one of data file (relnode file).
+  */
+ static bool
+ getFileMaxLSN(char *filename, struct stat * statbuf, XLogRecPtr *filemaxlsn)
+ {
+ 	FILE	   *fp;
+ 	char		buf[BLCKSZ];
+ 	size_t		cnt;
+ 	pgoff_t		len = 0;
+ 	XLogRecPtr	pagelsn;
+ 
+ 	*filemaxlsn = 0;
+ 
+ 	fp = AllocateFile(filename, "rb");
+ 	if (fp == NULL)
+ 	{
+ 		if (errno == ENOENT)
+ 			return false;
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not open file \"%s\": %m", filename)));
+ 	}
+ 
+ 	while ((cnt = fread(buf, 1, Min(sizeof(buf), statbuf->st_size - len), fp)) > 0)
+ 	{
+ 		pagelsn = PageGetLSN(buf);
+ 		if (*filemaxlsn < pagelsn)
+ 			*filemaxlsn = pagelsn;
+ 
+ 		if (len >= statbuf->st_size)
+ 		{
+ 			/*
+ 			 * Reached end of file. The file could be longer, if it was
+ 			 * extended while we were sending it, but for a base backup we can
+ 			 * ignore such extended data. It will be restored from WAL.
+ 			 */
+ 			break;
+ 		}
+ 	}
+ 
+ 	FreeFile(fp);
+ 	return true;
+ }
+ 
+ /*
+  * Write an entry in file list section of backup profile.
+  */
+ static void
+ writeBackupProfileLine(const char *filename, struct stat * statbuf,
+ 					   bool is_relfile, XLogRecPtr filemaxlsn, bool sent)
+ {
+ 	/*
+ 	 * tablespace oid (10) + max LSN (17) + mtime (10) + size (19) +
+ 	 * path (MAXPGPATH) + separators (4) + trailing \0 = 65
+ 	 */
+ 	char	buf[MAXPGPATH + 65];
+ 	char    maxlsn[17];
+ 	int		rowlen;
+ 
+ 	Assert(backup_profile_fd > 0);
+ 
+ 	/* Prepare maxlsn */
+ 	if (is_relfile)
+ 	{
+ 		snprintf(maxlsn, sizeof(maxlsn), "%X/%X",
+ 				 (uint32) (filemaxlsn >> 32), (uint32) filemaxlsn);
+ 	}
+ 	else
+ 	{
+ 		strlcpy(maxlsn, "\\N", sizeof(maxlsn));
+ 	}
+ 
+ 	rowlen = snprintf(buf, sizeof(buf), "%s\t%s\t%s\t%u\t%lld\t%s\n",
+ 					  current_tablespace ? current_tablespace : "\\N",
+ 					  maxlsn,
+ 					  sent ? "t" : "f",
+ 					  (uint32) statbuf->st_mtime,
+ 					  statbuf->st_size,
+ 					  filename);
+ 	FileWrite(backup_profile_fd, buf, rowlen);
+ }
+ 
+ /*
+  * Send the backup profile. It is wrapped in a tar CopyOutResponse containing
+  * a tar stream with only one file.
+  */
+ static void
+ sendBackupProfile(const char *labelfile)
+ {
+ 	StringInfoData msgbuf;
+ 	struct stat statbuf;
+ 	char		buf[TAR_SEND_SIZE];
+ 	size_t		cnt;
+ 	pgoff_t		len = 0;
+ 	size_t		pad;
+ 	char *backup_profile = FilePathName(backup_profile_fd);
+ 
+ 	/* Send CopyOutResponse message */
+ 	pq_beginmessage(&msgbuf, 'H');
+ 	pq_sendbyte(&msgbuf, 0);		/* overall format */
+ 	pq_sendint(&msgbuf, 0, 2);		/* natts */
+ 	pq_endmessage(&msgbuf);
+ 
+ 	if (lstat(backup_profile, &statbuf) != 0)
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not stat backup_profile file \"%s\": %m",
+ 						backup_profile)));
+ 
+ 	/* Set the file position to the beginning. */
+ 	FileSeek(backup_profile_fd, 0, SEEK_SET);
+ 
+ 	/*
+ 	 * Fill the buffer with content of backup profile header section. Being it
+ 	 * the concatenation of two separator and the backup label, it should be
+ 	 * shorter of TAR_SEND_SIZE.
+ 	 */
+ 	cnt = snprintf(buf, sizeof(buf), "%s\n%s%s\n",
+ 				   BACKUP_PROFILE_HEADER,
+ 				   labelfile,
+ 				   BACKUP_PROFILE_SEPARATOR);
+ 
+ 	/* Add size of backup label and separators */
+ 	statbuf.st_size += cnt;
+ 
+ 	_tarWriteHeader(BACKUP_PROFILE_FILE, NULL, &statbuf);
+ 
+ 	/* Send backup profile header */
+ 	if (pq_putmessage('d', buf, cnt))
+ 		ereport(ERROR,
+ 				(errmsg("base backup could not send data, aborting backup")));
+ 
+ 	len += cnt;
+ 	throttle(cnt);
+ 
+ 	while ((cnt = FileRead(backup_profile_fd, buf, sizeof(buf))) > 0)
+ 	{
+ 		/* Send the chunk as a CopyData message */
+ 		if (pq_putmessage('d', buf, cnt))
+ 			ereport(ERROR,
+ 					(errmsg("base backup could not send data, aborting backup")));
+ 
+ 		len += cnt;
+ 		throttle(cnt);
+ 
+ 	}
+ 
+ 	/*
+ 	 * Pad to 512 byte boundary, per tar format requirements. (This small
+ 	 * piece of data is probably not worth throttling.)
+ 	 */
+ 	pad = ((len + 511) & ~511) - len;
+ 	if (pad > 0)
+ 	{
+ 		MemSet(buf, 0, pad);
+ 		pq_putmessage('d', buf, pad);
+ 	}
+ 
+ 	pq_putemptymessage('c');        /* CopyDone */
+ }
+ 
+ /*
+  * relfilenode name validation.
+  *
+  * Format with_ext == true	[0-9]+[ \w | _vm | _fsm | _init ][\.][0-9]*
+  *		  with_ext == false [0-9]+
+  */
+ static bool
+ validateRelfilenodeName(char *name)
+ {
+ 	int			pos = 0;
+ 
+ 	while ((name[pos] >= '0') && (name[pos] <= '9'))
+ 		pos++;
+ 
+ 	if (name[pos] == '_')
+ 	{
+ 		pos++;
+ 		while ((name[pos] >= 'a') && (name[pos] <= 'z'))
+ 			pos++;
+ 	}
+ 	if (name[pos] == '.')
+ 	{
+ 		pos++;
+ 		while ((name[pos] >= '0') && (name[pos] <= '9'))
+ 			pos++;
+ 	}
+ 
+ 	if (name[pos] == 0)
+ 		return true;
+ 
+ 	return false;
+ }
diff --git a/src/backend/replication/repl_gram.y b/src/backend/replication/repl_gram.y
index 154aaac..97f1091 100644
*** a/src/backend/replication/repl_gram.y
--- b/src/backend/replication/repl_gram.y
*************** Node *replication_parse_result;
*** 75,80 ****
--- 75,81 ----
  %token K_PHYSICAL
  %token K_LOGICAL
  %token K_SLOT
+ %token K_INCREMENTAL
  
  %type <node>	command
  %type <node>	base_backup start_replication start_logical_replication create_replication_slot drop_replication_slot identify_system timeline_history
*************** base_backup_opt:
*** 168,173 ****
--- 169,179 ----
  				  $$ = makeDefElem("max_rate",
  								   (Node *)makeInteger($2));
  				}
+ 			| K_INCREMENTAL SCONST
+ 				{
+ 				  $$ = makeDefElem("incremental",
+ 								   (Node *)makeString($2));
+ 				}
  			;
  
  create_replication_slot:
diff --git a/src/backend/replication/repl_scanner.l b/src/backend/replication/repl_scanner.l
index a257124..74c5119 100644
*** a/src/backend/replication/repl_scanner.l
--- b/src/backend/replication/repl_scanner.l
*************** TIMELINE_HISTORY	{ return K_TIMELINE_HIS
*** 96,101 ****
--- 96,102 ----
  PHYSICAL			{ return K_PHYSICAL; }
  LOGICAL				{ return K_LOGICAL; }
  SLOT				{ return K_SLOT; }
+ INCREMENTAL			{ return K_INCREMENTAL; }
  
  ","				{ return ','; }
  ";"				{ return ';'; }
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 0ebda9a..9902a8a 100644
*** a/src/bin/pg_basebackup/pg_basebackup.c
--- b/src/bin/pg_basebackup/pg_basebackup.c
*************** static bool writerecoveryconf = false;
*** 66,71 ****
--- 66,72 ----
  static int	standby_message_timeout = 10 * 1000;		/* 10 sec = default */
  static pg_time_t last_progress_report = 0;
  static int32 maxrate = 0;		/* no limit by default */
+ static XLogRecPtr incremental_startpoint = 0;
  
  
  /* Progress counters */
*************** static void verify_dir_is_empty_or_creat
*** 100,106 ****
  static void progress_report(int tablespacenum, const char *filename, bool force);
  
  static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);
! static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum);
  static void GenerateRecoveryConf(PGconn *conn);
  static void WriteRecoveryConf(void);
  static void BaseBackup(void);
--- 101,108 ----
  static void progress_report(int tablespacenum, const char *filename, bool force);
  
  static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);
! static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum,
! 									const char *dest_path);
  static void GenerateRecoveryConf(PGconn *conn);
  static void WriteRecoveryConf(void);
  static void BaseBackup(void);
*************** usage(void)
*** 231,236 ****
--- 233,240 ----
  	printf(_("\nOptions controlling the output:\n"));
  	printf(_("  -D, --pgdata=DIRECTORY receive base backup into directory\n"));
  	printf(_("  -F, --format=p|t       output format (plain (default), tar)\n"));
+ 	printf(_("  -I, --incremental=STARTPOINT\n"
+ 			 "                         send only chenges after STARTPOINT\n"));
  	printf(_("  -r, --max-rate=RATE    maximum transfer rate to transfer data directory\n"
  			 "                         (in kB/s, or use suffix \"k\" or \"M\")\n"));
  	printf(_("  -R, --write-recovery-conf\n"
*************** get_tablespace_mapping(const char *dir)
*** 1116,1124 ****
   * If the data is for the main data directory, it will be restored in the
   * specified directory. If it's for another tablespace, it will be restored
   * in the original or mapped directory.
   */
  static void
! ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum)
  {
  	char		current_path[MAXPGPATH];
  	char		filename[MAXPGPATH];
--- 1120,1135 ----
   * If the data is for the main data directory, it will be restored in the
   * specified directory. If it's for another tablespace, it will be restored
   * in the original or mapped directory.
+  *
+  * If 'res' is NULL, the destination directory is taken from the
+  * 'dest_path' parameter.
+  *
+  * When 'dest_path' is specified, progresses are not displayed because the
+  * content it is not in any tablespace.
   */
  static void
! ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum,
! 						const char *dest_path)
  {
  	char		current_path[MAXPGPATH];
  	char		filename[MAXPGPATH];
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1129,1141 ****
  	char	   *copybuf = NULL;
  	FILE	   *file = NULL;
  
! 	basetablespace = PQgetisnull(res, rownum, 0);
! 	if (basetablespace)
! 		strlcpy(current_path, basedir, sizeof(current_path));
  	else
! 		strlcpy(current_path,
! 				get_tablespace_mapping(PQgetvalue(res, rownum, 1)),
! 				sizeof(current_path));
  
  	/*
  	 * Get the COPY data
--- 1140,1167 ----
  	char	   *copybuf = NULL;
  	FILE	   *file = NULL;
  
! 	/* 'res' and 'dest_path' are mutually exclusive */
! 	assert(!res != !dest_path);
! 
! 	/*
! 	 * If 'res' is NULL, the destination directory is taken from the
! 	 * 'dest_path' parameter.
! 	 */
! 	if (res)
! 	{
! 		basetablespace = PQgetisnull(res, rownum, 0);
! 		if (basetablespace)
! 			strlcpy(current_path, basedir, sizeof(current_path));
! 		else
! 			strlcpy(current_path,
! 					get_tablespace_mapping(PQgetvalue(res, rownum, 1)),
! 					sizeof(current_path));
! 	}
  	else
! 	{
! 		basetablespace = false;
! 		strlcpy(current_path, dest_path, sizeof(current_path));
! 	}
  
  	/*
  	 * Get the COPY data
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1342,1348 ****
  				disconnect_and_exit(1);
  			}
  			totaldone += r;
! 			progress_report(rownum, filename, false);
  
  			current_len_left -= r;
  			if (current_len_left == 0 && current_padding == 0)
--- 1368,1376 ----
  				disconnect_and_exit(1);
  			}
  			totaldone += r;
! 			/* report progress unless a custom destination is used */
! 			if (!dest_path)
! 				progress_report(rownum, filename, false);
  
  			current_len_left -= r;
  			if (current_len_left == 0 && current_padding == 0)
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1358,1364 ****
  			}
  		}						/* continuing data in existing file */
  	}							/* loop over all data blocks */
! 	progress_report(rownum, filename, true);
  
  	if (file != NULL)
  	{
--- 1386,1394 ----
  			}
  		}						/* continuing data in existing file */
  	}							/* loop over all data blocks */
! 	/* report progress unless a custom destination is used */
! 	if (!dest_path)
! 		progress_report(rownum, filename, true);
  
  	if (file != NULL)
  	{
*************** BaseBackup(void)
*** 1574,1579 ****
--- 1604,1610 ----
  	char	   *basebkp;
  	char		escaped_label[MAXPGPATH];
  	char	   *maxrate_clause = NULL;
+ 	char	   *incremental_clause = NULL;
  	int			i;
  	char		xlogstart[64];
  	char		xlogend[64];
*************** BaseBackup(void)
*** 1635,1648 ****
  	if (maxrate > 0)
  		maxrate_clause = psprintf("MAX_RATE %u", maxrate);
  
  	basebkp =
! 		psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s",
  				 escaped_label,
  				 showprogress ? "PROGRESS" : "",
  				 includewal && !streamwal ? "WAL" : "",
  				 fastcheckpoint ? "FAST" : "",
  				 includewal ? "NOWAIT" : "",
! 				 maxrate_clause ? maxrate_clause : "");
  
  	if (PQsendQuery(conn, basebkp) == 0)
  	{
--- 1666,1685 ----
  	if (maxrate > 0)
  		maxrate_clause = psprintf("MAX_RATE %u", maxrate);
  
+ 	if (incremental_startpoint > 0)
+ 		incremental_clause = psprintf("INCREMENTAL '%X/%X'",
+ 									  (uint32) (incremental_startpoint >> 32),
+ 									  (uint32) incremental_startpoint);
+ 
  	basebkp =
! 		psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s %s",
  				 escaped_label,
  				 showprogress ? "PROGRESS" : "",
  				 includewal && !streamwal ? "WAL" : "",
  				 fastcheckpoint ? "FAST" : "",
  				 includewal ? "NOWAIT" : "",
! 				 maxrate_clause ? maxrate_clause : "",
! 				 incremental_clause ? incremental_clause : "");
  
  	if (PQsendQuery(conn, basebkp) == 0)
  	{
*************** BaseBackup(void)
*** 1756,1762 ****
  		if (format == 't')
  			ReceiveTarFile(conn, res, i);
  		else
! 			ReceiveAndUnpackTarFile(conn, res, i);
  	}							/* Loop over all tablespaces */
  
  	if (showprogress)
--- 1793,1799 ----
  		if (format == 't')
  			ReceiveTarFile(conn, res, i);
  		else
! 			ReceiveAndUnpackTarFile(conn, res, i, NULL);
  	}							/* Loop over all tablespaces */
  
  	if (showprogress)
*************** BaseBackup(void)
*** 1790,1795 ****
--- 1827,1837 ----
  		fprintf(stderr, "transaction log end point: %s\n", xlogend);
  	PQclear(res);
  
+ 	/*
+ 	 * Get the backup profile
+ 	 */
+ 	ReceiveAndUnpackTarFile(conn, NULL, -1, basedir);
+ 
  	res = PQgetResult(conn);
  	if (PQresultStatus(res) != PGRES_COMMAND_OK)
  	{
*************** main(int argc, char **argv)
*** 1929,1934 ****
--- 1971,1977 ----
  		{"username", required_argument, NULL, 'U'},
  		{"no-password", no_argument, NULL, 'w'},
  		{"password", no_argument, NULL, 'W'},
+ 		{"incremental", required_argument, NULL, 'I'},
  		{"status-interval", required_argument, NULL, 's'},
  		{"verbose", no_argument, NULL, 'v'},
  		{"progress", no_argument, NULL, 'P'},
*************** main(int argc, char **argv)
*** 1936,1942 ****
  		{NULL, 0, NULL, 0}
  	};
  	int			c;
! 
  	int			option_index;
  
  	progname = get_progname(argv[0]);
--- 1979,1985 ----
  		{NULL, 0, NULL, 0}
  	};
  	int			c;
! 	int			hi, lo;
  	int			option_index;
  
  	progname = get_progname(argv[0]);
*************** main(int argc, char **argv)
*** 1957,1963 ****
  		}
  	}
  
! 	while ((c = getopt_long(argc, argv, "D:F:r:RT:xX:l:zZ:d:c:h:p:U:s:wWvP",
  							long_options, &option_index)) != -1)
  	{
  		switch (c)
--- 2000,2006 ----
  		}
  	}
  
! 	while ((c = getopt_long(argc, argv, "D:F:r:RT:xX:l:zZ:d:c:h:p:U:s:wWI:vP",
  							long_options, &option_index)) != -1)
  	{
  		switch (c)
*************** main(int argc, char **argv)
*** 2075,2080 ****
--- 2118,2133 ----
  			case 'W':
  				dbgetpassword = 1;
  				break;
+ 			case 'I':
+ 				if (sscanf(optarg, "%X/%X", &hi, &lo) != 2)
+ 				{
+ 					fprintf(stderr,
+ 							_("%s: could not parse incremental start position \"%s\"\n"),
+ 							progname, optarg);
+ 					exit(1);
+ 				}
+ 				incremental_startpoint = ((uint64) hi << 32) | lo;
+ 				break;
  			case 's':
  				standby_message_timeout = atoi(optarg) * 1000;
  				if (standby_message_timeout < 0)
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 0f068d9..91d05d5 100644
*** a/src/include/access/xlog.h
--- b/src/include/access/xlog.h
*************** extern void SetWalWriterSleeping(bool sl
*** 349,355 ****
   * Starting/stopping a base backup
   */
  extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast,
! 				   TimeLineID *starttli_p, char **labelfile);
  extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive,
  				  TimeLineID *stoptli_p);
  extern void do_pg_abort_backup(void);
--- 349,356 ----
   * Starting/stopping a base backup
   */
  extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast,
! 				  XLogRecPtr incremental_startpoint,
! 				  TimeLineID *starttli_p, char **labelfile);
  extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive,
  				  TimeLineID *stoptli_p);
  extern void do_pg_abort_backup(void);
diff --git a/src/include/replication/basebackup.h b/src/include/replication/basebackup.h
index 988bce7..9210e67 100644
*** a/src/include/replication/basebackup.h
--- b/src/include/replication/basebackup.h
***************
*** 20,25 ****
--- 20,29 ----
  #define MAX_RATE_LOWER	32
  #define MAX_RATE_UPPER	1048576
  
+ /* Backup profile */
+ #define BACKUP_PROFILE_FILE			"backup_profile"
+ #define BACKUP_PROFILE_HEADER		"POSTGRESQL BACKUP PROFILE 1"
+ #define BACKUP_PROFILE_SEPARATOR	"FILE LIST"
  
  extern void SendBaseBackup(BaseBackupCmd *cmd);
  
-- 
2.1.2

Marco Nenciarini

marco.nenciarini@2ndquadrant.it

about 11 years ago

In reply to: Marco Nenciarini (#1)

Re: [RFC] Incremental backup v3: incremental PoC

I've noticed that I missed to add this to the commitfest.

I've just added it.

It is not meant to end up in a committable state, but at this point I'm
searching for some code review and more discusison.

I'm also about to send an additional patch to implement an LSN map as an
additional fork for heap files.

Regards,
Marco

--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it

Marco Nenciarini

marco.nenciarini@2ndquadrant.it

about 11 years ago

In reply to: Marco Nenciarini (#1)

Re: [RFC] Incremental backup v3: incremental PoC

I've noticed that I missed to add this to the commitfest.

I've just added it.

It is not meant to end up in a committable state, but at this point I'm
searching for some code review and more discusison.

I'm also about to send an additional patch to implement an LSN map as an
additional fork for heap files.

Regards,
Marco

--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it

Michael Paquier

michael.paquier@gmail.com

about 11 years ago

In reply to: Marco Nenciarini (#2)

Re: [RFC] Incremental backup v3: incremental PoC

On Mon, Jan 5, 2015 at 7:56 PM, Marco Nenciarini
<marco.nenciarini@2ndquadrant.it> wrote:

I've noticed that I missed to add this to the commitfest.

I've just added it.

It is not meant to end up in a committable state, but at this point I'm
searching for some code review and more discusison.

I'm also about to send an additional patch to implement an LSN map as an
additional fork for heap files.

Moved to CF 2015-02.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Robert Haas

robertmhaas@gmail.com

about 11 years ago

In reply to: Marco Nenciarini (#1)

Re: [RFC] Incremental backup v3: incremental PoC

On Tue, Oct 14, 2014 at 1:17 PM, Marco Nenciarini
<marco.nenciarini@2ndquadrant.it> wrote:

I would to replace the getMaxLSN function with a more-or-less persistent
structure which contains the maxLSN for each data segment.

To make it work I would hook into the ForwardFsyncRequest() function in
src/backend/postmaster/checkpointer.c and update an in memory hash every
time a block is going to be fsynced. The structure could be persisted on
disk at some time (probably on checkpoint).

I think a good key for the hash would be a BufferTag with blocknum
"rounded" to the start of the segment.

I'm here asking for comments and advices on how to implement it in an
acceptable way.

I'm afraid this is going to be quite tricky to implement. There's no
way to make the in-memory hash table large enough that it can
definitely contain all of the entries for the entire database. Even
if it's big enough at a certain point in time, somebody can create
100,000 new tables and now it's not big enough any more. This is not
unlike the problem we had with the visibility map and free space map
before 8.4 (and you probably remember how much fun that was).

I suggest leaving this out altogether for the first version. I can
think of three possible ways that we can determine which blocks need
to be backed up. One, just read every block in the database and look
at the LSN of each one. Two, maintain a cache of LSN information on a
per-segment (or smaller) basis, as you suggest here. Three, scan the
WAL generated since the incremental backup and summarize it into a
list of blocks that need to be backed up. This last idea could either
be done when the backup is requested, or it could be done as the WAL
is generated and used to populate the LSN cache. In the long run, I
think some variant of approach #3 is likely best, but in the short
run, approach #1 (scan everything) is certainly easiest. While it
doesn't optimize I/O, it still gives you the benefit of reducing the
amount of data that needs to be transferred and stored, and that's not
nothing. If we get that much working, we can improve things more
later.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Jehan-Guillaume de Rorthais

jgdr@dalibo.com

about 11 years ago

In reply to: Robert Haas (#5)

Re: [RFC] Incremental backup v3: incremental PoC

On Tue, 6 Jan 2015 08:26:22 -0500
Robert Haas <robertmhaas@gmail.com> wrote:

Three, scan the WAL generated since the incremental backup and summarize it
into a list of blocks that need to be backed up.

This can be done from the archive side. I was talking about some months ago
now:

/messages/by-id/51C4DD20.3000103@free.fr

One of the traps I could think of it that it requires "full_page_write=on" so
we can forge each block correctly. So collar is that we need to start a diff
backup right after a checkpoints then.

And even without "full_page_write=on", maybe we could add a function, say
"pg_start_backupdiff()", which would force to log full pages right after it
only, the same way "full_page_write" does after a checkpoint. Diff backups would
be possible from each LSN where we pg_start_backupdiff'ed till whenever.

Building this backup by merging versions of blocks from WAL is on big step.
But then, there is a file format to define, how to restore it and to decide what
tools/functions/GUCs to expose to admins.

After discussing with Magnus, he adviced me to wait for a diff backup file
format to emerge from online tools, like discussed here (by the time, that was
Michael's proposal based on pg_basebackup that was discussed). But I wonder how
easier it would be to do this the opposite way? If this idea of building diff
backup offline from archives is possible, wouldn't it remove a lot of trouble
you are discussing here?

Regards,
--
Jehan-Guillaume de Rorthais
Dalibo
http://www.dalibo.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Marco Nenciarini

marco.nenciarini@2ndquadrant.it

about 11 years ago

In reply to: Robert Haas (#5)

Re: [RFC] Incremental backup v3: incremental PoC

Il 06/01/15 14:26, Robert Haas ha scritto:

I suggest leaving this out altogether for the first version. I can
think of three possible ways that we can determine which blocks need
to be backed up. One, just read every block in the database and look
at the LSN of each one. Two, maintain a cache of LSN information on a
per-segment (or smaller) basis, as you suggest here. Three, scan the
WAL generated since the incremental backup and summarize it into a
list of blocks that need to be backed up. This last idea could either
be done when the backup is requested, or it could be done as the WAL
is generated and used to populate the LSN cache. In the long run, I
think some variant of approach #3 is likely best, but in the short
run, approach #1 (scan everything) is certainly easiest. While it
doesn't optimize I/O, it still gives you the benefit of reducing the
amount of data that needs to be transferred and stored, and that's not
nothing. If we get that much working, we can improve things more
later.

Hi,
The patch now uses the approach #1, but I've just sent a patch that uses
the #2 approach.

54AD016E.9020406@2ndquadrant.it

Regards,
Marco

--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it

Gabriele Bartolini

gabriele.bartolini@2ndquadrant.it

almost 11 years ago

In reply to: Marco Nenciarini (#7)

Re: [RFC] Incremental backup v3: incremental PoC

Hi Marco,

could you please send an updated version the patch against the current
HEAD in order to facilitate reviewers?

Thanks,
Gabriele

--
Gabriele Bartolini - 2ndQuadrant Italia - Managing Director
PostgreSQL Training, Services and Support
gabriele.bartolini@2ndQuadrant.it | www.2ndQuadrant.it

2015-01-07 11:00 GMT+01:00 Marco Nenciarini <marco.nenciarini@2ndquadrant.it

Show quoted text

:

Il 06/01/15 14:26, Robert Haas ha scritto:

I suggest leaving this out altogether for the first version. I can
think of three possible ways that we can determine which blocks need
to be backed up. One, just read every block in the database and look
at the LSN of each one. Two, maintain a cache of LSN information on a
per-segment (or smaller) basis, as you suggest here. Three, scan the
WAL generated since the incremental backup and summarize it into a
list of blocks that need to be backed up. This last idea could either
be done when the backup is requested, or it could be done as the WAL
is generated and used to populate the LSN cache. In the long run, I
think some variant of approach #3 is likely best, but in the short
run, approach #1 (scan everything) is certainly easiest. While it
doesn't optimize I/O, it still gives you the benefit of reducing the
amount of data that needs to be transferred and stored, and that's not
nothing. If we get that much working, we can improve things more
later.

Hi,
The patch now uses the approach #1, but I've just sent a patch that uses
the #2 approach.

54AD016E.9020406@2ndquadrant.it

Regards,
Marco

--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it

Marco Nenciarini

marco.nenciarini@2ndquadrant.it

almost 11 years ago

In reply to: Gabriele Bartolini (#8)

1 attachment(s)

Re: [RFC] Incremental backup v3: incremental PoC

Il 13/01/15 12:53, Gabriele Bartolini ha scritto:

Hi Marco,

could you please send an updated version the patch against the current
HEAD in order to facilitate reviewers?

Here is the updated patch for incremental file based backup.

It is based on the current HEAD.

I'm now working to the client tool to rebuild a full backup starting
from a file based incremental backup.

Regards,
Marco

--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it

Attachments:

file-based-incremental-backup-v4.patchtext/plain; charset=UTF-8; name=file-based-incremental-backup-v4.patch; x-mac-creator=0; x-mac-type=0Download

From 50ff0872d3901a30b6742900170052eabe0e06dd Mon Sep 17 00:00:00 2001
From: Marco Nenciarini <marco.nenciarini@2ndQuadrant.it>
Date: Tue, 14 Oct 2014 14:31:28 +0100
Subject: [PATCH] File-based incremental backup v4

Add backup profile to pg_basebackup
INCREMENTAL option implementaion
---
 src/backend/access/transam/xlog.c      |   7 +-
 src/backend/access/transam/xlogfuncs.c |   2 +-
 src/backend/replication/basebackup.c   | 335 +++++++++++++++++++++++++++++++--
 src/backend/replication/repl_gram.y    |   6 +
 src/backend/replication/repl_scanner.l |   1 +
 src/bin/pg_basebackup/pg_basebackup.c  |  83 ++++++--
 src/include/access/xlog.h              |   3 +-
 src/include/replication/basebackup.h   |   4 +
 8 files changed, 409 insertions(+), 32 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 839ea7c..625a5df 100644
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
*************** XLogFileNameP(TimeLineID tli, XLogSegNo 
*** 9249,9255 ****
   * permissions of the calling user!
   */
  XLogRecPtr
! do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
  				   char **labelfile)
  {
  	bool		exclusive = (labelfile == NULL);
--- 9249,9256 ----
   * permissions of the calling user!
   */
  XLogRecPtr
! do_pg_start_backup(const char *backupidstr, bool fast,
! 				   XLogRecPtr incremental_startpoint, TimeLineID *starttli_p,
  				   char **labelfile)
  {
  	bool		exclusive = (labelfile == NULL);
*************** do_pg_start_backup(const char *backupids
*** 9468,9473 ****
--- 9469,9478 ----
  			 (uint32) (startpoint >> 32), (uint32) startpoint, xlogfilename);
  		appendStringInfo(&labelfbuf, "CHECKPOINT LOCATION: %X/%X\n",
  					 (uint32) (checkpointloc >> 32), (uint32) checkpointloc);
+ 		if (incremental_startpoint > 0)
+ 			appendStringInfo(&labelfbuf, "INCREMENTAL FROM LOCATION: %X/%X\n",
+ 							 (uint32) (incremental_startpoint >> 32),
+ 							 (uint32) incremental_startpoint);
  		appendStringInfo(&labelfbuf, "BACKUP METHOD: %s\n",
  						 exclusive ? "pg_start_backup" : "streamed");
  		appendStringInfo(&labelfbuf, "BACKUP FROM: %s\n",
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2179bf7..ace84d8 100644
*** a/src/backend/access/transam/xlogfuncs.c
--- b/src/backend/access/transam/xlogfuncs.c
*************** pg_start_backup(PG_FUNCTION_ARGS)
*** 59,65 ****
  				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
  		   errmsg("must be superuser or replication role to run a backup")));
  
! 	startpoint = do_pg_start_backup(backupidstr, fast, NULL, NULL);
  
  	PG_RETURN_LSN(startpoint);
  }
--- 59,65 ----
  				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
  		   errmsg("must be superuser or replication role to run a backup")));
  
! 	startpoint = do_pg_start_backup(backupidstr, fast, 0, NULL, NULL);
  
  	PG_RETURN_LSN(startpoint);
  }
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 07030a2..05b19c5 100644
*** a/src/backend/replication/basebackup.c
--- b/src/backend/replication/basebackup.c
***************
*** 30,40 ****
--- 30,42 ----
  #include "replication/basebackup.h"
  #include "replication/walsender.h"
  #include "replication/walsender_private.h"
+ #include "storage/bufpage.h"
  #include "storage/fd.h"
  #include "storage/ipc.h"
  #include "utils/builtins.h"
  #include "utils/elog.h"
  #include "utils/ps_status.h"
+ #include "utils/pg_lsn.h"
  #include "utils/timestamp.h"
  
  
*************** typedef struct
*** 46,56 ****
  	bool		nowait;
  	bool		includewal;
  	uint32		maxrate;
  } basebackup_options;
  
  
! static int64 sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces);
! static int64 sendTablespace(char *path, bool sizeonly);
  static bool sendFile(char *readfilename, char *tarfilename,
  		 struct stat * statbuf, bool missing_ok);
  static void sendFileWithContent(const char *filename, const char *content);
--- 48,62 ----
  	bool		nowait;
  	bool		includewal;
  	uint32		maxrate;
+ 	XLogRecPtr	incremental_startpoint;
  } basebackup_options;
  
  
! static int64 sendDir(char *path, int basepathlen, bool sizeonly,
! 					 List *tablespaces, bool has_relfiles,
! 					 XLogRecPtr incremental_startpoint);
! static int64 sendTablespace(char *path, bool sizeonly,
! 				XLogRecPtr incremental_startpoint);
  static bool sendFile(char *readfilename, char *tarfilename,
  		 struct stat * statbuf, bool missing_ok);
  static void sendFileWithContent(const char *filename, const char *content);
*************** static void parse_basebackup_options(Lis
*** 64,69 ****
--- 70,81 ----
  static void SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli);
  static int	compareWalFileNames(const void *a, const void *b);
  static void throttle(size_t increment);
+ static bool relnodeIsNewerThanLSN(char *filename, struct stat * statbuf,
+ 				XLogRecPtr *filemaxlsn, XLogRecPtr thresholdlsn);
+ static void writeBackupProfileLine(const char *filename, struct stat * statbuf,
+ 								   bool has_maxlsn, XLogRecPtr filemaxlsn, bool sent);
+ static void sendBackupProfile(const char *labelfile);
+ static bool validateRelfilenodeName(char *name);
  
  /* Was the backup currently in-progress initiated in recovery mode? */
  static bool backup_started_in_recovery = false;
*************** static int64 elapsed_min_unit;
*** 93,98 ****
--- 105,116 ----
  /* The last check of the transfer rate. */
  static int64 throttled_last;
  
+ /* Temporary file containing the backup profile */
+ static File backup_profile_fd = 0;
+ 
+ /* Tablespace being currently sent. Used in backup profile generation */
+ static char *current_tablespace = NULL;
+ 
  typedef struct
  {
  	char	   *oid;
*************** perform_base_backup(basebackup_options *
*** 132,138 ****
  
  	backup_started_in_recovery = RecoveryInProgress();
  
! 	startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint, &starttli,
  								  &labelfile);
  	/*
  	 * Once do_pg_start_backup has been called, ensure that any failure causes
--- 150,160 ----
  
  	backup_started_in_recovery = RecoveryInProgress();
  
! 	/* Open a temporary file to hold the profile content. */
! 	backup_profile_fd = OpenTemporaryFile(false);
! 
! 	startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint,
! 								  opt->incremental_startpoint, &starttli,
  								  &labelfile);
  	/*
  	 * Once do_pg_start_backup has been called, ensure that any failure causes
*************** perform_base_backup(basebackup_options *
*** 208,214 ****
  			ti->oid = pstrdup(de->d_name);
  			ti->path = pstrdup(linkpath);
  			ti->rpath = relpath ? pstrdup(relpath) : NULL;
! 			ti->size = opt->progress ? sendTablespace(fullpath, true) : -1;
  			tablespaces = lappend(tablespaces, ti);
  #else
  
--- 230,237 ----
  			ti->oid = pstrdup(de->d_name);
  			ti->path = pstrdup(linkpath);
  			ti->rpath = relpath ? pstrdup(relpath) : NULL;
! 			ti->size = opt->progress ? sendTablespace(fullpath, true,
! 											opt->incremental_startpoint) : -1;
  			tablespaces = lappend(tablespaces, ti);
  #else
  
*************** perform_base_backup(basebackup_options *
*** 225,231 ****
  
  		/* Add a node for the base directory at the end */
  		ti = palloc0(sizeof(tablespaceinfo));
! 		ti->size = opt->progress ? sendDir(".", 1, true, tablespaces) : -1;
  		tablespaces = lappend(tablespaces, ti);
  
  		/* Send tablespace header */
--- 248,255 ----
  
  		/* Add a node for the base directory at the end */
  		ti = palloc0(sizeof(tablespaceinfo));
! 		ti->size = opt->progress ? sendDir(".", 1, true, tablespaces, false,
! 										   opt->incremental_startpoint) : -1;
  		tablespaces = lappend(tablespaces, ti);
  
  		/* Send tablespace header */
*************** perform_base_backup(basebackup_options *
*** 267,272 ****
--- 291,302 ----
  			pq_sendint(&buf, 0, 2);		/* natts */
  			pq_endmessage(&buf);
  
+ 			/*
+ 			 * Save the current tablespace, used in writeBackupProfileLine
+ 			 * function
+ 			 */
+ 			current_tablespace = ti->oid;
+ 
  			if (ti->path == NULL)
  			{
  				struct stat statbuf;
*************** perform_base_backup(basebackup_options *
*** 275,281 ****
  				sendFileWithContent(BACKUP_LABEL_FILE, labelfile);
  
  				/* ... then the bulk of the files ... */
! 				sendDir(".", 1, false, tablespaces);
  
  				/* ... and pg_control after everything else. */
  				if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
--- 305,311 ----
  				sendFileWithContent(BACKUP_LABEL_FILE, labelfile);
  
  				/* ... then the bulk of the files ... */
! 				sendDir(".", 1, false, tablespaces, false, opt->incremental_startpoint);
  
  				/* ... and pg_control after everything else. */
  				if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
*************** perform_base_backup(basebackup_options *
*** 284,292 ****
  							 errmsg("could not stat control file \"%s\": %m",
  									XLOG_CONTROL_FILE)));
  				sendFile(XLOG_CONTROL_FILE, XLOG_CONTROL_FILE, &statbuf, false);
  			}
  			else
! 				sendTablespace(ti->path, false);
  
  			/*
  			 * If we're including WAL, and this is the main data directory we
--- 314,323 ----
  							 errmsg("could not stat control file \"%s\": %m",
  									XLOG_CONTROL_FILE)));
  				sendFile(XLOG_CONTROL_FILE, XLOG_CONTROL_FILE, &statbuf, false);
+ 				writeBackupProfileLine(XLOG_CONTROL_FILE, &statbuf, false, 0, true);
  			}
  			else
! 				sendTablespace(ti->path, false, opt->incremental_startpoint);
  
  			/*
  			 * If we're including WAL, and this is the main data directory we
*************** perform_base_backup(basebackup_options *
*** 501,507 ****
  
  			FreeFile(fp);
  
! 			/*
  			 * Mark file as archived, otherwise files can get archived again
  			 * after promotion of a new node. This is in line with
  			 * walreceiver.c always doing a XLogArchiveForceDone() after a
--- 532,541 ----
  
  			FreeFile(fp);
  
! 			/* Add the WAL file to backup profile */
! 			writeBackupProfileLine(pathbuf, &statbuf, false, 0, true);
! 
! 		    /*
  			 * Mark file as archived, otherwise files can get archived again
  			 * after promotion of a new node. This is in line with
  			 * walreceiver.c always doing a XLogArchiveForceDone() after a
*************** perform_base_backup(basebackup_options *
*** 533,538 ****
--- 567,575 ----
  
  			sendFile(pathbuf, pathbuf, &statbuf, false);
  
+ 			/* Add the WAL file to backup profile */
+ 			writeBackupProfileLine(pathbuf, &statbuf, false, 0, true);
+ 
  			/* unconditionally mark file as archived */
  			StatusFilePath(pathbuf, fname, ".done");
  			sendFileWithContent(pathbuf, "");
*************** perform_base_backup(basebackup_options *
*** 542,547 ****
--- 579,587 ----
  		pq_putemptymessage('c');
  	}
  	SendXlogRecPtrResult(endptr, endtli);
+ 
+ 	/* Send the profile file. */
+ 	sendBackupProfile(labelfile);
  }
  
  /*
*************** parse_basebackup_options(List *options, 
*** 570,575 ****
--- 610,616 ----
  	bool		o_nowait = false;
  	bool		o_wal = false;
  	bool		o_maxrate = false;
+ 	bool		o_incremental = false;
  
  	MemSet(opt, 0, sizeof(*opt));
  	foreach(lopt, options)
*************** parse_basebackup_options(List *options, 
*** 640,645 ****
--- 681,698 ----
  			opt->maxrate = (uint32) maxrate;
  			o_maxrate = true;
  		}
+ 		else if (strcmp(defel->defname, "incremental") == 0)
+ 		{
+ 			if (o_incremental)
+ 				ereport(ERROR,
+ 						(errcode(ERRCODE_SYNTAX_ERROR),
+ 						 errmsg("duplicate option \"%s\"", defel->defname)));
+ 
+ 			opt->incremental_startpoint = DatumGetLSN(
+ 				DirectFunctionCall1(pg_lsn_in,
+ 									CStringGetDatum(strVal(defel->arg))));
+ 			o_incremental = true;
+ 		}
  		else
  			elog(ERROR, "option \"%s\" not recognized",
  				 defel->defname);
*************** sendFileWithContent(const char *filename
*** 859,864 ****
--- 912,920 ----
  		MemSet(buf, 0, pad);
  		pq_putmessage('d', buf, pad);
  	}
+ 
+ 	/* Write a backup profile entry for this file. */
+ 	writeBackupProfileLine(filename, &statbuf, false, 0, true);
  }
  
  /*
*************** sendFileWithContent(const char *filename
*** 869,875 ****
   * Only used to send auxiliary tablespaces, not PGDATA.
   */
  static int64
! sendTablespace(char *path, bool sizeonly)
  {
  	int64		size;
  	char		pathbuf[MAXPGPATH];
--- 925,931 ----
   * Only used to send auxiliary tablespaces, not PGDATA.
   */
  static int64
! sendTablespace(char *path, bool sizeonly, XLogRecPtr incremental_startpoint)
  {
  	int64		size;
  	char		pathbuf[MAXPGPATH];
*************** sendTablespace(char *path, bool sizeonly
*** 902,908 ****
  	size = 512;					/* Size of the header just added */
  
  	/* Send all the files in the tablespace version directory */
! 	size += sendDir(pathbuf, strlen(path), sizeonly, NIL);
  
  	return size;
  }
--- 958,964 ----
  	size = 512;					/* Size of the header just added */
  
  	/* Send all the files in the tablespace version directory */
! 	size += sendDir(pathbuf, strlen(path), sizeonly, NIL, true, incremental_startpoint);
  
  	return size;
  }
*************** sendTablespace(char *path, bool sizeonly
*** 914,922 ****
   *
   * Omit any directory in the tablespaces list, to avoid backing up
   * tablespaces twice when they were created inside PGDATA.
   */
  static int64
! sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces)
  {
  	DIR		   *dir;
  	struct dirent *de;
--- 970,982 ----
   *
   * Omit any directory in the tablespaces list, to avoid backing up
   * tablespaces twice when they were created inside PGDATA.
+  *
+  * If 'has_relfiles' is set, this directory will be checked to identify
+  * relnode files and compute their maxLSN.
   */
  static int64
! sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces,
! 		bool has_relfiles, XLogRecPtr incremental_startpoint)
  {
  	DIR		   *dir;
  	struct dirent *de;
*************** sendDir(char *path, int basepathlen, boo
*** 1124,1138 ****
  				}
  			}
  			if (!skip_this_dir)
! 				size += sendDir(pathbuf, basepathlen, sizeonly, tablespaces);
  		}
  		else if (S_ISREG(statbuf.st_mode))
  		{
  			bool		sent = false;
  
  			if (!sizeonly)
! 				sent = sendFile(pathbuf, pathbuf + basepathlen + 1, &statbuf,
! 								true);
  
  			if (sent || sizeonly)
  			{
--- 1184,1235 ----
  				}
  			}
  			if (!skip_this_dir)
! 			{
! 				bool	subdir_has_relfiles;
! 
! 				/*
! 				 * Whithin PGDATA relnode files are contained only in "global"
! 				 * and "base" directory
! 				 */
! 				subdir_has_relfiles = has_relfiles
! 					|| strcmp(pathbuf, "./global") == 0
! 					|| strcmp(pathbuf, "./base") == 0;
! 
! 				size += sendDir(pathbuf, basepathlen, sizeonly, tablespaces,
! 								subdir_has_relfiles, incremental_startpoint);
! 			}
  		}
  		else if (S_ISREG(statbuf.st_mode))
  		{
  			bool		sent = false;
  
  			if (!sizeonly)
! 			{
! 				bool		is_relfile;
! 				XLogRecPtr	filemaxlsn = 0;
! 
! 				/*
! 				 * If the current directory can have relnode files, check the file
! 				 * name to see if it is one of them.
! 				 */
! 				is_relfile = has_relfiles && validateRelfilenodeName(de->d_name);
! 
! 				if (!is_relfile
! 					|| incremental_startpoint == 0
! 					|| relnodeIsNewerThanLSN(pathbuf, &statbuf, &filemaxlsn,
! 											  incremental_startpoint))
! 				{
! 					sent = sendFile(pathbuf, pathbuf + basepathlen + 1,
! 									&statbuf, true);
! 					/* Write a backup profile entry for the sent file. */
! 					writeBackupProfileLine(pathbuf + basepathlen + 1, &statbuf,
! 										   false, 0, sent);
! 				}
! 				else
! 					/* Write a backup profile entry for the skipped file. */
! 					writeBackupProfileLine(pathbuf + basepathlen + 1, &statbuf,
! 										   true, filemaxlsn, sent);
! 			}
  
  			if (sent || sizeonly)
  			{
*************** throttle(size_t increment)
*** 1327,1329 ****
--- 1424,1636 ----
  		/* Sleep was necessary but might have been interrupted. */
  		throttled_last = GetCurrentIntegerTimestamp();
  }
+ 
+ /*
+  * Search in a relnode file for a page with a LSN greater than the threshold.
+  * If all the blocks in the file are older than the threshold the file can
+  * be safely skipped during an incremental backup.
+  */
+ static bool
+ relnodeIsNewerThanLSN(char *filename, struct stat * statbuf,
+ 		XLogRecPtr *filemaxlsn, XLogRecPtr thresholdlsn)
+ {
+ 	FILE	   *fp;
+ 	char		buf[BLCKSZ];
+ 	size_t		cnt;
+ 	pgoff_t		len = 0;
+ 	XLogRecPtr	pagelsn;
+ 
+ 	*filemaxlsn = 0;
+ 
+ 	fp = AllocateFile(filename, "rb");
+ 	if (fp == NULL)
+ 	{
+ 		if (errno == ENOENT)
+ 			return true;
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not open file \"%s\": %m", filename)));
+ 	}
+ 
+ 	while ((cnt = fread(buf, 1, Min(sizeof(buf), statbuf->st_size - len), fp)) > 0)
+ 	{
+ 		pagelsn = PageGetLSN(buf);
+ 
+ 		/* Keep the max LSN found */
+ 		if (*filemaxlsn < pagelsn)
+ 			*filemaxlsn = pagelsn;
+ 
+ 		/*
+ 		 *  If a page with a LSN newer than the threshold stop scanning
+ 		 *  and set the filemaxlsn value to 0 as it is only partial.
+ 		 */
+ 		if (thresholdlsn <= pagelsn)
+ 		{
+ 			*filemaxlsn = 0;
+ 			return true;
+ 		}
+ 
+ 		if (len >= statbuf->st_size)
+ 		{
+ 			/*
+ 			 * Reached end of file. The file could be longer, if it was
+ 			 * extended while we were sending it, but for a base backup we can
+ 			 * ignore such extended data. It will be restored from WAL.
+ 			 */
+ 			break;
+ 		}
+ 	}
+ 
+ 	FreeFile(fp);
+ 	return false;
+ }
+ 
+ /*
+  * Write an entry in file list section of backup profile.
+  */
+ static void
+ writeBackupProfileLine(const char *filename, struct stat * statbuf,
+ 					   bool has_maxlsn, XLogRecPtr filemaxlsn, bool sent)
+ {
+ 	/*
+ 	 * tablespace oid (10) + max LSN (17) + mtime (10) + size (19) +
+ 	 * path (MAXPGPATH) + separators (4) + trailing \0 = 65
+ 	 */
+ 	char	buf[MAXPGPATH + 65];
+ 	char    maxlsn[17];
+ 	int		rowlen;
+ 
+ 	Assert(backup_profile_fd > 0);
+ 
+ 	/* Prepare maxlsn */
+ 	if (has_maxlsn)
+ 	{
+ 		snprintf(maxlsn, sizeof(maxlsn), "%X/%X",
+ 				 (uint32) (filemaxlsn >> 32), (uint32) filemaxlsn);
+ 	}
+ 	else
+ 	{
+ 		strlcpy(maxlsn, "\\N", sizeof(maxlsn));
+ 	}
+ 
+ 	rowlen = snprintf(buf, sizeof(buf), "%s\t%s\t%s\t%u\t%lld\t%s\n",
+ 					  current_tablespace ? current_tablespace : "\\N",
+ 					  maxlsn,
+ 					  sent ? "t" : "f",
+ 					  (uint32) statbuf->st_mtime,
+ 					  statbuf->st_size,
+ 					  filename);
+ 	FileWrite(backup_profile_fd, buf, rowlen);
+ }
+ 
+ /*
+  * Send the backup profile. It is wrapped in a tar CopyOutResponse containing
+  * a tar stream with only one file.
+  */
+ static void
+ sendBackupProfile(const char *labelfile)
+ {
+ 	StringInfoData msgbuf;
+ 	struct stat statbuf;
+ 	char		buf[TAR_SEND_SIZE];
+ 	size_t		cnt;
+ 	pgoff_t		len = 0;
+ 	size_t		pad;
+ 	char *backup_profile = FilePathName(backup_profile_fd);
+ 
+ 	/* Send CopyOutResponse message */
+ 	pq_beginmessage(&msgbuf, 'H');
+ 	pq_sendbyte(&msgbuf, 0);		/* overall format */
+ 	pq_sendint(&msgbuf, 0, 2);		/* natts */
+ 	pq_endmessage(&msgbuf);
+ 
+ 	if (lstat(backup_profile, &statbuf) != 0)
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not stat backup_profile file \"%s\": %m",
+ 						backup_profile)));
+ 
+ 	/* Set the file position to the beginning. */
+ 	FileSeek(backup_profile_fd, 0, SEEK_SET);
+ 
+ 	/*
+ 	 * Fill the buffer with content of backup profile header section. Being it
+ 	 * the concatenation of two separator and the backup label, it should be
+ 	 * shorter of TAR_SEND_SIZE.
+ 	 */
+ 	cnt = snprintf(buf, sizeof(buf), "%s\n%s%s\n",
+ 				   BACKUP_PROFILE_HEADER,
+ 				   labelfile,
+ 				   BACKUP_PROFILE_SEPARATOR);
+ 
+ 	/* Add size of backup label and separators */
+ 	statbuf.st_size += cnt;
+ 
+ 	_tarWriteHeader(BACKUP_PROFILE_FILE, NULL, &statbuf);
+ 
+ 	/* Send backup profile header */
+ 	if (pq_putmessage('d', buf, cnt))
+ 		ereport(ERROR,
+ 				(errmsg("base backup could not send data, aborting backup")));
+ 
+ 	len += cnt;
+ 	throttle(cnt);
+ 
+ 	while ((cnt = FileRead(backup_profile_fd, buf, sizeof(buf))) > 0)
+ 	{
+ 		/* Send the chunk as a CopyData message */
+ 		if (pq_putmessage('d', buf, cnt))
+ 			ereport(ERROR,
+ 					(errmsg("base backup could not send data, aborting backup")));
+ 
+ 		len += cnt;
+ 		throttle(cnt);
+ 
+ 	}
+ 
+ 	/*
+ 	 * Pad to 512 byte boundary, per tar format requirements. (This small
+ 	 * piece of data is probably not worth throttling.)
+ 	 */
+ 	pad = ((len + 511) & ~511) - len;
+ 	if (pad > 0)
+ 	{
+ 		MemSet(buf, 0, pad);
+ 		pq_putmessage('d', buf, pad);
+ 	}
+ 
+ 	pq_putemptymessage('c');        /* CopyDone */
+ }
+ 
+ /*
+  * relfilenode name validation.
+  *
+  * Format with_ext == true	[0-9]+[ \w | _vm | _fsm | _init ][\.][0-9]*
+  *		  with_ext == false [0-9]+
+  */
+ static bool
+ validateRelfilenodeName(char *name)
+ {
+ 	int			pos = 0;
+ 
+ 	while ((name[pos] >= '0') && (name[pos] <= '9'))
+ 		pos++;
+ 
+ 	if (name[pos] == '_')
+ 	{
+ 		pos++;
+ 		while ((name[pos] >= 'a') && (name[pos] <= 'z'))
+ 			pos++;
+ 	}
+ 	if (name[pos] == '.')
+ 	{
+ 		pos++;
+ 		while ((name[pos] >= '0') && (name[pos] <= '9'))
+ 			pos++;
+ 	}
+ 
+ 	if (name[pos] == 0)
+ 		return true;
+ 
+ 	return false;
+ }
diff --git a/src/backend/replication/repl_gram.y b/src/backend/replication/repl_gram.y
index 2a41eb1..684cf4d 100644
*** a/src/backend/replication/repl_gram.y
--- b/src/backend/replication/repl_gram.y
*************** Node *replication_parse_result;
*** 75,80 ****
--- 75,81 ----
  %token K_PHYSICAL
  %token K_LOGICAL
  %token K_SLOT
+ %token K_INCREMENTAL
  
  %type <node>	command
  %type <node>	base_backup start_replication start_logical_replication create_replication_slot drop_replication_slot identify_system timeline_history
*************** base_backup_opt:
*** 168,173 ****
--- 169,179 ----
  				  $$ = makeDefElem("max_rate",
  								   (Node *)makeInteger($2));
  				}
+ 			| K_INCREMENTAL SCONST
+ 				{
+ 				  $$ = makeDefElem("incremental",
+ 								   (Node *)makeString($2));
+ 				}
  			;
  
  create_replication_slot:
diff --git a/src/backend/replication/repl_scanner.l b/src/backend/replication/repl_scanner.l
index 449c127..a6d0dd8 100644
*** a/src/backend/replication/repl_scanner.l
--- b/src/backend/replication/repl_scanner.l
*************** TIMELINE_HISTORY	{ return K_TIMELINE_HIS
*** 96,101 ****
--- 96,102 ----
  PHYSICAL			{ return K_PHYSICAL; }
  LOGICAL				{ return K_LOGICAL; }
  SLOT				{ return K_SLOT; }
+ INCREMENTAL			{ return K_INCREMENTAL; }
  
  ","				{ return ','; }
  ";"				{ return ';'; }
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index fbf7106..d4adb6a 100644
*** a/src/bin/pg_basebackup/pg_basebackup.c
--- b/src/bin/pg_basebackup/pg_basebackup.c
*************** static bool writerecoveryconf = false;
*** 67,72 ****
--- 67,73 ----
  static int	standby_message_timeout = 10 * 1000;		/* 10 sec = default */
  static pg_time_t last_progress_report = 0;
  static int32 maxrate = 0;		/* no limit by default */
+ static XLogRecPtr incremental_startpoint = 0;
  
  
  /* Progress counters */
*************** static void verify_dir_is_empty_or_creat
*** 101,107 ****
  static void progress_report(int tablespacenum, const char *filename, bool force);
  
  static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);
! static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum);
  static void GenerateRecoveryConf(PGconn *conn);
  static void WriteRecoveryConf(void);
  static void BaseBackup(void);
--- 102,109 ----
  static void progress_report(int tablespacenum, const char *filename, bool force);
  
  static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);
! static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum,
! 									const char *dest_path);
  static void GenerateRecoveryConf(PGconn *conn);
  static void WriteRecoveryConf(void);
  static void BaseBackup(void);
*************** usage(void)
*** 232,237 ****
--- 234,241 ----
  	printf(_("\nOptions controlling the output:\n"));
  	printf(_("  -D, --pgdata=DIRECTORY receive base backup into directory\n"));
  	printf(_("  -F, --format=p|t       output format (plain (default), tar)\n"));
+ 	printf(_("  -I, --incremental=STARTPOINT\n"
+ 			 "                         send only chenges after STARTPOINT\n"));
  	printf(_("  -r, --max-rate=RATE    maximum transfer rate to transfer data directory\n"
  			 "                         (in kB/s, or use suffix \"k\" or \"M\")\n"));
  	printf(_("  -R, --write-recovery-conf\n"
*************** get_tablespace_mapping(const char *dir)
*** 1128,1136 ****
   * If the data is for the main data directory, it will be restored in the
   * specified directory. If it's for another tablespace, it will be restored
   * in the original or mapped directory.
   */
  static void
! ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum)
  {
  	char		current_path[MAXPGPATH];
  	char		filename[MAXPGPATH];
--- 1132,1147 ----
   * If the data is for the main data directory, it will be restored in the
   * specified directory. If it's for another tablespace, it will be restored
   * in the original or mapped directory.
+  *
+  * If 'res' is NULL, the destination directory is taken from the
+  * 'dest_path' parameter.
+  *
+  * When 'dest_path' is specified, progresses are not displayed because the
+  * content it is not in any tablespace.
   */
  static void
! ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum,
! 						const char *dest_path)
  {
  	char		current_path[MAXPGPATH];
  	char		filename[MAXPGPATH];
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1141,1153 ****
  	char	   *copybuf = NULL;
  	FILE	   *file = NULL;
  
! 	basetablespace = PQgetisnull(res, rownum, 0);
! 	if (basetablespace)
! 		strlcpy(current_path, basedir, sizeof(current_path));
  	else
! 		strlcpy(current_path,
! 				get_tablespace_mapping(PQgetvalue(res, rownum, 1)),
! 				sizeof(current_path));
  
  	/*
  	 * Get the COPY data
--- 1152,1179 ----
  	char	   *copybuf = NULL;
  	FILE	   *file = NULL;
  
! 	/* 'res' and 'dest_path' are mutually exclusive */
! 	assert(!res != !dest_path);
! 
! 	/*
! 	 * If 'res' is NULL, the destination directory is taken from the
! 	 * 'dest_path' parameter.
! 	 */
! 	if (res)
! 	{
! 		basetablespace = PQgetisnull(res, rownum, 0);
! 		if (basetablespace)
! 			strlcpy(current_path, basedir, sizeof(current_path));
! 		else
! 			strlcpy(current_path,
! 					get_tablespace_mapping(PQgetvalue(res, rownum, 1)),
! 					sizeof(current_path));
! 	}
  	else
! 	{
! 		basetablespace = false;
! 		strlcpy(current_path, dest_path, sizeof(current_path));
! 	}
  
  	/*
  	 * Get the COPY data
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1355,1361 ****
  				disconnect_and_exit(1);
  			}
  			totaldone += r;
! 			progress_report(rownum, filename, false);
  
  			current_len_left -= r;
  			if (current_len_left == 0 && current_padding == 0)
--- 1381,1389 ----
  				disconnect_and_exit(1);
  			}
  			totaldone += r;
! 			/* report progress unless a custom destination is used */
! 			if (!dest_path)
! 				progress_report(rownum, filename, false);
  
  			current_len_left -= r;
  			if (current_len_left == 0 && current_padding == 0)
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1371,1377 ****
  			}
  		}						/* continuing data in existing file */
  	}							/* loop over all data blocks */
! 	progress_report(rownum, filename, true);
  
  	if (file != NULL)
  	{
--- 1399,1407 ----
  			}
  		}						/* continuing data in existing file */
  	}							/* loop over all data blocks */
! 	/* report progress unless a custom destination is used */
! 	if (!dest_path)
! 		progress_report(rownum, filename, true);
  
  	if (file != NULL)
  	{
*************** BaseBackup(void)
*** 1587,1592 ****
--- 1617,1623 ----
  	char	   *basebkp;
  	char		escaped_label[MAXPGPATH];
  	char	   *maxrate_clause = NULL;
+ 	char	   *incremental_clause = NULL;
  	int			i;
  	char		xlogstart[64];
  	char		xlogend[64];
*************** BaseBackup(void)
*** 1648,1661 ****
  	if (maxrate > 0)
  		maxrate_clause = psprintf("MAX_RATE %u", maxrate);
  
  	basebkp =
! 		psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s",
  				 escaped_label,
  				 showprogress ? "PROGRESS" : "",
  				 includewal && !streamwal ? "WAL" : "",
  				 fastcheckpoint ? "FAST" : "",
  				 includewal ? "NOWAIT" : "",
! 				 maxrate_clause ? maxrate_clause : "");
  
  	if (PQsendQuery(conn, basebkp) == 0)
  	{
--- 1679,1698 ----
  	if (maxrate > 0)
  		maxrate_clause = psprintf("MAX_RATE %u", maxrate);
  
+ 	if (incremental_startpoint > 0)
+ 		incremental_clause = psprintf("INCREMENTAL '%X/%X'",
+ 									  (uint32) (incremental_startpoint >> 32),
+ 									  (uint32) incremental_startpoint);
+ 
  	basebkp =
! 		psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s %s",
  				 escaped_label,
  				 showprogress ? "PROGRESS" : "",
  				 includewal && !streamwal ? "WAL" : "",
  				 fastcheckpoint ? "FAST" : "",
  				 includewal ? "NOWAIT" : "",
! 				 maxrate_clause ? maxrate_clause : "",
! 				 incremental_clause ? incremental_clause : "");
  
  	if (PQsendQuery(conn, basebkp) == 0)
  	{
*************** BaseBackup(void)
*** 1769,1775 ****
  		if (format == 't')
  			ReceiveTarFile(conn, res, i);
  		else
! 			ReceiveAndUnpackTarFile(conn, res, i);
  	}							/* Loop over all tablespaces */
  
  	if (showprogress)
--- 1806,1812 ----
  		if (format == 't')
  			ReceiveTarFile(conn, res, i);
  		else
! 			ReceiveAndUnpackTarFile(conn, res, i, NULL);
  	}							/* Loop over all tablespaces */
  
  	if (showprogress)
*************** BaseBackup(void)
*** 1803,1808 ****
--- 1840,1850 ----
  		fprintf(stderr, "transaction log end point: %s\n", xlogend);
  	PQclear(res);
  
+ 	/*
+ 	 * Get the backup profile
+ 	 */
+ 	ReceiveAndUnpackTarFile(conn, NULL, -1, basedir);
+ 
  	res = PQgetResult(conn);
  	if (PQresultStatus(res) != PGRES_COMMAND_OK)
  	{
*************** main(int argc, char **argv)
*** 1942,1947 ****
--- 1984,1990 ----
  		{"username", required_argument, NULL, 'U'},
  		{"no-password", no_argument, NULL, 'w'},
  		{"password", no_argument, NULL, 'W'},
+ 		{"incremental", required_argument, NULL, 'I'},
  		{"status-interval", required_argument, NULL, 's'},
  		{"verbose", no_argument, NULL, 'v'},
  		{"progress", no_argument, NULL, 'P'},
*************** main(int argc, char **argv)
*** 1949,1955 ****
  		{NULL, 0, NULL, 0}
  	};
  	int			c;
! 
  	int			option_index;
  
  	progname = get_progname(argv[0]);
--- 1992,1998 ----
  		{NULL, 0, NULL, 0}
  	};
  	int			c;
! 	int			hi, lo;
  	int			option_index;
  
  	progname = get_progname(argv[0]);
*************** main(int argc, char **argv)
*** 1970,1976 ****
  		}
  	}
  
! 	while ((c = getopt_long(argc, argv, "D:F:r:RT:xX:l:zZ:d:c:h:p:U:s:wWvP",
  							long_options, &option_index)) != -1)
  	{
  		switch (c)
--- 2013,2019 ----
  		}
  	}
  
! 	while ((c = getopt_long(argc, argv, "D:F:r:RT:xX:l:zZ:d:c:h:p:U:s:wWI:vP",
  							long_options, &option_index)) != -1)
  	{
  		switch (c)
*************** main(int argc, char **argv)
*** 2088,2093 ****
--- 2131,2146 ----
  			case 'W':
  				dbgetpassword = 1;
  				break;
+ 			case 'I':
+ 				if (sscanf(optarg, "%X/%X", &hi, &lo) != 2)
+ 				{
+ 					fprintf(stderr,
+ 							_("%s: could not parse incremental start position \"%s\"\n"),
+ 							progname, optarg);
+ 					exit(1);
+ 				}
+ 				incremental_startpoint = ((uint64) hi << 32) | lo;
+ 				break;
  			case 's':
  				standby_message_timeout = atoi(optarg) * 1000;
  				if (standby_message_timeout < 0)
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 138deaf..4bb261a 100644
*** a/src/include/access/xlog.h
--- b/src/include/access/xlog.h
*************** extern void SetWalWriterSleeping(bool sl
*** 249,255 ****
   * Starting/stopping a base backup
   */
  extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast,
! 				   TimeLineID *starttli_p, char **labelfile);
  extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive,
  				  TimeLineID *stoptli_p);
  extern void do_pg_abort_backup(void);
--- 249,256 ----
   * Starting/stopping a base backup
   */
  extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast,
! 				  XLogRecPtr incremental_startpoint,
! 				  TimeLineID *starttli_p, char **labelfile);
  extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive,
  				  TimeLineID *stoptli_p);
  extern void do_pg_abort_backup(void);
diff --git a/src/include/replication/basebackup.h b/src/include/replication/basebackup.h
index 64f2bd5..9182c0a 100644
*** a/src/include/replication/basebackup.h
--- b/src/include/replication/basebackup.h
***************
*** 20,25 ****
--- 20,29 ----
  #define MAX_RATE_LOWER	32
  #define MAX_RATE_UPPER	1048576
  
+ /* Backup profile */
+ #define BACKUP_PROFILE_FILE			"backup_profile"
+ #define BACKUP_PROFILE_HEADER		"POSTGRESQL BACKUP PROFILE 1"
+ #define BACKUP_PROFILE_SEPARATOR	"FILE LIST"
  
  extern void SendBaseBackup(BaseBackupCmd *cmd);
  
-- 
2.2.1

#10

Gabriele Bartolini

gabriele.bartolini@2ndquadrant.it

almost 11 years ago

In reply to: Marco Nenciarini (#9)

Re: [RFC] Incremental backup v3: incremental PoC

Hi Marco,

thank you for sending an updated patch. I am writing down a report of
this initial (and partial) review.

IMPORTANT: This patch is not complete, as stated by Marco. See the
"Conclusions" section for my proposed TODO list.

== Patch application

I have been able to successfully apply your patch and compile it.
Regression tests passed.

== Initial run

I have created a fresh new instance of PostgreSQL and activated streaming
replication to be used by pg_basebackup. I have done a pgbench run with
scale 100.

I have taken a full consistent backup with pg_basebackup (in plain format):

pg_basebackup -v -F p -D $BACKUPDIR/backup-$(date '+%s') -x

I have been able to verify that the backup_profile is correctly placed in
the destination PGDATA directory. Here is an excerpt:

POSTGRESQL BACKUP PROFILE 1
START WAL LOCATION: 0/3000058 (file 000000010000000000000003)
CHECKPOINT LOCATION: 0/300008C
BACKUP METHOD: streamed
BACKUP FROM: master
START TIME: 2015-01-14 10:07:07 CET
LABEL: pg_basebackup base backup
FILE LIST
\N \N t 1421226427 206 backup_label
\N \N t 1421225508 88 postgresql.auto.conf
...

As suggested by Marco, I have manually taken the LSN from this file (next
version must do this automatically).
I have then executed pg_basebackup and activated the incremental feature by
using the LSN from the previous backup, as follows:

LSN=$(awk '/^START WAL/{print $4}' backup_profile)

pg_basebackup -v -F p -D $BACKUPDIR/backup-$(date '+%s') -I $LSN -x

The time taken by this operation has been much lower than the previous one
and the size is much lower (I have not done any operation in the meantime):

du -hs backup-1421226*
1,5G backup-1421226427
17M backup-1421226427

I have done some checks on the file system and then used the prototype of
recovery script in Python written by Marco.

./recover.py backup-1421226427 backup-1421226427 new-data

The cluster started successfully. I have then run a pg_dump of the pgbench
database and were able to reload it on the initial cluster.

== Conclusions

The first run of this patch seems promising.

While the discussion on the LSN map continues (which is mainly an
optimisation of this patch), I would really like to see this patch progress
as it would be a killer feature in several contexts (not in every context).

Just in this period we are releasing file based incremental backup for
Barman and customers using the alpha version are experiencing on average a
deduplication ratio between 50% to 70%. This is for example an excerpt of
"barman show-backup" from one of our customers (a daily saving of 550GB is
not bad):

Base backup information:
Disk usage : 1.1 TiB (1.1 TiB with WALs)
Incremental size : 564.6 GiB (-50.60%)
...

My opinion, Marco, is that for version 5 of this patch, you:

1) update the information on the wiki (it is outdated - I know you have
been busy with LSN map optimisation)
2) modify pg_basebackup in order to accept a directory (or tar file) and
automatically detect the LSN from the backup profile
3) add the documentation regarding the backup profile and pg_basebackup

Once we have all of this, we can continue trying the patch. Some unexplored
paths are:

* tablespace usage
* tar format
* performance impact (in both "read-only" and heavily updated contexts)
* consistency checks

I would then leave for version 6 the pg_restorebackup utility (unless you
want to do everything at once).

One limitation of the current recovery script is that it cannot accept
multiple incremental backups (it just accepts three parameters: base
backup, incremental backup and merge destination). Maybe you can change the
syntax as follows:

./recover.py DESTINATION BACKUP_1 BACKUP_2 [BACKUP_3, ...]

Thanks a lot for working on this.

I am looking forward to continuing the review.

Ciao,
Gabriele
--
Gabriele Bartolini - 2ndQuadrant Italia - Managing Director
PostgreSQL Training, Services and Support
gabriele.bartolini@2ndQuadrant.it | www.2ndQuadrant.it

2015-01-13 17:21 GMT+01:00 Marco Nenciarini <marco.nenciarini@2ndquadrant.it

Show quoted text

:

Il 13/01/15 12:53, Gabriele Bartolini ha scritto:

Hi Marco,

could you please send an updated version the patch against the current
HEAD in order to facilitate reviewers?

Here is the updated patch for incremental file based backup.

It is based on the current HEAD.

I'm now working to the client tool to rebuild a full backup starting
from a file based incremental backup.

Regards,
Marco

--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it

#11

Marco Nenciarini

marco.nenciarini@2ndquadrant.it

almost 11 years ago

In reply to: Gabriele Bartolini (#10)

2 attachment(s)

Re: [RFC] Incremental backup v3: incremental PoC

On 14/01/15 17:22, Gabriele Bartolini wrote:

My opinion, Marco, is that for version 5 of this patch, you:

1) update the information on the wiki (it is outdated - I know you have
been busy with LSN map optimisation)

Done.

2) modify pg_basebackup in order to accept a directory (or tar file) and
automatically detect the LSN from the backup profile

New version of patch attached. The -I parameter now requires a backup
profile from a previous backup. I've added a sanity check that forbid
incremental file level backups if the base timeline is different from
the current one.

3) add the documentation regarding the backup profile and pg_basebackup

Next on my TODO list.

Once we have all of this, we can continue trying the patch. Some
unexplored paths are:

* tablespace usage

I've improved my pg_restorebackup python PoC. It now supports tablespaces.

* tar format
* performance impact (in both "read-only" and heavily updated contexts)

From the server point of view, the current code generates a load similar
to normal backup. It only adds an initial scan of any data file to
decide whether it has to send it. One it found a single newer page it
immediately stop scanning and start sending the file. The IO impact
should not be that big due to the filesystem cache, but I agree with you
that it has to be measured.

Regards,
Marco

--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it

Attachments:

file-based-incremental-backup-v5.patchtext/plain; charset=UTF-8; name=file-based-incremental-backup-v5.patch; x-mac-creator=0; x-mac-type=0Download

From f7cf8b9dd7d32f64a30dafaeeaeb56cbcd2eafff Mon Sep 17 00:00:00 2001
From: Marco Nenciarini <marco.nenciarini@2ndQuadrant.it>
Date: Tue, 14 Oct 2014 14:31:28 +0100
Subject: [PATCH] File-based incremental backup v5

Add backup profile to pg_basebackup
INCREMENTAL option implementaion
---
 src/backend/access/transam/xlog.c      |   7 +-
 src/backend/access/transam/xlogfuncs.c |   2 +-
 src/backend/replication/basebackup.c   | 335 +++++++++++++++++++++++++++++++--
 src/backend/replication/repl_gram.y    |   6 +
 src/backend/replication/repl_scanner.l |   1 +
 src/bin/pg_basebackup/pg_basebackup.c  | 147 +++++++++++++--
 src/include/access/xlog.h              |   3 +-
 src/include/replication/basebackup.h   |   4 +
 8 files changed, 473 insertions(+), 32 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 629a457..1e50625 100644
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
*************** XLogFileNameP(TimeLineID tli, XLogSegNo 
*** 9249,9255 ****
   * permissions of the calling user!
   */
  XLogRecPtr
! do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
  				   char **labelfile)
  {
  	bool		exclusive = (labelfile == NULL);
--- 9249,9256 ----
   * permissions of the calling user!
   */
  XLogRecPtr
! do_pg_start_backup(const char *backupidstr, bool fast,
! 				   XLogRecPtr incremental_startpoint, TimeLineID *starttli_p,
  				   char **labelfile)
  {
  	bool		exclusive = (labelfile == NULL);
*************** do_pg_start_backup(const char *backupids
*** 9468,9473 ****
--- 9469,9478 ----
  			 (uint32) (startpoint >> 32), (uint32) startpoint, xlogfilename);
  		appendStringInfo(&labelfbuf, "CHECKPOINT LOCATION: %X/%X\n",
  					 (uint32) (checkpointloc >> 32), (uint32) checkpointloc);
+ 		if (incremental_startpoint > 0)
+ 			appendStringInfo(&labelfbuf, "INCREMENTAL FROM LOCATION: %X/%X\n",
+ 							 (uint32) (incremental_startpoint >> 32),
+ 							 (uint32) incremental_startpoint);
  		appendStringInfo(&labelfbuf, "BACKUP METHOD: %s\n",
  						 exclusive ? "pg_start_backup" : "streamed");
  		appendStringInfo(&labelfbuf, "BACKUP FROM: %s\n",
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2179bf7..ace84d8 100644
*** a/src/backend/access/transam/xlogfuncs.c
--- b/src/backend/access/transam/xlogfuncs.c
*************** pg_start_backup(PG_FUNCTION_ARGS)
*** 59,65 ****
  				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
  		   errmsg("must be superuser or replication role to run a backup")));
  
! 	startpoint = do_pg_start_backup(backupidstr, fast, NULL, NULL);
  
  	PG_RETURN_LSN(startpoint);
  }
--- 59,65 ----
  				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
  		   errmsg("must be superuser or replication role to run a backup")));
  
! 	startpoint = do_pg_start_backup(backupidstr, fast, 0, NULL, NULL);
  
  	PG_RETURN_LSN(startpoint);
  }
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 07030a2..05b19c5 100644
*** a/src/backend/replication/basebackup.c
--- b/src/backend/replication/basebackup.c
***************
*** 30,40 ****
--- 30,42 ----
  #include "replication/basebackup.h"
  #include "replication/walsender.h"
  #include "replication/walsender_private.h"
+ #include "storage/bufpage.h"
  #include "storage/fd.h"
  #include "storage/ipc.h"
  #include "utils/builtins.h"
  #include "utils/elog.h"
  #include "utils/ps_status.h"
+ #include "utils/pg_lsn.h"
  #include "utils/timestamp.h"
  
  
*************** typedef struct
*** 46,56 ****
  	bool		nowait;
  	bool		includewal;
  	uint32		maxrate;
  } basebackup_options;
  
  
! static int64 sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces);
! static int64 sendTablespace(char *path, bool sizeonly);
  static bool sendFile(char *readfilename, char *tarfilename,
  		 struct stat * statbuf, bool missing_ok);
  static void sendFileWithContent(const char *filename, const char *content);
--- 48,62 ----
  	bool		nowait;
  	bool		includewal;
  	uint32		maxrate;
+ 	XLogRecPtr	incremental_startpoint;
  } basebackup_options;
  
  
! static int64 sendDir(char *path, int basepathlen, bool sizeonly,
! 					 List *tablespaces, bool has_relfiles,
! 					 XLogRecPtr incremental_startpoint);
! static int64 sendTablespace(char *path, bool sizeonly,
! 				XLogRecPtr incremental_startpoint);
  static bool sendFile(char *readfilename, char *tarfilename,
  		 struct stat * statbuf, bool missing_ok);
  static void sendFileWithContent(const char *filename, const char *content);
*************** static void parse_basebackup_options(Lis
*** 64,69 ****
--- 70,81 ----
  static void SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli);
  static int	compareWalFileNames(const void *a, const void *b);
  static void throttle(size_t increment);
+ static bool relnodeIsNewerThanLSN(char *filename, struct stat * statbuf,
+ 				XLogRecPtr *filemaxlsn, XLogRecPtr thresholdlsn);
+ static void writeBackupProfileLine(const char *filename, struct stat * statbuf,
+ 								   bool has_maxlsn, XLogRecPtr filemaxlsn, bool sent);
+ static void sendBackupProfile(const char *labelfile);
+ static bool validateRelfilenodeName(char *name);
  
  /* Was the backup currently in-progress initiated in recovery mode? */
  static bool backup_started_in_recovery = false;
*************** static int64 elapsed_min_unit;
*** 93,98 ****
--- 105,116 ----
  /* The last check of the transfer rate. */
  static int64 throttled_last;
  
+ /* Temporary file containing the backup profile */
+ static File backup_profile_fd = 0;
+ 
+ /* Tablespace being currently sent. Used in backup profile generation */
+ static char *current_tablespace = NULL;
+ 
  typedef struct
  {
  	char	   *oid;
*************** perform_base_backup(basebackup_options *
*** 132,138 ****
  
  	backup_started_in_recovery = RecoveryInProgress();
  
! 	startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint, &starttli,
  								  &labelfile);
  	/*
  	 * Once do_pg_start_backup has been called, ensure that any failure causes
--- 150,160 ----
  
  	backup_started_in_recovery = RecoveryInProgress();
  
! 	/* Open a temporary file to hold the profile content. */
! 	backup_profile_fd = OpenTemporaryFile(false);
! 
! 	startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint,
! 								  opt->incremental_startpoint, &starttli,
  								  &labelfile);
  	/*
  	 * Once do_pg_start_backup has been called, ensure that any failure causes
*************** perform_base_backup(basebackup_options *
*** 208,214 ****
  			ti->oid = pstrdup(de->d_name);
  			ti->path = pstrdup(linkpath);
  			ti->rpath = relpath ? pstrdup(relpath) : NULL;
! 			ti->size = opt->progress ? sendTablespace(fullpath, true) : -1;
  			tablespaces = lappend(tablespaces, ti);
  #else
  
--- 230,237 ----
  			ti->oid = pstrdup(de->d_name);
  			ti->path = pstrdup(linkpath);
  			ti->rpath = relpath ? pstrdup(relpath) : NULL;
! 			ti->size = opt->progress ? sendTablespace(fullpath, true,
! 											opt->incremental_startpoint) : -1;
  			tablespaces = lappend(tablespaces, ti);
  #else
  
*************** perform_base_backup(basebackup_options *
*** 225,231 ****
  
  		/* Add a node for the base directory at the end */
  		ti = palloc0(sizeof(tablespaceinfo));
! 		ti->size = opt->progress ? sendDir(".", 1, true, tablespaces) : -1;
  		tablespaces = lappend(tablespaces, ti);
  
  		/* Send tablespace header */
--- 248,255 ----
  
  		/* Add a node for the base directory at the end */
  		ti = palloc0(sizeof(tablespaceinfo));
! 		ti->size = opt->progress ? sendDir(".", 1, true, tablespaces, false,
! 										   opt->incremental_startpoint) : -1;
  		tablespaces = lappend(tablespaces, ti);
  
  		/* Send tablespace header */
*************** perform_base_backup(basebackup_options *
*** 267,272 ****
--- 291,302 ----
  			pq_sendint(&buf, 0, 2);		/* natts */
  			pq_endmessage(&buf);
  
+ 			/*
+ 			 * Save the current tablespace, used in writeBackupProfileLine
+ 			 * function
+ 			 */
+ 			current_tablespace = ti->oid;
+ 
  			if (ti->path == NULL)
  			{
  				struct stat statbuf;
*************** perform_base_backup(basebackup_options *
*** 275,281 ****
  				sendFileWithContent(BACKUP_LABEL_FILE, labelfile);
  
  				/* ... then the bulk of the files ... */
! 				sendDir(".", 1, false, tablespaces);
  
  				/* ... and pg_control after everything else. */
  				if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
--- 305,311 ----
  				sendFileWithContent(BACKUP_LABEL_FILE, labelfile);
  
  				/* ... then the bulk of the files ... */
! 				sendDir(".", 1, false, tablespaces, false, opt->incremental_startpoint);
  
  				/* ... and pg_control after everything else. */
  				if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
*************** perform_base_backup(basebackup_options *
*** 284,292 ****
  							 errmsg("could not stat control file \"%s\": %m",
  									XLOG_CONTROL_FILE)));
  				sendFile(XLOG_CONTROL_FILE, XLOG_CONTROL_FILE, &statbuf, false);
  			}
  			else
! 				sendTablespace(ti->path, false);
  
  			/*
  			 * If we're including WAL, and this is the main data directory we
--- 314,323 ----
  							 errmsg("could not stat control file \"%s\": %m",
  									XLOG_CONTROL_FILE)));
  				sendFile(XLOG_CONTROL_FILE, XLOG_CONTROL_FILE, &statbuf, false);
+ 				writeBackupProfileLine(XLOG_CONTROL_FILE, &statbuf, false, 0, true);
  			}
  			else
! 				sendTablespace(ti->path, false, opt->incremental_startpoint);
  
  			/*
  			 * If we're including WAL, and this is the main data directory we
*************** perform_base_backup(basebackup_options *
*** 501,507 ****
  
  			FreeFile(fp);
  
! 			/*
  			 * Mark file as archived, otherwise files can get archived again
  			 * after promotion of a new node. This is in line with
  			 * walreceiver.c always doing a XLogArchiveForceDone() after a
--- 532,541 ----
  
  			FreeFile(fp);
  
! 			/* Add the WAL file to backup profile */
! 			writeBackupProfileLine(pathbuf, &statbuf, false, 0, true);
! 
! 		    /*
  			 * Mark file as archived, otherwise files can get archived again
  			 * after promotion of a new node. This is in line with
  			 * walreceiver.c always doing a XLogArchiveForceDone() after a
*************** perform_base_backup(basebackup_options *
*** 533,538 ****
--- 567,575 ----
  
  			sendFile(pathbuf, pathbuf, &statbuf, false);
  
+ 			/* Add the WAL file to backup profile */
+ 			writeBackupProfileLine(pathbuf, &statbuf, false, 0, true);
+ 
  			/* unconditionally mark file as archived */
  			StatusFilePath(pathbuf, fname, ".done");
  			sendFileWithContent(pathbuf, "");
*************** perform_base_backup(basebackup_options *
*** 542,547 ****
--- 579,587 ----
  		pq_putemptymessage('c');
  	}
  	SendXlogRecPtrResult(endptr, endtli);
+ 
+ 	/* Send the profile file. */
+ 	sendBackupProfile(labelfile);
  }
  
  /*
*************** parse_basebackup_options(List *options, 
*** 570,575 ****
--- 610,616 ----
  	bool		o_nowait = false;
  	bool		o_wal = false;
  	bool		o_maxrate = false;
+ 	bool		o_incremental = false;
  
  	MemSet(opt, 0, sizeof(*opt));
  	foreach(lopt, options)
*************** parse_basebackup_options(List *options, 
*** 640,645 ****
--- 681,698 ----
  			opt->maxrate = (uint32) maxrate;
  			o_maxrate = true;
  		}
+ 		else if (strcmp(defel->defname, "incremental") == 0)
+ 		{
+ 			if (o_incremental)
+ 				ereport(ERROR,
+ 						(errcode(ERRCODE_SYNTAX_ERROR),
+ 						 errmsg("duplicate option \"%s\"", defel->defname)));
+ 
+ 			opt->incremental_startpoint = DatumGetLSN(
+ 				DirectFunctionCall1(pg_lsn_in,
+ 									CStringGetDatum(strVal(defel->arg))));
+ 			o_incremental = true;
+ 		}
  		else
  			elog(ERROR, "option \"%s\" not recognized",
  				 defel->defname);
*************** sendFileWithContent(const char *filename
*** 859,864 ****
--- 912,920 ----
  		MemSet(buf, 0, pad);
  		pq_putmessage('d', buf, pad);
  	}
+ 
+ 	/* Write a backup profile entry for this file. */
+ 	writeBackupProfileLine(filename, &statbuf, false, 0, true);
  }
  
  /*
*************** sendFileWithContent(const char *filename
*** 869,875 ****
   * Only used to send auxiliary tablespaces, not PGDATA.
   */
  static int64
! sendTablespace(char *path, bool sizeonly)
  {
  	int64		size;
  	char		pathbuf[MAXPGPATH];
--- 925,931 ----
   * Only used to send auxiliary tablespaces, not PGDATA.
   */
  static int64
! sendTablespace(char *path, bool sizeonly, XLogRecPtr incremental_startpoint)
  {
  	int64		size;
  	char		pathbuf[MAXPGPATH];
*************** sendTablespace(char *path, bool sizeonly
*** 902,908 ****
  	size = 512;					/* Size of the header just added */
  
  	/* Send all the files in the tablespace version directory */
! 	size += sendDir(pathbuf, strlen(path), sizeonly, NIL);
  
  	return size;
  }
--- 958,964 ----
  	size = 512;					/* Size of the header just added */
  
  	/* Send all the files in the tablespace version directory */
! 	size += sendDir(pathbuf, strlen(path), sizeonly, NIL, true, incremental_startpoint);
  
  	return size;
  }
*************** sendTablespace(char *path, bool sizeonly
*** 914,922 ****
   *
   * Omit any directory in the tablespaces list, to avoid backing up
   * tablespaces twice when they were created inside PGDATA.
   */
  static int64
! sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces)
  {
  	DIR		   *dir;
  	struct dirent *de;
--- 970,982 ----
   *
   * Omit any directory in the tablespaces list, to avoid backing up
   * tablespaces twice when they were created inside PGDATA.
+  *
+  * If 'has_relfiles' is set, this directory will be checked to identify
+  * relnode files and compute their maxLSN.
   */
  static int64
! sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces,
! 		bool has_relfiles, XLogRecPtr incremental_startpoint)
  {
  	DIR		   *dir;
  	struct dirent *de;
*************** sendDir(char *path, int basepathlen, boo
*** 1124,1138 ****
  				}
  			}
  			if (!skip_this_dir)
! 				size += sendDir(pathbuf, basepathlen, sizeonly, tablespaces);
  		}
  		else if (S_ISREG(statbuf.st_mode))
  		{
  			bool		sent = false;
  
  			if (!sizeonly)
! 				sent = sendFile(pathbuf, pathbuf + basepathlen + 1, &statbuf,
! 								true);
  
  			if (sent || sizeonly)
  			{
--- 1184,1235 ----
  				}
  			}
  			if (!skip_this_dir)
! 			{
! 				bool	subdir_has_relfiles;
! 
! 				/*
! 				 * Whithin PGDATA relnode files are contained only in "global"
! 				 * and "base" directory
! 				 */
! 				subdir_has_relfiles = has_relfiles
! 					|| strcmp(pathbuf, "./global") == 0
! 					|| strcmp(pathbuf, "./base") == 0;
! 
! 				size += sendDir(pathbuf, basepathlen, sizeonly, tablespaces,
! 								subdir_has_relfiles, incremental_startpoint);
! 			}
  		}
  		else if (S_ISREG(statbuf.st_mode))
  		{
  			bool		sent = false;
  
  			if (!sizeonly)
! 			{
! 				bool		is_relfile;
! 				XLogRecPtr	filemaxlsn = 0;
! 
! 				/*
! 				 * If the current directory can have relnode files, check the file
! 				 * name to see if it is one of them.
! 				 */
! 				is_relfile = has_relfiles && validateRelfilenodeName(de->d_name);
! 
! 				if (!is_relfile
! 					|| incremental_startpoint == 0
! 					|| relnodeIsNewerThanLSN(pathbuf, &statbuf, &filemaxlsn,
! 											  incremental_startpoint))
! 				{
! 					sent = sendFile(pathbuf, pathbuf + basepathlen + 1,
! 									&statbuf, true);
! 					/* Write a backup profile entry for the sent file. */
! 					writeBackupProfileLine(pathbuf + basepathlen + 1, &statbuf,
! 										   false, 0, sent);
! 				}
! 				else
! 					/* Write a backup profile entry for the skipped file. */
! 					writeBackupProfileLine(pathbuf + basepathlen + 1, &statbuf,
! 										   true, filemaxlsn, sent);
! 			}
  
  			if (sent || sizeonly)
  			{
*************** throttle(size_t increment)
*** 1327,1329 ****
--- 1424,1636 ----
  		/* Sleep was necessary but might have been interrupted. */
  		throttled_last = GetCurrentIntegerTimestamp();
  }
+ 
+ /*
+  * Search in a relnode file for a page with a LSN greater than the threshold.
+  * If all the blocks in the file are older than the threshold the file can
+  * be safely skipped during an incremental backup.
+  */
+ static bool
+ relnodeIsNewerThanLSN(char *filename, struct stat * statbuf,
+ 		XLogRecPtr *filemaxlsn, XLogRecPtr thresholdlsn)
+ {
+ 	FILE	   *fp;
+ 	char		buf[BLCKSZ];
+ 	size_t		cnt;
+ 	pgoff_t		len = 0;
+ 	XLogRecPtr	pagelsn;
+ 
+ 	*filemaxlsn = 0;
+ 
+ 	fp = AllocateFile(filename, "rb");
+ 	if (fp == NULL)
+ 	{
+ 		if (errno == ENOENT)
+ 			return true;
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not open file \"%s\": %m", filename)));
+ 	}
+ 
+ 	while ((cnt = fread(buf, 1, Min(sizeof(buf), statbuf->st_size - len), fp)) > 0)
+ 	{
+ 		pagelsn = PageGetLSN(buf);
+ 
+ 		/* Keep the max LSN found */
+ 		if (*filemaxlsn < pagelsn)
+ 			*filemaxlsn = pagelsn;
+ 
+ 		/*
+ 		 *  If a page with a LSN newer than the threshold stop scanning
+ 		 *  and set the filemaxlsn value to 0 as it is only partial.
+ 		 */
+ 		if (thresholdlsn <= pagelsn)
+ 		{
+ 			*filemaxlsn = 0;
+ 			return true;
+ 		}
+ 
+ 		if (len >= statbuf->st_size)
+ 		{
+ 			/*
+ 			 * Reached end of file. The file could be longer, if it was
+ 			 * extended while we were sending it, but for a base backup we can
+ 			 * ignore such extended data. It will be restored from WAL.
+ 			 */
+ 			break;
+ 		}
+ 	}
+ 
+ 	FreeFile(fp);
+ 	return false;
+ }
+ 
+ /*
+  * Write an entry in file list section of backup profile.
+  */
+ static void
+ writeBackupProfileLine(const char *filename, struct stat * statbuf,
+ 					   bool has_maxlsn, XLogRecPtr filemaxlsn, bool sent)
+ {
+ 	/*
+ 	 * tablespace oid (10) + max LSN (17) + mtime (10) + size (19) +
+ 	 * path (MAXPGPATH) + separators (4) + trailing \0 = 65
+ 	 */
+ 	char	buf[MAXPGPATH + 65];
+ 	char    maxlsn[17];
+ 	int		rowlen;
+ 
+ 	Assert(backup_profile_fd > 0);
+ 
+ 	/* Prepare maxlsn */
+ 	if (has_maxlsn)
+ 	{
+ 		snprintf(maxlsn, sizeof(maxlsn), "%X/%X",
+ 				 (uint32) (filemaxlsn >> 32), (uint32) filemaxlsn);
+ 	}
+ 	else
+ 	{
+ 		strlcpy(maxlsn, "\\N", sizeof(maxlsn));
+ 	}
+ 
+ 	rowlen = snprintf(buf, sizeof(buf), "%s\t%s\t%s\t%u\t%lld\t%s\n",
+ 					  current_tablespace ? current_tablespace : "\\N",
+ 					  maxlsn,
+ 					  sent ? "t" : "f",
+ 					  (uint32) statbuf->st_mtime,
+ 					  statbuf->st_size,
+ 					  filename);
+ 	FileWrite(backup_profile_fd, buf, rowlen);
+ }
+ 
+ /*
+  * Send the backup profile. It is wrapped in a tar CopyOutResponse containing
+  * a tar stream with only one file.
+  */
+ static void
+ sendBackupProfile(const char *labelfile)
+ {
+ 	StringInfoData msgbuf;
+ 	struct stat statbuf;
+ 	char		buf[TAR_SEND_SIZE];
+ 	size_t		cnt;
+ 	pgoff_t		len = 0;
+ 	size_t		pad;
+ 	char *backup_profile = FilePathName(backup_profile_fd);
+ 
+ 	/* Send CopyOutResponse message */
+ 	pq_beginmessage(&msgbuf, 'H');
+ 	pq_sendbyte(&msgbuf, 0);		/* overall format */
+ 	pq_sendint(&msgbuf, 0, 2);		/* natts */
+ 	pq_endmessage(&msgbuf);
+ 
+ 	if (lstat(backup_profile, &statbuf) != 0)
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not stat backup_profile file \"%s\": %m",
+ 						backup_profile)));
+ 
+ 	/* Set the file position to the beginning. */
+ 	FileSeek(backup_profile_fd, 0, SEEK_SET);
+ 
+ 	/*
+ 	 * Fill the buffer with content of backup profile header section. Being it
+ 	 * the concatenation of two separator and the backup label, it should be
+ 	 * shorter of TAR_SEND_SIZE.
+ 	 */
+ 	cnt = snprintf(buf, sizeof(buf), "%s\n%s%s\n",
+ 				   BACKUP_PROFILE_HEADER,
+ 				   labelfile,
+ 				   BACKUP_PROFILE_SEPARATOR);
+ 
+ 	/* Add size of backup label and separators */
+ 	statbuf.st_size += cnt;
+ 
+ 	_tarWriteHeader(BACKUP_PROFILE_FILE, NULL, &statbuf);
+ 
+ 	/* Send backup profile header */
+ 	if (pq_putmessage('d', buf, cnt))
+ 		ereport(ERROR,
+ 				(errmsg("base backup could not send data, aborting backup")));
+ 
+ 	len += cnt;
+ 	throttle(cnt);
+ 
+ 	while ((cnt = FileRead(backup_profile_fd, buf, sizeof(buf))) > 0)
+ 	{
+ 		/* Send the chunk as a CopyData message */
+ 		if (pq_putmessage('d', buf, cnt))
+ 			ereport(ERROR,
+ 					(errmsg("base backup could not send data, aborting backup")));
+ 
+ 		len += cnt;
+ 		throttle(cnt);
+ 
+ 	}
+ 
+ 	/*
+ 	 * Pad to 512 byte boundary, per tar format requirements. (This small
+ 	 * piece of data is probably not worth throttling.)
+ 	 */
+ 	pad = ((len + 511) & ~511) - len;
+ 	if (pad > 0)
+ 	{
+ 		MemSet(buf, 0, pad);
+ 		pq_putmessage('d', buf, pad);
+ 	}
+ 
+ 	pq_putemptymessage('c');        /* CopyDone */
+ }
+ 
+ /*
+  * relfilenode name validation.
+  *
+  * Format with_ext == true	[0-9]+[ \w | _vm | _fsm | _init ][\.][0-9]*
+  *		  with_ext == false [0-9]+
+  */
+ static bool
+ validateRelfilenodeName(char *name)
+ {
+ 	int			pos = 0;
+ 
+ 	while ((name[pos] >= '0') && (name[pos] <= '9'))
+ 		pos++;
+ 
+ 	if (name[pos] == '_')
+ 	{
+ 		pos++;
+ 		while ((name[pos] >= 'a') && (name[pos] <= 'z'))
+ 			pos++;
+ 	}
+ 	if (name[pos] == '.')
+ 	{
+ 		pos++;
+ 		while ((name[pos] >= '0') && (name[pos] <= '9'))
+ 			pos++;
+ 	}
+ 
+ 	if (name[pos] == 0)
+ 		return true;
+ 
+ 	return false;
+ }
diff --git a/src/backend/replication/repl_gram.y b/src/backend/replication/repl_gram.y
index 2a41eb1..684cf4d 100644
*** a/src/backend/replication/repl_gram.y
--- b/src/backend/replication/repl_gram.y
*************** Node *replication_parse_result;
*** 75,80 ****
--- 75,81 ----
  %token K_PHYSICAL
  %token K_LOGICAL
  %token K_SLOT
+ %token K_INCREMENTAL
  
  %type <node>	command
  %type <node>	base_backup start_replication start_logical_replication create_replication_slot drop_replication_slot identify_system timeline_history
*************** base_backup_opt:
*** 168,173 ****
--- 169,179 ----
  				  $$ = makeDefElem("max_rate",
  								   (Node *)makeInteger($2));
  				}
+ 			| K_INCREMENTAL SCONST
+ 				{
+ 				  $$ = makeDefElem("incremental",
+ 								   (Node *)makeString($2));
+ 				}
  			;
  
  create_replication_slot:
diff --git a/src/backend/replication/repl_scanner.l b/src/backend/replication/repl_scanner.l
index 449c127..a6d0dd8 100644
*** a/src/backend/replication/repl_scanner.l
--- b/src/backend/replication/repl_scanner.l
*************** TIMELINE_HISTORY	{ return K_TIMELINE_HIS
*** 96,101 ****
--- 96,102 ----
  PHYSICAL			{ return K_PHYSICAL; }
  LOGICAL				{ return K_LOGICAL; }
  SLOT				{ return K_SLOT; }
+ INCREMENTAL			{ return K_INCREMENTAL; }
  
  ","				{ return ','; }
  ";"				{ return ';'; }
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index fbf7106..892472d 100644
*** a/src/bin/pg_basebackup/pg_basebackup.c
--- b/src/bin/pg_basebackup/pg_basebackup.c
*************** static bool writerecoveryconf = false;
*** 67,72 ****
--- 67,74 ----
  static int	standby_message_timeout = 10 * 1000;		/* 10 sec = default */
  static pg_time_t last_progress_report = 0;
  static int32 maxrate = 0;		/* no limit by default */
+ static XLogRecPtr incremental_startpoint = 0;
+ static TimeLineID incremental_timeline = 0;
  
  
  /* Progress counters */
*************** static void usage(void);
*** 99,107 ****
  static void disconnect_and_exit(int code);
  static void verify_dir_is_empty_or_create(char *dirname);
  static void progress_report(int tablespacenum, const char *filename, bool force);
  
  static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);
! static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum);
  static void GenerateRecoveryConf(PGconn *conn);
  static void WriteRecoveryConf(void);
  static void BaseBackup(void);
--- 101,111 ----
  static void disconnect_and_exit(int code);
  static void verify_dir_is_empty_or_create(char *dirname);
  static void progress_report(int tablespacenum, const char *filename, bool force);
+ static void read_backup_profile_header(const char *profile_path);
  
  static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);
! static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum,
! 									const char *dest_path);
  static void GenerateRecoveryConf(PGconn *conn);
  static void WriteRecoveryConf(void);
  static void BaseBackup(void);
*************** usage(void)
*** 232,237 ****
--- 236,243 ----
  	printf(_("\nOptions controlling the output:\n"));
  	printf(_("  -D, --pgdata=DIRECTORY receive base backup into directory\n"));
  	printf(_("  -F, --format=p|t       output format (plain (default), tar)\n"));
+ 	printf(_("  -I, --incremental=PROFILE\n"
+ 			 "                         enable incremental from given backup profile\n"));
  	printf(_("  -r, --max-rate=RATE    maximum transfer rate to transfer data directory\n"
  			 "                         (in kB/s, or use suffix \"k\" or \"M\")\n"));
  	printf(_("  -R, --write-recovery-conf\n"
*************** parse_max_rate(char *src)
*** 717,722 ****
--- 723,778 ----
  	return (int32) result;
  }
  
+ 
+ /*
+  * Read incremental_startpoint and incremental_timeline
+  * from a backup profile.
+  */
+ static void
+ read_backup_profile_header(const char *profile_path)
+ {
+ 	FILE	   *lfp;
+ 	char		ch;
+ 	uint32		hi,
+ 				lo;
+ 
+ 	/*
+ 	 * See if label file is present
+ 	 */
+ 	lfp = fopen(profile_path, "r");
+ 	if (!lfp)
+ 	{
+ 		fprintf(stderr, _("%s: could not read file \"%s\": %s\n"),
+ 				progname, profile_path, strerror(errno));
+ 		exit(1);
+ 	}
+ 
+ 	/* Consume the profile header, don't fail if the header is absent */
+ 	fscanf(lfp, "POSTGRESQL BACKUP PROFILE 1\n");
+ 
+ 	/*
+ 	 * Read and parse the START WAL LOCATION (this code
+ 	 * is pretty crude, but we are not expecting any variability in the file
+ 	 * format).
+ 	 */
+ 	if (fscanf(lfp, "START WAL LOCATION: %X/%X (file %08X%*16s)%c",
+ 			   &hi, &lo, &incremental_timeline, &ch) != 4 || ch != '\n')
+ 	{
+ 		fprintf(stderr, _("%s: invalid data in file \"%s\"\n"),
+ 				progname, profile_path);
+ 		exit(1);
+ 	}
+ 	incremental_startpoint = ((uint64) hi) << 32 | lo;
+ 
+ 	if (ferror(lfp) || fclose(lfp))
+ 	{
+ 		fprintf(stderr, _("%s: could not read file \"%s\": %s\n"),
+ 				progname, profile_path, strerror(errno));
+ 		exit(1);
+ 	}
+ }
+ 
+ 
  /*
   * Write a piece of tar data
   */
*************** get_tablespace_mapping(const char *dir)
*** 1128,1136 ****
   * If the data is for the main data directory, it will be restored in the
   * specified directory. If it's for another tablespace, it will be restored
   * in the original or mapped directory.
   */
  static void
! ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum)
  {
  	char		current_path[MAXPGPATH];
  	char		filename[MAXPGPATH];
--- 1184,1199 ----
   * If the data is for the main data directory, it will be restored in the
   * specified directory. If it's for another tablespace, it will be restored
   * in the original or mapped directory.
+  *
+  * If 'res' is NULL, the destination directory is taken from the
+  * 'dest_path' parameter.
+  *
+  * When 'dest_path' is specified, progresses are not displayed because the
+  * content it is not in any tablespace.
   */
  static void
! ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum,
! 						const char *dest_path)
  {
  	char		current_path[MAXPGPATH];
  	char		filename[MAXPGPATH];
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1141,1153 ****
  	char	   *copybuf = NULL;
  	FILE	   *file = NULL;
  
! 	basetablespace = PQgetisnull(res, rownum, 0);
! 	if (basetablespace)
! 		strlcpy(current_path, basedir, sizeof(current_path));
  	else
! 		strlcpy(current_path,
! 				get_tablespace_mapping(PQgetvalue(res, rownum, 1)),
! 				sizeof(current_path));
  
  	/*
  	 * Get the COPY data
--- 1204,1231 ----
  	char	   *copybuf = NULL;
  	FILE	   *file = NULL;
  
! 	/* 'res' and 'dest_path' are mutually exclusive */
! 	assert(!res != !dest_path);
! 
! 	/*
! 	 * If 'res' is NULL, the destination directory is taken from the
! 	 * 'dest_path' parameter.
! 	 */
! 	if (res)
! 	{
! 		basetablespace = PQgetisnull(res, rownum, 0);
! 		if (basetablespace)
! 			strlcpy(current_path, basedir, sizeof(current_path));
! 		else
! 			strlcpy(current_path,
! 					get_tablespace_mapping(PQgetvalue(res, rownum, 1)),
! 					sizeof(current_path));
! 	}
  	else
! 	{
! 		basetablespace = false;
! 		strlcpy(current_path, dest_path, sizeof(current_path));
! 	}
  
  	/*
  	 * Get the COPY data
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1355,1361 ****
  				disconnect_and_exit(1);
  			}
  			totaldone += r;
! 			progress_report(rownum, filename, false);
  
  			current_len_left -= r;
  			if (current_len_left == 0 && current_padding == 0)
--- 1433,1441 ----
  				disconnect_and_exit(1);
  			}
  			totaldone += r;
! 			/* report progress unless a custom destination is used */
! 			if (!dest_path)
! 				progress_report(rownum, filename, false);
  
  			current_len_left -= r;
  			if (current_len_left == 0 && current_padding == 0)
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1371,1377 ****
  			}
  		}						/* continuing data in existing file */
  	}							/* loop over all data blocks */
! 	progress_report(rownum, filename, true);
  
  	if (file != NULL)
  	{
--- 1451,1459 ----
  			}
  		}						/* continuing data in existing file */
  	}							/* loop over all data blocks */
! 	/* report progress unless a custom destination is used */
! 	if (!dest_path)
! 		progress_report(rownum, filename, true);
  
  	if (file != NULL)
  	{
*************** BaseBackup(void)
*** 1587,1592 ****
--- 1669,1675 ----
  	char	   *basebkp;
  	char		escaped_label[MAXPGPATH];
  	char	   *maxrate_clause = NULL;
+ 	char	   *incremental_clause = NULL;
  	int			i;
  	char		xlogstart[64];
  	char		xlogend[64];
*************** BaseBackup(void)
*** 1648,1661 ****
  	if (maxrate > 0)
  		maxrate_clause = psprintf("MAX_RATE %u", maxrate);
  
  	basebkp =
! 		psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s",
  				 escaped_label,
  				 showprogress ? "PROGRESS" : "",
  				 includewal && !streamwal ? "WAL" : "",
  				 fastcheckpoint ? "FAST" : "",
  				 includewal ? "NOWAIT" : "",
! 				 maxrate_clause ? maxrate_clause : "");
  
  	if (PQsendQuery(conn, basebkp) == 0)
  	{
--- 1731,1770 ----
  	if (maxrate > 0)
  		maxrate_clause = psprintf("MAX_RATE %u", maxrate);
  
+ 	if (incremental_startpoint > 0)
+ 	{
+ 		incremental_clause = psprintf("INCREMENTAL '%X/%X'",
+ 									  (uint32) (incremental_startpoint >> 32),
+ 									  (uint32) incremental_startpoint);
+ 
+ 		/*
+ 		 * Sanity check: if from a different timeline abort the backup.
+ 		 */
+ 		if (latesttli != incremental_timeline)
+ 		{
+ 			fprintf(stderr,
+ 					_("%s: incremental backup from a different timeline "
+ 					  "is not supported: base=%u current=%u\n"),
+ 					progname, incremental_timeline, latesttli);
+ 			disconnect_and_exit(1);
+ 		}
+ 
+ 		if (verbose)
+ 			fprintf(stderr, _("incremental from point: %X/%X on timeline %u\n"),
+ 					(uint32) (incremental_startpoint >> 32),
+ 					(uint32) incremental_startpoint,
+ 					incremental_timeline);
+ 	}
+ 
  	basebkp =
! 		psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s %s",
  				 escaped_label,
  				 showprogress ? "PROGRESS" : "",
  				 includewal && !streamwal ? "WAL" : "",
  				 fastcheckpoint ? "FAST" : "",
  				 includewal ? "NOWAIT" : "",
! 				 maxrate_clause ? maxrate_clause : "",
! 				 incremental_clause ? incremental_clause : "");
  
  	if (PQsendQuery(conn, basebkp) == 0)
  	{
*************** BaseBackup(void)
*** 1769,1775 ****
  		if (format == 't')
  			ReceiveTarFile(conn, res, i);
  		else
! 			ReceiveAndUnpackTarFile(conn, res, i);
  	}							/* Loop over all tablespaces */
  
  	if (showprogress)
--- 1878,1884 ----
  		if (format == 't')
  			ReceiveTarFile(conn, res, i);
  		else
! 			ReceiveAndUnpackTarFile(conn, res, i, NULL);
  	}							/* Loop over all tablespaces */
  
  	if (showprogress)
*************** BaseBackup(void)
*** 1803,1808 ****
--- 1912,1922 ----
  		fprintf(stderr, "transaction log end point: %s\n", xlogend);
  	PQclear(res);
  
+ 	/*
+ 	 * Get the backup profile
+ 	 */
+ 	ReceiveAndUnpackTarFile(conn, NULL, -1, basedir);
+ 
  	res = PQgetResult(conn);
  	if (PQresultStatus(res) != PGRES_COMMAND_OK)
  	{
*************** main(int argc, char **argv)
*** 1942,1947 ****
--- 2056,2062 ----
  		{"username", required_argument, NULL, 'U'},
  		{"no-password", no_argument, NULL, 'w'},
  		{"password", no_argument, NULL, 'W'},
+ 		{"incremental", required_argument, NULL, 'I'},
  		{"status-interval", required_argument, NULL, 's'},
  		{"verbose", no_argument, NULL, 'v'},
  		{"progress", no_argument, NULL, 'P'},
*************** main(int argc, char **argv)
*** 1949,1955 ****
  		{NULL, 0, NULL, 0}
  	};
  	int			c;
- 
  	int			option_index;
  
  	progname = get_progname(argv[0]);
--- 2064,2069 ----
*************** main(int argc, char **argv)
*** 1970,1976 ****
  		}
  	}
  
! 	while ((c = getopt_long(argc, argv, "D:F:r:RT:xX:l:zZ:d:c:h:p:U:s:wWvP",
  							long_options, &option_index)) != -1)
  	{
  		switch (c)
--- 2084,2090 ----
  		}
  	}
  
! 	while ((c = getopt_long(argc, argv, "D:F:r:RT:xX:l:zZ:d:c:h:p:U:s:wWI:vP",
  							long_options, &option_index)) != -1)
  	{
  		switch (c)
*************** main(int argc, char **argv)
*** 2088,2093 ****
--- 2202,2210 ----
  			case 'W':
  				dbgetpassword = 1;
  				break;
+ 			case 'I':
+ 				read_backup_profile_header(optarg);
+ 				break;
  			case 's':
  				standby_message_timeout = atoi(optarg) * 1000;
  				if (standby_message_timeout < 0)
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 138deaf..4bb261a 100644
*** a/src/include/access/xlog.h
--- b/src/include/access/xlog.h
*************** extern void SetWalWriterSleeping(bool sl
*** 249,255 ****
   * Starting/stopping a base backup
   */
  extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast,
! 				   TimeLineID *starttli_p, char **labelfile);
  extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive,
  				  TimeLineID *stoptli_p);
  extern void do_pg_abort_backup(void);
--- 249,256 ----
   * Starting/stopping a base backup
   */
  extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast,
! 				  XLogRecPtr incremental_startpoint,
! 				  TimeLineID *starttli_p, char **labelfile);
  extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive,
  				  TimeLineID *stoptli_p);
  extern void do_pg_abort_backup(void);
diff --git a/src/include/replication/basebackup.h b/src/include/replication/basebackup.h
index 64f2bd5..9182c0a 100644
*** a/src/include/replication/basebackup.h
--- b/src/include/replication/basebackup.h
***************
*** 20,25 ****
--- 20,29 ----
  #define MAX_RATE_LOWER	32
  #define MAX_RATE_UPPER	1048576
  
+ /* Backup profile */
+ #define BACKUP_PROFILE_FILE			"backup_profile"
+ #define BACKUP_PROFILE_HEADER		"POSTGRESQL BACKUP PROFILE 1"
+ #define BACKUP_PROFILE_SEPARATOR	"FILE LIST"
  
  extern void SendBaseBackup(BaseBackupCmd *cmd);
  
-- 
2.2.1

pg_restorebackup.pytext/x-python-script; name=pg_restorebackup.pyDownload

#12

Marco Nenciarini

marco.nenciarini@2ndquadrant.it

almost 11 years ago

In reply to: Marco Nenciarini (#11)

2 attachment(s)

Re: File based incremental backup v6

Hi,

here it is another version of the file based incremental backup patch.

Changelog from the previous one:

* pg_basebackup --incremental option take the directory containing the
base backup instead of the backup profile file
* rename the backup_profile file at the same time of backup_label file
when starting the first time from a backup.
* handle "pg_basebackup -D -" appending the backup profile to the
resulting tar stream
* added documentation for -I/--incremental option to pg_basebackup doc
* updated replication protocol documentation

The reationale of moving the backup_profile out of the way during
recovery is to avoid using a data directory which has been already
started as a base of a backup.

I've also lightly improved the pg_restorebackup PoC implementing the
syntax advised by Gabriele:

./pg_restorebackup.py DESTINATION BACKUP_1 BACKUP_2 [BACKUP_3, ...]

It also supports relocation of tablespace with -T option.
The -T option is mandatory if there was any tablespace defined in the
PostgreSQL instance when the incremental_backup was taken.

Regards,
Marco

--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it

Attachments:

file-based-incremental-backup-v6.patchtext/plain; charset=UTF-8; name=file-based-incremental-backup-v6.patch; x-mac-creator=0; x-mac-type=0Download

From 56fed6e250280f8e5d5c17252db631f33a3c9d8f Mon Sep 17 00:00:00 2001
From: Marco Nenciarini <marco.nenciarini@2ndQuadrant.it>
Date: Tue, 14 Oct 2014 14:31:28 +0100
Subject: [PATCH] File-based incremental backup v6

Add backup profile to pg_basebackup
INCREMENTAL option implementaion
---
 doc/src/sgml/protocol.sgml             |  86 ++++++++-
 doc/src/sgml/ref/pg_basebackup.sgml    |  31 ++-
 src/backend/access/transam/xlog.c      |  18 +-
 src/backend/access/transam/xlogfuncs.c |   2 +-
 src/backend/replication/basebackup.c   | 335 +++++++++++++++++++++++++++++++--
 src/backend/replication/repl_gram.y    |   6 +
 src/backend/replication/repl_scanner.l |   1 +
 src/bin/pg_basebackup/pg_basebackup.c  | 191 +++++++++++++++++--
 src/include/access/xlog.h              |   3 +-
 src/include/replication/basebackup.h   |   5 +
 10 files changed, 639 insertions(+), 39 deletions(-)

diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index efe75ea..fc24648 100644
*** a/doc/src/sgml/protocol.sgml
--- b/doc/src/sgml/protocol.sgml
*************** The commands accepted in walsender mode 
*** 1882,1888 ****
    </varlistentry>
  
    <varlistentry>
!     <term>BASE_BACKUP [<literal>LABEL</literal> <replaceable>'label'</replaceable>] [<literal>PROGRESS</literal>] [<literal>FAST</literal>] [<literal>WAL</literal>] [<literal>NOWAIT</literal>] [<literal>MAX_RATE</literal> <replaceable>rate</replaceable>]
       <indexterm><primary>BASE_BACKUP</primary></indexterm>
      </term>
      <listitem>
--- 1882,1888 ----
    </varlistentry>
  
    <varlistentry>
!     <term>BASE_BACKUP [<literal>LABEL</literal> <replaceable>'label'</replaceable>] [<literal>INCREMENTAL</literal> <replaceable>'start_lsn'</replaceable>] [<literal>PROGRESS</literal>] [<literal>FAST</literal>] [<literal>WAL</literal>] [<literal>NOWAIT</literal>] [<literal>MAX_RATE</literal> <replaceable>rate</replaceable>]
       <indexterm><primary>BASE_BACKUP</primary></indexterm>
      </term>
      <listitem>
*************** The commands accepted in walsender mode 
*** 1905,1910 ****
--- 1905,1928 ----
         </varlistentry>
  
         <varlistentry>
+         <term><literal>INCREMENTAL</literal> <replaceable>'start_lsn'</replaceable></term>
+         <listitem>
+          <para>
+           Requests a file-level incremental backup of all files changed after
+           <replaceable>start_lsn</replaceable>. When operating with
+           <literal>INCREMENTAL</literal>, the content of every block-organised
+           file will be analyzed and the file will be sent if at least one
+           block has a LSN higher than or equal to the provided
+           <replaceable>start_lsn</replaceable>.
+          </para>
+          <para>
+           The <filename>backup_profile</filename> will contain information on
+           every file that has been analyzed, even those that have not been sent.
+          </para>
+         </listitem>
+        </varlistentry>
+ 
+        <varlistentry>
          <term><literal>PROGRESS</></term>
          <listitem>
           <para>
*************** The commands accepted in walsender mode 
*** 2022,2028 ****
        <quote>ustar interchange format</> specified in the POSIX 1003.1-2008
        standard) dump of the tablespace contents, except that the two trailing
        blocks of zeroes specified in the standard are omitted.
!       After the tar data is complete, a final ordinary result set will be sent,
        containing the WAL end position of the backup, in the same format as
        the start position.
       </para>
--- 2040,2046 ----
        <quote>ustar interchange format</> specified in the POSIX 1003.1-2008
        standard) dump of the tablespace contents, except that the two trailing
        blocks of zeroes specified in the standard are omitted.
!       After the tar data is complete, an ordinary result set will be sent,
        containing the WAL end position of the backup, in the same format as
        the start position.
       </para>
*************** The commands accepted in walsender mode 
*** 2073,2082 ****
        the server supports it.
       </para>
       <para>
!       Once all tablespaces have been sent, a final regular result set will
        be sent. This result set contains the end position of the
        backup, given in XLogRecPtr format as a single column in a single row.
       </para>
      </listitem>
    </varlistentry>
  </variablelist>
--- 2091,2162 ----
        the server supports it.
       </para>
       <para>
!       Once all tablespaces have been sent, another regular result set will
        be sent. This result set contains the end position of the
        backup, given in XLogRecPtr format as a single column in a single row.
       </para>
+      <para>
+       Finally a last CopyResponse will be sent, containing only the
+       <filename>backup_profile</filename> file, in tar format.
+      </para>
+      <para>
+       The <filename>backup_profile</filename> file will have the following
+       format:
+ <programlisting>
+ POSTGRESQL BACKUP PROFILE 1
+ &lt;backup label content&gt;
+ FILE LIST
+ &lt;file list&gt;
+ </programlisting>
+       where <replaceable>&lt;backup label content&gt;</replaceable> is a
+       verbatim copy of the content of <filename>backup_label</filename> file
+       and the <replaceable>&lt;file list&gt;</replaceable> section is made up
+       of one line per file examined by the backup, having the following format
+       (standard COPY TEXT file, tab separated):
+ <programlisting>
+ tablespace maxlsn included mtime size relpath
+ </programlisting>
+      </para>
+      <para>
+       The meaning of the fields is the following:
+       <itemizedlist spacing="compact" mark="bullet">
+        <listitem>
+         <para>
+          <replaceable>tablespace</replaceable> is the OID of the tablespace
+          (or <literal>\N</literal> for files in PGDATA)
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>maxlsn</replaceable> is the file's max LSN in case
+          the file has been skipped, <literal>\N</literal> otherwise
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>included</replaceable> is a <literal>'t'</literal> if
+          the file is included in the backup, <literal>'f'</literal> otherwise
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>mtime</replaceable> is the timestamp of the last file
+          modification
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>size</replaceable> is the number of bytes of the file
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>relpath</replaceable> is the path of the file relative
+          to the tablespace root (PGDATA or the tablespace)
+         </para>
+        </listitem>
+       </itemizedlist>
+      </para>
      </listitem>
    </varlistentry>
  </variablelist>
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 642fccf..a13b188 100644
*** a/doc/src/sgml/ref/pg_basebackup.sgml
--- b/doc/src/sgml/ref/pg_basebackup.sgml
*************** PostgreSQL documentation
*** 158,163 ****
--- 158,165 ----
              tablespaces, the main data directory will be placed in the
              target directory, but all other tablespaces will be placed
              in the same absolute path as they have on the server.
+             The <filename>backup_profile</filename> file will be placed in
+             this directory.
             </para>
             <para>
              This is the default format.
*************** PostgreSQL documentation
*** 174,186 ****
              data directory will be written to a file named
              <filename>base.tar</filename>, and all other tablespaces will
              be named after the tablespace OID.
!             </para>
             <para>
              If the value <literal>-</literal> (dash) is specified as
              target directory, the tar contents will be written to
              standard output, suitable for piping to for example
              <productname>gzip</productname>. This is only possible if
              the cluster has no additional tablespaces.
             </para>
             </listitem>
           </varlistentry>
--- 176,192 ----
              data directory will be written to a file named
              <filename>base.tar</filename>, and all other tablespaces will
              be named after the tablespace OID.
!             The <filename>backup_profile</filename> file will be placed in
!             this directory.
!            </para>
             <para>
              If the value <literal>-</literal> (dash) is specified as
              target directory, the tar contents will be written to
              standard output, suitable for piping to for example
              <productname>gzip</productname>. This is only possible if
              the cluster has no additional tablespaces.
+             In this case, the <filename>backup_profile</filename> file 
+             will be sent to standard output as part of the tar stream.
             </para>
             </listitem>
           </varlistentry>
*************** PostgreSQL documentation
*** 189,194 ****
--- 195,214 ----
       </varlistentry>
  
       <varlistentry>
+       <term><option>-I <replaceable class="parameter">directory</replaceable></option></term>
+       <term><option>--incremental=<replaceable class="parameter">directory</replaceable></option></term>
+       <listitem>
+         <para>
+         Directory containing the backup to use as a start point for a file-level
+         incremental backup. <application>pg_basebackup</application> will read
+         the <filename>backup_profile</filename> file and then create an
+         incremental backup containing only the files which have been modified
+         after the start point.
+        </para>
+       </listitem>
+      </varlistentry>
+ 
+      <varlistentry>
        <term><option>-r <replaceable class="parameter">rate</replaceable></option></term>
        <term><option>--max-rate=<replaceable class="parameter">rate</replaceable></option></term>
        <listitem>
*************** PostgreSQL documentation
*** 588,593 ****
--- 608,622 ----
    </para>
  
    <para>
+    In order to support file-level incremental backups, a
+    <filename>backup_profile</filename> file
+    is generated in the target directory as last step of every backup. This
+    file will be transparently used by <application>pg_basebackup</application>
+    when invoked with the option <replaceable>--incremental</replaceable> to start
+    a new file-level incremental backup.
+   </para>
+ 
+   <para>
     <application>pg_basebackup</application> works with servers of the same
     or an older major version, down to 9.1. However, WAL streaming mode (-X
     stream) only works with server version 9.3 and later.
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 629a457..a642a04 100644
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***************
*** 47,52 ****
--- 47,53 ----
  #include "replication/snapbuild.h"
  #include "replication/walreceiver.h"
  #include "replication/walsender.h"
+ #include "replication/basebackup.h"
  #include "storage/barrier.h"
  #include "storage/bufmgr.h"
  #include "storage/fd.h"
*************** StartupXLOG(void)
*** 6164,6169 ****
--- 6165,6173 ----
  		 * the latest recovery restartpoint instead of going all the way back
  		 * to the backup start point.  It seems prudent though to just rename
  		 * the file out of the way rather than delete it completely.
+ 		 *
+ 		 * Rename also the backup profile if present. This marks the data
+ 		 * directory as not usable as base for an incremental backup.
  		 */
  		if (haveBackupLabel)
  		{
*************** StartupXLOG(void)
*** 6173,6178 ****
--- 6177,6189 ----
  						(errcode_for_file_access(),
  						 errmsg("could not rename file \"%s\" to \"%s\": %m",
  								BACKUP_LABEL_FILE, BACKUP_LABEL_OLD)));
+ 			unlink(BACKUP_PROFILE_OLD);
+ 			if (rename(BACKUP_PROFILE_FILE, BACKUP_PROFILE_OLD) != 0
+ 					&& errno != ENOENT)
+ 				ereport(FATAL,
+ 						(errcode_for_file_access(),
+ 						 errmsg("could not rename file \"%s\" to \"%s\": %m",
+ 								 BACKUP_PROFILE_FILE, BACKUP_PROFILE_OLD)));
  		}
  
  		/* Check that the GUCs used to generate the WAL allow recovery */
*************** XLogFileNameP(TimeLineID tli, XLogSegNo 
*** 9249,9255 ****
   * permissions of the calling user!
   */
  XLogRecPtr
! do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
  				   char **labelfile)
  {
  	bool		exclusive = (labelfile == NULL);
--- 9260,9267 ----
   * permissions of the calling user!
   */
  XLogRecPtr
! do_pg_start_backup(const char *backupidstr, bool fast,
! 				   XLogRecPtr incremental_startpoint, TimeLineID *starttli_p,
  				   char **labelfile)
  {
  	bool		exclusive = (labelfile == NULL);
*************** do_pg_start_backup(const char *backupids
*** 9468,9473 ****
--- 9480,9489 ----
  			 (uint32) (startpoint >> 32), (uint32) startpoint, xlogfilename);
  		appendStringInfo(&labelfbuf, "CHECKPOINT LOCATION: %X/%X\n",
  					 (uint32) (checkpointloc >> 32), (uint32) checkpointloc);
+ 		if (incremental_startpoint > 0)
+ 			appendStringInfo(&labelfbuf, "INCREMENTAL FROM LOCATION: %X/%X\n",
+ 							 (uint32) (incremental_startpoint >> 32),
+ 							 (uint32) incremental_startpoint);
  		appendStringInfo(&labelfbuf, "BACKUP METHOD: %s\n",
  						 exclusive ? "pg_start_backup" : "streamed");
  		appendStringInfo(&labelfbuf, "BACKUP FROM: %s\n",
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2179bf7..ace84d8 100644
*** a/src/backend/access/transam/xlogfuncs.c
--- b/src/backend/access/transam/xlogfuncs.c
*************** pg_start_backup(PG_FUNCTION_ARGS)
*** 59,65 ****
  				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
  		   errmsg("must be superuser or replication role to run a backup")));
  
! 	startpoint = do_pg_start_backup(backupidstr, fast, NULL, NULL);
  
  	PG_RETURN_LSN(startpoint);
  }
--- 59,65 ----
  				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
  		   errmsg("must be superuser or replication role to run a backup")));
  
! 	startpoint = do_pg_start_backup(backupidstr, fast, 0, NULL, NULL);
  
  	PG_RETURN_LSN(startpoint);
  }
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 3058ce9..fe585c3 100644
*** a/src/backend/replication/basebackup.c
--- b/src/backend/replication/basebackup.c
***************
*** 30,40 ****
--- 30,42 ----
  #include "replication/basebackup.h"
  #include "replication/walsender.h"
  #include "replication/walsender_private.h"
+ #include "storage/bufpage.h"
  #include "storage/fd.h"
  #include "storage/ipc.h"
  #include "utils/builtins.h"
  #include "utils/elog.h"
  #include "utils/ps_status.h"
+ #include "utils/pg_lsn.h"
  #include "utils/timestamp.h"
  
  
*************** typedef struct
*** 46,56 ****
  	bool		nowait;
  	bool		includewal;
  	uint32		maxrate;
  } basebackup_options;
  
  
! static int64 sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces);
! static int64 sendTablespace(char *path, bool sizeonly);
  static bool sendFile(char *readfilename, char *tarfilename,
  		 struct stat * statbuf, bool missing_ok);
  static void sendFileWithContent(const char *filename, const char *content);
--- 48,62 ----
  	bool		nowait;
  	bool		includewal;
  	uint32		maxrate;
+ 	XLogRecPtr	incremental_startpoint;
  } basebackup_options;
  
  
! static int64 sendDir(char *path, int basepathlen, bool sizeonly,
! 					 List *tablespaces, bool has_relfiles,
! 					 XLogRecPtr incremental_startpoint);
! static int64 sendTablespace(char *path, bool sizeonly,
! 				XLogRecPtr incremental_startpoint);
  static bool sendFile(char *readfilename, char *tarfilename,
  		 struct stat * statbuf, bool missing_ok);
  static void sendFileWithContent(const char *filename, const char *content);
*************** static void parse_basebackup_options(Lis
*** 64,69 ****
--- 70,81 ----
  static void SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli);
  static int	compareWalFileNames(const void *a, const void *b);
  static void throttle(size_t increment);
+ static bool relnodeIsNewerThanLSN(char *filename, struct stat * statbuf,
+ 				XLogRecPtr *filemaxlsn, XLogRecPtr thresholdlsn);
+ static void writeBackupProfileLine(const char *filename, struct stat * statbuf,
+ 								   bool has_maxlsn, XLogRecPtr filemaxlsn, bool sent);
+ static void sendBackupProfile(const char *labelfile);
+ static bool validateRelfilenodeName(char *name);
  
  /* Was the backup currently in-progress initiated in recovery mode? */
  static bool backup_started_in_recovery = false;
*************** static int64 elapsed_min_unit;
*** 93,98 ****
--- 105,116 ----
  /* The last check of the transfer rate. */
  static int64 throttled_last;
  
+ /* Temporary file containing the backup profile */
+ static File backup_profile_fd = 0;
+ 
+ /* Tablespace being currently sent. Used in backup profile generation */
+ static char *current_tablespace = NULL;
+ 
  typedef struct
  {
  	char	   *oid;
*************** perform_base_backup(basebackup_options *
*** 132,138 ****
  
  	backup_started_in_recovery = RecoveryInProgress();
  
! 	startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint, &starttli,
  								  &labelfile);
  	/*
  	 * Once do_pg_start_backup has been called, ensure that any failure causes
--- 150,160 ----
  
  	backup_started_in_recovery = RecoveryInProgress();
  
! 	/* Open a temporary file to hold the profile content. */
! 	backup_profile_fd = OpenTemporaryFile(false);
! 
! 	startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint,
! 								  opt->incremental_startpoint, &starttli,
  								  &labelfile);
  	/*
  	 * Once do_pg_start_backup has been called, ensure that any failure causes
*************** perform_base_backup(basebackup_options *
*** 208,214 ****
  			ti->oid = pstrdup(de->d_name);
  			ti->path = pstrdup(linkpath);
  			ti->rpath = relpath ? pstrdup(relpath) : NULL;
! 			ti->size = opt->progress ? sendTablespace(fullpath, true) : -1;
  			tablespaces = lappend(tablespaces, ti);
  #else
  
--- 230,237 ----
  			ti->oid = pstrdup(de->d_name);
  			ti->path = pstrdup(linkpath);
  			ti->rpath = relpath ? pstrdup(relpath) : NULL;
! 			ti->size = opt->progress ? sendTablespace(fullpath, true,
! 											opt->incremental_startpoint) : -1;
  			tablespaces = lappend(tablespaces, ti);
  #else
  
*************** perform_base_backup(basebackup_options *
*** 225,231 ****
  
  		/* Add a node for the base directory at the end */
  		ti = palloc0(sizeof(tablespaceinfo));
! 		ti->size = opt->progress ? sendDir(".", 1, true, tablespaces) : -1;
  		tablespaces = lappend(tablespaces, ti);
  
  		/* Send tablespace header */
--- 248,255 ----
  
  		/* Add a node for the base directory at the end */
  		ti = palloc0(sizeof(tablespaceinfo));
! 		ti->size = opt->progress ? sendDir(".", 1, true, tablespaces, false,
! 										   opt->incremental_startpoint) : -1;
  		tablespaces = lappend(tablespaces, ti);
  
  		/* Send tablespace header */
*************** perform_base_backup(basebackup_options *
*** 267,272 ****
--- 291,302 ----
  			pq_sendint(&buf, 0, 2);		/* natts */
  			pq_endmessage(&buf);
  
+ 			/*
+ 			 * Save the current tablespace, used in writeBackupProfileLine
+ 			 * function
+ 			 */
+ 			current_tablespace = ti->oid;
+ 
  			if (ti->path == NULL)
  			{
  				struct stat statbuf;
*************** perform_base_backup(basebackup_options *
*** 275,281 ****
  				sendFileWithContent(BACKUP_LABEL_FILE, labelfile);
  
  				/* ... then the bulk of the files ... */
! 				sendDir(".", 1, false, tablespaces);
  
  				/* ... and pg_control after everything else. */
  				if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
--- 305,311 ----
  				sendFileWithContent(BACKUP_LABEL_FILE, labelfile);
  
  				/* ... then the bulk of the files ... */
! 				sendDir(".", 1, false, tablespaces, false, opt->incremental_startpoint);
  
  				/* ... and pg_control after everything else. */
  				if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
*************** perform_base_backup(basebackup_options *
*** 284,292 ****
  							 errmsg("could not stat control file \"%s\": %m",
  									XLOG_CONTROL_FILE)));
  				sendFile(XLOG_CONTROL_FILE, XLOG_CONTROL_FILE, &statbuf, false);
  			}
  			else
! 				sendTablespace(ti->path, false);
  
  			/*
  			 * If we're including WAL, and this is the main data directory we
--- 314,323 ----
  							 errmsg("could not stat control file \"%s\": %m",
  									XLOG_CONTROL_FILE)));
  				sendFile(XLOG_CONTROL_FILE, XLOG_CONTROL_FILE, &statbuf, false);
+ 				writeBackupProfileLine(XLOG_CONTROL_FILE, &statbuf, false, 0, true);
  			}
  			else
! 				sendTablespace(ti->path, false, opt->incremental_startpoint);
  
  			/*
  			 * If we're including WAL, and this is the main data directory we
*************** perform_base_backup(basebackup_options *
*** 501,507 ****
  
  			FreeFile(fp);
  
! 			/*
  			 * Mark file as archived, otherwise files can get archived again
  			 * after promotion of a new node. This is in line with
  			 * walreceiver.c always doing a XLogArchiveForceDone() after a
--- 532,541 ----
  
  			FreeFile(fp);
  
! 			/* Add the WAL file to backup profile */
! 			writeBackupProfileLine(pathbuf, &statbuf, false, 0, true);
! 
! 		    /*
  			 * Mark file as archived, otherwise files can get archived again
  			 * after promotion of a new node. This is in line with
  			 * walreceiver.c always doing a XLogArchiveForceDone() after a
*************** perform_base_backup(basebackup_options *
*** 533,538 ****
--- 567,575 ----
  
  			sendFile(pathbuf, pathbuf, &statbuf, false);
  
+ 			/* Add the WAL file to backup profile */
+ 			writeBackupProfileLine(pathbuf, &statbuf, false, 0, true);
+ 
  			/* unconditionally mark file as archived */
  			StatusFilePath(pathbuf, fname, ".done");
  			sendFileWithContent(pathbuf, "");
*************** perform_base_backup(basebackup_options *
*** 542,547 ****
--- 579,587 ----
  		pq_putemptymessage('c');
  	}
  	SendXlogRecPtrResult(endptr, endtli);
+ 
+ 	/* Send the profile file. */
+ 	sendBackupProfile(labelfile);
  }
  
  /*
*************** parse_basebackup_options(List *options, 
*** 570,575 ****
--- 610,616 ----
  	bool		o_nowait = false;
  	bool		o_wal = false;
  	bool		o_maxrate = false;
+ 	bool		o_incremental = false;
  
  	MemSet(opt, 0, sizeof(*opt));
  	foreach(lopt, options)
*************** parse_basebackup_options(List *options, 
*** 640,645 ****
--- 681,698 ----
  			opt->maxrate = (uint32) maxrate;
  			o_maxrate = true;
  		}
+ 		else if (strcmp(defel->defname, "incremental") == 0)
+ 		{
+ 			if (o_incremental)
+ 				ereport(ERROR,
+ 						(errcode(ERRCODE_SYNTAX_ERROR),
+ 						 errmsg("duplicate option \"%s\"", defel->defname)));
+ 
+ 			opt->incremental_startpoint = DatumGetLSN(
+ 				DirectFunctionCall1(pg_lsn_in,
+ 									CStringGetDatum(strVal(defel->arg))));
+ 			o_incremental = true;
+ 		}
  		else
  			elog(ERROR, "option \"%s\" not recognized",
  				 defel->defname);
*************** sendFileWithContent(const char *filename
*** 859,864 ****
--- 912,920 ----
  		MemSet(buf, 0, pad);
  		pq_putmessage('d', buf, pad);
  	}
+ 
+ 	/* Write a backup profile entry for this file. */
+ 	writeBackupProfileLine(filename, &statbuf, false, 0, true);
  }
  
  /*
*************** sendFileWithContent(const char *filename
*** 869,875 ****
   * Only used to send auxiliary tablespaces, not PGDATA.
   */
  static int64
! sendTablespace(char *path, bool sizeonly)
  {
  	int64		size;
  	char		pathbuf[MAXPGPATH];
--- 925,931 ----
   * Only used to send auxiliary tablespaces, not PGDATA.
   */
  static int64
! sendTablespace(char *path, bool sizeonly, XLogRecPtr incremental_startpoint)
  {
  	int64		size;
  	char		pathbuf[MAXPGPATH];
*************** sendTablespace(char *path, bool sizeonly
*** 902,908 ****
  	size = 512;					/* Size of the header just added */
  
  	/* Send all the files in the tablespace version directory */
! 	size += sendDir(pathbuf, strlen(path), sizeonly, NIL);
  
  	return size;
  }
--- 958,964 ----
  	size = 512;					/* Size of the header just added */
  
  	/* Send all the files in the tablespace version directory */
! 	size += sendDir(pathbuf, strlen(path), sizeonly, NIL, true, incremental_startpoint);
  
  	return size;
  }
*************** sendTablespace(char *path, bool sizeonly
*** 914,922 ****
   *
   * Omit any directory in the tablespaces list, to avoid backing up
   * tablespaces twice when they were created inside PGDATA.
   */
  static int64
! sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces)
  {
  	DIR		   *dir;
  	struct dirent *de;
--- 970,982 ----
   *
   * Omit any directory in the tablespaces list, to avoid backing up
   * tablespaces twice when they were created inside PGDATA.
+  *
+  * If 'has_relfiles' is set, this directory will be checked to identify
+  * relnode files and compute their maxLSN.
   */
  static int64
! sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces,
! 		bool has_relfiles, XLogRecPtr incremental_startpoint)
  {
  	DIR		   *dir;
  	struct dirent *de;
*************** sendDir(char *path, int basepathlen, boo
*** 1124,1138 ****
  				}
  			}
  			if (!skip_this_dir)
! 				size += sendDir(pathbuf, basepathlen, sizeonly, tablespaces);
  		}
  		else if (S_ISREG(statbuf.st_mode))
  		{
  			bool		sent = false;
  
  			if (!sizeonly)
! 				sent = sendFile(pathbuf, pathbuf + basepathlen + 1, &statbuf,
! 								true);
  
  			if (sent || sizeonly)
  			{
--- 1184,1235 ----
  				}
  			}
  			if (!skip_this_dir)
! 			{
! 				bool	subdir_has_relfiles;
! 
! 				/*
! 				 * Whithin PGDATA relnode files are contained only in "global"
! 				 * and "base" directory
! 				 */
! 				subdir_has_relfiles = has_relfiles
! 					|| strcmp(pathbuf, "./global") == 0
! 					|| strcmp(pathbuf, "./base") == 0;
! 
! 				size += sendDir(pathbuf, basepathlen, sizeonly, tablespaces,
! 								subdir_has_relfiles, incremental_startpoint);
! 			}
  		}
  		else if (S_ISREG(statbuf.st_mode))
  		{
  			bool		sent = false;
  
  			if (!sizeonly)
! 			{
! 				bool		is_relfile;
! 				XLogRecPtr	filemaxlsn = 0;
! 
! 				/*
! 				 * If the current directory can have relnode files, check the file
! 				 * name to see if it is one of them.
! 				 */
! 				is_relfile = has_relfiles && validateRelfilenodeName(de->d_name);
! 
! 				if (!is_relfile
! 					|| incremental_startpoint == 0
! 					|| relnodeIsNewerThanLSN(pathbuf, &statbuf, &filemaxlsn,
! 											  incremental_startpoint))
! 				{
! 					sent = sendFile(pathbuf, pathbuf + basepathlen + 1,
! 									&statbuf, true);
! 					/* Write a backup profile entry for the sent file. */
! 					writeBackupProfileLine(pathbuf + basepathlen + 1, &statbuf,
! 										   false, 0, sent);
! 				}
! 				else
! 					/* Write a backup profile entry for the skipped file. */
! 					writeBackupProfileLine(pathbuf + basepathlen + 1, &statbuf,
! 										   true, filemaxlsn, sent);
! 			}
  
  			if (sent || sizeonly)
  			{
*************** throttle(size_t increment)
*** 1333,1335 ****
--- 1430,1642 ----
  		/* Sleep was necessary but might have been interrupted. */
  		throttled_last = GetCurrentIntegerTimestamp();
  }
+ 
+ /*
+  * Search in a relnode file for a page with a LSN greater than the threshold.
+  * If all the blocks in the file are older than the threshold the file can
+  * be safely skipped during an incremental backup.
+  */
+ static bool
+ relnodeIsNewerThanLSN(char *filename, struct stat * statbuf,
+ 		XLogRecPtr *filemaxlsn, XLogRecPtr thresholdlsn)
+ {
+ 	FILE	   *fp;
+ 	char		buf[BLCKSZ];
+ 	size_t		cnt;
+ 	pgoff_t		len = 0;
+ 	XLogRecPtr	pagelsn;
+ 
+ 	*filemaxlsn = 0;
+ 
+ 	fp = AllocateFile(filename, "rb");
+ 	if (fp == NULL)
+ 	{
+ 		if (errno == ENOENT)
+ 			return true;
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not open file \"%s\": %m", filename)));
+ 	}
+ 
+ 	while ((cnt = fread(buf, 1, Min(sizeof(buf), statbuf->st_size - len), fp)) > 0)
+ 	{
+ 		pagelsn = PageGetLSN(buf);
+ 
+ 		/* Keep the max LSN found */
+ 		if (*filemaxlsn < pagelsn)
+ 			*filemaxlsn = pagelsn;
+ 
+ 		/*
+ 		 *  If a page with a LSN newer than the threshold stop scanning
+ 		 *  and set the filemaxlsn value to 0 as it is only partial.
+ 		 */
+ 		if (thresholdlsn <= pagelsn)
+ 		{
+ 			*filemaxlsn = 0;
+ 			return true;
+ 		}
+ 
+ 		if (len >= statbuf->st_size)
+ 		{
+ 			/*
+ 			 * Reached end of file. The file could be longer, if it was
+ 			 * extended while we were sending it, but for a base backup we can
+ 			 * ignore such extended data. It will be restored from WAL.
+ 			 */
+ 			break;
+ 		}
+ 	}
+ 
+ 	FreeFile(fp);
+ 	return false;
+ }
+ 
+ /*
+  * Write an entry in file list section of backup profile.
+  */
+ static void
+ writeBackupProfileLine(const char *filename, struct stat * statbuf,
+ 					   bool has_maxlsn, XLogRecPtr filemaxlsn, bool sent)
+ {
+ 	/*
+ 	 * tablespace oid (10) + max LSN (17) + mtime (10) + size (19) +
+ 	 * path (MAXPGPATH) + separators (4) + trailing \0 = 65
+ 	 */
+ 	char	buf[MAXPGPATH + 65];
+ 	char    maxlsn[17];
+ 	int		rowlen;
+ 
+ 	Assert(backup_profile_fd > 0);
+ 
+ 	/* Prepare maxlsn */
+ 	if (has_maxlsn)
+ 	{
+ 		snprintf(maxlsn, sizeof(maxlsn), "%X/%X",
+ 				 (uint32) (filemaxlsn >> 32), (uint32) filemaxlsn);
+ 	}
+ 	else
+ 	{
+ 		strlcpy(maxlsn, "\\N", sizeof(maxlsn));
+ 	}
+ 
+ 	rowlen = snprintf(buf, sizeof(buf), "%s\t%s\t%s\t%u\t%lld\t%s\n",
+ 					  current_tablespace ? current_tablespace : "\\N",
+ 					  maxlsn,
+ 					  sent ? "t" : "f",
+ 					  (uint32) statbuf->st_mtime,
+ 					  statbuf->st_size,
+ 					  filename);
+ 	FileWrite(backup_profile_fd, buf, rowlen);
+ }
+ 
+ /*
+  * Send the backup profile. It is wrapped in a tar CopyOutResponse containing
+  * a tar stream with only one file.
+  */
+ static void
+ sendBackupProfile(const char *labelfile)
+ {
+ 	StringInfoData msgbuf;
+ 	struct stat statbuf;
+ 	char		buf[TAR_SEND_SIZE];
+ 	size_t		cnt;
+ 	pgoff_t		len = 0;
+ 	size_t		pad;
+ 	char *backup_profile = FilePathName(backup_profile_fd);
+ 
+ 	/* Send CopyOutResponse message */
+ 	pq_beginmessage(&msgbuf, 'H');
+ 	pq_sendbyte(&msgbuf, 0);		/* overall format */
+ 	pq_sendint(&msgbuf, 0, 2);		/* natts */
+ 	pq_endmessage(&msgbuf);
+ 
+ 	if (lstat(backup_profile, &statbuf) != 0)
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not stat backup_profile file \"%s\": %m",
+ 						backup_profile)));
+ 
+ 	/* Set the file position to the beginning. */
+ 	FileSeek(backup_profile_fd, 0, SEEK_SET);
+ 
+ 	/*
+ 	 * Fill the buffer with content of backup profile header section. Being it
+ 	 * the concatenation of two separator and the backup label, it should be
+ 	 * shorter of TAR_SEND_SIZE.
+ 	 */
+ 	cnt = snprintf(buf, sizeof(buf), "%s\n%s%s\n",
+ 				   BACKUP_PROFILE_HEADER,
+ 				   labelfile,
+ 				   BACKUP_PROFILE_SEPARATOR);
+ 
+ 	/* Add size of backup label and separators */
+ 	statbuf.st_size += cnt;
+ 
+ 	_tarWriteHeader(BACKUP_PROFILE_FILE, NULL, &statbuf);
+ 
+ 	/* Send backup profile header */
+ 	if (pq_putmessage('d', buf, cnt))
+ 		ereport(ERROR,
+ 				(errmsg("base backup could not send data, aborting backup")));
+ 
+ 	len += cnt;
+ 	throttle(cnt);
+ 
+ 	while ((cnt = FileRead(backup_profile_fd, buf, sizeof(buf))) > 0)
+ 	{
+ 		/* Send the chunk as a CopyData message */
+ 		if (pq_putmessage('d', buf, cnt))
+ 			ereport(ERROR,
+ 					(errmsg("base backup could not send data, aborting backup")));
+ 
+ 		len += cnt;
+ 		throttle(cnt);
+ 
+ 	}
+ 
+ 	/*
+ 	 * Pad to 512 byte boundary, per tar format requirements. (This small
+ 	 * piece of data is probably not worth throttling.)
+ 	 */
+ 	pad = ((len + 511) & ~511) - len;
+ 	if (pad > 0)
+ 	{
+ 		MemSet(buf, 0, pad);
+ 		pq_putmessage('d', buf, pad);
+ 	}
+ 
+ 	pq_putemptymessage('c');        /* CopyDone */
+ }
+ 
+ /*
+  * relfilenode name validation.
+  *
+  * Format with_ext == true	[0-9]+[ \w | _vm | _fsm | _init ][\.][0-9]*
+  *		  with_ext == false [0-9]+
+  */
+ static bool
+ validateRelfilenodeName(char *name)
+ {
+ 	int			pos = 0;
+ 
+ 	while ((name[pos] >= '0') && (name[pos] <= '9'))
+ 		pos++;
+ 
+ 	if (name[pos] == '_')
+ 	{
+ 		pos++;
+ 		while ((name[pos] >= 'a') && (name[pos] <= 'z'))
+ 			pos++;
+ 	}
+ 	if (name[pos] == '.')
+ 	{
+ 		pos++;
+ 		while ((name[pos] >= '0') && (name[pos] <= '9'))
+ 			pos++;
+ 	}
+ 
+ 	if (name[pos] == 0)
+ 		return true;
+ 
+ 	return false;
+ }
diff --git a/src/backend/replication/repl_gram.y b/src/backend/replication/repl_gram.y
index 2a41eb1..684cf4d 100644
*** a/src/backend/replication/repl_gram.y
--- b/src/backend/replication/repl_gram.y
*************** Node *replication_parse_result;
*** 75,80 ****
--- 75,81 ----
  %token K_PHYSICAL
  %token K_LOGICAL
  %token K_SLOT
+ %token K_INCREMENTAL
  
  %type <node>	command
  %type <node>	base_backup start_replication start_logical_replication create_replication_slot drop_replication_slot identify_system timeline_history
*************** base_backup_opt:
*** 168,173 ****
--- 169,179 ----
  				  $$ = makeDefElem("max_rate",
  								   (Node *)makeInteger($2));
  				}
+ 			| K_INCREMENTAL SCONST
+ 				{
+ 				  $$ = makeDefElem("incremental",
+ 								   (Node *)makeString($2));
+ 				}
  			;
  
  create_replication_slot:
diff --git a/src/backend/replication/repl_scanner.l b/src/backend/replication/repl_scanner.l
index 449c127..a6d0dd8 100644
*** a/src/backend/replication/repl_scanner.l
--- b/src/backend/replication/repl_scanner.l
*************** TIMELINE_HISTORY	{ return K_TIMELINE_HIS
*** 96,101 ****
--- 96,102 ----
  PHYSICAL			{ return K_PHYSICAL; }
  LOGICAL				{ return K_LOGICAL; }
  SLOT				{ return K_SLOT; }
+ INCREMENTAL			{ return K_INCREMENTAL; }
  
  ","				{ return ','; }
  ";"				{ return ';'; }
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index fbf7106..fd67d51 100644
*** a/src/bin/pg_basebackup/pg_basebackup.c
--- b/src/bin/pg_basebackup/pg_basebackup.c
*************** static bool writerecoveryconf = false;
*** 67,72 ****
--- 67,74 ----
  static int	standby_message_timeout = 10 * 1000;		/* 10 sec = default */
  static pg_time_t last_progress_report = 0;
  static int32 maxrate = 0;		/* no limit by default */
+ static XLogRecPtr incremental_startpoint = 0;
+ static TimeLineID incremental_timeline = 0;
  
  
  /* Progress counters */
*************** static void usage(void);
*** 99,107 ****
  static void disconnect_and_exit(int code);
  static void verify_dir_is_empty_or_create(char *dirname);
  static void progress_report(int tablespacenum, const char *filename, bool force);
  
  static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);
! static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum);
  static void GenerateRecoveryConf(PGconn *conn);
  static void WriteRecoveryConf(void);
  static void BaseBackup(void);
--- 101,111 ----
  static void disconnect_and_exit(int code);
  static void verify_dir_is_empty_or_create(char *dirname);
  static void progress_report(int tablespacenum, const char *filename, bool force);
+ static void read_backup_profile_header(const char *profile_path);
  
  static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);
! static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum,
! 									const char *dest_path);
  static void GenerateRecoveryConf(PGconn *conn);
  static void WriteRecoveryConf(void);
  static void BaseBackup(void);
*************** usage(void)
*** 232,237 ****
--- 236,243 ----
  	printf(_("\nOptions controlling the output:\n"));
  	printf(_("  -D, --pgdata=DIRECTORY receive base backup into directory\n"));
  	printf(_("  -F, --format=p|t       output format (plain (default), tar)\n"));
+ 	printf(_("  -I, --incremental=DIRECTORY\n"
+ 			 "                         incremental backup from an existing backup\n"));
  	printf(_("  -r, --max-rate=RATE    maximum transfer rate to transfer data directory\n"
  			 "                         (in kB/s, or use suffix \"k\" or \"M\")\n"));
  	printf(_("  -R, --write-recovery-conf\n"
*************** parse_max_rate(char *src)
*** 717,722 ****
--- 723,794 ----
  	return (int32) result;
  }
  
+ 
+ /*
+  * Read incremental_startpoint and incremental_timeline
+  * from a backup profile.
+  */
+ static void
+ read_backup_profile_header(const char *reference_path)
+ {
+ 	char 		profile_path[MAXPGPATH];
+ 	FILE	   *pfp;
+ 	char		ch;
+ 	uint32		hi,
+ 				lo;
+ 
+ 	/* The directory must exist and must be not empty */
+ 	if (pg_check_dir(reference_path) < 3)
+ 	{
+ 		fprintf(stderr, _("%s: invalid incremental base directory \"%s\"\n"),
+ 				progname, reference_path);
+ 		exit(1);
+ 	}
+ 
+ 	/* Build the backup profile location */
+ 	join_path_components(profile_path, reference_path, BACKUP_PROFILE_FILE);
+ 
+ 	/* See if label file is present */
+ 	pfp = fopen(profile_path, "r");
+ 	if (!pfp)
+ 	{
+ 		fprintf(stderr, _("%s: could not read file \"%s\": %s\n"),
+ 				progname, profile_path, strerror(errno));
+ 		exit(1);
+ 	}
+ 
+ 	/* Consume the profile header */
+ 	fscanf(pfp, BACKUP_PROFILE_HEADER);
+ 	if (fscanf(pfp, "%c", &ch) != 1 || ch != '\n')
+ 	{
+ 		fprintf(stderr, _("%s: invalid data in file \"%s\"\n"),
+ 				progname, profile_path);
+ 		exit(1);
+ 	}
+ 
+ 	/*
+ 	 * Read and parse the START WAL LOCATION (this code
+ 	 * is pretty crude, but we are not expecting any variability in the file
+ 	 * format).
+ 	 */
+ 	if (fscanf(pfp, "START WAL LOCATION: %X/%X (file %08X%*16s)%c",
+ 			   &hi, &lo, &incremental_timeline, &ch) != 4 || ch != '\n')
+ 	{
+ 		fprintf(stderr, _("%s: invalid data in file \"%s\"\n"),
+ 				progname, profile_path);
+ 		exit(1);
+ 	}
+ 	incremental_startpoint = ((uint64) hi) << 32 | lo;
+ 
+ 	if (ferror(pfp) || fclose(pfp))
+ 	{
+ 		fprintf(stderr, _("%s: could not read file \"%s\": %s\n"),
+ 				progname, profile_path, strerror(errno));
+ 		exit(1);
+ 	}
+ }
+ 
+ 
  /*
   * Write a piece of tar data
   */
*************** ReceiveTarFile(PGconn *conn, PGresult *r
*** 773,784 ****
  	char	   *copybuf = NULL;
  	FILE	   *tarfile = NULL;
  	char		tarhdr[512];
! 	bool		basetablespace = PQgetisnull(res, rownum, 0);
  	bool		in_tarhdr = true;
  	bool		skip_file = false;
  	size_t		tarhdrsz = 0;
  	size_t		filesz = 0;
  
  #ifdef HAVE_LIBZ
  	gzFile		ztarfile = NULL;
  #endif
--- 845,866 ----
  	char	   *copybuf = NULL;
  	FILE	   *tarfile = NULL;
  	char		tarhdr[512];
! 	bool		basetablespace;
  	bool		in_tarhdr = true;
  	bool		skip_file = false;
  	size_t		tarhdrsz = 0;
  	size_t		filesz = 0;
  
+ 	/*
+ 	 * If 'res' is NULL, we are appending the backup profile to
+ 	 * the standard output tar stream.
+ 	 */
+ 	assert(res || (strcmp(basedir, "-") == 0));
+ 	if (res)
+ 		basetablespace = PQgetisnull(res, rownum, 0);
+ 	else
+ 		basetablespace = true;
+ 
  #ifdef HAVE_LIBZ
  	gzFile		ztarfile = NULL;
  #endif
*************** ReceiveTarFile(PGconn *conn, PGresult *r
*** 939,946 ****
  					WRITE_TAR_DATA(zerobuf, padding);
  			}
  
! 			/* 2 * 512 bytes empty data at end of file */
! 			WRITE_TAR_DATA(zerobuf, sizeof(zerobuf));
  
  #ifdef HAVE_LIBZ
  			if (ztarfile != NULL)
--- 1021,1033 ----
  					WRITE_TAR_DATA(zerobuf, padding);
  			}
  
! 			/*
! 			 * Write the end-of-file blocks unless using stdout
! 			 * and not writing the backup profile (res is NULL).
! 			 */
! 			if (!res || strcmp(basedir, "-") != 0)
! 				/* 2 * 512 bytes empty data at end of file */
! 				WRITE_TAR_DATA(zerobuf, sizeof(zerobuf));
  
  #ifdef HAVE_LIBZ
  			if (ztarfile != NULL)
*************** get_tablespace_mapping(const char *dir)
*** 1128,1136 ****
   * If the data is for the main data directory, it will be restored in the
   * specified directory. If it's for another tablespace, it will be restored
   * in the original or mapped directory.
   */
  static void
! ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum)
  {
  	char		current_path[MAXPGPATH];
  	char		filename[MAXPGPATH];
--- 1215,1230 ----
   * If the data is for the main data directory, it will be restored in the
   * specified directory. If it's for another tablespace, it will be restored
   * in the original or mapped directory.
+  *
+  * If 'res' is NULL, the destination directory is taken from the
+  * 'dest_path' parameter.
+  *
+  * When 'dest_path' is specified, progresses are not displayed because the
+  * content it is not in any tablespace.
   */
  static void
! ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum,
! 						const char *dest_path)
  {
  	char		current_path[MAXPGPATH];
  	char		filename[MAXPGPATH];
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1141,1153 ****
  	char	   *copybuf = NULL;
  	FILE	   *file = NULL;
  
! 	basetablespace = PQgetisnull(res, rownum, 0);
! 	if (basetablespace)
! 		strlcpy(current_path, basedir, sizeof(current_path));
  	else
! 		strlcpy(current_path,
! 				get_tablespace_mapping(PQgetvalue(res, rownum, 1)),
! 				sizeof(current_path));
  
  	/*
  	 * Get the COPY data
--- 1235,1262 ----
  	char	   *copybuf = NULL;
  	FILE	   *file = NULL;
  
! 	/* 'res' and 'dest_path' are mutually exclusive */
! 	assert(!res != !dest_path);
! 
! 	/*
! 	 * If 'res' is NULL, the destination directory is taken from the
! 	 * 'dest_path' parameter.
! 	 */
! 	if (res)
! 	{
! 		basetablespace = PQgetisnull(res, rownum, 0);
! 		if (basetablespace)
! 			strlcpy(current_path, basedir, sizeof(current_path));
! 		else
! 			strlcpy(current_path,
! 					get_tablespace_mapping(PQgetvalue(res, rownum, 1)),
! 					sizeof(current_path));
! 	}
  	else
! 	{
! 		basetablespace = false;
! 		strlcpy(current_path, dest_path, sizeof(current_path));
! 	}
  
  	/*
  	 * Get the COPY data
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1355,1361 ****
  				disconnect_and_exit(1);
  			}
  			totaldone += r;
! 			progress_report(rownum, filename, false);
  
  			current_len_left -= r;
  			if (current_len_left == 0 && current_padding == 0)
--- 1464,1472 ----
  				disconnect_and_exit(1);
  			}
  			totaldone += r;
! 			/* report progress unless a custom destination is used */
! 			if (!dest_path)
! 				progress_report(rownum, filename, false);
  
  			current_len_left -= r;
  			if (current_len_left == 0 && current_padding == 0)
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1371,1377 ****
  			}
  		}						/* continuing data in existing file */
  	}							/* loop over all data blocks */
! 	progress_report(rownum, filename, true);
  
  	if (file != NULL)
  	{
--- 1482,1490 ----
  			}
  		}						/* continuing data in existing file */
  	}							/* loop over all data blocks */
! 	/* report progress unless a custom destination is used */
! 	if (!dest_path)
! 		progress_report(rownum, filename, true);
  
  	if (file != NULL)
  	{
*************** BaseBackup(void)
*** 1587,1592 ****
--- 1700,1706 ----
  	char	   *basebkp;
  	char		escaped_label[MAXPGPATH];
  	char	   *maxrate_clause = NULL;
+ 	char	   *incremental_clause = NULL;
  	int			i;
  	char		xlogstart[64];
  	char		xlogend[64];
*************** BaseBackup(void)
*** 1648,1661 ****
  	if (maxrate > 0)
  		maxrate_clause = psprintf("MAX_RATE %u", maxrate);
  
  	basebkp =
! 		psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s",
  				 escaped_label,
  				 showprogress ? "PROGRESS" : "",
  				 includewal && !streamwal ? "WAL" : "",
  				 fastcheckpoint ? "FAST" : "",
  				 includewal ? "NOWAIT" : "",
! 				 maxrate_clause ? maxrate_clause : "");
  
  	if (PQsendQuery(conn, basebkp) == 0)
  	{
--- 1762,1801 ----
  	if (maxrate > 0)
  		maxrate_clause = psprintf("MAX_RATE %u", maxrate);
  
+ 	if (incremental_startpoint > 0)
+ 	{
+ 		incremental_clause = psprintf("INCREMENTAL '%X/%X'",
+ 									  (uint32) (incremental_startpoint >> 32),
+ 									  (uint32) incremental_startpoint);
+ 
+ 		/*
+ 		 * Sanity check: if from a different timeline abort the backup.
+ 		 */
+ 		if (latesttli != incremental_timeline)
+ 		{
+ 			fprintf(stderr,
+ 					_("%s: incremental backup from a different timeline "
+ 					  "is not supported: base=%u current=%u\n"),
+ 					progname, incremental_timeline, latesttli);
+ 			disconnect_and_exit(1);
+ 		}
+ 
+ 		if (verbose)
+ 			fprintf(stderr, _("incremental from point: %X/%X on timeline %u\n"),
+ 					(uint32) (incremental_startpoint >> 32),
+ 					(uint32) incremental_startpoint,
+ 					incremental_timeline);
+ 	}
+ 
  	basebkp =
! 		psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s %s",
  				 escaped_label,
  				 showprogress ? "PROGRESS" : "",
  				 includewal && !streamwal ? "WAL" : "",
  				 fastcheckpoint ? "FAST" : "",
  				 includewal ? "NOWAIT" : "",
! 				 maxrate_clause ? maxrate_clause : "",
! 				 incremental_clause ? incremental_clause : "");
  
  	if (PQsendQuery(conn, basebkp) == 0)
  	{
*************** BaseBackup(void)
*** 1769,1775 ****
  		if (format == 't')
  			ReceiveTarFile(conn, res, i);
  		else
! 			ReceiveAndUnpackTarFile(conn, res, i);
  	}							/* Loop over all tablespaces */
  
  	if (showprogress)
--- 1909,1915 ----
  		if (format == 't')
  			ReceiveTarFile(conn, res, i);
  		else
! 			ReceiveAndUnpackTarFile(conn, res, i, NULL);
  	}							/* Loop over all tablespaces */
  
  	if (showprogress)
*************** BaseBackup(void)
*** 1803,1808 ****
--- 1943,1960 ----
  		fprintf(stderr, "transaction log end point: %s\n", xlogend);
  	PQclear(res);
  
+ 	/*
+ 	 * Get the backup profile
+ 	 *
+ 	 * If format is tar and we are writing on standard output
+ 	 * append the backup profile to the stream, otherwise put it
+ 	 * in the destination directory
+ 	 */
+ 	if (format == 't' && (strcmp(basedir, "-") == 0))
+ 		ReceiveTarFile(conn, NULL, -1);
+ 	else
+ 		ReceiveAndUnpackTarFile(conn, NULL, -1, basedir);
+ 
  	res = PQgetResult(conn);
  	if (PQresultStatus(res) != PGRES_COMMAND_OK)
  	{
*************** main(int argc, char **argv)
*** 1942,1947 ****
--- 2094,2100 ----
  		{"username", required_argument, NULL, 'U'},
  		{"no-password", no_argument, NULL, 'w'},
  		{"password", no_argument, NULL, 'W'},
+ 		{"incremental", required_argument, NULL, 'I'},
  		{"status-interval", required_argument, NULL, 's'},
  		{"verbose", no_argument, NULL, 'v'},
  		{"progress", no_argument, NULL, 'P'},
*************** main(int argc, char **argv)
*** 1949,1955 ****
  		{NULL, 0, NULL, 0}
  	};
  	int			c;
- 
  	int			option_index;
  
  	progname = get_progname(argv[0]);
--- 2102,2107 ----
*************** main(int argc, char **argv)
*** 1970,1976 ****
  		}
  	}
  
! 	while ((c = getopt_long(argc, argv, "D:F:r:RT:xX:l:zZ:d:c:h:p:U:s:wWvP",
  							long_options, &option_index)) != -1)
  	{
  		switch (c)
--- 2122,2128 ----
  		}
  	}
  
! 	while ((c = getopt_long(argc, argv, "D:F:r:RT:xX:l:zZ:d:c:h:p:U:s:wWI:vP",
  							long_options, &option_index)) != -1)
  	{
  		switch (c)
*************** main(int argc, char **argv)
*** 2088,2093 ****
--- 2240,2248 ----
  			case 'W':
  				dbgetpassword = 1;
  				break;
+ 			case 'I':
+ 				read_backup_profile_header(optarg);
+ 				break;
  			case 's':
  				standby_message_timeout = atoi(optarg) * 1000;
  				if (standby_message_timeout < 0)
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 138deaf..4bb261a 100644
*** a/src/include/access/xlog.h
--- b/src/include/access/xlog.h
*************** extern void SetWalWriterSleeping(bool sl
*** 249,255 ****
   * Starting/stopping a base backup
   */
  extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast,
! 				   TimeLineID *starttli_p, char **labelfile);
  extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive,
  				  TimeLineID *stoptli_p);
  extern void do_pg_abort_backup(void);
--- 249,256 ----
   * Starting/stopping a base backup
   */
  extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast,
! 				  XLogRecPtr incremental_startpoint,
! 				  TimeLineID *starttli_p, char **labelfile);
  extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive,
  				  TimeLineID *stoptli_p);
  extern void do_pg_abort_backup(void);
diff --git a/src/include/replication/basebackup.h b/src/include/replication/basebackup.h
index 64f2bd5..08f8e90 100644
*** a/src/include/replication/basebackup.h
--- b/src/include/replication/basebackup.h
***************
*** 20,25 ****
--- 20,30 ----
  #define MAX_RATE_LOWER	32
  #define MAX_RATE_UPPER	1048576
  
+ /* Backup profile */
+ #define BACKUP_PROFILE_HEADER		"POSTGRESQL BACKUP PROFILE 1"
+ #define BACKUP_PROFILE_SEPARATOR	"FILE LIST"
+ #define BACKUP_PROFILE_FILE			"backup_profile"
+ #define BACKUP_PROFILE_OLD			"backup_profile.old"
  
  extern void SendBaseBackup(BaseBackupCmd *cmd);
  
-- 
2.2.2

pg_restorebackup.pytext/x-python-script; name=pg_restorebackup.pyDownload

#13

Marco Nenciarini

marco.nenciarini@2ndquadrant.it

almost 11 years ago

In reply to: Gabriele Bartolini (#10)

1 attachment(s)

File based Incremental backup v7

Il 27/01/15 10:25, Giuseppe Broccolo ha scritto:> Hi Marco,

On 16/01/15 16:55, Marco Nenciarini wrote:

On 14/01/15 17:22, Gabriele Bartolini wrote:

My opinion, Marco, is that for version 5 of this patch, you:

1) update the information on the wiki (it is outdated - I know you have
been busy with LSN map optimisation)

Done.

2) modify pg_basebackup in order to accept a directory (or tar

file) and

automatically detect the LSN from the backup profile

New version of patch attached. The -I parameter now requires a backup
profile from a previous backup. I've added a sanity check that forbid
incremental file level backups if the base timeline is different from
the current one.

3) add the documentation regarding the backup profile and pg_basebackup

Next on my TODO list.

Once we have all of this, we can continue trying the patch. Some
unexplored paths are:

* tablespace usage

I've improved my pg_restorebackup python PoC. It now supports

tablespaces.

About tablespaces, I noticed that any pointing to tablespace locations
is lost during the recovery of an incremental backup changing the
tablespace mapping (-T option). Here the steps I followed:

* creating and filling a test database obtained through pgbench

psql -c "CREATE DATABASE pgbench"
pgbench -U postgres -i -s 5 -F 80 pgbench

* a first base backup with pg_basebackup:

mkdir -p backups/$(date '+%d%m%y%H%M')/data && pg_basebackup -v -F

p -D backups/$(date '+%d%m%y%H%M')/data -x

* creation of a new tablespace, alter the table "pgbench_accounts" to
set the new tablespace:

mkdir -p /home/gbroccolo/pgsql/tbls
psql -c "CREATE TABLESPACE tbls LOCATION

'/home/gbroccolo/pgsql/tbls'"

psql -c "ALTER TABLE pgbench_accounts SET TABLESPACE tbls" pgbench

* Doing some work on the database:

pgbench -U postgres -T 120 pgbench

* a second incremental backup with pg_basebackup specifying the new
location for the tablespace through the tablespace mapping:

mkdir -p backups/$(date '+%d%m%y%H%M')/data backups/$(date

'+%d%m%y%H%M')/tbls && pg_basebackup -v -F p -D backups/$(date
'+%d%m%y%H%M')/data -x -I backups/2601151641/data/backup_profile -T
/home/gbroccolo/pgsql/tbls=/home/gbroccolo/pgsql/backups/$(date
'+%d%m%y%H%M')/tbls

* a recovery based on the tool pg_restorebackup.py attached in
/messages/by-id/54B9428E.9020001@2ndquadrant.it

./pg_restorebackup.py backups/2601151641/data

backups/2601151707/data /tmp/data -T
/home/gbroccolo/pgsql/backups/2601151707/tbls=/tmp/tbls

In the last step, I obtained the following stack trace:

Traceback (most recent call last):
File "./pg_restorebackup.py", line 74, in <module>
shutil.copy2(base_file, dest_file)
File

"/home/gbroccolo/.pyenv/versions/2.7.5/lib/python2.7/shutil.py", line
130, in copy2

copyfile(src, dst)
File

"/home/gbroccolo/.pyenv/versions/2.7.5/lib/python2.7/shutil.py", line
82, in copyfile

with open(src, 'rb') as fsrc:
IOError: [Errno 2] No such file or directory:

'backups/2601151641/data/base/16384/16406_fsm'

Any idea on what's going wrong?

I've done some test and it looks like that FSM nodes always have
InvalidXLogRecPtr as LSN.

Ive updated the patch to always include files if all their pages have
LSN == InvalidXLogRecPtr

Updated patch v7 attached.

Regards,
Marco

--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it

Attachments:

file-based-incremental-backup-v7.patchtext/plain; charset=UTF-8; name=file-based-incremental-backup-v7.patch; x-mac-creator=0; x-mac-type=0Download

From bffcdf0d5c3258c8848215011eb8e8b3377d9f18 Mon Sep 17 00:00:00 2001
From: Marco Nenciarini <marco.nenciarini@2ndQuadrant.it>
Date: Tue, 14 Oct 2014 14:31:28 +0100
Subject: [PATCH] File-based incremental backup v7

Add backup profiles and --incremental to pg_basebackup
---
 doc/src/sgml/protocol.sgml             |  86 ++++++++-
 doc/src/sgml/ref/pg_basebackup.sgml    |  31 ++-
 src/backend/access/transam/xlog.c      |  18 +-
 src/backend/access/transam/xlogfuncs.c |   2 +-
 src/backend/replication/basebackup.c   | 344 +++++++++++++++++++++++++++++++--
 src/backend/replication/repl_gram.y    |   6 +
 src/backend/replication/repl_scanner.l |   1 +
 src/bin/pg_basebackup/pg_basebackup.c  | 191 ++++++++++++++++--
 src/include/access/xlog.h              |   3 +-
 src/include/replication/basebackup.h   |   5 +
 10 files changed, 648 insertions(+), 39 deletions(-)

diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index efe75ea..fc24648 100644
*** a/doc/src/sgml/protocol.sgml
--- b/doc/src/sgml/protocol.sgml
*************** The commands accepted in walsender mode 
*** 1882,1888 ****
    </varlistentry>
  
    <varlistentry>
!     <term>BASE_BACKUP [<literal>LABEL</literal> <replaceable>'label'</replaceable>] [<literal>PROGRESS</literal>] [<literal>FAST</literal>] [<literal>WAL</literal>] [<literal>NOWAIT</literal>] [<literal>MAX_RATE</literal> <replaceable>rate</replaceable>]
       <indexterm><primary>BASE_BACKUP</primary></indexterm>
      </term>
      <listitem>
--- 1882,1888 ----
    </varlistentry>
  
    <varlistentry>
!     <term>BASE_BACKUP [<literal>LABEL</literal> <replaceable>'label'</replaceable>] [<literal>INCREMENTAL</literal> <replaceable>'start_lsn'</replaceable>] [<literal>PROGRESS</literal>] [<literal>FAST</literal>] [<literal>WAL</literal>] [<literal>NOWAIT</literal>] [<literal>MAX_RATE</literal> <replaceable>rate</replaceable>]
       <indexterm><primary>BASE_BACKUP</primary></indexterm>
      </term>
      <listitem>
*************** The commands accepted in walsender mode 
*** 1905,1910 ****
--- 1905,1928 ----
         </varlistentry>
  
         <varlistentry>
+         <term><literal>INCREMENTAL</literal> <replaceable>'start_lsn'</replaceable></term>
+         <listitem>
+          <para>
+           Requests a file-level incremental backup of all files changed after
+           <replaceable>start_lsn</replaceable>. When operating with
+           <literal>INCREMENTAL</literal>, the content of every block-organised
+           file will be analyzed and the file will be sent if at least one
+           block has a LSN higher than or equal to the provided
+           <replaceable>start_lsn</replaceable>.
+          </para>
+          <para>
+           The <filename>backup_profile</filename> will contain information on
+           every file that has been analyzed, even those that have not been sent.
+          </para>
+         </listitem>
+        </varlistentry>
+ 
+        <varlistentry>
          <term><literal>PROGRESS</></term>
          <listitem>
           <para>
*************** The commands accepted in walsender mode 
*** 2022,2028 ****
        <quote>ustar interchange format</> specified in the POSIX 1003.1-2008
        standard) dump of the tablespace contents, except that the two trailing
        blocks of zeroes specified in the standard are omitted.
!       After the tar data is complete, a final ordinary result set will be sent,
        containing the WAL end position of the backup, in the same format as
        the start position.
       </para>
--- 2040,2046 ----
        <quote>ustar interchange format</> specified in the POSIX 1003.1-2008
        standard) dump of the tablespace contents, except that the two trailing
        blocks of zeroes specified in the standard are omitted.
!       After the tar data is complete, an ordinary result set will be sent,
        containing the WAL end position of the backup, in the same format as
        the start position.
       </para>
*************** The commands accepted in walsender mode 
*** 2073,2082 ****
        the server supports it.
       </para>
       <para>
!       Once all tablespaces have been sent, a final regular result set will
        be sent. This result set contains the end position of the
        backup, given in XLogRecPtr format as a single column in a single row.
       </para>
      </listitem>
    </varlistentry>
  </variablelist>
--- 2091,2162 ----
        the server supports it.
       </para>
       <para>
!       Once all tablespaces have been sent, another regular result set will
        be sent. This result set contains the end position of the
        backup, given in XLogRecPtr format as a single column in a single row.
       </para>
+      <para>
+       Finally a last CopyResponse will be sent, containing only the
+       <filename>backup_profile</filename> file, in tar format.
+      </para>
+      <para>
+       The <filename>backup_profile</filename> file will have the following
+       format:
+ <programlisting>
+ POSTGRESQL BACKUP PROFILE 1
+ &lt;backup label content&gt;
+ FILE LIST
+ &lt;file list&gt;
+ </programlisting>
+       where <replaceable>&lt;backup label content&gt;</replaceable> is a
+       verbatim copy of the content of <filename>backup_label</filename> file
+       and the <replaceable>&lt;file list&gt;</replaceable> section is made up
+       of one line per file examined by the backup, having the following format
+       (standard COPY TEXT file, tab separated):
+ <programlisting>
+ tablespace maxlsn included mtime size relpath
+ </programlisting>
+      </para>
+      <para>
+       The meaning of the fields is the following:
+       <itemizedlist spacing="compact" mark="bullet">
+        <listitem>
+         <para>
+          <replaceable>tablespace</replaceable> is the OID of the tablespace
+          (or <literal>\N</literal> for files in PGDATA)
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>maxlsn</replaceable> is the file's max LSN in case
+          the file has been skipped, <literal>\N</literal> otherwise
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>included</replaceable> is a <literal>'t'</literal> if
+          the file is included in the backup, <literal>'f'</literal> otherwise
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>mtime</replaceable> is the timestamp of the last file
+          modification
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>size</replaceable> is the number of bytes of the file
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>relpath</replaceable> is the path of the file relative
+          to the tablespace root (PGDATA or the tablespace)
+         </para>
+        </listitem>
+       </itemizedlist>
+      </para>
      </listitem>
    </varlistentry>
  </variablelist>
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 642fccf..a13b188 100644
*** a/doc/src/sgml/ref/pg_basebackup.sgml
--- b/doc/src/sgml/ref/pg_basebackup.sgml
*************** PostgreSQL documentation
*** 158,163 ****
--- 158,165 ----
              tablespaces, the main data directory will be placed in the
              target directory, but all other tablespaces will be placed
              in the same absolute path as they have on the server.
+             The <filename>backup_profile</filename> file will be placed in
+             this directory.
             </para>
             <para>
              This is the default format.
*************** PostgreSQL documentation
*** 174,186 ****
              data directory will be written to a file named
              <filename>base.tar</filename>, and all other tablespaces will
              be named after the tablespace OID.
!             </para>
             <para>
              If the value <literal>-</literal> (dash) is specified as
              target directory, the tar contents will be written to
              standard output, suitable for piping to for example
              <productname>gzip</productname>. This is only possible if
              the cluster has no additional tablespaces.
             </para>
             </listitem>
           </varlistentry>
--- 176,192 ----
              data directory will be written to a file named
              <filename>base.tar</filename>, and all other tablespaces will
              be named after the tablespace OID.
!             The <filename>backup_profile</filename> file will be placed in
!             this directory.
!            </para>
             <para>
              If the value <literal>-</literal> (dash) is specified as
              target directory, the tar contents will be written to
              standard output, suitable for piping to for example
              <productname>gzip</productname>. This is only possible if
              the cluster has no additional tablespaces.
+             In this case, the <filename>backup_profile</filename> file 
+             will be sent to standard output as part of the tar stream.
             </para>
             </listitem>
           </varlistentry>
*************** PostgreSQL documentation
*** 189,194 ****
--- 195,214 ----
       </varlistentry>
  
       <varlistentry>
+       <term><option>-I <replaceable class="parameter">directory</replaceable></option></term>
+       <term><option>--incremental=<replaceable class="parameter">directory</replaceable></option></term>
+       <listitem>
+         <para>
+         Directory containing the backup to use as a start point for a file-level
+         incremental backup. <application>pg_basebackup</application> will read
+         the <filename>backup_profile</filename> file and then create an
+         incremental backup containing only the files which have been modified
+         after the start point.
+        </para>
+       </listitem>
+      </varlistentry>
+ 
+      <varlistentry>
        <term><option>-r <replaceable class="parameter">rate</replaceable></option></term>
        <term><option>--max-rate=<replaceable class="parameter">rate</replaceable></option></term>
        <listitem>
*************** PostgreSQL documentation
*** 588,593 ****
--- 608,622 ----
    </para>
  
    <para>
+    In order to support file-level incremental backups, a
+    <filename>backup_profile</filename> file
+    is generated in the target directory as last step of every backup. This
+    file will be transparently used by <application>pg_basebackup</application>
+    when invoked with the option <replaceable>--incremental</replaceable> to start
+    a new file-level incremental backup.
+   </para>
+ 
+   <para>
     <application>pg_basebackup</application> works with servers of the same
     or an older major version, down to 9.1. However, WAL streaming mode (-X
     stream) only works with server version 9.3 and later.
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 629a457..a642a04 100644
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***************
*** 47,52 ****
--- 47,53 ----
  #include "replication/snapbuild.h"
  #include "replication/walreceiver.h"
  #include "replication/walsender.h"
+ #include "replication/basebackup.h"
  #include "storage/barrier.h"
  #include "storage/bufmgr.h"
  #include "storage/fd.h"
*************** StartupXLOG(void)
*** 6164,6169 ****
--- 6165,6173 ----
  		 * the latest recovery restartpoint instead of going all the way back
  		 * to the backup start point.  It seems prudent though to just rename
  		 * the file out of the way rather than delete it completely.
+ 		 *
+ 		 * Rename also the backup profile if present. This marks the data
+ 		 * directory as not usable as base for an incremental backup.
  		 */
  		if (haveBackupLabel)
  		{
*************** StartupXLOG(void)
*** 6173,6178 ****
--- 6177,6189 ----
  						(errcode_for_file_access(),
  						 errmsg("could not rename file \"%s\" to \"%s\": %m",
  								BACKUP_LABEL_FILE, BACKUP_LABEL_OLD)));
+ 			unlink(BACKUP_PROFILE_OLD);
+ 			if (rename(BACKUP_PROFILE_FILE, BACKUP_PROFILE_OLD) != 0
+ 					&& errno != ENOENT)
+ 				ereport(FATAL,
+ 						(errcode_for_file_access(),
+ 						 errmsg("could not rename file \"%s\" to \"%s\": %m",
+ 								 BACKUP_PROFILE_FILE, BACKUP_PROFILE_OLD)));
  		}
  
  		/* Check that the GUCs used to generate the WAL allow recovery */
*************** XLogFileNameP(TimeLineID tli, XLogSegNo 
*** 9249,9255 ****
   * permissions of the calling user!
   */
  XLogRecPtr
! do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
  				   char **labelfile)
  {
  	bool		exclusive = (labelfile == NULL);
--- 9260,9267 ----
   * permissions of the calling user!
   */
  XLogRecPtr
! do_pg_start_backup(const char *backupidstr, bool fast,
! 				   XLogRecPtr incremental_startpoint, TimeLineID *starttli_p,
  				   char **labelfile)
  {
  	bool		exclusive = (labelfile == NULL);
*************** do_pg_start_backup(const char *backupids
*** 9468,9473 ****
--- 9480,9489 ----
  			 (uint32) (startpoint >> 32), (uint32) startpoint, xlogfilename);
  		appendStringInfo(&labelfbuf, "CHECKPOINT LOCATION: %X/%X\n",
  					 (uint32) (checkpointloc >> 32), (uint32) checkpointloc);
+ 		if (incremental_startpoint > 0)
+ 			appendStringInfo(&labelfbuf, "INCREMENTAL FROM LOCATION: %X/%X\n",
+ 							 (uint32) (incremental_startpoint >> 32),
+ 							 (uint32) incremental_startpoint);
  		appendStringInfo(&labelfbuf, "BACKUP METHOD: %s\n",
  						 exclusive ? "pg_start_backup" : "streamed");
  		appendStringInfo(&labelfbuf, "BACKUP FROM: %s\n",
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2179bf7..ace84d8 100644
*** a/src/backend/access/transam/xlogfuncs.c
--- b/src/backend/access/transam/xlogfuncs.c
*************** pg_start_backup(PG_FUNCTION_ARGS)
*** 59,65 ****
  				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
  		   errmsg("must be superuser or replication role to run a backup")));
  
! 	startpoint = do_pg_start_backup(backupidstr, fast, NULL, NULL);
  
  	PG_RETURN_LSN(startpoint);
  }
--- 59,65 ----
  				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
  		   errmsg("must be superuser or replication role to run a backup")));
  
! 	startpoint = do_pg_start_backup(backupidstr, fast, 0, NULL, NULL);
  
  	PG_RETURN_LSN(startpoint);
  }
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 3058ce9..5be8a10 100644
*** a/src/backend/replication/basebackup.c
--- b/src/backend/replication/basebackup.c
***************
*** 30,40 ****
--- 30,42 ----
  #include "replication/basebackup.h"
  #include "replication/walsender.h"
  #include "replication/walsender_private.h"
+ #include "storage/bufpage.h"
  #include "storage/fd.h"
  #include "storage/ipc.h"
  #include "utils/builtins.h"
  #include "utils/elog.h"
  #include "utils/ps_status.h"
+ #include "utils/pg_lsn.h"
  #include "utils/timestamp.h"
  
  
*************** typedef struct
*** 46,56 ****
  	bool		nowait;
  	bool		includewal;
  	uint32		maxrate;
  } basebackup_options;
  
  
! static int64 sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces);
! static int64 sendTablespace(char *path, bool sizeonly);
  static bool sendFile(char *readfilename, char *tarfilename,
  		 struct stat * statbuf, bool missing_ok);
  static void sendFileWithContent(const char *filename, const char *content);
--- 48,62 ----
  	bool		nowait;
  	bool		includewal;
  	uint32		maxrate;
+ 	XLogRecPtr	incremental_startpoint;
  } basebackup_options;
  
  
! static int64 sendDir(char *path, int basepathlen, bool sizeonly,
! 					 List *tablespaces, bool has_relfiles,
! 					 XLogRecPtr incremental_startpoint);
! static int64 sendTablespace(char *path, bool sizeonly,
! 				XLogRecPtr incremental_startpoint);
  static bool sendFile(char *readfilename, char *tarfilename,
  		 struct stat * statbuf, bool missing_ok);
  static void sendFileWithContent(const char *filename, const char *content);
*************** static void parse_basebackup_options(Lis
*** 64,69 ****
--- 70,81 ----
  static void SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli);
  static int	compareWalFileNames(const void *a, const void *b);
  static void throttle(size_t increment);
+ static bool relnodeIsNewerThanLSN(char *filename, struct stat * statbuf,
+ 				XLogRecPtr *filemaxlsn, XLogRecPtr thresholdlsn);
+ static void writeBackupProfileLine(const char *filename, struct stat * statbuf,
+ 								   bool has_maxlsn, XLogRecPtr filemaxlsn, bool sent);
+ static void sendBackupProfile(const char *labelfile);
+ static bool validateRelfilenodeName(char *name);
  
  /* Was the backup currently in-progress initiated in recovery mode? */
  static bool backup_started_in_recovery = false;
*************** static int64 elapsed_min_unit;
*** 93,98 ****
--- 105,116 ----
  /* The last check of the transfer rate. */
  static int64 throttled_last;
  
+ /* Temporary file containing the backup profile */
+ static File backup_profile_fd = 0;
+ 
+ /* Tablespace being currently sent. Used in backup profile generation */
+ static char *current_tablespace = NULL;
+ 
  typedef struct
  {
  	char	   *oid;
*************** perform_base_backup(basebackup_options *
*** 132,138 ****
  
  	backup_started_in_recovery = RecoveryInProgress();
  
! 	startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint, &starttli,
  								  &labelfile);
  	/*
  	 * Once do_pg_start_backup has been called, ensure that any failure causes
--- 150,160 ----
  
  	backup_started_in_recovery = RecoveryInProgress();
  
! 	/* Open a temporary file to hold the profile content. */
! 	backup_profile_fd = OpenTemporaryFile(false);
! 
! 	startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint,
! 								  opt->incremental_startpoint, &starttli,
  								  &labelfile);
  	/*
  	 * Once do_pg_start_backup has been called, ensure that any failure causes
*************** perform_base_backup(basebackup_options *
*** 208,214 ****
  			ti->oid = pstrdup(de->d_name);
  			ti->path = pstrdup(linkpath);
  			ti->rpath = relpath ? pstrdup(relpath) : NULL;
! 			ti->size = opt->progress ? sendTablespace(fullpath, true) : -1;
  			tablespaces = lappend(tablespaces, ti);
  #else
  
--- 230,237 ----
  			ti->oid = pstrdup(de->d_name);
  			ti->path = pstrdup(linkpath);
  			ti->rpath = relpath ? pstrdup(relpath) : NULL;
! 			ti->size = opt->progress ? sendTablespace(fullpath, true,
! 											opt->incremental_startpoint) : -1;
  			tablespaces = lappend(tablespaces, ti);
  #else
  
*************** perform_base_backup(basebackup_options *
*** 225,231 ****
  
  		/* Add a node for the base directory at the end */
  		ti = palloc0(sizeof(tablespaceinfo));
! 		ti->size = opt->progress ? sendDir(".", 1, true, tablespaces) : -1;
  		tablespaces = lappend(tablespaces, ti);
  
  		/* Send tablespace header */
--- 248,255 ----
  
  		/* Add a node for the base directory at the end */
  		ti = palloc0(sizeof(tablespaceinfo));
! 		ti->size = opt->progress ? sendDir(".", 1, true, tablespaces, false,
! 										   opt->incremental_startpoint) : -1;
  		tablespaces = lappend(tablespaces, ti);
  
  		/* Send tablespace header */
*************** perform_base_backup(basebackup_options *
*** 267,272 ****
--- 291,302 ----
  			pq_sendint(&buf, 0, 2);		/* natts */
  			pq_endmessage(&buf);
  
+ 			/*
+ 			 * Save the current tablespace, used in writeBackupProfileLine
+ 			 * function
+ 			 */
+ 			current_tablespace = ti->oid;
+ 
  			if (ti->path == NULL)
  			{
  				struct stat statbuf;
*************** perform_base_backup(basebackup_options *
*** 275,281 ****
  				sendFileWithContent(BACKUP_LABEL_FILE, labelfile);
  
  				/* ... then the bulk of the files ... */
! 				sendDir(".", 1, false, tablespaces);
  
  				/* ... and pg_control after everything else. */
  				if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
--- 305,311 ----
  				sendFileWithContent(BACKUP_LABEL_FILE, labelfile);
  
  				/* ... then the bulk of the files ... */
! 				sendDir(".", 1, false, tablespaces, false, opt->incremental_startpoint);
  
  				/* ... and pg_control after everything else. */
  				if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
*************** perform_base_backup(basebackup_options *
*** 284,292 ****
  							 errmsg("could not stat control file \"%s\": %m",
  									XLOG_CONTROL_FILE)));
  				sendFile(XLOG_CONTROL_FILE, XLOG_CONTROL_FILE, &statbuf, false);
  			}
  			else
! 				sendTablespace(ti->path, false);
  
  			/*
  			 * If we're including WAL, and this is the main data directory we
--- 314,323 ----
  							 errmsg("could not stat control file \"%s\": %m",
  									XLOG_CONTROL_FILE)));
  				sendFile(XLOG_CONTROL_FILE, XLOG_CONTROL_FILE, &statbuf, false);
+ 				writeBackupProfileLine(XLOG_CONTROL_FILE, &statbuf, false, 0, true);
  			}
  			else
! 				sendTablespace(ti->path, false, opt->incremental_startpoint);
  
  			/*
  			 * If we're including WAL, and this is the main data directory we
*************** perform_base_backup(basebackup_options *
*** 501,507 ****
  
  			FreeFile(fp);
  
! 			/*
  			 * Mark file as archived, otherwise files can get archived again
  			 * after promotion of a new node. This is in line with
  			 * walreceiver.c always doing a XLogArchiveForceDone() after a
--- 532,541 ----
  
  			FreeFile(fp);
  
! 			/* Add the WAL file to backup profile */
! 			writeBackupProfileLine(pathbuf, &statbuf, false, 0, true);
! 
! 		    /*
  			 * Mark file as archived, otherwise files can get archived again
  			 * after promotion of a new node. This is in line with
  			 * walreceiver.c always doing a XLogArchiveForceDone() after a
*************** perform_base_backup(basebackup_options *
*** 533,538 ****
--- 567,575 ----
  
  			sendFile(pathbuf, pathbuf, &statbuf, false);
  
+ 			/* Add the WAL file to backup profile */
+ 			writeBackupProfileLine(pathbuf, &statbuf, false, 0, true);
+ 
  			/* unconditionally mark file as archived */
  			StatusFilePath(pathbuf, fname, ".done");
  			sendFileWithContent(pathbuf, "");
*************** perform_base_backup(basebackup_options *
*** 542,547 ****
--- 579,587 ----
  		pq_putemptymessage('c');
  	}
  	SendXlogRecPtrResult(endptr, endtli);
+ 
+ 	/* Send the profile file. */
+ 	sendBackupProfile(labelfile);
  }
  
  /*
*************** parse_basebackup_options(List *options, 
*** 570,575 ****
--- 610,616 ----
  	bool		o_nowait = false;
  	bool		o_wal = false;
  	bool		o_maxrate = false;
+ 	bool		o_incremental = false;
  
  	MemSet(opt, 0, sizeof(*opt));
  	foreach(lopt, options)
*************** parse_basebackup_options(List *options, 
*** 640,645 ****
--- 681,698 ----
  			opt->maxrate = (uint32) maxrate;
  			o_maxrate = true;
  		}
+ 		else if (strcmp(defel->defname, "incremental") == 0)
+ 		{
+ 			if (o_incremental)
+ 				ereport(ERROR,
+ 						(errcode(ERRCODE_SYNTAX_ERROR),
+ 						 errmsg("duplicate option \"%s\"", defel->defname)));
+ 
+ 			opt->incremental_startpoint = DatumGetLSN(
+ 				DirectFunctionCall1(pg_lsn_in,
+ 									CStringGetDatum(strVal(defel->arg))));
+ 			o_incremental = true;
+ 		}
  		else
  			elog(ERROR, "option \"%s\" not recognized",
  				 defel->defname);
*************** sendFileWithContent(const char *filename
*** 859,864 ****
--- 912,920 ----
  		MemSet(buf, 0, pad);
  		pq_putmessage('d', buf, pad);
  	}
+ 
+ 	/* Write a backup profile entry for this file. */
+ 	writeBackupProfileLine(filename, &statbuf, false, 0, true);
  }
  
  /*
*************** sendFileWithContent(const char *filename
*** 869,875 ****
   * Only used to send auxiliary tablespaces, not PGDATA.
   */
  static int64
! sendTablespace(char *path, bool sizeonly)
  {
  	int64		size;
  	char		pathbuf[MAXPGPATH];
--- 925,931 ----
   * Only used to send auxiliary tablespaces, not PGDATA.
   */
  static int64
! sendTablespace(char *path, bool sizeonly, XLogRecPtr incremental_startpoint)
  {
  	int64		size;
  	char		pathbuf[MAXPGPATH];
*************** sendTablespace(char *path, bool sizeonly
*** 902,908 ****
  	size = 512;					/* Size of the header just added */
  
  	/* Send all the files in the tablespace version directory */
! 	size += sendDir(pathbuf, strlen(path), sizeonly, NIL);
  
  	return size;
  }
--- 958,964 ----
  	size = 512;					/* Size of the header just added */
  
  	/* Send all the files in the tablespace version directory */
! 	size += sendDir(pathbuf, strlen(path), sizeonly, NIL, true, incremental_startpoint);
  
  	return size;
  }
*************** sendTablespace(char *path, bool sizeonly
*** 914,922 ****
   *
   * Omit any directory in the tablespaces list, to avoid backing up
   * tablespaces twice when they were created inside PGDATA.
   */
  static int64
! sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces)
  {
  	DIR		   *dir;
  	struct dirent *de;
--- 970,982 ----
   *
   * Omit any directory in the tablespaces list, to avoid backing up
   * tablespaces twice when they were created inside PGDATA.
+  *
+  * If 'has_relfiles' is set, this directory will be checked to identify
+  * relnode files and compute their maxLSN.
   */
  static int64
! sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces,
! 		bool has_relfiles, XLogRecPtr incremental_startpoint)
  {
  	DIR		   *dir;
  	struct dirent *de;
*************** sendDir(char *path, int basepathlen, boo
*** 1124,1138 ****
  				}
  			}
  			if (!skip_this_dir)
! 				size += sendDir(pathbuf, basepathlen, sizeonly, tablespaces);
  		}
  		else if (S_ISREG(statbuf.st_mode))
  		{
  			bool		sent = false;
  
  			if (!sizeonly)
! 				sent = sendFile(pathbuf, pathbuf + basepathlen + 1, &statbuf,
! 								true);
  
  			if (sent || sizeonly)
  			{
--- 1184,1235 ----
  				}
  			}
  			if (!skip_this_dir)
! 			{
! 				bool	subdir_has_relfiles;
! 
! 				/*
! 				 * Whithin PGDATA relnode files are contained only in "global"
! 				 * and "base" directory
! 				 */
! 				subdir_has_relfiles = has_relfiles
! 					|| strcmp(pathbuf, "./global") == 0
! 					|| strcmp(pathbuf, "./base") == 0;
! 
! 				size += sendDir(pathbuf, basepathlen, sizeonly, tablespaces,
! 								subdir_has_relfiles, incremental_startpoint);
! 			}
  		}
  		else if (S_ISREG(statbuf.st_mode))
  		{
  			bool		sent = false;
  
  			if (!sizeonly)
! 			{
! 				bool		is_relfile;
! 				XLogRecPtr	filemaxlsn = 0;
! 
! 				/*
! 				 * If the current directory can have relnode files, check the file
! 				 * name to see if it is one of them.
! 				 */
! 				is_relfile = has_relfiles && validateRelfilenodeName(de->d_name);
! 
! 				if (!is_relfile
! 					|| incremental_startpoint == 0
! 					|| relnodeIsNewerThanLSN(pathbuf, &statbuf, &filemaxlsn,
! 											  incremental_startpoint))
! 				{
! 					sent = sendFile(pathbuf, pathbuf + basepathlen + 1,
! 									&statbuf, true);
! 					/* Write a backup profile entry for the sent file. */
! 					writeBackupProfileLine(pathbuf + basepathlen + 1, &statbuf,
! 										   false, 0, sent);
! 				}
! 				else
! 					/* Write a backup profile entry for the skipped file. */
! 					writeBackupProfileLine(pathbuf + basepathlen + 1, &statbuf,
! 										   true, filemaxlsn, sent);
! 			}
  
  			if (sent || sizeonly)
  			{
*************** throttle(size_t increment)
*** 1333,1335 ****
--- 1430,1651 ----
  		/* Sleep was necessary but might have been interrupted. */
  		throttled_last = GetCurrentIntegerTimestamp();
  }
+ 
+ /*
+  * Search in a relnode file for a page with a LSN greater than the threshold.
+  * If all the blocks in the file are older than the threshold the file can
+  * be safely skipped during an incremental backup.
+  */
+ static bool
+ relnodeIsNewerThanLSN(char *filename, struct stat * statbuf,
+ 		XLogRecPtr *filemaxlsn, XLogRecPtr thresholdlsn)
+ {
+ 	FILE	   *fp;
+ 	char		buf[BLCKSZ];
+ 	size_t		cnt;
+ 	pgoff_t		len = 0;
+ 	XLogRecPtr	pagelsn;
+ 
+ 	*filemaxlsn = 0;
+ 
+ 	fp = AllocateFile(filename, "rb");
+ 	if (fp == NULL)
+ 	{
+ 		if (errno == ENOENT)
+ 			return true;
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not open file \"%s\": %m", filename)));
+ 	}
+ 
+ 	while ((cnt = fread(buf, 1, Min(sizeof(buf), statbuf->st_size - len), fp)) > 0)
+ 	{
+ 		pagelsn = PageGetLSN(buf);
+ 
+ 		/* Keep the max LSN found */
+ 		if (*filemaxlsn < pagelsn)
+ 			*filemaxlsn = pagelsn;
+ 
+ 		/*
+ 		 *  If a page with a LSN newer than the threshold stop scanning
+ 		 *  and set the filemaxlsn value to 0 as it is only partial.
+ 		 */
+ 		if (thresholdlsn <= pagelsn)
+ 		{
+ 			*filemaxlsn = 0;
+ 			FreeFile(fp);
+ 			return true;
+ 		}
+ 
+ 		if (len >= statbuf->st_size)
+ 		{
+ 			/*
+ 			 * Reached end of file. The file could be longer, if it was
+ 			 * extended while we were sending it, but for a base backup we can
+ 			 * ignore such extended data. It will be restored from WAL.
+ 			 */
+ 			break;
+ 		}
+ 	}
+ 
+ 	FreeFile(fp);
+ 
+ 	/*
+ 	 * At this point, if *filemaxlsn contains InvalidXLogRecPtr
+ 	 * the file contains something that doesn't update page LSNs (e.g. FSM)
+ 	 */
+ 	if (*filemaxlsn == InvalidXLogRecPtr)
+ 		return true;
+ 
+ 	return false;
+ }
+ 
+ /*
+  * Write an entry in file list section of backup profile.
+  */
+ static void
+ writeBackupProfileLine(const char *filename, struct stat * statbuf,
+ 					   bool has_maxlsn, XLogRecPtr filemaxlsn, bool sent)
+ {
+ 	/*
+ 	 * tablespace oid (10) + max LSN (17) + mtime (10) + size (19) +
+ 	 * path (MAXPGPATH) + separators (4) + trailing \0 = 65
+ 	 */
+ 	char	buf[MAXPGPATH + 65];
+ 	char    maxlsn[17];
+ 	int		rowlen;
+ 
+ 	Assert(backup_profile_fd > 0);
+ 
+ 	/* Prepare maxlsn */
+ 	if (has_maxlsn)
+ 	{
+ 		snprintf(maxlsn, sizeof(maxlsn), "%X/%X",
+ 				 (uint32) (filemaxlsn >> 32), (uint32) filemaxlsn);
+ 	}
+ 	else
+ 	{
+ 		strlcpy(maxlsn, "\\N", sizeof(maxlsn));
+ 	}
+ 
+ 	rowlen = snprintf(buf, sizeof(buf), "%s\t%s\t%s\t%u\t%lld\t%s\n",
+ 					  current_tablespace ? current_tablespace : "\\N",
+ 					  maxlsn,
+ 					  sent ? "t" : "f",
+ 					  (uint32) statbuf->st_mtime,
+ 					  statbuf->st_size,
+ 					  filename);
+ 	FileWrite(backup_profile_fd, buf, rowlen);
+ }
+ 
+ /*
+  * Send the backup profile. It is wrapped in a tar CopyOutResponse containing
+  * a tar stream with only one file.
+  */
+ static void
+ sendBackupProfile(const char *labelfile)
+ {
+ 	StringInfoData msgbuf;
+ 	struct stat statbuf;
+ 	char		buf[TAR_SEND_SIZE];
+ 	size_t		cnt;
+ 	pgoff_t		len = 0;
+ 	size_t		pad;
+ 	char *backup_profile = FilePathName(backup_profile_fd);
+ 
+ 	/* Send CopyOutResponse message */
+ 	pq_beginmessage(&msgbuf, 'H');
+ 	pq_sendbyte(&msgbuf, 0);		/* overall format */
+ 	pq_sendint(&msgbuf, 0, 2);		/* natts */
+ 	pq_endmessage(&msgbuf);
+ 
+ 	if (lstat(backup_profile, &statbuf) != 0)
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not stat backup_profile file \"%s\": %m",
+ 						backup_profile)));
+ 
+ 	/* Set the file position to the beginning. */
+ 	FileSeek(backup_profile_fd, 0, SEEK_SET);
+ 
+ 	/*
+ 	 * Fill the buffer with content of backup profile header section. Being it
+ 	 * the concatenation of two separator and the backup label, it should be
+ 	 * shorter of TAR_SEND_SIZE.
+ 	 */
+ 	cnt = snprintf(buf, sizeof(buf), "%s\n%s%s\n",
+ 				   BACKUP_PROFILE_HEADER,
+ 				   labelfile,
+ 				   BACKUP_PROFILE_SEPARATOR);
+ 
+ 	/* Add size of backup label and separators */
+ 	statbuf.st_size += cnt;
+ 
+ 	_tarWriteHeader(BACKUP_PROFILE_FILE, NULL, &statbuf);
+ 
+ 	/* Send backup profile header */
+ 	if (pq_putmessage('d', buf, cnt))
+ 		ereport(ERROR,
+ 				(errmsg("base backup could not send data, aborting backup")));
+ 
+ 	len += cnt;
+ 	throttle(cnt);
+ 
+ 	while ((cnt = FileRead(backup_profile_fd, buf, sizeof(buf))) > 0)
+ 	{
+ 		/* Send the chunk as a CopyData message */
+ 		if (pq_putmessage('d', buf, cnt))
+ 			ereport(ERROR,
+ 					(errmsg("base backup could not send data, aborting backup")));
+ 
+ 		len += cnt;
+ 		throttle(cnt);
+ 
+ 	}
+ 
+ 	/*
+ 	 * Pad to 512 byte boundary, per tar format requirements. (This small
+ 	 * piece of data is probably not worth throttling.)
+ 	 */
+ 	pad = ((len + 511) & ~511) - len;
+ 	if (pad > 0)
+ 	{
+ 		MemSet(buf, 0, pad);
+ 		pq_putmessage('d', buf, pad);
+ 	}
+ 
+ 	pq_putemptymessage('c');        /* CopyDone */
+ }
+ 
+ /*
+  * relfilenode name validation.
+  *
+  * Format with_ext == true	[0-9]+[ \w | _vm | _fsm | _init ][\.][0-9]*
+  *		  with_ext == false [0-9]+
+  */
+ static bool
+ validateRelfilenodeName(char *name)
+ {
+ 	int			pos = 0;
+ 
+ 	while ((name[pos] >= '0') && (name[pos] <= '9'))
+ 		pos++;
+ 
+ 	if (name[pos] == '_')
+ 	{
+ 		pos++;
+ 		while ((name[pos] >= 'a') && (name[pos] <= 'z'))
+ 			pos++;
+ 	}
+ 	if (name[pos] == '.')
+ 	{
+ 		pos++;
+ 		while ((name[pos] >= '0') && (name[pos] <= '9'))
+ 			pos++;
+ 	}
+ 
+ 	if (name[pos] == 0)
+ 		return true;
+ 
+ 	return false;
+ }
diff --git a/src/backend/replication/repl_gram.y b/src/backend/replication/repl_gram.y
index 2a41eb1..684cf4d 100644
*** a/src/backend/replication/repl_gram.y
--- b/src/backend/replication/repl_gram.y
*************** Node *replication_parse_result;
*** 75,80 ****
--- 75,81 ----
  %token K_PHYSICAL
  %token K_LOGICAL
  %token K_SLOT
+ %token K_INCREMENTAL
  
  %type <node>	command
  %type <node>	base_backup start_replication start_logical_replication create_replication_slot drop_replication_slot identify_system timeline_history
*************** base_backup_opt:
*** 168,173 ****
--- 169,179 ----
  				  $$ = makeDefElem("max_rate",
  								   (Node *)makeInteger($2));
  				}
+ 			| K_INCREMENTAL SCONST
+ 				{
+ 				  $$ = makeDefElem("incremental",
+ 								   (Node *)makeString($2));
+ 				}
  			;
  
  create_replication_slot:
diff --git a/src/backend/replication/repl_scanner.l b/src/backend/replication/repl_scanner.l
index 449c127..a6d0dd8 100644
*** a/src/backend/replication/repl_scanner.l
--- b/src/backend/replication/repl_scanner.l
*************** TIMELINE_HISTORY	{ return K_TIMELINE_HIS
*** 96,101 ****
--- 96,102 ----
  PHYSICAL			{ return K_PHYSICAL; }
  LOGICAL				{ return K_LOGICAL; }
  SLOT				{ return K_SLOT; }
+ INCREMENTAL			{ return K_INCREMENTAL; }
  
  ","				{ return ','; }
  ";"				{ return ';'; }
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index fbf7106..fd67d51 100644
*** a/src/bin/pg_basebackup/pg_basebackup.c
--- b/src/bin/pg_basebackup/pg_basebackup.c
*************** static bool writerecoveryconf = false;
*** 67,72 ****
--- 67,74 ----
  static int	standby_message_timeout = 10 * 1000;		/* 10 sec = default */
  static pg_time_t last_progress_report = 0;
  static int32 maxrate = 0;		/* no limit by default */
+ static XLogRecPtr incremental_startpoint = 0;
+ static TimeLineID incremental_timeline = 0;
  
  
  /* Progress counters */
*************** static void usage(void);
*** 99,107 ****
  static void disconnect_and_exit(int code);
  static void verify_dir_is_empty_or_create(char *dirname);
  static void progress_report(int tablespacenum, const char *filename, bool force);
  
  static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);
! static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum);
  static void GenerateRecoveryConf(PGconn *conn);
  static void WriteRecoveryConf(void);
  static void BaseBackup(void);
--- 101,111 ----
  static void disconnect_and_exit(int code);
  static void verify_dir_is_empty_or_create(char *dirname);
  static void progress_report(int tablespacenum, const char *filename, bool force);
+ static void read_backup_profile_header(const char *profile_path);
  
  static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);
! static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum,
! 									const char *dest_path);
  static void GenerateRecoveryConf(PGconn *conn);
  static void WriteRecoveryConf(void);
  static void BaseBackup(void);
*************** usage(void)
*** 232,237 ****
--- 236,243 ----
  	printf(_("\nOptions controlling the output:\n"));
  	printf(_("  -D, --pgdata=DIRECTORY receive base backup into directory\n"));
  	printf(_("  -F, --format=p|t       output format (plain (default), tar)\n"));
+ 	printf(_("  -I, --incremental=DIRECTORY\n"
+ 			 "                         incremental backup from an existing backup\n"));
  	printf(_("  -r, --max-rate=RATE    maximum transfer rate to transfer data directory\n"
  			 "                         (in kB/s, or use suffix \"k\" or \"M\")\n"));
  	printf(_("  -R, --write-recovery-conf\n"
*************** parse_max_rate(char *src)
*** 717,722 ****
--- 723,794 ----
  	return (int32) result;
  }
  
+ 
+ /*
+  * Read incremental_startpoint and incremental_timeline
+  * from a backup profile.
+  */
+ static void
+ read_backup_profile_header(const char *reference_path)
+ {
+ 	char 		profile_path[MAXPGPATH];
+ 	FILE	   *pfp;
+ 	char		ch;
+ 	uint32		hi,
+ 				lo;
+ 
+ 	/* The directory must exist and must be not empty */
+ 	if (pg_check_dir(reference_path) < 3)
+ 	{
+ 		fprintf(stderr, _("%s: invalid incremental base directory \"%s\"\n"),
+ 				progname, reference_path);
+ 		exit(1);
+ 	}
+ 
+ 	/* Build the backup profile location */
+ 	join_path_components(profile_path, reference_path, BACKUP_PROFILE_FILE);
+ 
+ 	/* See if label file is present */
+ 	pfp = fopen(profile_path, "r");
+ 	if (!pfp)
+ 	{
+ 		fprintf(stderr, _("%s: could not read file \"%s\": %s\n"),
+ 				progname, profile_path, strerror(errno));
+ 		exit(1);
+ 	}
+ 
+ 	/* Consume the profile header */
+ 	fscanf(pfp, BACKUP_PROFILE_HEADER);
+ 	if (fscanf(pfp, "%c", &ch) != 1 || ch != '\n')
+ 	{
+ 		fprintf(stderr, _("%s: invalid data in file \"%s\"\n"),
+ 				progname, profile_path);
+ 		exit(1);
+ 	}
+ 
+ 	/*
+ 	 * Read and parse the START WAL LOCATION (this code
+ 	 * is pretty crude, but we are not expecting any variability in the file
+ 	 * format).
+ 	 */
+ 	if (fscanf(pfp, "START WAL LOCATION: %X/%X (file %08X%*16s)%c",
+ 			   &hi, &lo, &incremental_timeline, &ch) != 4 || ch != '\n')
+ 	{
+ 		fprintf(stderr, _("%s: invalid data in file \"%s\"\n"),
+ 				progname, profile_path);
+ 		exit(1);
+ 	}
+ 	incremental_startpoint = ((uint64) hi) << 32 | lo;
+ 
+ 	if (ferror(pfp) || fclose(pfp))
+ 	{
+ 		fprintf(stderr, _("%s: could not read file \"%s\": %s\n"),
+ 				progname, profile_path, strerror(errno));
+ 		exit(1);
+ 	}
+ }
+ 
+ 
  /*
   * Write a piece of tar data
   */
*************** ReceiveTarFile(PGconn *conn, PGresult *r
*** 773,784 ****
  	char	   *copybuf = NULL;
  	FILE	   *tarfile = NULL;
  	char		tarhdr[512];
! 	bool		basetablespace = PQgetisnull(res, rownum, 0);
  	bool		in_tarhdr = true;
  	bool		skip_file = false;
  	size_t		tarhdrsz = 0;
  	size_t		filesz = 0;
  
  #ifdef HAVE_LIBZ
  	gzFile		ztarfile = NULL;
  #endif
--- 845,866 ----
  	char	   *copybuf = NULL;
  	FILE	   *tarfile = NULL;
  	char		tarhdr[512];
! 	bool		basetablespace;
  	bool		in_tarhdr = true;
  	bool		skip_file = false;
  	size_t		tarhdrsz = 0;
  	size_t		filesz = 0;
  
+ 	/*
+ 	 * If 'res' is NULL, we are appending the backup profile to
+ 	 * the standard output tar stream.
+ 	 */
+ 	assert(res || (strcmp(basedir, "-") == 0));
+ 	if (res)
+ 		basetablespace = PQgetisnull(res, rownum, 0);
+ 	else
+ 		basetablespace = true;
+ 
  #ifdef HAVE_LIBZ
  	gzFile		ztarfile = NULL;
  #endif
*************** ReceiveTarFile(PGconn *conn, PGresult *r
*** 939,946 ****
  					WRITE_TAR_DATA(zerobuf, padding);
  			}
  
! 			/* 2 * 512 bytes empty data at end of file */
! 			WRITE_TAR_DATA(zerobuf, sizeof(zerobuf));
  
  #ifdef HAVE_LIBZ
  			if (ztarfile != NULL)
--- 1021,1033 ----
  					WRITE_TAR_DATA(zerobuf, padding);
  			}
  
! 			/*
! 			 * Write the end-of-file blocks unless using stdout
! 			 * and not writing the backup profile (res is NULL).
! 			 */
! 			if (!res || strcmp(basedir, "-") != 0)
! 				/* 2 * 512 bytes empty data at end of file */
! 				WRITE_TAR_DATA(zerobuf, sizeof(zerobuf));
  
  #ifdef HAVE_LIBZ
  			if (ztarfile != NULL)
*************** get_tablespace_mapping(const char *dir)
*** 1128,1136 ****
   * If the data is for the main data directory, it will be restored in the
   * specified directory. If it's for another tablespace, it will be restored
   * in the original or mapped directory.
   */
  static void
! ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum)
  {
  	char		current_path[MAXPGPATH];
  	char		filename[MAXPGPATH];
--- 1215,1230 ----
   * If the data is for the main data directory, it will be restored in the
   * specified directory. If it's for another tablespace, it will be restored
   * in the original or mapped directory.
+  *
+  * If 'res' is NULL, the destination directory is taken from the
+  * 'dest_path' parameter.
+  *
+  * When 'dest_path' is specified, progresses are not displayed because the
+  * content it is not in any tablespace.
   */
  static void
! ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum,
! 						const char *dest_path)
  {
  	char		current_path[MAXPGPATH];
  	char		filename[MAXPGPATH];
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1141,1153 ****
  	char	   *copybuf = NULL;
  	FILE	   *file = NULL;
  
! 	basetablespace = PQgetisnull(res, rownum, 0);
! 	if (basetablespace)
! 		strlcpy(current_path, basedir, sizeof(current_path));
  	else
! 		strlcpy(current_path,
! 				get_tablespace_mapping(PQgetvalue(res, rownum, 1)),
! 				sizeof(current_path));
  
  	/*
  	 * Get the COPY data
--- 1235,1262 ----
  	char	   *copybuf = NULL;
  	FILE	   *file = NULL;
  
! 	/* 'res' and 'dest_path' are mutually exclusive */
! 	assert(!res != !dest_path);
! 
! 	/*
! 	 * If 'res' is NULL, the destination directory is taken from the
! 	 * 'dest_path' parameter.
! 	 */
! 	if (res)
! 	{
! 		basetablespace = PQgetisnull(res, rownum, 0);
! 		if (basetablespace)
! 			strlcpy(current_path, basedir, sizeof(current_path));
! 		else
! 			strlcpy(current_path,
! 					get_tablespace_mapping(PQgetvalue(res, rownum, 1)),
! 					sizeof(current_path));
! 	}
  	else
! 	{
! 		basetablespace = false;
! 		strlcpy(current_path, dest_path, sizeof(current_path));
! 	}
  
  	/*
  	 * Get the COPY data
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1355,1361 ****
  				disconnect_and_exit(1);
  			}
  			totaldone += r;
! 			progress_report(rownum, filename, false);
  
  			current_len_left -= r;
  			if (current_len_left == 0 && current_padding == 0)
--- 1464,1472 ----
  				disconnect_and_exit(1);
  			}
  			totaldone += r;
! 			/* report progress unless a custom destination is used */
! 			if (!dest_path)
! 				progress_report(rownum, filename, false);
  
  			current_len_left -= r;
  			if (current_len_left == 0 && current_padding == 0)
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1371,1377 ****
  			}
  		}						/* continuing data in existing file */
  	}							/* loop over all data blocks */
! 	progress_report(rownum, filename, true);
  
  	if (file != NULL)
  	{
--- 1482,1490 ----
  			}
  		}						/* continuing data in existing file */
  	}							/* loop over all data blocks */
! 	/* report progress unless a custom destination is used */
! 	if (!dest_path)
! 		progress_report(rownum, filename, true);
  
  	if (file != NULL)
  	{
*************** BaseBackup(void)
*** 1587,1592 ****
--- 1700,1706 ----
  	char	   *basebkp;
  	char		escaped_label[MAXPGPATH];
  	char	   *maxrate_clause = NULL;
+ 	char	   *incremental_clause = NULL;
  	int			i;
  	char		xlogstart[64];
  	char		xlogend[64];
*************** BaseBackup(void)
*** 1648,1661 ****
  	if (maxrate > 0)
  		maxrate_clause = psprintf("MAX_RATE %u", maxrate);
  
  	basebkp =
! 		psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s",
  				 escaped_label,
  				 showprogress ? "PROGRESS" : "",
  				 includewal && !streamwal ? "WAL" : "",
  				 fastcheckpoint ? "FAST" : "",
  				 includewal ? "NOWAIT" : "",
! 				 maxrate_clause ? maxrate_clause : "");
  
  	if (PQsendQuery(conn, basebkp) == 0)
  	{
--- 1762,1801 ----
  	if (maxrate > 0)
  		maxrate_clause = psprintf("MAX_RATE %u", maxrate);
  
+ 	if (incremental_startpoint > 0)
+ 	{
+ 		incremental_clause = psprintf("INCREMENTAL '%X/%X'",
+ 									  (uint32) (incremental_startpoint >> 32),
+ 									  (uint32) incremental_startpoint);
+ 
+ 		/*
+ 		 * Sanity check: if from a different timeline abort the backup.
+ 		 */
+ 		if (latesttli != incremental_timeline)
+ 		{
+ 			fprintf(stderr,
+ 					_("%s: incremental backup from a different timeline "
+ 					  "is not supported: base=%u current=%u\n"),
+ 					progname, incremental_timeline, latesttli);
+ 			disconnect_and_exit(1);
+ 		}
+ 
+ 		if (verbose)
+ 			fprintf(stderr, _("incremental from point: %X/%X on timeline %u\n"),
+ 					(uint32) (incremental_startpoint >> 32),
+ 					(uint32) incremental_startpoint,
+ 					incremental_timeline);
+ 	}
+ 
  	basebkp =
! 		psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s %s",
  				 escaped_label,
  				 showprogress ? "PROGRESS" : "",
  				 includewal && !streamwal ? "WAL" : "",
  				 fastcheckpoint ? "FAST" : "",
  				 includewal ? "NOWAIT" : "",
! 				 maxrate_clause ? maxrate_clause : "",
! 				 incremental_clause ? incremental_clause : "");
  
  	if (PQsendQuery(conn, basebkp) == 0)
  	{
*************** BaseBackup(void)
*** 1769,1775 ****
  		if (format == 't')
  			ReceiveTarFile(conn, res, i);
  		else
! 			ReceiveAndUnpackTarFile(conn, res, i);
  	}							/* Loop over all tablespaces */
  
  	if (showprogress)
--- 1909,1915 ----
  		if (format == 't')
  			ReceiveTarFile(conn, res, i);
  		else
! 			ReceiveAndUnpackTarFile(conn, res, i, NULL);
  	}							/* Loop over all tablespaces */
  
  	if (showprogress)
*************** BaseBackup(void)
*** 1803,1808 ****
--- 1943,1960 ----
  		fprintf(stderr, "transaction log end point: %s\n", xlogend);
  	PQclear(res);
  
+ 	/*
+ 	 * Get the backup profile
+ 	 *
+ 	 * If format is tar and we are writing on standard output
+ 	 * append the backup profile to the stream, otherwise put it
+ 	 * in the destination directory
+ 	 */
+ 	if (format == 't' && (strcmp(basedir, "-") == 0))
+ 		ReceiveTarFile(conn, NULL, -1);
+ 	else
+ 		ReceiveAndUnpackTarFile(conn, NULL, -1, basedir);
+ 
  	res = PQgetResult(conn);
  	if (PQresultStatus(res) != PGRES_COMMAND_OK)
  	{
*************** main(int argc, char **argv)
*** 1942,1947 ****
--- 2094,2100 ----
  		{"username", required_argument, NULL, 'U'},
  		{"no-password", no_argument, NULL, 'w'},
  		{"password", no_argument, NULL, 'W'},
+ 		{"incremental", required_argument, NULL, 'I'},
  		{"status-interval", required_argument, NULL, 's'},
  		{"verbose", no_argument, NULL, 'v'},
  		{"progress", no_argument, NULL, 'P'},
*************** main(int argc, char **argv)
*** 1949,1955 ****
  		{NULL, 0, NULL, 0}
  	};
  	int			c;
- 
  	int			option_index;
  
  	progname = get_progname(argv[0]);
--- 2102,2107 ----
*************** main(int argc, char **argv)
*** 1970,1976 ****
  		}
  	}
  
! 	while ((c = getopt_long(argc, argv, "D:F:r:RT:xX:l:zZ:d:c:h:p:U:s:wWvP",
  							long_options, &option_index)) != -1)
  	{
  		switch (c)
--- 2122,2128 ----
  		}
  	}
  
! 	while ((c = getopt_long(argc, argv, "D:F:r:RT:xX:l:zZ:d:c:h:p:U:s:wWI:vP",
  							long_options, &option_index)) != -1)
  	{
  		switch (c)
*************** main(int argc, char **argv)
*** 2088,2093 ****
--- 2240,2248 ----
  			case 'W':
  				dbgetpassword = 1;
  				break;
+ 			case 'I':
+ 				read_backup_profile_header(optarg);
+ 				break;
  			case 's':
  				standby_message_timeout = atoi(optarg) * 1000;
  				if (standby_message_timeout < 0)
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 138deaf..4bb261a 100644
*** a/src/include/access/xlog.h
--- b/src/include/access/xlog.h
*************** extern void SetWalWriterSleeping(bool sl
*** 249,255 ****
   * Starting/stopping a base backup
   */
  extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast,
! 				   TimeLineID *starttli_p, char **labelfile);
  extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive,
  				  TimeLineID *stoptli_p);
  extern void do_pg_abort_backup(void);
--- 249,256 ----
   * Starting/stopping a base backup
   */
  extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast,
! 				  XLogRecPtr incremental_startpoint,
! 				  TimeLineID *starttli_p, char **labelfile);
  extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive,
  				  TimeLineID *stoptli_p);
  extern void do_pg_abort_backup(void);
diff --git a/src/include/replication/basebackup.h b/src/include/replication/basebackup.h
index 64f2bd5..08f8e90 100644
*** a/src/include/replication/basebackup.h
--- b/src/include/replication/basebackup.h
***************
*** 20,25 ****
--- 20,30 ----
  #define MAX_RATE_LOWER	32
  #define MAX_RATE_UPPER	1048576
  
+ /* Backup profile */
+ #define BACKUP_PROFILE_HEADER		"POSTGRESQL BACKUP PROFILE 1"
+ #define BACKUP_PROFILE_SEPARATOR	"FILE LIST"
+ #define BACKUP_PROFILE_FILE			"backup_profile"
+ #define BACKUP_PROFILE_OLD			"backup_profile.old"
  
  extern void SendBaseBackup(BaseBackupCmd *cmd);
  
-- 
2.2.2

#14

Giuseppe Broccolo

giuseppe.broccolo@2ndquadrant.it

almost 11 years ago

In reply to: Marco Nenciarini (#13)

Re: File based Incremental backup v7

Hi Marco,

2015-01-27 19:04 GMT+01:00 Marco Nenciarini <marco.nenciarini@2ndquadrant.it

:

I've done some test and it looks like that FSM nodes always have
InvalidXLogRecPtr as LSN.

Ive updated the patch to always include files if all their pages have
LSN == InvalidXLogRecPtr

Updated patch v7 attached.

Regards,
Marco

--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it

I've tried again to replay a new test of the incremental backup introducing
a new tablespace after a base backup, considering the version 7 of the
patch and the new version of the restore script attached in
/messages/by-id/54C7CDAD.6060900@2ndquadrant.it:

# define here your work dir
WORK_DIR='/home/gbroccolo/pgsql'

# preliminary steps
rm -rf /tmp/data /tmp/tbls tbls/ backups/

# create a test db and a backup repository
psql -c "DROP DATABASE IF EXISTS pgbench"
psql -c "CREATE DATABASE pgbench"
pgbench -U postgres -i -s 5 -F 80 pgbench
mkdir -p backups

# a first base backup with pg_basebackup
BASE=$(mkdir -vp backups/$(date '+%d%m%y%H%M') | awk -F'[’‘]' '{print $2}')
echo "start a base backup: $BASE"
mkdir -vp $BASE/data
pg_basebackup -v -F p -D $BASE/data -x -c fast

# creation of a new tablespace, alter the table "pgbench_accounts" to
set the new tablespace
mkdir -p $WORK_DIR/tbls
CREATE_CMD="CREATE TABLESPACE tbls LOCATION '$WORK_DIR/tbls'"
psql -c "$CREATE_CMD"
psql -c "ALTER TABLE pgbench_accounts SET TABLESPACE tbls" pgbench

# Doing some work on the database
pgbench -U postgres -T 120 pgbench

# a second incremental backup with pg_basebackup specifying the new
location for the tablespace through the tablespace mapping
INCREMENTAL=$(mkdir -vp backups/$(date '+%d%m%y%H%M') | awk -F'[’‘]'
'{print $2}')
echo "start an incremental backup: $INCREMENTAL"
mkdir -vp $INCREMENTAL/data $INCREMENTAL/tbls
pg_basebackup -v -F p -D $INCREMENTAL/data -x -I $BASE/data -T
$WORK_DIR/tbls=$WORK_DIR/$INCREMENTAL/tbls -c fast

# restore the database
./pg_restorebackup.py -T $WORK_DIR/$INCREMENTAL/tbls=/tmp/tbls
/tmp/data $BASE/data $INCREMENTAL/data
chmod 0700 /tmp/data/
echo "port=5555" >> /tmp/data/postgresql.conf
pg_ctl -D /tmp/data start

now the restore works fine and pointing to tablespaces are preserved also
in the restored instance:

Thanks Marco for your reply.

Giuseppe.
--
Giuseppe Broccolo - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
giuseppe.broccolo@2ndQuadrant.it | www.2ndQuadrant.it

#15

Marco Nenciarini

marco.nenciarini@2ndquadrant.it

almost 11 years ago

In reply to: Marco Nenciarini (#13)

3 attachment(s)

File based Incremental backup v8

The current implementation of copydir function is incompatible with LSN
based incremental backups. The problem is that new files are created,
but their blocks are still with the old LSN, so they will not be backed
up because they are looking old enough.

copydir function is used in:

CREATE DATABASE
ALTER DATABASE SET TABLESPACE

I can imagine two possible solutions:

a) wal log the whole copydir operations, setting the lsn accordingly
b) pass to copydir the LSN of the operation which triggered it, and
update the LSN of all the copied blocks

The latter solution is IMO easier to be implemented and does not deviate
much from the current implementation.

I've implemented it and it's attached to this message.

I've also moved the parse_filename_for_notntemp_relation function out of
reinit.c to make it available both to copydir.c and basebackup.c.

I've also limited the LSN comparison to the only MAIN fork, because:

* LSN fork doesn't uses LSN
* VM fork update LSN only when the visibility bit is set
* INIT forks doesn't use LSN. It's only one page anyway.

Regards,
Marco

--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it

Attachments:

0001-public-parse_filename_for_nontemp_relation.patchtext/plain; charset=UTF-8; name=0001-public-parse_filename_for_nontemp_relation.patch; x-mac-creator=0; x-mac-type=0Download

From 087faed899b9afab324aff7fa20e715c4f99eb4a Mon Sep 17 00:00:00 2001
From: Marco Nenciarini <marco.nenciarini@2ndQuadrant.it>
Date: Thu, 29 Jan 2015 12:18:47 +0100
Subject: [PATCH 1/3] public parse_filename_for_nontemp_relation

---
 src/backend/storage/file/reinit.c | 58 ---------------------------------------
 src/common/relpath.c              | 56 +++++++++++++++++++++++++++++++++++++
 src/include/common/relpath.h      |  2 ++
 3 files changed, 58 insertions(+), 58 deletions(-)

diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c
index afd9255..02b5fee 100644
*** a/src/backend/storage/file/reinit.c
--- b/src/backend/storage/file/reinit.c
*************** static void ResetUnloggedRelationsInTabl
*** 28,35 ****
  									  int op);
  static void ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname,
  								   int op);
- static bool parse_filename_for_nontemp_relation(const char *name,
- 									int *oidchars, ForkNumber *fork);
  
  typedef struct
  {
--- 28,33 ----
*************** ResetUnloggedRelationsInDbspaceDir(const
*** 388,446 ****
  		fsync_fname((char *) dbspacedirname, true);
  	}
  }
- 
- /*
-  * Basic parsing of putative relation filenames.
-  *
-  * This function returns true if the file appears to be in the correct format
-  * for a non-temporary relation and false otherwise.
-  *
-  * NB: If this function returns true, the caller is entitled to assume that
-  * *oidchars has been set to the a value no more than OIDCHARS, and thus
-  * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID
-  * portion of the filename.  This is critical to protect against a possible
-  * buffer overrun.
-  */
- static bool
- parse_filename_for_nontemp_relation(const char *name, int *oidchars,
- 									ForkNumber *fork)
- {
- 	int			pos;
- 
- 	/* Look for a non-empty string of digits (that isn't too long). */
- 	for (pos = 0; isdigit((unsigned char) name[pos]); ++pos)
- 		;
- 	if (pos == 0 || pos > OIDCHARS)
- 		return false;
- 	*oidchars = pos;
- 
- 	/* Check for a fork name. */
- 	if (name[pos] != '_')
- 		*fork = MAIN_FORKNUM;
- 	else
- 	{
- 		int			forkchar;
- 
- 		forkchar = forkname_chars(&name[pos + 1], fork);
- 		if (forkchar <= 0)
- 			return false;
- 		pos += forkchar + 1;
- 	}
- 
- 	/* Check for a segment number. */
- 	if (name[pos] == '.')
- 	{
- 		int			segchar;
- 
- 		for (segchar = 1; isdigit((unsigned char) name[pos + segchar]); ++segchar)
- 			;
- 		if (segchar <= 1)
- 			return false;
- 		pos += segchar;
- 	}
- 
- 	/* Now we should be at the end. */
- 	if (name[pos] != '\0')
- 		return false;
- 	return true;
- }
--- 386,388 ----
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 66dfef1..83a1e3a 100644
*** a/src/common/relpath.c
--- b/src/common/relpath.c
*************** GetRelationPath(Oid dbNode, Oid spcNode,
*** 206,208 ****
--- 206,264 ----
  	}
  	return path;
  }
+ 
+ /*
+  * Basic parsing of putative relation filenames.
+  *
+  * This function returns true if the file appears to be in the correct format
+  * for a non-temporary relation and false otherwise.
+  *
+  * NB: If this function returns true, the caller is entitled to assume that
+  * *oidchars has been set to the a value no more than OIDCHARS, and thus
+  * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID
+  * portion of the filename.  This is critical to protect against a possible
+  * buffer overrun.
+  */
+ bool
+ parse_filename_for_nontemp_relation(const char *name, int *oidchars,
+ 									ForkNumber *fork)
+ {
+ 	int			pos;
+ 
+ 	/* Look for a non-empty string of digits (that isn't too long). */
+ 	for (pos = 0; isdigit((unsigned char) name[pos]); ++pos)
+ 		;
+ 	if (pos == 0 || pos > OIDCHARS)
+ 		return false;
+ 	*oidchars = pos;
+ 
+ 	/* Check for a fork name. */
+ 	if (name[pos] != '_')
+ 		*fork = MAIN_FORKNUM;
+ 	else
+ 	{
+ 		int			forkchar;
+ 
+ 		forkchar = forkname_chars(&name[pos + 1], fork);
+ 		if (forkchar <= 0)
+ 			return false;
+ 		pos += forkchar + 1;
+ 	}
+ 
+ 	/* Check for a segment number. */
+ 	if (name[pos] == '.')
+ 	{
+ 		int			segchar;
+ 
+ 		for (segchar = 1; isdigit((unsigned char) name[pos + segchar]); ++segchar)
+ 			;
+ 		if (segchar <= 1)
+ 			return false;
+ 		pos += segchar;
+ 	}
+ 
+ 	/* Now we should be at the end. */
+ 	if (name[pos] != '\0')
+ 		return false;
+ 	return true;
+ }
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index a263779..9736a78 100644
*** a/src/include/common/relpath.h
--- b/src/include/common/relpath.h
*************** extern char *GetDatabasePath(Oid dbNode,
*** 52,57 ****
--- 52,59 ----
  
  extern char *GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
  				int backendId, ForkNumber forkNumber);
+ extern bool parse_filename_for_nontemp_relation(const char *name,
+ 								int *oidchars, ForkNumber *fork);
  
  /*
   * Wrapper macros for GetRelationPath.  Beware of multiple
-- 
2.2.2

0002-copydir-LSN.patchtext/plain; charset=UTF-8; name=0002-copydir-LSN.patch; x-mac-creator=0; x-mac-type=0Download

From a4952a2fe57f26e50cfc14de0286e1466caaa4ae Mon Sep 17 00:00:00 2001
From: Marco Nenciarini <marco.nenciarini@2ndQuadrant.it>
Date: Thu, 29 Jan 2015 11:41:35 +0100
Subject: [PATCH 2/3] copydir LSN

---
 src/backend/commands/dbcommands.c  | 32 ++++++++++++----------
 src/backend/storage/file/copydir.c | 56 +++++++++++++++++++++++++++++++++++---
 src/backend/storage/file/reinit.c  |  3 +-
 src/include/storage/copydir.h      |  6 ++--
 4 files changed, 76 insertions(+), 21 deletions(-)

diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index 5e66961..6dd9878 100644
*** a/src/backend/commands/dbcommands.c
--- b/src/backend/commands/dbcommands.c
*************** createdb(const CreatedbStmt *stmt)
*** 586,591 ****
--- 586,592 ----
  			Oid			dsttablespace;
  			char	   *srcpath;
  			char	   *dstpath;
+ 			XLogRecPtr	recptr;
  			struct stat st;
  
  			/* No need to copy global tablespace */
*************** createdb(const CreatedbStmt *stmt)
*** 609,621 ****
  
  			dstpath = GetDatabasePath(dboid, dsttablespace);
  
- 			/*
- 			 * Copy this subdirectory to the new location
- 			 *
- 			 * We don't need to copy subdirectories
- 			 */
- 			copydir(srcpath, dstpath, false);
- 
  			/* Record the filesystem change in XLOG */
  			{
  				xl_dbase_create_rec xlrec;
--- 610,615 ----
*************** createdb(const CreatedbStmt *stmt)
*** 628,636 ****
  				XLogBeginInsert();
  				XLogRegisterData((char *) &xlrec, sizeof(xl_dbase_create_rec));
  
! 				(void) XLogInsert(RM_DBASE_ID,
  								  XLOG_DBASE_CREATE | XLR_SPECIAL_REL_UPDATE);
  			}
  		}
  		heap_endscan(scan);
  		heap_close(rel, AccessShareLock);
--- 622,637 ----
  				XLogBeginInsert();
  				XLogRegisterData((char *) &xlrec, sizeof(xl_dbase_create_rec));
  
! 				recptr = XLogInsert(RM_DBASE_ID,
  								  XLOG_DBASE_CREATE | XLR_SPECIAL_REL_UPDATE);
  			}
+ 
+ 			/*
+ 			 * Copy this subdirectory to the new location
+ 			 *
+ 			 * We don't need to copy subdirectories
+ 			 */
+ 			copydir(srcpath, dstpath, false, recptr);
  		}
  		heap_endscan(scan);
  		heap_close(rel, AccessShareLock);
*************** movedb(const char *dbname, const char *t
*** 1214,1223 ****
  	PG_ENSURE_ERROR_CLEANUP(movedb_failure_callback,
  							PointerGetDatum(&fparms));
  	{
! 		/*
! 		 * Copy files from the old tablespace to the new one
! 		 */
! 		copydir(src_dbpath, dst_dbpath, false);
  
  		/*
  		 * Record the filesystem change in XLOG
--- 1215,1221 ----
  	PG_ENSURE_ERROR_CLEANUP(movedb_failure_callback,
  							PointerGetDatum(&fparms));
  	{
! 		XLogRecPtr	recptr;
  
  		/*
  		 * Record the filesystem change in XLOG
*************** movedb(const char *dbname, const char *t
*** 1233,1243 ****
  			XLogBeginInsert();
  			XLogRegisterData((char *) &xlrec, sizeof(xl_dbase_create_rec));
  
! 			(void) XLogInsert(RM_DBASE_ID,
  							  XLOG_DBASE_CREATE | XLR_SPECIAL_REL_UPDATE);
  		}
  
  		/*
  		 * Update the database's pg_database tuple
  		 */
  		ScanKeyInit(&scankey,
--- 1231,1246 ----
  			XLogBeginInsert();
  			XLogRegisterData((char *) &xlrec, sizeof(xl_dbase_create_rec));
  
! 			recptr = XLogInsert(RM_DBASE_ID,
  							  XLOG_DBASE_CREATE | XLR_SPECIAL_REL_UPDATE);
  		}
  
  		/*
+ 		 * Copy files from the old tablespace to the new one
+ 		 */
+ 		copydir(src_dbpath, dst_dbpath, false, recptr);
+ 
+ 		/*
  		 * Update the database's pg_database tuple
  		 */
  		ScanKeyInit(&scankey,
*************** dbase_redo(XLogReaderState *record)
*** 2045,2050 ****
--- 2048,2054 ----
  	if (info == XLOG_DBASE_CREATE)
  	{
  		xl_dbase_create_rec *xlrec = (xl_dbase_create_rec *) XLogRecGetData(record);
+ 		XLogRecPtr	lsn = record->EndRecPtr;
  		char	   *src_path;
  		char	   *dst_path;
  		struct stat st;
*************** dbase_redo(XLogReaderState *record)
*** 2077,2083 ****
  		 *
  		 * We don't need to copy subdirectories
  		 */
! 		copydir(src_path, dst_path, false);
  	}
  	else if (info == XLOG_DBASE_DROP)
  	{
--- 2081,2087 ----
  		 *
  		 * We don't need to copy subdirectories
  		 */
! 		copydir(src_path, dst_path, false, lsn);
  	}
  	else if (info == XLOG_DBASE_DROP)
  	{
diff --git a/src/backend/storage/file/copydir.c b/src/backend/storage/file/copydir.c
index 41b2c62..92b49ab 100644
*** a/src/backend/storage/file/copydir.c
--- b/src/backend/storage/file/copydir.c
***************
*** 22,27 ****
--- 22,29 ----
  #include <unistd.h>
  #include <sys/stat.h>
  
+ #include "common/relpath.h"
+ #include "storage/bufpage.h"
  #include "storage/copydir.h"
  #include "storage/fd.h"
  #include "miscadmin.h"
***************
*** 32,40 ****
   *
   * If recurse is false, subdirectories are ignored.  Anything that's not
   * a directory or a regular file is ignored.
   */
  void
! copydir(char *fromdir, char *todir, bool recurse)
  {
  	DIR		   *xldir;
  	struct dirent *xlde;
--- 34,45 ----
   *
   * If recurse is false, subdirectories are ignored.  Anything that's not
   * a directory or a regular file is ignored.
+  *
+  * If recptr is different from InvalidXlogRecPtr, LSN of pages in the
+  * destination directory will be updated to recptr.
   */
  void
! copydir(char *fromdir, char *todir, bool recurse, XLogRecPtr recptr)
  {
  	DIR		   *xldir;
  	struct dirent *xlde;
*************** copydir(char *fromdir, char *todir, bool
*** 75,84 ****
  		{
  			/* recurse to handle subdirectories */
  			if (recurse)
! 				copydir(fromfile, tofile, true);
  		}
  		else if (S_ISREG(fst.st_mode))
! 			copy_file(fromfile, tofile);
  	}
  	FreeDir(xldir);
  
--- 80,106 ----
  		{
  			/* recurse to handle subdirectories */
  			if (recurse)
! 				copydir(fromfile, tofile, true, recptr);
  		}
  		else if (S_ISREG(fst.st_mode))
! 		{
! 			int			oidchars;
! 			ForkNumber	fork;
! 
! 			/*
! 			 * To support incremental backups, we need to update the LSN in
! 			 * all relation files we are copying.
! 			 *
! 			 * We are updating only the MAIN fork because at the moment
! 			 * blocks in FSM and VM forks are not guaranteed to have an
! 			 * up-to-date LSN
! 			 */
! 			if (parse_filename_for_nontemp_relation(xlde->d_name,
! 						&oidchars, &fork) && fork == MAIN_FORKNUM)
! 				copy_file(fromfile, tofile, recptr);
! 			else
! 				copy_file(fromfile, tofile, InvalidXLogRecPtr);
! 		}
  	}
  	FreeDir(xldir);
  
*************** copydir(char *fromdir, char *todir, bool
*** 130,138 ****
  
  /*
   * copy one file
   */
  void
! copy_file(char *fromfile, char *tofile)
  {
  	char	   *buffer;
  	int			srcfd;
--- 152,163 ----
  
  /*
   * copy one file
+  *
+  * If recptr is different from InvalidXlogRecPtr, the destination file will
+  * have all its pages with LSN set accordingly
   */
  void
! copy_file(char *fromfile, char *tofile, XLogRecPtr recptr)
  {
  	char	   *buffer;
  	int			srcfd;
*************** copy_file(char *fromfile, char *tofile)
*** 176,181 ****
--- 201,229 ----
  					 errmsg("could not read file \"%s\": %m", fromfile)));
  		if (nbytes == 0)
  			break;
+ 
+ 		/*
+ 		 * If a valid recptr has been provided, the resulting file will have
+ 		 * all its pages with LSN set accordingly
+ 		 */
+ 		if (recptr != InvalidXLogRecPtr)
+ 		{
+ 			char		*page;
+ 
+ 			/*
+ 			 * If we are updating LSN of a file, we must be sure that the
+ 			 * source file is not being extended.
+ 			 */
+ 			if (nbytes % BLCKSZ != 0)
+ 				ereport(ERROR,
+ 						(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ 						 errmsg("file \"%s\" size is not multiple of %d",
+ 								fromfile, BLCKSZ)));
+ 
+ 			for (page = buffer; page < (buffer + nbytes); page += BLCKSZ)
+ 				PageSetLSN(page, recptr);
+ 		}
+ 
  		errno = 0;
  		if ((int) write(dstfd, buffer, nbytes) != nbytes)
  		{
diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c
index 02b5fee..854ae4a 100644
*** a/src/backend/storage/file/reinit.c
--- b/src/backend/storage/file/reinit.c
***************
*** 16,21 ****
--- 16,22 ----
  
  #include <unistd.h>
  
+ #include "access/xlogdefs.h"
  #include "catalog/catalog.h"
  #include "common/relpath.h"
  #include "storage/copydir.h"
*************** ResetUnloggedRelationsInDbspaceDir(const
*** 333,339 ****
  
  			/* OK, we're ready to perform the actual copy. */
  			elog(DEBUG2, "copying %s to %s", srcpath, dstpath);
! 			copy_file(srcpath, dstpath);
  		}
  
  		FreeDir(dbspace_dir);
--- 334,340 ----
  
  			/* OK, we're ready to perform the actual copy. */
  			elog(DEBUG2, "copying %s to %s", srcpath, dstpath);
! 			copy_file(srcpath, dstpath, InvalidXLogRecPtr);
  		}
  
  		FreeDir(dbspace_dir);
diff --git a/src/include/storage/copydir.h b/src/include/storage/copydir.h
index 2635a7e..463141d 100644
*** a/src/include/storage/copydir.h
--- b/src/include/storage/copydir.h
***************
*** 13,19 ****
  #ifndef COPYDIR_H
  #define COPYDIR_H
  
! extern void copydir(char *fromdir, char *todir, bool recurse);
! extern void copy_file(char *fromfile, char *tofile);
  
  #endif   /* COPYDIR_H */
--- 13,21 ----
  #ifndef COPYDIR_H
  #define COPYDIR_H
  
! #include "access/xlogdefs.h"
! 
! extern void copydir(char *fromdir, char *todir, bool recurse, XLogRecPtr recptr);
! extern void copy_file(char *fromfile, char *tofile, XLogRecPtr recptr);
  
  #endif   /* COPYDIR_H */
-- 
2.2.2

0003-File-based-incremental-backup-v8.patchtext/plain; charset=UTF-8; name=0003-File-based-incremental-backup-v8.patch; x-mac-creator=0; x-mac-type=0Download

From 99a0ee4950a77ae6fbe35d294d47b53a273496a6 Mon Sep 17 00:00:00 2001
From: Marco Nenciarini <marco.nenciarini@2ndQuadrant.it>
Date: Tue, 14 Oct 2014 14:31:28 +0100
Subject: [PATCH 3/3] File-based incremental backup v8

Add backup profiles and --incremental to pg_basebackup
---
 doc/src/sgml/protocol.sgml             |  86 ++++++++-
 doc/src/sgml/ref/pg_basebackup.sgml    |  31 +++-
 src/backend/access/transam/xlog.c      |  18 +-
 src/backend/access/transam/xlogfuncs.c |   2 +-
 src/backend/replication/basebackup.c   | 319 +++++++++++++++++++++++++++++++--
 src/backend/replication/repl_gram.y    |   6 +
 src/backend/replication/repl_scanner.l |   1 +
 src/bin/pg_basebackup/pg_basebackup.c  | 191 ++++++++++++++++++--
 src/include/access/xlog.h              |   3 +-
 src/include/replication/basebackup.h   |   5 +
 10 files changed, 623 insertions(+), 39 deletions(-)

diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index efe75ea..fc24648 100644
*** a/doc/src/sgml/protocol.sgml
--- b/doc/src/sgml/protocol.sgml
*************** The commands accepted in walsender mode 
*** 1882,1888 ****
    </varlistentry>
  
    <varlistentry>
!     <term>BASE_BACKUP [<literal>LABEL</literal> <replaceable>'label'</replaceable>] [<literal>PROGRESS</literal>] [<literal>FAST</literal>] [<literal>WAL</literal>] [<literal>NOWAIT</literal>] [<literal>MAX_RATE</literal> <replaceable>rate</replaceable>]
       <indexterm><primary>BASE_BACKUP</primary></indexterm>
      </term>
      <listitem>
--- 1882,1888 ----
    </varlistentry>
  
    <varlistentry>
!     <term>BASE_BACKUP [<literal>LABEL</literal> <replaceable>'label'</replaceable>] [<literal>INCREMENTAL</literal> <replaceable>'start_lsn'</replaceable>] [<literal>PROGRESS</literal>] [<literal>FAST</literal>] [<literal>WAL</literal>] [<literal>NOWAIT</literal>] [<literal>MAX_RATE</literal> <replaceable>rate</replaceable>]
       <indexterm><primary>BASE_BACKUP</primary></indexterm>
      </term>
      <listitem>
*************** The commands accepted in walsender mode 
*** 1905,1910 ****
--- 1905,1928 ----
         </varlistentry>
  
         <varlistentry>
+         <term><literal>INCREMENTAL</literal> <replaceable>'start_lsn'</replaceable></term>
+         <listitem>
+          <para>
+           Requests a file-level incremental backup of all files changed after
+           <replaceable>start_lsn</replaceable>. When operating with
+           <literal>INCREMENTAL</literal>, the content of every block-organised
+           file will be analyzed and the file will be sent if at least one
+           block has a LSN higher than or equal to the provided
+           <replaceable>start_lsn</replaceable>.
+          </para>
+          <para>
+           The <filename>backup_profile</filename> will contain information on
+           every file that has been analyzed, even those that have not been sent.
+          </para>
+         </listitem>
+        </varlistentry>
+ 
+        <varlistentry>
          <term><literal>PROGRESS</></term>
          <listitem>
           <para>
*************** The commands accepted in walsender mode 
*** 2022,2028 ****
        <quote>ustar interchange format</> specified in the POSIX 1003.1-2008
        standard) dump of the tablespace contents, except that the two trailing
        blocks of zeroes specified in the standard are omitted.
!       After the tar data is complete, a final ordinary result set will be sent,
        containing the WAL end position of the backup, in the same format as
        the start position.
       </para>
--- 2040,2046 ----
        <quote>ustar interchange format</> specified in the POSIX 1003.1-2008
        standard) dump of the tablespace contents, except that the two trailing
        blocks of zeroes specified in the standard are omitted.
!       After the tar data is complete, an ordinary result set will be sent,
        containing the WAL end position of the backup, in the same format as
        the start position.
       </para>
*************** The commands accepted in walsender mode 
*** 2073,2082 ****
        the server supports it.
       </para>
       <para>
!       Once all tablespaces have been sent, a final regular result set will
        be sent. This result set contains the end position of the
        backup, given in XLogRecPtr format as a single column in a single row.
       </para>
      </listitem>
    </varlistentry>
  </variablelist>
--- 2091,2162 ----
        the server supports it.
       </para>
       <para>
!       Once all tablespaces have been sent, another regular result set will
        be sent. This result set contains the end position of the
        backup, given in XLogRecPtr format as a single column in a single row.
       </para>
+      <para>
+       Finally a last CopyResponse will be sent, containing only the
+       <filename>backup_profile</filename> file, in tar format.
+      </para>
+      <para>
+       The <filename>backup_profile</filename> file will have the following
+       format:
+ <programlisting>
+ POSTGRESQL BACKUP PROFILE 1
+ &lt;backup label content&gt;
+ FILE LIST
+ &lt;file list&gt;
+ </programlisting>
+       where <replaceable>&lt;backup label content&gt;</replaceable> is a
+       verbatim copy of the content of <filename>backup_label</filename> file
+       and the <replaceable>&lt;file list&gt;</replaceable> section is made up
+       of one line per file examined by the backup, having the following format
+       (standard COPY TEXT file, tab separated):
+ <programlisting>
+ tablespace maxlsn included mtime size relpath
+ </programlisting>
+      </para>
+      <para>
+       The meaning of the fields is the following:
+       <itemizedlist spacing="compact" mark="bullet">
+        <listitem>
+         <para>
+          <replaceable>tablespace</replaceable> is the OID of the tablespace
+          (or <literal>\N</literal> for files in PGDATA)
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>maxlsn</replaceable> is the file's max LSN in case
+          the file has been skipped, <literal>\N</literal> otherwise
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>included</replaceable> is a <literal>'t'</literal> if
+          the file is included in the backup, <literal>'f'</literal> otherwise
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>mtime</replaceable> is the timestamp of the last file
+          modification
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>size</replaceable> is the number of bytes of the file
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>relpath</replaceable> is the path of the file relative
+          to the tablespace root (PGDATA or the tablespace)
+         </para>
+        </listitem>
+       </itemizedlist>
+      </para>
      </listitem>
    </varlistentry>
  </variablelist>
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 642fccf..a13b188 100644
*** a/doc/src/sgml/ref/pg_basebackup.sgml
--- b/doc/src/sgml/ref/pg_basebackup.sgml
*************** PostgreSQL documentation
*** 158,163 ****
--- 158,165 ----
              tablespaces, the main data directory will be placed in the
              target directory, but all other tablespaces will be placed
              in the same absolute path as they have on the server.
+             The <filename>backup_profile</filename> file will be placed in
+             this directory.
             </para>
             <para>
              This is the default format.
*************** PostgreSQL documentation
*** 174,186 ****
              data directory will be written to a file named
              <filename>base.tar</filename>, and all other tablespaces will
              be named after the tablespace OID.
!             </para>
             <para>
              If the value <literal>-</literal> (dash) is specified as
              target directory, the tar contents will be written to
              standard output, suitable for piping to for example
              <productname>gzip</productname>. This is only possible if
              the cluster has no additional tablespaces.
             </para>
             </listitem>
           </varlistentry>
--- 176,192 ----
              data directory will be written to a file named
              <filename>base.tar</filename>, and all other tablespaces will
              be named after the tablespace OID.
!             The <filename>backup_profile</filename> file will be placed in
!             this directory.
!            </para>
             <para>
              If the value <literal>-</literal> (dash) is specified as
              target directory, the tar contents will be written to
              standard output, suitable for piping to for example
              <productname>gzip</productname>. This is only possible if
              the cluster has no additional tablespaces.
+             In this case, the <filename>backup_profile</filename> file 
+             will be sent to standard output as part of the tar stream.
             </para>
             </listitem>
           </varlistentry>
*************** PostgreSQL documentation
*** 189,194 ****
--- 195,214 ----
       </varlistentry>
  
       <varlistentry>
+       <term><option>-I <replaceable class="parameter">directory</replaceable></option></term>
+       <term><option>--incremental=<replaceable class="parameter">directory</replaceable></option></term>
+       <listitem>
+         <para>
+         Directory containing the backup to use as a start point for a file-level
+         incremental backup. <application>pg_basebackup</application> will read
+         the <filename>backup_profile</filename> file and then create an
+         incremental backup containing only the files which have been modified
+         after the start point.
+        </para>
+       </listitem>
+      </varlistentry>
+ 
+      <varlistentry>
        <term><option>-r <replaceable class="parameter">rate</replaceable></option></term>
        <term><option>--max-rate=<replaceable class="parameter">rate</replaceable></option></term>
        <listitem>
*************** PostgreSQL documentation
*** 588,593 ****
--- 608,622 ----
    </para>
  
    <para>
+    In order to support file-level incremental backups, a
+    <filename>backup_profile</filename> file
+    is generated in the target directory as last step of every backup. This
+    file will be transparently used by <application>pg_basebackup</application>
+    when invoked with the option <replaceable>--incremental</replaceable> to start
+    a new file-level incremental backup.
+   </para>
+ 
+   <para>
     <application>pg_basebackup</application> works with servers of the same
     or an older major version, down to 9.1. However, WAL streaming mode (-X
     stream) only works with server version 9.3 and later.
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 629a457..a642a04 100644
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***************
*** 47,52 ****
--- 47,53 ----
  #include "replication/snapbuild.h"
  #include "replication/walreceiver.h"
  #include "replication/walsender.h"
+ #include "replication/basebackup.h"
  #include "storage/barrier.h"
  #include "storage/bufmgr.h"
  #include "storage/fd.h"
*************** StartupXLOG(void)
*** 6164,6169 ****
--- 6165,6173 ----
  		 * the latest recovery restartpoint instead of going all the way back
  		 * to the backup start point.  It seems prudent though to just rename
  		 * the file out of the way rather than delete it completely.
+ 		 *
+ 		 * Rename also the backup profile if present. This marks the data
+ 		 * directory as not usable as base for an incremental backup.
  		 */
  		if (haveBackupLabel)
  		{
*************** StartupXLOG(void)
*** 6173,6178 ****
--- 6177,6189 ----
  						(errcode_for_file_access(),
  						 errmsg("could not rename file \"%s\" to \"%s\": %m",
  								BACKUP_LABEL_FILE, BACKUP_LABEL_OLD)));
+ 			unlink(BACKUP_PROFILE_OLD);
+ 			if (rename(BACKUP_PROFILE_FILE, BACKUP_PROFILE_OLD) != 0
+ 					&& errno != ENOENT)
+ 				ereport(FATAL,
+ 						(errcode_for_file_access(),
+ 						 errmsg("could not rename file \"%s\" to \"%s\": %m",
+ 								 BACKUP_PROFILE_FILE, BACKUP_PROFILE_OLD)));
  		}
  
  		/* Check that the GUCs used to generate the WAL allow recovery */
*************** XLogFileNameP(TimeLineID tli, XLogSegNo 
*** 9249,9255 ****
   * permissions of the calling user!
   */
  XLogRecPtr
! do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
  				   char **labelfile)
  {
  	bool		exclusive = (labelfile == NULL);
--- 9260,9267 ----
   * permissions of the calling user!
   */
  XLogRecPtr
! do_pg_start_backup(const char *backupidstr, bool fast,
! 				   XLogRecPtr incremental_startpoint, TimeLineID *starttli_p,
  				   char **labelfile)
  {
  	bool		exclusive = (labelfile == NULL);
*************** do_pg_start_backup(const char *backupids
*** 9468,9473 ****
--- 9480,9489 ----
  			 (uint32) (startpoint >> 32), (uint32) startpoint, xlogfilename);
  		appendStringInfo(&labelfbuf, "CHECKPOINT LOCATION: %X/%X\n",
  					 (uint32) (checkpointloc >> 32), (uint32) checkpointloc);
+ 		if (incremental_startpoint > 0)
+ 			appendStringInfo(&labelfbuf, "INCREMENTAL FROM LOCATION: %X/%X\n",
+ 							 (uint32) (incremental_startpoint >> 32),
+ 							 (uint32) incremental_startpoint);
  		appendStringInfo(&labelfbuf, "BACKUP METHOD: %s\n",
  						 exclusive ? "pg_start_backup" : "streamed");
  		appendStringInfo(&labelfbuf, "BACKUP FROM: %s\n",
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2179bf7..ace84d8 100644
*** a/src/backend/access/transam/xlogfuncs.c
--- b/src/backend/access/transam/xlogfuncs.c
*************** pg_start_backup(PG_FUNCTION_ARGS)
*** 59,65 ****
  				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
  		   errmsg("must be superuser or replication role to run a backup")));
  
! 	startpoint = do_pg_start_backup(backupidstr, fast, NULL, NULL);
  
  	PG_RETURN_LSN(startpoint);
  }
--- 59,65 ----
  				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
  		   errmsg("must be superuser or replication role to run a backup")));
  
! 	startpoint = do_pg_start_backup(backupidstr, fast, 0, NULL, NULL);
  
  	PG_RETURN_LSN(startpoint);
  }
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 3058ce9..107d70c 100644
*** a/src/backend/replication/basebackup.c
--- b/src/backend/replication/basebackup.c
***************
*** 30,40 ****
--- 30,42 ----
  #include "replication/basebackup.h"
  #include "replication/walsender.h"
  #include "replication/walsender_private.h"
+ #include "storage/bufpage.h"
  #include "storage/fd.h"
  #include "storage/ipc.h"
  #include "utils/builtins.h"
  #include "utils/elog.h"
  #include "utils/ps_status.h"
+ #include "utils/pg_lsn.h"
  #include "utils/timestamp.h"
  
  
*************** typedef struct
*** 46,56 ****
  	bool		nowait;
  	bool		includewal;
  	uint32		maxrate;
  } basebackup_options;
  
  
! static int64 sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces);
! static int64 sendTablespace(char *path, bool sizeonly);
  static bool sendFile(char *readfilename, char *tarfilename,
  		 struct stat * statbuf, bool missing_ok);
  static void sendFileWithContent(const char *filename, const char *content);
--- 48,62 ----
  	bool		nowait;
  	bool		includewal;
  	uint32		maxrate;
+ 	XLogRecPtr	incremental_startpoint;
  } basebackup_options;
  
  
! static int64 sendDir(char *path, int basepathlen, bool sizeonly,
! 					 List *tablespaces, bool has_relfiles,
! 					 XLogRecPtr incremental_startpoint);
! static int64 sendTablespace(char *path, bool sizeonly,
! 				XLogRecPtr incremental_startpoint);
  static bool sendFile(char *readfilename, char *tarfilename,
  		 struct stat * statbuf, bool missing_ok);
  static void sendFileWithContent(const char *filename, const char *content);
*************** static void parse_basebackup_options(Lis
*** 64,69 ****
--- 70,80 ----
  static void SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli);
  static int	compareWalFileNames(const void *a, const void *b);
  static void throttle(size_t increment);
+ static bool relnodeIsNewerThanLSN(char *filename, struct stat * statbuf,
+ 				XLogRecPtr *filemaxlsn, XLogRecPtr thresholdlsn);
+ static void writeBackupProfileLine(const char *filename, struct stat * statbuf,
+ 								   bool has_maxlsn, XLogRecPtr filemaxlsn, bool sent);
+ static void sendBackupProfile(const char *labelfile);
  
  /* Was the backup currently in-progress initiated in recovery mode? */
  static bool backup_started_in_recovery = false;
*************** static int64 elapsed_min_unit;
*** 93,98 ****
--- 104,115 ----
  /* The last check of the transfer rate. */
  static int64 throttled_last;
  
+ /* Temporary file containing the backup profile */
+ static File backup_profile_fd = 0;
+ 
+ /* Tablespace being currently sent. Used in backup profile generation */
+ static char *current_tablespace = NULL;
+ 
  typedef struct
  {
  	char	   *oid;
*************** perform_base_backup(basebackup_options *
*** 132,138 ****
  
  	backup_started_in_recovery = RecoveryInProgress();
  
! 	startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint, &starttli,
  								  &labelfile);
  	/*
  	 * Once do_pg_start_backup has been called, ensure that any failure causes
--- 149,159 ----
  
  	backup_started_in_recovery = RecoveryInProgress();
  
! 	/* Open a temporary file to hold the profile content. */
! 	backup_profile_fd = OpenTemporaryFile(false);
! 
! 	startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint,
! 								  opt->incremental_startpoint, &starttli,
  								  &labelfile);
  	/*
  	 * Once do_pg_start_backup has been called, ensure that any failure causes
*************** perform_base_backup(basebackup_options *
*** 208,214 ****
  			ti->oid = pstrdup(de->d_name);
  			ti->path = pstrdup(linkpath);
  			ti->rpath = relpath ? pstrdup(relpath) : NULL;
! 			ti->size = opt->progress ? sendTablespace(fullpath, true) : -1;
  			tablespaces = lappend(tablespaces, ti);
  #else
  
--- 229,236 ----
  			ti->oid = pstrdup(de->d_name);
  			ti->path = pstrdup(linkpath);
  			ti->rpath = relpath ? pstrdup(relpath) : NULL;
! 			ti->size = opt->progress ? sendTablespace(fullpath, true,
! 											opt->incremental_startpoint) : -1;
  			tablespaces = lappend(tablespaces, ti);
  #else
  
*************** perform_base_backup(basebackup_options *
*** 225,231 ****
  
  		/* Add a node for the base directory at the end */
  		ti = palloc0(sizeof(tablespaceinfo));
! 		ti->size = opt->progress ? sendDir(".", 1, true, tablespaces) : -1;
  		tablespaces = lappend(tablespaces, ti);
  
  		/* Send tablespace header */
--- 247,254 ----
  
  		/* Add a node for the base directory at the end */
  		ti = palloc0(sizeof(tablespaceinfo));
! 		ti->size = opt->progress ? sendDir(".", 1, true, tablespaces, false,
! 										   opt->incremental_startpoint) : -1;
  		tablespaces = lappend(tablespaces, ti);
  
  		/* Send tablespace header */
*************** perform_base_backup(basebackup_options *
*** 267,272 ****
--- 290,301 ----
  			pq_sendint(&buf, 0, 2);		/* natts */
  			pq_endmessage(&buf);
  
+ 			/*
+ 			 * Save the current tablespace, used in writeBackupProfileLine
+ 			 * function
+ 			 */
+ 			current_tablespace = ti->oid;
+ 
  			if (ti->path == NULL)
  			{
  				struct stat statbuf;
*************** perform_base_backup(basebackup_options *
*** 275,281 ****
  				sendFileWithContent(BACKUP_LABEL_FILE, labelfile);
  
  				/* ... then the bulk of the files ... */
! 				sendDir(".", 1, false, tablespaces);
  
  				/* ... and pg_control after everything else. */
  				if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
--- 304,310 ----
  				sendFileWithContent(BACKUP_LABEL_FILE, labelfile);
  
  				/* ... then the bulk of the files ... */
! 				sendDir(".", 1, false, tablespaces, false, opt->incremental_startpoint);
  
  				/* ... and pg_control after everything else. */
  				if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
*************** perform_base_backup(basebackup_options *
*** 284,292 ****
  							 errmsg("could not stat control file \"%s\": %m",
  									XLOG_CONTROL_FILE)));
  				sendFile(XLOG_CONTROL_FILE, XLOG_CONTROL_FILE, &statbuf, false);
  			}
  			else
! 				sendTablespace(ti->path, false);
  
  			/*
  			 * If we're including WAL, and this is the main data directory we
--- 313,322 ----
  							 errmsg("could not stat control file \"%s\": %m",
  									XLOG_CONTROL_FILE)));
  				sendFile(XLOG_CONTROL_FILE, XLOG_CONTROL_FILE, &statbuf, false);
+ 				writeBackupProfileLine(XLOG_CONTROL_FILE, &statbuf, false, 0, true);
  			}
  			else
! 				sendTablespace(ti->path, false, opt->incremental_startpoint);
  
  			/*
  			 * If we're including WAL, and this is the main data directory we
*************** perform_base_backup(basebackup_options *
*** 501,507 ****
  
  			FreeFile(fp);
  
! 			/*
  			 * Mark file as archived, otherwise files can get archived again
  			 * after promotion of a new node. This is in line with
  			 * walreceiver.c always doing a XLogArchiveForceDone() after a
--- 531,540 ----
  
  			FreeFile(fp);
  
! 			/* Add the WAL file to backup profile */
! 			writeBackupProfileLine(pathbuf, &statbuf, false, 0, true);
! 
! 		    /*
  			 * Mark file as archived, otherwise files can get archived again
  			 * after promotion of a new node. This is in line with
  			 * walreceiver.c always doing a XLogArchiveForceDone() after a
*************** perform_base_backup(basebackup_options *
*** 533,538 ****
--- 566,574 ----
  
  			sendFile(pathbuf, pathbuf, &statbuf, false);
  
+ 			/* Add the WAL file to backup profile */
+ 			writeBackupProfileLine(pathbuf, &statbuf, false, 0, true);
+ 
  			/* unconditionally mark file as archived */
  			StatusFilePath(pathbuf, fname, ".done");
  			sendFileWithContent(pathbuf, "");
*************** perform_base_backup(basebackup_options *
*** 542,547 ****
--- 578,586 ----
  		pq_putemptymessage('c');
  	}
  	SendXlogRecPtrResult(endptr, endtli);
+ 
+ 	/* Send the profile file. */
+ 	sendBackupProfile(labelfile);
  }
  
  /*
*************** parse_basebackup_options(List *options, 
*** 570,575 ****
--- 609,615 ----
  	bool		o_nowait = false;
  	bool		o_wal = false;
  	bool		o_maxrate = false;
+ 	bool		o_incremental = false;
  
  	MemSet(opt, 0, sizeof(*opt));
  	foreach(lopt, options)
*************** parse_basebackup_options(List *options, 
*** 640,645 ****
--- 680,697 ----
  			opt->maxrate = (uint32) maxrate;
  			o_maxrate = true;
  		}
+ 		else if (strcmp(defel->defname, "incremental") == 0)
+ 		{
+ 			if (o_incremental)
+ 				ereport(ERROR,
+ 						(errcode(ERRCODE_SYNTAX_ERROR),
+ 						 errmsg("duplicate option \"%s\"", defel->defname)));
+ 
+ 			opt->incremental_startpoint = DatumGetLSN(
+ 				DirectFunctionCall1(pg_lsn_in,
+ 									CStringGetDatum(strVal(defel->arg))));
+ 			o_incremental = true;
+ 		}
  		else
  			elog(ERROR, "option \"%s\" not recognized",
  				 defel->defname);
*************** sendFileWithContent(const char *filename
*** 859,864 ****
--- 911,919 ----
  		MemSet(buf, 0, pad);
  		pq_putmessage('d', buf, pad);
  	}
+ 
+ 	/* Write a backup profile entry for this file. */
+ 	writeBackupProfileLine(filename, &statbuf, false, 0, true);
  }
  
  /*
*************** sendFileWithContent(const char *filename
*** 869,875 ****
   * Only used to send auxiliary tablespaces, not PGDATA.
   */
  static int64
! sendTablespace(char *path, bool sizeonly)
  {
  	int64		size;
  	char		pathbuf[MAXPGPATH];
--- 924,930 ----
   * Only used to send auxiliary tablespaces, not PGDATA.
   */
  static int64
! sendTablespace(char *path, bool sizeonly, XLogRecPtr incremental_startpoint)
  {
  	int64		size;
  	char		pathbuf[MAXPGPATH];
*************** sendTablespace(char *path, bool sizeonly
*** 902,908 ****
  	size = 512;					/* Size of the header just added */
  
  	/* Send all the files in the tablespace version directory */
! 	size += sendDir(pathbuf, strlen(path), sizeonly, NIL);
  
  	return size;
  }
--- 957,963 ----
  	size = 512;					/* Size of the header just added */
  
  	/* Send all the files in the tablespace version directory */
! 	size += sendDir(pathbuf, strlen(path), sizeonly, NIL, true, incremental_startpoint);
  
  	return size;
  }
*************** sendTablespace(char *path, bool sizeonly
*** 914,922 ****
   *
   * Omit any directory in the tablespaces list, to avoid backing up
   * tablespaces twice when they were created inside PGDATA.
   */
  static int64
! sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces)
  {
  	DIR		   *dir;
  	struct dirent *de;
--- 969,981 ----
   *
   * Omit any directory in the tablespaces list, to avoid backing up
   * tablespaces twice when they were created inside PGDATA.
+  *
+  * If 'has_relfiles' is set, this directory will be checked to identify
+  * relnode files and compute their maxLSN.
   */
  static int64
! sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces,
! 		bool has_relfiles, XLogRecPtr incremental_startpoint)
  {
  	DIR		   *dir;
  	struct dirent *de;
*************** sendDir(char *path, int basepathlen, boo
*** 1124,1138 ****
  				}
  			}
  			if (!skip_this_dir)
! 				size += sendDir(pathbuf, basepathlen, sizeonly, tablespaces);
  		}
  		else if (S_ISREG(statbuf.st_mode))
  		{
  			bool		sent = false;
  
  			if (!sizeonly)
! 				sent = sendFile(pathbuf, pathbuf + basepathlen + 1, &statbuf,
! 								true);
  
  			if (sent || sizeonly)
  			{
--- 1183,1243 ----
  				}
  			}
  			if (!skip_this_dir)
! 			{
! 				bool	subdir_has_relfiles;
! 
! 				/*
! 				 * Whithin PGDATA relnode files are contained only in "global"
! 				 * and "base" directory
! 				 */
! 				subdir_has_relfiles = has_relfiles
! 					|| strcmp(pathbuf, "./global") == 0
! 					|| strcmp(pathbuf, "./base") == 0;
! 
! 				size += sendDir(pathbuf, basepathlen, sizeonly, tablespaces,
! 								subdir_has_relfiles, incremental_startpoint);
! 			}
  		}
  		else if (S_ISREG(statbuf.st_mode))
  		{
  			bool		sent = false;
  
  			if (!sizeonly)
! 			{
! 				bool		is_relfile;
! 				XLogRecPtr	filemaxlsn = 0;
! 				int			oidchars;
! 				ForkNumber	forknum;
! 
! 				/*
! 				 * If the current directory can have relnode files, check the file
! 				 * name to see if it is one of them.
! 				 *
! 				 * Only copy the main fork because is the only one
! 				 * where page LSNs are always updated
! 				 */
! 				is_relfile = ( has_relfiles
! 					&& parse_filename_for_nontemp_relation(de->d_name,
! 														   &oidchars,
! 														   &forknum)
! 					&& forknum == MAIN_FORKNUM);
! 
! 				if (!is_relfile
! 					|| incremental_startpoint == 0
! 					|| relnodeIsNewerThanLSN(pathbuf, &statbuf, &filemaxlsn,
! 											 incremental_startpoint))
! 				{
! 					sent = sendFile(pathbuf, pathbuf + basepathlen + 1,
! 									&statbuf, true);
! 					/* Write a backup profile entry for the sent file. */
! 					writeBackupProfileLine(pathbuf + basepathlen + 1, &statbuf,
! 										   false, 0, sent);
! 				}
! 				else
! 					/* Write a backup profile entry for the skipped file. */
! 					writeBackupProfileLine(pathbuf + basepathlen + 1, &statbuf,
! 										   true, filemaxlsn, sent);
! 			}
  
  			if (sent || sizeonly)
  			{
*************** throttle(size_t increment)
*** 1333,1335 ****
--- 1438,1626 ----
  		/* Sleep was necessary but might have been interrupted. */
  		throttled_last = GetCurrentIntegerTimestamp();
  }
+ 
+ /*
+  * Search in a relnode file for a page with a LSN greater than the threshold.
+  * If all the blocks in the file are older than the threshold the file can
+  * be safely skipped during an incremental backup.
+  */
+ static bool
+ relnodeIsNewerThanLSN(char *filename, struct stat * statbuf,
+ 		XLogRecPtr *filemaxlsn, XLogRecPtr thresholdlsn)
+ {
+ 	FILE	   *fp;
+ 	char		buf[BLCKSZ];
+ 	size_t		cnt;
+ 	pgoff_t		len = 0;
+ 	XLogRecPtr	pagelsn;
+ 
+ 	*filemaxlsn = 0;
+ 
+ 	fp = AllocateFile(filename, "rb");
+ 	if (fp == NULL)
+ 	{
+ 		if (errno == ENOENT)
+ 			return true;
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not open file \"%s\": %m", filename)));
+ 	}
+ 
+ 	while ((cnt = fread(buf, 1, Min(sizeof(buf), statbuf->st_size - len), fp)) > 0)
+ 	{
+ 		pagelsn = PageGetLSN(buf);
+ 
+ 		/* Keep the max LSN found */
+ 		if (*filemaxlsn < pagelsn)
+ 			*filemaxlsn = pagelsn;
+ 
+ 		/*
+ 		 *  If a page with a LSN newer than the threshold stop scanning
+ 		 *  and set the filemaxlsn value to 0 as it is only partial.
+ 		 */
+ 		if (thresholdlsn <= pagelsn)
+ 		{
+ 			*filemaxlsn = 0;
+ 			FreeFile(fp);
+ 			return true;
+ 		}
+ 
+ 		if (len >= statbuf->st_size)
+ 		{
+ 			/*
+ 			 * Reached end of file. The file could be longer, if it was
+ 			 * extended while we were sending it, but for a base backup we can
+ 			 * ignore such extended data. It will be restored from WAL.
+ 			 */
+ 			break;
+ 		}
+ 	}
+ 
+ 	FreeFile(fp);
+ 
+ 	/*
+ 	 * At this point, if *filemaxlsn contains InvalidXLogRecPtr
+ 	 * the file contains something that doesn't update page LSNs (e.g. FSM)
+ 	 */
+ 	if (*filemaxlsn == InvalidXLogRecPtr)
+ 		return true;
+ 
+ 	return false;
+ }
+ 
+ /*
+  * Write an entry in file list section of backup profile.
+  */
+ static void
+ writeBackupProfileLine(const char *filename, struct stat * statbuf,
+ 					   bool has_maxlsn, XLogRecPtr filemaxlsn, bool sent)
+ {
+ 	/*
+ 	 * tablespace oid (10) + max LSN (17) + mtime (10) + size (19) +
+ 	 * path (MAXPGPATH) + separators (4) + trailing \0 = 65
+ 	 */
+ 	char	buf[MAXPGPATH + 65];
+ 	char    maxlsn[17];
+ 	int		rowlen;
+ 
+ 	Assert(backup_profile_fd > 0);
+ 
+ 	/* Prepare maxlsn */
+ 	if (has_maxlsn)
+ 	{
+ 		snprintf(maxlsn, sizeof(maxlsn), "%X/%X",
+ 				 (uint32) (filemaxlsn >> 32), (uint32) filemaxlsn);
+ 	}
+ 	else
+ 	{
+ 		strlcpy(maxlsn, "\\N", sizeof(maxlsn));
+ 	}
+ 
+ 	rowlen = snprintf(buf, sizeof(buf), "%s\t%s\t%s\t%u\t%lld\t%s\n",
+ 					  current_tablespace ? current_tablespace : "\\N",
+ 					  maxlsn,
+ 					  sent ? "t" : "f",
+ 					  (uint32) statbuf->st_mtime,
+ 					  statbuf->st_size,
+ 					  filename);
+ 	FileWrite(backup_profile_fd, buf, rowlen);
+ }
+ 
+ /*
+  * Send the backup profile. It is wrapped in a tar CopyOutResponse containing
+  * a tar stream with only one file.
+  */
+ static void
+ sendBackupProfile(const char *labelfile)
+ {
+ 	StringInfoData msgbuf;
+ 	struct stat statbuf;
+ 	char		buf[TAR_SEND_SIZE];
+ 	size_t		cnt;
+ 	pgoff_t		len = 0;
+ 	size_t		pad;
+ 	char *backup_profile = FilePathName(backup_profile_fd);
+ 
+ 	/* Send CopyOutResponse message */
+ 	pq_beginmessage(&msgbuf, 'H');
+ 	pq_sendbyte(&msgbuf, 0);		/* overall format */
+ 	pq_sendint(&msgbuf, 0, 2);		/* natts */
+ 	pq_endmessage(&msgbuf);
+ 
+ 	if (lstat(backup_profile, &statbuf) != 0)
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not stat backup_profile file \"%s\": %m",
+ 						backup_profile)));
+ 
+ 	/* Set the file position to the beginning. */
+ 	FileSeek(backup_profile_fd, 0, SEEK_SET);
+ 
+ 	/*
+ 	 * Fill the buffer with content of backup profile header section. Being it
+ 	 * the concatenation of two separator and the backup label, it should be
+ 	 * shorter of TAR_SEND_SIZE.
+ 	 */
+ 	cnt = snprintf(buf, sizeof(buf), "%s\n%s%s\n",
+ 				   BACKUP_PROFILE_HEADER,
+ 				   labelfile,
+ 				   BACKUP_PROFILE_SEPARATOR);
+ 
+ 	/* Add size of backup label and separators */
+ 	statbuf.st_size += cnt;
+ 
+ 	_tarWriteHeader(BACKUP_PROFILE_FILE, NULL, &statbuf);
+ 
+ 	/* Send backup profile header */
+ 	if (pq_putmessage('d', buf, cnt))
+ 		ereport(ERROR,
+ 				(errmsg("base backup could not send data, aborting backup")));
+ 
+ 	len += cnt;
+ 	throttle(cnt);
+ 
+ 	while ((cnt = FileRead(backup_profile_fd, buf, sizeof(buf))) > 0)
+ 	{
+ 		/* Send the chunk as a CopyData message */
+ 		if (pq_putmessage('d', buf, cnt))
+ 			ereport(ERROR,
+ 					(errmsg("base backup could not send data, aborting backup")));
+ 
+ 		len += cnt;
+ 		throttle(cnt);
+ 
+ 	}
+ 
+ 	/*
+ 	 * Pad to 512 byte boundary, per tar format requirements. (This small
+ 	 * piece of data is probably not worth throttling.)
+ 	 */
+ 	pad = ((len + 511) & ~511) - len;
+ 	if (pad > 0)
+ 	{
+ 		MemSet(buf, 0, pad);
+ 		pq_putmessage('d', buf, pad);
+ 	}
+ 
+ 	pq_putemptymessage('c');        /* CopyDone */
+ }
diff --git a/src/backend/replication/repl_gram.y b/src/backend/replication/repl_gram.y
index 2a41eb1..684cf4d 100644
*** a/src/backend/replication/repl_gram.y
--- b/src/backend/replication/repl_gram.y
*************** Node *replication_parse_result;
*** 75,80 ****
--- 75,81 ----
  %token K_PHYSICAL
  %token K_LOGICAL
  %token K_SLOT
+ %token K_INCREMENTAL
  
  %type <node>	command
  %type <node>	base_backup start_replication start_logical_replication create_replication_slot drop_replication_slot identify_system timeline_history
*************** base_backup_opt:
*** 168,173 ****
--- 169,179 ----
  				  $$ = makeDefElem("max_rate",
  								   (Node *)makeInteger($2));
  				}
+ 			| K_INCREMENTAL SCONST
+ 				{
+ 				  $$ = makeDefElem("incremental",
+ 								   (Node *)makeString($2));
+ 				}
  			;
  
  create_replication_slot:
diff --git a/src/backend/replication/repl_scanner.l b/src/backend/replication/repl_scanner.l
index 449c127..a6d0dd8 100644
*** a/src/backend/replication/repl_scanner.l
--- b/src/backend/replication/repl_scanner.l
*************** TIMELINE_HISTORY	{ return K_TIMELINE_HIS
*** 96,101 ****
--- 96,102 ----
  PHYSICAL			{ return K_PHYSICAL; }
  LOGICAL				{ return K_LOGICAL; }
  SLOT				{ return K_SLOT; }
+ INCREMENTAL			{ return K_INCREMENTAL; }
  
  ","				{ return ','; }
  ";"				{ return ';'; }
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index fbf7106..fd67d51 100644
*** a/src/bin/pg_basebackup/pg_basebackup.c
--- b/src/bin/pg_basebackup/pg_basebackup.c
*************** static bool writerecoveryconf = false;
*** 67,72 ****
--- 67,74 ----
  static int	standby_message_timeout = 10 * 1000;		/* 10 sec = default */
  static pg_time_t last_progress_report = 0;
  static int32 maxrate = 0;		/* no limit by default */
+ static XLogRecPtr incremental_startpoint = 0;
+ static TimeLineID incremental_timeline = 0;
  
  
  /* Progress counters */
*************** static void usage(void);
*** 99,107 ****
  static void disconnect_and_exit(int code);
  static void verify_dir_is_empty_or_create(char *dirname);
  static void progress_report(int tablespacenum, const char *filename, bool force);
  
  static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);
! static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum);
  static void GenerateRecoveryConf(PGconn *conn);
  static void WriteRecoveryConf(void);
  static void BaseBackup(void);
--- 101,111 ----
  static void disconnect_and_exit(int code);
  static void verify_dir_is_empty_or_create(char *dirname);
  static void progress_report(int tablespacenum, const char *filename, bool force);
+ static void read_backup_profile_header(const char *profile_path);
  
  static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);
! static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum,
! 									const char *dest_path);
  static void GenerateRecoveryConf(PGconn *conn);
  static void WriteRecoveryConf(void);
  static void BaseBackup(void);
*************** usage(void)
*** 232,237 ****
--- 236,243 ----
  	printf(_("\nOptions controlling the output:\n"));
  	printf(_("  -D, --pgdata=DIRECTORY receive base backup into directory\n"));
  	printf(_("  -F, --format=p|t       output format (plain (default), tar)\n"));
+ 	printf(_("  -I, --incremental=DIRECTORY\n"
+ 			 "                         incremental backup from an existing backup\n"));
  	printf(_("  -r, --max-rate=RATE    maximum transfer rate to transfer data directory\n"
  			 "                         (in kB/s, or use suffix \"k\" or \"M\")\n"));
  	printf(_("  -R, --write-recovery-conf\n"
*************** parse_max_rate(char *src)
*** 717,722 ****
--- 723,794 ----
  	return (int32) result;
  }
  
+ 
+ /*
+  * Read incremental_startpoint and incremental_timeline
+  * from a backup profile.
+  */
+ static void
+ read_backup_profile_header(const char *reference_path)
+ {
+ 	char 		profile_path[MAXPGPATH];
+ 	FILE	   *pfp;
+ 	char		ch;
+ 	uint32		hi,
+ 				lo;
+ 
+ 	/* The directory must exist and must be not empty */
+ 	if (pg_check_dir(reference_path) < 3)
+ 	{
+ 		fprintf(stderr, _("%s: invalid incremental base directory \"%s\"\n"),
+ 				progname, reference_path);
+ 		exit(1);
+ 	}
+ 
+ 	/* Build the backup profile location */
+ 	join_path_components(profile_path, reference_path, BACKUP_PROFILE_FILE);
+ 
+ 	/* See if label file is present */
+ 	pfp = fopen(profile_path, "r");
+ 	if (!pfp)
+ 	{
+ 		fprintf(stderr, _("%s: could not read file \"%s\": %s\n"),
+ 				progname, profile_path, strerror(errno));
+ 		exit(1);
+ 	}
+ 
+ 	/* Consume the profile header */
+ 	fscanf(pfp, BACKUP_PROFILE_HEADER);
+ 	if (fscanf(pfp, "%c", &ch) != 1 || ch != '\n')
+ 	{
+ 		fprintf(stderr, _("%s: invalid data in file \"%s\"\n"),
+ 				progname, profile_path);
+ 		exit(1);
+ 	}
+ 
+ 	/*
+ 	 * Read and parse the START WAL LOCATION (this code
+ 	 * is pretty crude, but we are not expecting any variability in the file
+ 	 * format).
+ 	 */
+ 	if (fscanf(pfp, "START WAL LOCATION: %X/%X (file %08X%*16s)%c",
+ 			   &hi, &lo, &incremental_timeline, &ch) != 4 || ch != '\n')
+ 	{
+ 		fprintf(stderr, _("%s: invalid data in file \"%s\"\n"),
+ 				progname, profile_path);
+ 		exit(1);
+ 	}
+ 	incremental_startpoint = ((uint64) hi) << 32 | lo;
+ 
+ 	if (ferror(pfp) || fclose(pfp))
+ 	{
+ 		fprintf(stderr, _("%s: could not read file \"%s\": %s\n"),
+ 				progname, profile_path, strerror(errno));
+ 		exit(1);
+ 	}
+ }
+ 
+ 
  /*
   * Write a piece of tar data
   */
*************** ReceiveTarFile(PGconn *conn, PGresult *r
*** 773,784 ****
  	char	   *copybuf = NULL;
  	FILE	   *tarfile = NULL;
  	char		tarhdr[512];
! 	bool		basetablespace = PQgetisnull(res, rownum, 0);
  	bool		in_tarhdr = true;
  	bool		skip_file = false;
  	size_t		tarhdrsz = 0;
  	size_t		filesz = 0;
  
  #ifdef HAVE_LIBZ
  	gzFile		ztarfile = NULL;
  #endif
--- 845,866 ----
  	char	   *copybuf = NULL;
  	FILE	   *tarfile = NULL;
  	char		tarhdr[512];
! 	bool		basetablespace;
  	bool		in_tarhdr = true;
  	bool		skip_file = false;
  	size_t		tarhdrsz = 0;
  	size_t		filesz = 0;
  
+ 	/*
+ 	 * If 'res' is NULL, we are appending the backup profile to
+ 	 * the standard output tar stream.
+ 	 */
+ 	assert(res || (strcmp(basedir, "-") == 0));
+ 	if (res)
+ 		basetablespace = PQgetisnull(res, rownum, 0);
+ 	else
+ 		basetablespace = true;
+ 
  #ifdef HAVE_LIBZ
  	gzFile		ztarfile = NULL;
  #endif
*************** ReceiveTarFile(PGconn *conn, PGresult *r
*** 939,946 ****
  					WRITE_TAR_DATA(zerobuf, padding);
  			}
  
! 			/* 2 * 512 bytes empty data at end of file */
! 			WRITE_TAR_DATA(zerobuf, sizeof(zerobuf));
  
  #ifdef HAVE_LIBZ
  			if (ztarfile != NULL)
--- 1021,1033 ----
  					WRITE_TAR_DATA(zerobuf, padding);
  			}
  
! 			/*
! 			 * Write the end-of-file blocks unless using stdout
! 			 * and not writing the backup profile (res is NULL).
! 			 */
! 			if (!res || strcmp(basedir, "-") != 0)
! 				/* 2 * 512 bytes empty data at end of file */
! 				WRITE_TAR_DATA(zerobuf, sizeof(zerobuf));
  
  #ifdef HAVE_LIBZ
  			if (ztarfile != NULL)
*************** get_tablespace_mapping(const char *dir)
*** 1128,1136 ****
   * If the data is for the main data directory, it will be restored in the
   * specified directory. If it's for another tablespace, it will be restored
   * in the original or mapped directory.
   */
  static void
! ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum)
  {
  	char		current_path[MAXPGPATH];
  	char		filename[MAXPGPATH];
--- 1215,1230 ----
   * If the data is for the main data directory, it will be restored in the
   * specified directory. If it's for another tablespace, it will be restored
   * in the original or mapped directory.
+  *
+  * If 'res' is NULL, the destination directory is taken from the
+  * 'dest_path' parameter.
+  *
+  * When 'dest_path' is specified, progresses are not displayed because the
+  * content it is not in any tablespace.
   */
  static void
! ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum,
! 						const char *dest_path)
  {
  	char		current_path[MAXPGPATH];
  	char		filename[MAXPGPATH];
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1141,1153 ****
  	char	   *copybuf = NULL;
  	FILE	   *file = NULL;
  
! 	basetablespace = PQgetisnull(res, rownum, 0);
! 	if (basetablespace)
! 		strlcpy(current_path, basedir, sizeof(current_path));
  	else
! 		strlcpy(current_path,
! 				get_tablespace_mapping(PQgetvalue(res, rownum, 1)),
! 				sizeof(current_path));
  
  	/*
  	 * Get the COPY data
--- 1235,1262 ----
  	char	   *copybuf = NULL;
  	FILE	   *file = NULL;
  
! 	/* 'res' and 'dest_path' are mutually exclusive */
! 	assert(!res != !dest_path);
! 
! 	/*
! 	 * If 'res' is NULL, the destination directory is taken from the
! 	 * 'dest_path' parameter.
! 	 */
! 	if (res)
! 	{
! 		basetablespace = PQgetisnull(res, rownum, 0);
! 		if (basetablespace)
! 			strlcpy(current_path, basedir, sizeof(current_path));
! 		else
! 			strlcpy(current_path,
! 					get_tablespace_mapping(PQgetvalue(res, rownum, 1)),
! 					sizeof(current_path));
! 	}
  	else
! 	{
! 		basetablespace = false;
! 		strlcpy(current_path, dest_path, sizeof(current_path));
! 	}
  
  	/*
  	 * Get the COPY data
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1355,1361 ****
  				disconnect_and_exit(1);
  			}
  			totaldone += r;
! 			progress_report(rownum, filename, false);
  
  			current_len_left -= r;
  			if (current_len_left == 0 && current_padding == 0)
--- 1464,1472 ----
  				disconnect_and_exit(1);
  			}
  			totaldone += r;
! 			/* report progress unless a custom destination is used */
! 			if (!dest_path)
! 				progress_report(rownum, filename, false);
  
  			current_len_left -= r;
  			if (current_len_left == 0 && current_padding == 0)
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1371,1377 ****
  			}
  		}						/* continuing data in existing file */
  	}							/* loop over all data blocks */
! 	progress_report(rownum, filename, true);
  
  	if (file != NULL)
  	{
--- 1482,1490 ----
  			}
  		}						/* continuing data in existing file */
  	}							/* loop over all data blocks */
! 	/* report progress unless a custom destination is used */
! 	if (!dest_path)
! 		progress_report(rownum, filename, true);
  
  	if (file != NULL)
  	{
*************** BaseBackup(void)
*** 1587,1592 ****
--- 1700,1706 ----
  	char	   *basebkp;
  	char		escaped_label[MAXPGPATH];
  	char	   *maxrate_clause = NULL;
+ 	char	   *incremental_clause = NULL;
  	int			i;
  	char		xlogstart[64];
  	char		xlogend[64];
*************** BaseBackup(void)
*** 1648,1661 ****
  	if (maxrate > 0)
  		maxrate_clause = psprintf("MAX_RATE %u", maxrate);
  
  	basebkp =
! 		psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s",
  				 escaped_label,
  				 showprogress ? "PROGRESS" : "",
  				 includewal && !streamwal ? "WAL" : "",
  				 fastcheckpoint ? "FAST" : "",
  				 includewal ? "NOWAIT" : "",
! 				 maxrate_clause ? maxrate_clause : "");
  
  	if (PQsendQuery(conn, basebkp) == 0)
  	{
--- 1762,1801 ----
  	if (maxrate > 0)
  		maxrate_clause = psprintf("MAX_RATE %u", maxrate);
  
+ 	if (incremental_startpoint > 0)
+ 	{
+ 		incremental_clause = psprintf("INCREMENTAL '%X/%X'",
+ 									  (uint32) (incremental_startpoint >> 32),
+ 									  (uint32) incremental_startpoint);
+ 
+ 		/*
+ 		 * Sanity check: if from a different timeline abort the backup.
+ 		 */
+ 		if (latesttli != incremental_timeline)
+ 		{
+ 			fprintf(stderr,
+ 					_("%s: incremental backup from a different timeline "
+ 					  "is not supported: base=%u current=%u\n"),
+ 					progname, incremental_timeline, latesttli);
+ 			disconnect_and_exit(1);
+ 		}
+ 
+ 		if (verbose)
+ 			fprintf(stderr, _("incremental from point: %X/%X on timeline %u\n"),
+ 					(uint32) (incremental_startpoint >> 32),
+ 					(uint32) incremental_startpoint,
+ 					incremental_timeline);
+ 	}
+ 
  	basebkp =
! 		psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s %s",
  				 escaped_label,
  				 showprogress ? "PROGRESS" : "",
  				 includewal && !streamwal ? "WAL" : "",
  				 fastcheckpoint ? "FAST" : "",
  				 includewal ? "NOWAIT" : "",
! 				 maxrate_clause ? maxrate_clause : "",
! 				 incremental_clause ? incremental_clause : "");
  
  	if (PQsendQuery(conn, basebkp) == 0)
  	{
*************** BaseBackup(void)
*** 1769,1775 ****
  		if (format == 't')
  			ReceiveTarFile(conn, res, i);
  		else
! 			ReceiveAndUnpackTarFile(conn, res, i);
  	}							/* Loop over all tablespaces */
  
  	if (showprogress)
--- 1909,1915 ----
  		if (format == 't')
  			ReceiveTarFile(conn, res, i);
  		else
! 			ReceiveAndUnpackTarFile(conn, res, i, NULL);
  	}							/* Loop over all tablespaces */
  
  	if (showprogress)
*************** BaseBackup(void)
*** 1803,1808 ****
--- 1943,1960 ----
  		fprintf(stderr, "transaction log end point: %s\n", xlogend);
  	PQclear(res);
  
+ 	/*
+ 	 * Get the backup profile
+ 	 *
+ 	 * If format is tar and we are writing on standard output
+ 	 * append the backup profile to the stream, otherwise put it
+ 	 * in the destination directory
+ 	 */
+ 	if (format == 't' && (strcmp(basedir, "-") == 0))
+ 		ReceiveTarFile(conn, NULL, -1);
+ 	else
+ 		ReceiveAndUnpackTarFile(conn, NULL, -1, basedir);
+ 
  	res = PQgetResult(conn);
  	if (PQresultStatus(res) != PGRES_COMMAND_OK)
  	{
*************** main(int argc, char **argv)
*** 1942,1947 ****
--- 2094,2100 ----
  		{"username", required_argument, NULL, 'U'},
  		{"no-password", no_argument, NULL, 'w'},
  		{"password", no_argument, NULL, 'W'},
+ 		{"incremental", required_argument, NULL, 'I'},
  		{"status-interval", required_argument, NULL, 's'},
  		{"verbose", no_argument, NULL, 'v'},
  		{"progress", no_argument, NULL, 'P'},
*************** main(int argc, char **argv)
*** 1949,1955 ****
  		{NULL, 0, NULL, 0}
  	};
  	int			c;
- 
  	int			option_index;
  
  	progname = get_progname(argv[0]);
--- 2102,2107 ----
*************** main(int argc, char **argv)
*** 1970,1976 ****
  		}
  	}
  
! 	while ((c = getopt_long(argc, argv, "D:F:r:RT:xX:l:zZ:d:c:h:p:U:s:wWvP",
  							long_options, &option_index)) != -1)
  	{
  		switch (c)
--- 2122,2128 ----
  		}
  	}
  
! 	while ((c = getopt_long(argc, argv, "D:F:r:RT:xX:l:zZ:d:c:h:p:U:s:wWI:vP",
  							long_options, &option_index)) != -1)
  	{
  		switch (c)
*************** main(int argc, char **argv)
*** 2088,2093 ****
--- 2240,2248 ----
  			case 'W':
  				dbgetpassword = 1;
  				break;
+ 			case 'I':
+ 				read_backup_profile_header(optarg);
+ 				break;
  			case 's':
  				standby_message_timeout = atoi(optarg) * 1000;
  				if (standby_message_timeout < 0)
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 138deaf..4bb261a 100644
*** a/src/include/access/xlog.h
--- b/src/include/access/xlog.h
*************** extern void SetWalWriterSleeping(bool sl
*** 249,255 ****
   * Starting/stopping a base backup
   */
  extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast,
! 				   TimeLineID *starttli_p, char **labelfile);
  extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive,
  				  TimeLineID *stoptli_p);
  extern void do_pg_abort_backup(void);
--- 249,256 ----
   * Starting/stopping a base backup
   */
  extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast,
! 				  XLogRecPtr incremental_startpoint,
! 				  TimeLineID *starttli_p, char **labelfile);
  extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive,
  				  TimeLineID *stoptli_p);
  extern void do_pg_abort_backup(void);
diff --git a/src/include/replication/basebackup.h b/src/include/replication/basebackup.h
index 64f2bd5..08f8e90 100644
*** a/src/include/replication/basebackup.h
--- b/src/include/replication/basebackup.h
***************
*** 20,25 ****
--- 20,30 ----
  #define MAX_RATE_LOWER	32
  #define MAX_RATE_UPPER	1048576
  
+ /* Backup profile */
+ #define BACKUP_PROFILE_HEADER		"POSTGRESQL BACKUP PROFILE 1"
+ #define BACKUP_PROFILE_SEPARATOR	"FILE LIST"
+ #define BACKUP_PROFILE_FILE			"backup_profile"
+ #define BACKUP_PROFILE_OLD			"backup_profile.old"
  
  extern void SendBaseBackup(BaseBackupCmd *cmd);
  
-- 
2.2.2

#16

Robert Haas

robertmhaas@gmail.com

almost 11 years ago

In reply to: Marco Nenciarini (#15)

Re: File based Incremental backup v8

On Thu, Jan 29, 2015 at 9:47 AM, Marco Nenciarini
<marco.nenciarini@2ndquadrant.it> wrote:

The current implementation of copydir function is incompatible with LSN
based incremental backups. The problem is that new files are created,
but their blocks are still with the old LSN, so they will not be backed
up because they are looking old enough.

I think this is trying to pollute what's supposed to be a pure
fs-level operation ("copy a directory") into something that is aware
of specific details like the PostgreSQL page format. I really think
that nothing in storage/file should know about the page format. If we
need a function that copies a file while replacing the LSNs, I think
it should be a new function living somewhere else.

A bigger problem is that you are proposing to stamp those files with
LSNs that are, for lack of a better word, fake. I would expect that
this would completely break if checksums are enabled. Also, unlogged
relations typically have an LSN of 0; this would change that in some
cases, and I don't know whether that's OK.

The issues here are similar to those in
/messages/by-id/20150120152819.GC24381@alap3.anarazel.de
- basically, I think we need to make CREATE DATABASE and ALTER
DATABASE .. SET TABLESPACE fully WAL-logged operations, or this is
never going to work right. If we're not going to allow that, we need
to disallow hot backups while those operations are in progress.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17

Andres Freund

andres@2ndquadrant.com

almost 11 years ago

In reply to: Robert Haas (#16)

Re: File based Incremental backup v8

On 2015-01-29 12:57:22 -0500, Robert Haas wrote:

The issues here are similar to those in
/messages/by-id/20150120152819.GC24381@alap3.anarazel.de
- basically, I think we need to make CREATE DATABASE and ALTER
DATABASE .. SET TABLESPACE fully WAL-logged operations, or this is
never going to work right. If we're not going to allow that, we need
to disallow hot backups while those operations are in progress.

Yea, the current way is just a hack from the dark ages. Which has some
advantages, true, but I don't think they outweight the disadvantages. I
hope to find time to develop a patch to make those properly WAL logged
(for master) sometime not too far away.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18

Marco Nenciarini

marco.nenciarini@2ndquadrant.it

almost 11 years ago

In reply to: Robert Haas (#16)

3 attachment(s)

Re: File based Incremental backup v8

Il 29/01/15 18:57, Robert Haas ha scritto:

On Thu, Jan 29, 2015 at 9:47 AM, Marco Nenciarini
<marco.nenciarini@2ndquadrant.it> wrote:

The current implementation of copydir function is incompatible with LSN
based incremental backups. The problem is that new files are created,
but their blocks are still with the old LSN, so they will not be backed
up because they are looking old enough.

I think this is trying to pollute what's supposed to be a pure
fs-level operation ("copy a directory") into something that is aware
of specific details like the PostgreSQL page format. I really think
that nothing in storage/file should know about the page format. If we
need a function that copies a file while replacing the LSNs, I think
it should be a new function living somewhere else.

Given that the copydir function is used only during CREATE DATABASE and
ALTER DATABASE SET TABLESPACE, we could move it/renaming it to a better
place that clearly mark it as "knowing about page format". I'm open to
suggestions on where to place it an on what should be the correct name.

However the whole copydir patch here should be treated as a "temporary"
thing. It is necessary until a proper WAL logging of CREATE DATABASE and
ALTER DATABASE SET TABLESPACE will be implemented to support any form of
LSN based incremental backup.

A bigger problem is that you are proposing to stamp those files with
LSNs that are, for lack of a better word, fake. I would expect that
this would completely break if checksums are enabled.

I'm sorry I completely ignored checksums in previous patch. The attached
one works with checksums enabled.

Also, unlogged relations typically have an LSN of 0; this would
change that in some cases, and I don't know whether that's OK.

It shouldn't be a problem because all the code that uses unlogged
relations normally skip all the WAL related operations. From the point
of view of an incremental backup it is also not a problem, because
restoring the backup the unlogged tables will get reinitialized because
of crash recovery procedure. However if you think it is worth the
effort, I can rewrite the copydir as a two pass operation detecting the
unlogged tables on the first pass and avoiding the LSN update on
unlogged tables. I personally think that it doesn't wort the effort
unless someone identify a real path where settins LSNs in unlogged
relations leads to an issue.

The issues here are similar to those in
/messages/by-id/20150120152819.GC24381@alap3.anarazel.de
- basically, I think we need to make CREATE DATABASE and ALTER
DATABASE .. SET TABLESPACE fully WAL-logged operations, or this is
never going to work right. If we're not going to allow that, we need
to disallow hot backups while those operations are in progress.

This is right, but the problem Andres reported is orthogonal with the
one I'm addressing here. Without this copydir patch (or without a proper
WAL logging of copydir operations), you cannot take an incremental
backup after a CREATE DATABASE or ALTER DATABASE SET TABLESPACE until
you get a full backup and use it as base.

Regards,
Marco

--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it

Attachments:

0001-public-parse_filename_for_nontemp_relation.patchtext/plain; charset=UTF-8; name=0001-public-parse_filename_for_nontemp_relation.patch; x-mac-creator=0; x-mac-type=0Download

From 3e451077283de8e99c4eceb748d49c34329c6ef8 Mon Sep 17 00:00:00 2001
From: Marco Nenciarini <marco.nenciarini@2ndQuadrant.it>
Date: Thu, 29 Jan 2015 12:18:47 +0100
Subject: [PATCH 1/3] public parse_filename_for_nontemp_relation

---
 src/backend/storage/file/reinit.c | 58 ---------------------------------------
 src/common/relpath.c              | 56 +++++++++++++++++++++++++++++++++++++
 src/include/common/relpath.h      |  2 ++
 3 files changed, 58 insertions(+), 58 deletions(-)

diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c
index afd9255..02b5fee 100644
*** a/src/backend/storage/file/reinit.c
--- b/src/backend/storage/file/reinit.c
*************** static void ResetUnloggedRelationsInTabl
*** 28,35 ****
  									  int op);
  static void ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname,
  								   int op);
- static bool parse_filename_for_nontemp_relation(const char *name,
- 									int *oidchars, ForkNumber *fork);
  
  typedef struct
  {
--- 28,33 ----
*************** ResetUnloggedRelationsInDbspaceDir(const
*** 388,446 ****
  		fsync_fname((char *) dbspacedirname, true);
  	}
  }
- 
- /*
-  * Basic parsing of putative relation filenames.
-  *
-  * This function returns true if the file appears to be in the correct format
-  * for a non-temporary relation and false otherwise.
-  *
-  * NB: If this function returns true, the caller is entitled to assume that
-  * *oidchars has been set to the a value no more than OIDCHARS, and thus
-  * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID
-  * portion of the filename.  This is critical to protect against a possible
-  * buffer overrun.
-  */
- static bool
- parse_filename_for_nontemp_relation(const char *name, int *oidchars,
- 									ForkNumber *fork)
- {
- 	int			pos;
- 
- 	/* Look for a non-empty string of digits (that isn't too long). */
- 	for (pos = 0; isdigit((unsigned char) name[pos]); ++pos)
- 		;
- 	if (pos == 0 || pos > OIDCHARS)
- 		return false;
- 	*oidchars = pos;
- 
- 	/* Check for a fork name. */
- 	if (name[pos] != '_')
- 		*fork = MAIN_FORKNUM;
- 	else
- 	{
- 		int			forkchar;
- 
- 		forkchar = forkname_chars(&name[pos + 1], fork);
- 		if (forkchar <= 0)
- 			return false;
- 		pos += forkchar + 1;
- 	}
- 
- 	/* Check for a segment number. */
- 	if (name[pos] == '.')
- 	{
- 		int			segchar;
- 
- 		for (segchar = 1; isdigit((unsigned char) name[pos + segchar]); ++segchar)
- 			;
- 		if (segchar <= 1)
- 			return false;
- 		pos += segchar;
- 	}
- 
- 	/* Now we should be at the end. */
- 	if (name[pos] != '\0')
- 		return false;
- 	return true;
- }
--- 386,388 ----
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 66dfef1..83a1e3a 100644
*** a/src/common/relpath.c
--- b/src/common/relpath.c
*************** GetRelationPath(Oid dbNode, Oid spcNode,
*** 206,208 ****
--- 206,264 ----
  	}
  	return path;
  }
+ 
+ /*
+  * Basic parsing of putative relation filenames.
+  *
+  * This function returns true if the file appears to be in the correct format
+  * for a non-temporary relation and false otherwise.
+  *
+  * NB: If this function returns true, the caller is entitled to assume that
+  * *oidchars has been set to the a value no more than OIDCHARS, and thus
+  * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID
+  * portion of the filename.  This is critical to protect against a possible
+  * buffer overrun.
+  */
+ bool
+ parse_filename_for_nontemp_relation(const char *name, int *oidchars,
+ 									ForkNumber *fork)
+ {
+ 	int			pos;
+ 
+ 	/* Look for a non-empty string of digits (that isn't too long). */
+ 	for (pos = 0; isdigit((unsigned char) name[pos]); ++pos)
+ 		;
+ 	if (pos == 0 || pos > OIDCHARS)
+ 		return false;
+ 	*oidchars = pos;
+ 
+ 	/* Check for a fork name. */
+ 	if (name[pos] != '_')
+ 		*fork = MAIN_FORKNUM;
+ 	else
+ 	{
+ 		int			forkchar;
+ 
+ 		forkchar = forkname_chars(&name[pos + 1], fork);
+ 		if (forkchar <= 0)
+ 			return false;
+ 		pos += forkchar + 1;
+ 	}
+ 
+ 	/* Check for a segment number. */
+ 	if (name[pos] == '.')
+ 	{
+ 		int			segchar;
+ 
+ 		for (segchar = 1; isdigit((unsigned char) name[pos + segchar]); ++segchar)
+ 			;
+ 		if (segchar <= 1)
+ 			return false;
+ 		pos += segchar;
+ 	}
+ 
+ 	/* Now we should be at the end. */
+ 	if (name[pos] != '\0')
+ 		return false;
+ 	return true;
+ }
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index a263779..9736a78 100644
*** a/src/include/common/relpath.h
--- b/src/include/common/relpath.h
*************** extern char *GetDatabasePath(Oid dbNode,
*** 52,57 ****
--- 52,59 ----
  
  extern char *GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
  				int backendId, ForkNumber forkNumber);
+ extern bool parse_filename_for_nontemp_relation(const char *name,
+ 								int *oidchars, ForkNumber *fork);
  
  /*
   * Wrapper macros for GetRelationPath.  Beware of multiple
-- 
2.2.2

0002-copydir-LSN-v2.patchtext/plain; charset=UTF-8; name=0002-copydir-LSN-v2.patch; x-mac-creator=0; x-mac-type=0Download

From 98d21da4d10c558323cef1f3895f02b3088345ed Mon Sep 17 00:00:00 2001
From: Marco Nenciarini <marco.nenciarini@2ndQuadrant.it>
Date: Thu, 29 Jan 2015 11:41:35 +0100
Subject: [PATCH 2/3] copydir LSN v2

---
 src/backend/commands/dbcommands.c  | 32 ++++++++++---------
 src/backend/storage/file/copydir.c | 64 +++++++++++++++++++++++++++++++++++---
 src/backend/storage/file/reinit.c  |  3 +-
 src/include/storage/copydir.h      |  6 ++--
 4 files changed, 84 insertions(+), 21 deletions(-)

diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index 5e66961..6dd9878 100644
*** a/src/backend/commands/dbcommands.c
--- b/src/backend/commands/dbcommands.c
*************** createdb(const CreatedbStmt *stmt)
*** 586,591 ****
--- 586,592 ----
  			Oid			dsttablespace;
  			char	   *srcpath;
  			char	   *dstpath;
+ 			XLogRecPtr	recptr;
  			struct stat st;
  
  			/* No need to copy global tablespace */
*************** createdb(const CreatedbStmt *stmt)
*** 609,621 ****
  
  			dstpath = GetDatabasePath(dboid, dsttablespace);
  
- 			/*
- 			 * Copy this subdirectory to the new location
- 			 *
- 			 * We don't need to copy subdirectories
- 			 */
- 			copydir(srcpath, dstpath, false);
- 
  			/* Record the filesystem change in XLOG */
  			{
  				xl_dbase_create_rec xlrec;
--- 610,615 ----
*************** createdb(const CreatedbStmt *stmt)
*** 628,636 ****
  				XLogBeginInsert();
  				XLogRegisterData((char *) &xlrec, sizeof(xl_dbase_create_rec));
  
! 				(void) XLogInsert(RM_DBASE_ID,
  								  XLOG_DBASE_CREATE | XLR_SPECIAL_REL_UPDATE);
  			}
  		}
  		heap_endscan(scan);
  		heap_close(rel, AccessShareLock);
--- 622,637 ----
  				XLogBeginInsert();
  				XLogRegisterData((char *) &xlrec, sizeof(xl_dbase_create_rec));
  
! 				recptr = XLogInsert(RM_DBASE_ID,
  								  XLOG_DBASE_CREATE | XLR_SPECIAL_REL_UPDATE);
  			}
+ 
+ 			/*
+ 			 * Copy this subdirectory to the new location
+ 			 *
+ 			 * We don't need to copy subdirectories
+ 			 */
+ 			copydir(srcpath, dstpath, false, recptr);
  		}
  		heap_endscan(scan);
  		heap_close(rel, AccessShareLock);
*************** movedb(const char *dbname, const char *t
*** 1214,1223 ****
  	PG_ENSURE_ERROR_CLEANUP(movedb_failure_callback,
  							PointerGetDatum(&fparms));
  	{
! 		/*
! 		 * Copy files from the old tablespace to the new one
! 		 */
! 		copydir(src_dbpath, dst_dbpath, false);
  
  		/*
  		 * Record the filesystem change in XLOG
--- 1215,1221 ----
  	PG_ENSURE_ERROR_CLEANUP(movedb_failure_callback,
  							PointerGetDatum(&fparms));
  	{
! 		XLogRecPtr	recptr;
  
  		/*
  		 * Record the filesystem change in XLOG
*************** movedb(const char *dbname, const char *t
*** 1233,1243 ****
  			XLogBeginInsert();
  			XLogRegisterData((char *) &xlrec, sizeof(xl_dbase_create_rec));
  
! 			(void) XLogInsert(RM_DBASE_ID,
  							  XLOG_DBASE_CREATE | XLR_SPECIAL_REL_UPDATE);
  		}
  
  		/*
  		 * Update the database's pg_database tuple
  		 */
  		ScanKeyInit(&scankey,
--- 1231,1246 ----
  			XLogBeginInsert();
  			XLogRegisterData((char *) &xlrec, sizeof(xl_dbase_create_rec));
  
! 			recptr = XLogInsert(RM_DBASE_ID,
  							  XLOG_DBASE_CREATE | XLR_SPECIAL_REL_UPDATE);
  		}
  
  		/*
+ 		 * Copy files from the old tablespace to the new one
+ 		 */
+ 		copydir(src_dbpath, dst_dbpath, false, recptr);
+ 
+ 		/*
  		 * Update the database's pg_database tuple
  		 */
  		ScanKeyInit(&scankey,
*************** dbase_redo(XLogReaderState *record)
*** 2045,2050 ****
--- 2048,2054 ----
  	if (info == XLOG_DBASE_CREATE)
  	{
  		xl_dbase_create_rec *xlrec = (xl_dbase_create_rec *) XLogRecGetData(record);
+ 		XLogRecPtr	lsn = record->EndRecPtr;
  		char	   *src_path;
  		char	   *dst_path;
  		struct stat st;
*************** dbase_redo(XLogReaderState *record)
*** 2077,2083 ****
  		 *
  		 * We don't need to copy subdirectories
  		 */
! 		copydir(src_path, dst_path, false);
  	}
  	else if (info == XLOG_DBASE_DROP)
  	{
--- 2081,2087 ----
  		 *
  		 * We don't need to copy subdirectories
  		 */
! 		copydir(src_path, dst_path, false, lsn);
  	}
  	else if (info == XLOG_DBASE_DROP)
  	{
diff --git a/src/backend/storage/file/copydir.c b/src/backend/storage/file/copydir.c
index 41b2c62..a7a0dc5 100644
*** a/src/backend/storage/file/copydir.c
--- b/src/backend/storage/file/copydir.c
***************
*** 22,27 ****
--- 22,29 ----
  #include <unistd.h>
  #include <sys/stat.h>
  
+ #include "common/relpath.h"
+ #include "storage/bufpage.h"
  #include "storage/copydir.h"
  #include "storage/fd.h"
  #include "miscadmin.h"
***************
*** 32,40 ****
   *
   * If recurse is false, subdirectories are ignored.  Anything that's not
   * a directory or a regular file is ignored.
   */
  void
! copydir(char *fromdir, char *todir, bool recurse)
  {
  	DIR		   *xldir;
  	struct dirent *xlde;
--- 34,45 ----
   *
   * If recurse is false, subdirectories are ignored.  Anything that's not
   * a directory or a regular file is ignored.
+  *
+  * If recptr is different from InvalidXlogRecPtr, LSN of pages in the
+  * destination directory will be updated to recptr.
   */
  void
! copydir(char *fromdir, char *todir, bool recurse, XLogRecPtr recptr)
  {
  	DIR		   *xldir;
  	struct dirent *xlde;
*************** copydir(char *fromdir, char *todir, bool
*** 75,84 ****
  		{
  			/* recurse to handle subdirectories */
  			if (recurse)
! 				copydir(fromfile, tofile, true);
  		}
  		else if (S_ISREG(fst.st_mode))
! 			copy_file(fromfile, tofile);
  	}
  	FreeDir(xldir);
  
--- 80,106 ----
  		{
  			/* recurse to handle subdirectories */
  			if (recurse)
! 				copydir(fromfile, tofile, true, recptr);
  		}
  		else if (S_ISREG(fst.st_mode))
! 		{
! 			int			oidchars;
! 			ForkNumber	fork;
! 
! 			/*
! 			 * To support incremental backups, we need to update the LSN in
! 			 * all relation files we are copying.
! 			 *
! 			 * We are updating only the MAIN fork because at the moment
! 			 * blocks in FSM and VM forks are not guaranteed to have an
! 			 * up-to-date LSN
! 			 */
! 			if (parse_filename_for_nontemp_relation(xlde->d_name,
! 						&oidchars, &fork) && fork == MAIN_FORKNUM)
! 				copy_file(fromfile, tofile, recptr);
! 			else
! 				copy_file(fromfile, tofile, InvalidXLogRecPtr);
! 		}
  	}
  	FreeDir(xldir);
  
*************** copydir(char *fromdir, char *todir, bool
*** 130,144 ****
  
  /*
   * copy one file
   */
  void
! copy_file(char *fromfile, char *tofile)
  {
  	char	   *buffer;
  	int			srcfd;
  	int			dstfd;
  	int			nbytes;
  	off_t		offset;
  
  	/* Use palloc to ensure we get a maxaligned buffer */
  #define COPY_BUF_SIZE (8 * BLCKSZ)
--- 152,170 ----
  
  /*
   * copy one file
+  *
+  * If recptr is different from InvalidXlogRecPtr, the destination file will
+  * have all its pages with LSN set accordingly
   */
  void
! copy_file(char *fromfile, char *tofile, XLogRecPtr recptr)
  {
  	char	   *buffer;
  	int			srcfd;
  	int			dstfd;
  	int			nbytes;
  	off_t		offset;
+ 	BlockNumber	blkno = 0;
  
  	/* Use palloc to ensure we get a maxaligned buffer */
  #define COPY_BUF_SIZE (8 * BLCKSZ)
*************** copy_file(char *fromfile, char *tofile)
*** 176,181 ****
--- 202,237 ----
  					 errmsg("could not read file \"%s\": %m", fromfile)));
  		if (nbytes == 0)
  			break;
+ 
+ 		/*
+ 		 * If a valid recptr has been provided, the resulting file will have
+ 		 * all its pages with LSN set accordingly
+ 		 */
+ 		if (recptr != InvalidXLogRecPtr)
+ 		{
+ 			char		*page;
+ 
+ 			/*
+ 			 * If we are updating LSN of a file, we must be sure that the
+ 			 * source file is not being extended.
+ 			 */
+ 			if (nbytes % BLCKSZ != 0)
+ 				ereport(ERROR,
+ 						(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ 						 errmsg("file \"%s\" size is not multiple of %d",
+ 								fromfile, BLCKSZ)));
+ 
+ 			for (page = buffer; page < (buffer + nbytes); page += BLCKSZ, blkno++)
+ 			{
+ 				/* Update LSN only if the page looks valid */
+ 				if (!PageIsNew(page) && PageIsVerified(page, blkno))
+ 				{
+ 					PageSetLSN(page, recptr);
+ 					PageSetChecksumInplace(page, blkno);
+ 				}
+ 			}
+ 		}
+ 
  		errno = 0;
  		if ((int) write(dstfd, buffer, nbytes) != nbytes)
  		{
diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c
index 02b5fee..854ae4a 100644
*** a/src/backend/storage/file/reinit.c
--- b/src/backend/storage/file/reinit.c
***************
*** 16,21 ****
--- 16,22 ----
  
  #include <unistd.h>
  
+ #include "access/xlogdefs.h"
  #include "catalog/catalog.h"
  #include "common/relpath.h"
  #include "storage/copydir.h"
*************** ResetUnloggedRelationsInDbspaceDir(const
*** 333,339 ****
  
  			/* OK, we're ready to perform the actual copy. */
  			elog(DEBUG2, "copying %s to %s", srcpath, dstpath);
! 			copy_file(srcpath, dstpath);
  		}
  
  		FreeDir(dbspace_dir);
--- 334,340 ----
  
  			/* OK, we're ready to perform the actual copy. */
  			elog(DEBUG2, "copying %s to %s", srcpath, dstpath);
! 			copy_file(srcpath, dstpath, InvalidXLogRecPtr);
  		}
  
  		FreeDir(dbspace_dir);
diff --git a/src/include/storage/copydir.h b/src/include/storage/copydir.h
index 2635a7e..463141d 100644
*** a/src/include/storage/copydir.h
--- b/src/include/storage/copydir.h
***************
*** 13,19 ****
  #ifndef COPYDIR_H
  #define COPYDIR_H
  
! extern void copydir(char *fromdir, char *todir, bool recurse);
! extern void copy_file(char *fromfile, char *tofile);
  
  #endif   /* COPYDIR_H */
--- 13,21 ----
  #ifndef COPYDIR_H
  #define COPYDIR_H
  
! #include "access/xlogdefs.h"
! 
! extern void copydir(char *fromdir, char *todir, bool recurse, XLogRecPtr recptr);
! extern void copy_file(char *fromfile, char *tofile, XLogRecPtr recptr);
  
  #endif   /* COPYDIR_H */
-- 
2.2.2

0003-File-based-incremental-backup-v8.patchtext/plain; charset=UTF-8; name=0003-File-based-incremental-backup-v8.patch; x-mac-creator=0; x-mac-type=0Download

From 9e582cbf480805ccf983a71a50fdde186a54769b Mon Sep 17 00:00:00 2001
From: Marco Nenciarini <marco.nenciarini@2ndQuadrant.it>
Date: Tue, 14 Oct 2014 14:31:28 +0100
Subject: [PATCH 3/3] File-based incremental backup v8

Add backup profiles and --incremental to pg_basebackup
---
 doc/src/sgml/protocol.sgml             |  86 ++++++++-
 doc/src/sgml/ref/pg_basebackup.sgml    |  31 +++-
 src/backend/access/transam/xlog.c      |  18 +-
 src/backend/access/transam/xlogfuncs.c |   2 +-
 src/backend/replication/basebackup.c   | 319 +++++++++++++++++++++++++++++++--
 src/backend/replication/repl_gram.y    |   6 +
 src/backend/replication/repl_scanner.l |   1 +
 src/bin/pg_basebackup/pg_basebackup.c  | 191 ++++++++++++++++++--
 src/include/access/xlog.h              |   3 +-
 src/include/replication/basebackup.h   |   5 +
 10 files changed, 623 insertions(+), 39 deletions(-)

diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index efe75ea..fc24648 100644
*** a/doc/src/sgml/protocol.sgml
--- b/doc/src/sgml/protocol.sgml
*************** The commands accepted in walsender mode 
*** 1882,1888 ****
    </varlistentry>
  
    <varlistentry>
!     <term>BASE_BACKUP [<literal>LABEL</literal> <replaceable>'label'</replaceable>] [<literal>PROGRESS</literal>] [<literal>FAST</literal>] [<literal>WAL</literal>] [<literal>NOWAIT</literal>] [<literal>MAX_RATE</literal> <replaceable>rate</replaceable>]
       <indexterm><primary>BASE_BACKUP</primary></indexterm>
      </term>
      <listitem>
--- 1882,1888 ----
    </varlistentry>
  
    <varlistentry>
!     <term>BASE_BACKUP [<literal>LABEL</literal> <replaceable>'label'</replaceable>] [<literal>INCREMENTAL</literal> <replaceable>'start_lsn'</replaceable>] [<literal>PROGRESS</literal>] [<literal>FAST</literal>] [<literal>WAL</literal>] [<literal>NOWAIT</literal>] [<literal>MAX_RATE</literal> <replaceable>rate</replaceable>]
       <indexterm><primary>BASE_BACKUP</primary></indexterm>
      </term>
      <listitem>
*************** The commands accepted in walsender mode 
*** 1905,1910 ****
--- 1905,1928 ----
         </varlistentry>
  
         <varlistentry>
+         <term><literal>INCREMENTAL</literal> <replaceable>'start_lsn'</replaceable></term>
+         <listitem>
+          <para>
+           Requests a file-level incremental backup of all files changed after
+           <replaceable>start_lsn</replaceable>. When operating with
+           <literal>INCREMENTAL</literal>, the content of every block-organised
+           file will be analyzed and the file will be sent if at least one
+           block has a LSN higher than or equal to the provided
+           <replaceable>start_lsn</replaceable>.
+          </para>
+          <para>
+           The <filename>backup_profile</filename> will contain information on
+           every file that has been analyzed, even those that have not been sent.
+          </para>
+         </listitem>
+        </varlistentry>
+ 
+        <varlistentry>
          <term><literal>PROGRESS</></term>
          <listitem>
           <para>
*************** The commands accepted in walsender mode 
*** 2022,2028 ****
        <quote>ustar interchange format</> specified in the POSIX 1003.1-2008
        standard) dump of the tablespace contents, except that the two trailing
        blocks of zeroes specified in the standard are omitted.
!       After the tar data is complete, a final ordinary result set will be sent,
        containing the WAL end position of the backup, in the same format as
        the start position.
       </para>
--- 2040,2046 ----
        <quote>ustar interchange format</> specified in the POSIX 1003.1-2008
        standard) dump of the tablespace contents, except that the two trailing
        blocks of zeroes specified in the standard are omitted.
!       After the tar data is complete, an ordinary result set will be sent,
        containing the WAL end position of the backup, in the same format as
        the start position.
       </para>
*************** The commands accepted in walsender mode 
*** 2073,2082 ****
        the server supports it.
       </para>
       <para>
!       Once all tablespaces have been sent, a final regular result set will
        be sent. This result set contains the end position of the
        backup, given in XLogRecPtr format as a single column in a single row.
       </para>
      </listitem>
    </varlistentry>
  </variablelist>
--- 2091,2162 ----
        the server supports it.
       </para>
       <para>
!       Once all tablespaces have been sent, another regular result set will
        be sent. This result set contains the end position of the
        backup, given in XLogRecPtr format as a single column in a single row.
       </para>
+      <para>
+       Finally a last CopyResponse will be sent, containing only the
+       <filename>backup_profile</filename> file, in tar format.
+      </para>
+      <para>
+       The <filename>backup_profile</filename> file will have the following
+       format:
+ <programlisting>
+ POSTGRESQL BACKUP PROFILE 1
+ &lt;backup label content&gt;
+ FILE LIST
+ &lt;file list&gt;
+ </programlisting>
+       where <replaceable>&lt;backup label content&gt;</replaceable> is a
+       verbatim copy of the content of <filename>backup_label</filename> file
+       and the <replaceable>&lt;file list&gt;</replaceable> section is made up
+       of one line per file examined by the backup, having the following format
+       (standard COPY TEXT file, tab separated):
+ <programlisting>
+ tablespace maxlsn included mtime size relpath
+ </programlisting>
+      </para>
+      <para>
+       The meaning of the fields is the following:
+       <itemizedlist spacing="compact" mark="bullet">
+        <listitem>
+         <para>
+          <replaceable>tablespace</replaceable> is the OID of the tablespace
+          (or <literal>\N</literal> for files in PGDATA)
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>maxlsn</replaceable> is the file's max LSN in case
+          the file has been skipped, <literal>\N</literal> otherwise
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>included</replaceable> is a <literal>'t'</literal> if
+          the file is included in the backup, <literal>'f'</literal> otherwise
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>mtime</replaceable> is the timestamp of the last file
+          modification
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>size</replaceable> is the number of bytes of the file
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>relpath</replaceable> is the path of the file relative
+          to the tablespace root (PGDATA or the tablespace)
+         </para>
+        </listitem>
+       </itemizedlist>
+      </para>
      </listitem>
    </varlistentry>
  </variablelist>
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 642fccf..a13b188 100644
*** a/doc/src/sgml/ref/pg_basebackup.sgml
--- b/doc/src/sgml/ref/pg_basebackup.sgml
*************** PostgreSQL documentation
*** 158,163 ****
--- 158,165 ----
              tablespaces, the main data directory will be placed in the
              target directory, but all other tablespaces will be placed
              in the same absolute path as they have on the server.
+             The <filename>backup_profile</filename> file will be placed in
+             this directory.
             </para>
             <para>
              This is the default format.
*************** PostgreSQL documentation
*** 174,186 ****
              data directory will be written to a file named
              <filename>base.tar</filename>, and all other tablespaces will
              be named after the tablespace OID.
!             </para>
             <para>
              If the value <literal>-</literal> (dash) is specified as
              target directory, the tar contents will be written to
              standard output, suitable for piping to for example
              <productname>gzip</productname>. This is only possible if
              the cluster has no additional tablespaces.
             </para>
             </listitem>
           </varlistentry>
--- 176,192 ----
              data directory will be written to a file named
              <filename>base.tar</filename>, and all other tablespaces will
              be named after the tablespace OID.
!             The <filename>backup_profile</filename> file will be placed in
!             this directory.
!            </para>
             <para>
              If the value <literal>-</literal> (dash) is specified as
              target directory, the tar contents will be written to
              standard output, suitable for piping to for example
              <productname>gzip</productname>. This is only possible if
              the cluster has no additional tablespaces.
+             In this case, the <filename>backup_profile</filename> file 
+             will be sent to standard output as part of the tar stream.
             </para>
             </listitem>
           </varlistentry>
*************** PostgreSQL documentation
*** 189,194 ****
--- 195,214 ----
       </varlistentry>
  
       <varlistentry>
+       <term><option>-I <replaceable class="parameter">directory</replaceable></option></term>
+       <term><option>--incremental=<replaceable class="parameter">directory</replaceable></option></term>
+       <listitem>
+         <para>
+         Directory containing the backup to use as a start point for a file-level
+         incremental backup. <application>pg_basebackup</application> will read
+         the <filename>backup_profile</filename> file and then create an
+         incremental backup containing only the files which have been modified
+         after the start point.
+        </para>
+       </listitem>
+      </varlistentry>
+ 
+      <varlistentry>
        <term><option>-r <replaceable class="parameter">rate</replaceable></option></term>
        <term><option>--max-rate=<replaceable class="parameter">rate</replaceable></option></term>
        <listitem>
*************** PostgreSQL documentation
*** 588,593 ****
--- 608,622 ----
    </para>
  
    <para>
+    In order to support file-level incremental backups, a
+    <filename>backup_profile</filename> file
+    is generated in the target directory as last step of every backup. This
+    file will be transparently used by <application>pg_basebackup</application>
+    when invoked with the option <replaceable>--incremental</replaceable> to start
+    a new file-level incremental backup.
+   </para>
+ 
+   <para>
     <application>pg_basebackup</application> works with servers of the same
     or an older major version, down to 9.1. However, WAL streaming mode (-X
     stream) only works with server version 9.3 and later.
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 629a457..a642a04 100644
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***************
*** 47,52 ****
--- 47,53 ----
  #include "replication/snapbuild.h"
  #include "replication/walreceiver.h"
  #include "replication/walsender.h"
+ #include "replication/basebackup.h"
  #include "storage/barrier.h"
  #include "storage/bufmgr.h"
  #include "storage/fd.h"
*************** StartupXLOG(void)
*** 6164,6169 ****
--- 6165,6173 ----
  		 * the latest recovery restartpoint instead of going all the way back
  		 * to the backup start point.  It seems prudent though to just rename
  		 * the file out of the way rather than delete it completely.
+ 		 *
+ 		 * Rename also the backup profile if present. This marks the data
+ 		 * directory as not usable as base for an incremental backup.
  		 */
  		if (haveBackupLabel)
  		{
*************** StartupXLOG(void)
*** 6173,6178 ****
--- 6177,6189 ----
  						(errcode_for_file_access(),
  						 errmsg("could not rename file \"%s\" to \"%s\": %m",
  								BACKUP_LABEL_FILE, BACKUP_LABEL_OLD)));
+ 			unlink(BACKUP_PROFILE_OLD);
+ 			if (rename(BACKUP_PROFILE_FILE, BACKUP_PROFILE_OLD) != 0
+ 					&& errno != ENOENT)
+ 				ereport(FATAL,
+ 						(errcode_for_file_access(),
+ 						 errmsg("could not rename file \"%s\" to \"%s\": %m",
+ 								 BACKUP_PROFILE_FILE, BACKUP_PROFILE_OLD)));
  		}
  
  		/* Check that the GUCs used to generate the WAL allow recovery */
*************** XLogFileNameP(TimeLineID tli, XLogSegNo 
*** 9249,9255 ****
   * permissions of the calling user!
   */
  XLogRecPtr
! do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
  				   char **labelfile)
  {
  	bool		exclusive = (labelfile == NULL);
--- 9260,9267 ----
   * permissions of the calling user!
   */
  XLogRecPtr
! do_pg_start_backup(const char *backupidstr, bool fast,
! 				   XLogRecPtr incremental_startpoint, TimeLineID *starttli_p,
  				   char **labelfile)
  {
  	bool		exclusive = (labelfile == NULL);
*************** do_pg_start_backup(const char *backupids
*** 9468,9473 ****
--- 9480,9489 ----
  			 (uint32) (startpoint >> 32), (uint32) startpoint, xlogfilename);
  		appendStringInfo(&labelfbuf, "CHECKPOINT LOCATION: %X/%X\n",
  					 (uint32) (checkpointloc >> 32), (uint32) checkpointloc);
+ 		if (incremental_startpoint > 0)
+ 			appendStringInfo(&labelfbuf, "INCREMENTAL FROM LOCATION: %X/%X\n",
+ 							 (uint32) (incremental_startpoint >> 32),
+ 							 (uint32) incremental_startpoint);
  		appendStringInfo(&labelfbuf, "BACKUP METHOD: %s\n",
  						 exclusive ? "pg_start_backup" : "streamed");
  		appendStringInfo(&labelfbuf, "BACKUP FROM: %s\n",
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2179bf7..ace84d8 100644
*** a/src/backend/access/transam/xlogfuncs.c
--- b/src/backend/access/transam/xlogfuncs.c
*************** pg_start_backup(PG_FUNCTION_ARGS)
*** 59,65 ****
  				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
  		   errmsg("must be superuser or replication role to run a backup")));
  
! 	startpoint = do_pg_start_backup(backupidstr, fast, NULL, NULL);
  
  	PG_RETURN_LSN(startpoint);
  }
--- 59,65 ----
  				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
  		   errmsg("must be superuser or replication role to run a backup")));
  
! 	startpoint = do_pg_start_backup(backupidstr, fast, 0, NULL, NULL);
  
  	PG_RETURN_LSN(startpoint);
  }
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 3058ce9..107d70c 100644
*** a/src/backend/replication/basebackup.c
--- b/src/backend/replication/basebackup.c
***************
*** 30,40 ****
--- 30,42 ----
  #include "replication/basebackup.h"
  #include "replication/walsender.h"
  #include "replication/walsender_private.h"
+ #include "storage/bufpage.h"
  #include "storage/fd.h"
  #include "storage/ipc.h"
  #include "utils/builtins.h"
  #include "utils/elog.h"
  #include "utils/ps_status.h"
+ #include "utils/pg_lsn.h"
  #include "utils/timestamp.h"
  
  
*************** typedef struct
*** 46,56 ****
  	bool		nowait;
  	bool		includewal;
  	uint32		maxrate;
  } basebackup_options;
  
  
! static int64 sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces);
! static int64 sendTablespace(char *path, bool sizeonly);
  static bool sendFile(char *readfilename, char *tarfilename,
  		 struct stat * statbuf, bool missing_ok);
  static void sendFileWithContent(const char *filename, const char *content);
--- 48,62 ----
  	bool		nowait;
  	bool		includewal;
  	uint32		maxrate;
+ 	XLogRecPtr	incremental_startpoint;
  } basebackup_options;
  
  
! static int64 sendDir(char *path, int basepathlen, bool sizeonly,
! 					 List *tablespaces, bool has_relfiles,
! 					 XLogRecPtr incremental_startpoint);
! static int64 sendTablespace(char *path, bool sizeonly,
! 				XLogRecPtr incremental_startpoint);
  static bool sendFile(char *readfilename, char *tarfilename,
  		 struct stat * statbuf, bool missing_ok);
  static void sendFileWithContent(const char *filename, const char *content);
*************** static void parse_basebackup_options(Lis
*** 64,69 ****
--- 70,80 ----
  static void SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli);
  static int	compareWalFileNames(const void *a, const void *b);
  static void throttle(size_t increment);
+ static bool relnodeIsNewerThanLSN(char *filename, struct stat * statbuf,
+ 				XLogRecPtr *filemaxlsn, XLogRecPtr thresholdlsn);
+ static void writeBackupProfileLine(const char *filename, struct stat * statbuf,
+ 								   bool has_maxlsn, XLogRecPtr filemaxlsn, bool sent);
+ static void sendBackupProfile(const char *labelfile);
  
  /* Was the backup currently in-progress initiated in recovery mode? */
  static bool backup_started_in_recovery = false;
*************** static int64 elapsed_min_unit;
*** 93,98 ****
--- 104,115 ----
  /* The last check of the transfer rate. */
  static int64 throttled_last;
  
+ /* Temporary file containing the backup profile */
+ static File backup_profile_fd = 0;
+ 
+ /* Tablespace being currently sent. Used in backup profile generation */
+ static char *current_tablespace = NULL;
+ 
  typedef struct
  {
  	char	   *oid;
*************** perform_base_backup(basebackup_options *
*** 132,138 ****
  
  	backup_started_in_recovery = RecoveryInProgress();
  
! 	startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint, &starttli,
  								  &labelfile);
  	/*
  	 * Once do_pg_start_backup has been called, ensure that any failure causes
--- 149,159 ----
  
  	backup_started_in_recovery = RecoveryInProgress();
  
! 	/* Open a temporary file to hold the profile content. */
! 	backup_profile_fd = OpenTemporaryFile(false);
! 
! 	startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint,
! 								  opt->incremental_startpoint, &starttli,
  								  &labelfile);
  	/*
  	 * Once do_pg_start_backup has been called, ensure that any failure causes
*************** perform_base_backup(basebackup_options *
*** 208,214 ****
  			ti->oid = pstrdup(de->d_name);
  			ti->path = pstrdup(linkpath);
  			ti->rpath = relpath ? pstrdup(relpath) : NULL;
! 			ti->size = opt->progress ? sendTablespace(fullpath, true) : -1;
  			tablespaces = lappend(tablespaces, ti);
  #else
  
--- 229,236 ----
  			ti->oid = pstrdup(de->d_name);
  			ti->path = pstrdup(linkpath);
  			ti->rpath = relpath ? pstrdup(relpath) : NULL;
! 			ti->size = opt->progress ? sendTablespace(fullpath, true,
! 											opt->incremental_startpoint) : -1;
  			tablespaces = lappend(tablespaces, ti);
  #else
  
*************** perform_base_backup(basebackup_options *
*** 225,231 ****
  
  		/* Add a node for the base directory at the end */
  		ti = palloc0(sizeof(tablespaceinfo));
! 		ti->size = opt->progress ? sendDir(".", 1, true, tablespaces) : -1;
  		tablespaces = lappend(tablespaces, ti);
  
  		/* Send tablespace header */
--- 247,254 ----
  
  		/* Add a node for the base directory at the end */
  		ti = palloc0(sizeof(tablespaceinfo));
! 		ti->size = opt->progress ? sendDir(".", 1, true, tablespaces, false,
! 										   opt->incremental_startpoint) : -1;
  		tablespaces = lappend(tablespaces, ti);
  
  		/* Send tablespace header */
*************** perform_base_backup(basebackup_options *
*** 267,272 ****
--- 290,301 ----
  			pq_sendint(&buf, 0, 2);		/* natts */
  			pq_endmessage(&buf);
  
+ 			/*
+ 			 * Save the current tablespace, used in writeBackupProfileLine
+ 			 * function
+ 			 */
+ 			current_tablespace = ti->oid;
+ 
  			if (ti->path == NULL)
  			{
  				struct stat statbuf;
*************** perform_base_backup(basebackup_options *
*** 275,281 ****
  				sendFileWithContent(BACKUP_LABEL_FILE, labelfile);
  
  				/* ... then the bulk of the files ... */
! 				sendDir(".", 1, false, tablespaces);
  
  				/* ... and pg_control after everything else. */
  				if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
--- 304,310 ----
  				sendFileWithContent(BACKUP_LABEL_FILE, labelfile);
  
  				/* ... then the bulk of the files ... */
! 				sendDir(".", 1, false, tablespaces, false, opt->incremental_startpoint);
  
  				/* ... and pg_control after everything else. */
  				if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
*************** perform_base_backup(basebackup_options *
*** 284,292 ****
  							 errmsg("could not stat control file \"%s\": %m",
  									XLOG_CONTROL_FILE)));
  				sendFile(XLOG_CONTROL_FILE, XLOG_CONTROL_FILE, &statbuf, false);
  			}
  			else
! 				sendTablespace(ti->path, false);
  
  			/*
  			 * If we're including WAL, and this is the main data directory we
--- 313,322 ----
  							 errmsg("could not stat control file \"%s\": %m",
  									XLOG_CONTROL_FILE)));
  				sendFile(XLOG_CONTROL_FILE, XLOG_CONTROL_FILE, &statbuf, false);
+ 				writeBackupProfileLine(XLOG_CONTROL_FILE, &statbuf, false, 0, true);
  			}
  			else
! 				sendTablespace(ti->path, false, opt->incremental_startpoint);
  
  			/*
  			 * If we're including WAL, and this is the main data directory we
*************** perform_base_backup(basebackup_options *
*** 501,507 ****
  
  			FreeFile(fp);
  
! 			/*
  			 * Mark file as archived, otherwise files can get archived again
  			 * after promotion of a new node. This is in line with
  			 * walreceiver.c always doing a XLogArchiveForceDone() after a
--- 531,540 ----
  
  			FreeFile(fp);
  
! 			/* Add the WAL file to backup profile */
! 			writeBackupProfileLine(pathbuf, &statbuf, false, 0, true);
! 
! 		    /*
  			 * Mark file as archived, otherwise files can get archived again
  			 * after promotion of a new node. This is in line with
  			 * walreceiver.c always doing a XLogArchiveForceDone() after a
*************** perform_base_backup(basebackup_options *
*** 533,538 ****
--- 566,574 ----
  
  			sendFile(pathbuf, pathbuf, &statbuf, false);
  
+ 			/* Add the WAL file to backup profile */
+ 			writeBackupProfileLine(pathbuf, &statbuf, false, 0, true);
+ 
  			/* unconditionally mark file as archived */
  			StatusFilePath(pathbuf, fname, ".done");
  			sendFileWithContent(pathbuf, "");
*************** perform_base_backup(basebackup_options *
*** 542,547 ****
--- 578,586 ----
  		pq_putemptymessage('c');
  	}
  	SendXlogRecPtrResult(endptr, endtli);
+ 
+ 	/* Send the profile file. */
+ 	sendBackupProfile(labelfile);
  }
  
  /*
*************** parse_basebackup_options(List *options, 
*** 570,575 ****
--- 609,615 ----
  	bool		o_nowait = false;
  	bool		o_wal = false;
  	bool		o_maxrate = false;
+ 	bool		o_incremental = false;
  
  	MemSet(opt, 0, sizeof(*opt));
  	foreach(lopt, options)
*************** parse_basebackup_options(List *options, 
*** 640,645 ****
--- 680,697 ----
  			opt->maxrate = (uint32) maxrate;
  			o_maxrate = true;
  		}
+ 		else if (strcmp(defel->defname, "incremental") == 0)
+ 		{
+ 			if (o_incremental)
+ 				ereport(ERROR,
+ 						(errcode(ERRCODE_SYNTAX_ERROR),
+ 						 errmsg("duplicate option \"%s\"", defel->defname)));
+ 
+ 			opt->incremental_startpoint = DatumGetLSN(
+ 				DirectFunctionCall1(pg_lsn_in,
+ 									CStringGetDatum(strVal(defel->arg))));
+ 			o_incremental = true;
+ 		}
  		else
  			elog(ERROR, "option \"%s\" not recognized",
  				 defel->defname);
*************** sendFileWithContent(const char *filename
*** 859,864 ****
--- 911,919 ----
  		MemSet(buf, 0, pad);
  		pq_putmessage('d', buf, pad);
  	}
+ 
+ 	/* Write a backup profile entry for this file. */
+ 	writeBackupProfileLine(filename, &statbuf, false, 0, true);
  }
  
  /*
*************** sendFileWithContent(const char *filename
*** 869,875 ****
   * Only used to send auxiliary tablespaces, not PGDATA.
   */
  static int64
! sendTablespace(char *path, bool sizeonly)
  {
  	int64		size;
  	char		pathbuf[MAXPGPATH];
--- 924,930 ----
   * Only used to send auxiliary tablespaces, not PGDATA.
   */
  static int64
! sendTablespace(char *path, bool sizeonly, XLogRecPtr incremental_startpoint)
  {
  	int64		size;
  	char		pathbuf[MAXPGPATH];
*************** sendTablespace(char *path, bool sizeonly
*** 902,908 ****
  	size = 512;					/* Size of the header just added */
  
  	/* Send all the files in the tablespace version directory */
! 	size += sendDir(pathbuf, strlen(path), sizeonly, NIL);
  
  	return size;
  }
--- 957,963 ----
  	size = 512;					/* Size of the header just added */
  
  	/* Send all the files in the tablespace version directory */
! 	size += sendDir(pathbuf, strlen(path), sizeonly, NIL, true, incremental_startpoint);
  
  	return size;
  }
*************** sendTablespace(char *path, bool sizeonly
*** 914,922 ****
   *
   * Omit any directory in the tablespaces list, to avoid backing up
   * tablespaces twice when they were created inside PGDATA.
   */
  static int64
! sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces)
  {
  	DIR		   *dir;
  	struct dirent *de;
--- 969,981 ----
   *
   * Omit any directory in the tablespaces list, to avoid backing up
   * tablespaces twice when they were created inside PGDATA.
+  *
+  * If 'has_relfiles' is set, this directory will be checked to identify
+  * relnode files and compute their maxLSN.
   */
  static int64
! sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces,
! 		bool has_relfiles, XLogRecPtr incremental_startpoint)
  {
  	DIR		   *dir;
  	struct dirent *de;
*************** sendDir(char *path, int basepathlen, boo
*** 1124,1138 ****
  				}
  			}
  			if (!skip_this_dir)
! 				size += sendDir(pathbuf, basepathlen, sizeonly, tablespaces);
  		}
  		else if (S_ISREG(statbuf.st_mode))
  		{
  			bool		sent = false;
  
  			if (!sizeonly)
! 				sent = sendFile(pathbuf, pathbuf + basepathlen + 1, &statbuf,
! 								true);
  
  			if (sent || sizeonly)
  			{
--- 1183,1243 ----
  				}
  			}
  			if (!skip_this_dir)
! 			{
! 				bool	subdir_has_relfiles;
! 
! 				/*
! 				 * Whithin PGDATA relnode files are contained only in "global"
! 				 * and "base" directory
! 				 */
! 				subdir_has_relfiles = has_relfiles
! 					|| strcmp(pathbuf, "./global") == 0
! 					|| strcmp(pathbuf, "./base") == 0;
! 
! 				size += sendDir(pathbuf, basepathlen, sizeonly, tablespaces,
! 								subdir_has_relfiles, incremental_startpoint);
! 			}
  		}
  		else if (S_ISREG(statbuf.st_mode))
  		{
  			bool		sent = false;
  
  			if (!sizeonly)
! 			{
! 				bool		is_relfile;
! 				XLogRecPtr	filemaxlsn = 0;
! 				int			oidchars;
! 				ForkNumber	forknum;
! 
! 				/*
! 				 * If the current directory can have relnode files, check the file
! 				 * name to see if it is one of them.
! 				 *
! 				 * Only copy the main fork because is the only one
! 				 * where page LSNs are always updated
! 				 */
! 				is_relfile = ( has_relfiles
! 					&& parse_filename_for_nontemp_relation(de->d_name,
! 														   &oidchars,
! 														   &forknum)
! 					&& forknum == MAIN_FORKNUM);
! 
! 				if (!is_relfile
! 					|| incremental_startpoint == 0
! 					|| relnodeIsNewerThanLSN(pathbuf, &statbuf, &filemaxlsn,
! 											 incremental_startpoint))
! 				{
! 					sent = sendFile(pathbuf, pathbuf + basepathlen + 1,
! 									&statbuf, true);
! 					/* Write a backup profile entry for the sent file. */
! 					writeBackupProfileLine(pathbuf + basepathlen + 1, &statbuf,
! 										   false, 0, sent);
! 				}
! 				else
! 					/* Write a backup profile entry for the skipped file. */
! 					writeBackupProfileLine(pathbuf + basepathlen + 1, &statbuf,
! 										   true, filemaxlsn, sent);
! 			}
  
  			if (sent || sizeonly)
  			{
*************** throttle(size_t increment)
*** 1333,1335 ****
--- 1438,1626 ----
  		/* Sleep was necessary but might have been interrupted. */
  		throttled_last = GetCurrentIntegerTimestamp();
  }
+ 
+ /*
+  * Search in a relnode file for a page with a LSN greater than the threshold.
+  * If all the blocks in the file are older than the threshold the file can
+  * be safely skipped during an incremental backup.
+  */
+ static bool
+ relnodeIsNewerThanLSN(char *filename, struct stat * statbuf,
+ 		XLogRecPtr *filemaxlsn, XLogRecPtr thresholdlsn)
+ {
+ 	FILE	   *fp;
+ 	char		buf[BLCKSZ];
+ 	size_t		cnt;
+ 	pgoff_t		len = 0;
+ 	XLogRecPtr	pagelsn;
+ 
+ 	*filemaxlsn = 0;
+ 
+ 	fp = AllocateFile(filename, "rb");
+ 	if (fp == NULL)
+ 	{
+ 		if (errno == ENOENT)
+ 			return true;
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not open file \"%s\": %m", filename)));
+ 	}
+ 
+ 	while ((cnt = fread(buf, 1, Min(sizeof(buf), statbuf->st_size - len), fp)) > 0)
+ 	{
+ 		pagelsn = PageGetLSN(buf);
+ 
+ 		/* Keep the max LSN found */
+ 		if (*filemaxlsn < pagelsn)
+ 			*filemaxlsn = pagelsn;
+ 
+ 		/*
+ 		 *  If a page with a LSN newer than the threshold stop scanning
+ 		 *  and set the filemaxlsn value to 0 as it is only partial.
+ 		 */
+ 		if (thresholdlsn <= pagelsn)
+ 		{
+ 			*filemaxlsn = 0;
+ 			FreeFile(fp);
+ 			return true;
+ 		}
+ 
+ 		if (len >= statbuf->st_size)
+ 		{
+ 			/*
+ 			 * Reached end of file. The file could be longer, if it was
+ 			 * extended while we were sending it, but for a base backup we can
+ 			 * ignore such extended data. It will be restored from WAL.
+ 			 */
+ 			break;
+ 		}
+ 	}
+ 
+ 	FreeFile(fp);
+ 
+ 	/*
+ 	 * At this point, if *filemaxlsn contains InvalidXLogRecPtr
+ 	 * the file contains something that doesn't update page LSNs (e.g. FSM)
+ 	 */
+ 	if (*filemaxlsn == InvalidXLogRecPtr)
+ 		return true;
+ 
+ 	return false;
+ }
+ 
+ /*
+  * Write an entry in file list section of backup profile.
+  */
+ static void
+ writeBackupProfileLine(const char *filename, struct stat * statbuf,
+ 					   bool has_maxlsn, XLogRecPtr filemaxlsn, bool sent)
+ {
+ 	/*
+ 	 * tablespace oid (10) + max LSN (17) + mtime (10) + size (19) +
+ 	 * path (MAXPGPATH) + separators (4) + trailing \0 = 65
+ 	 */
+ 	char	buf[MAXPGPATH + 65];
+ 	char    maxlsn[17];
+ 	int		rowlen;
+ 
+ 	Assert(backup_profile_fd > 0);
+ 
+ 	/* Prepare maxlsn */
+ 	if (has_maxlsn)
+ 	{
+ 		snprintf(maxlsn, sizeof(maxlsn), "%X/%X",
+ 				 (uint32) (filemaxlsn >> 32), (uint32) filemaxlsn);
+ 	}
+ 	else
+ 	{
+ 		strlcpy(maxlsn, "\\N", sizeof(maxlsn));
+ 	}
+ 
+ 	rowlen = snprintf(buf, sizeof(buf), "%s\t%s\t%s\t%u\t%lld\t%s\n",
+ 					  current_tablespace ? current_tablespace : "\\N",
+ 					  maxlsn,
+ 					  sent ? "t" : "f",
+ 					  (uint32) statbuf->st_mtime,
+ 					  statbuf->st_size,
+ 					  filename);
+ 	FileWrite(backup_profile_fd, buf, rowlen);
+ }
+ 
+ /*
+  * Send the backup profile. It is wrapped in a tar CopyOutResponse containing
+  * a tar stream with only one file.
+  */
+ static void
+ sendBackupProfile(const char *labelfile)
+ {
+ 	StringInfoData msgbuf;
+ 	struct stat statbuf;
+ 	char		buf[TAR_SEND_SIZE];
+ 	size_t		cnt;
+ 	pgoff_t		len = 0;
+ 	size_t		pad;
+ 	char *backup_profile = FilePathName(backup_profile_fd);
+ 
+ 	/* Send CopyOutResponse message */
+ 	pq_beginmessage(&msgbuf, 'H');
+ 	pq_sendbyte(&msgbuf, 0);		/* overall format */
+ 	pq_sendint(&msgbuf, 0, 2);		/* natts */
+ 	pq_endmessage(&msgbuf);
+ 
+ 	if (lstat(backup_profile, &statbuf) != 0)
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not stat backup_profile file \"%s\": %m",
+ 						backup_profile)));
+ 
+ 	/* Set the file position to the beginning. */
+ 	FileSeek(backup_profile_fd, 0, SEEK_SET);
+ 
+ 	/*
+ 	 * Fill the buffer with content of backup profile header section. Being it
+ 	 * the concatenation of two separator and the backup label, it should be
+ 	 * shorter of TAR_SEND_SIZE.
+ 	 */
+ 	cnt = snprintf(buf, sizeof(buf), "%s\n%s%s\n",
+ 				   BACKUP_PROFILE_HEADER,
+ 				   labelfile,
+ 				   BACKUP_PROFILE_SEPARATOR);
+ 
+ 	/* Add size of backup label and separators */
+ 	statbuf.st_size += cnt;
+ 
+ 	_tarWriteHeader(BACKUP_PROFILE_FILE, NULL, &statbuf);
+ 
+ 	/* Send backup profile header */
+ 	if (pq_putmessage('d', buf, cnt))
+ 		ereport(ERROR,
+ 				(errmsg("base backup could not send data, aborting backup")));
+ 
+ 	len += cnt;
+ 	throttle(cnt);
+ 
+ 	while ((cnt = FileRead(backup_profile_fd, buf, sizeof(buf))) > 0)
+ 	{
+ 		/* Send the chunk as a CopyData message */
+ 		if (pq_putmessage('d', buf, cnt))
+ 			ereport(ERROR,
+ 					(errmsg("base backup could not send data, aborting backup")));
+ 
+ 		len += cnt;
+ 		throttle(cnt);
+ 
+ 	}
+ 
+ 	/*
+ 	 * Pad to 512 byte boundary, per tar format requirements. (This small
+ 	 * piece of data is probably not worth throttling.)
+ 	 */
+ 	pad = ((len + 511) & ~511) - len;
+ 	if (pad > 0)
+ 	{
+ 		MemSet(buf, 0, pad);
+ 		pq_putmessage('d', buf, pad);
+ 	}
+ 
+ 	pq_putemptymessage('c');        /* CopyDone */
+ }
diff --git a/src/backend/replication/repl_gram.y b/src/backend/replication/repl_gram.y
index 2a41eb1..684cf4d 100644
*** a/src/backend/replication/repl_gram.y
--- b/src/backend/replication/repl_gram.y
*************** Node *replication_parse_result;
*** 75,80 ****
--- 75,81 ----
  %token K_PHYSICAL
  %token K_LOGICAL
  %token K_SLOT
+ %token K_INCREMENTAL
  
  %type <node>	command
  %type <node>	base_backup start_replication start_logical_replication create_replication_slot drop_replication_slot identify_system timeline_history
*************** base_backup_opt:
*** 168,173 ****
--- 169,179 ----
  				  $$ = makeDefElem("max_rate",
  								   (Node *)makeInteger($2));
  				}
+ 			| K_INCREMENTAL SCONST
+ 				{
+ 				  $$ = makeDefElem("incremental",
+ 								   (Node *)makeString($2));
+ 				}
  			;
  
  create_replication_slot:
diff --git a/src/backend/replication/repl_scanner.l b/src/backend/replication/repl_scanner.l
index 449c127..a6d0dd8 100644
*** a/src/backend/replication/repl_scanner.l
--- b/src/backend/replication/repl_scanner.l
*************** TIMELINE_HISTORY	{ return K_TIMELINE_HIS
*** 96,101 ****
--- 96,102 ----
  PHYSICAL			{ return K_PHYSICAL; }
  LOGICAL				{ return K_LOGICAL; }
  SLOT				{ return K_SLOT; }
+ INCREMENTAL			{ return K_INCREMENTAL; }
  
  ","				{ return ','; }
  ";"				{ return ';'; }
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index fbf7106..fd67d51 100644
*** a/src/bin/pg_basebackup/pg_basebackup.c
--- b/src/bin/pg_basebackup/pg_basebackup.c
*************** static bool writerecoveryconf = false;
*** 67,72 ****
--- 67,74 ----
  static int	standby_message_timeout = 10 * 1000;		/* 10 sec = default */
  static pg_time_t last_progress_report = 0;
  static int32 maxrate = 0;		/* no limit by default */
+ static XLogRecPtr incremental_startpoint = 0;
+ static TimeLineID incremental_timeline = 0;
  
  
  /* Progress counters */
*************** static void usage(void);
*** 99,107 ****
  static void disconnect_and_exit(int code);
  static void verify_dir_is_empty_or_create(char *dirname);
  static void progress_report(int tablespacenum, const char *filename, bool force);
  
  static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);
! static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum);
  static void GenerateRecoveryConf(PGconn *conn);
  static void WriteRecoveryConf(void);
  static void BaseBackup(void);
--- 101,111 ----
  static void disconnect_and_exit(int code);
  static void verify_dir_is_empty_or_create(char *dirname);
  static void progress_report(int tablespacenum, const char *filename, bool force);
+ static void read_backup_profile_header(const char *profile_path);
  
  static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);
! static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum,
! 									const char *dest_path);
  static void GenerateRecoveryConf(PGconn *conn);
  static void WriteRecoveryConf(void);
  static void BaseBackup(void);
*************** usage(void)
*** 232,237 ****
--- 236,243 ----
  	printf(_("\nOptions controlling the output:\n"));
  	printf(_("  -D, --pgdata=DIRECTORY receive base backup into directory\n"));
  	printf(_("  -F, --format=p|t       output format (plain (default), tar)\n"));
+ 	printf(_("  -I, --incremental=DIRECTORY\n"
+ 			 "                         incremental backup from an existing backup\n"));
  	printf(_("  -r, --max-rate=RATE    maximum transfer rate to transfer data directory\n"
  			 "                         (in kB/s, or use suffix \"k\" or \"M\")\n"));
  	printf(_("  -R, --write-recovery-conf\n"
*************** parse_max_rate(char *src)
*** 717,722 ****
--- 723,794 ----
  	return (int32) result;
  }
  
+ 
+ /*
+  * Read incremental_startpoint and incremental_timeline
+  * from a backup profile.
+  */
+ static void
+ read_backup_profile_header(const char *reference_path)
+ {
+ 	char 		profile_path[MAXPGPATH];
+ 	FILE	   *pfp;
+ 	char		ch;
+ 	uint32		hi,
+ 				lo;
+ 
+ 	/* The directory must exist and must be not empty */
+ 	if (pg_check_dir(reference_path) < 3)
+ 	{
+ 		fprintf(stderr, _("%s: invalid incremental base directory \"%s\"\n"),
+ 				progname, reference_path);
+ 		exit(1);
+ 	}
+ 
+ 	/* Build the backup profile location */
+ 	join_path_components(profile_path, reference_path, BACKUP_PROFILE_FILE);
+ 
+ 	/* See if label file is present */
+ 	pfp = fopen(profile_path, "r");
+ 	if (!pfp)
+ 	{
+ 		fprintf(stderr, _("%s: could not read file \"%s\": %s\n"),
+ 				progname, profile_path, strerror(errno));
+ 		exit(1);
+ 	}
+ 
+ 	/* Consume the profile header */
+ 	fscanf(pfp, BACKUP_PROFILE_HEADER);
+ 	if (fscanf(pfp, "%c", &ch) != 1 || ch != '\n')
+ 	{
+ 		fprintf(stderr, _("%s: invalid data in file \"%s\"\n"),
+ 				progname, profile_path);
+ 		exit(1);
+ 	}
+ 
+ 	/*
+ 	 * Read and parse the START WAL LOCATION (this code
+ 	 * is pretty crude, but we are not expecting any variability in the file
+ 	 * format).
+ 	 */
+ 	if (fscanf(pfp, "START WAL LOCATION: %X/%X (file %08X%*16s)%c",
+ 			   &hi, &lo, &incremental_timeline, &ch) != 4 || ch != '\n')
+ 	{
+ 		fprintf(stderr, _("%s: invalid data in file \"%s\"\n"),
+ 				progname, profile_path);
+ 		exit(1);
+ 	}
+ 	incremental_startpoint = ((uint64) hi) << 32 | lo;
+ 
+ 	if (ferror(pfp) || fclose(pfp))
+ 	{
+ 		fprintf(stderr, _("%s: could not read file \"%s\": %s\n"),
+ 				progname, profile_path, strerror(errno));
+ 		exit(1);
+ 	}
+ }
+ 
+ 
  /*
   * Write a piece of tar data
   */
*************** ReceiveTarFile(PGconn *conn, PGresult *r
*** 773,784 ****
  	char	   *copybuf = NULL;
  	FILE	   *tarfile = NULL;
  	char		tarhdr[512];
! 	bool		basetablespace = PQgetisnull(res, rownum, 0);
  	bool		in_tarhdr = true;
  	bool		skip_file = false;
  	size_t		tarhdrsz = 0;
  	size_t		filesz = 0;
  
  #ifdef HAVE_LIBZ
  	gzFile		ztarfile = NULL;
  #endif
--- 845,866 ----
  	char	   *copybuf = NULL;
  	FILE	   *tarfile = NULL;
  	char		tarhdr[512];
! 	bool		basetablespace;
  	bool		in_tarhdr = true;
  	bool		skip_file = false;
  	size_t		tarhdrsz = 0;
  	size_t		filesz = 0;
  
+ 	/*
+ 	 * If 'res' is NULL, we are appending the backup profile to
+ 	 * the standard output tar stream.
+ 	 */
+ 	assert(res || (strcmp(basedir, "-") == 0));
+ 	if (res)
+ 		basetablespace = PQgetisnull(res, rownum, 0);
+ 	else
+ 		basetablespace = true;
+ 
  #ifdef HAVE_LIBZ
  	gzFile		ztarfile = NULL;
  #endif
*************** ReceiveTarFile(PGconn *conn, PGresult *r
*** 939,946 ****
  					WRITE_TAR_DATA(zerobuf, padding);
  			}
  
! 			/* 2 * 512 bytes empty data at end of file */
! 			WRITE_TAR_DATA(zerobuf, sizeof(zerobuf));
  
  #ifdef HAVE_LIBZ
  			if (ztarfile != NULL)
--- 1021,1033 ----
  					WRITE_TAR_DATA(zerobuf, padding);
  			}
  
! 			/*
! 			 * Write the end-of-file blocks unless using stdout
! 			 * and not writing the backup profile (res is NULL).
! 			 */
! 			if (!res || strcmp(basedir, "-") != 0)
! 				/* 2 * 512 bytes empty data at end of file */
! 				WRITE_TAR_DATA(zerobuf, sizeof(zerobuf));
  
  #ifdef HAVE_LIBZ
  			if (ztarfile != NULL)
*************** get_tablespace_mapping(const char *dir)
*** 1128,1136 ****
   * If the data is for the main data directory, it will be restored in the
   * specified directory. If it's for another tablespace, it will be restored
   * in the original or mapped directory.
   */
  static void
! ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum)
  {
  	char		current_path[MAXPGPATH];
  	char		filename[MAXPGPATH];
--- 1215,1230 ----
   * If the data is for the main data directory, it will be restored in the
   * specified directory. If it's for another tablespace, it will be restored
   * in the original or mapped directory.
+  *
+  * If 'res' is NULL, the destination directory is taken from the
+  * 'dest_path' parameter.
+  *
+  * When 'dest_path' is specified, progresses are not displayed because the
+  * content it is not in any tablespace.
   */
  static void
! ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum,
! 						const char *dest_path)
  {
  	char		current_path[MAXPGPATH];
  	char		filename[MAXPGPATH];
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1141,1153 ****
  	char	   *copybuf = NULL;
  	FILE	   *file = NULL;
  
! 	basetablespace = PQgetisnull(res, rownum, 0);
! 	if (basetablespace)
! 		strlcpy(current_path, basedir, sizeof(current_path));
  	else
! 		strlcpy(current_path,
! 				get_tablespace_mapping(PQgetvalue(res, rownum, 1)),
! 				sizeof(current_path));
  
  	/*
  	 * Get the COPY data
--- 1235,1262 ----
  	char	   *copybuf = NULL;
  	FILE	   *file = NULL;
  
! 	/* 'res' and 'dest_path' are mutually exclusive */
! 	assert(!res != !dest_path);
! 
! 	/*
! 	 * If 'res' is NULL, the destination directory is taken from the
! 	 * 'dest_path' parameter.
! 	 */
! 	if (res)
! 	{
! 		basetablespace = PQgetisnull(res, rownum, 0);
! 		if (basetablespace)
! 			strlcpy(current_path, basedir, sizeof(current_path));
! 		else
! 			strlcpy(current_path,
! 					get_tablespace_mapping(PQgetvalue(res, rownum, 1)),
! 					sizeof(current_path));
! 	}
  	else
! 	{
! 		basetablespace = false;
! 		strlcpy(current_path, dest_path, sizeof(current_path));
! 	}
  
  	/*
  	 * Get the COPY data
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1355,1361 ****
  				disconnect_and_exit(1);
  			}
  			totaldone += r;
! 			progress_report(rownum, filename, false);
  
  			current_len_left -= r;
  			if (current_len_left == 0 && current_padding == 0)
--- 1464,1472 ----
  				disconnect_and_exit(1);
  			}
  			totaldone += r;
! 			/* report progress unless a custom destination is used */
! 			if (!dest_path)
! 				progress_report(rownum, filename, false);
  
  			current_len_left -= r;
  			if (current_len_left == 0 && current_padding == 0)
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1371,1377 ****
  			}
  		}						/* continuing data in existing file */
  	}							/* loop over all data blocks */
! 	progress_report(rownum, filename, true);
  
  	if (file != NULL)
  	{
--- 1482,1490 ----
  			}
  		}						/* continuing data in existing file */
  	}							/* loop over all data blocks */
! 	/* report progress unless a custom destination is used */
! 	if (!dest_path)
! 		progress_report(rownum, filename, true);
  
  	if (file != NULL)
  	{
*************** BaseBackup(void)
*** 1587,1592 ****
--- 1700,1706 ----
  	char	   *basebkp;
  	char		escaped_label[MAXPGPATH];
  	char	   *maxrate_clause = NULL;
+ 	char	   *incremental_clause = NULL;
  	int			i;
  	char		xlogstart[64];
  	char		xlogend[64];
*************** BaseBackup(void)
*** 1648,1661 ****
  	if (maxrate > 0)
  		maxrate_clause = psprintf("MAX_RATE %u", maxrate);
  
  	basebkp =
! 		psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s",
  				 escaped_label,
  				 showprogress ? "PROGRESS" : "",
  				 includewal && !streamwal ? "WAL" : "",
  				 fastcheckpoint ? "FAST" : "",
  				 includewal ? "NOWAIT" : "",
! 				 maxrate_clause ? maxrate_clause : "");
  
  	if (PQsendQuery(conn, basebkp) == 0)
  	{
--- 1762,1801 ----
  	if (maxrate > 0)
  		maxrate_clause = psprintf("MAX_RATE %u", maxrate);
  
+ 	if (incremental_startpoint > 0)
+ 	{
+ 		incremental_clause = psprintf("INCREMENTAL '%X/%X'",
+ 									  (uint32) (incremental_startpoint >> 32),
+ 									  (uint32) incremental_startpoint);
+ 
+ 		/*
+ 		 * Sanity check: if from a different timeline abort the backup.
+ 		 */
+ 		if (latesttli != incremental_timeline)
+ 		{
+ 			fprintf(stderr,
+ 					_("%s: incremental backup from a different timeline "
+ 					  "is not supported: base=%u current=%u\n"),
+ 					progname, incremental_timeline, latesttli);
+ 			disconnect_and_exit(1);
+ 		}
+ 
+ 		if (verbose)
+ 			fprintf(stderr, _("incremental from point: %X/%X on timeline %u\n"),
+ 					(uint32) (incremental_startpoint >> 32),
+ 					(uint32) incremental_startpoint,
+ 					incremental_timeline);
+ 	}
+ 
  	basebkp =
! 		psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s %s",
  				 escaped_label,
  				 showprogress ? "PROGRESS" : "",
  				 includewal && !streamwal ? "WAL" : "",
  				 fastcheckpoint ? "FAST" : "",
  				 includewal ? "NOWAIT" : "",
! 				 maxrate_clause ? maxrate_clause : "",
! 				 incremental_clause ? incremental_clause : "");
  
  	if (PQsendQuery(conn, basebkp) == 0)
  	{
*************** BaseBackup(void)
*** 1769,1775 ****
  		if (format == 't')
  			ReceiveTarFile(conn, res, i);
  		else
! 			ReceiveAndUnpackTarFile(conn, res, i);
  	}							/* Loop over all tablespaces */
  
  	if (showprogress)
--- 1909,1915 ----
  		if (format == 't')
  			ReceiveTarFile(conn, res, i);
  		else
! 			ReceiveAndUnpackTarFile(conn, res, i, NULL);
  	}							/* Loop over all tablespaces */
  
  	if (showprogress)
*************** BaseBackup(void)
*** 1803,1808 ****
--- 1943,1960 ----
  		fprintf(stderr, "transaction log end point: %s\n", xlogend);
  	PQclear(res);
  
+ 	/*
+ 	 * Get the backup profile
+ 	 *
+ 	 * If format is tar and we are writing on standard output
+ 	 * append the backup profile to the stream, otherwise put it
+ 	 * in the destination directory
+ 	 */
+ 	if (format == 't' && (strcmp(basedir, "-") == 0))
+ 		ReceiveTarFile(conn, NULL, -1);
+ 	else
+ 		ReceiveAndUnpackTarFile(conn, NULL, -1, basedir);
+ 
  	res = PQgetResult(conn);
  	if (PQresultStatus(res) != PGRES_COMMAND_OK)
  	{
*************** main(int argc, char **argv)
*** 1942,1947 ****
--- 2094,2100 ----
  		{"username", required_argument, NULL, 'U'},
  		{"no-password", no_argument, NULL, 'w'},
  		{"password", no_argument, NULL, 'W'},
+ 		{"incremental", required_argument, NULL, 'I'},
  		{"status-interval", required_argument, NULL, 's'},
  		{"verbose", no_argument, NULL, 'v'},
  		{"progress", no_argument, NULL, 'P'},
*************** main(int argc, char **argv)
*** 1949,1955 ****
  		{NULL, 0, NULL, 0}
  	};
  	int			c;
- 
  	int			option_index;
  
  	progname = get_progname(argv[0]);
--- 2102,2107 ----
*************** main(int argc, char **argv)
*** 1970,1976 ****
  		}
  	}
  
! 	while ((c = getopt_long(argc, argv, "D:F:r:RT:xX:l:zZ:d:c:h:p:U:s:wWvP",
  							long_options, &option_index)) != -1)
  	{
  		switch (c)
--- 2122,2128 ----
  		}
  	}
  
! 	while ((c = getopt_long(argc, argv, "D:F:r:RT:xX:l:zZ:d:c:h:p:U:s:wWI:vP",
  							long_options, &option_index)) != -1)
  	{
  		switch (c)
*************** main(int argc, char **argv)
*** 2088,2093 ****
--- 2240,2248 ----
  			case 'W':
  				dbgetpassword = 1;
  				break;
+ 			case 'I':
+ 				read_backup_profile_header(optarg);
+ 				break;
  			case 's':
  				standby_message_timeout = atoi(optarg) * 1000;
  				if (standby_message_timeout < 0)
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 138deaf..4bb261a 100644
*** a/src/include/access/xlog.h
--- b/src/include/access/xlog.h
*************** extern void SetWalWriterSleeping(bool sl
*** 249,255 ****
   * Starting/stopping a base backup
   */
  extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast,
! 				   TimeLineID *starttli_p, char **labelfile);
  extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive,
  				  TimeLineID *stoptli_p);
  extern void do_pg_abort_backup(void);
--- 249,256 ----
   * Starting/stopping a base backup
   */
  extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast,
! 				  XLogRecPtr incremental_startpoint,
! 				  TimeLineID *starttli_p, char **labelfile);
  extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive,
  				  TimeLineID *stoptli_p);
  extern void do_pg_abort_backup(void);
diff --git a/src/include/replication/basebackup.h b/src/include/replication/basebackup.h
index 64f2bd5..08f8e90 100644
*** a/src/include/replication/basebackup.h
--- b/src/include/replication/basebackup.h
***************
*** 20,25 ****
--- 20,30 ----
  #define MAX_RATE_LOWER	32
  #define MAX_RATE_UPPER	1048576
  
+ /* Backup profile */
+ #define BACKUP_PROFILE_HEADER		"POSTGRESQL BACKUP PROFILE 1"
+ #define BACKUP_PROFILE_SEPARATOR	"FILE LIST"
+ #define BACKUP_PROFILE_FILE			"backup_profile"
+ #define BACKUP_PROFILE_OLD			"backup_profile.old"
  
  extern void SendBaseBackup(BaseBackupCmd *cmd);
  
-- 
2.2.2

#19

Erik Rijkers

er@xs4all.nl

almost 11 years ago

In reply to: Marco Nenciarini (#18)

Re: File based Incremental backup v8

On Sat, January 31, 2015 15:14, Marco Nenciarini wrote:

0001-public-parse_filename_for_nontemp_relation.patch
0002-copydir-LSN-v2.patch
0003-File-based-incremental-backup-v8.patch

Hi,

It looks like it only compiles with assert enabled.

This is perhaps not yet really a problem at this stage but I thought I'd mention it:

make --quiet -j 8
In file included from gram.y:14403:0:
scan.c: In function ï¿½yy_try_NUL_transï¿½:
scan.c:10174:23: warning: unused variable ï¿½yygï¿½ [-Wunused-variable]
struct yyguts_t * yyg = (struct yyguts_t*)yyscanner; /* This var may be unused depending upon options. */
^
basebackup.c: In function ï¿½writeBackupProfileLineï¿½:
basebackup.c:1545:8: warning: format ï¿½%lldï¿½ expects argument of type ï¿½long long intï¿½, but argument 8 has type ï¿½__off_tï¿½
[-Wformat=]
filename);
^
basebackup.c:1545:8: warning: format ï¿½%lldï¿½ expects argument of type ï¿½long long intï¿½, but argument 8 has type ï¿½__off_tï¿½
[-Wformat=]
pg_basebackup.c: In function ï¿½ReceiveTarFileï¿½:
pg_basebackup.c:858:2: warning: implicit declaration of function ï¿½assertï¿½ [-Wimplicit-function-declaration]
assert(res || (strcmp(basedir, "-") == 0));
^
pg_basebackup.c:865:2: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
gzFile ztarfile = NULL;
^
pg_basebackup.o: In function `ReceiveAndUnpackTarFile':
pg_basebackup.c:(.text+0x690): undefined reference to `assert'
pg_basebackup.o: In function `ReceiveTarFile':
pg_basebackup.c:(.text+0xeb0): undefined reference to `assert'
pg_basebackup.c:(.text+0x10ad): undefined reference to `assert'
collect2: error: ld returned 1 exit status
make[3]: *** [pg_basebackup] Error 1
make[3]: *** Waiting for unfinished jobs....
make[2]: *** [all-pg_basebackup-recurse] Error 2
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [all-bin-recurse] Error 2
make: *** [all-src-recurse] Error 2

The configure used was:
./configure \
--prefix=/home/aardvark/pg_stuff/pg_installations/pgsql.incremental_backup \
--bindir=/home/aardvark/pg_stuff/pg_installations/pgsql.incremental_backup/bin.fast \
--libdir=/home/aardvark/pg_stuff/pg_installations/pgsql.incremental_backup/lib.fast \
--with-pgport=6973 --quiet --enable-depend \
--with-extra-version=_incremental_backup_20150131_1521_08bd0c581158 \
--with-openssl --with-perl --with-libxml --with-libxslt --with-zlib

A build with --enable-cassert and --enable-debug builds fine:

./configure \
--prefix=/home/aardvark/pg_stuff/pg_installations/pgsql.incremental_backup \
--bindir=/home/aardvark/pg_stuff/pg_installations/pgsql.incremental_backup/bin \
--libdir=/home/aardvark/pg_stuff/pg_installations/pgsql.incremental_backup/lib \
--with-pgport=6973 --quiet --enable-depend \
--with-extra-version=_incremental_backup_20150131_1628_08bd0c581158 \
--enable-cassert --enable-debug \
--with-openssl --with-perl --with-libxml --with-libxslt --with-zlib

I will further test with that.

thanks,

Erik Rijkers

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20

Marco Nenciarini

marco.nenciarini@2ndquadrant.it

almost 11 years ago

In reply to: Erik Rijkers (#19)

3 attachment(s)

File based Incremental backup v9

Il 31/01/15 17:22, Erik Rijkers ha scritto:

On Sat, January 31, 2015 15:14, Marco Nenciarini wrote:

0001-public-parse_filename_for_nontemp_relation.patch
0002-copydir-LSN-v2.patch
0003-File-based-incremental-backup-v8.patch

Hi,

It looks like it only compiles with assert enabled.

It is due to a typo (assert instead of Assert). You can find the updated
patch attached to this message.

Regards,
Marco

--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it

Attachments:

0001-public-parse_filename_for_nontemp_relation.patchtext/plain; charset=UTF-8; name=0001-public-parse_filename_for_nontemp_relation.patch; x-mac-creator=0; x-mac-type=0Download

From 3e451077283de8e99c4eceb748d49c34329c6ef8 Mon Sep 17 00:00:00 2001
From: Marco Nenciarini <marco.nenciarini@2ndQuadrant.it>
Date: Thu, 29 Jan 2015 12:18:47 +0100
Subject: [PATCH 1/3] public parse_filename_for_nontemp_relation

---
 src/backend/storage/file/reinit.c | 58 ---------------------------------------
 src/common/relpath.c              | 56 +++++++++++++++++++++++++++++++++++++
 src/include/common/relpath.h      |  2 ++
 3 files changed, 58 insertions(+), 58 deletions(-)

diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c
index afd9255..02b5fee 100644
*** a/src/backend/storage/file/reinit.c
--- b/src/backend/storage/file/reinit.c
*************** static void ResetUnloggedRelationsInTabl
*** 28,35 ****
  									  int op);
  static void ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname,
  								   int op);
- static bool parse_filename_for_nontemp_relation(const char *name,
- 									int *oidchars, ForkNumber *fork);
  
  typedef struct
  {
--- 28,33 ----
*************** ResetUnloggedRelationsInDbspaceDir(const
*** 388,446 ****
  		fsync_fname((char *) dbspacedirname, true);
  	}
  }
- 
- /*
-  * Basic parsing of putative relation filenames.
-  *
-  * This function returns true if the file appears to be in the correct format
-  * for a non-temporary relation and false otherwise.
-  *
-  * NB: If this function returns true, the caller is entitled to assume that
-  * *oidchars has been set to the a value no more than OIDCHARS, and thus
-  * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID
-  * portion of the filename.  This is critical to protect against a possible
-  * buffer overrun.
-  */
- static bool
- parse_filename_for_nontemp_relation(const char *name, int *oidchars,
- 									ForkNumber *fork)
- {
- 	int			pos;
- 
- 	/* Look for a non-empty string of digits (that isn't too long). */
- 	for (pos = 0; isdigit((unsigned char) name[pos]); ++pos)
- 		;
- 	if (pos == 0 || pos > OIDCHARS)
- 		return false;
- 	*oidchars = pos;
- 
- 	/* Check for a fork name. */
- 	if (name[pos] != '_')
- 		*fork = MAIN_FORKNUM;
- 	else
- 	{
- 		int			forkchar;
- 
- 		forkchar = forkname_chars(&name[pos + 1], fork);
- 		if (forkchar <= 0)
- 			return false;
- 		pos += forkchar + 1;
- 	}
- 
- 	/* Check for a segment number. */
- 	if (name[pos] == '.')
- 	{
- 		int			segchar;
- 
- 		for (segchar = 1; isdigit((unsigned char) name[pos + segchar]); ++segchar)
- 			;
- 		if (segchar <= 1)
- 			return false;
- 		pos += segchar;
- 	}
- 
- 	/* Now we should be at the end. */
- 	if (name[pos] != '\0')
- 		return false;
- 	return true;
- }
--- 386,388 ----
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 66dfef1..83a1e3a 100644
*** a/src/common/relpath.c
--- b/src/common/relpath.c
*************** GetRelationPath(Oid dbNode, Oid spcNode,
*** 206,208 ****
--- 206,264 ----
  	}
  	return path;
  }
+ 
+ /*
+  * Basic parsing of putative relation filenames.
+  *
+  * This function returns true if the file appears to be in the correct format
+  * for a non-temporary relation and false otherwise.
+  *
+  * NB: If this function returns true, the caller is entitled to assume that
+  * *oidchars has been set to the a value no more than OIDCHARS, and thus
+  * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID
+  * portion of the filename.  This is critical to protect against a possible
+  * buffer overrun.
+  */
+ bool
+ parse_filename_for_nontemp_relation(const char *name, int *oidchars,
+ 									ForkNumber *fork)
+ {
+ 	int			pos;
+ 
+ 	/* Look for a non-empty string of digits (that isn't too long). */
+ 	for (pos = 0; isdigit((unsigned char) name[pos]); ++pos)
+ 		;
+ 	if (pos == 0 || pos > OIDCHARS)
+ 		return false;
+ 	*oidchars = pos;
+ 
+ 	/* Check for a fork name. */
+ 	if (name[pos] != '_')
+ 		*fork = MAIN_FORKNUM;
+ 	else
+ 	{
+ 		int			forkchar;
+ 
+ 		forkchar = forkname_chars(&name[pos + 1], fork);
+ 		if (forkchar <= 0)
+ 			return false;
+ 		pos += forkchar + 1;
+ 	}
+ 
+ 	/* Check for a segment number. */
+ 	if (name[pos] == '.')
+ 	{
+ 		int			segchar;
+ 
+ 		for (segchar = 1; isdigit((unsigned char) name[pos + segchar]); ++segchar)
+ 			;
+ 		if (segchar <= 1)
+ 			return false;
+ 		pos += segchar;
+ 	}
+ 
+ 	/* Now we should be at the end. */
+ 	if (name[pos] != '\0')
+ 		return false;
+ 	return true;
+ }
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index a263779..9736a78 100644
*** a/src/include/common/relpath.h
--- b/src/include/common/relpath.h
*************** extern char *GetDatabasePath(Oid dbNode,
*** 52,57 ****
--- 52,59 ----
  
  extern char *GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
  				int backendId, ForkNumber forkNumber);
+ extern bool parse_filename_for_nontemp_relation(const char *name,
+ 								int *oidchars, ForkNumber *fork);
  
  /*
   * Wrapper macros for GetRelationPath.  Beware of multiple
-- 
2.2.2

0002-copydir-LSN-v2.patchtext/plain; charset=UTF-8; name=0002-copydir-LSN-v2.patch; x-mac-creator=0; x-mac-type=0Download

From 98d21da4d10c558323cef1f3895f02b3088345ed Mon Sep 17 00:00:00 2001
From: Marco Nenciarini <marco.nenciarini@2ndQuadrant.it>
Date: Thu, 29 Jan 2015 11:41:35 +0100
Subject: [PATCH 2/3] copydir LSN v2

---
 src/backend/commands/dbcommands.c  | 32 ++++++++++---------
 src/backend/storage/file/copydir.c | 64 +++++++++++++++++++++++++++++++++++---
 src/backend/storage/file/reinit.c  |  3 +-
 src/include/storage/copydir.h      |  6 ++--
 4 files changed, 84 insertions(+), 21 deletions(-)

diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index 5e66961..6dd9878 100644
*** a/src/backend/commands/dbcommands.c
--- b/src/backend/commands/dbcommands.c
*************** createdb(const CreatedbStmt *stmt)
*** 586,591 ****
--- 586,592 ----
  			Oid			dsttablespace;
  			char	   *srcpath;
  			char	   *dstpath;
+ 			XLogRecPtr	recptr;
  			struct stat st;
  
  			/* No need to copy global tablespace */
*************** createdb(const CreatedbStmt *stmt)
*** 609,621 ****
  
  			dstpath = GetDatabasePath(dboid, dsttablespace);
  
- 			/*
- 			 * Copy this subdirectory to the new location
- 			 *
- 			 * We don't need to copy subdirectories
- 			 */
- 			copydir(srcpath, dstpath, false);
- 
  			/* Record the filesystem change in XLOG */
  			{
  				xl_dbase_create_rec xlrec;
--- 610,615 ----
*************** createdb(const CreatedbStmt *stmt)
*** 628,636 ****
  				XLogBeginInsert();
  				XLogRegisterData((char *) &xlrec, sizeof(xl_dbase_create_rec));
  
! 				(void) XLogInsert(RM_DBASE_ID,
  								  XLOG_DBASE_CREATE | XLR_SPECIAL_REL_UPDATE);
  			}
  		}
  		heap_endscan(scan);
  		heap_close(rel, AccessShareLock);
--- 622,637 ----
  				XLogBeginInsert();
  				XLogRegisterData((char *) &xlrec, sizeof(xl_dbase_create_rec));
  
! 				recptr = XLogInsert(RM_DBASE_ID,
  								  XLOG_DBASE_CREATE | XLR_SPECIAL_REL_UPDATE);
  			}
+ 
+ 			/*
+ 			 * Copy this subdirectory to the new location
+ 			 *
+ 			 * We don't need to copy subdirectories
+ 			 */
+ 			copydir(srcpath, dstpath, false, recptr);
  		}
  		heap_endscan(scan);
  		heap_close(rel, AccessShareLock);
*************** movedb(const char *dbname, const char *t
*** 1214,1223 ****
  	PG_ENSURE_ERROR_CLEANUP(movedb_failure_callback,
  							PointerGetDatum(&fparms));
  	{
! 		/*
! 		 * Copy files from the old tablespace to the new one
! 		 */
! 		copydir(src_dbpath, dst_dbpath, false);
  
  		/*
  		 * Record the filesystem change in XLOG
--- 1215,1221 ----
  	PG_ENSURE_ERROR_CLEANUP(movedb_failure_callback,
  							PointerGetDatum(&fparms));
  	{
! 		XLogRecPtr	recptr;
  
  		/*
  		 * Record the filesystem change in XLOG
*************** movedb(const char *dbname, const char *t
*** 1233,1243 ****
  			XLogBeginInsert();
  			XLogRegisterData((char *) &xlrec, sizeof(xl_dbase_create_rec));
  
! 			(void) XLogInsert(RM_DBASE_ID,
  							  XLOG_DBASE_CREATE | XLR_SPECIAL_REL_UPDATE);
  		}
  
  		/*
  		 * Update the database's pg_database tuple
  		 */
  		ScanKeyInit(&scankey,
--- 1231,1246 ----
  			XLogBeginInsert();
  			XLogRegisterData((char *) &xlrec, sizeof(xl_dbase_create_rec));
  
! 			recptr = XLogInsert(RM_DBASE_ID,
  							  XLOG_DBASE_CREATE | XLR_SPECIAL_REL_UPDATE);
  		}
  
  		/*
+ 		 * Copy files from the old tablespace to the new one
+ 		 */
+ 		copydir(src_dbpath, dst_dbpath, false, recptr);
+ 
+ 		/*
  		 * Update the database's pg_database tuple
  		 */
  		ScanKeyInit(&scankey,
*************** dbase_redo(XLogReaderState *record)
*** 2045,2050 ****
--- 2048,2054 ----
  	if (info == XLOG_DBASE_CREATE)
  	{
  		xl_dbase_create_rec *xlrec = (xl_dbase_create_rec *) XLogRecGetData(record);
+ 		XLogRecPtr	lsn = record->EndRecPtr;
  		char	   *src_path;
  		char	   *dst_path;
  		struct stat st;
*************** dbase_redo(XLogReaderState *record)
*** 2077,2083 ****
  		 *
  		 * We don't need to copy subdirectories
  		 */
! 		copydir(src_path, dst_path, false);
  	}
  	else if (info == XLOG_DBASE_DROP)
  	{
--- 2081,2087 ----
  		 *
  		 * We don't need to copy subdirectories
  		 */
! 		copydir(src_path, dst_path, false, lsn);
  	}
  	else if (info == XLOG_DBASE_DROP)
  	{
diff --git a/src/backend/storage/file/copydir.c b/src/backend/storage/file/copydir.c
index 41b2c62..a7a0dc5 100644
*** a/src/backend/storage/file/copydir.c
--- b/src/backend/storage/file/copydir.c
***************
*** 22,27 ****
--- 22,29 ----
  #include <unistd.h>
  #include <sys/stat.h>
  
+ #include "common/relpath.h"
+ #include "storage/bufpage.h"
  #include "storage/copydir.h"
  #include "storage/fd.h"
  #include "miscadmin.h"
***************
*** 32,40 ****
   *
   * If recurse is false, subdirectories are ignored.  Anything that's not
   * a directory or a regular file is ignored.
   */
  void
! copydir(char *fromdir, char *todir, bool recurse)
  {
  	DIR		   *xldir;
  	struct dirent *xlde;
--- 34,45 ----
   *
   * If recurse is false, subdirectories are ignored.  Anything that's not
   * a directory or a regular file is ignored.
+  *
+  * If recptr is different from InvalidXlogRecPtr, LSN of pages in the
+  * destination directory will be updated to recptr.
   */
  void
! copydir(char *fromdir, char *todir, bool recurse, XLogRecPtr recptr)
  {
  	DIR		   *xldir;
  	struct dirent *xlde;
*************** copydir(char *fromdir, char *todir, bool
*** 75,84 ****
  		{
  			/* recurse to handle subdirectories */
  			if (recurse)
! 				copydir(fromfile, tofile, true);
  		}
  		else if (S_ISREG(fst.st_mode))
! 			copy_file(fromfile, tofile);
  	}
  	FreeDir(xldir);
  
--- 80,106 ----
  		{
  			/* recurse to handle subdirectories */
  			if (recurse)
! 				copydir(fromfile, tofile, true, recptr);
  		}
  		else if (S_ISREG(fst.st_mode))
! 		{
! 			int			oidchars;
! 			ForkNumber	fork;
! 
! 			/*
! 			 * To support incremental backups, we need to update the LSN in
! 			 * all relation files we are copying.
! 			 *
! 			 * We are updating only the MAIN fork because at the moment
! 			 * blocks in FSM and VM forks are not guaranteed to have an
! 			 * up-to-date LSN
! 			 */
! 			if (parse_filename_for_nontemp_relation(xlde->d_name,
! 						&oidchars, &fork) && fork == MAIN_FORKNUM)
! 				copy_file(fromfile, tofile, recptr);
! 			else
! 				copy_file(fromfile, tofile, InvalidXLogRecPtr);
! 		}
  	}
  	FreeDir(xldir);
  
*************** copydir(char *fromdir, char *todir, bool
*** 130,144 ****
  
  /*
   * copy one file
   */
  void
! copy_file(char *fromfile, char *tofile)
  {
  	char	   *buffer;
  	int			srcfd;
  	int			dstfd;
  	int			nbytes;
  	off_t		offset;
  
  	/* Use palloc to ensure we get a maxaligned buffer */
  #define COPY_BUF_SIZE (8 * BLCKSZ)
--- 152,170 ----
  
  /*
   * copy one file
+  *
+  * If recptr is different from InvalidXlogRecPtr, the destination file will
+  * have all its pages with LSN set accordingly
   */
  void
! copy_file(char *fromfile, char *tofile, XLogRecPtr recptr)
  {
  	char	   *buffer;
  	int			srcfd;
  	int			dstfd;
  	int			nbytes;
  	off_t		offset;
+ 	BlockNumber	blkno = 0;
  
  	/* Use palloc to ensure we get a maxaligned buffer */
  #define COPY_BUF_SIZE (8 * BLCKSZ)
*************** copy_file(char *fromfile, char *tofile)
*** 176,181 ****
--- 202,237 ----
  					 errmsg("could not read file \"%s\": %m", fromfile)));
  		if (nbytes == 0)
  			break;
+ 
+ 		/*
+ 		 * If a valid recptr has been provided, the resulting file will have
+ 		 * all its pages with LSN set accordingly
+ 		 */
+ 		if (recptr != InvalidXLogRecPtr)
+ 		{
+ 			char		*page;
+ 
+ 			/*
+ 			 * If we are updating LSN of a file, we must be sure that the
+ 			 * source file is not being extended.
+ 			 */
+ 			if (nbytes % BLCKSZ != 0)
+ 				ereport(ERROR,
+ 						(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ 						 errmsg("file \"%s\" size is not multiple of %d",
+ 								fromfile, BLCKSZ)));
+ 
+ 			for (page = buffer; page < (buffer + nbytes); page += BLCKSZ, blkno++)
+ 			{
+ 				/* Update LSN only if the page looks valid */
+ 				if (!PageIsNew(page) && PageIsVerified(page, blkno))
+ 				{
+ 					PageSetLSN(page, recptr);
+ 					PageSetChecksumInplace(page, blkno);
+ 				}
+ 			}
+ 		}
+ 
  		errno = 0;
  		if ((int) write(dstfd, buffer, nbytes) != nbytes)
  		{
diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c
index 02b5fee..854ae4a 100644
*** a/src/backend/storage/file/reinit.c
--- b/src/backend/storage/file/reinit.c
***************
*** 16,21 ****
--- 16,22 ----
  
  #include <unistd.h>
  
+ #include "access/xlogdefs.h"
  #include "catalog/catalog.h"
  #include "common/relpath.h"
  #include "storage/copydir.h"
*************** ResetUnloggedRelationsInDbspaceDir(const
*** 333,339 ****
  
  			/* OK, we're ready to perform the actual copy. */
  			elog(DEBUG2, "copying %s to %s", srcpath, dstpath);
! 			copy_file(srcpath, dstpath);
  		}
  
  		FreeDir(dbspace_dir);
--- 334,340 ----
  
  			/* OK, we're ready to perform the actual copy. */
  			elog(DEBUG2, "copying %s to %s", srcpath, dstpath);
! 			copy_file(srcpath, dstpath, InvalidXLogRecPtr);
  		}
  
  		FreeDir(dbspace_dir);
diff --git a/src/include/storage/copydir.h b/src/include/storage/copydir.h
index 2635a7e..463141d 100644
*** a/src/include/storage/copydir.h
--- b/src/include/storage/copydir.h
***************
*** 13,19 ****
  #ifndef COPYDIR_H
  #define COPYDIR_H
  
! extern void copydir(char *fromdir, char *todir, bool recurse);
! extern void copy_file(char *fromfile, char *tofile);
  
  #endif   /* COPYDIR_H */
--- 13,21 ----
  #ifndef COPYDIR_H
  #define COPYDIR_H
  
! #include "access/xlogdefs.h"
! 
! extern void copydir(char *fromdir, char *todir, bool recurse, XLogRecPtr recptr);
! extern void copy_file(char *fromfile, char *tofile, XLogRecPtr recptr);
  
  #endif   /* COPYDIR_H */
-- 
2.2.2

0003-File-based-incremental-backup-v9.patchtext/plain; charset=UTF-8; name=0003-File-based-incremental-backup-v9.patch; x-mac-creator=0; x-mac-type=0Download

From 5fc2a495199f9a9a5f1000bd44ad63669a032275 Mon Sep 17 00:00:00 2001
From: Marco Nenciarini <marco.nenciarini@2ndQuadrant.it>
Date: Tue, 14 Oct 2014 14:31:28 +0100
Subject: [PATCH 3/3] File-based incremental backup v9

Add backup profiles and --incremental to pg_basebackup
---
 doc/src/sgml/protocol.sgml             |  86 ++++++++-
 doc/src/sgml/ref/pg_basebackup.sgml    |  31 +++-
 src/backend/access/transam/xlog.c      |  18 +-
 src/backend/access/transam/xlogfuncs.c |   2 +-
 src/backend/replication/basebackup.c   | 319 +++++++++++++++++++++++++++++++--
 src/backend/replication/repl_gram.y    |   6 +
 src/backend/replication/repl_scanner.l |   1 +
 src/bin/pg_basebackup/pg_basebackup.c  | 191 ++++++++++++++++++--
 src/include/access/xlog.h              |   3 +-
 src/include/replication/basebackup.h   |   5 +
 10 files changed, 623 insertions(+), 39 deletions(-)

diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index efe75ea..fc24648 100644
*** a/doc/src/sgml/protocol.sgml
--- b/doc/src/sgml/protocol.sgml
*************** The commands accepted in walsender mode 
*** 1882,1888 ****
    </varlistentry>
  
    <varlistentry>
!     <term>BASE_BACKUP [<literal>LABEL</literal> <replaceable>'label'</replaceable>] [<literal>PROGRESS</literal>] [<literal>FAST</literal>] [<literal>WAL</literal>] [<literal>NOWAIT</literal>] [<literal>MAX_RATE</literal> <replaceable>rate</replaceable>]
       <indexterm><primary>BASE_BACKUP</primary></indexterm>
      </term>
      <listitem>
--- 1882,1888 ----
    </varlistentry>
  
    <varlistentry>
!     <term>BASE_BACKUP [<literal>LABEL</literal> <replaceable>'label'</replaceable>] [<literal>INCREMENTAL</literal> <replaceable>'start_lsn'</replaceable>] [<literal>PROGRESS</literal>] [<literal>FAST</literal>] [<literal>WAL</literal>] [<literal>NOWAIT</literal>] [<literal>MAX_RATE</literal> <replaceable>rate</replaceable>]
       <indexterm><primary>BASE_BACKUP</primary></indexterm>
      </term>
      <listitem>
*************** The commands accepted in walsender mode 
*** 1905,1910 ****
--- 1905,1928 ----
         </varlistentry>
  
         <varlistentry>
+         <term><literal>INCREMENTAL</literal> <replaceable>'start_lsn'</replaceable></term>
+         <listitem>
+          <para>
+           Requests a file-level incremental backup of all files changed after
+           <replaceable>start_lsn</replaceable>. When operating with
+           <literal>INCREMENTAL</literal>, the content of every block-organised
+           file will be analyzed and the file will be sent if at least one
+           block has a LSN higher than or equal to the provided
+           <replaceable>start_lsn</replaceable>.
+          </para>
+          <para>
+           The <filename>backup_profile</filename> will contain information on
+           every file that has been analyzed, even those that have not been sent.
+          </para>
+         </listitem>
+        </varlistentry>
+ 
+        <varlistentry>
          <term><literal>PROGRESS</></term>
          <listitem>
           <para>
*************** The commands accepted in walsender mode 
*** 2022,2028 ****
        <quote>ustar interchange format</> specified in the POSIX 1003.1-2008
        standard) dump of the tablespace contents, except that the two trailing
        blocks of zeroes specified in the standard are omitted.
!       After the tar data is complete, a final ordinary result set will be sent,
        containing the WAL end position of the backup, in the same format as
        the start position.
       </para>
--- 2040,2046 ----
        <quote>ustar interchange format</> specified in the POSIX 1003.1-2008
        standard) dump of the tablespace contents, except that the two trailing
        blocks of zeroes specified in the standard are omitted.
!       After the tar data is complete, an ordinary result set will be sent,
        containing the WAL end position of the backup, in the same format as
        the start position.
       </para>
*************** The commands accepted in walsender mode 
*** 2073,2082 ****
        the server supports it.
       </para>
       <para>
!       Once all tablespaces have been sent, a final regular result set will
        be sent. This result set contains the end position of the
        backup, given in XLogRecPtr format as a single column in a single row.
       </para>
      </listitem>
    </varlistentry>
  </variablelist>
--- 2091,2162 ----
        the server supports it.
       </para>
       <para>
!       Once all tablespaces have been sent, another regular result set will
        be sent. This result set contains the end position of the
        backup, given in XLogRecPtr format as a single column in a single row.
       </para>
+      <para>
+       Finally a last CopyResponse will be sent, containing only the
+       <filename>backup_profile</filename> file, in tar format.
+      </para>
+      <para>
+       The <filename>backup_profile</filename> file will have the following
+       format:
+ <programlisting>
+ POSTGRESQL BACKUP PROFILE 1
+ &lt;backup label content&gt;
+ FILE LIST
+ &lt;file list&gt;
+ </programlisting>
+       where <replaceable>&lt;backup label content&gt;</replaceable> is a
+       verbatim copy of the content of <filename>backup_label</filename> file
+       and the <replaceable>&lt;file list&gt;</replaceable> section is made up
+       of one line per file examined by the backup, having the following format
+       (standard COPY TEXT file, tab separated):
+ <programlisting>
+ tablespace maxlsn included mtime size relpath
+ </programlisting>
+      </para>
+      <para>
+       The meaning of the fields is the following:
+       <itemizedlist spacing="compact" mark="bullet">
+        <listitem>
+         <para>
+          <replaceable>tablespace</replaceable> is the OID of the tablespace
+          (or <literal>\N</literal> for files in PGDATA)
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>maxlsn</replaceable> is the file's max LSN in case
+          the file has been skipped, <literal>\N</literal> otherwise
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>included</replaceable> is a <literal>'t'</literal> if
+          the file is included in the backup, <literal>'f'</literal> otherwise
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>mtime</replaceable> is the timestamp of the last file
+          modification
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>size</replaceable> is the number of bytes of the file
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>relpath</replaceable> is the path of the file relative
+          to the tablespace root (PGDATA or the tablespace)
+         </para>
+        </listitem>
+       </itemizedlist>
+      </para>
      </listitem>
    </varlistentry>
  </variablelist>
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 642fccf..a13b188 100644
*** a/doc/src/sgml/ref/pg_basebackup.sgml
--- b/doc/src/sgml/ref/pg_basebackup.sgml
*************** PostgreSQL documentation
*** 158,163 ****
--- 158,165 ----
              tablespaces, the main data directory will be placed in the
              target directory, but all other tablespaces will be placed
              in the same absolute path as they have on the server.
+             The <filename>backup_profile</filename> file will be placed in
+             this directory.
             </para>
             <para>
              This is the default format.
*************** PostgreSQL documentation
*** 174,186 ****
              data directory will be written to a file named
              <filename>base.tar</filename>, and all other tablespaces will
              be named after the tablespace OID.
!             </para>
             <para>
              If the value <literal>-</literal> (dash) is specified as
              target directory, the tar contents will be written to
              standard output, suitable for piping to for example
              <productname>gzip</productname>. This is only possible if
              the cluster has no additional tablespaces.
             </para>
             </listitem>
           </varlistentry>
--- 176,192 ----
              data directory will be written to a file named
              <filename>base.tar</filename>, and all other tablespaces will
              be named after the tablespace OID.
!             The <filename>backup_profile</filename> file will be placed in
!             this directory.
!            </para>
             <para>
              If the value <literal>-</literal> (dash) is specified as
              target directory, the tar contents will be written to
              standard output, suitable for piping to for example
              <productname>gzip</productname>. This is only possible if
              the cluster has no additional tablespaces.
+             In this case, the <filename>backup_profile</filename> file 
+             will be sent to standard output as part of the tar stream.
             </para>
             </listitem>
           </varlistentry>
*************** PostgreSQL documentation
*** 189,194 ****
--- 195,214 ----
       </varlistentry>
  
       <varlistentry>
+       <term><option>-I <replaceable class="parameter">directory</replaceable></option></term>
+       <term><option>--incremental=<replaceable class="parameter">directory</replaceable></option></term>
+       <listitem>
+         <para>
+         Directory containing the backup to use as a start point for a file-level
+         incremental backup. <application>pg_basebackup</application> will read
+         the <filename>backup_profile</filename> file and then create an
+         incremental backup containing only the files which have been modified
+         after the start point.
+        </para>
+       </listitem>
+      </varlistentry>
+ 
+      <varlistentry>
        <term><option>-r <replaceable class="parameter">rate</replaceable></option></term>
        <term><option>--max-rate=<replaceable class="parameter">rate</replaceable></option></term>
        <listitem>
*************** PostgreSQL documentation
*** 588,593 ****
--- 608,622 ----
    </para>
  
    <para>
+    In order to support file-level incremental backups, a
+    <filename>backup_profile</filename> file
+    is generated in the target directory as last step of every backup. This
+    file will be transparently used by <application>pg_basebackup</application>
+    when invoked with the option <replaceable>--incremental</replaceable> to start
+    a new file-level incremental backup.
+   </para>
+ 
+   <para>
     <application>pg_basebackup</application> works with servers of the same
     or an older major version, down to 9.1. However, WAL streaming mode (-X
     stream) only works with server version 9.3 and later.
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 629a457..a642a04 100644
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***************
*** 47,52 ****
--- 47,53 ----
  #include "replication/snapbuild.h"
  #include "replication/walreceiver.h"
  #include "replication/walsender.h"
+ #include "replication/basebackup.h"
  #include "storage/barrier.h"
  #include "storage/bufmgr.h"
  #include "storage/fd.h"
*************** StartupXLOG(void)
*** 6164,6169 ****
--- 6165,6173 ----
  		 * the latest recovery restartpoint instead of going all the way back
  		 * to the backup start point.  It seems prudent though to just rename
  		 * the file out of the way rather than delete it completely.
+ 		 *
+ 		 * Rename also the backup profile if present. This marks the data
+ 		 * directory as not usable as base for an incremental backup.
  		 */
  		if (haveBackupLabel)
  		{
*************** StartupXLOG(void)
*** 6173,6178 ****
--- 6177,6189 ----
  						(errcode_for_file_access(),
  						 errmsg("could not rename file \"%s\" to \"%s\": %m",
  								BACKUP_LABEL_FILE, BACKUP_LABEL_OLD)));
+ 			unlink(BACKUP_PROFILE_OLD);
+ 			if (rename(BACKUP_PROFILE_FILE, BACKUP_PROFILE_OLD) != 0
+ 					&& errno != ENOENT)
+ 				ereport(FATAL,
+ 						(errcode_for_file_access(),
+ 						 errmsg("could not rename file \"%s\" to \"%s\": %m",
+ 								 BACKUP_PROFILE_FILE, BACKUP_PROFILE_OLD)));
  		}
  
  		/* Check that the GUCs used to generate the WAL allow recovery */
*************** XLogFileNameP(TimeLineID tli, XLogSegNo 
*** 9249,9255 ****
   * permissions of the calling user!
   */
  XLogRecPtr
! do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
  				   char **labelfile)
  {
  	bool		exclusive = (labelfile == NULL);
--- 9260,9267 ----
   * permissions of the calling user!
   */
  XLogRecPtr
! do_pg_start_backup(const char *backupidstr, bool fast,
! 				   XLogRecPtr incremental_startpoint, TimeLineID *starttli_p,
  				   char **labelfile)
  {
  	bool		exclusive = (labelfile == NULL);
*************** do_pg_start_backup(const char *backupids
*** 9468,9473 ****
--- 9480,9489 ----
  			 (uint32) (startpoint >> 32), (uint32) startpoint, xlogfilename);
  		appendStringInfo(&labelfbuf, "CHECKPOINT LOCATION: %X/%X\n",
  					 (uint32) (checkpointloc >> 32), (uint32) checkpointloc);
+ 		if (incremental_startpoint > 0)
+ 			appendStringInfo(&labelfbuf, "INCREMENTAL FROM LOCATION: %X/%X\n",
+ 							 (uint32) (incremental_startpoint >> 32),
+ 							 (uint32) incremental_startpoint);
  		appendStringInfo(&labelfbuf, "BACKUP METHOD: %s\n",
  						 exclusive ? "pg_start_backup" : "streamed");
  		appendStringInfo(&labelfbuf, "BACKUP FROM: %s\n",
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2179bf7..ace84d8 100644
*** a/src/backend/access/transam/xlogfuncs.c
--- b/src/backend/access/transam/xlogfuncs.c
*************** pg_start_backup(PG_FUNCTION_ARGS)
*** 59,65 ****
  				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
  		   errmsg("must be superuser or replication role to run a backup")));
  
! 	startpoint = do_pg_start_backup(backupidstr, fast, NULL, NULL);
  
  	PG_RETURN_LSN(startpoint);
  }
--- 59,65 ----
  				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
  		   errmsg("must be superuser or replication role to run a backup")));
  
! 	startpoint = do_pg_start_backup(backupidstr, fast, 0, NULL, NULL);
  
  	PG_RETURN_LSN(startpoint);
  }
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 3058ce9..107d70c 100644
*** a/src/backend/replication/basebackup.c
--- b/src/backend/replication/basebackup.c
***************
*** 30,40 ****
--- 30,42 ----
  #include "replication/basebackup.h"
  #include "replication/walsender.h"
  #include "replication/walsender_private.h"
+ #include "storage/bufpage.h"
  #include "storage/fd.h"
  #include "storage/ipc.h"
  #include "utils/builtins.h"
  #include "utils/elog.h"
  #include "utils/ps_status.h"
+ #include "utils/pg_lsn.h"
  #include "utils/timestamp.h"
  
  
*************** typedef struct
*** 46,56 ****
  	bool		nowait;
  	bool		includewal;
  	uint32		maxrate;
  } basebackup_options;
  
  
! static int64 sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces);
! static int64 sendTablespace(char *path, bool sizeonly);
  static bool sendFile(char *readfilename, char *tarfilename,
  		 struct stat * statbuf, bool missing_ok);
  static void sendFileWithContent(const char *filename, const char *content);
--- 48,62 ----
  	bool		nowait;
  	bool		includewal;
  	uint32		maxrate;
+ 	XLogRecPtr	incremental_startpoint;
  } basebackup_options;
  
  
! static int64 sendDir(char *path, int basepathlen, bool sizeonly,
! 					 List *tablespaces, bool has_relfiles,
! 					 XLogRecPtr incremental_startpoint);
! static int64 sendTablespace(char *path, bool sizeonly,
! 				XLogRecPtr incremental_startpoint);
  static bool sendFile(char *readfilename, char *tarfilename,
  		 struct stat * statbuf, bool missing_ok);
  static void sendFileWithContent(const char *filename, const char *content);
*************** static void parse_basebackup_options(Lis
*** 64,69 ****
--- 70,80 ----
  static void SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli);
  static int	compareWalFileNames(const void *a, const void *b);
  static void throttle(size_t increment);
+ static bool relnodeIsNewerThanLSN(char *filename, struct stat * statbuf,
+ 				XLogRecPtr *filemaxlsn, XLogRecPtr thresholdlsn);
+ static void writeBackupProfileLine(const char *filename, struct stat * statbuf,
+ 								   bool has_maxlsn, XLogRecPtr filemaxlsn, bool sent);
+ static void sendBackupProfile(const char *labelfile);
  
  /* Was the backup currently in-progress initiated in recovery mode? */
  static bool backup_started_in_recovery = false;
*************** static int64 elapsed_min_unit;
*** 93,98 ****
--- 104,115 ----
  /* The last check of the transfer rate. */
  static int64 throttled_last;
  
+ /* Temporary file containing the backup profile */
+ static File backup_profile_fd = 0;
+ 
+ /* Tablespace being currently sent. Used in backup profile generation */
+ static char *current_tablespace = NULL;
+ 
  typedef struct
  {
  	char	   *oid;
*************** perform_base_backup(basebackup_options *
*** 132,138 ****
  
  	backup_started_in_recovery = RecoveryInProgress();
  
! 	startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint, &starttli,
  								  &labelfile);
  	/*
  	 * Once do_pg_start_backup has been called, ensure that any failure causes
--- 149,159 ----
  
  	backup_started_in_recovery = RecoveryInProgress();
  
! 	/* Open a temporary file to hold the profile content. */
! 	backup_profile_fd = OpenTemporaryFile(false);
! 
! 	startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint,
! 								  opt->incremental_startpoint, &starttli,
  								  &labelfile);
  	/*
  	 * Once do_pg_start_backup has been called, ensure that any failure causes
*************** perform_base_backup(basebackup_options *
*** 208,214 ****
  			ti->oid = pstrdup(de->d_name);
  			ti->path = pstrdup(linkpath);
  			ti->rpath = relpath ? pstrdup(relpath) : NULL;
! 			ti->size = opt->progress ? sendTablespace(fullpath, true) : -1;
  			tablespaces = lappend(tablespaces, ti);
  #else
  
--- 229,236 ----
  			ti->oid = pstrdup(de->d_name);
  			ti->path = pstrdup(linkpath);
  			ti->rpath = relpath ? pstrdup(relpath) : NULL;
! 			ti->size = opt->progress ? sendTablespace(fullpath, true,
! 											opt->incremental_startpoint) : -1;
  			tablespaces = lappend(tablespaces, ti);
  #else
  
*************** perform_base_backup(basebackup_options *
*** 225,231 ****
  
  		/* Add a node for the base directory at the end */
  		ti = palloc0(sizeof(tablespaceinfo));
! 		ti->size = opt->progress ? sendDir(".", 1, true, tablespaces) : -1;
  		tablespaces = lappend(tablespaces, ti);
  
  		/* Send tablespace header */
--- 247,254 ----
  
  		/* Add a node for the base directory at the end */
  		ti = palloc0(sizeof(tablespaceinfo));
! 		ti->size = opt->progress ? sendDir(".", 1, true, tablespaces, false,
! 										   opt->incremental_startpoint) : -1;
  		tablespaces = lappend(tablespaces, ti);
  
  		/* Send tablespace header */
*************** perform_base_backup(basebackup_options *
*** 267,272 ****
--- 290,301 ----
  			pq_sendint(&buf, 0, 2);		/* natts */
  			pq_endmessage(&buf);
  
+ 			/*
+ 			 * Save the current tablespace, used in writeBackupProfileLine
+ 			 * function
+ 			 */
+ 			current_tablespace = ti->oid;
+ 
  			if (ti->path == NULL)
  			{
  				struct stat statbuf;
*************** perform_base_backup(basebackup_options *
*** 275,281 ****
  				sendFileWithContent(BACKUP_LABEL_FILE, labelfile);
  
  				/* ... then the bulk of the files ... */
! 				sendDir(".", 1, false, tablespaces);
  
  				/* ... and pg_control after everything else. */
  				if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
--- 304,310 ----
  				sendFileWithContent(BACKUP_LABEL_FILE, labelfile);
  
  				/* ... then the bulk of the files ... */
! 				sendDir(".", 1, false, tablespaces, false, opt->incremental_startpoint);
  
  				/* ... and pg_control after everything else. */
  				if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
*************** perform_base_backup(basebackup_options *
*** 284,292 ****
  							 errmsg("could not stat control file \"%s\": %m",
  									XLOG_CONTROL_FILE)));
  				sendFile(XLOG_CONTROL_FILE, XLOG_CONTROL_FILE, &statbuf, false);
  			}
  			else
! 				sendTablespace(ti->path, false);
  
  			/*
  			 * If we're including WAL, and this is the main data directory we
--- 313,322 ----
  							 errmsg("could not stat control file \"%s\": %m",
  									XLOG_CONTROL_FILE)));
  				sendFile(XLOG_CONTROL_FILE, XLOG_CONTROL_FILE, &statbuf, false);
+ 				writeBackupProfileLine(XLOG_CONTROL_FILE, &statbuf, false, 0, true);
  			}
  			else
! 				sendTablespace(ti->path, false, opt->incremental_startpoint);
  
  			/*
  			 * If we're including WAL, and this is the main data directory we
*************** perform_base_backup(basebackup_options *
*** 501,507 ****
  
  			FreeFile(fp);
  
! 			/*
  			 * Mark file as archived, otherwise files can get archived again
  			 * after promotion of a new node. This is in line with
  			 * walreceiver.c always doing a XLogArchiveForceDone() after a
--- 531,540 ----
  
  			FreeFile(fp);
  
! 			/* Add the WAL file to backup profile */
! 			writeBackupProfileLine(pathbuf, &statbuf, false, 0, true);
! 
! 		    /*
  			 * Mark file as archived, otherwise files can get archived again
  			 * after promotion of a new node. This is in line with
  			 * walreceiver.c always doing a XLogArchiveForceDone() after a
*************** perform_base_backup(basebackup_options *
*** 533,538 ****
--- 566,574 ----
  
  			sendFile(pathbuf, pathbuf, &statbuf, false);
  
+ 			/* Add the WAL file to backup profile */
+ 			writeBackupProfileLine(pathbuf, &statbuf, false, 0, true);
+ 
  			/* unconditionally mark file as archived */
  			StatusFilePath(pathbuf, fname, ".done");
  			sendFileWithContent(pathbuf, "");
*************** perform_base_backup(basebackup_options *
*** 542,547 ****
--- 578,586 ----
  		pq_putemptymessage('c');
  	}
  	SendXlogRecPtrResult(endptr, endtli);
+ 
+ 	/* Send the profile file. */
+ 	sendBackupProfile(labelfile);
  }
  
  /*
*************** parse_basebackup_options(List *options, 
*** 570,575 ****
--- 609,615 ----
  	bool		o_nowait = false;
  	bool		o_wal = false;
  	bool		o_maxrate = false;
+ 	bool		o_incremental = false;
  
  	MemSet(opt, 0, sizeof(*opt));
  	foreach(lopt, options)
*************** parse_basebackup_options(List *options, 
*** 640,645 ****
--- 680,697 ----
  			opt->maxrate = (uint32) maxrate;
  			o_maxrate = true;
  		}
+ 		else if (strcmp(defel->defname, "incremental") == 0)
+ 		{
+ 			if (o_incremental)
+ 				ereport(ERROR,
+ 						(errcode(ERRCODE_SYNTAX_ERROR),
+ 						 errmsg("duplicate option \"%s\"", defel->defname)));
+ 
+ 			opt->incremental_startpoint = DatumGetLSN(
+ 				DirectFunctionCall1(pg_lsn_in,
+ 									CStringGetDatum(strVal(defel->arg))));
+ 			o_incremental = true;
+ 		}
  		else
  			elog(ERROR, "option \"%s\" not recognized",
  				 defel->defname);
*************** sendFileWithContent(const char *filename
*** 859,864 ****
--- 911,919 ----
  		MemSet(buf, 0, pad);
  		pq_putmessage('d', buf, pad);
  	}
+ 
+ 	/* Write a backup profile entry for this file. */
+ 	writeBackupProfileLine(filename, &statbuf, false, 0, true);
  }
  
  /*
*************** sendFileWithContent(const char *filename
*** 869,875 ****
   * Only used to send auxiliary tablespaces, not PGDATA.
   */
  static int64
! sendTablespace(char *path, bool sizeonly)
  {
  	int64		size;
  	char		pathbuf[MAXPGPATH];
--- 924,930 ----
   * Only used to send auxiliary tablespaces, not PGDATA.
   */
  static int64
! sendTablespace(char *path, bool sizeonly, XLogRecPtr incremental_startpoint)
  {
  	int64		size;
  	char		pathbuf[MAXPGPATH];
*************** sendTablespace(char *path, bool sizeonly
*** 902,908 ****
  	size = 512;					/* Size of the header just added */
  
  	/* Send all the files in the tablespace version directory */
! 	size += sendDir(pathbuf, strlen(path), sizeonly, NIL);
  
  	return size;
  }
--- 957,963 ----
  	size = 512;					/* Size of the header just added */
  
  	/* Send all the files in the tablespace version directory */
! 	size += sendDir(pathbuf, strlen(path), sizeonly, NIL, true, incremental_startpoint);
  
  	return size;
  }
*************** sendTablespace(char *path, bool sizeonly
*** 914,922 ****
   *
   * Omit any directory in the tablespaces list, to avoid backing up
   * tablespaces twice when they were created inside PGDATA.
   */
  static int64
! sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces)
  {
  	DIR		   *dir;
  	struct dirent *de;
--- 969,981 ----
   *
   * Omit any directory in the tablespaces list, to avoid backing up
   * tablespaces twice when they were created inside PGDATA.
+  *
+  * If 'has_relfiles' is set, this directory will be checked to identify
+  * relnode files and compute their maxLSN.
   */
  static int64
! sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces,
! 		bool has_relfiles, XLogRecPtr incremental_startpoint)
  {
  	DIR		   *dir;
  	struct dirent *de;
*************** sendDir(char *path, int basepathlen, boo
*** 1124,1138 ****
  				}
  			}
  			if (!skip_this_dir)
! 				size += sendDir(pathbuf, basepathlen, sizeonly, tablespaces);
  		}
  		else if (S_ISREG(statbuf.st_mode))
  		{
  			bool		sent = false;
  
  			if (!sizeonly)
! 				sent = sendFile(pathbuf, pathbuf + basepathlen + 1, &statbuf,
! 								true);
  
  			if (sent || sizeonly)
  			{
--- 1183,1243 ----
  				}
  			}
  			if (!skip_this_dir)
! 			{
! 				bool	subdir_has_relfiles;
! 
! 				/*
! 				 * Whithin PGDATA relnode files are contained only in "global"
! 				 * and "base" directory
! 				 */
! 				subdir_has_relfiles = has_relfiles
! 					|| strcmp(pathbuf, "./global") == 0
! 					|| strcmp(pathbuf, "./base") == 0;
! 
! 				size += sendDir(pathbuf, basepathlen, sizeonly, tablespaces,
! 								subdir_has_relfiles, incremental_startpoint);
! 			}
  		}
  		else if (S_ISREG(statbuf.st_mode))
  		{
  			bool		sent = false;
  
  			if (!sizeonly)
! 			{
! 				bool		is_relfile;
! 				XLogRecPtr	filemaxlsn = 0;
! 				int			oidchars;
! 				ForkNumber	forknum;
! 
! 				/*
! 				 * If the current directory can have relnode files, check the file
! 				 * name to see if it is one of them.
! 				 *
! 				 * Only copy the main fork because is the only one
! 				 * where page LSNs are always updated
! 				 */
! 				is_relfile = ( has_relfiles
! 					&& parse_filename_for_nontemp_relation(de->d_name,
! 														   &oidchars,
! 														   &forknum)
! 					&& forknum == MAIN_FORKNUM);
! 
! 				if (!is_relfile
! 					|| incremental_startpoint == 0
! 					|| relnodeIsNewerThanLSN(pathbuf, &statbuf, &filemaxlsn,
! 											 incremental_startpoint))
! 				{
! 					sent = sendFile(pathbuf, pathbuf + basepathlen + 1,
! 									&statbuf, true);
! 					/* Write a backup profile entry for the sent file. */
! 					writeBackupProfileLine(pathbuf + basepathlen + 1, &statbuf,
! 										   false, 0, sent);
! 				}
! 				else
! 					/* Write a backup profile entry for the skipped file. */
! 					writeBackupProfileLine(pathbuf + basepathlen + 1, &statbuf,
! 										   true, filemaxlsn, sent);
! 			}
  
  			if (sent || sizeonly)
  			{
*************** throttle(size_t increment)
*** 1333,1335 ****
--- 1438,1626 ----
  		/* Sleep was necessary but might have been interrupted. */
  		throttled_last = GetCurrentIntegerTimestamp();
  }
+ 
+ /*
+  * Search in a relnode file for a page with a LSN greater than the threshold.
+  * If all the blocks in the file are older than the threshold the file can
+  * be safely skipped during an incremental backup.
+  */
+ static bool
+ relnodeIsNewerThanLSN(char *filename, struct stat * statbuf,
+ 		XLogRecPtr *filemaxlsn, XLogRecPtr thresholdlsn)
+ {
+ 	FILE	   *fp;
+ 	char		buf[BLCKSZ];
+ 	size_t		cnt;
+ 	pgoff_t		len = 0;
+ 	XLogRecPtr	pagelsn;
+ 
+ 	*filemaxlsn = 0;
+ 
+ 	fp = AllocateFile(filename, "rb");
+ 	if (fp == NULL)
+ 	{
+ 		if (errno == ENOENT)
+ 			return true;
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not open file \"%s\": %m", filename)));
+ 	}
+ 
+ 	while ((cnt = fread(buf, 1, Min(sizeof(buf), statbuf->st_size - len), fp)) > 0)
+ 	{
+ 		pagelsn = PageGetLSN(buf);
+ 
+ 		/* Keep the max LSN found */
+ 		if (*filemaxlsn < pagelsn)
+ 			*filemaxlsn = pagelsn;
+ 
+ 		/*
+ 		 *  If a page with a LSN newer than the threshold stop scanning
+ 		 *  and set the filemaxlsn value to 0 as it is only partial.
+ 		 */
+ 		if (thresholdlsn <= pagelsn)
+ 		{
+ 			*filemaxlsn = 0;
+ 			FreeFile(fp);
+ 			return true;
+ 		}
+ 
+ 		if (len >= statbuf->st_size)
+ 		{
+ 			/*
+ 			 * Reached end of file. The file could be longer, if it was
+ 			 * extended while we were sending it, but for a base backup we can
+ 			 * ignore such extended data. It will be restored from WAL.
+ 			 */
+ 			break;
+ 		}
+ 	}
+ 
+ 	FreeFile(fp);
+ 
+ 	/*
+ 	 * At this point, if *filemaxlsn contains InvalidXLogRecPtr
+ 	 * the file contains something that doesn't update page LSNs (e.g. FSM)
+ 	 */
+ 	if (*filemaxlsn == InvalidXLogRecPtr)
+ 		return true;
+ 
+ 	return false;
+ }
+ 
+ /*
+  * Write an entry in file list section of backup profile.
+  */
+ static void
+ writeBackupProfileLine(const char *filename, struct stat * statbuf,
+ 					   bool has_maxlsn, XLogRecPtr filemaxlsn, bool sent)
+ {
+ 	/*
+ 	 * tablespace oid (10) + max LSN (17) + mtime (10) + size (19) +
+ 	 * path (MAXPGPATH) + separators (4) + trailing \0 = 65
+ 	 */
+ 	char	buf[MAXPGPATH + 65];
+ 	char    maxlsn[17];
+ 	int		rowlen;
+ 
+ 	Assert(backup_profile_fd > 0);
+ 
+ 	/* Prepare maxlsn */
+ 	if (has_maxlsn)
+ 	{
+ 		snprintf(maxlsn, sizeof(maxlsn), "%X/%X",
+ 				 (uint32) (filemaxlsn >> 32), (uint32) filemaxlsn);
+ 	}
+ 	else
+ 	{
+ 		strlcpy(maxlsn, "\\N", sizeof(maxlsn));
+ 	}
+ 
+ 	rowlen = snprintf(buf, sizeof(buf), "%s\t%s\t%s\t%u\t%lld\t%s\n",
+ 					  current_tablespace ? current_tablespace : "\\N",
+ 					  maxlsn,
+ 					  sent ? "t" : "f",
+ 					  (uint32) statbuf->st_mtime,
+ 					  statbuf->st_size,
+ 					  filename);
+ 	FileWrite(backup_profile_fd, buf, rowlen);
+ }
+ 
+ /*
+  * Send the backup profile. It is wrapped in a tar CopyOutResponse containing
+  * a tar stream with only one file.
+  */
+ static void
+ sendBackupProfile(const char *labelfile)
+ {
+ 	StringInfoData msgbuf;
+ 	struct stat statbuf;
+ 	char		buf[TAR_SEND_SIZE];
+ 	size_t		cnt;
+ 	pgoff_t		len = 0;
+ 	size_t		pad;
+ 	char *backup_profile = FilePathName(backup_profile_fd);
+ 
+ 	/* Send CopyOutResponse message */
+ 	pq_beginmessage(&msgbuf, 'H');
+ 	pq_sendbyte(&msgbuf, 0);		/* overall format */
+ 	pq_sendint(&msgbuf, 0, 2);		/* natts */
+ 	pq_endmessage(&msgbuf);
+ 
+ 	if (lstat(backup_profile, &statbuf) != 0)
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not stat backup_profile file \"%s\": %m",
+ 						backup_profile)));
+ 
+ 	/* Set the file position to the beginning. */
+ 	FileSeek(backup_profile_fd, 0, SEEK_SET);
+ 
+ 	/*
+ 	 * Fill the buffer with content of backup profile header section. Being it
+ 	 * the concatenation of two separator and the backup label, it should be
+ 	 * shorter of TAR_SEND_SIZE.
+ 	 */
+ 	cnt = snprintf(buf, sizeof(buf), "%s\n%s%s\n",
+ 				   BACKUP_PROFILE_HEADER,
+ 				   labelfile,
+ 				   BACKUP_PROFILE_SEPARATOR);
+ 
+ 	/* Add size of backup label and separators */
+ 	statbuf.st_size += cnt;
+ 
+ 	_tarWriteHeader(BACKUP_PROFILE_FILE, NULL, &statbuf);
+ 
+ 	/* Send backup profile header */
+ 	if (pq_putmessage('d', buf, cnt))
+ 		ereport(ERROR,
+ 				(errmsg("base backup could not send data, aborting backup")));
+ 
+ 	len += cnt;
+ 	throttle(cnt);
+ 
+ 	while ((cnt = FileRead(backup_profile_fd, buf, sizeof(buf))) > 0)
+ 	{
+ 		/* Send the chunk as a CopyData message */
+ 		if (pq_putmessage('d', buf, cnt))
+ 			ereport(ERROR,
+ 					(errmsg("base backup could not send data, aborting backup")));
+ 
+ 		len += cnt;
+ 		throttle(cnt);
+ 
+ 	}
+ 
+ 	/*
+ 	 * Pad to 512 byte boundary, per tar format requirements. (This small
+ 	 * piece of data is probably not worth throttling.)
+ 	 */
+ 	pad = ((len + 511) & ~511) - len;
+ 	if (pad > 0)
+ 	{
+ 		MemSet(buf, 0, pad);
+ 		pq_putmessage('d', buf, pad);
+ 	}
+ 
+ 	pq_putemptymessage('c');        /* CopyDone */
+ }
diff --git a/src/backend/replication/repl_gram.y b/src/backend/replication/repl_gram.y
index 2a41eb1..684cf4d 100644
*** a/src/backend/replication/repl_gram.y
--- b/src/backend/replication/repl_gram.y
*************** Node *replication_parse_result;
*** 75,80 ****
--- 75,81 ----
  %token K_PHYSICAL
  %token K_LOGICAL
  %token K_SLOT
+ %token K_INCREMENTAL
  
  %type <node>	command
  %type <node>	base_backup start_replication start_logical_replication create_replication_slot drop_replication_slot identify_system timeline_history
*************** base_backup_opt:
*** 168,173 ****
--- 169,179 ----
  				  $$ = makeDefElem("max_rate",
  								   (Node *)makeInteger($2));
  				}
+ 			| K_INCREMENTAL SCONST
+ 				{
+ 				  $$ = makeDefElem("incremental",
+ 								   (Node *)makeString($2));
+ 				}
  			;
  
  create_replication_slot:
diff --git a/src/backend/replication/repl_scanner.l b/src/backend/replication/repl_scanner.l
index 449c127..a6d0dd8 100644
*** a/src/backend/replication/repl_scanner.l
--- b/src/backend/replication/repl_scanner.l
*************** TIMELINE_HISTORY	{ return K_TIMELINE_HIS
*** 96,101 ****
--- 96,102 ----
  PHYSICAL			{ return K_PHYSICAL; }
  LOGICAL				{ return K_LOGICAL; }
  SLOT				{ return K_SLOT; }
+ INCREMENTAL			{ return K_INCREMENTAL; }
  
  ","				{ return ','; }
  ";"				{ return ';'; }
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index fbf7106..c03e7e0 100644
*** a/src/bin/pg_basebackup/pg_basebackup.c
--- b/src/bin/pg_basebackup/pg_basebackup.c
*************** static bool writerecoveryconf = false;
*** 67,72 ****
--- 67,74 ----
  static int	standby_message_timeout = 10 * 1000;		/* 10 sec = default */
  static pg_time_t last_progress_report = 0;
  static int32 maxrate = 0;		/* no limit by default */
+ static XLogRecPtr incremental_startpoint = 0;
+ static TimeLineID incremental_timeline = 0;
  
  
  /* Progress counters */
*************** static void usage(void);
*** 99,107 ****
  static void disconnect_and_exit(int code);
  static void verify_dir_is_empty_or_create(char *dirname);
  static void progress_report(int tablespacenum, const char *filename, bool force);
  
  static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);
! static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum);
  static void GenerateRecoveryConf(PGconn *conn);
  static void WriteRecoveryConf(void);
  static void BaseBackup(void);
--- 101,111 ----
  static void disconnect_and_exit(int code);
  static void verify_dir_is_empty_or_create(char *dirname);
  static void progress_report(int tablespacenum, const char *filename, bool force);
+ static void read_backup_profile_header(const char *profile_path);
  
  static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);
! static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum,
! 									const char *dest_path);
  static void GenerateRecoveryConf(PGconn *conn);
  static void WriteRecoveryConf(void);
  static void BaseBackup(void);
*************** usage(void)
*** 232,237 ****
--- 236,243 ----
  	printf(_("\nOptions controlling the output:\n"));
  	printf(_("  -D, --pgdata=DIRECTORY receive base backup into directory\n"));
  	printf(_("  -F, --format=p|t       output format (plain (default), tar)\n"));
+ 	printf(_("  -I, --incremental=DIRECTORY\n"
+ 			 "                         incremental backup from an existing backup\n"));
  	printf(_("  -r, --max-rate=RATE    maximum transfer rate to transfer data directory\n"
  			 "                         (in kB/s, or use suffix \"k\" or \"M\")\n"));
  	printf(_("  -R, --write-recovery-conf\n"
*************** parse_max_rate(char *src)
*** 717,722 ****
--- 723,794 ----
  	return (int32) result;
  }
  
+ 
+ /*
+  * Read incremental_startpoint and incremental_timeline
+  * from a backup profile.
+  */
+ static void
+ read_backup_profile_header(const char *reference_path)
+ {
+ 	char 		profile_path[MAXPGPATH];
+ 	FILE	   *pfp;
+ 	char		ch;
+ 	uint32		hi,
+ 				lo;
+ 
+ 	/* The directory must exist and must be not empty */
+ 	if (pg_check_dir(reference_path) < 3)
+ 	{
+ 		fprintf(stderr, _("%s: invalid incremental base directory \"%s\"\n"),
+ 				progname, reference_path);
+ 		exit(1);
+ 	}
+ 
+ 	/* Build the backup profile location */
+ 	join_path_components(profile_path, reference_path, BACKUP_PROFILE_FILE);
+ 
+ 	/* See if label file is present */
+ 	pfp = fopen(profile_path, "r");
+ 	if (!pfp)
+ 	{
+ 		fprintf(stderr, _("%s: could not read file \"%s\": %s\n"),
+ 				progname, profile_path, strerror(errno));
+ 		exit(1);
+ 	}
+ 
+ 	/* Consume the profile header */
+ 	fscanf(pfp, BACKUP_PROFILE_HEADER);
+ 	if (fscanf(pfp, "%c", &ch) != 1 || ch != '\n')
+ 	{
+ 		fprintf(stderr, _("%s: invalid data in file \"%s\"\n"),
+ 				progname, profile_path);
+ 		exit(1);
+ 	}
+ 
+ 	/*
+ 	 * Read and parse the START WAL LOCATION (this code
+ 	 * is pretty crude, but we are not expecting any variability in the file
+ 	 * format).
+ 	 */
+ 	if (fscanf(pfp, "START WAL LOCATION: %X/%X (file %08X%*16s)%c",
+ 			   &hi, &lo, &incremental_timeline, &ch) != 4 || ch != '\n')
+ 	{
+ 		fprintf(stderr, _("%s: invalid data in file \"%s\"\n"),
+ 				progname, profile_path);
+ 		exit(1);
+ 	}
+ 	incremental_startpoint = ((uint64) hi) << 32 | lo;
+ 
+ 	if (ferror(pfp) || fclose(pfp))
+ 	{
+ 		fprintf(stderr, _("%s: could not read file \"%s\": %s\n"),
+ 				progname, profile_path, strerror(errno));
+ 		exit(1);
+ 	}
+ }
+ 
+ 
  /*
   * Write a piece of tar data
   */
*************** ReceiveTarFile(PGconn *conn, PGresult *r
*** 773,784 ****
  	char	   *copybuf = NULL;
  	FILE	   *tarfile = NULL;
  	char		tarhdr[512];
! 	bool		basetablespace = PQgetisnull(res, rownum, 0);
  	bool		in_tarhdr = true;
  	bool		skip_file = false;
  	size_t		tarhdrsz = 0;
  	size_t		filesz = 0;
  
  #ifdef HAVE_LIBZ
  	gzFile		ztarfile = NULL;
  #endif
--- 845,866 ----
  	char	   *copybuf = NULL;
  	FILE	   *tarfile = NULL;
  	char		tarhdr[512];
! 	bool		basetablespace;
  	bool		in_tarhdr = true;
  	bool		skip_file = false;
  	size_t		tarhdrsz = 0;
  	size_t		filesz = 0;
  
+ 	/*
+ 	 * If 'res' is NULL, we are appending the backup profile to
+ 	 * the standard output tar stream.
+ 	 */
+ 	Assert(res || (strcmp(basedir, "-") == 0));
+ 	if (res)
+ 		basetablespace = PQgetisnull(res, rownum, 0);
+ 	else
+ 		basetablespace = true;
+ 
  #ifdef HAVE_LIBZ
  	gzFile		ztarfile = NULL;
  #endif
*************** ReceiveTarFile(PGconn *conn, PGresult *r
*** 939,946 ****
  					WRITE_TAR_DATA(zerobuf, padding);
  			}
  
! 			/* 2 * 512 bytes empty data at end of file */
! 			WRITE_TAR_DATA(zerobuf, sizeof(zerobuf));
  
  #ifdef HAVE_LIBZ
  			if (ztarfile != NULL)
--- 1021,1033 ----
  					WRITE_TAR_DATA(zerobuf, padding);
  			}
  
! 			/*
! 			 * Write the end-of-file blocks unless using stdout
! 			 * and not writing the backup profile (res is NULL).
! 			 */
! 			if (!res || strcmp(basedir, "-") != 0)
! 				/* 2 * 512 bytes empty data at end of file */
! 				WRITE_TAR_DATA(zerobuf, sizeof(zerobuf));
  
  #ifdef HAVE_LIBZ
  			if (ztarfile != NULL)
*************** get_tablespace_mapping(const char *dir)
*** 1128,1136 ****
   * If the data is for the main data directory, it will be restored in the
   * specified directory. If it's for another tablespace, it will be restored
   * in the original or mapped directory.
   */
  static void
! ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum)
  {
  	char		current_path[MAXPGPATH];
  	char		filename[MAXPGPATH];
--- 1215,1230 ----
   * If the data is for the main data directory, it will be restored in the
   * specified directory. If it's for another tablespace, it will be restored
   * in the original or mapped directory.
+  *
+  * If 'res' is NULL, the destination directory is taken from the
+  * 'dest_path' parameter.
+  *
+  * When 'dest_path' is specified, progresses are not displayed because the
+  * content it is not in any tablespace.
   */
  static void
! ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum,
! 						const char *dest_path)
  {
  	char		current_path[MAXPGPATH];
  	char		filename[MAXPGPATH];
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1141,1153 ****
  	char	   *copybuf = NULL;
  	FILE	   *file = NULL;
  
! 	basetablespace = PQgetisnull(res, rownum, 0);
! 	if (basetablespace)
! 		strlcpy(current_path, basedir, sizeof(current_path));
  	else
! 		strlcpy(current_path,
! 				get_tablespace_mapping(PQgetvalue(res, rownum, 1)),
! 				sizeof(current_path));
  
  	/*
  	 * Get the COPY data
--- 1235,1262 ----
  	char	   *copybuf = NULL;
  	FILE	   *file = NULL;
  
! 	/* 'res' and 'dest_path' are mutually exclusive */
! 	Assert(!res != !dest_path);
! 
! 	/*
! 	 * If 'res' is NULL, the destination directory is taken from the
! 	 * 'dest_path' parameter.
! 	 */
! 	if (res)
! 	{
! 		basetablespace = PQgetisnull(res, rownum, 0);
! 		if (basetablespace)
! 			strlcpy(current_path, basedir, sizeof(current_path));
! 		else
! 			strlcpy(current_path,
! 					get_tablespace_mapping(PQgetvalue(res, rownum, 1)),
! 					sizeof(current_path));
! 	}
  	else
! 	{
! 		basetablespace = false;
! 		strlcpy(current_path, dest_path, sizeof(current_path));
! 	}
  
  	/*
  	 * Get the COPY data
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1355,1361 ****
  				disconnect_and_exit(1);
  			}
  			totaldone += r;
! 			progress_report(rownum, filename, false);
  
  			current_len_left -= r;
  			if (current_len_left == 0 && current_padding == 0)
--- 1464,1472 ----
  				disconnect_and_exit(1);
  			}
  			totaldone += r;
! 			/* report progress unless a custom destination is used */
! 			if (!dest_path)
! 				progress_report(rownum, filename, false);
  
  			current_len_left -= r;
  			if (current_len_left == 0 && current_padding == 0)
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1371,1377 ****
  			}
  		}						/* continuing data in existing file */
  	}							/* loop over all data blocks */
! 	progress_report(rownum, filename, true);
  
  	if (file != NULL)
  	{
--- 1482,1490 ----
  			}
  		}						/* continuing data in existing file */
  	}							/* loop over all data blocks */
! 	/* report progress unless a custom destination is used */
! 	if (!dest_path)
! 		progress_report(rownum, filename, true);
  
  	if (file != NULL)
  	{
*************** BaseBackup(void)
*** 1587,1592 ****
--- 1700,1706 ----
  	char	   *basebkp;
  	char		escaped_label[MAXPGPATH];
  	char	   *maxrate_clause = NULL;
+ 	char	   *incremental_clause = NULL;
  	int			i;
  	char		xlogstart[64];
  	char		xlogend[64];
*************** BaseBackup(void)
*** 1648,1661 ****
  	if (maxrate > 0)
  		maxrate_clause = psprintf("MAX_RATE %u", maxrate);
  
  	basebkp =
! 		psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s",
  				 escaped_label,
  				 showprogress ? "PROGRESS" : "",
  				 includewal && !streamwal ? "WAL" : "",
  				 fastcheckpoint ? "FAST" : "",
  				 includewal ? "NOWAIT" : "",
! 				 maxrate_clause ? maxrate_clause : "");
  
  	if (PQsendQuery(conn, basebkp) == 0)
  	{
--- 1762,1801 ----
  	if (maxrate > 0)
  		maxrate_clause = psprintf("MAX_RATE %u", maxrate);
  
+ 	if (incremental_startpoint > 0)
+ 	{
+ 		incremental_clause = psprintf("INCREMENTAL '%X/%X'",
+ 									  (uint32) (incremental_startpoint >> 32),
+ 									  (uint32) incremental_startpoint);
+ 
+ 		/*
+ 		 * Sanity check: if from a different timeline abort the backup.
+ 		 */
+ 		if (latesttli != incremental_timeline)
+ 		{
+ 			fprintf(stderr,
+ 					_("%s: incremental backup from a different timeline "
+ 					  "is not supported: base=%u current=%u\n"),
+ 					progname, incremental_timeline, latesttli);
+ 			disconnect_and_exit(1);
+ 		}
+ 
+ 		if (verbose)
+ 			fprintf(stderr, _("incremental from point: %X/%X on timeline %u\n"),
+ 					(uint32) (incremental_startpoint >> 32),
+ 					(uint32) incremental_startpoint,
+ 					incremental_timeline);
+ 	}
+ 
  	basebkp =
! 		psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s %s",
  				 escaped_label,
  				 showprogress ? "PROGRESS" : "",
  				 includewal && !streamwal ? "WAL" : "",
  				 fastcheckpoint ? "FAST" : "",
  				 includewal ? "NOWAIT" : "",
! 				 maxrate_clause ? maxrate_clause : "",
! 				 incremental_clause ? incremental_clause : "");
  
  	if (PQsendQuery(conn, basebkp) == 0)
  	{
*************** BaseBackup(void)
*** 1769,1775 ****
  		if (format == 't')
  			ReceiveTarFile(conn, res, i);
  		else
! 			ReceiveAndUnpackTarFile(conn, res, i);
  	}							/* Loop over all tablespaces */
  
  	if (showprogress)
--- 1909,1915 ----
  		if (format == 't')
  			ReceiveTarFile(conn, res, i);
  		else
! 			ReceiveAndUnpackTarFile(conn, res, i, NULL);
  	}							/* Loop over all tablespaces */
  
  	if (showprogress)
*************** BaseBackup(void)
*** 1803,1808 ****
--- 1943,1960 ----
  		fprintf(stderr, "transaction log end point: %s\n", xlogend);
  	PQclear(res);
  
+ 	/*
+ 	 * Get the backup profile
+ 	 *
+ 	 * If format is tar and we are writing on standard output
+ 	 * append the backup profile to the stream, otherwise put it
+ 	 * in the destination directory
+ 	 */
+ 	if (format == 't' && (strcmp(basedir, "-") == 0))
+ 		ReceiveTarFile(conn, NULL, -1);
+ 	else
+ 		ReceiveAndUnpackTarFile(conn, NULL, -1, basedir);
+ 
  	res = PQgetResult(conn);
  	if (PQresultStatus(res) != PGRES_COMMAND_OK)
  	{
*************** main(int argc, char **argv)
*** 1942,1947 ****
--- 2094,2100 ----
  		{"username", required_argument, NULL, 'U'},
  		{"no-password", no_argument, NULL, 'w'},
  		{"password", no_argument, NULL, 'W'},
+ 		{"incremental", required_argument, NULL, 'I'},
  		{"status-interval", required_argument, NULL, 's'},
  		{"verbose", no_argument, NULL, 'v'},
  		{"progress", no_argument, NULL, 'P'},
*************** main(int argc, char **argv)
*** 1949,1955 ****
  		{NULL, 0, NULL, 0}
  	};
  	int			c;
- 
  	int			option_index;
  
  	progname = get_progname(argv[0]);
--- 2102,2107 ----
*************** main(int argc, char **argv)
*** 1970,1976 ****
  		}
  	}
  
! 	while ((c = getopt_long(argc, argv, "D:F:r:RT:xX:l:zZ:d:c:h:p:U:s:wWvP",
  							long_options, &option_index)) != -1)
  	{
  		switch (c)
--- 2122,2128 ----
  		}
  	}
  
! 	while ((c = getopt_long(argc, argv, "D:F:r:RT:xX:l:zZ:d:c:h:p:U:s:wWI:vP",
  							long_options, &option_index)) != -1)
  	{
  		switch (c)
*************** main(int argc, char **argv)
*** 2088,2093 ****
--- 2240,2248 ----
  			case 'W':
  				dbgetpassword = 1;
  				break;
+ 			case 'I':
+ 				read_backup_profile_header(optarg);
+ 				break;
  			case 's':
  				standby_message_timeout = atoi(optarg) * 1000;
  				if (standby_message_timeout < 0)
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 138deaf..4bb261a 100644
*** a/src/include/access/xlog.h
--- b/src/include/access/xlog.h
*************** extern void SetWalWriterSleeping(bool sl
*** 249,255 ****
   * Starting/stopping a base backup
   */
  extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast,
! 				   TimeLineID *starttli_p, char **labelfile);
  extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive,
  				  TimeLineID *stoptli_p);
  extern void do_pg_abort_backup(void);
--- 249,256 ----
   * Starting/stopping a base backup
   */
  extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast,
! 				  XLogRecPtr incremental_startpoint,
! 				  TimeLineID *starttli_p, char **labelfile);
  extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive,
  				  TimeLineID *stoptli_p);
  extern void do_pg_abort_backup(void);
diff --git a/src/include/replication/basebackup.h b/src/include/replication/basebackup.h
index 64f2bd5..08f8e90 100644
*** a/src/include/replication/basebackup.h
--- b/src/include/replication/basebackup.h
***************
*** 20,25 ****
--- 20,30 ----
  #define MAX_RATE_LOWER	32
  #define MAX_RATE_UPPER	1048576
  
+ /* Backup profile */
+ #define BACKUP_PROFILE_HEADER		"POSTGRESQL BACKUP PROFILE 1"
+ #define BACKUP_PROFILE_SEPARATOR	"FILE LIST"
+ #define BACKUP_PROFILE_FILE			"backup_profile"
+ #define BACKUP_PROFILE_OLD			"backup_profile.old"
  
  extern void SendBaseBackup(BaseBackupCmd *cmd);
  
-- 
2.2.2

#21

Robert Haas

robertmhaas@gmail.com

almost 11 years ago

In reply to: Marco Nenciarini (#20)

Re: File based Incremental backup v9

On Sat, Jan 31, 2015 at 6:47 PM, Marco Nenciarini
<marco.nenciarini@2ndquadrant.it> wrote:

Il 31/01/15 17:22, Erik Rijkers ha scritto:

On Sat, January 31, 2015 15:14, Marco Nenciarini wrote:

0001-public-parse_filename_for_nontemp_relation.patch
0002-copydir-LSN-v2.patch
0003-File-based-incremental-backup-v8.patch

Hi,

It looks like it only compiles with assert enabled.

It is due to a typo (assert instead of Assert). You can find the updated
patch attached to this message.

I would sure like it if you would avoid changing the subject line
every time you post a new version of this patch. It breaks the
threading for me.

It seems to have also broken it for the CommitFest app, which thinks
v3 is the last version. I was not able to attach the new version.
When I clicked on "attach thread" without having logged in, it took me
to a bad URL. When I clicked on it after having logged in, it
purported to work, but AFAICS, it didn't actually do anything.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22

Magnus Hagander

magnus@hagander.net

almost 11 years ago

In reply to: Robert Haas (#21)

Re: File based Incremental backup v9

On Mon, Feb 2, 2015 at 10:06 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Sat, Jan 31, 2015 at 6:47 PM, Marco Nenciarini
<marco.nenciarini@2ndquadrant.it> wrote:

Il 31/01/15 17:22, Erik Rijkers ha scritto:

On Sat, January 31, 2015 15:14, Marco Nenciarini wrote:

0001-public-parse_filename_for_nontemp_relation.patch
0002-copydir-LSN-v2.patch
0003-File-based-incremental-backup-v8.patch

Hi,

It looks like it only compiles with assert enabled.

It is due to a typo (assert instead of Assert). You can find the updated
patch attached to this message.

I would sure like it if you would avoid changing the subject line
every time you post a new version of this patch. It breaks the
threading for me.

+1 - it does break gmail.

It seems to have also broken it for the CommitFest app, which thinks

v3 is the last version. I was not able to attach the new version.

The CF app has detected that it's the same thread, because of the headers
(gmail is the buggy one here - the headers of the email are perfectly
correct).

It does not, however, pick up and show the change of subject there (but you
can see if if you click the link for the latest version into the archives -
the link under "latest" or "latest attachment" both go to the v9 patch).

When I clicked on "attach thread" without having logged in, it took me
to a bad URL. When I clicked on it after having logged in, it

Clearly a bug.

purported to work, but AFAICS, it didn't actually do anything.

That's because the thread is already there, and you're adding it again. Of
course, it wouldn't hurt if it actually told you that :)

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#23

Marco Nenciarini

marco.nenciarini@2ndquadrant.it

almost 11 years ago

In reply to: Magnus Hagander (#22)

Re: File based Incremental backup v9

Il 02/02/15 22:28, Magnus Hagander ha scritto:

On Mon, Feb 2, 2015 at 10:06 PM, Robert Haas <robertmhaas@gmail.com
<mailto:robertmhaas@gmail.com>> wrote:

On Sat, Jan 31, 2015 at 6:47 PM, Marco Nenciarini
<marco.nenciarini@2ndquadrant.it
<mailto:marco.nenciarini@2ndquadrant.it>> wrote:

Il 31/01/15 17:22, Erik Rijkers ha scritto:

On Sat, January 31, 2015 15:14, Marco Nenciarini wrote:

0001-public-parse_filename_for_nontemp_relation.patch
0002-copydir-LSN-v2.patch
0003-File-based-incremental-backup-v8.patch

Hi,

It looks like it only compiles with assert enabled.

It is due to a typo (assert instead of Assert). You can find the updated
patch attached to this message.

I would sure like it if you would avoid changing the subject line
every time you post a new version of this patch. It breaks the
threading for me.

+1 - it does break gmail.

Ok, sorry for that.

It seems to have also broken it for the CommitFest app, which thinks
v3 is the last version. I was not able to attach the new version.

The CF app has detected that it's the same thread, because of the
headers (gmail is the buggy one here - the headers of the email are
perfectly correct).

It does not, however, pick up and show the change of subject there (but
you can see if if you click the link for the latest version into the
archives - the link under "latest" or "latest attachment" both go to the
v9 patch).

When I clicked on "attach thread" without having logged in, it took me
to a bad URL. When I clicked on it after having logged in, it

Clearly a bug.

purported to work, but AFAICS, it didn't actually do anything.

That's because the thread is already there, and you're adding it again.
Of course, it wouldn't hurt if it actually told you that :)

I'm also confused from the "(Patch: No)" part at the end of every line
if you expand the last attachment line.

Every message shown here contains one or more patch attached.

Regards,
Marco

--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it

#24

Magnus Hagander

magnus@hagander.net

almost 11 years ago

In reply to: Magnus Hagander (#22)

Re: File based Incremental backup v9

On Mon, Feb 2, 2015 at 10:28 PM, Magnus Hagander <magnus@hagander.net>
wrote:

On Mon, Feb 2, 2015 at 10:06 PM, Robert Haas <robertmhaas@gmail.com>
wrote:

When I clicked on "attach thread" without having logged in, it took me

to a bad URL. When I clicked on it after having logged in, it

Clearly a bug.

bug has now been fixed.

purported to work, but AFAICS, it didn't actually do anything.

That's because the thread is already there, and you're adding it again. Of
course, it wouldn't hurt if it actually told you that :)

A message telling you what happened has been added.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#25

Francesco Canovai

francesco.canovai@2ndquadrant.it

almost 11 years ago

In reply to: Marco Nenciarini (#20)

Re: File based Incremental backup v9

Hi Marco,

On Sunday 01 February 2015 00:47:24 Marco Nenciarini wrote:

You can find the updated patch attached to this message.

I've been testing the v9 patch with checksums enabled and I end up with a lot
of warnings like these ones:

WARNING: page verification failed, calculated checksum 47340 but expected
47342
WARNING: page verification failed, calculated checksum 16649 but expected
16647
WARNING: page verification failed, calculated checksum 13567 but expected
13565
WARNING: page verification failed, calculated checksum 14110 but expected
14108
WARNING: page verification failed, calculated checksum 40990 but expected
40988
WARNING: page verification failed, calculated checksum 46242 but expected
46244

I can reproduce the problem with the following script:

WORKDIR=/home/fcanovai/tmp
psql -c "CREATE DATABASE pgbench"
pgbench -i -s 100 --foreign-keys pgbench
mkdir $WORKDIR/tbsp
psql -c "CREATE TABLESPACE tbsp LOCATION '$WORKDIR/tbsp'"
psql -c "ALTER DATABASE pgbench SET TABLESPACE tbsp"

Regards,
Francesco

--
Francesco Canovai - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
francesco.canovai@2ndQuadrant.it | www.2ndQuadrant.it

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#26

Marco Nenciarini

marco.nenciarini@2ndquadrant.it

almost 11 years ago

In reply to: Robert Haas (#16)

3 attachment(s)

Re: File based Incremental backup v8

Hi,

I've attached an updated version of the patch. This fixes the issue on
checksum calculation for segments after the first one.

To solve it I've added an optional uint32 *segno argument to
parse_filename_for_nontemp_relation, so I can know the segment number
and calculate the block number correctly.

Il 29/01/15 18:57, Robert Haas ha scritto:

On Thu, Jan 29, 2015 at 9:47 AM, Marco Nenciarini
<marco.nenciarini@2ndquadrant.it> wrote:

The current implementation of copydir function is incompatible with LSN
based incremental backups. The problem is that new files are created,
but their blocks are still with the old LSN, so they will not be backed
up because they are looking old enough.

I think this is trying to pollute what's supposed to be a pure
fs-level operation ("copy a directory") into something that is aware
of specific details like the PostgreSQL page format. I really think
that nothing in storage/file should know about the page format. If we
need a function that copies a file while replacing the LSNs, I think
it should be a new function living somewhere else.

I've named it copydir_set_lsn and placed it as static function in
dbcommands.c. This lefts the copydir and copy_file functions in copydir.c
untouched. The copydir function in copydir.c is now unused, while the copy_file
function is still used during unlogged tables reinit.

A bigger problem is that you are proposing to stamp those files with
LSNs that are, for lack of a better word, fake. I would expect that
this would completely break if checksums are enabled. Also, unlogged
relations typically have an LSN of 0; this would change that in some
cases, and I don't know whether that's OK.

I've investigate a bit and I have not been able to find any problem here.

The issues here are similar to those in
/messages/by-id/20150120152819.GC24381@alap3.anarazel.de
- basically, I think we need to make CREATE DATABASE and ALTER
DATABASE .. SET TABLESPACE fully WAL-logged operations, or this is
never going to work right. If we're not going to allow that, we need
to disallow hot backups while those operations are in progress.

As already said the copydir-LSN patch should be treated as a "temporary"
until a proper WAL logging of CREATE DATABASE and ALTER DATABASE SET
TABLESPACE will be implemented. At that time we could probably get rid
of the whole copydir.[ch] file moving the copy_file function inside reinit.c

Regards,
Marco

--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it

Attachments:

0001-public-parse_filename_for_nontemp_relation.patchtext/plain; charset=UTF-8; name=0001-public-parse_filename_for_nontemp_relation.patch; x-mac-creator=0; x-mac-type=0Download

diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c
index afd9255..02b5fee 100644
*** a/src/backend/storage/file/reinit.c
--- b/src/backend/storage/file/reinit.c
*************** static void ResetUnloggedRelationsInTabl
*** 28,35 ****
  									  int op);
  static void ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname,
  								   int op);
- static bool parse_filename_for_nontemp_relation(const char *name,
- 									int *oidchars, ForkNumber *fork);
  
  typedef struct
  {
--- 28,33 ----
*************** ResetUnloggedRelationsInDbspaceDir(const
*** 388,446 ****
  		fsync_fname((char *) dbspacedirname, true);
  	}
  }
- 
- /*
-  * Basic parsing of putative relation filenames.
-  *
-  * This function returns true if the file appears to be in the correct format
-  * for a non-temporary relation and false otherwise.
-  *
-  * NB: If this function returns true, the caller is entitled to assume that
-  * *oidchars has been set to the a value no more than OIDCHARS, and thus
-  * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID
-  * portion of the filename.  This is critical to protect against a possible
-  * buffer overrun.
-  */
- static bool
- parse_filename_for_nontemp_relation(const char *name, int *oidchars,
- 									ForkNumber *fork)
- {
- 	int			pos;
- 
- 	/* Look for a non-empty string of digits (that isn't too long). */
- 	for (pos = 0; isdigit((unsigned char) name[pos]); ++pos)
- 		;
- 	if (pos == 0 || pos > OIDCHARS)
- 		return false;
- 	*oidchars = pos;
- 
- 	/* Check for a fork name. */
- 	if (name[pos] != '_')
- 		*fork = MAIN_FORKNUM;
- 	else
- 	{
- 		int			forkchar;
- 
- 		forkchar = forkname_chars(&name[pos + 1], fork);
- 		if (forkchar <= 0)
- 			return false;
- 		pos += forkchar + 1;
- 	}
- 
- 	/* Check for a segment number. */
- 	if (name[pos] == '.')
- 	{
- 		int			segchar;
- 
- 		for (segchar = 1; isdigit((unsigned char) name[pos + segchar]); ++segchar)
- 			;
- 		if (segchar <= 1)
- 			return false;
- 		pos += segchar;
- 	}
- 
- 	/* Now we should be at the end. */
- 	if (name[pos] != '\0')
- 		return false;
- 	return true;
- }
--- 386,388 ----
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 66dfef1..83a1e3a 100644
*** a/src/common/relpath.c
--- b/src/common/relpath.c
*************** GetRelationPath(Oid dbNode, Oid spcNode,
*** 206,208 ****
--- 206,264 ----
  	}
  	return path;
  }
+ 
+ /*
+  * Basic parsing of putative relation filenames.
+  *
+  * This function returns true if the file appears to be in the correct format
+  * for a non-temporary relation and false otherwise.
+  *
+  * NB: If this function returns true, the caller is entitled to assume that
+  * *oidchars has been set to the a value no more than OIDCHARS, and thus
+  * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID
+  * portion of the filename.  This is critical to protect against a possible
+  * buffer overrun.
+  */
+ bool
+ parse_filename_for_nontemp_relation(const char *name, int *oidchars,
+ 									ForkNumber *fork)
+ {
+ 	int			pos;
+ 
+ 	/* Look for a non-empty string of digits (that isn't too long). */
+ 	for (pos = 0; isdigit((unsigned char) name[pos]); ++pos)
+ 		;
+ 	if (pos == 0 || pos > OIDCHARS)
+ 		return false;
+ 	*oidchars = pos;
+ 
+ 	/* Check for a fork name. */
+ 	if (name[pos] != '_')
+ 		*fork = MAIN_FORKNUM;
+ 	else
+ 	{
+ 		int			forkchar;
+ 
+ 		forkchar = forkname_chars(&name[pos + 1], fork);
+ 		if (forkchar <= 0)
+ 			return false;
+ 		pos += forkchar + 1;
+ 	}
+ 
+ 	/* Check for a segment number. */
+ 	if (name[pos] == '.')
+ 	{
+ 		int			segchar;
+ 
+ 		for (segchar = 1; isdigit((unsigned char) name[pos + segchar]); ++segchar)
+ 			;
+ 		if (segchar <= 1)
+ 			return false;
+ 		pos += segchar;
+ 	}
+ 
+ 	/* Now we should be at the end. */
+ 	if (name[pos] != '\0')
+ 		return false;
+ 	return true;
+ }
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index a263779..9736a78 100644
*** a/src/include/common/relpath.h
--- b/src/include/common/relpath.h
*************** extern char *GetDatabasePath(Oid dbNode,
*** 52,57 ****
--- 52,59 ----
  
  extern char *GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
  				int backendId, ForkNumber forkNumber);
+ extern bool parse_filename_for_nontemp_relation(const char *name,
+ 								int *oidchars, ForkNumber *fork);
  
  /*
   * Wrapper macros for GetRelationPath.  Beware of multiple
-- 
2.3.0

0002-file-based-incremental-backup-v9.patchtext/plain; charset=UTF-8; name=0002-file-based-incremental-backup-v9.patch; x-mac-creator=0; x-mac-type=0Download

diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 3a753a0..a1db67c 100644
*** a/doc/src/sgml/protocol.sgml
--- b/doc/src/sgml/protocol.sgml
*************** The commands accepted in walsender mode 
*** 1882,1888 ****
    </varlistentry>
  
    <varlistentry>
!     <term>BASE_BACKUP [<literal>LABEL</literal> <replaceable>'label'</replaceable>] [<literal>PROGRESS</literal>] [<literal>FAST</literal>] [<literal>WAL</literal>] [<literal>NOWAIT</literal>] [<literal>MAX_RATE</literal> <replaceable>rate</replaceable>]
       <indexterm><primary>BASE_BACKUP</primary></indexterm>
      </term>
      <listitem>
--- 1882,1888 ----
    </varlistentry>
  
    <varlistentry>
!     <term>BASE_BACKUP [<literal>LABEL</literal> <replaceable>'label'</replaceable>] [<literal>INCREMENTAL</literal> <replaceable>'start_lsn'</replaceable>] [<literal>PROGRESS</literal>] [<literal>FAST</literal>] [<literal>WAL</literal>] [<literal>NOWAIT</literal>] [<literal>MAX_RATE</literal> <replaceable>rate</replaceable>]
       <indexterm><primary>BASE_BACKUP</primary></indexterm>
      </term>
      <listitem>
*************** The commands accepted in walsender mode 
*** 1905,1910 ****
--- 1905,1928 ----
         </varlistentry>
  
         <varlistentry>
+         <term><literal>INCREMENTAL</literal> <replaceable>'start_lsn'</replaceable></term>
+         <listitem>
+          <para>
+           Requests a file-level incremental backup of all files changed after
+           <replaceable>start_lsn</replaceable>. When operating with
+           <literal>INCREMENTAL</literal>, the content of every block-organised
+           file will be analyzed and the file will be sent if at least one
+           block has a LSN higher than or equal to the provided
+           <replaceable>start_lsn</replaceable>.
+          </para>
+          <para>
+           The <filename>backup_profile</filename> will contain information on
+           every file that has been analyzed, even those that have not been sent.
+          </para>
+         </listitem>
+        </varlistentry>
+ 
+        <varlistentry>
          <term><literal>PROGRESS</></term>
          <listitem>
           <para>
*************** The commands accepted in walsender mode 
*** 2022,2028 ****
        <quote>ustar interchange format</> specified in the POSIX 1003.1-2008
        standard) dump of the tablespace contents, except that the two trailing
        blocks of zeroes specified in the standard are omitted.
!       After the tar data is complete, a final ordinary result set will be sent,
        containing the WAL end position of the backup, in the same format as
        the start position.
       </para>
--- 2040,2046 ----
        <quote>ustar interchange format</> specified in the POSIX 1003.1-2008
        standard) dump of the tablespace contents, except that the two trailing
        blocks of zeroes specified in the standard are omitted.
!       After the tar data is complete, an ordinary result set will be sent,
        containing the WAL end position of the backup, in the same format as
        the start position.
       </para>
*************** The commands accepted in walsender mode 
*** 2073,2082 ****
        the server supports it.
       </para>
       <para>
!       Once all tablespaces have been sent, a final regular result set will
        be sent. This result set contains the end position of the
        backup, given in XLogRecPtr format as a single column in a single row.
       </para>
      </listitem>
    </varlistentry>
  </variablelist>
--- 2091,2162 ----
        the server supports it.
       </para>
       <para>
!       Once all tablespaces have been sent, another regular result set will
        be sent. This result set contains the end position of the
        backup, given in XLogRecPtr format as a single column in a single row.
       </para>
+      <para>
+       Finally a last CopyResponse will be sent, containing only the
+       <filename>backup_profile</filename> file, in tar format.
+      </para>
+      <para>
+       The <filename>backup_profile</filename> file will have the following
+       format:
+ <programlisting>
+ POSTGRESQL BACKUP PROFILE 1
+ &lt;backup label content&gt;
+ FILE LIST
+ &lt;file list&gt;
+ </programlisting>
+       where <replaceable>&lt;backup label content&gt;</replaceable> is a
+       verbatim copy of the content of <filename>backup_label</filename> file
+       and the <replaceable>&lt;file list&gt;</replaceable> section is made up
+       of one line per file examined by the backup, having the following format
+       (standard COPY TEXT file, tab separated):
+ <programlisting>
+ tablespace maxlsn included mtime size relpath
+ </programlisting>
+      </para>
+      <para>
+       The meaning of the fields is the following:
+       <itemizedlist spacing="compact" mark="bullet">
+        <listitem>
+         <para>
+          <replaceable>tablespace</replaceable> is the OID of the tablespace
+          (or <literal>\N</literal> for files in PGDATA)
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>maxlsn</replaceable> is the file's max LSN in case
+          the file has been skipped, <literal>\N</literal> otherwise
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>included</replaceable> is a <literal>'t'</literal> if
+          the file is included in the backup, <literal>'f'</literal> otherwise
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>mtime</replaceable> is the timestamp of the last file
+          modification
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>size</replaceable> is the number of bytes of the file
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <replaceable>relpath</replaceable> is the path of the file relative
+          to the tablespace root (PGDATA or the tablespace)
+         </para>
+        </listitem>
+       </itemizedlist>
+      </para>
      </listitem>
    </varlistentry>
  </variablelist>
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 642fccf..a13b188 100644
*** a/doc/src/sgml/ref/pg_basebackup.sgml
--- b/doc/src/sgml/ref/pg_basebackup.sgml
*************** PostgreSQL documentation
*** 158,163 ****
--- 158,165 ----
              tablespaces, the main data directory will be placed in the
              target directory, but all other tablespaces will be placed
              in the same absolute path as they have on the server.
+             The <filename>backup_profile</filename> file will be placed in
+             this directory.
             </para>
             <para>
              This is the default format.
*************** PostgreSQL documentation
*** 174,186 ****
              data directory will be written to a file named
              <filename>base.tar</filename>, and all other tablespaces will
              be named after the tablespace OID.
!             </para>
             <para>
              If the value <literal>-</literal> (dash) is specified as
              target directory, the tar contents will be written to
              standard output, suitable for piping to for example
              <productname>gzip</productname>. This is only possible if
              the cluster has no additional tablespaces.
             </para>
             </listitem>
           </varlistentry>
--- 176,192 ----
              data directory will be written to a file named
              <filename>base.tar</filename>, and all other tablespaces will
              be named after the tablespace OID.
!             The <filename>backup_profile</filename> file will be placed in
!             this directory.
!            </para>
             <para>
              If the value <literal>-</literal> (dash) is specified as
              target directory, the tar contents will be written to
              standard output, suitable for piping to for example
              <productname>gzip</productname>. This is only possible if
              the cluster has no additional tablespaces.
+             In this case, the <filename>backup_profile</filename> file 
+             will be sent to standard output as part of the tar stream.
             </para>
             </listitem>
           </varlistentry>
*************** PostgreSQL documentation
*** 189,194 ****
--- 195,214 ----
       </varlistentry>
  
       <varlistentry>
+       <term><option>-I <replaceable class="parameter">directory</replaceable></option></term>
+       <term><option>--incremental=<replaceable class="parameter">directory</replaceable></option></term>
+       <listitem>
+         <para>
+         Directory containing the backup to use as a start point for a file-level
+         incremental backup. <application>pg_basebackup</application> will read
+         the <filename>backup_profile</filename> file and then create an
+         incremental backup containing only the files which have been modified
+         after the start point.
+        </para>
+       </listitem>
+      </varlistentry>
+ 
+      <varlistentry>
        <term><option>-r <replaceable class="parameter">rate</replaceable></option></term>
        <term><option>--max-rate=<replaceable class="parameter">rate</replaceable></option></term>
        <listitem>
*************** PostgreSQL documentation
*** 588,593 ****
--- 608,622 ----
    </para>
  
    <para>
+    In order to support file-level incremental backups, a
+    <filename>backup_profile</filename> file
+    is generated in the target directory as last step of every backup. This
+    file will be transparently used by <application>pg_basebackup</application>
+    when invoked with the option <replaceable>--incremental</replaceable> to start
+    a new file-level incremental backup.
+   </para>
+ 
+   <para>
     <application>pg_basebackup</application> works with servers of the same
     or an older major version, down to 9.1. However, WAL streaming mode (-X
     stream) only works with server version 9.3 and later.
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 629a457..a642a04 100644
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***************
*** 47,52 ****
--- 47,53 ----
  #include "replication/snapbuild.h"
  #include "replication/walreceiver.h"
  #include "replication/walsender.h"
+ #include "replication/basebackup.h"
  #include "storage/barrier.h"
  #include "storage/bufmgr.h"
  #include "storage/fd.h"
*************** StartupXLOG(void)
*** 6164,6169 ****
--- 6165,6173 ----
  		 * the latest recovery restartpoint instead of going all the way back
  		 * to the backup start point.  It seems prudent though to just rename
  		 * the file out of the way rather than delete it completely.
+ 		 *
+ 		 * Rename also the backup profile if present. This marks the data
+ 		 * directory as not usable as base for an incremental backup.
  		 */
  		if (haveBackupLabel)
  		{
*************** StartupXLOG(void)
*** 6173,6178 ****
--- 6177,6189 ----
  						(errcode_for_file_access(),
  						 errmsg("could not rename file \"%s\" to \"%s\": %m",
  								BACKUP_LABEL_FILE, BACKUP_LABEL_OLD)));
+ 			unlink(BACKUP_PROFILE_OLD);
+ 			if (rename(BACKUP_PROFILE_FILE, BACKUP_PROFILE_OLD) != 0
+ 					&& errno != ENOENT)
+ 				ereport(FATAL,
+ 						(errcode_for_file_access(),
+ 						 errmsg("could not rename file \"%s\" to \"%s\": %m",
+ 								 BACKUP_PROFILE_FILE, BACKUP_PROFILE_OLD)));
  		}
  
  		/* Check that the GUCs used to generate the WAL allow recovery */
*************** XLogFileNameP(TimeLineID tli, XLogSegNo 
*** 9249,9255 ****
   * permissions of the calling user!
   */
  XLogRecPtr
! do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
  				   char **labelfile)
  {
  	bool		exclusive = (labelfile == NULL);
--- 9260,9267 ----
   * permissions of the calling user!
   */
  XLogRecPtr
! do_pg_start_backup(const char *backupidstr, bool fast,
! 				   XLogRecPtr incremental_startpoint, TimeLineID *starttli_p,
  				   char **labelfile)
  {
  	bool		exclusive = (labelfile == NULL);
*************** do_pg_start_backup(const char *backupids
*** 9468,9473 ****
--- 9480,9489 ----
  			 (uint32) (startpoint >> 32), (uint32) startpoint, xlogfilename);
  		appendStringInfo(&labelfbuf, "CHECKPOINT LOCATION: %X/%X\n",
  					 (uint32) (checkpointloc >> 32), (uint32) checkpointloc);
+ 		if (incremental_startpoint > 0)
+ 			appendStringInfo(&labelfbuf, "INCREMENTAL FROM LOCATION: %X/%X\n",
+ 							 (uint32) (incremental_startpoint >> 32),
+ 							 (uint32) incremental_startpoint);
  		appendStringInfo(&labelfbuf, "BACKUP METHOD: %s\n",
  						 exclusive ? "pg_start_backup" : "streamed");
  		appendStringInfo(&labelfbuf, "BACKUP FROM: %s\n",
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2179bf7..ace84d8 100644
*** a/src/backend/access/transam/xlogfuncs.c
--- b/src/backend/access/transam/xlogfuncs.c
*************** pg_start_backup(PG_FUNCTION_ARGS)
*** 59,65 ****
  				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
  		   errmsg("must be superuser or replication role to run a backup")));
  
! 	startpoint = do_pg_start_backup(backupidstr, fast, NULL, NULL);
  
  	PG_RETURN_LSN(startpoint);
  }
--- 59,65 ----
  				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
  		   errmsg("must be superuser or replication role to run a backup")));
  
! 	startpoint = do_pg_start_backup(backupidstr, fast, 0, NULL, NULL);
  
  	PG_RETURN_LSN(startpoint);
  }
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 3058ce9..107d70c 100644
*** a/src/backend/replication/basebackup.c
--- b/src/backend/replication/basebackup.c
***************
*** 30,40 ****
--- 30,42 ----
  #include "replication/basebackup.h"
  #include "replication/walsender.h"
  #include "replication/walsender_private.h"
+ #include "storage/bufpage.h"
  #include "storage/fd.h"
  #include "storage/ipc.h"
  #include "utils/builtins.h"
  #include "utils/elog.h"
  #include "utils/ps_status.h"
+ #include "utils/pg_lsn.h"
  #include "utils/timestamp.h"
  
  
*************** typedef struct
*** 46,56 ****
  	bool		nowait;
  	bool		includewal;
  	uint32		maxrate;
  } basebackup_options;
  
  
! static int64 sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces);
! static int64 sendTablespace(char *path, bool sizeonly);
  static bool sendFile(char *readfilename, char *tarfilename,
  		 struct stat * statbuf, bool missing_ok);
  static void sendFileWithContent(const char *filename, const char *content);
--- 48,62 ----
  	bool		nowait;
  	bool		includewal;
  	uint32		maxrate;
+ 	XLogRecPtr	incremental_startpoint;
  } basebackup_options;
  
  
! static int64 sendDir(char *path, int basepathlen, bool sizeonly,
! 					 List *tablespaces, bool has_relfiles,
! 					 XLogRecPtr incremental_startpoint);
! static int64 sendTablespace(char *path, bool sizeonly,
! 				XLogRecPtr incremental_startpoint);
  static bool sendFile(char *readfilename, char *tarfilename,
  		 struct stat * statbuf, bool missing_ok);
  static void sendFileWithContent(const char *filename, const char *content);
*************** static void parse_basebackup_options(Lis
*** 64,69 ****
--- 70,80 ----
  static void SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli);
  static int	compareWalFileNames(const void *a, const void *b);
  static void throttle(size_t increment);
+ static bool relnodeIsNewerThanLSN(char *filename, struct stat * statbuf,
+ 				XLogRecPtr *filemaxlsn, XLogRecPtr thresholdlsn);
+ static void writeBackupProfileLine(const char *filename, struct stat * statbuf,
+ 								   bool has_maxlsn, XLogRecPtr filemaxlsn, bool sent);
+ static void sendBackupProfile(const char *labelfile);
  
  /* Was the backup currently in-progress initiated in recovery mode? */
  static bool backup_started_in_recovery = false;
*************** static int64 elapsed_min_unit;
*** 93,98 ****
--- 104,115 ----
  /* The last check of the transfer rate. */
  static int64 throttled_last;
  
+ /* Temporary file containing the backup profile */
+ static File backup_profile_fd = 0;
+ 
+ /* Tablespace being currently sent. Used in backup profile generation */
+ static char *current_tablespace = NULL;
+ 
  typedef struct
  {
  	char	   *oid;
*************** perform_base_backup(basebackup_options *
*** 132,138 ****
  
  	backup_started_in_recovery = RecoveryInProgress();
  
! 	startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint, &starttli,
  								  &labelfile);
  	/*
  	 * Once do_pg_start_backup has been called, ensure that any failure causes
--- 149,159 ----
  
  	backup_started_in_recovery = RecoveryInProgress();
  
! 	/* Open a temporary file to hold the profile content. */
! 	backup_profile_fd = OpenTemporaryFile(false);
! 
! 	startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint,
! 								  opt->incremental_startpoint, &starttli,
  								  &labelfile);
  	/*
  	 * Once do_pg_start_backup has been called, ensure that any failure causes
*************** perform_base_backup(basebackup_options *
*** 208,214 ****
  			ti->oid = pstrdup(de->d_name);
  			ti->path = pstrdup(linkpath);
  			ti->rpath = relpath ? pstrdup(relpath) : NULL;
! 			ti->size = opt->progress ? sendTablespace(fullpath, true) : -1;
  			tablespaces = lappend(tablespaces, ti);
  #else
  
--- 229,236 ----
  			ti->oid = pstrdup(de->d_name);
  			ti->path = pstrdup(linkpath);
  			ti->rpath = relpath ? pstrdup(relpath) : NULL;
! 			ti->size = opt->progress ? sendTablespace(fullpath, true,
! 											opt->incremental_startpoint) : -1;
  			tablespaces = lappend(tablespaces, ti);
  #else
  
*************** perform_base_backup(basebackup_options *
*** 225,231 ****
  
  		/* Add a node for the base directory at the end */
  		ti = palloc0(sizeof(tablespaceinfo));
! 		ti->size = opt->progress ? sendDir(".", 1, true, tablespaces) : -1;
  		tablespaces = lappend(tablespaces, ti);
  
  		/* Send tablespace header */
--- 247,254 ----
  
  		/* Add a node for the base directory at the end */
  		ti = palloc0(sizeof(tablespaceinfo));
! 		ti->size = opt->progress ? sendDir(".", 1, true, tablespaces, false,
! 										   opt->incremental_startpoint) : -1;
  		tablespaces = lappend(tablespaces, ti);
  
  		/* Send tablespace header */
*************** perform_base_backup(basebackup_options *
*** 267,272 ****
--- 290,301 ----
  			pq_sendint(&buf, 0, 2);		/* natts */
  			pq_endmessage(&buf);
  
+ 			/*
+ 			 * Save the current tablespace, used in writeBackupProfileLine
+ 			 * function
+ 			 */
+ 			current_tablespace = ti->oid;
+ 
  			if (ti->path == NULL)
  			{
  				struct stat statbuf;
*************** perform_base_backup(basebackup_options *
*** 275,281 ****
  				sendFileWithContent(BACKUP_LABEL_FILE, labelfile);
  
  				/* ... then the bulk of the files ... */
! 				sendDir(".", 1, false, tablespaces);
  
  				/* ... and pg_control after everything else. */
  				if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
--- 304,310 ----
  				sendFileWithContent(BACKUP_LABEL_FILE, labelfile);
  
  				/* ... then the bulk of the files ... */
! 				sendDir(".", 1, false, tablespaces, false, opt->incremental_startpoint);
  
  				/* ... and pg_control after everything else. */
  				if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
*************** perform_base_backup(basebackup_options *
*** 284,292 ****
  							 errmsg("could not stat control file \"%s\": %m",
  									XLOG_CONTROL_FILE)));
  				sendFile(XLOG_CONTROL_FILE, XLOG_CONTROL_FILE, &statbuf, false);
  			}
  			else
! 				sendTablespace(ti->path, false);
  
  			/*
  			 * If we're including WAL, and this is the main data directory we
--- 313,322 ----
  							 errmsg("could not stat control file \"%s\": %m",
  									XLOG_CONTROL_FILE)));
  				sendFile(XLOG_CONTROL_FILE, XLOG_CONTROL_FILE, &statbuf, false);
+ 				writeBackupProfileLine(XLOG_CONTROL_FILE, &statbuf, false, 0, true);
  			}
  			else
! 				sendTablespace(ti->path, false, opt->incremental_startpoint);
  
  			/*
  			 * If we're including WAL, and this is the main data directory we
*************** perform_base_backup(basebackup_options *
*** 501,507 ****
  
  			FreeFile(fp);
  
! 			/*
  			 * Mark file as archived, otherwise files can get archived again
  			 * after promotion of a new node. This is in line with
  			 * walreceiver.c always doing a XLogArchiveForceDone() after a
--- 531,540 ----
  
  			FreeFile(fp);
  
! 			/* Add the WAL file to backup profile */
! 			writeBackupProfileLine(pathbuf, &statbuf, false, 0, true);
! 
! 		    /*
  			 * Mark file as archived, otherwise files can get archived again
  			 * after promotion of a new node. This is in line with
  			 * walreceiver.c always doing a XLogArchiveForceDone() after a
*************** perform_base_backup(basebackup_options *
*** 533,538 ****
--- 566,574 ----
  
  			sendFile(pathbuf, pathbuf, &statbuf, false);
  
+ 			/* Add the WAL file to backup profile */
+ 			writeBackupProfileLine(pathbuf, &statbuf, false, 0, true);
+ 
  			/* unconditionally mark file as archived */
  			StatusFilePath(pathbuf, fname, ".done");
  			sendFileWithContent(pathbuf, "");
*************** perform_base_backup(basebackup_options *
*** 542,547 ****
--- 578,586 ----
  		pq_putemptymessage('c');
  	}
  	SendXlogRecPtrResult(endptr, endtli);
+ 
+ 	/* Send the profile file. */
+ 	sendBackupProfile(labelfile);
  }
  
  /*
*************** parse_basebackup_options(List *options, 
*** 570,575 ****
--- 609,615 ----
  	bool		o_nowait = false;
  	bool		o_wal = false;
  	bool		o_maxrate = false;
+ 	bool		o_incremental = false;
  
  	MemSet(opt, 0, sizeof(*opt));
  	foreach(lopt, options)
*************** parse_basebackup_options(List *options, 
*** 640,645 ****
--- 680,697 ----
  			opt->maxrate = (uint32) maxrate;
  			o_maxrate = true;
  		}
+ 		else if (strcmp(defel->defname, "incremental") == 0)
+ 		{
+ 			if (o_incremental)
+ 				ereport(ERROR,
+ 						(errcode(ERRCODE_SYNTAX_ERROR),
+ 						 errmsg("duplicate option \"%s\"", defel->defname)));
+ 
+ 			opt->incremental_startpoint = DatumGetLSN(
+ 				DirectFunctionCall1(pg_lsn_in,
+ 									CStringGetDatum(strVal(defel->arg))));
+ 			o_incremental = true;
+ 		}
  		else
  			elog(ERROR, "option \"%s\" not recognized",
  				 defel->defname);
*************** sendFileWithContent(const char *filename
*** 859,864 ****
--- 911,919 ----
  		MemSet(buf, 0, pad);
  		pq_putmessage('d', buf, pad);
  	}
+ 
+ 	/* Write a backup profile entry for this file. */
+ 	writeBackupProfileLine(filename, &statbuf, false, 0, true);
  }
  
  /*
*************** sendFileWithContent(const char *filename
*** 869,875 ****
   * Only used to send auxiliary tablespaces, not PGDATA.
   */
  static int64
! sendTablespace(char *path, bool sizeonly)
  {
  	int64		size;
  	char		pathbuf[MAXPGPATH];
--- 924,930 ----
   * Only used to send auxiliary tablespaces, not PGDATA.
   */
  static int64
! sendTablespace(char *path, bool sizeonly, XLogRecPtr incremental_startpoint)
  {
  	int64		size;
  	char		pathbuf[MAXPGPATH];
*************** sendTablespace(char *path, bool sizeonly
*** 902,908 ****
  	size = 512;					/* Size of the header just added */
  
  	/* Send all the files in the tablespace version directory */
! 	size += sendDir(pathbuf, strlen(path), sizeonly, NIL);
  
  	return size;
  }
--- 957,963 ----
  	size = 512;					/* Size of the header just added */
  
  	/* Send all the files in the tablespace version directory */
! 	size += sendDir(pathbuf, strlen(path), sizeonly, NIL, true, incremental_startpoint);
  
  	return size;
  }
*************** sendTablespace(char *path, bool sizeonly
*** 914,922 ****
   *
   * Omit any directory in the tablespaces list, to avoid backing up
   * tablespaces twice when they were created inside PGDATA.
   */
  static int64
! sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces)
  {
  	DIR		   *dir;
  	struct dirent *de;
--- 969,981 ----
   *
   * Omit any directory in the tablespaces list, to avoid backing up
   * tablespaces twice when they were created inside PGDATA.
+  *
+  * If 'has_relfiles' is set, this directory will be checked to identify
+  * relnode files and compute their maxLSN.
   */
  static int64
! sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces,
! 		bool has_relfiles, XLogRecPtr incremental_startpoint)
  {
  	DIR		   *dir;
  	struct dirent *de;
*************** sendDir(char *path, int basepathlen, boo
*** 1124,1138 ****
  				}
  			}
  			if (!skip_this_dir)
! 				size += sendDir(pathbuf, basepathlen, sizeonly, tablespaces);
  		}
  		else if (S_ISREG(statbuf.st_mode))
  		{
  			bool		sent = false;
  
  			if (!sizeonly)
! 				sent = sendFile(pathbuf, pathbuf + basepathlen + 1, &statbuf,
! 								true);
  
  			if (sent || sizeonly)
  			{
--- 1183,1243 ----
  				}
  			}
  			if (!skip_this_dir)
! 			{
! 				bool	subdir_has_relfiles;
! 
! 				/*
! 				 * Whithin PGDATA relnode files are contained only in "global"
! 				 * and "base" directory
! 				 */
! 				subdir_has_relfiles = has_relfiles
! 					|| strcmp(pathbuf, "./global") == 0
! 					|| strcmp(pathbuf, "./base") == 0;
! 
! 				size += sendDir(pathbuf, basepathlen, sizeonly, tablespaces,
! 								subdir_has_relfiles, incremental_startpoint);
! 			}
  		}
  		else if (S_ISREG(statbuf.st_mode))
  		{
  			bool		sent = false;
  
  			if (!sizeonly)
! 			{
! 				bool		is_relfile;
! 				XLogRecPtr	filemaxlsn = 0;
! 				int			oidchars;
! 				ForkNumber	forknum;
! 
! 				/*
! 				 * If the current directory can have relnode files, check the file
! 				 * name to see if it is one of them.
! 				 *
! 				 * Only copy the main fork because is the only one
! 				 * where page LSNs are always updated
! 				 */
! 				is_relfile = ( has_relfiles
! 					&& parse_filename_for_nontemp_relation(de->d_name,
! 														   &oidchars,
! 														   &forknum)
! 					&& forknum == MAIN_FORKNUM);
! 
! 				if (!is_relfile
! 					|| incremental_startpoint == 0
! 					|| relnodeIsNewerThanLSN(pathbuf, &statbuf, &filemaxlsn,
! 											 incremental_startpoint))
! 				{
! 					sent = sendFile(pathbuf, pathbuf + basepathlen + 1,
! 									&statbuf, true);
! 					/* Write a backup profile entry for the sent file. */
! 					writeBackupProfileLine(pathbuf + basepathlen + 1, &statbuf,
! 										   false, 0, sent);
! 				}
! 				else
! 					/* Write a backup profile entry for the skipped file. */
! 					writeBackupProfileLine(pathbuf + basepathlen + 1, &statbuf,
! 										   true, filemaxlsn, sent);
! 			}
  
  			if (sent || sizeonly)
  			{
*************** throttle(size_t increment)
*** 1333,1335 ****
--- 1438,1626 ----
  		/* Sleep was necessary but might have been interrupted. */
  		throttled_last = GetCurrentIntegerTimestamp();
  }
+ 
+ /*
+  * Search in a relnode file for a page with a LSN greater than the threshold.
+  * If all the blocks in the file are older than the threshold the file can
+  * be safely skipped during an incremental backup.
+  */
+ static bool
+ relnodeIsNewerThanLSN(char *filename, struct stat * statbuf,
+ 		XLogRecPtr *filemaxlsn, XLogRecPtr thresholdlsn)
+ {
+ 	FILE	   *fp;
+ 	char		buf[BLCKSZ];
+ 	size_t		cnt;
+ 	pgoff_t		len = 0;
+ 	XLogRecPtr	pagelsn;
+ 
+ 	*filemaxlsn = 0;
+ 
+ 	fp = AllocateFile(filename, "rb");
+ 	if (fp == NULL)
+ 	{
+ 		if (errno == ENOENT)
+ 			return true;
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not open file \"%s\": %m", filename)));
+ 	}
+ 
+ 	while ((cnt = fread(buf, 1, Min(sizeof(buf), statbuf->st_size - len), fp)) > 0)
+ 	{
+ 		pagelsn = PageGetLSN(buf);
+ 
+ 		/* Keep the max LSN found */
+ 		if (*filemaxlsn < pagelsn)
+ 			*filemaxlsn = pagelsn;
+ 
+ 		/*
+ 		 *  If a page with a LSN newer than the threshold stop scanning
+ 		 *  and set the filemaxlsn value to 0 as it is only partial.
+ 		 */
+ 		if (thresholdlsn <= pagelsn)
+ 		{
+ 			*filemaxlsn = 0;
+ 			FreeFile(fp);
+ 			return true;
+ 		}
+ 
+ 		if (len >= statbuf->st_size)
+ 		{
+ 			/*
+ 			 * Reached end of file. The file could be longer, if it was
+ 			 * extended while we were sending it, but for a base backup we can
+ 			 * ignore such extended data. It will be restored from WAL.
+ 			 */
+ 			break;
+ 		}
+ 	}
+ 
+ 	FreeFile(fp);
+ 
+ 	/*
+ 	 * At this point, if *filemaxlsn contains InvalidXLogRecPtr
+ 	 * the file contains something that doesn't update page LSNs (e.g. FSM)
+ 	 */
+ 	if (*filemaxlsn == InvalidXLogRecPtr)
+ 		return true;
+ 
+ 	return false;
+ }
+ 
+ /*
+  * Write an entry in file list section of backup profile.
+  */
+ static void
+ writeBackupProfileLine(const char *filename, struct stat * statbuf,
+ 					   bool has_maxlsn, XLogRecPtr filemaxlsn, bool sent)
+ {
+ 	/*
+ 	 * tablespace oid (10) + max LSN (17) + mtime (10) + size (19) +
+ 	 * path (MAXPGPATH) + separators (4) + trailing \0 = 65
+ 	 */
+ 	char	buf[MAXPGPATH + 65];
+ 	char    maxlsn[17];
+ 	int		rowlen;
+ 
+ 	Assert(backup_profile_fd > 0);
+ 
+ 	/* Prepare maxlsn */
+ 	if (has_maxlsn)
+ 	{
+ 		snprintf(maxlsn, sizeof(maxlsn), "%X/%X",
+ 				 (uint32) (filemaxlsn >> 32), (uint32) filemaxlsn);
+ 	}
+ 	else
+ 	{
+ 		strlcpy(maxlsn, "\\N", sizeof(maxlsn));
+ 	}
+ 
+ 	rowlen = snprintf(buf, sizeof(buf), "%s\t%s\t%s\t%u\t%lld\t%s\n",
+ 					  current_tablespace ? current_tablespace : "\\N",
+ 					  maxlsn,
+ 					  sent ? "t" : "f",
+ 					  (uint32) statbuf->st_mtime,
+ 					  statbuf->st_size,
+ 					  filename);
+ 	FileWrite(backup_profile_fd, buf, rowlen);
+ }
+ 
+ /*
+  * Send the backup profile. It is wrapped in a tar CopyOutResponse containing
+  * a tar stream with only one file.
+  */
+ static void
+ sendBackupProfile(const char *labelfile)
+ {
+ 	StringInfoData msgbuf;
+ 	struct stat statbuf;
+ 	char		buf[TAR_SEND_SIZE];
+ 	size_t		cnt;
+ 	pgoff_t		len = 0;
+ 	size_t		pad;
+ 	char *backup_profile = FilePathName(backup_profile_fd);
+ 
+ 	/* Send CopyOutResponse message */
+ 	pq_beginmessage(&msgbuf, 'H');
+ 	pq_sendbyte(&msgbuf, 0);		/* overall format */
+ 	pq_sendint(&msgbuf, 0, 2);		/* natts */
+ 	pq_endmessage(&msgbuf);
+ 
+ 	if (lstat(backup_profile, &statbuf) != 0)
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not stat backup_profile file \"%s\": %m",
+ 						backup_profile)));
+ 
+ 	/* Set the file position to the beginning. */
+ 	FileSeek(backup_profile_fd, 0, SEEK_SET);
+ 
+ 	/*
+ 	 * Fill the buffer with content of backup profile header section. Being it
+ 	 * the concatenation of two separator and the backup label, it should be
+ 	 * shorter of TAR_SEND_SIZE.
+ 	 */
+ 	cnt = snprintf(buf, sizeof(buf), "%s\n%s%s\n",
+ 				   BACKUP_PROFILE_HEADER,
+ 				   labelfile,
+ 				   BACKUP_PROFILE_SEPARATOR);
+ 
+ 	/* Add size of backup label and separators */
+ 	statbuf.st_size += cnt;
+ 
+ 	_tarWriteHeader(BACKUP_PROFILE_FILE, NULL, &statbuf);
+ 
+ 	/* Send backup profile header */
+ 	if (pq_putmessage('d', buf, cnt))
+ 		ereport(ERROR,
+ 				(errmsg("base backup could not send data, aborting backup")));
+ 
+ 	len += cnt;
+ 	throttle(cnt);
+ 
+ 	while ((cnt = FileRead(backup_profile_fd, buf, sizeof(buf))) > 0)
+ 	{
+ 		/* Send the chunk as a CopyData message */
+ 		if (pq_putmessage('d', buf, cnt))
+ 			ereport(ERROR,
+ 					(errmsg("base backup could not send data, aborting backup")));
+ 
+ 		len += cnt;
+ 		throttle(cnt);
+ 
+ 	}
+ 
+ 	/*
+ 	 * Pad to 512 byte boundary, per tar format requirements. (This small
+ 	 * piece of data is probably not worth throttling.)
+ 	 */
+ 	pad = ((len + 511) & ~511) - len;
+ 	if (pad > 0)
+ 	{
+ 		MemSet(buf, 0, pad);
+ 		pq_putmessage('d', buf, pad);
+ 	}
+ 
+ 	pq_putemptymessage('c');        /* CopyDone */
+ }
diff --git a/src/backend/replication/repl_gram.y b/src/backend/replication/repl_gram.y
index 2a41eb1..684cf4d 100644
*** a/src/backend/replication/repl_gram.y
--- b/src/backend/replication/repl_gram.y
*************** Node *replication_parse_result;
*** 75,80 ****
--- 75,81 ----
  %token K_PHYSICAL
  %token K_LOGICAL
  %token K_SLOT
+ %token K_INCREMENTAL
  
  %type <node>	command
  %type <node>	base_backup start_replication start_logical_replication create_replication_slot drop_replication_slot identify_system timeline_history
*************** base_backup_opt:
*** 168,173 ****
--- 169,179 ----
  				  $$ = makeDefElem("max_rate",
  								   (Node *)makeInteger($2));
  				}
+ 			| K_INCREMENTAL SCONST
+ 				{
+ 				  $$ = makeDefElem("incremental",
+ 								   (Node *)makeString($2));
+ 				}
  			;
  
  create_replication_slot:
diff --git a/src/backend/replication/repl_scanner.l b/src/backend/replication/repl_scanner.l
index 449c127..a6d0dd8 100644
*** a/src/backend/replication/repl_scanner.l
--- b/src/backend/replication/repl_scanner.l
*************** TIMELINE_HISTORY	{ return K_TIMELINE_HIS
*** 96,101 ****
--- 96,102 ----
  PHYSICAL			{ return K_PHYSICAL; }
  LOGICAL				{ return K_LOGICAL; }
  SLOT				{ return K_SLOT; }
+ INCREMENTAL			{ return K_INCREMENTAL; }
  
  ","				{ return ','; }
  ";"				{ return ';'; }
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index fbf7106..c03e7e0 100644
*** a/src/bin/pg_basebackup/pg_basebackup.c
--- b/src/bin/pg_basebackup/pg_basebackup.c
*************** static bool writerecoveryconf = false;
*** 67,72 ****
--- 67,74 ----
  static int	standby_message_timeout = 10 * 1000;		/* 10 sec = default */
  static pg_time_t last_progress_report = 0;
  static int32 maxrate = 0;		/* no limit by default */
+ static XLogRecPtr incremental_startpoint = 0;
+ static TimeLineID incremental_timeline = 0;
  
  
  /* Progress counters */
*************** static void usage(void);
*** 99,107 ****
  static void disconnect_and_exit(int code);
  static void verify_dir_is_empty_or_create(char *dirname);
  static void progress_report(int tablespacenum, const char *filename, bool force);
  
  static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);
! static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum);
  static void GenerateRecoveryConf(PGconn *conn);
  static void WriteRecoveryConf(void);
  static void BaseBackup(void);
--- 101,111 ----
  static void disconnect_and_exit(int code);
  static void verify_dir_is_empty_or_create(char *dirname);
  static void progress_report(int tablespacenum, const char *filename, bool force);
+ static void read_backup_profile_header(const char *profile_path);
  
  static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);
! static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum,
! 									const char *dest_path);
  static void GenerateRecoveryConf(PGconn *conn);
  static void WriteRecoveryConf(void);
  static void BaseBackup(void);
*************** usage(void)
*** 232,237 ****
--- 236,243 ----
  	printf(_("\nOptions controlling the output:\n"));
  	printf(_("  -D, --pgdata=DIRECTORY receive base backup into directory\n"));
  	printf(_("  -F, --format=p|t       output format (plain (default), tar)\n"));
+ 	printf(_("  -I, --incremental=DIRECTORY\n"
+ 			 "                         incremental backup from an existing backup\n"));
  	printf(_("  -r, --max-rate=RATE    maximum transfer rate to transfer data directory\n"
  			 "                         (in kB/s, or use suffix \"k\" or \"M\")\n"));
  	printf(_("  -R, --write-recovery-conf\n"
*************** parse_max_rate(char *src)
*** 717,722 ****
--- 723,794 ----
  	return (int32) result;
  }
  
+ 
+ /*
+  * Read incremental_startpoint and incremental_timeline
+  * from a backup profile.
+  */
+ static void
+ read_backup_profile_header(const char *reference_path)
+ {
+ 	char 		profile_path[MAXPGPATH];
+ 	FILE	   *pfp;
+ 	char		ch;
+ 	uint32		hi,
+ 				lo;
+ 
+ 	/* The directory must exist and must be not empty */
+ 	if (pg_check_dir(reference_path) < 3)
+ 	{
+ 		fprintf(stderr, _("%s: invalid incremental base directory \"%s\"\n"),
+ 				progname, reference_path);
+ 		exit(1);
+ 	}
+ 
+ 	/* Build the backup profile location */
+ 	join_path_components(profile_path, reference_path, BACKUP_PROFILE_FILE);
+ 
+ 	/* See if label file is present */
+ 	pfp = fopen(profile_path, "r");
+ 	if (!pfp)
+ 	{
+ 		fprintf(stderr, _("%s: could not read file \"%s\": %s\n"),
+ 				progname, profile_path, strerror(errno));
+ 		exit(1);
+ 	}
+ 
+ 	/* Consume the profile header */
+ 	fscanf(pfp, BACKUP_PROFILE_HEADER);
+ 	if (fscanf(pfp, "%c", &ch) != 1 || ch != '\n')
+ 	{
+ 		fprintf(stderr, _("%s: invalid data in file \"%s\"\n"),
+ 				progname, profile_path);
+ 		exit(1);
+ 	}
+ 
+ 	/*
+ 	 * Read and parse the START WAL LOCATION (this code
+ 	 * is pretty crude, but we are not expecting any variability in the file
+ 	 * format).
+ 	 */
+ 	if (fscanf(pfp, "START WAL LOCATION: %X/%X (file %08X%*16s)%c",
+ 			   &hi, &lo, &incremental_timeline, &ch) != 4 || ch != '\n')
+ 	{
+ 		fprintf(stderr, _("%s: invalid data in file \"%s\"\n"),
+ 				progname, profile_path);
+ 		exit(1);
+ 	}
+ 	incremental_startpoint = ((uint64) hi) << 32 | lo;
+ 
+ 	if (ferror(pfp) || fclose(pfp))
+ 	{
+ 		fprintf(stderr, _("%s: could not read file \"%s\": %s\n"),
+ 				progname, profile_path, strerror(errno));
+ 		exit(1);
+ 	}
+ }
+ 
+ 
  /*
   * Write a piece of tar data
   */
*************** ReceiveTarFile(PGconn *conn, PGresult *r
*** 773,784 ****
  	char	   *copybuf = NULL;
  	FILE	   *tarfile = NULL;
  	char		tarhdr[512];
! 	bool		basetablespace = PQgetisnull(res, rownum, 0);
  	bool		in_tarhdr = true;
  	bool		skip_file = false;
  	size_t		tarhdrsz = 0;
  	size_t		filesz = 0;
  
  #ifdef HAVE_LIBZ
  	gzFile		ztarfile = NULL;
  #endif
--- 845,866 ----
  	char	   *copybuf = NULL;
  	FILE	   *tarfile = NULL;
  	char		tarhdr[512];
! 	bool		basetablespace;
  	bool		in_tarhdr = true;
  	bool		skip_file = false;
  	size_t		tarhdrsz = 0;
  	size_t		filesz = 0;
  
+ 	/*
+ 	 * If 'res' is NULL, we are appending the backup profile to
+ 	 * the standard output tar stream.
+ 	 */
+ 	Assert(res || (strcmp(basedir, "-") == 0));
+ 	if (res)
+ 		basetablespace = PQgetisnull(res, rownum, 0);
+ 	else
+ 		basetablespace = true;
+ 
  #ifdef HAVE_LIBZ
  	gzFile		ztarfile = NULL;
  #endif
*************** ReceiveTarFile(PGconn *conn, PGresult *r
*** 939,946 ****
  					WRITE_TAR_DATA(zerobuf, padding);
  			}
  
! 			/* 2 * 512 bytes empty data at end of file */
! 			WRITE_TAR_DATA(zerobuf, sizeof(zerobuf));
  
  #ifdef HAVE_LIBZ
  			if (ztarfile != NULL)
--- 1021,1033 ----
  					WRITE_TAR_DATA(zerobuf, padding);
  			}
  
! 			/*
! 			 * Write the end-of-file blocks unless using stdout
! 			 * and not writing the backup profile (res is NULL).
! 			 */
! 			if (!res || strcmp(basedir, "-") != 0)
! 				/* 2 * 512 bytes empty data at end of file */
! 				WRITE_TAR_DATA(zerobuf, sizeof(zerobuf));
  
  #ifdef HAVE_LIBZ
  			if (ztarfile != NULL)
*************** get_tablespace_mapping(const char *dir)
*** 1128,1136 ****
   * If the data is for the main data directory, it will be restored in the
   * specified directory. If it's for another tablespace, it will be restored
   * in the original or mapped directory.
   */
  static void
! ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum)
  {
  	char		current_path[MAXPGPATH];
  	char		filename[MAXPGPATH];
--- 1215,1230 ----
   * If the data is for the main data directory, it will be restored in the
   * specified directory. If it's for another tablespace, it will be restored
   * in the original or mapped directory.
+  *
+  * If 'res' is NULL, the destination directory is taken from the
+  * 'dest_path' parameter.
+  *
+  * When 'dest_path' is specified, progresses are not displayed because the
+  * content it is not in any tablespace.
   */
  static void
! ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum,
! 						const char *dest_path)
  {
  	char		current_path[MAXPGPATH];
  	char		filename[MAXPGPATH];
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1141,1153 ****
  	char	   *copybuf = NULL;
  	FILE	   *file = NULL;
  
! 	basetablespace = PQgetisnull(res, rownum, 0);
! 	if (basetablespace)
! 		strlcpy(current_path, basedir, sizeof(current_path));
  	else
! 		strlcpy(current_path,
! 				get_tablespace_mapping(PQgetvalue(res, rownum, 1)),
! 				sizeof(current_path));
  
  	/*
  	 * Get the COPY data
--- 1235,1262 ----
  	char	   *copybuf = NULL;
  	FILE	   *file = NULL;
  
! 	/* 'res' and 'dest_path' are mutually exclusive */
! 	Assert(!res != !dest_path);
! 
! 	/*
! 	 * If 'res' is NULL, the destination directory is taken from the
! 	 * 'dest_path' parameter.
! 	 */
! 	if (res)
! 	{
! 		basetablespace = PQgetisnull(res, rownum, 0);
! 		if (basetablespace)
! 			strlcpy(current_path, basedir, sizeof(current_path));
! 		else
! 			strlcpy(current_path,
! 					get_tablespace_mapping(PQgetvalue(res, rownum, 1)),
! 					sizeof(current_path));
! 	}
  	else
! 	{
! 		basetablespace = false;
! 		strlcpy(current_path, dest_path, sizeof(current_path));
! 	}
  
  	/*
  	 * Get the COPY data
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1355,1361 ****
  				disconnect_and_exit(1);
  			}
  			totaldone += r;
! 			progress_report(rownum, filename, false);
  
  			current_len_left -= r;
  			if (current_len_left == 0 && current_padding == 0)
--- 1464,1472 ----
  				disconnect_and_exit(1);
  			}
  			totaldone += r;
! 			/* report progress unless a custom destination is used */
! 			if (!dest_path)
! 				progress_report(rownum, filename, false);
  
  			current_len_left -= r;
  			if (current_len_left == 0 && current_padding == 0)
*************** ReceiveAndUnpackTarFile(PGconn *conn, PG
*** 1371,1377 ****
  			}
  		}						/* continuing data in existing file */
  	}							/* loop over all data blocks */
! 	progress_report(rownum, filename, true);
  
  	if (file != NULL)
  	{
--- 1482,1490 ----
  			}
  		}						/* continuing data in existing file */
  	}							/* loop over all data blocks */
! 	/* report progress unless a custom destination is used */
! 	if (!dest_path)
! 		progress_report(rownum, filename, true);
  
  	if (file != NULL)
  	{
*************** BaseBackup(void)
*** 1587,1592 ****
--- 1700,1706 ----
  	char	   *basebkp;
  	char		escaped_label[MAXPGPATH];
  	char	   *maxrate_clause = NULL;
+ 	char	   *incremental_clause = NULL;
  	int			i;
  	char		xlogstart[64];
  	char		xlogend[64];
*************** BaseBackup(void)
*** 1648,1661 ****
  	if (maxrate > 0)
  		maxrate_clause = psprintf("MAX_RATE %u", maxrate);
  
  	basebkp =
! 		psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s",
  				 escaped_label,
  				 showprogress ? "PROGRESS" : "",
  				 includewal && !streamwal ? "WAL" : "",
  				 fastcheckpoint ? "FAST" : "",
  				 includewal ? "NOWAIT" : "",
! 				 maxrate_clause ? maxrate_clause : "");
  
  	if (PQsendQuery(conn, basebkp) == 0)
  	{
--- 1762,1801 ----
  	if (maxrate > 0)
  		maxrate_clause = psprintf("MAX_RATE %u", maxrate);
  
+ 	if (incremental_startpoint > 0)
+ 	{
+ 		incremental_clause = psprintf("INCREMENTAL '%X/%X'",
+ 									  (uint32) (incremental_startpoint >> 32),
+ 									  (uint32) incremental_startpoint);
+ 
+ 		/*
+ 		 * Sanity check: if from a different timeline abort the backup.
+ 		 */
+ 		if (latesttli != incremental_timeline)
+ 		{
+ 			fprintf(stderr,
+ 					_("%s: incremental backup from a different timeline "
+ 					  "is not supported: base=%u current=%u\n"),
+ 					progname, incremental_timeline, latesttli);
+ 			disconnect_and_exit(1);
+ 		}
+ 
+ 		if (verbose)
+ 			fprintf(stderr, _("incremental from point: %X/%X on timeline %u\n"),
+ 					(uint32) (incremental_startpoint >> 32),
+ 					(uint32) incremental_startpoint,
+ 					incremental_timeline);
+ 	}
+ 
  	basebkp =
! 		psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s %s",
  				 escaped_label,
  				 showprogress ? "PROGRESS" : "",
  				 includewal && !streamwal ? "WAL" : "",
  				 fastcheckpoint ? "FAST" : "",
  				 includewal ? "NOWAIT" : "",
! 				 maxrate_clause ? maxrate_clause : "",
! 				 incremental_clause ? incremental_clause : "");
  
  	if (PQsendQuery(conn, basebkp) == 0)
  	{
*************** BaseBackup(void)
*** 1769,1775 ****
  		if (format == 't')
  			ReceiveTarFile(conn, res, i);
  		else
! 			ReceiveAndUnpackTarFile(conn, res, i);
  	}							/* Loop over all tablespaces */
  
  	if (showprogress)
--- 1909,1915 ----
  		if (format == 't')
  			ReceiveTarFile(conn, res, i);
  		else
! 			ReceiveAndUnpackTarFile(conn, res, i, NULL);
  	}							/* Loop over all tablespaces */
  
  	if (showprogress)
*************** BaseBackup(void)
*** 1803,1808 ****
--- 1943,1960 ----
  		fprintf(stderr, "transaction log end point: %s\n", xlogend);
  	PQclear(res);
  
+ 	/*
+ 	 * Get the backup profile
+ 	 *
+ 	 * If format is tar and we are writing on standard output
+ 	 * append the backup profile to the stream, otherwise put it
+ 	 * in the destination directory
+ 	 */
+ 	if (format == 't' && (strcmp(basedir, "-") == 0))
+ 		ReceiveTarFile(conn, NULL, -1);
+ 	else
+ 		ReceiveAndUnpackTarFile(conn, NULL, -1, basedir);
+ 
  	res = PQgetResult(conn);
  	if (PQresultStatus(res) != PGRES_COMMAND_OK)
  	{
*************** main(int argc, char **argv)
*** 1942,1947 ****
--- 2094,2100 ----
  		{"username", required_argument, NULL, 'U'},
  		{"no-password", no_argument, NULL, 'w'},
  		{"password", no_argument, NULL, 'W'},
+ 		{"incremental", required_argument, NULL, 'I'},
  		{"status-interval", required_argument, NULL, 's'},
  		{"verbose", no_argument, NULL, 'v'},
  		{"progress", no_argument, NULL, 'P'},
*************** main(int argc, char **argv)
*** 1949,1955 ****
  		{NULL, 0, NULL, 0}
  	};
  	int			c;
- 
  	int			option_index;
  
  	progname = get_progname(argv[0]);
--- 2102,2107 ----
*************** main(int argc, char **argv)
*** 1970,1976 ****
  		}
  	}
  
! 	while ((c = getopt_long(argc, argv, "D:F:r:RT:xX:l:zZ:d:c:h:p:U:s:wWvP",
  							long_options, &option_index)) != -1)
  	{
  		switch (c)
--- 2122,2128 ----
  		}
  	}
  
! 	while ((c = getopt_long(argc, argv, "D:F:r:RT:xX:l:zZ:d:c:h:p:U:s:wWI:vP",
  							long_options, &option_index)) != -1)
  	{
  		switch (c)
*************** main(int argc, char **argv)
*** 2088,2093 ****
--- 2240,2248 ----
  			case 'W':
  				dbgetpassword = 1;
  				break;
+ 			case 'I':
+ 				read_backup_profile_header(optarg);
+ 				break;
  			case 's':
  				standby_message_timeout = atoi(optarg) * 1000;
  				if (standby_message_timeout < 0)
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 138deaf..4bb261a 100644
*** a/src/include/access/xlog.h
--- b/src/include/access/xlog.h
*************** extern void SetWalWriterSleeping(bool sl
*** 249,255 ****
   * Starting/stopping a base backup
   */
  extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast,
! 				   TimeLineID *starttli_p, char **labelfile);
  extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive,
  				  TimeLineID *stoptli_p);
  extern void do_pg_abort_backup(void);
--- 249,256 ----
   * Starting/stopping a base backup
   */
  extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast,
! 				  XLogRecPtr incremental_startpoint,
! 				  TimeLineID *starttli_p, char **labelfile);
  extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive,
  				  TimeLineID *stoptli_p);
  extern void do_pg_abort_backup(void);
diff --git a/src/include/replication/basebackup.h b/src/include/replication/basebackup.h
index 64f2bd5..08f8e90 100644
*** a/src/include/replication/basebackup.h
--- b/src/include/replication/basebackup.h
***************
*** 20,25 ****
--- 20,30 ----
  #define MAX_RATE_LOWER	32
  #define MAX_RATE_UPPER	1048576
  
+ /* Backup profile */
+ #define BACKUP_PROFILE_HEADER		"POSTGRESQL BACKUP PROFILE 1"
+ #define BACKUP_PROFILE_SEPARATOR	"FILE LIST"
+ #define BACKUP_PROFILE_FILE			"backup_profile"
+ #define BACKUP_PROFILE_OLD			"backup_profile.old"
  
  extern void SendBaseBackup(BaseBackupCmd *cmd);
  
-- 
2.3.0

0003-copydir-LSN-v3.patchtext/plain; charset=UTF-8; name=0003-copydir-LSN-v3.patch; x-mac-creator=0; x-mac-type=0Download

diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index 5e66961..7409471 100644
*** a/src/backend/commands/dbcommands.c
--- b/src/backend/commands/dbcommands.c
*************** static bool have_createdb_privilege(void
*** 89,94 ****
--- 89,98 ----
  static void remove_dbtablespaces(Oid db_id);
  static bool check_db_file_conflict(Oid db_id);
  static int	errdetail_busy_db(int notherbackends, int npreparedxacts);
+ static void copydir_set_lsn(char *fromdir, char *todir, bool recurse,
+ 			XLogRecPtr recptr);
+ static void copy_file_set_lsn(char *fromfile, char *tofile,
+ 			XLogRecPtr recptr);
  
  
  /*
*************** createdb(const CreatedbStmt *stmt)
*** 586,591 ****
--- 590,596 ----
  			Oid			dsttablespace;
  			char	   *srcpath;
  			char	   *dstpath;
+ 			XLogRecPtr	recptr;
  			struct stat st;
  
  			/* No need to copy global tablespace */
*************** createdb(const CreatedbStmt *stmt)
*** 609,621 ****
  
  			dstpath = GetDatabasePath(dboid, dsttablespace);
  
- 			/*
- 			 * Copy this subdirectory to the new location
- 			 *
- 			 * We don't need to copy subdirectories
- 			 */
- 			copydir(srcpath, dstpath, false);
- 
  			/* Record the filesystem change in XLOG */
  			{
  				xl_dbase_create_rec xlrec;
--- 614,619 ----
*************** createdb(const CreatedbStmt *stmt)
*** 628,636 ****
  				XLogBeginInsert();
  				XLogRegisterData((char *) &xlrec, sizeof(xl_dbase_create_rec));
  
! 				(void) XLogInsert(RM_DBASE_ID,
  								  XLOG_DBASE_CREATE | XLR_SPECIAL_REL_UPDATE);
  			}
  		}
  		heap_endscan(scan);
  		heap_close(rel, AccessShareLock);
--- 626,641 ----
  				XLogBeginInsert();
  				XLogRegisterData((char *) &xlrec, sizeof(xl_dbase_create_rec));
  
! 				recptr = XLogInsert(RM_DBASE_ID,
  								  XLOG_DBASE_CREATE | XLR_SPECIAL_REL_UPDATE);
  			}
+ 
+ 			/*
+ 			 * Copy this subdirectory to the new location
+ 			 *
+ 			 * We don't need to copy subdirectories
+ 			 */
+ 			copydir_set_lsn(srcpath, dstpath, false, recptr);
  		}
  		heap_endscan(scan);
  		heap_close(rel, AccessShareLock);
*************** movedb(const char *dbname, const char *t
*** 1214,1223 ****
  	PG_ENSURE_ERROR_CLEANUP(movedb_failure_callback,
  							PointerGetDatum(&fparms));
  	{
! 		/*
! 		 * Copy files from the old tablespace to the new one
! 		 */
! 		copydir(src_dbpath, dst_dbpath, false);
  
  		/*
  		 * Record the filesystem change in XLOG
--- 1219,1225 ----
  	PG_ENSURE_ERROR_CLEANUP(movedb_failure_callback,
  							PointerGetDatum(&fparms));
  	{
! 		XLogRecPtr	recptr;
  
  		/*
  		 * Record the filesystem change in XLOG
*************** movedb(const char *dbname, const char *t
*** 1233,1243 ****
  			XLogBeginInsert();
  			XLogRegisterData((char *) &xlrec, sizeof(xl_dbase_create_rec));
  
! 			(void) XLogInsert(RM_DBASE_ID,
  							  XLOG_DBASE_CREATE | XLR_SPECIAL_REL_UPDATE);
  		}
  
  		/*
  		 * Update the database's pg_database tuple
  		 */
  		ScanKeyInit(&scankey,
--- 1235,1250 ----
  			XLogBeginInsert();
  			XLogRegisterData((char *) &xlrec, sizeof(xl_dbase_create_rec));
  
! 			recptr = XLogInsert(RM_DBASE_ID,
  							  XLOG_DBASE_CREATE | XLR_SPECIAL_REL_UPDATE);
  		}
  
  		/*
+ 		 * Copy files from the old tablespace to the new one
+ 		 */
+ 		copydir_set_lsn(src_dbpath, dst_dbpath, false, recptr);
+ 
+ 		/*
  		 * Update the database's pg_database tuple
  		 */
  		ScanKeyInit(&scankey,
*************** dbase_redo(XLogReaderState *record)
*** 2045,2050 ****
--- 2052,2058 ----
  	if (info == XLOG_DBASE_CREATE)
  	{
  		xl_dbase_create_rec *xlrec = (xl_dbase_create_rec *) XLogRecGetData(record);
+ 		XLogRecPtr	lsn = record->EndRecPtr;
  		char	   *src_path;
  		char	   *dst_path;
  		struct stat st;
*************** dbase_redo(XLogReaderState *record)
*** 2077,2083 ****
  		 *
  		 * We don't need to copy subdirectories
  		 */
! 		copydir(src_path, dst_path, false);
  	}
  	else if (info == XLOG_DBASE_DROP)
  	{
--- 2085,2091 ----
  		 *
  		 * We don't need to copy subdirectories
  		 */
! 		copydir_set_lsn(src_path, dst_path, false, lsn);
  	}
  	else if (info == XLOG_DBASE_DROP)
  	{
*************** dbase_redo(XLogReaderState *record)
*** 2128,2130 ****
--- 2136,2377 ----
  	else
  		elog(PANIC, "dbase_redo: unknown op code %u", info);
  }
+ 
+ /*
+  * copydir: copy a directory
+  *
+  * If recurse is false, subdirectories are ignored.  Anything that's not
+  * a directory or a regular file is ignored.
+  *
+  * If recptr is different from InvalidXlogRecPtr, LSN of pages in the
+  * destination directory will be updated to recptr.
+  */
+ void
+ copydir_set_lsn(char *fromdir, char *todir, bool recurse, XLogRecPtr recptr)
+ {
+ 	DIR		   *xldir;
+ 	struct dirent *xlde;
+ 	char		fromfile[MAXPGPATH];
+ 	char		tofile[MAXPGPATH];
+ 
+ 	if (mkdir(todir, S_IRWXU) != 0)
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not create directory \"%s\": %m", todir)));
+ 
+ 	xldir = AllocateDir(fromdir);
+ 	if (xldir == NULL)
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not open directory \"%s\": %m", fromdir)));
+ 
+ 	while ((xlde = ReadDir(xldir, fromdir)) != NULL)
+ 	{
+ 		struct stat fst;
+ 
+ 		/* If we got a cancel signal during the copy of the directory, quit */
+ 		CHECK_FOR_INTERRUPTS();
+ 
+ 		if (strcmp(xlde->d_name, ".") == 0 ||
+ 			strcmp(xlde->d_name, "..") == 0)
+ 			continue;
+ 
+ 		snprintf(fromfile, MAXPGPATH, "%s/%s", fromdir, xlde->d_name);
+ 		snprintf(tofile, MAXPGPATH, "%s/%s", todir, xlde->d_name);
+ 
+ 		if (lstat(fromfile, &fst) < 0)
+ 			ereport(ERROR,
+ 					(errcode_for_file_access(),
+ 					 errmsg("could not stat file \"%s\": %m", fromfile)));
+ 
+ 		if (S_ISDIR(fst.st_mode))
+ 		{
+ 			/* recurse to handle subdirectories */
+ 			if (recurse)
+ 				copydir_set_lsn(fromfile, tofile, true, recptr);
+ 		}
+ 		else if (S_ISREG(fst.st_mode))
+ 			copy_file_set_lsn(fromfile, tofile, recptr);
+ 	}
+ 	FreeDir(xldir);
+ 
+ 	/*
+ 	 * Be paranoid here and fsync all files to ensure the copy is really done.
+ 	 * But if fsync is disabled, we're done.
+ 	 */
+ 	if (!enableFsync)
+ 		return;
+ 
+ 	xldir = AllocateDir(todir);
+ 	if (xldir == NULL)
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not open directory \"%s\": %m", todir)));
+ 
+ 	while ((xlde = ReadDir(xldir, todir)) != NULL)
+ 	{
+ 		struct stat fst;
+ 
+ 		if (strcmp(xlde->d_name, ".") == 0 ||
+ 			strcmp(xlde->d_name, "..") == 0)
+ 			continue;
+ 
+ 		snprintf(tofile, MAXPGPATH, "%s/%s", todir, xlde->d_name);
+ 
+ 		/*
+ 		 * We don't need to sync subdirectories here since the recursive
+ 		 * copydir will do it before it returns
+ 		 */
+ 		if (lstat(tofile, &fst) < 0)
+ 			ereport(ERROR,
+ 					(errcode_for_file_access(),
+ 					 errmsg("could not stat file \"%s\": %m", tofile)));
+ 
+ 		if (S_ISREG(fst.st_mode))
+ 			fsync_fname(tofile, false);
+ 	}
+ 	FreeDir(xldir);
+ 
+ 	/*
+ 	 * It's important to fsync the destination directory itself as individual
+ 	 * file fsyncs don't guarantee that the directory entry for the file is
+ 	 * synced. Recent versions of ext4 have made the window much wider but
+ 	 * it's been true for ext3 and other filesystems in the past.
+ 	 */
+ 	fsync_fname(todir, true);
+ }
+ 
+ /*
+  * copy one file
+  *
+  * If recptr is different from InvalidXlogRecPtr, the destination file will
+  * have all its pages with LSN set accordingly
+  */
+ void
+ copy_file_set_lsn(char *fromfile, char *tofile, XLogRecPtr recptr)
+ {
+ 	char	   *buffer;
+ 	int			srcfd;
+ 	int			dstfd;
+ 	int			nbytes;
+ 	off_t		offset;
+ 	BlockNumber	blkno = 0;
+ 
+ 	/* Use palloc to ensure we get a maxaligned buffer */
+ #define COPY_BUF_SIZE (8 * BLCKSZ)
+ 
+ 	buffer = palloc(COPY_BUF_SIZE);
+ 
+ 	/*
+ 	 * To support incremental backups, we need to update the LSN in
+ 	 * all relation files we are copying.
+ 	 *
+ 	 * We are updating only the MAIN fork because at the moment
+ 	 * blocks in FSM and VM forks are not guaranteed to have an
+ 	 * up-to-date LSN
+ 	 */
+ 	if (recptr != InvalidXLogRecPtr)
+ 	{
+ 		char 	   *filename = last_dir_separator(fromfile);
+ 		ForkNumber	fork;
+ 		int			oidchars;
+ 		uint32		segno;
+ 
+ 		if (filename &&
+ 				*(filename + 1) &&
+ 				parse_filename_for_nontemp_relation(filename + 1,
+ 						&oidchars, &fork, &segno) && fork == MAIN_FORKNUM)
+ 			blkno = segno * RELSEG_SIZE;
+ 		else
+ 			recptr = InvalidXLogRecPtr;
+ 	}
+ 
+ 	/*
+ 	 * Open the files
+ 	 */
+ 	srcfd = OpenTransientFile(fromfile, O_RDONLY | PG_BINARY, 0);
+ 	if (srcfd < 0)
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not open file \"%s\": %m", fromfile)));
+ 
+ 	dstfd = OpenTransientFile(tofile, O_RDWR | O_CREAT | O_EXCL | PG_BINARY,
+ 							  S_IRUSR | S_IWUSR);
+ 	if (dstfd < 0)
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not create file \"%s\": %m", tofile)));
+ 
+ 	/*
+ 	 * Do the data copying.
+ 	 */
+ 	for (offset = 0;; offset += nbytes)
+ 	{
+ 		/* If we got a cancel signal during the copy of the file, quit */
+ 		CHECK_FOR_INTERRUPTS();
+ 
+ 		nbytes = read(srcfd, buffer, COPY_BUF_SIZE);
+ 		if (nbytes < 0)
+ 			ereport(ERROR,
+ 					(errcode_for_file_access(),
+ 					 errmsg("could not read file \"%s\": %m", fromfile)));
+ 		if (nbytes == 0)
+ 			break;
+ 
+ 		/*
+ 		 * If a valid recptr has been provided, the resulting file will have
+ 		 * all its pages with LSN set accordingly
+ 		 */
+ 		if (recptr != InvalidXLogRecPtr)
+ 		{
+ 			char		*page;
+ 
+ 			/*
+ 			 * If we are updating LSN of a file, we must be sure that the
+ 			 * source file is not being extended.
+ 			 */
+ 			if (nbytes % BLCKSZ != 0)
+ 				ereport(ERROR,
+ 						(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ 						 errmsg("file \"%s\" size is not multiple of %d",
+ 								fromfile, BLCKSZ)));
+ 
+ 			for (page = buffer; page < (buffer + nbytes); page += BLCKSZ, blkno++)
+ 			{
+ 				/* Update LSN only if the page looks valid */
+ 				if (!PageIsNew(page) && PageIsVerified(page, blkno))
+ 				{
+ 					PageSetLSN(page, recptr);
+ 					PageSetChecksumInplace(page, blkno);
+ 				}
+ 			}
+ 		}
+ 
+ 		errno = 0;
+ 		if ((int) write(dstfd, buffer, nbytes) != nbytes)
+ 		{
+ 			/* if write didn't set errno, assume problem is no disk space */
+ 			if (errno == 0)
+ 				errno = ENOSPC;
+ 			ereport(ERROR,
+ 					(errcode_for_file_access(),
+ 					 errmsg("could not write to file \"%s\": %m", tofile)));
+ 		}
+ 
+ 		/*
+ 		 * We fsync the files later but first flush them to avoid spamming the
+ 		 * cache and hopefully get the kernel to start writing them out before
+ 		 * the fsync comes.  Ignore any error, since it's only a hint.
+ 		 */
+ 		(void) pg_flush_data(dstfd, offset, nbytes);
+ 	}
+ 
+ 	if (CloseTransientFile(dstfd))
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not close file \"%s\": %m", tofile)));
+ 
+ 	CloseTransientFile(srcfd);
+ 
+ 	pfree(buffer);
+ }
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 107d70c..8f85752 100644
*** a/src/backend/replication/basebackup.c
--- b/src/backend/replication/basebackup.c
*************** sendDir(char *path, int basepathlen, boo
*** 1219,1225 ****
  				is_relfile = ( has_relfiles
  					&& parse_filename_for_nontemp_relation(de->d_name,
  														   &oidchars,
! 														   &forknum)
  					&& forknum == MAIN_FORKNUM);
  
  				if (!is_relfile
--- 1219,1226 ----
  				is_relfile = ( has_relfiles
  					&& parse_filename_for_nontemp_relation(de->d_name,
  														   &oidchars,
! 														   &forknum,
! 														   NULL)
  					&& forknum == MAIN_FORKNUM);
  
  				if (!is_relfile
diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c
index 02b5fee..2f7dca6 100644
*** a/src/backend/storage/file/reinit.c
--- b/src/backend/storage/file/reinit.c
*************** ResetUnloggedRelationsInDbspaceDir(const
*** 190,196 ****
  
  			/* Skip anything that doesn't look like a relation data file. */
  			if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
! 													 &forkNum))
  				continue;
  
  			/* Also skip it unless this is the init fork. */
--- 190,196 ----
  
  			/* Skip anything that doesn't look like a relation data file. */
  			if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
! 													 &forkNum, NULL))
  				continue;
  
  			/* Also skip it unless this is the init fork. */
*************** ResetUnloggedRelationsInDbspaceDir(const
*** 243,249 ****
  
  			/* Skip anything that doesn't look like a relation data file. */
  			if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
! 													 &forkNum))
  				continue;
  
  			/* We never remove the init fork. */
--- 243,249 ----
  
  			/* Skip anything that doesn't look like a relation data file. */
  			if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
! 													 &forkNum, NULL))
  				continue;
  
  			/* We never remove the init fork. */
*************** ResetUnloggedRelationsInDbspaceDir(const
*** 313,319 ****
  
  			/* Skip anything that doesn't look like a relation data file. */
  			if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
! 													 &forkNum))
  				continue;
  
  			/* Also skip it unless this is the init fork. */
--- 313,319 ----
  
  			/* Skip anything that doesn't look like a relation data file. */
  			if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
! 													 &forkNum, NULL))
  				continue;
  
  			/* Also skip it unless this is the init fork. */
*************** ResetUnloggedRelationsInDbspaceDir(const
*** 364,370 ****
  
  			/* Skip anything that doesn't look like a relation data file. */
  			if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
! 													 &forkNum))
  				continue;
  
  			/* Also skip it unless this is the init fork. */
--- 364,370 ----
  
  			/* Skip anything that doesn't look like a relation data file. */
  			if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars,
! 													 &forkNum, NULL))
  				continue;
  
  			/* Also skip it unless this is the init fork. */
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 83a1e3a..63972bd 100644
*** a/src/common/relpath.c
--- b/src/common/relpath.c
*************** GetRelationPath(Oid dbNode, Oid spcNode,
*** 213,218 ****
--- 213,222 ----
   * This function returns true if the file appears to be in the correct format
   * for a non-temporary relation and false otherwise.
   *
+  * The segno parameter can be safely set to NULL.
+  * It should be of BlockNumber* type, but it is declared as uint32
+  * to avoid depending on storage/block.h
+  *
   * NB: If this function returns true, the caller is entitled to assume that
   * *oidchars has been set to the a value no more than OIDCHARS, and thus
   * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID
*************** GetRelationPath(Oid dbNode, Oid spcNode,
*** 221,227 ****
   */
  bool
  parse_filename_for_nontemp_relation(const char *name, int *oidchars,
! 									ForkNumber *fork)
  {
  	int			pos;
  
--- 225,231 ----
   */
  bool
  parse_filename_for_nontemp_relation(const char *name, int *oidchars,
! 									ForkNumber *fork, uint32 *segno)
  {
  	int			pos;
  
*************** parse_filename_for_nontemp_relation(cons
*** 246,257 ****
  	}
  
  	/* Check for a segment number. */
  	if (name[pos] == '.')
  	{
  		int			segchar;
  
  		for (segchar = 1; isdigit((unsigned char) name[pos + segchar]); ++segchar)
! 			;
  		if (segchar <= 1)
  			return false;
  		pos += segchar;
--- 250,264 ----
  	}
  
  	/* Check for a segment number. */
+ 	if (segno)
+ 		*segno = 0;
  	if (name[pos] == '.')
  	{
  		int			segchar;
  
  		for (segchar = 1; isdigit((unsigned char) name[pos + segchar]); ++segchar)
! 			if (segno)
! 				*segno = *segno * 10 + name[pos + segchar] - '0';
  		if (segchar <= 1)
  			return false;
  		pos += segchar;
diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h
index 9736a78..9dd492f 100644
*** a/src/include/common/relpath.h
--- b/src/include/common/relpath.h
*************** extern char *GetDatabasePath(Oid dbNode,
*** 53,59 ****
  extern char *GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
  				int backendId, ForkNumber forkNumber);
  extern bool parse_filename_for_nontemp_relation(const char *name,
! 								int *oidchars, ForkNumber *fork);
  
  /*
   * Wrapper macros for GetRelationPath.  Beware of multiple
--- 53,60 ----
  extern char *GetRelationPath(Oid dbNode, Oid spcNode, Oid relNode,
  				int backendId, ForkNumber forkNumber);
  extern bool parse_filename_for_nontemp_relation(const char *name,
! 								int *oidchars, ForkNumber *fork,
! 								uint32 *seqno);
  
  /*
   * Wrapper macros for GetRelationPath.  Beware of multiple
-- 
2.3.0

#27

Fujii Masao

masao.fujii@gmail.com

almost 11 years ago

In reply to: Marco Nenciarini (#26)

Re: File based Incremental backup v8

On Thu, Feb 12, 2015 at 10:50 PM, Marco Nenciarini
<marco.nenciarini@2ndquadrant.it> wrote:

Hi,

I've attached an updated version of the patch.

basebackup.c:1565: warning: format '%lld' expects type 'long long
int', but argument 8 has type '__off_t'
basebackup.c:1565: warning: format '%lld' expects type 'long long
int', but argument 8 has type '__off_t'
pg_basebackup.c:865: warning: ISO C90 forbids mixed declarations and code

When I applied three patches and compiled the code, I got the above warnings.

How can we get the full backup that we can use for the archive recovery, from
the first full backup and subsequent incremental backups? What commands should
we use for that, for example? It's better to document that.

What does "1" of the heading line in backup_profile mean?

Sorry if this has been already discussed so far. Why is a backup profile file
necessary? Maybe it's necessary in the future, but currently seems not.
Several infos like LSN, modification time, size, etc are tracked in a backup
profile file for every backup files, but they are not used for now. If it's now
not required, I'm inclined to remove it to simplify the code.

We've really gotten the consensus about the current design, especially that
every files basically need to be read to check whether they have been modified
since last backup even when *no* modification happens since last backup?

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#28

Marco Nenciarini

marco.nenciarini@2ndquadrant.it

almost 11 years ago

In reply to: Fujii Masao (#27)

Re: File based Incremental backup v8

Il 02/03/15 14:21, Fujii Masao ha scritto:

On Thu, Feb 12, 2015 at 10:50 PM, Marco Nenciarini
<marco.nenciarini@2ndquadrant.it> wrote:

Hi,

I've attached an updated version of the patch.

basebackup.c:1565: warning: format '%lld' expects type 'long long
int', but argument 8 has type '__off_t'
basebackup.c:1565: warning: format '%lld' expects type 'long long
int', but argument 8 has type '__off_t'
pg_basebackup.c:865: warning: ISO C90 forbids mixed declarations and code

I'll add the an explicit cast at that two lines.

When I applied three patches and compiled the code, I got the above warnings.

How can we get the full backup that we can use for the archive recovery, from
the first full backup and subsequent incremental backups? What commands should
we use for that, for example? It's better to document that.

I've sent a python PoC that supports the plain format only (not the tar one).
I'm currently rewriting it in C (with also the tar support) and I'll send a new patch containing it ASAP.

What does "1" of the heading line in backup_profile mean?

Nothing. It's a version number. If you think it's misleading I will remove it.

Sorry if this has been already discussed so far. Why is a backup profile file
necessary? Maybe it's necessary in the future, but currently seems not.

It's necessary because it's the only way to detect deleted files.

Several infos like LSN, modification time, size, etc are tracked in a backup
profile file for every backup files, but they are not used for now. If it's now
not required, I'm inclined to remove it to simplify the code.

I've put LSN there mainly for debugging purpose, but it can also be used to check the file during pg_restorebackup execution. The sent field is probably redundant (if sent = False and LSN is not set, we should probably simply avoid to write a line about that file) and I'll remove it in the next patch.

We've really gotten the consensus about the current design, especially that
every files basically need to be read to check whether they have been modified
since last backup even when *no* modification happens since last backup?

The real problem here is that there is currently no way to detect that a file is not changed since the last backup. We agreed to not use file system timestamps as they are not reliable for that purpose.
Using LSN have a significant advantage over using checksum, as we can start the full copy as soon as we found a block whith a LSN greater than the threshold.
There are two cases: 1) the file is changed, so we can assume that we detect it after reading 50% of the file, then we send it taking advantage of file system cache; 2) the file is not changed, so we read it without sending anything.
It will end up producing an I/O comparable to a normal backup.

Regards,
Marco

--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it

#29

Fujii Masao

masao.fujii@gmail.com

almost 11 years ago

In reply to: Marco Nenciarini (#28)

Re: File based Incremental backup v8

On Tue, Mar 3, 2015 at 12:36 AM, Marco Nenciarini
<marco.nenciarini@2ndquadrant.it> wrote:

Il 02/03/15 14:21, Fujii Masao ha scritto:

On Thu, Feb 12, 2015 at 10:50 PM, Marco Nenciarini
<marco.nenciarini@2ndquadrant.it> wrote:

Hi,

I've attached an updated version of the patch.

basebackup.c:1565: warning: format '%lld' expects type 'long long
int', but argument 8 has type '__off_t'
basebackup.c:1565: warning: format '%lld' expects type 'long long
int', but argument 8 has type '__off_t'
pg_basebackup.c:865: warning: ISO C90 forbids mixed declarations and code

I'll add the an explicit cast at that two lines.

When I applied three patches and compiled the code, I got the above warnings.

How can we get the full backup that we can use for the archive recovery, from
the first full backup and subsequent incremental backups? What commands should
we use for that, for example? It's better to document that.

I've sent a python PoC that supports the plain format only (not the tar one).
I'm currently rewriting it in C (with also the tar support) and I'll send a new patch containing it ASAP.

Yeah, if special tool is required for that purpose, the patch should include it.

What does "1" of the heading line in backup_profile mean?

Nothing. It's a version number. If you think it's misleading I will remove it.

A version number of file format of backup profile? If it's required for
the validation of backup profile file as a safe-guard, it should be included
in the profile file. For example, it might be useful to check whether
pg_basebackup executable is compatible with the "source" backup that
you specify. But more info might be needed for such validation.

Sorry if this has been already discussed so far. Why is a backup profile file
necessary? Maybe it's necessary in the future, but currently seems not.

It's necessary because it's the only way to detect deleted files.

Maybe I'm missing something. Seems we can detect that even without a profile.
For example, please imagine the case where the file has been deleted since
the last full backup and then the incremental backup is taken. In this case,
that deleted file exists only in the full backup. We can detect the deletion of
the file by checking both full and incremental backups.

We've really gotten the consensus about the current design, especially that
every files basically need to be read to check whether they have been modified
since last backup even when *no* modification happens since last backup?

The real problem here is that there is currently no way to detect that a file is not changed since the last backup. We agreed to not use file system timestamps as they are not reliable for that purpose.

TBH I prefer timestamp-based approach in the first version of incremental backup
even if's less reliable than LSN-based one. I think that some users who are
using timestamp-based rsync (i.e., default mode) for the backup would be
satisfied with timestamp-based one.

Using LSN have a significant advantage over using checksum, as we can start the full copy as soon as we found a block whith a LSN greater than the threshold.
There are two cases: 1) the file is changed, so we can assume that we detect it after reading 50% of the file, then we send it taking advantage of file system cache; 2) the file is not changed, so we read it without sending anything.
It will end up producing an I/O comparable to a normal backup.

Yeah, it might make the situation better than today. But I'm afraid that
many users might get disappointed about that behavior of an incremental
backup after the release...

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#30

Marco Nenciarini

marco.nenciarini@2ndquadrant.it

almost 11 years ago

In reply to: Fujii Masao (#29)

Re: File based Incremental backup v8

Hi Fujii,

Il 03/03/15 11:48, Fujii Masao ha scritto:

On Tue, Mar 3, 2015 at 12:36 AM, Marco Nenciarini
<marco.nenciarini@2ndquadrant.it> wrote:

Il 02/03/15 14:21, Fujii Masao ha scritto:

On Thu, Feb 12, 2015 at 10:50 PM, Marco Nenciarini
<marco.nenciarini@2ndquadrant.it> wrote:

Hi,

I've attached an updated version of the patch.

basebackup.c:1565: warning: format '%lld' expects type 'long long
int', but argument 8 has type '__off_t'
basebackup.c:1565: warning: format '%lld' expects type 'long long
int', but argument 8 has type '__off_t'
pg_basebackup.c:865: warning: ISO C90 forbids mixed declarations and code

I'll add the an explicit cast at that two lines.

When I applied three patches and compiled the code, I got the above warnings.

How can we get the full backup that we can use for the archive recovery, from
the first full backup and subsequent incremental backups? What commands should
we use for that, for example? It's better to document that.

I've sent a python PoC that supports the plain format only (not the tar one).
I'm currently rewriting it in C (with also the tar support) and I'll send a new patch containing it ASAP.

Yeah, if special tool is required for that purpose, the patch should include it.

I'm working on it. The interface will be exactly the same of the PoC script I've attached to

54C7CDAD.6060900@2ndquadrant.it

What does "1" of the heading line in backup_profile mean?

Nothing. It's a version number. If you think it's misleading I will remove it.

A version number of file format of backup profile? If it's required for
the validation of backup profile file as a safe-guard, it should be included
in the profile file. For example, it might be useful to check whether
pg_basebackup executable is compatible with the "source" backup that
you specify. But more info might be needed for such validation.

The current implementation bail out with an error if the header line is different from what it expect.
It also reports and error if the 2nd line is not the start WAL location. That's all that pg_basebackup needs to start a new incremental backup. All the other information are useful to reconstruct a full backup in case of an incremental backup, or maybe to check the completeness of an archived full backup.
Initially the profile was present only in incremental backups, but after some discussion on list we agreed to always write it.

Sorry if this has been already discussed so far. Why is a backup profile file
necessary? Maybe it's necessary in the future, but currently seems not.

It's necessary because it's the only way to detect deleted files.

Maybe I'm missing something. Seems we can detect that even without a profile.
For example, please imagine the case where the file has been deleted since
the last full backup and then the incremental backup is taken. In this case,
that deleted file exists only in the full backup. We can detect the deletion of
the file by checking both full and incremental backups.

When you take an incremental backup, only changed files are sent. Without the backup_profile in the incremental backup, you cannot detect a deleted file, because it's indistinguishable from a file that is not changed.

We've really gotten the consensus about the current design, especially that
every files basically need to be read to check whether they have been modified
since last backup even when *no* modification happens since last backup?

The real problem here is that there is currently no way to detect that a file is not changed since the last backup. We agreed to not use file system timestamps as they are not reliable for that purpose.

TBH I prefer timestamp-based approach in the first version of incremental backup
even if's less reliable than LSN-based one. I think that some users who are
using timestamp-based rsync (i.e., default mode) for the backup would be
satisfied with timestamp-based one.

The original design was to compare size+timestamp+checksums (only if everything else matches and the file has been modified after the start of the backup), but the feedback from the list was that we cannot trust the filesystem mtime and we must use LSN instead.

Using LSN have a significant advantage over using checksum, as we can start the full copy as soon as we found a block whith a LSN greater than the threshold.
There are two cases: 1) the file is changed, so we can assume that we detect it after reading 50% of the file, then we send it taking advantage of file system cache; 2) the file is not changed, so we read it without sending anything.
It will end up producing an I/O comparable to a normal backup.

Yeah, it might make the situation better than today. But I'm afraid that
many users might get disappointed about that behavior of an incremental
backup after the release...

I don't get what do you mean here. Can you elaborate this point?

Regards,
Marco

--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it

#31

Fujii Masao

masao.fujii@gmail.com

almost 11 years ago

In reply to: Marco Nenciarini (#30)

Re: File based Incremental backup v8

On Thu, Mar 5, 2015 at 1:59 AM, Marco Nenciarini
<marco.nenciarini@2ndquadrant.it> wrote:

Hi Fujii,

Il 03/03/15 11:48, Fujii Masao ha scritto:

On Tue, Mar 3, 2015 at 12:36 AM, Marco Nenciarini
<marco.nenciarini@2ndquadrant.it> wrote:

Il 02/03/15 14:21, Fujii Masao ha scritto:

On Thu, Feb 12, 2015 at 10:50 PM, Marco Nenciarini
<marco.nenciarini@2ndquadrant.it> wrote:

Hi,

I've attached an updated version of the patch.

basebackup.c:1565: warning: format '%lld' expects type 'long long
int', but argument 8 has type '__off_t'
basebackup.c:1565: warning: format '%lld' expects type 'long long
int', but argument 8 has type '__off_t'
pg_basebackup.c:865: warning: ISO C90 forbids mixed declarations and code

I'll add the an explicit cast at that two lines.

When I applied three patches and compiled the code, I got the above warnings.

How can we get the full backup that we can use for the archive recovery, from
the first full backup and subsequent incremental backups? What commands should
we use for that, for example? It's better to document that.

I've sent a python PoC that supports the plain format only (not the tar one).
I'm currently rewriting it in C (with also the tar support) and I'll send a new patch containing it ASAP.

Yeah, if special tool is required for that purpose, the patch should include it.

I'm working on it. The interface will be exactly the same of the PoC script I've attached to

54C7CDAD.6060900@2ndquadrant.it

What does "1" of the heading line in backup_profile mean?

Nothing. It's a version number. If you think it's misleading I will remove it.

A version number of file format of backup profile? If it's required for
the validation of backup profile file as a safe-guard, it should be included
in the profile file. For example, it might be useful to check whether
pg_basebackup executable is compatible with the "source" backup that
you specify. But more info might be needed for such validation.

The current implementation bail out with an error if the header line is different from what it expect.
It also reports and error if the 2nd line is not the start WAL location. That's all that pg_basebackup needs to start a new incremental backup. All the other information are useful to reconstruct a full backup in case of an incremental backup, or maybe to check the completeness of an archived full backup.
Initially the profile was present only in incremental backups, but after some discussion on list we agreed to always write it.

Don't we need more checks about the compatibility of the backup-target database
cluster and the source incremental backup? Without such more checks, I'm afraid
we can easily get a corrupted incremental backups. For example, pg_basebackup
should emit an error if the target and source have the different system IDs,
like walreceiver does? What happens if the timeline ID is different between the
source and target? What happens if the source was taken from the standby but
new incremental backup will be taken from the master? Do we need to check them?

Sorry if this has been already discussed so far. Why is a backup profile file
necessary? Maybe it's necessary in the future, but currently seems not.

It's necessary because it's the only way to detect deleted files.

Maybe I'm missing something. Seems we can detect that even without a profile.
For example, please imagine the case where the file has been deleted since
the last full backup and then the incremental backup is taken. In this case,
that deleted file exists only in the full backup. We can detect the deletion of
the file by checking both full and incremental backups.

When you take an incremental backup, only changed files are sent. Without the backup_profile in the incremental backup, you cannot detect a deleted file, because it's indistinguishable from a file that is not changed.

Yeah, you're right!

We've really gotten the consensus about the current design, especially that
every files basically need to be read to check whether they have been modified
since last backup even when *no* modification happens since last backup?

The real problem here is that there is currently no way to detect that a file is not changed since the last backup. We agreed to not use file system timestamps as they are not reliable for that purpose.

TBH I prefer timestamp-based approach in the first version of incremental backup
even if's less reliable than LSN-based one. I think that some users who are
using timestamp-based rsync (i.e., default mode) for the backup would be
satisfied with timestamp-based one.

The original design was to compare size+timestamp+checksums (only if everything else matches and the file has been modified after the start of the backup), but the feedback from the list was that we cannot trust the filesystem mtime and we must use LSN instead.

Using LSN have a significant advantage over using checksum, as we can start the full copy as soon as we found a block whith a LSN greater than the threshold.
There are two cases: 1) the file is changed, so we can assume that we detect it after reading 50% of the file, then we send it taking advantage of file system cache; 2) the file is not changed, so we read it without sending anything.
It will end up producing an I/O comparable to a normal backup.

Yeah, it might make the situation better than today. But I'm afraid that
many users might get disappointed about that behavior of an incremental
backup after the release...

I don't get what do you mean here. Can you elaborate this point?

The proposed version of LSN-based incremental backup has some limitations
(e.g., every database files need to be read even when there is no modification
in database since last backup, and which may make the backup time longer than
users expect) which may disappoint users. So I'm afraid that users who can
benefit from the feature might be very limited. IOW, I'm just sticking to
the idea of timestamp-based one :) But I should drop it if the majority in
the list prefers the LSN-based one even if it has such limitations.

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#32

Bruce Momjian

bruce@momjian.us

almost 11 years ago

In reply to: Fujii Masao (#31)

Re: File based Incremental backup v8

On Thu, Mar 5, 2015 at 01:25:13PM +0900, Fujii Masao wrote:

Yeah, it might make the situation better than today. But I'm afraid that
many users might get disappointed about that behavior of an incremental
backup after the release...

I don't get what do you mean here. Can you elaborate this point?

The proposed version of LSN-based incremental backup has some limitations
(e.g., every database files need to be read even when there is no modification
in database since last backup, and which may make the backup time longer than
users expect) which may disappoint users. So I'm afraid that users who can
benefit from the feature might be very limited. IOW, I'm just sticking to
the idea of timestamp-based one :) But I should drop it if the majority in
the list prefers the LSN-based one even if it has such limitations.

We need numbers on how effective each level of tracking will be. Until
then, the patch can't move forward.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#33

Robert Haas

robertmhaas@gmail.com

almost 11 years ago

In reply to: Bruce Momjian (#32)

Re: File based Incremental backup v8

On Wed, Mar 4, 2015 at 11:42 PM, Bruce Momjian <bruce@momjian.us> wrote:

On Thu, Mar 5, 2015 at 01:25:13PM +0900, Fujii Masao wrote:

Yeah, it might make the situation better than today. But I'm afraid that
many users might get disappointed about that behavior of an incremental
backup after the release...

I don't get what do you mean here. Can you elaborate this point?

The proposed version of LSN-based incremental backup has some limitations
(e.g., every database files need to be read even when there is no modification
in database since last backup, and which may make the backup time longer than
users expect) which may disappoint users. So I'm afraid that users who can
benefit from the feature might be very limited. IOW, I'm just sticking to
the idea of timestamp-based one :) But I should drop it if the majority in
the list prefers the LSN-based one even if it has such limitations.

We need numbers on how effective each level of tracking will be. Until
then, the patch can't move forward.

The point is that this is a stepping stone toward what will ultimately
be a better solution. You can use timestamps today if (a) whole-file
granularity is good enough for you and (b) you trust your system clock
to never go backwards. In fact, if you use pg_start_backup() and
pg_stop_backup(), you don't even need a server patch; you can just go
right ahead and implement whatever you like. A server patch would be
needed to make pg_basebackup do a file-time-based incremental backup,
but I'm not excited about that because I think the approach is a
dead-end.

If you want block-level granularity, and you should, an approach based
on file times is never going to get you there. An approach based on
LSNs can. If the first version of the patch requires reading the
whole database, fine, it's not going to perform all that terribly
well. But we can optimize that later by keeping track of which blocks
have been modified since a given LSN. If we do that, we can get
better reliability than the timestamp approach can ever offer, plus
excellent transfer and storage characteristics.

What I'm unhappy with about this patch is that it insists on sending
the whole file if a single block in that file has changed. That is
lame. To get something useful out of this, we should be looking to
send only those blocks whose LSNs have actually changed. That would
reduce I/O (in the worst case, the current patch each file in its
entirety twice) and transfer bandwidth as compared to the proposed
patch. We'd still have to read the whole database so it might very
well do more I/O than the file-timestamp approach, but it would beat
the file-timestamp approach on transfer bandwidth and on the amount of
storage required to store the incremental. In many workloads, I
expect those savings would be quite significant. If we then went back
in a later release and implemented one of the various proposals to
avoid needing to read every block, we'd then have a very robust and
complete solution.

But I agree with Fujii to the extent that I see little value in
committing this patch in the form proposed. Being smart enough to use
the LSN to identify changed blocks, but then sending the entirety of
every file anyway because you don't want to go to the trouble of
figuring out how to revise the wire protocol to identify the
individual blocks being sent and write the tools to reconstruct a full
backup based on that data, does not seem like enough of a win. As
Fujii says, if we ship this patch as written, people will just keep
using the timestamp-based approach anyway. Let's wait until we have
something that is, at least in some circumstances, a material
improvement over the status quo before committing anything.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#34

Gabriele Bartolini

gabriele.bartolini@2ndquadrant.it

almost 11 years ago

In reply to: Robert Haas (#33)

Re: File based Incremental backup v8

Hi Robert,

2015-03-06 3:10 GMT+11:00 Robert Haas <robertmhaas@gmail.com>:

But I agree with Fujii to the extent that I see little value in
committing this patch in the form proposed. Being smart enough to use
the LSN to identify changed blocks, but then sending the entirety of
every file anyway because you don't want to go to the trouble of
figuring out how to revise the wire protocol to identify the
individual blocks being sent and write the tools to reconstruct a full
backup based on that data, does not seem like enough of a win.

I believe the main point is to look at a user interface point of view.
If/When we switch to a block level incremental support, this will be
completely transparent to the end user, even if we start with a file-level
approach with LSN check.

The win is already determined by the average space/time gained by users of
VLDB with a good chunk of read-only data. Our Barman users with incremental
backup (released recently - its algorithm can be compared to the one of
file-level backup proposed by Marco) can benefit on average of a data
deduplication ratio ranging between 50 to 70% of the cluster size.

A tangible example is depicted here, with Navionics saving 8.2TB a week
thanks to this approach (and 17 hours instead of 50 for backup time):
http://blog.2ndquadrant.com/incremental-backup-barman-1-4-0/

However, even smaller databases will benefit. It is clear that very small
databases as well as frequently updated ones won't be interested in
incremental backup, but that is never been the use case for this feature.

I believe that if we still think that this approach is not worth it, we are
making a big mistake. The way I see it, this patch follows an agile
approach and it is an important step towards incremental backup on a block
basis.

As Fujii says, if we ship this patch as written, people will just keep
using the timestamp-based approach anyway.

I think that allowing users to be able to backup in an incremental way
through streaming replication (even though based on files) will give more
flexibility to system and database administrators for their disaster
recovery solutions.

Thanks,
Gabriele

#35

Robert Haas

robertmhaas@gmail.com

almost 11 years ago

In reply to: Gabriele Bartolini (#34)

Re: File based Incremental backup v8

On Fri, Mar 6, 2015 at 9:38 AM, Gabriele Bartolini
<gabriele.bartolini@2ndquadrant.it> wrote:

I believe the main point is to look at a user interface point of view.
If/When we switch to a block level incremental support, this will be
completely transparent to the end user, even if we start with a file-level
approach with LSN check.

I don't think that's true. To have a real file-level incremental
backup you need the ability to take the incremental backup, and then
you also need the ability to take a full backup + an incremental
backup taken later and reassemble a full image of the cluster on which
you can run recovery. The means of doing that is going to be
different for an approach that only copies certain blocks vs. one that
copies whole files. Once we have the block-based approach, nobody
will ever use the file-based approach, so whatever code or tools we
write to do that will all be dead code, yet we'll still have to
support them for many years.

By the way, unless I'm missing something, this patch only seems to
include the code to construct an incremental backup, but no tools
whatsoever to do anything useful with it once you've got it. I think
that's 100% unacceptable. Users need to be able to manipulate
PostgreSQL backups using either standard operating system tools or
tools provided with PostgreSQL. Some people may prefer to use
something like repmgr or pitrtools or omniptr in addition, but that
shouldn't be a requirement for incremental backup to be usable.

Agile development is good, but that does not mean you can divide a big
project into arbitrarily small chunks. At some point the chunks are
too small to be sure that the overall direction is right, and/or
individually useless.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#36

Marco Nenciarini

marco.nenciarini@2ndquadrant.it

almost 11 years ago

In reply to: Bruce Momjian (#32)

1 attachment(s)

Re: File based Incremental backup v8

Il 05/03/15 05:42, Bruce Momjian ha scritto:

On Thu, Mar 5, 2015 at 01:25:13PM +0900, Fujii Masao wrote:

Yeah, it might make the situation better than today. But I'm afraid that
many users might get disappointed about that behavior of an incremental
backup after the release...

I don't get what do you mean here. Can you elaborate this point?

The proposed version of LSN-based incremental backup has some limitations
(e.g., every database files need to be read even when there is no modification
in database since last backup, and which may make the backup time longer than
users expect) which may disappoint users. So I'm afraid that users who can
benefit from the feature might be very limited. IOW, I'm just sticking to
the idea of timestamp-based one :) But I should drop it if the majority in
the list prefers the LSN-based one even if it has such limitations.

We need numbers on how effective each level of tracking will be. Until
then, the patch can't move forward.

I've written a little test script to estimate how much space can be saved by file level incremental, and I've run it on some real backups I have access to.

The script takes two basebackup directory and simulate how much data can be saved in the 2nd backup using incremental backup (using file size/time and LSN)

It assumes that every file in base, global and pg_tblspc which matches both size and modification time will also match from the LSN point of view.

The result is that many databases can take advantage of incremental, even if not do big, and considering LSNs yield a result almost identical to the approach based on filesystem metadata.

== Very big geographic database (similar to openstreetmap main DB), it contains versioned data, interval two months

First backup size: 13286623850656 (12.1TiB)
Second backup size: 13323511925626 (12.1TiB)
Matching files count: 17094
Matching LSN count: 14580
Matching files size: 9129755116499 (8.3TiB, 68.5%)
Matching LSN size: 9128568799332 (8.3TiB, 68.5%)

== Big on-line store database, old data regularly moved to historic partitions, interval one day

First backup size: 1355285058842 (1.2TiB)
Second backup size: 1358389467239 (1.2TiB)
Matching files count: 3937
Matching LSN count: 2821
Matching files size: 762292960220 (709.9GiB, 56.1%)
Matching LSN size: 762122543668 (709.8GiB, 56.1%)

== Ticketing system database, interval one day

First backup size: 144988275 (138.3MiB)
Second backup size: 146135155 (139.4MiB)
Matching files count: 3124
Matching LSN count: 2641
Matching files size: 76908986 (73.3MiB, 52.6%)
Matching LSN size: 67747928 (64.6MiB, 46.4%)

== Online store, interval one day

First backup size: 20418561133 (19.0GiB)
Second backup size: 20475290733 (19.1GiB)
Matching files count: 5744
Matching LSN count: 4302
Matching files size: 4432709876 (4.1GiB, 21.6%)
Matching LSN size: 4388993884 (4.1GiB, 21.4%)

== Heavily updated database, interval one week

First backup size: 3203198962 (3.0GiB)
Second backup size: 3222409202 (3.0GiB)
Matching files count: 1801
Matching LSN count: 1273
Matching files size: 91206317 (87.0MiB, 2.8%)
Matching LSN size: 69083532 (65.9MiB, 2.1%)

Regards,
Marco

--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it

#37

Bruce Momjian

bruce@momjian.us

almost 11 years ago

In reply to: Robert Haas (#33)

Re: File based Incremental backup v8

On Thu, Mar 5, 2015 at 11:10:08AM -0500, Robert Haas wrote:

But I agree with Fujii to the extent that I see little value in
committing this patch in the form proposed. Being smart enough to use
the LSN to identify changed blocks, but then sending the entirety of
every file anyway because you don't want to go to the trouble of
figuring out how to revise the wire protocol to identify the
individual blocks being sent and write the tools to reconstruct a full
backup based on that data, does not seem like enough of a win. As
Fujii says, if we ship this patch as written, people will just keep
using the timestamp-based approach anyway. Let's wait until we have
something that is, at least in some circumstances, a material
improvement over the status quo before committing anything.

The big problem I have with this patch is that it has not followed the
proper process for development, i.e. at the top of the TODO list we
have:

Desirability -> Design -> Implement -> Test -> Review -> Commit

This patch has continued in development without getting agreement on
its Desirability or Design, meaning we are going to continue going back
to those points until there is agreement. Posting more versions of this
patch is not going to change that.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#38

Gabriele Bartolini

gabriele.bartolini@2ndquadrant.it

almost 11 years ago

In reply to: Bruce Momjian (#37)

Re: File based Incremental backup v8

Hi Bruce,

2015-03-08 5:37 GMT+11:00 Bruce Momjian <bruce@momjian.us>:

Desirability -> Design -> Implement -> Test -> Review -> Commit

This patch has continued in development without getting agreement on
its Desirability or Design, meaning we are going to continue going back
to those points until there is agreement. Posting more versions of this
patch is not going to change that.

Could you please elaborate that?

I actually think the approach that has been followed is what makes open
source and collaborative development work. The initial idea was based on
timestamp approach which, thanks to the input of several developers, led
Marco to develop LSN based checks and move forward the feature
implementation.

The numbers that Marco has posted clearly show that a lot of users will
benefit from this file-based approach for incremental backup through
pg_basebackup.

As far as I see it, the only missing bit is the pg_restorebackup tool which
is quite trivial - given the existing prototype in Python.

Thanks,
Gabriele

#39

Gabriele Bartolini

gabriele.bartolini@2ndquadrant.it

almost 11 years ago

In reply to: Robert Haas (#35)

Re: File based Incremental backup v8

Hi Robert,

2015-03-07 2:57 GMT+11:00 Robert Haas <robertmhaas@gmail.com>:

By the way, unless I'm missing something, this patch only seems to
include the code to construct an incremental backup, but no tools
whatsoever to do anything useful with it once you've got it.

As stated previously, Marco is writing a tool called pg_restorebackup (the
prototype in Python has been already posted) to be included in the core. I
am in Australia now and not in the office so I cannot confirm it, but I am
pretty sure he had already written it and was about to send it to the list.

He's been trying to find more data - see the one that he's sent - in order
to convince that even a file-based approach is useful.

I think that's 100% unacceptable.

I agree, that's why pg_restorebackup written in C is part of this patch.
See: https://wiki.postgresql.org/wiki/Incremental_backup

Users need to be able to manipulate
PostgreSQL backups using either standard operating system tools or
tools provided with PostgreSQL. Some people may prefer to use
something like repmgr or pitrtools or omniptr in addition, but that
shouldn't be a requirement for incremental backup to be usable.

Not at all. I believe those tools will have to use pg_basebackup and
pg_restorebackup. If they want to use streaming replication protocol they
will be responsible to make sure that - if the protocol changes - they
adapt their technology.

Agile development is good, but that does not mean you can divide a big

project into arbitrarily small chunks. At some point the chunks are
too small to be sure that the overall direction is right, and/or
individually useless.

The goal has always been to provide "file-based incremental backup". I can
assure that this has always been our compass and the direction to follow.

I repeat that, using pg_restorebackup, this patch will transparently let
users benefit from incremental backup even when it will be moved to an
internal block-level logic. Users will continue to execute pg_basebackup
and pg_restorebackup, ignoring that with - for example 9.5 - it is
file-based (saving between 50-70% of space and time) of block level - for
example 9.6.

My proposal is that Marco provides pg_restorebackup according to the
initial plan - a matter of hours/days.

Cheers,
Gabriele

#40

Bruce Momjian

bruce@momjian.us

almost 11 years ago

In reply to: Gabriele Bartolini (#38)

Re: File based Incremental backup v8

On Sun, Mar 8, 2015 at 09:26:38AM +1100, Gabriele Bartolini wrote:

Hi Bruce,

2015-03-08 5:37 GMT+11:00 Bruce Momjian <bruce@momjian.us>:

ï¿½ ï¿½ ï¿½ ï¿½ Desirability -> Design -> Implement -> Test -> Review -> Commit

This patch has continued in development without getting agreement on
its Desirability or Design, meaning we are going to continue going back
to those points until there is agreement.ï¿½ Posting more versions of this
patch is not going to change that.

Could you please elaborate that?

I actually think the approach that has been followed is what makes open source
and collaborative development work. The initial idea was based on timestamp
approach which, thanks to the input of several developers, led Marco to develop
LSN based checks and move forward the feature implementation.

OK, if you think everyone just going on their own and working on patches
that have little chance of being accepted, you can do it, but that
rarely makes successful open source software. You can do whatever you
want with your patch.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#41

Robert Haas

robertmhaas@gmail.com

almost 11 years ago

In reply to: Gabriele Bartolini (#39)

Re: File based Incremental backup v8

On Sat, Mar 7, 2015 at 5:45 PM, Gabriele Bartolini
<gabriele.bartolini@2ndquadrant.it> wrote:

By the way, unless I'm missing something, this patch only seems to
include the code to construct an incremental backup, but no tools
whatsoever to do anything useful with it once you've got it.

As stated previously, Marco is writing a tool called pg_restorebackup (the
prototype in Python has been already posted) to be included in the core. I
am in Australia now and not in the office so I cannot confirm it, but I am
pretty sure he had already written it and was about to send it to the list.

Gabriele, the deadline for the last CommitFest was three weeks ago.
Having a patch that you are "about to send to the list" is not good
enough at this point.

I think that's 100% unacceptable.

I agree, that's why pg_restorebackup written in C is part of this patch.
See: https://wiki.postgresql.org/wiki/Incremental_backup

No, it *isn't* part of this patch. You may have a plan to add it to
this patch, but that's not the same thing.

The goal has always been to provide "file-based incremental backup". I can
assure that this has always been our compass and the direction to follow.

Regardless of community feedback? OK. Let's see how that works out for you.

I repeat that, using pg_restorebackup, this patch will transparently let
users benefit from incremental backup even when it will be moved to an
internal block-level logic. Users will continue to execute pg_basebackup and
pg_restorebackup, ignoring that with - for example 9.5 - it is file-based
(saving between 50-70% of space and time) of block level - for example 9.6.

I understand that. But I also understand that in other cases it's
going to be slower than a full backup. This problem has been pointed
out several times, and you're just refusing to admit that it's a real
issue. A user with a bunch of tables where only the rows near the end
of the file get updated is going to repeatedly read those files until
it hits the first modified block and then rewind and reread the whole
file. I pointed this problem out back in early October and suggested
some ways of fixing it; Heikki followed up with his own suggestions
for modifying my idea. Instead of implementing any of that, or even
discussing it, you're still plugging away on a design that no
committer has endorsed and that several committers obviously have
concerns about.

It's also pretty clear that nobody likes the backup profile, at least
in the form it exists today. But it's still here, many patch versions
later.

I think there's absolutely no point in spending more time on this for
9.5. At least 4 committers have looked at it and none of them are
convinced by the current design; feedback from almost half a year ago
hasn't been incorporated; obviously-needed parts of the patch
(pg_restorebackup) are missing weeks after the last CF deadline.
Let's mark this Rejected in the CF app and move on.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#42

Michael Paquier

michael.paquier@gmail.com

almost 11 years ago

In reply to: Robert Haas (#41)

Re: File based Incremental backup v8

On Tue, Mar 10, 2015 at 1:50 AM, Robert Haas <robertmhaas@gmail.com> wrote:

I think there's absolutely no point in spending more time on this for
9.5. At least 4 committers have looked at it and none of them are
convinced by the current design; feedback from almost half a year ago
hasn't been incorporated; obviously-needed parts of the patch
(pg_restorebackup) are missing weeks after the last CF deadline.
Let's mark this Rejected in the CF app and move on.

Agreed. I lost a bit interest in this patch lately, but if all the
necessary parts of the patch were not posted before the CF deadline
that's not something we should consider for integration at this point.
Let's give it a couple of months of fresh air and, Gabriele, I am sure
you will be able to come back with something far more advanced for the
first CF of 9.6.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#43

David Fetter

david@fetter.org

almost 10 years ago

In reply to: Michael Paquier (#42)

Re: File based Incremental backup v8

On Tue, Mar 10, 2015 at 08:25:27AM +0900, Michael Paquier wrote:

On Tue, Mar 10, 2015 at 1:50 AM, Robert Haas <robertmhaas@gmail.com> wrote:

I think there's absolutely no point in spending more time on this for
9.5. At least 4 committers have looked at it and none of them are
convinced by the current design; feedback from almost half a year ago
hasn't been incorporated; obviously-needed parts of the patch
(pg_restorebackup) are missing weeks after the last CF deadline.
Let's mark this Rejected in the CF app and move on.

Agreed. I lost a bit interest in this patch lately, but if all the
necessary parts of the patch were not posted before the CF deadline
that's not something we should consider for integration at this point.
Let's give it a couple of months of fresh air and, Gabriele, I am sure
you will be able to come back with something far more advanced for the
first CF of 9.6.

What's the latest on this patch?

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#44

Michael Paquier

michael.paquier@gmail.com

almost 10 years ago

In reply to: David Fetter (#43)

Re: File based Incremental backup v8

On Tue, Jan 26, 2016 at 12:55 AM, David Fetter <david@fetter.org> wrote:

On Tue, Mar 10, 2015 at 08:25:27AM +0900, Michael Paquier wrote:

On Tue, Mar 10, 2015 at 1:50 AM, Robert Haas <robertmhaas@gmail.com> wrote:

I think there's absolutely no point in spending more time on this for
9.5. At least 4 committers have looked at it and none of them are
convinced by the current design; feedback from almost half a year ago
hasn't been incorporated; obviously-needed parts of the patch
(pg_restorebackup) are missing weeks after the last CF deadline.
Let's mark this Rejected in the CF app and move on.

Agreed. I lost a bit interest in this patch lately, but if all the
necessary parts of the patch were not posted before the CF deadline
that's not something we should consider for integration at this point.
Let's give it a couple of months of fresh air and, Gabriele, I am sure
you will be able to come back with something far more advanced for the
first CF of 9.6.

What's the latest on this patch?

My guess is that Marco and Gabriele are working on something directly
for barman, the backup tool they use, with a differential backup
implementation based on tracking blocks modified by WAL records (far
faster for large data sets than scanning all the relation files of
PGDATA).
Regards,
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers