Patch: restrict pg_rewind to whitelisted directories
The attached patch is cleaned up and filed for the commit fest this next
month:
Here's the full commit message via Mercurial. I will likely have a new
branch per version the patch since that's the closest thing to a rebase in
this version control system.
changeset: 60492:47f87a2d2fa1
tag: mine/pg_rewind_restrict_dirs
parent: 60446:e638ba9c3c11
user: Chris Travers <chris.travers@gmail.com>
date: Mon Oct 30 12:25:18 2017 +0100
files: doc/src/sgml/ref/pg_rewind.sgml src/bin/pg_rewind/copy_fetch.c
src/bin/pg_rewind/fetch.c src/bin/pg_rewind/fetch.h
src/bin/pg_rewind/libpq_fetch.c src/bin/pg_rewind/pg_rewind.c
src/bin/pg_rewind/t/003_extrafiles.pl
description:
Restrict pg_rewind to whitelisted directories.
This is intended to be a minimum working version and in fact builds and
passes tests.
Note that tests for extra files have been changed to reflect new behavior
and additional
debugging informnation added in to output in case of failure.
The patch iterates through a series of set directories to synchronize them
only. This improves
predictability of the complete state of the system after a rewind.
One important outstanding question here is whether we need to ensure the
possibility of backing
up other files if they exist via an --include-path command line switch
(this would not be a glob).
In the thread discussing this patch, Michael Paquier has expressed concern
about configuration
files created by extensions or other components not being copied. I could
add such a switch but
the patch is long enough, and it is unclear enough to the extent this is
needed at present, so
I am leaving it at the reviewer's discretion whether I should add this here
or submit a second
patch later to add the ability to add additional paths to the filemap.
Either way, it is worth noting that I expect to have a subsequent patch
either incorporated here or in a further submission that takes this and
adds the ability to include additional directories or files via a command
line flag. This will *not* be a shell glob but one directory or file per
invocation of the switch (similar to -t in pg_dump).
--
Best Regards,
Chris Travers
Database Administrator
Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin
Attachments:
pg_rewind_restrict_dirs.v2.patchapplication/octet-stream; name=pg_rewind_restrict_dirs.v2.patchDownload
diff -r e638ba9c3c11 -r e37682f34fe1 doc/src/sgml/ref/pg_rewind.sgml
--- a/doc/src/sgml/ref/pg_rewind.sgml Sun Oct 29 12:58:40 2017 +0530
+++ b/doc/src/sgml/ref/pg_rewind.sgml Mon Oct 30 14:09:56 2017 +0100
@@ -48,17 +48,27 @@
</para>
<para>
- The result is equivalent to replacing the target data directory with the
- source one. Only changed blocks from relation files are copied;
- all other files are copied in full, including configuration files. The
- advantage of <application>pg_rewind</application> over taking a new base backup, or
- tools like <application>rsync</application>, is that <application>pg_rewind</application> does
- not require reading through unchanged blocks in the cluster. This makes
+ The result is equivalent to replacing the data-related files in the target
+ data directory with the source one. Only changed blocks from relation files
+ are copied; all other files relating to control or WAL information are copied
+ in full. The advantage of <application>pg_rewind</> over taking a new base
+ backup, or tools like <application>rsync</>, is that <application>pg_rewind</>
+ does not require reading through unchanged blocks in the cluster. This makes
it a lot faster when the database is large and only a small
fraction of blocks differ between the clusters.
</para>
<para>
+ A second advantage is predictability. <application>pg_rewind</> is aware of
+ what directories are relevant to restoring replication and which ones are not.
+ The result is that you get something of a guaranteed state at the end. Log
+ files on the source are unlikely to clobber those on the client. The
+ <filename>postgresql.conf.auto</> is unlikely to be copied over. Replication
+ slot information is removed (and must be added again), and so forth. Your
+ system is as it had been before, but the data is synchronized from the master.
+ </para>
+
+ <para>
<application>pg_rewind</application> examines the timeline histories of the source
and target clusters to determine the point where they diverged, and
expects to find WAL in the target cluster's <filename>pg_wal</filename> directory
@@ -84,7 +94,9 @@
<application>pg_rewind</application> session, it must be made available when the
target server is started. This can be done by creating a
<filename>recovery.conf</filename> file in the target data directory with a
- suitable <varname>restore_command</varname>.
+ suitable <varname>restore_command</varbane>. Alternatively replication slots can
+ be set just before promotion to ensure that the wal files have not been
+ removed.
</para>
<para>
@@ -206,7 +218,7 @@
<para>
The basic idea is to copy all file system-level changes from the source
- cluster to the target cluster:
+ cluster to the target cluster if they implicate the data stored:
</para>
<procedure>
@@ -229,9 +241,7 @@
</step>
<step>
<para>
- Copy all other files such as <filename>pg_xact</filename> and
- configuration files from the source cluster to the target cluster
- (everything except the relation files).
+ Copy wal-related files such as those in <filename>pg_xact</filename>.
</para>
</step>
<step>
diff -r e638ba9c3c11 -r e37682f34fe1 src/bin/pg_rewind/copy_fetch.c
--- a/src/bin/pg_rewind/copy_fetch.c Sun Oct 29 12:58:40 2017 +0530
+++ b/src/bin/pg_rewind/copy_fetch.c Mon Oct 30 14:09:56 2017 +0100
@@ -26,6 +26,28 @@
static void recurse_dir(const char *datadir, const char *path,
process_file_callback_t callback);
+/* List of directories to synchronize:
+ * base data dirs (and ablespaces)
+ * wal/transaction data
+ * and that is it.
+ *
+ * This array is null-terminated to make
+ * it easy to expand
+ */
+
+const char *rewind_dirs[] = {
+ "base", // Default tablespace
+ "global", // global tablespace
+ "pg_commit_ts", // In case we need to do PITR before up to sync
+ "pg_logical", // WAL related and no good reason to exclude
+ "pg_multixact", // WAL related and may need for vacuum-related reasons
+ "pg_tblspc", // Pther tablespaces
+ "pg_twophase", // mostly to *clear*
+ "pg_wal", // WAL
+ "pg_xact", // Commits of transactions
+ NULL
+};
+
static void execute_pagemap(datapagemap_t *pagemap, const char *path);
/*
@@ -33,13 +55,16 @@
* for each file.
*/
void
-traverse_datadir(const char *datadir, process_file_callback_t callback)
+traverse_rewinddirs(const char *datadir, process_file_callback_t callback)
{
- recurse_dir(datadir, NULL, callback);
+ int i;
+ for(i = 0; rewind_dirs[i] != NULL; i++){
+ recurse_dir(datadir, rewind_dirs[i], callback);
+ }
}
/*
- * recursive part of traverse_datadir
+ * recursive part of traverse_rewinddirs
*
* parentpath is the current subdirectory's path relative to datadir,
* or NULL at the top level.
diff -r e638ba9c3c11 -r e37682f34fe1 src/bin/pg_rewind/fetch.c
--- a/src/bin/pg_rewind/fetch.c Sun Oct 29 12:58:40 2017 +0530
+++ b/src/bin/pg_rewind/fetch.c Mon Oct 30 14:09:56 2017 +0100
@@ -28,7 +28,7 @@
fetchSourceFileList(void)
{
if (datadir_source)
- traverse_datadir(datadir_source, &process_source_file);
+ traverse_rewinddirs(datadir_source, &process_source_file);
else
libpqProcessFileList();
}
diff -r e638ba9c3c11 -r e37682f34fe1 src/bin/pg_rewind/fetch.h
--- a/src/bin/pg_rewind/fetch.h Sun Oct 29 12:58:40 2017 +0530
+++ b/src/bin/pg_rewind/fetch.h Mon Oct 30 14:09:56 2017 +0100
@@ -36,9 +36,11 @@
extern XLogRecPtr libpqGetCurrentXlogInsertLocation(void);
/* in copy_fetch.c */
+extern const char *rewind_dirs[];
extern void copy_executeFileMap(filemap_t *map);
typedef void (*process_file_callback_t) (const char *path, file_type_t type, size_t size, const char *link_target);
extern void traverse_datadir(const char *datadir, process_file_callback_t callback);
+extern void traverse_rewinddirs(const char *datadir, process_file_callback_t callback);
#endif /* FETCH_H */
diff -r e638ba9c3c11 -r e37682f34fe1 src/bin/pg_rewind/libpq_fetch.c
--- a/src/bin/pg_rewind/libpq_fetch.c Sun Oct 29 12:58:40 2017 +0530
+++ b/src/bin/pg_rewind/libpq_fetch.c Mon Oct 30 14:09:56 2017 +0100
@@ -147,7 +147,7 @@
{
PGresult *res;
const char *sql;
- int i;
+ int i, p;
/*
* Create a recursive directory listing of the whole data directory.
@@ -163,7 +163,8 @@
"WITH RECURSIVE files (path, filename, size, isdir) AS (\n"
" SELECT '' AS path, filename, size, isdir FROM\n"
" (SELECT pg_ls_dir('.', true, false) AS filename) AS fn,\n"
- " pg_stat_file(fn.filename, true) AS this\n"
+ " LATERAL pg_stat_file(fn.filename, true) AS this\n"
+ " WHERE filename = $1\n"
" UNION ALL\n"
" SELECT parent.path || parent.filename || '/' AS path,\n"
" fn, this.size, this.isdir\n"
@@ -177,44 +178,56 @@
"FROM files\n"
"LEFT OUTER JOIN pg_tablespace ON files.path = 'pg_tblspc/'\n"
" AND oid::text = files.filename\n";
- res = PQexec(conn, sql);
- if (PQresultStatus(res) != PGRES_TUPLES_OK)
- pg_fatal("could not fetch file list: %s",
+ /* Going through the directories in a loop. Doing it this way
+ * makes it easier to add more inclusions later.
+ *
+ * Note that the query filters out on top-level directories before
+ * recursion so this will not give us problems in terms of listing
+ * lots of files many times.
+ */
+ for (p = 0; rewind_dirs[p] != NULL; ++p)
+ {
+ const char *paths[1];
+ paths[0] = rewind_dirs[p];
+ res = PQexecParams(conn, sql, 1, NULL, paths, NULL, NULL, 0);
+
+ if (PQresultStatus(res) != PGRES_TUPLES_OK)
+ pg_fatal("could not fetch file list: %s",
PQresultErrorMessage(res));
- /* sanity check the result set */
- if (PQnfields(res) != 4)
- pg_fatal("unexpected result set while fetching file list\n");
+ /* sanity check the result set */
+ if (PQnfields(res) != 4)
+ pg_fatal("unexpected result set while fetching file list\n");
- /* Read result to local variables */
- for (i = 0; i < PQntuples(res); i++)
- {
- char *path = PQgetvalue(res, i, 0);
- int64 filesize = atol(PQgetvalue(res, i, 1));
- bool isdir = (strcmp(PQgetvalue(res, i, 2), "t") == 0);
- char *link_target = PQgetvalue(res, i, 3);
- file_type_t type;
-
- if (PQgetisnull(res, 0, 1))
+ /* Read result to local variables */
+ for (i = 0; i < PQntuples(res); i++)
{
- /*
- * The file was removed from the server while the query was
- * running. Ignore it.
- */
- continue;
- }
+ char *path = PQgetvalue(res, i, 0);
+ int64 filesize = atol(PQgetvalue(res, i, 1));
+ bool isdir = (strcmp(PQgetvalue(res, i, 2), "t") == 0);
+ char *link_target = PQgetvalue(res, i, 3);
+ file_type_t type;
- if (link_target[0])
- type = FILE_TYPE_SYMLINK;
- else if (isdir)
- type = FILE_TYPE_DIRECTORY;
- else
- type = FILE_TYPE_REGULAR;
+ if (PQgetisnull(res, 0, 1))
+ {
+ /*
+ * The file was removed from the server while the query was
+ * running. Ignore it.
+ */
+ continue;
+ }
- process_source_file(path, type, filesize, link_target);
+ if (link_target[0])
+ type = FILE_TYPE_SYMLINK;
+ else if (isdir)
+ type = FILE_TYPE_DIRECTORY;
+ else
+ type = FILE_TYPE_REGULAR;
+ process_source_file(path, type, filesize, link_target);
+ }
+ PQclear(res);
}
- PQclear(res);
}
/*----
diff -r e638ba9c3c11 -r e37682f34fe1 src/bin/pg_rewind/pg_rewind.c
--- a/src/bin/pg_rewind/pg_rewind.c Sun Oct 29 12:58:40 2017 +0530
+++ b/src/bin/pg_rewind/pg_rewind.c Mon Oct 30 14:09:56 2017 +0100
@@ -287,7 +287,7 @@
pg_log(PG_PROGRESS, "reading source file list\n");
fetchSourceFileList();
pg_log(PG_PROGRESS, "reading target file list\n");
- traverse_datadir(datadir_target, &process_target_file);
+ traverse_rewinddirs(datadir_target, &process_target_file);
/*
* Read the target WAL from last checkpoint before the point of fork, to
diff -r e638ba9c3c11 -r e37682f34fe1 src/bin/pg_rewind/t/003_extrafiles.pl
--- a/src/bin/pg_rewind/t/003_extrafiles.pl Sun Oct 29 12:58:40 2017 +0530
+++ b/src/bin/pg_rewind/t/003_extrafiles.pl Mon Oct 30 14:09:56 2017 +0100
@@ -71,12 +71,12 @@
"$test_master_datadir/tst_both_dir/both_file2",
"$test_master_datadir/tst_both_dir/both_subdir",
"$test_master_datadir/tst_both_dir/both_subdir/both_file3",
- "$test_master_datadir/tst_standby_dir",
- "$test_master_datadir/tst_standby_dir/standby_file1",
- "$test_master_datadir/tst_standby_dir/standby_file2",
- "$test_master_datadir/tst_standby_dir/standby_subdir",
-"$test_master_datadir/tst_standby_dir/standby_subdir/standby_file3" ],
- "file lists match");
+ "$test_master_datadir/tst_master_dir",
+ "$test_master_datadir/tst_master_dir/master_file1",
+ "$test_master_datadir/tst_master_dir/master_file2",
+ "$test_master_datadir/tst_master_dir/master_subdir",
+ "$test_master_datadir/tst_master_dir/master_subdir/master_file3", ],
+ "file lists match") or (diag("Files found:"), diag(explain(\@paths)));
RewindTest::clean_rewind_test();
}
On Mon, Oct 30, 2017 at 6:44 PM, Chris Travers <chris.travers@adjust.com> wrote:
The attached patch is cleaned up and filed for the commit fest this next
month:
It's generally better to post the patch on the same message as the
discussion thread, or at least link back to the discussion thread from
the new one. Otherwise people may have trouble understanding the
history.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Oct 31, 2017 at 1:38 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Oct 30, 2017 at 6:44 PM, Chris Travers <chris.travers@adjust.com>
wrote:The attached patch is cleaned up and filed for the commit fest this next
month:It's generally better to post the patch on the same message as the
discussion thread, or at least link back to the discussion thread from
the new one. Otherwise people may have trouble understanding the
history.
Fair point. Mostly concerned about the WIP marker there. This is my first
time around this.
I will then retire this thread and attach the patch on the other one.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Best Regards,
Chris Travers
Database Administrator
Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin