PATCH: Exclude additional directories in pg_basebackup
Recently a hacker proposed a patch to add pg_dynshmem to the list of
directories whose contents are excluded in pg_basebackup. I wasn't able
to find the original email despite several attempts.
That patch got me thinking about what else could be excluded and after
some investigation I found the following: pg_notify, pg_serial,
pg_snapshots, pg_subtrans. These directories are all cleaned, zeroed,
or rebuilt on server start.
The attached patch adds these directories to the pg_basebackup
exclusions and takes an array-based approach to excluding directories
and files during backup.
I also incorporated Ashutosh's patch to fix corruption when
pg_stat_tmp/pg_replslot are links [1]/messages/by-id/CAE9k0Pm7=x_o0W7E2b2s2cWcZdcBGczGdrxttzXOZGp8bEBcGw@mail.gmail.com/. This logic has been extended to
all excluded directories.
Perhaps these patches should be merged in the CF but first I'd like to
see if anyone can identify problems with the additional exclusions.
Thanks,
--
-David
david@pgmasters.net
[1]: /messages/by-id/CAE9k0Pm7=x_o0W7E2b2s2cWcZdcBGczGdrxttzXOZGp8bEBcGw@mail.gmail.com/
/messages/by-id/CAE9k0Pm7=x_o0W7E2b2s2cWcZdcBGczGdrxttzXOZGp8bEBcGw@mail.gmail.com/
Attachments:
basebackup-exclusions-v1.patchtext/plain; charset=UTF-8; name=basebackup-exclusions-v1.patch; x-mac-creator=0; x-mac-type=0Download+139-62
David,
* David Steele (david@pgmasters.net) wrote:
Recently a hacker proposed a patch to add pg_dynshmem to the list of
directories whose contents are excluded in pg_basebackup. I wasn't able
to find the original email despite several attempts.
That would be here:
b4e94836-786b-6020-e1b3-3d7390f95497@aiven.io
Thanks!
Stephen
On 8/15/16 2:39 PM, David Steele wrote:
That patch got me thinking about what else could be excluded and after
some investigation I found the following: pg_notify, pg_serial,
pg_snapshots, pg_subtrans. These directories are all cleaned, zeroed,
or rebuilt on server start.
If someone wanted to scratch an itch, it would also be useful to put
things that are zeroed under a single dedicated directory, so that folks
who wanted to could mount that on a ram drive. It would also be useful
to do that for stuff that's only reset on crash (to put it on local
storage as opposed to a SAN).
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 8/15/16 4:58 PM, Jim Nasby wrote:
On 8/15/16 2:39 PM, David Steele wrote:
That patch got me thinking about what else could be excluded and after
some investigation I found the following: pg_notify, pg_serial,
pg_snapshots, pg_subtrans. These directories are all cleaned, zeroed,
or rebuilt on server start.If someone wanted to scratch an itch, it would also be useful to put
things that are zeroed under a single dedicated directory, so that folks
who wanted to could mount that on a ram drive. It would also be useful
to do that for stuff that's only reset on crash (to put it on local
storage as opposed to a SAN).
I definitely have something like that in mind and thought this was a
good place to start.
--
-David
david@pgmasters.net
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Aug 15, 2016 at 3:39 PM, David Steele <david@pgmasters.net> wrote:
Recently a hacker proposed a patch to add pg_dynshmem to the list of
directories whose contents are excluded in pg_basebackup. I wasn't able
to find the original email despite several attempts.That patch got me thinking about what else could be excluded and after
some investigation I found the following: pg_notify, pg_serial,
pg_snapshots, pg_subtrans. These directories are all cleaned, zeroed,
or rebuilt on server start.
Eh ... I doubt very much that it's safe to blow away the entire
contents of an SLRU between shutdown and startup, even if the data is
technically transient data that won't be needed again after the system
is reset.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas wrote:
Eh ... I doubt very much that it's safe to blow away the entire
contents of an SLRU between shutdown and startup, even if the data is
technically transient data that won't be needed again after the system
is reset.
Hmm. At least async.c (pg_notify) deletes the whole directory at
startup so it's fine, but for the others (pg_serial, pg_subtrans) I
think it'd be saner to keep any "active" files (probably just the
latest). I don't remember how pg_snapshot works, but it's probably fine
to start with an empty subdir (is it possible to export a snapshot from
a prepared transaction?)
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi Robert,
On 8/17/16 11:27 AM, Robert Haas wrote:
On Mon, Aug 15, 2016 at 3:39 PM, David Steele <david@pgmasters.net> wrote:
Recently a hacker proposed a patch to add pg_dynshmem to the list of
directories whose contents are excluded in pg_basebackup. I wasn't able
to find the original email despite several attempts.That patch got me thinking about what else could be excluded and after
some investigation I found the following: pg_notify, pg_serial,
pg_snapshots, pg_subtrans. These directories are all cleaned, zeroed,
or rebuilt on server start.Eh ... I doubt very much that it's safe to blow away the entire
contents of an SLRU between shutdown and startup, even if the data is
technically transient data that won't be needed again after the system
is reset.
I've done pretty extensive testing in pgBackRest and haven't seen issues
in any supported version (plus I audited each init() function for every
version back to where it was introduced). The patch also passes all the
pg_basebackup TAP tests in master.
If you are correct it may indicate a problem anyway. Consider a standby
backup where the files in these directories may be incredibly stale
since they are not replicated. Once restored to a master should we
trust anything in these files?
pg_serial, pg_notify, pg_subtrans are not even fsync'd
(SlruCtl->do_fsync = false). It's hard to imagine there's anything of
value in there or that it can be trusted if there is.
The files in pg_snapshot and pg_dynshmem are simply deleted on startup
so that seems safe enough.
--
-David
david@pgmasters.net
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Aug 17, 2016 at 2:50 PM, David Steele <david@pgmasters.net> wrote:
Hi Robert,
On 8/17/16 11:27 AM, Robert Haas wrote:
On Mon, Aug 15, 2016 at 3:39 PM, David Steele <david@pgmasters.net> wrote:
Recently a hacker proposed a patch to add pg_dynshmem to the list of
directories whose contents are excluded in pg_basebackup. I wasn't able
to find the original email despite several attempts.That patch got me thinking about what else could be excluded and after
some investigation I found the following: pg_notify, pg_serial,
pg_snapshots, pg_subtrans. These directories are all cleaned, zeroed,
or rebuilt on server start.Eh ... I doubt very much that it's safe to blow away the entire
contents of an SLRU between shutdown and startup, even if the data is
technically transient data that won't be needed again after the system
is reset.I've done pretty extensive testing in pgBackRest and haven't seen issues
in any supported version (plus I audited each init() function for every
version back to where it was introduced). The patch also passes all the
pg_basebackup TAP tests in master.If you are correct it may indicate a problem anyway. Consider a standby
backup where the files in these directories may be incredibly stale
since they are not replicated. Once restored to a master should we
trust anything in these files?pg_serial, pg_notify, pg_subtrans are not even fsync'd
(SlruCtl->do_fsync = false). It's hard to imagine there's anything of
value in there or that it can be trusted if there is.
It's not just a question of whether the data has value; it's a
question of whether the SLRU code will handle the situation correctly
in all cases if the directory contains no files. I don't think you
can draw a firm conclusion on that without reading the code.
The files in pg_snapshot and pg_dynshmem are simply deleted on startup
so that seems safe enough.
Agreed.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Aug 17, 2016 at 1:50 PM, David Steele <david@pgmasters.net> wrote:
That patch got me thinking about what else could be excluded and after
some investigation I found the following: pg_notify, pg_serial,
pg_snapshots, pg_subtrans. These directories are all cleaned, zeroed,
or rebuilt on server start.Eh ... I doubt very much that it's safe to blow away the entire
contents of an SLRU between shutdown and startup, even if the data is
technically transient data that won't be needed again after the system
is reset.I've done pretty extensive testing in pgBackRest and haven't seen issues
in any supported version (plus I audited each init() function for every
version back to where it was introduced). The patch also passes all the
pg_basebackup TAP tests in master.If you are correct it may indicate a problem anyway. Consider a standby
backup where the files in these directories may be incredibly stale
since they are not replicated. Once restored to a master should we
trust anything in these files?pg_serial, pg_notify, pg_subtrans are not even fsync'd
(SlruCtl->do_fsync = false). It's hard to imagine there's anything of
value in there or that it can be trusted if there is.
The contents of pg_serial could be deleted safely. As I recall, I
initially had it cleaned out on startup, but Heikki (who was the
main committer for this feature) added SLRU buffer flushing for it
on checkpoint and took out the startup delete code with the
explanation that he thought someone might want to look at the file
contents for debugging purposes. I would be a bit surprised if
anyone ever has used it for that, but that is the reason the
contents are not deleted just like pg_snapshot and pg_dynshmem.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 8/17/16 2:53 PM, Robert Haas wrote:
On Wed, Aug 17, 2016 at 2:50 PM, David Steele <david@pgmasters.net> wrote:
Hi Robert,
On 8/17/16 11:27 AM, Robert Haas wrote:
On Mon, Aug 15, 2016 at 3:39 PM, David Steele <david@pgmasters.net> wrote:
Recently a hacker proposed a patch to add pg_dynshmem to the list of
directories whose contents are excluded in pg_basebackup. I wasn't able
to find the original email despite several attempts.That patch got me thinking about what else could be excluded and after
some investigation I found the following: pg_notify, pg_serial,
pg_snapshots, pg_subtrans. These directories are all cleaned, zeroed,
or rebuilt on server start.Eh ... I doubt very much that it's safe to blow away the entire
contents of an SLRU between shutdown and startup, even if the data is
technically transient data that won't be needed again after the system
is reset.If you are correct it may indicate a problem anyway. Consider a standby
backup where the files in these directories may be incredibly stale
since they are not replicated. Once restored to a master should we
trust anything in these files?pg_serial, pg_notify, pg_subtrans are not even fsync'd
(SlruCtl->do_fsync = false). It's hard to imagine there's anything of
value in there or that it can be trusted if there is.It's not just a question of whether the data has value; it's a
question of whether the SLRU code will handle the situation correctly
in all cases if the directory contains no files. I don't think you
can draw a firm conclusion on that without reading the code.
A good point, Robert. I read the code but I should have shared my
findings in the original post. I'll only cover the the cases we have not
already decided were safe (those that explicitly delete files in their
init routines, pg_snapshots and pg_dynshmem).
First, SlruPhysicalWritePage() will create a file/page that does not
exist and SlruPhysicalReadPage() will return zeros for a file/page that
does not exist during recovery. Not that I think the latter is
important for pg_serial, pg_notify, or pg_subtrans since they are not
WAL-logged. As far as I can see the slru system is very resilient overall.
For pg_notify, AsyncShmemInit() explicitly deletes all files on startup.
For pg_subtrans, StartupSUBTRANS() explicitly zeroes all required pages
but does not flush them to disk. Since SlruPhysicalWritePage() is
tolerant of missing files/pages this should be fine.
For pg_serial, OldSerXidInit() doesn't zero anything out since it
ignores the old transactions and they get truncated as the transaction
horizon moves. The old data hangs around for a little while so it could
theoretically be used for debugging as Kevin notes. Since pg_serial
would only be cleared after a restore I don't see that as a big issue.
--
-David
david@pgmasters.net
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Aug 18, 2016 at 1:35 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
I don't remember how pg_snapshot works, but it's probably fine
to start with an empty subdir (is it possible to export a snapshot from
a prepared transaction?)
From xact.c:
/*
* Likewise, don't allow PREPARE after pg_export_snapshot. This could be
* supported if we added cleanup logic to twophase.c, but for now it
* doesn't seem worth the trouble.
*/
if (XactHasExportedSnapshots())
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot PREPARE a transaction that has exported snapshots")));
So that's fine.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 8/17/16 7:56 PM, Michael Paquier wrote:
On Thu, Aug 18, 2016 at 1:35 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:I don't remember how pg_snapshot works, but it's probably fine
to start with an empty subdir (is it possible to export a snapshot from
a prepared transaction?)From xact.c:
/*
* Likewise, don't allow PREPARE after pg_export_snapshot. This could be
* supported if we added cleanup logic to twophase.c, but for now it
* doesn't seem worth the trouble.
*/
if (XactHasExportedSnapshots())
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot PREPARE a transaction that has exported snapshots")));
So that's fine.
Thank you for tracking that down, Michael.
--
-David
david@pgmasters.net
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 8/15/16 3:39 PM, David Steele wrote:
That patch got me thinking about what else could be excluded and after
some investigation I found the following: pg_notify, pg_serial,
pg_snapshots, pg_subtrans. These directories are all cleaned, zeroed,
or rebuilt on server start.The attached patch adds these directories to the pg_basebackup
exclusions and takes an array-based approach to excluding directories
and files during backup.
We do support other backup methods besides using pg_basebackup. So I
think we need to document the required or recommended handling of each
of these directories. And then pg_basebackup should become a consumer
of that documentation.
The current documentation on this is at
<https://www.postgresql.org/docs/devel/static/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP-DATA>,
which covers pg_xlog and pg_replslot. I think that documentation should
be expanded, maybe with a simple list that is easy to copy into an
exclude file, following by more detail on each directory.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 9/1/16 9:53 AM, Peter Eisentraut wrote:
On 8/15/16 3:39 PM, David Steele wrote:
That patch got me thinking about what else could be excluded and after
some investigation I found the following: pg_notify, pg_serial,
pg_snapshots, pg_subtrans. These directories are all cleaned, zeroed,
or rebuilt on server start.The attached patch adds these directories to the pg_basebackup
exclusions and takes an array-based approach to excluding directories
and files during backup.We do support other backup methods besides using pg_basebackup. So I
think we need to document the required or recommended handling of each
of these directories. And then pg_basebackup should become a consumer
of that documentation.The current documentation on this is at
<https://www.postgresql.org/docs/devel/static/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP-DATA>,
which covers pg_xlog and pg_replslot. I think that documentation should
be expanded, maybe with a simple list that is easy to copy into an
exclude file, following by more detail on each directory.
Attached is a new patch that adds sgml documentation. I can expand on
each directory individually if you think that's necessary, but thought
it was better to lump them into a few categories.
--
-David
david@pgmasters.net
Attachments:
basebackup-exclusions-v2.patchtext/plain; charset=UTF-8; name=basebackup-exclusions-v2.patch; x-mac-creator=0; x-mac-type=0Download+155-62
On Wed, Sep 7, 2016 at 12:16 AM, David Steele <david@pgmasters.net> wrote:
On 9/1/16 9:53 AM, Peter Eisentraut wrote:
On 8/15/16 3:39 PM, David Steele wrote:
That patch got me thinking about what else could be excluded and after
some investigation I found the following: pg_notify, pg_serial,
pg_snapshots, pg_subtrans. These directories are all cleaned, zeroed,
or rebuilt on server start.The attached patch adds these directories to the pg_basebackup
exclusions and takes an array-based approach to excluding directories
and files during backup.We do support other backup methods besides using pg_basebackup. So I
think we need to document the required or recommended handling of each
of these directories. And then pg_basebackup should become a consumer
of that documentation.The current documentation on this is at
<https://www.postgresql.org/docs/devel/static/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP-DATA>,
which covers pg_xlog and pg_replslot. I think that documentation should
be expanded, maybe with a simple list that is easy to copy into an
exclude file, following by more detail on each directory.Attached is a new patch that adds sgml documentation. I can expand on each
directory individually if you think that's necessary, but thought it was
better to lump them into a few categories.
+ be ommitted from the backup as they will be initialized on postmaster
+ startup. If the <xref linkend="GUC-STATS-TEMP-DIRECTORY"> is set and is
+ under the database cluster directory then the contents of the directory
+ specified by <xref linkend="GUC-STATS-TEMP-DIRECTORY"> can also
be ommitted.
s/ommitted/omitted/
+#define EXCLUDE_DIR_MAX 8
+#define EXCLUDE_DIR_STAT_TMP 0
+
+const char *excludeDirContents[EXCLUDE_DIR_MAX] =
+{
+ /*
+ * Skip temporary statistics files. The first array position will be
+ * filled with the value of pgstat_stat_directory relative to PGDATA.
+ * PG_STAT_TMP_DIR must be skipped even when stats_temp_directory is set
+ * because PGSS_TEXT_FILE is always created there.
+ */
+ NULL,
I find that ugly. I'd rather use an array with undefined size for the
fixed elements finishing by NULL, remove EXCLUDE_DIR_MAX and
EXCLUDE_DIR_STAT_TMP and use a small routine to do the work done on
_tarWriteHeader...
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 9/6/16 10:25 PM, Michael Paquier wrote:
On Wed, Sep 7, 2016 at 12:16 AM, David Steele <david@pgmasters.net> wrote:
Attached is a new patch that adds sgml documentation. I can expand on each
directory individually if you think that's necessary, but thought it was
better to lump them into a few categories.+ be ommitted from the backup as they will be initialized on postmaster + startup. If the <xref linkend="GUC-STATS-TEMP-DIRECTORY"> is set and is + under the database cluster directory then the contents of the directory + specified by <xref linkend="GUC-STATS-TEMP-DIRECTORY"> can also be ommitted.s/ommitted/omitted/
Thanks!
+#define EXCLUDE_DIR_MAX 8 +#define EXCLUDE_DIR_STAT_TMP 0 + +const char *excludeDirContents[EXCLUDE_DIR_MAX] = +{ + /* + * Skip temporary statistics files. The first array position will be + * filled with the value of pgstat_stat_directory relative to PGDATA. + * PG_STAT_TMP_DIR must be skipped even when stats_temp_directory is set + * because PGSS_TEXT_FILE is always created there. + */ + NULL, I find that ugly. I'd rather use an array with undefined size for the fixed elements finishing by NULL, remove EXCLUDE_DIR_MAX and EXCLUDE_DIR_STAT_TMP and use a small routine to do the work done on _tarWriteHeader...
My goal was to be able to fully reuse the code that creates the paths,
but this could also be done by following your suggestion and also moving
the path code into a function.
Any opinion on this, Peter?
--
-David
david@pgmasters.net
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 9/7/16 8:21 AM, David Steele wrote:
On 9/6/16 10:25 PM, Michael Paquier wrote:
I find that ugly. I'd rather use an array with undefined size for the
fixed elements finishing by NULL, remove EXCLUDE_DIR_MAX and
EXCLUDE_DIR_STAT_TMP and use a small routine to do the work done on
_tarWriteHeader...My goal was to be able to fully reuse the code that creates the paths,
but this could also be done by following your suggestion and also moving
the path code into a function.
I just realized all I did was repeat what you said...
--
-David
david@pgmasters.net
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 9/6/16 10:25 PM, Michael Paquier wrote:
On Wed, Sep 7, 2016 at 12:16 AM, David Steele <david@pgmasters.net> wrote:
Attached is a new patch that adds sgml documentation. I can expand on each
directory individually if you think that's necessary, but thought it was
better to lump them into a few categories.+ be ommitted from the backup as they will be initialized on postmaster + startup. If the <xref linkend="GUC-STATS-TEMP-DIRECTORY"> is set and is + under the database cluster directory then the contents of the directory + specified by <xref linkend="GUC-STATS-TEMP-DIRECTORY"> can also be ommitted.s/ommitted/omitted/
Fixed.
+#define EXCLUDE_DIR_MAX 8 +#define EXCLUDE_DIR_STAT_TMP 0 + +const char *excludeDirContents[EXCLUDE_DIR_MAX] = +{ + /* + * Skip temporary statistics files. The first array position will be + * filled with the value of pgstat_stat_directory relative to PGDATA. + * PG_STAT_TMP_DIR must be skipped even when stats_temp_directory is set + * because PGSS_TEXT_FILE is always created there. + */ + NULL, I find that ugly. I'd rather use an array with undefined size for the fixed elements finishing by NULL, remove EXCLUDE_DIR_MAX and EXCLUDE_DIR_STAT_TMP and use a small routine to do the work done on _tarWriteHeader...
Done. Also writing out pg_xlog with the new routine.
--
-David
david@pgmasters.net
Attachments:
basebackup-exclusions-v3.patchtext/plain; charset=UTF-8; name=basebackup-exclusions-v3.patch; x-mac-creator=0; x-mac-type=0Download+186-68
On Fri, Sep 9, 2016 at 10:18 PM, David Steele <david@pgmasters.net> wrote:
On 9/6/16 10:25 PM, Michael Paquier wrote:
On Wed, Sep 7, 2016 at 12:16 AM, David Steele <david@pgmasters.net> wrote:
Attached is a new patch that adds sgml documentation. I can expand on
each
directory individually if you think that's necessary, but thought it was
better to lump them into a few categories.+ be ommitted from the backup as they will be initialized on postmaster + startup. If the <xref linkend="GUC-STATS-TEMP-DIRECTORY"> is set and is + under the database cluster directory then the contents of the directory + specified by <xref linkend="GUC-STATS-TEMP-DIRECTORY"> can also be ommitted.s/ommitted/omitted/
Fixed.
+#define EXCLUDE_DIR_MAX 8 +#define EXCLUDE_DIR_STAT_TMP 0 + +const char *excludeDirContents[EXCLUDE_DIR_MAX] = +{ + /* + * Skip temporary statistics files. The first array position will be + * filled with the value of pgstat_stat_directory relative to PGDATA. + * PG_STAT_TMP_DIR must be skipped even when stats_temp_directory is set + * because PGSS_TEXT_FILE is always created there. + */ + NULL, I find that ugly. I'd rather use an array with undefined size for the fixed elements finishing by NULL, remove EXCLUDE_DIR_MAX and EXCLUDE_DIR_STAT_TMP and use a small routine to do the work done on _tarWriteHeader...Done. Also writing out pg_xlog with the new routine.
Thanks, this looks in far better shape now.
+ /* Removed on startup, see DeleteAllExportedSnapshotFiles(). */
+ "pg_snapshots",
Using SNAPSHOT_EXPORT_DIR is possible here.
+_tarWriteDir(char *pathbuf, int basepathlen, struct stat *statbuf,
+ bool sizeonly)
That's nice, thanks! This even takes care of the fact when directories
other than pg_xlog are symlinks or junction points.
+ /* Recreated on startup, see StartupReplicationSlots(). */
+ "pg_replslot",
This comment is incorrect, pg_replslot is skipped in a base backup
because that's just useless.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
The comments on excludeDirContents are somewhat imprecise. For
example, none of those directories are actually removed or recreated
on startup, only the contents are.
The comment for pg_replslot is incorrect. I think you can copy
replication slots just fine, but you usually don't want to.
What is the 2 for in this code?
if (strcmp(pathbuf + 2, excludeFile[excludeIdx]) == 0)
I would keep the signature of _tarWriteDir() similar to
_tarWriteHeader(). So don't pass sizeonly in there, or do pass in
sizeonly but do the same with _tarWriteHeader().
The documentation in pg_basebackup.sgml and protocol.sgml should be
updated.
Add some tests. At least test that one entry from the directory list
and one entry from the files list is not contained in the backup
result. Testing the symlink handling would also be good. (The
pg_basebackup tests claim that Windows doesn't support symlinks and
therefore skip all the symlink tests. That seems a bit at odds with
your code handling symlinks on Windows.)
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers