pg_basebackup creates a corrupt file for pg_stat_tmp and pg_replslot on a backup location
Hi All,
I tried performing pg_basebackup after creating a symbolic link for
pg_replslot, pg_stat_tmp, pg_log and pg_clog in the source directory and
found that on the backup location pg_stat_tmp, pg_repl_slot is a corrupt
file rather than a link or directory whereas pg_clog and pg_log are getting
skipped. As per the documentation of pg_basebackup, symbolic links on any
directories
other than tablespace and xlog should be skipped. But this statement is not
true for pg_replslot and pg_stat_tmp. The reason is as follows:
pg_basebackup is expecting pg_stat_tmp/pg_replslot to be a directory and
irrespective of whether pg_stat_tmp is empty or not, it will always
include it as a empty directory in backup path. Now, in my case i have
created a softlink for pg_stat_tmp/pg_replslot and pg_basebackup is trying
to create a tar format header without changing the filemode as it does in
case of pg_xlog. Also, linkpath is not considered as in case of pg_tblspc.
This is the reason why a regular file is getting created in the backup path
even though i have a symbolic link in the source path but ideally it should
be skipped.
*Solution:* Skip pg_stat_tmp and pg_replslot if they are symbolic link.
Attached is the patch that fix this issue.
With Regards,
Ashutosh Sharma
EnterpriseDB: *http://www.enterprisedb.com <http://www.enterprisedb.com>*
Attachments:
Allow_pg_basebackup_to_skip_symlinks_if_not_tblspc_and_xlog.patchtext/x-patch; charset=US-ASCII; name=Allow_pg_basebackup_to_skip_symlinks_if_not_tblspc_and_xlog.patchDownload
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 1008873..5163b85 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -970,9 +970,25 @@ sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces,
strncmp(de->d_name, PG_STAT_TMP_DIR, strlen(PG_STAT_TMP_DIR)) == 0)
{
if (!sizeonly)
- _tarWriteHeader(pathbuf + basepathlen + 1, NULL, &statbuf);
- size += 512;
- continue;
+ {
+ /* If pg_stat_tmp is a symlink, skip it */
+#ifndef WIN32
+ if (S_ISLNK(statbuf.st_mode))
+#else
+ if (pgwin32_is_junction(pathbuf))
+#endif
+ {
+ ereport(WARNING,
+ (errmsg("skipping special file \"%s\"", pathbuf)));
+ continue;
+ }
+ else
+ {
+ _tarWriteHeader(pathbuf + basepathlen + 1, NULL, &statbuf);
+ size += 512;
+ continue;
+ }
+ }
}
/*
@@ -982,9 +998,25 @@ sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces,
if (strcmp(de->d_name, "pg_replslot") == 0)
{
if (!sizeonly)
- _tarWriteHeader(pathbuf + basepathlen + 1, NULL, &statbuf);
- size += 512; /* Size of the header just added */
- continue;
+ {
+ /* If pg_replslot is a symlink, skip it */
+#ifndef WIN32
+ if (S_ISLNK(statbuf.st_mode))
+#else
+ if (pgwin32_is_junction(pathbuf))
+#endif
+ {
+ ereport(WARNING,
+ (errmsg("skipping special file \"%s\"", pathbuf)));
+ continue;
+ }
+ else
+ {
+ _tarWriteHeader(pathbuf + basepathlen + 1, NULL, &statbuf);
+ size += 512; /* Size of the header just added */
+ continue;
+ }
+ }
}
/*
Hi,
On 2016-04-14 13:43:34 +0530, Ashutosh Sharma wrote:
I tried performing pg_basebackup after creating a symbolic link for
pg_replslot, pg_stat_tmp, pg_log and pg_clog in the source directory
That's not supported, and I strongly suspect that you're goint to hit
more than just this issue. The only directory you're allowed to symlink
away is pg_xlog.
What were you actually trying to achieve?
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
I was just curious to know how would "*pg_basebackup*" behave if we do
create a symbolic link for directories other than pg_xlog/pg_tblspc.
However it is clearly mentioned in the documentation of pg_basebackup that
if a Symbolic link for the directories other than pg_tblspc and pg_xlog is
created then it would be skipped. But, that is not the case for pg_replslot
and pg_stat_tmp. Is this not an issue. Should these directories not be
skipped. Please let me know your thoughts on this. Thanks.
With Regards,
Ashutosh Sharma
EnterpriseDB: *http://www.enterprisedb.com <http://www.enterprisedb.com>*
On Thu, Apr 14, 2016 at 8:42 PM, Andres Freund <andres@anarazel.de> wrote:
Show quoted text
Hi,
On 2016-04-14 13:43:34 +0530, Ashutosh Sharma wrote:
I tried performing pg_basebackup after creating a symbolic link for
pg_replslot, pg_stat_tmp, pg_log and pg_clog in the source directoryThat's not supported, and I strongly suspect that you're goint to hit
more than just this issue. The only directory you're allowed to symlink
away is pg_xlog.What were you actually trying to achieve?
Greetings,
Andres Freund
On Thu, Apr 14, 2016 at 8:20 PM, Ashutosh Sharma <ashu.coek88@gmail.com>
wrote:
Hi,
I was just curious to know how would "*pg_basebackup*" behave if we do
create a symbolic link for directories other than pg_xlog/pg_tblspc.
However it is clearly mentioned in the documentation of pg_basebackup that
if a Symbolic link for the directories other than pg_tblspc and pg_xlog is
created then it would be skipped. But, that is not the case for pg_replslot
and pg_stat_tmp. Is this not an issue. Should these directories not be
skipped. Please let me know your thoughts on this. Thanks.
I agree that actually generating a corrupt tarfile is not good. But I think
the correct fix is to actually generate an empty placeholder directory
rather than skipping it - thereby making the backup look the same as it
would if it was a correct directory where we just skipped the contents.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
Magnus Hagander wrote:
On Thu, Apr 14, 2016 at 8:20 PM, Ashutosh Sharma <ashu.coek88@gmail.com>
wrote:I was just curious to know how would "*pg_basebackup*" behave if we do
create a symbolic link for directories other than pg_xlog/pg_tblspc.
However it is clearly mentioned in the documentation of pg_basebackup that
if a Symbolic link for the directories other than pg_tblspc and pg_xlog is
created then it would be skipped. But, that is not the case for pg_replslot
and pg_stat_tmp. Is this not an issue. Should these directories not be
skipped. Please let me know your thoughts on this. Thanks.I agree that actually generating a corrupt tarfile is not good. But I think
the correct fix is to actually generate an empty placeholder directory
rather than skipping it - thereby making the backup look the same as it
would if it was a correct directory where we just skipped the contents.
Hmm, if your server has replication slots and pg_basebackup doesn't copy
them, is the copy okay? It may be a waste to try to create all the
decoded data if there was any spillage, but perhaps it should at least
backup the state? If this is excessive datadir knowledge in
pg_basebackup, perhaps it should throw an error if some of the critical
subdirs aren't copied?
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Apr 14, 2016 at 8:44 PM, Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:
Magnus Hagander wrote:
On Thu, Apr 14, 2016 at 8:20 PM, Ashutosh Sharma <ashu.coek88@gmail.com>
wrote:I was just curious to know how would "*pg_basebackup*" behave if we do
create a symbolic link for directories other than pg_xlog/pg_tblspc.
However it is clearly mentioned in the documentation of pg_basebackupthat
if a Symbolic link for the directories other than pg_tblspc and
pg_xlog is
created then it would be skipped. But, that is not the case for
pg_replslot
and pg_stat_tmp. Is this not an issue. Should these directories not be
skipped. Please let me know your thoughts on this. Thanks.I agree that actually generating a corrupt tarfile is not good. But I
think
the correct fix is to actually generate an empty placeholder directory
rather than skipping it - thereby making the backup look the same as it
would if it was a correct directory where we just skipped the contents.Hmm, if your server has replication slots and pg_basebackup doesn't copy
them, is the copy okay?
If pg_replslot is a directory and not a symlink, we skip the contents and
create an empty directory. So it'd better be ok...
It may be a waste to try to create all the
decoded data if there was any spillage, but perhaps it should at least
backup the state? If this is excessive datadir knowledge in
pg_basebackup, perhaps it should throw an error if some of the critical
subdirs aren't copied?
The knowledge isn't actually in pg_basebackup, it's in basebackup.c in
backend/replication.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On Thu, Apr 14, 2016 at 11:12 AM, Andres Freund <andres@anarazel.de> wrote:
On 2016-04-14 13:43:34 +0530, Ashutosh Sharma wrote:
I tried performing pg_basebackup after creating a symbolic link for
pg_replslot, pg_stat_tmp, pg_log and pg_clog in the source directoryThat's not supported, and I strongly suspect that you're goint to hit
more than just this issue. The only directory you're allowed to symlink
away is pg_xlog.
I think various tools might choke on such a configuration, but I'm not
entirely sure why we haven't made them all work. Is there some more
fundamental problem?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2016-04-14 14:55:37 -0400, Robert Haas wrote:
On Thu, Apr 14, 2016 at 11:12 AM, Andres Freund <andres@anarazel.de> wrote:
On 2016-04-14 13:43:34 +0530, Ashutosh Sharma wrote:
I tried performing pg_basebackup after creating a symbolic link for
pg_replslot, pg_stat_tmp, pg_log and pg_clog in the source directoryThat's not supported, and I strongly suspect that you're goint to hit
more than just this issue. The only directory you're allowed to symlink
away is pg_xlog.I think various tools might choke on such a configuration, but I'm not
entirely sure why we haven't made them all work. Is there some more
fundamental problem?
I don't think there's a fundamental problem, just that it'd require
adding code to numerous places, and that it'd make it harder to reason
about things. For example you have to be a lot more careful about
iterating over the data directory, because loops suddenly are a major
concern. Fsyncing files & directories suddenly needs special care to
fsync the correct directory for a file, lest it's symlinked somewhere.
It's harder to perform checks like the "not at the top of the
filesystem" when every file and directory can be somewhere else.
- Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 4/14/16 2:55 PM, Robert Haas wrote:
On Thu, Apr 14, 2016 at 11:12 AM, Andres Freund <andres@anarazel.de> wrote:
On 2016-04-14 13:43:34 +0530, Ashutosh Sharma wrote:
I tried performing pg_basebackup after creating a symbolic link for
pg_replslot, pg_stat_tmp, pg_log and pg_clog in the source directoryThat's not supported, and I strongly suspect that you're goint to hit
more than just this issue. The only directory you're allowed to symlink
away is pg_xlog.I think various tools might choke on such a configuration, but I'm not
entirely sure why we haven't made them all work. Is there some more
fundamental problem?
I'm don't think there's a fundamental problem it just takes a lot of
work to get it right. Extensive link support has been added for
pgBackRest 1.0 and it took a lot of effort, not just for the backup
itself (loop detection, links into multiple hierarchy levels, etc.) but
adding options to the restore to make remapping possible on systems with
different disk layouts.
--
-David
david@pgmasters.net
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 4/14/16 3:01 PM, Andres Freund wrote:
On 2016-04-14 14:55:37 -0400, Robert Haas wrote:
On Thu, Apr 14, 2016 at 11:12 AM, Andres Freund <andres@anarazel.de> wrote:
On 2016-04-14 13:43:34 +0530, Ashutosh Sharma wrote:
I tried performing pg_basebackup after creating a symbolic link for
pg_replslot, pg_stat_tmp, pg_log and pg_clog in the source directoryThat's not supported, and I strongly suspect that you're goint to hit
more than just this issue. The only directory you're allowed to symlink
away is pg_xlog.I think various tools might choke on such a configuration, but I'm not
entirely sure why we haven't made them all work. Is there some more
fundamental problem?<...> Fsyncing files & directories suddenly needs special care to
fsync the correct directory for a file, lest it's symlinked somewhere.
That's a good point. I'll need to make sure pgBackRest is correctly
handling that case on restore.
--
-David
david@pgmasters.net
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
Knowing that pg_basebackup always creates an empty directory for
pg_stat_tmp and pg_replslot in backup location, even i think it would be
better to handle these directories in such a way that pg_basebackup
generates an empty directory for pg_replslot and pg_stat_tmp if they are
symbolic link.
PFA patch for the same.
With Regards,
Ashutosh Sharma
EnterpriseDB: *http://www.enterprisedb.com <http://www.enterprisedb.com>*
On Thu, Apr 14, 2016 at 11:57 PM, Magnus Hagander <magnus@hagander.net>
wrote:
Show quoted text
On Thu, Apr 14, 2016 at 8:20 PM, Ashutosh Sharma <ashu.coek88@gmail.com>
wrote:Hi,
I was just curious to know how would "*pg_basebackup*" behave if we do
create a symbolic link for directories other than pg_xlog/pg_tblspc.
However it is clearly mentioned in the documentation of pg_basebackup that
if a Symbolic link for the directories other than pg_tblspc and pg_xlog is
created then it would be skipped. But, that is not the case for pg_replslot
and pg_stat_tmp. Is this not an issue. Should these directories not be
skipped. Please let me know your thoughts on this. Thanks.I agree that actually generating a corrupt tarfile is not good. But I
think the correct fix is to actually generate an empty placeholder
directory rather than skipping it - thereby making the backup look the same
as it would if it was a correct directory where we just skipped the
contents.--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
Attachments:
Allow-pg_basebackup-to-create-pg_stat_tmp-and-pg_replslot-as-an-empty-dir-if-they-are-Symbolic-link.patchtext/plain; charset=US-ASCII; name=Allow-pg_basebackup-to-create-pg_stat_tmp-and-pg_replslot-as-an-empty-dir-if-they-are-Symbolic-link.patchDownload
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 1008873..5e3932b 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -970,7 +970,16 @@ sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces,
strncmp(de->d_name, PG_STAT_TMP_DIR, strlen(PG_STAT_TMP_DIR)) == 0)
{
if (!sizeonly)
+ {
+ /* If pg_stat_tmp is a symlink, write it as a directory anyway */
+#ifndef WIN32
+ if (S_ISLNK(statbuf.st_mode))
+#else
+ if (pgwin32_is_junction(pathbuf))
+#endif
+ statbuf.st_mode = S_IFDIR | S_IRWXU;
_tarWriteHeader(pathbuf + basepathlen + 1, NULL, &statbuf);
+ }
size += 512;
continue;
}
@@ -982,7 +991,16 @@ sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces,
if (strcmp(de->d_name, "pg_replslot") == 0)
{
if (!sizeonly)
+ {
+ /* If pg_replslot is a symlink, write it as a directory anyway */
+#ifndef WIN32
+ if (S_ISLNK(statbuf.st_mode))
+#else
+ if (pgwin32_is_junction(pathbuf))
+#endif
+ statbuf.st_mode = S_IFDIR | S_IRWXU;
_tarWriteHeader(pathbuf + basepathlen + 1, NULL, &statbuf);
+ }
size += 512; /* Size of the header just added */
continue;
}
On 4/26/16 5:02 AM, Ashutosh Sharma wrote:
Knowing that pg_basebackup always creates an empty directory for
pg_stat_tmp and pg_replslot in backup location, even i think it would be
better to handle these directories in such a way that pg_basebackup
generates an empty directory for pg_replslot and pg_stat_tmp if they are
symbolic link.
I just wanted to update you, I have taken this commit fest entry patch
to review because I think it will be addresses as part of "Exclude
additional directories in pg_basebackup", which I'm also reviewing.
Therefore, I'm not actually planning on discussing this patch further.
Please correct me if this assessment does not match your expectations.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 9/23/16 2:12 PM, Peter Eisentraut wrote:
On 4/26/16 5:02 AM, Ashutosh Sharma wrote:
Knowing that pg_basebackup always creates an empty directory for
pg_stat_tmp and pg_replslot in backup location, even i think it would be
better to handle these directories in such a way that pg_basebackup
generates an empty directory for pg_replslot and pg_stat_tmp if they are
symbolic link.I just wanted to update you, I have taken this commit fest entry patch
to review because I think it will be addresses as part of "Exclude
additional directories in pg_basebackup", which I'm also reviewing.
Therefore, I'm not actually planning on discussing this patch further.
Please correct me if this assessment does not match your expectations.
Just to be clear, and as I noted in [1]/messages/by-id/raw/ced3b05f-c1d9-c262-ce63-9744ef7e6de8@pgmasters.net, I pulled the symlink logic from
this this patch into the exclusion patch [2]https://commitfest.postgresql.org/10/721/. I think it makes sense to
only commit [2]https://commitfest.postgresql.org/10/721/ as long as Ashutosh gets credit for his contribution.
Thanks,
--
-David
david@pgmasters.net
[1]: /messages/by-id/raw/ced3b05f-c1d9-c262-ce63-9744ef7e6de8@pgmasters.net
/messages/by-id/raw/ced3b05f-c1d9-c262-ce63-9744ef7e6de8@pgmasters.net
[2]: https://commitfest.postgresql.org/10/721/
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi Peter,
I just wanted to update you, I have taken this commit fest entry patch
to review because I think it will be addresses as part of "Exclude
additional directories in pg_basebackup", which I'm also reviewing.
Therefore, I'm not actually planning on discussing this patch further.
Please correct me if this assessment does not match your expectations.
Thanks for the update. I am absolutely OK with it. I feel it would be
a good idea to review "Exclude additional directories in
pg_basebackup" which also addresses the issue reported by me.
With Regards,
Ashutosh Sharma
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 9/25/16 8:06 AM, Ashutosh Sharma wrote:
Hi Peter,
I just wanted to update you, I have taken this commit fest entry patch
to review because I think it will be addresses as part of "Exclude
additional directories in pg_basebackup", which I'm also reviewing.
Therefore, I'm not actually planning on discussing this patch further.
Please correct me if this assessment does not match your expectations.Thanks for the update. I am absolutely OK with it. I feel it would be
a good idea to review "Exclude additional directories in
pg_basebackup" which also addresses the issue reported by me.
That has been committed.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers