Re: pg_basebackup vs. Windows and tablespaces
Magnus Hagander wrote:
On Mon, Aug 5, 2013 at 10:03 PM, Noah Misch <noah(at)leadboat(dot)com>
wrote:
On Thu, Aug 01, 2013 at 01:04:42PM -0400, Andrew Dunstan wrote:
On 08/01/2013 12:15 PM, Noah Misch wrote:
1. Include in the base backup a file listing symbolic links/junction
points,
then have archive recovery recreate them. This file would be managed
like the
backup label file; exclusive backups would actually write it to the
master
data directory, and non-exclusive backups would incorporate it on the
fly.
pg_basebackup could also omit the actual links from its backup.
Nearly any
tar or file copy utility would then suffice.
I like #1, it seems nice and workable.
Agreed. I'll lean in that direction for resolving the proximate problem.
+1.
I had implemented the above feature which will help to
restore symlinks during archive recovery.
Implementation details:
-----------------------------------
1. This feature is implemented only for tar format in windows
as native windows utilites are not able to create symlinks while
extracting files from tar (It might be possible to create symlinks
if cygwin is installed on your system, however I feel we need this
feature to work for native windows as well). Another reason to not
create it for non-tar (plain) format is that plain format can update the
symlinks via -T option and backing up symlink file during that
operation can lead to spurious symlinks after archive recovery.
2. Symlink file format:
<oid> <linkpath>
16387 E:\PostgreSQL\tbs
Symlink file will contain entries for all the tablspaces
under pg_tblspc directory. I have kept the file name as
symlink_label (suggestion are welcome if you want some
different name for this file).
3. While taking exclusive backup, write the symlink file
in master data directory similar to backup_label file.
4. Non-exclusive backups include the symlink file in archive.
5. Archive recovery will create symlinks if symlink_label file
is present and contain information about symlinks, it will rename
the file symlink_label.old after its done with the usage of file.
6. Cancel backup will rename the file symlink_label to
symlink_label.old to avoid server trying to create symlinks
during archive recovery.
Feedback?
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachments:
extend_basebackup_to_include_symlink_v1.patchapplication/octet-stream; name=extend_basebackup_to_include_symlink_v1.patchDownload+431-111
15 July 2014 19:29 Amit Kapila Wrote,
Implementation details:
-----------------------------------
1. This feature is implemented only for tar format in windows
as native windows utilites are not able to create symlinks while
extracting files from tar (It might be possible to create symlinks
if cygwin is installed on your system, however I feel we need this
feature to work for native windows as well). Another reason to not
create it for non-tar (plain) format is that plain format can update the
symlinks via -T option and backing up symlink file during that
operation can lead to spurious symlinks after archive recovery.
I have reviewed the patch and did not find any major comments.
There are some comments I would like to share with you
1. Rebase the patch to current GIT head.
2. + * Construct symlink file
+ */
+ initStringInfo(&symlinkfbuf);
I think declaration and initialization of symlinkfbuf string can be moved under #ifdef WIN32 compile time macro,
for other platform it’s simply allocated and freed which can be avoided.
3. + /*
+ * native windows utilites are not able create symlinks while
+ * extracting files from tar.
+ */
Rephrase the above sentence and fix spelling mistake (utilities are not able to create)
I haven’t done the testing yet, once I finish with testing i will share the result with you.
Regards,
Dilip
On Wed, Aug 20, 2014 at 12:12 PM, Dilip kumar <dilip.kumar@huawei.com>
wrote:
I have reviewed the patch and did not find any major comments.
Thanks for the review.
There are some comments I would like to share with you
1. Rebase the patch to current GIT head.
Done.
2. + * Construct symlink file
+ */
+ initStringInfo(&symlinkfbuf);
I think declaration and initialization of symlinkfbuf string
can be moved under #ifdef WIN32 compile time macro,
for other platform it’s simply allocated and freed which can be avoided.
Agreed, I have changed the patch as per your suggestion.
3. + /*
+ * native windows utilites are not able
create symlinks while
+ * extracting files from tar.
+ */
Rephrase the above sentence and fix spelling mistake
(utilities are not able to create)
Done.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachments:
extend_basebackup_to_include_symlink_v2.patchapplication/octet-stream; name=extend_basebackup_to_include_symlink_v2.patchDownload+437-111
On 20 August 2014 19:49, Amit Kapila Wrote
There are some comments I would like to share with you
1. Rebase the patch to current GIT head.
Done.
+ initStringInfo(&symlinkfbuf);
I think declaration and initialization of symlinkfbuf string can be moved under #ifdef WIN32 compile time macro,
for other platform it’s simply allocated and freed which can be avoided.
Agreed, I have changed the patch as per your suggestion.
I have done the testing and behavior is as per expectation,
Do we need to do some document change? I mean is this limitation on windows is mentioned anywhere ?
If no change then i will move the patch to “Ready For Committer”.
Thanks & Regards,
Dilip
On Thu, Sep 11, 2014 at 9:10 AM, Dilip kumar <dilip.kumar@huawei.com> wrote:
I have done the testing and behavior is as per expectation,
Do we need to do some document change? I mean is this limitation on
windows is mentioned anywhere ?
I don't think currently such a limitation is mentioned in docs,
however I think we can update the docs at below locations:
1. In description of pg_start_backup in below page:
http://www.postgresql.org/docs/devel/static/functions-admin.html#FUNCTIONS-ADMIN-BACKUP
2. In Explanation of Base Backup, basically under heading
Making a Base Backup Using the Low Level API at below
page:
http://www.postgresql.org/docs/devel/static/continuous-archiving.html#BACKUP-BASE-BACKUP
In general, we can explain about symlink_label file where ever
we are explaining about backup_label file.
If you think it is sufficient to explain about symlink_label in
above places, then I can update the patch.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On 11 September 2014 10:21, Amit kapila Wrote,
I don't think currently such a limitation is mentioned in docs,
however I think we can update the docs at below locations:
1. In description of pg_start_backup in below page:
http://www.postgresql.org/docs/devel/static/functions-admin.html#FUNCTIONS-ADMIN-BACKUP
2. In Explanation of Base Backup, basically under heading
Making a Base Backup Using the Low Level API at below
page:
http://www.postgresql.org/docs/devel/static/continuous-archiving.html#BACKUP-BASE-BACKUP
In general, we can explain about symlink_label file where ever
we are explaining about backup_label file.
If you think it is sufficient to explain about symlink_label in
above places, then I can update the patch.
I think this will be sufficient….
Regards,
Dilip
On Fri, Sep 12, 2014 at 1:50 PM, Dilip kumar <dilip.kumar@huawei.com> wrote:
On 11 September 2014 10:21, Amit kapila Wrote,
I don't think currently such a limitation is mentioned in docs,
however I think we can update the docs at below locations:
1. In description of pg_start_backup in below page:
http://www.postgresql.org/docs/devel/static/functions-admin.html#FUNCTIONS-ADMIN-BACKUP
2. In Explanation of Base Backup, basically under heading
Making a Base Backup Using the Low Level API at below
page:
http://www.postgresql.org/docs/devel/static/continuous-archiving.html#BACKUP-BASE-BACKUP
In general, we can explain about symlink_label file where ever
we are explaining about backup_label file.
If you think it is sufficient to explain about symlink_label if
above places, then I can update the patch.I think this will be sufficient….
Please find updated patch to include those documentation changes.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachments:
extend_basebackup_to_include_symlink_v3.patchapplication/octet-stream; name=extend_basebackup_to_include_symlink_v3.patchDownload+457-125
On 12 September 2014 14:34, Amit Kapila Wrote
Please find updated patch to include those documentation changes.
Looks fine, Moved to Ready for committer.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com<http://www.enterprisedb.com/>
On Fri, Sep 12, 2014 at 5:07 PM, Dilip kumar <dilip.kumar@huawei.com> wrote:
On 12 September 2014 14:34, Amit Kapila Wrote
Please find updated patch to include those documentation changes.
Looks fine, Moved to Ready for committer.
Thanks a lot for the review.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On Fri, Sep 12, 2014 at 6:12 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:
On Fri, Sep 12, 2014 at 5:07 PM, Dilip kumar <dilip.kumar@huawei.com>
wrote:
On 12 September 2014 14:34, Amit Kapila Wrote
Please find updated patch to include those documentation changes.
Looks fine, Moved to Ready for committer.
Thanks a lot for the review.
This patch is in "Ready for committer" stage for more than 1.5 months.
I believe this is an important functionality such that without this tar
format of pg_basebackup is not usable on Windows. I feel this
will add a value to pg_basebackup utility and moreover the need
and design has been agreed upon the list before development.
Can any Committer please have a look at this patch?
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Amit Kapila wrote:
This patch is in "Ready for committer" stage for more than 1.5 months.
I believe this is an important functionality such that without this tar
format of pg_basebackup is not usable on Windows. I feel this
will add a value to pg_basebackup utility and moreover the need
and design has been agreed upon the list before development.Can any Committer please have a look at this patch?
Is this still relevant after this commit?
commit fb05f3ce83d225dd0f39f8860ce04082753e9e98
Author: Peter Eisentraut <peter_e@gmx.net>
Date: Sat Feb 22 13:38:06 2014 -0500
pg_basebackup: Add support for relocating tablespaces
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 11/13/14 11:52 AM, Alvaro Herrera wrote:
Amit Kapila wrote:
This patch is in "Ready for committer" stage for more than 1.5 months.
I believe this is an important functionality such that without this tar
format of pg_basebackup is not usable on Windows. I feel this
will add a value to pg_basebackup utility and moreover the need
and design has been agreed upon the list before development.Can any Committer please have a look at this patch?
Is this still relevant after this commit?
commit fb05f3ce83d225dd0f39f8860ce04082753e9e98
Author: Peter Eisentraut <peter_e@gmx.net>
Date: Sat Feb 22 13:38:06 2014 -0500pg_basebackup: Add support for relocating tablespaces
I believe so.
The commit only applies to "plain" output. Amit's complaint is that tar
utilities on Windows don't unpack symlinks, so the "tar" format isn't
useful on Windows when tablespaces are used. So he wants the recovery
mechanism to restore the symlinks.
I'm not fully on board with that premise. (Get a better tar tool.
Submit a patch.)
But this also ties in with the recent discovery that the tar format
cannot handle symlinks longer than 99 bytes. So this patch could also
fix that problem by putting the untruncated name of the symlink in the
WAL data.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Nov 13, 2014 at 4:33 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
I'm not fully on board with that premise. (Get a better tar tool.
Submit a patch.)
Noah was unable to find one that works:
/messages/by-id/20130801161519.GA334956@tornado.leadboat.com
If most tar tools worked, and there was one that didn't, I think
that'd be a reasonable argument. But telling people to get a better
tool when they'd have to write it first seems rather unfriendly.
But this also ties in with the recent discovery that the tar format
cannot handle symlinks longer than 99 bytes. So this patch could also
fix that problem by putting the untruncated name of the symlink in the
WAL data.
Yeah, seems like a chance to kill two birds with one stone.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Nov 14, 2014 at 3:03 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
On 11/13/14 11:52 AM, Alvaro Herrera wrote:
Amit Kapila wrote:
This patch is in "Ready for committer" stage for more than 1.5 months.
I believe this is an important functionality such that without this tar
format of pg_basebackup is not usable on Windows. I feel this
will add a value to pg_basebackup utility and moreover the need
and design has been agreed upon the list before development.Can any Committer please have a look at this patch?
Is this still relevant after this commit?
commit fb05f3ce83d225dd0f39f8860ce04082753e9e98
Author: Peter Eisentraut <peter_e@gmx.net>
Date: Sat Feb 22 13:38:06 2014 -0500pg_basebackup: Add support for relocating tablespaces
I believe so.
The commit only applies to "plain" output. Amit's complaint is that tar
utilities on Windows don't unpack symlinks, so the "tar" format isn't
useful on Windows when tablespaces are used. So he wants the recovery
mechanism to restore the symlinks.I'm not fully on board with that premise. (Get a better tar tool.
Submit a patch.)
For native Windows environment, I have checked all the tools I could find
(Winrar, tar, 7-zip, etc...) and none of them is working and even checked
a lot on google to try to find some workaround for this, but it seems there
is no way to reliably handle this issue. Refer link :
http://sourceforge.net/p/mingw/bugs/2002/
Then I started discussion in tar community to see if they can suggest
some way, but there also I could not find a reliable solution except that
it might work in some cases if cygwin is installed. You can refer below
thread:
https://lists.gnu.org/archive/html/bug-tar/2014-07/msg00007.html
After spending good amount of time for finding a workaround or alternative,
only I decided that it is important to write this patch to make tar format
for pg_basebackup usable for Windows users.
But this also ties in with the recent discovery that the tar format
cannot handle symlinks longer than 99 bytes. So this patch could also
fix that problem by putting the untruncated name of the symlink in the
WAL data.
I have mentioned that this can be usable for Linux users as well on that
thread, however I think we might want to provide it with an option for
linux users. In general, I think it is good to have this patch for Windows
users and later if we find that Linux users can also get the benefit with
this functionality, we can expose the same with an additional option.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On Thu, Nov 13, 2014 at 10:37 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Nov 14, 2014 at 3:03 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
On 11/13/14 11:52 AM, Alvaro Herrera wrote:
Amit Kapila wrote:
This patch is in "Ready for committer" stage for more than 1.5 months.
I believe this is an important functionality such that without this tar
format of pg_basebackup is not usable on Windows. I feel this
will add a value to pg_basebackup utility and moreover the need
and design has been agreed upon the list before development.Can any Committer please have a look at this patch?
Is this still relevant after this commit?
commit fb05f3ce83d225dd0f39f8860ce04082753e9e98
Author: Peter Eisentraut <peter_e@gmx.net>
Date: Sat Feb 22 13:38:06 2014 -0500pg_basebackup: Add support for relocating tablespaces
I believe so.
The commit only applies to "plain" output. Amit's complaint is that tar
utilities on Windows don't unpack symlinks, so the "tar" format isn't
useful on Windows when tablespaces are used. So he wants the recovery
mechanism to restore the symlinks.I'm not fully on board with that premise. (Get a better tar tool.
Submit a patch.)For native Windows environment, I have checked all the tools I could find
(Winrar, tar, 7-zip, etc...) and none of them is working and even checked
a lot on google to try to find some workaround for this, but it seems there
is no way to reliably handle this issue. Refer link :
http://sourceforge.net/p/mingw/bugs/2002/Then I started discussion in tar community to see if they can suggest
some way, but there also I could not find a reliable solution except that
it might work in some cases if cygwin is installed. You can refer below
thread:
https://lists.gnu.org/archive/html/bug-tar/2014-07/msg00007.htmlAfter spending good amount of time for finding a workaround or alternative,
only I decided that it is important to write this patch to make tar format
for pg_basebackup usable for Windows users.But this also ties in with the recent discovery that the tar format
cannot handle symlinks longer than 99 bytes. So this patch could also
fix that problem by putting the untruncated name of the symlink in the
WAL data.I have mentioned that this can be usable for Linux users as well on that
thread, however I think we might want to provide it with an option for
linux users. In general, I think it is good to have this patch for Windows
users and later if we find that Linux users can also get the benefit with
this functionality, we can expose the same with an additional option.
Why make it an option instead of just always doing it this way?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Nov 14, 2014 at 9:11 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Nov 13, 2014 at 10:37 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:
I have mentioned that this can be usable for Linux users as well on that
thread, however I think we might want to provide it with an option for
linux users. In general, I think it is good to have this patch for
Windows
users and later if we find that Linux users can also get the benefit
with
this functionality, we can expose the same with an additional option.
Why make it an option instead of just always doing it this way?
To avoid extra work during archive recovery if it is not required. I
understand that this might not create any measurable difference, but
still there is addition I/O involved (read from file) which can be avoided.
OTOH, if that is okay, then I think we can avoid few #ifdef WIN32 that
this patch introduces and can have consistency for this operation on
both linux and Windows.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On Fri, Nov 14, 2014 at 2:55 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Nov 14, 2014 at 9:11 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Nov 13, 2014 at 10:37 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:I have mentioned that this can be usable for Linux users as well on that
thread, however I think we might want to provide it with an option for
linux users. In general, I think it is good to have this patch for
Windows
users and later if we find that Linux users can also get the benefit
with
this functionality, we can expose the same with an additional option.Why make it an option instead of just always doing it this way?
To avoid extra work during archive recovery if it is not required. I
understand that this might not create any measurable difference, but
still there is addition I/O involved (read from file) which can be avoided.
Yeah, but it's trivial. We're not going create enough tablespaces on
one cluster for the cost of dropping a few extra symlinks in place to
matter.
OTOH, if that is okay, then I think we can avoid few #ifdef WIN32 that
this patch introduces and can have consistency for this operation on
both linux and Windows.
Having one code path for everything seems appealing to me, but what do
others think?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
On Fri, Nov 14, 2014 at 2:55 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
OTOH, if that is okay, then I think we can avoid few #ifdef WIN32 that
this patch introduces and can have consistency for this operation on
both linux and Windows.
Having one code path for everything seems appealing to me, but what do
others think?
Generally I'd be in favor of avoiding platform-dependent code where
possible, but that doesn't represent a YES vote for this particular
patch. It looks pretty messy in a quick look, even granting that the
#ifdef WIN32's would all go away.
A larger question here is about forward/backward compatibility of the
basebackup files. Changing the representation of symlinks like this
would break that. Maybe we don't care, not sure (is there already a
catversion check for these things?). Changing the file format for only
some platforms seems like definitely a bad idea though.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Nov 14, 2014 at 1:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Generally I'd be in favor of avoiding platform-dependent code where
possible, but that doesn't represent a YES vote for this particular
patch. It looks pretty messy in a quick look, even granting that the
#ifdef WIN32's would all go away.
Hmm, OK. I have not read the patch. Hopefully that's something that
could be fixed.
A larger question here is about forward/backward compatibility of the
basebackup files. Changing the representation of symlinks like this
would break that. Maybe we don't care, not sure (is there already a
catversion check for these things?). Changing the file format for only
some platforms seems like definitely a bad idea though.
What are the practical consequences of changing the file format? I
think that an old backup containing symlinks could be made to work on
a new server that knows how to create them, and we should probably
design it that way, but a physical backup isn't compatible across
major versions anyway, so it doesn't have the same kinds of
repercussions as changing something like the pg_dump file format.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Amit Kapila wrote:
2. Symlink file format:
<oid> <linkpath>
16387 E:\PostgreSQL\tbsSymlink file will contain entries for all the tablspaces
under pg_tblspc directory. I have kept the file name as
symlink_label (suggestion are welcome if you want some
different name for this file).
I think symlink_label isn't a very good name. This file is not a label
in the sense that backup_label is; it seems more a "catalog" to me. And
it's not, in essence, about symlinks either, but rather about
tablespaces. I would name it following the term "tablespace catalog" or
some variation thereof.
I know we don't expect that users would have to look at the file or edit
it in normal cases, but it seems better to make it be human-readable. I
would think that the file needs to have tablespace names too, then, not
just OIDs. Maybe we don't use the tablespace name for anything other
than "documentation" purposes if someone decides to look at the file, so
perhaps it should look like a comment:
<oid> <link path> ; <tablespace name>
We already do this in pg_restore -l output IIRC.
One use case mentioned upthread is having the clone be created in the
same machine as the source server. Does your proposal help with it?
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers