Bug in WAL backup documentation
Our WAL backup documentation says in some parts of it:
..."%p is replaced by the absolute path of the file to archive..." [1]http://www.postgresql.org/docs/8.1/interactive/backup-online.html#BACKUP-ARCHIVING-WAL
I think this is (at least for 8.1 and upcoming 8.2 releases) wrong, since
the archiver
replaces this with pg_xlog/<LOGFILENAME> only, so that the archive command
is invoked with the relative path to the database data directory and its
xlog files. This
applies to the restore_command as well. Attached is a small patch against
HEAD, which
will adjust things in the documentation.
[1]: http://www.postgresql.org/docs/8.1/interactive/backup-online.html#BACKUP-ARCHIVING-WAL
http://www.postgresql.org/docs/8.1/interactive/backup-online.html#BACKUP-ARCHIVING-WAL
--
Thanks
Bernd
Attachments:
wal_backup_doc.patchapplication/octet-stream; name=wal_backup_doc.patchDownload
Index: doc/src/sgml/backup.sgml
===================================================================
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/backup.sgml,v
retrieving revision 2.92
diff -c -r2.92 backup.sgml
*** doc/src/sgml/backup.sgml 30 Oct 2006 00:08:02 -0000 2.92
--- doc/src/sgml/backup.sgml 3 Nov 2006 11:18:06 -0000
***************
*** 518,532 ****
linkend="guc-archive-command"> configuration parameter, which in practice
will always be placed in the <filename>postgresql.conf</filename> file.
In this string,
! any <literal>%p</> is replaced by the absolute path of the file to
! archive, while any <literal>%f</> is replaced by the file name only.
Write <literal>%%</> if you need to embed an actual <literal>%</>
character in the command. The simplest useful command is something
like
<programlisting>
archive_command = 'cp -i %p /mnt/server/archivedir/%f </dev/null'
</programlisting>
! which will copy archivable WAL segments to the directory
<filename>/mnt/server/archivedir</>. (This is an example, not a
recommendation, and may not work on all platforms.)
</para>
--- 518,532 ----
linkend="guc-archive-command"> configuration parameter, which in practice
will always be placed in the <filename>postgresql.conf</filename> file.
In this string,
! any <literal>%p</> is replaced by the relative path of the file to
! the database data directory, while any <literal>%f</> is replaced by the file name only.
Write <literal>%%</> if you need to embed an actual <literal>%</>
character in the command. The simplest useful command is something
like
<programlisting>
archive_command = 'cp -i %p /mnt/server/archivedir/%f </dev/null'
</programlisting>
! which will copy archivable WAL segments from pg_xlog/ to the directory
<filename>/mnt/server/archivedir</>. (This is an example, not a
recommendation, and may not work on all platforms.)
</para>
***************
*** 597,603 ****
In writing your archive command, you should assume that the file names to
be archived may be up to 64 characters long and may contain any
combination of ASCII letters, digits, and dots. It is not necessary to
! remember the original full path (<literal>%p</>) but it is necessary to
remember the file name (<literal>%f</>).
</para>
--- 597,603 ----
In writing your archive command, you should assume that the file names to
be archived may be up to 64 characters long and may contain any
combination of ASCII letters, digits, and dots. It is not necessary to
! remember the original relative path (<literal>%p</>) but it is necessary to
remember the file name (<literal>%f</>).
</para>
***************
*** 915,921 ****
WAL file segments. Like the <varname>archive_command</>, this is
a shell command string. It may contain <literal>%f</>, which is
replaced by the name of the desired log file, and <literal>%p</>,
! which is replaced by the absolute path to copy the log file to.
Write <literal>%%</> if you need to embed an actual <literal>%</>
character in the command. The simplest useful command is
something like
--- 915,922 ----
WAL file segments. Like the <varname>archive_command</>, this is
a shell command string. It may contain <literal>%f</>, which is
replaced by the name of the desired log file, and <literal>%p</>,
! which is replaced by the relative path to the database data
! directory to copy the log file to.
Write <literal>%%</> if you need to embed an actual <literal>%</>
character in the command. The simplest useful command is
something like
***************
*** 1003,1010 ****
the WAL file series. This parameter is required.
Any <literal>%f</> in the string is
replaced by the name of the file to retrieve from the archive,
! and any <literal>%p</> is replaced by the absolute path to copy
! it to on the server.
Write <literal>%%</> to embed an actual <literal>%</> character
in the command.
</para>
--- 1004,1012 ----
the WAL file series. This parameter is required.
Any <literal>%f</> in the string is
replaced by the name of the file to retrieve from the archive,
! and any <literal>%p</> is replaced by the relative path to copy
! it to on the server (Under normal circumstances this will be
! the pg_xlog directory, located in the database data directory).
Write <literal>%%</> to embed an actual <literal>%</> character
in the command.
</para>
Bernd Helmle <mailings@oopsware.de> writes:
Our WAL backup documentation says in some parts of it:
..."%p is replaced by the absolute path of the file to archive..." [1]
I think this is (at least for 8.1 and upcoming 8.2 releases) wrong, since
the archiver replaces this with pg_xlog/<LOGFILENAME> only,
Good point. Do we want to consider that this is a code bug rather than
a doc bug? The relative path is more efficient as long as the archiver
script doesn't do a "cd", but if it does then there'd be a problem.
You could argue that the code should be tweaked to continue supplying
an absolute path.
Since 8.1 has done this all along and no one's actually complained about
it, I guess no one is using scripts that do "cd". I'm inclined to go
with Bernd's suggestion to change the docs to match the code, but does
anyone have a contrary opinion?
regards, tom lane
On Fri, Nov 03, 2006 at 11:25:09AM -0500, Tom Lane wrote:
Since 8.1 has done this all along and no one's actually complained about
it, I guess no one is using scripts that do "cd". I'm inclined to go
with Bernd's suggestion to change the docs to match the code, but does
anyone have a contrary opinion?
Arguably you could give people a choice, say %P for the absolute path
and %p for the relative one. In Unix you can easily prepend $PWD to the
string, but I don't know how easy that is in Windows.
Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Show quoted text
From each according to his ability. To each according to his ability to litigate.
Tom Lane wrote:
Bernd Helmle <mailings@oopsware.de> writes:
Since 8.1 has done this all along and no one's actually complained about
it, I guess no one is using scripts that do "cd". I'm inclined to go
with Bernd's suggestion to change the docs to match the code, but does
anyone have a contrary opinion?
I think supplying the absolute path makes archiving scripts less
error-prone, which is a good time. So I'd vote for absolute paths.
greetings, Florian Pflug
On Fri, 2006-11-03 at 17:34 +0100, Martijn van Oosterhout wrote:
On Fri, Nov 03, 2006 at 11:25:09AM -0500, Tom Lane wrote:
Since 8.1 has done this all along and no one's actually complained about
it, I guess no one is using scripts that do "cd". I'm inclined to go
with Bernd's suggestion to change the docs to match the code, but does
anyone have a contrary opinion?
In Unix you can easily prepend $PWD to the
string, but I don't know how easy that is in Windows.
Windows input anyone?
Given the lack of a comprehensive test suite at this stage, I'd vote on
the side of least change right now. We know the existing mechanism
works, and as Martijn point out there is a workaround, plus as Tom
discusses this would only happen if people "cd" which in my book would
be bad programming form anyway.
+1 Doc bug for 8.2, feature request for 8.3, unless Windows bites.
--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com
Simon Riggs wrote:
On Fri, 2006-11-03 at 17:34 +0100, Martijn van Oosterhout wrote:
On Fri, Nov 03, 2006 at 11:25:09AM -0500, Tom Lane wrote:
Since 8.1 has done this all along and no one's actually complained
about
it, I guess no one is using scripts that do "cd". I'm inclined to go
with Bernd's suggestion to change the docs to match the code, but does
anyone have a contrary opinion?In Unix you can easily prepend $PWD to the
string, but I don't know how easy that is in Windows.Windows input anyone?
Of course you can get the current directory on Windows, if that's what the
question is.
cheers
andrew
Since 8.1 has done this all along and no one's actually
complained
about it, I guess no one is using scripts that do "cd". I'm
inclined to go with Bernd's suggestion to change the docsto match
the code, but does anyone have a contrary opinion?
In Unix you can easily prepend $PWD to the string, but I don't know
how easy that is in Windows.Windows input anyone?
%CD% gives the same as $PWD in a command shell:
C:\Program Files\Microsoft Visual Studio 8\VC>echo %CD%
C:\Program Files\Microsoft Visual Studio 8\VC
//Magnus
"Simon Riggs" <simon@2ndquadrant.com> writes:
On Fri, Nov 03, 2006 at 11:25:09AM -0500, Tom Lane wrote:
Since 8.1 has done this all along and no one's actually complained about
it, I guess no one is using scripts that do "cd". I'm inclined to go
with Bernd's suggestion to change the docs to match the code, but does
anyone have a contrary opinion?
+1 Doc bug for 8.2, feature request for 8.3, unless Windows bites.
Looking back in the archives, I note that one of the arguments for
making the server use relative paths everywhere was so that it'd be
robust against things like DBAs moving directories that contain live
postmasters. If we provide a %P option, or otherwise encourage people
to write scripts that depend on the absolute path of $PGDATA, we'd lose
some of this robustness. So that might be an argument for leaving the
code as-is indefinitely ... not a very strong argument maybe, but it's
more than just we're-too-lazy-to-add-%P.
Anyway, I've corrected the documentation in HEAD and 8.1.
regards, tom lane
On Sat, 2006-11-04 at 13:29 -0500, Tom Lane wrote:
"Simon Riggs" <simon@2ndquadrant.com> writes:
On Fri, Nov 03, 2006 at 11:25:09AM -0500, Tom Lane wrote:
Since 8.1 has done this all along and no one's actually complained about
it, I guess no one is using scripts that do "cd". I'm inclined to go
with Bernd's suggestion to change the docs to match the code, but does
anyone have a contrary opinion?+1 Doc bug for 8.2, feature request for 8.3, unless Windows bites.
Looking back in the archives, I note that one of the arguments for
making the server use relative paths everywhere was so that it'd be
robust against things like DBAs moving directories that contain live
postmasters. If we provide a %P option, or otherwise encourage people
to write scripts that depend on the absolute path of $PGDATA, we'd lose
some of this robustness. So that might be an argument for leaving the
code as-is indefinitely ... not a very strong argument maybe, but it's
more than just we're-too-lazy-to-add-%P.Anyway, I've corrected the documentation in HEAD and 8.1.
I think I can fulfil Bernd, Florian and Martijn's wishes by supplying an
additional substitutable parameter %d which is replaced by the DataDir.
This allows people to use an absolute directory if they wish, allows us
to continue with the functionality of %p as-is and all without a
possible confusion between %p and %P. It also allows %d to be used as an
identifier which might be used to locate the appropriate archive for
those with multiple servers without editing the archive_command for each
of those servers.
So using %d/%p will give you the absolute path for forward-slashers.
Works for archive and recovery.
Patch included, code and docs.
Code comments now discuss relative paths also.
Comments?
--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com
Attachments:
pitr_percentd.patchtext/x-patch; charset=UTF-8; name=pitr_percentd.patchDownload
Index: doc/src/sgml/backup.sgml
===================================================================
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/backup.sgml,v
retrieving revision 2.93
diff -c -r2.93 backup.sgml
*** doc/src/sgml/backup.sgml 4 Nov 2006 18:20:27 -0000 2.93
--- doc/src/sgml/backup.sgml 5 Nov 2006 14:44:34 -0000
***************
*** 521,527 ****
any <literal>%p</> is replaced by the path name of the file to
archive, while any <literal>%f</> is replaced by the file name only.
(The path name is relative to the working directory of the server,
! i.e., the cluster's data directory.)
Write <literal>%%</> if you need to embed an actual <literal>%</>
character in the command. The simplest useful command is something
like
--- 521,528 ----
any <literal>%p</> is replaced by the path name of the file to
archive, while any <literal>%f</> is replaced by the file name only.
(The path name is relative to the working directory of the server,
! i.e., the cluster's data directory, which can also be specified
! using %d, if required)
Write <literal>%%</> if you need to embed an actual <literal>%</>
character in the command. The simplest useful command is something
like
***************
*** 599,605 ****
In writing your archive command, you should assume that the file names to
be archived may be up to 64 characters long and may contain any
combination of ASCII letters, digits, and dots. It is not necessary to
! remember the original full path (<literal>%p</>) but it is necessary to
remember the file name (<literal>%f</>).
</para>
--- 600,606 ----
In writing your archive command, you should assume that the file names to
be archived may be up to 64 characters long and may contain any
combination of ASCII letters, digits, and dots. It is not necessary to
! remember the original relative path (<literal>%p</>) but it is necessary to
remember the file name (<literal>%f</>).
</para>
***************
*** 919,925 ****
replaced by the name of the desired log file, and <literal>%p</>,
which is replaced by the path name to copy the log file to.
(The path name is relative to the working directory of the server,
! i.e., the cluster's data directory.)
Write <literal>%%</> if you need to embed an actual <literal>%</>
character in the command. The simplest useful command is
something like
--- 920,927 ----
replaced by the name of the desired log file, and <literal>%p</>,
which is replaced by the path name to copy the log file to.
(The path name is relative to the working directory of the server,
! i.e., the cluster's data directory, which can also be specified
! using %d, if required.)
Write <literal>%%</> if you need to embed an actual <literal>%</>
character in the command. The simplest useful command is
something like
***************
*** 1010,1016 ****
and any <literal>%p</> is replaced by the path name to copy
it to on the server.
(The path name is relative to the working directory of the server,
! i.e., the cluster's data directory.)
Write <literal>%%</> to embed an actual <literal>%</> character
in the command.
</para>
--- 1012,1019 ----
and any <literal>%p</> is replaced by the path name to copy
it to on the server.
(The path name is relative to the working directory of the server,
! i.e., the cluster's data directory, which can also be specified
! using %d, if required)
Write <literal>%%</> to embed an actual <literal>%</> character
in the command.
</para>
Index: doc/src/sgml/config.sgml
===================================================================
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/config.sgml,v
retrieving revision 1.93
diff -c -r1.93 config.sgml
*** doc/src/sgml/config.sgml 4 Nov 2006 18:20:27 -0000 1.93
--- doc/src/sgml/config.sgml 5 Nov 2006 14:44:41 -0000
***************
*** 1576,1582 ****
replaced by the path name of the file to archive, and any
<literal>%f</> is replaced by the file name only.
(The path name is relative to the working directory of the server,
! i.e., the cluster's data directory.)
Use <literal>%%</> to embed an actual <literal>%</> character in the
command. For more information see <xref
linkend="backup-archiving-wal">.
--- 1576,1583 ----
replaced by the path name of the file to archive, and any
<literal>%f</> is replaced by the file name only.
(The path name is relative to the working directory of the server,
! i.e., the cluster's data directory, which can also be specified
! using %d, if required)
Use <literal>%%</> to embed an actual <literal>%</> character in the
command. For more information see <xref
linkend="backup-archiving-wal">.
Index: src/backend/access/transam/xlog.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.252
diff -c -r1.252 xlog.c
*** src/backend/access/transam/xlog.c 18 Oct 2006 22:44:11 -0000 1.252
--- src/backend/access/transam/xlog.c 5 Nov 2006 14:44:48 -0000
***************
*** 2416,2423 ****
{
switch (sp[1])
{
case 'p':
! /* %p: full path of target file */
sp++;
StrNCpy(dp, xlogpath, endp - dp);
make_native_path(dp);
--- 2416,2430 ----
{
switch (sp[1])
{
+ case 'd':
+ /* %d: DataDirectory of server */
+ sp++;
+ StrNCpy(dp, DataDir, endp - dp);
+ make_native_path(dp);
+ dp += strlen(dp);
+ break;
case 'p':
! /* %p: relative path of target file */
sp++;
StrNCpy(dp, xlogpath, endp - dp);
make_native_path(dp);
Index: src/backend/postmaster/pgarch.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/postmaster/pgarch.c,v
retrieving revision 1.25
diff -c -r1.25 pgarch.c
*** src/backend/postmaster/pgarch.c 7 Aug 2006 17:41:42 -0000 1.25
--- src/backend/postmaster/pgarch.c 5 Nov 2006 14:44:49 -0000
***************
*** 416,423 ****
{
switch (sp[1])
{
case 'p':
! /* %p: full path of source file */
sp++;
StrNCpy(dp, pathname, endp - dp);
make_native_path(dp);
--- 416,430 ----
{
switch (sp[1])
{
+ case 'd':
+ /* %d: DataDirectory of server */
+ sp++;
+ StrNCpy(dp, DataDir, endp - dp);
+ make_native_path(dp);
+ dp += strlen(dp);
+ break;
case 'p':
! /* %p: relative path of source file */
sp++;
StrNCpy(dp, pathname, endp - dp);
make_native_path(dp);
"Simon Riggs" <simon@2ndquadrant.com> writes:
On Sat, 2006-11-04 at 13:29 -0500, Tom Lane wrote:
Looking back in the archives, I note that one of the arguments for
making the server use relative paths everywhere was so that it'd be
robust against things like DBAs moving directories that contain live
postmasters. If we provide a %P option, or otherwise encourage people
to write scripts that depend on the absolute path of $PGDATA, we'd lose
some of this robustness.
I think I can fulfil Bernd, Florian and Martijn's wishes by supplying an
additional substitutable parameter %d which is replaced by the DataDir.
This fails to respond to the concern that DataDir might be out-of-date.
regards, tom lane
On Sun, 2006-11-05 at 11:10 -0500, Tom Lane wrote:
"Simon Riggs" <simon@2ndquadrant.com> writes:
On Sat, 2006-11-04 at 13:29 -0500, Tom Lane wrote:
Looking back in the archives, I note that one of the arguments for
making the server use relative paths everywhere was so that it'd be
robust against things like DBAs moving directories that contain live
postmasters. If we provide a %P option, or otherwise encourage people
to write scripts that depend on the absolute path of $PGDATA, we'd lose
some of this robustness.I think I can fulfil Bernd, Florian and Martijn's wishes by supplying an
additional substitutable parameter %d which is replaced by the DataDir.This fails to respond to the concern that DataDir might be out-of-date.
I'm not suggesting that the option is necessary, but I am suggesting
offering it to those who consider it useful.
Let's allow it, but document the concern about its use in certain
circumstances.
I'm pretty sure most people don't move live postmasters very frequently,
plus it isn't clear to me why we should support the people that want
that to do that, yet not the people who want the absolute-path option.
--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com
"Simon Riggs" <simon@2ndquadrant.com> writes:
I'm pretty sure most people don't move live postmasters very frequently,
plus it isn't clear to me why we should support the people that want
that to do that, yet not the people who want the absolute-path option.
As already discussed upthread, anyone who wants the path can get it from
`pwd` or local equivalent --- and that mechanism is robust (as long as
the directory move doesn't happen while any particular instance of the
script is running). I don't see why we should go out of our way to
provide a bad substitute for pwd.
BTW, I note that some post-startup uses of DataDir have crept back in,
in places like utils/adt/dbsize.c. I'll be sure to clean those up
before release...
regards, tom lane
On Sun, 2006-11-05 at 11:49 -0500, Tom Lane wrote:
I don't see why we should go out of our way to
provide a bad substitute for pwd.
That argument is conclusive. Agreed.
--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com
On Sun, 2006-11-05 at 15:02 +0000, Simon Riggs wrote:
Code comments now discuss relative paths also.
Patch containing just the minor cleanup of docs and code comments.
--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com
Attachments:
pitr_cleanup.patchtext/x-patch; charset=UTF-8; name=pitr_cleanup.patchDownload
Index: doc/src/sgml/backup.sgml
===================================================================
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/backup.sgml,v
retrieving revision 2.93
diff -c -r2.93 backup.sgml
*** doc/src/sgml/backup.sgml 4 Nov 2006 18:20:27 -0000 2.93
--- doc/src/sgml/backup.sgml 6 Nov 2006 08:21:22 -0000
***************
*** 599,605 ****
In writing your archive command, you should assume that the file names to
be archived may be up to 64 characters long and may contain any
combination of ASCII letters, digits, and dots. It is not necessary to
! remember the original full path (<literal>%p</>) but it is necessary to
remember the file name (<literal>%f</>).
</para>
--- 599,605 ----
In writing your archive command, you should assume that the file names to
be archived may be up to 64 characters long and may contain any
combination of ASCII letters, digits, and dots. It is not necessary to
! remember the original relative path (<literal>%p</>) but it is necessary to
remember the file name (<literal>%f</>).
</para>
Index: src/backend/access/transam/xlog.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.252
diff -c -r1.252 xlog.c
*** src/backend/access/transam/xlog.c 18 Oct 2006 22:44:11 -0000 1.252
--- src/backend/access/transam/xlog.c 6 Nov 2006 08:21:31 -0000
***************
*** 2417,2423 ****
switch (sp[1])
{
case 'p':
! /* %p: full path of target file */
sp++;
StrNCpy(dp, xlogpath, endp - dp);
make_native_path(dp);
--- 2417,2423 ----
switch (sp[1])
{
case 'p':
! /* %p: relative path of target file */
sp++;
StrNCpy(dp, xlogpath, endp - dp);
make_native_path(dp);
Index: src/backend/postmaster/pgarch.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/postmaster/pgarch.c,v
retrieving revision 1.25
diff -c -r1.25 pgarch.c
*** src/backend/postmaster/pgarch.c 7 Aug 2006 17:41:42 -0000 1.25
--- src/backend/postmaster/pgarch.c 6 Nov 2006 08:21:33 -0000
***************
*** 417,423 ****
switch (sp[1])
{
case 'p':
! /* %p: full path of source file */
sp++;
StrNCpy(dp, pathname, endp - dp);
make_native_path(dp);
--- 417,423 ----
switch (sp[1])
{
case 'p':
! /* %p: relative path of source file */
sp++;
StrNCpy(dp, pathname, endp - dp);
make_native_path(dp);
On Sun, Nov 05, 2006 at 11:49:36 -0500,
Tom Lane <tgl@sss.pgh.pa.us> wrote:
As already discussed upthread, anyone who wants the path can get it from
`pwd` or local equivalent --- and that mechanism is robust (as long as
the directory move doesn't happen while any particular instance of the
script is running). I don't see why we should go out of our way to
provide a bad substitute for pwd.
I think you also still need read access to the intervening directories.
If the command works by walking up and matching inode numbers with names,
then it will break if it can't read the names. (For example /bin/pwd
breaks when it can't read a parent directories filenames.)
On Tue, Nov 07, 2006 at 07:11:35PM -0600, Bruno Wolff III wrote:
On Sun, Nov 05, 2006 at 11:49:36 -0500,
Tom Lane <tgl@sss.pgh.pa.us> wrote:As already discussed upthread, anyone who wants the path can get it from
`pwd` or local equivalent --- and that mechanism is robust (as long as
the directory move doesn't happen while any particular instance of the
script is running). I don't see why we should go out of our way to
provide a bad substitute for pwd.I think you also still need read access to the intervening directories.
If the command works by walking up and matching inode numbers with names,
then it will break if it can't read the names. (For example /bin/pwd
breaks when it can't read a parent directories filenames.)
That's system dependant though, Linux getcwd doesn't have that problem
for example. Should probably dig up some documention on which systems
would be affected by this.
Have a ncie day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Show quoted text
From each according to his ability. To each according to his ability to litigate.