Docs for archive_cleanup_command are poor

Started by Brendan Jurdover 15 years ago10 messages
#1Brendan Jurd
direvus@gmail.com

Hi folks,

I have just set up HS+SR for the first time, and for the most part,
the docs were excellent. The one exception for me was the discussion
of archive_cleanup_command. This is a pretty important part of
constructing a healthy standby server, and IMO the docs don't give it
the treatment it deserves.

Under "25.2.4. Setting Up a Standby Server", we have:

"You can use archive_cleanup_command to prune the archive of files no
longer needed by the standby."

... then a few paragraphs later ...

"If you're using a WAL archive, its size can be minimized using the
archive_cleanup_command option to remove files that are no longer
required by the standby server. Note however, that if you're using the
archive for backup purposes, you need to retain files needed to
recover from at least the latest base backup, even if they're no
longer needed by the standby."

So there are a couple of brief mentions of what
archive_cleanup_command is for, but nothing about how it works, no
exampes of how to use it, and no links at all. Contrast how we deal
with archive_command, restore_command and primary_conninfo.

I'd like to suggest a few ways we could improve on this:

1. Remove the former paragraph. It's stranded out there on its own in
the middle of some unrelated text, and doesn't say anything of
substance not also said in the latter paragraph.

2. Include an example archive_cleanup_command in the recovery.conf
example snippet.

3. Link to 26.1 which actually explains how a_c_c works.

4. Mention, and link to, pg_archivecleanup from both 25.2.4 and 26.1.
This is the utility that most newcomers to WAL archiving will want to
use, so it's rather weird of us not to advertise it.

I'm willing to write a patch for this, but I thought I'd raise the
suggestions on-list first, before getting too invested. So, please
comment if you have an opinion on this.

Cheers,
BJ

#2Fujii Masao
masao.fujii@gmail.com
In reply to: Brendan Jurd (#1)
Re: [HACKERS] Docs for archive_cleanup_command are poor

On Sat, Oct 9, 2010 at 10:04 AM, Brendan Jurd <direvus@gmail.com> wrote:

Hi folks,

I have just set up HS+SR for the first time, and for the most part,
the docs were excellent.  The one exception for me was the discussion
of archive_cleanup_command.  This is a pretty important part of
constructing a healthy standby server, and IMO the docs don't give it
the treatment it deserves.

Under "25.2.4. Setting Up a Standby Server", we have:

"You can use archive_cleanup_command to prune the archive of files no
longer needed by the standby."

... then a few paragraphs later ...

"If you're using a WAL archive, its size can be minimized using the
archive_cleanup_command  option to remove files that are no longer
required by the standby server. Note however, that if you're using the
archive for backup purposes, you need to retain files needed to
recover from at least the latest base backup, even if they're no
longer needed by the standby."

So there are a couple of brief mentions of what
archive_cleanup_command is for, but nothing about how it works, no
exampes of how to use it, and no links at all.  Contrast how we deal
with archive_command, restore_command and primary_conninfo.

I'd like to suggest a few ways we could improve on this:

1. Remove the former paragraph.  It's stranded out there on its own in
the middle of some unrelated text, and doesn't say anything of
substance not also said in the latter paragraph.

2. Include an example archive_cleanup_command in the recovery.conf
example snippet.

3. Link to 26.1 which actually explains how a_c_c works.

4. Mention, and link to, pg_archivecleanup from both 25.2.4 and 26.1.
This is the utility that most newcomers to WAL archiving will want to
use, so it's rather weird of us not to advertise it.

I'm willing to write a patch for this, but I thought I'd raise the
suggestions on-list first, before getting too invested.  So, please
comment if you have an opinion on this.

Agreed.

And, ISTM that we should mention that we must not just specify
pg_archivecleanup in archive_cleanup_command when there are multiple
standby servers. This is because, in that case, we must calculate
the oldest restart point in those standbys and delete the archived
WAL files according to that point.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#3Robert Haas
robertmhaas@gmail.com
In reply to: Fujii Masao (#2)
Re: [HACKERS] Docs for archive_cleanup_command are poor

On Tue, Oct 12, 2010 at 8:28 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Sat, Oct 9, 2010 at 10:04 AM, Brendan Jurd <direvus@gmail.com> wrote:

Hi folks,

I have just set up HS+SR for the first time, and for the most part,
the docs were excellent.  The one exception for me was the discussion
of archive_cleanup_command.  This is a pretty important part of
constructing a healthy standby server, and IMO the docs don't give it
the treatment it deserves.

Under "25.2.4. Setting Up a Standby Server", we have:

"You can use archive_cleanup_command to prune the archive of files no
longer needed by the standby."

... then a few paragraphs later ...

"If you're using a WAL archive, its size can be minimized using the
archive_cleanup_command  option to remove files that are no longer
required by the standby server. Note however, that if you're using the
archive for backup purposes, you need to retain files needed to
recover from at least the latest base backup, even if they're no
longer needed by the standby."

So there are a couple of brief mentions of what
archive_cleanup_command is for, but nothing about how it works, no
exampes of how to use it, and no links at all.  Contrast how we deal
with archive_command, restore_command and primary_conninfo.

I'd like to suggest a few ways we could improve on this:

1. Remove the former paragraph.  It's stranded out there on its own in
the middle of some unrelated text, and doesn't say anything of
substance not also said in the latter paragraph.

2. Include an example archive_cleanup_command in the recovery.conf
example snippet.

3. Link to 26.1 which actually explains how a_c_c works.

4. Mention, and link to, pg_archivecleanup from both 25.2.4 and 26.1.
This is the utility that most newcomers to WAL archiving will want to
use, so it's rather weird of us not to advertise it.

I'm willing to write a patch for this, but I thought I'd raise the
suggestions on-list first, before getting too invested.  So, please
comment if you have an opinion on this.

Agreed.

Is someone working on a patch?

And, ISTM that we should mention that we must not just specify
pg_archivecleanup in archive_cleanup_command when there are multiple
standby servers. This is because, in that case, we must calculate
the oldest restart point in those standbys and delete the archived
WAL files according to that point.

How do we expect people to do that, by the way?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#4Brendan Jurd
direvus@gmail.com
In reply to: Robert Haas (#3)
Re: [HACKERS] Docs for archive_cleanup_command are poor

On 14 October 2010 08:45, Robert Haas <robertmhaas@gmail.com> wrote:

Is someone working on a patch?

Yes, I will prepare a patch to get us started.

Cheers,
BJ

#5Brendan Jurd
direvus@gmail.com
In reply to: Fujii Masao (#2)
1 attachment(s)
Re: [HACKERS] Docs for archive_cleanup_command are poor

On 12 October 2010 23:28, Fujii Masao <masao.fujii@gmail.com> wrote:

On Sat, Oct 9, 2010 at 10:04 AM, Brendan Jurd <direvus@gmail.com> wrote:

I have just set up HS+SR for the first time, and for the most part,
the docs were excellent.  The one exception for me was the discussion
of archive_cleanup_command.  This is a pretty important part of
constructing a healthy standby server, and IMO the docs don't give it
the treatment it deserves.

...

Agreed.

And, ISTM that we should mention that we must not just specify
pg_archivecleanup in archive_cleanup_command when there are multiple
standby servers. This is because, in that case, we must calculate
the oldest restart point in those standbys and delete the archived
WAL files according to that point.

As promised, here is a patch to try to address $SUBJECT.

Summary of changes:

In 25.2.4. "Setting Up a Standby Server":

* Get rid of the extraneous short paragraph,
* move the full-size paragraph up to where the now-extinct short para was,
* add an archive_cleanup_command to the example recovery.conf,
* flesh out the wording,
* add links to 26.1 and F.22.

In 26.1. "Archive recovery settings":

* Add detail to the description of how it works,
* add an example recovery.conf snippet,
* per Fujii-san's comment, indicate that multi-standby setups require
more finesse,
* link to F.22.

In F.22. "pg_archivecleanup":

* Edit and clarify wording,
* standardise label for the <archivelocation> argument,
* again indicate the multi-standby issue,
* link to 25.2.

I'll drop this onto the next open commitfest. If it passes muster, it
sure wouldn't hurt to backpatch it to 9.0.

Cheers,
BJ

Attachments:

acc-docs.difftext/plain; charset=US-ASCII; name=acc-docs.diffDownload
*** a/doc/src/sgml/high-availability.sgml
--- b/doc/src/sgml/high-availability.sgml
***************
*** 681,691 **** protocol to make nodes agree on a serializable transactional order.
     </para>
  
     <para>
-     You can use <varname>archive_cleanup_command</> to prune the archive of
-     files no longer needed by the standby.
-    </para>
- 
-    <para>
      If you're setting up the standby server for high availability purposes,
      set up WAL archiving, connections and authentication like the primary
      server, because the standby server will work as a primary server after
--- 681,686 ----
***************
*** 697,708 **** protocol to make nodes agree on a serializable transactional order.
--- 692,716 ----
     </para>
  
     <para>
+     If you're using a WAL archive, its size can be minimized using the <xref
+     linkend="archive-cleanup-command"> parameter to remove files that are no
+     longer required by the standby server.
+     The <application>pg_archivecleanup</> utility is designed specifically to
+     be used with <varname>archive_cleanup_command</> in typical single-standby
+     configurations, see <xref linkend="pgarchivecleanup">.
+     Note however, that if you're using the archive for backup purposes, you
+     need to retain files needed to recover from at least the latest base
+     backup, even if they're no longer needed by the standby.
+    </para>
+ 
+    <para>
      A simple example of a <filename>recovery.conf</> is:
  <programlisting>
  standby_mode = 'on'
  primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass'
  restore_command = 'cp /path/to/archive/%f %p'
  trigger_file = '/path/to/trigger_file'
+ archive_cleanup_command = 'pg_archivecleanup /path/to/archive %r'
  </programlisting>
     </para>
  
***************
*** 712,725 **** trigger_file = '/path/to/trigger_file'
      the primary to allow them to be connected simultaneously.
     </para>
  
-    <para>
-     If you're using a WAL archive, its size can be minimized using
-     the <varname>archive_cleanup_command</> option to remove files that are
-     no longer required by the standby server. Note however, that if you're
-     using the archive for backup purposes, you need to retain files needed
-     to recover from at least the latest base backup, even if they're no
-     longer needed by the standby.
-    </para>
    </sect2>
  
    <sect2 id="streaming-replication">
--- 720,725 ----
*** a/doc/src/sgml/pgarchivecleanup.sgml
--- b/doc/src/sgml/pgarchivecleanup.sgml
***************
*** 8,17 ****
   </indexterm>
  
   <para>
!   <application>pg_archivecleanup</> is designed to cleanup an archive when used
!   as an <literal>archive_cleanup_command</literal> when running with
!   <literal>standby_mode = on</literal>. <application>pg_archivecleanup</> can
!   also be used as a standalone program to clean WAL file archives.
   </para>
  
   <para>
--- 8,18 ----
   </indexterm>
  
   <para>
!   <application>pg_archivecleanup</> is designed to be used as an
!   <literal>archive_cleanup_command</literal> to clean up WAL file archives when
!   running as a standby server (see <xref linkend="warm-standby">).
!   <application>pg_archivecleanup</> can also be used as a standalone program to
!   clean WAL file archives.
   </para>
  
   <para>
***************
*** 39,58 ****
     server to use <application>pg_archivecleanup</>, put this into its
     <filename>recovery.conf</filename> configuration file:
  <programlisting>
! archive_cleanup_command = 'pg_archivecleanup <replaceable>archiveDir</> %r'
  </programlisting>
!    where <replaceable>archiveDir</> is the directory from which WAL segment
!    files should be restored.
    </para>
    <para>
!    When used within <literal>archive_cleanup_command</literal>,
!    all WAL files logically preceding the value of the <literal>%r</>
!    will be removed <replaceable>archivelocation</>. This minimizes
!    the number of files that need to be retained, while preserving
!    crash-restart capability.  Use of this parameter is appropriate if the
!    <replaceable>archivelocation</> is a transient staging area for this
!    particular standby server, but <emphasis>not</> when the
!    <replaceable>archivelocation</> is intended as a long-term WAL archive area.
    </para>
    <para>
     The full syntax of <application>pg_archivecleanup</>'s command line is
--- 40,60 ----
     server to use <application>pg_archivecleanup</>, put this into its
     <filename>recovery.conf</filename> configuration file:
  <programlisting>
! archive_cleanup_command = 'pg_archivecleanup <replaceable>archivelocation</> %r'
  </programlisting>
!    where <replaceable>archivelocation</> is the directory from which WAL segment
!    files should be removed.
    </para>
    <para>
!    When used within <xref linkend="archive-cleanup-command">, all WAL files
!    logically preceding the value of the <literal>%r</> argument will be removed
!    from <replaceable>archivelocation</>. This minimizes the number of files
!    that need to be retained, while preserving crash-restart capability.  Use of
!    this parameter is appropriate if the <replaceable>archivelocation</> is a
!    transient staging area for this particular standby server, but
!    <emphasis>not</> when the <replaceable>archivelocation</> is intended as a
!    long-term WAL archive area, or when multiple standby servers are recovering
!    from the same archive location.
    </para>
    <para>
     The full syntax of <application>pg_archivecleanup</>'s command line is
*** a/doc/src/sgml/recovery-config.sgml
--- b/doc/src/sgml/recovery-config.sgml
***************
*** 80,99 **** restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
        </indexterm>
        <listitem>
         <para>
!         This parameter specifies a shell command that will be executed at
!         every restartpoint. This parameter is optional. The purpose of the
!         <varname>archive_cleanup_command</> is to provide a mechanism for cleaning
!         up old archived WAL files that are no longer needed by the standby
!         server.
!         Any <literal>%r</> is replaced by the name of the file
!         containing the last valid restart point. That is the earliest file that
!         must be kept to allow a restore to be restartable, so this information
!         can be used to truncate the archive to just the minimum required to
!         support restart from the current restore. <literal>%r</> would
!         typically be used in a warm-standby configuration
!         (see <xref linkend="warm-standby">).
!         Write <literal>%%</> to embed an actual <literal>%</> character
!         in the command.
         </para>
         <para>
          If the command returns a non-zero exit status then a WARNING log
--- 80,109 ----
        </indexterm>
        <listitem>
         <para>
!         This optional parameter specifies a shell command that will be executed
!         at every restartpoint.  The purpose of
!         <varname>archive_cleanup_command</> is to provide a mechanism for
!         cleaning up old archived WAL files that are no longer needed by the
!         standby server.
!         Any <literal>%r</> is replaced by the name of the file containing the
!         last valid restart point.
!         That is the earliest file that must be <emphasis>kept</> to allow a
!         restore to be restartable, and so all files earlier than <literal>%r</>
!         may be safely removed.
!         This information can be used to truncate the archive to just the
!         minimum required to support restart from the current restore.
!         The <application>pg_archivecleanup</> utility provided in
!         <literal>contrib</> (see <xref linkend="pgarchivecleanup">) serves as a
!         convenient target for <varname>archive_cleanup_command</> in typical
!         single-standby configurations, for example:
! <programlisting> archive_cleanup_command = 'pg_archivecleanup /mnt/server/archivedir %r' </programlisting>
!         Note however that if multiple standby servers are restoring from the
!         same archive directory, you will need to ensure that you do not delete
!         WAL files until they are no longer needed by any of the servers.
!         <varname>archive_cleanup_command</> would typically be used in a
!         warm-standby configuration (see <xref linkend="warm-standby">).
!         Write <literal>%%</> to embed an actual <literal>%</> character in the
!         command.
         </para>
         <para>
          If the command returns a non-zero exit status then a WARNING log
#6Simon Riggs
simon@2ndQuadrant.com
In reply to: Brendan Jurd (#5)
Re: [HACKERS] Docs for archive_cleanup_command are poor

On Fri, 2010-10-15 at 02:24 +1100, Brendan Jurd wrote:

I'll drop this onto the next open commitfest. If it passes muster, it
sure wouldn't hurt to backpatch it to 9.0.

Committed. Not sure there's anything there worth backpatching? There
aren't any doc bugs there.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services

#7Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#6)
Re: [HACKERS] Docs for archive_cleanup_command are poor

On Thu, Oct 14, 2010 at 2:33 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On Fri, 2010-10-15 at 02:24 +1100, Brendan Jurd wrote:

I'll drop this onto the next open commitfest.  If it passes muster, it
sure wouldn't hurt to backpatch it to 9.0.

Committed. Not sure there's anything there worth backpatching? There
aren't any doc bugs there.

It's a while until 9.1 comes out, so it might be helpful to back-patch to 9.0.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#8Brendan Jurd
direvus@gmail.com
In reply to: Simon Riggs (#6)
Re: [HACKERS] Docs for archive_cleanup_command are poor

On 15 October 2010 05:33, Simon Riggs <simon@2ndquadrant.com> wrote:

On Fri, 2010-10-15 at 02:24 +1100, Brendan Jurd wrote:

I'll drop this onto the next open commitfest.  If it passes muster, it
sure wouldn't hurt to backpatch it to 9.0.

Committed. Not sure there's anything there worth backpatching? There
aren't any doc bugs there.

Thanks for the commit Simon.

Agreed that there are no doc bugs. The reason I suggested a backpatch
is that I'm concerned that a lot of people are going to be approaching
the whole Standby topic for the first time with 9.0, so it would be
nice to give those folks an accessible account of how
archive_cleanup_command is meant to be used.

I was also working from the assumption that the "we only packpatch bug
fixes" policy applied to the code, not so much to the documentation.
If I was in error about that, well fair enough then.

Cheers,
BJ

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Brendan Jurd (#8)
Re: [HACKERS] Docs for archive_cleanup_command are poor

Brendan Jurd <direvus@gmail.com> writes:

Agreed that there are no doc bugs. The reason I suggested a backpatch
is that I'm concerned that a lot of people are going to be approaching
the whole Standby topic for the first time with 9.0, so it would be
nice to give those folks an accessible account of how
archive_cleanup_command is meant to be used.

Yeah, if this is a material improvement in the usefulness of the HS
docs, I think including it into 9.0.x isn't a bad idea.

regards, tom lane

#10Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#9)
Re: [HACKERS] Docs for archive_cleanup_command are poor

On Thu, 2010-10-14 at 18:05 -0400, Tom Lane wrote:

Brendan Jurd <direvus@gmail.com> writes:

Agreed that there are no doc bugs. The reason I suggested a backpatch
is that I'm concerned that a lot of people are going to be approaching
the whole Standby topic for the first time with 9.0, so it would be
nice to give those folks an accessible account of how
archive_cleanup_command is meant to be used.

Yeah, if this is a material improvement in the usefulness of the HS
docs, I think including it into 9.0.x isn't a bad idea.

Done.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services