Backup docs

Started by Magnus Haganderover 13 years ago10 messages
#1Magnus Hagander
magnus@hagander.net

In reference to:
http://www.postgresql.org/docs/devel/static/continuous-archiving.html

I would like to see that page changed to list pg_basebackup as the
"default" way of doing base backups, and then list the "manual way" as
an option if you need more flexibility.

The reason being that for the majority of users that's going to be
flexible enough, and it's easier to use. And it doesn't hurt to show
that setting these things up really doesn't have to be that hard.

But since I'm definitely slightly biased on this, I'm not going to go
changing anything, or even write up suggested changing, until I can
get some agreement that making this change is good in the first place
;) Thus, please...

I'd also like to add "pg_basebackup -x" under standalone hot backups,
again as the main option.

I also wonder if we need a tl;dr; section of that whole page that just
goes through *what to do*, rather than why we do it? Of course not
removing the details, just showing the simplest case in, um, a simpler
way?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#2Simon Riggs
simon@2ndQuadrant.com
In reply to: Magnus Hagander (#1)
Re: Backup docs

On 5 June 2012 14:43, Magnus Hagander <magnus@hagander.net> wrote:

In reference to:
http://www.postgresql.org/docs/devel/static/continuous-archiving.html

I would like to see that page changed to list pg_basebackup as the
"default" way of doing base backups, and then list the "manual way" as
an option if you need more flexibility.

Agreed, but prefer phrasing old approach as "lower level API" or similar.

The reason being that for the majority of users that's going to be
flexible enough, and it's easier to use. And it doesn't hurt to show
that setting these things up really doesn't have to be that hard.

But since I'm definitely slightly biased on this, I'm not going to go
changing anything, or even write up suggested changing, until I can
get some agreement that making this change is good in the first place
;) Thus, please...

I'd also like to add "pg_basebackup -x" under standalone hot backups,
again as the main option.

I also wonder if we need a tl;dr; section of that whole page that just
goes through *what to do*, rather than why we do it? Of course not
removing the details, just showing the simplest case in, um, a simpler
way?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#3Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Magnus Hagander (#1)
Re: Backup docs

Magnus Hagander <magnus@hagander.net> wrote:

In reference to:

http://www.postgresql.org/docs/devel/static/continuous-archiving.html

I would like to see that page changed to list pg_basebackup as the
"default" way of doing base backups, and then list the "manual
way" as an option if you need more flexibility.

That sounds reasonable to me, as long as we don't lose information
on the alternatives. I agree that we should *emphasize* the easiest
steps to set up and run. The lower-level alternatives could even be
moved to a separate "tuning" section. (Speaking of which, if we
have such a section I think it would make sense to describe the
rsync techniques which minimize network traffic in the docs.)

Basically, a simple, straightforward description of the easy way to
get going is desperately needed, with alternatives separated out a
bit, with some hint as to when it might be worth going to the extra
trouble..

-Kevin

#4Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Magnus Hagander (#1)
Re: Backup docs

Magnus Hagander <magnus@hagander.net> writes:

I would like to see that page changed to list pg_basebackup as the
"default" way of doing base backups, and then list the "manual way" as
an option if you need more flexibility.

+1

I'd also like to add "pg_basebackup -x" under standalone hot backups,
again as the main option.

+1

I also wonder if we need a tl;dr; section of that whole page that just
goes through *what to do*, rather than why we do it? Of course not
removing the details, just showing the simplest case in, um, a simpler
way?

+1

Come to think about it, that is the perfect occasion to have the
tutorial open itself up to dealing with admin tasks, right?

Please let's apply that documentation patch to 9.2 too.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

#5Robert Haas
robertmhaas@gmail.com
In reply to: Dimitri Fontaine (#4)
Re: Backup docs

On Wed, Jun 13, 2012 at 3:20 PM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:

Please let's apply that documentation patch to 9.2 too.

Agreed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#6Magnus Hagander
magnus@hagander.net
In reply to: Robert Haas (#5)
1 attachment(s)
Re: Backup docs

On Thu, Jun 14, 2012 at 10:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Jun 13, 2012 at 3:20 PM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:

Please let's apply that documentation patch to 9.2 too.

Agreed.

Here's a patch that does the first two things. Does not attempt a
tl;tr section yet. Also adds a subheader for the notes about
compressing archive logs that seems to have been missing for a long
time - that's definitely valid for things that aren't standalone
backups, and is arguably a lot more *useful* in cases that aren't
standalone backups (since standalone backups won't have very much
log).

No removed text, just moved around and added some.

Unless there are objections to this one specifically, I'll go ahead
and commit it soon.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Attachments:

backup_docs.patchapplication/octet-stream; name=backup_docs.patchDownload
diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 0180df5..ea639e9 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -724,7 +724,72 @@ test ! -f /mnt/server/archivedir/00000001000000A900000065 &amp;&amp; cp pg_xlog/
    <title>Making a Base Backup</title>
 
    <para>
-    The procedure for making a base backup is relatively simple:
+    The easiest way to perform a base backup is to use the
+    <xref linkend="app-pgbasebackup"> tool. It can create
+    a base backup either as regular files or as a tar archive. If more
+    flexibility than <xref linkend="app-pgbasebackup"> can provide is
+    required, you can also make a base backup using the low level API
+    (see <xref linkend="backup-lowlevel-base-backup">).
+   </para>
+
+   <para>
+    It is not necessary to be concerned about the amount of time it takes
+    to make a base backup. However, if you normally run the
+    server with <varname>full_page_writes</> disabled, you might notice a drop
+    in performance while the backup runs since <varname>full_page_writes</> is
+    effectively forced on during backup mode.
+   </para>
+
+   <para>
+    To make use of the backup, you will need to keep all the WAL
+    segment files generated during and after the file system backup.
+    To aid you in doing this, the base backup process
+    creates a <firstterm>backup history file</> that is immediately
+    stored into the WAL archive area. This file is named after the first
+    WAL segment file that you need for the file system backup.
+    For example, if the starting WAL file is
+    <literal>0000000100001234000055CD</> the backup history file will be
+    named something like
+    <literal>0000000100001234000055CD.007C9330.backup</>. (The second
+    part of the file name stands for an exact position within the WAL
+    file, and can ordinarily be ignored.) Once you have safely archived
+    the file system backup and the WAL segment files used during the
+    backup (as specified in the backup history file), all archived WAL
+    segments with names numerically less are no longer needed to recover
+    the file system backup and can be deleted. However, you should
+    consider keeping several backup sets to be absolutely certain that
+    you can recover your data.
+   </para>
+
+   <para>
+    The backup history file is just a small text file. It contains the
+    label string you gave to <xref linkend="app-pgbasebackup">, as well as
+    the starting and ending times and WAL segments of the backup.
+    If you used the label to identify the associated dump file,
+    then the archived history file is enough to tell you which dump file to
+    restore.
+   </para>
+
+   <para>
+    Since you have to keep around all the archived WAL files back to your
+    last base backup, the interval between base backups should usually be
+    chosen based on how much storage you want to expend on archived WAL
+    files.  You should also consider how long you are prepared to spend
+    recovering, if recovery should be necessary &mdash; the system will have to
+    replay all those WAL segments, and that could take awhile if it has
+    been a long time since the last base backup.
+   </para>
+  </sect2>
+
+  <sect2 id="backup-lowlevel-base-backup">
+   <title>Making a Base Backup Using the Low Level API</title>
+   <para>
+    The procedure for making a base backup using the low level
+    APIs contains a few more steps than
+    the <xref linkend="app-pgbasebackup"> method, but is relatively
+    simple. It is very important that these steps are executed in
+    sequence, and that the success of a step is verified before
+    proceeding to the next step.
   <orderedlist>
    <listitem>
     <para>
@@ -814,17 +879,6 @@ SELECT pg_stop_backup();
    </para>
 
    <para>
-    You can also use the <xref linkend="app-pgbasebackup"> tool to take
-    the backup, instead of manually copying the files. This tool will do
-    the equivalent of <function>pg_start_backup()</>, copy and
-    <function>pg_stop_backup()</> steps automatically, and transfers the
-    backup over a regular <productname>PostgreSQL</productname> connection
-    using the replication protocol, instead of requiring file system level
-    access. <command>pg_basebackup</command> does not interfere with file system level backups
-    taken using <function>pg_start_backup()</>/<function>pg_stop_backup()</>.
-   </para>
-
-   <para>
     Some file system backup tools emit warnings or errors
     if the files they are trying to copy change while the copy proceeds.
     When taking a base backup of an active database, this situation is normal
@@ -843,19 +897,6 @@ SELECT pg_stop_backup();
    </para>
 
    <para>
-    It is not necessary to be concerned about the amount of time elapsed
-    between <function>pg_start_backup</> and the start of the actual backup,
-    nor between the end of the backup and <function>pg_stop_backup</>; a
-    few minutes' delay won't hurt anything.  (However, if you normally run the
-    server with <varname>full_page_writes</> disabled, you might notice a drop
-    in performance between <function>pg_start_backup</> and
-    <function>pg_stop_backup</>, since <varname>full_page_writes</> is
-    effectively forced on during backup mode.)  You must ensure that these
-    steps are carried out in sequence, without any possible
-    overlap, or you will invalidate the backup.
-   </para>
-
-   <para>
     Be certain that your backup dump includes all of the files under
     the database cluster directory (e.g., <filename>/usr/local/pgsql/data</>).
     If you are using tablespaces that do not reside underneath this directory,
@@ -879,46 +920,6 @@ SELECT pg_stop_backup();
    </para>
 
    <para>
-    To make use of the backup, you will need to keep all the WAL
-    segment files generated during and after the file system backup.
-    To aid you in doing this, the <function>pg_stop_backup</> function
-    creates a <firstterm>backup history file</> that is immediately
-    stored into the WAL archive area. This file is named after the first
-    WAL segment file that you need for the file system backup.
-    For example, if the starting WAL file is
-    <literal>0000000100001234000055CD</> the backup history file will be
-    named something like
-    <literal>0000000100001234000055CD.007C9330.backup</>. (The second
-    part of the file name stands for an exact position within the WAL
-    file, and can ordinarily be ignored.) Once you have safely archived
-    the file system backup and the WAL segment files used during the
-    backup (as specified in the backup history file), all archived WAL
-    segments with names numerically less are no longer needed to recover
-    the file system backup and can be deleted. However, you should
-    consider keeping several backup sets to be absolutely certain that
-    you can recover your data.
-   </para>
-
-   <para>
-    The backup history file is just a small text file. It contains the
-    label string you gave to <function>pg_start_backup</>, as well as
-    the starting and ending times and WAL segments of the backup.
-    If you used the label to identify the associated dump file,
-    then the archived history file is enough to tell you which dump file to
-    restore.
-   </para>
-
-   <para>
-    Since you have to keep around all the archived WAL files back to your
-    last base backup, the interval between base backups should usually be
-    chosen based on how much storage you want to expend on archived WAL
-    files.  You should also consider how long you are prepared to spend
-    recovering, if recovery should be necessary &mdash; the system will have to
-    replay all those WAL segments, and that could take awhile if it has
-    been a long time since the last base backup.
-   </para>
-
-   <para>
     It's also worth noting that the <function>pg_start_backup</> function
     makes a file named <filename>backup_label</> in the database cluster
     directory, which is removed by <function>pg_stop_backup</>.
@@ -1214,7 +1215,18 @@ restore_command = 'cp /mnt/server/archivedir/%f %p'
      </para>
 
      <para>
-      To prepare for standalone hot backups, set <varname>wal_level</> to
+      As with base backups, the easiest way to produce a standalone
+      hot backup is to use the <xref linkend="app-pgbasebackup">
+      tool. If you include the <literal>-X</> parameter when calling
+      it, all the transaction log required to use the backup will be
+      included in the backup automatically, and no special action is
+      required to restore the backup.
+     </para>
+
+     <para>
+      If more flexibility in copying the backup files is needed, a lower
+      level process can be used for standalone hot backups as well.
+      To prepare for low level standalone hot backups, set <varname>wal_level</> to
       <literal>archive</> (or <literal>hot_standby</>), <varname>archive_mode</> to
       <literal>on</>, and set up an <varname>archive_command</> that performs
       archiving only when a <emphasis>switch file</> exists.  For example:
@@ -1246,6 +1258,11 @@ tar -rf /var/lib/pgsql/backup.tar /var/lib/pgsql/archive/
       Please remember to add error handling to your backup scripts.
      </para>
 
+    </sect3>
+
+    <sect3 id="compressed-archive-logs">
+     <title>Compressed Archive Logs</title>
+
      <para>
       If archive storage size is a concern, you can use
       <application>gzip</application> to compress the archive files:
#7Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Magnus Hagander (#6)
Re: Backup docs

Magnus Hagander <magnus@hagander.net> writes:

-    The procedure for making a base backup is relatively simple:
+    The easiest way to perform a base backup is to use the
+    <xref linkend="app-pgbasebackup"> tool. It can create
+    a base backup either as regular files or as a tar archive. If more
+    flexibility than <xref linkend="app-pgbasebackup"> can provide is
+    required, you can also make a base backup using the low level API
+    (see <xref linkend="backup-lowlevel-base-backup">).
+   </para>

Good start.

+   <para>
+    It is not necessary to be concerned about the amount of time it takes
+    to make a base backup. However, if you normally run the

Why not?

+    file, and can ordinarily be ignored.) Once you have safely archived
+    the file system backup and the WAL segment files used during the
+    backup (as specified in the backup history file), all archived WAL
+    segments with names numerically less are no longer needed to recover
+    the file system backup and can be deleted. However, you should
+    consider keeping several backup sets to be absolutely certain that
+    you can recover your data.
+   </para>

You're frighting off users when not detailing, I think. How to be
certain I can recover my data, is there a way that I can't when a backup
has been successfully made? How can I check?

Also I don't see mention of basebackup+wal files all in one with the -x
option, which I though would have to be addressed here?

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

#8Magnus Hagander
magnus@hagander.net
In reply to: Dimitri Fontaine (#7)
Re: Backup docs

On Sat, Jun 16, 2012 at 4:39 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:

Magnus Hagander <magnus@hagander.net> writes:

-    The procedure for making a base backup is relatively simple:
+    The easiest way to perform a base backup is to use the
+    <xref linkend="app-pgbasebackup"> tool. It can create
+    a base backup either as regular files or as a tar archive. If more
+    flexibility than <xref linkend="app-pgbasebackup"> can provide is
+    required, you can also make a base backup using the low level API
+    (see <xref linkend="backup-lowlevel-base-backup">).
+   </para>

Good start.

+   <para>
+    It is not necessary to be concerned about the amount of time it takes
+    to make a base backup. However, if you normally run the

Why not?

This is copied from the old documentation. It used to say "It is not
necessary to be concerned about the amount of time elapsed between
pg_start_backup and the start of the actual backup, nor between the
end of the backup and pg_stop_backup".

And the whole idea was to simplify the text at the beginning ;)

+    file, and can ordinarily be ignored.) Once you have safely archived
+    the file system backup and the WAL segment files used during the
+    backup (as specified in the backup history file), all archived WAL
+    segments with names numerically less are no longer needed to recover
+    the file system backup and can be deleted. However, you should
+    consider keeping several backup sets to be absolutely certain that
+    you can recover your data.
+   </para>

You're frighting off users when not detailing, I think. How to be

This is copied exactly from what it is today. I'm sure it can be
approved, but it's not the goal of this patch. Let's not let
perfection get in the way of improvement...

Also I don't see mention of basebackup+wal files all in one with the -x
option, which I though would have to be addressed here?

It does, it's under "standalone hot backups". The second to last part
of the patch.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#9Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Magnus Hagander (#8)
Re: Backup docs

Magnus Hagander <magnus@hagander.net> writes:

This is copied from the old documentation. It used to say "It is not
necessary to be concerned about the amount of time elapsed between
pg_start_backup and the start of the actual backup, nor between the
end of the backup and pg_stop_backup".

And the whole idea was to simplify the text at the beginning ;)

Oh I see, not your patch to fix then. I just quick read the diff, as you
can see.

This is copied exactly from what it is today. I'm sure it can be
approved, but it's not the goal of this patch. Let's not let
perfection get in the way of improvement...

Same.

It does, it's under "standalone hot backups". The second to last part
of the patch.

Perfect then.

Sorry for the noise, regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

#10Magnus Hagander
magnus@hagander.net
In reply to: Dimitri Fontaine (#9)
Re: Backup docs

On Sun, Jun 17, 2012 at 12:13 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:

Magnus Hagander <magnus@hagander.net> writes:

This is copied from the old documentation. It used to say "It is not
necessary to be concerned about the amount of time elapsed between
pg_start_backup and the start of the actual backup, nor between the
end of the backup and pg_stop_backup".

And the whole idea was to simplify the text at the beginning ;)

Oh I see, not your patch to fix then. I just quick read the diff, as you
can see.

This is copied exactly from what it is today. I'm sure it can be
approved, but it's not the goal of this patch. Let's not let
perfection get in the way of improvement...

Same.

It does, it's under "standalone hot backups". The second to last part
of the patch.

Perfect then.

Sorry for the noise, regards,

np, thanks for checking. Applied.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/