Incrementally Updated Backup
Way past feature freeze, but this small change allows a powerful new
feature utilising the Restartable Recovery capability. Very useful for
very large database backups...
Includes full documentation.
Perhaps a bit rushed, but inclusion in 8.2 would be great. (Ouch, don't
shout back, read the patch first....)
-----------------------------
Docs copied here as better explanation:
<title>Incrementally Updated Backups</title>
<para>
Restartable Recovery can also be utilised to avoid the need to take
regular complete base backups, thus improving backup performance in
situations where the server is heavily loaded or the database is
very large. This concept is known as incrementally updated backups.
</para>
<para>
If we take a backup of the server files after a recovery is
partially
completed, we will be able to restart the recovery from the last
restartpoint. This backup is now further forward along the timeline
than the original base backup, so we can refer to it as an
incrementally
updated backup. If we need to recover, it will be faster to recover
from
the incrementally updated backup than from the base backup.
</para>
<para>
The <xref linkend="startup-after-recovery"> option in the
recovery.conf
file is provided to allow the recovery to complete up to the current
last
WAL segment, yet without starting the database. This option allows
us
to stop the server and take a backup of the partially recovered
server
files: this is the incrementally updated backup.
</para>
<para>
We can use the incrementally updated backup concept to come up with
a
streamlined backup schedule. For example:
<orderedlist>
<listitem>
<para>
Set up continuous archiving
</para>
</listitem>
<listitem>
<para>
Take weekly base backup
</para>
</listitem>
<listitem>
<para>
After 24 hours, restore base backup to another server, then run a
partial recovery and take a backup of the latest database state to
produce an incrmentally updated backup.
</para>
</listitem>
<listitem>
<para>
After next 24 hours, restore the incrementally updated backup to
the
second server, then run a partial recovery, at the end, take a
backup
of the partially recovered files.
</para>
</listitem>
<listitem>
<para>
Repeat previous step each day, until the end of the week.
</para>
</listitem>
</orderedlist>
</para>
<para>
A weekly backup need only be taken once per week, yet the same level
of
protection is offered as if base backups were taken nightly.
</para>
</sect2>
--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com
Attachments:
iubackup.patchtext/x-patch; charset=UTF-8; name=iubackup.patchDownload
Index: doc/src/sgml/backup.sgml
===================================================================
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/backup.sgml,v
retrieving revision 2.86
diff -c -r2.86 backup.sgml
*** doc/src/sgml/backup.sgml 16 Sep 2006 00:30:11 -0000 2.86
--- doc/src/sgml/backup.sgml 19 Sep 2006 14:03:32 -0000
***************
*** 1063,1068 ****
--- 1063,1081 ----
</listitem>
</varlistentry>
+ <varlistentry id="startup-after-recovery"
+ xreflabel="startup_after_recovery">
+ <term><varname>startup_after_recovery</varname>
+ (<type>boolean</type>)
+ </term>
+ <listitem>
+ <para>
+ Allows an incrementally updated backup to be taken.
+ See <xref linkend="backup-incremental-updated"> for discussion.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect3>
***************
*** 1137,1142 ****
--- 1150,1229 ----
</para>
</sect2>
+ <sect2 id="backup-incremental-updated">
+ <title>Incrementally Updated Backups</title>
+
+ <indexterm zone="backup">
+ <primary>incrementally updated backups</primary>
+ </indexterm>
+
+ <para>
+ Restartable Recovery can also be utilised to avoid the need to take
+ regular complete base backups, thus improving backup performance in
+ situations where the server is heavily loaded or the database is
+ very large. This concept is known as incrementally updated backups.
+ </para>
+
+ <para>
+ If we take a backup of the server files after a recovery is partially
+ completed, we will be able to restart the recovery from the last
+ restartpoint. This backup is now further forward along the timeline
+ than the original base backup, so we can refer to it as an incrementally
+ updated backup. If we need to recover, it will be faster to recover from
+ the incrementally updated backup than from the base backup.
+ </para>
+
+ <para>
+ The <xref linkend="startup-after-recovery"> option in the recovery.conf
+ file is provided to allow the recovery to complete up to the current last
+ WAL segment, yet without starting the database. This option allows us
+ to stop the server and take a backup of the partially recovered server
+ files: this is the incrementally updated backup.
+ </para>
+
+ <para>
+ We can use the incrementally updated backup concept to come up with a
+ streamlined backup schedule. For example:
+ <orderedlist>
+ <listitem>
+ <para>
+ Set up continuous archiving
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Take weekly base backup
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ After 24 hours, restore base backup to another server, then run a
+ partial recovery and take a backup of the latest database state to
+ produce an incrmentally updated backup.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ After next 24 hours, restore the incrementally updated backup to the
+ second server, then run a partial recovery, at the end, take a backup
+ of the partially recovered files.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Repeat previous step each day, until the end of the week.
+ </para>
+ </listitem>
+ </orderedlist>
+ </para>
+
+ <para>
+ A weekly backup need only be taken once per week, yet the same level of
+ protection is offered as if base backups were taken nightly.
+ </para>
+
+ </sect2>
+
<sect2 id="continuous-archiving-caveats">
<title>Caveats</title>
Index: src/backend/access/transam/xlog.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.249
diff -c -r1.249 xlog.c
*** src/backend/access/transam/xlog.c 21 Aug 2006 16:16:31 -0000 1.249
--- src/backend/access/transam/xlog.c 19 Sep 2006 14:03:36 -0000
***************
*** 182,187 ****
--- 182,188 ----
static bool recoveryTarget = false;
static bool recoveryTargetExact = false;
static bool recoveryTargetInclusive = true;
+ static bool startupAfterRecovery = true;
static TransactionId recoveryTargetXid;
static time_t recoveryTargetTime;
***************
*** 2506,2514 ****
* or because the administrator has specified the restore program
* incorrectly. We have to assume the former.
*/
! ereport(DEBUG2,
(errmsg("could not restore file \"%s\" from archive: return code %d",
xlogfname, rc)));
/*
* if an archived file is not available, there might still be a version of
--- 2507,2520 ----
* or because the administrator has specified the restore program
* incorrectly. We have to assume the former.
*/
! ereport((startupAfterRecovery ? DEBUG2 : LOG),
(errmsg("could not restore file \"%s\" from archive: return code %d",
xlogfname, rc)));
+
+ if (startupAfterRecovery)
+ ereport(ERROR,
+ (errmsg("recovery ends normally with startup_after_recovery=false")));
+
/*
* if an archived file is not available, there might still be a version of
***************
*** 4343,4348 ****
--- 4349,4366 ----
ereport(LOG,
(errmsg("recovery_target_inclusive = %s", tok2)));
}
+ else if (strcmp(tok1, "startup_after_recovery") == 0)
+ {
+ if (strcmp(tok2, "false") == 0)
+ startupAfterRecovery = false;
+ else
+ {
+ startupAfterRecovery = true;
+ tok2 = "true";
+ }
+ ereport(LOG,
+ (errmsg("startup_after_recovery = %s", tok2)));
+ }
else
ereport(FATAL,
(errmsg("unrecognized recovery parameter \"%s\"",
Simon Riggs wrote:
+ + if (startupAfterRecovery) + ereport(ERROR, + (errmsg("recovery ends normally with startup_after_recovery=false"))); +
I find this part of the patch a bit ugly. Isn't there a better way to
exit than throwing an error that's not really an error?
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki@enterprisedb.com> writes:
Simon Riggs wrote:
+ + if (startupAfterRecovery) + ereport(ERROR, + (errmsg("recovery ends normally with startup_after_recovery=false"))); +
I find this part of the patch a bit ugly. Isn't there a better way to
exit than throwing an error that's not really an error?
This patch has obviously been thrown together with no thought and even
less testing. It breaks the normal case (I think the above if-test is
backwards), and I don't believe that it works for the advertised purpose
either (because nothing gets done to force a checkpoint before aborting,
thus the files on disk are not up to date with the end of WAL).
Also, I'm not sold that the concept is even useful. Apparently the idea
is to offload the expense of taking periodic base backups from a master
server, by instead backing up a PITR slave's fileset --- which is fine.
But why in the world would you want to stop the slave to do it? ISTM
we would want to arrange things so that you can copy the slave's files
while it continues replicating, just as with a standard base backup.
regards, tom lane
No, too late.
---------------------------------------------------------------------------
Simon Riggs wrote:
Way past feature freeze, but this small change allows a powerful new
feature utilising the Restartable Recovery capability. Very useful for
very large database backups...Includes full documentation.
Perhaps a bit rushed, but inclusion in 8.2 would be great. (Ouch, don't
shout back, read the patch first....)-----------------------------
Docs copied here as better explanation:<title>Incrementally Updated Backups</title>
<para>
Restartable Recovery can also be utilised to avoid the need to take
regular complete base backups, thus improving backup performance in
situations where the server is heavily loaded or the database is
very large. This concept is known as incrementally updated backups.
</para><para>
If we take a backup of the server files after a recovery is
partially
completed, we will be able to restart the recovery from the last
restartpoint. This backup is now further forward along the timeline
than the original base backup, so we can refer to it as an
incrementally
updated backup. If we need to recover, it will be faster to recover
from
the incrementally updated backup than from the base backup.
</para><para>
The <xref linkend="startup-after-recovery"> option in the
recovery.conf
file is provided to allow the recovery to complete up to the current
last
WAL segment, yet without starting the database. This option allows
us
to stop the server and take a backup of the partially recovered
server
files: this is the incrementally updated backup.
</para><para>
We can use the incrementally updated backup concept to come up with
a
streamlined backup schedule. For example:
<orderedlist>
<listitem>
<para>
Set up continuous archiving
</para>
</listitem>
<listitem>
<para>
Take weekly base backup
</para>
</listitem>
<listitem>
<para>
After 24 hours, restore base backup to another server, then run a
partial recovery and take a backup of the latest database state to
produce an incrmentally updated backup.
</para>
</listitem>
<listitem>
<para>
After next 24 hours, restore the incrementally updated backup to
the
second server, then run a partial recovery, at the end, take a
backup
of the partially recovered files.
</para>
</listitem>
<listitem>
<para>
Repeat previous step each day, until the end of the week.
</para>
</listitem>
</orderedlist>
</para><para>
A weekly backup need only be taken once per week, yet the same level
of
protection is offered as if base backups were taken nightly.
</para></sect2>
--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com
[ Attachment, skipping... ]
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
On Tue, 2006-09-19 at 12:13 -0400, Tom Lane wrote:
Also, I'm not sold that the concept is even useful. Apparently the idea
is to offload the expense of taking periodic base backups from a master
server, by instead backing up a PITR slave's fileset --- which is fine.
Good. That's the key part of the idea and its a useful one, so I was
looking to document it for 8.2
I thought of this idea separately, then, as usual, realised that this
idea has a long heritage: Change Accumulation has been in production use
with IMS for at least 20 years.
But why in the world would you want to stop the slave to do it? ISTM
we would want to arrange things so that you can copy the slave's files
while it continues replicating, just as with a standard base backup.
You can do that, of course, but my thinking was that people would regard
the technique as "unsupported", so I added a quick flag as a prototype.
On Tue, 2006-09-19 at 12:13 -0400, Tom Lane wrote:
This patch has obviously been thrown together with no thought and even
less testing. It breaks the normal case (I think the above if-test is
backwards), and I don't believe that it works for the advertised purpose
either (because nothing gets done to force a checkpoint before aborting,
thus the files on disk are not up to date with the end of WAL).
Yes, it was done very quickly and submitted to ensure it could be
considered yesterday for inclusion. It was described by me as rushed,
which it certainly was because of personal time pressure yesterday: I
thought that made it clear that discussion was needed. Heikki mentions
to me it wasn't clear, so those criticisms are accepted.
On Tue, 2006-09-19 at 16:05 +0100, Heikki Linnakangas wrote:
Simon Riggs wrote:
+ + if (startupAfterRecovery) + ereport(ERROR, + (errmsg("recovery ends normally with startup_after_recovery=false"))); +I find this part of the patch a bit ugly.
Me too.
Overall, my own thoughts and Tom's and Heikki's comments indicate I
should withdraw the patch rather than fix it. Patch withdrawn.
Enclose a new doc patch to describe the capability, without s/w change.
--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com
Attachments:
iub_doc.patchtext/x-patch; charset=UTF-8; name=iub_doc.patchDownload
Index: doc/src/sgml/backup.sgml
===================================================================
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/backup.sgml,v
retrieving revision 2.86
diff -c -r2.86 backup.sgml
*** doc/src/sgml/backup.sgml 16 Sep 2006 00:30:11 -0000 2.86
--- doc/src/sgml/backup.sgml 20 Sep 2006 12:43:55 -0000
***************
*** 1137,1142 ****
--- 1150,1197 ----
</para>
</sect2>
+ <sect2 id="backup-incremental-updated">
+ <title>Incrementally Updated Backups</title>
+
+ <indexterm zone="backup">
+ <primary>incrementally updated backups</primary>
+ </indexterm>
+
+ <indexterm zone="backup">
+ <primary>change accumulation</primary>
+ </indexterm>
+
+ <para>
+ Restartable Recovery can also be utilised to offload the expense of
+ taking periodic base backups from a main server, by instead backing
+ up a Standby server's files. This concept is also generally known as
+ incrementally updated backups, log change accumulation or more simply,
+ change accumulation.
+ </para>
+
+ <para>
+ If we take a backup of the server files whilst a recovery is in progress,
+ we will be able to restart the recovery from the last restartpoint.
+ That backup now has many of the changes from previous WAL archive files,
+ so this version is now an updated version of the original base backup.
+ If we need to recover, it will be faster to recover from the
+ incrementally updated backup than from the base backup.
+ </para>
+
+ <para>
+ To make use of this capability you will need to set up a Standby database
+ on a second system, as described in <xref linkend="warm-standby">. By
+ taking a backup of the Standby server while it is running you will
+ have produced an incrementally updated backup. Once this configuration
+ has been implemented you will no longer need to produce regular base
+ backups of the Primary server: all base backups can be performed on the
+ Standby server. If you wish to do this, it is not a requirement that you
+ also implement the failover features of a Warm Standby configuration,
+ though you may find it desirable to do both.
+ </para>
+
+ </sect2>
+
<sect2 id="continuous-archiving-caveats">
<title>Caveats</title>
***************
*** 1287,1292 ****
--- 1342,1355 ----
really offers a solution for Disaster Recovery, not HA.
</para>
+ <para>
+ When running a Standby Server, backups can be performed on the Standby
+ rather than the Primary, thereby offloading the expense of
+ taking periodic base backups. (See
+ <xref linkend="backup-incremental-updated">)
+ </para>
+
+
<para>
Other mechanisms for High Availability replication are available, both
commercially and as open-source software.
On Wed, Sep 20, 2006 at 02:09:43PM +0100, Simon Riggs wrote:
But why in the world would you want to stop the slave to do it? ISTM
we would want to arrange things so that you can copy the slave's files
while it continues replicating, just as with a standard base backup.You can do that, of course, but my thinking was that people would regard
the technique as "unsupported", so I added a quick flag as a prototype.
An advantage to being able to stop the server is that you could have one
server processing backups for multiple PostgreSQL clusters by going
through them 1 (or more likely, 2, 4, etc) at a time, essentially
providing N+1 capability.
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
"Jim C. Nasby" <jim@nasby.net> writes:
An advantage to being able to stop the server is that you could have one
server processing backups for multiple PostgreSQL clusters by going
through them 1 (or more likely, 2, 4, etc) at a time, essentially
providing N+1 capability.
Why wouldn't you implement that by putting N postmasters onto the backup
server? It'd be far more efficient than the proposed patch, which by
aborting at random points is essentially guaranteeing a whole lot of
useless re-replay of WAL whenever you restart it.
regards, tom lane
On Wed, Sep 20, 2006 at 04:26:30PM -0400, Tom Lane wrote:
"Jim C. Nasby" <jim@nasby.net> writes:
An advantage to being able to stop the server is that you could have one
server processing backups for multiple PostgreSQL clusters by going
through them 1 (or more likely, 2, 4, etc) at a time, essentially
providing N+1 capability.Why wouldn't you implement that by putting N postmasters onto the backup
server? It'd be far more efficient than the proposed patch, which by
aborting at random points is essentially guaranteeing a whole lot of
useless re-replay of WAL whenever you restart it.
My thought is that in many envoronments it would take much beefier
hardware to support N postmasters running simultaneously than to cycle
through them periodically bringing the backups up-to-date.
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
"Jim C. Nasby" <jim@nasby.net> writes:
My thought is that in many envoronments it would take much beefier
hardware to support N postmasters running simultaneously than to cycle
through them periodically bringing the backups up-to-date.
How you figure that? The cycling approach will require more total I/O
due to extra page re-reads ... particularly if it's built on a patch
like this one that abandons work-in-progress at arbitrary points.
A postmaster running WAL replay does not require all that much in the
way of CPU resources. It is going to need I/O comparable to the gross
I/O load of its master, but cycling isn't going to reduce that at all.
regards, tom lane
On Wed, Sep 20, 2006 at 05:50:48PM -0400, Tom Lane wrote:
"Jim C. Nasby" <jim@nasby.net> writes:
My thought is that in many envoronments it would take much beefier
hardware to support N postmasters running simultaneously than to cycle
through them periodically bringing the backups up-to-date.How you figure that? The cycling approach will require more total I/O
due to extra page re-reads ... particularly if it's built on a patch
like this one that abandons work-in-progress at arbitrary points.A postmaster running WAL replay does not require all that much in the
way of CPU resources. It is going to need I/O comparable to the gross
I/O load of its master, but cycling isn't going to reduce that at all.
True, but running several dozen instances on a single machine will
require a lot more memory (or, conversely, each individual database gets
a lot less memory to use).
Of course, this is all hand-waving right now... it'd be interesting to
see which approach was actually better.
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
True, but running several dozen instances on a single machine will
require a lot more memory (or, conversely, each individual database gets
a lot less memory to use).Of course, this is all hand-waving right now... it'd be interesting to
see which approach was actually better.
I'm running 4 WAL logging standby clusters on a single machine. While
the load on the master servers occasionally goes up to >60, the load on
the standby machine have never climbed above 5.
Of course when the master servers are all loaded, the standby gets
behind with the recovery... but eventually it gets up to date again.
I would be very surprised if it would get less behind if I would use it
in the 1 by 1 scenario.
Cheers,
Csaba.
Your patch has been added to the PostgreSQL unapplied patches list at:
http://momjian.postgresql.org/cgi-bin/pgpatches
It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.
---------------------------------------------------------------------------
Simon Riggs wrote:
On Tue, 2006-09-19 at 12:13 -0400, Tom Lane wrote:
Also, I'm not sold that the concept is even useful. Apparently the idea
is to offload the expense of taking periodic base backups from a master
server, by instead backing up a PITR slave's fileset --- which is fine.Good. That's the key part of the idea and its a useful one, so I was
looking to document it for 8.2I thought of this idea separately, then, as usual, realised that this
idea has a long heritage: Change Accumulation has been in production use
with IMS for at least 20 years.But why in the world would you want to stop the slave to do it? ISTM
we would want to arrange things so that you can copy the slave's files
while it continues replicating, just as with a standard base backup.You can do that, of course, but my thinking was that people would regard
the technique as "unsupported", so I added a quick flag as a prototype.On Tue, 2006-09-19 at 12:13 -0400, Tom Lane wrote:
This patch has obviously been thrown together with no thought and even
less testing. It breaks the normal case (I think the above if-test is
backwards), and I don't believe that it works for the advertised purpose
either (because nothing gets done to force a checkpoint before aborting,
thus the files on disk are not up to date with the end of WAL).Yes, it was done very quickly and submitted to ensure it could be
considered yesterday for inclusion. It was described by me as rushed,
which it certainly was because of personal time pressure yesterday: I
thought that made it clear that discussion was needed. Heikki mentions
to me it wasn't clear, so those criticisms are accepted.On Tue, 2006-09-19 at 16:05 +0100, Heikki Linnakangas wrote:
Simon Riggs wrote:
+ + if (startupAfterRecovery) + ereport(ERROR, + (errmsg("recovery ends normally with startup_after_recovery=false"))); +I find this part of the patch a bit ugly.
Me too.
Overall, my own thoughts and Tom's and Heikki's comments indicate I
should withdraw the patch rather than fix it. Patch withdrawn.Enclose a new doc patch to describe the capability, without s/w change.
--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com
[ Attachment, skipping... ]
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Documentation patch applied. Thanks. Your documentation changes can be
viewed in five minutes using links on the developer's page,
http://www.postgresql.org/developer/testing.
---------------------------------------------------------------------------
Simon Riggs wrote:
On Tue, 2006-09-19 at 12:13 -0400, Tom Lane wrote:
Also, I'm not sold that the concept is even useful. Apparently the idea
is to offload the expense of taking periodic base backups from a master
server, by instead backing up a PITR slave's fileset --- which is fine.Good. That's the key part of the idea and its a useful one, so I was
looking to document it for 8.2I thought of this idea separately, then, as usual, realised that this
idea has a long heritage: Change Accumulation has been in production use
with IMS for at least 20 years.But why in the world would you want to stop the slave to do it? ISTM
we would want to arrange things so that you can copy the slave's files
while it continues replicating, just as with a standard base backup.You can do that, of course, but my thinking was that people would regard
the technique as "unsupported", so I added a quick flag as a prototype.On Tue, 2006-09-19 at 12:13 -0400, Tom Lane wrote:
This patch has obviously been thrown together with no thought and even
less testing. It breaks the normal case (I think the above if-test is
backwards), and I don't believe that it works for the advertised purpose
either (because nothing gets done to force a checkpoint before aborting,
thus the files on disk are not up to date with the end of WAL).Yes, it was done very quickly and submitted to ensure it could be
considered yesterday for inclusion. It was described by me as rushed,
which it certainly was because of personal time pressure yesterday: I
thought that made it clear that discussion was needed. Heikki mentions
to me it wasn't clear, so those criticisms are accepted.On Tue, 2006-09-19 at 16:05 +0100, Heikki Linnakangas wrote:
Simon Riggs wrote:
+ + if (startupAfterRecovery) + ereport(ERROR, + (errmsg("recovery ends normally with startup_after_recovery=false"))); +I find this part of the patch a bit ugly.
Me too.
Overall, my own thoughts and Tom's and Heikki's comments indicate I
should withdraw the patch rather than fix it. Patch withdrawn.Enclose a new doc patch to describe the capability, without s/w change.
--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com
[ Attachment, skipping... ]
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +