Incrementally Updated Backups: Docs Clarification

Started by Thomas F. O'Connellabout 19 years ago4 messagesdocsgeneral
Jump to latest
#1Thomas F. O'Connell
tf@o.ptimized.com
docsgeneral

I'll preemptively apologize for cross-posting this, but I follow the
lists via Google Groups, and I never saw this show up on -general
there. I see it in the archives, but the lack of discussion on -
general makes me wonder if it was widely read. Since it involves what
I perceive to be a lack of clarity in the docs, I'll offer to make up
for my cross-posting by helping to improve the docs as mentioned
below once I get things working...

I'm about to begin playing with incrementally updated backups for a
warm standby scenario, but I need some help understanding this
paragraph in postgres terms. From 23.4.5 in the 8.2.3 docs:

"If we take a backup of the standby server's files while it is
following logs shipped from the primary, we will be able to reload
that data and restart the standby's recovery process from the last
restart point. We no longer need to keep WAL files from before the
restart point. If we need to recover, it will be faster to recover
from the incrementally updated backup than from the original base
backup."

I'm specifically confused about the meaning of the following phrases:

"backup of the standby server's files" - Which files?

"reload that data" - What does this mean in postgres terms?

"last restart point" - What is this? Wouldn't it be able to restart
from the last recovered file, which would presumably occur later than
the last restart point?

Does this mean make a filesystem backup of the standby server's data
directory while it's stopped, and then start it again with that data
and the restricted set of WAL files needed to continue recovery? I'd
like to see the language here converted to words that have more
meaning in the context of postgres. I'd be happy to attempt a
revision of this section once I'm able to complete an incrementally
updated backup successfully.

Here's how I envision it playing out in practice:

1. stop standby postgres server
2. [optional] preserve data directory, remove unnecessary WAL files
3. restart standby server

Is that all there is to it?

--
Thomas F. O'Connell

optimizing modern web applications
: for search engines, for usability, and for performance :

http://o.ptimized.com/
615-260-0005

#2Simon Riggs
simon@2ndQuadrant.com
In reply to: Thomas F. O'Connell (#1)
docsgeneral
Re: [DOCS] Incrementally Updated Backups: Docs Clarification

On Thu, 2007-04-19 at 15:48 -0500, Thomas F. O'Connell wrote:

"If we take a backup of the standby server's files while it is
following logs shipped from the primary, we will be able to reload
that data and restart the standby's recovery process from the last
restart point. We no longer need to keep WAL files from before the
restart point. If we need to recover, it will be faster to recover
from the incrementally updated backup than from the original base
backup."

I'm specifically confused about the meaning of the following phrases:

"backup of the standby server's files" - Which files?

the files that make up the database server:
- data directory
- all tablespace directories

"reload that data" - What does this mean in postgres terms?

copy back from wherever you put them in the first place

"that data" referring to the "files that make up the db server"

"last restart point" - What is this? Wouldn't it be able to restart
from the last recovered file, which would presumably occur later than
the last restart point?

No, we don't restart file-by-file.

http://developer.postgresql.org/pgdocs/postgres/continuous-archiving.html#BACKUP-PITR-RECOVERY

"If recovery finds a corruption in the WAL..." onwards explains the
restart mechanism. It's much like checkpointing, so we don't restart
from the last log file we restart from a point possibly many log files
in the past.

Does this mean make a filesystem backup of the standby server's data
directory while it's stopped, and then start it again with that data
and the restricted set of WAL files needed to continue recovery?

No need to stop server. Where do you read you need to do that?

I'd like to see the language here converted to words that have more
meaning in the context of postgres. I'd be happy to attempt a revision
of this section once I'm able to complete an incrementally updated
backup successfully.

Feel free to provide updates that make it clearer.

Here's how I envision it playing out in practice:

1. stop standby postgres server
2. [optional] preserve data directory, remove unnecessary WAL files
3. restart standby server

step 2 only.

Clearly not an optional step, since its a 1 stage process. :-)

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

#3Thomas F. O'Connell
tf@o.ptimized.com
In reply to: Simon Riggs (#2)
docsgeneral
Re: [DOCS] Incrementally Updated Backups: Docs Clarification

On Apr 25, 2007, at 9:42 AM, Simon Riggs wrote:

On Thu, 2007-04-19 at 15:48 -0500, Thomas F. O'Connell wrote:

"If we take a backup of the standby server's files while it is
following logs shipped from the primary, we will be able to reload
that data and restart the standby's recovery process from the last
restart point. We no longer need to keep WAL files from before the
restart point. If we need to recover, it will be faster to recover
from the incrementally updated backup than from the original base
backup."

I'm specifically confused about the meaning of the following phrases:

"backup of the standby server's files" - Which files?

the files that make up the database server:
- data directory
- all tablespace directories

"reload that data" - What does this mean in postgres terms?

copy back from wherever you put them in the first place

"that data" referring to the "files that make up the db server"

"last restart point" - What is this? Wouldn't it be able to restart
from the last recovered file, which would presumably occur later than
the last restart point?

No, we don't restart file-by-file.

http://developer.postgresql.org/pgdocs/postgres/continuous-
archiving.html#BACKUP-PITR-RECOVERY

"If recovery finds a corruption in the WAL..." onwards explains the
restart mechanism. It's much like checkpointing, so we don't restart
from the last log file we restart from a point possibly many log files
in the past.

Does this mean make a filesystem backup of the standby server's data
directory while it's stopped, and then start it again with that data
and the restricted set of WAL files needed to continue recovery?

No need to stop server. Where do you read you need to do that?

I'd like to see the language here converted to words that have more
meaning in the context of postgres. I'd be happy to attempt a
revision
of this section once I'm able to complete an incrementally updated
backup successfully.

Feel free to provide updates that make it clearer.

Here's how I envision it playing out in practice:

1. stop standby postgres server
2. [optional] preserve data directory, remove unnecessary WAL files
3. restart standby server

step 2 only.

Clearly not an optional step, since its a 1 stage process. :-)

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

Well, this conversation made things a lot clearer, but I'm not sure
(yet) how to patch the docs. It seems like the original version is
written in general terms, whereas what our Q&A produces here is very
postgres-specific. I'll see if I can produce a version that would be
add clarity (for me).

--
Thomas F. O'Connell

optimizing modern web applications
: for search engines, for usability, and for performance :

http://o.ptimized.com/
615-260-0005

#4Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#2)
docsgeneral
Re: [DOCS] Incrementally Updated Backups: Docs Clarification

I have updated the docs by changing a few words, patch attached.

---------------------------------------------------------------------------

Simon Riggs wrote:

On Thu, 2007-04-19 at 15:48 -0500, Thomas F. O'Connell wrote:

"If we take a backup of the standby server's files while it is
following logs shipped from the primary, we will be able to reload
that data and restart the standby's recovery process from the last
restart point. We no longer need to keep WAL files from before the
restart point. If we need to recover, it will be faster to recover
from the incrementally updated backup than from the original base
backup."

I'm specifically confused about the meaning of the following phrases:

"backup of the standby server's files" - Which files?

the files that make up the database server:
- data directory
- all tablespace directories

"reload that data" - What does this mean in postgres terms?

copy back from wherever you put them in the first place

"that data" referring to the "files that make up the db server"

"last restart point" - What is this? Wouldn't it be able to restart
from the last recovered file, which would presumably occur later than
the last restart point?

No, we don't restart file-by-file.

http://developer.postgresql.org/pgdocs/postgres/continuous-archiving.html#BACKUP-PITR-RECOVERY

"If recovery finds a corruption in the WAL..." onwards explains the
restart mechanism. It's much like checkpointing, so we don't restart
from the last log file we restart from a point possibly many log files
in the past.

Does this mean make a filesystem backup of the standby server's data
directory while it's stopped, and then start it again with that data
and the restricted set of WAL files needed to continue recovery?

No need to stop server. Where do you read you need to do that?

I'd like to see the language here converted to words that have more
meaning in the context of postgres. I'd be happy to attempt a revision
of this section once I'm able to complete an incrementally updated
backup successfully.

Feel free to provide updates that make it clearer.

Here's how I envision it playing out in practice:

1. stop standby postgres server
2. [optional] preserve data directory, remove unnecessary WAL files
3. restart standby server

step 2 only.

Clearly not an optional step, since its a 1 stage process. :-)

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Attachments:

/rtmp/difftext/x-diffDownload+2-2