Problem with PITR recovery
I had a problem using PITR recovery just now. If I do:
SELECT pg_start_backup('label');
do my tar
SELECT pg_stop_backup();
and stop the server, delete /data, then recover from the tar, delete
files in pg_xlog, then set recovery.conf to restore, it fails, I think
because no actual pg_xlog file was archived since the tar.
The problem is that we don't archive the partially written xlog file,
and in this case that xlog file contains the information needed to make
the tar file consistent.
Is this a known problem? Do we document this? If so, I can't find it.
I am concerned about folks cleaning out their archive directory after
the pg_stop_backup() not realizing they need that last xlog file to make
the tar valid.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian <pgman@candle.pha.pa.us> writes:
The problem is that we don't archive the partially written xlog file,
and in this case that xlog file contains the information needed to make
the tar file consistent.
Is this a known problem? Do we document this? If so, I can't find it.
Yes, and yes. You did not follow the procedure:
http://www.postgresql.org/docs/8.0/static/backup-online.html#BACKUP-PITR-RECOVERY
In particular, step 2 says:
: ... you need at the least to copy the contents of the pg_xlog
: subdirectory of the cluster data directory, as it may contain logs which
: were not archived before the system went down.
Possibly this needs to be highlighted a little better.
regards, tom lane
Tom Lane wrote:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
The problem is that we don't archive the partially written xlog file,
and in this case that xlog file contains the information needed to make
the tar file consistent.Is this a known problem? Do we document this? If so, I can't find it.
Yes, and yes. You did not follow the procedure:
http://www.postgresql.org/docs/8.0/static/backup-online.html#BACKUP-PITR-RECOVERY
In particular, step 2 says:
: ... you need at the least to copy the contents of the pg_xlog
: subdirectory of the cluster data directory, as it may contain logs which
: were not archived before the system went down.Possibly this needs to be highlighted a little better.
I figured that part of the goal of PITR was that you could recover from
just the tar backup and archived WAL files --- using the pg_xlog
contents is nice, but not something we can require.
I understood the last missing WAL log would cause missing information,
but not that it would make the tar backup unusable.
It would be nice if we could force a new WAL file on pg_stop_backup()
and archive the WAL file needed to match the tar file. How hard would
that be?
I see in the docs:
To make use of this backup, you will need to keep around all the WAL
segment files generated at or after the starting time of the backup. To
aid you in doing this, the pg_stop_backup function creates a backup
history file that is immediately stored into the WAL archive area. This
file is named after the first WAL segment file that you need to have to
make use of the backup. For example, if the starting WAL file is
0000000100001234000055CD the backup history file will be named something
like 0000000100001234000055CD.007C9330.backup. (The second part of this
file name stands for an exact position within the WAL file, and can
ordinarily be ignored.) Once you have safely archived the backup dump
file, you can delete all archived WAL segments with names numerically
preceding this one.
I am not clear on what the "backup dump file" is? I assume it means
0000000100001234000055CD. It is called "WAL segment file" above. I
will rename that phrase to match the above terminology. Patch attached
and applied.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Attachments:
/bjm/difftext/plainDownload+4-4
On Sat, 2005-04-16 at 23:06 -0400, Bruce Momjian wrote:
[about backup procedure with PITR documentation
I see in the docs:
To make use of this backup, you will need to keep around all the WAL
segment files generated at or after the starting time of the backup. To
aid you in doing this, the pg_stop_backup function creates a backup
history file that is immediately stored into the WAL archive area. This
file is named after the first WAL segment file that you need to have to
make use of the backup. For example, if the starting WAL file is
0000000100001234000055CD the backup history file will be named something
like 0000000100001234000055CD.007C9330.backup. (The second part of this
file name stands for an exact position within the WAL file, and can
ordinarily be ignored.) Once you have safely archived the backup dump
file, you can delete all archived WAL segments with names numerically
preceding this one.I am not clear on what the "backup dump file" is? I assume it means
0000000100001234000055CD. It is called "WAL segment file" above. I
will rename that phrase to match the above terminology. Patch attached
and applied.
Doesn't it refer to the backup file itself (the tar file of the data
directory) ?
You do not want to start deleting WAL segments until that one is safely
archived.
gnari
Ragnar Hafsta��� wrote:
On Sat, 2005-04-16 at 23:06 -0400, Bruce Momjian wrote:
[about backup procedure with PITR documentationI see in the docs:
To make use of this backup, you will need to keep around all the WAL
segment files generated at or after the starting time of the backup. To
aid you in doing this, the pg_stop_backup function creates a backup
history file that is immediately stored into the WAL archive area. This
file is named after the first WAL segment file that you need to have to
make use of the backup. For example, if the starting WAL file is
0000000100001234000055CD the backup history file will be named something
like 0000000100001234000055CD.007C9330.backup. (The second part of this
file name stands for an exact position within the WAL file, and can
ordinarily be ignored.) Once you have safely archived the backup dump
file, you can delete all archived WAL segments with names numerically
preceding this one.I am not clear on what the "backup dump file" is? I assume it means
0000000100001234000055CD. It is called "WAL segment file" above. I
will rename that phrase to match the above terminology. Patch attached
and applied.Doesn't it refer to the backup file itself (the tar file of the data
directory) ?
No. That is what I thought it meant on first reading, but looking
closer it is referring to the numbered file, and the tar file has no
specific number.
You do not want to start deleting WAL segments until that one is safely
archived.
Right, but the point of the paragraph is that you need the WAL file that
goes with the backup history file number.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian wrote:
I figured that part of the goal of PITR was that you could recover from
just the tar backup and archived WAL files --- using the pg_xlog
contents is nice, but not something we can require.I understood the last missing WAL log would cause missing information,
but not that it would make the tar backup unusable.It would be nice if we could force a new WAL file on pg_stop_backup()
and archive the WAL file needed to match the tar file. How hard would
that be?I see in the docs:
To make use of this backup, you will need to keep around all the WAL
segment files generated at or after the starting time of the backup. To
aid you in doing this, the pg_stop_backup function creates a backup
history file that is immediately stored into the WAL archive area. This
file is named after the first WAL segment file that you need to have to
make use of the backup. For example, if the starting WAL file is
0000000100001234000055CD the backup history file will be named something
like 0000000100001234000055CD.007C9330.backup. (The second part of this
file name stands for an exact position within the WAL file, and can
ordinarily be ignored.) Once you have safely archived the backup dump
file, you can delete all archived WAL segments with names numerically
preceding this one.I am not clear on what the "backup dump file" is? I assume it means
0000000100001234000055CD. It is called "WAL segment file" above. I
will rename that phrase to match the above terminology. Patch attached
and applied.
I found that the docs mention above are inaccurate because they state
you only need the WAL segment used at the start of the file system
backup, while you really need all the WAL segments used _during_ the
backup before you can safely delete the older WAL segments. Here is
updated text I have applied to HEAD and 8.0.X:
Once you have safely archived the WAL segment files used during the file
system backup (as specified in the backup history file), you can delete
all archived WAL segments with names numerically less. Keep in mind that
only completed WAL segment files are archived, so there will be delay
between running pg_stop_backup and the archiving of all WAL segment
files needed to make the file system backup consistent.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
pgman wrote:
I figured that part of the goal of PITR was that you could recover from
just the tar backup and archived WAL files --- using the pg_xlog
contents is nice, but not something we can require.I understood the last missing WAL log would cause missing information,
but not that it would make the tar backup unusable.It would be nice if we could force a new WAL file on pg_stop_backup()
and archive the WAL file needed to match the tar file. How hard would
that be?
Added to TODO:
* Force archiving of partially-full WAL files when pg_stop_backup() is
called or the server is stopped
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Import Notes
Reply to msg id not found: | Resolved by subject fallback
I could still use a little clarification. It seems sort of like there is
an extra step, like:
(1) start archiving
(2) pg_start_backup()
(3) copy PGDATA directory with tar
(4) pg_stop_backup()
(5) ??
And the text you have at
http://candle.pha.pa.us/main/writings/pgsql/sgml/backup-online.html
says: "To make use of this backup, you will need to keep around all the
WAL segment files generated during and after the file system backup.".
How long after? Wouldn't you be keeping the WAL segments afterward
anyway by archiving?
I've tested and been able to recover using PITR before, but I'd like a
little clarification on the steps to make absolutely sure that the base
backup I have is viable.
Can you sort of run through the failure case again, and how to prevent
it?
Regards,
Jeff Davis
Show quoted text
On Sun, 2005-04-17 at 21:38 -0400, Bruce Momjian wrote:
Bruce Momjian wrote:
I figured that part of the goal of PITR was that you could recover from
just the tar backup and archived WAL files --- using the pg_xlog
contents is nice, but not something we can require.I understood the last missing WAL log would cause missing information,
but not that it would make the tar backup unusable.It would be nice if we could force a new WAL file on pg_stop_backup()
and archive the WAL file needed to match the tar file. How hard would
that be?I see in the docs:
To make use of this backup, you will need to keep around all the WAL
segment files generated at or after the starting time of the backup. To
aid you in doing this, the pg_stop_backup function creates a backup
history file that is immediately stored into the WAL archive area. This
file is named after the first WAL segment file that you need to have to
make use of the backup. For example, if the starting WAL file is
0000000100001234000055CD the backup history file will be named something
like 0000000100001234000055CD.007C9330.backup. (The second part of this
file name stands for an exact position within the WAL file, and can
ordinarily be ignored.) Once you have safely archived the backup dump
file, you can delete all archived WAL segments with names numerically
preceding this one.I am not clear on what the "backup dump file" is? I assume it means
0000000100001234000055CD. It is called "WAL segment file" above. I
will rename that phrase to match the above terminology. Patch attached
and applied.I found that the docs mention above are inaccurate because they state
you only need the WAL segment used at the start of the file system
backup, while you really need all the WAL segments used _during_ the
backup before you can safely delete the older WAL segments. Here is
updated text I have applied to HEAD and 8.0.X:Once you have safely archived the WAL segment files used during the file
system backup (as specified in the backup history file), you can delete
all archived WAL segments with names numerically less. Keep in mind that
only completed WAL segment files are archived, so there will be delay
between running pg_stop_backup and the archiving of all WAL segment
files needed to make the file system backup consistent.
Jeff Davis wrote:
I could still use a little clarification. It seems sort of like there is
an extra step, like:(1) start archiving
(2) pg_start_backup()
(3) copy PGDATA directory with tar
(4) pg_stop_backup()
(5) ??And the text you have at
http://candle.pha.pa.us/main/writings/pgsql/sgml/backup-online.htmlsays: "To make use of this backup, you will need to keep around all the
WAL segment files generated during and after the file system backup.".How long after? Wouldn't you be keeping the WAL segments afterward
anyway by archiving?I've tested and been able to recover using PITR before, but I'd like a
little clarification on the steps to make absolutely sure that the base
backup I have is viable.Can you sort of run through the failure case again, and how to prevent
it?
The failure case in the original docs is that you do your
pg_stop_backup(), and then delete all the WAL file before the *.backup
file that was just created. However, you do not have a valid tar backup
until you have archived all the WAL files used from the *.backup WAL
file up to the WAL file that was active at pg_stop_backup(), which is
mentioned in the *.backup file. If you went and deleted your old WAL
files anyway, without waiting for those other WAL files to be archived,
and your disk drive crashed, you wouldn't have a tar backup you could
use, and you had deleted the old WAL files you would have needed to
recover your previous tar backup.
Is there something in the current wording that needs clarification?
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Mon, 2005-04-18 at 00:20 -0400, Bruce Momjian wrote:
Jeff Davis wrote:
Can you sort of run through the failure case again, and how to prevent
it?The failure case in the original docs is that you do your
pg_stop_backup(), and then delete all the WAL file before the *.backup
file that was just created. However, you do not have a valid tar backup
until you have archived all the WAL files used from the *.backup WAL
file up to the WAL file that was active at pg_stop_backup(), which is
mentioned in the *.backup file. If you went and deleted your old WAL
files anyway, without waiting for those other WAL files to be archived,
and your disk drive crashed, you wouldn't have a tar backup you could
use, and you had deleted the old WAL files you would have needed to
recover your previous tar backup.Is there something in the current wording that needs clarification?
So, as I understand it: everything works great as long as everything has
been archived up to and including the WAL file that was active when you
did pg_stop_backup(). However, if you do pg_stop_backup() and
immediately delete PGDATA (before any WAL files are archived), the
backup may fail.
I think, to clear it up a little, you might add a step 5 before saying
"If this returns successfully, you're done.", so that people know for
sure that they get a good base backup. It actually seems like something
that maybe pg_stop_backup() should do in the future.
It's a little unclear how you tell which WAL segment was active during
pg_stop_backup(), but that shouldn't be a practical concern since you
can just manually archive them all.
Maybe step 5 could be something like:
(5) Make a copy of all WAL segments above XXXX.backup and store with the
base backup. When it's time to recover, if those WAL segments were not
properly archived, you need to have them available.
(probably needs rewording)
Regards,
Jeff Davis
On Mon, 18 Apr 2005, Bruce Momjian wrote:
Jeff Davis wrote:
I could still use a little clarification. It seems sort of like there is
an extra step, like:(1) start archiving
(2) pg_start_backup()
(3) copy PGDATA directory with tar
(4) pg_stop_backup()
(5) ??And the text you have at
http://candle.pha.pa.us/main/writings/pgsql/sgml/backup-online.htmlsays: "To make use of this backup, you will need to keep around all the
WAL segment files generated during and after the file system backup.".How long after? Wouldn't you be keeping the WAL segments afterward
anyway by archiving?I've tested and been able to recover using PITR before, but I'd like a
little clarification on the steps to make absolutely sure that the base
backup I have is viable.Can you sort of run through the failure case again, and how to prevent
it?The failure case in the original docs is that you do your
pg_stop_backup(), and then delete all the WAL file before the *.backup
file that was just created. However, you do not have a valid tar backup
until you have archived all the WAL files used from the *.backup WAL
file up to the WAL file that was active at pg_stop_backup(), which is
mentioned in the *.backup file. If you went and deleted your old WAL
files anyway, without waiting for those other WAL files to be archived,
and your disk drive crashed, you wouldn't have a tar backup you could
use, and you had deleted the old WAL files you would have needed to
recover your previous tar backup.Is there something in the current wording that needs clarification?
I'd say it's very not cool :) It's not we all expected from PITR.
I recall now Simon mentioned about that and have it in his TODO.
Other thing I don't understand what's the problem to generate WAL file
by demand ? Probably, TODO should says about this.
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
I'd say it's very not cool :) It's not we all
expected from PITR.
I recall now Simon mentioned about that and have it
in his TODO.
Other thing I don't understand what's the problem to
generate WAL file
by demand ? Probably, TODO should says about this.
This would definetly be a good feature to have. What
I would prefer is:
1) have the pitr stop command write out and close the
WAL that it is currently using.
2) have another stored proc which can be invoked at
any time that will write out and close the WAL that is
currently in use when that command is executed.
3) have a feature in postgres that will automatically
write out and close the WAL if the server hasn't had
any activity in XX minutes, or hasn't closed a WAL
file in XX minutes.
The reason for this is "the Friday night" scenario.
Let's say you have your WAL's FTP'd to a remote server
off-site. Friday at 4:50 PM Postgres starts a new
WAL, and everyone goes home for the weekend at 5pm.
No activity occurs on the database all weekend long,
so the new WAL never fills and is never closed. If
something should happen during the weekend, and the
disks are ruined on the PG DB server, the last WAL is
never sent to the remote off-site server. The last
transactions of the day are lost, even though they
could have taken place days ago. With feature 3, you
can guarantee that the oldest WAL is XX minutes old,
so at least you have all the transactions within the
last XX minutes.
Of course feature #3 also needs to have some smarts to
it, so it doesn't create a bunch of completely empty
WAL's everytime the timer runs out. It should only
write and close the WAL if there is actually some new
data in it.
Later
Rob
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
Import Notes
Reply to msg id not found: 6667 | Resolved by subject fallback
Jeff Davis wrote:
On Mon, 2005-04-18 at 00:20 -0400, Bruce Momjian wrote:
Jeff Davis wrote:
Can you sort of run through the failure case again, and how to prevent
it?The failure case in the original docs is that you do your
pg_stop_backup(), and then delete all the WAL file before the *.backup
file that was just created. However, you do not have a valid tar backup
until you have archived all the WAL files used from the *.backup WAL
file up to the WAL file that was active at pg_stop_backup(), which is
mentioned in the *.backup file. If you went and deleted your old WAL
files anyway, without waiting for those other WAL files to be archived,
and your disk drive crashed, you wouldn't have a tar backup you could
use, and you had deleted the old WAL files you would have needed to
recover your previous tar backup.Is there something in the current wording that needs clarification?
So, as I understand it: everything works great as long as everything has
been archived up to and including the WAL file that was active when you
did pg_stop_backup(). However, if you do pg_stop_backup() and
immediately delete PGDATA (before any WAL files are archived), the
backup may fail.
Right, and that is the issue that wasn't documented before, and I was
even unclear about it myself when testing initially.
I think, to clear it up a little, you might add a step 5 before saying
"If this returns successfully, you're done.", so that people know for
I see your point. New text is:
4 Again connect to the database as a superuser, and issue the command
SELECT pg_stop_backup();
This should return successfully.
5 Once the WAL segment files used during the backup are archived as
part of normal database activity, you are done.
sure that they get a good base backup. It actually seems like something
that maybe pg_stop_backup() should do in the future.
Yes, I added that to the TODO list:
* Force archiving of partially-full WAL files when pg_stop_backup() is
called or the server is stopped
Doing this will allow administrators to know more easily when the
archive contins all the files needed for point-in-time recovery.
It's a little unclear how you tell which WAL segment was active during
pg_stop_backup(), but that shouldn't be a practical concern since you
can just manually archive them all.
We do have this sentence:
Once you have safely archived the WAL segment files used during the file
system backup (as specified in the backup history file), you can delete
all archived WAL segments with names numerically less.
The information is actually in the *.backup file. I think that is the
only way to know.
And you can't manually copy the WAL files to the archive because they
aren't full and the recommended archive_command will fail if those files
are already in the archive. You could copy them off somewhere else, I
suppose.
Maybe step 5 could be something like:
(5) Make a copy of all WAL segments above XXXX.backup and store with the
base backup. When it's time to recover, if those WAL segments were not
properly archived, you need to have them available.
Again, that doesn't work because of the "no overwrite" behavior of the
archive_command.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Mon, 18 Apr 2005, Rob Butler wrote:
I'd say it's very not cool :) It's not we all
expected from PITR.
I recall now Simon mentioned about that and have it
in his TODO.
Other thing I don't understand what's the problem to
generate WAL file
by demand ? Probably, TODO should says about this.This would definetly be a good feature to have. What
I would prefer is:1) have the pitr stop command write out and close the
WAL that it is currently using.2) have another stored proc which can be invoked at
any time that will write out and close the WAL that is
currently in use when that command is executed.3) have a feature in postgres that will automatically
write out and close the WAL if the server hasn't had
any activity in XX minutes, or hasn't closed a WAL
file in XX minutes.The reason for this is "the Friday night" scenario.
This is exactly what I'm worry about ! Very typical
scenario. I hope PITR improvement could be done in
8.0.X development cycle.
Let's say you have your WAL's FTP'd to a remote server
off-site. Friday at 4:50 PM Postgres starts a new
WAL, and everyone goes home for the weekend at 5pm.
No activity occurs on the database all weekend long,
so the new WAL never fills and is never closed. If
something should happen during the weekend, and the
disks are ruined on the PG DB server, the last WAL is
never sent to the remote off-site server. The last
transactions of the day are lost, even though they
could have taken place days ago. With feature 3, you
can guarantee that the oldest WAL is XX minutes old,
so at least you have all the transactions within the
last XX minutes.Of course feature #3 also needs to have some smarts to
it, so it doesn't create a bunch of completely empty
WAL's everytime the timer runs out. It should only
write and close the WAL if there is actually some new
data in it.Later
Rob__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
Oleg Bartunov wrote:
Is there something in the current wording that needs clarification?
I'd say it's very not cool :) It's not we all expected from PITR.
I recall now Simon mentioned about that and have it in his TODO.
Other thing I don't understand what's the problem to generate WAL file
by demand ? Probably, TODO should says about this.
Yes, we have TODO items for that and I added another one yesterday.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Rob Butler <crodster2k@yahoo.com> wrote on 18.04.2005, 15:05:20:
I'd say it's very not cool :) It's not we all
expected from PITR.
I recall now Simon mentioned about that and have it
in his TODO.
Other thing I don't understand what's the problem to
generate WAL file
by demand ? Probably, TODO should says about this.This would definetly be a good feature to have. What
I would prefer is:1) have the pitr stop command write out and close the
WAL that it is currently using.2) have another stored proc which can be invoked at
any time that will write out and close the WAL that is
currently in use when that command is executed.3) have a feature in postgres that will automatically
write out and close the WAL if the server hasn't had
any activity in XX minutes, or hasn't closed a WAL
file in XX minutes.
Yes, I have been working on a design.
1) is required to make PITR better for low transaction rate users.
3) is required to allow standby replication
2) is a standard feature on other DBMS, but I'd have to consider that as
optional.
Anyway, I'll post more in a few hours on this.
Best Regards, Simon Riggs
Import Notes
Resolved by subject fallback
Bruce Momjian <pgman@candle.pha.pa.us> writes:
I see your point. New text is:
4 Again connect to the database as a superuser, and issue the command
SELECT pg_stop_backup();
This should return successfully.
5 Once the WAL segment files used during the backup are archived as
part of normal database activity, you are done.sure that they get a good base backup. It actually seems like something
that maybe pg_stop_backup() should do in the future.Yes, I added that to the TODO list:
* Force archiving of partially-full WAL files when pg_stop_backup() is
called or the server is stopped
You could even make pg_stop_backup() hang until that's complete.
--
greg
OK, I updated the two current TODO items:
* Allow point-in-time recovery to archive partially filled write-ahead
logs
Currently only full WAL files are archived. This means that the most
recent transactions aren't available for recovery in case of a disk
failure. This could be triggered by a user command or a timer.
* Automatically force archiving of partially-filled WAL files when
pg_stop_backup() is called or the server is stopped
Doing this will allow administrators to know more easily when the
archive contins all the files needed for point-in-time recovery.
Is this OK?
---------------------------------------------------------------------------
Rob Butler wrote:
I'd say it's very not cool :) It's not we all
expected from PITR.
I recall now Simon mentioned about that and have it
in his TODO.
Other thing I don't understand what's the problem to
generate WAL file
by demand ? Probably, TODO should says about this.This would definetly be a good feature to have. What
I would prefer is:1) have the pitr stop command write out and close the
WAL that it is currently using.2) have another stored proc which can be invoked at
any time that will write out and close the WAL that is
currently in use when that command is executed.3) have a feature in postgres that will automatically
write out and close the WAL if the server hasn't had
any activity in XX minutes, or hasn't closed a WAL
file in XX minutes.The reason for this is "the Friday night" scenario.
Let's say you have your WAL's FTP'd to a remote server
off-site. Friday at 4:50 PM Postgres starts a new
WAL, and everyone goes home for the weekend at 5pm.
No activity occurs on the database all weekend long,
so the new WAL never fills and is never closed. If
something should happen during the weekend, and the
disks are ruined on the PG DB server, the last WAL is
never sent to the remote off-site server. The last
transactions of the day are lost, even though they
could have taken place days ago. With feature 3, you
can guarantee that the oldest WAL is XX minutes old,
so at least you have all the transactions within the
last XX minutes.Of course feature #3 also needs to have some smarts to
it, so it doesn't create a bunch of completely empty
WAL's everytime the timer runs out. It should only
write and close the WAL if there is actually some new
data in it.Later
Rob__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Oleg Bartunov wrote:
On Mon, 18 Apr 2005, Rob Butler wrote:
I'd say it's very not cool :) It's not we all
expected from PITR.
I recall now Simon mentioned about that and have it
in his TODO.
Other thing I don't understand what's the problem to
generate WAL file
by demand ? Probably, TODO should says about this.This would definetly be a good feature to have. What
I would prefer is:1) have the pitr stop command write out and close the
WAL that it is currently using.2) have another stored proc which can be invoked at
any time that will write out and close the WAL that is
currently in use when that command is executed.3) have a feature in postgres that will automatically
write out and close the WAL if the server hasn't had
any activity in XX minutes, or hasn't closed a WAL
file in XX minutes.The reason for this is "the Friday night" scenario.
This is exactly what I'm worry about ! Very typical
scenario. I hope PITR improvement could be done in
8.0.X development cycle.
Yes, I described this exact scenario during a talk I gave on Saturday.
I think the only way to do this for 8.0.X now is to run a cron job that
just copies pg_xlog off to another location every so often.
Of course, there is the risk that your cron copy will fail in the
middle, leaving the WAL file corrupt. You would have to copy to a
temporary directory, then once that succeeds, move the files to overlay
the previous copies.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Greg Stark wrote:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
I see your point. New text is:
4 Again connect to the database as a superuser, and issue the command
SELECT pg_stop_backup();
This should return successfully.
5 Once the WAL segment files used during the backup are archived as
part of normal database activity, you are done.sure that they get a good base backup. It actually seems like something
that maybe pg_stop_backup() should do in the future.Yes, I added that to the TODO list:
* Force archiving of partially-full WAL files when pg_stop_backup() is
called or the server is stoppedYou could even make pg_stop_backup() hang until that's complete.
You mean don't force the archive copy but just have pg_stop_backup()
hang until the files fill? Yea, we could do that, but there is no way
to know how long the hang might take.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073