Forcing current WAL file to be archived
Where are we on these TODO items:
o Allow point-in-time recovery to archive partially filled
write-ahead logs [pitr]
Currently only full WAL files are archived. This means that the
most recent transactions aren't available for recovery in case
of a disk failure. This could be triggered by a user command or
a timer.
o Automatically force archiving of partially-filled WAL files when
pg_stop_backup() is called or the server is stopped
Doing this will allow administrators to know more easily when
the archive contains all the files needed for point-in-time
recovery.
http://archives.postgresql.org/pgsql-patches/2005-04/msg00121.php
o Add reporting of the current WAL file, perhaps as part of
partial log file archiving
Seems they should be completed for 8.2. I have only a /contrib version for
the last one.
--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes:
Where are we on these TODO items:
o Allow point-in-time recovery to archive partially filled
write-ahead logs [pitr]
I believe we'd agreed that the necessary infrastructure for this is
just a function to tell the current WAL segment name and offset.
o Automatically force archiving of partially-filled WAL files when
pg_stop_backup() is called or the server is stopped
I see no need for that to be "automatic". I'd vote for a simple
function pg_finish_wal_segment() or something like that, which you
call just after pg_stop_backup() if you want this behavior. Trying
to tie it into pg_stop_backup() will only make things more complicated
and less flexible.
regards, tom lane
Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
Where are we on these TODO items:
o Allow point-in-time recovery to archive partially filled
write-ahead logs [pitr]I believe we'd agreed that the necessary infrastructure for this is
just a function to tell the current WAL segment name and offset.
Yes, perhaps, though I can envision a GUC that does regularly partial
archiving. I will add a question mark to the item. In fact, the
description has more details:
o Allow point-in-time recovery to archive partially filled
write-ahead logs? [pitr]
Currently only full WAL files are archived. This means that the
most recent transactions aren't available for recovery in case
of a disk failure. This could be triggered by a user command or
a timer.
o Automatically force archiving of partially-filled WAL files when
pg_stop_backup() is called or the server is stoppedI see no need for that to be "automatic". I'd vote for a simple
function pg_finish_wal_segment() or something like that, which you
call just after pg_stop_backup() if you want this behavior. Trying
to tie it into pg_stop_backup() will only make things more complicated
and less flexible.
I assumed we would have a function like pg_finish_wal_segment(), and
server stop and stop_backup would call it too, the reason being, it
would greatly simplify our documentation on how to use PITR if these
were done automatically.
--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes:
I assumed we would have a function like pg_finish_wal_segment(), and
server stop and stop_backup would call it too,
That idea is *exactly* what I'm objecting to.
the reason being, it
would greatly simplify our documentation on how to use PITR if these
were done automatically.
No it wouldn't, it'd just bloat the already excessive WAL volume.
regards, tom lane
Ühel kenal päeval, T, 2006-07-25 kell 10:51, kirjutas Bruce Momjian:
Where are we on these TODO items:
o Add reporting of the current WAL file, perhaps as part of
partial log file archiving
It would be nice to have a function that tells both filename and offset
of current WAL file, so it would be possible to do live async streaming
of up-to-subsecond changes without too much overhead.
This could be used to solve all the above problems with some extra work
on side of WAL-shipping framework.
Marko Kreen thought he might also do some work on such a function, but
he is on a vacation, so I'm not sure he will be able to get it done by
feature freeze.
It would be extra nice if postgres would nudge some external process on
each WAL write via a signal or UDP packet (after write and before flush)
so there would be virtually no delay between WAL write and notification,
but just the function would also go a long way.
Seems they should be completed for 8.2. I have only a /contrib version for
the last one.
--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia
Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com
Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
I assumed we would have a function like pg_finish_wal_segment(), and
server stop and stop_backup would call it too,That idea is *exactly* what I'm objecting to.
the reason being, it
would greatly simplify our documentation on how to use PITR if these
were done automatically.No it wouldn't, it'd just bloat the already excessive WAL volume.
Well, it only would happen when you have PITR enabled.
For example, if you do pg_stop_backup(), in what cases would you not
also call pg_finish_wal_segment()? I can't think of one. Maybe the
server restart case isn't necessary.
--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Hannu Krosing wrote:
?hel kenal p?eval, T, 2006-07-25 kell 10:51, kirjutas Bruce Momjian:
Where are we on these TODO items:
o Add reporting of the current WAL file, perhaps as part of
partial log file archivingIt would be nice to have a function that tells both filename and offset
of current WAL file, so it would be possible to do live async streaming
of up-to-subsecond changes without too much overhead.
OK, "offset" added to TODO item. What would the offset give us?
--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes:
For example, if you do pg_stop_backup(), in what cases would you not
also call pg_finish_wal_segment()? I can't think of one.
I can't see why you would need to, unless your intention is not to run
PITR at all but only to make a filesystem backup instead of using
pg_dump. Normally you'd be running a continuing archival process and
there's no particular need to force the current WAL segment off to
archive at that exact instant.
My point here is that forcing the current segment to archive is a
function of whatever your continuous-archiving process is, and it's
not necessarily tied to backups. We should not prejudge when people
want that fairly-expensive function to be invoked.
regards, tom lane
On Tue, 2006-07-25 at 11:07 -0400, Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
Where are we on these TODO items:
o Allow point-in-time recovery to archive partially filled
write-ahead logs [pitr]I believe we'd agreed that the necessary infrastructure for this is
just a function to tell the current WAL segment name and offset.
Not that I remember. That was just a proposal for backpatching to
8.1/8.0 so that it would be easier to cope with PITR at those releases.
o Automatically force archiving of partially-filled WAL files when
pg_stop_backup() is called or the server is stoppedI see no need for that to be "automatic". I'd vote for a simple
function pg_finish_wal_segment() or something like that, which you
call just after pg_stop_backup() if you want this behavior. Trying
to tie it into pg_stop_backup() will only make things more complicated
and less flexible.
Putting it into pg_stop_backup was what we previously agreed.
Where is the loss of flexibility?
I need to get this straight because I was actually intending to do this
for 8.2, i.e. next few days.
--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com
On Tue, 2006-07-25 at 11:20 -0400, Bruce Momjian wrote:
Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
Where are we on these TODO items:
o Allow point-in-time recovery to archive partially filled
write-ahead logs [pitr]I believe we'd agreed that the necessary infrastructure for this is
just a function to tell the current WAL segment name and offset.Yes, perhaps, though I can envision a GUC that does regularly partial
archiving. I will add a question mark to the item.
I was planning to add a new GUC
archive_timeout (integer) = max # secs between log file switches
--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com
Ühel kenal päeval, T, 2006-07-25 kell 11:27, kirjutas Bruce Momjian:
Hannu Krosing wrote:
?hel kenal p?eval, T, 2006-07-25 kell 10:51, kirjutas Bruce Momjian:
Where are we on these TODO items:
o Add reporting of the current WAL file, perhaps as part of
partial log file archivingIt would be nice to have a function that tells both filename and offset
of current WAL file, so it would be possible to do live async streaming
of up-to-subsecond changes without too much overhead.OK, "offset" added to TODO item. What would the offset give us?
the offset returned by lseek() on the WAL file, that is the end of the
part of the WAL file which has actually been written to.
--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia
Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com
Simon Riggs <simon@2ndquadrant.com> writes:
On Tue, 2006-07-25 at 11:07 -0400, Tom Lane wrote:
I see no need for that to be "automatic". I'd vote for a simple
function pg_finish_wal_segment() or something like that, which you
call just after pg_stop_backup() if you want this behavior. Trying
to tie it into pg_stop_backup() will only make things more complicated
and less flexible.
Putting it into pg_stop_backup was what we previously agreed.
Where is the loss of flexibility?
I don't see why you think this function should be tied to making a
backup. There are other reasons for wanting to force a WAL switch
than that, and there are scenarios in which you don't need a WAL
switch at the end of a backup.
regards, tom lane
Hannu Krosing wrote:
?hel kenal p?eval, T, 2006-07-25 kell 11:27, kirjutas Bruce Momjian:
Hannu Krosing wrote:
?hel kenal p?eval, T, 2006-07-25 kell 10:51, kirjutas Bruce Momjian:
Where are we on these TODO items:
o Add reporting of the current WAL file, perhaps as part of
partial log file archivingIt would be nice to have a function that tells both filename and offset
of current WAL file, so it would be possible to do live async streaming
of up-to-subsecond changes without too much overhead.OK, "offset" added to TODO item. What would the offset give us?
the offset returned by lseek() on the WAL file, that is the end of the
part of the WAL file which has actually been written to.
Sorry, I was actually asking what use the offset would be to a user.
--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
On Tue, 2006-07-25 at 11:31 -0400, Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
For example, if you do pg_stop_backup(), in what cases would you not
also call pg_finish_wal_segment()? I can't think of one.I can't see why you would need to, unless your intention is not to run
PITR at all but only to make a filesystem backup instead of using
pg_dump.
If thats all you want you can set
archive_command = 'echo %f %p > /dev/null'
Normally you'd be running a continuing archival process and
there's no particular need to force the current WAL segment off to
archive at that exact instant.
That's exactly the point of contention. When we originally completed
PITR we thought that was acceptable. It isn't and many people have stuck
pins in effigies of me since then. :-/
My point here is that forcing the current segment to archive is a
function of whatever your continuous-archiving process is, and it's
not necessarily tied to backups. We should not prejudge when people
want that fairly-expensive function to be invoked.
The point is until that last WAL file is backed up, the whole backup is
useless. It isn't good policy to have a backup's value be contingent on
some future event.
--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com
OK, "offset" added to TODO item. What would the offset give us?
The last offset could be remembered by the external program, and it only
has to transfer from the last offset to the new one. It allows
incremental streaming of the WAL files... of course the external program
will be a lot more complex than the current shell scripts which can be
used for WAL archiving...
The problems I see with this is if in this case the normal postgres WAL
archiving won't conflict with this streaming ? And if yes, wouldn't it
be better to have a separate mechanism for the stream based archiving ?
I mean what will happen if postgres successfully archives the WAL file
and then moves it out of way before the streaming process finishes with
it, the streaming process will have a problem...
A few months ago I spent some time thinking about a solution where a WAL
based standby could be built using only normal data base connections to
the master server, and one of the ideas was to create a WAL subscription
mechanism where the standby subscribes for getting WAL files, and
updates it's subscription status with the last processed WAL file after
each processed file. The master can then recycle the WAL files only
after they were confirmed by all current subscriptions... and to avoid
excessive WAL file bloat if a slave goes offline, the subscription could
be canceled automatically if it gets too much behind.
If this mechanism is in place, it would be also nice if the slave could
ask for the WAL records to be streamed on a normal data base connection.
The function which would do it could be smart enough to stream the
current WAL file too up to the current offset and then wait for new
records. The slave would invoke the function for each WAL file it needs
to transfer, and then when finished it would update it's subscription
status and continue with the next one. The streaming function should not
update the subscription status as this way the slave can ask for the
file again if something goes wrong with the transfer.
The third thing needed to create a facility for one-connection-standby
building is to be able to stream the OS files of the DB through a DB
connection - I guess that can be done with a relatively simple C
function...
With all these things in place, a program could be written which would
run on the standby machine and completely automatically set up the
standby, only needing a simple connection string to the master...
Cheers,
Csaba.
Simon Riggs <simon@2ndquadrant.com> writes:
I was planning to add a new GUC
archive_timeout (integer) = max # secs between log file switches
That's fine, but feature freeze is in a week and we don't even have the
basic function for manually doing a log file switch. Let's get that
done first and then think about automatic switches.
Also, forcing another 16M of WAL out every few seconds is a pretty crude
and inefficient way of making sure your hot standby server is up to
date. As Hannu noted, an archiving script can do better than that if it
can access the current WAL file name and offset. So we really want to
get the function to return that info done as well.
regards, tom lane
Tom Lane wrote:
Simon Riggs <simon@2ndquadrant.com> writes:
On Tue, 2006-07-25 at 11:07 -0400, Tom Lane wrote:
I see no need for that to be "automatic". I'd vote for a simple
function pg_finish_wal_segment() or something like that, which you
call just after pg_stop_backup() if you want this behavior. Trying
to tie it into pg_stop_backup() will only make things more complicated
and less flexible.Putting it into pg_stop_backup was what we previously agreed.
Where is the loss of flexibility?I don't see why you think this function should be tied to making a
backup. There are other reasons for wanting to force a WAL switch
than that, and there are scenarios in which you don't need a WAL
Yes, that is why we would have a separate function too.
switch at the end of a backup.
Well, I figured if you just did a backup, you would want a switch in
_most_ cases, and since you just did a backup, I figured an extra WAL
file would be minimal additional overhead.
--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Tom Lane wrote:
Simon Riggs <simon@2ndquadrant.com> writes:
I was planning to add a new GUC
archive_timeout (integer) = max # secs between log file switchesThat's fine, but feature freeze is in a week and we don't even have the
basic function for manually doing a log file switch. Let's get that
done first and then think about automatic switches.Also, forcing another 16M of WAL out every few seconds is a pretty crude
and inefficient way of making sure your hot standby server is up to
date. As Hannu noted, an archiving script can do better than that if it
can access the current WAL file name and offset. So we really want to
get the function to return that info done as well.
Agreed. One concern I have is that we have waited for Simon to complete
this for 1.5 years, and now with a week left he is still working on it
(or starting on it). I am wondering if someone else needs to take this
on, because if Simon doesn't complete it in a week, we don't have it for
8.2, and we would then have to sit through another year of PITR
complaints. :-(
--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
* Simon Riggs (simon@2ndquadrant.com) wrote:
On Tue, 2006-07-25 at 11:20 -0400, Bruce Momjian wrote:
Yes, perhaps, though I can envision a GUC that does regularly partial
archiving. I will add a question mark to the item.I was planning to add a new GUC
archive_timeout (integer) = max # secs between log file switches
I'd love to see both this GUC and the function itself make it into 8.2..
I'm tempted to agree with Bruce about running the wal-archive-function
after pg_stop_backup(). The backup isn't any good without all the WALs
which were used during the backup anyway (iirc) so I can't really think
why you'd want any time at all between "backup happening" and "backup
actually usable".
Also, compared to the backup itself I'd tend to doubt there would be
much of a performance hit. It may be expensive compared to other
regular queries/operations but an rsync across the entire database isn't
exactly cheap.
Thanks,
Stephen
Simon Riggs wrote:
On Tue, 2006-07-25 at 11:31 -0400, Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
For example, if you do pg_stop_backup(), in what cases would you not
also call pg_finish_wal_segment()? I can't think of one.I can't see why you would need to, unless your intention is not to run
PITR at all but only to make a filesystem backup instead of using
pg_dump.If thats all you want you can set
archive_command = 'echo %f %p > /dev/null'
Uh, what good is a file system backup without the WAL files modified
during the backup?
Normally you'd be running a continuing archival process and
there's no particular need to force the current WAL segment off to
archive at that exact instant.That's exactly the point of contention. When we originally completed
PITR we thought that was acceptable. It isn't and many people have stuck
pins in effigies of me since then. :-/My point here is that forcing the current segment to archive is a
function of whatever your continuous-archiving process is, and it's
not necessarily tied to backups. We should not prejudge when people
want that fairly-expensive function to be invoked.The point is until that last WAL file is backed up, the whole backup is
useless. It isn't good policy to have a backup's value be contingent on
some future event.
Good analysis.
--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +