Warm-Standby using WAL archiving / Seperate pg_restorelog application
Hi
I've now setup a warm-standby machine by using wal archiving. The restore_command on the
warm-standby machine loops until the wal requested by postgres appears, instead of
returning 1. Additionally, restore_command check for two special flag-files "abort"
and "take_online". If "take_online" exists, then it exists with code 1 in case of a
non-existant wal - this allows me to take the slave online if the master fails.
This methods seems to work, but it is neither particularly fool-proof nor
administrator friendly. It's not possible e.g. to reboot the slave without postgres
abortint the recovery, and therefor processing all wals generated since the last
backup all over again.
Monitoring this system is hard too, since there is no easy way to detect errors
while restoring a particular wal.
I think that all those problems could be solved if postgres provided a standalone application
that could restore one wal into a specified data-dir. It should be possible to call this
application repeatedly to restore wals as they are received from the master. Since "pg_restorelog"
would be call seperately for every wal, I'd be easy to detect errors recovering a specific wal.
Do you think this idea is feaseable? How hard would it be to turn the current archived-wal-recovery-code
into a standalone executable (That of course needs to be called when postgres is _not_ running.)
greetings, Florian Pflug
On 7/10/06, Florian G. Pflug <fgp@phlo.org> wrote:
This methods seems to work, but it is neither particularly fool-proof nor
administrator friendly. It's not possible e.g. to reboot the slave without postgres
abortint the recovery, and therefor processing all wals generated since the last
backup all over again.Monitoring this system is hard too, since there is no easy way to detect errors
while restoring a particular wal.
what I would really like to see is to have the postmaster start up in
a special read only mode where it could auto-restore wal files placed
there by an external process but not generate any of its own. This
would be a step towards a pitr based simple replication method.
merlin
Merlin Moncure wrote:
On 7/10/06, Florian G. Pflug <fgp@phlo.org> wrote:
This methods seems to work, but it is neither particularly fool-proof nor
administrator friendly. It's not possible e.g. to reboot the slave
without postgres
abortint the recovery, and therefor processing all wals generated
since the last
backup all over again.Monitoring this system is hard too, since there is no easy way to
detect errors
while restoring a particular wal.what I would really like to see is to have the postmaster start up in
a special read only mode where it could auto-restore wal files placed
there by an external process but not generate any of its own. This
would be a step towards a pitr based simple replication method.
I didn't dare to ask for being able to actually _access_ a wal-shipping
based slaved (in read only mode) - from how I interpret the code, it's
a _long_ way to get that working. So I figured a stand-alone executable
that just recovers _one_ archived wal would at least remove that administrative
burden that my current solution brings. And it would be easy to monitor
the slave - much easier than with any automatic pickup of wals.
greetings, Florian Pflug
Just having a standby mode that survived shutdown/startup would be a nice
start...
I also do the blocking-restore-command technique, which although workable,
has a bit of a house-of-cards feel to it sometimes.
On 7/10/06 5:40 PM, "Florian G. Pflug" <fgp@phlo.org> wrote:
Show quoted text
Merlin Moncure wrote:
On 7/10/06, Florian G. Pflug <fgp@phlo.org> wrote:
This methods seems to work, but it is neither particularly fool-proof nor
administrator friendly. It's not possible e.g. to reboot the slave
without postgres
abortint the recovery, and therefor processing all wals generated
since the last
backup all over again.Monitoring this system is hard too, since there is no easy way to
detect errors
while restoring a particular wal.what I would really like to see is to have the postmaster start up in
a special read only mode where it could auto-restore wal files placed
there by an external process but not generate any of its own. This
would be a step towards a pitr based simple replication method.I didn't dare to ask for being able to actually _access_ a wal-shipping
based slaved (in read only mode) - from how I interpret the code, it's
a _long_ way to get that working. So I figured a stand-alone executable
that just recovers _one_ archived wal would at least remove that
administrative
burden that my current solution brings. And it would be easy to monitor
the Y&
On Mon, 2006-07-10 at 19:34 +0200, Florian G. Pflug wrote:
This methods seems to work, but it is neither particularly fool-proof nor
administrator friendly. It's not possible e.g. to reboot the slave without postgres
abortint the recovery, and therefor processing all wals generated since the last
backup all over again.
Just submitted a patch to allow restartable recovery, which addresses
this concern.
Monitoring this system is hard too, since there is no easy way to detect errors
while restoring a particular wal.
What do you mean?
If there is an ERROR in the WAL file, it stops.
If the restore of the WAL file fails, it retries a few times before
giving up.
--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com
Ühel kenal päeval, T, 2006-07-11 kell 08:38, kirjutas Andrew Rawnsley:
Just having a standby mode that survived shutdown/startup would be a nice
start...
I think that Simon Riggs did some work on this at the code sprint
yesterday.
I also do the blocking-restore-command technique, which although workable,
has a bit of a house-of-cards feel to it sometimes.
--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia
Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com