Warm-Standby using WAL archiving / Seperate pg_restorelog application

Started by Florian G. Pflugover 19 years ago6 messages
#1Florian G. Pflug
fgp@phlo.org

Hi

I've now setup a warm-standby machine by using wal archiving. The restore_command on the
warm-standby machine loops until the wal requested by postgres appears, instead of
returning 1. Additionally, restore_command check for two special flag-files "abort"
and "take_online". If "take_online" exists, then it exists with code 1 in case of a
non-existant wal - this allows me to take the slave online if the master fails.

This methods seems to work, but it is neither particularly fool-proof nor
administrator friendly. It's not possible e.g. to reboot the slave without postgres
abortint the recovery, and therefor processing all wals generated since the last
backup all over again.

Monitoring this system is hard too, since there is no easy way to detect errors
while restoring a particular wal.

I think that all those problems could be solved if postgres provided a standalone application
that could restore one wal into a specified data-dir. It should be possible to call this
application repeatedly to restore wals as they are received from the master. Since "pg_restorelog"
would be call seperately for every wal, I'd be easy to detect errors recovering a specific wal.

Do you think this idea is feaseable? How hard would it be to turn the current archived-wal-recovery-code
into a standalone executable (That of course needs to be called when postgres is _not_ running.)

greetings, Florian Pflug

#2Merlin Moncure
mmoncure@gmail.com
In reply to: Florian G. Pflug (#1)
Re: Warm-Standby using WAL archiving / Seperate pg_restorelog application

On 7/10/06, Florian G. Pflug <fgp@phlo.org> wrote:

This methods seems to work, but it is neither particularly fool-proof nor
administrator friendly. It's not possible e.g. to reboot the slave without postgres
abortint the recovery, and therefor processing all wals generated since the last
backup all over again.

Monitoring this system is hard too, since there is no easy way to detect errors
while restoring a particular wal.

what I would really like to see is to have the postmaster start up in
a special read only mode where it could auto-restore wal files placed
there by an external process but not generate any of its own. This
would be a step towards a pitr based simple replication method.

merlin

#3Florian G. Pflug
fgp@phlo.org
In reply to: Merlin Moncure (#2)
Re: Warm-Standby using WAL archiving / Seperate pg_restorelog

Merlin Moncure wrote:

On 7/10/06, Florian G. Pflug <fgp@phlo.org> wrote:

This methods seems to work, but it is neither particularly fool-proof nor
administrator friendly. It's not possible e.g. to reboot the slave
without postgres
abortint the recovery, and therefor processing all wals generated
since the last
backup all over again.

Monitoring this system is hard too, since there is no easy way to
detect errors
while restoring a particular wal.

what I would really like to see is to have the postmaster start up in
a special read only mode where it could auto-restore wal files placed
there by an external process but not generate any of its own. This
would be a step towards a pitr based simple replication method.

I didn't dare to ask for being able to actually _access_ a wal-shipping
based slaved (in read only mode) - from how I interpret the code, it's
a _long_ way to get that working. So I figured a stand-alone executable
that just recovers _one_ archived wal would at least remove that administrative
burden that my current solution brings. And it would be easy to monitor
the slave - much easier than with any automatic pickup of wals.

greetings, Florian Pflug

#4Andrew Rawnsley
ronz@investoranalytics.com
In reply to: Florian G. Pflug (#3)
Re: Warm-Standby using WAL archiving / Seperate

Just having a standby mode that survived shutdown/startup would be a nice
start...

I also do the blocking-restore-command technique, which although workable,
has a bit of a house-of-cards feel to it sometimes.

On 7/10/06 5:40 PM, "Florian G. Pflug" <fgp@phlo.org> wrote:

Show quoted text

Merlin Moncure wrote:

On 7/10/06, Florian G. Pflug <fgp@phlo.org> wrote:

This methods seems to work, but it is neither particularly fool-proof nor
administrator friendly. It's not possible e.g. to reboot the slave
without postgres
abortint the recovery, and therefor processing all wals generated
since the last
backup all over again.

Monitoring this system is hard too, since there is no easy way to
detect errors
while restoring a particular wal.

what I would really like to see is to have the postmaster start up in
a special read only mode where it could auto-restore wal files placed
there by an external process but not generate any of its own. This
would be a step towards a pitr based simple replication method.

I didn't dare to ask for being able to actually _access_ a wal-shipping
based slaved (in read only mode) - from how I interpret the code, it's
a _long_ way to get that working. So I figured a stand-alone executable
that just recovers _one_ archived wal would at least remove that
administrative
burden that my current solution brings. And it would be easy to monitor
the Y&

#5Simon Riggs
simon@2ndquadrant.com
In reply to: Florian G. Pflug (#1)
Re: Warm-Standby using WAL archiving / Seperate

On Mon, 2006-07-10 at 19:34 +0200, Florian G. Pflug wrote:

This methods seems to work, but it is neither particularly fool-proof nor
administrator friendly. It's not possible e.g. to reboot the slave without postgres
abortint the recovery, and therefor processing all wals generated since the last
backup all over again.

Just submitted a patch to allow restartable recovery, which addresses
this concern.

Monitoring this system is hard too, since there is no easy way to detect errors
while restoring a particular wal.

What do you mean?

If there is an ERROR in the WAL file, it stops.
If the restore of the WAL file fails, it retries a few times before
giving up.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

#6Hannu Krosing
hannu@skype.net
In reply to: Andrew Rawnsley (#4)
Re: Warm-Standby using WAL archiving / Seperate

Ühel kenal päeval, T, 2006-07-11 kell 08:38, kirjutas Andrew Rawnsley:

Just having a standby mode that survived shutdown/startup would be a nice
start...

I think that Simon Riggs did some work on this at the code sprint
yesterday.

I also do the blocking-restore-command technique, which although workable,
has a bit of a house-of-cards feel to it sometimes.

--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com