warm standby and reciprocating failover

Started by james bardinover 16 years ago2 messagesgeneral
Jump to latest
#1james bardin
jbardin@bu.edu

I wasn't sure which list is better suited, so this is cross posted
from pgsql-admin.
-Thanks

On Fri, Aug 21, 2009 at 10:46 AM, james bardin<jbardin@bu.edu> wrote:

I have a working warm standby system, running 8.4 (thanks for urging
me to upgrade from the rehdat provided release).
One of the new requirements is going to be for (a non-DBA) admin to
easily swap services between the two servers for maintenance.

The first move runs easily as expected- postgres ships the last
partial wal immediately on shutdown, trigger the standby and we're up.
I'm now running into issues bringing the first server back up in
standby mode. After the second server finishes recovery, the major
number of the wal files is incremented (say from 00000001 to
00000002), and the 00000002.history file is shipped back to the first
server. The first server however is still looking for 00000001x files.

Is there a way to ship back the missing information from the recovery
process, without doing another base backup of data/ ?

On Mon, Aug 24, 2009 at 11:34 AM, james bardin<jbardin@bu.edu> wrote:

So I've been experimenting with this timeline problem without any success.
Is it possible that there are changes made during recovery that aren't logged?

I tried recovery_target_timeline='X' on the standby, where X is the
new timeline created after recovery on the new master. This fails,
with some "unexpected timeline ID" lines and a
PANIC: could not locate a valid checkpoint record

I also tried using recovery_target_timeline='latest'. This fell back
gracefully to an earlier state, but changes were lost. Also, it never
waited on pg_standby, and finished recovering immediately.

Although it doesn't solve this problem, can pg_standby be used with
recovery_target_timeline='latest', or should I file a bug?

Thanks
-jim

#2james bardin
jbardin@bu.edu
In reply to: james bardin (#1)
Re: warm standby and reciprocating failover

On Mon, Aug 24, 2009 at 12:45 PM, james bardin<jbardin@bu.edu> wrote:

I tried recovery_target_timeline='X' on the standby, where X is the
new timeline created after recovery on the new master. This fails,
with some "unexpected timeline ID" lines and a
PANIC:  could not locate a valid checkpoint record

I also tried using recovery_target_timeline='latest'. This fell back
gracefully to an earlier state, but changes were lost. Also, it never
waited on pg_standby, and finished recovering immediately.

It seems that this is related the the issue in this bug report:
http://archives.postgresql.org/pgsql-bugs/2009-05/msg00060.php

The follow up is very long, and I couldn't formulate any workaround
for the issue.