Recovery continually requests new WAL files

Started by Alex Goodalmost 14 years ago7 messagesgeneral
Jump to latest
#1Alex Good
alexjsgood@gmail.com

Hey!

I have a simple setup with one master and one backup server. I have an
issue where I have performed a backup and copied it to the data
directory for the slave, written a recovery.conf and copied in the
backup_label file and then started the server, it happily restores
everything up until and including the WAL file mentioned in the
backup_label and then attempts to obtain the next archive file which has
not yet been archived. I can't for the life of me figure out what is
going on.

Here's a break down of what I do

call pg_start_backup('label')

tar -zcf backup.tar.gz base global pg_clog pg_multixact pg_notify
pg_serial pg_subtrans pg_tblspc pg_twophase backup_label

call pg_stop_backup()

scp pgsql.tar.gz slave_hostname:/var/lib/postgresql/9.1/main

move to slave server

rm -rf global base pg_clog pg_multixact pg_notify pg_serial pg_subtrans
pg_tblspc pg_twophase pg_xlog/*
mkdir pg_xlog/archive_status
tar -xvf backup.tar.gz

restart postgresql

----------------
recovery.conf
-----------------
restore_command = 'scp
master-hostname:/var/lib/postgresql/9.1/main/wal_archives/%f %p'
standby_mode=on

And here's what I'm seeing in the logs on the recovering server

2012-06-12 16:31:26 UTC FATAL: the database system is starting up
2012-06-12 16:31:27 UTC FATAL: the database system is starting up
2012-06-12 16:31:27 UTC FATAL: the database system is starting up
2012-06-12 16:31:27 UTC LOG: incomplete startup packet
2012-06-12 16:31:30 UTC LOG: restored log file
"00000001000000000000000A" from archive
2012-06-12 16:31:30 UTC LOG: redo starts at 0/A000078
2012-06-12 16:31:30 UTC LOG: consistent recovery state reached at 0/B000000
scp: /var/lib/postgresql/9.1/main/wal_archives/00000001000000000000000B:
No such file or directory
scp: /var/lib/postgresql/9.1/main/wal_archives/00000001000000000000000B:
No such file or directory
scp: /var/lib/postgresql/9.1/main/wal_archives/00000001000000000000000B:
No such file or directory

I'm confused by this because the 00000001000000000000000B archive wasn't
created until after the pg_stop_backup call so why is it needed?

Any help would be appreciated, I've been banging my head against this
one for a while.

Thanks
Alex

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Alex Good (#1)
Re: Recovery continually requests new WAL files

Alex Good wrote:

I have a simple setup with one master and one backup server. I have an
issue where I have performed a backup and copied it to the data
directory for the slave, written a recovery.conf and copied in the
backup_label file and then started the server, it happily restores
everything up until and including the WAL file mentioned in the
backup_label and then attempts to obtain the next archive file which

has

not yet been archived. I can't for the life of me figure out what is
going on.

What else would you expect?

Are you planning to use streaming replication?

If yes, what are your configuration parameters for replication?

Yours,
Laurenz Albe

#3Alex Good
alexjsgood@gmail.com
In reply to: Laurenz Albe (#2)
Re: Recovery continually requests new WAL files

On 13/06/12 09:10, Albe Laurenz wrote:

Alex Good wrote:

I have a simple setup with one master and one backup server. I have an
issue where I have performed a backup and copied it to the data
directory for the slave, written a recovery.conf and copied in the
backup_label file and then started the server, it happily restores
everything up until and including the WAL file mentioned in the
backup_label and then attempts to obtain the next archive file which

has

not yet been archived. I can't for the life of me figure out what is
going on.

What else would you expect?

Are you planning to use streaming replication?

If yes, what are your configuration parameters for replication?

Yours,
Laurenz Albe

What I expected to see was the server requesting each WAL file up until
the one which was archived during pg_stop_backup and then the server
would consider itself to be recovered. Clearly I have misunderstood
something here.

These two servers are actually sat behind pgpool which is in replication
mode (so I don't have streaming replication set up) which I chose
beccause it gives me synchronous replication as well as automatic
failover. I am trying to understand the recovery process so I can use it
to set up pgpools' online recovery feature.

Thanks
Alex Good

Alex

#4Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Alex Good (#3)
Re: Recovery continually requests new WAL files

Alex Good wrote:

What I expected to see was the server requesting each WAL file up

until

the one which was archived during pg_stop_backup and then the server
would consider itself to be recovered. Clearly I have misunderstood
something here.

These two servers are actually sat behind pgpool which is in

replication

mode (so I don't have streaming replication set up) which I chose
beccause it gives me synchronous replication as well as automatic
failover. I am trying to understand the recovery process so I can use

it

to set up pgpools' online recovery feature.

Oh, you didn't say that it is about pgpool.

You might try to ask their mailing lists:
http://www.pgpool.net/mediawiki/index.php/Mailing_lists

Yours,
Laurenz Albe

#5Alex Good
alexjsgood@gmail.com
In reply to: Laurenz Albe (#4)
Re: Recovery continually requests new WAL files

On 13/06/12 10:29, Albe Laurenz wrote:

Alex Good wrote:

What I expected to see was the server requesting each WAL file up

until

the one which was archived during pg_stop_backup and then the server
would consider itself to be recovered. Clearly I have misunderstood
something here.

These two servers are actually sat behind pgpool which is in

replication

mode (so I don't have streaming replication set up) which I chose
beccause it gives me synchronous replication as well as automatic
failover. I am trying to understand the recovery process so I can use

it

to set up pgpools' online recovery feature.

Oh, you didn't say that it is about pgpool.

You might try to ask their mailing lists:
http://www.pgpool.net/mediawiki/index.php/Mailing_lists

Yours,
Laurenz Albe

Although pgpool is involved this isn't actually about pgpool, I've been
running through the recovery process manually to try and understand what
needs to be done in order to get onlinve recovery working with pgpool.
Pgpool isn't actually running at the moment.

Anyway, I think what I had misunderstood was the meaning of the
'standby_mode' parameter in recovery.conf. If I remove that then the
process behaves as I expect it to except that the restoring server ends
up restoring to a new timeline, I would prefer that it be on the same
timeline as the master, I have set recovery_target_timeline = 'latest'
in recovery.conf but this still increments the timeline. Is there any
way to get the recovery to stay on the same timeline other than
explicitly specifying the timeline?

Thanks
Alex

#6Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Alex Good (#5)
Re: Recovery continually requests new WAL files

Alex Good wrote:

Although pgpool is involved this isn't actually about pgpool, I've

been

running through the recovery process manually to try and understand

what

needs to be done in order to get onlinve recovery working with pgpool.
Pgpool isn't actually running at the moment.

Oh, I see.

Anyway, I think what I had misunderstood was the meaning of the
'standby_mode' parameter in recovery.conf. If I remove that then the
process behaves as I expect it to except that the restoring server

ends

up restoring to a new timeline, I would prefer that it be on the same
timeline as the master, I have set recovery_target_timeline = 'latest'
in recovery.conf but this still increments the timeline. Is there any
way to get the recovery to stay on the same timeline other than
explicitly specifying the timeline?

That's why I asked if this is about streaming replication.

It is by design that a new timeline is opened after recovery.
This is to tell the WAL sequence from before and after recovery apart.
Is it a problem for you?

Yours,
Laurenz Albe

#7Alex Good
alexjsgood@gmail.com
In reply to: Laurenz Albe (#6)
Re: Recovery continually requests new WAL files

On 13/06/12 11:10, Albe Laurenz wrote:

Alex Good wrote:

Although pgpool is involved this isn't actually about pgpool, I've

been

running through the recovery process manually to try and understand

what

needs to be done in order to get onlinve recovery working with pgpool.
Pgpool isn't actually running at the moment.

Oh, I see.

Anyway, I think what I had misunderstood was the meaning of the
'standby_mode' parameter in recovery.conf. If I remove that then the
process behaves as I expect it to except that the restoring server

ends

up restoring to a new timeline, I would prefer that it be on the same
timeline as the master, I have set recovery_target_timeline = 'latest'
in recovery.conf but this still increments the timeline. Is there any
way to get the recovery to stay on the same timeline other than
explicitly specifying the timeline?

That's why I asked if this is about streaming replication.

It is by design that a new timeline is opened after recovery.
This is to tell the WAL sequence from before and after recovery apart.
Is it a problem for you?

Yours,
Laurenz Albe

Well I had assumed that it was a bad thing as the way I am intending to
use the recovery procedure is to add backup servers to the pgpool
cluster and it seemed to make more sense that they all be on the same
timeline.

Having thought about it though I don't think it matters, thanks very
much for your help, I've been banging my head against this for a while.

Thanks
Alex Good