Warm standby can't start because logs stream too quickly from the master

Started by Zach Waltonover 8 years ago4 messagesgeneral
Jump to latest
#1Zach Walton
zacwalt@gmail.com

Looking at the startup process:

postgres 16749 4.1 6.7 17855104 8914544 ? Ss 18:36 0:44 postgres:
startup process recovering 0000000800005B1C00000030

Then a few seconds later:

postgres 16749 4.2 7.0 17855104 9294172 ? Ss 18:36 0:47 postgres:
startup process recovering 0000000800005B1C00000047

It's replaying logs from the master, but it's always a few behind, so
startup never finishes. Here's a demonstration:

# while :; do echo $(ls data/pg_xlog/ | grep -n $(ps aux | egrep "startup
process" | awk '{print $15}')) $(ls data/pg_xlog/ | wc -l); sleep 1; done
# current replay location # number of WALs in pg_xlog
1655:0000000800005B1C00000064 1659
1656:0000000800005B1C00000065 1660
1658:0000000800005B1C00000067 1661
1659:0000000800005B1C00000068 1662
1660:0000000800005B1C00000069 1663

Generally this works itself out if I wait (sometimes a really long time).
Is there a configuration option that allows a warm standby to start without
having fully replayed the logs from the master?

* Note: wal_keep_segments is set to 8192 on these servers, which have large
disks, to allow for recovery within a couple of hours of a failover without
resorting to restoring from archive
* This is specifically an issue for pgpool recovery, which fails if a
standby can't start within (by default) 300 seconds. Open to toggling that
param if there's no way around this.

#2Joshua D. Drake
jd@commandprompt.com
In reply to: Zach Walton (#1)
Re: Warm standby can't start because logs stream too quickly from the master

On 12/02/2017 11:02 AM, Zach Walton wrote:

Generally this works itself out if I wait (sometimes a really long
time). Is there a configuration option that allows a warm standby to
start without having fully replayed the logs from the master?

* Note: wal_keep_segments is set to 8192 on these servers, which have
large disks, to allow for recovery within a couple of hours of a
failover without resorting to restoring from archive
* This is specifically an issue for pgpool recovery, which fails if a
standby can't start within (by default) 300 seconds. Open to toggling
that param if there's no way around this.

It needs to only reach a consistent state, it doesn't need restore all
logs. What does your recovery.conf say and are you *100% sure% you
issued a pg_stop_backup()?

JD

--
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL Centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://pgconf.org
***** Unless otherwise stated, opinions are my own. *****

#3Jeff Janes
jeff.janes@gmail.com
In reply to: Zach Walton (#1)
Re: Warm standby can't start because logs stream too quickly from the master

On Sat, Dec 2, 2017 at 11:02 AM, Zach Walton <zacwalt@gmail.com> wrote:

Looking at the startup process:

postgres 16749 4.1 6.7 17855104 8914544 ? Ss 18:36 0:44 postgres:
startup process recovering 0000000800005B1C00000030

Then a few seconds later:

postgres 16749 4.2 7.0 17855104 9294172 ? Ss 18:36 0:47 postgres:
startup process recovering 0000000800005B1C00000047

It's replaying logs from the master, but it's always a few behind, so
startup never finishes. Here's a demonstration:

# while :; do echo $(ls data/pg_xlog/ | grep -n $(ps aux | egrep "startup
process" | awk '{print $15}')) $(ls data/pg_xlog/ | wc -l); sleep 1; done
# current replay location # number of WALs in pg_xlog
1655:0000000800005B1C00000064 1659
1656:0000000800005B1C00000065 1660
1658:0000000800005B1C00000067 1661
1659:0000000800005B1C00000068 1662
1660:0000000800005B1C00000069 1663

Generally this works itself out if I wait (sometimes a really long time).
Is there a configuration option that allows a warm standby to start without
having fully replayed the logs from the master?

Warm standbys aren't supposed to start up, that is what makes them warm.
Are you trying to set up a hot standby? Are you trying to promote a warm
standby to be the new master (but usually you would do that when the
current master has died, and so would no longer be generating log.)

Cheers,

Jeff

#4Zach Walton
zacwalt@gmail.com
In reply to: Joshua D. Drake (#2)
Re: Warm standby can't start because logs stream too quickly from the master

This was my fault. I'd restored recovery.conf from recovery.done to try to
recover manually after automated recovery failed. Everything's working
after stopping the database and running pgpool online recovery again.

Thanks for the help.

On Sat, Dec 2, 2017 at 1:55 PM, Joshua D. Drake <jd@commandprompt.com>
wrote:

Show quoted text

On 12/02/2017 11:02 AM, Zach Walton wrote:

Generally this works itself out if I wait (sometimes a really long time).
Is there a configuration option that allows a warm standby to start without
having fully replayed the logs from the master?

* Note: wal_keep_segments is set to 8192 on these servers, which have
large disks, to allow for recovery within a couple of hours of a failover
without resorting to restoring from archive
* This is specifically an issue for pgpool recovery, which fails if a
standby can't start within (by default) 300 seconds. Open to toggling that
param if there's no way around this.

It needs to only reach a consistent state, it doesn't need restore all
logs. What does your recovery.conf say and are you *100% sure% you issued a
pg_stop_backup()?

JD

--
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL Centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://pgconf.org
***** Unless otherwise stated, opinions are my own. *****