pgsql: If recovery_target_timeline is set to 'latest' and standby mode

Started by Heikki Linnakangasalmost 15 years ago5 messages
#1Heikki Linnakangas
heikki.linnakangas@iki.fi

If recovery_target_timeline is set to 'latest' and standby mode is enabled,
periodically rescan the archive for new timelines, while waiting for new WAL
segments to arrive. This allows you to set up a standby server that follows
the TLI change if another standby server is promoted to master. Before this,
you had to restart the standby server to make it notice the new timeline.

This patch only scans the archive for TLI changes, it won't follow a TLI
change in streaming replication. That is much needed too, but it would be a
much bigger patch than I dare to sneak in this late in the release cycle.

There was discussion on improving the sanity checking of the WAL segments so
that the system would notice more reliably if the new timeline isn't an
ancestor of the current one, but that is not included in this patch.

Reviewed by Fujii Masao.

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/1a4ab9ec23f0635a4c15b069df60b545814650e9

Modified Files
--------------
doc/src/sgml/high-availability.sgml | 5 ++-
doc/src/sgml/recovery-config.sgml | 4 +-
src/backend/access/transam/xlog.c | 79 +++++++++++++++++++++++++++++++++-
3 files changed, 83 insertions(+), 5 deletions(-)

#2Magnus Hagander
magnus@hagander.net
In reply to: Heikki Linnakangas (#1)
Re: [COMMITTERS] pgsql: If recovery_target_timeline is set to 'latest' and standby mode

On Mon, Mar 7, 2011 at 20:16, Heikki Linnakangas
<heikki.linnakangas@iki.fi> wrote:

If recovery_target_timeline is set to 'latest' and standby mode is enabled,
periodically rescan the archive for new timelines, while waiting for new WAL
segments to arrive. This allows you to set up a standby server that follows
the TLI change if another standby server is promoted to master. Before this,
you had to restart the standby server to make it notice the new timeline.

Can we make recovery_target_timeline='latest' the default when we are
in standby mode? That would suddenly make it a lot easier to "repoint
a slave" after a switchover...

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#3Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Magnus Hagander (#2)
Re: Re: [COMMITTERS] pgsql: If recovery_target_timeline is set to 'latest' and standby mode

On 07.03.2011 21:20, Magnus Hagander wrote:

On Mon, Mar 7, 2011 at 20:16, Heikki Linnakangas
<heikki.linnakangas@iki.fi> wrote:

If recovery_target_timeline is set to 'latest' and standby mode is enabled,
periodically rescan the archive for new timelines, while waiting for new WAL
segments to arrive. This allows you to set up a standby server that follows
the TLI change if another standby server is promoted to master. Before this,
you had to restart the standby server to make it notice the new timeline.

Can we make recovery_target_timeline='latest' the default when we are
in standby mode? That would suddenly make it a lot easier to "repoint
a slave" after a switchover...

Hmm, seems reasonable. 'latest' is what you usually want, at least in
standby mode. Though it would be strange to have a different default
depending on the value of another setting. Maybe we should change the
default regardless of standby_mode?

Wë́'d need a magic value to mean the current default behavior, to recover
to the current timeline. 'current'?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#4Magnus Hagander
magnus@hagander.net
In reply to: Heikki Linnakangas (#3)
Re: Re: [COMMITTERS] pgsql: If recovery_target_timeline is set to 'latest' and standby mode

On Mon, Mar 7, 2011 at 20:24, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

On 07.03.2011 21:20, Magnus Hagander wrote:

On Mon, Mar 7, 2011 at 20:16, Heikki Linnakangas
<heikki.linnakangas@iki.fi>  wrote:

If recovery_target_timeline is set to 'latest' and standby mode is
enabled,
periodically rescan the archive for new timelines, while waiting for new
WAL
segments to arrive. This allows you to set up a standby server that
follows
the TLI change if another standby server is promoted to master. Before
this,
you had to restart the standby server to make it notice the new timeline.

Can we make recovery_target_timeline='latest' the default when we are
in standby mode?  That would suddenly make it a lot easier to "repoint

a slave" after a switchover...

Hmm, seems reasonable. 'latest' is what you usually want, at least in
standby mode. Though it would be strange to have a different default
depending on the value of another setting. Maybe we should change the
default regardless of standby_mode?

Seems like a much narrower usecase in ordinary recovery mode, but we
could definitely change both..

Wë́'d need a magic value to mean the current default behavior, to recover to
the current timeline. 'current'?

I didn't realize we didn't already have that. In principle, i think we
should *always* be able to specify in a config file whatever comes out
as a default. There should be no magic behavior that cannot be
explicitly specified.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#5Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#1)
Re: [COMMITTERS] pgsql: If recovery_target_timeline is set to 'latest' and standby mode

On Mon, 2011-03-07 at 19:16 +0000, Heikki Linnakangas wrote:

If recovery_target_timeline is set to 'latest' and standby mode is enabled,
periodically rescan the archive for new timelines, while waiting for new WAL
segments to arrive. This allows you to set up a standby server that follows
the TLI change if another standby server is promoted to master. Before this,
you had to restart the standby server to make it notice the new timeline.

This patch only scans the archive for TLI changes, it won't follow a TLI
change in streaming replication. That is much needed too, but it would be a
much bigger patch than I dare to sneak in this late in the release cycle.

There was discussion on improving the sanity checking of the WAL segments so
that the system would notice more reliably if the new timeline isn't an
ancestor of the current one, but that is not included in this patch.

This appears to rely on the existence of an archive, which isn't always
there. That isn't documented nor checked for that I can see.

If the idea is to support downstream standbys via file based replication
it should really say that. Shame it doesn't support streaming only.

There's also a comment in the code about something the admin needs to
make sure doesn't happen. That needs to be in the docs also.

--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services