what to do after a failover

Started by Ritaabout 6 years ago5 messagesgeneral
Jump to latest
#1Rita
rmorgan466@gmail.com

I run a master and standby setup with Postgresql 11. The systems are
identical from a hardware and software setup. If the master goes down I
can do a pg_ctl promote on the standby and point my applications to use the
standby (new master).

Once the original master is online, when is an appropriate time to fail
back over? And are there any other things besides promote after the
failover is done?

-- 
--- Get your facts first, then you can distort them as you please.--
#2Michael Paquier
michael@paquier.xyz
In reply to: Rita (#1)
Re: what to do after a failover

On Wed, Jan 08, 2020 at 11:06:28PM -0500, Rita wrote:

I run a master and standby setup with Postgresql 11. The systems are
identical from a hardware and software setup. If the master goes down I
can do a pg_ctl promote on the standby and point my applications to use the
standby (new master).

Once the original master is online, when is an appropriate time to fail
back over? And are there any other things besides promote after the
failover is done?

Make sure that you still have an HA configuration able to handle
multiple degrees of failures with always standbys available after a
promotion.

The options available to rebuild your HA configuration after a
failover depend on the version of PostgreSQL you are using. After a
failover the most simple solution would be to always recreate a new
standby from a base backup taken from the freshly-promoted primary,
though it can be costly depending on your instance. You could also
use pg_rewind (available in core since 9.5) to recycle the previous
primary and reuse it as a standby of the new promoted custer. Note
that there are community-based solutions for such things, like
pg_auto_failover or pacemaker-based stuff just to name two. These
rely on more complex architectures, where a third node is present to
monitor the others (any sane HA infra ought to do at least that to be
honest).
--
Michael

#3Rita
rmorgan466@gmail.com
In reply to: Michael Paquier (#2)
Re: what to do after a failover

Thanks for the response.
I am using Postgresql 11.
I want something simple and I have a strong preference toward using stock
tools. After the promotion and the original master comes online, I was
thinking of doing a pg_basebackup to sync. Any thoughts about that? I had a
very hard time with pg_rewind and I didn't like its complexity.

On Wed, Jan 8, 2020 at 11:31 PM Michael Paquier <michael@paquier.xyz> wrote:

On Wed, Jan 08, 2020 at 11:06:28PM -0500, Rita wrote:

I run a master and standby setup with Postgresql 11. The systems are
identical from a hardware and software setup. If the master goes down I
can do a pg_ctl promote on the standby and point my applications to use

the

standby (new master).

Once the original master is online, when is an appropriate time to fail
back over? And are there any other things besides promote after the
failover is done?

Make sure that you still have an HA configuration able to handle
multiple degrees of failures with always standbys available after a
promotion.

The options available to rebuild your HA configuration after a
failover depend on the version of PostgreSQL you are using. After a
failover the most simple solution would be to always recreate a new
standby from a base backup taken from the freshly-promoted primary,
though it can be costly depending on your instance. You could also
use pg_rewind (available in core since 9.5) to recycle the previous
primary and reuse it as a standby of the new promoted custer. Note
that there are community-based solutions for such things, like
pg_auto_failover or pacemaker-based stuff just to name two. These
rely on more complex architectures, where a third node is present to
monitor the others (any sane HA infra ought to do at least that to be
honest).
--
Michael

-- 
--- Get your facts first, then you can distort them as you please.--
In reply to: Rita (#3)
Re: what to do after a failover

On Thu, 9 Jan 2020 06:55:18 -0500
Rita <rmorgan466@gmail.com> wrote:

Thanks for the response.
I am using Postgresql 11.
I want something simple and I have a strong preference toward using stock
tools. After the promotion and the original master comes online, I was
thinking of doing a pg_basebackup to sync. Any thoughts about that?

If you can afford that, this is the cleanest and easiest procedure you could
find.

Note that pg_basebackup need an empty PGDATA, so it will have to transfert the
whole instance from new promoted primary to the original one.

Regards,

#5Michael Paquier
michael@paquier.xyz
In reply to: Jehan-Guillaume de Rorthais (#4)
Re: what to do after a failover

On Thu, Jan 09, 2020 at 03:14:59PM +0100, Jehan-Guillaume de Rorthais wrote:

If you can afford that, this is the cleanest and easiest procedure you could
find.

Note that pg_basebackup need an empty PGDATA, so it will have to transfert the
whole instance from new promoted primary to the original one.

Simple is easier to understand. Now the larger your instance, the
longer it takes to copy a base backup and the longer your reduce the
availability of your cluster. So be careful with what you choose.
--
Michael