what to do after a failover

Started by Ritaover 6 years ago5 messagesgeneral

rmorgan466@gmail.com

over 6 years ago

I run a master and standby setup with Postgresql 11. The systems are
identical from a hardware and software setup. If the master goes down I
can do a pg_ctl promote on the standby and point my applications to use the
standby (new master).

Once the original master is online, when is an appropriate time to fail
back over? And are there any other things besides promote after the
failover is done?

-- 
--- Get your facts first, then you can distort them as you please.--

Michael Paquier

michael@paquier.xyz

over 6 years ago

In reply to: Rita (#1)

Re: what to do after a failover

On Wed, Jan 08, 2020 at 11:06:28PM -0500, Rita wrote:

I run a master and standby setup with Postgresql 11. The systems are
identical from a hardware and software setup. If the master goes down I
can do a pg_ctl promote on the standby and point my applications to use the
standby (new master).

Once the original master is online, when is an appropriate time to fail
back over? And are there any other things besides promote after the
failover is done?

Make sure that you still have an HA configuration able to handle
multiple degrees of failures with always standbys available after a
promotion.

The options available to rebuild your HA configuration after a
failover depend on the version of PostgreSQL you are using. After a
failover the most simple solution would be to always recreate a new
standby from a base backup taken from the freshly-promoted primary,
though it can be costly depending on your instance. You could also
use pg_rewind (available in core since 9.5) to recycle the previous
primary and reuse it as a standby of the new promoted custer. Note
that there are community-based solutions for such things, like
pg_auto_failover or pacemaker-based stuff just to name two. These
rely on more complex architectures, where a third node is present to
monitor the others (any sane HA infra ought to do at least that to be
honest).
--
Michael

Rita

rmorgan466@gmail.com

over 6 years ago

In reply to: Michael Paquier (#2)

Re: what to do after a failover

Thanks for the response.
I am using Postgresql 11.
I want something simple and I have a strong preference toward using stock
tools. After the promotion and the original master comes online, I was
thinking of doing a pg_basebackup to sync. Any thoughts about that? I had a
very hard time with pg_rewind and I didn't like its complexity.

On Wed, Jan 8, 2020 at 11:31 PM Michael Paquier <michael@paquier.xyz> wrote:

On Wed, Jan 08, 2020 at 11:06:28PM -0500, Rita wrote:

I run a master and standby setup with Postgresql 11. The systems are
identical from a hardware and software setup. If the master goes down I
can do a pg_ctl promote on the standby and point my applications to use

the

standby (new master).

Once the original master is online, when is an appropriate time to fail
back over? And are there any other things besides promote after the
failover is done?

Make sure that you still have an HA configuration able to handle
multiple degrees of failures with always standbys available after a
promotion.

The options available to rebuild your HA configuration after a
failover depend on the version of PostgreSQL you are using. After a
failover the most simple solution would be to always recreate a new
standby from a base backup taken from the freshly-promoted primary,
though it can be costly depending on your instance. You could also
use pg_rewind (available in core since 9.5) to recycle the previous
primary and reuse it as a standby of the new promoted custer. Note
that there are community-based solutions for such things, like
pg_auto_failover or pacemaker-based stuff just to name two. These
rely on more complex architectures, where a third node is present to
monitor the others (any sane HA infra ought to do at least that to be
honest).
--
Michael

-- 
--- Get your facts first, then you can distort them as you please.--

Jehan-Guillaume de Rorthais

jgdr@dalibo.com

over 6 years ago

In reply to: Rita (#3)

Re: what to do after a failover

On Thu, 9 Jan 2020 06:55:18 -0500
Rita <rmorgan466@gmail.com> wrote:

Thanks for the response.
I am using Postgresql 11.
I want something simple and I have a strong preference toward using stock
tools. After the promotion and the original master comes online, I was
thinking of doing a pg_basebackup to sync. Any thoughts about that?

If you can afford that, this is the cleanest and easiest procedure you could
find.

Note that pg_basebackup need an empty PGDATA, so it will have to transfert the
whole instance from new promoted primary to the original one.

Regards,

Michael Paquier

michael@paquier.xyz

over 6 years ago

In reply to: Jehan-Guillaume de Rorthais (#4)

Re: what to do after a failover

On Thu, Jan 09, 2020 at 03:14:59PM +0100, Jehan-Guillaume de Rorthais wrote:

If you can afford that, this is the cleanest and easiest procedure you could
find.

Note that pg_basebackup need an empty PGDATA, so it will have to transfert the
whole instance from new promoted primary to the original one.

Simple is easier to understand. Now the larger your instance, the
longer it takes to copy a base backup and the longer your reduce the
availability of your cluster. So be careful with what you choose.
--
Michael