Manual failover cluster

Started by Hispaniola Solover 4 years ago6 messagesgeneral

moishap@hotmail.com

over 4 years ago

Team,

I have a pg 10 cluster with a master and two hot-standby nodes. There is a requirement for a manual failover (nodes switching the roles) at will. This is a vanilla 3 node PG cluster that was built with WAL archiving (central location) and streaming replication to two hot standby nodes. The failover is scripted in Ansible. Ansible massages and moves around the archive/restore scripts, the conf files and the trigger and calls ` pg_ctlcluster` to start/stop. This part _seems_ to be doing the job fine.

The issue I am struggling with is the apparent fragility of the process - all 3 nodes will end up in a "good" state after the switch only every other time. Other times I have to rebase the hot-standby from the new master with pg_basebackup. It seems the issues are mostly with those nodes, ending up as slaves after the roles switch runs.
They get errors like mismatch in timelines, recovering from the same WAL over and over again, invalid resource manager ID in primary checkpoint record, etc.

In this light, I am wondering - using what's offered by PostgreSQL itself, i.e. streaming WAL replication with log shipping - can I expect to have this kind of failover 100% reliable on PG side ? Anyone is doing this reliably on PostgreSQL 10.1x ?

Thanks !

Moishe

Ninad Shah

nshah.postgres@gmail.com

over 4 years ago

In reply to: Hispaniola Sol (#1)

Re: Manual failover cluster

What are the parameters have you set in the recovery.conf file?

Regards,
Ninad Shah

On Fri, 20 Aug 2021 at 18:53, Hispaniola Sol <moishap@hotmail.com> wrote:

Show quoted text

Team,

I have a pg 10 cluster with a master and two hot-standby nodes. There is a
requirement for a manual failover (nodes switching the roles) at will. This
is a vanilla 3 node PG cluster that was built with WAL archiving (central
location) and streaming replication to two hot standby nodes. The failover
is scripted in Ansible. Ansible massages and moves around the
archive/restore scripts, the conf files and the trigger and calls `
pg_ctlcluster` to start/stop. This part _seems_ to be doing the job fine.

The issue I am struggling with is the apparent fragility of the process -
all 3 nodes will end up in a "good" state after the switch only every other
time. Other times I have to rebase the hot-standby from the new master with
pg_basebackup. It seems the issues are mostly with those nodes, ending up
as slaves after the roles switch runs.
They get errors like mismatch in timelines, recovering from the same WAL
over and over again, invalid resource manager ID in primary checkpoint
record, etc.

In this light, I am wondering - using what's offered by PostgreSQL itself,
i.e. streaming WAL replication with log shipping - can I expect to have
this kind of failover 100% reliable on PG side ? Anyone is doing this
reliably on PostgreSQL 10.1x ?

Thanks !

Moishe

Saul Perdomo

saul.perdomo@gmail.com

over 4 years ago

In reply to: Hispaniola Sol (#1)

Re: Manual failover cluster

Hi Moishe,

Since we use pgbackrest ourselves, this is the process I followed to set up
something similar on PG 10:

https://pgstef.github.io/2018/11/28/combining_pgbackrest_and_streaming_replication.html

(Not knowing much [if at all] about the reason for your requirements, I
would recommend looking into enabling PITR backups in addition to the hot
standby[s], which as you've seen are not bulletproof.)

Anyway, just wanted to add that your experience roughly matches ours -- it
seems that an attempt to promote the standby isn't always smooth. My sample
size is almost certainly smaller than yours, about half a dozen attempts.
In the past I've attributed this to a poor understanding on my part of the
behind-the-scenes of the process, and chalked it up to having made a
mistake or other in the manual steps (mainly in the order they are
executed). That being said, if you find a way to faithfully reproduce the
issue, I'm sure the community will want to know, there is certainly an
expectation that the failover is reliable from the PG software side of
things, as long as there are no hardware nor operator issues! Again, not
knowing a lot about your setup, my first instinct would be to troubleshoot
your automated scripts, you might find that you need to change the order
things are run when on server B vs server A, for example..

On Fri, Aug 20, 2021 at 9:23 AM Hispaniola Sol <moishap@hotmail.com> wrote:

Show quoted text

Team,

I have a pg 10 cluster with a master and two hot-standby nodes. There is a
requirement for a manual failover (nodes switching the roles) at will. This
is a vanilla 3 node PG cluster that was built with WAL archiving (central
location) and streaming replication to two hot standby nodes. The failover
is scripted in Ansible. Ansible massages and moves around the
archive/restore scripts, the conf files and the trigger and calls `
pg_ctlcluster` to start/stop. This part _seems_ to be doing the job fine.

The issue I am struggling with is the apparent fragility of the process -
all 3 nodes will end up in a "good" state after the switch only every other
time. Other times I have to rebase the hot-standby from the new master with
pg_basebackup. It seems the issues are mostly with those nodes, ending up
as slaves after the roles switch runs.
They get errors like mismatch in timelines, recovering from the same WAL
over and over again, invalid resource manager ID in primary checkpoint
record, etc.

In this light, I am wondering - using what's offered by PostgreSQL itself,
i.e. streaming WAL replication with log shipping - can I expect to have
this kind of failover 100% reliable on PG side ? Anyone is doing this
reliably on PostgreSQL 10.1x ?

Thanks !

Moishe

Saul Perdomo

saul.perdomo@gmail.com

over 4 years ago

In reply to: Saul Perdomo (#3)

Re: Manual failover cluster

Sorry, I misspoke there - I meant to say that since one should not count on
the standby-failover process to always run smoothly (whether it's due to
hardware, operator, automated scripts, or software issues), DB backups
should also be in place if at all possible.

On Mon, Aug 23, 2021 at 1:37 PM Saul Perdomo <saul.perdomo@gmail.com> wrote:

Show quoted text

Hi Moishe,

Since we use pgbackrest ourselves, this is the process I followed to set
up something similar on PG 10:

https://pgstef.github.io/2018/11/28/combining_pgbackrest_and_streaming_replication.html

(Not knowing much [if at all] about the reason for your requirements, I
would recommend looking into enabling PITR backups in addition to the hot
standby[s], which as you've seen are not bulletproof.)

Anyway, just wanted to add that your experience roughly matches ours -- it
seems that an attempt to promote the standby isn't always smooth. My sample
size is almost certainly smaller than yours, about half a dozen attempts.
In the past I've attributed this to a poor understanding on my part of the
behind-the-scenes of the process, and chalked it up to having made a
mistake or other in the manual steps (mainly in the order they are
executed). That being said, if you find a way to faithfully reproduce the
issue, I'm sure the community will want to know, there is certainly an
expectation that the failover is reliable from the PG software side of
things, as long as there are no hardware nor operator issues! Again, not
knowing a lot about your setup, my first instinct would be to troubleshoot
your automated scripts, you might find that you need to change the order
things are run when on server B vs server A, for example..

On Fri, Aug 20, 2021 at 9:23 AM Hispaniola Sol <moishap@hotmail.com>
wrote:

Team,

I have a pg 10 cluster with a master and two hot-standby nodes. There is
a requirement for a manual failover (nodes switching the roles) at will.
This is a vanilla 3 node PG cluster that was built with WAL archiving
(central location) and streaming replication to two hot standby nodes. The
failover is scripted in Ansible. Ansible massages and moves around the
archive/restore scripts, the conf files and the trigger and calls `
pg_ctlcluster` to start/stop. This part _seems_ to be doing the job fine.

The issue I am struggling with is the apparent fragility of the process -
all 3 nodes will end up in a "good" state after the switch only every other
time. Other times I have to rebase the hot-standby from the new master with
pg_basebackup. It seems the issues are mostly with those nodes, ending up
as slaves after the roles switch runs.
They get errors like mismatch in timelines, recovering from the same WAL
over and over again, invalid resource manager ID in primary checkpoint
record, etc.

In this light, I am wondering - using what's offered by PostgreSQL
itself, i.e. streaming WAL replication with log shipping - can I expect to
have this kind of failover 100% reliable on PG side ? Anyone is doing this
reliably on PostgreSQL 10.1x ?

Thanks !

Moishe

Ninad Shah

nshah.postgres@gmail.com

over 4 years ago

In reply to: Saul Perdomo (#4)

Re: Manual failover cluster

Hi Saul,

Hope you are doing well. My apology for no response for a longer time.

Pgbackrest helps build a streaming replication. While performing role
reversal(switchover), it is mandatory to set recover_target_timeline to
latest in recovery.conf(in data directory). Steps to perform switchover is
as below.

1) Stop master database
2) Promote the slave database
3) Prepare the recovery.conf file on the stopped master node, and set
recover_target_timeline to latest in that file
4) Start the stopped master database; it will automatically come up as a
slave
5) Check status of the database.

Additionally, entries in pg_hba.conf is required.

Hope this helps.

Regards,
Ninad Shah

On Mon, 23 Aug 2021 at 23:12, Saul Perdomo <saul.perdomo@gmail.com> wrote:

Show quoted text

Sorry, I misspoke there - I meant to say that since one should not count
on the standby-failover process to always run smoothly (whether it's due to
hardware, operator, automated scripts, or software issues), DB backups
should also be in place if at all possible.

On Mon, Aug 23, 2021 at 1:37 PM Saul Perdomo <saul.perdomo@gmail.com>
wrote:

Hi Moishe,

Since we use pgbackrest ourselves, this is the process I followed to set
up something similar on PG 10:

https://pgstef.github.io/2018/11/28/combining_pgbackrest_and_streaming_replication.html

(Not knowing much [if at all] about the reason for your requirements, I
would recommend looking into enabling PITR backups in addition to the hot
standby[s], which as you've seen are not bulletproof.)

Anyway, just wanted to add that your experience roughly matches ours --
it seems that an attempt to promote the standby isn't always smooth. My
sample size is almost certainly smaller than yours, about half a dozen
attempts. In the past I've attributed this to a poor understanding on my
part of the behind-the-scenes of the process, and chalked it up to having
made a mistake or other in the manual steps (mainly in the order they are
executed). That being said, if you find a way to faithfully reproduce the
issue, I'm sure the community will want to know, there is certainly an
expectation that the failover is reliable from the PG software side of
things, as long as there are no hardware nor operator issues! Again, not
knowing a lot about your setup, my first instinct would be to troubleshoot
your automated scripts, you might find that you need to change the order
things are run when on server B vs server A, for example..

On Fri, Aug 20, 2021 at 9:23 AM Hispaniola Sol <moishap@hotmail.com>
wrote:

Team,

I have a pg 10 cluster with a master and two hot-standby nodes. There is
a requirement for a manual failover (nodes switching the roles) at will.
This is a vanilla 3 node PG cluster that was built with WAL archiving
(central location) and streaming replication to two hot standby nodes. The
failover is scripted in Ansible. Ansible massages and moves around the
archive/restore scripts, the conf files and the trigger and calls `
pg_ctlcluster` to start/stop. This part _seems_ to be doing the job fine.

The issue I am struggling with is the apparent fragility of the process
- all 3 nodes will end up in a "good" state after the switch only every
other time. Other times I have to rebase the hot-standby from the new
master with pg_basebackup. It seems the issues are mostly with those nodes,
ending up as slaves after the roles switch runs.
They get errors like mismatch in timelines, recovering from the same WAL
over and over again, invalid resource manager ID in primary checkpoint
record, etc.

In this light, I am wondering - using what's offered by PostgreSQL
itself, i.e. streaming WAL replication with log shipping - can I expect to
have this kind of failover 100% reliable on PG side ? Anyone is doing this
reliably on PostgreSQL 10.1x ?

Thanks !

Moishe

Saul Perdomo

saul.perdomo@gmail.com

over 4 years ago

In reply to: Ninad Shah (#5)

Re: Manual failover cluster

Are you sure that it is *mandatory*? Because from my recollection I've only
needed to set one manually when (for one reason or another) my recovery
attempt fails and then I'm in what the docs call a "complex re-recovery
situation" -- not a fun time:

recovery_target_timeline (string)

Specifies recovering into a particular timeline. The default is to recover
along the same timeline that was current when the base backup was taken.
Setting this to latest recovers to the latest timeline found in the
archive, which is useful in a standby server. Other than that you only need
to set this parameter in complex re-recovery situations, where you need to
return to a state that itself was reached after a point-in-time recovery.
See Section 25.3.5
<https://www.postgresql.org/docs/10/continuous-archiving.html#BACKUP-TIMELINES>
for
discussion.
From: https://www.postgresql.org/docs/10/recovery-target-settings.html

On Fri, Aug 27, 2021 at 2:36 AM Ninad Shah <nshah.postgres@gmail.com> wrote:

Show quoted text

Hi Saul,

Hope you are doing well. My apology for no response for a longer time.

Pgbackrest helps build a streaming replication. While performing role
reversal(switchover), it is mandatory to set recover_target_timeline to
latest in recovery.conf(in data directory). Steps to perform switchover is
as below.

1) Stop master database
2) Promote the slave database
3) Prepare the recovery.conf file on the stopped master node, and set
recover_target_timeline to latest in that file
4) Start the stopped master database; it will automatically come up as a
slave
5) Check status of the database.

Additionally, entries in pg_hba.conf is required.

Hope this helps.

Regards,
Ninad Shah

On Mon, 23 Aug 2021 at 23:12, Saul Perdomo <saul.perdomo@gmail.com> wrote:

Sorry, I misspoke there - I meant to say that since one should not count
on the standby-failover process to always run smoothly (whether it's due to
hardware, operator, automated scripts, or software issues), DB backups
should also be in place if at all possible.

On Mon, Aug 23, 2021 at 1:37 PM Saul Perdomo <saul.perdomo@gmail.com>
wrote:

Hi Moishe,

Since we use pgbackrest ourselves, this is the process I followed to set
up something similar on PG 10:

https://pgstef.github.io/2018/11/28/combining_pgbackrest_and_streaming_replication.html

(Not knowing much [if at all] about the reason for your requirements, I
would recommend looking into enabling PITR backups in addition to the hot
standby[s], which as you've seen are not bulletproof.)

Anyway, just wanted to add that your experience roughly matches ours --
it seems that an attempt to promote the standby isn't always smooth. My
sample size is almost certainly smaller than yours, about half a dozen
attempts. In the past I've attributed this to a poor understanding on my
part of the behind-the-scenes of the process, and chalked it up to having
made a mistake or other in the manual steps (mainly in the order they are
executed). That being said, if you find a way to faithfully reproduce the
issue, I'm sure the community will want to know, there is certainly an
expectation that the failover is reliable from the PG software side of
things, as long as there are no hardware nor operator issues! Again, not
knowing a lot about your setup, my first instinct would be to troubleshoot
your automated scripts, you might find that you need to change the order
things are run when on server B vs server A, for example..

On Fri, Aug 20, 2021 at 9:23 AM Hispaniola Sol <moishap@hotmail.com>
wrote:

Team,

I have a pg 10 cluster with a master and two hot-standby nodes. There
is a requirement for a manual failover (nodes switching the roles) at will.
This is a vanilla 3 node PG cluster that was built with WAL archiving
(central location) and streaming replication to two hot standby nodes. The
failover is scripted in Ansible. Ansible massages and moves around the
archive/restore scripts, the conf files and the trigger and calls `
pg_ctlcluster` to start/stop. This part _seems_ to be doing the job fine.

The issue I am struggling with is the apparent fragility of the process
- all 3 nodes will end up in a "good" state after the switch only every
other time. Other times I have to rebase the hot-standby from the new
master with pg_basebackup. It seems the issues are mostly with those nodes,
ending up as slaves after the roles switch runs.
They get errors like mismatch in timelines, recovering from the same
WAL over and over again, invalid resource manager ID in primary checkpoint
record, etc.

In this light, I am wondering - using what's offered by PostgreSQL
itself, i.e. streaming WAL replication with log shipping - can I expect to
have this kind of failover 100% reliable on PG side ? Anyone is doing this
reliably on PostgreSQL 10.1x ?

Thanks !

Moishe