Streaming replication failover process - Pgsql 9.2

Started by Patrick Bover 9 years ago17 messagesgeneral

Jump to latest

Patrick B

patrickbakerbr@gmail.com

over 9 years ago

Hi all,

There will be a network maintenance at the company where my servers are...

I've got one master and one slave server, running PostgreSQL 9.2.

As the network will be down, the internet won't be working as well as the
intranet. Both servers won't be able to communicate each other. Not by
streaming replication and wal_files too.

That should take at most 10 minutes.

1. When the connection comes back, will the master and slave work as
expected? The streaming replication should be ok?

2. As the master will be down, I don't wanna slave turns into a master by
Failover. The trigger_file line on recovery.conf on the slave server is
commented, so should be ok here right? The slave will be still a slave once
master is down....

Thanks!
Patrick

John R Pierce

pierce@hogranch.com

over 9 years ago

In reply to: Patrick B (#1)

Re: Streaming replication failover process - Pgsql 9.2

On 7/10/2016 2:19 PM, Patrick B wrote:

1. When the connection comes back, will the master and slave work as
expected? The streaming replication should be ok?

as long as you have sufficient WAL available it should recover fine.
you might have to restart the slave to get it to reconnect.

2. As the master will be down, I don't wanna slave turns into a master
by Failover. The trigger_file line on recovery.conf on the slave
server is commented, so should be ok here right? The slave will be
still a slave once master is down....

failover is a function of whatever cluster management software you use,
postgres won't failover on its own.

--
john r pierce, recycling bits in santa cruz

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Adrian Klaver

adrian.klaver@aklaver.com

over 9 years ago

In reply to: Patrick B (#1)

Re: Streaming replication failover process - Pgsql 9.2

On 07/10/2016 02:19 PM, Patrick B wrote:

Hi all,

There will be a network maintenance at the company where my servers are...

I've got one master and one slave server, running PostgreSQL 9.2.

As the network will be down, the internet won't be working as well as
the intranet. Both servers won't be able to communicate each other. Not
by streaming replication and wal_files too.

That should take at most 10 minutes.

1. When the connection comes back, will the master and slave work as
expected? The streaming replication should be ok?

Assuming you have wal_keep_segments set high enough to keep the master
from recycling the WAL files before the network comes up again.

2. As the master will be down, I don't wanna slave turns into a master
by Failover. The trigger_file line on recovery.conf on the slave server
is commented, so should be ok here right? The slave will be still a
slave once master is down....

Thanks!
Patrick

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Andreas Kretschmer

andreas@a-kretschmer.de

over 9 years ago

In reply to: Patrick B (#1)

Re: Streaming replication failover process - Pgsql 9.2

Am 10.07.2016 um 23:19 schrieb Patrick B:

Hi all,

There will be a network maintenance at the company where my servers are...

I've got one master and one slave server, running PostgreSQL 9.2.

As the network will be down, the internet won't be working as well as
the intranet. Both servers won't be able to communicate each other.
Not by streaming replication and wal_files too.

That should take at most 10 minutes.

1. When the connection comes back, will the master and slave work as
expected? The streaming replication should be ok?

if the master holds all needed WAL's there should be no problem.

You can ensure that with wal_keep_segments, or, in newer version, via
replication slots.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

John R Pierce

pierce@hogranch.com

over 9 years ago

In reply to: Andreas Kretschmer (#4)

Re: Streaming replication failover process - Pgsql 9.2

On 7/10/2016 2:42 PM, Andreas Kretschmer wrote:

1. When the connection comes back, will the master and slave work as
expected? The streaming replication should be ok?

if the master holds all needed WAL's there should be no problem.

You can ensure that with wal_keep_segments, or, in newer version, via
replication slots.

or via an external WAL archive, that the master writes, and the slave
has access to.

--
john r pierce, recycling bits in santa cruz

Patrick B

patrickbakerbr@gmail.com

over 9 years ago

In reply to: John R Pierce (#5)

Re: Streaming replication failover process - Pgsql 9.2

thanks guys.. thanks for all the comments...

I'm not shipping the wal_files into master, I actually ship them into slave
and another backup server as well.

So I'll have to change my archive_command then :)

Thanks!

Patrick B

patrickbakerbr@gmail.com

over 9 years ago

In reply to: Patrick B (#6)

Re: Streaming replication failover process - Pgsql 9.2

archive_command = 'cp %p /var/lib/pgsql/archive/%f'

That would be ok right guys?

I will also setup wal_keep_segments to 512

Adrian Klaver

adrian.klaver@aklaver.com

over 9 years ago

In reply to: Patrick B (#7)

Re: Streaming replication failover process - Pgsql 9.2

On 07/10/2016 04:28 PM, Patrick B wrote:

archive_command = 'cp %p /var/lib/pgsql/archive/%f'

This would be where?

And does the corresponding restore_command point to the same place?

That would be ok right guys?

I will also setup wal_keep_segments to 512

The WAL segments kept would be dependent the number of WAL segments your
database generates on average over the worse case downtime interval plus
a CYA buffer.

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Patrick B

patrickbakerbr@gmail.com

over 9 years ago

In reply to: Adrian Klaver (#8)

Re: Streaming replication failover process - Pgsql 9.2

2016-07-11 12:18 GMT+12:00 Adrian Klaver <adrian.klaver@aklaver.com>:

On 07/10/2016 04:28 PM, Patrick B wrote:

archive_command = 'cp %p /var/lib/pgsql/archive/%f'

This would be where?

master server

And does the corresponding restore_command point to the same place?

yes.. the slaves have the restore_command pointing to the same place.
As they won't be able to communicate with master, once all the services are
up again, will have to copy them manually

Show quoted text

That would be ok right guys?

I will also setup wal_keep_segments to 512

The WAL segments kept would be dependent the number of WAL segments your
database generates on average over the worse case downtime interval plus a
CYA buffer.

#10

Patrick B

patrickbakerbr@gmail.com

over 9 years ago

In reply to: Patrick B (#9)

Re: Streaming replication failover process - Pgsql 9.2

If the master server can't send the wal_files through the slaves, shouldn't
the wal_files be in "background" waiting to be delivered?

Otherwise what's the purpose of them? If a network fails I'd loose those
files?

#11

Patrick B

patrickbakerbr@gmail.com

over 9 years ago

In reply to: Patrick B (#10)

Re: Streaming replication failover process - Pgsql 9.2

oh ok.. got it..

wal_keep_segments = To prevent the primary server from removing the
WAL segments required for the standby server before shipping them, set
the minimum number of segments retained in the pg_xlog directory

so it would be ok just by increasing that parameter, right? Once the
servers are back online, the shipping should start normally.

Is that right guys?

cheers

#12

John R Pierce

pierce@hogranch.com

over 9 years ago

In reply to: Patrick B (#7)

Re: Streaming replication failover process - Pgsql 9.2

On 7/10/2016 4:28 PM, Patrick B wrote:

archive_command = 'cp %p /var/lib/pgsql/archive/%f'

That would be ok right guys?

normally, you want to ship your WAL archives to a NFS server or
something similar, which the master and all the slaves can read.

--
john r pierce, recycling bits in santa cruz

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#13

Patrick B

patrickbakerbr@gmail.com

over 9 years ago

In reply to: John R Pierce (#12)

Re: Streaming replication failover process - Pgsql 9.2

2016-07-11 15:48 GMT+12:00 John R Pierce <pierce@hogranch.com>:

On 7/10/2016 4:28 PM, Patrick B wrote:

archive_command = 'cp %p /var/lib/pgsql/archive/%f'

That would be ok right guys?

normally, you want to ship your WAL archives to a NFS server or something
similar, which the master and all the slaves can read.

what if the network goes down?

#14

John R Pierce

pierce@hogranch.com

over 9 years ago

In reply to: Patrick B (#13)

Re: Streaming replication failover process - Pgsql 9.2

On 7/10/2016 8:51 PM, Patrick B wrote:

what if the network goes down?

that WAL server could be located in the same data center as the master
database server. if your local area network goes down, well, you're
probably in a world of hurt.

if the wide area network is mission critical, it would be smart to have
multiple resilient network paths

--
john r pierce, recycling bits in santa cruz

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#15

Adrian Klaver

adrian.klaver@aklaver.com

over 9 years ago

In reply to: Patrick B (#10)

Re: Streaming replication failover process - Pgsql 9.2

On 07/10/2016 07:17 PM, Patrick B wrote:

If the master server can't send the wal_files through the slaves,
shouldn't the wal_files be in "background" waiting to be delivered?

Short version, yes, assuming you are talking about archiving the WAL
files somewhere and assuming there is sufficient space for the quantity
of WAL files stored on what ever 'device' you are storing them on.

Long version and, I would say, required reading:

https://www.postgresql.org/docs/9.2/static/high-availability.html

Otherwise what's the purpose of them? If a network fails I'd loose those
files?

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#16

Patrick B

patrickbakerbr@gmail.com

over 9 years ago

In reply to: Adrian Klaver (#15)

Re: Streaming replication failover process - Pgsql 9.2

Hi guys,

I'm setting up a new slave server, using Postgres 9.2. This new slave
server I'll call: New_slave.

I ran this command, from the new_slave server. It will connects to my other
slave and copy the DB.

ssh postgres@slave05 'pg_basebackup --pgdata=- --format=tar

--label=new_slave --progress --host=localhost --port=5432
--username=replicator --xlog' | tar -x --no-same-owner

That took me 10 days, as I'm copying +2TB. Note that I'm not using
*--xlog-method=stream*, but I was copying the wal_files manually to the
new_slave server. By the time the command above (pg_basebackup) finished,
there was 1TB of wal_files.

I started to recovery the Database, setting the
recovery.conf.recovery_command.

and then I got this error:

WAL segment `../wal_archive/000000020000171B000000D9` not found

So.. shit.. I don't know what happened, but I checked the master server and
actually that file wasn't shipped into the new_server. I don't know why.

Questions:
1 - Do you have any recommendations? I'll have to perform that again.

2 - If I use pg_basebackup with -xlog-stream, do I have to have
wal_shippment to the new_slave? Or the basebackup command would take care
of them?

*Using the pg_basebackup with stream should be?*

ssh postgres@slave05 'pg_basebackup --pgdata=- --format=tar

--label=new_slave --xlog-method=stream --progress --host=localhost
--port=5432 --username=replicator --xlog' | tar -x --no-same-owner

3 - Instead of using pg_basebackup like that, should I dump into local
disk(on the slave05 server) and then copy to the new_slave?

Cheers

#17

Patrick B

patrickbakerbr@gmail.com

over 9 years ago

In reply to: Patrick B (#16)

Re: Streaming replication failover process - Pgsql 9.2

sorry... wrong email. Will create a new topic.