Live steraming replication setup issue!

Started by Ashish Chauhanabout 10 years ago8 messagesgeneral
Jump to latest
#1Ashish Chauhan
Ashish.Chauhan@support.com

Hi,

Currently we have master -> slave -> DR hot standby streaming replication in current prod environment. Between master and slave server replication running fine. Between slave and DR server replication is broken and I am trying to fix it. For DR server, slave server is master server.

Issue: Few days back, DR was lagging behind slave server and stopped replication. I tried to setup the replication from slave to DR (currently there is replication running from master to slave) with pg_basebackup command, I am able to restart Postgres without any error on DR server but when I try to run any psql on DR, it throwing up below error.

psql: FATAL: the database system is starting up

Slave configuration:
hot_standby = on
listen_addresses = '*'
wal_level = hot_standby
wal_keep_segments = 3000
max_wal_senders = 5

default_statistics_target = 100
maintenance_work_mem = 1792MB
checkpoint_completion_target = 0.7
effective_cache_size = 22GB
work_mem = 144MB
wal_buffers = 16MB
checkpoint_segments = 32
shared_buffers = 7GB
max_connections = 300

DR Server configuration:
listen_addresses = '*'
hot_standby = on
wal_level = hot_standby

default_statistics_target = 100
maintenance_work_mem = 896MB
checkpoint_completion_target = 0.7
effective_cache_size = 10GB
work_mem = 352MB
wal_buffers = 16MB
checkpoint_segments = 32
shared_buffers = 3584MB
max_connections = 300

How do I setup replication between DR server and slave server while slave server and master server are running? I cannot stop master server. Can someone please guide with steps?

Thanks for your help in advance.

Thanks
-Ashish

#2Venkata B Nagothi
nag1010@gmail.com
In reply to: Ashish Chauhan (#1)
Re: Live steraming replication setup issue!

How do I setup replication between DR server and slave server while slave
server and master server are running? I cannot stop master server. Can
someone please guide with steps?

Steps are pretty much similar. You can setup replication between slave and
DR by using the backup of Master database + WAL archives (if available) and
setup primary_conninfo to point to slave database in recovery.conf on DR.
Can you please let us know which version of postgresql you are using ?
Thats important for us to make any suggestions.

Regards,
Venkata B N

Fujitsu Australia

#3Ashish Chauhan
Ashish.Chauhan@support.com
In reply to: Venkata B Nagothi (#2)
Re: Live steraming replication setup issue!

I am using Postgres 9.4.

I removed /etc/Postgres/9.4/main directory from DR server and fetching data through below pg_basebackup command but no luck.

sudo -u postgres pg_basebackup -h <slave server ip> -D /var/lib/postgresql/9.4/main -U postgres -v –P

Do I need to enable these two parameters in slave server? archive_mode = on and archive_command = some command?

Biggest issue is, replication running between master and slave and I need to build DR from slave server.

Thanks,
Ashish

From: Venkata Balaji N [mailto:nag1010@gmail.com]
Sent: Thursday, February 18, 2016 3:45 PM, 15:45
To: Ashish Chauhan
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Live steraming replication setup issue!

How do I setup replication between DR server and slave server while slave server and master server are running? I cannot stop master server. Can someone please guide with steps?

Steps are pretty much similar. You can setup replication between slave and DR by using the backup of Master database + WAL archives (if available) and setup primary_conninfo to point to slave database in recovery.conf on DR.
Can you please let us know which version of postgresql you are using ? Thats important for us to make any suggestions.

Regards,
Venkata B N

Fujitsu Australia

#4Andreas Kretschmer
andreas@a-kretschmer.de
In reply to: Ashish Chauhan (#1)
Re: Live steraming replication setup issue!

Currently we have master -> slave -> DR hot standby streaming replication in
current prod environment. Between master and slave server replication running
fine. Between slave and DR server replication is broken and I am trying to fix
it. For DR server, slave server is master server.

Issue: Few days back, DR was lagging behind slave server and stopped
replication. I tried to setup the replication from slave to DR (currently
there is replication running from master to slave) with pg_basebackup command,
I am able to restart Postgres without any error on DR server but when I try to
run any psql on DR, it throwing up below error.

psql: FATAL: the database system is starting up

please show us your recovery.conf. this should include this line:

standby_mode = 'on'

--
Andreas Kretschmer
http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Ashish Chauhan
Ashish.Chauhan@support.com
In reply to: Andreas Kretschmer (#4)
Re: Live steraming replication setup issue!

Below is recovery.conf on slave

#---------------------------------------------------------------------------
# STANDBY SERVER PARAMETERS
#---------------------------------------------------------------------------
#
# standby_mode
#
# When standby_mode is enabled, the PostgreSQL server will work as a
# standby. It will continuously wait for the additional XLOG records, using
# restore_command and/or primary_conninfo.
#
standby_mode = 'on'
#
# primary_conninfo
#
# If set, the PostgreSQL server will try to connect to the primary using this
# connection string and receive XLOG records continuously.
#
primary_conninfo = 'host=<master server ip> port=5432'
#
#
# By default, a standby server keeps restoring XLOG records from the
# primary indefinitely. If you want to stop the standby mode, finish recovery
# and open the system in read/write mode, specify path to a trigger file.
# The server will poll the trigger file path periodically and start as a
# primary server when it's found.
#
trigger_file = '/data/main/primary.trigger'

Thanks,
Ashish

-----Original Message-----
From: Andreas Kretschmer [mailto:andreas@a-kretschmer.de]
Sent: Thursday, February 18, 2016 5:11 PM, 17:11
To: Ashish Chauhan; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Live steraming replication setup issue!

Currently we have master -> slave -> DR hot standby streaming
replication in current prod environment. Between master and slave
server replication running fine. Between slave and DR server
replication is broken and I am trying to fix it. For DR server, slave server is master server.

Issue: Few days back, DR was lagging behind slave server and stopped
replication. I tried to setup the replication from slave to DR
(currently there is replication running from master to slave) with
pg_basebackup command, I am able to restart Postgres without any error
on DR server but when I try to run any psql on DR, it throwing up below error.

psql: FATAL: the database system is starting up

please show us your recovery.conf. this should include this line:

standby_mode = 'on'

--
Andreas Kretschmer
http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#6Venkata B Nagothi
nag1010@gmail.com
In reply to: Ashish Chauhan (#5)
Re: Live steraming replication setup issue!

On Fri, Feb 19, 2016 at 6:24 PM, Ashish Chauhan <Ashish.Chauhan@support.com>
wrote:

Below is recovery.conf on slave

#---------------------------------------------------------------------------
# STANDBY SERVER PARAMETERS

#---------------------------------------------------------------------------
#
# standby_mode
#
# When standby_mode is enabled, the PostgreSQL server will work as a
# standby. It will continuously wait for the additional XLOG records, using
# restore_command and/or primary_conninfo.
#
standby_mode = 'on'
#
# primary_conninfo
#
# If set, the PostgreSQL server will try to connect to the primary using
this
# connection string and receive XLOG records continuously.
#
primary_conninfo = 'host=<master server ip> port=5432'
#
#
# By default, a standby server keeps restoring XLOG records from the
# primary indefinitely. If you want to stop the standby mode, finish
recovery
# and open the system in read/write mode, specify path to a trigger file.
# The server will poll the trigger file path periodically and start as a
# primary server when it's found.
#
trigger_file = '/data/main/primary.trigger'

Can you consider putting recovery_target_timeline='latest' as well ? and
can you help us know if you can see anything weird in the postgresql
logfiles @ DR ?

Is DR in complete sync with the slave ?

Regards,
Venkata B N

Fujitsu Australia

#7Ashish Chauhan
Ashish.Chauhan@support.com
In reply to: Venkata B Nagothi (#6)
Re: Live steraming replication setup issue!

Thanks Venkata, I am able to setup replication now. Just wondering when I check replication_delay and lag, I am getting negative number, any idea why?

receive | replay | replication_delay | lag
--------------+--------------+-------------------+-----
796/BA9D8000 | 796/BA9D7FF0 | -00:00:01.612415 | -2

Thanks,
Ashish

From: Venkata Balaji N [mailto:nag1010@gmail.com]
Sent: Sunday, February 21, 2016 2:14 AM, 2:14
To: Ashish Chauhan
Cc: Andreas Kretschmer; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Live steraming replication setup issue!

On Fri, Feb 19, 2016 at 6:24 PM, Ashish Chauhan <Ashish.Chauhan@support.com<mailto:Ashish.Chauhan@support.com>> wrote:
Below is recovery.conf on slave

#---------------------------------------------------------------------------
# STANDBY SERVER PARAMETERS
#---------------------------------------------------------------------------
#
# standby_mode
#
# When standby_mode is enabled, the PostgreSQL server will work as a
# standby. It will continuously wait for the additional XLOG records, using
# restore_command and/or primary_conninfo.
#
standby_mode = 'on'
#
# primary_conninfo
#
# If set, the PostgreSQL server will try to connect to the primary using this
# connection string and receive XLOG records continuously.
#
primary_conninfo = 'host=<master server ip> port=5432'
#
#
# By default, a standby server keeps restoring XLOG records from the
# primary indefinitely. If you want to stop the standby mode, finish recovery
# and open the system in read/write mode, specify path to a trigger file.
# The server will poll the trigger file path periodically and start as a
# primary server when it's found.
#
trigger_file = '/data/main/primary.trigger'

Can you consider putting recovery_target_timeline='latest' as well ? and can you help us know if you can see anything weird in the postgresql logfiles @ DR ?

Is DR in complete sync with the slave ?

Regards,
Venkata B N

Fujitsu Australia

#8Venkata B Nagothi
nag1010@gmail.com
In reply to: Ashish Chauhan (#7)
Re: Live steraming replication setup issue!

On Tue, Feb 23, 2016 at 10:02 AM, Ashish Chauhan <Ashish.Chauhan@support.com

wrote:

Thanks Venkata, I am able to setup replication now. Just wondering when I
check replication_delay and lag, I am getting negative number, any idea why?

receive | replay | replication_delay | lag

--------------+--------------+-------------------+-----

796/BA9D8000 | 796/BA9D7FF0 | -00:00:01.612415 | -2

The WAL records in receive and replay means the same WAL record, please see
below :

postgres=# select pg_xlogfile_name('796/BA9D8000');
pg_xlogfile_name
--------------------------
0000000100000796000000BA
(1 row)

postgres=# select pg_xlogfile_name('796/BA9D7FF0');
pg_xlogfile_name
--------------------------
0000000100000796000000BA
(1 row)

That means the replication is continuously streaming and may behind few WAL
records. Do you see the lag all the time ? Did you test if the replication
is working fine ?

You can check that via pg_controldata as well. What does sync_state in
pg_stat_replication say ?

Regards,
Venkata B N

Fujitsu Australia