DETAIL: pg_rewind: servers diverged at WAL location 0/9000000 on timeline 1

Started by Kumar, Deveshalmost 2 years ago3 messagesbugs
Jump to latest
#1Kumar, Devesh
devesh.kumar@cmegroup.com

Hello,

Currently we are working on setting up replication and testing failover
scenarios and failback. During our testing, failover is getting successful.
During Failback, when we are reverting the original primary instance as the
new standby, we are getting pg_rewind errors. Kindly can someone check and
let us know.

DETAIL: pg_rewind command is "/opt/postgresql/pg/bin/pg_rewind -D
'/pgresdata101/data' --source-server='host=10.29.97.241 port=5432
user=repmgr dbname=repmgr connect_timeout=2'"
DEBUG: executing:
/opt/postgresql/pg/bin/pg_rewind -D '/pgresdata101/data'
--source-server='host=10.29.97.241 port=5432 user=repmgr dbname=repmgr
connect_timeout=2' 2>/tmp/repmgr_command.wgVGPS
DEBUG: result of command was 1 (256)
DEBUG: local_command(): output returned was:
pg_rewind: servers diverged at WAL location 0/9000000 on timeline 1
pg_rewind: error: could not open file
"/pgresdata101/data/pg_wal/000000010000000000000008": No such file or
directory
pg_rewind: error: could not find previous WAL record at 0/802B668

___________________________

*DEVESH KUMAR*

Database Admin I – India

M: +91 6366843695

devesh.kumar@cmegroup.com <firstname.lastname@cmegroup.com>

[image: CC24_EC010-Great-Place-to-Work-India-email-sign-260x100px_v2 (1)
(1).jpg]

Address: Tridib Building Block B 5th Floor

Bagmane Tech Park CV Raman Nagar,

Bengaluru, 560093, IN
www.cmegroup.com

--

NOTICE: This message, and any attachments, are for the intended
recipient(s) only, may contain information that is privileged, confidential
and/or proprietary and subject to important terms and conditions available
at 
https://www.cmegroup.com/tools-information/communications/e-communication-disclaimer.html
<https://www.cmegroup.com/tools-information/communications/e-communication-disclaimer.html&gt; 
If you are not the intended recipient, please delete this message. CME
Group and its subsidiaries reserve the right to monitor all email
communications that occur on CME Group information systems.

Attachments:

CC24_EC010-Great-Place-to-Work-India-email-sign-260x100px_v2 (1) (1).jpgimage/jpeg; name="CC24_EC010-Great-Place-to-Work-India-email-sign-260x100px_v2 (1) (1).jpg"Download
#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Kumar, Devesh (#1)
Re: DETAIL: pg_rewind: servers diverged at WAL location 0/9000000 on timeline 1

On Sat, 2024-04-27 at 00:36 +0530, Kumar, Devesh wrote:

Currently we are working on setting up replication and testing failover scenarios
and failback. During our testing, failover is getting successful. During Failback,
when we are reverting the original primary instance as the new standby, we are
getting pg_rewind errors. Kindly can someone check and let us know.

pg_rewind: servers diverged at WAL location 0/9000000 on timeline 1
pg_rewind: error: could not open file "/pgresdata101/data/pg_wal/000000010000000000000008": No such file or directory
pg_rewind: error: could not find previous WAL record at 0/802B668

You should show the exact commands used for failover and failback.

Yours,
Laurenz Albe

#3Kumar, Devesh
devesh.kumar@cmegroup.com
In reply to: Laurenz Albe (#2)
Re: DETAIL: pg_rewind: servers diverged at WAL location 0/9000000 on timeline 1

Hello Laurenz

Thanks for the response. I am putting the details as below:

Primary repmgr.conf Details
[image: image.png]

Secondary repmgr.conf Details

[image: image.png]

Failover steps:

We stopped the primary server pg service and repmgrd automatically did the
failover to standby and made standby as the new primary.

See the below status after failover

[image: image.png]

Failback steps;

1. We executed a checkpoint on the new primary( originally standby ).
2. We ran the below node rejoin command with --dry-run

repmgr node rejoin -f /opt/postgresql/15.6/bin/repmgr.conf -d
'host=10.29.97.241 port=5432 user=repmgr dbname=repmgr' --force-rewind
--config-files=postgresql.conf,postgresql.local.conf,pg_hba.conf -v
--dry-run ///try to check if original_primary is eligible to rejoin

NOTICE: rejoin target is node "d-dba-pg-rnh9" (ID: 2)
INFO: replication connection to the rejoin target node was successful
INFO: local and rejoin target system identifiers match
DETAIL: system identifier is 7360952088605465701
NOTICE: pg_rewind execution required for this node to attach to rejoin
target node 2
DETAIL: rejoin target server's timeline 2 forked off current database
system timeline 1 before current recovery point 0/9000028
INFO: prerequisites for using pg_rewind are met
INFO: file "postgresql.conf" would be copied to
"/tmp/repmgr-config-archive-d-dba-pg-0ptt/postgresql.conf"
WARNING: specified file "/pgresdata101/data/postgresql.local.conf" not
found, skipping
INFO: file "pg_hba.conf" would be copied to
"/tmp/repmgr-config-archive-d-dba-pg-0ptt/pg_hba.conf"
INFO: pg_rewind would now be executed
DETAIL: pg_rewind command is:
/opt/postgresql/pg/bin/pg_rewind -D '/pgresdata101/data'
--source-server='host=10.29.97.241 port=5432 user=repmgr dbname=repmgr
connect_timeout=2'
INFO: prerequisites for executing NODE REJOIN are met

3. executed node rejoin command

repmgr node rejoin -f /opt/postgresql/15.6/bin/repmgr.conf -d
'host=10.29.97.241 port=5432 user=repmgr dbname=repmgr' --force-rewind
--config-files=postgresql.conf,postgresql.local.conf,pg_hba.conf -v
NOTICE: using provided configuration file
"/opt/postgresql/15.6/bin/repmgr.conf"
DEBUG: server version number is: 150000
DEBUG: set_config():
SET synchronous_commit TO 'local'
DEBUG: get_primary_node_id():
SELECT node_id FROM repmgr.nodes WHERE type =
'primary' AND active IS TRUE
DEBUG: get_node_record():
SELECT n.node_id, n.type, n.upstream_node_id, n.node_name, n.conninfo,
n.repluser, n.slot_name, n.location, n.priority, n.active, n.config_file,
'' AS upstream_node_name, NULL AS attached FROM repmgr.nodes n WHERE
n.node_id = 2
NOTICE: rejoin target is node "d-dba-pg-rnh9" (ID: 2)
DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr
host=10.29.97.241 port=5432 fallback_application_name=repmgr
options=-csearch_path="
DEBUG: set_config():
SET synchronous_commit TO 'local'
DEBUG: get_recovery_type(): SELECT pg_catalog.pg_is_in_recovery()
DEBUG: get_node_record():
SELECT n.node_id, n.type, n.upstream_node_id, n.node_name, n.conninfo,
n.repluser, n.slot_name, n.location, n.priority, n.active, n.config_file,
'' AS upstream_node_name, NULL AS attached FROM repmgr.nodes n WHERE
n.node_id = 1
DEBUG: local timeline: 1; rejoin target timeline: 2
DEBUG: get_timeline_history():
TIMELINE_HISTORY 2
DEBUG: local tli: 1; local_xlogpos: 0/9000028; follow_target_history->tli:
1; follow_target_history->end: 0/9000000
NOTICE: pg_rewind execution required for this node to attach to rejoin
target node 2
DETAIL: rejoin target server's timeline 2 forked off current database
system timeline 1 before current recovery point 0/9000028
DEBUG: guc_set():
SELECT true FROM pg_catalog.pg_settings WHERE name = 'full_page_writes'
AND setting = 'off'
DEBUG: guc_set():
SELECT true FROM pg_catalog.pg_settings WHERE name = 'wal_log_hints' AND
setting = 'on'
INFO: prerequisites for using pg_rewind are met
DEBUG: using archive directory "/tmp/repmgr-config-archive-d-dba-pg-0ptt"
DEBUG: copying "postgresql.conf" to
"/tmp/repmgr-config-archive-d-dba-pg-0ptt/postgresql.conf"
WARNING: specified file "/pgresdata101/data/postgresql.local.conf" not
found, skipping
DEBUG: copying "pg_hba.conf" to
"/tmp/repmgr-config-archive-d-dba-pg-0ptt/pg_hba.conf"
INFO: 2 files copied to "/tmp/repmgr-config-archive-d-dba-pg-0ptt"
NOTICE: executing pg_rewind
DETAIL: pg_rewind command is "/opt/postgresql/pg/bin/pg_rewind -D
'/pgresdata101/data' --source-server='host=10.29.97.241 port=5432
user=repmgr dbname=repmgr connect_timeout=2'"
DEBUG: executing:
/opt/postgresql/pg/bin/pg_rewind -D '/pgresdata101/data'
--source-server='host=10.29.97.241 port=5432 user=repmgr dbname=repmgr
connect_timeout=2' 2>/tmp/repmgr_command.wgVGPS
DEBUG: result of command was 1 (256)
DEBUG: local_command(): output returned was:
pg_rewind: servers diverged at WAL location 0/9000000 on timeline 1
pg_rewind: error: could not open file
"/pgresdata101/data/pg_wal/000000010000000000000008": No such file or
directory
pg_rewind: error: could not find previous WAL record at 0/802B668

ERROR: pg_rewind execution failed
DETAIL: pg_rewind: servers diverged at WAL location 0/9000000 on timeline 1
pg_rewind: error: could not open file
"/pgresdata101/data/pg_wal/000000010000000000000008": No such file or
directory
pg_rewind: error: could not find previous WAL record at 0/802B668

___________________________

*DEVESH KUMAR*

Database Admin I – India

M: +91 6366843695

devesh.kumar@cmegroup.com <firstname.lastname@cmegroup.com>

[image: CC24_EC010-Great-Place-to-Work-India-email-sign-260x100px_v2 (1)
(1).jpg]

Address: Tridib Building Block B 5th Floor

Bagmane Tech Park CV Raman Nagar,

Bengaluru, 560093, IN
www.cmegroup.com

On Mon, Apr 29, 2024 at 3:37 PM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:

This email is from an external source. Do not click links or open
attachments you do not trust. EXERCISE CAUTION.

On Sat, 2024-04-27 at 00:36 +0530, Kumar, Devesh wrote:

Currently we are working on setting up replication and testing failover

scenarios

and failback. During our testing, failover is getting successful. During

Failback,

when we are reverting the original primary instance as the new standby,

we are

getting pg_rewind errors. Kindly can someone check and let us know.

pg_rewind: servers diverged at WAL location 0/9000000 on timeline 1
pg_rewind: error: could not open file

"/pgresdata101/data/pg_wal/000000010000000000000008": No such file or
directory

pg_rewind: error: could not find previous WAL record at 0/802B668

You should show the exact commands used for failover and failback.

Yours,
Laurenz Albe

--

NOTICE: This message, and any attachments, are for the intended
recipient(s) only, may contain information that is privileged, confidential
and/or proprietary and subject to important terms and conditions available
at 
https://www.cmegroup.com/tools-information/communications/e-communication-disclaimer.html
<https://www.cmegroup.com/tools-information/communications/e-communication-disclaimer.html&gt; 
If you are not the intended recipient, please delete this message. CME
Group and its subsidiaries reserve the right to monitor all email
communications that occur on CME Group information systems.

Attachments:

image.pngimage/png; name=image.pngDownload
image.pngimage/png; name=image.pngDownload
image.pngimage/png; name=image.pngDownload
CC24_EC010-Great-Place-to-Work-India-email-sign-260x100px_v2 (1) (1).jpgimage/jpeg; name="CC24_EC010-Great-Place-to-Work-India-email-sign-260x100px_v2 (1) (1).jpg"Download