replication isn't replicating

Started by Brad Whiteabout 2 years ago4 messagesgeneral
Jump to latest
#1Brad White
b55white@gmail.com

Errors from the Primary server

2024-01-15 00:01:06.166 CST [1428] ERROR: requested WAL segment
0000000100000002000000A2 has already been removed
2024-01-15 00:01:06.166 CST [1428] STATEMENT: START_REPLICATION 2/A2000000
TIMELINE 1
2024-01-15 00:01:11.158 CST [3472] ERROR: requested WAL segment
0000000100000002000000A2 has already been removed
2024-01-15 00:01:11.158 CST [3472] STATEMENT: START_REPLICATION 2/A2000000
TIMELINE 1
2024-01-15 00:01:11.158 CST [3472] LOG: could not receive data from
client: An existing connection was forcibly closed by the remote host.

2024-01-15 00:01:16.166 CST [664] ERROR: requested WAL segment
0000000100000002000000A2 has already been removed
2024-01-15 00:01:16.166 CST [664] STATEMENT: START_REPLICATION 2/A2000000
TIMELINE 1
2024-01-15 00:01:21.161 CST [2016] ERROR: requested WAL segment
0000000100000002000000A2 has already been removed
2024-01-15 00:01:21.161 CST [2016] STATEMENT: START_REPLICATION 2/A2000000
TIMELINE 1
2024-01-15 00:01:21.161 CST [2016] LOG: could not receive data from
client: An existing connection was forcibly closed by the remote host.

[repeat for 550000 lines]
----------------------------------------
Errors from the backup server

2024-01-15 01:13:57.893 CST [2988] LOG: started streaming WAL from primary
at 2/A2000000 on timeline 1
2024-01-15 01:13:57.893 CST [2988] FATAL: could not receive data from WAL
stream: ERROR: requested WAL segment 0000000100000002000000A2 has already
been removed
2024-01-15 01:13:57.893 CST [1792] LOG: waiting for WAL to become
available at 2/A2002000
2024-01-15 01:14:02.884 CST [2552] LOG: started streaming WAL from primary
at 2/A2000000 on timeline 1
2024-01-15 01:14:02.884 CST [2552] FATAL: could not receive data from WAL
stream: ERROR: requested WAL segment 0000000100000002000000A2 has already
been removed
2024-01-15 01:14:02.884 CST [1792] LOG: waiting for WAL to become
available at 2/A2002000

[repeat for 49000 lines]

What's my next step?

Thanks,
Brad.

#2Brad White
b55white@gmail.com
In reply to: Brad White (#1)
Re: replication isn't replicating

Sorry for the repeat. It looked like it hadn't been sent. 😔

Show quoted text
#3Emanuel Calvo
3manuek@gmail.com
In reply to: Brad White (#1)
Re: replication isn't replicating

El mar, 16 ene 2024 a las 22:47, Brad White (<b55white@gmail.com>) escribió:

Errors from the Primary server

2024-01-15 00:01:06.166 CST [1428] ERROR: requested WAL segment
0000000100000002000000A2 has already been removed
2024-01-15 00:01:06.166 CST [1428] STATEMENT: START_REPLICATION
2/A2000000 TIMELINE 1
2024-01-15 00:01:11.158 CST [3472] ERROR: requested WAL segment
0000000100000002000000A2 has already been removed
2024-01-15 00:01:11.158 CST [3472] STATEMENT: START_REPLICATION
2/A2000000 TIMELINE 1
2024-01-15 00:01:11.158 CST [3472] LOG: could not receive data from
client: An existing connection was forcibly closed by the remote host.

These log entries mean that some node is requesting a WAL segment that was
already removed from
the server.

2024-01-15 00:01:16.166 CST [664] ERROR: requested WAL segment
0000000100000002000000A2 has already been removed
2024-01-15 00:01:16.166 CST [664] STATEMENT: START_REPLICATION 2/A2000000
TIMELINE 1
2024-01-15 00:01:21.161 CST [2016] ERROR: requested WAL segment
0000000100000002000000A2 has already been removed
2024-01-15 00:01:21.161 CST [2016] STATEMENT: START_REPLICATION
2/A2000000 TIMELINE 1
2024-01-15 00:01:21.161 CST [2016] LOG: could not receive data from
client: An existing connection was forcibly closed by the remote host.

[repeat for 550000 lines]
----------------------------------------
Errors from the backup server

2024-01-15 01:13:57.893 CST [2988] LOG: started streaming WAL from
primary at 2/A2000000 on timeline 1
2024-01-15 01:13:57.893 CST [2988] FATAL: could not receive data from WAL
stream: ERROR: requested WAL segment 0000000100000002000000A2 has already
been removed
2024-01-15 01:13:57.893 CST [1792] LOG: waiting for WAL to become
available at 2/A2002000
2024-01-15 01:14:02.884 CST [2552] LOG: started streaming WAL from
primary at 2/A2000000 on timeline 1
2024-01-15 01:14:02.884 CST [2552] FATAL: could not receive data from WAL
stream: ERROR: requested WAL segment 0000000100000002000000A2 has already
been removed
2024-01-15 01:14:02.884 CST [1792] LOG: waiting for WAL to become
available at 2/A2002000

These are related to the backup not finding that segment, so it means
you'll need to resync
your backup stream. I assume that you're using barman and using
https://docs.pgbarman.org/release/3.9.0/#streaming-backup .

Hope it helped.

--
--
Emanuel Calvo
OnGres Database Engineer | ViaDB Founder

#4Brad White
b55white@gmail.com
In reply to: Emanuel Calvo (#3)
Re: replication isn't replicating

On Tue, Jan 16, 2024 at 3:53 PM Emanuel Calvo <3manuek@gmail.com> wrote:

El mar, 16 ene 2024 a las 22:47, Brad White (<b55white@gmail.com>)
escribió:

Errors from the Primary server

2024-01-15 00:01:06.166 CST [1428] ERROR: requested WAL segment
0000000100000002000000A2 has already been removed
2024-01-15 00:01:06.166 CST [1428] STATEMENT: START_REPLICATION
2/A2000000 TIMELINE 1
2024-01-15 00:01:11.158 CST [3472] ERROR: requested WAL segment
0000000100000002000000A2 has already been removed
2024-01-15 00:01:11.158 CST [3472] STATEMENT: START_REPLICATION
2/A2000000 TIMELINE 1
2024-01-15 00:01:11.158 CST [3472] LOG: could not receive data from
client: An existing connection was forcibly closed by the remote host.

These log entries mean that some node is requesting a WAL segment that was
already removed from
the server.

2024-01-15 01:14:02.884 CST [2552] LOG: started streaming WAL from

primary at 2/A2000000 on timeline 1
2024-01-15 01:14:02.884 CST [2552] FATAL: could not receive data from
WAL stream: ERROR: requested WAL segment 0000000100000002000000A2 has
already been removed
2024-01-15 01:14:02.884 CST [1792] LOG: waiting for WAL to become
available at 2/A2002000

These are related to the backup not finding that segment, so it means
you'll need to resync
your backup stream.

You pointed me in the right direction.
Turns out the files are still there, so it must be a permission issue.
pgUser has full access to the files.
Postgres is running as pgUser, except that wasn't true on the backup.
Was running as 'Network Service'.
Should be better now.

Aaaand I'm wrong.
Still getting the same errors on both servers.