Requested WAL segment xxx has already been removed
Hi all,
I recently hit an error with our streaming replication setup:
2025-07-14 11:52:59.361 CST,"replicator","",728458,"10.9.9.74:35724",68747f1b.b1d8a,1,"START_REPLICATION",2025-07-14 11:52:59 CST,3/0,0,ERROR,58P01,"requested WAL segment 00000001000000000000000C has already been removed",,,,,,"START_REPLICATION 0/C000000 TIMELINE 1",,,"standby","walsender",,0
It appears the requested WAL segment 00000001000000000000000C had already been
archived, and I confirmed its presence in the archive directory. However, when
the standby tried to request this file, the primary only searched for it in
pg_wal and didn't check the archive directory. I had to manually copy the
segment into pg_wal to get streaming replication working again.
My question is: Can we make the primary automatically search the archive if
restore_command is set?
I found that Fujii Masao also requested this feature [1]/messages/by-id/AANLkTinN=xsPOoaXzVFSp1OkfMDAB1f_d-F91xjEZDV8@mail.gmail.com, but it seems there
wasn't a consensus.
I've attached a script to reproduce this issue.
[1]: /messages/by-id/AANLkTinN=xsPOoaXzVFSp1OkfMDAB1f_d-F91xjEZDV8@mail.gmail.com
--
Regards,
Japin Li
Attachments:
On Mon, 14 Jul 2025 at 10:08, Japin Li <japinli@hotmail.com> wrote:
Hi all,
I recently hit an error with our streaming replication setup:
2025-07-14 11:52:59.361 CST,"replicator","",728458,"10.9.9.74:35724",68747f1b.b1d8a,1,"START_REPLICATION",2025-07-14
11:52:59 CST,3/0,0,ERROR,58P01,"requested WAL segment
00000001000000000000000C has already been removed",,,,,,"START_REPLICATION
0/C000000 TIMELINE 1",,,"standby","walsender",,0My question is: Can we make the primary automatically search the archive if
restore_command is set?
If we talk about physical replication, then with the same success
restore_command could be (and more important, it should be) used on a
standby. And the main question here is why standby wasn't
properly configured?
However, with logical replication it is a different story, and it would be
really great if restore_command is used when WAL's are missing to fetch it.
Regards,
--
Alexander Kukushkin
On Mon, 14 Jul 2025 at 10:21, Alexander Kukushkin <cyberdemn@gmail.com> wrote:
On Mon, 14 Jul 2025 at 10:08, Japin Li <japinli@hotmail.com> wrote:
Hi all,
I recently hit an error with our streaming replication setup:
2025-07-14 11:52:59.361
CST,"replicator","",728458,"10.9.9.74:35724",68747f1b.b1d8a,1,"START_REPLICATION",2025-07-14 11:52:59
CST,3/0,0,ERROR,58P01,"requested WAL segment 00000001000000000000000C has already been
removed",,,,,,"START_REPLICATION 0/C000000 TIMELINE 1",,,"standby","walsender",,0My question is: Can we make the primary automatically search the archive if
restore_command is set?If we talk about physical replication, then with the same success restore_command could be (and more important, it should
be) used on a standby.
Yes, I'm referring to physical replication.
And the main question here is why standby wasn't properly configured?
The configuration is as expected. My test script simulates two distinct hosts
by utilizing local archive storage.
For physical replication across distinct hosts without shared WAL archive
storage, WALs are archived locally (in my test).
When the primary's walsender needs a WAL file from the archive that's not in
its pg_wal directory, manual copying is required to the primary's pg_wal or the
standby's pg_wal (or its archive directory, and use restore_command to fetch it).
What prevents us from using the primary's restore_command to retrieve the
necessary WALs?
However, with logical replication it is a different story, and it would be really great if restore_command is used when
WAL's are missing to fetch it.
--
Regards,
Japin Li
On 2025/07/14 17:08, Japin Li wrote:
Hi all,
I recently hit an error with our streaming replication setup:
2025-07-14 11:52:59.361 CST,"replicator","",728458,"10.9.9.74:35724",68747f1b.b1d8a,1,"START_REPLICATION",2025-07-14 11:52:59 CST,3/0,0,ERROR,58P01,"requested WAL segment 00000001000000000000000C has already been removed",,,,,,"START_REPLICATION 0/C000000 TIMELINE 1",,,"standby","walsender",,0
It appears the requested WAL segment 00000001000000000000000C had already been
archived, and I confirmed its presence in the archive directory. However, when
the standby tried to request this file, the primary only searched for it in
pg_wal and didn't check the archive directory. I had to manually copy the
segment into pg_wal to get streaming replication working again.My question is: Can we make the primary automatically search the archive if
restore_command is set?I found that Fujii Masao also requested this feature [1], but it seems there
wasn't a consensus.
Yeah, I still like this idea. It's useful, for example, when we want to
temporarily retain WAL files, such as during planned standby maintenance,
to avoid "requested WAL segment ... removed." error.
Using a replication slot is one way to retain WAL files in pg_wal,
but it requires the pg_wal directory to be large enough to hold all
WAL generated during that time, which isn't always practical.
Regards,
--
Fujii Masao
NTT DATA Japan Corporation
On Mon, 14 Jul 2025 at 20:33, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2025/07/14 17:08, Japin Li wrote:
Hi all,
I recently hit an error with our streaming replication setup:
2025-07-14 11:52:59.361
CST,"replicator","",728458,"10.9.9.74:35724",68747f1b.b1d8a,1,"START_REPLICATION",2025-07-14
11:52:59 CST,3/0,0,ERROR,58P01,"requested WAL segment
00000001000000000000000C has already been
removed",,,,,,"START_REPLICATION 0/C000000 TIMELINE
1",,,"standby","walsender",,0
It appears the requested WAL segment 00000001000000000000000C had
already been
archived, and I confirmed its presence in the archive directory. However, when
the standby tried to request this file, the primary only searched for it in
pg_wal and didn't check the archive directory. I had to manually copy the
segment into pg_wal to get streaming replication working again.
My question is: Can we make the primary automatically search the
archive if
restore_command is set?
I found that Fujii Masao also requested this feature [1], but it
seems there
wasn't a consensus.Yeah, I still like this idea. It's useful, for example, when we want to
temporarily retain WAL files, such as during planned standby maintenance,
to avoid "requested WAL segment ... removed." error.Using a replication slot is one way to retain WAL files in pg_wal,
but it requires the pg_wal directory to be large enough to hold all
WAL generated during that time, which isn't always practical.
Agreed. Here is a patch that fixes this.
--
Regards,
Japin Li
Attachments:
0001-Allow-the-walsender-to-retrieve-WALs-from-the-archiv.patchtext/x-diffDownload
From 9df3700bf0152c44e232755137c4681fd2c72e50 Mon Sep 17 00:00:00 2001
From: Japin Li <japinli@hotmail.com>
Date: Tue, 15 Jul 2025 13:58:53 +0800
Subject: [PATCH] Allow the walsender to retrieve WALs from the archive
---
src/backend/access/transam/xlogarchive.c | 4 ++--
src/backend/replication/walsender.c | 10 ++++++++++
2 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c
index 1ef1713c91a..fe932c11f44 100644
--- a/src/backend/access/transam/xlogarchive.c
+++ b/src/backend/access/transam/xlogarchive.c
@@ -66,9 +66,9 @@ RestoreArchivedFile(char *path, const char *xlogfname,
/*
* Ignore restore_command when not in archive recovery (meaning we are in
- * crash recovery).
+ * crash recovery) and non-walsender processes.
*/
- if (!ArchiveRecoveryRequested)
+ if (!ArchiveRecoveryRequested && !am_walsender)
goto not_available;
/* In standby mode, restore_command might not be supplied */
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 28b8591efa5..438b5d27a32 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -53,6 +53,7 @@
#include "access/transam.h"
#include "access/xact.h"
#include "access/xlog_internal.h"
+#include "access/xlogarchive.h"
#include "access/xlogreader.h"
#include "access/xlogrecovery.h"
#include "access/xlogutils.h"
@@ -3068,6 +3069,15 @@ WalSndSegmentOpen(XLogReaderState *state, XLogSegNo nextSegNo,
int save_errno = errno;
XLogFileName(xlogfname, *tli_p, nextSegNo, wal_segment_size);
+
+ /* Restore WALs from archive if not found in XLOGDIR. */
+ if (RestoreArchivedFile(path, xlogfname, xlogfname, wal_segment_size, false))
+ {
+ state->seg.ws_file = BasicOpenFile(path, O_RDONLY | PG_BINARY);
+ if (state->seg.ws_file >= 0)
+ return;
+ }
+
errno = save_errno;
ereport(ERROR,
(errcode_for_file_access(),
--
2.43.0
HI Japin
Thank you for your working on this.It is useful ,when a standby node has
hardware issue repaired ,wal log usually has been archived.The
wal_keep_size parameter is difficult to estimate accurately, as hardware
repair or replacement times are often unpredictable. If the machine can be
fixed in a few days, the archived WAL files are likely still available in
the archive directory.One small regret is that postgresql currently lacks a
speed limit for sending wal logs.
Thanks
On Tue, Jul 15, 2025 at 2:11 PM Japin Li <japinli@hotmail.com> wrote:
Show quoted text
On Mon, 14 Jul 2025 at 20:33, Fujii Masao <masao.fujii@oss.nttdata.com>
wrote:On 2025/07/14 17:08, Japin Li wrote:
Hi all,
I recently hit an error with our streaming replication setup:
2025-07-14 11:52:59.361
CST,"replicator","",728458,"10.9.9.74:35724",68747f1b.b1d8a,1,"START_REPLICATION",2025-07-14
11:52:59 CST,3/0,0,ERROR,58P01,"requested WAL segment
00000001000000000000000C has already been
removed",,,,,,"START_REPLICATION 0/C000000 TIMELINE
1",,,"standby","walsender",,0
It appears the requested WAL segment 00000001000000000000000C had
already been
archived, and I confirmed its presence in the archive directory.However, when
the standby tried to request this file, the primary only searched for
it in
pg_wal and didn't check the archive directory. I had to manually copy
the
segment into pg_wal to get streaming replication working again.
My question is: Can we make the primary automatically search the
archive if
restore_command is set?
I found that Fujii Masao also requested this feature [1], but it
seems there
wasn't a consensus.Yeah, I still like this idea. It's useful, for example, when we want to
temporarily retain WAL files, such as during planned standby maintenance,
to avoid "requested WAL segment ... removed." error.Using a replication slot is one way to retain WAL files in pg_wal,
but it requires the pg_wal directory to be large enough to hold all
WAL generated during that time, which isn't always practical.Agreed. Here is a patch that fixes this.
--
Regards,
Japin Li
Hi,
On Mon, 14 Jul 2025 at 11:24, Japin Li <japinli@hotmail.com> wrote:
The configuration is as expected. My test script simulates two distinct
hosts
by utilizing local archive storage.For physical replication across distinct hosts without shared WAL archive
storage, WALs are archived locally (in my test).When the primary's walsender needs a WAL file from the archive that's not
in
its pg_wal directory, manual copying is required to the primary's pg_wal
or the
standby's pg_wal (or its archive directory, and use restore_command to
fetch it).What prevents us from using the primary's restore_command to retrieve the
necessary WALs?
I am just talking about the practical side of local archive storage.
Such archives will be gone along with the server in case of disaster and
therefore they bring only a little value.
With the same success, physical standby can use restore_command to copy
files from the archive on the primary via ssh/rsync or similar. This
approach is used for ages and works just fine.
What is really painful right now, logical walsenders can only look into
pg_wal, and unfortunately replication slots don't give 100% guarantee for
WAL retention because of max_slot_wal_keep_size.
That is, using restore_command for logical walsenders would be really
helpful and solve some problems and pain points with logical replication.
However, if we start calling restore_command also for physical walsenders
it might result in increased resource usage on primary without providing
much additional value. For example, restore_command is failing, but standby
indefinitely continues making replication connection attempts.
I don't mind if it will also work for physical replication, but IMO there
should be a possibility to opt out from it.
Regards,
--
Alexander Kukushkin
HI
What is really painful right now, logical walsenders can only look into
pg_wal, and unfortunately replication slots don't give 100% guarantee for
WAL >retention because of max_slot_wal_keep_size.
That is, using restore_command for logical walsenders would be really
helpful and solve some problems and pain points with logical replication.
restore_command needs to be realized with the help of ssh or nfs shared
storage,most companies due to the requirement of security audit, it is not
possible to establish ssh mutual trust.It would be very convenient if this
feature was implemented
Thanks
On Tue, Jul 15, 2025 at 5:24 PM Alexander Kukushkin <cyberdemn@gmail.com>
wrote:
Show quoted text
Hi,
On Mon, 14 Jul 2025 at 11:24, Japin Li <japinli@hotmail.com> wrote:
The configuration is as expected. My test script simulates two distinct
hosts
by utilizing local archive storage.For physical replication across distinct hosts without shared WAL archive
storage, WALs are archived locally (in my test).When the primary's walsender needs a WAL file from the archive that's not
in
its pg_wal directory, manual copying is required to the primary's pg_wal
or the
standby's pg_wal (or its archive directory, and use restore_command to
fetch it).What prevents us from using the primary's restore_command to retrieve the
necessary WALs?I am just talking about the practical side of local archive storage.
Such archives will be gone along with the server in case of disaster and
therefore they bring only a little value.
With the same success, physical standby can use restore_command to copy
files from the archive on the primary via ssh/rsync or similar. This
approach is used for ages and works just fine.What is really painful right now, logical walsenders can only look into
pg_wal, and unfortunately replication slots don't give 100% guarantee for
WAL retention because of max_slot_wal_keep_size.
That is, using restore_command for logical walsenders would be really
helpful and solve some problems and pain points with logical replication.However, if we start calling restore_command also for physical walsenders
it might result in increased resource usage on primary without providing
much additional value. For example, restore_command is failing, but standby
indefinitely continues making replication connection attempts.I don't mind if it will also work for physical replication, but IMO there
should be a possibility to opt out from it.Regards,
--
Alexander Kukushkin
On Tue, 15 Jul 2025 at 11:24, Alexander Kukushkin <cyberdemn@gmail.com> wrote:
Hi,
On Mon, 14 Jul 2025 at 11:24, Japin Li <japinli@hotmail.com> wrote:
The configuration is as expected. My test script simulates two distinct hosts
by utilizing local archive storage.For physical replication across distinct hosts without shared WAL archive
storage, WALs are archived locally (in my test).When the primary's walsender needs a WAL file from the archive that's not in
its pg_wal directory, manual copying is required to the primary's pg_wal or the
standby's pg_wal (or its archive directory, and use restore_command to fetch it).What prevents us from using the primary's restore_command to retrieve the
necessary WALs?I am just talking about the practical side of local archive storage.
Yes, it's quite niche in its usage.
Such archives will be gone along with the server in case of disaster and therefore they bring only a little value.
With the same success, physical standby can use restore_command to copy files from the archive on the primary via
ssh/rsync or similar. This approach is used for ages and works just fine.
However, some environments might prohibit password-free scp or the use of
shared directories.
What is really painful right now, logical walsenders can only look into pg_wal, and unfortunately replication slots don't
give 100% guarantee for WAL retention because of max_slot_wal_keep_size.
That is, using restore_command for logical walsenders would be really helpful and solve some problems and pain points
with logical replication.
I agree; logical walsenders offer greater value than physical ones.
However, if we start calling restore_command also for physical walsenders it might result in increased resource usage on
primary without providing much additional value. For example, restore_command is failing, but standby indefinitely
continues making replication connection attempts.
IIRC, the standby will indefinitely attempt to connect for replication, even
without restore_command configured.
I don't mind if it will also work for physical replication, but IMO there should be a possibility to opt out from it.
--
Regards,
Japin Li
On Tue, 15 Jul 2025 at 12:08, Japin Li <japinli@hotmail.com> wrote:
IIRC, the standby will indefinitely attempt to connect for replication,
even
without restore_command configured.
That's correct. However right now it just results in an attempt to open the
WAL segment in pg_wal and failing, what is cheap.
Calling restore_command is more expensive and therefore the impact on
resource usage will be bigger.
Regards,
--
Alexander Kukushkin