Simplify standby state machine a bit in WaitForWALToBecomeAvailable()
Hi,
In standby mode, the state machine in WaitForWALToBecomeAvailable()
reads WAL from pg_wal after failing to read from the archive. This is
currently implemented in XLogFileReadAnyTLI() by calling
XLogFileRead() with source XLOG_FROM_PG_WAL after it fails with source
XLOG_FROM_PG_ARCHIVE and the current source isn't changed at all.
Also, passing the source to XLogFileReadAnyTLI() in
WaitForWALToBecomeAvailable() isn't straight i.e. it's not necessary
to pass in XLOG_FROM_ANY at all. These things make the state machine a
bit complicated and hard to understand.
The attached patch attempts to simplify the code a bit by changing the
current source to XLOG_FROM_PG_WAL after failing in
XLOG_FROM_PG_ARCHIVE so that the state machine can move smoothly to
read from pg_wal. And we can just pass the current source to
XLogFileReadAnyTLI(). It also enables us to reduce a bit of extra
XLogFileRead() code in XLogFileReadAnyTLI().
Thoughts?
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachments:
v1-0001-Simplify-standby-state-machine-in-WaitForWALToBec.patchapplication/x-patch; name=v1-0001-Simplify-standby-state-machine-in-WaitForWALToBec.patchDownload
From e20354b60f4f7040e034cbd26142d8e69dd01be5 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 18 Oct 2022 05:52:35 +0000
Subject: [PATCH v1] Simplify standby state machine in
WaitForWALToBecomeAvailable()
In standby mode, the state machine in
WaitForWALToBecomeAvailable() reads WAL from pg_wal after failing
to read from the archive. This is currently implemented in
XLogFileReadAnyTLI() by calling XLogFileRead() with source
XLOG_FROM_PG_WAL after it fails with source XLOG_FROM_PG_ARCHIVE
and the current source isn't changed at all. Also, passing the
source to XLogFileReadAnyTLI() in WaitForWALToBecomeAvailable()
isn't straight i.e. it's not necessary to pass in XLOG_FROM_ANY at
all. These things make the state machine a bit complicated and
hard to understand.
This patch attempts to simplify the code a bit by changing the
current source to XLOG_FROM_PG_WAL after failing in
XLOG_FROM_PG_ARCHIVE so that the state machine can move smoothly
to read from pg_wal. And we can just pass the current source to
XLogFileReadAnyTLI(). It also enables us to reduce a bit of extra
XLogFileRead() code in XLogFileReadAnyTLI().
---
src/backend/access/transam/xlogrecovery.c | 40 ++++++++---------------
1 file changed, 14 insertions(+), 26 deletions(-)
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index cb07694aea..6ac961f32b 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -3478,6 +3478,14 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
switch (currentSource)
{
case XLOG_FROM_ARCHIVE:
+
+ /*
+ * After failing to read from archive, we try to read from
+ * pg_wal.
+ */
+ currentSource = XLOG_FROM_PG_WAL;
+ break;
+
case XLOG_FROM_PG_WAL:
/*
@@ -3632,12 +3640,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
if (randAccess)
curFileTLI = 0;
- /*
- * Try to restore the file from archive, or read an existing
- * file from pg_wal.
- */
readFile = XLogFileReadAnyTLI(readSegNo, DEBUG2,
- currentSource == XLOG_FROM_ARCHIVE ? XLOG_FROM_ANY :
currentSource);
if (readFile >= 0)
return XLREAD_SUCCESS; /* success! */
@@ -4199,29 +4202,14 @@ XLogFileReadAnyTLI(XLogSegNo segno, int emode, XLogSource source)
continue;
}
- if (source == XLOG_FROM_ANY || source == XLOG_FROM_ARCHIVE)
+ fd = XLogFileRead(segno, emode, tli, source, true);
+ if (fd != -1)
{
- fd = XLogFileRead(segno, emode, tli,
- XLOG_FROM_ARCHIVE, true);
- if (fd != -1)
- {
+ if (source == XLOG_FROM_ARCHIVE)
elog(DEBUG1, "got WAL segment from archive");
- if (!expectedTLEs)
- expectedTLEs = tles;
- return fd;
- }
- }
-
- if (source == XLOG_FROM_ANY || source == XLOG_FROM_PG_WAL)
- {
- fd = XLogFileRead(segno, emode, tli,
- XLOG_FROM_PG_WAL, true);
- if (fd != -1)
- {
- if (!expectedTLEs)
- expectedTLEs = tles;
- return fd;
- }
+ if (!expectedTLEs)
+ expectedTLEs = tles;
+ return fd;
}
}
--
2.34.1
On Tue, Oct 18, 2022 at 12:01 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
Hi,
In standby mode, the state machine in WaitForWALToBecomeAvailable()
reads WAL from pg_wal after failing to read from the archive. This is
currently implemented in XLogFileReadAnyTLI() by calling
XLogFileRead() with source XLOG_FROM_PG_WAL after it fails with source
XLOG_FROM_PG_ARCHIVE and the current source isn't changed at all.
Also, passing the source to XLogFileReadAnyTLI() in
WaitForWALToBecomeAvailable() isn't straight i.e. it's not necessary
to pass in XLOG_FROM_ANY at all. These things make the state machine a
bit complicated and hard to understand.The attached patch attempts to simplify the code a bit by changing the
current source to XLOG_FROM_PG_WAL after failing in
XLOG_FROM_PG_ARCHIVE so that the state machine can move smoothly to
read from pg_wal. And we can just pass the current source to
XLogFileReadAnyTLI(). It also enables us to reduce a bit of extra
XLogFileRead() code in XLogFileReadAnyTLI().Thoughts?
+1
Regards,
Amul
On Tue, Oct 18, 2022 at 1:03 PM Amul Sul <sulamul@gmail.com> wrote:
On Tue, Oct 18, 2022 at 12:01 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:Hi,
In standby mode, the state machine in WaitForWALToBecomeAvailable()
reads WAL from pg_wal after failing to read from the archive. This is
currently implemented in XLogFileReadAnyTLI() by calling
XLogFileRead() with source XLOG_FROM_PG_WAL after it fails with source
XLOG_FROM_PG_ARCHIVE and the current source isn't changed at all.
Also, passing the source to XLogFileReadAnyTLI() in
WaitForWALToBecomeAvailable() isn't straight i.e. it's not necessary
to pass in XLOG_FROM_ANY at all. These things make the state machine a
bit complicated and hard to understand.The attached patch attempts to simplify the code a bit by changing the
current source to XLOG_FROM_PG_WAL after failing in
XLOG_FROM_PG_ARCHIVE so that the state machine can move smoothly to
read from pg_wal. And we can just pass the current source to
XLogFileReadAnyTLI(). It also enables us to reduce a bit of extra
XLogFileRead() code in XLogFileReadAnyTLI().Thoughts?
+1
Thanks. Let's see what others think about it. I will add a CF entry in a while.
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Tue, Oct 18, 2022 at 12:01:07PM +0530, Bharath Rupireddy wrote:
The attached patch attempts to simplify the code a bit by changing the
current source to XLOG_FROM_PG_WAL after failing in
XLOG_FROM_PG_ARCHIVE so that the state machine can move smoothly to
read from pg_wal. And we can just pass the current source to
XLogFileReadAnyTLI(). It also enables us to reduce a bit of extra
XLogFileRead() code in XLogFileReadAnyTLI().
This looks correct to me. The only thing that stood out to me was the loop
through 'tles' in XLogFileReadyAnyTLI. With this change, we'd loop through
the timelines for both XLOG_FROM_PG_ARCHIVE and XLOG_FROM_PG_WAL, whereas
now we only loop through the timelines once. However, I doubt this makes
much difference in practice. You'd only do the extra loop whenever
restoring from the archives failed.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Fri, Dec 30, 2022 at 10:32:57AM -0800, Nathan Bossart wrote:
This looks correct to me. The only thing that stood out to me was the loop
through 'tles' in XLogFileReadyAnyTLI. With this change, we'd loop through
the timelines for both XLOG_FROM_PG_ARCHIVE and XLOG_FROM_PG_WAL, whereas
now we only loop through the timelines once. However, I doubt this makes
much difference in practice. You'd only do the extra loop whenever
restoring from the archives failed.
case XLOG_FROM_ARCHIVE:
+
+ /*
+ * After failing to read from archive, we try to read from
+ * pg_wal.
+ */
+ currentSource = XLOG_FROM_PG_WAL;
+ break;
In standby mode, the priority lookup order is pg_wal -> archive ->
stream. With this change, we would do pg_wal -> archive -> pg_wal ->
stream, meaning that it could influence some recovery scenarios while
involving more lookups than necessary to the local pg_wal/ directory?
See, on failure where the current source is XLOG_FROM_ARCHIVE, we
would not switch anymore directly to XLOG_FROM_STREAM.
--
Michael
On Tue, Jan 3, 2023 at 7:47 AM Michael Paquier <michael@paquier.xyz> wrote:
On Fri, Dec 30, 2022 at 10:32:57AM -0800, Nathan Bossart wrote:
This looks correct to me. The only thing that stood out to me was the loop
through 'tles' in XLogFileReadyAnyTLI. With this change, we'd loop through
the timelines for both XLOG_FROM_PG_ARCHIVE and XLOG_FROM_PG_WAL, whereas
now we only loop through the timelines once. However, I doubt this makes
much difference in practice. You'd only do the extra loop whenever
restoring from the archives failed.case XLOG_FROM_ARCHIVE: + + /* + * After failing to read from archive, we try to read from + * pg_wal. + */ + currentSource = XLOG_FROM_PG_WAL; + break; In standby mode, the priority lookup order is pg_wal -> archive -> stream. With this change, we would do pg_wal -> archive -> pg_wal -> stream, meaning that it could influence some recovery scenarios while involving more lookups than necessary to the local pg_wal/ directory?See, on failure where the current source is XLOG_FROM_ARCHIVE, we
would not switch anymore directly to XLOG_FROM_STREAM.
I think there's a bit of disconnect here - here's what I understand:
Standby when started can either enter to crash recovery (if it is a
restart after crash) or enter to archive recovery directly.
The standby, when in crash recovery:
currentSource is set to XLOG_FROM_PG_WAL in
WaitForWALToBecomeAvailable() and it continues to exhaust replaying
all the WAL in the pg_wal directory.
After all the pg_wal is exhausted during crash recovery, currentSource
is set to XLOG_FROM_ANY in ReadRecord() and the standby enters archive
recovery mode (see below).
The standby, when in archive recovery:
In WaitForWALToBecomeAvailable() currentSource is set to
XLOG_FROM_ARCHIVE and it enters XLogFileReadAnyTLI() - first tries to
fetch WAL from archive and returns if succeeds otherwise tries to
fetch from pg_wal and returns if succeeds, otherwise returns with
failure.
If failure is returned from XLogFileReadAnyTLI(), change the
currentSource to XLOG_FROM_STREAM.
If a failure in XLOG_FROM_STREAM, the currentSource is set to
XLOG_FROM_ARCHIVE and XLogFileReadAnyTLI() is called again.
Note that the standby exits from this WaitForWALToBecomeAvailable()
state machine when the promotion signal is detected and before which
all the wal from archive -> pg_wal is exhausted.
Note that currentSource is set to XLOG_FROM_PG_WAL in
WaitForWALToBecomeAvailable() only after the server exits archive
recovery i.e. InArchiveRecovery is set to false in
FinishWalRecovery(). However, exhausting pg_wal for recovery is built
inherently within XLogFileReadAnyTLI().
In summary:
the flow when the standby is in crash recovery is pg_wal -> [archive
-> pg_wal -> stream] -> [archive -> pg_wal -> stream] -> [] -> [] ...
the flow when the standby is in archive recovery is [archive -> pg_wal
-> stream] -> [archive -> pg_wal -> stream] -> [] -> [] ...
The proposed patch makes the inherent state change to pg_wal after
failure to read from archive in XLogFileReadAnyTLI() to explicit by
setting currentSource to XLOG_FROM_PG_WAL in the state machine. I
think it doesn't alter the existing state machine or add any new extra
lookups in pg_wal.
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Sat, Dec 31, 2022 at 12:03 AM Nathan Bossart
<nathandbossart@gmail.com> wrote:
On Tue, Oct 18, 2022 at 12:01:07PM +0530, Bharath Rupireddy wrote:
The attached patch attempts to simplify the code a bit by changing the
current source to XLOG_FROM_PG_WAL after failing in
XLOG_FROM_PG_ARCHIVE so that the state machine can move smoothly to
read from pg_wal. And we can just pass the current source to
XLogFileReadAnyTLI(). It also enables us to reduce a bit of extra
XLogFileRead() code in XLogFileReadAnyTLI().This looks correct to me. The only thing that stood out to me was the loop
through 'tles' in XLogFileReadyAnyTLI. With this change, we'd loop through
the timelines for both XLOG_FROM_PG_ARCHIVE and XLOG_FROM_PG_WAL, whereas
now we only loop through the timelines once. However, I doubt this makes
much difference in practice. You'd only do the extra loop whenever
restoring from the archives failed.
Right. With the patch, we'd loop again through the tles after a
failure from the archive. Since the curFileTLI isn't changed unless a
successful read, we'd read from pg_wal with tli where we earlier left
off reading from the archive. I'm not sure if this extra looping is
worth worrying about.
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Tue, Jan 03, 2023 at 02:53:10PM +0530, Bharath Rupireddy wrote:
In summary:
the flow when the standby is in crash recovery is pg_wal -> [archive
-> pg_wal -> stream] -> [archive -> pg_wal -> stream] -> [] -> [] ...
the flow when the standby is in archive recovery is [archive -> pg_wal
-> stream] -> [archive -> pg_wal -> stream] -> [] -> [] ...
This is my understanding as well.
The proposed patch makes the inherent state change to pg_wal after
failure to read from archive in XLogFileReadAnyTLI() to explicit by
setting currentSource to XLOG_FROM_PG_WAL in the state machine. I
think it doesn't alter the existing state machine or add any new extra
lookups in pg_wal.
I'm assuming this change would simplify your other patch that modifieѕ
WaitForWALToBecomeAvailable() [0]https://commitfest.postgresql.org/41/3663/. Is that correct?
[0]: https://commitfest.postgresql.org/41/3663/
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Wed, Jan 4, 2023 at 12:03 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Tue, Jan 03, 2023 at 02:53:10PM +0530, Bharath Rupireddy wrote:
In summary:
the flow when the standby is in crash recovery is pg_wal -> [archive
-> pg_wal -> stream] -> [archive -> pg_wal -> stream] -> [] -> [] ...
the flow when the standby is in archive recovery is [archive -> pg_wal
-> stream] -> [archive -> pg_wal -> stream] -> [] -> [] ...This is my understanding as well.
The proposed patch makes the inherent state change to pg_wal after
failure to read from archive in XLogFileReadAnyTLI() to explicit by
setting currentSource to XLOG_FROM_PG_WAL in the state machine. I
think it doesn't alter the existing state machine or add any new extra
lookups in pg_wal.I'm assuming this change would simplify your other patch that modifieѕ
WaitForWALToBecomeAvailable() [0]. Is that correct?
Yes, it does simplify the other feature patch.
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Tue, Jan 03, 2023 at 02:53:10PM +0530, Bharath Rupireddy wrote:
The proposed patch makes the inherent state change to pg_wal after
failure to read from archive in XLogFileReadAnyTLI() to explicit by
setting currentSource to XLOG_FROM_PG_WAL in the state machine. I
think it doesn't alter the existing state machine or add any new extra
lookups in pg_wal.
Well, did you notice 4d894b41? It introduced this change:
- readFile = XLogFileReadAnyTLI(readSegNo, DEBUG2, currentSource);
+ readFile = XLogFileReadAnyTLI(readSegNo, DEBUG2,
+ currentSource == XLOG_FROM_ARCHIVE ? XLOG_FROM_ANY :
+ currentSource);
And this patch basically undoes that, meaning that we would basically
look at the archives first for all the expected TLIs, but only if no
files were found in pg_wal/.
The change is subtle, see XLogFileReadAnyTLI(). On HEAD we go through
each timeline listed and check both archives and then pg_wal/ after
the last source that failed was the archives. The patch does
something different: it checks all the timelines for the archives,
then all the timelines in pg_wal/ with two separate calls to
XLogFileReadAnyTLI().
--
Michael
On Wed, Mar 1, 2023 at 1:46 PM Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Jan 03, 2023 at 02:53:10PM +0530, Bharath Rupireddy wrote:
The proposed patch makes the inherent state change to pg_wal after
failure to read from archive in XLogFileReadAnyTLI() to explicit by
setting currentSource to XLOG_FROM_PG_WAL in the state machine. I
think it doesn't alter the existing state machine or add any new extra
lookups in pg_wal.Well, did you notice 4d894b41? It introduced this change:
- readFile = XLogFileReadAnyTLI(readSegNo, DEBUG2, currentSource); + readFile = XLogFileReadAnyTLI(readSegNo, DEBUG2, + currentSource == XLOG_FROM_ARCHIVE ? XLOG_FROM_ANY : + currentSource);And this patch basically undoes that, meaning that we would basically
look at the archives first for all the expected TLIs, but only if no
files were found in pg_wal/.The change is subtle, see XLogFileReadAnyTLI(). On HEAD we go through
each timeline listed and check both archives and then pg_wal/ after
the last source that failed was the archives. The patch does
something different: it checks all the timelines for the archives,
then all the timelines in pg_wal/ with two separate calls to
XLogFileReadAnyTLI().
Thanks. Yeah, the patch proposed here just reverts that commit [1]commit 4d894b41cd12179b710526eba9dc62c2b99abc4d Author: Heikki Linnakangas <heikki.linnakangas@iki.fi> Date: Fri Feb 14 15:15:09 2014 +0200
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=4d894b41cd12179b710526eba9dc62c2b99abc4d.
That commit fixes an issue - "If there is a WAL segment with same ID
but different TLI present in both the WAL archive and pg_xlog, prefer
the one with higher TLI.".
I will withdraw the patch proposed in this thread.
[1]: commit 4d894b41cd12179b710526eba9dc62c2b99abc4d Author: Heikki Linnakangas <heikki.linnakangas@iki.fi> Date: Fri Feb 14 15:15:09 2014 +0200
commit 4d894b41cd12179b710526eba9dc62c2b99abc4d
Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Fri Feb 14 15:15:09 2014 +0200
Change the order that pg_xlog and WAL archive are polled for WAL segments.
If there is a WAL segment with same ID but different TLI present in both
the WAL archive and pg_xlog, prefer the one with higher TLI. Before this
patch, the archive was polled first, for all expected TLIs, and only if no
file was found was pg_xlog scanned. This was a change in behavior from 9.3,
which first scanned archive and pg_xlog for the highest TLI, then archive
and pg_xlog for the next highest TLI and so forth. This patch reverts the
behavior back to what it was in 9.2.
The reason for this is that if for example you try to do archive recovery
to timeline 2, which branched off timeline 1, but the WAL for timeline 2 is
not archived yet, we would replay past the timeline switch point on
timeline 1 using the archived files, before even looking timeline 2's files
in pg_xlog
Report and patch by Kyotaro Horiguchi. Backpatch to 9.3 where the behavior
was changed.
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Fri, Mar 03, 2023 at 01:38:38PM +0530, Bharath Rupireddy wrote:
I will withdraw the patch proposed in this thread.
Okay, thanks for confirming.
--
Michael
On Fri, Mar 03, 2023 at 01:38:38PM +0530, Bharath Rupireddy wrote:
On Wed, Mar 1, 2023 at 1:46 PM Michael Paquier <michael@paquier.xyz> wrote:
Well, did you notice 4d894b41? It introduced this change:
- readFile = XLogFileReadAnyTLI(readSegNo, DEBUG2, currentSource); + readFile = XLogFileReadAnyTLI(readSegNo, DEBUG2, + currentSource == XLOG_FROM_ARCHIVE ? XLOG_FROM_ANY : + currentSource);And this patch basically undoes that, meaning that we would basically
look at the archives first for all the expected TLIs, but only if no
files were found in pg_wal/.The change is subtle, see XLogFileReadAnyTLI(). On HEAD we go through
each timeline listed and check both archives and then pg_wal/ after
the last source that failed was the archives. The patch does
something different: it checks all the timelines for the archives,
then all the timelines in pg_wal/ with two separate calls to
XLogFileReadAnyTLI().Thanks. Yeah, the patch proposed here just reverts that commit [1]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=4d894b41cd12179b710526eba9dc62c2b99abc4d.That commit fixes an issue - "If there is a WAL segment with same ID
but different TLI present in both the WAL archive and pg_xlog, prefer
the one with higher TLI.".
Given both Bharath and I missed this, perhaps we should add a comment about
this behavior.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Fri, Mar 03, 2023 at 04:33:39PM -0800, Nathan Bossart wrote:
Given both Bharath and I missed this, perhaps we should add a comment about
this behavior.
Makes sense to me to document that in a better way. What about the
addition of a short paragraph at the top of XLogFileReadAnyTLI() that
explains the behaviors we expect depending on the values of
XLogSource? The case of XLOG_FROM_ANY with the order to scan the
archives then pg_wal/ for each timeline is the most important bit,
surely.
--
Michael
On Sat, Mar 4, 2023 at 8:00 AM Michael Paquier <michael@paquier.xyz> wrote:
On Fri, Mar 03, 2023 at 04:33:39PM -0800, Nathan Bossart wrote:
Given both Bharath and I missed this, perhaps we should add a comment about
this behavior.Makes sense to me to document that in a better way. What about the
addition of a short paragraph at the top of XLogFileReadAnyTLI() that
explains the behaviors we expect depending on the values of
XLogSource? The case of XLOG_FROM_ANY with the order to scan the
archives then pg_wal/ for each timeline is the most important bit,
surely.
+1. I will send a patch in a bit.
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Sat, Mar 4, 2023 at 8:14 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
On Sat, Mar 4, 2023 at 8:00 AM Michael Paquier <michael@paquier.xyz> wrote:
On Fri, Mar 03, 2023 at 04:33:39PM -0800, Nathan Bossart wrote:
Given both Bharath and I missed this, perhaps we should add a comment about
this behavior.Makes sense to me to document that in a better way. What about the
addition of a short paragraph at the top of XLogFileReadAnyTLI() that
explains the behaviors we expect depending on the values of
XLogSource? The case of XLOG_FROM_ANY with the order to scan the
archives then pg_wal/ for each timeline is the most important bit,
surely.+1. I will send a patch in a bit.
Okay, here's a patch attached.
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachments:
v1-0001-Clarify-XLogFileReadAnyTLI-s-behaviour-for-XLOG_F.patchapplication/octet-stream; name=v1-0001-Clarify-XLogFileReadAnyTLI-s-behaviour-for-XLOG_F.patchDownload
From fa88e2cecd33dfcd8800c89b183fa99b46037e7e Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sat, 4 Mar 2023 04:15:04 +0000
Subject: [PATCH v1] Clarify XLogFileReadAnyTLI()'s behaviour for XLOG_FROM_ANY
---
src/backend/access/transam/xlogrecovery.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index dbe9394762..c5bc295fdb 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -4165,6 +4165,16 @@ XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli,
* Open a logfile segment for reading (during recovery).
*
* This version searches for the segment with any TLI listed in expectedTLEs.
+ *
+ * When source == XLOG_FROM_ANY, this function first searches for the segment
+ * with a TLI in archive first, if not found, it searches in pg_wal. This way,
+ * if there is a WAL segment with same passed-in segno but different TLI
+ * present in both the archive and pg_wal, it prefers the one with higher TLI.
+ * The reason for this is that if for example we try to do archive recovery to
+ * timeline 2, which branched off timeline 1, but the WAL for timeline 2 is not
+ * archived yet, we would replay past the timeline switch point on timeline 1
+ * using the archived WAL segment, before even looking timeline 2's WAL
+ * segments in pg_wal.
*/
static int
XLogFileReadAnyTLI(XLogSegNo segno, int emode, XLogSource source)
--
2.34.1
On Sat, Mar 04, 2023 at 09:47:05AM +0530, Bharath Rupireddy wrote:
Okay, here's a patch attached.
Thanks.
+ * When source == XLOG_FROM_ANY, this function first searches for the segment
+ * with a TLI in archive first, if not found, it searches in pg_wal. This way,
+ * if there is a WAL segment with same passed-in segno but different TLI
+ * present in both the archive and pg_wal, it prefers the one with higher TLI.
+ * The reason for this is that if for example we try to do archive recovery to
+ * timeline 2, which branched off timeline 1, but the WAL for timeline 2 is not
+ * archived yet, we would replay past the timeline switch point on timeline 1
+ * using the archived WAL segment, before even looking timeline 2's WAL
+ * segments in pg_wal.
This is pretty much what the commit has mentioned. The first half
provides enough details, IMO.
--
Michael
On Mon, Mar 6, 2023 at 1:26 PM Michael Paquier <michael@paquier.xyz> wrote:
On Sat, Mar 04, 2023 at 09:47:05AM +0530, Bharath Rupireddy wrote:
Okay, here's a patch attached.
Thanks.
+ * When source == XLOG_FROM_ANY, this function first searches for the segment + * with a TLI in archive first, if not found, it searches in pg_wal. This way, + * if there is a WAL segment with same passed-in segno but different TLI + * present in both the archive and pg_wal, it prefers the one with higher TLI. + * The reason for this is that if for example we try to do archive recovery to + * timeline 2, which branched off timeline 1, but the WAL for timeline 2 is not + * archived yet, we would replay past the timeline switch point on timeline 1 + * using the archived WAL segment, before even looking timeline 2's WAL + * segments in pg_wal.This is pretty much what the commit has mentioned. The first half
provides enough details, IMO.
IMO, mentioning the example from the commit message in the function
comment makes things more clear - one doesn't have to go look for the
commit message for that.
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com