corruption of WAL page header is never reported
Hello,
I found that any corruption of WAL page header found during recovery is never
reported in log messages. If wal page header is broken, it is detected in
XLogReaderValidatePageHeader called from XLogPageRead, but the error messages
are always reset and never reported.
if (!XLogReaderValidatePageHeader(xlogreader, targetPagePtr, readBuf))
{
/* reset any error XLogReaderValidatePageHeader() might have set */
xlogreader->errormsg_buf[0] = '\0';
goto next_record_is_invalid;
}
Since the commit 06687198018, we call XLogReaderValidatePageHeader here so that
we can check a page header and retry immediately if it's invalid, but the error
message is reset immediately and not reported. I guess the reason why the error
message is reset is because we might get the right WAL after some retries.
However, I think it is better to report the error for each check in order to let
users know the actual issues founded in the WAL.
I attached a patch to fix it in this way.
Or, if we wouldn't like to report an error for each check and also what we want
to check here is just about old recycled WAL instead of header corruption itself,
I wander that we could check just xlp_pageaddr instead of calling
XLogReaderValidatePageHeader.
Regards,
Yugo Nagata
--
Yugo NAGATA <nagata@sraoss.co.jp>
Attachments:
report_corrupt_page_header.patchtext/x-diff; name=report_corrupt_page_header.patchDownload+7-2
Em sáb., 17 de jul. de 2021 às 16:57, Yugo NAGATA <nagata@sraoss.co.jp>
escreveu:
Hello,
I found that any corruption of WAL page header found during recovery is
never
reported in log messages. If wal page header is broken, it is detected in
XLogReaderValidatePageHeader called from XLogPageRead, but the error
messages
are always reset and never reported.if (!XLogReaderValidatePageHeader(xlogreader, targetPagePtr,
readBuf))
{
/* reset any error XLogReaderValidatePageHeader() might
have set */
xlogreader->errormsg_buf[0] = '\0';
goto next_record_is_invalid;
}Since the commit 06687198018, we call XLogReaderValidatePageHeader here so
that
we can check a page header and retry immediately if it's invalid, but the
error
message is reset immediately and not reported. I guess the reason why the
error
message is reset is because we might get the right WAL after some retries.
However, I think it is better to report the error for each check in order
to let
users know the actual issues founded in the WAL.I attached a patch to fix it in this way.
I think to keep the same behavior as before, is necessary always run:
/* reset any error XLogReaderValidatePageHeader() might have set */
xlogreader->errormsg_buf[0] = '\0';
not?
regards,
Ranier Vilela
Attachments:
v1_report_corrupt_page_header.patchtext/x-patch; charset=US-ASCII; name=v1_report_corrupt_page_header.patchDownload+5-0
On Sat, 17 Jul 2021 18:40:02 -0300
Ranier Vilela <ranier.vf@gmail.com> wrote:
Em sáb., 17 de jul. de 2021 às 16:57, Yugo NAGATA <nagata@sraoss.co.jp>
escreveu:Hello,
I found that any corruption of WAL page header found during recovery is
never
reported in log messages. If wal page header is broken, it is detected in
XLogReaderValidatePageHeader called from XLogPageRead, but the error
messages
are always reset and never reported.if (!XLogReaderValidatePageHeader(xlogreader, targetPagePtr,
readBuf))
{
/* reset any error XLogReaderValidatePageHeader() might
have set */
xlogreader->errormsg_buf[0] = '\0';
goto next_record_is_invalid;
}Since the commit 06687198018, we call XLogReaderValidatePageHeader here so
that
we can check a page header and retry immediately if it's invalid, but the
error
message is reset immediately and not reported. I guess the reason why the
error
message is reset is because we might get the right WAL after some retries.
However, I think it is better to report the error for each check in order
to let
users know the actual issues founded in the WAL.I attached a patch to fix it in this way.
I think to keep the same behavior as before, is necessary always run:
/* reset any error XLogReaderValidatePageHeader() might have set */
xlogreader->errormsg_buf[0] = '\0';not?
If we are not in StandbyMode, the check is not retried, and an error is returned
immediately. So, I think ,we don't have to display an error message in such cases,
and neither reset it. Instead, it would be better to leave the error message
handling to the caller of XLogReadRecord.
Regards,
Yugo Nagat
--
Yugo NAGATA <nagata@sraoss.co.jp>
Hello.
At Sun, 18 Jul 2021 04:55:05 +0900, Yugo NAGATA <nagata@sraoss.co.jp> wrote in
Hello,
I found that any corruption of WAL page header found during recovery is never
reported in log messages. If wal page header is broken, it is detected in
XLogReaderValidatePageHeader called from XLogPageRead, but the error messages
are always reset and never reported.
Good catch! Currently recovery stops showing no reason if it is
stopped by page-header errors.
I attached a patch to fix it in this way.
However, it is a kind of a roof-over-a-roof. What we should do is
just omitting the check in XLogPageRead while in standby mode.
Or, if we wouldn't like to report an error for each check and also what we want
to check here is just about old recycled WAL instead of header corruption itself,
I wander that we could check just xlp_pageaddr instead of calling
XLogReaderValidatePageHeader.
I'm not sure. But as described in the commit message, the commit
intended to save a common case and there's no obvious reason to (and
not to) restrict the check only to page address. So it uses the
established checking function.
I was tempted to adjust the comment just above by adding "while in
standby mode", but "so that we can retry immediately" is suggesting
that so I didn't do that in the attached.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
v1-0001-Don-t-forget-message-of-hage-header-errors-while-.patchtext/x-patch; charset=us-asciiDownload+2-2
On Mon, 19 Jul 2021 15:14:41 +0900 (JST)
Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
Hello.
At Sun, 18 Jul 2021 04:55:05 +0900, Yugo NAGATA <nagata@sraoss.co.jp> wrote in
Hello,
I found that any corruption of WAL page header found during recovery is never
reported in log messages. If wal page header is broken, it is detected in
XLogReaderValidatePageHeader called from XLogPageRead, but the error messages
are always reset and never reported.Good catch! Currently recovery stops showing no reason if it is
stopped by page-header errors.I attached a patch to fix it in this way.
However, it is a kind of a roof-over-a-roof. What we should do is
just omitting the check in XLogPageRead while in standby mode.
Your patch doesn't fix the issue that the error message is never reported in
standby mode. When a WAL page header is broken, the standby would silently repeat
retrying forever.
I think we have to let users know the corruption of WAL page header even in
standby mode, not? A corruption of WAL record header is always reported,
by the way. (See that XLogReadRecord is calling ValidXLogRecordHeader.)
Regards,
Yugo Nagata
--
Yugo NAGATA <nagata@sraoss.co.jp>
At Mon, 19 Jul 2021 16:00:39 +0900, Yugo NAGATA <nagata@sraoss.co.jp> wrote in
Your patch doesn't fix the issue that the error message is never reported in
standby mode. When a WAL page header is broken, the standby would silently repeat
retrying forever.
Ok, I see your point and agree to that.
I think we have to let users know the corruption of WAL page header even in
standby mode, not? A corruption of WAL record header is always reported,
by the way. (See that XLogReadRecord is calling ValidXLogRecordHeader.)
Howeer, I'm still on the opinion that we don't need to check that
while in standby mode.
How about the attached?
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
0001-Don-t-hide-xlog-page-header-errors-while-recovery.patchtext/x-patch; charset=us-asciiDownload+10-2
me> Howeer, I'm still on the opinion that we don't need to check that
me> while in standby mode.
Of course it's typo of "while not in standby mode".
sorry..
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Mon, 19 Jul 2021 17:47:07 +0900 (JST)
Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
At Mon, 19 Jul 2021 16:00:39 +0900, Yugo NAGATA <nagata@sraoss.co.jp> wrote in
Your patch doesn't fix the issue that the error message is never reported in
standby mode. When a WAL page header is broken, the standby would silently repeat
retrying forever.Ok, I see your point and agree to that.
I think we have to let users know the corruption of WAL page header even in
standby mode, not? A corruption of WAL record header is always reported,
by the way. (See that XLogReadRecord is calling ValidXLogRecordHeader.)Howeer, I'm still on the opinion that we don't need to check that
while in standby mode.How about the attached?
On Mon, 19 Jul 2021 17:50:16 +0900 (JST)
Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
me> Howeer, I'm still on the opinion that we don't need to check that
me> while in standby mode.Of course it's typo of "while not in standby mode".
Thanks for updating the patch. I agree with you.
I think it is nice to fix to perform the check only during standby mode
because it make a bit clearer why we check it immediately in XLogPageRead.
Regards,
Yugo Nagata
--
Yugo NAGATA <nagata@sraoss.co.jp>
Em seg., 19 de jul. de 2021 às 06:15, Yugo NAGATA <nagata@sraoss.co.jp>
escreveu:
On Mon, 19 Jul 2021 17:47:07 +0900 (JST)
Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:At Mon, 19 Jul 2021 16:00:39 +0900, Yugo NAGATA <nagata@sraoss.co.jp>
wrote in
Your patch doesn't fix the issue that the error message is never
reported in
standby mode. When a WAL page header is broken, the standby would
silently repeat
retrying forever.
Ok, I see your point and agree to that.
I think we have to let users know the corruption of WAL page header
even in
standby mode, not? A corruption of WAL record header is always
reported,
by the way. (See that XLogReadRecord is calling ValidXLogRecordHeader.)
Howeer, I'm still on the opinion that we don't need to check that
while in standby mode.How about the attached?
On Mon, 19 Jul 2021 17:50:16 +0900 (JST)
Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:me> Howeer, I'm still on the opinion that we don't need to check that
me> while in standby mode.Of course it's typo of "while not in standby mode".
Thanks for updating the patch. I agree with you.
I think it is nice to fix to perform the check only during standby mode
because it make a bit clearer why we check it immediately in XLogPageRead.
And as I had reviewed, your first patch was wrong and now with the Kyotaro
version,
to keep the same behavior, it is necessary to reset the error, correct?
regards,
Ranier Vilela
On Mon, 19 Jul 2021 06:32:28 -0300
Ranier Vilela <ranier.vf@gmail.com> wrote:
Em seg., 19 de jul. de 2021 às 06:15, Yugo NAGATA <nagata@sraoss.co.jp>
escreveu:On Mon, 19 Jul 2021 17:47:07 +0900 (JST)
Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:At Mon, 19 Jul 2021 16:00:39 +0900, Yugo NAGATA <nagata@sraoss.co.jp>
wrote in
Your patch doesn't fix the issue that the error message is never
reported in
standby mode. When a WAL page header is broken, the standby would
silently repeat
retrying forever.
Ok, I see your point and agree to that.
I think we have to let users know the corruption of WAL page header
even in
standby mode, not? A corruption of WAL record header is always
reported,
by the way. (See that XLogReadRecord is calling ValidXLogRecordHeader.)
Howeer, I'm still on the opinion that we don't need to check that
while in standby mode.How about the attached?
On Mon, 19 Jul 2021 17:50:16 +0900 (JST)
Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:me> Howeer, I'm still on the opinion that we don't need to check that
me> while in standby mode.Of course it's typo of "while not in standby mode".
Thanks for updating the patch. I agree with you.
I think it is nice to fix to perform the check only during standby mode
because it make a bit clearer why we check it immediately in XLogPageRead.And as I had reviewed, your first patch was wrong and now with the Kyotaro
version,
to keep the same behavior, it is necessary to reset the error, correct?
Well, I think my first patch was not wrong. The difference with the latest
patch is just whether we perform the additional check when we are not in
standby mode or not. The behavior is basically the same although which function
detects and prints the page-header error in cases of crash recovery is different.
On the other hand, in your patch, the error message was always ommitted in cases
of crash recovery, and it seemed to me wrong.
Regards,
Yugo Nagata
--
Yugo NAGATA <nagata@sraoss.co.jp>
On 2021/07/19 18:52, Yugo NAGATA wrote:
Well, I think my first patch was not wrong. The difference with the latest
patch is just whether we perform the additional check when we are not in
standby mode or not. The behavior is basically the same although which function
detects and prints the page-header error in cases of crash recovery is different.
Yes, so which patch do you think is better? I like your version
because there seems no reason why XLogPageRead() should skip
XLogReaderValidatePageHeader() when not in standby mode.
Also I'm tempted to move ereport() and reset of errmsg_buf to
under next_record_is_invalid as follows. That is, in standby mode
whenever we find an invalid record and retry reading WAL page
in XLogPageRead(), we report the error message and reset it.
For now in XLogPageRead(), only XLogReaderValidatePageHeader()
sets errmsg_buf, but in the future other code or function doing that
may be added. For that case, the following change seems more elegant.
Thought?
* shouldn't be a big deal from a performance point of view.
*/
if (!XLogReaderValidatePageHeader(xlogreader, targetPagePtr, readBuf))
- {
- /* reset any error XLogReaderValidatePageHeader() might have set */
- xlogreader->errormsg_buf[0] = '\0';
goto next_record_is_invalid;
- }
return readLen;
@@ -12517,7 +12513,17 @@ next_record_is_invalid:
/* In standby-mode, keep trying */
if (StandbyMode)
+ {
+ if (xlogreader->errormsg_buf[0])
+ {
+ ereport(emode_for_corrupt_record(emode, EndRecPtr),
+ (errmsg_internal("%s", xlogreader->errormsg_buf)));
+
+ /* reset any error XLogReaderValidatePageHeader() might have set */
+ xlogreader->errormsg_buf[0] = '\0';
+ }
goto retry;
+ }
else
return -1;
}
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
At Thu, 2 Sep 2021 12:19:25 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
On 2021/07/19 18:52, Yugo NAGATA wrote:
Well, I think my first patch was not wrong. The difference with the
latest
patch is just whether we perform the additional check when we are not
in
standby mode or not. The behavior is basically the same although which
function
detects and prints the page-header error in cases of crash recovery is
different.Yes, so which patch do you think is better? I like your version
because there seems no reason why XLogPageRead() should skip
XLogReaderValidatePageHeader() when not in standby mode.
Did you read the comment just above?
xlog.c:12523
* Check the page header immediately, so that we can retry immediately if
* it's not valid. This may seem unnecessary, because XLogReadRecord()
* validates the page header anyway, and would propagate the failure up to
* ReadRecord(), which would retry. However, there's a corner case with
* continuation records, if a record is split across two pages such that
So when not in standby mode, the same check is performed by xlogreader
which has the responsibility to validate the binary data read by
XLogPageRead. The page-header validation is a compromise to save a
specific case.
Also I'm tempted to move ereport() and reset of errmsg_buf to
under next_record_is_invalid as follows. That is, in standby mode
whenever we find an invalid record and retry reading WAL page
in XLogPageRead(), we report the error message and reset it.
For now in XLogPageRead(), only XLogReaderValidatePageHeader()
sets errmsg_buf, but in the future other code or function doing that
may be added. For that case, the following change seems more elegant.
Thought?
I don't think it is good choice to conflate read-failure and header
validation failure from the view of modularity.
* shouldn't be a big deal from a performance point of view. */ if (!XLogReaderValidatePageHeader(xlogreader, targetPagePtr, readBuf)) - { - /* reset any error XLogReaderValidatePageHeader() might have set */ - xlogreader->errormsg_buf[0] = '\0'; goto next_record_is_invalid; - } return readLen; @@ -12517,7 +12513,17 @@ next_record_is_invalid: /* In standby-mode, keep trying */ if (StandbyMode) + { + if (xlogreader->errormsg_buf[0]) + { + ereport(emode_for_corrupt_record(emode, EndRecPtr), + (errmsg_internal("%s", xlogreader->errormsg_buf))); + + /* reset any error XLogReaderValidatePageHeader() might have set */ + xlogreader->errormsg_buf[0] = '\0'; + } goto retry; + } else return -1; }
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Sorry, please let me add something.
At Thu, 02 Sep 2021 13:17:16 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
At Thu, 2 Sep 2021 12:19:25 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
Also I'm tempted to move ereport() and reset of errmsg_buf to
under next_record_is_invalid as follows. That is, in standby mode
whenever we find an invalid record and retry reading WAL page
in XLogPageRead(), we report the error message and reset it.
For now in XLogPageRead(), only XLogReaderValidatePageHeader()
sets errmsg_buf, but in the future other code or function doing that
may be added. For that case, the following change seems more elegant.
Thought?I don't think it is good choice to conflate read-failure and header
validation failure from the view of modularity.
In other words, XLogReaderValidatePageHeader is foreign for
XLogPageRead and the function indeuces the need of extra care for
errormsg_buf that is not relevant to the elog-capable module.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
(.. sorry for the chained-mails)
At Thu, 02 Sep 2021 13:31:31 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
Sorry, please let me add something.
At Thu, 02 Sep 2021 13:17:16 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
At Thu, 2 Sep 2021 12:19:25 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
Also I'm tempted to move ereport() and reset of errmsg_buf to
under next_record_is_invalid as follows. That is, in standby mode
whenever we find an invalid record and retry reading WAL page
in XLogPageRead(), we report the error message and reset it.
For now in XLogPageRead(), only XLogReaderValidatePageHeader()
sets errmsg_buf, but in the future other code or function doing that
may be added. For that case, the following change seems more elegant.
Thought?I don't think it is good choice to conflate read-failure and header
validation failure from the view of modularity.In other words, XLogReaderValidatePageHeader is foreign for
XLogPageRead and the function indeuces the need of extra care for
errormsg_buf that is not relevant to the elog-capable module.
However, I can agree that the error handling code can be moved further
later. Like this,
* shouldn't be a big deal from a performance point of view.
*/
- if (!XLogReaderValidatePageHeader(xlogreader, targetPagePtr, readBuf))
- /* reset any error XLogReaderValidatePageHeader() might have set */
- xlogreader->errormsg_buf[0] = '\0';
- goto next_record_is_invalid;
+ if (... && XLogReaderValidatePageHeader())
+ goto page_header_is_invalid;
...
return readlen;
+ page_header_is_invalid:
+ /*
+ * in this case we consume this error right now then retry immediately,
+ * the message is already translated
+ */
+ if (xlogreader->errormsg_buf[0])
+ ereport(emode_for_corrupt_record(emode, EndRecPtr),
+ (errmsg_internal("%s", xlogreader->errormsg_buf)));
+
+ /* reset any error XLogReaderValidatePageHeader() might have set */
+ xlogreader->errormsg_buf[0] = '\0';
next_record_is_invalid:
lastSourceFailed = true;
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On 2021/09/02 13:17, Kyotaro Horiguchi wrote:
Did you read the comment just above?
Yes!
xlog.c:12523
* Check the page header immediately, so that we can retry immediately if
* it's not valid. This may seem unnecessary, because XLogReadRecord()
* validates the page header anyway, and would propagate the failure up to
* ReadRecord(), which would retry. However, there's a corner case with
* continuation records, if a record is split across two pages such thatSo when not in standby mode, the same check is performed by xlogreader
which has the responsibility to validate the binary data read by
XLogPageRead. The page-header validation is a compromise to save a
specific case.
Yes, so XLogPageRead() can skip the validation check of page head if not
in standby mode. On the other hand, there is no problem if it still performs
the validation check as it does for now. No?
I don't think it is good choice to conflate read-failure and header
validation failure from the view of modularity.
I don't think that the proposed change does that. But maybe I failed to get
your point yet... Anyway the proposed change just tries to reset
errormsg_buf whenever XLogPageRead() retries, whatever error happened
before. Also if errormsg_buf is set at that moment, it's reported.
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
At Thu, 2 Sep 2021 14:44:31 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
On 2021/09/02 13:17, Kyotaro Horiguchi wrote:
Did you read the comment just above?
Yes!
Glad to hear that, or..:p
xlog.c:12523
* Check the page header immediately, so that we can retry immediately if
* it's not valid. This may seem unnecessary, because XLogReadRecord()
* validates the page header anyway, and would propagate the failure up to
* ReadRecord(), which would retry. However, there's a corner case with
* continuation records, if a record is split across two pages such thatSo when not in standby mode, the same check is performed by xlogreader
which has the responsibility to validate the binary data read by
XLogPageRead. The page-header validation is a compromise to save a
specific case.Yes, so XLogPageRead() can skip the validation check of page head if
not
in standby mode. On the other hand, there is no problem if it still
performs
the validation check as it does for now. No?
Practically yes, and it has always been like that as you say.
I don't think it is good choice to conflate read-failure and header
validation failure from the view of modularity.I don't think that the proposed change does that. But maybe I failed
It's about your idea in a recent mail. not about the proposed
patch(es).
to get
your point yet... Anyway the proposed change just tries to reset
errormsg_buf whenever XLogPageRead() retries, whatever error happened
before. Also if errormsg_buf is set at that moment, it's reported.
I believe errmsg_buf is an interface to emit error messages dedicated
to xlogreader that doesn't have an access to elog facility, and
xlogreader doesn't (or ought not to or expect to) suppose
non-xlogreader callback functions set the variable. In that sense I
don't think theoriginally proposed patch is proper for the reason that
the non-xlogreader callback function may set errmsg_buf. This is what
I meant by the word "modularity".
For that reason I avoided in my second proposal to call
XLogReaderValidatePageHeader() at all while not in standby mode,
because calling the validator function while in non-standby mode
results in the non-xlogreader function return errmsg_buf. Of course
we can instead always consume errmsg_buf in the function but I don't
like to shadow the caller's task.
Does that makes sense?
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On 2021/09/02 16:26, Kyotaro Horiguchi wrote:
I believe errmsg_buf is an interface to emit error messages dedicated
to xlogreader that doesn't have an access to elog facility, and
xlogreader doesn't (or ought not to or expect to) suppose
non-xlogreader callback functions set the variable. In that sense I
don't think theoriginally proposed patch is proper for the reason that
the non-xlogreader callback function may set errmsg_buf. This is what
I meant by the word "modularity".For that reason I avoided in my second proposal to call
XLogReaderValidatePageHeader() at all while not in standby mode,
because calling the validator function while in non-standby mode
results in the non-xlogreader function return errmsg_buf. Of course
we can instead always consume errmsg_buf in the function but I don't
like to shadow the caller's task.
Understood. Thanks for clarifying this!
Does that makes sense?
Yes, I'm fine with your latest patch.
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
At Thu, 2 Sep 2021 21:52:00 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
On 2021/09/02 16:26, Kyotaro Horiguchi wrote:
I believe errmsg_buf is an interface to emit error messages dedicated
to xlogreader that doesn't have an access to elog facility, and
xlogreader doesn't (or ought not to or expect to) suppose
non-xlogreader callback functions set the variable. In that sense I
don't think theoriginally proposed patch is proper for the reason that
the non-xlogreader callback function may set errmsg_buf. This is what
I meant by the word "modularity".
For that reason I avoided in my second proposal to call
XLogReaderValidatePageHeader() at all while not in standby mode,
because calling the validator function while in non-standby mode
results in the non-xlogreader function return errmsg_buf. Of course
we can instead always consume errmsg_buf in the function but I don't
like to shadow the caller's task.Understood. Thanks for clarifying this!
Does that makes sense?
Yes, I'm fine with your latest patch.
Thanks. Maybe some additional comment is needed.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
At Fri, 03 Sep 2021 16:55:36 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
Yes, I'm fine with your latest patch.
Thanks. Maybe some additional comment is needed.
Something like this.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
v2-0001-Don-t-hide-xlog-page-header-errors-while-recovery.difftext/x-patch; charset=us-asciiDownload+13-1
On 2021-Sep-03, Kyotaro Horiguchi wrote:
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 24165ab03e..b621ad6b0f 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -12496,9 +12496,21 @@ retry: * * Validating the page header is cheap enough that doing it twice * shouldn't be a big deal from a performance point of view. + * + * Don't call XLogReaderValidatePageHeader here while not in standby mode + * so that this function won't return with a valid errmsg_buf. */ - if (!XLogReaderValidatePageHeader(xlogreader, targetPagePtr, readBuf)) + if (StandbyMode && + !XLogReaderValidatePageHeader(xlogreader, targetPagePtr, readBuf))
OK, but I don't understand why we have a comment that says (referring to
non-standby mode) "doing it twice shouldn't be a big deal", followed by
"Don't do it twice while not in standby mode" -- that seems quite
contradictory. I think the new comment should overwrite the previous
one, something like this:
- * Validating the page header is cheap enough that doing it twice
- * shouldn't be a big deal from a performance point of view.
+ *
+ * We do this in standby mode only,
+ * so that this function won't return with a valid errmsg_buf.
*/
- if (!XLogReaderValidatePageHeader(xlogreader, targetPagePtr, readBuf))
+ if (StandbyMode &&
+ !XLogReaderValidatePageHeader(xlogreader, targetPagePtr, readBuf))
--
Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/