Add information to rm_redo_error_callback()
Hi hackers,
I've attached a small patch to add information to rm_redo_error_callback().
The changes attached in this patch came while working on the "Add
information during standby recovery conflicts" patch (See [1]https://commitfest.postgresql.org/29/2604).
The goal is to add more information during the callback (if doable), so
that something like:
2020-08-04 14:42:57.545 UTC [15459] CONTEXT: WAL redo at 0/4A3B0DE0 for
Heap2/CLEAN: remxid 1168
would get extra information that way:
2020-08-04 14:42:57.545 UTC [15459] CONTEXT: WAL redo at 0/4A3B0DE0 for
Heap2/CLEAN: remxid 1168, blkref #0: rel 1663/13586/16850 fork main blk 0
As this could be useful outside of [1]https://commitfest.postgresql.org/29/2604, a dedicated "sub" patch has been
created (thanks Sawada for the suggestion).
I will add this patch to the next commitfest. I look forward to your
feedback about the idea and/or implementation.
Regards,
Bertrand
Attachments:
v1-0001-improve-rm_redo_error_callback.patchtext/plain; charset=UTF-8; name=v1-0001-improve-rm_redo_error_callback.patch; x-mac-creator=0; x-mac-type=0Download
src/backend/access/transam/xlog.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 55cac186dc..cf3267a347 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -11745,10 +11745,22 @@ rm_redo_error_callback(void *arg)
{
XLogReaderState *record = (XLogReaderState *) arg;
StringInfoData buf;
+ int block_id;
+ RelFileNode rnode;
+ ForkNumber forknum;
+ BlockNumber blknum;
initStringInfo(&buf);
xlog_outdesc(&buf, record);
+ for (block_id = 0; block_id <= record->max_block_id; block_id++)
+ {
+ if (XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blknum))
+ appendStringInfo(&buf,", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ block_id, rnode.spcNode, rnode.dbNode,
+ rnode.relNode, forkNames[forknum],
+ blknum);
+ }
/* translator: %s is a WAL record description */
errcontext("WAL redo at %X/%X for %s",
(uint32) (record->ReadRecPtr >> 32),
On Wed, 5 Aug 2020 at 00:37, Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
Hi hackers,
I've attached a small patch to add information to rm_redo_error_callback().
The changes attached in this patch came while working on the "Add
information during standby recovery conflicts" patch (See [1]).The goal is to add more information during the callback (if doable), so
that something like:2020-08-04 14:42:57.545 UTC [15459] CONTEXT: WAL redo at 0/4A3B0DE0 for
Heap2/CLEAN: remxid 1168would get extra information that way:
2020-08-04 14:42:57.545 UTC [15459] CONTEXT: WAL redo at 0/4A3B0DE0 for
Heap2/CLEAN: remxid 1168, blkref #0: rel 1663/13586/16850 fork main blk 0As this could be useful outside of [1], a dedicated "sub" patch has been
created (thanks Sawada for the suggestion).I will add this patch to the next commitfest. I look forward to your
feedback about the idea and/or implementation.
Thank you for starting the new thread for this patch!
I think this patch is simple enough and improves information shown in
errcontext.
I have two comments on the patch:
diff --git a/src/backend/access/transam/xlog.c
b/src/backend/access/transam/xlog.c
index 756b838e6a..8b2024e9e9 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -11749,10 +11749,22 @@ rm_redo_error_callback(void *arg)
{
XLogReaderState *record = (XLogReaderState *) arg;
StringInfoData buf;
+ int block_id;
+ RelFileNode rnode;
+ ForkNumber forknum;
+ BlockNumber blknum;
initStringInfo(&buf);
xlog_outdesc(&buf, record);
+ for (block_id = 0; block_id <= record->max_block_id; block_id++)
+ {
+ if (XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blknum))
+ appendStringInfo(&buf,", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ block_id, rnode.spcNode, rnode.dbNode,
+ rnode.relNode, forkNames[forknum],
+ blknum);
+ }
/* translator: %s is a WAL record description */
errcontext("WAL redo at %X/%X for %s",
(uint32) (record->ReadRecPtr >> 32),
rnode, forknum and blknum can be declared within the for loop.
I think it's better to put a new line just before the comment starting
from "translator:".
Regards,
--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi,
On 8/10/20 7:10 AM, Masahiko Sawada wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
On Wed, 5 Aug 2020 at 00:37, Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
Hi hackers,
I've attached a small patch to add information to rm_redo_error_callback().
The changes attached in this patch came while working on the "Add
information during standby recovery conflicts" patch (See [1]).The goal is to add more information during the callback (if doable), so
that something like:2020-08-04 14:42:57.545 UTC [15459] CONTEXT: WAL redo at 0/4A3B0DE0 for
Heap2/CLEAN: remxid 1168would get extra information that way:
2020-08-04 14:42:57.545 UTC [15459] CONTEXT: WAL redo at 0/4A3B0DE0 for
Heap2/CLEAN: remxid 1168, blkref #0: rel 1663/13586/16850 fork main blk 0As this could be useful outside of [1], a dedicated "sub" patch has been
created (thanks Sawada for the suggestion).I will add this patch to the next commitfest. I look forward to your
feedback about the idea and/or implementation.Thank you for starting the new thread for this patch!
I think this patch is simple enough and improves information shown in
errcontext.I have two comments on the patch:
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 756b838e6a..8b2024e9e9 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -11749,10 +11749,22 @@ rm_redo_error_callback(void *arg) { XLogReaderState *record = (XLogReaderState *) arg; StringInfoData buf; + int block_id; + RelFileNode rnode; + ForkNumber forknum; + BlockNumber blknum;initStringInfo(&buf);
xlog_outdesc(&buf, record);+ for (block_id = 0; block_id <= record->max_block_id; block_id++) + { + if (XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blknum)) + appendStringInfo(&buf,", blkref #%d: rel %u/%u/%u fork %s blk %u", + block_id, rnode.spcNode, rnode.dbNode, + rnode.relNode, forkNames[forknum], + blknum); + } /* translator: %s is a WAL record description */ errcontext("WAL redo at %X/%X for %s", (uint32) (record->ReadRecPtr >> 32),rnode, forknum and blknum can be declared within the for loop.
I think it's better to put a new line just before the comment starting
from "translator:".
Thanks for looking at it!
I've attached a new version as per your comments.
Thanks,
Bertrand
Attachments:
v1-0002-improve-rm_redo_error_callback.patchtext/plain; charset=UTF-8; name=v1-0002-improve-rm_redo_error_callback.patch; x-mac-creator=0; x-mac-type=0Download
src/backend/access/transam/xlog.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 55cac186dc..757b2bf3d0 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -11745,10 +11745,24 @@ rm_redo_error_callback(void *arg)
{
XLogReaderState *record = (XLogReaderState *) arg;
StringInfoData buf;
+ int block_id;
initStringInfo(&buf);
xlog_outdesc(&buf, record);
+ for (block_id = 0; block_id <= record->max_block_id; block_id++)
+ {
+ RelFileNode rnode;
+ ForkNumber forknum;
+ BlockNumber blknum;
+
+ if (XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blknum))
+ appendStringInfo(&buf,", blkref #%d: rel %u/%u/%u fork %s blk %u",
+ block_id, rnode.spcNode, rnode.dbNode,
+ rnode.relNode, forkNames[forknum],
+ blknum);
+ }
+
/* translator: %s is a WAL record description */
errcontext("WAL redo at %X/%X for %s",
(uint32) (record->ReadRecPtr >> 32),
On Tue, 11 Aug 2020 at 00:07, Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
Hi,
On 8/10/20 7:10 AM, Masahiko Sawada wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
On Wed, 5 Aug 2020 at 00:37, Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
Hi hackers,
I've attached a small patch to add information to rm_redo_error_callback().
The changes attached in this patch came while working on the "Add
information during standby recovery conflicts" patch (See [1]).The goal is to add more information during the callback (if doable), so
that something like:2020-08-04 14:42:57.545 UTC [15459] CONTEXT: WAL redo at 0/4A3B0DE0 for
Heap2/CLEAN: remxid 1168would get extra information that way:
2020-08-04 14:42:57.545 UTC [15459] CONTEXT: WAL redo at 0/4A3B0DE0 for
Heap2/CLEAN: remxid 1168, blkref #0: rel 1663/13586/16850 fork main blk 0As this could be useful outside of [1], a dedicated "sub" patch has been
created (thanks Sawada for the suggestion).I will add this patch to the next commitfest. I look forward to your
feedback about the idea and/or implementation.Thank you for starting the new thread for this patch!
I think this patch is simple enough and improves information shown in
errcontext.I have two comments on the patch:
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 756b838e6a..8b2024e9e9 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -11749,10 +11749,22 @@ rm_redo_error_callback(void *arg) { XLogReaderState *record = (XLogReaderState *) arg; StringInfoData buf; + int block_id; + RelFileNode rnode; + ForkNumber forknum; + BlockNumber blknum;initStringInfo(&buf);
xlog_outdesc(&buf, record);+ for (block_id = 0; block_id <= record->max_block_id; block_id++) + { + if (XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blknum)) + appendStringInfo(&buf,", blkref #%d: rel %u/%u/%u fork %s blk %u", + block_id, rnode.spcNode, rnode.dbNode, + rnode.relNode, forkNames[forknum], + blknum); + } /* translator: %s is a WAL record description */ errcontext("WAL redo at %X/%X for %s", (uint32) (record->ReadRecPtr >> 32),rnode, forknum and blknum can be declared within the for loop.
I think it's better to put a new line just before the comment starting
from "translator:".Thanks for looking at it!
I've attached a new version as per your comments.
Thank you for updating the patch!
The patch looks good to me. I've set this patch as Ready for Committer.
Regards,
--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Aug 11, 2020 at 02:45:50PM +0900, Masahiko Sawada wrote:
Thank you for updating the patch!
The patch looks good to me. I've set this patch as Ready for Committer.
+ for (block_id = 0; block_id <= record->max_block_id; block_id++)
+ {
+ RelFileNode rnode;
+ ForkNumber forknum;
+ BlockNumber blknum;
Doesn't this potentially create duplicate information in some of the
RM's desc() callbacks, and are we sure that the information provided
is worth having for all the RMs? As one example, gin_desc() looks at
some of the block information, so there are overlaps. It may be
worth thinking about showing more information for has_image and
apply_image if a block is in_use?
--
Michael
On Tue, 11 Aug 2020 at 15:30, Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Aug 11, 2020 at 02:45:50PM +0900, Masahiko Sawada wrote:
Thank you for updating the patch!
The patch looks good to me. I've set this patch as Ready for Committer.
+ for (block_id = 0; block_id <= record->max_block_id; block_id++) + { + RelFileNode rnode; + ForkNumber forknum; + BlockNumber blknum;Doesn't this potentially create duplicate information in some of the
RM's desc() callbacks, and are we sure that the information provided
is worth having for all the RMs? As one example, gin_desc() looks at
some of the block information, so there are overlaps.
Yeah, there is duplicate information in some RMs. I thought that we
can change individual RM’s desc() functions to output the block
information but as long as I see the pg_waldump outputs these are not
annoying to me and many of RM’s desc() doesn’t show the block
information.
It may be
worth thinking about showing more information for has_image and
apply_image if a block is in_use?
Yes. I’m okay with adding information for has_image and apply_image
but IMHO I'm not sure how these shown in errcontext would help. If an
error related to has_image or apply_image happens, errmsg should show
something detailed information about FPI.
Regards,
--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi,
Thanks for the feedback.
On 8/11/20 12:03 PM, Masahiko Sawada wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
On Tue, 11 Aug 2020 at 15:30, Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Aug 11, 2020 at 02:45:50PM +0900, Masahiko Sawada wrote:
Thank you for updating the patch!
The patch looks good to me. I've set this patch as Ready for Committer.
+ for (block_id = 0; block_id <= record->max_block_id; block_id++) + { + RelFileNode rnode; + ForkNumber forknum; + BlockNumber blknum;Doesn't this potentially create duplicate information in some of the
RM's desc() callbacks, and are we sure that the information provided
is worth having for all the RMs? As one example, gin_desc() looks at
some of the block information, so there are overlaps.Yeah, there is duplicate information in some RMs. I thought that we
can change individual RM’s desc() functions to output the block
information but as long as I see the pg_waldump outputs these are not
annoying to me and many of RM’s desc() doesn’t show the block
information.
Having this "pg_waldump" kind of format in this place
(rm_redo_error_callback()) ensures that we'll always see the same piece
of information during rm_redo.
I think it's good to guarantee that we'll always see the same piece of
information (should a new RM desc() be created in the future for
example), even if it could lead to some information overlap in some cases.
It may be
worth thinking about showing more information for has_image and
apply_image if a block is in_use?Yes. I’m okay with adding information for has_image and apply_image
but IMHO I'm not sure how these shown in errcontext would help. If an
error related to has_image or apply_image happens, errmsg should show
something detailed information about FPI.
I am ok too, but I am also not sure that errcontext is the right place
for that.
Thanks
Bertrand
Show quoted text
Regards,
--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2020-Aug-17, Drouvot, Bertrand wrote:
Having this "pg_waldump" kind of format in this place
(rm_redo_error_callback()) ensures that we'll always see the same piece of
information during rm_redo.I think it's good to guarantee that we'll always see the same piece of
information (should a new RM desc() be created in the future for example),
even if it could lead to some information overlap in some cases.
I agree.
I think we should treat the changes to remove rm_desc-specific info
items that are redundant as separate improvements that don't need to
block this patch. They would be, at worst, only minor annoyances.
And the removal, as was said, can affect other things that we might want
to think about separately.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Aug 17, 2020 at 05:47:13PM +0200, Drouvot, Bertrand wrote:
I think it's good to guarantee that we'll always see the same piece of
information (should a new RM desc() be created in the future for example),
even if it could lead to some information overlap in some cases.
I am ok too, but I am also not sure that errcontext is the right place for
that.
Hmm. I still think that knowing at least about a FPW could be an
interesting piece of information even here. Anyway, instead of
copying a logic that exists already in xlog_outrec(), why not moving
the block information print into a separate routine out of the
WAL_DEBUG section, and just reuse the same format for the context of
the redo error callback? That would also be more consistent with what
we do in pg_waldump where we don't show the fork name of a block when
it is on a MAIN_FORKNUM. And this would avoid a third copy of the
same logic. If we add the XID, previous LSN and the record length
on the stack of what is printed, we could just reuse the existing
routine, still that's perhaps too much information displayed.
--
Michael
On Thu, Sep 24, 2020 at 03:03:46PM +0900, Michael Paquier wrote:
Hmm. I still think that knowing at least about a FPW could be an
interesting piece of information even here. Anyway, instead of
copying a logic that exists already in xlog_outrec(), why not moving
the block information print into a separate routine out of the
WAL_DEBUG section, and just reuse the same format for the context of
the redo error callback? That would also be more consistent with what
we do in pg_waldump where we don't show the fork name of a block when
it is on a MAIN_FORKNUM. And this would avoid a third copy of the
same logic. If we add the XID, previous LSN and the record length
on the stack of what is printed, we could just reuse the existing
routine, still that's perhaps too much information displayed.
Seeing nothing, I took a swing at that, and finished with the
attached that refactors the logic and prints the block information as
wanted. Any objections to that?
--
Michael
Attachments:
redo-callback-block-v2.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 79a77ebbfe..e96075158a 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -940,6 +940,7 @@ static bool CheckForStandbyTrigger(void);
#ifdef WAL_DEBUG
static void xlog_outrec(StringInfo buf, XLogReaderState *record);
#endif
+static void xlog_block_info(StringInfo buf, XLogReaderState *record);
static void xlog_outdesc(StringInfo buf, XLogReaderState *record);
static void pg_start_backup_callback(int code, Datum arg);
static void pg_stop_backup_callback(int code, Datum arg);
@@ -10258,6 +10259,19 @@ xlog_outrec(StringInfo buf, XLogReaderState *record)
appendStringInfo(buf, "; len %u",
XLogRecGetDataLen(record));
+ xlog_block_info(buf, record);
+}
+#endif /* WAL_DEBUG */
+
+/*
+ * Returns a string giving information about all the blocks in an
+ * XLogRecord.
+ */
+static void
+xlog_block_info(StringInfo buf, XLogReaderState *record)
+{
+ int block_id;
+
/* decode block references */
for (block_id = 0; block_id <= record->max_block_id; block_id++)
{
@@ -10284,7 +10298,6 @@ xlog_outrec(StringInfo buf, XLogReaderState *record)
appendStringInfoString(buf, " FPW");
}
}
-#endif /* WAL_DEBUG */
/*
* Returns a string describing an XLogRecord, consisting of its identity
@@ -11765,6 +11778,7 @@ rm_redo_error_callback(void *arg)
initStringInfo(&buf);
xlog_outdesc(&buf, record);
+ xlog_block_info(&buf, record);
/* translator: %s is a WAL record description */
errcontext("WAL redo at %X/%X for %s",
Hi,
On 10/1/20 9:41 AM, Michael Paquier wrote:
On Thu, Sep 24, 2020 at 03:03:46PM +0900, Michael Paquier wrote:
Hmm. I still think that knowing at least about a FPW could be an
interesting piece of information even here. Anyway, instead of
copying a logic that exists already in xlog_outrec(), why not moving
the block information print into a separate routine out of the
WAL_DEBUG section, and just reuse the same format for the context of
the redo error callback? That would also be more consistent with what
we do in pg_waldump where we don't show the fork name of a block when
it is on a MAIN_FORKNUM. And this would avoid a third copy of the
same logic. If we add the XID, previous LSN and the record length
on the stack of what is printed, we could just reuse the existing
routine, still that's perhaps too much information displayed.Seeing nothing, I took a swing at that, and finished with the
attached that refactors the logic and prints the block information as
wanted. Any objections to that?
Sorry for the late reply and thanks for looking at it!
Had a look at it and did a few tests: looks all good to me.
No objections at all, thanks!
Bertrand
Show quoted text
--
Michael
On Thu, Oct 01, 2020 at 11:18:30AM +0200, Drouvot, Bertrand wrote:
Had a look at it and did a few tests: looks all good to me.
No objections at all, thanks!
Thanks for double-checking. Applied, then.
--
Michael
On 10/2/20 2:47 AM, Michael Paquier wrote:
On Thu, Oct 01, 2020 at 11:18:30AM +0200, Drouvot, Bertrand wrote:
Had a look at it and did a few tests: looks all good to me.
No objections at all, thanks!
Thanks for double-checking. Applied, then.
Thanks!
Bertrand