Assertion for logically decoding multi inserts into the catalog
My patch for using heap_multi_insert in the catalog [1]https://commitfest.postgresql.org/23/2125/ failed the logical
decoding part of test/recovery [2]/messages/by-id/CA+hUKGLg1vFiXnkxjp_bea5+VP8D=vHRwSdvj7Rbikr_u4xFbg@mail.gmail.com.
The assertion it failed on seems to not take multi inserts into the catalog
into consideration, while the main logic does. This assertion hasn't tripped
since there are no multi inserts into the catalog, but if we introduce them it
will so I’m raising it in a separate thread as it is sort of unrelated from the
patch in question.
The attached patch fixes my test failure and makes sense to me, but this code
is far from my neck of the tree, so I’m really not sure this is the best way to
express the assertion.
cheers ./daniel
[1]: https://commitfest.postgresql.org/23/2125/
[2]: /messages/by-id/CA+hUKGLg1vFiXnkxjp_bea5+VP8D=vHRwSdvj7Rbikr_u4xFbg@mail.gmail.com
Attachments:
logdec_assert.patchapplication/octet-stream; name=logdec_assert.patch; x-unix-mode=0644Download
From bac62a06bbf8b19a891d7a25a78b572f007f75ec Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Mon, 8 Jul 2019 16:25:35 +0200
Subject: [PATCH] Fix assertion in decoding multi-insert
The assertion in the code for logical decoding of a multi-insert
didn't take multi-inserts into the catalog into consideration, but
the actual decoding did. This assertion hasn't tripped since we
don't logically decode catalog inserts and until now there are no
multi inserts into the catalog. If that happens thoug, this assert
will trip so better fix it now.
---
src/backend/replication/logical/decode.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 151c3ef882..5468a6ecb8 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -974,7 +974,8 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
ReorderBufferQueueChange(ctx->reorder, XLogRecGetXid(r),
buf->origptr, change);
}
- Assert(data == tupledata + tuplelen);
+ Assert(xlrec->flags & XLH_INSERT_CONTAINS_NEW_TUPLE &&
+ data == tupledata + tuplelen);
}
/*
--
2.14.1.145.gb3622a4ee
On 08/07/2019 22:42, Daniel Gustafsson wrote:
My patch for using heap_multi_insert in the catalog [1] failed the logical
decoding part of test/recovery [2].The assertion it failed on seems to not take multi inserts into the catalog
into consideration, while the main logic does. This assertion hasn't tripped
since there are no multi inserts into the catalog, but if we introduce them it
will so I’m raising it in a separate thread as it is sort of unrelated from the
patch in question.The attached patch fixes my test failure and makes sense to me, but this code
is far from my neck of the tree, so I’m really not sure this is the best way to
express the assertion.--- a/src/backend/replication/logical/decode.c +++ b/src/backend/replication/logical/decode.c @@ -974,7 +974,8 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf) ReorderBufferQueueChange(ctx->reorder, XLogRecGetXid(r), buf->origptr, change); } - Assert(data == tupledata + tuplelen); + Assert(xlrec->flags & XLH_INSERT_CONTAINS_NEW_TUPLE && + data == tupledata + tuplelen); }/*
This patch makes the assertion more strict than it was before. I don't
see how it could possibly make a regression failure go away. On the
contrary. So, huh?
- Heikki
On 31 Jul 2019, at 19:20, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
This patch makes the assertion more strict than it was before. I don't see how it could possibly make a regression failure go away. On the contrary. So, huh?
Yeah, this is clearly fat-fingered, the intent is to only run the Assert in
case XLH_INSERT_CONTAINS_NEW_TUPLE is set in xlrec->flags, as it only applies
under that condition. The attached is tested in both in the multi-insert patch
and on HEAD, but I wish I could figure out a better way to express this Assert.
cheers ./daniel
Attachments:
logdec_assert-v2.patchapplication/octet-stream; name=logdec_assert-v2.patch; x-unix-mode=0644Download
From 094d4a0cb69525f478f1452bded176fd2d3d2904 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Mon, 8 Jul 2019 16:25:35 +0200
Subject: [PATCH] Fix assertion in decoding multi-insert
The assertion in the code for logical decoding of a multi-insert
didn't take multi-inserts into the catalog into consideration, but
the actual decoding did. This assertion hasn't tripped since we
don't logically decode catalog inserts and until now there are no
multi inserts into the catalog. If that happens thoug, this assert
will trip so better fix it now.
---
src/backend/replication/logical/decode.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 151c3ef882..36d43eee2a 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -974,7 +974,8 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
ReorderBufferQueueChange(ctx->reorder, XLogRecGetXid(r),
buf->origptr, change);
}
- Assert(data == tupledata + tuplelen);
+ Assert(data == tupledata + tuplelen ||
+ ~(xlrec->flags & XLH_INSERT_CONTAINS_NEW_TUPLE));
}
/*
--
2.14.1.145.gb3622a4ee
On Tue, Aug 06, 2019 at 12:52:09AM +0200, Daniel Gustafsson wrote:
Yeah, this is clearly fat-fingered, the intent is to only run the Assert in
case XLH_INSERT_CONTAINS_NEW_TUPLE is set in xlrec->flags, as it only applies
under that condition. The attached is tested in both in the multi-insert patch
and on HEAD, but I wish I could figure out a better way to express this Assert.
- Assert(data == tupledata + tuplelen);
+ Assert(data == tupledata + tuplelen ||
+ ~(xlrec->flags & XLH_INSERT_CONTAINS_NEW_TUPLE));
I find this way to formulate the assertion a bit confusing, as what
you want is basically to make sure that XLH_INSERT_CONTAINS_NEW_TUPLE
is not set in the context of catalogs. So you could just use that
instead:
(xlrec->flags & XLH_INSERT_CONTAINS_NEW_TUPLE) == 0
Anyway, if you make a parallel with heap_multi_insert() and the way
each xl_multi_insert_tuple is built, I think that the error does not
come from this assertion, but with the way the data length is computed
in DecodeMultiInsert as a move to the next chunk of tuple data is only
done if XLH_INSERT_CONTAINS_NEW_TUPLE is set. So, in my opinion,
something to fix here is to make sure that we compute the correct
length even if XLH_INSERT_CONTAINS_NEW_TUPLE is *not* set, and then
make sure at the end that the tuple length matches to the end.
This way, we also make sure that we never finish on a state where
the block data associated to the multi-insert record is NULL but
because of a mistake there is some tuple data detected, or that the
tuple data set has a final length which matches the expected outcome.
And actually, it seems to me that this happens in your original patch
to open access to multi-insert for catalogs, because for some reason
XLogRecGetBlockData() returns NULL with a non-zero tuplelen in
DecodeMultiInsert(). I can see that with the TAP test
010_logical_decoding_timelines.pl
Attached is a patch for that. Thoughts?
--
Michael
Attachments:
logdec_assert-v3.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 151c3ef882..81d2574f9f 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -901,6 +901,7 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
return;
tupledata = XLogRecGetBlockData(r, 0, &tuplelen);
+ Assert(tupledata != NULL);
data = tupledata;
for (i = 0; i < xlrec->ntuples; i++)
@@ -916,6 +917,10 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
memcpy(&change->data.tp.relnode, &rnode, sizeof(RelFileNode));
+ xlhdr = (xl_multi_insert_tuple *) SHORTALIGN(data);
+ data = ((char *) xlhdr) + SizeOfMultiInsertTuple;
+ datalen = xlhdr->datalen;
+
/*
* CONTAINS_NEW_TUPLE will always be set currently as multi_insert
* isn't used for catalogs, but better be future proof.
@@ -927,10 +932,6 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
{
HeapTupleHeader header;
- xlhdr = (xl_multi_insert_tuple *) SHORTALIGN(data);
- data = ((char *) xlhdr) + SizeOfMultiInsertTuple;
- datalen = xlhdr->datalen;
-
change->data.tp.newtuple =
ReorderBufferGetTupleBuf(ctx->reorder, datalen);
@@ -953,8 +954,6 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
memcpy((char *) tuple->tuple.t_data + SizeofHeapTupleHeader,
(char *) data,
datalen);
- data += datalen;
-
header->t_infomask = xlhdr->t_infomask;
header->t_infomask2 = xlhdr->t_infomask2;
header->t_hoff = xlhdr->t_hoff;
@@ -973,6 +972,9 @@ DecodeMultiInsert(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
ReorderBufferQueueChange(ctx->reorder, XLogRecGetXid(r),
buf->origptr, change);
+
+ /* move to the next xl_multi_insert_tuple entry */
+ data += datalen;
}
Assert(data == tupledata + tuplelen);
}
On 6 Aug 2019, at 05:36, Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Aug 06, 2019 at 12:52:09AM +0200, Daniel Gustafsson wrote:
Yeah, this is clearly fat-fingered, the intent is to only run the Assert in
case XLH_INSERT_CONTAINS_NEW_TUPLE is set in xlrec->flags, as it only applies
under that condition. The attached is tested in both in the multi-insert patch
and on HEAD, but I wish I could figure out a better way to express this Assert.- Assert(data == tupledata + tuplelen); + Assert(data == tupledata + tuplelen || + ~(xlrec->flags & XLH_INSERT_CONTAINS_NEW_TUPLE)); I find this way to formulate the assertion a bit confusing, as what you want is basically to make sure that XLH_INSERT_CONTAINS_NEW_TUPLE is not set in the context of catalogs. So you could just use that instead: (xlrec->flags & XLH_INSERT_CONTAINS_NEW_TUPLE) == 0Anyway, if you make a parallel with heap_multi_insert() and the way
each xl_multi_insert_tuple is built, I think that the error does not
come from this assertion, but with the way the data length is computed
in DecodeMultiInsert as a move to the next chunk of tuple data is only
done if XLH_INSERT_CONTAINS_NEW_TUPLE is set. So, in my opinion,
something to fix here is to make sure that we compute the correct
length even if XLH_INSERT_CONTAINS_NEW_TUPLE is *not* set, and then
make sure at the end that the tuple length matches to the end.This way, we also make sure that we never finish on a state where
the block data associated to the multi-insert record is NULL but
because of a mistake there is some tuple data detected, or that the
tuple data set has a final length which matches the expected outcome.
And actually, it seems to me that this happens in your original patch
to open access to multi-insert for catalogs, because for some reason
XLogRecGetBlockData() returns NULL with a non-zero tuplelen in
DecodeMultiInsert(). I can see that with the TAP test
010_logical_decoding_timelines.plAttached is a patch for that. Thoughts?
Thanks, this is a much better approach and it passes tests for me. +1 on this
version (regardless of outcome of the other patch as this is separate).
cheers ./daniel
On Tue, Aug 06, 2019 at 03:08:48PM +0200, Daniel Gustafsson wrote:
Thanks, this is a much better approach and it passes tests for me. +1 on this
version (regardless of outcome of the other patch as this is separate).
I had an extra lookup at this stuff this morning, and applied the
patch. Please note that I have kept the assertion on tupledata which
cannot be NULL and added a comment about that because it is not
possible to finish yet in a state where we do not have tuple data in
this context, but it actually could be the case if we begin to use
multi-inserts with system catalogs, so the assertion is here to make
future patch authors careful about that. We could in this case bypass
DecodeMultiInsert() if tupledata is NULL and assert that
XLH_INSERT_CONTAINS_NEW_TUPLE should not be set, or we could just
bypass the logic if XLH_INSERT_CONTAINS_NEW_TUPLE is not set at all.
Let's sort that out in your other patch.
--
Michael