Bug in jsonb minus operator
I'm seeing the following problem on the master branch:
postgres=# select '{"foo":5}'::jsonb - 'bar'; -- okay
?column?
------------
{"foo": 5}
(1 row)
postgres=# select '{"foo":{"bar":5}}'::jsonb - 'foo'; -- okay
?column?
----------
{}
(1 row)
postgres=# select '{"foo":{"bar":5}}'::jsonb - 'rion'; -- spurious error
ERROR: XX000: unknown type of jsonb container to convert
LOCATION: convertJsonbValue, jsonb_util.c:1430
This is an elog() - a "can't happen" error - due to this restriction
within convertJsonbValue():
/*
* A JsonbValue passed as val should never have a type of jbvBinary, and
* neither should any of its sub-components. Those values will be produced
* by convertJsonbArray and convertJsonbObject, the results of which will
* not be passed back to this function as an argument.
*/
This call to convertJsonbValue() actually originates from the final
line of the new jsonb_delete() function, here:
#5 0x0000000000877e10 in jsonb_delete (fcinfo=0x160e060) at jsonfuncs.c:3389
3389 PG_RETURN_JSONB(JsonbValueToJsonb(res));
I explored writing a fix for this bug. I went back and forth on it,
but I think that the most promising approach might be to simply teach
convertJsonbValue() to care about jbvBinary-typed JsonbValue
variables. That way, the jsonb_delete() function could actually expect
this to work. I can't remember why I thought it was a good idea to
have convertJsonbValue() reject jbvBinary values, but I believe the
reason was that it simply wasn't necessary. As it says of conversion
from in memory to disk representation (at the top of json_util.c):
/*
* Turn an in-memory JsonbValue into a Jsonb for on-disk storage.
*
* There isn't a JsonbToJsonbValue(), because generally we find it more
* convenient to directly iterate through the Jsonb representation and only
* really convert nested scalar values. JsonbIteratorNext() does this, so that
* clients of the iteration code don't have to directly deal with the binary
* representation (JsonbDeepContains() is a notable exception, although all
* exceptions are internal to this module). In general, functions that accept
* a JsonbValue argument are concerned with the manipulation of scalar values,
* or simple containers of scalar values, where it would be inconvenient to
* deal with a great amount of other state.
*/
jsonb_delete() is unusual in that it converts from a JsonbValue back
to the on-disk Jsonb varlena format, but with a nested jbvBinary-typed
value (in the presence of a nested object or array) -- it seems like
it wouldn't be too hard to just roll with that. jsonb_delete() makes
the mistake of not expecting to see jbvBinary values (since it doesn't
always recurse into the json structure). We shouldn't really deal with
jbvBinary-typed values specially from outside specialized utility
routines like JsonbDeepContains() as noted in the above comment,
though (so offhand I don't think we should teach jsonb_delete()
anything new, as that would be a modularity violation). Thoughts?
Note that the existence related operators (that, like the minus
operator should only work at the top level) don't go through
JsonbValue conversion of the lhs value as part of their search at all.
I don't think that their findJsonbValueFromContainer() routine (or a
similar routine) is the right way of approaching this, though - that's
a specialized routine, that doesn't care if an array value (which is,
you'll recall, a key for the purposes of existence checking) appears
once or multiple times. On that topic, I think it's sloppy that "Table
9-41. Additional jsonb Operators" isn't clear about the fact that the
"operator - text" op matches things on the same basis as the existence
operators -- notice how the existence operator notes with emphasis
that it cares about array *string* values only.
Finally, I don't think an operator implementation function that is
jsonb-only belongs anywhere other than jsonb_ops.c (for the same
reason, replacePath() should live in jsonb_util.c). And I'm
disappointed that the jsonb tests can no longer be run atomically with
'make installcheck TESTS="jsonb"' - I liked being able to do that.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 05/16/2015 08:04 PM, Peter Geoghegan wrote:
I'm seeing the following problem on the master branch:
postgres=# select '{"foo":5}'::jsonb - 'bar'; -- okay
?column?
------------
{"foo": 5}
(1 row)postgres=# select '{"foo":{"bar":5}}'::jsonb - 'foo'; -- okay
?column?
----------
{}
(1 row)postgres=# select '{"foo":{"bar":5}}'::jsonb - 'rion'; -- spurious error
ERROR: XX000: unknown type of jsonb container to convert
LOCATION: convertJsonbValue, jsonb_util.c:1430This is an elog() - a "can't happen" error - due to this restriction
within convertJsonbValue():/*
* A JsonbValue passed as val should never have a type of jbvBinary, and
* neither should any of its sub-components. Those values will be produced
* by convertJsonbArray and convertJsonbObject, the results of which will
* not be passed back to this function as an argument.
*/This call to convertJsonbValue() actually originates from the final
line of the new jsonb_delete() function, here:#5 0x0000000000877e10 in jsonb_delete (fcinfo=0x160e060) at jsonfuncs.c:3389
3389 PG_RETURN_JSONB(JsonbValueToJsonb(res));I explored writing a fix for this bug. I went back and forth on it,
but I think that the most promising approach might be to simply teach
convertJsonbValue() to care about jbvBinary-typed JsonbValue
variables. That way, the jsonb_delete() function could actually expect
this to work. I can't remember why I thought it was a good idea to
have convertJsonbValue() reject jbvBinary values, but I believe the
reason was that it simply wasn't necessary.
Sure. I thought we'd covered this but it's possible that we didn't, or
that it got rebroken. There have been complaints about the limitation on
values containing jbvBinary, so let's just remove it if that can be done
simply, as it seems to be a not uncommonly encountered problem.
Are you going to submit a patch for that?
jsonb_delete() is unusual in that it converts from a JsonbValue back
to the on-disk Jsonb varlena format, but with a nested jbvBinary-typed
value (in the presence of a nested object or array) -- it seems like
it wouldn't be too hard to just roll with that. jsonb_delete() makes
the mistake of not expecting to see jbvBinary values (since it doesn't
always recurse into the json structure). We shouldn't really deal with
jbvBinary-typed values specially from outside specialized utility
routines like JsonbDeepContains() as noted in the above comment,
though (so offhand I don't think we should teach jsonb_delete()
anything new, as that would be a modularity violation). Thoughts?
Assuming the above fix we won't have to teach it anything new, right?
Note that the existence related operators (that, like the minus
operator should only work at the top level) don't go through
JsonbValue conversion of the lhs value as part of their search at all.
I don't think that their findJsonbValueFromContainer() routine (or a
similar routine) is the right way of approaching this, though - that's
a specialized routine, that doesn't care if an array value (which is,
you'll recall, a key for the purposes of existence checking) appears
once or multiple times.
Seems reasonable.
On that topic, I think it's sloppy that "Table
9-41. Additional jsonb Operators" isn't clear about the fact that the
"operator - text" op matches things on the same basis as the existence
operators -- notice how the existence operator notes with emphasis
that it cares about array *string* values only.
OK, please submit a patch that you think clears it up.
Finally, I don't think an operator implementation function that is
jsonb-only belongs anywhere other than jsonb_ops.c (for the same
reason, replacePath() should live in jsonb_util.c).
The heading on jsonb_op.c (note spelling) suggests that it's for indexed
operations. I don't mind rearranging the code layout to something people
think more logical, but I also don't want to engage in unnecessary code
churn. I'm not that religious about the organization.
And I'm
disappointed that the jsonb tests can no longer be run atomically with
'make installcheck TESTS="jsonb"' - I liked being able to do that.
It was always known that some cases would not work with TESTS - that was
part of Tom's reservation about the whole thing. You can say
make check-tests TESTS="create_table copy jsonb"
and while create_table will fail on an irrelevant test, copy and jsonb
will succeed.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, May 17, 2015 at 8:04 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
Sure. I thought we'd covered this but it's possible that we didn't, or that
it got rebroken. There have been complaints about the limitation on values
containing jbvBinary, so let's just remove it if that can be done simply, as
it seems to be a not uncommonly encountered problem.Are you going to submit a patch for that?
I'll try and come up with something. It's not a trivial fix.
Assuming the above fix we won't have to teach it anything new, right?
I'm not 100% sure, but I think not.
On that topic, I think it's sloppy that "Table
9-41. Additional jsonb Operators" isn't clear about the fact that the
"operator - text" op matches things on the same basis as the existence
operators -- notice how the existence operator notes with emphasis
that it cares about array *string* values only.OK, please submit a patch that you think clears it up.
I was going to better document the new operators anyway, so I'll get
this in passing.
The heading on jsonb_op.c (note spelling) suggests that it's for indexed
operations. I don't mind rearranging the code layout to something people
think more logical, but I also don't want to engage in unnecessary code
churn. I'm not that religious about the organization.
I see what you mean.
It was always known that some cases would not work with TESTS - that was
part of Tom's reservation about the whole thing. You can say
Sure, but you added it, and you mentioned jsonb in the commit message,
before jsonb was even committed to the master branch. :-)
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 05/17/2015 09:08 PM, Peter Geoghegan wrote:
On Sun, May 17, 2015 at 8:04 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
Sure. I thought we'd covered this but it's possible that we didn't, or that
it got rebroken. There have been complaints about the limitation on values
containing jbvBinary, so let's just remove it if that can be done simply, as
it seems to be a not uncommonly encountered problem.Are you going to submit a patch for that?
I'll try and come up with something. It's not a trivial fix.
Here's a thought. in pushJsonbValue() the value pushed is called
scalarVal, and in all our regression tests it is in fact scalar.
However, the Asserts inside the function tell a different story:
case WJB_VALUE:
Assert(IsAJsonbScalar(scalarVal) ||
scalarVal->type == jbvBinary);
appendValue(*pstate, scalarVal);
break;
case WJB_ELEM:
Assert(IsAJsonbScalar(scalarVal) ||
scalarVal->type == jbvBinary);
appendElement(*pstate, scalarVal);
break;
and indeed it turns out that your delete example does push a value with
jbvBinary, thus triggering the problem. So, could we deal with that, by,
say, setting up an iterator for the binary datum and just pushing all of
its values?
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 05/18/2015 08:17 AM, Andrew Dunstan wrote:
On 05/17/2015 09:08 PM, Peter Geoghegan wrote:
On Sun, May 17, 2015 at 8:04 AM, Andrew Dunstan <andrew@dunslane.net>
wrote:Sure. I thought we'd covered this but it's possible that we didn't,
or that
it got rebroken. There have been complaints about the limitation on
values
containing jbvBinary, so let's just remove it if that can be done
simply, as
it seems to be a not uncommonly encountered problem.Are you going to submit a patch for that?
I'll try and come up with something. It's not a trivial fix.
Here's a thought. in pushJsonbValue() the value pushed is called
scalarVal, and in all our regression tests it is in fact scalar.
However, the Asserts inside the function tell a different story:case WJB_VALUE:
Assert(IsAJsonbScalar(scalarVal) ||
scalarVal->type == jbvBinary);
appendValue(*pstate, scalarVal);
break;
case WJB_ELEM:
Assert(IsAJsonbScalar(scalarVal) ||
scalarVal->type == jbvBinary);
appendElement(*pstate, scalarVal);
break;and indeed it turns out that your delete example does push a value
with jbvBinary, thus triggering the problem. So, could we deal with
that, by, say, setting up an iterator for the binary datum and just
pushing all of its values?
Here's an patch along those lines. It seems to do the trick, at least
for your test case, and it has the merit of being very small, so small
I'd like to backpatch it - accepting jbvBinary as is in pushJsonbValue
seems like a bug to me.
cheers
andrew
Attachments:
jbvbinaryfix.patchtext/x-patch; name=jbvbinaryfix.patchDownload
diff --git a/src/backend/utils/adt/jsonb_util.c b/src/backend/utils/adt/jsonb_util.c
index 5cd9140..03cc125 100644
--- a/src/backend/utils/adt/jsonb_util.c
+++ b/src/backend/utils/adt/jsonb_util.c
@@ -57,6 +57,9 @@ static void appendElement(JsonbParseState *pstate, JsonbValue *scalarVal);
static int lengthCompareJsonbStringValue(const void *a, const void *b);
static int lengthCompareJsonbPair(const void *a, const void *b, void *arg);
static void uniqueifyJsonbObject(JsonbValue *object);
+static JsonbValue *pushJsonbValueScalar(JsonbParseState **pstate,
+ JsonbIteratorToken seq,
+ JsonbValue *scalarVal);
/*
* Turn an in-memory JsonbValue into a Jsonb for on-disk storage.
@@ -505,8 +508,37 @@ fillJsonbValue(JsonbContainer *container, int index,
* JsonbValue. There is one exception -- WJB_BEGIN_ARRAY callers may pass a
* "raw scalar" pseudo array to append that.
*/
+
JsonbValue *
pushJsonbValue(JsonbParseState **pstate, JsonbIteratorToken seq,
+ JsonbValue *jbval)
+{
+ JsonbIterator *it;
+ JsonbValue *res = NULL;
+ JsonbValue v;
+ JsonbIteratorToken tok;
+
+ if (!jbval || (seq != WJB_ELEM && seq != WJB_VALUE) ||
+ jbval->type != jbvBinary)
+ {
+ /* drop through */
+ return pushJsonbValueScalar(pstate, seq, jbval);
+ }
+
+ /* unpack the binary and add each piece to the pstate */
+ it = JsonbIteratorInit(jbval->val.binary.data);
+ while ((tok = JsonbIteratorNext(&it, &v, false)) != WJB_DONE)
+ res = pushJsonbValueScalar(pstate, tok,
+ tok < WJB_BEGIN_ARRAY ? &v : NULL);
+
+ return res;
+}
+
+
+
+
+static JsonbValue *
+pushJsonbValueScalar(JsonbParseState **pstate, JsonbIteratorToken seq,
JsonbValue *scalarVal)
{
JsonbValue *result = NULL;
@@ -549,13 +581,11 @@ pushJsonbValue(JsonbParseState **pstate, JsonbIteratorToken seq,
appendKey(*pstate, scalarVal);
break;
case WJB_VALUE:
- Assert(IsAJsonbScalar(scalarVal) ||
- scalarVal->type == jbvBinary);
+ Assert(IsAJsonbScalar(scalarVal));
appendValue(*pstate, scalarVal);
break;
case WJB_ELEM:
- Assert(IsAJsonbScalar(scalarVal) ||
- scalarVal->type == jbvBinary);
+ Assert(IsAJsonbScalar(scalarVal));
appendElement(*pstate, scalarVal);
break;
case WJB_END_OBJECT:
On Mon, May 18, 2015 at 7:11 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
Here's an patch along those lines. It seems to do the trick, at least for
your test case, and it has the merit of being very small, so small I'd like
to backpatch it - accepting jbvBinary as is in pushJsonbValue seems like a
bug to me.
Isn't that for the benefit of raw scalar pseudo arrays? The existing
comments above pushJsonbValue() acknowledge such callers.
Why are you passing the skipNested variable to JsonbIteratorNext()
within jsonb_delete()? I'm not seeing a need for that.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 05/18/2015 10:52 PM, Peter Geoghegan wrote:
On Mon, May 18, 2015 at 7:11 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
Here's an patch along those lines. It seems to do the trick, at least for
your test case, and it has the merit of being very small, so small I'd like
to backpatch it - accepting jbvBinary as is in pushJsonbValue seems like a
bug to me.Isn't that for the benefit of raw scalar pseudo arrays? The existing
comments above pushJsonbValue() acknowledge such callers.
Umm, no, the raw scalar pseudo arrays are of type jbvArray, not
jbvBinary. And they are pushed with WJB_BEGIN_ARRAY, not with WJB_ELEM
or WJB_VALUE, as the comment notes. See this fragment of code from
JsonbValueToJsonb:
scalarArray.type = jbvArray;
scalarArray.val.array.rawScalar = true;
scalarArray.val.array.nElems = 1;
pushJsonbValue(&pstate, WJB_BEGIN_ARRAY, &scalarArray);
I tested this by removing the assert test for jbvBinary in the WJB_ELEM
and WJB_VALUE switch beranches and then running the regression suite. No
assertion failure was triggered. While that's not guaranteed to be a
perfect test, it doesn't seem like a bad one. Can you pose a counter
example where this will break?
Why are you passing the skipNested variable to JsonbIteratorNext()
within jsonb_delete()? I'm not seeing a need for that.
If you don't the logic gets more complex, as you need to keep track of
what level of the object you are at. The virtue of this change is that
it will simplify a lot of such processing by removing the unnecessary
restriction on passing jbvBinary values to pushJsonbValue().
If you have a better fix for the bug you complained about I'll be happy
to take a look.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 05/19/2015 07:11 PM, Andrew Dunstan wrote:
On 05/18/2015 10:52 PM, Peter Geoghegan wrote:
On Mon, May 18, 2015 at 7:11 AM, Andrew Dunstan <andrew@dunslane.net>
wrote:Here's an patch along those lines. It seems to do the trick, at
least for
your test case, and it has the merit of being very small, so small
I'd like
to backpatch it - accepting jbvBinary as is in pushJsonbValue seems
like a
bug to me.Isn't that for the benefit of raw scalar pseudo arrays? The existing
comments above pushJsonbValue() acknowledge such callers.Umm, no, the raw scalar pseudo arrays are of type jbvArray, not
jbvBinary. And they are pushed with WJB_BEGIN_ARRAY, not with WJB_ELEM
or WJB_VALUE, as the comment notes. See this fragment of code from
JsonbValueToJsonb:scalarArray.type = jbvArray;
scalarArray.val.array.rawScalar = true;
scalarArray.val.array.nElems = 1;pushJsonbValue(&pstate, WJB_BEGIN_ARRAY, &scalarArray);
I tested this by removing the assert test for jbvBinary in the
WJB_ELEM and WJB_VALUE switch beranches and then running the
regression suite. No assertion failure was triggered. While that's not
guaranteed to be a perfect test, it doesn't seem like a bad one. Can
you pose a counter example where this will break?Why are you passing the skipNested variable to JsonbIteratorNext()
within jsonb_delete()? I'm not seeing a need for that.If you don't the logic gets more complex, as you need to keep track of
what level of the object you are at. The virtue of this change is that
it will simplify a lot of such processing by removing the unnecessary
restriction on passing jbvBinary values to pushJsonbValue().If you have a better fix for the bug you complained about I'll be
happy to take a look.
OK, here's a slightly updated version of the patch. The current
behaviour in which pushJsonbValue() happily pushes a jbvBinary value,
only to have it fail later when we try to get a Jsonb out of it, seems
like a plain bug. The fact that none of our present 9.4 code exercises
that bug is lucky, but since this is an exposed API, and has been the
cause of past complaint, I propose, in the absence of further objections
or comments, to apply it to both master and 9.4, once the new releases
are out of the way.
cheers
andrew
Attachments:
jbvbinaryfix2.patchtext/x-patch; name=jbvbinaryfix2.patchDownload
diff --git a/src/backend/utils/adt/jsonb_util.c b/src/backend/utils/adt/jsonb_util.c
index 5cd9140..974e386 100644
--- a/src/backend/utils/adt/jsonb_util.c
+++ b/src/backend/utils/adt/jsonb_util.c
@@ -57,6 +57,9 @@ static void appendElement(JsonbParseState *pstate, JsonbValue *scalarVal);
static int lengthCompareJsonbStringValue(const void *a, const void *b);
static int lengthCompareJsonbPair(const void *a, const void *b, void *arg);
static void uniqueifyJsonbObject(JsonbValue *object);
+static JsonbValue *pushJsonbValueScalar(JsonbParseState **pstate,
+ JsonbIteratorToken seq,
+ JsonbValue *scalarVal);
/*
* Turn an in-memory JsonbValue into a Jsonb for on-disk storage.
@@ -503,10 +506,43 @@ fillJsonbValue(JsonbContainer *container, int index,
*
* Only sequential tokens pertaining to non-container types should pass a
* JsonbValue. There is one exception -- WJB_BEGIN_ARRAY callers may pass a
- * "raw scalar" pseudo array to append that.
+ * "raw scalar" pseudo array to append it - the actual scalar should be passed
+ * next and it will be added as the only member of the array.
+ *
+ * Values of type jvbBinary, which are rolled up arrays and objects,
+ * are unpacked before being added to the result.
*/
JsonbValue *
pushJsonbValue(JsonbParseState **pstate, JsonbIteratorToken seq,
+ JsonbValue *jbval)
+{
+ JsonbIterator *it;
+ JsonbValue *res = NULL;
+ JsonbValue v;
+ JsonbIteratorToken tok;
+
+ if (!jbval || (seq != WJB_ELEM && seq != WJB_VALUE) ||
+ jbval->type != jbvBinary)
+ {
+ /* drop through */
+ return pushJsonbValueScalar(pstate, seq, jbval);
+ }
+
+ /* unpack the binary and add each piece to the pstate */
+ it = JsonbIteratorInit(jbval->val.binary.data);
+ while ((tok = JsonbIteratorNext(&it, &v, false)) != WJB_DONE)
+ res = pushJsonbValueScalar(pstate, tok,
+ tok < WJB_BEGIN_ARRAY ? &v : NULL);
+
+ return res;
+}
+
+/*
+ * Do the actual pushing, with only scalar or pseudo-scalar-array values
+ * accepted.
+ */
+static JsonbValue *
+pushJsonbValueScalar(JsonbParseState **pstate, JsonbIteratorToken seq,
JsonbValue *scalarVal)
{
JsonbValue *result = NULL;
@@ -549,13 +585,11 @@ pushJsonbValue(JsonbParseState **pstate, JsonbIteratorToken seq,
appendKey(*pstate, scalarVal);
break;
case WJB_VALUE:
- Assert(IsAJsonbScalar(scalarVal) ||
- scalarVal->type == jbvBinary);
+ Assert(IsAJsonbScalar(scalarVal));
appendValue(*pstate, scalarVal);
break;
case WJB_ELEM:
- Assert(IsAJsonbScalar(scalarVal) ||
- scalarVal->type == jbvBinary);
+ Assert(IsAJsonbScalar(scalarVal));
appendElement(*pstate, scalarVal);
break;
case WJB_END_OBJECT:
diff --git a/src/include/utils/jsonb.h b/src/include/utils/jsonb.h
index 7b56175..b02934a 100644
--- a/src/include/utils/jsonb.h
+++ b/src/include/utils/jsonb.h
@@ -418,7 +418,7 @@ extern JsonbValue *findJsonbValueFromContainer(JsonbContainer *sheader,
extern JsonbValue *getIthJsonbValueFromContainer(JsonbContainer *sheader,
uint32 i);
extern JsonbValue *pushJsonbValue(JsonbParseState **pstate,
- JsonbIteratorToken seq, JsonbValue *scalarVal);
+ JsonbIteratorToken seq, JsonbValue *jbVal);
extern JsonbIterator *JsonbIteratorInit(JsonbContainer *container);
extern JsonbIteratorToken JsonbIteratorNext(JsonbIterator **it, JsonbValue *val,
bool skipNested);