Further issues with jsonb semantics, documentation
I've noticed some more issues with the jsonb documentation, and the
new jsonb stuff generally. I didn't set out to give Andrew feedback on
the semantics weeks after feature freeze, but unfortunately this feels
like another discussion that we need to have now rather than later.
"operator jsonb - integer"
===================
Summary: I think that this operator has a problem, but a problem that
can easily be fixed.
I think it was a bad idea to allow array-style removal of object
key/value pairs. ISTM that it implies a level of stability in the
ordering that doesn't make sense. Besides, is it really all that
useful?
Consider this case:
postgres=# select '{"c":5, "a":6, "b":7}'::jsonb - 1;
?column?
------------------
{"a": 6, "c": 5}
(1 row)
Clearly anyone expecting the value "a" to be removed here would be in
for a surprise. Moreover, it is inconsistent with the established
behavior of the corresponding array-wise subscript operator:
postgres=# select '{"c":5, "a":6, "b":7}'::jsonb -> 1;
?column?
----------
[null]
(1 row)
I suggest, given that this is conceptually a data-modifying operator,
that the minus operator/jsonb_delete() case raise an error rather than
matching "operator jsonb -> integer" and returning NULL. I say this as
the person who successfully argued that the -> operator case above
should return NULL during the 9.4 beta period; returning SQL NULL for
the delete/minus operator feels like going too far in the direction of
permissiveness, even for jsonb; my expression index argument does not
apply here as it did for the "operator jsonb -> integer" case.
"operator jsonb - text"
================
Summary: I think that this operator is fine.
Documentation needs work, though. The "operator jsonb - text" operator
ought to be documented as in the attached patch, which is closer to
the equivalent hstore operator, and emphasizes the "operator jsonb ?
text" definition of a key. It should emphasize its similarity to the
established "operator jsonb ? text" operator, and in particular that
array elements behave as keys *iff* they're strings.
"operator jsonb - text[]" (and *nested* deletion more generally)
===============================================
Summary: I think that this operator has many problems, and should be
scraped (although only as an operator). IMV nested deletion should
only be handled by functions, and the way that nested deletion works
in general should be slightly adjusted.
The new "operator jsonb - text[]" operator is confusingly inconsistent with:
A) "operator jsonb text"
and:
B) the established "operator hstore - text[]" operator, since that
operator deletes all key/value pairs that have keys that match any of
the right operand text array values. In contrast, this new operator is
passed as its right operand an array of text elements that constitute
a "path" (so the order in the rhs text[] operand matters). If the text
element in the rhs text[] operand happens to be what would pass for a
Postgres integer literal, it can be used to traverse lhs array values
through subscripting at that nesting level.
Regarding nested deletion behavior more generally, consider this
example of how this can work out badly:
postgres=# select jsonb_delete(jsonb_set('["a"]', '{5}', '"b"'), '{5}') ;
jsonb_delete
--------------
["a", "b"]
(1 row)
Here, we're adding and then deleting an array element at offset 5 (the
string "b"). But the element is never deleted by the outer
jsonb_delete(), because we can't rely on the element actually being
stored at offset 5. Seems a bit fragile.
More importantly, consider the inconsistency with "operator jsonb
text" ("point A" above):
postgres=# select '["a"]'::jsonb ?| '{a}'::text[]; -- historic/9.4 behavior
?column?
----------
t
(1 row)
postgres=# select '["a"]'::jsonb - '{a}'::text[]; -- new to 9.5
operator, does not delete!
?column?
----------
["a"]
(1 row)
Perhaps most questionably of all, the non-array based minus/delete
operator (which I like) *does* have the same idea of matching a key as
the established "operator jsonb ?| text[]" operator (and "operator
jsonb ? text", etc):
postgres=# select '["a"]'::jsonb - 'a'::text; -- new to 9.5 operator,
*does* delete!
?column?
----------
[]
(1 row)
This conceptual model for manipulating jsonb is entirely new and novel
to this new operator "operator text[]" (and jsonb_set()).
"operator jsonb - text[]" categorization/conceptual model
==========================================
Operators like the established "operator jsonb -> integer" operator (a
jsonb "array-wise" operator) always seemed okay to me because the rhs
operand really was a Postgres integer, and because it's explicitly an
array-wise operator (just like "operator -> text" is explicitly
object-wise). But now, with these new operators, you've added a
"shadow type" system to certain rhs text[] operands, consisting of
types not explicitly delineated by JSON-style double quotes (for
strings, say). So there is kind of a second shadow type system in
play, similar to that of jsonb except that text[] "shadow types"
cannot be distinguished -- '{0}'::text[] could be intended to affect
an array or an object with the key value "0" in one of its pairs.
Accordingly, I would vote for changing this, and making the nested
deletion stuff only care about object key values and array *string*
elements. This backs out of the idea of a new "shadow type" system for
text arrays. I think that this is conceptually a lot cleaner, while
actually not being significantly less useful for most use cases
(nesting tends to involve only objects anyway). As noted in my summary
of the previous section, I would also vote for scraping "operator
jsonb - text[]" as a jsonb_delete() wrapper (see closing summary below
for more on why).
While I appreciate that Andrew wanted to make deletion as flexible as
possible, these inconsistencies feel arbitrary to me.
Closing summary
=============
At the very least, some of these new jsonb operators need to decide if they're:
1) Specifically "array-wise" or "object-wise", like the existing
operators "operator -> integer", or "operator -> text".
or:
2) "Key" operators. Operators that share the "operator jsonb ? text"
idea of a key, and operate on jsonb datums accordingly. "Keys" here
include array elements that are strings.
As I've said, I prefer option 2 for deletion, which is why I like
"operator jsonb - text" but not "operator jsonb - text[]", but either
way it needs to be *a lot* clearer than it is at the moment.
You might also say that there is a category 3 jsonb operator, of which
the containment operator ("operator @>") is the best example. It takes
jsonb on the rhs, and so naturally cares about types including
container types on the lhs. Clearly these new operators are far enough
away from category 3 that we cannot really contemplate moving them
into category 3, particularly post feature freeze (even though, as I
said, I think that would be a superior approach).
Some of what I've said here is just my opinion, but I feel pretty
confident that for example we don't want to add a fourth "operator
category" to jsonb, a weird hybrid between category 2 and category 3.
Having 3 "operator categories" seems quite enough. And by making
*nested* deletion only possible through a function call to
jsonb_delete() (by scrapping "operator jsonb - text[]" as I suggest),
you also avoid having an *operator* that is still somewhat not like
other operators in category 2 (since the other operators in category 2
don't care about nesting). That feels cleaner -- IMV the oddness of
walking a path based on a text[] value ought to be confined to a well
named function. Plus you can then add a new "operator jsonb - text[]"
that matches "operator hstore - text[]" and comports with the existing
"operator jsonb - text".
If you don't like my taxonomy for jsonb operators, then feel free to
come up with your own. As things stand, it is hard to describe a
taxonomy that makes the operators easy to understand and
non-surprising -- there'd be more exceptions than rules. I feel we
need to be disciplined about this stuff, or jsonb will become much
harder to use than it needs to be.
Opinions?
--
Peter Geoghegan
Attachments:
jsonb-delete-doc.patchtext/x-patch; charset=US-ASCII; name=jsonb-delete-doc.patchDownload+3-2
On Wed, Jun 3, 2015 at 7:02 PM, Peter Geoghegan <pg@heroku.com> wrote:
Consider this case:
postgres=# select '{"c":5, "a":6, "b":7}'::jsonb - 1;
?column?
------------------
{"a": 6, "c": 5}
(1 row)Clearly anyone expecting the value "a" to be removed here would be in
for a surprise. Moreover, it is inconsistent with the established
behavior of the corresponding array-wise subscript operator:postgres=# select '{"c":5, "a":6, "b":7}'::jsonb -> 1;
?column?
----------
[null]
(1 row)
For similar reasons, I think that this inconsistency is unacceptable:
postgres=# select '["a", "b", "c"]'::jsonb - -1;
?column?
------------
["a", "b"]
(1 row)
postgres=# select '["a", "b", "c"]'::jsonb -> -1;
?column?
----------
[null]
(1 row)
jsonb now supports Python-style negative subscripting to index
backward. I think that this a fine idea. However, I also think it's a
big POLA violation that this was not done for the ordinary array
subscripting operator ("operator jsonb -> integer") at the same time
as "operator jsonb - integer" was added. Although doing this will
require a compatibility note in the 9.5 release notes, it's extremely
unlikely to destabilize anybody's app, and makes a lot of sense.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 06/03/2015 10:02 PM, Peter Geoghegan wrote:
I've noticed some more issues with the jsonb documentation, and the
new jsonb stuff generally. I didn't set out to give Andrew feedback on
the semantics weeks after feature freeze, but unfortunately this feels
like another discussion that we need to have now rather than later.
Yes, I wish you had raised these issues months ago when this was
published. That's the way the process is supposed to work.
"operator jsonb - integer"
===================Summary: I think that this operator has a problem, but a problem that
can easily be fixed.I think it was a bad idea to allow array-style removal of object
key/value pairs. ISTM that it implies a level of stability in the
ordering that doesn't make sense. Besides, is it really all that
useful?
The origin of this is nested hstore. Looking at my last version of that
patch, I see:
SELECT 'a=>1, b=>2, c=>3'::hstore - 3;
?column?
------------------------
"a"=>1, "b"=>2, "c"=>3
(1 row)
But I agree that it's not a great contribution to science, especially
since the index will be applied to the list of elements in the somewhat
counter-intuitive storage order we use, and we could just raise an error
if we try to apply integer delete to an object instead of an array.
"operator jsonb - text[]" (and *nested* deletion more generally)
===============================================Summary: I think that this operator has many problems, and should be
scraped (although only as an operator). IMV nested deletion should
only be handled by functions, and the way that nested deletion works
in general should be slightly adjusted.The new "operator jsonb - text[]" operator is confusingly inconsistent with:
A) "operator jsonb text"
What exactly is this? I have no idea what you're talking about.
and:
B) the established "operator hstore - text[]" operator, since that
operator deletes all key/value pairs that have keys that match any of
the right operand text array values. In contrast, this new operator is
passed as its right operand an array of text elements that constitute
a "path" (so the order in the rhs text[] operand matters). If the text
element in the rhs text[] operand happens to be what would pass for a
Postgres integer literal, it can be used to traverse lhs array values
through subscripting at that nesting level.
The fact that hstore uses it that way doesn't really concern me. Since
hstore isn't nested it doesn't make a whole lot of sense for it to mean
anything else there. But json(b) is nested, and jsonb - path seems quite
a reasonable treatment, something you're much more likely to want to do
than removeing top level elements in bulk.
Regarding nested deletion behavior more generally, consider this
example of how this can work out badly:postgres=# select jsonb_delete(jsonb_set('["a"]', '{5}', '"b"'), '{5}') ;
jsonb_delete
--------------
["a", "b"]
(1 row)Here, we're adding and then deleting an array element at offset 5 (the
string "b"). But the element is never deleted by the outer
jsonb_delete(), because we can't rely on the element actually being
stored at offset 5. Seems a bit fragile.
The behaviour of jsonb_set is pretty explicitly documented. If we wanted
to do something else then we'd have to disable the special meaning given
to negative indices, but that would mean in turn we wouldn't be able to
prepend to an array.
More importantly, consider the inconsistency with "operator jsonb
text" ("point A" above):postgres=# select '["a"]'::jsonb ?| '{a}'::text[]; -- historic/9.4 behavior
?column?
----------
t
(1 row)postgres=# select '["a"]'::jsonb - '{a}'::text[]; -- new to 9.5
operator, does not delete!
?column?
----------
["a"]
(1 row)
You are conflating two different things here, quite pointlessly. The RH
operand of ?| is not a path, whereas the RH operand of this - variant
is. The fact that they are both text arrays doesn't mean that they
should mean the same thing. And this is really the whole problem with
the rest of your analysis.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 6/4/15 8:43 AM, Andrew Dunstan wrote:
You are conflating two different things here, quite pointlessly. The RH
operand of ?| is not a path, whereas the RH operand of this - variant
is. The fact that they are both text arrays doesn't mean that they
should mean the same thing. And this is really the whole problem with
the rest of your analysis.
Has the idea of a specific json_path datatype been discussed? I feel it
would add a lot of clarity to the operators. It would also make it easy
to have an array of paths, something that's difficult to do today
because a path can be an arbitrary length and arrays don't support that.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 06/04/2015 11:33 AM, Jim Nasby wrote:
On 6/4/15 8:43 AM, Andrew Dunstan wrote:
You are conflating two different things here, quite pointlessly. The RH
operand of ?| is not a path, whereas the RH operand of this - variant
is. The fact that they are both text arrays doesn't mean that they
should mean the same thing. And this is really the whole problem with
the rest of your analysis.Has the idea of a specific json_path datatype been discussed? I feel
it would add a lot of clarity to the operators. It would also make it
easy to have an array of paths, something that's difficult to do today
because a path can be an arbitrary length and arrays don't support that.
I actually thought of doing something like that earlier today, although
I was thinking of making it an array under the hood - I'm not sure how
much call there is for an array of paths. We could probably finesse
that. I agree that there is some sense in having such a type, especially
if we later want to implement json(b)_patch, see
<http://tools.ietf.org/html/rfc6902>. And if we do we should call the
type json_pointer to be consistent with
<http://tools.ietf.org/html/rfc6901>.
However, this is certainly not 9.5 material.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Jun 4, 2015 at 6:43 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
I've noticed some more issues with the jsonb documentation, and the
new jsonb stuff generally. I didn't set out to give Andrew feedback on
the semantics weeks after feature freeze, but unfortunately this feels
like another discussion that we need to have now rather than later.Yes, I wish you had raised these issues months ago when this was published.
That's the way the process is supposed to work.
I also wish that I managed to do that. As you know, I was working
overtime to get UPSERT into 9.5 during that period. Finding time to
review things is always difficult, and I which I could do more.
"operator jsonb - integer"
===================
I think it was a bad idea to allow array-style removal of object
key/value pairs. ISTM that it implies a level of stability in the
ordering that doesn't make sense. Besides, is it really all that
useful?
But I agree that it's not a great contribution to science, especially since
the index will be applied to the list of elements in the somewhat
counter-intuitive storage order we use, and we could just raise an error if
we try to apply integer delete to an object instead of an array.
Cool. Do you want to write a patch, or should I?
Also, what about negative array subscripting (making the 9.4-era
"operator jsonb -> integer" operator support that for consistency with
the new "operator jsonb - integer" operator)? Should I write the
patch? Will you commit it if I do?
"operator jsonb - text[]" (and *nested* deletion more generally)
===============================================Summary: I think that this operator has many problems, and should be
scraped (although only as an operator). IMV nested deletion should
only be handled by functions, and the way that nested deletion works
in general should be slightly adjusted.The new "operator jsonb - text[]" operator is confusingly inconsistent
with:A) "operator jsonb text"
What exactly is this? I have no idea what you're talking about.
It's a typo -- I meant "operator jsonb - text". The fact that
"operator jsonb - text" and "operator jsonb - text[]" diverge in the
way they do seems confusing.
The fact that hstore uses it that way doesn't really concern me. Since
hstore isn't nested it doesn't make a whole lot of sense for it to mean
anything else there.
It seems pretty obvious to me that it makes just as much sense as in
hstore. In hstore, you might want to delete multiple key/value pairs
at once, for exactly the same reason as you might want to with jsonb.
Certainly, you'll also want to support nested deletion with jsonb, but
that's beside the point.
But json(b) is nested, and jsonb - path seems quite a
reasonable treatment, something you're much more likely to want to do than
removeing top level elements in bulk.
Probably true. I think that this interface for nested deletion is
complicated enough and inconsistent enough that I'd rather not have an
operator at all, just a function (so somewhat like jsonb_set() --
jsonb_delete()). That is my main point on "operator jsonb - text[]";
I think the interface is complicated and inconsistent with everything
else for no good reason.
Regarding nested deletion behavior more generally, consider this
example of how this can work out badly:postgres=# select jsonb_delete(jsonb_set('["a"]', '{5}', '"b"'), '{5}') ;
jsonb_delete
--------------
["a", "b"]
(1 row)Here, we're adding and then deleting an array element at offset 5 (the
string "b"). But the element is never deleted by the outer
jsonb_delete(), because we can't rely on the element actually being
stored at offset 5. Seems a bit fragile.The behaviour of jsonb_set is pretty explicitly documented. If we wanted to
do something else then we'd have to disable the special meaning given to
negative indices, but that would mean in turn we wouldn't be able to prepend
to an array.
jsonb_delete() should certainly be able to traverse objects, but it's
much less clear that it should be able to *traverse* arrays (affecting
arrays is a different story, though). That's why I proposed not
supporting traversing arrays with it or with jsonb_set(). This would
also removes the questionable second "shadow type system" within the
text[] rhs operand too, which seems like a good thing.
I think that traversing arrays in nested documents is a rare
requirement, because the ordering within arrays is unstable. If you
already know the ordinal number of the thing you want to nuke, then
you probably have already locked the row, and you might as well
manipulate the JSON using Javascript or Python at that stage.
Making jsonb_delete() buy into the "operator jsonb ? text" idea of a
key (a thing that it must delete) would also allow jsonb_delete() to
reliably delete particular strings in arrays, which actually does make
a lot of sense (think of arrays of "tags"). But FWIW it's the
inconsistency that bothers me most.
More importantly, consider the inconsistency with "operator jsonb
text" ("point A" above):postgres=# select '["a"]'::jsonb ?| '{a}'::text[]; -- historic/9.4
behavior
?column?
----------
t
(1 row)postgres=# select '["a"]'::jsonb - '{a}'::text[]; -- new to 9.5
operator, does not delete!
?column?
----------
["a"]
(1 row)You are conflating two different things here, quite pointlessly. The RH
operand of ?| is not a path, whereas the RH operand of this - variant is.
The fact that they are both text arrays doesn't mean that they should mean
the same thing. And this is really the whole problem with the rest of your
analysis.
I'm not conflating anything -- obviously I understand that "operator
jsonb - text[]" is not supposed to be an "any" variant of "operator
jsonb - text". My point is that a reasonable person could infer that
it *is* an "any" variant of "operator jsonb - text", based on hstore's
behavior for its two deletion operators, and based on the other jsonb
operators, and based on the fact that it will often behave that way
anyway.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
I'm just skimming here, but if a jsonb_path type is being proposed,
perhaps it would be better not to have operators that take text or
text[] as second argument. We can provide that functionality with just
functions. For example, it will be confusing to have
jsonb 'some json value' - '{foo,bar}'
operate too differently from
jsonb 'some json value' - json_path '{foo,bar}'
And it will be a nasty regression to have 9.5 allow
jsonb 'some json value' - '{foo,bar}'
and then have 9.6 error out with "ambiguous operator" when the json_path
thing is added.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Jun 4, 2015 at 12:16 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
I'm just skimming here, but if a jsonb_path type is being proposed,
perhaps it would be better not to have operators that take text or
text[] as second argument. We can provide that functionality with just
functions. For example, it will be confusing to havejsonb 'some json value' - '{foo,bar}'
operate too differently from
jsonb 'some json value' - json_path '{foo,bar}'
And it will be a nasty regression to have 9.5 allow
jsonb 'some json value' - '{foo,bar}'
and then have 9.6 error out with "ambiguous operator" when the json_path
thing is added.
Fair point, but FWIW I don't think it'll end up being a new type like
json_path -- it'll just be jsonb, as with containment. I can see there
being an operator that performs deletion in a very similar way to how
"operator jsonb @> jsonb" performs containment (recall that jsonb
containment is a very JSON-ish flavor of containment).
I would like these new-to-9.5 deletion operators to work at the top
level only, like "operator jsonb ? text" and "operator jsonb ?| text",
sharing their idea of a key, __including that string array elements
are keys__. We haven't got a containment-style nested delete operator
for 9.5, but we can hope for it in the future. In the meantime, you
get much of the benefit of that with jsonb_delete(), which *can*
support nested deletion. It does so by buying into the "operator jsonb
? text" idea of a key (including that string array elements are keys),
although with a twist: the paths text[] right operand operates at
multiple nesting levels (not supporting traversing arrays, as Andrew
implemented it, but OTOH adding support for deleting String array
elements based on the string alone, useful for "tag" arrays).
If in 9.6 we have something like an "operator jsonb @- jsonb" operator
for containment style deletion, and a 9.5 era "operator jsonb - text"
and operator jsonb - text[]" pair of operators for existence style
deletion (matching "operator jsonb ? text", operating only on the top
level), that will be pretty good. The fact that jsonb_delete() will
have somewhat bridged the gap nesting-deletion-wise for 9.5 (without
being usable through an operator) won't really matter then. I want to
keep the "twist" I described out of any jsonb operators that are
shipped, and only use it within functions.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Jun 4, 2015 at 1:02 PM, Peter Geoghegan <pg@heroku.com> wrote:
I would like these new-to-9.5 deletion operators to work at the top
level only, like "operator jsonb ? text" and "operator jsonb ?| text",
sharing their idea of a key, __including that string array elements
are keys__. We haven't got a containment-style nested delete operator
for 9.5, but we can hope for it in the future. In the meantime, you
get much of the benefit of that with jsonb_delete(), which *can*
support nested deletion. It does so by buying into the "operator jsonb
? text" idea of a key (including that string array elements are keys),
although with a twist: the paths text[] right operand operates at
multiple nesting levels (not supporting traversing arrays, as Andrew
implemented it, but OTOH adding support for deleting String array
elements based on the string alone, useful for "tag" arrays).If in 9.6 we have something like an "operator jsonb @- jsonb" operator
for containment style deletion, and a 9.5 era "operator jsonb - text"
and operator jsonb - text[]" pair of operators for existence style
deletion (matching "operator jsonb ? text", operating only on the top
level), that will be pretty good. The fact that jsonb_delete() will
have somewhat bridged the gap nesting-deletion-wise for 9.5 (without
being usable through an operator) won't really matter then. I want to
keep the "twist" I described out of any jsonb operators that are
shipped, and only use it within functions.
To be clear: these two paragraphs are a proposal about how I'd like to
change things for 9.5 to make the jsonb operators more consistent than
the way things are in the master branch, while still offering nested
deletion through a function.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Jun 4, 2015, at 12:16 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
I'm just skimming here, but if a jsonb_path type is being proposed,
Is this not the purpose of JSQuery?
https://code.google.com/p/gwtquery/wiki/JsQuery
David
Attachments:
On 06/04/2015 04:13 PM, David E. Wheeler wrote:
On Jun 4, 2015, at 12:16 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
I'm just skimming here, but if a jsonb_path type is being proposed,
Is this not the purpose of JSQuery?
No, it doesn't seem to have anything at all to do with it. What I
suggested would be an implementation of json_pointer - see
<http://tools.ietf.org/html/rfc6901>
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 06/04/2015 03:10 PM, Peter Geoghegan wrote:
On Thu, Jun 4, 2015 at 6:43 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
I've noticed some more issues with the jsonb documentation, and the
new jsonb stuff generally. I didn't set out to give Andrew feedback on
the semantics weeks after feature freeze, but unfortunately this feels
like another discussion that we need to have now rather than later.Yes, I wish you had raised these issues months ago when this was published.
That's the way the process is supposed to work.I also wish that I managed to do that. As you know, I was working
overtime to get UPSERT into 9.5 during that period. Finding time to
review things is always difficult, and I which I could do more.
That's happened to me in the past. My view has generally been that in
that case I have missed my chance, and I need to live with what others
have done. That seems to me preferable to tearing up any pretense we
might have to be following a defined development process.
I should point out that I have already gone out of my way to accommodate
concerns you expressed extremely late about this set of features, and I
have lately indicated another area where we can adjust it to meet your
objections. Re-litigating this wholesale seems quite a different kettle
of fish, however.
Just in case it's not clear: I am not at all happy.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Jun 4, 2015 at 12:10 PM, Peter Geoghegan <pg@heroku.com> wrote:
jsonb_delete() should certainly be able to traverse objects, but it's
much less clear that it should be able to *traverse* arrays (affecting
arrays is a different story, though). That's why I proposed not
supporting traversing arrays with it or with jsonb_set(). This would
also removes the questionable second "shadow type system" within the
text[] rhs operand too, which seems like a good thing.
Here is a further example of why I find this new "shadow type" system
for rhs text[] operands to be pretty questionable:
postgres=# select jsonb_set('[1, 2, 3, 4, 5,6,7,8,9,10,11,12]',
'{"5e10"}'::text[], '"Input unsanitized"') ;
jsonb_set
-----------------------------------------------------------
[1, 2, 3, 4, 5, "Input unsanitized", 7, 8, 9, 10, 11, 12]
(1 row)
BTW, there is a bug here -- strtol() needs additional defenses [1]https://www.securecoding.cert.org/confluence/display/cplusplus/INT06-CPP.+Use+strtol()+or+a+related+function+to+convert+a+string+token+to+an+integer -- Peter Geoghegan
(before casting to int):
postgres=# select jsonb_set('[1, 2, 3, 4,
5,6,7,8,9,10,11,12,13,14,15,16,17,18]',
'{"9223372036854775806"}'::text[], '"Input unsanitized"', false) ;
jsonb_set
----------------------------------------------------------------------------------
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, "Input
unsanitized", 18]
(1 row)
[1]: https://www.securecoding.cert.org/confluence/display/cplusplus/INT06-CPP.+Use+strtol()+or+a+related+function+to+convert+a+string+token+to+an+integer -- Peter Geoghegan
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Jun 4, 2015 at 5:31 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
Just in case it's not clear: I am not at all happy.
I've offered to help you with several of the issue I raised; I had
intended to offer more help.
The issues I raise seem pretty substantive to me. I'm trying to make
sure that we don't end up with something bad that we need to live with
indefinitely. I have offered you something not far off an "everybody
wins" proposal (i.e. no real loss of functionality), and that was my
first proposal.
I don't know what more I could do for you. I *am* trying to help.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 06/04/2015 03:16 PM, Alvaro Herrera wrote:
I'm just skimming here, but if a jsonb_path type is being proposed,
perhaps it would be better not to have operators that take text or
text[] as second argument. We can provide that functionality with just
functions. For example, it will be confusing to havejsonb 'some json value' - '{foo,bar}'
operate too differently from
jsonb 'some json value' - json_path '{foo,bar}'
And it will be a nasty regression to have 9.5 allow
jsonb 'some json value' - '{foo,bar}'
and then have 9.6 error out with "ambiguous operator" when the json_path
thing is added.
The boat has sailed on this. We have had the #> and #>> operators since
9.3, i.e. even before we got the operators that Peter wants us to adopt
the usage from, and their right hand operands are text arrays with the
same path semantics.
'some jsonb value' - '{foo,bar}' is already ambiguous - the RH operand
could be a single text datum or a text array.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Jun 4, 2015 at 12:10 PM, Peter Geoghegan <pg@heroku.com> wrote:
But I agree that it's not a great contribution to science, especially since
the index will be applied to the list of elements in the somewhat
counter-intuitive storage order we use, and we could just raise an error if
we try to apply integer delete to an object instead of an array.Cool. Do you want to write a patch, or should I?
Also, what about negative array subscripting (making the 9.4-era
"operator jsonb -> integer" operator support that for consistency with
the new "operator jsonb - integer" operator)? Should I write the
patch? Will you commit it if I do?
Please let me know if you want me to write these two patches.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 06/05/2015 01:39 PM, Peter Geoghegan wrote:
On Thu, Jun 4, 2015 at 12:10 PM, Peter Geoghegan <pg@heroku.com> wrote:
But I agree that it's not a great contribution to science, especially since
the index will be applied to the list of elements in the somewhat
counter-intuitive storage order we use, and we could just raise an error if
we try to apply integer delete to an object instead of an array.Cool. Do you want to write a patch, or should I?
Also, what about negative array subscripting (making the 9.4-era
"operator jsonb -> integer" operator support that for consistency with
the new "operator jsonb - integer" operator)? Should I write the
patch? Will you commit it if I do?Please let me know if you want me to write these two patches.
Send the first one, I'm still thinking about the second one.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andrew Dunstan wrote:
On 06/04/2015 03:16 PM, Alvaro Herrera wrote:
I'm just skimming here, but if a jsonb_path type is being proposed,
perhaps it would be better not to have operators that take text or
text[] as second argument. We can provide that functionality with just
functions. For example, it will be confusing to havejsonb 'some json value' - '{foo,bar}'
operate too differently from
jsonb 'some json value' - json_path '{foo,bar}'
And it will be a nasty regression to have 9.5 allow
jsonb 'some json value' - '{foo,bar}'
and then have 9.6 error out with "ambiguous operator" when the json_path
thing is added.The boat has sailed on this. We have had the #> and #>> operators since 9.3,
i.e. even before we got the operators that Peter wants us to adopt the usage
from, and their right hand operands are text arrays with the same path
semantics.
Well, some boats sailed, but maybe those were different boats. I don't
think we should shut discussion off only because we made some choice or
other in the past. Since we haven't released yet, we can base decisions
on what's the most useful API for users, rather on what got committed in
the initial patch.
'some jsonb value' - '{foo,bar}' is already ambiguous - the RH operand
could be a single text datum or a text array.
Hmm, but that's not in 9.4, so we can still tweak it if necessary.
Consider this jsonb datum. Nobody in their right mind would have a key
that looks like a path, I hear you say; yet I'm sure this is going to
happen.
alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}, "{c,a}": "uh"}' ;
jsonb
------------------------------------------------------
{"a": "1", "b": "2", "c": {"a": "2"}, "{c,a}": "uh"}
(1 fila)
This seems pretty surprising to me:
-- here, the -(jsonb,text) operator is silently chosen, even though the
-- right operand looks like an array. And we do the wrong thing.
alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - '{c,a}';
?column?
---------------------------------------
{"a": "1", "b": "2", "c": {"a": "2"}}
(1 fila)
-- here, the -(jsonb,text[]) operator is chosen
alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - _text '{c,a}';
?column?
-------------------------------
{"a": "1", "b": "2", "c": {}}
(1 fila)
But this seems worse to me, because we silently do nothing:
alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - '{c,a}';
?column?
---------------------------------------
{"a": "1", "b": "2", "c": {"a": "2"}}
(1 fila)
I think the first operator can be qualified as dangerous. If you delete
that one, then it's fine because you can't do that query anymore because
of the conflict with -(jsonb, int).
alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - '{c,a}';
ERROR: operator is not unique: jsonb - unknown
L�NEA 1: ...elect jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - '{c,a}'...
^
SUGERENCIA: Could not choose a best candidate operator. You might need to add explicit type casts.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Jun 5, 2015 at 8:32 , Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:
Andrew Dunstan wrote:
'some jsonb value' - '{foo,bar}' is already ambiguous - the RH
operand
could be a single text datum or a text array.Hmm, but that's not in 9.4, so we can still tweak it if necessary.
Consider this jsonb datum. Nobody in their right mind would have a
key
that looks like a path, I hear you say; yet I'm sure this is going to
happen.alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}, "{c,a}":
"uh"}' ;
jsonb
------------------------------------------------------
{"a": "1", "b": "2", "c": {"a": "2"}, "{c,a}": "uh"}
(1 fila)This seems pretty surprising to me:
-- here, the -(jsonb,text) operator is silently chosen, even though
the
-- right operand looks like an array. And we do the wrong thing.
alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -
'{c,a}';
?column?
---------------------------------------
{"a": "1", "b": "2", "c": {"a": "2"}}
(1 fila)-- here, the -(jsonb,text[]) operator is chosen
alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -
_text '{c,a}';
?column?
-------------------------------
{"a": "1", "b": "2", "c": {}}
(1 fila)But this seems worse to me, because we silently do nothing:
alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -
'{c,a}';
?column?
---------------------------------------
{"a": "1", "b": "2", "c": {"a": "2"}}
(1 fila)I think the first operator can be qualified as dangerous. If you
delete
that one, then it's fine because you can't do that query anymore
because
of the conflict with -(jsonb, int).alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -
'{c,a}';
ERROR: operator is not unique: jsonb - unknown
LÍNEA 1: ...elect jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -
'{c,a}'...
^
SUGERENCIA: Could not choose a best candidate operator. You might
need to add explicit type casts.
That's a good point, and it won't get any better if/when we add the
json point support in 9.6 since the syntax would be something like
select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - '/c/a'; and we
will again silently do nothing. That's going to cause bugs in
applications using this.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 6/5/15 2:08 PM, Petr Jelinek wrote:
That's a good point, and it won't get any better if/when we add the json
point support in 9.6 since the syntax would be something like select
jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - '/c/a'; and we will again
silently do nothing. That's going to cause bugs in applications using this.
Yeah, this is a miniature version of the pain I've felt with variant:
trying to get sane casting for a data type that encompasses other types
in current Postgres is essentially impossible. Your only option is to
put implicit or assignment casts in and cross your fingers, or to do
only explicit casts and force the user to cast everything (which is a
PITA). Even a json_pointer type may not help this much unless we have
some way to reliable transform an unknown into a json_pointer.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers