BUG #9519: Allows storing scalar json, but fails when querying

Started by Alf Kristian Støyleabout 12 years ago14 messagesbugs
Jump to latest
#1Alf Kristian Støyle
alf.kristian@gmail.com

The following bug has been logged on the website:

Bug reference: 9519
Logged by: Alf Kristian Støyle
Email address: alf.kristian@gmail.com
PostgreSQL version: 9.3.2
Operating system: Red Hat 4.6.3-2
Description:

Steps to reproduce:
create table jtest (data json);
=> CREATE TABLE

insert into jtest (data) values ('1');
=> INSERT 0 1

select data->>'foo' from jtest;
=> ERROR: cannot extract element from a scalar

I think the insert should fail, since '1' is not valid JSON.

After the data is in the database every query using the ->> operator,
hitting the row containing '1' will fail.

Our database runs in Amazon RDS, 'select version();' returns:
PostgreSQL 9.3.2 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.6.3
20120306 (Red Hat 4.6.3-2), 64-bit

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#2Marko Tiikkaja
marko@joh.to
In reply to: Alf Kristian Støyle (#1)
Re: BUG #9519: Allows storing scalar json, but fails when querying

On 3/10/14, 4:09 PM, alf.kristian@gmail.com wrote:

I think the insert should fail, since '1' is not valid JSON.

It's a valid "JSON value", it's just not an object. If you only want to
store JSON objects in the table, consider using a CHECK constraint.

Regards,
Marko Tiikkaja

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#3John R Pierce
pierce@hogranch.com
In reply to: Marko Tiikkaja (#2)
Re: BUG #9519: Allows storing scalar json, but fails when querying

On 3/10/2014 12:52 PM, Marko Tiikkaja wrote:

On 3/10/14, 4:09 PM, alf.kristian@gmail.com wrote:

I think the insert should fail, since '1' is not valid JSON.

It's a valid "JSON value", it's just not an object.

then why does querying it fail ?

--
john r pierce 37N 122W
somewhere on the middle of the left coast

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#4Alf Kristian Støyle
alf.kristian@gmail.com
In reply to: Marko Tiikkaja (#2)
Re: BUG #9519: Allows storing scalar json, but fails when querying

If the JSON datatype accepts JSON values (not just objects), which I
supposed is a valid approach, then shouldn't the JSON query operators be
able to deal with that?

This is the "unfortunate" part:

*select data->>'foo' from jtest; => ERROR: cannot extract element from a
scalar*
So either, only accept JSON object (not simple JSON values), or make the
JSON query operators work with JSON values as well.

Having to add an extra check constraint here is not obvious, imho.

For the record, we are working around this problem, through a small hack.
Some other system is storing this kind of data in our database. We have
reported a bug with them to fix it.

This is not a big problem for us, but we love Postgres, so we thought we
should report this.

Cheers,
Alf

On 10 March 2014 20:52, Marko Tiikkaja <marko@joh.to> wrote:

Show quoted text

On 3/10/14, 4:09 PM, alf.kristian@gmail.com wrote:

I think the insert should fail, since '1' is not valid JSON.

It's a valid "JSON value", it's just not an object. If you only want to
store JSON objects in the table, consider using a CHECK constraint.

Regards,
Marko Tiikkaja

#5John R Pierce
pierce@hogranch.com
In reply to: Alf Kristian Støyle (#4)
Re: BUG #9519: Allows storing scalar json, but fails when querying

On 3/10/2014 1:39 PM, Alf Kristian St�yle wrote:

This is the "unfortunate" part:
/select data->>'foo' from jtest;
=> ERROR: cannot extract element from a scalar
/
So either, only accept JSON object (not simple JSON values), or make
the JSON query operators work with JSON values as well.

the json ->> 'fieldname' operator has no meaning when applied to a
scalar value. what would you suggest it should do? return a NULL ?

--
john r pierce 37N 122W
somewhere on the middle of the left coast

#6David G. Johnston
david.g.johnston@gmail.com
In reply to: Alf Kristian Støyle (#4)
Re: BUG #9519: Allows storing scalar json, but fails when querying

alf.kristian wrote

If the JSON datatype accepts JSON values (not just objects), which I
supposed is a valid approach, then shouldn't the JSON query operators be
able to deal with that?

This is the "unfortunate" part:

*select data->>'foo' from jtest; => ERROR: cannot extract element from a
scalar*
So either, only accept JSON object (not simple JSON values), or make the
JSON query operators work with JSON values as well.

Having to add an extra check constraint here is not obvious, imho.

For the record, we are working around this problem, through a small hack.
Some other system is storing this kind of data in our database. We have
reported a bug with them to fix it.

This is not a big problem for us, but we love Postgres, so we thought we
should report this.

Cheers,
Alf

On 10 March 2014 20:52, Marko Tiikkaja &lt;

marko@

&gt; wrote:

On 3/10/14, 4:09 PM,

alf.kristian@

wrote:

I think the insert should fail, since '1' is not valid JSON.

It's a valid "JSON value", it's just not an object. If you only want to
store JSON objects in the table, consider using a CHECK constraint.

Regards,
Marko Tiikkaja

Not really sure how robust you expect the system to be in the face of
polymorphic columns.

SELECT ('["a","b","c"]'::json)->>'not_a_key' -- ERROR: cannot extract field
from a non-object

The system supposes that, at a structural level, you are dealing with
column-consistent data and so if you ask for something that does not make
sense (i.e., an object key when you have an array or a scalar) it will warn
you.

I guess, in theory, any de-referencing that does not find a valid target
could return NULL...though I'm not sure that is an improvement.

This is a relational database and so it is expected that a column defines a
single thing and that thing naturally can be one-of a scalar, object, or
array. Given that underlying assumption - though likely never truly spelled
out anywhere in the documentation - allowing and then throwing a run-time
error when the specific sub-type of json does not match the given operator
makes sense.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/BUG-9519-Allows-storing-scalar-json-but-fails-when-querying-tp5795417p5795449.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#7David G. Johnston
david.g.johnston@gmail.com
In reply to: John R Pierce (#5)
Re: BUG #9519: Allows storing scalar json, but fails when querying

John R Pierce wrote

On 3/10/2014 1:39 PM, Alf Kristian Støyle wrote:

This is the "unfortunate" part:
/select data->>'foo' from jtest;
=> ERROR: cannot extract element from a scalar
/
So either, only accept JSON object (not simple JSON values), or make
the JSON query operators work with JSON values as well.

the json ->> 'fieldname' operator has no meaning when applied to a
scalar value. what would you suggest it should do? return a NULL ?

"key does not exist" is represented by NULL if the operator is applied to an
object so, in theory, since the key does not exist in a scalar the same
value - NULL - would not be unexpected.

From a theory perspective, and based upon typical database normalization
rules, the current behavior makes the most sense but it does force the user
to be consistent in defining of their JSON models - a little big-brother-ish
but if you are only going to support a single set of rules the more-strict
ones are generally preferable.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/BUG-9519-Allows-storing-scalar-json-but-fails-when-querying-tp5795417p5795451.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#8Jeff Janes
jeff.janes@gmail.com
In reply to: Alf Kristian Støyle (#1)
Re: BUG #9519: Allows storing scalar json, but fails when querying

On Mon, Mar 10, 2014 at 8:09 AM, <alf.kristian@gmail.com> wrote:

The following bug has been logged on the website:

Bug reference: 9519
Logged by: Alf Kristian Støyle
Email address: alf.kristian@gmail.com
PostgreSQL version: 9.3.2
Operating system: Red Hat 4.6.3-2
Description:

Steps to reproduce:
create table jtest (data json);
=> CREATE TABLE

insert into jtest (data) values ('1');
=> INSERT 0 1

select data->>'foo' from jtest;
=> ERROR: cannot extract element from a scalar

I think the insert should fail, since '1' is not valid JSON.

After the data is in the database every query using the ->> operator,
hitting the row containing '1' will fail.

Lets say the value was instead {"a":1}.

Now every query using data->'a'->>'b' will fail when it hits that row.

So forbidding values does not fix the problem, it just moves it down a
level.

A possible solution is to make ->> return NULL (like it does for accessing
values of non-existent keys) rather than raise an error when used on a
scalar. Whether this would be an improvement, I don't know.

Note that the construct:
data #> '{a,b}'
does return null in this case, and does not raise an error. You could
argue that that is an inconsistency. On the other hand, you could argue it
provides you with the flexibility to accomplish different things depending
on which you desire.

So if you want the NULL behavior, you could use this to get it:

data #>> '{foo}'

Cheers,

Jeff

#9Christian Kruse
christian@2ndquadrant.com
In reply to: David G. Johnston (#6)
Re: BUG #9519: Allows storing scalar json, but fails when querying

Hi,

On 10/03/14 14:09, David Johnston wrote:

The system supposes that, at a structural level, you are dealing with
column-consistent data and so if you ask for something that does not make
sense (i.e., an object key when you have an array or a scalar) it will warn
you.

Looking at the common usage of JSON it doesn't seem to be a good idea
to imply column-consistent JSON. Normally when JSON is used the
application ensures consistency and when a non-existent key is
accessed NULL is returned. I think this is expected behavior and we
should do it like that, too; if one wants it different, he should use
CHECK constraints.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#10Alf Kristian Støyle
alf.kristian@gmail.com
In reply to: Jeff Janes (#8)
Re: BUG #9519: Allows storing scalar json, but fails when querying

The #> operator works in SELECT, e.g. does not fail on when JSON column
contains JSON values. Thanks for the tip!

However when using it in the WHERE clause I get no result.

select * from jtest;
data
-------------
1
1
{"a" : "b"}
(3 rows)

select data #> '{"a"}' from jtest;
?column?
----------

"b"
(3 rows)

select data #> '{"a"}' from jtest where (data #> '{"a"}') = 'b';
ERROR: operator does not exist: json = unknown
LINE 1: ...CT data #> '{"a"}' from jtest where (data #> '{"a"}') = 'b';
^
HINT: No operator matches the given name and argument type(s). You might
need to add explicit type casts.

select data #> '{"a"}' from jtest where (data #> '{"a"}')::text = 'b';
?column?
----------
(0 rows)

Am I doing a wrong conversion here, or is something else going on? If the
data in the database did not contain scalar values, then ->> works fine in
WHERE. The following is almost the query we are actually trying run
(checking for existence):

select data->>'a' from jtest where data->>'a' = 'b';
?column?
----------
b
(1 row)

Regarding the ->> operator, I think it is unfortunate behavior it fails
like that, I suppose we were expecting NULL behavior. However we are
working around this, so if you don't think this should change, then we are
fine with that :)

Just a note though. It took us a while to track down the problem. We have a
table with several million rows, and suddenly our queries started failing,
since someone had started to insert scalars. Others might also struggle to
figure out what is wrong if they bump into this behavior.

Cheers,
Alf

On 10 March 2014 22:42, Jeff Janes <jeff.janes@gmail.com> wrote:

Show quoted text

On Mon, Mar 10, 2014 at 8:09 AM, <alf.kristian@gmail.com> wrote:

The following bug has been logged on the website:

Bug reference: 9519
Logged by: Alf Kristian Støyle
Email address: alf.kristian@gmail.com
PostgreSQL version: 9.3.2
Operating system: Red Hat 4.6.3-2
Description:

Steps to reproduce:
create table jtest (data json);
=> CREATE TABLE

insert into jtest (data) values ('1');
=> INSERT 0 1

select data->>'foo' from jtest;
=> ERROR: cannot extract element from a scalar

I think the insert should fail, since '1' is not valid JSON.

After the data is in the database every query using the ->> operator,
hitting the row containing '1' will fail.

Lets say the value was instead {"a":1}.

Now every query using data->'a'->>'b' will fail when it hits that row.

So forbidding values does not fix the problem, it just moves it down a
level.

A possible solution is to make ->> return NULL (like it does for accessing
values of non-existent keys) rather than raise an error when used on a
scalar. Whether this would be an improvement, I don't know.

Note that the construct:
data #> '{a,b}'
does return null in this case, and does not raise an error. You could
argue that that is an inconsistency. On the other hand, you could argue it
provides you with the flexibility to accomplish different things depending
on which you desire.

So if you want the NULL behavior, you could use this to get it:

data #>> '{foo}'

Cheers,

Jeff

#11David G. Johnston
david.g.johnston@gmail.com
In reply to: Christian Kruse (#9)
Re: BUG #9519: Allows storing scalar json, but fails when querying

Christian Kruse-4 wrote

Hi,

On 10/03/14 14:09, David Johnston wrote:

The system supposes that, at a structural level, you are dealing with
column-consistent data and so if you ask for something that does not make
sense (i.e., an object key when you have an array or a scalar) it will
warn
you.

Looking at the common usage of JSON it doesn't seem to be a good idea
to imply column-consistent JSON. Normally when JSON is used the
application ensures consistency and when a non-existent key is
accessed NULL is returned. I think this is expected behavior and we
should do it like that, too; if one wants it different, he should use
CHECK constraints.

The database is the application. It's job is to store data so that,
multiple, other applications can access it and in such a way that those
applications do not have to enforce data integrity since the database has
taken care of that responsibility.

I can see it both ways and so maybe a set of null-returning functions and
operators need to be made standard so the user can choose which behavior is
desired. I cannot see changing the behavior of the current operators since
these are in the wild - though turning errors into non-errors isn't really a
risk since no existing code can reasonably be said to rely on such behavior.
The advantage of avoiding data-specific errors is probably worth it. It
makes finding problems harder, not easier, to locate if one doesn't make
liberal use of check constraints but personally I'd rather have some wrong
data interspersed with the good data instead of throwing out al the data
because one single value is bad - a value that may never make it to the
final result anyway.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/BUG-9519-Allows-storing-scalar-json-but-fails-when-querying-tp5795417p5795536.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#12David G. Johnston
david.g.johnston@gmail.com
In reply to: Alf Kristian Støyle (#10)
Re: BUG #9519: Allows storing scalar json, but fails when querying

Alf Kristian Støyle wrote

select data #> '{"a"}' from jtest;
?column?
----------

"b"
(3 rows)

select data #> '{"a"}' from jtest where (data #> '{"a"}')::text = 'b';
?column?
----------
(0 rows)

Am I doing a wrong conversion here, or is something else going on? If the
data in the database did not contain scalar values, then ->> works fine in
WHERE. The following is almost the query we are actually trying run
(checking for existence):

Side not - please follow the example set by others and trim your quoting and
bottom-post.

As to your comment - the example above shows that the where clause is
evaluating to:

Where '"b"' = 'b' -- which is false.

Maybe try #>> which is defined to return text instead of an object.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/BUG-9519-Allows-storing-scalar-json-but-fails-when-querying-tp5795417p5795538.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#13Jeff Janes
jeff.janes@gmail.com
In reply to: Alf Kristian Støyle (#10)
Re: BUG #9519: Allows storing scalar json, but fails when querying

On Tue, Mar 11, 2014 at 1:32 AM, Alf Kristian Støyle <alf.kristian@gmail.com

wrote:

select data #> '{"a"}' from jtest where (data #> '{"a"}') = 'b';
ERROR: operator does not exist: json = unknown
LINE 1: ...CT data #> '{"a"}' from jtest where (data #> '{"a"}') = 'b';
^
HINT: No operator matches the given name and argument type(s). You might
need to add explicit type casts.

"#>>" returns text directly, just like ->> vs ->.

select data #> '{"a"}' from jtest where (data #> '{"a"}')::text = 'b';
?column?
----------
(0 rows)

Am I doing a wrong conversion here, or is something else going on?

If you put the (data #> '{"a"}')::text construct in the select, you can
see what is going on. pulling out the element as JSON quotes the value (if
it not a number), because that is what JSON values are supposed to be, and
then converting to text leaves the quotes in place. So you are comparing
the 3 character '"b"' to the one character string 'b', and they are not
equal. If you use #>>, it is pulled out as text in the first place and the
quotes are not put on there.

If the data in the database did not contain scalar values, then ->> works
fine in WHERE. The following is almost the query we are actually trying run
(checking for existence):

select data->>'a' from jtest where data->>'a' = 'b';
?column?
----------
b
(1 row)

Regarding the ->> operator, I think it is unfortunate behavior it fails
like that, I suppose we were expecting NULL behavior. However we are
working around this, so if you don't think this should change, then we are
fine with that :)

I don't really have an opinion on that, it just isn't obvious which way is
better--I can see times I would want either one. There are people who have
thought about this much more deeply than I have, but they haven't shown up
on this thread yet. (I think they are too busy over on the hackers list,
arguing over what behavior the next generation of json operators should
have.)

You can create a new operator with the behavior you want. I would like
some simple notation one could add to an operator or function invocation
which means "catch errors and convert to null", as I have several plperl
functions which I have created in two forms, one with an eval block and one
without. It would be nice to have one function with a run-time notation to
distinguish the behavior.

Cheers,

Jeff

#14Alf Kristian Støyle
alf.kristian@gmail.com
In reply to: Jeff Janes (#13)
Re: BUG #9519: Allows storing scalar json, but fails when querying

Thank you for your help, and thorough explanation.

Creating a new operator, is way out of my league, and we now have a good
way to work around our problems. So we are very pleased :)

I suppose if no one else complains, this shouldn't be fixed/changed, and
you shouldn't spend time creating a new operator.

Thanks again. Cheers,
Alf

On 11 March 2014 17:03, Jeff Janes <jeff.janes@gmail.com> wrote:

Show quoted text

On Tue, Mar 11, 2014 at 1:32 AM, Alf Kristian Støyle <
alf.kristian@gmail.com> wrote:

select data #> '{"a"}' from jtest where (data #> '{"a"}') = 'b';
ERROR: operator does not exist: json = unknown
LINE 1: ...CT data #> '{"a"}' from jtest where (data #> '{"a"}') = 'b';
^
HINT: No operator matches the given name and argument type(s). You might
need to add explicit type casts.

"#>>" returns text directly, just like ->> vs ->.

select data #> '{"a"}' from jtest where (data #> '{"a"}')::text = 'b';
?column?
----------
(0 rows)

Am I doing a wrong conversion here, or is something else going on?

If you put the (data #> '{"a"}')::text construct in the select, you can
see what is going on. pulling out the element as JSON quotes the value (if
it not a number), because that is what JSON values are supposed to be, and
then converting to text leaves the quotes in place. So you are comparing
the 3 character '"b"' to the one character string 'b', and they are not
equal. If you use #>>, it is pulled out as text in the first place and the
quotes are not put on there.

If the data in the database did not contain scalar values, then ->> works
fine in WHERE. The following is almost the query we are actually trying run
(checking for existence):

select data->>'a' from jtest where data->>'a' = 'b';
?column?
----------
b
(1 row)

Regarding the ->> operator, I think it is unfortunate behavior it fails
like that, I suppose we were expecting NULL behavior. However we are
working around this, so if you don't think this should change, then we are
fine with that :)

I don't really have an opinion on that, it just isn't obvious which way is
better--I can see times I would want either one. There are people who have
thought about this much more deeply than I have, but they haven't shown up
on this thread yet. (I think they are too busy over on the hackers list,
arguing over what behavior the next generation of json operators should
have.)

You can create a new operator with the behavior you want. I would like
some simple notation one could add to an operator or function invocation
which means "catch errors and convert to null", as I have several plperl
functions which I have created in two forms, one with an eval block and one
without. It would be nice to have one function with a run-time notation to
distinguish the behavior.

Cheers,

Jeff