json/jsonb inconsistence

Started by Teodor Sigaevover 11 years ago3 messages
#1Teodor Sigaev
teodor@sigaev.ru

# select '"\uaBcD"'::json;
json
----------
"\uaBcD"

but

# select '"\uaBcD"'::jsonb;
ERROR: invalid input syntax for type json
LINE 1: select '"\uaBcD"'::jsonb;
^
DETAIL: Unicode escape values cannot be used for code point values above 007F
when the server encoding is not UTF8.
CONTEXT: JSON data, line 1: ...

and

# select '"\uaBcD"'::json -> 0;
ERROR: invalid input syntax for type json
DETAIL: Unicode escape values cannot be used for code point values above 007F
when the server encoding is not UTF8.
CONTEXT: JSON data, line 1: ...
Time: 0,696 ms

More than, json looks strange:

# select '["\uaBcD"]'::json;
json
------------
["\uaBcD"]

but

# select '["\uaBcD"]'::json->0;
ERROR: invalid input syntax for type json
DETAIL: Unicode escape values cannot be used for code point values above 007F
when the server encoding is not UTF8.
CONTEXT: JSON data, line 1: [...

Looks like random parse rules.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Andrew Dunstan
andrew@dunslane.net
In reply to: Teodor Sigaev (#1)
Re: json/jsonb inconsistence

On 05/29/2014 07:55 AM, Teodor Sigaev wrote:

# select '"\uaBcD"'::json;
json
----------
"\uaBcD"

but

# select '"\uaBcD"'::jsonb;
ERROR: invalid input syntax for type json
LINE 1: select '"\uaBcD"'::jsonb;
^
DETAIL: Unicode escape values cannot be used for code point values
above 007F when the server encoding is not UTF8.
CONTEXT: JSON data, line 1: ...

and

# select '"\uaBcD"'::json -> 0;
ERROR: invalid input syntax for type json
DETAIL: Unicode escape values cannot be used for code point values
above 007F when the server encoding is not UTF8.
CONTEXT: JSON data, line 1: ...
Time: 0,696 ms

More than, json looks strange:

# select '["\uaBcD"]'::json;
json
------------
["\uaBcD"]

but

# select '["\uaBcD"]'::json->0;
ERROR: invalid input syntax for type json
DETAIL: Unicode escape values cannot be used for code point values
above 007F when the server encoding is not UTF8.
CONTEXT: JSON data, line 1: [...

Looks like random parse rules.

It is documented that for json we don't check the validity of unicode
escapes until we try to use them. That's because the original json
parser didn't check them, and if we started doing so now users who
pg_upgraded would find themselves with invalid data in the database. The
rules for jsonb are more strict, because we actually resolve the unicode
escapes when constructing the jsonb object. There is nothing at all
random about it, although I agree it's slightly inconsistent. It's been
the subject of some discussion on -hackers previously, IIRC. I actually
referred to this difference in my talk at pgCon last Friday.

Frankly, if you want to use json/jsonb, you are going to be best served
by using a UTF8-encoded database, or not using non-ascii unicode escapes
in json at all.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Teodor Sigaev
teodor@sigaev.ru
In reply to: Andrew Dunstan (#2)
Re: json/jsonb inconsistence

inconsistent. It's been the subject of some discussion on -hackers previously,
IIRC. I actually referred to this difference in my talk at pgCon last Friday.

I see, and I wasn't on your talk :( I'm playing around jsquery and now trying to
implement UTF escapes there.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers