why are null bytes allowed in JSON columns?
Hi,
I am not sure if this is a bug or a known inconvenience. First create a
table with a JSON column:
create table test(t json);
insert into test(t) values ('{"foo": "bar"}');
select * from test where t->>'foo' = 'bar';
-- t
-- ----------------
-- {"foo": "bar"}
-- (1 row)
Now, insert a record with a null byte
insert into test(t) values ('{"foo\u0000": "bar"}');
select * from test where t->>'foo' = 'bar';
-- ERROR: unsupported Unicode escape sequence
-- DETAIL: \u0000 cannot be converted to text.
-- CONTEXT: JSON data, line 1: {...
insert into test(t) values ('{"foo\u0000": "bar"}');
select * from test where t->>'foo' = 'bar';
-- ERROR: unsupported Unicode escape sequence
-- DETAIL: \u0000 cannot be converted to text.
-- CONTEXT: JSON data, line 1: {...
Once the null byte is inserted the JSON operator ->> can no longer be
applied to the column. JSONB columns don't allow null bytes at all. Should
the same constraint apply on JSON? If not, applications must be vigilant to
guard against null bytes, or queries could break at read time. My reading
of table 8.23 in https://www.postgresql.org/docs/16/datatype-json.html is
they should be disallowed at insert.
On Thursday, September 21, 2023, Wyatt Alt <wyatt.alt@gmail.com> wrote:
I am not sure if this is a bug or a known inconvenience.
It’s a reason to not use json in new development and instead use jsonb.
Once the null byte is inserted the JSON operator ->> can no longer be
applied to the column. JSONB columns don't allow null bytes at all. Should
the same constraint apply on JSON? If not, applications must be vigilant to
guard against null bytes, or queries could break at read time. My reading
of table 8.23 in https://www.postgresql.org/docs/16/datatype-json.html is
they should be disallowed at insert.
That table basically describes how jsonb behaves, not json. I agree that
fact could be made clearer instead of having to deduce that from reading
the prose which explicitly says json is basically lacks safeties for any
usage within the database. But it also imposes fewer restrictions in
exchange.
David J.
Thanks for the clarification. I now see the "Note" above the table explains
this.
On 2023-09-21 Th 21:57, Wyatt Alt wrote:
Thanks for the clarification. I now see the "Note" above the table
explains this.
Essentially the json data type allows anything that is syntactically
valid according to the standard. That doesn't mean that functions
operating on the type can always work, and this is the most obvious of
such restrictions.
cheers
andrew
--
Andrew Dunstan
EDB:https://www.enterprisedb.com