Removing null bytes from a json column

Started by Timothy Garnettalmost 9 years ago2 messagesgeneral
Jump to latest
#1Timothy Garnett
tgarnett@panjiva.com

Does anyone have some tips on how to deal with an existing json type column
that has some null bytes ( \u0000) in it? It seems like anything I do that
touches any row with a null byte just errors. I'd love to just remove them
if I could find some way to find them, but I'm having trouble even figuring
out how to do that.

This error I get is always:

PG::UntranslatableCharacter: ERROR: unsupported Unicode escape sequence
DETAIL: \u0000 cannot be converted to text.
CONTEXT: JSON data, line 1:
...st_name":"efxkerbs","company":"efxkerbs","email":...

Convert the column to jsonb -> errors.

Trying to find one of the offending rows:
select id from xxx where command->'args'->2->>'company' = 'efxkerbs';
- same error on \u0000
select id from issued_crm_commands where command->'args'->2->'company' =
'"efxkerbs"':json;
- no equality operator between json
select id from issued_crm_commands where
command->'args'->2->'company'::bytea = 'efxkerbs'::bytea;
- no conversion from json to bytea

Any ideas on how to find rows with a \u0000 in the json?

Tim

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Timothy Garnett (#1)
Re: Removing null bytes from a json column

Timothy Garnett <tgarnett@panjiva.com> writes:

Does anyone have some tips on how to deal with an existing json type column
that has some null bytes ( \u0000) in it? It seems like anything I do that
touches any row with a null byte just errors. I'd love to just remove them
if I could find some way to find them, but I'm having trouble even figuring
out how to do that.

Doesn't it work to cast to text and do a LIKE or regex search?

regression=# select '{"z":"\u0000"}'::json::text ~ '\\u0000';
?column?
----------
t
(1 row)

regression=# select '{"z":"\u0001"}'::json::text ~ '\\u0000';
?column?
----------
f
(1 row)

It's true that we won't let you cast such a value to JSONB or do any
exciting JSON-ish manipulations on it, but I'm not seeing an error
in cast-to-text.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general