json_strip_nulls()

Started by Erwin Brandstetterabout 4 years ago5 messagesdocs
Jump to latest
#1Erwin Brandstetter
brsaweda@gmail.com

The manual says this about json_strip_nulls()[1]https://www.postgresql.org/docs/current/functions-json.html#FUNCTIONS-JSON-PROCESSING-TABLE:

Deletes all object fields that have null values from the given JSON

value, recursively. Null values that are not object fields are untouched.

[1]: https://www.postgresql.org/docs/current/functions-json.html#FUNCTIONS-JSON-PROCESSING-TABLE
https://www.postgresql.org/docs/current/functions-json.html#FUNCTIONS-JSON-PROCESSING-TABLE

But the function also strips all insignificant white space:

test=> SELECT json_strip_nulls(json '{"a": 1 ,
test'> "foo" : "bar"
test'> }');
json_strip_nulls
---------------------
{"a":1,"foo":"bar"}

This is a useful feature to trim noise from json values, but unreliable
while undocumented. So let's document the behavior:

*Deletes all object fields that have null values from the given JSON value,
recursively. Null values that are not object fields are untouched.
json_strip_nulls additionally removes all insignificant white space.*

If that's undesirable, let's remove the functionality - and provide a
dedicated function for the task.
I found similar (unresolved) considerations here:

/messages/by-id/20160610110633.GG18838@zip.com.au

There were related requests on Stackoverflow:
-
https://stackoverflow.com/questions/27505181/fetching-compact-version-of-jsonb-in-postgresql
-
https://stackoverflow.com/questions/70813106/jsonb-cast-to-text-without-sporious-spaces

Regards
Erwin

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: Erwin Brandstetter (#1)
Re: json_strip_nulls()

On Sat, Jan 22, 2022 at 12:11 PM Erwin Brandstetter <brsaweda@gmail.com>
wrote:

But the function also strips all insignificant white space:

test=> SELECT json_strip_nulls(json '{"a": 1 ,
test'> "foo" : "bar"
test'> }');
json_strip_nulls
---------------------
{"a":1,"foo":"bar"}

This is a useful feature to trim noise from json values, but unreliable
while undocumented. So let's document the behavior:

json_strip_nulls doesn't make any promise regarding its output json other
than that it is valid. Since we are munging the json we are arguably
within our rights to output whatever transformed version we want. The
format should not be documented.

If that's undesirable, let's remove the functionality - and provide a
dedicated function for the task.

I agree we should at least provide a function that takes a json or jsonb
and outputs its text representation in a minimalist form.

I found similar (unresolved) considerations here:

/messages/by-id/20160610110633.GG18838@zip.com.au

This whole situation went through a fairly lengthy discussion back in 2016:

/messages/by-id/CAH7T-ap6R_xzWz98c6AQzQuGsK_vpgr-et4VRaSjgxqom--ibw@mail.gmail.com

It's an interesting, and IMO, disappointing thread. Maybe we can do better
now and focus on one missing capability the community desires and actually
get something committed.

David J.

#3Erwin Brandstetter
brsaweda@gmail.com
In reply to: David G. Johnston (#2)
Re: json_strip_nulls()

On Sat, 22 Jan 2022 at 20:31, David G. Johnston <david.g.johnston@gmail.com>
wrote:

On Sat, Jan 22, 2022 at 12:11 PM Erwin Brandstetter <brsaweda@gmail.com>
wrote:

But the function also strips all insignificant white space:
[...]
This is a useful feature to trim noise from json values, but unreliable
while undocumented. So let's document the behavior:

json_strip_nulls doesn't make any promise regarding its output json other
than that it is valid. Since we are munging the json we are arguably
within our rights to output whatever transformed version we want. The
format should not be documented.

Within our rights, maybe. The manual makes related promises[1]https://www.postgresql.org/docs/current/datatype-json.html:

Because the json type stores an exact copy of the input text, it will

preserve semantically-insignificant white space between tokens

And[2]https://www.postgresql.org/docs/current/datatype-json.html#JSON-KEYS-ELEMENTS:

As previously stated, when a JSON value is input and then printed without

any additional processing, json outputs the same text that was input,

Not strictly contradicting, but the current behavior of json_strip_nulls()
is still surprising. Either the input should be preserved as far as
possible or, failing that, the actual behavior documented.

[1]: https://www.postgresql.org/docs/current/datatype-json.html
[2]: https://www.postgresql.org/docs/current/datatype-json.html#JSON-KEYS-ELEMENTS
https://www.postgresql.org/docs/current/datatype-json.html#JSON-KEYS-ELEMENTS

If that's undesirable, let's remove the functionality - and provide a
dedicated function for the task.

I agree we should at least provide a function that takes a json or jsonb
and outputs its text representation in a minimalist form.

I found similar (unresolved) considerations here:

/messages/by-id/20160610110633.GG18838@zip.com.au

This whole situation went through a fairly lengthy discussion back in 2016:

/messages/by-id/CAH7T-ap6R_xzWz98c6AQzQuGsK_vpgr-et4VRaSjgxqom--ibw@mail.gmail.com

It's an interesting, and IMO, disappointing thread. Maybe we can do
better now and focus on one missing capability the community desires and
actually get something committed.

That would address the main issue here: to have a function doing that
reliably.

Regards
Erwin

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Erwin Brandstetter (#3)
Re: json_strip_nulls()

Erwin Brandstetter <brsaweda@gmail.com> writes:

On Sat, 22 Jan 2022 at 20:31, David G. Johnston <david.g.johnston@gmail.com>
wrote:

json_strip_nulls doesn't make any promise regarding its output json other
than that it is valid. Since we are munging the json we are arguably
within our rights to output whatever transformed version we want. The
format should not be documented.

Within our rights, maybe. The manual makes related promises[1]:

Because the json type stores an exact copy of the input text, it will
preserve semantically-insignificant white space between tokens

And[2]:

As previously stated, when a JSON value is input and then printed without
any additional processing, json outputs the same text that was input,

"Without any additional processing" is the key restriction there.

Not strictly contradicting, but the current behavior of json_strip_nulls()
is still surprising. Either the input should be preserved as far as
possible or, failing that, the actual behavior documented.

It is documented --- you just quoted the text that does so.

I don't have a lot of sympathy for "JSON-reading" code that fails to
conform to the JSON RFC, so I'm disinclined to work harder than that.

regards, tom lane

#5Erwin Brandstetter
brsaweda@gmail.com
In reply to: Tom Lane (#4)
Re: json_strip_nulls()

On Sat, 22 Jan 2022 at 22:00, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Erwin Brandstetter <brsaweda@gmail.com> writes:

On Sat, 22 Jan 2022 at 20:31, David G. Johnston <

david.g.johnston@gmail.com>

wrote:

json_strip_nulls doesn't make any promise regarding its output json

other

than that it is valid. Since we are munging the json we are arguably
within our rights to output whatever transformed version we want. The
format should not be documented.

Within our rights, maybe. The manual makes related promises[1]:

Because the json type stores an exact copy of the input text, it will
preserve semantically-insignificant white space between tokens

And[2]:

As previously stated, when a JSON value is input and then printed

without

any additional processing, json outputs the same text that was input,

"Without any additional processing" is the key restriction there.

Not strictly contradicting, but the current behavior of

json_strip_nulls()

is still surprising. Either the input should be preserved as far as
possible or, failing that, the actual behavior documented.

It is documented --- you just quoted the text that does so.

I don't have a lot of sympathy for "JSON-reading" code that fails to
conform to the JSON RFC, so I'm disinclined to work harder than that.

I suggest to clarify the behavior of json_strip_nulls() in the manual: that
it also strips insignificant white space. If that may change in future
versions, also say so. People are starting to use json_strip_nulls() for
the purpose (and may regret it later):
https://stackoverflow.com/questions/27505181/fetching-compact-version-of-jsonb-in-postgresql/56842519#56842519

If (like I assume) json_strip_nulls() should not be relied upon to strip
whitespace, it would be great to have a separate, dedicated function for
that. That's independent from the first suggestion.

Regards
Erwin