Question about duplicate JSONTYPE_JSON check

Started by Maciek Sakrejda10 months ago10 messages
#1Maciek Sakrejda
maciek@pganalyze.com

While exploring the jsonb code, I noticed that in
datum_to_jsonb_internal, the tcategory checks compares against
JSONTYPE_JSON twice. There's no reason for that, right?

...

Ok, so, to try to answer my own question, I went looking at the
history, and this comes from "Unify JSON categorize type API and
export for external use" [0]https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=3c152a27b06313fe27bd47079658f928e291986b. Specifically, the change was

-            (tcategory == JSONBTYPE_ARRAY ||
-             tcategory == JSONBTYPE_COMPOSITE ||
-             tcategory == JSONBTYPE_JSON ||
-             tcategory == JSONBTYPE_JSONB ||
-             tcategory == JSONBTYPE_JSONCAST))
+            (tcategory == JSONTYPE_ARRAY ||
+             tcategory == JSONTYPE_COMPOSITE ||
+             tcategory == JSONTYPE_JSON ||
+             tcategory == JSONTYPE_JSONB ||
+             tcategory == JSONTYPE_JSON))

So "JSONBTYPE_JSONCAST" turned into "JSONTYPE_JSON". Should that have
been "JSONTYPE_CAST" (that seems to be the corresponding value in the
new enum) instead?

Thanks,
Maciek

[0]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=3c152a27b06313fe27bd47079658f928e291986b

#2Maciek Sakrejda
maciek@pganalyze.com
In reply to: Maciek Sakrejda (#1)
Re: Question about duplicate JSONTYPE_JSON check

I'm adding the author/committer and reviewer of 3c152a2, since I think
this may be a bug (my apologies if I'm misunderstanding this). See my
previous e-mail quoted below:

Show quoted text

On Mon, Mar 10, 2025 at 5:11 PM Maciek Sakrejda <maciek@pganalyze.com> wrote:

While exploring the jsonb code, I noticed that in
datum_to_jsonb_internal, the tcategory checks compares against
JSONTYPE_JSON twice. There's no reason for that, right?

...

Ok, so, to try to answer my own question, I went looking at the
history, and this comes from "Unify JSON categorize type API and
export for external use" [0]. Specifically, the change was

-            (tcategory == JSONBTYPE_ARRAY ||
-             tcategory == JSONBTYPE_COMPOSITE ||
-             tcategory == JSONBTYPE_JSON ||
-             tcategory == JSONBTYPE_JSONB ||
-             tcategory == JSONBTYPE_JSONCAST))
+            (tcategory == JSONTYPE_ARRAY ||
+             tcategory == JSONTYPE_COMPOSITE ||
+             tcategory == JSONTYPE_JSON ||
+             tcategory == JSONTYPE_JSONB ||
+             tcategory == JSONTYPE_JSON))

So "JSONBTYPE_JSONCAST" turned into "JSONTYPE_JSON". Should that have
been "JSONTYPE_CAST" (that seems to be the corresponding value in the
new enum) instead?

Thanks,
Maciek

[0]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=3c152a27b06313fe27bd47079658f928e291986b

#3Tender Wang
tndrwang@gmail.com
In reply to: Maciek Sakrejda (#1)
Re: Question about duplicate JSONTYPE_JSON check

Maciek Sakrejda <maciek@pganalyze.com> 于2025年3月11日周二 08:12写道:

While exploring the jsonb code, I noticed that in
datum_to_jsonb_internal, the tcategory checks compares against
JSONTYPE_JSON twice. There's no reason for that, right?

Yeah, the second JSONTYPE_JSON seems redundant.

...

Ok, so, to try to answer my own question, I went looking at the
history, and this comes from "Unify JSON categorize type API and
export for external use" [0]. Specifically, the change was

-            (tcategory == JSONBTYPE_ARRAY ||
-             tcategory == JSONBTYPE_COMPOSITE ||
-             tcategory == JSONBTYPE_JSON ||
-             tcategory == JSONBTYPE_JSONB ||
-             tcategory == JSONBTYPE_JSONCAST))
+            (tcategory == JSONTYPE_ARRAY ||
+             tcategory == JSONTYPE_COMPOSITE ||
+             tcategory == JSONTYPE_JSON ||
+             tcategory == JSONTYPE_JSONB ||
+             tcategory == JSONTYPE_JSON))

So "JSONBTYPE_JSONCAST" turned into "JSONTYPE_JSON". Should that have
been "JSONTYPE_CAST" (that seems to be the corresponding value in the
new enum) instead?

The below else branch has code if (tcategory == JSONTYPE_CAST). I guess
here the
second JSONTYPE_JSON may just be removed.
@Amit Langote <amitlangote09@gmail.com> please check out this.

--
Thanks,
Tender Wang

#4Amit Langote
amitlangote09@gmail.com
In reply to: Tender Wang (#3)
Re: Question about duplicate JSONTYPE_JSON check

On Wed, Mar 12, 2025 at 10:00 AM Tender Wang <tndrwang@gmail.com> wrote:

Maciek Sakrejda <maciek@pganalyze.com> 于2025年3月11日周二 08:12写道:

While exploring the jsonb code, I noticed that in
datum_to_jsonb_internal, the tcategory checks compares against
JSONTYPE_JSON twice. There's no reason for that, right?

Yeah, the second JSONTYPE_JSON seems redundant.

Ok, so, to try to answer my own question, I went looking at the
history, and this comes from "Unify JSON categorize type API and
export for external use" [0]. Specifically, the change was

-            (tcategory == JSONBTYPE_ARRAY ||
-             tcategory == JSONBTYPE_COMPOSITE ||
-             tcategory == JSONBTYPE_JSON ||
-             tcategory == JSONBTYPE_JSONB ||
-             tcategory == JSONBTYPE_JSONCAST))
+            (tcategory == JSONTYPE_ARRAY ||
+             tcategory == JSONTYPE_COMPOSITE ||
+             tcategory == JSONTYPE_JSON ||
+             tcategory == JSONTYPE_JSONB ||
+             tcategory == JSONTYPE_JSON))

So "JSONBTYPE_JSONCAST" turned into "JSONTYPE_JSON". Should that have
been "JSONTYPE_CAST" (that seems to be the corresponding value in the
new enum) instead?

The below else branch has code if (tcategory == JSONTYPE_CAST). I guess here the
second JSONTYPE_JSON may just be removed.
@Amit Langote please check out this.

Looks like a copy-paste bug on my part. Will fix, thanks for the report.

--
Thanks, Amit Langote

#5Amit Langote
amitlangote09@gmail.com
In reply to: Amit Langote (#4)
1 attachment(s)
Re: Question about duplicate JSONTYPE_JSON check

On Wed, Mar 12, 2025 at 12:07 PM Amit Langote <amitlangote09@gmail.com> wrote:

On Wed, Mar 12, 2025 at 10:00 AM Tender Wang <tndrwang@gmail.com> wrote:

Maciek Sakrejda <maciek@pganalyze.com> 于2025年3月11日周二 08:12写道:

While exploring the jsonb code, I noticed that in
datum_to_jsonb_internal, the tcategory checks compares against
JSONTYPE_JSON twice. There's no reason for that, right?

Yeah, the second JSONTYPE_JSON seems redundant.

Ok, so, to try to answer my own question, I went looking at the
history, and this comes from "Unify JSON categorize type API and
export for external use" [0]. Specifically, the change was

-            (tcategory == JSONBTYPE_ARRAY ||
-             tcategory == JSONBTYPE_COMPOSITE ||
-             tcategory == JSONBTYPE_JSON ||
-             tcategory == JSONBTYPE_JSONB ||
-             tcategory == JSONBTYPE_JSONCAST))
+            (tcategory == JSONTYPE_ARRAY ||
+             tcategory == JSONTYPE_COMPOSITE ||
+             tcategory == JSONTYPE_JSON ||
+             tcategory == JSONTYPE_JSONB ||
+             tcategory == JSONTYPE_JSON))

So "JSONBTYPE_JSONCAST" turned into "JSONTYPE_JSON". Should that have
been "JSONTYPE_CAST" (that seems to be the corresponding value in the
new enum) instead?

The below else branch has code if (tcategory == JSONTYPE_CAST). I guess here the
second JSONTYPE_JSON may just be removed.
@Amit Langote please check out this.

Looks like a copy-paste bug on my part. Will fix, thanks for the report.

I was able to construct a test case that crashes due to this bug:

CREATE TYPE mood AS ENUM ('happy', 'sad', 'neutral');
CREATE FUNCTION mood_to_json(mood) RETURNS json AS $$
SELECT to_json($1::text);
$$ LANGUAGE sql IMMUTABLE;
CREATE CAST (mood AS json) WITH FUNCTION mood_to_json(mood) AS IMPLICIT;

SELECT JSON_OBJECT('happy'::mood: '123'::jsonb);
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

Attached patch adds the one-line fix and the above test case.

Will push tomorrow.

--
Thanks, Amit Langote

Attachments:

v1-0001-Fix-copy-paste-error-in-datum_to_jsonb_internal.patchapplication/octet-stream; name=v1-0001-Fix-copy-paste-error-in-datum_to_jsonb_internal.patchDownload
From 7ee4f13bad9fd366aed1fb29ea947b974539eea4 Mon Sep 17 00:00:00 2001
From: Amit Langote <amitlan@postgresql.org>
Date: Wed, 12 Mar 2025 16:56:00 +0900
Subject: [PATCH v1] Fix copy-paste error in datum_to_jsonb_internal()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Commit 3c152a27b06 mistakenly repeated JSONTYPE_JSON in a condition,
omitting JSONTYPE_CAST. As a result, datum_to_jsonb_internal() failed
to reject inputs that were casts (e.g., from an enum to json as in the
example below) when used as keys in JSON constructors.

This led to a crash in cases like:

  SELECT JSON_OBJECT('happy'::mood: '123'::jsonb);

where 'happy'::mood is implicitly cast to json. The missing check
meant such casted values weren’t properly rejected as invalid
(non-scalar) JSON keys.

Reported-by: Maciek Sakrejda <maciek@pganalyze.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/CADXhmgTJtJZK9A3Na_ry%2BXrq-ghjcejBRhcRMzWZvbd__QdgJA%40mail.gmail.com
Backpatch-through: 17
---
 src/backend/utils/adt/jsonb.c         |  2 +-
 src/test/regress/expected/sqljson.out | 12 ++++++++++++
 src/test/regress/sql/sqljson.sql      | 11 +++++++++++
 3 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/jsonb.c b/src/backend/utils/adt/jsonb.c
index 8394a20e0e5..da94d424d61 100644
--- a/src/backend/utils/adt/jsonb.c
+++ b/src/backend/utils/adt/jsonb.c
@@ -657,7 +657,7 @@ datum_to_jsonb_internal(Datum val, bool is_null, JsonbInState *result,
 			  tcategory == JSONTYPE_COMPOSITE ||
 			  tcategory == JSONTYPE_JSON ||
 			  tcategory == JSONTYPE_JSONB ||
-			  tcategory == JSONTYPE_JSON))
+			  tcategory == JSONTYPE_CAST))
 	{
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/test/regress/expected/sqljson.out b/src/test/regress/expected/sqljson.out
index 7c3e673e5ea..eb320f003f5 100644
--- a/src/test/regress/expected/sqljson.out
+++ b/src/test/regress/expected/sqljson.out
@@ -573,6 +573,18 @@ SELECT JSON_OBJECT(1: 1, '2': NULL, '3': 1, 4: NULL, '5': 'a' ABSENT ON NULL WIT
  {"1": 1, "3": 1, "5": "a"}
 (1 row)
 
+-- BUG: https://postgr.es/m/CADXhmgTJtJZK9A3Na_ry%2BXrq-ghjcejBRhcRMzWZvbd__QdgJA%40mail.gmail.com
+-- datum_to_jsonb_internal() didn't catch keys that are casts instead of a simple scalar
+CREATE TYPE mood AS ENUM ('happy', 'sad', 'neutral');
+CREATE FUNCTION mood_to_json(mood) RETURNS json AS $$
+  SELECT to_json($1::text);
+$$ LANGUAGE sql IMMUTABLE;
+CREATE CAST (mood AS json) WITH FUNCTION mood_to_json(mood) AS IMPLICIT;
+SELECT JSON_OBJECT('happy'::mood: '123'::jsonb);
+ERROR:  key value must be scalar, not array, composite, or json
+DROP CAST (mood AS json);
+DROP FUNCTION mood_to_json;
+DROP TYPE mood;
 -- JSON_ARRAY()
 SELECT JSON_ARRAY();
  json_array 
diff --git a/src/test/regress/sql/sqljson.sql b/src/test/regress/sql/sqljson.sql
index 9b5d82602d0..3fd6ac260b8 100644
--- a/src/test/regress/sql/sqljson.sql
+++ b/src/test/regress/sql/sqljson.sql
@@ -152,6 +152,17 @@ SELECT JSON_OBJECT(1: 1, '2': NULL, '1': 1 ABSENT ON NULL WITH UNIQUE RETURNING
 SELECT JSON_OBJECT(1: 1, '2': NULL, '1': 1 ABSENT ON NULL WITHOUT UNIQUE RETURNING jsonb);
 SELECT JSON_OBJECT(1: 1, '2': NULL, '3': 1, 4: NULL, '5': 'a' ABSENT ON NULL WITH UNIQUE RETURNING jsonb);
 
+-- BUG: https://postgr.es/m/CADXhmgTJtJZK9A3Na_ry%2BXrq-ghjcejBRhcRMzWZvbd__QdgJA%40mail.gmail.com
+-- datum_to_jsonb_internal() didn't catch keys that are casts instead of a simple scalar
+CREATE TYPE mood AS ENUM ('happy', 'sad', 'neutral');
+CREATE FUNCTION mood_to_json(mood) RETURNS json AS $$
+  SELECT to_json($1::text);
+$$ LANGUAGE sql IMMUTABLE;
+CREATE CAST (mood AS json) WITH FUNCTION mood_to_json(mood) AS IMPLICIT;
+SELECT JSON_OBJECT('happy'::mood: '123'::jsonb);
+DROP CAST (mood AS json);
+DROP FUNCTION mood_to_json;
+DROP TYPE mood;
 
 -- JSON_ARRAY()
 SELECT JSON_ARRAY();
-- 
2.43.0

#6Álvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Amit Langote (#5)
Re: Question about duplicate JSONTYPE_JSON check

On 2025-Mar-12, Amit Langote wrote:

I was able to construct a test case that crashes due to this bug:

CREATE TYPE mood AS ENUM ('happy', 'sad', 'neutral');
CREATE FUNCTION mood_to_json(mood) RETURNS json AS $$
SELECT to_json($1::text);
$$ LANGUAGE sql IMMUTABLE;
CREATE CAST (mood AS json) WITH FUNCTION mood_to_json(mood) AS IMPLICIT;

SELECT JSON_OBJECT('happy'::mood: '123'::jsonb);
server closed the connection unexpectedly

Good reaction time :-) I see that that line shows as not even uncovered
in the report, but as non-existant (no background color as opposed to
red):

https://coverage.postgresql.org/src/backend/utils/adt/jsonb.c.gcov.html#660

Evidently the compiler must be optimizing it out as dead code.

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Update: super-fast reaction on the Postgres bugs mailing list. The report
was acknowledged [...], and a fix is under discussion.
The wonders of open-source !"
https://twitter.com/gunnarmorling/status/1596080409259003906

#7Amit Langote
amitlangote09@gmail.com
In reply to: Álvaro Herrera (#6)
Re: Question about duplicate JSONTYPE_JSON check

On Wed, Mar 12, 2025 at 7:09 PM Álvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2025-Mar-12, Amit Langote wrote:

I was able to construct a test case that crashes due to this bug:

CREATE TYPE mood AS ENUM ('happy', 'sad', 'neutral');
CREATE FUNCTION mood_to_json(mood) RETURNS json AS $$
SELECT to_json($1::text);
$$ LANGUAGE sql IMMUTABLE;
CREATE CAST (mood AS json) WITH FUNCTION mood_to_json(mood) AS IMPLICIT;

SELECT JSON_OBJECT('happy'::mood: '123'::jsonb);
server closed the connection unexpectedly

Good reaction time :-) I see that that line shows as not even uncovered
in the report, but as non-existant (no background color as opposed to
red):

https://coverage.postgresql.org/src/backend/utils/adt/jsonb.c.gcov.html#660

Evidently the compiler must be optimizing it out as dead code.

Ah, I did wonder about the coverage, thanks for pointing it out.

Patch look good for committing?

--
Thanks, Amit Langote

#8Álvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Amit Langote (#7)
Re: Question about duplicate JSONTYPE_JSON check

On 2025-Mar-12, Amit Langote wrote:

Patch look good for committing?

Ah sorry, I should have said so -- yes, it looks good to me. I feel a
slight dislike for using URL-escaped characters in the mailing list link
you added, because it means I cannot directly copy/paste the message-id
string into my email client program. Not a huge issue for sure, and it
seems a majority of links in the source tree are already like that
anyway, but this seems an inclusive, safe, welcoming nitpicking space :-)

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/

#9Maciek Sakrejda
maciek@pganalyze.com
In reply to: Amit Langote (#5)
Re: Question about duplicate JSONTYPE_JSON check

Thanks for the quick fix. I was able to reproduce the assertion
failure and to confirm that it's resolved with the patch. Looks good
to me.

#10Amit Langote
amitlangote09@gmail.com
In reply to: Álvaro Herrera (#8)
Re: Question about duplicate JSONTYPE_JSON check

On Wed, Mar 12, 2025 at 21:40 Álvaro Herrera <alvherre@alvh.no-ip.org>
wrote:

On 2025-Mar-12, Amit Langote wrote:

Patch look good for committing?

Ah sorry, I should have said so -- yes, it looks good to me.

Thanks (Maciek, Tender too) for the review.

I feel a

slight dislike for using URL-escaped characters in the mailing list link
you added, because it means I cannot directly copy/paste the message-id
string into my email client program. Not a huge issue for sure, and it
seems a majority of links in the source tree are already like that
anyway, but this seems an inclusive, safe, welcoming nitpicking space :-)

Ah, no problem, fixed it. I prefer to see URLs in that way too, but somehow
didn’t notice what I had done.