Cache lookup error when using jsonb, json_build_object and a WITH clause

Started by Michael Paquieralmost 12 years ago9 messageshackers
Jump to latest
#1Michael Paquier
michael@paquier.xyz

Hi all,

I found the following error when playing with jsonb and json_build_object:
=# with jsonb_data as (select * from jsonb_each('{"aa" :
"po"}'::jsonb)) select json_build_object(key,value) from jsonb_data;
ERROR: XX000: cache lookup failed for type 2147483650
LOCATION: lookup_type_cache, typcache.c:193

I would have expected the result to be the same as in the case of json:
=# with json_data as (select * from json_each('{"aa" : "po"}'::json))
select json_build_object(key,value) from json_data;
json_build_object
-------------------
{"aa" : "po"}
(1 row)

Regards,
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#1)
Re: Cache lookup error when using jsonb, json_build_object and a WITH clause

Michael Paquier <michael.paquier@gmail.com> writes:

I found the following error when playing with jsonb and json_build_object:
=# with jsonb_data as (select * from jsonb_each('{"aa" :
"po"}'::jsonb)) select json_build_object(key,value) from jsonb_data;
ERROR: XX000: cache lookup failed for type 2147483650
LOCATION: lookup_type_cache, typcache.c:193

The immediate problem seems to be that add_json() did not get taught
that jsonb is of TYPCATEGORY_JSON; somebody missed updating that copy
of logic that's been copied and pasted several times too many, IMNSHO.

However, now that I look at this code, it seems like it's got more
problems than that:

* it will be fooled utterly by domains over types it's interested in.

* there is nothing stopping somebody from making user-defined types
with category 'j' or 'c', which will confuse it even more.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#1)
Re: Cache lookup error when using jsonb, json_build_object and a WITH clause

Hi,

On 2014-05-09 21:40:07 +0900, Michael Paquier wrote:

Hi all,

I found the following error when playing with jsonb and json_build_object:
=# with jsonb_data as (select * from jsonb_each('{"aa" :
"po"}'::jsonb)) select json_build_object(key,value) from jsonb_data;
ERROR: XX000: cache lookup failed for type 2147483650
LOCATION: lookup_type_cache, typcache.c:193

I would have expected the result to be the same as in the case of json:
=# with json_data as (select * from json_each('{"aa" : "po"}'::json))
select json_build_object(key,value) from json_data;
json_build_object
-------------------
{"aa" : "po"}
(1 row)

Whoa. There's two wierd things here:
a) "jsonb" has a typcategory 'C'. Marking a composite type. "json" has
'U'.

b) datum_to_json() thinks it's a good idea to use typcategory to decide
how a type is output. Isn't that pertty fundamentally flawed? To
detect composite types it really should look at typtype, now?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#3)
Re: Cache lookup error when using jsonb, json_build_object and a WITH clause

Andres Freund <andres@2ndquadrant.com> writes:

Whoa. There's two wierd things here:
a) "jsonb" has a typcategory 'C'. Marking a composite type. "json" has
'U'.

Yeah, that's flat out wrong. I changed it before seeing your message.

b) datum_to_json() thinks it's a good idea to use typcategory to decide
how a type is output. Isn't that pertty fundamentally flawed?

Indeed. I think the bit that uses TYPCATEGORY_NUMERIC as a hint to decide
whether the value can be left unquoted (assuming it looks like a number)
might be all right, but the rest of this seems pretty bogus.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#4)
Re: Cache lookup error when using jsonb, json_build_object and a WITH clause

I wrote:

Andres Freund <andres@2ndquadrant.com> writes:

b) datum_to_json() thinks it's a good idea to use typcategory to decide
how a type is output. Isn't that pertty fundamentally flawed?

Indeed. I think the bit that uses TYPCATEGORY_NUMERIC as a hint to decide
whether the value can be left unquoted (assuming it looks like a number)
might be all right, but the rest of this seems pretty bogus.

Actually, that would be a security hole if it weren't that CREATE TYPE for
new base types is superuser-only. Otherwise a user-defined type could
fool this logic with a malicious choice of typcategory. jsonb itself was
darn close to being a "malicious choice of typcategory" --- it's entirely
accidental that Michael's example didn't lead to a crash or even more
interesting stuff, since the code was trying to process a jsonb as though
it were a regular composite type. Other choices of typcategory could have
sent the code into the array path for something that's not an array, or
have allowed escaping to be bypassed for something that's not json, etc.

In short, there are defined ways to decide if a type is array or
composite, and this ain't how.

After further reflection I think we should lose the TYPCATEGORY_NUMERIC
business too. ruleutils.c hard-wires the set of types it will consider
to be numeric, and I see no very good reason not to do likewise here.
That will remove the need to look up the typcategory at all.

So we need to:

1. Refactor so there's only one copy of the control logic.

2. Smash domains to their base types.

3. Identify boolean, numeric, and json types by direct tests of type OID.

4. Identify array and composite types using standard methods.

Anybody see other problems to fix here?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#5)
Re: Cache lookup error when using jsonb, json_build_object and a WITH clause

On 05/09/2014 10:07 AM, Tom Lane wrote:

After further reflection I think we should lose the TYPCATEGORY_NUMERIC
business too. ruleutils.c hard-wires the set of types it will consider
to be numeric, and I see no very good reason not to do likewise here.
That will remove the need to look up the typcategory at all.

So we need to:

1. Refactor so there's only one copy of the control logic.

2. Smash domains to their base types.

3. Identify boolean, numeric, and json types by direct tests of type OID.

4. Identify array and composite types using standard methods.

Anybody see other problems to fix here?

I guess this is my fault. I recall some discussions when some of this
was first being written about the best way to make the type based
decisions, not sure at this remove whether on list or off. The origin of
it is in 9.2, so if you're going to adjust it you should probably go
back that far.

I was aware of the domain problem, but in 2 years or so nobody has
complained about it, so I guess nobody is defining domains over json.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#5)
Re: Cache lookup error when using jsonb, json_build_object and a WITH clause

On 2014-05-09 10:07:10 -0400, Tom Lane wrote:

I wrote:

Andres Freund <andres@2ndquadrant.com> writes:

b) datum_to_json() thinks it's a good idea to use typcategory to decide
how a type is output. Isn't that pertty fundamentally flawed?

Indeed. I think the bit that uses TYPCATEGORY_NUMERIC as a hint to decide
whether the value can be left unquoted (assuming it looks like a number)
might be all right, but the rest of this seems pretty bogus.

Actually, that would be a security hole if it weren't that CREATE TYPE for
new base types is superuser-only.

Yea. I actual wonder why CREATE TYPE seems to allow createing
TYPCATEGORY_COMPOSITE types - there really doesn't seem to be a usecase.

After further reflection I think we should lose the TYPCATEGORY_NUMERIC
business too. ruleutils.c hard-wires the set of types it will consider
to be numeric, and I see no very good reason not to do likewise here.
That will remove the need to look up the typcategory at all.

Maybe we should expose that list or the functionality in a neater way?
It's already been copied to test_decoding...

So we need to:

1. Refactor so there's only one copy of the control logic.

2. Smash domains to their base types.

3. Identify boolean, numeric, and json types by direct tests of type OID.

4. Identify array and composite types using standard methods.

Anybody see other problems to fix here?

Yea.
5)
if (val_type > FirstNormalObjectId)
isn't fundamentally incorrect but imo shouldn't be replaced by something
like !IsCatalogType() (akin to IsCatalogRelation). At least if we decide
that hunk is safe from other POVs - I am not actually 100% sure yet.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#6)
Re: Cache lookup error when using jsonb, json_build_object and a WITH clause

Andrew Dunstan <andrew@dunslane.net> writes:

I guess this is my fault. I recall some discussions when some of this
was first being written about the best way to make the type based
decisions, not sure at this remove whether on list or off. The origin of
it is in 9.2, so if you're going to adjust it you should probably go
back that far.

Right, will back-patch.

I was aware of the domain problem, but in 2 years or so nobody has
complained about it, so I guess nobody is defining domains over json.

Actually, I was more concerned about domains over other types.
For instance

regression=# create domain dd as int;
CREATE DOMAIN
regression=# select json_build_object('foo', 43);
json_build_object
-------------------
{"foo" : 43}
(1 row)

regression=# select json_build_object('foo', 43::dd);
json_build_object
-------------------
{"foo" : "43"}
(1 row)

With the attached patch, you get 43 without any quotes, which seems
right to me. However, given the lack of complaints, it might be better
to not make this behavioral change in the back branches. I'll omit
the getBaseType() call from the back-patches.

Draft HEAD patch attached.

regards, tom lane

Attachments:

fix-json-type-categorization.patchtext/x-diff; charset=us-ascii; name=fix-json-type-categorization.patchDownload+374-385
#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#7)
Re: Cache lookup error when using jsonb, json_build_object and a WITH clause

Andres Freund <andres@2ndquadrant.com> writes:

Anybody see other problems to fix here?

Yea.
5)
if (val_type > FirstNormalObjectId)
isn't fundamentally incorrect but imo shouldn't be replaced by something
like !IsCatalogType() (akin to IsCatalogRelation). At least if we decide
that hunk is safe from other POVs - I am not actually 100% sure yet.

I didn't particularly like that either. The test is off-by-one, for
one thing (a type created right at OID wraparound could have
FirstNormalObjectId). However, it seems reasonable to avoid fruitless
syscache searches for builtin types, and I'm not seeing a lot of point
to wrapping this test in some macro.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers