Re: pgsql: Add more SQL/JSON constructor functions
On 2024-May-27, Alvaro Herrera wrote:
JSON_SERIALIZE()
I just noticed this behavior, which looks like a bug to me:
select json_serialize('{"a":1, "a":2}' returning varchar(5));
json_serialize
────────────────
{"a":
I think this function should throw an error if the destination type
doesn't have room for the output json. Otherwise, what good is the
serialization function?
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
Import Notes
Reply to msg id not found: 202405271313.qkl63gffldmx@alvherre.pgsql
On Monday, May 27, 2024, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2024-May-27, Alvaro Herrera wrote:
JSON_SERIALIZE()
I just noticed this behavior, which looks like a bug to me:
select json_serialize('{"a":1, "a":2}' returning varchar(5));
json_serialize
────────────────
{"a":I think this function should throw an error if the destination type
doesn't have room for the output json. Otherwise, what good is the
serialization function?
It’s not a self-evident bug given that this is exactly how casting data to
varchar(n) behaves as directed by the SQL Standard.
I'd probably leave the internal consistency and take the opportunity to
educate the reader that text is the preferred type in PostgreSQL and,
especially here, there is little good reason to use anything else.
David J.
Hi Alvaro,
On Mon, May 27, 2024 at 7:10 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2024-May-27, Alvaro Herrera wrote:
JSON_SERIALIZE()
I just noticed this behavior, which looks like a bug to me:
select json_serialize('{"a":1, "a":2}' returning varchar(5));
json_serialize
────────────────
{"a":I think this function should throw an error if the destination type
doesn't have room for the output json. Otherwise, what good is the
serialization function?
I remember using the reasoning mentioned by David G when testing
json_query() et al with varchar(n), so you get:
select json_query('{"a":1, "a":2}', '$' returning varchar(5));
json_query
------------
{"a":
(1 row)
which is the same as:
select '{"a":1, "a":2}'::varchar(5);
varchar
---------
{"a":
(1 row)
Also,
select json_value('{"a":"abcdef"}', '$.a' returning varchar(5) error on error);
json_value
------------
abcde
(1 row)
This behavior comes from using COERCE_EXPLICIT_CAST when creating the
coercion expression to convert json_*() functions' argument to the
RETURNING type.
--
Thanks, Amit Langote
Amit Langote <amitlangote09@gmail.com> writes:
On Mon, May 27, 2024 at 7:10 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2024-May-27, Alvaro Herrera wrote:
I just noticed this behavior, which looks like a bug to me:select json_serialize('{"a":1, "a":2}' returning varchar(5));
json_serialize
────────────────
{"a":I think this function should throw an error if the destination type
doesn't have room for the output json. Otherwise, what good is the
serialization function?
This behavior comes from using COERCE_EXPLICIT_CAST when creating the
coercion expression to convert json_*() functions' argument to the
RETURNING type.
Yeah, I too think this is a cast, and truncation is the spec-defined
behavior for casting to varchar with a specific length limit. I see
little reason that this should work differently from
select json_serialize('{"a":1, "a":2}' returning text)::varchar(5);
json_serialize
----------------
{"a":
(1 row)
regards, tom lane
On 29.05.24 18:44, Tom Lane wrote:
Amit Langote <amitlangote09@gmail.com> writes:
On Mon, May 27, 2024 at 7:10 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2024-May-27, Alvaro Herrera wrote:
I just noticed this behavior, which looks like a bug to me:select json_serialize('{"a":1, "a":2}' returning varchar(5));
json_serialize
────────────────
{"a":I think this function should throw an error if the destination type
doesn't have room for the output json. Otherwise, what good is the
serialization function?This behavior comes from using COERCE_EXPLICIT_CAST when creating the
coercion expression to convert json_*() functions' argument to the
RETURNING type.Yeah, I too think this is a cast, and truncation is the spec-defined
behavior for casting to varchar with a specific length limit. I see
little reason that this should work differently fromselect json_serialize('{"a":1, "a":2}' returning text)::varchar(5);
json_serialize
----------------
{"a":
(1 row)
The SQL standard says essentially that the output of json_serialize() is
some string that when parsed back in gives you an equivalent JSON value
as the input. That doesn't seem compatible with truncating the output.
If you want output truncation, you can of course use an actual cast.
But it makes sense that the RETURNING clause is separate from that.
Peter Eisentraut <peter@eisentraut.org> writes:
On 29.05.24 18:44, Tom Lane wrote:
Yeah, I too think this is a cast, and truncation is the spec-defined
behavior for casting to varchar with a specific length limit.
The SQL standard says essentially that the output of json_serialize() is
some string that when parsed back in gives you an equivalent JSON value
as the input. That doesn't seem compatible with truncating the output.
Maybe you should take this up with the SQL committee? If you don't
like our current behavior, then either you have to say that RETURNING
with a length-limited target type is illegal (which is problematic
for the spec, since they have no such type) or that the cast behaves
like an implicit cast, with errors for overlength input (which I find
to be an unintuitive definition for a construct that names the target
type explicitly).
If you want output truncation, you can of course use an actual cast.
But it makes sense that the RETURNING clause is separate from that.
Are you trying to say that the RETURNING clause's specified type
isn't the actual output type? I can't buy that either.
Again, if you think our existing behavior isn't right, I think
it's a problem for the SQL committee not us.
regards, tom lane
On 02.06.24 21:46, Tom Lane wrote:
If you don't
like our current behavior, then either you have to say that RETURNING
with a length-limited target type is illegal (which is problematic
for the spec, since they have no such type) or that the cast behaves
like an implicit cast, with errors for overlength input (which I find
to be an unintuitive definition for a construct that names the target
type explicitly).
It asks for the latter behavior, essentially (but it's not defined in
terms of casts). It says:
"""
ii) Let JV be an implementation-dependent (UV097) value of type TT and
encoding ENC such that these two conditions hold:
1) JV is a JSON text.
2) When the General Rules of Subclause 9.42, “Parsing JSON text”, are
applied with JV as JSON TEXT, FO as FORMAT OPTION, and WITHOUT UNIQUE
KEYS as UNIQUENESS CONSTRAINT; let CST be the STATUS and let CSJI be the
SQL/JSON ITEM returned from the application of those General Rules, CST
is successful completion (00000) and CSJI is an SQL/JSON item that is
equivalent to SJI.
If there is no such JV, then let ST be the exception condition: data
exception — invalid JSON text (22032).
iii) If JV is longer than the length or maximum length of TT, then an
exception condition is raised: data exception — string data, right
truncation (22001).
"""
Oracle also behaves accordingly:
SQL> select json_serialize('{"a":1, "a":2}' returning varchar2(20)) from
dual;
JSON_SERIALIZE('{"A"
--------------------
{"a":1,"a":2}
SQL> select json_serialize('{"a":1, "a":2}' returning varchar2(5)) from
dual;
select json_serialize('{"a":1, "a":2}' returning varchar2(5)) from dual
*
ERROR at line 1:
ORA-40478: output value too large (maximum: 5)
JZN-00018: Input to serializer is too large
Help: https://docs.oracle.com/error-help/db/ora-40478/
As opposed to:
SQL> select cast(json_serialize('{"a":1, "a":2}') as varchar2(5)) from dual;
CAST(
-----
{"a":
Peter Eisentraut <peter@eisentraut.org> writes:
On 02.06.24 21:46, Tom Lane wrote:
If you don't
like our current behavior, then either you have to say that RETURNING
with a length-limited target type is illegal (which is problematic
for the spec, since they have no such type) or that the cast behaves
like an implicit cast, with errors for overlength input (which I find
to be an unintuitive definition for a construct that names the target
type explicitly).
It asks for the latter behavior, essentially (but it's not defined in
terms of casts). It says:
Meh. Who needs consistency? But I guess the answer is to do what was
suggested earlier and change the code to use COERCE_IMPLICIT_CAST.
regards, tom lane
On Tue, Jun 4, 2024 at 2:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Peter Eisentraut <peter@eisentraut.org> writes:
On 02.06.24 21:46, Tom Lane wrote:
If you don't
like our current behavior, then either you have to say that RETURNING
with a length-limited target type is illegal (which is problematic
for the spec, since they have no such type) or that the cast behaves
like an implicit cast, with errors for overlength input (which I find
to be an unintuitive definition for a construct that names the target
type explicitly).It asks for the latter behavior, essentially (but it's not defined in
terms of casts). It says:Meh. Who needs consistency? But I guess the answer is to do what was
suggested earlier and change the code to use COERCE_IMPLICIT_CAST.
OK, will post a patch to do so in a new thread on -hackers.
--
Thanks, Amit Langote
On Tue, Jun 4, 2024 at 7:03 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Tue, Jun 4, 2024 at 2:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Peter Eisentraut <peter@eisentraut.org> writes:
On 02.06.24 21:46, Tom Lane wrote:
If you don't
like our current behavior, then either you have to say that RETURNING
with a length-limited target type is illegal (which is problematic
for the spec, since they have no such type) or that the cast behaves
like an implicit cast, with errors for overlength input (which I find
to be an unintuitive definition for a construct that names the target
type explicitly).It asks for the latter behavior, essentially (but it's not defined in
terms of casts). It says:Meh. Who needs consistency? But I guess the answer is to do what was
suggested earlier and change the code to use COERCE_IMPLICIT_CAST.OK, will post a patch to do so in a new thread on -hackers.
Oops, didn't realize that this is already on -hackers.
Attached is a patch to use COERCE_IMPLICIT_CAST when the RETURNING
type specifies a length limit.
Given that this also affects JSON_OBJECT() et al that got added in
v16, maybe back-patching is in order but I'd like to hear opinions on
that.
--
Thanks, Amit Langote
Attachments:
v1-0001-SQL-JSON-Use-implicit-casts-for-RETURNING-type-wi.patchapplication/octet-stream; name=v1-0001-SQL-JSON-Use-implicit-casts-for-RETURNING-type-wi.patchDownload+73-26
On Tue, Jun 18, 2024 at 5:02 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Tue, Jun 4, 2024 at 7:03 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Tue, Jun 4, 2024 at 2:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Peter Eisentraut <peter@eisentraut.org> writes:
On 02.06.24 21:46, Tom Lane wrote:
If you don't
like our current behavior, then either you have to say that RETURNING
with a length-limited target type is illegal (which is problematic
for the spec, since they have no such type) or that the cast behaves
like an implicit cast, with errors for overlength input (which I find
to be an unintuitive definition for a construct that names the target
type explicitly).It asks for the latter behavior, essentially (but it's not defined in
terms of casts). It says:Meh. Who needs consistency? But I guess the answer is to do what was
suggested earlier and change the code to use COERCE_IMPLICIT_CAST.OK, will post a patch to do so in a new thread on -hackers.
Oops, didn't realize that this is already on -hackers.
Attached is a patch to use COERCE_IMPLICIT_CAST when the RETURNING
type specifies a length limit.
hi.
i am a little confused.
here[1]/messages/by-id/3189.1717001075@sss.pgh.pa.us tom says:
Yeah, I too think this is a cast, and truncation is the spec-defined
behavior for casting to varchar with a specific length limit. I see
little reason that this should work differently fromselect json_serialize('{"a":1, "a":2}' returning text)::varchar(5);
json_serialize
----------------
{"a":
(1 row)
if i understand it correctly, and my english interpretation is fine.
i think tom means something like:
select json_serialize('{"a":1, "a":2}' returning text)::varchar(5) =
json_serialize('{"a":1, "a":2}' returning varchar(5));
should return true.
the master will return true, but apply your patch, the above query
will yield an error.
your patch will make domain and char(n) behavior inconsistent.
create domain char2 as char(2);
SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char2 ERROR ON ERROR);
SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char(2) ERROR ON ERROR);
another example:
SELECT JSON_query(jsonb '"aaa"', '$' RETURNING char(2) keep quotes
default '"aaa"'::jsonb ON ERROR);
same value (jsonb "aaa") error on error will yield error,
but `default expression on error` can coerce the value to char(2),
which looks a little bit inconsistent, I think.
------------------------------------------
current in ExecInitJsonExpr we have
if (jsexpr->coercion_expr)
...
else if (jsexpr->use_json_coercion)
...
else if (jsexpr->use_io_coercion)
...
do you think it is necessary to add following asserts:
Assert (!(jsexpr->coercion_expr && jsexpr->use_json_coercion))
Assert (!(jsexpr->coercion_expr && jsexpr->use_io_coercion))
Hi,
Thanks for taking a look.
On Fri, Jun 21, 2024 at 4:05 PM jian he <jian.universality@gmail.com> wrote:
On Tue, Jun 18, 2024 at 5:02 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Tue, Jun 4, 2024 at 7:03 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Tue, Jun 4, 2024 at 2:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Peter Eisentraut <peter@eisentraut.org> writes:
On 02.06.24 21:46, Tom Lane wrote:
If you don't
like our current behavior, then either you have to say that RETURNING
with a length-limited target type is illegal (which is problematic
for the spec, since they have no such type) or that the cast behaves
like an implicit cast, with errors for overlength input (which I find
to be an unintuitive definition for a construct that names the target
type explicitly).It asks for the latter behavior, essentially (but it's not defined in
terms of casts). It says:Meh. Who needs consistency? But I guess the answer is to do what was
suggested earlier and change the code to use COERCE_IMPLICIT_CAST.OK, will post a patch to do so in a new thread on -hackers.
Oops, didn't realize that this is already on -hackers.
Attached is a patch to use COERCE_IMPLICIT_CAST when the RETURNING
type specifies a length limit.hi.
i am a little confused.here[1] tom says:
Yeah, I too think this is a cast, and truncation is the spec-defined
behavior for casting to varchar with a specific length limit. I see
little reason that this should work differently fromselect json_serialize('{"a":1, "a":2}' returning text)::varchar(5);
json_serialize
----------------
{"a":
(1 row)if i understand it correctly, and my english interpretation is fine.
i think tom means something like:select json_serialize('{"a":1, "a":2}' returning text)::varchar(5) =
json_serialize('{"a":1, "a":2}' returning varchar(5));should return true.
the master will return true, but apply your patch, the above query
will yield an error.
The RETURNING variant giving an error is what the standard asks us to
do apparently. I read Tom's last message on this thread as agreeing
to that, even though hesitantly. He can correct me if I got that
wrong.
your patch will make domain and char(n) behavior inconsistent.
create domain char2 as char(2);
SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char2 ERROR ON ERROR);
SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char(2) ERROR ON ERROR);another example:
SELECT JSON_query(jsonb '"aaa"', '$' RETURNING char(2) keep quotes
default '"aaa"'::jsonb ON ERROR);
same value (jsonb "aaa") error on error will yield error,
but `default expression on error` can coerce the value to char(2),
which looks a little bit inconsistent, I think.
Interesting examples, thanks for sharing.
Attached updated version should take into account that typmod may be
hiding under domains. Please test.
------------------------------------------
current in ExecInitJsonExpr we haveif (jsexpr->coercion_expr)
...
else if (jsexpr->use_json_coercion)
...
else if (jsexpr->use_io_coercion)
...do you think it is necessary to add following asserts:
Assert (!(jsexpr->coercion_expr && jsexpr->use_json_coercion))
Assert (!(jsexpr->coercion_expr && jsexpr->use_io_coercion))
Yeah, perhaps, but let's consider this independently please.
--
Thanks, Amit Langote
Attachments:
v2-0001-SQL-JSON-Use-implicit-casts-for-RETURNING-type-wi.patchapplication/octet-stream; name=v2-0001-SQL-JSON-Use-implicit-casts-for-RETURNING-type-wi.patchDownload+136-41
On Fri, Jun 21, 2024 at 10:48 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Fri, Jun 21, 2024 at 4:05 PM jian he <jian.universality@gmail.com> wrote:
hi.
i am a little confused.here[1] tom says:
Yeah, I too think this is a cast, and truncation is the spec-defined
behavior for casting to varchar with a specific length limit. I see
little reason that this should work differently fromselect json_serialize('{"a":1, "a":2}' returning text)::varchar(5);
json_serialize
----------------
{"a":
(1 row)if i understand it correctly, and my english interpretation is fine.
i think tom means something like:select json_serialize('{"a":1, "a":2}' returning text)::varchar(5) =
json_serialize('{"a":1, "a":2}' returning varchar(5));should return true.
the master will return true, but apply your patch, the above query
will yield an error.The RETURNING variant giving an error is what the standard asks us to
do apparently. I read Tom's last message on this thread as agreeing
to that, even though hesitantly. He can correct me if I got that
wrong.your patch will make domain and char(n) behavior inconsistent.
create domain char2 as char(2);
SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char2 ERROR ON ERROR);
SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char(2) ERROR ON ERROR);another example:
SELECT JSON_query(jsonb '"aaa"', '$' RETURNING char(2) keep quotes
default '"aaa"'::jsonb ON ERROR);
same value (jsonb "aaa") error on error will yield error,
but `default expression on error` can coerce the value to char(2),
which looks a little bit inconsistent, I think.Interesting examples, thanks for sharing.
Attached updated version should take into account that typmod may be
hiding under domains. Please test.
I'd like to push this one tomorrow, barring objections.
I could use some advice on backpatching. As I mentioned upthread,
this changes the behavior for JSON_OBJECT(), JSON_ARRAY(),
JSON_ARRAYAGG(), JSON_OBJECTAGG() too, which were added in v16.
Should this change be backpatched? In general, what's our stance on
changes that cater to improving standard compliance, but are not
necessarily bugs.
--
Thanks, Amit Langote
On Wed, Jun 26, 2024 at 8:39 PM Amit Langote <amitlangote09@gmail.com> wrote:
The RETURNING variant giving an error is what the standard asks us to
do apparently. I read Tom's last message on this thread as agreeing
to that, even though hesitantly. He can correct me if I got that
wrong.your patch will make domain and char(n) behavior inconsistent.
create domain char2 as char(2);
SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char2 ERROR ON ERROR);
SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char(2) ERROR ON ERROR);another example:
SELECT JSON_query(jsonb '"aaa"', '$' RETURNING char(2) keep quotes
default '"aaa"'::jsonb ON ERROR);
same value (jsonb "aaa") error on error will yield error,
but `default expression on error` can coerce the value to char(2),
which looks a little bit inconsistent, I think.Interesting examples, thanks for sharing.
Attached updated version should take into account that typmod may be
hiding under domains. Please test.
SELECT JSON_VALUE(jsonb '111', '$' RETURNING queryfuncs_char2 default
'13' on error);
return
ERROR: value too long for type character(2)
should return 13
I found out the source of the problem is in coerceJsonExprOutput
/*
* Use cast expression for domain types; we need CoerceToDomain here.
*/
if (get_typtype(returning->typid) != TYPTYPE_DOMAIN)
{
jsexpr->use_io_coercion = true;
return;
}
I'd like to push this one tomorrow, barring objections.
Currently the latest patch available cannot be `git apply` cleanly.
@@ -464,3 +466,9 @@ SELECT JSON_QUERY(jsonb 'null', '$xyz' PASSING 1 AS xyz);
SELECT JSON_EXISTS(jsonb '1', '$' DEFAULT 1 ON ERROR);
SELECT JSON_VALUE(jsonb '1', '$' EMPTY ON ERROR);
SELECT JSON_QUERY(jsonb '1', '$' TRUE ON ERROR);
+
+-- Test implicit coercion domain over fixed-legth type specified in RETURNING
+CREATE DOMAIN queryfuncs_char2 AS char(2) CHECK (VALUE NOT IN ('12'));
+SELECT JSON_QUERY(jsonb '123', '$' RETURNING queryfuncs_char2 ERROR ON ERROR);
+SELECT JSON_VALUE(jsonb '123', '$' RETURNING queryfuncs_char2 ERROR ON ERROR);
+SELECT JSON_VALUE(jsonb '12', '$' RETURNING queryfuncs_char2 ERROR ON ERROR);
cannot found `SELECT JSON_QUERY(jsonb '1', '$' TRUE ON ERROR);` in
https://git.postgresql.org/cgit/postgresql.git/tree/src/test/regress/sql/sqljson_queryfuncs.sql
hi.
I have assembled a list of simple examples, some works (for comparison
sake), most not work
as intended.
CREATE DOMAIN queryfuncs_char2 AS char(2) CHECK (VALUE NOT IN ('12'));
CREATE DOMAIN queryfuncs_d_interval AS interval(2) CHECK (VALUE is not null);
SELECT JSON_VALUE(jsonb '111', '$' RETURNING queryfuncs_char2 default
'12' on error);
SELECT JSON_VALUE(jsonb '12', '$' RETURNING queryfuncs_char2 default
'11' on error);
SELECT JSON_VALUE(jsonb '111', '$' RETURNING queryfuncs_char2 default
'13' on error);
SELECT JSON_VALUE(jsonb '"111"', '$' RETURNING queryfuncs_char2
default '17' on error);
SELECT JSON_QUERY(jsonb '111', '$' RETURNING queryfuncs_char2 default
'14' on error);
SELECT JSON_QUERY(jsonb '111', '$' RETURNING queryfuncs_char2 omit
quotes default '15' on error);
SELECT JSON_QUERY(jsonb '111', '$' RETURNING queryfuncs_char2 keep
quotes default '16' on error);
SELECT JSON_VALUE(jsonb '"01:23:45.6789"', '$' RETURNING
queryfuncs_d_interval default '01:23:45.6789' on error);
SELECT JSON_VALUE(jsonb '"01:23:45.6789"', '$' RETURNING
queryfuncs_d_interval default '01:23:45.6789' on empty);
SELECT JSON_QUERY(jsonb '"01:23:45.6789"', '$' RETURNING
queryfuncs_d_interval default '01:23:45.6789' on error);
SELECT JSON_QUERY(jsonb '"01:23:45.6789"', '$' RETURNING
queryfuncs_d_interval default '01:23:45.6789' on empty);
above 4 queries fails, meaning the changes you propose within
transformJsonBehavior is wrong?
i think it's because the COERCION_IMPLICIT cast from text to domain
queryfuncs_d_interval is not doable.
json_table seems also have problem with "exists" cast to other type, example:
SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a char(2) EXISTS
PATH '$.a' ));
SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a queryfuncs_char2
EXISTS PATH '$.a'));
SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a queryfuncs_char2
EXISTS PATH '$.a' error on error));
SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a queryfuncs_char2
EXISTS PATH '$.a' error on empty));
----------------------------------------------------------------------------------------------------
SELECT JSON_VALUE(jsonb '111', '$' RETURNING queryfuncs_char2 default
'13' on error);
for the above example:
coerceJsonExprOutput, coerceJsonFuncExpr set the result datum coercion
node to RelabelType:
RelabelType is not error safe. so the above query will fail converting
text 111 to queryfuncs_char2
which is not what we want.
I think making coerceJsonExprOutput the following way can solve this problem.
your patch cannot apply cleanly, I just posted the actual code snippet
of coerceJsonExprOutput, not a diff file.
static void
coerceJsonExprOutput(ParseState *pstate, JsonExpr *jsexpr)
{
JsonReturning *returning = jsexpr->returning;
Node *context_item = jsexpr->formatted_expr;
int default_typmod;
Oid default_typid;
bool omit_quotes =
jsexpr->op == JSON_QUERY_OP && jsexpr->omit_quotes;
Node *coercion_expr = NULL;
int32 baseTypmod = returning->typmod;
Assert(returning);
/*
* Check for cases where the coercion should be handled at runtime, that
* is, without using a cast expression.
*/
if (jsexpr->op == JSON_VALUE_OP)
{
/*
* Use cast expression for domain types; we need CoerceToDomain here.
*/
if (get_typtype(returning->typid) != TYPTYPE_DOMAIN)
{
jsexpr->use_io_coercion = true;
return;
}
else
{
/* domain type, typmod > 0 can only use use_io_coercion */
(void) getBaseTypeAndTypmod(returning->typid, &baseTypmod);
if (baseTypmod > 0)
{
jsexpr->use_io_coercion = true;
return;
}
}
}
else if (jsexpr->op == JSON_QUERY_OP)
{
/*
* Cast functions from jsonb to the following types (jsonb_bool() et
* al) don't handle errors softly, so coerce either by calling
* json_populate_type() or the type's input function so that any
* errors are handled appropriately. The latter only if OMIT QUOTES is
* true.
*/
switch (returning->typid)
{
case BOOLOID:
case NUMERICOID:
case INT2OID:
case INT4OID:
case INT8OID:
case FLOAT4OID:
case FLOAT8OID:
if (jsexpr->omit_quotes)
jsexpr->use_io_coercion = true;
else
jsexpr->use_json_coercion = true;
return;
default:
break;
}
/*
* for returning domain type, we cannot use coercion expression.
* it may not be able to catch the error, for example RelabelType
* for we either use_io_coercion or use_json_coercion.
*/
if (get_typtype(returning->typid) == TYPTYPE_DOMAIN)
(void) getBaseTypeAndTypmod(returning->typid, &baseTypmod);
/*
* coerceJsonFuncExpr() creates implicit casts for types with typmod,
* which (if present) don't handle errors softly, so use runtime
* coercion.
*/
if (baseTypmod > 0)
{
if (jsexpr->omit_quotes)
jsexpr->use_io_coercion = true;
else
jsexpr->use_json_coercion = true;
return;
}
}
...
-------------------------------
Hi,
On Wed, Jun 26, 2024 at 11:46 PM jian he <jian.universality@gmail.com> wrote:
On Wed, Jun 26, 2024 at 8:39 PM Amit Langote <amitlangote09@gmail.com> wrote:
The RETURNING variant giving an error is what the standard asks us to
do apparently. I read Tom's last message on this thread as agreeing
to that, even though hesitantly. He can correct me if I got that
wrong.your patch will make domain and char(n) behavior inconsistent.
create domain char2 as char(2);
SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char2 ERROR ON ERROR);
SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char(2) ERROR ON ERROR);another example:
SELECT JSON_query(jsonb '"aaa"', '$' RETURNING char(2) keep quotes
default '"aaa"'::jsonb ON ERROR);
same value (jsonb "aaa") error on error will yield error,
but `default expression on error` can coerce the value to char(2),
which looks a little bit inconsistent, I think.Interesting examples, thanks for sharing.
Attached updated version should take into account that typmod may be
hiding under domains. Please test.SELECT JSON_VALUE(jsonb '111', '$' RETURNING queryfuncs_char2 default
'13' on error);
return
ERROR: value too long for type character(2)
should return 13I found out the source of the problem is in coerceJsonExprOutput
/*
* Use cast expression for domain types; we need CoerceToDomain here.
*/
if (get_typtype(returning->typid) != TYPTYPE_DOMAIN)
{
jsexpr->use_io_coercion = true;
return;
}
Thanks for this test case and the analysis. Yes, using a cast
expression for coercion to the RETURNING type generally seems to be a
source of many problems that could've been solved by fixing things so
that only use_io_coercion and use_json_coercion are enough to handle
all the cases.
I've attempted that in the attached 0001, which removes
JsonExpr.coercion_expr and a bunch of code around it.
0002 is now the original patch minus the changes to make
JSON_EXISTS(), JSON_QUERY(), and JSON_VALUE() behave as we would like,
because the changes in 0001 covers them. The changes for JsonBehavior
expression coercion as they were in the last version of the patch are
still needed, but I decided to move those into 0001 so that the
changes for query functions are all in 0001 and those for constructors
in 0002. It would be nice to get rid of that coerce_to_target_type()
call to coerce the "behavior expression" to RETURNING type, but I'm
leaving that as a task for another day.
--
Thanks, Amit Langote
Attachments:
v3-0001-SQL-JSON-Always-coerce-JsonExpr-result-at-runtime.patchapplication/octet-stream; name=v3-0001-SQL-JSON-Always-coerce-JsonExpr-result-at-runtime.patchDownload+170-236
v3-0002-SQL-JSON-Use-implicit-casts-for-RETURNING-type-wi.patchapplication/octet-stream; name=v3-0002-SQL-JSON-Use-implicit-casts-for-RETURNING-type-wi.patchDownload+52-2
On Thu, Jun 27, 2024 at 6:57 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Wed, Jun 26, 2024 at 11:46 PM jian he <jian.universality@gmail.com> wrote:
On Wed, Jun 26, 2024 at 8:39 PM Amit Langote <amitlangote09@gmail.com> wrote:
The RETURNING variant giving an error is what the standard asks us to
do apparently. I read Tom's last message on this thread as agreeing
to that, even though hesitantly. He can correct me if I got that
wrong.your patch will make domain and char(n) behavior inconsistent.
create domain char2 as char(2);
SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char2 ERROR ON ERROR);
SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING char(2) ERROR ON ERROR);another example:
SELECT JSON_query(jsonb '"aaa"', '$' RETURNING char(2) keep quotes
default '"aaa"'::jsonb ON ERROR);
same value (jsonb "aaa") error on error will yield error,
but `default expression on error` can coerce the value to char(2),
which looks a little bit inconsistent, I think.Interesting examples, thanks for sharing.
Attached updated version should take into account that typmod may be
hiding under domains. Please test.SELECT JSON_VALUE(jsonb '111', '$' RETURNING queryfuncs_char2 default
'13' on error);
return
ERROR: value too long for type character(2)
should return 13I found out the source of the problem is in coerceJsonExprOutput
/*
* Use cast expression for domain types; we need CoerceToDomain here.
*/
if (get_typtype(returning->typid) != TYPTYPE_DOMAIN)
{
jsexpr->use_io_coercion = true;
return;
}Thanks for this test case and the analysis. Yes, using a cast
expression for coercion to the RETURNING type generally seems to be a
source of many problems that could've been solved by fixing things so
that only use_io_coercion and use_json_coercion are enough to handle
all the cases.I've attempted that in the attached 0001, which removes
JsonExpr.coercion_expr and a bunch of code around it.0002 is now the original patch minus the changes to make
JSON_EXISTS(), JSON_QUERY(), and JSON_VALUE() behave as we would like,
because the changes in 0001 covers them. The changes for JsonBehavior
expression coercion as they were in the last version of the patch are
still needed, but I decided to move those into 0001 so that the
changes for query functions are all in 0001 and those for constructors
in 0002. It would be nice to get rid of that coerce_to_target_type()
call to coerce the "behavior expression" to RETURNING type, but I'm
leaving that as a task for another day.
Updated 0001 to remove outdated references, remove some more unnecessary code.
--
Thanks, Amit Langote
Attachments:
v4-0002-SQL-JSON-Use-implicit-casts-for-RETURNING-type-wi.patchapplication/octet-stream; name=v4-0002-SQL-JSON-Use-implicit-casts-for-RETURNING-type-wi.patchDownload+52-2
v4-0001-SQL-JSON-Always-coerce-JsonExpr-result-at-runtime.patchapplication/octet-stream; name=v4-0001-SQL-JSON-Always-coerce-JsonExpr-result-at-runtime.patchDownload+189-277
On Thu, Jun 27, 2024 at 7:48 PM Amit Langote <amitlangote09@gmail.com> wrote:
I've attempted that in the attached 0001, which removes
JsonExpr.coercion_expr and a bunch of code around it.0002 is now the original patch minus the changes to make
JSON_EXISTS(), JSON_QUERY(), and JSON_VALUE() behave as we would like,
because the changes in 0001 covers them. The changes for JsonBehavior
expression coercion as they were in the last version of the patch are
still needed, but I decided to move those into 0001 so that the
changes for query functions are all in 0001 and those for constructors
in 0002. It would be nice to get rid of that coerce_to_target_type()
call to coerce the "behavior expression" to RETURNING type, but I'm
leaving that as a task for another day.Updated 0001 to remove outdated references, remove some more unnecessary code.
i found some remaining references of "coercion_expr" should be removed.
src/include/nodes/primnodes.h
/* JsonExpr's collation, if coercion_expr is NULL. */
src/include/nodes/execnodes.h
/*
* Address of the step to coerce the result value of jsonpath evaluation
* to the RETURNING type using JsonExpr.coercion_expr. -1 if no coercion
* is necessary or if either JsonExpr.use_io_coercion or
* JsonExpr.use_json_coercion is true.
*/
int jump_eval_coercion;
src/backend/jit/llvm/llvmjit_expr.c
/* coercion_expr code */
LLVMPositionBuilderAtEnd(b, b_coercion);
if (jsestate->jump_eval_coercion >= 0)
LLVMBuildBr(b, opblocks[jsestate->jump_eval_coercion]);
else
LLVMBuildUnreachable(b);
src/backend/executor/execExprInterp.c
/*
* Checks if an error occurred either when evaluating JsonExpr.coercion_expr or
* in ExecEvalJsonCoercion(). If so, this sets JsonExprState.error to trigger
* the ON ERROR handling steps.
*/
void
ExecEvalJsonCoercionFinish(ExprState *state, ExprEvalStep *op)
{
}
if (jbv == NULL)
{
/* Will be coerced with coercion_expr, if any. */
*op->resvalue = (Datum) 0;
*op->resnull = true;
}
src/backend/executor/execExpr.c
/*
* Jump to coerce the NULL using coercion_expr if present. Coercing NULL
* is only interesting when the RETURNING type is a domain whose
* constraints must be checked. jsexpr->coercion_expr containing a
* CoerceToDomain node must have been set in that case.
*/
/*
* Jump to coerce the NULL using coercion_expr if present. Coercing NULL
* is only interesting when the RETURNING type is a domain whose
* constraints must be checked. jsexpr->coercion_expr containing a
* CoerceToDomain node must have been set in that case.
*/
On Fri, Jun 28, 2024 at 3:14 PM jian he <jian.universality@gmail.com> wrote:
On Thu, Jun 27, 2024 at 7:48 PM Amit Langote <amitlangote09@gmail.com> wrote:
I've attempted that in the attached 0001, which removes
JsonExpr.coercion_expr and a bunch of code around it.0002 is now the original patch minus the changes to make
JSON_EXISTS(), JSON_QUERY(), and JSON_VALUE() behave as we would like,
because the changes in 0001 covers them. The changes for JsonBehavior
expression coercion as they were in the last version of the patch are
still needed, but I decided to move those into 0001 so that the
changes for query functions are all in 0001 and those for constructors
in 0002. It would be nice to get rid of that coerce_to_target_type()
call to coerce the "behavior expression" to RETURNING type, but I'm
leaving that as a task for another day.Updated 0001 to remove outdated references, remove some more unnecessary code.
i found some remaining references of "coercion_expr" should be removed.
src/include/nodes/primnodes.h
/* JsonExpr's collation, if coercion_expr is NULL. */src/include/nodes/execnodes.h
/*
* Address of the step to coerce the result value of jsonpath evaluation
* to the RETURNING type using JsonExpr.coercion_expr. -1 if no coercion
* is necessary or if either JsonExpr.use_io_coercion or
* JsonExpr.use_json_coercion is true.
*/
int jump_eval_coercion;src/backend/jit/llvm/llvmjit_expr.c
/* coercion_expr code */
LLVMPositionBuilderAtEnd(b, b_coercion);
if (jsestate->jump_eval_coercion >= 0)
LLVMBuildBr(b, opblocks[jsestate->jump_eval_coercion]);
else
LLVMBuildUnreachable(b);src/backend/executor/execExprInterp.c
/*
* Checks if an error occurred either when evaluating JsonExpr.coercion_expr or
* in ExecEvalJsonCoercion(). If so, this sets JsonExprState.error to trigger
* the ON ERROR handling steps.
*/
void
ExecEvalJsonCoercionFinish(ExprState *state, ExprEvalStep *op)
{
}if (jbv == NULL)
{
/* Will be coerced with coercion_expr, if any. */
*op->resvalue = (Datum) 0;
*op->resnull = true;
}src/backend/executor/execExpr.c
/*
* Jump to coerce the NULL using coercion_expr if present. Coercing NULL
* is only interesting when the RETURNING type is a domain whose
* constraints must be checked. jsexpr->coercion_expr containing a
* CoerceToDomain node must have been set in that case.
*//*
* Jump to coerce the NULL using coercion_expr if present. Coercing NULL
* is only interesting when the RETURNING type is a domain whose
* constraints must be checked. jsexpr->coercion_expr containing a
* CoerceToDomain node must have been set in that case.
*/
Thanks for checking.
Will push the attached 0001 and 0002 shortly.
--
Thanks, Amit Langote
Attachments:
v5-0001-SQL-JSON-Fix-coercion-of-constructor-outputs-to-t.patchapplication/octet-stream; name=v5-0001-SQL-JSON-Fix-coercion-of-constructor-outputs-to-t.patchDownload+52-2
v5-0002-SQL-JSON-Always-coerce-JsonExpr-result-at-runtime.patchapplication/octet-stream; name=v5-0002-SQL-JSON-Always-coerce-JsonExpr-result-at-runtime.patchDownload+210-303
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c index 233b7b1cc9..df766cdec1 100644 --- a/src/backend/parser/parse_expr.c +++ b/src/backend/parser/parse_expr.c @@ -3583,6 +3583,7 @@ coerceJsonFuncExpr(ParseState *pstate, Node *expr, Node *res; int location; Oid exprtype = exprType(expr); + int32 baseTypmod = returning->typmod;/* if output type is not specified or equals to function type, return */
if (!OidIsValid(returning->typid) || returning->typid == exprtype)
@@ -3611,10 +3612,19 @@ coerceJsonFuncExpr(ParseState *pstate, Node *expr,
return (Node *) fexpr;
}+ /* + * For domains, consider the base type's typmod to decide whether to setup + * an implicit or explicit cast. + */ + if (get_typtype(returning->typid) == TYPTYPE_DOMAIN) + (void) getBaseTypeAndTypmod(returning->typid, &baseTypmod);
I didn't review this patch in detail, but I just noticed this tiny bit
and wanted to say that I don't like this coding style where you
initialize a variable to a certain value, and much later you override it
with a completely different value. It seems much clearer to leave it
uninitialized at first, and have both places that determine the value
together,
if (get_typtype(returning->typid) == TYPTYPE_DOMAIN)
(void) getBaseTypeAndTypmod(returning->typid, &baseTypmod);
else
baseTypmod = returning->typmod;
Not only because in the domain case the initializer value is a downright
lie, but also because of considerations such as if you later add code
that uses the variable in between those two places, you'd be introducing
a bug in the domain case because it hasn't been set. With the coding I
propose, the compiler immediately tells you that the initialization is
missing.
TBH I'm not super clear on why we decide on explicit or implicit cast
based on presence of a typmod. Why isn't it better to always use an
implicit one?
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"Small aircraft do not crash frequently ... usually only once!"
(ponder, http://thedailywtf.com/)