POC: PLpgSQL FOREACH IN JSON ARRAY

Started by Pavel Stehule10 days ago10 messages
Jump to latest
#1Pavel Stehule
pavel.stehule@gmail.com

Hi

I wrote PoC for previously proposed plpgsql statement FOREACH IN JSON ARRAY

It looks like:

do $$
declare x int;
begin
foreach x in json array '[1,2,3,4]'
loop
raise notice 'x: %', x;
end loop;
end;
$$

do $$
declare x int; y int;
begin
foreach x, y in json array '[{"x": 100, "y": 1000}, {"y": 1000, "x":
100}]'
loop
raise notice 'x: %, y: %', x, y;
end loop;
end
$$

My first motivation for this patch is performance. This is faster (3 - 4 x)
than using FOR IN SELECT FROM json_array_elements, because there is no
overhead of SQL executor. Second motivation is a little bit better
readability, because inside plpgsql' statements we have info about used
variables and we can use it.

The behavior is very similar to FOREACH IN ARRAY with one significant
difference - the values of JSON objects are assigned to the composite
variable or lists of variables by names (not by position). It made this
decision because jsonb doesn't preserve the position of the field in
object, and then assignment based on position cannot work.

The code is relatively short now - about 400 lines +/- and the code is
simple without risks.

There are some open questions - mainly if default mode for mapping json
fields to plpgsql variables should be in lax or strict mode. Now, it is
something between (cast errors are raised) - it is consistent
with jsonb_populate_record - but it should not be the final design. I
cannot say what is better - currently implemented behavior is consistent
with common plpgsql behaviour, but SQL/JSON is different. I can imagine
that default behaviour will be lax, and with some optional clauses we can
push behave to strict mode. I have no strong opinion about it. Maybe I
prefer the current "strict" behaviour a little bit, because it is more
"safe", but it is only my personal opinion. But again, I have no strong
opinion about this question and I very much invite any discussion about it.

This is proof of concept patch - casting between plpgsql arrays and json
arrays is not supported, documentation and regress tests are minimalistic,
but it is good enough for testing and good enough for decision, if this
feature is wanted or not (or if it needs some modifications).

This is a new feature (and proprietary feature). There should not be any
compatibility issues.

What do you think about this feature?

Regards

Pavel

Attachments:

v20260228-1-0001-FOREACH-scalar-IN-JSON-ARRAY.patchtext/x-patch; charset=US-ASCII; name=v20260228-1-0001-FOREACH-scalar-IN-JSON-ARRAY.patchDownload+669-11
#2Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#1)
Re: POC: PLpgSQL FOREACH IN JSON ARRAY

Hi

small update

Now assignment to plpgsql array variable and json array works. Some minor
code cleaning + more regress tests.

Regards

Pavel

Attachments:

v20260301-2-0001-FOREACH-scalar-IN-JSON-ARRAY.patchtext/x-patch; charset=US-ASCII; name=v20260301-2-0001-FOREACH-scalar-IN-JSON-ARRAY.patchDownload+797-9
#3Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#2)
Re: POC: PLpgSQL FOREACH IN JSON ARRAY

ne 1. 3. 2026 v 6:44 odesílatel Pavel Stehule <pavel.stehule@gmail.com>
napsal:

Hi

small update

Now assignment to plpgsql array variable and json array works. Some minor
code cleaning + more regress tests.

fix uninitialized argument of json_populate_type function

regards

Pavel

Show quoted text

Regards

Pavel

Attachments:

v20260301-3-0001-FOREACH-scalar-IN-JSON-ARRAY.patchtext/x-patch; charset=US-ASCII; name=v20260301-3-0001-FOREACH-scalar-IN-JSON-ARRAY.patchDownload+800-9
#4Jim Jones
jim.jones@uni-muenster.de
In reply to: Pavel Stehule (#3)
Re: POC: PLpgSQL FOREACH IN JSON ARRAY

Hi Pavel,

I quickly tested the patch, and I also could observe a ~3x performance
improvement!

A few first impressions:

## in exec_stmt_foreach_json_a the boolean variable found is declared as
false, bit its value is never set to true until exec_set_found() is called:

/*
* Set the FOUND variable to indicate the result of executing the loop
* (namely, whether we looped one or more times). This must be set here so
* that it does not interfere with the value of the FOUND variable inside
* the loop processing itself.
*/
exec_set_found(estate, found);

Test:

DO $$
DECLARE
x int;
BEGIN
FOREACH x IN JSON ARRAY '[1,2,3]'
LOOP
RAISE NOTICE 'x: %', x;
END LOOP;

IF FOUND THEN
RAISE NOTICE 'FOUND is true';
ELSE
RAISE NOTICE 'FOUND is false';
END IF;
END;
$$;
NOTICE: x: 1
NOTICE: x: 2
NOTICE: x: 3
NOTICE: FOUND is false

## Suggestion in the plpgsql.sgml

The <literal>FOREACH</literal> loop is much like a
<literal>FOREACH</literal> loop,

to

"much like a regular <literal>FOREACH</literal> loop over arrays"

## Typo in comment

/*
* We cannot to use fieldnames for tupdescentry, because
* these names can be suffixed by name of row variable.
...

We cannot to use > We cannot use

## Nit pick

These error messages are not wrong, but IMO a errhint/errdetail could
add some value here:

ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("cannot extract elements from a scalar")));

ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("cannot extract elements from an object")));

Something like this perhaps?

ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("cannot extract elements from a scalar"),
errhint("FOREACH IN JSON ARRAY requires an array value.")));

ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("FOREACH expression must evaluate to a JSON array"),
errdetail("Cannot iterate over a scalar value.")));

Thanks for the patch!

Best, Jim

#5Pavel Stehule
pavel.stehule@gmail.com
In reply to: Jim Jones (#4)
Re: POC: PLpgSQL FOREACH IN JSON ARRAY

Hi

út 3. 3. 2026 v 8:43 odesílatel Jim Jones <jim.jones@uni-muenster.de>
napsal:

Hi Pavel,

I quickly tested the patch, and I also could observe a ~3x performance
improvement!

A few first impressions:

## in exec_stmt_foreach_json_a the boolean variable found is declared as
false, bit its value is never set to true until exec_set_found() is called:

/*
* Set the FOUND variable to indicate the result of executing the loop
* (namely, whether we looped one or more times). This must be set here so
* that it does not interfere with the value of the FOUND variable inside
* the loop processing itself.
*/
exec_set_found(estate, found);

Test:

DO $$
DECLARE
x int;
BEGIN
FOREACH x IN JSON ARRAY '[1,2,3]'
LOOP
RAISE NOTICE 'x: %', x;
END LOOP;

IF FOUND THEN
RAISE NOTICE 'FOUND is true';
ELSE
RAISE NOTICE 'FOUND is false';
END IF;
END;
$$;
NOTICE: x: 1
NOTICE: x: 2
NOTICE: x: 3
NOTICE: FOUND is false

fixed + regress tests

## Suggestion in the plpgsql.sgml

The <literal>FOREACH</literal> loop is much like a
<literal>FOREACH</literal> loop,

to

"much like a regular <literal>FOREACH</literal> loop over arrays"

done

## Typo in comment

/*
* We cannot to use fieldnames for tupdescentry, because
* these names can be suffixed by name of row variable.
...

We cannot to use > We cannot use

fixed

## Nit pick

These error messages are not wrong, but IMO a errhint/errdetail could
add some value here:

ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("cannot extract elements from a scalar")));

ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("cannot extract elements from an object")));

Something like this perhaps?

ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("cannot extract elements from a scalar"),
errhint("FOREACH IN JSON ARRAY requires an array value.")));

ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("FOREACH expression must evaluate to a JSON array"),
errdetail("Cannot iterate over a scalar value.")));

I rewrote it to

if (JB_ROOT_IS_SCALAR(jb))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("FOREACH expression must evaluate to a JSON array"),
errhint("Cannot iterate over a scalar value.")));
else if (JB_ROOT_IS_OBJECT(jb))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("FOREACH expression must evaluate to a JSON array"),
errdetail("Cannot iterate over a object value.")));

Assert(JB_ROOT_IS_ARRAY(jb));

+ regress tests

Thanks for the patch!

Thank you for check

Regards

Pavel

Show quoted text

Best, Jim

Attachments:

v20260303-4-0001-FOREACH-scalar-IN-JSON-ARRAY.patchtext/x-patch; charset=US-ASCII; name=v20260303-4-0001-FOREACH-scalar-IN-JSON-ARRAY.patchDownload+923-9
#6Jim Jones
jim.jones@uni-muenster.de
In reply to: Pavel Stehule (#5)
Re: POC: PLpgSQL FOREACH IN JSON ARRAY

I reviewed the code I have nothing to add at this point. LGTM!

The tests touch a lot of different scenarios, but for the sake of
completeness I'd like to suggest adding these three cases:

-- EXIT and CONTINUE can be triggered by LOOP_RC_PROCESSING
DO $$
DECLARE x int;
BEGIN
FOREACH x IN JSON ARRAY '[1,2,3,4,5]'
LOOP
EXIT WHEN x = 3;
RAISE NOTICE '%', x;
END LOOP;
END;
$$;

DO $$
DECLARE x int;
BEGIN
FOREACH x IN JSON ARRAY '[1,2,3,4,5]'
LOOP
CONTINUE WHEN x % 2 = 0;
RAISE NOTICE '%', x;
END LOOP;
END;
$$;

-- Variable instead of string
DO $$
DECLARE x int; arr jsonb;
BEGIN
SELECT jsonb_agg(i) INTO arr
FROM generate_series(1,3) i;

FOREACH x IN JSON ARRAY arr
LOOP
RAISE NOTICE '%', x;
END LOOP;
END;
$$;

Thanks!

Best, Jim

#7Pavel Stehule
pavel.stehule@gmail.com
In reply to: Jim Jones (#6)
Re: POC: PLpgSQL FOREACH IN JSON ARRAY

Hi

st 4. 3. 2026 v 12:35 odesílatel Jim Jones <jim.jones@uni-muenster.de>
napsal:

I reviewed the code I have nothing to add at this point. LGTM!

The tests touch a lot of different scenarios, but for the sake of
completeness I'd like to suggest adding these three cases:

-- EXIT and CONTINUE can be triggered by LOOP_RC_PROCESSING
DO $$
DECLARE x int;
BEGIN
FOREACH x IN JSON ARRAY '[1,2,3,4,5]'
LOOP
EXIT WHEN x = 3;
RAISE NOTICE '%', x;
END LOOP;
END;
$$;

DO $$
DECLARE x int;
BEGIN
FOREACH x IN JSON ARRAY '[1,2,3,4,5]'
LOOP
CONTINUE WHEN x % 2 = 0;
RAISE NOTICE '%', x;
END LOOP;
END;
$$;

-- Variable instead of string
DO $$
DECLARE x int; arr jsonb;
BEGIN
SELECT jsonb_agg(i) INTO arr
FROM generate_series(1,3) i;

FOREACH x IN JSON ARRAY arr
LOOP
RAISE NOTICE '%', x;
END LOOP;
END;
$$;

I merged these examples to tests

Thank you for review

Regards

Pavel

Show quoted text

Thanks!

Best, Jim

Attachments:

v20260304-5-0001-FOREACH-scalar-IN-JSON-ARRAY.patchtext/x-patch; charset=US-ASCII; name=v20260304-5-0001-FOREACH-scalar-IN-JSON-ARRAY.patchDownload+1006-9
#8Shreeya Sharma
shreeya2304@gmail.com
In reply to: Pavel Stehule (#7)
Re: POC: PLpgSQL FOREACH IN JSON ARRAY

Status: SUCCESS
Applied against: /Users/shreeyasharma/postgresql (Current HEAD)
Build status: PASS
Test status: PASS
Notes:
- Compilation passed: True
- Regression tests (make check) passed: True

#9Shreeya Sharma
shreeya2304@gmail.com
In reply to: Pavel Stehule (#7)
Re: POC: PLpgSQL FOREACH IN JSON ARRAY

The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: tested, passed

The changes are good

The new status of this patch is: Ready for Committer

#10Pavel Stehule
pavel.stehule@gmail.com
In reply to: Shreeya Sharma (#9)
Re: POC: PLpgSQL FOREACH IN JSON ARRAY

po 9. 3. 2026 v 7:12 odesílatel Shreeya Sharma <shreeya2304@gmail.com>
napsal:

The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: tested, passed

The changes are good

The new status of this patch is: Ready for Committer

Thank you very much

Regards

Pavel