Variable substitution in jsonb functions fails for jsonpath operator like_regex

Started by Erwin Brandstetterover 2 years ago8 messagesbugs
Jump to latest
#1Erwin Brandstetter
brsaweda@gmail.com

The functions jsonb_path_exists() and friends accept a "vars" parameter for
parameter substitution in the jsonpath argument. This seems to work for all
jsonpath operators except "like_regex":

SELECT * FROM tbl
WHERE jsonb_path_exists(data, '$[*].value ? (@ like_regex $foo)', '{"foo":
"CEO"}');

ERROR: syntax error at or near "$foo" of jsonpath input
LINE 3: WHERE jsonb_path_exists(data, '$[*].value ? (@ like_regex $...

Notably, the same works even for "starts with":

SELECT * FROM tbl
WHERE jsonb_path_exists(data, '$[*].value ? (@ starts with $foo)',
'{"foo": "CEO"}');

I also tested related jsonb functions including jsonb_path_query(),
jsonb_path_query_first(), jsonb_path_query(). Always the same error message.

Here is the question on stackoverflow.com that brought the issue to my
attention (plus my answer with more details):
https://stackoverflow.com/questions/77317468/variable-substitution-for-postgres-jsonpath-operator-like-regex/77318568

Here is a related fiddle to play with:
https://dbfiddle.uk/4yRjIYlh

Tested with Postgres 16.0.

Regards
Erwin

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Erwin Brandstetter (#1)
Re: Variable substitution in jsonb functions fails for jsonpath operator like_regex

Erwin Brandstetter <brsaweda@gmail.com> writes:

The functions jsonb_path_exists() and friends accept a "vars" parameter for
parameter substitution in the jsonpath argument. This seems to work for all
jsonpath operators except "like_regex":

SELECT * FROM tbl
WHERE jsonb_path_exists(data, '$[*].value ? (@ like_regex $foo)', '{"foo":
"CEO"}');

ERROR: syntax error at or near "$foo" of jsonpath input
LINE 3: WHERE jsonb_path_exists(data, '$[*].value ? (@ like_regex $...

Notably, the same works even for "starts with":

Hmm, maybe just an oversight in jsonpath_gram.y?

predicate:
...
| expr STARTS_P WITH_P starts_with_initial
| expr LIKE_REGEX_P STRING_P
| expr LIKE_REGEX_P STRING_P FLAG_P STRING_P
;

starts_with_initial:
STRING_P { $$ = makeItemString(&$1); }
| VARIABLE_P { $$ = makeItemVariable(&$1); }
;

Maybe it wouldn't take more than s/STRING_P/starts_with_initial/
here (though I'd then rename starts_with_initial to something
more generic). With bad luck there might be some dependency on
this restriction downstream of the grammar, but I suspect not.
Didn't try though.

regards, tom lane

#3Jeff Janes
jeff.janes@gmail.com
In reply to: Tom Lane (#2)
Re: Variable substitution in jsonb functions fails for jsonpath operator like_regex

On Wed, Oct 18, 2023 at 5:20 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Erwin Brandstetter <brsaweda@gmail.com> writes:

The functions jsonb_path_exists() and friends accept a "vars" parameter

for

parameter substitution in the jsonpath argument. This seems to work for

all

jsonpath operators except "like_regex":

SELECT * FROM tbl
WHERE jsonb_path_exists(data, '$[*].value ? (@ like_regex $foo)',

'{"foo":

"CEO"}');

ERROR: syntax error at or near "$foo" of jsonpath input
LINE 3: WHERE jsonb_path_exists(data, '$[*].value ? (@ like_regex $...

Notably, the same works even for "starts with":

Hmm, maybe just an oversight in jsonpath_gram.y?

predicate:
...
| expr STARTS_P WITH_P starts_with_initial
| expr LIKE_REGEX_P STRING_P
| expr LIKE_REGEX_P STRING_P FLAG_P STRING_P
;

starts_with_initial:
STRING_P { $$ = makeItemString(&$1); }
| VARIABLE_P { $$ = makeItemVariable(&$1); }
;

Maybe it wouldn't take more than s/STRING_P/starts_with_initial/
here (though I'd then rename starts_with_initial to something
more generic). With bad luck there might be some dependency on
this restriction downstream of the grammar, but I suspect not.
Didn't try though.

I tried it, and it didn't work. No error, it just doesn't match
anything--including literal values which do match things in HEAD.

Maybe the problem is that the regex pattern is compiled at the same time
the jsonpath is compiled? Then it can't just have a different pattern
slotted in later through a variable.

For example, this finds the row in HEAD but not with the proposed change:

with tbl as (select '[{"value":"CEO"}]'::jsonb data) SELECT * FROM tbl
WHERE jsonb_path_exists(data, '$[*].value ? (@ like_regex "CEO")');

Cheers,

Jeff

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff Janes (#3)
Re: Variable substitution in jsonb functions fails for jsonpath operator like_regex

Jeff Janes <jeff.janes@gmail.com> writes:

On Wed, Oct 18, 2023 at 5:20 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Maybe it wouldn't take more than s/STRING_P/starts_with_initial/
here (though I'd then rename starts_with_initial to something
more generic).

I tried it, and it didn't work. No error, it just doesn't match
anything--including literal values which do match things in HEAD.

Oh, duh, makeItemLikeRegex() is expecting a plain JsonPathString,
and JsonPathParseItem.value.like_regex is set up for just a
constant pattern. We could no doubt fix that, but it's a bigger
lift than I was hoping.

Maybe the problem is that the regex pattern is compiled at the same time
the jsonpath is compiled?

Doesn't look that way to me: executeLikeRegex is just using
RE_compile_and_execute every time. (It's "caching" a text datum
representing the pattern string, which might be a good candidate for
the silliest use of caching I've ever seen in PG; it's surely not
buying any useful increment of performance.)

regards, tom lane

#5David G. Johnston
david.g.johnston@gmail.com
In reply to: Tom Lane (#4)
Re: Variable substitution in jsonb functions fails for jsonpath operator like_regex

On Wed, Oct 18, 2023, 18:47 Tom Lane <tgl@sss.pgh.pa.us> wrote:

Jeff Janes <jeff.janes@gmail.com> writes:

On Wed, Oct 18, 2023 at 5:20 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Maybe it wouldn't take more than s/STRING_P/starts_with_initial/
here (though I'd then rename starts_with_initial to something
more generic).

I tried it, and it didn't work. No error, it just doesn't match
anything--including literal values which do match things in HEAD.

Oh, duh, makeItemLikeRegex() is expecting a plain JsonPathString,
and JsonPathParseItem.value.like_regex is set up for just a
constant pattern. We could no doubt fix that, but it's a bigger
lift than I was hoping.

Not in a position to do it myself but we should confirm we aren't simply
following the standard here, and if so decide whether we want to deviate.

David J.

Show quoted text
#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: David G. Johnston (#5)
Re: Variable substitution in jsonb functions fails for jsonpath operator like_regex

"David G. Johnston" <david.g.johnston@gmail.com> writes:

On Wed, Oct 18, 2023, 18:47 Tom Lane <tgl@sss.pgh.pa.us> wrote:

Oh, duh, makeItemLikeRegex() is expecting a plain JsonPathString,
and JsonPathParseItem.value.like_regex is set up for just a
constant pattern. We could no doubt fix that, but it's a bigger
lift than I was hoping.

Not in a position to do it myself but we should confirm we aren't simply
following the standard here, and if so decide whether we want to deviate.

Hmm ... looks like we *are* following the standard:

<JSON like_regex predicate> ::=
<JSON path wff> like_regex <JSON like_regex pattern>
[ flag <JSON like_regex flags> ]
<JSON like_regex pattern> ::=
<JSON path string literal>
<JSON like_regex flag> ::=
<JSON path string literal>

whereas "starts with" has

<JSON starts with predicate> ::=
<JSON starts with whole> starts with <JSON starts with initial>
<JSON starts with whole> ::=
<JSON path wff>
<JSON starts with initial> ::=
<JSON path wff>

The text mentions that "the second operand is permitted to be an SQL/JSON
sequence and to support existential semantics", whereas they evidently
don't want that for a regex pattern.

regards, tom lane

#7Erwin Brandstetter
brsaweda@gmail.com
In reply to: Tom Lane (#6)
Re: Variable substitution in jsonb functions fails for jsonpath operator like_regex

On Thu, 19 Oct 2023 at 21:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"David G. Johnston" <david.g.johnston@gmail.com> writes:

On Wed, Oct 18, 2023, 18:47 Tom Lane <tgl@sss.pgh.pa.us> wrote:

Oh, duh, makeItemLikeRegex() is expecting a plain JsonPathString,
and JsonPathParseItem.value.like_regex is set up for just a
constant pattern. We could no doubt fix that, but it's a bigger
lift than I was hoping.

Not in a position to do it myself but we should confirm we aren't simply
following the standard here, and if so decide whether we want to deviate.

Hmm ... looks like we *are* following the standard:

<JSON like_regex predicate> ::=
<JSON path wff> like_regex <JSON like_regex pattern>
[ flag <JSON like_regex flags> ]
<JSON like_regex pattern> ::=
<JSON path string literal>
<JSON like_regex flag> ::=
<JSON path string literal>

whereas "starts with" has

<JSON starts with predicate> ::=
<JSON starts with whole> starts with <JSON starts with initial>
<JSON starts with whole> ::=
<JSON path wff>
<JSON starts with initial> ::=
<JSON path wff>

The text mentions that "the second operand is permitted to be an SQL/JSON
sequence and to support existential semantics", whereas they evidently
don't want that for a regex pattern.

So input from "vars" cannot be substituted into the jsonpath expression
after "like_regex" (as opposed to all other jsonpath operators). Seems
pretty random from a user's perspective.

Be that as it may, if that's the consensus, I'll have a closer look at the
manual page to try and convey the rules.
Do we have an accord?

Regards
Erwin

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Erwin Brandstetter (#7)
Re: Variable substitution in jsonb functions fails for jsonpath operator like_regex

Erwin Brandstetter <brsaweda@gmail.com> writes:

On Thu, 19 Oct 2023 at 21:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hmm ... looks like we *are* following the standard:
...
The text mentions that "the second operand is permitted to be an SQL/JSON
sequence and to support existential semantics", whereas they evidently
don't want that for a regex pattern.

So input from "vars" cannot be substituted into the jsonpath expression
after "like_regex" (as opposed to all other jsonpath operators). Seems
pretty random from a user's perspective.

I agree it looks pretty random if you haven't drilled down into the
spec's fine print. Personally I wouldn't be opposed to extending
the spec here (not that I'm volunteering to write the patch).

Nosing around in jsonpath_gram.y, I see datetime_template as the
only other place where there's a random-seeming choice to allow
STRING_P but not VARIABLE_P. Should we tackle that too?

regards, tom lane