jsonb existence queries are misimplemented by jsonb_ops
I wrote:
The readability of that comment starts to go downhill with its use of
"reset" to refer to what everything else calls a "recheck" flag, and in
any case it's claiming that we *don't* need a recheck for exists (a
statement I suspect to be false, but more later).
And, indeed, it's false:
regression=# create table j (f1 jsonb);
CREATE TABLE
regression=# insert into j values ('{"foo": {"bar": "baz"}}');
INSERT 0 1
regression=# insert into j values ('{"foo": {"blah": "baz"}}');
INSERT 0 1
regression=# insert into j values ('{"fool": {"bar": "baz"}}');
INSERT 0 1
regression=# create index on j using gin(f1);
CREATE INDEX
regression=# select * from j where f1 ? 'bar';
f1
----
(0 rows)
regression=# set enable_seqscan to 0;
SET
regression=# select * from j where f1 ? 'bar';
f1
--------------------------
{"foo": {"bar": "baz"}}
{"fool": {"bar": "baz"}}
(2 rows)
The indexscan is incorrectly returning rows where the queried key exists
but isn't at top-level.
We could fix this either by giving up on no-recheck for existence queries,
or by changing the way that non-top-level keys get indexed. However
I suspect the latter would break containment queries, or at least make
their lives a lot more difficult.
Another idea would be to change the definition of the exists operator
so that it *does* look into sub-objects. It seems rather random to me
that containment looks into sub-objects but exists doesn't. However,
possibly there are good reasons for the non-orthogonality.
Thoughts?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
I wrote:
Another idea would be to change the definition of the exists operator
so that it *does* look into sub-objects. It seems rather random to me
that containment looks into sub-objects but exists doesn't. However,
possibly there are good reasons for the non-orthogonality.
No, wait, containment *doesn't* look into sub-objects:
regression=# select * from j where f1 @> '{"foo": {"bar": "baz"}}';
f1
-------------------------
{"foo": {"bar": "baz"}}
(1 row)
regression=# select * from j where f1 @> '{"bar": "baz"}';
f1
----
(0 rows)
This is rather surprising in view of the way that section 8.14.4
goes on about nesting. But I guess the user-facing docs for jsonb
are in little better shape than the internal docs.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, May 7, 2014 at 1:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
No, wait, containment *doesn't* look into sub-objects:
regression=# select * from j where f1 @> '{"foo": {"bar": "baz"}}';
f1
-------------------------
{"foo": {"bar": "baz"}}
(1 row)regression=# select * from j where f1 @> '{"bar": "baz"}';
f1
----
(0 rows)
Yes it does. It's just not intended to work like that. You have to
"give a path" to what you're looking for.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, May 7, 2014 at 1:51 PM, Peter Geoghegan <pg@heroku.com> wrote:
Yes it does. It's just not intended to work like that. You have to
"give a path" to what you're looking for.
There is exactly one exception explicitly noted as such in the
user-visible docs, which is arrays and the containment of single
elements.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Peter Geoghegan <pg@heroku.com> writes:
On Wed, May 7, 2014 at 1:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
No, wait, containment *doesn't* look into sub-objects:
regression=# select * from j where f1 @> '{"foo": {"bar": "baz"}}';
f1
-------------------------
{"foo": {"bar": "baz"}}
(1 row)regression=# select * from j where f1 @> '{"bar": "baz"}';
f1
----
(0 rows)
Yes it does. It's just not intended to work like that. You have to
"give a path" to what you're looking for.
Right, so the top-level item in the query has to be at top level in the
searched-for row. This means that we *could* distinguish top-level from
non-top-level keys in the index items, and still be able to generate
correct index probes for a containment search (since we could similarly
mark the generated keys as to whether they came from top level in the
query object).
So the question is whether that's a good idea or not. It seems like it is
a good idea for the current existence operator, and about a wash for the
current containment operator, but perhaps a bad idea for other definitions
of containment.
In any case, something here needs to change. Thoughts?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, May 7, 2014 at 1:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
The indexscan is incorrectly returning rows where the queried key exists
but isn't at top-level.We could fix this either by giving up on no-recheck for existence queries,
or by changing the way that non-top-level keys get indexed. However
I suspect the latter would break containment queries, or at least make
their lives a lot more difficult.
Thanks for looking into this.
I don't know that it would make life all that much harder for
containment. jsonb_ops is already not compelling for testing
containment where there is a lot of nesting. ISTM that
make_scalar_key() could be taught to respect that distinction. ISTM
that containment will indifferently treat the top-level keys as top
level, and the nested keys as nested, because if you're looking at
something nested you have to put a top-level key in the rhs value to
do so, and it will be marked as top level, or not top level in a
bijective fashion.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers