jsonb search

Started by armand pirvualmost 10 years ago3 messagesgeneral
Jump to latest
#1armand pirvu
armand.pirvu@gmail.com

Hi

In my quest of JSONB querying and searching without having to actually cast into a text, I found JSQuery

I do admit my JSONB knowledge shortcoming and I am not a developer but a DBA. As such some examples would be greatly appreciated since I tend to understand better

I compiled and installed the extension

1 - Exact matching without knowing the hierarchy, just the key and element, I built a set like

col1 | col2
------+--------------------------------------------------
1 | {"Home Email": {"EmailAddress": "1@yahoo.com"}}
2 | {"Home Email": {"EmailAddress": "2@yahoo.com"}}
3 | {"Home Email": {"EmailAddress": "3@yahoo.com"}}

JSQuqery is super

SELECT * FROM test1 WHERE col2 @@ '*.EmailAddress = "1@yahoo.com"';

Now I can do a performance boost using

CREATE INDEX idx1 ON test1 USING GIN (col2 jsonb_value_path_ops);

I see this yield

from

testdb=# explain analyze^JSELECT * FROM test1 WHERE col2 @@ '*.EmailAddress = "10@yahoo.com"';
Seq Scan on test1 (cost=0.00..12423.00 rows=500 width=68) (actual time=0.016..160.777 rows=1 loops=1)
Filter: (col2 @@ '*."EmailAddress" = "10@yahoo.com"'::jsquery)
Rows Removed by Filter: 499999
Planning time: 0.042 ms
Execution time: 160.799 ms
(5 rows)

to

testdb-# SELECT * FROM test1 WHERE col2 @@ '*.EmailAddress = "10@yahoo.com"';
Bitmap Heap Scan on test1 (cost=31.88..1559.32 rows=500 width=68) (actual time=0.018..0.019 rows=1 loops=1)
Recheck Cond: (col2 @@ '*."EmailAddress" = "10@yahoo.com"'::jsquery)
Heap Blocks: exact=1
-> Bitmap Index Scan on idx1 (cost=0.00..31.75 rows=500 width=0) (actual time=0.011..0.011 rows=1 loops=1)
Index Cond: (col2 @@ '*."EmailAddress" = "10@yahoo.com"'::jsquery)
Planning time: 0.039 ms
Execution time: 0.038 ms
(7 rows)

A whooping 4000 times improvement

But I also noticed a vodka index

testdb=# CREATE INDEX idx2 ON
testdb-# test1 USING vodka (col2);
ERROR: access method "vodka" does not exist

What am I missing ?

2 - Is there anyway I can accomplish a pattern and/or case insensitive search using JSQuery similar to

select * from test2 where upper((col2 -> 'Home Email') ->> 'EmailAddress') ilike '%3%YAH%';

select * from test2 where (col2 -> 'Home Email') ->> 'EmailAddress' like '%3%yah%';

If so what indexing strategy can be used to have similar gains as above ?

Many thanks for any help

Armand

#2Oleg Bartunov
oleg@sai.msu.su
In reply to: armand pirvu (#1)
Re: jsonb search

On Tue, Jun 28, 2016 at 5:15 PM, Armand Pirvu (home)
<armand.pirvu@gmail.com> wrote:

Hi

In my quest of JSONB querying and searching without having to actually cast
into a text, I found JSQuery

I do admit my JSONB knowledge shortcoming and I am not a developer but a
DBA. As such some examples would be greatly appreciated since I tend to
understand better

I compiled and installed the extension

1 - Exact matching without knowing the hierarchy, just the key and element,
I built a set like

col1 | col2
------+--------------------------------------------------
1 | {"Home Email": {"EmailAddress": "1@yahoo.com"}}
2 | {"Home Email": {"EmailAddress": "2@yahoo.com"}}
3 | {"Home Email": {"EmailAddress": "3@yahoo.com"}}

JSQuqery is super

SELECT * FROM test1 WHERE col2 @@ '*.EmailAddress = "1@yahoo.com"';

Now I can do a performance boost using

CREATE INDEX idx1 ON test1 USING GIN (col2 jsonb_value_path_ops);

I see this yield

from

testdb=# explain analyze^JSELECT * FROM test1 WHERE col2 @@ '*.EmailAddress
= "10@yahoo.com"';
Seq Scan on test1 (cost=0.00..12423.00 rows=500 width=68) (actual
time=0.016..160.777 rows=1 loops=1)
Filter: (col2 @@ '*."EmailAddress" = "10@yahoo.com"'::jsquery)
Rows Removed by Filter: 499999
Planning time: 0.042 ms
Execution time: 160.799 ms
(5 rows)

to

testdb-# SELECT * FROM test1 WHERE col2 @@ '*.EmailAddress =
"10@yahoo.com"';
Bitmap Heap Scan on test1 (cost=31.88..1559.32 rows=500 width=68) (actual
time=0.018..0.019 rows=1 loops=1)
Recheck Cond: (col2 @@ '*."EmailAddress" = "10@yahoo.com"'::jsquery)
Heap Blocks: exact=1
-> Bitmap Index Scan on idx1 (cost=0.00..31.75 rows=500 width=0) (actual
time=0.011..0.011 rows=1 loops=1)
Index Cond: (col2 @@ '*."EmailAddress" = "10@yahoo.com"'::jsquery)
Planning time: 0.039 ms
Execution time: 0.038 ms
(7 rows)

A whooping 4000 times improvement

But I also noticed a vodka index

testdb=# CREATE INDEX idx2 ON
testdb-# test1 USING vodka (col2);
ERROR: access method "vodka" does not exist

What am I missing ?

2 - Is there anyway I can accomplish a pattern and/or case insensitive
search using JSQuery similar to

select * from test2 where upper((col2 -> 'Home Email') ->> 'EmailAddress')
ilike '%3%YAH%';

select * from test2 where (col2 -> 'Home Email') ->> 'EmailAddress' like
'%3%yah%';

If so what indexing strategy can be used to have similar gains as above ?

Many thanks for any help

Vodka is our experimental prototype of access method of next
generation and it doesn't exists in production-ready form. You can
check our presentation
http://www.sai.msu.su/~megera/postgres/talks/highload-2014-vodka.pdf
to understand jsquery limitation and why we stop its development.
Also, 2 years ago I wrote (in russian)
http://obartunov.livejournal.com/179422.html about jsonb query
language and our plans. Google translate might helps

https://translate.google.com/translate?sl=auto&amp;tl=en&amp;js=y&amp;prev=_t&amp;hl=en&amp;ie=UTF-8&amp;u=http%3A%2F%2Fobartunov.livejournal.com%2F179422.html&amp;edit-text=&amp;act=url

Armand

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Arthur Silva
arthurprs@gmail.com
In reply to: Oleg Bartunov (#2)
Re: jsonb search

On Tue, Jun 28, 2016 at 5:09 PM, Oleg Bartunov <obartunov@gmail.com> wrote:

On Tue, Jun 28, 2016 at 5:15 PM, Armand Pirvu (home)
<armand.pirvu@gmail.com> wrote:

Hi

In my quest of JSONB querying and searching without having to actually

cast

into a text, I found JSQuery

I do admit my JSONB knowledge shortcoming and I am not a developer but a
DBA. As such some examples would be greatly appreciated since I tend to
understand better

I compiled and installed the extension

1 - Exact matching without knowing the hierarchy, just the key and

element,

I built a set like

col1 | col2
------+--------------------------------------------------
1 | {"Home Email": {"EmailAddress": "1@yahoo.com"}}
2 | {"Home Email": {"EmailAddress": "2@yahoo.com"}}
3 | {"Home Email": {"EmailAddress": "3@yahoo.com"}}

JSQuqery is super

SELECT * FROM test1 WHERE col2 @@ '*.EmailAddress = "1@yahoo.com"';

Now I can do a performance boost using

CREATE INDEX idx1 ON test1 USING GIN (col2 jsonb_value_path_ops);

I see this yield

from

testdb=# explain analyze^JSELECT * FROM test1 WHERE col2 @@

'*.EmailAddress

= "10@yahoo.com"';
Seq Scan on test1 (cost=0.00..12423.00 rows=500 width=68) (actual
time=0.016..160.777 rows=1 loops=1)
Filter: (col2 @@ '*."EmailAddress" = "10@yahoo.com"'::jsquery)
Rows Removed by Filter: 499999
Planning time: 0.042 ms
Execution time: 160.799 ms
(5 rows)

to

testdb-# SELECT * FROM test1 WHERE col2 @@ '*.EmailAddress =
"10@yahoo.com"';
Bitmap Heap Scan on test1 (cost=31.88..1559.32 rows=500 width=68)

(actual

time=0.018..0.019 rows=1 loops=1)
Recheck Cond: (col2 @@ '*."EmailAddress" = "10@yahoo.com"'::jsquery)
Heap Blocks: exact=1
-> Bitmap Index Scan on idx1 (cost=0.00..31.75 rows=500 width=0)

(actual

time=0.011..0.011 rows=1 loops=1)
Index Cond: (col2 @@ '*."EmailAddress" = "10@yahoo.com

"'::jsquery)

Planning time: 0.039 ms
Execution time: 0.038 ms
(7 rows)

A whooping 4000 times improvement

But I also noticed a vodka index

testdb=# CREATE INDEX idx2 ON
testdb-# test1 USING vodka (col2);
ERROR: access method "vodka" does not exist

What am I missing ?

2 - Is there anyway I can accomplish a pattern and/or case insensitive
search using JSQuery similar to

select * from test2 where upper((col2 -> 'Home Email') ->>

'EmailAddress')

ilike '%3%YAH%';

select * from test2 where (col2 -> 'Home Email') ->> 'EmailAddress' like
'%3%yah%';

If so what indexing strategy can be used to have similar gains as above ?

Many thanks for any help

Vodka is our experimental prototype of access method of next
generation and it doesn't exists in production-ready form. You can
check our presentation
http://www.sai.msu.su/~megera/postgres/talks/highload-2014-vodka.pdf
to understand jsquery limitation and why we stop its development.
Also, 2 years ago I wrote (in russian)
http://obartunov.livejournal.com/179422.html about jsonb query
language and our plans. Google translate might helps

https://translate.google.com/translate?sl=auto&amp;tl=en&amp;js=y&amp;prev=_t&amp;hl=en&amp;ie=UTF-8&amp;u=http%3A%2F%2Fobartunov.livejournal.com%2F179422.html&amp;edit-text=&amp;act=url

Armand

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Could you share your future plans for it (or it's reincarnation), if any?

Even in the limited form, vodka is very impressive.

--
Arthur Silva