Performance of JSON_TABLE vs jsonb_to_recordset

Started by Alexander Lakhinover 1 year ago4 messages
#1Alexander Lakhin
exclusion@gmail.com
1 attachment(s)

Hello hackers,

When playing with JSON_TABLE, I tried to replace tenk1 in regression tests
with a view based on JSON_TABLE, with the same content, and discovered
that for one sub-optimal query it's execution duration increased many-fold.
With the preparation script attached, I see the following durations
(for a build compiled by clang 18.1.3 with -O3):
explain (verbose, analyze)
select
  (select max((select i.unique2 from tenk1 i where i.unique1 = o.unique1)))
from tenk1 o;
-- original tenk1
 Execution Time: 4769.481 ms

explain (verbose, analyze)
select
  (select max((select i.unique2 from jsonb_rs_tenk1 i where i.unique1 = o.unique1)))
from jsonb_rs_tenk1 o;
-- Function Call: jsonb_to_recordset...
 Execution Time: 6841.767 ms

explain (verbose, analyze)
select
  (select max((select i.unique2 from jsontable_tenk1 i where i.unique1 = o.unique1)))
from jsontable_tenk1 o;
-- Table Function Call: JSON_TABLE...
 Execution Time: 288310.131 ms
(with 63% of time spent inside ExecEvalJsonExprPath())

Just for fun I've tested also XMLTABLE with the similar content:
explain (verbose, analyze)
select
  (select max((select i.unique2 from xml_tenk1 i where i.unique1 = o.unique1)))
from xml_tenk1 o;
-- Table Function Call: XMLTABLE...
 Execution Time: 1235066.636 ms

Maybe it's worth to add a note to the JSON_TABLE() documentation saying that
jsonb_to_recordset is (inherently?) more performant when processing arrays
of flat structures for users not to re-discover this fact...

Best regards,
Alexander

Attachments:

jsontable-perf-setup.sqlapplication/sql; name=jsontable-perf-setup.sqlDownload
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alexander Lakhin (#1)
Re: Performance of JSON_TABLE vs jsonb_to_recordset

Alexander Lakhin <exclusion@gmail.com> writes:

When playing with JSON_TABLE, I tried to replace tenk1 in regression tests
with a view based on JSON_TABLE, with the same content, and discovered
that for one sub-optimal query it's execution duration increased many-fold.
With the preparation script attached, I see the following durations
(for a build compiled by clang 18.1.3 with -O3):
explain (verbose, analyze)
select
  (select max((select i.unique2 from tenk1 i where i.unique1 = o.unique1)))
from tenk1 o;
-- original tenk1
 Execution Time: 4769.481 ms

Hm, I get about 13 ms for that example. Do you have some really
expensive debugging infrastructure enabled, perhaps?

regards, tom lane

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#2)
Re: Performance of JSON_TABLE vs jsonb_to_recordset

I wrote:

Alexander Lakhin <exclusion@gmail.com> writes:

explain (verbose, analyze)
select
  (select max((select i.unique2 from tenk1 i where i.unique1 = o.unique1)))
from tenk1 o;
-- original tenk1
 Execution Time: 4769.481 ms

Hm, I get about 13 ms for that example. Do you have some really
expensive debugging infrastructure enabled, perhaps?

Oh, never mind, now I see you are testing a version of the table
with no indexes, rather than the way it's set up in the regression
database. Apologies for the noise.

regards, tom lane

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alexander Lakhin (#1)
Re: Performance of JSON_TABLE vs jsonb_to_recordset

Alexander Lakhin <exclusion@gmail.com> writes:

explain (verbose, analyze)
select
(select max((select i.unique2 from jsontable_tenk1 i where i.unique1 = o.unique1)))
from jsontable_tenk1 o;
-- Table Function Call: JSON_TABLE...
Execution Time: 288310.131 ms
(with 63% of time spent inside ExecEvalJsonExprPath())

Yeah, I looked at this with perf too, and what I'm seeing is

- 55.87% ExecEvalJsonExprPath
- 39.30% JsonPathValue
- 37.63% executeJsonPath
- 34.87% executeItem (inlined)
- executeItemOptUnwrapTarget
- 32.39% executeNextItem
- 31.02% executeItem (inlined)
- 30.90% executeItemOptUnwrapTarget
- 26.81% getKeyJsonValueFromContainer
14.35% getJsonbOffset (inlined)
- 4.90% lengthCompareJsonbString (inlined)
3.19% __memcmp_avx2_movbe
- 2.32% palloc
1.67% AllocSetAlloc
0.93% fillJsonbValue
1.18% executeNextItem
0.51% findJsonbValueFromContainer
- 1.04% jspGetNext
0.72% jspInitByBuffer
- 1.46% check_stack_depth
stack_is_too_deep (inlined)
0.61% jspInitByBuffer
- 9.82% ExecGetJsonValueItemString (inlined)
- 8.68% DirectFunctionCall1Coll
- 8.07% numeric_out
- 6.15% get_str_from_var
- 2.07% palloc
- 1.80% AllocSetAlloc
0.72% AllocSetAllocChunkFromBlock (inlined)
1.28% init_var_from_num
- 1.61% namein
0.90% __strlen_avx2
0.52% palloc0
- 0.74% int4in
0.69% pg_strtoint32_safe

Depressingly small amount of useful work being done there compared
to the management overhead. Seems like some micro-optimization
in this area could be a useful project for v18.

regards, tom lane