PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size
Hi,
Mark Callaghan reported a regression in 15, in the post linked here (with
comments in the twitter thread, hence linked here)
https://twitter.com/MarkCallaghanDB/status/1537475430227161098
A differential flame graph shows increased time spent doing memory
allocations, below ExecInitExpr():
https://mdcallag.github.io/reports/22_06_06_ibench.20m.pg.all/l.i0.0.143.15b1.n.svg
Quite the useful comparison.
That lead to me to look at the size of ExprEvalStep - and indeed, it has
*exploded* in 15. 10-13: 64 bytes, 14: 320 bytes.
The comment above the union for data fields says:
/*
* Inline data for the operation. Inline data is faster to access, but
* also bloats the size of all instructions. The union should be kept to
* no more than 40 bytes on 64-bit systems (so that the entire struct is
* no more than 64 bytes, a single cacheline on common systems).
*/
However, jsonexpr/EEOP_JSONEXPR is 296 bytes, and
hashedscalararrayop/EEOP_HASHED_SCALARARRAYOP is 64 bytes, even though the
limit is 40 bytes.
The EEOP_JSONEXPR stuff was added during 15 development in:
commit 1a36bc9dba8eae90963a586d37b6457b32b2fed4
Author: Andrew Dunstan <andrew@dunslane.net>
Date: 2022-03-03 13:11:14 -0500
SQL/JSON query functions
the EEOP_HASHED_SCALARARRAYOP stuff was added during 14 development in:
commit 50e17ad281b8d1c1b410c9833955bc80fbad4078
Author: David Rowley <drowley@postgresql.org>
Date: 2021-04-08 23:51:22 +1200
Speedup ScalarArrayOpExpr evaluation
Unfortunately ExprEvalStep is public, so I don't think we can fix the 14
regression. If somebody installed updated server packages while the server is
running, we could end up loading extensions referencing ExprEvalStep (at least
plpgsql and LLVMJIT).
It's not great to have an ABI break at this point of the 15 cycle, but I don't
think we have a choice. Exploding the size of ExprEvalStep by ~4x is bad -
both for memory overhead (expressions often have dozens to hundreds of steps)
and expression evaluation performance.
The Hashed SAO case can perhaps be squeezed sufficiently to fit inline, but
clearly that's not going to happen for the json case. So we should just move
that out of line.
Maybe it's worth sticking a StaticAssert() for the struct size somewhere. I'm
a bit wary about that being too noisy, there are some machines with odd
alignment requirements. Perhaps worth restricting the assertion to x86-64 +
armv8 or such?
It very well might be that this isn't the full explanation of the regression
Mark observed. E.g. the difference in DecodeDateTime() looks more likely to be
caused by 591e088dd5b - but we need to fix the above issue, whether it's the
cause of the regression Mark observed or not.
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
However, jsonexpr/EEOP_JSONEXPR is 296 bytes, and
hashedscalararrayop/EEOP_HASHED_SCALARARRAYOP is 64 bytes, even though the
limit is 40 bytes.
Oops.
Maybe it's worth sticking a StaticAssert() for the struct size
somewhere.
Indeed. I thought we had one already.
I'm a bit wary about that being too noisy, there are some machines with
odd alignment requirements. Perhaps worth restricting the assertion to
x86-64 + armv8 or such?
I'd put it in first and only reconsider if it shows unfixable problems.
regards, tom lane
Hi,
On 2022-06-16 16:31:30 -0700, Andres Freund wrote:
The EEOP_JSONEXPR stuff was added during 15 development in:
commit 1a36bc9dba8eae90963a586d37b6457b32b2fed4
Author: Andrew Dunstan <andrew@dunslane.net>
Date: 2022-03-03 13:11:14 -0500SQL/JSON query functions
I'm quite confused about part of the struct definition of this:
struct JsonCoercionsState
{
struct JsonCoercionState
{
JsonCoercion *coercion; /* coercion expression */
ExprState *estate; /* coercion expression state */
} null,
string,
numeric ,
boolean,
date,
time,
timetz,
timestamp,
timestamptz,
composite;
} coercions; /* states for coercion from SQL/JSON item
* types directly to the output type */
Why on earth do we have coercion state for all these different types? That
really adds up:
struct {
JsonExpr * jsexpr; /* 24 8 */
struct {
FmgrInfo func; /* 32 48 */
/* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */
Oid typioparam; /* 80 4 */
} input; /* 32 56 */
/* XXX last struct has 4 bytes of padding */
NullableDatum * formatted_expr; /* 88 8 */
NullableDatum * res_expr; /* 96 8 */
NullableDatum * coercion_expr; /* 104 8 */
NullableDatum * pathspec; /* 112 8 */
ExprState * result_expr; /* 120 8 */
/* --- cacheline 2 boundary (128 bytes) --- */
ExprState * default_on_empty; /* 128 8 */
ExprState * default_on_error; /* 136 8 */
List * args; /* 144 8 */
void * cache; /* 152 8 */
struct JsonCoercionsState coercions; /* 160 160 */
} jsonexpr; /* 24 296 */
And why is FmgrInfo stored inline in the struct? Everything else just stores
pointers to FmgrInfo.
Now that I look at this: It's a *bad* idea to have subsidiary ExprState inside
an ExprState. Nearly always the correct thing to do is to build those
expressions. There's plenty memory and evaluation overhead in jumping to a
different expression. And I see no reason for doing it that way here?
This stuff doesn't look ready.
Greetings,
Andres Freund
On Fri, 17 Jun 2022 at 11:31, Andres Freund <andres@anarazel.de> wrote:
hashedscalararrayop/EEOP_HASHED_SCALARARRAYOP is 64 bytes, even though the
limit is 40 bytes.
commit 50e17ad281b8d1c1b410c9833955bc80fbad4078
Author: David Rowley <drowley@postgresql.org>
Date: 2021-04-08 23:51:22 +1200Speedup ScalarArrayOpExpr evaluation
I've put together the attached patch which removes 4 fields from the
hashedscalararrayop portion of the struct which, once the JSON part is
fixed, will put sizeof(ExprEvalStep) back down to 64 bytes again.
The attached patch causes some extra pointer dereferencing to perform
a hashed saop step, so I tested the performance on f4fb45d15 (prior to
the JSON patch that pushed the sizeof(ExprEvalStep) up further. I
found:
setup:
create table a (a int);
insert into a select x from generate_series(1000000,2000000) x;
bench.sql
select * from a where a in(1,2,3,4,5,6,7,8,9,10);
f4fb45d15 + reduce_sizeof_hashedsaop_ExprEvalStep.patch
drowley@amd3990x:~$ pgbench -n -f bench.sql -T 60 -M prepared postgres
tps = 44.841851 (without initial connection time)
tps = 44.986472 (without initial connection time)
tps = 44.944315 (without initial connection time)
f4fb45d15
drowley@amd3990x:~$ pgbench -n -f bench.sql -T 60 -M prepared postgres
tps = 44.446127 (without initial connection time)
tps = 44.614134 (without initial connection time)
tps = 44.895011 (without initial connection time)
(Patched is ~0.61% faster here)
So, there appears to be no performance regression due to the extra
indirection. There's maybe even some gains due to the smaller step
size.
David
Attachments:
On Thu, Jun 16, 2022 at 7:15 PM David Rowley <dgrowleyml@gmail.com> wrote:
So, there appears to be no performance regression due to the extra
indirection. There's maybe even some gains due to the smaller step
size.
Have you tried this with the insert benchmark [1]http://smalldatum.blogspot.com/2017/06/the-insert-benchmark.html -- Peter Geoghegan?
I've run it myself in the past (when working on B-Tree deduplication).
It's quite straightforward to set up and run.
[1]: http://smalldatum.blogspot.com/2017/06/the-insert-benchmark.html -- Peter Geoghegan
--
Peter Geoghegan
On Fri, 17 Jun 2022 at 15:33, Peter Geoghegan <pg@bowt.ie> wrote:
Have you tried this with the insert benchmark [1]?
I was mostly focusing on the performance of the hashed saop feature
after having removed the additional fields that pushed ExprEvalStep
over 64 bytes in 14.
I agree it would be good to do further benchmarking to see if there's
anything else that's snuck into 15 that's slowed that benchmark down,
but we can likely work on that after we get the ExprEvalStep size back
to 64 bytes again.
David
Hi,
On 2022-06-17 16:53:31 +1200, David Rowley wrote:
On Fri, 17 Jun 2022 at 15:33, Peter Geoghegan <pg@bowt.ie> wrote:
Have you tried this with the insert benchmark [1]?
I was mostly focusing on the performance of the hashed saop feature
after having removed the additional fields that pushed ExprEvalStep
over 64 bytes in 14.I agree it would be good to do further benchmarking to see if there's
anything else that's snuck into 15 that's slowed that benchmark down,
but we can likely work on that after we get the ExprEvalStep size back
to 64 bytes again.
I did reproduce a regression between 14 and 15, using both pgbench -Mprepared
-S (scale 1) and TPC-H Q01 (scale 5). Between 7-10% - not good, particularly
that that's not been found so far. Fixing the json size issue gets that down
to ~2%. Not sure what that's caused by yet.
Greetings,
Andres Freund
Attachments:
Hi,
On 2022-06-16 22:22:28 -0700, Andres Freund wrote:
On 2022-06-17 16:53:31 +1200, David Rowley wrote:
On Fri, 17 Jun 2022 at 15:33, Peter Geoghegan <pg@bowt.ie> wrote:
Have you tried this with the insert benchmark [1]?
I was mostly focusing on the performance of the hashed saop feature
after having removed the additional fields that pushed ExprEvalStep
over 64 bytes in 14.I agree it would be good to do further benchmarking to see if there's
anything else that's snuck into 15 that's slowed that benchmark down,
but we can likely work on that after we get the ExprEvalStep size back
to 64 bytes again.I did reproduce a regression between 14 and 15, using both pgbench -Mprepared
-S (scale 1) and TPC-H Q01 (scale 5). Between 7-10% - not good, particularly
that that's not been found so far. Fixing the json size issue gets that down
to ~2%. Not sure what that's caused by yet.
The remaining difference looks like it's largely caused by the
enable_timeout_after(IDLE_STATS_UPDATE_TIMEOUT, ...) introduced as part of the
pgstats patch. It's only really visible when I pin a single connection pgbench
to the same CPU core as the server (which gives a ~16% boost here).
It's not the timeout itself - that we amortize nicely (via 09cf1d522). It's
that enable_timeout_after() does a GetCurrentTimestamp().
Not sure yet what the best way to fix that is.
We could just leave the timer active and add some gating condition indicating
idleness to the IdleStatsUpdateTimeoutPending body in ProcessInterrupts()?
Or we could add a timeout.c API that specifies the timeout?
pgstat_report_stat() uses GetCurrentTransactionStopTimestamp(), it seems like
it'd make sense to use the same for arming the timeout?
- Andres
At Thu, 16 Jun 2022 23:24:56 -0700, Andres Freund <andres@anarazel.de> wrote in
The remaining difference looks like it's largely caused by the
enable_timeout_after(IDLE_STATS_UPDATE_TIMEOUT, ...) introduced as part of the
pgstats patch. It's only really visible when I pin a single connection pgbench
to the same CPU core as the server (which gives a ~16% boost here).It's not the timeout itself - that we amortize nicely (via 09cf1d522). It's
that enable_timeout_after() does a GetCurrentTimestamp().Not sure yet what the best way to fix that is.
We could just leave the timer active and add some gating condition indicating
idleness to the IdleStatsUpdateTimeoutPending body in ProcessInterrupts()?Or we could add a timeout.c API that specifies the timeout?
I sometimes wanted this, But I don't see a simple way to sort multiple
relative timeouts in absolute time order. Maybe we can skip
GetCurrentTimestamp only when inserting the first timeout, but I don't
think it benefits this case.
pgstat_report_stat() uses GetCurrentTransactionStopTimestamp(), it seems like
it'd make sense to use the same for arming the timeout?
This seems like the feasible best fix for this specific issue.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
At Fri, 17 Jun 2022 15:54:13 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
Or we could add a timeout.c API that specifies the timeout?
I sometimes wanted this, But I don't see a simple way to sort multiple
relative timeouts in absolute time order. Maybe we can skip
GetCurrentTimestamp only when inserting the first timeout, but I don't
think it benefits this case.
Or we can use a free-run interval timer and individual down-counter
for each timtouts. I think we need at-most 0.1s resolution and error
of long-run timer doesn't harm?
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
At Fri, 17 Jun 2022 15:59:26 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
At Fri, 17 Jun 2022 15:54:13 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
Or we could add a timeout.c API that specifies the timeout?
I sometimes wanted this, But I don't see a simple way to sort multiple
relative timeouts in absolute time order. Maybe we can skip
GetCurrentTimestamp only when inserting the first timeout, but I don't
think it benefits this case.Or we can use a free-run interval timer and individual down-counter
for each timtouts. I think we need at-most 0.1s resolution and error
of long-run timer doesn't harm?
Yeah, stupid. We don't want awake process with such a high frequency..
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Thu, Jun 16, 2022 at 10:15 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Fri, 17 Jun 2022 at 11:31, Andres Freund <andres@anarazel.de> wrote:
hashedscalararrayop/EEOP_HASHED_SCALARARRAYOP is 64 bytes, even though the
limit is 40 bytes.commit 50e17ad281b8d1c1b410c9833955bc80fbad4078
Author: David Rowley <drowley@postgresql.org>
Date: 2021-04-08 23:51:22 +1200Speedup ScalarArrayOpExpr evaluation
I've put together the attached patch which removes 4 fields from the
hashedscalararrayop portion of the struct which, once the JSON part is
fixed, will put sizeof(ExprEvalStep) back down to 64 bytes again.The attached patch causes some extra pointer dereferencing to perform
a hashed saop step, so I tested the performance on f4fb45d15 (prior to
the JSON patch that pushed the sizeof(ExprEvalStep) up further. I
found:setup:
create table a (a int);
insert into a select x from generate_series(1000000,2000000) x;bench.sql
select * from a where a in(1,2,3,4,5,6,7,8,9,10);f4fb45d15 + reduce_sizeof_hashedsaop_ExprEvalStep.patch
drowley@amd3990x:~$ pgbench -n -f bench.sql -T 60 -M prepared postgres
tps = 44.841851 (without initial connection time)
tps = 44.986472 (without initial connection time)
tps = 44.944315 (without initial connection time)f4fb45d15
drowley@amd3990x:~$ pgbench -n -f bench.sql -T 60 -M prepared postgres
tps = 44.446127 (without initial connection time)
tps = 44.614134 (without initial connection time)
tps = 44.895011 (without initial connection time)(Patched is ~0.61% faster here)
So, there appears to be no performance regression due to the extra
indirection. There's maybe even some gains due to the smaller step
size.
I didn't see that comment when working on this (it's quite a long
unioned struct; I concur on adding an assert to catch it).
This patch looks very reasonable to me though.
James Coleman
Andres Freund <andres@anarazel.de> writes:
The remaining difference looks like it's largely caused by the
enable_timeout_after(IDLE_STATS_UPDATE_TIMEOUT, ...) introduced as part of the
pgstats patch. It's only really visible when I pin a single connection pgbench
to the same CPU core as the server (which gives a ~16% boost here).
It's not the timeout itself - that we amortize nicely (via 09cf1d522). It's
that enable_timeout_after() does a GetCurrentTimestamp().
Not sure yet what the best way to fix that is.
Maybe not queue a new timeout if the old one is still active?
BTW, it looks like that patch also falsified this comment
(postgres.c:4478):
* At most one of these timeouts will be active, so there's no need to
* worry about combining the timeout.c calls into one.
Maybe fixing that end of things would be a simpler way of buying back
the delta.
Or we could add a timeout.c API that specifies the timeout?
Don't think that will help: it'd be morally equivalent to
enable_timeout_at(), which also has to do GetCurrentTimestamp().
regards, tom lane
I wrote:
Andres Freund <andres@anarazel.de> writes:
Or we could add a timeout.c API that specifies the timeout?
Don't think that will help: it'd be morally equivalent to
enable_timeout_at(), which also has to do GetCurrentTimestamp().
BTW, if we were willing to drop get_timeout_start_time(), it might
be possible to avoid doing GetCurrentTimestamp() in enable_timeout_at,
in the common case where the specified timestamp is beyond signal_due_at
so that no setitimer call is needed. But getting the race conditions
right could be tricky. On the whole this doesn't sound like something
to tackle post-beta.
regards, tom lane
Hi,
On 2022-06-17 14:14:54 +1200, David Rowley wrote:
I've put together the attached patch which removes 4 fields from the
hashedscalararrayop portion of the struct which, once the JSON part is
fixed, will put sizeof(ExprEvalStep) back down to 64 bytes again.
The attached patch causes some extra pointer dereferencing to perform
a hashed saop step, so I tested the performance on f4fb45d15 (prior to
the JSON patch that pushed the sizeof(ExprEvalStep) up further. I
found:
What do you think about the approach prototyped in my patch to move the hash
FunctionCallInfo into the element_tab? With a tiny bit more work that should
reduce the amount of dereferincing over the state today, while also keeping
below the limit?
setup:
create table a (a int);
insert into a select x from generate_series(1000000,2000000) x;bench.sql
select * from a where a in(1,2,3,4,5,6,7,8,9,10);f4fb45d15 + reduce_sizeof_hashedsaop_ExprEvalStep.patch
drowley@amd3990x:~$ pgbench -n -f bench.sql -T 60 -M prepared postgres
tps = 44.841851 (without initial connection time)
tps = 44.986472 (without initial connection time)
tps = 44.944315 (without initial connection time)f4fb45d15
drowley@amd3990x:~$ pgbench -n -f bench.sql -T 60 -M prepared postgres
tps = 44.446127 (without initial connection time)
tps = 44.614134 (without initial connection time)
tps = 44.895011 (without initial connection time)(Patched is ~0.61% faster here)
So, there appears to be no performance regression due to the extra
indirection. There's maybe even some gains due to the smaller step
size.
"smaller step size"?
Greetings,
Andres Freund
Hi,
On 2022-06-17 10:33:08 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
The remaining difference looks like it's largely caused by the
enable_timeout_after(IDLE_STATS_UPDATE_TIMEOUT, ...) introduced as part of the
pgstats patch. It's only really visible when I pin a single connection pgbench
to the same CPU core as the server (which gives a ~16% boost here).It's not the timeout itself - that we amortize nicely (via 09cf1d522). It's
that enable_timeout_after() does a GetCurrentTimestamp().Not sure yet what the best way to fix that is.
Maybe not queue a new timeout if the old one is still active?
Right now we disable the timer after ReadCommand(). We can of course change
that. At first I thought we might need more bookkeeping to do so, to avoid
ProcessInterrupts() triggering pgstat_report_stat() when the timer fires
later, but we probably can jury-rig something with DoingCommandRead &&
IsTransactionOrTransactionBlock() or such.
I guess one advantage of something like this could be that we could possibly
move the arming of the timeout to pgstat.c. But that looks like it might be
more complicated than really worth it.
BTW, it looks like that patch also falsified this comment
(postgres.c:4478):* At most one of these timeouts will be active, so there's no need to
* worry about combining the timeout.c calls into one.
Hm, yea. I guess we can just disable them at once.
Maybe fixing that end of things would be a simpler way of buying back
the delta.
I don't think that'll do the trick - in the case I'm looking at none of the
other timers are active...
Or we could add a timeout.c API that specifies the timeout?
Don't think that will help: it'd be morally equivalent to
enable_timeout_at(), which also has to do GetCurrentTimestamp().
I should have been more precise - what I meant was a timeout.c API that allows
the caller to pass in "now", which in this case we'd get from
GetCurrentTransactionStopTimestamp(), which would avoid the additional
timestamp computation.
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
I should have been more precise - what I meant was a timeout.c API that allows
the caller to pass in "now", which in this case we'd get from
GetCurrentTransactionStopTimestamp(), which would avoid the additional
timestamp computation.
I don't care for that one bit: it makes the accuracy of all timeouts
dependent on how careful that caller is to provide an up-to-date "now".
In the example at hand, there is WAY too much code between
SetCurrentTransactionStopTimestamp() and the timer arming to make me
think the results will be acceptable.
regards, tom lane
Hi,
On 2022-06-17 13:43:49 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
I should have been more precise - what I meant was a timeout.c API that allows
the caller to pass in "now", which in this case we'd get from
GetCurrentTransactionStopTimestamp(), which would avoid the additional
timestamp computation.I don't care for that one bit: it makes the accuracy of all timeouts
dependent on how careful that caller is to provide an up-to-date "now".
I don't think it'd necessarily have to influence the accuracy of all timeouts
- but I've not looked at timeout.c much before. From what I understand we use
'now' for two things: First, to set ->start_time in enable_timeout() and
second to schedule the alarm in schedule_alarm(). An inaccurate start_time
won't cause problems for other timers afaics and it looks to me that it
wouldn't be too hard to only require an accurate 'now' if the new timeout is
nearest_timeout and now + nearest_timeout < signal_due_at?
It's probably to complicated to tinker with now tho.
Greetings,
Andres Freund
Hi,
On 2022-06-17 10:30:55 -0700, Andres Freund wrote:
On 2022-06-17 10:33:08 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
The remaining difference looks like it's largely caused by the
enable_timeout_after(IDLE_STATS_UPDATE_TIMEOUT, ...) introduced as part of the
pgstats patch. It's only really visible when I pin a single connection pgbench
to the same CPU core as the server (which gives a ~16% boost here).It's not the timeout itself - that we amortize nicely (via 09cf1d522). It's
that enable_timeout_after() does a GetCurrentTimestamp().Not sure yet what the best way to fix that is.
Maybe not queue a new timeout if the old one is still active?
Right now we disable the timer after ReadCommand(). We can of course change
that. At first I thought we might need more bookkeeping to do so, to avoid
ProcessInterrupts() triggering pgstat_report_stat() when the timer fires
later, but we probably can jury-rig something with DoingCommandRead &&
IsTransactionOrTransactionBlock() or such.
Here's a patch for that.
One thing I noticed is that disable_timeout() calls do
schedule_alarm(GetCurrentTimestamp()) if there's any other active timeout,
even if the to-be-disabled timer is already disabled. Of course callers of
disable_timeout() can guard against that using get_timeout_active(), but that
spreads repetitive code around...
I opted to add a fastpath for that, instead of using
get_timeout_active(). Afaics that's safe to do without disarming the signal
handler, but I'd welcome a look from somebody that knows this code.
I guess one advantage of something like this could be that we could possibly
move the arming of the timeout to pgstat.c. But that looks like it might be
more complicated than really worth it.
I didn't do that yet, but am curious whether others think this would be
preferrable.
BTW, it looks like that patch also falsified this comment
(postgres.c:4478):* At most one of these timeouts will be active, so there's no need to
* worry about combining the timeout.c calls into one.Hm, yea. I guess we can just disable them at once.
With the proposed change we don't need to change the separate timeout.c to
one, or update the comment, as it should now look the same as 14.
I also attached my heavily-WIP patches for the ExprEvalStep issues, I
accidentally had only included a small part of the contents of the json fix.
Greetings,
Andres Freund
Attachments:
On 2022-06-17 Fr 16:06, Andres Freund wrote:
I also attached my heavily-WIP patches for the ExprEvalStep issues,
Many thanks
I
accidentally had only included a small part of the contents of the json fix.
Yeah, that confused me mightily last week :-)
I and a couple of colleagues have looked it over. As far as it goes the
json fix looks kosher to me. I'll play with it some more.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Hi,
On 2022-06-21 17:11:33 -0400, Andrew Dunstan wrote:
I and a couple of colleagues have looked it over. As far as it goes the
json fix looks kosher to me. I'll play with it some more.
Cool.
Any chance you could look at fixing the "structure" of the generated
expression "program". The recursive ExecEvalExpr() calls are really not ok...
Greetings,
Andres Freund
On 2022-06-21 Tu 17:25, Andres Freund wrote:
Hi,
On 2022-06-21 17:11:33 -0400, Andrew Dunstan wrote:
I and a couple of colleagues have looked it over. As far as it goes the
json fix looks kosher to me. I'll play with it some more.Cool.
Any chance you could look at fixing the "structure" of the generated
expression "program". The recursive ExecEvalExpr() calls are really not ok...
Yes, but I don't guarantee to have a fix in time for Beta2.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
On Tue, Jun 21, 2022 at 05:41:07PM -0400, Andrew Dunstan wrote:
On 2022-06-21 Tu 17:25, Andres Freund wrote:
On 2022-06-21 17:11:33 -0400, Andrew Dunstan wrote:
I and a couple of colleagues have looked it over. As far as it goes the
json fix looks kosher to me. I'll play with it some more.Cool.
Any chance you could look at fixing the "structure" of the generated
expression "program". The recursive ExecEvalExpr() calls are really not ok...
By how much does the size of ExprEvalStep go down once you don't
inline the JSON structures as of 0004 in [1]/messages/by-id/20220617200605.3moq7dtxua5cxemv@alap3.anarazel.de -- Michael? And what of 0003? The
JSON portions seem like the largest portion of the cake, though both
are must-fixes.
Yes, but I don't guarantee to have a fix in time for Beta2.
IMHO, it would be nice to get something done for beta2. Now the
thread is rather fresh and I guess that more performance study is
required even for 0004, so.. Waiting for beta3 would a better move at
this stage. Is somebody confident enough in the patches proposed?
0004 looks rather sane, seen from here, at least.
[1]: /messages/by-id/20220617200605.3moq7dtxua5cxemv@alap3.anarazel.de -- Michael
--
Michael
Hi,
On 2022-06-23 16:38:12 +0900, Michael Paquier wrote:
On Tue, Jun 21, 2022 at 05:41:07PM -0400, Andrew Dunstan wrote:
On 2022-06-21 Tu 17:25, Andres Freund wrote:
On 2022-06-21 17:11:33 -0400, Andrew Dunstan wrote:
I and a couple of colleagues have looked it over. As far as it goes the
json fix looks kosher to me. I'll play with it some more.Cool.
Any chance you could look at fixing the "structure" of the generated
expression "program". The recursive ExecEvalExpr() calls are really not ok...By how much does the size of ExprEvalStep go down once you don't
inline the JSON structures as of 0004 in [1]? And what of 0003?
0004 gets us back to 64 bytes, if 0003 is applied first. 0003 alone doesn't
yield a size reduction, because obviously 0004 is the bigger problem. Applying
just 0004 you end up with 88 bytes.
The JSON portions seem like the largest portion of the cake, though both are
must-fixes.
Yep.
Yes, but I don't guarantee to have a fix in time for Beta2.
IMHO, it would be nice to get something done for beta2. Now the
thread is rather fresh and I guess that more performance study is
required even for 0004, so..
I don't think there's a whole lot of performance study needed for 0004 - the
current state is obviously wrong.
I think Andrew's beta 2 comment was more about my other architectural
complains around the json expression eval stuff.
Waiting for beta3 would a better move at this stage. Is somebody confident
enough in the patches proposed?
0001 is the one that needs to most careful analysis, I think. 0002 I'd be fine
with pushing after reviewing it again. For 0003 David's approach might be
better or worse, it doesn't matter much I think. 0004 is ok I think, perhaps
with the exception of quibbling over some naming decisions?
Greetings,
Andres Freund
On 2022-06-23 Th 21:51, Andres Freund wrote:
Hi,
On 2022-06-23 16:38:12 +0900, Michael Paquier wrote:
On Tue, Jun 21, 2022 at 05:41:07PM -0400, Andrew Dunstan wrote:
On 2022-06-21 Tu 17:25, Andres Freund wrote:
On 2022-06-21 17:11:33 -0400, Andrew Dunstan wrote:
I and a couple of colleagues have looked it over. As far as it goes the
json fix looks kosher to me. I'll play with it some more.Cool.
Any chance you could look at fixing the "structure" of the generated
expression "program". The recursive ExecEvalExpr() calls are really not ok...By how much does the size of ExprEvalStep go down once you don't
inline the JSON structures as of 0004 in [1]? And what of 0003?0004 gets us back to 64 bytes, if 0003 is applied first. 0003 alone doesn't
yield a size reduction, because obviously 0004 is the bigger problem. Applying
just 0004 you end up with 88 bytes.The JSON portions seem like the largest portion of the cake, though both are
must-fixes.Yep.
Yes, but I don't guarantee to have a fix in time for Beta2.
IMHO, it would be nice to get something done for beta2. Now the
thread is rather fresh and I guess that more performance study is
required even for 0004, so..I don't think there's a whole lot of performance study needed for 0004 - the
current state is obviously wrong.I think Andrew's beta 2 comment was more about my other architectural
complains around the json expression eval stuff.
Right. That's being worked on but it's not going to be a mechanical fix.
Waiting for beta3 would a better move at this stage. Is somebody confident
enough in the patches proposed?0001 is the one that needs to most careful analysis, I think. 0002 I'd be fine
with pushing after reviewing it again. For 0003 David's approach might be
better or worse, it doesn't matter much I think. 0004 is ok I think, perhaps
with the exception of quibbling over some naming decisions?
The attached very small patch applies on top of your 0002 and deals with
the FmgrInfo complaint.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Attachments:
On Sat, 18 Jun 2022 at 05:21, Andres Freund <andres@anarazel.de> wrote:
On 2022-06-17 14:14:54 +1200, David Rowley wrote:
So, there appears to be no performance regression due to the extra
indirection. There's maybe even some gains due to the smaller step
size."smaller step size"?
I mean smaller sizeof(ExprEvalStep).
David
On Sat, 18 Jun 2022 at 08:06, Andres Freund <andres@anarazel.de> wrote:
I also attached my heavily-WIP patches for the ExprEvalStep issues, I
accidentally had only included a small part of the contents of the json fix.
I've now looked at the 0003 patch. I like the idea you have about
moving some of the additional fields into ScalarArrayOpExprHashTable.
I think the patch can even go a little further and move the hash_finfo
into there too. This means we don't need to dereference the "op" in
saop_element_hash().
To make this work, I did need to tag the ScalarArrayOpExpr into the
ExprEvalStep. That's required now since some of the initialization of
the hash function fields is delayed until
ExecEvalHashedScalarArrayOp(). We need to know the
ScalarArrayOpExpr's hashfuncid and inputcollid.
Your v2 patch did shift off some of this initialization work to
ExecEvalHashedScalarArrayOp(). The attached v3 takes that a bit
further. This saves a bit more work for ScalarArrayOpExprs that are
evaluated 0 times.
Another small thing which I considered doing was to put the
hash_fcinfo_data field as the final field in
ScalarArrayOpExprHashTable so that we could allocate the memory for
the hash_fcinfo_data in the same allocation as the
ScalarArrayOpExprHashTable. This would reduce the pointer
dereferencing done in saop_element_hash() a bit further. I just
didn't notice anywhere else where we do that for FunctionCallInfo, so
I resisted doing this.
(There was also a small bug in your patch where you mistakenly cast to
an OpExpr instead of ScalarArrayOpExpr when you were fetching the
inputcollid)
David
Attachments:
Hi,
On 2022-06-24 10:29:06 -0400, Andrew Dunstan wrote:
On 2022-06-23 Th 21:51, Andres Freund wrote:
On 2022-06-23 16:38:12 +0900, Michael Paquier wrote:
On Tue, Jun 21, 2022 at 05:41:07PM -0400, Andrew Dunstan wrote:
Yes, but I don't guarantee to have a fix in time for Beta2.
IMHO, it would be nice to get something done for beta2. Now the
thread is rather fresh and I guess that more performance study is
required even for 0004, so..I don't think there's a whole lot of performance study needed for 0004 - the
current state is obviously wrong.I think Andrew's beta 2 comment was more about my other architectural
complains around the json expression eval stuff.Right. That's being worked on but it's not going to be a mechanical fix.
Any updates here?
I'd mentioned the significant space use due to all JsonCoercionsState for all
the types. Another related aspect is that this code is just weird - the same
struct name (JsonCoercionsState), nested in each other?
struct JsonCoercionsState
{
struct JsonCoercionState
{
JsonCoercion *coercion; /* coercion expression */
ExprState *estate; /* coercion expression state */
} null,
string,
numeric ,
boolean,
date,
time,
timetz,
timestamp,
timestamptz,
composite;
} coercions; /* states for coercion from SQL/JSON item
* types directly to the output type */
Also note the weird numeric indentation that pgindent does...
The attached very small patch applies on top of your 0002 and deals with
the FmgrInfo complaint.
Now that the FmgrInfo is part of a separately allocated struct, that doesn't
seem necessary anymore.
- Andres
On 2022-07-05 Tu 14:36, Andres Freund wrote:
Hi,
On 2022-06-24 10:29:06 -0400, Andrew Dunstan wrote:
On 2022-06-23 Th 21:51, Andres Freund wrote:
On 2022-06-23 16:38:12 +0900, Michael Paquier wrote:
On Tue, Jun 21, 2022 at 05:41:07PM -0400, Andrew Dunstan wrote:
Yes, but I don't guarantee to have a fix in time for Beta2.
IMHO, it would be nice to get something done for beta2. Now the
thread is rather fresh and I guess that more performance study is
required even for 0004, so..I don't think there's a whole lot of performance study needed for 0004 - the
current state is obviously wrong.I think Andrew's beta 2 comment was more about my other architectural
complains around the json expression eval stuff.Right. That's being worked on but it's not going to be a mechanical fix.
Any updates here?
Not yet. A colleague and I are working on it. I'll post a status this
week if we can't post a fix.
I'd mentioned the significant space use due to all JsonCoercionsState for all
the types. Another related aspect is that this code is just weird - the same
struct name (JsonCoercionsState), nested in each other?struct JsonCoercionsState
{
struct JsonCoercionState
{
JsonCoercion *coercion; /* coercion expression */
ExprState *estate; /* coercion expression state */
} null,
string,
numeric ,
boolean,
date,
time,
timetz,
timestamp,
timestamptz,
composite;
} coercions; /* states for coercion from SQL/JSON item
* types directly to the output type */Also note the weird numeric indentation that pgindent does...
Yeah, we'll try to fix that.
The attached very small patch applies on top of your 0002 and deals with
the FmgrInfo complaint.Now that the FmgrInfo is part of a separately allocated struct, that doesn't
seem necessary anymore.
Right, but you complained that we should do it the same way as it's done
elsewhere, so I thought I'd do that anyway.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Hi,
On 2022-06-23 18:51:45 -0700, Andres Freund wrote:
Waiting for beta3 would a better move at this stage. Is somebody confident
enough in the patches proposed?0001 is the one that needs to most careful analysis, I think. 0002 I'd be fine
with pushing after reviewing it again. For 0003 David's approach might be
better or worse, it doesn't matter much I think. 0004 is ok I think, perhaps
with the exception of quibbling over some naming decisions?
I don't quite feel comfortable with 0001, without review by others. So my
current plan is to drop it and use get_timeout_active() "manually". We can
improve this in HEAD to remove the redundancy.
I've pushed what was 0004, will push what was 0002 with the above change in a
short while unless somebody protests PDQ. Then will look at David's edition of
my 0003.
Greetings,
Andres Freund
Hi,
On 2022-06-29 11:40:45 +1200, David Rowley wrote:
On Sat, 18 Jun 2022 at 08:06, Andres Freund <andres@anarazel.de> wrote:
I also attached my heavily-WIP patches for the ExprEvalStep issues, I
accidentally had only included a small part of the contents of the json fix.I've now looked at the 0003 patch. I like the idea you have about
moving some of the additional fields into ScalarArrayOpExprHashTable.
I think the patch can even go a little further and move the hash_finfo
into there too. This means we don't need to dereference the "op" in
saop_element_hash().
Makes sense.
To make this work, I did need to tag the ScalarArrayOpExpr into the
ExprEvalStep. That's required now since some of the initialization of
the hash function fields is delayed until
ExecEvalHashedScalarArrayOp(). We need to know the
ScalarArrayOpExpr's hashfuncid and inputcollid.
Makes sense.
Another small thing which I considered doing was to put the
hash_fcinfo_data field as the final field in
ScalarArrayOpExprHashTable so that we could allocate the memory for
the hash_fcinfo_data in the same allocation as the
ScalarArrayOpExprHashTable. This would reduce the pointer
dereferencing done in saop_element_hash() a bit further. I just
didn't notice anywhere else where we do that for FunctionCallInfo, so
I resisted doing this.
I think that'd make sense - it does add a bit of size calculation magic, but
it shouldn't be a problem. I'm fairly sure we do this in other parts of the
code.
(There was also a small bug in your patch where you mistakenly cast to
an OpExpr instead of ScalarArrayOpExpr when you were fetching the
inputcollid)
Ooops.
Are you good pushing this? I'm fine with you doing so wether you adapt it
further or not.
Greetings,
Andres Freund
Thanks for looking at this.
On Wed, 6 Jul 2022 at 12:32, Andres Freund <andres@anarazel.de> wrote:
On 2022-06-29 11:40:45 +1200, David Rowley wrote:
Another small thing which I considered doing was to put the
hash_fcinfo_data field as the final field in
ScalarArrayOpExprHashTable so that we could allocate the memory for
the hash_fcinfo_data in the same allocation as the
ScalarArrayOpExprHashTable. This would reduce the pointer
dereferencing done in saop_element_hash() a bit further. I just
didn't notice anywhere else where we do that for FunctionCallInfo, so
I resisted doing this.I think that'd make sense - it does add a bit of size calculation magic, but
it shouldn't be a problem. I'm fairly sure we do this in other parts of the
code.
I've now adjusted that. I also changed the hash_finfo field to make
it so the FmgrInfo is inline rather than a pointer. This saves an
additional dereference in saop_element_hash() and also saves a
palloc().
I had to adjust the palloc for the ScalarArrayOpExprHashTable struct
into a palloc0 due to the FmgrInfo being inlined. I considered just
zeroing out the hash_finfo portion but thought it wasn't worth the
extra code.
Are you good pushing this? I'm fine with you doing so wether you adapt it
further or not.
Pushed.
David
On 2022-07-05 Tu 15:04, Andrew Dunstan wrote:
On 2022-07-05 Tu 14:36, Andres Freund wrote:
I think Andrew's beta 2 comment was more about my other architectural
complains around the json expression eval stuff.Right. That's being worked on but it's not going to be a mechanical fix.
Any updates here?
Not yet. A colleague and I are working on it. I'll post a status this
week if we can't post a fix.
We're still working on it. We've made substantial progress but there are
some tests failing that we need to fix.
I'd mentioned the significant space use due to all JsonCoercionsState for all
the types. Another related aspect is that this code is just weird - the same
struct name (JsonCoercionsState), nested in each other?struct JsonCoercionsState
{
struct JsonCoercionState
{
JsonCoercion *coercion; /* coercion expression */
ExprState *estate; /* coercion expression state */
} null,
string,
numeric ,
boolean,
date,
time,
timetz,
timestamp,
timestamptz,
composite;
} coercions; /* states for coercion from SQL/JSON item
* types directly to the output type */Also note the weird numeric indentation that pgindent does...
Yeah, we'll try to fix that.
Actually, it's not the same name: JsonCoercionsState vs
JsonCoercionState. But I agree that it's a subtle enough difference that
we should use something more obvious. Maybe JsonCoercionStates instead
of JsonCoercionsState? The plural at the end would be harder to miss.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Hi,
On 2022-07-08 17:05:49 -0400, Andrew Dunstan wrote:
Actually, it's not the same name: JsonCoercionsState vs
JsonCoercionState. But I agree that it's a subtle enough difference that
we should use something more obvious. Maybe JsonCoercionStates instead
of JsonCoercionsState? The plural at the end would be harder to miss.
Given that it's a one-off use struct, why name it? Then we don't have to
figure out a name we never use.
I also still would like to understand why we need pre-allocated space for all
these types. How could multiple datums be coerced in an interleaved manner?
And if that's possible, why can't multiple datums of the same type be coerced
at the same time?
Greetings,
Andres Freund
Hi,
On 2022-07-08 17:05:49 -0400, Andrew Dunstan wrote:
On 2022-07-05 Tu 15:04, Andrew Dunstan wrote:
On 2022-07-05 Tu 14:36, Andres Freund wrote:
I think Andrew's beta 2 comment was more about my other architectural
complains around the json expression eval stuff.Right. That's being worked on but it's not going to be a mechanical fix.
Any updates here?
Not yet. A colleague and I are working on it. I'll post a status this
week if we can't post a fix.
We're still working on it. We've made substantial progress but there are
some tests failing that we need to fix.
I think we need to resolve this soon - or consider the alternatives. A lot of
the new json stuff doesn't seem fully baked, so I'm starting to wonder if we
have to consider pushing it a release further down.
Perhaps you could post your current state? I might be able to help resolving
some of the problems.
Greetings,
Andres Freund
On 2022-07-15 Fr 17:07, Andres Freund wrote:
Hi,
On 2022-07-08 17:05:49 -0400, Andrew Dunstan wrote:
On 2022-07-05 Tu 15:04, Andrew Dunstan wrote:
On 2022-07-05 Tu 14:36, Andres Freund wrote:
I think Andrew's beta 2 comment was more about my other architectural
complains around the json expression eval stuff.Right. That's being worked on but it's not going to be a mechanical fix.
Any updates here?
Not yet. A colleague and I are working on it. I'll post a status this
week if we can't post a fix.We're still working on it. We've made substantial progress but there are
some tests failing that we need to fix.I think we need to resolve this soon - or consider the alternatives. A lot of
the new json stuff doesn't seem fully baked, so I'm starting to wonder if we
have to consider pushing it a release further down.Perhaps you could post your current state? I might be able to help resolving
some of the problems.
Ok. Here is the state of things. This has proved to be rather more
intractable than I expected. Almost all the legwork here has been done
by Amit Langote, for which he deserves both my thanks and considerable
credit, but I take responsibility for it.
I just discovered today that this scheme is failing under
"force_parallel_mode = regress". I have as yet no idea if that can be
fixed simply or not. Apart from that I think the main outstanding issue
is to fill in the gaps in llvm_compile_expr().
If you have help you can offer that would be very welcome.
I'd still very much like to get this done, but if the decision is we've
run out of time I'll be sad but understand.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Attachments:
Hi,
On Tue, Jul 19, 2022 at 4:09 AM Andrew Dunstan <andrew@dunslane.net> wrote:
On 2022-07-15 Fr 17:07, Andres Freund wrote:
Perhaps you could post your current state? I might be able to help resolving
some of the problems.Ok. Here is the state of things. This has proved to be rather more
intractable than I expected. Almost all the legwork here has been done
by Amit Langote, for which he deserves both my thanks and considerable
credit, but I take responsibility for it.I just discovered today that this scheme is failing under
"force_parallel_mode = regress". I have as yet no idea if that can be
fixed simply or not.
The errors Andrew mentions here had to do with a bug of the new
coercion evaluation logic. The old code in ExecEvalJsonExpr() would
skip coercion evaluation and thus also the sub-transaction associated
with it for some JsonExprs that the new code would not and that didn't
sit well with the invariant that a parallel worker shouldn't try to
start a sub-transaction.
That bug has been fixed in the attached updated version.
Apart from that I think the main outstanding issue
is to fill in the gaps in llvm_compile_expr().
About that, I was wondering if the blocks in llvm_compile_expr() need
to be hand-coded to match what's added in ExecInterpExpr() or if I've
missed some tool that can be used instead?
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
Hi,
On 2022-07-19 20:40:11 +0900, Amit Langote wrote:
About that, I was wondering if the blocks in llvm_compile_expr() need
to be hand-coded to match what's added in ExecInterpExpr() or if I've
missed some tool that can be used instead?
The easiest way is to just call an external function for the implementation of
the step. But yes, otherwise you need to handcraft it.
Greetings,
Andres Freund
On Wed, Jul 20, 2022 at 12:37 AM Andres Freund <andres@anarazel.de> wrote:
On 2022-07-19 20:40:11 +0900, Amit Langote wrote:
About that, I was wondering if the blocks in llvm_compile_expr() need
to be hand-coded to match what's added in ExecInterpExpr() or if I've
missed some tool that can be used instead?The easiest way is to just call an external function for the implementation of
the step. But yes, otherwise you need to handcraft it.
Ok, thanks.
So I started updating llvm_compile_expr() for handling the new
ExprEvalSteps that the patch adds to ExecExprInterp(), but quickly
realized that code could have been consolidated into less code, or
IOW, into fewer new ExprEvalSteps. So, I refactored things that way
and am now retrying adding the code to llvm_compile_expr() based on
new, better consolidated, code.
Here's the updated version, without the llvm pieces, in case you'd
like to look at it even in this state. I'll post a version with llvm
pieces filled in tomorrow. (I have merged the different patches into
one for convenience.)
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
On Wed, Jul 20, 2022 at 11:09 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Wed, Jul 20, 2022 at 12:37 AM Andres Freund <andres@anarazel.de> wrote:
On 2022-07-19 20:40:11 +0900, Amit Langote wrote:
About that, I was wondering if the blocks in llvm_compile_expr() need
to be hand-coded to match what's added in ExecInterpExpr() or if I've
missed some tool that can be used instead?The easiest way is to just call an external function for the implementation of
the step. But yes, otherwise you need to handcraft it.Ok, thanks.
So I started updating llvm_compile_expr() for handling the new
ExprEvalSteps that the patch adds to ExecExprInterp(), but quickly
realized that code could have been consolidated into less code, or
IOW, into fewer new ExprEvalSteps. So, I refactored things that way
and am now retrying adding the code to llvm_compile_expr() based on
new, better consolidated, code.Here's the updated version, without the llvm pieces, in case you'd
like to look at it even in this state. I'll post a version with llvm
pieces filled in tomorrow. (I have merged the different patches into
one for convenience.)
And here's a version with llvm pieces filled in.
Because I wrote all of it while not really understanding how the LLVM
constructs like blocks and branches work, the only reason I think
those llvm_compile_expr() additions may be correct is that all the
tests in jsonb_sqljson.sql pass even if I add the following line at
the top:
set jit_above_cost to 0;
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
On Thu, Jul 21, 2022 at 11:55 PM Amit Langote <amitlangote09@gmail.com>
wrote:
On Wed, Jul 20, 2022 at 11:09 PM Amit Langote <amitlangote09@gmail.com>
wrote:
On Wed, Jul 20, 2022 at 12:37 AM Andres Freund <andres@anarazel.de>
wrote:
On 2022-07-19 20:40:11 +0900, Amit Langote wrote:
About that, I was wondering if the blocks in llvm_compile_expr()
need
to be hand-coded to match what's added in ExecInterpExpr() or if
I've
missed some tool that can be used instead?
The easiest way is to just call an external function for the
implementation of
the step. But yes, otherwise you need to handcraft it.
Ok, thanks.
So I started updating llvm_compile_expr() for handling the new
ExprEvalSteps that the patch adds to ExecExprInterp(), but quickly
realized that code could have been consolidated into less code, or
IOW, into fewer new ExprEvalSteps. So, I refactored things that way
and am now retrying adding the code to llvm_compile_expr() based on
new, better consolidated, code.Here's the updated version, without the llvm pieces, in case you'd
like to look at it even in this state. I'll post a version with llvm
pieces filled in tomorrow. (I have merged the different patches into
one for convenience.)And here's a version with llvm pieces filled in.
Because I wrote all of it while not really understanding how the LLVM
constructs like blocks and branches work, the only reason I think
those llvm_compile_expr() additions may be correct is that all the
tests in jsonb_sqljson.sql pass even if I add the following line at
the top:set jit_above_cost to 0;
Oh and I did build --with-llvm. :-)
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
On 2022-Jul-21, Amit Langote wrote:
Because I wrote all of it while not really understanding how the LLVM
constructs like blocks and branches work, the only reason I think
those llvm_compile_expr() additions may be correct is that all the
tests in jsonb_sqljson.sql pass even if I add the following line at
the top:
I suggest to build with --enable-coverage, then run the regression tests
and do "make coverage-html" and see if your code appears covered in the
report.
--
Ãlvaro Herrera PostgreSQL Developer â https://www.EnterpriseDB.com/
"¿Cómo puedes confiar en algo que pagas y que no ves,
y no confiar en algo que te dan y te lo muestran?" (Germán Poo)
On Fri, Jul 22, 2022 at 2:12 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2022-Jul-21, Amit Langote wrote:
Because I wrote all of it while not really understanding how the LLVM
constructs like blocks and branches work, the only reason I think
those llvm_compile_expr() additions may be correct is that all the
tests in jsonb_sqljson.sql pass even if I add the following line at
the top:I suggest to build with --enable-coverage, then run the regression tests
and do "make coverage-html" and see if your code appears covered in the
report.
Thanks for the suggestion. I just did and it seems that both the
additions to ExecInterpExpr() and to llvm_compile_expr() are well
covered.
BTW, the only way I found to *forcefully* exercise llvm_compile_expr()
is to add `set jit_above_cost to 0` at the top of the test file, or
are we missing a force_jit_mode, like there is force_parallel_mode?
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
On Fri, 22 Jul 2022 at 15:22, Amit Langote <amitlangote09@gmail.com> wrote:
BTW, the only way I found to *forcefully* exercise llvm_compile_expr()
is to add `set jit_above_cost to 0` at the top of the test file, or
are we missing a force_jit_mode, like there is force_parallel_mode?
I don't think we'd need any setting to hide the JIT counters from
EXPLAIN ANALYZE since those only show with COSTS ON, which we tend not
to do.
I think for testing, you could just zero all the jit*above_cost GUCs.
If you look at the config_extra in [1]https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=desmoxytes&dt=2022-07-22%2003%3A04%3A03, you'll see that animal runs
the tests with modified JIT parameters.
BTW, I was working on code inside llvm_compile_expr() a few days ago
and I thought I'd gotten the new evaluation steps I was adding correct
as it worked fine with jit_above_cost=0, but on further testing, it
crashed with jit_inline_above_cost=0. Might be worth doing both to see
if everything works as intended.
David
[1]: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=desmoxytes&dt=2022-07-22%2003%3A04%3A03
On Fri, Jul 22, 2022 at 1:13 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Fri, 22 Jul 2022 at 15:22, Amit Langote <amitlangote09@gmail.com> wrote:
BTW, the only way I found to *forcefully* exercise llvm_compile_expr()
is to add `set jit_above_cost to 0` at the top of the test file, or
are we missing a force_jit_mode, like there is force_parallel_mode?I don't think we'd need any setting to hide the JIT counters from
EXPLAIN ANALYZE since those only show with COSTS ON, which we tend not
to do.
Ah, makes sense.
I think for testing, you could just zero all the jit*above_cost GUCs.
If you look at the config_extra in [1], you'll see that animal runs
the tests with modified JIT parameters.BTW, I was working on code inside llvm_compile_expr() a few days ago
and I thought I'd gotten the new evaluation steps I was adding correct
as it worked fine with jit_above_cost=0, but on further testing, it
crashed with jit_inline_above_cost=0. Might be worth doing both to see
if everything works as intended.
Thanks for the pointer.
So I didn't see things going bust on re-testing with all
jit_*_above_cost parameters set to 0, so maybe the
llvm_compile_expression() additions are alright.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
On Fri, Jul 22, 2022 at 2:49 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Fri, Jul 22, 2022 at 1:13 PM David Rowley <dgrowleyml@gmail.com> wrote:
BTW, I was working on code inside llvm_compile_expr() a few days ago
and I thought I'd gotten the new evaluation steps I was adding correct
as it worked fine with jit_above_cost=0, but on further testing, it
crashed with jit_inline_above_cost=0. Might be worth doing both to see
if everything works as intended.Thanks for the pointer.
So I didn't see things going bust on re-testing with all
jit_*_above_cost parameters set to 0, so maybe the
llvm_compile_expression() additions are alright.
Here's an updated version of the patch, with mostly cosmetic changes.
In particular, I added comments describing the new llvm_compile_expr()
blobs.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
On 2022-07-27 We 04:01, Amit Langote wrote:
On Fri, Jul 22, 2022 at 2:49 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Fri, Jul 22, 2022 at 1:13 PM David Rowley <dgrowleyml@gmail.com> wrote:
BTW, I was working on code inside llvm_compile_expr() a few days ago
and I thought I'd gotten the new evaluation steps I was adding correct
as it worked fine with jit_above_cost=0, but on further testing, it
crashed with jit_inline_above_cost=0. Might be worth doing both to see
if everything works as intended.Thanks for the pointer.
So I didn't see things going bust on re-testing with all
jit_*_above_cost parameters set to 0, so maybe the
llvm_compile_expression() additions are alright.Here's an updated version of the patch, with mostly cosmetic changes.
In particular, I added comments describing the new llvm_compile_expr()
blobs.
Andres,
this work has been done in response to a complaint from you. Does this
address your concerns satisfactorily?
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Hi,
On 2022-07-29 14:27:36 -0400, Andrew Dunstan wrote:
this work has been done in response to a complaint from you. Does this
address your concerns satisfactorily?
Will look. Was on vacation for the last two weeks...
Greetings,
Andres Freund
Hi,
On 2022-07-27 17:01:13 +0900, Amit Langote wrote:
Here's an updated version of the patch, with mostly cosmetic changes.
In particular, I added comments describing the new llvm_compile_expr()
blobs.
- I've asked a couple times before: Why do we need space for every possible
datatype at once in JsonItemCoercions? Can there be multiple "concurrent"
coercions in process?
The whole coercion stuff just seems incredibly clunky (in a slightly
different shape before this patch). ExecEvalJsonExprItemCoercion() calls
ExecPrepareJsonItemCoercion(), which gets a pointer to one of the per-type
elements in JsonItemCoercionsState, dispatching on the type of the json
object. Then we later call ExecGetJsonItemCoercion() (via a convoluted
path), which again will dispatch on the type (extracting the json object
again afaics!), to then somehow eventually get the coerced value.
I cannot make any sense of this. This code should not have been committed
in this state.
- Looks like there's still some recursive expression states, namely
JsonExprState->{result_coercion, coercions}?
- Looks like the JsonExpr code in ExecInitExprRec() is big enough to
potentially benefit from splitting out into a separate function?
- looks like JsonExprPostEvalState could be moved to execExprInterp.c?
- I ran the patch against LLVM 14 built with assertions enabled, and it
triggers an assertion failure:
#3 0x00007f75d165c242 in __GI___assert_fail (
assertion=0x7f75c278d511 "getOperand(0)->getType() == getOperand(1)->getType() && \"Both operands to ICmp instruction are not of the same type!\"",
file=0x7f75c2780366 "/home/andres/src/llvm-project-14/llvm/include/llvm/IR/Instructions.h", line=1192,
function=0x7f75c27d9dcc "void llvm::ICmpInst::AssertOK()") at assert.c:101
#4 0x00007f75c2b9b25c in llvm::ICmpInst::AssertOK (this=0x55e019290ca0) at /home/andres/src/llvm-project-14/llvm/include/llvm/IR/Instructions.h:1191
#5 0x00007f75c2b9b0ea in llvm::ICmpInst::ICmpInst (this=0x55e019290ca0, pred=llvm::CmpInst::ICMP_EQ, LHS=0x55e019290c10, RHS=0x55e01928ce80, NameStr="")
at /home/andres/src/llvm-project-14/llvm/include/llvm/IR/Instructions.h:1246
#6 0x00007f75c2b93c99 in llvm::IRBuilderBase::CreateICmp (this=0x55e0192894f0, P=llvm::CmpInst::ICMP_EQ, LHS=0x55e019290c10, RHS=0x55e01928ce80, Name="")
at /home/andres/src/llvm-project-14/llvm/include/llvm/IR/IRBuilder.h:2202
#7 0x00007f75c2c1bc5d in LLVMBuildICmp (B=0x55e0192894f0, Op=LLVMIntEQ, LHS=0x55e019290c10, RHS=0x55e01928ce80, Name=0x7f75d0d24cbc "")
at /home/andres/src/llvm-project-14/llvm/lib/IR/Core.cpp:3927
#8 0x00007f75d0d20b1f in llvm_compile_expr (state=0x55e019201380) at /home/andres/src/postgresql/src/backend/jit/llvm/llvmjit_expr.c:2392
...
#19 0x000055e0184c16d4 in exec_simple_query (query_string=0x55e01912f6e0 "SELECT JSON_EXISTS(NULL::jsonb, '$');") at /home/andres/src/postgresql/src/backend/tcop/postgres.c:1204
this triggers easily interactively - which is nice because that allows to
dump the types:
p getOperand(0)->getType()->dump() -> prints i64
p getOperand(1)->getType()->dump() -> prints i32
The immediate issue is that you're setting v_jumpaddrp up as a pointer to a
pointer to size_t - but then compare it to i32.
I first was confused why the code tries to load the jump target
dynamically. But then I saw that the interpreted code sets it dynamically -
why? That's completely unnecessary overhead afaics? There's just two
possible jump targets, no?
- why is EvalJsonPathVar() in execExprInterp.c, when it's only ever called
from within jsonpath_exec.c?
- s/JsobbValue/JsonbValue/
Greetings,
Andres Freund
Hi,
Thanks for looking into this.
On Tue, Aug 2, 2022 at 9:39 AM Andres Freund <andres@anarazel.de> wrote:
On 2022-07-27 17:01:13 +0900, Amit Langote wrote:
Here's an updated version of the patch, with mostly cosmetic changes.
In particular, I added comments describing the new llvm_compile_expr()
blobs.- I've asked a couple times before: Why do we need space for every possible
datatype at once in JsonItemCoercions? Can there be multiple "concurrent"
coercions in process?
This topic has been a head-scratcher for me too from the beginning,
but I've since come to understand (convince myself) that we do need
the coercions for all possible types, because we don't know the type
of the JSON item that's going to pop out of the main JSON path
expression until we've run it through the JSON path executor that
ExecEvalJson() invokes. So, it's not possible to statically assign
the coercion. I am not really sure if different coercions may be used
in the same query over multiple evaluations of the same JSON path
expression, but maybe that's also possible.
The whole coercion stuff just seems incredibly clunky (in a slightly
different shape before this patch). ExecEvalJsonExprItemCoercion() calls
ExecPrepareJsonItemCoercion(), which gets a pointer to one of the per-type
elements in JsonItemCoercionsState, dispatching on the type of the json
object. Then we later call ExecGetJsonItemCoercion() (via a convoluted
path), which again will dispatch on the type (extracting the json object
again afaics!), to then somehow eventually get the coerced value.
I think it might be possible to make this a bit simpler, by not
leaving anything coercion-related in ExecEvalJsonExpr(). I left some
pieces there, because I thought the error of not finding an
appropriate coercion must be thrown right away as the code in
ExecEvalJsonExpr() does after calling ExecGetJsonItemCoercion().
ExecPrepareJsonItemCoercion() is called later when it's time to
actually evaluate the coercion. If we move the error path to
ExecPrepareJsonItemCoercion(), both ExecGetJsonItemCoercion() and the
error path code in ExecEvalJsonExpr() will be unnecessary. I will
give that a try.
- Looks like there's still some recursive expression states, namely
JsonExprState->{result_coercion, coercions}?
So, the problem with inlining coercion evaluation into the main parent
JsonExpr's is that it needs to be wrapped in a sub-transaction to
catch any errors and return NULL instead. I don't know a way to wrap
ExprEvalStep evaluation in a sub-transaction to achieve that effect.
- Looks like the JsonExpr code in ExecInitExprRec() is big enough to
potentially benefit from splitting out into a separate function?
Thought about it too, so will do.
- looks like JsonExprPostEvalState could be moved to execExprInterp.c?
OK, will give that a try.
- I ran the patch against LLVM 14 built with assertions enabled, and it
triggers an assertion failure:#3 0x00007f75d165c242 in __GI___assert_fail (
assertion=0x7f75c278d511 "getOperand(0)->getType() == getOperand(1)->getType() && \"Both operands to ICmp instruction are not of the same type!\"",
file=0x7f75c2780366 "/home/andres/src/llvm-project-14/llvm/include/llvm/IR/Instructions.h", line=1192,
function=0x7f75c27d9dcc "void llvm::ICmpInst::AssertOK()") at assert.c:101
#4 0x00007f75c2b9b25c in llvm::ICmpInst::AssertOK (this=0x55e019290ca0) at /home/andres/src/llvm-project-14/llvm/include/llvm/IR/Instructions.h:1191
#5 0x00007f75c2b9b0ea in llvm::ICmpInst::ICmpInst (this=0x55e019290ca0, pred=llvm::CmpInst::ICMP_EQ, LHS=0x55e019290c10, RHS=0x55e01928ce80, NameStr="")
at /home/andres/src/llvm-project-14/llvm/include/llvm/IR/Instructions.h:1246
#6 0x00007f75c2b93c99 in llvm::IRBuilderBase::CreateICmp (this=0x55e0192894f0, P=llvm::CmpInst::ICMP_EQ, LHS=0x55e019290c10, RHS=0x55e01928ce80, Name="")
at /home/andres/src/llvm-project-14/llvm/include/llvm/IR/IRBuilder.h:2202
#7 0x00007f75c2c1bc5d in LLVMBuildICmp (B=0x55e0192894f0, Op=LLVMIntEQ, LHS=0x55e019290c10, RHS=0x55e01928ce80, Name=0x7f75d0d24cbc "")
at /home/andres/src/llvm-project-14/llvm/lib/IR/Core.cpp:3927
#8 0x00007f75d0d20b1f in llvm_compile_expr (state=0x55e019201380) at /home/andres/src/postgresql/src/backend/jit/llvm/llvmjit_expr.c:2392
...
#19 0x000055e0184c16d4 in exec_simple_query (query_string=0x55e01912f6e0 "SELECT JSON_EXISTS(NULL::jsonb, '$');") at /home/andres/src/postgresql/src/backend/tcop/postgres.c:1204this triggers easily interactively - which is nice because that allows to
dump the types:p getOperand(0)->getType()->dump() -> prints i64
p getOperand(1)->getType()->dump() -> prints i32The immediate issue is that you're setting v_jumpaddrp up as a pointer to a
pointer to size_t - but then compare it to i32.
Ooh, thanks for letting me know. So maybe I am missing some
llvmjist_emit.h/type.c infrastructure to read an int32 value
(jumpdone) out of an int32 pointer (&jumpdone)?
I first was confused why the code tries to load the jump target
dynamically. But then I saw that the interpreted code sets it dynamically -
why? That's completely unnecessary overhead afaics? There's just two
possible jump targets, no?
Hmm, I looked at the code for other expressions that jump, especially
CASE WHEN, but they use ready-made EEOP_JUMP_IF_* steps, which can be
added statically. I thought we can't use them in this case, because
the conditions are very ad-hoc, like if the JSON path computation
returned an "empty" item or if the "error" flag was set during that
computation, etc.
- why is EvalJsonPathVar() in execExprInterp.c, when it's only ever called
from within jsonpath_exec.c?
Hadn't noticed that because the patch didn't really have to touch it,
but yes, maybe it makes sense to move it there.
- s/JsobbValue/JsonbValue/
Oops, will fix.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
Hi,
On 2022-08-02 12:05:55 +0900, Amit Langote wrote:
On Tue, Aug 2, 2022 at 9:39 AM Andres Freund <andres@anarazel.de> wrote:
On 2022-07-27 17:01:13 +0900, Amit Langote wrote:
Here's an updated version of the patch, with mostly cosmetic changes.
In particular, I added comments describing the new llvm_compile_expr()
blobs.- I've asked a couple times before: Why do we need space for every possible
datatype at once in JsonItemCoercions? Can there be multiple "concurrent"
coercions in process?This topic has been a head-scratcher for me too from the beginning,
but I've since come to understand (convince myself) that we do need
the coercions for all possible types, because we don't know the type
of the JSON item that's going to pop out of the main JSON path
expression until we've run it through the JSON path executor that
ExecEvalJson() invokes. So, it's not possible to statically assign
the coercion.
Sure. But that doesn't mean we have to have memory for every possible type *at
the same time*.
I am not really sure if different coercions may be used
in the same query over multiple evaluations of the same JSON path
expression, but maybe that's also possible.
Even if the type can change, I don't think that means we need to have space
for multiple types at the same time - there can't be multiple coercions
happening at the same time, otherwise there could be two coercions of the same
type as well. So we don't need memory for every coercion type.
The whole coercion stuff just seems incredibly clunky (in a slightly
different shape before this patch). ExecEvalJsonExprItemCoercion() calls
ExecPrepareJsonItemCoercion(), which gets a pointer to one of the per-type
elements in JsonItemCoercionsState, dispatching on the type of the json
object. Then we later call ExecGetJsonItemCoercion() (via a convoluted
path), which again will dispatch on the type (extracting the json object
again afaics!), to then somehow eventually get the coerced value.I think it might be possible to make this a bit simpler, by not
leaving anything coercion-related in ExecEvalJsonExpr().
Honestly, this code seems like it should just be rewritten from scratch.
I left some pieces there, because I thought the error of not finding an
appropriate coercion must be thrown right away as the code in
ExecEvalJsonExpr() does after calling ExecGetJsonItemCoercion().ExecPrepareJsonItemCoercion() is called later when it's time to
actually evaluate the coercion. If we move the error path to
ExecPrepareJsonItemCoercion(), both ExecGetJsonItemCoercion() and the
error path code in ExecEvalJsonExpr() will be unnecessary. I will
give that a try.
Why do we need the separation of prepare and then evaluation? They're executed
straight after each other?
- Looks like there's still some recursive expression states, namely
JsonExprState->{result_coercion, coercions}?So, the problem with inlining coercion evaluation into the main parent
JsonExpr's is that it needs to be wrapped in a sub-transaction to
catch any errors and return NULL instead. I don't know a way to wrap
ExprEvalStep evaluation in a sub-transaction to achieve that effect.
But we don't need to wrap arbitrary evaluation in a subtransaction - afaics
the coercion calls a single function, not an arbitrary expression?
Ooh, thanks for letting me know. So maybe I am missing some
llvmjist_emit.h/type.c infrastructure to read an int32 value
(jumpdone) out of an int32 pointer (&jumpdone)?
No, you just need to replace l_ptr(TypeSizeT) with l_ptr(LLVMInt32Type()).
I first was confused why the code tries to load the jump target
dynamically. But then I saw that the interpreted code sets it dynamically -
why? That's completely unnecessary overhead afaics? There's just two
possible jump targets, no?Hmm, I looked at the code for other expressions that jump, especially
CASE WHEN, but they use ready-made EEOP_JUMP_IF_* steps, which can be
added statically. I thought we can't use them in this case, because
the conditions are very ad-hoc, like if the JSON path computation
returned an "empty" item or if the "error" flag was set during that
computation, etc.
The minimal fix would be to return the jump target from the function, and then
jump to that. That at least avoids the roundtrip to memory you have right now.
Greetings,
Andres Freund
Hi,
On Wed, Aug 3, 2022 at 12:00 AM Andres Freund <andres@anarazel.de> wrote:
On 2022-08-02 12:05:55 +0900, Amit Langote wrote:
On Tue, Aug 2, 2022 at 9:39 AM Andres Freund <andres@anarazel.de> wrote:
On 2022-07-27 17:01:13 +0900, Amit Langote wrote:
Here's an updated version of the patch, with mostly cosmetic changes.
In particular, I added comments describing the new llvm_compile_expr()
blobs.- I've asked a couple times before: Why do we need space for every possible
datatype at once in JsonItemCoercions? Can there be multiple "concurrent"
coercions in process?This topic has been a head-scratcher for me too from the beginning,
but I've since come to understand (convince myself) that we do need
the coercions for all possible types, because we don't know the type
of the JSON item that's going to pop out of the main JSON path
expression until we've run it through the JSON path executor that
ExecEvalJson() invokes. So, it's not possible to statically assign
the coercion.Sure. But that doesn't mean we have to have memory for every possible type *at
the same time*.I am not really sure if different coercions may be used
in the same query over multiple evaluations of the same JSON path
expression, but maybe that's also possible.Even if the type can change, I don't think that means we need to have space
for multiple types at the same time - there can't be multiple coercions
happening at the same time, otherwise there could be two coercions of the same
type as well. So we don't need memory for every coercion type.
Do you find it unnecessary to statically allocate memory for
JsonItemCoercionState for each possible coercion, as in the following
struct definition:
typedef struct JsonItemCoercionsState
{
JsonItemCoercionState null;
JsonItemCoercionState string;
JsonItemCoercionState numeric;
JsonItemCoercionState boolean;
JsonItemCoercionState date;
JsonItemCoercionState time;
JsonItemCoercionState timetz;
JsonItemCoercionState timestamp;
JsonItemCoercionState timestamptz;
JsonItemCoercionState composite;
} JsonItemCoercionsState;
A given JsonItemCoercionState (note singular Coercion) contains:
typedef struct JsonItemCoercionState
{
/* Expression used to evaluate the coercion */
JsonCoercion *coercion;
/* ExprEvalStep to compute this coercion's expression */
int jump_eval_expr;
} JsonItemCoercionState;
jump_eval_expr above is the address in JsonItemCoercions'
ExprState.steps of the 1st ExprEvalStep corresponding to
coercion->expr. IIUC, all ExprEvalSteps needed to evaluate an
expression and its children must be allocated statically in
ExecInitExprRec(), and none on-the-fly as needed. So, this considers
all coercions and allocates states of all statically.
The whole coercion stuff just seems incredibly clunky (in a slightly
different shape before this patch). ExecEvalJsonExprItemCoercion() calls
ExecPrepareJsonItemCoercion(), which gets a pointer to one of the per-type
elements in JsonItemCoercionsState, dispatching on the type of the json
object. Then we later call ExecGetJsonItemCoercion() (via a convoluted
path), which again will dispatch on the type (extracting the json object
again afaics!), to then somehow eventually get the coerced value.I think it might be possible to make this a bit simpler, by not
leaving anything coercion-related in ExecEvalJsonExpr().Honestly, this code seems like it should just be rewritten from scratch.
Based on what I wrote above, please let me know if I've misunderstood
your concerns about over-allocation of coercion state. I can try to
rewrite one more time if I know what this should look like instead.
I left some pieces there, because I thought the error of not finding an
appropriate coercion must be thrown right away as the code in
ExecEvalJsonExpr() does after calling ExecGetJsonItemCoercion().ExecPrepareJsonItemCoercion() is called later when it's time to
actually evaluate the coercion. If we move the error path to
ExecPrepareJsonItemCoercion(), both ExecGetJsonItemCoercion() and the
error path code in ExecEvalJsonExpr() will be unnecessary. I will
give that a try.Why do we need the separation of prepare and then evaluation? They're executed
straight after each other?
ExecPrepareJsonItemCoercion() is a helper routine to choose the
coercion and extract the Datum out of the JsonbValue produced by the
EEOP_JSONEXPR_PATH step to feed to the coercion expression's
ExprEvalStep. The coercion evaluation will be done by jumping to said
step in ExecInterpExpr().
- Looks like there's still some recursive expression states, namely
JsonExprState->{result_coercion, coercions}?So, the problem with inlining coercion evaluation into the main parent
JsonExpr's is that it needs to be wrapped in a sub-transaction to
catch any errors and return NULL instead. I don't know a way to wrap
ExprEvalStep evaluation in a sub-transaction to achieve that effect.But we don't need to wrap arbitrary evaluation in a subtransaction - afaics
the coercion calls a single function, not an arbitrary expression?
It can do EEOP_IOCOERCE for example, and the input/output function may
cause an error depending on what comes out of the JSON blob.
IIUC, those errors need to be caught to satisfy some SQL/JSON spec.
Ooh, thanks for letting me know. So maybe I am missing some
llvmjist_emit.h/type.c infrastructure to read an int32 value
(jumpdone) out of an int32 pointer (&jumpdone)?No, you just need to replace l_ptr(TypeSizeT) with l_ptr(LLVMInt32Type()).
OK, thanks.
I first was confused why the code tries to load the jump target
dynamically. But then I saw that the interpreted code sets it dynamically -
why? That's completely unnecessary overhead afaics? There's just two
possible jump targets, no?Hmm, I looked at the code for other expressions that jump, especially
CASE WHEN, but they use ready-made EEOP_JUMP_IF_* steps, which can be
added statically. I thought we can't use them in this case, because
the conditions are very ad-hoc, like if the JSON path computation
returned an "empty" item or if the "error" flag was set during that
computation, etc.The minimal fix would be to return the jump target from the function, and then
jump to that. That at least avoids the roundtrip to memory you have right now.
You mean like this:
LLVMValueRef v_args[2];
LLVMValueRef v_ret;
/*
* Call ExecEvalJsonExprSkip() to decide if JSON path
* evaluation can be skipped. This returns the step
* address to jump to.
*/
v_args[0] = v_state;
v_args[1] = l_ptr_const(op, l_ptr(StructExprEvalStep));
v_ret = LLVMBuildCall(b,
llvm_pg_func(mod,
"ExecEvalJsonExprSkip"),
params, lengthof(params), "");
Actually, this is how I had started, but never figured out how to jump
to the address in v_ret. As in, how to extract the plain C int32
value that is the jump address from v_ret, an LLVMValueRef, to use in
the following statement:
LLVMBuildBr(b, opblocks[<int32-in-v_ret>]);
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
Hi,
I tried to look into some of the questions from Amit, but I have e.g. no idea
what exactly the use of subtransactions tries to achieve - afaics 1a36bc9dba8
is the first patch to introduce needing to evaluate parts expressions in a
subtransaction - but there's not a single comment explaining why.
It's really hard to understand the new json code. It's a substantial amount of
new infrastructure, without any design documentation that I can find. And it's
not like it's just code that's equivalent to nearby stuff. To me this falls
way below our standards and I think it's *months* of work to fix.
Even just the expresion evaluation code: EvalJsonPathVar(),
ExecEvalJsonConstructor(), ExecEvalJsonExpr(), ExecEvalJson(). There's one
layer of subtransactions in one of the paths in ExecEvalJsonExpr(), another in
ExecEvalJson(). Some paths of ExecEvalJsonExpr() go through subtransactions,
others don't.
It's one thing for a small set of changes to be of this quality. But this is
pretty large - a quick summing of diffstat ends up with about 17k lines added,
of which ~2.5k are docs, ~4.8k are tests.
On 2022-08-04 17:01:48 +0900, Amit Langote wrote:
On Wed, Aug 3, 2022 at 12:00 AM Andres Freund <andres@anarazel.de> wrote:
The whole coercion stuff just seems incredibly clunky (in a slightly
different shape before this patch). ExecEvalJsonExprItemCoercion() calls
ExecPrepareJsonItemCoercion(), which gets a pointer to one of the per-type
elements in JsonItemCoercionsState, dispatching on the type of the json
object. Then we later call ExecGetJsonItemCoercion() (via a convoluted
path), which again will dispatch on the type (extracting the json object
again afaics!), to then somehow eventually get the coerced value.I think it might be possible to make this a bit simpler, by not
leaving anything coercion-related in ExecEvalJsonExpr().Honestly, this code seems like it should just be rewritten from scratch.
Based on what I wrote above, please let me know if I've misunderstood
your concerns about over-allocation of coercion state. I can try to
rewrite one more time if I know what this should look like instead.
I don't know. I don't understand the design of what needs to have error
handling, and what not.
I first was confused why the code tries to load the jump target
dynamically. But then I saw that the interpreted code sets it dynamically -
why? That's completely unnecessary overhead afaics? There's just two
possible jump targets, no?Hmm, I looked at the code for other expressions that jump, especially
CASE WHEN, but they use ready-made EEOP_JUMP_IF_* steps, which can be
added statically. I thought we can't use them in this case, because
the conditions are very ad-hoc, like if the JSON path computation
returned an "empty" item or if the "error" flag was set during that
computation, etc.The minimal fix would be to return the jump target from the function, and then
jump to that. That at least avoids the roundtrip to memory you have right now.You mean like this:
LLVMValueRef v_args[2];
LLVMValueRef v_ret;/*
* Call ExecEvalJsonExprSkip() to decide if JSON path
* evaluation can be skipped. This returns the step
* address to jump to.
*/
v_args[0] = v_state;
v_args[1] = l_ptr_const(op, l_ptr(StructExprEvalStep));
v_ret = LLVMBuildCall(b,
llvm_pg_func(mod,
"ExecEvalJsonExprSkip"),
params, lengthof(params), "");Actually, this is how I had started, but never figured out how to jump
to the address in v_ret. As in, how to extract the plain C int32
value that is the jump address from v_ret, an LLVMValueRef, to use in
the following statement:LLVMBuildBr(b, opblocks[<int32-in-v_ret>]);
We could make that work, but even keeping it more similar to your current
code, you're already dealing with a variable jump target. Only that you load
it from a variable, instead of the function return type. So you could just
v_ret instead of v_jumpaddr, and your code would be simpler and faster.
Greetings,
Andres Freund
Hi Andres,
On Sat, Aug 6, 2022 at 5:37 AM Andres Freund <andres@anarazel.de> wrote:
On 2022-08-04 17:01:48 +0900, Amit Langote wrote:
On Wed, Aug 3, 2022 at 12:00 AM Andres Freund <andres@anarazel.de> wrote:
Honestly, this code seems like it should just be rewritten from scratch.
Based on what I wrote above, please let me know if I've misunderstood
your concerns about over-allocation of coercion state. I can try to
rewrite one more time if I know what this should look like instead.I don't know. I don't understand the design of what needs to have error
handling, and what not.
AFAIK, there are two things that the JSON path code considers can
cause an error when evaluating a JsonExpr:
* Actual JSON path evaluation in ExecEvalJsonExpr(), because it
invokes JsonPath*() family of functions defined in jsonpath_exec.c,
which can possibly cause an error. Actually, I looked again today as
to what goes on in them and it seems there is a throwErrors variable
being used to catch and prevent any errors found by the JSON path
machinery itself, and it has the same value as the throwErrors
variable in ExecEvalJsonExpr(). Given that the latter is set per the
ON ERROR specification (throw errors or return NULL / a default
expression in lieu), maybe this part doesn't really need a
sub-transaction. To check, I took off the sub-transaction around this
part and can see that no tests fail.
* Evaluating coercion expression in ExecEvalJsonExprCoercion(), which
involves passing a user-specified expression through either
EEOP_IOCOERCE or json_populate_type(), both of which can cause errors
that are not suppressible as those in jsonpath_exec.c are. So, this
part does need a sub-transaction to satisfy the ON ERROR behavior. To
check, I took out the sub-transaction around the coercion evaluation,
and some tests in jsob_sqljson do indeed fail, like this, for example:
SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING int);
- json_value
-------------
-
-(1 row)
-
+ERROR: invalid input syntax for type integer: "aaa"
Note that both the JSON path expression and the coercion would run as
part of the one EEOP_JSONEXPR ExecEvalStep before this patch and thus
would be wrapped under the same sub-transaction, even if only the
latter apparently needs it.
With this patch, I tried to keep that same behavior, but because the
coercion evaluation has now been broken out into its own step, it must
use another sub-transaction, given that the same sub-transaction can
no longer wrap both things. But given my finding that the JSON path
expression doesn't really need one, maybe the new EEOP_JSONEXPR_PATH
step can run without one, while keeping it for the new
EEOP_JSONEXPR_COERCION step.
I first was confused why the code tries to load the jump target
dynamically. But then I saw that the interpreted code sets it dynamically -
why? That's completely unnecessary overhead afaics? There's just two
possible jump targets, no?Hmm, I looked at the code for other expressions that jump, especially
CASE WHEN, but they use ready-made EEOP_JUMP_IF_* steps, which can be
added statically. I thought we can't use them in this case, because
the conditions are very ad-hoc, like if the JSON path computation
returned an "empty" item or if the "error" flag was set during that
computation, etc.The minimal fix would be to return the jump target from the function, and then
jump to that. That at least avoids the roundtrip to memory you have right now.You mean like this:
LLVMValueRef v_args[2];
LLVMValueRef v_ret;/*
* Call ExecEvalJsonExprSkip() to decide if JSON path
* evaluation can be skipped. This returns the step
* address to jump to.
*/
v_args[0] = v_state;
v_args[1] = l_ptr_const(op, l_ptr(StructExprEvalStep));
v_ret = LLVMBuildCall(b,
llvm_pg_func(mod,
"ExecEvalJsonExprSkip"),
params, lengthof(params), "");Actually, this is how I had started, but never figured out how to jump
to the address in v_ret. As in, how to extract the plain C int32
value that is the jump address from v_ret, an LLVMValueRef, to use in
the following statement:LLVMBuildBr(b, opblocks[<int32-in-v_ret>]);
We could make that work, but even keeping it more similar to your current
code, you're already dealing with a variable jump target. Only that you load
it from a variable, instead of the function return type. So you could just
v_ret instead of v_jumpaddr, and your code would be simpler and faster.
Ah, I see you mean to continue to use all the LLVMBuildCondBr()s as
the code currently does, but use v_ret like in the code above, instead
of v_jumpaddr, to access the jump address returned by the
step-choosing function. I've done that in the updated patch. This
also allows us to get rid of all the jumpdone fields in the
ExprEvalStep.
I've also moved the blocks of code in ExecInitExprRec() that
initialize the state for JsonExpr and JsonItemCoercions into new
functions. I've also moved EvalJsonPathVar() from execExprInterp.c to
jsonpath_exec.c where it's used.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
On 8/5/22 4:36 PM, Andres Freund wrote:
Hi,
I tried to look into some of the questions from Amit, but I have e.g. no idea
what exactly the use of subtransactions tries to achieve - afaics 1a36bc9dba8
is the first patch to introduce needing to evaluate parts expressions in a
subtransaction - but there's not a single comment explaining why.It's really hard to understand the new json code. It's a substantial amount of
new infrastructure, without any design documentation that I can find. And it's
not like it's just code that's equivalent to nearby stuff. To me this falls
way below our standards and I think it's *months* of work to fix.Even just the expresion evaluation code: EvalJsonPathVar(),
ExecEvalJsonConstructor(), ExecEvalJsonExpr(), ExecEvalJson(). There's one
layer of subtransactions in one of the paths in ExecEvalJsonExpr(), another in
ExecEvalJson(). Some paths of ExecEvalJsonExpr() go through subtransactions,
others don't.It's one thing for a small set of changes to be of this quality. But this is
pretty large - a quick summing of diffstat ends up with about 17k lines added,
of which ~2.5k are docs, ~4.8k are tests.
The RMT met today to discuss the state of this open item surrounding the
SQL/JSON feature set. We discussed the specific concerns raised about
the code and debated four different options:
1. Do nothing and continue with the community process of stabilizing
the code without significant redesign
2. Recommend holding up the v15 release to allow for the code to be
redesigned and fixed (as based on Andres' estimates, this would push the
release out several months).
3. Revert the problematic parts of the code but try to include some
of the features in the v15 release (e.g. JSON_TABLE)
4. Revert the feature set and redesign and try to include for v16
Based on the concerns raised, the RMT is recommending option #4, to
revert the SQL/JSON changes for v15, and come back with a redesign for v16.
If folks think there are some bits we can include in v15, we can
consider option #3. (Personally, I would like to see if we can keep
JSON_TABLE, but if there is too much overlap with the problematic
portions of the code I am fine with waiting for v16).
At this stage in the release process coupled with the concerns, we're a
bit worried about introducing changes that are unpredictable in terms of
stability and maintenance. We also do not want to hold up the release
while this feature set is goes through a redesign without agreement on
what such a design would look like as well as a timeline.
Note that the above are the RMT's recommendations; while the RMT can
explicitly call for a revert, we want to first guide the discussion on
the best path forward knowing the challenges for including these
features in v15.
Thanks,
Jonathan
On 2022-08-09 Tu 09:59, Jonathan S. Katz wrote:
On 8/5/22 4:36 PM, Andres Freund wrote:
Hi,
I tried to look into some of the questions from Amit, but I have e.g.
no idea
what exactly the use of subtransactions tries to achieve - afaics
1a36bc9dba8
is the first patch to introduce needing to evaluate parts expressions
in a
subtransaction - but there's not a single comment explaining why.It's really hard to understand the new json code. It's a substantial
amount of
new infrastructure, without any design documentation that I can find.
And it's
not like it's just code that's equivalent to nearby stuff. To me this
falls
way below our standards and I think it's *months* of work to fix.Even just the expresion evaluation code: EvalJsonPathVar(),
ExecEvalJsonConstructor(), ExecEvalJsonExpr(), ExecEvalJson().
There's one
layer of subtransactions in one of the paths in ExecEvalJsonExpr(),
another in
ExecEvalJson(). Some paths of ExecEvalJsonExpr() go through
subtransactions,
others don't.It's one thing for a small set of changes to be of this quality. But
this is
pretty large - a quick summing of diffstat ends up with about 17k
lines added,
of which ~2.5k are docs, ~4.8k are tests.The RMT met today to discuss the state of this open item surrounding
the SQL/JSON feature set. We discussed the specific concerns raised
about the code and debated four different options:Â 1. Do nothing and continue with the community process of stabilizing
the code without significant redesign 2. Recommend holding up the v15 release to allow for the code to be
redesigned and fixed (as based on Andres' estimates, this would push
the release out several months). 3. Revert the problematic parts of the code but try to include some
of the features in the v15 release (e.g. JSON_TABLE)Â 4. Revert the feature set and redesign and try to include for v16
Based on the concerns raised, the RMT is recommending option #4, to
revert the SQL/JSON changes for v15, and come back with a redesign for
v16.If folks think there are some bits we can include in v15, we can
consider option #3. (Personally, I would like to see if we can keep
JSON_TABLE, but if there is too much overlap with the problematic
portions of the code I am fine with waiting for v16).At this stage in the release process coupled with the concerns, we're
a bit worried about introducing changes that are unpredictable in
terms of stability and maintenance. We also do not want to hold up the
release while this feature set is goes through a redesign without
agreement on what such a design would look like as well as a timeline.Note that the above are the RMT's recommendations; while the RMT can
explicitly call for a revert, we want to first guide the discussion on
the best path forward knowing the challenges for including these
features in v15.
I very much doubt option 3 is feasible. The parts that are controversial
go back at least in part to the first patches of the series. Trying to
salvage anything would almost certainly be more disruptive than trying
to fix it.
I'm not sure what the danger is to stability, performance or correctness
in applying the changes Amit has proposed for release 15. But if that
danger is judged to be too great then I agree we should revert.
I should add that I'm very grateful to Amit for his work, and I'm sure
it isn't wasted effort, whatever the decision.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
On 8/9/22 11:03 AM, Andrew Dunstan wrote:
On 2022-08-09 Tu 09:59, Jonathan S. Katz wrote:
The RMT met today to discuss the state of this open item surrounding
the SQL/JSON feature set. We discussed the specific concerns raised
about the code and debated four different options:1. Do nothing and continue with the community process of stabilizing
the code without significant redesign2. Recommend holding up the v15 release to allow for the code to be
redesigned and fixed (as based on Andres' estimates, this would push
the release out several months).3. Revert the problematic parts of the code but try to include some
of the features in the v15 release (e.g. JSON_TABLE)4. Revert the feature set and redesign and try to include for v16
Based on the concerns raised, the RMT is recommending option #4, to
revert the SQL/JSON changes for v15, and come back with a redesign for
v16.If folks think there are some bits we can include in v15, we can
consider option #3. (Personally, I would like to see if we can keep
JSON_TABLE, but if there is too much overlap with the problematic
portions of the code I am fine with waiting for v16).At this stage in the release process coupled with the concerns, we're
a bit worried about introducing changes that are unpredictable in
terms of stability and maintenance. We also do not want to hold up the
release while this feature set is goes through a redesign without
agreement on what such a design would look like as well as a timeline.Note that the above are the RMT's recommendations; while the RMT can
explicitly call for a revert, we want to first guide the discussion on
the best path forward knowing the challenges for including these
features in v15.I very much doubt option 3 is feasible. The parts that are controversial
go back at least in part to the first patches of the series. Trying to
salvage anything would almost certainly be more disruptive than trying
to fix it.I'm not sure what the danger is to stability, performance or correctness
in applying the changes Amit has proposed for release 15. But if that
danger is judged to be too great then I agree we should revert.
Speaking personally, I would like to see what we could do to include
support for this batch of the SQL/JSON features in v15. What is included
looks like it closes most of the gap on what we've been missing
syntactically since the standard was adopted, and the JSON_TABLE work is
very convenient for converting JSON data into a relational format. I
believe having this feature set is important for maintaining standards
compliance, interoperability, tooling support, and general usability.
Plus, JSON still seems to be pretty popular :)
Rereading the thread for the umpteenth time, I have seen Amit working
through Andres' concerns. From my read, the ones that seem pressing are:
* Lack of design documentation, which may be leading to some of the
general design concerns
* The use of substransactions within the executor, though docs
explaining the decisions on that could alleviate it (I realize this is a
big topic and any summary I give won't capture the full nuance)
* Debate on how to handle the type coercions
(Please correct me if I've missed anything).
I hope that these can be addressed satisfactorily in a reasonable (read:
now a much shorter) timeframe so we can include the SQL/JSON work in v15.
With my RMT hat on, the issue is that we're now at beta 3 and we still
have not not reached a resolution on this open item. Even if it were
committed tomorrow, we would definitely need a beta 4, and we would want
to let the code bake a bit more to ensure we get adequate test coverage
on it. This would likely put the release date into October, presuming we
have not found any other issues that could cause a release delay.
With my advocacy hat on, we're at the point in the release cycle where I
kick off the GA release process (e.g. announcement drafting). Not
knowing the final status of a feature that's likely to be highlighted
makes it difficult to write said release as well as kick off the other
machinery (e.g. translations). If there is at least a decision on next
steps, I can adjust the GA release process timeline.
I should add that I'm very grateful to Amit for his work, and I'm sure
it isn't wasted effort, whatever the decision.
+1. While I've been quiet on this thread to date, I have definitely seen
Amit working hard on addressing the concerns.
Thanks,
Jonathan
Hi,
On 2022-08-09 14:04:48 -0400, Jonathan S. Katz wrote:
On 8/9/22 11:03 AM, Andrew Dunstan wrote:
On 2022-08-09 Tu 09:59, Jonathan S. Katz wrote:
The RMT met today to discuss the state of this open item surrounding
the SQL/JSON feature set. We discussed the specific concerns raised
about the code and debated four different options:Â 1. Do nothing and continue with the community process of stabilizing
the code without significant redesign 2. Recommend holding up the v15 release to allow for the code to be
redesigned and fixed (as based on Andres' estimates, this would push
the release out several months).
Obviously that's a question of the resources brought to bear.
From my angle: I've obviously some of my own work to take care of as well, but
it's also just hard to improve something that includes a lot of undocumented
design decisions.
 3. Revert the problematic parts of the code but try to include some
of the features in the v15 release (e.g. JSON_TABLE)
I very much doubt option 3 is feasible. The parts that are controversial
go back at least in part to the first patches of the series. Trying to
salvage anything would almost certainly be more disruptive than trying
to fix it.
Agreed.
 4. Revert the feature set and redesign and try to include for v16
Based on the concerns raised, the RMT is recommending option #4, to
revert the SQL/JSON changes for v15, and come back with a redesign for
v16.If folks think there are some bits we can include in v15, we can
consider option #3. (Personally, I would like to see if we can keep
JSON_TABLE, but if there is too much overlap with the problematic
portions of the code I am fine with waiting for v16).At this stage in the release process coupled with the concerns, we're
a bit worried about introducing changes that are unpredictable in
terms of stability and maintenance. We also do not want to hold up the
release while this feature set is goes through a redesign without
agreement on what such a design would look like as well as a timeline.Note that the above are the RMT's recommendations; while the RMT can
explicitly call for a revert, we want to first guide the discussion on
the best path forward knowing the challenges for including these
features in v15.
Unless we decide on 4 immediately, I think it might be worth starting a
separate thread to get more attention. The subject doesn't necessarily have
everyone follow along.
I'm not sure what the danger is to stability, performance or correctness
in applying the changes Amit has proposed for release 15. But if that
danger is judged to be too great then I agree we should revert.
My primary problem is that as-is the code is nearly unmaintainable. It's hard
for Amit to fix that, given that he's not one of the original authors.
Rereading the thread for the umpteenth time, I have seen Amit working
through Andres' concerns. From my read, the ones that seem pressing are:* Lack of design documentation, which may be leading to some of the general
design concerns
* The use of substransactions within the executor, though docs explaining
the decisions on that could alleviate it (I realize this is a big topic and
any summary I give won't capture the full nuance)
I don't think subtransactions per-se are a fundamental problem. I think the
error handling implementation is ridiculously complicated, and while I started
to hack on improving it, I stopped when I really couldn't understand what
errors it actually needs to handle when and why.
* Debate on how to handle the type coercions
That's partially related to the error handling issue above.
One way this code could be drastically simplified is to force all
type-coercions to go through the "io coercion" path, which could be
implemented as a single execution step (which thus could trivially
start/finish a subtransaction) and would remove a lot of the complicated code
around coercions.
I should add that I'm very grateful to Amit for his work, and I'm sure
it isn't wasted effort, whatever the decision.+1. While I've been quiet on this thread to date, I have definitely seen
Amit working hard on addressing the concerns.
+1
Greetings,
Andres Freund
"Jonathan S. Katz" <jkatz@postgresql.org> writes:
Speaking personally, I would like to see what we could do to include
support for this batch of the SQL/JSON features in v15. What is included
looks like it closes most of the gap on what we've been missing
syntactically since the standard was adopted, and the JSON_TABLE work is
very convenient for converting JSON data into a relational format. I
believe having this feature set is important for maintaining standards
compliance, interoperability, tooling support, and general usability.
Plus, JSON still seems to be pretty popular :)
...
I hope that these can be addressed satisfactorily in a reasonable (read:
now a much shorter) timeframe so we can include the SQL/JSON work in v15.
We have delayed releases for $COOL_FEATURE in the past, and I think
our batting average on that is still .000: not once has it worked out
well. I think we're better off getting the pain over with quickly,
so I regretfully vote for revert. And for a full redesign/rewrite
before we try again; based on Andres' comments, it needs that.
regards, tom lane
Hi,
On 2022-08-09 15:17:44 -0400, Tom Lane wrote:
We have delayed releases for $COOL_FEATURE in the past, and I think
our batting average on that is still .000: not once has it worked out
well.
I think it semi worked when jsonb (?) first went in - it took a while and a
lot of effort from a lot of people, but in the end we made it work, and it was
a success from our user's perspectives, I think. OTOH, it's not a great sign
this is around json again...
- Andres
On 8/9/22 3:22 PM, Andres Freund wrote:
Hi,
On 2022-08-09 15:17:44 -0400, Tom Lane wrote:
We have delayed releases for $COOL_FEATURE in the past, and I think
our batting average on that is still .000: not once has it worked out
well.I think it semi worked when jsonb (?) first went in - it took a while and a
lot of effort from a lot of people, but in the end we made it work, and it was
a success from our user's perspectives, I think.
Yeah, this was the example I was thinking of. To continue with the
baseball analogy, it was a home-run from a PR perspective, and I can say
as a power user at the time, the 9.4 JSONB representation worked well
for my use case. Certainly newer functionality has made JSON easier to
work with in PG.
(I can't remember what the 9.5 hold up was).
The cases where we either delayed/punted on $COOL_FEATURE that cause me
concern are the ones where we say "OK, well fix this in the next
release" and we are then waiting, 2, 3, 4 releases for the work to be
completed. And to be clear, I'm thinking of this as "known issues" vs.
"iterating towards the whole solution".
OTOH, it's not a great sign this is around json again...
Yeah, I was thinking about that too.
Per Andres comment upthread, let's open a new thread to discuss the
SQL/JSON + v15 topic to improve visibility and get more feedback. I can
do that shortly.
We can continue with the technical discussion in here.
Jonathan
On 8/9/22 2:57 PM, Andres Freund wrote:
Hi,
On 2022-08-09 14:04:48 -0400, Jonathan S. Katz wrote:
2. Recommend holding up the v15 release to allow for the code to be
redesigned and fixed (as based on Andres' estimates, this would push
the release out several months).Obviously that's a question of the resources brought to bear.
From my angle: I've obviously some of my own work to take care of as well, but
it's also just hard to improve something that includes a lot of undocumented
design decisions.
*nod*
4. Revert the feature set and redesign and try to include for v16
Unless we decide on 4 immediately, I think it might be worth starting a
separate thread to get more attention. The subject doesn't necessarily have
everyone follow along.
*nod* I'll do this shortly.
Rereading the thread for the umpteenth time, I have seen Amit working
through Andres' concerns. From my read, the ones that seem pressing are:* Lack of design documentation, which may be leading to some of the general
design concerns* The use of substransactions within the executor, though docs explaining
the decisions on that could alleviate it (I realize this is a big topic and
any summary I give won't capture the full nuance)I don't think subtransactions per-se are a fundamental problem. I think the
error handling implementation is ridiculously complicated, and while I started
to hack on improving it, I stopped when I really couldn't understand what
errors it actually needs to handle when and why.
Ah, thanks for the clarification. That makes sense.
* Debate on how to handle the type coercions
That's partially related to the error handling issue above.
One way this code could be drastically simplified is to force all
type-coercions to go through the "io coercion" path, which could be
implemented as a single execution step (which thus could trivially
start/finish a subtransaction) and would remove a lot of the complicated code
around coercions.
If we went down this path, would this make you feel more comfortable
with including this work in the v15 release?
Thanks,
Jonathan
On 2022-08-09 Tu 15:50, Jonathan S. Katz wrote:
On 8/9/22 3:22 PM, Andres Freund wrote:
Hi,
On 2022-08-09 15:17:44 -0400, Tom Lane wrote:
We have delayed releases for $COOL_FEATURE in the past, and I think
our batting average on that is still .000: not once has it worked out
well.I think it semi worked when jsonb (?) first went in - it took a while
and a
lot of effort from a lot of people, but in the end we made it work,
and it was
a success from our user's perspectives, I think.Yeah, this was the example I was thinking of. To continue with the
baseball analogy, it was a home-run from a PR perspective, and I can
say as a power user at the time, the 9.4 JSONB representation worked
well for my use case. Certainly newer functionality has made JSON
easier to work with in PG.(I can't remember what the 9.5 hold up was).
The cases where we either delayed/punted on $COOL_FEATURE that cause
me concern are the ones where we say "OK, well fix this in the next
release" and we are then waiting, 2, 3, 4 releases for the work to be
completed. And to be clear, I'm thinking of this as "known issues" vs.
"iterating towards the whole solution".
Where we ended up with jsonb was a long way from where we started, but
technical difficulties were largely confined because it didn't involve
anything like the parser or the expression evaluation code. Here the SQL
Standards Committee has imposed a pretty substantial technical burden on
us and the code that Andres complains of is attempting to deal with that.
OTOH, it's not a great sign this is around json again...
Yeah, I was thinking about that too.
Ouch :-(
I think after 10 years of being involved with our JSON features, I'm
going to take a substantial break on that front.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
On 8/9/22 4:15 PM, Andrew Dunstan wrote:
On 2022-08-09 Tu 15:50, Jonathan S. Katz wrote:
On 8/9/22 3:22 PM, Andres Freund wrote:
OTOH, it's not a great sign this is around json again...
Yeah, I was thinking about that too.
Ouch :-(
I think after 10 years of being involved with our JSON features, I'm
going to take a substantial break on that front.
I hope that wasn't taken as a sleight, but just an observation. There
are other feature areas where I can make similar observations. All this
work around a database system is challenging as there are many
considerations that need to be made.
You've done an awesome job driving the JSON work forward and it is
greatly appreciated.
Thanks,
Jonathan
On 2022-08-09 Tu 16:19, Jonathan S. Katz wrote:
On 8/9/22 4:15 PM, Andrew Dunstan wrote:
On 2022-08-09 Tu 15:50, Jonathan S. Katz wrote:
On 8/9/22 3:22 PM, Andres Freund wrote:
OTOH, it's not a great sign this is around json again...
Yeah, I was thinking about that too.
Ouch :-(
I think after 10 years of being involved with our JSON features, I'm
going to take a substantial break on that front.I hope that wasn't taken as a sleight, but just an observation. There
are other feature areas where I can make similar observations. All
this work around a database system is challenging as there are many
considerations that need to be made.You've done an awesome job driving the JSON work forward and it is
greatly appreciated.
Thanks, I appreciate that (and I wasn't fishing for compliments). It's
more that I feel a bit tired of it ... some of my colleagues will
confirm that I've been saying this for a while, so it's not spurred by
this setback.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
On Wed, Aug 10, 2022 at 3:57 AM Andres Freund <andres@anarazel.de> wrote:
One way this code could be drastically simplified is to force all
type-coercions to go through the "io coercion" path, which could be
implemented as a single execution step (which thus could trivially
start/finish a subtransaction) and would remove a lot of the complicated code
around coercions.
Could you please clarify how you think we might do the io coercion
wrapped with a subtransaction all as a single execution step? I
would've thought that we couldn't do the sub-transaction without
leaving ExecInterpExpr() anyway, so maybe you meant the io coercion
itself was done using some code outside ExecInterpExpr()?
The current JsonExpr code does it by recursively calling
ExecInterpExpr() using the nested ExprState expressly for the
coercion.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
On 8/10/22 9:27 AM, Amit Langote wrote:
On Wed, Aug 10, 2022 at 3:57 AM Andres Freund <andres@anarazel.de> wrote:
One way this code could be drastically simplified is to force all
type-coercions to go through the "io coercion" path, which could be
implemented as a single execution step (which thus could trivially
start/finish a subtransaction) and would remove a lot of the complicated code
around coercions.Could you please clarify how you think we might do the io coercion
wrapped with a subtransaction all as a single execution step? I
would've thought that we couldn't do the sub-transaction without
leaving ExecInterpExpr() anyway, so maybe you meant the io coercion
itself was done using some code outside ExecInterpExpr()?The current JsonExpr code does it by recursively calling
ExecInterpExpr() using the nested ExprState expressly for the
coercion.
With RMT hat on, Andres do you have any thoughts on this?
Thanks,
Jonathan
Hi,
On 2022-08-11 13:08:27 -0400, Jonathan S. Katz wrote:
On 8/10/22 9:27 AM, Amit Langote wrote:
On Wed, Aug 10, 2022 at 3:57 AM Andres Freund <andres@anarazel.de> wrote:
One way this code could be drastically simplified is to force all
type-coercions to go through the "io coercion" path, which could be
implemented as a single execution step (which thus could trivially
start/finish a subtransaction) and would remove a lot of the complicated code
around coercions.Could you please clarify how you think we might do the io coercion
wrapped with a subtransaction all as a single execution step? I
would've thought that we couldn't do the sub-transaction without
leaving ExecInterpExpr() anyway, so maybe you meant the io coercion
itself was done using some code outside ExecInterpExpr()?The current JsonExpr code does it by recursively calling
ExecInterpExpr() using the nested ExprState expressly for the
coercion.
The basic idea is to rip out all the type-dependent stuff out and replace it
with a single JSON_IOCERCE step, which has a parameter about whether to wrap
things in a subtransaction or not. That step would always perform the coercion
by calling the text output function of the input and the text input function
of the output.
With RMT hat on, Andres do you have any thoughts on this?
I think I need to prototype how it'd look like to give a more detailed
answer. I have a bunch of meetings over the next few hours, but after that I
can give it a shot.
Greetings,
Andres Freund
I wrote:
Andres Freund <andres@anarazel.de> writes:
Maybe it's worth sticking a StaticAssert() for the struct size
somewhere.
Indeed. I thought we had one already.
I'm a bit wary about that being too noisy, there are some machines with
odd alignment requirements. Perhaps worth restricting the assertion to
x86-64 + armv8 or such?
I'd put it in first and only reconsider if it shows unfixable problems.
Now that we've got the sizeof(ExprEvalStep) under control, shouldn't
we do the attached?
regards, tom lane
Attachments:
Hi,
On 2023-02-22 16:34:44 -0500, Tom Lane wrote:
I wrote:
Andres Freund <andres@anarazel.de> writes:
Maybe it's worth sticking a StaticAssert() for the struct size
somewhere.Indeed. I thought we had one already.
I'm a bit wary about that being too noisy, there are some machines with
odd alignment requirements. Perhaps worth restricting the assertion to
x86-64 + armv8 or such?I'd put it in first and only reconsider if it shows unfixable problems.
Now that we've got the sizeof(ExprEvalStep) under control, shouldn't
we do the attached?
Indeed. Pushed.
Let's hope there's no rarely used architecture with odd alignment rules.
Greetings,
Andres Freund
On Wed, Feb 22, 2023 at 5:47 PM Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2023-02-22 16:34:44 -0500, Tom Lane wrote:
I wrote:
Andres Freund <andres@anarazel.de> writes:
Maybe it's worth sticking a StaticAssert() for the struct size
somewhere.Indeed. I thought we had one already.
I'm a bit wary about that being too noisy, there are some machines
with
odd alignment requirements. Perhaps worth restricting the assertion to
x86-64 + armv8 or such?I'd put it in first and only reconsider if it shows unfixable problems.
Now that we've got the sizeof(ExprEvalStep) under control, shouldn't
we do the attached?Indeed. Pushed.
Let's hope there's no rarely used architecture with odd alignment rules.
Greetings,
Andres Freund
I have a question about this that may affect some of my future work.
My not-ready-for-16 work on CAST( ... ON DEFAULT ... ) involved making
FuncExpr/IoCoerceExpr/ArrayCoerceExpr have a safe_mode flag, and that
necessitates adding a reserror boolean to ExprEvalStep for subsequent steps
to test if the error happened.
Will that change be throwing some architectures over the 64 byte count?
Corey Huinker <corey.huinker@gmail.com> writes:
My not-ready-for-16 work on CAST( ... ON DEFAULT ... ) involved making
FuncExpr/IoCoerceExpr/ArrayCoerceExpr have a safe_mode flag, and that
necessitates adding a reserror boolean to ExprEvalStep for subsequent steps
to test if the error happened.
Why do you want it in ExprEvalStep ... couldn't it be in ExprState?
I can't see why you'd need more than one at a time during evaluation.
regards, tom lane
Hi,
On 2023-02-23 13:39:14 -0500, Corey Huinker wrote:
My not-ready-for-16 work on CAST( ... ON DEFAULT ... ) involved making
FuncExpr/IoCoerceExpr/ArrayCoerceExpr have a safe_mode flag, and that
necessitates adding a reserror boolean to ExprEvalStep for subsequent steps
to test if the error happened.
I think if that requires adding a new variable to each ExprEvalStep, it's
DOA. The overhead would be too high. But I don't see why it would need to be
added all ExprEvalSteps instead of individual steps, or perhaps ExprEvalState?
Will that change be throwing some architectures over the 64 byte count?
It would.
I find the 'pahole' tool very useful for looking at struct layout.
struct ExprEvalStep {
intptr_t opcode; /* 0 8 */
Datum * resvalue; /* 8 8 */
_Bool * resnull; /* 16 8 */
union {
struct {
int last_var; /* 24 4 */
_Bool fixed; /* 28 1 */
/* XXX 3 bytes hole, try to pack */
TupleDesc known_desc; /* 32 8 */
const TupleTableSlotOps * kind; /* 40 8 */
} fetch; /* 24 24 */
...
struct {
Datum * values; /* 24 8 */
_Bool * nulls; /* 32 8 */
int nelems; /* 40 4 */
MinMaxOp op; /* 44 4 */
FmgrInfo * finfo; /* 48 8 */
FunctionCallInfo fcinfo_data; /* 56 8 */
} minmax; /* 24 40 */
...
} d; /* 24 40 */
/* size: 64, cachelines: 1, members: 4 */
};
We don't have memory to spare in the "general" portion of ExprEvalStep
(currently 24 bytes), as several of the type-specific portions are already 40
bytes large.
Greetings,
Andres Freund
Hi,
On 2023-02-23 13:56:56 -0500, Tom Lane wrote:
Corey Huinker <corey.huinker@gmail.com> writes:
My not-ready-for-16 work on CAST( ... ON DEFAULT ... ) involved making
FuncExpr/IoCoerceExpr/ArrayCoerceExpr have a safe_mode flag, and that
necessitates adding a reserror boolean to ExprEvalStep for subsequent steps
to test if the error happened.Why do you want it in ExprEvalStep ... couldn't it be in ExprState?
I can't see why you'd need more than one at a time during evaluation.
I don't know exactly what CAST( ... ON DEFAULT ... ) is aiming for - I guess
it wants to assign a different value when the cast fails? Is the default
expression a constant, or does it need to be runtime evaluated? If a const,
then the cast steps just could assign the new value. If runtime evaluation is
needed I'd expect the various coerce steps to jump to the value implementing
the default expression in case of a failure.
So I'm not sure we even need a reserror field in ExprState.
Greetings,
Andres Freund
On Thu, Feb 23, 2023 at 2:39 PM Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2023-02-23 13:56:56 -0500, Tom Lane wrote:
Corey Huinker <corey.huinker@gmail.com> writes:
My not-ready-for-16 work on CAST( ... ON DEFAULT ... ) involved making
FuncExpr/IoCoerceExpr/ArrayCoerceExpr have a safe_mode flag, and that
necessitates adding a reserror boolean to ExprEvalStep for subsequentsteps
to test if the error happened.
Why do you want it in ExprEvalStep ... couldn't it be in ExprState?
I can't see why you'd need more than one at a time during evaluation.I don't know exactly what CAST( ... ON DEFAULT ... ) is aiming for - I
guess
it wants to assign a different value when the cast fails? Is the default
expression a constant, or does it need to be runtime evaluated? If a
const,
then the cast steps just could assign the new value. If runtime evaluation
is
needed I'd expect the various coerce steps to jump to the value
implementing
the default expression in case of a failure.
The default expression is itself a cast expression. So CAST (expr1 AS
some_type DEFAULT expr2 ON ERROR) would basically be a safe-mode cast of
expr1 to some_type, and only upon failure would the non-safe cast of expr2
to some_type be executed. Granted, the most common use case would be for
expr2 to be a constant or something that folds into a constant, but the
proposed spec allows for it.
My implementation involved adding a setting to CoalesceExpr that tested for
error flags rather than null flags, hence putting it in ExprEvalStep and
ExprState (perhaps mistakenly). Copying and adapting EEOP_JUMP_IF_NOT_NULL
lead me to this:
EEO_CASE(EEOP_JUMP_IF_NOT_ERROR)
{
/* Transfer control if current result is non-error */
if (!*op->reserror)
{
*op->reserror = false;
EEO_JUMP(op->d.jump.jumpdone);
}
/* reset error flag */
*op->reserror = false;
EEO_NEXT();
}