SQL/JSON features for v15
Hi,
(Personal hat, not RMT hat unless otherwise noted).
This thread[1]/messages/by-id/20220616233130.rparivafipt6doj3@alap3.anarazel.de raised some concerns around the implementation of the
SQL/JSON features that are slated for v15, which includes an outstanding
open item[2]https://wiki.postgresql.org/wiki/PostgreSQL_15_Open_Items. Given the current state of the discussion, when the RMT
met on Aug 8, they several options, readable here[3]/messages/by-id/787cef45-15de-8f1d-ed58-a1c1435bfc0e@postgresql.org. Given we are now
into the later part of the release cycle, we need to make some decisions
on how to proceed with this feature given the concerns raised.
Per additional discussion on the thread, the group wanted to provide
more visibility into the discussion to get opinions on how to proceed
for the v15 release.
Without rehashing the thread, the options presented were:
1. Fixing the concerns addressed in the thread around the v15 SQL/JSON
features implementation, noting that this would likely entail at least
one more beta release and would push the GA date past our normal timeframe.
2. Try to commit a subset of the features that caused less debate. This
was ruled out.
3. Revert the v15 SQL/JSON features work.
<RMT hat>
Based on the current release timing and the open issues presented on the
thread, and the RMT had recommended reverting, but preferred to drive
consensus on next steps.
</RMT hat>
From a release advocacy standpoint, I need about 6 weeks lead time to
put together the GA launch. We're at the point where I typically deliver
a draft release announcement. From this, given this involves a high
visibility feature, I would want some clarity on what option we would
like to pursue. Once the announcement translation process has begun (and
this is when we have consensus on the release announcement), it becomes
more challenging to change things out.
From a personal standpoint (restating from[3]/messages/by-id/787cef45-15de-8f1d-ed58-a1c1435bfc0e@postgresql.org), I would like to see
what we could do to include support for this batch of the SQL/JSON
features in v15. What is included looks like it closes most of the gap
on what we've been missing syntactically since the standard was adopted,
and the JSON_TABLE work is very convenient for converting JSON data into
a relational format. I believe having this feature set is important for
maintaining standards compliance, interoperability, tooling support, and
general usability. Plus, JSON still seems to be pretty popular.
We're looking for additional input on what makes sense as a best course
of action, given what is presented in[3]/messages/by-id/787cef45-15de-8f1d-ed58-a1c1435bfc0e@postgresql.org.
Thanks,
Jonathan
[1]: /messages/by-id/20220616233130.rparivafipt6doj3@alap3.anarazel.de
/messages/by-id/20220616233130.rparivafipt6doj3@alap3.anarazel.de
[2]: https://wiki.postgresql.org/wiki/PostgreSQL_15_Open_Items
[3]: /messages/by-id/787cef45-15de-8f1d-ed58-a1c1435bfc0e@postgresql.org
/messages/by-id/787cef45-15de-8f1d-ed58-a1c1435bfc0e@postgresql.org
On 8/9/22 4:58 PM, Jonathan S. Katz wrote:
We're looking for additional input on what makes sense as a best course
of action, given what is presented in[3].
Missed adding Amit on the CC.
Jonathan
On 2022-08-09 Tu 16:58, Jonathan S. Katz wrote:
Hi,
(Personal hat, not RMT hat unless otherwise noted).
This thread[1] raised some concerns around the implementation of the
SQL/JSON features that are slated for v15, which includes an
outstanding open item[2]. Given the current state of the discussion,
when the RMT met on Aug 8, they several options, readable here[3].
Given we are now into the later part of the release cycle, we need to
make some decisions on how to proceed with this feature given the
concerns raised.Per additional discussion on the thread, the group wanted to provide
more visibility into the discussion to get opinions on how to proceed
for the v15 release.Without rehashing the thread, the options presented were:
1. Fixing the concerns addressed in the thread around the v15 SQL/JSON
features implementation, noting that this would likely entail at least
one more beta release and would push the GA date past our normal
timeframe.2. Try to commit a subset of the features that caused less debate.
This was ruled out.3. Revert the v15 SQL/JSON features work.
<RMT hat>
Based on the current release timing and the open issues presented on
the thread, and the RMT had recommended reverting, but preferred to
drive consensus on next steps.
</RMT hat>From a release advocacy standpoint, I need about 6 weeks lead time to
put together the GA launch. We're at the point where I typically
deliver a draft release announcement. From this, given this involves a
high visibility feature, I would want some clarity on what option we
would like to pursue. Once the announcement translation process has
begun (and this is when we have consensus on the release
announcement), it becomes more challenging to change things out.From a personal standpoint (restating from[3]), I would like to see
what we could do to include support for this batch of the SQL/JSON
features in v15. What is included looks like it closes most of the gap
on what we've been missing syntactically since the standard was
adopted, and the JSON_TABLE work is very convenient for converting
JSON data into a relational format. I believe having this feature set
is important for maintaining standards compliance, interoperability,
tooling support, and general usability. Plus, JSON still seems to be
pretty popular.We're looking for additional input on what makes sense as a best
course of action, given what is presented in[3].Thanks,
Jonathan
[1]
/messages/by-id/20220616233130.rparivafipt6doj3@alap3.anarazel.de
[2] https://wiki.postgresql.org/wiki/PostgreSQL_15_Open_Items
[3]
/messages/by-id/787cef45-15de-8f1d-ed58-a1c1435bfc0e@postgresql.org
To preserve options I will start preparing reversion patches. Given
there are I think more than 20 commits all told that could be fun, and
will probably take me a little while. The sad part is that to the best
of my knowledge this code is producing correct results, and not
disturbing the stability or performance of anything else. There was a
performance issue but it's been dealt with AIUI.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
On 8/10/22 11:50 AM, Andrew Dunstan wrote:
On 2022-08-09 Tu 16:58, Jonathan S. Katz wrote:
Hi,
(Personal hat, not RMT hat unless otherwise noted).
This thread[1] raised some concerns around the implementation of the
SQL/JSON features that are slated for v15, which includes an
outstanding open item[2]. Given the current state of the discussion,
when the RMT met on Aug 8, they several options, readable here[3].
Given we are now into the later part of the release cycle, we need to
make some decisions on how to proceed with this feature given the
concerns raised.Per additional discussion on the thread, the group wanted to provide
more visibility into the discussion to get opinions on how to proceed
for the v15 release.Without rehashing the thread, the options presented were:
1. Fixing the concerns addressed in the thread around the v15 SQL/JSON
features implementation, noting that this would likely entail at least
one more beta release and would push the GA date past our normal
timeframe.2. Try to commit a subset of the features that caused less debate.
This was ruled out.3. Revert the v15 SQL/JSON features work.
<RMT hat>
Based on the current release timing and the open issues presented on
the thread, and the RMT had recommended reverting, but preferred to
drive consensus on next steps.
</RMT hat>From a release advocacy standpoint, I need about 6 weeks lead time to
put together the GA launch. We're at the point where I typically
deliver a draft release announcement. From this, given this involves a
high visibility feature, I would want some clarity on what option we
would like to pursue. Once the announcement translation process has
begun (and this is when we have consensus on the release
announcement), it becomes more challenging to change things out.From a personal standpoint (restating from[3]), I would like to see
what we could do to include support for this batch of the SQL/JSON
features in v15. What is included looks like it closes most of the gap
on what we've been missing syntactically since the standard was
adopted, and the JSON_TABLE work is very convenient for converting
JSON data into a relational format. I believe having this feature set
is important for maintaining standards compliance, interoperability,
tooling support, and general usability. Plus, JSON still seems to be
pretty popular.We're looking for additional input on what makes sense as a best
course of action, given what is presented in[3].
To preserve options I will start preparing reversion patches. Given
there are I think more than 20 commits all told that could be fun, and
will probably take me a little while. The sad part is that to the best
of my knowledge this code is producing correct results, and not
disturbing the stability or performance of anything else. There was a
performance issue but it's been dealt with AIUI.
Personally, I hope we don't need to revert. If everything from the open
item standpoint is addressed, I want to ensure we capture and complete
the remaining issues that were raised on the other thread, i.e.
* adding design docs
* simplifying the type-coercion code
* another other design concerns that were presented
We switched this discussion out to a different thread to get some more
visibility on the issue and see if other folks would weigh in. Thus far,
there has not been much additional say either way. It would be good if
other folks chimed in.
Thanks,
Jonathan
Hi,
Continuation from the thread at
/messages/by-id/20220811171740.m5b4h7x63g4lzgrk@awork3.anarazel.de
On 2022-08-11 10:17:40 -0700, Andres Freund wrote:
On 2022-08-11 13:08:27 -0400, Jonathan S. Katz wrote:
With RMT hat on, Andres do you have any thoughts on this?
I think I need to prototype how it'd look like to give a more detailed
answer. I have a bunch of meetings over the next few hours, but after that I
can give it a shot.
I started hacking on this Friday. I think there's some relatively easy
improvements that make the code substantially more understandable, at least
for me, without even addressing the structural stuff.
One thing I could use help understanding is the logic behind
ExecEvalJsonNeedsSubTransaction() - there's no useful comments, so it's hard
to follow.
bool
ExecEvalJsonNeedsSubTransaction(JsonExpr *jsexpr,
struct JsonCoercionsState *coercions)
{
/* want error to be raised, so clearly no subtrans needed */
if (jsexpr->on_error->btype == JSON_BEHAVIOR_ERROR)
return false;
if (jsexpr->op == JSON_EXISTS_OP && !jsexpr->result_coercion)
return false;
if (!coercions)
return true;
return false;
}
I guess the !coercions bit is just about the planner, where we want to be
pessimistic about when subtransactions are used, for the purpose of
parallelism? Because that's the only place that passes in NULL.
What really baffles me is that last 'return false' - it seems to indicate that
there's no paths during query execution where
ExecEvalJsonNeedsSubTransaction() returns true. And indeed, tests pass with an
Assert(!needSubtrans) added to ExecEvalJson() (and then unsurprisingly also
after removing the ExecEvalJsonExprSubtrans() indirection).
What's going on here?
We, somewhat confusingly, still rely on subtransactions, heavily
so. Responsible for that is this hunk of code:
bool throwErrors = jexpr->on_error->btype == JSON_BEHAVIOR_ERROR;
[...]
cxt.error = throwErrors ? NULL : &error;
cxt.coercionInSubtrans = !needSubtrans && !throwErrors;
Assert(!needSubtrans || cxt.error);
So basically we start a subtransaction inside ExecEvalJsonExpr(), to coerce
the result type, whenever !needSubtrans (which is always!), unless ERROR ON
ERROR is used.
Which then also explains the theory behind the EXISTS_OP check in
ExecEvalJsonNeedsSubTransaction(). In that case ExecEvalJsonExpr() returns
early, before doing a return value coercion, thus not starting a
subtransaction.
I don't think it's sane from a performance view to start a subtransaction for
every coercion, particularly because most coercion paths will never trigger an
error, leaving things like out-of-memory or interrupts aside. And those are
re-thrown by ExecEvalJsonExprSubtrans(). A quick and dirty benchmark shows
ERROR ON ERROR nearly 2xing speed. I'm worried about the system impact of
using subtransactions this heavily, it's not exactly the best performing
system - the only reason it's kind of ok here is that it's going to be very
rare to allocate a subxid, I think.
Next question:
/*
* We should catch exceptions of category ERRCODE_DATA_EXCEPTION and
* execute the corresponding ON ERROR behavior then.
*/
oldcontext = CurrentMemoryContext;
oldowner = CurrentResourceOwner;
Assert(error);
BeginInternalSubTransaction(NULL);
/* Want to execute expressions inside function's memory context */
MemoryContextSwitchTo(oldcontext);
PG_TRY();
{
res = func(op, econtext, res, resnull, p, error);
/* Commit the inner transaction, return to outer xact context */
ReleaseCurrentSubTransaction();
MemoryContextSwitchTo(oldcontext);
CurrentResourceOwner = oldowner;
}
PG_CATCH();
{
ErrorData *edata;
int ecategory;
/* Save error info in oldcontext */
MemoryContextSwitchTo(oldcontext);
edata = CopyErrorData();
FlushErrorState();
/* Abort the inner transaction */
RollbackAndReleaseCurrentSubTransaction();
MemoryContextSwitchTo(oldcontext);
CurrentResourceOwner = oldowner;
Two points:
1) I suspect it's not safe to switch to oldcontext before calling func().
On error we'll have leaked memory into oldcontext and we'll just continue
on. It might not be very consequential here, because the calling context
presumably isn't very long lived, but that's probably not something we should
rely on.
Also, are we sure that the context will be in a clean state when it's used
within an erroring subtransaction?
I think the right thing here would be to stay in the subtransaction context
and then copy the datum out to the surrounding context in the success case.
2) If there was an out-of-memory error, it'll have been in oldcontext. So
switching back to it before calling CopyErrorData() doesn't seem good - we'll
just hit OOM issues again.
I realize that both of these issues are present in plenty other code (see
e.g. plperl_spi_exec()). So I'm curious why they are ok?
Greetings,
Andres Freund
Hi,
On 16.08.2022 01:38, Andres Freund wrote:
Continuation from the thread at
/messages/by-id/20220811171740.m5b4h7x63g4lzgrk@awork3.anarazel.deI started hacking on this Friday. I think there's some relatively easy
improvements that make the code substantially more understandable, at least
for me, without even addressing the structural stuff.
I also started hacking Friday, hacked all weekend, and now have a new
version of the patch.
I received your message when I finished writing of mine, so I will
try answer your new questions only in next message. But in short, I
can say that some things like ExecEvalJsonExprSubtrans() were fixed.
I took Amit's patch and tried to simplify execution further.
Explanation of the patches is at the very end of message.
Next, I try to answer some of previous questions.
On Aug 2, 2022 at 9:39 AM Andres Freund<andres@anarazel.de> wrote:
The whole coercion stuff just seems incredibly clunky (in a
slightly different shape before this patch).
ExecEvalJsonExprItemCoercion() calls ExecPrepareJsonItemCoercion(),
which gets a pointer to one of the per-type elements in
JsonItemCoercionsState, dispatching on the type of the json
object. Then we later call ExecGetJsonItemCoercion() (via a
convoluted path), which again will dispatch on the type
(extracting the json object again afaics!), to then somehow
eventually get the coerced value. I think it might be possible
to make this a bit simpler, by not leaving anything
coercion-related in ExecEvalJsonExpr().
On 2022-08-02 12:05:55 +0900, Amit Langote wrote:
I left some pieces there, because I thought the error of not finding an
appropriate coercion must be thrown right away as the code in
ExecEvalJsonExpr() does after calling ExecGetJsonItemCoercion().
ExecPrepareJsonItemCoercion() is called later when it's time to
actually evaluate the coercion. If we move the error path to
ExecPrepareJsonItemCoercion(), both ExecGetJsonItemCoercion() and the
error path code in ExecEvalJsonExpr() will be unnecessary. I will
give that a try.
The first dispatch is done only for throwing error about missing cast
without starting subtransaction in which second dispatch is executed.
I agree, this is bad that result of first dispatch is not used later,
and I have removed second dispatch.
I don't understand the design of what needs to have error handling,
and what not.
I don't think subtransactions per-se are a fundamental problem.
I think the error handling implementation is ridiculously complicated,
and while I started to hack on improving it, I stopped when I really
couldn't understand what errors it actually needs to handle when and
why.
Here is the diagram that may help to understand error handling in
SQL/JSON functions (I hope it will be displayed correctly):
JSON path -------
expression \
->+-----------+ SQL/JSON +----------+ Result
PASSING args ------->| JSON path |--> item or --->| Output |-> SQL
->| executor | JSONB .->| Coercion | value
/ +-----------+ datum | +----------+
JSON + - - - -+ | | | |
Context ->: FORMAT : v v | v
item : JSON : error? empty? | error?
+ - - - -+ | | | |
| | +----------+ | /
v | | ON EMPTY |--> SQL --' /
error? | +----------+ value /
| | | /
\ | v /
\ \ error? /
\ \ | /
\______ \ | _____________/
\ \ | /
v v v v +----------+
+----------+ | Output | Result
| ON ERROR |--->| Coercion |--> SQL
+----------+ +----------+ value
| |
V V
EXCEPTION EXCEPTION
The first dashed box "FORMAT JSON" used for parsing JSON is absent in
our implementation, because we support only jsonb type which is
pre-parsed. This could be used in queries like that:
JSON_VALUE('invalid json', '$' DEFAULT 'error' ON ERROR) => 'error'
JSON path executor already has error handling and does not need
subtransactions. We had to add functions like numeric_add_opt_error()
which return error flag instead of throwing exceptions.
On Aug 10, 2022 at 3:57 AM Andres Freund<andres@anarazel.de> wrote:
One way this code could be drastically simplified is to force all
type-coercions to go through the "io coercion" path, which could be
implemented as a single execution step (which thus could trivially
start/finish a subtransaction) and would remove a lot of the
complicated code around coercions.
Could you please clarify how you think we might do the io coercion
wrapped with a subtransaction all as a single execution step? I
would've thought that we couldn't do the sub-transaction without
leaving ExecInterpExpr() anyway, so maybe you meant the io coercion
itself was done using some code outside ExecInterpExpr()?
The current JsonExpr code does it by recursively calling
ExecInterpExpr() using the nested ExprState expressly for the
coercion.
The basic idea is to rip out all the type-dependent stuff out and
replace it with a single JSON_IOCERCE step, which has a parameter
about whether to wrap things in a subtransaction or not. That step
would always perform the coercion by calling the text output function
of the input and the text input function of the output.
On Aug 3, 2022 at 12:00 AM Andres Freund<andres@anarazel.de> wrote:
But we don't need to wrap arbitrary evaluation in a subtransaction -
afaics the coercion calls a single function, not an arbitrary
expression?
SQL standard says that scalar SQL/JSON items are converted to SQL type
through CAST(corresponding_SQL_type_for_item AS returning_type).
Our JSON_VALUE implementation supports arbitrary output types that can
have specific CASTs from numeric, bool, datetime, which we can't
emulate with simple I/O coercion. But supporting of arbitrary types
may be dangerous, because SQL standard denotes only a limited set of
types:
The <data type> contained in the explicit or implicit
<JSON returning clause> shall be a <predefined type> that identifies
a character string data type, numeric data type, boolean data type,
or datetime data type.
I/O coercion will not even work in the following simple case:
JSON_VALUE('1.23', '$' RETURNING int)
It is expected to return 1::int, like ordinary cast 1.23::numeric::int.
Exceptions may come not only from coercions. Expressions in DEFAULT ON
EMPTY can also throw exceptions, which also must be handled.
Here is excerpt from ISO/IEC 19075-6:2021(E) "Part 6: Support for JSON",
which explains SQL standard features in human-readable manner:
6.4.3 JSON_VALUE:
<JSON value error behavior> specifies what to do if there is an
unhandled error. Unhandled errors can arise if there is an input
conversion error (for example, if the context item cannot be parsed),
an error returned by the SQL/JSON path engine, or an output
conversion error. The choices are the same as for
<JSON value empty behavior>.
When using DEFAULT <value expression> for either the empty or error
behavior, what happens if the <value expression> raises an exception?
The answer is that an error during empty behavior "falls through"
to the error behavior. If the error behavior itself has an error,
there is no further recourse but to raise the exception.
So, we need to support execution of arbitrary expressions inside a
subtransaction, and do not try to somehow simplify coercions.
In Amit's fix, wrapping DEFAULT ON EMPTY into subtransactions was
lost, mainly because there were no tests for this case. The following
test should not fall on the second row:
SELECT JSON_VALUE(jsonb '1', '$.a' RETURNING int
DEFAULT 1 / x ON EMPTY
DEFAULT 2 ON ERROR)
FROM (VALUES (1::int), (0)) x(x);
json_value
------------
1
2
I have added this test in 0003.
On Aug 3, 2022 at 12:00 AM Andres Freund<andres@anarazel.de> wrote:
On 2022-08-02 12:05:55 +0900, Amit Langote wrote:
I am not really sure if different coercions may be used
in the same query over multiple evaluations of the same JSON path
expression, but maybe that's also possible.
Even if the type can change, I don't think that means we need to have
space for multiple types at the same time - there can't be multiple
coercions happening at the same time, otherwise there could be two
coercions of the same type as well. So we don't need memory for
every coercion type.
Only the one item coercion is used in execution of JSON_VALUE().
Multiple coercions could be executed, if we supported quite useful SRF
JSON_QUERY() using "RETURNING SETOF type" (I had this idea for a long
time, but I didn't dare to implement it).
I don't understand what "memory" you mean. If we will not emit all
possible expressions statically, we will need to generate them
dynamically at run-time, and this could be hardly acceptable. In the
last version of the fix there is only 4 bytes (int jump) of additional
state space per coercion.
On Aug 6, 2022 at 5:37 Andres Freund<andres@anarazel.de> wrote:
There's one layer of subtransactions in one of the paths in
ExecEvalJsonExpr(), another in ExecEvalJson(). Some paths of
ExecEvalJsonExpr() go through subtransactions, others don't.
Really, there is only one level of subtransactions. Outer subtransaction
may be used for FORMAT JSON handling which always requires
subtransaction at the beginning of expression execution.
Inner subtransactions are conditional, they are started only and when
there is no outer subtransaction.
Now, outer subtransactions are not used at all,
ExecEvalJsonNeedsSubtransaction(NULL) always returns false. (AFAIR,
FORMAT JSON was present in older version of SQL/JSON patches, then it
was removed, but outer subtransactions were not). In the last version
of the fix I have removed them completely and moved inner
subtransactions into a separate executor step (see below).
The description of the patches:
0001 - Fix returning of json[b] domains in JSON_VALUE()
(This may require a separate thread.)
I found a bug in returning json[b] domains in JSON_VALUE(). json[b]
has special processing in JSON_VALUE, bypassing oridinary
SQL/JSON item type => SQL type coercions. But json[b] domains miss
this processing:
CREATE DOMAIN jsonb_not_null AS jsonb NOT NULL;
SELECT JSON_VALUE('"123"', '$' RETURNING jsonb);
"123"
SELECT JSON_VALUE( '123', '$' RETURNING jsonb);
123
SELECT JSON_VALUE('"123"', '$' RETURNING jsonb_not_null);
123
SELECT JSON_VALUE( '123', '$' RETURNING jsonb_not_null ERROR ON ERROR);
ERROR: SQL/JSON item cannot be cast to target type
Fixed by examinating output base type in parse_expr.c and skipping
allocation of item coercions, what later will be a signal for special
processing in ExecEvalJsonExpr().
0002 - Add EEOP_SUBTRANS executor step
On 2022-08-02 12:05:55 +0900, Amit Langote wrote:
So, the problem with inlining coercion evaluation into the main parent
JsonExpr's is that it needs to be wrapped in a sub-transaction to
catch any errors and return NULL instead. I don't know a way to wrap
ExprEvalStep evaluation in a sub-transaction to achieve that effect.
I also don't know way to run subtransactions without recursion in
executor, but I still managed to elimiate subsidary ExprStates.
I have introduced new EEOP_SUBTRANS step which executes its subsequent
steps in a subtransaction. It recursively calls a new variant of
ExecInterpExpr() in which starting stepno is passed. The return from
subtransaction is done with EEOP_DONE step that emitted after
subexpression. This step can be reused for other future expressions,
that's why it has no JSON prefix in its name (you could see recent
message in the thread about casts with default values, which are
missing in PostgreSQL).
But for JIT I still had to construct additional ExprState with a
function compiled from subexpression steps.
0003 - Simplify JsonExpr execution:
- New EEOP_SUBTRANS was used to wrap individual coercion expressions:
after execution it jumps to "done" or "onerror" step
- JSONEXPR_ITEM_COERCE step was removed
- JSONEXPR_COERCE split into JSONEXPR_IOCOERCE and JSONEXPR_POPULATE
- Removed all JsonExprPostEvalState
- JSONEXPR step simply returns jump address to one of its possible
continuations: done, onempty, onerror, coercion, coercion_subtrans,
io_coercion or one of item_coercions
- Fixed JsonExprNeedsSubTransaction(): considired more cases
- Eliminated transactions on Const expressions
--
Nikita Glukhov
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company
Attachments:
v6-0001-Fix-returning-of-json-b-domains-in-JSON_VALUE.patchtext/x-patch; charset=UTF-8; name=v6-0001-Fix-returning-of-json-b-domains-in-JSON_VALUE.patchDownload+318-27
v6-0002-Add-EEOP_SUBTRANS-executor-step.patchtext/x-patch; charset=UTF-8; name=v6-0002-Add-EEOP_SUBTRANS-executor-step.patchDownload+198-9
v6-0003-Simplify-JsonExpr-execution.patchtext/x-patch; charset=UTF-8; name=v6-0003-Simplify-JsonExpr-execution.patchDownload+1110-587
Hi,
On 2022-08-15 15:38:53 -0700, Andres Freund wrote:
Next question:
/*
* We should catch exceptions of category ERRCODE_DATA_EXCEPTION and
* execute the corresponding ON ERROR behavior then.
*/
oldcontext = CurrentMemoryContext;
oldowner = CurrentResourceOwner;Assert(error);
BeginInternalSubTransaction(NULL);
/* Want to execute expressions inside function's memory context */
MemoryContextSwitchTo(oldcontext);PG_TRY();
{
res = func(op, econtext, res, resnull, p, error);/* Commit the inner transaction, return to outer xact context */
ReleaseCurrentSubTransaction();
MemoryContextSwitchTo(oldcontext);
CurrentResourceOwner = oldowner;
}
PG_CATCH();
{
ErrorData *edata;
int ecategory;/* Save error info in oldcontext */
MemoryContextSwitchTo(oldcontext);
edata = CopyErrorData();
FlushErrorState();/* Abort the inner transaction */
RollbackAndReleaseCurrentSubTransaction();
MemoryContextSwitchTo(oldcontext);
CurrentResourceOwner = oldowner;Two points:
1) I suspect it's not safe to switch to oldcontext before calling func().
On error we'll have leaked memory into oldcontext and we'll just continue
on. It might not be very consequential here, because the calling context
presumably isn't very long lived, but that's probably not something we should
rely on.Also, are we sure that the context will be in a clean state when it's used
within an erroring subtransaction?I think the right thing here would be to stay in the subtransaction context
and then copy the datum out to the surrounding context in the success case.2) If there was an out-of-memory error, it'll have been in oldcontext. So
switching back to it before calling CopyErrorData() doesn't seem good - we'll
just hit OOM issues again.I realize that both of these issues are present in plenty other code (see
e.g. plperl_spi_exec()). So I'm curious why they are ok?
Certainly seems to be missing a FreeErrorData() for the happy path?
It'd be nicer if we didn't copy the error. In the case we rethrow we don't
need it, because we can just PG_RE_THROW(). And in the other path we just want
to get the error code. It just risks additional errors to CopyErrorData(). But
it's not entirely obvious that geterrcode() is intended for this:
* This is only intended for use in error callback subroutines, since there
* is no other place outside elog.c where the concept is meaningful.
*/
a PG_CATCH() block isn't really an error callback subroutine. But it should be
fine.
Greetings,
Andres Freund
Hi,
On 2022-08-16 04:02:17 +0300, Nikita Glukhov wrote:
Hi,
On 16.08.2022 01:38, Andres Freund wrote:
Continuation from the thread at
/messages/by-id/20220811171740.m5b4h7x63g4lzgrk@awork3.anarazel.deI started hacking on this Friday. I think there's some relatively easy
improvements that make the code substantially more understandable, at least
for me, without even addressing the structural stuff.I also started hacking Friday, hacked all weekend, and now have a new
version of the patch.
Cool.
I don't understand the design of what needs to have error handling,
and what not.I don't think subtransactions per-se are a fundamental problem.
I think the error handling implementation is ridiculously complicated,
and while I started to hack on improving it, I stopped when I really
couldn't understand what errors it actually needs to handle when and
why.Here is the diagram that may help to understand error handling in
SQL/JSON functions (I hope it will be displayed correctly):
I think that is helpful.
JSON path ------- expression \ ->+-----------+ SQL/JSON +----------+ Result PASSING args ------->| JSON path |--> item or --->| Output |-> SQL ->| executor | JSONB .->| Coercion | value / +-----------+ datum | +----------+ JSON + - - - -+ | | | | Context ->: FORMAT : v v | v item : JSON : error? empty? | error? + - - - -+ | | | | | | +----------+ | / v | | ON EMPTY |--> SQL --' / error? | +----------+ value / | | | / \ | v / \ \ error? / \ \ | / \______ \ | _____________/ \ \ | / v v v v +----------+ +----------+ | Output | Result | ON ERROR |--->| Coercion |--> SQL +----------+ +----------+ value | | V V EXCEPTION EXCEPTIONThe first dashed box "FORMAT JSON" used for parsing JSON is absent in
our implementation, because we support only jsonb type which is
pre-parsed. This could be used in queries like that:
JSON_VALUE('invalid json', '$' DEFAULT 'error' ON ERROR) => 'error'
On Aug 3, 2022 at 12:00 AM Andres Freund<andres@anarazel.de> wrote:
But we don't need to wrap arbitrary evaluation in a subtransaction -
afaics the coercion calls a single function, not an arbitrary
expression?SQL standard says that scalar SQL/JSON items are converted to SQL type
through CAST(corresponding_SQL_type_for_item AS returning_type).
Our JSON_VALUE implementation supports arbitrary output types that can
have specific CASTs from numeric, bool, datetime, which we can't
emulate with simple I/O coercion. But supporting of arbitrary types
may be dangerous, because SQL standard denotes only a limited set of
types:The <data type> contained in the explicit or implicit
<JSON returning clause> shall be a <predefined type> that identifies
a character string data type, numeric data type, boolean data type,
or datetime data type.I/O coercion will not even work in the following simple case:
JSON_VALUE('1.23', '$' RETURNING int)
It is expected to return 1::int, like ordinary cast 1.23::numeric::int.
Whether it's just IO coercions or also coercions through function calls
doesn't matter terribly, as long as both can be wrapped as a single
interpretation step. You can have a EEOP_JSON_COERCE_IO,
EEOP_JSON_COERCE_FUNC that respectively call input/output function and the
transformation routine within a subtransaction. On error they can jump to some
on_error execution step.
The difficulty is likely just dealing with the intermediary nodes like
RelabelType.
Exceptions may come not only from coercions. Expressions in DEFAULT ON
EMPTY can also throw exceptions, which also must be handled.
Are there other cases?
Only the one item coercion is used in execution of JSON_VALUE().
Multiple coercions could be executed, if we supported quite useful SRF
JSON_QUERY() using "RETURNING SETOF type" (I had this idea for a long
time, but I didn't dare to implement it).I don't understand what "memory" you mean.
I'm not entirely sure what I meant at that time either. Understanding this
code involves a lot of guessing since there's practically no explanatory
comments.
If we will not emit all possible expressions statically, we will need to
generate them dynamically at run-time, and this could be hardly acceptable.
I'm not convinced that that's true. We spend a fair amount of memory
generating expression paths for the per-type elements in JsonItemCoercions,
most of which will never be used. Even trivial stuff ends up with ~2kB.
Then there's of course the executor side, where the various ExprStates really
add up:
MemoryContextStats(CurrentMemoryContext) in ExecInitExprRec(), just before
if (jext->coercions)
ExecutorState: 8192 total in 1 blocks; 4464 free (0 chunks); 3728 used
ExprContext: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
Grand total: 16384 bytes in 2 blocks; 12392 free (0 chunks); 3992 used
just after:
ExecutorState: 32768 total in 3 blocks; 15032 free (2 chunks); 17736 used
ExprContext: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
Grand total: 40960 bytes in 4 blocks; 22960 free (2 chunks); 18000 used
for SELECT JSON_VALUE(NULL::jsonb, '$');
In the last version of the fix there is only 4 bytes (int jump) of
additional state space per coercion.
That's certainly a *lot* better.
On Aug 6, 2022 at 5:37 Andres Freund<andres@anarazel.de> wrote:
There's one layer of subtransactions in one of the paths in
ExecEvalJsonExpr(), another in ExecEvalJson(). Some paths of
ExecEvalJsonExpr() go through subtransactions, others don't.Really, there is only one level of subtransactions. Outer subtransaction
may be used for FORMAT JSON handling which always requires
subtransaction at the beginning of expression execution.
Inner subtransactions are conditional, they are started only and when
there is no outer subtransaction.
Yea, I realized that by now as well. But the code doesn't make that
understandable. E.g.:
Now, outer subtransactions are not used at all,
ExecEvalJsonNeedsSubtransaction(NULL) always returns false. (AFAIR,
FORMAT JSON was present in older version of SQL/JSON patches, then it
was removed, but outer subtransactions were not).
is very misleading.
0002 - Add EEOP_SUBTRANS executor step
On 2022-08-02 12:05:55 +0900, Amit Langote wrote:
So, the problem with inlining coercion evaluation into the main parent
JsonExpr's is that it needs to be wrapped in a sub-transaction to
catch any errors and return NULL instead. I don't know a way to wrap
ExprEvalStep evaluation in a sub-transaction to achieve that effect.I also don't know way to run subtransactions without recursion in
executor, but I still managed to elimiate subsidary ExprStates.I have introduced new EEOP_SUBTRANS step which executes its subsequent
steps in a subtransaction. It recursively calls a new variant of
ExecInterpExpr() in which starting stepno is passed. The return from
subtransaction is done with EEOP_DONE step that emitted after
subexpression. This step can be reused for other future expressions,
that's why it has no JSON prefix in its name (you could see recent
message in the thread about casts with default values, which are
missing in PostgreSQL).
I've wondered about this as well, but I think it'd require quite careful work
to be safe. And certainly isn't something we can do at this point in the cycle
- it'll potentially impact every query, not just ones with json in, if we
screw up something (or introduce overhead).
But for JIT I still had to construct additional ExprState with a
function compiled from subexpression steps.
JIT is one of the reason *not* want to construct subsidiary ExprState's, since
they will trigger separate code generation (and thus overhead).
Why did you have to do this?
0003 - Simplify JsonExpr execution:
- New EEOP_SUBTRANS was used to wrap individual coercion expressions:
after execution it jumps to "done" or "onerror" step
- JSONEXPR_ITEM_COERCE step was removed
- JSONEXPR_COERCE split into JSONEXPR_IOCOERCE and JSONEXPR_POPULATE
- Removed all JsonExprPostEvalState
- JSONEXPR step simply returns jump address to one of its possible
continuations: done, onempty, onerror, coercion, coercion_subtrans,
io_coercion or one of item_coercions
- Fixed JsonExprNeedsSubTransaction(): considired more cases
- Eliminated transactions on Const expressions
I pushed a few cleanups to https://github.com/anarazel/postgres/commits/json
while I was hacking on this (ignore that it's based on the meson tree, that's
just faster for me). Some of them might not be applicable anymore, but it
might still make sense for you to look at.
Greetings,
Andres Freund
On Mon, Aug 15, 2022 at 6:39 PM Andres Freund <andres@anarazel.de> wrote:
I don't think it's sane from a performance view to start a subtransaction for
every coercion, particularly because most coercion paths will never trigger an
error, leaving things like out-of-memory or interrupts aside. And those are
re-thrown by ExecEvalJsonExprSubtrans(). A quick and dirty benchmark shows
ERROR ON ERROR nearly 2xing speed. I'm worried about the system impact of
using subtransactions this heavily, it's not exactly the best performing
system - the only reason it's kind of ok here is that it's going to be very
rare to allocate a subxid, I think.
I agree. It kinda surprises me that we thought it was OK to commit
something that uses that many subtransactions. I feel like that's
going to cause people to hose themselves in ways that we can't really
do anything about. Like they'll test it out, it will work, and then
when they put it into production, they'll have constant wraparound
issues for which the only real solution is to not use the feature they
relied on to build the application.
--
Robert Haas
EDB: http://www.enterprisedb.com
Hi,
On 8/15/22 10:14 PM, Andres Freund wrote:
I pushed a few cleanups to https://github.com/anarazel/postgres/commits/json
while I was hacking on this (ignore that it's based on the meson tree, that's
just faster for me). Some of them might not be applicable anymore, but it
might still make sense for you to look at.
With RMT hat on, this appears to be making progress. A few questions /
comments for the group:
1. Nikita: Did you have a chance to review Andres's changes as well?
2. There seems to be some, though limited, progress on design docs.
Andres keeps making a point on adding additional comments to the code to
make it easier to follow. Please do not lose sight of this.
3. Robert raised a point about the use of subtransactions and the
increased risk of wraparound on busy systems using the SQL/JSON
features. Do these patches help reduce this risk? I read some clarity on
the use of subtransactions within the patchset, but want to better
understand if the risks pointed out are a concern.
Thanks everyone for your work on this so far!
Jonathan
Hi,
On 17.08.2022 04:45, Jonathan S. Katz wrote:
On 8/15/22 10:14 PM, Andres Freund wrote:
I pushed a few cleanups to
https://github.com/anarazel/postgres/commits/json
while I was hacking on this (ignore that it's based on the meson
tree, that's
just faster for me). Some of them might not be applicable anymore,
but it
might still make sense for you to look at.With RMT hat on, this appears to be making progress. A few questions /
comments for the group:1. Nikita: Did you have a chance to review Andres's changes as well?
Yes, I have reviewed Andres's changes, they all are ok.
Then I started to do on the top of it other fixes that help to avoid
subtransactions when they are not needed. And it ended in the new
refactoring of coercion code. Also I moved here from v6-0003 fix of
ExecEvalJsonNeedSubtransaction() which considers more cases.
On 16.08.2022 05:14, Andres Freund wrote:
But for JIT I still had to construct additional ExprState with a
function compiled from subexpression steps.
Why did you have to do this?
I simply did not dare to implement compilation of recursively-callable
function with additional parameter stepno. In the v8 patch I did it
by adding a switch with all possible jump addresses of EEOP_SUBTRANS
steps in the beginning of the function. And it really seems to work
faster, but needs more exploration. See patch 0003, where both
variants preserved using #ifdef.
The desciprion of the v7 patches:
0001 Simplify JsonExpr execution
Andres's changes + mine:
- Added JsonCoercionType enum, fields like via_io replaced with it
- Emit only context item steps in JSON_TABLE_OP case
- Skip coercion of NULLs to non-domain types (is it correct?)
0002 Fix returning of json[b] domains in JSON_VALUE:
simply rebase of v6 onto 0001
0003 Add EEOP_SUBTRANS executor step
v6 + new recursive JIT
0004 Split JsonExpr execution into steps
simply rebase of v6 + used LLMBuildSwitch() in EEOP_JSONEXPR
--
Nikita Glukhov
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company
Attachments:
v7-0001-Simplify-JsonExpr-execution.patchtext/x-patch; charset=UTF-8; name=v7-0001-Simplify-JsonExpr-execution.patchDownload+472-348
v7-0002-Fix-returning-of-json-b-domains-in-JSON_VALUE.patchtext/x-patch; charset=UTF-8; name=v7-0002-Fix-returning-of-json-b-domains-in-JSON_VALUE.patchDownload+287-11
v7-0003-Add-EEOP_SUBTRANS-executor-step.patchtext/x-patch; charset=UTF-8; name=v7-0003-Add-EEOP_SUBTRANS-executor-step.patchDownload+330-13
v7-0004-Split-JsonExpr-execution-into-steps.patchtext/x-patch; charset=UTF-8; name=v7-0004-Split-JsonExpr-execution-into-steps.patchDownload+1085-624
Hi,
On 8/17/22 11:45 PM, Nikita Glukhov wrote:
Hi,
On 17.08.2022 04:45, Jonathan S. Katz wrote:
On 8/15/22 10:14 PM, Andres Freund wrote:
I pushed a few cleanups to
https://github.com/anarazel/postgres/commits/json
while I was hacking on this (ignore that it's based on the meson
tree, that's
just faster for me). Some of them might not be applicable anymore,
but it
might still make sense for you to look at.With RMT hat on, this appears to be making progress. A few questions /
comments for the group:1. Nikita: Did you have a chance to review Andres's changes as well?
Yes, I have reviewed Andres's changes, they all are ok.
Thank you!
Then I started to do on the top of it other fixes that help to avoid
subtransactions when they are not needed. And it ended in the new
refactoring of coercion code. Also I moved here from v6-0003 fix of
ExecEvalJsonNeedSubtransaction() which considers more cases.
Great.
Andres, Robert: Do these changes address your concerns about the use of
substransactions and reduce the risk of xid wraparound?
On 16.08.2022 05:14, Andres Freund wrote:
But for JIT I still had to construct additional ExprState with a
function compiled from subexpression steps.Why did you have to do this?
I simply did not dare to implement compilation of recursively-callable
function with additional parameter stepno. In the v8 patch I did it
by adding a switch with all possible jump addresses of EEOP_SUBTRANS
steps in the beginning of the function. And it really seems to work
faster, but needs more exploration. See patch 0003, where both
variants preserved using #ifdef.The desciprion of the v7 patches:
0001 Simplify JsonExpr execution
Andres's changes + mine:
- Added JsonCoercionType enum, fields like via_io replaced with it
- Emit only context item steps in JSON_TABLE_OP case
- Skip coercion of NULLs to non-domain types (is it correct?)0002 Fix returning of json[b] domains in JSON_VALUE:
simply rebase of v6 onto 00010003 Add EEOP_SUBTRANS executor step
v6 + new recursive JIT0004 Split JsonExpr execution into steps
simply rebase of v6 + used LLMBuildSwitch() in EEOP_JSONEXPR
What do folks think of these patches?
Thanks,
Jonathan
On 8/19/22 10:11 AM, Jonathan S. Katz wrote:
Hi,
On 8/17/22 11:45 PM, Nikita Glukhov wrote:
Hi,
On 17.08.2022 04:45, Jonathan S. Katz wrote:
On 8/15/22 10:14 PM, Andres Freund wrote:
I pushed a few cleanups to
https://github.com/anarazel/postgres/commits/json
while I was hacking on this (ignore that it's based on the meson
tree, that's
just faster for me). Some of them might not be applicable anymore,
but it
might still make sense for you to look at.With RMT hat on, this appears to be making progress. A few questions
/ comments for the group:1. Nikita: Did you have a chance to review Andres's changes as well?
Yes, I have reviewed Andres's changes, they all are ok.
Thank you!
Then I started to do on the top of it other fixes that help to avoid
subtransactions when they are not needed. And it ended in the new
refactoring of coercion code. Also I moved here from v6-0003 fix of
ExecEvalJsonNeedSubtransaction() which considers more cases.Great.
Andres, Robert: Do these changes address your concerns about the use of
substransactions and reduce the risk of xid wraparound?On 16.08.2022 05:14, Andres Freund wrote:
But for JIT I still had to construct additional ExprState with a
function compiled from subexpression steps.Why did you have to do this?
I simply did not dare to implement compilation of recursively-callable
function with additional parameter stepno. In the v8 patch I did it
by adding a switch with all possible jump addresses of EEOP_SUBTRANS
steps in the beginning of the function. And it really seems to work
faster, but needs more exploration. See patch 0003, where both
variants preserved using #ifdef.The desciprion of the v7 patches:
0001 Simplify JsonExpr execution
Andres's changes + mine:
- Added JsonCoercionType enum, fields like via_io replaced with it
- Emit only context item steps in JSON_TABLE_OP case
- Skip coercion of NULLs to non-domain types (is it correct?)0002 Fix returning of json[b] domains in JSON_VALUE:
simply rebase of v6 onto 00010003 Add EEOP_SUBTRANS executor step
v6 + new recursive JIT0004 Split JsonExpr execution into steps
simply rebase of v6 + used LLMBuildSwitch() in EEOP_JSONEXPRWhat do folks think of these patches?
Andres, Andrew, Amit, Robert -- as you have either worked on this or
expressed opinions -- any thoughts on this current patch set?
Thanks,
Jonathan
On Tue, Aug 23, 2022 at 10:52 AM Jonathan S. Katz <jkatz@postgresql.org> wrote:
Andres, Andrew, Amit, Robert -- as you have either worked on this or
expressed opinions -- any thoughts on this current patch set?
FWIW, I've started looking at these patches.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
Hi,
On 2022-08-22 21:52:01 -0400, Jonathan S. Katz wrote:
Andres, Andrew, Amit, Robert -- as you have either worked on this or
expressed opinions -- any thoughts on this current patch set?
To me it feels like there's a probably too much work here to cram it at this
point. If several other committers shared the load of working on this it'd
perhaps be doable, but I've not seen many volunteers.
Greetings,
Andres Freund
On Mon, Aug 22, 2022 at 07:57:29PM -0700, Andres Freund wrote:
To me it feels like there's a probably too much work here to cram it at this
point. If several other committers shared the load of working on this it'd
perhaps be doable, but I've not seen many volunteers.
While 0002 is dead simple, I am worried about the complexity created
by 0001, 0003 (particularly tightening subtransactions with a
CASE_EEOP) and 0004 at this late stage of the release process:
/messages/by-id/7d83684b-7932-9f29-400b-0beedfafcdd4@postgrespro.ru
This is not a good sign after three betas for a feature as complex as
this one.
--
Michael
Hi Nikita,
On Thu, Aug 18, 2022 at 12:46 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
The desciprion of the v7 patches:
0001 Simplify JsonExpr execution
Andres's changes + mine:
- Added JsonCoercionType enum, fields like via_io replaced with it
- Emit only context item steps in JSON_TABLE_OP case
- Skip coercion of NULLs to non-domain types (is it correct?)
I like the parser changes to add JsonCoercionType, because that makes
ExecEvalJsonExprCoercion() so much simpler to follow.
In coerceJsonExpr():
+ if (!allow_io_coercion)
+ return NULL;
+
Might it make more sense to create a JsonCoercion even in this case
and assign it the type JSON_COERCION_ERROR, rather than allow the
coercion to be NULL and doing the following in ExecInitExprRec():
+ if (!*coercion)
+ /* Missing coercion here means missing cast */
+ cstate->type = JSON_COERCION_ERROR;
Likewise in transformJsonFuncExpr():
+ if (coercion_expr != (Node *) placeholder)
+ {
+ jsexpr->result_coercion = makeNode(JsonCoercion);
+ jsexpr->result_coercion->expr = coercion_expr;
+ jsexpr->result_coercion->ctype = JSON_COERCION_VIA_EXPR;
+ }
How about creating a JSON_COERCION_NONE coercion in the else block of
this, just like coerceJsonExpr() does?
Related to that, the JSON_EXISTS_OP block in
ExecEvalJsonExprInternal() sounds to assume that result_coercion would
always be non-NULL, per the comment in the last line:
case JSON_EXISTS_OP:
{
bool exists = JsonPathExists(item, path,
jsestate->args,
error);
*resnull = error && *error;
res = BoolGetDatum(exists);
break; /* always use result coercion */
}
...but it won't be if the above condition is false?
0002 Fix returning of json[b] domains in JSON_VALUE:
simply rebase of v6 onto 0001
Especially after seeing the new comments in this one, I'm wondering if
it makes sense to rename result_coercion to, say, default_coercion?
0003 Add EEOP_SUBTRANS executor step
v6 + new recursive JIT0004 Split JsonExpr execution into steps
simply rebase of v6 + used LLMBuildSwitch() in EEOP_JSONEXPR
Will need to spend more time looking at these.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
On Tue, Aug 23, 2022 at 4:48 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Thu, Aug 18, 2022 at 12:46 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
The desciprion of the v7 patches:
0003 Add EEOP_SUBTRANS executor step
v6 + new recursive JIT0004 Split JsonExpr execution into steps
simply rebase of v6 + used LLMBuildSwitch() in EEOP_JSONEXPRWill need to spend more time looking at these.
0004 adds the following to initJsonItemCoercions():
+ /* When returning JSON types, no need to initialize coercions */
+ /* XXX domain types on json/jsonb */
+ if (returning->typid == JSONBOID || returning->typid == JSONOID)
+ return NULL;
But maybe it's dead code, because 0001 has this:
+ if (jsexpr->returning->typid != JSONOID &&
+ jsexpr->returning->typid != JSONBOID)
+ jsexpr->coercions =
+ initJsonItemCoercions(pstate, jsexpr->returning,
+ exprType(contextItemExpr));
+ /* We need to handle RETURNING int etc. */
Is this a TODO and what does it mean?
+ * "JsonCoercion == NULL" means no cast is available.
+ * "JsonCoercion.expr == NULL" means no coercion is needed.
As said in my previous email, I wonder if these cases are better
handled by adding JSON_COERCION_ERROR and JSON_COERCION_NONE
coercions?
+/* Skip calling ExecEvalJson() on a JsonExpr? */
ExecEvalJsonExpr()
Will look more.
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
On 8/23/22 12:13 AM, Michael Paquier wrote:
On Mon, Aug 22, 2022 at 07:57:29PM -0700, Andres Freund wrote:
To me it feels like there's a probably too much work here to cram it at this
point. If several other committers shared the load of working on this it'd
perhaps be doable, but I've not seen many volunteers.While 0002 is dead simple, I am worried about the complexity created
by 0001, 0003 (particularly tightening subtransactions with a
CASE_EEOP) and 0004 at this late stage of the release process:
/messages/by-id/7d83684b-7932-9f29-400b-0beedfafcdd4@postgrespro.ruThis is not a good sign after three betas for a feature as complex as
this one.
I see Amit is taking a closer look at the patches.
The RMT had its regular meeting today and discussed the state of
progress on this and how it reflects release timing. Our feeling is that
regardless if the patchset is included/reverted, it would necessitate a
Beta 4 (to be discussed with release team). While no Beta 4 date is set,
given where we are this would probably push the release into early
October to allow for adequate testing time.
To say it another way, if we want to ensure we can have a 15.1 in the
November update releases, we need to make a decision soon on how we want
to proceed.
Thanks,
Jonathan
On Mon, Aug 22, 2022 at 9:52 PM Jonathan S. Katz <jkatz@postgresql.org> wrote:
Andres, Robert: Do these changes address your concerns about the use of
substransactions and reduce the risk of xid wraparound?Andres, Andrew, Amit, Robert -- as you have either worked on this or
expressed opinions -- any thoughts on this current patch set?
I do not think that using subtransactions as part of the expression
evaluation process is a sound idea pretty much under any
circumstances. Maybe if the subtransations aren't commonly created and
don't usually get XIDs there wouldn't be a big problem in practice,
but it's an awfully heavyweight operation to be done inside expression
evaluation even in corner cases. I think that if we need to make
certain operations that would throw errors not throw errors, we need
to refactor interfaces until it's possible to return an error
indicator up to the appropriate level, not just let the error be
thrown and catch it.
The patches in question are thousands of lines of new code that I
simply do not have time or interest to review in detail. I didn't
commit this feature, or write this feature, or review this feature.
I'm not familiar with any of the code. To really know what's going on
here, I would need to review not only the new patches but also all the
code in the original commits, and probably some of the preexisting
code from before those commits that I have never examined in the past.
That would take me quite a few months even if I had absolutely nothing
else to do. And because I haven't been involved in this patch set in
any way, I don't think it's really my responsibility.
At the end of the day, the RMT is going to have to take a call here.
It seems to me that Andres's concerns about code quality and lack of
comments are probably somewhat legitimate, and in particular I do not
think the use of subtransactions is a good idea. I also don't think
that trying to fix those problems or generally improve the code by
committing thousands of lines of new code in August when we're
targeting a release in September or October is necessarily a good
idea. But I'm also not in a position to say that the project is going
to be irreparably damaged if we just ship what we've got, perhaps
after fixing the most acute problems that we currently know about.
This is after all relatively isolated from the rest of the system.
Fixing the stuff that touches the core executor is probably pretty
important, but beyond that, the worst thing that happens is the
feature sucks and people who try to use it have bad experiences. That
would be bad, and might be a sufficient reason to revert, but it's not
nearly as bad as, say, the whole system being slow, or data loss for
every user, or something like that. And we do have other bad code in
the system. Is this a lot worse? I'm not in a position to say one way
or the other.
--
Robert Haas
EDB: http://www.enterprisedb.com