MERGE ... WHEN NOT MATCHED BY SOURCE
This allows MERGE to UPDATE or DELETE target rows where there is no
matching source row. In addition, it allows the existing "WHEN NOT
MATCHED" syntax to include an optional "BY TARGET" to make its meaning
more explicit. E.g.,
MERGE INTO tgt USING src ON ...
WHEN NOT MATCHED BY SOURCE THEN UPDATE/DELETE ...
WHEN NOT MATCHED BY TARGET THEN INSERT ...
AFAIK, this is not part of the standard (though I only have a very old
draft copy). It is supported by at least 2 other major DB vendors
though, and I think it usefully rounds off the set of possible MERGE
actions.
Attached is a WIP patch. I haven't updated the docs yet, and there are
probably a few other things to tidy up and test, but the basic
functionality is there.
Regards,
Dean
Attachments:
support-merge-when-not-matched-by-source-v1.patchtext/x-patch; charset=US-ASCII; name=support-merge-when-not-matched-by-source-v1.patchDownload+343-56
On Fri, 30 Dec 2022 at 16:56, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
Attached is a WIP patch.
Updated patch attached, now with updated docs and some other minor tidying up.
Regards,
Dean
Attachments:
support-merge-when-not-matched-by-source-v2.patchtext/x-patch; charset=US-ASCII; name=support-merge-when-not-matched-by-source-v2.patchDownload+420-71
I haven't read this patch other than superficially; I suppose the
feature it's introducing is an OK one to have as an extension to the
standard. (I hope the community members that are committee members
will propose this extension to become part of the standard.)
On 2023-Jan-02, Dean Rasheed wrote:
--- a/src/backend/optimizer/prep/preptlist.c +++ b/src/backend/optimizer/prep/preptlist.c @@ -157,15 +157,14 @@ preprocess_targetlist(PlannerInfo *root) /* * Add resjunk entries for any Vars used in each action's * targetlist and WHEN condition that belong to relations other - * than target. Note that aggregates, window functions and - * placeholder vars are not possible anywhere in MERGE's WHEN - * clauses. (PHVs may be added later, but they don't concern us - * here.) + * than target. Note that aggregates and window functions are not + * possible anywhere in MERGE's WHEN clauses, but PlaceHolderVars + * may have been added by subquery pullup. */ vars = pull_var_clause((Node *) list_concat_copy((List *) action->qual, action->targetList), - 0); + PVC_INCLUDE_PLACEHOLDERS);
Hmm, is this new because of NOT MATCHED BY SOURCE, or is it something
that can already be hit by existing features of MERGE? In other words
-- is this a bug fix that should be backpatched ahead of introducing NOT
MATCHED BY SOURCE?
@@ -127,10 +143,12 @@ transformMergeStmt(ParseState *pstate, M
*/
is_terminal[0] = false;
is_terminal[1] = false;
+ is_terminal[2] = false;
I think these 0/1/2 should be replaced by the values of MergeMatchKind.
+ /* Join type required */ + if (left_join && right_join) + qry->mergeJoinType = JOIN_FULL; + else if (left_join) + qry->mergeJoinType = JOIN_LEFT; + else if (right_join) + qry->mergeJoinType = JOIN_RIGHT; + else + qry->mergeJoinType = JOIN_INNER;
One of the review comments that MERGE got initially was that parse
analysis was not a place to "do query optimization", in the sense that
the original code was making a decision whether to make an outer or
inner join based on the set of WHEN clauses that appear in the command.
That's how we ended up with transform_MERGE_to_join and
mergeUseOuterJoin instead. This new code is certainly not the same, but
it makes me a bit unconfortable. Maybe it's OK, though.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
On Thu, 5 Jan 2023 at 11:03, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
I haven't read this patch other than superficially; I suppose the
feature it's introducing is an OK one to have as an extension to the
standard. (I hope the community members that are committee members
will propose this extension to become part of the standard.)
Thanks for looking!
--- a/src/backend/optimizer/prep/preptlist.c +++ b/src/backend/optimizer/prep/preptlist.c @@ -157,15 +157,14 @@ preprocess_targetlist(PlannerInfo *root) /* * Add resjunk entries for any Vars used in each action's * targetlist and WHEN condition that belong to relations other - * than target. Note that aggregates, window functions and - * placeholder vars are not possible anywhere in MERGE's WHEN - * clauses. (PHVs may be added later, but they don't concern us - * here.) + * than target. Note that aggregates and window functions are not + * possible anywhere in MERGE's WHEN clauses, but PlaceHolderVars + * may have been added by subquery pullup. */ vars = pull_var_clause((Node *) list_concat_copy((List *) action->qual, action->targetList), - 0); + PVC_INCLUDE_PLACEHOLDERS);Hmm, is this new because of NOT MATCHED BY SOURCE, or is it something
that can already be hit by existing features of MERGE? In other words
-- is this a bug fix that should be backpatched ahead of introducing NOT
MATCHED BY SOURCE?
It's new because of NOT MATCHED BY SOURCE, and I also found that I had
to make the same change in the MERGE INTO view patch, in the case
where the target view is simple enough to allow subquery pullup, but
also had INSTEAD OF triggers causing the pullup to happen in the
planner rather than the rewriter.
I couldn't think of a way that it could happen with the existing MERGE
code though, so I don't think it's a bug that needs fixing and
back-patching.
@@ -127,10 +143,12 @@ transformMergeStmt(ParseState *pstate, M
*/
is_terminal[0] = false;
is_terminal[1] = false;
+ is_terminal[2] = false;I think these 0/1/2 should be replaced by the values of MergeMatchKind.
Agreed.
+ /* Join type required */ + if (left_join && right_join) + qry->mergeJoinType = JOIN_FULL; + else if (left_join) + qry->mergeJoinType = JOIN_LEFT; + else if (right_join) + qry->mergeJoinType = JOIN_RIGHT; + else + qry->mergeJoinType = JOIN_INNER;One of the review comments that MERGE got initially was that parse
analysis was not a place to "do query optimization", in the sense that
the original code was making a decision whether to make an outer or
inner join based on the set of WHEN clauses that appear in the command.
That's how we ended up with transform_MERGE_to_join and
mergeUseOuterJoin instead. This new code is certainly not the same, but
it makes me a bit unconfortable. Maybe it's OK, though.
Yeah I agree, it's a bit ugly. Perhaps a better solution would be to
do away with that field entirely and just make the decision in
transform_MERGE_to_join() by examining the action list again. That
would require making MergeAction's "matched" field a MergeMatchKind
rather than a bool, but maybe that's not so bad, since retaining that
information might prove useful one day.
Regards,
Dean
On Thu, 5 Jan 2023 at 13:21, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
On Thu, 5 Jan 2023 at 11:03, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
+ /* Join type required */ + if (left_join && right_join) + qry->mergeJoinType = JOIN_FULL; + else if (left_join) + qry->mergeJoinType = JOIN_LEFT; + else if (right_join) + qry->mergeJoinType = JOIN_RIGHT; + else + qry->mergeJoinType = JOIN_INNER;One of the review comments that MERGE got initially was that parse
analysis was not a place to "do query optimization", in the sense that
the original code was making a decision whether to make an outer or
inner join based on the set of WHEN clauses that appear in the command.
That's how we ended up with transform_MERGE_to_join and
mergeUseOuterJoin instead. This new code is certainly not the same, but
it makes me a bit unconfortable. Maybe it's OK, though.Yeah I agree, it's a bit ugly. Perhaps a better solution would be to
do away with that field entirely and just make the decision in
transform_MERGE_to_join() by examining the action list again.
Attached is an updated patch taking that approach, allowing
mergeUseOuterJoin to be removed from the Query node, which I think is
probably a good thing.
Aside from that, it includes a few additional comment updates in the
executor that I'd missed, and psql tab completion support.
Regards,
Dean
Attachments:
support-merge-when-not-matched-by-source-v3.patchtext/x-patch; charset=US-ASCII; name=support-merge-when-not-matched-by-source-v3.patchDownload+493-82
Rebased version attached.
Regards,
Dean
Attachments:
support-merge-when-not-matched-by-source-v4.patchtext/x-patch; charset=US-ASCII; name=support-merge-when-not-matched-by-source-v4.patchDownload+492-81
On Tue, 10 Jan 2023 at 14:43, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
Rebased version attached.
Rebased version, following 8eba3e3f02 and 5d29d525ff.
Regards,
Dean
Attachments:
support-merge-when-not-matched-by-source-v5.patchtext/x-patch; charset=US-ASCII; name=support-merge-when-not-matched-by-source-v5.patchDownload+492-82
On Sat, Jan 21, 2023 at 3:05 AM Dean Rasheed <dean.a.rasheed@gmail.com>
wrote:
On Tue, 10 Jan 2023 at 14:43, Dean Rasheed <dean.a.rasheed@gmail.com>
wrote:Rebased version attached.
Rebased version, following 8eba3e3f02 and 5d29d525ff.
Regards,
Dean
Hi,
In transform_MERGE_to_join :
+ if (action->matchKind ==
MERGE_WHEN_NOT_MATCHED_BY_SOURCE)
+ tgt_only_tuples = true;
+ if (action->matchKind ==
MERGE_WHEN_NOT_MATCHED_BY_TARGET)
There should be an `else` in front of the second `if`.
When tgt_only_tuples and src_only_tuples are both true, we can come out of
the loop.
Cheers
On Sat, 21 Jan 2023 at 14:18, Ted Yu <yuzhihong@gmail.com> wrote:
On Sat, Jan 21, 2023 at 3:05 AM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
Rebased version, following 8eba3e3f02 and 5d29d525ff.
Another rebased version attached.
In transform_MERGE_to_join :
+ if (action->matchKind == MERGE_WHEN_NOT_MATCHED_BY_SOURCE) + tgt_only_tuples = true; + if (action->matchKind == MERGE_WHEN_NOT_MATCHED_BY_TARGET)There should be an `else` in front of the second `if`.
When tgt_only_tuples and src_only_tuples are both true, we can come out of the loop.
I decided not to do that. Adding an "else" doesn't change the code
that the compiler generates, and IMO it's slightly more readable
without it, since it keeps the line length shorter, and the test
conditions aligned, but that's a matter of opinion / personal
preference.
I think adding extra logic to exit the loop early if both
tgt_only_tuples and src_only_tuples are true would be a premature
optimisation, increasing the code size for no real benefit. In
practice, there are unlikely to be more than a few merge actions in
the list.
Regards,
Dean
Attachments:
support-merge-when-not-matched-by-source-v6.patchtext/x-patch; charset=US-ASCII; name=support-merge-when-not-matched-by-source-v6.patchDownload+491-81
On 1/4/23 12:57, Alvaro Herrera wrote:
I haven't read this patch other than superficially; I suppose the
feature it's introducing is an OK one to have as an extension to the
standard. (I hope the community members that are committee members
will propose this extension to become part of the standard.)
I have been doing some research on this, reading the original papers
that introduced the feature and its improvements.
I don't see anything that ever considered what this patch proposes, even
though SQL Server has it. (The initial MERGE didn't even have DELETE!)
SOURCE and TARGET are not currently keywords, but the only things that
can come after MATCHED are THEN and AND, so I don't foresee any issues
with us implementing this before the committee accepts such a change
proposal. I also don't see how the committee could possibly change the
semantics of this, and two implementations having it is a good argument
for getting it in.
We should be cautious in doing something differently from SQL Server
here, and I would appreciate any differences being brought to my
attention so I can incorporate them into a specification, even if that
means resorting to the hated "implementation-defined".
--
Vik Fearing
I see the PlaceHolderVar issue turned out to be a pre-existing bug after all.
Rebased version attached.
Regards,
Dean
Attachments:
support-merge-when-not-matched-by-source-v7.patchtext/x-patch; charset=US-ASCII; name=support-merge-when-not-matched-by-source-v7.patchDownload+487-76
On 2023-Mar-19, Dean Rasheed wrote:
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y new file mode 100644 index efe88cc..e1ebc8d --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y
+merge_when_tgt_matched: + WHEN MATCHED { $$ = MERGE_WHEN_MATCHED; } + | WHEN NOT MATCHED BY SOURCE { $$ = MERGE_WHEN_NOT_MATCHED_BY_SOURCE; } + ;
I think a one-line comment on why this "matched" production matches "NOT
MATCHED BY" would be useful. I think you have a big one in
transformMergeStmt already.
+ /* Combine it with the action's WHEN condition */ + if (action->qual == NULL) + action->qual = (Node *) ntest; + else + action->qual = + (Node *) makeBoolExpr(AND_EXPR, + list_make2(ntest, action->qual), + -1);
Hmm, I think ->qual is already in implicit-and form, so do you really
need to makeBoolExpr, or would it be sufficient to append this new
condition to the list?
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"El miedo atento y previsor es la madre de la seguridad" (E. Burke)
On Tue, 21 Mar 2023 at 10:28, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
+ /* Combine it with the action's WHEN condition */ + if (action->qual == NULL) + action->qual = (Node *) ntest; + else + action->qual = + (Node *) makeBoolExpr(AND_EXPR, + list_make2(ntest, action->qual), + -1);Hmm, I think ->qual is already in implicit-and form, so do you really
need to makeBoolExpr, or would it be sufficient to append this new
condition to the list?
No, this has come directly from transformWhereClause() in the parser,
so it's an expression tree, not a list. Transforming to implicit-and
form doesn't happen until later.
Looking at it with fresh eyes though, I realise that I could have just written
action->qual = make_and_qual((Node *) ntest, action->qual);
which is equivalent, but more concise.
Regards,
Dean
On 2023-Mar-21, Dean Rasheed wrote:
Looking at it with fresh eyes though, I realise that I could have just written
action->qual = make_and_qual((Node *) ntest, action->qual);
which is equivalent, but more concise.
Nice.
I have no further observations about this patch.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
On Tue, 21 Mar 2023 at 12:26, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2023-Mar-21, Dean Rasheed wrote:
Looking at it with fresh eyes though, I realise that I could have just written
action->qual = make_and_qual((Node *) ntest, action->qual);
which is equivalent, but more concise.
Nice.
I have no further observations about this patch.
Looking at this one afresh, it seems that the change to make Vars
outer-join aware broke it -- the Var in the qual to test whether the
source row is null needs to be marked as nullable by the join added by
transform_MERGE_to_join(). That's something that needs to be done in
transform_MERGE_to_join(), so it makes more sense to add the new qual
there rather than in transformMergeStmt().
Also, now that MERGE has ruleutils support, it's clear that adding the
qual in transformMergeStmt() isn't right anyway, since it would then
appear in the deparsed output.
So attached is an updated patch doing that, which seems neater all
round, since adding the qual is closely related to the join-type
choice, which is now a decision taken entirely in
transform_MERGE_to_join(). This requires a new "mergeSourceRelation"
field on the Query structure, but as before, it does away with the
"mergeUseOuterJoin" field.
I've also updated the ruleutils support. In the absence of any WHEN
NOT MATCHED BY SOURCE actions, this will output not-matched actions
simply as "WHEN NOT MATCHED" for backwards compatibility, and to be
SQL-standard-compliant. If there are any WHEN NOT MATCHED BY SOURCE
actions though, I think it's preferable to output explicit "BY SOURCE"
and "BY TARGET" qualifiers for all not-matched actions, to make the
meaning clearer.
Regards,
Dean
Attachments:
support-merge-when-not-matched-by-source-v8.patchtext/x-patch; charset=US-ASCII; name=support-merge-when-not-matched-by-source-v8.patchDownload+593-88
Hi, this patch was marked in CF as "Needs Review" [1]https://commitfest.postgresql.org/46/4092/, but there has
been no activity on this thread for 6+ months.
Is anything else planned? Can you post something to elicit more
interest in the latest patch? Otherwise, if nothing happens then the
CF entry will be closed ("Returned with feedback") at the end of this
CF.
======
[1]: https://commitfest.postgresql.org/46/4092/
Kind Regards,
Peter Smith.
On Sat, 1 Jul 2023 at 18:04, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
On Tue, 21 Mar 2023 at 12:26, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2023-Mar-21, Dean Rasheed wrote:
Looking at it with fresh eyes though, I realise that I could have just written
action->qual = make_and_qual((Node *) ntest, action->qual);
which is equivalent, but more concise.
Nice.
I have no further observations about this patch.
Looking at this one afresh, it seems that the change to make Vars
outer-join aware broke it -- the Var in the qual to test whether the
source row is null needs to be marked as nullable by the join added by
transform_MERGE_to_join(). That's something that needs to be done in
transform_MERGE_to_join(), so it makes more sense to add the new qual
there rather than in transformMergeStmt().Also, now that MERGE has ruleutils support, it's clear that adding the
qual in transformMergeStmt() isn't right anyway, since it would then
appear in the deparsed output.So attached is an updated patch doing that, which seems neater all
round, since adding the qual is closely related to the join-type
choice, which is now a decision taken entirely in
transform_MERGE_to_join(). This requires a new "mergeSourceRelation"
field on the Query structure, but as before, it does away with the
"mergeUseOuterJoin" field.I've also updated the ruleutils support. In the absence of any WHEN
NOT MATCHED BY SOURCE actions, this will output not-matched actions
simply as "WHEN NOT MATCHED" for backwards compatibility, and to be
SQL-standard-compliant. If there are any WHEN NOT MATCHED BY SOURCE
actions though, I think it's preferable to output explicit "BY SOURCE"
and "BY TARGET" qualifiers for all not-matched actions, to make the
meaning clearer.
CFBot shows that the patch does not apply anymore as in [1]http://cfbot.cputube.org/patch_46_4092.log:
=== Applying patches on top of PostgreSQL commit ID
f2bf8fb04886e3ea82e7f7f86696ac78e06b7e60 ===
=== applying patch ./support-merge-when-not-matched-by-source-v8.patch
...
patching file doc/src/sgml/ref/merge.sgml
Hunk #5 FAILED at 409.
Hunk #9 FAILED at 673.
2 out of 9 hunks FAILED -- saving rejects to file
doc/src/sgml/ref/merge.sgml.rej
..
patching file src/include/nodes/parsenodes.h
Hunk #1 succeeded at 175 (offset -8 lines).
Hunk #2 succeeded at 1657 (offset -6 lines).
Hunk #3 succeeded at 1674 (offset -6 lines).
Hunk #4 FAILED at 1696.
1 out of 4 hunks FAILED -- saving rejects to file
src/include/nodes/parsenodes.h.rej
Please post an updated version for the same.
[1]: http://cfbot.cputube.org/patch_46_4092.log
Regards,
Vignesh
On Mon, 22 Jan 2024 at 02:10, Peter Smith <smithpb2250@gmail.com> wrote:
Hi, this patch was marked in CF as "Needs Review" [1], but there has
been no activity on this thread for 6+ months.Is anything else planned? Can you post something to elicit more
interest in the latest patch? Otherwise, if nothing happens then the
CF entry will be closed ("Returned with feedback") at the end of this
CF.
I think it has had a decent amount of review and all the review
comments have been addressed. I'm not quite sure from Alvaro's last
comment whether he was implying that he thought it was ready for
commit.
Looking back through the thread, the general sentiment seems to be in
favour of adding this feature, and I still think it's worth doing, but
I haven't managed to find much time to progress it recently.
Regards,
Dean
n Fri, 26 Jan 2024 at 14:59, vignesh C <vignesh21@gmail.com> wrote:
CFBot shows that the patch does not apply anymore as in [1]:
Rebased version attached.
Regards,
Dean
Attachments:
support-merge-when-not-matched-by-source-v9.patchtext/x-patch; charset=US-ASCII; name=support-merge-when-not-matched-by-source-v9.patchDownload+588-88
On 2024-Jan-26, Dean Rasheed wrote:
I think it has had a decent amount of review and all the review
comments have been addressed. I'm not quite sure from Alvaro's last
comment whether he was implying that he thought it was ready for
commit.
Well, firstly this is clearly a feature we want to have, even though
it's non-standard, because people use it and other implementations have
it. (Eh, so maybe somebody should be talking to the SQL standard
committee about it). As for code quality, I didn't do a comprehensive
review, but I think it is quite reasonable. Therefore, my inclination
would be to get it committed soonish, and celebrate it widely so that
people can test it soon and complain if they see something they don't
like.
I have to say that I find the idea of booting patches as Returned with
Feedback just because of inactivity (as opposed to unresponsive authors)
rather wrong-headed, and I wish we didn't do it.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/