Some problems regarding the self-join elimination code
While working on another patch, I looked at ChangeVarNodes() and found
some issues introduced by the self-join elimination commit. I think
it'd better to fix or improve them.
* ChangeVarNodes_walker includes special handling for RestrictInfo
nodes. And it adjusts RestrictInfo.orclause only if the rt_index of
the relation to be removed is contained in RestrictInfo.clause_relids.
I don't think this is correct. Even if the to-be-removed relation is
not present in clause_relids, it can still be found somewhere within
the orclause. As an example, consider:
create table t (a int primary key, b int, c int);
explain (costs off)
select * from t t1
inner join t t2 on t1.a = t2.a
left join t t3 on t1.c = 1 and (t2.b = t3.b or t2.b = 1);
server closed the connection unexpectedly
This query triggers an Assert failure in match_orclause_to_indexcol,
and the root cause is that the removed relation is still present
in the outer_relids of the OR-clauses.
To fix, I think we may need to run ChangeVarNodes_walker on the
orclause in all cases.
* After the adjustment, the num_base_rels is recalculated as:
+ rinfo->num_base_rels = bms_num_members(rinfo->clause_relids);
I don't think this is correct either. num_base_rels is the number of
base rels in clause_relids and should not count outer-join relids.
And we have no guarantee that clause_relids does not include any
outer-join relids.
To fix, I think we should exclude all outer-join relids. Perhaps we
can leverage root->outer_join_rels to achieve this.
* I'm not sure why the self-join elimination commit removed all the
comments in ChangeVarNodes(Extended). I think these comments are very
helpful for understanding the code.
- /*
- * Must be prepared to start with a Query or a bare expression tree; if
- * it's a Query, go straight to query_tree_walker to make sure that
- * sublevels_up doesn't get incremented prematurely.
- */
if (node && IsA(node, Query))
{
Query *qry = (Query *) node;
- /*
- * If we are starting at a Query, and sublevels_up is
zero, then we
- * must also fix rangetable indexes in the Query
itself --- namely
- * resultRelation, mergeTargetRelation, exclRelIndex
and rowMarks
- * entries. sublevels_up cannot be zero when recursing into a
- * subquery, so there's no need to have the same logic inside
- * ChangeVarNodes_walker.
- */
if (sublevels_up == 0)
{
ListCell *l;
@@ -701,7 +772,6 @@ ChangeVarNodes(Node *node, int rt_index, int
new_index, int sublevels_up)
if (qry->mergeTargetRelation == rt_index)
qry->mergeTargetRelation = new_index;
- /* this is unlikely to ever be used, but ... */
* Speaking of comments, I’ve noticed that some changes are lacking
comments. For example, there are almost no comments regarding the
special handling of RestrictInfo nodes in ChangeVarNodes_walker,
except for an explanation about adding the NullTest.
BTW, there's a typo in that explanation: qual() should be equal().
This is not a thorough review of the code; I just randomly looked at
ChangeVarNodes(). It's possible that there are other issues that
haven't been discovered, but I think we should at least fix these.
I'll provide a patch for the fixes later. Apologies for being so
nitpicky.
Thanks
Richard
On Wed, Apr 2, 2025 at 10:26 PM Richard Guo <guofenglinux@gmail.com> wrote:
This is not a thorough review of the code; I just randomly looked at
ChangeVarNodes(). It's possible that there are other issues that
haven't been discovered, but I think we should at least fix these.
I'll provide a patch for the fixes later. Apologies for being so
nitpicky.
With more exposure to the changes in ChangeVarNodes(), I have some
more concerns. As I understand it, ChangeVarNodes() is designed to
update the varno fields of Var nodes in the given tree that belong to
the specific relation; neat and clear. However, now, within this
function, we also create new NullTest quals for SJE, which doesn't
seem like something this function should be handling. It makes this
function look a bit like a kludge.
Actually, while working on the reduce-NullTest-during-constant-folding
patch, I tried to figure out where the new NOT NULL quals from SJE are
added. I was a bit surprised to find that they are added in
ChangeVarNodes. I didn't expect this function to have this extra
responsibility.
Additionally, I have a minor suggestion regarding the new boolean
parameter, "change_RangeTblRef". AFAIU, by convention, we typically
use flags to control the behavior of walker functions, like in
pull_var_clause. While it's fine to use a boolean parameter here
given that we currently only have one flag, for the sake of future
extensibility, I think using flags might be a better choice.
Any ideas on how we could improve this function? Apologies again for
being nitpicky.
Thanks
Richard
On 4/2/25 15:26, Richard Guo wrote:
While working on another patch, I looked at ChangeVarNodes() and found
some issues introduced by the self-join elimination commit. I think
it'd better to fix or improve them.* ChangeVarNodes_walker includes special handling for RestrictInfo
nodes. And it adjusts RestrictInfo.orclause only if the rt_index of
the relation to be removed is contained in RestrictInfo.clause_relids.I don't think this is correct. Even if the to-be-removed relation is
not present in clause_relids, it can still be found somewhere within
the orclause. As an example, consider:
Yeah, discovering how we tolerated such a gaffe I found that the comment
on the clause_relids field is quite vague here:
"The relids (varnos+varnullingrels) actually referenced in the clause:"
and only in the RestrictInfo description you can find some additional
information. I think we should modify the check, uniting clause_relids
and required_relids before searching for the removing relid.
+ rinfo->num_base_rels = bms_num_members(rinfo->clause_relids);
I don't think this is correct either. num_base_rels is the number of
base rels in clause_relids and should not count outer-join relids.
And we have no guarantee that clause_relids does not include any
outer-join relids.
It is a clear mistake, no apologies to me.
To fix, I think we should exclude all outer-join relids. Perhaps we
can leverage root->outer_join_rels to achieve this.
I think we may use a difference in size of rinfo->clause_relids before
and after adjustion. If something were removed, just decrease num_base_rels.
* Speaking of comments, I’ve noticed that some changes are lacking
comments. For example, there are almost no comments regarding the
special handling of RestrictInfo nodes in ChangeVarNodes_walker,
except for an explanation about adding the NullTest.
Ok, it seems that comments were removed too hastily. Not addressed it yet.
I also added little code refactoring to make it more readable. See
fix.diff in attachment as my proposal for future discussion.
--
regards, Andrei Lepikhov
Attachments:
fix.difftext/x-patch; charset=UTF-8; name=fix.diffDownload
diff --git a/src/backend/rewrite/rewriteManip.c b/src/backend/rewrite/rewriteManip.c
index 1c61085a0a7..df96b1bba6a 100644
--- a/src/backend/rewrite/rewriteManip.c
+++ b/src/backend/rewrite/rewriteManip.c
@@ -637,41 +637,58 @@ ChangeVarNodes_walker(Node *node, ChangeVarNodes_context *context)
}
if (IsA(node, RestrictInfo))
{
- RestrictInfo *rinfo = (RestrictInfo *) node;
- int relid = -1;
- bool is_req_equal =
- (rinfo->required_relids == rinfo->clause_relids);
- bool clause_relids_is_multiple =
- (bms_membership(rinfo->clause_relids) == BMS_MULTIPLE);
-
- if (bms_is_member(context->rt_index, rinfo->clause_relids))
+ RestrictInfo *rinfo = (RestrictInfo *) node;
+ int relid = -1;
+ bool is_req_equal =
+ (rinfo->required_relids == rinfo->clause_relids);
+ bool is_multiple =
+ (bms_membership(rinfo->clause_relids) == BMS_MULTIPLE);
+
+ if (bms_is_member(context->rt_index, rinfo->clause_relids
+ /*bms_union(rinfo->clause_relids, rinfo->required_relids)*/))
{
- expression_tree_walker((Node *) rinfo->clause, ChangeVarNodes_walker, (void *) context);
- expression_tree_walker((Node *) rinfo->orclause, ChangeVarNodes_walker, (void *) context);
+ Relids new_clause_relids;
- rinfo->clause_relids =
- adjust_relid_set(rinfo->clause_relids, context->rt_index, context->new_index);
+ expression_tree_walker((Node *) rinfo->clause,
+ ChangeVarNodes_walker, (void *) context);
+ expression_tree_walker((Node *) rinfo->orclause,
+ ChangeVarNodes_walker, (void *) context);
+
+ new_clause_relids = adjust_relid_set(rinfo->clause_relids,
+ context->rt_index,
+ context->new_index);
+
+ /* This operation is legal until we remove only baserels */
+ rinfo->num_base_rels -= bms_num_members(rinfo->clause_relids) -
+ bms_num_members(new_clause_relids);
+
+ rinfo->clause_relids = new_clause_relids;
rinfo->num_base_rels = bms_num_members(rinfo->clause_relids);
- rinfo->left_relids =
- adjust_relid_set(rinfo->left_relids, context->rt_index, context->new_index);
- rinfo->right_relids =
- adjust_relid_set(rinfo->right_relids, context->rt_index, context->new_index);
+ rinfo->left_relids = adjust_relid_set(rinfo->left_relids,
+ context->rt_index,
+ context->new_index);
+ rinfo->right_relids = adjust_relid_set(rinfo->right_relids,
+ context->rt_index,
+ context->new_index);
}
if (is_req_equal)
rinfo->required_relids = rinfo->clause_relids;
else
- rinfo->required_relids =
- adjust_relid_set(rinfo->required_relids, context->rt_index, context->new_index);
+ rinfo->required_relids = adjust_relid_set(rinfo->required_relids,
+ context->rt_index,
+ context->new_index);
- rinfo->outer_relids =
- adjust_relid_set(rinfo->outer_relids, context->rt_index, context->new_index);
- rinfo->incompatible_relids =
- adjust_relid_set(rinfo->incompatible_relids, context->rt_index, context->new_index);
+ rinfo->outer_relids = adjust_relid_set(rinfo->outer_relids,
+ context->rt_index,
+ context->new_index);
+ rinfo->incompatible_relids = adjust_relid_set(rinfo->incompatible_relids,
+ context->rt_index,
+ context->new_index);
if (rinfo->mergeopfamilies &&
bms_get_singleton_member(rinfo->clause_relids, &relid) &&
- clause_relids_is_multiple &&
+ is_multiple &&
relid == context->new_index && IsA(rinfo->clause, OpExpr))
{
Expr *leftOp;
On 4/4/25 09:37, Richard Guo wrote:
On Wed, Apr 2, 2025 at 10:26 PM Richard Guo <guofenglinux@gmail.com> wrote:
With more exposure to the changes in ChangeVarNodes(), I have some
more concerns. As I understand it, ChangeVarNodes() is designed to
update the varno fields of Var nodes in the given tree that belong to
the specific relation; neat and clear. However, now, within this
function, we also create new NullTest quals for SJE, which doesn't
seem like something this function should be handling. It makes this
function look a bit like a kludge.
To be precise, inside the function we replace old clause with the
NullTest, because relid replacement action made it degenerate. It seems
essential to me and I think it may be ok to add a comment at the top of
ChangeVarNodes, describing this minor optimisation.
Additionally, I have a minor suggestion regarding the new boolean
parameter, "change_RangeTblRef". AFAIU, by convention, we typically
use flags to control the behavior of walker functions, like in
pull_var_clause. While it's fine to use a boolean parameter here
given that we currently only have one flag, for the sake of future
extensibility, I think using flags might be a better choice.
That was exactly why I wasn't happy with replacing the change_relid
routine with ChangeVarNodes.
But variant with a flag seems non-trivial to implement. Do you have any
proposal about the code?
--
regards, Andrei Lepikhov
On Tue, 8 Apr 2025 at 12:31, Andrei Lepikhov <lepihov@gmail.com> wrote:
On 4/4/25 09:37, Richard Guo wrote:
On Wed, Apr 2, 2025 at 10:26 PM Richard Guo <guofenglinux@gmail.com> wrote:
With more exposure to the changes in ChangeVarNodes(), I have some
more concerns. As I understand it, ChangeVarNodes() is designed to
update the varno fields of Var nodes in the given tree that belong to
the specific relation; neat and clear. However, now, within this
function, we also create new NullTest quals for SJE, which doesn't
seem like something this function should be handling. It makes this
function look a bit like a kludge.To be precise, inside the function we replace old clause with the
NullTest, because relid replacement action made it degenerate. It seems
essential to me and I think it may be ok to add a comment at the top of
ChangeVarNodes, describing this minor optimisation.
I recall raising the same concerns when I originally reviewed the
patch. I don't think this code belongs in the rewriter, because it's
very specific to how SJE wants to handle these kinds of nodes.
Note also that RestrictInfo is defined in nodes/pathnodes.h, which
describes its contents as internal data structures for the planner,
suggesting that the rewriter has no business processing those kinds of
nodes.
I don't think simply adding a comment addresses those concerns.
Additionally, I have a minor suggestion regarding the new boolean
parameter, "change_RangeTblRef". AFAIU, by convention, we typically
use flags to control the behavior of walker functions, like in
pull_var_clause. While it's fine to use a boolean parameter here
given that we currently only have one flag, for the sake of future
extensibility, I think using flags might be a better choice.That was exactly why I wasn't happy with replacing the change_relid
routine with ChangeVarNodes.
But variant with a flag seems non-trivial to implement. Do you have any
proposal about the code?
Perhaps the way to do it is to make ChangeVarNodesExtended() take a
callback function to be invoked for each node visited. The callback
(which would then be in the planner, with the other SJE code) would do
any special handling it needed (for RangeTblRef and RestrictInfo
nodes), and call ChangeVarNodes_walker() for any other types of node,
to get the default behaviour.
Regards,
Dean
On Tue, Apr 8, 2025 at 11:12 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
On Tue, 8 Apr 2025 at 12:31, Andrei Lepikhov <lepihov@gmail.com> wrote:
On 4/4/25 09:37, Richard Guo wrote:
With more exposure to the changes in ChangeVarNodes(), I have some
more concerns. As I understand it, ChangeVarNodes() is designed to
update the varno fields of Var nodes in the given tree that belong to
the specific relation; neat and clear. However, now, within this
function, we also create new NullTest quals for SJE, which doesn't
seem like something this function should be handling. It makes this
function look a bit like a kludge.
To be precise, inside the function we replace old clause with the
NullTest, because relid replacement action made it degenerate. It seems
essential to me and I think it may be ok to add a comment at the top of
ChangeVarNodes, describing this minor optimisation.
I recall raising the same concerns when I originally reviewed the
patch. I don't think this code belongs in the rewriter, because it's
very specific to how SJE wants to handle these kinds of nodes.
Correct. And I don't think it's this function's responsibility to
create new NullTest quals for SJE.
Note also that RestrictInfo is defined in nodes/pathnodes.h, which
describes its contents as internal data structures for the planner,
suggesting that the rewriter has no business processing those kinds of
nodes.
I don't think simply adding a comment addresses those concerns.
Agreed. We may need more than just a comment change.
Additionally, I have a minor suggestion regarding the new boolean
parameter, "change_RangeTblRef". AFAIU, by convention, we typically
use flags to control the behavior of walker functions, like in
pull_var_clause. While it's fine to use a boolean parameter here
given that we currently only have one flag, for the sake of future
extensibility, I think using flags might be a better choice.
That was exactly why I wasn't happy with replacing the change_relid
routine with ChangeVarNodes.
But variant with a flag seems non-trivial to implement. Do you have any
proposal about the code?
Perhaps the way to do it is to make ChangeVarNodesExtended() take a
callback function to be invoked for each node visited. The callback
(which would then be in the planner, with the other SJE code) would do
any special handling it needed (for RangeTblRef and RestrictInfo
nodes), and call ChangeVarNodes_walker() for any other types of node,
to get the default behaviour.
Yeah, this might be a better approach. Perhaps we can borrow some
ideas from replace_rte_variables.
Thanks
Richard
On 4/9/25 04:05, Richard Guo wrote:
On Tue, Apr 8, 2025 at 11:12 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
On Tue, 8 Apr 2025 at 12:31, Andrei Lepikhov <lepihov@gmail.com> wrote:
Perhaps the way to do it is to make ChangeVarNodesExtended() take a
callback function to be invoked for each node visited. The callback
(which would then be in the planner, with the other SJE code) would do
any special handling it needed (for RangeTblRef and RestrictInfo
nodes), and call ChangeVarNodes_walker() for any other types of node,
to get the default behaviour.Yeah, this might be a better approach. Perhaps we can borrow some
ideas from replace_rte_variables.
It seems we are coming to the conclusion that join removal optimisation
may do something out of ChangeVarNodes resposibility. Before further
complicating of this function code I would like to know opinion of Tom,
who initially proposed [1]/messages/by-id/3622801.1715010885@sss.pgh.pa.us to use this routine. May be better a) return
to more specialised change_relid / sje_walker machinery or b) move
ChangeVarNodes out of rewriteManip and make it multi-purpose routine,
allowing to transform expression that may happen after a Var node change?
[1]: /messages/by-id/3622801.1715010885@sss.pgh.pa.us
--
regards, Andrei Lepikhov
On Wed, Apr 9, 2025 at 10:39 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
On 4/9/25 04:05, Richard Guo wrote:
On Tue, Apr 8, 2025 at 11:12 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
On Tue, 8 Apr 2025 at 12:31, Andrei Lepikhov <lepihov@gmail.com> wrote:
Perhaps the way to do it is to make ChangeVarNodesExtended() take a
callback function to be invoked for each node visited. The callback
(which would then be in the planner, with the other SJE code) would do
any special handling it needed (for RangeTblRef and RestrictInfo
nodes), and call ChangeVarNodes_walker() for any other types of node,
to get the default behaviour.Yeah, this might be a better approach. Perhaps we can borrow some
ideas from replace_rte_variables.It seems we are coming to the conclusion that join removal optimisation
may do something out of ChangeVarNodes resposibility. Before further
complicating of this function code I would like to know opinion of Tom,
who initially proposed [1] to use this routine. May be better a) return
to more specialised change_relid / sje_walker machinery or b) move
ChangeVarNodes out of rewriteManip and make it multi-purpose routine,
allowing to transform expression that may happen after a Var node change?
What about adding a callback to ChangeVarNodes_context that would
called for each RestrictInfo after changing varnodes itself? SJE
could use a callback that replaces OpExpr with NullTest when needed.
------
Regards,
Alexander Korotkov
Supabase
On 4/10/25 13:36, Alexander Korotkov wrote:
On Wed, Apr 9, 2025 at 10:39 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
It seems we are coming to the conclusion that join removal optimisation
may do something out of ChangeVarNodes resposibility. Before further
complicating of this function code I would like to know opinion of Tom,
who initially proposed [1] to use this routine. May be better a) return
to more specialised change_relid / sje_walker machinery or b) move
ChangeVarNodes out of rewriteManip and make it multi-purpose routine,
allowing to transform expression that may happen after a Var node change?What about adding a callback to ChangeVarNodes_context that would
called for each RestrictInfo after changing varnodes itself? SJE
could use a callback that replaces OpExpr with NullTest when needed.
I think it is doable, of course. Just looking forward a little, it may
need more complication in the future (SJE definitely should be widened
to partitioned tables) and it may be simpler to have two different
routines for two different stages of planning.
--
regards, Andrei Lepikhov
On 4/10/25 14:39, Andrei Lepikhov wrote:
On 4/10/25 13:36, Alexander Korotkov wrote:
On Wed, Apr 9, 2025 at 10:39 AM Andrei Lepikhov <lepihov@gmail.com>
wrote:It seems we are coming to the conclusion that join removal optimisation
may do something out of ChangeVarNodes resposibility. Before further
complicating of this function code I would like to know opinion of Tom,
who initially proposed [1] to use this routine. May be better a) return
to more specialised change_relid / sje_walker machinery or b) move
ChangeVarNodes out of rewriteManip and make it multi-purpose routine,
allowing to transform expression that may happen after a Var node
change?What about adding a callback to ChangeVarNodes_context that would
called for each RestrictInfo after changing varnodes itself? SJE
could use a callback that replaces OpExpr with NullTest when needed.I think it is doable, of course. Just looking forward a little, it may
need more complication in the future (SJE definitely should be widened
to partitioned tables) and it may be simpler to have two different
routines for two different stages of planning.
To provide some food for thought, here is a draft in attachment which
addresses both issues: RestrictInfo relid replacement and move
SJE-specific code out of the ChangeVarNodes routine (callback approach).
--
regards, Andrei Lepikhov
Attachments:
v0-0001-Switch-the-approach-to-ChangeVarNodes-s-extensibi.patchtext/x-patch; charset=UTF-8; name=v0-0001-Switch-the-approach-to-ChangeVarNodes-s-extensibi.patchDownload
From 6b68703d7b38326393da58e42618e4915a0e4590 Mon Sep 17 00:00:00 2001
From: "Andrei V. Lepikhov" <lepihov@gmail.com>
Date: Fri, 11 Apr 2025 14:30:33 +0200
Subject: [PATCH v0] Switch the approach to ChangeVarNodes's extensibility.
---
src/backend/optimizer/plan/analyzejoins.c | 149 +++++++++++++++++++---
src/backend/rewrite/rewriteManip.c | 95 ++------------
src/include/rewrite/rewriteManip.h | 14 +-
3 files changed, 161 insertions(+), 97 deletions(-)
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index 6b58567f511..0f0ed1785e6 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -74,6 +74,7 @@ static bool is_innerrel_unique_for(PlannerInfo *root,
List *restrictlist,
List **extra_clauses);
static int self_join_candidates_cmp(const void *a, const void *b);
+static bool ChangeVarNodes_callback(Node *node, void *arg);
/*
@@ -397,7 +398,8 @@ remove_rel_from_query(PlannerInfo *root, RelOptInfo *rel,
{
Assert(subst > 0);
- ChangeVarNodes((Node *) sjinf->semi_rhs_exprs, relid, subst, 0);
+ ChangeVarNodesExtended((Node *) sjinf->semi_rhs_exprs, relid, subst,
+ 0, ChangeVarNodes_callback);
}
}
@@ -458,7 +460,8 @@ remove_rel_from_query(PlannerInfo *root, RelOptInfo *rel,
sjinfo->ojrelid, subst);
Assert(!bms_is_empty(phv->phrels));
- ChangeVarNodes((Node *) phv->phexpr, relid, subst, 0);
+ ChangeVarNodesExtended((Node *) phv->phexpr, relid, subst, 0,
+ ChangeVarNodes_callback);
Assert(phv->phnullingrels == NULL); /* no need to adjust */
}
@@ -512,7 +515,8 @@ remove_rel_from_query(PlannerInfo *root, RelOptInfo *rel,
}
if (subst > 0)
- ChangeVarNodes((Node *) otherrel->lateral_vars, relid, subst, 0);
+ ChangeVarNodesExtended((Node *) otherrel->lateral_vars, relid,
+ subst, 0, ChangeVarNodes_callback);
}
}
@@ -746,7 +750,8 @@ remove_rel_from_eclass(EquivalenceClass *ec, SpecialJoinInfo *sjinfo,
RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
if (sjinfo == NULL)
- ChangeVarNodes((Node *) rinfo, relid, subst, 0);
+ ChangeVarNodesExtended((Node *) rinfo, relid, subst, 0,
+ ChangeVarNodes_callback);
else
remove_rel_from_restrictinfo(rinfo, relid, sjinfo->ojrelid);
}
@@ -1537,7 +1542,8 @@ update_eclasses(EquivalenceClass *ec, int from, int to)
em->em_jdomain->jd_relids = adjust_relid_set(em->em_jdomain->jd_relids, from, to);
/* We only process inner joins */
- ChangeVarNodes((Node *) em->em_expr, from, to, 0);
+ ChangeVarNodesExtended((Node *) em->em_expr, from, to, 0,
+ ChangeVarNodes_callback);
foreach_node(EquivalenceMember, other, new_members)
{
@@ -1571,7 +1577,8 @@ update_eclasses(EquivalenceClass *ec, int from, int to)
continue;
}
- ChangeVarNodes((Node *) rinfo, from, to, 0);
+ ChangeVarNodesExtended((Node *) rinfo, from, to, 0,
+ ChangeVarNodes_callback);
/*
* After switching the clause to the remaining relation, check it for
@@ -1666,6 +1673,108 @@ add_non_redundant_clauses(PlannerInfo *root,
}
}
+static bool
+ChangeVarNodes_callback(Node *node, void *arg)
+{
+ ChangeVarNodes_context *context = (ChangeVarNodes_context *) arg;
+
+ if (node == NULL)
+ return false;
+
+ if (IsA(node, RangeTblRef))
+ {
+ return false;
+ }
+ else if (IsA(node, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *) node;
+ int relid = -1;
+ bool is_req_equal =
+ (rinfo->required_relids == rinfo->clause_relids);
+ bool is_multiple =
+ (bms_membership(rinfo->clause_relids) == BMS_MULTIPLE);
+
+ if (bms_is_member(context->rt_index,
+ bms_union(rinfo->clause_relids, rinfo->required_relids)))
+ {
+ Relids new_clause_relids;
+
+ expression_tree_walker((Node *) rinfo->clause,
+ ChangeVarNodes_callback, arg);
+ expression_tree_walker((Node *) rinfo->orclause,
+ ChangeVarNodes_callback, arg);
+
+ new_clause_relids = adjust_relid_set(rinfo->clause_relids,
+ context->rt_index,
+ context->new_index);
+
+ /* This operation is legal until we remove only baserels */
+ rinfo->num_base_rels -= bms_num_members(rinfo->clause_relids) -
+ bms_num_members(new_clause_relids);
+
+ rinfo->clause_relids = new_clause_relids;
+ rinfo->num_base_rels = bms_num_members(rinfo->clause_relids);
+ rinfo->left_relids = adjust_relid_set(rinfo->left_relids,
+ context->rt_index,
+ context->new_index);
+ rinfo->right_relids = adjust_relid_set(rinfo->right_relids,
+ context->rt_index,
+ context->new_index);
+ }
+
+ if (is_req_equal)
+ rinfo->required_relids = rinfo->clause_relids;
+ else
+ rinfo->required_relids = adjust_relid_set(rinfo->required_relids,
+ context->rt_index,
+ context->new_index);
+
+ rinfo->outer_relids = adjust_relid_set(rinfo->outer_relids,
+ context->rt_index,
+ context->new_index);
+ rinfo->incompatible_relids = adjust_relid_set(rinfo->incompatible_relids,
+ context->rt_index,
+ context->new_index);
+
+ if (rinfo->mergeopfamilies &&
+ bms_get_singleton_member(rinfo->clause_relids, &relid) &&
+ is_multiple &&
+ relid == context->new_index && IsA(rinfo->clause, OpExpr))
+ {
+ Expr *leftOp;
+ Expr *rightOp;
+
+ leftOp = (Expr *) get_leftop(rinfo->clause);
+ rightOp = (Expr *) get_rightop(rinfo->clause);
+
+ /*
+ * For self-join elimination, changing varnos could transform
+ * "t1.a = t2.a" into "t1.a = t1.a". That is always true as long
+ * as "t1.a" is not null. We use qual() to check for such a case,
+ * and then we replace the qual for a check for not null
+ * (NullTest).
+ */
+ if (leftOp != NULL && equal(leftOp, rightOp))
+ {
+ NullTest *ntest = makeNode(NullTest);
+
+ ntest->arg = leftOp;
+ ntest->nulltesttype = IS_NOT_NULL;
+ ntest->argisrow = false;
+ ntest->location = -1;
+ rinfo->clause = (Expr *) ntest;
+ rinfo->mergeopfamilies = NIL;
+ rinfo->left_em = NULL;
+ rinfo->right_em = NULL;
+ }
+ Assert(rinfo->orclause == NULL);
+ }
+ return false;
+ }
+ else
+ return standard_ChangeVarNodes_walker(node, context);
+}
+
/*
* Remove a relation after we have proven that it participates only in an
* unneeded unique self-join.
@@ -1714,7 +1823,8 @@ remove_self_join_rel(PlannerInfo *root, PlanRowMark *kmark, PlanRowMark *rmark,
foreach_node(RestrictInfo, rinfo, joininfos)
{
remove_join_clause_from_rels(root, rinfo, rinfo->required_relids);
- ChangeVarNodes((Node *) rinfo, toRemove->relid, toKeep->relid, 0);
+ ChangeVarNodesExtended((Node *) rinfo, toRemove->relid, toKeep->relid,
+ 0, ChangeVarNodes_callback);
if (bms_membership(rinfo->required_relids) == BMS_MULTIPLE)
jinfo_candidates = lappend(jinfo_candidates, rinfo);
@@ -1732,7 +1842,8 @@ remove_self_join_rel(PlannerInfo *root, PlanRowMark *kmark, PlanRowMark *rmark,
restrictlist);
foreach_node(RestrictInfo, rinfo, toRemove->baserestrictinfo)
{
- ChangeVarNodes((Node *) rinfo, toRemove->relid, toKeep->relid, 0);
+ ChangeVarNodesExtended((Node *) rinfo, toRemove->relid, toKeep->relid,
+ 0, ChangeVarNodes_callback);
if (bms_membership(rinfo->required_relids) == BMS_MULTIPLE)
jinfo_candidates = lappend(jinfo_candidates, rinfo);
@@ -1774,7 +1885,8 @@ remove_self_join_rel(PlannerInfo *root, PlanRowMark *kmark, PlanRowMark *rmark,
{
Node *node = lfirst(lc);
- ChangeVarNodes(node, toRemove->relid, toKeep->relid, 0);
+ ChangeVarNodesExtended(node, toRemove->relid, toKeep->relid, 0,
+ ChangeVarNodes_callback);
if (!list_member(toKeep->reltarget->exprs, node))
toKeep->reltarget->exprs = lappend(toKeep->reltarget->exprs, node);
}
@@ -1818,14 +1930,18 @@ remove_self_join_rel(PlannerInfo *root, PlanRowMark *kmark, PlanRowMark *rmark,
* Replace varno in all the query structures, except nodes RangeTblRef
* otherwise later remove_rel_from_joinlist will yield errors.
*/
- ChangeVarNodesExtended((Node *) root->parse, toRemove->relid, toKeep->relid, 0, false);
+ ChangeVarNodesExtended((Node *) root->parse, toRemove->relid, toKeep->relid,
+ 0, ChangeVarNodes_callback);
/* Replace links in the planner info */
remove_rel_from_query(root, toRemove, toKeep->relid, NULL, NULL);
/* At last, replace varno in root targetlist and HAVING clause */
- ChangeVarNodes((Node *) root->processed_tlist, toRemove->relid, toKeep->relid, 0);
- ChangeVarNodes((Node *) root->processed_groupClause, toRemove->relid, toKeep->relid, 0);
+ ChangeVarNodesExtended((Node *) root->processed_tlist, toRemove->relid,
+ toKeep->relid, 0, ChangeVarNodes_callback);
+ ChangeVarNodesExtended((Node *) root->processed_groupClause,
+ toRemove->relid, toKeep->relid, 0,
+ ChangeVarNodes_callback);
adjust_relid_set(root->all_result_relids, toRemove->relid, toKeep->relid);
adjust_relid_set(root->leaf_result_relids, toRemove->relid, toKeep->relid);
@@ -1908,8 +2024,10 @@ split_selfjoin_quals(PlannerInfo *root, List *joinquals, List **selfjoinquals,
* when we have cast of the same var to different (but compatible)
* types.
*/
- ChangeVarNodes(rightexpr, bms_singleton_member(rinfo->right_relids),
- bms_singleton_member(rinfo->left_relids), 0);
+ ChangeVarNodesExtended(rightexpr,
+ bms_singleton_member(rinfo->right_relids),
+ bms_singleton_member(rinfo->left_relids), 0,
+ ChangeVarNodes_callback);
if (equal(leftexpr, rightexpr))
sjoinquals = lappend(sjoinquals, rinfo);
@@ -1945,7 +2063,8 @@ match_unique_clauses(PlannerInfo *root, RelOptInfo *outer, List *uclauses,
bms_is_empty(rinfo->right_relids));
clause = (Expr *) copyObject(rinfo->clause);
- ChangeVarNodes((Node *) clause, relid, outer->relid, 0);
+ ChangeVarNodesExtended((Node *) clause, relid, outer->relid, 0,
+ ChangeVarNodes_callback);
iclause = bms_is_empty(rinfo->left_relids) ? get_rightop(clause) :
get_leftop(clause);
diff --git a/src/backend/rewrite/rewriteManip.c b/src/backend/rewrite/rewriteManip.c
index 1c61085a0a7..790cd738f9e 100644
--- a/src/backend/rewrite/rewriteManip.c
+++ b/src/backend/rewrite/rewriteManip.c
@@ -550,16 +550,18 @@ offset_relid_set(Relids relids, int offset)
* earlier to ensure that no unwanted side-effects occur!
*/
-typedef struct
-{
- int rt_index;
- int new_index;
- int sublevels_up;
- bool change_RangeTblRef;
-} ChangeVarNodes_context;
-
static bool
ChangeVarNodes_walker(Node *node, ChangeVarNodes_context *context)
+{
+ /* if caller has defined a callback it may need to alter default behaviour */
+ if (context->callback)
+ return context->callback(node, (void *) context);
+ else
+ return standard_ChangeVarNodes_walker(node, context);
+}
+
+bool
+standard_ChangeVarNodes_walker(Node *node, ChangeVarNodes_context *context)
{
if (node == NULL)
return false;
@@ -588,7 +590,7 @@ ChangeVarNodes_walker(Node *node, ChangeVarNodes_context *context)
cexpr->cvarno = context->new_index;
return false;
}
- if (IsA(node, RangeTblRef) && context->change_RangeTblRef)
+ if (IsA(node, RangeTblRef))
{
RangeTblRef *rtr = (RangeTblRef *) node;
@@ -635,75 +637,6 @@ ChangeVarNodes_walker(Node *node, ChangeVarNodes_context *context)
}
return false;
}
- if (IsA(node, RestrictInfo))
- {
- RestrictInfo *rinfo = (RestrictInfo *) node;
- int relid = -1;
- bool is_req_equal =
- (rinfo->required_relids == rinfo->clause_relids);
- bool clause_relids_is_multiple =
- (bms_membership(rinfo->clause_relids) == BMS_MULTIPLE);
-
- if (bms_is_member(context->rt_index, rinfo->clause_relids))
- {
- expression_tree_walker((Node *) rinfo->clause, ChangeVarNodes_walker, (void *) context);
- expression_tree_walker((Node *) rinfo->orclause, ChangeVarNodes_walker, (void *) context);
-
- rinfo->clause_relids =
- adjust_relid_set(rinfo->clause_relids, context->rt_index, context->new_index);
- rinfo->num_base_rels = bms_num_members(rinfo->clause_relids);
- rinfo->left_relids =
- adjust_relid_set(rinfo->left_relids, context->rt_index, context->new_index);
- rinfo->right_relids =
- adjust_relid_set(rinfo->right_relids, context->rt_index, context->new_index);
- }
-
- if (is_req_equal)
- rinfo->required_relids = rinfo->clause_relids;
- else
- rinfo->required_relids =
- adjust_relid_set(rinfo->required_relids, context->rt_index, context->new_index);
-
- rinfo->outer_relids =
- adjust_relid_set(rinfo->outer_relids, context->rt_index, context->new_index);
- rinfo->incompatible_relids =
- adjust_relid_set(rinfo->incompatible_relids, context->rt_index, context->new_index);
-
- if (rinfo->mergeopfamilies &&
- bms_get_singleton_member(rinfo->clause_relids, &relid) &&
- clause_relids_is_multiple &&
- relid == context->new_index && IsA(rinfo->clause, OpExpr))
- {
- Expr *leftOp;
- Expr *rightOp;
-
- leftOp = (Expr *) get_leftop(rinfo->clause);
- rightOp = (Expr *) get_rightop(rinfo->clause);
-
- /*
- * For self-join elimination, changing varnos could transform
- * "t1.a = t2.a" into "t1.a = t1.a". That is always true as long
- * as "t1.a" is not null. We use qual() to check for such a case,
- * and then we replace the qual for a check for not null
- * (NullTest).
- */
- if (leftOp != NULL && equal(leftOp, rightOp))
- {
- NullTest *ntest = makeNode(NullTest);
-
- ntest->arg = leftOp;
- ntest->nulltesttype = IS_NOT_NULL;
- ntest->argisrow = false;
- ntest->location = -1;
- rinfo->clause = (Expr *) ntest;
- rinfo->mergeopfamilies = NIL;
- rinfo->left_em = NULL;
- rinfo->right_em = NULL;
- }
- Assert(rinfo->orclause == NULL);
- }
- return false;
- }
if (IsA(node, AppendRelInfo))
{
AppendRelInfo *appinfo = (AppendRelInfo *) node;
@@ -749,14 +682,14 @@ ChangeVarNodes_walker(Node *node, ChangeVarNodes_context *context)
*/
void
ChangeVarNodesExtended(Node *node, int rt_index, int new_index,
- int sublevels_up, bool change_RangeTblRef)
+ int sublevels_up, ChangeVarNodes_callback_type callback)
{
ChangeVarNodes_context context;
context.rt_index = rt_index;
context.new_index = new_index;
context.sublevels_up = sublevels_up;
- context.change_RangeTblRef = change_RangeTblRef;
+ context.callback = callback;
if (node && IsA(node, Query))
{
@@ -792,7 +725,7 @@ ChangeVarNodesExtended(Node *node, int rt_index, int new_index,
void
ChangeVarNodes(Node *node, int rt_index, int new_index, int sublevels_up)
{
- ChangeVarNodesExtended(node, rt_index, new_index, sublevels_up, true);
+ ChangeVarNodesExtended(node, rt_index, new_index, sublevels_up, NULL);
}
/*
diff --git a/src/include/rewrite/rewriteManip.h b/src/include/rewrite/rewriteManip.h
index ea3908739c6..6d46274f1ee 100644
--- a/src/include/rewrite/rewriteManip.h
+++ b/src/include/rewrite/rewriteManip.h
@@ -41,7 +41,18 @@ typedef enum ReplaceVarsNoMatchOption
REPLACEVARS_SUBSTITUTE_NULL, /* replace with a NULL Const */
} ReplaceVarsNoMatchOption;
+typedef bool (*ChangeVarNodes_callback_type) (Node *node, void *arg);
+typedef struct
+{
+ int rt_index;
+ int new_index;
+ int sublevels_up;
+ ChangeVarNodes_callback_type callback;
+} ChangeVarNodes_context;
+
+extern bool standard_ChangeVarNodes_walker(Node *node,
+ ChangeVarNodes_context *context);
extern Relids adjust_relid_set(Relids relids, int oldrelid, int newrelid);
extern void CombineRangeTables(List **dst_rtable, List **dst_perminfos,
List *src_rtable, List *src_perminfos);
@@ -49,7 +60,8 @@ extern void OffsetVarNodes(Node *node, int offset, int sublevels_up);
extern void ChangeVarNodes(Node *node, int rt_index, int new_index,
int sublevels_up);
extern void ChangeVarNodesExtended(Node *node, int rt_index, int new_index,
- int sublevels_up, bool change_RangeTblRef);
+ int sublevels_up,
+ ChangeVarNodes_callback_type callback);
extern void IncrementVarSublevelsUp(Node *node, int delta_sublevels_up,
int min_sublevels_up);
extern void IncrementVarSublevelsUp_rtable(List *rtable,
--
2.39.5
On Fri, Apr 11, 2025 at 5:46 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
On 4/10/25 14:39, Andrei Lepikhov wrote:
On 4/10/25 13:36, Alexander Korotkov wrote:
On Wed, Apr 9, 2025 at 10:39 AM Andrei Lepikhov <lepihov@gmail.com>
wrote:It seems we are coming to the conclusion that join removal optimisation
may do something out of ChangeVarNodes resposibility. Before further
complicating of this function code I would like to know opinion of Tom,
who initially proposed [1] to use this routine. May be better a) return
to more specialised change_relid / sje_walker machinery or b) move
ChangeVarNodes out of rewriteManip and make it multi-purpose routine,
allowing to transform expression that may happen after a Var node
change?What about adding a callback to ChangeVarNodes_context that would
called for each RestrictInfo after changing varnodes itself? SJE
could use a callback that replaces OpExpr with NullTest when needed.I think it is doable, of course. Just looking forward a little, it may
need more complication in the future (SJE definitely should be widened
to partitioned tables) and it may be simpler to have two different
routines for two different stages of planning.To provide some food for thought, here is a draft in attachment which
addresses both issues: RestrictInfo relid replacement and move
SJE-specific code out of the ChangeVarNodes routine (callback approach).
Thank you, Andrei. I've put it all together.
0001 Fixes material bugs in ChangeVarNodes_walker() including regression test
0002 Puts back comments which got accidentally removed
0003 Refactors ChangeVarNodesExtended() with custom user-defined callback
I'm going to further work on improvement of these patches.
------
Regards,
Alexander Korotkov
Supabase
On Sun, Apr 27, 2025 at 2:02 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Fri, Apr 11, 2025 at 5:46 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
On 4/10/25 14:39, Andrei Lepikhov wrote:
On 4/10/25 13:36, Alexander Korotkov wrote:
On Wed, Apr 9, 2025 at 10:39 AM Andrei Lepikhov <lepihov@gmail.com>
wrote:It seems we are coming to the conclusion that join removal optimisation
may do something out of ChangeVarNodes resposibility. Before further
complicating of this function code I would like to know opinion of Tom,
who initially proposed [1] to use this routine. May be better a) return
to more specialised change_relid / sje_walker machinery or b) move
ChangeVarNodes out of rewriteManip and make it multi-purpose routine,
allowing to transform expression that may happen after a Var node
change?What about adding a callback to ChangeVarNodes_context that would
called for each RestrictInfo after changing varnodes itself? SJE
could use a callback that replaces OpExpr with NullTest when needed.I think it is doable, of course. Just looking forward a little, it may
need more complication in the future (SJE definitely should be widened
to partitioned tables) and it may be simpler to have two different
routines for two different stages of planning.To provide some food for thought, here is a draft in attachment which
addresses both issues: RestrictInfo relid replacement and move
SJE-specific code out of the ChangeVarNodes routine (callback approach).Thank you, Andrei. I've put it all together.
0001 Fixes material bugs in ChangeVarNodes_walker() including regression test
0002 Puts back comments which got accidentally removed
0003 Refactors ChangeVarNodesExtended() with custom user-defined callbackI'm going to further work on improvement of these patches.
Sorry, I accidentally sent some messages off-list. So, now 0001 and
0002 are pushed.
I've revised the remaining refactoring patch. I've made a callback an
additional callback, but not the replacement to the walker. It looks
better for me now. Also, I've written some comments and the commit
message. Any thoughts?
------
Regards,
Alexander Korotkov
Supabase
Attachments:
v4-0001-Refactor-ChangeVarNodesExtended-using-the-custom-.patchapplication/octet-stream; name=v4-0001-Refactor-ChangeVarNodesExtended-using-the-custom-.patchDownload
From 506ce279c7aad015f7be4d7791e47c7466212c88 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Sun, 27 Apr 2025 14:00:15 +0300
Subject: [PATCH v4] Refactor ChangeVarNodesExtended() using the custom
callback
fc069a3a6319 implemented Self-Join Elimination (SJE) and put related logic
to ChangeVarNodes_walker(). This commit provides refactoring to remove the
SJE-related logic from ChangeVarNodes_walker() but adds a custom callback to
ChangeVarNodesExtended(), which has a chance to process a node before
ChangeVarNodes_walker(). Passing this callback to ChangeVarNodesExtended()
allows SJE-related node handling to be kept within the analyzejoins.c.
Reported-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CAMbWs49PE3CvnV8vrQ0Dr%3DHqgZZmX0tdNbzVNJxqc8yg-8kDQQ%40mail.gmail.com
Author: Andrei Lepikhov <lepihov@gmail.com>
Author: Alexander Korotkov <aekorotkov@gmail.com>
---
src/backend/optimizer/plan/analyzejoins.c | 160 ++++++++++++++++++++--
src/backend/rewrite/rewriteManip.c | 133 ++++--------------
src/include/rewrite/rewriteManip.h | 17 ++-
3 files changed, 186 insertions(+), 124 deletions(-)
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index be19167e4a2..467202b9e6c 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -74,6 +74,8 @@ static bool is_innerrel_unique_for(PlannerInfo *root,
List *restrictlist,
List **extra_clauses);
static int self_join_candidates_cmp(const void *a, const void *b);
+static bool ChangeVarNodes_callback(Node *node,
+ ChangeVarNodes_context *context);
/*
@@ -397,7 +399,8 @@ remove_rel_from_query(PlannerInfo *root, RelOptInfo *rel,
{
Assert(subst > 0);
- ChangeVarNodes((Node *) sjinf->semi_rhs_exprs, relid, subst, 0);
+ ChangeVarNodesExtended((Node *) sjinf->semi_rhs_exprs, relid, subst,
+ 0, ChangeVarNodes_callback);
}
}
@@ -469,7 +472,8 @@ remove_rel_from_query(PlannerInfo *root, RelOptInfo *rel,
sjinfo->ojrelid, subst);
Assert(!bms_is_empty(phv->phrels));
- ChangeVarNodes((Node *) phv->phexpr, relid, subst, 0);
+ ChangeVarNodesExtended((Node *) phv->phexpr, relid, subst, 0,
+ ChangeVarNodes_callback);
Assert(phv->phnullingrels == NULL); /* no need to adjust */
}
@@ -523,7 +527,8 @@ remove_rel_from_query(PlannerInfo *root, RelOptInfo *rel,
}
if (subst > 0)
- ChangeVarNodes((Node *) otherrel->lateral_vars, relid, subst, 0);
+ ChangeVarNodesExtended((Node *) otherrel->lateral_vars, relid,
+ subst, 0, ChangeVarNodes_callback);
}
}
@@ -757,7 +762,8 @@ remove_rel_from_eclass(EquivalenceClass *ec, SpecialJoinInfo *sjinfo,
RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
if (sjinfo == NULL)
- ChangeVarNodes((Node *) rinfo, relid, subst, 0);
+ ChangeVarNodesExtended((Node *) rinfo, relid, subst, 0,
+ ChangeVarNodes_callback);
else
remove_rel_from_restrictinfo(rinfo, relid, sjinfo->ojrelid);
}
@@ -1548,7 +1554,8 @@ update_eclasses(EquivalenceClass *ec, int from, int to)
em->em_jdomain->jd_relids = adjust_relid_set(em->em_jdomain->jd_relids, from, to);
/* We only process inner joins */
- ChangeVarNodes((Node *) em->em_expr, from, to, 0);
+ ChangeVarNodesExtended((Node *) em->em_expr, from, to, 0,
+ ChangeVarNodes_callback);
foreach_node(EquivalenceMember, other, new_members)
{
@@ -1582,7 +1589,8 @@ update_eclasses(EquivalenceClass *ec, int from, int to)
continue;
}
- ChangeVarNodes((Node *) rinfo, from, to, 0);
+ ChangeVarNodesExtended((Node *) rinfo, from, to, 0,
+ ChangeVarNodes_callback);
/*
* After switching the clause to the remaining relation, check it for
@@ -1677,6 +1685,118 @@ add_non_redundant_clauses(PlannerInfo *root,
}
}
+/*
+ * A custom callback for ChangeVarNodesExtended() providing
+ * Self-join elimination (SJE) related functionality
+ *
+ * SJE needs to skip the RangeTblRef node
+ * type. During SJE's last step, remove_rel_from_joinlist() removes
+ * remaining RangeTblRefs with target relid. If ChangeVarNodes() replaces
+ * the target relid before, remove_rel_from_joinlist() fails to identify
+ * the nodes to delete.
+ *
+ * SJE also needs to change the relids within RestrictInfo's.
+ */
+static bool
+ChangeVarNodes_callback(Node *node, ChangeVarNodes_context *context)
+{
+ if (IsA(node, RangeTblRef))
+ {
+ return true;
+ }
+ else if (IsA(node, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *) node;
+ int relid = -1;
+ bool is_req_equal =
+ (rinfo->required_relids == rinfo->clause_relids);
+ bool clause_relids_is_multiple =
+ (bms_membership(rinfo->clause_relids) == BMS_MULTIPLE);
+
+ /*
+ * Recurse down into clauses if the target relation is present in
+ * clause_relids or required_relids. We must check required_relids
+ * because the relation not present in clause_relids might still be
+ * present somewhere in orclause.
+ */
+ if (bms_is_member(context->rt_index, rinfo->clause_relids) ||
+ bms_is_member(context->rt_index, rinfo->required_relids))
+ {
+ Relids new_clause_relids;
+
+ ChangeVarNodesWalkExpression((Node *) rinfo->clause, context);
+ ChangeVarNodesWalkExpression((Node *) rinfo->orclause, context);
+
+ new_clause_relids = adjust_relid_set(rinfo->clause_relids,
+ context->rt_index,
+ context->new_index);
+
+ /*
+ * Incrementally adjust num_base_rels based on the change of
+ * clause_relids, which could contain both base relids and
+ * outer-join relids. This operation is legal until we remove
+ * only baserels.
+ */
+ rinfo->num_base_rels -= bms_num_members(rinfo->clause_relids) -
+ bms_num_members(new_clause_relids);
+
+ rinfo->clause_relids = new_clause_relids;
+ rinfo->left_relids =
+ adjust_relid_set(rinfo->left_relids, context->rt_index, context->new_index);
+ rinfo->right_relids =
+ adjust_relid_set(rinfo->right_relids, context->rt_index, context->new_index);
+ }
+
+ if (is_req_equal)
+ rinfo->required_relids = rinfo->clause_relids;
+ else
+ rinfo->required_relids =
+ adjust_relid_set(rinfo->required_relids, context->rt_index, context->new_index);
+
+ rinfo->outer_relids =
+ adjust_relid_set(rinfo->outer_relids, context->rt_index, context->new_index);
+ rinfo->incompatible_relids =
+ adjust_relid_set(rinfo->incompatible_relids, context->rt_index, context->new_index);
+
+ if (rinfo->mergeopfamilies &&
+ bms_get_singleton_member(rinfo->clause_relids, &relid) &&
+ clause_relids_is_multiple &&
+ relid == context->new_index && IsA(rinfo->clause, OpExpr))
+ {
+ Expr *leftOp;
+ Expr *rightOp;
+
+ leftOp = (Expr *) get_leftop(rinfo->clause);
+ rightOp = (Expr *) get_rightop(rinfo->clause);
+
+ /*
+ * For self-join elimination, changing varnos could transform
+ * "t1.a = t2.a" into "t1.a = t1.a". That is always true as long
+ * as "t1.a" is not null. We use qual() to check for such a case,
+ * and then we replace the qual for a check for not null
+ * (NullTest).
+ */
+ if (leftOp != NULL && equal(leftOp, rightOp))
+ {
+ NullTest *ntest = makeNode(NullTest);
+
+ ntest->arg = leftOp;
+ ntest->nulltesttype = IS_NOT_NULL;
+ ntest->argisrow = false;
+ ntest->location = -1;
+ rinfo->clause = (Expr *) ntest;
+ rinfo->mergeopfamilies = NIL;
+ rinfo->left_em = NULL;
+ rinfo->right_em = NULL;
+ }
+ Assert(rinfo->orclause == NULL);
+ }
+ return true;
+ }
+
+ return false;
+}
+
/*
* Remove a relation after we have proven that it participates only in an
* unneeded unique self-join.
@@ -1725,7 +1845,8 @@ remove_self_join_rel(PlannerInfo *root, PlanRowMark *kmark, PlanRowMark *rmark,
foreach_node(RestrictInfo, rinfo, joininfos)
{
remove_join_clause_from_rels(root, rinfo, rinfo->required_relids);
- ChangeVarNodes((Node *) rinfo, toRemove->relid, toKeep->relid, 0);
+ ChangeVarNodesExtended((Node *) rinfo, toRemove->relid, toKeep->relid,
+ 0, ChangeVarNodes_callback);
if (bms_membership(rinfo->required_relids) == BMS_MULTIPLE)
jinfo_candidates = lappend(jinfo_candidates, rinfo);
@@ -1743,7 +1864,8 @@ remove_self_join_rel(PlannerInfo *root, PlanRowMark *kmark, PlanRowMark *rmark,
restrictlist);
foreach_node(RestrictInfo, rinfo, toRemove->baserestrictinfo)
{
- ChangeVarNodes((Node *) rinfo, toRemove->relid, toKeep->relid, 0);
+ ChangeVarNodesExtended((Node *) rinfo, toRemove->relid, toKeep->relid,
+ 0, ChangeVarNodes_callback);
if (bms_membership(rinfo->required_relids) == BMS_MULTIPLE)
jinfo_candidates = lappend(jinfo_candidates, rinfo);
@@ -1785,7 +1907,8 @@ remove_self_join_rel(PlannerInfo *root, PlanRowMark *kmark, PlanRowMark *rmark,
{
Node *node = lfirst(lc);
- ChangeVarNodes(node, toRemove->relid, toKeep->relid, 0);
+ ChangeVarNodesExtended(node, toRemove->relid, toKeep->relid, 0,
+ ChangeVarNodes_callback);
if (!list_member(toKeep->reltarget->exprs, node))
toKeep->reltarget->exprs = lappend(toKeep->reltarget->exprs, node);
}
@@ -1829,14 +1952,18 @@ remove_self_join_rel(PlannerInfo *root, PlanRowMark *kmark, PlanRowMark *rmark,
* Replace varno in all the query structures, except nodes RangeTblRef
* otherwise later remove_rel_from_joinlist will yield errors.
*/
- ChangeVarNodesExtended((Node *) root->parse, toRemove->relid, toKeep->relid, 0, false);
+ ChangeVarNodesExtended((Node *) root->parse, toRemove->relid, toKeep->relid,
+ 0, ChangeVarNodes_callback);
/* Replace links in the planner info */
remove_rel_from_query(root, toRemove, toKeep->relid, NULL, NULL);
/* At last, replace varno in root targetlist and HAVING clause */
- ChangeVarNodes((Node *) root->processed_tlist, toRemove->relid, toKeep->relid, 0);
- ChangeVarNodes((Node *) root->processed_groupClause, toRemove->relid, toKeep->relid, 0);
+ ChangeVarNodesExtended((Node *) root->processed_tlist, toRemove->relid,
+ toKeep->relid, 0, ChangeVarNodes_callback);
+ ChangeVarNodesExtended((Node *) root->processed_groupClause,
+ toRemove->relid, toKeep->relid, 0,
+ ChangeVarNodes_callback);
adjust_relid_set(root->all_result_relids, toRemove->relid, toKeep->relid);
adjust_relid_set(root->leaf_result_relids, toRemove->relid, toKeep->relid);
@@ -1919,8 +2046,10 @@ split_selfjoin_quals(PlannerInfo *root, List *joinquals, List **selfjoinquals,
* when we have cast of the same var to different (but compatible)
* types.
*/
- ChangeVarNodes(rightexpr, bms_singleton_member(rinfo->right_relids),
- bms_singleton_member(rinfo->left_relids), 0);
+ ChangeVarNodesExtended(rightexpr,
+ bms_singleton_member(rinfo->right_relids),
+ bms_singleton_member(rinfo->left_relids), 0,
+ ChangeVarNodes_callback);
if (equal(leftexpr, rightexpr))
sjoinquals = lappend(sjoinquals, rinfo);
@@ -1956,7 +2085,8 @@ match_unique_clauses(PlannerInfo *root, RelOptInfo *outer, List *uclauses,
bms_is_empty(rinfo->right_relids));
clause = (Expr *) copyObject(rinfo->clause);
- ChangeVarNodes((Node *) clause, relid, outer->relid, 0);
+ ChangeVarNodesExtended((Node *) clause, relid, outer->relid, 0,
+ ChangeVarNodes_callback);
iclause = bms_is_empty(rinfo->left_relids) ? get_rightop(clause) :
get_leftop(clause);
diff --git a/src/backend/rewrite/rewriteManip.c b/src/backend/rewrite/rewriteManip.c
index 3f8b8b6eed9..e031b3edf58 100644
--- a/src/backend/rewrite/rewriteManip.c
+++ b/src/backend/rewrite/rewriteManip.c
@@ -550,19 +550,15 @@ offset_relid_set(Relids relids, int offset)
* earlier to ensure that no unwanted side-effects occur!
*/
-typedef struct
-{
- int rt_index;
- int new_index;
- int sublevels_up;
- bool change_RangeTblRef;
-} ChangeVarNodes_context;
-
static bool
ChangeVarNodes_walker(Node *node, ChangeVarNodes_context *context)
{
if (node == NULL)
return false;
+
+ if (context->callback && context->callback(node, context))
+ return false;
+
if (IsA(node, Var))
{
Var *var = (Var *) node;
@@ -588,7 +584,7 @@ ChangeVarNodes_walker(Node *node, ChangeVarNodes_context *context)
cexpr->cvarno = context->new_index;
return false;
}
- if (IsA(node, RangeTblRef) && context->change_RangeTblRef)
+ if (IsA(node, RangeTblRef))
{
RangeTblRef *rtr = (RangeTblRef *) node;
@@ -635,95 +631,6 @@ ChangeVarNodes_walker(Node *node, ChangeVarNodes_context *context)
}
return false;
}
- if (IsA(node, RestrictInfo))
- {
- RestrictInfo *rinfo = (RestrictInfo *) node;
- int relid = -1;
- bool is_req_equal =
- (rinfo->required_relids == rinfo->clause_relids);
- bool clause_relids_is_multiple =
- (bms_membership(rinfo->clause_relids) == BMS_MULTIPLE);
-
- /*
- * Recurse down into clauses if the target relation is present in
- * clause_relids or required_relids. We must check required_relids
- * because the relation not present in clause_relids might still be
- * present somewhere in orclause.
- */
- if (bms_is_member(context->rt_index, rinfo->clause_relids) ||
- bms_is_member(context->rt_index, rinfo->required_relids))
- {
- Relids new_clause_relids;
-
- expression_tree_walker((Node *) rinfo->clause, ChangeVarNodes_walker, (void *) context);
- expression_tree_walker((Node *) rinfo->orclause, ChangeVarNodes_walker, (void *) context);
-
- new_clause_relids = adjust_relid_set(rinfo->clause_relids,
- context->rt_index,
- context->new_index);
-
- /*
- * Incrementally adjust num_base_rels based on the change of
- * clause_relids, which could contain both base relids and
- * outer-join relids. This operation is legal until we remove
- * only baserels.
- */
- rinfo->num_base_rels -= bms_num_members(rinfo->clause_relids) -
- bms_num_members(new_clause_relids);
-
- rinfo->clause_relids = new_clause_relids;
- rinfo->left_relids =
- adjust_relid_set(rinfo->left_relids, context->rt_index, context->new_index);
- rinfo->right_relids =
- adjust_relid_set(rinfo->right_relids, context->rt_index, context->new_index);
- }
-
- if (is_req_equal)
- rinfo->required_relids = rinfo->clause_relids;
- else
- rinfo->required_relids =
- adjust_relid_set(rinfo->required_relids, context->rt_index, context->new_index);
-
- rinfo->outer_relids =
- adjust_relid_set(rinfo->outer_relids, context->rt_index, context->new_index);
- rinfo->incompatible_relids =
- adjust_relid_set(rinfo->incompatible_relids, context->rt_index, context->new_index);
-
- if (rinfo->mergeopfamilies &&
- bms_get_singleton_member(rinfo->clause_relids, &relid) &&
- clause_relids_is_multiple &&
- relid == context->new_index && IsA(rinfo->clause, OpExpr))
- {
- Expr *leftOp;
- Expr *rightOp;
-
- leftOp = (Expr *) get_leftop(rinfo->clause);
- rightOp = (Expr *) get_rightop(rinfo->clause);
-
- /*
- * For self-join elimination, changing varnos could transform
- * "t1.a = t2.a" into "t1.a = t1.a". That is always true as long
- * as "t1.a" is not null. We use qual() to check for such a case,
- * and then we replace the qual for a check for not null
- * (NullTest).
- */
- if (leftOp != NULL && equal(leftOp, rightOp))
- {
- NullTest *ntest = makeNode(NullTest);
-
- ntest->arg = leftOp;
- ntest->nulltesttype = IS_NOT_NULL;
- ntest->argisrow = false;
- ntest->location = -1;
- rinfo->clause = (Expr *) ntest;
- rinfo->mergeopfamilies = NIL;
- rinfo->left_em = NULL;
- rinfo->right_em = NULL;
- }
- Assert(rinfo->orclause == NULL);
- }
- return false;
- }
if (IsA(node, AppendRelInfo))
{
AppendRelInfo *appinfo = (AppendRelInfo *) node;
@@ -757,26 +664,25 @@ ChangeVarNodes_walker(Node *node, ChangeVarNodes_context *context)
}
/*
- * ChangeVarNodesExtended - similar to ChangeVarNodes, but has additional
- * 'change_RangeTblRef' param
+ * ChangeVarNodesExtended - similar to ChangeVarNodes, but with an additional
+ * 'callback' param
*
* ChangeVarNodes changes a given node and all of its underlying nodes.
- * However, self-join elimination (SJE) needs to skip the RangeTblRef node
- * type. During SJE's last step, remove_rel_from_joinlist() removes
- * remaining RangeTblRefs with target relid. If ChangeVarNodes() replaces
- * the target relid before, remove_rel_from_joinlist() fails to identify
- * the nodes to delete.
+ * This version of function additionally takes a callback, which has a
+ * chance to process a node before ChangeVarNodes_walker. A callback
+ * returns a boolean value indicating if given node should be skipped from
+ * further processing by ChangeVarNodes_walker.
*/
void
ChangeVarNodesExtended(Node *node, int rt_index, int new_index,
- int sublevels_up, bool change_RangeTblRef)
+ int sublevels_up, ChangeVarNodes_callback_type callback)
{
ChangeVarNodes_context context;
context.rt_index = rt_index;
context.new_index = new_index;
context.sublevels_up = sublevels_up;
- context.change_RangeTblRef = change_RangeTblRef;
+ context.callback = callback;
/*
* Must be prepared to start with a Query or a bare expression tree; if
@@ -826,7 +732,18 @@ ChangeVarNodesExtended(Node *node, int rt_index, int new_index,
void
ChangeVarNodes(Node *node, int rt_index, int new_index, int sublevels_up)
{
- ChangeVarNodesExtended(node, rt_index, new_index, sublevels_up, true);
+ ChangeVarNodesExtended(node, rt_index, new_index, sublevels_up, NULL);
+}
+
+/*
+ * ChangeVarNodesWalkExpression - process expression within the custom
+ * callback provided to the
+ * ChangeVarNodesExtended.
+ */
+void
+ChangeVarNodesWalkExpression(Node *node, ChangeVarNodes_context *context)
+{
+ expression_tree_walker(node, ChangeVarNodes_walker, (void *) context);
}
/*
diff --git a/src/include/rewrite/rewriteManip.h b/src/include/rewrite/rewriteManip.h
index ea3908739c6..66197ce624d 100644
--- a/src/include/rewrite/rewriteManip.h
+++ b/src/include/rewrite/rewriteManip.h
@@ -41,6 +41,18 @@ typedef enum ReplaceVarsNoMatchOption
REPLACEVARS_SUBSTITUTE_NULL, /* replace with a NULL Const */
} ReplaceVarsNoMatchOption;
+typedef struct ChangeVarNodes_context ChangeVarNodes_context;
+
+typedef bool (*ChangeVarNodes_callback_type) (Node *node,
+ ChangeVarNodes_context *arg);
+
+struct ChangeVarNodes_context
+{
+ int rt_index;
+ int new_index;
+ int sublevels_up;
+ ChangeVarNodes_callback_type callback;
+};
extern Relids adjust_relid_set(Relids relids, int oldrelid, int newrelid);
extern void CombineRangeTables(List **dst_rtable, List **dst_perminfos,
@@ -49,7 +61,10 @@ extern void OffsetVarNodes(Node *node, int offset, int sublevels_up);
extern void ChangeVarNodes(Node *node, int rt_index, int new_index,
int sublevels_up);
extern void ChangeVarNodesExtended(Node *node, int rt_index, int new_index,
- int sublevels_up, bool change_RangeTblRef);
+ int sublevels_up,
+ ChangeVarNodes_callback_type callback);
+extern void ChangeVarNodesWalkExpression(Node *node,
+ ChangeVarNodes_context *context);
extern void IncrementVarSublevelsUp(Node *node, int delta_sublevels_up,
int min_sublevels_up);
extern void IncrementVarSublevelsUp_rtable(List *rtable,
--
2.39.5 (Apple Git-154)
On 4/30/25 13:22, Alexander Korotkov wrote:
Thank you, Andrei. I've put it all together.
0001 Fixes material bugs in ChangeVarNodes_walker() including regression test
0002 Puts back comments which got accidentally removed
0003 Refactors ChangeVarNodesExtended() with custom user-defined callbackI've revised the remaining refactoring patch. I've made a callback an
additional callback, but not the replacement to the walker. It looks
better for me now. Also, I've written some comments and the commit
message. Any thoughts?
It seems quite elegant to me.
I have not precisely examined the part with the RestrictInfo replacement
logic - I guess you just copied it - I reviewed the rest of the patch.
Generally, it looks good, but let me be a little picky.
1. Looking into the callback-related code, I suggest changing the name
of ChangeVarNodes_callback to something less general and more specific -
like replace_relid_callback. My variant doesn't seem the best, but
general-purposed name seems worse.
2. This callback doesn't modify query replacement logic's behaviour- it
only affects expressions. It makes sense to write about that in the
description of the ChangeVarNodesExtended.
3. Should the ChangeVarNodesWalkExpression function return the walker's
returning value?
--
regards, Andrei Lepikhov
Hi, Andrei!
Thank you for your review!
On Wed, Apr 30, 2025 at 4:34 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
On 4/30/25 13:22, Alexander Korotkov wrote:
Thank you, Andrei. I've put it all together.
0001 Fixes material bugs in ChangeVarNodes_walker() includingregression test
0002 Puts back comments which got accidentally removed
0003 Refactors ChangeVarNodesExtended() with custom user-definedcallback
I've revised the remaining refactoring patch. I've made a callback an
additional callback, but not the replacement to the walker. It looks
better for me now. Also, I've written some comments and the commit
message. Any thoughts?It seems quite elegant to me.
I have not precisely examined the part with the RestrictInfo replacement
logic - I guess you just copied it - I reviewed the rest of the patch.Generally, it looks good, but let me be a little picky.
1. Looking into the callback-related code, I suggest changing the name
of ChangeVarNodes_callback to something less general and more specific -
like replace_relid_callback. My variant doesn't seem the best, but
general-purposed name seems worse.
I didn't have any better ideas and picked the name you proposed. Then I
renamed ChangeVarNodes_callback_type to just ChangeVarNodes_callback. Now
it seems consitent with other type names like ChangeVarNodes_context (which
is not ChangeVarNodes_context_type).
2. This callback doesn't modify query replacement logic's behaviour- it
only affects expressions. It makes sense to write about that in the
description of the ChangeVarNodesExtended.
I've added a couple sentences to ChangeVarNodesExtended().
3. Should the ChangeVarNodesWalkExpression function return the walker's
returning value?
Done.
------
Regards,
Alexander Korotkov
Supabase
Attachments:
v5-0001-Refactor-ChangeVarNodesExtended-using-the-custom-.patchapplication/octet-stream; name=v5-0001-Refactor-ChangeVarNodesExtended-using-the-custom-.patchDownload
From 0ab371d825c7d3d7efa215083229e2ddc9ec59a3 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Sun, 27 Apr 2025 14:00:15 +0300
Subject: [PATCH v5] Refactor ChangeVarNodesExtended() using the custom
callback
fc069a3a6319 implemented Self-Join Elimination (SJE) and put related logic
to ChangeVarNodes_walker(). This commit provides refactoring to remove the
SJE-related logic from ChangeVarNodes_walker() but adds a custom callback to
ChangeVarNodesExtended(), which has a chance to process a node before
ChangeVarNodes_walker(). Passing this callback to ChangeVarNodesExtended()
allows SJE-related node handling to be kept within the analyzejoins.c.
Reported-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CAMbWs49PE3CvnV8vrQ0Dr%3DHqgZZmX0tdNbzVNJxqc8yg-8kDQQ%40mail.gmail.com
Author: Andrei Lepikhov <lepihov@gmail.com>
Author: Alexander Korotkov <aekorotkov@gmail.com>
---
src/backend/optimizer/plan/analyzejoins.c | 160 ++++++++++++++++++++--
src/backend/rewrite/rewriteManip.c | 138 ++++---------------
src/include/rewrite/rewriteManip.h | 17 ++-
3 files changed, 191 insertions(+), 124 deletions(-)
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index be19167e4a2..f823dde0fac 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -74,6 +74,8 @@ static bool is_innerrel_unique_for(PlannerInfo *root,
List *restrictlist,
List **extra_clauses);
static int self_join_candidates_cmp(const void *a, const void *b);
+static bool replace_relid_callback(Node *node,
+ ChangeVarNodes_context *context);
/*
@@ -397,7 +399,8 @@ remove_rel_from_query(PlannerInfo *root, RelOptInfo *rel,
{
Assert(subst > 0);
- ChangeVarNodes((Node *) sjinf->semi_rhs_exprs, relid, subst, 0);
+ ChangeVarNodesExtended((Node *) sjinf->semi_rhs_exprs, relid, subst,
+ 0, replace_relid_callback);
}
}
@@ -469,7 +472,8 @@ remove_rel_from_query(PlannerInfo *root, RelOptInfo *rel,
sjinfo->ojrelid, subst);
Assert(!bms_is_empty(phv->phrels));
- ChangeVarNodes((Node *) phv->phexpr, relid, subst, 0);
+ ChangeVarNodesExtended((Node *) phv->phexpr, relid, subst, 0,
+ replace_relid_callback);
Assert(phv->phnullingrels == NULL); /* no need to adjust */
}
@@ -523,7 +527,8 @@ remove_rel_from_query(PlannerInfo *root, RelOptInfo *rel,
}
if (subst > 0)
- ChangeVarNodes((Node *) otherrel->lateral_vars, relid, subst, 0);
+ ChangeVarNodesExtended((Node *) otherrel->lateral_vars, relid,
+ subst, 0, replace_relid_callback);
}
}
@@ -757,7 +762,8 @@ remove_rel_from_eclass(EquivalenceClass *ec, SpecialJoinInfo *sjinfo,
RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
if (sjinfo == NULL)
- ChangeVarNodes((Node *) rinfo, relid, subst, 0);
+ ChangeVarNodesExtended((Node *) rinfo, relid, subst, 0,
+ replace_relid_callback);
else
remove_rel_from_restrictinfo(rinfo, relid, sjinfo->ojrelid);
}
@@ -1548,7 +1554,8 @@ update_eclasses(EquivalenceClass *ec, int from, int to)
em->em_jdomain->jd_relids = adjust_relid_set(em->em_jdomain->jd_relids, from, to);
/* We only process inner joins */
- ChangeVarNodes((Node *) em->em_expr, from, to, 0);
+ ChangeVarNodesExtended((Node *) em->em_expr, from, to, 0,
+ replace_relid_callback);
foreach_node(EquivalenceMember, other, new_members)
{
@@ -1582,7 +1589,8 @@ update_eclasses(EquivalenceClass *ec, int from, int to)
continue;
}
- ChangeVarNodes((Node *) rinfo, from, to, 0);
+ ChangeVarNodesExtended((Node *) rinfo, from, to, 0,
+ replace_relid_callback);
/*
* After switching the clause to the remaining relation, check it for
@@ -1677,6 +1685,118 @@ add_non_redundant_clauses(PlannerInfo *root,
}
}
+/*
+ * A custom callback for ChangeVarNodesExtended() providing
+ * Self-join elimination (SJE) related functionality
+ *
+ * SJE needs to skip the RangeTblRef node
+ * type. During SJE's last step, remove_rel_from_joinlist() removes
+ * remaining RangeTblRefs with target relid. If ChangeVarNodes() replaces
+ * the target relid before, remove_rel_from_joinlist() fails to identify
+ * the nodes to delete.
+ *
+ * SJE also needs to change the relids within RestrictInfo's.
+ */
+static bool
+replace_relid_callback(Node *node, ChangeVarNodes_context *context)
+{
+ if (IsA(node, RangeTblRef))
+ {
+ return true;
+ }
+ else if (IsA(node, RestrictInfo))
+ {
+ RestrictInfo *rinfo = (RestrictInfo *) node;
+ int relid = -1;
+ bool is_req_equal =
+ (rinfo->required_relids == rinfo->clause_relids);
+ bool clause_relids_is_multiple =
+ (bms_membership(rinfo->clause_relids) == BMS_MULTIPLE);
+
+ /*
+ * Recurse down into clauses if the target relation is present in
+ * clause_relids or required_relids. We must check required_relids
+ * because the relation not present in clause_relids might still be
+ * present somewhere in orclause.
+ */
+ if (bms_is_member(context->rt_index, rinfo->clause_relids) ||
+ bms_is_member(context->rt_index, rinfo->required_relids))
+ {
+ Relids new_clause_relids;
+
+ ChangeVarNodesWalkExpression((Node *) rinfo->clause, context);
+ ChangeVarNodesWalkExpression((Node *) rinfo->orclause, context);
+
+ new_clause_relids = adjust_relid_set(rinfo->clause_relids,
+ context->rt_index,
+ context->new_index);
+
+ /*
+ * Incrementally adjust num_base_rels based on the change of
+ * clause_relids, which could contain both base relids and
+ * outer-join relids. This operation is legal until we remove
+ * only baserels.
+ */
+ rinfo->num_base_rels -= bms_num_members(rinfo->clause_relids) -
+ bms_num_members(new_clause_relids);
+
+ rinfo->clause_relids = new_clause_relids;
+ rinfo->left_relids =
+ adjust_relid_set(rinfo->left_relids, context->rt_index, context->new_index);
+ rinfo->right_relids =
+ adjust_relid_set(rinfo->right_relids, context->rt_index, context->new_index);
+ }
+
+ if (is_req_equal)
+ rinfo->required_relids = rinfo->clause_relids;
+ else
+ rinfo->required_relids =
+ adjust_relid_set(rinfo->required_relids, context->rt_index, context->new_index);
+
+ rinfo->outer_relids =
+ adjust_relid_set(rinfo->outer_relids, context->rt_index, context->new_index);
+ rinfo->incompatible_relids =
+ adjust_relid_set(rinfo->incompatible_relids, context->rt_index, context->new_index);
+
+ if (rinfo->mergeopfamilies &&
+ bms_get_singleton_member(rinfo->clause_relids, &relid) &&
+ clause_relids_is_multiple &&
+ relid == context->new_index && IsA(rinfo->clause, OpExpr))
+ {
+ Expr *leftOp;
+ Expr *rightOp;
+
+ leftOp = (Expr *) get_leftop(rinfo->clause);
+ rightOp = (Expr *) get_rightop(rinfo->clause);
+
+ /*
+ * For self-join elimination, changing varnos could transform
+ * "t1.a = t2.a" into "t1.a = t1.a". That is always true as long
+ * as "t1.a" is not null. We use qual() to check for such a case,
+ * and then we replace the qual for a check for not null
+ * (NullTest).
+ */
+ if (leftOp != NULL && equal(leftOp, rightOp))
+ {
+ NullTest *ntest = makeNode(NullTest);
+
+ ntest->arg = leftOp;
+ ntest->nulltesttype = IS_NOT_NULL;
+ ntest->argisrow = false;
+ ntest->location = -1;
+ rinfo->clause = (Expr *) ntest;
+ rinfo->mergeopfamilies = NIL;
+ rinfo->left_em = NULL;
+ rinfo->right_em = NULL;
+ }
+ Assert(rinfo->orclause == NULL);
+ }
+ return true;
+ }
+
+ return false;
+}
+
/*
* Remove a relation after we have proven that it participates only in an
* unneeded unique self-join.
@@ -1725,7 +1845,8 @@ remove_self_join_rel(PlannerInfo *root, PlanRowMark *kmark, PlanRowMark *rmark,
foreach_node(RestrictInfo, rinfo, joininfos)
{
remove_join_clause_from_rels(root, rinfo, rinfo->required_relids);
- ChangeVarNodes((Node *) rinfo, toRemove->relid, toKeep->relid, 0);
+ ChangeVarNodesExtended((Node *) rinfo, toRemove->relid, toKeep->relid,
+ 0, replace_relid_callback);
if (bms_membership(rinfo->required_relids) == BMS_MULTIPLE)
jinfo_candidates = lappend(jinfo_candidates, rinfo);
@@ -1743,7 +1864,8 @@ remove_self_join_rel(PlannerInfo *root, PlanRowMark *kmark, PlanRowMark *rmark,
restrictlist);
foreach_node(RestrictInfo, rinfo, toRemove->baserestrictinfo)
{
- ChangeVarNodes((Node *) rinfo, toRemove->relid, toKeep->relid, 0);
+ ChangeVarNodesExtended((Node *) rinfo, toRemove->relid, toKeep->relid,
+ 0, replace_relid_callback);
if (bms_membership(rinfo->required_relids) == BMS_MULTIPLE)
jinfo_candidates = lappend(jinfo_candidates, rinfo);
@@ -1785,7 +1907,8 @@ remove_self_join_rel(PlannerInfo *root, PlanRowMark *kmark, PlanRowMark *rmark,
{
Node *node = lfirst(lc);
- ChangeVarNodes(node, toRemove->relid, toKeep->relid, 0);
+ ChangeVarNodesExtended(node, toRemove->relid, toKeep->relid, 0,
+ replace_relid_callback);
if (!list_member(toKeep->reltarget->exprs, node))
toKeep->reltarget->exprs = lappend(toKeep->reltarget->exprs, node);
}
@@ -1829,14 +1952,18 @@ remove_self_join_rel(PlannerInfo *root, PlanRowMark *kmark, PlanRowMark *rmark,
* Replace varno in all the query structures, except nodes RangeTblRef
* otherwise later remove_rel_from_joinlist will yield errors.
*/
- ChangeVarNodesExtended((Node *) root->parse, toRemove->relid, toKeep->relid, 0, false);
+ ChangeVarNodesExtended((Node *) root->parse, toRemove->relid, toKeep->relid,
+ 0, replace_relid_callback);
/* Replace links in the planner info */
remove_rel_from_query(root, toRemove, toKeep->relid, NULL, NULL);
/* At last, replace varno in root targetlist and HAVING clause */
- ChangeVarNodes((Node *) root->processed_tlist, toRemove->relid, toKeep->relid, 0);
- ChangeVarNodes((Node *) root->processed_groupClause, toRemove->relid, toKeep->relid, 0);
+ ChangeVarNodesExtended((Node *) root->processed_tlist, toRemove->relid,
+ toKeep->relid, 0, replace_relid_callback);
+ ChangeVarNodesExtended((Node *) root->processed_groupClause,
+ toRemove->relid, toKeep->relid, 0,
+ replace_relid_callback);
adjust_relid_set(root->all_result_relids, toRemove->relid, toKeep->relid);
adjust_relid_set(root->leaf_result_relids, toRemove->relid, toKeep->relid);
@@ -1919,8 +2046,10 @@ split_selfjoin_quals(PlannerInfo *root, List *joinquals, List **selfjoinquals,
* when we have cast of the same var to different (but compatible)
* types.
*/
- ChangeVarNodes(rightexpr, bms_singleton_member(rinfo->right_relids),
- bms_singleton_member(rinfo->left_relids), 0);
+ ChangeVarNodesExtended(rightexpr,
+ bms_singleton_member(rinfo->right_relids),
+ bms_singleton_member(rinfo->left_relids), 0,
+ replace_relid_callback);
if (equal(leftexpr, rightexpr))
sjoinquals = lappend(sjoinquals, rinfo);
@@ -1956,7 +2085,8 @@ match_unique_clauses(PlannerInfo *root, RelOptInfo *outer, List *uclauses,
bms_is_empty(rinfo->right_relids));
clause = (Expr *) copyObject(rinfo->clause);
- ChangeVarNodes((Node *) clause, relid, outer->relid, 0);
+ ChangeVarNodesExtended((Node *) clause, relid, outer->relid, 0,
+ replace_relid_callback);
iclause = bms_is_empty(rinfo->left_relids) ? get_rightop(clause) :
get_leftop(clause);
diff --git a/src/backend/rewrite/rewriteManip.c b/src/backend/rewrite/rewriteManip.c
index 3f8b8b6eed9..cd786aa4112 100644
--- a/src/backend/rewrite/rewriteManip.c
+++ b/src/backend/rewrite/rewriteManip.c
@@ -550,19 +550,15 @@ offset_relid_set(Relids relids, int offset)
* earlier to ensure that no unwanted side-effects occur!
*/
-typedef struct
-{
- int rt_index;
- int new_index;
- int sublevels_up;
- bool change_RangeTblRef;
-} ChangeVarNodes_context;
-
static bool
ChangeVarNodes_walker(Node *node, ChangeVarNodes_context *context)
{
if (node == NULL)
return false;
+
+ if (context->callback && context->callback(node, context))
+ return false;
+
if (IsA(node, Var))
{
Var *var = (Var *) node;
@@ -588,7 +584,7 @@ ChangeVarNodes_walker(Node *node, ChangeVarNodes_context *context)
cexpr->cvarno = context->new_index;
return false;
}
- if (IsA(node, RangeTblRef) && context->change_RangeTblRef)
+ if (IsA(node, RangeTblRef))
{
RangeTblRef *rtr = (RangeTblRef *) node;
@@ -635,95 +631,6 @@ ChangeVarNodes_walker(Node *node, ChangeVarNodes_context *context)
}
return false;
}
- if (IsA(node, RestrictInfo))
- {
- RestrictInfo *rinfo = (RestrictInfo *) node;
- int relid = -1;
- bool is_req_equal =
- (rinfo->required_relids == rinfo->clause_relids);
- bool clause_relids_is_multiple =
- (bms_membership(rinfo->clause_relids) == BMS_MULTIPLE);
-
- /*
- * Recurse down into clauses if the target relation is present in
- * clause_relids or required_relids. We must check required_relids
- * because the relation not present in clause_relids might still be
- * present somewhere in orclause.
- */
- if (bms_is_member(context->rt_index, rinfo->clause_relids) ||
- bms_is_member(context->rt_index, rinfo->required_relids))
- {
- Relids new_clause_relids;
-
- expression_tree_walker((Node *) rinfo->clause, ChangeVarNodes_walker, (void *) context);
- expression_tree_walker((Node *) rinfo->orclause, ChangeVarNodes_walker, (void *) context);
-
- new_clause_relids = adjust_relid_set(rinfo->clause_relids,
- context->rt_index,
- context->new_index);
-
- /*
- * Incrementally adjust num_base_rels based on the change of
- * clause_relids, which could contain both base relids and
- * outer-join relids. This operation is legal until we remove
- * only baserels.
- */
- rinfo->num_base_rels -= bms_num_members(rinfo->clause_relids) -
- bms_num_members(new_clause_relids);
-
- rinfo->clause_relids = new_clause_relids;
- rinfo->left_relids =
- adjust_relid_set(rinfo->left_relids, context->rt_index, context->new_index);
- rinfo->right_relids =
- adjust_relid_set(rinfo->right_relids, context->rt_index, context->new_index);
- }
-
- if (is_req_equal)
- rinfo->required_relids = rinfo->clause_relids;
- else
- rinfo->required_relids =
- adjust_relid_set(rinfo->required_relids, context->rt_index, context->new_index);
-
- rinfo->outer_relids =
- adjust_relid_set(rinfo->outer_relids, context->rt_index, context->new_index);
- rinfo->incompatible_relids =
- adjust_relid_set(rinfo->incompatible_relids, context->rt_index, context->new_index);
-
- if (rinfo->mergeopfamilies &&
- bms_get_singleton_member(rinfo->clause_relids, &relid) &&
- clause_relids_is_multiple &&
- relid == context->new_index && IsA(rinfo->clause, OpExpr))
- {
- Expr *leftOp;
- Expr *rightOp;
-
- leftOp = (Expr *) get_leftop(rinfo->clause);
- rightOp = (Expr *) get_rightop(rinfo->clause);
-
- /*
- * For self-join elimination, changing varnos could transform
- * "t1.a = t2.a" into "t1.a = t1.a". That is always true as long
- * as "t1.a" is not null. We use qual() to check for such a case,
- * and then we replace the qual for a check for not null
- * (NullTest).
- */
- if (leftOp != NULL && equal(leftOp, rightOp))
- {
- NullTest *ntest = makeNode(NullTest);
-
- ntest->arg = leftOp;
- ntest->nulltesttype = IS_NOT_NULL;
- ntest->argisrow = false;
- ntest->location = -1;
- rinfo->clause = (Expr *) ntest;
- rinfo->mergeopfamilies = NIL;
- rinfo->left_em = NULL;
- rinfo->right_em = NULL;
- }
- Assert(rinfo->orclause == NULL);
- }
- return false;
- }
if (IsA(node, AppendRelInfo))
{
AppendRelInfo *appinfo = (AppendRelInfo *) node;
@@ -757,26 +664,28 @@ ChangeVarNodes_walker(Node *node, ChangeVarNodes_context *context)
}
/*
- * ChangeVarNodesExtended - similar to ChangeVarNodes, but has additional
- * 'change_RangeTblRef' param
+ * ChangeVarNodesExtended - similar to ChangeVarNodes, but with an additional
+ * 'callback' param
*
* ChangeVarNodes changes a given node and all of its underlying nodes.
- * However, self-join elimination (SJE) needs to skip the RangeTblRef node
- * type. During SJE's last step, remove_rel_from_joinlist() removes
- * remaining RangeTblRefs with target relid. If ChangeVarNodes() replaces
- * the target relid before, remove_rel_from_joinlist() fails to identify
- * the nodes to delete.
+ * This version of function additionally takes a callback, which has a
+ * chance to process a node before ChangeVarNodes_walker. A callback
+ * returns a boolean value indicating if given node should be skipped from
+ * further processing by ChangeVarNodes_walker. The callback is called
+ * only for expressions and other children nodes of a Query processed by
+ * a walker. Initial processing of the root Query doesn't involve the
+ * callback.
*/
void
ChangeVarNodesExtended(Node *node, int rt_index, int new_index,
- int sublevels_up, bool change_RangeTblRef)
+ int sublevels_up, ChangeVarNodes_callback callback)
{
ChangeVarNodes_context context;
context.rt_index = rt_index;
context.new_index = new_index;
context.sublevels_up = sublevels_up;
- context.change_RangeTblRef = change_RangeTblRef;
+ context.callback = callback;
/*
* Must be prepared to start with a Query or a bare expression tree; if
@@ -826,7 +735,20 @@ ChangeVarNodesExtended(Node *node, int rt_index, int new_index,
void
ChangeVarNodes(Node *node, int rt_index, int new_index, int sublevels_up)
{
- ChangeVarNodesExtended(node, rt_index, new_index, sublevels_up, true);
+ ChangeVarNodesExtended(node, rt_index, new_index, sublevels_up, NULL);
+}
+
+/*
+ * ChangeVarNodesWalkExpression - process expression within the custom
+ * callback provided to the
+ * ChangeVarNodesExtended.
+ */
+bool
+ChangeVarNodesWalkExpression(Node *node, ChangeVarNodes_context *context)
+{
+ return expression_tree_walker(node,
+ ChangeVarNodes_walker,
+ (void *) context);
}
/*
diff --git a/src/include/rewrite/rewriteManip.h b/src/include/rewrite/rewriteManip.h
index ea3908739c6..3e941393da2 100644
--- a/src/include/rewrite/rewriteManip.h
+++ b/src/include/rewrite/rewriteManip.h
@@ -41,6 +41,18 @@ typedef enum ReplaceVarsNoMatchOption
REPLACEVARS_SUBSTITUTE_NULL, /* replace with a NULL Const */
} ReplaceVarsNoMatchOption;
+typedef struct ChangeVarNodes_context ChangeVarNodes_context;
+
+typedef bool (*ChangeVarNodes_callback) (Node *node,
+ ChangeVarNodes_context *arg);
+
+struct ChangeVarNodes_context
+{
+ int rt_index;
+ int new_index;
+ int sublevels_up;
+ ChangeVarNodes_callback callback;
+};
extern Relids adjust_relid_set(Relids relids, int oldrelid, int newrelid);
extern void CombineRangeTables(List **dst_rtable, List **dst_perminfos,
@@ -49,7 +61,10 @@ extern void OffsetVarNodes(Node *node, int offset, int sublevels_up);
extern void ChangeVarNodes(Node *node, int rt_index, int new_index,
int sublevels_up);
extern void ChangeVarNodesExtended(Node *node, int rt_index, int new_index,
- int sublevels_up, bool change_RangeTblRef);
+ int sublevels_up,
+ ChangeVarNodes_callback callback);
+extern bool ChangeVarNodesWalkExpression(Node *node,
+ ChangeVarNodes_context *context);
extern void IncrementVarSublevelsUp(Node *node, int delta_sublevels_up,
int min_sublevels_up);
extern void IncrementVarSublevelsUp_rtable(List *rtable,
--
2.39.5 (Apple Git-154)
On 1/5/2025 14:11, Alexander Korotkov wrote:
3. Should the ChangeVarNodesWalkExpression function return the walker's
returning value?Done.
Thanks for your efforts! Looks good to me.
--
regards, Andrei Lepikhov
On Thu, May 1, 2025 at 8:22 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
On 1/5/2025 14:11, Alexander Korotkov wrote:
3. Should the ChangeVarNodesWalkExpression function return the walker's
returning value?Done.
Thanks for your efforts! Looks good to me.
Thank you. I'm going to push this if there is no objections.
------
Regards,
Alexander Korotkov
Supabase