About displaying NestLoopParam

Started by Richard Guoover 3 years ago4 messages
#1Richard Guo
guofenglinux@gmail.com

I have a question about displaying NestLoopParam. In the plan below,

# explain (costs off)
select * from a, lateral (select sum(i) as i from b where exists (select
sum(c.i) from c where c.j = a.j and c.i = b.i) ) ss where a.i = ss.i;
QUERY PLAN
--------------------------------------------------------------------
Nested Loop
-> Seq Scan on a
-> Subquery Scan on ss
Filter: (a.i = ss.i)
-> Aggregate
-> Seq Scan on b
Filter: (SubPlan 1)
SubPlan 1
-> Aggregate
-> Seq Scan on c
Filter: ((j = $0) AND (i = b.i))
(11 rows)

There are three Params. Param 0 (a.j) and param 2 (a.i) are from
nestParams of the NestLoop. Param 1 (b.i) is from parParam of the
SubPlan. As we can see, param 1 and param 2 are displayed as the
corresponding expressions, while param 0 is displayed as $0.

I'm not saying this is a bug, but just curious why param 0 cannot be
displayed as the referenced expression. And I find the reason is that in
function find_param_referent(), we have the 'in_same_plan_level' flag
controlling that if we have emerged from a subplan, i.e. not the same
plan level any more, we would not look further for the matching
NestLoopParam. Param 0 suits this situation.

And there is a comment there also saying,

/*
* NestLoops transmit params to their inner child only; also, once
* we've crawled up out of a subplan, this couldn't possibly be
* the right match.
*/

My question is why is that?

Thanks
Richard

#2Richard Guo
guofenglinux@gmail.com
In reply to: Richard Guo (#1)
1 attachment(s)
Re: About displaying NestLoopParam

On Fri, Sep 16, 2022 at 5:59 PM Richard Guo <guofenglinux@gmail.com> wrote:

I'm not saying this is a bug, but just curious why param 0 cannot be
displayed as the referenced expression. And I find the reason is that in
function find_param_referent(), we have the 'in_same_plan_level' flag
controlling that if we have emerged from a subplan, i.e. not the same
plan level any more, we would not look further for the matching
NestLoopParam. Param 0 suits this situation.

And there is a comment there also saying,

/*
* NestLoops transmit params to their inner child only; also, once
* we've crawled up out of a subplan, this couldn't possibly be
* the right match.
*/

After thinking of this for more time, I still don't see the reason why
we cannot display NestLoopParam after we've emerged from a subplan.

It seems these params are from parameterized subqueryscan and their
values are supplied by an upper nestloop. These params should have been
processed in process_subquery_nestloop_params() that we just add the
PlannerParamItem entries to root->curOuterParams, in the form of
NestLoopParam, using the same PARAM_EXEC slots.

So I propose the patch attached to remove the 'in_same_plan_level' flag
so that we can display NestLoopParam across subplan. Please correct me
if I'm wrong.

Thanks
Richard

Attachments:

v1-0001-Display-NestLoopParam-across-subplan.patchapplication/octet-stream; name=v1-0001-Display-NestLoopParam-across-subplan.patchDownload
From 30d3a692607215b5dd98e15eddfe7128cbcfc245 Mon Sep 17 00:00:00 2001
From: Richard Guo <guofenglinux@gmail.com>
Date: Tue, 20 Sep 2022 15:41:07 +0800
Subject: [PATCH v1] Display NestLoopParam across subplan

---
 src/backend/utils/adt/ruleutils.c       | 34 ++-----------------------
 src/test/regress/expected/join.out      |  8 +++---
 src/test/regress/expected/subselect.out |  2 +-
 3 files changed, 7 insertions(+), 37 deletions(-)

diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 2b7b1b0c0f..ee950fc6f6 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -7900,12 +7900,10 @@ find_param_referent(Param *param, deparse_context *context,
 	{
 		deparse_namespace *dpns;
 		Plan	   *child_plan;
-		bool		in_same_plan_level;
 		ListCell   *lc;
 
 		dpns = (deparse_namespace *) linitial(context->namespaces);
 		child_plan = dpns->plan;
-		in_same_plan_level = true;
 
 		foreach(lc, dpns->ancestors)
 		{
@@ -7913,13 +7911,10 @@ find_param_referent(Param *param, deparse_context *context,
 			ListCell   *lc2;
 
 			/*
-			 * NestLoops transmit params to their inner child only; also, once
-			 * we've crawled up out of a subplan, this couldn't possibly be
-			 * the right match.
+			 * NestLoops transmit params to their inner child only.
 			 */
 			if (IsA(ancestor, NestLoop) &&
-				child_plan == innerPlan(ancestor) &&
-				in_same_plan_level)
+				child_plan == innerPlan(ancestor))
 			{
 				NestLoop   *nl = (NestLoop *) ancestor;
 
@@ -7977,35 +7972,10 @@ find_param_referent(Param *param, deparse_context *context,
 					}
 				}
 
-				/* We have emerged from a subplan. */
-				in_same_plan_level = false;
-
 				/* SubPlan isn't a kind of Plan, so skip the rest */
 				continue;
 			}
 
-			/*
-			 * Check to see if we're emerging from an initplan of the current
-			 * ancestor plan.  Initplans never have any parParams, so no need
-			 * to search that list, but we need to know if we should reset
-			 * in_same_plan_level.
-			 */
-			foreach(lc2, ((Plan *) ancestor)->initPlan)
-			{
-				SubPlan    *subplan = lfirst_node(SubPlan, lc2);
-
-				if (child_plan != (Plan *) list_nth(dpns->subplans,
-													subplan->plan_id - 1))
-					continue;
-
-				/* No parameters to be had here. */
-				Assert(subplan->parParam == NIL);
-
-				/* We have emerged from an initplan. */
-				in_same_plan_level = false;
-				break;
-			}
-
 			/* No luck, crawl up to next ancestor */
 			child_plan = (Plan *) ancestor;
 		}
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 2ed2e542a4..48cb0b9f35 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -5985,8 +5985,8 @@ lateral (select * from int8_tbl t1,
                                      where q2 = (select greatest(t1.q1,t2.q2))
                                        and (select v.id=0)) offset 0) ss2) ss
          where t1.q1 = ss.q2) ss0;
-                           QUERY PLAN                            
------------------------------------------------------------------
+                              QUERY PLAN                              
+----------------------------------------------------------------------
  Nested Loop
    Output: "*VALUES*".column1, t1.q1, t1.q2, ss2.q1, ss2.q2
    ->  Seq Scan on public.int8_tbl t1
@@ -6007,10 +6007,10 @@ lateral (select * from int8_tbl t1,
                              One-Time Filter: $4
                              InitPlan 1 (returns $2)
                                ->  Result
-                                     Output: GREATEST($0, t2.q2)
+                                     Output: GREATEST(t1.q1, t2.q2)
                              InitPlan 2 (returns $4)
                                ->  Result
-                                     Output: ($3 = 0)
+                                     Output: ("*VALUES*".column1 = 0)
                              ->  Seq Scan on public.int8_tbl t3
                                    Output: t3.q1, t3.q2
                                    Filter: (t3.q2 = $2)
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index 63d26d44fc..22af8fafa1 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -1155,7 +1155,7 @@ where o.ten = 0;
                SubPlan 1
                  ->  Seq Scan on public.int4_tbl
                        Output: int4_tbl.f1
-                       Filter: (int4_tbl.f1 <= $0)
+                       Filter: (int4_tbl.f1 <= o.hundred)
 (14 rows)
 
 select sum(ss.tst::int) from
-- 
2.31.0

#3Greg Stark
stark@mit.edu
In reply to: Richard Guo (#2)
Re: About displaying NestLoopParam

So I guess I don't have much to add since I don't really understand
the Param infrastructure, certainly not any better than you seem to.

I do note that the code in question was added in this commit in 2010.
That predates the addition of LATERAL in 2013. I suppose those
comments may be talking about InitPlans for things like constant
subqueries that have been pulled up to InitPlans in queries like:

explain verbose select * from x join y on (x.i=y.j) where y.j+1=(select 5) ;

Which your patch doesn't eliminate the $0 in. I don't know if the code
you're removing is just for efficiency -- to avoid trawling through
nodes of the plan that can't be relevant -- or for correctness.

Fwiw your patch applied for me and built without warnings and seems to
work for all the queries I've thrown at it so far. That's hardly an
exhaustive test of course.

commit 1cc29fe7c60ba643c114979dbe588d3a38005449
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Tue Jul 13 20:57:19 2010 +0000

Teach EXPLAIN to print PARAM_EXEC Params as the referenced expressions,
rather than just $N. This brings the display of nestloop-inner-indexscan
plans back to where it's been, and incidentally improves the display of
SubPlan parameters as well. In passing, simplify the EXPLAIN code by
having it deal primarily in the PlanState tree rather than separately
searching Plan and PlanState trees. This is noticeably cleaner for
subplans, and about a wash elsewhere.

One small difference from previous behavior is that EXPLAIN will no longer
qualify local variable references in inner-indexscan plan nodes, since it
no longer sees such nodes as possibly referencing multiple tables. Vars
referenced through PARAM_EXEC Params are still forcibly qualified, though,
so I don't think the display is any more confusing than before. Adjust a
couple of examples in the documentation to match this behavior.

On Tue, 20 Sept 2022 at 05:00, Richard Guo <guofenglinux@gmail.com> wrote:

On Fri, Sep 16, 2022 at 5:59 PM Richard Guo <guofenglinux@gmail.com> wrote:

I'm not saying this is a bug, but just curious why param 0 cannot be
displayed as the referenced expression. And I find the reason is that in
function find_param_referent(), we have the 'in_same_plan_level' flag
controlling that if we have emerged from a subplan, i.e. not the same
plan level any more, we would not look further for the matching
NestLoopParam. Param 0 suits this situation.

And there is a comment there also saying,

/*
* NestLoops transmit params to their inner child only; also, once
* we've crawled up out of a subplan, this couldn't possibly be
* the right match.
*/

After thinking of this for more time, I still don't see the reason why
we cannot display NestLoopParam after we've emerged from a subplan.

It seems these params are from parameterized subqueryscan and their
values are supplied by an upper nestloop. These params should have been
processed in process_subquery_nestloop_params() that we just add the
PlannerParamItem entries to root->curOuterParams, in the form of
NestLoopParam, using the same PARAM_EXEC slots.

So I propose the patch attached to remove the 'in_same_plan_level' flag
so that we can display NestLoopParam across subplan. Please correct me
if I'm wrong.

Thanks
Richard

--
greg

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Stark (#3)
Re: About displaying NestLoopParam

Greg Stark <stark@mit.edu> writes:

I do note that the code in question was added in this commit in 2010.
That predates the addition of LATERAL in 2013.

Yeah. It's pretty clear from the comments that I was concerned about
false matches of PARAM_EXEC numbers. I think that was a live issue
at the time but is so no longer, cf. 46c508fbc and 1db5667ba.
The possibility of LATERAL references is what makes it interesting
to search higher in the plan tree, so there wasn't any real reason to
take any risk of a false match.

I think I might've also been concerned about printing misleading
names for any Vars we did find, due to them belonging to a different
query level. That's probably a dead issue too now that ruleutils
assigns unique aliases to all RTEs in the query (I'm not sure if
it did at the time).

Looking at this now, it seems a little weird to me that we allow
LATERAL values to be passed down directly into the subplan rather
than having them go through the parParam mechanism. (If they did,
ruleutils' restriction would be fine.) I don't know of a reason
to change that, though.

I suppose those
comments may be talking about InitPlans for things like constant
subqueries that have been pulled up to InitPlans in queries like:
explain verbose select * from x join y on (x.i=y.j) where y.j+1=(select 5) ;
Which your patch doesn't eliminate the $0 in.

No, because that $0 is for a subplan/initplan output, which we don't
have any other sort of name for. Your example produces output that
explains what it is:

InitPlan 1 (returns $0)
...
Filter: ((y.j + 1) = $0)

although I'm not sure that we document that anywhere user-facing.

Fwiw your patch applied for me and built without warnings and seems to
work for all the queries I've thrown at it so far. That's hardly an
exhaustive test of course.

I'm content to apply this (although I quibble with removal of some
of the commentary). Worst case, somebody will find an example where
it produces wrong/misleading output, and we can revert it. But
the regression test changes show that it does produce useful output
in at least some cases.

regards, tom lane