A problem about ParamPathInfo for an AppendPath

Started by Richard Guoabout 3 years ago4 messages
#1Richard Guo
guofenglinux@gmail.com

While trying the codes of MemoizePath I noticed that MemoizePath cannot
be generated if inner path is a union-all AppendPath, even if it is
parameterized. And I found the problem is that for a union-all
AppendPath the ParamPathInfo is not created in the usual way. Instead
it is created with get_appendrel_parampathinfo, which just creates a
struct with no ppi_clauses. As a result, MemoizePath cannot find a
proper cache key to use.

This problem does not exist for AppendPath generated for a partitioned
table though, because in this case we always create a normal param_info
with ppi_clauses, for use in run-time pruning.

As an illustration, we can use the table 'prt' in sql/memoize.sql with
the settings below

set enable_hashjoin to off;
set enable_mergejoin to off;
set enable_seqscan to off;
set enable_material to off;

explain (costs off) select * from prt_p1 t1 join prt t2 on t1.a = t2.a;
QUERY PLAN
------------------------------------------------------------------
Nested Loop
-> Index Only Scan using iprt_p1_a on prt_p1 t1
-> Memoize
Cache Key: t1.a
Cache Mode: logical
-> Append
-> Index Only Scan using iprt_p1_a on prt_p1 t2_1
Index Cond: (a = t1.a)
-> Index Only Scan using iprt_p2_a on prt_p2 t2_2
Index Cond: (a = t1.a)
(10 rows)

explain (costs off) select * from prt_p1 t1 join (select * from prt_p1
union all select * from prt_p2) t2 on t1.a = t2.a;
QUERY PLAN
-------------------------------------------------------
Nested Loop
-> Index Only Scan using iprt_p1_a on prt_p1 t1
-> Append
-> Index Only Scan using iprt_p1_a on prt_p1
Index Cond: (a = t1.a)
-> Index Only Scan using iprt_p2_a on prt_p2
Index Cond: (a = t1.a)
(7 rows)

As we can see, MemoizePath can be generated for partitioned AppendPath
but not for union-all AppendPath.

For the fix I think we can relax the check in create_append_path and
always use get_baserel_parampathinfo if the parent is a baserel.

--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -1266,7 +1266,7 @@ create_append_path(PlannerInfo *root,
     * partition, and it's not necessary anyway in that case.  Must skip it
if
     * we don't have "root", too.)
     */
-   if (root && rel->reloptkind == RELOPT_BASEREL &&
IS_PARTITIONED_REL(rel))
+   if (root && rel->reloptkind == RELOPT_BASEREL)
        pathnode->path.param_info = get_baserel_parampathinfo(root,
                                                              rel,

required_outer);

Thanks
Richard

#2Richard Guo
guofenglinux@gmail.com
In reply to: Richard Guo (#1)
1 attachment(s)
Re: A problem about ParamPathInfo for an AppendPath

On Tue, Dec 6, 2022 at 5:00 PM Richard Guo <guofenglinux@gmail.com> wrote:

As we can see, MemoizePath can be generated for partitioned AppendPath
but not for union-all AppendPath.

For the fix I think we can relax the check in create_append_path and
always use get_baserel_parampathinfo if the parent is a baserel.

BTW, IIUC currently we don't generate any parameterized MergeAppend
paths, as explained in generate_orderedappend_paths. So the codes that
gather information from a MergeAppend path's param_info for run-time
partition pruning in create_merge_append_plan seem unnecessary.

Attached is a patch for this change and the changes described upthread.

Thanks
Richard

Attachments:

v1-0001-Fix-ParamPathInfo-for-AppendPath.patchapplication/octet-stream; name=v1-0001-Fix-ParamPathInfo-for-AppendPath.patchDownload
From 53d506973e108e0dfbb6b80485d417bb5a863fa5 Mon Sep 17 00:00:00 2001
From: Richard Guo <guofenglinux@gmail.com>
Date: Mon, 12 Dec 2022 11:29:58 +0800
Subject: [PATCH v1] Fix ParamPathInfo for AppendPath

---
 src/backend/optimizer/plan/createplan.c | 12 ++----------
 src/backend/optimizer/util/pathnode.c   |  8 +++++++-
 src/test/regress/expected/memoize.out   | 24 ++++++++++++++++++++++++
 src/test/regress/sql/memoize.sql        |  8 ++++++++
 4 files changed, 41 insertions(+), 11 deletions(-)

diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 66139928e8..9124d09b91 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1531,16 +1531,8 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 
 		prunequal = extract_actual_clauses(rel->baserestrictinfo, false);
 
-		if (best_path->path.param_info)
-		{
-			List	   *prmquals = best_path->path.param_info->ppi_clauses;
-
-			prmquals = extract_actual_clauses(prmquals, false);
-			prmquals = (List *) replace_nestloop_params(root,
-														(Node *) prmquals);
-
-			prunequal = list_concat(prunequal, prmquals);
-		}
+		/* We don't currently generate any parameterized MergeAppend paths */
+		Assert(best_path->path.param_info == NULL);
 
 		if (prunequal != NIL)
 			node->part_prune_index = make_partition_pruneinfo(root, rel,
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 55deee555a..db23818e41 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -1258,6 +1258,9 @@ create_append_path(PlannerInfo *root,
 	pathnode->path.pathtarget = rel->reltarget;
 
 	/*
+	 * There are several reasons why we might need parameterized quals for
+	 * baserel.
+	 *
 	 * When generating an Append path for a partitioned table, there may be
 	 * parameterized quals that are useful for run-time pruning.  Hence,
 	 * compute path.param_info the same way as for any other baserel, so that
@@ -1265,8 +1268,11 @@ create_append_path(PlannerInfo *root,
 	 * would not work right for a non-baserel, ie a scan on a non-leaf child
 	 * partition, and it's not necessary anyway in that case.  Must skip it if
 	 * we don't have "root", too.)
+	 *
+	 * Another reason is that we need parameterized quals to search for memoize
+	 * cache keys when generating a Memoize path atop an Append path.
 	 */
-	if (root && rel->reloptkind == RELOPT_BASEREL && IS_PARTITIONED_REL(rel))
+	if (root && rel->reloptkind == RELOPT_BASEREL)
 		pathnode->path.param_info = get_baserel_parampathinfo(root,
 															  rel,
 															  required_outer);
diff --git a/src/test/regress/expected/memoize.out b/src/test/regress/expected/memoize.out
index de43afa76e..2b82a27494 100644
--- a/src/test/regress/expected/memoize.out
+++ b/src/test/regress/expected/memoize.out
@@ -234,6 +234,30 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
                      Heap Fetches: N
 (21 rows)
 
+-- Ensure memoize works with parameterized union-all Append path
+SET enable_partitionwise_join TO off;
+SELECT explain_memoize('
+SELECT * FROM prt_p1 t1 INNER JOIN
+(SELECT * FROM prt_p1 UNION ALL SELECT * FROM prt_p2) t2
+ON t1.a = t2.a;', false);
+                                   explain_memoize                                   
+-------------------------------------------------------------------------------------
+ Nested Loop (actual rows=16 loops=N)
+   ->  Index Only Scan using iprt_p1_a on prt_p1 t1 (actual rows=4 loops=N)
+         Heap Fetches: N
+   ->  Memoize (actual rows=4 loops=N)
+         Cache Key: t1.a
+         Cache Mode: logical
+         Hits: 3  Misses: 1  Evictions: Zero  Overflows: 0  Memory Usage: NkB
+         ->  Append (actual rows=4 loops=N)
+               ->  Index Only Scan using iprt_p1_a on prt_p1 (actual rows=4 loops=N)
+                     Index Cond: (a = t1.a)
+                     Heap Fetches: N
+               ->  Index Only Scan using iprt_p2_a on prt_p2 (actual rows=0 loops=N)
+                     Index Cond: (a = t1.a)
+                     Heap Fetches: N
+(14 rows)
+
 DROP TABLE prt;
 RESET enable_partitionwise_join;
 -- Exercise Memoize code that flushes the cache when a parameter changes which
diff --git a/src/test/regress/sql/memoize.sql b/src/test/regress/sql/memoize.sql
index 17c5b4bfab..a78c38ba6b 100644
--- a/src/test/regress/sql/memoize.sql
+++ b/src/test/regress/sql/memoize.sql
@@ -119,6 +119,14 @@ ANALYZE prt;
 SELECT explain_memoize('
 SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
 
+-- Ensure memoize works with parameterized union-all Append path
+SET enable_partitionwise_join TO off;
+
+SELECT explain_memoize('
+SELECT * FROM prt_p1 t1 INNER JOIN
+(SELECT * FROM prt_p1 UNION ALL SELECT * FROM prt_p2) t2
+ON t1.a = t2.a;', false);
+
 DROP TABLE prt;
 
 RESET enable_partitionwise_join;
-- 
2.31.0

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Richard Guo (#2)
Re: A problem about ParamPathInfo for an AppendPath

Richard Guo <guofenglinux@gmail.com> writes:

Attached is a patch for this change and the changes described upthread.

Pushed. I thought the comment needed to be completely rewritten not just
tweaked, and I felt it was probably reasonable to continue to exclude
dummy paths from getting the more expensive treatment.

regards, tom lane

#4Richard Guo
guofenglinux@gmail.com
In reply to: Tom Lane (#3)
Re: A problem about ParamPathInfo for an AppendPath

On Fri, Mar 17, 2023 at 6:15 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Pushed. I thought the comment needed to be completely rewritten not just
tweaked, and I felt it was probably reasonable to continue to exclude
dummy paths from getting the more expensive treatment.

Yes agreed. Thanks for the changes and pushing.

Thanks
Richard