Check volatile functions in ppi_clauses for memoize node

Started by Richard Guoover 2 years ago3 messages
#1Richard Guo
guofenglinux@gmail.com
1 attachment(s)

In get_memoize_path() we have a theory about avoiding memoize node if
there are volatile functions in the inner rel's target/restrict list.

/*
* We can't use a memoize node if there are volatile functions in the
* inner rel's target list or restrict list. A cache hit could reduce the
* number of calls to these functions.
*/
if (contain_volatile_functions((Node *) innerrel->reltarget))
return NULL;

foreach(lc, innerrel->baserestrictinfo)
{
RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);

if (contain_volatile_functions((Node *) rinfo))
return NULL;
}

It seems that the check for restrict list is not thorough, because for a
parameterized scan we are supposed to also consider all the join clauses
available from the outer rels, ie, ppi_clauses. For instance,

create table t (a float);
insert into t values (1.0), (1.0), (1.0), (1.0);
analyze t;

explain (costs off)
select * from t t1 left join lateral
(select t1.a as t1a, t2.a as t2a from t t2) s
on t1.a = s.t2a + random();
QUERY PLAN
-----------------------------------------------
Nested Loop Left Join
-> Seq Scan on t t1
-> Memoize
Cache Key: t1.a
Cache Mode: binary
-> Seq Scan on t t2
Filter: (t1.a = (a + random()))
(7 rows)

According to the theory we should not use memoize node for this query
because of the volatile function in the inner side. So propose a patch
to fix that.

Thanks
Richard

Attachments:

v1-0001-Check-volatile-functions-in-ppi_clauses-for-memoize-node.patchapplication/octet-stream; name=v1-0001-Check-volatile-functions-in-ppi_clauses-for-memoize-node.patchDownload
From e74e92de47e4522dc6b2fccfbdbceb1ba30fb80f Mon Sep 17 00:00:00 2001
From: Richard Guo <guofenglinux@gmail.com>
Date: Fri, 4 Aug 2023 18:07:24 +0800
Subject: [PATCH v1] Check volatile functions in ppi_clauses for memoize node

---
 src/backend/optimizer/path/joinpath.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index 4b58936fa4..676a1eec84 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -582,6 +582,7 @@ get_memoize_path(PlannerInfo *root, RelOptInfo *innerrel,
 	List	   *hash_operators;
 	ListCell   *lc;
 	bool		binary_mode;
+	List	   *restrictlist;
 
 	/* Obviously not if it's disabled */
 	if (!enable_memoize)
@@ -655,11 +656,18 @@ get_memoize_path(PlannerInfo *root, RelOptInfo *innerrel,
 	 * We can't use a memoize node if there are volatile functions in the
 	 * inner rel's target list or restrict list.  A cache hit could reduce the
 	 * number of calls to these functions.
+	 *
+	 * Note that we need to also consider all the join clauses available from
+	 * the outer relation(s) as restriction clauses.
 	 */
 	if (contain_volatile_functions((Node *) innerrel->reltarget))
 		return NULL;
 
-	foreach(lc, innerrel->baserestrictinfo)
+	restrictlist = innerrel->baserestrictinfo;
+	if (inner_path->param_info)
+		restrictlist = list_concat_copy(restrictlist,
+										inner_path->param_info->ppi_clauses);
+	foreach(lc, restrictlist)
 	{
 		RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
 
-- 
2.31.0

#2David Rowley
dgrowleyml@gmail.com
In reply to: Richard Guo (#1)
Re: Check volatile functions in ppi_clauses for memoize node

On Fri, 4 Aug 2023 at 22:26, Richard Guo <guofenglinux@gmail.com> wrote:

explain (costs off)
select * from t t1 left join lateral
(select t1.a as t1a, t2.a as t2a from t t2) s
on t1.a = s.t2a + random();
QUERY PLAN
-----------------------------------------------
Nested Loop Left Join
-> Seq Scan on t t1
-> Memoize
Cache Key: t1.a
Cache Mode: binary
-> Seq Scan on t t2
Filter: (t1.a = (a + random()))
(7 rows)

According to the theory we should not use memoize node for this query
because of the volatile function in the inner side. So propose a patch
to fix that.

Thanks for the patch. I've pushed a variation of it.

I didn't really see the need to make a single list with all the
RestrictInfos. It seems fine just to write code and loop over the
ppi_clauses checking for volatility.

David

#3Richard Guo
guofenglinux@gmail.com
In reply to: David Rowley (#2)
Re: Check volatile functions in ppi_clauses for memoize node

On Mon, Aug 7, 2023 at 6:19 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Fri, 4 Aug 2023 at 22:26, Richard Guo <guofenglinux@gmail.com> wrote:

explain (costs off)
select * from t t1 left join lateral
(select t1.a as t1a, t2.a as t2a from t t2) s
on t1.a = s.t2a + random();
QUERY PLAN
-----------------------------------------------
Nested Loop Left Join
-> Seq Scan on t t1
-> Memoize
Cache Key: t1.a
Cache Mode: binary
-> Seq Scan on t t2
Filter: (t1.a = (a + random()))
(7 rows)

According to the theory we should not use memoize node for this query
because of the volatile function in the inner side. So propose a patch
to fix that.

Thanks for the patch. I've pushed a variation of it.

I didn't really see the need to make a single list with all the
RestrictInfos. It seems fine just to write code and loop over the
ppi_clauses checking for volatility.

That looks good. Thanks for pushing it!

Thanks
Richard