Check volatile functions in ppi_clauses for memoize node
In get_memoize_path() we have a theory about avoiding memoize node if
there are volatile functions in the inner rel's target/restrict list.
/*
* We can't use a memoize node if there are volatile functions in the
* inner rel's target list or restrict list. A cache hit could reduce the
* number of calls to these functions.
*/
if (contain_volatile_functions((Node *) innerrel->reltarget))
return NULL;
foreach(lc, innerrel->baserestrictinfo)
{
RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
if (contain_volatile_functions((Node *) rinfo))
return NULL;
}
It seems that the check for restrict list is not thorough, because for a
parameterized scan we are supposed to also consider all the join clauses
available from the outer rels, ie, ppi_clauses. For instance,
create table t (a float);
insert into t values (1.0), (1.0), (1.0), (1.0);
analyze t;
explain (costs off)
select * from t t1 left join lateral
(select t1.a as t1a, t2.a as t2a from t t2) s
on t1.a = s.t2a + random();
QUERY PLAN
-----------------------------------------------
Nested Loop Left Join
-> Seq Scan on t t1
-> Memoize
Cache Key: t1.a
Cache Mode: binary
-> Seq Scan on t t2
Filter: (t1.a = (a + random()))
(7 rows)
According to the theory we should not use memoize node for this query
because of the volatile function in the inner side. So propose a patch
to fix that.
Thanks
Richard
Attachments:
v1-0001-Check-volatile-functions-in-ppi_clauses-for-memoize-node.patchapplication/octet-stream; name=v1-0001-Check-volatile-functions-in-ppi_clauses-for-memoize-node.patchDownload
From e74e92de47e4522dc6b2fccfbdbceb1ba30fb80f Mon Sep 17 00:00:00 2001
From: Richard Guo <guofenglinux@gmail.com>
Date: Fri, 4 Aug 2023 18:07:24 +0800
Subject: [PATCH v1] Check volatile functions in ppi_clauses for memoize node
---
src/backend/optimizer/path/joinpath.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index 4b58936fa4..676a1eec84 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -582,6 +582,7 @@ get_memoize_path(PlannerInfo *root, RelOptInfo *innerrel,
List *hash_operators;
ListCell *lc;
bool binary_mode;
+ List *restrictlist;
/* Obviously not if it's disabled */
if (!enable_memoize)
@@ -655,11 +656,18 @@ get_memoize_path(PlannerInfo *root, RelOptInfo *innerrel,
* We can't use a memoize node if there are volatile functions in the
* inner rel's target list or restrict list. A cache hit could reduce the
* number of calls to these functions.
+ *
+ * Note that we need to also consider all the join clauses available from
+ * the outer relation(s) as restriction clauses.
*/
if (contain_volatile_functions((Node *) innerrel->reltarget))
return NULL;
- foreach(lc, innerrel->baserestrictinfo)
+ restrictlist = innerrel->baserestrictinfo;
+ if (inner_path->param_info)
+ restrictlist = list_concat_copy(restrictlist,
+ inner_path->param_info->ppi_clauses);
+ foreach(lc, restrictlist)
{
RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
--
2.31.0
On Fri, 4 Aug 2023 at 22:26, Richard Guo <guofenglinux@gmail.com> wrote:
explain (costs off)
select * from t t1 left join lateral
(select t1.a as t1a, t2.a as t2a from t t2) s
on t1.a = s.t2a + random();
QUERY PLAN
-----------------------------------------------
Nested Loop Left Join
-> Seq Scan on t t1
-> Memoize
Cache Key: t1.a
Cache Mode: binary
-> Seq Scan on t t2
Filter: (t1.a = (a + random()))
(7 rows)According to the theory we should not use memoize node for this query
because of the volatile function in the inner side. So propose a patch
to fix that.
Thanks for the patch. I've pushed a variation of it.
I didn't really see the need to make a single list with all the
RestrictInfos. It seems fine just to write code and loop over the
ppi_clauses checking for volatility.
David
On Mon, Aug 7, 2023 at 6:19 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Fri, 4 Aug 2023 at 22:26, Richard Guo <guofenglinux@gmail.com> wrote:
explain (costs off)
select * from t t1 left join lateral
(select t1.a as t1a, t2.a as t2a from t t2) s
on t1.a = s.t2a + random();
QUERY PLAN
-----------------------------------------------
Nested Loop Left Join
-> Seq Scan on t t1
-> Memoize
Cache Key: t1.a
Cache Mode: binary
-> Seq Scan on t t2
Filter: (t1.a = (a + random()))
(7 rows)According to the theory we should not use memoize node for this query
because of the volatile function in the inner side. So propose a patch
to fix that.Thanks for the patch. I've pushed a variation of it.
I didn't really see the need to make a single list with all the
RestrictInfos. It seems fine just to write code and loop over the
ppi_clauses checking for volatility.
That looks good. Thanks for pushing it!
Thanks
Richard