"type with xxxx does not exist" when doing ExecMemoize()

Started by Tender Wangalmost 2 years ago19 messages
#1Tender Wang
tndrwang@gmail.com
2 attachment(s)

Hi,

I met Memoize node failed When I used sqlancer test postgres.
database0=# explain select t0.c0 from t0 join t5 on t0.c0 = (t5.c0 - t5.c0);
QUERY PLAN
--------------------------------------------------------------------------------------
Nested Loop (cost=0.17..21.20 rows=4 width=32)
-> Seq Scan on t5 (cost=0.00..1.04 rows=4 width=14)
-> Memoize (cost=0.17..6.18 rows=1 width=32)
Cache Key: (t5.c0 - t5.c0)
Cache Mode: logical
-> Index Only Scan using t0_c0_key on t0 (cost=0.15..6.17 rows=1
width=32)
Index Cond: (c0 = (t5.c0 - t5.c0))
(7 rows)

database0=# select t0.c0 from t0 join t5 on t0.c0 = (t5.c0 - t5.c0);
ERROR: type with OID 2139062143 does not exist

How to repeat:
The attached database0.log (created by sqlancer) included statements to
repeat this issue.
Firstly, create database test;
then;
psql postgres
\i /xxx/database0.log

I analyzed aboved issue this weekend. And I found that
After called ResetExprContext() in MemoizeHash_hash(), the data in
mstate->probeslot was corrputed.

in prepare_probe_slot: the data as below:
(gdb) p *(DatumGetRangeTypeP(pslot->tts_values[0]))
$1 = {vl_len_ = 36, rangetypid = 3904}
after called ResetExprContext() in MemoizeHash_hash:
(gdb) p *(DatumGetRangeTypeP(pslot->tts_values[0]))
$3 = {vl_len_ = 264, rangetypid = 2139062143}

I think in prepare_probe_slot(), should called datumCopy as the attached
patch does.

Any thoughts? Thanks.
--
Tender Wang
OpenPie: https://en.openpie.com/

Attachments:

database0.logapplication/octet-stream; name=database0.logDownload
0001-Fix-RangeType-oid-not-found-when-doing-Memoize.patchapplication/octet-stream; name=0001-Fix-RangeType-oid-not-found-when-doing-Memoize.patchDownload
From fe773d211c7da0f97ae764815ad3d87225e0e366 Mon Sep 17 00:00:00 2001
From: "tender.wang" <tender.wang@openpie.com>
Date: Sun, 25 Feb 2024 20:18:04 +0800
Subject: [PATCH] Fix RangeType oid not found when doing Memoize.

After call ResetExprContext(), the data in mstate->proeslot may be freed too,
So we copy the result to probeslot from ExecEvalExpr.
---
 src/backend/executor/nodeMemoize.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/src/backend/executor/nodeMemoize.c b/src/backend/executor/nodeMemoize.c
index 18870f10e1..5f1468ad0b 100644
--- a/src/backend/executor/nodeMemoize.c
+++ b/src/backend/executor/nodeMemoize.c
@@ -312,17 +312,21 @@ prepare_probe_slot(MemoizeState *mstate, MemoizeKey *key)
 	if (key == NULL)
 	{
 		ExprContext *econtext = mstate->ss.ps.ps_ExprContext;
-		MemoryContext oldcontext;
-
-		oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
+		Datum value;
+		bool isnull;
+		TupleDesc tup = pslot->tts_tupleDescriptor;
+		Form_pg_attribute att;
 
 		/* Set the probeslot's values based on the current parameter values */
 		for (int i = 0; i < numKeys; i++)
-			pslot->tts_values[i] = ExecEvalExpr(mstate->param_exprs[i],
-												econtext,
-												&pslot->tts_isnull[i]);
-
-		MemoryContextSwitchTo(oldcontext);
+		{
+			att = TupleDescAttr(tup, i);
+			value = ExecEvalExprSwitchContext(mstate->param_exprs[i],
+											  econtext,
+											  &isnull);
+			pslot->tts_values[i] = datumCopy(value, att->attbyval, att->attlen);
+			pslot->tts_isnull[i] = isnull;
+		}
 	}
 	else
 	{
-- 
2.25.1

#2Andrei Lepikhov
a.lepikhov@postgrespro.ru
In reply to: Tender Wang (#1)
Re: "type with xxxx does not exist" when doing ExecMemoize()

On 25/2/2024 20:32, Tender Wang wrote:

I think in prepare_probe_slot(), should called datumCopy as the attached
patch does.

Any thoughts? Thanks.

Thanks for the report.
I think it is better to invent a Runtime Memory Context; likewise, it is
already designed in IndexScan and derivatives. Here, you just allocate
the value in some upper memory context.
Also, I'm curious why such a trivial error hasn't been found for a long time

--
regards,
Andrei Lepikhov
Postgres Professional

#3Andrei Lepikhov
a.lepikhov@postgrespro.ru
In reply to: Andrei Lepikhov (#2)
1 attachment(s)
Re: "type with xxxx does not exist" when doing ExecMemoize()

On 26/2/2024 09:52, Andrei Lepikhov wrote:

On 25/2/2024 20:32, Tender Wang wrote:

I think in prepare_probe_slot(), should called datumCopy as the
attached patch does.

Any thoughts? Thanks.

Thanks for the report.
I think it is better to invent a Runtime Memory Context; likewise, it is
already designed in IndexScan and derivatives. Here, you just allocate
the value in some upper memory context.
Also, I'm curious why such a trivial error hasn't been found for a long
time

Hmmm. I see the problem (test.sql in attachment for reproduction and
results). We only detect it by the number of Hits:
Cache Key: t1.x, (t1.t)::numeric
Cache Mode: logical
Hits: 0 Misses: 30 Evictions: 0 Overflows: 0 Memory Usage: 8kB

We see no hits in logical mode and 100 hits in binary mode. We see 15
hits for both logical and binary mode if parameters are integer numbers
- no problems with resetting expression context.

Your patch resolves the issue for logical mode - I see 15 hits for
integer and complex keys. But I still see 100 hits in binary mode. Maybe
we still have a problem?

What's more, why the Memoize node doesn't see the problem at all?

--
regards,
Andrei Lepikhov
Postgres Professional

Attachments:

test.sqlapplication/sql; name=test.sqlDownload
#4Andrei Lepikhov
a.lepikhov@postgrespro.ru
In reply to: Tender Wang (#1)
1 attachment(s)
Re: "type with xxxx does not exist" when doing ExecMemoize()

On 26/2/2024 12:44, Tender Wang wrote:

Andrei Lepikhov <a.lepikhov@postgrespro.ru
<mailto:a.lepikhov@postgrespro.ru>> 于2024年2月26日周一 10:57写道:

On 25/2/2024 20:32, Tender Wang wrote:

I think in prepare_probe_slot(), should called datumCopy as the

attached

patch does.

Any thoughts? Thanks.

Thanks for the report.
I think it is better to invent a Runtime Memory Context; likewise,
it is
already designed in IndexScan and derivatives. Here, you just allocate
the value in some upper memory context.
Also, I'm curious why such a trivial error hasn't been found for a
long time

Make sense. I found MemoizeState already has a MemoryContext, so I used it.
I update the patch.

This approach is better for me. In the next version of this patch, I
included a test case. I am still unsure about the context chosen and the
stability of the test case. Richard, you recently fixed some Memoize
issues, could you look at this problem and patch?

--
regards,
Andrei Lepikhov
Postgres Professional

Attachments:

v3-0001-Store-Memoize-probeslot-values-in-the-hash-table-mem.patchtext/plain; charset=UTF-8; name=v3-0001-Store-Memoize-probeslot-values-in-the-hash-table-mem.patchDownload
From e32900e50730bccfde26355609d6b0b3e970f0a8 Mon Sep 17 00:00:00 2001
From: "tender.wang" <tender.wang@openpie.com>
Date: Mon, 26 Feb 2024 13:38:30 +0800
Subject: [PATCH] Store Memoize probeslot values in the hash table memory
 context.

Values of probeslot evaluates in expression memory context which can be reset
on hash comparison when we still need it for the equality comparison.
So we copy the result to probeslot from ExecEvalExpr.
---
 src/backend/executor/nodeMemoize.c    | 23 ++++++++++++++++-------
 src/test/regress/expected/memoize.out | 26 ++++++++++++++++++++++++++
 src/test/regress/sql/memoize.sql      | 14 ++++++++++++++
 3 files changed, 56 insertions(+), 7 deletions(-)

diff --git a/src/backend/executor/nodeMemoize.c b/src/backend/executor/nodeMemoize.c
index 18870f10e1..6402943772 100644
--- a/src/backend/executor/nodeMemoize.c
+++ b/src/backend/executor/nodeMemoize.c
@@ -312,17 +312,26 @@ prepare_probe_slot(MemoizeState *mstate, MemoizeKey *key)
 	if (key == NULL)
 	{
 		ExprContext *econtext = mstate->ss.ps.ps_ExprContext;
+		Datum value;
+		bool isnull;
+		TupleDesc tup = pslot->tts_tupleDescriptor;
+		Form_pg_attribute att;
 		MemoryContext oldcontext;
 
-		oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
-
 		/* Set the probeslot's values based on the current parameter values */
 		for (int i = 0; i < numKeys; i++)
-			pslot->tts_values[i] = ExecEvalExpr(mstate->param_exprs[i],
-												econtext,
-												&pslot->tts_isnull[i]);
-
-		MemoryContextSwitchTo(oldcontext);
+		{
+			att = TupleDescAttr(tup, i);
+			value = ExecEvalExprSwitchContext(mstate->param_exprs[i],
+											  econtext,
+											  &isnull);
+			/* Copy the value to avoid freed after resetting ExprContext */
+			oldcontext = MemoryContextSwitchTo(mstate->tableContext);
+			pslot->tts_values[i] = datumCopy(value, att->attbyval, att->attlen);
+			MemoryContextSwitchTo(oldcontext);
+
+			pslot->tts_isnull[i] = isnull;
+		}
 	}
 	else
 	{
diff --git a/src/test/regress/expected/memoize.out b/src/test/regress/expected/memoize.out
index cf6886a288..9842b238c6 100644
--- a/src/test/regress/expected/memoize.out
+++ b/src/test/regress/expected/memoize.out
@@ -196,6 +196,32 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f >= f2.f;', false);
 (10 rows)
 
 DROP TABLE flt;
+CREATE TABLE expr_key (x numeric, t text);
+INSERT INTO expr_key (x,t) SELECT d1::numeric, d1::text FROM (
+  SELECT round((d/pi())::numeric,7) AS d1 FROM generate_series(1,20) AS d);
+-- Having duplicates we must see hits in the Memoize node
+INSERT INTO expr_key SELECT * FROM expr_key;
+CREATE INDEX expr_key_idx_x_t ON expr_key (x,t);
+VACUUM ANALYZE expr_key;
+SELECT explain_memoize('
+SELECT * FROM expr_key t1 INNER JOIN expr_key t2
+ON t1.x=t2.t::numeric AND t1.t::numeric=t2.x;', true);
+                                      explain_memoize                                      
+-------------------------------------------------------------------------------------------
+ Nested Loop (actual rows=80 loops=N)
+   ->  Index Only Scan using expr_key_idx_x_t on expr_key t1 (actual rows=40 loops=N)
+         Heap Fetches: N
+   ->  Memoize (actual rows=2 loops=N)
+         Cache Key: t1.x, (t1.t)::numeric
+         Cache Mode: logical
+         Hits: N  Misses: N  Evictions: Zero  Overflows: 0  Memory Usage: NkB
+         ->  Index Only Scan using expr_key_idx_x_t on expr_key t2 (actual rows=2 loops=N)
+               Index Cond: (x = (t1.t)::numeric)
+               Filter: (t1.x = (t)::numeric)
+               Heap Fetches: N
+(11 rows)
+
+DROP TABLE expr_key;
 -- Exercise Memoize in binary mode with a large fixed width type and a
 -- varlena type.
 CREATE TABLE strtest (n name, t text);
diff --git a/src/test/regress/sql/memoize.sql b/src/test/regress/sql/memoize.sql
index 1f4ab0ba3b..f2f7643571 100644
--- a/src/test/regress/sql/memoize.sql
+++ b/src/test/regress/sql/memoize.sql
@@ -103,6 +103,20 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f >= f2.f;', false);
 
 DROP TABLE flt;
 
+CREATE TABLE expr_key (x numeric, t text);
+INSERT INTO expr_key (x,t) SELECT d1::numeric, d1::text FROM (
+  SELECT round((d/pi())::numeric,7) AS d1 FROM generate_series(1,20) AS d);
+-- Having duplicates we must see hits in the Memoize node
+INSERT INTO expr_key SELECT * FROM expr_key;
+CREATE INDEX expr_key_idx_x_t ON expr_key (x,t);
+VACUUM ANALYZE expr_key;
+
+SELECT explain_memoize('
+SELECT * FROM expr_key t1 INNER JOIN expr_key t2
+ON t1.x=t2.t::numeric AND t1.t::numeric=t2.x;', true);
+
+DROP TABLE expr_key;
+
 -- Exercise Memoize in binary mode with a large fixed width type and a
 -- varlena type.
 CREATE TABLE strtest (n name, t text);
-- 
2.43.2

#5Tender Wang
tndrwang@gmail.com
In reply to: Andrei Lepikhov (#2)
Re: "type with xxxx does not exist" when doing ExecMemoize()

Andrei Lepikhov <a.lepikhov@postgrespro.ru> 于2024年2月26日周一 10:57写道:

On 25/2/2024 20:32, Tender Wang wrote:

I think in prepare_probe_slot(), should called datumCopy as the attached
patch does.

Any thoughts? Thanks.

Thanks for the report.
I think it is better to invent a Runtime Memory Context; likewise, it is
already designed in IndexScan and derivatives. Here, you just allocate
the value in some upper memory context.

Also, I'm curious why such a trivial error hasn't been found for a long
time

I analyze this issue again. I found that the forms of qual in
Memoize.sql(regress) are all like this:

table1.c0 OP table2.c0
If table2.c0 is the param value, the probeslot->tts_values[i] just store
the pointer. The memorycontext of this pointer is
ExecutorContext not ExprContext, Reset ExprContext doesn't change the data
of probeslot->tts_values[i].
So such a trivial error hasn't been found before.

--

regards,
Andrei Lepikhov
Postgres Professional

--
Tender Wang
OpenPie: https://en.openpie.com/

#6Andrei Lepikhov
a.lepikhov@postgrespro.ru
In reply to: Tender Wang (#5)
Re: "type with xxxx does not exist" when doing ExecMemoize()

On 26/2/2024 15:14, Tender Wang wrote:

Andrei Lepikhov <a.lepikhov@postgrespro.ru
<mailto:a.lepikhov@postgrespro.ru>> 于2024年2月26日周一 10:57写道:

On 25/2/2024 20:32, Tender Wang wrote:

I think in prepare_probe_slot(), should called datumCopy as the

attached

patch does.

Any thoughts? Thanks.

Thanks for the report.
I think it is better to invent a Runtime Memory Context; likewise,
it is
already designed in IndexScan and derivatives. Here, you just allocate
the value in some upper memory context.

Also, I'm curious why such a trivial error hasn't been found for a
long time

  I analyze this issue again. I found that the forms of qual in
Memoize.sql(regress) are all like this:

  table1.c0 OP table2.c0
If table2.c0 is the param value, the probeslot->tts_values[i] just store
the pointer.  The memorycontext of this pointer is
ExecutorContext not ExprContext, Reset ExprContext doesn't change the
data of probeslot->tts_values[i].
So such a trivial error hasn't been found before.

I'm not happy with using table context for the probeslot values. As I
see, in the case of a new entry, the cache_lookup copies data from this
slot. If a match is detected, the allocated probeslot memory piece will
not be freed up to hash table reset. Taking this into account, should we
invent some new runtime context?

--
regards,
Andrei Lepikhov
Postgres Professional

#7Richard Guo
guofenglinux@gmail.com
In reply to: Andrei Lepikhov (#4)
Re: "type with xxxx does not exist" when doing ExecMemoize()

On Mon, Feb 26, 2024 at 3:54 PM Andrei Lepikhov <a.lepikhov@postgrespro.ru>
wrote:

On 26/2/2024 12:44, Tender Wang wrote:

Make sense. I found MemoizeState already has a MemoryContext, so I used

it.

I update the patch.

This approach is better for me. In the next version of this patch, I
included a test case. I am still unsure about the context chosen and the
stability of the test case. Richard, you recently fixed some Memoize
issues, could you look at this problem and patch?

I looked at this issue a bit. It seems to me what happens is that at
first the memory areas referenced by probeslot->tts_values[] are
allocated in the per tuple context (see prepare_probe_slot). And then
in MemoizeHash_hash, after we've calculated the hashkey, we will reset
the per tuple context. However, later in MemoizeHash_equal, we still
need to reference the values in probeslot->tts_values[], which have been
cleared.

Actually the context would always be reset in MemoizeHash_equal, for
both binary and logical mode. So I kind of wonder if it's necessary to
reset the context in MemoizeHash_hash.

The ResetExprContext call in MemoizeHash_hash was introduced in
0b053e78b to fix a memory leak issue.

commit 0b053e78b5990cd01e7169ee5bd2bb8e4045deea
Author: David Rowley <drowley@postgresql.org>
Date: Thu Oct 5 20:30:47 2023 +1300

Fix memory leak in Memoize code

It seems to me that switching to the per-tuple memory context is
sufficient to fix the memory leak. Calling ResetExprContext in
MemoizeHash_hash each time seems too aggressive.

I tried to remove the ResetExprContext call in MemoizeHash_hash and did
not see the memory leak with the repro query in [1]/messages/by-id/83281eed63c74e4f940317186372abfd@cft.ru.

diff --git a/src/backend/executor/nodeMemoize.c
b/src/backend/executor/nodeMemoize.c
index 18870f10e1..f2f025520d 100644
--- a/src/backend/executor/nodeMemoize.c
+++ b/src/backend/executor/nodeMemoize.c
@@ -207,7 +207,6 @@ MemoizeHash_hash(struct memoize_hash *tb, const
MemoizeKey *key)
                }
        }

- ResetExprContext(econtext);
MemoryContextSwitchTo(oldcontext);
return murmurhash32(hashkey);
}

Looping in David to have a look.

[1]: /messages/by-id/83281eed63c74e4f940317186372abfd@cft.ru
/messages/by-id/83281eed63c74e4f940317186372abfd@cft.ru

Thanks
Richard

#8Andrei Lepikhov
a.lepikhov@postgrespro.ru
In reply to: Richard Guo (#7)
Re: "type with xxxx does not exist" when doing ExecMemoize()

On 26/2/2024 18:34, Richard Guo wrote:

On Mon, Feb 26, 2024 at 3:54 PM Andrei Lepikhov
<a.lepikhov@postgrespro.ru <mailto:a.lepikhov@postgrespro.ru>> wrote:

On 26/2/2024 12:44, Tender Wang wrote:

Make sense. I found MemoizeState already has a MemoryContext, so

I used it.

I update the patch.

This approach is better for me. In the next version of this patch, I
included a test case. I am still unsure about the context chosen and
the
stability of the test case. Richard, you recently fixed some Memoize
issues, could you look at this problem and patch?

I looked at this issue a bit.  It seems to me what happens is that at
first the memory areas referenced by probeslot->tts_values[] are
allocated in the per tuple context (see prepare_probe_slot).  And then
in MemoizeHash_hash, after we've calculated the hashkey, we will reset
the per tuple context.  However, later in MemoizeHash_equal, we still
need to reference the values in probeslot->tts_values[], which have been
cleared.

Agree

Actually the context would always be reset in MemoizeHash_equal, for
both binary and logical mode.  So I kind of wonder if it's necessary to
reset the context in MemoizeHash_hash.

I can only provide one thought against this solution: what if we have a
lot of unique hash values, maybe all of them? In that case, we still
have a kind of 'leak' David fixed by the commit 0b053e78b5.
Also, I have a segfault report of one client. As I see, it was caused by
too long text column in the table slot. As I see, key value, stored in
the Memoize hash table, was corrupted, and the most plain reason is this
bug. Should we add a test on this bug, and what do you think about the
one proposed in v3?

--
regards,
Andrei Lepikhov
Postgres Professional

#9Tender Wang
tndrwang@gmail.com
In reply to: Andrei Lepikhov (#8)
1 attachment(s)
Re: "type with xxxx does not exist" when doing ExecMemoize()

The attached patch is a new version based on v3(not including Andrei's the
test case). There is no need to call datumCopy when
isnull is true.

I have not added a new runtime memoryContext so far. Continue to use
mstate->tableContext, I'm not sure the memory used of probeslot will affect
mstate->mem_limit.
Maybe adding a new memoryContext is better. I think I should spend a little
time to learn nodeMemoize.c more deeply.

Andrei Lepikhov <a.lepikhov@postgrespro.ru> 于2024年2月26日周一 20:29写道:

On 26/2/2024 18:34, Richard Guo wrote:

On Mon, Feb 26, 2024 at 3:54 PM Andrei Lepikhov
<a.lepikhov@postgrespro.ru <mailto:a.lepikhov@postgrespro.ru>> wrote:

On 26/2/2024 12:44, Tender Wang wrote:

Make sense. I found MemoizeState already has a MemoryContext, so

I used it.

I update the patch.

This approach is better for me. In the next version of this patch, I
included a test case. I am still unsure about the context chosen and
the
stability of the test case. Richard, you recently fixed some Memoize
issues, could you look at this problem and patch?

I looked at this issue a bit. It seems to me what happens is that at
first the memory areas referenced by probeslot->tts_values[] are
allocated in the per tuple context (see prepare_probe_slot). And then
in MemoizeHash_hash, after we've calculated the hashkey, we will reset
the per tuple context. However, later in MemoizeHash_equal, we still
need to reference the values in probeslot->tts_values[], which have been
cleared.

Agree

Actually the context would always be reset in MemoizeHash_equal, for
both binary and logical mode. So I kind of wonder if it's necessary to
reset the context in MemoizeHash_hash.

I can only provide one thought against this solution: what if we have a
lot of unique hash values, maybe all of them? In that case, we still
have a kind of 'leak' David fixed by the commit 0b053e78b5.
Also, I have a segfault report of one client. As I see, it was caused by
too long text column in the table slot. As I see, key value, stored in
the Memoize hash table, was corrupted, and the most plain reason is this
bug. Should we add a test on this bug, and what do you think about the
one proposed in v3?

--
regards,
Andrei Lepikhov
Postgres Professional

--
Tender Wang
OpenPie: https://en.openpie.com/

Attachments:

v4-0001-Fix-RangeType-oid-not-found-when-doing-Memoize.patchapplication/octet-stream; name=v4-0001-Fix-RangeType-oid-not-found-when-doing-Memoize.patchDownload
From 91a1c0ce97346bef188e99d98a6de90741a1a0c8 Mon Sep 17 00:00:00 2001
From: "tender.wang" <tender.wang@openpie.com>
Date: Wed, 28 Feb 2024 14:40:00 +0800
Subject: [PATCH v4] Fix RangeType oid not found when doing Memoize.

After call ResetExprContext(), the data in mstate->proeslot may be freed too,
So we copy the result to probeslot from ExecEvalExpr.
---
 src/backend/executor/nodeMemoize.c | 30 +++++++++++++++++++++++-------
 1 file changed, 23 insertions(+), 7 deletions(-)

diff --git a/src/backend/executor/nodeMemoize.c b/src/backend/executor/nodeMemoize.c
index 18870f10e1..3790df71c0 100644
--- a/src/backend/executor/nodeMemoize.c
+++ b/src/backend/executor/nodeMemoize.c
@@ -312,17 +312,33 @@ prepare_probe_slot(MemoizeState *mstate, MemoizeKey *key)
 	if (key == NULL)
 	{
 		ExprContext *econtext = mstate->ss.ps.ps_ExprContext;
+		Datum value;
+		bool isnull;
+		TupleDesc tup = pslot->tts_tupleDescriptor;
+		Form_pg_attribute att;
 		MemoryContext oldcontext;
 
-		oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
-
 		/* Set the probeslot's values based on the current parameter values */
 		for (int i = 0; i < numKeys; i++)
-			pslot->tts_values[i] = ExecEvalExpr(mstate->param_exprs[i],
-												econtext,
-												&pslot->tts_isnull[i]);
-
-		MemoryContextSwitchTo(oldcontext);
+		{
+			att = TupleDescAttr(tup, i);
+			value = ExecEvalExprSwitchContext(mstate->param_exprs[i],
+											  econtext,
+											  &isnull);
+			if (isnull)
+			{
+				pslot->tts_values[i] = (Datum ) 0;
+				pslot->tts_isnull[i] = true;
+			}
+			else
+			{
+				/* Copy the value to avoid freed after resetting ExprContext */
+				oldcontext = MemoryContextSwitchTo(mstate->tableContext);
+				pslot->tts_values[i] = datumCopy(value, att->attbyval, att->attlen);
+				pslot->tts_isnull[i] = false;
+				MemoryContextSwitchTo(oldcontext);
+			}
+		}
 	}
 	else
 	{
-- 
2.25.1

#10Andrei Lepikhov
a.lepikhov@postgrespro.ru
In reply to: Tender Wang (#9)
Re: "type with xxxx does not exist" when doing ExecMemoize()

On 28/2/2024 13:53, Tender Wang wrote:

The attached patch is a new version based on v3(not including Andrei's
the test case). There is no need to call datumCopy when
isnull is true.

I have not added a new runtime memoryContext so far. Continue to use
mstate->tableContext, I'm not sure the memory used of probeslot will
affect mstate->mem_limit.
Maybe adding a new memoryContext is better. I think I should spend a
little time to learn nodeMemoize.c more deeply.

I am curious about your reasons to stay with tableContext. In terms of
memory allocation, Richard's approach looks better.
Also, You don't need to initialize tts_values[i] at all if tts_isnull[i]
set to true.

--
regards,
Andrei Lepikhov
Postgres Professional

#11Tender Wang
tndrwang@gmail.com
In reply to: Andrei Lepikhov (#10)
1 attachment(s)
Re: "type with xxxx does not exist" when doing ExecMemoize()

I read Memoize code and how other node use ResetExprContext() recently.

The comments about per_tuple_memory said that :

* ecxt_per_tuple_memory is a short-term context for expression results.
* As the name suggests, it will typically be reset once per tuple,
* before we begin to evaluate expressions for that tuple. Each
* ExprContext normally has its very own per-tuple memory context.

So ResetExprContext() should called once per tuple, but not in Hash and
Equal function just as Richard said before.
In ExecResult() and ExecProjectSet(), they call ResetExprContext() once
when enter these functions.
So I think ExecMemoize() can do the same way.

The attached patch includes below modifications:
1.
When I read the code in nodeMemoize.c, I found a typos: outer should be
inner,
if I don't misunderstand the intend of Memoize.

2.
I found that almost executor node call CHECK_FOR_INTERRUPTS(), so I add it.
Is it right to add it for ExecMemoize()?

3.
I remove ResetExprContext() from Hash and Equal funciton. And I call it
when enter
ExecMemoize() just like ExecPrejectSet() does.
ExecQualAndReset() is replaed with ExecQual().

4.
This patch doesn't include test case. I use the Andrei's test case, but I
don't repeat the aboved issue.
I may need to spend some more time to think about how to repeat this issue
easily.

So, what do you think about the one proposed in v5? @Andrei Lepikhov
<a.lepikhov@postgrespro.ru> @Richard Guo <guofenglinux@gmail.com> @David
Rowley <dgrowleyml@gmail.com> .
I don't want to continue to do work based on v3 patch. As Andrei Lepikhov
said, using mstate->tableContext for probeslot
is not good. v5 looks more simple.

Attachments:

v5-0001-Fix-wrong-used-ResetExprContext-in-ExecMemoize.patchapplication/octet-stream; name=v5-0001-Fix-wrong-used-ResetExprContext-in-ExecMemoize.patchDownload
From 8497ae88c203d751013357d32a7940fd353b46c0 Mon Sep 17 00:00:00 2001
From: "tender.wang" <tender.wang@openpie.com>
Date: Thu, 29 Feb 2024 12:31:34 +0800
Subject: [PATCH v5] Fix wrong used ResetExprContext() in ExecMemoize().

In commit 0b053e78, ResetExprContext() was called in Hash and Equal func.
The memory allocation for probeslot is also from per_tuple_memory.
It will be safe for probeslot if the data in probeslot is pass-by-value or
plain var referencd outertuple slot. But it will be unsafe if the data in
probeslot is pass-by-reference and pointer to per_tuple_memory.

per_tuple_memory is a short-term context for expression results.
As the name suggests, it will typically be reset once per tuple,
before we begin to evaluate expressions for that tuple.

So we call RestExprContext() every time when we enter ExecMemoize(),
ohter function doesn't call it anymore. This way is as same as ExecProjectSet
and ExecResult does.
---
 src/backend/executor/nodeMemoize.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/src/backend/executor/nodeMemoize.c b/src/backend/executor/nodeMemoize.c
index 18870f10e1..4b544afc6c 100644
--- a/src/backend/executor/nodeMemoize.c
+++ b/src/backend/executor/nodeMemoize.c
@@ -13,7 +13,7 @@
  * Memoize nodes are intended to sit above parameterized nodes in the plan
  * tree in order to cache results from them.  The intention here is that a
  * repeat scan with a parameter value that has already been seen by the node
- * can fetch tuples from the cache rather than having to re-scan the outer
+ * can fetch tuples from the cache rather than having to re-scan the inner
  * node all over again.  The query planner may choose to make use of one of
  * these when it thinks rescans for previously seen values are likely enough
  * to warrant adding the additional node.
@@ -207,7 +207,6 @@ MemoizeHash_hash(struct memoize_hash *tb, const MemoizeKey *key)
 		}
 	}
 
-	ResetExprContext(econtext);
 	MemoryContextSwitchTo(oldcontext);
 	return murmurhash32(hashkey);
 }
@@ -265,7 +264,6 @@ MemoizeHash_equal(struct memoize_hash *tb, const MemoizeKey *key1,
 			}
 		}
 
-		ResetExprContext(econtext);
 		MemoryContextSwitchTo(oldcontext);
 		return match;
 	}
@@ -273,7 +271,7 @@ MemoizeHash_equal(struct memoize_hash *tb, const MemoizeKey *key1,
 	{
 		econtext->ecxt_innertuple = tslot;
 		econtext->ecxt_outertuple = pslot;
-		return ExecQualAndReset(mstate->cache_eq_expr, econtext);
+		return ExecQual(mstate->cache_eq_expr, econtext);
 	}
 }
 
@@ -701,6 +699,17 @@ ExecMemoize(PlanState *pstate)
 	MemoizeState *node = castNode(MemoizeState, pstate);
 	PlanState  *outerNode;
 	TupleTableSlot *slot;
+	ExprContext *econtext;
+
+	CHECK_FOR_INTERRUPTS();
+
+	econtext = node->ss.ps.ps_ExprContext;
+
+	/*
+	 * Reset per-tuple memory context to free any expression evaluation
+	 * storage allocated in the previous tuple cycle.
+	 */
+	ResetExprContext(econtext);
 
 	switch (node->mstatus)
 	{
-- 
2.25.1

#12Tender Wang
tndrwang@gmail.com
In reply to: Tender Wang (#11)
1 attachment(s)
Re: "type with xxxx does not exist" when doing ExecMemoize()

Hi,

When I think about how to add a test case for v5 version patch, and I want
to test if v5 version patch has memory leak.
This thread [1]/messages/by-id/83281eed63c74e4f940317186372abfd@cft.ru provided a way how to repeat the memory leak, so I used it
to test v5 patch. I didn't found memory leak on
v5 patch.

But I found other interesting issue. When changed whereClause in [1]/messages/by-id/83281eed63c74e4f940317186372abfd@cft.ru, the
query reported below error:

"ERROR could not find memoization table entry"

the query:
EXPLAIN analyze
select sum(q.id_table1)
from (
SELECT t2.*
FROM table1 t1
JOIN table2 t2
ON (t2.id_table1 + t2.id_table1) = t1.id) q;

But on v5 patch, it didn't report error.

I guess it is the same reason that data in probeslot was reset in Hash
function.

I debug the above query, and get this:
before
(gdb) p *(DatumGetNumeric(mstate->probeslot->tts_values[0]))
$1 = {vl_len_ = 48, choice = {n_header = 32770, n_long = {n_sign_dscale =
32770, n_weight = 60, n_data = 0x564632ebd708}, n_short = {n_header =
32770, n_data = 0x564632ebd706}}}
after
(gdb) p *(DatumGetNumeric(mstate->probeslot->tts_values[0]))
$2 = {vl_len_ = 264, choice = {n_header = 32639, n_long = {n_sign_dscale =
32639, n_weight = 32639, n_data = 0x564632ebd6a8}, n_short = {n_header =
32639, n_data = 0x564632ebd6a6}}}

So after call ResetExprContext() in Hash function, the data in probeslot is
corrupted. It is not sure what error will happen when executing on
corrupted data.

During debug, I learned that numeric_add doesn't have type check like
rangetype, so aboved query will not report "type with xxx does not exist".

And I realize that the test case added by Andrei Lepikhov in v3 is right.
So in v6 patch I add Andrei Lepikhov's test case. Thanks a lot.

Now I think the v6 version patch seems to be complete now.

[1]: /messages/by-id/83281eed63c74e4f940317186372abfd@cft.ru
/messages/by-id/83281eed63c74e4f940317186372abfd@cft.ru

--
Tender Wang
OpenPie: https://en.openpie.com/

Attachments:

v6-0001-Fix-wrong-used-ResetExprContext-in-ExecMemoize.patchapplication/octet-stream; name=v6-0001-Fix-wrong-used-ResetExprContext-in-ExecMemoize.patchDownload
From 55723ecdd1e1ff0c0c3438827c05cb1a29b671e0 Mon Sep 17 00:00:00 2001
From: "tender.wang" <tender.wang@openpie.com>
Date: Fri, 1 Mar 2024 14:49:11 +0800
Subject: [PATCH v6] Fix wrong used ResetExprContext() in ExecMemoize().

In commit 0b053e78, ResetExprContext() was called in Hash and Equal func.
The memory allocation for probeslot is also from per_tuple_memory.
It will be safe for probeslot if the data in probeslot is pass-by-value or
plain var referencd outertuple slot. But it will be unsafe if the data in
probeslot is pass-by-reference and pointer to per_tuple_memory.

per_tuple_memory is a short-term context for expression results.
As the name suggests, it will typically be reset once per tuple,
before we begin to evaluate expressions for that tuple.

So we call RestExprContext() every time when we enter ExecMemoize(),
ohter function doesn't call it anymore. This way is as same as ExecProjectSet
and ExecResult does.
---
 src/backend/executor/nodeMemoize.c    | 17 +++++++++++++----
 src/test/regress/expected/memoize.out | 25 +++++++++++++++++++++++++
 src/test/regress/sql/memoize.sql      | 14 ++++++++++++++
 3 files changed, 52 insertions(+), 4 deletions(-)

diff --git a/src/backend/executor/nodeMemoize.c b/src/backend/executor/nodeMemoize.c
index 18870f10e1..4b544afc6c 100644
--- a/src/backend/executor/nodeMemoize.c
+++ b/src/backend/executor/nodeMemoize.c
@@ -13,7 +13,7 @@
  * Memoize nodes are intended to sit above parameterized nodes in the plan
  * tree in order to cache results from them.  The intention here is that a
  * repeat scan with a parameter value that has already been seen by the node
- * can fetch tuples from the cache rather than having to re-scan the outer
+ * can fetch tuples from the cache rather than having to re-scan the inner
  * node all over again.  The query planner may choose to make use of one of
  * these when it thinks rescans for previously seen values are likely enough
  * to warrant adding the additional node.
@@ -207,7 +207,6 @@ MemoizeHash_hash(struct memoize_hash *tb, const MemoizeKey *key)
 		}
 	}
 
-	ResetExprContext(econtext);
 	MemoryContextSwitchTo(oldcontext);
 	return murmurhash32(hashkey);
 }
@@ -265,7 +264,6 @@ MemoizeHash_equal(struct memoize_hash *tb, const MemoizeKey *key1,
 			}
 		}
 
-		ResetExprContext(econtext);
 		MemoryContextSwitchTo(oldcontext);
 		return match;
 	}
@@ -273,7 +271,7 @@ MemoizeHash_equal(struct memoize_hash *tb, const MemoizeKey *key1,
 	{
 		econtext->ecxt_innertuple = tslot;
 		econtext->ecxt_outertuple = pslot;
-		return ExecQualAndReset(mstate->cache_eq_expr, econtext);
+		return ExecQual(mstate->cache_eq_expr, econtext);
 	}
 }
 
@@ -701,6 +699,17 @@ ExecMemoize(PlanState *pstate)
 	MemoizeState *node = castNode(MemoizeState, pstate);
 	PlanState  *outerNode;
 	TupleTableSlot *slot;
+	ExprContext *econtext;
+
+	CHECK_FOR_INTERRUPTS();
+
+	econtext = node->ss.ps.ps_ExprContext;
+
+	/*
+	 * Reset per-tuple memory context to free any expression evaluation
+	 * storage allocated in the previous tuple cycle.
+	 */
+	ResetExprContext(econtext);
 
 	switch (node->mstatus)
 	{
diff --git a/src/test/regress/expected/memoize.out b/src/test/regress/expected/memoize.out
index cf6886a288..7667ae393c 100644
--- a/src/test/regress/expected/memoize.out
+++ b/src/test/regress/expected/memoize.out
@@ -196,6 +196,31 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f >= f2.f;', false);
 (10 rows)
 
 DROP TABLE flt;
+CREATE TABLE expr_key (x numeric, t text);
+INSERT INTO expr_key (x,t) SELECT d1::numeric, d1::text FROM (
+  SELECT round((d/pi())::numeric,7) AS d1 FROM generate_series(1,20) AS d);
+INSERT INTO expr_key SELECT * FROM expr_key;
+CREATE INDEX expr_key_idx_x_t ON expr_key (x,t);
+VACUUM ANALYZE expr_key;
+SELECT explain_memoize('
+SELECT * FROM expr_key t1 INNER JOIN expr_key t2
+ON t1.x=t2.t::numeric AND t1.t::numeric=t2.x;', true);
+                                      explain_memoize                                      
+-------------------------------------------------------------------------------------------
+ Nested Loop (actual rows=80 loops=N)
+   ->  Index Only Scan using expr_key_idx_x_t on expr_key t1 (actual rows=40 loops=N)
+         Heap Fetches: N
+   ->  Memoize (actual rows=2 loops=N)
+         Cache Key: t1.x, (t1.t)::numeric
+         Cache Mode: logical
+         Hits: N  Misses: N  Evictions: Zero  Overflows: 0  Memory Usage: NkB
+         ->  Index Only Scan using expr_key_idx_x_t on expr_key t2 (actual rows=2 loops=N)
+               Index Cond: (x = (t1.t)::numeric)
+               Filter: (t1.x = (t)::numeric)
+               Heap Fetches: N
+(11 rows)
+
+DROP TABLE expr_key;
 -- Exercise Memoize in binary mode with a large fixed width type and a
 -- varlena type.
 CREATE TABLE strtest (n name, t text);
diff --git a/src/test/regress/sql/memoize.sql b/src/test/regress/sql/memoize.sql
index 1f4ab0ba3b..00003e3cff 100644
--- a/src/test/regress/sql/memoize.sql
+++ b/src/test/regress/sql/memoize.sql
@@ -103,6 +103,20 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f >= f2.f;', false);
 
 DROP TABLE flt;
 
+CREATE TABLE expr_key (x numeric, t text);
+INSERT INTO expr_key (x,t) SELECT d1::numeric, d1::text FROM (
+  SELECT round((d/pi())::numeric,7) AS d1 FROM generate_series(1,20) AS d);
+
+INSERT INTO expr_key SELECT * FROM expr_key;
+CREATE INDEX expr_key_idx_x_t ON expr_key (x,t);
+VACUUM ANALYZE expr_key;
+
+SELECT explain_memoize('
+SELECT * FROM expr_key t1 INNER JOIN expr_key t2
+ON t1.x=t2.t::numeric AND t1.t::numeric=t2.x;', true);
+
+DROP TABLE expr_key;
+
 -- Exercise Memoize in binary mode with a large fixed width type and a
 -- varlena type.
 CREATE TABLE strtest (n name, t text);
-- 
2.25.1

#13Andrei Lepikhov
a.lepikhov@postgrespro.ru
In reply to: Tender Wang (#12)
Re: "type with xxxx does not exist" when doing ExecMemoize()

On 1/3/2024 14:18, Tender Wang wrote:

During debug, I learned that numeric_add doesn't have type check like
rangetype, so aboved query will not report "type with xxx does not exist".

And I realize that  the test case added by Andrei Lepikhov  in v3 is
right. So in v6 patch I add Andrei Lepikhov's test case.  Thanks a lot.

Now I think the v6 version patch seems to be complete now.

I've passed through the patch, and it looks okay. Although I am afraid
of the same problems that future changes can cause and how to detect
them, it works correctly.

--
regards,
Andrei Lepikhov
Postgres Professional

#14Tender Wang
tndrwang@gmail.com
In reply to: Andrei Lepikhov (#13)
Re: "type with xxxx does not exist" when doing ExecMemoize()

Andrei Lepikhov <a.lepikhov@postgrespro.ru> 于2024年3月5日周二 17:36写道:

On 1/3/2024 14:18, Tender Wang wrote:

During debug, I learned that numeric_add doesn't have type check like
rangetype, so aboved query will not report "type with xxx does not

exist".

And I realize that the test case added by Andrei Lepikhov in v3 is
right. So in v6 patch I add Andrei Lepikhov's test case. Thanks a lot.

Now I think the v6 version patch seems to be complete now.

I've passed through the patch, and it looks okay. Although I am afraid
of the same problems that future changes can cause and how to detect
them, it works correctly.

Thanks for reviewing it, and I add it to commitfest 2024-07.

--
Tender Wang
OpenPie: https://en.openpie.com/

#15Andrei Lepikhov
a.lepikhov@postgrespro.ru
In reply to: Tender Wang (#14)
Re: "type with xxxx does not exist" when doing ExecMemoize()

On 6/3/2024 10:10, Tender Wang wrote:

Andrei Lepikhov <a.lepikhov@postgrespro.ru
<mailto:a.lepikhov@postgrespro.ru>> 于2024年3月5日周二 17:36写道:

On 1/3/2024 14:18, Tender Wang wrote:

During debug, I learned that numeric_add doesn't have type check

like

rangetype, so aboved query will not report "type with xxx does

not exist".

And I realize that  the test case added by Andrei Lepikhov  in v3 is
right. So in v6 patch I add Andrei Lepikhov's test case.  Thanks

a lot.

Now I think the v6 version patch seems to be complete now.

I've passed through the patch, and it looks okay. Although I am afraid
of the same problems that future changes can cause and how to detect
them, it works correctly.

Thanks for reviewing it, and I add it to commitfest 2024-07.

I think, it is a bug. Should it be fixed (and back-patched) earlier?

--
regards,
Andrei Lepikhov
Postgres Professional

#16Tender Wang
tndrwang@gmail.com
In reply to: Andrei Lepikhov (#15)
Re: "type with xxxx does not exist" when doing ExecMemoize()

Andrei Lepikhov <a.lepikhov@postgrespro.ru> 于2024年3月6日周三 11:37写道:

I think, it is a bug. Should it be fixed (and back-patched) earlier?

Agreed. Need David to review it as he knows this area best.

--
Tender Wang
OpenPie: https://en.openpie.com/

#17David Rowley
dgrowleyml@gmail.com
In reply to: Tender Wang (#16)
Re: "type with xxxx does not exist" when doing ExecMemoize()

On Thu, 7 Mar 2024 at 15:24, Tender Wang <tndrwang@gmail.com> wrote:

Andrei Lepikhov <a.lepikhov@postgrespro.ru> 于2024年3月6日周三 11:37写道:

I think, it is a bug. Should it be fixed (and back-patched) earlier?

Agreed. Need David to review it as he knows this area best.

This is on my list of things to do. Just not at the top yet.

David

#18David Rowley
dgrowleyml@gmail.com
In reply to: David Rowley (#17)
Re: "type with xxxx does not exist" when doing ExecMemoize()

On Thu, 7 Mar 2024 at 22:50, David Rowley <dgrowleyml@gmail.com> wrote:

On Thu, 7 Mar 2024 at 15:24, Tender Wang <tndrwang@gmail.com> wrote:

Andrei Lepikhov <a.lepikhov@postgrespro.ru> 于2024年3月6日周三 11:37写道:

I think, it is a bug. Should it be fixed (and back-patched) earlier?

Agreed. Need David to review it as he knows this area best.

This is on my list of things to do. Just not at the top yet.

I've gone over this patch and I'm happy with the changes to
nodeMemoize.c. The thing I did change was the newly added test. The
problem there was the test was passing for me with and without the
code fix. I ended up changing the test so the cache hits and misses
are reported. That required moving the test to above where the
work_mem is set to 64KB so we can be certain the values will all be
cached and the cache hits are predictable.

My other changes were just cosmetic.

Thanks for working on this fix. I've pushed the patch.

David

#19Tender Wang
tndrwang@gmail.com
In reply to: David Rowley (#18)
Re: "type with xxxx does not exist" when doing ExecMemoize()

David Rowley <dgrowleyml@gmail.com> 于2024年3月11日周一 13:25写道:

On Thu, 7 Mar 2024 at 22:50, David Rowley <dgrowleyml@gmail.com> wrote:

On Thu, 7 Mar 2024 at 15:24, Tender Wang <tndrwang@gmail.com> wrote:

Andrei Lepikhov <a.lepikhov@postgrespro.ru> 于2024年3月6日周三 11:37写道:

I think, it is a bug. Should it be fixed (and back-patched) earlier?

Agreed. Need David to review it as he knows this area best.

This is on my list of things to do. Just not at the top yet.

I've gone over this patch and I'm happy with the changes to
nodeMemoize.c. The thing I did change was the newly added test. The
problem there was the test was passing for me with and without the
code fix. I ended up changing the test so the cache hits and misses
are reported. That required moving the test to above where the
work_mem is set to 64KB so we can be certain the values will all be
cached and the cache hits are predictable.

My other changes were just cosmetic.

Thanks for working on this fix. I've pushed the patch.

David

Thanks for pushing the patch.
--
Tender Wang
OpenPie: https://en.openpie.com/