Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

Started by Richard Guoabout 2 years ago25 messages
#1Richard Guo
guofenglinux@gmail.com
1 attachment(s)

While working on BUG #18187 [1]/messages/by-id/18187-831da249cbd2ff8e@postgresql.org, I noticed that we also have issues with
how SJE replaces join clauses involving the removed rel. As an example,
consider the query below, which would trigger an Assert.

create table t (a int primary key, b int);

explain (costs off)
select * from t t1
inner join t t2 on t1.a = t2.a
left join t t3 on t1.b > 1 and t1.b < 2;
server closed the connection unexpectedly

The Assert failure happens in remove_self_join_rel() when we're trying
to remove t1. The two join clauses of t1, 't1.b > 1' and 't1.b < 2',
share the same pointer of 'required_relids', which is {t1, t3} at first.
After we've performed replace_varno for the first clause, the
required_relids becomes {t2, t3}, which is no problem. However, the
second clause's required_relids also becomes {t2, t3}, because they are
actually the same pointer. So when we proceed with replace_varno on the
second clause, we'd trigger the Assert.

Off the top of my head I'm thinking that we can fix this kind of issue
by bms_copying the bitmapset first before we make a substitution in
replace_relid(), like attached.

Alternatively, we can revise distribute_qual_to_rels() as below so that
different RestrictInfos don't share the same pointer of required_relids.

--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -2385,7 +2385,7 @@ distribute_qual_to_rels(PlannerInfo *root, Node
*clause,
         * nonnullable-side rows failing the qual.
         */
        Assert(ojscope);
-       relids = ojscope;
+       relids = bms_copy(ojscope);
        Assert(!pseudoconstant);
    }
    else

With this way, I'm worrying that there are other places where we should
avoid sharing the same pointer to Bitmapset structure. I'm not sure how
to discover all these places.

Any thoughts?

[1]: /messages/by-id/18187-831da249cbd2ff8e@postgresql.org
/messages/by-id/18187-831da249cbd2ff8e@postgresql.org

Thanks
Richard

Attachments:

v1-0001-Fix-how-SJE-replaces-join-clauses.patchapplication/octet-stream; name=v1-0001-Fix-how-SJE-replaces-join-clauses.patchDownload
From 8669edb991f4cb9d48fe8f85bec86636eef24652 Mon Sep 17 00:00:00 2001
From: Richard Guo <guofenglinux@gmail.com>
Date: Tue, 14 Nov 2023 18:55:24 +0800
Subject: [PATCH v1] Fix how SJE replaces join clauses

---
 src/backend/optimizer/plan/analyzejoins.c |  9 +++++++--
 src/test/regress/expected/join.out        | 15 +++++++++++++++
 src/test/regress/sql/join.sql             |  6 ++++++
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index b9be19a687..f94449df94 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -1518,12 +1518,17 @@ replace_relid(Relids relids, int oldId, int newId)
 	if (oldId < 0)
 		return relids;
 
+	/* Delete relid without substitution. */
 	if (newId < 0)
-		/* Delete relid without substitution. */
 		return bms_del_member(relids, oldId);
 
+	/*
+	 * Substitute newId for oldId.  We must make a copy of the original relids
+	 * before starting the substitution, because the same pointer to a
+	 * Bitmapset structure might be shared among different places.
+	 */
 	if (bms_is_member(oldId, relids))
-		return bms_add_member(bms_del_member(relids, oldId), newId);
+		return bms_add_member(bms_del_member(bms_copy(relids), oldId), newId);
 
 	return relids;
 }
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 2c73270143..5ceecc4413 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -6836,6 +6836,21 @@ select * from emp1 t1 join emp1 t2 on t1.id = t2.id left join
    ->  Function Scan on generate_series t3
 (6 rows)
 
+-- Check that SJE replaces join clauses involving the removed rel correctly
+explain (costs off)
+select * from emp1 t1
+   inner join emp1 t2 on t1.id = t2.id
+    left join emp1 t3 on t1.id > 1 and t1.id < 2;
+                  QUERY PLAN                  
+----------------------------------------------
+ Nested Loop Left Join
+   Join Filter: ((t2.id > 1) AND (t2.id < 2))
+   ->  Seq Scan on emp1 t2
+         Filter: (id IS NOT NULL)
+   ->  Materialize
+         ->  Seq Scan on emp1 t3
+(6 rows)
+
 -- We can remove the join even if we find the join can't duplicate rows and
 -- the base quals of each side are different.  In the following case we end up
 -- moving quals over to s1 to make it so it can't match any rows.
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index 8a8a63bd2f..c3ec340e2d 100644
--- a/src/test/regress/sql/join.sql
+++ b/src/test/regress/sql/join.sql
@@ -2606,6 +2606,12 @@ explain (costs off)
 select * from emp1 t1 join emp1 t2 on t1.id = t2.id left join
     lateral (select t1.id as t1id, * from generate_series(1,1) t3) s on true;
 
+-- Check that SJE replaces join clauses involving the removed rel correctly
+explain (costs off)
+select * from emp1 t1
+   inner join emp1 t2 on t1.id = t2.id
+    left join emp1 t3 on t1.id > 1 and t1.id < 2;
+
 -- We can remove the join even if we find the join can't duplicate rows and
 -- the base quals of each side are different.  In the following case we end up
 -- moving quals over to s1 to make it so it can't match any rows.
-- 
2.31.0

#2Alexander Korotkov
aekorotkov@gmail.com
In reply to: Richard Guo (#1)
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

Hi!

Thank you for spotting this and dealing with this.

On Tue, Nov 14, 2023 at 1:15 PM Richard Guo <guofenglinux@gmail.com> wrote:

While working on BUG #18187 [1], I noticed that we also have issues with
how SJE replaces join clauses involving the removed rel. As an example,
consider the query below, which would trigger an Assert.

create table t (a int primary key, b int);

explain (costs off)
select * from t t1
inner join t t2 on t1.a = t2.a
left join t t3 on t1.b > 1 and t1.b < 2;
server closed the connection unexpectedly

The Assert failure happens in remove_self_join_rel() when we're trying
to remove t1. The two join clauses of t1, 't1.b > 1' and 't1.b < 2',
share the same pointer of 'required_relids', which is {t1, t3} at first.
After we've performed replace_varno for the first clause, the
required_relids becomes {t2, t3}, which is no problem. However, the
second clause's required_relids also becomes {t2, t3}, because they are
actually the same pointer. So when we proceed with replace_varno on the
second clause, we'd trigger the Assert.

Off the top of my head I'm thinking that we can fix this kind of issue
by bms_copying the bitmapset first before we make a substitution in
replace_relid(), like attached.

I remember, I've removed bms_copy() from here. Now I understand why
that was needed. But I'm still not particularly happy about it. The
reason is that logic of replace_relid() becomes cumbersome. In some
cases it performs modification in-place, while in other cases it
copies.

Alternatively, we can revise distribute_qual_to_rels() as below so that
different RestrictInfos don't share the same pointer of required_relids.

--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -2385,7 +2385,7 @@ distribute_qual_to_rels(PlannerInfo *root, Node *clause,
* nonnullable-side rows failing the qual.
*/
Assert(ojscope);
-       relids = ojscope;
+       relids = bms_copy(ojscope);
Assert(!pseudoconstant);
}
else

With this way, I'm worrying that there are other places where we should
avoid sharing the same pointer to Bitmapset structure. I'm not sure how
to discover all these places.

This looks better to me. However, I'm not sure what the overhead
would be? How much would it increase the memory footprint?

It's possibly dumb option, but what about just removing the assert?

------
Regards,
Alexander Korotkov

#3Andres Freund
andres@anarazel.de
In reply to: Richard Guo (#1)
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

Hi,

On 2023-11-14 19:14:57 +0800, Richard Guo wrote:

While working on BUG #18187 [1], I noticed that we also have issues with
how SJE replaces join clauses involving the removed rel. As an example,
consider the query below, which would trigger an Assert.

create table t (a int primary key, b int);

explain (costs off)
select * from t t1
inner join t t2 on t1.a = t2.a
left join t t3 on t1.b > 1 and t1.b < 2;
server closed the connection unexpectedly

The Assert failure happens in remove_self_join_rel() when we're trying
to remove t1. The two join clauses of t1, 't1.b > 1' and 't1.b < 2',
share the same pointer of 'required_relids', which is {t1, t3} at first.
After we've performed replace_varno for the first clause, the
required_relids becomes {t2, t3}, which is no problem. However, the
second clause's required_relids also becomes {t2, t3}, because they are
actually the same pointer. So when we proceed with replace_varno on the
second clause, we'd trigger the Assert.

Good catch.

Off the top of my head I'm thinking that we can fix this kind of issue
by bms_copying the bitmapset first before we make a substitution in
replace_relid(), like attached.

Alternatively, we can revise distribute_qual_to_rels() as below so that
different RestrictInfos don't share the same pointer of required_relids.

--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -2385,7 +2385,7 @@ distribute_qual_to_rels(PlannerInfo *root, Node
*clause,
* nonnullable-side rows failing the qual.
*/
Assert(ojscope);
-       relids = ojscope;
+       relids = bms_copy(ojscope);
Assert(!pseudoconstant);
}
else

With this way, I'm worrying that there are other places where we should
avoid sharing the same pointer to Bitmapset structure.

Indeed.

I'm not sure how to discover all these places. Any thoughts?

At the very least I think we should add a mode to bitmapset.c mode where
every modification of a bitmapset reallocates, rather than just when the size
actually changes. Because we only reallocte and free in relatively uncommon
cases, particularly on 64bit systems, it's very easy to not find spots that
continue to use the input pointer to one of the modifying bms functions.

A very hacky implementation of that indeed catches this bug with the existing
regression tests.

The tests do *not* pass with just the attached applied, as the "Delete relid
without substitution" path has the same issue. With that also copying and all
the "reusing" bms* functions always reallocating, the tests pass - kinda.

The kinda because there are callers to bms_(add|del)_members() that pass the
same bms as a and b, which only works if the reallocation happens "late".

Greetings,

Andres

#4Andres Freund
andres@anarazel.de
In reply to: Alexander Korotkov (#2)
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

Hi,

On 2023-11-14 14:42:13 +0200, Alexander Korotkov wrote:

It's possibly dumb option, but what about just removing the assert?

That's not at all an option - the in-place bms_* functions can free their
input. So a dangling pointer to the "old" version is a use-after-free waiting
to happen - you just need a query that actually gets to bitmapsets that are a
bit larger.

Greetings,

Andres

#5Alexander Korotkov
aekorotkov@gmail.com
In reply to: Andres Freund (#4)
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

On Wed, Nov 15, 2023 at 8:04 AM Andres Freund <andres@anarazel.de> wrote:

On 2023-11-14 14:42:13 +0200, Alexander Korotkov wrote:

It's possibly dumb option, but what about just removing the assert?

That's not at all an option - the in-place bms_* functions can free their
input. So a dangling pointer to the "old" version is a use-after-free waiting
to happen - you just need a query that actually gets to bitmapsets that are a
bit larger.

Yeah, now I got it, thank you. I was under the wrong impression that
bitmapset has the level of indirection, so the pointer remains valid.
Now, I see that bitmapset manipulation functions can do free/repalloc
making the previous bitmapset pointer invalid.

------
Regards,
Alexander Korotkov

#6Alexander Korotkov
aekorotkov@gmail.com
In reply to: Andres Freund (#3)
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

On Wed, Nov 15, 2023 at 8:02 AM Andres Freund <andres@anarazel.de> wrote:

On 2023-11-14 19:14:57 +0800, Richard Guo wrote:

While working on BUG #18187 [1], I noticed that we also have issues with
how SJE replaces join clauses involving the removed rel. As an example,
consider the query below, which would trigger an Assert.

create table t (a int primary key, b int);

explain (costs off)
select * from t t1
inner join t t2 on t1.a = t2.a
left join t t3 on t1.b > 1 and t1.b < 2;
server closed the connection unexpectedly

The Assert failure happens in remove_self_join_rel() when we're trying
to remove t1. The two join clauses of t1, 't1.b > 1' and 't1.b < 2',
share the same pointer of 'required_relids', which is {t1, t3} at first.
After we've performed replace_varno for the first clause, the
required_relids becomes {t2, t3}, which is no problem. However, the
second clause's required_relids also becomes {t2, t3}, because they are
actually the same pointer. So when we proceed with replace_varno on the
second clause, we'd trigger the Assert.

Good catch.

Off the top of my head I'm thinking that we can fix this kind of issue
by bms_copying the bitmapset first before we make a substitution in
replace_relid(), like attached.

Alternatively, we can revise distribute_qual_to_rels() as below so that
different RestrictInfos don't share the same pointer of required_relids.

--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -2385,7 +2385,7 @@ distribute_qual_to_rels(PlannerInfo *root, Node
*clause,
* nonnullable-side rows failing the qual.
*/
Assert(ojscope);
-       relids = ojscope;
+       relids = bms_copy(ojscope);
Assert(!pseudoconstant);
}
else

With this way, I'm worrying that there are other places where we should
avoid sharing the same pointer to Bitmapset structure.

Indeed.

I'm not sure how to discover all these places. Any thoughts?

At the very least I think we should add a mode to bitmapset.c mode where
every modification of a bitmapset reallocates, rather than just when the size
actually changes. Because we only reallocte and free in relatively uncommon
cases, particularly on 64bit systems, it's very easy to not find spots that
continue to use the input pointer to one of the modifying bms functions.

A very hacky implementation of that indeed catches this bug with the existing
regression tests.

The tests do *not* pass with just the attached applied, as the "Delete relid
without substitution" path has the same issue. With that also copying and all
the "reusing" bms* functions always reallocating, the tests pass - kinda.

The kinda because there are callers to bms_(add|del)_members() that pass the
same bms as a and b, which only works if the reallocation happens "late".

+1,
Neat idea. I'm willing to work on this. Will propose the patch soon.

------
Regards,
Alexander Korotkov

#7Alexander Korotkov
aekorotkov@gmail.com
In reply to: Alexander Korotkov (#6)
2 attachment(s)
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

On Wed, Nov 15, 2023 at 5:07 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:

On Wed, Nov 15, 2023 at 8:02 AM Andres Freund <andres@anarazel.de> wrote:

The kinda because there are callers to bms_(add|del)_members() that pass the
same bms as a and b, which only works if the reallocation happens "late".

+1,
Neat idea. I'm willing to work on this. Will propose the patch soon.

It's here. New REALLOCATE_BITMAPSETS forces bitmapset reallocation on
each modification. I also find it useful to add assert to all
bitmapset functions on argument NodeTag. This allows you to find
access to hanging pointers earlier.

I had the feeling of falling into a rabbit hole while debugging all
the cases of failure with this new option. With the second patch
regressions tests pass.

Any thoughts?

------
Regards,
Alexander Korotkov

Attachments:

0001-REALLOCATE_BITMAPSETS-manual-compile-time-option-v1.patchapplication/octet-stream; name=0001-REALLOCATE_BITMAPSETS-manual-compile-time-option-v1.patchDownload
From 5a18e880fb53ee5b5a0be092b1cb796636dbb60c Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Sun, 19 Nov 2023 01:03:12 +0200
Subject: [PATCH 1/2] REALLOCATE_BITMAPSETS manual compile-time option

This option forces each bitmapset modification to reallocate bitmapset.  This
is useful for debugging hangling pointers to bitmapset's.

Discussion: https://postgr.es/m/CAMbWs4_wJthNtYBL+SsebpgF-5L2r5zFFk6xYbS0A78GKOTFHw@mail.gmail.com
---
 src/backend/nodes/bitmapset.c  | 146 +++++++++++++++++++++++++++++----
 src/include/pg_config_manual.h |   6 ++
 2 files changed, 138 insertions(+), 14 deletions(-)

diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index 704879f5660..8be3e3e2c5e 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -84,6 +84,7 @@ bms_copy(const Bitmapset *a)
 
 	if (a == NULL)
 		return NULL;
+	Assert(IsA(a, Bitmapset));
 	size = BITMAPSET_SIZE(a->nwords);
 	result = (Bitmapset *) palloc(size);
 	memcpy(result, a, size);
@@ -98,8 +99,8 @@ bms_equal(const Bitmapset *a, const Bitmapset *b)
 {
 	int			i;
 
-	Assert(a == NULL || a->words[a->nwords - 1] != 0);
-	Assert(b == NULL || b->words[b->nwords - 1] != 0);
+	Assert(a == NULL || (IsA(a, Bitmapset) && a->words[a->nwords - 1] != 0));
+	Assert(b == NULL || (IsA(b, Bitmapset) && b->words[b->nwords - 1] != 0));
 
 	/* Handle cases where either input is NULL */
 	if (a == NULL)
@@ -139,8 +140,8 @@ bms_compare(const Bitmapset *a, const Bitmapset *b)
 {
 	int			i;
 
-	Assert(a == NULL || a->words[a->nwords - 1] != 0);
-	Assert(b == NULL || b->words[b->nwords - 1] != 0);
+	Assert(a == NULL || (IsA(a, Bitmapset) && a->words[a->nwords - 1] != 0));
+	Assert(b == NULL || (IsA(b, Bitmapset) && b->words[b->nwords - 1] != 0));
 
 	/* Handle cases where either input is NULL */
 	if (a == NULL)
@@ -215,6 +216,9 @@ bms_union(const Bitmapset *a, const Bitmapset *b)
 	int			otherlen;
 	int			i;
 
+	Assert(a == NULL || IsA(a, Bitmapset));
+	Assert(b == NULL || IsA(b, Bitmapset));
+
 	/* Handle cases where either input is NULL */
 	if (a == NULL)
 		return bms_copy(b);
@@ -253,9 +257,13 @@ bms_intersect(const Bitmapset *a, const Bitmapset *b)
 	int			resultlen;
 	int			i;
 
+	Assert(a == NULL || IsA(a, Bitmapset));
+	Assert(b == NULL || IsA(b, Bitmapset));
+
 	/* Handle cases where either input is NULL */
 	if (a == NULL || b == NULL)
 		return NULL;
+
 	/* Identify shorter and longer input; copy the shorter one */
 	if (a->nwords <= b->nwords)
 	{
@@ -299,8 +307,8 @@ bms_difference(const Bitmapset *a, const Bitmapset *b)
 	Bitmapset  *result;
 	int			i;
 
-	Assert(a == NULL || a->words[a->nwords - 1] != 0);
-	Assert(b == NULL || b->words[b->nwords - 1] != 0);
+	Assert(a == NULL || (IsA(a, Bitmapset) && a->words[a->nwords - 1] != 0));
+	Assert(b == NULL || (IsA(b, Bitmapset) && b->words[b->nwords - 1] != 0));
 
 	/* Handle cases where either input is NULL */
 	if (a == NULL)
@@ -308,6 +316,8 @@ bms_difference(const Bitmapset *a, const Bitmapset *b)
 	if (b == NULL)
 		return bms_copy(a);
 
+	Assert(IsA(a, Bitmapset) && IsA(b, Bitmapset));
+
 	/*
 	 * In Postgres' usage, an empty result is a very common case, so it's
 	 * worth optimizing for that by testing bms_nonempty_difference().  This
@@ -364,8 +374,8 @@ bms_is_subset(const Bitmapset *a, const Bitmapset *b)
 {
 	int			i;
 
-	Assert(a == NULL || a->words[a->nwords - 1] != 0);
-	Assert(b == NULL || b->words[b->nwords - 1] != 0);
+	Assert(a == NULL || (IsA(a, Bitmapset) && a->words[a->nwords - 1] != 0));
+	Assert(b == NULL || (IsA(b, Bitmapset) && b->words[b->nwords - 1] != 0));
 
 	/* Handle cases where either input is NULL */
 	if (a == NULL)
@@ -373,6 +383,8 @@ bms_is_subset(const Bitmapset *a, const Bitmapset *b)
 	if (b == NULL)
 		return false;
 
+	Assert(IsA(a, Bitmapset) && IsA(b, Bitmapset));
+
 	/* 'a' can't be a subset of 'b' if it contains more words */
 	if (a->nwords > b->nwords)
 		return false;
@@ -399,8 +411,8 @@ bms_subset_compare(const Bitmapset *a, const Bitmapset *b)
 	int			shortlen;
 	int			i;
 
-	Assert(a == NULL || a->words[a->nwords - 1] != 0);
-	Assert(b == NULL || b->words[b->nwords - 1] != 0);
+	Assert(a == NULL || (IsA(a, Bitmapset) && a->words[a->nwords - 1] != 0));
+	Assert(b == NULL || (IsA(b, Bitmapset) && b->words[b->nwords - 1] != 0));
 
 	/* Handle cases where either input is NULL */
 	if (a == NULL)
@@ -411,6 +423,9 @@ bms_subset_compare(const Bitmapset *a, const Bitmapset *b)
 	}
 	if (b == NULL)
 		return BMS_SUBSET2;
+
+	Assert(IsA(a, Bitmapset) && IsA(b, Bitmapset));
+
 	/* Check common words */
 	result = BMS_EQUAL;			/* status so far */
 	shortlen = Min(a->nwords, b->nwords);
@@ -467,6 +482,9 @@ bms_is_member(int x, const Bitmapset *a)
 		elog(ERROR, "negative bitmapset member not allowed");
 	if (a == NULL)
 		return false;
+
+	Assert(IsA(a, Bitmapset));
+
 	wordnum = WORDNUM(x);
 	bitnum = BITNUM(x);
 	if (wordnum >= a->nwords)
@@ -495,6 +513,8 @@ bms_member_index(Bitmapset *a, int x)
 	if (!bms_is_member(x, a))
 		return -1;
 
+	Assert(IsA(a, Bitmapset));
+
 	wordnum = WORDNUM(x);
 	bitnum = BITNUM(x);
 
@@ -529,6 +549,9 @@ bms_overlap(const Bitmapset *a, const Bitmapset *b)
 	int			shortlen;
 	int			i;
 
+	Assert(a == NULL || IsA(a, Bitmapset));
+	Assert(b == NULL || IsA(b, Bitmapset));
+
 	/* Handle cases where either input is NULL */
 	if (a == NULL || b == NULL)
 		return false;
@@ -553,6 +576,8 @@ bms_overlap_list(const Bitmapset *a, const List *b)
 	int			wordnum,
 				bitnum;
 
+	Assert(a == NULL || IsA(a, Bitmapset));
+
 	if (a == NULL || b == NIL)
 		return false;
 
@@ -582,8 +607,8 @@ bms_nonempty_difference(const Bitmapset *a, const Bitmapset *b)
 {
 	int			i;
 
-	Assert(a == NULL || a->words[a->nwords - 1] != 0);
-	Assert(b == NULL || b->words[b->nwords - 1] != 0);
+	Assert(a == NULL || (IsA(a, Bitmapset) && a->words[a->nwords - 1] != 0));
+	Assert(b == NULL || (IsA(b, Bitmapset) && b->words[b->nwords - 1] != 0));
 
 	/* Handle cases where either input is NULL */
 	if (a == NULL)
@@ -617,6 +642,9 @@ bms_singleton_member(const Bitmapset *a)
 
 	if (a == NULL)
 		elog(ERROR, "bitmapset is empty");
+
+	Assert(IsA(a, Bitmapset));
+
 	nwords = a->nwords;
 	wordnum = 0;
 	do
@@ -657,6 +685,7 @@ bms_get_singleton_member(const Bitmapset *a, int *member)
 
 	if (a == NULL)
 		return false;
+	Assert(IsA(a, Bitmapset));
 	nwords = a->nwords;
 	wordnum = 0;
 	do
@@ -690,6 +719,7 @@ bms_num_members(const Bitmapset *a)
 
 	if (a == NULL)
 		return 0;
+	Assert(IsA(a, Bitmapset));
 	nwords = a->nwords;
 	wordnum = 0;
 	do
@@ -717,6 +747,7 @@ bms_membership(const Bitmapset *a)
 
 	if (a == NULL)
 		return BMS_EMPTY_SET;
+	Assert(IsA(a, Bitmapset));
 	nwords = a->nwords;
 	wordnum = 0;
 	do
@@ -759,6 +790,7 @@ bms_add_member(Bitmapset *a, int x)
 		elog(ERROR, "negative bitmapset member not allowed");
 	if (a == NULL)
 		return bms_make_singleton(x);
+	Assert(IsA(a, Bitmapset));
 	wordnum = WORDNUM(x);
 	bitnum = BITNUM(x);
 
@@ -767,8 +799,15 @@ bms_add_member(Bitmapset *a, int x)
 	{
 		int			oldnwords = a->nwords;
 		int			i;
+#ifdef REALLOCATE_BITMAPSETS
+		Bitmapset  *tmp = a;
 
+		a = (Bitmapset *) palloc(BITMAPSET_SIZE(wordnum + 1));
+		memcpy(a, tmp, BITMAPSET_SIZE(tmp->nwords));
+		pfree(tmp);
+#else
 		a = (Bitmapset *) repalloc(a, BITMAPSET_SIZE(wordnum + 1));
+#endif
 		a->nwords = wordnum + 1;
 		/* zero out the enlarged portion */
 		i = oldnwords;
@@ -777,6 +816,16 @@ bms_add_member(Bitmapset *a, int x)
 			a->words[i] = 0;
 		} while (++i < a->nwords);
 	}
+#ifdef REALLOCATE_BITMAPSETS
+	else
+	{
+		Bitmapset  *tmp = a;
+
+		a = (Bitmapset *) palloc(BITMAPSET_SIZE(tmp->nwords));
+		memcpy(a, tmp, BITMAPSET_SIZE(tmp->nwords));
+		pfree(tmp);
+	}
+#endif
 
 	a->words[wordnum] |= ((bitmapword) 1 << bitnum);
 	return a;
@@ -794,14 +843,24 @@ bms_del_member(Bitmapset *a, int x)
 {
 	int			wordnum,
 				bitnum;
+#ifdef REALLOCATE_BITMAPSETS
+	Bitmapset  *tmp = a;
+#endif
 
 	if (x < 0)
 		elog(ERROR, "negative bitmapset member not allowed");
 	if (a == NULL)
 		return NULL;
+	Assert(IsA(a, Bitmapset));
 	wordnum = WORDNUM(x);
 	bitnum = BITNUM(x);
 
+#ifdef REALLOCATE_BITMAPSETS
+	a = (Bitmapset *) palloc(BITMAPSET_SIZE(tmp->nwords));
+	memcpy(a, tmp, BITMAPSET_SIZE(tmp->nwords));
+	pfree(tmp);
+#endif
+
 	/* member can't exist.  Return 'a' unmodified */
 	if (unlikely(wordnum >= a->nwords))
 		return a;
@@ -839,6 +898,9 @@ bms_add_members(Bitmapset *a, const Bitmapset *b)
 	int			otherlen;
 	int			i;
 
+	Assert(a == NULL || IsA(a, Bitmapset));
+	Assert(b == NULL || IsA(b, Bitmapset));
+
 	/* Handle cases where either input is NULL */
 	if (a == NULL)
 		return bms_copy(b);
@@ -852,6 +914,13 @@ bms_add_members(Bitmapset *a, const Bitmapset *b)
 	}
 	else
 	{
+#ifdef REALLOCATE_BITMAPSETS
+		Bitmapset  *tmp = a;
+
+		a = (Bitmapset *) palloc(BITMAPSET_SIZE(tmp->nwords));
+		memcpy(a, tmp, BITMAPSET_SIZE(tmp->nwords));
+		pfree(tmp);
+#endif
 		result = a;
 		other = b;
 	}
@@ -884,6 +953,8 @@ bms_add_range(Bitmapset *a, int lower, int upper)
 				ushiftbits,
 				wordnum;
 
+	Assert(a == NULL || IsA(a, Bitmapset));
+
 	/* do nothing if nothing is called for, without further checking */
 	if (upper < lower)
 		return a;
@@ -902,9 +973,16 @@ bms_add_range(Bitmapset *a, int lower, int upper)
 	{
 		int			oldnwords = a->nwords;
 		int			i;
+#ifdef REALLOCATE_BITMAPSETS
+		Bitmapset  *tmp = a;
 
+		a = (Bitmapset *) palloc(BITMAPSET_SIZE(uwordnum + 1));
+		memcpy(a, tmp, BITMAPSET_SIZE(tmp->nwords));
+		pfree(tmp);
+#else
 		/* ensure we have enough words to store the upper bit */
 		a = (Bitmapset *) repalloc(a, BITMAPSET_SIZE(uwordnum + 1));
+#endif
 		a->nwords = uwordnum + 1;
 		/* zero out the enlarged portion */
 		i = oldnwords;
@@ -953,6 +1031,12 @@ bms_int_members(Bitmapset *a, const Bitmapset *b)
 	int			lastnonzero;
 	int			shortlen;
 	int			i;
+#ifdef REALLOCATE_BITMAPSETS
+	Bitmapset  *tmp = a;
+#endif
+
+	Assert(a == NULL || IsA(a, Bitmapset));
+	Assert(b == NULL || IsA(b, Bitmapset));
 
 	/* Handle cases where either input is NULL */
 	if (a == NULL)
@@ -962,6 +1046,13 @@ bms_int_members(Bitmapset *a, const Bitmapset *b)
 		pfree(a);
 		return NULL;
 	}
+
+#ifdef REALLOCATE_BITMAPSETS
+	a = (Bitmapset *) palloc(BITMAPSET_SIZE(tmp->nwords));
+	memcpy(a, tmp, BITMAPSET_SIZE(tmp->nwords));
+	pfree(tmp);
+#endif
+
 	/* Intersect b into a; we need never copy */
 	shortlen = Min(a->nwords, b->nwords);
 	lastnonzero = -1;
@@ -993,15 +1084,25 @@ Bitmapset *
 bms_del_members(Bitmapset *a, const Bitmapset *b)
 {
 	int			i;
+#ifdef REALLOCATE_BITMAPSETS
+	Bitmapset  *tmp = a;
+#endif
 
-	Assert(a == NULL || a->words[a->nwords - 1] != 0);
-	Assert(b == NULL || b->words[b->nwords - 1] != 0);
+	Assert(a == NULL || (IsA(a, Bitmapset) && a->words[a->nwords - 1] != 0));
+	Assert(b == NULL || (IsA(b, Bitmapset) && b->words[b->nwords - 1] != 0));
 
 	/* Handle cases where either input is NULL */
 	if (a == NULL)
 		return NULL;
 	if (b == NULL)
 		return a;
+
+#ifdef REALLOCATE_BITMAPSETS
+	a = (Bitmapset *) palloc(BITMAPSET_SIZE(tmp->nwords));
+	memcpy(a, tmp, BITMAPSET_SIZE(tmp->nwords));
+	pfree(tmp);
+#endif
+
 	/* Remove b's bits from a; we need never copy */
 	if (a->nwords > b->nwords)
 	{
@@ -1054,12 +1155,25 @@ bms_join(Bitmapset *a, Bitmapset *b)
 	Bitmapset  *other;
 	int			otherlen;
 	int			i;
+#ifdef REALLOCATE_BITMAPSETS
+	Bitmapset  *tmp = a;
+#endif
+
+	Assert(a == NULL || IsA(a, Bitmapset));
+	Assert(b == NULL || IsA(b, Bitmapset));
 
 	/* Handle cases where either input is NULL */
 	if (a == NULL)
 		return b;
 	if (b == NULL)
 		return a;
+
+#ifdef REALLOCATE_BITMAPSETS
+	a = (Bitmapset *) palloc(BITMAPSET_SIZE(tmp->nwords));
+	memcpy(a, tmp, BITMAPSET_SIZE(tmp->nwords));
+	pfree(tmp);
+#endif
+
 	/* Identify shorter and longer input; use longer one as result */
 	if (a->nwords < b->nwords)
 	{
@@ -1109,6 +1223,8 @@ bms_next_member(const Bitmapset *a, int prevbit)
 	int			wordnum;
 	bitmapword	mask;
 
+	Assert(a == NULL || IsA(a, Bitmapset));
+
 	if (a == NULL)
 		return -2;
 	nwords = a->nwords;
@@ -1168,6 +1284,8 @@ bms_prev_member(const Bitmapset *a, int prevbit)
 	int			ushiftbits;
 	bitmapword	mask;
 
+	Assert(a == NULL || IsA(a, Bitmapset));
+
 	/*
 	 * If set is NULL or if there are no more bits to the right then we've
 	 * nothing to do.
diff --git a/src/include/pg_config_manual.h b/src/include/pg_config_manual.h
index 8a6e67a445d..16c383ba7f7 100644
--- a/src/include/pg_config_manual.h
+++ b/src/include/pg_config_manual.h
@@ -335,6 +335,12 @@
  */
 /* #define COPY_PARSE_PLAN_TREES */
 
+/*
+ * Define this to force Bitmapset reallocation on each modification.  Helps
+ * to find hangling pointers to Bitmapset's.
+ */
+/* #define REALLOCATE_BITMAPSETS */
+
 /*
  * Define this to force all parse and plan trees to be passed through
  * outfuncs.c/readfuncs.c, to facilitate catching errors and omissions in
-- 
2.39.3 (Apple Git-145)

0002-Make-regression-tests-pass-with-REALLOCATE_BITMAP-v1.patchapplication/octet-stream; name=0002-Make-regression-tests-pass-with-REALLOCATE_BITMAP-v1.patchDownload
From 9d26049872a5dab8238ea686c810c32ea713e73e Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Sun, 19 Nov 2023 01:57:07 +0200
Subject: [PATCH 2/2] Make regression tests pass with REALLOCATE_BITMAPSETS
 enabled

Make a copy of bitmapset's which are going to be modified in future.
---
 src/backend/optimizer/path/equivclass.c   | 4 ++--
 src/backend/optimizer/plan/initsplan.c    | 6 +++---
 src/backend/optimizer/util/restrictinfo.c | 2 +-
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 7fa502d6e25..5fd60c095b0 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -151,8 +151,8 @@ process_equivalence(PlannerInfo *root,
 	collation = ((OpExpr *) clause)->inputcollid;
 	item1 = (Expr *) get_leftop(clause);
 	item2 = (Expr *) get_rightop(clause);
-	item1_relids = restrictinfo->left_relids;
-	item2_relids = restrictinfo->right_relids;
+	item1_relids = bms_copy(restrictinfo->left_relids);
+	item2_relids = bms_copy(restrictinfo->right_relids);
 
 	/*
 	 * Ensure both input expressions expose the desired collation (their types
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index b31d8921211..228aa3edb8c 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -1409,8 +1409,8 @@ make_outerjoininfo(PlannerInfo *root,
 							LCS_asString(rc->strength))));
 	}
 
-	sjinfo->syn_lefthand = left_rels;
-	sjinfo->syn_righthand = right_rels;
+	sjinfo->syn_lefthand = bms_copy(left_rels);
+	sjinfo->syn_righthand = bms_copy(right_rels);
 	sjinfo->jointype = jointype;
 	sjinfo->ojrelid = ojrelid;
 	/* these fields may get added to later: */
@@ -2385,7 +2385,7 @@ distribute_qual_to_rels(PlannerInfo *root, Node *clause,
 		 * nonnullable-side rows failing the qual.
 		 */
 		Assert(ojscope);
-		relids = ojscope;
+		relids = bms_copy(ojscope);
 		Assert(!pseudoconstant);
 	}
 	else
diff --git a/src/backend/optimizer/util/restrictinfo.c b/src/backend/optimizer/util/restrictinfo.c
index d6d26a2b515..57473f9c001 100644
--- a/src/backend/optimizer/util/restrictinfo.c
+++ b/src/backend/optimizer/util/restrictinfo.c
@@ -133,7 +133,7 @@ make_restrictinfo_internal(PlannerInfo *root,
 	restrictinfo->can_join = false; /* may get set below */
 	restrictinfo->security_level = security_level;
 	restrictinfo->incompatible_relids = incompatible_relids;
-	restrictinfo->outer_relids = outer_relids;
+	restrictinfo->outer_relids = bms_copy(outer_relids);
 
 	/*
 	 * If it's potentially delayable by lower-level security quals, figure out
-- 
2.39.3 (Apple Git-145)

#8Ashutosh Bapat
ashutosh.bapat.oss@gmail.com
In reply to: Alexander Korotkov (#7)
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

On Sun, Nov 19, 2023 at 6:48 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:

On Wed, Nov 15, 2023 at 5:07 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:

On Wed, Nov 15, 2023 at 8:02 AM Andres Freund <andres@anarazel.de> wrote:

The kinda because there are callers to bms_(add|del)_members() that pass the
same bms as a and b, which only works if the reallocation happens "late".

+1,
Neat idea. I'm willing to work on this. Will propose the patch soon.

It's here. New REALLOCATE_BITMAPSETS forces bitmapset reallocation on
each modification. I also find it useful to add assert to all
bitmapset functions on argument NodeTag. This allows you to find
access to hanging pointers earlier.

Creating separate patches for REALLOCATE_BITMAPSETs and
Assert(ISA(Bitmapset)) will be easier to review. We will be able to
check whether all the places that require either of the fixes have
been indeed fixed and correctly. I kept switching back and forth.

I had the feeling of falling into a rabbit hole while debugging all
the cases of failure with this new option. With the second patch
regressions tests pass.

I think this will increase memory consumption when planning queries
with partitioned tables (100s or 1000s of partitions). Have you tried
measuring the impact?

We should take hit on memory consumption when there is correctness
involved but not all these cases look correctness problems. For
example. RelOptInfo::left_relids or SpecialJoinInfo::syn_lefthand may
not get modified after they are set. But just because
RelOptInfo::relids of a lower relation was assigned somewhere which
got modified, these two get modified. bms_copy() in
make_specialjoininfo may not be necessary. I haven't tried that myself
so I may be wrong.

What might be useful is to mark a bitmap as "final" once it's know
that it can not change. e.g. RelOptInfo->relids once set never
changes. Each operation that modifies a Bitmapset throws an
error/Asserts if it's marked as "final", thus catching the places
where we expect a Bitmapset being modified when not intended. This
will catch shared bitmapsets as well. We could apply bms_copy in only
those cases then.

--
Best Wishes,
Ashutosh Bapat

#9Richard Guo
guofenglinux@gmail.com
In reply to: Alexander Korotkov (#7)
1 attachment(s)
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

On Sun, Nov 19, 2023 at 9:17 AM Alexander Korotkov <aekorotkov@gmail.com>
wrote:

It's here. New REALLOCATE_BITMAPSETS forces bitmapset reallocation on
each modification.

+1 to the idea of introducing a reallocation mode to Bitmapset.

I had the feeling of falling into a rabbit hole while debugging all
the cases of failure with this new option. With the second patch
regressions tests pass.

It seems to me that we have always had situations where we share the
same pointer to a Bitmapset structure across different places. I do not
think this is a problem as long as we do not modify the Bitmapsets in a
way that requires reallocation or impact the locations sharing the same
pointer.

So I'm wondering, instead of attempting to avoid sharing pointer to
Bitmapset in all locations that have problems, can we simply bms_copy
the original Bitmapset within replace_relid() before making any
modifications, as I proposed previously? Of course, as Andres pointed
out, we need to do so also for the "Delete relid without substitution"
path. Please see the attached.

Thanks
Richard

Attachments:

v2-0001-Fix-how-SJE-replaces-join-clauses.patchapplication/octet-stream; name=v2-0001-Fix-how-SJE-replaces-join-clauses.patchDownload
From 71aeb11cf5ed42dc6a477ccf897e2be37d1b6132 Mon Sep 17 00:00:00 2001
From: Richard Guo <guofenglinux@gmail.com>
Date: Tue, 14 Nov 2023 18:55:24 +0800
Subject: [PATCH v2] Fix how SJE replaces join clauses

---
 src/backend/optimizer/plan/analyzejoins.c | 11 ++++++++---
 src/test/regress/expected/join.out        | 15 +++++++++++++++
 src/test/regress/sql/join.sql             |  6 ++++++
 3 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index b9be19a687..75e4ebc64a 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -1511,6 +1511,10 @@ replace_varno_walker(Node *node, ReplaceVarnoContext *ctx)
 
 /*
  * Substitute newId by oldId in relids.
+ *
+ * We must make a copy of the original Bitmapset before making any
+ * modifications, because the same pointer to it might be shared among
+ * different places.
  */
 static Bitmapset *
 replace_relid(Relids relids, int oldId, int newId)
@@ -1518,12 +1522,13 @@ replace_relid(Relids relids, int oldId, int newId)
 	if (oldId < 0)
 		return relids;
 
+	/* Delete relid without substitution. */
 	if (newId < 0)
-		/* Delete relid without substitution. */
-		return bms_del_member(relids, oldId);
+		return bms_del_member(bms_copy(relids), oldId);
 
+	/* Substitute newId for oldId. */
 	if (bms_is_member(oldId, relids))
-		return bms_add_member(bms_del_member(relids, oldId), newId);
+		return bms_add_member(bms_del_member(bms_copy(relids), oldId), newId);
 
 	return relids;
 }
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 2c73270143..5ceecc4413 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -6836,6 +6836,21 @@ select * from emp1 t1 join emp1 t2 on t1.id = t2.id left join
    ->  Function Scan on generate_series t3
 (6 rows)
 
+-- Check that SJE replaces join clauses involving the removed rel correctly
+explain (costs off)
+select * from emp1 t1
+   inner join emp1 t2 on t1.id = t2.id
+    left join emp1 t3 on t1.id > 1 and t1.id < 2;
+                  QUERY PLAN                  
+----------------------------------------------
+ Nested Loop Left Join
+   Join Filter: ((t2.id > 1) AND (t2.id < 2))
+   ->  Seq Scan on emp1 t2
+         Filter: (id IS NOT NULL)
+   ->  Materialize
+         ->  Seq Scan on emp1 t3
+(6 rows)
+
 -- We can remove the join even if we find the join can't duplicate rows and
 -- the base quals of each side are different.  In the following case we end up
 -- moving quals over to s1 to make it so it can't match any rows.
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index 8a8a63bd2f..c3ec340e2d 100644
--- a/src/test/regress/sql/join.sql
+++ b/src/test/regress/sql/join.sql
@@ -2606,6 +2606,12 @@ explain (costs off)
 select * from emp1 t1 join emp1 t2 on t1.id = t2.id left join
     lateral (select t1.id as t1id, * from generate_series(1,1) t3) s on true;
 
+-- Check that SJE replaces join clauses involving the removed rel correctly
+explain (costs off)
+select * from emp1 t1
+   inner join emp1 t2 on t1.id = t2.id
+    left join emp1 t3 on t1.id > 1 and t1.id < 2;
+
 -- We can remove the join even if we find the join can't duplicate rows and
 -- the base quals of each side are different.  In the following case we end up
 -- moving quals over to s1 to make it so it can't match any rows.
-- 
2.31.0

#10Alexander Korotkov
aekorotkov@gmail.com
In reply to: Richard Guo (#9)
3 attachment(s)
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

On Thu, Nov 23, 2023 at 4:33 AM Richard Guo <guofenglinux@gmail.com> wrote:

On Sun, Nov 19, 2023 at 9:17 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:

It's here. New REALLOCATE_BITMAPSETS forces bitmapset reallocation on
each modification.

+1 to the idea of introducing a reallocation mode to Bitmapset.

I had the feeling of falling into a rabbit hole while debugging all
the cases of failure with this new option. With the second patch
regressions tests pass.

It seems to me that we have always had situations where we share the
same pointer to a Bitmapset structure across different places. I do not
think this is a problem as long as we do not modify the Bitmapsets in a
way that requires reallocation or impact the locations sharing the same
pointer.

So I'm wondering, instead of attempting to avoid sharing pointer to
Bitmapset in all locations that have problems, can we simply bms_copy
the original Bitmapset within replace_relid() before making any
modifications, as I proposed previously? Of course, as Andres pointed
out, we need to do so also for the "Delete relid without substitution"
path. Please see the attached.

Yes, this makes sense. Thank you for the patch. My initial point was
that replace_relid() should either do in-place in all cases or make a
copy in all cases. Now I see that it should make a copy in all cases.
Note, that without making a copy in delete case, regression tests fail
with REALLOCATE_BITMAPSETS on.

Please, find the revised patchset. As Ashutosh Bapat asked, asserts
are split into separate patch.

------
Regards,
Alexander Korotkov

Attachments:

0003-Make-replace_relid-leave-argument-unmodified-v2.patchapplication/octet-stream; name=0003-Make-replace_relid-leave-argument-unmodified-v2.patchDownload
From c197405fbae7f6b8abe217fbfab8c4882a872229 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Fri, 24 Nov 2023 14:57:27 +0200
Subject: [PATCH 3/3] Make replace_relid() leave argument unmodified

There are a lot of situations when we share the same pointer to a Bitmapset
structure across different places.  In order to evade undesirable side effects
replace_relid() function should always return a copy.
---
 src/backend/optimizer/plan/analyzejoins.c | 11 ++++++++---
 src/test/regress/expected/join.out        | 15 +++++++++++++++
 src/test/regress/sql/join.sql             |  6 ++++++
 3 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index b9be19a687f..75e4ebc64a2 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -1511,6 +1511,10 @@ replace_varno_walker(Node *node, ReplaceVarnoContext *ctx)
 
 /*
  * Substitute newId by oldId in relids.
+ *
+ * We must make a copy of the original Bitmapset before making any
+ * modifications, because the same pointer to it might be shared among
+ * different places.
  */
 static Bitmapset *
 replace_relid(Relids relids, int oldId, int newId)
@@ -1518,12 +1522,13 @@ replace_relid(Relids relids, int oldId, int newId)
 	if (oldId < 0)
 		return relids;
 
+	/* Delete relid without substitution. */
 	if (newId < 0)
-		/* Delete relid without substitution. */
-		return bms_del_member(relids, oldId);
+		return bms_del_member(bms_copy(relids), oldId);
 
+	/* Substitute newId for oldId. */
 	if (bms_is_member(oldId, relids))
-		return bms_add_member(bms_del_member(relids, oldId), newId);
+		return bms_add_member(bms_del_member(bms_copy(relids), oldId), newId);
 
 	return relids;
 }
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 2c73270143b..5ceecc44132 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -6836,6 +6836,21 @@ select * from emp1 t1 join emp1 t2 on t1.id = t2.id left join
    ->  Function Scan on generate_series t3
 (6 rows)
 
+-- Check that SJE replaces join clauses involving the removed rel correctly
+explain (costs off)
+select * from emp1 t1
+   inner join emp1 t2 on t1.id = t2.id
+    left join emp1 t3 on t1.id > 1 and t1.id < 2;
+                  QUERY PLAN                  
+----------------------------------------------
+ Nested Loop Left Join
+   Join Filter: ((t2.id > 1) AND (t2.id < 2))
+   ->  Seq Scan on emp1 t2
+         Filter: (id IS NOT NULL)
+   ->  Materialize
+         ->  Seq Scan on emp1 t3
+(6 rows)
+
 -- We can remove the join even if we find the join can't duplicate rows and
 -- the base quals of each side are different.  In the following case we end up
 -- moving quals over to s1 to make it so it can't match any rows.
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index 8a8a63bd2f1..c3ec340e2da 100644
--- a/src/test/regress/sql/join.sql
+++ b/src/test/regress/sql/join.sql
@@ -2606,6 +2606,12 @@ explain (costs off)
 select * from emp1 t1 join emp1 t2 on t1.id = t2.id left join
     lateral (select t1.id as t1id, * from generate_series(1,1) t3) s on true;
 
+-- Check that SJE replaces join clauses involving the removed rel correctly
+explain (costs off)
+select * from emp1 t1
+   inner join emp1 t2 on t1.id = t2.id
+    left join emp1 t3 on t1.id > 1 and t1.id < 2;
+
 -- We can remove the join even if we find the join can't duplicate rows and
 -- the base quals of each side are different.  In the following case we end up
 -- moving quals over to s1 to make it so it can't match any rows.
-- 
2.39.3 (Apple Git-145)

0001-Add-asserts-to-bimapset-manipulation-functions-v2.patchapplication/octet-stream; name=0001-Add-asserts-to-bimapset-manipulation-functions-v2.patchDownload
From 33b7e3859797150d0d54bd08ae202663d9f754d0 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Fri, 24 Nov 2023 15:32:32 +0200
Subject: [PATCH 1/3] Add asserts to bimapset manipulation functions

New asserts validate that arguments are really bitmapsets.  This should help
to early detect accesses to dangling pointers.
---
 src/backend/nodes/bitmapset.c | 77 ++++++++++++++++++++++++++++-------
 1 file changed, 63 insertions(+), 14 deletions(-)

diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index 704879f5660..1627922ef7a 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -84,6 +84,7 @@ bms_copy(const Bitmapset *a)
 
 	if (a == NULL)
 		return NULL;
+	Assert(IsA(a, Bitmapset));
 	size = BITMAPSET_SIZE(a->nwords);
 	result = (Bitmapset *) palloc(size);
 	memcpy(result, a, size);
@@ -98,8 +99,8 @@ bms_equal(const Bitmapset *a, const Bitmapset *b)
 {
 	int			i;
 
-	Assert(a == NULL || a->words[a->nwords - 1] != 0);
-	Assert(b == NULL || b->words[b->nwords - 1] != 0);
+	Assert(a == NULL || (IsA(a, Bitmapset) && a->words[a->nwords - 1] != 0));
+	Assert(b == NULL || (IsA(b, Bitmapset) && b->words[b->nwords - 1] != 0));
 
 	/* Handle cases where either input is NULL */
 	if (a == NULL)
@@ -139,8 +140,8 @@ bms_compare(const Bitmapset *a, const Bitmapset *b)
 {
 	int			i;
 
-	Assert(a == NULL || a->words[a->nwords - 1] != 0);
-	Assert(b == NULL || b->words[b->nwords - 1] != 0);
+	Assert(a == NULL || (IsA(a, Bitmapset) && a->words[a->nwords - 1] != 0));
+	Assert(b == NULL || (IsA(b, Bitmapset) && b->words[b->nwords - 1] != 0));
 
 	/* Handle cases where either input is NULL */
 	if (a == NULL)
@@ -215,6 +216,9 @@ bms_union(const Bitmapset *a, const Bitmapset *b)
 	int			otherlen;
 	int			i;
 
+	Assert(a == NULL || IsA(a, Bitmapset));
+	Assert(b == NULL || IsA(b, Bitmapset));
+
 	/* Handle cases where either input is NULL */
 	if (a == NULL)
 		return bms_copy(b);
@@ -253,6 +257,9 @@ bms_intersect(const Bitmapset *a, const Bitmapset *b)
 	int			resultlen;
 	int			i;
 
+	Assert(a == NULL || IsA(a, Bitmapset));
+	Assert(b == NULL || IsA(b, Bitmapset));
+
 	/* Handle cases where either input is NULL */
 	if (a == NULL || b == NULL)
 		return NULL;
@@ -299,8 +306,8 @@ bms_difference(const Bitmapset *a, const Bitmapset *b)
 	Bitmapset  *result;
 	int			i;
 
-	Assert(a == NULL || a->words[a->nwords - 1] != 0);
-	Assert(b == NULL || b->words[b->nwords - 1] != 0);
+	Assert(a == NULL || (IsA(a, Bitmapset) && a->words[a->nwords - 1] != 0));
+	Assert(b == NULL || (IsA(b, Bitmapset) && b->words[b->nwords - 1] != 0));
 
 	/* Handle cases where either input is NULL */
 	if (a == NULL)
@@ -308,6 +315,8 @@ bms_difference(const Bitmapset *a, const Bitmapset *b)
 	if (b == NULL)
 		return bms_copy(a);
 
+	Assert(IsA(a, Bitmapset) && IsA(b, Bitmapset));
+
 	/*
 	 * In Postgres' usage, an empty result is a very common case, so it's
 	 * worth optimizing for that by testing bms_nonempty_difference().  This
@@ -364,8 +373,8 @@ bms_is_subset(const Bitmapset *a, const Bitmapset *b)
 {
 	int			i;
 
-	Assert(a == NULL || a->words[a->nwords - 1] != 0);
-	Assert(b == NULL || b->words[b->nwords - 1] != 0);
+	Assert(a == NULL || (IsA(a, Bitmapset) && a->words[a->nwords - 1] != 0));
+	Assert(b == NULL || (IsA(b, Bitmapset) && b->words[b->nwords - 1] != 0));
 
 	/* Handle cases where either input is NULL */
 	if (a == NULL)
@@ -373,6 +382,8 @@ bms_is_subset(const Bitmapset *a, const Bitmapset *b)
 	if (b == NULL)
 		return false;
 
+	Assert(IsA(a, Bitmapset) && IsA(b, Bitmapset));
+
 	/* 'a' can't be a subset of 'b' if it contains more words */
 	if (a->nwords > b->nwords)
 		return false;
@@ -399,8 +410,8 @@ bms_subset_compare(const Bitmapset *a, const Bitmapset *b)
 	int			shortlen;
 	int			i;
 
-	Assert(a == NULL || a->words[a->nwords - 1] != 0);
-	Assert(b == NULL || b->words[b->nwords - 1] != 0);
+	Assert(a == NULL || (IsA(a, Bitmapset) && a->words[a->nwords - 1] != 0));
+	Assert(b == NULL || (IsA(b, Bitmapset) && b->words[b->nwords - 1] != 0));
 
 	/* Handle cases where either input is NULL */
 	if (a == NULL)
@@ -411,6 +422,9 @@ bms_subset_compare(const Bitmapset *a, const Bitmapset *b)
 	}
 	if (b == NULL)
 		return BMS_SUBSET2;
+
+	Assert(IsA(a, Bitmapset) && IsA(b, Bitmapset));
+
 	/* Check common words */
 	result = BMS_EQUAL;			/* status so far */
 	shortlen = Min(a->nwords, b->nwords);
@@ -467,6 +481,9 @@ bms_is_member(int x, const Bitmapset *a)
 		elog(ERROR, "negative bitmapset member not allowed");
 	if (a == NULL)
 		return false;
+
+	Assert(IsA(a, Bitmapset));
+
 	wordnum = WORDNUM(x);
 	bitnum = BITNUM(x);
 	if (wordnum >= a->nwords)
@@ -495,6 +512,8 @@ bms_member_index(Bitmapset *a, int x)
 	if (!bms_is_member(x, a))
 		return -1;
 
+	Assert(IsA(a, Bitmapset));
+
 	wordnum = WORDNUM(x);
 	bitnum = BITNUM(x);
 
@@ -529,6 +548,9 @@ bms_overlap(const Bitmapset *a, const Bitmapset *b)
 	int			shortlen;
 	int			i;
 
+	Assert(a == NULL || IsA(a, Bitmapset));
+	Assert(b == NULL || IsA(b, Bitmapset));
+
 	/* Handle cases where either input is NULL */
 	if (a == NULL || b == NULL)
 		return false;
@@ -553,6 +575,8 @@ bms_overlap_list(const Bitmapset *a, const List *b)
 	int			wordnum,
 				bitnum;
 
+	Assert(a == NULL || IsA(a, Bitmapset));
+
 	if (a == NULL || b == NIL)
 		return false;
 
@@ -582,8 +606,8 @@ bms_nonempty_difference(const Bitmapset *a, const Bitmapset *b)
 {
 	int			i;
 
-	Assert(a == NULL || a->words[a->nwords - 1] != 0);
-	Assert(b == NULL || b->words[b->nwords - 1] != 0);
+	Assert(a == NULL || (IsA(a, Bitmapset) && a->words[a->nwords - 1] != 0));
+	Assert(b == NULL || (IsA(b, Bitmapset) && b->words[b->nwords - 1] != 0));
 
 	/* Handle cases where either input is NULL */
 	if (a == NULL)
@@ -617,6 +641,9 @@ bms_singleton_member(const Bitmapset *a)
 
 	if (a == NULL)
 		elog(ERROR, "bitmapset is empty");
+
+	Assert(IsA(a, Bitmapset));
+
 	nwords = a->nwords;
 	wordnum = 0;
 	do
@@ -657,6 +684,7 @@ bms_get_singleton_member(const Bitmapset *a, int *member)
 
 	if (a == NULL)
 		return false;
+	Assert(IsA(a, Bitmapset));
 	nwords = a->nwords;
 	wordnum = 0;
 	do
@@ -690,6 +718,7 @@ bms_num_members(const Bitmapset *a)
 
 	if (a == NULL)
 		return 0;
+	Assert(IsA(a, Bitmapset));
 	nwords = a->nwords;
 	wordnum = 0;
 	do
@@ -717,6 +746,7 @@ bms_membership(const Bitmapset *a)
 
 	if (a == NULL)
 		return BMS_EMPTY_SET;
+	Assert(IsA(a, Bitmapset));
 	nwords = a->nwords;
 	wordnum = 0;
 	do
@@ -759,6 +789,7 @@ bms_add_member(Bitmapset *a, int x)
 		elog(ERROR, "negative bitmapset member not allowed");
 	if (a == NULL)
 		return bms_make_singleton(x);
+	Assert(IsA(a, Bitmapset));
 	wordnum = WORDNUM(x);
 	bitnum = BITNUM(x);
 
@@ -799,6 +830,9 @@ bms_del_member(Bitmapset *a, int x)
 		elog(ERROR, "negative bitmapset member not allowed");
 	if (a == NULL)
 		return NULL;
+
+	Assert(IsA(a, Bitmapset));
+
 	wordnum = WORDNUM(x);
 	bitnum = BITNUM(x);
 
@@ -839,6 +873,9 @@ bms_add_members(Bitmapset *a, const Bitmapset *b)
 	int			otherlen;
 	int			i;
 
+	Assert(a == NULL || IsA(a, Bitmapset));
+	Assert(b == NULL || IsA(b, Bitmapset));
+
 	/* Handle cases where either input is NULL */
 	if (a == NULL)
 		return bms_copy(b);
@@ -884,6 +921,8 @@ bms_add_range(Bitmapset *a, int lower, int upper)
 				ushiftbits,
 				wordnum;
 
+	Assert(a == NULL || IsA(a, Bitmapset));
+
 	/* do nothing if nothing is called for, without further checking */
 	if (upper < lower)
 		return a;
@@ -954,6 +993,9 @@ bms_int_members(Bitmapset *a, const Bitmapset *b)
 	int			shortlen;
 	int			i;
 
+	Assert(a == NULL || IsA(a, Bitmapset));
+	Assert(b == NULL || IsA(b, Bitmapset));
+
 	/* Handle cases where either input is NULL */
 	if (a == NULL)
 		return NULL;
@@ -994,8 +1036,8 @@ bms_del_members(Bitmapset *a, const Bitmapset *b)
 {
 	int			i;
 
-	Assert(a == NULL || a->words[a->nwords - 1] != 0);
-	Assert(b == NULL || b->words[b->nwords - 1] != 0);
+	Assert(a == NULL || (IsA(a, Bitmapset) && a->words[a->nwords - 1] != 0));
+	Assert(b == NULL || (IsA(b, Bitmapset) && b->words[b->nwords - 1] != 0));
 
 	/* Handle cases where either input is NULL */
 	if (a == NULL)
@@ -1055,6 +1097,9 @@ bms_join(Bitmapset *a, Bitmapset *b)
 	int			otherlen;
 	int			i;
 
+	Assert(a == NULL || IsA(a, Bitmapset));
+	Assert(b == NULL || IsA(b, Bitmapset));
+
 	/* Handle cases where either input is NULL */
 	if (a == NULL)
 		return b;
@@ -1109,6 +1154,8 @@ bms_next_member(const Bitmapset *a, int prevbit)
 	int			wordnum;
 	bitmapword	mask;
 
+	Assert(a == NULL || IsA(a, Bitmapset));
+
 	if (a == NULL)
 		return -2;
 	nwords = a->nwords;
@@ -1168,6 +1215,8 @@ bms_prev_member(const Bitmapset *a, int prevbit)
 	int			ushiftbits;
 	bitmapword	mask;
 
+	Assert(a == NULL || IsA(a, Bitmapset));
+
 	/*
 	 * If set is NULL or if there are no more bits to the right then we've
 	 * nothing to do.
-- 
2.39.3 (Apple Git-145)

0002-REALLOCATE_BITMAPSETS-manual-compile-time-option-v2.patchapplication/octet-stream; name=0002-REALLOCATE_BITMAPSETS-manual-compile-time-option-v2.patchDownload
From a04499b50ac50820184e019b3d329443d55aa39a Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Sun, 19 Nov 2023 01:03:12 +0200
Subject: [PATCH 2/3] REALLOCATE_BITMAPSETS manual compile-time option

This option forces each bitmapset modification to reallocate bitmapset.  This
is useful for debugging hangling pointers to bitmapset's.

Discussion: https://postgr.es/m/CAMbWs4_wJthNtYBL+SsebpgF-5L2r5zFFk6xYbS0A78GKOTFHw@mail.gmail.com
---
 src/backend/nodes/bitmapset.c  | 77 ++++++++++++++++++++++++++++++++++
 src/include/pg_config_manual.h |  6 +++
 2 files changed, 83 insertions(+)

diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index 1627922ef7a..e13ecaa155d 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -263,6 +263,7 @@ bms_intersect(const Bitmapset *a, const Bitmapset *b)
 	/* Handle cases where either input is NULL */
 	if (a == NULL || b == NULL)
 		return NULL;
+
 	/* Identify shorter and longer input; copy the shorter one */
 	if (a->nwords <= b->nwords)
 	{
@@ -798,8 +799,15 @@ bms_add_member(Bitmapset *a, int x)
 	{
 		int			oldnwords = a->nwords;
 		int			i;
+#ifdef REALLOCATE_BITMAPSETS
+		Bitmapset  *tmp = a;
 
+		a = (Bitmapset *) palloc(BITMAPSET_SIZE(wordnum + 1));
+		memcpy(a, tmp, BITMAPSET_SIZE(tmp->nwords));
+		pfree(tmp);
+#else
 		a = (Bitmapset *) repalloc(a, BITMAPSET_SIZE(wordnum + 1));
+#endif
 		a->nwords = wordnum + 1;
 		/* zero out the enlarged portion */
 		i = oldnwords;
@@ -808,6 +816,16 @@ bms_add_member(Bitmapset *a, int x)
 			a->words[i] = 0;
 		} while (++i < a->nwords);
 	}
+#ifdef REALLOCATE_BITMAPSETS
+	else
+	{
+		Bitmapset  *tmp = a;
+
+		a = (Bitmapset *) palloc(BITMAPSET_SIZE(tmp->nwords));
+		memcpy(a, tmp, BITMAPSET_SIZE(tmp->nwords));
+		pfree(tmp);
+	}
+#endif
 
 	a->words[wordnum] |= ((bitmapword) 1 << bitnum);
 	return a;
@@ -825,6 +843,9 @@ bms_del_member(Bitmapset *a, int x)
 {
 	int			wordnum,
 				bitnum;
+#ifdef REALLOCATE_BITMAPSETS
+	Bitmapset  *tmp = a;
+#endif
 
 	if (x < 0)
 		elog(ERROR, "negative bitmapset member not allowed");
@@ -836,6 +857,12 @@ bms_del_member(Bitmapset *a, int x)
 	wordnum = WORDNUM(x);
 	bitnum = BITNUM(x);
 
+#ifdef REALLOCATE_BITMAPSETS
+	a = (Bitmapset *) palloc(BITMAPSET_SIZE(tmp->nwords));
+	memcpy(a, tmp, BITMAPSET_SIZE(tmp->nwords));
+	pfree(tmp);
+#endif
+
 	/* member can't exist.  Return 'a' unmodified */
 	if (unlikely(wordnum >= a->nwords))
 		return a;
@@ -889,6 +916,13 @@ bms_add_members(Bitmapset *a, const Bitmapset *b)
 	}
 	else
 	{
+#ifdef REALLOCATE_BITMAPSETS
+		Bitmapset  *tmp = a;
+
+		a = (Bitmapset *) palloc(BITMAPSET_SIZE(tmp->nwords));
+		memcpy(a, tmp, BITMAPSET_SIZE(tmp->nwords));
+		pfree(tmp);
+#endif
 		result = a;
 		other = b;
 	}
@@ -941,9 +975,16 @@ bms_add_range(Bitmapset *a, int lower, int upper)
 	{
 		int			oldnwords = a->nwords;
 		int			i;
+#ifdef REALLOCATE_BITMAPSETS
+		Bitmapset  *tmp = a;
 
+		a = (Bitmapset *) palloc(BITMAPSET_SIZE(uwordnum + 1));
+		memcpy(a, tmp, BITMAPSET_SIZE(tmp->nwords));
+		pfree(tmp);
+#else
 		/* ensure we have enough words to store the upper bit */
 		a = (Bitmapset *) repalloc(a, BITMAPSET_SIZE(uwordnum + 1));
+#endif
 		a->nwords = uwordnum + 1;
 		/* zero out the enlarged portion */
 		i = oldnwords;
@@ -992,6 +1033,12 @@ bms_int_members(Bitmapset *a, const Bitmapset *b)
 	int			lastnonzero;
 	int			shortlen;
 	int			i;
+#ifdef REALLOCATE_BITMAPSETS
+	Bitmapset  *tmp = a;
+#endif
+
+	Assert(a == NULL || IsA(a, Bitmapset));
+	Assert(b == NULL || IsA(b, Bitmapset));
 
 	Assert(a == NULL || IsA(a, Bitmapset));
 	Assert(b == NULL || IsA(b, Bitmapset));
@@ -1004,6 +1051,13 @@ bms_int_members(Bitmapset *a, const Bitmapset *b)
 		pfree(a);
 		return NULL;
 	}
+
+#ifdef REALLOCATE_BITMAPSETS
+	a = (Bitmapset *) palloc(BITMAPSET_SIZE(tmp->nwords));
+	memcpy(a, tmp, BITMAPSET_SIZE(tmp->nwords));
+	pfree(tmp);
+#endif
+
 	/* Intersect b into a; we need never copy */
 	shortlen = Min(a->nwords, b->nwords);
 	lastnonzero = -1;
@@ -1035,6 +1089,9 @@ Bitmapset *
 bms_del_members(Bitmapset *a, const Bitmapset *b)
 {
 	int			i;
+#ifdef REALLOCATE_BITMAPSETS
+	Bitmapset  *tmp = a;
+#endif
 
 	Assert(a == NULL || (IsA(a, Bitmapset) && a->words[a->nwords - 1] != 0));
 	Assert(b == NULL || (IsA(b, Bitmapset) && b->words[b->nwords - 1] != 0));
@@ -1044,6 +1101,13 @@ bms_del_members(Bitmapset *a, const Bitmapset *b)
 		return NULL;
 	if (b == NULL)
 		return a;
+
+#ifdef REALLOCATE_BITMAPSETS
+	a = (Bitmapset *) palloc(BITMAPSET_SIZE(tmp->nwords));
+	memcpy(a, tmp, BITMAPSET_SIZE(tmp->nwords));
+	pfree(tmp);
+#endif
+
 	/* Remove b's bits from a; we need never copy */
 	if (a->nwords > b->nwords)
 	{
@@ -1096,6 +1160,12 @@ bms_join(Bitmapset *a, Bitmapset *b)
 	Bitmapset  *other;
 	int			otherlen;
 	int			i;
+#ifdef REALLOCATE_BITMAPSETS
+	Bitmapset  *tmp = a;
+#endif
+
+	Assert(a == NULL || IsA(a, Bitmapset));
+	Assert(b == NULL || IsA(b, Bitmapset));
 
 	Assert(a == NULL || IsA(a, Bitmapset));
 	Assert(b == NULL || IsA(b, Bitmapset));
@@ -1105,6 +1175,13 @@ bms_join(Bitmapset *a, Bitmapset *b)
 		return b;
 	if (b == NULL)
 		return a;
+
+#ifdef REALLOCATE_BITMAPSETS
+	a = (Bitmapset *) palloc(BITMAPSET_SIZE(tmp->nwords));
+	memcpy(a, tmp, BITMAPSET_SIZE(tmp->nwords));
+	pfree(tmp);
+#endif
+
 	/* Identify shorter and longer input; use longer one as result */
 	if (a->nwords < b->nwords)
 	{
diff --git a/src/include/pg_config_manual.h b/src/include/pg_config_manual.h
index 8a6e67a445d..16c383ba7f7 100644
--- a/src/include/pg_config_manual.h
+++ b/src/include/pg_config_manual.h
@@ -335,6 +335,12 @@
  */
 /* #define COPY_PARSE_PLAN_TREES */
 
+/*
+ * Define this to force Bitmapset reallocation on each modification.  Helps
+ * to find hangling pointers to Bitmapset's.
+ */
+/* #define REALLOCATE_BITMAPSETS */
+
 /*
  * Define this to force all parse and plan trees to be passed through
  * outfuncs.c/readfuncs.c, to facilitate catching errors and omissions in
-- 
2.39.3 (Apple Git-145)

#11Alexander Korotkov
aekorotkov@gmail.com
In reply to: Alexander Korotkov (#10)
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

On Fri, Nov 24, 2023 at 3:54 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:

On Thu, Nov 23, 2023 at 4:33 AM Richard Guo <guofenglinux@gmail.com> wrote:

On Sun, Nov 19, 2023 at 9:17 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:

It's here. New REALLOCATE_BITMAPSETS forces bitmapset reallocation on
each modification.

+1 to the idea of introducing a reallocation mode to Bitmapset.

I had the feeling of falling into a rabbit hole while debugging all
the cases of failure with this new option. With the second patch
regressions tests pass.

It seems to me that we have always had situations where we share the
same pointer to a Bitmapset structure across different places. I do not
think this is a problem as long as we do not modify the Bitmapsets in a
way that requires reallocation or impact the locations sharing the same
pointer.

So I'm wondering, instead of attempting to avoid sharing pointer to
Bitmapset in all locations that have problems, can we simply bms_copy
the original Bitmapset within replace_relid() before making any
modifications, as I proposed previously? Of course, as Andres pointed
out, we need to do so also for the "Delete relid without substitution"
path. Please see the attached.

Yes, this makes sense. Thank you for the patch. My initial point was
that replace_relid() should either do in-place in all cases or make a
copy in all cases. Now I see that it should make a copy in all cases.
Note, that without making a copy in delete case, regression tests fail
with REALLOCATE_BITMAPSETS on.

Please, find the revised patchset. As Ashutosh Bapat asked, asserts
are split into separate patch.

Any objections to pushing this?

------
Regards,
Alexander Korotkov

#12Ashutosh Bapat
ashutosh.bapat.oss@gmail.com
In reply to: Alexander Korotkov (#11)
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

On Mon, Nov 27, 2023 at 6:35 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:

On Fri, Nov 24, 2023 at 3:54 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:

On Thu, Nov 23, 2023 at 4:33 AM Richard Guo <guofenglinux@gmail.com> wrote:

On Sun, Nov 19, 2023 at 9:17 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:

It's here. New REALLOCATE_BITMAPSETS forces bitmapset reallocation on
each modification.

+1 to the idea of introducing a reallocation mode to Bitmapset.

I had the feeling of falling into a rabbit hole while debugging all
the cases of failure with this new option. With the second patch
regressions tests pass.

It seems to me that we have always had situations where we share the
same pointer to a Bitmapset structure across different places. I do not
think this is a problem as long as we do not modify the Bitmapsets in a
way that requires reallocation or impact the locations sharing the same
pointer.

So I'm wondering, instead of attempting to avoid sharing pointer to
Bitmapset in all locations that have problems, can we simply bms_copy
the original Bitmapset within replace_relid() before making any
modifications, as I proposed previously? Of course, as Andres pointed
out, we need to do so also for the "Delete relid without substitution"
path. Please see the attached.

Yes, this makes sense. Thank you for the patch. My initial point was
that replace_relid() should either do in-place in all cases or make a
copy in all cases. Now I see that it should make a copy in all cases.
Note, that without making a copy in delete case, regression tests fail
with REALLOCATE_BITMAPSETS on.

Please, find the revised patchset. As Ashutosh Bapat asked, asserts
are split into separate patch.

Any objections to pushing this?

Did we at least measure the memory impact?

How do we ensure that we are not making unnecessary copies of Bitmapsets?

--
Best Wishes,
Ashutosh Bapat

#13Andres Freund
andres@anarazel.de
In reply to: Ashutosh Bapat (#12)
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

On 2023-11-27 11:29:48 +0530, Ashutosh Bapat wrote:

How do we ensure that we are not making unnecessary copies of Bitmapsets?

We don't - but that's not specific to this patch. Bitmapsets typically aren't
very large, I doubt that it's a significant proportion of the memory
usage. Adding refcounts or such would likely add more overhead than it'd save,
both in time and memory.

I am a bit worried about the maintainability of remove_rel_from_query() et
al. Is there any infrastructure for detecting that some PlannerInfo field that
needs updating wasn't updated? There's not even a note in PlannerInfo that
documents that that needs to happen.

#14Alexander Korotkov
aekorotkov@gmail.com
In reply to: Andres Freund (#13)
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

On Mon, Nov 27, 2023 at 8:07 PM Andres Freund <andres@anarazel.de> wrote:

On 2023-11-27 11:29:48 +0530, Ashutosh Bapat wrote:

How do we ensure that we are not making unnecessary copies of Bitmapsets?

We don't - but that's not specific to this patch. Bitmapsets typically aren't
very large, I doubt that it's a significant proportion of the memory
usage. Adding refcounts or such would likely add more overhead than it'd save,
both in time and memory.

I am a bit worried about the maintainability of remove_rel_from_query() et
al. Is there any infrastructure for detecting that some PlannerInfo field that
needs updating wasn't updated? There's not even a note in PlannerInfo that
documents that that needs to happen.

That makes sense, thank you. We need at least a comment about this.
I'll write a patch adding this comment.

BTW, what do you think about the patches upthread [1].

Links
1. /messages/by-id/CAPpHfdtLgCryACcrmLv=Koq9rAB3=tr5y9D84dGgvUhSCvjzjg@mail.gmail.com

------
Regards,
Alexander Korotkov

#15Andrei Lepikhov
a.lepikhov@postgrespro.ru
In reply to: Alexander Korotkov (#14)
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

On 28/11/2023 01:37, Alexander Korotkov wrote:

On Mon, Nov 27, 2023 at 8:07 PM Andres Freund <andres@anarazel.de> wrote:

Sorry for the late answer, I missed this thread because of vacation.

On 2023-11-27 11:29:48 +0530, Ashutosh Bapat wrote:

How do we ensure that we are not making unnecessary copies of Bitmapsets?

We don't - but that's not specific to this patch. Bitmapsets typically aren't
very large, I doubt that it's a significant proportion of the memory
usage. Adding refcounts or such would likely add more overhead than it'd save,
both in time and memory.

I'd already clashed with Tom on copying the required_relids field and
voluntarily made unnecessary copies in the project [1]Asymmetric partition-wise JOIN /messages/by-id/CAOP8fzaVL_2SCJayLL9kj5pCA46PJOXXjuei6-3aFUV45j4LJQ@mail.gmail.com.
And ... stuck into huge memory consumption. The reason was in Bitmapsets:
When we have 1E3-1E4 partitions and try to reparameterize a join, one
bitmapset field can have a size of about 1kB. Having bitmapset
referencing Relation with a large index value, we had a lot of (for
example, 1E4 * 1kB) copies on each reparametrization of such a field.
Alexander Pyhalov should remember that case.
I don't claim we will certainly catch such an issue here, but it is a
reason why we should look at this option carefully.

I am a bit worried about the maintainability of remove_rel_from_query() et
al. Is there any infrastructure for detecting that some PlannerInfo field that
needs updating wasn't updated? There's not even a note in PlannerInfo that
documents that that needs to happen.

Thanks you for highlighting this issue.> That makes sense, thank you.
We need at least a comment about this.

I'll write a patch adding this comment.

BTW, what do you think about the patches upthread [1].

Links
1. /messages/by-id/CAPpHfdtLgCryACcrmLv=Koq9rAB3=tr5y9D84dGgvUhSCvjzjg@mail.gmail.com

0001 - Looks good and can be applied.
0002 - I am afraid the problems with expanded range table entries are
likewise described above. The patch makes sense, but it requires time to
reproduce corner cases. Maybe we can do it separately from the current
hotfix?
0003 - I think it is really what we need right now: SJE is quite a rare
optimization and executes before the entries expansion procedure. So it
looks less risky.

[1]: Asymmetric partition-wise JOIN /messages/by-id/CAOP8fzaVL_2SCJayLL9kj5pCA46PJOXXjuei6-3aFUV45j4LJQ@mail.gmail.com
/messages/by-id/CAOP8fzaVL_2SCJayLL9kj5pCA46PJOXXjuei6-3aFUV45j4LJQ@mail.gmail.com

--
regards,
Andrei Lepikhov
Postgres Professional

#16Alexander Pyhalov
a.pyhalov@postgrespro.ru
In reply to: Andrei Lepikhov (#15)
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

Andrei Lepikhov писал(а) 2023-12-08 07:37:

On 28/11/2023 01:37, Alexander Korotkov wrote:

On Mon, Nov 27, 2023 at 8:07 PM Andres Freund <andres@anarazel.de>
wrote:

Sorry for the late answer, I missed this thread because of vacation.

On 2023-11-27 11:29:48 +0530, Ashutosh Bapat wrote:

How do we ensure that we are not making unnecessary copies of
Bitmapsets?

We don't - but that's not specific to this patch. Bitmapsets
typically aren't
very large, I doubt that it's a significant proportion of the memory
usage. Adding refcounts or such would likely add more overhead than
it'd save,
both in time and memory.

I'd already clashed with Tom on copying the required_relids field and
voluntarily made unnecessary copies in the project [1].
And ... stuck into huge memory consumption. The reason was in
Bitmapsets:
When we have 1E3-1E4 partitions and try to reparameterize a join, one
bitmapset field can have a size of about 1kB. Having bitmapset
referencing Relation with a large index value, we had a lot of (for
example, 1E4 * 1kB) copies on each reparametrization of such a field.
Alexander Pyhalov should remember that case.

Yes. If it matters, this happened during reparametrization when 2
partitioned tables with 1000 partitions each were joined. Then
asymmetric pw join managed to eat lots of memory for bitmapsets (by
lots of memory I mean all available on the test VM).

[1] Asymmetric partition-wise JOIN
/messages/by-id/CAOP8fzaVL_2SCJayLL9kj5pCA46PJOXXjuei6-3aFUV45j4LJQ@mail.gmail.com

--
Best regards,
Alexander Pyhalov,
Postgres Professional

#17Ashutosh Bapat
ashutosh.bapat.oss@gmail.com
In reply to: Alexander Pyhalov (#16)
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

On Fri, Dec 8, 2023 at 12:43 PM Alexander Pyhalov
<a.pyhalov@postgrespro.ru> wrote:

Andrei Lepikhov писал(а) 2023-12-08 07:37:

On 28/11/2023 01:37, Alexander Korotkov wrote:

On Mon, Nov 27, 2023 at 8:07 PM Andres Freund <andres@anarazel.de>
wrote:

Sorry for the late answer, I missed this thread because of vacation.

On 2023-11-27 11:29:48 +0530, Ashutosh Bapat wrote:

How do we ensure that we are not making unnecessary copies of
Bitmapsets?

We don't - but that's not specific to this patch. Bitmapsets
typically aren't
very large, I doubt that it's a significant proportion of the memory
usage. Adding refcounts or such would likely add more overhead than
it'd save,
both in time and memory.

I'd already clashed with Tom on copying the required_relids field and
voluntarily made unnecessary copies in the project [1].
And ... stuck into huge memory consumption. The reason was in
Bitmapsets:
When we have 1E3-1E4 partitions and try to reparameterize a join, one
bitmapset field can have a size of about 1kB. Having bitmapset
referencing Relation with a large index value, we had a lot of (for
example, 1E4 * 1kB) copies on each reparametrization of such a field.
Alexander Pyhalov should remember that case.

Yes. If it matters, this happened during reparametrization when 2
partitioned tables with 1000 partitions each were joined. Then
asymmetric pw join managed to eat lots of memory for bitmapsets (by
lots of memory I mean all available on the test VM).

I did some analysis of memory consumption by bitmapsets in such cases.
[1]: https://docs.google.com/presentation/d/1S9BiAADhX-Fv9tDbx5R5Izq4blAofhZMhHcO1c-wzfI/edit?usp=sharing
crude and quite WIP. But they will give some idea.

[1]: https://docs.google.com/presentation/d/1S9BiAADhX-Fv9tDbx5R5Izq4blAofhZMhHcO1c-wzfI/edit?usp=sharing

--
Best Wishes,
Ashutosh Bapat

#18Alexander Korotkov
aekorotkov@gmail.com
In reply to: Ashutosh Bapat (#17)
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

Hi, Ashutosh!

On Fri, Dec 8, 2023 at 3:28 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:

I did some analysis of memory consumption by bitmapsets in such cases.
[1] contains slides with the result of this analysis. The slides are
crude and quite WIP. But they will give some idea.

[1] https://docs.google.com/presentation/d/1S9BiAADhX-Fv9tDbx5R5Izq4blAofhZMhHcO1c-wzfI/edit?usp=sharing

Thank you for sharing your analysis. I understand that usage of a
plain bitmap becomes a problem with a large number of partitions. But
I wonder what does "post proposed fixes" mean? Is it the fixes posted
in [1]. If so it's very surprising for me they are reducing the
memory footprint size.

Links.
1. /messages/by-id/CAPpHfdtLgCryACcrmLv=Koq9rAB3=tr5y9D84dGgvUhSCvjzjg@mail.gmail.com

------
Regards,
Alexander Korotkov

#19Richard Guo
guofenglinux@gmail.com
In reply to: Alexander Pyhalov (#16)
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

On Fri, Dec 8, 2023 at 3:13 PM Alexander Pyhalov <a.pyhalov@postgrespro.ru>
wrote:

Andrei Lepikhov писал(а) 2023-12-08 07:37:

I'd already clashed with Tom on copying the required_relids field and
voluntarily made unnecessary copies in the project [1].
And ... stuck into huge memory consumption. The reason was in
Bitmapsets:
When we have 1E3-1E4 partitions and try to reparameterize a join, one
bitmapset field can have a size of about 1kB. Having bitmapset
referencing Relation with a large index value, we had a lot of (for
example, 1E4 * 1kB) copies on each reparametrization of such a field.
Alexander Pyhalov should remember that case.

Yes. If it matters, this happened during reparametrization when 2
partitioned tables with 1000 partitions each were joined. Then
asymmetric pw join managed to eat lots of memory for bitmapsets (by
lots of memory I mean all available on the test VM).

By reparametrization did you mean the work done in
reparameterize_path_by_child()? If so maybe you'd be interested in the
patch [1]/messages/by-id/CAMbWs48PBwe1YadzgKGW_ES=V9BZhq00BaZTOTM6Oye8n_cDNg@mail.gmail.com which postpones reparameterization of paths until createplan.c
and thus can help avoid unnecessary reparametrization work.

[1]: /messages/by-id/CAMbWs48PBwe1YadzgKGW_ES=V9BZhq00BaZTOTM6Oye8n_cDNg@mail.gmail.com
/messages/by-id/CAMbWs48PBwe1YadzgKGW_ES=V9BZhq00BaZTOTM6Oye8n_cDNg@mail.gmail.com

Thanks
Richard

#20Andrei Lepikhov
a.lepikhov@postgrespro.ru
In reply to: Richard Guo (#19)
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

On 11/12/2023 09:31, Richard Guo wrote:

On Fri, Dec 8, 2023 at 3:13 PM Alexander Pyhalov
<a.pyhalov@postgrespro.ru <mailto:a.pyhalov@postgrespro.ru>> wrote:
Andrei Lepikhov писал(а) 2023-12-08 07:37:

I'd already clashed with Tom on copying the required_relids field

and

voluntarily made unnecessary copies in the project [1].
And ... stuck into huge memory consumption. The reason was in
Bitmapsets:
When we have 1E3-1E4 partitions and try to reparameterize a join,

one

bitmapset field can have a size of about 1kB. Having bitmapset
referencing Relation with a large index value, we had a lot of (for
example, 1E4 * 1kB) copies on each reparametrization of such a

field.

Alexander Pyhalov should remember that case.

Yes. If it matters, this happened during reparametrization when 2
partitioned tables with 1000 partitions each were joined. Then
asymmetric  pw join managed to eat lots of memory for bitmapsets (by
lots of memory I mean all available on the test VM).
By reparametrization did you mean the work done in
reparameterize_path_by_child()?  If so maybe you'd be interested in the
patch [1] which postpones reparameterization of paths until createplan.c
and thus can help avoid unnecessary reparametrization work.

Yeah, I have discovered it already. It is a promising solution and only
needs a bit more review. But here, I embraced some corner cases with the
idea that we may not see other cases right now. And also, sometimes the
Bitmapset field is significant - it is not a corner case.

--
regards,
Andrei Lepikhov
Postgres Professional

#21Ashutosh Bapat
ashutosh.bapat.oss@gmail.com
In reply to: Alexander Korotkov (#18)
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

On Fri, Dec 8, 2023 at 11:24 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:

Hi, Ashutosh!

On Fri, Dec 8, 2023 at 3:28 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:

I did some analysis of memory consumption by bitmapsets in such cases.
[1] contains slides with the result of this analysis. The slides are
crude and quite WIP. But they will give some idea.

[1] https://docs.google.com/presentation/d/1S9BiAADhX-Fv9tDbx5R5Izq4blAofhZMhHcO1c-wzfI/edit?usp=sharing

Thank you for sharing your analysis. I understand that usage of a
plain bitmap becomes a problem with a large number of partitions. But
I wonder what does "post proposed fixes" mean? Is it the fixes posted
in [1]. If so it's very surprising for me they are reducing the
memory footprint size.

No. These are fixes in various threads all listed together in [1]/messages/by-id/CAExHW5s_KwB0Rb9L3TuRJxsvO5UCtEpdskkAeMb5X1EtssMjgg@mail.gmail.com. I
had started investigating memory consumption by Bitmapsets around the
same time. The slides are result of that investigation. I have updated
slides with this reference.

[1]: /messages/by-id/CAExHW5s_KwB0Rb9L3TuRJxsvO5UCtEpdskkAeMb5X1EtssMjgg@mail.gmail.com

They reduce the memory footprint by Bitmapset because they reduce the
objects that contain the bitmapsets, thus reducing the total number of
bitmapsets produced.

--
Best Wishes,
Ashutosh Bapat

#22Alexander Korotkov
aekorotkov@gmail.com
In reply to: Ashutosh Bapat (#21)
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

Hi!

On Mon, Dec 11, 2023 at 3:25 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
wrote:

On Fri, Dec 8, 2023 at 11:24 PM Alexander Korotkov <aekorotkov@gmail.com>
wrote:

On Fri, Dec 8, 2023 at 3:28 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:

I did some analysis of memory consumption by bitmapsets in such cases.
[1] contains slides with the result of this analysis. The slides are
crude and quite WIP. But they will give some idea.

[1]

https://docs.google.com/presentation/d/1S9BiAADhX-Fv9tDbx5R5Izq4blAofhZMhHcO1c-wzfI/edit?usp=sharing

Thank you for sharing your analysis. I understand that usage of a
plain bitmap becomes a problem with a large number of partitions. But
I wonder what does "post proposed fixes" mean? Is it the fixes posted
in [1]. If so it's very surprising for me they are reducing the
memory footprint size.

No. These are fixes in various threads all listed together in [1]. I
had started investigating memory consumption by Bitmapsets around the
same time. The slides are result of that investigation. I have updated
slides with this reference.

[1]
/messages/by-id/CAExHW5s_KwB0Rb9L3TuRJxsvO5UCtEpdskkAeMb5X1EtssMjgg@mail.gmail.com

They reduce the memory footprint by Bitmapset because they reduce the
objects that contain the bitmapsets, thus reducing the total number of
bitmapsets produced.

Thank you Ashutosh for your work on this matter. With a large number of
partitions, it definitely makes sense to reduce both Bitmapset's size as
well as the number of Bitmapsets.

I've checked the patchset [1] with your test suite to check the memory
consumption. The results are in the table below.

query | no patch | patch | no self-join
removal
----------------------------------------------------------------------------------
2-way join, non partitioned | 14792 | 15208 | 29152
2-way join, no partitionwise join | 19519576 | 19519576 | 19519576
2-way join, partitionwise join | 40851968 | 40851968 | 40851968
3-way join, non partitioned | 20632 | 21784 | 79376
3-way join, no partitionwise join | 45227224 | 45227224 | 45227224
3-way join, partitionwise join | 151655144 | 151655144 | 151655144
4-way join, non partitioned | 25816 | 27736 | 209128
4-way join, no partitionwise join | 83540712 | 83540712 | 83540712
4-way join, partitionwise join | 463960088 | 463960088 | 463960088
5-way join, non partitioned | 31000 | 33720 | 562552
5-way join, no partitionwise join | 149284376 | 149284376 | 149284376
5-way join, partitionwise join | 1663896608 | 1663896608 | 1663896608

The most noticeable thing for me is that self-join removal doesn't work
with partitioned tables. I think this is the direction for future work on
this subject. In non-partitioned cases, patchset gives a small memory
overhead. However, the memory consumption is still much less than it is
without the self-join removal. So, removing the join still lowers memory
consumption even if it copies some Bitmapsets. Given that patchset [1] is
required for the correctness of memory manipulations in Bitmapsets during
join removals, I'm going to push it if there are no objections.

Links.
1.
/messages/by-id/CAPpHfdtLgCryACcrmLv=Koq9rAB3=tr5y9D84dGgvUhSCvjzjg@mail.gmail.com

------
Regards,
Alexander Korotkov

#23Ashutosh Bapat
ashutosh.bapat.oss@gmail.com
In reply to: Alexander Korotkov (#22)
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

Hi Alexander,

On Sun, Dec 24, 2023 at 5:32 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:

Thank you Ashutosh for your work on this matter. With a large number of partitions, it definitely makes sense to reduce both Bitmapset's size as well as the number of Bitmapsets.

I've checked the patchset [1] with your test suite to check the memory consumption. The results are in the table below.

query | no patch | patch | no self-join removal
----------------------------------------------------------------------------------
2-way join, non partitioned | 14792 | 15208 | 29152
2-way join, no partitionwise join | 19519576 | 19519576 | 19519576
2-way join, partitionwise join | 40851968 | 40851968 | 40851968
3-way join, non partitioned | 20632 | 21784 | 79376
3-way join, no partitionwise join | 45227224 | 45227224 | 45227224
3-way join, partitionwise join | 151655144 | 151655144 | 151655144
4-way join, non partitioned | 25816 | 27736 | 209128
4-way join, no partitionwise join | 83540712 | 83540712 | 83540712
4-way join, partitionwise join | 463960088 | 463960088 | 463960088
5-way join, non partitioned | 31000 | 33720 | 562552
5-way join, no partitionwise join | 149284376 | 149284376 | 149284376
5-way join, partitionwise join | 1663896608 | 1663896608 | 1663896608

The most noticeable thing for me is that self-join removal doesn't work with partitioned tables. I think this is the direction for future work on this subject. In non-partitioned cases, patchset gives a small memory overhead. However, the memory consumption is still much less than it is without the self-join removal. So, removing the join still lowers memory consumption even if it copies some Bitmapsets. Given that patchset [1] is required for the correctness of memory manipulations in Bitmapsets during join removals, I'm going to push it if there are no objections.

I am missing the link between this work and the self join work. Can
you please provide me relevant pointers?

--
Best Wishes,
Ashutosh Bapat

#24Alexander Korotkov
aekorotkov@gmail.com
In reply to: Ashutosh Bapat (#23)
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

On Mon, Dec 25, 2023 at 2:56 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:

On Sun, Dec 24, 2023 at 5:32 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:

Thank you Ashutosh for your work on this matter. With a large number of partitions, it definitely makes sense to reduce both Bitmapset's size as well as the number of Bitmapsets.

I've checked the patchset [1] with your test suite to check the memory consumption. The results are in the table below.

query | no patch | patch | no self-join removal
----------------------------------------------------------------------------------
2-way join, non partitioned | 14792 | 15208 | 29152
2-way join, no partitionwise join | 19519576 | 19519576 | 19519576
2-way join, partitionwise join | 40851968 | 40851968 | 40851968
3-way join, non partitioned | 20632 | 21784 | 79376
3-way join, no partitionwise join | 45227224 | 45227224 | 45227224
3-way join, partitionwise join | 151655144 | 151655144 | 151655144
4-way join, non partitioned | 25816 | 27736 | 209128
4-way join, no partitionwise join | 83540712 | 83540712 | 83540712
4-way join, partitionwise join | 463960088 | 463960088 | 463960088
5-way join, non partitioned | 31000 | 33720 | 562552
5-way join, no partitionwise join | 149284376 | 149284376 | 149284376
5-way join, partitionwise join | 1663896608 | 1663896608 | 1663896608

The most noticeable thing for me is that self-join removal doesn't work with partitioned tables. I think this is the direction for future work on this subject. In non-partitioned cases, patchset gives a small memory overhead. However, the memory consumption is still much less than it is without the self-join removal. So, removing the join still lowers memory consumption even if it copies some Bitmapsets. Given that patchset [1] is required for the correctness of memory manipulations in Bitmapsets during join removals, I'm going to push it if there are no objections.

I am missing the link between this work and the self join work. Can
you please provide me relevant pointers?

This thread was started from the bug in self-join removal [1]. The
fix under consideration [2] makes replace_relid() leave the argument
unmodified. I've used your test set [3] to check the memory overhead
of this solution.

Links.
1. /messages/by-id/CAMbWs4_wJthNtYBL+SsebpgF-5L2r5zFFk6xYbS0A78GKOTFHw@mail.gmail.com
2. /messages/by-id/CAPpHfdtLgCryACcrmLv=Koq9rAB3=tr5y9D84dGgvUhSCvjzjg@mail.gmail.com
3. /messages/by-id/CAExHW5stmOUobE55pMt83r8UxvfCph+Pvo5dNpdrVCsBgXEzDQ@mail.gmail.com

------
Regards,
Alexander Korotkov

#25Alexander Korotkov
aekorotkov@gmail.com
In reply to: Alexander Korotkov (#22)
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

On Sun, Dec 24, 2023 at 2:02 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:

The most noticeable thing for me is that self-join removal doesn't work with partitioned tables. I think this is the direction for future work on this subject. In non-partitioned cases, patchset gives a small memory overhead. However, the memory consumption is still much less than it is without the self-join removal. So, removing the join still lowers memory consumption even if it copies some Bitmapsets. Given that patchset [1] is required for the correctness of memory manipulations in Bitmapsets during join removals, I'm going to push it if there are no objections.

Pushed!

------
Regards,
Alexander Korotkov