Using each rel as both outer and inner for JOIN_ANTI

Started by Richard Guoover 4 years ago32 messages

guofenglinux@gmail.com

over 4 years ago

Hi hackers,

We may have anti-joins in several cases. Sublinks of 'NOT EXISTS' may be
pulled up as anti-joins. Left joins whose join quals are strict for any
nullable var that is forced null by higher qual levels will also be
reduced to anti-joins. So anti-joins are very commonly used in practice.

Currently when populating anti-join with paths, we do not try to swap
the outer and inner to get both paths. That may make us miss some
cheaper paths.

# insert into foo select i, i from generate_series(1,10)i;
INSERT 0 10

# insert into bar select i, i from generate_series(1,5000000)i;
INSERT 0 5000000

# explain select * from foo left join bar on foo.a = bar.c where bar.c is
null;
QUERY PLAN
-------------------------------------------------------------------------
Hash Anti Join (cost=154156.00..173691.19 rows=1 width=16)
Hash Cond: (foo.a = bar.c)
-> Seq Scan on foo (cost=0.00..1.10 rows=10 width=8)
-> Hash (cost=72124.00..72124.00 rows=5000000 width=8)
-> Seq Scan on bar (cost=0.00..72124.00 rows=5000000 width=8)
(5 rows)

I believe if we use the smaller table 'foo' as inner side for this
query, we would have a cheaper plan.

So I'm wondering whether it's worthwhile to use each rel as both outer
and inner for anti-joins, maybe by inventing a JOIN_REVERSE_ANTI join
type.

Thanks
Richard

Heikki Linnakangas

hlinnaka@iki.fi

over 4 years ago

In reply to: Richard Guo (#1)

Re: Using each rel as both outer and inner for JOIN_ANTI

On 24/06/2021 12:50, Richard Guo wrote:

Hi hackers,

We may have anti-joins in several cases. Sublinks of 'NOT EXISTS' may be
pulled up as anti-joins. Left joins whose join quals are strict for any
nullable var that is forced null by higher qual levels will also be
reduced to anti-joins. So anti-joins are very commonly used in practice.

Currently when populating anti-join with paths, we do not try to swap
the outer and inner to get both paths. That may make us miss some
cheaper paths.

# insert into foo select i, i from generate_series(1,10)i;
INSERT 0 10

# insert into bar select i, i from generate_series(1,5000000)i;
INSERT 0 5000000

# explain select * from foo left join bar on foo.a = bar.c where bar.c
is null;
QUERY PLAN
-------------------------------------------------------------------------
Hash Anti Join (cost=154156.00..173691.19 rows=1 width=16)
Hash Cond: (foo.a = bar.c)
-> Seq Scan on foo (cost=0.00..1.10 rows=10 width=8)
-> Hash (cost=72124.00..72124.00 rows=5000000 width=8)
-> Seq Scan on bar (cost=0.00..72124.00 rows=5000000 width=8)
(5 rows)

I believe if we use the smaller table 'foo' as inner side for this
query, we would have a cheaper plan.

How would that work?

- Heikki

Tom Lane

tgl@sss.pgh.pa.us

over 4 years ago

In reply to: Heikki Linnakangas (#2)

Re: Using each rel as both outer and inner for JOIN_ANTI

Heikki Linnakangas <hlinnaka@iki.fi> writes:

On 24/06/2021 12:50, Richard Guo wrote:

I believe if we use the smaller table 'foo' as inner side for this
query, we would have a cheaper plan.

How would that work?

I think you could make it work for the hash-join case by extending
the existing mechanism for right joins: emit nothing during the main
scan, but mark hashtable entries when a match is found. Then make
a post-pass and emit hash entries that never found a match. So
basically just the inverse behavior of a right join, but with the
same state info.

Merge join could likely support "right anti join" too, though the
benefit of swapping inputs tends to be small there, so it may not be
worth doing.

Obviously there's a pretty fair amount of code to write to make this
happen.

regards, tom lane

Richard Guo

guofenglinux@gmail.com

over 4 years ago

In reply to: Tom Lane (#3)

1 attachment(s)

Re: Using each rel as both outer and inner for JOIN_ANTI

On Thu, Jun 24, 2021 at 10:14 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Heikki Linnakangas <hlinnaka@iki.fi> writes:

On 24/06/2021 12:50, Richard Guo wrote:

I believe if we use the smaller table 'foo' as inner side for this
query, we would have a cheaper plan.

How would that work?

I think you could make it work for the hash-join case by extending
the existing mechanism for right joins: emit nothing during the main
scan, but mark hashtable entries when a match is found. Then make
a post-pass and emit hash entries that never found a match. So
basically just the inverse behavior of a right join, but with the
same state info.

Merge join could likely support "right anti join" too, though the
benefit of swapping inputs tends to be small there, so it may not be
worth doing.

Obviously there's a pretty fair amount of code to write to make this
happen.

Thanks for the explanation. Attached is a demo code for the hash-join
case, which is only for PoC to show how we can make it work. It's far
from complete, at least we need to adjust the cost calculation for this
'right anti join'.

Am I going in the right direction?

Thanks
Richard

Attachments:

0001-Using-each-rel-as-both-outer-and-inner-for-anti-join.patchapplication/octet-stream; name=0001-Using-each-rel-as-both-outer-and-inner-for-anti-join.patchDownload

From 3590a10e5a891fd93e83318c5b0082eae1d98333 Mon Sep 17 00:00:00 2001
From: Richard Guo <guofenglinux@gmail.com>
Date: Thu, 24 Jun 2021 16:58:44 +0800
Subject: [PATCH] Using each rel as both outer and inner for anti joins

---
 src/backend/commands/explain.c        | 3 +++
 src/backend/executor/nodeHashjoin.c   | 4 +++-
 src/backend/optimizer/path/joinpath.c | 7 +++++++
 src/backend/optimizer/path/joinrels.c | 3 +++
 src/include/nodes/nodes.h             | 4 +++-
 5 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index e81b990092..a9f6ebf1f3 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -1517,6 +1517,9 @@ ExplainNode(PlanState *planstate, List *ancestors,
 					case JOIN_ANTI:
 						jointype = "Anti";
 						break;
+					case JOIN_RIGHT_ANTI:
+						jointype = "Right Anti";
+						break;
 					default:
 						jointype = "???";
 						break;
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index 510bdd39ad..0e901c9d53 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -476,7 +476,8 @@ ExecHashJoinImpl(PlanState *pstate, bool parallel)
 					}
 
 					/* In an antijoin, we never return a matched tuple */
-					if (node->js.jointype == JOIN_ANTI)
+					if (node->js.jointype == JOIN_ANTI ||
+							node->js.jointype == JOIN_RIGHT_ANTI)
 					{
 						node->hj_JoinState = HJ_NEED_NEW_OUTER;
 						continue;
@@ -694,6 +695,7 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
 				ExecInitNullTupleSlot(estate, innerDesc, &TTSOpsVirtual);
 			break;
 		case JOIN_RIGHT:
+		case JOIN_RIGHT_ANTI:
 			hjstate->hj_NullOuterTupleSlot =
 				ExecInitNullTupleSlot(estate, outerDesc, &TTSOpsVirtual);
 			break;
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index b67b517770..e91293ff54 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -1110,6 +1110,9 @@ sort_inner_and_outer(PlannerInfo *root,
 	List	   *all_pathkeys;
 	ListCell   *l;
 
+	if (jointype == JOIN_RIGHT_ANTI)
+		return;
+
 	/*
 	 * We only consider the cheapest-total-cost input paths, since we are
 	 * assuming here that a sort is required.  We will consider
@@ -1559,6 +1562,9 @@ match_unsorted_outer(PlannerInfo *root,
 	Path	   *matpath = NULL;
 	ListCell   *lc1;
 
+	if (jointype == JOIN_RIGHT_ANTI)
+		return;
+
 	/*
 	 * Nestloop only supports inner, left, semi, and anti joins.  Also, if we
 	 * are doing a right or full mergejoin, we must use *all* the mergeclauses
@@ -2100,6 +2106,7 @@ hash_inner_and_outer(PlannerInfo *root,
 			save_jointype != JOIN_UNIQUE_OUTER &&
 			save_jointype != JOIN_FULL &&
 			save_jointype != JOIN_RIGHT &&
+			save_jointype != JOIN_RIGHT_ANTI &&
 			outerrel->partial_pathlist != NIL &&
 			bms_is_empty(joinrel->lateral_relids))
 		{
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 0dbe2ac726..194f820c2f 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -915,6 +915,9 @@ populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
 			add_paths_to_joinrel(root, joinrel, rel1, rel2,
 								 JOIN_ANTI, sjinfo,
 								 restrictlist);
+			add_paths_to_joinrel(root, joinrel, rel2, rel1,
+								 JOIN_RIGHT_ANTI, sjinfo,
+								 restrictlist);
 			break;
 		default:
 			/* other values not expected here */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index d9e417bcd7..7c07792e56 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -725,6 +725,7 @@ typedef enum JoinType
 	 */
 	JOIN_SEMI,					/* 1 copy of each LHS row that has match(es) */
 	JOIN_ANTI,					/* 1 copy of each LHS row that has no match */
+	JOIN_RIGHT_ANTI,			/* 1 copy of each RHS row that has no match */
 
 	/*
 	 * These codes are used internally in the planner, but are not supported
@@ -757,7 +758,8 @@ typedef enum JoinType
 	  ((1 << JOIN_LEFT) | \
 	   (1 << JOIN_FULL) | \
 	   (1 << JOIN_RIGHT) | \
-	   (1 << JOIN_ANTI))) != 0)
+	   (1 << JOIN_ANTI) | \
+	   (1 << JOIN_RIGHT_ANTI))) != 0)
 
 /*
  * AggStrategy -
-- 
2.31.0

Emre Hasegeli

emre@hasegeli.com

over 4 years ago

In reply to: Richard Guo (#4)

Re: Using each rel as both outer and inner for JOIN_ANTI

Thanks for the explanation. Attached is a demo code for the hash-join
case, which is only for PoC to show how we can make it work. It's far
from complete, at least we need to adjust the cost calculation for this
'right anti join'.

I applied the patch and executed some queries. Hash Right Anti Joins
seem to be working correctly. Though, some of the tests are failing.
I guessed it's because the other join algorithms do not support right
anti join, but I couldn't reproduce it.

I am impressed by how simple the patch is: only 2 lines to support a
new join algorithm. This is a good case for the quality of Postgres
code. I hope supporting the other join algorithms would be similar.

I am not sure how the cost estimation should differ from straight anti
join. It seems to me that the planner is already making the right
choice by taking into account the cost of the Hash node which makes
the whole cost greater when the inner table is much bigger.

I am not an expert planner, but it feels to me like a good feature
that can provide better plans in some cases. Given it works correctly
and the implementation is so simple, the only argument against it may
be increased planning time. I know that the planner performance is
affected negatively by the number of join paths to consider. This may
not be a bigger deal as typically there are not many anti joins in a
query, but it'd still be a good idea to do some performance tests.

Richard Guo

guofenglinux@gmail.com

over 4 years ago

In reply to: Emre Hasegeli (#5)

Re: Using each rel as both outer and inner for JOIN_ANTI

On Tue, Jun 29, 2021 at 3:55 PM Emre Hasegeli <emre@hasegeli.com> wrote:

Thanks for the explanation. Attached is a demo code for the hash-join
case, which is only for PoC to show how we can make it work. It's far
from complete, at least we need to adjust the cost calculation for this
'right anti join'.

I applied the patch and executed some queries. Hash Right Anti Joins
seem to be working correctly. Though, some of the tests are failing.
I guessed it's because the other join algorithms do not support right
anti join, but I couldn't reproduce it.

Thanks for verifying this patch.

I am impressed by how simple the patch is: only 2 lines to support a
new join algorithm. This is a good case for the quality of Postgres
code. I hope supporting the other join algorithms would be similar.

Yes, thanks to the excellent design pattern of the execution codes, we
only need very few changes to support this new join type.

I am not sure how the cost estimation should differ from straight anti
join. It seems to me that the planner is already making the right
choice by taking into account the cost of the Hash node which makes
the whole cost greater when the inner table is much bigger.

I think we can basically use the same cost calculation as with anti
joins, since they share the fact that the executor will stop after the
first match. However, there are still some differences. Such as when we
consider the number of tuples that will pass the basic join, I think we
need to use unmatched inner rows, rather than unmatched outer rows.

I am not an expert planner, but it feels to me like a good feature
that can provide better plans in some cases. Given it works correctly
and the implementation is so simple, the only argument against it may
be increased planning time. I know that the planner performance is
affected negatively by the number of join paths to consider. This may
not be a bigger deal as typically there are not many anti joins in a
query, but it'd still be a good idea to do some performance tests.

Agree. Performance tests are necessary if we consider finishing this
patch.

Thanks
Richard

Ronan Dunklau

ronan.dunklau@aiven.io

over 4 years ago

In reply to: Richard Guo (#6)

Re: Using each rel as both outer and inner for JOIN_ANTI

Le mardi 29 juin 2021, 10:55:59 CEST Richard Guo a écrit :

On Tue, Jun 29, 2021 at 3:55 PM Emre Hasegeli <emre@hasegeli.com> wrote:

Thanks for the explanation. Attached is a demo code for the hash-join
case, which is only for PoC to show how we can make it work. It's far
from complete, at least we need to adjust the cost calculation for this
'right anti join'.

I applied the patch and executed some queries. Hash Right Anti Joins
seem to be working correctly. Though, some of the tests are failing.
I guessed it's because the other join algorithms do not support right
anti join, but I couldn't reproduce it.

Thanks for verifying this patch.

I also ran the tests on this patch, and can confirm the tests are failing.

The reason for that is that you request a new outer tuple whenever we have a
match, even when the outer tuple could match several tuples from the hash
table: we end up emitting the duplicates because we switched to another tuple
after the first match.

You can set up a simple test case like this:

create table t1 (id int);
create table t2 (id int);
insert into t1 select generate_series (1, 10000);
insert into t2 VALUES (1), (1), (-1), (-1);
analyze t1, t2;

set enable_hashjoin = off;
explain (analyze) select * from t2 where not exists (SELECT 1 FROM t1 where
t1.id = t2.id);
set enable_nestloop = off;
set enable_hashjoin = on;
explain (analyze) select * from t2 where not exists (SELECT 1 FROM t1 where
t1.id = t2.id);

Also, I'm not familiar enough with the hash join algorithm to know if the
approach of "scanning every outer tuple, skip emitting matching inner tuples"
would be correct, but this is the first problem I notice. Not getting into the
HJ_NEED_NEW_OUTER state when the join type is JOIN_RIGHT_ANTI seems to fix this
specific issue tough.

I think we can basically use the same cost calculation as with anti
joins, since they share the fact that the executor will stop after the
first match. However, there are still some differences. Such as when we
consider the number of tuples that will pass the basic join, I think we
need to use unmatched inner rows, rather than unmatched outer rows.

Due to the fact we cannot just skip at the first match, I'm not sure this works
either.

Show quoted text

Thanks
Richard

Richard Guo

guofenglinux@gmail.com

over 4 years ago

In reply to: Ronan Dunklau (#7)

1 attachment(s)

Re: Using each rel as both outer and inner for JOIN_ANTI

On Tue, Jun 29, 2021 at 10:41 PM Ronan Dunklau <ronan.dunklau@aiven.io>
wrote:

Le mardi 29 juin 2021, 10:55:59 CEST Richard Guo a écrit :

On Tue, Jun 29, 2021 at 3:55 PM Emre Hasegeli <emre@hasegeli.com> wrote:

Thanks for the explanation. Attached is a demo code for the hash-join
case, which is only for PoC to show how we can make it work. It's far
from complete, at least we need to adjust the cost calculation for

this

'right anti join'.

I applied the patch and executed some queries. Hash Right Anti Joins
seem to be working correctly. Though, some of the tests are failing.
I guessed it's because the other join algorithms do not support right
anti join, but I couldn't reproduce it.

Thanks for verifying this patch.

I also ran the tests on this patch, and can confirm the tests are failing.

The reason for that is that you request a new outer tuple whenever we have
a
match, even when the outer tuple could match several tuples from the hash
table: we end up emitting the duplicates because we switched to another
tuple
after the first match.

Yes, thanks! I was making a big mistake here thinking the executor can
stop after the first match. That's not true. We need to use each outer
tuple to find all the matches and mark the corresponding hashtable
entries. I have updated the patch with the fix.

I think we can basically use the same cost calculation as with anti
joins, since they share the fact that the executor will stop after the
first match. However, there are still some differences. Such as when we
consider the number of tuples that will pass the basic join, I think we
need to use unmatched inner rows, rather than unmatched outer rows.

Due to the fact we cannot just skip at the first match, I'm not sure this
works
either.

This is not correct any more since the fact that the executor will stop
after the first match does not hold true. A brief thought show me that
we can use the same cost calculation as with right joins.

Thanks
Richard

Attachments:

v2-0001-Using-each-rel-as-both-outer-and-inner-for-anti-join.patchapplication/octet-stream; name=v2-0001-Using-each-rel-as-both-outer-and-inner-for-anti-join.patchDownload

From 27cdb17454507604f87c8362c4d4e48f5d879f35 Mon Sep 17 00:00:00 2001
From: Richard Guo <guofenglinux@gmail.com>
Date: Wed, 30 Jun 2021 10:39:36 +0800
Subject: [PATCH] Using each rel as both outer and inner for anti joins

---
 src/backend/commands/explain.c        | 3 +++
 src/backend/executor/nodeHashjoin.c   | 9 +++++++++
 src/backend/optimizer/path/joinpath.c | 7 +++++++
 src/backend/optimizer/path/joinrels.c | 3 +++
 src/backend/optimizer/plan/setrefs.c  | 1 +
 src/include/nodes/nodes.h             | 4 +++-
 6 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index e81b990092..a9f6ebf1f3 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -1517,6 +1517,9 @@ ExplainNode(PlanState *planstate, List *ancestors,
 					case JOIN_ANTI:
 						jointype = "Anti";
 						break;
+					case JOIN_RIGHT_ANTI:
+						jointype = "Right Anti";
+						break;
 					default:
 						jointype = "???";
 						break;
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index 510bdd39ad..6592a81de3 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -482,6 +482,14 @@ ExecHashJoinImpl(PlanState *pstate, bool parallel)
 						continue;
 					}
 
+					/*
+					 * In a right-antijoin, we never return a matched tuple.
+					 * And we need to use current outer tuple to scan the
+					 * bucket for matches
+					 */
+					if (node->js.jointype == JOIN_RIGHT_ANTI)
+						continue;
+
 					/*
 					 * If we only need to join to the first matching inner
 					 * tuple, then consider returning this one, but after that
@@ -694,6 +702,7 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
 				ExecInitNullTupleSlot(estate, innerDesc, &TTSOpsVirtual);
 			break;
 		case JOIN_RIGHT:
+		case JOIN_RIGHT_ANTI:
 			hjstate->hj_NullOuterTupleSlot =
 				ExecInitNullTupleSlot(estate, outerDesc, &TTSOpsVirtual);
 			break;
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index b67b517770..e91293ff54 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -1110,6 +1110,9 @@ sort_inner_and_outer(PlannerInfo *root,
 	List	   *all_pathkeys;
 	ListCell   *l;
 
+	if (jointype == JOIN_RIGHT_ANTI)
+		return;
+
 	/*
 	 * We only consider the cheapest-total-cost input paths, since we are
 	 * assuming here that a sort is required.  We will consider
@@ -1559,6 +1562,9 @@ match_unsorted_outer(PlannerInfo *root,
 	Path	   *matpath = NULL;
 	ListCell   *lc1;
 
+	if (jointype == JOIN_RIGHT_ANTI)
+		return;
+
 	/*
 	 * Nestloop only supports inner, left, semi, and anti joins.  Also, if we
 	 * are doing a right or full mergejoin, we must use *all* the mergeclauses
@@ -2100,6 +2106,7 @@ hash_inner_and_outer(PlannerInfo *root,
 			save_jointype != JOIN_UNIQUE_OUTER &&
 			save_jointype != JOIN_FULL &&
 			save_jointype != JOIN_RIGHT &&
+			save_jointype != JOIN_RIGHT_ANTI &&
 			outerrel->partial_pathlist != NIL &&
 			bms_is_empty(joinrel->lateral_relids))
 		{
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 0dbe2ac726..194f820c2f 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -915,6 +915,9 @@ populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
 			add_paths_to_joinrel(root, joinrel, rel1, rel2,
 								 JOIN_ANTI, sjinfo,
 								 restrictlist);
+			add_paths_to_joinrel(root, joinrel, rel2, rel1,
+								 JOIN_RIGHT_ANTI, sjinfo,
+								 restrictlist);
 			break;
 		default:
 			/* other values not expected here */
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 61ccfd300b..56fc238b0e 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -2069,6 +2069,7 @@ set_join_references(PlannerInfo *root, Join *join, int rtoffset)
 			inner_itlist->has_non_vars = false;
 			break;
 		case JOIN_RIGHT:
+		case JOIN_RIGHT_ANTI:
 			outer_itlist->has_non_vars = false;
 			break;
 		case JOIN_FULL:
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index d9e417bcd7..7c07792e56 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -725,6 +725,7 @@ typedef enum JoinType
 	 */
 	JOIN_SEMI,					/* 1 copy of each LHS row that has match(es) */
 	JOIN_ANTI,					/* 1 copy of each LHS row that has no match */
+	JOIN_RIGHT_ANTI,			/* 1 copy of each RHS row that has no match */
 
 	/*
 	 * These codes are used internally in the planner, but are not supported
@@ -757,7 +758,8 @@ typedef enum JoinType
 	  ((1 << JOIN_LEFT) | \
 	   (1 << JOIN_FULL) | \
 	   (1 << JOIN_RIGHT) | \
-	   (1 << JOIN_ANTI))) != 0)
+	   (1 << JOIN_ANTI) | \
+	   (1 << JOIN_RIGHT_ANTI))) != 0)
 
 /*
  * AggStrategy -
-- 
2.31.0

Ronan Dunklau

ronan.dunklau@aiven.io

over 4 years ago

In reply to: Richard Guo (#8)

Re: Using each rel as both outer and inner for JOIN_ANTI

Yes, thanks! I was making a big mistake here thinking the executor can
stop after the first match. That's not true. We need to use each outer
tuple to find all the matches and mark the corresponding hashtable
entries. I have updated the patch with the fix.

It looks OK to me.

I think we can basically use the same cost calculation as with anti
joins, since they share the fact that the executor will stop after the
first match. However, there are still some differences. Such as when we
consider the number of tuples that will pass the basic join, I think we
need to use unmatched inner rows, rather than unmatched outer rows.

Due to the fact we cannot just skip at the first match, I'm not sure this
works
either.

This is not correct any more since the fact that the executor will stop
after the first match does not hold true. A brief thought show me that
we can use the same cost calculation as with right joins.

Once you do that, you should also add test coverage for those new plans. Also
consider adding a commitfest entry.

Regards,

--
Ronan Dunklau

#10

Ronan Dunklau

ronan.dunklau@aiven.io

over 4 years ago

In reply to: Ronan Dunklau (#9)

Re: Using each rel as both outer and inner for JOIN_ANTI

Le jeudi 1 juillet 2021, 09:09:38 CEST Ronan Dunklau a écrit :

Yes, thanks! I was making a big mistake here thinking the executor can
stop after the first match. That's not true. We need to use each outer
tuple to find all the matches and mark the corresponding hashtable
entries. I have updated the patch with the fix.

It looks OK to me.

I forgot to mention: you also have failing tests due to the plans changing to
use the new join type. This might not be the case anymore once you update the
cost model, but if that's the case the tests should be updated.

#11

Richard Guo

guofenglinux@gmail.com

over 4 years ago

In reply to: Ronan Dunklau (#10)

1 attachment(s)

Re: Using each rel as both outer and inner for JOIN_ANTI

On Thu, Jul 1, 2021 at 3:18 PM Ronan Dunklau <ronan.dunklau@aiven.io> wrote:

Le jeudi 1 juillet 2021, 09:09:38 CEST Ronan Dunklau a écrit :

Yes, thanks! I was making a big mistake here thinking the executor can
stop after the first match. That's not true. We need to use each outer
tuple to find all the matches and mark the corresponding hashtable
entries. I have updated the patch with the fix.

It looks OK to me.

I forgot to mention: you also have failing tests due to the plans changing
to
use the new join type. This might not be the case anymore once you update
the
cost model, but if that's the case the tests should be updated.

Thanks! Test cases are updated in v3 patch. Also merge join can do the
'right anti join' too in the same patch.

Thanks again for reviewing this patch.

Thanks
Richard

Attachments:

v3-0001-Using-each-rel-as-both-outer-and-inner-for-anti-join.patchapplication/octet-stream; name=v3-0001-Using-each-rel-as-both-outer-and-inner-for-anti-join.patchDownload

From 0d2accb93a5bbe2746e8428f92bee49f0d4e8243 Mon Sep 17 00:00:00 2001
From: Richard Guo <guofenglinux@gmail.com>
Date: Wed, 30 Jun 2021 10:39:36 +0800
Subject: [PATCH] Using each rel as both outer and inner for anti joins

---
 src/backend/commands/explain.c               |   3 +
 src/backend/executor/nodeHashjoin.c          |   9 +
 src/backend/executor/nodeMergejoin.c         |   8 +
 src/backend/optimizer/path/costsize.c        |   3 +-
 src/backend/optimizer/path/joinpath.c        |   5 +
 src/backend/optimizer/path/joinrels.c        |   3 +
 src/backend/optimizer/path/pathkeys.c        |   4 +-
 src/backend/optimizer/plan/setrefs.c         |   1 +
 src/include/nodes/nodes.h                    |   4 +-
 src/test/regress/expected/partition_join.out | 276 ++++++++++---------
 10 files changed, 179 insertions(+), 137 deletions(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index e81b990092..a9f6ebf1f3 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -1517,6 +1517,9 @@ ExplainNode(PlanState *planstate, List *ancestors,
 					case JOIN_ANTI:
 						jointype = "Anti";
 						break;
+					case JOIN_RIGHT_ANTI:
+						jointype = "Right Anti";
+						break;
 					default:
 						jointype = "???";
 						break;
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index 510bdd39ad..6592a81de3 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -482,6 +482,14 @@ ExecHashJoinImpl(PlanState *pstate, bool parallel)
 						continue;
 					}
 
+					/*
+					 * In a right-antijoin, we never return a matched tuple.
+					 * And we need to use current outer tuple to scan the
+					 * bucket for matches
+					 */
+					if (node->js.jointype == JOIN_RIGHT_ANTI)
+						continue;
+
 					/*
 					 * If we only need to join to the first matching inner
 					 * tuple, then consider returning this one, but after that
@@ -694,6 +702,7 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
 				ExecInitNullTupleSlot(estate, innerDesc, &TTSOpsVirtual);
 			break;
 		case JOIN_RIGHT:
+		case JOIN_RIGHT_ANTI:
 			hjstate->hj_NullOuterTupleSlot =
 				ExecInitNullTupleSlot(estate, outerDesc, &TTSOpsVirtual);
 			break;
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index b41454ab6d..41ac929199 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -805,6 +805,13 @@ ExecMergeJoin(PlanState *pstate)
 						break;
 					}
 
+					/*
+					 * In a right-antijoin, we never return a matched tuple.
+					 * And we need to use current outer tuple for further scans
+					 */
+					if (node->js.jointype == JOIN_RIGHT_ANTI)
+						break;
+
 					/*
 					 * If we only need to join to the first matching inner
 					 * tuple, then consider returning this one, but after that
@@ -1554,6 +1561,7 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
 				ExecInitNullTupleSlot(estate, innerDesc, &TTSOpsVirtual);
 			break;
 		case JOIN_RIGHT:
+		case JOIN_RIGHT_ANTI:
 			mergestate->mj_FillOuter = false;
 			mergestate->mj_FillInner = true;
 			mergestate->mj_NullOuterTupleSlot =
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 8577c7b138..7110ea82ec 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -3268,7 +3268,8 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 			outerstartsel = 0.0;
 			outerendsel = 1.0;
 		}
-		else if (jointype == JOIN_RIGHT)
+		else if (jointype == JOIN_RIGHT ||
+				jointype == JOIN_RIGHT_ANTI)
 		{
 			innerstartsel = 0.0;
 			innerendsel = 1.0;
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index b67b517770..4306e8a559 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -1167,6 +1167,7 @@ sort_inner_and_outer(PlannerInfo *root,
 		save_jointype != JOIN_UNIQUE_OUTER &&
 		save_jointype != JOIN_FULL &&
 		save_jointype != JOIN_RIGHT &&
+		save_jointype != JOIN_RIGHT_ANTI &&
 		outerrel->partial_pathlist != NIL &&
 		bms_is_empty(joinrel->lateral_relids))
 	{
@@ -1576,6 +1577,7 @@ match_unsorted_outer(PlannerInfo *root,
 			useallclauses = false;
 			break;
 		case JOIN_RIGHT:
+		case JOIN_RIGHT_ANTI:
 		case JOIN_FULL:
 			nestjoinOK = false;
 			useallclauses = true;
@@ -1754,6 +1756,7 @@ match_unsorted_outer(PlannerInfo *root,
 		save_jointype != JOIN_UNIQUE_OUTER &&
 		save_jointype != JOIN_FULL &&
 		save_jointype != JOIN_RIGHT &&
+		save_jointype != JOIN_RIGHT_ANTI &&
 		outerrel->partial_pathlist != NIL &&
 		bms_is_empty(joinrel->lateral_relids))
 	{
@@ -2100,6 +2103,7 @@ hash_inner_and_outer(PlannerInfo *root,
 			save_jointype != JOIN_UNIQUE_OUTER &&
 			save_jointype != JOIN_FULL &&
 			save_jointype != JOIN_RIGHT &&
+			save_jointype != JOIN_RIGHT_ANTI &&
 			outerrel->partial_pathlist != NIL &&
 			bms_is_empty(joinrel->lateral_relids))
 		{
@@ -2262,6 +2266,7 @@ select_mergejoin_clauses(PlannerInfo *root,
 	switch (jointype)
 	{
 		case JOIN_RIGHT:
+		case JOIN_RIGHT_ANTI:
 		case JOIN_FULL:
 			*mergejoin_allowed = !have_nonmergeable_joinclause;
 			break;
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 0dbe2ac726..194f820c2f 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -915,6 +915,9 @@ populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
 			add_paths_to_joinrel(root, joinrel, rel1, rel2,
 								 JOIN_ANTI, sjinfo,
 								 restrictlist);
+			add_paths_to_joinrel(root, joinrel, rel2, rel1,
+								 JOIN_RIGHT_ANTI, sjinfo,
+								 restrictlist);
 			break;
 		default:
 			/* other values not expected here */
diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
index bd9a176d7d..d2c2278274 100644
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -1084,7 +1084,9 @@ build_join_pathkeys(PlannerInfo *root,
 					JoinType jointype,
 					List *outer_pathkeys)
 {
-	if (jointype == JOIN_FULL || jointype == JOIN_RIGHT)
+	if (jointype == JOIN_FULL ||
+		jointype == JOIN_RIGHT ||
+		jointype == JOIN_RIGHT_ANTI)
 		return NIL;
 
 	/*
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 61ccfd300b..56fc238b0e 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -2069,6 +2069,7 @@ set_join_references(PlannerInfo *root, Join *join, int rtoffset)
 			inner_itlist->has_non_vars = false;
 			break;
 		case JOIN_RIGHT:
+		case JOIN_RIGHT_ANTI:
 			outer_itlist->has_non_vars = false;
 			break;
 		case JOIN_FULL:
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index d9e417bcd7..7c07792e56 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -725,6 +725,7 @@ typedef enum JoinType
 	 */
 	JOIN_SEMI,					/* 1 copy of each LHS row that has match(es) */
 	JOIN_ANTI,					/* 1 copy of each LHS row that has no match */
+	JOIN_RIGHT_ANTI,			/* 1 copy of each RHS row that has no match */
 
 	/*
 	 * These codes are used internally in the planner, but are not supported
@@ -757,7 +758,8 @@ typedef enum JoinType
 	  ((1 << JOIN_LEFT) | \
 	   (1 << JOIN_FULL) | \
 	   (1 << JOIN_RIGHT) | \
-	   (1 << JOIN_ANTI))) != 0)
+	   (1 << JOIN_ANTI) | \
+	   (1 << JOIN_RIGHT_ANTI))) != 0)
 
 /*
  * AggStrategy -
diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out
index 27f7525b3e..db69b835a5 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -2414,24 +2414,24 @@ SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Hash Anti Join
-               Hash Cond: (t1_1.a = t2_1.b)
-               ->  Seq Scan on prt1_adv_p1 t1_1
-                     Filter: (b = 0)
+         ->  Hash Right Anti Join
+               Hash Cond: (t2_1.b = t1_1.a)
+               ->  Seq Scan on prt2_adv_p1 t2_1
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p1 t2_1
-         ->  Hash Anti Join
-               Hash Cond: (t1_2.a = t2_2.b)
-               ->  Seq Scan on prt1_adv_p2 t1_2
-                     Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p1 t1_1
+                           Filter: (b = 0)
+         ->  Hash Right Anti Join
+               Hash Cond: (t2_2.b = t1_2.a)
+               ->  Seq Scan on prt2_adv_p2 t2_2
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p2 t2_2
-         ->  Hash Anti Join
-               Hash Cond: (t1_3.a = t2_3.b)
-               ->  Seq Scan on prt1_adv_p3 t1_3
-                     Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p2 t1_2
+                           Filter: (b = 0)
+         ->  Hash Right Anti Join
+               Hash Cond: (t2_3.b = t1_3.a)
+               ->  Seq Scan on prt2_adv_p3 t2_3
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p3 t2_3
+                     ->  Seq Scan on prt1_adv_p3 t1_3
+                           Filter: (b = 0)
 (21 rows)
 
 SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
@@ -2649,24 +2649,24 @@ SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Hash Anti Join
-               Hash Cond: (t1_1.a = t2_1.b)
-               ->  Seq Scan on prt1_adv_p1 t1_1
-                     Filter: (b = 0)
+         ->  Hash Right Anti Join
+               Hash Cond: (t2_1.b = t1_1.a)
+               ->  Seq Scan on prt2_adv_p1 t2_1
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p1 t2_1
-         ->  Hash Anti Join
-               Hash Cond: (t1_2.a = t2_2.b)
-               ->  Seq Scan on prt1_adv_p2 t1_2
-                     Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p1 t1_1
+                           Filter: (b = 0)
+         ->  Hash Right Anti Join
+               Hash Cond: (t2_2.b = t1_2.a)
+               ->  Seq Scan on prt2_adv_p2 t2_2
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p2 t2_2
-         ->  Hash Anti Join
-               Hash Cond: (t1_3.a = t2_3.b)
-               ->  Seq Scan on prt1_adv_p3 t1_3
-                     Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p2 t1_2
+                           Filter: (b = 0)
+         ->  Hash Right Anti Join
+               Hash Cond: (t2_3.b = t1_3.a)
+               ->  Seq Scan on prt2_adv_p3 t2_3
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p3 t2_3
+                     ->  Seq Scan on prt1_adv_p3 t1_3
+                           Filter: (b = 0)
 (21 rows)
 
 SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
@@ -2682,26 +2682,26 @@ SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t
 -- partitions on the nullable side
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM prt2_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt1_adv t2 WHERE t1.b = t2.a) AND t1.a = 0 ORDER BY t1.b;
-                      QUERY PLAN                      
-------------------------------------------------------
+                       QUERY PLAN                        
+---------------------------------------------------------
  Sort
    Sort Key: t1.b
-   ->  Hash Anti Join
-         Hash Cond: (t1.b = t2.a)
+   ->  Hash Right Anti Join
+         Hash Cond: (t2.a = t1.b)
          ->  Append
-               ->  Seq Scan on prt2_adv_p1 t1_1
-                     Filter: (a = 0)
-               ->  Seq Scan on prt2_adv_p2 t1_2
-                     Filter: (a = 0)
-               ->  Seq Scan on prt2_adv_p3 t1_3
-                     Filter: (a = 0)
-               ->  Seq Scan on prt2_adv_extra t1_4
-                     Filter: (a = 0)
+               ->  Seq Scan on prt1_adv_p1 t2_1
+               ->  Seq Scan on prt1_adv_p2 t2_2
+               ->  Seq Scan on prt1_adv_p3 t2_3
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt1_adv_p1 t2_1
-                     ->  Seq Scan on prt1_adv_p2 t2_2
-                     ->  Seq Scan on prt1_adv_p3 t2_3
+                     ->  Seq Scan on prt2_adv_p1 t1_1
+                           Filter: (a = 0)
+                     ->  Seq Scan on prt2_adv_p2 t1_2
+                           Filter: (a = 0)
+                     ->  Seq Scan on prt2_adv_p3 t1_3
+                           Filter: (a = 0)
+                     ->  Seq Scan on prt2_adv_extra t1_4
+                           Filter: (a = 0)
 (18 rows)
 
 -- full join; currently we can't do partitioned join if there are no matched
@@ -2869,25 +2869,25 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_adv t1 LEFT JOIN prt2_adv t2 ON (t1.a =
 -- anti join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
-                       QUERY PLAN                       
---------------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Sort
    Sort Key: t1.a
-   ->  Hash Anti Join
-         Hash Cond: (t1.a = t2.b)
+   ->  Hash Right Anti Join
+         Hash Cond: (t2.b = t1.a)
          ->  Append
-               ->  Seq Scan on prt1_adv_p1 t1_1
-                     Filter: (b = 0)
-               ->  Seq Scan on prt1_adv_p2 t1_2
-                     Filter: (b = 0)
-               ->  Seq Scan on prt1_adv_p3 t1_3
-                     Filter: (b = 0)
+               ->  Seq Scan on prt2_adv_p1 t2_1
+               ->  Seq Scan on prt2_adv_p2 t2_2
+               ->  Seq Scan on prt2_adv_p3_1 t2_3
+               ->  Seq Scan on prt2_adv_p3_2 t2_4
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt2_adv_p1 t2_1
-                     ->  Seq Scan on prt2_adv_p2 t2_2
-                     ->  Seq Scan on prt2_adv_p3_1 t2_3
-                     ->  Seq Scan on prt2_adv_p3_2 t2_4
+                     ->  Seq Scan on prt1_adv_p1 t1_1
+                           Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p2 t1_2
+                           Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p3 t1_3
+                           Filter: (b = 0)
 (17 rows)
 
 -- full join
@@ -3291,27 +3291,30 @@ SELECT t1.a, t1.c, t2.a, t2.c FROM plt1_adv t1 LEFT JOIN plt2_adv t2 ON (t1.a =
 -- anti join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
-                              QUERY PLAN                              
-----------------------------------------------------------------------
+                             QUERY PLAN                             
+--------------------------------------------------------------------
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_1.a = t2_1.a) AND (t1_1.c = t2_1.c))
-               ->  Seq Scan on plt1_adv_p1 t1_1
-                     Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_1.a = t1_1.a) AND (t2_1.c = t1_1.c))
                ->  Seq Scan on plt2_adv_p1 t2_1
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_2.a = t2_2.a) AND (t1_2.c = t2_2.c))
-               ->  Seq Scan on plt1_adv_p2 t1_2
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p1 t1_1
+                           Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_2.a = t1_2.a) AND (t2_2.c = t1_2.c))
                ->  Seq Scan on plt2_adv_p2 t2_2
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_3.a = t2_3.a) AND (t1_3.c = t2_3.c))
-               ->  Seq Scan on plt1_adv_p3 t1_3
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p2 t1_2
+                           Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_3.a = t1_3.a) AND (t2_3.c = t1_3.c))
                ->  Seq Scan on plt2_adv_p3 t2_3
-(18 rows)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p3 t1_3
+                           Filter: (b < 10)
+(21 rows)
 
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
  a | b |  c   
@@ -3504,27 +3507,30 @@ SELECT t1.a, t1.c, t2.a, t2.c FROM plt2_adv t1 LEFT JOIN plt1_adv t2 ON (t1.a =
 -- anti join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
-                              QUERY PLAN                              
-----------------------------------------------------------------------
+                             QUERY PLAN                             
+--------------------------------------------------------------------
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_1.a = t2_1.a) AND (t1_1.c = t2_1.c))
-               ->  Seq Scan on plt1_adv_p1 t1_1
-                     Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_1.a = t1_1.a) AND (t2_1.c = t1_1.c))
                ->  Seq Scan on plt2_adv_p1 t2_1
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_2.a = t2_2.a) AND (t1_2.c = t2_2.c))
-               ->  Seq Scan on plt1_adv_p2 t1_2
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p1 t1_1
+                           Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_2.a = t1_2.a) AND (t2_2.c = t1_2.c))
                ->  Seq Scan on plt2_adv_p2 t2_2
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_3.a = t2_3.a) AND (t1_3.c = t2_3.c))
-               ->  Seq Scan on plt1_adv_p3 t1_3
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p2 t1_2
+                           Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_3.a = t1_3.a) AND (t2_3.c = t1_3.c))
                ->  Seq Scan on plt2_adv_p3 t2_3
-(18 rows)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p3 t1_3
+                           Filter: (b < 10)
+(21 rows)
 
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
  a | b |  c   
@@ -3537,26 +3543,26 @@ SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t
 -- partitions on the nullable side
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM plt2_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt1_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
-                      QUERY PLAN                      
-------------------------------------------------------
+                       QUERY PLAN                        
+---------------------------------------------------------
  Sort
    Sort Key: t1.a
-   ->  Hash Anti Join
-         Hash Cond: ((t1.a = t2.a) AND (t1.c = t2.c))
+   ->  Hash Right Anti Join
+         Hash Cond: ((t2.a = t1.a) AND (t2.c = t1.c))
          ->  Append
-               ->  Seq Scan on plt2_adv_extra t1_1
-                     Filter: (b < 10)
-               ->  Seq Scan on plt2_adv_p1 t1_2
-                     Filter: (b < 10)
-               ->  Seq Scan on plt2_adv_p2 t1_3
-                     Filter: (b < 10)
-               ->  Seq Scan on plt2_adv_p3 t1_4
-                     Filter: (b < 10)
+               ->  Seq Scan on plt1_adv_p1 t2_1
+               ->  Seq Scan on plt1_adv_p2 t2_2
+               ->  Seq Scan on plt1_adv_p3 t2_3
          ->  Hash
                ->  Append
-                     ->  Seq Scan on plt1_adv_p1 t2_1
-                     ->  Seq Scan on plt1_adv_p2 t2_2
-                     ->  Seq Scan on plt1_adv_p3 t2_3
+                     ->  Seq Scan on plt2_adv_extra t1_1
+                           Filter: (b < 10)
+                     ->  Seq Scan on plt2_adv_p1 t1_2
+                           Filter: (b < 10)
+                     ->  Seq Scan on plt2_adv_p2 t1_3
+                           Filter: (b < 10)
+                     ->  Seq Scan on plt2_adv_p3 t1_4
+                           Filter: (b < 10)
 (18 rows)
 
 -- full join; currently we can't do partitioned join if there are no matched
@@ -3666,25 +3672,25 @@ SELECT t1.a, t1.c, t2.a, t2.c FROM plt1_adv t1 LEFT JOIN plt2_adv t2 ON (t1.a =
 -- anti join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
-                       QUERY PLAN                       
---------------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Sort
    Sort Key: t1.a
-   ->  Hash Anti Join
-         Hash Cond: ((t1.a = t2.a) AND (t1.c = t2.c))
+   ->  Hash Right Anti Join
+         Hash Cond: ((t2.a = t1.a) AND (t2.c = t1.c))
          ->  Append
-               ->  Seq Scan on plt1_adv_p1 t1_1
-                     Filter: (b < 10)
-               ->  Seq Scan on plt1_adv_p2 t1_2
-                     Filter: (b < 10)
-               ->  Seq Scan on plt1_adv_p3 t1_3
-                     Filter: (b < 10)
+               ->  Seq Scan on plt2_adv_p1 t2_1
+               ->  Seq Scan on plt2_adv_p2_1 t2_2
+               ->  Seq Scan on plt2_adv_p2_2 t2_3
+               ->  Seq Scan on plt2_adv_p3 t2_4
          ->  Hash
                ->  Append
-                     ->  Seq Scan on plt2_adv_p1 t2_1
-                     ->  Seq Scan on plt2_adv_p2_1 t2_2
-                     ->  Seq Scan on plt2_adv_p2_2 t2_3
-                     ->  Seq Scan on plt2_adv_p3 t2_4
+                     ->  Seq Scan on plt1_adv_p1 t1_1
+                           Filter: (b < 10)
+                     ->  Seq Scan on plt1_adv_p2 t1_2
+                           Filter: (b < 10)
+                     ->  Seq Scan on plt1_adv_p3 t1_3
+                           Filter: (b < 10)
 (17 rows)
 
 -- full join
@@ -3841,28 +3847,30 @@ SELECT t1.a, t1.c, t2.a, t2.c FROM plt1_adv t1 LEFT JOIN plt2_adv t2 ON (t1.a =
 -- anti join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
-                              QUERY PLAN                              
-----------------------------------------------------------------------
+                             QUERY PLAN                             
+--------------------------------------------------------------------
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Hash Anti Join
-               Hash Cond: ((t1_1.a = t2_1.a) AND (t1_1.c = t2_1.c))
-               ->  Seq Scan on plt1_adv_p1_null t1_1
-                     Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_1.a = t1_1.a) AND (t2_1.c = t1_1.c))
+               ->  Seq Scan on plt2_adv_p1 t2_1
                ->  Hash
-                     ->  Seq Scan on plt2_adv_p1 t2_1
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_2.a = t2_2.a) AND (t1_2.c = t2_2.c))
-               ->  Seq Scan on plt1_adv_p2 t1_2
-                     Filter: (b < 10)
+                     ->  Seq Scan on plt1_adv_p1_null t1_1
+                           Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_2.a = t1_2.a) AND (t2_2.c = t1_2.c))
                ->  Seq Scan on plt2_adv_p2 t2_2
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_3.a = t2_3.a) AND (t1_3.c = t2_3.c))
-               ->  Seq Scan on plt1_adv_p3 t1_3
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p2 t1_2
+                           Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_3.a = t1_3.a) AND (t2_3.c = t1_3.c))
                ->  Seq Scan on plt2_adv_p3_null t2_3
-(19 rows)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p3 t1_3
+                           Filter: (b < 10)
+(21 rows)
 
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
  a  | b  |  c   
-- 
2.31.0

#12

Zhihong Yu

zyu@yugabyte.com

over 4 years ago

In reply to: Richard Guo (#11)

Re: Using each rel as both outer and inner for JOIN_ANTI

On Thu, Jul 1, 2021 at 8:24 PM Richard Guo <guofenglinux@gmail.com> wrote:

On Thu, Jul 1, 2021 at 3:18 PM Ronan Dunklau <ronan.dunklau@aiven.io>
wrote:

Le jeudi 1 juillet 2021, 09:09:38 CEST Ronan Dunklau a écrit :

Yes, thanks! I was making a big mistake here thinking the executor can
stop after the first match. That's not true. We need to use each outer
tuple to find all the matches and mark the corresponding hashtable
entries. I have updated the patch with the fix.

It looks OK to me.

I forgot to mention: you also have failing tests due to the plans
changing to
use the new join type. This might not be the case anymore once you update
the
cost model, but if that's the case the tests should be updated.

Thanks! Test cases are updated in v3 patch. Also merge join can do the
'right anti join' too in the same patch.

Thanks again for reviewing this patch.

Thanks
Richard

Hi,
Minor comment:
+ * In a right-antijoin, we never return a matched tuple.

matched tuple -> matching tuple

Cheers

#13

Richard Guo

guofenglinux@gmail.com

over 4 years ago

In reply to: Zhihong Yu (#12)

Re: Using each rel as both outer and inner for JOIN_ANTI

On Fri, Jul 2, 2021 at 11:59 AM Zhihong Yu <zyu@yugabyte.com> wrote:

On Thu, Jul 1, 2021 at 8:24 PM Richard Guo <guofenglinux@gmail.com> wrote:

On Thu, Jul 1, 2021 at 3:18 PM Ronan Dunklau <ronan.dunklau@aiven.io>
wrote:

Le jeudi 1 juillet 2021, 09:09:38 CEST Ronan Dunklau a écrit :

Yes, thanks! I was making a big mistake here thinking the executor

can

stop after the first match. That's not true. We need to use each

outer

tuple to find all the matches and mark the corresponding hashtable
entries. I have updated the patch with the fix.

It looks OK to me.

I forgot to mention: you also have failing tests due to the plans
changing to
use the new join type. This might not be the case anymore once you
update the
cost model, but if that's the case the tests should be updated.

Thanks! Test cases are updated in v3 patch. Also merge join can do the
'right anti join' too in the same patch.

Thanks again for reviewing this patch.

Thanks
Richard

Hi,
Minor comment:
+ * In a right-antijoin, we never return a matched
tuple.

matched tuple -> matching tuple

Thanks for reviewing.

The comment for JOIN_ANTI in ExecHashJoinImpl/ExecMergeJoin is:

"In an antijoin, we never return a matched tuple"

So I think we'd better keep it consistent for JOIN_RIGHT_ANTI?

Thanks
Richard

#14

Richard Guo

guofenglinux@gmail.com

over 3 years ago

In reply to: Richard Guo (#11)

1 attachment(s)

Re: Using each rel as both outer and inner for JOIN_ANTI

On Fri, Jul 2, 2021 at 11:23 AM Richard Guo <guofenglinux@gmail.com> wrote:

Thanks! Test cases are updated in v3 patch. Also merge join can do the
'right anti join' too in the same patch.

Thanks again for reviewing this patch.

Rebased this patch with latest master, with no other changes.

Thanks
Richard

Attachments:

v4-0001-Using-each-rel-as-both-outer-and-inner-for-anti-j.patchapplication/octet-stream; name=v4-0001-Using-each-rel-as-both-outer-and-inner-for-anti-j.patchDownload

From c0750160b80eec6339924c96200167f694be99f0 Mon Sep 17 00:00:00 2001
From: pgsql-guo <richard.guo@openpie.com>
Date: Mon, 25 Jul 2022 02:27:47 +0000
Subject: [PATCH v4] Using each rel as both outer and inner for anti joins

---
 src/backend/commands/explain.c               |   3 +
 src/backend/executor/nodeHashjoin.c          |   9 +
 src/backend/executor/nodeMergejoin.c         |   8 +
 src/backend/optimizer/path/costsize.c        |   3 +-
 src/backend/optimizer/path/joinpath.c        |   5 +
 src/backend/optimizer/path/joinrels.c        |   3 +
 src/backend/optimizer/path/pathkeys.c        |   4 +-
 src/backend/optimizer/plan/setrefs.c         |   1 +
 src/include/nodes/nodes.h                    |   4 +-
 src/test/regress/expected/partition_join.out | 276 ++++++++++---------
 10 files changed, 179 insertions(+), 137 deletions(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index e29c2ae206..419b1753e7 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -1550,6 +1550,9 @@ ExplainNode(PlanState *planstate, List *ancestors,
 					case JOIN_ANTI:
 						jointype = "Anti";
 						break;
+					case JOIN_RIGHT_ANTI:
+						jointype = "Right Anti";
+						break;
 					default:
 						jointype = "???";
 						break;
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index 87403e2478..7a2a9f17cc 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -482,6 +482,14 @@ ExecHashJoinImpl(PlanState *pstate, bool parallel)
 						continue;
 					}
 
+					/*
+					 * In a right-antijoin, we never return a matched tuple.
+					 * And we need to use current outer tuple to scan the
+					 * bucket for matches
+					 */
+					if (node->js.jointype == JOIN_RIGHT_ANTI)
+						continue;
+
 					/*
 					 * If we only need to join to the first matching inner
 					 * tuple, then consider returning this one, but after that
@@ -694,6 +702,7 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
 				ExecInitNullTupleSlot(estate, innerDesc, &TTSOpsVirtual);
 			break;
 		case JOIN_RIGHT:
+		case JOIN_RIGHT_ANTI:
 			hjstate->hj_NullOuterTupleSlot =
 				ExecInitNullTupleSlot(estate, outerDesc, &TTSOpsVirtual);
 			break;
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index fed345eae5..1f2176e841 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -805,6 +805,13 @@ ExecMergeJoin(PlanState *pstate)
 						break;
 					}
 
+					/*
+					 * In a right-antijoin, we never return a matched tuple.
+					 * And we need to use current outer tuple for further scans
+					 */
+					if (node->js.jointype == JOIN_RIGHT_ANTI)
+						break;
+
 					/*
 					 * If we only need to join to the first matching inner
 					 * tuple, then consider returning this one, but after that
@@ -1554,6 +1561,7 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
 				ExecInitNullTupleSlot(estate, innerDesc, &TTSOpsVirtual);
 			break;
 		case JOIN_RIGHT:
+		case JOIN_RIGHT_ANTI:
 			mergestate->mj_FillOuter = false;
 			mergestate->mj_FillInner = true;
 			mergestate->mj_NullOuterTupleSlot =
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index fb28e6411a..2a82692801 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -3645,7 +3645,8 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 			outerstartsel = 0.0;
 			outerendsel = 1.0;
 		}
-		else if (jointype == JOIN_RIGHT)
+		else if (jointype == JOIN_RIGHT ||
+				jointype == JOIN_RIGHT_ANTI)
 		{
 			innerstartsel = 0.0;
 			innerendsel = 1.0;
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index 2a3f0ab7bf..89408ba8b6 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -1212,6 +1212,7 @@ sort_inner_and_outer(PlannerInfo *root,
 		save_jointype != JOIN_UNIQUE_OUTER &&
 		save_jointype != JOIN_FULL &&
 		save_jointype != JOIN_RIGHT &&
+		save_jointype != JOIN_RIGHT_ANTI &&
 		outerrel->partial_pathlist != NIL &&
 		bms_is_empty(joinrel->lateral_relids))
 	{
@@ -1621,6 +1622,7 @@ match_unsorted_outer(PlannerInfo *root,
 			useallclauses = false;
 			break;
 		case JOIN_RIGHT:
+		case JOIN_RIGHT_ANTI:
 		case JOIN_FULL:
 			nestjoinOK = false;
 			useallclauses = true;
@@ -1799,6 +1801,7 @@ match_unsorted_outer(PlannerInfo *root,
 		save_jointype != JOIN_UNIQUE_OUTER &&
 		save_jointype != JOIN_FULL &&
 		save_jointype != JOIN_RIGHT &&
+		save_jointype != JOIN_RIGHT_ANTI &&
 		outerrel->partial_pathlist != NIL &&
 		bms_is_empty(joinrel->lateral_relids))
 	{
@@ -2145,6 +2148,7 @@ hash_inner_and_outer(PlannerInfo *root,
 			save_jointype != JOIN_UNIQUE_OUTER &&
 			save_jointype != JOIN_FULL &&
 			save_jointype != JOIN_RIGHT &&
+			save_jointype != JOIN_RIGHT_ANTI &&
 			outerrel->partial_pathlist != NIL &&
 			bms_is_empty(joinrel->lateral_relids))
 		{
@@ -2307,6 +2311,7 @@ select_mergejoin_clauses(PlannerInfo *root,
 	switch (jointype)
 	{
 		case JOIN_RIGHT:
+		case JOIN_RIGHT_ANTI:
 		case JOIN_FULL:
 			*mergejoin_allowed = !have_nonmergeable_joinclause;
 			break;
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 9da3ff2f9a..c6e5e0195c 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -915,6 +915,9 @@ populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
 			add_paths_to_joinrel(root, joinrel, rel1, rel2,
 								 JOIN_ANTI, sjinfo,
 								 restrictlist);
+			add_paths_to_joinrel(root, joinrel, rel2, rel1,
+								 JOIN_RIGHT_ANTI, sjinfo,
+								 restrictlist);
 			break;
 		default:
 			/* other values not expected here */
diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
index b5d6c977ee..40ad79bdba 100644
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -1609,7 +1609,9 @@ build_join_pathkeys(PlannerInfo *root,
 					JoinType jointype,
 					List *outer_pathkeys)
 {
-	if (jointype == JOIN_FULL || jointype == JOIN_RIGHT)
+	if (jointype == JOIN_FULL ||
+		jointype == JOIN_RIGHT ||
+		jointype == JOIN_RIGHT_ANTI)
 		return NIL;
 
 	/*
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 1cb0abdbc1..51f548041e 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -2252,6 +2252,7 @@ set_join_references(PlannerInfo *root, Join *join, int rtoffset)
 			inner_itlist->has_non_vars = false;
 			break;
 		case JOIN_RIGHT:
+		case JOIN_RIGHT_ANTI:
 			outer_itlist->has_non_vars = false;
 			break;
 		case JOIN_FULL:
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index cdd6debfa0..4a9aa2308f 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -300,6 +300,7 @@ typedef enum JoinType
 	 */
 	JOIN_SEMI,					/* 1 copy of each LHS row that has match(es) */
 	JOIN_ANTI,					/* 1 copy of each LHS row that has no match */
+	JOIN_RIGHT_ANTI,			/* 1 copy of each RHS row that has no match */
 
 	/*
 	 * These codes are used internally in the planner, but are not supported
@@ -332,7 +333,8 @@ typedef enum JoinType
 	  ((1 << JOIN_LEFT) | \
 	   (1 << JOIN_FULL) | \
 	   (1 << JOIN_RIGHT) | \
-	   (1 << JOIN_ANTI))) != 0)
+	   (1 << JOIN_ANTI) | \
+	   (1 << JOIN_RIGHT_ANTI))) != 0)
 
 /*
  * AggStrategy -
diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out
index 03926a8413..852b740416 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -2403,24 +2403,24 @@ SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Hash Anti Join
-               Hash Cond: (t1_1.a = t2_1.b)
-               ->  Seq Scan on prt1_adv_p1 t1_1
-                     Filter: (b = 0)
+         ->  Hash Right Anti Join
+               Hash Cond: (t2_1.b = t1_1.a)
+               ->  Seq Scan on prt2_adv_p1 t2_1
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p1 t2_1
-         ->  Hash Anti Join
-               Hash Cond: (t1_2.a = t2_2.b)
-               ->  Seq Scan on prt1_adv_p2 t1_2
-                     Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p1 t1_1
+                           Filter: (b = 0)
+         ->  Hash Right Anti Join
+               Hash Cond: (t2_2.b = t1_2.a)
+               ->  Seq Scan on prt2_adv_p2 t2_2
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p2 t2_2
-         ->  Hash Anti Join
-               Hash Cond: (t1_3.a = t2_3.b)
-               ->  Seq Scan on prt1_adv_p3 t1_3
-                     Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p2 t1_2
+                           Filter: (b = 0)
+         ->  Hash Right Anti Join
+               Hash Cond: (t2_3.b = t1_3.a)
+               ->  Seq Scan on prt2_adv_p3 t2_3
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p3 t2_3
+                     ->  Seq Scan on prt1_adv_p3 t1_3
+                           Filter: (b = 0)
 (21 rows)
 
 SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
@@ -2638,24 +2638,24 @@ SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Hash Anti Join
-               Hash Cond: (t1_1.a = t2_1.b)
-               ->  Seq Scan on prt1_adv_p1 t1_1
-                     Filter: (b = 0)
+         ->  Hash Right Anti Join
+               Hash Cond: (t2_1.b = t1_1.a)
+               ->  Seq Scan on prt2_adv_p1 t2_1
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p1 t2_1
-         ->  Hash Anti Join
-               Hash Cond: (t1_2.a = t2_2.b)
-               ->  Seq Scan on prt1_adv_p2 t1_2
-                     Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p1 t1_1
+                           Filter: (b = 0)
+         ->  Hash Right Anti Join
+               Hash Cond: (t2_2.b = t1_2.a)
+               ->  Seq Scan on prt2_adv_p2 t2_2
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p2 t2_2
-         ->  Hash Anti Join
-               Hash Cond: (t1_3.a = t2_3.b)
-               ->  Seq Scan on prt1_adv_p3 t1_3
-                     Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p2 t1_2
+                           Filter: (b = 0)
+         ->  Hash Right Anti Join
+               Hash Cond: (t2_3.b = t1_3.a)
+               ->  Seq Scan on prt2_adv_p3 t2_3
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p3 t2_3
+                     ->  Seq Scan on prt1_adv_p3 t1_3
+                           Filter: (b = 0)
 (21 rows)
 
 SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
@@ -2671,26 +2671,26 @@ SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t
 -- partitions on the nullable side
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM prt2_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt1_adv t2 WHERE t1.b = t2.a) AND t1.a = 0 ORDER BY t1.b;
-                      QUERY PLAN                      
-------------------------------------------------------
+                       QUERY PLAN                        
+---------------------------------------------------------
  Sort
    Sort Key: t1.b
-   ->  Hash Anti Join
-         Hash Cond: (t1.b = t2.a)
+   ->  Hash Right Anti Join
+         Hash Cond: (t2.a = t1.b)
          ->  Append
-               ->  Seq Scan on prt2_adv_p1 t1_1
-                     Filter: (a = 0)
-               ->  Seq Scan on prt2_adv_p2 t1_2
-                     Filter: (a = 0)
-               ->  Seq Scan on prt2_adv_p3 t1_3
-                     Filter: (a = 0)
-               ->  Seq Scan on prt2_adv_extra t1_4
-                     Filter: (a = 0)
+               ->  Seq Scan on prt1_adv_p1 t2_1
+               ->  Seq Scan on prt1_adv_p2 t2_2
+               ->  Seq Scan on prt1_adv_p3 t2_3
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt1_adv_p1 t2_1
-                     ->  Seq Scan on prt1_adv_p2 t2_2
-                     ->  Seq Scan on prt1_adv_p3 t2_3
+                     ->  Seq Scan on prt2_adv_p1 t1_1
+                           Filter: (a = 0)
+                     ->  Seq Scan on prt2_adv_p2 t1_2
+                           Filter: (a = 0)
+                     ->  Seq Scan on prt2_adv_p3 t1_3
+                           Filter: (a = 0)
+                     ->  Seq Scan on prt2_adv_extra t1_4
+                           Filter: (a = 0)
 (18 rows)
 
 -- full join; currently we can't do partitioned join if there are no matched
@@ -2858,25 +2858,25 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_adv t1 LEFT JOIN prt2_adv t2 ON (t1.a =
 -- anti join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
-                       QUERY PLAN                       
---------------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Sort
    Sort Key: t1.a
-   ->  Hash Anti Join
-         Hash Cond: (t1.a = t2.b)
+   ->  Hash Right Anti Join
+         Hash Cond: (t2.b = t1.a)
          ->  Append
-               ->  Seq Scan on prt1_adv_p1 t1_1
-                     Filter: (b = 0)
-               ->  Seq Scan on prt1_adv_p2 t1_2
-                     Filter: (b = 0)
-               ->  Seq Scan on prt1_adv_p3 t1_3
-                     Filter: (b = 0)
+               ->  Seq Scan on prt2_adv_p1 t2_1
+               ->  Seq Scan on prt2_adv_p2 t2_2
+               ->  Seq Scan on prt2_adv_p3_1 t2_3
+               ->  Seq Scan on prt2_adv_p3_2 t2_4
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt2_adv_p1 t2_1
-                     ->  Seq Scan on prt2_adv_p2 t2_2
-                     ->  Seq Scan on prt2_adv_p3_1 t2_3
-                     ->  Seq Scan on prt2_adv_p3_2 t2_4
+                     ->  Seq Scan on prt1_adv_p1 t1_1
+                           Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p2 t1_2
+                           Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p3 t1_3
+                           Filter: (b = 0)
 (17 rows)
 
 -- full join
@@ -3280,27 +3280,30 @@ SELECT t1.a, t1.c, t2.a, t2.c FROM plt1_adv t1 LEFT JOIN plt2_adv t2 ON (t1.a =
 -- anti join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
-                              QUERY PLAN                              
-----------------------------------------------------------------------
+                             QUERY PLAN                             
+--------------------------------------------------------------------
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_1.a = t2_1.a) AND (t1_1.c = t2_1.c))
-               ->  Seq Scan on plt1_adv_p1 t1_1
-                     Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_1.a = t1_1.a) AND (t2_1.c = t1_1.c))
                ->  Seq Scan on plt2_adv_p1 t2_1
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_2.a = t2_2.a) AND (t1_2.c = t2_2.c))
-               ->  Seq Scan on plt1_adv_p2 t1_2
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p1 t1_1
+                           Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_2.a = t1_2.a) AND (t2_2.c = t1_2.c))
                ->  Seq Scan on plt2_adv_p2 t2_2
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_3.a = t2_3.a) AND (t1_3.c = t2_3.c))
-               ->  Seq Scan on plt1_adv_p3 t1_3
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p2 t1_2
+                           Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_3.a = t1_3.a) AND (t2_3.c = t1_3.c))
                ->  Seq Scan on plt2_adv_p3 t2_3
-(18 rows)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p3 t1_3
+                           Filter: (b < 10)
+(21 rows)
 
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
  a | b |  c   
@@ -3493,27 +3496,30 @@ SELECT t1.a, t1.c, t2.a, t2.c FROM plt2_adv t1 LEFT JOIN plt1_adv t2 ON (t1.a =
 -- anti join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
-                              QUERY PLAN                              
-----------------------------------------------------------------------
+                             QUERY PLAN                             
+--------------------------------------------------------------------
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_1.a = t2_1.a) AND (t1_1.c = t2_1.c))
-               ->  Seq Scan on plt1_adv_p1 t1_1
-                     Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_1.a = t1_1.a) AND (t2_1.c = t1_1.c))
                ->  Seq Scan on plt2_adv_p1 t2_1
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_2.a = t2_2.a) AND (t1_2.c = t2_2.c))
-               ->  Seq Scan on plt1_adv_p2 t1_2
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p1 t1_1
+                           Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_2.a = t1_2.a) AND (t2_2.c = t1_2.c))
                ->  Seq Scan on plt2_adv_p2 t2_2
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_3.a = t2_3.a) AND (t1_3.c = t2_3.c))
-               ->  Seq Scan on plt1_adv_p3 t1_3
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p2 t1_2
+                           Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_3.a = t1_3.a) AND (t2_3.c = t1_3.c))
                ->  Seq Scan on plt2_adv_p3 t2_3
-(18 rows)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p3 t1_3
+                           Filter: (b < 10)
+(21 rows)
 
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
  a | b |  c   
@@ -3526,26 +3532,26 @@ SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t
 -- partitions on the nullable side
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM plt2_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt1_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
-                      QUERY PLAN                      
-------------------------------------------------------
+                       QUERY PLAN                        
+---------------------------------------------------------
  Sort
    Sort Key: t1.a
-   ->  Hash Anti Join
-         Hash Cond: ((t1.a = t2.a) AND (t1.c = t2.c))
+   ->  Hash Right Anti Join
+         Hash Cond: ((t2.a = t1.a) AND (t2.c = t1.c))
          ->  Append
-               ->  Seq Scan on plt2_adv_extra t1_1
-                     Filter: (b < 10)
-               ->  Seq Scan on plt2_adv_p1 t1_2
-                     Filter: (b < 10)
-               ->  Seq Scan on plt2_adv_p2 t1_3
-                     Filter: (b < 10)
-               ->  Seq Scan on plt2_adv_p3 t1_4
-                     Filter: (b < 10)
+               ->  Seq Scan on plt1_adv_p1 t2_1
+               ->  Seq Scan on plt1_adv_p2 t2_2
+               ->  Seq Scan on plt1_adv_p3 t2_3
          ->  Hash
                ->  Append
-                     ->  Seq Scan on plt1_adv_p1 t2_1
-                     ->  Seq Scan on plt1_adv_p2 t2_2
-                     ->  Seq Scan on plt1_adv_p3 t2_3
+                     ->  Seq Scan on plt2_adv_extra t1_1
+                           Filter: (b < 10)
+                     ->  Seq Scan on plt2_adv_p1 t1_2
+                           Filter: (b < 10)
+                     ->  Seq Scan on plt2_adv_p2 t1_3
+                           Filter: (b < 10)
+                     ->  Seq Scan on plt2_adv_p3 t1_4
+                           Filter: (b < 10)
 (18 rows)
 
 -- full join; currently we can't do partitioned join if there are no matched
@@ -3655,25 +3661,25 @@ SELECT t1.a, t1.c, t2.a, t2.c FROM plt1_adv t1 LEFT JOIN plt2_adv t2 ON (t1.a =
 -- anti join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
-                       QUERY PLAN                       
---------------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Sort
    Sort Key: t1.a
-   ->  Hash Anti Join
-         Hash Cond: ((t1.a = t2.a) AND (t1.c = t2.c))
+   ->  Hash Right Anti Join
+         Hash Cond: ((t2.a = t1.a) AND (t2.c = t1.c))
          ->  Append
-               ->  Seq Scan on plt1_adv_p1 t1_1
-                     Filter: (b < 10)
-               ->  Seq Scan on plt1_adv_p2 t1_2
-                     Filter: (b < 10)
-               ->  Seq Scan on plt1_adv_p3 t1_3
-                     Filter: (b < 10)
+               ->  Seq Scan on plt2_adv_p1 t2_1
+               ->  Seq Scan on plt2_adv_p2_1 t2_2
+               ->  Seq Scan on plt2_adv_p2_2 t2_3
+               ->  Seq Scan on plt2_adv_p3 t2_4
          ->  Hash
                ->  Append
-                     ->  Seq Scan on plt2_adv_p1 t2_1
-                     ->  Seq Scan on plt2_adv_p2_1 t2_2
-                     ->  Seq Scan on plt2_adv_p2_2 t2_3
-                     ->  Seq Scan on plt2_adv_p3 t2_4
+                     ->  Seq Scan on plt1_adv_p1 t1_1
+                           Filter: (b < 10)
+                     ->  Seq Scan on plt1_adv_p2 t1_2
+                           Filter: (b < 10)
+                     ->  Seq Scan on plt1_adv_p3 t1_3
+                           Filter: (b < 10)
 (17 rows)
 
 -- full join
@@ -3830,28 +3836,30 @@ SELECT t1.a, t1.c, t2.a, t2.c FROM plt1_adv t1 LEFT JOIN plt2_adv t2 ON (t1.a =
 -- anti join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
-                              QUERY PLAN                              
-----------------------------------------------------------------------
+                             QUERY PLAN                             
+--------------------------------------------------------------------
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Hash Anti Join
-               Hash Cond: ((t1_1.a = t2_1.a) AND (t1_1.c = t2_1.c))
-               ->  Seq Scan on plt1_adv_p1_null t1_1
-                     Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_1.a = t1_1.a) AND (t2_1.c = t1_1.c))
+               ->  Seq Scan on plt2_adv_p1 t2_1
                ->  Hash
-                     ->  Seq Scan on plt2_adv_p1 t2_1
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_2.a = t2_2.a) AND (t1_2.c = t2_2.c))
-               ->  Seq Scan on plt1_adv_p2 t1_2
-                     Filter: (b < 10)
+                     ->  Seq Scan on plt1_adv_p1_null t1_1
+                           Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_2.a = t1_2.a) AND (t2_2.c = t1_2.c))
                ->  Seq Scan on plt2_adv_p2 t2_2
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_3.a = t2_3.a) AND (t1_3.c = t2_3.c))
-               ->  Seq Scan on plt1_adv_p3 t1_3
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p2 t1_2
+                           Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_3.a = t1_3.a) AND (t2_3.c = t1_3.c))
                ->  Seq Scan on plt2_adv_p3_null t2_3
-(19 rows)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p3 t1_3
+                           Filter: (b < 10)
+(21 rows)
 
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
  a  | b  |  c   
-- 
2.25.1

#15

Tom Lane

tgl@sss.pgh.pa.us

over 3 years ago

In reply to: Richard Guo (#14)

Re: Using each rel as both outer and inner for JOIN_ANTI

Richard Guo <guofenglinux@gmail.com> writes:

[ v4-0001-Using-each-rel-as-both-outer-and-inner-for-anti-j.patch ]

I took a quick look through this. The executor changes are indeed
impressively short, but that's largely because you've paid zero
attention to updating obsoleted comments. For example, in
nodeHashjoin.c there are lots of references to right/full joins
that likely now need to cover right-anti. I'm not sure that the
empty-rel startup optimizations are correct for this case, either.

I don't have a warm feeling about the planner changes being correct:
it looks like what you mostly did was to treat JOIN_RIGHT_ANTI
identically to JOIN_ANTI everywhere, which is surely wrong.
As an example of this, optimizer/README mentions

Similarly, parameterized paths do not normally get preference in add_path
for having cheap startup cost; that's seldom of much value when on the
inside of a nestloop, so it seems not worth keeping extra paths solely for
that. An exception occurs for parameterized paths for the RHS relation of
a SEMI or ANTI join: in those cases, we can stop the inner scan after the
first match, so it's primarily startup not total cost that we care about.

For RIGHT_ANTI it'd become startup of the outer scan that counts, but
I don't think you've gotten that right here.

There are various references to JOIN_ANTI in planner peripheral code,
e.g. selfuncs.c, that probably need adjustment.

[ wanders away wondering if JOIN_RIGHT_SEMI should become a thing ... ]

regards, tom lane

#16

Richard Guo

guofenglinux@gmail.com

over 3 years ago

In reply to: Tom Lane (#15)

Re: Using each rel as both outer and inner for JOIN_ANTI

On Sun, Jul 31, 2022 at 12:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Richard Guo <guofenglinux@gmail.com> writes:

[ v4-0001-Using-each-rel-as-both-outer-and-inner-for-anti-j.patch ]

I took a quick look through this. The executor changes are indeed
impressively short, but that's largely because you've paid zero
attention to updating obsoleted comments. For example, in
nodeHashjoin.c there are lots of references to right/full joins
that likely now need to cover right-anti. I'm not sure that the
empty-rel startup optimizations are correct for this case, either.

Thanks for the review! Yeah, you're right. I neglected to update the
related comments. Will do that in the new patch. For the empty-rel
startup optimizations, since the right-anti join also does null-fill on
inner relation (the HJ_FILL_INNER case), I think we cannot skip building
the hash table even when the outer rel is completely empty.

I don't have a warm feeling about the planner changes being correct:
it looks like what you mostly did was to treat JOIN_RIGHT_ANTI
identically to JOIN_ANTI everywhere, which is surely wrong.
As an example of this, optimizer/README mentions

Similarly, parameterized paths do not normally get preference in add_path
for having cheap startup cost; that's seldom of much value when on the
inside of a nestloop, so it seems not worth keeping extra paths solely
for
that. An exception occurs for parameterized paths for the RHS relation
of
a SEMI or ANTI join: in those cases, we can stop the inner scan after the
first match, so it's primarily startup not total cost that we care about.

For RIGHT_ANTI it'd become startup of the outer scan that counts, but
I don't think you've gotten that right here.

I think JOIN_RIGHT_ANTI behaves more like JOIN_RIGHT, except that
JOIN_RIGHT returns a matched tuple while JOIN_RIGHT_ANTI does not. For
each outer tuple, right-anti needs to scan the inner rel for every match
and mark its hashtable entry. So I think the right-anti join should not
belong to the case 'in those cases, we can stop the inner scan after the
first match, so it's primarily startup not total cost that we care
about.' Am I thinking it correctly?

[ wanders away wondering if JOIN_RIGHT_SEMI should become a thing ... ]

Maybe this is something we can do. Currently for the query below:

# explain select * from foo where a in (select c from bar);
QUERY PLAN
-------------------------------------------------------------------------
Hash Semi Join (cost=154156.00..173691.29 rows=10 width=8)
Hash Cond: (foo.a = bar.c)
-> Seq Scan on foo (cost=0.00..1.10 rows=10 width=8)
-> Hash (cost=72124.00..72124.00 rows=5000000 width=4)
-> Seq Scan on bar (cost=0.00..72124.00 rows=5000000 width=4)
(5 rows)

I believe we can get a cheaper plan if we are able to swap the outer and
inner for SEMI JOIN and use the smaller 'foo' as inner rel.

Thanks
Richard

#17

Richard Guo

guofenglinux@gmail.com

over 3 years ago

In reply to: Richard Guo (#16)

1 attachment(s)

Re: Using each rel as both outer and inner for JOIN_ANTI

On Tue, Aug 2, 2022 at 3:13 PM Richard Guo <guofenglinux@gmail.com> wrote:

On Sun, Jul 31, 2022 at 12:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I took a quick look through this. The executor changes are indeed
impressively short, but that's largely because you've paid zero
attention to updating obsoleted comments. For example, in
nodeHashjoin.c there are lots of references to right/full joins
that likely now need to cover right-anti. I'm not sure that the
empty-rel startup optimizations are correct for this case, either.

Thanks for the review! Yeah, you're right. I neglected to update the
related comments. Will do that in the new patch. For the empty-rel
startup optimizations, since the right-anti join also does null-fill on
inner relation (the HJ_FILL_INNER case), I think we cannot skip building
the hash table even when the outer rel is completely empty.

Here is the new patch which addresses the obsoleted comments.

Thanks
Richard

Attachments:

v5-0001-Using-each-rel-as-both-outer-and-inner-for-anti-j.patchapplication/octet-stream; name=v5-0001-Using-each-rel-as-both-outer-and-inner-for-anti-j.patchDownload

From 12aabf4cc858e00e266d1b5224685b7496f352c2 Mon Sep 17 00:00:00 2001
From: Richard Guo <guofenglinux@gmail.com>
Date: Mon, 8 Aug 2022 14:44:01 +0800
Subject: [PATCH v5] Using each rel as both outer and inner for anti joins

---
 src/backend/commands/explain.c               |   3 +
 src/backend/executor/nodeHashjoin.c          |  41 +--
 src/backend/executor/nodeMergejoin.c         |  32 ++-
 src/backend/optimizer/path/costsize.c        |   3 +-
 src/backend/optimizer/path/joinpath.c        |  55 ++--
 src/backend/optimizer/path/joinrels.c        |   3 +
 src/backend/optimizer/path/pathkeys.c        |  10 +-
 src/backend/optimizer/plan/setrefs.c         |   1 +
 src/include/nodes/execnodes.h                |   3 +-
 src/include/nodes/nodes.h                    |   4 +-
 src/test/regress/expected/partition_join.out | 276 ++++++++++---------
 11 files changed, 238 insertions(+), 193 deletions(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index e078456b19..cb093f31f1 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -1550,6 +1550,9 @@ ExplainNode(PlanState *planstate, List *ancestors,
 					case JOIN_ANTI:
 						jointype = "Anti";
 						break;
+					case JOIN_RIGHT_ANTI:
+						jointype = "Right Anti";
+						break;
 					default:
 						jointype = "???";
 						break;
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index 87403e2478..140db24afe 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -217,10 +217,10 @@ ExecHashJoinImpl(PlanState *pstate, bool parallel)
 
 				/*
 				 * If the outer relation is completely empty, and it's not
-				 * right/full join, we can quit without building the hash
-				 * table.  However, for an inner join it is only a win to
-				 * check this when the outer relation's startup cost is less
-				 * than the projected cost of building the hash table.
+				 * right/right-anti/full join, we can quit without building
+				 * the hash table.  However, for an inner join it is only a
+				 * win to check this when the outer relation's startup cost is
+				 * less than the projected cost of building the hash table.
 				 * Otherwise it's best to build the hash table first and see
 				 * if the inner relation is empty.  (When it's a left join, we
 				 * should always make this check, since we aren't going to be
@@ -458,11 +458,10 @@ ExecHashJoinImpl(PlanState *pstate, bool parallel)
 					if (parallel)
 					{
 						/*
-						 * Full/right outer joins are currently not supported
-						 * for parallel joins, so we don't need to set the
-						 * match bit.  Experiments show that it's worth
-						 * avoiding the shared memory traffic on large
-						 * systems.
+						 * Full/right/right-anti outer joins are currently not
+						 * supported for parallel joins, so we don't need to
+						 * set the match bit.  Experiments show that it's worth
+						 * avoiding the shared memory traffic on large systems.
 						 */
 						Assert(!HJ_FILL_INNER(node));
 					}
@@ -482,6 +481,14 @@ ExecHashJoinImpl(PlanState *pstate, bool parallel)
 						continue;
 					}
 
+					/*
+					 * In a right-antijoin, we never return a matched tuple.
+					 * And we need to use current outer tuple to scan the
+					 * bucket for matches
+					 */
+					if (node->js.jointype == JOIN_RIGHT_ANTI)
+						continue;
+
 					/*
 					 * If we only need to join to the first matching inner
 					 * tuple, then consider returning this one, but after that
@@ -527,9 +534,10 @@ ExecHashJoinImpl(PlanState *pstate, bool parallel)
 			case HJ_FILL_INNER_TUPLES:
 
 				/*
-				 * We have finished a batch, but we are doing right/full join,
-				 * so any unmatched inner tuples in the hashtable have to be
-				 * emitted before we continue to the next batch.
+				 * We have finished a batch, but we are doing
+				 * right/right-anti/full join, so any unmatched inner tuples in
+				 * the hashtable have to be emitted before we continue to the
+				 * next batch.
 				 */
 				if (!ExecScanHashTableForUnmatched(node, econtext))
 				{
@@ -694,6 +702,7 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
 				ExecInitNullTupleSlot(estate, innerDesc, &TTSOpsVirtual);
 			break;
 		case JOIN_RIGHT:
+		case JOIN_RIGHT_ANTI:
 			hjstate->hj_NullOuterTupleSlot =
 				ExecInitNullTupleSlot(estate, outerDesc, &TTSOpsVirtual);
 			break;
@@ -987,8 +996,8 @@ ExecHashJoinNewBatch(HashJoinState *hjstate)
 	 * side, but there are exceptions:
 	 *
 	 * 1. In a left/full outer join, we have to process outer batches even if
-	 * the inner batch is empty.  Similarly, in a right/full outer join, we
-	 * have to process inner batches even if the outer batch is empty.
+	 * the inner batch is empty.  Similarly, in a right/right-anti/full outer
+	 * join, we have to process inner batches even if the outer batch is empty.
 	 *
 	 * 2. If we have increased nbatch since the initial estimate, we have to
 	 * scan inner batches since they might contain tuples that need to be
@@ -1308,8 +1317,8 @@ ExecReScanHashJoin(HashJoinState *node)
 			/*
 			 * Okay to reuse the hash table; needn't rescan inner, either.
 			 *
-			 * However, if it's a right/full join, we'd better reset the
-			 * inner-tuple match flags contained in the table.
+			 * However, if it's a right/right-anti/full join, we'd better reset
+			 * the inner-tuple match flags contained in the table.
 			 */
 			if (HJ_FILL_INNER(node))
 				ExecHashTableResetMatchFlags(node->hj_HashTable);
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index fed345eae5..b16d2a066b 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -805,6 +805,13 @@ ExecMergeJoin(PlanState *pstate)
 						break;
 					}
 
+					/*
+					 * In a right-antijoin, we never return a matched tuple.
+					 * And we need to use current outer tuple for further scans
+					 */
+					if (node->js.jointype == JOIN_RIGHT_ANTI)
+						break;
+
 					/*
 					 * If we only need to join to the first matching inner
 					 * tuple, then consider returning this one, but after that
@@ -1063,12 +1070,12 @@ ExecMergeJoin(PlanState *pstate)
 					 * them will match this new outer tuple and therefore
 					 * won't be emitted as fill tuples.  This works *only*
 					 * because we require the extra joinquals to be constant
-					 * when doing a right or full join --- otherwise some of
-					 * the rescanned tuples might fail the extra joinquals.
-					 * This obviously won't happen for a constant-true extra
-					 * joinqual, while the constant-false case is handled by
-					 * forcing the merge clause to never match, so we never
-					 * get here.
+					 * when doing a right, right-anti or full join ---
+					 * otherwise some of the rescanned tuples might fail the
+					 * extra joinquals.  This obviously won't happen for a
+					 * constant-true extra joinqual, while the constant-false
+					 * case is handled by forcing the merge clause to never
+					 * match, so we never get here.
 					 */
 					if (!node->mj_SkipMarkRestore)
 					{
@@ -1332,8 +1339,8 @@ ExecMergeJoin(PlanState *pstate)
 
 				/*
 				 * EXEC_MJ_ENDOUTER means we have run out of outer tuples, but
-				 * are doing a right/full join and therefore must null-fill
-				 * any remaining unmatched inner tuples.
+				 * are doing a right/right-anti/full join and therefore must
+				 * null-fill any remaining unmatched inner tuples.
 				 */
 			case EXEC_MJ_ENDOUTER:
 				MJ_printf("ExecMergeJoin: EXEC_MJ_ENDOUTER\n");
@@ -1554,14 +1561,15 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
 				ExecInitNullTupleSlot(estate, innerDesc, &TTSOpsVirtual);
 			break;
 		case JOIN_RIGHT:
+		case JOIN_RIGHT_ANTI:
 			mergestate->mj_FillOuter = false;
 			mergestate->mj_FillInner = true;
 			mergestate->mj_NullOuterTupleSlot =
 				ExecInitNullTupleSlot(estate, outerDesc, &TTSOpsVirtual);
 
 			/*
-			 * Can't handle right or full join with non-constant extra
-			 * joinclauses.  This should have been caught by planner.
+			 * Can't handle right, right-anti or full join with non-constant
+			 * extra joinclauses.  This should have been caught by planner.
 			 */
 			if (!check_constant_qual(node->join.joinqual,
 									 &mergestate->mj_ConstFalseJoin))
@@ -1578,8 +1586,8 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
 				ExecInitNullTupleSlot(estate, innerDesc, &TTSOpsVirtual);
 
 			/*
-			 * Can't handle right or full join with non-constant extra
-			 * joinclauses.  This should have been caught by planner.
+			 * Can't handle right, right-anti or full join with non-constant
+			 * extra joinclauses.  This should have been caught by planner.
 			 */
 			if (!check_constant_qual(node->join.joinqual,
 									 &mergestate->mj_ConstFalseJoin))
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index fb28e6411a..2a82692801 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -3645,7 +3645,8 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 			outerstartsel = 0.0;
 			outerendsel = 1.0;
 		}
-		else if (jointype == JOIN_RIGHT)
+		else if (jointype == JOIN_RIGHT ||
+				jointype == JOIN_RIGHT_ANTI)
 		{
 			innerstartsel = 0.0;
 			innerendsel = 1.0;
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index 2a3f0ab7bf..6a3b5adcba 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -284,8 +284,9 @@ add_paths_to_joinrel(PlannerInfo *root,
 	 * 2. Consider paths where the outer relation need not be explicitly
 	 * sorted. This includes both nestloops and mergejoins where the outer
 	 * path is already ordered.  Again, skip this if we can't mergejoin.
-	 * (That's okay because we know that nestloop can't handle right/full
-	 * joins at all, so it wouldn't work in the prohibited cases either.)
+	 * (That's okay because we know that nestloop can't handle
+	 * right/right-anti/full joins at all, so it wouldn't work in the
+	 * prohibited cases either.)
 	 */
 	if (mergejoin_allowed)
 		match_unsorted_outer(root, joinrel, outerrel, innerrel,
@@ -1204,14 +1205,15 @@ sort_inner_and_outer(PlannerInfo *root,
 	 * If the joinrel is parallel-safe, we may be able to consider a partial
 	 * merge join.  However, we can't handle JOIN_UNIQUE_OUTER, because the
 	 * outer path will be partial, and therefore we won't be able to properly
-	 * guarantee uniqueness.  Similarly, we can't handle JOIN_FULL and
-	 * JOIN_RIGHT, because they can produce false null extended rows.  Also,
-	 * the resulting path must not be parameterized.
+	 * guarantee uniqueness.  Similarly, we can't handle JOIN_FULL, JOIN_RIGHT
+	 * and JOIN_RIGHT_ANTI, because they can produce false null extended rows.
+	 * Also, the resulting path must not be parameterized.
 	 */
 	if (joinrel->consider_parallel &&
 		save_jointype != JOIN_UNIQUE_OUTER &&
 		save_jointype != JOIN_FULL &&
 		save_jointype != JOIN_RIGHT &&
+		save_jointype != JOIN_RIGHT_ANTI &&
 		outerrel->partial_pathlist != NIL &&
 		bms_is_empty(joinrel->lateral_relids))
 	{
@@ -1606,10 +1608,10 @@ match_unsorted_outer(PlannerInfo *root,
 
 	/*
 	 * Nestloop only supports inner, left, semi, and anti joins.  Also, if we
-	 * are doing a right or full mergejoin, we must use *all* the mergeclauses
-	 * as join clauses, else we will not have a valid plan.  (Although these
-	 * two flags are currently inverses, keep them separate for clarity and
-	 * possible future changes.)
+	 * are doing a right, right-anti or full mergejoin, we must use *all* the
+	 * mergeclauses as join clauses, else we will not have a valid plan.
+	 * (Although these two flags are currently inverses, keep them separate
+	 * for clarity and possible future changes.)
 	 */
 	switch (jointype)
 	{
@@ -1621,6 +1623,7 @@ match_unsorted_outer(PlannerInfo *root,
 			useallclauses = false;
 			break;
 		case JOIN_RIGHT:
+		case JOIN_RIGHT_ANTI:
 		case JOIN_FULL:
 			nestjoinOK = false;
 			useallclauses = true;
@@ -1792,13 +1795,14 @@ match_unsorted_outer(PlannerInfo *root,
 	 * handle JOIN_UNIQUE_OUTER, because the outer path will be partial, and
 	 * therefore we won't be able to properly guarantee uniqueness.  Nor can
 	 * we handle joins needing lateral rels, since partial paths must not be
-	 * parameterized. Similarly, we can't handle JOIN_FULL and JOIN_RIGHT,
-	 * because they can produce false null extended rows.
+	 * parameterized. Similarly, we can't handle JOIN_FULL, JOIN_RIGHT and
+	 * JOIN_RIGHT_ANTI, because they can produce false null extended rows.
 	 */
 	if (joinrel->consider_parallel &&
 		save_jointype != JOIN_UNIQUE_OUTER &&
 		save_jointype != JOIN_FULL &&
 		save_jointype != JOIN_RIGHT &&
+		save_jointype != JOIN_RIGHT_ANTI &&
 		outerrel->partial_pathlist != NIL &&
 		bms_is_empty(joinrel->lateral_relids))
 	{
@@ -2134,17 +2138,19 @@ hash_inner_and_outer(PlannerInfo *root,
 		 * partial hash join.  However, we can't handle JOIN_UNIQUE_OUTER,
 		 * because the outer path will be partial, and therefore we won't be
 		 * able to properly guarantee uniqueness.  Similarly, we can't handle
-		 * JOIN_FULL and JOIN_RIGHT, because they can produce false null
-		 * extended rows.  Also, the resulting path must not be parameterized.
-		 * We would be able to support JOIN_FULL and JOIN_RIGHT for Parallel
-		 * Hash, since in that case we're back to a single hash table with a
-		 * single set of match bits for each batch, but that will require
-		 * figuring out a deadlock-free way to wait for the probe to finish.
+		 * JOIN_FULL, JOIN_RIGHT and JOIN_RIGHT_ANTI, because they can produce
+		 * false null extended rows.  Also, the resulting path must not be
+		 * parameterized.  We would be able to support JOIN_FULL, JOIN_RIGHT
+		 * and JOIN_RIGHT_ANTI for Parallel Hash, since in that case we're
+		 * back to a single hash table with a single set of match bits for
+		 * each batch, but that will require figuring out a deadlock-free way
+		 * to wait for the probe to finish.
 		 */
 		if (joinrel->consider_parallel &&
 			save_jointype != JOIN_UNIQUE_OUTER &&
 			save_jointype != JOIN_FULL &&
 			save_jointype != JOIN_RIGHT &&
+			save_jointype != JOIN_RIGHT_ANTI &&
 			outerrel->partial_pathlist != NIL &&
 			bms_is_empty(joinrel->lateral_relids))
 		{
@@ -2201,11 +2207,11 @@ hash_inner_and_outer(PlannerInfo *root,
  *	  Select mergejoin clauses that are usable for a particular join.
  *	  Returns a list of RestrictInfo nodes for those clauses.
  *
- * *mergejoin_allowed is normally set to true, but it is set to false if
- * this is a right/full join and there are nonmergejoinable join clauses.
- * The executor's mergejoin machinery cannot handle such cases, so we have
- * to avoid generating a mergejoin plan.  (Note that this flag does NOT
- * consider whether there are actually any mergejoinable clauses.  This is
+ * *mergejoin_allowed is normally set to true, but it is set to false if this
+ * is a right/right-anti/full join and there are nonmergejoinable join
+ * clauses.  The executor's mergejoin machinery cannot handle such cases, so
+ * we have to avoid generating a mergejoin plan.  (Note that this flag does
+ * NOT consider whether there are actually any mergejoinable clauses.  This is
  * correct because in some cases we need to build a clauseless mergejoin.
  * Simply returning NIL is therefore not enough to distinguish safe from
  * unsafe cases.)
@@ -2251,8 +2257,8 @@ select_mergejoin_clauses(PlannerInfo *root,
 		{
 			/*
 			 * The executor can handle extra joinquals that are constants, but
-			 * not anything else, when doing right/full merge join.  (The
-			 * reason to support constants is so we can do FULL JOIN ON
+			 * not anything else, when doing right/right-anti/full merge join.
+			 * (The reason to support constants is so we can do FULL JOIN ON
 			 * FALSE.)
 			 */
 			if (!restrictinfo->clause || !IsA(restrictinfo->clause, Const))
@@ -2307,6 +2313,7 @@ select_mergejoin_clauses(PlannerInfo *root,
 	switch (jointype)
 	{
 		case JOIN_RIGHT:
+		case JOIN_RIGHT_ANTI:
 		case JOIN_FULL:
 			*mergejoin_allowed = !have_nonmergeable_joinclause;
 			break;
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 9da3ff2f9a..c6e5e0195c 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -915,6 +915,9 @@ populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
 			add_paths_to_joinrel(root, joinrel, rel1, rel2,
 								 JOIN_ANTI, sjinfo,
 								 restrictlist);
+			add_paths_to_joinrel(root, joinrel, rel2, rel1,
+								 JOIN_RIGHT_ANTI, sjinfo,
+								 restrictlist);
 			break;
 		default:
 			/* other values not expected here */
diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
index 1fa7fc99b5..4504cd1fd5 100644
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -1626,9 +1626,9 @@ find_var_for_subquery_tle(RelOptInfo *rel, TargetEntry *tle)
  *	  Build the path keys for a join relation constructed by mergejoin or
  *	  nestloop join.  This is normally the same as the outer path's keys.
  *
- *	  EXCEPTION: in a FULL or RIGHT join, we cannot treat the result as
- *	  having the outer path's path keys, because null lefthand rows may be
- *	  inserted at random points.  It must be treated as unsorted.
+ *	  EXCEPTION: in a FULL, RIGHT or RIGHT_ANTI join, we cannot treat the
+ *	  result as having the outer path's path keys, because null lefthand rows
+ *	  may be inserted at random points.  It must be treated as unsorted.
  *
  *	  We truncate away any pathkeys that are uninteresting for higher joins.
  *
@@ -1644,7 +1644,9 @@ build_join_pathkeys(PlannerInfo *root,
 					JoinType jointype,
 					List *outer_pathkeys)
 {
-	if (jointype == JOIN_FULL || jointype == JOIN_RIGHT)
+	if (jointype == JOIN_FULL ||
+		jointype == JOIN_RIGHT ||
+		jointype == JOIN_RIGHT_ANTI)
 		return NIL;
 
 	/*
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 1cb0abdbc1..51f548041e 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -2252,6 +2252,7 @@ set_join_references(PlannerInfo *root, Join *join, int rtoffset)
 			inner_itlist->has_non_vars = false;
 			break;
 		case JOIN_RIGHT:
+		case JOIN_RIGHT_ANTI:
 			outer_itlist->has_non_vars = false;
 			break;
 		case JOIN_FULL:
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 01b1727fc0..3b3f5b5fd8 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -2051,7 +2051,8 @@ typedef struct MergeJoinState
  *								OuterTupleSlot is empty!)
  *		hj_OuterTupleSlot		tuple slot for outer tuples
  *		hj_HashTupleSlot		tuple slot for inner (hashed) tuples
- *		hj_NullOuterTupleSlot	prepared null tuple for right/full outer joins
+ *		hj_NullOuterTupleSlot	prepared null tuple for right/right-anti/full
+ *								outer joins
  *		hj_NullInnerTupleSlot	prepared null tuple for left/full outer joins
  *		hj_FirstOuterTupleSlot	first tuple retrieved from outer plan
  *		hj_JoinState			current state of ExecHashJoin state machine
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index cdd6debfa0..4a9aa2308f 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -300,6 +300,7 @@ typedef enum JoinType
 	 */
 	JOIN_SEMI,					/* 1 copy of each LHS row that has match(es) */
 	JOIN_ANTI,					/* 1 copy of each LHS row that has no match */
+	JOIN_RIGHT_ANTI,			/* 1 copy of each RHS row that has no match */
 
 	/*
 	 * These codes are used internally in the planner, but are not supported
@@ -332,7 +333,8 @@ typedef enum JoinType
 	  ((1 << JOIN_LEFT) | \
 	   (1 << JOIN_FULL) | \
 	   (1 << JOIN_RIGHT) | \
-	   (1 << JOIN_ANTI))) != 0)
+	   (1 << JOIN_ANTI) | \
+	   (1 << JOIN_RIGHT_ANTI))) != 0)
 
 /*
  * AggStrategy -
diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out
index 03926a8413..852b740416 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -2403,24 +2403,24 @@ SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Hash Anti Join
-               Hash Cond: (t1_1.a = t2_1.b)
-               ->  Seq Scan on prt1_adv_p1 t1_1
-                     Filter: (b = 0)
+         ->  Hash Right Anti Join
+               Hash Cond: (t2_1.b = t1_1.a)
+               ->  Seq Scan on prt2_adv_p1 t2_1
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p1 t2_1
-         ->  Hash Anti Join
-               Hash Cond: (t1_2.a = t2_2.b)
-               ->  Seq Scan on prt1_adv_p2 t1_2
-                     Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p1 t1_1
+                           Filter: (b = 0)
+         ->  Hash Right Anti Join
+               Hash Cond: (t2_2.b = t1_2.a)
+               ->  Seq Scan on prt2_adv_p2 t2_2
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p2 t2_2
-         ->  Hash Anti Join
-               Hash Cond: (t1_3.a = t2_3.b)
-               ->  Seq Scan on prt1_adv_p3 t1_3
-                     Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p2 t1_2
+                           Filter: (b = 0)
+         ->  Hash Right Anti Join
+               Hash Cond: (t2_3.b = t1_3.a)
+               ->  Seq Scan on prt2_adv_p3 t2_3
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p3 t2_3
+                     ->  Seq Scan on prt1_adv_p3 t1_3
+                           Filter: (b = 0)
 (21 rows)
 
 SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
@@ -2638,24 +2638,24 @@ SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Hash Anti Join
-               Hash Cond: (t1_1.a = t2_1.b)
-               ->  Seq Scan on prt1_adv_p1 t1_1
-                     Filter: (b = 0)
+         ->  Hash Right Anti Join
+               Hash Cond: (t2_1.b = t1_1.a)
+               ->  Seq Scan on prt2_adv_p1 t2_1
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p1 t2_1
-         ->  Hash Anti Join
-               Hash Cond: (t1_2.a = t2_2.b)
-               ->  Seq Scan on prt1_adv_p2 t1_2
-                     Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p1 t1_1
+                           Filter: (b = 0)
+         ->  Hash Right Anti Join
+               Hash Cond: (t2_2.b = t1_2.a)
+               ->  Seq Scan on prt2_adv_p2 t2_2
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p2 t2_2
-         ->  Hash Anti Join
-               Hash Cond: (t1_3.a = t2_3.b)
-               ->  Seq Scan on prt1_adv_p3 t1_3
-                     Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p2 t1_2
+                           Filter: (b = 0)
+         ->  Hash Right Anti Join
+               Hash Cond: (t2_3.b = t1_3.a)
+               ->  Seq Scan on prt2_adv_p3 t2_3
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p3 t2_3
+                     ->  Seq Scan on prt1_adv_p3 t1_3
+                           Filter: (b = 0)
 (21 rows)
 
 SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
@@ -2671,26 +2671,26 @@ SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t
 -- partitions on the nullable side
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM prt2_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt1_adv t2 WHERE t1.b = t2.a) AND t1.a = 0 ORDER BY t1.b;
-                      QUERY PLAN                      
-------------------------------------------------------
+                       QUERY PLAN                        
+---------------------------------------------------------
  Sort
    Sort Key: t1.b
-   ->  Hash Anti Join
-         Hash Cond: (t1.b = t2.a)
+   ->  Hash Right Anti Join
+         Hash Cond: (t2.a = t1.b)
          ->  Append
-               ->  Seq Scan on prt2_adv_p1 t1_1
-                     Filter: (a = 0)
-               ->  Seq Scan on prt2_adv_p2 t1_2
-                     Filter: (a = 0)
-               ->  Seq Scan on prt2_adv_p3 t1_3
-                     Filter: (a = 0)
-               ->  Seq Scan on prt2_adv_extra t1_4
-                     Filter: (a = 0)
+               ->  Seq Scan on prt1_adv_p1 t2_1
+               ->  Seq Scan on prt1_adv_p2 t2_2
+               ->  Seq Scan on prt1_adv_p3 t2_3
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt1_adv_p1 t2_1
-                     ->  Seq Scan on prt1_adv_p2 t2_2
-                     ->  Seq Scan on prt1_adv_p3 t2_3
+                     ->  Seq Scan on prt2_adv_p1 t1_1
+                           Filter: (a = 0)
+                     ->  Seq Scan on prt2_adv_p2 t1_2
+                           Filter: (a = 0)
+                     ->  Seq Scan on prt2_adv_p3 t1_3
+                           Filter: (a = 0)
+                     ->  Seq Scan on prt2_adv_extra t1_4
+                           Filter: (a = 0)
 (18 rows)
 
 -- full join; currently we can't do partitioned join if there are no matched
@@ -2858,25 +2858,25 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_adv t1 LEFT JOIN prt2_adv t2 ON (t1.a =
 -- anti join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
-                       QUERY PLAN                       
---------------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Sort
    Sort Key: t1.a
-   ->  Hash Anti Join
-         Hash Cond: (t1.a = t2.b)
+   ->  Hash Right Anti Join
+         Hash Cond: (t2.b = t1.a)
          ->  Append
-               ->  Seq Scan on prt1_adv_p1 t1_1
-                     Filter: (b = 0)
-               ->  Seq Scan on prt1_adv_p2 t1_2
-                     Filter: (b = 0)
-               ->  Seq Scan on prt1_adv_p3 t1_3
-                     Filter: (b = 0)
+               ->  Seq Scan on prt2_adv_p1 t2_1
+               ->  Seq Scan on prt2_adv_p2 t2_2
+               ->  Seq Scan on prt2_adv_p3_1 t2_3
+               ->  Seq Scan on prt2_adv_p3_2 t2_4
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt2_adv_p1 t2_1
-                     ->  Seq Scan on prt2_adv_p2 t2_2
-                     ->  Seq Scan on prt2_adv_p3_1 t2_3
-                     ->  Seq Scan on prt2_adv_p3_2 t2_4
+                     ->  Seq Scan on prt1_adv_p1 t1_1
+                           Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p2 t1_2
+                           Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p3 t1_3
+                           Filter: (b = 0)
 (17 rows)
 
 -- full join
@@ -3280,27 +3280,30 @@ SELECT t1.a, t1.c, t2.a, t2.c FROM plt1_adv t1 LEFT JOIN plt2_adv t2 ON (t1.a =
 -- anti join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
-                              QUERY PLAN                              
-----------------------------------------------------------------------
+                             QUERY PLAN                             
+--------------------------------------------------------------------
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_1.a = t2_1.a) AND (t1_1.c = t2_1.c))
-               ->  Seq Scan on plt1_adv_p1 t1_1
-                     Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_1.a = t1_1.a) AND (t2_1.c = t1_1.c))
                ->  Seq Scan on plt2_adv_p1 t2_1
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_2.a = t2_2.a) AND (t1_2.c = t2_2.c))
-               ->  Seq Scan on plt1_adv_p2 t1_2
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p1 t1_1
+                           Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_2.a = t1_2.a) AND (t2_2.c = t1_2.c))
                ->  Seq Scan on plt2_adv_p2 t2_2
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_3.a = t2_3.a) AND (t1_3.c = t2_3.c))
-               ->  Seq Scan on plt1_adv_p3 t1_3
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p2 t1_2
+                           Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_3.a = t1_3.a) AND (t2_3.c = t1_3.c))
                ->  Seq Scan on plt2_adv_p3 t2_3
-(18 rows)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p3 t1_3
+                           Filter: (b < 10)
+(21 rows)
 
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
  a | b |  c   
@@ -3493,27 +3496,30 @@ SELECT t1.a, t1.c, t2.a, t2.c FROM plt2_adv t1 LEFT JOIN plt1_adv t2 ON (t1.a =
 -- anti join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
-                              QUERY PLAN                              
-----------------------------------------------------------------------
+                             QUERY PLAN                             
+--------------------------------------------------------------------
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_1.a = t2_1.a) AND (t1_1.c = t2_1.c))
-               ->  Seq Scan on plt1_adv_p1 t1_1
-                     Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_1.a = t1_1.a) AND (t2_1.c = t1_1.c))
                ->  Seq Scan on plt2_adv_p1 t2_1
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_2.a = t2_2.a) AND (t1_2.c = t2_2.c))
-               ->  Seq Scan on plt1_adv_p2 t1_2
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p1 t1_1
+                           Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_2.a = t1_2.a) AND (t2_2.c = t1_2.c))
                ->  Seq Scan on plt2_adv_p2 t2_2
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_3.a = t2_3.a) AND (t1_3.c = t2_3.c))
-               ->  Seq Scan on plt1_adv_p3 t1_3
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p2 t1_2
+                           Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_3.a = t1_3.a) AND (t2_3.c = t1_3.c))
                ->  Seq Scan on plt2_adv_p3 t2_3
-(18 rows)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p3 t1_3
+                           Filter: (b < 10)
+(21 rows)
 
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
  a | b |  c   
@@ -3526,26 +3532,26 @@ SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t
 -- partitions on the nullable side
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM plt2_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt1_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
-                      QUERY PLAN                      
-------------------------------------------------------
+                       QUERY PLAN                        
+---------------------------------------------------------
  Sort
    Sort Key: t1.a
-   ->  Hash Anti Join
-         Hash Cond: ((t1.a = t2.a) AND (t1.c = t2.c))
+   ->  Hash Right Anti Join
+         Hash Cond: ((t2.a = t1.a) AND (t2.c = t1.c))
          ->  Append
-               ->  Seq Scan on plt2_adv_extra t1_1
-                     Filter: (b < 10)
-               ->  Seq Scan on plt2_adv_p1 t1_2
-                     Filter: (b < 10)
-               ->  Seq Scan on plt2_adv_p2 t1_3
-                     Filter: (b < 10)
-               ->  Seq Scan on plt2_adv_p3 t1_4
-                     Filter: (b < 10)
+               ->  Seq Scan on plt1_adv_p1 t2_1
+               ->  Seq Scan on plt1_adv_p2 t2_2
+               ->  Seq Scan on plt1_adv_p3 t2_3
          ->  Hash
                ->  Append
-                     ->  Seq Scan on plt1_adv_p1 t2_1
-                     ->  Seq Scan on plt1_adv_p2 t2_2
-                     ->  Seq Scan on plt1_adv_p3 t2_3
+                     ->  Seq Scan on plt2_adv_extra t1_1
+                           Filter: (b < 10)
+                     ->  Seq Scan on plt2_adv_p1 t1_2
+                           Filter: (b < 10)
+                     ->  Seq Scan on plt2_adv_p2 t1_3
+                           Filter: (b < 10)
+                     ->  Seq Scan on plt2_adv_p3 t1_4
+                           Filter: (b < 10)
 (18 rows)
 
 -- full join; currently we can't do partitioned join if there are no matched
@@ -3655,25 +3661,25 @@ SELECT t1.a, t1.c, t2.a, t2.c FROM plt1_adv t1 LEFT JOIN plt2_adv t2 ON (t1.a =
 -- anti join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
-                       QUERY PLAN                       
---------------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Sort
    Sort Key: t1.a
-   ->  Hash Anti Join
-         Hash Cond: ((t1.a = t2.a) AND (t1.c = t2.c))
+   ->  Hash Right Anti Join
+         Hash Cond: ((t2.a = t1.a) AND (t2.c = t1.c))
          ->  Append
-               ->  Seq Scan on plt1_adv_p1 t1_1
-                     Filter: (b < 10)
-               ->  Seq Scan on plt1_adv_p2 t1_2
-                     Filter: (b < 10)
-               ->  Seq Scan on plt1_adv_p3 t1_3
-                     Filter: (b < 10)
+               ->  Seq Scan on plt2_adv_p1 t2_1
+               ->  Seq Scan on plt2_adv_p2_1 t2_2
+               ->  Seq Scan on plt2_adv_p2_2 t2_3
+               ->  Seq Scan on plt2_adv_p3 t2_4
          ->  Hash
                ->  Append
-                     ->  Seq Scan on plt2_adv_p1 t2_1
-                     ->  Seq Scan on plt2_adv_p2_1 t2_2
-                     ->  Seq Scan on plt2_adv_p2_2 t2_3
-                     ->  Seq Scan on plt2_adv_p3 t2_4
+                     ->  Seq Scan on plt1_adv_p1 t1_1
+                           Filter: (b < 10)
+                     ->  Seq Scan on plt1_adv_p2 t1_2
+                           Filter: (b < 10)
+                     ->  Seq Scan on plt1_adv_p3 t1_3
+                           Filter: (b < 10)
 (17 rows)
 
 -- full join
@@ -3830,28 +3836,30 @@ SELECT t1.a, t1.c, t2.a, t2.c FROM plt1_adv t1 LEFT JOIN plt2_adv t2 ON (t1.a =
 -- anti join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
-                              QUERY PLAN                              
-----------------------------------------------------------------------
+                             QUERY PLAN                             
+--------------------------------------------------------------------
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Hash Anti Join
-               Hash Cond: ((t1_1.a = t2_1.a) AND (t1_1.c = t2_1.c))
-               ->  Seq Scan on plt1_adv_p1_null t1_1
-                     Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_1.a = t1_1.a) AND (t2_1.c = t1_1.c))
+               ->  Seq Scan on plt2_adv_p1 t2_1
                ->  Hash
-                     ->  Seq Scan on plt2_adv_p1 t2_1
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_2.a = t2_2.a) AND (t1_2.c = t2_2.c))
-               ->  Seq Scan on plt1_adv_p2 t1_2
-                     Filter: (b < 10)
+                     ->  Seq Scan on plt1_adv_p1_null t1_1
+                           Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_2.a = t1_2.a) AND (t2_2.c = t1_2.c))
                ->  Seq Scan on plt2_adv_p2 t2_2
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_3.a = t2_3.a) AND (t1_3.c = t2_3.c))
-               ->  Seq Scan on plt1_adv_p3 t1_3
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p2 t1_2
+                           Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_3.a = t1_3.a) AND (t2_3.c = t1_3.c))
                ->  Seq Scan on plt2_adv_p3_null t2_3
-(19 rows)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p3 t1_3
+                           Filter: (b < 10)
+(21 rows)
 
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
  a  | b  |  c   
-- 
2.31.0

#18

Alvaro Herrera

alvherre@alvh.no-ip.org

over 3 years ago

In reply to: Richard Guo (#17)

Re: Using each rel as both outer and inner for JOIN_ANTI

Just for kicks, I ran query in your original post under EXPLAIN ANALYZE
in both patched and unpatched with this last version. I got this (best
of three):

Unpatched:
55432 16devel 437532=# explain (analyze, buffers) select * from foo left join bar on foo.a = bar.c where bar.c is null;
QUERY PLAN
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Hash Anti Join (cost=159039.00..183457.23 rows=10 width=20) (actual time=482.788..483.182 rows=10 loops=1)
Hash Cond: (foo.a = bar.c)
Buffers: shared hit=161 read=21964, temp read=8 written=8
-> Seq Scan on foo (cost=0.00..1.10 rows=10 width=8) (actual time=0.020..0.022 rows=10 loops=1)
Buffers: shared hit=1
-> Hash (cost=72124.00..72124.00 rows=5000000 width=12) (actual time=482.128..482.129 rows=0 loops=1)
Buckets: 262144 Batches: 64 Memory Usage: 2048kB
Buffers: shared hit=160 read=21964
-> Seq Scan on bar (cost=0.00..72124.00 rows=5000000 width=12) (actual time=0.092..237.431 rows=5000000 loops=1)
Buffers: shared hit=160 read=21964
Planning Time: 0.182 ms
Execution Time: 483.248 ms

Patched:
QUERY PLAN
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Hash Right Anti Join (cost=1.23..90875.24 rows=10 width=20) (actual time=457.654..457.658 rows=10 loops=1)
Hash Cond: (bar.c = foo.a)
Buffers: shared hit=33 read=22092
-> Seq Scan on bar (cost=0.00..72124.00 rows=5000000 width=12) (actual time=0.020..229.097 rows=5000000 loops=1)
Buffers: shared hit=32 read=22092
-> Hash (cost=1.10..1.10 rows=10 width=8) (actual time=0.011..0.012 rows=10 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=1
-> Seq Scan on foo (cost=0.00..1.10 rows=10 width=8) (actual time=0.006..0.007 rows=10 loops=1)
Buffers: shared hit=1
Planning Time: 0.067 ms
Execution Time: 457.679 ms

I suppose this looks good as far as the plan goes, but the cost estimation
might be a little bit too optimistic: it is reporting that the new plan
costs 50% of the original, yet the execution time is only 5% lower.

I wonder where does time go (in unpatched) when seqscanning finishes
and before hashing starts.

(I had to disable JIT for the first one, as it insisted on JITting tuple
deforming.)

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
Bob [Floyd] used to say that he was planning to get a Ph.D. by the "green
stamp method," namely by saving envelopes addressed to him as 'Dr. Floyd'.
After collecting 500 such letters, he mused, a university somewhere in
Arizona would probably grant him a degree. (Don Knuth)

#19

Richard Guo

guofenglinux@gmail.com

over 3 years ago

In reply to: Alvaro Herrera (#18)

Re: Using each rel as both outer and inner for JOIN_ANTI

On Tue, Aug 9, 2022 at 6:54 PM Alvaro Herrera <alvherre@alvh.no-ip.org>
wrote:

I suppose this looks good as far as the plan goes, but the cost estimation
might be a little bit too optimistic: it is reporting that the new plan
costs 50% of the original, yet the execution time is only 5% lower.

Thanks for trying this patch. Yeah, the estimated cost doesn't match the
execution time here. I tried the query locally and here is what I got
(best of three):

Unpatched:
# explain analyze select * from foo left join bar on foo.a = bar.c where
bar.c is null;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Hash Anti Join (cost=154156.00..173691.19 rows=1 width=16) (actual
time=1548.622..1548.624 rows=0 loops=1)
Hash Cond: (foo.a = bar.c)
-> Seq Scan on foo (cost=0.00..1.10 rows=10 width=8) (actual
time=0.024..0.026 rows=10 loops=1)
-> Hash (cost=72124.00..72124.00 rows=5000000 width=8) (actual
time=1443.157..1443.158 rows=5000000 loops=1)
Buckets: 262144 Batches: 64 Memory Usage: 5107kB
-> Seq Scan on bar (cost=0.00..72124.00 rows=5000000 width=8)
(actual time=0.045..481.059 rows=5000000 loops=1)
Planning Time: 0.262 ms
Execution Time: 1549.138 ms
(8 rows)

Patched:
# explain analyze select * from foo left join bar on foo.a = bar.c where
bar.c is null;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Hash Right Anti Join (cost=1.23..90875.33 rows=1 width=16) (actual
time=985.773..985.775 rows=0 loops=1)
Hash Cond: (bar.c = foo.a)
-> Seq Scan on bar (cost=0.00..72124.00 rows=5000000 width=8) (actual
time=0.095..438.333 rows=5000000 loops=1)
-> Hash (cost=1.10..1.10 rows=10 width=8) (actual time=0.076..0.077
rows=10 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on foo (cost=0.00..1.10 rows=10 width=8) (actual
time=0.060..0.064 rows=10 loops=1)
Planning Time: 0.290 ms
Execution Time: 985.830 ms
(8 rows)

Seems the cost matches the execution time better in my local box.

The right-anti join plan has the same cost estimation with right join
plan in this case. So would you please help to test what the right join
plan looks like in your env for the query below?

select * from foo left join bar on foo.a = bar.c;

Thanks
Richard

#20

Alvaro Herrera

alvherre@alvh.no-ip.org

over 3 years ago

In reply to: Richard Guo (#19)

Re: Using each rel as both outer and inner for JOIN_ANTI

On 2022-Aug-10, Richard Guo wrote:

The right-anti join plan has the same cost estimation with right join
plan in this case. So would you please help to test what the right join
plan looks like in your env for the query below?

select * from foo left join bar on foo.a = bar.c;

You're right, it does.

55432 16devel 475322=# explain (analyze, buffers) select * from foo left join bar on foo.a = bar.c;
QUERY PLAN
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Hash Right Join (cost=1.23..90875.24 rows=10 width=20) (actual time=456.410..456.415 rows=10 loops=1)
Hash Cond: (bar.c = foo.a)
Buffers: shared hit=15852 read=6273
-> Seq Scan on bar (cost=0.00..72124.00 rows=5000000 width=12) (actual time=0.036..210.468 rows=5000000 loops=1)
Buffers: shared hit=15852 read=6272
-> Hash (cost=1.10..1.10 rows=10 width=8) (actual time=0.037..0.038 rows=10 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared read=1
-> Seq Scan on foo (cost=0.00..1.10 rows=10 width=8) (actual time=0.022..0.026 rows=10 loops=1)
Buffers: shared read=1
Planning:
Buffers: shared hit=92 read=13
Planning Time: 1.077 ms
Execution Time: 456.458 ms
(14 filas)

55432 16devel 475322=# explain (analyze, buffers) select * from foo left join bar on foo.a = bar.c where bar.c is null;
QUERY PLAN
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Hash Right Anti Join (cost=1.23..90875.24 rows=10 width=20) (actual time=451.747..451.751 rows=10 loops=1)
Hash Cond: (bar.c = foo.a)
Buffers: shared hit=15646 read=6479
-> Seq Scan on bar (cost=0.00..72124.00 rows=5000000 width=12) (actual time=0.048..204.940 rows=5000000 loops=1)
Buffers: shared hit=15645 read=6479
-> Hash (cost=1.10..1.10 rows=10 width=8) (actual time=0.030..0.031 rows=10 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=1
-> Seq Scan on foo (cost=0.00..1.10 rows=10 width=8) (actual time=0.017..0.020 rows=10 loops=1)
Buffers: shared hit=1
Planning Time: 0.227 ms
Execution Time: 451.793 ms

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"Every machine is a smoke machine if you operate it wrong enough."
https://twitter.com/libseybieda/status/1541673325781196801

#21

Richard Guo

guofenglinux@gmail.com

over 3 years ago

In reply to: Alvaro Herrera (#20)

Re: Using each rel as both outer and inner for JOIN_ANTI

On Wed, Aug 10, 2022 at 4:40 PM Alvaro Herrera <alvherre@alvh.no-ip.org>
wrote:

On 2022-Aug-10, Richard Guo wrote:

The right-anti join plan has the same cost estimation with right join
plan in this case. So would you please help to test what the right join
plan looks like in your env for the query below?

select * from foo left join bar on foo.a = bar.c;

You're right, it does.

55432 16devel 475322=# explain (analyze, buffers) select * from foo left
join bar on foo.a = bar.c;
QUERY PLAN

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Hash Right Join (cost=1.23..90875.24 rows=10 width=20) (actual
time=456.410..456.415 rows=10 loops=1)
Hash Cond: (bar.c = foo.a)
Buffers: shared hit=15852 read=6273
-> Seq Scan on bar (cost=0.00..72124.00 rows=5000000 width=12)
(actual time=0.036..210.468 rows=5000000 loops=1)
Buffers: shared hit=15852 read=6272
-> Hash (cost=1.10..1.10 rows=10 width=8) (actual time=0.037..0.038
rows=10 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared read=1
-> Seq Scan on foo (cost=0.00..1.10 rows=10 width=8) (actual
time=0.022..0.026 rows=10 loops=1)
Buffers: shared read=1
Planning:
Buffers: shared hit=92 read=13
Planning Time: 1.077 ms
Execution Time: 456.458 ms
(14 filas)

Thanks for help testing. Comparing the anti join plan and the right join
plan, the estimated cost and the execution time mismatch a lot. Seems
the cost estimate of hashjoin path is not that precise for this case
even in the unpatched codes. Maybe this is something we need to improve.

Thanks
Richard

#22

Gregory Stark (as CFM)

stark.cfm@gmail.com

almost 3 years ago

In reply to: Richard Guo (#21)

Re: Using each rel as both outer and inner for JOIN_ANTI

So what is the status of this patch?

It looks like you received some feedback from Emre, Tom, Ronan, and
Alvaro but it also looks like you responded to most or all of that.
Are you still blocked waiting for feedback? Anything specific you need
help with?

Or is the patch ready for commit now? In which case it would be good
to rebase it since it's currently failing to apply. Well it would be
good to rebase regardless but it would be especially important if we
want to get it committed :)

#23

Richard Guo

guofenglinux@gmail.com

almost 3 years ago

In reply to: Gregory Stark (as CFM) (#22)

1 attachment(s)

Re: Using each rel as both outer and inner for JOIN_ANTI

On Wed, Mar 15, 2023 at 2:25 AM Gregory Stark (as CFM) <stark.cfm@gmail.com>
wrote:

So what is the status of this patch?

It looks like you received some feedback from Emre, Tom, Ronan, and
Alvaro but it also looks like you responded to most or all of that.
Are you still blocked waiting for feedback? Anything specific you need
help with?

Or is the patch ready for commit now? In which case it would be good
to rebase it since it's currently failing to apply. Well it would be
good to rebase regardless but it would be especially important if we
want to get it committed :)

Thanks for reminding. Attached is the rebased patch, with no other
changes. I think the patch is ready for commit.

Thanks
Richard

Attachments:

v6-0001-Using-each-rel-as-both-outer-and-inner-for-anti-joins.patchapplication/octet-stream; name=v6-0001-Using-each-rel-as-both-outer-and-inner-for-anti-joins.patchDownload

From a1f2dcc0b13363f3f025c050973fb5b09d14e51c Mon Sep 17 00:00:00 2001
From: Richard Guo <guofenglinux@gmail.com>
Date: Wed, 15 Mar 2023 14:59:05 +0800
Subject: [PATCH v6] Using each rel as both outer and inner for anti joins

---
 src/backend/commands/explain.c               |   3 +
 src/backend/executor/nodeHashjoin.c          |  41 +--
 src/backend/executor/nodeMergejoin.c         |  32 ++-
 src/backend/optimizer/path/costsize.c        |   3 +-
 src/backend/optimizer/path/joinpath.c        |  55 ++--
 src/backend/optimizer/path/joinrels.c        |   3 +
 src/backend/optimizer/path/pathkeys.c        |  10 +-
 src/include/nodes/execnodes.h                |   3 +-
 src/include/nodes/nodes.h                    |   4 +-
 src/test/regress/expected/partition_join.out | 276 ++++++++++---------
 10 files changed, 237 insertions(+), 193 deletions(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index e57bda7b62..49af2ef504 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -1550,6 +1550,9 @@ ExplainNode(PlanState *planstate, List *ancestors,
 					case JOIN_ANTI:
 						jointype = "Anti";
 						break;
+					case JOIN_RIGHT_ANTI:
+						jointype = "Right Anti";
+						break;
 					default:
 						jointype = "???";
 						break;
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index b215e3f59a..c9ede41b65 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -217,10 +217,10 @@ ExecHashJoinImpl(PlanState *pstate, bool parallel)
 
 				/*
 				 * If the outer relation is completely empty, and it's not
-				 * right/full join, we can quit without building the hash
-				 * table.  However, for an inner join it is only a win to
-				 * check this when the outer relation's startup cost is less
-				 * than the projected cost of building the hash table.
+				 * right/right-anti/full join, we can quit without building
+				 * the hash table.  However, for an inner join it is only a
+				 * win to check this when the outer relation's startup cost is
+				 * less than the projected cost of building the hash table.
 				 * Otherwise it's best to build the hash table first and see
 				 * if the inner relation is empty.  (When it's a left join, we
 				 * should always make this check, since we aren't going to be
@@ -458,11 +458,10 @@ ExecHashJoinImpl(PlanState *pstate, bool parallel)
 					if (parallel)
 					{
 						/*
-						 * Full/right outer joins are currently not supported
-						 * for parallel joins, so we don't need to set the
-						 * match bit.  Experiments show that it's worth
-						 * avoiding the shared memory traffic on large
-						 * systems.
+						 * Full/right/right-anti outer joins are currently not
+						 * supported for parallel joins, so we don't need to
+						 * set the match bit.  Experiments show that it's worth
+						 * avoiding the shared memory traffic on large systems.
 						 */
 						Assert(!HJ_FILL_INNER(node));
 					}
@@ -482,6 +481,14 @@ ExecHashJoinImpl(PlanState *pstate, bool parallel)
 						continue;
 					}
 
+					/*
+					 * In a right-antijoin, we never return a matched tuple.
+					 * And we need to use current outer tuple to scan the
+					 * bucket for matches
+					 */
+					if (node->js.jointype == JOIN_RIGHT_ANTI)
+						continue;
+
 					/*
 					 * If we only need to join to the first matching inner
 					 * tuple, then consider returning this one, but after that
@@ -527,9 +534,10 @@ ExecHashJoinImpl(PlanState *pstate, bool parallel)
 			case HJ_FILL_INNER_TUPLES:
 
 				/*
-				 * We have finished a batch, but we are doing right/full join,
-				 * so any unmatched inner tuples in the hashtable have to be
-				 * emitted before we continue to the next batch.
+				 * We have finished a batch, but we are doing
+				 * right/right-anti/full join, so any unmatched inner tuples in
+				 * the hashtable have to be emitted before we continue to the
+				 * next batch.
 				 */
 				if (!ExecScanHashTableForUnmatched(node, econtext))
 				{
@@ -694,6 +702,7 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
 				ExecInitNullTupleSlot(estate, innerDesc, &TTSOpsVirtual);
 			break;
 		case JOIN_RIGHT:
+		case JOIN_RIGHT_ANTI:
 			hjstate->hj_NullOuterTupleSlot =
 				ExecInitNullTupleSlot(estate, outerDesc, &TTSOpsVirtual);
 			break;
@@ -987,8 +996,8 @@ ExecHashJoinNewBatch(HashJoinState *hjstate)
 	 * side, but there are exceptions:
 	 *
 	 * 1. In a left/full outer join, we have to process outer batches even if
-	 * the inner batch is empty.  Similarly, in a right/full outer join, we
-	 * have to process inner batches even if the outer batch is empty.
+	 * the inner batch is empty.  Similarly, in a right/right-anti/full outer
+	 * join, we have to process inner batches even if the outer batch is empty.
 	 *
 	 * 2. If we have increased nbatch since the initial estimate, we have to
 	 * scan inner batches since they might contain tuples that need to be
@@ -1298,8 +1307,8 @@ ExecReScanHashJoin(HashJoinState *node)
 			/*
 			 * Okay to reuse the hash table; needn't rescan inner, either.
 			 *
-			 * However, if it's a right/full join, we'd better reset the
-			 * inner-tuple match flags contained in the table.
+			 * However, if it's a right/right-anti/full join, we'd better reset
+			 * the inner-tuple match flags contained in the table.
 			 */
 			if (HJ_FILL_INNER(node))
 				ExecHashTableResetMatchFlags(node->hj_HashTable);
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index 809aa215c6..5515b041d0 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -805,6 +805,13 @@ ExecMergeJoin(PlanState *pstate)
 						break;
 					}
 
+					/*
+					 * In a right-antijoin, we never return a matched tuple.
+					 * And we need to use current outer tuple for further scans
+					 */
+					if (node->js.jointype == JOIN_RIGHT_ANTI)
+						break;
+
 					/*
 					 * If we only need to join to the first matching inner
 					 * tuple, then consider returning this one, but after that
@@ -1063,12 +1070,12 @@ ExecMergeJoin(PlanState *pstate)
 					 * them will match this new outer tuple and therefore
 					 * won't be emitted as fill tuples.  This works *only*
 					 * because we require the extra joinquals to be constant
-					 * when doing a right or full join --- otherwise some of
-					 * the rescanned tuples might fail the extra joinquals.
-					 * This obviously won't happen for a constant-true extra
-					 * joinqual, while the constant-false case is handled by
-					 * forcing the merge clause to never match, so we never
-					 * get here.
+					 * when doing a right, right-anti or full join ---
+					 * otherwise some of the rescanned tuples might fail the
+					 * extra joinquals.  This obviously won't happen for a
+					 * constant-true extra joinqual, while the constant-false
+					 * case is handled by forcing the merge clause to never
+					 * match, so we never get here.
 					 */
 					if (!node->mj_SkipMarkRestore)
 					{
@@ -1332,8 +1339,8 @@ ExecMergeJoin(PlanState *pstate)
 
 				/*
 				 * EXEC_MJ_ENDOUTER means we have run out of outer tuples, but
-				 * are doing a right/full join and therefore must null-fill
-				 * any remaining unmatched inner tuples.
+				 * are doing a right/right-anti/full join and therefore must
+				 * null-fill any remaining unmatched inner tuples.
 				 */
 			case EXEC_MJ_ENDOUTER:
 				MJ_printf("ExecMergeJoin: EXEC_MJ_ENDOUTER\n");
@@ -1554,14 +1561,15 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
 				ExecInitNullTupleSlot(estate, innerDesc, &TTSOpsVirtual);
 			break;
 		case JOIN_RIGHT:
+		case JOIN_RIGHT_ANTI:
 			mergestate->mj_FillOuter = false;
 			mergestate->mj_FillInner = true;
 			mergestate->mj_NullOuterTupleSlot =
 				ExecInitNullTupleSlot(estate, outerDesc, &TTSOpsVirtual);
 
 			/*
-			 * Can't handle right or full join with non-constant extra
-			 * joinclauses.  This should have been caught by planner.
+			 * Can't handle right, right-anti or full join with non-constant
+			 * extra joinclauses.  This should have been caught by planner.
 			 */
 			if (!check_constant_qual(node->join.joinqual,
 									 &mergestate->mj_ConstFalseJoin))
@@ -1578,8 +1586,8 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
 				ExecInitNullTupleSlot(estate, innerDesc, &TTSOpsVirtual);
 
 			/*
-			 * Can't handle right or full join with non-constant extra
-			 * joinclauses.  This should have been caught by planner.
+			 * Can't handle right, right-anti or full join with non-constant
+			 * extra joinclauses.  This should have been caught by planner.
 			 */
 			if (!check_constant_qual(node->join.joinqual,
 									 &mergestate->mj_ConstFalseJoin))
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 7918bb6f0d..c6d59aa99d 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -3327,7 +3327,8 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 			outerstartsel = 0.0;
 			outerendsel = 1.0;
 		}
-		else if (jointype == JOIN_RIGHT)
+		else if (jointype == JOIN_RIGHT ||
+				jointype == JOIN_RIGHT_ANTI)
 		{
 			innerstartsel = 0.0;
 			innerendsel = 1.0;
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index e6ef0deb23..c1237d53f3 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -286,8 +286,9 @@ add_paths_to_joinrel(PlannerInfo *root,
 	 * 2. Consider paths where the outer relation need not be explicitly
 	 * sorted. This includes both nestloops and mergejoins where the outer
 	 * path is already ordered.  Again, skip this if we can't mergejoin.
-	 * (That's okay because we know that nestloop can't handle right/full
-	 * joins at all, so it wouldn't work in the prohibited cases either.)
+	 * (That's okay because we know that nestloop can't handle
+	 * right/right-anti/full joins at all, so it wouldn't work in the
+	 * prohibited cases either.)
 	 */
 	if (mergejoin_allowed)
 		match_unsorted_outer(root, joinrel, outerrel, innerrel,
@@ -1261,14 +1262,15 @@ sort_inner_and_outer(PlannerInfo *root,
 	 * If the joinrel is parallel-safe, we may be able to consider a partial
 	 * merge join.  However, we can't handle JOIN_UNIQUE_OUTER, because the
 	 * outer path will be partial, and therefore we won't be able to properly
-	 * guarantee uniqueness.  Similarly, we can't handle JOIN_FULL and
-	 * JOIN_RIGHT, because they can produce false null extended rows.  Also,
-	 * the resulting path must not be parameterized.
+	 * guarantee uniqueness.  Similarly, we can't handle JOIN_FULL, JOIN_RIGHT
+	 * and JOIN_RIGHT_ANTI, because they can produce false null extended rows.
+	 * Also, the resulting path must not be parameterized.
 	 */
 	if (joinrel->consider_parallel &&
 		save_jointype != JOIN_UNIQUE_OUTER &&
 		save_jointype != JOIN_FULL &&
 		save_jointype != JOIN_RIGHT &&
+		save_jointype != JOIN_RIGHT_ANTI &&
 		outerrel->partial_pathlist != NIL &&
 		bms_is_empty(joinrel->lateral_relids))
 	{
@@ -1663,10 +1665,10 @@ match_unsorted_outer(PlannerInfo *root,
 
 	/*
 	 * Nestloop only supports inner, left, semi, and anti joins.  Also, if we
-	 * are doing a right or full mergejoin, we must use *all* the mergeclauses
-	 * as join clauses, else we will not have a valid plan.  (Although these
-	 * two flags are currently inverses, keep them separate for clarity and
-	 * possible future changes.)
+	 * are doing a right, right-anti or full mergejoin, we must use *all* the
+	 * mergeclauses as join clauses, else we will not have a valid plan.
+	 * (Although these two flags are currently inverses, keep them separate
+	 * for clarity and possible future changes.)
 	 */
 	switch (jointype)
 	{
@@ -1678,6 +1680,7 @@ match_unsorted_outer(PlannerInfo *root,
 			useallclauses = false;
 			break;
 		case JOIN_RIGHT:
+		case JOIN_RIGHT_ANTI:
 		case JOIN_FULL:
 			nestjoinOK = false;
 			useallclauses = true;
@@ -1849,13 +1852,14 @@ match_unsorted_outer(PlannerInfo *root,
 	 * handle JOIN_UNIQUE_OUTER, because the outer path will be partial, and
 	 * therefore we won't be able to properly guarantee uniqueness.  Nor can
 	 * we handle joins needing lateral rels, since partial paths must not be
-	 * parameterized. Similarly, we can't handle JOIN_FULL and JOIN_RIGHT,
-	 * because they can produce false null extended rows.
+	 * parameterized. Similarly, we can't handle JOIN_FULL, JOIN_RIGHT and
+	 * JOIN_RIGHT_ANTI, because they can produce false null extended rows.
 	 */
 	if (joinrel->consider_parallel &&
 		save_jointype != JOIN_UNIQUE_OUTER &&
 		save_jointype != JOIN_FULL &&
 		save_jointype != JOIN_RIGHT &&
+		save_jointype != JOIN_RIGHT_ANTI &&
 		outerrel->partial_pathlist != NIL &&
 		bms_is_empty(joinrel->lateral_relids))
 	{
@@ -2191,17 +2195,19 @@ hash_inner_and_outer(PlannerInfo *root,
 		 * partial hash join.  However, we can't handle JOIN_UNIQUE_OUTER,
 		 * because the outer path will be partial, and therefore we won't be
 		 * able to properly guarantee uniqueness.  Similarly, we can't handle
-		 * JOIN_FULL and JOIN_RIGHT, because they can produce false null
-		 * extended rows.  Also, the resulting path must not be parameterized.
-		 * We would be able to support JOIN_FULL and JOIN_RIGHT for Parallel
-		 * Hash, since in that case we're back to a single hash table with a
-		 * single set of match bits for each batch, but that will require
-		 * figuring out a deadlock-free way to wait for the probe to finish.
+		 * JOIN_FULL, JOIN_RIGHT and JOIN_RIGHT_ANTI, because they can produce
+		 * false null extended rows.  Also, the resulting path must not be
+		 * parameterized.  We would be able to support JOIN_FULL, JOIN_RIGHT
+		 * and JOIN_RIGHT_ANTI for Parallel Hash, since in that case we're
+		 * back to a single hash table with a single set of match bits for
+		 * each batch, but that will require figuring out a deadlock-free way
+		 * to wait for the probe to finish.
 		 */
 		if (joinrel->consider_parallel &&
 			save_jointype != JOIN_UNIQUE_OUTER &&
 			save_jointype != JOIN_FULL &&
 			save_jointype != JOIN_RIGHT &&
+			save_jointype != JOIN_RIGHT_ANTI &&
 			outerrel->partial_pathlist != NIL &&
 			bms_is_empty(joinrel->lateral_relids))
 		{
@@ -2258,11 +2264,11 @@ hash_inner_and_outer(PlannerInfo *root,
  *	  Select mergejoin clauses that are usable for a particular join.
  *	  Returns a list of RestrictInfo nodes for those clauses.
  *
- * *mergejoin_allowed is normally set to true, but it is set to false if
- * this is a right/full join and there are nonmergejoinable join clauses.
- * The executor's mergejoin machinery cannot handle such cases, so we have
- * to avoid generating a mergejoin plan.  (Note that this flag does NOT
- * consider whether there are actually any mergejoinable clauses.  This is
+ * *mergejoin_allowed is normally set to true, but it is set to false if this
+ * is a right/right-anti/full join and there are nonmergejoinable join
+ * clauses.  The executor's mergejoin machinery cannot handle such cases, so
+ * we have to avoid generating a mergejoin plan.  (Note that this flag does
+ * NOT consider whether there are actually any mergejoinable clauses.  This is
  * correct because in some cases we need to build a clauseless mergejoin.
  * Simply returning NIL is therefore not enough to distinguish safe from
  * unsafe cases.)
@@ -2308,8 +2314,8 @@ select_mergejoin_clauses(PlannerInfo *root,
 		{
 			/*
 			 * The executor can handle extra joinquals that are constants, but
-			 * not anything else, when doing right/full merge join.  (The
-			 * reason to support constants is so we can do FULL JOIN ON
+			 * not anything else, when doing right/right-anti/full merge join.
+			 * (The reason to support constants is so we can do FULL JOIN ON
 			 * FALSE.)
 			 */
 			if (!restrictinfo->clause || !IsA(restrictinfo->clause, Const))
@@ -2352,6 +2358,7 @@ select_mergejoin_clauses(PlannerInfo *root,
 	switch (jointype)
 	{
 		case JOIN_RIGHT:
+		case JOIN_RIGHT_ANTI:
 		case JOIN_FULL:
 			*mergejoin_allowed = !have_nonmergeable_joinclause;
 			break;
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index d7cb11c851..4c6ea3a2f0 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -925,6 +925,9 @@ populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
 			add_paths_to_joinrel(root, joinrel, rel1, rel2,
 								 JOIN_ANTI, sjinfo,
 								 restrictlist);
+			add_paths_to_joinrel(root, joinrel, rel2, rel1,
+								 JOIN_RIGHT_ANTI, sjinfo,
+								 restrictlist);
 			break;
 		default:
 			/* other values not expected here */
diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
index c4e7f97f68..e53ea84224 100644
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -1077,9 +1077,9 @@ find_var_for_subquery_tle(RelOptInfo *rel, TargetEntry *tle)
  *	  Build the path keys for a join relation constructed by mergejoin or
  *	  nestloop join.  This is normally the same as the outer path's keys.
  *
- *	  EXCEPTION: in a FULL or RIGHT join, we cannot treat the result as
- *	  having the outer path's path keys, because null lefthand rows may be
- *	  inserted at random points.  It must be treated as unsorted.
+ *	  EXCEPTION: in a FULL, RIGHT or RIGHT_ANTI join, we cannot treat the
+ *	  result as having the outer path's path keys, because null lefthand rows
+ *	  may be inserted at random points.  It must be treated as unsorted.
  *
  *	  We truncate away any pathkeys that are uninteresting for higher joins.
  *
@@ -1095,7 +1095,9 @@ build_join_pathkeys(PlannerInfo *root,
 					JoinType jointype,
 					List *outer_pathkeys)
 {
-	if (jointype == JOIN_FULL || jointype == JOIN_RIGHT)
+	if (jointype == JOIN_FULL ||
+		jointype == JOIN_RIGHT ||
+		jointype == JOIN_RIGHT_ANTI)
 		return NIL;
 
 	/*
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index bc67cb9ed8..f412ba62eb 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -2071,7 +2071,8 @@ typedef struct MergeJoinState
  *								OuterTupleSlot is empty!)
  *		hj_OuterTupleSlot		tuple slot for outer tuples
  *		hj_HashTupleSlot		tuple slot for inner (hashed) tuples
- *		hj_NullOuterTupleSlot	prepared null tuple for right/full outer joins
+ *		hj_NullOuterTupleSlot	prepared null tuple for right/right-anti/full
+ *								outer joins
  *		hj_NullInnerTupleSlot	prepared null tuple for left/full outer joins
  *		hj_FirstOuterTupleSlot	first tuple retrieved from outer plan
  *		hj_JoinState			current state of ExecHashJoin state machine
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index bdfef0f461..f8e8fe699a 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -317,6 +317,7 @@ typedef enum JoinType
 	 */
 	JOIN_SEMI,					/* 1 copy of each LHS row that has match(es) */
 	JOIN_ANTI,					/* 1 copy of each LHS row that has no match */
+	JOIN_RIGHT_ANTI,			/* 1 copy of each RHS row that has no match */
 
 	/*
 	 * These codes are used internally in the planner, but are not supported
@@ -349,7 +350,8 @@ typedef enum JoinType
 	  ((1 << JOIN_LEFT) | \
 	   (1 << JOIN_FULL) | \
 	   (1 << JOIN_RIGHT) | \
-	   (1 << JOIN_ANTI))) != 0)
+	   (1 << JOIN_ANTI) | \
+	   (1 << JOIN_RIGHT_ANTI))) != 0)
 
 /*
  * AggStrategy -
diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out
index e18641ab92..762cc8e53b 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -2415,24 +2415,24 @@ SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Hash Anti Join
-               Hash Cond: (t1_1.a = t2_1.b)
-               ->  Seq Scan on prt1_adv_p1 t1_1
-                     Filter: (b = 0)
+         ->  Hash Right Anti Join
+               Hash Cond: (t2_1.b = t1_1.a)
+               ->  Seq Scan on prt2_adv_p1 t2_1
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p1 t2_1
-         ->  Hash Anti Join
-               Hash Cond: (t1_2.a = t2_2.b)
-               ->  Seq Scan on prt1_adv_p2 t1_2
-                     Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p1 t1_1
+                           Filter: (b = 0)
+         ->  Hash Right Anti Join
+               Hash Cond: (t2_2.b = t1_2.a)
+               ->  Seq Scan on prt2_adv_p2 t2_2
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p2 t2_2
-         ->  Hash Anti Join
-               Hash Cond: (t1_3.a = t2_3.b)
-               ->  Seq Scan on prt1_adv_p3 t1_3
-                     Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p2 t1_2
+                           Filter: (b = 0)
+         ->  Hash Right Anti Join
+               Hash Cond: (t2_3.b = t1_3.a)
+               ->  Seq Scan on prt2_adv_p3 t2_3
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p3 t2_3
+                     ->  Seq Scan on prt1_adv_p3 t1_3
+                           Filter: (b = 0)
 (21 rows)
 
 SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
@@ -2650,24 +2650,24 @@ SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Hash Anti Join
-               Hash Cond: (t1_1.a = t2_1.b)
-               ->  Seq Scan on prt1_adv_p1 t1_1
-                     Filter: (b = 0)
+         ->  Hash Right Anti Join
+               Hash Cond: (t2_1.b = t1_1.a)
+               ->  Seq Scan on prt2_adv_p1 t2_1
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p1 t2_1
-         ->  Hash Anti Join
-               Hash Cond: (t1_2.a = t2_2.b)
-               ->  Seq Scan on prt1_adv_p2 t1_2
-                     Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p1 t1_1
+                           Filter: (b = 0)
+         ->  Hash Right Anti Join
+               Hash Cond: (t2_2.b = t1_2.a)
+               ->  Seq Scan on prt2_adv_p2 t2_2
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p2 t2_2
-         ->  Hash Anti Join
-               Hash Cond: (t1_3.a = t2_3.b)
-               ->  Seq Scan on prt1_adv_p3 t1_3
-                     Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p2 t1_2
+                           Filter: (b = 0)
+         ->  Hash Right Anti Join
+               Hash Cond: (t2_3.b = t1_3.a)
+               ->  Seq Scan on prt2_adv_p3 t2_3
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p3 t2_3
+                     ->  Seq Scan on prt1_adv_p3 t1_3
+                           Filter: (b = 0)
 (21 rows)
 
 SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
@@ -2683,26 +2683,26 @@ SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t
 -- partitions on the nullable side
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM prt2_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt1_adv t2 WHERE t1.b = t2.a) AND t1.a = 0 ORDER BY t1.b;
-                      QUERY PLAN                      
-------------------------------------------------------
+                       QUERY PLAN                        
+---------------------------------------------------------
  Sort
    Sort Key: t1.b
-   ->  Hash Anti Join
-         Hash Cond: (t1.b = t2.a)
+   ->  Hash Right Anti Join
+         Hash Cond: (t2.a = t1.b)
          ->  Append
-               ->  Seq Scan on prt2_adv_p1 t1_1
-                     Filter: (a = 0)
-               ->  Seq Scan on prt2_adv_p2 t1_2
-                     Filter: (a = 0)
-               ->  Seq Scan on prt2_adv_p3 t1_3
-                     Filter: (a = 0)
-               ->  Seq Scan on prt2_adv_extra t1_4
-                     Filter: (a = 0)
+               ->  Seq Scan on prt1_adv_p1 t2_1
+               ->  Seq Scan on prt1_adv_p2 t2_2
+               ->  Seq Scan on prt1_adv_p3 t2_3
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt1_adv_p1 t2_1
-                     ->  Seq Scan on prt1_adv_p2 t2_2
-                     ->  Seq Scan on prt1_adv_p3 t2_3
+                     ->  Seq Scan on prt2_adv_p1 t1_1
+                           Filter: (a = 0)
+                     ->  Seq Scan on prt2_adv_p2 t1_2
+                           Filter: (a = 0)
+                     ->  Seq Scan on prt2_adv_p3 t1_3
+                           Filter: (a = 0)
+                     ->  Seq Scan on prt2_adv_extra t1_4
+                           Filter: (a = 0)
 (18 rows)
 
 -- full join; currently we can't do partitioned join if there are no matched
@@ -2870,25 +2870,25 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_adv t1 LEFT JOIN prt2_adv t2 ON (t1.a =
 -- anti join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
-                       QUERY PLAN                       
---------------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Sort
    Sort Key: t1.a
-   ->  Hash Anti Join
-         Hash Cond: (t1.a = t2.b)
+   ->  Hash Right Anti Join
+         Hash Cond: (t2.b = t1.a)
          ->  Append
-               ->  Seq Scan on prt1_adv_p1 t1_1
-                     Filter: (b = 0)
-               ->  Seq Scan on prt1_adv_p2 t1_2
-                     Filter: (b = 0)
-               ->  Seq Scan on prt1_adv_p3 t1_3
-                     Filter: (b = 0)
+               ->  Seq Scan on prt2_adv_p1 t2_1
+               ->  Seq Scan on prt2_adv_p2 t2_2
+               ->  Seq Scan on prt2_adv_p3_1 t2_3
+               ->  Seq Scan on prt2_adv_p3_2 t2_4
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt2_adv_p1 t2_1
-                     ->  Seq Scan on prt2_adv_p2 t2_2
-                     ->  Seq Scan on prt2_adv_p3_1 t2_3
-                     ->  Seq Scan on prt2_adv_p3_2 t2_4
+                     ->  Seq Scan on prt1_adv_p1 t1_1
+                           Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p2 t1_2
+                           Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p3 t1_3
+                           Filter: (b = 0)
 (17 rows)
 
 -- full join
@@ -3292,27 +3292,30 @@ SELECT t1.a, t1.c, t2.a, t2.c FROM plt1_adv t1 LEFT JOIN plt2_adv t2 ON (t1.a =
 -- anti join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
-                              QUERY PLAN                              
-----------------------------------------------------------------------
+                             QUERY PLAN                             
+--------------------------------------------------------------------
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_1.a = t2_1.a) AND (t1_1.c = t2_1.c))
-               ->  Seq Scan on plt1_adv_p1 t1_1
-                     Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_1.a = t1_1.a) AND (t2_1.c = t1_1.c))
                ->  Seq Scan on plt2_adv_p1 t2_1
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_2.a = t2_2.a) AND (t1_2.c = t2_2.c))
-               ->  Seq Scan on plt1_adv_p2 t1_2
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p1 t1_1
+                           Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_2.a = t1_2.a) AND (t2_2.c = t1_2.c))
                ->  Seq Scan on plt2_adv_p2 t2_2
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_3.a = t2_3.a) AND (t1_3.c = t2_3.c))
-               ->  Seq Scan on plt1_adv_p3 t1_3
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p2 t1_2
+                           Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_3.a = t1_3.a) AND (t2_3.c = t1_3.c))
                ->  Seq Scan on plt2_adv_p3 t2_3
-(18 rows)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p3 t1_3
+                           Filter: (b < 10)
+(21 rows)
 
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
  a | b |  c   
@@ -3505,27 +3508,30 @@ SELECT t1.a, t1.c, t2.a, t2.c FROM plt2_adv t1 LEFT JOIN plt1_adv t2 ON (t1.a =
 -- anti join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
-                              QUERY PLAN                              
-----------------------------------------------------------------------
+                             QUERY PLAN                             
+--------------------------------------------------------------------
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_1.a = t2_1.a) AND (t1_1.c = t2_1.c))
-               ->  Seq Scan on plt1_adv_p1 t1_1
-                     Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_1.a = t1_1.a) AND (t2_1.c = t1_1.c))
                ->  Seq Scan on plt2_adv_p1 t2_1
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_2.a = t2_2.a) AND (t1_2.c = t2_2.c))
-               ->  Seq Scan on plt1_adv_p2 t1_2
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p1 t1_1
+                           Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_2.a = t1_2.a) AND (t2_2.c = t1_2.c))
                ->  Seq Scan on plt2_adv_p2 t2_2
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_3.a = t2_3.a) AND (t1_3.c = t2_3.c))
-               ->  Seq Scan on plt1_adv_p3 t1_3
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p2 t1_2
+                           Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_3.a = t1_3.a) AND (t2_3.c = t1_3.c))
                ->  Seq Scan on plt2_adv_p3 t2_3
-(18 rows)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p3 t1_3
+                           Filter: (b < 10)
+(21 rows)
 
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
  a | b |  c   
@@ -3538,26 +3544,26 @@ SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t
 -- partitions on the nullable side
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM plt2_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt1_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
-                      QUERY PLAN                      
-------------------------------------------------------
+                       QUERY PLAN                        
+---------------------------------------------------------
  Sort
    Sort Key: t1.a
-   ->  Hash Anti Join
-         Hash Cond: ((t1.a = t2.a) AND (t1.c = t2.c))
+   ->  Hash Right Anti Join
+         Hash Cond: ((t2.a = t1.a) AND (t2.c = t1.c))
          ->  Append
-               ->  Seq Scan on plt2_adv_extra t1_1
-                     Filter: (b < 10)
-               ->  Seq Scan on plt2_adv_p1 t1_2
-                     Filter: (b < 10)
-               ->  Seq Scan on plt2_adv_p2 t1_3
-                     Filter: (b < 10)
-               ->  Seq Scan on plt2_adv_p3 t1_4
-                     Filter: (b < 10)
+               ->  Seq Scan on plt1_adv_p1 t2_1
+               ->  Seq Scan on plt1_adv_p2 t2_2
+               ->  Seq Scan on plt1_adv_p3 t2_3
          ->  Hash
                ->  Append
-                     ->  Seq Scan on plt1_adv_p1 t2_1
-                     ->  Seq Scan on plt1_adv_p2 t2_2
-                     ->  Seq Scan on plt1_adv_p3 t2_3
+                     ->  Seq Scan on plt2_adv_extra t1_1
+                           Filter: (b < 10)
+                     ->  Seq Scan on plt2_adv_p1 t1_2
+                           Filter: (b < 10)
+                     ->  Seq Scan on plt2_adv_p2 t1_3
+                           Filter: (b < 10)
+                     ->  Seq Scan on plt2_adv_p3 t1_4
+                           Filter: (b < 10)
 (18 rows)
 
 -- full join; currently we can't do partitioned join if there are no matched
@@ -3667,25 +3673,25 @@ SELECT t1.a, t1.c, t2.a, t2.c FROM plt1_adv t1 LEFT JOIN plt2_adv t2 ON (t1.a =
 -- anti join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
-                       QUERY PLAN                       
---------------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Sort
    Sort Key: t1.a
-   ->  Hash Anti Join
-         Hash Cond: ((t1.a = t2.a) AND (t1.c = t2.c))
+   ->  Hash Right Anti Join
+         Hash Cond: ((t2.a = t1.a) AND (t2.c = t1.c))
          ->  Append
-               ->  Seq Scan on plt1_adv_p1 t1_1
-                     Filter: (b < 10)
-               ->  Seq Scan on plt1_adv_p2 t1_2
-                     Filter: (b < 10)
-               ->  Seq Scan on plt1_adv_p3 t1_3
-                     Filter: (b < 10)
+               ->  Seq Scan on plt2_adv_p1 t2_1
+               ->  Seq Scan on plt2_adv_p2_1 t2_2
+               ->  Seq Scan on plt2_adv_p2_2 t2_3
+               ->  Seq Scan on plt2_adv_p3 t2_4
          ->  Hash
                ->  Append
-                     ->  Seq Scan on plt2_adv_p1 t2_1
-                     ->  Seq Scan on plt2_adv_p2_1 t2_2
-                     ->  Seq Scan on plt2_adv_p2_2 t2_3
-                     ->  Seq Scan on plt2_adv_p3 t2_4
+                     ->  Seq Scan on plt1_adv_p1 t1_1
+                           Filter: (b < 10)
+                     ->  Seq Scan on plt1_adv_p2 t1_2
+                           Filter: (b < 10)
+                     ->  Seq Scan on plt1_adv_p3 t1_3
+                           Filter: (b < 10)
 (17 rows)
 
 -- full join
@@ -3842,28 +3848,30 @@ SELECT t1.a, t1.c, t2.a, t2.c FROM plt1_adv t1 LEFT JOIN plt2_adv t2 ON (t1.a =
 -- anti join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
-                              QUERY PLAN                              
-----------------------------------------------------------------------
+                             QUERY PLAN                             
+--------------------------------------------------------------------
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Hash Anti Join
-               Hash Cond: ((t1_1.a = t2_1.a) AND (t1_1.c = t2_1.c))
-               ->  Seq Scan on plt1_adv_p1_null t1_1
-                     Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_1.a = t1_1.a) AND (t2_1.c = t1_1.c))
+               ->  Seq Scan on plt2_adv_p1 t2_1
                ->  Hash
-                     ->  Seq Scan on plt2_adv_p1 t2_1
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_2.a = t2_2.a) AND (t1_2.c = t2_2.c))
-               ->  Seq Scan on plt1_adv_p2 t1_2
-                     Filter: (b < 10)
+                     ->  Seq Scan on plt1_adv_p1_null t1_1
+                           Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_2.a = t1_2.a) AND (t2_2.c = t1_2.c))
                ->  Seq Scan on plt2_adv_p2 t2_2
-         ->  Nested Loop Anti Join
-               Join Filter: ((t1_3.a = t2_3.a) AND (t1_3.c = t2_3.c))
-               ->  Seq Scan on plt1_adv_p3 t1_3
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p2 t1_2
+                           Filter: (b < 10)
+         ->  Hash Right Anti Join
+               Hash Cond: ((t2_3.a = t1_3.a) AND (t2_3.c = t1_3.c))
                ->  Seq Scan on plt2_adv_p3_null t2_3
-(19 rows)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p3 t1_3
+                           Filter: (b < 10)
+(21 rows)
 
 SELECT t1.* FROM plt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
  a  | b  |  c   
-- 
2.31.0

#24

Tom Lane

tgl@sss.pgh.pa.us

almost 3 years ago

In reply to: Richard Guo (#23)

Re: Using each rel as both outer and inner for JOIN_ANTI

Richard Guo <guofenglinux@gmail.com> writes:

Thanks for reminding. Attached is the rebased patch, with no other
changes. I think the patch is ready for commit.

Pushed after a little further fooling with the comments. I also had
to rebase it over 11c2d6fdf (Parallel Hash Full Join). I think I did
that correctly, but it's not clear to me whether any of the existing
test cases are now doing parallelized hashed right antijoins. Might
be worth a little more testing.

I think that Alvaro's concern about incorrect cost estimates may be
misplaced. I couldn't find any obvious errors in the costing logic for
this, given that we concluded that the early-exit runtime logic cannot
apply. Also, when I try simply executing Richard's original test query
(in a non-JIT build), the runtimes I get line up quite well ... maybe
too well? ... with the cost estimates:

v15 HEAD w/patch Ratio

Cost estimate 173691.19 90875.33 0.52
Actual (best of 3) 514.200 ms 268.978 ms 0.52

I think the smaller differentials you guys were seeing were all about
EXPLAIN ANALYZE overhead.

regards, tom lane

#25

Thomas Munro

thomas.munro@gmail.com

almost 3 years ago

In reply to: Tom Lane (#24)

Re: Using each rel as both outer and inner for JOIN_ANTI

On Thu, Apr 6, 2023 at 9:11 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Richard Guo <guofenglinux@gmail.com> writes:

Thanks for reminding. Attached is the rebased patch, with no other
changes. I think the patch is ready for commit.

Pushed after a little further fooling with the comments. I also had
to rebase it over 11c2d6fdf (Parallel Hash Full Join). I think I did
that correctly, but it's not clear to me whether any of the existing
test cases are now doing parallelized hashed right antijoins. Might
be worth a little more testing.

I don't see any (at least that are EXPLAINed). Wondering if we should
add some of these into join_hash.sql, but probably not before I figure
out how to make that whole file run faster...

I think that Alvaro's concern about incorrect cost estimates may be
misplaced. I couldn't find any obvious errors in the costing logic for
this, given that we concluded that the early-exit runtime logic cannot
apply. Also, when I try simply executing Richard's original test query
(in a non-JIT build), the runtimes I get line up quite well ... maybe
too well? ... with the cost estimates:

v15 HEAD w/patch Ratio

Cost estimate 173691.19 90875.33 0.52
Actual (best of 3) 514.200 ms 268.978 ms 0.52

I think the smaller differentials you guys were seeing were all about
EXPLAIN ANALYZE overhead.

I tried the original example from the top of this thread and saw a
decent speedup from parallelism, but only if I set
min_parallel_table_scan_size=0, and otherwise it doesn't choose
Parallel Hash Right Anti Join. Same if I embiggen bar significantly.
Haven't looked yet but I wonder if there is some left/right confusion
on parallel degree computation or something like that...

#26

Thomas Munro

thomas.munro@gmail.com

almost 3 years ago

In reply to: Thomas Munro (#25)

Re: Using each rel as both outer and inner for JOIN_ANTI

On Thu, Apr 6, 2023 at 12:17 PM Thomas Munro <thomas.munro@gmail.com> wrote:

I tried the original example from the top of this thread and saw a
decent speedup from parallelism, but only if I set
min_parallel_table_scan_size=0, and otherwise it doesn't choose
Parallel Hash Right Anti Join. Same if I embiggen bar significantly.
Haven't looked yet but I wonder if there is some left/right confusion
on parallel degree computation or something like that...

Ahh, the problem is just that create_plain_partial_paths() doesn't
bother to create a partial path for foo at all, because it's so small,
so hash_inner_and_outer() can't even consider a parallel join (that
needs partial paths on both sides). What we want here is a shared
hash table so we can have shared match flags, an entirely new concern,
but create_plain_partial_path() can't see any point in a partial scan
of such a small table. It works if you're OK creating partial paths
for everything...

+#if 0
/* If any limit was set to zero, the user doesn't want a
parallel scan. */
if (parallel_workers <= 0)
return;
+#endif

#27

Tom Lane

tgl@sss.pgh.pa.us

almost 3 years ago

In reply to: Thomas Munro (#26)

Re: Using each rel as both outer and inner for JOIN_ANTI

Thomas Munro <thomas.munro@gmail.com> writes:

... It works if you're OK creating partial paths
for everything...

Hmm. The committed patch already causes us to investigate more
paths than before, which I was okay with because it only costs
more if there's an antijoin involved --- which it seems like
there's at least a 50% chance of winning on, depending on which
table is bigger.

This:

+#if 0
/* If any limit was set to zero, the user doesn't want a
parallel scan. */
if (parallel_workers <= 0)
return;
+#endif

seems like it adds a lot of new paths with a lot lower chance
of win, but maybe we could tighten the conditions to improve
the odds?

regards, tom lane

#28

Richard Guo

guofenglinux@gmail.com

almost 3 years ago

In reply to: Thomas Munro (#25)

Re: Using each rel as both outer and inner for JOIN_ANTI

On Thu, Apr 6, 2023 at 8:18 AM Thomas Munro <thomas.munro@gmail.com> wrote:

On Thu, Apr 6, 2023 at 9:11 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Richard Guo <guofenglinux@gmail.com> writes:

Thanks for reminding. Attached is the rebased patch, with no other
changes. I think the patch is ready for commit.

Pushed after a little further fooling with the comments. I also had
to rebase it over 11c2d6fdf (Parallel Hash Full Join). I think I did
that correctly, but it's not clear to me whether any of the existing
test cases are now doing parallelized hashed right antijoins. Might
be worth a little more testing.

I don't see any (at least that are EXPLAINed). Wondering if we should
add some of these into join_hash.sql, but probably not before I figure
out how to make that whole file run faster...

Thanks Tom for the rebase and pushing. Agreed that we need to add more
testing to cover Parallel Hash Right Anti Join. I tried one in
join_hash.sql as below

explain (costs off)
select count(*) from simple r right join bigger_than_it_looks s using (id)
where r.id is null;
QUERY PLAN
---------------------------------------------------------------------
Aggregate
-> Gather
Workers Planned: 2
-> Parallel Hash Right Anti Join
Hash Cond: (r.id = s.id)
-> Parallel Seq Scan on simple r
-> Parallel Hash
-> Parallel Seq Scan on bigger_than_it_looks s
(8 rows)

But as Thomas said, maybe we need to wait until that file becomes
faster.

Thanks
Richard

#29

Richard Guo

guofenglinux@gmail.com

almost 3 years ago

In reply to: Tom Lane (#27)

Re: Using each rel as both outer and inner for JOIN_ANTI

On Thu, Apr 6, 2023 at 1:06 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

This:

+#if 0
/* If any limit was set to zero, the user doesn't want a
parallel scan. */
if (parallel_workers <= 0)
return;
+#endif

seems like it adds a lot of new paths with a lot lower chance
of win, but maybe we could tighten the conditions to improve
the odds?

Seems it wins if the parallel scan becomes part of a hash join in final
plan. I wonder if we have a way to know that in this early stage.

BTW, zero parallel_workers seems would break some later assumptions, so
we may need to give it a meaningful number if we want to do in this way.

Thanks
Richard

#30

Thomas Munro

thomas.munro@gmail.com

almost 3 years ago

In reply to: Richard Guo (#29)

Re: Using each rel as both outer and inner for JOIN_ANTI

On Thu, Apr 6, 2023 at 6:40 PM Richard Guo <guofenglinux@gmail.com> wrote:

Seems it wins if the parallel scan becomes part of a hash join in final
plan. I wonder if we have a way to know that in this early stage.

I haven't tried but I'm not sure off the top of my head how to make a
decision that early unless it's super coarse grained, like is there an
anti join in here somewhere...

Generally, this seems to be a consequence of our parallel query
planner design where parallel paths are generated separately and
compete on cost, as foretold by Hong and Stonebraker[1]/messages/by-id/CA+hUKGL-Fo9mZyFK1tdmzFng2puRBrgROsCiB1=n7wP79mTZ+g@mail.gmail.com. It's going
to be hard to prune those early without missing out on some good
plans, if you don't have a crystal ball.

I wondered for a while what horrible technical problems would come up
if we could defer creation of paths, so partial_pathlist is empty but
a new RelOptInfo flag says "you can call
finish_the_partial_pathlist_please(root, rel)) if you really want
one". We could try to be sparing about calling it that so we don't
finish up creating them all. That would at least move the
should-I-bother-to-make-this-path? logic close to the place with the
knowledge that it'd be useful, in this case the inner side of a
parallel hash join. One problem is that you have to get a
parallel_workers number from somewhere to make a partial path. The
hash join path code knows what its own parallel_workers number will be
(it inherits it from the outer path, though I can't immediately think
of any good reason why it shouldn't be Max(inner, outer)), but if we
were to feed that into a created-on-demand inner path that is then
cached, we'd have a horrible ordering dependency (some other join in
the query gets planned first with a different parallel_workers number
and it gets cached differently). Yuck. As you noted, 0 isn't a great
number either, but it leads to a another thought...

I wondered if we could somehow use the complete (non-partial) path
directly in some cases here, if certain conditions are met. Aside
from any other technical problems, you might ask "but what about the
estimates/costing we already did for that path"; well the numbers are
usually wrong anyway! We have complete paths, and partial paths for
arbitrary parallel_workers numbers that bubbled up from our scan
size-based heuristics. Every time we combine more than one partial
path, we have to handle non-matching "parallel degree" (I'm using that
word to mean the result of get_parallel_divisor(path), a lightly
grilled version of parallel_workers that makes dubious assumptions,
but I digress). Concretely, parallel append and parallel hash join
already rescale some of their input variables to match their own
nominal parallel degree (ugh, I see a few things we should fix in that
code, but this email is long enough already). I wonder if we might
need some infrastructure to formalise that sort of thing. For
example, look at the row estimates in the EXPLAIN of parallel append
over tables with parallel_workers explicitly set to different numbers
using ALTER TABLE. They are wrong (they're based on different
parallel degrees; turn off parallel_leader_participation to make the
arithmetic easier to follow), while the append node itself has
rescaled them and has a good row estimate for its own nominal parallel
degree, which in turn might be wrong depending on what is above.
Perhaps EXPLAIN and everything else should use some common
infrastructure to deal with this.

In other words, we avoid the need for a try-every-parallel-degree
explosion by rescaling from some arbitrary input degree to some
arbitrary output degree, but we can't go all the way and do the
two-phase Hong thing and rescale from non-partial paths in general
(for various technical reasons that apply to more complex nodes, but
not to basic scans). From where we are, I'm not sure how much of a
big step it is to (1) formalise the path rescaling system and (2) be
able to rescale some qualifying simple complete paths too, if needed
for places like this.

Of course you could quibble with the concept of linear scaling of
various number by parallel degrees; various things aren't linear or
even continuous (probably why Hong's system included hash join
thresholds). Even the concept of get_parallel_divisor(path) as
applied to row estimates is suspect, because it assumes that the
planned workers will show up; if a smaller number shows up (at all,
due to max_parallel_workers, or just to this node because a higher
level parallel append sent workers to different subplans) then a node
might receive many more input tuples than expected and blow through
work_mem, even if all estimates were 100% perfect. I have no idea
what to do about that. At least *that* problem doesn't apply to
parallel hash, which shares the memory for the number of planned
workers, even if fewer show up, but that ideas isn't without critics
either.

I dunno. Sorry for the wall of text/ideas. I see unfinished business
in every direction.

[1]: /messages/by-id/CA+hUKGL-Fo9mZyFK1tdmzFng2puRBrgROsCiB1=n7wP79mTZ+g@mail.gmail.com

#31

Richard Guo

guofenglinux@gmail.com

almost 3 years ago

In reply to: Richard Guo (#16)

Re: Using each rel as both outer and inner for JOIN_ANTI

On Tue, Aug 2, 2022 at 3:13 PM Richard Guo <guofenglinux@gmail.com> wrote:

On Sun, Jul 31, 2022 at 12:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

[ wanders away wondering if JOIN_RIGHT_SEMI should become a thing ... ]

Maybe this is something we can do. Currently for the query below:

# explain select * from foo where a in (select c from bar);
QUERY PLAN
-------------------------------------------------------------------------
Hash Semi Join (cost=154156.00..173691.29 rows=10 width=8)
Hash Cond: (foo.a = bar.c)
-> Seq Scan on foo (cost=0.00..1.10 rows=10 width=8)
-> Hash (cost=72124.00..72124.00 rows=5000000 width=4)
-> Seq Scan on bar (cost=0.00..72124.00 rows=5000000 width=4)
(5 rows)

I believe we can get a cheaper plan if we are able to swap the outer and
inner for SEMI JOIN and use the smaller 'foo' as inner rel.

I'm thinking about the JOIN_RIGHT_SEMI thing and it seems that it can be
implemented for HashJoin with very short change. What we want to do is
to just have the first match for each inner tuple. So after scanning
the hash bucket for matches, we just need to check whether the inner
tuple has been set match and skip it if so, something like

{
if (!ExecScanHashBucket(node, econtext))
{
/* out of matches; check for possible outer-join fill */
node->hj_JoinState = HJ_FILL_OUTER_TUPLE;
continue;
}
}

+     /*
+      * In a right-semijoin, we only need the first match for each
+      * inner tuple.
+      */
+     if (node->js.jointype == JOIN_RIGHT_SEMI &&
+         HeapTupleHeaderHasMatch(HJTUPLE_MINTUPLE(node->hj_CurTuple)))
+         continue;
+

I have a simple implementation locally and tried it with the query below
and saw a speedup of 2055.617 ms VS. 1156.772 ms (both best of 3).

# explain (costs off, analyze)
select * from foo where a in (select c from bar);
QUERY PLAN
-------------------------------------------------------------------------------
Hash Semi Join (actual time=1957.748..2055.058 rows=10 loops=1)
Hash Cond: (foo.a = bar.c)
-> Seq Scan on foo (actual time=0.026..0.029 rows=10 loops=1)
-> Hash (actual time=1938.818..1938.819 rows=5000000 loops=1)
Buckets: 262144 Batches: 64 Memory Usage: 4802kB
-> Seq Scan on bar (actual time=0.016..853.010 rows=5000000
loops=1)
Planning Time: 0.327 ms
Execution Time: 2055.617 ms
(8 rows)

# explain (costs off, analyze)
select * from foo where a in (select c from bar);
QUERY PLAN
-------------------------------------------------------------------------
Hash Right Semi Join (actual time=11.525..1156.713 rows=10 loops=1)
Hash Cond: (bar.c = foo.a)
-> Seq Scan on bar (actual time=0.034..523.036 rows=5000000 loops=1)
-> Hash (actual time=0.027..0.029 rows=10 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on foo (actual time=0.009..0.014 rows=10 loops=1)
Planning Time: 0.312 ms
Execution Time: 1156.772 ms
(8 rows)

It may not be easy for MergeJoin and NestLoop though, as we do not have
a way to know if an inner tuple has been already matched or not. But
the benefit of swapping inputs for MergeJoin and NestLoop seems to be
small, so I think it's OK to ignore them.

So is it worthwhile to make JOIN_RIGHT_SEMI come true?

Thanks
Richard

#32

Richard Guo

guofenglinux@gmail.com

almost 3 years ago

In reply to: Richard Guo (#31)

1 attachment(s)

Re: Using each rel as both outer and inner for JOIN_ANTI

On Fri, Apr 7, 2023 at 3:28 PM Richard Guo <guofenglinux@gmail.com> wrote:

On Tue, Aug 2, 2022 at 3:13 PM Richard Guo <guofenglinux@gmail.com> wrote:

On Sun, Jul 31, 2022 at 12:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

[ wanders away wondering if JOIN_RIGHT_SEMI should become a thing ... ]

Maybe this is something we can do. Currently for the query below:

# explain select * from foo where a in (select c from bar);
QUERY PLAN
-------------------------------------------------------------------------
Hash Semi Join (cost=154156.00..173691.29 rows=10 width=8)
Hash Cond: (foo.a = bar.c)
-> Seq Scan on foo (cost=0.00..1.10 rows=10 width=8)
-> Hash (cost=72124.00..72124.00 rows=5000000 width=4)
-> Seq Scan on bar (cost=0.00..72124.00 rows=5000000 width=4)
(5 rows)

I believe we can get a cheaper plan if we are able to swap the outer and
inner for SEMI JOIN and use the smaller 'foo' as inner rel.

It may not be easy for MergeJoin and NestLoop though, as we do not have
a way to know if an inner tuple has been already matched or not. But
the benefit of swapping inputs for MergeJoin and NestLoop seems to be
small, so I think it's OK to ignore them.

Hmm. Actually we can do it for MergeJoin by avoiding restoring inner
scan to the marked tuple in EXEC_MJ_TESTOUTER, in the case when new
outer tuple == marked tuple. But I'm not sure how much benefit we can
get from Merge Right Semi Join.

For HashJoin, though, there are cases that can surely benefit from Hash
Right Semi Join. So I go ahead and have a try on it as attached.

Thanks
Richard

Attachments:

v1-0001-Support-Right-Semi-Join.patchapplication/octet-stream; name=v1-0001-Support-Right-Semi-Join.patchDownload

From fb90a594973cf9be028f216d4a8441fb47c3f8a3 Mon Sep 17 00:00:00 2001
From: Richard Guo <guofenglinux@gmail.com>
Date: Fri, 7 Apr 2023 18:57:10 +0800
Subject: [PATCH v1] Support Right Semi Join

---
 src/backend/commands/explain.c                |   3 +
 src/backend/executor/nodeHashjoin.c           |  14 +-
 src/backend/optimizer/path/joinpath.c         |  14 +-
 src/backend/optimizer/path/joinrels.c         |   3 +
 src/backend/optimizer/path/pathkeys.c         |   3 +
 src/include/nodes/nodes.h                     |   1 +
 src/test/regress/expected/join.out            |  20 +-
 src/test/regress/expected/partition_join.out  | 216 +++++++++---------
 src/test/regress/expected/select_parallel.out |  32 +--
 src/test/regress/expected/updatable_views.out |   6 +-
 10 files changed, 169 insertions(+), 143 deletions(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 5334c503e1..c3bd0bdcbc 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -1561,6 +1561,9 @@ ExplainNode(PlanState *planstate, List *ancestors,
 					case JOIN_ANTI:
 						jointype = "Anti";
 						break;
+					case JOIN_RIGHT_SEMI:
+						jointype = "Right Semi";
+						break;
 					case JOIN_RIGHT_ANTI:
 						jointype = "Right Anti";
 						break;
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index 0a3f32f731..ef10f8633e 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -488,6 +488,14 @@ ExecHashJoinImpl(PlanState *pstate, bool parallel)
 					}
 				}
 
+				/*
+				 * In a right-semijoin, we only need the first match for each
+				 * inner tuple.
+				 */
+				if (node->js.jointype == JOIN_RIGHT_SEMI &&
+					HeapTupleHeaderHasMatch(HJTUPLE_MINTUPLE(node->hj_CurTuple)))
+					continue;
+
 				/*
 				 * We've got a match, but still need to test non-hashed quals.
 				 * ExecScanHashBucket already set up all the state needed to
@@ -505,10 +513,7 @@ ExecHashJoinImpl(PlanState *pstate, bool parallel)
 					node->hj_MatchedOuter = true;
 
 
-					/*
-					 * This is really only needed if HJ_FILL_INNER(node), but
-					 * we'll avoid the branch and just set it always.
-					 */
+					/* Set the match flag */
 					if (!HeapTupleHeaderHasMatch(HJTUPLE_MINTUPLE(node->hj_CurTuple)))
 						HeapTupleHeaderSetMatch(HJTUPLE_MINTUPLE(node->hj_CurTuple));
 
@@ -734,6 +739,7 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
 	{
 		case JOIN_INNER:
 		case JOIN_SEMI:
+		case JOIN_RIGHT_SEMI:
 			break;
 		case JOIN_LEFT:
 		case JOIN_ANTI:
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index cd80e61fd7..1edb1760f4 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -129,7 +129,7 @@ add_paths_to_joinrel(PlannerInfo *root,
 					 List *restrictlist)
 {
 	JoinPathExtraData extra;
-	bool		mergejoin_allowed = true;
+	bool		mergejoin_allowed = !(jointype == JOIN_RIGHT_SEMI);
 	ListCell   *lc;
 	Relids		joinrelids;
 
@@ -206,7 +206,8 @@ add_paths_to_joinrel(PlannerInfo *root,
 	 * way of implementing a full outer join, so override enable_mergejoin if
 	 * it's a full join.
 	 */
-	if (enable_mergejoin || jointype == JOIN_FULL)
+	if ((enable_mergejoin || jointype == JOIN_FULL) &&
+		jointype != JOIN_RIGHT_SEMI)
 		extra.mergeclause_list = select_mergejoin_clauses(root,
 														  joinrel,
 														  outerrel,
@@ -2232,13 +2233,14 @@ hash_inner_and_outer(PlannerInfo *root,
 			 * total inner path will also be parallel-safe, but if not, we'll
 			 * have to search for the cheapest safe, unparameterized inner
 			 * path.  If doing JOIN_UNIQUE_INNER, we can't use any alternative
-			 * inner path.  If full, right, or right-anti join, we can't use
-			 * parallelism (building the hash table in each backend) because
-			 * no one process has all the match bits.
+			 * inner path.  If full, right, right-anti or right-semi join, we
+			 * can't use parallelism (building the hash table in each backend)
+			 * because no one process has all the match bits.
 			 */
 			if (save_jointype == JOIN_FULL ||
 				save_jointype == JOIN_RIGHT ||
-				save_jointype == JOIN_RIGHT_ANTI)
+				save_jointype == JOIN_RIGHT_ANTI ||
+				save_jointype == JOIN_RIGHT_SEMI)
 				cheapest_safe_inner = NULL;
 			else if (cheapest_total_inner->parallel_safe)
 				cheapest_safe_inner = cheapest_total_inner;
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 4c6ea3a2f0..e8290bf80b 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -884,6 +884,9 @@ populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
 				add_paths_to_joinrel(root, joinrel, rel1, rel2,
 									 JOIN_SEMI, sjinfo,
 									 restrictlist);
+				add_paths_to_joinrel(root, joinrel, rel2, rel1,
+									 JOIN_RIGHT_SEMI, sjinfo,
+									 restrictlist);
 			}
 
 			/*
diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
index e53ea84224..4eda6eb75b 100644
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -1095,6 +1095,9 @@ build_join_pathkeys(PlannerInfo *root,
 					JoinType jointype,
 					List *outer_pathkeys)
 {
+	/* RIGHT_SEMI should not come here */
+	Assert(jointype != JOIN_RIGHT_SEMI);
+
 	if (jointype == JOIN_FULL ||
 		jointype == JOIN_RIGHT ||
 		jointype == JOIN_RIGHT_ANTI)
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index f8e8fe699a..013aaf6040 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -317,6 +317,7 @@ typedef enum JoinType
 	 */
 	JOIN_SEMI,					/* 1 copy of each LHS row that has match(es) */
 	JOIN_ANTI,					/* 1 copy of each LHS row that has no match */
+	JOIN_RIGHT_SEMI,			/* 1 copy of each RHS row that has match(es) */
 	JOIN_RIGHT_ANTI,			/* 1 copy of each RHS row that has no match */
 
 	/*
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 5d59ed7890..90397bc095 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -1909,18 +1909,18 @@ select * from int4_tbl i4, tenk1 a
 where exists(select * from tenk1 b
              where a.twothousand = b.twothousand and a.fivethous <> b.fivethous)
       and i4.f1 = a.tenthous;
-                  QUERY PLAN                  
-----------------------------------------------
- Hash Semi Join
-   Hash Cond: (a.twothousand = b.twothousand)
+                   QUERY PLAN                    
+-------------------------------------------------
+ Hash Right Semi Join
+   Hash Cond: (b.twothousand = a.twothousand)
    Join Filter: (a.fivethous <> b.fivethous)
-   ->  Hash Join
-         Hash Cond: (a.tenthous = i4.f1)
-         ->  Seq Scan on tenk1 a
-         ->  Hash
-               ->  Seq Scan on int4_tbl i4
+   ->  Seq Scan on tenk1 b
    ->  Hash
-         ->  Seq Scan on tenk1 b
+         ->  Hash Join
+               Hash Cond: (a.tenthous = i4.f1)
+               ->  Seq Scan on tenk1 a
+               ->  Hash
+                     ->  Seq Scan on int4_tbl i4
 (10 rows)
 
 --
diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out
index 762cc8e53b..b8cbde9df4 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -2329,24 +2329,24 @@ SELECT t1.* FROM prt1_adv t1 WHERE EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Hash Semi Join
-               Hash Cond: (t1_1.a = t2_1.b)
-               ->  Seq Scan on prt1_adv_p1 t1_1
-                     Filter: (b = 0)
+         ->  Hash Right Semi Join
+               Hash Cond: (t2_1.b = t1_1.a)
+               ->  Seq Scan on prt2_adv_p1 t2_1
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p1 t2_1
-         ->  Hash Semi Join
-               Hash Cond: (t1_2.a = t2_2.b)
-               ->  Seq Scan on prt1_adv_p2 t1_2
-                     Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p1 t1_1
+                           Filter: (b = 0)
+         ->  Hash Right Semi Join
+               Hash Cond: (t2_2.b = t1_2.a)
+               ->  Seq Scan on prt2_adv_p2 t2_2
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p2 t2_2
-         ->  Hash Semi Join
-               Hash Cond: (t1_3.a = t2_3.b)
-               ->  Seq Scan on prt1_adv_p3 t1_3
-                     Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p2 t1_2
+                           Filter: (b = 0)
+         ->  Hash Right Semi Join
+               Hash Cond: (t2_3.b = t1_3.a)
+               ->  Seq Scan on prt2_adv_p3 t2_3
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p3 t2_3
+                     ->  Seq Scan on prt1_adv_p3 t1_3
+                           Filter: (b = 0)
 (21 rows)
 
 SELECT t1.* FROM prt1_adv t1 WHERE EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
@@ -2538,24 +2538,24 @@ SELECT t1.* FROM prt1_adv t1 WHERE EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Hash Semi Join
-               Hash Cond: (t1_1.a = t2_1.b)
-               ->  Seq Scan on prt1_adv_p1 t1_1
-                     Filter: (b = 0)
+         ->  Hash Right Semi Join
+               Hash Cond: (t2_1.b = t1_1.a)
+               ->  Seq Scan on prt2_adv_p1 t2_1
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p1 t2_1
-         ->  Hash Semi Join
-               Hash Cond: (t1_2.a = t2_2.b)
-               ->  Seq Scan on prt1_adv_p2 t1_2
-                     Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p1 t1_1
+                           Filter: (b = 0)
+         ->  Hash Right Semi Join
+               Hash Cond: (t2_2.b = t1_2.a)
+               ->  Seq Scan on prt2_adv_p2 t2_2
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p2 t2_2
-         ->  Hash Semi Join
-               Hash Cond: (t1_3.a = t2_3.b)
-               ->  Seq Scan on prt1_adv_p3 t1_3
-                     Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p2 t1_2
+                           Filter: (b = 0)
+         ->  Hash Right Semi Join
+               Hash Cond: (t2_3.b = t1_3.a)
+               ->  Seq Scan on prt2_adv_p3 t2_3
                ->  Hash
-                     ->  Seq Scan on prt2_adv_p3 t2_3
+                     ->  Seq Scan on prt1_adv_p3 t1_3
+                           Filter: (b = 0)
 (21 rows)
 
 SELECT t1.* FROM prt1_adv t1 WHERE EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
@@ -2822,25 +2822,25 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_adv t1 INNER JOIN prt2_adv t2 ON (t1.a =
 -- semi join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM prt1_adv t1 WHERE EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
-                       QUERY PLAN                       
---------------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Sort
    Sort Key: t1.a
-   ->  Hash Semi Join
-         Hash Cond: (t1.a = t2.b)
+   ->  Hash Right Semi Join
+         Hash Cond: (t2.b = t1.a)
          ->  Append
-               ->  Seq Scan on prt1_adv_p1 t1_1
-                     Filter: (b = 0)
-               ->  Seq Scan on prt1_adv_p2 t1_2
-                     Filter: (b = 0)
-               ->  Seq Scan on prt1_adv_p3 t1_3
-                     Filter: (b = 0)
+               ->  Seq Scan on prt2_adv_p1 t2_1
+               ->  Seq Scan on prt2_adv_p2 t2_2
+               ->  Seq Scan on prt2_adv_p3_1 t2_3
+               ->  Seq Scan on prt2_adv_p3_2 t2_4
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt2_adv_p1 t2_1
-                     ->  Seq Scan on prt2_adv_p2 t2_2
-                     ->  Seq Scan on prt2_adv_p3_1 t2_3
-                     ->  Seq Scan on prt2_adv_p3_2 t2_4
+                     ->  Seq Scan on prt1_adv_p1 t1_1
+                           Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p2 t1_2
+                           Filter: (b = 0)
+                     ->  Seq Scan on prt1_adv_p3 t1_3
+                           Filter: (b = 0)
 (17 rows)
 
 -- left join
@@ -3219,27 +3219,30 @@ SELECT t1.a, t1.c, t2.a, t2.c FROM plt1_adv t1 INNER JOIN plt2_adv t2 ON (t1.a =
 -- semi join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM plt1_adv t1 WHERE EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
-                              QUERY PLAN                              
-----------------------------------------------------------------------
+                             QUERY PLAN                             
+--------------------------------------------------------------------
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Nested Loop Semi Join
-               Join Filter: ((t1_1.a = t2_1.a) AND (t1_1.c = t2_1.c))
-               ->  Seq Scan on plt1_adv_p1 t1_1
-                     Filter: (b < 10)
+         ->  Hash Right Semi Join
+               Hash Cond: ((t2_1.a = t1_1.a) AND (t2_1.c = t1_1.c))
                ->  Seq Scan on plt2_adv_p1 t2_1
-         ->  Nested Loop Semi Join
-               Join Filter: ((t1_2.a = t2_2.a) AND (t1_2.c = t2_2.c))
-               ->  Seq Scan on plt1_adv_p2 t1_2
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p1 t1_1
+                           Filter: (b < 10)
+         ->  Hash Right Semi Join
+               Hash Cond: ((t2_2.a = t1_2.a) AND (t2_2.c = t1_2.c))
                ->  Seq Scan on plt2_adv_p2 t2_2
-         ->  Nested Loop Semi Join
-               Join Filter: ((t1_3.a = t2_3.a) AND (t1_3.c = t2_3.c))
-               ->  Seq Scan on plt1_adv_p3 t1_3
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p2 t1_2
+                           Filter: (b < 10)
+         ->  Hash Right Semi Join
+               Hash Cond: ((t2_3.a = t1_3.a) AND (t2_3.c = t1_3.c))
                ->  Seq Scan on plt2_adv_p3 t2_3
-(18 rows)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p3 t1_3
+                           Filter: (b < 10)
+(21 rows)
 
 SELECT t1.* FROM plt1_adv t1 WHERE EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
  a | b |  c   
@@ -3409,27 +3412,30 @@ SELECT t1.a, t1.c, t2.a, t2.c FROM plt1_adv t1 INNER JOIN plt2_adv t2 ON (t1.a =
 -- semi join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM plt1_adv t1 WHERE EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
-                              QUERY PLAN                              
-----------------------------------------------------------------------
+                             QUERY PLAN                             
+--------------------------------------------------------------------
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Nested Loop Semi Join
-               Join Filter: ((t1_1.a = t2_1.a) AND (t1_1.c = t2_1.c))
-               ->  Seq Scan on plt1_adv_p1 t1_1
-                     Filter: (b < 10)
+         ->  Hash Right Semi Join
+               Hash Cond: ((t2_1.a = t1_1.a) AND (t2_1.c = t1_1.c))
                ->  Seq Scan on plt2_adv_p1 t2_1
-         ->  Nested Loop Semi Join
-               Join Filter: ((t1_2.a = t2_2.a) AND (t1_2.c = t2_2.c))
-               ->  Seq Scan on plt1_adv_p2 t1_2
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p1 t1_1
+                           Filter: (b < 10)
+         ->  Hash Right Semi Join
+               Hash Cond: ((t2_2.a = t1_2.a) AND (t2_2.c = t1_2.c))
                ->  Seq Scan on plt2_adv_p2 t2_2
-         ->  Nested Loop Semi Join
-               Join Filter: ((t1_3.a = t2_3.a) AND (t1_3.c = t2_3.c))
-               ->  Seq Scan on plt1_adv_p3 t1_3
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p2 t1_2
+                           Filter: (b < 10)
+         ->  Hash Right Semi Join
+               Hash Cond: ((t2_3.a = t1_3.a) AND (t2_3.c = t1_3.c))
                ->  Seq Scan on plt2_adv_p3 t2_3
-(18 rows)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p3 t1_3
+                           Filter: (b < 10)
+(21 rows)
 
 SELECT t1.* FROM plt1_adv t1 WHERE EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
  a | b |  c   
@@ -3625,25 +3631,25 @@ SELECT t1.a, t1.c, t2.a, t2.c FROM plt1_adv t1 INNER JOIN plt2_adv t2 ON (t1.a =
 -- semi join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM plt1_adv t1 WHERE EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
-                       QUERY PLAN                       
---------------------------------------------------------
+                      QUERY PLAN                      
+------------------------------------------------------
  Sort
    Sort Key: t1.a
-   ->  Hash Semi Join
-         Hash Cond: ((t1.a = t2.a) AND (t1.c = t2.c))
+   ->  Hash Right Semi Join
+         Hash Cond: ((t2.a = t1.a) AND (t2.c = t1.c))
          ->  Append
-               ->  Seq Scan on plt1_adv_p1 t1_1
-                     Filter: (b < 10)
-               ->  Seq Scan on plt1_adv_p2 t1_2
-                     Filter: (b < 10)
-               ->  Seq Scan on plt1_adv_p3 t1_3
-                     Filter: (b < 10)
+               ->  Seq Scan on plt2_adv_p1 t2_1
+               ->  Seq Scan on plt2_adv_p2_1 t2_2
+               ->  Seq Scan on plt2_adv_p2_2 t2_3
+               ->  Seq Scan on plt2_adv_p3 t2_4
          ->  Hash
                ->  Append
-                     ->  Seq Scan on plt2_adv_p1 t2_1
-                     ->  Seq Scan on plt2_adv_p2_1 t2_2
-                     ->  Seq Scan on plt2_adv_p2_2 t2_3
-                     ->  Seq Scan on plt2_adv_p3 t2_4
+                     ->  Seq Scan on plt1_adv_p1 t1_1
+                           Filter: (b < 10)
+                     ->  Seq Scan on plt1_adv_p2 t1_2
+                           Filter: (b < 10)
+                     ->  Seq Scan on plt1_adv_p3 t1_3
+                           Filter: (b < 10)
 (17 rows)
 
 -- left join
@@ -3773,28 +3779,30 @@ SELECT t1.a, t1.c, t2.a, t2.c FROM plt1_adv t1 INNER JOIN plt2_adv t2 ON (t1.a =
 -- semi join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM plt1_adv t1 WHERE EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
-                              QUERY PLAN                              
-----------------------------------------------------------------------
+                             QUERY PLAN                             
+--------------------------------------------------------------------
  Sort
    Sort Key: t1.a
    ->  Append
-         ->  Hash Semi Join
-               Hash Cond: ((t1_1.a = t2_1.a) AND (t1_1.c = t2_1.c))
-               ->  Seq Scan on plt1_adv_p1_null t1_1
-                     Filter: (b < 10)
+         ->  Hash Right Semi Join
+               Hash Cond: ((t2_1.a = t1_1.a) AND (t2_1.c = t1_1.c))
+               ->  Seq Scan on plt2_adv_p1 t2_1
                ->  Hash
-                     ->  Seq Scan on plt2_adv_p1 t2_1
-         ->  Nested Loop Semi Join
-               Join Filter: ((t1_2.a = t2_2.a) AND (t1_2.c = t2_2.c))
-               ->  Seq Scan on plt1_adv_p2 t1_2
-                     Filter: (b < 10)
+                     ->  Seq Scan on plt1_adv_p1_null t1_1
+                           Filter: (b < 10)
+         ->  Hash Right Semi Join
+               Hash Cond: ((t2_2.a = t1_2.a) AND (t2_2.c = t1_2.c))
                ->  Seq Scan on plt2_adv_p2 t2_2
-         ->  Nested Loop Semi Join
-               Join Filter: ((t1_3.a = t2_3.a) AND (t1_3.c = t2_3.c))
-               ->  Seq Scan on plt1_adv_p3 t1_3
-                     Filter: (b < 10)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p2 t1_2
+                           Filter: (b < 10)
+         ->  Hash Right Semi Join
+               Hash Cond: ((t2_3.a = t1_3.a) AND (t2_3.c = t1_3.c))
                ->  Seq Scan on plt2_adv_p3_null t2_3
-(19 rows)
+               ->  Hash
+                     ->  Seq Scan on plt1_adv_p3 t1_3
+                           Filter: (b < 10)
+(21 rows)
 
 SELECT t1.* FROM plt1_adv t1 WHERE EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a;
  a | b |  c   
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index d88353d496..d524989f25 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -999,26 +999,26 @@ reset role;
 explain (costs off, verbose)
   select count(*) from tenk1 a where (unique1, two) in
     (select unique1, row_number() over() from tenk1 b);
-                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
+                                       QUERY PLAN                                       
+----------------------------------------------------------------------------------------
  Aggregate
    Output: count(*)
-   ->  Hash Semi Join
-         Hash Cond: ((a.unique1 = b.unique1) AND (a.two = (row_number() OVER (?))))
-         ->  Gather
+   ->  Hash Right Semi Join
+         Hash Cond: ((b.unique1 = a.unique1) AND ((row_number() OVER (?)) = a.two))
+         ->  WindowAgg
+               Output: b.unique1, row_number() OVER (?)
+               ->  Gather
+                     Output: b.unique1
+                     Workers Planned: 4
+                     ->  Parallel Index Only Scan using tenk1_unique1 on public.tenk1 b
+                           Output: b.unique1
+         ->  Hash
                Output: a.unique1, a.two
-               Workers Planned: 4
-               ->  Parallel Seq Scan on public.tenk1 a
+               ->  Gather
                      Output: a.unique1, a.two
-         ->  Hash
-               Output: b.unique1, (row_number() OVER (?))
-               ->  WindowAgg
-                     Output: b.unique1, row_number() OVER (?)
-                     ->  Gather
-                           Output: b.unique1
-                           Workers Planned: 4
-                           ->  Parallel Index Only Scan using tenk1_unique1 on public.tenk1 b
-                                 Output: b.unique1
+                     Workers Planned: 4
+                     ->  Parallel Seq Scan on public.tenk1 a
+                           Output: a.unique1, a.two
 (18 rows)
 
 -- LIMIT/OFFSET within sub-selects can't be pushed to workers.
diff --git a/src/test/regress/expected/updatable_views.out b/src/test/regress/expected/updatable_views.out
index 0cbedc657d..9f18e227dd 100644
--- a/src/test/regress/expected/updatable_views.out
+++ b/src/test/regress/expected/updatable_views.out
@@ -2675,10 +2675,10 @@ NOTICE:  snooped value: 8
 SELECT * FROM v1 WHERE b=8;
  a | b |  c   |  d   
 ---+---+------+------
- 9 | 8 | t1   | t11d
- 9 | 8 | t11  | t11d
- 9 | 8 | t12  | t11d
  9 | 8 | t111 | t11d
+ 9 | 8 | t12  | t11d
+ 9 | 8 | t11  | t11d
+ 9 | 8 | t1   | t11d
 (4 rows)
 
 DELETE FROM v1 WHERE snoop(a) AND leakproof(a); -- should not delete everything, just where a>5
-- 
2.31.0